Hi,
I have seen a very similar issue as Andreas. It was found when streaming a mender file (using mender install <url> from my arm device. But I have also managed to reproduce a similar issue by flooding the interface using iperf.
on target:
$ sudo iperf -s -u
On host:
$ iperf -c <ip> -u -b 200M -t 300
Then it will almost instantly get problems causing the lm842 dongle to stop working.
I'm using below fw:
$ sudo dmesg | grep 8822c
[ 19.282167] Bluetooth: hci0: RTL: loading rtl_bt/rtl8822cu_fw.bin
[ 19.299025] Bluetooth: hci0: RTL: loading rtl_bt/rtl8822cu_config.bin
[ 19.628570] rtw_8822cu 1-1:1.2: WOW Firmware version 9.9.4, H2C version 15
[ 19.641604] rtw_8822cu 1-1:1.2: Firmware version 9.9.15, H2C version 15
$ iperf -s -u
------------------------------------------------------------
Server listening on UDP port 5001
UDP buffer size: 176 KByte (default)
------------------------------------------------------------
[ 415.791320] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
[ 415.797443] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
[ 415.803511] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
[ 415.809635] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
[ 438.102270] rtw_8822cu 1-1:1.2: failed to get tx report from firmware
[ 441.446726] rtw_8822cu 1-1:1.2: failed to send h2c command
[ 471.480932] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
Any ideas what might be the cause of this? I have also tried to use the latest patch from Sascha that seems to be aimed to fix some issue I thought might have been related to this(https://lore.kernel.org/linux-wireless/[email protected]/T/#m54b7c8c604b91cfce470fcec8fc7d4c20f3056c9), but still get same behavior.
BR,
Petter Mabäcker
On Thu, Apr 06, 2023 at 10:41:20AM +0000, [email protected] wrote:
> Hi,
>
> I have seen a very similar issue as Andreas. It was found when
> streaming a mender file (using mender install <url> from my arm
> device. But I have also managed to reproduce a similar issue by
> flooding the interface using iperf.
>
> on target:
> $ sudo iperf -s -u
>
> On host:
> $ iperf -c <ip> -u -b 200M -t 300
>
> Then it will almost instantly get problems causing the lm842 dongle to stop working.
I just gave it a try. This works here.
Which Kernel version do you use? Do you have:
709f329ea51c2 wifi: rtw88: usb: drop now unnecessary URB size check
b30dbecf891fb wifi: rtw88: usb: send Zero length packets if necessary
3bcc9db63336e wifi: rtw88: usb: Set qsel correctly
They are contained in v6.2.2 and v6.3-rc1, but not in v6.2.
Also you need these, but you already mentioned you have them:
https://patchwork.kernel.org/project/linux-wireless/patch/[email protected]/
Sascha
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
I'm working with a Linux 6.1 based track, but with all the mentioned bug fixes cherry-picked to that track. They have all made the LM842 a lot more stabile, but the issue I see with "tx report failed" is currently blocking me from using the LM842, since the mender upgrade is a crucial part for my use-case.
I have been trying to find a better way to reproduce the issue, without any success so far. For me it takes just 10-30 sec with above mention flooding using iperf to at least trigger a similar case.
...
[ 671.908527] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
[ 671.914632] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
[ 671.920750] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
[ 671.926792] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
[ 671.932924] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
[ 694.709045] rtw_8822cu 1-1:1.2: failed to get tx report from firmware
[ 710.169496] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
[ 717.701235] rtw_8822cu 1-1:1.2: failed to send h2c command
I can also mention that I'm running this in a i.MX6 SoloX based board.
I will let you guys know if I find a better way to reproduce the issue. But if you have any good ideas what above error (that brings down the entire interface) really mean (for example does it indicate kernel or firmware issue), please feel free to share some information about it and it might help me in troubleshooting the issue further.
BR Petter
On Mon, May 08, 2023 at 03:29:01PM +0200, Petter Mabacker wrote:
> I'm working with a Linux 6.1 based track, but with all the mentioned bug fixes cherry-picked to that track. They have all made the LM842 a lot more stabile, but the issue I see with "tx report failed" is currently blocking me from using the LM842, since the mender upgrade is a crucial part for my use-case.
>
> I have been trying to find a better way to reproduce the issue, without any success so far. For me it takes just 10-30 sec with above mention flooding using iperf to at least trigger a similar case.
>
> ...
> [ 671.908527] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
> [ 671.914632] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
> [ 671.920750] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
> [ 671.926792] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
> [ 671.932924] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
I am still not sure what to do about this. It happens with high RX load.
One way would be to just drop the log level of this message.
Otherwise this message should be harmless.
>
> [ 694.709045] rtw_8822cu 1-1:1.2: failed to get tx report from firmware
>
> [ 710.169496] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
> [ 717.701235] rtw_8822cu 1-1:1.2: failed to send h2c command
>
> I can also mention that I'm running this in a i.MX6 SoloX based board.
>
> I will let you guys know if I find a better way to reproduce the
> issue. But if you have any good ideas what above error (that brings
> down the entire interface) really mean (for example does it indicate
> kernel or firmware issue), please feel free to share some information
> about it and it might help me in troubleshooting the issue further.
Please try reproducing this with a recent mainline vanilla kernel. It
shouldn't be too hard to bring up a i.MX6 board with a vanilla kernel.
Regards,
Sascha
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
>> I'm working with a Linux 6.1 based track, but with all the mentioned bug fixes cherry-picked to that track. They have all made the LM842 a lot more stabile, but the issue I see with "tx report failed" is currently blocking me from using the LM842, since the mender upgrade is a crucial part for my use-case.
>>
>> I have been trying to find a better way to reproduce the issue, without any success so far. For me it takes just 10-30 sec with above mention flooding using iperf to at least trigger a similar case.
>>
>> ...
>> [ 671.908527] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
>> [ 671.914632] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
>> [ 671.920750] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
>> [ 671.926792] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
>> [ 671.932924] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
>I am still not sure what to do about this. It happens with high RX load.
>One way would be to just drop the log level of this message.
>Otherwise this message should be harmless.
Like stated in earlier mails, the initial problem was found during a mender upgrade (streaming a ~200MB file). In that case the problem occurs without any high RX load warnings. So that is not really related (at least I don't think so).
The real problem is that the driver ends-up in a not working state after this. Not even hot-plugging the dongle will help. Instead a reboot or reset of the driver (rmmod/insmod etc) is required.
>>
>> [ 694.709045] rtw_8822cu 1-1:1.2: failed to get tx report from firmware
>>
>> [ 710.169496] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
>> [ 717.701235] rtw_8822cu 1-1:1.2: failed to send h2c command
>>
>> I can also mention that I'm running this in a i.MX6 SoloX based board.
>>
>> I will let you guys know if I find a better way to reproduce the
>> issue. But if you have any good ideas what above error (that brings
>> down the entire interface) really mean (for example does it indicate
>> kernel or firmware issue), please feel free to share some information
>> about it and it might help me in troubleshooting the issue further.
>Please try reproducing this with a recent mainline vanilla kernel. It
>shouldn't be too hard to bring up a i.MX6 board with a vanilla kernel.
Just to be sure, I have tried this using latest kernel tree as you suggested:
~# uname -r
6.4.0-rc1-g5ca44e46dff4
However I get the very same behavior (in this case it's from the failed mender upgrade):
[ 724.788270] rtw_8822cu 1-1:1.2: failed to get tx report from firmware
[ 728.499480] rtw_8822cu 1-1:1.2: failed to send h2c command
[ 758.558511] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
May 09 06:48:17 iotgw mender[643]: time="2023-05-09T06:48:17Z" level=error msg="Download connection broken: read tcp 192.168.68.113:54072->52.239.140.42:443: read: connection timed out"
[ 796.975782] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
[ 835.251656] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
[ 843.586421] rtw_8822cu 1-1:1.2: failed to send h2c command
When I try to hotplug the dongle (that still don't solve the issue). I can see below printout, any ideas what it really mean? (I never see this before the problem occurs, only when hotplugging after the problem occurs):
[ 2298.729359] wlx34c9f08deb60: Limiting TX power to 23 (23 - 0) dBm as advertised by 1c:3b:f3:55:59:93
Since you cannot reproduce the similar (perhaps not even the same root issue) issue I saw using iperf, I will focus on trying to reproduce it using something similar as the streaming procedure done by mender. Any other suggestions from your side, or any logs etc that could be of interest?
BR Petter
>Regards,
> Sascha
On Tue, May 09, 2023 at 09:43:50AM +0200, Petter Mabacker wrote:
> >> I'm working with a Linux 6.1 based track, but with all the mentioned bug fixes cherry-picked to that track. They have all made the LM842 a lot more stabile, but the issue I see with "tx report failed" is currently blocking me from using the LM842, since the mender upgrade is a crucial part for my use-case.
> >>
> >> I have been trying to find a better way to reproduce the issue, without any success so far. For me it takes just 10-30 sec with above mention flooding using iperf to at least trigger a similar case.
> >>
> >> ...
> >> [ 671.908527] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
> >> [ 671.914632] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
> >> [ 671.920750] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
> >> [ 671.926792] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
> >> [ 671.932924] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
>
> >I am still not sure what to do about this. It happens with high RX load.
> >One way would be to just drop the log level of this message.
> >Otherwise this message should be harmless.
>
> Like stated in earlier mails, the initial problem was found during a
> mender upgrade (streaming a ~200MB file). In that case the problem
> occurs without any high RX load warnings. So that is not really
> related (at least I don't think so).
>
> The real problem is that the driver ends-up in a not working state
> after this. Not even hot-plugging the dongle will help. Instead a
> reboot or reset of the driver (rmmod/insmod etc) is required.
>
> >>
> >> [ 694.709045] rtw_8822cu 1-1:1.2: failed to get tx report from firmware
> >>
> >> [ 710.169496] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
> >> [ 717.701235] rtw_8822cu 1-1:1.2: failed to send h2c command
> >>
> >> I can also mention that I'm running this in a i.MX6 SoloX based board.
> >>
> >> I will let you guys know if I find a better way to reproduce the
> >> issue. But if you have any good ideas what above error (that brings
> >> down the entire interface) really mean (for example does it indicate
> >> kernel or firmware issue), please feel free to share some information
> >> about it and it might help me in troubleshooting the issue further.
>
> >Please try reproducing this with a recent mainline vanilla kernel. It
> >shouldn't be too hard to bring up a i.MX6 board with a vanilla kernel.
>
> Just to be sure, I have tried this using latest kernel tree as you suggested:
>
> ~# uname -r
> 6.4.0-rc1-g5ca44e46dff4
>
> However I get the very same behavior (in this case it's from the failed mender upgrade):
> [ 724.788270] rtw_8822cu 1-1:1.2: failed to get tx report from firmware
> [ 728.499480] rtw_8822cu 1-1:1.2: failed to send h2c command
> [ 758.558511] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
> May 09 06:48:17 iotgw mender[643]: time="2023-05-09T06:48:17Z" level=error msg="Download connection broken: read tcp 192.168.68.113:54072->52.239.140.42:443: read: connection timed out"
> [ 796.975782] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
> [ 835.251656] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
> [ 843.586421] rtw_8822cu 1-1:1.2: failed to send h2c command
Unfortunately it looks like this very often when something goes wrong in
the RTW88 driver. These messages seem to be a general sign for the
device to say that we have touched it wrong somehow and it's stuck now.
>
> When I try to hotplug the dongle (that still don't solve the issue). I
> can see below printout, any ideas what it really mean? (I never see
> this before the problem occurs, only when hotplugging after the
> problem occurs):
>
> [ 2298.729359] wlx34c9f08deb60: Limiting TX power to 23 (23 - 0) dBm as advertised by 1c:3b:f3:55:59:93
>
> Since you cannot reproduce the similar (perhaps not even the same root
> issue) issue I saw using iperf, I will focus on trying to reproduce it
> using something similar as the streaming procedure done by mender. Any
> other suggestions from your side, or any logs etc that could be of
> interest?
You could verify that you are using a recent firmware. The driver prints
it during initialization. It should be 9.9.11.
Other than that I don't have any good idea, sorry.
Sascha
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
Actually there is a recent firmware 9.9.15.
Petter use that, is displayed in the first email.
I have also an LM482 with 9.9.15 firmware. I used iperf with TCP and I
could not reproduce that.
I will try iperf with UDP like in your case.
How do you use LM482 when running iperf ? As station or AP ?
Gabriel
On Tue, May 9, 2023 at 11:18 AM Sascha Hauer <[email protected]> wrote:
>
> On Tue, May 09, 2023 at 09:43:50AM +0200, Petter Mabacker wrote:
> > >> I'm working with a Linux 6.1 based track, but with all the mentioned bug fixes cherry-picked to that track. They have all made the LM842 a lot more stabile, but the issue I see with "tx report failed" is currently blocking me from using the LM842, since the mender upgrade is a crucial part for my use-case.
> > >>
> > >> I have been trying to find a better way to reproduce the issue, without any success so far. For me it takes just 10-30 sec with above mention flooding using iperf to at least trigger a similar case.
> > >>
> > >> ...
> > >> [ 671.908527] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
> > >> [ 671.914632] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
> > >> [ 671.920750] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
> > >> [ 671.926792] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
> > >> [ 671.932924] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
> >
> > >I am still not sure what to do about this. It happens with high RX load.
> > >One way would be to just drop the log level of this message.
> > >Otherwise this message should be harmless.
> >
> > Like stated in earlier mails, the initial problem was found during a
> > mender upgrade (streaming a ~200MB file). In that case the problem
> > occurs without any high RX load warnings. So that is not really
> > related (at least I don't think so).
> >
> > The real problem is that the driver ends-up in a not working state
> > after this. Not even hot-plugging the dongle will help. Instead a
> > reboot or reset of the driver (rmmod/insmod etc) is required.
> >
> > >>
> > >> [ 694.709045] rtw_8822cu 1-1:1.2: failed to get tx report from firmware
> > >>
> > >> [ 710.169496] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
> > >> [ 717.701235] rtw_8822cu 1-1:1.2: failed to send h2c command
> > >>
> > >> I can also mention that I'm running this in a i.MX6 SoloX based board.
> > >>
> > >> I will let you guys know if I find a better way to reproduce the
> > >> issue. But if you have any good ideas what above error (that brings
> > >> down the entire interface) really mean (for example does it indicate
> > >> kernel or firmware issue), please feel free to share some information
> > >> about it and it might help me in troubleshooting the issue further.
> >
> > >Please try reproducing this with a recent mainline vanilla kernel. It
> > >shouldn't be too hard to bring up a i.MX6 board with a vanilla kernel.
> >
> > Just to be sure, I have tried this using latest kernel tree as you suggested:
> >
> > ~# uname -r
> > 6.4.0-rc1-g5ca44e46dff4
> >
> > However I get the very same behavior (in this case it's from the failed mender upgrade):
> > [ 724.788270] rtw_8822cu 1-1:1.2: failed to get tx report from firmware
> > [ 728.499480] rtw_8822cu 1-1:1.2: failed to send h2c command
> > [ 758.558511] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
> > May 09 06:48:17 iotgw mender[643]: time="2023-05-09T06:48:17Z" level=error msg="Download connection broken: read tcp 192.168.68.113:54072->52.239.140.42:443: read: connection timed out"
> > [ 796.975782] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
> > [ 835.251656] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
> > [ 843.586421] rtw_8822cu 1-1:1.2: failed to send h2c command
>
> Unfortunately it looks like this very often when something goes wrong in
> the RTW88 driver. These messages seem to be a general sign for the
> device to say that we have touched it wrong somehow and it's stuck now.
>
> >
> > When I try to hotplug the dongle (that still don't solve the issue). I
> > can see below printout, any ideas what it really mean? (I never see
> > this before the problem occurs, only when hotplugging after the
> > problem occurs):
> >
> > [ 2298.729359] wlx34c9f08deb60: Limiting TX power to 23 (23 - 0) dBm as advertised by 1c:3b:f3:55:59:93
> >
> > Since you cannot reproduce the similar (perhaps not even the same root
> > issue) issue I saw using iperf, I will focus on trying to reproduce it
> > using something similar as the streaming procedure done by mender. Any
> > other suggestions from your side, or any logs etc that could be of
> > interest?
>
> You could verify that you are using a recent firmware. The driver prints
> it during initialization. It should be 9.9.11.
>
> Other than that I don't have any good idea, sorry.
>
> Sascha
>
> --
> Pengutronix e.K. | |
> Steuerwalder Str. 21 | http://www.pengutronix.de/ |
> 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
> Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
>Actually there is a recent firmware 9.9.15.
>Petter use that, is displayed in the first email.
>I have also an LM482 with 9.9.15 firmware. I used iperf with TCP and I
>could not reproduce that.
>I will try iperf with UDP like in your case.
>How do you use LM482 when running iperf ? As station or AP ?
>Gabriel
Yes, I have been using 9.9.15, but I have also tested using 9.9.14 firmware.
I'm running it as a station. Please let me know if you manage to reproduce a similar behaviour by flooding the udp like in my example.
Thanks.
BR Petter
On Tue, May 9, 2023 at 11:18=E2=80=AFAM Sascha Hauer <[email protected]=
e> wrote:
>>
>> On Tue, May 09, 2023 at 09:43:50AM +0200, Petter Mabacker wrote:
>> > >> I'm working with a Linux 6.1 based track, but with all the mentioned bug fixes cherry-picked to that track. They have all made the LM842 a lot more stabile, but the issue I see with "tx report failed" is currently blocking me from using the LM842, since the mender upgrade is a crucial part for my use-case.
>> > >>
>> > >> I have been trying to find a better way to reproduce the issue, without any success so far. For me it takes just 10-30 sec with above mention flooding using iperf to at least trigger a similar case.
>> > >>
>> > >> ...
>> > >> [ 671.908527] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
>> > >> [ 671.914632] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
>> > >> [ 671.920750] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
>> > >> [ 671.926792] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
>> > >> [ 671.932924] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
>> >
>> > >I am still not sure what to do about this. It happens with high RX load.
>> > >One way would be to just drop the log level of this message.
>> > >Otherwise this message should be harmless.
>> >
>> > Like stated in earlier mails, the initial problem was found during a
>> > mender upgrade (streaming a ~200MB file). In that case the problem
>> > occurs without any high RX load warnings. So that is not really
>> > related (at least I don't think so).
>> >
>> > The real problem is that the driver ends-up in a not working state
>> > after this. Not even hot-plugging the dongle will help. Instead a
>> > reboot or reset of the driver (rmmod/insmod etc) is required.
>> >
>> > >>
>> > >> [ 694.709045] rtw_8822cu 1-1:1.2: failed to get tx report from firmware
>> > >>
>> > >> [ 710.169496] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
>> > >> [ 717.701235] rtw_8822cu 1-1:1.2: failed to send h2c command
>> > >>
>> > >> I can also mention that I'm running this in a i.MX6 SoloX based board.
>> > >>
>> > >> I will let you guys know if I find a better way to reproduce the
>> > >> issue. But if you have any good ideas what above error (that brings
>> > >> down the entire interface) really mean (for example does it indicate
>> > >> kernel or firmware issue), please feel free to share some information
>> > >> about it and it might help me in troubleshooting the issue further.
>> >
>> > >Please try reproducing this with a recent mainline vanilla kernel. It
>> > >shouldn't be too hard to bring up a i.MX6 board with a vanilla kernel.
>> >
>> > Just to be sure, I have tried this using latest kernel tree as you suggested:
>> >
>> > ~# uname -r
>> > 6.4.0-rc1-g5ca44e46dff4
>> >
>> > However I get the very same behavior (in this case it's from the failed mender upgrade):
>> > [ 724.788270] rtw_8822cu 1-1:1.2: failed to get tx report from firmware
>> > [ 728.499480] rtw_8822cu 1-1:1.2: failed to send h2c command
>> > [ 758.558511] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
>> > May 09 06:48:17 iotgw mender[643]: time=3D"2023-05-09T06:48:17Z" level=3Derror msg=3D"Download connection broken: read tcp 192.168.68.113:54072->52.239.140.42:443: read: connection timed out"
>> > [ 796.975782] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
>> > [ 835.251656] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
>> > [ 843.586421] rtw_8822cu 1-1:1.2: failed to send h2c command
>>
>> Unfortunately it looks like this very often when something goes wrong in
>> the RTW88 driver. These messages seem to be a general sign for the
>> device to say that we have touched it wrong somehow and it's stuck now.
>>
>> >
>> > When I try to hotplug the dongle (that still don't solve the issue). I
>> > can see below printout, any ideas what it really mean? (I never see
>> > this before the problem occurs, only when hotplugging after the
>> > problem occurs):
>> >
>> > [ 2298.729359] wlx34c9f08deb60: Limiting TX power to 23 (23 - 0) dBm as advertised by 1c:3b:f3:55:59:93
>> >
>> > Since you cannot reproduce the similar (perhaps not even the same root
>> > issue) issue I saw using iperf, I will focus on trying to reproduce it
>> > using something similar as the streaming procedure done by mender. Any
>> > other suggestions from your side, or any logs etc that could be of
>> > interest?
>>
>> You could verify that you are using a recent firmware. The driver prints
>> it during initialization. It should be 9.9.11.
>>
>> Other than that I don't have any good idea, sorry.
>>
>> Sascha
>>
>> --
>> Pengutronix e.K. | |
>> Steuerwalder Str. 21 | http://www.pengutronix.de/ |
>> 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
>> Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
On my machine it works.
# iperf -s -u -p 5077
------------------------------------------------------------
Server listening on UDP port 5077
Receiving 1470 byte datagrams
UDP buffer size: 176 KByte (default)
------------------------------------------------------------
[ 3] local 192.168.199.128 port 5077 connected with 192.168.199.129 port 52400
[ ID] Interval Transfer Bandwidth Jitter Lost/Total Datagrams
[ 3] 0.0-300.2 sec 1.13 GBytes 32.5 Mbits/sec 0.811 ms 447069/1275909 (35%)
C:\work\iperf\iperf-2.0.9-win64>iperf -c 192.168.199.128 -u -p 5077 -b
200M -t 300
------------------------------------------------------------
Client connecting to 192.168.199.128, UDP port 5077
Sending 1470 byte datagrams, IPG target: 58.80 us (kalman adjust)
UDP buffer size: 208 KByte (default)
------------------------------------------------------------
[ 3] local 192.168.199.129 port 52400 connected with 192.168.199.128 port 5077
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-300.0 sec 1.75 GBytes 50.0 Mbits/sec
[ 3] Sent 1275909 datagrams
[ 3] Server Report:
[ 3] 0.0-300.2 sec 1.13 GBytes 32.5 Mbits/sec 0.811 ms
447069/1275909 (35%)
The only problem that I have is that after some time appears that the
link goes up again,
although the connection does not break and the stick is still plugged in.
# ip -oneline -family inet monitor link
6: wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> \ link/ether
6: wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> \ link/ether
# uname -a
Linux 6.0.5-rt14-00133-g9bc110e18268 #23 PREEMPT_RT @1683554137 ppc GNU/Linux
I use Linux 6.0.5 patched with latest changes from rtw88:
$ git log --oneline
9bc110e18268 (HEAD -> v6.0.5-rt14-rtw88) wifi: rtw88: Update spelling in main.h
3ae5fb6c4817 wifi: rtw88: Fix memory leak in rtw88_usb
45b8a6b717f7 wifi: rtw88: call rtw8821c_switch_rf_set() according to
chip variant
9a026c4ca518 wifi: rtw88: set pkg_type correctly for specific rtw8821c variants
1a6a48dcfc62 wifi: rtw88: rtw8821c: Fix rfe_option field width
dc964f05689e wifi: rtw88: usb: fix priority queue to endpoint mapping
551f663748ec wifi: rtw88: 8822c: add iface combination
406db9770c3b wifi: rtw88: prevent scan abort with other VIFs
c4baacf76af1 wifi: rtw88: refine reserved page flow for AP mode
bf443e16ab6b wifi: rtw88: disallow PS during AP mode
a4e31f468776 wifi: rtw88: 8822c: extend reserved page number
7d3459fec41d wifi: rtw88: add port switch for AP mode
1663c8bbfb6c wifi: rtw88: add bitmap for dynamic port settings
177cce5278da wifi: rtw88: Add support for the SDIO based RTL8821CS chipset
0c2d0c2e95e9 wifi: rtw88: Add support for the SDIO based RTL8822CS chipset
54ccc15bf8e8 wifi: rtw88: Add support for the SDIO based RTL8822BS chipset
5ba6cc26d37d wifi: rtw88: main: Reserve 8 bytes of extra TX headroom
for SDIO cards
78968acd1cf7 wifi: rtw88: main: Add the {cpwm,rpwm}_addr for SDIO based chipsets
2c84a6fbc425 wifi: rtw88: mac: Support SDIO specific bits in the power
on sequence
f8fecd6b4b15 wifi: rtw88: sdio: Add HCI implementation for SDIO based chipsets
18d149b363ec wifi: rtw88: Clear RTW_FLAG_POWERON early in rtw_mac_power_switch()
4428afb018ad wifi: rtw88: Remove redundant pci_clear_master
e03a57505246 wifi: rtw88: remove unused rtw_pci_get_tx_desc function
1a33faaf7ee2 wifi: rtw88: fix memory leak in rtw_usb_probe()
7df514789989 wifi: rtw88: mac: Return the original error from
rtw_mac_power_switch()
cdcec0087eed wifi: rtw88: mac: Return the original error from
rtw_pwr_seq_parser()
29e7e78aff16 wifi: rtw88: rtw8822c: Implement RTL8822CS (SDIO) efuse parsing
fd1c578236f0 wifi: rtw88: rtw8822b: Implement RTL8822BS (SDIO) efuse parsing
27cf63903f82 wifi: rtw88: rtw8821c: Implement RTL8821CS (SDIO) efuse parsing
880dcc22d96e wifi: rtw88: mac: Add SDIO HCI support in the TX/page table setup
02a905156d9f wifi: rtw88: mac: Add support for the SDIO HCI in
rtw_pwr_seq_parser()
3cbb5e76d79d wifi: rtw88: use RTW_FLAG_POWERON flag to prevent to
power on/off twice
d645a4334e82 wifi: rtw88: add flag check before enter or leave IPS
96c6c8188d76 wifi: rtw88: usb: drop now unnecessary URB size check
7c6f489cee16 wifi: rtw88: usb: send Zero length packets if necessary
3cd2c456cf5f wifi: rtw88: usb: Set qsel correctly
93da623c5ab9 wifi: rtw88: mac: Use existing macros in rtw_pwr_seq_parser()
2adec4917988 wifi: rtw88: Move enum rtw_tx_queue_type mapping code to tx.{c,h}
25e6d15c23f2 wifi: rtw88: pci: Change queue datatype to enum rtw_tx_queue_type
f915cdb6b40c wifi: rtw88: pci: Use enum type for
rtw_hw_queue_mapping() and ac_to_hwq
32c6e075e96f wifi: rtw88: Use non-atomic sta iterator in
rtw_ra_mask_info_update()
42509ddaa145 wifi: rtw88: Use rtw_iterate_vifs() for rtw_vif_watch_dog_iter()
a963db0abf71 wifi: rtw88: Move register access from rtw_bf_assoc()
outside the RCU
ba948e675516 wifi: rtw88: Add rtw8723du chipset support
5790a74b07bc wifi: rtw88: Add rtw8822cu chipset support
a22a3494b2b0 wifi: rtw88: Add rtw8822bu chipset support
fff42b4bfafe wifi: rtw88: Add rtw8821cu chipset support
bf5a6ce1815f wifi: rtw88: Add common USB chip support
2bd3a3cdff6b wifi: rtw88: iterate over vif/sta list non-atomically
e6198781367b wifi: rtw88: Drop coex mutex
7274d8b3144a wifi: rtw88: Drop h2c.lock
aa76144a028e wifi: rtw88: Drop rf_lock
2d162565d3bb wifi: rtw88: Call rtw_fw_beacon_filter_config() with
rtwdev->mutex held
9691b6813454 wifi: rtw88: print firmware type in info message
94243d5a51f9 wifi: rtw88: 8821c: enable BT device recovery mechanism
ec71728e9565 wifi: rtw88: fix race condition when doing H2C command
dd281929b088 wifi: mac80211: extend ieee80211_nullfunc_get() for MLO
040e3123e9d9 (tag: v6.0.5-rt14) v6.0.5-rt14c775cbedc0b8 net: axienet:
Remove the obsolete u64_stats_fetch_*_irq().
19dbfda76f91 (tag: v6.0.5-rt13) v6.0.5-rt13
20d4181d35af Merge tag 'v6.0.5' into linux-6.0.y-rt
3829606fc5df (tag: v6.0.5) Linux 6.0.5
On Thu, May 11, 2023 at 3:26 PM Petter Mabacker <[email protected]> wrote:
>
> >Actually there is a recent firmware 9.9.15.
> >Petter use that, is displayed in the first email.
>
> >I have also an LM482 with 9.9.15 firmware. I used iperf with TCP and I
> >could not reproduce that.
> >I will try iperf with UDP like in your case.
> >How do you use LM482 when running iperf ? As station or AP ?
>
> >Gabriel
>
> Yes, I have been using 9.9.15, but I have also tested using 9.9.14 firmware.
> I'm running it as a station. Please let me know if you manage to reproduce a similar behaviour by flooding the udp like in my example.
>
> Thanks.
> BR Petter
>
> On Tue, May 9, 2023 at 11:18=E2=80=AFAM Sascha Hauer <[email protected]=
> e> wrote:
> >>
> >> On Tue, May 09, 2023 at 09:43:50AM +0200, Petter Mabacker wrote:
> >> > >> I'm working with a Linux 6.1 based track, but with all the mentioned bug fixes cherry-picked to that track. They have all made the LM842 a lot more stabile, but the issue I see with "tx report failed" is currently blocking me from using the LM842, since the mender upgrade is a crucial part for my use-case.
> >> > >>
> >> > >> I have been trying to find a better way to reproduce the issue, without any success so far. For me it takes just 10-30 sec with above mention flooding using iperf to at least trigger a similar case.
> >> > >>
> >> > >> ...
> >> > >> [ 671.908527] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
> >> > >> [ 671.914632] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
> >> > >> [ 671.920750] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
> >> > >> [ 671.926792] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
> >> > >> [ 671.932924] rtw_8822cu 1-1:1.2: failed to get rx_queue, overflow
> >> >
> >> > >I am still not sure what to do about this. It happens with high RX load.
> >> > >One way would be to just drop the log level of this message.
> >> > >Otherwise this message should be harmless.
> >> >
> >> > Like stated in earlier mails, the initial problem was found during a
> >> > mender upgrade (streaming a ~200MB file). In that case the problem
> >> > occurs without any high RX load warnings. So that is not really
> >> > related (at least I don't think so).
> >> >
> >> > The real problem is that the driver ends-up in a not working state
> >> > after this. Not even hot-plugging the dongle will help. Instead a
> >> > reboot or reset of the driver (rmmod/insmod etc) is required.
> >> >
> >> > >>
> >> > >> [ 694.709045] rtw_8822cu 1-1:1.2: failed to get tx report from firmware
> >> > >>
> >> > >> [ 710.169496] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
> >> > >> [ 717.701235] rtw_8822cu 1-1:1.2: failed to send h2c command
> >> > >>
> >> > >> I can also mention that I'm running this in a i.MX6 SoloX based board.
> >> > >>
> >> > >> I will let you guys know if I find a better way to reproduce the
> >> > >> issue. But if you have any good ideas what above error (that brings
> >> > >> down the entire interface) really mean (for example does it indicate
> >> > >> kernel or firmware issue), please feel free to share some information
> >> > >> about it and it might help me in troubleshooting the issue further.
> >> >
> >> > >Please try reproducing this with a recent mainline vanilla kernel. It
> >> > >shouldn't be too hard to bring up a i.MX6 board with a vanilla kernel.
> >> >
> >> > Just to be sure, I have tried this using latest kernel tree as you suggested:
> >> >
> >> > ~# uname -r
> >> > 6.4.0-rc1-g5ca44e46dff4
> >> >
> >> > However I get the very same behavior (in this case it's from the failed mender upgrade):
> >> > [ 724.788270] rtw_8822cu 1-1:1.2: failed to get tx report from firmware
> >> > [ 728.499480] rtw_8822cu 1-1:1.2: failed to send h2c command
> >> > [ 758.558511] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
> >> > May 09 06:48:17 iotgw mender[643]: time=3D"2023-05-09T06:48:17Z" level=3Derror msg=3D"Download connection broken: read tcp 192.168.68.113:54072->52.239.140.42:443: read: connection timed out"
> >> > [ 796.975782] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
> >> > [ 835.251656] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
> >> > [ 843.586421] rtw_8822cu 1-1:1.2: failed to send h2c command
> >>
> >> Unfortunately it looks like this very often when something goes wrong in
> >> the RTW88 driver. These messages seem to be a general sign for the
> >> device to say that we have touched it wrong somehow and it's stuck now.
> >>
> >> >
> >> > When I try to hotplug the dongle (that still don't solve the issue). I
> >> > can see below printout, any ideas what it really mean? (I never see
> >> > this before the problem occurs, only when hotplugging after the
> >> > problem occurs):
> >> >
> >> > [ 2298.729359] wlx34c9f08deb60: Limiting TX power to 23 (23 - 0) dBm as advertised by 1c:3b:f3:55:59:93
> >> >
> >> > Since you cannot reproduce the similar (perhaps not even the same root
> >> > issue) issue I saw using iperf, I will focus on trying to reproduce it
> >> > using something similar as the streaming procedure done by mender. Any
> >> > other suggestions from your side, or any logs etc that could be of
> >> > interest?
> >>
> >> You could verify that you are using a recent firmware. The driver prints
> >> it during initialization. It should be 9.9.11.
> >>
> >> Other than that I don't have any good idea, sorry.
> >>
> >> Sascha
> >>
> >> --
> >> Pengutronix e.K. | |
> >> Steuerwalder Str. 21 | http://www.pengutronix.de/ |
> >> 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
> >> Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
>
Hi Petter,
On Thu, Apr 06, 2023 at 10:41:20AM +0000, [email protected] wrote:
> Hi,
>
> I have seen a very similar issue as Andreas. It was found when streaming a mender file (using mender install <url> from my arm device. But I have also managed to reproduce a similar issue by flooding the interface using iperf.
>
> on target:
> $ sudo iperf -s -u
>
> On host:
> $ iperf -c <ip> -u -b 200M -t 300
>
> Then it will almost instantly get problems causing the lm842 dongle to stop working.
I could finally reproduce this problem by placing an access point close
enough to my device. Only then the incoming packet rate is high enough
that the "failed to get rx_queue, overflow" message triggers.
In my case the time it takes to print this message many times is enough
to confuse the device so that it finally responds with:
[ 126.449305] rtw_8822cu 1-1:1.2: failed to get tx report from firmware
[ 142.081419] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
[ 175.929407] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
I just sent a patch printing the message with dev_dbg_ratelimited
instead which fixes that problem for me, you're on Cc.
It likely won't fix Andreas' problem though, as I don't see this message
in his bug report.
Sascha
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
>Hi Petter,
>On Thu, Apr 06, 2023 at 10:41:20AM +0000, [email protected] wrote:
>> Hi,
>>
>> I have seen a very similar issue as Andreas. It was found when streaming a mender file (using mender install <url> from my arm device. But I have also managed to reproduce a similar issue by flooding the interface using iperf.
>>
>> on target:
>> $ sudo iperf -s -u
>>
>> On host:
>> $ iperf -c <ip> -u -b 200M -t 300
>>
>> Then it will almost instantly get problems causing the lm842 dongle to stop working.
>I could finally reproduce this problem by placing an access point close
>enough to my device. Only then the incoming packet rate is high enough
>that the "failed to get rx_queue, overflow" message triggers.
>In my case the time it takes to print this message many times is enough
>to confuse the device so that it finally responds with:
>[ 126.449305] rtw_8822cu 1-1:1.2: failed to get tx report from firmware
>[ 142.081419] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
>[ 175.929407] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
>I just sent a patch printing the message with dev_dbg_ratelimited
>instead which fixes that problem for me, you're on Cc.
>It likely won't fix Andreas' problem though, as I don't see this message
>in his bug report.
>Sascha
Nice work. I have tested your patch v1 for the flooding at it solves my
iperf issue. Also when you describe above, its the
very same situation for me, I have been using a board that is very close
to the access point, so this is likely why I could reproduce it quite
easy.
I have however finally manage to make some break-through about the
original issue Andreas described, that so far has only been seen when
running mender install. A similar behaviour is to download large amount
of data combined with writing to the disk. So for me I can reproduce the
issue on my i.MX6 SoloX (single cpu board) by doing.
$ sudo dd if=/dev/urandom of=/path/to/bigfile bs=4M count=500
and in parallell download a large file such as:
$ wget -O /dev/null http://speedtest.tele2.net/10GB.zip
This will trigger the problem quite fast (within 5-15 min at least):
[ 374.763424] rtw_8822cu 1-1.2:1.2: failed to get tx report from firmware
[ 377.771790] rtw_8822cu 1-1.2:1.2: failed to send h2c command
[ 407.813460] rtw_8822cu 1-1.2:1.2: firmware failed to report density after scan
[ 414.965826] rtw_8822cu 1-1.2:1.2: failed to send h2c command
[ 444.993462] rtw_8822cu 1-1.2:1.2: firmware failed to report density after scan
[ 452.144551] rtw_8822cu 1-1.2:1.2: failed to send h2c command
[ 482.183445] rtw_8822cu 1-1.2:1.2: firmware failed to report density after scan
However one very interesting thing is that I can not reproduce this on a
more powerful device, such as i.MX8 or RPi4 etc.. But when I tried this
on another less powerful old single core device (BCM2835), I was able to
reproduce it quite easily again..
So from my understanding it seems to be a bit related to how the driver
behaves when the network queue/buffer etc are a bit stretch and the
system occupied with high I/O and/or system load. By increasing buffer sizes and
priorities for network queues, the system can handle it a bit better,
but still enough stress of the system seems to trigger the driver to
bail out completely..
Any suggestions or ideas around this is most welcome..
BR Petter
>--
>Pengutronix e.K. | |
>Steuerwalder Str. 21 | http://www.pengutronix.de/ |
>31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
>Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
>>Hi Petter,
>>On Thu, Apr 06, 2023 at 10:41:20AM +0000, [email protected] wrote:
>>> Hi,
>>>
>>> I have seen a very similar issue as Andreas. It was found when streaming a mender file (using mender install <url> from my arm device. But I have also managed to reproduce a similar issue by flooding the interface using iperf.
>>>
>>> on target:
>>> $ sudo iperf -s -u
>>>
>>> On host:
>>> $ iperf -c <ip> -u -b 200M -t 300
>>>
>>> Then it will almost instantly get problems causing the lm842 dongle to stop working.
>>I could finally reproduce this problem by placing an access point close
>>enough to my device. Only then the incoming packet rate is high enough
>>that the "failed to get rx_queue, overflow" message triggers.
>>In my case the time it takes to print this message many times is enough
>>to confuse the device so that it finally responds with:
>>[ 126.449305] rtw_8822cu 1-1:1.2: failed to get tx report from firmware
>>[ 142.081419] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
>>[ 175.929407] rtw_8822cu 1-1:1.2: firmware failed to report density after scan
>>I just sent a patch printing the message with dev_dbg_ratelimited
>>instead which fixes that problem for me, you're on Cc.
>>It likely won't fix Andreas' problem though, as I don't see this message
>>in his bug report.
>>Sascha
>Nice work. I have tested your patch v1 for the flooding at it solves my
>iperf issue. Also when you describe above, its the
>very same situation for me, I have been using a board that is very close
>to the access point, so this is likely why I could reproduce it quite
>easy.
>I have however finally manage to make some break-through about the
>original issue Andreas described, that so far has only been seen when
>running mender install. A similar behaviour is to download large amount
>of data combined with writing to the disk. So for me I can reproduce the
>issue on my i.MX6 SoloX (single cpu board) by doing.
>$ sudo dd if=/dev/urandom of=/path/to/bigfile bs=4M count=500
>and in parallell download a large file such as:
>$ wget -O /dev/null http://speedtest.tele2.net/10GB.zip
>This will trigger the problem quite fast (within 5-15 min at least):
>[ 374.763424] rtw_8822cu 1-1.2:1.2: failed to get tx report from firmware
>[ 377.771790] rtw_8822cu 1-1.2:1.2: failed to send h2c command
>[ 407.813460] rtw_8822cu 1-1.2:1.2: firmware failed to report density after scan
>[ 414.965826] rtw_8822cu 1-1.2:1.2: failed to send h2c command
>[ 444.993462] rtw_8822cu 1-1.2:1.2: firmware failed to report density after scan
>[ 452.144551] rtw_8822cu 1-1.2:1.2: failed to send h2c command
>[ 482.183445] rtw_8822cu 1-1.2:1.2: firmware failed to report density after scan
>However one very interesting thing is that I can not reproduce this on a
>more powerful device, such as i.MX8 or RPi4 etc.. But when I tried this
>on another less powerful old single core device (BCM2835), I was able to
>reproduce it quite easily again..
>So from my understanding it seems to be a bit related to how the driver
>behaves when the network queue/buffer etc are a bit stretch and the
>system occupied with high I/O and/or system load. By increasing buffer sizes and
>priorities for network queues, the system can handle it a bit better,
>but still enough stress of the system seems to trigger the driver to
>bail out completely..
>Any suggestions or ideas around this is most welcome..
>BR Petter
Some updates on this. Things seems to work a lot better when I moved to latest 6.4 with `wifi: rtw89: correct PS calculation for SUPPORTS_DYNAMIC_PS` included (see https://lore.kernel.org/linux-wireless/[email protected]/T/#t) and the discussions (for another rtw88 device) in https://github.com/lwfinger/rtw88/issues/129
With above fix I cannot reproduce this issue anymore. I can sometimes see the "rtw_8822cu 1-1.2:1.2: failed to get tx report from firmware" appear, but the drive continues to operate and will not bail out.
Only way I can still reproduce a similar issue now is when using HW offload scan through NetworkManager in combination with a business application that are using some nmlib callbacks. But I will drive that forward in a separate thread if I manage to create a better way to reproduce that. So in meantime running with above mention patch (e.g 6.4 tree) + Sascha's "wifi: rtw88: usb: silence log flooding error message" and https://lore.kernel.org/linux-wireless/[email protected]/T/#m65695e06fefb8cc5ae541dadacdd89ff540b875f + disable HW offload scan seems to make the driver quite stable on my i.MX6 SoloX board.
Thanks for all input.
BR Petter