Lukas reports skb_over_panic errors on his Banana Pi BPI-CM4 which comes
with an Amlogic A311D (G12B) SoC and a RTL8822CS SDIO wifi/Bluetooth
combo card. The error he observed is identical to what has been fixed
in commit e967229ead0e ("wifi: rtw88: sdio: Check the HISR RX_REQUEST
bit in rtw_sdio_rx_isr()") but that commit didn't fix Lukas' problem.
Lukas found that disabling or limiting RX aggregation fix the problem
for him. In the following discussion a few key topics have been
discussed which have an impact on this problem:
- The Amlogic A311D (G12B) SoC has a hardware bug in the SDIO controller
which prevents DMA transfers. Instead all transfers need to go through
the controller SRAM which limits transfers to 1536 bytes
- rtw88 chips don't split incoming (RX) packets, so if a big packet is
received this is forwarded to the host in it's original form
- rtw88 chips can do RX aggregation, meaning more multiple incoming
packets can be pulled by the host from the card with one MMC/SDIO
transfer. This Depends on settings in the REG_RXDMA_AGG_PG_TH
register (BIT_RXDMA_AGG_PG_TH limits the number of packets that will
be aggregated, BIT_DMA_AGG_TO_V1 configures a timeout for aggregation
and BIT_EN_PRE_CALC makes the chip honor the limits more effectively)
Use multiple consecutive reads in rtw_sdio_read_port() to limit the
number of bytes which are copied by the host from the card in one
MMC/SDIO transfer. This allows receiving a buffer that's larger than
the hosts max_req_size (number of bytes which can be transferred in
one MMC/SDIO transfer). As a result of this the skb_over_panic error
is gone as the rtw88 driver is now able to receive more than 1536 bytes
from the card (either because the incoming packet is larger than that
or because multiple packets have been aggregated).
Fixes: 65371a3f14e7 ("wifi: rtw88: sdio: Add HCI implementation for SDIO based chipsets")
Reported-by: Lukas F. Hartmann <[email protected]>
Closes: https://lore.kernel.org/linux-wireless/CAFBinCBaXtebixKbjkWKW_WXc5k=NdGNaGUjVE8NCPNxOhsb2g@mail.gmail.com/
Suggested-by: Ping-Ke Shih <[email protected]>
Signed-off-by: Martin Blumenstingl <[email protected]>
---
drivers/net/wireless/realtek/rtw88/sdio.c | 24 +++++++++++++++++------
1 file changed, 18 insertions(+), 6 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtw88/sdio.c b/drivers/net/wireless/realtek/rtw88/sdio.c
index 2c1fb2dabd40..b19262ec5d8c 100644
--- a/drivers/net/wireless/realtek/rtw88/sdio.c
+++ b/drivers/net/wireless/realtek/rtw88/sdio.c
@@ -500,19 +500,31 @@ static u32 rtw_sdio_get_tx_addr(struct rtw_dev *rtwdev, size_t size,
static int rtw_sdio_read_port(struct rtw_dev *rtwdev, u8 *buf, size_t count)
{
struct rtw_sdio *rtwsdio = (struct rtw_sdio *)rtwdev->priv;
+ struct mmc_host *host = rtwsdio->sdio_func->card->host;
bool bus_claim = rtw_sdio_bus_claim_needed(rtwsdio);
u32 rxaddr = rtwsdio->rx_addr++;
+ size_t bytes;
int ret;
if (bus_claim)
sdio_claim_host(rtwsdio->sdio_func);
- ret = sdio_memcpy_fromio(rtwsdio->sdio_func, buf,
- RTW_SDIO_ADDR_RX_RX0FF_GEN(rxaddr), count);
- if (ret)
- rtw_warn(rtwdev,
- "Failed to read %zu byte(s) from SDIO port 0x%08x",
- count, rxaddr);
+ while (count > 0) {
+ bytes = min_t(size_t, host->max_req_size, count);
+
+ ret = sdio_memcpy_fromio(rtwsdio->sdio_func, buf,
+ RTW_SDIO_ADDR_RX_RX0FF_GEN(rxaddr),
+ bytes);
+ if (ret) {
+ rtw_warn(rtwdev,
+ "Failed to read %zu byte(s) from SDIO port 0x%08x",
+ bytes, rxaddr);
+ break;
+ }
+
+ count -= bytes;
+ buf += bytes;
+ }
if (bus_claim)
sdio_release_host(rtwsdio->sdio_func);
--
2.41.0
> -----Original Message-----
> From: Martin Blumenstingl <[email protected]>
> Sent: Monday, July 10, 2023 3:57 AM
> To: [email protected]
> Cc: [email protected]; [email protected]; Ping-Ke Shih <[email protected]>;
> [email protected]; [email protected]; [email protected]; Martin Blumenstingl
> <[email protected]>; Lukas F . Hartmann <[email protected]>
> Subject: [PATCH] wifi: rtw88: sdio: Honor the host max_req_size in the RX path
>
> Lukas reports skb_over_panic errors on his Banana Pi BPI-CM4 which comes
> with an Amlogic A311D (G12B) SoC and a RTL8822CS SDIO wifi/Bluetooth
> combo card. The error he observed is identical to what has been fixed
> in commit e967229ead0e ("wifi: rtw88: sdio: Check the HISR RX_REQUEST
> bit in rtw_sdio_rx_isr()") but that commit didn't fix Lukas' problem.
>
> Lukas found that disabling or limiting RX aggregation fix the problem
> for him. In the following discussion a few key topics have been
> discussed which have an impact on this problem:
> - The Amlogic A311D (G12B) SoC has a hardware bug in the SDIO controller
> which prevents DMA transfers. Instead all transfers need to go through
> the controller SRAM which limits transfers to 1536 bytes
> - rtw88 chips don't split incoming (RX) packets, so if a big packet is
> received this is forwarded to the host in it's original form
> - rtw88 chips can do RX aggregation, meaning more multiple incoming
> packets can be pulled by the host from the card with one MMC/SDIO
> transfer. This Depends on settings in the REG_RXDMA_AGG_PG_TH
> register (BIT_RXDMA_AGG_PG_TH limits the number of packets that will
> be aggregated, BIT_DMA_AGG_TO_V1 configures a timeout for aggregation
> and BIT_EN_PRE_CALC makes the chip honor the limits more effectively)
>
> Use multiple consecutive reads in rtw_sdio_read_port() to limit the
> number of bytes which are copied by the host from the card in one
> MMC/SDIO transfer. This allows receiving a buffer that's larger than
> the hosts max_req_size (number of bytes which can be transferred in
> one MMC/SDIO transfer). As a result of this the skb_over_panic error
> is gone as the rtw88 driver is now able to receive more than 1536 bytes
> from the card (either because the incoming packet is larger than that
> or because multiple packets have been aggregated).
I assume your conclusion is correct for all platforms, so I add my reviewed-by.
But, I think it would be better that Lukas can help to test this patch on his
platform, and give a tested-by tag before getting this patch merged.
>
> Fixes: 65371a3f14e7 ("wifi: rtw88: sdio: Add HCI implementation for SDIO based chipsets")
> Reported-by: Lukas F. Hartmann <[email protected]>
> Closes:
> https://lore.kernel.org/linux-wireless/[email protected].
> com/
> Suggested-by: Ping-Ke Shih <[email protected]>
> Signed-off-by: Martin Blumenstingl <[email protected]>
Reviewed-by: Ping-Ke Shih <[email protected]>
[...]
> -----Original Message-----
> From: Lukas F. Hartmann <[email protected]>
> Sent: Thursday, July 13, 2023 8:49 PM
> To: Ping-Ke Shih <[email protected]>; Martin Blumenstingl <[email protected]>;
> [email protected]
> Cc: [email protected]; [email protected]; [email protected]; [email protected];
> [email protected]
> Subject: RE: [PATCH] wifi: rtw88: sdio: Honor the host max_req_size in the RX path
>
> Hi,
>
> Ping-Ke Shih <[email protected]> writes:
>
> > I assume your conclusion is correct for all platforms, so I add my reviewed-by.
> > But, I think it would be better that Lukas can help to test this patch on his
> > platform, and give a tested-by tag before getting this patch merged.
>
> I have been testing this now more rigorously in my own laptop with
> Kernel 6.4.1 (from Debian experimental) and this patch applied. I first
> had issues with rtw_power_mode_change (and "firmware failed to leave lps
> state"), so I turned off power_save using iw. This made everything
> quiet, but unfortunately after about 1 hour of usage I get
> skb_over_panic again and I believe some memory corruption happens in the
> kernel, as I can do dmesg only once and then another dmesg will hang forever.
> (After WARNING: CPU: 4 PID: 0 at kernel/context_tracking.c:128
> ct_kernel_exit.constprop.0+0xa0/0xa8)
>
> Here are the errors that lead up to this:
> http://dump.mntmn.com/rtw88-failure-1h-dmesg.txt
Hi Martin,
The dmesg shows that
"rtw_8822cs mmc2:0001:1: Failed to read 1536 byte(s) from SDIO port 0x000000d1"
Shouldn't we return an error code (with proper error handling) instead of
just break the loop? Because 'buf' content isn't usable.
I wonder the approach of this patch is still not enough for Lukas' platform.
Ping-Ke
Hello Ping-Ke,
On Fri, Jul 14, 2023 at 2:34 AM Ping-Ke Shih <[email protected]> wrote:
[...]
> > Here are the errors that lead up to this:
> > http://dump.mntmn.com/rtw88-failure-1h-dmesg.txt
>
> Hi Martin,
>
> The dmesg shows that
> "rtw_8822cs mmc2:0001:1: Failed to read 1536 byte(s) from SDIO port 0x000000d1"
>
> Shouldn't we return an error code (with proper error handling) instead of
> just break the loop? Because 'buf' content isn't usable.
In my opinion we are properly breaking the loop:
"ret" will be non-zero so the error code is returned from
rtw_sdio_read_port() to the caller.
The (only) caller is rtw_sdio_rxfifo_recv() which sees the non-zero
return code and aborts processing.
What do you think?
> I wonder the approach of this patch is still not enough for Lukas' platform.
On IRC Lukas wrote:
funny, i can reproduce skb_panic when opening this page in chromium
https://embedded.avnet.com/product/msc-sm2s-ryz/
and:
still getting spurious skb_panics, even after disabling rx aggregation.
I haven't had the time to look into this any further yet.
Unfortunately I also don't have any hardware to reproduce this problem
either, which unfortunately results in this long ping-pong.
Lukas, could you please add two more prints:
- in the rtw_warn with "Failed to read %zu byte(s) from SDIO port":
please also print the ret variable (with %d) - I'm curious what the
reported error is (it could be some CRC error which would mean ret is
-EILSEQ)
- add something like the following at the end of rtw_sdio_read_port()
(right before "return ret"):
if (!ret && count > 1000) {
printk(KERN_INFO "rtw_sdio_read_port() with %zu bytes:", count);
print_hex_dump(KERN_INFO, "", DUMP_PREFIX_OFFSET, 16, 1, buf, count, false);
}
(note: I only compile-tested this)
The very last output of this (potentially spammy) output will contain
the full buffer that's causing the problem.
Best regards,
Martin
Martin Blumenstingl <[email protected]> wrote:
> Lukas reports skb_over_panic errors on his Banana Pi BPI-CM4 which comes
> with an Amlogic A311D (G12B) SoC and a RTL8822CS SDIO wifi/Bluetooth
> combo card. The error he observed is identical to what has been fixed
> in commit e967229ead0e ("wifi: rtw88: sdio: Check the HISR RX_REQUEST
> bit in rtw_sdio_rx_isr()") but that commit didn't fix Lukas' problem.
>
> Lukas found that disabling or limiting RX aggregation fix the problem
> for him. In the following discussion a few key topics have been
> discussed which have an impact on this problem:
> - The Amlogic A311D (G12B) SoC has a hardware bug in the SDIO controller
> which prevents DMA transfers. Instead all transfers need to go through
> the controller SRAM which limits transfers to 1536 bytes
> - rtw88 chips don't split incoming (RX) packets, so if a big packet is
> received this is forwarded to the host in it's original form
> - rtw88 chips can do RX aggregation, meaning more multiple incoming
> packets can be pulled by the host from the card with one MMC/SDIO
> transfer. This Depends on settings in the REG_RXDMA_AGG_PG_TH
> register (BIT_RXDMA_AGG_PG_TH limits the number of packets that will
> be aggregated, BIT_DMA_AGG_TO_V1 configures a timeout for aggregation
> and BIT_EN_PRE_CALC makes the chip honor the limits more effectively)
>
> Use multiple consecutive reads in rtw_sdio_read_port() to limit the
> number of bytes which are copied by the host from the card in one
> MMC/SDIO transfer. This allows receiving a buffer that's larger than
> the hosts max_req_size (number of bytes which can be transferred in
> one MMC/SDIO transfer). As a result of this the skb_over_panic error
> is gone as the rtw88 driver is now able to receive more than 1536 bytes
> from the card (either because the incoming packet is larger than that
> or because multiple packets have been aggregated).
>
> Fixes: 65371a3f14e7 ("wifi: rtw88: sdio: Add HCI implementation for SDIO based chipsets")
> Reported-by: Lukas F. Hartmann <[email protected]>
> Closes: https://lore.kernel.org/linux-wireless/CAFBinCBaXtebixKbjkWKW_WXc5k=NdGNaGUjVE8NCPNxOhsb2g@mail.gmail.com/
> Suggested-by: Ping-Ke Shih <[email protected]>
> Signed-off-by: Martin Blumenstingl <[email protected]>
> Reviewed-by: Ping-Ke Shih <[email protected]>
Ping, should I take or drop the patch? It wasn't quite clear for me.
--
https://patchwork.kernel.org/project/linux-wireless/patch/[email protected]/
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches