2019-05-20 10:15:12

by Arend Van Spriel

[permalink] [raw]
Subject: Re: [PATCH 0/3] brcmfmac: sdio: Deal better w/ transmission errors waking from sleep

On 5/18/2019 12:54 AM, Douglas Anderson wrote:
> This series attempts to deal better with the expected transmission
> errors that we get when waking up the SDIO-based WiFi on
> rk3288-veyron-minnie, rk3288-veyron-speedy, and rk3288-veyron-mickey.
>
> Some details about those errors can be found in
> <https://crbug.com/960222>, but to summarize it here: if we try to
> send the wakeup command to the WiFi card at the same time it has
> decided to wake up itself then it will behave badly on the SDIO bus.
> This can cause timeouts or CRC errors.
>
> When I tested on 4.19 and 4.20 these CRC errors can be seen to cause
> re-tuning. Since I am currently developing on 4.19 this was the
> original problem I attempted to solve.
>
> On mainline it turns out that you don't see the retuning errors but
> you see tons of spam about timeouts trying to wakeup from sleep. I
> tracked down the commit that was causing that and have partially
> reverted it here. I have no real knowledge about Broadcom WiFi, but
> the commit that was causing problems sounds (from the descriptioin) to
> be a hack commit penalizing all Broadcom WiFi users because of a bug
> in a Cypress SD controller. I will let others comment if this is
> truly the case and, if so, what the right solution should be.

Let me give a bit of background. The brcmfmac driver implements its own
runtime-pm like functionality, ie. if the driver is idle for some time
it will put the device in a low-power state. When it does that it powers
down several cores in the chip among which the SDIO core. However, the
SDIO bus used be very bad at handling devices that do that so instead it
has the Always-On-Station (AOS) block take over the SDIO core in
handling the bus. Default is will send a R1 response, but only for CMD52
(and CMD14 but no host is using that cruft). In noCmdDecode it does not
respond and simply wakes up the SDIO core, which takes over again.
Because it does not respond timeouts (-110) are kinda expected in this mode.

Regards,
Arend