2021-05-19 20:21:31

by Dave Olsthoorn

[permalink] [raw]
Subject: Re: mwifiex firmware crash

Hi,

I'll drop some of the people since this is a sub-thread of the original,
I'll keep the lists for access to this using lore.kernel.org.

On 2021-05-15 18:53, Pali Rohár wrote:
> Hello!
>
> On Saturday 15 May 2021 18:32:30 Dave Olsthoorn wrote:
>> Hi,
>>
>> On 2021-05-15 17:40, Pali Rohár wrote:
>> > On Saturday 15 May 2021 17:10:31 Dave Olsthoorn wrote:
>> > > The firmware still seems to crash quicker than previously, but
>> > > that's a
>> > > unrelated problem.
>> >
>> > Hello! Do you have some more details (or links) about mentioned firmware
>> > crash?
>>
>> Sure, firmware crashes have always been a problem on the Surface
>> devices.
>
> What wifi chip you have on these devices? Because very similar firmware
> crashes I see on 88W8997 chip (also with mwifiex) when wifi card is
> configured in SDIO mode (not PCIe).
>

The Surface Pro 2017 has an 88W8897.

> I know that there are new version of firmwares for these 88W8xxx chips,
> but they are available only under NXP NDA and only for NXP customers.
> So it looks like that end users with NXP wifi chips are out of luck.
>
>> They seem to be related, at least for some of the crashes, to power
>> management. For this reason I disabled powersaving in NetworkManager
>> which
>> used to make it at least stable enough for me, in 5.13 this trick does
>> not
>> seem to work.
>>
>> The dmesg log attached shows a firmware crash happening, the card does
>> not
>> work even after a reset or remove & rescan on the pci(e) bus.
>
> Similar issue, card start working again only after whole system
> restart.
>
> So this is something which can be resolved only in NXP.

After a conversation with the author of the patches, the problem is not
the power management itself (for most hardware revisions [1]) but a race
where pci commands are being written while the device is being put to
sleep. A fix for this problem is included in the patches which make all
pci commands synchronous instead of asynchronous [2].

After that a the wakeup patch seems relevant [3].

<snip>
>> There are patches [1] which have not been submitted yet and where
>> developed
>> as part of the linux-surface effort [2]. From my experience these
>> patches
>> resolve most if not all of the firmware crashes.
>
> Is somebody going to cleanup these patches and send them for inclusion
> into mainline kernel? I see that most of them are PCIe related, but due
> to seeing same issues also on SDIO bus, I guess adding similar hooks
> also for SDIO could make also SDIO more stable...

The author plans to upstream them, he just hasn't gotten around to it.

Regards,
Dave

[1]:
https://github.com/linux-surface/linux-surface/blob/master/patches/5.12/0002-mwifiex.patch#L2237-L2338
[2]:
https://github.com/linux-surface/linux-surface/blob/master/patches/5.12/0002-mwifiex.patch#L1152-L1207
[3]:
https://github.com/linux-surface/linux-surface/blob/master/patches/5.12/0002-mwifiex.patch#L1992-L2079