2021-03-01 13:34:39

by Marc Zyngier

[permalink] [raw]
Subject: Re: Aw: Re: [PATCH 09/13] PCI: mediatek: Advertise lack of MSI handling

Frank,

>> > i guess it's a bug in ath10k driver or my r64 board (it is a v1.1
>> > which has missing capacitors on tx lines).
>>
>> No, this definitely looks like a bug in the MTK PCIe driver,
>> where the mutex is either not properly initialised, corrupted,
>> or the wrong pointer is passed.
>
> but why does it happen only with the ath10k-card and not the mt7612 in
> same slot?

Does mt7612 use MSI? What we have here is a bogus mutex in the
MTK PCIe driver, and the only way not to get there would be
to avoid using MSIs.

>
>> This r64 machine is supposed to have working MSIs, right?
>
> imho mt7622 have working MSI
>
>> Do you get the same issue without this series?
>
> tested 5.11.0 [1] without this series (but with your/thomas' patch
> from discussion about my old patch) and got same trace. so this series
> does not break anything here.

Can you retest without any additional patch on top of 5.11?
These two patches only affect platforms that do *not* have MSIs at all.

>
>> > Tried with an mt7612e, this seems to work without any errors.
>> >
>> > so for mt7622/mt7623
>> >
>> > Tested-by: Frank Wunderlich <[email protected]>
>>
>> We definitely need to understand the above.
>
> there is a hardware-bug which may cause this...afair i saw this with
> the card in r64 with earlier Kernel-versions where other cards work
> (like the mt7612e).

I don't think a HW bug affecting PCI would cause what we are seeing
here, unless it results in memory corruption.

Thanks,

M.
--
Jazz is not dead. It just smells funny...


2021-03-03 03:45:06

by Frank Wunderlich

[permalink] [raw]
Subject: Aw: Re: Re: [PATCH 09/13] PCI: mediatek: Advertise lack of MSI handling

> Gesendet: Montag, 01. März 2021 um 14:31 Uhr
> Von: "Marc Zyngier" <[email protected]>
>
> Frank,
>
> >> > i guess it's a bug in ath10k driver or my r64 board (it is a v1.1
> >> > which has missing capacitors on tx lines).
> >>
> >> No, this definitely looks like a bug in the MTK PCIe driver,
> >> where the mutex is either not properly initialised, corrupted,
> >> or the wrong pointer is passed.
> >
> > but why does it happen only with the ath10k-card and not the mt7612 in
> > same slot?
>
> Does mt7612 use MSI? What we have here is a bogus mutex in the
> MTK PCIe driver, and the only way not to get there would be
> to avoid using MSIs.

i guess this card/its driver does not use MSI. Did not found anything in "datasheet" [1] or driver [2] about msi

> >
> >> This r64 machine is supposed to have working MSIs, right?
> >
> > imho mt7622 have working MSI
> >
> >> Do you get the same issue without this series?
> >
> > tested 5.11.0 [1] without this series (but with your/thomas' patch
> > from discussion about my old patch) and got same trace. so this series
> > does not break anything here.
>
> Can you retest without any additional patch on top of 5.11?
> These two patches only affect platforms that do *not* have MSIs at all.

i can revert these 2, but still need patches for mt7622 pcie-support [3]...btw. i see that i miss these in 5.11-main...do not see traceback with them (have firmware not installed...)

root@bpi-r64:~# dmesg | grep ath
[ 6.450765] ath10k_pci 0000:01:00.0: assign IRQ: got 146
[ 6.661752] ath10k_pci 0000:01:00.0: enabling device (0000 -> 0002)
[ 6.697811] ath10k_pci 0000:01:00.0: enabling bus mastering
[ 6.721293] ath10k_pci 0000:01:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 r
eset_mode 0
[ 6.921030] ath10k_pci 0000:01:00.0: Failed to find firmware-N.bin (N between
2 and 6) from ath10k/QCA988X/hw2.0: -2
[ 6.931698] ath10k_pci 0000:01:00.0: could not fetch firmware files (-2)
[ 6.940417] ath10k_pci 0000:01:00.0: could not probe fw (-2)

so traceback was caused by missing changes in mtk pcie-driver not yet upstream, added Chuanjia Liu

> >
> >> > Tried with an mt7612e, this seems to work without any errors.
> >> >
> >> > so for mt7622/mt7623
> >> >
> >> > Tested-by: Frank Wunderlich <[email protected]>
> >>
> >> We definitely need to understand the above.
> >
> > there is a hardware-bug which may cause this...afair i saw this with
> > the card in r64 with earlier Kernel-versions where other cards work
> > (like the mt7612e).
>
> I don't think a HW bug affecting PCI would cause what we are seeing
> here, unless it results in memory corruption.


[1] https://www.asiarf.com/shop/wifi-wlan/wifi_mini_pcie/ws2433-wifi-11ac-mini-pcie-module-manufacturer/
[2] grep -Rni 'msi' drivers/net/wireless/mediatek/mt76/mt76x2/
[3] https://patchwork.kernel.org/project/linux-mediatek/list/?series=372885

2021-03-04 06:10:51

by Robin Murphy

[permalink] [raw]
Subject: Re: Aw: Re: Re: [PATCH 09/13] PCI: mediatek: Advertise lack of MSI handling

On 2021-03-01 14:06, Frank Wunderlich wrote:
>> Gesendet: Montag, 01. März 2021 um 14:31 Uhr
>> Von: "Marc Zyngier" <[email protected]>
>>
>> Frank,
>>
>>>>> i guess it's a bug in ath10k driver or my r64 board (it is a v1.1
>>>>> which has missing capacitors on tx lines).
>>>>
>>>> No, this definitely looks like a bug in the MTK PCIe driver,
>>>> where the mutex is either not properly initialised, corrupted,
>>>> or the wrong pointer is passed.
>>>
>>> but why does it happen only with the ath10k-card and not the mt7612 in
>>> same slot?
>>
>> Does mt7612 use MSI? What we have here is a bogus mutex in the
>> MTK PCIe driver, and the only way not to get there would be
>> to avoid using MSIs.
>
> i guess this card/its driver does not use MSI. Did not found anything in "datasheet" [1] or driver [2] about msi

FWIW, no need to guess - `lspci -v` (as root) should tell you whether
the card has MSI (and/or MSI-X) capability, and whether it is enabled if so.

Robin.

>>>
>>>> This r64 machine is supposed to have working MSIs, right?
>>>
>>> imho mt7622 have working MSI
>>>
>>>> Do you get the same issue without this series?
>>>
>>> tested 5.11.0 [1] without this series (but with your/thomas' patch
>>> from discussion about my old patch) and got same trace. so this series
>>> does not break anything here.
>>
>> Can you retest without any additional patch on top of 5.11?
>> These two patches only affect platforms that do *not* have MSIs at all.
>
> i can revert these 2, but still need patches for mt7622 pcie-support [3]...btw. i see that i miss these in 5.11-main...do not see traceback with them (have firmware not installed...)
>
> root@bpi-r64:~# dmesg | grep ath
> [ 6.450765] ath10k_pci 0000:01:00.0: assign IRQ: got 146
> [ 6.661752] ath10k_pci 0000:01:00.0: enabling device (0000 -> 0002)
> [ 6.697811] ath10k_pci 0000:01:00.0: enabling bus mastering
> [ 6.721293] ath10k_pci 0000:01:00.0: pci irq msi oper_irq_mode 2 irq_mode 0 r
> eset_mode 0
> [ 6.921030] ath10k_pci 0000:01:00.0: Failed to find firmware-N.bin (N between
> 2 and 6) from ath10k/QCA988X/hw2.0: -2
> [ 6.931698] ath10k_pci 0000:01:00.0: could not fetch firmware files (-2)
> [ 6.940417] ath10k_pci 0000:01:00.0: could not probe fw (-2)
>
> so traceback was caused by missing changes in mtk pcie-driver not yet upstream, added Chuanjia Liu
>
>>>
>>>>> Tried with an mt7612e, this seems to work without any errors.
>>>>>
>>>>> so for mt7622/mt7623
>>>>>
>>>>> Tested-by: Frank Wunderlich <[email protected]>
>>>>
>>>> We definitely need to understand the above.
>>>
>>> there is a hardware-bug which may cause this...afair i saw this with
>>> the card in r64 with earlier Kernel-versions where other cards work
>>> (like the mt7612e).
>>
>> I don't think a HW bug affecting PCI would cause what we are seeing
>> here, unless it results in memory corruption.
>
>
> [1] https://www.asiarf.com/shop/wifi-wlan/wifi_mini_pcie/ws2433-wifi-11ac-mini-pcie-module-manufacturer/
> [2] grep -Rni 'msi' drivers/net/wireless/mediatek/mt76/mt76x2/
> [3] https://patchwork.kernel.org/project/linux-mediatek/list/?series=372885
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>