2023-06-06 11:04:12

by Nicolas Escande

[permalink] [raw]
Subject: [regression] STP on 80211s is broken in 6.4-rc4

Hello Felix,

As user of the mesh part of mac80211 on multiple products at work let me say
thank you for all the work you do on wifi, especially on 80211s, and especially
the recent improvements you made for mesh fast RX/TX & cross vendor AMSDU compat

We upgraded our kernel from an older (5.15) to a newer 6.4. The problem is STP
doesn't work anymore and alas we use it for now (for the better or worse).

What I gathered so far from my setup:
- we use ath9k & ath10k
- in my case STP frames are received as regular packet and not as amsdu
- the received packets have a wrong length of 44 in tcpdump
(instead of 38 with our previous kernel)
- llc_fixup_skb() tries to pull some 41 bytes out of a 35 bytes packet
this makes llc_rcv() discard the frames & breaks STP

From bisecting the culprit seems to be 986e43b19ae9176093da35e0a844e65c8bf9ede7
(wifi: mac80211: fix receiving A-MSDU frames on mesh interfaces)

I guess that your changes to handle both ampdu subframes & normal frames in the
same datapath ends up putting a wrong skb->len for STP (multicast) frames ?
Honestly I don't understand enough of the 80211 internals & spec to pinpoint the
exact problem.

It seems this change was already in the 6.3 kernel so I guess someone should
have seen it before (but I didn't find anything..) ? Maybe I missed something...

Anyway I'm happy to provide more info or try anything you throw at me.

Thanks,

---
Nicolas E.


2023-06-10 06:50:58

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: [regression] STP on 80211s is broken in 6.4-rc4

On Tue, Jun 06, 2023 at 12:55:57PM +0200, Nicolas Escande wrote:
> Hello Felix,
>
> As user of the mesh part of mac80211 on multiple products at work let me say
> thank you for all the work you do on wifi, especially on 80211s, and especially
> the recent improvements you made for mesh fast RX/TX & cross vendor AMSDU compat
>
> We upgraded our kernel from an older (5.15) to a newer 6.4. The problem is STP
> doesn't work anymore and alas we use it for now (for the better or worse).
>
> What I gathered so far from my setup:
> - we use ath9k & ath10k
> - in my case STP frames are received as regular packet and not as amsdu
> - the received packets have a wrong length of 44 in tcpdump
> (instead of 38 with our previous kernel)
> - llc_fixup_skb() tries to pull some 41 bytes out of a 35 bytes packet
> this makes llc_rcv() discard the frames & breaks STP
>
> >From bisecting the culprit seems to be 986e43b19ae9176093da35e0a844e65c8bf9ede7
> (wifi: mac80211: fix receiving A-MSDU frames on mesh interfaces)
>
> I guess that your changes to handle both ampdu subframes & normal frames in the
> same datapath ends up putting a wrong skb->len for STP (multicast) frames ?
> Honestly I don't understand enough of the 80211 internals & spec to pinpoint the
> exact problem.
>
> It seems this change was already in the 6.3 kernel so I guess someone should
> have seen it before (but I didn't find anything..) ? Maybe I missed something...
>
> Anyway I'm happy to provide more info or try anything you throw at me.
>

Thanks for the regression report. I'm adding it to regzbot:

(Felix: it looks like this regression is introcued by a commit authored by you.
Would you like to take a look on it?)

#regzbot ^introduced: 986e43b19ae917

--
An old man doll... just what I always wanted! - Clara


Attachments:
(No filename) (1.81 kB)
signature.asc (235.00 B)
Download all attachments
Subject: Re: [regression] STP on 80211s is broken in 6.4-rc4

On 10.06.23 08:44, Bagas Sanjaya wrote:
> On Tue, Jun 06, 2023 at 12:55:57PM +0200, Nicolas Escande wrote:
>> Hello Felix,
>>
>> As user of the mesh part of mac80211 on multiple products at work let me say
>> thank you for all the work you do on wifi, especially on 80211s, and especially
>> the recent improvements you made for mesh fast RX/TX & cross vendor AMSDU compat
>>
>> We upgraded our kernel from an older (5.15) to a newer 6.4. The problem is STP
>> doesn't work anymore and alas we use it for now (for the better or worse).
>>
>> What I gathered so far from my setup:
>> - we use ath9k & ath10k
>> - in my case STP frames are received as regular packet and not as amsdu
>> - the received packets have a wrong length of 44 in tcpdump
>> (instead of 38 with our previous kernel)
>> - llc_fixup_skb() tries to pull some 41 bytes out of a 35 bytes packet
>> this makes llc_rcv() discard the frames & breaks STP
>>
>> >From bisecting the culprit seems to be 986e43b19ae9176093da35e0a844e65c8bf9ede7
>> (wifi: mac80211: fix receiving A-MSDU frames on mesh interfaces)
>>
>> I guess that your changes to handle both ampdu subframes & normal frames in the
>> same datapath ends up putting a wrong skb->len for STP (multicast) frames ?
>> Honestly I don't understand enough of the 80211 internals & spec to pinpoint the
>> exact problem.
>>
>> It seems this change was already in the 6.3 kernel so I guess someone should
>> have seen it before (but I didn't find anything..) ? Maybe I missed something...
>>
>> Anyway I'm happy to provide more info or try anything you throw at me.
>>
>
> Thanks for the regression report. I'm adding it to regzbot:
>
> (Felix: it looks like this regression is introcued by a commit authored by you.
> Would you like to take a look on it?)
>
> #regzbot ^introduced: 986e43b19ae917

Hmmm, Felix did not reply. But let's ignore that for now.

Nicolas, I noticed there are a few patches in next that refer to the
culprit. Might be worth giving this series a try:

https://lore.kernel.org/all/[email protected]/

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

2023-06-16 07:54:53

by Nicolas Escande

[permalink] [raw]
Subject: Re: [regression] STP on 80211s is broken in 6.4-rc4

On Thu Jun 15, 2023 at 2:54 PM CEST, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 10.06.23 08:44, Bagas Sanjaya wrote:
> > On Tue, Jun 06, 2023 at 12:55:57PM +0200, Nicolas Escande wrote:
> >> Hello Felix,
> >>
> >> As user of the mesh part of mac80211 on multiple products at work let me say
> >> thank you for all the work you do on wifi, especially on 80211s, and especially
> >> the recent improvements you made for mesh fast RX/TX & cross vendor AMSDU compat
> >>
> >> We upgraded our kernel from an older (5.15) to a newer 6.4. The problem is STP
> >> doesn't work anymore and alas we use it for now (for the better or worse).
> >>
> >> What I gathered so far from my setup:
> >> - we use ath9k & ath10k
> >> - in my case STP frames are received as regular packet and not as amsdu
> >> - the received packets have a wrong length of 44 in tcpdump
> >> (instead of 38 with our previous kernel)
> >> - llc_fixup_skb() tries to pull some 41 bytes out of a 35 bytes packet
> >> this makes llc_rcv() discard the frames & breaks STP
> >>
> >> >From bisecting the culprit seems to be 986e43b19ae9176093da35e0a844e65c8bf9ede7
> >> (wifi: mac80211: fix receiving A-MSDU frames on mesh interfaces)
> >>
> >> I guess that your changes to handle both ampdu subframes & normal frames in the
> >> same datapath ends up putting a wrong skb->len for STP (multicast) frames ?
> >> Honestly I don't understand enough of the 80211 internals & spec to pinpoint the
> >> exact problem.
> >>
> >> It seems this change was already in the 6.3 kernel so I guess someone should
> >> have seen it before (but I didn't find anything..) ? Maybe I missed something...
> >>
> >> Anyway I'm happy to provide more info or try anything you throw at me.
> >>
> >
> > Thanks for the regression report. I'm adding it to regzbot:
> >
> > (Felix: it looks like this regression is introcued by a commit authored by you.
> > Would you like to take a look on it?)
> >
> > #regzbot ^introduced: 986e43b19ae917
>
> Hmmm, Felix did not reply. But let's ignore that for now.

I haven't seen mails from felix on the list for a few days, I'm guessing he's
unavailable for now but I'll hapilly wait.

>
> Nicolas, I noticed there are a few patches in next that refer to the
> culprit. Might be worth giving this series a try:
>
> https://lore.kernel.org/all/[email protected]/

Well this series already landed in 6.4 and that is the version I did my initial
testing with. So no luck there.

>
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.

Subject: Re: [regression] STP on 80211s is broken in 6.4-rc4

On 16.06.23 09:45, Nicolas Escande wrote:
> On Thu Jun 15, 2023 at 2:54 PM CEST, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 10.06.23 08:44, Bagas Sanjaya wrote:
>>> On Tue, Jun 06, 2023 at 12:55:57PM +0200, Nicolas Escande wrote:

>> Hmmm, Felix did not reply. But let's ignore that for now.
> I haven't seen mails from felix on the list for a few days, I'm guessing he's
> unavailable for now but I'll hapilly wait.

Okay.

>> Nicolas, I noticed there are a few patches in next that refer to the
>> culprit. Might be worth giving this series a try:
>> https://lore.kernel.org/all/[email protected]/
> Well this series already landed in 6.4 and that is the version I did my initial
> testing with. So no luck there.

What? Ohh, sorry for the noise, I had missed that they were in mainline
already.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

2023-06-16 12:39:54

by Bagas Sanjaya

[permalink] [raw]
Subject: Re: [regression] STP on 80211s is broken in 6.4-rc4

On 6/16/23 16:25, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 16.06.23 09:45, Nicolas Escande wrote:
>> On Thu Jun 15, 2023 at 2:54 PM CEST, Linux regression tracking (Thorsten Leemhuis) wrote:
>>> On 10.06.23 08:44, Bagas Sanjaya wrote:
>>>> On Tue, Jun 06, 2023 at 12:55:57PM +0200, Nicolas Escande wrote:
>
>>> Hmmm, Felix did not reply. But let's ignore that for now.
>> I haven't seen mails from felix on the list for a few days, I'm guessing he's
>> unavailable for now but I'll hapilly wait.
>
> Okay.
>
>>> Nicolas, I noticed there are a few patches in next that refer to the
>>> culprit. Might be worth giving this series a try:
>>> https://lore.kernel.org/all/[email protected]/
>> Well this series already landed in 6.4 and that is the version I did my initial
>> testing with. So no luck there.
>
> What? Ohh, sorry for the noise, I had missed that they were in mainline
> already.
>

Hi Thorsten,

Should this be removed from tracking as inconclusive?

--
An old man doll... just what I always wanted! - Clara


Subject: Re: [regression] STP on 80211s is broken in 6.4-rc4

On 16.06.23 14:17, Bagas Sanjaya wrote:
> On 6/16/23 16:25, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 16.06.23 09:45, Nicolas Escande wrote:
>>> On Thu Jun 15, 2023 at 2:54 PM CEST, Linux regression tracking (Thorsten Leemhuis) wrote:
>>>> On 10.06.23 08:44, Bagas Sanjaya wrote:
>>>>> On Tue, Jun 06, 2023 at 12:55:57PM +0200, Nicolas Escande wrote:
>>
>>>> Hmmm, Felix did not reply. But let's ignore that for now.
>>> I haven't seen mails from felix on the list for a few days, I'm guessing he's
>>> unavailable for now but I'll hapilly wait.
>>
>> Okay.
>>
>>>> Nicolas, I noticed there are a few patches in next that refer to the
>>>> culprit. Might be worth giving this series a try:
>>>> https://lore.kernel.org/all/[email protected]/
>>> Well this series already landed in 6.4 and that is the version I did my initial
>>> testing with. So no luck there.
>>
>> What? Ohh, sorry for the noise, I had missed that they were in mainline
>> already.
>
> Should this be removed from tracking as inconclusive?

Ehh, why? Afaics this is still a regression, just not one the reporter
considers urgent; that is fine for me, unless more people start to
report the problem.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.


Subject: Re: [regression] STP on 80211s is broken in 6.4-rc4

On 16.06.23 09:45, Nicolas Escande wrote:
> On Thu Jun 15, 2023 at 2:54 PM CEST, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 10.06.23 08:44, Bagas Sanjaya wrote:
>>> On Tue, Jun 06, 2023 at 12:55:57PM +0200, Nicolas Escande wrote:
>>>>
>>>> As user of the mesh part of mac80211 on multiple products at work let me say
>>>> thank you for all the work you do on wifi, especially on 80211s, and especially
>>>> the recent improvements you made for mesh fast RX/TX & cross vendor AMSDU compat
>>>>
>>>> We upgraded our kernel from an older (5.15) to a newer 6.4. The problem is STP
>>>> doesn't work anymore and alas we use it for now (for the better or worse).
>>>>
>>>> What I gathered so far from my setup:
>>>> - we use ath9k & ath10k
>>>> - in my case STP frames are received as regular packet and not as amsdu
>>>> - the received packets have a wrong length of 44 in tcpdump
>>>> (instead of 38 with our previous kernel)
>>>> - llc_fixup_skb() tries to pull some 41 bytes out of a 35 bytes packet
>>>> this makes llc_rcv() discard the frames & breaks STP
>>>>
>>>> >From bisecting the culprit seems to be 986e43b19ae9176093da35e0a844e65c8bf9ede7
>>>> (wifi: mac80211: fix receiving A-MSDU frames on mesh interfaces)
>>>>
>>>> I guess that your changes to handle both ampdu subframes & normal frames in the
>>>> same datapath ends up putting a wrong skb->len for STP (multicast) frames ?
>>>> Honestly I don't understand enough of the 80211 internals & spec to pinpoint the
>>>> exact problem.
>>>>
>>>> It seems this change was already in the 6.3 kernel so I guess someone should
>>>> have seen it before (but I didn't find anything..) ? Maybe I missed something...
>>>>
>>>> Anyway I'm happy to provide more info or try anything you throw at me.
>> [...]
>> Hmmm, Felix did not reply. But let's ignore that for now.
>
> I haven't seen mails from felix on the list for a few days, I'm guessing he's
> unavailable for now but I'll hapilly wait.

Still no progress. Hmmm. Are you still okay with that? I've seen no
other reports about this, so waiting is somewhat (albeit not completely)
fine for me if it is for you.

But in any case it might be good if you could recheck 6.5-rc1.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

2023-07-10 16:51:50

by Nicolas Escande

[permalink] [raw]
Subject: Re: [regression] STP on 80211s is broken in 6.4-rc4

On Mon Jul 10, 2023 at 1:32 PM CEST, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 16.06.23 09:45, Nicolas Escande wrote:
> > On Thu Jun 15, 2023 at 2:54 PM CEST, Linux regression tracking (Thorsten Leemhuis) wrote:
> >> On 10.06.23 08:44, Bagas Sanjaya wrote:
> >>> On Tue, Jun 06, 2023 at 12:55:57PM +0200, Nicolas Escande wrote:
> >>>>
> >>>> As user of the mesh part of mac80211 on multiple products at work let me say
> >>>> thank you for all the work you do on wifi, especially on 80211s, and especially
> >>>> the recent improvements you made for mesh fast RX/TX & cross vendor AMSDU compat
> >>>>
> >>>> We upgraded our kernel from an older (5.15) to a newer 6.4. The problem is STP
> >>>> doesn't work anymore and alas we use it for now (for the better or worse).
> >>>>
> >>>> What I gathered so far from my setup:
> >>>> - we use ath9k & ath10k
> >>>> - in my case STP frames are received as regular packet and not as amsdu
> >>>> - the received packets have a wrong length of 44 in tcpdump
> >>>> (instead of 38 with our previous kernel)
> >>>> - llc_fixup_skb() tries to pull some 41 bytes out of a 35 bytes packet
> >>>> this makes llc_rcv() discard the frames & breaks STP
> >>>>
> >>>> >From bisecting the culprit seems to be 986e43b19ae9176093da35e0a844e65c8bf9ede7
> >>>> (wifi: mac80211: fix receiving A-MSDU frames on mesh interfaces)
> >>>>
> >>>> I guess that your changes to handle both ampdu subframes & normal frames in the
> >>>> same datapath ends up putting a wrong skb->len for STP (multicast) frames ?
> >>>> Honestly I don't understand enough of the 80211 internals & spec to pinpoint the
> >>>> exact problem.
> >>>>
> >>>> It seems this change was already in the 6.3 kernel so I guess someone should
> >>>> have seen it before (but I didn't find anything..) ? Maybe I missed something...
> >>>>
> >>>> Anyway I'm happy to provide more info or try anything you throw at me.
> >> [...]
> >> Hmmm, Felix did not reply. But let's ignore that for now.
> >
> > I haven't seen mails from felix on the list for a few days, I'm guessing he's
> > unavailable for now but I'll hapilly wait.
>
> Still no progress. Hmmm. Are you still okay with that? I've seen no
> other reports about this, so waiting is somewhat (albeit not completely)
> fine for me if it is for you.
I'm not so surprised no one else reported it, using STP on wifi (and 802.11s) is
not a really common thing to do, to be honest (and STP on wifi is unreliable).
Even though some openwrt guys do it for sure, I'm guessing their kernel version
is lagging behind...
>
> But in any case it might be good if you could recheck 6.5-rc1.
Testing on 6.5 as a whole won't be as easy for me as testing a single patch on
top of 6.4. I'll do my best to try but from what I saw nothing got merged that
would even remotely help me on this issue.

I am not loosing hope that Felix or someone that understands this stuff better
finds the time to look into this. I'm guessing it's the summer vacation effet.

>
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.

2023-07-11 11:28:23

by Felix Fietkau

[permalink] [raw]
Subject: Re: [regression] STP on 80211s is broken in 6.4-rc4

On 10.07.23 18:50, Nicolas Escande wrote:
> On Mon Jul 10, 2023 at 1:32 PM CEST, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 16.06.23 09:45, Nicolas Escande wrote:
>> > On Thu Jun 15, 2023 at 2:54 PM CEST, Linux regression tracking (Thorsten Leemhuis) wrote:
>> >> On 10.06.23 08:44, Bagas Sanjaya wrote:
>> >>> On Tue, Jun 06, 2023 at 12:55:57PM +0200, Nicolas Escande wrote:
>> >>>>
>> >>>> As user of the mesh part of mac80211 on multiple products at work let me say
>> >>>> thank you for all the work you do on wifi, especially on 80211s, and especially
>> >>>> the recent improvements you made for mesh fast RX/TX & cross vendor AMSDU compat
>> >>>>
>> >>>> We upgraded our kernel from an older (5.15) to a newer 6.4. The problem is STP
>> >>>> doesn't work anymore and alas we use it for now (for the better or worse).
>> >>>>
>> >>>> What I gathered so far from my setup:
>> >>>> - we use ath9k & ath10k
>> >>>> - in my case STP frames are received as regular packet and not as amsdu
>> >>>> - the received packets have a wrong length of 44 in tcpdump
>> >>>> (instead of 38 with our previous kernel)
>> >>>> - llc_fixup_skb() tries to pull some 41 bytes out of a 35 bytes packet
>> >>>> this makes llc_rcv() discard the frames & breaks STP
>> >>>>
>> >>>> >From bisecting the culprit seems to be 986e43b19ae9176093da35e0a844e65c8bf9ede7
>> >>>> (wifi: mac80211: fix receiving A-MSDU frames on mesh interfaces)
>> >>>>
>> >>>> I guess that your changes to handle both ampdu subframes & normal frames in the
>> >>>> same datapath ends up putting a wrong skb->len for STP (multicast) frames ?
>> >>>> Honestly I don't understand enough of the 80211 internals & spec to pinpoint the
>> >>>> exact problem.
>> >>>>
>> >>>> It seems this change was already in the 6.3 kernel so I guess someone should
>> >>>> have seen it before (but I didn't find anything..) ? Maybe I missed something...
>> >>>>
>> >>>> Anyway I'm happy to provide more info or try anything you throw at me.
>> >> [...]
>> >> Hmmm, Felix did not reply. But let's ignore that for now.
>> >
>> > I haven't seen mails from felix on the list for a few days, I'm guessing he's
>> > unavailable for now but I'll hapilly wait.
>>
>> Still no progress. Hmmm. Are you still okay with that? I've seen no
>> other reports about this, so waiting is somewhat (albeit not completely)
>> fine for me if it is for you.
> I'm not so surprised no one else reported it, using STP on wifi (and 802.11s) is
> not a really common thing to do, to be honest (and STP on wifi is unreliable).
> Even though some openwrt guys do it for sure, I'm guessing their kernel version
> is lagging behind...
>>
>> But in any case it might be good if you could recheck 6.5-rc1.
> Testing on 6.5 as a whole won't be as easy for me as testing a single patch on
> top of 6.4. I'll do my best to try but from what I saw nothing got merged that
> would even remotely help me on this issue.
>
> I am not loosing hope that Felix or someone that understands this stuff better
> finds the time to look into this. I'm guessing it's the summer vacation effet.

Sorry for the delay. This should fix the regression, please test.
I will submit it for 6.5 soon.
---
--- a/net/wireless/util.c
+++ b/net/wireless/util.c
@@ -580,6 +580,8 @@ int ieee80211_strip_8023_mesh_hdr(struct
hdrlen += ETH_ALEN + 2;
else if (!pskb_may_pull(skb, hdrlen))
return -EINVAL;
+ else
+ payload.eth.h_proto = htons(skb->len - hdrlen);

mesh_addr = skb->data + sizeof(payload.eth) + ETH_ALEN;
switch (payload.flags & MESH_FLAGS_AE) {



2023-07-11 12:17:11

by Nicolas Escande

[permalink] [raw]
Subject: Re: [regression] STP on 80211s is broken in 6.4-rc4

On Tue Jul 11, 2023 at 1:12 PM CEST, Felix Fietkau wrote:
> On 10.07.23 18:50, Nicolas Escande wrote:
> > On Mon Jul 10, 2023 at 1:32 PM CEST, Linux regression tracking (Thorsten Leemhuis) wrote:
> >> On 16.06.23 09:45, Nicolas Escande wrote:
> >> > On Thu Jun 15, 2023 at 2:54 PM CEST, Linux regression tracking (Thorsten Leemhuis) wrote:
> >> >> On 10.06.23 08:44, Bagas Sanjaya wrote:
> >> >>> On Tue, Jun 06, 2023 at 12:55:57PM +0200, Nicolas Escande wrote:
> >> >>>>
> >> >>>> As user of the mesh part of mac80211 on multiple products at work let me say
> >> >>>> thank you for all the work you do on wifi, especially on 80211s, and especially
> >> >>>> the recent improvements you made for mesh fast RX/TX & cross vendor AMSDU compat
> >> >>>>
> >> >>>> We upgraded our kernel from an older (5.15) to a newer 6.4. The problem is STP
> >> >>>> doesn't work anymore and alas we use it for now (for the better or worse).
> >> >>>>
> >> >>>> What I gathered so far from my setup:
> >> >>>> - we use ath9k & ath10k
> >> >>>> - in my case STP frames are received as regular packet and not as amsdu
> >> >>>> - the received packets have a wrong length of 44 in tcpdump
> >> >>>> (instead of 38 with our previous kernel)
> >> >>>> - llc_fixup_skb() tries to pull some 41 bytes out of a 35 bytes packet
> >> >>>> this makes llc_rcv() discard the frames & breaks STP
> >> >>>>
> >> >>>> >From bisecting the culprit seems to be 986e43b19ae9176093da35e0a844e65c8bf9ede7
> >> >>>> (wifi: mac80211: fix receiving A-MSDU frames on mesh interfaces)
> >> >>>>
> >> >>>> I guess that your changes to handle both ampdu subframes & normal frames in the
> >> >>>> same datapath ends up putting a wrong skb->len for STP (multicast) frames ?
> >> >>>> Honestly I don't understand enough of the 80211 internals & spec to pinpoint the
> >> >>>> exact problem.
> >> >>>>
> >> >>>> It seems this change was already in the 6.3 kernel so I guess someone should
> >> >>>> have seen it before (but I didn't find anything..) ? Maybe I missed something...
> >> >>>>
> >> >>>> Anyway I'm happy to provide more info or try anything you throw at me.
> >> >> [...]
> >> >> Hmmm, Felix did not reply. But let's ignore that for now.
> >> >
> >> > I haven't seen mails from felix on the list for a few days, I'm guessing he's
> >> > unavailable for now but I'll hapilly wait.
> >>
> >> Still no progress. Hmmm. Are you still okay with that? I've seen no
> >> other reports about this, so waiting is somewhat (albeit not completely)
> >> fine for me if it is for you.
> > I'm not so surprised no one else reported it, using STP on wifi (and 802.11s) is
> > not a really common thing to do, to be honest (and STP on wifi is unreliable).
> > Even though some openwrt guys do it for sure, I'm guessing their kernel version
> > is lagging behind...
> >>
> >> But in any case it might be good if you could recheck 6.5-rc1.
> > Testing on 6.5 as a whole won't be as easy for me as testing a single patch on
> > top of 6.4. I'll do my best to try but from what I saw nothing got merged that
> > would even remotely help me on this issue.
> >
> > I am not loosing hope that Felix or someone that understands this stuff better
> > finds the time to look into this. I'm guessing it's the summer vacation effet.
>
> Sorry for the delay. This should fix the regression, please test.
> I will submit it for 6.5 soon.
> ---
> --- a/net/wireless/util.c
> +++ b/net/wireless/util.c
> @@ -580,6 +580,8 @@ int ieee80211_strip_8023_mesh_hdr(struct
> hdrlen += ETH_ALEN + 2;
> else if (!pskb_may_pull(skb, hdrlen))
> return -EINVAL;
> + else
> + payload.eth.h_proto = htons(skb->len - hdrlen);
>
> mesh_addr = skb->data + sizeof(payload.eth) + ETH_ALEN;
> switch (payload.flags & MESH_FLAGS_AE) {

Great, that does the trick for me
Thanks Felix