2023-03-17 08:55:27

by Vegard Nossum

[permalink] [raw]
Subject: iwlwifi frequent drops between v6.2-rc3 and v6.3-rc1


Hi,

Somewhere between e8f60cd7db24 (v6.2-rc3) and 2eb29d59ddf0 (v6.3-rc1) my
wifi started dropping a lot. I'm using iwlwifi and I've checked the old
logs that the firmware hasn't changed:

Feb 25 16:22:33 kernel: iwlwifi 0000:04:00.0: loaded firmware version
36.e91976c0.0 8000C-36.ucode op_mode iwlmvm

My logs look like this:

09:01:50 kernel: wlp4s0: Connection to AP [REDACTED]:0b:25 lost
09:01:50 wpa_supplicant[1279]: wlp4s0: CTRL-EVENT-DISCONNECTED
bssid=[REDACTED]:0b:25 reason=4 locally_generated=1
09:01:50 NetworkManager[8368]: <warn> [1679040110.6548]
sup-iface[0x5628f2495990,wlp4s0]: connection disconnected (reason -4)
09:01:50 NetworkManager[8368]: <info> [1679040110.6656] device
(wlp4s0): supplicant interface state: completed -> disconnected
09:01:50 NetworkManager[8368]: <info> [1679040110.7609] device
(wlp4s0): supplicant interface state: disconnected -> scanning
09:01:53 wpa_supplicant[1279]: wlp4s0: SME: Trying to authenticate with
[REDACTED]:0b:25 (SSID='[REDACTED]' freq=5500 MHz)
09:01:53 kernel: wlp4s0: authenticate with [REDACTED]:0b:25
09:01:53 kernel: wlp4s0: send auth to [REDACTED]:0b:25 (try 1/3)
09:01:53 kernel: wlp4s0: authenticated
09:01:53 wpa_supplicant[1279]: wlp4s0: Trying to associate with
[REDACTED]:0b:25 (SSID='[REDACTED]' freq=5500 MHz)
09:01:53 kernel: wlp4s0: associate with [REDACTED]:0b:25 (try 1/3)
09:01:53 kernel: wlp4s0: RX AssocResp from [REDACTED]:0b:25
(capab=0x1511 status=0 aid=36)
09:01:53 NetworkManager[8368]: <info> [1679040113.4553] device
(wlp4s0): supplicant interface state: scanning -> authenticating
09:01:53 kernel: wlp4s0: associated
09:01:53 wpa_supplicant[1279]: wlp4s0: Associated with [REDACTED]:0b:25
09:01:53 wpa_supplicant[1279]: wlp4s0: CTRL-EVENT-SUBNET-STATUS-UPDATE
status=0
09:01:53 NetworkManager[8368]: <info> [1679040113.4645] device
(wlp4s0): supplicant interface state: authenticating -> associating
09:01:53 NetworkManager[8368]: <info> [1679040113.4829] device
(wlp4s0): supplicant interface state: associating -> associated
09:01:53 NetworkManager[8368]: <info> [1679040113.5175] device
(wlp4s0): supplicant interface state: associated -> 4-way handshake
09:01:53 kernel: wlp4s0: Connection to AP [REDACTED]:0b:25 lost
09:01:53 wpa_supplicant[1279]: wlp4s0: CTRL-EVENT-DISCONNECTED
bssid=[REDACTED]:0b:25 reason=4 locally_generated=1
09:01:53 wpa_supplicant[1279]: wlp4s0: WPA: 4-Way Handshake failed -
pre-shared key may be incorrect
09:01:53 wpa_supplicant[1279]: wlp4s0: CTRL-EVENT-SSID-TEMP-DISABLED
id=0 ssid="[REDACTED]" auth_failures=1 duration=10 reason=WRONG_KEY
09:01:53 NetworkManager[8368]: <warn> [1679040113.5685]
sup-iface[0x5628f2495990,wlp4s0]: connection disconnected (reason -4)
09:01:53 NetworkManager[8368]: <info> [1679040113.5737] device
(wlp4s0): supplicant interface state: 4-way handshake -> disconnected
09:01:53 NetworkManager[8368]: <info> [1679040113.6759] device
(wlp4s0): supplicant interface state: disconnected -> scanning

I did see that somebody else reported a similar regression, but I don't
know if it could be the same problem or not (I see the driver is
different, but the fix looks generic):

https://linux-regtracking.leemhuis.info/regzbot/regression/217119/

The buggy commit that was identified there does seem to be within the
range of potential culprits:

$ git log --oneline e8f60cd7db24..2eb29d59ddf0 | grep resumption
4444bc2116ae wifi: mac80211: Proper mark iTXQs for resumption

If people think it's the same, I could try the proposed fix -- otherwise
let me know what else I can do to help track this down.

I don't think bisecting this is feasible for me since there are hours
between the drops. But it's really noticeable over the stretch of
multiple days that something has changed -- looking at the logs, the
disconnect rate has basically tripled.

Thanks,


Vegard


2023-03-17 15:30:07

by Alexander Wetzel

[permalink] [raw]
Subject: Re: iwlwifi frequent drops between v6.2-rc3 and v6.3-rc1

On 17.03.23 09:53, Vegard Nossum wrote:
>
> Hi,
>
> Somewhere between e8f60cd7db24 (v6.2-rc3) and 2eb29d59ddf0 (v6.3-rc1) my
> wifi started dropping a lot. I'm using iwlwifi and I've checked the old
> logs that the firmware hasn't changed:
>
> Feb 25 16:22:33 kernel: iwlwifi 0000:04:00.0: loaded firmware version
> 36.e91976c0.0 8000C-36.ucode op_mode iwlmvm
>
> My logs look like this:
>
> 09:01:50 kernel: wlp4s0: Connection to AP [REDACTED]:0b:25 lost
> 09:01:50 wpa_supplicant[1279]: wlp4s0: CTRL-EVENT-DISCONNECTED
> bssid=[REDACTED]:0b:25 reason=4 locally_generated=1
> 09:01:50 NetworkManager[8368]: <warn>  [1679040110.6548]
> sup-iface[0x5628f2495990,wlp4s0]: connection disconnected (reason -4)
> 09:01:50 NetworkManager[8368]: <info>  [1679040110.6656] device
> (wlp4s0): supplicant interface state: completed -> disconnected
> 09:01:50 NetworkManager[8368]: <info>  [1679040110.7609] device
> (wlp4s0): supplicant interface state: disconnected -> scanning
> 09:01:53 wpa_supplicant[1279]: wlp4s0: SME: Trying to authenticate with
> [REDACTED]:0b:25 (SSID='[REDACTED]' freq=5500 MHz)
> 09:01:53 kernel: wlp4s0: authenticate with [REDACTED]:0b:25
> 09:01:53 kernel: wlp4s0: send auth to [REDACTED]:0b:25 (try 1/3)
> 09:01:53 kernel: wlp4s0: authenticated
> 09:01:53 wpa_supplicant[1279]: wlp4s0: Trying to associate with
> [REDACTED]:0b:25 (SSID='[REDACTED]' freq=5500 MHz)
> 09:01:53 kernel: wlp4s0: associate with [REDACTED]:0b:25 (try 1/3)
> 09:01:53 kernel: wlp4s0: RX AssocResp from [REDACTED]:0b:25
> (capab=0x1511 status=0 aid=36)
> 09:01:53 NetworkManager[8368]: <info>  [1679040113.4553] device
> (wlp4s0): supplicant interface state: scanning -> authenticating
> 09:01:53 kernel: wlp4s0: associated
> 09:01:53 wpa_supplicant[1279]: wlp4s0: Associated with [REDACTED]:0b:25
> 09:01:53 wpa_supplicant[1279]: wlp4s0: CTRL-EVENT-SUBNET-STATUS-UPDATE
> status=0
> 09:01:53 NetworkManager[8368]: <info>  [1679040113.4645] device
> (wlp4s0): supplicant interface state: authenticating -> associating
> 09:01:53 NetworkManager[8368]: <info>  [1679040113.4829] device
> (wlp4s0): supplicant interface state: associating -> associated
> 09:01:53 NetworkManager[8368]: <info>  [1679040113.5175] device
> (wlp4s0): supplicant interface state: associated -> 4-way handshake
> 09:01:53 kernel: wlp4s0: Connection to AP [REDACTED]:0b:25 lost
> 09:01:53 wpa_supplicant[1279]: wlp4s0: CTRL-EVENT-DISCONNECTED
> bssid=[REDACTED]:0b:25 reason=4 locally_generated=1
> 09:01:53 wpa_supplicant[1279]: wlp4s0: WPA: 4-Way Handshake failed -
> pre-shared key may be incorrect
> 09:01:53 wpa_supplicant[1279]: wlp4s0: CTRL-EVENT-SSID-TEMP-DISABLED
> id=0 ssid="[REDACTED]" auth_failures=1 duration=10 reason=WRONG_KEY
> 09:01:53 NetworkManager[8368]: <warn>  [1679040113.5685]
> sup-iface[0x5628f2495990,wlp4s0]: connection disconnected (reason -4)
> 09:01:53 NetworkManager[8368]: <info>  [1679040113.5737] device
> (wlp4s0): supplicant interface state: 4-way handshake -> disconnected
> 09:01:53 NetworkManager[8368]: <info>  [1679040113.6759] device
> (wlp4s0): supplicant interface state: disconnected -> scanning
>
> I did see that somebody else reported a similar regression, but I don't
> know if it could be the same problem or not (I see the driver is
> different, but the fix looks generic):
>
> https://linux-regtracking.leemhuis.info/regzbot/regression/217119/
>
> The buggy commit that was identified there does seem to be within the
> range of potential culprits:
>
> $ git log --oneline e8f60cd7db24..2eb29d59ddf0 | grep resumption
> 4444bc2116ae wifi: mac80211: Proper mark iTXQs for resumption
>
> If people think it's the same, I could try the proposed fix -- otherwise
> let me know what else I can do to help track this down.
>

You are using a iwlwilf/mvm card. The fix we plan to merge for that
regression above won't help you. (mvm cards do not use the function we
serialized in the end. iwlwifi/dvm on the other uses it.)

But Johannes is working on a comparable issue affecting mvm cards.
Check out
https://lore.kernel.org/r/[email protected]

That may well fix your issue.

> I don't think bisecting this is feasible for me since there are hours
> between the drops. But it's really noticeable over the stretch of
> multiple days that something has changed -- looking at the logs, the
> disconnect rate has basically tripled.
>
> Thanks,
>
>
> Vegard


Subject: Re: iwlwifi frequent drops between v6.2-rc3 and v6.3-rc1

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 17.03.23 09:53, Vegard Nossum wrote:
>
> Somewhere between e8f60cd7db24 (v6.2-rc3) and 2eb29d59ddf0 (v6.3-rc1) my
> wifi started dropping a lot. I'm using iwlwifi and I've checked the old
> logs that the firmware hasn't changed:

Thanks for the report. Due to Alexander's reply I hesitated first to add
this regression to the tracking, as I wondered that this might solve
itself quickly. But well, nothing happened since then, so I guess I
better add it to ensure the issue doesn't fall through the cracks:

#regzbot ^introduced e8f60cd7db24..2eb29d59ddf0
#regzbot title iwlwifi: frequent drops between v6.2-rc3 and v6.3-rc1
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.

2023-03-25 20:43:40

by Alexander Wetzel

[permalink] [raw]
Subject: Re: iwlwifi frequent drops between v6.2-rc3 and v6.3-rc1

On 17.03.23 16:29, Alexander Wetzel wrote:
> On 17.03.23 09:53, Vegard Nossum wrote:
>>
>> Hi,
>>
>> Somewhere between e8f60cd7db24 (v6.2-rc3) and 2eb29d59ddf0 (v6.3-rc1) my
>> wifi started dropping a lot. I'm using iwlwifi and I've checked the old
>> logs that the firmware hasn't changed:
>>
>> Feb 25 16:22:33 kernel: iwlwifi 0000:04:00.0: loaded firmware version
>> 36.e91976c0.0 8000C-36.ucode op_mode iwlmvm
>>
>> My logs look like this:
>>
>> 09:01:50 kernel: wlp4s0: Connection to AP [REDACTED]:0b:25 lost
>> 09:01:50 wpa_supplicant[1279]: wlp4s0: CTRL-EVENT-DISCONNECTED
>> bssid=[REDACTED]:0b:25 reason=4 locally_generated=1
>> 09:01:50 NetworkManager[8368]: <warn>  [1679040110.6548]
>> sup-iface[0x5628f2495990,wlp4s0]: connection disconnected (reason -4)
>> 09:01:50 NetworkManager[8368]: <info>  [1679040110.6656] device
>> (wlp4s0): supplicant interface state: completed -> disconnected
>> 09:01:50 NetworkManager[8368]: <info>  [1679040110.7609] device
>> (wlp4s0): supplicant interface state: disconnected -> scanning
>> 09:01:53 wpa_supplicant[1279]: wlp4s0: SME: Trying to authenticate
>> with [REDACTED]:0b:25 (SSID='[REDACTED]' freq=5500 MHz)
>> 09:01:53 kernel: wlp4s0: authenticate with [REDACTED]:0b:25
>> 09:01:53 kernel: wlp4s0: send auth to [REDACTED]:0b:25 (try 1/3)
>> 09:01:53 kernel: wlp4s0: authenticated
>> 09:01:53 wpa_supplicant[1279]: wlp4s0: Trying to associate with
>> [REDACTED]:0b:25 (SSID='[REDACTED]' freq=5500 MHz)
>> 09:01:53 kernel: wlp4s0: associate with [REDACTED]:0b:25 (try 1/3)
>> 09:01:53 kernel: wlp4s0: RX AssocResp from [REDACTED]:0b:25
>> (capab=0x1511 status=0 aid=36)
>> 09:01:53 NetworkManager[8368]: <info>  [1679040113.4553] device
>> (wlp4s0): supplicant interface state: scanning -> authenticating
>> 09:01:53 kernel: wlp4s0: associated
>> 09:01:53 wpa_supplicant[1279]: wlp4s0: Associated with [REDACTED]:0b:25
>> 09:01:53 wpa_supplicant[1279]: wlp4s0: CTRL-EVENT-SUBNET-STATUS-UPDATE
>> status=0
>> 09:01:53 NetworkManager[8368]: <info>  [1679040113.4645] device
>> (wlp4s0): supplicant interface state: authenticating -> associating
>> 09:01:53 NetworkManager[8368]: <info>  [1679040113.4829] device
>> (wlp4s0): supplicant interface state: associating -> associated
>> 09:01:53 NetworkManager[8368]: <info>  [1679040113.5175] device
>> (wlp4s0): supplicant interface state: associated -> 4-way handshake
>> 09:01:53 kernel: wlp4s0: Connection to AP [REDACTED]:0b:25 lost
>> 09:01:53 wpa_supplicant[1279]: wlp4s0: CTRL-EVENT-DISCONNECTED
>> bssid=[REDACTED]:0b:25 reason=4 locally_generated=1
>> 09:01:53 wpa_supplicant[1279]: wlp4s0: WPA: 4-Way Handshake failed -
>> pre-shared key may be incorrect
>> 09:01:53 wpa_supplicant[1279]: wlp4s0: CTRL-EVENT-SSID-TEMP-DISABLED
>> id=0 ssid="[REDACTED]" auth_failures=1 duration=10 reason=WRONG_KEY
>> 09:01:53 NetworkManager[8368]: <warn>  [1679040113.5685]
>> sup-iface[0x5628f2495990,wlp4s0]: connection disconnected (reason -4)
>> 09:01:53 NetworkManager[8368]: <info>  [1679040113.5737] device
>> (wlp4s0): supplicant interface state: 4-way handshake -> disconnected
>> 09:01:53 NetworkManager[8368]: <info>  [1679040113.6759] device
>> (wlp4s0): supplicant interface state: disconnected -> scanning
>>
>> I did see that somebody else reported a similar regression, but I don't
>> know if it could be the same problem or not (I see the driver is
>> different, but the fix looks generic):
>>
>> https://linux-regtracking.leemhuis.info/regzbot/regression/217119/
>>
>> The buggy commit that was identified there does seem to be within the
>> range of potential culprits:
>>
>> $ git log --oneline e8f60cd7db24..2eb29d59ddf0 | grep resumption
>> 4444bc2116ae wifi: mac80211: Proper mark iTXQs for resumption
>>
>> If people think it's the same, I could try the proposed fix -- otherwise
>> let me know what else I can do to help track this down.
>>
>
> You are using a iwlwilf/mvm card. The fix we plan to merge for that
> regression above won't help you. (mvm cards do not use the function we
> serialized in the end. iwlwifi/dvm on the other uses it.)
>
> But Johannes is working on a comparable issue affecting mvm cards.
> Check out
> https://lore.kernel.org/r/[email protected]
>
> That may well fix your issue.

It's surprisingly quiet here...
Are these fixes helping? Honestly I'm not very optimistic.
I expect that this is something else...

If so, we can debug that here together.

For start, I would like to see the full logs, starting from the initial
(working) connect.

I also would like to get a confirmation, that by booting an older kernel
(v6.2-rc3 or older) the system gets stable again. (Not that
wpa_supplicant or the router got an update, too.)

Alexander

Subject: Re: iwlwifi frequent drops between v6.2-rc3 and v6.3-rc1

Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
for once, to make this easily accessible to everyone.

Vegard, any news here? Is this still happening?

And out of curiosity: how often is "[wifi started] dropping a lot"
actually? I have seen occasional disconnects with iwlwifi myself, but I
have no idea what's causing them -- and I think it started earlier
already (and might have started when I switched to iwd, not sure). And
the error messages in the log only look similar in a few cases.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

On 25.03.23 21:33, Alexander Wetzel wrote:
> On 17.03.23 16:29, Alexander Wetzel wrote:
>> On 17.03.23 09:53, Vegard Nossum wrote:
>>>
>>> Hi,
>>>
>>> Somewhere between e8f60cd7db24 (v6.2-rc3) and 2eb29d59ddf0 (v6.3-rc1) my
>>> wifi started dropping a lot. I'm using iwlwifi and I've checked the old
>>> logs that the firmware hasn't changed:
>>>
>>> Feb 25 16:22:33 kernel: iwlwifi 0000:04:00.0: loaded firmware version
>>> 36.e91976c0.0 8000C-36.ucode op_mode iwlmvm
>>>
>>> My logs look like this:
>>>
>>> 09:01:50 kernel: wlp4s0: Connection to AP [REDACTED]:0b:25 lost
>>> 09:01:50 wpa_supplicant[1279]: wlp4s0: CTRL-EVENT-DISCONNECTED
>>> bssid=[REDACTED]:0b:25 reason=4 locally_generated=1
>>> 09:01:50 NetworkManager[8368]: <warn>  [1679040110.6548]
>>> sup-iface[0x5628f2495990,wlp4s0]: connection disconnected (reason -4)
>>> 09:01:50 NetworkManager[8368]: <info>  [1679040110.6656] device
>>> (wlp4s0): supplicant interface state: completed -> disconnected
>>> 09:01:50 NetworkManager[8368]: <info>  [1679040110.7609] device
>>> (wlp4s0): supplicant interface state: disconnected -> scanning
>>> 09:01:53 wpa_supplicant[1279]: wlp4s0: SME: Trying to authenticate
>>> with [REDACTED]:0b:25 (SSID='[REDACTED]' freq=5500 MHz)
>>> 09:01:53 kernel: wlp4s0: authenticate with [REDACTED]:0b:25
>>> 09:01:53 kernel: wlp4s0: send auth to [REDACTED]:0b:25 (try 1/3)
>>> 09:01:53 kernel: wlp4s0: authenticated
>>> 09:01:53 wpa_supplicant[1279]: wlp4s0: Trying to associate with
>>> [REDACTED]:0b:25 (SSID='[REDACTED]' freq=5500 MHz)
>>> 09:01:53 kernel: wlp4s0: associate with [REDACTED]:0b:25 (try 1/3)
>>> 09:01:53 kernel: wlp4s0: RX AssocResp from [REDACTED]:0b:25
>>> (capab=0x1511 status=0 aid=36)
>>> 09:01:53 NetworkManager[8368]: <info>  [1679040113.4553] device
>>> (wlp4s0): supplicant interface state: scanning -> authenticating
>>> 09:01:53 kernel: wlp4s0: associated
>>> 09:01:53 wpa_supplicant[1279]: wlp4s0: Associated with [REDACTED]:0b:25
>>> 09:01:53 wpa_supplicant[1279]: wlp4s0:
>>> CTRL-EVENT-SUBNET-STATUS-UPDATE status=0
>>> 09:01:53 NetworkManager[8368]: <info>  [1679040113.4645] device
>>> (wlp4s0): supplicant interface state: authenticating -> associating
>>> 09:01:53 NetworkManager[8368]: <info>  [1679040113.4829] device
>>> (wlp4s0): supplicant interface state: associating -> associated
>>> 09:01:53 NetworkManager[8368]: <info>  [1679040113.5175] device
>>> (wlp4s0): supplicant interface state: associated -> 4-way handshake
>>> 09:01:53 kernel: wlp4s0: Connection to AP [REDACTED]:0b:25 lost
>>> 09:01:53 wpa_supplicant[1279]: wlp4s0: CTRL-EVENT-DISCONNECTED
>>> bssid=[REDACTED]:0b:25 reason=4 locally_generated=1
>>> 09:01:53 wpa_supplicant[1279]: wlp4s0: WPA: 4-Way Handshake failed -
>>> pre-shared key may be incorrect
>>> 09:01:53 wpa_supplicant[1279]: wlp4s0: CTRL-EVENT-SSID-TEMP-DISABLED
>>> id=0 ssid="[REDACTED]" auth_failures=1 duration=10 reason=WRONG_KEY
>>> 09:01:53 NetworkManager[8368]: <warn>  [1679040113.5685]
>>> sup-iface[0x5628f2495990,wlp4s0]: connection disconnected (reason -4)
>>> 09:01:53 NetworkManager[8368]: <info>  [1679040113.5737] device
>>> (wlp4s0): supplicant interface state: 4-way handshake -> disconnected
>>> 09:01:53 NetworkManager[8368]: <info>  [1679040113.6759] device
>>> (wlp4s0): supplicant interface state: disconnected -> scanning
>>>
>>> I did see that somebody else reported a similar regression, but I don't
>>> know if it could be the same problem or not (I see the driver is
>>> different, but the fix looks generic):
>>>
>>> https://linux-regtracking.leemhuis.info/regzbot/regression/217119/
>>>
>>> The buggy commit that was identified there does seem to be within the
>>> range of potential culprits:
>>>
>>> $ git log --oneline e8f60cd7db24..2eb29d59ddf0 | grep resumption
>>> 4444bc2116ae wifi: mac80211: Proper mark iTXQs for resumption
>>>
>>> If people think it's the same, I could try the proposed fix -- otherwise
>>> let me know what else I can do to help track this down.
>>>
>>
>> You are using a iwlwilf/mvm card. The fix we plan to merge for that
>> regression above won't help you. (mvm cards do not use the function we
>> serialized in the end. iwlwifi/dvm on the other uses it.)
>>
>> But Johannes is working on a comparable issue affecting mvm cards.
>> Check out
>> https://lore.kernel.org/r/[email protected]
>>
>> That may well fix your issue.
>
> It's surprisingly quiet here...
> Are these fixes helping? Honestly I'm not very optimistic.
> I expect that this is something else...
>
> If so, we can debug that here together.
>
> For start, I would like to see the full logs, starting from the initial
> (working) connect.
>
> I also would like to get a confirmation, that by booting an older kernel
> (v6.2-rc3 or older) the system gets stable again. (Not that
> wpa_supplicant or the router got an update, too.)
>
> Alexander
>
>

Subject: Re: iwlwifi frequent drops between v6.2-rc3 and v6.3-rc1

On 12.04.23 13:38, Linux regression tracking (Thorsten Leemhuis) wrote:
> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> for once, to make this easily accessible to everyone.
>
> Vegard, any news here? Is this still happening?
>
> And out of curiosity: how often is "[wifi started] dropping a lot"
> actually? I have seen occasional disconnects with iwlwifi myself, but I
> have no idea what's causing them -- and I think it started earlier
> already (and might have started when I switched to iwd, not sure). And
> the error messages in the log only look similar in a few cases.
>
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.
>
> #regzbot poke

Putting this on the backburner to reduce the noise in the list of
tracked issues:

#regzbot backburner: not bisected and lack of data to debug this further
#regzbot ignore-activity

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.

> On 25.03.23 21:33, Alexander Wetzel wrote:
>> On 17.03.23 16:29, Alexander Wetzel wrote:
>>> On 17.03.23 09:53, Vegard Nossum wrote:
>>>>
>>>> Hi,
>>>>
>>>> Somewhere between e8f60cd7db24 (v6.2-rc3) and 2eb29d59ddf0 (v6.3-rc1) my
>>>> wifi started dropping a lot. I'm using iwlwifi and I've checked the old
>>>> logs that the firmware hasn't changed:
>>>>
>>>> Feb 25 16:22:33 kernel: iwlwifi 0000:04:00.0: loaded firmware version
>>>> 36.e91976c0.0 8000C-36.ucode op_mode iwlmvm
>>>>
>>>> My logs look like this:
>>>>
>>>> 09:01:50 kernel: wlp4s0: Connection to AP [REDACTED]:0b:25 lost
>>>> 09:01:50 wpa_supplicant[1279]: wlp4s0: CTRL-EVENT-DISCONNECTED
>>>> bssid=[REDACTED]:0b:25 reason=4 locally_generated=1
>>>> 09:01:50 NetworkManager[8368]: <warn>  [1679040110.6548]
>>>> sup-iface[0x5628f2495990,wlp4s0]: connection disconnected (reason -4)
>>>> 09:01:50 NetworkManager[8368]: <info>  [1679040110.6656] device
>>>> (wlp4s0): supplicant interface state: completed -> disconnected
>>>> 09:01:50 NetworkManager[8368]: <info>  [1679040110.7609] device
>>>> (wlp4s0): supplicant interface state: disconnected -> scanning
>>>> 09:01:53 wpa_supplicant[1279]: wlp4s0: SME: Trying to authenticate
>>>> with [REDACTED]:0b:25 (SSID='[REDACTED]' freq=5500 MHz)
>>>> 09:01:53 kernel: wlp4s0: authenticate with [REDACTED]:0b:25
>>>> 09:01:53 kernel: wlp4s0: send auth to [REDACTED]:0b:25 (try 1/3)
>>>> 09:01:53 kernel: wlp4s0: authenticated
>>>> 09:01:53 wpa_supplicant[1279]: wlp4s0: Trying to associate with
>>>> [REDACTED]:0b:25 (SSID='[REDACTED]' freq=5500 MHz)
>>>> 09:01:53 kernel: wlp4s0: associate with [REDACTED]:0b:25 (try 1/3)
>>>> 09:01:53 kernel: wlp4s0: RX AssocResp from [REDACTED]:0b:25
>>>> (capab=0x1511 status=0 aid=36)
>>>> 09:01:53 NetworkManager[8368]: <info>  [1679040113.4553] device
>>>> (wlp4s0): supplicant interface state: scanning -> authenticating
>>>> 09:01:53 kernel: wlp4s0: associated
>>>> 09:01:53 wpa_supplicant[1279]: wlp4s0: Associated with [REDACTED]:0b:25
>>>> 09:01:53 wpa_supplicant[1279]: wlp4s0:
>>>> CTRL-EVENT-SUBNET-STATUS-UPDATE status=0
>>>> 09:01:53 NetworkManager[8368]: <info>  [1679040113.4645] device
>>>> (wlp4s0): supplicant interface state: authenticating -> associating
>>>> 09:01:53 NetworkManager[8368]: <info>  [1679040113.4829] device
>>>> (wlp4s0): supplicant interface state: associating -> associated
>>>> 09:01:53 NetworkManager[8368]: <info>  [1679040113.5175] device
>>>> (wlp4s0): supplicant interface state: associated -> 4-way handshake
>>>> 09:01:53 kernel: wlp4s0: Connection to AP [REDACTED]:0b:25 lost
>>>> 09:01:53 wpa_supplicant[1279]: wlp4s0: CTRL-EVENT-DISCONNECTED
>>>> bssid=[REDACTED]:0b:25 reason=4 locally_generated=1
>>>> 09:01:53 wpa_supplicant[1279]: wlp4s0: WPA: 4-Way Handshake failed -
>>>> pre-shared key may be incorrect
>>>> 09:01:53 wpa_supplicant[1279]: wlp4s0: CTRL-EVENT-SSID-TEMP-DISABLED
>>>> id=0 ssid="[REDACTED]" auth_failures=1 duration=10 reason=WRONG_KEY
>>>> 09:01:53 NetworkManager[8368]: <warn>  [1679040113.5685]
>>>> sup-iface[0x5628f2495990,wlp4s0]: connection disconnected (reason -4)
>>>> 09:01:53 NetworkManager[8368]: <info>  [1679040113.5737] device
>>>> (wlp4s0): supplicant interface state: 4-way handshake -> disconnected
>>>> 09:01:53 NetworkManager[8368]: <info>  [1679040113.6759] device
>>>> (wlp4s0): supplicant interface state: disconnected -> scanning
>>>>
>>>> I did see that somebody else reported a similar regression, but I don't
>>>> know if it could be the same problem or not (I see the driver is
>>>> different, but the fix looks generic):
>>>>
>>>> https://linux-regtracking.leemhuis.info/regzbot/regression/217119/
>>>>
>>>> The buggy commit that was identified there does seem to be within the
>>>> range of potential culprits:
>>>>
>>>> $ git log --oneline e8f60cd7db24..2eb29d59ddf0 | grep resumption
>>>> 4444bc2116ae wifi: mac80211: Proper mark iTXQs for resumption
>>>>
>>>> If people think it's the same, I could try the proposed fix -- otherwise
>>>> let me know what else I can do to help track this down.
>>>>
>>>
>>> You are using a iwlwilf/mvm card. The fix we plan to merge for that
>>> regression above won't help you. (mvm cards do not use the function we
>>> serialized in the end. iwlwifi/dvm on the other uses it.)
>>>
>>> But Johannes is working on a comparable issue affecting mvm cards.
>>> Check out
>>> https://lore.kernel.org/r/[email protected]
>>>
>>> That may well fix your issue.
>>
>> It's surprisingly quiet here...
>> Are these fixes helping? Honestly I'm not very optimistic.
>> I expect that this is something else...
>>
>> If so, we can debug that here together.
>>
>> For start, I would like to see the full logs, starting from the initial
>> (working) connect.
>>
>> I also would like to get a confirmation, that by booting an older kernel
>> (v6.2-rc3 or older) the system gets stable again. (Not that
>> wpa_supplicant or the router got an update, too.)
>>
>> Alexander
>>
>>