2017-11-03 02:57:13

by Daniel Drake

[permalink] [raw]
Subject: ath9k disconnects in 4.13 with reason=4 locally_generated=1

Hi,

Endless OS recently upgraded from Linux 4.11 to Linux 4.13, and we now
have a few reports of issues with ath9k wireless becoming unusable.

In the logs we can see that it authenticates, associates and completes
the WPA 4 way handshake, before then being disconnected with:

wlp2s0: CTRL-EVENT-DISCONNECTED bssid=74:26:ac:68:2f:c0 reason=4
locally_generated=1

The cycle then repeats with it connecting again before being swiftly
disconnected, etc.

More logs: https://gist.github.com/dsd/49f263c67c2859838ce168628ab043e0

At the same time that we upgraded the kernel, we also upgraded many
other components (e.g. NetworkManager and wpa_supplicant), however the
same problem has been reported on Arch Linux and a user there reports
that he narrowed it down to a kernel regression between 4.12 and 4.13:
https://bbs.archlinux.org/viewtopic.php?id=225199

Unfortunately we can not reproduce this in our office, so can't offer
much more info yet, but we are continuing to investigate. I have not
found any codepaths in userspace that generate disconnect reason 4, so
I think it must be something in the kernel causing the disconnection,
but I did not see any suspicious changes in recent commit history.

It would be good to hear from anyone who has heard of this or has any
ideas about causes or solutions.

Thanks
Daniel


2017-11-03 09:51:30

by Jouni Malinen

[permalink] [raw]
Subject: Re: ath9k disconnects in 4.13 with reason=4 locally_generated=1

On Fri, Nov 03, 2017 at 10:57:11AM +0800, Daniel Drake wrote:
> Endless OS recently upgraded from Linux 4.11 to Linux 4.13, and we now
> have a few reports of issues with ath9k wireless becoming unusable.
>=20
> In the logs we can see that it authenticates, associates and completes
> the WPA 4 way handshake, before then being disconnected with:
>=20
> wlp2s0: CTRL-EVENT-DISCONNECTED bssid=3D74:26:ac:68:2f:c0 reason=3D4
> locally_generated=3D1

reason=3D4 is WLAN_REASON_DISASSOC_DUE_TO_INACTIVITY. I'd expect the most
likely source of this to be one of the mac80211 code paths in mlme.c
where disconnection is triggered if the current AP become unreachable.
Getting a debug log from mac80211 might help in figuring out what is
causing this (there seem to be number of mlme_dbg() calls before most,
but not necessarily all, places where
WLAN_REASON_DISASSOC_DUE_TO_INACTIVITY is used).

--=20
Jouni Malinen PGP id EFC895FA=

2017-11-10 01:30:07

by Daniel Drake

[permalink] [raw]
Subject: Re: ath9k disconnects in 4.13 with reason=4 locally_generated=1

On Fri, Nov 3, 2017 at 5:51 PM, Jouni Malinen <[email protected]> wrote:
> On Fri, Nov 03, 2017 at 10:57:11AM +0800, Daniel Drake wrote:
>> Endless OS recently upgraded from Linux 4.11 to Linux 4.13, and we now
>> have a few reports of issues with ath9k wireless becoming unusable.
>>
>> In the logs we can see that it authenticates, associates and completes
>> the WPA 4 way handshake, before then being disconnected with:
>>
>> wlp2s0: CTRL-EVENT-DISCONNECTED bssid=74:26:ac:68:2f:c0 reason=4
>> locally_generated=1
>
> reason=4 is WLAN_REASON_DISASSOC_DUE_TO_INACTIVITY. I'd expect the most
> likely source of this to be one of the mac80211 code paths in mlme.c
> where disconnection is triggered if the current AP become unreachable.
> Getting a debug log from mac80211 might help in figuring out what is
> causing this (there seem to be number of mlme_dbg() calls before most,
> but not necessarily all, places where
> WLAN_REASON_DISASSOC_DUE_TO_INACTIVITY is used).

We got the log, it is coming from ieee80211_sta_work()

else if (ieee80211_hw_check(&local->hw, REPORTS_TX_ACK_STATUS)) {
sdata_info(sdata,t
"Failed to send nullfunc to AP %pM after %dms,
disconnecting\n",
bssid, probe_wait_ms);
ieee80211_sta_connection_lost(sdata, bssid,
WLAN_REASON_DISASSOC_DUE_TO_INACTIVITY, false);

I looked again at changes between 4.12 and 4.13 and still no idea how
4.13 causes this problem :(

Daniel