2022-07-14 12:14:00

by syzbot

[permalink] [raw]
Subject: [syzbot] INFO: trying to register non-static key in ieee80211_do_stop

Hello,

syzbot found the following issue on:

HEAD commit: b11e5f6a3a5c net: sunhme: output link status with a single..
git tree: net
console+strace: https://syzkaller.appspot.com/x/log.txt?x=108ed862080000
kernel config: https://syzkaller.appspot.com/x/.config?x=fa95f12403a2e0d2
dashboard link: https://syzkaller.appspot.com/bug?extid=eceab52db7c4b961e9d6
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=173a7c78080000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1102749a080000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

INFO: trying to register non-static key.
The code is fine but needs lockdep annotation, or maybe
you didn't initialize this object before use?
turning off the locking correctness validator.
CPU: 0 PID: 3615 Comm: syz-executor630 Not tainted 5.19.0-rc5-syzkaller-00263-gb11e5f6a3a5c #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/29/2022
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
assign_lock_key kernel/locking/lockdep.c:979 [inline]
register_lock_class+0xf30/0x1130 kernel/locking/lockdep.c:1292
__lock_acquire+0x10a/0x5660 kernel/locking/lockdep.c:4932
lock_acquire kernel/locking/lockdep.c:5665 [inline]
lock_acquire+0x1ab/0x570 kernel/locking/lockdep.c:5630
__raw_spin_lock_bh include/linux/spinlock_api_smp.h:126 [inline]
_raw_spin_lock_bh+0x2f/0x40 kernel/locking/spinlock.c:178
spin_lock_bh include/linux/spinlock.h:354 [inline]
ieee80211_do_stop+0xc3/0x1ff0 net/mac80211/iface.c:380
ieee80211_runtime_change_iftype net/mac80211/iface.c:1789 [inline]
ieee80211_if_change_type+0x383/0x840 net/mac80211/iface.c:1827
ieee80211_change_iface+0x57/0x3f0 net/mac80211/cfg.c:190
rdev_change_virtual_intf net/wireless/rdev-ops.h:69 [inline]
cfg80211_change_iface+0x5e1/0xf10 net/wireless/util.c:1078
nl80211_set_interface+0x64f/0x8c0 net/wireless/nl80211.c:4041
genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:731
genl_family_rcv_msg net/netlink/genetlink.c:775 [inline]
genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:792
netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2501
genl_rcv+0x24/0x40 net/netlink/genetlink.c:803
netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
netlink_unicast+0x543/0x7f0 net/netlink/af_netlink.c:1345
netlink_sendmsg+0x917/0xe10 net/netlink/af_netlink.c:1921
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:734
____sys_sendmsg+0x6eb/0x810 net/socket.c:2488
___sys_sendmsg+0xf3/0x170 net/socket.c:2542
__sys_sendmsg net/socket.c:2571 [inline]
__do_sys_sendmsg net/socket.c:2580 [inline]
__se_sys_sendmsg net/socket.c:2578 [inline]
__x64_sys_sendmsg+0x132/0x220 net/socket.c:2578
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x46/0xb0
RIP: 0033:0x7f5bf1b37b89
Code: 28 c3 e8 5a 14 00 00 66 2e 0f 1f 84 00 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffd682b8a38 EFLAGS: 00000246 ORIG_RAX: 000000000000002e


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches


2022-07-17 02:47:11

by Tetsuo Handa

[permalink] [raw]
Subject: [PATCH] wifi: mac80211: initialize fq.lock as early as possible

lockdep complains use of uninitialized spinlock at ieee80211_do_stop() [1],
for commit f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif
that is being stopped") guards clear_bit() using fq.lock even before
fq_init() from ieee80211_txq_setup_flows() initializes this spinlock.
Initialize this spinlock as early as possible.

Link: https://syzkaller.appspot.com/bug?extid=eceab52db7c4b961e9d6 [1]
Reported-by: syzbot <[email protected]>
Signed-off-by: Tetsuo Handa <[email protected]>
Fixes: f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif that is being stopped")
Tested-by: syzbot <[email protected]>
---
net/mac80211/main.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/net/mac80211/main.c b/net/mac80211/main.c
index 5a385d4146b9..584e98300bbf 100644
--- a/net/mac80211/main.c
+++ b/net/mac80211/main.c
@@ -642,6 +642,7 @@ struct ieee80211_hw *ieee80211_alloc_hw_nm(size_t priv_data_len,
wiphy->bss_priv_size = sizeof(struct ieee80211_bss);

local = wiphy_priv(wiphy);
+ spin_lock_init(&local->fq.lock);

if (sta_info_init(local))
goto err_free;
--
2.18.4

2022-07-17 12:47:23

by Tetsuo Handa

[permalink] [raw]
Subject: [PATCH v2] wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()

lockdep complains use of uninitialized spinlock at ieee80211_do_stop() [1],
for commit f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif
that is being stopped") guards clear_bit() using fq.lock even before
fq_init() from ieee80211_txq_setup_flows() initializes this spinlock.

According to discussion [2], Toke was not happy with expanding usage of
fq.lock. Since __ieee80211_wake_txqs() is called under RCU read lock, we
can instead use synchronize_rcu() for flushing ieee80211_wake_txqs().

Link: https://syzkaller.appspot.com/bug?extid=eceab52db7c4b961e9d6 [1]
Link: https://lkml.kernel.org/r/[email protected] [2]
Reported-by: syzbot <[email protected]>
Signed-off-by: Tetsuo Handa <[email protected]>
Fixes: f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif that is being stopped")
Tested-by: syzbot <[email protected]>
---
Changes in v2:
Use synchronize_rcu() instead of initializing fq.lock early.

This bug is current top crasher for syzbot. Please fix as soon as possible.

net/mac80211/iface.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index 15a73b7fdd75..1a9ada411879 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -377,9 +377,8 @@ static void ieee80211_do_stop(struct ieee80211_sub_if_data *sdata, bool going_do
bool cancel_scan;
struct cfg80211_nan_func *func;

- spin_lock_bh(&local->fq.lock);
clear_bit(SDATA_STATE_RUNNING, &sdata->state);
- spin_unlock_bh(&local->fq.lock);
+ synchronize_rcu(); /* flush _ieee80211_wake_txqs() */

cancel_scan = rcu_access_pointer(local->scan_sdata) == sdata;
if (cancel_scan)
--
2.18.4


2022-07-18 11:24:10

by Toke Høiland-Jørgensen

[permalink] [raw]
Subject: Re: [PATCH v2] wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()

Tetsuo Handa <[email protected]> writes:

> lockdep complains use of uninitialized spinlock at ieee80211_do_stop() [1],
> for commit f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif
> that is being stopped") guards clear_bit() using fq.lock even before
> fq_init() from ieee80211_txq_setup_flows() initializes this spinlock.
>
> According to discussion [2], Toke was not happy with expanding usage of
> fq.lock. Since __ieee80211_wake_txqs() is called under RCU read lock, we
> can instead use synchronize_rcu() for flushing ieee80211_wake_txqs().

Ah, that's a neat solution! :)

Acked-by: Toke Høiland-Jørgensen <[email protected]>

2022-07-18 12:02:25

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH v2] wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()

Tetsuo Handa <[email protected]> wrote:

> lockdep complains use of uninitialized spinlock at ieee80211_do_stop() [1],
> for commit f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif
> that is being stopped") guards clear_bit() using fq.lock even before
> fq_init() from ieee80211_txq_setup_flows() initializes this spinlock.
>
> According to discussion [2], Toke was not happy with expanding usage of
> fq.lock. Since __ieee80211_wake_txqs() is called under RCU read lock, we
> can instead use synchronize_rcu() for flushing ieee80211_wake_txqs().
>
> Link: https://syzkaller.appspot.com/bug?extid=eceab52db7c4b961e9d6 [1]
> Link: https://lkml.kernel.org/r/[email protected] [2]
> Reported-by: syzbot <[email protected]>
> Signed-off-by: Tetsuo Handa <[email protected]>
> Fixes: f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif that is being stopped")
> Tested-by: syzbot <[email protected]>
> Acked-by: Toke Høiland-Jørgensen <[email protected]>

Patch applied to wireless-next.git, thanks.

3598cb6e1862 wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()

--
https://patchwork.kernel.org/project/linux-wireless/patch/[email protected]/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

2022-07-26 07:00:34

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [PATCH v2] wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()

Since this patch fixes a regression introduced in 5.19-rc7, can this patch go to 5.19-final ?

syzbot is failing to test linux.git for 12 days due to this regression.
syzbot will fail to bisect new bugs found in the upcoming merge window
if unable to test v5.19 due to this regression.

On 2022/07/18 21:01, Kalle Valo wrote:
> Tetsuo Handa <[email protected]> wrote:
>
>> lockdep complains use of uninitialized spinlock at ieee80211_do_stop() [1],
>> for commit f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif
>> that is being stopped") guards clear_bit() using fq.lock even before
>> fq_init() from ieee80211_txq_setup_flows() initializes this spinlock.
>>
>> According to discussion [2], Toke was not happy with expanding usage of
>> fq.lock. Since __ieee80211_wake_txqs() is called under RCU read lock, we
>> can instead use synchronize_rcu() for flushing ieee80211_wake_txqs().
>>
>> Link: https://syzkaller.appspot.com/bug?extid=eceab52db7c4b961e9d6 [1]
>> Link: https://lkml.kernel.org/r/[email protected] [2]
>> Reported-by: syzbot <[email protected]>
>> Signed-off-by: Tetsuo Handa <[email protected]>
>> Fixes: f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif that is being stopped")
>> Tested-by: syzbot <[email protected]>
>> Acked-by: Toke Høiland-Jørgensen <[email protected]>
>
> Patch applied to wireless-next.git, thanks.
>
> 3598cb6e1862 wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()
>

2022-07-26 14:46:09

by Kalle Valo

[permalink] [raw]
Subject: Re: [PATCH v2] wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()

(please don't top post, I manually fixed that)

Tetsuo Handa <[email protected]> writes:

> On 2022/07/18 21:01, Kalle Valo wrote:
>> Tetsuo Handa <[email protected]> wrote:
>>
>>> lockdep complains use of uninitialized spinlock at ieee80211_do_stop() [1],
>>> for commit f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif
>>> that is being stopped") guards clear_bit() using fq.lock even before
>>> fq_init() from ieee80211_txq_setup_flows() initializes this spinlock.
>>>
>>> According to discussion [2], Toke was not happy with expanding usage of
>>> fq.lock. Since __ieee80211_wake_txqs() is called under RCU read lock, we
>>> can instead use synchronize_rcu() for flushing ieee80211_wake_txqs().
>>>
>>> Link: https://syzkaller.appspot.com/bug?extid=eceab52db7c4b961e9d6 [1]
>>> Link: https://lkml.kernel.org/r/[email protected] [2]
>>> Reported-by: syzbot <[email protected]>
>>> Signed-off-by: Tetsuo Handa <[email protected]>
>>> Fixes: f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif that is being stopped")
>>> Tested-by: syzbot <[email protected]>
>>> Acked-by: Toke Høiland-Jørgensen <[email protected]>
>>
>> Patch applied to wireless-next.git, thanks.
>>
>> 3598cb6e1862 wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()
>
> Since this patch fixes a regression introduced in 5.19-rc7, can this patch go to 5.19-final ?
>
> syzbot is failing to test linux.git for 12 days due to this regression.
> syzbot will fail to bisect new bugs found in the upcoming merge window
> if unable to test v5.19 due to this regression.

I took this to wireless-next as I didn't think there's enough time to
get this to v5.19 (and I only heard Linus' -rc8 plans after the fact).
So this will be in v5.20-rc1 and I recommend pushing this to a v5.19
stable release.

--
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

2022-07-26 15:10:18

by Ben Greear

[permalink] [raw]
Subject: Re: [PATCH v2] wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()

On 7/26/22 7:38 AM, Kalle Valo wrote:
> (please don't top post, I manually fixed that)
>
> Tetsuo Handa <[email protected]> writes:
>
>> On 2022/07/18 21:01, Kalle Valo wrote:
>>> Tetsuo Handa <[email protected]> wrote:
>>>
>>>> lockdep complains use of uninitialized spinlock at ieee80211_do_stop() [1],
>>>> for commit f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif
>>>> that is being stopped") guards clear_bit() using fq.lock even before
>>>> fq_init() from ieee80211_txq_setup_flows() initializes this spinlock.
>>>>
>>>> According to discussion [2], Toke was not happy with expanding usage of
>>>> fq.lock. Since __ieee80211_wake_txqs() is called under RCU read lock, we
>>>> can instead use synchronize_rcu() for flushing ieee80211_wake_txqs().
>>>>
>>>> Link: https://syzkaller.appspot.com/bug?extid=eceab52db7c4b961e9d6 [1]
>>>> Link: https://lkml.kernel.org/r/[email protected] [2]
>>>> Reported-by: syzbot <[email protected]>
>>>> Signed-off-by: Tetsuo Handa <[email protected]>
>>>> Fixes: f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif that is being stopped")
>>>> Tested-by: syzbot <[email protected]>
>>>> Acked-by: Toke Høiland-Jørgensen <[email protected]>
>>>
>>> Patch applied to wireless-next.git, thanks.
>>>
>>> 3598cb6e1862 wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()
>>
>> Since this patch fixes a regression introduced in 5.19-rc7, can this patch go to 5.19-final ?
>>
>> syzbot is failing to test linux.git for 12 days due to this regression.
>> syzbot will fail to bisect new bugs found in the upcoming merge window
>> if unable to test v5.19 due to this regression.
>
> I took this to wireless-next as I didn't think there's enough time to
> get this to v5.19 (and I only heard Linus' -rc8 plans after the fact).
> So this will be in v5.20-rc1 and I recommend pushing this to a v5.19
> stable release.

Would it be worth reverting the patch that broke things until the first stable 5.19.x
tree then? Seems lame to ship an official kernel with a known bug like this.

Thanks,
Ben


--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com

2022-07-26 21:45:51

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH v2] wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()

On Tue, 26 Jul 2022 08:05:12 -0700 Ben Greear wrote:
> >> Since this patch fixes a regression introduced in 5.19-rc7, can this patch go to 5.19-final ?
> >>
> >> syzbot is failing to test linux.git for 12 days due to this regression.
> >> syzbot will fail to bisect new bugs found in the upcoming merge window
> >> if unable to test v5.19 due to this regression.
> >
> > I took this to wireless-next as I didn't think there's enough time to
> > get this to v5.19 (and I only heard Linus' -rc8 plans after the fact).
> > So this will be in v5.20-rc1 and I recommend pushing this to a v5.19
> > stable release.
>
> Would it be worth reverting the patch that broke things until the first stable 5.19.x
> tree then? Seems lame to ship an official kernel with a known bug like this.

I cherry-picked the fix across the trees after talking to Kalle and
DaveM. Let's see how that goes...

2022-07-28 23:03:35

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [PATCH v2] wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()

On 2022/07/27 6:38, Jakub Kicinski wrote:
> On Tue, 26 Jul 2022 08:05:12 -0700 Ben Greear wrote:
>>>> Since this patch fixes a regression introduced in 5.19-rc7, can this patch go to 5.19-final ?
>>>>
>>>> syzbot is failing to test linux.git for 12 days due to this regression.
>>>> syzbot will fail to bisect new bugs found in the upcoming merge window
>>>> if unable to test v5.19 due to this regression.
>>>
>>> I took this to wireless-next as I didn't think there's enough time to
>>> get this to v5.19 (and I only heard Linus' -rc8 plans after the fact).
>>> So this will be in v5.20-rc1 and I recommend pushing this to a v5.19
>>> stable release.
>>
>> Would it be worth reverting the patch that broke things until the first stable 5.19.x
>> tree then? Seems lame to ship an official kernel with a known bug like this.
>
> I cherry-picked the fix across the trees after talking to Kalle and
> DaveM. Let's see how that goes...

This patch successfully arrived at linux.git, in time for 5.19-final.

Thank you.