2024-03-28 11:01:18

by syzbot

[permalink] [raw]
Subject: [syzbot] [wireless?] WARNING in kcov_remote_start (3)

Hello,

syzbot found the following issue on:

HEAD commit: a6bd6c933339 Add linux-next specific files for 20240328
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=15c85eb1180000
kernel config: https://syzkaller.appspot.com/x/.config?x=b0058bda1436e073
dashboard link: https://syzkaller.appspot.com/bug?extid=0438378d6f157baae1a2
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/7c1618ff7d25/disk-a6bd6c93.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/875519f620fe/vmlinux-a6bd6c93.xz
kernel image: https://storage.googleapis.com/syzbot-assets/ad92b057fb96/bzImage-a6bd6c93.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

------------[ cut here ]------------
WARNING: CPU: 1 PID: 2400 at kernel/kcov.c:860 kcov_remote_start+0x549/0x7e0 kernel/kcov.c:860
Modules linked in:
CPU: 1 PID: 2400 Comm: kworker/u8:7 Not tainted 6.9.0-rc1-next-20240328-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/29/2024
Workqueue: events_unbound cfg80211_wiphy_work
RIP: 0010:kcov_remote_start+0x549/0x7e0 kernel/kcov.c:860
Code: 4c 89 ff be 03 00 00 00 e8 14 99 16 03 e9 fd fa ff ff e8 8a 26 ea 09 41 f7 c6 00 02 00 00 0f 84 eb fa ff ff e9 7f fc ff ff 90 <0f> 0b 90 e8 8f 43 ea 09 89 c0 48 c7 c7 c8 d4 02 00 48 03 3c c5 d0
RSP: 0018:ffffc90009b17aa8 EFLAGS: 00010002
RAX: 0000000080000000 RBX: ffff888029649e00 RCX: 0000000000000002
RDX: dffffc0000000000 RSI: ffffffff8bcae740 RDI: ffffffff8c1f77c0
RBP: 0000000000000000 R08: ffffffff92f3358f R09: 1ffffffff25e66b1
R10: dffffc0000000000 R11: fffffbfff25e66b2 R12: ffffffff8195747e
R13: ffff88807c8cd748 R14: 0000000000000246 R15: ffff8880b952d4c8
FS: 0000000000000000(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fc66d258d58 CR3: 00000000222ca000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
kcov_remote_start_common include/linux/kcov.h:48 [inline]
ieee80211_iface_work+0x21f/0xf10 net/mac80211/iface.c:1654
cfg80211_wiphy_work+0x221/0x260 net/wireless/core.c:437
process_one_work kernel/workqueue.c:3218 [inline]
process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3299
worker_thread+0x86d/0xd70 kernel/workqueue.c:3380
kthread+0x2f0/0x390 kernel/kthread.c:388
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup


2024-03-28 11:45:27

by Johannes Berg

[permalink] [raw]
Subject: Re: [syzbot] [wireless?] WARNING in kcov_remote_start (3)

On Thu, 2024-03-28 at 04:00 -0700, syzbot wrote:
>
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 2400 at kernel/kcov.c:860 kcov_remote_start+0x549/0x7e0 kernel/kcov.c:860

This is

/*
* Check that kcov_remote_start() is not called twice in background
* threads nor called by user tasks (with enabled kcov).
*/
mode = READ_ONCE(t->kcov_mode);
if (WARN_ON(in_task() && kcov_mode_enabled(mode))) {
local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
return;
}

but I have no idea what that even means?

> Workqueue: events_unbound cfg80211_wiphy_work
> RIP: 0010:kcov_remote_start+0x549/0x7e0 kernel/kcov.c:860
...
> Call Trace:
> <TASK>
> kcov_remote_start_common include/linux/kcov.h:48 [inline]
> ieee80211_iface_work+0x21f/0xf10 net/mac80211/iface.c:1654
> cfg80211_wiphy_work+0x221/0x260 net/wireless/core.c:437
> process_one_work kernel/workqueue.c:3218 [inline]
> process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3299
> worker_thread+0x86d/0xd70 kernel/workqueue.c:3380

It's a worker thread. Was this not intended to be called in threads?

johannes

2024-04-10 10:57:36

by Andrey Konovalov

[permalink] [raw]
Subject: Re: [syzbot] [wireless?] WARNING in kcov_remote_start (3)

On Thu, Mar 28, 2024 at 12:45 PM Johannes Berg
<[email protected]> wrote:
>
> On Thu, 2024-03-28 at 04:00 -0700, syzbot wrote:
> >
> > ------------[ cut here ]------------
> > WARNING: CPU: 1 PID: 2400 at kernel/kcov.c:860 kcov_remote_start+0x549/0x7e0 kernel/kcov.c:860
>
> This is
>
> /*
> * Check that kcov_remote_start() is not called twice in background
> * threads nor called by user tasks (with enabled kcov).
> */
> mode = READ_ONCE(t->kcov_mode);
> if (WARN_ON(in_task() && kcov_mode_enabled(mode))) {
> local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
> return;
> }
>
> but I have no idea what that even means?
>
> > Workqueue: events_unbound cfg80211_wiphy_work
> > RIP: 0010:kcov_remote_start+0x549/0x7e0 kernel/kcov.c:860
> ...
> > Call Trace:
> > <TASK>
> > kcov_remote_start_common include/linux/kcov.h:48 [inline]
> > ieee80211_iface_work+0x21f/0xf10 net/mac80211/iface.c:1654
> > cfg80211_wiphy_work+0x221/0x260 net/wireless/core.c:437
> > process_one_work kernel/workqueue.c:3218 [inline]
> > process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3299
> > worker_thread+0x86d/0xd70 kernel/workqueue.c:3380
>
> It's a worker thread. Was this not intended to be called in threads?

I think the problem is that the KCOV annotations in the NFC code are
buggy: kcov_remote_stop() is never called if the loop in nci_rx_work()
exits on one of the breaks. With the recent addition of the nci_plen()
check, this started happening often. But breaks existed in the loop
before that too.

We need to move kcov_remote_stop() into the loop and call it every
time the loop exits.

Dmitry, could you PTAL and confirm this? You added the annotation for
NFC, AFAICS.

Thanks!

2024-05-21 04:43:39

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [syzbot] [wireless?] WARNING in kcov_remote_start (3)

On Wed, 10 Apr 2024 at 12:56, Andrey Konovalov <[email protected]> wrote:
>
> On Thu, Mar 28, 2024 at 12:45 PM Johannes Berg
> <[email protected]> wrote:
> >
> > On Thu, 2024-03-28 at 04:00 -0700, syzbot wrote:
> > >
> > > ------------[ cut here ]------------
> > > WARNING: CPU: 1 PID: 2400 at kernel/kcov.c:860 kcov_remote_start+0x549/0x7e0 kernel/kcov.c:860
> >
> > This is
> >
> > /*
> > * Check that kcov_remote_start() is not called twice in background
> > * threads nor called by user tasks (with enabled kcov).
> > */
> > mode = READ_ONCE(t->kcov_mode);
> > if (WARN_ON(in_task() && kcov_mode_enabled(mode))) {
> > local_unlock_irqrestore(&kcov_percpu_data.lock, flags);
> > return;
> > }
> >
> > but I have no idea what that even means?
> >
> > > Workqueue: events_unbound cfg80211_wiphy_work
> > > RIP: 0010:kcov_remote_start+0x549/0x7e0 kernel/kcov.c:860
> > ...
> > > Call Trace:
> > > <TASK>
> > > kcov_remote_start_common include/linux/kcov.h:48 [inline]
> > > ieee80211_iface_work+0x21f/0xf10 net/mac80211/iface.c:1654
> > > cfg80211_wiphy_work+0x221/0x260 net/wireless/core.c:437
> > > process_one_work kernel/workqueue.c:3218 [inline]
> > > process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3299
> > > worker_thread+0x86d/0xd70 kernel/workqueue.c:3380
> >
> > It's a worker thread. Was this not intended to be called in threads?
>
> I think the problem is that the KCOV annotations in the NFC code are
> buggy: kcov_remote_stop() is never called if the loop in nci_rx_work()
> exits on one of the breaks. With the recent addition of the nci_plen()
> check, this started happening often. But breaks existed in the loop
> before that too.
>
> We need to move kcov_remote_stop() into the loop and call it every
> time the loop exits.
>
> Dmitry, could you PTAL and confirm this? You added the annotation for
> NFC, AFAICS.


Missed this before somehow.
The other breaks seems to be from the switch, so should be fine:
https://elixir.bootlin.com/linux/v6.9-rc6/source/net/nfc/nci/core.c#L1528

Tetsuo, thanks for fixing it.