2024-04-10 18:46:20

by syzbot

[permalink] [raw]
Subject: [syzbot] [net?] possible deadlock in unix_notinflight

Hello,

syzbot found the following issue on:

HEAD commit: 443574b03387 riscv, bpf: Fix kfunc parameters incompatibil..
git tree: bpf
console+strace: https://syzkaller.appspot.com/x/log.txt?x=12898aa9180000
kernel config: https://syzkaller.appspot.com/x/.config?x=6fb1be60a193d440
dashboard link: https://syzkaller.appspot.com/bug?extid=38b3aa8cd529958bd27a
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13d693e3180000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11aee305180000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/3f355021a085/disk-443574b0.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/44cf4de7472a/vmlinux-443574b0.xz
kernel image: https://storage.googleapis.com/syzbot-assets/a99a36c7ad65/bzImage-443574b0.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

============================================
WARNING: possible recursive locking detected
6.8.0-syzkaller-05236-g443574b03387 #0 Not tainted
--------------------------------------------
kworker/u8:0/10 is trying to acquire lock:
ffffffff8f48b798 (unix_gc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
ffffffff8f48b798 (unix_gc_lock){+.+.}-{2:2}, at: unix_notinflight+0x204/0x390 net/unix/garbage.c:140

but task is already holding lock:
ffffffff8f48b798 (unix_gc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
ffffffff8f48b798 (unix_gc_lock){+.+.}-{2:2}, at: __unix_gc+0x117/0xf10 net/unix/garbage.c:261

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(unix_gc_lock);
lock(unix_gc_lock);

*** DEADLOCK ***

May be due to missing lock nesting notation

3 locks held by kworker/u8:0/10:
#0: ffff888014c81148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3229 [inline]
#0: ffff888014c81148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_scheduled_works+0x8e0/0x1770 kernel/workqueue.c:3335
#1: ffffc900000f7d00 (unix_gc_work){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3230 [inline]
#1: ffffc900000f7d00 (unix_gc_work){+.+.}-{0:0}, at: process_scheduled_works+0x91b/0x1770 kernel/workqueue.c:3335
#2: ffffffff8f48b798 (unix_gc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
#2: ffffffff8f48b798 (unix_gc_lock){+.+.}-{2:2}, at: __unix_gc+0x117/0xf10 net/unix/garbage.c:261

stack backtrace:
CPU: 1 PID: 10 Comm: kworker/u8:0 Not tainted 6.8.0-syzkaller-05236-g443574b03387 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
Workqueue: events_unbound __unix_gc
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e7/0x2e0 lib/dump_stack.c:106
check_deadlock kernel/locking/lockdep.c:3062 [inline]
validate_chain+0x15c1/0x58e0 kernel/locking/lockdep.c:3856
__lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
__raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
_raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
spin_lock include/linux/spinlock.h:351 [inline]
unix_notinflight+0x204/0x390 net/unix/garbage.c:140
unix_detach_fds net/unix/af_unix.c:1819 [inline]
unix_destruct_scm+0x221/0x350 net/unix/af_unix.c:1876
skb_release_head_state+0x100/0x250 net/core/skbuff.c:1188
skb_release_all net/core/skbuff.c:1200 [inline]
__kfree_skb net/core/skbuff.c:1216 [inline]
kfree_skb_reason+0x15d/0x390 net/core/skbuff.c:1252
kfree_skb include/linux/skbuff.h:1267 [inline]
__unix_gc+0xaf3/0xf10 net/unix/garbage.c:330
process_one_work kernel/workqueue.c:3254 [inline]
process_scheduled_works+0xa00/0x1770 kernel/workqueue.c:3335
worker_thread+0x86d/0xd70 kernel/workqueue.c:3416
kthread+0x2f0/0x390 kernel/kthread.c:388
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup


2024-04-10 22:32:35

by Kuniyuki Iwashima

[permalink] [raw]
Subject: Re: [syzbot] [net?] possible deadlock in unix_notinflight

From: syzbot <[email protected]>
Date: Wed, 10 Apr 2024 11:45:29 -0700
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 443574b03387 riscv, bpf: Fix kfunc parameters incompatibil..
> git tree: bpf
> console+strace: https://syzkaller.appspot.com/x/log.txt?x=12898aa9180000
> kernel config: https://syzkaller.appspot.com/x/.config?x=6fb1be60a193d440
> dashboard link: https://syzkaller.appspot.com/bug?extid=38b3aa8cd529958bd27a
> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13d693e3180000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11aee305180000
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/3f355021a085/disk-443574b0.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/44cf4de7472a/vmlinux-443574b0.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/a99a36c7ad65/bzImage-443574b0.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: [email protected]
>
> ============================================
> WARNING: possible recursive locking detected
> 6.8.0-syzkaller-05236-g443574b03387 #0 Not tainted
> --------------------------------------------
> kworker/u8:0/10 is trying to acquire lock:
> ffffffff8f48b798 (unix_gc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
> ffffffff8f48b798 (unix_gc_lock){+.+.}-{2:2}, at: unix_notinflight+0x204/0x390 net/unix/garbage.c:140
>
> but task is already holding lock:
> ffffffff8f48b798 (unix_gc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
> ffffffff8f48b798 (unix_gc_lock){+.+.}-{2:2}, at: __unix_gc+0x117/0xf10 net/unix/garbage.c:261
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(unix_gc_lock);
> lock(unix_gc_lock);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
> 3 locks held by kworker/u8:0/10:
> #0: ffff888014c81148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3229 [inline]
> #0: ffff888014c81148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_scheduled_works+0x8e0/0x1770 kernel/workqueue.c:3335
> #1: ffffc900000f7d00 (unix_gc_work){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3230 [inline]
> #1: ffffc900000f7d00 (unix_gc_work){+.+.}-{0:0}, at: process_scheduled_works+0x91b/0x1770 kernel/workqueue.c:3335
> #2: ffffffff8f48b798 (unix_gc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
> #2: ffffffff8f48b798 (unix_gc_lock){+.+.}-{2:2}, at: __unix_gc+0x117/0xf10 net/unix/garbage.c:261
>
> stack backtrace:
> CPU: 1 PID: 10 Comm: kworker/u8:0 Not tainted 6.8.0-syzkaller-05236-g443574b03387 #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
> Workqueue: events_unbound __unix_gc
> Call Trace:
> <TASK>
> __dump_stack lib/dump_stack.c:88 [inline]
> dump_stack_lvl+0x1e7/0x2e0 lib/dump_stack.c:106
> check_deadlock kernel/locking/lockdep.c:3062 [inline]
> validate_chain+0x15c1/0x58e0 kernel/locking/lockdep.c:3856
> __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137
> lock_acquire+0x1e4/0x530 kernel/locking/lockdep.c:5754
> __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
> _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
> spin_lock include/linux/spinlock.h:351 [inline]
> unix_notinflight+0x204/0x390 net/unix/garbage.c:140
> unix_detach_fds net/unix/af_unix.c:1819 [inline]
> unix_destruct_scm+0x221/0x350 net/unix/af_unix.c:1876
> skb_release_head_state+0x100/0x250 net/core/skbuff.c:1188
> skb_release_all net/core/skbuff.c:1200 [inline]
> __kfree_skb net/core/skbuff.c:1216 [inline]
> kfree_skb_reason+0x15d/0x390 net/core/skbuff.c:1252
> kfree_skb include/linux/skbuff.h:1267 [inline]
> __unix_gc+0xaf3/0xf10 net/unix/garbage.c:330
> process_one_work kernel/workqueue.c:3254 [inline]
> process_scheduled_works+0xa00/0x1770 kernel/workqueue.c:3335
> worker_thread+0x86d/0xd70 kernel/workqueue.c:3416
> kthread+0x2f0/0x390 kernel/kthread.c:388
> ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:243
> </TASK>
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at [email protected].
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title

#syz fix: af_unix: Clear stale u->oob_skb.

2024-04-10 22:58:54

by Hillf Danton

[permalink] [raw]
Subject: Re: [syzbot] [net?] possible deadlock in unix_notinflight

On Wed, 10 Apr 2024 11:45:29 -0700
> syzbot found the following issue on:
>
> HEAD commit: 443574b03387 riscv, bpf: Fix kfunc parameters incompatibil..
> git tree: bpf
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11aee305180000

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git 443574b03387

--- x/net/unix/garbage.c
+++ y/net/unix/garbage.c
@@ -327,7 +327,7 @@ static void __unix_gc(struct work_struct

#if IS_ENABLED(CONFIG_AF_UNIX_OOB)
if (u->oob_skb) {
- kfree_skb(u->oob_skb);
+ __skb_queue_tail(hitlist, u->oob_skb);
u->oob_skb = NULL;
}
#endif
--

2024-04-11 19:32:19

by syzbot

[permalink] [raw]
Subject: Re: [syzbot] [net?] possible deadlock in unix_notinflight

Hello,

syzbot tried to test the proposed patch but the build/boot failed:

net/unix/garbage.c:330:21: error: passing 'struct sk_buff_head' to parameter of incompatible type 'struct sk_buff_head *'; take the address with &


Tested on:

commit: 443574b0 riscv, bpf: Fix kfunc parameters incompatibil..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git
kernel config: https://syzkaller.appspot.com/x/.config?x=6fb1be60a193d440
dashboard link: https://syzkaller.appspot.com/bug?extid=38b3aa8cd529958bd27a
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=164eb66d180000