2022-03-30 06:23:07

by Dominique Martinet

[permalink] [raw]
Subject: Re: [syzbot] possible deadlock in p9_write_work

syzbot wrote on Tue, Mar 29, 2022 at 02:23:17PM -0700:
> ======================================================
> WARNING: possible circular locking dependency detected
> 5.17.0-next-20220328-syzkaller #0 Not tainted
> ------------------------------------------------------
> kworker/1:1/26 is trying to acquire lock:
> ffff88807eece460 (sb_writers#3){.+.+}-{0:0}, at: p9_fd_write net/9p/trans_fd.c:428 [inline]
> ffff88807eece460 (sb_writers#3){.+.+}-{0:0}, at: p9_write_work+0x25e/0xca0 net/9p/trans_fd.c:479
>
> but task is already holding lock:
> ffffc90000a1fda8 ((work_completion)(&m->wq)){+.+.}-{0:0}, at: process_one_work+0x8ae/0x1610 kernel/workqueue.c:2264
>
> which lock already depends on the new lock.
>
>
> the existing dependency chain (in reverse order) is:
>
> -> #3 ((work_completion)(&m->wq)){+.+.}-{0:0}:
> process_one_work+0x905/0x1610 kernel/workqueue.c:2265
> worker_thread+0x665/0x1080 kernel/workqueue.c:2436
> kthread+0x2e9/0x3a0 kernel/kthread.c:376
> ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298
>
> -> #2 ((wq_completion)events){+.+.}-{0:0}:
> flush_workqueue+0x164/0x1440 kernel/workqueue.c:2831
> flush_scheduled_work include/linux/workqueue.h:583 [inline]
> ext4_put_super+0x99/0x1150 fs/ext4/super.c:1202
> generic_shutdown_super+0x14c/0x400 fs/super.c:462
> kill_block_super+0x97/0xf0 fs/super.c:1394
> deactivate_locked_super+0x94/0x160 fs/super.c:332
> deactivate_super+0xad/0xd0 fs/super.c:363
> cleanup_mnt+0x3a2/0x540 fs/namespace.c:1186
> task_work_run+0xdd/0x1a0 kernel/task_work.c:164
> resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
> exit_to_user_mode_loop kernel/entry/common.c:183 [inline]
> exit_to_user_mode_prepare+0x23c/0x250 kernel/entry/common.c:215
> __syscall_exit_to_user_mode_work kernel/entry/common.c:297 [inline]
> syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:308
> do_syscall_64+0x42/0x80 arch/x86/entry/common.c:86
> entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> -> #1 (&type->s_umount_key#32){++++}-{3:3}:
> down_read+0x98/0x440 kernel/locking/rwsem.c:1461
> iterate_supers+0xdb/0x290 fs/super.c:692
> drop_caches_sysctl_handler+0xdb/0x110 fs/drop_caches.c:62
> proc_sys_call_handler+0x4a1/0x6e0 fs/proc/proc_sysctl.c:604
> call_write_iter include/linux/fs.h:2080 [inline]
> do_iter_readv_writev+0x3d1/0x640 fs/read_write.c:726
> do_iter_write+0x182/0x700 fs/read_write.c:852
> vfs_iter_write+0x70/0xa0 fs/read_write.c:893
> iter_file_splice_write+0x723/0xc70 fs/splice.c:689
> do_splice_from fs/splice.c:767 [inline]
> direct_splice_actor+0x110/0x180 fs/splice.c:936
> splice_direct_to_actor+0x34b/0x8c0 fs/splice.c:891
> do_splice_direct+0x1a7/0x270 fs/splice.c:979
> do_sendfile+0xae0/0x1240 fs/read_write.c:1246
> __do_sys_sendfile64 fs/read_write.c:1305 [inline]
> __se_sys_sendfile64 fs/read_write.c:1297 [inline]
> __x64_sys_sendfile64+0x149/0x210 fs/read_write.c:1297
> do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> do_syscall_64+0x35/0x80 arch/x86/entry/common.c:80
> entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> -> #0 (sb_writers#3){.+.+}-{0:0}:
> check_prev_add kernel/locking/lockdep.c:3096 [inline]
> check_prevs_add kernel/locking/lockdep.c:3219 [inline]
> validate_chain kernel/locking/lockdep.c:3834 [inline]
> __lock_acquire+0x2ac6/0x56c0 kernel/locking/lockdep.c:5060
> lock_acquire kernel/locking/lockdep.c:5672 [inline]
> lock_acquire+0x1ab/0x510 kernel/locking/lockdep.c:5637
> percpu_down_read include/linux/percpu-rwsem.h:51 [inline]
> __sb_start_write include/linux/fs.h:1728 [inline]
> sb_start_write include/linux/fs.h:1798 [inline]
> file_start_write include/linux/fs.h:2815 [inline]
> kernel_write fs/read_write.c:564 [inline]
> kernel_write+0x2ac/0x540 fs/read_write.c:555
> p9_fd_write net/9p/trans_fd.c:428 [inline]
> p9_write_work+0x25e/0xca0 net/9p/trans_fd.c:479
> process_one_work+0x996/0x1610 kernel/workqueue.c:2289
> worker_thread+0x665/0x1080 kernel/workqueue.c:2436
> kthread+0x2e9/0x3a0 kernel/kthread.c:376
> ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298


So p9_write_work cannot write because there's.. a backing ext4 umount (I
assume it's been mounted with trans fd with an ext4 file) and a
drop_caches stuck in parallel, and we just got caught in the crossfire ?

I'm not sure why it got stuck there but that doesn't look like anything
we can do about it, using trans fd with filesystem backed files isn't a
usage we care about in the first place, maybe there's a way to refuse
these and only keep sockets but I don't really see the point of
artificially limiting the interface (unless using a 9p mount with a file
could have security implications I don't see)

wontfix/dontcare for me,
--
Dominique