Hello,
syzbot found the following issue on:
HEAD commit: c5d9ae265b10 Merge tag 'for-linus' of git://git.kernel.org..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=132e687c700000
kernel config: https://syzkaller.appspot.com/x/.config?x=a78b064590b9f912
dashboard link: https://syzkaller.appspot.com/bug?extid=1e3ea63db39f2b4440e0
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
Unfortunately, I don't have any reproducer for this issue yet.
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]
WARNING: CPU: 1 PID: 10828 at drivers/vhost/vhost.c:715 vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715
Modules linked in:
CPU: 0 PID: 10828 Comm: syz-executor.0 Not tainted 5.17.0-rc4-syzkaller-00051-gc5d9ae265b10 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715
Code: c7 85 90 01 00 00 00 00 00 00 e8 53 6e a2 fa 48 89 ef 48 83 c4 20 5b 5d 41 5c 41 5d 41 5e 41 5f e9 7d d6 ff ff e8 38 6e a2 fa <0f> 0b e9 46 ff ff ff 48 8b 7c 24 10 e8 87 00 ea fa e9 75 f7 ff ff
RSP: 0018:ffffc9000fe6fa18 EFLAGS: 00010293
RAX: 0000000000000000 RBX: dffffc0000000000 RCX: 0000000000000000
RDX: ffff888021b63a00 RSI: ffffffff86d66fe8 RDI: ffff88801cc200b0
RBP: ffff88801cc20000 R08: 0000000000000001 R09: 0000000000000001
R10: ffffffff817f1e08 R11: 0000000000000000 R12: ffff88801cc200d0
R13: ffff88801cc20120 R14: ffff88801cc200d0 R15: 0000000000000002
FS: 0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b2de25000 CR3: 000000004c9cd000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
vhost_vsock_dev_release+0x36e/0x4b0 drivers/vhost/vsock.c:771
__fput+0x286/0x9f0 fs/file_table.c:313
task_work_run+0xdd/0x1a0 kernel/task_work.c:164
exit_task_work include/linux/task_work.h:32 [inline]
do_exit+0xb29/0x2a30 kernel/exit.c:806
do_group_exit+0xd2/0x2f0 kernel/exit.c:935
get_signal+0x45a/0x2490 kernel/signal.c:2863
arch_do_signal_or_restart+0x2a9/0x1c40 arch/x86/kernel/signal.c:868
handle_signal_work kernel/entry/common.c:148 [inline]
exit_to_user_mode_loop kernel/entry/common.c:172 [inline]
exit_to_user_mode_prepare+0x17d/0x290 kernel/entry/common.c:207
__syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:300
do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f4027a46481
Code: Unable to access opcode bytes at RIP 0x7f4027a46457.
RSP: 002b:00007f402808ba68 EFLAGS: 00000206 ORIG_RAX: 0000000000000038
RAX: fffffffffffffffc RBX: 00007f402622e700 RCX: 00007f4027a46481
RDX: 00007f402622e9d0 RSI: 00007f402622e2f0 RDI: 00000000003d0f00
RBP: 00007f402808bcb0 R08: 00007f402622e700 R09: 00007f402622e700
R10: 00007f402622e9d0 R11: 0000000000000206 R12: 00007f402808bb1e
R13: 00007f402808bb1f R14: 00007f402622e300 R15: 0000000000022000
</TASK>
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
On Thu, Feb 17, 2022 at 3:36 PM Michael S. Tsirkin <[email protected]> wrote:
>
> On Thu, Feb 17, 2022 at 03:34:13PM +0800, Jason Wang wrote:
> > On Thu, Feb 17, 2022 at 10:01 AM syzbot
> > <[email protected]> wrote:
> > >
> > > Hello,
> > >
> > > syzbot found the following issue on:
> > >
> > > HEAD commit: c5d9ae265b10 Merge tag 'for-linus' of git://git.kernel.org..
> > > git tree: upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=132e687c700000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=a78b064590b9f912
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=1e3ea63db39f2b4440e0
> > > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> > >
> > > Unfortunately, I don't have any reproducer for this issue yet.
> > >
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: [email protected]
> > >
> > > WARNING: CPU: 1 PID: 10828 at drivers/vhost/vhost.c:715 vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715
> > > Modules linked in:
> > > CPU: 0 PID: 10828 Comm: syz-executor.0 Not tainted 5.17.0-rc4-syzkaller-00051-gc5d9ae265b10 #0
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > > RIP: 0010:vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715
> >
> > Probably a hint that we are missing a flush.
> >
> > Looking at vhost_vsock_stop() that is called by vhost_vsock_dev_release():
> >
> > static int vhost_vsock_stop(struct vhost_vsock *vsock)
> > {
> > size_t i;
> > int ret;
> >
> > mutex_lock(&vsock->dev.mutex);
> >
> > ret = vhost_dev_check_owner(&vsock->dev);
> > if (ret)
> > goto err;
> >
> > Where it could fail so the device is not actually stopped.
> >
> > I wonder if this is something related.
> >
> > Thanks
>
>
> But then if that is not the owner then no work should be running, right?
Could it be a buggy user space that passes the fd to another process
and changes the owner just before the mutex_lock() above?
Thanks
>
>
> >
> > > Code: c7 85 90 01 00 00 00 00 00 00 e8 53 6e a2 fa 48 89 ef 48 83 c4 20 5b 5d 41 5c 41 5d 41 5e 41 5f e9 7d d6 ff ff e8 38 6e a2 fa <0f> 0b e9 46 ff ff ff 48 8b 7c 24 10 e8 87 00 ea fa e9 75 f7 ff ff
> > > RSP: 0018:ffffc9000fe6fa18 EFLAGS: 00010293
> > > RAX: 0000000000000000 RBX: dffffc0000000000 RCX: 0000000000000000
> > > RDX: ffff888021b63a00 RSI: ffffffff86d66fe8 RDI: ffff88801cc200b0
> > > RBP: ffff88801cc20000 R08: 0000000000000001 R09: 0000000000000001
> > > R10: ffffffff817f1e08 R11: 0000000000000000 R12: ffff88801cc200d0
> > > R13: ffff88801cc20120 R14: ffff88801cc200d0 R15: 0000000000000002
> > > FS: 0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
> > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 0000001b2de25000 CR3: 000000004c9cd000 CR4: 00000000003506f0
> > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > Call Trace:
> > > <TASK>
> > > vhost_vsock_dev_release+0x36e/0x4b0 drivers/vhost/vsock.c:771
> > > __fput+0x286/0x9f0 fs/file_table.c:313
> > > task_work_run+0xdd/0x1a0 kernel/task_work.c:164
> > > exit_task_work include/linux/task_work.h:32 [inline]
> > > do_exit+0xb29/0x2a30 kernel/exit.c:806
> > > do_group_exit+0xd2/0x2f0 kernel/exit.c:935
> > > get_signal+0x45a/0x2490 kernel/signal.c:2863
> > > arch_do_signal_or_restart+0x2a9/0x1c40 arch/x86/kernel/signal.c:868
> > > handle_signal_work kernel/entry/common.c:148 [inline]
> > > exit_to_user_mode_loop kernel/entry/common.c:172 [inline]
> > > exit_to_user_mode_prepare+0x17d/0x290 kernel/entry/common.c:207
> > > __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
> > > syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:300
> > > do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
> > > entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > RIP: 0033:0x7f4027a46481
> > > Code: Unable to access opcode bytes at RIP 0x7f4027a46457.
> > > RSP: 002b:00007f402808ba68 EFLAGS: 00000206 ORIG_RAX: 0000000000000038
> > > RAX: fffffffffffffffc RBX: 00007f402622e700 RCX: 00007f4027a46481
> > > RDX: 00007f402622e9d0 RSI: 00007f402622e2f0 RDI: 00000000003d0f00
> > > RBP: 00007f402808bcb0 R08: 00007f402622e700 R09: 00007f402622e700
> > > R10: 00007f402622e9d0 R11: 0000000000000206 R12: 00007f402808bb1e
> > > R13: 00007f402808bb1f R14: 00007f402622e300 R15: 0000000000022000
> > > </TASK>
> > >
> > >
> > > ---
> > > This report is generated by a bot. It may contain errors.
> > > See https://goo.gl/tpsmEJ for more information about syzbot.
> > > syzbot engineers can be reached at [email protected].
> > >
> > > syzbot will keep track of this issue. See:
> > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> > >
>
On Thu, Feb 17, 2022 at 03:39:48PM +0800, Jason Wang wrote:
> On Thu, Feb 17, 2022 at 3:36 PM Michael S. Tsirkin <[email protected]> wrote:
> >
> > On Thu, Feb 17, 2022 at 03:34:13PM +0800, Jason Wang wrote:
> > > On Thu, Feb 17, 2022 at 10:01 AM syzbot
> > > <[email protected]> wrote:
> > > >
> > > > Hello,
> > > >
> > > > syzbot found the following issue on:
> > > >
> > > > HEAD commit: c5d9ae265b10 Merge tag 'for-linus' of git://git.kernel.org..
> > > > git tree: upstream
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=132e687c700000
> > > > kernel config: https://syzkaller.appspot.com/x/.config?x=a78b064590b9f912
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=1e3ea63db39f2b4440e0
> > > > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> > > >
> > > > Unfortunately, I don't have any reproducer for this issue yet.
> > > >
> > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > Reported-by: [email protected]
> > > >
> > > > WARNING: CPU: 1 PID: 10828 at drivers/vhost/vhost.c:715 vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715
> > > > Modules linked in:
> > > > CPU: 0 PID: 10828 Comm: syz-executor.0 Not tainted 5.17.0-rc4-syzkaller-00051-gc5d9ae265b10 #0
> > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > > > RIP: 0010:vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715
> > >
> > > Probably a hint that we are missing a flush.
> > >
> > > Looking at vhost_vsock_stop() that is called by vhost_vsock_dev_release():
> > >
> > > static int vhost_vsock_stop(struct vhost_vsock *vsock)
> > > {
> > > size_t i;
> > > int ret;
> > >
> > > mutex_lock(&vsock->dev.mutex);
> > >
> > > ret = vhost_dev_check_owner(&vsock->dev);
> > > if (ret)
> > > goto err;
> > >
> > > Where it could fail so the device is not actually stopped.
> > >
> > > I wonder if this is something related.
> > >
> > > Thanks
> >
> >
> > But then if that is not the owner then no work should be running, right?
>
> Could it be a buggy user space that passes the fd to another process
> and changes the owner just before the mutex_lock() above?
>
> Thanks
Maybe, but can you be a bit more explicit? what is the set of
conditions you see that can lead to this?
> >
> >
> > >
> > > > Code: c7 85 90 01 00 00 00 00 00 00 e8 53 6e a2 fa 48 89 ef 48 83 c4 20 5b 5d 41 5c 41 5d 41 5e 41 5f e9 7d d6 ff ff e8 38 6e a2 fa <0f> 0b e9 46 ff ff ff 48 8b 7c 24 10 e8 87 00 ea fa e9 75 f7 ff ff
> > > > RSP: 0018:ffffc9000fe6fa18 EFLAGS: 00010293
> > > > RAX: 0000000000000000 RBX: dffffc0000000000 RCX: 0000000000000000
> > > > RDX: ffff888021b63a00 RSI: ffffffff86d66fe8 RDI: ffff88801cc200b0
> > > > RBP: ffff88801cc20000 R08: 0000000000000001 R09: 0000000000000001
> > > > R10: ffffffff817f1e08 R11: 0000000000000000 R12: ffff88801cc200d0
> > > > R13: ffff88801cc20120 R14: ffff88801cc200d0 R15: 0000000000000002
> > > > FS: 0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
> > > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > CR2: 0000001b2de25000 CR3: 000000004c9cd000 CR4: 00000000003506f0
> > > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > > > Call Trace:
> > > > <TASK>
> > > > vhost_vsock_dev_release+0x36e/0x4b0 drivers/vhost/vsock.c:771
> > > > __fput+0x286/0x9f0 fs/file_table.c:313
> > > > task_work_run+0xdd/0x1a0 kernel/task_work.c:164
> > > > exit_task_work include/linux/task_work.h:32 [inline]
> > > > do_exit+0xb29/0x2a30 kernel/exit.c:806
> > > > do_group_exit+0xd2/0x2f0 kernel/exit.c:935
> > > > get_signal+0x45a/0x2490 kernel/signal.c:2863
> > > > arch_do_signal_or_restart+0x2a9/0x1c40 arch/x86/kernel/signal.c:868
> > > > handle_signal_work kernel/entry/common.c:148 [inline]
> > > > exit_to_user_mode_loop kernel/entry/common.c:172 [inline]
> > > > exit_to_user_mode_prepare+0x17d/0x290 kernel/entry/common.c:207
> > > > __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
> > > > syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:300
> > > > do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
> > > > entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > > RIP: 0033:0x7f4027a46481
> > > > Code: Unable to access opcode bytes at RIP 0x7f4027a46457.
> > > > RSP: 002b:00007f402808ba68 EFLAGS: 00000206 ORIG_RAX: 0000000000000038
> > > > RAX: fffffffffffffffc RBX: 00007f402622e700 RCX: 00007f4027a46481
> > > > RDX: 00007f402622e9d0 RSI: 00007f402622e2f0 RDI: 00000000003d0f00
> > > > RBP: 00007f402808bcb0 R08: 00007f402622e700 R09: 00007f402622e700
> > > > R10: 00007f402622e9d0 R11: 0000000000000206 R12: 00007f402808bb1e
> > > > R13: 00007f402808bb1f R14: 00007f402622e300 R15: 0000000000022000
> > > > </TASK>
> > > >
> > > >
> > > > ---
> > > > This report is generated by a bot. It may contain errors.
> > > > See https://goo.gl/tpsmEJ for more information about syzbot.
> > > > syzbot engineers can be reached at [email protected].
> > > >
> > > > syzbot will keep track of this issue. See:
> > > > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> > > >
> >
On Thu, Feb 17, 2022 at 10:01 AM syzbot
<[email protected]> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: c5d9ae265b10 Merge tag 'for-linus' of git://git.kernel.org..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=132e687c700000
> kernel config: https://syzkaller.appspot.com/x/.config?x=a78b064590b9f912
> dashboard link: https://syzkaller.appspot.com/bug?extid=1e3ea63db39f2b4440e0
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: [email protected]
>
> WARNING: CPU: 1 PID: 10828 at drivers/vhost/vhost.c:715 vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715
> Modules linked in:
> CPU: 0 PID: 10828 Comm: syz-executor.0 Not tainted 5.17.0-rc4-syzkaller-00051-gc5d9ae265b10 #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> RIP: 0010:vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715
Probably a hint that we are missing a flush.
Looking at vhost_vsock_stop() that is called by vhost_vsock_dev_release():
static int vhost_vsock_stop(struct vhost_vsock *vsock)
{
size_t i;
int ret;
mutex_lock(&vsock->dev.mutex);
ret = vhost_dev_check_owner(&vsock->dev);
if (ret)
goto err;
Where it could fail so the device is not actually stopped.
I wonder if this is something related.
Thanks
> Code: c7 85 90 01 00 00 00 00 00 00 e8 53 6e a2 fa 48 89 ef 48 83 c4 20 5b 5d 41 5c 41 5d 41 5e 41 5f e9 7d d6 ff ff e8 38 6e a2 fa <0f> 0b e9 46 ff ff ff 48 8b 7c 24 10 e8 87 00 ea fa e9 75 f7 ff ff
> RSP: 0018:ffffc9000fe6fa18 EFLAGS: 00010293
> RAX: 0000000000000000 RBX: dffffc0000000000 RCX: 0000000000000000
> RDX: ffff888021b63a00 RSI: ffffffff86d66fe8 RDI: ffff88801cc200b0
> RBP: ffff88801cc20000 R08: 0000000000000001 R09: 0000000000000001
> R10: ffffffff817f1e08 R11: 0000000000000000 R12: ffff88801cc200d0
> R13: ffff88801cc20120 R14: ffff88801cc200d0 R15: 0000000000000002
> FS: 0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000001b2de25000 CR3: 000000004c9cd000 CR4: 00000000003506f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <TASK>
> vhost_vsock_dev_release+0x36e/0x4b0 drivers/vhost/vsock.c:771
> __fput+0x286/0x9f0 fs/file_table.c:313
> task_work_run+0xdd/0x1a0 kernel/task_work.c:164
> exit_task_work include/linux/task_work.h:32 [inline]
> do_exit+0xb29/0x2a30 kernel/exit.c:806
> do_group_exit+0xd2/0x2f0 kernel/exit.c:935
> get_signal+0x45a/0x2490 kernel/signal.c:2863
> arch_do_signal_or_restart+0x2a9/0x1c40 arch/x86/kernel/signal.c:868
> handle_signal_work kernel/entry/common.c:148 [inline]
> exit_to_user_mode_loop kernel/entry/common.c:172 [inline]
> exit_to_user_mode_prepare+0x17d/0x290 kernel/entry/common.c:207
> __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
> syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:300
> do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
> entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x7f4027a46481
> Code: Unable to access opcode bytes at RIP 0x7f4027a46457.
> RSP: 002b:00007f402808ba68 EFLAGS: 00000206 ORIG_RAX: 0000000000000038
> RAX: fffffffffffffffc RBX: 00007f402622e700 RCX: 00007f4027a46481
> RDX: 00007f402622e9d0 RSI: 00007f402622e2f0 RDI: 00000000003d0f00
> RBP: 00007f402808bcb0 R08: 00007f402622e700 R09: 00007f402622e700
> R10: 00007f402622e9d0 R11: 0000000000000206 R12: 00007f402808bb1e
> R13: 00007f402808bb1f R14: 00007f402622e300 R15: 0000000000022000
> </TASK>
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at [email protected].
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
On Thu, Feb 17, 2022 at 8:50 AM Michael S. Tsirkin <[email protected]> wrote:
>
> On Thu, Feb 17, 2022 at 03:39:48PM +0800, Jason Wang wrote:
> > On Thu, Feb 17, 2022 at 3:36 PM Michael S. Tsirkin <[email protected]> wrote:
> > >
> > > On Thu, Feb 17, 2022 at 03:34:13PM +0800, Jason Wang wrote:
> > > > On Thu, Feb 17, 2022 at 10:01 AM syzbot
> > > > <[email protected]> wrote:
> > > > >
> > > > > Hello,
> > > > >
> > > > > syzbot found the following issue on:
> > > > >
> > > > > HEAD commit: c5d9ae265b10 Merge tag 'for-linus' of git://git.kernel.org..
> > > > > git tree: upstream
> > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=132e687c700000
> > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=a78b064590b9f912
> > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=1e3ea63db39f2b4440e0
> > > > > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> > > > >
> > > > > Unfortunately, I don't have any reproducer for this issue yet.
> > > > >
> > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > > Reported-by: [email protected]
> > > > >
> > > > > WARNING: CPU: 1 PID: 10828 at drivers/vhost/vhost.c:715 vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715
> > > > > Modules linked in:
> > > > > CPU: 0 PID: 10828 Comm: syz-executor.0 Not tainted 5.17.0-rc4-syzkaller-00051-gc5d9ae265b10 #0
> > > > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > > > > RIP: 0010:vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715
> > > >
> > > > Probably a hint that we are missing a flush.
> > > >
> > > > Looking at vhost_vsock_stop() that is called by vhost_vsock_dev_release():
> > > >
> > > > static int vhost_vsock_stop(struct vhost_vsock *vsock)
> > > > {
> > > > size_t i;
> > > > int ret;
> > > >
> > > > mutex_lock(&vsock->dev.mutex);
> > > >
> > > > ret = vhost_dev_check_owner(&vsock->dev);
> > > > if (ret)
> > > > goto err;
> > > >
> > > > Where it could fail so the device is not actually stopped.
> > > >
> > > > I wonder if this is something related.
> > > >
> > > > Thanks
> > >
> > >
> > > But then if that is not the owner then no work should be running, right?
> >
> > Could it be a buggy user space that passes the fd to another process
> > and changes the owner just before the mutex_lock() above?
> >
> > Thanks
>
> Maybe, but can you be a bit more explicit? what is the set of
> conditions you see that can lead to this?
I think the issue could be in the vhost_vsock_stop() as Jason mentioned,
but not related to fd passing, but related to the do_exit() function.
Looking the stack trace, we are in exit_task_work(), that is called
after exit_mm(), so the vhost_dev_check_owner() can fail because
current->mm should be NULL at that point.
It seems the fput work is queued by fput_many() in a worker queue, and
in some cases (maybe a lot of files opened?) the work is still queued
when we enter in do_exit().
That said, I don't know if we can simply remove that check in
vhost_vsock_stop(), or check if current->mm is NULL, to understand if
the process is exiting.
Stefano
On Thu, Feb 17, 2022 at 03:34:13PM +0800, Jason Wang wrote:
> On Thu, Feb 17, 2022 at 10:01 AM syzbot
> <[email protected]> wrote:
> >
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: c5d9ae265b10 Merge tag 'for-linus' of git://git.kernel.org..
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=132e687c700000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=a78b064590b9f912
> > dashboard link: https://syzkaller.appspot.com/bug?extid=1e3ea63db39f2b4440e0
> > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: [email protected]
> >
> > WARNING: CPU: 1 PID: 10828 at drivers/vhost/vhost.c:715 vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715
> > Modules linked in:
> > CPU: 0 PID: 10828 Comm: syz-executor.0 Not tainted 5.17.0-rc4-syzkaller-00051-gc5d9ae265b10 #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> > RIP: 0010:vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715
>
> Probably a hint that we are missing a flush.
>
> Looking at vhost_vsock_stop() that is called by vhost_vsock_dev_release():
>
> static int vhost_vsock_stop(struct vhost_vsock *vsock)
> {
> size_t i;
> int ret;
>
> mutex_lock(&vsock->dev.mutex);
>
> ret = vhost_dev_check_owner(&vsock->dev);
> if (ret)
> goto err;
>
> Where it could fail so the device is not actually stopped.
>
> I wonder if this is something related.
>
> Thanks
But then if that is not the owner then no work should be running, right?
>
> > Code: c7 85 90 01 00 00 00 00 00 00 e8 53 6e a2 fa 48 89 ef 48 83 c4 20 5b 5d 41 5c 41 5d 41 5e 41 5f e9 7d d6 ff ff e8 38 6e a2 fa <0f> 0b e9 46 ff ff ff 48 8b 7c 24 10 e8 87 00 ea fa e9 75 f7 ff ff
> > RSP: 0018:ffffc9000fe6fa18 EFLAGS: 00010293
> > RAX: 0000000000000000 RBX: dffffc0000000000 RCX: 0000000000000000
> > RDX: ffff888021b63a00 RSI: ffffffff86d66fe8 RDI: ffff88801cc200b0
> > RBP: ffff88801cc20000 R08: 0000000000000001 R09: 0000000000000001
> > R10: ffffffff817f1e08 R11: 0000000000000000 R12: ffff88801cc200d0
> > R13: ffff88801cc20120 R14: ffff88801cc200d0 R15: 0000000000000002
> > FS: 0000000000000000(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000001b2de25000 CR3: 000000004c9cd000 CR4: 00000000003506f0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> > <TASK>
> > vhost_vsock_dev_release+0x36e/0x4b0 drivers/vhost/vsock.c:771
> > __fput+0x286/0x9f0 fs/file_table.c:313
> > task_work_run+0xdd/0x1a0 kernel/task_work.c:164
> > exit_task_work include/linux/task_work.h:32 [inline]
> > do_exit+0xb29/0x2a30 kernel/exit.c:806
> > do_group_exit+0xd2/0x2f0 kernel/exit.c:935
> > get_signal+0x45a/0x2490 kernel/signal.c:2863
> > arch_do_signal_or_restart+0x2a9/0x1c40 arch/x86/kernel/signal.c:868
> > handle_signal_work kernel/entry/common.c:148 [inline]
> > exit_to_user_mode_loop kernel/entry/common.c:172 [inline]
> > exit_to_user_mode_prepare+0x17d/0x290 kernel/entry/common.c:207
> > __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
> > syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:300
> > do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86
> > entry_SYSCALL_64_after_hwframe+0x44/0xae
> > RIP: 0033:0x7f4027a46481
> > Code: Unable to access opcode bytes at RIP 0x7f4027a46457.
> > RSP: 002b:00007f402808ba68 EFLAGS: 00000206 ORIG_RAX: 0000000000000038
> > RAX: fffffffffffffffc RBX: 00007f402622e700 RCX: 00007f4027a46481
> > RDX: 00007f402622e9d0 RSI: 00007f402622e2f0 RDI: 00000000003d0f00
> > RBP: 00007f402808bcb0 R08: 00007f402622e700 R09: 00007f402622e700
> > R10: 00007f402622e9d0 R11: 0000000000000206 R12: 00007f402808bb1e
> > R13: 00007f402808bb1f R14: 00007f402622e300 R15: 0000000000022000
> > </TASK>
> >
> >
> > ---
> > This report is generated by a bot. It may contain errors.
> > See https://goo.gl/tpsmEJ for more information about syzbot.
> > syzbot engineers can be reached at [email protected].
> >
> > syzbot will keep track of this issue. See:
> > https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> >
On 2/18/22 11:53 AM, Mike Christie wrote:
> On 2/17/22 3:48 AM, Stefano Garzarella wrote:
>>
>> On Thu, Feb 17, 2022 at 8:50 AM Michael S. Tsirkin <[email protected]> wrote:
>>>
>>> On Thu, Feb 17, 2022 at 03:39:48PM +0800, Jason Wang wrote:
>>>> On Thu, Feb 17, 2022 at 3:36 PM Michael S. Tsirkin <[email protected]> wrote:
>>>>>
>>>>> On Thu, Feb 17, 2022 at 03:34:13PM +0800, Jason Wang wrote:
>>>>>> On Thu, Feb 17, 2022 at 10:01 AM syzbot
>>>>>> <[email protected]> wrote:
>>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> syzbot found the following issue on:
>>>>>>>
>>>>>>> HEAD commit: c5d9ae265b10 Merge tag 'for-linus' of git://git.kernel.org..
>>>>>>> git tree: upstream
>>>>>>> console output: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/log.txt?x=132e687c700000__;!!ACWV5N9M2RV99hQ!fLqQTyosTBm7FK50IVmo0ozZhsvUEPFCivEHFDGU3GjlAHDWl07UdOa-t9uf9YisMihn$
>>>>>>> kernel config: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/.config?x=a78b064590b9f912__;!!ACWV5N9M2RV99hQ!fLqQTyosTBm7FK50IVmo0ozZhsvUEPFCivEHFDGU3GjlAHDWl07UdOa-t9uf9RjOhplp$
>>>>>>> dashboard link: https://urldefense.com/v3/__https://syzkaller.appspot.com/bug?extid=1e3ea63db39f2b4440e0__;!!ACWV5N9M2RV99hQ!fLqQTyosTBm7FK50IVmo0ozZhsvUEPFCivEHFDGU3GjlAHDWl07UdOa-t9uf9bBf5tv0$
>>>>>>> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>>>>>>>
>>>>>>> Unfortunately, I don't have any reproducer for this issue yet.
>>>>>>>
>>>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>>>>> Reported-by: [email protected]
>>>>>>>
>>>>>>> WARNING: CPU: 1 PID: 10828 at drivers/vhost/vhost.c:715 vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715
>>>>>>> Modules linked in:
>>>>>>> CPU: 0 PID: 10828 Comm: syz-executor.0 Not tainted 5.17.0-rc4-syzkaller-00051-gc5d9ae265b10 #0
>>>>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>>>>>>> RIP: 0010:vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715
>>>>>>
>>>>>> Probably a hint that we are missing a flush.
>>>>>>
>>>>>> Looking at vhost_vsock_stop() that is called by vhost_vsock_dev_release():
>>>>>>
>>>>>> static int vhost_vsock_stop(struct vhost_vsock *vsock)
>>>>>> {
>>>>>> size_t i;
>>>>>> int ret;
>>>>>>
>>>>>> mutex_lock(&vsock->dev.mutex);
>>>>>>
>>>>>> ret = vhost_dev_check_owner(&vsock->dev);
>>>>>> if (ret)
>>>>>> goto err;
>>>>>>
>>>>>> Where it could fail so the device is not actually stopped.
>>>>>>
>>>>>> I wonder if this is something related.
>>>>>>
>>>>>> Thanks
>>>>>
>>>>>
>>>>> But then if that is not the owner then no work should be running, right?
>>>>
>>>> Could it be a buggy user space that passes the fd to another process
>>>> and changes the owner just before the mutex_lock() above?
>>>>
>>>> Thanks
>>>
>>> Maybe, but can you be a bit more explicit? what is the set of
>>> conditions you see that can lead to this?
>>
>> I think the issue could be in the vhost_vsock_stop() as Jason mentioned,
>> but not related to fd passing, but related to the do_exit() function.
>>
>> Looking the stack trace, we are in exit_task_work(), that is called
>> after exit_mm(), so the vhost_dev_check_owner() can fail because
>> current->mm should be NULL at that point.
>>
>> It seems the fput work is queued by fput_many() in a worker queue, and
>> in some cases (maybe a lot of files opened?) the work is still queued
>> when we enter in do_exit().
> It normally happens if userspace doesn't do a close() when the VM
Just one clarification. I meant to say it "always" happens when userspace
doesn't do a close.
It doesn't have anything to do with lots of files or something like that.
We are actually running the vhost device's release function from
do_exit->task_work_run and so all those __fputs are done from something
like qemu's context (current == that process).
We are *not* hitting the case:
do_exit->exit_files->put_files_struct->filp_close->fput->fput_many
and then in there hitting the schedule_delayed_work path. For that
the last __fput would be done from a workqueue thread and so the current
pointer would point to a completely different thread.
> is shutdown and instead let's the kernel's reaper code cleanup. The qemu
> vhost-scsi code doesn't do a close() during shutdown and so this is our
> normal code path. It also happens when something like qemu is not
> gracefully shutdown like during a crash.
>
> So fire up qemu, start IO, then crash it or kill 9 it while IO is still
> running and you can hit it.
>
>>
>> That said, I don't know if we can simply remove that check in
>> vhost_vsock_stop(), or check if current->mm is NULL, to understand if
>> the process is exiting.
>>
>
> Should the caller do the vhost_dev_check_owner or tell vhost_vsock_stop
> when to check?
>
> - vhost_vsock_dev_ioctl always wants to check for ownership right?
>
> - For vhost_vsock_dev_release ownership doesn't matter because we
> always want to clean up or it doesn't hurt too much.
>
> For the case where we just do open then close and no ioctls then
> running vhost_vq_set_backend in vhost_vsock_stop is just a minor
> hit of extra work. If we've done ioctls, but are now in
> vhost_vsock_dev_release then we know for the graceful and ungraceful
> case that nothing is going to be accessing this device in the future
> and it's getting completely freed so we must completely clean it up.
>
>
>
>
>
> _______________________________________________
> Virtualization mailing list
> [email protected]
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
On 2/17/22 3:48 AM, Stefano Garzarella wrote:
>
> On Thu, Feb 17, 2022 at 8:50 AM Michael S. Tsirkin <[email protected]> wrote:
>>
>> On Thu, Feb 17, 2022 at 03:39:48PM +0800, Jason Wang wrote:
>>> On Thu, Feb 17, 2022 at 3:36 PM Michael S. Tsirkin <[email protected]> wrote:
>>>>
>>>> On Thu, Feb 17, 2022 at 03:34:13PM +0800, Jason Wang wrote:
>>>>> On Thu, Feb 17, 2022 at 10:01 AM syzbot
>>>>> <[email protected]> wrote:
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> syzbot found the following issue on:
>>>>>>
>>>>>> HEAD commit: c5d9ae265b10 Merge tag 'for-linus' of git://git.kernel.org..
>>>>>> git tree: upstream
>>>>>> console output: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/log.txt?x=132e687c700000__;!!ACWV5N9M2RV99hQ!fLqQTyosTBm7FK50IVmo0ozZhsvUEPFCivEHFDGU3GjlAHDWl07UdOa-t9uf9YisMihn$
>>>>>> kernel config: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/.config?x=a78b064590b9f912__;!!ACWV5N9M2RV99hQ!fLqQTyosTBm7FK50IVmo0ozZhsvUEPFCivEHFDGU3GjlAHDWl07UdOa-t9uf9RjOhplp$
>>>>>> dashboard link: https://urldefense.com/v3/__https://syzkaller.appspot.com/bug?extid=1e3ea63db39f2b4440e0__;!!ACWV5N9M2RV99hQ!fLqQTyosTBm7FK50IVmo0ozZhsvUEPFCivEHFDGU3GjlAHDWl07UdOa-t9uf9bBf5tv0$
>>>>>> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>>>>>>
>>>>>> Unfortunately, I don't have any reproducer for this issue yet.
>>>>>>
>>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>>>> Reported-by: [email protected]
>>>>>>
>>>>>> WARNING: CPU: 1 PID: 10828 at drivers/vhost/vhost.c:715 vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715
>>>>>> Modules linked in:
>>>>>> CPU: 0 PID: 10828 Comm: syz-executor.0 Not tainted 5.17.0-rc4-syzkaller-00051-gc5d9ae265b10 #0
>>>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>>>>>> RIP: 0010:vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715
>>>>>
>>>>> Probably a hint that we are missing a flush.
>>>>>
>>>>> Looking at vhost_vsock_stop() that is called by vhost_vsock_dev_release():
>>>>>
>>>>> static int vhost_vsock_stop(struct vhost_vsock *vsock)
>>>>> {
>>>>> size_t i;
>>>>> int ret;
>>>>>
>>>>> mutex_lock(&vsock->dev.mutex);
>>>>>
>>>>> ret = vhost_dev_check_owner(&vsock->dev);
>>>>> if (ret)
>>>>> goto err;
>>>>>
>>>>> Where it could fail so the device is not actually stopped.
>>>>>
>>>>> I wonder if this is something related.
>>>>>
>>>>> Thanks
>>>>
>>>>
>>>> But then if that is not the owner then no work should be running, right?
>>>
>>> Could it be a buggy user space that passes the fd to another process
>>> and changes the owner just before the mutex_lock() above?
>>>
>>> Thanks
>>
>> Maybe, but can you be a bit more explicit? what is the set of
>> conditions you see that can lead to this?
>
> I think the issue could be in the vhost_vsock_stop() as Jason mentioned,
> but not related to fd passing, but related to the do_exit() function.
>
> Looking the stack trace, we are in exit_task_work(), that is called
> after exit_mm(), so the vhost_dev_check_owner() can fail because
> current->mm should be NULL at that point.
>
> It seems the fput work is queued by fput_many() in a worker queue, and
> in some cases (maybe a lot of files opened?) the work is still queued
> when we enter in do_exit().
It normally happens if userspace doesn't do a close() when the VM
is shutdown and instead let's the kernel's reaper code cleanup. The qemu
vhost-scsi code doesn't do a close() during shutdown and so this is our
normal code path. It also happens when something like qemu is not
gracefully shutdown like during a crash.
So fire up qemu, start IO, then crash it or kill 9 it while IO is still
running and you can hit it.
>
> That said, I don't know if we can simply remove that check in
> vhost_vsock_stop(), or check if current->mm is NULL, to understand if
> the process is exiting.
>
Should the caller do the vhost_dev_check_owner or tell vhost_vsock_stop
when to check?
- vhost_vsock_dev_ioctl always wants to check for ownership right?
- For vhost_vsock_dev_release ownership doesn't matter because we
always want to clean up or it doesn't hurt too much.
For the case where we just do open then close and no ioctls then
running vhost_vq_set_backend in vhost_vsock_stop is just a minor
hit of extra work. If we've done ioctls, but are now in
vhost_vsock_dev_release then we know for the graceful and ungraceful
case that nothing is going to be accessing this device in the future
and it's getting completely freed so we must completely clean it up.
On Fri, Feb 18, 2022 at 12:23:10PM -0600, Mike Christie wrote:
>On 2/18/22 11:53 AM, Mike Christie wrote:
>> On 2/17/22 3:48 AM, Stefano Garzarella wrote:
>>>
>>> On Thu, Feb 17, 2022 at 8:50 AM Michael S. Tsirkin <[email protected]> wrote:
>>>>
>>>> On Thu, Feb 17, 2022 at 03:39:48PM +0800, Jason Wang wrote:
>>>>> On Thu, Feb 17, 2022 at 3:36 PM Michael S. Tsirkin <[email protected]> wrote:
>>>>>>
>>>>>> On Thu, Feb 17, 2022 at 03:34:13PM +0800, Jason Wang wrote:
>>>>>>> On Thu, Feb 17, 2022 at 10:01 AM syzbot
>>>>>>> <[email protected]> wrote:
>>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> syzbot found the following issue on:
>>>>>>>>
>>>>>>>> HEAD commit: c5d9ae265b10 Merge tag 'for-linus' of git://git.kernel.org..
>>>>>>>> git tree: upstream
>>>>>>>> console output: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/log.txt?x=132e687c700000__;!!ACWV5N9M2RV99hQ!fLqQTyosTBm7FK50IVmo0ozZhsvUEPFCivEHFDGU3GjlAHDWl07UdOa-t9uf9YisMihn$
>>>>>>>> kernel config: https://urldefense.com/v3/__https://syzkaller.appspot.com/x/.config?x=a78b064590b9f912__;!!ACWV5N9M2RV99hQ!fLqQTyosTBm7FK50IVmo0ozZhsvUEPFCivEHFDGU3GjlAHDWl07UdOa-t9uf9RjOhplp$
>>>>>>>> dashboard link: https://urldefense.com/v3/__https://syzkaller.appspot.com/bug?extid=1e3ea63db39f2b4440e0__;!!ACWV5N9M2RV99hQ!fLqQTyosTBm7FK50IVmo0ozZhsvUEPFCivEHFDGU3GjlAHDWl07UdOa-t9uf9bBf5tv0$
>>>>>>>> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>>>>>>>>
>>>>>>>> Unfortunately, I don't have any reproducer for this issue yet.
>>>>>>>>
>>>>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>>>>>> Reported-by: [email protected]
>>>>>>>>
>>>>>>>> WARNING: CPU: 1 PID: 10828 at drivers/vhost/vhost.c:715 vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715
>>>>>>>> Modules linked in:
>>>>>>>> CPU: 0 PID: 10828 Comm: syz-executor.0 Not tainted 5.17.0-rc4-syzkaller-00051-gc5d9ae265b10 #0
>>>>>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>>>>>>>> RIP: 0010:vhost_dev_cleanup+0x8b8/0xbc0 drivers/vhost/vhost.c:715
>>>>>>>
>>>>>>> Probably a hint that we are missing a flush.
>>>>>>>
>>>>>>> Looking at vhost_vsock_stop() that is called by vhost_vsock_dev_release():
>>>>>>>
>>>>>>> static int vhost_vsock_stop(struct vhost_vsock *vsock)
>>>>>>> {
>>>>>>> size_t i;
>>>>>>> int ret;
>>>>>>>
>>>>>>> mutex_lock(&vsock->dev.mutex);
>>>>>>>
>>>>>>> ret = vhost_dev_check_owner(&vsock->dev);
>>>>>>> if (ret)
>>>>>>> goto err;
>>>>>>>
>>>>>>> Where it could fail so the device is not actually stopped.
>>>>>>>
>>>>>>> I wonder if this is something related.
>>>>>>>
>>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>> But then if that is not the owner then no work should be running, right?
>>>>>
>>>>> Could it be a buggy user space that passes the fd to another process
>>>>> and changes the owner just before the mutex_lock() above?
>>>>>
>>>>> Thanks
>>>>
>>>> Maybe, but can you be a bit more explicit? what is the set of
>>>> conditions you see that can lead to this?
>>>
>>> I think the issue could be in the vhost_vsock_stop() as Jason mentioned,
>>> but not related to fd passing, but related to the do_exit() function.
>>>
>>> Looking the stack trace, we are in exit_task_work(), that is called
>>> after exit_mm(), so the vhost_dev_check_owner() can fail because
>>> current->mm should be NULL at that point.
>>>
>>> It seems the fput work is queued by fput_many() in a worker queue, and
>>> in some cases (maybe a lot of files opened?) the work is still queued
>>> when we enter in do_exit().
>> It normally happens if userspace doesn't do a close() when the VM
>
>Just one clarification. I meant to say it "always" happens when userspace
>doesn't do a close.
>
>It doesn't have anything to do with lots of files or something like that.
>We are actually running the vhost device's release function from
>do_exit->task_work_run and so all those __fputs are done from something
>like qemu's context (current == that process).
>
>We are *not* hitting the case:
>
>do_exit->exit_files->put_files_struct->filp_close->fput->fput_many
>
>and then in there hitting the schedule_delayed_work path. For that
>the last __fput would be done from a workqueue thread and so the current
>pointer would point to a completely different thread.
>
>
>
>> is shutdown and instead let's the kernel's reaper code cleanup. The qemu
>> vhost-scsi code doesn't do a close() during shutdown and so this is our
>> normal code path. It also happens when something like qemu is not
>> gracefully shutdown like during a crash.
>>
>> So fire up qemu, start IO, then crash it or kill 9 it while IO is still
>> running and you can hit it.
Thank you very much for this detailed explanation!
>>
>>>
>>> That said, I don't know if we can simply remove that check in
>>> vhost_vsock_stop(), or check if current->mm is NULL, to understand if
>>> the process is exiting.
>>>
>>
>> Should the caller do the vhost_dev_check_owner or tell vhost_vsock_stop
>> when to check?
>>
>> - vhost_vsock_dev_ioctl always wants to check for ownership right?
>>
>> - For vhost_vsock_dev_release ownership doesn't matter because we
>> always want to clean up or it doesn't hurt too much.
>>
>> For the case where we just do open then close and no ioctls then
>> running vhost_vq_set_backend in vhost_vsock_stop is just a minor
>> hit of extra work. If we've done ioctls, but are now in
>> vhost_vsock_dev_release then we know for the graceful and ungraceful
>> case that nothing is going to be accessing this device in the future
>> and it's getting completely freed so we must completely clean it up.
Yep, I think the easiest way is to add a parameter to vhost_vsock_stop()
to tell when to call vhost_dev_check_owner() or not. This is because
dev->mm is protected by dev->mutex, acquired in vhost_vsock_stop().
I will send a patch right away, it would be great if you can review.
Thanks,
Stefano