Hello,
syzbot found the following issue on:
HEAD commit: 933174ae28ba Merge tag 'spi-fix-v6.4-rc3' of git://git.ker..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=138d4ae5280000
kernel config: https://syzkaller.appspot.com/x/.config?x=f389ffdf4e9ba3f0
dashboard link: https://syzkaller.appspot.com/bug?extid=d0d442c22fa8db45ff0e
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
Unfortunately, I don't have any reproducer for this issue yet.
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/21a81b8c2660/disk-933174ae.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/b4951d89e238/vmlinux-933174ae.xz
kernel image: https://storage.googleapis.com/syzbot-assets/21eb405303cc/bzImage-933174ae.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]
general protection fault, probably for non-canonical address 0xdffffc000000000e: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
CPU: 0 PID: 29845 Comm: syz-executor.4 Not tainted 6.4.0-rc3-syzkaller-00032-g933174ae28ba #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/16/2023
RIP: 0010:vhost_work_queue drivers/vhost/vhost.c:259 [inline]
RIP: 0010:vhost_work_queue+0xfc/0x150 drivers/vhost/vhost.c:248
Code: 00 00 fc ff df 48 89 da 48 c1 ea 03 80 3c 02 00 75 56 48 b8 00 00 00 00 00 fc ff df 48 8b 1b 48 8d 7b 70 48 89 fa 48 c1 ea 03 <80> 3c 02 00 75 42 48 8b 7b 70 e8 95 9e ae f9 5b 5d 41 5c 41 5d e9
RSP: 0018:ffffc9000333faf8 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffc9000d84d000
RDX: 000000000000000e RSI: ffffffff841221d7 RDI: 0000000000000070
RBP: ffff88804b6b95b0 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffff88804b6b00b0
R13: 0000000000000000 R14: ffff88804b6b95e0 R15: ffff88804b6b95c8
FS: 00007f3b445ec700(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b2e423000 CR3: 000000005d734000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 000000000000003b DR6: 00000000ffff0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
vhost_transport_send_pkt+0x268/0x520 drivers/vhost/vsock.c:288
virtio_transport_send_pkt_info+0x54c/0x820 net/vmw_vsock/virtio_transport_common.c:250
virtio_transport_connect+0xb1/0xf0 net/vmw_vsock/virtio_transport_common.c:813
vsock_connect+0x37f/0xcd0 net/vmw_vsock/af_vsock.c:1414
__sys_connect_file+0x153/0x1a0 net/socket.c:2003
__sys_connect+0x165/0x1a0 net/socket.c:2020
__do_sys_connect net/socket.c:2030 [inline]
__se_sys_connect net/socket.c:2027 [inline]
__x64_sys_connect+0x73/0xb0 net/socket.c:2027
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f3b4388c169
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f3b445ec168 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 00007f3b439ac050 RCX: 00007f3b4388c169
RDX: 0000000000000010 RSI: 0000000020000140 RDI: 0000000000000004
RBP: 00007f3b438e7ca1 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f3b43acfb1f R14: 00007f3b445ec300 R15: 0000000000022000
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:vhost_work_queue drivers/vhost/vhost.c:259 [inline]
RIP: 0010:vhost_work_queue+0xfc/0x150 drivers/vhost/vhost.c:248
Code: 00 00 fc ff df 48 89 da 48 c1 ea 03 80 3c 02 00 75 56 48 b8 00 00 00 00 00 fc ff df 48 8b 1b 48 8d 7b 70 48 89 fa 48 c1 ea 03 <80> 3c 02 00 75 42 48 8b 7b 70 e8 95 9e ae f9 5b 5d 41 5c 41 5d e9
RSP: 0018:ffffc9000333faf8 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffc9000d84d000
RDX: 000000000000000e RSI: ffffffff841221d7 RDI: 0000000000000070
RBP: ffff88804b6b95b0 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffff88804b6b00b0
R13: 0000000000000000 R14: ffff88804b6b95e0 R15: ffff88804b6b95c8
FS: 00007f3b445ec700(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b2e428000 CR3: 000000005d734000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 000000000000003b DR6: 00000000ffff0ff0 DR7: 0000000000000400
----------------
Code disassembly (best guess), 5 bytes skipped:
0: 48 89 da mov %rbx,%rdx
3: 48 c1 ea 03 shr $0x3,%rdx
7: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1)
b: 75 56 jne 0x63
d: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
14: fc ff df
17: 48 8b 1b mov (%rbx),%rbx
1a: 48 8d 7b 70 lea 0x70(%rbx),%rdi
1e: 48 89 fa mov %rdi,%rdx
21: 48 c1 ea 03 shr $0x3,%rdx
* 25: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) <-- trapping instruction
29: 75 42 jne 0x6d
2b: 48 8b 7b 70 mov 0x70(%rbx),%rdi
2f: e8 95 9e ae f9 callq 0xf9ae9ec9
34: 5b pop %rbx
35: 5d pop %rbp
36: 41 5c pop %r12
38: 41 5d pop %r13
3a: e9 .byte 0xe9
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
If the bug is already fixed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want to change bug's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the bug is a duplicate of another bug, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
On Tue, May 30, 2023 at 12:30:06AM -0700, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 933174ae28ba Merge tag 'spi-fix-v6.4-rc3' of git://git.ker..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=138d4ae5280000
> kernel config: https://syzkaller.appspot.com/x/.config?x=f389ffdf4e9ba3f0
> dashboard link: https://syzkaller.appspot.com/bug?extid=d0d442c22fa8db45ff0e
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/21a81b8c2660/disk-933174ae.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/b4951d89e238/vmlinux-933174ae.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/21eb405303cc/bzImage-933174ae.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: [email protected]
>
> general protection fault, probably for non-canonical address 0xdffffc000000000e: 0000 [#1] PREEMPT SMP KASAN
> KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
> CPU: 0 PID: 29845 Comm: syz-executor.4 Not tainted 6.4.0-rc3-syzkaller-00032-g933174ae28ba #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/16/2023
> RIP: 0010:vhost_work_queue drivers/vhost/vhost.c:259 [inline]
> RIP: 0010:vhost_work_queue+0xfc/0x150 drivers/vhost/vhost.c:248
> Code: 00 00 fc ff df 48 89 da 48 c1 ea 03 80 3c 02 00 75 56 48 b8 00 00 00 00 00 fc ff df 48 8b 1b 48 8d 7b 70 48 89 fa 48 c1 ea 03 <80> 3c 02 00 75 42 48 8b 7b 70 e8 95 9e ae f9 5b 5d 41 5c 41 5d e9
> RSP: 0018:ffffc9000333faf8 EFLAGS: 00010202
> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffc9000d84d000
> RDX: 000000000000000e RSI: ffffffff841221d7 RDI: 0000000000000070
> RBP: ffff88804b6b95b0 R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000000 R12: ffff88804b6b00b0
> R13: 0000000000000000 R14: ffff88804b6b95e0 R15: ffff88804b6b95c8
> FS: 00007f3b445ec700(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000001b2e423000 CR3: 000000005d734000 CR4: 00000000003506f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 000000000000003b DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Call Trace:
> <TASK>
> vhost_transport_send_pkt+0x268/0x520 drivers/vhost/vsock.c:288
> virtio_transport_send_pkt_info+0x54c/0x820 net/vmw_vsock/virtio_transport_common.c:250
> virtio_transport_connect+0xb1/0xf0 net/vmw_vsock/virtio_transport_common.c:813
> vsock_connect+0x37f/0xcd0 net/vmw_vsock/af_vsock.c:1414
> __sys_connect_file+0x153/0x1a0 net/socket.c:2003
> __sys_connect+0x165/0x1a0 net/socket.c:2020
> __do_sys_connect net/socket.c:2030 [inline]
> __se_sys_connect net/socket.c:2027 [inline]
> __x64_sys_connect+0x73/0xb0 net/socket.c:2027
> do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
> entry_SYSCALL_64_after_hwframe+0x63/0xcd
> RIP: 0033:0x7f3b4388c169
> Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007f3b445ec168 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
> RAX: ffffffffffffffda RBX: 00007f3b439ac050 RCX: 00007f3b4388c169
> RDX: 0000000000000010 RSI: 0000000020000140 RDI: 0000000000000004
> RBP: 00007f3b438e7ca1 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> R13: 00007f3b43acfb1f R14: 00007f3b445ec300 R15: 0000000000022000
> </TASK>
> Modules linked in:
> ---[ end trace 0000000000000000 ]---
> RIP: 0010:vhost_work_queue drivers/vhost/vhost.c:259 [inline]
> RIP: 0010:vhost_work_queue+0xfc/0x150 drivers/vhost/vhost.c:248
> Code: 00 00 fc ff df 48 89 da 48 c1 ea 03 80 3c 02 00 75 56 48 b8 00 00 00 00 00 fc ff df 48 8b 1b 48 8d 7b 70 48 89 fa 48 c1 ea 03 <80> 3c 02 00 75 42 48 8b 7b 70 e8 95 9e ae f9 5b 5d 41 5c 41 5d e9
> RSP: 0018:ffffc9000333faf8 EFLAGS: 00010202
> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffc9000d84d000
> RDX: 000000000000000e RSI: ffffffff841221d7 RDI: 0000000000000070
> RBP: ffff88804b6b95b0 R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000001 R11: 0000000000000000 R12: ffff88804b6b00b0
> R13: 0000000000000000 R14: ffff88804b6b95e0 R15: ffff88804b6b95c8
> FS: 00007f3b445ec700(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000001b2e428000 CR3: 000000005d734000 CR4: 00000000003506e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 000000000000003b DR6: 00000000ffff0ff0 DR7: 0000000000000400
> ----------------
> Code disassembly (best guess), 5 bytes skipped:
> 0: 48 89 da mov %rbx,%rdx
> 3: 48 c1 ea 03 shr $0x3,%rdx
> 7: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1)
> b: 75 56 jne 0x63
> d: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
> 14: fc ff df
> 17: 48 8b 1b mov (%rbx),%rbx
> 1a: 48 8d 7b 70 lea 0x70(%rbx),%rdi
> 1e: 48 89 fa mov %rdi,%rdx
> 21: 48 c1 ea 03 shr $0x3,%rdx
> * 25: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) <-- trapping instruction
> 29: 75 42 jne 0x6d
> 2b: 48 8b 7b 70 mov 0x70(%rbx),%rdi
> 2f: e8 95 9e ae f9 callq 0xf9ae9ec9
> 34: 5b pop %rbx
> 35: 5d pop %rbp
> 36: 41 5c pop %r12
> 38: 41 5d pop %r13
> 3a: e9 .byte 0xe9
Stefano, Stefan, take a look?
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at [email protected].
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> If the bug is already fixed, let syzbot know by replying with:
> #syz fix: exact-commit-title
>
> If you want to change bug's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
>
> If the bug is a duplicate of another bug, reply with:
> #syz dup: exact-subject-of-another-report
>
> If you want to undo deduplication, reply with:
> #syz undup
On Tue, May 30, 2023 at 1:24 PM Michael S. Tsirkin <[email protected]> wrote:
>
> On Tue, May 30, 2023 at 12:30:06AM -0700, syzbot wrote:
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: 933174ae28ba Merge tag 'spi-fix-v6.4-rc3' of git://git.ker..
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=138d4ae5280000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=f389ffdf4e9ba3f0
> > dashboard link: https://syzkaller.appspot.com/bug?extid=d0d442c22fa8db45ff0e
> > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > Downloadable assets:
> > disk image: https://storage.googleapis.com/syzbot-assets/21a81b8c2660/disk-933174ae.raw.xz
> > vmlinux: https://storage.googleapis.com/syzbot-assets/b4951d89e238/vmlinux-933174ae.xz
> > kernel image: https://storage.googleapis.com/syzbot-assets/21eb405303cc/bzImage-933174ae.xz
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: [email protected]
> >
> > general protection fault, probably for non-canonical address 0xdffffc000000000e: 0000 [#1] PREEMPT SMP KASAN
> > KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
> > CPU: 0 PID: 29845 Comm: syz-executor.4 Not tainted 6.4.0-rc3-syzkaller-00032-g933174ae28ba #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/16/2023
> > RIP: 0010:vhost_work_queue drivers/vhost/vhost.c:259 [inline]
> > RIP: 0010:vhost_work_queue+0xfc/0x150 drivers/vhost/vhost.c:248
> > Code: 00 00 fc ff df 48 89 da 48 c1 ea 03 80 3c 02 00 75 56 48 b8 00 00 00 00 00 fc ff df 48 8b 1b 48 8d 7b 70 48 89 fa 48 c1 ea 03 <80> 3c 02 00 75 42 48 8b 7b 70 e8 95 9e ae f9 5b 5d 41 5c 41 5d e9
> > RSP: 0018:ffffc9000333faf8 EFLAGS: 00010202
> > RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffc9000d84d000
> > RDX: 000000000000000e RSI: ffffffff841221d7 RDI: 0000000000000070
> > RBP: ffff88804b6b95b0 R08: 0000000000000001 R09: 0000000000000000
> > R10: 0000000000000001 R11: 0000000000000000 R12: ffff88804b6b00b0
> > R13: 0000000000000000 R14: ffff88804b6b95e0 R15: ffff88804b6b95c8
> > FS: 00007f3b445ec700(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000001b2e423000 CR3: 000000005d734000 CR4: 00000000003506f0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 000000000000003b DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > Call Trace:
> > <TASK>
> > vhost_transport_send_pkt+0x268/0x520 drivers/vhost/vsock.c:288
> > virtio_transport_send_pkt_info+0x54c/0x820 net/vmw_vsock/virtio_transport_common.c:250
> > virtio_transport_connect+0xb1/0xf0 net/vmw_vsock/virtio_transport_common.c:813
> > vsock_connect+0x37f/0xcd0 net/vmw_vsock/af_vsock.c:1414
> > __sys_connect_file+0x153/0x1a0 net/socket.c:2003
> > __sys_connect+0x165/0x1a0 net/socket.c:2020
> > __do_sys_connect net/socket.c:2030 [inline]
> > __se_sys_connect net/socket.c:2027 [inline]
> > __x64_sys_connect+0x73/0xb0 net/socket.c:2027
> > do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> > do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
> > entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > RIP: 0033:0x7f3b4388c169
> > Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
> > RSP: 002b:00007f3b445ec168 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
> > RAX: ffffffffffffffda RBX: 00007f3b439ac050 RCX: 00007f3b4388c169
> > RDX: 0000000000000010 RSI: 0000000020000140 RDI: 0000000000000004
> > RBP: 00007f3b438e7ca1 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> > R13: 00007f3b43acfb1f R14: 00007f3b445ec300 R15: 0000000000022000
> > </TASK>
> > Modules linked in:
> > ---[ end trace 0000000000000000 ]---
> > RIP: 0010:vhost_work_queue drivers/vhost/vhost.c:259 [inline]
> > RIP: 0010:vhost_work_queue+0xfc/0x150 drivers/vhost/vhost.c:248
> > Code: 00 00 fc ff df 48 89 da 48 c1 ea 03 80 3c 02 00 75 56 48 b8 00 00 00 00 00 fc ff df 48 8b 1b 48 8d 7b 70 48 89 fa 48 c1 ea 03 <80> 3c 02 00 75 42 48 8b 7b 70 e8 95 9e ae f9 5b 5d 41 5c 41 5d e9
> > RSP: 0018:ffffc9000333faf8 EFLAGS: 00010202
> > RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffc9000d84d000
> > RDX: 000000000000000e RSI: ffffffff841221d7 RDI: 0000000000000070
> > RBP: ffff88804b6b95b0 R08: 0000000000000001 R09: 0000000000000000
> > R10: 0000000000000001 R11: 0000000000000000 R12: ffff88804b6b00b0
> > R13: 0000000000000000 R14: ffff88804b6b95e0 R15: ffff88804b6b95c8
> > FS: 00007f3b445ec700(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000001b2e428000 CR3: 000000005d734000 CR4: 00000000003506e0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 000000000000003b DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > ----------------
> > Code disassembly (best guess), 5 bytes skipped:
> > 0: 48 89 da mov %rbx,%rdx
> > 3: 48 c1 ea 03 shr $0x3,%rdx
> > 7: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1)
> > b: 75 56 jne 0x63
> > d: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
> > 14: fc ff df
> > 17: 48 8b 1b mov (%rbx),%rbx
> > 1a: 48 8d 7b 70 lea 0x70(%rbx),%rdi
> > 1e: 48 89 fa mov %rdi,%rdx
> > 21: 48 c1 ea 03 shr $0x3,%rdx
> > * 25: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) <-- trapping instruction
> > 29: 75 42 jne 0x6d
> > 2b: 48 8b 7b 70 mov 0x70(%rbx),%rdi
> > 2f: e8 95 9e ae f9 callq 0xf9ae9ec9
> > 34: 5b pop %rbx
> > 35: 5d pop %rbp
> > 36: 41 5c pop %r12
> > 38: 41 5d pop %r13
> > 3a: e9 .byte 0xe9
>
>
> Stefano, Stefan, take a look?
I'll take a look.
From a first glance, it looks like an issue when we call vhost_work_queue().
@Mike, does that ring any bells since you recently looked at that code?
Thanks,
Stefano
On Tue, May 30, 2023 at 3:44 PM Stefano Garzarella <[email protected]> wrote:
>
> On Tue, May 30, 2023 at 1:24 PM Michael S. Tsirkin <[email protected]> wrote:
> >
> > On Tue, May 30, 2023 at 12:30:06AM -0700, syzbot wrote:
> > > Hello,
> > >
> > > syzbot found the following issue on:
> > >
> > > HEAD commit: 933174ae28ba Merge tag 'spi-fix-v6.4-rc3' of git://git.ker..
> > > git tree: upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=138d4ae5280000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=f389ffdf4e9ba3f0
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=d0d442c22fa8db45ff0e
> > > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> > >
> > > Unfortunately, I don't have any reproducer for this issue yet.
> > >
> > > Downloadable assets:
> > > disk image: https://storage.googleapis.com/syzbot-assets/21a81b8c2660/disk-933174ae.raw.xz
> > > vmlinux: https://storage.googleapis.com/syzbot-assets/b4951d89e238/vmlinux-933174ae.xz
> > > kernel image: https://storage.googleapis.com/syzbot-assets/21eb405303cc/bzImage-933174ae.xz
> > >
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: [email protected]
> > >
> > > general protection fault, probably for non-canonical address 0xdffffc000000000e: 0000 [#1] PREEMPT SMP KASAN
> > > KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
> > > CPU: 0 PID: 29845 Comm: syz-executor.4 Not tainted 6.4.0-rc3-syzkaller-00032-g933174ae28ba #0
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/16/2023
> > > RIP: 0010:vhost_work_queue drivers/vhost/vhost.c:259 [inline]
> > > RIP: 0010:vhost_work_queue+0xfc/0x150 drivers/vhost/vhost.c:248
> > > Code: 00 00 fc ff df 48 89 da 48 c1 ea 03 80 3c 02 00 75 56 48 b8 00 00 00 00 00 fc ff df 48 8b 1b 48 8d 7b 70 48 89 fa 48 c1 ea 03 <80> 3c 02 00 75 42 48 8b 7b 70 e8 95 9e ae f9 5b 5d 41 5c 41 5d e9
> > > RSP: 0018:ffffc9000333faf8 EFLAGS: 00010202
> > > RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffc9000d84d000
> > > RDX: 000000000000000e RSI: ffffffff841221d7 RDI: 0000000000000070
> > > RBP: ffff88804b6b95b0 R08: 0000000000000001 R09: 0000000000000000
> > > R10: 0000000000000001 R11: 0000000000000000 R12: ffff88804b6b00b0
> > > R13: 0000000000000000 R14: ffff88804b6b95e0 R15: ffff88804b6b95c8
> > > FS: 00007f3b445ec700(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
> > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 0000001b2e423000 CR3: 000000005d734000 CR4: 00000000003506f0
> > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > DR3: 000000000000003b DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > Call Trace:
> > > <TASK>
> > > vhost_transport_send_pkt+0x268/0x520 drivers/vhost/vsock.c:288
> > > virtio_transport_send_pkt_info+0x54c/0x820 net/vmw_vsock/virtio_transport_common.c:250
> > > virtio_transport_connect+0xb1/0xf0 net/vmw_vsock/virtio_transport_common.c:813
> > > vsock_connect+0x37f/0xcd0 net/vmw_vsock/af_vsock.c:1414
> > > __sys_connect_file+0x153/0x1a0 net/socket.c:2003
> > > __sys_connect+0x165/0x1a0 net/socket.c:2020
> > > __do_sys_connect net/socket.c:2030 [inline]
> > > __se_sys_connect net/socket.c:2027 [inline]
> > > __x64_sys_connect+0x73/0xb0 net/socket.c:2027
> > > do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> > > do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
> > > entry_SYSCALL_64_after_hwframe+0x63/0xcd
> > > RIP: 0033:0x7f3b4388c169
> > > Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
> > > RSP: 002b:00007f3b445ec168 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
> > > RAX: ffffffffffffffda RBX: 00007f3b439ac050 RCX: 00007f3b4388c169
> > > RDX: 0000000000000010 RSI: 0000000020000140 RDI: 0000000000000004
> > > RBP: 00007f3b438e7ca1 R08: 0000000000000000 R09: 0000000000000000
> > > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> > > R13: 00007f3b43acfb1f R14: 00007f3b445ec300 R15: 0000000000022000
> > > </TASK>
> > > Modules linked in:
> > > ---[ end trace 0000000000000000 ]---
> > > RIP: 0010:vhost_work_queue drivers/vhost/vhost.c:259 [inline]
> > > RIP: 0010:vhost_work_queue+0xfc/0x150 drivers/vhost/vhost.c:248
> > > Code: 00 00 fc ff df 48 89 da 48 c1 ea 03 80 3c 02 00 75 56 48 b8 00 00 00 00 00 fc ff df 48 8b 1b 48 8d 7b 70 48 89 fa 48 c1 ea 03 <80> 3c 02 00 75 42 48 8b 7b 70 e8 95 9e ae f9 5b 5d 41 5c 41 5d e9
> > > RSP: 0018:ffffc9000333faf8 EFLAGS: 00010202
> > > RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffc9000d84d000
> > > RDX: 000000000000000e RSI: ffffffff841221d7 RDI: 0000000000000070
> > > RBP: ffff88804b6b95b0 R08: 0000000000000001 R09: 0000000000000000
> > > R10: 0000000000000001 R11: 0000000000000000 R12: ffff88804b6b00b0
> > > R13: 0000000000000000 R14: ffff88804b6b95e0 R15: ffff88804b6b95c8
> > > FS: 00007f3b445ec700(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
> > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 0000001b2e428000 CR3: 000000005d734000 CR4: 00000000003506e0
> > > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > > DR3: 000000000000003b DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > ----------------
> > > Code disassembly (best guess), 5 bytes skipped:
> > > 0: 48 89 da mov %rbx,%rdx
> > > 3: 48 c1 ea 03 shr $0x3,%rdx
> > > 7: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1)
> > > b: 75 56 jne 0x63
> > > d: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
> > > 14: fc ff df
> > > 17: 48 8b 1b mov (%rbx),%rbx
> > > 1a: 48 8d 7b 70 lea 0x70(%rbx),%rdi
> > > 1e: 48 89 fa mov %rdi,%rdx
> > > 21: 48 c1 ea 03 shr $0x3,%rdx
> > > * 25: 80 3c 02 00 cmpb $0x0,(%rdx,%rax,1) <-- trapping instruction
> > > 29: 75 42 jne 0x6d
> > > 2b: 48 8b 7b 70 mov 0x70(%rbx),%rdi
> > > 2f: e8 95 9e ae f9 callq 0xf9ae9ec9
> > > 34: 5b pop %rbx
> > > 35: 5d pop %rbp
> > > 36: 41 5c pop %r12
> > > 38: 41 5d pop %r13
> > > 3a: e9 .byte 0xe9
> >
> >
> > Stefano, Stefan, take a look?
>
> I'll take a look.
>
> From a first glance, it looks like an issue when we call vhost_work_queue().
> @Mike, does that ring any bells since you recently looked at that code?
I think it is partially related to commit 6e890c5d5021 ("vhost: use
vhost_tasks for worker threads") and commit 1a5f8090c6de ("vhost: move
worker thread fields to new struct"). Maybe that commits just
highlighted the issue and it was already existing.
In this case I think there is a race between vhost_worker_create() and
vhost_transport_send_pkt(). vhost_transport_send_pkt() calls
vhost_work_queue() without holding the vhost device mutex, so it can run
while vhost_worker_create() set dev->worker, but has not yet set
worker->vtsk.
Before commit 1a5f8090c6de ("vhost: move worker thread fields to new
struct"), dev->worker is set when everything was ready, but maybe it was
just a case of the instructions not being re-ordered and the problem
could still occur.
This happens because VHOST_VSOCK_SET_GUEST_CID can be called before
VHOST_SET_OWNER and then vhost_transport_send_pkt() finds the guest's
CID and tries to send it a packet.
But is it correct to handle VHOST_VSOCK_SET_GUEST_CID, before
VHOST_SET_OWNER?
QEMU always calls VHOST_SET_OWNER before anything, but I don't know
about the other VMMs.
So, could it be an acceptable solution to reject
VHOST_VSOCK_SET_GUEST_CID before VHOST_SET_OWNER?
I mean somethig like this:
diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index 6578db78f0ae..33fc0805d189 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -829,7 +829,12 @@ static long vhost_vsock_dev_ioctl(struct file *f, unsigned int ioctl,
case VHOST_VSOCK_SET_GUEST_CID:
if (copy_from_user(&guest_cid, argp, sizeof(guest_cid)))
return -EFAULT;
- return vhost_vsock_set_cid(vsock, guest_cid);
+ mutex_lock(&vsock->dev.mutex);
+ r = vhost_dev_check_owner(&vsock->dev);
+ if (!r)
+ r = vhost_vsock_set_cid(vsock, guest_cid);
+ mutex_unlock(&vsock->dev.mutex);
+ return r;
case VHOST_VSOCK_SET_RUNNING:
if (copy_from_user(&start, argp, sizeof(start)))
return -EFAULT;
In the documentation, we say:
/* Set current process as the (exclusive) owner of this file descriptor. This
* must be called before any other vhost command. Further calls to
* VHOST_OWNER_SET fail until VHOST_OWNER_RESET is called. */
This should prevents the issue, but could break a wrong userspace.
Others idea that I have in mind are:
- hold vsock->dev.mutex while calling vhost_work_queue() (performance
degradation?)
- use RCU to protect dev->worker
WDYT?
Thanks,
Stefano
On 5/30/23 10:58 AM, Mike Christie wrote:
> On 5/30/23 8:44 AM, Stefano Garzarella wrote:
>>
>> From a first glance, it looks like an issue when we call vhost_work_queue().
>> @Mike, does that ring any bells since you recently looked at that code?
>
> I see the bug. needed to have set the dev->worker after setting worker->vtsk
> like below:
>
>
> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> index a92af08e7864..7bd95984a501 100644
> --- a/drivers/vhost/vhost.c
> +++ b/drivers/vhost/vhost.c
> @@ -564,7 +564,6 @@ static int vhost_worker_create(struct vhost_dev *dev)
> if (!worker)
> return -ENOMEM;
>
> - dev->worker = worker;
> worker->kcov_handle = kcov_common_handle();
> init_llist_head(&worker->work_list);
> snprintf(name, sizeof(name), "vhost-%d", current->pid);
> @@ -576,6 +575,7 @@ static int vhost_worker_create(struct vhost_dev *dev)
> }
>
> worker->vtsk = vtsk;
Shoot, oh wait, I think I needed a smp_wmb to always make sure worker->vtask
is set before dev->worker or vhost_work_queue could still end up seeing
dev->worker set before worker->vtsk right?
> + dev->worker = worker;> vhost_task_start(vtsk);
> return 0;
>
On 5/30/23 8:44 AM, Stefano Garzarella wrote:
>
> From a first glance, it looks like an issue when we call vhost_work_queue().
> @Mike, does that ring any bells since you recently looked at that code?
I see the bug. needed to have set the dev->worker after setting worker->vtsk
like below:
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index a92af08e7864..7bd95984a501 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -564,7 +564,6 @@ static int vhost_worker_create(struct vhost_dev *dev)
if (!worker)
return -ENOMEM;
- dev->worker = worker;
worker->kcov_handle = kcov_common_handle();
init_llist_head(&worker->work_list);
snprintf(name, sizeof(name), "vhost-%d", current->pid);
@@ -576,6 +575,7 @@ static int vhost_worker_create(struct vhost_dev *dev)
}
worker->vtsk = vtsk;
+ dev->worker = worker;
vhost_task_start(vtsk);
return 0;
On 5/30/23 11:00 AM, Stefano Garzarella wrote:
> I think it is partially related to commit 6e890c5d5021 ("vhost: use
> vhost_tasks for worker threads") and commit 1a5f8090c6de ("vhost: move
> worker thread fields to new struct"). Maybe that commits just
> highlighted the issue and it was already existing.
See my mail about the crash. Agree with your analysis about worker->vtsk
not being set yet. It's a bug from my commit where I should have not set
it so early or I should be checking for
if (dev->worker && worker->vtsk)
instead of
if (dev->worker)
One question about the behavior before my commit though and what we want in
the end going forward. Before that patch we would just drop work if
vhost_work_queue was called before VHOST_SET_OWNER. Was that correct/expected?
The call to vhost_work_queue in vhost_vsock_start was only seeing the
works queued after VHOST_SET_OWNER. Did you want works queued before that?
On Tue, May 30, 2023 at 11:01:11AM -0500, Mike Christie wrote:
>On 5/30/23 10:58 AM, Mike Christie wrote:
>> On 5/30/23 8:44 AM, Stefano Garzarella wrote:
>>>
>>> From a first glance, it looks like an issue when we call vhost_work_queue().
>>> @Mike, does that ring any bells since you recently looked at that code?
>>
>> I see the bug. needed to have set the dev->worker after setting worker->vtsk
Yes, I came to the same conclusion (see my email sent at the same time
:-).
>> like below:
>>
>>
>> diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>> index a92af08e7864..7bd95984a501 100644
>> --- a/drivers/vhost/vhost.c
>> +++ b/drivers/vhost/vhost.c
>> @@ -564,7 +564,6 @@ static int vhost_worker_create(struct vhost_dev *dev)
>> if (!worker)
>> return -ENOMEM;
>>
>> - dev->worker = worker;
>> worker->kcov_handle = kcov_common_handle();
>> init_llist_head(&worker->work_list);
>> snprintf(name, sizeof(name), "vhost-%d", current->pid);
>> @@ -576,6 +575,7 @@ static int vhost_worker_create(struct vhost_dev *dev)
>> }
>>
>> worker->vtsk = vtsk;
>
>Shoot, oh wait, I think I needed a smp_wmb to always make sure worker->vtask
>is set before dev->worker or vhost_work_queue could still end up seeing
>dev->worker set before worker->vtsk right?
But should we pair smp_wmb() with an smp_rmb() wherever we check
dev->worker?
Thanks,
Stefano
On Tue, May 30, 2023 at 11:09:09AM -0500, Mike Christie wrote:
>On 5/30/23 11:00 AM, Stefano Garzarella wrote:
>> I think it is partially related to commit 6e890c5d5021 ("vhost: use
>> vhost_tasks for worker threads") and commit 1a5f8090c6de ("vhost: move
>> worker thread fields to new struct"). Maybe that commits just
>> highlighted the issue and it was already existing.
>
>See my mail about the crash. Agree with your analysis about worker->vtsk
>not being set yet. It's a bug from my commit where I should have not set
>it so early or I should be checking for
>
>if (dev->worker && worker->vtsk)
>
>instead of
>
>if (dev->worker)
Yes, though, in my opinion the problem may persist depending on how the
instructions are reordered.
Should we protect dev->worker() with an RCU to be safe?
>
>One question about the behavior before my commit though and what we want in
>the end going forward. Before that patch we would just drop work if
>vhost_work_queue was called before VHOST_SET_OWNER. Was that correct/expected?
I think so, since we ask the guest to call VHOST_SET_OWNER, before any
other command.
>
>The call to vhost_work_queue in vhost_vsock_start was only seeing the
>works queued after VHOST_SET_OWNER. Did you want works queued before that?
>
Yes, for example if an application in the host has tried to connect and
is waiting for a timeout, we already have work queued up to flush as
soon as we start the device. (See commit 0b841030625c ("vhost: vsock:
kick send_pkt worker once device is started")).
Thanks,
Stefano
On 5/30/23 11:17 AM, Stefano Garzarella wrote:
> On Tue, May 30, 2023 at 11:09:09AM -0500, Mike Christie wrote:
>> On 5/30/23 11:00 AM, Stefano Garzarella wrote:
>>> I think it is partially related to commit 6e890c5d5021 ("vhost: use
>>> vhost_tasks for worker threads") and commit 1a5f8090c6de ("vhost: move
>>> worker thread fields to new struct"). Maybe that commits just
>>> highlighted the issue and it was already existing.
>>
>> See my mail about the crash. Agree with your analysis about worker->vtsk
>> not being set yet. It's a bug from my commit where I should have not set
>> it so early or I should be checking for
>>
>> if (dev->worker && worker->vtsk)
>>
>> instead of
>>
>> if (dev->worker)
>
> Yes, though, in my opinion the problem may persist depending on how the
> instructions are reordered.
Ah ok.
>
> Should we protect dev->worker() with an RCU to be safe?
For those multiple worker patchsets Jason had asked me about supporting
where we don't have a worker while we are swapping workers around. To do
that I had added rcu around the dev->worker. I removed it in later patchsets
because I didn't think anyone would use it.
rcu would work for your case and for what Jason had requested.
On Tue, May 30, 2023 at 6:30 PM <[email protected]> wrote:
>
> On 5/30/23 11:17 AM, Stefano Garzarella wrote:
> > On Tue, May 30, 2023 at 11:09:09AM -0500, Mike Christie wrote:
> >> On 5/30/23 11:00 AM, Stefano Garzarella wrote:
> >>> I think it is partially related to commit 6e890c5d5021 ("vhost: use
> >>> vhost_tasks for worker threads") and commit 1a5f8090c6de ("vhost: move
> >>> worker thread fields to new struct"). Maybe that commits just
> >>> highlighted the issue and it was already existing.
> >>
> >> See my mail about the crash. Agree with your analysis about worker->vtsk
> >> not being set yet. It's a bug from my commit where I should have not set
> >> it so early or I should be checking for
> >>
> >> if (dev->worker && worker->vtsk)
> >>
> >> instead of
> >>
> >> if (dev->worker)
> >
> > Yes, though, in my opinion the problem may persist depending on how the
> > instructions are reordered.
>
> Ah ok.
>
> >
> > Should we protect dev->worker() with an RCU to be safe?
>
> For those multiple worker patchsets Jason had asked me about supporting
> where we don't have a worker while we are swapping workers around. To do
> that I had added rcu around the dev->worker. I removed it in later patchsets
> because I didn't think anyone would use it.
>
> rcu would work for your case and for what Jason had requested.
Yeah, so you already have some patches?
Do you want to send it to solve this problem?
Thanks,
Stefano
On 5/31/23 2:27 AM, Stefano Garzarella wrote:
> On Tue, May 30, 2023 at 6:30 PM <[email protected]> wrote:
>>
>> On 5/30/23 11:17 AM, Stefano Garzarella wrote:
>>> On Tue, May 30, 2023 at 11:09:09AM -0500, Mike Christie wrote:
>>>> On 5/30/23 11:00 AM, Stefano Garzarella wrote:
>>>>> I think it is partially related to commit 6e890c5d5021 ("vhost: use
>>>>> vhost_tasks for worker threads") and commit 1a5f8090c6de ("vhost: move
>>>>> worker thread fields to new struct"). Maybe that commits just
>>>>> highlighted the issue and it was already existing.
>>>>
>>>> See my mail about the crash. Agree with your analysis about worker->vtsk
>>>> not being set yet. It's a bug from my commit where I should have not set
>>>> it so early or I should be checking for
>>>>
>>>> if (dev->worker && worker->vtsk)
>>>>
>>>> instead of
>>>>
>>>> if (dev->worker)
>>>
>>> Yes, though, in my opinion the problem may persist depending on how the
>>> instructions are reordered.
>>
>> Ah ok.
>>
>>>
>>> Should we protect dev->worker() with an RCU to be safe?
>>
>> For those multiple worker patchsets Jason had asked me about supporting
>> where we don't have a worker while we are swapping workers around. To do
>> that I had added rcu around the dev->worker. I removed it in later patchsets
>> because I didn't think anyone would use it.
>>
>> rcu would work for your case and for what Jason had requested.
>
> Yeah, so you already have some patches?
>
> Do you want to send it to solve this problem?
>
Yeah, I'll break them out and send them later today when I can retest
rebased patches.
On 5/31/23 10:15 AM, Mike Christie wrote:
>>> rcu would work for your case and for what Jason had requested.
>> Yeah, so you already have some patches?
>>
>> Do you want to send it to solve this problem?
>>
> Yeah, I'll break them out and send them later today when I can retest
> rebased patches.
>
Just one question. Do you core vhost developers consider RCU more complex
or switching to READ_ONCE/WRITE_ONCE? I am asking because for this immediate
regression fix we could just switch to the latter like below to just fix
the crash if we think that is more simple.
I think RCU is just a little more complex/invasive because it will have the
extra synchronize_rcu calls.
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index a92af08e7864..03fd47a22a73 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -235,7 +235,7 @@ void vhost_dev_flush(struct vhost_dev *dev)
{
struct vhost_flush_struct flush;
- if (dev->worker) {
+ if (READ_ONCE(dev->worker.vtsk)) {
init_completion(&flush.wait_event);
vhost_work_init(&flush.work, vhost_flush_work);
@@ -247,7 +247,9 @@ EXPORT_SYMBOL_GPL(vhost_dev_flush);
void vhost_work_queue(struct vhost_dev *dev, struct vhost_work *work)
{
- if (!dev->worker)
+ struct vhost_task *vtsk = READ_ONCE(dev->worker.vtsk);
+
+ if (!vtsk)
return;
if (!test_and_set_bit(VHOST_WORK_QUEUED, &work->flags)) {
@@ -255,8 +257,8 @@ void vhost_work_queue(struct vhost_dev *dev, struct vhost_work *work)
* sure it was not in the list.
* test_and_set_bit() implies a memory barrier.
*/
- llist_add(&work->node, &dev->worker->work_list);
- wake_up_process(dev->worker->vtsk->task);
+ llist_add(&work->node, &dev->worker.work_list);
+ wake_up_process(vtsk->task);
}
}
EXPORT_SYMBOL_GPL(vhost_work_queue);
@@ -264,7 +266,7 @@ EXPORT_SYMBOL_GPL(vhost_work_queue);
/* A lockless hint for busy polling code to exit the loop */
bool vhost_has_work(struct vhost_dev *dev)
{
- return dev->worker && !llist_empty(&dev->worker->work_list);
+ return !llist_empty(&dev->worker.work_list);
}
EXPORT_SYMBOL_GPL(vhost_has_work);
@@ -468,7 +470,7 @@ void vhost_dev_init(struct vhost_dev *dev,
dev->umem = NULL;
dev->iotlb = NULL;
dev->mm = NULL;
- dev->worker = NULL;
+ memset(&dev->worker, 0, sizeof(dev->worker));
dev->iov_limit = iov_limit;
dev->weight = weight;
dev->byte_weight = byte_weight;
@@ -542,46 +544,38 @@ static void vhost_detach_mm(struct vhost_dev *dev)
static void vhost_worker_free(struct vhost_dev *dev)
{
- struct vhost_worker *worker = dev->worker;
+ struct vhost_task *vtsk = READ_ONCE(dev->worker.vtsk);
- if (!worker)
+ if (!vtsk)
return;
- dev->worker = NULL;
- WARN_ON(!llist_empty(&worker->work_list));
- vhost_task_stop(worker->vtsk);
- kfree(worker);
+ vhost_task_stop(vtsk);
+ WARN_ON(!llist_empty(&dev->worker.work_list));
+ WRITE_ONCE(dev->worker.vtsk, NULL);
}
static int vhost_worker_create(struct vhost_dev *dev)
{
- struct vhost_worker *worker;
struct vhost_task *vtsk;
char name[TASK_COMM_LEN];
int ret;
- worker = kzalloc(sizeof(*worker), GFP_KERNEL_ACCOUNT);
- if (!worker)
- return -ENOMEM;
-
- dev->worker = worker;
- worker->kcov_handle = kcov_common_handle();
- init_llist_head(&worker->work_list);
+ dev->worker.kcov_handle = kcov_common_handle();
+ init_llist_head(&dev->worker.work_list);
snprintf(name, sizeof(name), "vhost-%d", current->pid);
- vtsk = vhost_task_create(vhost_worker, worker, name);
+ vtsk = vhost_task_create(vhost_worker, &dev->worker, name);
if (!vtsk) {
ret = -ENOMEM;
goto free_worker;
}
- worker->vtsk = vtsk;
+ WRITE_ONCE(dev->worker.vtsk, vtsk);
vhost_task_start(vtsk);
return 0;
free_worker:
- kfree(worker);
- dev->worker = NULL;
+ WRITE_ONCE(dev->worker.vtsk, NULL);
return ret;
}
diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
index 0308638cdeee..305ec8593d46 100644
--- a/drivers/vhost/vhost.h
+++ b/drivers/vhost/vhost.h
@@ -154,7 +154,7 @@ struct vhost_dev {
struct vhost_virtqueue **vqs;
int nvqs;
struct eventfd_ctx *log_ctx;
- struct vhost_worker *worker;
+ struct vhost_worker worker;
struct vhost_iotlb *umem;
struct vhost_iotlb *iotlb;
spinlock_t iotlb_lock;
On Wed, May 31, 2023 at 11:27:12AM -0500, Mike Christie wrote:
>On 5/31/23 10:15 AM, Mike Christie wrote:
>>>> rcu would work for your case and for what Jason had requested.
>>> Yeah, so you already have some patches?
>>>
>>> Do you want to send it to solve this problem?
>>>
>> Yeah, I'll break them out and send them later today when I can retest
>> rebased patches.
>>
>
>Just one question. Do you core vhost developers consider RCU more complex
>or switching to READ_ONCE/WRITE_ONCE? I am asking because for this immediate
>regression fix we could just switch to the latter like below to just fix
>the crash if we think that is more simple.
>
>I think RCU is just a little more complex/invasive because it will have the
>extra synchronize_rcu calls.
Yes, you may be right, in this case we should just need
READ_ONCE/WRITE_ONCE if dev->worker is no longer a pointer.
>
>
>diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
>index a92af08e7864..03fd47a22a73 100644
>--- a/drivers/vhost/vhost.c
>+++ b/drivers/vhost/vhost.c
>@@ -235,7 +235,7 @@ void vhost_dev_flush(struct vhost_dev *dev)
> {
> struct vhost_flush_struct flush;
>
>- if (dev->worker) {
>+ if (READ_ONCE(dev->worker.vtsk)) {
> init_completion(&flush.wait_event);
> vhost_work_init(&flush.work, vhost_flush_work);
>
>@@ -247,7 +247,9 @@ EXPORT_SYMBOL_GPL(vhost_dev_flush);
>
> void vhost_work_queue(struct vhost_dev *dev, struct vhost_work *work)
> {
>- if (!dev->worker)
>+ struct vhost_task *vtsk = READ_ONCE(dev->worker.vtsk);
>+
>+ if (!vtsk)
> return;
>
> if (!test_and_set_bit(VHOST_WORK_QUEUED, &work->flags)) {
>@@ -255,8 +257,8 @@ void vhost_work_queue(struct vhost_dev *dev, struct vhost_work *work)
> * sure it was not in the list.
> * test_and_set_bit() implies a memory barrier.
> */
>- llist_add(&work->node, &dev->worker->work_list);
>- wake_up_process(dev->worker->vtsk->task);
>+ llist_add(&work->node, &dev->worker.work_list);
>+ wake_up_process(vtsk->task);
> }
> }
> EXPORT_SYMBOL_GPL(vhost_work_queue);
>@@ -264,7 +266,7 @@ EXPORT_SYMBOL_GPL(vhost_work_queue);
> /* A lockless hint for busy polling code to exit the loop */
> bool vhost_has_work(struct vhost_dev *dev)
> {
>- return dev->worker && !llist_empty(&dev->worker->work_list);
>+ return !llist_empty(&dev->worker.work_list);
> }
> EXPORT_SYMBOL_GPL(vhost_has_work);
>
>@@ -468,7 +470,7 @@ void vhost_dev_init(struct vhost_dev *dev,
> dev->umem = NULL;
> dev->iotlb = NULL;
> dev->mm = NULL;
>- dev->worker = NULL;
>+ memset(&dev->worker, 0, sizeof(dev->worker));
> dev->iov_limit = iov_limit;
> dev->weight = weight;
> dev->byte_weight = byte_weight;
>@@ -542,46 +544,38 @@ static void vhost_detach_mm(struct vhost_dev *dev)
>
> static void vhost_worker_free(struct vhost_dev *dev)
> {
>- struct vhost_worker *worker = dev->worker;
>+ struct vhost_task *vtsk = READ_ONCE(dev->worker.vtsk);
>
>- if (!worker)
>+ if (!vtsk)
> return;
>
>- dev->worker = NULL;
>- WARN_ON(!llist_empty(&worker->work_list));
>- vhost_task_stop(worker->vtsk);
>- kfree(worker);
>+ vhost_task_stop(vtsk);
>+ WARN_ON(!llist_empty(&dev->worker.work_list));
>+ WRITE_ONCE(dev->worker.vtsk, NULL);
The patch LGTM, I just wonder if we should set dev->worker to zero here,
but maybe we don't need to.
Thanks,
Stefano
> }
>
> static int vhost_worker_create(struct vhost_dev *dev)
> {
>- struct vhost_worker *worker;
> struct vhost_task *vtsk;
> char name[TASK_COMM_LEN];
> int ret;
>
>- worker = kzalloc(sizeof(*worker), GFP_KERNEL_ACCOUNT);
>- if (!worker)
>- return -ENOMEM;
>-
>- dev->worker = worker;
>- worker->kcov_handle = kcov_common_handle();
>- init_llist_head(&worker->work_list);
>+ dev->worker.kcov_handle = kcov_common_handle();
>+ init_llist_head(&dev->worker.work_list);
> snprintf(name, sizeof(name), "vhost-%d", current->pid);
>
>- vtsk = vhost_task_create(vhost_worker, worker, name);
>+ vtsk = vhost_task_create(vhost_worker, &dev->worker, name);
> if (!vtsk) {
> ret = -ENOMEM;
> goto free_worker;
> }
>
>- worker->vtsk = vtsk;
>+ WRITE_ONCE(dev->worker.vtsk, vtsk);
> vhost_task_start(vtsk);
> return 0;
>
> free_worker:
>- kfree(worker);
>- dev->worker = NULL;
>+ WRITE_ONCE(dev->worker.vtsk, NULL);
> return ret;
> }
>
>diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h
>index 0308638cdeee..305ec8593d46 100644
>--- a/drivers/vhost/vhost.h
>+++ b/drivers/vhost/vhost.h
>@@ -154,7 +154,7 @@ struct vhost_dev {
> struct vhost_virtqueue **vqs;
> int nvqs;
> struct eventfd_ctx *log_ctx;
>- struct vhost_worker *worker;
>+ struct vhost_worker worker;
> struct vhost_iotlb *umem;
> struct vhost_iotlb *iotlb;
> spinlock_t iotlb_lock;
>
On 6/1/23 2:47 AM, Stefano Garzarella wrote:
>>
>> static void vhost_worker_free(struct vhost_dev *dev)
>> {
>> - struct vhost_worker *worker = dev->worker;
>> + struct vhost_task *vtsk = READ_ONCE(dev->worker.vtsk);
>>
>> - if (!worker)
>> + if (!vtsk)
>> return;
>>
>> - dev->worker = NULL;
>> - WARN_ON(!llist_empty(&worker->work_list));
>> - vhost_task_stop(worker->vtsk);
>> - kfree(worker);
>> + vhost_task_stop(vtsk);
>> + WARN_ON(!llist_empty(&dev->worker.work_list));
>> + WRITE_ONCE(dev->worker.vtsk, NULL);
>
> The patch LGTM, I just wonder if we should set dev->worker to zero here,
We might want to just set kcov_handle to zero for now.
In 6.3 and older, I think we could do:
1. vhost_dev_set_owner could successfully set dev->worker.
2. vhost_transport_send_pkt runs vhost_work_queue and sees worker
is set and adds the vhost_work to the work_list.
3. vhost_dev_set_owner fails in vhost_attach_cgroups, so we stop
the worker before the work can be run and set worker to NULL.
4. We clear kcov_handle and return.
We leave the work on the work_list.
5. Userspace can then retry vhost_dev_set_owner. If that works, then the
work gets executed ok eventually.
OR
Userspace can just close the device. vhost_vsock_dev_release would
eventually call vhost_dev_cleanup (vhost_dev_flush won't see a worker
so will just return), and that will hit the WARN_ON but we would
proceed ok.
If I do a memset of the worker, then if userspace were to retry
VHOST_SET_OWNER, we would lose the queued work since the work_list would
get zero'd. I think it's unlikely this ever happens, but you know best
so let me know if this a real issue.
On Thu, Jun 01, 2023 at 11:33:09AM -0500, Mike Christie wrote:
>On 6/1/23 2:47 AM, Stefano Garzarella wrote:
>>>
>>> static void vhost_worker_free(struct vhost_dev *dev)
>>> {
>>> -??? struct vhost_worker *worker = dev->worker;
>>> +??? struct vhost_task *vtsk = READ_ONCE(dev->worker.vtsk);
>>>
>>> -??? if (!worker)
>>> +??? if (!vtsk)
>>> ??????? return;
>>>
>>> -??? dev->worker = NULL;
>>> -??? WARN_ON(!llist_empty(&worker->work_list));
>>> -??? vhost_task_stop(worker->vtsk);
>>> -??? kfree(worker);
>>> +??? vhost_task_stop(vtsk);
>>> +??? WARN_ON(!llist_empty(&dev->worker.work_list));
>>> +??? WRITE_ONCE(dev->worker.vtsk, NULL);
>>
>> The patch LGTM, I just wonder if we should set dev->worker to zero here,
>
>We might want to just set kcov_handle to zero for now.
>
>In 6.3 and older, I think we could do:
>
>1. vhost_dev_set_owner could successfully set dev->worker.
>2. vhost_transport_send_pkt runs vhost_work_queue and sees worker
>is set and adds the vhost_work to the work_list.
>3. vhost_dev_set_owner fails in vhost_attach_cgroups, so we stop
>the worker before the work can be run and set worker to NULL.
>4. We clear kcov_handle and return.
>
>We leave the work on the work_list.
>
>5. Userspace can then retry vhost_dev_set_owner. If that works, then the
>work gets executed ok eventually.
>
>OR
>
>Userspace can just close the device. vhost_vsock_dev_release would
>eventually call vhost_dev_cleanup (vhost_dev_flush won't see a worker
>so will just return), and that will hit the WARN_ON but we would
>proceed ok.
>
>If I do a memset of the worker, then if userspace were to retry
>VHOST_SET_OWNER, we would lose the queued work since the work_list would
>get zero'd. I think it's unlikely this ever happens, but you know best
>so let me know if this a real issue.
>
I don't think it's a problem, though, you're right, we could hide the
warning and thus future bugs, better as you proposed.
Thanks,
Stefano