2021-04-12 15:48:18

by Hao Sun

[permalink] [raw]
Subject: general protection fault in kvm_vm_ioctl_unregister_coalesced_mmio

Hi

When using Healer(https://github.com/SunHao-0/healer/tree/dev) to fuzz
the Linux kernel, I found the following bug triggered when fault
injection was enabled.

commit: 52e44129fba5cfc4e351fdb5e45849afc74d9a53
version: linux 5.12
git tree: upstream
C reproduction program, kernel config, and full log can be found in
the attached file.

Fault injection log:
==============================================
FAULT_INJECTION: forcing a failure.
name failslab, interval 1, probability 0, space 0, times 0
CPU: 1 PID: 7974 Comm: executor Not tainted 5.12.0-rc6+ #14
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x104/0x14e lib/dump_stack.c:120
fail_dump lib/fault-inject.c:52 [inline]
should_fail+0x23e/0x250 lib/fault-inject.c:146
__should_failslab+0x81/0x90 mm/failslab.c:33
should_failslab+0x5/0x20 mm/slab_common.c:1273
slab_pre_alloc_hook mm/slab.h:499 [inline]
slab_alloc mm/slab.c:3306 [inline]
__do_kmalloc mm/slab.c:3693 [inline]
__kmalloc+0x66/0x380 mm/slab.c:3704
kmalloc ./include/linux/slab.h:559 [inline]
kvm_io_bus_unregister_dev+0xe8/0x270
arch/x86/kvm/../../../virt/kvm/kvm_main.c:4507
kvm_vm_ioctl_unregister_coalesced_mmio+0x164/0x1e0
arch/x86/kvm/../../../virt/kvm/coalesced_mmio.c:186
kvm_vm_ioctl+0x6e1/0x1860 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3897
vfs_ioctl fs/ioctl.c:48 [inline]
__do_sys_ioctl fs/ioctl.c:753 [inline]
__se_sys_ioctl+0xab/0x110 fs/ioctl.c:739
__x64_sys_ioctl+0x3f/0x50 fs/ioctl.c:739
do_syscall_64+0x39/0x80 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x47338d
Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48
89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ff1bb091c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 000000000059c080 RCX: 000000000047338d
RDX: 0000000020000380 RSI: 000000004010ae68 RDI: 0000000000000004
RBP: 00007ff1bb091c90 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
R13: 00007fffc4b6b82f R14: 00007fffc4b6b9d0 R15: 00007ff1bb091dc0

Crash log:
==============================================
kvm: failed to shrink bus, removing it completely
general protection fault, probably for non-canonical address
0xdead000000000100: 0000 [#1] PREEMPT SMP
CPU: 3 PID: 7974 Comm: executor Not tainted 5.12.0-rc6+ #14
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:kvm_vm_ioctl_unregister_coalesced_mmio+0x88/0x1e0
arch/x86/kvm/../../../virt/kvm/coalesced_mmio.c:183
Code: 00 4c 89 74 24 18 4c 89 6c 24 20 48 8b 44 24 10 48 83 c0 08 48
89 44 24 28 48 89 5c 24 08 4c 89 24 24 4c 89 ff e8 d8 9f 49 00 <4d> 8b
37 48 89 df e8 3d 9b 49 00 8b 2b 49 8d 7f 2c e8 32 9b 49 00
RSP: 0018:ffffc90005dfbd58 EFLAGS: 00010246
RAX: ffff88800c3e7188 RBX: ffffc90005dfbe3c RCX: 0000000000000af0
RDX: 0001000000000100 RSI: 000000000000cbab RDI: dead000000000100
RBP: 0000000000000000 R08: 0000000000000000 R09: 0001000000000107
R10: 0001ffffffffffff R11: 00000000000001d2 R12: ffffc90005e7dff8
R13: 0000000000004000 R14: dead000000000100 R15: dead000000000100
FS: 00007ff1bb092700(0000) GS:ffff88807ed00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055d8946c4918 CR3: 0000000012d88000 CR4: 0000000000752ee0
PKRU: 55555554
Call Trace:
kvm_vm_ioctl+0x6e1/0x1860 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3897
vfs_ioctl fs/ioctl.c:48 [inline]
__do_sys_ioctl fs/ioctl.c:753 [inline]
__se_sys_ioctl+0xab/0x110 fs/ioctl.c:739
__x64_sys_ioctl+0x3f/0x50 fs/ioctl.c:739
do_syscall_64+0x39/0x80 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x47338d
Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48
89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ff1bb091c58 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 000000000059c080 RCX: 000000000047338d
RDX: 0000000020000380 RSI: 000000004010ae68 RDI: 0000000000000004
RBP: 00007ff1bb091c90 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
R13: 00007fffc4b6b82f R14: 00007fffc4b6b9d0 R15: 00007ff1bb091dc0
Modules linked in:
Dumping ftrace buffer:
(ftrace buffer empty)
---[ end trace fa3b3a11a60fdabf ]---
RIP: 0010:kvm_vm_ioctl_unregister_coalesced_mmio+0x88/0x1e0
arch/x86/kvm/../../../virt/kvm/coalesced_mmio.c:183
Code: 00 4c 89 74 24 18 4c 89 6c 24 20 48 8b 44 24 10 48 83 c0 08 48
89 44 24 28 48 89 5c 24 08 4c 89 24 24 4c 89 ff e8 d8 9f 49 00 <4d> 8b
37 48 89 df e8 3d 9b 49 00 8b 2b 49 8d 7f 2c e8 32 9b 49 00
RSP: 0018:ffffc90005dfbd58 EFLAGS: 00010246
RAX: ffff88800c3e7188 RBX: ffffc90005dfbe3c RCX: 0000000000000af0
RDX: 0001000000000100 RSI: 000000000000cbab RDI: dead000000000100
RBP: 0000000000000000 R08: 0000000000000000 R09: 0001000000000107
R10: 0001ffffffffffff R11: 00000000000001d2 R12: ffffc90005e7dff8
R13: 0000000000004000 R14: dead000000000100 R15: dead000000000100
FS: 00007ff1bb092700(0000) GS:ffff88807ed00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fc82800d6b0 CR3: 0000000012d88000 CR4: 0000000000752ee0
PKRU: 55555554


The following system call sequence (Syzlang format) can reproduce the crash:
# {Threaded:false Collide:false Repeat:true RepeatTimes:0 Procs:1
Slowdown:1 Sandbox: Fault:true FaultCall:4 FaultNth:3 Leak:false
NetInjection:false NetDevices:false NetReset:false Cgroups:false
BinfmtMisc:false CloseFDs:false KCSAN:false DevlinkPCI:false USB:false
VhciInjection:false Wifi:false IEEE802154:false Sysctl:false
UseTmpDir:true HandleSegv:false Repro:false Trace:false}

r0 = openat$kvm(0xffffffffffffff9c, &(0x7f0000000740)='/dev/kvm\x00',
0x481001a000001409, 0x0)
r1 = ioctl$KVM_CREATE_VM(r0, 0xae01, 0x0)
ioctl$KVM_REGISTER_COALESCED_MMIO(r1, 0x4010ae67,
&(0x7f0000000240)={0x0, 0x8000})
ioctl$KVM_REGISTER_COALESCED_MMIO(r1, 0x4010ae67,
&(0x7f0000000040)={0x0, 0x3000})
ioctl$KVM_UNREGISTER_COALESCED_MMIO(r1, 0x4010ae68,
&(0x7f0000000380)={0x0, 0x4000})

Using syz-execprog can run this reproduction program directly:
./syz-execprog -repeat 0 -procs 1 -slowdown 1 -fault_call 4
-fault_nth 3 repro.prog


Attachments:
log (6.51 kB)
repro.cprog (6.36 kB)
config (216.34 kB)
Download all attachments

2021-04-13 05:57:53

by Sean Christopherson

[permalink] [raw]
Subject: Re: general protection fault in kvm_vm_ioctl_unregister_coalesced_mmio

On Mon, Apr 12, 2021, Hao Sun wrote:
> Crash log:
> ==============================================
> kvm: failed to shrink bus, removing it completely
> general protection fault, probably for non-canonical address
> 0xdead000000000100: 0000 [#1] PREEMPT SMP
> CPU: 3 PID: 7974 Comm: executor Not tainted 5.12.0-rc6+ #14
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
> 1.13.0-1ubuntu1.1 04/01/2014
> RIP: 0010:kvm_vm_ioctl_unregister_coalesced_mmio+0x88/0x1e0
> arch/x86/kvm/../../../virt/kvm/coalesced_mmio.c:183

Ugh, this code is a mess. On allocation failure, it nukes the entire bus and
invokes the destructor for all _other_ devices on the bus. The coalesced MMIO
code is iterating over its list of devices, and while list_for_each_entry_safe()
can handle removal of the current entry, it blows up when future entries are
deleted.

That the coalesced MMIO code continuing to iterate appears to stem from the fact
that KVM_UNREGISTER_COALESCED_MMIO doesn't require an exact match. Whether or
not this is intentional is probably a moot point since it's now baked into the
ABI.

Assuming we can't change kvm_vm_ioctl_unregister_coalesced_mmio() to stop
iterating on match, the least awful fix would be to return success/failure from
kvm_io_bus_unregister_dev().

Note, there's a second bug in the error path in kvm_io_bus_unregister_dev(), as
it invokes the destructors before nullifying kvm->buses and synchronizing SRCU.
I.e. it's freeing devices on the bus while readers may be in flight. That can
be fixed by deferring the destruction until after SRCU synchronization.

I'll send patches unless someone has a better idea for fixing this.

> Code: 00 4c 89 74 24 18 4c 89 6c 24 20 48 8b 44 24 10 48 83 c0 08 48
> 89 44 24 28 48 89 5c 24 08 4c 89 24 24 4c 89 ff e8 d8 9f 49 00 <4d> 8b
> 37 48 89 df e8 3d 9b 49 00 8b 2b 49 8d 7f 2c e8 32 9b 49 00
> RSP: 0018:ffffc90005dfbd58 EFLAGS: 00010246
> RAX: ffff88800c3e7188 RBX: ffffc90005dfbe3c RCX: 0000000000000af0
> RDX: 0001000000000100 RSI: 000000000000cbab RDI: dead000000000100
> RBP: 0000000000000000 R08: 0000000000000000 R09: 0001000000000107
> R10: 0001ffffffffffff R11: 00000000000001d2 R12: ffffc90005e7dff8
> R13: 0000000000004000 R14: dead000000000100 R15: dead000000000100
> FS: 00007ff1bb092700(0000) GS:ffff88807ed00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000055d8946c4918 CR3: 0000000012d88000 CR4: 0000000000752ee0
> PKRU: 55555554
> Call Trace:
> kvm_vm_ioctl+0x6e1/0x1860 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3897
> vfs_ioctl fs/ioctl.c:48 [inline]
> __do_sys_ioctl fs/ioctl.c:753 [inline]
> __se_sys_ioctl+0xab/0x110 fs/ioctl.c:739
> __x64_sys_ioctl+0x3f/0x50 fs/ioctl.c:739
> do_syscall_64+0x39/0x80 arch/x86/entry/common.c:46
> entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x47338d