Hi,
I'm getting some hangs while fuzzing the kernel with syzkaller.
Possibly it happens during the execution of the following syzkaller program:
mmap(&(0x7f0000000000/0xb90000)=nil, (0xb90000), 0x3, 0x32,
0xffffffffffffffff, 0x0)
r0 = open$usb(&(0x7f0000001000)="2f6465762f6761646765742f64756d6d795f75646300",
0xc002, 0x0)
r1 = open$usb(&(0x7f0000002000)="2f6465762f6761646765742f64756d6d795f75646300",
0x1, 0x102)
write$usb(r1, &(0x7f0000003000)={0x0, {0x9, 0x2, 0x1b, 0x0, 0x5, 0x0,
0x80, 0x8, 0x9, 0x4, 0x10000000, 0xfffffefffffffff9, 0x1, 0xff, 0x0,
0x8, 0x80, [{0x9, 0x5, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}]}, {0x12, 0x1,
0x0, 0x0, 0x4b5, 0x7c, 0x0, 0x3, 0x4, 0x0, 0x8, 0xd686, 0x0, 0x1}},
0x31)
I haven't managed to get the exact same stack trace (or any at all
actually) while trying to reproduce the bug with this program, but the
kernel definitely hangs.
On commit b29794ec95c6856b316c2295904208bf11ffddd9 (4.12-rc4+) with
Alan's patch applied.
gadgetfs: bound to dummy_udc driver
gadgetfs: suspended from state 2
INFO: rcu_sched detected stalls on CPUs/tasks:
1-...: (0 ticks this GP) idle=966/140000000000000/0 softirq=37706/37706 fqs=5250
(detected by 2, t=21002 jiffies, g=26575, c=26574, q=183)
Sending NMI from CPU 2 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 1394 Comm: kworker/1:2 Not tainted 4.12.0-rc4+ #24
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
task: ffff88003ebfb640 task.stack: ffffc900024fc000
RIP: 0010:rep_nop arch/x86/include/asm/processor.h:619 [inline]
RIP: 0010:cpu_relax arch/x86/include/asm/processor.h:624 [inline]
RIP: 0010:virt_spin_lock arch/x86/include/asm/qspinlock.h:63 [inline]
RIP: 0010:queued_spin_lock_slowpath+0x20/0x1a0 kernel/locking/qspinlock.c:421
RSP: 0018:ffffc900024ff9f0 EFLAGS: 00000002
RAX: 000000007d72e420 RBX: ffff88007d72e768 RCX: 0000000000010000
RDX: 0000000000000001 RSI: 000000007d72e420 RDI: ffff88007d72e768
RBP: ffffc900024ff9f0 R08: 0000000000000006 R09: 0000000000000020
R10: ffffc900024ffa50 R11: 0000000000d52301 R12: ffff88007da9c298
R13: 0000000000000000 R14: 0000000000000000 R15: ffffffff82cfc5a0
FS: 0000000000000000(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f728a487000 CR3: 00000000371e6000 CR4: 00000000000006e0
Call Trace:
__raw_spin_lock include/asm-generic/qspinlock.h:103 [inline]
_raw_spin_lock+0x1c/0x20 kernel/locking/spinlock.c:151
spin_lock include/linux/spinlock.h:299 [inline]
gadgetfs_suspend+0x32/0x90 drivers/usb/gadget/legacy/inode.c:1684
set_link_state+0x39c/0x440 drivers/usb/gadget/udc/dummy_hcd.c:455
dummy_hub_control+0x3e7/0x650 drivers/usb/gadget/udc/dummy_hcd.c:2074
rh_call_control drivers/usb/core/hcd.c:689 [inline]
rh_urb_enqueue drivers/usb/core/hcd.c:846 [inline]
usb_hcd_submit_urb+0x327/0xcf0 drivers/usb/core/hcd.c:1650
usb_submit_urb+0x355/0x6f0 drivers/usb/core/urb.c:542
usb_start_wait_urb+0x5f/0x110 drivers/usb/core/message.c:56
usb_internal_control_msg drivers/usb/core/message.c:100 [inline]
usb_control_msg+0xd9/0x120 drivers/usb/core/message.c:151
usb_clear_port_feature+0x46/0x60 drivers/usb/core/hub.c:412
hub_port_disable+0x65/0x1d0 drivers/usb/core/hub.c:4177
hub_port_init+0x10a/0xee0 drivers/usb/core/hub.c:4648
hub_port_connect drivers/usb/core/hub.c:4826 [inline]
hub_port_connect_change drivers/usb/core/hub.c:4999 [inline]
port_event drivers/usb/core/hub.c:5105 [inline]
hub_event+0xa0b/0x16e0 drivers/usb/core/hub.c:5189
process_one_work+0x1fb/0x4c0 kernel/workqueue.c:2097
process_scheduled_works kernel/workqueue.c:2157 [inline]
worker_thread+0x2ab/0x4c0 kernel/workqueue.c:2233
kthread+0x140/0x160 kernel/kthread.c:231
ret_from_fork+0x25/0x30 arch/x86/entry/entry_64.S:424
Code: 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 0f 1f 44 00 00 ba
01 00 00 00 8b 07 85 c0 75 0a f0 0f b1 17 85 c0 75 f2 5d c3 f3 90 <eb>
ec 81 fe 00 01 00 00 0f 84 92 00 00 00 41 b8 01 01 00 00 b9
On Fri, Jun 9, 2017 at 2:41 PM, Andrey Konovalov <[email protected]> wrote:
> Hi,
>
> I'm getting some hangs while fuzzing the kernel with syzkaller.
>
> Possibly it happens during the execution of the following syzkaller program:
>
> mmap(&(0x7f0000000000/0xb90000)=nil, (0xb90000), 0x3, 0x32,
> 0xffffffffffffffff, 0x0)
> r0 = open$usb(&(0x7f0000001000)="2f6465762f6761646765742f64756d6d795f75646300",
> 0xc002, 0x0)
> r1 = open$usb(&(0x7f0000002000)="2f6465762f6761646765742f64756d6d795f75646300",
> 0x1, 0x102)
> write$usb(r1, &(0x7f0000003000)={0x0, {0x9, 0x2, 0x1b, 0x0, 0x5, 0x0,
> 0x80, 0x8, 0x9, 0x4, 0x10000000, 0xfffffefffffffff9, 0x1, 0xff, 0x0,
> 0x8, 0x80, [{0x9, 0x5, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}]}, {0x12, 0x1,
> 0x0, 0x0, 0x4b5, 0x7c, 0x0, 0x3, 0x4, 0x0, 0x8, 0xd686, 0x0, 0x1}},
> 0x31)
>
> I haven't managed to get the exact same stack trace (or any at all
> actually) while trying to reproduce the bug with this program, but the
> kernel definitely hangs.
Never mind, this program is unrelated to the hang.
>
> On commit b29794ec95c6856b316c2295904208bf11ffddd9 (4.12-rc4+) with
> Alan's patch applied.
>
> gadgetfs: bound to dummy_udc driver
> gadgetfs: suspended from state 2
> INFO: rcu_sched detected stalls on CPUs/tasks:
> 1-...: (0 ticks this GP) idle=966/140000000000000/0 softirq=37706/37706 fqs=5250
> (detected by 2, t=21002 jiffies, g=26575, c=26574, q=183)
> Sending NMI from CPU 2 to CPUs 1:
> NMI backtrace for cpu 1
> CPU: 1 PID: 1394 Comm: kworker/1:2 Not tainted 4.12.0-rc4+ #24
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Workqueue: usb_hub_wq hub_event
> task: ffff88003ebfb640 task.stack: ffffc900024fc000
> RIP: 0010:rep_nop arch/x86/include/asm/processor.h:619 [inline]
> RIP: 0010:cpu_relax arch/x86/include/asm/processor.h:624 [inline]
> RIP: 0010:virt_spin_lock arch/x86/include/asm/qspinlock.h:63 [inline]
> RIP: 0010:queued_spin_lock_slowpath+0x20/0x1a0 kernel/locking/qspinlock.c:421
> RSP: 0018:ffffc900024ff9f0 EFLAGS: 00000002
> RAX: 000000007d72e420 RBX: ffff88007d72e768 RCX: 0000000000010000
> RDX: 0000000000000001 RSI: 000000007d72e420 RDI: ffff88007d72e768
> RBP: ffffc900024ff9f0 R08: 0000000000000006 R09: 0000000000000020
> R10: ffffc900024ffa50 R11: 0000000000d52301 R12: ffff88007da9c298
> R13: 0000000000000000 R14: 0000000000000000 R15: ffffffff82cfc5a0
> FS: 0000000000000000(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f728a487000 CR3: 00000000371e6000 CR4: 00000000000006e0
> Call Trace:
> __raw_spin_lock include/asm-generic/qspinlock.h:103 [inline]
> _raw_spin_lock+0x1c/0x20 kernel/locking/spinlock.c:151
> spin_lock include/linux/spinlock.h:299 [inline]
> gadgetfs_suspend+0x32/0x90 drivers/usb/gadget/legacy/inode.c:1684
> set_link_state+0x39c/0x440 drivers/usb/gadget/udc/dummy_hcd.c:455
> dummy_hub_control+0x3e7/0x650 drivers/usb/gadget/udc/dummy_hcd.c:2074
> rh_call_control drivers/usb/core/hcd.c:689 [inline]
> rh_urb_enqueue drivers/usb/core/hcd.c:846 [inline]
> usb_hcd_submit_urb+0x327/0xcf0 drivers/usb/core/hcd.c:1650
> usb_submit_urb+0x355/0x6f0 drivers/usb/core/urb.c:542
> usb_start_wait_urb+0x5f/0x110 drivers/usb/core/message.c:56
> usb_internal_control_msg drivers/usb/core/message.c:100 [inline]
> usb_control_msg+0xd9/0x120 drivers/usb/core/message.c:151
> usb_clear_port_feature+0x46/0x60 drivers/usb/core/hub.c:412
> hub_port_disable+0x65/0x1d0 drivers/usb/core/hub.c:4177
> hub_port_init+0x10a/0xee0 drivers/usb/core/hub.c:4648
> hub_port_connect drivers/usb/core/hub.c:4826 [inline]
> hub_port_connect_change drivers/usb/core/hub.c:4999 [inline]
> port_event drivers/usb/core/hub.c:5105 [inline]
> hub_event+0xa0b/0x16e0 drivers/usb/core/hub.c:5189
> process_one_work+0x1fb/0x4c0 kernel/workqueue.c:2097
> process_scheduled_works kernel/workqueue.c:2157 [inline]
> worker_thread+0x2ab/0x4c0 kernel/workqueue.c:2233
> kthread+0x140/0x160 kernel/kthread.c:231
> ret_from_fork+0x25/0x30 arch/x86/entry/entry_64.S:424
> Code: 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 0f 1f 44 00 00 ba
> 01 00 00 00 8b 07 85 c0 75 0a f0 0f b1 17 85 c0 75 f2 5d c3 f3 90 <eb>
> ec 81 fe 00 01 00 00 0f 84 92 00 00 00 41 b8 01 01 00 00 b9
On Fri, Jun 9, 2017 at 2:41 PM, Andrey Konovalov <[email protected]> wrote:
> Hi,
>
> I'm getting some hangs while fuzzing the kernel with syzkaller.
>
> Possibly it happens during the execution of the following syzkaller program:
>
> mmap(&(0x7f0000000000/0xb90000)=nil, (0xb90000), 0x3, 0x32,
> 0xffffffffffffffff, 0x0)
> r0 = open$usb(&(0x7f0000001000)="2f6465762f6761646765742f64756d6d795f75646300",
> 0xc002, 0x0)
> r1 = open$usb(&(0x7f0000002000)="2f6465762f6761646765742f64756d6d795f75646300",
> 0x1, 0x102)
> write$usb(r1, &(0x7f0000003000)={0x0, {0x9, 0x2, 0x1b, 0x0, 0x5, 0x0,
> 0x80, 0x8, 0x9, 0x4, 0x10000000, 0xfffffefffffffff9, 0x1, 0xff, 0x0,
> 0x8, 0x80, [{0x9, 0x5, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}]}, {0x12, 0x1,
> 0x0, 0x0, 0x4b5, 0x7c, 0x0, 0x3, 0x4, 0x0, 0x8, 0xd686, 0x0, 0x1}},
> 0x31)
>
> I haven't managed to get the exact same stack trace (or any at all
> actually) while trying to reproduce the bug with this program, but the
> kernel definitely hangs.
>
> On commit b29794ec95c6856b316c2295904208bf11ffddd9 (4.12-rc4+) with
> Alan's patch applied.
>
> gadgetfs: bound to dummy_udc driver
> gadgetfs: suspended from state 2
> INFO: rcu_sched detected stalls on CPUs/tasks:
> 1-...: (0 ticks this GP) idle=966/140000000000000/0 softirq=37706/37706 fqs=5250
> (detected by 2, t=21002 jiffies, g=26575, c=26574, q=183)
> Sending NMI from CPU 2 to CPUs 1:
> NMI backtrace for cpu 1
> CPU: 1 PID: 1394 Comm: kworker/1:2 Not tainted 4.12.0-rc4+ #24
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Workqueue: usb_hub_wq hub_event
> task: ffff88003ebfb640 task.stack: ffffc900024fc000
> RIP: 0010:rep_nop arch/x86/include/asm/processor.h:619 [inline]
> RIP: 0010:cpu_relax arch/x86/include/asm/processor.h:624 [inline]
> RIP: 0010:virt_spin_lock arch/x86/include/asm/qspinlock.h:63 [inline]
> RIP: 0010:queued_spin_lock_slowpath+0x20/0x1a0 kernel/locking/qspinlock.c:421
> RSP: 0018:ffffc900024ff9f0 EFLAGS: 00000002
> RAX: 000000007d72e420 RBX: ffff88007d72e768 RCX: 0000000000010000
> RDX: 0000000000000001 RSI: 000000007d72e420 RDI: ffff88007d72e768
> RBP: ffffc900024ff9f0 R08: 0000000000000006 R09: 0000000000000020
> R10: ffffc900024ffa50 R11: 0000000000d52301 R12: ffff88007da9c298
> R13: 0000000000000000 R14: 0000000000000000 R15: ffffffff82cfc5a0
> FS: 0000000000000000(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f728a487000 CR3: 00000000371e6000 CR4: 00000000000006e0
> Call Trace:
> __raw_spin_lock include/asm-generic/qspinlock.h:103 [inline]
> _raw_spin_lock+0x1c/0x20 kernel/locking/spinlock.c:151
> spin_lock include/linux/spinlock.h:299 [inline]
> gadgetfs_suspend+0x32/0x90 drivers/usb/gadget/legacy/inode.c:1684
> set_link_state+0x39c/0x440 drivers/usb/gadget/udc/dummy_hcd.c:455
> dummy_hub_control+0x3e7/0x650 drivers/usb/gadget/udc/dummy_hcd.c:2074
> rh_call_control drivers/usb/core/hcd.c:689 [inline]
> rh_urb_enqueue drivers/usb/core/hcd.c:846 [inline]
> usb_hcd_submit_urb+0x327/0xcf0 drivers/usb/core/hcd.c:1650
> usb_submit_urb+0x355/0x6f0 drivers/usb/core/urb.c:542
> usb_start_wait_urb+0x5f/0x110 drivers/usb/core/message.c:56
> usb_internal_control_msg drivers/usb/core/message.c:100 [inline]
> usb_control_msg+0xd9/0x120 drivers/usb/core/message.c:151
> usb_clear_port_feature+0x46/0x60 drivers/usb/core/hub.c:412
> hub_port_disable+0x65/0x1d0 drivers/usb/core/hub.c:4177
> hub_port_init+0x10a/0xee0 drivers/usb/core/hub.c:4648
> hub_port_connect drivers/usb/core/hub.c:4826 [inline]
> hub_port_connect_change drivers/usb/core/hub.c:4999 [inline]
> port_event drivers/usb/core/hub.c:5105 [inline]
> hub_event+0xa0b/0x16e0 drivers/usb/core/hub.c:5189
> process_one_work+0x1fb/0x4c0 kernel/workqueue.c:2097
> process_scheduled_works kernel/workqueue.c:2157 [inline]
> worker_thread+0x2ab/0x4c0 kernel/workqueue.c:2233
> kthread+0x140/0x160 kernel/kthread.c:231
> ret_from_fork+0x25/0x30 arch/x86/entry/entry_64.S:424
> Code: 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 0f 1f 44 00 00 ba
> 01 00 00 00 8b 07 85 c0 75 0a f0 0f b1 17 85 c0 75 f2 5d c3 f3 90 <eb>
> ec 81 fe 00 01 00 00 0f 84 92 00 00 00 41 b8 01 01 00 00 b9
Here are a few similar reports:
gadgetfs: bound to dummy_udc driver
gadgetfs: suspended from state 2
BUG: spinlock bad magic on CPU#3, kworker/3:1/222
lock: 0xffff88003cdb9a70, .magic: dead4ead, .owner: <none>/-1, .owner_cpu: -1
CPU: 3 PID: 222 Comm: kworker/3:1 Not tainted 4.12.0-rc4+ #28
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
Call Trace:
__dump_stack lib/dump_stack.c:16 [inline]
dump_stack+0x8e/0xcd lib/dump_stack.c:52
spin_dump+0x73/0xd0 kernel/locking/spinlock_debug.c:67
spin_bug kernel/locking/spinlock_debug.c:75 [inline]
debug_spin_lock_before kernel/locking/spinlock_debug.c:83 [inline]
do_raw_spin_lock+0x6d/0xb0 kernel/locking/spinlock_debug.c:112
__raw_spin_lock include/linux/spinlock_api_smp.h:143 [inline]
_raw_spin_lock+0x32/0x40 kernel/locking/spinlock.c:151
spin_lock include/linux/spinlock.h:299 [inline]
gadgetfs_suspend+0x2a/0x80 drivers/usb/gadget/legacy/inode.c:1684
set_link_state+0x28f/0x300 drivers/usb/gadget/udc/dummy_hcd.c:455
dummy_hub_control+0x366/0x580 drivers/usb/gadget/udc/dummy_hcd.c:2074
rh_call_control drivers/usb/core/hcd.c:689 [inline]
rh_urb_enqueue drivers/usb/core/hcd.c:846 [inline]
usb_hcd_submit_urb+0x26f/0xa20 drivers/usb/core/hcd.c:1650
usb_submit_urb+0x338/0x500 drivers/usb/core/urb.c:542
usb_start_wait_urb+0x5a/0xe0 drivers/usb/core/message.c:56
usb_internal_control_msg drivers/usb/core/message.c:100 [inline]
usb_control_msg+0xbc/0xf0 drivers/usb/core/message.c:151
usb_clear_port_feature+0x30/0x40 drivers/usb/core/hub.c:412
hub_port_disable+0x45/0x150 drivers/usb/core/hub.c:4177
hub_port_init+0xe0/0xb50 drivers/usb/core/hub.c:4648
hub_port_connect drivers/usb/core/hub.c:4826 [inline]
hub_port_connect_change drivers/usb/core/hub.c:4999 [inline]
port_event drivers/usb/core/hub.c:5105 [inline]
hub_event+0x92a/0x1260 drivers/usb/core/hub.c:5185
process_one_work+0x1fd/0x620 kernel/workqueue.c:2097
worker_thread+0x48/0x3b0 kernel/workqueue.c:2231
kthread+0x12b/0x150 kernel/kthread.c:231
ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:424
gadgetfs: bound to dummy_udc driver
gadgetfs: suspended from state 2
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
CPU: 0 PID: 33 Comm: kworker/0:1 Not tainted 4.12.0-rc4+ #28
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
Call Trace:
__dump_stack lib/dump_stack.c:16 [inline]
dump_stack+0x8e/0xcd lib/dump_stack.c:52
register_lock_class+0x164/0x540 kernel/locking/lockdep.c:755
__lock_acquire+0xb6/0x12a0 kernel/locking/lockdep.c:3255
lock_acquire+0xa3/0x1f0 kernel/locking/lockdep.c:3855
__raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
_raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
spin_lock include/linux/spinlock.h:299 [inline]
gadgetfs_suspend+0x2a/0x80 drivers/usb/gadget/legacy/inode.c:1684
set_link_state+0x28f/0x300 drivers/usb/gadget/udc/dummy_hcd.c:455
dummy_hub_control+0x366/0x580 drivers/usb/gadget/udc/dummy_hcd.c:2074
rh_call_control drivers/usb/core/hcd.c:689 [inline]
rh_urb_enqueue drivers/usb/core/hcd.c:846 [inline]
usb_hcd_submit_urb+0x26f/0xa20 drivers/usb/core/hcd.c:1650
usb_submit_urb+0x338/0x500 drivers/usb/core/urb.c:542
usb_start_wait_urb+0x5a/0xe0 drivers/usb/core/message.c:56
usb_internal_control_msg drivers/usb/core/message.c:100 [inline]
usb_control_msg+0xbc/0xf0 drivers/usb/core/message.c:151
usb_clear_port_feature+0x30/0x40 drivers/usb/core/hub.c:412
hub_port_disable+0x45/0x150 drivers/usb/core/hub.c:4177
hub_port_init+0xe0/0xb50 drivers/usb/core/hub.c:4648
hub_port_connect drivers/usb/core/hub.c:4826 [inline]
hub_port_connect_change drivers/usb/core/hub.c:4999 [inline]
port_event drivers/usb/core/hub.c:5105 [inline]
hub_event+0x92a/0x1260 drivers/usb/core/hub.c:5185
process_one_work+0x1fd/0x620 kernel/workqueue.c:2097
process_scheduled_works kernel/workqueue.c:2157 [inline]
worker_thread+0x1fc/0x3b0 kernel/workqueue.c:2233
kthread+0x12b/0x150 kernel/kthread.c:231
ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:424
usb 2-1: new full-speed USB device number 5 using dummy_hcd
gadgetfs: suspended from state 2
BUG: spinlock bad magic on CPU#2, kworker/2:1/896
lock: 0xffff88003bd744b0, .magic: ffff8800, .owner: <none>/-1,
.owner_cpu: -20255234
CPU: 2 PID: 896 Comm: kworker/2:1 Not tainted 4.12.0-rc4+ #28
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
Call Trace:
__dump_stack lib/dump_stack.c:16 [inline]
dump_stack+0x8e/0xcd lib/dump_stack.c:52
spin_dump+0x73/0xd0 kernel/locking/spinlock_debug.c:67
spin_bug kernel/locking/spinlock_debug.c:75 [inline]
debug_spin_lock_before kernel/locking/spinlock_debug.c:83 [inline]
do_raw_spin_lock+0x6d/0xb0 kernel/locking/spinlock_debug.c:112
__raw_spin_lock include/linux/spinlock_api_smp.h:143 [inline]
_raw_spin_lock+0x32/0x40 kernel/locking/spinlock.c:151
spin_lock include/linux/spinlock.h:299 [inline]
gadgetfs_suspend+0x2a/0x80 drivers/usb/gadget/legacy/inode.c:1684
set_link_state+0x28f/0x300 drivers/usb/gadget/udc/dummy_hcd.c:455
dummy_hub_control+0x366/0x580 drivers/usb/gadget/udc/dummy_hcd.c:2074
rh_call_control drivers/usb/core/hcd.c:689 [inline]
rh_urb_enqueue drivers/usb/core/hcd.c:846 [inline]
usb_hcd_submit_urb+0x26f/0xa20 drivers/usb/core/hcd.c:1650
usb_submit_urb+0x338/0x500 drivers/usb/core/urb.c:542
usb_start_wait_urb+0x5a/0xe0 drivers/usb/core/message.c:56
usb_internal_control_msg drivers/usb/core/message.c:100 [inline]
usb_control_msg+0xbc/0xf0 drivers/usb/core/message.c:151
usb_clear_port_feature+0x30/0x40 drivers/usb/core/hub.c:412
hub_port_disable+0x45/0x150 drivers/usb/core/hub.c:4177
hub_port_init+0xe0/0xb50 drivers/usb/core/hub.c:4648
hub_port_connect drivers/usb/core/hub.c:4826 [inline]
hub_port_connect_change drivers/usb/core/hub.c:4999 [inline]
port_event drivers/usb/core/hub.c:5105 [inline]
hub_event+0x92a/0x1260 drivers/usb/core/hub.c:5185
process_one_work+0x1fd/0x620 kernel/workqueue.c:2097
worker_thread+0x48/0x3b0 kernel/workqueue.c:2231
kthread+0x12b/0x150 kernel/kthread.c:231
ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:424
On Fri, 9 Jun 2017, Andrey Konovalov wrote:
> On Fri, Jun 9, 2017 at 2:41 PM, Andrey Konovalov <[email protected]> wrote:
> > Hi,
> >
> > I'm getting some hangs while fuzzing the kernel with syzkaller.
> >
> > Possibly it happens during the execution of the following syzkaller program:
> >
> > mmap(&(0x7f0000000000/0xb90000)=nil, (0xb90000), 0x3, 0x32,
> > 0xffffffffffffffff, 0x0)
> > r0 = open$usb(&(0x7f0000001000)="2f6465762f6761646765742f64756d6d795f75646300",
> > 0xc002, 0x0)
> > r1 = open$usb(&(0x7f0000002000)="2f6465762f6761646765742f64756d6d795f75646300",
> > 0x1, 0x102)
> > write$usb(r1, &(0x7f0000003000)={0x0, {0x9, 0x2, 0x1b, 0x0, 0x5, 0x0,
> > 0x80, 0x8, 0x9, 0x4, 0x10000000, 0xfffffefffffffff9, 0x1, 0xff, 0x0,
I don't understand these large constants. They're supposed to be __u8
values. Do they get truncated to the least significant byte?
> > 0x8, 0x80, [{0x9, 0x5, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}]}, {0x12, 0x1,
> > 0x0, 0x0, 0x4b5, 0x7c, 0x0, 0x3, 0x4, 0x0, 0x8, 0xd686, 0x0, 0x1}},
> > 0x31)
> >
> > I haven't managed to get the exact same stack trace (or any at all
> > actually) while trying to reproduce the bug with this program, but the
> > kernel definitely hangs.
Can you get a usbmon trace for the test? And can you enable debugging
for the usbcore module?
echo "module usbcore =p" >/sys/kernel/debug/dynamic_debug/control
> > On commit b29794ec95c6856b316c2295904208bf11ffddd9 (4.12-rc4+) with
> > Alan's patch applied.
> >
> > gadgetfs: bound to dummy_udc driver
> > gadgetfs: suspended from state 2
> > INFO: rcu_sched detected stalls on CPUs/tasks:
> > 1-...: (0 ticks this GP) idle=966/140000000000000/0 softirq=37706/37706 fqs=5250
> > (detected by 2, t=21002 jiffies, g=26575, c=26574, q=183)
> > Sending NMI from CPU 2 to CPUs 1:
> > NMI backtrace for cpu 1
> > CPU: 1 PID: 1394 Comm: kworker/1:2 Not tainted 4.12.0-rc4+ #24
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> > Workqueue: usb_hub_wq hub_event
> > task: ffff88003ebfb640 task.stack: ffffc900024fc000
> > RIP: 0010:rep_nop arch/x86/include/asm/processor.h:619 [inline]
> > RIP: 0010:cpu_relax arch/x86/include/asm/processor.h:624 [inline]
> > RIP: 0010:virt_spin_lock arch/x86/include/asm/qspinlock.h:63 [inline]
> > RIP: 0010:queued_spin_lock_slowpath+0x20/0x1a0 kernel/locking/qspinlock.c:421
> > RSP: 0018:ffffc900024ff9f0 EFLAGS: 00000002
> > RAX: 000000007d72e420 RBX: ffff88007d72e768 RCX: 0000000000010000
> > RDX: 0000000000000001 RSI: 000000007d72e420 RDI: ffff88007d72e768
> > RBP: ffffc900024ff9f0 R08: 0000000000000006 R09: 0000000000000020
> > R10: ffffc900024ffa50 R11: 0000000000d52301 R12: ffff88007da9c298
> > R13: 0000000000000000 R14: 0000000000000000 R15: ffffffff82cfc5a0
> > FS: 0000000000000000(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 00007f728a487000 CR3: 00000000371e6000 CR4: 00000000000006e0
> > Call Trace:
> > __raw_spin_lock include/asm-generic/qspinlock.h:103 [inline]
> > _raw_spin_lock+0x1c/0x20 kernel/locking/spinlock.c:151
> > spin_lock include/linux/spinlock.h:299 [inline]
> > gadgetfs_suspend+0x32/0x90 drivers/usb/gadget/legacy/inode.c:1684
Looks like it's waiting for the spinlock in gadgetfs_suspend. Nothing
else should be holding that lock.
Were any other tasks stalled?
> > set_link_state+0x39c/0x440 drivers/usb/gadget/udc/dummy_hcd.c:455
> > dummy_hub_control+0x3e7/0x650 drivers/usb/gadget/udc/dummy_hcd.c:2074
> > rh_call_control drivers/usb/core/hcd.c:689 [inline]
> > rh_urb_enqueue drivers/usb/core/hcd.c:846 [inline]
> > usb_hcd_submit_urb+0x327/0xcf0 drivers/usb/core/hcd.c:1650
> > usb_submit_urb+0x355/0x6f0 drivers/usb/core/urb.c:542
> > usb_start_wait_urb+0x5f/0x110 drivers/usb/core/message.c:56
> > usb_internal_control_msg drivers/usb/core/message.c:100 [inline]
> > usb_control_msg+0xd9/0x120 drivers/usb/core/message.c:151
> > usb_clear_port_feature+0x46/0x60 drivers/usb/core/hub.c:412
> > hub_port_disable+0x65/0x1d0 drivers/usb/core/hub.c:4177
> > hub_port_init+0x10a/0xee0 drivers/usb/core/hub.c:4648
> > hub_port_connect drivers/usb/core/hub.c:4826 [inline]
> > hub_port_connect_change drivers/usb/core/hub.c:4999 [inline]
> > port_event drivers/usb/core/hub.c:5105 [inline]
> > hub_event+0xa0b/0x16e0 drivers/usb/core/hub.c:5189
> > process_one_work+0x1fb/0x4c0 kernel/workqueue.c:2097
> > process_scheduled_works kernel/workqueue.c:2157 [inline]
> > worker_thread+0x2ab/0x4c0 kernel/workqueue.c:2233
> > kthread+0x140/0x160 kernel/kthread.c:231
> > ret_from_fork+0x25/0x30 arch/x86/entry/entry_64.S:424
> > Code: 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 0f 1f 44 00 00 ba
> > 01 00 00 00 8b 07 85 c0 75 0a f0 0f b1 17 85 c0 75 f2 5d c3 f3 90 <eb>
> > ec 81 fe 00 01 00 00 0f 84 92 00 00 00 41 b8 01 01 00 00 b9
Alan Stern
On Fri, Jun 9, 2017 at 9:14 PM, Alan Stern <[email protected]> wrote:
> On Fri, 9 Jun 2017, Andrey Konovalov wrote:
>
>> On Fri, Jun 9, 2017 at 2:41 PM, Andrey Konovalov <[email protected]> wrote:
>> > Hi,
>> >
>> > I'm getting some hangs while fuzzing the kernel with syzkaller.
>> >
>> > Possibly it happens during the execution of the following syzkaller program:
>> >
>> > mmap(&(0x7f0000000000/0xb90000)=nil, (0xb90000), 0x3, 0x32,
>> > 0xffffffffffffffff, 0x0)
>> > r0 = open$usb(&(0x7f0000001000)="2f6465762f6761646765742f64756d6d795f75646300",
>> > 0xc002, 0x0)
>> > r1 = open$usb(&(0x7f0000002000)="2f6465762f6761646765742f64756d6d795f75646300",
>> > 0x1, 0x102)
>> > write$usb(r1, &(0x7f0000003000)={0x0, {0x9, 0x2, 0x1b, 0x0, 0x5, 0x0,
>> > 0x80, 0x8, 0x9, 0x4, 0x10000000, 0xfffffefffffffff9, 0x1, 0xff, 0x0,
>
> I don't understand these large constants. They're supposed to be __u8
> values. Do they get truncated to the least significant byte?
This program doesn't lead to crashes, it was a bug in syzkaller. The
format keeps changing, I'll explain it if I send another program.
>
>> > 0x8, 0x80, [{0x9, 0x5, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}]}, {0x12, 0x1,
>> > 0x0, 0x0, 0x4b5, 0x7c, 0x0, 0x3, 0x4, 0x0, 0x8, 0xd686, 0x0, 0x1}},
>> > 0x31)
>> >
>> > I haven't managed to get the exact same stack trace (or any at all
>> > actually) while trying to reproduce the bug with this program, but the
>> > kernel definitely hangs.
It seems that I get the stall reports quite rarely and I can't
reproduce them. However I get the "bad spinlock magic" crashes quite
often and able to reproduce them. So I can apply debug patches.
Reproducing sometimes requires quite some time (~10 minutes), so this
seems to be some kind of race. I noticed, that often I unmount
gadgetfs before the USB device has enough time to properly initialize,
so this might be a race of unmount vs device initialization or
something similar.
>
> Can you get a usbmon trace for the test?
I wasn't able to reproduce the crash with usbmon tracing enabled.
Maybe usbmon tracing affects timings somehow and the race is quite
less likely to be triggered. I'll keep trying though.
> And can you enable debugging for the usbcore module?
>
> echo "module usbcore =p" >/sys/kernel/debug/dynamic_debug/control
>
I was able to reproduce the crash with debugging for usbcore enabled:
https://gist.github.com/xairy/52bfb302d824eb55d39776eb156812cc
>> > On commit b29794ec95c6856b316c2295904208bf11ffddd9 (4.12-rc4+) with
>> > Alan's patch applied.
>> >
>> > gadgetfs: bound to dummy_udc driver
>> > gadgetfs: suspended from state 2
>> > INFO: rcu_sched detected stalls on CPUs/tasks:
>> > 1-...: (0 ticks this GP) idle=966/140000000000000/0 softirq=37706/37706 fqs=5250
>> > (detected by 2, t=21002 jiffies, g=26575, c=26574, q=183)
>> > Sending NMI from CPU 2 to CPUs 1:
>> > NMI backtrace for cpu 1
>> > CPU: 1 PID: 1394 Comm: kworker/1:2 Not tainted 4.12.0-rc4+ #24
>> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>> > Workqueue: usb_hub_wq hub_event
>> > task: ffff88003ebfb640 task.stack: ffffc900024fc000
>> > RIP: 0010:rep_nop arch/x86/include/asm/processor.h:619 [inline]
>> > RIP: 0010:cpu_relax arch/x86/include/asm/processor.h:624 [inline]
>> > RIP: 0010:virt_spin_lock arch/x86/include/asm/qspinlock.h:63 [inline]
>> > RIP: 0010:queued_spin_lock_slowpath+0x20/0x1a0 kernel/locking/qspinlock.c:421
>> > RSP: 0018:ffffc900024ff9f0 EFLAGS: 00000002
>> > RAX: 000000007d72e420 RBX: ffff88007d72e768 RCX: 0000000000010000
>> > RDX: 0000000000000001 RSI: 000000007d72e420 RDI: ffff88007d72e768
>> > RBP: ffffc900024ff9f0 R08: 0000000000000006 R09: 0000000000000020
>> > R10: ffffc900024ffa50 R11: 0000000000d52301 R12: ffff88007da9c298
>> > R13: 0000000000000000 R14: 0000000000000000 R15: ffffffff82cfc5a0
>> > FS: 0000000000000000(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000
>> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > CR2: 00007f728a487000 CR3: 00000000371e6000 CR4: 00000000000006e0
>> > Call Trace:
>> > __raw_spin_lock include/asm-generic/qspinlock.h:103 [inline]
>> > _raw_spin_lock+0x1c/0x20 kernel/locking/spinlock.c:151
>> > spin_lock include/linux/spinlock.h:299 [inline]
>> > gadgetfs_suspend+0x32/0x90 drivers/usb/gadget/legacy/inode.c:1684
>
> Looks like it's waiting for the spinlock in gadgetfs_suspend. Nothing
> else should be holding that lock.
>
> Were any other tasks stalled?
It seems that I get these stalls quite rarely and I can't reproduce them.
>
>> > set_link_state+0x39c/0x440 drivers/usb/gadget/udc/dummy_hcd.c:455
>> > dummy_hub_control+0x3e7/0x650 drivers/usb/gadget/udc/dummy_hcd.c:2074
>> > rh_call_control drivers/usb/core/hcd.c:689 [inline]
>> > rh_urb_enqueue drivers/usb/core/hcd.c:846 [inline]
>> > usb_hcd_submit_urb+0x327/0xcf0 drivers/usb/core/hcd.c:1650
>> > usb_submit_urb+0x355/0x6f0 drivers/usb/core/urb.c:542
>> > usb_start_wait_urb+0x5f/0x110 drivers/usb/core/message.c:56
>> > usb_internal_control_msg drivers/usb/core/message.c:100 [inline]
>> > usb_control_msg+0xd9/0x120 drivers/usb/core/message.c:151
>> > usb_clear_port_feature+0x46/0x60 drivers/usb/core/hub.c:412
>> > hub_port_disable+0x65/0x1d0 drivers/usb/core/hub.c:4177
>> > hub_port_init+0x10a/0xee0 drivers/usb/core/hub.c:4648
>> > hub_port_connect drivers/usb/core/hub.c:4826 [inline]
>> > hub_port_connect_change drivers/usb/core/hub.c:4999 [inline]
>> > port_event drivers/usb/core/hub.c:5105 [inline]
>> > hub_event+0xa0b/0x16e0 drivers/usb/core/hub.c:5189
>> > process_one_work+0x1fb/0x4c0 kernel/workqueue.c:2097
>> > process_scheduled_works kernel/workqueue.c:2157 [inline]
>> > worker_thread+0x2ab/0x4c0 kernel/workqueue.c:2233
>> > kthread+0x140/0x160 kernel/kthread.c:231
>> > ret_from_fork+0x25/0x30 arch/x86/entry/entry_64.S:424
>> > Code: 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 0f 1f 44 00 00 ba
>> > 01 00 00 00 8b 07 85 c0 75 0a f0 0f b1 17 85 c0 75 f2 5d c3 f3 90 <eb>
>> > ec 81 fe 00 01 00 00 0f 84 92 00 00 00 41 b8 01 01 00 00 b9
>
> Alan Stern
>
On Mon, Jun 12, 2017 at 6:11 PM, Andrey Konovalov <[email protected]> wrote:
> On Fri, Jun 9, 2017 at 9:14 PM, Alan Stern <[email protected]> wrote:
>> On Fri, 9 Jun 2017, Andrey Konovalov wrote:
>>
>>> On Fri, Jun 9, 2017 at 2:41 PM, Andrey Konovalov <[email protected]> wrote:
>>> > Hi,
>>> >
>>> > I'm getting some hangs while fuzzing the kernel with syzkaller.
>>> >
>>> > Possibly it happens during the execution of the following syzkaller program:
>>> >
>>> > mmap(&(0x7f0000000000/0xb90000)=nil, (0xb90000), 0x3, 0x32,
>>> > 0xffffffffffffffff, 0x0)
>>> > r0 = open$usb(&(0x7f0000001000)="2f6465762f6761646765742f64756d6d795f75646300",
>>> > 0xc002, 0x0)
>>> > r1 = open$usb(&(0x7f0000002000)="2f6465762f6761646765742f64756d6d795f75646300",
>>> > 0x1, 0x102)
>>> > write$usb(r1, &(0x7f0000003000)={0x0, {0x9, 0x2, 0x1b, 0x0, 0x5, 0x0,
>>> > 0x80, 0x8, 0x9, 0x4, 0x10000000, 0xfffffefffffffff9, 0x1, 0xff, 0x0,
>>
>> I don't understand these large constants. They're supposed to be __u8
>> values. Do they get truncated to the least significant byte?
>
> This program doesn't lead to crashes, it was a bug in syzkaller. The
> format keeps changing, I'll explain it if I send another program.
>
>>
>>> > 0x8, 0x80, [{0x9, 0x5, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}]}, {0x12, 0x1,
>>> > 0x0, 0x0, 0x4b5, 0x7c, 0x0, 0x3, 0x4, 0x0, 0x8, 0xd686, 0x0, 0x1}},
>>> > 0x31)
>>> >
>>> > I haven't managed to get the exact same stack trace (or any at all
>>> > actually) while trying to reproduce the bug with this program, but the
>>> > kernel definitely hangs.
>
> It seems that I get the stall reports quite rarely and I can't
> reproduce them. However I get the "bad spinlock magic" crashes quite
> often and able to reproduce them. So I can apply debug patches.
>
> Reproducing sometimes requires quite some time (~10 minutes), so this
> seems to be some kind of race. I noticed, that often I unmount
> gadgetfs before the USB device has enough time to properly initialize,
> so this might be a race of unmount vs device initialization or
> something similar.
Aha, got KASAN report:
==================================================================
BUG: KASAN: use-after-free in __lock_acquire+0x3069/0x3690
kernel/locking/lockdep.c:3246
Read of size 8 at addr ffff88003a2bdaf8 by task kworker/3:1/903
CPU: 3 PID: 903 Comm: kworker/3:1 Not tainted 4.12.0-rc4+ #35
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
Call Trace:
__dump_stack lib/dump_stack.c:16 [inline]
dump_stack+0x292/0x395 lib/dump_stack.c:52
print_address_description+0x78/0x280 mm/kasan/report.c:252
kasan_report_error mm/kasan/report.c:351 [inline]
kasan_report+0x230/0x340 mm/kasan/report.c:408
__asan_report_load8_noabort+0x19/0x20 mm/kasan/report.c:429
__lock_acquire+0x3069/0x3690 kernel/locking/lockdep.c:3246
lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855
__raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
_raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
spin_lock include/linux/spinlock.h:299 [inline]
gadgetfs_suspend+0x89/0x130 drivers/usb/gadget/legacy/inode.c:1682
set_link_state+0x88e/0xae0 drivers/usb/gadget/udc/dummy_hcd.c:455
dummy_hub_control+0xd7e/0x1fb0 drivers/usb/gadget/udc/dummy_hcd.c:2074
rh_call_control drivers/usb/core/hcd.c:689 [inline]
rh_urb_enqueue drivers/usb/core/hcd.c:846 [inline]
usb_hcd_submit_urb+0x92f/0x20b0 drivers/usb/core/hcd.c:1650
usb_submit_urb+0x8b2/0x12c0 drivers/usb/core/urb.c:542
usb_start_wait_urb+0x148/0x5b0 drivers/usb/core/message.c:56
usb_internal_control_msg drivers/usb/core/message.c:100 [inline]
usb_control_msg+0x341/0x4d0 drivers/usb/core/message.c:151
usb_clear_port_feature+0x74/0xa0 drivers/usb/core/hub.c:412
hub_port_disable+0x123/0x510 drivers/usb/core/hub.c:4177
hub_port_init+0x1ed/0x2940 drivers/usb/core/hub.c:4648
hub_port_connect drivers/usb/core/hub.c:4826 [inline]
hub_port_connect_change drivers/usb/core/hub.c:4999 [inline]
port_event drivers/usb/core/hub.c:5105 [inline]
hub_event+0x1ae1/0x3d40 drivers/usb/core/hub.c:5185
process_one_work+0xc08/0x1bd0 kernel/workqueue.c:2097
process_scheduled_works kernel/workqueue.c:2157 [inline]
worker_thread+0xb2b/0x1860 kernel/workqueue.c:2233
kthread+0x363/0x440 kernel/kthread.c:231
ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:424
Allocated by task 9958:
save_stack_trace+0x1b/0x20 arch/x86/kernel/stacktrace.c:59
save_stack+0x43/0xd0 mm/kasan/kasan.c:513
set_track mm/kasan/kasan.c:525 [inline]
kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:617
kmem_cache_alloc_trace+0x87/0x280 mm/slub.c:2745
kmalloc include/linux/slab.h:492 [inline]
kzalloc include/linux/slab.h:665 [inline]
dev_new drivers/usb/gadget/legacy/inode.c:170 [inline]
gadgetfs_fill_super+0x24f/0x540 drivers/usb/gadget/legacy/inode.c:1993
mount_single+0xf6/0x160 fs/super.c:1192
gadgetfs_mount+0x31/0x40 drivers/usb/gadget/legacy/inode.c:2019
mount_fs+0x9c/0x2d0 fs/super.c:1223
vfs_kern_mount.part.25+0xcb/0x490 fs/namespace.c:976
vfs_kern_mount fs/namespace.c:2509 [inline]
do_new_mount fs/namespace.c:2512 [inline]
do_mount+0x41b/0x2d90 fs/namespace.c:2834
SYSC_mount fs/namespace.c:3050 [inline]
SyS_mount+0xb0/0x120 fs/namespace.c:3027
entry_SYSCALL_64_fastpath+0x1f/0xbe
Freed by task 9960:
save_stack_trace+0x1b/0x20 arch/x86/kernel/stacktrace.c:59
save_stack+0x43/0xd0 mm/kasan/kasan.c:513
set_track mm/kasan/kasan.c:525 [inline]
kasan_slab_free+0x72/0xc0 mm/kasan/kasan.c:590
slab_free_hook mm/slub.c:1357 [inline]
slab_free_freelist_hook mm/slub.c:1379 [inline]
slab_free mm/slub.c:2961 [inline]
kfree+0xed/0x2b0 mm/slub.c:3882
put_dev+0x124/0x160 drivers/usb/gadget/legacy/inode.c:163
gadgetfs_kill_sb+0x33/0x60 drivers/usb/gadget/legacy/inode.c:2027
deactivate_locked_super+0x8d/0xd0 fs/super.c:309
deactivate_super+0x21e/0x310 fs/super.c:340
cleanup_mnt+0xb7/0x150 fs/namespace.c:1112
__cleanup_mnt+0x1b/0x20 fs/namespace.c:1119
task_work_run+0x1a0/0x280 kernel/task_work.c:116
exit_task_work include/linux/task_work.h:21 [inline]
do_exit+0x18a8/0x2820 kernel/exit.c:878
do_group_exit+0x14e/0x420 kernel/exit.c:982
get_signal+0x784/0x1780 kernel/signal.c:2318
do_signal+0xd7/0x2130 arch/x86/kernel/signal.c:808
exit_to_usermode_loop+0x1ac/0x240 arch/x86/entry/common.c:157
prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
syscall_return_slowpath+0x3ba/0x410 arch/x86/entry/common.c:263
entry_SYSCALL_64_fastpath+0xbc/0xbe
The buggy address belongs to the object at ffff88003a2bdae0
which belongs to the cache kmalloc-1024 of size 1024
The buggy address is located 24 bytes inside of
1024-byte region [ffff88003a2bdae0, ffff88003a2bdee0)
The buggy address belongs to the page:
page:ffffea0000e8ae00 count:1 mapcount:0 mapping: (null)
index:0x0 compound_mapcount: 0
flags: 0x100000000008100(slab|head)
raw: 0100000000008100 0000000000000000 0000000000000000 0000000100170017
raw: ffffea0000ed3020 ffffea0000f5f820 ffff88003e80efc0 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88003a2bd980: fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88003a2bda00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88003a2bda80: fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb fb
^
ffff88003a2bdb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88003a2bdb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
>
>>
>> Can you get a usbmon trace for the test?
>
> I wasn't able to reproduce the crash with usbmon tracing enabled.
> Maybe usbmon tracing affects timings somehow and the race is quite
> less likely to be triggered. I'll keep trying though.
>
>> And can you enable debugging for the usbcore module?
>>
>> echo "module usbcore =p" >/sys/kernel/debug/dynamic_debug/control
>>
>
> I was able to reproduce the crash with debugging for usbcore enabled:
>
> https://gist.github.com/xairy/52bfb302d824eb55d39776eb156812cc
>
>>> > On commit b29794ec95c6856b316c2295904208bf11ffddd9 (4.12-rc4+) with
>>> > Alan's patch applied.
>>> >
>>> > gadgetfs: bound to dummy_udc driver
>>> > gadgetfs: suspended from state 2
>>> > INFO: rcu_sched detected stalls on CPUs/tasks:
>>> > 1-...: (0 ticks this GP) idle=966/140000000000000/0 softirq=37706/37706 fqs=5250
>>> > (detected by 2, t=21002 jiffies, g=26575, c=26574, q=183)
>>> > Sending NMI from CPU 2 to CPUs 1:
>>> > NMI backtrace for cpu 1
>>> > CPU: 1 PID: 1394 Comm: kworker/1:2 Not tainted 4.12.0-rc4+ #24
>>> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>>> > Workqueue: usb_hub_wq hub_event
>>> > task: ffff88003ebfb640 task.stack: ffffc900024fc000
>>> > RIP: 0010:rep_nop arch/x86/include/asm/processor.h:619 [inline]
>>> > RIP: 0010:cpu_relax arch/x86/include/asm/processor.h:624 [inline]
>>> > RIP: 0010:virt_spin_lock arch/x86/include/asm/qspinlock.h:63 [inline]
>>> > RIP: 0010:queued_spin_lock_slowpath+0x20/0x1a0 kernel/locking/qspinlock.c:421
>>> > RSP: 0018:ffffc900024ff9f0 EFLAGS: 00000002
>>> > RAX: 000000007d72e420 RBX: ffff88007d72e768 RCX: 0000000000010000
>>> > RDX: 0000000000000001 RSI: 000000007d72e420 RDI: ffff88007d72e768
>>> > RBP: ffffc900024ff9f0 R08: 0000000000000006 R09: 0000000000000020
>>> > R10: ffffc900024ffa50 R11: 0000000000d52301 R12: ffff88007da9c298
>>> > R13: 0000000000000000 R14: 0000000000000000 R15: ffffffff82cfc5a0
>>> > FS: 0000000000000000(0000) GS:ffff88003ed00000(0000) knlGS:0000000000000000
>>> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> > CR2: 00007f728a487000 CR3: 00000000371e6000 CR4: 00000000000006e0
>>> > Call Trace:
>>> > __raw_spin_lock include/asm-generic/qspinlock.h:103 [inline]
>>> > _raw_spin_lock+0x1c/0x20 kernel/locking/spinlock.c:151
>>> > spin_lock include/linux/spinlock.h:299 [inline]
>>> > gadgetfs_suspend+0x32/0x90 drivers/usb/gadget/legacy/inode.c:1684
>>
>> Looks like it's waiting for the spinlock in gadgetfs_suspend. Nothing
>> else should be holding that lock.
>>
>> Were any other tasks stalled?
>
> It seems that I get these stalls quite rarely and I can't reproduce them.
>
>>
>>> > set_link_state+0x39c/0x440 drivers/usb/gadget/udc/dummy_hcd.c:455
>>> > dummy_hub_control+0x3e7/0x650 drivers/usb/gadget/udc/dummy_hcd.c:2074
>>> > rh_call_control drivers/usb/core/hcd.c:689 [inline]
>>> > rh_urb_enqueue drivers/usb/core/hcd.c:846 [inline]
>>> > usb_hcd_submit_urb+0x327/0xcf0 drivers/usb/core/hcd.c:1650
>>> > usb_submit_urb+0x355/0x6f0 drivers/usb/core/urb.c:542
>>> > usb_start_wait_urb+0x5f/0x110 drivers/usb/core/message.c:56
>>> > usb_internal_control_msg drivers/usb/core/message.c:100 [inline]
>>> > usb_control_msg+0xd9/0x120 drivers/usb/core/message.c:151
>>> > usb_clear_port_feature+0x46/0x60 drivers/usb/core/hub.c:412
>>> > hub_port_disable+0x65/0x1d0 drivers/usb/core/hub.c:4177
>>> > hub_port_init+0x10a/0xee0 drivers/usb/core/hub.c:4648
>>> > hub_port_connect drivers/usb/core/hub.c:4826 [inline]
>>> > hub_port_connect_change drivers/usb/core/hub.c:4999 [inline]
>>> > port_event drivers/usb/core/hub.c:5105 [inline]
>>> > hub_event+0xa0b/0x16e0 drivers/usb/core/hub.c:5189
>>> > process_one_work+0x1fb/0x4c0 kernel/workqueue.c:2097
>>> > process_scheduled_works kernel/workqueue.c:2157 [inline]
>>> > worker_thread+0x2ab/0x4c0 kernel/workqueue.c:2233
>>> > kthread+0x140/0x160 kernel/kthread.c:231
>>> > ret_from_fork+0x25/0x30 arch/x86/entry/entry_64.S:424
>>> > Code: 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 0f 1f 44 00 00 ba
>>> > 01 00 00 00 8b 07 85 c0 75 0a f0 0f b1 17 85 c0 75 f2 5d c3 f3 90 <eb>
>>> > ec 81 fe 00 01 00 00 0f 84 92 00 00 00 41 b8 01 01 00 00 b9
>>
>> Alan Stern
>>
On Mon, 12 Jun 2017, Andrey Konovalov wrote:
> Aha, got KASAN report:
>
> ==================================================================
> BUG: KASAN: use-after-free in __lock_acquire+0x3069/0x3690
> kernel/locking/lockdep.c:3246
> Read of size 8 at addr ffff88003a2bdaf8 by task kworker/3:1/903
>
> CPU: 3 PID: 903 Comm: kworker/3:1 Not tainted 4.12.0-rc4+ #35
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Workqueue: usb_hub_wq hub_event
> Call Trace:
> __dump_stack lib/dump_stack.c:16 [inline]
> dump_stack+0x292/0x395 lib/dump_stack.c:52
> print_address_description+0x78/0x280 mm/kasan/report.c:252
> kasan_report_error mm/kasan/report.c:351 [inline]
> kasan_report+0x230/0x340 mm/kasan/report.c:408
> __asan_report_load8_noabort+0x19/0x20 mm/kasan/report.c:429
> __lock_acquire+0x3069/0x3690 kernel/locking/lockdep.c:3246
> lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855
> __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
> _raw_spin_lock+0x2f/0x40 kernel/locking/spinlock.c:151
> spin_lock include/linux/spinlock.h:299 [inline]
> gadgetfs_suspend+0x89/0x130 drivers/usb/gadget/legacy/inode.c:1682
> set_link_state+0x88e/0xae0 drivers/usb/gadget/udc/dummy_hcd.c:455
> dummy_hub_control+0xd7e/0x1fb0 drivers/usb/gadget/udc/dummy_hcd.c:2074
> rh_call_control drivers/usb/core/hcd.c:689 [inline]
> rh_urb_enqueue drivers/usb/core/hcd.c:846 [inline]
> usb_hcd_submit_urb+0x92f/0x20b0 drivers/usb/core/hcd.c:1650
> usb_submit_urb+0x8b2/0x12c0 drivers/usb/core/urb.c:542
> usb_start_wait_urb+0x148/0x5b0 drivers/usb/core/message.c:56
> usb_internal_control_msg drivers/usb/core/message.c:100 [inline]
> usb_control_msg+0x341/0x4d0 drivers/usb/core/message.c:151
> usb_clear_port_feature+0x74/0xa0 drivers/usb/core/hub.c:412
> hub_port_disable+0x123/0x510 drivers/usb/core/hub.c:4177
> hub_port_init+0x1ed/0x2940 drivers/usb/core/hub.c:4648
> hub_port_connect drivers/usb/core/hub.c:4826 [inline]
> hub_port_connect_change drivers/usb/core/hub.c:4999 [inline]
> port_event drivers/usb/core/hub.c:5105 [inline]
> hub_event+0x1ae1/0x3d40 drivers/usb/core/hub.c:5185
> process_one_work+0xc08/0x1bd0 kernel/workqueue.c:2097
> process_scheduled_works kernel/workqueue.c:2157 [inline]
> worker_thread+0xb2b/0x1860 kernel/workqueue.c:2233
> kthread+0x363/0x440 kernel/kthread.c:231
> ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:424
>
> Allocated by task 9958:
> save_stack_trace+0x1b/0x20 arch/x86/kernel/stacktrace.c:59
> save_stack+0x43/0xd0 mm/kasan/kasan.c:513
> set_track mm/kasan/kasan.c:525 [inline]
> kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:617
> kmem_cache_alloc_trace+0x87/0x280 mm/slub.c:2745
> kmalloc include/linux/slab.h:492 [inline]
> kzalloc include/linux/slab.h:665 [inline]
> dev_new drivers/usb/gadget/legacy/inode.c:170 [inline]
> gadgetfs_fill_super+0x24f/0x540 drivers/usb/gadget/legacy/inode.c:1993
> mount_single+0xf6/0x160 fs/super.c:1192
> gadgetfs_mount+0x31/0x40 drivers/usb/gadget/legacy/inode.c:2019
> mount_fs+0x9c/0x2d0 fs/super.c:1223
> vfs_kern_mount.part.25+0xcb/0x490 fs/namespace.c:976
> vfs_kern_mount fs/namespace.c:2509 [inline]
> do_new_mount fs/namespace.c:2512 [inline]
> do_mount+0x41b/0x2d90 fs/namespace.c:2834
> SYSC_mount fs/namespace.c:3050 [inline]
> SyS_mount+0xb0/0x120 fs/namespace.c:3027
> entry_SYSCALL_64_fastpath+0x1f/0xbe
>
> Freed by task 9960:
> save_stack_trace+0x1b/0x20 arch/x86/kernel/stacktrace.c:59
> save_stack+0x43/0xd0 mm/kasan/kasan.c:513
> set_track mm/kasan/kasan.c:525 [inline]
> kasan_slab_free+0x72/0xc0 mm/kasan/kasan.c:590
> slab_free_hook mm/slub.c:1357 [inline]
> slab_free_freelist_hook mm/slub.c:1379 [inline]
> slab_free mm/slub.c:2961 [inline]
> kfree+0xed/0x2b0 mm/slub.c:3882
> put_dev+0x124/0x160 drivers/usb/gadget/legacy/inode.c:163
> gadgetfs_kill_sb+0x33/0x60 drivers/usb/gadget/legacy/inode.c:2027
> deactivate_locked_super+0x8d/0xd0 fs/super.c:309
> deactivate_super+0x21e/0x310 fs/super.c:340
> cleanup_mnt+0xb7/0x150 fs/namespace.c:1112
> __cleanup_mnt+0x1b/0x20 fs/namespace.c:1119
> task_work_run+0x1a0/0x280 kernel/task_work.c:116
> exit_task_work include/linux/task_work.h:21 [inline]
> do_exit+0x18a8/0x2820 kernel/exit.c:878
> do_group_exit+0x14e/0x420 kernel/exit.c:982
> get_signal+0x784/0x1780 kernel/signal.c:2318
> do_signal+0xd7/0x2130 arch/x86/kernel/signal.c:808
> exit_to_usermode_loop+0x1ac/0x240 arch/x86/entry/common.c:157
> prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
> syscall_return_slowpath+0x3ba/0x410 arch/x86/entry/common.c:263
> entry_SYSCALL_64_fastpath+0xbc/0xbe
>
> The buggy address belongs to the object at ffff88003a2bdae0
> which belongs to the cache kmalloc-1024 of size 1024
> The buggy address is located 24 bytes inside of
> 1024-byte region [ffff88003a2bdae0, ffff88003a2bdee0)
> The buggy address belongs to the page:
> page:ffffea0000e8ae00 count:1 mapcount:0 mapping: (null)
> index:0x0 compound_mapcount: 0
> flags: 0x100000000008100(slab|head)
> raw: 0100000000008100 0000000000000000 0000000000000000 0000000100170017
> raw: ffffea0000ed3020 ffffea0000f5f820 ffff88003e80efc0 0000000000000000
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
> ffff88003a2bd980: fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ffff88003a2bda00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> >ffff88003a2bda80: fc fc fc fc fc fc fc fc fc fc fc fc fb fb fb fb
> ^
> ffff88003a2bdb00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ffff88003a2bdb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==================================================================
As you surmised, this was caused by a race. The race was between
dummy_udc_stop() and set_link_state(), both in dummy_hcd.c. A symptom
of this race is that the first routine clears dum->driver while the
second dereferences it, and neither access is protected by a lock.
It turns out this problem affects at least one other UDC driver
(net2280.c). Below is a patch that adds the missing lock (and removes
some unneeded unlocks) from these drivers. It turns out that the
disconnect, reset, suspend, and resume callbacks should all be invoked
while the UDC's lock is held, because they can race with gadget driver
unbinding.
Adding locked regions can lead to the possibility of deadlock or
lockdep violations. Please let me know if the patch below raises any
problems.
Alan Stern
Index: usb-4.x/drivers/usb/gadget/legacy/inode.c
===================================================================
--- usb-4.x.orig/drivers/usb/gadget/legacy/inode.c
+++ usb-4.x/drivers/usb/gadget/legacy/inode.c
@@ -1679,9 +1679,10 @@ static void
gadgetfs_suspend (struct usb_gadget *gadget)
{
struct dev_data *dev = get_gadget_data (gadget);
+ unsigned long flags;
INFO (dev, "suspended from state %d\n", dev->state);
- spin_lock (&dev->lock);
+ spin_lock_irqsave(&dev->lock, flags);
switch (dev->state) {
case STATE_DEV_SETUP: // VERY odd... host died??
case STATE_DEV_CONNECTED:
@@ -1692,7 +1693,7 @@ gadgetfs_suspend (struct usb_gadget *gad
default:
break;
}
- spin_unlock (&dev->lock);
+ spin_unlock_irqrestore(&dev->lock, flags);
}
static struct usb_gadget_driver gadgetfs_driver = {
Index: usb-4.x/drivers/usb/gadget/udc/dummy_hcd.c
===================================================================
--- usb-4.x.orig/drivers/usb/gadget/udc/dummy_hcd.c
+++ usb-4.x/drivers/usb/gadget/udc/dummy_hcd.c
@@ -442,23 +442,16 @@ static void set_link_state(struct dummy_
/* Report reset and disconnect events to the driver */
if (dum->driver && (disconnect || reset)) {
stop_activity(dum);
- spin_unlock(&dum->lock);
if (reset)
usb_gadget_udc_reset(&dum->gadget, dum->driver);
else
dum->driver->disconnect(&dum->gadget);
- spin_lock(&dum->lock);
}
} else if (dum_hcd->active != dum_hcd->old_active) {
- if (dum_hcd->old_active && dum->driver->suspend) {
- spin_unlock(&dum->lock);
+ if (dum_hcd->old_active && dum->driver->suspend)
dum->driver->suspend(&dum->gadget);
- spin_lock(&dum->lock);
- } else if (!dum_hcd->old_active && dum->driver->resume) {
- spin_unlock(&dum->lock);
+ else if (!dum_hcd->old_active && dum->driver->resume)
dum->driver->resume(&dum->gadget);
- spin_lock(&dum->lock);
- }
}
dum_hcd->old_status = dum_hcd->port_status;
@@ -983,7 +976,9 @@ static int dummy_udc_stop(struct usb_gad
struct dummy_hcd *dum_hcd = gadget_to_dummy_hcd(g);
struct dummy *dum = dum_hcd->dum;
+ spin_lock_irq(&dum->lock);
dum->driver = NULL;
+ spin_unlock_irq(&dum->lock);
return 0;
}
Index: usb-4.x/drivers/usb/gadget/udc/net2280.c
===================================================================
--- usb-4.x.orig/drivers/usb/gadget/udc/net2280.c
+++ usb-4.x/drivers/usb/gadget/udc/net2280.c
@@ -2470,11 +2470,8 @@ static void stop_activity(struct net2280
nuke(&dev->ep[i]);
/* report disconnect; the driver is already quiesced */
- if (driver) {
- spin_unlock(&dev->lock);
+ if (driver)
driver->disconnect(&dev->gadget);
- spin_lock(&dev->lock);
- }
usb_reinit(dev);
}
@@ -3348,8 +3345,6 @@ next_endpoints:
BIT(PCI_RETRY_ABORT_INTERRUPT))
static void handle_stat1_irqs(struct net2280 *dev, u32 stat)
-__releases(dev->lock)
-__acquires(dev->lock)
{
struct net2280_ep *ep;
u32 tmp, num, mask, scratch;
@@ -3390,14 +3385,12 @@ __acquires(dev->lock)
if (disconnect || reset) {
stop_activity(dev, dev->driver);
ep0_start(dev);
- spin_unlock(&dev->lock);
if (reset)
usb_gadget_udc_reset
(&dev->gadget, dev->driver);
else
(dev->driver->disconnect)
(&dev->gadget);
- spin_lock(&dev->lock);
return;
}
}
On Mon, Jun 12, 2017 at 10:25 PM, Alan Stern <[email protected]> wrote:
>
> As you surmised, this was caused by a race. The race was between
> dummy_udc_stop() and set_link_state(), both in dummy_hcd.c. A symptom
> of this race is that the first routine clears dum->driver while the
> second dereferences it, and neither access is protected by a lock.
>
> It turns out this problem affects at least one other UDC driver
> (net2280.c). Below is a patch that adds the missing lock (and removes
> some unneeded unlocks) from these drivers. It turns out that the
> disconnect, reset, suspend, and resume callbacks should all be invoked
> while the UDC's lock is held, because they can race with gadget driver
> unbinding.
>
> Adding locked regions can lead to the possibility of deadlock or
> lockdep violations. Please let me know if the patch below raises any
> problems.
Hi Alan,
Thanks for the patch!
I've been testing with your patch applied and the "bad spinlock magic"
crashes seem to be gone. However I got another crash (happened only
once over the night), which happens during "spin_lock_irqsave
(&dev->lock, flags)":
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 3 PID: 900 Comm: kworker/3:1 Not tainted 4.12.0-rc4+ #36
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
task: ffff88006c5aadc0 task.stack: ffff88006c620000
RIP: 0010:__lock_acquire+0xbe5/0x3690 kernel/locking/lockdep.c:3246
RSP: 0018:ffff88006c625a80 EFLAGS: 00010006
RAX: dffffc0000000000 RBX: ffff88006c5aadc0 RCX: 0000000000000000
RDX: 0000000000000003 RSI: 0000000000000000 RDI: 1ffff1000d8c4bab
RBP: ffff88006c625fc0 R08: 0000000000000001 R09: 0000000000000001
R10: ffff88006c5ab698 R11: ffffffff87dd2e80 R12: dffffc0000000000
R13: 0000000000000001 R14: 0000000000000018 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88006dd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020fdffff CR3: 000000006797e000 CR4: 00000000000006e0
Call Trace:
lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0xc9/0x110 kernel/locking/spinlock.c:159
gadgetfs_disconnect+0xf1/0x230 drivers/usb/gadget/legacy/inode.c:1664
usb_gadget_udc_reset+0x3b/0xb0 drivers/usb/gadget/udc/core.c:1020
set_link_state+0x648/0x9f0 drivers/usb/gadget/udc/dummy_hcd.c:446
dummy_hub_control+0x11bb/0x1fb0 drivers/usb/gadget/udc/dummy_hcd.c:2243
rh_call_control drivers/usb/core/hcd.c:689 [inline]
rh_urb_enqueue drivers/usb/core/hcd.c:846 [inline]
usb_hcd_submit_urb+0x92f/0x20b0 drivers/usb/core/hcd.c:1650
usb_submit_urb+0x8b2/0x12c0 drivers/usb/core/urb.c:542
usb_start_wait_urb+0x148/0x5b0 drivers/usb/core/message.c:56
usb_internal_control_msg drivers/usb/core/message.c:100 [inline]
usb_control_msg+0x341/0x4d0 drivers/usb/core/message.c:151
set_port_feature+0x73/0x90 drivers/usb/core/hub.c:422
hub_port_reset+0x277/0x1550 drivers/usb/core/hub.c:2772
hub_port_init+0x7dc/0x2940 drivers/usb/core/hub.c:4510
hub_port_connect drivers/usb/core/hub.c:4826 [inline]
hub_port_connect_change drivers/usb/core/hub.c:4999 [inline]
port_event drivers/usb/core/hub.c:5105 [inline]
hub_event+0x1ae1/0x3d40 drivers/usb/core/hub.c:5185
process_one_work+0xc08/0x1bd0 kernel/workqueue.c:2097
process_scheduled_works kernel/workqueue.c:2157 [inline]
worker_thread+0xb2b/0x1860 kernel/workqueue.c:2233
kthread+0x363/0x440 kernel/kthread.c:231
ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:424
Code: e9 03 f3 48 ab 48 81 c4 18 05 00 00 44 89 e0 5b 41 5c 41 5d 41
5e 41 5f 5d c3 4c 89 f2 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80>
3c 02 00 0f 85 51 24 00 00 49 81 3e 40 ea 13 87 41 bd 00 00
RIP: __lock_acquire+0xbe5/0x3690 kernel/locking/lockdep.c:3246 RSP:
ffff88006c625a80
---[ end trace f5a7b971fc1b0546 ]---
Kernel panic - not syncing: Fatal exception
Shutting down cpus with NMI
Dumping ftrace buffer:
(ftrace buffer empty)
Kernel Offset: disabled
Rebooting in 86400 seconds..
>
> Alan Stern
>
>
>
> Index: usb-4.x/drivers/usb/gadget/legacy/inode.c
> ===================================================================
> --- usb-4.x.orig/drivers/usb/gadget/legacy/inode.c
> +++ usb-4.x/drivers/usb/gadget/legacy/inode.c
> @@ -1679,9 +1679,10 @@ static void
> gadgetfs_suspend (struct usb_gadget *gadget)
> {
> struct dev_data *dev = get_gadget_data (gadget);
> + unsigned long flags;
>
> INFO (dev, "suspended from state %d\n", dev->state);
> - spin_lock (&dev->lock);
> + spin_lock_irqsave(&dev->lock, flags);
> switch (dev->state) {
> case STATE_DEV_SETUP: // VERY odd... host died??
> case STATE_DEV_CONNECTED:
> @@ -1692,7 +1693,7 @@ gadgetfs_suspend (struct usb_gadget *gad
> default:
> break;
> }
> - spin_unlock (&dev->lock);
> + spin_unlock_irqrestore(&dev->lock, flags);
> }
>
> static struct usb_gadget_driver gadgetfs_driver = {
> Index: usb-4.x/drivers/usb/gadget/udc/dummy_hcd.c
> ===================================================================
> --- usb-4.x.orig/drivers/usb/gadget/udc/dummy_hcd.c
> +++ usb-4.x/drivers/usb/gadget/udc/dummy_hcd.c
> @@ -442,23 +442,16 @@ static void set_link_state(struct dummy_
> /* Report reset and disconnect events to the driver */
> if (dum->driver && (disconnect || reset)) {
> stop_activity(dum);
> - spin_unlock(&dum->lock);
> if (reset)
> usb_gadget_udc_reset(&dum->gadget, dum->driver);
> else
> dum->driver->disconnect(&dum->gadget);
> - spin_lock(&dum->lock);
> }
> } else if (dum_hcd->active != dum_hcd->old_active) {
> - if (dum_hcd->old_active && dum->driver->suspend) {
> - spin_unlock(&dum->lock);
> + if (dum_hcd->old_active && dum->driver->suspend)
> dum->driver->suspend(&dum->gadget);
> - spin_lock(&dum->lock);
> - } else if (!dum_hcd->old_active && dum->driver->resume) {
> - spin_unlock(&dum->lock);
> + else if (!dum_hcd->old_active && dum->driver->resume)
> dum->driver->resume(&dum->gadget);
> - spin_lock(&dum->lock);
> - }
> }
>
> dum_hcd->old_status = dum_hcd->port_status;
> @@ -983,7 +976,9 @@ static int dummy_udc_stop(struct usb_gad
> struct dummy_hcd *dum_hcd = gadget_to_dummy_hcd(g);
> struct dummy *dum = dum_hcd->dum;
>
> + spin_lock_irq(&dum->lock);
> dum->driver = NULL;
> + spin_unlock_irq(&dum->lock);
>
> return 0;
> }
> Index: usb-4.x/drivers/usb/gadget/udc/net2280.c
> ===================================================================
> --- usb-4.x.orig/drivers/usb/gadget/udc/net2280.c
> +++ usb-4.x/drivers/usb/gadget/udc/net2280.c
> @@ -2470,11 +2470,8 @@ static void stop_activity(struct net2280
> nuke(&dev->ep[i]);
>
> /* report disconnect; the driver is already quiesced */
> - if (driver) {
> - spin_unlock(&dev->lock);
> + if (driver)
> driver->disconnect(&dev->gadget);
> - spin_lock(&dev->lock);
> - }
>
> usb_reinit(dev);
> }
> @@ -3348,8 +3345,6 @@ next_endpoints:
> BIT(PCI_RETRY_ABORT_INTERRUPT))
>
> static void handle_stat1_irqs(struct net2280 *dev, u32 stat)
> -__releases(dev->lock)
> -__acquires(dev->lock)
> {
> struct net2280_ep *ep;
> u32 tmp, num, mask, scratch;
> @@ -3390,14 +3385,12 @@ __acquires(dev->lock)
> if (disconnect || reset) {
> stop_activity(dev, dev->driver);
> ep0_start(dev);
> - spin_unlock(&dev->lock);
> if (reset)
> usb_gadget_udc_reset
> (&dev->gadget, dev->driver);
> else
> (dev->driver->disconnect)
> (&dev->gadget);
> - spin_lock(&dev->lock);
> return;
> }
> }
>
On Tue, 13 Jun 2017, Andrey Konovalov wrote:
> Hi Alan,
>
> Thanks for the patch!
>
> I've been testing with your patch applied and the "bad spinlock magic"
> crashes seem to be gone. However I got another crash (happened only
> once over the night), which happens during "spin_lock_irqsave
> (&dev->lock, flags)":
>
> kasan: GPF could be caused by NULL-ptr deref or user memory access
> general protection fault: 0000 [#1] SMP KASAN
> Dumping ftrace buffer:
> (ftrace buffer empty)
> Modules linked in:
> CPU: 3 PID: 900 Comm: kworker/3:1 Not tainted 4.12.0-rc4+ #36
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> Workqueue: usb_hub_wq hub_event
> task: ffff88006c5aadc0 task.stack: ffff88006c620000
> RIP: 0010:__lock_acquire+0xbe5/0x3690 kernel/locking/lockdep.c:3246
> RSP: 0018:ffff88006c625a80 EFLAGS: 00010006
> RAX: dffffc0000000000 RBX: ffff88006c5aadc0 RCX: 0000000000000000
> RDX: 0000000000000003 RSI: 0000000000000000 RDI: 1ffff1000d8c4bab
> RBP: ffff88006c625fc0 R08: 0000000000000001 R09: 0000000000000001
> R10: ffff88006c5ab698 R11: ffffffff87dd2e80 R12: dffffc0000000000
> R13: 0000000000000001 R14: 0000000000000018 R15: 0000000000000000
> FS: 0000000000000000(0000) GS:ffff88006dd00000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000020fdffff CR3: 000000006797e000 CR4: 00000000000006e0
> Call Trace:
> lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855
> __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
> _raw_spin_lock_irqsave+0xc9/0x110 kernel/locking/spinlock.c:159
> gadgetfs_disconnect+0xf1/0x230 drivers/usb/gadget/legacy/inode.c:1664
> usb_gadget_udc_reset+0x3b/0xb0 drivers/usb/gadget/udc/core.c:1020
> set_link_state+0x648/0x9f0 drivers/usb/gadget/udc/dummy_hcd.c:446
> dummy_hub_control+0x11bb/0x1fb0 drivers/usb/gadget/udc/dummy_hcd.c:2243
> rh_call_control drivers/usb/core/hcd.c:689 [inline]
> rh_urb_enqueue drivers/usb/core/hcd.c:846 [inline]
> usb_hcd_submit_urb+0x92f/0x20b0 drivers/usb/core/hcd.c:1650
> usb_submit_urb+0x8b2/0x12c0 drivers/usb/core/urb.c:542
> usb_start_wait_urb+0x148/0x5b0 drivers/usb/core/message.c:56
> usb_internal_control_msg drivers/usb/core/message.c:100 [inline]
> usb_control_msg+0x341/0x4d0 drivers/usb/core/message.c:151
> set_port_feature+0x73/0x90 drivers/usb/core/hub.c:422
> hub_port_reset+0x277/0x1550 drivers/usb/core/hub.c:2772
> hub_port_init+0x7dc/0x2940 drivers/usb/core/hub.c:4510
> hub_port_connect drivers/usb/core/hub.c:4826 [inline]
> hub_port_connect_change drivers/usb/core/hub.c:4999 [inline]
> port_event drivers/usb/core/hub.c:5105 [inline]
> hub_event+0x1ae1/0x3d40 drivers/usb/core/hub.c:5185
> process_one_work+0xc08/0x1bd0 kernel/workqueue.c:2097
> process_scheduled_works kernel/workqueue.c:2157 [inline]
> worker_thread+0xb2b/0x1860 kernel/workqueue.c:2233
> kthread+0x363/0x440 kernel/kthread.c:231
> ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:424
> Code: e9 03 f3 48 ab 48 81 c4 18 05 00 00 44 89 e0 5b 41 5c 41 5d 41
> 5e 41 5f 5d c3 4c 89 f2 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80>
> 3c 02 00 0f 85 51 24 00 00 49 81 3e 40 ea 13 87 41 bd 00 00
> RIP: __lock_acquire+0xbe5/0x3690 kernel/locking/lockdep.c:3246 RSP:
> ffff88006c625a80
> ---[ end trace f5a7b971fc1b0546 ]---
> Kernel panic - not syncing: Fatal exception
> Shutting down cpus with NMI
> Dumping ftrace buffer:
> (ftrace buffer empty)
> Kernel Offset: disabled
> Rebooting in 86400 seconds..
I guess the structure containing the spinlock was already deallocated
when gadgetfs_disconnect was called. But I don't see how that could
have happened, with the patched locking.
If you can't replicate this then there's not much hope of tracking it
down.
If you can, then perhaps it would help to trace each time any function
calls either put_dev or get_dev in inode.c.
Alan Stern
On Tue, Jun 13, 2017 at 7:44 PM, Alan Stern <[email protected]> wrote:
> On Tue, 13 Jun 2017, Andrey Konovalov wrote:
>
>> Hi Alan,
>>
>> Thanks for the patch!
>>
>> I've been testing with your patch applied and the "bad spinlock magic"
>> crashes seem to be gone. However I got another crash (happened only
>> once over the night), which happens during "spin_lock_irqsave
>> (&dev->lock, flags)":
>>
>> kasan: GPF could be caused by NULL-ptr deref or user memory access
>> general protection fault: 0000 [#1] SMP KASAN
>> Dumping ftrace buffer:
>> (ftrace buffer empty)
>> Modules linked in:
>> CPU: 3 PID: 900 Comm: kworker/3:1 Not tainted 4.12.0-rc4+ #36
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>> Workqueue: usb_hub_wq hub_event
>> task: ffff88006c5aadc0 task.stack: ffff88006c620000
>> RIP: 0010:__lock_acquire+0xbe5/0x3690 kernel/locking/lockdep.c:3246
>> RSP: 0018:ffff88006c625a80 EFLAGS: 00010006
>> RAX: dffffc0000000000 RBX: ffff88006c5aadc0 RCX: 0000000000000000
>> RDX: 0000000000000003 RSI: 0000000000000000 RDI: 1ffff1000d8c4bab
>> RBP: ffff88006c625fc0 R08: 0000000000000001 R09: 0000000000000001
>> R10: ffff88006c5ab698 R11: ffffffff87dd2e80 R12: dffffc0000000000
>> R13: 0000000000000001 R14: 0000000000000018 R15: 0000000000000000
>> FS: 0000000000000000(0000) GS:ffff88006dd00000(0000) knlGS:0000000000000000
>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 0000000020fdffff CR3: 000000006797e000 CR4: 00000000000006e0
>> Call Trace:
>> lock_acquire+0x22d/0x560 kernel/locking/lockdep.c:3855
>> __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
>> _raw_spin_lock_irqsave+0xc9/0x110 kernel/locking/spinlock.c:159
>> gadgetfs_disconnect+0xf1/0x230 drivers/usb/gadget/legacy/inode.c:1664
>> usb_gadget_udc_reset+0x3b/0xb0 drivers/usb/gadget/udc/core.c:1020
>> set_link_state+0x648/0x9f0 drivers/usb/gadget/udc/dummy_hcd.c:446
>> dummy_hub_control+0x11bb/0x1fb0 drivers/usb/gadget/udc/dummy_hcd.c:2243
>> rh_call_control drivers/usb/core/hcd.c:689 [inline]
>> rh_urb_enqueue drivers/usb/core/hcd.c:846 [inline]
>> usb_hcd_submit_urb+0x92f/0x20b0 drivers/usb/core/hcd.c:1650
>> usb_submit_urb+0x8b2/0x12c0 drivers/usb/core/urb.c:542
>> usb_start_wait_urb+0x148/0x5b0 drivers/usb/core/message.c:56
>> usb_internal_control_msg drivers/usb/core/message.c:100 [inline]
>> usb_control_msg+0x341/0x4d0 drivers/usb/core/message.c:151
>> set_port_feature+0x73/0x90 drivers/usb/core/hub.c:422
>> hub_port_reset+0x277/0x1550 drivers/usb/core/hub.c:2772
>> hub_port_init+0x7dc/0x2940 drivers/usb/core/hub.c:4510
>> hub_port_connect drivers/usb/core/hub.c:4826 [inline]
>> hub_port_connect_change drivers/usb/core/hub.c:4999 [inline]
>> port_event drivers/usb/core/hub.c:5105 [inline]
>> hub_event+0x1ae1/0x3d40 drivers/usb/core/hub.c:5185
>> process_one_work+0xc08/0x1bd0 kernel/workqueue.c:2097
>> process_scheduled_works kernel/workqueue.c:2157 [inline]
>> worker_thread+0xb2b/0x1860 kernel/workqueue.c:2233
>> kthread+0x363/0x440 kernel/kthread.c:231
>> ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:424
>> Code: e9 03 f3 48 ab 48 81 c4 18 05 00 00 44 89 e0 5b 41 5c 41 5d 41
>> 5e 41 5f 5d c3 4c 89 f2 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80>
>> 3c 02 00 0f 85 51 24 00 00 49 81 3e 40 ea 13 87 41 bd 00 00
>> RIP: __lock_acquire+0xbe5/0x3690 kernel/locking/lockdep.c:3246 RSP:
>> ffff88006c625a80
>> ---[ end trace f5a7b971fc1b0546 ]---
>> Kernel panic - not syncing: Fatal exception
>> Shutting down cpus with NMI
>> Dumping ftrace buffer:
>> (ftrace buffer empty)
>> Kernel Offset: disabled
>> Rebooting in 86400 seconds..
>
> I guess the structure containing the spinlock was already deallocated
> when gadgetfs_disconnect was called. But I don't see how that could
> have happened, with the patched locking.
>
> If you can't replicate this then there's not much hope of tracking it
> down.
>
> If you can, then perhaps it would help to trace each time any function
> calls either put_dev or get_dev in inode.c.
OK, I'll let you know if I manage to reproduce this. I'll report it in
a separate thread then.
Anyway, your patch fixes the crashes I reported initially, thanks!
>
> Alan Stern
>