Hi,
I've got the following error report while fuzzing the kernel with syzkaller.
On commit b29794ec95c6856b316c2295904208bf11ffddd9 (4.12-rc4+).
This looks quite similar to
https://groups.google.com/forum/#!topic/syzkaller/HDawLBeeORI
I'm able to reproduce this, so I can collect some debug traces if needed.
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 2 PID: 4820 Comm: syz-executor0 Not tainted 4.12.0-rc4+ #5
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff880039542dc0 task.stack: ffff88003bdd0000
RIP: 0010:__list_del_entry_valid+0x7e/0x170 lib/list_debug.c:51
RSP: 0018:ffff88003bdd6e50 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000010000
RDX: 0000000000000000 RSI: ffffffff86504948 RDI: ffffffff86504950
RBP: ffff88003bdd6e68 R08: ffff880039542dc0 R09: ffffffff8778ce00
R10: ffff88003bdd6e68 R11: dffffc0000000000 R12: 0000000000000000
R13: dffffc0000000000 R14: 1ffff100077badd2 R15: ffffffff864d2e40
FS: 0000000000000000(0000) GS:ffff88006dc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000002014aff9 CR3: 0000000006022000 CR4: 00000000000006e0
Call Trace:
__list_del_entry include/linux/list.h:116 [inline]
list_del include/linux/list.h:124 [inline]
usb_gadget_unregister_driver+0x166/0x4c0 drivers/usb/gadget/udc/core.c:1387
dev_release+0x80/0x160 drivers/usb/gadget/legacy/inode.c:1187
__fput+0x332/0x7f0 fs/file_table.c:209
____fput+0x15/0x20 fs/file_table.c:245
task_work_run+0x19b/0x270 kernel/task_work.c:116
exit_task_work include/linux/task_work.h:21 [inline]
do_exit+0x18a3/0x2820 kernel/exit.c:878
do_group_exit+0x149/0x420 kernel/exit.c:982
get_signal+0x77f/0x1780 kernel/signal.c:2318
do_signal+0xd2/0x2130 arch/x86/kernel/signal.c:808
exit_to_usermode_loop+0x1a7/0x240 arch/x86/entry/common.c:157
prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
syscall_return_slowpath+0x3ba/0x410 arch/x86/entry/common.c:263
entry_SYSCALL_64_fastpath+0xbc/0xbe
RIP: 0033:0x4461f9
RSP: 002b:00007fdac2b1ecf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00000000007080c8 RCX: 00000000004461f9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000007080c8
RBP: 00000000007080a8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007fdac2b1f9c0 R15: 00007fdac2b1f700
Code: 00 00 00 00 ad de 49 39 c4 74 6a 48 b8 00 02 00 00 00 00 ad de
48 89 da 48 39 c3 74 74 48 c1 ea 03 48 b8 00 00 00 00 00 fc ff df <80>
3c 02 00 0f 85 92 00 00 00 48 8b 13 48 39 f2 75 66 49 8d 7c
RIP: __list_del_entry_valid+0x7e/0x170 lib/list_debug.c:51 RSP: ffff88003bdd6e50
---[ end trace 30e94b1eec4831c8 ]---
Kernel panic - not syncing: Fatal exception
Dumping ftrace buffer:
(ftrace buffer empty)
Kernel Offset: disabled
Rebooting in 86400 seconds..
On Wed, 7 Jun 2017, Andrey Konovalov wrote:
> Hi,
>
> I've got the following error report while fuzzing the kernel with syzkaller.
>
> On commit b29794ec95c6856b316c2295904208bf11ffddd9 (4.12-rc4+).
>
> This looks quite similar to
> https://groups.google.com/forum/#!topic/syzkaller/HDawLBeeORI
It does look very similar, but that problem was supposed to have been
fixed by commit 7b0173811260 ("usb: gadget: udc: core: fix return code
of usb_gadget_probe_driver()").
> I'm able to reproduce this, so I can collect some debug traces if needed.
Can you provide an strace or the equivalent?
Alan Stern
On Wed, Jun 7, 2017 at 4:43 PM, Alan Stern <[email protected]> wrote:
> On Wed, 7 Jun 2017, Andrey Konovalov wrote:
>
>> Hi,
>>
>> I've got the following error report while fuzzing the kernel with syzkaller.
>>
>> On commit b29794ec95c6856b316c2295904208bf11ffddd9 (4.12-rc4+).
>>
>> This looks quite similar to
>> https://groups.google.com/forum/#!topic/syzkaller/HDawLBeeORI
>
> It does look very similar, but that problem was supposed to have been
> fixed by commit 7b0173811260 ("usb: gadget: udc: core: fix return code
> of usb_gadget_probe_driver()").
>
>> I'm able to reproduce this, so I can collect some debug traces if needed.
>
> Can you provide an strace or the equivalent?
Here's the syzkaller program (which is actually two programs executed
consequently):
https://gist.github.com/xairy/fe0a7531e00df5e8bc23e2e56e413510
Here's the strace log:
https://gist.github.com/xairy/5fadc3b5d8b2b80c97e566538de08bc4
Unfortunately there's a lot of unrelated garbage, but I can't extract
a simple C reproducer.
I can also apply patches with debug printk's, run the reproducer and
send you the result if that will help.
>
> Alan Stern
>
On Wed, 7 Jun 2017, Andrey Konovalov wrote:
> On Wed, Jun 7, 2017 at 4:43 PM, Alan Stern <[email protected]> wrote:
> > On Wed, 7 Jun 2017, Andrey Konovalov wrote:
> >
> >> Hi,
> >>
> >> I've got the following error report while fuzzing the kernel with syzkaller.
> >>
> >> On commit b29794ec95c6856b316c2295904208bf11ffddd9 (4.12-rc4+).
> >>
> >> This looks quite similar to
> >> https://groups.google.com/forum/#!topic/syzkaller/HDawLBeeORI
> >
> > It does look very similar, but that problem was supposed to have been
> > fixed by commit 7b0173811260 ("usb: gadget: udc: core: fix return code
> > of usb_gadget_probe_driver()").
> >
> >> I'm able to reproduce this, so I can collect some debug traces if needed.
> >
> > Can you provide an strace or the equivalent?
>
> Here's the syzkaller program (which is actually two programs executed
> consequently):
> https://gist.github.com/xairy/fe0a7531e00df5e8bc23e2e56e413510
>
> Here's the strace log:
> https://gist.github.com/xairy/5fadc3b5d8b2b80c97e566538de08bc4
Do you know which of the two programs got the GPF? I can't tell from
the strace log.
> Unfortunately there's a lot of unrelated garbage, but I can't extract
> a simple C reproducer.
That's okay, it's easy enough to see what's going on. One program
opens /dev/gadget/dummy_udc, writes an invalid setup string, then
writes a valid setup string, and then exits. The other program just
opens the file and then exits.
> I can also apply patches with debug printk's, run the reproducer and
> send you the result if that will help.
Maybe you can patch usb_gadget_probe_driver() in
drivers/usb/gadget/udc/core.c. Find out whether the "if
(!driver->match_existing_only)" test is executed and whether it
succeeds, and find out whether the code following "found:" is executed.
I would expect that the test is not executed and the jump to "found:"
is taken, so udc_bind_to_driver() is called and returns 0. Thus,
udc->driver should be set to driver.
Also, in usb_gadget_unregister_driver(), in the list_for_each_entry()
loop, we should have udc->driver == driver and therefore ret should get
set to 0. Consequently, the list_del() near the end should not be
executed and so the GPF should not occur.
In particular, do these subroutines get called more than once?
Alan Stern
On Wed, Jun 7, 2017 at 11:20 PM, Alan Stern <[email protected]> wrote:
> On Wed, 7 Jun 2017, Andrey Konovalov wrote:
>
>> On Wed, Jun 7, 2017 at 4:43 PM, Alan Stern <[email protected]> wrote:
>> > On Wed, 7 Jun 2017, Andrey Konovalov wrote:
>> >
>> >> Hi,
>> >>
>> >> I've got the following error report while fuzzing the kernel with syzkaller.
>> >>
>> >> On commit b29794ec95c6856b316c2295904208bf11ffddd9 (4.12-rc4+).
>> >>
>> >> This looks quite similar to
>> >> https://groups.google.com/forum/#!topic/syzkaller/HDawLBeeORI
>> >
>> > It does look very similar, but that problem was supposed to have been
>> > fixed by commit 7b0173811260 ("usb: gadget: udc: core: fix return code
>> > of usb_gadget_probe_driver()").
>> >
>> >> I'm able to reproduce this, so I can collect some debug traces if needed.
>> >
>> > Can you provide an strace or the equivalent?
>>
>> Here's the syzkaller program (which is actually two programs executed
>> consequently):
>> https://gist.github.com/xairy/fe0a7531e00df5e8bc23e2e56e413510
>>
>> Here's the strace log:
>> https://gist.github.com/xairy/5fadc3b5d8b2b80c97e566538de08bc4
>
> Do you know which of the two programs got the GPF? I can't tell from
> the strace log.
>
>> Unfortunately there's a lot of unrelated garbage, but I can't extract
>> a simple C reproducer.
>
> That's okay, it's easy enough to see what's going on. One program
> opens /dev/gadget/dummy_udc, writes an invalid setup string, then
> writes a valid setup string, and then exits. The other program just
> opens the file and then exits.
>
>> I can also apply patches with debug printk's, run the reproducer and
>> send you the result if that will help.
I've extract another crash log, which is a little simpler:
https://gist.github.com/xairy/b8c814cbd731e4632e8e8fa0f51a29e8
>
> Maybe you can patch usb_gadget_probe_driver() in
> drivers/usb/gadget/udc/core.c. Find out whether the "if
> (!driver->match_existing_only)" test is executed and whether it
> succeeds, and find out whether the code following "found:" is executed.
> I would expect that the test is not executed and the jump to "found:"
> is taken, so udc_bind_to_driver() is called and returns 0. Thus,
> udc->driver should be set to driver.
Here's the funcgraph for usb_gadget_probe_driver:
https://gist.github.com/xairy/3221e2cb9c59514880d24c955de30b80
The (!driver->match_existing_only) test is not executed.
The code following "found:" is executed.
>
> Also, in usb_gadget_unregister_driver(), in the list_for_each_entry()
> loop, we should have udc->driver == driver and therefore ret should get
> set to 0. Consequently, the list_del() near the end should not be
> executed and so the GPF should not occur.
Here's the funcgraph for usb_gadget_unregister_driver:
https://gist.github.com/xairy/887c52a12af8c9f9fe8ba3e4fa0ef1f0
What you described happens during the first call of
usb_gadget_unregister_driver(), however there's another one after
that, which is probably triggered by the second program.
>
> In particular, do these subroutines get called more than once?
usb_gadget_unregister_driver() is called twice, the GPF happens during
the second call.
>
> Alan Stern
>
On Thu, 8 Jun 2017, Andrey Konovalov wrote:
> On Wed, Jun 7, 2017 at 11:20 PM, Alan Stern <[email protected]> wrote:
> > On Wed, 7 Jun 2017, Andrey Konovalov wrote:
> >
> >> On Wed, Jun 7, 2017 at 4:43 PM, Alan Stern <[email protected]> wrote:
> >> > On Wed, 7 Jun 2017, Andrey Konovalov wrote:
> >> >
> >> >> Hi,
> >> >>
> >> >> I've got the following error report while fuzzing the kernel with syzkaller.
> >> >>
> >> >> On commit b29794ec95c6856b316c2295904208bf11ffddd9 (4.12-rc4+).
> >> >>
> >> >> This looks quite similar to
> >> >> https://groups.google.com/forum/#!topic/syzkaller/HDawLBeeORI
> >> >
> >> > It does look very similar, but that problem was supposed to have been
> >> > fixed by commit 7b0173811260 ("usb: gadget: udc: core: fix return code
> >> > of usb_gadget_probe_driver()").
> >> >
> >> >> I'm able to reproduce this, so I can collect some debug traces if needed.
> >> >
> >> > Can you provide an strace or the equivalent?
> >>
> >> Here's the syzkaller program (which is actually two programs executed
> >> consequently):
> >> https://gist.github.com/xairy/fe0a7531e00df5e8bc23e2e56e413510
> >>
> >> Here's the strace log:
> >> https://gist.github.com/xairy/5fadc3b5d8b2b80c97e566538de08bc4
> >
> > Do you know which of the two programs got the GPF? I can't tell from
> > the strace log.
> >
> >> Unfortunately there's a lot of unrelated garbage, but I can't extract
> >> a simple C reproducer.
> >
> > That's okay, it's easy enough to see what's going on. One program
> > opens /dev/gadget/dummy_udc, writes an invalid setup string, then
> > writes a valid setup string, and then exits. The other program just
> > opens the file and then exits.
> >
> >> I can also apply patches with debug printk's, run the reproducer and
> >> send you the result if that will help.
>
> I've extract another crash log, which is a little simpler:
> https://gist.github.com/xairy/b8c814cbd731e4632e8e8fa0f51a29e8
>
> >
> > Maybe you can patch usb_gadget_probe_driver() in
> > drivers/usb/gadget/udc/core.c. Find out whether the "if
> > (!driver->match_existing_only)" test is executed and whether it
> > succeeds, and find out whether the code following "found:" is executed.
> > I would expect that the test is not executed and the jump to "found:"
> > is taken, so udc_bind_to_driver() is called and returns 0. Thus,
> > udc->driver should be set to driver.
>
> Here's the funcgraph for usb_gadget_probe_driver:
> https://gist.github.com/xairy/3221e2cb9c59514880d24c955de30b80
>
> The (!driver->match_existing_only) test is not executed.
> The code following "found:" is executed.
>
> >
> > Also, in usb_gadget_unregister_driver(), in the list_for_each_entry()
> > loop, we should have udc->driver == driver and therefore ret should get
> > set to 0. Consequently, the list_del() near the end should not be
> > executed and so the GPF should not occur.
>
> Here's the funcgraph for usb_gadget_unregister_driver:
> https://gist.github.com/xairy/887c52a12af8c9f9fe8ba3e4fa0ef1f0
>
> What you described happens during the first call of
> usb_gadget_unregister_driver(), however there's another one after
> that, which is probably triggered by the second program.
>
> >
> > In particular, do these subroutines get called more than once?
>
> usb_gadget_unregister_driver() is called twice, the GPF happens during
> the second call.
Good, that's definitive. And I feel stupid for missing this bug.
The patch is below.
Alan Stern
Index: usb-4.x/drivers/usb/gadget/legacy/inode.c
===================================================================
--- usb-4.x.orig/drivers/usb/gadget/legacy/inode.c
+++ usb-4.x/drivers/usb/gadget/legacy/inode.c
@@ -1183,8 +1183,10 @@ dev_release (struct inode *inode, struct
/* closing ep0 === shutdown all */
- if (dev->gadget_registered)
+ if (dev->gadget_registered) {
usb_gadget_unregister_driver (&gadgetfs_driver);
+ dev->gadget_registered = false;
+ }
/* at this point "good" hardware has disconnected the
* device from USB; the host won't see it any more.
On Thu, Jun 8, 2017 at 5:55 PM, Alan Stern <[email protected]> wrote:
> On Thu, 8 Jun 2017, Andrey Konovalov wrote:
>
>> On Wed, Jun 7, 2017 at 11:20 PM, Alan Stern <[email protected]> wrote:
>> > On Wed, 7 Jun 2017, Andrey Konovalov wrote:
>> >
>> >> On Wed, Jun 7, 2017 at 4:43 PM, Alan Stern <[email protected]> wrote:
>> >> > On Wed, 7 Jun 2017, Andrey Konovalov wrote:
>> >> >
>> >> >> Hi,
>> >> >>
>> >> >> I've got the following error report while fuzzing the kernel with syzkaller.
>> >> >>
>> >> >> On commit b29794ec95c6856b316c2295904208bf11ffddd9 (4.12-rc4+).
>> >> >>
>> >> >> This looks quite similar to
>> >> >> https://groups.google.com/forum/#!topic/syzkaller/HDawLBeeORI
>> >> >
>> >> > It does look very similar, but that problem was supposed to have been
>> >> > fixed by commit 7b0173811260 ("usb: gadget: udc: core: fix return code
>> >> > of usb_gadget_probe_driver()").
>> >> >
>> >> >> I'm able to reproduce this, so I can collect some debug traces if needed.
>> >> >
>> >> > Can you provide an strace or the equivalent?
>> >>
>> >> Here's the syzkaller program (which is actually two programs executed
>> >> consequently):
>> >> https://gist.github.com/xairy/fe0a7531e00df5e8bc23e2e56e413510
>> >>
>> >> Here's the strace log:
>> >> https://gist.github.com/xairy/5fadc3b5d8b2b80c97e566538de08bc4
>> >
>> > Do you know which of the two programs got the GPF? I can't tell from
>> > the strace log.
>> >
>> >> Unfortunately there's a lot of unrelated garbage, but I can't extract
>> >> a simple C reproducer.
>> >
>> > That's okay, it's easy enough to see what's going on. One program
>> > opens /dev/gadget/dummy_udc, writes an invalid setup string, then
>> > writes a valid setup string, and then exits. The other program just
>> > opens the file and then exits.
>> >
>> >> I can also apply patches with debug printk's, run the reproducer and
>> >> send you the result if that will help.
>>
>> I've extract another crash log, which is a little simpler:
>> https://gist.github.com/xairy/b8c814cbd731e4632e8e8fa0f51a29e8
>>
>> >
>> > Maybe you can patch usb_gadget_probe_driver() in
>> > drivers/usb/gadget/udc/core.c. Find out whether the "if
>> > (!driver->match_existing_only)" test is executed and whether it
>> > succeeds, and find out whether the code following "found:" is executed.
>> > I would expect that the test is not executed and the jump to "found:"
>> > is taken, so udc_bind_to_driver() is called and returns 0. Thus,
>> > udc->driver should be set to driver.
>>
>> Here's the funcgraph for usb_gadget_probe_driver:
>> https://gist.github.com/xairy/3221e2cb9c59514880d24c955de30b80
>>
>> The (!driver->match_existing_only) test is not executed.
>> The code following "found:" is executed.
>>
>> >
>> > Also, in usb_gadget_unregister_driver(), in the list_for_each_entry()
>> > loop, we should have udc->driver == driver and therefore ret should get
>> > set to 0. Consequently, the list_del() near the end should not be
>> > executed and so the GPF should not occur.
>>
>> Here's the funcgraph for usb_gadget_unregister_driver:
>> https://gist.github.com/xairy/887c52a12af8c9f9fe8ba3e4fa0ef1f0
>>
>> What you described happens during the first call of
>> usb_gadget_unregister_driver(), however there's another one after
>> that, which is probably triggered by the second program.
>>
>> >
>> > In particular, do these subroutines get called more than once?
>>
>> usb_gadget_unregister_driver() is called twice, the GPF happens during
>> the second call.
>
> Good, that's definitive. And I feel stupid for missing this bug.
> The patch is below.
Perfect, this fixes the issue, thanks!
>
> Alan Stern
>
>
>
> Index: usb-4.x/drivers/usb/gadget/legacy/inode.c
> ===================================================================
> --- usb-4.x.orig/drivers/usb/gadget/legacy/inode.c
> +++ usb-4.x/drivers/usb/gadget/legacy/inode.c
> @@ -1183,8 +1183,10 @@ dev_release (struct inode *inode, struct
>
> /* closing ep0 === shutdown all */
>
> - if (dev->gadget_registered)
> + if (dev->gadget_registered) {
> usb_gadget_unregister_driver (&gadgetfs_driver);
> + dev->gadget_registered = false;
> + }
>
> /* at this point "good" hardware has disconnected the
> * device from USB; the host won't see it any more.
>