2017-06-07 13:39:55

by Andrey Konovalov

[permalink] [raw]
Subject: usb/gadget: another GPF in usb_gadget_unregister_driver

Hi,

I've got the following error report while fuzzing the kernel with syzkaller.

On commit b29794ec95c6856b316c2295904208bf11ffddd9 (4.12-rc4+).

This looks quite similar to
https://groups.google.com/forum/#!topic/syzkaller/HDawLBeeORI

I'm able to reproduce this, so I can collect some debug traces if needed.

kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 2 PID: 4820 Comm: syz-executor0 Not tainted 4.12.0-rc4+ #5
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
task: ffff880039542dc0 task.stack: ffff88003bdd0000
RIP: 0010:__list_del_entry_valid+0x7e/0x170 lib/list_debug.c:51
RSP: 0018:ffff88003bdd6e50 EFLAGS: 00010246
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000010000
RDX: 0000000000000000 RSI: ffffffff86504948 RDI: ffffffff86504950
RBP: ffff88003bdd6e68 R08: ffff880039542dc0 R09: ffffffff8778ce00
R10: ffff88003bdd6e68 R11: dffffc0000000000 R12: 0000000000000000
R13: dffffc0000000000 R14: 1ffff100077badd2 R15: ffffffff864d2e40
FS: 0000000000000000(0000) GS:ffff88006dc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000002014aff9 CR3: 0000000006022000 CR4: 00000000000006e0
Call Trace:
__list_del_entry include/linux/list.h:116 [inline]
list_del include/linux/list.h:124 [inline]
usb_gadget_unregister_driver+0x166/0x4c0 drivers/usb/gadget/udc/core.c:1387
dev_release+0x80/0x160 drivers/usb/gadget/legacy/inode.c:1187
__fput+0x332/0x7f0 fs/file_table.c:209
____fput+0x15/0x20 fs/file_table.c:245
task_work_run+0x19b/0x270 kernel/task_work.c:116
exit_task_work include/linux/task_work.h:21 [inline]
do_exit+0x18a3/0x2820 kernel/exit.c:878
do_group_exit+0x149/0x420 kernel/exit.c:982
get_signal+0x77f/0x1780 kernel/signal.c:2318
do_signal+0xd2/0x2130 arch/x86/kernel/signal.c:808
exit_to_usermode_loop+0x1a7/0x240 arch/x86/entry/common.c:157
prepare_exit_to_usermode arch/x86/entry/common.c:194 [inline]
syscall_return_slowpath+0x3ba/0x410 arch/x86/entry/common.c:263
entry_SYSCALL_64_fastpath+0xbc/0xbe
RIP: 0033:0x4461f9
RSP: 002b:00007fdac2b1ecf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00000000007080c8 RCX: 00000000004461f9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000007080c8
RBP: 00000000007080a8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007fdac2b1f9c0 R15: 00007fdac2b1f700
Code: 00 00 00 00 ad de 49 39 c4 74 6a 48 b8 00 02 00 00 00 00 ad de
48 89 da 48 39 c3 74 74 48 c1 ea 03 48 b8 00 00 00 00 00 fc ff df <80>
3c 02 00 0f 85 92 00 00 00 48 8b 13 48 39 f2 75 66 49 8d 7c
RIP: __list_del_entry_valid+0x7e/0x170 lib/list_debug.c:51 RSP: ffff88003bdd6e50
---[ end trace 30e94b1eec4831c8 ]---
Kernel panic - not syncing: Fatal exception
Dumping ftrace buffer:
(ftrace buffer empty)
Kernel Offset: disabled
Rebooting in 86400 seconds..


2017-06-07 14:43:18

by Alan Stern

[permalink] [raw]
Subject: Re: usb/gadget: another GPF in usb_gadget_unregister_driver

On Wed, 7 Jun 2017, Andrey Konovalov wrote:

> Hi,
>
> I've got the following error report while fuzzing the kernel with syzkaller.
>
> On commit b29794ec95c6856b316c2295904208bf11ffddd9 (4.12-rc4+).
>
> This looks quite similar to
> https://groups.google.com/forum/#!topic/syzkaller/HDawLBeeORI

It does look very similar, but that problem was supposed to have been
fixed by commit 7b0173811260 ("usb: gadget: udc: core: fix return code
of usb_gadget_probe_driver()").

> I'm able to reproduce this, so I can collect some debug traces if needed.

Can you provide an strace or the equivalent?

Alan Stern

2017-06-07 15:15:46

by Andrey Konovalov

[permalink] [raw]
Subject: Re: usb/gadget: another GPF in usb_gadget_unregister_driver

On Wed, Jun 7, 2017 at 4:43 PM, Alan Stern <[email protected]> wrote:
> On Wed, 7 Jun 2017, Andrey Konovalov wrote:
>
>> Hi,
>>
>> I've got the following error report while fuzzing the kernel with syzkaller.
>>
>> On commit b29794ec95c6856b316c2295904208bf11ffddd9 (4.12-rc4+).
>>
>> This looks quite similar to
>> https://groups.google.com/forum/#!topic/syzkaller/HDawLBeeORI
>
> It does look very similar, but that problem was supposed to have been
> fixed by commit 7b0173811260 ("usb: gadget: udc: core: fix return code
> of usb_gadget_probe_driver()").
>
>> I'm able to reproduce this, so I can collect some debug traces if needed.
>
> Can you provide an strace or the equivalent?

Here's the syzkaller program (which is actually two programs executed
consequently):
https://gist.github.com/xairy/fe0a7531e00df5e8bc23e2e56e413510

Here's the strace log:
https://gist.github.com/xairy/5fadc3b5d8b2b80c97e566538de08bc4

Unfortunately there's a lot of unrelated garbage, but I can't extract
a simple C reproducer.

I can also apply patches with debug printk's, run the reproducer and
send you the result if that will help.

>
> Alan Stern
>

2017-06-07 21:20:38

by Alan Stern

[permalink] [raw]
Subject: Re: usb/gadget: another GPF in usb_gadget_unregister_driver

On Wed, 7 Jun 2017, Andrey Konovalov wrote:

> On Wed, Jun 7, 2017 at 4:43 PM, Alan Stern <[email protected]> wrote:
> > On Wed, 7 Jun 2017, Andrey Konovalov wrote:
> >
> >> Hi,
> >>
> >> I've got the following error report while fuzzing the kernel with syzkaller.
> >>
> >> On commit b29794ec95c6856b316c2295904208bf11ffddd9 (4.12-rc4+).
> >>
> >> This looks quite similar to
> >> https://groups.google.com/forum/#!topic/syzkaller/HDawLBeeORI
> >
> > It does look very similar, but that problem was supposed to have been
> > fixed by commit 7b0173811260 ("usb: gadget: udc: core: fix return code
> > of usb_gadget_probe_driver()").
> >
> >> I'm able to reproduce this, so I can collect some debug traces if needed.
> >
> > Can you provide an strace or the equivalent?
>
> Here's the syzkaller program (which is actually two programs executed
> consequently):
> https://gist.github.com/xairy/fe0a7531e00df5e8bc23e2e56e413510
>
> Here's the strace log:
> https://gist.github.com/xairy/5fadc3b5d8b2b80c97e566538de08bc4

Do you know which of the two programs got the GPF? I can't tell from
the strace log.

> Unfortunately there's a lot of unrelated garbage, but I can't extract
> a simple C reproducer.

That's okay, it's easy enough to see what's going on. One program
opens /dev/gadget/dummy_udc, writes an invalid setup string, then
writes a valid setup string, and then exits. The other program just
opens the file and then exits.

> I can also apply patches with debug printk's, run the reproducer and
> send you the result if that will help.

Maybe you can patch usb_gadget_probe_driver() in
drivers/usb/gadget/udc/core.c. Find out whether the "if
(!driver->match_existing_only)" test is executed and whether it
succeeds, and find out whether the code following "found:" is executed.
I would expect that the test is not executed and the jump to "found:"
is taken, so udc_bind_to_driver() is called and returns 0. Thus,
udc->driver should be set to driver.

Also, in usb_gadget_unregister_driver(), in the list_for_each_entry()
loop, we should have udc->driver == driver and therefore ret should get
set to 0. Consequently, the list_del() near the end should not be
executed and so the GPF should not occur.

In particular, do these subroutines get called more than once?

Alan Stern

2017-06-08 11:41:03

by Andrey Konovalov

[permalink] [raw]
Subject: Re: usb/gadget: another GPF in usb_gadget_unregister_driver

On Wed, Jun 7, 2017 at 11:20 PM, Alan Stern <[email protected]> wrote:
> On Wed, 7 Jun 2017, Andrey Konovalov wrote:
>
>> On Wed, Jun 7, 2017 at 4:43 PM, Alan Stern <[email protected]> wrote:
>> > On Wed, 7 Jun 2017, Andrey Konovalov wrote:
>> >
>> >> Hi,
>> >>
>> >> I've got the following error report while fuzzing the kernel with syzkaller.
>> >>
>> >> On commit b29794ec95c6856b316c2295904208bf11ffddd9 (4.12-rc4+).
>> >>
>> >> This looks quite similar to
>> >> https://groups.google.com/forum/#!topic/syzkaller/HDawLBeeORI
>> >
>> > It does look very similar, but that problem was supposed to have been
>> > fixed by commit 7b0173811260 ("usb: gadget: udc: core: fix return code
>> > of usb_gadget_probe_driver()").
>> >
>> >> I'm able to reproduce this, so I can collect some debug traces if needed.
>> >
>> > Can you provide an strace or the equivalent?
>>
>> Here's the syzkaller program (which is actually two programs executed
>> consequently):
>> https://gist.github.com/xairy/fe0a7531e00df5e8bc23e2e56e413510
>>
>> Here's the strace log:
>> https://gist.github.com/xairy/5fadc3b5d8b2b80c97e566538de08bc4
>
> Do you know which of the two programs got the GPF? I can't tell from
> the strace log.
>
>> Unfortunately there's a lot of unrelated garbage, but I can't extract
>> a simple C reproducer.
>
> That's okay, it's easy enough to see what's going on. One program
> opens /dev/gadget/dummy_udc, writes an invalid setup string, then
> writes a valid setup string, and then exits. The other program just
> opens the file and then exits.
>
>> I can also apply patches with debug printk's, run the reproducer and
>> send you the result if that will help.

I've extract another crash log, which is a little simpler:
https://gist.github.com/xairy/b8c814cbd731e4632e8e8fa0f51a29e8

>
> Maybe you can patch usb_gadget_probe_driver() in
> drivers/usb/gadget/udc/core.c. Find out whether the "if
> (!driver->match_existing_only)" test is executed and whether it
> succeeds, and find out whether the code following "found:" is executed.
> I would expect that the test is not executed and the jump to "found:"
> is taken, so udc_bind_to_driver() is called and returns 0. Thus,
> udc->driver should be set to driver.

Here's the funcgraph for usb_gadget_probe_driver:
https://gist.github.com/xairy/3221e2cb9c59514880d24c955de30b80

The (!driver->match_existing_only) test is not executed.
The code following "found:" is executed.

>
> Also, in usb_gadget_unregister_driver(), in the list_for_each_entry()
> loop, we should have udc->driver == driver and therefore ret should get
> set to 0. Consequently, the list_del() near the end should not be
> executed and so the GPF should not occur.

Here's the funcgraph for usb_gadget_unregister_driver:
https://gist.github.com/xairy/887c52a12af8c9f9fe8ba3e4fa0ef1f0

What you described happens during the first call of
usb_gadget_unregister_driver(), however there's another one after
that, which is probably triggered by the second program.

>
> In particular, do these subroutines get called more than once?

usb_gadget_unregister_driver() is called twice, the GPF happens during
the second call.

>
> Alan Stern
>

2017-06-08 15:55:16

by Alan Stern

[permalink] [raw]
Subject: Re: usb/gadget: another GPF in usb_gadget_unregister_driver

On Thu, 8 Jun 2017, Andrey Konovalov wrote:

> On Wed, Jun 7, 2017 at 11:20 PM, Alan Stern <[email protected]> wrote:
> > On Wed, 7 Jun 2017, Andrey Konovalov wrote:
> >
> >> On Wed, Jun 7, 2017 at 4:43 PM, Alan Stern <[email protected]> wrote:
> >> > On Wed, 7 Jun 2017, Andrey Konovalov wrote:
> >> >
> >> >> Hi,
> >> >>
> >> >> I've got the following error report while fuzzing the kernel with syzkaller.
> >> >>
> >> >> On commit b29794ec95c6856b316c2295904208bf11ffddd9 (4.12-rc4+).
> >> >>
> >> >> This looks quite similar to
> >> >> https://groups.google.com/forum/#!topic/syzkaller/HDawLBeeORI
> >> >
> >> > It does look very similar, but that problem was supposed to have been
> >> > fixed by commit 7b0173811260 ("usb: gadget: udc: core: fix return code
> >> > of usb_gadget_probe_driver()").
> >> >
> >> >> I'm able to reproduce this, so I can collect some debug traces if needed.
> >> >
> >> > Can you provide an strace or the equivalent?
> >>
> >> Here's the syzkaller program (which is actually two programs executed
> >> consequently):
> >> https://gist.github.com/xairy/fe0a7531e00df5e8bc23e2e56e413510
> >>
> >> Here's the strace log:
> >> https://gist.github.com/xairy/5fadc3b5d8b2b80c97e566538de08bc4
> >
> > Do you know which of the two programs got the GPF? I can't tell from
> > the strace log.
> >
> >> Unfortunately there's a lot of unrelated garbage, but I can't extract
> >> a simple C reproducer.
> >
> > That's okay, it's easy enough to see what's going on. One program
> > opens /dev/gadget/dummy_udc, writes an invalid setup string, then
> > writes a valid setup string, and then exits. The other program just
> > opens the file and then exits.
> >
> >> I can also apply patches with debug printk's, run the reproducer and
> >> send you the result if that will help.
>
> I've extract another crash log, which is a little simpler:
> https://gist.github.com/xairy/b8c814cbd731e4632e8e8fa0f51a29e8
>
> >
> > Maybe you can patch usb_gadget_probe_driver() in
> > drivers/usb/gadget/udc/core.c. Find out whether the "if
> > (!driver->match_existing_only)" test is executed and whether it
> > succeeds, and find out whether the code following "found:" is executed.
> > I would expect that the test is not executed and the jump to "found:"
> > is taken, so udc_bind_to_driver() is called and returns 0. Thus,
> > udc->driver should be set to driver.
>
> Here's the funcgraph for usb_gadget_probe_driver:
> https://gist.github.com/xairy/3221e2cb9c59514880d24c955de30b80
>
> The (!driver->match_existing_only) test is not executed.
> The code following "found:" is executed.
>
> >
> > Also, in usb_gadget_unregister_driver(), in the list_for_each_entry()
> > loop, we should have udc->driver == driver and therefore ret should get
> > set to 0. Consequently, the list_del() near the end should not be
> > executed and so the GPF should not occur.
>
> Here's the funcgraph for usb_gadget_unregister_driver:
> https://gist.github.com/xairy/887c52a12af8c9f9fe8ba3e4fa0ef1f0
>
> What you described happens during the first call of
> usb_gadget_unregister_driver(), however there's another one after
> that, which is probably triggered by the second program.
>
> >
> > In particular, do these subroutines get called more than once?
>
> usb_gadget_unregister_driver() is called twice, the GPF happens during
> the second call.

Good, that's definitive. And I feel stupid for missing this bug.
The patch is below.

Alan Stern



Index: usb-4.x/drivers/usb/gadget/legacy/inode.c
===================================================================
--- usb-4.x.orig/drivers/usb/gadget/legacy/inode.c
+++ usb-4.x/drivers/usb/gadget/legacy/inode.c
@@ -1183,8 +1183,10 @@ dev_release (struct inode *inode, struct

/* closing ep0 === shutdown all */

- if (dev->gadget_registered)
+ if (dev->gadget_registered) {
usb_gadget_unregister_driver (&gadgetfs_driver);
+ dev->gadget_registered = false;
+ }

/* at this point "good" hardware has disconnected the
* device from USB; the host won't see it any more.

2017-06-08 16:34:55

by Andrey Konovalov

[permalink] [raw]
Subject: Re: usb/gadget: another GPF in usb_gadget_unregister_driver

On Thu, Jun 8, 2017 at 5:55 PM, Alan Stern <[email protected]> wrote:
> On Thu, 8 Jun 2017, Andrey Konovalov wrote:
>
>> On Wed, Jun 7, 2017 at 11:20 PM, Alan Stern <[email protected]> wrote:
>> > On Wed, 7 Jun 2017, Andrey Konovalov wrote:
>> >
>> >> On Wed, Jun 7, 2017 at 4:43 PM, Alan Stern <[email protected]> wrote:
>> >> > On Wed, 7 Jun 2017, Andrey Konovalov wrote:
>> >> >
>> >> >> Hi,
>> >> >>
>> >> >> I've got the following error report while fuzzing the kernel with syzkaller.
>> >> >>
>> >> >> On commit b29794ec95c6856b316c2295904208bf11ffddd9 (4.12-rc4+).
>> >> >>
>> >> >> This looks quite similar to
>> >> >> https://groups.google.com/forum/#!topic/syzkaller/HDawLBeeORI
>> >> >
>> >> > It does look very similar, but that problem was supposed to have been
>> >> > fixed by commit 7b0173811260 ("usb: gadget: udc: core: fix return code
>> >> > of usb_gadget_probe_driver()").
>> >> >
>> >> >> I'm able to reproduce this, so I can collect some debug traces if needed.
>> >> >
>> >> > Can you provide an strace or the equivalent?
>> >>
>> >> Here's the syzkaller program (which is actually two programs executed
>> >> consequently):
>> >> https://gist.github.com/xairy/fe0a7531e00df5e8bc23e2e56e413510
>> >>
>> >> Here's the strace log:
>> >> https://gist.github.com/xairy/5fadc3b5d8b2b80c97e566538de08bc4
>> >
>> > Do you know which of the two programs got the GPF? I can't tell from
>> > the strace log.
>> >
>> >> Unfortunately there's a lot of unrelated garbage, but I can't extract
>> >> a simple C reproducer.
>> >
>> > That's okay, it's easy enough to see what's going on. One program
>> > opens /dev/gadget/dummy_udc, writes an invalid setup string, then
>> > writes a valid setup string, and then exits. The other program just
>> > opens the file and then exits.
>> >
>> >> I can also apply patches with debug printk's, run the reproducer and
>> >> send you the result if that will help.
>>
>> I've extract another crash log, which is a little simpler:
>> https://gist.github.com/xairy/b8c814cbd731e4632e8e8fa0f51a29e8
>>
>> >
>> > Maybe you can patch usb_gadget_probe_driver() in
>> > drivers/usb/gadget/udc/core.c. Find out whether the "if
>> > (!driver->match_existing_only)" test is executed and whether it
>> > succeeds, and find out whether the code following "found:" is executed.
>> > I would expect that the test is not executed and the jump to "found:"
>> > is taken, so udc_bind_to_driver() is called and returns 0. Thus,
>> > udc->driver should be set to driver.
>>
>> Here's the funcgraph for usb_gadget_probe_driver:
>> https://gist.github.com/xairy/3221e2cb9c59514880d24c955de30b80
>>
>> The (!driver->match_existing_only) test is not executed.
>> The code following "found:" is executed.
>>
>> >
>> > Also, in usb_gadget_unregister_driver(), in the list_for_each_entry()
>> > loop, we should have udc->driver == driver and therefore ret should get
>> > set to 0. Consequently, the list_del() near the end should not be
>> > executed and so the GPF should not occur.
>>
>> Here's the funcgraph for usb_gadget_unregister_driver:
>> https://gist.github.com/xairy/887c52a12af8c9f9fe8ba3e4fa0ef1f0
>>
>> What you described happens during the first call of
>> usb_gadget_unregister_driver(), however there's another one after
>> that, which is probably triggered by the second program.
>>
>> >
>> > In particular, do these subroutines get called more than once?
>>
>> usb_gadget_unregister_driver() is called twice, the GPF happens during
>> the second call.
>
> Good, that's definitive. And I feel stupid for missing this bug.
> The patch is below.

Perfect, this fixes the issue, thanks!

>
> Alan Stern
>
>
>
> Index: usb-4.x/drivers/usb/gadget/legacy/inode.c
> ===================================================================
> --- usb-4.x.orig/drivers/usb/gadget/legacy/inode.c
> +++ usb-4.x/drivers/usb/gadget/legacy/inode.c
> @@ -1183,8 +1183,10 @@ dev_release (struct inode *inode, struct
>
> /* closing ep0 === shutdown all */
>
> - if (dev->gadget_registered)
> + if (dev->gadget_registered) {
> usb_gadget_unregister_driver (&gadgetfs_driver);
> + dev->gadget_registered = false;
> + }
>
> /* at this point "good" hardware has disconnected the
> * device from USB; the host won't see it any more.
>