2022-02-23 16:36:01

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [syzbot] KASAN: use-after-free Read in dev_uevent

On Wed, Feb 23, 2022 at 09:38:20AM -0500, [email protected] wrote:
> On Wed, Feb 23, 2022 at 12:29:03PM +0100, [email protected] wrote:
> > On Wed, Feb 23, 2022 at 11:17:07AM +0000, Zhang, Qiang1 wrote:
> > >
> > > syzbot has found a reproducer for the following issue on:
> > >
> > > HEAD commit: 4f12b742eb2b Merge tag 'nfs-for-5.17-3' of git://git.linux..
> > > git tree: upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=110a6df2700000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=f6a069ed94a1ed1d
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=348b571beb5eeb70a582
> > > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12377296700000
> > >
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: [email protected]
> > >
> > > ==================================================================
> > > BUG: KASAN: use-after-free in dev_uevent+0x712/0x780 drivers/base/core.c:2320 Read of size 8 at addr ffff88802b934098 by task udevd/3689
> > >
> > > CPU: 2 PID: 3689 Comm: udevd Not tainted 5.17.0-rc4-syzkaller-00229-g4f12b742eb2b #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 Call Trace:
> > > <TASK>
> > > __dump_stack lib/dump_stack.c:88 [inline]
> > > dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
> > > print_address_description.constprop.0.cold+0x8d/0x303 mm/kasan/report.c:255 __kasan_report mm/kasan/report.c:442 [inline] kasan_report.cold+0x83/0xdf mm/kasan/report.c:459
> > > dev_uevent+0x712/0x780 drivers/base/core.c:2320
> > > uevent_show+0x1b8/0x380 drivers/base/core.c:2391
> > > dev_attr_show+0x4b/0x90 drivers/base/core.c:2094
> > > sysfs_kf_seq_show+0x219/0x3d0 fs/sysfs/file.c:59
> > > seq_read_iter+0x4f5/0x1280 fs/seq_file.c:230
> > > kernfs_fop_read_iter+0x514/0x6f0 fs/kernfs/file.c:241 call_read_iter include/linux/fs.h:2068 [inline]
> > > new_sync_read+0x429/0x6e0 fs/read_write.c:400
> > > vfs_read+0x35c/0x600 fs/read_write.c:481
> > > ksys_read+0x12d/0x250 fs/read_write.c:619
> > > do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> > > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > RIP: 0033:0x7f964cc558fe
> > > Code: c0 e9 e6 fe ff ff 50 48 8d 3d 0e c7 09 00 e8 c9 cf 01 00 66 0f 1f 84 00 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28
> > > RSP: 002b:00007ffc0133d258 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> > > RAX: ffffffffffffffda RBX: 000056497b21a140 RCX: 00007f964cc558fe
> > > RDX: 0000000000001000 RSI: 000056497b218650 RDI: 0000000000000008
> > > RBP: 00007f964cd22380 R08: 0000000000000008 R09: 00007f964cd25a60
> > > R10: 0000000000000008 R11: 0000000000000246 R12: 000056497b21a140
> > > R13: 0000000000000d68 R14: 00007f964cd21780 R15: 0000000000000d68 </TASK>
> > >
> > > Cc: Alan Stern
> > > Felipe Balbi
> > >
> > > Hello syzbot, Please try it:
> > >
> > > From 574d45ff924e2d2f9b9f5cc3e846f8004498a811 Mon Sep 17 00:00:00 2001
> > > From: Zqiang <[email protected]>
> > > Date: Wed, 23 Feb 2022 18:18:22 +0800
> > > Subject: [PATCH] driver core: Fix use-after-free in dev_uevent()
> > >
> > > In dev_uevent(), if the "dev->driver" is valid, the "dev->driver->name"
> > > be accessed, there may be a window period between these two operations.
> >
> > There should not be any such window period. The bus locks should
> > prevent this, unless some driver is doing odd things with the pointers
> > that it should not be doing.
>
> Which bus locks are you referring to? I'm not aware of any locks that
> synchronize dev_uevent() with anything (in particular, with driver
> unbinding).

The locks in the driver core that handle the binding and unbinding of
drivers to devices.

> And as far as I know, usb_gadget_remove_driver() doesn't play any odd
> tricks with pointers.

Ah, I never noticed that this is doing a "fake" bus and does the
bind/unbind itself outside of the driver core. It should just be a
normal bus type and have the core do the work for it, but oh well.

And there is a lock that should serialize all of this already, so it's
odd that this is able to be triggered at all.

Unless the device is being removed at the same time it was manually
unbound from the driver? If so, then this really should be fixed up to
use the driver core logic instead...

thanks,

greg k-h


2022-02-24 00:41:24

by Alan Stern

[permalink] [raw]
Subject: Re: [syzbot] KASAN: use-after-free Read in dev_uevent

On Wed, Feb 23, 2022 at 05:00:12PM +0100, [email protected] wrote:
> On Wed, Feb 23, 2022 at 09:38:20AM -0500, [email protected] wrote:
> > Which bus locks are you referring to? I'm not aware of any locks that
> > synchronize dev_uevent() with anything (in particular, with driver
> > unbinding).
>
> The locks in the driver core that handle the binding and unbinding of
> drivers to devices.
>
> > And as far as I know, usb_gadget_remove_driver() doesn't play any odd
> > tricks with pointers.
>
> Ah, I never noticed that this is doing a "fake" bus and does the
> bind/unbind itself outside of the driver core. It should just be a
> normal bus type and have the core do the work for it, but oh well.
>
> And there is a lock that should serialize all of this already, so it's
> odd that this is able to be triggered at all.

I guess at a minimum the UDC core should hold the device lock when it
registers, unregisters, binds, or unbinds UDC and gadget devices.
Would that be enough to fix the problem? I really don't understand how
sysfs file access gets synchronized with device removal.

> Unless the device is being removed at the same time it was manually
> unbound from the driver? If so, then this really should be fixed up to
> use the driver core logic instead...

Device removal does of course trigger unbinding, but they always take
place in the same thread so it isn't an issue.

Probably part of the reason people don't want to use the driver core
here is so that they can specify which UDC a gadget driver should bind
to. The driver core would always bind each new gadget to the first
registered gadget driver.

When Dave Brownell originally wrote the gadget subsystem, I believe he
didn't bother to integrate it with the driver core because it was a
"bus" with only a single device and a single driver. The ability to
have multiple UDCs in the system was added later.

Alan Stern

2022-02-24 02:15:23

by Zqiang

[permalink] [raw]
Subject: RE: [syzbot] KASAN: use-after-free Read in dev_uevent

On Wed, Feb 23, 2022 at 05:00:12PM +0100, [email protected] wrote:
> On Wed, Feb 23, 2022 at 09:38:20AM -0500, [email protected] wrote:
> > Which bus locks are you referring to? I'm not aware of any locks
> > that synchronize dev_uevent() with anything (in particular, with
> > driver unbinding).
>
> The locks in the driver core that handle the binding and unbinding of
> drivers to devices.
>
> > And as far as I know, usb_gadget_remove_driver() doesn't play any
> > odd tricks with pointers.
>
> Ah, I never noticed that this is doing a "fake" bus and does the
> bind/unbind itself outside of the driver core. It should just be a
> normal bus type and have the core do the work for it, but oh well.
>
> And there is a lock that should serialize all of this already, so it's
> odd that this is able to be triggered at all.

>>I guess at a minimum the UDC core should hold the device lock when it registers, unregisters, binds, or unbinds UDC and gadget devices.
>>Would that be enough to fix the problem? I really don't understand how sysfs file access gets synchronized with device removal.


Agree with you, in usb_gadget_remove_driver() function, the udc->dev.driver and udc->gadget->dev.driver be set to null
without any protection, so when the udevd accessed the dev->driver, this address may be invalid at this time.
maybe the operation of dev->driver can be protected by device_lock().

Thanks,
Zqiang


> Unless the device is being removed at the same time it was manually
> unbound from the driver? If so, then this really should be fixed up
> to use the driver core logic instead...
>>
>>Device removal does of course trigger unbinding, but they always take place in the same thread so it isn't an issue.
>>
>>Probably part of the reason people don't want to use the driver core here is so that they can specify which UDC a gadget driver should bind to. The driver core would always bind each new gadget to the first registered gadget driver.
>>
>>When Dave Brownell originally wrote the gadget subsystem, I believe he didn't bother to integrate it with the driver core because it was a "bus" with only a single device and a single driver. The ability to have multiple UDCs in the system was added later.
>>
>>Alan Stern

2022-02-24 03:15:21

by Zqiang

[permalink] [raw]
Subject: RE: [syzbot] KASAN: use-after-free Read in dev_uevent


On Wed, Feb 23, 2022 at 05:00:12PM +0100, [email protected] wrote:
> On Wed, Feb 23, 2022 at 09:38:20AM -0500, [email protected] wrote:
> > Which bus locks are you referring to? I'm not aware of any locks
> > that synchronize dev_uevent() with anything (in particular, with
> > driver unbinding).
>
> The locks in the driver core that handle the binding and unbinding of
> drivers to devices.
>
> > And as far as I know, usb_gadget_remove_driver() doesn't play any
> > odd tricks with pointers.
>
> Ah, I never noticed that this is doing a "fake" bus and does the
> bind/unbind itself outside of the driver core. It should just be a
> normal bus type and have the core do the work for it, but oh well.
>
> And there is a lock that should serialize all of this already, so it's
> odd that this is able to be triggered at all.

>>I guess at a minimum the UDC core should hold the device lock when it registers, unregisters, binds, or unbinds UDC and gadget devices.
>>Would that be enough to fix the problem? I really don't understand how sysfs file access gets synchronized with device removal.

>>>
>>>Agree with you, in usb_gadget_remove_driver() function, the udc->dev.driver and udc->gadget->dev.driver be set to null without any protection, so when the udevd accessed the dev->driver, this address may be invalid at this time.
>>>maybe the operation of dev->driver can be protected by device_lock().
>>>

Is it enough that we just need to protect "dev.driver" ?

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 3d6430eb0c6a..9bd2624973d7 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -2316,8 +2316,10 @@ static int dev_uevent(struct kobject *kobj, struct kobj_uevent_env *env)
if (dev->type && dev->type->name)
add_uevent_var(env, "DEVTYPE=%s", dev->type->name);

+ device_lock(dev);
if (dev->driver)
add_uevent_var(env, "DRIVER=%s", dev->driver->name);
+ device_unlock(dev);

/* Add common DT information about the device */
of_device_uevent(dev, env);
diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c
index 568534a0d17c..7877142397d3 100644
--- a/drivers/usb/gadget/udc/core.c
+++ b/drivers/usb/gadget/udc/core.c
@@ -1436,8 +1436,14 @@ static void usb_gadget_remove_driver(struct usb_udc *udc)
usb_gadget_udc_stop(udc);

udc->driver = NULL;
+
+ device_lock(&udc->dev);
udc->dev.driver = NULL;
+ device_unlock(&udc->dev);
+
+ device_lock(&udc->gadget->dev);
udc->gadget->dev.driver = NULL;
+ device_unlock(&udc->gadget->dev);
}

/**
@@ -1498,8 +1504,14 @@ static int udc_bind_to_driver(struct usb_udc *udc, struct usb_gadget_driver *dri
driver->function);

udc->driver = driver;
+
+ device_lock(&udc->dev);
udc->dev.driver = &driver->driver;
+ device_unlock(&udc->dev);
+
+ device_lock(&udc->gadget->dev);
udc->gadget->dev.driver = &driver->driver;
+ device_unlock(&udc->gadget->dev);

usb_gadget_udc_set_speed(udc, driver->max_speed);

@@ -1521,8 +1533,14 @@ static int udc_bind_to_driver(struct usb_udc *udc, struct usb_gadget_driver *dri
dev_err(&udc->dev, "failed to start %s: %d\n",
udc->driver->function, ret);
udc->driver = NULL;
+
+ device_lock(&udc->dev);
udc->dev.driver = NULL;
+ device_unlock(&udc->dev);
+
+ device_lock(&udc->gadget->dev);
udc->gadget->dev.driver = NULL;
+ device_unlock(&udc->gadget->dev);
return ret;
}

Thanks,
Zqiang


>>>Thanks,
>>>Zqiang


> Unless the device is being removed at the same time it was manually
> unbound from the driver? If so, then this really should be fixed up
> to use the driver core logic instead...
>>
>>Device removal does of course trigger unbinding, but they always take place in the same thread so it isn't an issue.
>>
>>Probably part of the reason people don't want to use the driver core here is so that they can specify which UDC a gadget driver should bind to. The driver core would always bind each new gadget to the first registered gadget driver.
>>
>>When Dave Brownell originally wrote the gadget subsystem, I believe he didn't bother to integrate it with the driver core because it was a "bus" with only a single device and a single driver. The ability to have multiple UDCs in the system was added later.
>>
>>Alan Stern

2022-02-25 10:46:37

by Alan Stern

[permalink] [raw]
Subject: Re: [syzbot] KASAN: use-after-free Read in dev_uevent

On Thu, Feb 24, 2022 at 03:14:54AM +0000, Zhang, Qiang1 wrote:
>
> On Wed, Feb 23, 2022 at 05:00:12PM +0100, [email protected] wrote:
> > On Wed, Feb 23, 2022 at 09:38:20AM -0500, [email protected] wrote:
> > > Which bus locks are you referring to? I'm not aware of any locks
> > > that synchronize dev_uevent() with anything (in particular, with
> > > driver unbinding).
> >
> > The locks in the driver core that handle the binding and unbinding of
> > drivers to devices.
> >
> > > And as far as I know, usb_gadget_remove_driver() doesn't play any
> > > odd tricks with pointers.
> >
> > Ah, I never noticed that this is doing a "fake" bus and does the
> > bind/unbind itself outside of the driver core. It should just be a
> > normal bus type and have the core do the work for it, but oh well.
> >
> > And there is a lock that should serialize all of this already, so it's
> > odd that this is able to be triggered at all.
>
> >>I guess at a minimum the UDC core should hold the device lock when it registers, unregisters, binds, or unbinds UDC and gadget devices.
> >>Would that be enough to fix the problem? I really don't understand how sysfs file access gets synchronized with device removal.
>
> >>>
> >>>Agree with you, in usb_gadget_remove_driver() function, the udc->dev.driver and udc->gadget->dev.driver be set to null without any protection, so when the udevd accessed the dev->driver, this address may be invalid at this time.
> >>>maybe the operation of dev->driver can be protected by device_lock().
> >>>
>
> Is it enough that we just need to protect "dev.driver" ?

I don't know, although I doubt it. The right way to fix it is to make
sure that the existing protections, which apply to drivers that are
registered in the driver core, can also work properly with gadgets. But
I don't know what those protections are or how they work.

> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 3d6430eb0c6a..9bd2624973d7 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -2316,8 +2316,10 @@ static int dev_uevent(struct kobject *kobj, struct kobj_uevent_env *env)
> if (dev->type && dev->type->name)
> add_uevent_var(env, "DEVTYPE=%s", dev->type->name);
>
> + device_lock(dev);
> if (dev->driver)
> add_uevent_var(env, "DRIVER=%s", dev->driver->name);
> + device_unlock(dev);

You probably should not do this. Unless there's a serious bug, the
driver core already takes all the locks it needs. Doing this might
cause a deadlock (because the caller may already hold the device lock).

>
> /* Add common DT information about the device */
> of_device_uevent(dev, env);
> diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c
> index 568534a0d17c..7877142397d3 100644
> --- a/drivers/usb/gadget/udc/core.c
> +++ b/drivers/usb/gadget/udc/core.c
> @@ -1436,8 +1436,14 @@ static void usb_gadget_remove_driver(struct usb_udc *udc)
> usb_gadget_udc_stop(udc);
>
> udc->driver = NULL;
> +
> + device_lock(&udc->dev);
> udc->dev.driver = NULL;
> + device_unlock(&udc->dev);
> +
> + device_lock(&udc->gadget->dev);
> udc->gadget->dev.driver = NULL;
> + device_unlock(&udc->gadget->dev);
> }

These are reasonable things to do, but I don't know if they will fix the
problem.

We really should ask advice from somebody who understands how this stuff
is supposed to work. I'm not sure who to ask, though. Tejun Heo,
perhaps (CC'ed).

Tejun: The USB Gadget core binds and unbinds drivers without using the
normal driver core facilities (see the code in
usb_gadget_remove_driver() above). As a result, unbinding races with
uevent generation, which can lead to a NULL pointer dereference as found
by syzbot testing. In particular, dev->driver can become NULL between
the times when dev_uevent() tests it and uses it (see above).

Can you tell us how this should be fixed?

Alan Stern