2012-06-11 05:14:38

by Ming Lei

[permalink] [raw]
Subject: [PATCH] driver core: fix shutdown races with probe/remove(v2)

Firstly, .shutdown callback may touch a uninitialized hardware
if dev->driver is set and .probe is not completed.

Secondly, device_shutdown() may dereference a null pointer to cause
oops when dev->driver is cleared after it is checked in
device_shutdown().

So just try to hold device lock and its parent lock(if it has) to
fix the races.

Cc: Alan Stern <[email protected]>
Cc: [email protected]
Signed-off-by: Ming Lei <[email protected]>
---
v2:
- take Alan's suggestion to use device_trylock to avoid
hanging during shutdown by buggy device or driver
- hold parent reference counter

drivers/base/core.c | 32 ++++++++++++++++++++++++++++++++
1 file changed, 32 insertions(+)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 346be8b..f2fc989 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -1796,6 +1796,16 @@ out:
}
EXPORT_SYMBOL_GPL(device_move);

+static int __try_lock(struct device *dev)
+{
+ int i = 0;
+
+ while (!device_trylock(dev) && i++ < 100)
+ msleep(10);
+
+ return i < 100;
+}
+
/**
* device_shutdown - call ->shutdown() on each device to shutdown.
*/
@@ -1810,8 +1820,11 @@ void device_shutdown(void)
* devices offline, even as the system is shutting down.
*/
while (!list_empty(&devices_kset->list)) {
+ int nonlocked;
+
dev = list_entry(devices_kset->list.prev, struct device,
kobj.entry);
+ get_device(dev->parent);
get_device(dev);
/*
* Make sure the device is off the kset list, in the
@@ -1820,6 +1833,18 @@ void device_shutdown(void)
list_del_init(&dev->kobj.entry);
spin_unlock(&devices_kset->list_lock);

+ /* hold lock to avoid race with .probe/.release */
+ if (dev->parent && !__try_lock(dev->parent))
+ nonlocked = 2;
+ else if (!__try_lock(dev))
+ nonlocked = 1;
+ else
+ nonlocked = 0;
+
+ if (nonlocked)
+ dev_err(dev, "can't hold %slock for shutdown\n",
+ nonlocked == 1 ? "" : "parent ");
+
/* Don't allow any more runtime suspends */
pm_runtime_get_noresume(dev);
pm_runtime_barrier(dev);
@@ -1831,7 +1856,14 @@ void device_shutdown(void)
dev_dbg(dev, "shutdown\n");
dev->driver->shutdown(dev);
}
+
+ if (!nonlocked)
+ device_unlock(dev);
+ if (dev->parent && nonlocked != 2)
+ device_unlock(dev->parent);
+
put_device(dev);
+ put_device(dev->parent);

spin_lock(&devices_kset->list_lock);
}
--
1.7.9.5


2012-06-11 14:16:19

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH] driver core: fix shutdown races with probe/remove(v2)

On Mon, 11 Jun 2012, Ming Lei wrote:

> @@ -1820,6 +1833,18 @@ void device_shutdown(void)
> list_del_init(&dev->kobj.entry);
> spin_unlock(&devices_kset->list_lock);
>
> + /* hold lock to avoid race with .probe/.release */
> + if (dev->parent && !__try_lock(dev->parent))
> + nonlocked = 2;
> + else if (!__try_lock(dev))
> + nonlocked = 1;
> + else
> + nonlocked = 0;
> +
> + if (nonlocked)
> + dev_err(dev, "can't hold %slock for shutdown\n",
> + nonlocked == 1 ? "" : "parent ");

Even if the parent can't be locked, you should still try to lock the
device.

Alan Stern

2012-06-11 14:43:33

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] driver core: fix shutdown races with probe/remove(v2)

On Mon, Jun 11, 2012 at 10:16 PM, Alan Stern <[email protected]> wrote:

>> +
>> + ? ? ? ? ? ? if (nonlocked)
>> + ? ? ? ? ? ? ? ? ? ? dev_err(dev, "can't hold %slock for shutdown\n",
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? nonlocked == 1 ? "" : "parent ");
>
> Even if the parent can't be locked, you should still try to lock the
> device.

I have considered doing it, but which may consume another 1sec.

Also if the parent lock has been held, it is very possibly that the
device can't be probed or removed at the same time, so just logged
the crazy thing and go ahead.

Thanks,
--
Ming Lei

2012-06-11 16:02:36

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH] driver core: fix shutdown races with probe/remove(v2)

On Mon, 11 Jun 2012, Ming Lei wrote:

> On Mon, Jun 11, 2012 at 10:16 PM, Alan Stern <[email protected]> wrote:
>
> >> +
> >> + ? ? ? ? ? ? if (nonlocked)
> >> + ? ? ? ? ? ? ? ? ? ? dev_err(dev, "can't hold %slock for shutdown\n",
> >> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? nonlocked == 1 ? "" : "parent ");
> >
> > Even if the parent can't be locked, you should still try to lock the
> > device.
>
> I have considered doing it, but which may consume another 1sec.
>
> Also if the parent lock has been held, it is very possibly that the
> device can't be probed or removed at the same time, so just logged
> the crazy thing and go ahead.

Okay. Then you can add my Acked-by.

Alan Stern

2012-06-12 01:02:56

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] driver core: fix shutdown races with probe/remove(v2)

On Tue, Jun 12, 2012 at 12:02 AM, Alan Stern <[email protected]> wrote:
> On Mon, 11 Jun 2012, Ming Lei wrote:
>>
>> I have considered doing it, but which may consume another 1sec.
>>
>> Also if the parent lock has been held, it is very possibly that the
>> device can't be probed or removed at the same time, so just logged
>> the crazy thing and go ahead.
>
> Okay. ?Then you can add my Acked-by.

Thanks, I think Greg will do it.


Thanks,
--
Ming Lei

2012-06-15 22:03:49

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] driver core: fix shutdown races with probe/remove(v2)

On Mon, Jun 11, 2012 at 01:13:20PM +0800, Ming Lei wrote:
> Firstly, .shutdown callback may touch a uninitialized hardware
> if dev->driver is set and .probe is not completed.
>
> Secondly, device_shutdown() may dereference a null pointer to cause
> oops when dev->driver is cleared after it is checked in
> device_shutdown().
>
> So just try to hold device lock and its parent lock(if it has) to
> fix the races.
>
> Cc: Alan Stern <[email protected]>
> Cc: [email protected]

Why stable? Are there known systems that crash right now without this
change? I don't think we ever heard back from the original poster about
this issue as to what exactly was going wrong.


> Signed-off-by: Ming Lei <[email protected]>
> ---
> v2:
> - take Alan's suggestion to use device_trylock to avoid
> hanging during shutdown by buggy device or driver
> - hold parent reference counter
>
> drivers/base/core.c | 32 ++++++++++++++++++++++++++++++++
> 1 file changed, 32 insertions(+)
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 346be8b..f2fc989 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -1796,6 +1796,16 @@ out:
> }
> EXPORT_SYMBOL_GPL(device_move);
>
> +static int __try_lock(struct device *dev)
> +{
> + int i = 0;
> +
> + while (!device_trylock(dev) && i++ < 100)
> + msleep(10);
> +
> + return i < 100;
> +}

That's a totally arbritary time, why does this work and other times do
not? And what is this returning, if the lock was grabbed successfully?
What's with the __ naming?

I really don't like this at all.


> +
> /**
> * device_shutdown - call ->shutdown() on each device to shutdown.
> */
> @@ -1810,8 +1820,11 @@ void device_shutdown(void)
> * devices offline, even as the system is shutting down.
> */
> while (!list_empty(&devices_kset->list)) {
> + int nonlocked;
> +
> dev = list_entry(devices_kset->list.prev, struct device,
> kobj.entry);
> + get_device(dev->parent);

Why grab the parent reference?

> get_device(dev);
> /*
> * Make sure the device is off the kset list, in the
> @@ -1820,6 +1833,18 @@ void device_shutdown(void)
> list_del_init(&dev->kobj.entry);
> spin_unlock(&devices_kset->list_lock);
>
> + /* hold lock to avoid race with .probe/.release */
> + if (dev->parent && !__try_lock(dev->parent))
> + nonlocked = 2;
> + else if (!__try_lock(dev))
> + nonlocked = 1;
> + else
> + nonlocked = 0;

Ick ick ick. Why can't we just grab the lock to try to only call these
callbacks one at a time? What is causing the big problem here that I am
missing?

> +
> + if (nonlocked)
> + dev_err(dev, "can't hold %slock for shutdown\n",
> + nonlocked == 1 ? "" : "parent ");

What can anyone do with this message? I sure wouldn't know what to do
with it, do you? If so, what?

greg k-h

2012-06-18 01:53:03

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] driver core: fix shutdown races with probe/remove(v2)

On Sat, Jun 16, 2012 at 6:03 AM, Greg Kroah-Hartman
<[email protected]> wrote:
> On Mon, Jun 11, 2012 at 01:13:20PM +0800, Ming Lei wrote:
>> Firstly, .shutdown callback may touch a uninitialized hardware
>> if dev->driver is set and .probe is not completed.
>>
>> Secondly, device_shutdown() may dereference a null pointer to cause
>> oops when dev->driver is cleared after it is checked in
>> device_shutdown().
>>
>> So just try to hold device lock and its parent lock(if it has) to
>> fix the races.
>>
>> Cc: Alan Stern <[email protected]>
>> Cc: [email protected]
>
> Why stable? ?Are there known systems that crash right now without this
> change? ?I don't think we ever heard back from the original poster about
> this issue as to what exactly was going wrong.

I marked the patch as stable because it is really a fix on race between
shutdown and probe/remove, and the race can really happen in practice
as discussed in the thread. Once it happened, it will cause a big problem
on production machines.

>
>
>> Signed-off-by: Ming Lei <[email protected]>
>> ---
>> v2:
>> ? ? ? - take Alan's suggestion to use device_trylock to avoid
>> ? ? ? hanging during shutdown by buggy device or driver
>> ? ? ? - hold parent reference counter
>>
>> ?drivers/base/core.c | ? 32 ++++++++++++++++++++++++++++++++
>> ?1 file changed, 32 insertions(+)
>>
>> diff --git a/drivers/base/core.c b/drivers/base/core.c
>> index 346be8b..f2fc989 100644
>> --- a/drivers/base/core.c
>> +++ b/drivers/base/core.c
>> @@ -1796,6 +1796,16 @@ out:
>> ?}
>> ?EXPORT_SYMBOL_GPL(device_move);
>>
>> +static int __try_lock(struct device *dev)
>> +{
>> + ? ? int i = 0;
>> +
>> + ? ? while (!device_trylock(dev) && i++ < 100)
>> + ? ? ? ? ? ? msleep(10);
>> +
>> + ? ? return i < 100;
>> +}
>
> That's a totally arbritary time, why does this work and other times do
> not? ?And what is this returning, if the lock was grabbed successfully?

It is a timeout time and is 1sec now. If the lock can't be held in 1sec, the
function will return 0, otherwise it will return 1 and indicates that the lock
has been held successfully.

Considered device lock is often held during probe and release in most
of situations, 1sec should be a sane value because it may be abnormal
if one driver's probe or release lasts for more than 1sec.

Also taking trylock is to prevent buggy drivers from hanging system during
shutdown. If the timeout is too large, it may prolong shutdown time in
the situation.

I will appreciate it very much if you can suggest a better timeout value.

> What's with the __ naming?

No special meaning, if is not allowed, I can remove the '__'.

>
> I really don't like this at all.
>
>
>> +
>> ?/**
>> ? * device_shutdown - call ->shutdown() on each device to shutdown.
>> ? */
>> @@ -1810,8 +1820,11 @@ void device_shutdown(void)
>> ? ? ? ?* devices offline, even as the system is shutting down.
>> ? ? ? ?*/
>> ? ? ? while (!list_empty(&devices_kset->list)) {
>> + ? ? ? ? ? ? int nonlocked;
>> +
>> ? ? ? ? ? ? ? dev = list_entry(devices_kset->list.prev, struct device,
>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? kobj.entry);
>> + ? ? ? ? ? ? get_device(dev->parent);
>
> Why grab the parent reference?

If it is not grabbed, device_del may happen after the line below

spin_unlock(&devices_kset->list_lock);

so use-after-free may be triggered because the parent's lock
is to be locked/unlocked in this patch.

>
>> ? ? ? ? ? ? ? get_device(dev);
>> ? ? ? ? ? ? ? /*
>> ? ? ? ? ? ? ? ?* Make sure the device is off the kset list, in the
>> @@ -1820,6 +1833,18 @@ void device_shutdown(void)
>> ? ? ? ? ? ? ? list_del_init(&dev->kobj.entry);
>> ? ? ? ? ? ? ? spin_unlock(&devices_kset->list_lock);
>>
>> + ? ? ? ? ? ? /* hold lock to avoid race with .probe/.release */
>> + ? ? ? ? ? ? if (dev->parent && !__try_lock(dev->parent))
>> + ? ? ? ? ? ? ? ? ? ? nonlocked = 2;
>> + ? ? ? ? ? ? else if (!__try_lock(dev))
>> + ? ? ? ? ? ? ? ? ? ? nonlocked = 1;
>> + ? ? ? ? ? ? else
>> + ? ? ? ? ? ? ? ? ? ? nonlocked = 0;
>
> Ick ick ick. ?Why can't we just grab the lock to try to only call these
> callbacks one at a time? ?What is causing the big problem here that I am
> missing?

As discussed before in the thread, trylock is introduced to prevent buggy
drivers from hanging system during shutdown.

>
>> +
>> + ? ? ? ? ? ? if (nonlocked)
>> + ? ? ? ? ? ? ? ? ? ? dev_err(dev, "can't hold %slock for shutdown\n",
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? nonlocked == 1 ? "" : "parent ");
>
> What can anyone do with this message? ?I sure wouldn't know what to do
> with it, do you? ?If so, what?

I think the above message is very useful to troubleshoot the buggy device
drivers which hold device lock for ever.


Thanks,
--
Ming Lei

2012-06-18 22:25:57

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] driver core: fix shutdown races with probe/remove(v2)

On Mon, Jun 18, 2012 at 09:52:57AM +0800, Ming Lei wrote:
> On Sat, Jun 16, 2012 at 6:03 AM, Greg Kroah-Hartman
> <[email protected]> wrote:
> > On Mon, Jun 11, 2012 at 01:13:20PM +0800, Ming Lei wrote:
> >> Firstly, .shutdown callback may touch a uninitialized hardware
> >> if dev->driver is set and .probe is not completed.
> >>
> >> Secondly, device_shutdown() may dereference a null pointer to cause
> >> oops when dev->driver is cleared after it is checked in
> >> device_shutdown().
> >>
> >> So just try to hold device lock and its parent lock(if it has) to
> >> fix the races.
> >>
> >> Cc: Alan Stern <[email protected]>
> >> Cc: [email protected]
> >
> > Why stable? ?Are there known systems that crash right now without this
> > change? ?I don't think we ever heard back from the original poster about
> > this issue as to what exactly was going wrong.
>
> I marked the patch as stable because it is really a fix on race between
> shutdown and probe/remove, and the race can really happen in practice
> as discussed in the thread. Once it happened, it will cause a big problem
> on production machines.

Have you read Documentation/stable_kernel_patches.txt? Please do so and
see why I can't take this patch for a stable tree. Note that no one has
ever reported this as a bug before, and the original poster ran away
never to be heard from again, so I really don't think it was a real
problem that people ever saw.

> >> Signed-off-by: Ming Lei <[email protected]>
> >> ---
> >> v2:
> >> ? ? ? - take Alan's suggestion to use device_trylock to avoid
> >> ? ? ? hanging during shutdown by buggy device or driver
> >> ? ? ? - hold parent reference counter
> >>
> >> ?drivers/base/core.c | ? 32 ++++++++++++++++++++++++++++++++
> >> ?1 file changed, 32 insertions(+)
> >>
> >> diff --git a/drivers/base/core.c b/drivers/base/core.c
> >> index 346be8b..f2fc989 100644
> >> --- a/drivers/base/core.c
> >> +++ b/drivers/base/core.c
> >> @@ -1796,6 +1796,16 @@ out:
> >> ?}
> >> ?EXPORT_SYMBOL_GPL(device_move);
> >>
> >> +static int __try_lock(struct device *dev)
> >> +{
> >> + ? ? int i = 0;
> >> +
> >> + ? ? while (!device_trylock(dev) && i++ < 100)
> >> + ? ? ? ? ? ? msleep(10);
> >> +
> >> + ? ? return i < 100;
> >> +}
> >
> > That's a totally arbritary time, why does this work and other times do
> > not? ?And what is this returning, if the lock was grabbed successfully?
>
> It is a timeout time and is 1sec now. If the lock can't be held in 1sec, the
> function will return 0, otherwise it will return 1 and indicates that the lock
> has been held successfully.

My point is why 1 second? That's completly arbitrary and means nothing.
Why not just do a real lock and try for forever?

> Considered device lock is often held during probe and release in most
> of situations, 1sec should be a sane value because it may be abnormal
> if one driver's probe or release lasts for more than 1sec.

How do you know how long a probe takes? I know of some that take far
longer than 1 second, so your patch just failed there :(

> Also taking trylock is to prevent buggy drivers from hanging system during
> shutdown. If the timeout is too large, it may prolong shutdown time in
> the situation.

If a buggy driver hangs, then we fix the buggy driver. We have the
source, we can do that.

> I will appreciate it very much if you can suggest a better timeout value.

None, spin forever, take a lock for real.

> > What's with the __ naming?
>
> No special meaning, if is not allowed, I can remove the '__'.

Please do, it makes no sense.

> > I really don't like this at all.
> >
> >
> >> +
> >> ?/**
> >> ? * device_shutdown - call ->shutdown() on each device to shutdown.
> >> ? */
> >> @@ -1810,8 +1820,11 @@ void device_shutdown(void)
> >> ? ? ? ?* devices offline, even as the system is shutting down.
> >> ? ? ? ?*/
> >> ? ? ? while (!list_empty(&devices_kset->list)) {
> >> + ? ? ? ? ? ? int nonlocked;
> >> +
> >> ? ? ? ? ? ? ? dev = list_entry(devices_kset->list.prev, struct device,
> >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? kobj.entry);
> >> + ? ? ? ? ? ? get_device(dev->parent);
> >
> > Why grab the parent reference?
>
> If it is not grabbed, device_del may happen after the line below
>
> spin_unlock(&devices_kset->list_lock);
>
> so use-after-free may be triggered because the parent's lock
> is to be locked/unlocked in this patch.

Then document that.

> >> ? ? ? ? ? ? ? get_device(dev);
> >> ? ? ? ? ? ? ? /*
> >> ? ? ? ? ? ? ? ?* Make sure the device is off the kset list, in the
> >> @@ -1820,6 +1833,18 @@ void device_shutdown(void)
> >> ? ? ? ? ? ? ? list_del_init(&dev->kobj.entry);
> >> ? ? ? ? ? ? ? spin_unlock(&devices_kset->list_lock);
> >>
> >> + ? ? ? ? ? ? /* hold lock to avoid race with .probe/.release */
> >> + ? ? ? ? ? ? if (dev->parent && !__try_lock(dev->parent))
> >> + ? ? ? ? ? ? ? ? ? ? nonlocked = 2;
> >> + ? ? ? ? ? ? else if (!__try_lock(dev))
> >> + ? ? ? ? ? ? ? ? ? ? nonlocked = 1;
> >> + ? ? ? ? ? ? else
> >> + ? ? ? ? ? ? ? ? ? ? nonlocked = 0;
> >
> > Ick ick ick. ?Why can't we just grab the lock to try to only call these
> > callbacks one at a time? ?What is causing the big problem here that I am
> > missing?
>
> As discussed before in the thread, trylock is introduced to prevent buggy
> drivers from hanging system during shutdown.

Fix buggy drivers, don't paper over them.

greg k-h

2012-06-19 02:00:40

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] driver core: fix shutdown races with probe/remove(v2)

On Tue, Jun 19, 2012 at 6:25 AM, Greg Kroah-Hartman
<[email protected]> wrote:

> Have you read Documentation/stable_kernel_patches.txt? ?Please do so and

I haven't read Documentation/stable_kernel_patches.txt, but read
Documentation/stable_kernel_rules.txt, :-)

> see why I can't take this patch for a stable tree. ?Note that no one has
> ever reported this as a bug before, and the original poster ran away
> never to be heard from again, so I really don't think it was a real
> problem that people ever saw.

If Documentation/stable_kernel_rules.txt is the correct doc for stable rule,
looks reporter requirement isn't listed in the file, but the below can be found:

- No "theoretical race condition" issues, unless an explanation of
how the race can be exploited is also provided.

so I marked it as -stable because I have explained how the race can be
exploited in reality.

>
>> >> Signed-off-by: Ming Lei <[email protected]>
>> >> ---
>> >> v2:
>> >> ? ? ? - take Alan's suggestion to use device_trylock to avoid
>> >> ? ? ? hanging during shutdown by buggy device or driver
>> >> ? ? ? - hold parent reference counter
>> >>
>> >> ?drivers/base/core.c | ? 32 ++++++++++++++++++++++++++++++++
>> >> ?1 file changed, 32 insertions(+)
>> >>
>> >> diff --git a/drivers/base/core.c b/drivers/base/core.c
>> >> index 346be8b..f2fc989 100644
>> >> --- a/drivers/base/core.c
>> >> +++ b/drivers/base/core.c
>> >> @@ -1796,6 +1796,16 @@ out:
>> >> ?}
>> >> ?EXPORT_SYMBOL_GPL(device_move);
>> >>
>> >> +static int __try_lock(struct device *dev)
>> >> +{
>> >> + ? ? int i = 0;
>> >> +
>> >> + ? ? while (!device_trylock(dev) && i++ < 100)
>> >> + ? ? ? ? ? ? msleep(10);
>> >> +
>> >> + ? ? return i < 100;
>> >> +}
>> >
>> > That's a totally arbritary time, why does this work and other times do
>> > not? ?And what is this returning, if the lock was grabbed successfully?
>>
>> ?It is a timeout time and is 1sec now. If the lock can't be held in 1sec, the
>> ?function will return 0, otherwise it will return 1 and indicates that the lock
>> ?has been held successfully.
>
> My point is why 1 second? ?That's completly arbitrary and means nothing.

1 second is a estimated value, just like many other timeout values in kernel.

For example, the timeout passed to usb_control_msg() is mostly estimated
first, then may be adjusted later from some new report.

Another example is one recent xhci fix commit:

622eb783fe(xHCI: Increase the timeout for controller save/restore state
operation)

the timeout value is just increased arbitrarily to adapt new device.

I still have more many examples in kernel about timeout value...

> Why not just do a real lock and try for forever?

IMO, there are two advantages not just doing a real lock for forever:

- avoiding buggy device/driver to hang the system
- with trylock, we can log the buggy device so that it is a bit
easier to troubleshoot the buggy drivers, suppose the bug is
only triggered 1 time in one year or more

>
>> Considered device lock is often held during probe and release in most
>> of situations, 1sec should be a sane value because it may be abnormal
>> if one driver's probe or release lasts for more than 1sec.
>
> How do you know how long a probe takes? ?I know of some that take far
> longer than 1 second, so your patch just failed there :(

Could you share the device or driver so that a better timeout value is
set firstly?
Also, the timeout value is just valid for hotplug device.

>
>> Also taking trylock is to prevent buggy drivers from hanging system during
>> shutdown. If the timeout is too large, it may prolong shutdown time in
>> the situation.
>
> If a buggy driver hangs, then we fix the buggy driver. ?We have the
> source, we can do that.

The problem may be triggered one time for one year or more, without the log
provided by trylock, it is a bit difficult to fix it.

>
>> I will appreciate it very much if you can suggest a better timeout value.
>
> None, spin forever, take a lock for real.
>
>> > What's with the __ naming?
>>
>> No special meaning, if is not allowed, I can remove the '__'.
>
> Please do, it makes no sense.

OK.

>
>> > I really don't like this at all.
>> >
>> >
>> >> +
>> >> ?/**
>> >> ? * device_shutdown - call ->shutdown() on each device to shutdown.
>> >> ? */
>> >> @@ -1810,8 +1820,11 @@ void device_shutdown(void)
>> >> ? ? ? ?* devices offline, even as the system is shutting down.
>> >> ? ? ? ?*/
>> >> ? ? ? while (!list_empty(&devices_kset->list)) {
>> >> + ? ? ? ? ? ? int nonlocked;
>> >> +
>> >> ? ? ? ? ? ? ? dev = list_entry(devices_kset->list.prev, struct device,
>> >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? kobj.entry);
>> >> + ? ? ? ? ? ? get_device(dev->parent);
>> >
>> > Why grab the parent reference?
>>
>> If it is not grabbed, ?device_del may happen after the line below
>>
>> ? ? ? ? ?spin_unlock(&devices_kset->list_lock);
>>
>> so use-after-free may be triggered because the parent's lock
>> is to be locked/unlocked in this patch.
>
> Then document that.

OK.

>
>> >> ? ? ? ? ? ? ? get_device(dev);
>> >> ? ? ? ? ? ? ? /*
>> >> ? ? ? ? ? ? ? ?* Make sure the device is off the kset list, in the
>> >> @@ -1820,6 +1833,18 @@ void device_shutdown(void)
>> >> ? ? ? ? ? ? ? list_del_init(&dev->kobj.entry);
>> >> ? ? ? ? ? ? ? spin_unlock(&devices_kset->list_lock);
>> >>
>> >> + ? ? ? ? ? ? /* hold lock to avoid race with .probe/.release */
>> >> + ? ? ? ? ? ? if (dev->parent && !__try_lock(dev->parent))
>> >> + ? ? ? ? ? ? ? ? ? ? nonlocked = 2;
>> >> + ? ? ? ? ? ? else if (!__try_lock(dev))
>> >> + ? ? ? ? ? ? ? ? ? ? nonlocked = 1;
>> >> + ? ? ? ? ? ? else
>> >> + ? ? ? ? ? ? ? ? ? ? nonlocked = 0;
>> >
>> > Ick ick ick. ?Why can't we just grab the lock to try to only call these
>> > callbacks one at a time? ?What is causing the big problem here that I am
>> > missing?
>>
>> As discussed before in the thread, trylock is introduced to prevent buggy
>> drivers from hanging system during shutdown.
>
> Fix buggy drivers, don't paper over them.

As said above, the log may be very helpful for fixing the buggy drivers.

Thanks,
--
Ming Lei

2012-06-20 22:38:04

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] driver core: fix shutdown races with probe/remove(v2)

On Tue, Jun 19, 2012 at 10:00:36AM +0800, Ming Lei wrote:
> On Tue, Jun 19, 2012 at 6:25 AM, Greg Kroah-Hartman
> <[email protected]> wrote:
>
> > Have you read Documentation/stable_kernel_patches.txt? ?Please do so and
>
> I haven't read Documentation/stable_kernel_patches.txt, but read
> Documentation/stable_kernel_rules.txt, :-)

Sorry, you are correct :)

> > see why I can't take this patch for a stable tree. ?Note that no one has
> > ever reported this as a bug before, and the original poster ran away
> > never to be heard from again, so I really don't think it was a real
> > problem that people ever saw.
>
> If Documentation/stable_kernel_rules.txt is the correct doc for stable rule,
> looks reporter requirement isn't listed in the file, but the below can be found:
>
> - No "theoretical race condition" issues, unless an explanation of
> how the race can be exploited is also provided.
>
> so I marked it as -stable because I have explained how the race can be
> exploited in reality.

Ok, but as this has been there since when, 2.5, I'll refrain from
marking it this way, as no one has reported a real problem like this
before.

> >> >> Signed-off-by: Ming Lei <[email protected]>
> >> >> ---
> >> >> v2:
> >> >> ? ? ? - take Alan's suggestion to use device_trylock to avoid
> >> >> ? ? ? hanging during shutdown by buggy device or driver
> >> >> ? ? ? - hold parent reference counter
> >> >>
> >> >> ?drivers/base/core.c | ? 32 ++++++++++++++++++++++++++++++++
> >> >> ?1 file changed, 32 insertions(+)
> >> >>
> >> >> diff --git a/drivers/base/core.c b/drivers/base/core.c
> >> >> index 346be8b..f2fc989 100644
> >> >> --- a/drivers/base/core.c
> >> >> +++ b/drivers/base/core.c
> >> >> @@ -1796,6 +1796,16 @@ out:
> >> >> ?}
> >> >> ?EXPORT_SYMBOL_GPL(device_move);
> >> >>
> >> >> +static int __try_lock(struct device *dev)
> >> >> +{
> >> >> + ? ? int i = 0;
> >> >> +
> >> >> + ? ? while (!device_trylock(dev) && i++ < 100)
> >> >> + ? ? ? ? ? ? msleep(10);
> >> >> +
> >> >> + ? ? return i < 100;
> >> >> +}
> >> >
> >> > That's a totally arbritary time, why does this work and other times do
> >> > not? ?And what is this returning, if the lock was grabbed successfully?
> >>
> >> ?It is a timeout time and is 1sec now. If the lock can't be held in 1sec, the
> >> ?function will return 0, otherwise it will return 1 and indicates that the lock
> >> ?has been held successfully.
> >
> > My point is why 1 second? ?That's completly arbitrary and means nothing.
>
> 1 second is a estimated value, just like many other timeout values in kernel.
>
> For example, the timeout passed to usb_control_msg() is mostly estimated
> first, then may be adjusted later from some new report.
>
> Another example is one recent xhci fix commit:
>
> 622eb783fe(xHCI: Increase the timeout for controller save/restore state
> operation)
>
> the timeout value is just increased arbitrarily to adapt new device.
>
> I still have more many examples in kernel about timeout value...

Yes, I know this, but now you are putting a limit on the amount of time
a probe function can take, when before, we have never had one. That's
not something to be taken lightly, and is one I know is not true.

> > Why not just do a real lock and try for forever?
>
> IMO, there are two advantages not just doing a real lock for forever:
>
> - avoiding buggy device/driver to hang the system
> - with trylock, we can log the buggy device so that it is a bit
> easier to troubleshoot the buggy drivers, suppose the bug is
> only triggered 1 time in one year or more

No, just fix the driver, I don't want to put a time limit on how long
probe can take, as we never have in the past and I'm sure that whatever
we pick, will be wrong for someone.

I have seen devices that take many seconds, and minutes for some if bad
things happen (i.e. the firmware doesn't download properly). Don't
break people's working systems.

greg k-h

2012-06-20 23:34:06

by Jonathan Nieder

[permalink] [raw]
Subject: Re: [PATCH] driver core: fix shutdown races with probe/remove(v2)

Greg Kroah-Hartman wrote:
> On Tue, Jun 19, 2012 at 10:00:36AM +0800, Ming Lei wrote:

>> If Documentation/stable_kernel_rules.txt is the correct doc for stable rule,
>> looks reporter requirement isn't listed in the file, but the below can be found:
>>
>> - No "theoretical race condition" issues, unless an explanation of
>> how the race can be exploited is also provided.
>>
>> so I marked it as -stable because I have explained how the race can be
>> exploited in reality.
>
> Ok, but as this has been there since when, 2.5, I'll refrain from
> marking it this way, as no one has reported a real problem like this
> before.

Just to clarify, if I understand correctly, exploited != reproduced. :)

That is, if you have an example of how the race can be exploited by an
unprivileged prankster to make a sysadmin's life miserable, that would
definitely be real problem. On the other hand, if you have a test case
for a rare race condition that has not appeared in the wild and is not
exploitable, that's very useful (test cases make programming much
easier!) but it doesn't make it qualify for stable.

Hope that helps,
Jonathan

2012-06-21 01:21:14

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] driver core: fix shutdown races with probe/remove(v2)

On Thu, Jun 21, 2012 at 6:37 AM, Greg Kroah-Hartman
<[email protected]> wrote:
> On Tue, Jun 19, 2012 at 10:00:36AM +0800, Ming Lei wrote:

>> so I marked it as -stable because I have explained how the race can be
>> exploited in reality.
>
> Ok, but as this has been there since when, 2.5, I'll refrain from
> marking it this way, as no one has reported a real problem like this
> before.

So have you agreed on keeping Cc: stable in v3?

>> I still have more many examples in kernel about timeout value...
>
> Yes, I know this, but now you are putting a limit on the amount of time

No, I don't put a limit on it, see my below explanation.

> a probe function can take, when before, we have never had one. ?That's
> not something to be taken lightly, and is one I know is not true.
>
>> > Why not just do a real lock and try for forever?
>>
>> IMO, there are two advantages not just doing a real lock for forever:
>>
>> - avoiding buggy device/driver to hang the system
>> - with trylock, we can log the buggy device so that it is a bit
>> easier to troubleshoot the buggy drivers, suppose the bug is
>> only triggered 1 time in one year or more
>
> No, just fix the driver, I don't want to put a time limit on how long

Surely we need to fix the driver, but the problem is that it may be very
difficult to fix the driver without the log introduced in the patch, so why
not take it without obvious side effect?

> probe can take, as we never have in the past and I'm sure that whatever
> we pick, will be wrong for someone.
>
> I have seen devices that take many seconds, and minutes for some if bad
> things happen (i.e. the firmware doesn't download properly). ?Don't
> break people's working systems.

Suppose some weird devices may take so much time on its probe callback,
the patch does not break anything, just like before applying the patch:
call its .shutdown callback directly without holding the device lock.

So could we just change the timeout value to a larger one? such as 10
seconds or more?

Thanks,
--
Ming Lei

2012-06-21 13:50:00

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] driver core: fix shutdown races with probe/remove(v2)

On Thu, Jun 21, 2012 at 09:21:08AM +0800, Ming Lei wrote:
> On Thu, Jun 21, 2012 at 6:37 AM, Greg Kroah-Hartman
> <[email protected]> wrote:
> > On Tue, Jun 19, 2012 at 10:00:36AM +0800, Ming Lei wrote:
>
> >> so I marked it as -stable because I have explained how the race can be
> >> exploited in reality.
> >
> > Ok, but as this has been there since when, 2.5, I'll refrain from
> > marking it this way, as no one has reported a real problem like this
> > before.
>
> So have you agreed on keeping Cc: stable in v3?

No.

> >> I still have more many examples in kernel about timeout value...
> >
> > Yes, I know this, but now you are putting a limit on the amount of time
>
> No, I don't put a limit on it, see my below explanation.
>
> > a probe function can take, when before, we have never had one. ?That's
> > not something to be taken lightly, and is one I know is not true.
> >
> >> > Why not just do a real lock and try for forever?
> >>
> >> IMO, there are two advantages not just doing a real lock for forever:
> >>
> >> - avoiding buggy device/driver to hang the system
> >> - with trylock, we can log the buggy device so that it is a bit
> >> easier to troubleshoot the buggy drivers, suppose the bug is
> >> only triggered 1 time in one year or more
> >
> > No, just fix the driver, I don't want to put a time limit on how long
>
> Surely we need to fix the driver, but the problem is that it may be very
> difficult to fix the driver without the log introduced in the patch, so why
> not take it without obvious side effect?

Because nothing is wrong with the driver if it takes that long, it's not
"broken" at all, because we have never made the rule that a probe
function has to complete in a specific amount of time.

We also have not made the rule that a shutdown will complete in a
specific amount of time either, so there is no problem if that takes a
long time as well.

greg k-h

2012-06-21 16:04:25

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] driver core: fix shutdown races with probe/remove(v2)

On Thu, Jun 21, 2012 at 9:49 PM, Greg Kroah-Hartman
<[email protected]> wrote:

> We also have not made the rule that a shutdown will complete in a
> specific amount of time either, so there is no problem if that takes a
> long time as well.

OK, I will submit a new patch which just holds device lock.

Alan, do you have objections on the new one?

Thanks,
--
Ming Lei

2012-06-21 16:12:46

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH] driver core: fix shutdown races with probe/remove(v2)

On Fri, 22 Jun 2012, Ming Lei wrote:

> On Thu, Jun 21, 2012 at 9:49 PM, Greg Kroah-Hartman
> <[email protected]> wrote:
>
> > We also have not made the rule that a shutdown will complete in a
> > specific amount of time either, so there is no problem if that takes a
> > long time as well.
>
> OK, I will submit a new patch which just holds device lock.
>
> Alan, do you have objections on the new one?

No objections. It should end up being pretty much like one of the
patches you posted earlier.

Should there be an option for a debug log of each device name just
before its shutdown routine is called?

Alan Stern

2012-06-21 16:21:38

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH] driver core: fix shutdown races with probe/remove(v2)

On Fri, Jun 22, 2012 at 12:12 AM, Alan Stern <[email protected]> wrote:
> On Fri, 22 Jun 2012, Ming Lei wrote:
>
>> On Thu, Jun 21, 2012 at 9:49 PM, Greg Kroah-Hartman
>> <[email protected]> wrote:
>>
>> > We also have not made the rule that a shutdown will complete in a
>> > specific amount of time either, so there is no problem if that takes a
>> > long time as well.
>>
>> OK, I will submit a new patch which just holds device lock.
>>
>> Alan, do you have objections on the new one?
>
> No objections. ?It should end up being pretty much like one of the
> patches you posted earlier.

Holding parent reference count is missed in earlier version.

>
> Should there be an option for a debug log of each device name just
> before its shutdown routine is called?

We have had the below before calling shutdown callback.

dev_dbg(dev, "shutdown\n");

Thanks,
--
Ming Lei