2011-05-18 15:27:42

by Milton Miller

[permalink] [raw]
Subject: [PATCH] of: fix race when matching drivers

If two drivers are probing devices at the same time, both will write
their match table result to the dev->of_match cache at the same time.

Only write the result if the device matches.

In a thread titled "SBus devices sometimes detected, sometimes not",
Meelis reported his SBus hme was not detected about 50% of the time.
>From the debug suggested by Grant it was obvious another driver matched
some devices between the call to match the hme and the hme discovery
failling.

Reported-by: Meelis Roos <[email protected]>
Signed-off-by: Milton Miller <[email protected]>
---

Grant, I really think this of_match cache in the device node is bad a
bad tradeoff, and am willing to submit patches to remove it for 2.6.40.
It is only used by about 26 drivers and all use it once during probe
to fill out their driver data. It comes at the cost of a long for
every struct device in every system.

I'll even offer to throw in a patch to cache the parsing of the
compatible property to speed up of_device_is_compatible if needed.



Index: work.git/include/linux/of_device.h
===================================================================
--- work.git.orig/include/linux/of_device.h 2011-05-18 09:57:01.014386816 -0500
+++ work.git/include/linux/of_device.h 2011-05-18 09:58:27.537431575 -0500
@@ -21,8 +21,15 @@ extern void of_device_make_bus_id(struct
static inline int of_driver_match_device(struct device *dev,
const struct device_driver *drv)
{
- dev->of_match = of_match_device(drv->of_match_table, dev);
- return dev->of_match != NULL;
+ const struct of_device_id *match;
+
+ match = of_match_device(drv->of_match_table, dev);
+ if (match) {
+ dev->of_match = of_match_device(drv->of_match_table, dev);
+ return 1;
+ }
+
+ return 0;
}

extern struct platform_device *of_dev_get(struct platform_device *dev);


2011-05-18 15:42:12

by Grant Likely

[permalink] [raw]
Subject: Re: [PATCH] of: fix race when matching drivers

On Wed, May 18, 2011 at 9:27 AM, Milton Miller <[email protected]> wrote:
> If two drivers are probing devices at the same time, both will write
> their match table result to the dev->of_match cache at the same time.
>
> Only write the result if the device matches.
>
> In a thread titled "SBus devices sometimes detected, sometimes not",
> Meelis reported his SBus hme was not detected about 50% of the time.
> From the debug suggested by Grant it was obvious another driver matched
> some devices between the call to match the hme and the hme discovery
> failling.
>
> Reported-by: Meelis Roos <[email protected]>
> Signed-off-by: Milton Miller <[email protected]>
> ---
>
> Grant, I really think this of_match cache in the device node is bad a
> bad tradeoff, and am willing to submit patches to remove it for 2.6.40.
> It is only used by about 26 drivers and all use it once during probe
> to fill out their driver data. ?It comes at the cost of a long for
> every struct device in every system.

Ah, bugger. I had /thought/ that matching and probing were kept
together with a mutex. So, yes, this is bad and the of_match needs to
be removed. Thanks for volunteering to submit the patch. It should
be backported to 2.6.39 too.

> I'll even offer to throw in a patch to cache the parsing of the
> compatible property to speed up of_device_is_compatible if needed.

That would be useful. :-)

I'll pick up your patch right now and fire it off to Linus.

g.

>
>
>
> Index: work.git/include/linux/of_device.h
> ===================================================================
> --- work.git.orig/include/linux/of_device.h ? ? 2011-05-18 09:57:01.014386816 -0500
> +++ work.git/include/linux/of_device.h ?2011-05-18 09:58:27.537431575 -0500
> @@ -21,8 +21,15 @@ extern void of_device_make_bus_id(struct
> ?static inline int of_driver_match_device(struct device *dev,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? const struct device_driver *drv)
> ?{
> - ? ? ? dev->of_match = of_match_device(drv->of_match_table, dev);
> - ? ? ? return dev->of_match != NULL;
> + ? ? ? const struct of_device_id *match;
> +
> + ? ? ? match = of_match_device(drv->of_match_table, dev);
> + ? ? ? if (match) {
> + ? ? ? ? ? ? ? dev->of_match = of_match_device(drv->of_match_table, dev);
> + ? ? ? ? ? ? ? return 1;
> + ? ? ? }
> +
> + ? ? ? return 0;
> ?}
>
> ?extern struct platform_device *of_dev_get(struct platform_device *dev);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at ?http://www.tux.org/lkml/
>



--
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.

2011-05-18 16:28:23

by Josip Rodin

[permalink] [raw]
Subject: Re: [PATCH] of: fix race when matching drivers

On Wed, May 18, 2011 at 10:27:39AM -0500, Milton Miller wrote:
> --- work.git.orig/include/linux/of_device.h 2011-05-18 09:57:01.014386816 -0500
> +++ work.git/include/linux/of_device.h 2011-05-18 09:58:27.537431575 -0500
> @@ -21,8 +21,15 @@ extern void of_device_make_bus_id(struct
> static inline int of_driver_match_device(struct device *dev,
> const struct device_driver *drv)
> {
> - dev->of_match = of_match_device(drv->of_match_table, dev);
> - return dev->of_match != NULL;
> + const struct of_device_id *match;
> +
> + match = of_match_device(drv->of_match_table, dev);
> + if (match) {
> + dev->of_match = of_match_device(drv->of_match_table, dev);
> + return 1;
> + }
> +
> + return 0;
> }

Err, is there some reason to avoid simply assigning the existing result:
dev->of_match = match; ?

--
2. That which causes joy or happiness.

2011-05-18 15:47:33

by Grant Likely

[permalink] [raw]
Subject: Re: [PATCH] of: fix race when matching drivers

[cc'ing Linus: This is fix for a 2.6.39 regression, so just a heads up
that I'll be sending you a pull req ASAP (later this morning)]

On Wed, May 18, 2011 at 9:41 AM, Grant Likely <[email protected]> wrote:
> On Wed, May 18, 2011 at 9:27 AM, Milton Miller <[email protected]> wrote:
>> If two drivers are probing devices at the same time, both will write
>> their match table result to the dev->of_match cache at the same time.
>>
>> Only write the result if the device matches.
>>
>> In a thread titled "SBus devices sometimes detected, sometimes not",
>> Meelis reported his SBus hme was not detected about 50% of the time.
>> From the debug suggested by Grant it was obvious another driver matched
>> some devices between the call to match the hme and the hme discovery
>> failling.
>>
>> Reported-by: Meelis Roos <[email protected]>
>> Signed-off-by: Milton Miller <[email protected]>
>> ---
>>
>> Grant, I really think this of_match cache in the device node is bad a
>> bad tradeoff, and am willing to submit patches to remove it for 2.6.40.
>> It is only used by about 26 drivers and all use it once during probe
>> to fill out their driver data. ?It comes at the cost of a long for
>> every struct device in every system.
>
> Ah, bugger. ?I had /thought/ that matching and probing were kept
> together with a mutex. ?So, yes, this is bad and the of_match needs to
> be removed. ?Thanks for volunteering to submit the patch. ?It should
> be backported to 2.6.39 too.
>
>> I'll even offer to throw in a patch to cache the parsing of the
>> compatible property to speed up of_device_is_compatible if needed.
>
> That would be useful. ?:-)
>
> I'll pick up your patch right now and fire it off to Linus.

... it's not perfect since it will break some theoretical drivers
unbind/rebind use-cases, but I cannot think of any actual examples,
and it is far safer than the current code. Regardless, the removal of
of_match will definitely need to go into stable when it is ready.

g.

2011-05-18 16:21:43

by Grant Likely

[permalink] [raw]
Subject: Re: [PATCH] of: fix race when matching drivers

On Wed, May 18, 2011 at 05:45:17PM +0200, Josip Rodin wrote:
> On Wed, May 18, 2011 at 10:27:39AM -0500, Milton Miller wrote:
> > --- work.git.orig/include/linux/of_device.h 2011-05-18 09:57:01.014386816 -0500
> > +++ work.git/include/linux/of_device.h 2011-05-18 09:58:27.537431575 -0500
> > @@ -21,8 +21,15 @@ extern void of_device_make_bus_id(struct
> > static inline int of_driver_match_device(struct device *dev,
> > const struct device_driver *drv)
> > {
> > - dev->of_match = of_match_device(drv->of_match_table, dev);
> > - return dev->of_match != NULL;
> > + const struct of_device_id *match;
> > +
> > + match = of_match_device(drv->of_match_table, dev);
> > + if (match) {
> > + dev->of_match = of_match_device(drv->of_match_table, dev);
> > + return 1;
> > + }
> > +
> > + return 0;
> > }
>
> Err, is there some reason to avoid simply assigning the existing result:
> dev->of_match = match; ?

Good point. I've committed and am currently testing the modified version.

g.

2011-05-18 17:03:38

by Milton Miller

[permalink] [raw]
Subject: Re: [PATCH] of: fix race when matching drivers

On Wed, 18 May 2011 about 10:21:39 -0600, Grant Likely wrote:
> On Wed, May 18, 2011 at 05:45:17PM +0200, Josip Rodin wrote:
> > On Wed, May 18, 2011 at 10:27:39AM -0500, Milton Miller wrote:
> > > --- work.git.orig/include/linux/of_device.h 2011-05-18 09:57:01.014386816 -0500
> > > +++ work.git/include/linux/of_device.h 2011-05-18 09:58:27.537431575 -0500
> > > @@ -21,8 +21,15 @@ extern void of_device_make_bus_id(struct
> > > static inline int of_driver_match_device(struct device *dev,
> > > const struct device_driver *drv)
> > > {
> > > - dev->of_match = of_match_device(drv->of_match_table, dev);
> > > - return dev->of_match != NULL;
> > > + const struct of_device_id *match;
> > > +
> > > + match = of_match_device(drv->of_match_table, dev);
> > > + if (match) {
> > > + dev->of_match = of_match_device(drv->of_match_table, dev);
> > > + return 1;
> > > + }
> > > +
> > > + return 0;
> > > }
> >
> > Err, is there some reason to avoid simply assigning the existing result:
> > dev->of_match = match; ?
>
> Good point. I've committed and am currently testing the modified version.
>
> g.

Sorry about that. I had intended to do that (hence creating the
variable), but obvioiusly I forgot when I hurried to compile test
and send out the patch.

thanks to Josip for finding and Grant for fixing.

milton