2014-04-03 07:12:19

by Peter Ujfalusi

[permalink] [raw]
Subject: [RESEND] drivercore: deferral race condition fix

When the kernel is built with CONFIG_PREEMPT it is possible to reach a state
when all modules loaded but some driver still stuck in the deferred list
and there is a need for external event to kick the deferred queue to probe
these drivers.

The issue has been observed on embedded systems with CONFIG_PREEMPT enabled,
audio support built as modules and using nfsroot for root filesystem.

The following log fragment shows such sequence when all audio modules
were loaded but the sound card is not present since the machine driver has
failed to probe due to missing dependency during it's probe.
The board is am335x-evmsk (McASP<->tlv320aic3106 codec) with davinci-evm
machine driver:

...
[ 12.615118] davinci-mcasp 4803c000.mcasp: davinci_mcasp_probe: ENTER
[ 12.719969] davinci_evm sound.3: davinci_evm_probe: ENTER
[ 12.725753] davinci_evm sound.3: davinci_evm_probe: snd_soc_register_card
[ 12.753846] davinci-mcasp 4803c000.mcasp: davinci_mcasp_probe: snd_soc_register_component
[ 12.922051] davinci-mcasp 4803c000.mcasp: davinci_mcasp_probe: snd_soc_register_component DONE
[ 12.950839] davinci_evm sound.3: ASoC: platform (null) not registered
[ 12.957898] davinci_evm sound.3: davinci_evm_probe: snd_soc_register_card DONE (-517)
[ 13.099026] davinci-mcasp 4803c000.mcasp: Kicking the deferred list
[ 13.177838] davinci-mcasp 4803c000.mcasp: really_probe: probe_count = 2
[ 13.194130] davinci_evm sound.3: snd_soc_register_card failed (-517)
[ 13.346755] davinci_mcasp_driver_init: LEAVE
[ 13.377446] platform sound.3: Driver davinci_evm requests probe deferral
[ 13.592527] platform sound.3: really_probe: probe_count = 0

In the log the machine driver enters it's probe at 12.719969 (this point it
has been removed from the deferred lists). McASP driver already executing
it's probing (since 12.615118).
The machine driver tries to construct the sound card (12.950839) but did
not found one of the components so it fails. After this McASP driver
registers all the ASoC components (the machine driver still in it's probe
function after it failed to construct the card) and the deferred work is
prepared at 13.099026 (note that this time the machine driver is not in the
lists so it is not going to be handled when the work is executing).
Lastly the machine driver exit from it's probe and the core places it to
the deferred list but there will be no other driver going to load and the
deferred queue is not going to be kicked again - till we have external event
like connecting USB stick, etc.

The proposed solution is to try the deferred queue once more when the last
driver is asking for deferring and we had drivers loaded while this last
driver was probing.

This way we can avoid drivers stuck in the deferred queue.

Signed-off-by: Peter Ujfalusi <[email protected]>
---
drivers/base/dd.c | 22 +++++++++++++++++++---
1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 06051767393f..80703de6e6ad 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -53,6 +53,10 @@ static LIST_HEAD(deferred_probe_pending_list);
static LIST_HEAD(deferred_probe_active_list);
static struct workqueue_struct *deferred_wq;

+static atomic_t probe_count = ATOMIC_INIT(0);
+static DECLARE_WAIT_QUEUE_HEAD(probe_waitqueue);
+static bool deferral_retry;
+
/**
* deferred_probe_work_func() - Retry probing devices in the active list.
*/
@@ -141,6 +145,11 @@ static void driver_deferred_probe_trigger(void)
if (!driver_deferred_probe_enable)
return;

+ if (atomic_read(&probe_count) > 1)
+ deferral_retry = true;
+ else
+ deferral_retry = false;
+
/*
* A successful probe means that all the devices in the pending list
* should be triggered to be reprobed. Move all the deferred devices
@@ -259,9 +268,6 @@ int device_bind_driver(struct device *dev)
}
EXPORT_SYMBOL_GPL(device_bind_driver);

-static atomic_t probe_count = ATOMIC_INIT(0);
-static DECLARE_WAIT_QUEUE_HEAD(probe_waitqueue);
-
static int really_probe(struct device *dev, struct device_driver *drv)
{
int ret = 0;
@@ -310,6 +316,16 @@ probe_failed:
/* Driver requested deferred probing */
dev_info(dev, "Driver %s requests probe deferral\n", drv->name);
driver_deferred_probe_add(dev);
+ /*
+ * This is the last driver to load and asking to be deferred.
+ * If other driver(s) loaded while this driver was loading, we
+ * should try the deferred modules again to avoid missing
+ * dependency for this driver.
+ */
+ if (atomic_read(&probe_count) == 1 && deferral_retry) {
+ deferral_retry = false;
+ driver_deferred_probe_trigger();
+ }
} else if (ret != -ENODEV && ret != -ENXIO) {
/* driver matched but the probe failed */
printk(KERN_WARNING
--
1.9.1


2014-04-03 09:41:18

by Mark Brown

[permalink] [raw]
Subject: Re: [RESEND] drivercore: deferral race condition fix

On Thu, Apr 03, 2014 at 10:12:07AM +0300, Peter Ujfalusi wrote:
> When the kernel is built with CONFIG_PREEMPT it is possible to reach a state
> when all modules loaded but some driver still stuck in the deferred list
> and there is a need for external event to kick the deferred queue to probe
> these drivers.

Acked-by: Mark Brown <[email protected]>


Attachments:
(No filename) (355.00 B)
signature.asc (836.00 B)
Digital signature
Download all attachments

2014-04-08 10:37:56

by Grant Likely

[permalink] [raw]
Subject: Re: [RESEND] drivercore: deferral race condition fix

On Thu, 3 Apr 2014 10:40:59 +0100, Mark Brown <[email protected]> wrote:
> On Thu, Apr 03, 2014 at 10:12:07AM +0300, Peter Ujfalusi wrote:
> > When the kernel is built with CONFIG_PREEMPT it is possible to reach a state
> > when all modules loaded but some driver still stuck in the deferred list
> > and there is a need for external event to kick the deferred queue to probe
> > these drivers.
>
> Acked-by: Mark Brown <[email protected]>

It's a pretty crude solution though. The problem is any "in-flight"
probes that are going to defer will not get added to the active list.
Rerunning the entire active list is a bit much (but it does have the
advantage of still being conceptually simple). I think we can do better.

Instead of running the entire list, we could add a check to
driver_deferred_probe_add() that adds the device to the active list
instead of pending list on the condition that another driver probe
completed while the deferred probe was in-flight.

I'm playing with a solution now. I'll email a proposal shortly.

g.



2014-04-08 12:47:44

by Grant Likely

[permalink] [raw]
Subject: Re: [RESEND] drivercore: deferral race condition fix

On Tue, Apr 8, 2014 at 3:27 AM, Grant Likely <[email protected]> wrote:
> On Thu, 3 Apr 2014 10:40:59 +0100, Mark Brown <[email protected]> wrote:
>> On Thu, Apr 03, 2014 at 10:12:07AM +0300, Peter Ujfalusi wrote:
>> > When the kernel is built with CONFIG_PREEMPT it is possible to reach a state
>> > when all modules loaded but some driver still stuck in the deferred list
>> > and there is a need for external event to kick the deferred queue to probe
>> > these drivers.
>>
>> Acked-by: Mark Brown <[email protected]>
>
> It's a pretty crude solution though. The problem is any "in-flight"
> probes that are going to defer will not get added to the active list.
> Rerunning the entire active list is a bit much (but it does have the
> advantage of still being conceptually simple). I think we can do better.
>
> Instead of running the entire list, we could add a check to
> driver_deferred_probe_add() that adds the device to the active list
> instead of pending list on the condition that another driver probe
> completed while the deferred probe was in-flight.
>
> I'm playing with a solution now. I'll email a proposal shortly.

Thinking out loud now...

The race can occur whenever a probe in another thread completes
successfully while the current probe is in-flight. If that has
happened, then the defer condition may be resolved and the driver
should be scheduled for retry immediately. If the core code can check
for that condition, then we can add the driver directly to the active
list and kick the workqueue.

The problem is that we don't currently have an easy way to test if a
probe has completed in another thread. This patch handles it with a
single flag that gets set whenever a probe completes while another
probe is executing. I was worried that this approach would be racy,
but after running through the scenarios I can't find a situation where
it wouldn't get added. I only concern I have remaining on this
approach is that it will trigger unnecessary retries, but even that
isn't really a problem because the pending list will have been moved
to the active list *anyway*. It isn't even a retry of the whole list
that's happening because most likely the only device on the pending
list will be the one that completed with -EPROBE_DEFER.

So, I actually think this is the right approach now. I'll reply to the
patch itself and make some comments on the code.

g.

2014-04-08 12:47:58

by Grant Likely

[permalink] [raw]
Subject: Re: [RESEND] drivercore: deferral race condition fix

On Thu, 3 Apr 2014 10:12:07 +0300, Peter Ujfalusi <[email protected]> wrote:
> When the kernel is built with CONFIG_PREEMPT it is possible to reach a state
> when all modules loaded but some driver still stuck in the deferred list
> and there is a need for external event to kick the deferred queue to probe
> these drivers.
>
> The issue has been observed on embedded systems with CONFIG_PREEMPT enabled,
> audio support built as modules and using nfsroot for root filesystem.
>
> The following log fragment shows such sequence when all audio modules
> were loaded but the sound card is not present since the machine driver has
> failed to probe due to missing dependency during it's probe.
> The board is am335x-evmsk (McASP<->tlv320aic3106 codec) with davinci-evm
> machine driver:
>
> ...
> [ 12.615118] davinci-mcasp 4803c000.mcasp: davinci_mcasp_probe: ENTER
> [ 12.719969] davinci_evm sound.3: davinci_evm_probe: ENTER
> [ 12.725753] davinci_evm sound.3: davinci_evm_probe: snd_soc_register_card
> [ 12.753846] davinci-mcasp 4803c000.mcasp: davinci_mcasp_probe: snd_soc_register_component
> [ 12.922051] davinci-mcasp 4803c000.mcasp: davinci_mcasp_probe: snd_soc_register_component DONE
> [ 12.950839] davinci_evm sound.3: ASoC: platform (null) not registered
> [ 12.957898] davinci_evm sound.3: davinci_evm_probe: snd_soc_register_card DONE (-517)
> [ 13.099026] davinci-mcasp 4803c000.mcasp: Kicking the deferred list
> [ 13.177838] davinci-mcasp 4803c000.mcasp: really_probe: probe_count = 2
> [ 13.194130] davinci_evm sound.3: snd_soc_register_card failed (-517)
> [ 13.346755] davinci_mcasp_driver_init: LEAVE
> [ 13.377446] platform sound.3: Driver davinci_evm requests probe deferral
> [ 13.592527] platform sound.3: really_probe: probe_count = 0
>
> In the log the machine driver enters it's probe at 12.719969 (this point it
> has been removed from the deferred lists). McASP driver already executing
> it's probing (since 12.615118).
> The machine driver tries to construct the sound card (12.950839) but did
> not found one of the components so it fails. After this McASP driver
> registers all the ASoC components (the machine driver still in it's probe
> function after it failed to construct the card) and the deferred work is
> prepared at 13.099026 (note that this time the machine driver is not in the
> lists so it is not going to be handled when the work is executing).
> Lastly the machine driver exit from it's probe and the core places it to
> the deferred list but there will be no other driver going to load and the
> deferred queue is not going to be kicked again - till we have external event
> like connecting USB stick, etc.
>
> The proposed solution is to try the deferred queue once more when the last
> driver is asking for deferring and we had drivers loaded while this last
> driver was probing.
>
> This way we can avoid drivers stuck in the deferred queue.
>
> Signed-off-by: Peter Ujfalusi <[email protected]>
> ---
> drivers/base/dd.c | 22 +++++++++++++++++++---
> 1 file changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 06051767393f..80703de6e6ad 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -53,6 +53,10 @@ static LIST_HEAD(deferred_probe_pending_list);
> static LIST_HEAD(deferred_probe_active_list);
> static struct workqueue_struct *deferred_wq;
>
> +static atomic_t probe_count = ATOMIC_INIT(0);
> +static DECLARE_WAIT_QUEUE_HEAD(probe_waitqueue);
> +static bool deferral_retry;
> +
> /**
> * deferred_probe_work_func() - Retry probing devices in the active list.
> */
> @@ -141,6 +145,11 @@ static void driver_deferred_probe_trigger(void)
> if (!driver_deferred_probe_enable)
> return;
>
> + if (atomic_read(&probe_count) > 1)
> + deferral_retry = true;
> + else
> + deferral_retry = false;

A few comments:
- Really need to comment why these lines are being added.
- I think this hunk needs to be moved to realy_probe(). It
doesn't make any sense when called via deferred_probe_initcall(), and
it doesn't work in the device_bind_driver path because the probe_count
is not incremented there. In fact, the device_bind_driver() path has
the same race condition, but it is unlikely to be a problem in
practice because device_bind_driver() is used very rarely and doesn't
execute any driver code.
- The 'if' is unnecessary:
deferred_retry = (atomic_read(&probe_count) > 1);

> +
> /*
> * A successful probe means that all the devices in the pending list
> * should be triggered to be reprobed. Move all the deferred devices
> @@ -259,9 +268,6 @@ int device_bind_driver(struct device *dev)
> }
> EXPORT_SYMBOL_GPL(device_bind_driver);
>
> -static atomic_t probe_count = ATOMIC_INIT(0);
> -static DECLARE_WAIT_QUEUE_HEAD(probe_waitqueue);
> -
> static int really_probe(struct device *dev, struct device_driver *drv)
> {
> int ret = 0;
> @@ -310,6 +316,16 @@ probe_failed:
> /* Driver requested deferred probing */
> dev_info(dev, "Driver %s requests probe deferral\n", drv->name);
> driver_deferred_probe_add(dev);
> + /*
> + * This is the last driver to load and asking to be deferred.
> + * If other driver(s) loaded while this driver was loading, we
> + * should try the deferred modules again to avoid missing
> + * dependency for this driver.
> + */
> + if (atomic_read(&probe_count) == 1 && deferral_retry) {
> + deferral_retry = false;
> + driver_deferred_probe_trigger();
> + }

Testing the probe count probably isn't necessary. Clearing the flag
though is probably racy if there are two deferred drivers in flight.

I would rather be happier if each probe could track on its own if there
had been any successful probes and then decide whether or not to trigger
again based on that, but when I played with it I found that it just
creates another race condition between calling really_probe() and
really_probe() grabbing a probe state footprint. Everything I tried made
things more complicated than less. Go ahead and add my a-b when you
respin the patch.

Acked-by: Grant Likely <[email protected]>


> } else if (ret != -ENODEV && ret != -ENXIO) {
> /* driver matched but the probe failed */
> printk(KERN_WARNING
> --
> 1.9.1
>

2014-04-08 13:16:38

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [RESEND] drivercore: deferral race condition fix

On 04/08/2014 03:47 PM, Grant Likely wrote:
> On Tue, Apr 8, 2014 at 3:27 AM, Grant Likely <[email protected]> wrote:
>> On Thu, 3 Apr 2014 10:40:59 +0100, Mark Brown <[email protected]> wrote:
>>> On Thu, Apr 03, 2014 at 10:12:07AM +0300, Peter Ujfalusi wrote:
>>>> When the kernel is built with CONFIG_PREEMPT it is possible to reach a state
>>>> when all modules loaded but some driver still stuck in the deferred list
>>>> and there is a need for external event to kick the deferred queue to probe
>>>> these drivers.
>>>
>>> Acked-by: Mark Brown <[email protected]>
>>
>> It's a pretty crude solution though. The problem is any "in-flight"
>> probes that are going to defer will not get added to the active list.
>> Rerunning the entire active list is a bit much (but it does have the
>> advantage of still being conceptually simple). I think we can do better.
>>
>> Instead of running the entire list, we could add a check to
>> driver_deferred_probe_add() that adds the device to the active list
>> instead of pending list on the condition that another driver probe
>> completed while the deferred probe was in-flight.
>>
>> I'm playing with a solution now. I'll email a proposal shortly.
>
> Thinking out loud now...
>
> The race can occur whenever a probe in another thread completes
> successfully while the current probe is in-flight. If that has
> happened, then the defer condition may be resolved and the driver
> should be scheduled for retry immediately. If the core code can check
> for that condition, then we can add the driver directly to the active
> list and kick the workqueue.
>
> The problem is that we don't currently have an easy way to test if a
> probe has completed in another thread. This patch handles it with a
> single flag that gets set whenever a probe completes while another
> probe is executing. I was worried that this approach would be racy,
> but after running through the scenarios I can't find a situation where
> it wouldn't get added. I only concern I have remaining on this
> approach is that it will trigger unnecessary retries, but even that
> isn't really a problem because the pending list will have been moved
> to the active list *anyway*. It isn't even a retry of the whole list
> that's happening because most likely the only device on the pending
> list will be the one that completed with -EPROBE_DEFER.

This code will only going to add one retry and it is going to that only under
the condition you have described:
the last driver which finishes it's probe ends up with -EPROBE_DEFER and while
it's probe was in-flight another driver loaded with success.
In all other cases this will not trigger a retry:
If you load only one driver which ends up deferring,
If the driver which differing is not the last driver to load - since we will
have other drivers to be loaded and they will kick the list anyways.

> So, I actually think this is the right approach now. I'll reply to the
> patch itself and make some comments on the code.

Thanks, I'll check the comments.

--
P?ter

2014-04-08 13:35:45

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [RESEND] drivercore: deferral race condition fix

On 04/08/2014 03:43 PM, Grant Likely wrote:
> On Thu, 3 Apr 2014 10:12:07 +0300, Peter Ujfalusi <[email protected]> wrote:
>> When the kernel is built with CONFIG_PREEMPT it is possible to reach a state
>> when all modules loaded but some driver still stuck in the deferred list
>> and there is a need for external event to kick the deferred queue to probe
>> these drivers.
>>
>> The issue has been observed on embedded systems with CONFIG_PREEMPT enabled,
>> audio support built as modules and using nfsroot for root filesystem.
>>
>> The following log fragment shows such sequence when all audio modules
>> were loaded but the sound card is not present since the machine driver has
>> failed to probe due to missing dependency during it's probe.
>> The board is am335x-evmsk (McASP<->tlv320aic3106 codec) with davinci-evm
>> machine driver:
>>
>> ...
>> [ 12.615118] davinci-mcasp 4803c000.mcasp: davinci_mcasp_probe: ENTER
>> [ 12.719969] davinci_evm sound.3: davinci_evm_probe: ENTER
>> [ 12.725753] davinci_evm sound.3: davinci_evm_probe: snd_soc_register_card
>> [ 12.753846] davinci-mcasp 4803c000.mcasp: davinci_mcasp_probe: snd_soc_register_component
>> [ 12.922051] davinci-mcasp 4803c000.mcasp: davinci_mcasp_probe: snd_soc_register_component DONE
>> [ 12.950839] davinci_evm sound.3: ASoC: platform (null) not registered
>> [ 12.957898] davinci_evm sound.3: davinci_evm_probe: snd_soc_register_card DONE (-517)
>> [ 13.099026] davinci-mcasp 4803c000.mcasp: Kicking the deferred list
>> [ 13.177838] davinci-mcasp 4803c000.mcasp: really_probe: probe_count = 2
>> [ 13.194130] davinci_evm sound.3: snd_soc_register_card failed (-517)
>> [ 13.346755] davinci_mcasp_driver_init: LEAVE
>> [ 13.377446] platform sound.3: Driver davinci_evm requests probe deferral
>> [ 13.592527] platform sound.3: really_probe: probe_count = 0
>>
>> In the log the machine driver enters it's probe at 12.719969 (this point it
>> has been removed from the deferred lists). McASP driver already executing
>> it's probing (since 12.615118).
>> The machine driver tries to construct the sound card (12.950839) but did
>> not found one of the components so it fails. After this McASP driver
>> registers all the ASoC components (the machine driver still in it's probe
>> function after it failed to construct the card) and the deferred work is
>> prepared at 13.099026 (note that this time the machine driver is not in the
>> lists so it is not going to be handled when the work is executing).
>> Lastly the machine driver exit from it's probe and the core places it to
>> the deferred list but there will be no other driver going to load and the
>> deferred queue is not going to be kicked again - till we have external event
>> like connecting USB stick, etc.
>>
>> The proposed solution is to try the deferred queue once more when the last
>> driver is asking for deferring and we had drivers loaded while this last
>> driver was probing.
>>
>> This way we can avoid drivers stuck in the deferred queue.
>>
>> Signed-off-by: Peter Ujfalusi <[email protected]>
>> ---
>> drivers/base/dd.c | 22 +++++++++++++++++++---
>> 1 file changed, 19 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
>> index 06051767393f..80703de6e6ad 100644
>> --- a/drivers/base/dd.c
>> +++ b/drivers/base/dd.c
>> @@ -53,6 +53,10 @@ static LIST_HEAD(deferred_probe_pending_list);
>> static LIST_HEAD(deferred_probe_active_list);
>> static struct workqueue_struct *deferred_wq;
>>
>> +static atomic_t probe_count = ATOMIC_INIT(0);
>> +static DECLARE_WAIT_QUEUE_HEAD(probe_waitqueue);
>> +static bool deferral_retry;
>> +
>> /**
>> * deferred_probe_work_func() - Retry probing devices in the active list.
>> */
>> @@ -141,6 +145,11 @@ static void driver_deferred_probe_trigger(void)
>> if (!driver_deferred_probe_enable)
>> return;
>>
>> + if (atomic_read(&probe_count) > 1)
>> + deferral_retry = true;
>> + else
>> + deferral_retry = false;
>
> A few comments:
> - Really need to comment why these lines are being added.
> - I think this hunk needs to be moved to realy_probe(). It
> doesn't make any sense when called via deferred_probe_initcall(), and
> it doesn't work in the device_bind_driver path because the probe_count
> is not incremented there. In fact, the device_bind_driver() path has
> the same race condition, but it is unlikely to be a problem in
> practice because device_bind_driver() is used very rarely and doesn't
> execute any driver code.

The reason why I have added the flagging to driver_deferred_probe_trigger()
because this is the place where the deferred drivers will be kicked.
When the drivers are loaded in order during the boot it is not really
interesting for this situation. When a driver has been moved to deferred queue
is the time we need to watch for the 'race' to handle.
I did have this flagging first in really_probe() but as far as I recall it
exhibited random misses.
The driver_deferred_probe_trigger() will be called every time when a driver
probed with success - from driver_bound(), right? So what we are doing is that
we set the deferral_retry flag if we have more than one driver's probe in
progress and see if when the last driver leaves it's probe we had another
loaded with success.
The probe_count will be decremented after the driver_bound() so if we had only
the two racy driver as last, we will have the flag set.
Hrm, probably it might be better for readability to move the deferral_retry
flag code just before the driver_bound() call in really_probe(). Inthis way we
will have these in one place.

> - The 'if' is unnecessary:
> deferred_retry = (atomic_read(&probe_count) > 1);
>
>> +
>> /*
>> * A successful probe means that all the devices in the pending list
>> * should be triggered to be reprobed. Move all the deferred devices
>> @@ -259,9 +268,6 @@ int device_bind_driver(struct device *dev)
>> }
>> EXPORT_SYMBOL_GPL(device_bind_driver);
>>
>> -static atomic_t probe_count = ATOMIC_INIT(0);
>> -static DECLARE_WAIT_QUEUE_HEAD(probe_waitqueue);
>> -
>> static int really_probe(struct device *dev, struct device_driver *drv)
>> {
>> int ret = 0;
>> @@ -310,6 +316,16 @@ probe_failed:
>> /* Driver requested deferred probing */
>> dev_info(dev, "Driver %s requests probe deferral\n", drv->name);
>> driver_deferred_probe_add(dev);
>> + /*
>> + * This is the last driver to load and asking to be deferred.
>> + * If other driver(s) loaded while this driver was loading, we
>> + * should try the deferred modules again to avoid missing
>> + * dependency for this driver.
>> + */
>> + if (atomic_read(&probe_count) == 1 && deferral_retry) {
>> + deferral_retry = false;
>> + driver_deferred_probe_trigger();
>> + }
>
> Testing the probe count probably isn't necessary. Clearing the flag
> though is probably racy if there are two deferred drivers in flight.

I think it is a good thing to have to avoid kicking the deferred list all the
time. If we still have 5 driver still probing we can just wait till the last
is gone and just check if we need to do an 'emergency' kick to the deferred list.

> I would rather be happier if each probe could track on its own if there
> had been any successful probes and then decide whether or not to trigger
> again based on that, but when I played with it I found that it just
> creates another race condition between calling really_probe() and
> really_probe() grabbing a probe state footprint. Everything I tried made
> things more complicated than less.

Yes, I also experimented with other ways but things got more fragile with even
more corner cases to handle and understand...

> Go ahead and add my a-b when you
> respin the patch.
>
> Acked-by: Grant Likely <[email protected]>

Thanks, I'll send the v2 tomorrow.

>
>
>> } else if (ret != -ENODEV && ret != -ENXIO) {
>> /* driver matched but the probe failed */
>> printk(KERN_WARNING
>> --
>> 1.9.1
>>
>


--
P?ter

2014-04-09 07:03:45

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [RESEND] drivercore: deferral race condition fix

On 04/08/2014 04:35 PM, Peter Ujfalusi wrote:
> On 04/08/2014 03:43 PM, Grant Likely wrote:
>>> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
>>> index 06051767393f..80703de6e6ad 100644
>>> --- a/drivers/base/dd.c
>>> +++ b/drivers/base/dd.c
>>> @@ -53,6 +53,10 @@ static LIST_HEAD(deferred_probe_pending_list);
>>> static LIST_HEAD(deferred_probe_active_list);
>>> static struct workqueue_struct *deferred_wq;
>>>
>>> +static atomic_t probe_count = ATOMIC_INIT(0);
>>> +static DECLARE_WAIT_QUEUE_HEAD(probe_waitqueue);
>>> +static bool deferral_retry;
>>> +
>>> /**
>>> * deferred_probe_work_func() - Retry probing devices in the active list.
>>> */
>>> @@ -141,6 +145,11 @@ static void driver_deferred_probe_trigger(void)
>>> if (!driver_deferred_probe_enable)
>>> return;
>>>
>>> + if (atomic_read(&probe_count) > 1)
>>> + deferral_retry = true;
>>> + else
>>> + deferral_retry = false;
>>
>> A few comments:
>> - Really need to comment why these lines are being added.
>> - I think this hunk needs to be moved to realy_probe(). It
>> doesn't make any sense when called via deferred_probe_initcall(), and
>> it doesn't work in the device_bind_driver path because the probe_count
>> is not incremented there. In fact, the device_bind_driver() path has
>> the same race condition, but it is unlikely to be a problem in
>> practice because device_bind_driver() is used very rarely and doesn't
>> execute any driver code.
>
> The reason why I have added the flagging to driver_deferred_probe_trigger()
> because this is the place where the deferred drivers will be kicked.
> When the drivers are loaded in order during the boot it is not really
> interesting for this situation. When a driver has been moved to deferred queue
> is the time we need to watch for the 'race' to handle.
> I did have this flagging first in really_probe() but as far as I recall it
> exhibited random misses.
> The driver_deferred_probe_trigger() will be called every time when a driver
> probed with success - from driver_bound(), right? So what we are doing is that
> we set the deferral_retry flag if we have more than one driver's probe in
> progress and see if when the last driver leaves it's probe we had another
> loaded with success.
> The probe_count will be decremented after the driver_bound() so if we had only
> the two racy driver as last, we will have the flag set.
> Hrm, probably it might be better for readability to move the deferral_retry
> flag code just before the driver_bound() call in really_probe(). Inthis way we
> will have these in one place.

Now that I had time to think about this again I think the really_probe() is a
wrong place for this failsafe mechanism.
At the end we need to look and handle the following case:
When a driver probed with success while other driver(s) still in their probe
(thus not present in the deferred lists) we need to flag this event in the
driver_deferred_probe_trigger() function, just before the
list_splice_tail_init() call - when the deferred list is prepared.
Basically we set a flag for later use, that we have prepared the deferred list
but there were drivers in-fligth which we do not yet know if they are going to
end up deferring.

We need to check this flag in driver_deferred_probe_add() function which adds
the driver to the deferred pending list in case it returned with -EPROBE_DEFER.
Here we check the flag and also check if this is the last driver known to us
probing (probe_count == 1).
If these conditions met, we call driver_deferred_probe_trigger() from here as
an automatic one shot try to see if the previously loaded driver had satisfied
the deferred last driver.

I need to move driver_deferred_probe_add() down a bit in the source file for
this and rename the flag I had to 'deferred_auto_retry' or something.
I think this is going to be more robust and also gives cleaner explanation
what this 'recovery' code meant to do.

>
>> - The 'if' is unnecessary:
>> deferred_retry = (atomic_read(&probe_count) > 1);
>>
>>> +
>>> /*
>>> * A successful probe means that all the devices in the pending list
>>> * should be triggered to be reprobed. Move all the deferred devices
>>> @@ -259,9 +268,6 @@ int device_bind_driver(struct device *dev)
>>> }
>>> EXPORT_SYMBOL_GPL(device_bind_driver);
>>>
>>> -static atomic_t probe_count = ATOMIC_INIT(0);
>>> -static DECLARE_WAIT_QUEUE_HEAD(probe_waitqueue);
>>> -
>>> static int really_probe(struct device *dev, struct device_driver *drv)
>>> {
>>> int ret = 0;
>>> @@ -310,6 +316,16 @@ probe_failed:
>>> /* Driver requested deferred probing */
>>> dev_info(dev, "Driver %s requests probe deferral\n", drv->name);
>>> driver_deferred_probe_add(dev);
>>> + /*
>>> + * This is the last driver to load and asking to be deferred.
>>> + * If other driver(s) loaded while this driver was loading, we
>>> + * should try the deferred modules again to avoid missing
>>> + * dependency for this driver.
>>> + */
>>> + if (atomic_read(&probe_count) == 1 && deferral_retry) {
>>> + deferral_retry = false;
>>> + driver_deferred_probe_trigger();
>>> + }
>>
>> Testing the probe count probably isn't necessary. Clearing the flag
>> though is probably racy if there are two deferred drivers in flight.
>
> I think it is a good thing to have to avoid kicking the deferred list all the
> time. If we still have 5 driver still probing we can just wait till the last
> is gone and just check if we need to do an 'emergency' kick to the deferred list.
>
>> I would rather be happier if each probe could track on its own if there
>> had been any successful probes and then decide whether or not to trigger
>> again based on that, but when I played with it I found that it just
>> creates another race condition between calling really_probe() and
>> really_probe() grabbing a probe state footprint. Everything I tried made
>> things more complicated than less.
>
> Yes, I also experimented with other ways but things got more fragile with even
> more corner cases to handle and understand...
>
>> Go ahead and add my a-b when you
>> respin the patch.
>>
>> Acked-by: Grant Likely <[email protected]>
>
> Thanks, I'll send the v2 tomorrow.
>
>>
>>
>>> } else if (ret != -ENODEV && ret != -ENXIO) {
>>> /* driver matched but the probe failed */
>>> printk(KERN_WARNING
>>> --
>>> 1.9.1
>>>
>>
>
>


--
P?ter