2020-03-24 17:58:11

by Andy Shevchenko

[permalink] [raw]
Subject: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

Consider the following scenario.

The main driver of USB OTG controller (dwc3-pci), which has the following
functional dependencies on certain platform:
- ULPI (tusb1210)
- extcon (tested with extcon-intel-mrfld)

Note, that first driver, tusb1210, is available at the moment of
dwc3-pci probing, while extcon-intel-mrfld is built as a module and
won't appear till user space does something about it.

This is depicted by kernel configuration excerpt:

CONFIG_PHY_TUSB1210=y
CONFIG_USB_DWC3=y
CONFIG_USB_DWC3_ULPI=y
CONFIG_USB_DWC3_DUAL_ROLE=y
CONFIG_USB_DWC3_PCI=y
CONFIG_EXTCON_INTEL_MRFLD=m

In the Buildroot environment the modules are probed by alphabetical ordering
of their modaliases. The latter comes to the case when USB OTG driver will be
probed first followed by extcon one.

So, if the platform anticipates extcon device to be appeared, in the above case
we will get deferred probe of USB OTG, because of ordering.

Since current implementation, done by the commit 58b116bce136 ("drivercore:
deferral race condition fix") counts the amount of triggered deferred probe,
we never advance the situation -- the change makes it to be an infinite loop.

---8<---8<---

[ 22.187127] driver_deferred_probe_trigger <<< 1

...here is the late initcall triggers deferred probe...

[ 22.191725] platform dwc3.0.auto: deferred_probe_work_func in deferred list

...dwc3.0.auto is the only device in the deferred list...

[ 22.198727] platform dwc3.0.auto: deferred_probe_work_func 1 <<< counter 1

...the counter before mutex is unlocked is kept the same...

[ 22.205663] platform dwc3.0.auto: Retrying from deferred list

...mutes has been unlocked, we try to re-probe the driver...

[ 22.211487] bus: 'platform': driver_probe_device: matched device dwc3.0.auto with driver dwc3
[ 22.220060] bus: 'platform': really_probe: probing driver dwc3 with device dwc3.0.auto
[ 22.238735] bus: 'ulpi': driver_probe_device: matched device dwc3.0.auto.ulpi with driver tusb1210
[ 22.247743] bus: 'ulpi': really_probe: probing driver tusb1210 with device dwc3.0.auto.ulpi
[ 22.256292] driver: 'tusb1210': driver_bound: bound to device 'dwc3.0.auto.ulpi'
[ 22.263723] driver_deferred_probe_trigger <<< 2

...the dwc3.0.auto probes ULPI, we got successful bound and bumped counter...

[ 22.268304] bus: 'ulpi': really_probe: bound device dwc3.0.auto.ulpi to driver tusb1210
[ 22.276697] platform dwc3.0.auto: Driver dwc3 requests probe deferral

...but extcon driver is still missing...

[ 22.283174] platform dwc3.0.auto: Added to deferred list
[ 22.288513] platform dwc3.0.auto: driver_deferred_probe_add_trigger local counter: 1 new counter 2

...and since we had a successful probe, we got counter mismatch...

[ 22.297490] driver_deferred_probe_trigger <<< 3
[ 22.302074] platform dwc3.0.auto: deferred_probe_work_func 2 <<< counter 3

...at the end we have a new counter and loop repeats again, see 22.198727...

---8<---8<---

Revert of the commit helps, but it is probably not helpful for the initially
found regression. Artem Bityutskiy suggested to use counter of the successful
probes instead. This fixes above mentioned case and shouldn't prevent driver
to reprobe deferred ones.

Under "successful probe" we understand the state when a driver of the certain
device is being kept bound after deferred probe trigger cycle. For instance,
in the above mentioned case probing of tusb1210 is not successful because dwc3
driver unbinds device dwc3.0.auto.ulpi. The atomic_dec() call is used to keep
track of this. The amount of bindings is always great than or equal to the
amount of unbindings as guaranteed by design of the driver binding mechanism.

Fixes: 58b116bce136 ("drivercore: deferral race condition fix")
Suggested-by: Artem Bityutskiy <[email protected]>
Cc: Grant Likely <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Felipe Balbi <[email protected]>
Cc: Andrzej Hajda <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>
Reviewed-by: Peter Ujfalusi <[email protected]>
Tested-by: Ferry Toth <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>
---
v3: added comment about atomic_dec() to the commit message and code (Rafael)

drivers/base/dd.c | 44 ++++++++++++++++++++++++++------------------
1 file changed, 26 insertions(+), 18 deletions(-)

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index b25bcab2a26b..c1b445733150 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -53,7 +53,6 @@
static DEFINE_MUTEX(deferred_probe_mutex);
static LIST_HEAD(deferred_probe_pending_list);
static LIST_HEAD(deferred_probe_active_list);
-static atomic_t deferred_trigger_count = ATOMIC_INIT(0);
static struct dentry *deferred_devices;
static bool initcalls_done;

@@ -147,17 +146,6 @@ static bool driver_deferred_probe_enable = false;
* This functions moves all devices from the pending list to the active
* list and schedules the deferred probe workqueue to process them. It
* should be called anytime a driver is successfully bound to a device.
- *
- * Note, there is a race condition in multi-threaded probe. In the case where
- * more than one device is probing at the same time, it is possible for one
- * probe to complete successfully while another is about to defer. If the second
- * depends on the first, then it will get put on the pending list after the
- * trigger event has already occurred and will be stuck there.
- *
- * The atomic 'deferred_trigger_count' is used to determine if a successful
- * trigger has occurred in the midst of probing a driver. If the trigger count
- * changes in the midst of a probe, then deferred processing should be triggered
- * again.
*/
static void driver_deferred_probe_trigger(void)
{
@@ -170,7 +158,6 @@ static void driver_deferred_probe_trigger(void)
* into the active list so they can be retried by the workqueue
*/
mutex_lock(&deferred_probe_mutex);
- atomic_inc(&deferred_trigger_count);
list_splice_tail_init(&deferred_probe_pending_list,
&deferred_probe_active_list);
mutex_unlock(&deferred_probe_mutex);
@@ -350,6 +337,19 @@ static void __exit deferred_probe_exit(void)
}
__exitcall(deferred_probe_exit);

+/*
+ * Note, there is a race condition in multi-threaded probe. In the case where
+ * more than one device is probing at the same time, it is possible for one
+ * probe to complete successfully while another is about to defer. If the second
+ * depends on the first, then it will get put on the pending list after the
+ * trigger event has already occurred and will be stuck there.
+ *
+ * The atomic 'probe_okay' is used to determine if a successful probe has
+ * occurred in the midst of probing another driver. If the count changes in
+ * the midst of a probe, then deferred processing should be triggered again.
+ */
+static atomic_t probe_okay = ATOMIC_INIT(0);
+
/**
* device_is_bound() - Check if device is bound to a driver
* @dev: device to check
@@ -375,6 +375,7 @@ static void driver_bound(struct device *dev)
pr_debug("driver: '%s': %s: bound to device '%s'\n", dev->driver->name,
__func__, dev_name(dev));

+ atomic_inc(&probe_okay);
klist_add_tail(&dev->p->knode_driver, &dev->driver->p->klist_devices);
device_links_driver_bound(dev);

@@ -481,18 +482,18 @@ static atomic_t probe_count = ATOMIC_INIT(0);
static DECLARE_WAIT_QUEUE_HEAD(probe_waitqueue);

static void driver_deferred_probe_add_trigger(struct device *dev,
- int local_trigger_count)
+ int local_probe_okay_count)
{
driver_deferred_probe_add(dev);
/* Did a trigger occur while probing? Need to re-trigger if yes */
- if (local_trigger_count != atomic_read(&deferred_trigger_count))
+ if (local_probe_okay_count != atomic_read(&probe_okay))
driver_deferred_probe_trigger();
}

static int really_probe(struct device *dev, struct device_driver *drv)
{
int ret = -EPROBE_DEFER;
- int local_trigger_count = atomic_read(&deferred_trigger_count);
+ int local_probe_okay_count = atomic_read(&probe_okay);
bool test_remove = IS_ENABLED(CONFIG_DEBUG_TEST_DRIVER_REMOVE) &&
!drv->suppress_bind_attrs;

@@ -509,7 +510,7 @@ static int really_probe(struct device *dev, struct device_driver *drv)

ret = device_links_check_suppliers(dev);
if (ret == -EPROBE_DEFER)
- driver_deferred_probe_add_trigger(dev, local_trigger_count);
+ driver_deferred_probe_add_trigger(dev, local_probe_okay_count);
if (ret)
return ret;

@@ -619,7 +620,7 @@ static int really_probe(struct device *dev, struct device_driver *drv)
case -EPROBE_DEFER:
/* Driver requested deferred probing */
dev_dbg(dev, "Driver %s requests probe deferral\n", drv->name);
- driver_deferred_probe_add_trigger(dev, local_trigger_count);
+ driver_deferred_probe_add_trigger(dev, local_probe_okay_count);
break;
case -ENODEV:
case -ENXIO:
@@ -1148,6 +1149,13 @@ static void __device_release_driver(struct device *dev, struct device *parent)
dev_pm_set_driver_flags(dev, 0);

klist_remove(&dev->p->knode_driver);
+ /*
+ * If a driver has been unbound from the device
+ * we won't consider the probe of the device
+ * successful.
+ */
+ atomic_dec(&probe_okay);
+
device_pm_check_callbacks(dev);
if (dev->bus)
blocking_notifier_call_chain(&dev->bus->p->bus_notifier,
--
2.25.1


2020-03-25 03:29:47

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied


On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko <[email protected]> wrote:
> Consider the following scenario.
>
> The main driver of USB OTG controller (dwc3-pci), which has the following
> functional dependencies on certain platform:
> - ULPI (tusb1210)
> - extcon (tested with extcon-intel-mrfld)
>
> Note, that first driver, tusb1210, is available at the moment of
> dwc3-pci probing, while extcon-intel-mrfld is built as a module and
> won't appear till user space does something about it.
>
> This is depicted by kernel configuration excerpt:
>
> CONFIG_PHY_TUSB1210=y
> CONFIG_USB_DWC3=y
> CONFIG_USB_DWC3_ULPI=y
> CONFIG_USB_DWC3_DUAL_ROLE=y
> CONFIG_USB_DWC3_PCI=y
> CONFIG_EXTCON_INTEL_MRFLD=m
>
> In the Buildroot environment the modules are probed by alphabetical ordering
> of their modaliases. The latter comes to the case when USB OTG driver will be
> probed first followed by extcon one.
>
> So, if the platform anticipates extcon device to be appeared, in the above case
> we will get deferred probe of USB OTG, because of ordering.
>
> Since current implementation, done by the commit 58b116bce136 ("drivercore:
> deferral race condition fix") counts the amount of triggered deferred probe,
> we never advance the situation -- the change makes it to be an infinite loop.
>

Hi Andy,

I'm trying to understand this sequence of steps. Sorry if the questions
are stupid -- I'm not very familiar with USB/PCI stuff.

> ---8<---8<---
>
> [ 22.187127] driver_deferred_probe_trigger <<< 1
>
> ...here is the late initcall triggers deferred probe...
>
> [ 22.191725] platform dwc3.0.auto: deferred_probe_work_func in deferred list
>
> ...dwc3.0.auto is the only device in the deferred list...

Ok, dwc3.0.auto is the only unprobed device at this point?

>
> [ 22.198727] platform dwc3.0.auto: deferred_probe_work_func 1 <<< counter 1
>
> ...the counter before mutex is unlocked is kept the same...
>
> [ 22.205663] platform dwc3.0.auto: Retrying from deferred list
>
> ...mutes has been unlocked, we try to re-probe the driver...
>
> [ 22.211487] bus: 'platform': driver_probe_device: matched device dwc3.0.auto with driver dwc3
> [ 22.220060] bus: 'platform': really_probe: probing driver dwc3 with device dwc3.0.auto
> [ 22.238735] bus: 'ulpi': driver_probe_device: matched device dwc3.0.auto.ulpi with driver tusb1210
> [ 22.247743] bus: 'ulpi': really_probe: probing driver tusb1210 with device dwc3.0.auto.ulpi
> [ 22.256292] driver: 'tusb1210': driver_bound: bound to device 'dwc3.0.auto.ulpi'
> [ 22.263723] driver_deferred_probe_trigger <<< 2
>
> ...the dwc3.0.auto probes ULPI, we got successful bound and bumped counter...
>
> [ 22.268304] bus: 'ulpi': really_probe: bound device dwc3.0.auto.ulpi to driver tusb1210

So where did this dwc3.0.auto.ulpi come from?
Looks like the device is created by dwc3_probe() through this call flow:
dwc3_probe() -> dwc3_core_init() -> dwc3_core_ulpi_init() ->
dwc3_ulpi_init() -> ulpi_register_interface() -> ulpi_register()

> [ 22.276697] platform dwc3.0.auto: Driver dwc3 requests probe deferral

Can you please point me to which code patch actually caused the probe
deferral?

> ...but extcon driver is still missing...
>
> [ 22.283174] platform dwc3.0.auto: Added to deferred list
> [ 22.288513] platform dwc3.0.auto: driver_deferred_probe_add_trigger local counter: 1 new counter 2

I'm not fully aware of all the USB implications, but if extcon is
needed, why can't that check be done before we add and probe the ulpi
device? That'll avoid this whole "fake" probing and avoid the counter
increase. And avoid the need for this patch that's touching the code
code that's already a bit delicate.

Also, with my limited experience with all the possible drivers in the
kernel, it's weird that the ulpi device is added and probed before we
make sure the parent device (dwc3.0.auto) can actually probe
successfully.

Most of the platform device code I've seen in systems with OF (device
tree) add the child devices towards the end of the parent's probe
function.

> ...and since we had a successful probe, we got counter mismatch...
>
> [ 22.297490] driver_deferred_probe_trigger <<< 3
> [ 22.302074] platform dwc3.0.auto: deferred_probe_work_func 2 <<< counter 3
>
> ...at the end we have a new counter and loop repeats again, see 22.198727...
>
> ---8<---8<---
>
> Revert of the commit helps, but it is probably not helpful for the initially
> found regression. Artem Bityutskiy suggested to use counter of the successful
> probes instead. This fixes above mentioned case and shouldn't prevent driver
> to reprobe deferred ones.
>
> Under "successful probe" we understand the state when a driver of the certain
> device is being kept bound after deferred probe trigger cycle. For instance,
> in the above mentioned case probing of tusb1210 is not successful because dwc3
> driver unbinds device dwc3.0.auto.ulpi. The atomic_dec() call is used to keep
> track of this. The amount of bindings is always great than or equal to the
> amount of unbindings as guaranteed by design of the driver binding mechanism.

The unbindings count can increase for other unrelated drivers unbinding
too. Wouldn't it? Seems a bit fragile and racy in a fashion similar to
the issue the original patch was trying to fix.

-Saravana

2020-03-25 12:52:57

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

On Tue, Mar 24, 2020 at 08:29:01PM -0700, Saravana Kannan wrote:
> On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko <[email protected]> wrote:
> > Consider the following scenario.
> >
> > The main driver of USB OTG controller (dwc3-pci), which has the following
> > functional dependencies on certain platform:
> > - ULPI (tusb1210)
> > - extcon (tested with extcon-intel-mrfld)
> >
> > Note, that first driver, tusb1210, is available at the moment of
> > dwc3-pci probing, while extcon-intel-mrfld is built as a module and
> > won't appear till user space does something about it.
> >
> > This is depicted by kernel configuration excerpt:
> >
> > CONFIG_PHY_TUSB1210=y
> > CONFIG_USB_DWC3=y
> > CONFIG_USB_DWC3_ULPI=y
> > CONFIG_USB_DWC3_DUAL_ROLE=y
> > CONFIG_USB_DWC3_PCI=y
> > CONFIG_EXTCON_INTEL_MRFLD=m
> >
> > In the Buildroot environment the modules are probed by alphabetical ordering
> > of their modaliases. The latter comes to the case when USB OTG driver will be
> > probed first followed by extcon one.
> >
> > So, if the platform anticipates extcon device to be appeared, in the above case
> > we will get deferred probe of USB OTG, because of ordering.
> >
> > Since current implementation, done by the commit 58b116bce136 ("drivercore:
> > deferral race condition fix") counts the amount of triggered deferred probe,
> > we never advance the situation -- the change makes it to be an infinite loop.
>
> Hi Andy,
>
> I'm trying to understand this sequence of steps. Sorry if the questions
> are stupid -- I'm not very familiar with USB/PCI stuff.

Thank you for looking into this. My answer below.

As a first thing I would like to tell that there is another example of bad
behaviour of deferred probe with no relation to USB. The proposed change also
fixes that one (however, less possible to find in real life).

> > ---8<---8<---
> >
> > [ 22.187127] driver_deferred_probe_trigger <<< 1
> >
> > ...here is the late initcall triggers deferred probe...
> >
> > [ 22.191725] platform dwc3.0.auto: deferred_probe_work_func in deferred list
> >
> > ...dwc3.0.auto is the only device in the deferred list...
>
> Ok, dwc3.0.auto is the only unprobed device at this point?

Correct.

> > [ 22.198727] platform dwc3.0.auto: deferred_probe_work_func 1 <<< counter 1
> >
> > ...the counter before mutex is unlocked is kept the same...
> >
> > [ 22.205663] platform dwc3.0.auto: Retrying from deferred list
> >
> > ...mutes has been unlocked, we try to re-probe the driver...
> >
> > [ 22.211487] bus: 'platform': driver_probe_device: matched device dwc3.0.auto with driver dwc3
> > [ 22.220060] bus: 'platform': really_probe: probing driver dwc3 with device dwc3.0.auto
> > [ 22.238735] bus: 'ulpi': driver_probe_device: matched device dwc3.0.auto.ulpi with driver tusb1210
> > [ 22.247743] bus: 'ulpi': really_probe: probing driver tusb1210 with device dwc3.0.auto.ulpi
> > [ 22.256292] driver: 'tusb1210': driver_bound: bound to device 'dwc3.0.auto.ulpi'
> > [ 22.263723] driver_deferred_probe_trigger <<< 2
> >
> > ...the dwc3.0.auto probes ULPI, we got successful bound and bumped counter...
> >
> > [ 22.268304] bus: 'ulpi': really_probe: bound device dwc3.0.auto.ulpi to driver tusb1210
>
> So where did this dwc3.0.auto.ulpi come from?

> Looks like the device is created by dwc3_probe() through this call flow:
> dwc3_probe() -> dwc3_core_init() -> dwc3_core_ulpi_init() ->
> dwc3_ulpi_init() -> ulpi_register_interface() -> ulpi_register()

Correct.

> > [ 22.276697] platform dwc3.0.auto: Driver dwc3 requests probe deferral
>
> Can you please point me to which code patch actually caused the probe
> deferral?

Sure, it's in drd.c.

if (device_property_read_string(dev, "linux,extcon-name", &name) == 0) {
edev = extcon_get_extcon_dev(name);
if (!edev)
return ERR_PTR(-EPROBE_DEFER);
return edev;
}

> > ...but extcon driver is still missing...
> >
> > [ 22.283174] platform dwc3.0.auto: Added to deferred list
> > [ 22.288513] platform dwc3.0.auto: driver_deferred_probe_add_trigger local counter: 1 new counter 2
>
> I'm not fully aware of all the USB implications, but if extcon is
> needed, why can't that check be done before we add and probe the ulpi
> device? That'll avoid this whole "fake" probing and avoid the counter
> increase. And avoid the need for this patch that's touching the code
> code that's already a bit delicate.

> Also, with my limited experience with all the possible drivers in the
> kernel, it's weird that the ulpi device is added and probed before we
> make sure the parent device (dwc3.0.auto) can actually probe
> successfully.

As I said above the deferred probe trigger has flaw on its own.
Even if we fix for USB case, there is (and probably will be) others.

> Most of the platform device code I've seen in systems with OF (device
> tree) add the child devices towards the end of the parent's probe
> function.

> > ...and since we had a successful probe, we got counter mismatch...
> >
> > [ 22.297490] driver_deferred_probe_trigger <<< 3
> > [ 22.302074] platform dwc3.0.auto: deferred_probe_work_func 2 <<< counter 3
> >
> > ...at the end we have a new counter and loop repeats again, see 22.198727...
> >
> > ---8<---8<---
> >
> > Revert of the commit helps, but it is probably not helpful for the initially
> > found regression. Artem Bityutskiy suggested to use counter of the successful
> > probes instead. This fixes above mentioned case and shouldn't prevent driver
> > to reprobe deferred ones.
> >
> > Under "successful probe" we understand the state when a driver of the certain
> > device is being kept bound after deferred probe trigger cycle. For instance,
> > in the above mentioned case probing of tusb1210 is not successful because dwc3
> > driver unbinds device dwc3.0.auto.ulpi. The atomic_dec() call is used to keep
> > track of this. The amount of bindings is always great than or equal to the
> > amount of unbindings as guaranteed by design of the driver binding mechanism.
>
> The unbindings count can increase for other unrelated drivers unbinding
> too. Wouldn't it? Seems a bit fragile and racy in a fashion similar to
> the issue the original patch was trying to fix.

Yes, it's (unlikely) possible (*), but it will give one more iteration per such
case. It's definitely better than infinite loop. Do you agree?

*) It means during probe you have _intensive_ removing, of course you may keep
kernel busy with iterations, but it has no practical sense. DoS attacks more
effective in different ways.

--
With Best Regards,
Andy Shevchenko


2020-03-25 22:09:41

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

On Wed, Mar 25, 2020 at 5:51 AM Andy Shevchenko
<[email protected]> wrote:
>
> On Tue, Mar 24, 2020 at 08:29:01PM -0700, Saravana Kannan wrote:
> > On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko <[email protected]> wrote:
> > > Consider the following scenario.
> > >
> > > The main driver of USB OTG controller (dwc3-pci), which has the following
> > > functional dependencies on certain platform:
> > > - ULPI (tusb1210)
> > > - extcon (tested with extcon-intel-mrfld)
> > >
> > > Note, that first driver, tusb1210, is available at the moment of
> > > dwc3-pci probing, while extcon-intel-mrfld is built as a module and
> > > won't appear till user space does something about it.
> > >
> > > This is depicted by kernel configuration excerpt:
> > >
> > > CONFIG_PHY_TUSB1210=y
> > > CONFIG_USB_DWC3=y
> > > CONFIG_USB_DWC3_ULPI=y
> > > CONFIG_USB_DWC3_DUAL_ROLE=y
> > > CONFIG_USB_DWC3_PCI=y
> > > CONFIG_EXTCON_INTEL_MRFLD=m
> > >
> > > In the Buildroot environment the modules are probed by alphabetical ordering
> > > of their modaliases. The latter comes to the case when USB OTG driver will be
> > > probed first followed by extcon one.
> > >
> > > So, if the platform anticipates extcon device to be appeared, in the above case
> > > we will get deferred probe of USB OTG, because of ordering.
> > >
> > > Since current implementation, done by the commit 58b116bce136 ("drivercore:
> > > deferral race condition fix") counts the amount of triggered deferred probe,
> > > we never advance the situation -- the change makes it to be an infinite loop.
> >
> > Hi Andy,
> >
> > I'm trying to understand this sequence of steps. Sorry if the questions
> > are stupid -- I'm not very familiar with USB/PCI stuff.
>
> Thank you for looking into this. My answer below.
>
> As a first thing I would like to tell that there is another example of bad
> behaviour of deferred probe with no relation to USB. The proposed change also
> fixes that one (however, less possible to find in real life).

Unless I see what the other issue is, I can't speak for the unknown.

> > > ---8<---8<---
> > >
> > > [ 22.187127] driver_deferred_probe_trigger <<< 1
> > >
> > > ...here is the late initcall triggers deferred probe...
> > >
> > > [ 22.191725] platform dwc3.0.auto: deferred_probe_work_func in deferred list
> > >
> > > ...dwc3.0.auto is the only device in the deferred list...
> >
> > Ok, dwc3.0.auto is the only unprobed device at this point?
>
> Correct.
>
> > > [ 22.198727] platform dwc3.0.auto: deferred_probe_work_func 1 <<< counter 1
> > >
> > > ...the counter before mutex is unlocked is kept the same...
> > >
> > > [ 22.205663] platform dwc3.0.auto: Retrying from deferred list
> > >
> > > ...mutes has been unlocked, we try to re-probe the driver...
> > >
> > > [ 22.211487] bus: 'platform': driver_probe_device: matched device dwc3.0.auto with driver dwc3
> > > [ 22.220060] bus: 'platform': really_probe: probing driver dwc3 with device dwc3.0.auto
> > > [ 22.238735] bus: 'ulpi': driver_probe_device: matched device dwc3.0.auto.ulpi with driver tusb1210
> > > [ 22.247743] bus: 'ulpi': really_probe: probing driver tusb1210 with device dwc3.0.auto.ulpi
> > > [ 22.256292] driver: 'tusb1210': driver_bound: bound to device 'dwc3.0.auto.ulpi'
> > > [ 22.263723] driver_deferred_probe_trigger <<< 2
> > >
> > > ...the dwc3.0.auto probes ULPI, we got successful bound and bumped counter...
> > >
> > > [ 22.268304] bus: 'ulpi': really_probe: bound device dwc3.0.auto.ulpi to driver tusb1210
> >
> > So where did this dwc3.0.auto.ulpi come from?
>
> > Looks like the device is created by dwc3_probe() through this call flow:
> > dwc3_probe() -> dwc3_core_init() -> dwc3_core_ulpi_init() ->
> > dwc3_ulpi_init() -> ulpi_register_interface() -> ulpi_register()
>
> Correct.
>
> > > [ 22.276697] platform dwc3.0.auto: Driver dwc3 requests probe deferral
> >
> > Can you please point me to which code patch actually caused the probe
> > deferral?
>
> Sure, it's in drd.c.
>
> if (device_property_read_string(dev, "linux,extcon-name", &name) == 0) {
> edev = extcon_get_extcon_dev(name);
> if (!edev)
> return ERR_PTR(-EPROBE_DEFER);
> return edev;
> }

Thanks for the confirmations and pointers. I assume
"linux,extcon-name" is a property that's obtained from ACPI? Because I
couldn't find a relevant reference to it elsewhere in the kernel.

> > > ...but extcon driver is still missing...
> > >
> > > [ 22.283174] platform dwc3.0.auto: Added to deferred list
> > > [ 22.288513] platform dwc3.0.auto: driver_deferred_probe_add_trigger local counter: 1 new counter 2
> >
> > I'm not fully aware of all the USB implications, but if extcon is
> > needed, why can't that check be done before we add and probe the ulpi
> > device? That'll avoid this whole "fake" probing and avoid the counter
> > increase. And avoid the need for this patch that's touching the code
> > code that's already a bit delicate.
>
> > Also, with my limited experience with all the possible drivers in the
> > kernel, it's weird that the ulpi device is added and probed before we
> > make sure the parent device (dwc3.0.auto) can actually probe
> > successfully.
>
> As I said above the deferred probe trigger has flaw on its own.

Definitely agree. I'm not saying deferred probe is perfect.

> Even if we fix for USB case, there is (and probably will be) others.
>
> > Most of the platform device code I've seen in systems with OF (device
> > tree) add the child devices towards the end of the parent's probe
> > function.
>
> > > ...and since we had a successful probe, we got counter mismatch...
> > >
> > > [ 22.297490] driver_deferred_probe_trigger <<< 3
> > > [ 22.302074] platform dwc3.0.auto: deferred_probe_work_func 2 <<< counter 3
> > >
> > > ...at the end we have a new counter and loop repeats again, see 22.198727...
> > >
> > > ---8<---8<---
> > >
> > > Revert of the commit helps, but it is probably not helpful for the initially
> > > found regression. Artem Bityutskiy suggested to use counter of the successful
> > > probes instead. This fixes above mentioned case and shouldn't prevent driver
> > > to reprobe deferred ones.
> > >
> > > Under "successful probe" we understand the state when a driver of the certain
> > > device is being kept bound after deferred probe trigger cycle. For instance,
> > > in the above mentioned case probing of tusb1210 is not successful because dwc3
> > > driver unbinds device dwc3.0.auto.ulpi. The atomic_dec() call is used to keep
> > > track of this. The amount of bindings is always great than or equal to the
> > > amount of unbindings as guaranteed by design of the driver binding mechanism.
> >
> > The unbindings count can increase for other unrelated drivers unbinding
> > too. Wouldn't it? Seems a bit fragile and racy in a fashion similar to
> > the issue the original patch was trying to fix.
>
> Yes, it's (unlikely) possible (*), but it will give one more iteration per such
> case. It's definitely better than infinite loop. Do you agree?

Sorry I wasn't being clear (I was in a rush). I'm saying this patch
can reintroduce the bug where the deferred probe isn't triggered when
it should be.

Let's take a simple execution flow.

probe_okay is at 10.

Thread-A
really_probe(Device-A)
local_probe_okay_count = 10
Device-A probe function is running...

Thread-B
really_probe(Device-B)
Device-B probes successfully.
probe_okay incremented to 11

Thread-C
Device-C (which had bound earlier) is unbound (say module is
unloaded or a million other reasons).
probe_okay is decremented to 10.

Thread-A continues
Device-A probe function returns -EPROBE_DEFER
driver_deferred_probe_add_trigger() doesn't do anything because
local_probe_okay_count == probe_okay
But Device-A might have deferred probe waiting on Device-B.
Device-A never probes.

> *) It means during probe you have _intensive_ removing, of course you may keep
> kernel busy with iterations, but it has no practical sense. DoS attacks more
> effective in different ways.

I wasn't worried about DoS attacks. More of a functional correctness
issue what I explained above.

Anyway, if your issue and similar issues can be handles in driver core
in a clean way without breaking other cases, I don't have any problem
with that. Just that, I think the current solution breaks other cases.

As an alternate solution, assuming "linux,extcon-name" is coming
from some firmware, you might want to look into the fw_devlink
feature.

That feature allows driver core to add device links from firmware
information. If you can get that feature to create device links from
your dwc3.0.auto (or its parent pci_dev?) to the extcon supplier
device, all of this can be sidestepped and your dwc3.0.auto's (or the
dwc pci_dev's) probe will be triggered only after extcon is probed.

I have very little familiarity with PCI/ACPI. I spent about an hour or
two poking at ACPI scan/property code. The relationship between a
pci_dev and an acpi_device is a bit confusing to me because I see:

static int dwc3_pci_probe(struct pci_dev *pci, const struct pci_device_id *id)
{
struct property_entry *p = (struct property_entry *)id->driver_data;
struct dwc3_pci *dwc;
struct resource res[2];
int ret;
struct device *dev = &pci->dev;
....
dwc->dwc3 = platform_device_alloc("dwc3", PLATFORM_DEVID_AUTO);
....
ACPI_COMPANION_SET(&dwc->dwc3->dev, ACPI_COMPANION(dev));

And ACPI_COMPANION returns an acpi_device by looking at dev->fwnode.
So how the heck is a pci_device.dev.fwnode pointing to an
acpi_device.fwnode?

Anyway, assuming the fwnode here points to an acpi_device's fwnode,
you can implement add_links() ops for acpi_device_fwnode_ops and get
fw_devlink working for ACPI.

I've implemented something similar for OF in drivers/of/property.c if
you are interested.

-Saravana

2020-03-26 08:41:37

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

On Wed, Mar 25, 2020 at 11:09 PM Saravana Kannan <[email protected]> wrote:
>
> On Wed, Mar 25, 2020 at 5:51 AM Andy Shevchenko
> <[email protected]> wrote:
> >

[cut]

> >
> > Yes, it's (unlikely) possible (*), but it will give one more iteration per such
> > case. It's definitely better than infinite loop. Do you agree?
>
> Sorry I wasn't being clear (I was in a rush). I'm saying this patch
> can reintroduce the bug where the deferred probe isn't triggered when
> it should be.
>
> Let's take a simple execution flow.
>
> probe_okay is at 10.
>
> Thread-A
> really_probe(Device-A)
> local_probe_okay_count = 10
> Device-A probe function is running...
>
> Thread-B
> really_probe(Device-B)
> Device-B probes successfully.
> probe_okay incremented to 11
>
> Thread-C
> Device-C (which had bound earlier) is unbound (say module is
> unloaded or a million other reasons).
> probe_okay is decremented to 10.
>
> Thread-A continues
> Device-A probe function returns -EPROBE_DEFER
> driver_deferred_probe_add_trigger() doesn't do anything because
> local_probe_okay_count == probe_okay
> But Device-A might have deferred probe waiting on Device-B.
> Device-A never probes.
>
> > *) It means during probe you have _intensive_ removing, of course you may keep
> > kernel busy with iterations, but it has no practical sense. DoS attacks more
> > effective in different ways.
>
> I wasn't worried about DoS attacks. More of a functional correctness
> issue what I explained above.

The code is functionally incorrect as is already AFAICS.

> Anyway, if your issue and similar issues can be handles in driver core
> in a clean way without breaking other cases, I don't have any problem
> with that. Just that, I think the current solution breaks other cases.

OK, so the situation right now is that commit 58b116bce136 has
introduced a regression and so it needs to be fixed or reverted. The
cases that were previously broken and were unbroken by that commit
don't matter here, so you cannot argue that they would be "broken".

It looks to me like the original issue fixed by the commit in question
needs to be addressed differently, so I would vote for reverting it
and starting over.

> As an alternate solution, assuming "linux,extcon-name" is coming
> from some firmware, you might want to look into the fw_devlink
> feature.

That would be a workaround for a driver core issue, though, wouldn't it?

> That feature allows driver core to add device links from firmware
> information. If you can get that feature to create device links from
> your dwc3.0.auto (or its parent pci_dev?) to the extcon supplier
> device, all of this can be sidestepped and your dwc3.0.auto's (or the
> dwc pci_dev's) probe will be triggered only after extcon is probed.
>
> I have very little familiarity with PCI/ACPI. I spent about an hour or
> two poking at ACPI scan/property code. The relationship between a
> pci_dev and an acpi_device is a bit confusing to me because I see:
>
> static int dwc3_pci_probe(struct pci_dev *pci, const struct pci_device_id *id)
> {
> struct property_entry *p = (struct property_entry *)id->driver_data;
> struct dwc3_pci *dwc;
> struct resource res[2];
> int ret;
> struct device *dev = &pci->dev;
> ....
> dwc->dwc3 = platform_device_alloc("dwc3", PLATFORM_DEVID_AUTO);
> ....
> ACPI_COMPANION_SET(&dwc->dwc3->dev, ACPI_COMPANION(dev));
>
> And ACPI_COMPANION returns an acpi_device by looking at dev->fwnode.
> So how the heck is a pci_device.dev.fwnode pointing to an
> acpi_device.fwnode?

acpi_device is an of_node counterpart (or it is an fwnode itself if you will).

Thanks!

2020-03-26 09:46:31

by Peter Ujfalusi

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

Hi,

On 26/03/2020 10.39, Rafael J. Wysocki wrote:
> On Wed, Mar 25, 2020 at 11:09 PM Saravana Kannan <[email protected]> wrote:
>>
>> On Wed, Mar 25, 2020 at 5:51 AM Andy Shevchenko
>> <[email protected]> wrote:
>>>
>
> [cut]
>
>>>
>>> Yes, it's (unlikely) possible (*), but it will give one more iteration per such
>>> case. It's definitely better than infinite loop. Do you agree?
>>
>> Sorry I wasn't being clear (I was in a rush). I'm saying this patch
>> can reintroduce the bug where the deferred probe isn't triggered when
>> it should be.
>>
>> Let's take a simple execution flow.
>>
>> probe_okay is at 10.
>>
>> Thread-A
>> really_probe(Device-A)
>> local_probe_okay_count = 10
>> Device-A probe function is running...
>>
>> Thread-B
>> really_probe(Device-B)
>> Device-B probes successfully.
>> probe_okay incremented to 11
>>
>> Thread-C
>> Device-C (which had bound earlier) is unbound (say module is
>> unloaded or a million other reasons).
>> probe_okay is decremented to 10.
>>
>> Thread-A continues
>> Device-A probe function returns -EPROBE_DEFER
>> driver_deferred_probe_add_trigger() doesn't do anything because
>> local_probe_okay_count == probe_okay
>> But Device-A might have deferred probe waiting on Device-B.
>> Device-A never probes.
>>
>>> *) It means during probe you have _intensive_ removing, of course you may keep
>>> kernel busy with iterations, but it has no practical sense. DoS attacks more
>>> effective in different ways.
>>
>> I wasn't worried about DoS attacks. More of a functional correctness
>> issue what I explained above.
>
> The code is functionally incorrect as is already AFAICS.
>
>> Anyway, if your issue and similar issues can be handles in driver core
>> in a clean way without breaking other cases, I don't have any problem
>> with that. Just that, I think the current solution breaks other cases.
>
> OK, so the situation right now is that commit 58b116bce136 has
> introduced a regression and so it needs to be fixed or reverted. The
> cases that were previously broken and were unbroken by that commit
> don't matter here, so you cannot argue that they would be "broken".

commit 58b116bce136 is from 2014 and the whole ULPI support for dwc3
came in a year later.
While I agree that 58b116bce136 fail to handle came a year later, but
technically it did not introduced a regression.

The revert on the other hand is going to introduce a regression as
things were working fine since 2014. Not sure why the dwc3 issue got
this long to be noticed as the 58b116bce136 was already in kernel when
the ULPI support was added...

> It looks to me like the original issue fixed by the commit in question
> needs to be addressed differently, so I would vote for reverting it
> and starting over.

Fwiw my original approach was a bit different:
https://lore.kernel.org/patchwork/patch/454800/

Greg changed it to what ended up in the kernel:
https://lore.kernel.org/patchwork/cover/454799/

>> As an alternate solution, assuming "linux,extcon-name" is coming
>> from some firmware, you might want to look into the fw_devlink
>> feature.
>
> That would be a workaround for a driver core issue, though, wouldn't it?
>
>> That feature allows driver core to add device links from firmware
>> information. If you can get that feature to create device links from
>> your dwc3.0.auto (or its parent pci_dev?) to the extcon supplier
>> device, all of this can be sidestepped and your dwc3.0.auto's (or the
>> dwc pci_dev's) probe will be triggered only after extcon is probed.
>>
>> I have very little familiarity with PCI/ACPI. I spent about an hour or
>> two poking at ACPI scan/property code. The relationship between a
>> pci_dev and an acpi_device is a bit confusing to me because I see:
>>
>> static int dwc3_pci_probe(struct pci_dev *pci, const struct pci_device_id *id)
>> {
>> struct property_entry *p = (struct property_entry *)id->driver_data;
>> struct dwc3_pci *dwc;
>> struct resource res[2];
>> int ret;
>> struct device *dev = &pci->dev;
>> ....
>> dwc->dwc3 = platform_device_alloc("dwc3", PLATFORM_DEVID_AUTO);
>> ....
>> ACPI_COMPANION_SET(&dwc->dwc3->dev, ACPI_COMPANION(dev));
>>
>> And ACPI_COMPANION returns an acpi_device by looking at dev->fwnode.
>> So how the heck is a pci_device.dev.fwnode pointing to an
>> acpi_device.fwnode?
>
> acpi_device is an of_node counterpart (or it is an fwnode itself if you will).
>
> Thanks!
>

- Péter

Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki.
Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2020-03-26 11:55:10

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

On Wed, Mar 25, 2020 at 03:08:29PM -0700, Saravana Kannan wrote:
> On Wed, Mar 25, 2020 at 5:51 AM Andy Shevchenko
> <[email protected]> wrote:
> > On Tue, Mar 24, 2020 at 08:29:01PM -0700, Saravana Kannan wrote:
> > > On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko <[email protected]> wrote:
> > > > Consider the following scenario.
> > > >
> > > > The main driver of USB OTG controller (dwc3-pci), which has the following
> > > > functional dependencies on certain platform:
> > > > - ULPI (tusb1210)
> > > > - extcon (tested with extcon-intel-mrfld)
> > > >
> > > > Note, that first driver, tusb1210, is available at the moment of
> > > > dwc3-pci probing, while extcon-intel-mrfld is built as a module and
> > > > won't appear till user space does something about it.
> > > >
> > > > This is depicted by kernel configuration excerpt:
> > > >
> > > > CONFIG_PHY_TUSB1210=y
> > > > CONFIG_USB_DWC3=y
> > > > CONFIG_USB_DWC3_ULPI=y
> > > > CONFIG_USB_DWC3_DUAL_ROLE=y
> > > > CONFIG_USB_DWC3_PCI=y
> > > > CONFIG_EXTCON_INTEL_MRFLD=m
> > > >
> > > > In the Buildroot environment the modules are probed by alphabetical ordering
> > > > of their modaliases. The latter comes to the case when USB OTG driver will be
> > > > probed first followed by extcon one.
> > > >
> > > > So, if the platform anticipates extcon device to be appeared, in the above case
> > > > we will get deferred probe of USB OTG, because of ordering.
> > > >
> > > > Since current implementation, done by the commit 58b116bce136 ("drivercore:
> > > > deferral race condition fix") counts the amount of triggered deferred probe,
> > > > we never advance the situation -- the change makes it to be an infinite loop.
> > >
> > > Hi Andy,
> > >
> > > I'm trying to understand this sequence of steps. Sorry if the questions
> > > are stupid -- I'm not very familiar with USB/PCI stuff.
> >
> > Thank you for looking into this. My answer below.
> >
> > As a first thing I would like to tell that there is another example of bad
> > behaviour of deferred probe with no relation to USB. The proposed change also
> > fixes that one (however, less possible to find in real life).
>
> Unless I see what the other issue is, I can't speak for the unknown.

Okay, let's talk about other case (actually it's the one which I had noticed
approximately at the time when culprit patch made the kernel).

For some debugging purposes I have been using pin control table in board code.

Since I would like to boot kernel on different systems I have some tables for
non-existing pin control device. Pin control framework returns -EPROBE_DEFER
when trying to probe device with attached table for wrong pin control. This is
fine, the problem is that *any* successfully probed device, which happens in
the deferred probe initcall will desynchronize existing counter. As a result ->
infinite loop. For the record, I didn't realize and didn't investigate that
time the issue and now I can confirm that this is a culprit which is fixed by
this patch.

> > > > ---8<---8<---
> > > >
> > > > [ 22.187127] driver_deferred_probe_trigger <<< 1
> > > >
> > > > ...here is the late initcall triggers deferred probe...
> > > >
> > > > [ 22.191725] platform dwc3.0.auto: deferred_probe_work_func in deferred list
> > > >
> > > > ...dwc3.0.auto is the only device in the deferred list...
> > >
> > > Ok, dwc3.0.auto is the only unprobed device at this point?
> >
> > Correct.
> >
> > > > [ 22.198727] platform dwc3.0.auto: deferred_probe_work_func 1 <<< counter 1
> > > >
> > > > ...the counter before mutex is unlocked is kept the same...
> > > >
> > > > [ 22.205663] platform dwc3.0.auto: Retrying from deferred list
> > > >
> > > > ...mutes has been unlocked, we try to re-probe the driver...
> > > >
> > > > [ 22.211487] bus: 'platform': driver_probe_device: matched device dwc3.0.auto with driver dwc3
> > > > [ 22.220060] bus: 'platform': really_probe: probing driver dwc3 with device dwc3.0.auto
> > > > [ 22.238735] bus: 'ulpi': driver_probe_device: matched device dwc3.0.auto.ulpi with driver tusb1210
> > > > [ 22.247743] bus: 'ulpi': really_probe: probing driver tusb1210 with device dwc3.0.auto.ulpi
> > > > [ 22.256292] driver: 'tusb1210': driver_bound: bound to device 'dwc3.0.auto.ulpi'
> > > > [ 22.263723] driver_deferred_probe_trigger <<< 2
> > > >
> > > > ...the dwc3.0.auto probes ULPI, we got successful bound and bumped counter...
> > > >
> > > > [ 22.268304] bus: 'ulpi': really_probe: bound device dwc3.0.auto.ulpi to driver tusb1210
> > >
> > > So where did this dwc3.0.auto.ulpi come from?
> >
> > > Looks like the device is created by dwc3_probe() through this call flow:
> > > dwc3_probe() -> dwc3_core_init() -> dwc3_core_ulpi_init() ->
> > > dwc3_ulpi_init() -> ulpi_register_interface() -> ulpi_register()
> >
> > Correct.
> >
> > > > [ 22.276697] platform dwc3.0.auto: Driver dwc3 requests probe deferral
> > >
> > > Can you please point me to which code patch actually caused the probe
> > > deferral?
> >
> > Sure, it's in drd.c.
> >
> > if (device_property_read_string(dev, "linux,extcon-name", &name) == 0) {
> > edev = extcon_get_extcon_dev(name);
> > if (!edev)
> > return ERR_PTR(-EPROBE_DEFER);
> > return edev;
> > }
>
> Thanks for the confirmations and pointers. I assume
> "linux,extcon-name" is a property that's obtained from ACPI? Because I
> couldn't find a relevant reference to it elsewhere in the kernel.

Yes.

> > > > ...but extcon driver is still missing...
> > > >
> > > > [ 22.283174] platform dwc3.0.auto: Added to deferred list
> > > > [ 22.288513] platform dwc3.0.auto: driver_deferred_probe_add_trigger local counter: 1 new counter 2
> > >
> > > I'm not fully aware of all the USB implications, but if extcon is
> > > needed, why can't that check be done before we add and probe the ulpi
> > > device? That'll avoid this whole "fake" probing and avoid the counter
> > > increase. And avoid the need for this patch that's touching the code
> > > code that's already a bit delicate.
> >
> > > Also, with my limited experience with all the possible drivers in the
> > > kernel, it's weird that the ulpi device is added and probed before we
> > > make sure the parent device (dwc3.0.auto) can actually probe
> > > successfully.
> >
> > As I said above the deferred probe trigger has flaw on its own.
>
> Definitely agree. I'm not saying deferred probe is perfect.
>
> > Even if we fix for USB case, there is (and probably will be) others.
> >
> > > Most of the platform device code I've seen in systems with OF (device
> > > tree) add the child devices towards the end of the parent's probe
> > > function.

I realized also that your fix won't work if we change extcon to be compiled in
and ULPI to be a module. So, any driver with two or more strict dependencies
one of which is satisfied and one is not will end up in this infinite loop.

> > > > ...and since we had a successful probe, we got counter mismatch...
> > > >
> > > > [ 22.297490] driver_deferred_probe_trigger <<< 3
> > > > [ 22.302074] platform dwc3.0.auto: deferred_probe_work_func 2 <<< counter 3
> > > >
> > > > ...at the end we have a new counter and loop repeats again, see 22.198727...
> > > >
> > > > ---8<---8<---
> > > >
> > > > Revert of the commit helps, but it is probably not helpful for the initially
> > > > found regression. Artem Bityutskiy suggested to use counter of the successful
> > > > probes instead. This fixes above mentioned case and shouldn't prevent driver
> > > > to reprobe deferred ones.
> > > >
> > > > Under "successful probe" we understand the state when a driver of the certain
> > > > device is being kept bound after deferred probe trigger cycle. For instance,
> > > > in the above mentioned case probing of tusb1210 is not successful because dwc3
> > > > driver unbinds device dwc3.0.auto.ulpi. The atomic_dec() call is used to keep
> > > > track of this. The amount of bindings is always great than or equal to the
> > > > amount of unbindings as guaranteed by design of the driver binding mechanism.
> > >
> > > The unbindings count can increase for other unrelated drivers unbinding
> > > too. Wouldn't it? Seems a bit fragile and racy in a fashion similar to
> > > the issue the original patch was trying to fix.
> >
> > Yes, it's (unlikely) possible (*), but it will give one more iteration per such
> > case. It's definitely better than infinite loop. Do you agree?
>
> Sorry I wasn't being clear (I was in a rush). I'm saying this patch
> can reintroduce the bug where the deferred probe isn't triggered when
> it should be.

I don't think so. If I'm not mistaken we still have one more cycle to trigger probe.

> Let's take a simple execution flow.
>
> probe_okay is at 10.
>
> Thread-A
> really_probe(Device-A)
> local_probe_okay_count = 10
> Device-A probe function is running...
>
> Thread-B
> really_probe(Device-B)
> Device-B probes successfully.
> probe_okay incremented to 11

And probe trigger task is called. It goes to the loop because counters are not the same.

> Thread-C
> Device-C (which had bound earlier) is unbound (say module is
> unloaded or a million other reasons).
> probe_okay is decremented to 10.


> Thread-A continues
> Device-A probe function returns -EPROBE_DEFER
> driver_deferred_probe_add_trigger() doesn't do anything because
> local_probe_okay_count == probe_okay
> But Device-A might have deferred probe waiting on Device-B.
> Device-A never probes.

See above.

> > *) It means during probe you have _intensive_ removing, of course you may keep
> > kernel busy with iterations, but it has no practical sense. DoS attacks more
> > effective in different ways.
>
> I wasn't worried about DoS attacks. More of a functional correctness
> issue what I explained above.
>
> Anyway, if your issue and similar issues can be handles in driver core
> in a clean way without breaking other cases, I don't have any problem
> with that. Just that, I think the current solution breaks other cases.

> As an alternate solution, assuming "linux,extcon-name" is coming
> from some firmware, you might want to look into the fw_devlink
> feature.

Let's forget about USB. Don't be fixed on it. It one of the particular case out
of others. I'm not going to comment USB particularities, sorry. It seems to me
as not related.

--
With Best Regards,
Andy Shevchenko


2020-03-26 11:58:27

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

On Thu, Mar 26, 2020 at 09:39:40AM +0100, Rafael J. Wysocki wrote:
> On Wed, Mar 25, 2020 at 11:09 PM Saravana Kannan <[email protected]> wrote:
> > On Wed, Mar 25, 2020 at 5:51 AM Andy Shevchenko
> > <[email protected]> wrote:

> > > Yes, it's (unlikely) possible (*), but it will give one more iteration per such
> > > case. It's definitely better than infinite loop. Do you agree?
> >
> > Sorry I wasn't being clear (I was in a rush). I'm saying this patch
> > can reintroduce the bug where the deferred probe isn't triggered when
> > it should be.
> >
> > Let's take a simple execution flow.
> >
> > probe_okay is at 10.
> >
> > Thread-A
> > really_probe(Device-A)
> > local_probe_okay_count = 10
> > Device-A probe function is running...
> >
> > Thread-B
> > really_probe(Device-B)
> > Device-B probes successfully.
> > probe_okay incremented to 11
> >
> > Thread-C
> > Device-C (which had bound earlier) is unbound (say module is
> > unloaded or a million other reasons).
> > probe_okay is decremented to 10.
> >
> > Thread-A continues
> > Device-A probe function returns -EPROBE_DEFER
> > driver_deferred_probe_add_trigger() doesn't do anything because
> > local_probe_okay_count == probe_okay
> > But Device-A might have deferred probe waiting on Device-B.
> > Device-A never probes.
> >
> > > *) It means during probe you have _intensive_ removing, of course you may keep
> > > kernel busy with iterations, but it has no practical sense. DoS attacks more
> > > effective in different ways.
> >
> > I wasn't worried about DoS attacks. More of a functional correctness
> > issue what I explained above.
>
> The code is functionally incorrect as is already AFAICS.
>
> > Anyway, if your issue and similar issues can be handles in driver core
> > in a clean way without breaking other cases, I don't have any problem
> > with that. Just that, I think the current solution breaks other cases.
>
> OK, so the situation right now is that commit 58b116bce136 has
> introduced a regression and so it needs to be fixed or reverted. The
> cases that were previously broken and were unbroken by that commit
> don't matter here, so you cannot argue that they would be "broken".
>
> It looks to me like the original issue fixed by the commit in question
> needs to be addressed differently, so I would vote for reverting it
> and starting over.

I think Saravana's example is not fully correct as I had responded to his mail.
I would like to hear Grant, but seems he is busy with something and didn't reply.

> > As an alternate solution, assuming "linux,extcon-name" is coming
> > from some firmware, you might want to look into the fw_devlink
> > feature.
>
> That would be a workaround for a driver core issue, though, wouldn't it?

As I explained to him, this issue is not limited to USB case.

--
With Best Regards,
Andy Shevchenko


2020-03-26 12:06:48

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

On Thu, Mar 26, 2020 at 11:45:18AM +0200, Peter Ujfalusi wrote:
> On 26/03/2020 10.39, Rafael J. Wysocki wrote:
> > On Wed, Mar 25, 2020 at 11:09 PM Saravana Kannan <[email protected]> wrote:
> >> On Wed, Mar 25, 2020 at 5:51 AM Andy Shevchenko
> >> <[email protected]> wrote:

...

> > OK, so the situation right now is that commit 58b116bce136 has
> > introduced a regression and so it needs to be fixed or reverted. The
> > cases that were previously broken and were unbroken by that commit
> > don't matter here, so you cannot argue that they would be "broken".
>
> commit 58b116bce136 is from 2014 and the whole ULPI support for dwc3
> came in a year later.
> While I agree that 58b116bce136 fail to handle came a year later, but
> technically it did not introduced a regression.
>
> The revert on the other hand is going to introduce a regression as
> things were working fine since 2014. Not sure why the dwc3 issue got
> this long to be noticed as the 58b116bce136 was already in kernel when
> the ULPI support was added...

I dare to say that is luck based on people's laziness to figure out the root
cause. As I pointed out in email to Saravana the issue is not limited to USB
case and, if my memory doesn't trick me out, I suffered from it approximately
in ~2014-2015 with pin control tables.

> > It looks to me like the original issue fixed by the commit in question
> > needs to be addressed differently, so I would vote for reverting it
> > and starting over.
>
> Fwiw my original approach was a bit different:
> https://lore.kernel.org/patchwork/patch/454800/

Can you apply above scenario to your patch and see if it solves the issue?
On the other hand to you possess the hardware you can test the original
scenario with my patch applied?

> Greg changed it to what ended up in the kernel:
> https://lore.kernel.org/patchwork/cover/454799/

--
With Best Regards,
Andy Shevchenko


2020-03-26 13:47:02

by Grant Likely

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied



On 26/03/2020 12:03, Andy Shevchenko wrote:
> On Thu, Mar 26, 2020 at 11:45:18AM +0200, Peter Ujfalusi wrote:
>> On 26/03/2020 10.39, Rafael J. Wysocki wrote:
>>> On Wed, Mar 25, 2020 at 11:09 PM Saravana Kannan <[email protected]> wrote:
>>>> On Wed, Mar 25, 2020 at 5:51 AM Andy Shevchenko
>>>> <[email protected]> wrote:
>
> ...
>
>>> OK, so the situation right now is that commit 58b116bce136 has
>>> introduced a regression and so it needs to be fixed or reverted. The
>>> cases that were previously broken and were unbroken by that commit
>>> don't matter here, so you cannot argue that they would be "broken".
>>
>> commit 58b116bce136 is from 2014 and the whole ULPI support for dwc3
>> came in a year later.
>> While I agree that 58b116bce136 fail to handle came a year later, but
>> technically it did not introduced a regression.
>>
>> The revert on the other hand is going to introduce a regression as
>> things were working fine since 2014. Not sure why the dwc3 issue got
>> this long to be noticed as the 58b116bce136 was already in kernel when
>> the ULPI support was added...
>
> I dare to say that is luck based on people's laziness to figure out the root
> cause. As I pointed out in email to Saravana the issue is not limited to USB
> case and, if my memory doesn't trick me out, I suffered from it approximately
> in ~2014-2015 with pin control tables.

I've not been involved in this for a very long time, but from our past
conversations and the description that is given here I still feel that
this problem is a design bug on the dwc3 driver dependencies rather than
a failure with driver core. dwc3 is doing something rather convoluted
and it would be worth reevaluating how probe failures are unwound on
that particular driver stack.

g.

2020-03-26 13:49:34

by Grant Likely

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied



On 26/03/2020 11:57, Andy Shevchenko wrote:
> On Thu, Mar 26, 2020 at 09:39:40AM +0100, Rafael J. Wysocki wrote:
>> On Wed, Mar 25, 2020 at 11:09 PM Saravana Kannan <[email protected]> wrote:
>>> On Wed, Mar 25, 2020 at 5:51 AM Andy Shevchenko
>>> <[email protected]> wrote:
>
>>>> Yes, it's (unlikely) possible (*), but it will give one more iteration per such
>>>> case. It's definitely better than infinite loop. Do you agree?
>>>
>>> Sorry I wasn't being clear (I was in a rush). I'm saying this patch
>>> can reintroduce the bug where the deferred probe isn't triggered when
>>> it should be.
>>>
>>> Let's take a simple execution flow.
>>>
>>> probe_okay is at 10.
>>>
>>> Thread-A
>>> really_probe(Device-A)
>>> local_probe_okay_count = 10
>>> Device-A probe function is running...
>>>
>>> Thread-B
>>> really_probe(Device-B)
>>> Device-B probes successfully.
>>> probe_okay incremented to 11
>>>
>>> Thread-C
>>> Device-C (which had bound earlier) is unbound (say module is
>>> unloaded or a million other reasons).
>>> probe_okay is decremented to 10.
>>>
>>> Thread-A continues
>>> Device-A probe function returns -EPROBE_DEFER
>>> driver_deferred_probe_add_trigger() doesn't do anything because
>>> local_probe_okay_count == probe_okay
>>> But Device-A might have deferred probe waiting on Device-B.
>>> Device-A never probes.
>>>
>>>> *) It means during probe you have _intensive_ removing, of course you may keep
>>>> kernel busy with iterations, but it has no practical sense. DoS attacks more
>>>> effective in different ways.
>>>
>>> I wasn't worried about DoS attacks. More of a functional correctness
>>> issue what I explained above.
>>
>> The code is functionally incorrect as is already AFAICS.
>>
>>> Anyway, if your issue and similar issues can be handles in driver core
>>> in a clean way without breaking other cases, I don't have any problem
>>> with that. Just that, I think the current solution breaks other cases.
>>
>> OK, so the situation right now is that commit 58b116bce136 has
>> introduced a regression and so it needs to be fixed or reverted. The
>> cases that were previously broken and were unbroken by that commit
>> don't matter here, so you cannot argue that they would be "broken".
>>
>> It looks to me like the original issue fixed by the commit in question
>> needs to be addressed differently, so I would vote for reverting it
>> and starting over.
>
> I think Saravana's example is not fully correct as I had responded to his mail.
> I would like to hear Grant, but seems he is busy with something and didn't reply.

Sadly I don't look much like a kernel developer these days. The last
code change I committed to the kernel was over 4 years ago.

g.

2020-03-26 14:25:09

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

On Thu, Mar 26, 2020 at 01:45:50PM +0000, Grant Likely wrote:
> On 26/03/2020 12:03, Andy Shevchenko wrote:
> > On Thu, Mar 26, 2020 at 11:45:18AM +0200, Peter Ujfalusi wrote:
> > > On 26/03/2020 10.39, Rafael J. Wysocki wrote:
> > > > On Wed, Mar 25, 2020 at 11:09 PM Saravana Kannan <[email protected]> wrote:
> > > > > On Wed, Mar 25, 2020 at 5:51 AM Andy Shevchenko
> > > > > <[email protected]> wrote:
> >
> > ...
> >
> > > > OK, so the situation right now is that commit 58b116bce136 has
> > > > introduced a regression and so it needs to be fixed or reverted. The
> > > > cases that were previously broken and were unbroken by that commit
> > > > don't matter here, so you cannot argue that they would be "broken".
> > >
> > > commit 58b116bce136 is from 2014 and the whole ULPI support for dwc3
> > > came in a year later.
> > > While I agree that 58b116bce136 fail to handle came a year later, but
> > > technically it did not introduced a regression.
> > >
> > > The revert on the other hand is going to introduce a regression as
> > > things were working fine since 2014. Not sure why the dwc3 issue got
> > > this long to be noticed as the 58b116bce136 was already in kernel when
> > > the ULPI support was added...
> >
> > I dare to say that is luck based on people's laziness to figure out the root
> > cause. As I pointed out in email to Saravana the issue is not limited to USB
> > case and, if my memory doesn't trick me out, I suffered from it approximately
> > in ~2014-2015 with pin control tables.
>
> I've not been involved in this for a very long time, but from our past
> conversations and the description that is given here I still feel that this
> problem is a design bug on the dwc3 driver dependencies rather than a
> failure with driver core. dwc3 is doing something rather convoluted and it
> would be worth reevaluating how probe failures are unwound on that
> particular driver stack.

I disagree. Have you chance to look into another example I gave to Saravana?

The unbalanced increment is fragile per se, because you can't guarantee that
it will be no unsynchronization between probed successfully (unrelated!) and
deferred drivers.

--
With Best Regards,
Andy Shevchenko


2020-03-26 14:48:55

by Grant Likely

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied



On 26/03/2020 11:54, Andy Shevchenko wrote:
> On Wed, Mar 25, 2020 at 03:08:29PM -0700, Saravana Kannan wrote:
>> On Wed, Mar 25, 2020 at 5:51 AM Andy Shevchenko
>> <[email protected]> wrote:
>>> On Tue, Mar 24, 2020 at 08:29:01PM -0700, Saravana Kannan wrote:
>>>> On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko <[email protected]> wrote:
>>>>> Consider the following scenario.
>>>>>
>>>>> The main driver of USB OTG controller (dwc3-pci), which has the following
>>>>> functional dependencies on certain platform:
>>>>> - ULPI (tusb1210)
>>>>> - extcon (tested with extcon-intel-mrfld)
>>>>>
>>>>> Note, that first driver, tusb1210, is available at the moment of
>>>>> dwc3-pci probing, while extcon-intel-mrfld is built as a module and
>>>>> won't appear till user space does something about it.
>>>>>
>>>>> This is depicted by kernel configuration excerpt:
>>>>>
>>>>> CONFIG_PHY_TUSB1210=y
>>>>> CONFIG_USB_DWC3=y
>>>>> CONFIG_USB_DWC3_ULPI=y
>>>>> CONFIG_USB_DWC3_DUAL_ROLE=y
>>>>> CONFIG_USB_DWC3_PCI=y
>>>>> CONFIG_EXTCON_INTEL_MRFLD=m
>>>>>
>>>>> In the Buildroot environment the modules are probed by alphabetical ordering
>>>>> of their modaliases. The latter comes to the case when USB OTG driver will be
>>>>> probed first followed by extcon one.
>>>>>
>>>>> So, if the platform anticipates extcon device to be appeared, in the above case
>>>>> we will get deferred probe of USB OTG, because of ordering.
>>>>>
>>>>> Since current implementation, done by the commit 58b116bce136 ("drivercore:
>>>>> deferral race condition fix") counts the amount of triggered deferred probe,
>>>>> we never advance the situation -- the change makes it to be an infinite loop.
>>>>
>>>> Hi Andy,
>>>>
>>>> I'm trying to understand this sequence of steps. Sorry if the questions
>>>> are stupid -- I'm not very familiar with USB/PCI stuff.
>>>
>>> Thank you for looking into this. My answer below.
>>>
>>> As a first thing I would like to tell that there is another example of bad
>>> behaviour of deferred probe with no relation to USB. The proposed change also
>>> fixes that one (however, less possible to find in real life).
>>
>> Unless I see what the other issue is, I can't speak for the unknown.
>
> Okay, let's talk about other case (actually it's the one which I had noticed
> approximately at the time when culprit patch made the kernel).
>
> For some debugging purposes I have been using pin control table in board code.
>
> Since I would like to boot kernel on different systems I have some tables for
> non-existing pin control device. Pin control framework returns -EPROBE_DEFER
> when trying to probe device with attached table for wrong pin control. This is
> fine, the problem is that *any* successfully probed device, which happens in
> the deferred probe initcall will desynchronize existing counter. As a result ->
> infinite loop. For the record, I didn't realize and didn't investigate that
> time the issue and now I can confirm that this is a culprit which is fixed by
> this patch.

Specific code path please?

g.

2020-03-26 15:02:47

by Grant Likely

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied



On 25/03/2020 12:51, Andy Shevchenko wrote:
> On Tue, Mar 24, 2020 at 08:29:01PM -0700, Saravana Kannan wrote:
>> On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko <[email protected]> wrote:
>>> Consider the following scenario.
>>>
>>> The main driver of USB OTG controller (dwc3-pci), which has the following
>>> functional dependencies on certain platform:
>>> - ULPI (tusb1210)
>>> - extcon (tested with extcon-intel-mrfld)
>>>
>>> Note, that first driver, tusb1210, is available at the moment of
>>> dwc3-pci probing, while extcon-intel-mrfld is built as a module and
>>> won't appear till user space does something about it.
>>>
>>> This is depicted by kernel configuration excerpt:
>>>
>>> CONFIG_PHY_TUSB1210=y
>>> CONFIG_USB_DWC3=y
>>> CONFIG_USB_DWC3_ULPI=y
>>> CONFIG_USB_DWC3_DUAL_ROLE=y
>>> CONFIG_USB_DWC3_PCI=y
>>> CONFIG_EXTCON_INTEL_MRFLD=m
>>>
>>> In the Buildroot environment the modules are probed by alphabetical ordering
>>> of their modaliases. The latter comes to the case when USB OTG driver will be
>>> probed first followed by extcon one.
>>>
>>> So, if the platform anticipates extcon device to be appeared, in the above case
>>> we will get deferred probe of USB OTG, because of ordering.
>>>
>>> Since current implementation, done by the commit 58b116bce136 ("drivercore:
>>> deferral race condition fix") counts the amount of triggered deferred probe,
>>> we never advance the situation -- the change makes it to be an infinite loop.
>>
>> Hi Andy,
>>
>> I'm trying to understand this sequence of steps. Sorry if the questions
>> are stupid -- I'm not very familiar with USB/PCI stuff.
>
> Thank you for looking into this. My answer below.
>
> As a first thing I would like to tell that there is another example of bad
> behaviour of deferred probe with no relation to USB. The proposed change also
> fixes that one (however, less possible to find in real life).
>
>>> ---8<---8<---
>>>
>>> [ 22.187127] driver_deferred_probe_trigger <<< 1
>>>
>>> ...here is the late initcall triggers deferred probe...
>>>
>>> [ 22.191725] platform dwc3.0.auto: deferred_probe_work_func in deferred list
>>>
>>> ...dwc3.0.auto is the only device in the deferred list...
>>
>> Ok, dwc3.0.auto is the only unprobed device at this point?
>
> Correct.
>
>>> [ 22.198727] platform dwc3.0.auto: deferred_probe_work_func 1 <<< counter 1
>>>
>>> ...the counter before mutex is unlocked is kept the same...
>>>
>>> [ 22.205663] platform dwc3.0.auto: Retrying from deferred list
>>>
>>> ...mutes has been unlocked, we try to re-probe the driver...
>>>
>>> [ 22.211487] bus: 'platform': driver_probe_device: matched device dwc3.0.auto with driver dwc3
>>> [ 22.220060] bus: 'platform': really_probe: probing driver dwc3 with device dwc3.0.auto
>>> [ 22.238735] bus: 'ulpi': driver_probe_device: matched device dwc3.0.auto.ulpi with driver tusb1210
>>> [ 22.247743] bus: 'ulpi': really_probe: probing driver tusb1210 with device dwc3.0.auto.ulpi
>>> [ 22.256292] driver: 'tusb1210': driver_bound: bound to device 'dwc3.0.auto.ulpi'
>>> [ 22.263723] driver_deferred_probe_trigger <<< 2
>>>
>>> ...the dwc3.0.auto probes ULPI, we got successful bound and bumped counter...
>>>
>>> [ 22.268304] bus: 'ulpi': really_probe: bound device dwc3.0.auto.ulpi to driver tusb1210
>>
>> So where did this dwc3.0.auto.ulpi come from?
>
>> Looks like the device is created by dwc3_probe() through this call flow:
>> dwc3_probe() -> dwc3_core_init() -> dwc3_core_ulpi_init() ->
>> dwc3_ulpi_init() -> ulpi_register_interface() -> ulpi_register()
>
> Correct.
>
>>> [ 22.276697] platform dwc3.0.auto: Driver dwc3 requests probe deferral
>>
>> Can you please point me to which code patch actually caused the probe
>> deferral?
>
> Sure, it's in drd.c.
>
> if (device_property_read_string(dev, "linux,extcon-name", &name) == 0) {
> edev = extcon_get_extcon_dev(name);
> if (!edev)
> return ERR_PTR(-EPROBE_DEFER);
> return edev;
> }
>
>>> ...but extcon driver is still missing...
>>>
>>> [ 22.283174] platform dwc3.0.auto: Added to deferred list
>>> [ 22.288513] platform dwc3.0.auto: driver_deferred_probe_add_trigger local counter: 1 new counter 2
>>
>> I'm not fully aware of all the USB implications, but if extcon is
>> needed, why can't that check be done before we add and probe the ulpi
>> device? That'll avoid this whole "fake" probing and avoid the counter
>> increase. And avoid the need for this patch that's touching the code
>> code that's already a bit delicate.
>
>> Also, with my limited experience with all the possible drivers in the
>> kernel, it's weird that the ulpi device is added and probed before we
>> make sure the parent device (dwc3.0.auto) can actually probe
>> successfully.
>
> As I said above the deferred probe trigger has flaw on its own.
> Even if we fix for USB case, there is (and probably will be) others.

Right here is the driver design bug. A driver's probe() hook should
*not* return -EPROBE_DEFER after already creating child devices which
may have already been probed.

It can be solved by refactoring the driver probe routine. If a resource
is required to be present, then check that it is available early; before
registering child devices.

The proposed solution to modify driver core is fragile and susceptible
to side effects from other probe paths. I don't think it is the right
approach.

g.

2020-03-26 15:25:11

by Grant Likely

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

On 26/03/2020 15:01, Grant Likely wrote:
>
>
> On 25/03/2020 12:51, Andy Shevchenko wrote:
>> On Tue, Mar 24, 2020 at 08:29:01PM -0700, Saravana Kannan wrote:
>>> On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko
>>> <[email protected]> wrote:
[...]
>>>> ...but extcon driver is still missing...
>>>>
>>>> [   22.283174] platform dwc3.0.auto: Added to deferred list
>>>> [   22.288513] platform dwc3.0.auto:
>>>> driver_deferred_probe_add_trigger local counter: 1 new counter 2
>>>
>>> I'm not fully aware of all the USB implications, but if extcon is
>>> needed, why can't that check be done before we add and probe the ulpi
>>> device? That'll avoid this whole "fake" probing and avoid the counter
>>> increase. And avoid the need for this patch that's touching the code
>>> code that's already a bit delicate.
>>
>>> Also, with my limited experience with all the possible drivers in the
>>> kernel, it's weird that the ulpi device is added and probed before we
>>> make sure the parent device (dwc3.0.auto) can actually probe
>>> successfully.
>>
>> As I said above the deferred probe trigger has flaw on its own.
>> Even if we fix for USB case, there is (and probably will be) others.
>
> Right here is the driver design bug. A driver's probe() hook should
> *not* return -EPROBE_DEFER after already creating child devices which
> may have already been probed.
>
> It can be solved by refactoring the driver probe routine. If a resource
> is required to be present, then check that it is available early; before
> registering child devices.

If it is difficult to determine whether extcon is available before
creating the child devices, then there is a way to solve it that still
leverages the driver core. You could refactor dwc3_core_init_mode() into
a set of separate probe() routines, one for each mode. dwc3_probe()
could register the matching child device just before it exits. Then the
driver core will take care of calling it again at a later point in time
without tearing down the ulpi device probe.

Cheers,
g.

2020-03-26 16:33:11

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

On Thu, Mar 26, 2020 at 03:01:22PM +0000, Grant Likely wrote:
> On 25/03/2020 12:51, Andy Shevchenko wrote:
> > On Tue, Mar 24, 2020 at 08:29:01PM -0700, Saravana Kannan wrote:
> > > On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko <[email protected]> wrote:
> > > > Consider the following scenario.
> > > >
> > > > The main driver of USB OTG controller (dwc3-pci), which has the following
> > > > functional dependencies on certain platform:
> > > > - ULPI (tusb1210)
> > > > - extcon (tested with extcon-intel-mrfld)
> > > >
> > > > Note, that first driver, tusb1210, is available at the moment of
> > > > dwc3-pci probing, while extcon-intel-mrfld is built as a module and
> > > > won't appear till user space does something about it.
> > > >
> > > > This is depicted by kernel configuration excerpt:
> > > >
> > > > CONFIG_PHY_TUSB1210=y
> > > > CONFIG_USB_DWC3=y
> > > > CONFIG_USB_DWC3_ULPI=y
> > > > CONFIG_USB_DWC3_DUAL_ROLE=y
> > > > CONFIG_USB_DWC3_PCI=y
> > > > CONFIG_EXTCON_INTEL_MRFLD=m
> > > >
> > > > In the Buildroot environment the modules are probed by alphabetical ordering
> > > > of their modaliases. The latter comes to the case when USB OTG driver will be
> > > > probed first followed by extcon one.
> > > >
> > > > So, if the platform anticipates extcon device to be appeared, in the above case
> > > > we will get deferred probe of USB OTG, because of ordering.
> > > >
> > > > Since current implementation, done by the commit 58b116bce136 ("drivercore:
> > > > deferral race condition fix") counts the amount of triggered deferred probe,
> > > > we never advance the situation -- the change makes it to be an infinite loop.
> > >
> > > Hi Andy,
> > >
> > > I'm trying to understand this sequence of steps. Sorry if the questions
> > > are stupid -- I'm not very familiar with USB/PCI stuff.
> >
> > Thank you for looking into this. My answer below.
> >
> > As a first thing I would like to tell that there is another example of bad
> > behaviour of deferred probe with no relation to USB. The proposed change also
> > fixes that one (however, less possible to find in real life).
> >
> > > > ---8<---8<---
> > > >
> > > > [ 22.187127] driver_deferred_probe_trigger <<< 1
> > > >
> > > > ...here is the late initcall triggers deferred probe...
> > > >
> > > > [ 22.191725] platform dwc3.0.auto: deferred_probe_work_func in deferred list
> > > >
> > > > ...dwc3.0.auto is the only device in the deferred list...
> > >
> > > Ok, dwc3.0.auto is the only unprobed device at this point?
> >
> > Correct.
> >
> > > > [ 22.198727] platform dwc3.0.auto: deferred_probe_work_func 1 <<< counter 1
> > > >
> > > > ...the counter before mutex is unlocked is kept the same...
> > > >
> > > > [ 22.205663] platform dwc3.0.auto: Retrying from deferred list
> > > >
> > > > ...mutes has been unlocked, we try to re-probe the driver...
> > > >
> > > > [ 22.211487] bus: 'platform': driver_probe_device: matched device dwc3.0.auto with driver dwc3
> > > > [ 22.220060] bus: 'platform': really_probe: probing driver dwc3 with device dwc3.0.auto
> > > > [ 22.238735] bus: 'ulpi': driver_probe_device: matched device dwc3.0.auto.ulpi with driver tusb1210
> > > > [ 22.247743] bus: 'ulpi': really_probe: probing driver tusb1210 with device dwc3.0.auto.ulpi
> > > > [ 22.256292] driver: 'tusb1210': driver_bound: bound to device 'dwc3.0.auto.ulpi'
> > > > [ 22.263723] driver_deferred_probe_trigger <<< 2
> > > >
> > > > ...the dwc3.0.auto probes ULPI, we got successful bound and bumped counter...
> > > >
> > > > [ 22.268304] bus: 'ulpi': really_probe: bound device dwc3.0.auto.ulpi to driver tusb1210
> > >
> > > So where did this dwc3.0.auto.ulpi come from?
> >
> > > Looks like the device is created by dwc3_probe() through this call flow:
> > > dwc3_probe() -> dwc3_core_init() -> dwc3_core_ulpi_init() ->
> > > dwc3_ulpi_init() -> ulpi_register_interface() -> ulpi_register()
> >
> > Correct.
> >
> > > > [ 22.276697] platform dwc3.0.auto: Driver dwc3 requests probe deferral
> > >
> > > Can you please point me to which code patch actually caused the probe
> > > deferral?
> >
> > Sure, it's in drd.c.
> >
> > if (device_property_read_string(dev, "linux,extcon-name", &name) == 0) {
> > edev = extcon_get_extcon_dev(name);
> > if (!edev)
> > return ERR_PTR(-EPROBE_DEFER);
> > return edev;
> > }
> >
> > > > ...but extcon driver is still missing...
> > > >
> > > > [ 22.283174] platform dwc3.0.auto: Added to deferred list
> > > > [ 22.288513] platform dwc3.0.auto: driver_deferred_probe_add_trigger local counter: 1 new counter 2
> > >
> > > I'm not fully aware of all the USB implications, but if extcon is
> > > needed, why can't that check be done before we add and probe the ulpi
> > > device? That'll avoid this whole "fake" probing and avoid the counter
> > > increase. And avoid the need for this patch that's touching the code
> > > code that's already a bit delicate.
> >
> > > Also, with my limited experience with all the possible drivers in the
> > > kernel, it's weird that the ulpi device is added and probed before we
> > > make sure the parent device (dwc3.0.auto) can actually probe
> > > successfully.
> >
> > As I said above the deferred probe trigger has flaw on its own.
> > Even if we fix for USB case, there is (and probably will be) others.
>
> Right here is the driver design bug. A driver's probe() hook should *not*
> return -EPROBE_DEFER after already creating child devices which may have
> already been probed.

Any documentation statement for this requirement?

By the way, I may imagine other mechanisms that probe the driver on other CPU
at the same time (let's consider parallel modprobes). The current code has a
flaw with that.

> It can be solved by refactoring the driver probe routine. If a resource is
> required to be present, then check that it is available early; before
> registering child devices.

We fix one and leave others.

> The proposed solution to modify driver core is fragile and susceptible to
> side effects from other probe paths. I don't think it is the right approach.

Have you tested it on your case? Does it fix the issue?

--
With Best Regards,
Andy Shevchenko


2020-03-26 16:41:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

On Thu, Mar 26, 2020 at 06:31:10PM +0200, Andy Shevchenko wrote:
> On Thu, Mar 26, 2020 at 03:01:22PM +0000, Grant Likely wrote:
> > On 25/03/2020 12:51, Andy Shevchenko wrote:
> > > On Tue, Mar 24, 2020 at 08:29:01PM -0700, Saravana Kannan wrote:
> > > > On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko <[email protected]> wrote:
> > > > > Consider the following scenario.
> > > > >
> > > > > The main driver of USB OTG controller (dwc3-pci), which has the following
> > > > > functional dependencies on certain platform:
> > > > > - ULPI (tusb1210)
> > > > > - extcon (tested with extcon-intel-mrfld)
> > > > >
> > > > > Note, that first driver, tusb1210, is available at the moment of
> > > > > dwc3-pci probing, while extcon-intel-mrfld is built as a module and
> > > > > won't appear till user space does something about it.
> > > > >
> > > > > This is depicted by kernel configuration excerpt:
> > > > >
> > > > > CONFIG_PHY_TUSB1210=y
> > > > > CONFIG_USB_DWC3=y
> > > > > CONFIG_USB_DWC3_ULPI=y
> > > > > CONFIG_USB_DWC3_DUAL_ROLE=y
> > > > > CONFIG_USB_DWC3_PCI=y
> > > > > CONFIG_EXTCON_INTEL_MRFLD=m
> > > > >
> > > > > In the Buildroot environment the modules are probed by alphabetical ordering
> > > > > of their modaliases. The latter comes to the case when USB OTG driver will be
> > > > > probed first followed by extcon one.
> > > > >
> > > > > So, if the platform anticipates extcon device to be appeared, in the above case
> > > > > we will get deferred probe of USB OTG, because of ordering.
> > > > >
> > > > > Since current implementation, done by the commit 58b116bce136 ("drivercore:
> > > > > deferral race condition fix") counts the amount of triggered deferred probe,
> > > > > we never advance the situation -- the change makes it to be an infinite loop.
> > > >
> > > > Hi Andy,
> > > >
> > > > I'm trying to understand this sequence of steps. Sorry if the questions
> > > > are stupid -- I'm not very familiar with USB/PCI stuff.
> > >
> > > Thank you for looking into this. My answer below.
> > >
> > > As a first thing I would like to tell that there is another example of bad
> > > behaviour of deferred probe with no relation to USB. The proposed change also
> > > fixes that one (however, less possible to find in real life).
> > >
> > > > > ---8<---8<---
> > > > >
> > > > > [ 22.187127] driver_deferred_probe_trigger <<< 1
> > > > >
> > > > > ...here is the late initcall triggers deferred probe...
> > > > >
> > > > > [ 22.191725] platform dwc3.0.auto: deferred_probe_work_func in deferred list
> > > > >
> > > > > ...dwc3.0.auto is the only device in the deferred list...
> > > >
> > > > Ok, dwc3.0.auto is the only unprobed device at this point?
> > >
> > > Correct.
> > >
> > > > > [ 22.198727] platform dwc3.0.auto: deferred_probe_work_func 1 <<< counter 1
> > > > >
> > > > > ...the counter before mutex is unlocked is kept the same...
> > > > >
> > > > > [ 22.205663] platform dwc3.0.auto: Retrying from deferred list
> > > > >
> > > > > ...mutes has been unlocked, we try to re-probe the driver...
> > > > >
> > > > > [ 22.211487] bus: 'platform': driver_probe_device: matched device dwc3.0.auto with driver dwc3
> > > > > [ 22.220060] bus: 'platform': really_probe: probing driver dwc3 with device dwc3.0.auto
> > > > > [ 22.238735] bus: 'ulpi': driver_probe_device: matched device dwc3.0.auto.ulpi with driver tusb1210
> > > > > [ 22.247743] bus: 'ulpi': really_probe: probing driver tusb1210 with device dwc3.0.auto.ulpi
> > > > > [ 22.256292] driver: 'tusb1210': driver_bound: bound to device 'dwc3.0.auto.ulpi'
> > > > > [ 22.263723] driver_deferred_probe_trigger <<< 2
> > > > >
> > > > > ...the dwc3.0.auto probes ULPI, we got successful bound and bumped counter...
> > > > >
> > > > > [ 22.268304] bus: 'ulpi': really_probe: bound device dwc3.0.auto.ulpi to driver tusb1210
> > > >
> > > > So where did this dwc3.0.auto.ulpi come from?
> > >
> > > > Looks like the device is created by dwc3_probe() through this call flow:
> > > > dwc3_probe() -> dwc3_core_init() -> dwc3_core_ulpi_init() ->
> > > > dwc3_ulpi_init() -> ulpi_register_interface() -> ulpi_register()
> > >
> > > Correct.
> > >
> > > > > [ 22.276697] platform dwc3.0.auto: Driver dwc3 requests probe deferral
> > > >
> > > > Can you please point me to which code patch actually caused the probe
> > > > deferral?
> > >
> > > Sure, it's in drd.c.
> > >
> > > if (device_property_read_string(dev, "linux,extcon-name", &name) == 0) {
> > > edev = extcon_get_extcon_dev(name);
> > > if (!edev)
> > > return ERR_PTR(-EPROBE_DEFER);
> > > return edev;
> > > }
> > >
> > > > > ...but extcon driver is still missing...
> > > > >
> > > > > [ 22.283174] platform dwc3.0.auto: Added to deferred list
> > > > > [ 22.288513] platform dwc3.0.auto: driver_deferred_probe_add_trigger local counter: 1 new counter 2
> > > >
> > > > I'm not fully aware of all the USB implications, but if extcon is
> > > > needed, why can't that check be done before we add and probe the ulpi
> > > > device? That'll avoid this whole "fake" probing and avoid the counter
> > > > increase. And avoid the need for this patch that's touching the code
> > > > code that's already a bit delicate.
> > >
> > > > Also, with my limited experience with all the possible drivers in the
> > > > kernel, it's weird that the ulpi device is added and probed before we
> > > > make sure the parent device (dwc3.0.auto) can actually probe
> > > > successfully.
> > >
> > > As I said above the deferred probe trigger has flaw on its own.
> > > Even if we fix for USB case, there is (and probably will be) others.
> >
> > Right here is the driver design bug. A driver's probe() hook should *not*
> > return -EPROBE_DEFER after already creating child devices which may have
> > already been probed.
>
> Any documentation statement for this requirement?

There shouldn't be. If you return ANY error from a probe function, your
driver is essencially "dead" when it comes to that device, and it had
better have cleaned up after itself.

That includes defering probe, that's not "special" here at all.

> By the way, I may imagine other mechanisms that probe the driver on other CPU
> at the same time (let's consider parallel modprobes). The current code has a
> flaw with that.

That can't happen, the driver core prevents that.

thanks,

greg k-h

2020-03-26 18:07:59

by Grant Likely

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied



On 26/03/2020 16:39, Greg KH wrote:
> On Thu, Mar 26, 2020 at 06:31:10PM +0200, Andy Shevchenko wrote:
>> On Thu, Mar 26, 2020 at 03:01:22PM +0000, Grant Likely wrote:
>>> On 25/03/2020 12:51, Andy Shevchenko wrote:
>>>> On Tue, Mar 24, 2020 at 08:29:01PM -0700, Saravana Kannan wrote:
>>>>> On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko <[email protected]> wrote:
>>>>>> Consider the following scenario.
>>>>>>
>>>>>> The main driver of USB OTG controller (dwc3-pci), which has the following
>>>>>> functional dependencies on certain platform:
>>>>>> - ULPI (tusb1210)
>>>>>> - extcon (tested with extcon-intel-mrfld)
>>>>>>
>>>>>> Note, that first driver, tusb1210, is available at the moment of
>>>>>> dwc3-pci probing, while extcon-intel-mrfld is built as a module and
>>>>>> won't appear till user space does something about it.
>>>>>>
>>>>>> This is depicted by kernel configuration excerpt:
>>>>>>
>>>>>> CONFIG_PHY_TUSB1210=y
>>>>>> CONFIG_USB_DWC3=y
>>>>>> CONFIG_USB_DWC3_ULPI=y
>>>>>> CONFIG_USB_DWC3_DUAL_ROLE=y
>>>>>> CONFIG_USB_DWC3_PCI=y
>>>>>> CONFIG_EXTCON_INTEL_MRFLD=m
>>>>>>
>>>>>> In the Buildroot environment the modules are probed by alphabetical ordering
>>>>>> of their modaliases. The latter comes to the case when USB OTG driver will be
>>>>>> probed first followed by extcon one.
>>>>>>
>>>>>> So, if the platform anticipates extcon device to be appeared, in the above case
>>>>>> we will get deferred probe of USB OTG, because of ordering.
>>>>>>
>>>>>> Since current implementation, done by the commit 58b116bce136 ("drivercore:
>>>>>> deferral race condition fix") counts the amount of triggered deferred probe,
>>>>>> we never advance the situation -- the change makes it to be an infinite loop.
>>>>>
>>>>> Hi Andy,
>>>>>
>>>>> I'm trying to understand this sequence of steps. Sorry if the questions
>>>>> are stupid -- I'm not very familiar with USB/PCI stuff.
>>>>
>>>> Thank you for looking into this. My answer below.
>>>>
>>>> As a first thing I would like to tell that there is another example of bad
>>>> behaviour of deferred probe with no relation to USB. The proposed change also
>>>> fixes that one (however, less possible to find in real life).
>>>>
>>>>>> ---8<---8<---
>>>>>>
>>>>>> [ 22.187127] driver_deferred_probe_trigger <<< 1
>>>>>>
>>>>>> ...here is the late initcall triggers deferred probe...
>>>>>>
>>>>>> [ 22.191725] platform dwc3.0.auto: deferred_probe_work_func in deferred list
>>>>>>
>>>>>> ...dwc3.0.auto is the only device in the deferred list...
>>>>>
>>>>> Ok, dwc3.0.auto is the only unprobed device at this point?
>>>>
>>>> Correct.
>>>>
>>>>>> [ 22.198727] platform dwc3.0.auto: deferred_probe_work_func 1 <<< counter 1
>>>>>>
>>>>>> ...the counter before mutex is unlocked is kept the same...
>>>>>>
>>>>>> [ 22.205663] platform dwc3.0.auto: Retrying from deferred list
>>>>>>
>>>>>> ...mutes has been unlocked, we try to re-probe the driver...
>>>>>>
>>>>>> [ 22.211487] bus: 'platform': driver_probe_device: matched device dwc3.0.auto with driver dwc3
>>>>>> [ 22.220060] bus: 'platform': really_probe: probing driver dwc3 with device dwc3.0.auto
>>>>>> [ 22.238735] bus: 'ulpi': driver_probe_device: matched device dwc3.0.auto.ulpi with driver tusb1210
>>>>>> [ 22.247743] bus: 'ulpi': really_probe: probing driver tusb1210 with device dwc3.0.auto.ulpi
>>>>>> [ 22.256292] driver: 'tusb1210': driver_bound: bound to device 'dwc3.0.auto.ulpi'
>>>>>> [ 22.263723] driver_deferred_probe_trigger <<< 2
>>>>>>
>>>>>> ...the dwc3.0.auto probes ULPI, we got successful bound and bumped counter...
>>>>>>
>>>>>> [ 22.268304] bus: 'ulpi': really_probe: bound device dwc3.0.auto.ulpi to driver tusb1210
>>>>>
>>>>> So where did this dwc3.0.auto.ulpi come from?
>>>>
>>>>> Looks like the device is created by dwc3_probe() through this call flow:
>>>>> dwc3_probe() -> dwc3_core_init() -> dwc3_core_ulpi_init() ->
>>>>> dwc3_ulpi_init() -> ulpi_register_interface() -> ulpi_register()
>>>>
>>>> Correct.
>>>>
>>>>>> [ 22.276697] platform dwc3.0.auto: Driver dwc3 requests probe deferral
>>>>>
>>>>> Can you please point me to which code patch actually caused the probe
>>>>> deferral?
>>>>
>>>> Sure, it's in drd.c.
>>>>
>>>> if (device_property_read_string(dev, "linux,extcon-name", &name) == 0) {
>>>> edev = extcon_get_extcon_dev(name);
>>>> if (!edev)
>>>> return ERR_PTR(-EPROBE_DEFER);
>>>> return edev;
>>>> }
>>>>
>>>>>> ...but extcon driver is still missing...
>>>>>>
>>>>>> [ 22.283174] platform dwc3.0.auto: Added to deferred list
>>>>>> [ 22.288513] platform dwc3.0.auto: driver_deferred_probe_add_trigger local counter: 1 new counter 2
>>>>>
>>>>> I'm not fully aware of all the USB implications, but if extcon is
>>>>> needed, why can't that check be done before we add and probe the ulpi
>>>>> device? That'll avoid this whole "fake" probing and avoid the counter
>>>>> increase. And avoid the need for this patch that's touching the code
>>>>> code that's already a bit delicate.
>>>>
>>>>> Also, with my limited experience with all the possible drivers in the
>>>>> kernel, it's weird that the ulpi device is added and probed before we
>>>>> make sure the parent device (dwc3.0.auto) can actually probe
>>>>> successfully.
>>>>
>>>> As I said above the deferred probe trigger has flaw on its own.
>>>> Even if we fix for USB case, there is (and probably will be) others.
>>>
>>> Right here is the driver design bug. A driver's probe() hook should *not*
>>> return -EPROBE_DEFER after already creating child devices which may have
>>> already been probed.
>>
>> Any documentation statement for this requirement?
>
> There shouldn't be. If you return ANY error from a probe function, your
> driver is essencially "dead" when it comes to that device, and it had
> better have cleaned up after itself. >
> That includes defering probe, that's not "special" here at all.

What is special in this case is that if a .probe() hook had registered a
child device, then removed that child device (so it did clean up after
itself) and then return -EPROBE_DEFER, then we end up in an endless
probe loop.

But this is unusual behaviour. Normally a .probe() hook checks all
required resources are available before registering any child devices.
This driver doesn't do that. Arguably this is indeed an additional
requirement beyond "clean up after yourself". I cannot find anyplace
where it is documented. In fact, I cannot find any documentation on
EPROBE_DEFER in the Documentation/ tree. How about the below?

>> By the way, I may imagine other mechanisms that probe the driver on other CPU
>> at the same time (let's consider parallel modprobes). The current code has a
>> flaw with that.
>
> That can't happen, the driver core prevents that.

Greg's right, that can't happen. At worst a driver will get an
additional defer event; but it all still works.

g.

---
diff --git a/Documentation/driver-api/driver-model/driver.rst
b/Documentation/driver-api/driver-model/driver.rst
index baa6a85c8287..46adede13aba 100644
--- a/Documentation/driver-api/driver-model/driver.rst
+++ b/Documentation/driver-api/driver-model/driver.rst
@@ -167,7 +167,17 @@ the driver to that device.

A driver's probe() may return a negative errno value to indicate that
the driver did not bind to this device, in which case it should have
-released all resources it allocated::
+released all resources it allocated. Optionally, probe() may return
+-EPROBE_DEFER if the driver depends on resources that are not yet
+available (e.g., supplied by a driver that hasn't initialized yet).
+The driver core will put the device onto the deferred probe list and
+will try to call it again later. Important: -EPROBE_DEFER must not be
+returned if probe() has already created child devices, even if those
+child devices have were removed again in a cleanup path. If -EPROBE_DEFER
+is returned after a child device has been registered, it may result in an
+infinite loop of .probe() calls to the same driver.
+
+::

void (*sync_state)(struct device *dev);

2020-03-26 18:47:20

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

On Thu, Mar 26, 2020 at 1:39 AM Rafael J. Wysocki <[email protected]> wrote:
>
> On Wed, Mar 25, 2020 at 11:09 PM Saravana Kannan <[email protected]> wrote:
> >
> > On Wed, Mar 25, 2020 at 5:51 AM Andy Shevchenko
> > <[email protected]> wrote:
> > >
>
> [cut]
>
> > >
> > > Yes, it's (unlikely) possible (*), but it will give one more iteration per such
> > > case. It's definitely better than infinite loop. Do you agree?
> >
> > Sorry I wasn't being clear (I was in a rush). I'm saying this patch
> > can reintroduce the bug where the deferred probe isn't triggered when
> > it should be.
> >
> > Let's take a simple execution flow.
> >
> > probe_okay is at 10.
> >
> > Thread-A
> > really_probe(Device-A)
> > local_probe_okay_count = 10
> > Device-A probe function is running...
> >
> > Thread-B
> > really_probe(Device-B)
> > Device-B probes successfully.
> > probe_okay incremented to 11
> >
> > Thread-C
> > Device-C (which had bound earlier) is unbound (say module is
> > unloaded or a million other reasons).
> > probe_okay is decremented to 10.
> >
> > Thread-A continues
> > Device-A probe function returns -EPROBE_DEFER
> > driver_deferred_probe_add_trigger() doesn't do anything because
> > local_probe_okay_count == probe_okay
> > But Device-A might have deferred probe waiting on Device-B.
> > Device-A never probes.
> >
> > > *) It means during probe you have _intensive_ removing, of course you may keep
> > > kernel busy with iterations, but it has no practical sense. DoS attacks more
> > > effective in different ways.
> >
> > I wasn't worried about DoS attacks. More of a functional correctness
> > issue what I explained above.
>
> The code is functionally incorrect as is already AFAICS.
>
> > Anyway, if your issue and similar issues can be handles in driver core
> > in a clean way without breaking other cases, I don't have any problem
> > with that. Just that, I think the current solution breaks other cases.
>
> OK, so the situation right now is that commit 58b116bce136 has
> introduced a regression and so it needs to be fixed or reverted. The
> cases that were previously broken and were unbroken by that commit
> don't matter here, so you cannot argue that they would be "broken".
>
> It looks to me like the original issue fixed by the commit in question
> needs to be addressed differently, so I would vote for reverting it
> and starting over.

I'm fine with whatever approach. My only point is that code that's
been there for 5+ years might be preventing that race in a multitude
of platforms. So I'm just reviewing to make sure fixes aren't
introducing regressions. I'm all for anyone cleaning up/redoing
deferred probe.

> > As an alternate solution, assuming "linux,extcon-name" is coming
> > from some firmware, you might want to look into the fw_devlink
> > feature.
>
> That would be a workaround for a driver core issue, though, wouldn't it?

I'm not saying don't fix it in the driver core if it can be done
without adding regressions.

> > That feature allows driver core to add device links from firmware
> > information. If you can get that feature to create device links from
> > your dwc3.0.auto (or its parent pci_dev?) to the extcon supplier
> > device, all of this can be sidestepped and your dwc3.0.auto's (or the
> > dwc pci_dev's) probe will be triggered only after extcon is probed.
> >
> > I have very little familiarity with PCI/ACPI. I spent about an hour or
> > two poking at ACPI scan/property code. The relationship between a
> > pci_dev and an acpi_device is a bit confusing to me because I see:
> >
> > static int dwc3_pci_probe(struct pci_dev *pci, const struct pci_device_id *id)
> > {
> > struct property_entry *p = (struct property_entry *)id->driver_data;
> > struct dwc3_pci *dwc;
> > struct resource res[2];
> > int ret;
> > struct device *dev = &pci->dev;
> > ....
> > dwc->dwc3 = platform_device_alloc("dwc3", PLATFORM_DEVID_AUTO);
> > ....
> > ACPI_COMPANION_SET(&dwc->dwc3->dev, ACPI_COMPANION(dev));
> >
> > And ACPI_COMPANION returns an acpi_device by looking at dev->fwnode.
> > So how the heck is a pci_device.dev.fwnode pointing to an
> > acpi_device.fwnode?
>
> acpi_device is an of_node counterpart (or it is an fwnode itself if you will).

If I understand correctly, you are saying it's similar to struct
device_node for OF -- as in, a data struct that stores the unpacked
ACPI firmware data. That helps me understand what is going on with
ACPI_COMPANION_SET() in the PCI driver.

But then, why does it have a "struct device dev" field embedded in it?
Does the acpi_device.dev ever get registered with driver core?

Thanks,
Saravana

2020-03-26 19:56:50

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

On Thu, Mar 26, 2020 at 4:54 AM Andy Shevchenko
<[email protected]> wrote:
>
> On Wed, Mar 25, 2020 at 03:08:29PM -0700, Saravana Kannan wrote:
> > On Wed, Mar 25, 2020 at 5:51 AM Andy Shevchenko
> > <[email protected]> wrote:
> > > On Tue, Mar 24, 2020 at 08:29:01PM -0700, Saravana Kannan wrote:
> > > > On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko <[email protected]> wrote:
> > > > > Consider the following scenario.
> > > > >
> > > > > The main driver of USB OTG controller (dwc3-pci), which has the following
> > > > > functional dependencies on certain platform:
> > > > > - ULPI (tusb1210)
> > > > > - extcon (tested with extcon-intel-mrfld)
> > > > >
> > > > > Note, that first driver, tusb1210, is available at the moment of
> > > > > dwc3-pci probing, while extcon-intel-mrfld is built as a module and
> > > > > won't appear till user space does something about it.
> > > > >
> > > > > This is depicted by kernel configuration excerpt:
> > > > >
> > > > > CONFIG_PHY_TUSB1210=y
> > > > > CONFIG_USB_DWC3=y
> > > > > CONFIG_USB_DWC3_ULPI=y
> > > > > CONFIG_USB_DWC3_DUAL_ROLE=y
> > > > > CONFIG_USB_DWC3_PCI=y
> > > > > CONFIG_EXTCON_INTEL_MRFLD=m
> > > > >
> > > > > In the Buildroot environment the modules are probed by alphabetical ordering
> > > > > of their modaliases. The latter comes to the case when USB OTG driver will be
> > > > > probed first followed by extcon one.
> > > > >
> > > > > So, if the platform anticipates extcon device to be appeared, in the above case
> > > > > we will get deferred probe of USB OTG, because of ordering.
> > > > >
> > > > > Since current implementation, done by the commit 58b116bce136 ("drivercore:
> > > > > deferral race condition fix") counts the amount of triggered deferred probe,
> > > > > we never advance the situation -- the change makes it to be an infinite loop.
> > > >
> > > > Hi Andy,
> > > >
> > > > I'm trying to understand this sequence of steps. Sorry if the questions
> > > > are stupid -- I'm not very familiar with USB/PCI stuff.
> > >
> > > Thank you for looking into this. My answer below.
> > >
> > > As a first thing I would like to tell that there is another example of bad
> > > behaviour of deferred probe with no relation to USB. The proposed change also
> > > fixes that one (however, less possible to find in real life).
> >
> > Unless I see what the other issue is, I can't speak for the unknown.
>
> Okay, let's talk about other case (actually it's the one which I had noticed
> approximately at the time when culprit patch made the kernel).
>
> For some debugging purposes I have been using pin control table in board code.
>
> Since I would like to boot kernel on different systems I have some tables for
> non-existing pin control device. Pin control framework returns -EPROBE_DEFER
> when trying to probe device with attached table for wrong pin control. This is
> fine, the problem is that *any* successfully probed device, which happens in
> the deferred probe initcall will desynchronize existing counter. As a result ->
> infinite loop. For the record, I didn't realize and didn't investigate that
> time the issue and now I can confirm that this is a culprit which is fixed by
> this patch.
>
> > > > > ---8<---8<---
> > > > >
> > > > > [ 22.187127] driver_deferred_probe_trigger <<< 1
> > > > >
> > > > > ...here is the late initcall triggers deferred probe...
> > > > >
> > > > > [ 22.191725] platform dwc3.0.auto: deferred_probe_work_func in deferred list
> > > > >
> > > > > ...dwc3.0.auto is the only device in the deferred list...
> > > >
> > > > Ok, dwc3.0.auto is the only unprobed device at this point?
> > >
> > > Correct.
> > >
> > > > > [ 22.198727] platform dwc3.0.auto: deferred_probe_work_func 1 <<< counter 1
> > > > >
> > > > > ...the counter before mutex is unlocked is kept the same...
> > > > >
> > > > > [ 22.205663] platform dwc3.0.auto: Retrying from deferred list
> > > > >
> > > > > ...mutes has been unlocked, we try to re-probe the driver...
> > > > >
> > > > > [ 22.211487] bus: 'platform': driver_probe_device: matched device dwc3.0.auto with driver dwc3
> > > > > [ 22.220060] bus: 'platform': really_probe: probing driver dwc3 with device dwc3.0.auto
> > > > > [ 22.238735] bus: 'ulpi': driver_probe_device: matched device dwc3.0.auto.ulpi with driver tusb1210
> > > > > [ 22.247743] bus: 'ulpi': really_probe: probing driver tusb1210 with device dwc3.0.auto.ulpi
> > > > > [ 22.256292] driver: 'tusb1210': driver_bound: bound to device 'dwc3.0.auto.ulpi'
> > > > > [ 22.263723] driver_deferred_probe_trigger <<< 2
> > > > >
> > > > > ...the dwc3.0.auto probes ULPI, we got successful bound and bumped counter...
> > > > >
> > > > > [ 22.268304] bus: 'ulpi': really_probe: bound device dwc3.0.auto.ulpi to driver tusb1210
> > > >
> > > > So where did this dwc3.0.auto.ulpi come from?
> > >
> > > > Looks like the device is created by dwc3_probe() through this call flow:
> > > > dwc3_probe() -> dwc3_core_init() -> dwc3_core_ulpi_init() ->
> > > > dwc3_ulpi_init() -> ulpi_register_interface() -> ulpi_register()
> > >
> > > Correct.
> > >
> > > > > [ 22.276697] platform dwc3.0.auto: Driver dwc3 requests probe deferral
> > > >
> > > > Can you please point me to which code patch actually caused the probe
> > > > deferral?
> > >
> > > Sure, it's in drd.c.
> > >
> > > if (device_property_read_string(dev, "linux,extcon-name", &name) == 0) {
> > > edev = extcon_get_extcon_dev(name);
> > > if (!edev)
> > > return ERR_PTR(-EPROBE_DEFER);
> > > return edev;
> > > }
> >
> > Thanks for the confirmations and pointers. I assume
> > "linux,extcon-name" is a property that's obtained from ACPI? Because I
> > couldn't find a relevant reference to it elsewhere in the kernel.
>
> Yes.
>
> > > > > ...but extcon driver is still missing...
> > > > >
> > > > > [ 22.283174] platform dwc3.0.auto: Added to deferred list
> > > > > [ 22.288513] platform dwc3.0.auto: driver_deferred_probe_add_trigger local counter: 1 new counter 2
> > > >
> > > > I'm not fully aware of all the USB implications, but if extcon is
> > > > needed, why can't that check be done before we add and probe the ulpi
> > > > device? That'll avoid this whole "fake" probing and avoid the counter
> > > > increase. And avoid the need for this patch that's touching the code
> > > > code that's already a bit delicate.
> > >
> > > > Also, with my limited experience with all the possible drivers in the
> > > > kernel, it's weird that the ulpi device is added and probed before we
> > > > make sure the parent device (dwc3.0.auto) can actually probe
> > > > successfully.
> > >
> > > As I said above the deferred probe trigger has flaw on its own.
> >
> > Definitely agree. I'm not saying deferred probe is perfect.
> >
> > > Even if we fix for USB case, there is (and probably will be) others.
> > >
> > > > Most of the platform device code I've seen in systems with OF (device
> > > > tree) add the child devices towards the end of the parent's probe
> > > > function.
>
> I realized also that your fix won't work if we change extcon to be compiled in
> and ULPI to be a module. So, any driver with two or more strict dependencies
> one of which is satisfied and one is not will end up in this infinite loop.
>
> > > > > ...and since we had a successful probe, we got counter mismatch...
> > > > >
> > > > > [ 22.297490] driver_deferred_probe_trigger <<< 3
> > > > > [ 22.302074] platform dwc3.0.auto: deferred_probe_work_func 2 <<< counter 3
> > > > >
> > > > > ...at the end we have a new counter and loop repeats again, see 22.198727...
> > > > >
> > > > > ---8<---8<---
> > > > >
> > > > > Revert of the commit helps, but it is probably not helpful for the initially
> > > > > found regression. Artem Bityutskiy suggested to use counter of the successful
> > > > > probes instead. This fixes above mentioned case and shouldn't prevent driver
> > > > > to reprobe deferred ones.
> > > > >
> > > > > Under "successful probe" we understand the state when a driver of the certain
> > > > > device is being kept bound after deferred probe trigger cycle. For instance,
> > > > > in the above mentioned case probing of tusb1210 is not successful because dwc3
> > > > > driver unbinds device dwc3.0.auto.ulpi. The atomic_dec() call is used to keep
> > > > > track of this. The amount of bindings is always great than or equal to the
> > > > > amount of unbindings as guaranteed by design of the driver binding mechanism.
> > > >
> > > > The unbindings count can increase for other unrelated drivers unbinding
> > > > too. Wouldn't it? Seems a bit fragile and racy in a fashion similar to
> > > > the issue the original patch was trying to fix.
> > >
> > > Yes, it's (unlikely) possible (*), but it will give one more iteration per such
> > > case. It's definitely better than infinite loop. Do you agree?
> >
> > Sorry I wasn't being clear (I was in a rush). I'm saying this patch
> > can reintroduce the bug where the deferred probe isn't triggered when
> > it should be.
>
> I don't think so. If I'm not mistaken we still have one more cycle to trigger probe.
>
> > Let's take a simple execution flow.
> >
> > probe_okay is at 10.
> >
> > Thread-A
> > really_probe(Device-A)
> > local_probe_okay_count = 10
> > Device-A probe function is running...
> >
> > Thread-B
> > really_probe(Device-B)
> > Device-B probes successfully.
> > probe_okay incremented to 11
>
> And probe trigger task is called. It goes to the loop because counters are not the same.

Good point. But Device-A is still not in any of the deferred probed
lists at this point AFAICU. So even the retriggered iteration can
complete before Device-A is back on the deferred probe list?

>
> > Thread-C
> > Device-C (which had bound earlier) is unbound (say module is
> > unloaded or a million other reasons).
> > probe_okay is decremented to 10.
>
>
> > Thread-A continues
> > Device-A probe function returns -EPROBE_DEFER
> > driver_deferred_probe_add_trigger() doesn't do anything because
> > local_probe_okay_count == probe_okay
> > But Device-A might have deferred probe waiting on Device-B.
> > Device-A never probes.
>
> See above.

I think the key to fixing the original issue that commit 58b116bce136
fixes is to make sure the thread that has a device getting deferred is
the one that re-triggers the deferred probe if it sees another device
having probed successfully.

> > > *) It means during probe you have _intensive_ removing, of course you may keep
> > > kernel busy with iterations, but it has no practical sense. DoS attacks more
> > > effective in different ways.
> >
> > I wasn't worried about DoS attacks. More of a functional correctness
> > issue what I explained above.
> >
> > Anyway, if your issue and similar issues can be handles in driver core
> > in a clean way without breaking other cases, I don't have any problem
> > with that. Just that, I think the current solution breaks other cases.
>
> > As an alternate solution, assuming "linux,extcon-name" is coming
> > from some firmware, you might want to look into the fw_devlink
> > feature.
>
> Let's forget about USB. Don't be fixed on it. It one of the particular case out
> of others. I'm not going to comment USB particularities, sorry. It seems to me
> as not related.

Ok, but just FYI, fw_devlink helps a lot with avoiding deferred probe
issues. Has helped a ton with Android on ARM.

-Saravana

2020-03-27 08:04:18

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

On Thu, Mar 26, 2020 at 06:06:37PM +0000, Grant Likely wrote:
>
>
> On 26/03/2020 16:39, Greg KH wrote:
> > On Thu, Mar 26, 2020 at 06:31:10PM +0200, Andy Shevchenko wrote:
> > > On Thu, Mar 26, 2020 at 03:01:22PM +0000, Grant Likely wrote:
> > > > On 25/03/2020 12:51, Andy Shevchenko wrote:
> > > > > On Tue, Mar 24, 2020 at 08:29:01PM -0700, Saravana Kannan wrote:
> > > > > > On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko <[email protected]> wrote:
> > > > > > > Consider the following scenario.
> > > > > > >
> > > > > > > The main driver of USB OTG controller (dwc3-pci), which has the following
> > > > > > > functional dependencies on certain platform:
> > > > > > > - ULPI (tusb1210)
> > > > > > > - extcon (tested with extcon-intel-mrfld)
> > > > > > >
> > > > > > > Note, that first driver, tusb1210, is available at the moment of
> > > > > > > dwc3-pci probing, while extcon-intel-mrfld is built as a module and
> > > > > > > won't appear till user space does something about it.
> > > > > > >
> > > > > > > This is depicted by kernel configuration excerpt:
> > > > > > >
> > > > > > > CONFIG_PHY_TUSB1210=y
> > > > > > > CONFIG_USB_DWC3=y
> > > > > > > CONFIG_USB_DWC3_ULPI=y
> > > > > > > CONFIG_USB_DWC3_DUAL_ROLE=y
> > > > > > > CONFIG_USB_DWC3_PCI=y
> > > > > > > CONFIG_EXTCON_INTEL_MRFLD=m
> > > > > > >
> > > > > > > In the Buildroot environment the modules are probed by alphabetical ordering
> > > > > > > of their modaliases. The latter comes to the case when USB OTG driver will be
> > > > > > > probed first followed by extcon one.
> > > > > > >
> > > > > > > So, if the platform anticipates extcon device to be appeared, in the above case
> > > > > > > we will get deferred probe of USB OTG, because of ordering.
> > > > > > >
> > > > > > > Since current implementation, done by the commit 58b116bce136 ("drivercore:
> > > > > > > deferral race condition fix") counts the amount of triggered deferred probe,
> > > > > > > we never advance the situation -- the change makes it to be an infinite loop.
> > > > > >
> > > > > > Hi Andy,
> > > > > >
> > > > > > I'm trying to understand this sequence of steps. Sorry if the questions
> > > > > > are stupid -- I'm not very familiar with USB/PCI stuff.
> > > > >
> > > > > Thank you for looking into this. My answer below.
> > > > >
> > > > > As a first thing I would like to tell that there is another example of bad
> > > > > behaviour of deferred probe with no relation to USB. The proposed change also
> > > > > fixes that one (however, less possible to find in real life).
> > > > >
> > > > > > > ---8<---8<---
> > > > > > >
> > > > > > > [ 22.187127] driver_deferred_probe_trigger <<< 1
> > > > > > >
> > > > > > > ...here is the late initcall triggers deferred probe...
> > > > > > >
> > > > > > > [ 22.191725] platform dwc3.0.auto: deferred_probe_work_func in deferred list
> > > > > > >
> > > > > > > ...dwc3.0.auto is the only device in the deferred list...
> > > > > >
> > > > > > Ok, dwc3.0.auto is the only unprobed device at this point?
> > > > >
> > > > > Correct.
> > > > >
> > > > > > > [ 22.198727] platform dwc3.0.auto: deferred_probe_work_func 1 <<< counter 1
> > > > > > >
> > > > > > > ...the counter before mutex is unlocked is kept the same...
> > > > > > >
> > > > > > > [ 22.205663] platform dwc3.0.auto: Retrying from deferred list
> > > > > > >
> > > > > > > ...mutes has been unlocked, we try to re-probe the driver...
> > > > > > >
> > > > > > > [ 22.211487] bus: 'platform': driver_probe_device: matched device dwc3.0.auto with driver dwc3
> > > > > > > [ 22.220060] bus: 'platform': really_probe: probing driver dwc3 with device dwc3.0.auto
> > > > > > > [ 22.238735] bus: 'ulpi': driver_probe_device: matched device dwc3.0.auto.ulpi with driver tusb1210
> > > > > > > [ 22.247743] bus: 'ulpi': really_probe: probing driver tusb1210 with device dwc3.0.auto.ulpi
> > > > > > > [ 22.256292] driver: 'tusb1210': driver_bound: bound to device 'dwc3.0.auto.ulpi'
> > > > > > > [ 22.263723] driver_deferred_probe_trigger <<< 2
> > > > > > >
> > > > > > > ...the dwc3.0.auto probes ULPI, we got successful bound and bumped counter...
> > > > > > >
> > > > > > > [ 22.268304] bus: 'ulpi': really_probe: bound device dwc3.0.auto.ulpi to driver tusb1210
> > > > > >
> > > > > > So where did this dwc3.0.auto.ulpi come from?
> > > > >
> > > > > > Looks like the device is created by dwc3_probe() through this call flow:
> > > > > > dwc3_probe() -> dwc3_core_init() -> dwc3_core_ulpi_init() ->
> > > > > > dwc3_ulpi_init() -> ulpi_register_interface() -> ulpi_register()
> > > > >
> > > > > Correct.
> > > > >
> > > > > > > [ 22.276697] platform dwc3.0.auto: Driver dwc3 requests probe deferral
> > > > > >
> > > > > > Can you please point me to which code patch actually caused the probe
> > > > > > deferral?
> > > > >
> > > > > Sure, it's in drd.c.
> > > > >
> > > > > if (device_property_read_string(dev, "linux,extcon-name", &name) == 0) {
> > > > > edev = extcon_get_extcon_dev(name);
> > > > > if (!edev)
> > > > > return ERR_PTR(-EPROBE_DEFER);
> > > > > return edev;
> > > > > }
> > > > >
> > > > > > > ...but extcon driver is still missing...
> > > > > > >
> > > > > > > [ 22.283174] platform dwc3.0.auto: Added to deferred list
> > > > > > > [ 22.288513] platform dwc3.0.auto: driver_deferred_probe_add_trigger local counter: 1 new counter 2
> > > > > >
> > > > > > I'm not fully aware of all the USB implications, but if extcon is
> > > > > > needed, why can't that check be done before we add and probe the ulpi
> > > > > > device? That'll avoid this whole "fake" probing and avoid the counter
> > > > > > increase. And avoid the need for this patch that's touching the code
> > > > > > code that's already a bit delicate.
> > > > >
> > > > > > Also, with my limited experience with all the possible drivers in the
> > > > > > kernel, it's weird that the ulpi device is added and probed before we
> > > > > > make sure the parent device (dwc3.0.auto) can actually probe
> > > > > > successfully.
> > > > >
> > > > > As I said above the deferred probe trigger has flaw on its own.
> > > > > Even if we fix for USB case, there is (and probably will be) others.
> > > >
> > > > Right here is the driver design bug. A driver's probe() hook should *not*
> > > > return -EPROBE_DEFER after already creating child devices which may have
> > > > already been probed.
> > >
> > > Any documentation statement for this requirement?
> >
> > There shouldn't be. If you return ANY error from a probe function, your
> > driver is essencially "dead" when it comes to that device, and it had
> > better have cleaned up after itself. >
> > That includes defering probe, that's not "special" here at all.
>
> What is special in this case is that if a .probe() hook had registered a
> child device, then removed that child device (so it did clean up after
> itself) and then return -EPROBE_DEFER, then we end up in an endless probe
> loop.

If all child devices really are cleaned up completly, why would this be
a problem? What is set internally in the driver core that would get
tripped up by this?

> But this is unusual behaviour. Normally a .probe() hook checks all required
> resources are available before registering any child devices. This driver
> doesn't do that. Arguably this is indeed an additional requirement beyond
> "clean up after yourself". I cannot find anyplace where it is documented. In
> fact, I cannot find any documentation on EPROBE_DEFER in the Documentation/
> tree. How about the below?
>
> > > By the way, I may imagine other mechanisms that probe the driver on other CPU
> > > at the same time (let's consider parallel modprobes). The current code has a
> > > flaw with that.
> >
> > That can't happen, the driver core prevents that.
>
> Greg's right, that can't happen. At worst a driver will get an additional
> defer event; but it all still works.
>
> g.
>
> ---
> diff --git a/Documentation/driver-api/driver-model/driver.rst
> b/Documentation/driver-api/driver-model/driver.rst
> index baa6a85c8287..46adede13aba 100644
> --- a/Documentation/driver-api/driver-model/driver.rst
> +++ b/Documentation/driver-api/driver-model/driver.rst
> @@ -167,7 +167,17 @@ the driver to that device.
>
> A driver's probe() may return a negative errno value to indicate that
> the driver did not bind to this device, in which case it should have
> -released all resources it allocated::
> +released all resources it allocated. Optionally, probe() may return
> +-EPROBE_DEFER if the driver depends on resources that are not yet
> +available (e.g., supplied by a driver that hasn't initialized yet).
> +The driver core will put the device onto the deferred probe list and
> +will try to call it again later. Important: -EPROBE_DEFER must not be
> +returned if probe() has already created child devices, even if those
> +child devices have were removed again in a cleanup path. If -EPROBE_DEFER
> +is returned after a child device has been registered, it may result in an
> +infinite loop of .probe() calls to the same driver.

Ok, this is a bug, if that is the case, in the driver core as it should
not matter how many devices were added/removed/whatever while a driver
is in it's probe function.

But, I don't see how this patch solves that problem, another probe call
should never be made for the same bus while in this probe function. If
we do:
device1->probe()
device1 creates device2 and registers it
device2->probe is called
device2->probe returns 0
device1 has problems, unregisters device2
device2->remove is called
device1 deletes device2
device1 returns -EPROBE_DEFER

So then where's the problem? Did device2 somehow not really get
properly cleaned up?

confused,

greg k-h

2020-03-27 12:38:55

by Grant Likely

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied



On 27/03/2020 08:03, Greg KH wrote:
> On Thu, Mar 26, 2020 at 06:06:37PM +0000, Grant Likely wrote:
>>
>>
>> On 26/03/2020 16:39, Greg KH wrote:
>>> On Thu, Mar 26, 2020 at 06:31:10PM +0200, Andy Shevchenko wrote:
>>>> On Thu, Mar 26, 2020 at 03:01:22PM +0000, Grant Likely wrote:
>>>>> On 25/03/2020 12:51, Andy Shevchenko wrote:
>>>>>> On Tue, Mar 24, 2020 at 08:29:01PM -0700, Saravana Kannan wrote:
>>>>>>> On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko <[email protected]> wrote:
>>>>>>>> Consider the following scenario.
>>>>>>>>
>>>>>>>> The main driver of USB OTG controller (dwc3-pci), which has the following
>>>>>>>> functional dependencies on certain platform:
>>>>>>>> - ULPI (tusb1210)
>>>>>>>> - extcon (tested with extcon-intel-mrfld)
>>>>>>>>
>>>>>>>> Note, that first driver, tusb1210, is available at the moment of
>>>>>>>> dwc3-pci probing, while extcon-intel-mrfld is built as a module and
>>>>>>>> won't appear till user space does something about it.
>>>>>>>>
>>>>>>>> This is depicted by kernel configuration excerpt:
>>>>>>>>
>>>>>>>> CONFIG_PHY_TUSB1210=y
>>>>>>>> CONFIG_USB_DWC3=y
>>>>>>>> CONFIG_USB_DWC3_ULPI=y
>>>>>>>> CONFIG_USB_DWC3_DUAL_ROLE=y
>>>>>>>> CONFIG_USB_DWC3_PCI=y
>>>>>>>> CONFIG_EXTCON_INTEL_MRFLD=m
>>>>>>>>
>>>>>>>> In the Buildroot environment the modules are probed by alphabetical ordering
>>>>>>>> of their modaliases. The latter comes to the case when USB OTG driver will be
>>>>>>>> probed first followed by extcon one.
>>>>>>>>
>>>>>>>> So, if the platform anticipates extcon device to be appeared, in the above case
>>>>>>>> we will get deferred probe of USB OTG, because of ordering.
>>>>>>>>
>>>>>>>> Since current implementation, done by the commit 58b116bce136 ("drivercore:
>>>>>>>> deferral race condition fix") counts the amount of triggered deferred probe,
>>>>>>>> we never advance the situation -- the change makes it to be an infinite loop.
>>>>>>>
>>>>>>> Hi Andy,
>>>>>>>
>>>>>>> I'm trying to understand this sequence of steps. Sorry if the questions
>>>>>>> are stupid -- I'm not very familiar with USB/PCI stuff.
>>>>>>
>>>>>> Thank you for looking into this. My answer below.
>>>>>>
>>>>>> As a first thing I would like to tell that there is another example of bad
>>>>>> behaviour of deferred probe with no relation to USB. The proposed change also
>>>>>> fixes that one (however, less possible to find in real life).
>>>>>>
>>>>>>>> ---8<---8<---
>>>>>>>>
>>>>>>>> [ 22.187127] driver_deferred_probe_trigger <<< 1
>>>>>>>>
>>>>>>>> ...here is the late initcall triggers deferred probe...
>>>>>>>>
>>>>>>>> [ 22.191725] platform dwc3.0.auto: deferred_probe_work_func in deferred list
>>>>>>>>
>>>>>>>> ...dwc3.0.auto is the only device in the deferred list...
>>>>>>>
>>>>>>> Ok, dwc3.0.auto is the only unprobed device at this point?
>>>>>>
>>>>>> Correct.
>>>>>>
>>>>>>>> [ 22.198727] platform dwc3.0.auto: deferred_probe_work_func 1 <<< counter 1
>>>>>>>>
>>>>>>>> ...the counter before mutex is unlocked is kept the same...
>>>>>>>>
>>>>>>>> [ 22.205663] platform dwc3.0.auto: Retrying from deferred list
>>>>>>>>
>>>>>>>> ...mutes has been unlocked, we try to re-probe the driver...
>>>>>>>>
>>>>>>>> [ 22.211487] bus: 'platform': driver_probe_device: matched device dwc3.0.auto with driver dwc3
>>>>>>>> [ 22.220060] bus: 'platform': really_probe: probing driver dwc3 with device dwc3.0.auto
>>>>>>>> [ 22.238735] bus: 'ulpi': driver_probe_device: matched device dwc3.0.auto.ulpi with driver tusb1210
>>>>>>>> [ 22.247743] bus: 'ulpi': really_probe: probing driver tusb1210 with device dwc3.0.auto.ulpi
>>>>>>>> [ 22.256292] driver: 'tusb1210': driver_bound: bound to device 'dwc3.0.auto.ulpi'
>>>>>>>> [ 22.263723] driver_deferred_probe_trigger <<< 2
>>>>>>>>
>>>>>>>> ...the dwc3.0.auto probes ULPI, we got successful bound and bumped counter...
>>>>>>>>
>>>>>>>> [ 22.268304] bus: 'ulpi': really_probe: bound device dwc3.0.auto.ulpi to driver tusb1210
>>>>>>>
>>>>>>> So where did this dwc3.0.auto.ulpi come from?
>>>>>>
>>>>>>> Looks like the device is created by dwc3_probe() through this call flow:
>>>>>>> dwc3_probe() -> dwc3_core_init() -> dwc3_core_ulpi_init() ->
>>>>>>> dwc3_ulpi_init() -> ulpi_register_interface() -> ulpi_register()
>>>>>>
>>>>>> Correct.
>>>>>>
>>>>>>>> [ 22.276697] platform dwc3.0.auto: Driver dwc3 requests probe deferral
>>>>>>>
>>>>>>> Can you please point me to which code patch actually caused the probe
>>>>>>> deferral?
>>>>>>
>>>>>> Sure, it's in drd.c.
>>>>>>
>>>>>> if (device_property_read_string(dev, "linux,extcon-name", &name) == 0) {
>>>>>> edev = extcon_get_extcon_dev(name);
>>>>>> if (!edev)
>>>>>> return ERR_PTR(-EPROBE_DEFER);
>>>>>> return edev;
>>>>>> }
>>>>>>
>>>>>>>> ...but extcon driver is still missing...
>>>>>>>>
>>>>>>>> [ 22.283174] platform dwc3.0.auto: Added to deferred list
>>>>>>>> [ 22.288513] platform dwc3.0.auto: driver_deferred_probe_add_trigger local counter: 1 new counter 2
>>>>>>>
>>>>>>> I'm not fully aware of all the USB implications, but if extcon is
>>>>>>> needed, why can't that check be done before we add and probe the ulpi
>>>>>>> device? That'll avoid this whole "fake" probing and avoid the counter
>>>>>>> increase. And avoid the need for this patch that's touching the code
>>>>>>> code that's already a bit delicate.
>>>>>>
>>>>>>> Also, with my limited experience with all the possible drivers in the
>>>>>>> kernel, it's weird that the ulpi device is added and probed before we
>>>>>>> make sure the parent device (dwc3.0.auto) can actually probe
>>>>>>> successfully.
>>>>>>
>>>>>> As I said above the deferred probe trigger has flaw on its own.
>>>>>> Even if we fix for USB case, there is (and probably will be) others.
>>>>>
>>>>> Right here is the driver design bug. A driver's probe() hook should *not*
>>>>> return -EPROBE_DEFER after already creating child devices which may have
>>>>> already been probed.
>>>>
>>>> Any documentation statement for this requirement?
>>>
>>> There shouldn't be. If you return ANY error from a probe function, your
>>> driver is essencially "dead" when it comes to that device, and it had
>>> better have cleaned up after itself. >
>>> That includes defering probe, that's not "special" here at all.
>>
>> What is special in this case is that if a .probe() hook had registered a
>> child device, then removed that child device (so it did clean up after
>> itself) and then return -EPROBE_DEFER, then we end up in an endless probe
>> loop.
>
> If all child devices really are cleaned up completly, why would this be
> a problem? What is set internally in the driver core that would get
> tripped up by this?
>
>> But this is unusual behaviour. Normally a .probe() hook checks all required
>> resources are available before registering any child devices. This driver
>> doesn't do that. Arguably this is indeed an additional requirement beyond
>> "clean up after yourself". I cannot find anyplace where it is documented. In
>> fact, I cannot find any documentation on EPROBE_DEFER in the Documentation/
>> tree. How about the below?
>>
>>>> By the way, I may imagine other mechanisms that probe the driver on other CPU
>>>> at the same time (let's consider parallel modprobes). The current code has a
>>>> flaw with that.
>>>
>>> That can't happen, the driver core prevents that.
>>
>> Greg's right, that can't happen. At worst a driver will get an additional
>> defer event; but it all still works.
>>
>> g.
>>
>> ---
>> diff --git a/Documentation/driver-api/driver-model/driver.rst
>> b/Documentation/driver-api/driver-model/driver.rst
>> index baa6a85c8287..46adede13aba 100644
>> --- a/Documentation/driver-api/driver-model/driver.rst
>> +++ b/Documentation/driver-api/driver-model/driver.rst
>> @@ -167,7 +167,17 @@ the driver to that device.
>>
>> A driver's probe() may return a negative errno value to indicate that
>> the driver did not bind to this device, in which case it should have
>> -released all resources it allocated::
>> +released all resources it allocated. Optionally, probe() may return
>> +-EPROBE_DEFER if the driver depends on resources that are not yet
>> +available (e.g., supplied by a driver that hasn't initialized yet).
>> +The driver core will put the device onto the deferred probe list and
>> +will try to call it again later. Important: -EPROBE_DEFER must not be
>> +returned if probe() has already created child devices, even if those
>> +child devices have were removed again in a cleanup path. If -EPROBE_DEFER
>> +is returned after a child device has been registered, it may result in an
>> +infinite loop of .probe() calls to the same driver.
>
> Ok, this is a bug, if that is the case, in the driver core as it should
> not matter how many devices were added/removed/whatever while a driver
> is in it's probe function.
Fair enough, call it a bug. However, the solution proposed so far is
broken, and I've not seen or been able to come up with an approach
without adding significant complexity to driver core.

Alternately, highlighting the limitation that EPROBE_DEFER can only be
used before the device gets to the point of registering child devices is
simple, only affects a few drivers, and matches the best practice of if
a driver needs to defer, it should do so *as soon as possible* to avoid
work needing to be unwound multiple times, which reduces boot time delays.

>
> But, I don't see how this patch solves that problem, another probe call
> should never be made for the same bus while in this probe function. If
> we do:
> device1->probe()
> device1 creates device2 and registers it
> device2->probe is called
> device2->probe returns 0
> device1 has problems, unregisters device2
> device2->remove is called
> device1 deletes device2
> device1 returns -EPROBE_DEFER
>
> So then where's the problem? Did device2 somehow not really get
> properly cleaned up?

Using the above sequence with a couple of annotations

device1->probe()
device1 creates device2 and registers it
device2->probe is called
device2->probe returns 0
*Driver Core increments probe_count
(drivers/base/dd.c:516)
device1 has problems, unregisters device2
device2->remove is called
device1 deletes device2
device1 returns -EPROBE_DEFER
*Driver core notices that probe_count got incremented since
device1->probe() call started; therefore it is possible that the
required resource is now available, so try calling device1->probe() again.

g.

2020-03-27 12:52:47

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

On Fri, Mar 27, 2020 at 12:37:54PM +0000, Grant Likely wrote:
>
>
> On 27/03/2020 08:03, Greg KH wrote:
> > On Thu, Mar 26, 2020 at 06:06:37PM +0000, Grant Likely wrote:
> > >
> > >
> > > On 26/03/2020 16:39, Greg KH wrote:
> > > > On Thu, Mar 26, 2020 at 06:31:10PM +0200, Andy Shevchenko wrote:
> > > > > On Thu, Mar 26, 2020 at 03:01:22PM +0000, Grant Likely wrote:
> > > > > > On 25/03/2020 12:51, Andy Shevchenko wrote:
> > > > > > > On Tue, Mar 24, 2020 at 08:29:01PM -0700, Saravana Kannan wrote:
> > > > > > > > On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko <[email protected]> wrote:
> > > > > > > > > Consider the following scenario.
> > > > > > > > >
> > > > > > > > > The main driver of USB OTG controller (dwc3-pci), which has the following
> > > > > > > > > functional dependencies on certain platform:
> > > > > > > > > - ULPI (tusb1210)
> > > > > > > > > - extcon (tested with extcon-intel-mrfld)
> > > > > > > > >
> > > > > > > > > Note, that first driver, tusb1210, is available at the moment of
> > > > > > > > > dwc3-pci probing, while extcon-intel-mrfld is built as a module and
> > > > > > > > > won't appear till user space does something about it.
> > > > > > > > >
> > > > > > > > > This is depicted by kernel configuration excerpt:
> > > > > > > > >
> > > > > > > > > CONFIG_PHY_TUSB1210=y
> > > > > > > > > CONFIG_USB_DWC3=y
> > > > > > > > > CONFIG_USB_DWC3_ULPI=y
> > > > > > > > > CONFIG_USB_DWC3_DUAL_ROLE=y
> > > > > > > > > CONFIG_USB_DWC3_PCI=y
> > > > > > > > > CONFIG_EXTCON_INTEL_MRFLD=m
> > > > > > > > >
> > > > > > > > > In the Buildroot environment the modules are probed by alphabetical ordering
> > > > > > > > > of their modaliases. The latter comes to the case when USB OTG driver will be
> > > > > > > > > probed first followed by extcon one.
> > > > > > > > >
> > > > > > > > > So, if the platform anticipates extcon device to be appeared, in the above case
> > > > > > > > > we will get deferred probe of USB OTG, because of ordering.
> > > > > > > > >
> > > > > > > > > Since current implementation, done by the commit 58b116bce136 ("drivercore:
> > > > > > > > > deferral race condition fix") counts the amount of triggered deferred probe,
> > > > > > > > > we never advance the situation -- the change makes it to be an infinite loop.
> > > > > > > >
> > > > > > > > Hi Andy,
> > > > > > > >
> > > > > > > > I'm trying to understand this sequence of steps. Sorry if the questions
> > > > > > > > are stupid -- I'm not very familiar with USB/PCI stuff.
> > > > > > >
> > > > > > > Thank you for looking into this. My answer below.
> > > > > > >
> > > > > > > As a first thing I would like to tell that there is another example of bad
> > > > > > > behaviour of deferred probe with no relation to USB. The proposed change also
> > > > > > > fixes that one (however, less possible to find in real life).
> > > > > > >
> > > > > > > > > ---8<---8<---
> > > > > > > > >
> > > > > > > > > [ 22.187127] driver_deferred_probe_trigger <<< 1
> > > > > > > > >
> > > > > > > > > ...here is the late initcall triggers deferred probe...
> > > > > > > > >
> > > > > > > > > [ 22.191725] platform dwc3.0.auto: deferred_probe_work_func in deferred list
> > > > > > > > >
> > > > > > > > > ...dwc3.0.auto is the only device in the deferred list...
> > > > > > > >
> > > > > > > > Ok, dwc3.0.auto is the only unprobed device at this point?
> > > > > > >
> > > > > > > Correct.
> > > > > > >
> > > > > > > > > [ 22.198727] platform dwc3.0.auto: deferred_probe_work_func 1 <<< counter 1
> > > > > > > > >
> > > > > > > > > ...the counter before mutex is unlocked is kept the same...
> > > > > > > > >
> > > > > > > > > [ 22.205663] platform dwc3.0.auto: Retrying from deferred list
> > > > > > > > >
> > > > > > > > > ...mutes has been unlocked, we try to re-probe the driver...
> > > > > > > > >
> > > > > > > > > [ 22.211487] bus: 'platform': driver_probe_device: matched device dwc3.0.auto with driver dwc3
> > > > > > > > > [ 22.220060] bus: 'platform': really_probe: probing driver dwc3 with device dwc3.0.auto
> > > > > > > > > [ 22.238735] bus: 'ulpi': driver_probe_device: matched device dwc3.0.auto.ulpi with driver tusb1210
> > > > > > > > > [ 22.247743] bus: 'ulpi': really_probe: probing driver tusb1210 with device dwc3.0.auto.ulpi
> > > > > > > > > [ 22.256292] driver: 'tusb1210': driver_bound: bound to device 'dwc3.0.auto.ulpi'
> > > > > > > > > [ 22.263723] driver_deferred_probe_trigger <<< 2
> > > > > > > > >
> > > > > > > > > ...the dwc3.0.auto probes ULPI, we got successful bound and bumped counter...
> > > > > > > > >
> > > > > > > > > [ 22.268304] bus: 'ulpi': really_probe: bound device dwc3.0.auto.ulpi to driver tusb1210
> > > > > > > >
> > > > > > > > So where did this dwc3.0.auto.ulpi come from?
> > > > > > >
> > > > > > > > Looks like the device is created by dwc3_probe() through this call flow:
> > > > > > > > dwc3_probe() -> dwc3_core_init() -> dwc3_core_ulpi_init() ->
> > > > > > > > dwc3_ulpi_init() -> ulpi_register_interface() -> ulpi_register()
> > > > > > >
> > > > > > > Correct.
> > > > > > >
> > > > > > > > > [ 22.276697] platform dwc3.0.auto: Driver dwc3 requests probe deferral
> > > > > > > >
> > > > > > > > Can you please point me to which code patch actually caused the probe
> > > > > > > > deferral?
> > > > > > >
> > > > > > > Sure, it's in drd.c.
> > > > > > >
> > > > > > > if (device_property_read_string(dev, "linux,extcon-name", &name) == 0) {
> > > > > > > edev = extcon_get_extcon_dev(name);
> > > > > > > if (!edev)
> > > > > > > return ERR_PTR(-EPROBE_DEFER);
> > > > > > > return edev;
> > > > > > > }
> > > > > > >
> > > > > > > > > ...but extcon driver is still missing...
> > > > > > > > >
> > > > > > > > > [ 22.283174] platform dwc3.0.auto: Added to deferred list
> > > > > > > > > [ 22.288513] platform dwc3.0.auto: driver_deferred_probe_add_trigger local counter: 1 new counter 2
> > > > > > > >
> > > > > > > > I'm not fully aware of all the USB implications, but if extcon is
> > > > > > > > needed, why can't that check be done before we add and probe the ulpi
> > > > > > > > device? That'll avoid this whole "fake" probing and avoid the counter
> > > > > > > > increase. And avoid the need for this patch that's touching the code
> > > > > > > > code that's already a bit delicate.
> > > > > > >
> > > > > > > > Also, with my limited experience with all the possible drivers in the
> > > > > > > > kernel, it's weird that the ulpi device is added and probed before we
> > > > > > > > make sure the parent device (dwc3.0.auto) can actually probe
> > > > > > > > successfully.
> > > > > > >
> > > > > > > As I said above the deferred probe trigger has flaw on its own.
> > > > > > > Even if we fix for USB case, there is (and probably will be) others.
> > > > > >
> > > > > > Right here is the driver design bug. A driver's probe() hook should *not*
> > > > > > return -EPROBE_DEFER after already creating child devices which may have
> > > > > > already been probed.
> > > > >
> > > > > Any documentation statement for this requirement?
> > > >
> > > > There shouldn't be. If you return ANY error from a probe function, your
> > > > driver is essencially "dead" when it comes to that device, and it had
> > > > better have cleaned up after itself. >
> > > > That includes defering probe, that's not "special" here at all.
> > >
> > > What is special in this case is that if a .probe() hook had registered a
> > > child device, then removed that child device (so it did clean up after
> > > itself) and then return -EPROBE_DEFER, then we end up in an endless probe
> > > loop.
> >
> > If all child devices really are cleaned up completly, why would this be
> > a problem? What is set internally in the driver core that would get
> > tripped up by this?
> >
> > > But this is unusual behaviour. Normally a .probe() hook checks all required
> > > resources are available before registering any child devices. This driver
> > > doesn't do that. Arguably this is indeed an additional requirement beyond
> > > "clean up after yourself". I cannot find anyplace where it is documented. In
> > > fact, I cannot find any documentation on EPROBE_DEFER in the Documentation/
> > > tree. How about the below?
> > >
> > > > > By the way, I may imagine other mechanisms that probe the driver on other CPU
> > > > > at the same time (let's consider parallel modprobes). The current code has a
> > > > > flaw with that.
> > > >
> > > > That can't happen, the driver core prevents that.
> > >
> > > Greg's right, that can't happen. At worst a driver will get an additional
> > > defer event; but it all still works.
> > >
> > > g.
> > >
> > > ---
> > > diff --git a/Documentation/driver-api/driver-model/driver.rst
> > > b/Documentation/driver-api/driver-model/driver.rst
> > > index baa6a85c8287..46adede13aba 100644
> > > --- a/Documentation/driver-api/driver-model/driver.rst
> > > +++ b/Documentation/driver-api/driver-model/driver.rst
> > > @@ -167,7 +167,17 @@ the driver to that device.
> > >
> > > A driver's probe() may return a negative errno value to indicate that
> > > the driver did not bind to this device, in which case it should have
> > > -released all resources it allocated::
> > > +released all resources it allocated. Optionally, probe() may return
> > > +-EPROBE_DEFER if the driver depends on resources that are not yet
> > > +available (e.g., supplied by a driver that hasn't initialized yet).
> > > +The driver core will put the device onto the deferred probe list and
> > > +will try to call it again later. Important: -EPROBE_DEFER must not be
> > > +returned if probe() has already created child devices, even if those
> > > +child devices have were removed again in a cleanup path. If -EPROBE_DEFER
> > > +is returned after a child device has been registered, it may result in an
> > > +infinite loop of .probe() calls to the same driver.
> >
> > Ok, this is a bug, if that is the case, in the driver core as it should
> > not matter how many devices were added/removed/whatever while a driver
> > is in it's probe function.
> Fair enough, call it a bug. However, the solution proposed so far is broken,
> and I've not seen or been able to come up with an approach without adding
> significant complexity to driver core.
>
> Alternately, highlighting the limitation that EPROBE_DEFER can only be used
> before the device gets to the point of registering child devices is simple,
> only affects a few drivers, and matches the best practice of if a driver
> needs to defer, it should do so *as soon as possible* to avoid work needing
> to be unwound multiple times, which reduces boot time delays.

Agreed.

> > But, I don't see how this patch solves that problem, another probe call
> > should never be made for the same bus while in this probe function. If
> > we do:
> > device1->probe()
> > device1 creates device2 and registers it
> > device2->probe is called
> > device2->probe returns 0
> > device1 has problems, unregisters device2
> > device2->remove is called
> > device1 deletes device2
> > device1 returns -EPROBE_DEFER
> >
> > So then where's the problem? Did device2 somehow not really get
> > properly cleaned up?
>
> Using the above sequence with a couple of annotations
>
> device1->probe()
> device1 creates device2 and registers it
> device2->probe is called
> device2->probe returns 0
> *Driver Core increments probe_count
> (drivers/base/dd.c:516)
> device1 has problems, unregisters device2
> device2->remove is called
> device1 deletes device2
> device1 returns -EPROBE_DEFER
> *Driver core notices that probe_count got incremented since
> device1->probe() call started; therefore it is possible that the required
> resource is now available, so try calling device1->probe() again.

Ah, that makes more sense now, thanks.

And yes, this specific patch doesn't seem to solve that problem, so I'll
be glad to take the documentation upate that says "do not do this" :)

thanks,

greg k-h

2020-06-08 09:19:46

by Marco Felsch

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

On 20-03-26 18:31, Andy Shevchenko wrote:
> On Thu, Mar 26, 2020 at 03:01:22PM +0000, Grant Likely wrote:
> > On 25/03/2020 12:51, Andy Shevchenko wrote:
> > > On Tue, Mar 24, 2020 at 08:29:01PM -0700, Saravana Kannan wrote:
> > > > On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko <[email protected]> wrote:
> > > > > Consider the following scenario.
> > > > >
> > > > > The main driver of USB OTG controller (dwc3-pci), which has the following
> > > > > functional dependencies on certain platform:
> > > > > - ULPI (tusb1210)
> > > > > - extcon (tested with extcon-intel-mrfld)
> > > > >
> > > > > Note, that first driver, tusb1210, is available at the moment of
> > > > > dwc3-pci probing, while extcon-intel-mrfld is built as a module and
> > > > > won't appear till user space does something about it.
> > > > >
> > > > > This is depicted by kernel configuration excerpt:
> > > > >
> > > > > CONFIG_PHY_TUSB1210=y
> > > > > CONFIG_USB_DWC3=y
> > > > > CONFIG_USB_DWC3_ULPI=y
> > > > > CONFIG_USB_DWC3_DUAL_ROLE=y
> > > > > CONFIG_USB_DWC3_PCI=y
> > > > > CONFIG_EXTCON_INTEL_MRFLD=m
> > > > >
> > > > > In the Buildroot environment the modules are probed by alphabetical ordering
> > > > > of their modaliases. The latter comes to the case when USB OTG driver will be
> > > > > probed first followed by extcon one.
> > > > >
> > > > > So, if the platform anticipates extcon device to be appeared, in the above case
> > > > > we will get deferred probe of USB OTG, because of ordering.
> > > > >
> > > > > Since current implementation, done by the commit 58b116bce136 ("drivercore:
> > > > > deferral race condition fix") counts the amount of triggered deferred probe,
> > > > > we never advance the situation -- the change makes it to be an infinite loop.
> > > >
> > > > Hi Andy,
> > > >
> > > > I'm trying to understand this sequence of steps. Sorry if the questions
> > > > are stupid -- I'm not very familiar with USB/PCI stuff.
> > >
> > > Thank you for looking into this. My answer below.
> > >
> > > As a first thing I would like to tell that there is another example of bad
> > > behaviour of deferred probe with no relation to USB. The proposed change also
> > > fixes that one (however, less possible to find in real life).
> > >
> > > > > ---8<---8<---
> > > > >
> > > > > [ 22.187127] driver_deferred_probe_trigger <<< 1
> > > > >
> > > > > ...here is the late initcall triggers deferred probe...
> > > > >
> > > > > [ 22.191725] platform dwc3.0.auto: deferred_probe_work_func in deferred list
> > > > >
> > > > > ...dwc3.0.auto is the only device in the deferred list...
> > > >
> > > > Ok, dwc3.0.auto is the only unprobed device at this point?
> > >
> > > Correct.
> > >
> > > > > [ 22.198727] platform dwc3.0.auto: deferred_probe_work_func 1 <<< counter 1
> > > > >
> > > > > ...the counter before mutex is unlocked is kept the same...
> > > > >
> > > > > [ 22.205663] platform dwc3.0.auto: Retrying from deferred list
> > > > >
> > > > > ...mutes has been unlocked, we try to re-probe the driver...
> > > > >
> > > > > [ 22.211487] bus: 'platform': driver_probe_device: matched device dwc3.0.auto with driver dwc3
> > > > > [ 22.220060] bus: 'platform': really_probe: probing driver dwc3 with device dwc3.0.auto
> > > > > [ 22.238735] bus: 'ulpi': driver_probe_device: matched device dwc3.0.auto.ulpi with driver tusb1210
> > > > > [ 22.247743] bus: 'ulpi': really_probe: probing driver tusb1210 with device dwc3.0.auto.ulpi
> > > > > [ 22.256292] driver: 'tusb1210': driver_bound: bound to device 'dwc3.0.auto.ulpi'
> > > > > [ 22.263723] driver_deferred_probe_trigger <<< 2
> > > > >
> > > > > ...the dwc3.0.auto probes ULPI, we got successful bound and bumped counter...
> > > > >
> > > > > [ 22.268304] bus: 'ulpi': really_probe: bound device dwc3.0.auto.ulpi to driver tusb1210
> > > >
> > > > So where did this dwc3.0.auto.ulpi come from?
> > >
> > > > Looks like the device is created by dwc3_probe() through this call flow:
> > > > dwc3_probe() -> dwc3_core_init() -> dwc3_core_ulpi_init() ->
> > > > dwc3_ulpi_init() -> ulpi_register_interface() -> ulpi_register()
> > >
> > > Correct.
> > >
> > > > > [ 22.276697] platform dwc3.0.auto: Driver dwc3 requests probe deferral
> > > >
> > > > Can you please point me to which code patch actually caused the probe
> > > > deferral?
> > >
> > > Sure, it's in drd.c.
> > >
> > > if (device_property_read_string(dev, "linux,extcon-name", &name) == 0) {
> > > edev = extcon_get_extcon_dev(name);
> > > if (!edev)
> > > return ERR_PTR(-EPROBE_DEFER);
> > > return edev;
> > > }
> > >
> > > > > ...but extcon driver is still missing...
> > > > >
> > > > > [ 22.283174] platform dwc3.0.auto: Added to deferred list
> > > > > [ 22.288513] platform dwc3.0.auto: driver_deferred_probe_add_trigger local counter: 1 new counter 2
> > > >
> > > > I'm not fully aware of all the USB implications, but if extcon is
> > > > needed, why can't that check be done before we add and probe the ulpi
> > > > device? That'll avoid this whole "fake" probing and avoid the counter
> > > > increase. And avoid the need for this patch that's touching the code
> > > > code that's already a bit delicate.
> > >
> > > > Also, with my limited experience with all the possible drivers in the
> > > > kernel, it's weird that the ulpi device is added and probed before we
> > > > make sure the parent device (dwc3.0.auto) can actually probe
> > > > successfully.
> > >
> > > As I said above the deferred probe trigger has flaw on its own.
> > > Even if we fix for USB case, there is (and probably will be) others.
> >
> > Right here is the driver design bug. A driver's probe() hook should *not*
> > return -EPROBE_DEFER after already creating child devices which may have
> > already been probed.
>
> Any documentation statement for this requirement?
>
> By the way, I may imagine other mechanisms that probe the driver on other CPU
> at the same time (let's consider parallel modprobes). The current code has a
> flaw with that.

Hi,

sorry for picking this up again but I stumbled above the same issue
within the driver imx/drm driver which is using the component framework.
I end up in a infinity boot loop if I enabled the HDMI (which is the
DesignWare bridge device) and the LVDS support and the LVDS bind return
with EPROBE_DEFER. There are no words within the component framework docs
which says that this is forbidden. Of course we can work-around the
driver-core framework but IMHO this shouldn't be the way to go. I do not
say that we should revert the commit introducing the regression but we
should address this not only by extending the docs since the most
drm-drivers are using the component framework and can end up in the same
situation.

> > It can be solved by refactoring the driver probe routine. If a resource is
> > required to be present, then check that it is available early; before
> > registering child devices.
>
> We fix one and leave others.

E.g. the imx-drm and the sunxi driver...

Regards,
Marco

> > The proposed solution to modify driver core is fragile and susceptible to
> > side effects from other probe paths. I don't think it is the right approach.
>
> Have you tested it on your case? Does it fix the issue?
>

2020-06-08 11:15:35

by Andrzej Hajda

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied


On 08.06.2020 11:17, Marco Felsch wrote:
> On 20-03-26 18:31, Andy Shevchenko wrote:
>> On Thu, Mar 26, 2020 at 03:01:22PM +0000, Grant Likely wrote:
>>> On 25/03/2020 12:51, Andy Shevchenko wrote:
>>>> On Tue, Mar 24, 2020 at 08:29:01PM -0700, Saravana Kannan wrote:
>>>>> On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko <[email protected]> wrote:
>>>>>> Consider the following scenario.
>>>>>>
>>>>>> The main driver of USB OTG controller (dwc3-pci), which has the following
>>>>>> functional dependencies on certain platform:
>>>>>> - ULPI (tusb1210)
>>>>>> - extcon (tested with extcon-intel-mrfld)
>>>>>>
>>>>>> Note, that first driver, tusb1210, is available at the moment of
>>>>>> dwc3-pci probing, while extcon-intel-mrfld is built as a module and
>>>>>> won't appear till user space does something about it.
>>>>>>
>>>>>> This is depicted by kernel configuration excerpt:
>>>>>>
>>>>>> CONFIG_PHY_TUSB1210=y
>>>>>> CONFIG_USB_DWC3=y
>>>>>> CONFIG_USB_DWC3_ULPI=y
>>>>>> CONFIG_USB_DWC3_DUAL_ROLE=y
>>>>>> CONFIG_USB_DWC3_PCI=y
>>>>>> CONFIG_EXTCON_INTEL_MRFLD=m
>>>>>>
>>>>>> In the Buildroot environment the modules are probed by alphabetical ordering
>>>>>> of their modaliases. The latter comes to the case when USB OTG driver will be
>>>>>> probed first followed by extcon one.
>>>>>>
>>>>>> So, if the platform anticipates extcon device to be appeared, in the above case
>>>>>> we will get deferred probe of USB OTG, because of ordering.
>>>>>>
>>>>>> Since current implementation, done by the commit 58b116bce136 ("drivercore:
>>>>>> deferral race condition fix") counts the amount of triggered deferred probe,
>>>>>> we never advance the situation -- the change makes it to be an infinite loop.
>>>>> Hi Andy,
>>>>>
>>>>> I'm trying to understand this sequence of steps. Sorry if the questions
>>>>> are stupid -- I'm not very familiar with USB/PCI stuff.
>>>> Thank you for looking into this. My answer below.
>>>>
>>>> As a first thing I would like to tell that there is another example of bad
>>>> behaviour of deferred probe with no relation to USB. The proposed change also
>>>> fixes that one (however, less possible to find in real life).
>>>>
>>>>>> ---8<---8<---
>>>>>>
>>>>>> [ 22.187127] driver_deferred_probe_trigger <<< 1
>>>>>>
>>>>>> ...here is the late initcall triggers deferred probe...
>>>>>>
>>>>>> [ 22.191725] platform dwc3.0.auto: deferred_probe_work_func in deferred list
>>>>>>
>>>>>> ...dwc3.0.auto is the only device in the deferred list...
>>>>> Ok, dwc3.0.auto is the only unprobed device at this point?
>>>> Correct.
>>>>
>>>>>> [ 22.198727] platform dwc3.0.auto: deferred_probe_work_func 1 <<< counter 1
>>>>>>
>>>>>> ...the counter before mutex is unlocked is kept the same...
>>>>>>
>>>>>> [ 22.205663] platform dwc3.0.auto: Retrying from deferred list
>>>>>>
>>>>>> ...mutes has been unlocked, we try to re-probe the driver...
>>>>>>
>>>>>> [ 22.211487] bus: 'platform': driver_probe_device: matched device dwc3.0.auto with driver dwc3
>>>>>> [ 22.220060] bus: 'platform': really_probe: probing driver dwc3 with device dwc3.0.auto
>>>>>> [ 22.238735] bus: 'ulpi': driver_probe_device: matched device dwc3.0.auto.ulpi with driver tusb1210
>>>>>> [ 22.247743] bus: 'ulpi': really_probe: probing driver tusb1210 with device dwc3.0.auto.ulpi
>>>>>> [ 22.256292] driver: 'tusb1210': driver_bound: bound to device 'dwc3.0.auto.ulpi'
>>>>>> [ 22.263723] driver_deferred_probe_trigger <<< 2
>>>>>>
>>>>>> ...the dwc3.0.auto probes ULPI, we got successful bound and bumped counter...
>>>>>>
>>>>>> [ 22.268304] bus: 'ulpi': really_probe: bound device dwc3.0.auto.ulpi to driver tusb1210
>>>>> So where did this dwc3.0.auto.ulpi come from?
>>>>> Looks like the device is created by dwc3_probe() through this call flow:
>>>>> dwc3_probe() -> dwc3_core_init() -> dwc3_core_ulpi_init() ->
>>>>> dwc3_ulpi_init() -> ulpi_register_interface() -> ulpi_register()
>>>> Correct.
>>>>
>>>>>> [ 22.276697] platform dwc3.0.auto: Driver dwc3 requests probe deferral
>>>>> Can you please point me to which code patch actually caused the probe
>>>>> deferral?
>>>> Sure, it's in drd.c.
>>>>
>>>> if (device_property_read_string(dev, "linux,extcon-name", &name) == 0) {
>>>> edev = extcon_get_extcon_dev(name);
>>>> if (!edev)
>>>> return ERR_PTR(-EPROBE_DEFER);
>>>> return edev;
>>>> }
>>>>
>>>>>> ...but extcon driver is still missing...
>>>>>>
>>>>>> [ 22.283174] platform dwc3.0.auto: Added to deferred list
>>>>>> [ 22.288513] platform dwc3.0.auto: driver_deferred_probe_add_trigger local counter: 1 new counter 2
>>>>> I'm not fully aware of all the USB implications, but if extcon is
>>>>> needed, why can't that check be done before we add and probe the ulpi
>>>>> device? That'll avoid this whole "fake" probing and avoid the counter
>>>>> increase. And avoid the need for this patch that's touching the code
>>>>> code that's already a bit delicate.
>>>>> Also, with my limited experience with all the possible drivers in the
>>>>> kernel, it's weird that the ulpi device is added and probed before we
>>>>> make sure the parent device (dwc3.0.auto) can actually probe
>>>>> successfully.
>>>> As I said above the deferred probe trigger has flaw on its own.
>>>> Even if we fix for USB case, there is (and probably will be) others.
>>> Right here is the driver design bug. A driver's probe() hook should *not*
>>> return -EPROBE_DEFER after already creating child devices which may have
>>> already been probed.
>> Any documentation statement for this requirement?
>>
>> By the way, I may imagine other mechanisms that probe the driver on other CPU
>> at the same time (let's consider parallel modprobes). The current code has a
>> flaw with that.
> Hi,
>
> sorry for picking this up again but I stumbled above the same issue
> within the driver imx/drm driver which is using the component framework.
> I end up in a infinity boot loop if I enabled the HDMI (which is the
> DesignWare bridge device) and the LVDS support and the LVDS bind return
> with EPROBE_DEFER. There are no words within the component framework docs
> which says that this is forbidden. Of course we can work-around the
> driver-core framework but IMHO this shouldn't be the way to go. I do not
> say that we should revert the commit introducing the regression but we
> should address this not only by extending the docs since the most
> drm-drivers are using the component framework and can end up in the same
> situation.

I am not sure why do you think this is similar issue.

Please describe the issue in more detail. Which drivers defers probe and
why, and why do you have infinite loop.

In general deferring probe from bind is not forbidden, but it should be
used carefully (as everything in kernel :) ). Fixing deferring probe
issues in many cases it is a matter of figuring out 'dependency loops'
and breaking them by splitting device initialization into more than one
phase.


Regards

Andrzej


>
>>> It can be solved by refactoring the driver probe routine. If a resource is
>>> required to be present, then check that it is available early; before
>>> registering child devices.
>> We fix one and leave others.
> E.g. the imx-drm and the sunxi driver...
>
> Regards,
> Marco
>
>>> The proposed solution to modify driver core is fragile and susceptible to
>>> side effects from other probe paths. I don't think it is the right approach.
>> Have you tested it on your case? Does it fix the issue?
>>

2020-06-08 11:15:49

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

On Mon, Jun 8, 2020 at 12:20 PM Marco Felsch <[email protected]> wrote:
> On 20-03-26 18:31, Andy Shevchenko wrote:
> > On Thu, Mar 26, 2020 at 03:01:22PM +0000, Grant Likely wrote:
> > > On 25/03/2020 12:51, Andy Shevchenko wrote:
> > > > On Tue, Mar 24, 2020 at 08:29:01PM -0700, Saravana Kannan wrote:
> > > > > On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko <[email protected]> wrote:
> > > > > > Consider the following scenario.
> > > > > >
> > > > > > The main driver of USB OTG controller (dwc3-pci), which has the following
> > > > > > functional dependencies on certain platform:
> > > > > > - ULPI (tusb1210)
> > > > > > - extcon (tested with extcon-intel-mrfld)
> > > > > >
> > > > > > Note, that first driver, tusb1210, is available at the moment of
> > > > > > dwc3-pci probing, while extcon-intel-mrfld is built as a module and
> > > > > > won't appear till user space does something about it.
> > > > > >
> > > > > > This is depicted by kernel configuration excerpt:
> > > > > >
> > > > > > CONFIG_PHY_TUSB1210=y
> > > > > > CONFIG_USB_DWC3=y
> > > > > > CONFIG_USB_DWC3_ULPI=y
> > > > > > CONFIG_USB_DWC3_DUAL_ROLE=y
> > > > > > CONFIG_USB_DWC3_PCI=y
> > > > > > CONFIG_EXTCON_INTEL_MRFLD=m
> > > > > >
> > > > > > In the Buildroot environment the modules are probed by alphabetical ordering
> > > > > > of their modaliases. The latter comes to the case when USB OTG driver will be
> > > > > > probed first followed by extcon one.
> > > > > >
> > > > > > So, if the platform anticipates extcon device to be appeared, in the above case
> > > > > > we will get deferred probe of USB OTG, because of ordering.
> > > > > >
> > > > > > Since current implementation, done by the commit 58b116bce136 ("drivercore:
> > > > > > deferral race condition fix") counts the amount of triggered deferred probe,
> > > > > > we never advance the situation -- the change makes it to be an infinite loop.
> > > > >
> > > > > Hi Andy,
> > > > >
> > > > > I'm trying to understand this sequence of steps. Sorry if the questions
> > > > > are stupid -- I'm not very familiar with USB/PCI stuff.
> > > >
> > > > Thank you for looking into this. My answer below.
> > > >
> > > > As a first thing I would like to tell that there is another example of bad
> > > > behaviour of deferred probe with no relation to USB. The proposed change also
> > > > fixes that one (however, less possible to find in real life).
> > > >
> > > > > > ---8<---8<---
> > > > > >
> > > > > > [ 22.187127] driver_deferred_probe_trigger <<< 1
> > > > > >
> > > > > > ...here is the late initcall triggers deferred probe...
> > > > > >
> > > > > > [ 22.191725] platform dwc3.0.auto: deferred_probe_work_func in deferred list
> > > > > >
> > > > > > ...dwc3.0.auto is the only device in the deferred list...
> > > > >
> > > > > Ok, dwc3.0.auto is the only unprobed device at this point?
> > > >
> > > > Correct.
> > > >
> > > > > > [ 22.198727] platform dwc3.0.auto: deferred_probe_work_func 1 <<< counter 1
> > > > > >
> > > > > > ...the counter before mutex is unlocked is kept the same...
> > > > > >
> > > > > > [ 22.205663] platform dwc3.0.auto: Retrying from deferred list
> > > > > >
> > > > > > ...mutes has been unlocked, we try to re-probe the driver...
> > > > > >
> > > > > > [ 22.211487] bus: 'platform': driver_probe_device: matched device dwc3.0.auto with driver dwc3
> > > > > > [ 22.220060] bus: 'platform': really_probe: probing driver dwc3 with device dwc3.0.auto
> > > > > > [ 22.238735] bus: 'ulpi': driver_probe_device: matched device dwc3.0.auto.ulpi with driver tusb1210
> > > > > > [ 22.247743] bus: 'ulpi': really_probe: probing driver tusb1210 with device dwc3.0.auto.ulpi
> > > > > > [ 22.256292] driver: 'tusb1210': driver_bound: bound to device 'dwc3.0.auto.ulpi'
> > > > > > [ 22.263723] driver_deferred_probe_trigger <<< 2
> > > > > >
> > > > > > ...the dwc3.0.auto probes ULPI, we got successful bound and bumped counter...
> > > > > >
> > > > > > [ 22.268304] bus: 'ulpi': really_probe: bound device dwc3.0.auto.ulpi to driver tusb1210
> > > > >
> > > > > So where did this dwc3.0.auto.ulpi come from?
> > > >
> > > > > Looks like the device is created by dwc3_probe() through this call flow:
> > > > > dwc3_probe() -> dwc3_core_init() -> dwc3_core_ulpi_init() ->
> > > > > dwc3_ulpi_init() -> ulpi_register_interface() -> ulpi_register()
> > > >
> > > > Correct.
> > > >
> > > > > > [ 22.276697] platform dwc3.0.auto: Driver dwc3 requests probe deferral
> > > > >
> > > > > Can you please point me to which code patch actually caused the probe
> > > > > deferral?
> > > >
> > > > Sure, it's in drd.c.
> > > >
> > > > if (device_property_read_string(dev, "linux,extcon-name", &name) == 0) {
> > > > edev = extcon_get_extcon_dev(name);
> > > > if (!edev)
> > > > return ERR_PTR(-EPROBE_DEFER);
> > > > return edev;
> > > > }
> > > >
> > > > > > ...but extcon driver is still missing...
> > > > > >
> > > > > > [ 22.283174] platform dwc3.0.auto: Added to deferred list
> > > > > > [ 22.288513] platform dwc3.0.auto: driver_deferred_probe_add_trigger local counter: 1 new counter 2
> > > > >
> > > > > I'm not fully aware of all the USB implications, but if extcon is
> > > > > needed, why can't that check be done before we add and probe the ulpi
> > > > > device? That'll avoid this whole "fake" probing and avoid the counter
> > > > > increase. And avoid the need for this patch that's touching the code
> > > > > code that's already a bit delicate.
> > > >
> > > > > Also, with my limited experience with all the possible drivers in the
> > > > > kernel, it's weird that the ulpi device is added and probed before we
> > > > > make sure the parent device (dwc3.0.auto) can actually probe
> > > > > successfully.
> > > >
> > > > As I said above the deferred probe trigger has flaw on its own.
> > > > Even if we fix for USB case, there is (and probably will be) others.
> > >
> > > Right here is the driver design bug. A driver's probe() hook should *not*
> > > return -EPROBE_DEFER after already creating child devices which may have
> > > already been probed.
> >
> > Any documentation statement for this requirement?
> >
> > By the way, I may imagine other mechanisms that probe the driver on other CPU
> > at the same time (let's consider parallel modprobes). The current code has a
> > flaw with that.
>
> Hi,
>
> sorry for picking this up again but I stumbled above the same issue
> within the driver imx/drm driver which is using the component framework.
> I end up in a infinity boot loop if I enabled the HDMI (which is the
> DesignWare bridge device) and the LVDS support and the LVDS bind return
> with EPROBE_DEFER. There are no words within the component framework docs
> which says that this is forbidden. Of course we can work-around the
> driver-core framework but IMHO this shouldn't be the way to go. I do not
> say that we should revert the commit introducing the regression but we
> should address this not only by extending the docs since the most
> drm-drivers are using the component framework and can end up in the same
> situation.
>
> > > It can be solved by refactoring the driver probe routine. If a resource is
> > > required to be present, then check that it is available early; before
> > > registering child devices.
> >
> > We fix one and leave others.
>
> E.g. the imx-drm and the sunxi driver...

Just out of curiosity, does my patch fix an issue for you?

> > > The proposed solution to modify driver core is fragile and susceptible to
> > > side effects from other probe paths. I don't think it is the right approach.
> >
> > Have you tested it on your case? Does it fix the issue?

--
With Best Regards,
Andy Shevchenko

2020-06-08 12:03:54

by Marco Felsch

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

On 20-06-08 14:13, Andy Shevchenko wrote:
> On Mon, Jun 8, 2020 at 12:20 PM Marco Felsch <[email protected]> wrote:
> > On 20-03-26 18:31, Andy Shevchenko wrote:
> > > On Thu, Mar 26, 2020 at 03:01:22PM +0000, Grant Likely wrote:
> > > > On 25/03/2020 12:51, Andy Shevchenko wrote:
> > > > > On Tue, Mar 24, 2020 at 08:29:01PM -0700, Saravana Kannan wrote:
> > > > > > On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko <[email protected]> wrote:
> > > > > > > Consider the following scenario.
> > > > > > >
> > > > > > > The main driver of USB OTG controller (dwc3-pci), which has the following
> > > > > > > functional dependencies on certain platform:
> > > > > > > - ULPI (tusb1210)
> > > > > > > - extcon (tested with extcon-intel-mrfld)
> > > > > > >
> > > > > > > Note, that first driver, tusb1210, is available at the moment of
> > > > > > > dwc3-pci probing, while extcon-intel-mrfld is built as a module and
> > > > > > > won't appear till user space does something about it.
> > > > > > >
> > > > > > > This is depicted by kernel configuration excerpt:
> > > > > > >
> > > > > > > CONFIG_PHY_TUSB1210=y
> > > > > > > CONFIG_USB_DWC3=y
> > > > > > > CONFIG_USB_DWC3_ULPI=y
> > > > > > > CONFIG_USB_DWC3_DUAL_ROLE=y
> > > > > > > CONFIG_USB_DWC3_PCI=y
> > > > > > > CONFIG_EXTCON_INTEL_MRFLD=m
> > > > > > >
> > > > > > > In the Buildroot environment the modules are probed by alphabetical ordering
> > > > > > > of their modaliases. The latter comes to the case when USB OTG driver will be
> > > > > > > probed first followed by extcon one.
> > > > > > >
> > > > > > > So, if the platform anticipates extcon device to be appeared, in the above case
> > > > > > > we will get deferred probe of USB OTG, because of ordering.
> > > > > > >
> > > > > > > Since current implementation, done by the commit 58b116bce136 ("drivercore:
> > > > > > > deferral race condition fix") counts the amount of triggered deferred probe,
> > > > > > > we never advance the situation -- the change makes it to be an infinite loop.
> > > > > >
> > > > > > Hi Andy,
> > > > > >
> > > > > > I'm trying to understand this sequence of steps. Sorry if the questions
> > > > > > are stupid -- I'm not very familiar with USB/PCI stuff.
> > > > >
> > > > > Thank you for looking into this. My answer below.
> > > > >
> > > > > As a first thing I would like to tell that there is another example of bad
> > > > > behaviour of deferred probe with no relation to USB. The proposed change also
> > > > > fixes that one (however, less possible to find in real life).
> > > > >
> > > > > > > ---8<---8<---
> > > > > > >
> > > > > > > [ 22.187127] driver_deferred_probe_trigger <<< 1
> > > > > > >
> > > > > > > ...here is the late initcall triggers deferred probe...
> > > > > > >
> > > > > > > [ 22.191725] platform dwc3.0.auto: deferred_probe_work_func in deferred list
> > > > > > >
> > > > > > > ...dwc3.0.auto is the only device in the deferred list...
> > > > > >
> > > > > > Ok, dwc3.0.auto is the only unprobed device at this point?
> > > > >
> > > > > Correct.
> > > > >
> > > > > > > [ 22.198727] platform dwc3.0.auto: deferred_probe_work_func 1 <<< counter 1
> > > > > > >
> > > > > > > ...the counter before mutex is unlocked is kept the same...
> > > > > > >
> > > > > > > [ 22.205663] platform dwc3.0.auto: Retrying from deferred list
> > > > > > >
> > > > > > > ...mutes has been unlocked, we try to re-probe the driver...
> > > > > > >
> > > > > > > [ 22.211487] bus: 'platform': driver_probe_device: matched device dwc3.0.auto with driver dwc3
> > > > > > > [ 22.220060] bus: 'platform': really_probe: probing driver dwc3 with device dwc3.0.auto
> > > > > > > [ 22.238735] bus: 'ulpi': driver_probe_device: matched device dwc3.0.auto.ulpi with driver tusb1210
> > > > > > > [ 22.247743] bus: 'ulpi': really_probe: probing driver tusb1210 with device dwc3.0.auto.ulpi
> > > > > > > [ 22.256292] driver: 'tusb1210': driver_bound: bound to device 'dwc3.0.auto.ulpi'
> > > > > > > [ 22.263723] driver_deferred_probe_trigger <<< 2
> > > > > > >
> > > > > > > ...the dwc3.0.auto probes ULPI, we got successful bound and bumped counter...
> > > > > > >
> > > > > > > [ 22.268304] bus: 'ulpi': really_probe: bound device dwc3.0.auto.ulpi to driver tusb1210
> > > > > >
> > > > > > So where did this dwc3.0.auto.ulpi come from?
> > > > >
> > > > > > Looks like the device is created by dwc3_probe() through this call flow:
> > > > > > dwc3_probe() -> dwc3_core_init() -> dwc3_core_ulpi_init() ->
> > > > > > dwc3_ulpi_init() -> ulpi_register_interface() -> ulpi_register()
> > > > >
> > > > > Correct.
> > > > >
> > > > > > > [ 22.276697] platform dwc3.0.auto: Driver dwc3 requests probe deferral
> > > > > >
> > > > > > Can you please point me to which code patch actually caused the probe
> > > > > > deferral?
> > > > >
> > > > > Sure, it's in drd.c.
> > > > >
> > > > > if (device_property_read_string(dev, "linux,extcon-name", &name) == 0) {
> > > > > edev = extcon_get_extcon_dev(name);
> > > > > if (!edev)
> > > > > return ERR_PTR(-EPROBE_DEFER);
> > > > > return edev;
> > > > > }
> > > > >
> > > > > > > ...but extcon driver is still missing...
> > > > > > >
> > > > > > > [ 22.283174] platform dwc3.0.auto: Added to deferred list
> > > > > > > [ 22.288513] platform dwc3.0.auto: driver_deferred_probe_add_trigger local counter: 1 new counter 2
> > > > > >
> > > > > > I'm not fully aware of all the USB implications, but if extcon is
> > > > > > needed, why can't that check be done before we add and probe the ulpi
> > > > > > device? That'll avoid this whole "fake" probing and avoid the counter
> > > > > > increase. And avoid the need for this patch that's touching the code
> > > > > > code that's already a bit delicate.
> > > > >
> > > > > > Also, with my limited experience with all the possible drivers in the
> > > > > > kernel, it's weird that the ulpi device is added and probed before we
> > > > > > make sure the parent device (dwc3.0.auto) can actually probe
> > > > > > successfully.
> > > > >
> > > > > As I said above the deferred probe trigger has flaw on its own.
> > > > > Even if we fix for USB case, there is (and probably will be) others.
> > > >
> > > > Right here is the driver design bug. A driver's probe() hook should *not*
> > > > return -EPROBE_DEFER after already creating child devices which may have
> > > > already been probed.
> > >
> > > Any documentation statement for this requirement?
> > >
> > > By the way, I may imagine other mechanisms that probe the driver on other CPU
> > > at the same time (let's consider parallel modprobes). The current code has a
> > > flaw with that.
> >
> > Hi,
> >
> > sorry for picking this up again but I stumbled above the same issue
> > within the driver imx/drm driver which is using the component framework.
> > I end up in a infinity boot loop if I enabled the HDMI (which is the
> > DesignWare bridge device) and the LVDS support and the LVDS bind return
> > with EPROBE_DEFER. There are no words within the component framework docs
> > which says that this is forbidden. Of course we can work-around the
> > driver-core framework but IMHO this shouldn't be the way to go. I do not
> > say that we should revert the commit introducing the regression but we
> > should address this not only by extending the docs since the most
> > drm-drivers are using the component framework and can end up in the same
> > situation.
> >
> > > > It can be solved by refactoring the driver probe routine. If a resource is
> > > > required to be present, then check that it is available early; before
> > > > registering child devices.
> > >
> > > We fix one and leave others.
> >
> > E.g. the imx-drm and the sunxi driver...
>
> Just out of curiosity, does my patch fix an issue for you?

I didn't applied your patch yet. I can test it if you want.

Regards,
Marco

> > > > The proposed solution to modify driver core is fragile and susceptible to
> > > > side effects from other probe paths. I don't think it is the right approach.
> > >
> > > Have you tested it on your case? Does it fix the issue?

2020-06-08 12:13:35

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

On Mon, Jun 8, 2020 at 2:59 PM Marco Felsch <[email protected]> wrote:
> On 20-06-08 14:13, Andy Shevchenko wrote:
> > On Mon, Jun 8, 2020 at 12:20 PM Marco Felsch <[email protected]> wrote:
> > > On 20-03-26 18:31, Andy Shevchenko wrote:

> > > sorry for picking this up again but I stumbled above the same issue
> > > within the driver imx/drm driver which is using the component framework.
> > > I end up in a infinity boot loop if I enabled the HDMI (which is the
> > > DesignWare bridge device) and the LVDS support and the LVDS bind return
> > > with EPROBE_DEFER. There are no words within the component framework docs
> > > which says that this is forbidden. Of course we can work-around the
> > > driver-core framework but IMHO this shouldn't be the way to go. I do not
> > > say that we should revert the commit introducing the regression but we
> > > should address this not only by extending the docs since the most
> > > drm-drivers are using the component framework and can end up in the same
> > > situation.
> > >
> > > > > It can be solved by refactoring the driver probe routine. If a resource is
> > > > > required to be present, then check that it is available early; before
> > > > > registering child devices.
> > > >
> > > > We fix one and leave others.
> > >
> > > E.g. the imx-drm and the sunxi driver...
> >
> > Just out of curiosity, does my patch fix an issue for you?
>
> I didn't applied your patch yet. I can test it if you want.

If you can, thanks!

> > > > > The proposed solution to modify driver core is fragile and susceptible to
> > > > > side effects from other probe paths. I don't think it is the right approach.

--
With Best Regards,
Andy Shevchenko

2020-06-09 06:51:15

by Marco Felsch

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

On 20-06-08 13:11, Andrzej Hajda wrote:
>
> On 08.06.2020 11:17, Marco Felsch wrote:
> > On 20-03-26 18:31, Andy Shevchenko wrote:
> >> On Thu, Mar 26, 2020 at 03:01:22PM +0000, Grant Likely wrote:
> >>> On 25/03/2020 12:51, Andy Shevchenko wrote:
> >>>> On Tue, Mar 24, 2020 at 08:29:01PM -0700, Saravana Kannan wrote:
> >>>>> On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko <[email protected]> wrote:
> >>>>>> Consider the following scenario.
> >>>>>>
> >>>>>> The main driver of USB OTG controller (dwc3-pci), which has the following
> >>>>>> functional dependencies on certain platform:
> >>>>>> - ULPI (tusb1210)
> >>>>>> - extcon (tested with extcon-intel-mrfld)
> >>>>>>
> >>>>>> Note, that first driver, tusb1210, is available at the moment of
> >>>>>> dwc3-pci probing, while extcon-intel-mrfld is built as a module and
> >>>>>> won't appear till user space does something about it.
> >>>>>>
> >>>>>> This is depicted by kernel configuration excerpt:
> >>>>>>
> >>>>>> CONFIG_PHY_TUSB1210=y
> >>>>>> CONFIG_USB_DWC3=y
> >>>>>> CONFIG_USB_DWC3_ULPI=y
> >>>>>> CONFIG_USB_DWC3_DUAL_ROLE=y
> >>>>>> CONFIG_USB_DWC3_PCI=y
> >>>>>> CONFIG_EXTCON_INTEL_MRFLD=m
> >>>>>>
> >>>>>> In the Buildroot environment the modules are probed by alphabetical ordering
> >>>>>> of their modaliases. The latter comes to the case when USB OTG driver will be
> >>>>>> probed first followed by extcon one.
> >>>>>>
> >>>>>> So, if the platform anticipates extcon device to be appeared, in the above case
> >>>>>> we will get deferred probe of USB OTG, because of ordering.
> >>>>>>
> >>>>>> Since current implementation, done by the commit 58b116bce136 ("drivercore:
> >>>>>> deferral race condition fix") counts the amount of triggered deferred probe,
> >>>>>> we never advance the situation -- the change makes it to be an infinite loop.
> >>>>> Hi Andy,
> >>>>>
> >>>>> I'm trying to understand this sequence of steps. Sorry if the questions
> >>>>> are stupid -- I'm not very familiar with USB/PCI stuff.
> >>>> Thank you for looking into this. My answer below.
> >>>>
> >>>> As a first thing I would like to tell that there is another example of bad
> >>>> behaviour of deferred probe with no relation to USB. The proposed change also
> >>>> fixes that one (however, less possible to find in real life).
> >>>>
> >>>>>> ---8<---8<---
> >>>>>>
> >>>>>> [ 22.187127] driver_deferred_probe_trigger <<< 1
> >>>>>>
> >>>>>> ...here is the late initcall triggers deferred probe...
> >>>>>>
> >>>>>> [ 22.191725] platform dwc3.0.auto: deferred_probe_work_func in deferred list
> >>>>>>
> >>>>>> ...dwc3.0.auto is the only device in the deferred list...
> >>>>> Ok, dwc3.0.auto is the only unprobed device at this point?
> >>>> Correct.
> >>>>
> >>>>>> [ 22.198727] platform dwc3.0.auto: deferred_probe_work_func 1 <<< counter 1
> >>>>>>
> >>>>>> ...the counter before mutex is unlocked is kept the same...
> >>>>>>
> >>>>>> [ 22.205663] platform dwc3.0.auto: Retrying from deferred list
> >>>>>>
> >>>>>> ...mutes has been unlocked, we try to re-probe the driver...
> >>>>>>
> >>>>>> [ 22.211487] bus: 'platform': driver_probe_device: matched device dwc3.0.auto with driver dwc3
> >>>>>> [ 22.220060] bus: 'platform': really_probe: probing driver dwc3 with device dwc3.0.auto
> >>>>>> [ 22.238735] bus: 'ulpi': driver_probe_device: matched device dwc3.0.auto.ulpi with driver tusb1210
> >>>>>> [ 22.247743] bus: 'ulpi': really_probe: probing driver tusb1210 with device dwc3.0.auto.ulpi
> >>>>>> [ 22.256292] driver: 'tusb1210': driver_bound: bound to device 'dwc3.0.auto.ulpi'
> >>>>>> [ 22.263723] driver_deferred_probe_trigger <<< 2
> >>>>>>
> >>>>>> ...the dwc3.0.auto probes ULPI, we got successful bound and bumped counter...
> >>>>>>
> >>>>>> [ 22.268304] bus: 'ulpi': really_probe: bound device dwc3.0.auto.ulpi to driver tusb1210
> >>>>> So where did this dwc3.0.auto.ulpi come from?
> >>>>> Looks like the device is created by dwc3_probe() through this call flow:
> >>>>> dwc3_probe() -> dwc3_core_init() -> dwc3_core_ulpi_init() ->
> >>>>> dwc3_ulpi_init() -> ulpi_register_interface() -> ulpi_register()
> >>>> Correct.
> >>>>
> >>>>>> [ 22.276697] platform dwc3.0.auto: Driver dwc3 requests probe deferral
> >>>>> Can you please point me to which code patch actually caused the probe
> >>>>> deferral?
> >>>> Sure, it's in drd.c.
> >>>>
> >>>> if (device_property_read_string(dev, "linux,extcon-name", &name) == 0) {
> >>>> edev = extcon_get_extcon_dev(name);
> >>>> if (!edev)
> >>>> return ERR_PTR(-EPROBE_DEFER);
> >>>> return edev;
> >>>> }
> >>>>
> >>>>>> ...but extcon driver is still missing...
> >>>>>>
> >>>>>> [ 22.283174] platform dwc3.0.auto: Added to deferred list
> >>>>>> [ 22.288513] platform dwc3.0.auto: driver_deferred_probe_add_trigger local counter: 1 new counter 2
> >>>>> I'm not fully aware of all the USB implications, but if extcon is
> >>>>> needed, why can't that check be done before we add and probe the ulpi
> >>>>> device? That'll avoid this whole "fake" probing and avoid the counter
> >>>>> increase. And avoid the need for this patch that's touching the code
> >>>>> code that's already a bit delicate.
> >>>>> Also, with my limited experience with all the possible drivers in the
> >>>>> kernel, it's weird that the ulpi device is added and probed before we
> >>>>> make sure the parent device (dwc3.0.auto) can actually probe
> >>>>> successfully.
> >>>> As I said above the deferred probe trigger has flaw on its own.
> >>>> Even if we fix for USB case, there is (and probably will be) others.
> >>> Right here is the driver design bug. A driver's probe() hook should *not*
> >>> return -EPROBE_DEFER after already creating child devices which may have
> >>> already been probed.
> >> Any documentation statement for this requirement?
> >>
> >> By the way, I may imagine other mechanisms that probe the driver on other CPU
> >> at the same time (let's consider parallel modprobes). The current code has a
> >> flaw with that.
> > Hi,
> >
> > sorry for picking this up again but I stumbled above the same issue
> > within the driver imx/drm driver which is using the component framework.
> > I end up in a infinity boot loop if I enabled the HDMI (which is the
> > DesignWare bridge device) and the LVDS support and the LVDS bind return
> > with EPROBE_DEFER. There are no words within the component framework docs
> > which says that this is forbidden. Of course we can work-around the
> > driver-core framework but IMHO this shouldn't be the way to go. I do not
> > say that we should revert the commit introducing the regression but we
> > should address this not only by extending the docs since the most
> > drm-drivers are using the component framework and can end up in the same
> > situation.
>
> I am not sure why do you think this is similar issue.

Because I see trying to bind the device over and over..

> Please describe the issue in more detail. Which drivers defers probe and
> why, and why do you have infinite loop.

As said I'm currently on the imx-drm driver. The iMX6 devices are
using the synopsis HDMI IP core and so they are using this bridge device
driver (drivers/gpu/drm/bridge/synopsys/). The imx-drm driver can be
build module wise. As example I enabled the LDB and the HDMI support.
The HDMI driver is composed as platform driver with different
(sub-)drivers and devices. Those devices are populated by the HDMI core
driver _probe() function and triggers a driver_deferred_probe_trigger()
after the driver successfully probed. The LDB driver bind() returns
-EPROBE_DEFER because the panel we are looking for depends on a defered
regulator device. Now the defered probe code tries to probe the defered
devices again because the local-trigger count was changed by the HDMI
driver and we are in the never ending loop.

> In general deferring probe from bind is not forbidden, but it should be
> used carefully (as everything in kernel :) ). Fixing deferring probe
> issues in many cases it is a matter of figuring out 'dependency loops'
> and breaking them by splitting device initialization into more than one
> phase.

We are on the way of splitting the imx-drm driver but there are many
other DRM drivers using the component framework. As far as I can see the
sunxi8 driver is component based and uses the same HDMI driver. I'm with
Andy that we should fix that on the common/core place.

Regards,
Marco

> Regards
>
> Andrzej
>
>
> >
> >>> It can be solved by refactoring the driver probe routine. If a resource is
> >>> required to be present, then check that it is available early; before
> >>> registering child devices.
> >> We fix one and leave others.
> > E.g. the imx-drm and the sunxi driver...
> >
> > Regards,
> > Marco
> >
> >>> The proposed solution to modify driver core is fragile and susceptible to
> >>> side effects from other probe paths. I don't think it is the right approach.
> >> Have you tested it on your case? Does it fix the issue?
> >>
>

--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |

2020-06-09 07:33:45

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

On Mon, Jun 8, 2020 at 11:45 PM Marco Felsch <[email protected]> wrote:
>
> On 20-06-08 13:11, Andrzej Hajda wrote:
> >
> > On 08.06.2020 11:17, Marco Felsch wrote:
> > > On 20-03-26 18:31, Andy Shevchenko wrote:
> > >> On Thu, Mar 26, 2020 at 03:01:22PM +0000, Grant Likely wrote:
> > >>> On 25/03/2020 12:51, Andy Shevchenko wrote:
> > >>>> On Tue, Mar 24, 2020 at 08:29:01PM -0700, Saravana Kannan wrote:
> > >>>>> On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko <[email protected]> wrote:
> > >>>>>> Consider the following scenario.
> > >>>>>>
> > >>>>>> The main driver of USB OTG controller (dwc3-pci), which has the following
> > >>>>>> functional dependencies on certain platform:
> > >>>>>> - ULPI (tusb1210)
> > >>>>>> - extcon (tested with extcon-intel-mrfld)
> > >>>>>>
> > >>>>>> Note, that first driver, tusb1210, is available at the moment of
> > >>>>>> dwc3-pci probing, while extcon-intel-mrfld is built as a module and
> > >>>>>> won't appear till user space does something about it.
> > >>>>>>
> > >>>>>> This is depicted by kernel configuration excerpt:
> > >>>>>>
> > >>>>>> CONFIG_PHY_TUSB1210=y
> > >>>>>> CONFIG_USB_DWC3=y
> > >>>>>> CONFIG_USB_DWC3_ULPI=y
> > >>>>>> CONFIG_USB_DWC3_DUAL_ROLE=y
> > >>>>>> CONFIG_USB_DWC3_PCI=y
> > >>>>>> CONFIG_EXTCON_INTEL_MRFLD=m
> > >>>>>>
> > >>>>>> In the Buildroot environment the modules are probed by alphabetical ordering
> > >>>>>> of their modaliases. The latter comes to the case when USB OTG driver will be
> > >>>>>> probed first followed by extcon one.
> > >>>>>>
> > >>>>>> So, if the platform anticipates extcon device to be appeared, in the above case
> > >>>>>> we will get deferred probe of USB OTG, because of ordering.
> > >>>>>>
> > >>>>>> Since current implementation, done by the commit 58b116bce136 ("drivercore:
> > >>>>>> deferral race condition fix") counts the amount of triggered deferred probe,
> > >>>>>> we never advance the situation -- the change makes it to be an infinite loop.
> > >>>>> Hi Andy,
> > >>>>>
> > >>>>> I'm trying to understand this sequence of steps. Sorry if the questions
> > >>>>> are stupid -- I'm not very familiar with USB/PCI stuff.
> > >>>> Thank you for looking into this. My answer below.
> > >>>>
> > >>>> As a first thing I would like to tell that there is another example of bad
> > >>>> behaviour of deferred probe with no relation to USB. The proposed change also
> > >>>> fixes that one (however, less possible to find in real life).
> > >>>>
> > >>>>>> ---8<---8<---
> > >>>>>>
> > >>>>>> [ 22.187127] driver_deferred_probe_trigger <<< 1
> > >>>>>>
> > >>>>>> ...here is the late initcall triggers deferred probe...
> > >>>>>>
> > >>>>>> [ 22.191725] platform dwc3.0.auto: deferred_probe_work_func in deferred list
> > >>>>>>
> > >>>>>> ...dwc3.0.auto is the only device in the deferred list...
> > >>>>> Ok, dwc3.0.auto is the only unprobed device at this point?
> > >>>> Correct.
> > >>>>
> > >>>>>> [ 22.198727] platform dwc3.0.auto: deferred_probe_work_func 1 <<< counter 1
> > >>>>>>
> > >>>>>> ...the counter before mutex is unlocked is kept the same...
> > >>>>>>
> > >>>>>> [ 22.205663] platform dwc3.0.auto: Retrying from deferred list
> > >>>>>>
> > >>>>>> ...mutes has been unlocked, we try to re-probe the driver...
> > >>>>>>
> > >>>>>> [ 22.211487] bus: 'platform': driver_probe_device: matched device dwc3.0.auto with driver dwc3
> > >>>>>> [ 22.220060] bus: 'platform': really_probe: probing driver dwc3 with device dwc3.0.auto
> > >>>>>> [ 22.238735] bus: 'ulpi': driver_probe_device: matched device dwc3.0.auto.ulpi with driver tusb1210
> > >>>>>> [ 22.247743] bus: 'ulpi': really_probe: probing driver tusb1210 with device dwc3.0.auto.ulpi
> > >>>>>> [ 22.256292] driver: 'tusb1210': driver_bound: bound to device 'dwc3.0.auto.ulpi'
> > >>>>>> [ 22.263723] driver_deferred_probe_trigger <<< 2
> > >>>>>>
> > >>>>>> ...the dwc3.0.auto probes ULPI, we got successful bound and bumped counter...
> > >>>>>>
> > >>>>>> [ 22.268304] bus: 'ulpi': really_probe: bound device dwc3.0.auto.ulpi to driver tusb1210
> > >>>>> So where did this dwc3.0.auto.ulpi come from?
> > >>>>> Looks like the device is created by dwc3_probe() through this call flow:
> > >>>>> dwc3_probe() -> dwc3_core_init() -> dwc3_core_ulpi_init() ->
> > >>>>> dwc3_ulpi_init() -> ulpi_register_interface() -> ulpi_register()
> > >>>> Correct.
> > >>>>
> > >>>>>> [ 22.276697] platform dwc3.0.auto: Driver dwc3 requests probe deferral
> > >>>>> Can you please point me to which code patch actually caused the probe
> > >>>>> deferral?
> > >>>> Sure, it's in drd.c.
> > >>>>
> > >>>> if (device_property_read_string(dev, "linux,extcon-name", &name) == 0) {
> > >>>> edev = extcon_get_extcon_dev(name);
> > >>>> if (!edev)
> > >>>> return ERR_PTR(-EPROBE_DEFER);
> > >>>> return edev;
> > >>>> }
> > >>>>
> > >>>>>> ...but extcon driver is still missing...
> > >>>>>>
> > >>>>>> [ 22.283174] platform dwc3.0.auto: Added to deferred list
> > >>>>>> [ 22.288513] platform dwc3.0.auto: driver_deferred_probe_add_trigger local counter: 1 new counter 2
> > >>>>> I'm not fully aware of all the USB implications, but if extcon is
> > >>>>> needed, why can't that check be done before we add and probe the ulpi
> > >>>>> device? That'll avoid this whole "fake" probing and avoid the counter
> > >>>>> increase. And avoid the need for this patch that's touching the code
> > >>>>> code that's already a bit delicate.
> > >>>>> Also, with my limited experience with all the possible drivers in the
> > >>>>> kernel, it's weird that the ulpi device is added and probed before we
> > >>>>> make sure the parent device (dwc3.0.auto) can actually probe
> > >>>>> successfully.
> > >>>> As I said above the deferred probe trigger has flaw on its own.
> > >>>> Even if we fix for USB case, there is (and probably will be) others.
> > >>> Right here is the driver design bug. A driver's probe() hook should *not*
> > >>> return -EPROBE_DEFER after already creating child devices which may have
> > >>> already been probed.
> > >> Any documentation statement for this requirement?
> > >>
> > >> By the way, I may imagine other mechanisms that probe the driver on other CPU
> > >> at the same time (let's consider parallel modprobes). The current code has a
> > >> flaw with that.
> > > Hi,
> > >
> > > sorry for picking this up again but I stumbled above the same issue
> > > within the driver imx/drm driver which is using the component framework.
> > > I end up in a infinity boot loop if I enabled the HDMI (which is the
> > > DesignWare bridge device) and the LVDS support and the LVDS bind return
> > > with EPROBE_DEFER. There are no words within the component framework docs
> > > which says that this is forbidden. Of course we can work-around the
> > > driver-core framework but IMHO this shouldn't be the way to go. I do not
> > > say that we should revert the commit introducing the regression but we
> > > should address this not only by extending the docs since the most
> > > drm-drivers are using the component framework and can end up in the same
> > > situation.
> >
> > I am not sure why do you think this is similar issue.
>
> Because I see trying to bind the device over and over..
>
> > Please describe the issue in more detail. Which drivers defers probe and
> > why, and why do you have infinite loop.
>
> As said I'm currently on the imx-drm driver. The iMX6 devices are
> using the synopsis HDMI IP core and so they are using this bridge device
> driver (drivers/gpu/drm/bridge/synopsys/). The imx-drm driver can be
> build module wise. As example I enabled the LDB and the HDMI support.
> The HDMI driver is composed as platform driver with different
> (sub-)drivers and devices. Those devices are populated by the HDMI core
> driver _probe() function and triggers a driver_deferred_probe_trigger()
> after the driver successfully probed. The LDB driver bind() returns
> -EPROBE_DEFER because the panel we are looking for depends on a defered
> regulator device. Now the defered probe code tries to probe the defered
> devices again because the local-trigger count was changed by the HDMI
> driver and we are in the never ending loop.
>
> > In general deferring probe from bind is not forbidden, but it should be
> > used carefully (as everything in kernel :) ). Fixing deferring probe
> > issues in many cases it is a matter of figuring out 'dependency loops'
> > and breaking them by splitting device initialization into more than one
> > phase.
>
> We are on the way of splitting the imx-drm driver but there are many
> other DRM drivers using the component framework. As far as I can see the
> sunxi8 driver is component based and uses the same HDMI driver. I'm with
> Andy that we should fix that on the common/core place.

I'm not opposed to fixing this at the common/core level if that's
possible, but Andy's patch still has a bug where it might never probe
a device. I still haven't seen an answer to this.
https://lore.kernel.org/lkml/CAGETcx_3+YH_LmNUCAAk1OaXk6noHEXxcE+ckkoBqKJJhtpDjQ@mail.gmail.com/

-Saravana

2020-06-09 09:29:35

by Andrzej Hajda

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied


On 09.06.2020 08:45, Marco Felsch wrote:
> On 20-06-08 13:11, Andrzej Hajda wrote:
>> On 08.06.2020 11:17, Marco Felsch wrote:
>>> On 20-03-26 18:31, Andy Shevchenko wrote:
>>>> On Thu, Mar 26, 2020 at 03:01:22PM +0000, Grant Likely wrote:
>>>>> On 25/03/2020 12:51, Andy Shevchenko wrote:
>>>>>> On Tue, Mar 24, 2020 at 08:29:01PM -0700, Saravana Kannan wrote:
>>>>>>> On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko <[email protected]> wrote:
>>>>>>>> Consider the following scenario.
>>>>>>>>
>>>>>>>> The main driver of USB OTG controller (dwc3-pci), which has the following
>>>>>>>> functional dependencies on certain platform:
>>>>>>>> - ULPI (tusb1210)
>>>>>>>> - extcon (tested with extcon-intel-mrfld)
>>>>>>>>
>>>>>>>> Note, that first driver, tusb1210, is available at the moment of
>>>>>>>> dwc3-pci probing, while extcon-intel-mrfld is built as a module and
>>>>>>>> won't appear till user space does something about it.
>>>>>>>>
>>>>>>>> This is depicted by kernel configuration excerpt:
>>>>>>>>
>>>>>>>> CONFIG_PHY_TUSB1210=y
>>>>>>>> CONFIG_USB_DWC3=y
>>>>>>>> CONFIG_USB_DWC3_ULPI=y
>>>>>>>> CONFIG_USB_DWC3_DUAL_ROLE=y
>>>>>>>> CONFIG_USB_DWC3_PCI=y
>>>>>>>> CONFIG_EXTCON_INTEL_MRFLD=m
>>>>>>>>
>>>>>>>> In the Buildroot environment the modules are probed by alphabetical ordering
>>>>>>>> of their modaliases. The latter comes to the case when USB OTG driver will be
>>>>>>>> probed first followed by extcon one.
>>>>>>>>
>>>>>>>> So, if the platform anticipates extcon device to be appeared, in the above case
>>>>>>>> we will get deferred probe of USB OTG, because of ordering.
>>>>>>>>
>>>>>>>> Since current implementation, done by the commit 58b116bce136 ("drivercore:
>>>>>>>> deferral race condition fix") counts the amount of triggered deferred probe,
>>>>>>>> we never advance the situation -- the change makes it to be an infinite loop.
>>>>>>> Hi Andy,
>>>>>>>
>>>>>>> I'm trying to understand this sequence of steps. Sorry if the questions
>>>>>>> are stupid -- I'm not very familiar with USB/PCI stuff.
>>>>>> Thank you for looking into this. My answer below.
>>>>>>
>>>>>> As a first thing I would like to tell that there is another example of bad
>>>>>> behaviour of deferred probe with no relation to USB. The proposed change also
>>>>>> fixes that one (however, less possible to find in real life).
>>>>>>
>>>>>>>> ---8<---8<---
>>>>>>>>
>>>>>>>> [ 22.187127] driver_deferred_probe_trigger <<< 1
>>>>>>>>
>>>>>>>> ...here is the late initcall triggers deferred probe...
>>>>>>>>
>>>>>>>> [ 22.191725] platform dwc3.0.auto: deferred_probe_work_func in deferred list
>>>>>>>>
>>>>>>>> ...dwc3.0.auto is the only device in the deferred list...
>>>>>>> Ok, dwc3.0.auto is the only unprobed device at this point?
>>>>>> Correct.
>>>>>>
>>>>>>>> [ 22.198727] platform dwc3.0.auto: deferred_probe_work_func 1 <<< counter 1
>>>>>>>>
>>>>>>>> ...the counter before mutex is unlocked is kept the same...
>>>>>>>>
>>>>>>>> [ 22.205663] platform dwc3.0.auto: Retrying from deferred list
>>>>>>>>
>>>>>>>> ...mutes has been unlocked, we try to re-probe the driver...
>>>>>>>>
>>>>>>>> [ 22.211487] bus: 'platform': driver_probe_device: matched device dwc3.0.auto with driver dwc3
>>>>>>>> [ 22.220060] bus: 'platform': really_probe: probing driver dwc3 with device dwc3.0.auto
>>>>>>>> [ 22.238735] bus: 'ulpi': driver_probe_device: matched device dwc3.0.auto.ulpi with driver tusb1210
>>>>>>>> [ 22.247743] bus: 'ulpi': really_probe: probing driver tusb1210 with device dwc3.0.auto.ulpi
>>>>>>>> [ 22.256292] driver: 'tusb1210': driver_bound: bound to device 'dwc3.0.auto.ulpi'
>>>>>>>> [ 22.263723] driver_deferred_probe_trigger <<< 2
>>>>>>>>
>>>>>>>> ...the dwc3.0.auto probes ULPI, we got successful bound and bumped counter...
>>>>>>>>
>>>>>>>> [ 22.268304] bus: 'ulpi': really_probe: bound device dwc3.0.auto.ulpi to driver tusb1210
>>>>>>> So where did this dwc3.0.auto.ulpi come from?
>>>>>>> Looks like the device is created by dwc3_probe() through this call flow:
>>>>>>> dwc3_probe() -> dwc3_core_init() -> dwc3_core_ulpi_init() ->
>>>>>>> dwc3_ulpi_init() -> ulpi_register_interface() -> ulpi_register()
>>>>>> Correct.
>>>>>>
>>>>>>>> [ 22.276697] platform dwc3.0.auto: Driver dwc3 requests probe deferral
>>>>>>> Can you please point me to which code patch actually caused the probe
>>>>>>> deferral?
>>>>>> Sure, it's in drd.c.
>>>>>>
>>>>>> if (device_property_read_string(dev, "linux,extcon-name", &name) == 0) {
>>>>>> edev = extcon_get_extcon_dev(name);
>>>>>> if (!edev)
>>>>>> return ERR_PTR(-EPROBE_DEFER);
>>>>>> return edev;
>>>>>> }
>>>>>>
>>>>>>>> ...but extcon driver is still missing...
>>>>>>>>
>>>>>>>> [ 22.283174] platform dwc3.0.auto: Added to deferred list
>>>>>>>> [ 22.288513] platform dwc3.0.auto: driver_deferred_probe_add_trigger local counter: 1 new counter 2
>>>>>>> I'm not fully aware of all the USB implications, but if extcon is
>>>>>>> needed, why can't that check be done before we add and probe the ulpi
>>>>>>> device? That'll avoid this whole "fake" probing and avoid the counter
>>>>>>> increase. And avoid the need for this patch that's touching the code
>>>>>>> code that's already a bit delicate.
>>>>>>> Also, with my limited experience with all the possible drivers in the
>>>>>>> kernel, it's weird that the ulpi device is added and probed before we
>>>>>>> make sure the parent device (dwc3.0.auto) can actually probe
>>>>>>> successfully.
>>>>>> As I said above the deferred probe trigger has flaw on its own.
>>>>>> Even if we fix for USB case, there is (and probably will be) others.
>>>>> Right here is the driver design bug. A driver's probe() hook should *not*
>>>>> return -EPROBE_DEFER after already creating child devices which may have
>>>>> already been probed.
>>>> Any documentation statement for this requirement?
>>>>
>>>> By the way, I may imagine other mechanisms that probe the driver on other CPU
>>>> at the same time (let's consider parallel modprobes). The current code has a
>>>> flaw with that.
>>> Hi,
>>>
>>> sorry for picking this up again but I stumbled above the same issue
>>> within the driver imx/drm driver which is using the component framework.
>>> I end up in a infinity boot loop if I enabled the HDMI (which is the
>>> DesignWare bridge device) and the LVDS support and the LVDS bind return
>>> with EPROBE_DEFER. There are no words within the component framework docs
>>> which says that this is forbidden. Of course we can work-around the
>>> driver-core framework but IMHO this shouldn't be the way to go. I do not
>>> say that we should revert the commit introducing the regression but we
>>> should address this not only by extending the docs since the most
>>> drm-drivers are using the component framework and can end up in the same
>>> situation.
>> I am not sure why do you think this is similar issue.
> Because I see trying to bind the device over and over..
>
>> Please describe the issue in more detail. Which drivers defers probe and
>> why, and why do you have infinite loop.
> As said I'm currently on the imx-drm driver. The iMX6 devices are
> using the synopsis HDMI IP core and so they are using this bridge device
> driver (drivers/gpu/drm/bridge/synopsys/). The imx-drm driver can be
> build module wise. As example I enabled the LDB and the HDMI support.
> The HDMI driver is composed as platform driver with different
> (sub-)drivers and devices. Those devices are populated by the HDMI core
> driver _probe() function and triggers a driver_deferred_probe_trigger()
> after the driver successfully probed. The LDB driver bind() returns
> -EPROBE_DEFER because the panel we are looking for depends on a defered
> regulator device. Now the defered probe code tries to probe the defered
> devices again because the local-trigger count was changed by the HDMI
> driver and we are in the never ending loop.
>
>> In general deferring probe from bind is not forbidden, but it should be
>> used carefully (as everything in kernel :) ). Fixing deferring probe
>> issues in many cases it is a matter of figuring out 'dependency loops'
>> and breaking them by splitting device initialization into more than one
>> phase.
> We are on the way of splitting the imx-drm driver but there are many
> other DRM drivers using the component framework. As far as I can see the
> sunxi8 driver is component based and uses the same HDMI driver. I'm with
> Andy that we should fix that on the common/core place.


I have looked at the drivers and I see the main issue I see is that imx
drivers performs resource acquisition in bind phase. I think rule of
thumb should be "do not expose yourself, until you are ready", which in
this case means "do not call component_add, until resources are
acquired" - ie resource acquisition should be performed in probe. I use
this approach mainly to avoid multiple deferred re-probes, but it should
solve also this issue, so even if there will be solution to "deferred
probe issues" in core it would be good to fix imx drivers.


Regards

Andrzej


>
> Regards,
> Marco
>
>> Regards
>>
>> Andrzej
>>
>>
>>>>> It can be solved by refactoring the driver probe routine. If a resource is
>>>>> required to be present, then check that it is available early; before
>>>>> registering child devices.
>>>> We fix one and leave others.
>>> E.g. the imx-drm and the sunxi driver...
>>>
>>> Regards,
>>> Marco
>>>
>>>>> The proposed solution to modify driver core is fragile and susceptible to
>>>>> side effects from other probe paths. I don't think it is the right approach.
>>>> Have you tested it on your case? Does it fix the issue?
>>>>

2020-06-09 12:52:00

by Marco Felsch

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

On 20-06-09 11:27, Andrzej Hajda wrote:
>
> On 09.06.2020 08:45, Marco Felsch wrote:
> > On 20-06-08 13:11, Andrzej Hajda wrote:
> >> On 08.06.2020 11:17, Marco Felsch wrote:
> >>> On 20-03-26 18:31, Andy Shevchenko wrote:
> >>>> On Thu, Mar 26, 2020 at 03:01:22PM +0000, Grant Likely wrote:
> >>>>> On 25/03/2020 12:51, Andy Shevchenko wrote:
> >>>>>> On Tue, Mar 24, 2020 at 08:29:01PM -0700, Saravana Kannan wrote:
> >>>>>>> On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko <[email protected]> wrote:
> >>>>>>>> Consider the following scenario.
> >>>>>>>>
> >>>>>>>> The main driver of USB OTG controller (dwc3-pci), which has the following
> >>>>>>>> functional dependencies on certain platform:
> >>>>>>>> - ULPI (tusb1210)
> >>>>>>>> - extcon (tested with extcon-intel-mrfld)
> >>>>>>>>
> >>>>>>>> Note, that first driver, tusb1210, is available at the moment of
> >>>>>>>> dwc3-pci probing, while extcon-intel-mrfld is built as a module and
> >>>>>>>> won't appear till user space does something about it.
> >>>>>>>>
> >>>>>>>> This is depicted by kernel configuration excerpt:
> >>>>>>>>
> >>>>>>>> CONFIG_PHY_TUSB1210=y
> >>>>>>>> CONFIG_USB_DWC3=y
> >>>>>>>> CONFIG_USB_DWC3_ULPI=y
> >>>>>>>> CONFIG_USB_DWC3_DUAL_ROLE=y
> >>>>>>>> CONFIG_USB_DWC3_PCI=y
> >>>>>>>> CONFIG_EXTCON_INTEL_MRFLD=m
> >>>>>>>>
> >>>>>>>> In the Buildroot environment the modules are probed by alphabetical ordering
> >>>>>>>> of their modaliases. The latter comes to the case when USB OTG driver will be
> >>>>>>>> probed first followed by extcon one.
> >>>>>>>>
> >>>>>>>> So, if the platform anticipates extcon device to be appeared, in the above case
> >>>>>>>> we will get deferred probe of USB OTG, because of ordering.
> >>>>>>>>
> >>>>>>>> Since current implementation, done by the commit 58b116bce136 ("drivercore:
> >>>>>>>> deferral race condition fix") counts the amount of triggered deferred probe,
> >>>>>>>> we never advance the situation -- the change makes it to be an infinite loop.
> >>>>>>> Hi Andy,
> >>>>>>>
> >>>>>>> I'm trying to understand this sequence of steps. Sorry if the questions
> >>>>>>> are stupid -- I'm not very familiar with USB/PCI stuff.
> >>>>>> Thank you for looking into this. My answer below.
> >>>>>>
> >>>>>> As a first thing I would like to tell that there is another example of bad
> >>>>>> behaviour of deferred probe with no relation to USB. The proposed change also
> >>>>>> fixes that one (however, less possible to find in real life).
> >>>>>>
> >>>>>>>> ---8<---8<---
> >>>>>>>>
> >>>>>>>> [ 22.187127] driver_deferred_probe_trigger <<< 1
> >>>>>>>>
> >>>>>>>> ...here is the late initcall triggers deferred probe...
> >>>>>>>>
> >>>>>>>> [ 22.191725] platform dwc3.0.auto: deferred_probe_work_func in deferred list
> >>>>>>>>
> >>>>>>>> ...dwc3.0.auto is the only device in the deferred list...
> >>>>>>> Ok, dwc3.0.auto is the only unprobed device at this point?
> >>>>>> Correct.
> >>>>>>
> >>>>>>>> [ 22.198727] platform dwc3.0.auto: deferred_probe_work_func 1 <<< counter 1
> >>>>>>>>
> >>>>>>>> ...the counter before mutex is unlocked is kept the same...
> >>>>>>>>
> >>>>>>>> [ 22.205663] platform dwc3.0.auto: Retrying from deferred list
> >>>>>>>>
> >>>>>>>> ...mutes has been unlocked, we try to re-probe the driver...
> >>>>>>>>
> >>>>>>>> [ 22.211487] bus: 'platform': driver_probe_device: matched device dwc3.0.auto with driver dwc3
> >>>>>>>> [ 22.220060] bus: 'platform': really_probe: probing driver dwc3 with device dwc3.0.auto
> >>>>>>>> [ 22.238735] bus: 'ulpi': driver_probe_device: matched device dwc3.0.auto.ulpi with driver tusb1210
> >>>>>>>> [ 22.247743] bus: 'ulpi': really_probe: probing driver tusb1210 with device dwc3.0.auto.ulpi
> >>>>>>>> [ 22.256292] driver: 'tusb1210': driver_bound: bound to device 'dwc3.0.auto.ulpi'
> >>>>>>>> [ 22.263723] driver_deferred_probe_trigger <<< 2
> >>>>>>>>
> >>>>>>>> ...the dwc3.0.auto probes ULPI, we got successful bound and bumped counter...
> >>>>>>>>
> >>>>>>>> [ 22.268304] bus: 'ulpi': really_probe: bound device dwc3.0.auto.ulpi to driver tusb1210
> >>>>>>> So where did this dwc3.0.auto.ulpi come from?
> >>>>>>> Looks like the device is created by dwc3_probe() through this call flow:
> >>>>>>> dwc3_probe() -> dwc3_core_init() -> dwc3_core_ulpi_init() ->
> >>>>>>> dwc3_ulpi_init() -> ulpi_register_interface() -> ulpi_register()
> >>>>>> Correct.
> >>>>>>
> >>>>>>>> [ 22.276697] platform dwc3.0.auto: Driver dwc3 requests probe deferral
> >>>>>>> Can you please point me to which code patch actually caused the probe
> >>>>>>> deferral?
> >>>>>> Sure, it's in drd.c.
> >>>>>>
> >>>>>> if (device_property_read_string(dev, "linux,extcon-name", &name) == 0) {
> >>>>>> edev = extcon_get_extcon_dev(name);
> >>>>>> if (!edev)
> >>>>>> return ERR_PTR(-EPROBE_DEFER);
> >>>>>> return edev;
> >>>>>> }
> >>>>>>
> >>>>>>>> ...but extcon driver is still missing...
> >>>>>>>>
> >>>>>>>> [ 22.283174] platform dwc3.0.auto: Added to deferred list
> >>>>>>>> [ 22.288513] platform dwc3.0.auto: driver_deferred_probe_add_trigger local counter: 1 new counter 2
> >>>>>>> I'm not fully aware of all the USB implications, but if extcon is
> >>>>>>> needed, why can't that check be done before we add and probe the ulpi
> >>>>>>> device? That'll avoid this whole "fake" probing and avoid the counter
> >>>>>>> increase. And avoid the need for this patch that's touching the code
> >>>>>>> code that's already a bit delicate.
> >>>>>>> Also, with my limited experience with all the possible drivers in the
> >>>>>>> kernel, it's weird that the ulpi device is added and probed before we
> >>>>>>> make sure the parent device (dwc3.0.auto) can actually probe
> >>>>>>> successfully.
> >>>>>> As I said above the deferred probe trigger has flaw on its own.
> >>>>>> Even if we fix for USB case, there is (and probably will be) others.
> >>>>> Right here is the driver design bug. A driver's probe() hook should *not*
> >>>>> return -EPROBE_DEFER after already creating child devices which may have
> >>>>> already been probed.
> >>>> Any documentation statement for this requirement?
> >>>>
> >>>> By the way, I may imagine other mechanisms that probe the driver on other CPU
> >>>> at the same time (let's consider parallel modprobes). The current code has a
> >>>> flaw with that.
> >>> Hi,
> >>>
> >>> sorry for picking this up again but I stumbled above the same issue
> >>> within the driver imx/drm driver which is using the component framework.
> >>> I end up in a infinity boot loop if I enabled the HDMI (which is the
> >>> DesignWare bridge device) and the LVDS support and the LVDS bind return
> >>> with EPROBE_DEFER. There are no words within the component framework docs
> >>> which says that this is forbidden. Of course we can work-around the
> >>> driver-core framework but IMHO this shouldn't be the way to go. I do not
> >>> say that we should revert the commit introducing the regression but we
> >>> should address this not only by extending the docs since the most
> >>> drm-drivers are using the component framework and can end up in the same
> >>> situation.
> >> I am not sure why do you think this is similar issue.
> > Because I see trying to bind the device over and over..
> >
> >> Please describe the issue in more detail. Which drivers defers probe and
> >> why, and why do you have infinite loop.
> > As said I'm currently on the imx-drm driver. The iMX6 devices are
> > using the synopsis HDMI IP core and so they are using this bridge device
> > driver (drivers/gpu/drm/bridge/synopsys/). The imx-drm driver can be
> > build module wise. As example I enabled the LDB and the HDMI support.
> > The HDMI driver is composed as platform driver with different
> > (sub-)drivers and devices. Those devices are populated by the HDMI core
> > driver _probe() function and triggers a driver_deferred_probe_trigger()
> > after the driver successfully probed. The LDB driver bind() returns
> > -EPROBE_DEFER because the panel we are looking for depends on a defered
> > regulator device. Now the defered probe code tries to probe the defered
> > devices again because the local-trigger count was changed by the HDMI
> > driver and we are in the never ending loop.
> >
> >> In general deferring probe from bind is not forbidden, but it should be
> >> used carefully (as everything in kernel :) ). Fixing deferring probe
> >> issues in many cases it is a matter of figuring out 'dependency loops'
> >> and breaking them by splitting device initialization into more than one
> >> phase.
> > We are on the way of splitting the imx-drm driver but there are many
> > other DRM drivers using the component framework. As far as I can see the
> > sunxi8 driver is component based and uses the same HDMI driver. I'm with
> > Andy that we should fix that on the common/core place.
>
>
> I have looked at the drivers and I see the main issue I see is that imx
> drivers performs resource acquisition in bind phase.

As I said we are working on this.

> I think rule of
> thumb should be "do not expose yourself, until you are ready", which in
> this case means "do not call component_add, until resources are
> acquired" - ie resource acquisition should be performed in probe.

Hm.. there are is no documentation which forbid this use-case. I thought
that the component framework bind() equals the driver probe() function..

> I use
> this approach mainly to avoid multiple deferred re-probes, but it should
> solve also this issue, so even if there will be solution to "deferred
> probe issues" in core it would be good to fix imx drivers.

Pls, see my above comments. It is not only the imx driver. Also we
shouldn't expect that driver-developers will follow a rule which is
not written somewhere.

Regards,
Marco

> Regards
>
> Andrzej
>
>
> >
> > Regards,
> > Marco
> >
> >> Regards
> >>
> >> Andrzej
> >>
> >>
> >>>>> It can be solved by refactoring the driver probe routine. If a resource is
> >>>>> required to be present, then check that it is available early; before
> >>>>> registering child devices.
> >>>> We fix one and leave others.
> >>> E.g. the imx-drm and the sunxi driver...
> >>>
> >>> Regards,
> >>> Marco
> >>>
> >>>>> The proposed solution to modify driver core is fragile and susceptible to
> >>>>> side effects from other probe paths. I don't think it is the right approach.
> >>>> Have you tested it on your case? Does it fix the issue?
> >>>>

2020-06-09 13:18:59

by Mark Brown

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied

On Tue, Jun 09, 2020 at 02:10:29PM +0200, Marco Felsch wrote:
> On 20-06-09 11:27, Andrzej Hajda wrote:
> > On 09.06.2020 08:45, Marco Felsch wrote:
> > > On 20-06-08 13:11, Andrzej Hajda wrote:
> > >> On 08.06.2020 11:17, Marco Felsch wrote:
> > >>> On 20-03-26 18:31, Andy Shevchenko wrote:
> > >>>> On Thu, Mar 26, 2020 at 03:01:22PM +0000, Grant Likely wrote:
> > >>>>> On 25/03/2020 12:51, Andy Shevchenko wrote:
> > >>>>>> On Tue, Mar 24, 2020 at 08:29:01PM -0700, Saravana Kannan wrote:
> > >>>>>>> On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko <[email protected]> wrote:

Please delete unneeded context from mails when replying. Doing this
makes it much easier to find your reply in the message, helping ensure
it won't be missed by people scrolling through the irrelevant quoted
material.

> > I think rule of
> > thumb should be "do not expose yourself, until you are ready", which in
> > this case means "do not call component_add, until resources are
> > acquired" - ie resource acquisition should be performed in probe.

> Hm.. there are is no documentation which forbid this use-case. I thought
> that the component framework bind() equals the driver probe() function..

It does, the issue is perhaps more clearly expressed as saying that a
driver should acquire whatever resources it needs before starting to
make resources available to others, this includes but isn't limited to
registering new device nodes. This ensures that the users don't then
start trying to use resources and have them torn down underneath them.

> > I use
> > this approach mainly to avoid multiple deferred re-probes, but it should
> > solve also this issue, so even if there will be solution to "deferred
> > probe issues" in core it would be good to fix imx drivers.

> Pls, see my above comments. It is not only the imx driver. Also we
> shouldn't expect that driver-developers will follow a rule which is
> not written somewhere.

If you've got an idea where this should be documented patches welcome!
I can't think of anywhere sensibly discoverable to put something off the
top of my head.


Attachments:
(No filename) (2.10 kB)
signature.asc (499.00 B)
Download all attachments

2020-06-09 17:16:38

by Andrzej Hajda

[permalink] [raw]
Subject: Re: [PATCH v3] driver core: Break infinite loop when deferred probe can't be satisfied


On 09.06.2020 14:10, Marco Felsch wrote:
> On 20-06-09 11:27, Andrzej Hajda wrote:
>> On 09.06.2020 08:45, Marco Felsch wrote:
>>> On 20-06-08 13:11, Andrzej Hajda wrote:
>>>> On 08.06.2020 11:17, Marco Felsch wrote:
>>>>> On 20-03-26 18:31, Andy Shevchenko wrote:
>>>>>> On Thu, Mar 26, 2020 at 03:01:22PM +0000, Grant Likely wrote:
>>>>>>> On 25/03/2020 12:51, Andy Shevchenko wrote:
>>>>>>>> On Tue, Mar 24, 2020 at 08:29:01PM -0700, Saravana Kannan wrote:
>>>>>>>>> On Tue, Mar 24, 2020 at 5:38 AM Andy Shevchenko <[email protected]> wrote:
>>>>>>>>>> Consider the following scenario.
>>>>>>>>>>
>>>>>>>>>> The main driver of USB OTG controller (dwc3-pci), which has the following
>>>>>>>>>> functional dependencies on certain platform:
>>>>>>>>>> - ULPI (tusb1210)
>>>>>>>>>> - extcon (tested with extcon-intel-mrfld)
>>>>>>>>>>
>>>>>>>>>> Note, that first driver, tusb1210, is available at the moment of
>>>>>>>>>> dwc3-pci probing, while extcon-intel-mrfld is built as a module and
>>>>>>>>>> won't appear till user space does something about it.
>>>>>>>>>>
>>>>>>>>>> This is depicted by kernel configuration excerpt:
>>>>>>>>>>
>>>>>>>>>> CONFIG_PHY_TUSB1210=y
>>>>>>>>>> CONFIG_USB_DWC3=y
>>>>>>>>>> CONFIG_USB_DWC3_ULPI=y
>>>>>>>>>> CONFIG_USB_DWC3_DUAL_ROLE=y
>>>>>>>>>> CONFIG_USB_DWC3_PCI=y
>>>>>>>>>> CONFIG_EXTCON_INTEL_MRFLD=m
>>>>>>>>>>
>>>>>>>>>> In the Buildroot environment the modules are probed by alphabetical ordering
>>>>>>>>>> of their modaliases. The latter comes to the case when USB OTG driver will be
>>>>>>>>>> probed first followed by extcon one.
>>>>>>>>>>
>>>>>>>>>> So, if the platform anticipates extcon device to be appeared, in the above case
>>>>>>>>>> we will get deferred probe of USB OTG, because of ordering.
>>>>>>>>>>
>>>>>>>>>> Since current implementation, done by the commit 58b116bce136 ("drivercore:
>>>>>>>>>> deferral race condition fix") counts the amount of triggered deferred probe,
>>>>>>>>>> we never advance the situation -- the change makes it to be an infinite loop.
>>>>>>>>> Hi Andy,
>>>>>>>>>
>>>>>>>>> I'm trying to understand this sequence of steps. Sorry if the questions
>>>>>>>>> are stupid -- I'm not very familiar with USB/PCI stuff.
>>>>>>>> Thank you for looking into this. My answer below.
>>>>>>>>
>>>>>>>> As a first thing I would like to tell that there is another example of bad
>>>>>>>> behaviour of deferred probe with no relation to USB. The proposed change also
>>>>>>>> fixes that one (however, less possible to find in real life).
>>>>>>>>
>>>>>>>>>> ---8<---8<---
>>>>>>>>>>
>>>>>>>>>> [ 22.187127] driver_deferred_probe_trigger <<< 1
>>>>>>>>>>
>>>>>>>>>> ...here is the late initcall triggers deferred probe...
>>>>>>>>>>
>>>>>>>>>> [ 22.191725] platform dwc3.0.auto: deferred_probe_work_func in deferred list
>>>>>>>>>>
>>>>>>>>>> ...dwc3.0.auto is the only device in the deferred list...
>>>>>>>>> Ok, dwc3.0.auto is the only unprobed device at this point?
>>>>>>>> Correct.
>>>>>>>>
>>>>>>>>>> [ 22.198727] platform dwc3.0.auto: deferred_probe_work_func 1 <<< counter 1
>>>>>>>>>>
>>>>>>>>>> ...the counter before mutex is unlocked is kept the same...
>>>>>>>>>>
>>>>>>>>>> [ 22.205663] platform dwc3.0.auto: Retrying from deferred list
>>>>>>>>>>
>>>>>>>>>> ...mutes has been unlocked, we try to re-probe the driver...
>>>>>>>>>>
>>>>>>>>>> [ 22.211487] bus: 'platform': driver_probe_device: matched device dwc3.0.auto with driver dwc3
>>>>>>>>>> [ 22.220060] bus: 'platform': really_probe: probing driver dwc3 with device dwc3.0.auto
>>>>>>>>>> [ 22.238735] bus: 'ulpi': driver_probe_device: matched device dwc3.0.auto.ulpi with driver tusb1210
>>>>>>>>>> [ 22.247743] bus: 'ulpi': really_probe: probing driver tusb1210 with device dwc3.0.auto.ulpi
>>>>>>>>>> [ 22.256292] driver: 'tusb1210': driver_bound: bound to device 'dwc3.0.auto.ulpi'
>>>>>>>>>> [ 22.263723] driver_deferred_probe_trigger <<< 2
>>>>>>>>>>
>>>>>>>>>> ...the dwc3.0.auto probes ULPI, we got successful bound and bumped counter...
>>>>>>>>>>
>>>>>>>>>> [ 22.268304] bus: 'ulpi': really_probe: bound device dwc3.0.auto.ulpi to driver tusb1210
>>>>>>>>> So where did this dwc3.0.auto.ulpi come from?
>>>>>>>>> Looks like the device is created by dwc3_probe() through this call flow:
>>>>>>>>> dwc3_probe() -> dwc3_core_init() -> dwc3_core_ulpi_init() ->
>>>>>>>>> dwc3_ulpi_init() -> ulpi_register_interface() -> ulpi_register()
>>>>>>>> Correct.
>>>>>>>>
>>>>>>>>>> [ 22.276697] platform dwc3.0.auto: Driver dwc3 requests probe deferral
>>>>>>>>> Can you please point me to which code patch actually caused the probe
>>>>>>>>> deferral?
>>>>>>>> Sure, it's in drd.c.
>>>>>>>>
>>>>>>>> if (device_property_read_string(dev, "linux,extcon-name", &name) == 0) {
>>>>>>>> edev = extcon_get_extcon_dev(name);
>>>>>>>> if (!edev)
>>>>>>>> return ERR_PTR(-EPROBE_DEFER);
>>>>>>>> return edev;
>>>>>>>> }
>>>>>>>>
>>>>>>>>>> ...but extcon driver is still missing...
>>>>>>>>>>
>>>>>>>>>> [ 22.283174] platform dwc3.0.auto: Added to deferred list
>>>>>>>>>> [ 22.288513] platform dwc3.0.auto: driver_deferred_probe_add_trigger local counter: 1 new counter 2
>>>>>>>>> I'm not fully aware of all the USB implications, but if extcon is
>>>>>>>>> needed, why can't that check be done before we add and probe the ulpi
>>>>>>>>> device? That'll avoid this whole "fake" probing and avoid the counter
>>>>>>>>> increase. And avoid the need for this patch that's touching the code
>>>>>>>>> code that's already a bit delicate.
>>>>>>>>> Also, with my limited experience with all the possible drivers in the
>>>>>>>>> kernel, it's weird that the ulpi device is added and probed before we
>>>>>>>>> make sure the parent device (dwc3.0.auto) can actually probe
>>>>>>>>> successfully.
>>>>>>>> As I said above the deferred probe trigger has flaw on its own.
>>>>>>>> Even if we fix for USB case, there is (and probably will be) others.
>>>>>>> Right here is the driver design bug. A driver's probe() hook should *not*
>>>>>>> return -EPROBE_DEFER after already creating child devices which may have
>>>>>>> already been probed.
>>>>>> Any documentation statement for this requirement?
>>>>>>
>>>>>> By the way, I may imagine other mechanisms that probe the driver on other CPU
>>>>>> at the same time (let's consider parallel modprobes). The current code has a
>>>>>> flaw with that.
>>>>> Hi,
>>>>>
>>>>> sorry for picking this up again but I stumbled above the same issue
>>>>> within the driver imx/drm driver which is using the component framework.
>>>>> I end up in a infinity boot loop if I enabled the HDMI (which is the
>>>>> DesignWare bridge device) and the LVDS support and the LVDS bind return
>>>>> with EPROBE_DEFER. There are no words within the component framework docs
>>>>> which says that this is forbidden. Of course we can work-around the
>>>>> driver-core framework but IMHO this shouldn't be the way to go. I do not
>>>>> say that we should revert the commit introducing the regression but we
>>>>> should address this not only by extending the docs since the most
>>>>> drm-drivers are using the component framework and can end up in the same
>>>>> situation.
>>>> I am not sure why do you think this is similar issue.
>>> Because I see trying to bind the device over and over..
>>>
>>>> Please describe the issue in more detail. Which drivers defers probe and
>>>> why, and why do you have infinite loop.
>>> As said I'm currently on the imx-drm driver. The iMX6 devices are
>>> using the synopsis HDMI IP core and so they are using this bridge device
>>> driver (drivers/gpu/drm/bridge/synopsys/). The imx-drm driver can be
>>> build module wise. As example I enabled the LDB and the HDMI support.
>>> The HDMI driver is composed as platform driver with different
>>> (sub-)drivers and devices. Those devices are populated by the HDMI core
>>> driver _probe() function and triggers a driver_deferred_probe_trigger()
>>> after the driver successfully probed. The LDB driver bind() returns
>>> -EPROBE_DEFER because the panel we are looking for depends on a defered
>>> regulator device. Now the defered probe code tries to probe the defered
>>> devices again because the local-trigger count was changed by the HDMI
>>> driver and we are in the never ending loop.
>>>
>>>> In general deferring probe from bind is not forbidden, but it should be
>>>> used carefully (as everything in kernel :) ). Fixing deferring probe
>>>> issues in many cases it is a matter of figuring out 'dependency loops'
>>>> and breaking them by splitting device initialization into more than one
>>>> phase.
>>> We are on the way of splitting the imx-drm driver but there are many
>>> other DRM drivers using the component framework. As far as I can see the
>>> sunxi8 driver is component based and uses the same HDMI driver. I'm with
>>> Andy that we should fix that on the common/core place.
>>
>> I have looked at the drivers and I see the main issue I see is that imx
>> drivers performs resource acquisition in bind phase.
> As I said we are working on this.
>
>> I think rule of
>> thumb should be "do not expose yourself, until you are ready", which in
>> this case means "do not call component_add, until resources are
>> acquired" - ie resource acquisition should be performed in probe.
> Hm.. there are is no documentation which forbid this use-case. I thought
> that the component framework bind() equals the driver probe() function..


In this particular case (components vs deferred probe interaction) I
guess the source code is the only documentation.


>
>> I use
>> this approach mainly to avoid multiple deferred re-probes, but it should
>> solve also this issue, so even if there will be solution to "deferred
>> probe issues" in core it would be good to fix imx drivers.
> Pls, see my above comments. It is not only the imx driver. Also we
> shouldn't expect that driver-developers will follow a rule which is
> not written somewhere.


As I wrote above this is only my advice and my experience, if you have
better idea regarding drivers/documentation/core please post appropriate
patches.


Regards

Andrzej


>
> Regards,
> Marco
>
>> Regards
>>
>> Andrzej
>>
>>
>>> Regards,
>>> Marco
>>>
>>>> Regards
>>>>
>>>> Andrzej
>>>>
>>>>
>>>>>>> It can be solved by refactoring the driver probe routine. If a resource is
>>>>>>> required to be present, then check that it is available early; before
>>>>>>> registering child devices.
>>>>>> We fix one and leave others.
>>>>> E.g. the imx-drm and the sunxi driver...
>>>>>
>>>>> Regards,
>>>>> Marco
>>>>>
>>>>>>> The proposed solution to modify driver core is fragile and susceptible to
>>>>>>> side effects from other probe paths. I don't think it is the right approach.
>>>>>> Have you tested it on your case? Does it fix the issue?
>>>>>>