2021-02-15 11:18:22

by Geert Uytterhoeven

[permalink] [raw]
Subject: [PATCH] driver core: Fix double failed probing with fw_devlink=on

With fw_devlink=permissive, devices are added to the deferred probe
pending list if their driver's .probe() method returns -EPROBE_DEFER.

With fw_devlink=on, devices are added to the deferred probe pending list
if they are determined to be a consumer, which happens before their
driver's .probe() method is called. If the actual probe fails later
(real failure, not -EPROBE_DEFER), the device will still be on the
deferred probe pending list, and it will be probed again when deferred
probing kicks in, which is futile.

Fix this by explicitly removing the device from the deferred probe
pending list in case of probe failures.

Fixes: e590474768f1cc04 ("driver core: Set fw_devlink=on by default")
Signed-off-by: Geert Uytterhoeven <[email protected]>
---
Seen on various Renesas R-Car platforms, cfr.
https://lore.kernel.org/linux-acpi/CAMuHMdVL-1RKJ5u-HDVA4F4w_+8yGvQQuJQBcZMsdV4yXzzfcw@mail.gmail.com
---
drivers/base/dd.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 9179825ff646f4e3..91c4181093c43709 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -639,11 +639,13 @@ static int really_probe(struct device *dev, struct device_driver *drv)
case -ENXIO:
pr_debug("%s: probe of %s rejects match %d\n",
drv->name, dev_name(dev), ret);
+ driver_deferred_probe_del(dev);
break;
default:
/* driver matched but the probe failed */
pr_warn("%s: probe of %s failed with error %d\n",
drv->name, dev_name(dev), ret);
+ driver_deferred_probe_del(dev);
}
/*
* Ignore errors returned by ->probe so that the next driver can try
--
2.25.1


2021-02-15 15:13:19

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] driver core: Fix double failed probing with fw_devlink=on

On Mon, Feb 15, 2021 at 12:16 PM Geert Uytterhoeven
<[email protected]> wrote:
>
> With fw_devlink=permissive, devices are added to the deferred probe
> pending list if their driver's .probe() method returns -EPROBE_DEFER.
>
> With fw_devlink=on, devices are added to the deferred probe pending list
> if they are determined to be a consumer, which happens before their
> driver's .probe() method is called. If the actual probe fails later
> (real failure, not -EPROBE_DEFER), the device will still be on the
> deferred probe pending list, and it will be probed again when deferred
> probing kicks in, which is futile.
>
> Fix this by explicitly removing the device from the deferred probe
> pending list in case of probe failures.
>
> Fixes: e590474768f1cc04 ("driver core: Set fw_devlink=on by default")
> Signed-off-by: Geert Uytterhoeven <[email protected]>

Good catch:

Reviewed-by: Rafael J. Wysocki <[email protected]>

> ---
> Seen on various Renesas R-Car platforms, cfr.
> https://lore.kernel.org/linux-acpi/CAMuHMdVL-1RKJ5u-HDVA4F4w_+8yGvQQuJQBcZMsdV4yXzzfcw@mail.gmail.com
> ---
> drivers/base/dd.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> index 9179825ff646f4e3..91c4181093c43709 100644
> --- a/drivers/base/dd.c
> +++ b/drivers/base/dd.c
> @@ -639,11 +639,13 @@ static int really_probe(struct device *dev, struct device_driver *drv)
> case -ENXIO:
> pr_debug("%s: probe of %s rejects match %d\n",
> drv->name, dev_name(dev), ret);
> + driver_deferred_probe_del(dev);
> break;
> default:
> /* driver matched but the probe failed */
> pr_warn("%s: probe of %s failed with error %d\n",
> drv->name, dev_name(dev), ret);
> + driver_deferred_probe_del(dev);
> }
> /*
> * Ignore errors returned by ->probe so that the next driver can try
> --
> 2.25.1
>

2021-02-15 18:32:00

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH] driver core: Fix double failed probing with fw_devlink=on

On Mon, Feb 15, 2021 at 6:59 AM Rafael J. Wysocki <[email protected]> wrote:
>
> On Mon, Feb 15, 2021 at 12:16 PM Geert Uytterhoeven
> <[email protected]> wrote:
> >
> > With fw_devlink=permissive, devices are added to the deferred probe
> > pending list if their driver's .probe() method returns -EPROBE_DEFER.
> >
> > With fw_devlink=on, devices are added to the deferred probe pending list
> > if they are determined to be a consumer,

If they are determined to be a consumer or if they are determined to
have a supplier that hasn't probed yet?

> > which happens before their
> > driver's .probe() method is called. If the actual probe fails later
> > (real failure, not -EPROBE_DEFER), the device will still be on the
> > deferred probe pending list, and it will be probed again when deferred
> > probing kicks in, which is futile.
> >
> > Fix this by explicitly removing the device from the deferred probe
> > pending list in case of probe failures.
> >
> > Fixes: e590474768f1cc04 ("driver core: Set fw_devlink=on by default")
> > Signed-off-by: Geert Uytterhoeven <[email protected]>
>
> Good catch:
>
> Reviewed-by: Rafael J. Wysocki <[email protected]>

Geert,

The issue is real and needs to be fixed. But I'm confused how this can
happen. We won't even enter really_probe() if the driver isn't ready.
We also won't get to run the driver's .probe() if the suppliers aren't
ready. So how does the device get added to the deferred probe list
before the driver is ready? Is this due to device_links_driver_bound()
on the supplier?

Can you give a more detailed step by step on the case you are hitting?

Greg/Rafael,

Let's hold off picking this patch till I get to take a closer look
(within a day or two) please.

-Saravana

>
> > ---
> > Seen on various Renesas R-Car platforms, cfr.
> > https://lore.kernel.org/linux-acpi/CAMuHMdVL-1RKJ5u-HDVA4F4w_+8yGvQQuJQBcZMsdV4yXzzfcw@mail.gmail.com
> > ---
> > drivers/base/dd.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> > index 9179825ff646f4e3..91c4181093c43709 100644
> > --- a/drivers/base/dd.c
> > +++ b/drivers/base/dd.c
> > @@ -639,11 +639,13 @@ static int really_probe(struct device *dev, struct device_driver *drv)
> > case -ENXIO:
> > pr_debug("%s: probe of %s rejects match %d\n",
> > drv->name, dev_name(dev), ret);
> > + driver_deferred_probe_del(dev);
> > break;
> > default:
> > /* driver matched but the probe failed */
> > pr_warn("%s: probe of %s failed with error %d\n",
> > drv->name, dev_name(dev), ret);
> > + driver_deferred_probe_del(dev);
> > }
> > /*
> > * Ignore errors returned by ->probe so that the next driver can try
> > --
> > 2.25.1
> >

2021-02-15 19:12:47

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH] driver core: Fix double failed probing with fw_devlink=on

Hi Saravana,

On Mon, Feb 15, 2021 at 7:27 PM Saravana Kannan <[email protected]> wrote:
> On Mon, Feb 15, 2021 at 6:59 AM Rafael J. Wysocki <[email protected]> wrote:
> > On Mon, Feb 15, 2021 at 12:16 PM Geert Uytterhoeven
> > <[email protected]> wrote:
> > > With fw_devlink=permissive, devices are added to the deferred probe
> > > pending list if their driver's .probe() method returns -EPROBE_DEFER.
> > >
> > > With fw_devlink=on, devices are added to the deferred probe pending list
> > > if they are determined to be a consumer,
>
> If they are determined to be a consumer or if they are determined to
> have a supplier that hasn't probed yet?

When the supplier has probed:

bus: 'platform': driver_probe_device: matched device
e6150000.clock-controller with driver renesas-cpg-mssr
bus: 'platform': really_probe: probing driver renesas-cpg-mssr
with device e6150000.clock-controller
PM: Added domain provider from /soc/clock-controller@e6150000
driver: 'renesas-cpg-mssr': driver_bound: bound to device
'e6150000.clock-controller'
platform e6055800.gpio: Added to deferred list
[...]
platform e6020000.watchdog: Added to deferred list
[...]
platform fe000000.pcie: Added to deferred list

> > > which happens before their
> > > driver's .probe() method is called. If the actual probe fails later
> > > (real failure, not -EPROBE_DEFER), the device will still be on the
> > > deferred probe pending list, and it will be probed again when deferred
> > > probing kicks in, which is futile.
> > >
> > > Fix this by explicitly removing the device from the deferred probe
> > > pending list in case of probe failures.
> > >
> > > Fixes: e590474768f1cc04 ("driver core: Set fw_devlink=on by default")
> > > Signed-off-by: Geert Uytterhoeven <[email protected]>
> >
> > Good catch:
> >
> > Reviewed-by: Rafael J. Wysocki <[email protected]>
>
> The issue is real and needs to be fixed. But I'm confused how this can
> happen. We won't even enter really_probe() if the driver isn't ready.
> We also won't get to run the driver's .probe() if the suppliers aren't
> ready. So how does the device get added to the deferred probe list
> before the driver is ready? Is this due to device_links_driver_bound()
> on the supplier?
>
> Can you give a more detailed step by step on the case you are hitting?

The device is added to the list due to device_links_driver_bound()
calling driver_deferred_probe_add() on all consumer devices.

> > > +++ b/drivers/base/dd.c
> > > @@ -639,11 +639,13 @@ static int really_probe(struct device *dev, struct device_driver *drv)
> > > case -ENXIO:
> > > pr_debug("%s: probe of %s rejects match %d\n",
> > > drv->name, dev_name(dev), ret);
> > > + driver_deferred_probe_del(dev);
> > > break;
> > > default:
> > > /* driver matched but the probe failed */
> > > pr_warn("%s: probe of %s failed with error %d\n",
> > > drv->name, dev_name(dev), ret);
> > > + driver_deferred_probe_del(dev);
> > > }
> > > /*
> > > * Ignore errors returned by ->probe so that the next driver can try

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2021-02-15 21:01:54

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH] driver core: Fix double failed probing with fw_devlink=on

On Mon, Feb 15, 2021 at 11:08 AM Geert Uytterhoeven
<[email protected]> wrote:
>
> Hi Saravana,
>
> On Mon, Feb 15, 2021 at 7:27 PM Saravana Kannan <[email protected]> wrote:
> > On Mon, Feb 15, 2021 at 6:59 AM Rafael J. Wysocki <[email protected]> wrote:
> > > On Mon, Feb 15, 2021 at 12:16 PM Geert Uytterhoeven
> > > <[email protected]> wrote:
> > > > With fw_devlink=permissive, devices are added to the deferred probe
> > > > pending list if their driver's .probe() method returns -EPROBE_DEFER.
> > > >
> > > > With fw_devlink=on, devices are added to the deferred probe pending list
> > > > if they are determined to be a consumer,
> >
> > If they are determined to be a consumer or if they are determined to
> > have a supplier that hasn't probed yet?
>
> When the supplier has probed:
>
> bus: 'platform': driver_probe_device: matched device
> e6150000.clock-controller with driver renesas-cpg-mssr
> bus: 'platform': really_probe: probing driver renesas-cpg-mssr
> with device e6150000.clock-controller
> PM: Added domain provider from /soc/clock-controller@e6150000
> driver: 'renesas-cpg-mssr': driver_bound: bound to device
> 'e6150000.clock-controller'
> platform e6055800.gpio: Added to deferred list
> [...]
> platform e6020000.watchdog: Added to deferred list
> [...]
> platform fe000000.pcie: Added to deferred list
>
> > > > which happens before their
> > > > driver's .probe() method is called. If the actual probe fails later
> > > > (real failure, not -EPROBE_DEFER), the device will still be on the
> > > > deferred probe pending list, and it will be probed again when deferred
> > > > probing kicks in, which is futile.
> > > >
> > > > Fix this by explicitly removing the device from the deferred probe
> > > > pending list in case of probe failures.
> > > >
> > > > Fixes: e590474768f1cc04 ("driver core: Set fw_devlink=on by default")
> > > > Signed-off-by: Geert Uytterhoeven <[email protected]>
> > >
> > > Good catch:
> > >
> > > Reviewed-by: Rafael J. Wysocki <[email protected]>
> >
> > The issue is real and needs to be fixed. But I'm confused how this can
> > happen. We won't even enter really_probe() if the driver isn't ready.
> > We also won't get to run the driver's .probe() if the suppliers aren't
> > ready. So how does the device get added to the deferred probe list
> > before the driver is ready? Is this due to device_links_driver_bound()
> > on the supplier?
> >
> > Can you give a more detailed step by step on the case you are hitting?
>
> The device is added to the list due to device_links_driver_bound()
> calling driver_deferred_probe_add() on all consumer devices.

Thanks for the explanation. Maybe add more details like this to the
commit text or in the code?

For the code:
Reviewed-by: Saravana Kanna <[email protected]>

-Saravana

>
> > > > +++ b/drivers/base/dd.c
> > > > @@ -639,11 +639,13 @@ static int really_probe(struct device *dev, struct device_driver *drv)
> > > > case -ENXIO:
> > > > pr_debug("%s: probe of %s rejects match %d\n",
> > > > drv->name, dev_name(dev), ret);
> > > > + driver_deferred_probe_del(dev);
> > > > break;
> > > > default:
> > > > /* driver matched but the probe failed */
> > > > pr_warn("%s: probe of %s failed with error %d\n",
> > > > drv->name, dev_name(dev), ret);
> > > > + driver_deferred_probe_del(dev);
> > > > }
> > > > /*
> > > > * Ignore errors returned by ->probe so that the next driver can try
>
> Gr{oetje,eeting}s,
>
> Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
> -- Linus Torvalds

2021-02-16 17:09:42

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH] driver core: Fix double failed probing with fw_devlink=on

On Mon, Feb 15, 2021 at 12:59 PM Saravana Kannan <[email protected]> wrote:
>
> On Mon, Feb 15, 2021 at 11:08 AM Geert Uytterhoeven
> <[email protected]> wrote:
> >
> > Hi Saravana,
> >
> > On Mon, Feb 15, 2021 at 7:27 PM Saravana Kannan <[email protected]> wrote:
> > > On Mon, Feb 15, 2021 at 6:59 AM Rafael J. Wysocki <[email protected]> wrote:
> > > > On Mon, Feb 15, 2021 at 12:16 PM Geert Uytterhoeven
> > > > <[email protected]> wrote:
> > > > > With fw_devlink=permissive, devices are added to the deferred probe
> > > > > pending list if their driver's .probe() method returns -EPROBE_DEFER.
> > > > >
> > > > > With fw_devlink=on, devices are added to the deferred probe pending list
> > > > > if they are determined to be a consumer,
> > >
> > > If they are determined to be a consumer or if they are determined to
> > > have a supplier that hasn't probed yet?
> >
> > When the supplier has probed:
> >
> > bus: 'platform': driver_probe_device: matched device
> > e6150000.clock-controller with driver renesas-cpg-mssr
> > bus: 'platform': really_probe: probing driver renesas-cpg-mssr
> > with device e6150000.clock-controller
> > PM: Added domain provider from /soc/clock-controller@e6150000
> > driver: 'renesas-cpg-mssr': driver_bound: bound to device
> > 'e6150000.clock-controller'
> > platform e6055800.gpio: Added to deferred list
> > [...]
> > platform e6020000.watchdog: Added to deferred list
> > [...]
> > platform fe000000.pcie: Added to deferred list
> >
> > > > > which happens before their
> > > > > driver's .probe() method is called. If the actual probe fails later
> > > > > (real failure, not -EPROBE_DEFER), the device will still be on the
> > > > > deferred probe pending list, and it will be probed again when deferred
> > > > > probing kicks in, which is futile.
> > > > >
> > > > > Fix this by explicitly removing the device from the deferred probe
> > > > > pending list in case of probe failures.
> > > > >
> > > > > Fixes: e590474768f1cc04 ("driver core: Set fw_devlink=on by default")
> > > > > Signed-off-by: Geert Uytterhoeven <[email protected]>
> > > >
> > > > Good catch:
> > > >
> > > > Reviewed-by: Rafael J. Wysocki <[email protected]>
> > >
> > > The issue is real and needs to be fixed. But I'm confused how this can
> > > happen. We won't even enter really_probe() if the driver isn't ready.
> > > We also won't get to run the driver's .probe() if the suppliers aren't
> > > ready. So how does the device get added to the deferred probe list
> > > before the driver is ready? Is this due to device_links_driver_bound()
> > > on the supplier?
> > >
> > > Can you give a more detailed step by step on the case you are hitting?
> >
> > The device is added to the list due to device_links_driver_bound()
> > calling driver_deferred_probe_add() on all consumer devices.
>
> Thanks for the explanation. Maybe add more details like this to the
> commit text or in the code?
>
> For the code:
> Reviewed-by: Saravana Kanna <[email protected]>

Ugh... I just realized that I might have to give this a Nak because of
bad locking in deferred_probe_work_func(). The unlock/lock inside the
loop is a terrible hack. If we add this patch, we can end up modifying
a linked list while it's being traversed and cause a crash or busy
loop (you'll accidentally end up on an "empty list"). I ran into a
similar issue during one of my unrelated refactors.

-Saravana

2021-07-07 08:45:51

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH] driver core: Fix double failed probing with fw_devlink=on

Hi Saravana,

(going over old patch I still have in my local tree)

On Tue, Feb 16, 2021 at 6:08 PM Saravana Kannan <[email protected]> wrote:
> On Mon, Feb 15, 2021 at 12:59 PM Saravana Kannan <[email protected]> wrote:
> > On Mon, Feb 15, 2021 at 11:08 AM Geert Uytterhoeven
> > <[email protected]> wrote:
> > > On Mon, Feb 15, 2021 at 7:27 PM Saravana Kannan <[email protected]> wrote:
> > > > On Mon, Feb 15, 2021 at 6:59 AM Rafael J. Wysocki <[email protected]> wrote:
> > > > > On Mon, Feb 15, 2021 at 12:16 PM Geert Uytterhoeven
> > > > > <[email protected]> wrote:
> > > > > > With fw_devlink=permissive, devices are added to the deferred probe
> > > > > > pending list if their driver's .probe() method returns -EPROBE_DEFER.
> > > > > >
> > > > > > With fw_devlink=on, devices are added to the deferred probe pending list
> > > > > > if they are determined to be a consumer,
> > > >
> > > > If they are determined to be a consumer or if they are determined to
> > > > have a supplier that hasn't probed yet?
> > >
> > > When the supplier has probed:
> > >
> > > bus: 'platform': driver_probe_device: matched device
> > > e6150000.clock-controller with driver renesas-cpg-mssr
> > > bus: 'platform': really_probe: probing driver renesas-cpg-mssr
> > > with device e6150000.clock-controller
> > > PM: Added domain provider from /soc/clock-controller@e6150000
> > > driver: 'renesas-cpg-mssr': driver_bound: bound to device
> > > 'e6150000.clock-controller'
> > > platform e6055800.gpio: Added to deferred list
> > > [...]
> > > platform e6020000.watchdog: Added to deferred list
> > > [...]
> > > platform fe000000.pcie: Added to deferred list
> > >
> > > > > > which happens before their
> > > > > > driver's .probe() method is called. If the actual probe fails later
> > > > > > (real failure, not -EPROBE_DEFER), the device will still be on the
> > > > > > deferred probe pending list, and it will be probed again when deferred
> > > > > > probing kicks in, which is futile.
> > > > > >
> > > > > > Fix this by explicitly removing the device from the deferred probe
> > > > > > pending list in case of probe failures.
> > > > > >
> > > > > > Fixes: e590474768f1cc04 ("driver core: Set fw_devlink=on by default")
> > > > > > Signed-off-by: Geert Uytterhoeven <[email protected]>
> > > > >
> > > > > Good catch:
> > > > >
> > > > > Reviewed-by: Rafael J. Wysocki <[email protected]>
> > > >
> > > > The issue is real and needs to be fixed. But I'm confused how this can
> > > > happen. We won't even enter really_probe() if the driver isn't ready.
> > > > We also won't get to run the driver's .probe() if the suppliers aren't
> > > > ready. So how does the device get added to the deferred probe list
> > > > before the driver is ready? Is this due to device_links_driver_bound()
> > > > on the supplier?
> > > >
> > > > Can you give a more detailed step by step on the case you are hitting?
> > >
> > > The device is added to the list due to device_links_driver_bound()
> > > calling driver_deferred_probe_add() on all consumer devices.
> >
> > Thanks for the explanation. Maybe add more details like this to the
> > commit text or in the code?
> >
> > For the code:
> > Reviewed-by: Saravana Kanna <[email protected]>
>
> Ugh... I just realized that I might have to give this a Nak because of
> bad locking in deferred_probe_work_func(). The unlock/lock inside the
> loop is a terrible hack. If we add this patch, we can end up modifying
> a linked list while it's being traversed and cause a crash or busy
> loop (you'll accidentally end up on an "empty list"). I ran into a
> similar issue during one of my unrelated refactors.

Turns out the issue I was seeing went away due to commit
f2db85b64f0af141 ("driver core: Avoid pointless deferred probe
attempts"), so there is no need to apply this patch.


Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2021-07-07 19:04:46

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH] driver core: Fix double failed probing with fw_devlink=on

On Wed, Jul 7, 2021 at 1:43 AM Geert Uytterhoeven <[email protected]> wrote:
>
> Hi Saravana,
>
> (going over old patch I still have in my local tree)
>
> On Tue, Feb 16, 2021 at 6:08 PM Saravana Kannan <[email protected]> wrote:
> > On Mon, Feb 15, 2021 at 12:59 PM Saravana Kannan <[email protected]> wrote:
> > > On Mon, Feb 15, 2021 at 11:08 AM Geert Uytterhoeven
> > > <[email protected]> wrote:
> > > > On Mon, Feb 15, 2021 at 7:27 PM Saravana Kannan <[email protected]> wrote:
> > > > > On Mon, Feb 15, 2021 at 6:59 AM Rafael J. Wysocki <[email protected]> wrote:
> > > > > > On Mon, Feb 15, 2021 at 12:16 PM Geert Uytterhoeven
> > > > > > <[email protected]> wrote:
> > > > > > > With fw_devlink=permissive, devices are added to the deferred probe
> > > > > > > pending list if their driver's .probe() method returns -EPROBE_DEFER.
> > > > > > >
> > > > > > > With fw_devlink=on, devices are added to the deferred probe pending list
> > > > > > > if they are determined to be a consumer,
> > > > >
> > > > > If they are determined to be a consumer or if they are determined to
> > > > > have a supplier that hasn't probed yet?
> > > >
> > > > When the supplier has probed:
> > > >
> > > > bus: 'platform': driver_probe_device: matched device
> > > > e6150000.clock-controller with driver renesas-cpg-mssr
> > > > bus: 'platform': really_probe: probing driver renesas-cpg-mssr
> > > > with device e6150000.clock-controller
> > > > PM: Added domain provider from /soc/clock-controller@e6150000
> > > > driver: 'renesas-cpg-mssr': driver_bound: bound to device
> > > > 'e6150000.clock-controller'
> > > > platform e6055800.gpio: Added to deferred list
> > > > [...]
> > > > platform e6020000.watchdog: Added to deferred list
> > > > [...]
> > > > platform fe000000.pcie: Added to deferred list
> > > >
> > > > > > > which happens before their
> > > > > > > driver's .probe() method is called. If the actual probe fails later
> > > > > > > (real failure, not -EPROBE_DEFER), the device will still be on the
> > > > > > > deferred probe pending list, and it will be probed again when deferred
> > > > > > > probing kicks in, which is futile.
> > > > > > >
> > > > > > > Fix this by explicitly removing the device from the deferred probe
> > > > > > > pending list in case of probe failures.
> > > > > > >
> > > > > > > Fixes: e590474768f1cc04 ("driver core: Set fw_devlink=on by default")
> > > > > > > Signed-off-by: Geert Uytterhoeven <[email protected]>
> > > > > >
> > > > > > Good catch:
> > > > > >
> > > > > > Reviewed-by: Rafael J. Wysocki <[email protected]>
> > > > >
> > > > > The issue is real and needs to be fixed. But I'm confused how this can
> > > > > happen. We won't even enter really_probe() if the driver isn't ready.
> > > > > We also won't get to run the driver's .probe() if the suppliers aren't
> > > > > ready. So how does the device get added to the deferred probe list
> > > > > before the driver is ready? Is this due to device_links_driver_bound()
> > > > > on the supplier?
> > > > >
> > > > > Can you give a more detailed step by step on the case you are hitting?
> > > >
> > > > The device is added to the list due to device_links_driver_bound()
> > > > calling driver_deferred_probe_add() on all consumer devices.
> > >
> > > Thanks for the explanation. Maybe add more details like this to the
> > > commit text or in the code?
> > >
> > > For the code:
> > > Reviewed-by: Saravana Kanna <[email protected]>
> >
> > Ugh... I just realized that I might have to give this a Nak because of
> > bad locking in deferred_probe_work_func(). The unlock/lock inside the
> > loop is a terrible hack. If we add this patch, we can end up modifying
> > a linked list while it's being traversed and cause a crash or busy
> > loop (you'll accidentally end up on an "empty list"). I ran into a
> > similar issue during one of my unrelated refactors.
>
> Turns out the issue I was seeing went away due to commit
> f2db85b64f0af141 ("driver core: Avoid pointless deferred probe
> attempts"), so there is no need to apply this patch.
>

Yay! That was the goal :) I'm assuming it wasn't ever applied.

-Saravana