2023-12-13 00:36:37

by Igor Mammedov

[permalink] [raw]
Subject: [RFC 1/2] PCI: acpiphp: enable slot only if it hasn't been enabled already

When SCSI_SCAN_ASYNC is enabled (either via config or via cmd line),
adding device to bus and enabling it will kick in async host scan

scsi_scan_host+0x21/0x1f0
virtscsi_probe+0x2dd/0x350
..
driver_probe_device+0x19/0x80
...
driver_probe_device+0x19/0x80
pci_bus_add_device+0x53/0x80
pci_bus_add_devices+0x2b/0x70
...

which will schedule a job for async scan. That however breaks
if there are more than one SCSI host behind bridge, since
acpiphp_check_bridge() will walk over all slots and try to
enable each of them regardless of whether they were already
enabled.
As result the bridge might be reconfigured several times
and trigger following sequence:

[cpu 0] acpiphp_check_bridge()
[cpu 0] enable_slot(a)
[cpu 0] configure bridge
[cpu 0] pci_bus_add_devices() -> scsi_scan_host(a1)
[cpu 0] enable_slot(b)
...
[cpu 1] do_scsi_scan_host(a1) <- async jib scheduled for slot a
...
[cpu 0] configure bridge <- temporaly disables bridge

and cause do_scsi_scan_host() failure.
The same race affects SHPC (but it manages to avoid hitting the race due to
1sec delay when enabling slot).
To cover case of single device hotplug (at a time) do not attempt to
enable slot that have already been enabled.

Fixes: 40613da52b13 ("PCI: acpiphp: Reassign resources on bridge if necessary")
Reported-by: Dongli Zhang <[email protected]>
Reported-by: iona Ebner <[email protected]>
Signed-off-by: Igor Mammedov <[email protected]>
---
drivers/pci/hotplug/acpiphp_glue.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
index 601129772b2d..6b11609927d6 100644
--- a/drivers/pci/hotplug/acpiphp_glue.c
+++ b/drivers/pci/hotplug/acpiphp_glue.c
@@ -722,7 +722,9 @@ static void acpiphp_check_bridge(struct acpiphp_bridge *bridge)
trim_stale_devices(dev);

/* configure all functions */
- enable_slot(slot, true);
+ if (slot->flags != SLOT_ENABLED) {
+ enable_slot(slot, true);
+ }
} else {
disable_slot(slot);
}
--
2.39.3


2023-12-13 09:47:49

by Fiona Ebner

[permalink] [raw]
Subject: Re: [RFC 1/2] PCI: acpiphp: enable slot only if it hasn't been enabled already

Am 13.12.23 um 01:36 schrieb Igor Mammedov:
> When SCSI_SCAN_ASYNC is enabled (either via config or via cmd line),
> adding device to bus and enabling it will kick in async host scan
>
> scsi_scan_host+0x21/0x1f0
> virtscsi_probe+0x2dd/0x350
> ..
> driver_probe_device+0x19/0x80
> ...
> driver_probe_device+0x19/0x80
> pci_bus_add_device+0x53/0x80
> pci_bus_add_devices+0x2b/0x70
> ...
>
> which will schedule a job for async scan. That however breaks
> if there are more than one SCSI host behind bridge, since
> acpiphp_check_bridge() will walk over all slots and try to
> enable each of them regardless of whether they were already
> enabled.
> As result the bridge might be reconfigured several times
> and trigger following sequence:
>
> [cpu 0] acpiphp_check_bridge()
> [cpu 0] enable_slot(a)
> [cpu 0] configure bridge
> [cpu 0] pci_bus_add_devices() -> scsi_scan_host(a1)
> [cpu 0] enable_slot(b)
> ...
> [cpu 1] do_scsi_scan_host(a1) <- async jib scheduled for slot a
> ...
> [cpu 0] configure bridge <- temporaly disables bridge
>
> and cause do_scsi_scan_host() failure.
> The same race affects SHPC (but it manages to avoid hitting the race due to
> 1sec delay when enabling slot).
> To cover case of single device hotplug (at a time) do not attempt to
> enable slot that have already been enabled.
>
> Fixes: 40613da52b13 ("PCI: acpiphp: Reassign resources on bridge if necessary")
> Reported-by: Dongli Zhang <[email protected]>
> Reported-by: iona Ebner <[email protected]>

Missing an F here ;)

> Signed-off-by: Igor Mammedov <[email protected]>

Thank you! Works for me:

Tested-by: Fiona Ebner <[email protected]>

2023-12-13 10:07:52

by Igor Mammedov

[permalink] [raw]
Subject: Re: [RFC 1/2] PCI: acpiphp: enable slot only if it hasn't been enabled already

On Wed, 13 Dec 2023 10:47:27 +0100
Fiona Ebner <[email protected]> wrote:

> Am 13.12.23 um 01:36 schrieb Igor Mammedov:
> > When SCSI_SCAN_ASYNC is enabled (either via config or via cmd line),
> > adding device to bus and enabling it will kick in async host scan
> >
> > scsi_scan_host+0x21/0x1f0
> > virtscsi_probe+0x2dd/0x350
> > ..
> > driver_probe_device+0x19/0x80
> > ...
> > driver_probe_device+0x19/0x80
> > pci_bus_add_device+0x53/0x80
> > pci_bus_add_devices+0x2b/0x70
> > ...
> >
> > which will schedule a job for async scan. That however breaks
> > if there are more than one SCSI host behind bridge, since
> > acpiphp_check_bridge() will walk over all slots and try to
> > enable each of them regardless of whether they were already
> > enabled.
> > As result the bridge might be reconfigured several times
> > and trigger following sequence:
> >
> > [cpu 0] acpiphp_check_bridge()
> > [cpu 0] enable_slot(a)
> > [cpu 0] configure bridge
> > [cpu 0] pci_bus_add_devices() -> scsi_scan_host(a1)
> > [cpu 0] enable_slot(b)
> > ...
> > [cpu 1] do_scsi_scan_host(a1) <- async jib scheduled for slot a
> > ...
> > [cpu 0] configure bridge <- temporaly disables bridge
> >
> > and cause do_scsi_scan_host() failure.
> > The same race affects SHPC (but it manages to avoid hitting the race due to
> > 1sec delay when enabling slot).
> > To cover case of single device hotplug (at a time) do not attempt to
> > enable slot that have already been enabled.
> >
> > Fixes: 40613da52b13 ("PCI: acpiphp: Reassign resources on bridge if necessary")
> > Reported-by: Dongli Zhang <[email protected]>
> > Reported-by: iona Ebner <[email protected]>
>
> Missing an F here ;)

Sorry for copypaste mistake, I'll fix it up on the next submission.

>
> > Signed-off-by: Igor Mammedov <[email protected]>
>
> Thank you! Works for me:
>
> Tested-by: Fiona Ebner <[email protected]>
>

2023-12-13 13:02:10

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFC 1/2] PCI: acpiphp: enable slot only if it hasn't been enabled already

On Wed, Dec 13, 2023 at 1:36 AM Igor Mammedov <[email protected]> wrote:
>
> When SCSI_SCAN_ASYNC is enabled (either via config or via cmd line),
> adding device to bus and enabling it will kick in async host scan
>
> scsi_scan_host+0x21/0x1f0
> virtscsi_probe+0x2dd/0x350
> ..
> driver_probe_device+0x19/0x80
> ...
> driver_probe_device+0x19/0x80
> pci_bus_add_device+0x53/0x80
> pci_bus_add_devices+0x2b/0x70
> ...
>
> which will schedule a job for async scan. That however breaks
> if there are more than one SCSI host behind bridge, since
> acpiphp_check_bridge() will walk over all slots and try to
> enable each of them regardless of whether they were already
> enabled.
> As result the bridge might be reconfigured several times
> and trigger following sequence:
>
> [cpu 0] acpiphp_check_bridge()
> [cpu 0] enable_slot(a)
> [cpu 0] configure bridge
> [cpu 0] pci_bus_add_devices() -> scsi_scan_host(a1)
> [cpu 0] enable_slot(b)
> ...
> [cpu 1] do_scsi_scan_host(a1) <- async jib scheduled for slot a
> ...
> [cpu 0] configure bridge <- temporaly disables bridge
>
> and cause do_scsi_scan_host() failure.
> The same race affects SHPC (but it manages to avoid hitting the race due to
> 1sec delay when enabling slot).
> To cover case of single device hotplug (at a time) do not attempt to
> enable slot that have already been enabled.
>
> Fixes: 40613da52b13 ("PCI: acpiphp: Reassign resources on bridge if necessary")
> Reported-by: Dongli Zhang <[email protected]>
> Reported-by: iona Ebner <[email protected]>
> Signed-off-by: Igor Mammedov <[email protected]>
> ---
> drivers/pci/hotplug/acpiphp_glue.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> index 601129772b2d..6b11609927d6 100644
> --- a/drivers/pci/hotplug/acpiphp_glue.c
> +++ b/drivers/pci/hotplug/acpiphp_glue.c
> @@ -722,7 +722,9 @@ static void acpiphp_check_bridge(struct acpiphp_bridge *bridge)
> trim_stale_devices(dev);
>
> /* configure all functions */
> - enable_slot(slot, true);
> + if (slot->flags != SLOT_ENABLED) {
> + enable_slot(slot, true);
> + }

Shouldn't this be following the acpiphp_enable_slot() pattern, that is

if (!(slot->flags & SLOT_ENABLED))
enable_slot(slot, true);

Also the braces are redundant.

> } else {
> disable_slot(slot);
> }
> --

2023-12-13 16:06:54

by Igor Mammedov

[permalink] [raw]
Subject: Re: [RFC 1/2] PCI: acpiphp: enable slot only if it hasn't been enabled already

On Wed, Dec 13, 2023 at 2:01 PM Rafael J. Wysocki <[email protected]> wrote:
>
> On Wed, Dec 13, 2023 at 1:36 AM Igor Mammedov <[email protected]> wrote:
> >
> > When SCSI_SCAN_ASYNC is enabled (either via config or via cmd line),
> > adding device to bus and enabling it will kick in async host scan
> >
> > scsi_scan_host+0x21/0x1f0
> > virtscsi_probe+0x2dd/0x350
> > ..
> > driver_probe_device+0x19/0x80
> > ...
> > driver_probe_device+0x19/0x80
> > pci_bus_add_device+0x53/0x80
> > pci_bus_add_devices+0x2b/0x70
> > ...
> >
> > which will schedule a job for async scan. That however breaks
> > if there are more than one SCSI host behind bridge, since
> > acpiphp_check_bridge() will walk over all slots and try to
> > enable each of them regardless of whether they were already
> > enabled.
> > As result the bridge might be reconfigured several times
> > and trigger following sequence:
> >
> > [cpu 0] acpiphp_check_bridge()
> > [cpu 0] enable_slot(a)
> > [cpu 0] configure bridge
> > [cpu 0] pci_bus_add_devices() -> scsi_scan_host(a1)
> > [cpu 0] enable_slot(b)
> > ...
> > [cpu 1] do_scsi_scan_host(a1) <- async jib scheduled for slot a
> > ...
> > [cpu 0] configure bridge <- temporaly disables bridge
> >
> > and cause do_scsi_scan_host() failure.
> > The same race affects SHPC (but it manages to avoid hitting the race due to
> > 1sec delay when enabling slot).
> > To cover case of single device hotplug (at a time) do not attempt to
> > enable slot that have already been enabled.
> >
> > Fixes: 40613da52b13 ("PCI: acpiphp: Reassign resources on bridge if necessary")
> > Reported-by: Dongli Zhang <[email protected]>
> > Reported-by: iona Ebner <[email protected]>
> > Signed-off-by: Igor Mammedov <[email protected]>
> > ---
> > drivers/pci/hotplug/acpiphp_glue.c | 4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
> > index 601129772b2d..6b11609927d6 100644
> > --- a/drivers/pci/hotplug/acpiphp_glue.c
> > +++ b/drivers/pci/hotplug/acpiphp_glue.c
> > @@ -722,7 +722,9 @@ static void acpiphp_check_bridge(struct acpiphp_bridge *bridge)
> > trim_stale_devices(dev);
> >
> > /* configure all functions */
> > - enable_slot(slot, true);
> > + if (slot->flags != SLOT_ENABLED) {
> > + enable_slot(slot, true);
> > + }
>
> Shouldn't this be following the acpiphp_enable_slot() pattern, that is
>
> if (!(slot->flags & SLOT_ENABLED))
> enable_slot(slot, true);
>
> Also the braces are redundant.

I'll fix up on respin if Bjorn is fine with the approach in general.

Patches need respin anyways to fix botched up white spacing.

>
> > } else {
> > disable_slot(slot);
> > }
> > --
>