2023-04-19 05:43:36

by Kai-Heng Feng

[permalink] [raw]
Subject: [PATCH] scsi: core: Avoid doing rescan on suspended device

During system resume, if an EH is schduled after ATA host is resumed
(i.e. ATA_PFLAG_PM_PENDING cleared), but before the disk device is
resumed, the device_lock hold by scsi_rescan_device() is never released
so the dpm_resume() of the disk is blocked forerver.

That's because scsi_attach_vpd() is expecting the disk device is in
operational state, as it doesn't work on suspended device.

To avoid such deadlock, avoid doing rescan if the disk is still
suspended so the resume process of the disk device can proceed.

Signed-off-by: Kai-Heng Feng <[email protected]>
---
drivers/scsi/scsi_scan.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index d217be323cc6..36680cb1535b 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -1621,6 +1621,9 @@ void scsi_rescan_device(struct device *dev)
{
struct scsi_device *sdev = to_scsi_device(dev);

+ if (dev->power.is_prepared)
+ return;
+
device_lock(dev);

scsi_attach_vpd(sdev);
--
2.34.1


2023-04-20 17:51:14

by Benjamin Block

[permalink] [raw]
Subject: Re: [PATCH] scsi: core: Avoid doing rescan on suspended device

On Wed, Apr 19, 2023 at 01:41:12PM +0800, Kai-Heng Feng wrote:
> During system resume, if an EH is schduled after ATA host is resumed
> (i.e. ATA_PFLAG_PM_PENDING cleared), but before the disk device is
> resumed, the device_lock hold by scsi_rescan_device() is never released
> so the dpm_resume() of the disk is blocked forerver.
>
> That's because scsi_attach_vpd() is expecting the disk device is in
> operational state, as it doesn't work on suspended device.
>
> To avoid such deadlock, avoid doing rescan if the disk is still
> suspended so the resume process of the disk device can proceed.

I'm no expert on suspend/resume, but wouldn't you then potentially miss
changes that have been done to the LUN during suspend?

What takes care of updating the VPDs, scsi-disk re-evaluation and such
in this case, when you block it initially during wakeup?

>
> Signed-off-by: Kai-Heng Feng <[email protected]>
> ---
> drivers/scsi/scsi_scan.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index d217be323cc6..36680cb1535b 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -1621,6 +1621,9 @@ void scsi_rescan_device(struct device *dev)
> {
> struct scsi_device *sdev = to_scsi_device(dev);
>
> + if (dev->power.is_prepared)
> + return;
> +
> device_lock(dev);
>
> scsi_attach_vpd(sdev);
> --
> 2.34.1
>

--
Best Regards, Benjamin Block / Linux on IBM Z Kernel Development
IBM Deutschland Research & Development GmbH / https://www.ibm.com/privacy
Vors. Aufs.-R.: Gregor Pillen / Gesch?ftsf?hrung: David Faller
Sitz der Ges.: B?blingen / Registergericht: AmtsG Stuttgart, HRB 243294

2023-04-24 03:32:12

by Kai-Heng Feng

[permalink] [raw]
Subject: Re: [PATCH] scsi: core: Avoid doing rescan on suspended device

On Fri, Apr 21, 2023 at 1:43 AM Benjamin Block <[email protected]> wrote:
>
> On Wed, Apr 19, 2023 at 01:41:12PM +0800, Kai-Heng Feng wrote:
> > During system resume, if an EH is schduled after ATA host is resumed
> > (i.e. ATA_PFLAG_PM_PENDING cleared), but before the disk device is
> > resumed, the device_lock hold by scsi_rescan_device() is never released
> > so the dpm_resume() of the disk is blocked forerver.
> >
> > That's because scsi_attach_vpd() is expecting the disk device is in
> > operational state, as it doesn't work on suspended device.
> >
> > To avoid such deadlock, avoid doing rescan if the disk is still
> > suspended so the resume process of the disk device can proceed.
>
> I'm no expert on suspend/resume, but wouldn't you then potentially miss
> changes that have been done to the LUN during suspend?

This is a valid concern.

>
> What takes care of updating the VPDs, scsi-disk re-evaluation and such
> in this case, when you block it initially during wakeup?

The other approach is to perform the re-evaluation when the system
resume is about to be completed.
Let me send v2 to address that.


Kai-Heng

>
> >
> > Signed-off-by: Kai-Heng Feng <[email protected]>
> > ---
> > drivers/scsi/scsi_scan.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> > index d217be323cc6..36680cb1535b 100644
> > --- a/drivers/scsi/scsi_scan.c
> > +++ b/drivers/scsi/scsi_scan.c
> > @@ -1621,6 +1621,9 @@ void scsi_rescan_device(struct device *dev)
> > {
> > struct scsi_device *sdev = to_scsi_device(dev);
> >
> > + if (dev->power.is_prepared)
> > + return;
> > +
> > device_lock(dev);
> >
> > scsi_attach_vpd(sdev);
> > --
> > 2.34.1
> >
>
> --
> Best Regards, Benjamin Block / Linux on IBM Z Kernel Development
> IBM Deutschland Research & Development GmbH / https://www.ibm.com/privacy
> Vors. Aufs.-R.: Gregor Pillen / Geschäftsführung: David Faller
> Sitz der Ges.: Böblingen / Registergericht: AmtsG Stuttgart, HRB 243294