2020-05-04 18:18:47

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.19 11/37] PM: hibernate: Freeze kernel threads in software_resume()

From: Dexuan Cui <[email protected]>

commit 2351f8d295ed63393190e39c2f7c1fee1a80578f upstream.

Currently the kernel threads are not frozen in software_resume(), so
between dpm_suspend_start(PMSG_QUIESCE) and resume_target_kernel(),
system_freezable_power_efficient_wq can still try to submit SCSI
commands and this can cause a panic since the low level SCSI driver
(e.g. hv_storvsc) has quiesced the SCSI adapter and can not accept
any SCSI commands: https://lkml.org/lkml/2020/4/10/47

At first I posted a fix (https://lkml.org/lkml/2020/4/21/1318) trying
to resolve the issue from hv_storvsc, but with the help of
Bart Van Assche, I realized it's better to fix software_resume(),
since this looks like a generic issue, not only pertaining to SCSI.

Cc: All applicable <[email protected]>
Signed-off-by: Dexuan Cui <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
kernel/power/hibernate.c | 7 +++++++
1 file changed, 7 insertions(+)

--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -901,6 +901,13 @@ static int software_resume(void)
error = freeze_processes();
if (error)
goto Close_Finish;
+
+ error = freeze_kernel_threads();
+ if (error) {
+ thaw_processes();
+ goto Close_Finish;
+ }
+
error = load_image_and_restore();
thaw_processes();
Finish:



2020-05-05 12:12:46

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH 4.19 11/37] PM: hibernate: Freeze kernel threads in software_resume()

Hi!

> commit 2351f8d295ed63393190e39c2f7c1fee1a80578f upstream.
>
> Currently the kernel threads are not frozen in software_resume(), so
> between dpm_suspend_start(PMSG_QUIESCE) and resume_target_kernel(),
> system_freezable_power_efficient_wq can still try to submit SCSI
> commands and this can cause a panic since the low level SCSI driver
> (e.g. hv_storvsc) has quiesced the SCSI adapter and can not accept
> any SCSI commands: https://lkml.org/lkml/2020/4/10/47
>
> At first I posted a fix (https://lkml.org/lkml/2020/4/21/1318) trying
> to resolve the issue from hv_storvsc, but with the help of
> Bart Van Assche, I realized it's better to fix software_resume(),
> since this looks like a generic issue, not only pertaining to SCSI.

I believe it is too soon to merge this into stable. It is rather big
hammer. Yes, it is right thing to do. But I'd wait for 5.7 to be
released before merging it to stable.

It needs some testing and it did not get any.

Best regards,
Pavel

> Cc: All applicable <[email protected]>
> Signed-off-by: Dexuan Cui <[email protected]>
> Signed-off-by: Rafael J. Wysocki <[email protected]>
> Signed-off-by: Greg Kroah-Hartman <[email protected]>
>
> ---
> kernel/power/hibernate.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> --- a/kernel/power/hibernate.c
> +++ b/kernel/power/hibernate.c
> @@ -901,6 +901,13 @@ static int software_resume(void)
> error = freeze_processes();
> if (error)
> goto Close_Finish;
> +
> + error = freeze_kernel_threads();
> + if (error) {
> + thaw_processes();
> + goto Close_Finish;
> + }
> +
> error = load_image_and_restore();
> thaw_processes();
> Finish:
>

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Attachments:
(No filename) (1.84 kB)
signature.asc (188.00 B)
Digital signature
Download all attachments

2020-05-05 16:59:37

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH 4.19 11/37] PM: hibernate: Freeze kernel threads in software_resume()

> From: Pavel Machek <[email protected]>
> Sent: Tuesday, May 5, 2020 5:10 AM
> To: Greg Kroah-Hartman <[email protected]>
> Cc: [email protected]; [email protected]; Dexuan Cui
> <[email protected]>; Rafael J. Wysocki <[email protected]>
> Subject: Re: [PATCH 4.19 11/37] PM: hibernate: Freeze kernel threads in
> software_resume()
>
> Hi!
>
> > commit 2351f8d295ed63393190e39c2f7c1fee1a80578f upstream.
> >
> > Currently the kernel threads are not frozen in software_resume(), so
> > between dpm_suspend_start(PMSG_QUIESCE) and resume_target_kernel(),
> > system_freezable_power_efficient_wq can still try to submit SCSI
> > commands and this can cause a panic since the low level SCSI driver
> > (e.g. hv_storvsc) has quiesced the SCSI adapter and can not accept
> > any SCSI commands: https://lkml.org/lkml/2020/4/10/47
> >
> > At first I posted a fix (https://lkml.org/lkml/2020/4/21/1318) trying
> > to resolve the issue from hv_storvsc, but with the help of
> > Bart Van Assche, I realized it's better to fix software_resume(),
> > since this looks like a generic issue, not only pertaining to SCSI.
>
> I believe it is too soon to merge this into stable. It is rather big
> hammer. Yes, it is right thing to do. But I'd wait for 5.7 to be
> released before merging it to stable.
>
> It needs some testing and it did not get any.
>
> Best regards,
> Pavel

Hi,
I did do some testing in a Linux VM running on Hyper-V:
Without the patch, I can easily hit the panic I described in the first link
above. With the patch, my Linux VM can hibernate >10K times without
seeing the panic and I don't see any issue caused by the patch.

That being said, I don't mind waiting for 5.7 before we merge the patch
to stable. It would be good for the patch to get more testing from others.

Thanks,
-- Dexuan