2020-04-24 03:43:18

by Dexuan Cui

[permalink] [raw]
Subject: [PATCH] PM: hibernate: Freeze kernel threads in software_resume()

Currently the kernel threads are not frozen in software_resume(), so
between dpm_suspend_start(PMSG_QUIESCE) and resume_target_kernel(),
system_freezable_power_efficient_wq can still try to submit SCSI
commands and this can cause a panic since the low level SCSI driver
(e.g. hv_storvsc) has quiesced the SCSI adapter and can not accept
any SCSI commands: https://lkml.org/lkml/2020/4/10/47

At first I posted a fix (https://lkml.org/lkml/2020/4/21/1318) trying
to resolve the issue from hv_storvsc, but with the help of
Bart Van Assche, I realized it's better to fix software_resume(),
since this looks like a generic issue, not only pertaining to SCSI.

Cc: Bart Van Assche <[email protected]>
Cc: [email protected]
Signed-off-by: Dexuan Cui <[email protected]>
---
kernel/power/hibernate.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index 86aba8706b16..30bd28d1d418 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -898,6 +898,13 @@ static int software_resume(void)
error = freeze_processes();
if (error)
goto Close_Finish;
+
+ error = freeze_kernel_threads();
+ if (error) {
+ thaw_processes();
+ goto Close_Finish;
+ }
+
error = load_image_and_restore();
thaw_processes();
Finish:
--
2.19.1


2020-04-26 15:07:28

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH] PM: hibernate: Freeze kernel threads in software_resume()

Hi

[This is an automated email]

This commit has been processed because it contains a -stable tag.
The stable tag indicates that it's relevant for the following trees: all

The bot has tested the following trees: v5.6.7, v5.4.35, v4.19.118, v4.14.177, v4.9.220, v4.4.220.

v5.6.7: Build OK!
v5.4.35: Build OK!
v4.19.118: Build OK!
v4.14.177: Build OK!
v4.9.220: Build OK!
v4.4.220: Failed to apply! Possible dependencies:
ea00f4f4f00c ("PM / sleep: make PM notifiers called symmetrically")
fe12c00d21bb ("PM / hibernate: Introduce test_resume mode for hibernation")


NOTE: The patch will not be queued to stable trees until it is upstream.

How should we proceed with this patch?

--
Thanks
Sasha

2020-04-26 16:26:01

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] PM: hibernate: Freeze kernel threads in software_resume()

On Friday, April 24, 2020 5:40:16 AM CEST Dexuan Cui wrote:
> Currently the kernel threads are not frozen in software_resume(), so
> between dpm_suspend_start(PMSG_QUIESCE) and resume_target_kernel(),
> system_freezable_power_efficient_wq can still try to submit SCSI
> commands and this can cause a panic since the low level SCSI driver
> (e.g. hv_storvsc) has quiesced the SCSI adapter and can not accept
> any SCSI commands: https://lkml.org/lkml/2020/4/10/47
>
> At first I posted a fix (https://lkml.org/lkml/2020/4/21/1318) trying
> to resolve the issue from hv_storvsc, but with the help of
> Bart Van Assche, I realized it's better to fix software_resume(),
> since this looks like a generic issue, not only pertaining to SCSI.
>
> Cc: Bart Van Assche <[email protected]>
> Cc: [email protected]
> Signed-off-by: Dexuan Cui <[email protected]>
> ---
> kernel/power/hibernate.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
> index 86aba8706b16..30bd28d1d418 100644
> --- a/kernel/power/hibernate.c
> +++ b/kernel/power/hibernate.c
> @@ -898,6 +898,13 @@ static int software_resume(void)
> error = freeze_processes();
> if (error)
> goto Close_Finish;
> +
> + error = freeze_kernel_threads();
> + if (error) {
> + thaw_processes();
> + goto Close_Finish;
> + }
> +
> error = load_image_and_restore();
> thaw_processes();
> Finish:
>

Applied as a fix for 5.7-rc4, thanks!



2020-04-26 18:36:30

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH] PM: hibernate: Freeze kernel threads in software_resume()

On 2020-04-26 09:24, Rafael J. Wysocki wrote:
> On Friday, April 24, 2020 5:40:16 AM CEST Dexuan Cui wrote:
>> Currently the kernel threads are not frozen in software_resume(), so
>> between dpm_suspend_start(PMSG_QUIESCE) and resume_target_kernel(),
>> system_freezable_power_efficient_wq can still try to submit SCSI
>> commands and this can cause a panic since the low level SCSI driver
>> (e.g. hv_storvsc) has quiesced the SCSI adapter and can not accept
>> any SCSI commands: https://lkml.org/lkml/2020/4/10/47
>>
>> At first I posted a fix (https://lkml.org/lkml/2020/4/21/1318) trying
>> to resolve the issue from hv_storvsc, but with the help of
>> Bart Van Assche, I realized it's better to fix software_resume(),
>> since this looks like a generic issue, not only pertaining to SCSI.
>>
>> Cc: Bart Van Assche <[email protected]>
>> Cc: [email protected]
>> Signed-off-by: Dexuan Cui <[email protected]>
>> ---
>> kernel/power/hibernate.c | 7 +++++++
>> 1 file changed, 7 insertions(+)
>>
>> diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
>> index 86aba8706b16..30bd28d1d418 100644
>> --- a/kernel/power/hibernate.c
>> +++ b/kernel/power/hibernate.c
>> @@ -898,6 +898,13 @@ static int software_resume(void)
>> error = freeze_processes();
>> if (error)
>> goto Close_Finish;
>> +
>> + error = freeze_kernel_threads();
>> + if (error) {
>> + thaw_processes();
>> + goto Close_Finish;
>> + }
>> +
>> error = load_image_and_restore();
>> thaw_processes();
>> Finish:
>
> Applied as a fix for 5.7-rc4, thanks!

Hi Rafael,

What is not clear to me is how kernel threads are thawed after
load_image_and_restore() has finished? Should a comment perhaps be added
above the freeze_kernel_threads() call that explains how
thaw_kernel_threads() is invoked after load_image_and_restore() has
finished?

Thanks,

Bart.

2020-04-27 01:02:39

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH] PM: hibernate: Freeze kernel threads in software_resume()

> From: Bart Van Assche <[email protected]>
> Sent: Sunday, April 26, 2020 11:34 AM
> To: Rafael J. Wysocki <[email protected]>; Dexuan Cui <[email protected]>
> >> --- a/kernel/power/hibernate.c
> >> +++ b/kernel/power/hibernate.c
> >> @@ -898,6 +898,13 @@ static int software_resume(void)
> >> error = freeze_processes();
> >> if (error)
> >> goto Close_Finish;
> >> +
> >> + error = freeze_kernel_threads();
> >> + if (error) {
> >> + thaw_processes();
> >> + goto Close_Finish;
> >> + }
> >> +
> >> error = load_image_and_restore();
> >> thaw_processes();
> >> Finish:
> >
> > Applied as a fix for 5.7-rc4, thanks!
>
> Hi Rafael,
>
> What is not clear to me is how kernel threads are thawed after
> load_image_and_restore() has finished? Should a comment perhaps be added
> above the freeze_kernel_threads() call that explains how
> thaw_kernel_threads() is invoked after load_image_and_restore() has
> finished?
>
> Bart.

Hi Bart, Rafael, I would suggest the below comment:

If load_image_and_restore() succeeds, it won't return, and the
execution will be restored from the 'old' kernel's hibernate() ->
hibernation_snapshot() -> create_image() -> swsusp_arch_suspend(),
and later hibernate() -> thaw_processes() will thaw every frozen
kernel process and userspace process of the 'old' kernel.

Thanks,
-- Dexuan

2020-04-27 08:47:12

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] PM: hibernate: Freeze kernel threads in software_resume()

On Sun, Apr 26, 2020 at 8:34 PM Bart Van Assche <[email protected]> wrote:
>
> On 2020-04-26 09:24, Rafael J. Wysocki wrote:
> > On Friday, April 24, 2020 5:40:16 AM CEST Dexuan Cui wrote:
> >> Currently the kernel threads are not frozen in software_resume(), so
> >> between dpm_suspend_start(PMSG_QUIESCE) and resume_target_kernel(),
> >> system_freezable_power_efficient_wq can still try to submit SCSI
> >> commands and this can cause a panic since the low level SCSI driver
> >> (e.g. hv_storvsc) has quiesced the SCSI adapter and can not accept
> >> any SCSI commands: https://lkml.org/lkml/2020/4/10/47
> >>
> >> At first I posted a fix (https://lkml.org/lkml/2020/4/21/1318) trying
> >> to resolve the issue from hv_storvsc, but with the help of
> >> Bart Van Assche, I realized it's better to fix software_resume(),
> >> since this looks like a generic issue, not only pertaining to SCSI.
> >>
> >> Cc: Bart Van Assche <[email protected]>
> >> Cc: [email protected]
> >> Signed-off-by: Dexuan Cui <[email protected]>
> >> ---
> >> kernel/power/hibernate.c | 7 +++++++
> >> 1 file changed, 7 insertions(+)
> >>
> >> diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
> >> index 86aba8706b16..30bd28d1d418 100644
> >> --- a/kernel/power/hibernate.c
> >> +++ b/kernel/power/hibernate.c
> >> @@ -898,6 +898,13 @@ static int software_resume(void)
> >> error = freeze_processes();
> >> if (error)
> >> goto Close_Finish;
> >> +
> >> + error = freeze_kernel_threads();
> >> + if (error) {
> >> + thaw_processes();
> >> + goto Close_Finish;
> >> + }
> >> +
> >> error = load_image_and_restore();
> >> thaw_processes();
> >> Finish:
> >
> > Applied as a fix for 5.7-rc4, thanks!
>
> Hi Rafael,
>
> What is not clear to me is how kernel threads are thawed after
> load_image_and_restore() has finished? Should a comment perhaps be added
> above the freeze_kernel_threads() call that explains how
> thaw_kernel_threads() is invoked after load_image_and_restore() has
> finished?

It isn't, because that is not necessary.

thaw_processes() will thaw them along with the user space.

Cheers!