2023-11-20 17:39:18

by Elliot Berman

[permalink] [raw]
Subject: [PATCH 0/2] freezer,sched: do not restore saved_state of a thawed task

This series applies couple fixes to commit 8f0eed4a78a8 ("freezer,sched:
Use saved_state to reduce some spurious wakeups") which was found while
testing with legacy cgroup freezer. My original testing was only with
system-wide freezer. We found that thaw_task could be called on a task
which was already frozen. Prior to commit 8f0eed4a78a8 ("freezer,sched:
Use saved_state to reduce some spurious wakeups"), this wasn't an issue
as kernel would try to wake up TASK_FROZEN, which wouldn't match the
thawed task state, and no harm done to task. After commit 8f0eed4a78a8
("freezer,sched: Use saved_state to reduce some spurious wakeups"), it
was possible to overwrite the state of thawed task.

To: Rafael J. Wysocki <[email protected]>
To: Pavel Machek <[email protected]>
To: Ingo Molnar <[email protected]>
To: Peter Zijlstra (Intel) <[email protected]>
Cc: <[email protected]>
Cc: Pavan Kondeti <[email protected]>
Cc: Aiqun Yu (Maria) <[email protected]>
Cc: <[email protected]>
Cc: <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>

Originally sent to only linux-arm-msm, resending to correct authors.
- Link to v1: https://lore.kernel.org/r/20231120-freezer-state-multiple-thaws-v1-0-a4c453f50745@quicinc.com

---
Elliot Berman (2):
freezer,sched: do not restore saved_state of a thawed task
freezer,sched: clean saved_state when restoring it during thaw

kernel/freezer.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
---
base-commit: 6d7e4782bcf549221b4ccfffec2cf4d1a473f1a3
change-id: 20231108-freezer-state-multiple-thaws-7a3a8d9dadb3

Best regards,
--
Elliot Berman <[email protected]>


2023-11-20 17:39:23

by Elliot Berman

[permalink] [raw]
Subject: [PATCH 1/2] freezer,sched: do not restore saved_state of a thawed task

It is possible for a task to be thawed multiple times when mixing the
*legacy* cgroup freezer and system-wide freezer. To do this, freeze the
cgroup, do system-wide freeze/thaw, then thaw the cgroup. When this
happens, then a stale saved_state can be written to the task's state
and cause task to hang indefinitely. Fix this by only trying to thaw
tasks that are actually frozen.

This change also has the marginal benefit avoiding unnecessary
wake_up_state(p, TASK_FROZEN) if we know the task is already thawed.
There is not possibility of time-of-compare/time-of-use race when we skip
the wake_up_state because entering/exiting TASK_FROZEN is guarded by
freezer_lock.

Fixes: 8f0eed4a78a8 ("freezer,sched: Use saved_state to reduce some spurious wakeups")
Reviewed-by: Abhijeet Dharmapurikar <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
---
kernel/freezer.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/freezer.c b/kernel/freezer.c
index c450fa8b8b5e..759006a9a910 100644
--- a/kernel/freezer.c
+++ b/kernel/freezer.c
@@ -201,7 +201,7 @@ void __thaw_task(struct task_struct *p)
if (WARN_ON_ONCE(freezing(p)))
goto unlock;

- if (task_call_func(p, __restore_freezer_state, NULL))
+ if (!frozen(p) || task_call_func(p, __restore_freezer_state, NULL))
goto unlock;

wake_up_state(p, TASK_FROZEN);

--
2.41.0

2023-11-28 09:12:51

by Aiqun Yu (Maria)

[permalink] [raw]
Subject: Re: [PATCH 0/2] freezer,sched: do not restore saved_state of a thawed task

On 11/21/2023 1:36 AM, Elliot Berman wrote:
> This series applies couple fixes to commit 8f0eed4a78a8 ("freezer,sched:
> Use saved_state to reduce some spurious wakeups") which was found while
> testing with legacy cgroup freezer. My original testing was only with
> system-wide freezer. We found that thaw_task could be called on a task
> which was already frozen. Prior to commit 8f0eed4a78a8 ("freezer,sched:
> Use saved_state to reduce some spurious wakeups"), this wasn't an issue
> as kernel would try to wake up TASK_FROZEN, which wouldn't match the
> thawed task state, and no harm done to task. After commit 8f0eed4a78a8
> ("freezer,sched: Use saved_state to reduce some spurious wakeups"), it
> was possible to overwrite the state of thawed task.
>
> To: Rafael J. Wysocki <[email protected]>
> To: Pavel Machek <[email protected]>
> To: Ingo Molnar <[email protected]>
> To: Peter Zijlstra (Intel) <[email protected]>
> Cc: <[email protected]>
> Cc: Pavan Kondeti <[email protected]>
> Cc: Aiqun Yu (Maria) <[email protected]>
> Cc: <[email protected]>
> Cc: <[email protected]>
> Signed-off-by: Elliot Berman <[email protected]>
Shall we add Fixed tag and Cc: [email protected] ?
Since it is fixing a stable user thread hung issue.
>
> Originally sent to only linux-arm-msm, resending to correct authors.
> - Link to v1: https://lore.kernel.org/r/20231120-freezer-state-multiple-thaws-v1-0-a4c453f50745@quicinc.com
>
> ---
> Elliot Berman (2):
> freezer,sched: do not restore saved_state of a thawed task
> freezer,sched: clean saved_state when restoring it during thaw
>
> kernel/freezer.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
> ---
> base-commit: 6d7e4782bcf549221b4ccfffec2cf4d1a473f1a3
> change-id: 20231108-freezer-state-multiple-thaws-7a3a8d9dadb3
>
> Best regards,

--
Thx and BRs,
Aiqun(Maria) Yu

2023-11-28 17:31:52

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 0/2] freezer,sched: do not restore saved_state of a thawed task

On Tue, Nov 28, 2023 at 05:12:00PM +0800, Aiqun(Maria) Yu wrote:
> On 11/21/2023 1:36 AM, Elliot Berman wrote:
> > This series applies couple fixes to commit 8f0eed4a78a8 ("freezer,sched:
> > Use saved_state to reduce some spurious wakeups") which was found while
> > testing with legacy cgroup freezer. My original testing was only with
> > system-wide freezer. We found that thaw_task could be called on a task
> > which was already frozen. Prior to commit 8f0eed4a78a8 ("freezer,sched:
> > Use saved_state to reduce some spurious wakeups"), this wasn't an issue
> > as kernel would try to wake up TASK_FROZEN, which wouldn't match the
> > thawed task state, and no harm done to task. After commit 8f0eed4a78a8
> > ("freezer,sched: Use saved_state to reduce some spurious wakeups"), it
> > was possible to overwrite the state of thawed task.
> >
> > To: Rafael J. Wysocki <[email protected]>
> > To: Pavel Machek <[email protected]>
> > To: Ingo Molnar <[email protected]>
> > To: Peter Zijlstra (Intel) <[email protected]>
> > Cc: <[email protected]>
> > Cc: Pavan Kondeti <[email protected]>
> > Cc: Aiqun Yu (Maria) <[email protected]>
> > Cc: <[email protected]>
> > Cc: <[email protected]>
> > Signed-off-by: Elliot Berman <[email protected]>
> Shall we add Fixed tag and Cc: [email protected] ?
> Since it is fixing a stable user thread hung issue.

If you want this routed through urgent, then yes.

2023-11-28 17:33:51

by Elliot Berman

[permalink] [raw]
Subject: Re: [PATCH 0/2] freezer,sched: do not restore saved_state of a thawed task



On 11/28/2023 11:31 AM, Peter Zijlstra wrote:
> On Tue, Nov 28, 2023 at 05:12:00PM +0800, Aiqun(Maria) Yu wrote:
>> On 11/21/2023 1:36 AM, Elliot Berman wrote:
>>> This series applies couple fixes to commit 8f0eed4a78a8 ("freezer,sched:
>>> Use saved_state to reduce some spurious wakeups") which was found while
>>> testing with legacy cgroup freezer. My original testing was only with
>>> system-wide freezer. We found that thaw_task could be called on a task
>>> which was already frozen. Prior to commit 8f0eed4a78a8 ("freezer,sched:
>>> Use saved_state to reduce some spurious wakeups"), this wasn't an issue
>>> as kernel would try to wake up TASK_FROZEN, which wouldn't match the
>>> thawed task state, and no harm done to task. After commit 8f0eed4a78a8
>>> ("freezer,sched: Use saved_state to reduce some spurious wakeups"), it
>>> was possible to overwrite the state of thawed task.
>>>
>>> To: Rafael J. Wysocki <[email protected]>
>>> To: Pavel Machek <[email protected]>
>>> To: Ingo Molnar <[email protected]>
>>> To: Peter Zijlstra (Intel) <[email protected]>
>>> Cc: <[email protected]>
>>> Cc: Pavan Kondeti <[email protected]>
>>> Cc: Aiqun Yu (Maria) <[email protected]>
>>> Cc: <[email protected]>
>>> Cc: <[email protected]>
>>> Signed-off-by: Elliot Berman <[email protected]>
>> Shall we add Fixed tag and Cc: [email protected] ?
>> Since it is fixing a stable user thread hung issue.
>
> If you want this routed through urgent, then yes.

Fixes tag is added to https://lore.kernel.org/all/20231120-freezer-state-multiple-thaws-v1-1-f2e1dd7ce5a2@quicinc.com/

Is CC'ing stable what triggers routing through urgent?

Thanks,
Elliot

2023-11-28 17:54:46

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 0/2] freezer,sched: do not restore saved_state of a thawed task

On Tue, Nov 28, 2023 at 11:33:23AM -0600, Elliot Berman wrote:
>
>
> On 11/28/2023 11:31 AM, Peter Zijlstra wrote:
> > On Tue, Nov 28, 2023 at 05:12:00PM +0800, Aiqun(Maria) Yu wrote:
> >> On 11/21/2023 1:36 AM, Elliot Berman wrote:
> >>> This series applies couple fixes to commit 8f0eed4a78a8 ("freezer,sched:
> >>> Use saved_state to reduce some spurious wakeups") which was found while
> >>> testing with legacy cgroup freezer. My original testing was only with
> >>> system-wide freezer. We found that thaw_task could be called on a task
> >>> which was already frozen. Prior to commit 8f0eed4a78a8 ("freezer,sched:
> >>> Use saved_state to reduce some spurious wakeups"), this wasn't an issue
> >>> as kernel would try to wake up TASK_FROZEN, which wouldn't match the
> >>> thawed task state, and no harm done to task. After commit 8f0eed4a78a8
> >>> ("freezer,sched: Use saved_state to reduce some spurious wakeups"), it
> >>> was possible to overwrite the state of thawed task.
> >>>
> >>> To: Rafael J. Wysocki <[email protected]>
> >>> To: Pavel Machek <[email protected]>
> >>> To: Ingo Molnar <[email protected]>
> >>> To: Peter Zijlstra (Intel) <[email protected]>
> >>> Cc: <[email protected]>
> >>> Cc: Pavan Kondeti <[email protected]>
> >>> Cc: Aiqun Yu (Maria) <[email protected]>
> >>> Cc: <[email protected]>
> >>> Cc: <[email protected]>
> >>> Signed-off-by: Elliot Berman <[email protected]>
> >> Shall we add Fixed tag and Cc: [email protected] ?
> >> Since it is fixing a stable user thread hung issue.
> >
> > If you want this routed through urgent, then yes.
>
> Fixes tag is added to https://lore.kernel.org/all/20231120-freezer-state-multiple-thaws-v1-1-f2e1dd7ce5a2@quicinc.com/
>
> Is CC'ing stable what triggers routing through urgent?

Fixes tag should be enough, lemme go find that other copy then.. so much
email :/

Subject: [tip: sched/urgent] freezer,sched: Do not restore saved_state of a thawed task

The following commit has been merged into the sched/urgent branch of tip:

Commit-ID: 23ab79e8e469e2605beec2e3ccb40d19c68dd2e0
Gitweb: https://git.kernel.org/tip/23ab79e8e469e2605beec2e3ccb40d19c68dd2e0
Author: Elliot Berman <[email protected]>
AuthorDate: Mon, 20 Nov 2023 09:36:31 -08:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Wed, 29 Nov 2023 15:43:48 +01:00

freezer,sched: Do not restore saved_state of a thawed task

It is possible for a task to be thawed multiple times when mixing the
*legacy* cgroup freezer and system-wide freezer. To do this, freeze the
cgroup, do system-wide freeze/thaw, then thaw the cgroup. When this
happens, then a stale saved_state can be written to the task's state
and cause task to hang indefinitely. Fix this by only trying to thaw
tasks that are actually frozen.

This change also has the marginal benefit avoiding unnecessary
wake_up_state(p, TASK_FROZEN) if we know the task is already thawed.
There is not possibility of time-of-compare/time-of-use race when we skip
the wake_up_state because entering/exiting TASK_FROZEN is guarded by
freezer_lock.

Fixes: 8f0eed4a78a8 ("freezer,sched: Use saved_state to reduce some spurious wakeups")
Signed-off-by: Elliot Berman <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Reviewed-by: Abhijeet Dharmapurikar <[email protected]>
Link: https://lore.kernel.org/r/20231120-freezer-state-multiple-thaws-v1-1-f2e1dd7ce5a2@quicinc.com
---
kernel/freezer.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/freezer.c b/kernel/freezer.c
index c450fa8..759006a 100644
--- a/kernel/freezer.c
+++ b/kernel/freezer.c
@@ -201,7 +201,7 @@ void __thaw_task(struct task_struct *p)
if (WARN_ON_ONCE(freezing(p)))
goto unlock;

- if (task_call_func(p, __restore_freezer_state, NULL))
+ if (!frozen(p) || task_call_func(p, __restore_freezer_state, NULL))
goto unlock;

wake_up_state(p, TASK_FROZEN);