2015-12-01 01:35:01

by Sasha Levin

[permalink] [raw]
Subject: [PATCH] sched: remove false-positive warning from wake_up_process()

Futex can have a spurious wake up before we actually wake it up on our own,
which will trigger this warning if the task is still stopped.

Fixes: 9067ac85d533651b98c2ff903182a20cbb361fcb ("wake_up_process() should be never used to wakeup a TASK_STOPPED/TRACED task")
Signed-off-by: Sasha Levin <[email protected]>
---
kernel/sched/core.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 4d568ac..fc8c987 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2039,7 +2039,6 @@ out:
*/
int wake_up_process(struct task_struct *p)
{
- WARN_ON(task_is_stopped_or_traced(p));
return try_to_wake_up(p, TASK_NORMAL, 0);
}
EXPORT_SYMBOL(wake_up_process);
--
2.5.0


2015-12-01 01:47:39

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] sched: remove false-positive warning from wake_up_process()

On Mon, Nov 30, 2015 at 5:34 PM, Sasha Levin <[email protected]> wrote:
> Futex can have a spurious wake up before we actually wake it up on our own,
> which will trigger this warning if the task is still stopped.

Actually, I think it would presumably be the other way around: a
spurious stale futex wakeup happens *after* the process has been woken
up for some other reason and moved to stopped state.

(The "wake up and move to stopped state" could be for the same reason:
a SIGSTOP signal).

So the setup is presumably something like this:

- on cpu1: futex code is about to go to sleep, adds itself to the
futex hash chains, but then gets interrupted by a SIGSTOP

- in the meantime, on cpu2, the futex is changed, and the wakup code
sees the process from cpu1 on the futex hash chains

- on cpu1, the process has now removed itself from the hash chains,
and goes through the signal code that sets the state to STOPPED

- in the meantime, on cpu2, the futex code now gets around to waking
things up, and sees that stopped state

Roughly.

Linus

2015-12-01 02:48:44

by Rik van Riel

[permalink] [raw]
Subject: Re: [PATCH] sched: remove false-positive warning from wake_up_process()

On 11/30/2015 08:47 PM, Linus Torvalds wrote:
> On Mon, Nov 30, 2015 at 5:34 PM, Sasha Levin <[email protected]> wrote:
>> Futex can have a spurious wake up before we actually wake it up on our own,
>> which will trigger this warning if the task is still stopped.
>
> Actually, I think it would presumably be the other way around: a
> spurious stale futex wakeup happens *after* the process has been woken
> up for some other reason and moved to stopped state.
>
> (The "wake up and move to stopped state" could be for the same reason:
> a SIGSTOP signal).
>
> So the setup is presumably something like this:
>
> - on cpu1: futex code is about to go to sleep, adds itself to the
> futex hash chains, but then gets interrupted by a SIGSTOP
>
> - in the meantime, on cpu2, the futex is changed, and the wakup code
> sees the process from cpu1 on the futex hash chains
>
> - on cpu1, the process has now removed itself from the hash chains,
> and goes through the signal code that sets the state to STOPPED
>
> - in the meantime, on cpu2, the futex code now gets around to waking
> things up, and sees that stopped state
>
> Roughly.

What would the correct behaviour in that case be?

Does waking up the task while it is being traced, and ptrace
(or gdb) is not expecting a wakeup, break the tracing?

--
All rights reversed

2015-12-01 03:14:06

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] sched: remove false-positive warning from wake_up_process()

On Mon, Nov 30, 2015 at 6:48 PM, Rik van Riel <[email protected]> wrote:
>
> What would the correct behaviour in that case be?
>
> Does waking up the task while it is being traced, and ptrace
> (or gdb) is not expecting a wakeup, break the tracing?

It would.

We already do the right thing (thanks to that commit 9067ac85d533),
namely just ignore the spurious wakeup.

Basically, all "normal" wait events have to be in a loop around the
event condition because of spurious wakeups like this, and they
already are (ie helpers like "wait_event()" etc do the right thing,
and in general it's actually fairly hard to do the wrong thing).

And special things like TASK_STOPPED now only get woken up by properly
serialized things that are supposed to wake them up.

So we're ok. It's just that the sanity check WARN_ON() was racily too
eager to warn about mis-use. The warning was *meant* to trigger in
case somebody depended on the old broken behavior of
"wake_up_process() wakes up anything" that the code moved away from.
But the warning also triggered for this race condition, that was
actually fixed by the commit in question.

Linus

2015-12-03 12:36:57

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] sched: remove false-positive warning from wake_up_process()

On Mon, Nov 30, 2015 at 08:34:20PM -0500, Sasha Levin wrote:
> Futex can have a spurious wake up before we actually wake it up on our own,
> which will trigger this warning if the task is still stopped.

I've edited the changelog like so, please let me know if that is fine
with you.

Thanks.

---
Subject: sched: Remove false-positive warning from wake_up_process()
From: Sasha Levin <[email protected]>
Date: Mon, 30 Nov 2015 20:34:20 -0500

Because wakeups can (fundamentally) be late, a task might not be in
the expected state. Therefore testing against a task's state is racy,
and can yield false positives.

Fixes: 9067ac85d533 ("wake_up_process() should be never used to wakeup a TASK_STOPPED/TRACED task")
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
---
kernel/sched/core.c | 1 -
1 file changed, 1 deletion(-)

--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2160,7 +2160,6 @@ static void try_to_wake_up_local(struct
*/
int wake_up_process(struct task_struct *p)
{
- WARN_ON(task_is_stopped_or_traced(p));
return try_to_wake_up(p, TASK_NORMAL, 0);
}
EXPORT_SYMBOL(wake_up_process);

2015-12-03 18:18:43

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] sched: remove false-positive warning from wake_up_process()

On Thu, Dec 3, 2015 at 4:36 AM, Peter Zijlstra <[email protected]> wrote:
>
> I've edited the changelog like so, please let me know if that is fine
> with you.

Ack.

Linus

Subject: [tip:locking/core] sched/core: Remove false-positive warning from wake_up_process()

Commit-ID: 119d6f6a3be8b424b200dcee56e74484d5445f7e
Gitweb: http://git.kernel.org/tip/119d6f6a3be8b424b200dcee56e74484d5445f7e
Author: Sasha Levin <[email protected]>
AuthorDate: Mon, 30 Nov 2015 20:34:20 -0500
Committer: Ingo Molnar <[email protected]>
CommitDate: Fri, 4 Dec 2015 10:10:16 +0100

sched/core: Remove false-positive warning from wake_up_process()

Because wakeups can (fundamentally) be late, a task might not be in
the expected state. Therefore testing against a task's state is racy,
and can yield false positives.

Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Linus Torvalds <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Fixes: 9067ac85d533 ("wake_up_process() should be never used to wakeup a TASK_STOPPED/TRACED task")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/sched/core.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 4d568ac..fc8c987 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2039,7 +2039,6 @@ out:
*/
int wake_up_process(struct task_struct *p)
{
- WARN_ON(task_is_stopped_or_traced(p));
return try_to_wake_up(p, TASK_NORMAL, 0);
}
EXPORT_SYMBOL(wake_up_process);