2018-07-17 19:36:47

by Isaac J. Manjarres

[permalink] [raw]
Subject: [PATCH] stop_machine: Disable preemption after queueing stopper threads

This commit:

9fb8d5dc4b64 ("stop_machine, Disable preemption when
waking two stopper threads")

does not fully address the race condition that can occur
as follows:

On one CPU, call it CPU 3, thread 1 invokes
cpu_stop_queue_two_works(2, 3,...), and the execution is such
that thread 1 queues the works for migration/2 and migration/3,
and is preempted after releasing the locks for migration/2 and
migration/3, but before waking the threads.

Then, On CPU 2, a kworker, call it thread 2, is running,
and it invokes cpu_stop_queue_two_works(1, 2,...), such that
thread 2 queues the works for migration/1 and migration/2.
Meanwhile, on CPU 3, thread 1 resumes execution, and wakes
migration/2 and migration/3. This means that when CPU 2
releases the locks for migration/1 and migration/2, but before
it wakes those threads, it can be preempted by migration/2.

If thread 2 is preempted by migration/2, then migration/2 will
execute the first work item successfully, since migration/3
was woken up by CPU 3, but when it goes to execute the second
work item, it disables preemption, calls multi_cpu_stop(),
and thus, CPU 2 will wait forever for migration/1, which should
have been woken up by thread 2. However migration/1 cannot be
woken up by thread 2, since it is a kworker, so it is affine to
CPU 2, but CPU 2 is running migration/2 with preemption
disabled, so thread 2 will never run.

Disable preemption after queueing works for stopper threads
to ensure that the operation of queueing the works and waking
the stopper threads is atomic.

Fixes: 9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper threads")
Co-Developed-by: Prasad Sodagudi <[email protected]>
Co-Developed-by: Pavankumar Kondeti <[email protected]>
Signed-off-by: Isaac J. Manjarres <[email protected]>
Signed-off-by: Prasad Sodagudi <[email protected]>
Signed-off-by: Pavankumar Kondeti <[email protected]>
Cc: [email protected]
---
kernel/stop_machine.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 1ff523d..e190d1e 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -260,6 +260,15 @@ static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1,
err = 0;
__cpu_stop_queue_work(stopper1, work1, &wakeq);
__cpu_stop_queue_work(stopper2, work2, &wakeq);
+ /*
+ * The waking up of stopper threads has to happen
+ * in the same scheduling context as the queueing.
+ * Otherwise, there is a possibility of one of the
+ * above stoppers being woken up by another CPU,
+ * and preempting us. This will cause us to n ot
+ * wake up the other stopper forever.
+ */
+ preempt_disable();
unlock:
raw_spin_unlock(&stopper2->lock);
raw_spin_unlock_irq(&stopper1->lock);
@@ -271,7 +280,6 @@ static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1,
}

if (!err) {
- preempt_disable();
wake_up_q(&wakeq);
preempt_enable();
}
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project



2018-07-24 01:14:53

by Isaac J. Manjarres

[permalink] [raw]
Subject: Re: [PATCH] stop_machine: Disable preemption after queueing stopper threads

Hi all,

Are there any comments about this patch?

Thanks,
Isaac Manjarres
On 2018-07-17 12:35, Isaac J. Manjarres wrote:
> This commit:
>
> 9fb8d5dc4b64 ("stop_machine, Disable preemption when
> waking two stopper threads")
>
> does not fully address the race condition that can occur
> as follows:
>
> On one CPU, call it CPU 3, thread 1 invokes
> cpu_stop_queue_two_works(2, 3,...), and the execution is such
> that thread 1 queues the works for migration/2 and migration/3,
> and is preempted after releasing the locks for migration/2 and
> migration/3, but before waking the threads.
>
> Then, On CPU 2, a kworker, call it thread 2, is running,
> and it invokes cpu_stop_queue_two_works(1, 2,...), such that
> thread 2 queues the works for migration/1 and migration/2.
> Meanwhile, on CPU 3, thread 1 resumes execution, and wakes
> migration/2 and migration/3. This means that when CPU 2
> releases the locks for migration/1 and migration/2, but before
> it wakes those threads, it can be preempted by migration/2.
>
> If thread 2 is preempted by migration/2, then migration/2 will
> execute the first work item successfully, since migration/3
> was woken up by CPU 3, but when it goes to execute the second
> work item, it disables preemption, calls multi_cpu_stop(),
> and thus, CPU 2 will wait forever for migration/1, which should
> have been woken up by thread 2. However migration/1 cannot be
> woken up by thread 2, since it is a kworker, so it is affine to
> CPU 2, but CPU 2 is running migration/2 with preemption
> disabled, so thread 2 will never run.
>
> Disable preemption after queueing works for stopper threads
> to ensure that the operation of queueing the works and waking
> the stopper threads is atomic.
>
> Fixes: 9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two
> stopper threads")
> Co-Developed-by: Prasad Sodagudi <[email protected]>
> Co-Developed-by: Pavankumar Kondeti <[email protected]>
> Signed-off-by: Isaac J. Manjarres <[email protected]>
> Signed-off-by: Prasad Sodagudi <[email protected]>
> Signed-off-by: Pavankumar Kondeti <[email protected]>
> Cc: [email protected]
> ---
> kernel/stop_machine.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
> index 1ff523d..e190d1e 100644
> --- a/kernel/stop_machine.c
> +++ b/kernel/stop_machine.c
> @@ -260,6 +260,15 @@ static int cpu_stop_queue_two_works(int cpu1,
> struct cpu_stop_work *work1,
> err = 0;
> __cpu_stop_queue_work(stopper1, work1, &wakeq);
> __cpu_stop_queue_work(stopper2, work2, &wakeq);
> + /*
> + * The waking up of stopper threads has to happen
> + * in the same scheduling context as the queueing.
> + * Otherwise, there is a possibility of one of the
> + * above stoppers being woken up by another CPU,
> + * and preempting us. This will cause us to n ot
> + * wake up the other stopper forever.
> + */
> + preempt_disable();
> unlock:
> raw_spin_unlock(&stopper2->lock);
> raw_spin_unlock_irq(&stopper1->lock);
> @@ -271,7 +280,6 @@ static int cpu_stop_queue_two_works(int cpu1,
> struct cpu_stop_work *work1,
> }
>
> if (!err) {
> - preempt_disable();
> wake_up_q(&wakeq);
> preempt_enable();
> }

Subject: Re: [PATCH] stop_machine: Disable preemption after queueing stopper threads

On 2018-07-23 18:13:48 [-0700], [email protected] wrote:
> Hi all,
Hi,

> Are there any comments about this patch?

I haven't look in detail at this but your new preempt_disable() makes
things unbalanced for the err != 0 case.

> Thanks,
> Isaac Manjarres

Sebastian

2018-07-25 04:16:10

by Isaac J. Manjarres

[permalink] [raw]
Subject: Re: [PATCH] stop_machine: Disable preemption after queueing stopper threads

Hi Sebastian,

Thanks for the response.

"I haven't look in detail at this but your new preempt_disable() makes
things unbalanced for the err != 0 case."

This cannot happen. The only possible return values of this function
are -ENOENT or 0.

In the case where we return -ENOENT, we'll go
straight to "unlock," which releases the two locks being held, but
doesn't disable preemption, and since err != we won't call
preemption_enable.

In the case where we return 0, then that means the works were queued
successfully, and preemption was disabled, and we'll fall into the
if branch, after releasing the locks, and enable preemption, which is
correct.

In either case, there is no imbalance between the
preemption_[disable/enable]
calls.

Thanks,
Isaac Manjarres

On 2018-07-23 23:23, Sebastian Andrzej Siewior wrote:
> On 2018-07-23 18:13:48 [-0700], [email protected] wrote:
>> Hi all,
> Hi,
>
>> Are there any comments about this patch?
>
> I haven't look in detail at this but your new preempt_disable() makes
> things unbalanced for the err != 0 case.
>
>> Thanks,
>> Isaac Manjarres
>
> Sebastian

Subject: [tip:sched/core] stop_machine: Disable preemption after queueing stopper threads

Commit-ID: 2610e88946632afb78aa58e61f11368ac4c0af7b
Gitweb: https://git.kernel.org/tip/2610e88946632afb78aa58e61f11368ac4c0af7b
Author: Isaac J. Manjarres <[email protected]>
AuthorDate: Tue, 17 Jul 2018 12:35:29 -0700
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 25 Jul 2018 11:25:08 +0200

stop_machine: Disable preemption after queueing stopper threads

This commit:

9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper threads")

does not fully address the race condition that can occur
as follows:

On one CPU, call it CPU 3, thread 1 invokes
cpu_stop_queue_two_works(2, 3,...), and the execution is such
that thread 1 queues the works for migration/2 and migration/3,
and is preempted after releasing the locks for migration/2 and
migration/3, but before waking the threads.

Then, On CPU 2, a kworker, call it thread 2, is running,
and it invokes cpu_stop_queue_two_works(1, 2,...), such that
thread 2 queues the works for migration/1 and migration/2.
Meanwhile, on CPU 3, thread 1 resumes execution, and wakes
migration/2 and migration/3. This means that when CPU 2
releases the locks for migration/1 and migration/2, but before
it wakes those threads, it can be preempted by migration/2.

If thread 2 is preempted by migration/2, then migration/2 will
execute the first work item successfully, since migration/3
was woken up by CPU 3, but when it goes to execute the second
work item, it disables preemption, calls multi_cpu_stop(),
and thus, CPU 2 will wait forever for migration/1, which should
have been woken up by thread 2. However migration/1 cannot be
woken up by thread 2, since it is a kworker, so it is affine to
CPU 2, but CPU 2 is running migration/2 with preemption
disabled, so thread 2 will never run.

Disable preemption after queueing works for stopper threads
to ensure that the operation of queueing the works and waking
the stopper threads is atomic.

Co-Developed-by: Prasad Sodagudi <[email protected]>
Co-Developed-by: Pavankumar Kondeti <[email protected]>
Signed-off-by: Isaac J. Manjarres <[email protected]>
Signed-off-by: Prasad Sodagudi <[email protected]>
Signed-off-by: Pavankumar Kondeti <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Fixes: 9fb8d5dc4b64 ("stop_machine, Disable preemption when waking two stopper threads")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/stop_machine.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 1ff523dae6e2..e190d1ef3a23 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -260,6 +260,15 @@ retry:
err = 0;
__cpu_stop_queue_work(stopper1, work1, &wakeq);
__cpu_stop_queue_work(stopper2, work2, &wakeq);
+ /*
+ * The waking up of stopper threads has to happen
+ * in the same scheduling context as the queueing.
+ * Otherwise, there is a possibility of one of the
+ * above stoppers being woken up by another CPU,
+ * and preempting us. This will cause us to n ot
+ * wake up the other stopper forever.
+ */
+ preempt_disable();
unlock:
raw_spin_unlock(&stopper2->lock);
raw_spin_unlock_irq(&stopper1->lock);
@@ -271,7 +280,6 @@ unlock:
}

if (!err) {
- preempt_disable();
wake_up_q(&wakeq);
preempt_enable();
}

2018-07-30 10:23:08

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] stop_machine: Disable preemption after queueing stopper threads

On Tue, 24 Jul 2018, Sebastian Andrzej Siewior wrote:
> On 2018-07-23 18:13:48 [-0700], [email protected] wrote:
> > Hi all,
> Hi,
>
> > Are there any comments about this patch?
>
> I haven't look in detail at this but your new preempt_disable() makes
> things unbalanced for the err != 0 case.

It doesn't but that code is really an unreadable pile of ...

2018-07-30 11:22:45

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] stop_machine: Disable preemption after queueing stopper threads

On Mon, Jul 30, 2018 at 12:20:57PM +0200, Thomas Gleixner wrote:
> On Tue, 24 Jul 2018, Sebastian Andrzej Siewior wrote:
> > On 2018-07-23 18:13:48 [-0700], [email protected] wrote:
> > > Hi all,
> > Hi,
> >
> > > Are there any comments about this patch?
> >
> > I haven't look in detail at this but your new preempt_disable() makes
> > things unbalanced for the err != 0 case.
>
> It doesn't but that code is really an unreadable pile of ...

---
Subject: stop_machine: Reflow cpu_stop_queue_two_works()

The code flow in cpu_stop_queue_two_works() is a little arcane; fix
this by lifting the preempt_disable() to the top to create more natural
nesting wrt the spinlocks and make the wake_up_q() and preempt_enable()
unconditional at the end.

Furthermore, enable preemption in the -EDEADLK case, such that we
spin-wait with preemption enabled.

Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
---
kernel/stop_machine.c | 41 +++++++++++++++++++++++------------------
1 file changed, 23 insertions(+), 18 deletions(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index e190d1ef3a23..34b6652e8677 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -236,13 +236,24 @@ static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1,
struct cpu_stopper *stopper2 = per_cpu_ptr(&cpu_stopper, cpu2);
DEFINE_WAKE_Q(wakeq);
int err;
+
retry:
+ /*
+ * The waking up of stopper threads has to happen in the same
+ * scheduling context as the queueing. Otherwise, there is a
+ * possibility of one of the above stoppers being woken up by another
+ * CPU, and preempting us. This will cause us to not wake up the other
+ * stopper forever.
+ */
+ preempt_disable();
raw_spin_lock_irq(&stopper1->lock);
raw_spin_lock_nested(&stopper2->lock, SINGLE_DEPTH_NESTING);

- err = -ENOENT;
- if (!stopper1->enabled || !stopper2->enabled)
+ if (!stopper1->enabled || !stopper2->enabled) {
+ err = -ENOENT;
goto unlock;
+ }
+
/*
* Ensure that if we race with __stop_cpus() the stoppers won't get
* queued up in reverse order leading to system deadlock.
@@ -253,36 +264,30 @@ static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1,
* It can be falsely true but it is safe to spin until it is cleared,
* queue_stop_cpus_work() does everything under preempt_disable().
*/
- err = -EDEADLK;
- if (unlikely(stop_cpus_in_progress))
- goto unlock;
+ if (unlikely(stop_cpus_in_progress)) {
+ err = -EDEADLK;
+ goto unlock;
+ }

err = 0;
__cpu_stop_queue_work(stopper1, work1, &wakeq);
__cpu_stop_queue_work(stopper2, work2, &wakeq);
- /*
- * The waking up of stopper threads has to happen
- * in the same scheduling context as the queueing.
- * Otherwise, there is a possibility of one of the
- * above stoppers being woken up by another CPU,
- * and preempting us. This will cause us to n ot
- * wake up the other stopper forever.
- */
- preempt_disable();
+
unlock:
raw_spin_unlock(&stopper2->lock);
raw_spin_unlock_irq(&stopper1->lock);

if (unlikely(err == -EDEADLK)) {
+ preempt_enable();
+
while (stop_cpus_in_progress)
cpu_relax();
+
goto retry;
}

- if (!err) {
- wake_up_q(&wakeq);
- preempt_enable();
- }
+ wake_up_q(&wakeq);
+ preempt_enable();

return err;
}

2018-07-30 12:42:13

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] stop_machine: Disable preemption after queueing stopper threads

On Mon, 30 Jul 2018, Peter Zijlstra wrote:

> On Mon, Jul 30, 2018 at 12:20:57PM +0200, Thomas Gleixner wrote:
> > On Tue, 24 Jul 2018, Sebastian Andrzej Siewior wrote:
> > > On 2018-07-23 18:13:48 [-0700], [email protected] wrote:
> > > > Hi all,
> > > Hi,
> > >
> > > > Are there any comments about this patch?
> > >
> > > I haven't look in detail at this but your new preempt_disable() makes
> > > things unbalanced for the err != 0 case.
> >
> > It doesn't but that code is really an unreadable pile of ...
>
> ---
> Subject: stop_machine: Reflow cpu_stop_queue_two_works()
>
> The code flow in cpu_stop_queue_two_works() is a little arcane; fix
> this by lifting the preempt_disable() to the top to create more natural
> nesting wrt the spinlocks and make the wake_up_q() and preempt_enable()
> unconditional at the end.
>
> Furthermore, enable preemption in the -EDEADLK case, such that we
> spin-wait with preemption enabled.
>
> Suggested-by: Thomas Gleixner <[email protected]>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>

Thanks for cleaning that up!

Reviewed-by: Thomas Gleixner <[email protected]>

2018-07-30 17:14:27

by Prasad Sodagudi

[permalink] [raw]
Subject: Re: [PATCH] stop_machine: Disable preemption after queueing stopper threads

On 2018-07-30 05:41, Thomas Gleixner wrote:
> On Mon, 30 Jul 2018, Peter Zijlstra wrote:
>
>> On Mon, Jul 30, 2018 at 12:20:57PM +0200, Thomas Gleixner wrote:
>> > On Tue, 24 Jul 2018, Sebastian Andrzej Siewior wrote:
>> > > On 2018-07-23 18:13:48 [-0700], [email protected] wrote:
>> > > > Hi all,
>> > > Hi,
>> > >
>> > > > Are there any comments about this patch?
>> > >
>> > > I haven't look in detail at this but your new preempt_disable() makes
>> > > things unbalanced for the err != 0 case.
>> >
>> > It doesn't but that code is really an unreadable pile of ...
>>
>> ---
>> Subject: stop_machine: Reflow cpu_stop_queue_two_works()
>>
>> The code flow in cpu_stop_queue_two_works() is a little arcane; fix
>> this by lifting the preempt_disable() to the top to create more
>> natural
>> nesting wrt the spinlocks and make the wake_up_q() and
>> preempt_enable()
>> unconditional at the end.
>>
>> Furthermore, enable preemption in the -EDEADLK case, such that we
>> spin-wait with preemption enabled.
>>
>> Suggested-by: Thomas Gleixner <[email protected]>
>> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
>
Hi Peter/Thomas,

How about including below change as well? Currently, there is no way to
identify thread migrations completed or not. When we observe this
issue, the symptom was work queue lock up. It is better to have some
timeout here and induce the bug_on.

There is no way to identify the migration threads stuck or not.

--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -290,6 +290,7 @@ int stop_two_cpus(unsigned int cpu1, unsigned int
cpu2, cpu_stop_fn_t fn, void *
struct cpu_stop_done done;
struct cpu_stop_work work1, work2;
struct multi_stop_data msdata;
+ int ret;

msdata = (struct multi_stop_data){
.fn = fn,
@@ -312,7 +313,10 @@ int stop_two_cpus(unsigned int cpu1, unsigned int
cpu2, cpu_stop_fn_t fn, void *
if (cpu_stop_queue_two_works(cpu1, &work1, cpu2, &work2))
return -ENOENT;

- wait_for_completion(&done.completion);
+ ret = wait_for_completion_timeout(&done.completion,
msecs_to_jiffies(1000));
+ if (!ret)
+ BUG_ON(1);
+


> Thanks for cleaning that up!
>
> Reviewed-by: Thomas Gleixner <[email protected]>

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum,
Linux Foundation Collaborative Project

2018-07-30 17:18:15

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] stop_machine: Disable preemption after queueing stopper threads

On Mon, 30 Jul 2018, Sodagudi Prasad wrote:
> How about including below change as well? Currently, there is no way to

That would be a completely separate change.

> identify thread migrations completed or not. When we observe this issue, the
> symptom was work queue lock up. It is better to have some timeout here and
> induce the bug_on.

BUG_ON() is wrong. Why kill the thing if there is at least a theoretical
chance that stuff can continue half baken so you can get more info out of
it. The back trace is pretty much uninteresting.

Thanks,

tglx

2018-07-30 21:08:39

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] stop_machine: Disable preemption after queueing stopper threads

On Mon, Jul 30, 2018 at 10:12:43AM -0700, Sodagudi Prasad wrote:
> How about including below change as well? Currently, there is no way to
> identify thread migrations completed or not. When we observe this issue,
> the symptom was work queue lock up. It is better to have some timeout here
> and induce the bug_on.

You'd trigger the soft-lockup or hung-task detector I think. And if not,
we ought to look at making it trigger at least one of those.

> There is no way to identify the migration threads stuck or not.

Should be pretty obvious from the splat generated by the above, no?

> --- a/kernel/stop_machine.c
> +++ b/kernel/stop_machine.c
> @@ -290,6 +290,7 @@ int stop_two_cpus(unsigned int cpu1, unsigned int cpu2,
> cpu_stop_fn_t fn, void *
> struct cpu_stop_done done;
> struct cpu_stop_work work1, work2;
> struct multi_stop_data msdata;
> + int ret;
>
> msdata = (struct multi_stop_data){
> .fn = fn,
> @@ -312,7 +313,10 @@ int stop_two_cpus(unsigned int cpu1, unsigned int cpu2,
> cpu_stop_fn_t fn, void *
> if (cpu_stop_queue_two_works(cpu1, &work1, cpu2, &work2))
> return -ENOENT;
>
> - wait_for_completion(&done.completion);
> + ret = wait_for_completion_timeout(&done.completion,
> msecs_to_jiffies(1000));
> + if (!ret)
> + BUG_ON(1);
> +

That's a random timeout, which if you spuriously trigger it, will take
down your machine. That seems like a cure worse than the disease.

2018-08-01 08:08:07

by Prasad Sodagudi

[permalink] [raw]
Subject: Re: [PATCH] stop_machine: Disable preemption after queueing stopper threads

On 2018-07-30 14:07, Peter Zijlstra wrote:
> On Mon, Jul 30, 2018 at 10:12:43AM -0700, Sodagudi Prasad wrote:
>> How about including below change as well? Currently, there is no way
>> to
>> identify thread migrations completed or not. When we observe this
>> issue,
>> the symptom was work queue lock up. It is better to have some timeout
>> here
>> and induce the bug_on.
>
> You'd trigger the soft-lockup or hung-task detector I think. And if
> not,
> we ought to look at making it trigger at least one of those.
>
>> There is no way to identify the migration threads stuck or not.
>
> Should be pretty obvious from the splat generated by the above, no?
Hi Peter and Thomas,

Thanks for your support.
I have another question on this flow and retry mechanism used in this
cpu_stop_queue_two_works() function using the global variable
stop_cpus_in_progress.

This variable is getting used in various paths, such as task migration,
set task affinity, and CPU hotplug.

For example cpu hotplug path, stop_cpus_in_progress variable getting set
with true with out checking.
takedown_cpu()
--stop_machine_cpuslocked()
---stop_cpus()
---__stop_cpus()
----queue_stop_cpus_work()
setting stop_cpus_in_progress to true directly.

But in the task migration path only, the stop_cpus_in_progress variable
is used for retry.

I am thinking that stop_cpus_in_progress variable lead race conditions,
where CPU hotplug and task migration happening simultaneously. Please
correct me If my understanding wrong.

-Thanks, Prasad

>
>> --- a/kernel/stop_machine.c
>> +++ b/kernel/stop_machine.c
>> @@ -290,6 +290,7 @@ int stop_two_cpus(unsigned int cpu1, unsigned int
>> cpu2,
>> cpu_stop_fn_t fn, void *
>> struct cpu_stop_done done;
>> struct cpu_stop_work work1, work2;
>> struct multi_stop_data msdata;
>> + int ret;
>>
>> msdata = (struct multi_stop_data){
>> .fn = fn,
>> @@ -312,7 +313,10 @@ int stop_two_cpus(unsigned int cpu1, unsigned int
>> cpu2,
>> cpu_stop_fn_t fn, void *
>> if (cpu_stop_queue_two_works(cpu1, &work1, cpu2, &work2))
>> return -ENOENT;
>>
>> - wait_for_completion(&done.completion);
>> + ret = wait_for_completion_timeout(&done.completion,
>> msecs_to_jiffies(1000));
>> + if (!ret)
>> + BUG_ON(1);
>> +
>
> That's a random timeout, which if you spuriously trigger it, will take
> down your machine. That seems like a cure worse than the disease.

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora
Forum,
Linux Foundation Collaborative Project

Subject: [tip:sched/core] stop_machine: Reflow cpu_stop_queue_two_works()

Commit-ID: 2171ce2d470d6e389ebbef3edd22c7643918a02f
Gitweb: https://git.kernel.org/tip/2171ce2d470d6e389ebbef3edd22c7643918a02f
Author: Peter Zijlstra <[email protected]>
AuthorDate: Mon, 30 Jul 2018 13:21:40 +0200
Committer: Thomas Gleixner <[email protected]>
CommitDate: Thu, 2 Aug 2018 14:02:53 +0200

stop_machine: Reflow cpu_stop_queue_two_works()

The code flow in cpu_stop_queue_two_works() is a little arcane; fix this by
lifting the preempt_disable() to the top to create more natural nesting wrt
the spinlocks and make the wake_up_q() and preempt_enable() unconditional
at the end.

Furthermore, enable preemption in the -EDEADLK case, such that we spin-wait
with preemption enabled.

Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
---
kernel/stop_machine.c | 41 +++++++++++++++++++++++------------------
1 file changed, 23 insertions(+), 18 deletions(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index e190d1ef3a23..34b6652e8677 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -236,13 +236,24 @@ static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1,
struct cpu_stopper *stopper2 = per_cpu_ptr(&cpu_stopper, cpu2);
DEFINE_WAKE_Q(wakeq);
int err;
+
retry:
+ /*
+ * The waking up of stopper threads has to happen in the same
+ * scheduling context as the queueing. Otherwise, there is a
+ * possibility of one of the above stoppers being woken up by another
+ * CPU, and preempting us. This will cause us to not wake up the other
+ * stopper forever.
+ */
+ preempt_disable();
raw_spin_lock_irq(&stopper1->lock);
raw_spin_lock_nested(&stopper2->lock, SINGLE_DEPTH_NESTING);

- err = -ENOENT;
- if (!stopper1->enabled || !stopper2->enabled)
+ if (!stopper1->enabled || !stopper2->enabled) {
+ err = -ENOENT;
goto unlock;
+ }
+
/*
* Ensure that if we race with __stop_cpus() the stoppers won't get
* queued up in reverse order leading to system deadlock.
@@ -253,36 +264,30 @@ retry:
* It can be falsely true but it is safe to spin until it is cleared,
* queue_stop_cpus_work() does everything under preempt_disable().
*/
- err = -EDEADLK;
- if (unlikely(stop_cpus_in_progress))
- goto unlock;
+ if (unlikely(stop_cpus_in_progress)) {
+ err = -EDEADLK;
+ goto unlock;
+ }

err = 0;
__cpu_stop_queue_work(stopper1, work1, &wakeq);
__cpu_stop_queue_work(stopper2, work2, &wakeq);
- /*
- * The waking up of stopper threads has to happen
- * in the same scheduling context as the queueing.
- * Otherwise, there is a possibility of one of the
- * above stoppers being woken up by another CPU,
- * and preempting us. This will cause us to n ot
- * wake up the other stopper forever.
- */
- preempt_disable();
+
unlock:
raw_spin_unlock(&stopper2->lock);
raw_spin_unlock_irq(&stopper1->lock);

if (unlikely(err == -EDEADLK)) {
+ preempt_enable();
+
while (stop_cpus_in_progress)
cpu_relax();
+
goto retry;
}

- if (!err) {
- wake_up_q(&wakeq);
- preempt_enable();
- }
+ wake_up_q(&wakeq);
+ preempt_enable();

return err;
}

Subject: [tip:sched/core] stop_machine: Reflow cpu_stop_queue_two_works()

Commit-ID: b80a2bfce85e1051056d98d04ecb2d0b55cbbc1c
Gitweb: https://git.kernel.org/tip/b80a2bfce85e1051056d98d04ecb2d0b55cbbc1c
Author: Peter Zijlstra <[email protected]>
AuthorDate: Mon, 30 Jul 2018 13:21:40 +0200
Committer: Thomas Gleixner <[email protected]>
CommitDate: Thu, 2 Aug 2018 15:25:20 +0200

stop_machine: Reflow cpu_stop_queue_two_works()

The code flow in cpu_stop_queue_two_works() is a little arcane; fix this by
lifting the preempt_disable() to the top to create more natural nesting wrt
the spinlocks and make the wake_up_q() and preempt_enable() unconditional
at the end.

Furthermore, enable preemption in the -EDEADLK case, such that we spin-wait
with preemption enabled.

Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lkml.kernel.org/r/[email protected]
---
kernel/stop_machine.c | 41 +++++++++++++++++++++++------------------
1 file changed, 23 insertions(+), 18 deletions(-)

diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index e190d1ef3a23..34b6652e8677 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -236,13 +236,24 @@ static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1,
struct cpu_stopper *stopper2 = per_cpu_ptr(&cpu_stopper, cpu2);
DEFINE_WAKE_Q(wakeq);
int err;
+
retry:
+ /*
+ * The waking up of stopper threads has to happen in the same
+ * scheduling context as the queueing. Otherwise, there is a
+ * possibility of one of the above stoppers being woken up by another
+ * CPU, and preempting us. This will cause us to not wake up the other
+ * stopper forever.
+ */
+ preempt_disable();
raw_spin_lock_irq(&stopper1->lock);
raw_spin_lock_nested(&stopper2->lock, SINGLE_DEPTH_NESTING);

- err = -ENOENT;
- if (!stopper1->enabled || !stopper2->enabled)
+ if (!stopper1->enabled || !stopper2->enabled) {
+ err = -ENOENT;
goto unlock;
+ }
+
/*
* Ensure that if we race with __stop_cpus() the stoppers won't get
* queued up in reverse order leading to system deadlock.
@@ -253,36 +264,30 @@ retry:
* It can be falsely true but it is safe to spin until it is cleared,
* queue_stop_cpus_work() does everything under preempt_disable().
*/
- err = -EDEADLK;
- if (unlikely(stop_cpus_in_progress))
- goto unlock;
+ if (unlikely(stop_cpus_in_progress)) {
+ err = -EDEADLK;
+ goto unlock;
+ }

err = 0;
__cpu_stop_queue_work(stopper1, work1, &wakeq);
__cpu_stop_queue_work(stopper2, work2, &wakeq);
- /*
- * The waking up of stopper threads has to happen
- * in the same scheduling context as the queueing.
- * Otherwise, there is a possibility of one of the
- * above stoppers being woken up by another CPU,
- * and preempting us. This will cause us to n ot
- * wake up the other stopper forever.
- */
- preempt_disable();
+
unlock:
raw_spin_unlock(&stopper2->lock);
raw_spin_unlock_irq(&stopper1->lock);

if (unlikely(err == -EDEADLK)) {
+ preempt_enable();
+
while (stop_cpus_in_progress)
cpu_relax();
+
goto retry;
}

- if (!err) {
- wake_up_q(&wakeq);
- preempt_enable();
- }
+ wake_up_q(&wakeq);
+ preempt_enable();

return err;
}

2018-08-06 08:38:50

by Pavankumar Kondeti

[permalink] [raw]
Subject: Re: [PATCH] stop_machine: Disable preemption after queueing stopper threads

Hi Prasad,

On Wed, Aug 01, 2018 at 01:07:03AM -0700, Sodagudi Prasad wrote:
> On 2018-07-30 14:07, Peter Zijlstra wrote:
> >On Mon, Jul 30, 2018 at 10:12:43AM -0700, Sodagudi Prasad wrote:
> >>How about including below change as well? Currently, there is
> >>no way to
> >>identify thread migrations completed or not. When we observe
> >>this issue,
> >>the symptom was work queue lock up. It is better to have some
> >>timeout here
> >>and induce the bug_on.
> >
> >You'd trigger the soft-lockup or hung-task detector I think. And
> >if not,
> >we ought to look at making it trigger at least one of those.
> >
> >>There is no way to identify the migration threads stuck or not.
> >
> >Should be pretty obvious from the splat generated by the above, no?
> Hi Peter and Thomas,
>
> Thanks for your support.
> I have another question on this flow and retry mechanism used in
> this cpu_stop_queue_two_works() function using the global variable
> stop_cpus_in_progress.
>
> This variable is getting used in various paths, such as task
> migration, set task affinity, and CPU hotplug.
>
> For example cpu hotplug path, stop_cpus_in_progress variable getting
> set with true with out checking.
> takedown_cpu()
> --stop_machine_cpuslocked()
> ---stop_cpus()
> ---__stop_cpus()
> ----queue_stop_cpus_work()
> setting stop_cpus_in_progress to true directly.
>
> But in the task migration path only, the stop_cpus_in_progress
> variable is used for retry.
>
> I am thinking that stop_cpus_in_progress variable lead race
> conditions, where CPU hotplug and task migration happening
> simultaneously. Please correct me If my understanding wrong.
>

The stop_cpus_in_progress variable is to guard against out of order queuing.
The stopper locks does not protect this when cpu_stop_queue_two_works() and
stop_cpus() are executing in parallel.

stop_one_cpu_{nowait} functions are called to handle affinity change and
load balance. Since we are queuing the work only on 1 CPU,
stop_cpus_in_progress variable protection is not needed.

Thanks,
Pavan
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.