2018-04-24 09:39:19

by Gaurav Kohli

[permalink] [raw]
Subject: [PATCH] kthread/smpboot: Serialize kthread parking against wakeup

The control cpu thread which initiates hotplug calls kthread_park()
for hotplug thread and sets KTHREAD_SHOULD_PARK. After this control
thread wakes up the hotplug thread. There is a chance that wakeup
code sees the hotplug thread (running on AP core) in INTERRUPTIBLE
state, but sets its state to RUNNING after hotplug thread has entered
kthread_parkme() and changed its state to TASK_PARKED. This can result
in panic later on in kthread_unpark(), as it sees KTHREAD_IS_PARKED
flag set but fails to rebind the kthread, due to it being not in
TASK_PARKED state. Fix this, by serializing wakeup state change,
against state change before parking the kthread.

Below is the possible race:

Control thread Hotplug Thread

kthread_park()
set KTHREAD_SHOULD_PARK
smpboot_thread_fn
set_current_state(TASK_INTERRUPTIBLE);
kthread_parkme

wake_up_process()

raw_spin_lock_irqsave(&p->pi_lock, flags);
if (!(p->state & state)) -> this will fail
goto out;

__kthread_parkme
__set_current_state(TASK_PARKED);

if (p->on_rq && ttwu_remote(p, wake_flags))
ttwu_remote()
p->state = TASK_RUNNING;
schedule();

So to avoid this race, take pi_lock to serial state changes.

Suggested-by: Pavankumar Kondeti <[email protected]>
Co-developed-by: Neeraj Upadhyay <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
Signed-off-by: Gaurav Kohli <[email protected]>

diff --git a/kernel/smpboot.c b/kernel/smpboot.c
index 1650578..514b232 100644
--- a/kernel/smpboot.c
+++ b/kernel/smpboot.c
@@ -121,7 +121,9 @@ static int smpboot_thread_fn(void *data)
}

if (kthread_should_park()) {
+ raw_spin_lock(&current->pi_lock);
__set_current_state(TASK_RUNNING);
+ raw_spin_unlock(&current->pi_lock);
preempt_enable();
if (ht->park && td->status == HP_THREAD_ACTIVE) {
BUG_ON(td->cpu != smp_processor_id());
--
1.9.1



2018-04-24 14:44:21

by Gaurav Kohli

[permalink] [raw]
Subject: Re: [PATCH] kthread/smpboot: Serialize kthread parking against wakeup

Hi ,

We can also fix below race by smpboot code as well:

@@ -109,7 +109,6 @@ static int smpboot_thread_fn(void *data)

        struct smp_hotplug_thread *ht = td->ht;

        while (1) {

-               set_current_state(TASK_INTERRUPTIBLE);

                preempt_disable();

                if (kthread_should_stop()) {

                        __set_current_state(TASK_RUNNING);

@@ -157,6 +156,7 @@ static int smpboot_thread_fn(void *data)

                if (!ht->thread_should_run(td->cpu)) {

                        preempt_enable_no_resched();

+                       set_current_state(TASK_INTERRUPTIBLE);

                        schedule();

                } else {

                        __set_current_state(TASK_RUNNING);

Please suggest if this approach is better.

Regards

Gaurav

On 4/24/2018 2:58 PM, Gaurav Kohli wrote:

> The control cpu thread which initiates hotplug calls kthread_park()
> for hotplug thread and sets KTHREAD_SHOULD_PARK. After this control
> thread wakes up the hotplug thread. There is a chance that wakeup
> code sees the hotplug thread (running on AP core) in INTERRUPTIBLE
> state, but sets its state to RUNNING after hotplug thread has entered
> kthread_parkme() and changed its state to TASK_PARKED. This can result
> in panic later on in kthread_unpark(), as it sees KTHREAD_IS_PARKED
> flag set but fails to rebind the kthread, due to it being not in
> TASK_PARKED state. Fix this, by serializing wakeup state change,
> against state change before parking the kthread.
>
> Below is the possible race:
>
> Control thread Hotplug Thread
>
> kthread_park()
> set KTHREAD_SHOULD_PARK
> smpboot_thread_fn
> set_current_state(TASK_INTERRUPTIBLE);
> kthread_parkme
>
> wake_up_process()
>
> raw_spin_lock_irqsave(&p->pi_lock, flags);
> if (!(p->state & state)) -> this will fail
> goto out;
>
> __kthread_parkme
> __set_current_state(TASK_PARKED);
>
> if (p->on_rq && ttwu_remote(p, wake_flags))
> ttwu_remote()
> p->state = TASK_RUNNING;
> schedule();
>
> So to avoid this race, take pi_lock to serial state changes.
>
> Suggested-by: Pavankumar Kondeti <[email protected]>
> Co-developed-by: Neeraj Upadhyay <[email protected]>
> Signed-off-by: Neeraj Upadhyay <[email protected]>
> Signed-off-by: Gaurav Kohli <[email protected]>
>
> diff --git a/kernel/smpboot.c b/kernel/smpboot.c
> index 1650578..514b232 100644
> --- a/kernel/smpboot.c
> +++ b/kernel/smpboot.c
> @@ -121,7 +121,9 @@ static int smpboot_thread_fn(void *data)
> }
>
> if (kthread_should_park()) {
> + raw_spin_lock(&current->pi_lock);
> __set_current_state(TASK_RUNNING);
> + raw_spin_unlock(&current->pi_lock);
> preempt_enable();
> if (ht->park && td->status == HP_THREAD_ACTIVE) {
> BUG_ON(td->cpu != smp_processor_id());
>
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.


2018-04-24 18:06:39

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] kthread/smpboot: Serialize kthread parking against wakeup

On Tue, Apr 24, 2018 at 08:12:49PM +0530, Kohli, Gaurav wrote:
> @@ -157,6 +156,7 @@ static int smpboot_thread_fn(void *data)
>
> ??????????????? if (!ht->thread_should_run(td->cpu)) {
>
> ??????????????????????? preempt_enable_no_resched();
>
> +?????????????????????? set_current_state(TASK_INTERRUPTIBLE);
>
> ??????????????????????? schedule();
>
> ??????????????? } else {
>
> ??????????????????????? __set_current_state(TASK_RUNNING);
>
> Please suggest if this approach is better.

Bah, my brain isn't working... see below for the 'correct' version of
your second patch.

But this violates the normal pattern; see the comment near
set_current_state(). That pattern ensures the thread either sees the
wakeup condition or the actual wakeup.

I'm thinking that with this patch there is a scenario where we'll miss
both the kthread_should_park() and the actual wakeup and end up not
doing anything.

I do the like the end result, but I suspect it's buggy.


---
kernel/smpboot.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/kernel/smpboot.c b/kernel/smpboot.c
index 5043e7433f4b..5bdf57f2ce68 100644
--- a/kernel/smpboot.c
+++ b/kernel/smpboot.c
@@ -109,10 +109,8 @@ static int smpboot_thread_fn(void *data)
struct smp_hotplug_thread *ht = td->ht;

while (1) {
- set_current_state(TASK_INTERRUPTIBLE);
preempt_disable();
if (kthread_should_stop()) {
- __set_current_state(TASK_RUNNING);
preempt_enable();
/* cleanup must mirror setup */
if (ht->cleanup && td->status != HP_THREAD_NONE)
@@ -122,7 +120,6 @@ static int smpboot_thread_fn(void *data)
}

if (kthread_should_park()) {
- __set_current_state(TASK_RUNNING);
preempt_enable();
if (ht->park && td->status == HP_THREAD_ACTIVE) {
BUG_ON(td->cpu != smp_processor_id());
@@ -139,7 +136,6 @@ static int smpboot_thread_fn(void *data)
/* Check for state change setup */
switch (td->status) {
case HP_THREAD_NONE:
- __set_current_state(TASK_RUNNING);
preempt_enable();
if (ht->setup)
ht->setup(td->cpu);
@@ -147,7 +143,6 @@ static int smpboot_thread_fn(void *data)
continue;

case HP_THREAD_PARKED:
- __set_current_state(TASK_RUNNING);
preempt_enable();
if (ht->unpark)
ht->unpark(td->cpu);
@@ -156,10 +151,10 @@ static int smpboot_thread_fn(void *data)
}

if (!ht->thread_should_run(td->cpu)) {
+ set_current_state(TASK_IDLE);
preempt_enable_no_resched();
schedule();
} else {
- __set_current_state(TASK_RUNNING);
preempt_enable();
ht->thread_fn(td->cpu);
}

2018-04-24 18:29:09

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] kthread/smpboot: Serialize kthread parking against wakeup

On Tue, Apr 24, 2018 at 02:58:25PM +0530, Gaurav Kohli wrote:
> The control cpu thread which initiates hotplug calls kthread_park()
> for hotplug thread and sets KTHREAD_SHOULD_PARK. After this control
> thread wakes up the hotplug thread. There is a chance that wakeup
> code sees the hotplug thread (running on AP core) in INTERRUPTIBLE
> state, but sets its state to RUNNING after hotplug thread has entered
> kthread_parkme() and changed its state to TASK_PARKED. This can result
> in panic later on in kthread_unpark(), as it sees KTHREAD_IS_PARKED
> flag set but fails to rebind the kthread, due to it being not in
> TASK_PARKED state. Fix this, by serializing wakeup state change,
> against state change before parking the kthread.
>
> Below is the possible race:
>
> Control thread Hotplug Thread
>
> kthread_park()
> set KTHREAD_SHOULD_PARK
> smpboot_thread_fn
> set_current_state(TASK_INTERRUPTIBLE);
> kthread_parkme
>
> wake_up_process()
>
> raw_spin_lock_irqsave(&p->pi_lock, flags);
> if (!(p->state & state)) -> this will fail
> goto out;
>
> __kthread_parkme
> __set_current_state(TASK_PARKED);
>
> if (p->on_rq && ttwu_remote(p, wake_flags))
> ttwu_remote()
> p->state = TASK_RUNNING;
> schedule();
>
> So to avoid this race, take pi_lock to serial state changes.
>
> Suggested-by: Pavankumar Kondeti <[email protected]>
> Co-developed-by: Neeraj Upadhyay <[email protected]>
> Signed-off-by: Neeraj Upadhyay <[email protected]>
> Signed-off-by: Gaurav Kohli <[email protected]>
>
> diff --git a/kernel/smpboot.c b/kernel/smpboot.c
> index 1650578..514b232 100644
> --- a/kernel/smpboot.c
> +++ b/kernel/smpboot.c
> @@ -121,7 +121,9 @@ static int smpboot_thread_fn(void *data)
> }
>
> if (kthread_should_park()) {
> + raw_spin_lock(&current->pi_lock);
> __set_current_state(TASK_RUNNING);
> + raw_spin_unlock(&current->pi_lock);
> preempt_enable();
> if (ht->park && td->status == HP_THREAD_ACTIVE) {
> BUG_ON(td->cpu != smp_processor_id());

Note how in your scenario above you didn't actually need the
TASK_RUNNING state; so how is this change going to fix anything?

But yes, I suspect it is right, but it definitely needs a comment
explaining wth we take that lock there.

Like I said earlier, my brain is entirely fried for the day; but I'll
have a try tomorrow.

2018-04-24 18:47:45

by Gaurav Kohli

[permalink] [raw]
Subject: Re: [PATCH] kthread/smpboot: Serialize kthread parking against wakeup

On 4/24/2018 11:56 PM, Peter Zijlstra wrote:

> On Tue, Apr 24, 2018 at 02:58:25PM +0530, Gaurav Kohli wrote:
>> The control cpu thread which initiates hotplug calls kthread_park()
>> for hotplug thread and sets KTHREAD_SHOULD_PARK. After this control
>> thread wakes up the hotplug thread. There is a chance that wakeup
>> code sees the hotplug thread (running on AP core) in INTERRUPTIBLE
>> state, but sets its state to RUNNING after hotplug thread has entered
>> kthread_parkme() and changed its state to TASK_PARKED. This can result
>> in panic later on in kthread_unpark(), as it sees KTHREAD_IS_PARKED
>> flag set but fails to rebind the kthread, due to it being not in
>> TASK_PARKED state. Fix this, by serializing wakeup state change,
>> against state change before parking the kthread.
>>
>> Below is the possible race:
>>
>> Control thread Hotplug Thread
>>
>> kthread_park()
>> set KTHREAD_SHOULD_PARK
>> smpboot_thread_fn
>> set_current_state(TASK_INTERRUPTIBLE);
>> kthread_parkme
>>
>> wake_up_process()
>>
>> raw_spin_lock_irqsave(&p->pi_lock, flags);
>> if (!(p->state & state)) -> this will fail
>> goto out;
>>
>> __kthread_parkme
>> __set_current_state(TASK_PARKED);
>>
>> if (p->on_rq && ttwu_remote(p, wake_flags))
>> ttwu_remote()
>> p->state = TASK_RUNNING;
>> schedule();
>>
>> So to avoid this race, take pi_lock to serial state changes.
>>
>> Suggested-by: Pavankumar Kondeti <[email protected]>
>> Co-developed-by: Neeraj Upadhyay <[email protected]>
>> Signed-off-by: Neeraj Upadhyay <[email protected]>
>> Signed-off-by: Gaurav Kohli <[email protected]>
>>
>> diff --git a/kernel/smpboot.c b/kernel/smpboot.c
>> index 1650578..514b232 100644
>> --- a/kernel/smpboot.c
>> +++ b/kernel/smpboot.c
>> @@ -121,7 +121,9 @@ static int smpboot_thread_fn(void *data)
>> }
>>
>> if (kthread_should_park()) {
>> + raw_spin_lock(&current->pi_lock);
>> __set_current_state(TASK_RUNNING);
>> + raw_spin_unlock(&current->pi_lock);
>> preempt_enable();
>> if (ht->park && td->status == HP_THREAD_ACTIVE) {
>> BUG_ON(td->cpu != smp_processor_id());
> Note how in your scenario above you didn't actually need the
> TASK_RUNNING state; so how is this change going to fix anything?

Hi Peter,

As with help of this , if kthread_should_park run first so wake_up call of controller

get exited as task is already set as running, otherwise if controller runs first

then we will block here and set running and then sets TASK_PARKED .

So no chance of cpuhp set as running during  kthread_parkme call.

But as we discussed this can be fix by 2nd patch as well, So once you get time and able to

see , Please let us know or do you want me to try your 2nd patch

for testing first?

>
> But yes, I suspect it is right, but it definitely needs a comment
> explaining wth we take that lock there.
>
> Like I said earlier, my brain is entirely fried for the day; but I'll
> have a try tomorrow.
>
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.