The control cpu thread which initiates hotplug calls kthread_park()
for hotplug thread and sets KTHREAD_SHOULD_PARK. After this control
thread wakes up the hotplug thread. There is a chance that wakeup
code sees the hotplug thread (running on AP core) in INTERRUPTIBLE
state, but sets its state to RUNNING after hotplug thread has entered
kthread_parkme() and changed its state to TASK_PARKED. This can result
in panic later on in kthread_unpark(), as it sees KTHREAD_IS_PARKED
flag set but fails to rebind the kthread, due to it being not in
TASK_PARKED state. Fix this, by serializing wakeup state change,
against state change before parking the kthread.
Below is the possible race:
Control thread Hotplug Thread
kthread_park()
set KTHREAD_SHOULD_PARK
smpboot_thread_fn
set_current_state(TASK_INTERRUPTIBLE);
kthread_parkme
wake_up_process()
raw_spin_lock_irqsave(&p->pi_lock, flags);
if (!(p->state & state)) -> this will fail
goto out;
__kthread_parkme
__set_current_state(TASK_PARKED);
if (p->on_rq && ttwu_remote(p, wake_flags))
ttwu_remote()
p->state = TASK_RUNNING;
schedule();
So to avoid this race, take pi_lock to serial state changes.
Suggested-by: Pavankumar Kondeti <[email protected]>
Co-developed-by: Neeraj Upadhyay <[email protected]>
Signed-off-by: Neeraj Upadhyay <[email protected]>
Signed-off-by: Gaurav Kohli <[email protected]>
diff --git a/kernel/smpboot.c b/kernel/smpboot.c
index 1650578..514b232 100644
--- a/kernel/smpboot.c
+++ b/kernel/smpboot.c
@@ -121,7 +121,9 @@ static int smpboot_thread_fn(void *data)
}
if (kthread_should_park()) {
+ raw_spin_lock(¤t->pi_lock);
__set_current_state(TASK_RUNNING);
+ raw_spin_unlock(¤t->pi_lock);
preempt_enable();
if (ht->park && td->status == HP_THREAD_ACTIVE) {
BUG_ON(td->cpu != smp_processor_id());
--
1.9.1
Hi ,
We can also fix below race by smpboot code as well:
@@ -109,7 +109,6 @@ static int smpboot_thread_fn(void *data)
struct smp_hotplug_thread *ht = td->ht;
while (1) {
- set_current_state(TASK_INTERRUPTIBLE);
preempt_disable();
if (kthread_should_stop()) {
__set_current_state(TASK_RUNNING);
@@ -157,6 +156,7 @@ static int smpboot_thread_fn(void *data)
if (!ht->thread_should_run(td->cpu)) {
preempt_enable_no_resched();
+ set_current_state(TASK_INTERRUPTIBLE);
schedule();
} else {
__set_current_state(TASK_RUNNING);
Please suggest if this approach is better.
Regards
Gaurav
On 4/24/2018 2:58 PM, Gaurav Kohli wrote:
> The control cpu thread which initiates hotplug calls kthread_park()
> for hotplug thread and sets KTHREAD_SHOULD_PARK. After this control
> thread wakes up the hotplug thread. There is a chance that wakeup
> code sees the hotplug thread (running on AP core) in INTERRUPTIBLE
> state, but sets its state to RUNNING after hotplug thread has entered
> kthread_parkme() and changed its state to TASK_PARKED. This can result
> in panic later on in kthread_unpark(), as it sees KTHREAD_IS_PARKED
> flag set but fails to rebind the kthread, due to it being not in
> TASK_PARKED state. Fix this, by serializing wakeup state change,
> against state change before parking the kthread.
>
> Below is the possible race:
>
> Control thread Hotplug Thread
>
> kthread_park()
> set KTHREAD_SHOULD_PARK
> smpboot_thread_fn
> set_current_state(TASK_INTERRUPTIBLE);
> kthread_parkme
>
> wake_up_process()
>
> raw_spin_lock_irqsave(&p->pi_lock, flags);
> if (!(p->state & state)) -> this will fail
> goto out;
>
> __kthread_parkme
> __set_current_state(TASK_PARKED);
>
> if (p->on_rq && ttwu_remote(p, wake_flags))
> ttwu_remote()
> p->state = TASK_RUNNING;
> schedule();
>
> So to avoid this race, take pi_lock to serial state changes.
>
> Suggested-by: Pavankumar Kondeti <[email protected]>
> Co-developed-by: Neeraj Upadhyay <[email protected]>
> Signed-off-by: Neeraj Upadhyay <[email protected]>
> Signed-off-by: Gaurav Kohli <[email protected]>
>
> diff --git a/kernel/smpboot.c b/kernel/smpboot.c
> index 1650578..514b232 100644
> --- a/kernel/smpboot.c
> +++ b/kernel/smpboot.c
> @@ -121,7 +121,9 @@ static int smpboot_thread_fn(void *data)
> }
>
> if (kthread_should_park()) {
> + raw_spin_lock(¤t->pi_lock);
> __set_current_state(TASK_RUNNING);
> + raw_spin_unlock(¤t->pi_lock);
> preempt_enable();
> if (ht->park && td->status == HP_THREAD_ACTIVE) {
> BUG_ON(td->cpu != smp_processor_id());
>
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.
On Tue, Apr 24, 2018 at 08:12:49PM +0530, Kohli, Gaurav wrote:
> @@ -157,6 +156,7 @@ static int smpboot_thread_fn(void *data)
>
> ??????????????? if (!ht->thread_should_run(td->cpu)) {
>
> ??????????????????????? preempt_enable_no_resched();
>
> +?????????????????????? set_current_state(TASK_INTERRUPTIBLE);
>
> ??????????????????????? schedule();
>
> ??????????????? } else {
>
> ??????????????????????? __set_current_state(TASK_RUNNING);
>
> Please suggest if this approach is better.
Bah, my brain isn't working... see below for the 'correct' version of
your second patch.
But this violates the normal pattern; see the comment near
set_current_state(). That pattern ensures the thread either sees the
wakeup condition or the actual wakeup.
I'm thinking that with this patch there is a scenario where we'll miss
both the kthread_should_park() and the actual wakeup and end up not
doing anything.
I do the like the end result, but I suspect it's buggy.
---
kernel/smpboot.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/kernel/smpboot.c b/kernel/smpboot.c
index 5043e7433f4b..5bdf57f2ce68 100644
--- a/kernel/smpboot.c
+++ b/kernel/smpboot.c
@@ -109,10 +109,8 @@ static int smpboot_thread_fn(void *data)
struct smp_hotplug_thread *ht = td->ht;
while (1) {
- set_current_state(TASK_INTERRUPTIBLE);
preempt_disable();
if (kthread_should_stop()) {
- __set_current_state(TASK_RUNNING);
preempt_enable();
/* cleanup must mirror setup */
if (ht->cleanup && td->status != HP_THREAD_NONE)
@@ -122,7 +120,6 @@ static int smpboot_thread_fn(void *data)
}
if (kthread_should_park()) {
- __set_current_state(TASK_RUNNING);
preempt_enable();
if (ht->park && td->status == HP_THREAD_ACTIVE) {
BUG_ON(td->cpu != smp_processor_id());
@@ -139,7 +136,6 @@ static int smpboot_thread_fn(void *data)
/* Check for state change setup */
switch (td->status) {
case HP_THREAD_NONE:
- __set_current_state(TASK_RUNNING);
preempt_enable();
if (ht->setup)
ht->setup(td->cpu);
@@ -147,7 +143,6 @@ static int smpboot_thread_fn(void *data)
continue;
case HP_THREAD_PARKED:
- __set_current_state(TASK_RUNNING);
preempt_enable();
if (ht->unpark)
ht->unpark(td->cpu);
@@ -156,10 +151,10 @@ static int smpboot_thread_fn(void *data)
}
if (!ht->thread_should_run(td->cpu)) {
+ set_current_state(TASK_IDLE);
preempt_enable_no_resched();
schedule();
} else {
- __set_current_state(TASK_RUNNING);
preempt_enable();
ht->thread_fn(td->cpu);
}
On Tue, Apr 24, 2018 at 02:58:25PM +0530, Gaurav Kohli wrote:
> The control cpu thread which initiates hotplug calls kthread_park()
> for hotplug thread and sets KTHREAD_SHOULD_PARK. After this control
> thread wakes up the hotplug thread. There is a chance that wakeup
> code sees the hotplug thread (running on AP core) in INTERRUPTIBLE
> state, but sets its state to RUNNING after hotplug thread has entered
> kthread_parkme() and changed its state to TASK_PARKED. This can result
> in panic later on in kthread_unpark(), as it sees KTHREAD_IS_PARKED
> flag set but fails to rebind the kthread, due to it being not in
> TASK_PARKED state. Fix this, by serializing wakeup state change,
> against state change before parking the kthread.
>
> Below is the possible race:
>
> Control thread Hotplug Thread
>
> kthread_park()
> set KTHREAD_SHOULD_PARK
> smpboot_thread_fn
> set_current_state(TASK_INTERRUPTIBLE);
> kthread_parkme
>
> wake_up_process()
>
> raw_spin_lock_irqsave(&p->pi_lock, flags);
> if (!(p->state & state)) -> this will fail
> goto out;
>
> __kthread_parkme
> __set_current_state(TASK_PARKED);
>
> if (p->on_rq && ttwu_remote(p, wake_flags))
> ttwu_remote()
> p->state = TASK_RUNNING;
> schedule();
>
> So to avoid this race, take pi_lock to serial state changes.
>
> Suggested-by: Pavankumar Kondeti <[email protected]>
> Co-developed-by: Neeraj Upadhyay <[email protected]>
> Signed-off-by: Neeraj Upadhyay <[email protected]>
> Signed-off-by: Gaurav Kohli <[email protected]>
>
> diff --git a/kernel/smpboot.c b/kernel/smpboot.c
> index 1650578..514b232 100644
> --- a/kernel/smpboot.c
> +++ b/kernel/smpboot.c
> @@ -121,7 +121,9 @@ static int smpboot_thread_fn(void *data)
> }
>
> if (kthread_should_park()) {
> + raw_spin_lock(¤t->pi_lock);
> __set_current_state(TASK_RUNNING);
> + raw_spin_unlock(¤t->pi_lock);
> preempt_enable();
> if (ht->park && td->status == HP_THREAD_ACTIVE) {
> BUG_ON(td->cpu != smp_processor_id());
Note how in your scenario above you didn't actually need the
TASK_RUNNING state; so how is this change going to fix anything?
But yes, I suspect it is right, but it definitely needs a comment
explaining wth we take that lock there.
Like I said earlier, my brain is entirely fried for the day; but I'll
have a try tomorrow.
On 4/24/2018 11:56 PM, Peter Zijlstra wrote:
> On Tue, Apr 24, 2018 at 02:58:25PM +0530, Gaurav Kohli wrote:
>> The control cpu thread which initiates hotplug calls kthread_park()
>> for hotplug thread and sets KTHREAD_SHOULD_PARK. After this control
>> thread wakes up the hotplug thread. There is a chance that wakeup
>> code sees the hotplug thread (running on AP core) in INTERRUPTIBLE
>> state, but sets its state to RUNNING after hotplug thread has entered
>> kthread_parkme() and changed its state to TASK_PARKED. This can result
>> in panic later on in kthread_unpark(), as it sees KTHREAD_IS_PARKED
>> flag set but fails to rebind the kthread, due to it being not in
>> TASK_PARKED state. Fix this, by serializing wakeup state change,
>> against state change before parking the kthread.
>>
>> Below is the possible race:
>>
>> Control thread Hotplug Thread
>>
>> kthread_park()
>> set KTHREAD_SHOULD_PARK
>> smpboot_thread_fn
>> set_current_state(TASK_INTERRUPTIBLE);
>> kthread_parkme
>>
>> wake_up_process()
>>
>> raw_spin_lock_irqsave(&p->pi_lock, flags);
>> if (!(p->state & state)) -> this will fail
>> goto out;
>>
>> __kthread_parkme
>> __set_current_state(TASK_PARKED);
>>
>> if (p->on_rq && ttwu_remote(p, wake_flags))
>> ttwu_remote()
>> p->state = TASK_RUNNING;
>> schedule();
>>
>> So to avoid this race, take pi_lock to serial state changes.
>>
>> Suggested-by: Pavankumar Kondeti <[email protected]>
>> Co-developed-by: Neeraj Upadhyay <[email protected]>
>> Signed-off-by: Neeraj Upadhyay <[email protected]>
>> Signed-off-by: Gaurav Kohli <[email protected]>
>>
>> diff --git a/kernel/smpboot.c b/kernel/smpboot.c
>> index 1650578..514b232 100644
>> --- a/kernel/smpboot.c
>> +++ b/kernel/smpboot.c
>> @@ -121,7 +121,9 @@ static int smpboot_thread_fn(void *data)
>> }
>>
>> if (kthread_should_park()) {
>> + raw_spin_lock(¤t->pi_lock);
>> __set_current_state(TASK_RUNNING);
>> + raw_spin_unlock(¤t->pi_lock);
>> preempt_enable();
>> if (ht->park && td->status == HP_THREAD_ACTIVE) {
>> BUG_ON(td->cpu != smp_processor_id());
> Note how in your scenario above you didn't actually need the
> TASK_RUNNING state; so how is this change going to fix anything?
Hi Peter,
As with help of this , if kthread_should_park run first so wake_up call of controller
get exited as task is already set as running, otherwise if controller runs first
then we will block here and set running and then sets TASK_PARKED .
So no chance of cpuhp set as running during kthread_parkme call.
But as we discussed this can be fix by 2nd patch as well, So once you get time and able to
see , Please let us know or do you want me to try your 2nd patch
for testing first?
>
> But yes, I suspect it is right, but it definitely needs a comment
> explaining wth we take that lock there.
>
> Like I said earlier, my brain is entirely fried for the day; but I'll
> have a try tomorrow.
>
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.