Subject: [PATCH] sched: Fix get_push_task() vs migrate_disable()

push_rt_task() attempts to move the currently running task away if the
next runnable task has migration disabled and therefore is pinned on the
current CPU.

The current task is retrieved via get_push_task() which only checks for
nr_cpus_allowed == 1, but does not check whether the task has migration
disabled and therefore cannot be moved either. The consequence is a
pointless invocation of the migration thread which correctly observes
that the task cannot be moved.

Return NULL if the task has migration disabled and cannot be moved to
another CPU.

Fixes: a7c81556ec4d3 ("sched: Fix migrate_disable() vs rt/dl balancing")
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
---
kernel/sched/sched.h | 3 +++
1 file changed, 3 insertions(+)

diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index e205b63d6db07..32a4945730a9b 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2259,6 +2259,9 @@ static inline struct task_struct *get_push_task(struct rq *rq)
if (p->nr_cpus_allowed == 1)
return NULL;

+ if (p->migration_disabled)
+ return NULL;
+
rq->push_busy = true;
return get_task_struct(p);
}
--
2.33.0


2021-08-26 16:27:18

by Tao Zhou

[permalink] [raw]
Subject: Re: [PATCH] sched: Fix get_push_task() vs migrate_disable()

Hi Sebastian,

On Thu, Aug 26, 2021 at 03:37:38PM +0200, Sebastian Andrzej Siewior wrote:

> push_rt_task() attempts to move the currently running task away if the
> next runnable task has migration disabled and therefore is pinned on the
> current CPU.
>
> The current task is retrieved via get_push_task() which only checks for
> nr_cpus_allowed == 1, but does not check whether the task has migration
> disabled and therefore cannot be moved either. The consequence is a
> pointless invocation of the migration thread which correctly observes
> that the task cannot be moved.
>
> Return NULL if the task has migration disabled and cannot be moved to
> another CPU.
>
> Fixes: a7c81556ec4d3 ("sched: Fix migrate_disable() vs rt/dl balancing")
> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> ---
> kernel/sched/sched.h | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index e205b63d6db07..32a4945730a9b 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -2259,6 +2259,9 @@ static inline struct task_struct *get_push_task(struct rq *rq)
> if (p->nr_cpus_allowed == 1)
> return NULL;
>
> + if (p->migration_disabled)
> + return NULL;

Not much I can restore here..

Is is_migration_disabled(p) be more correct to check migration disable.
And get_push_task() being called in pull_rt_task() has checked migration
disable first and then call get_push_task(). That means this check in
get_push_task() in patch is a second repeatly check.

> rq->push_busy = true;
> return get_task_struct(p);
> }
> --
> 2.33.0
>



Thanks,
Tao

2021-08-26 17:02:42

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] sched: Fix get_push_task() vs migrate_disable()

On Thu, Aug 26, 2021 at 03:37:38PM +0200, Sebastian Andrzej Siewior wrote:
> push_rt_task() attempts to move the currently running task away if the
> next runnable task has migration disabled and therefore is pinned on the
> current CPU.
>
> The current task is retrieved via get_push_task() which only checks for
> nr_cpus_allowed == 1, but does not check whether the task has migration
> disabled and therefore cannot be moved either. The consequence is a
> pointless invocation of the migration thread which correctly observes
> that the task cannot be moved.
>
> Return NULL if the task has migration disabled and cannot be moved to
> another CPU.
>
> Fixes: a7c81556ec4d3 ("sched: Fix migrate_disable() vs rt/dl balancing")
> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>

Thanks!

2021-08-26 17:12:11

by tip-bot2 for Jacob Pan

[permalink] [raw]
Subject: [tip: sched/urgent] sched: Fix get_push_task() vs migrate_disable()

The following commit has been merged into the sched/urgent branch of tip:

Commit-ID: e681dcbaa4b284454fecd09617f8b24231448446
Gitweb: https://git.kernel.org/tip/e681dcbaa4b284454fecd09617f8b24231448446
Author: Sebastian Andrzej Siewior <[email protected]>
AuthorDate: Thu, 26 Aug 2021 15:37:38 +02:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Thu, 26 Aug 2021 19:02:00 +02:00

sched: Fix get_push_task() vs migrate_disable()

push_rt_task() attempts to move the currently running task away if the
next runnable task has migration disabled and therefore is pinned on the
current CPU.

The current task is retrieved via get_push_task() which only checks for
nr_cpus_allowed == 1, but does not check whether the task has migration
disabled and therefore cannot be moved either. The consequence is a
pointless invocation of the migration thread which correctly observes
that the task cannot be moved.

Return NULL if the task has migration disabled and cannot be moved to
another CPU.

Fixes: a7c81556ec4d3 ("sched: Fix migrate_disable() vs rt/dl balancing")
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
kernel/sched/sched.h | 3 +++
1 file changed, 3 insertions(+)

diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index da4295f..ddefb04 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2255,6 +2255,9 @@ static inline struct task_struct *get_push_task(struct rq *rq)
if (p->nr_cpus_allowed == 1)
return NULL;

+ if (p->migration_disabled)
+ return NULL;
+
rq->push_busy = true;
return get_task_struct(p);
}

2021-08-26 19:40:46

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] sched: Fix get_push_task() vs migrate_disable()

Tao,

On Fri, Aug 27 2021 at 00:24, Tao Zhou wrote:
> On Thu, Aug 26, 2021 at 03:37:38PM +0200, Sebastian Andrzej Siewior wrote:
>> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
>> index e205b63d6db07..32a4945730a9b 100644
>> --- a/kernel/sched/sched.h
>> +++ b/kernel/sched/sched.h
>> @@ -2259,6 +2259,9 @@ static inline struct task_struct *get_push_task(struct rq *rq)
>> if (p->nr_cpus_allowed == 1)
>> return NULL;
>>
>> + if (p->migration_disabled)
>> + return NULL;
>
> Not much I can restore here..
>
> Is is_migration_disabled(p) be more correct to check migration
> disable.

Kinda, but it's not an issue here because get_push_task() is only available when
CONFIG_SMP=y which makes p->migration_disabled available as well.

> And get_push_task() being called in pull_rt_task() has checked migration
> disable first and then call get_push_task(). That means this check in
> get_push_task() in patch is a second repeatly check.

No. The checks are for two different tasks...

Thanks,

tglx

2021-08-26 22:56:39

by Tao Zhou

[permalink] [raw]
Subject: Re: [PATCH] sched: Fix get_push_task() vs migrate_disable()

Hi Thomas,

On Thu, Aug 26, 2021 at 09:38:17PM +0200, Thomas Gleixner wrote:
> Tao,
>
> On Fri, Aug 27 2021 at 00:24, Tao Zhou wrote:
> > On Thu, Aug 26, 2021 at 03:37:38PM +0200, Sebastian Andrzej Siewior wrote:
> >> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> >> index e205b63d6db07..32a4945730a9b 100644
> >> --- a/kernel/sched/sched.h
> >> +++ b/kernel/sched/sched.h
> >> @@ -2259,6 +2259,9 @@ static inline struct task_struct *get_push_task(struct rq *rq)
> >> if (p->nr_cpus_allowed == 1)
> >> return NULL;
> >>
> >> + if (p->migration_disabled)
> >> + return NULL;
> >
> > Not much I can restore here..
> >
> > Is is_migration_disabled(p) be more correct to check migration
> > disable.
>
> Kinda, but it's not an issue here because get_push_task() is only available when
> CONFIG_SMP=y which makes p->migration_disabled available as well.
>
> > And get_push_task() being called in pull_rt_task() has checked migration
> > disable first and then call get_push_task(). That means this check in
> > get_push_task() in patch is a second repeatly check.
>
> No. The checks are for two different tasks...

Aha, yes. I lost here. Thanks for your reply.

> Thanks,
>
> tglx



Thanks,
Tao