2020-02-20 07:30:10

by chenqiwu

[permalink] [raw]
Subject: [PATCH] sched/fair: add !se->on_rq check before dequeue entity

From: chenqiwu <[email protected]>

We igonre checking for !se->on_rq condition before dequeue one
entity from cfs rq. It must be required in case the entity has
been dequeued.

Signed-off-by: chenqiwu <[email protected]>
---
kernel/sched/fair.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 3c8a379..945dcaf 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5341,6 +5341,8 @@ static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
bool was_sched_idle = sched_idle_rq(rq);

for_each_sched_entity(se) {
+ if (!se->on_rq)
+ break;
cfs_rq = cfs_rq_of(se);
dequeue_entity(cfs_rq, se, flags);

--
1.9.1


2020-02-20 09:39:28

by Vincent Guittot

[permalink] [raw]
Subject: Re: [PATCH] sched/fair: add !se->on_rq check before dequeue entity

On Thu, 20 Feb 2020 at 08:29, <[email protected]> wrote:
>
> From: chenqiwu <[email protected]>
>
> We igonre checking for !se->on_rq condition before dequeue one
> entity from cfs rq. It must be required in case the entity has
> been dequeued.

Do you have a use case that triggers this situation ?

This is the only way to reach this situation seems to be dequeuing a
task on a throttled cfs_rq

>
> Signed-off-by: chenqiwu <[email protected]>
> ---
> kernel/sched/fair.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 3c8a379..945dcaf 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -5341,6 +5341,8 @@ static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
> bool was_sched_idle = sched_idle_rq(rq);
>
> for_each_sched_entity(se) {
> + if (!se->on_rq)
> + break;
> cfs_rq = cfs_rq_of(se);
> dequeue_entity(cfs_rq, se, flags);
>
> --
> 1.9.1
>

2020-02-20 10:10:59

by chenqiwu

[permalink] [raw]
Subject: Re: [PATCH] sched/fair: add !se->on_rq check before dequeue entity

On Thu, Feb 20, 2020 at 10:38:02AM +0100, Vincent Guittot wrote:
> On Thu, 20 Feb 2020 at 08:29, <[email protected]> wrote:
> >
> > From: chenqiwu <[email protected]>
> >
> > We igonre checking for !se->on_rq condition before dequeue one
> > entity from cfs rq. It must be required in case the entity has
> > been dequeued.
>
> Do you have a use case that triggers this situation ?
>
> This is the only way to reach this situation seems to be dequeuing a
> task on a throttled cfs_rq
>
Sorry, I have no use case triggers this situation. It's just found by
reading code.
I agree the situation you mentioned above may have a racy with
dequeue_task_fair() in the following code path:
__schedule
pick_next_task_fair
put_prev_entity
check_cfs_rq_runtime
throttle_cfs_rq
dequeue_entity

So this check is worth to be added for dequeue_task_fair().

2020-02-20 10:33:06

by Vincent Guittot

[permalink] [raw]
Subject: Re: [PATCH] sched/fair: add !se->on_rq check before dequeue entity

On Thu, 20 Feb 2020 at 11:09, chenqiwu <[email protected]> wrote:
>
> On Thu, Feb 20, 2020 at 10:38:02AM +0100, Vincent Guittot wrote:
> > On Thu, 20 Feb 2020 at 08:29, <[email protected]> wrote:
> > >
> > > From: chenqiwu <[email protected]>
> > >
> > > We igonre checking for !se->on_rq condition before dequeue one
> > > entity from cfs rq. It must be required in case the entity has
> > > been dequeued.
> >
> > Do you have a use case that triggers this situation ?
> >
> > This is the only way to reach this situation seems to be dequeuing a
> > task on a throttled cfs_rq
> >
> Sorry, I have no use case triggers this situation. It's just found by
> reading code.
> I agree the situation you mentioned above may have a racy with
> dequeue_task_fair() in the following code path:
> __schedule
> pick_next_task_fair
> put_prev_entity
> check_cfs_rq_runtime
> throttle_cfs_rq
> dequeue_entity
>
> So this check is worth to be added for dequeue_task_fair().

In fact the check is already done thanks to the: if (cfs_rq_throttled(cfs_rq))
AFAICT, there is no other way to enqueue a task on a cfs_rq for which
the group entity is not enqueued

2020-02-20 12:16:05

by Vincent Guittot

[permalink] [raw]
Subject: Re: [PATCH] sched/fair: add !se->on_rq check before dequeue entity

On Thu, 20 Feb 2020 at 11:31, Vincent Guittot
<[email protected]> wrote:
>
> On Thu, 20 Feb 2020 at 11:09, chenqiwu <[email protected]> wrote:
> >
> > On Thu, Feb 20, 2020 at 10:38:02AM +0100, Vincent Guittot wrote:
> > > On Thu, 20 Feb 2020 at 08:29, <[email protected]> wrote:
> > > >
> > > > From: chenqiwu <[email protected]>
> > > >
> > > > We igonre checking for !se->on_rq condition before dequeue one
> > > > entity from cfs rq. It must be required in case the entity has
> > > > been dequeued.
> > >
> > > Do you have a use case that triggers this situation ?
> > >
> > > This is the only way to reach this situation seems to be dequeuing a
> > > task on a throttled cfs_rq
> > >
> > Sorry, I have no use case triggers this situation. It's just found by
> > reading code.
> > I agree the situation you mentioned above may have a racy with
> > dequeue_task_fair() in the following code path:
> > __schedule
> > pick_next_task_fair
> > put_prev_entity
> > check_cfs_rq_runtime
> > throttle_cfs_rq
> > dequeue_entity
> >
> > So this check is worth to be added for dequeue_task_fair().
>
> In fact the check is already done thanks to the: if (cfs_rq_throttled(cfs_rq))
> AFAICT, there is no other way to enqueue a task on a cfs_rq for which
> the group entity is not enqueued

Hmm i have been too quick in my reply. I wanted to say:
AFAICT, there is no other way to dequeue a task from a cfs_rq for
which the group entity is not enqueued

2020-02-20 14:44:00

by chenqiwu

[permalink] [raw]
Subject: Re: [PATCH] sched/fair: add !se->on_rq check before dequeue entity

>
> Hmm i have been too quick in my reply. I wanted to say:
> AFAICT, there is no other way to dequeue a task from a cfs_rq for
> which the group entity is not enqueued

But we should notice the potential racy pathes called by deactivate_task().
For example:
One path is dequeue a task from its cfs_rq called by schedule():
__schedule
deactivate_task
dequeue_task
dequeue_task_fair

Another path is trying to migrate the same task to a CPU on the preferred node:
numa_migrate_preferred
task_numa_migrate
migrate_swap
stop_two_cpus
migrate_swap_stop
__migrate_swap_task
deactivate_task
dequeue_task_fair

There could be a racy if the task is dequeued form its cfs_rq in parallel.