2019-08-07 17:49:52

by Valentin Schneider

[permalink] [raw]
Subject: [PATCH 2/3] sched/fair: Prevent active LB from preempting higher sched classes

The CFS load balancer can cause the cpu_stopper to run a function to
try and steal a rq's currently running task. However, it so happens
that while only CFS tasks will ever be migrated by that function, we
can end up preempting higher sched class tasks, since it is executed
by the cpu_stopper.

I don't expect this to be exceedingly common: we still need to have
had a busiest group in the first place, which needs

busiest->sum_nr_running != 0

which is a cfs.h_nr_running sum, so we should have something to pull,
but if we fail to pull anything and the remote rq is executing
an RT/DL task we can hit this.

Add an extra check to not trigger the cpu_stopper if the remote
rq's running task isn't CFS.

Signed-off-by: Valentin Schneider <[email protected]>
---
kernel/sched/fair.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b56b8edee3d3..79bd6ead589c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8834,6 +8834,10 @@ static inline enum alb_status active_load_balance(struct lb_env *env)

raw_spin_lock_irqsave(&busiest->lock, flags);

+ /* Make sure we're not about to stop a task from a higher sched class */
+ if (busiest->curr->sched_class != &fair_sched_class)
+ goto unlock;
+
/*
* Don't kick the active_load_balance_cpu_stop, if the curr task on
* busiest CPU can't be moved to dst_cpu:
--
2.22.0


2019-08-08 09:26:02

by Qais Yousef

[permalink] [raw]
Subject: Re: [PATCH 2/3] sched/fair: Prevent active LB from preempting higher sched classes

On 08/07/19 18:40, Valentin Schneider wrote:
> The CFS load balancer can cause the cpu_stopper to run a function to
> try and steal a rq's currently running task. However, it so happens
> that while only CFS tasks will ever be migrated by that function, we
> can end up preempting higher sched class tasks, since it is executed
> by the cpu_stopper.
>
> I don't expect this to be exceedingly common: we still need to have
> had a busiest group in the first place, which needs
>
> busiest->sum_nr_running != 0
>
> which is a cfs.h_nr_running sum, so we should have something to pull,
> but if we fail to pull anything and the remote rq is executing
> an RT/DL task we can hit this.
>
> Add an extra check to not trigger the cpu_stopper if the remote
> rq's running task isn't CFS.
>
> Signed-off-by: Valentin Schneider <[email protected]>
> ---
> kernel/sched/fair.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index b56b8edee3d3..79bd6ead589c 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8834,6 +8834,10 @@ static inline enum alb_status active_load_balance(struct lb_env *env)
>
> raw_spin_lock_irqsave(&busiest->lock, flags);
>
> + /* Make sure we're not about to stop a task from a higher sched class */
> + if (busiest->curr->sched_class != &fair_sched_class)
> + goto unlock;
> +

This looks correct to me, but I wonder if this check is something that belongs
to the CONFIG_PREEMPT_RT land. This will give a preference to not disrupt the
RT/DL tasks which is certainly the desired behavior there, but maybe in none
PREEMPT_RT world balancing CFS tasks is more important? Hmmm

--
Qais Yousef

> /*
> * Don't kick the active_load_balance_cpu_stop, if the curr task on
> * busiest CPU can't be moved to dst_cpu:
> --
> 2.22.0
>

2019-08-08 10:47:37

by Valentin Schneider

[permalink] [raw]
Subject: Re: [PATCH 2/3] sched/fair: Prevent active LB from preempting higher sched classes

On 08/08/2019 10:24, Qais Yousef wrote:
>> @@ -8834,6 +8834,10 @@ static inline enum alb_status active_load_balance(struct lb_env *env)
>>
>> raw_spin_lock_irqsave(&busiest->lock, flags);
>>
>> + /* Make sure we're not about to stop a task from a higher sched class */
>> + if (busiest->curr->sched_class != &fair_sched_class)
>> + goto unlock;
>> +
>
> This looks correct to me, but I wonder if this check is something that belongs
> to the CONFIG_PREEMPT_RT land. This will give a preference to not disrupt the
> RT/DL tasks which is certainly the desired behavior there, but maybe in none
> PREEMPT_RT world balancing CFS tasks is more important? Hmmm
>

My take on this is that if the running task isn't CFS, there is no point in
running the cpu_stopper there (PREEMPT_RT or not). We can still try other
things though.

It could be that the running task had been > CFS all along, so if we
failed to move any load then we just couldn't pull any CFS task and should
bail out of load balance at this point.

If the running task was CFS but got preempted by a > CFS task in the
meantime (e.g. after detach_tasks() failed to pull anything), the best we
could do is run detach_one_task() (locally - no need for any cpu_stopper)
to try and nab the now not-running CFS task. Otherwise we'll have to wait
for another round of load_balance().
Not sure how much we care about this case - I think it's extremely unlikely
to repeatedly want to pull a currently-running CFS task and have it
repeatedly preempted by a > CFS task whenever we get to active_load_balance().

Let me try and see if I can come up with something sensible with that
detach_one_task() thingy.

> --
> Qais Yousef
>
>> /*
>> * Don't kick the active_load_balance_cpu_stop, if the curr task on
>> * busiest CPU can't be moved to dst_cpu:
>> --
>> 2.22.0
>>