2023-06-21 08:09:27

by Miaohe Lin

[permalink] [raw]
Subject: [PATCH] sched/fair: fix possible active balance misbehavior

In LBF_DST_PINNED case, env.dst_cpu won't be equal to this_cpu. So when
need_active_balance() returns true, env.dst_cpu should be used to do the
active balance stuff instead of this_cpu.

Fixes: 88b8dac0a14c ("sched: Improve balance_cpu() to consider other cpus in its group as target of (pinned) task")
Signed-off-by: Miaohe Lin <[email protected]>
---
kernel/sched/fair.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 5e90e9658528..28ff831ee847 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -10968,14 +10968,14 @@ static int load_balance(int this_cpu, struct rq *this_rq,
/*
* Don't kick the active_load_balance_cpu_stop,
* if the curr task on busiest CPU can't be
- * moved to this_cpu:
+ * moved to env.dst_cpu:
*/
- if (!cpumask_test_cpu(this_cpu, busiest->curr->cpus_ptr)) {
+ if (!cpumask_test_cpu(env.dst_cpu, busiest->curr->cpus_ptr)) {
raw_spin_rq_unlock_irqrestore(busiest, flags);
goto out_one_pinned;
}

- /* Record that we found at least one task that could run on this_cpu */
+ /* Record that we found at least one task that could run on env.dst_cpu */
env.flags &= ~LBF_ALL_PINNED;

/*
@@ -10985,7 +10985,7 @@ static int load_balance(int this_cpu, struct rq *this_rq,
*/
if (!busiest->active_balance) {
busiest->active_balance = 1;
- busiest->push_cpu = this_cpu;
+ busiest->push_cpu = env.dst_cpu;
active_balance = 1;
}
raw_spin_rq_unlock_irqrestore(busiest, flags);
--
2.27.0



2023-06-21 08:11:20

by Abel Wu

[permalink] [raw]
Subject: Re: [PATCH] sched/fair: fix possible active balance misbehavior

Hi Miaohe,

On 6/21/23 2:53 PM, Miaohe Lin wrote:
> In LBF_DST_PINNED case, env.dst_cpu won't be equal to this_cpu. So when
> need_active_balance() returns true, env.dst_cpu should be used to do the
> active balance stuff instead of this_cpu.

Active LB is the last resort to balance loads, which means no task
found can be moved to the local group before we actually do active lb.
So I don't think there is much difference between this cpu and the
selected new dst_cpu, as they are both in the local sched group and
the sched group is treated as a whole in point of view of balancing.

Best,
Abel

>
> Fixes: 88b8dac0a14c ("sched: Improve balance_cpu() to consider other cpus in its group as target of (pinned) task")
> Signed-off-by: Miaohe Lin <[email protected]>
> ---
> kernel/sched/fair.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 5e90e9658528..28ff831ee847 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -10968,14 +10968,14 @@ static int load_balance(int this_cpu, struct rq *this_rq,
> /*
> * Don't kick the active_load_balance_cpu_stop,
> * if the curr task on busiest CPU can't be
> - * moved to this_cpu:
> + * moved to env.dst_cpu:
> */
> - if (!cpumask_test_cpu(this_cpu, busiest->curr->cpus_ptr)) {
> + if (!cpumask_test_cpu(env.dst_cpu, busiest->curr->cpus_ptr)) {
> raw_spin_rq_unlock_irqrestore(busiest, flags);
> goto out_one_pinned;
> }
>
> - /* Record that we found at least one task that could run on this_cpu */
> + /* Record that we found at least one task that could run on env.dst_cpu */
> env.flags &= ~LBF_ALL_PINNED;
>
> /*
> @@ -10985,7 +10985,7 @@ static int load_balance(int this_cpu, struct rq *this_rq,
> */
> if (!busiest->active_balance) {
> busiest->active_balance = 1;
> - busiest->push_cpu = this_cpu;
> + busiest->push_cpu = env.dst_cpu;
> active_balance = 1;
> }
> raw_spin_rq_unlock_irqrestore(busiest, flags);