2023-10-27 17:27:02

by Keisuke Nishimura

[permalink] [raw]
Subject: [PATCH] sched/fair: Fix the decision for load balance

should_we_balance is called for the decision to do load-balancing.
When sched ticks invoke this function, only one CPU should return
true. However, in the current code, two CPUs can return true. The
following situation, where b means busy and i means idle, is an
example because CPU 0 and CPU 2 return true.

[0, 1] [2, 3]
b b i b

This fix checks if there exists an idle CPU with busy sibling(s)
after looking for a CPU on an idle core. If some idle CPUs with busy
siblings are found, just the first one should do load-balancing.

Fixes: b1bfeab9b002 ("sched/fair: Consider the idle state of the whole core for load balance")
Signed-off-by: Keisuke Nishimura <[email protected]>
---
kernel/sched/fair.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2048138ce54b..eff0316d6c7d 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -11083,8 +11083,9 @@ static int should_we_balance(struct lb_env *env)
return cpu == env->dst_cpu;
}

- if (idle_smt == env->dst_cpu)
- return true;
+ /* Is there an idle CPU with busy siblings? */
+ if (idle_smt != -1)
+ return idle_smt == env->dst_cpu;

/* Are we the first CPU of this group ? */
return group_balance_cpu(sg) == env->dst_cpu;
--
2.34.1


2023-10-28 05:52:17

by Chen Yu

[permalink] [raw]
Subject: Re: [PATCH] sched/fair: Fix the decision for load balance

On 2023-10-27 at 19:17:43 +0200, Keisuke Nishimura wrote:
> should_we_balance is called for the decision to do load-balancing.
> When sched ticks invoke this function, only one CPU should return
> true. However, in the current code, two CPUs can return true. The
> following situation, where b means busy and i means idle, is an
> example because CPU 0 and CPU 2 return true.
>
> [0, 1] [2, 3]
> b b i b
>
> This fix checks if there exists an idle CPU with busy sibling(s)
> after looking for a CPU on an idle core. If some idle CPUs with busy
> siblings are found, just the first one should do load-balancing.
>
> Fixes: b1bfeab9b002 ("sched/fair: Consider the idle state of the whole core for load balance")
> Signed-off-by: Keisuke Nishimura <[email protected]>
> ---
> kernel/sched/fair.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 2048138ce54b..eff0316d6c7d 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -11083,8 +11083,9 @@ static int should_we_balance(struct lb_env *env)
> return cpu == env->dst_cpu;
> }
>
> - if (idle_smt == env->dst_cpu)
> - return true;
> + /* Is there an idle CPU with busy siblings? */
> + if (idle_smt != -1)
> + return idle_smt == env->dst_cpu;
>
> /* Are we the first CPU of this group ? */
> return group_balance_cpu(sg) == env->dst_cpu;

Looks reasonable to me, if there is other idle SMT(from half-busy core)
in the system, we should leverage that SMT to do the periodic lb.
Per my understanding,

Reviewed-by: Chen Yu <[email protected]>

thanks,
Chenyu

2023-10-28 06:38:16

by Julia Lawall

[permalink] [raw]
Subject: Re: [PATCH] sched/fair: Fix the decision for load balance



On Sat, 28 Oct 2023, Chen Yu wrote:

> On 2023-10-27 at 19:17:43 +0200, Keisuke Nishimura wrote:
> > should_we_balance is called for the decision to do load-balancing.
> > When sched ticks invoke this function, only one CPU should return
> > true. However, in the current code, two CPUs can return true. The
> > following situation, where b means busy and i means idle, is an
> > example because CPU 0 and CPU 2 return true.
> >
> > [0, 1] [2, 3]
> > b b i b
> >
> > This fix checks if there exists an idle CPU with busy sibling(s)
> > after looking for a CPU on an idle core. If some idle CPUs with busy
> > siblings are found, just the first one should do load-balancing.
> >
> > Fixes: b1bfeab9b002 ("sched/fair: Consider the idle state of the whole core for load balance")
> > Signed-off-by: Keisuke Nishimura <[email protected]>
> > ---
> > kernel/sched/fair.c | 5 +++--
> > 1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 2048138ce54b..eff0316d6c7d 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -11083,8 +11083,9 @@ static int should_we_balance(struct lb_env *env)
> > return cpu == env->dst_cpu;
> > }
> >
> > - if (idle_smt == env->dst_cpu)
> > - return true;
> > + /* Is there an idle CPU with busy siblings? */
> > + if (idle_smt != -1)
> > + return idle_smt == env->dst_cpu;
> >
> > /* Are we the first CPU of this group ? */
> > return group_balance_cpu(sg) == env->dst_cpu;
>
> Looks reasonable to me, if there is other idle SMT(from half-busy core)
> in the system, we should leverage that SMT to do the periodic lb.
> Per my understanding,

That's not the goal of this patch. The goal of this patch is to avoid
doing return group_balance_cpu(sg) == env->dst_cpu; when a half-busy core
has been identified that is different from env->dst_cpu.

julia

>
> Reviewed-by: Chen Yu <[email protected]>
>
> thanks,
> Chenyu
>

2023-10-28 14:58:42

by Chen Yu

[permalink] [raw]
Subject: Re: [PATCH] sched/fair: Fix the decision for load balance

On 2023-10-28 at 08:37:59 +0200, Julia Lawall wrote:
>
>
> On Sat, 28 Oct 2023, Chen Yu wrote:
>
> > On 2023-10-27 at 19:17:43 +0200, Keisuke Nishimura wrote:
> > > should_we_balance is called for the decision to do load-balancing.
> > > When sched ticks invoke this function, only one CPU should return
> > > true. However, in the current code, two CPUs can return true. The
> > > following situation, where b means busy and i means idle, is an
> > > example because CPU 0 and CPU 2 return true.
> > >
> > > [0, 1] [2, 3]
> > > b b i b
> > >
> > > This fix checks if there exists an idle CPU with busy sibling(s)
> > > after looking for a CPU on an idle core. If some idle CPUs with busy
> > > siblings are found, just the first one should do load-balancing.
> > >
> > > Fixes: b1bfeab9b002 ("sched/fair: Consider the idle state of the whole core for load balance")
> > > Signed-off-by: Keisuke Nishimura <[email protected]>
> > > ---
> > > kernel/sched/fair.c | 5 +++--
> > > 1 file changed, 3 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > index 2048138ce54b..eff0316d6c7d 100644
> > > --- a/kernel/sched/fair.c
> > > +++ b/kernel/sched/fair.c
> > > @@ -11083,8 +11083,9 @@ static int should_we_balance(struct lb_env *env)
> > > return cpu == env->dst_cpu;
> > > }
> > >
> > > - if (idle_smt == env->dst_cpu)
> > > - return true;
> > > + /* Is there an idle CPU with busy siblings? */
> > > + if (idle_smt != -1)
> > > + return idle_smt == env->dst_cpu;
> > >
> > > /* Are we the first CPU of this group ? */
> > > return group_balance_cpu(sg) == env->dst_cpu;
> >
> > Looks reasonable to me, if there is other idle SMT(from half-busy core)
> > in the system, we should leverage that SMT to do the periodic lb.
> > Per my understanding,
>
> That's not the goal of this patch. The goal of this patch is to avoid
> doing return group_balance_cpu(sg) == env->dst_cpu;

Yes, I mean, without this patch, we could incorrectly choose the current
non idle CPU rather than that idle SMT, but actually we should let that
idle SMT to do the idle lb.

thanks,
Chenyu

> when a half-busy core
> has been identified that is different from env->dst_cpu.
>
> julia
>
> >
> > Reviewed-by: Chen Yu <[email protected]>
> >
> > thanks,
> > Chenyu
> >

2023-10-28 15:04:00

by Julia Lawall

[permalink] [raw]
Subject: Re: [PATCH] sched/fair: Fix the decision for load balance



On Sat, 28 Oct 2023, Chen Yu wrote:

> On 2023-10-28 at 08:37:59 +0200, Julia Lawall wrote:
> >
> >
> > On Sat, 28 Oct 2023, Chen Yu wrote:
> >
> > > On 2023-10-27 at 19:17:43 +0200, Keisuke Nishimura wrote:
> > > > should_we_balance is called for the decision to do load-balancing.
> > > > When sched ticks invoke this function, only one CPU should return
> > > > true. However, in the current code, two CPUs can return true. The
> > > > following situation, where b means busy and i means idle, is an
> > > > example because CPU 0 and CPU 2 return true.
> > > >
> > > > [0, 1] [2, 3]
> > > > b b i b
> > > >
> > > > This fix checks if there exists an idle CPU with busy sibling(s)
> > > > after looking for a CPU on an idle core. If some idle CPUs with busy
> > > > siblings are found, just the first one should do load-balancing.
> > > >
> > > > Fixes: b1bfeab9b002 ("sched/fair: Consider the idle state of the whole core for load balance")
> > > > Signed-off-by: Keisuke Nishimura <[email protected]>
> > > > ---
> > > > kernel/sched/fair.c | 5 +++--
> > > > 1 file changed, 3 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > > > index 2048138ce54b..eff0316d6c7d 100644
> > > > --- a/kernel/sched/fair.c
> > > > +++ b/kernel/sched/fair.c
> > > > @@ -11083,8 +11083,9 @@ static int should_we_balance(struct lb_env *env)
> > > > return cpu == env->dst_cpu;
> > > > }
> > > >
> > > > - if (idle_smt == env->dst_cpu)
> > > > - return true;
> > > > + /* Is there an idle CPU with busy siblings? */
> > > > + if (idle_smt != -1)
> > > > + return idle_smt == env->dst_cpu;
> > > >
> > > > /* Are we the first CPU of this group ? */
> > > > return group_balance_cpu(sg) == env->dst_cpu;
> > >
> > > Looks reasonable to me, if there is other idle SMT(from half-busy core)
> > > in the system, we should leverage that SMT to do the periodic lb.
> > > Per my understanding,
> >
> > That's not the goal of this patch. The goal of this patch is to avoid
> > doing return group_balance_cpu(sg) == env->dst_cpu;
>
> Yes, I mean, without this patch, we could incorrectly choose the current
> non idle CPU rather than that idle SMT, but actually we should let that
> idle SMT to do the idle lb.

OK, agreed. Thanks for the feedback!

julia

>
> thanks,
> Chenyu
>
> > when a half-busy core
> > has been identified that is different from env->dst_cpu.
> >
> > julia
> >
> > >
> > > Reviewed-by: Chen Yu <[email protected]>
> > >
> > > thanks,
> > > Chenyu
> > >
>

2023-10-30 04:04:57

by Shrikanth Hegde

[permalink] [raw]
Subject: Re: [PATCH] sched/fair: Fix the decision for load balance



On 10/27/23 10:47 PM, Keisuke Nishimura wrote:
> should_we_balance is called for the decision to do load-balancing.
> When sched ticks invoke this function, only one CPU should return
> true. However, in the current code, two CPUs can return true. The
> following situation, where b means busy and i means idle, is an
> example because CPU 0 and CPU 2 return true.
>
> [0, 1] [2, 3]
> b b i b
>
> This fix checks if there exists an idle CPU with busy sibling(s)
> after looking for a CPU on an idle core. If some idle CPUs with busy
> siblings are found, just the first one should do load-balancing.
>

> Fixes: b1bfeab9b002 ("sched/fair: Consider the idle state of the whole core for load balance")
> Signed-off-by: Keisuke Nishimura <[email protected]>
> ---
> kernel/sched/fair.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 2048138ce54b..eff0316d6c7d 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -11083,8 +11083,9 @@ static int should_we_balance(struct lb_env *env)
> return cpu == env->dst_cpu;
> }
>


There is comment above this /* Are we the first idle CPU? */
Maybe update that comment as /* Are we the first idle core */

> - if (idle_smt == env->dst_cpu)
> - return true;
> + /* Is there an idle CPU with busy siblings? */
nit: We can keep the comment style fixed in this function.
/* Are we the first idle CPU with busy siblings */

> + if (idle_smt != -1)
> + return idle_smt == env->dst_cpu;
>
> /* Are we the first CPU of this group ? */
> return group_balance_cpu(sg) == env->dst_cpu;

code changes LGTM
Reviewed-by: Shrikanth Hegde <[email protected]>

2023-10-30 08:03:15

by Vincent Guittot

[permalink] [raw]
Subject: Re: [PATCH] sched/fair: Fix the decision for load balance

On Fri, 27 Oct 2023 at 19:25, Keisuke Nishimura
<[email protected]> wrote:
>
> should_we_balance is called for the decision to do load-balancing.
> When sched ticks invoke this function, only one CPU should return
> true. However, in the current code, two CPUs can return true. The
> following situation, where b means busy and i means idle, is an
> example because CPU 0 and CPU 2 return true.
>
> [0, 1] [2, 3]
> b b i b
>
> This fix checks if there exists an idle CPU with busy sibling(s)
> after looking for a CPU on an idle core. If some idle CPUs with busy
> siblings are found, just the first one should do load-balancing.
>
> Fixes: b1bfeab9b002 ("sched/fair: Consider the idle state of the whole core for load balance")
> Signed-off-by: Keisuke Nishimura <[email protected]>
> ---
> kernel/sched/fair.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 2048138ce54b..eff0316d6c7d 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -11083,8 +11083,9 @@ static int should_we_balance(struct lb_env *env)
> return cpu == env->dst_cpu;
> }
>
> - if (idle_smt == env->dst_cpu)
> - return true;
> + /* Is there an idle CPU with busy siblings? */

Nit. I agree with Shrikanth that we should keep using similar comment :

/* Are we the first idle CPU with busy siblings */

Reviewed-by: Vincent Guittot <[email protected]>

> + if (idle_smt != -1)
> + return idle_smt == env->dst_cpu;
>
> /* Are we the first CPU of this group ? */
> return group_balance_cpu(sg) == env->dst_cpu;
> --
> 2.34.1
>

2023-10-30 08:05:38

by Vincent Guittot

[permalink] [raw]
Subject: Re: [PATCH] sched/fair: Fix the decision for load balance

On Mon, 30 Oct 2023 at 05:03, Shrikanth Hegde
<[email protected]> wrote:
>
>
>
> On 10/27/23 10:47 PM, Keisuke Nishimura wrote:
> > should_we_balance is called for the decision to do load-balancing.
> > When sched ticks invoke this function, only one CPU should return
> > true. However, in the current code, two CPUs can return true. The
> > following situation, where b means busy and i means idle, is an
> > example because CPU 0 and CPU 2 return true.
> >
> > [0, 1] [2, 3]
> > b b i b
> >
> > This fix checks if there exists an idle CPU with busy sibling(s)
> > after looking for a CPU on an idle core. If some idle CPUs with busy
> > siblings are found, just the first one should do load-balancing.
> >
>
> > Fixes: b1bfeab9b002 ("sched/fair: Consider the idle state of the whole core for load balance")
> > Signed-off-by: Keisuke Nishimura <[email protected]>
> > ---
> > kernel/sched/fair.c | 5 +++--
> > 1 file changed, 3 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index 2048138ce54b..eff0316d6c7d 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -11083,8 +11083,9 @@ static int should_we_balance(struct lb_env *env)
> > return cpu == env->dst_cpu;
> > }
> >
>
>
> There is comment above this /* Are we the first idle CPU? */
> Maybe update that comment as /* Are we the first idle core */

I was about to say the same but it's not always true. If we are at SMT
level, we look for an idle CPU in the core

>
> > - if (idle_smt == env->dst_cpu)
> > - return true;
> > + /* Is there an idle CPU with busy siblings? */
> nit: We can keep the comment style fixed in this function.
> /* Are we the first idle CPU with busy siblings */
>
> > + if (idle_smt != -1)
> > + return idle_smt == env->dst_cpu;
> >
> > /* Are we the first CPU of this group ? */
> > return group_balance_cpu(sg) == env->dst_cpu;
>
> code changes LGTM
> Reviewed-by: Shrikanth Hegde <[email protected]>

2023-10-30 10:16:35

by Keisuke Nishimura

[permalink] [raw]
Subject: Re: [PATCH] sched/fair: Fix the decision for load balance



On 30/10/2023 09:05, Vincent Guittot wrote:
> On Mon, 30 Oct 2023 at 05:03, Shrikanth Hegde
> <[email protected]> wrote:
>>
>>
>>
>> On 10/27/23 10:47 PM, Keisuke Nishimura wrote:
>>> should_we_balance is called for the decision to do load-balancing.
>>> When sched ticks invoke this function, only one CPU should return
>>> true. However, in the current code, two CPUs can return true. The
>>> following situation, where b means busy and i means idle, is an
>>> example because CPU 0 and CPU 2 return true.
>>>
>>> [0, 1] [2, 3]
>>> b b i b
>>>
>>> This fix checks if there exists an idle CPU with busy sibling(s)
>>> after looking for a CPU on an idle core. If some idle CPUs with busy
>>> siblings are found, just the first one should do load-balancing.
>>>
>>
>>> Fixes: b1bfeab9b002 ("sched/fair: Consider the idle state of the whole core for load balance")
>>> Signed-off-by: Keisuke Nishimura <[email protected]>
>>> ---
>>> kernel/sched/fair.c | 5 +++--
>>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>> index 2048138ce54b..eff0316d6c7d 100644
>>> --- a/kernel/sched/fair.c
>>> +++ b/kernel/sched/fair.c
>>> @@ -11083,8 +11083,9 @@ static int should_we_balance(struct lb_env *env)
>>> return cpu == env->dst_cpu;
>>> }
>>>
>>
>>
>> There is comment above this /* Are we the first idle CPU? */
>> Maybe update that comment as /* Are we the first idle core */
>
> I was about to say the same but it's not always true. If we are at SMT
> level, we look for an idle CPU in the core
>

Maybe I should update the comment with the additional contexts:

/*
* Are we the first idle core in a sched_domain not-sharing capacity,
* or the first idle CPU in a sched_domain sharing capacity?
*/


>>
>>> - if (idle_smt == env->dst_cpu)
>>> - return true;
>>> + /* Is there an idle CPU with busy siblings? */
>> nit: We can keep the comment style fixed in this function.
>> /* Are we the first idle CPU with busy siblings */
>>

OK, agreed. Should I create version 2?

thanks,
Keisuke

>>> + if (idle_smt != -1)
>>> + return idle_smt == env->dst_cpu;
>>>
>>> /* Are we the first CPU of this group ? */
>>> return group_balance_cpu(sg) == env->dst_cpu;
>>
>> code changes LGTM
>> Reviewed-by: Shrikanth Hegde <[email protected]>

2023-10-30 14:14:26

by Shrikanth Hegde

[permalink] [raw]
Subject: Re: [PATCH] sched/fair: Fix the decision for load balance



On 10/30/23 3:32 PM, Keisuke Nishimura wrote:
>
>
> On 30/10/2023 09:05, Vincent Guittot wrote:
>> On Mon, 30 Oct 2023 at 05:03, Shrikanth Hegde
>> <[email protected]> wrote:
>>>
>>>
>>>
>>> On 10/27/23 10:47 PM, Keisuke Nishimura wrote:
>>>> should_we_balance is called for the decision to do load-balancing.
>>>> When sched ticks invoke this function, only one CPU should return
>>>> true. However, in the current code, two CPUs can return true. The
>>>> following situation, where b means busy and i means idle, is an
>>>> example because CPU 0 and CPU 2 return true.
>>>>
>>>>          [0, 1] [2, 3]
>>>>           b  b   i  b
>>>>
>>>> This fix checks if there exists an idle CPU with busy sibling(s)
>>>> after looking for a CPU on an idle core. If some idle CPUs with busy
>>>> siblings are found, just the first one should do load-balancing.
>>>>
>>>
>>>> Fixes: b1bfeab9b002 ("sched/fair: Consider the idle state of the
>>>> whole core for load balance")
>>>> Signed-off-by: Keisuke Nishimura <[email protected]>
>>>> ---
>>>>   kernel/sched/fair.c | 5 +++--
>>>>   1 file changed, 3 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>>> index 2048138ce54b..eff0316d6c7d 100644
>>>> --- a/kernel/sched/fair.c
>>>> +++ b/kernel/sched/fair.c
>>>> @@ -11083,8 +11083,9 @@ static int should_we_balance(struct lb_env
>>>> *env)
>>>>                return cpu == env->dst_cpu;
>>>>        }
>>>>
>>>
>>>
>>> There is comment above this /* Are we the first idle CPU? */
>>> Maybe update that comment as /* Are we the first idle core */
>>
>> I was about to say the same but it's not always true. If we are at SMT
>> level, we look for an idle CPU in the core
>>
>
> Maybe I should update the comment with the additional contexts:
>
> /*
>  * Are we the first idle core in a sched_domain not-sharing capacity,
>  * or the first idle CPU in a sched_domain sharing capacity?
>  */
>


/*
* Are we the first idle core in a MC or higher domain
* or the first idle CPU in a SMT domain
*/


>
>>>
>>>> -     if (idle_smt == env->dst_cpu)
>>>> -             return true;
>>>> +     /* Is there an idle CPU with busy siblings? */
>>> nit: We can keep the comment style fixed in this function.
>>> /* Are we the first idle CPU with busy siblings */
>>>
>
> OK, agreed. Should I create version 2?

Yes. That would be good.

>
> thanks,
> Keisuke
>
>>>> +     if (idle_smt != -1)
>>>> +             return idle_smt == env->dst_cpu;
>>>>
>>>>        /* Are we the first CPU of this group ? */
>>>>        return group_balance_cpu(sg) == env->dst_cpu;
>>>
>>> code changes LGTM
>>> Reviewed-by: Shrikanth Hegde <[email protected]>