2012-05-01 18:28:42

by Peter Zijlstra

[permalink] [raw]
Subject: [RFC][PATCH 1/5] sched, fair: Let minimally loaded cpu balance the group

Currently we let the leftmost (or first idle) cpu acend the
sched_domain tree and perform load-balancing. The result is that the
busiest cpu in the group might be performing this function and pull
more load to itself. The next load balance pass will then try to
equalize this again.

Change this to pick the least loaded cpu to perform higher domain
balancing.

Signed-off-by: Peter Zijlstra <[email protected]>
---
kernel/sched/fair.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

Index: linux-2.6/kernel/sched/fair.c
===================================================================
--- linux-2.6.orig/kernel/sched/fair.c
+++ linux-2.6/kernel/sched/fair.c
@@ -3779,7 +3779,8 @@ static inline void update_sg_lb_stats(st
{
unsigned long load, max_cpu_load, min_cpu_load, max_nr_running;
int i;
- unsigned int balance_cpu = -1, first_idle_cpu = 0;
+ unsigned int balance_cpu = -1;
+ unsigned long balance_load = ~0UL;
unsigned long avg_load_per_task = 0;

if (local_group)
@@ -3795,12 +3796,11 @@ static inline void update_sg_lb_stats(st

/* Bias balancing toward cpus of our domain */
if (local_group) {
- if (idle_cpu(i) && !first_idle_cpu) {
- first_idle_cpu = 1;
+ load = target_load(i, load_idx);
+ if (load < balance_load || idle_cpu(i)) {
+ balance_load = load;
balance_cpu = i;
}
-
- load = target_load(i, load_idx);
} else {
load = source_load(i, load_idx);
if (load > max_cpu_load) {


2012-05-02 10:25:53

by Srivatsa Vaddagiri

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/5] sched, fair: Let minimally loaded cpu balance the group

* Peter Zijlstra <[email protected]> [2012-05-01 20:14:31]:

> @@ -3795,12 +3796,11 @@ static inline void update_sg_lb_stats(st
>
> /* Bias balancing toward cpus of our domain */
> if (local_group) {
> - if (idle_cpu(i) && !first_idle_cpu) {
> - first_idle_cpu = 1;
> + load = target_load(i, load_idx);
> + if (load < balance_load || idle_cpu(i)) {
> + balance_load = load;

Let's say load_idx != 0 (ex: a busy cpu doing this load balance). In
that case, for a idle cpu, we could return non-zero load and hence this
would fail to select such a idle cpu? IOW :

balance_load = 0 iff idle_cpu(i) ??

> balance_cpu = i;
> }
> -
> - load = target_load(i, load_idx);
> } else {
> load = source_load(i, load_idx);
> if (load > max_cpu_load) {
>
>

- vatsa

2012-05-02 10:31:45

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/5] sched, fair: Let minimally loaded cpu balance the group

On Wed, 2012-05-02 at 15:55 +0530, Srivatsa Vaddagiri wrote:
> * Peter Zijlstra <[email protected]> [2012-05-01 20:14:31]:
>
> > @@ -3795,12 +3796,11 @@ static inline void update_sg_lb_stats(st
> >
> > /* Bias balancing toward cpus of our domain */
> > if (local_group) {
> > - if (idle_cpu(i) && !first_idle_cpu) {
> > - first_idle_cpu = 1;
> > + load = target_load(i, load_idx);
> > + if (load < balance_load || idle_cpu(i)) {
> > + balance_load = load;
>
> Let's say load_idx != 0 (ex: a busy cpu doing this load balance). In
> that case, for a idle cpu, we could return non-zero load and hence this
> would fail to select such a idle cpu?

Yep, such is the nature of !0 load_idx.

> IOW :
>
> balance_load = 0 iff idle_cpu(i) ??

I think so, even for !0 load_idx, load will only reach zero when we're
idle, just takes longer.

2012-05-02 10:35:01

by Srivatsa Vaddagiri

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/5] sched, fair: Let minimally loaded cpu balance the group

* Peter Zijlstra <[email protected]> [2012-05-02 12:31:30]:

> > IOW :
> >
> > balance_load = 0 iff idle_cpu(i) ??
>
> I think so, even for !0 load_idx, load will only reach zero when we're
> idle, just takes longer.

Right ...so should we force it to select a idle_cpu by having
balance_load = 0 for a idle cpu (ignoring what target_load(i, load_idx)
told us as its load?

- vatsa

2012-05-04 00:02:32

by Suresh Siddha

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/5] sched, fair: Let minimally loaded cpu balance the group

On Wed, 2012-05-02 at 16:04 +0530, Srivatsa Vaddagiri wrote:
> * Peter Zijlstra <[email protected]> [2012-05-02 12:31:30]:
>
> > > IOW :
> > >
> > > balance_load = 0 iff idle_cpu(i) ??
> >
> > I think so, even for !0 load_idx, load will only reach zero when we're
> > idle, just takes longer.
>
> Right ...so should we force it to select a idle_cpu by having
> balance_load = 0 for a idle cpu (ignoring what target_load(i, load_idx)
> told us as its load?

I think Peter is trying to find the leastly loaded among idle cpu's (in
other words the longest idle cpu ;)

should be ok, isn't it?

thanks,
suresh


2012-05-04 16:09:46

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/5] sched, fair: Let minimally loaded cpu balance the group

On Thu, 2012-05-03 at 17:05 -0700, Suresh Siddha wrote:
> On Wed, 2012-05-02 at 16:04 +0530, Srivatsa Vaddagiri wrote:
> > * Peter Zijlstra <[email protected]> [2012-05-02 12:31:30]:
> >
> > > > IOW :
> > > >
> > > > balance_load = 0 iff idle_cpu(i) ??
> > >
> > > I think so, even for !0 load_idx, load will only reach zero when we're
> > > idle, just takes longer.
> >
> > Right ...so should we force it to select a idle_cpu by having
> > balance_load = 0 for a idle cpu (ignoring what target_load(i, load_idx)
> > told us as its load?
>
> I think Peter is trying to find the leastly loaded among idle cpu's (in
> other words the longest idle cpu ;)

Nah, Peter isn't trying to do anything smart like that, he's just trying
to pick the least loaded when they're all busy or any idle otherwise.

Afaict the code as it is today is the worst possible choice, always
picking the same (first) will result in that one being the busiest at
all times.

I mean anything will converge (eventually) due to the lower levels
spreading load again, but by pulling to the idlest it should converge
faster.

Picking a random cpu would also work.