2002-01-13 17:02:11

by Manfred Spraul

[permalink] [raw]
Subject: cross-cpu balancing with the new scheduler

Is it possible that the inter-cpu balancing is broken in 2.5.2-pre11?

eatcpu is a simple cpu hog ("for(;;);"). Dual CPU i386.

$nice -19 ./eatcpu&;
<wait>
$nice -19 ./eatcpu&;
<wait>
$./eatcpu&.

IMHO it should be
* both niced process run on one cpu.
* the non-niced process runs with a 100% timeslice.

But it's the other way around:
One niced process runs with 100%. The non-niced process with 50%, and
the second niced process with 50%.

--
Manfred


2002-01-14 02:19:54

by Rusty Russell

[permalink] [raw]
Subject: Re: cross-cpu balancing with the new scheduler

On Sun, 13 Jan 2002 18:01:40 +0100
Manfred Spraul <[email protected]> wrote:

> Is it possible that the inter-cpu balancing is broken in 2.5.2-pre11?
>
> eatcpu is a simple cpu hog ("for(;;);"). Dual CPU i386.
>
> $nice -19 ./eatcpu&;
> <wait>
> $nice -19 ./eatcpu&;
> <wait>
> $./eatcpu&.
>
> IMHO it should be
> * both niced process run on one cpu.
> * the non-niced process runs with a 100% timeslice.
>
> But it's the other way around:
> One niced process runs with 100%. The non-niced process with 50%, and
> the second niced process with 50%.

This could be fixed by making "nr_running" closer to a "priority sum".

Ingo?

Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

2002-01-14 02:43:55

by Davide Libenzi

[permalink] [raw]
Subject: Re: cross-cpu balancing with the new scheduler

On Mon, 14 Jan 2002, Rusty Russell wrote:

> On Sun, 13 Jan 2002 18:01:40 +0100
> Manfred Spraul <[email protected]> wrote:
>
> > Is it possible that the inter-cpu balancing is broken in 2.5.2-pre11?
> >
> > eatcpu is a simple cpu hog ("for(;;);"). Dual CPU i386.
> >
> > $nice -19 ./eatcpu&;
> > <wait>
> > $nice -19 ./eatcpu&;
> > <wait>
> > $./eatcpu&.
> >
> > IMHO it should be
> > * both niced process run on one cpu.
> > * the non-niced process runs with a 100% timeslice.
> >
> > But it's the other way around:
> > One niced process runs with 100%. The non-niced process with 50%, and
> > the second niced process with 50%.
>
> This could be fixed by making "nr_running" closer to a "priority sum".

I've a very simple phrase when QA is bugging me with these corner cases :

"As Designed"

It's much much better than adding code and "Return To QA" :-)
I tried priority balancing in BMQS but i still prefer "As Designed" ...




- Davide


2002-01-14 04:37:41

by Rusty Russell

[permalink] [raw]
Subject: Re: cross-cpu balancing with the new scheduler

In message <[email protected]> yo
u write:
> On Mon, 14 Jan 2002, Rusty Russell wrote:
>
> > This could be fixed by making "nr_running" closer to a "priority sum".
>
> I've a very simple phrase when QA is bugging me with these corner cases :
>
> "As Designed"

My point is: it's just a heuristic number. It currently reflects the
number on the runqueue, but there's no reason it *has to* (except the
name, of course).

1) The nr_running() function can use rq->active->nr_active +
rq->expired->nr_active. And anyway it's only as "am I
idle?".

2) The test inside schedule() can be replaced by checking the result
of the sched_find_first_zero_bit() (I have a patch which does this
to good effect, but for other reasons).

The other uses of nr_running are all "how long is this runqueue for
rebalancing", and Ingo *already* modifies his use of this number,
using the "prev_nr_running" hack.

Hope that clarifies,
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.

2002-01-14 15:39:45

by Manfred Spraul

[permalink] [raw]
Subject: Re: cross-cpu balancing with the new scheduler

Davide Libenzi wrote:
>
> I've a very simple phrase when QA is bugging me with these corner cases :
>
> "As Designed"
>
> It's much much better than adding code and "Return To QA" :-)
> I tried priority balancing in BMQS but i still prefer "As Designed" ...
>
Another test, now with 4 process (dual cpu):
#nice -n 19 ./eatcpu&
#nice -n 19 ./eatcpu&
#./eatcpu&
#nice -n -19 ./eatcpu&

And the top output:
<<<<<<
73 processes: 68 sleeping, 5 running, 0 zombie, 0 stopped
CPU0 states: 100.0% user, 0.0% system, 100.0% nice, 0.0% idle
CPU1 states: 98.0% user, 2.0% system, 33.0% nice, 0.0% idle
[snip]
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
1163 root 39 19 396 396 324 R N 99.5 0.1 0:28 eatcpu
1164 root 39 19 396 396 324 R N 33.1 0.1 0:11 eatcpu
1165 root 39 0 396 396 324 R 33.1 0.1 0:07 eatcpu
1166 root 39 -19 396 396 324 R < 31.3 0.1 0:06 eatcpu
1168 manfred 1 0 980 976 768 R 2.7 0.2 0:00 top
[snip]

The niced process still has it's own cpu, and the "nice -19" process has
33% of the second cpu.

IMHO that's buggy. 4 running process, 1 on cpu0, 3 on cpu1.

--
Manfred

2002-01-14 15:44:35

by Davide Libenzi

[permalink] [raw]
Subject: Re: cross-cpu balancing with the new scheduler

On Mon, 14 Jan 2002, Manfred Spraul wrote:

> Davide Libenzi wrote:
> >
> > I've a very simple phrase when QA is bugging me with these corner cases :
> >
> > "As Designed"
> >
> > It's much much better than adding code and "Return To QA" :-)
> > I tried priority balancing in BMQS but i still prefer "As Designed" ...
> >
> Another test, now with 4 process (dual cpu):
> #nice -n 19 ./eatcpu&
> #nice -n 19 ./eatcpu&
> #./eatcpu&
> #nice -n -19 ./eatcpu&
>
> And the top output:
> <<<<<<
> 73 processes: 68 sleeping, 5 running, 0 zombie, 0 stopped
> CPU0 states: 100.0% user, 0.0% system, 100.0% nice, 0.0% idle
> CPU1 states: 98.0% user, 2.0% system, 33.0% nice, 0.0% idle
> [snip]
> PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
> 1163 root 39 19 396 396 324 R N 99.5 0.1 0:28 eatcpu
> 1164 root 39 19 396 396 324 R N 33.1 0.1 0:11 eatcpu
> 1165 root 39 0 396 396 324 R 33.1 0.1 0:07 eatcpu
> 1166 root 39 -19 396 396 324 R < 31.3 0.1 0:06 eatcpu
> 1168 manfred 1 0 980 976 768 R 2.7 0.2 0:00 top
> [snip]
>
> The niced process still has it's own cpu, and the "nice -19" process has
> 33% of the second cpu.
>
> IMHO that's buggy. 4 running process, 1 on cpu0, 3 on cpu1.

Yes, a long run with 3:1 is no more "As Designed" :-)




- Davide


2002-01-14 15:47:47

by Ingo Molnar

[permalink] [raw]
Subject: Re: cross-cpu balancing with the new scheduler


(it turns out that Manfred used 2.5.2-pre11-vanilla for this test.)

On Mon, 14 Jan 2002, Manfred Spraul wrote:

> PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
> 1163 root 39 19 396 396 324 R N 99.5 0.1 0:28 eatcpu
> 1164 root 39 19 396 396 324 R N 33.1 0.1 0:11 eatcpu
> 1165 root 39 0 396 396 324 R 33.1 0.1 0:07 eatcpu
> 1166 root 39 -19 396 396 324 R < 31.3 0.1 0:06 eatcpu

The load-balancer in 2.5.2-pre11 is known-broken, please try the -H7 patch
to get the latest code.

Ingo

2002-01-15 00:29:50

by Anton Blanchard

[permalink] [raw]
Subject: Re: cross-cpu balancing with the new scheduler


> eatcpu is a simple cpu hog ("for(;;);"). Dual CPU i386.
>
> $nice -19 ./eatcpu&;
> <wait>
> $nice -19 ./eatcpu&;
> <wait>
> $./eatcpu&.
>
> IMHO it should be
> * both niced process run on one cpu.
> * the non-niced process runs with a 100% timeslice.
>
> But it's the other way around:
> One niced process runs with 100%. The non-niced process with 50%, and
> the second niced process with 50%.

Rusty and I were talking about this recently. Would it make sense for
the load balancer to use a weighted queue length (sum up all priorities
in the queue?) instead of just balancing the queue length?

Anton

2002-01-15 14:40:53

by Ingo Molnar

[permalink] [raw]
Subject: Re: cross-cpu balancing with the new scheduler


On Mon, 14 Jan 2002, Anton Blanchard wrote:

> Rusty and I were talking about this recently. Would it make sense for
> the load balancer to use a weighted queue length (sum up all
> priorities in the queue?) instead of just balancing the queue length?

something like this would work, but it's not an easy task to *truly*
balance priorities (or timeslice lengths instead) between CPUs.

Eg. in the following situation:

CPU#0 CPU#1

prio 1 prio 1
prio 1 prio 1
prio 20 prio 1

if the load-balancer only looks at the tail of the runqueue then it finds
that it cannot balance things any better - by moving the prio 20 task over
to CPU#1 it will not create a better-balanced situation. If it would look
at other runqueue entries then it could create the following,
better-balanced situation:

CPU#0 CPU#1

prio 20 prio 1
prio 1
prio 1
prio 1
prio 1

the solution would be to search the whole runqueue and migrate the task
with the shortest timeslice - but that is a pretty slow and
cache-intensive thing to do.

Ingo