LinuxLists.cc - [patch] scheduler cache affinity improvement in 2.4 kernels by Ingo Molnar

2001-11-17 16:58:38

Subject: [patch] scheduler cache affinity improvement in 2.4 kernels by Ingo Molnar

Folks,

The above patch for scheduler cache affinity improvement in 2.4 kernels by
Ingo Molnar was applied to 2.4.14 kernel;
a run of Volano LoopBack BenchMark on a Netfinity 8500 R 1MB 700 MHz PIII
1MB-L2 cache and 1GB memory support produced
the following results:

The UniProcessor throughput was reduced by 40%.
The 4-way throughput showed a very slight degradation of 1%.
The 8-way throughput showed an improvemnet of 10%.

I do not subscribe to lkml and hence please address any future
correspondence on this topic to [email protected].,com.

Thanks,
Partha

2001-11-17 17:21:44

by Mark Hahn

[permalink] [raw]

Subject: Re: [patch] scheduler cache affinity improvement in 2.4 kernels by Ingo Molnar

> The UniProcessor throughput was reduced by 40%.
> The 4-way throughput showed a very slight degradation of 1%.
> The 8-way throughput showed an improvemnet of 10%.

what a waste of time: volanomark loopback is deliberately unlike
any real-world workload, so noone can sanely use these numbers.
why not compare to something vaguely intelligible? even lmbench
would be more interesting...

2001-11-17 17:53:07

by Alan

[permalink] [raw]

Subject: Re: [patch] scheduler cache affinity improvement in 2.4 kernels by Ingo Molnar

> The UniProcessor throughput was reduced by 40%.
> The 4-way throughput showed a very slight degradation of 1%.
> The 8-way throughput showed an improvemnet of 10%.

This is the 10 billion thread volcanomark stuff though I assume ? What
happens in the real world ?

2001-11-17 17:58:07

by Arnaldo Carvalho de Melo

[permalink] [raw]

Subject: Re: [patch] scheduler cache affinity improvement in 2.4 kernels by Ingo Molnar

Em Sat, Nov 17, 2001 at 06:00:37PM +0000, Alan Cox escreveu:
> > The UniProcessor throughput was reduced by 40%.
> > The 4-way throughput showed a very slight degradation of 1%.
> > The 8-way throughput showed an improvemnet of 10%.
>
> This is the 10 billion thread volcanomark stuff though I assume ? What
> happens in the real world ?

Hey, those who need 10 billion threads can afford a 8-way machine or even
bigger ;)

- Arnaldo

``"90% of everything is crap", Its called Sturgeon's law 8)
One of the problems is indeed finding the good bits''
- Alan Cox

2001-11-19 08:40:28

by Albert D. Cahalan

[permalink] [raw]

Subject: Re: [patch] scheduler cache affinity improvement in 2.4 kernels by Ingo Molnar

Partha Narayanan writes:

> The above patch for scheduler cache affinity improvement in 2.4 kernels by
> Ingo Molnar was applied to 2.4.14 kernel;

Just a thought: some processors tell you how many cache lines
have been thrown out. Look in whatever performance monitoring
registers are available.

2001-11-19 17:07:34

by Shailabh Nagar

[permalink] [raw]

Subject: Re: [patch] scheduler cache affinity improvement in 2.4 kernels by Ingo Molnar

Hi Partha,

Sorry to see the "shoot-the-messenger" replies to your posting. I've yet to
understand why lmbench is relevant
in an SMP setting.....

Shailabh Nagar
Enterprise Linux Group, IBM TJ Watson Research Center
(914) 945 2851, T/L 862 2851

Partha Narayanan/Austin/IBM@[email protected] on 11/17/2001 11:58:05 AM

Sent by: [email protected]

To: [email protected]
cc:
Subject: [patch] scheduler cache affinity improvement in 2.4 kernels by
Ingo Molnar

Folks,

The above patch for scheduler cache affinity improvement in 2.4 kernels by
Ingo Molnar was applied to 2.4.14 kernel;
a run of Volano LoopBack BenchMark on a Netfinity 8500 R 1MB 700 MHz PIII
1MB-L2 cache and 1GB memory support produced
the following results:

The UniProcessor throughput was reduced by 40%.
The 4-way throughput showed a very slight degradation of 1%.
The 8-way throughput showed an improvemnet of 10%.

I do not subscribe to lkml and hence please address any future
correspondence on this topic to [email protected].,com.

Thanks,
Partha

2001-11-23 09:52:37

by Ingo Molnar

[permalink] [raw]

Subject: Re: [patch] scheduler cache affinity improvement in 2.4 kernels by Ingo Molnar

On Sat, 17 Nov 2001, Partha Narayanan wrote:

> The UniProcessor throughput was reduced by 40%.
> The 4-way throughput showed a very slight degradation of 1%.
> The 8-way throughput showed an improvemnet of 10%.

thanks Partha for the measurements. I'll soon post an updated patch that
also includes some of the suggestions from this list.

Ingo

2001-11-23 11:44:54

by Samium Gromoff

[permalink] [raw]

Subject: Re: [patch] scheduler cache affinity improvement in 2.4 kernels by Ingo Molnar

So as i see the patch in question is being hit in its weakest
place by that enormous 10 billion thread benchmark.
Indeed that weakest place is the added overhead which effects the
heavy scheduling load.
I look at it as at absolutely worst case. And even in this worst
case we still have a win on a 8-way smp... I`d like to see some
more real-life benchmarks on the issue...

Maybe the tester lose the point, cause the patch was not pointed
to improve the scheduling itself, but to reduce the loss of improper
scheduling - ie cache thrashing.

cheers, Samium Gromoff