2002-09-13 16:04:08

by Con Kolivas

[permalink] [raw]
Subject: System response benchmarks in performance patches


I came up with a very simple way of measuring responsiveness that gives me
numbers that are meaningful to me. What I've done is the old faithful kernel
compile and measured it under different loads to simulate the pc's ability to
perform under various loads. I have so far benchmarked 2.4.19 versus 2.4.19-ck7,
2.4.19-ck7-rmap and 2.4.18-6mdk(mandrake's kernel in 8.2). 2.5.34 has a dead
keyboard for me so I'm unable to test it as yet.

Here is the story so far:

No Load
Kernel Time %CPU
2.4.19 1:49.17 98%
2.4.19-ck7 1:47.66 97%
2.4.19-ck7-rmap 1:48.58 98%
2.4.18-6mdk 1:48.18 98%

Memory Load
Kernel Time %CPU
2.4.19 2:15.21 78%
2.4.19-ck7 1:55.88 92%
2.4.19-ck7-rmap 2:18.55 79%
2.4.18-6mdk 2:15.68 79%

IO Load
Kernel Time %CPU
2.4.19 3:00.76 58%
2.4.19-ck7 2:01.68 86%
2.4.19-ck7-rmap 2:05.95 83%
2.4.18-6mdk 3:01.48 58%

Process Load
Kernel Time %CPU
2.4.19 2:09.42 80%
2.4.19-ck7 1:53.52 92%
2.4.19-ck7-rmap 1:54.39 93%
2.4.18-6mdk 2:10.57 80%

Kernel compiles were done on the same config kernel, fresh boot etc.. on a
single PIII 1133 with make -j 4

The loads were taken from BMatthew's iman found here:
http://people.redhat.com/bmatthews/irman/

Unlike the original program I am not looking at average latencies (which by the
way are <.01 msecs)

A brief description of the loads follows:

Memory load - Repeatedly reference 110% of RAM in a pattern designed to cause
cache misses
IO load - Read and write 1K chunks from random places in a file using multiple
processes
Process load - Fork and exec N processes, connected in a unidirectional ring by
pipes. Insert M<<N chunks of data into the ring and pass them around

I certainly feel these numbers represent the "feel" of the various kernels to
respond under different loads which I've had trouble quantifying. As you can see
under different loads the kernels vary in their ability to devote enough cpu
time because they're too busy. Obviously the way the memory is "loaded" will
affect different VM patches differently.

The -ck kernels are from my merged patchsets here:
http://kernel.kolivas.net

I have yet to merge compressed cache fully without bugs, so -ck8 is still not
finished but R. De Castro is working hard to help me do it :)

Please send me your comments and please cc me to ensure I get your email.

Con Kolivas


2002-09-13 16:18:23

by Rik van Riel

[permalink] [raw]
Subject: Re: System response benchmarks in performance patches

On Sat, 14 Sep 2002, Con Kolivas wrote:

> I came up with a very simple way of measuring responsiveness that gives
> me numbers that are meaningful to me. What I've done is the old faithful
> kernel compile and measured it under different loads to simulate the
> pc's ability to perform under various loads.

Absolutely wonderful. I'd love to see this easily scriptable
so we can just run it with one command, eg:

$ ./contest

> Kernel Time %CPU
> 2.4.19 3:00.76 58%
> 2.4.19-ck7 2:01.68 86%
> 2.4.19-ck7-rmap 2:05.95 83%
> 2.4.18-6mdk 3:01.48 58%

Very interesting results. People benchmarking just one thing
at a time won't get variances anywhere near this big, while
real system workload is pretty much always multitasking.

I think I've finally found a benchmark that gives results which
are meaningful in the context of a multitasking system.

regards,

Rik
--
Bravely reimplemented by the knights who say "NIH".

http://www.surriel.com/ http://distro.conectiva.com/

Spamtraps of the month: [email protected] [email protected]

2002-09-13 20:07:01

by Andrew Morton

[permalink] [raw]
Subject: Re: System response benchmarks in performance patches

Con Kolivas wrote:
>
> I came up with a very simple way of measuring responsiveness that gives me
> numbers that are meaningful to me. What I've done is the old faithful kernel
> compile and measured it under different loads to simulate the pc's ability to
> perform under various loads. I have so far benchmarked 2.4.19 versus 2.4.19-ck7,
> 2.4.19-ck7-rmap and 2.4.18-6mdk(mandrake's kernel in 8.2). 2.5.34 has a dead
> keyboard for me so I'm unable to test it as yet.

Yes, this is a wonderful test. Very real-world, easy to do and it
tickles a few fairly serious performance problems which we have.

> ...
> The loads were taken from BMatthew's iman found here:
> http://people.redhat.com/bmatthews/irman/

I have issues with irman (I think - didn't read the code really
closely).

It appears to always perform file overwrites - seeking over files,
rewriting them.

This tends to cause best-case behaviour in the VM. The affected pages
are tucked up out of the way on the active list and we do quite well.

If instead the background application is writing _new_ files then
everything falls apart.

I'd suggest that you stick with the kernel compile as the workload,
and vary the background activity a bit. Try tiobench.

(oh, and try turning on everything in the `input' menu; that might
get the keyboard working again in 2.5)

2002-09-14 12:21:09

by Con Kolivas

[permalink] [raw]
Subject: Re: System response benchmarks in performance patches

Quoting Andrew Morton <[email protected]>:

> Con Kolivas wrote:
> > I came up with a very simple way of measuring responsiveness that gives
> me
> > numbers that are meaningful to me. What I've done is the old faithful
> kernel
> > compile and measured it under different loads to simulate the pc's ability
> to
> > perform under various loads. I have so far benchmarked 2.4.19 versus
> 2.4.19-ck7,...
> Yes, this is a wonderful test. Very real-world, easy to do and it
> tickles a few fairly serious performance problems which we have.

Thank you. I've been thinking hard about this for some time. I hope it becomes
useful

> > The loads were taken from BMatthew's iman found here:
> > http://people.redhat.com/bmatthews/irman/
> I have issues with irman (I think - didn't read the code really
> closely).

irman isn't really the key to this benchmark, it was only the source of a
constant load in each different area. The actual irman app isn't used, only the
child load apps.

> It appears to always perform file overwrites - seeking over files,
> rewriting them.
>
> This tends to cause best-case behaviour in the VM. The affected pages
> are tucked up out of the way on the active list and we do quite well.
>
> If instead the background application is writing _new_ files then
> everything falls apart.
>
> I'd suggest that you stick with the kernel compile as the workload,
> and vary the background activity a bit. Try tiobench.

I've had a look at tiobench and it looks like a serious IO load. However the
load varies quite a bit and I'd like something that remains relatively constant
throughout. I'd have to strip the guts out of it and do it as numerous different
IO tests. It does start taking away from the simplicity of it at the moment, and
as you can see from the numbers, it is still very revealing in it's current
incarnation.

> (oh, and try turning on everything in the `input' menu; that might
> get the keyboard working again in 2.5)

Thanks for that. I've managed to get the keyboard working but haven't been up to
speed with development of 2.5.x so I haven't figured out what to do about the
changing IDE partitions. All of that is moot at the moment, though, as my
benchmark won't yet work on 2.5.x because of changes to the /proc filesystem.
Rik Van Riel is helping me sort out this problem.

In the meantime I have created a tarball of this benchmark in a usable form.
I've called it contest (thanks Rik for the name idea) and it's basically a
script and the relevant workload tasks. I've posted it on my kernel page at
http://kernel.kolivas.net under the FAQ. A final reminder note: it won't work on
2.5.x

Con.

2002-09-14 12:24:06

by Paolo Ciarrocchi

[permalink] [raw]
Subject: Re: System response benchmarks in performance patches

Con,
I've just tried your benchmark against a vanilla 2.4.19 and 2.5.34

Results:

_NO LOAD_
Kernel Time CPU
2.4.19 7:37.99 99%
2.5.34 7:47.68 99%

_IOLOAD_
Kernel Time CPU
2.4.19 11:23.86 65%
2.5.34 10:48.24 72%

_CPULOAD_
Kernel Time CPU
2.4.19 9:07.80 82%
2.5.34 8:50.56 87%

_MEMLOAD_ [Probably wrong with 2.5*]
Kernel Time CPU
2.4.19 10:00.63 78%
2.5.34 7:45.80 99%

Hope it helps.

Ciao,
Paolo
--
Get your free email from http://www.linuxmail.org


Powered by Outblaze

2002-09-14 12:43:38

by Paolo Ciarrocchi

[permalink] [raw]
Subject: Re: System response benchmarks in performance patches

[...]
>http://kernel.kolivas.net under the FAQ. A final >reminder note: it won't work on
>2.5.x

Con,
I think that only the _memload_ test is not
working with 2.5.*, am I wrong?

Paolo

--
Get your free email from http://www.linuxmail.org


Powered by Outblaze

2002-09-14 18:19:46

by Paolo Ciarrocchi

[permalink] [raw]
Subject: Re: System response benchmarks in performance patches

From: Con Kolivas <[email protected]>

> Quoting Paolo Ciarrocchi <[email protected]>:
>
> > [...]
> > >http://kernel.kolivas.net under the FAQ. A final >reminder note: it won't
> > work on
> > >2.5.x
> >
> > Con,
> > I think that only the _memload_ test is not
> > working with 2.5.*, am I wrong?
>
> Correct. memload determines the amount of memory to allocate based on
> /proc/meminfo which has changed in 2.5.x
Rik has a pacth for this (thanks ;-).
Let me know when you are going to release a new version of the benchmark.

[...]
> P.S. How does 2.4.19-ck7 compare ;-)
Downloaded, compiled and tested ;-)
BTW, I tested also the latest compressed cache patch.

_NOLOAD_
Kernel Time CPU
2.4.19 7:37.99 99%
2.5.34 7:47.68 99%
2.4.19-0.24pre4 7:38.17 99%
2.4.19-ck7 7:35.54 99%

_CPULOAD_
Kernel Time CPU
2.4.19 9:07.80 82%
2.5.34 8:50.56 87%
2.4.19-0.24pre4 9:07.61 82%
2.4.19-ck7 8:38.07 87%

_IOLOAD_
Kernel Time CPU
2.4.19 11:23.86 65%
2.5.34 10:48.24 72%
2.4.19-0.24pre4 11:51.79 63%
2.4.19-ck7 10:41.56 71%

_MEMLOAD_
Kernel Time CPU
2.4.19 10:00.63 78%
2.5.34 [7:45.80] [99%]
2.4.19-0.24pre4 10:37.85 78%
2.4.19-ck7 10:46.06 72%

Ciao,
Paolo
--
Get your free email from http://www.linuxmail.org


Powered by Outblaze

2002-09-16 00:29:26

by Bill Davidsen

[permalink] [raw]
Subject: Re: System response benchmarks in performance patches

On Sat, 14 Sep 2002, Con Kolivas wrote:

>
> I came up with a very simple way of measuring responsiveness that gives me
> numbers that are meaningful to me. What I've done is the old faithful kernel
> compile and measured it under different loads to simulate the pc's ability to
> perform under various loads. I have so far benchmarked 2.4.19 versus 2.4.19-ck7,
> 2.4.19-ck7-rmap and 2.4.18-6mdk(mandrake's kernel in 8.2). 2.5.34 has a dead
> keyboard for me so I'm unable to test it as yet.

If that's a real kernel compile in <2 minutes I'm impressed!

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

2002-09-16 06:14:23

by Con Kolivas

[permalink] [raw]
Subject: Re: System response benchmarks in performance patches

Hi Bill

Quoting Bill Davidsen <[email protected]>:

> On Sat, 14 Sep 2002, Con Kolivas wrote:
>
> >
> > I came up with a very simple way of measuring responsiveness that gives
> me
> > numbers that are meaningful to me. What I've done is the old faithful
> kernel
> > compile and measured it under different loads to simulate the pc's ability
> to
> > perform under various loads. I have so far benchmarked 2.4.19 versus
> 2.4.19-ck7,
> > 2.4.19-ck7-rmap and 2.4.18-6mdk(mandrake's kernel in 8.2). 2.5.34 has a
> dead
> > keyboard for me so I'm unable to test it as yet.
>
> If that's a real kernel compile in <2 minutes I'm impressed!

If you look at my README in the tarball you'll see that I suggest using a
minimal kernel config (ie almost nothing enabled) and include just such a
.config. So, yes, it is a real kernel compile on a 1133Mhz pIII.

Con.