2002-10-29 01:37:09

by Con Kolivas

[permalink] [raw]
Subject: [BENCHMARK] 2.5.44-mm6 contest results

Contest results for 2.5.44-mm6 for comparison (Shared pagetables = y):

noload:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.44 [6] 74.7 93 0 0 1.05
2.5.44-mm1 [3] 75.0 93 0 0 1.05
2.5.44-mm2 [3] 76.4 93 0 0 1.07
2.5.44-mm4 [3] 75.0 93 0 0 1.05
2.5.44-mm5 [7] 75.3 91 0 0 1.05
2.5.44-mm6 [3] 75.7 91 0 0 1.06

cacherun:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.44 [3] 68.1 99 0 0 0.95
2.5.44-mm5 [2] 68.8 99 0 0 0.96
2.5.44-mm6 [3] 69.3 99 0 0 0.97

This (cacherun) is an experimental addition to contest. It is basically how fast
a second kernel compile is when conducted immediately following a previous compile.

process_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.44 [3] 90.9 76 32 26 1.27
2.5.44-mm1 [3] 191.5 36 168 64 2.68
2.5.44-mm2 [3] 193.5 38 161 62 2.71
2.5.44-mm4 [3] 191.1 36 166 63 2.68
2.5.44-mm5 [4] 191.4 36 166 63 2.68
2.5.44-mm6 [3] 190.6 36 166 63 2.67

ctar_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.44 [3] 97.7 80 1 6 1.37
2.5.44-mm1 [3] 99.2 78 1 6 1.39
2.5.44-mm2 [3] 96.9 79 1 5 1.36
2.5.44-mm4 [3] 97.1 79 1 5 1.36
2.5.44-mm5 [4] 97.7 78 1 5 1.37
2.5.44-mm6 [3] 97.3 79 1 5 1.36

xtar_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.44 [3] 117.0 65 1 7 1.64
2.5.44-mm1 [3] 156.2 49 2 7 2.19
2.5.44-mm2 [3] 176.1 44 2 7 2.47
2.5.44-mm4 [3] 183.3 41 2 8 2.57
2.5.44-mm5 [4] 181.1 44 2 7 2.54
2.5.44-mm6 [3] 207.6 37 2 7 2.91

io_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.44 [3] 873.8 9 69 12 12.24
2.5.44-mm1 [3] 347.3 22 35 15 4.86
2.5.44-mm2 [3] 294.2 28 19 10 4.12
2.5.44-mm4 [3] 358.7 23 25 10 5.02
2.5.44-mm5 [4] 270.7 29 18 11 3.79
2.5.44-mm6 [3] 284.1 28 20 10 3.98

read_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.44 [3] 110.8 68 6 3 1.55
2.5.44-mm1 [3] 110.5 69 7 3 1.55
2.5.44-mm2 [3] 104.5 73 7 4 1.46
2.5.44-mm4 [3] 105.6 71 6 4 1.48
2.5.44-mm5 [4] 103.3 74 6 4 1.45
2.5.44-mm6 [3] 104.3 73 7 4 1.46

list_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.44 [3] 99.1 71 1 21 1.39
2.5.44-mm1 [3] 96.5 74 1 22 1.35
2.5.44-mm2 [3] 94.5 75 1 22 1.32
2.5.44-mm4 [3] 96.4 74 1 21 1.35
2.5.44-mm5 [4] 95.0 75 1 20 1.33
2.5.44-mm6 [3] 95.3 75 1 20 1.33

mem_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.44 [3] 114.3 67 30 2 1.60
2.5.44-mm1 [3] 159.7 47 38 2 2.24
2.5.44-mm2 [3] 116.6 64 29 2 1.63
2.5.44-mm4 [3] 114.9 65 28 2 1.61
2.5.44-mm5 [4] 114.1 65 30 2 1.60
2.5.44-mm6 [3] 226.9 33 50 2 3.18

Mem load has dropped off again

Con


2002-10-29 06:40:10

by Andrew Morton

[permalink] [raw]
Subject: Re: [BENCHMARK] 2.5.44-mm6 contest results

Con Kolivas wrote:
>
> io_load:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.5.44 [3] 873.8 9 69 12 12.24
> 2.5.44-mm1 [3] 347.3 22 35 15 4.86
> 2.5.44-mm2 [3] 294.2 28 19 10 4.12
> 2.5.44-mm4 [3] 358.7 23 25 10 5.02
> 2.5.44-mm5 [4] 270.7 29 18 11 3.79
> 2.5.44-mm6 [3] 284.1 28 20 10 3.98

Jens, I think I prefer fifo_batch=16. We do need to expose
these in /somewhere so people can fiddle with them.

>...
> mem_load:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.5.44 [3] 114.3 67 30 2 1.60
> 2.5.44-mm1 [3] 159.7 47 38 2 2.24
> 2.5.44-mm2 [3] 116.6 64 29 2 1.63
> 2.5.44-mm4 [3] 114.9 65 28 2 1.61
> 2.5.44-mm5 [4] 114.1 65 30 2 1.60
> 2.5.44-mm6 [3] 226.9 33 50 2 3.18
>
> Mem load has dropped off again

Well that's one interpretation. The other is "goody, that pesky
kernel compile isn't slowing down my important memory-intensive
whateveritis so much". It's a tradeoff.

It appears that this change was caused by increasing the default
value of /proc/sys/vm/page-cluster from 3 to 4. I am surprised.

It was only of small benefit in other tests so I'll ditch that one.

Thanks.

(You're still testing with all IO against the same disk, yes? Please
rememeber that things change quite significantly when the swap IO
or the io_load is against a different device)

2002-10-29 07:34:20

by Jens Axboe

[permalink] [raw]
Subject: Re: [BENCHMARK] 2.5.44-mm6 contest results

On Mon, Oct 28 2002, Andrew Morton wrote:
> Con Kolivas wrote:
> >
> > io_load:
> > Kernel [runs] Time CPU% Loads LCPU% Ratio
> > 2.5.44 [3] 873.8 9 69 12 12.24
> > 2.5.44-mm1 [3] 347.3 22 35 15 4.86
> > 2.5.44-mm2 [3] 294.2 28 19 10 4.12
> > 2.5.44-mm4 [3] 358.7 23 25 10 5.02
> > 2.5.44-mm5 [4] 270.7 29 18 11 3.79
> > 2.5.44-mm6 [3] 284.1 28 20 10 3.98
>
> Jens, I think I prefer fifo_batch=16. We do need to expose
> these in /somewhere so people can fiddle with them.

I was hoping someone else would do comprehensive disk benchmarks with
fifo_batch=32 and fifo_batch=16, but I guess that you have to currently
change the define in the source doesn't make that very likely. I don't
really like your global settings (for per-queue entities), but I guess
they can suffice until a better approach is done.

I'll do some benching today.

--
Jens Axboe

2002-10-29 07:45:43

by Andrew Morton

[permalink] [raw]
Subject: Re: [BENCHMARK] 2.5.44-mm6 contest results

Jens Axboe wrote:
>
> On Mon, Oct 28 2002, Andrew Morton wrote:
> > Con Kolivas wrote:
> > >
> > > io_load:
> > > Kernel [runs] Time CPU% Loads LCPU% Ratio
> > > 2.5.44 [3] 873.8 9 69 12 12.24
> > > 2.5.44-mm1 [3] 347.3 22 35 15 4.86
> > > 2.5.44-mm2 [3] 294.2 28 19 10 4.12
> > > 2.5.44-mm4 [3] 358.7 23 25 10 5.02
> > > 2.5.44-mm5 [4] 270.7 29 18 11 3.79
> > > 2.5.44-mm6 [3] 284.1 28 20 10 3.98
> >
> > Jens, I think I prefer fifo_batch=16. We do need to expose
> > these in /somewhere so people can fiddle with them.
>
> I was hoping someone else would do comprehensive disk benchmarks with
> fifo_batch=32 and fifo_batch=16, but I guess that you have to currently
> change the define in the source doesn't make that very likely. I don't
> really like your global settings (for per-queue entities), but I guess
> they can suffice until a better approach is done.

Oh sure; it's just a hack so we can experiment with it. Not that
anyone has, to my knowledge.

> I'll do some benching today.

Wouldn't hurt, but it's a complex problem. Another approach would
be to quietly change it and see who squeaks ;)

2002-10-29 08:16:21

by Giuliano Pochini

[permalink] [raw]
Subject: Re: [BENCHMARK] 2.5.44-mm6 contest results


On 29-Oct-2002 Andrew Morton wrote:
> Con Kolivas wrote:
>> 2.5.44-mm6 [3] 226.9 33 50 2 3.18
>>
>> Mem load has dropped off again
>
> Well that's one interpretation. The other is "goody, that pesky
> kernel compile isn't slowing down my important memory-intensive
> whateveritis so much". It's a tradeoff.

This test should display the speed of the memory hog, kernel
compile and also how much disk i/o occurred to be really
meaningful. But IMO disk i/o is the one that slows things
down, so we should try to keep it as low as possible (and this
test show nothing about it).


Bye.

2002-10-29 09:05:30

by Con Kolivas

[permalink] [raw]
Subject: Re: [BENCHMARK] 2.5.44-mm6 contest results

Quoting Andrew Morton <[email protected]>:

> Con Kolivas wrote:
> >
> > io_load:
> > Kernel [runs] Time CPU% Loads LCPU% Ratio
> > 2.5.44 [3] 873.8 9 69 12 12.24
> > 2.5.44-mm1 [3] 347.3 22 35 15 4.86
> > 2.5.44-mm2 [3] 294.2 28 19 10 4.12
> > 2.5.44-mm4 [3] 358.7 23 25 10 5.02
> > 2.5.44-mm5 [4] 270.7 29 18 11 3.79
> > 2.5.44-mm6 [3] 284.1 28 20 10 3.98
>
> Jens, I think I prefer fifo_batch=16. We do need to expose
> these in /somewhere so people can fiddle with them.
>
> >...
> > mem_load:
> > Kernel [runs] Time CPU% Loads LCPU% Ratio
> > 2.5.44 [3] 114.3 67 30 2 1.60
> > 2.5.44-mm1 [3] 159.7 47 38 2 2.24
> > 2.5.44-mm2 [3] 116.6 64 29 2 1.63
> > 2.5.44-mm4 [3] 114.9 65 28 2 1.61
> > 2.5.44-mm5 [4] 114.1 65 30 2 1.60
> > 2.5.44-mm6 [3] 226.9 33 50 2 3.18
> >
> > Mem load has dropped off again
>
> Well that's one interpretation. The other is "goody, that pesky
> kernel compile isn't slowing down my important memory-intensive
> whateveritis so much". It's a tradeoff.
>
> It appears that this change was caused by increasing the default
> value of /proc/sys/vm/page-cluster from 3 to 4. I am surprised.
>
> It was only of small benefit in other tests so I'll ditch that one.

I understand the trade off issue. Since make -j4 bzImage is 4 cpu hungry
processes, ideally I'm guessing mem_load should only extend the duration and
drop the cpu by 25%.

> (You're still testing with all IO against the same disk, yes? Please
> rememeber that things change quite significantly when the swap IO
> or the io_load is against a different device)

Yes I am. Sorry I just dont have the hardware to do anything else.

Con