2003-01-31 22:21:36

by Con Kolivas

[permalink] [raw]
Subject: [BENCHMARK] 2.5.59-mm7 with contest

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Here are contest (http://contest.kolivas.net) benchmarks using the osdl
(http://www.osdl.org) hardware comparing mm7

no_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.59 3 79 94.9 0 0.0 1.00
2.5.59-mm6 1 78 96.2 0 0.0 1.00
2.5.59-mm7 5 78 96.2 0 0.0 1.00
cacherun:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.59 3 76 98.7 0 0.0 0.96
2.5.59-mm6 1 76 97.4 0 0.0 0.97
2.5.59-mm7 5 75 98.7 0 0.0 0.96
process_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.59 3 92 81.5 28 16.3 1.16
2.5.59-mm6 1 92 81.5 25 15.2 1.18
2.5.59-mm7 4 90 82.2 25 18.3 1.15
ctar_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.59 3 98 80.6 2 5.1 1.24
2.5.59-mm6 3 112 70.5 2 4.5 1.44
2.5.59-mm7 5 96 80.2 1 3.4 1.23
xtar_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.59 3 101 75.2 1 4.0 1.28
2.5.59-mm6 3 115 66.1 1 4.3 1.47
2.5.59-mm7 5 96 79.2 0 3.3 1.23
io_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.59 3 153 50.3 8 13.7 1.94
2.5.59-mm6 2 90 83.3 2 6.7 1.15
2.5.59-mm7 5 110 68.2 2 6.4 1.41
read_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.59 3 102 76.5 5 4.9 1.29
2.5.59-mm6 3 733 10.8 56 6.3 9.40
2.5.59-mm7 4 90 84.4 1 1.3 1.15
list_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.59 3 95 80.0 0 6.3 1.20
2.5.59-mm6 3 97 79.4 0 6.2 1.24
2.5.59-mm7 4 94 80.9 0 6.4 1.21
mem_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.59 3 97 80.4 56 2.1 1.23
2.5.59-mm6 3 94 83.0 50 2.1 1.21
2.5.59-mm7 4 92 82.6 45 1.4 1.18
dbench_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.59 3 126 60.3 3 22.2 1.59
2.5.59-mm6 3 122 61.5 3 25.4 1.56
2.5.59-mm7 4 121 62.0 2 24.8 1.55
io_other:
Kernel [runs] Time CPU% Loads LCPU% Ratio
2.5.59 3 89 84.3 2 5.5 1.13
2.5.59-mm6 2 90 83.3 2 6.7 1.15
2.5.59-mm7 3 90 83.3 2 5.6 1.15

Seems the fix for "reads starves everything" works. Affected the tar loads
too?

Con
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.0 (GNU/Linux)

iD8DBQE+OvkcF6dfvkL3i1gRAihPAJ0dYEHFrIf6Ut1j1Kp62JGRkq076QCffpgE
ztv0SsBWixlwn++QQ0XBuyg=
=PSRo
-----END PGP SIGNATURE-----


2003-01-31 22:53:47

by Andrew Morton

[permalink] [raw]
Subject: Re: [BENCHMARK] 2.5.59-mm7 with contest

Con Kolivas wrote:
>
> ...
> io_load:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.5.59 3 153 50.3 8 13.7 1.94
> 2.5.59-mm6 2 90 83.3 2 6.7 1.15
> 2.5.59-mm7 5 110 68.2 2 6.4 1.41
> read_load:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.5.59 3 102 76.5 5 4.9 1.29
> 2.5.59-mm6 3 733 10.8 56 6.3 9.40
> 2.5.59-mm7 4 90 84.4 1 1.3 1.15

The background loads took some punishment.

2003-01-31 23:04:15

by Con Kolivas

[permalink] [raw]
Subject: Re: [BENCHMARK] 2.5.59-mm7 with contest

On Saturday 01 Feb 2003 10:01 am, Andrew Morton wrote:
> Con Kolivas wrote:
> > ...
> > io_load:
> > Kernel [runs] Time CPU% Loads LCPU% Ratio
> > 2.5.59 3 153 50.3 8 13.7 1.94
> > 2.5.59-mm6 2 90 83.3 2 6.7 1.15
> > 2.5.59-mm7 5 110 68.2 2 6.4 1.41
> > read_load:
> > Kernel [runs] Time CPU% Loads LCPU% Ratio
> > 2.5.59 3 102 76.5 5 4.9 1.29
> > 2.5.59-mm6 3 733 10.8 56 6.3 9.40
> > 2.5.59-mm7 4 90 84.4 1 1.3 1.15
>
> The background loads took some punishment.

Yes and I'd say a ratio of only 1.15 suggests kernel compilation got an unfair
share of the resources.

2003-02-01 00:28:27

by Nick Piggin

[permalink] [raw]
Subject: Re: [BENCHMARK] 2.5.59-mm7 with contest

Con Kolivas wrote:

>-----BEGIN PGP SIGNED MESSAGE-----
>Hash: SHA1
>
>Here are contest (http://contest.kolivas.net) benchmarks using the osdl
>(http://www.osdl.org) hardware comparing mm7
>
>no_load:
>Kernel [runs] Time CPU% Loads LCPU% Ratio
>2.5.59 3 79 94.9 0 0.0 1.00
>2.5.59-mm6 1 78 96.2 0 0.0 1.00
>2.5.59-mm7 5 78 96.2 0 0.0 1.00
>cacherun:
>Kernel [runs] Time CPU% Loads LCPU% Ratio
>2.5.59 3 76 98.7 0 0.0 0.96
>2.5.59-mm6 1 76 97.4 0 0.0 0.97
>2.5.59-mm7 5 75 98.7 0 0.0 0.96
>process_load:
>Kernel [runs] Time CPU% Loads LCPU% Ratio
>2.5.59 3 92 81.5 28 16.3 1.16
>2.5.59-mm6 1 92 81.5 25 15.2 1.18
>2.5.59-mm7 4 90 82.2 25 18.3 1.15
>ctar_load:
>Kernel [runs] Time CPU% Loads LCPU% Ratio
>2.5.59 3 98 80.6 2 5.1 1.24
>2.5.59-mm6 3 112 70.5 2 4.5 1.44
>2.5.59-mm7 5 96 80.2 1 3.4 1.23
>xtar_load:
>Kernel [runs] Time CPU% Loads LCPU% Ratio
>2.5.59 3 101 75.2 1 4.0 1.28
>2.5.59-mm6 3 115 66.1 1 4.3 1.47
>2.5.59-mm7 5 96 79.2 0 3.3 1.23
>io_load:
>Kernel [runs] Time CPU% Loads LCPU% Ratio
>2.5.59 3 153 50.3 8 13.7 1.94
>2.5.59-mm6 2 90 83.3 2 6.7 1.15
>2.5.59-mm7 5 110 68.2 2 6.4 1.41
>read_load:
>Kernel [runs] Time CPU% Loads LCPU% Ratio
>2.5.59 3 102 76.5 5 4.9 1.29
>2.5.59-mm6 3 733 10.8 56 6.3 9.40
>2.5.59-mm7 4 90 84.4 1 1.3 1.15
>list_load:
>Kernel [runs] Time CPU% Loads LCPU% Ratio
>2.5.59 3 95 80.0 0 6.3 1.20
>2.5.59-mm6 3 97 79.4 0 6.2 1.24
>2.5.59-mm7 4 94 80.9 0 6.4 1.21
>mem_load:
>Kernel [runs] Time CPU% Loads LCPU% Ratio
>2.5.59 3 97 80.4 56 2.1 1.23
>2.5.59-mm6 3 94 83.0 50 2.1 1.21
>2.5.59-mm7 4 92 82.6 45 1.4 1.18
>dbench_load:
>Kernel [runs] Time CPU% Loads LCPU% Ratio
>2.5.59 3 126 60.3 3 22.2 1.59
>2.5.59-mm6 3 122 61.5 3 25.4 1.56
>2.5.59-mm7 4 121 62.0 2 24.8 1.55
>io_other:
>Kernel [runs] Time CPU% Loads LCPU% Ratio
>2.5.59 3 89 84.3 2 5.5 1.13
>2.5.59-mm6 2 90 83.3 2 6.7 1.15
>2.5.59-mm7 3 90 83.3 2 5.6 1.15
>
>Seems the fix for "reads starves everything" works. Affected the tar loads
>too?
>
Yes, at the cost of throughput, however for now it is probably
the best way to go. Hopefully anticipatory scheduling will provide
as good or better kernel compile times and better throughput.

Con, tell me, are "Loads" normalised to the time they run for?
Is it possible to get a finer grain result for the load tests?

Thanks
Nick

2003-02-01 00:35:34

by Con Kolivas

[permalink] [raw]
Subject: Re: [BENCHMARK] 2.5.59-mm7 with contest

On Saturday 01 Feb 2003 11:37 am, Nick Piggin wrote:
> Con Kolivas wrote:
> >Seems the fix for "reads starves everything" works. Affected the tar loads
> >too?
>
> Yes, at the cost of throughput, however for now it is probably
> the best way to go. Hopefully anticipatory scheduling will provide
> as good or better kernel compile times and better throughput.
>
> Con, tell me, are "Loads" normalised to the time they run for?
> Is it possible to get a finer grain result for the load tests?

No, the load is the absolute number of times the load successfully completed.
We battled with the code for a while to see if there were ways to get more
accurate load numbers but if you write a 256Mb file you can only tell if it
completes the write or not; not how much has been written when you stop the
write. Same goes with read etc. The load rate is a more meaningful number but
we haven't gotten around to implementing that in the result presentation.

Load rate would be:

loads / ( load_compile_time - no_load_compile_time )

because basically if the load compile time is longer, more loads are
completed. Most of the time the loads happen at the same rate, but if the
load rate was different it would be a more significant result than just a
scheduling balance change which is why load rate would be a useful addition.

Con

2003-02-01 01:00:36

by Con Kolivas

[permalink] [raw]
Subject: Re: [BENCHMARK] 2.5.59-mm7 with contest

On Saturday 01 Feb 2003 11:55 am, Nick Piggin wrote:
> Con Kolivas wrote:
> >On Saturday 01 Feb 2003 11:37 am, Nick Piggin wrote:
> >>Con Kolivas wrote:
> >>>Seems the fix for "reads starves everything" works. Affected the tar
> >>> loads too?
> >>
> >>Yes, at the cost of throughput, however for now it is probably
> >>the best way to go. Hopefully anticipatory scheduling will provide
> >>as good or better kernel compile times and better throughput.
> >>
> >>Con, tell me, are "Loads" normalised to the time they run for?
> >>Is it possible to get a finer grain result for the load tests?
> >
> >No, the load is the absolute number of times the load successfully
> > completed. We battled with the code for a while to see if there were ways
> > to get more accurate load numbers but if you write a 256Mb file you can
> > only tell if it completes the write or not; not how much has been written
> > when you stop the write. Same goes with read etc. The load rate is a more
> > meaningful number but we haven't gotten around to implementing that in
> > the result presentation.
>
> I don't know how the contest code works, but if you split that into
> a number of smaller writes it should work?

Yes it would but the load effect is significantly diminished. By writing a
file the size==physical ram the load effect is substantial.

> >Load rate would be:
> >
> >loads / ( load_compile_time - no_load_compile_time )
>
> I think loads / time_load_ran_for should be ok (ie, give you loads per time
> interval). This would be more useful if your loads were getting more
> efficient
> or less because it is possible that an improvement would lower compile time
> _and_ loads, but overall the loads were getting done quicker.

I found the following is how loads occur almost always:
noload time: 60
load time kernal a: 80, loads 20
load time kernel b: 100, loads 40
load time kernel c: 90, loads 30

and loads/total time wouldnt show this effect as kernel c would appear to have
a better load rate

if there was
load time kernel d: 80, loads 40

that would be more significant no?

2003-02-01 01:14:11

by Nick Piggin

[permalink] [raw]
Subject: Re: [BENCHMARK] 2.5.59-mm7 with contest

Con Kolivas wrote:

>On Saturday 01 Feb 2003 11:55 am, Nick Piggin wrote:
>
>>Con Kolivas wrote:
>>
>>>On Saturday 01 Feb 2003 11:37 am, Nick Piggin wrote:
>>>
>>>>Con Kolivas wrote:
>>>>
>>>>>Seems the fix for "reads starves everything" works. Affected the tar
>>>>>loads too?
>>>>>
>>>>Yes, at the cost of throughput, however for now it is probably
>>>>the best way to go. Hopefully anticipatory scheduling will provide
>>>>as good or better kernel compile times and better throughput.
>>>>
>>>>Con, tell me, are "Loads" normalised to the time they run for?
>>>>Is it possible to get a finer grain result for the load tests?
>>>>
>>>No, the load is the absolute number of times the load successfully
>>>completed. We battled with the code for a while to see if there were ways
>>>to get more accurate load numbers but if you write a 256Mb file you can
>>>only tell if it completes the write or not; not how much has been written
>>>when you stop the write. Same goes with read etc. The load rate is a more
>>>meaningful number but we haven't gotten around to implementing that in
>>>the result presentation.
>>>
>>I don't know how the contest code works, but if you split that into
>>a number of smaller writes it should work?
>>
>
>Yes it would but the load effect is significantly diminished. By writing a
>file the size==physical ram the load effect is substantial.
>
Oh yes of course, but I meant just break up the writing of that big file
into smaller write(2)s.

>
>
>>>Load rate would be:
>>>
>>>loads / ( load_compile_time - no_load_compile_time )
>>>
>>I think loads / time_load_ran_for should be ok (ie, give you loads per time
>>interval). This would be more useful if your loads were getting more
>>efficient
>>or less because it is possible that an improvement would lower compile time
>>_and_ loads, but overall the loads were getting done quicker.
>>
>
>I found the following is how loads occur almost always:
>noload time: 60
>load time kernal a: 80, loads 20
>load time kernel b: 100, loads 40
>load time kernel c: 90, loads 30
>
>and loads/total time wouldnt show this effect as kernel c would appear to have
>a better load rate
>
Kernel a would have a rate of .25 l/s, b: .4 l/s, c: .33~ l/s so I b would
be better.

>
>
>if there was
>load time kernel d: 80, loads 40
>
>that would be more significant no?
>
It would, yes... but it would measure .5 loads per second done.

The noload time is basically constant anyway so I don't think it would add
much value if it were incorporated into the results, but would make the
metric harder to follow than simple "loads per second".

2003-02-01 01:55:14

by Andrew Morton

[permalink] [raw]
Subject: Re: [BENCHMARK] 2.5.59-mm7 with contest

Con Kolivas <[email protected]> wrote:
>
> On Saturday 01 Feb 2003 10:01 am, Andrew Morton wrote:
> > Con Kolivas wrote:
> > > ...
> > > io_load:
> > > Kernel [runs] Time CPU% Loads LCPU% Ratio
> > > 2.5.59 3 153 50.3 8 13.7 1.94
> > > 2.5.59-mm6 2 90 83.3 2 6.7 1.15
> > > 2.5.59-mm7 5 110 68.2 2 6.4 1.41
> > > read_load:
> > > Kernel [runs] Time CPU% Loads LCPU% Ratio
> > > 2.5.59 3 102 76.5 5 4.9 1.29
> > > 2.5.59-mm6 3 733 10.8 56 6.3 9.40
> > > 2.5.59-mm7 4 90 84.4 1 1.3 1.15
> >
> > The background loads took some punishment.
>
> Yes and I'd say a ratio of only 1.15 suggests kernel compilation got an unfair
> share of the resources.

A very important metric is system-wide idle/IO-wait CPU time. As long as
that is kept nice and low, we can then finetune the starvation and fairness
aspects.

2003-02-01 03:12:02

by Con Kolivas

[permalink] [raw]
Subject: Re: [BENCHMARK] 2.5.59-mm7 with contest

> >I found the following is how loads occur almost always:
> >noload time: 60
> >load time kernal a: 80, loads 20
> >load time kernel b: 100, loads 40
> >load time kernel c: 90, loads 30
> >
> >and loads/total time wouldnt show this effect as kernel c would appear to
> > have a better load rate
>
> Kernel a would have a rate of .25 l/s, b: .4 l/s, c: .33~ l/s so I b would
> be better.

Err yeah thats what I mean sorry. What I'm getting at is notice they all do it
at 1/second regardless. It's only the scheduling balance that has changed
rather than the rate of work.

> >if there was
> >load time kernel d: 80, loads 40
> >
> >that would be more significant no?
>
> It would, yes... but it would measure .5 loads per second done.
>
> The noload time is basically constant anyway so I don't think it would add
> much value if it were incorporated into the results, but would make the
> metric harder to follow than simple "loads per second".

At the moment total loads tells the full story either way so for now I'm
sticking to that.

Con