LinuxLists.cc - some rmap12a benchmarks

2002-01-27 04:36:04

Subject: some rmap12a benchmarks

I ran dbench on three different kernels: 2.4.17 w/ rmap12a, 2.4.18pre7, and
2.4.18pre7 w/ rmap12a. 2.4.18pre7 had better throughput by a substantial
margin. The results are at http://www.dynamicbullet.com/rmap.html

__________________________________________________
Do You Yahoo!?
Great stuff seeking new owners in Yahoo! Auctions!
http://auctions.yahoo.com

2002-01-28 11:52:00

by Daniel Phillips

[permalink] [raw]

Subject: Don't use dbench for benchmarks

On January 27, 2002 05:35 am, Alex Davis wrote:
> I ran dbench on three different kernels: 2.4.17 w/ rmap12a, 2.4.18pre7, and
> 2.4.18pre7 w/ rmap12a. 2.4.18pre7 had better throughput by a substantial
> margin. The results are at http://www.dynamicbullet.com/rmap.html

I must be having a bad day, I can only think of irritable things to post.
Continuing that theme: please don't use dbench for benchmarks. At all.
It's an unreliable indicator of anything in particular except perhaps
stability. Please, use something else for your benchmarks.

--
Daniel

2002-01-28 13:05:23

by Alan

[permalink] [raw]

Subject: Re: Don't use dbench for benchmarks

> I must be having a bad day, I can only think of irritable things to post.
> Continuing that theme: please don't use dbench for benchmarks. At all.
> It's an unreliable indicator of anything in particular except perhaps
> stability. Please, use something else for your benchmarks.

Im not 100% sure that is the case. Done 30 or 40 times and done from a
reboot for the 30-40 pass sequence its quite a passable guide to
both stability and I/O behaviour under some server loads.

Alan

2002-01-28 13:24:13

by Steve Lord

[permalink] [raw]

Subject: Re: Don't use dbench for benchmarks

Alan Cox wrote:

>>I must be having a bad day, I can only think of irritable things to post.
>>Continuing that theme: please don't use dbench for benchmarks. At all.
>>It's an unreliable indicator of anything in particular except perhaps
>>stability. Please, use something else for your benchmarks.
>>
>
>Im not 100% sure that is the case. Done 30 or 40 times and done from a
>reboot for the 30-40 pass sequence its quite a passable guide to
>both stability and I/O behaviour under some server loads.
>

dbench tells you two things:

o how repeatable your filesystem performance is under load from
multiple processes,
and is the available bandwidth equally shared between the threads.
Various versions
of linux (especially the elevator code) have shown different
characteristics here, but
nowadays things are pretty fair.

o once you have repeatable performance it tells you if your performance
regressed or
improved (on identical hardware of course).

However, there are 'interesting' aspects of the test (and any other
memory stressing
test), it performs better if you push as much out to swap as you can
first. So I find
a dbench 8 runs better after a dbench 64 than it did before it. This
means you need
a VERY controlled environment to make the results mean anything.

Steve

p.s. I really only use it to see if XFS can survive heavy load and
memory pressure.

2002-01-28 14:43:46

by Alex Davis

[permalink] [raw]

Subject: Re: Don't use dbench for benchmarks

--- Daniel Phillips <[email protected]> wrote:
> On January 27, 2002 05:35 am, Alex Davis wrote:
> > I ran dbench on three different kernels: 2.4.17 w/ rmap12a, 2.4.18pre7, and
> > 2.4.18pre7 w/ rmap12a. 2.4.18pre7 had better throughput by a substantial
> > margin. The results are at http://www.dynamicbullet.com/rmap.html
>
> I must be having a bad day, I can only think of irritable things to post.
I don't consider this "irritable".
> Continuing that theme: please don't use dbench for benchmarks. At all.
> It's an unreliable indicator of anything in particular except perhaps
> stability. Please, use something else for your benchmarks.
What do you suggest as an acceptable benchmark???

>
> --
> Daniel

__________________________________________________
Do You Yahoo!?
Great stuff seeking new owners in Yahoo! Auctions!
http://auctions.yahoo.com

2002-01-28 15:30:20

by Richard B. Johnson

[permalink] [raw]

Subject: Re: Don't use dbench for benchmarks

On Mon, 28 Jan 2002, Alex Davis wrote:

>
> --- Daniel Phillips <[email protected]> wrote:
> > On January 27, 2002 05:35 am, Alex Davis wrote:
> > > I ran dbench on three different kernels: 2.4.17 w/ rmap12a, 2.4.18pre7, and
> > > 2.4.18pre7 w/ rmap12a. 2.4.18pre7 had better throughput by a substantial
> > > margin. The results are at http://www.dynamicbullet.com/rmap.html
> >
> > I must be having a bad day, I can only think of irritable things to post.
> I don't consider this "irritable".
> > Continuing that theme: please don't use dbench for benchmarks. At all.
> > It's an unreliable indicator of anything in particular except perhaps
> > stability. Please, use something else for your benchmarks.
> What do you suggest as an acceptable benchmark???
>

A major problem with all known benchmarks is that, once you define one,
an OS can be tuned to maximize perceived performance while, in fact,
destroying other "unmeasurables" like "snappy interactive performance".

It seems that compiling the Linux Kernel while burning a CDROM gives
a good check of "acceptable" performance. But, such operations are
not "benchmarks". The trick is to create a benchmark that performs
many "simultaneous" independent and co-dependent operations using
I/O devices that everyone is likely to have. I haven't seen anything
like this yet.

Such a benchmark might have multiple tasks performing things like:

(1) Real Math on large arrays.

(2) Data-base indexed lookups.

(3) Data-base keys sorting.

(4) Small file I/O with multiple creations and deletions.

(5) Large file I/O operations with many seeks.

(6) Multiple "network" Client/Server tasks through loop-back.

(7) Simulated compiles by searching directory trees for
"include" files, reading them and closing them, while
performing string-searches to simulate compiler parsing.

(8) Two or more tasks communicating using shared-RAM. This
can be a "nasty" performance hog, but tests the performance
of threaded applications without having to write those
applications.

(9) And more....

These tasks would be given a "performance weighting value", a heuristic
that relates to perceived overall performance. But, even this is
full of holes. You could tune a fast machine with much RAM and then
have terrible performance with machines that sleep, waiting for I/O.

So, one of the first things that has to be done, before any benchmark
can attempt to be valid, is to stabililize the testing environment.
This is difficult to do under software control. For instance,
if I had a RAM-eater which was going to use up RAM until there was
only a certain amount left, I would need to prevent the RAM-eater from
using the CPU after it had performed its work. The RAM-eater would
have to lock pages into place, pages it would not be accessing during
the rest of the benchmark. If I didn't do this, the kernel would
be spending a lot of time swapping, messing up benchmark results.

Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (797.90 BogoMips).

I was going to compile a list of innovations that could be
attributed to Microsoft. Once I realized that Ctrl-Alt-Del
was handled in the BIOS, I found that there aren't any.

2002-01-28 16:56:53

by Nigel Gamble

[permalink] [raw]

Subject: Re: Don't use dbench for benchmarks

On Mon, 28 Jan 2002, Richard B. Johnson wrote:
> It seems that compiling the Linux Kernel while burning a CDROM gives
> a good check of "acceptable" performance. But, such operations are
> not "benchmarks". The trick is to create a benchmark that performs
> many "simultaneous" independent and co-dependent operations using
> I/O devices that everyone is likely to have. I haven't seen anything
> like this yet.
>
> Such a benchmark might have multiple tasks performing things like:
>
> (1) Real Math on large arrays.
>
> (2) Data-base indexed lookups.
>
> (3) Data-base keys sorting.
>
> (4) Small file I/O with multiple creations and deletions.
>
> (5) Large file I/O operations with many seeks.
>
> (6) Multiple "network" Client/Server tasks through loop-back.
>
> (7) Simulated compiles by searching directory trees for
> "include" files, reading them and closing them, while
> performing string-searches to simulate compiler parsing.
>
> (8) Two or more tasks communicating using shared-RAM. This
> can be a "nasty" performance hog, but tests the performance
> of threaded applications without having to write those
> applications.
>
> (9) And more....
>
>
> These tasks would be given a "performance weighting value", a heuristic
> that relates to perceived overall performance.

It sounds like you are describing the Aim Benchmark suite, which has
been used for years to compare unix system performancem, and was
recently released under the GPL by Caldera.

See http://caldera.com/developers/community/contrib/aim.html

Nigel Gamble [email protected]
Mountain View, CA, USA. http://www.nrg.org/

2002-01-28 17:52:56

by Richard B. Johnson

[permalink] [raw]

Subject: Re: Don't use dbench for benchmarks

On Mon, 28 Jan 2002, Nigel Gamble wrote:

> On Mon, 28 Jan 2002, Richard B. Johnson wrote:
> > It seems that compiling the Linux Kernel while burning a CDROM gives
> > a good check of "acceptable" performance. But, such operations are
> > not "benchmarks". The trick is to create a benchmark that performs
> > many "simultaneous" independent and co-dependent operations using
> > I/O devices that everyone is likely to have. I haven't seen anything
> > like this yet.
> >
> > Such a benchmark might have multiple tasks performing things like:
> >
> > (1) Real Math on large arrays.
> >
> > (2) Data-base indexed lookups.
> >
> > (3) Data-base keys sorting.
> >
> > (4) Small file I/O with multiple creations and deletions.
> >
> > (5) Large file I/O operations with many seeks.
> >
> > (6) Multiple "network" Client/Server tasks through loop-back.
> >
> > (7) Simulated compiles by searching directory trees for
> > "include" files, reading them and closing them, while
> > performing string-searches to simulate compiler parsing.
> >
> > (8) Two or more tasks communicating using shared-RAM. This
> > can be a "nasty" performance hog, but tests the performance
> > of threaded applications without having to write those
> > applications.
> >
> > (9) And more....
> >
> >
> > These tasks would be given a "performance weighting value", a heuristic
> > that relates to perceived overall performance.
>
> It sounds like you are describing the Aim Benchmark suite, which has
> been used for years to compare unix system performancem, and was
> recently released under the GPL by Caldera.
>
> See http://caldera.com/developers/community/contrib/aim.html
>
> Nigel Gamble [email protected]
> Mountain View, CA, USA. http://www.nrg.org/
>

That sounds good. Have you tried it? Does it seem to provide the
kind of data that will show the effect of various trade-offs?

Cheers,
Dick Johnson

Penguin : Linux version 2.4.1 on an i686 machine (797.90 BogoMips).

I was going to compile a list of innovations that could be
attributed to Microsoft. Once I realized that Ctrl-Alt-Del
was handled in the BIOS, I found that there aren't any.

2002-01-28 18:41:01

by Daniel Phillips

[permalink] [raw]

Subject: Re: Don't use dbench for benchmarks

On January 28, 2002 03:43 pm, Alex Davis wrote:
> > Continuing that theme: please don't use dbench for benchmarks. At all.
> > It's an unreliable indicator of anything in particular except perhaps
> > stability. Please, use something else for your benchmarks.
>
> What do you suggest as an acceptable benchmark???

A benchmark that tests disk/file system create/read/write/delete throughput,
as dbench is supposed to? Though I haven't used it personally, others
(Arjan) have suggested tiobench:

http://tiobench.sourceforge.net/

Apparently it does not suffer from the kind of scheduling and caching
variability that dbench does. This needs to be verified. Some multiple run
benchmarks would do the trick, with results for the individual runs reported
along the lines of what we have seen often with dbench.

Bonnie++ is another benchmark that is often suggested. Again, I don't
personally have much experience with it.

After that, I'm afraid we tend to enter the realm of commercial benchmarks,
where the name of the game is to establish your own benchmark program as the
standard so that you can charge big bucks for licensing your code (since your
customers have two choices: either buy your code or don't publish their
numbers, sweet deal).

Personally, I normally create my own benchmark tests, tailor-made to exercise
the particular thing I'm working on at the moment. Such quick hacks would
not normally possess all the properties we'd like to see in benchmarks
designed for widespread use and publication of results.

Anybody looking for a kernel-related project but not being quite ready to
hack the kernel itself might well have a good think about what might
constitute good benchmarks for various kernel subsystems, and code something
up, or join up with others who are already interested in that subject, such
as osdl or the tiobench project mentioned above. This would be a valuable
contribution.

--
Daniel

2002-01-28 19:09:33

by Andrew Morton

[permalink] [raw]

Subject: Re: Don't use dbench for benchmarks

Daniel Phillips wrote:
>
> On January 28, 2002 03:43 pm, Alex Davis wrote:
> > > Continuing that theme: please don't use dbench for benchmarks. At all.
> > > It's an unreliable indicator of anything in particular except perhaps
> > > stability. Please, use something else for your benchmarks.
> >
> > What do you suggest as an acceptable benchmark???
>
> A benchmark that tests disk/file system create/read/write/delete throughput,
> as dbench is supposed to? Though I haven't used it personally, others
> (Arjan) have suggested tiobench:
>
> http://tiobench.sourceforge.net/
>

Also http://www.iozone.org/

Really, iozone isn't a benchmark as much as the "engine" of
a benchmark. It has so many options that you can use it
to build higher-level, more intelligent test suites by invoking it
in specific ways. read/write, mmap, MS_SYNC, MS_ASYNC, O_DIRECT,
aio, O_SYNC, fsync(), multiple threads, ...

-

2002-01-28 20:30:16

by Nigel Gamble

[permalink] [raw]

Subject: Re: Don't use dbench for benchmarks

On Mon, 28 Jan 2002, Richard B. Johnson wrote:
> > It sounds like you are describing the Aim Benchmark suite, which has
> > been used for years to compare unix system performancem, and was
> > recently released under the GPL by Caldera.
> >
> > See http://caldera.com/developers/community/contrib/aim.html
>
> That sounds good. Have you tried it? Does it seem to provide the
> kind of data that will show the effect of various trade-offs?

The last time I personally used it was over 10 years ago, but we got a
lot of use out of it to test system performance after making kernel
changes. Of course, we used other benchmarks and microbenchmarks too.

Now that it has been GPL'd, I think it would be a useful addition to
Linux benchmarking.

Nigel Gamble [email protected]
Mountain View, CA, USA. http://www.nrg.org/