2001-10-30 07:24:42

by Randy Hron

[permalink] [raw]
Subject: VM test comparison of 2.4.14-pre5, aa1, and 2.4.13-ac5-fs


2.4.14-pre5 fastest for mtest01, smoothest sound.
2.4.14pre5aa1
2.4.13-ac5-freeswap fastest for mmap001

Summary:

mtest01

2.4.14-pre5 had the lowest wall clock time for mtest01.
2.4.14-pre5 and 2.4.14pre5aa1 played just over 80% of the
mp3. 2.4.13-ac5-freeswap played 23% of the mp3.

mmap001

2.4.13-ac5-freeswap skipped very little and was about 11%
faster than the pre5 kernels.


2.4.14-pre5
-----------

mtest01

mp3 played 263 seconds of 321 second run.

Averages for 10 mtest01 runs
bytes allocated: 1241933414
User time (seconds): 2.088
System time (seconds): 3.129
Elapsed (wall clock) time: 32.104
Percent of CPU this job got: 15.70
Major (requiring I/O) page faults: 105.8
Minor (reclaiming a frame) faults: 304000.6

mmap001

No mp3 skips noted.

Average for 5 mmap001 runs
bytes allocated: 2048000000
User time (seconds): 19.510
System time (seconds): 17.438
Elapsed (wall clock seconds) time: 173.48
Percent of CPU this job got: 20.80
Major (requiring I/O) page faults: 500164.4
Minor (reclaiming a frame) faults: 43.6


2.4.14pre5aa1
-------------

mtest01

mp3 played for 318 seconds of 383 second run.

Averages for 10 mtest01 runs
bytes allocated: 1250217164
User time (seconds): 2.017
System time (seconds): 2.959
Elapsed (wall clock) time: 38.255
Percent of CPU this job got: 12.60
Major (requiring I/O) page faults: 125.7
Minor (reclaiming a frame) faults: 306016.6


mmap001

mp3 played 823 seconds of 878 second run.

Average for 5 mmap001 runs
bytes allocated: 2048000000
User time (seconds): 19.496
System time (seconds): 14.450
Elapsed (wall clock seconds) time: 175.54
Percent of CPU this job got: 18.80
Major (requiring I/O) page faults: 500164.4
Minor (reclaiming a frame) faults: 43.8


2.4.13-ac5-freeswap
-------------------

The freeswap patch is from http://www.surriel.com for 2.4.12-ac3.

mtest01

mp3 played 81 seconds of 352 second run.

Averages for 10 mtest01 runs
bytes allocated: 1244345139
User time (seconds): 2.104
System time (seconds): 3.815
Elapsed (wall clock) time: 35.153
Percent of CPU this job got: 16.40
Major (requiring I/O) page faults: 113.1
Minor (reclaiming a frame) faults: 304585.4


mmap001

mp3 played 773 seconds of 774 second run

Average for 5 mmap001 runs
bytes allocated: 2048000000
User time (seconds): 19.160
System time (seconds): 16.748
Elapsed (wall clock seconds) time: 154.70
Percent of CPU this job got: 22.60
Major (requiring I/O) page faults: 500174.8
Minor (reclaiming a frame) faults: 20.0

--
Randy Hron


2001-10-30 07:34:02

by Jens Axboe

[permalink] [raw]
Subject: Re: VM test comparison of 2.4.14-pre5, aa1, and 2.4.13-ac5-fs

On Tue, Oct 30 2001, [email protected] wrote:
>
> 2.4.14-pre5 fastest for mtest01, smoothest sound.
> 2.4.14pre5aa1
> 2.4.13-ac5-freeswap fastest for mmap001

Side note -- you cannot directly call this a vm vs vm test, not if you
are doing any significant amount of I/O. The -ac and Linus tree have
several significant changes in the queueing layer that makes this pretty
much and apples and oranges comparison.

--
Jens Axboe

2001-10-30 14:49:04

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: VM test comparison of 2.4.14-pre5, aa1, and 2.4.13-ac5-fs

On Tue, Oct 30, 2001 at 02:26:40AM -0500, [email protected] wrote:
> 2.4.14-pre5
> -----------
>
> mtest01
>
> mp3 played 263 seconds of 321 second run.
>
> Averages for 10 mtest01 runs
> bytes allocated: 1241933414
> bytes allocated: 1250217164
> bytes allocated: 1244345139

the mtest01 -p may comparing apples to oranges. Please make sure to apply
the -p fix I posted in the last days before using the -p option,
otherwise the amount of memory swapped out will be random. In this case
the -aa run was a little penalized for example (not much but since the
difference is of the order of seconds ...).

thanks!

Andrea

2001-10-31 05:39:31

by Randy Hron

[permalink] [raw]
Subject: Re: VM test comparison of 2.4.14-pre5, aa1, and 2.4.13-ac5-fs

On Tue, Oct 30, 2001 at 03:49:11PM +0100, Andrea Arcangeli wrote:
> >
> > Averages for 10 mtest01 runs
> > bytes allocated: 1241933414
> > bytes allocated: 1250217164
> > bytes allocated: 1244345139
>
> the mtest01 -p may comparing apples to oranges. Please make sure to apply
> the -p fix I posted in the last days before using the -p option,
>
> thanks!
>
> Andrea

I put Andrea's patch on mtest01.c to remove the variation of memory allocated
between runs. So far I've only tested it on 2.4.14pre5aa1.

Paul Larson's comments about the test working as designed make sense too.

Jen Axboe's comment about how much I/O comes into play in these tests is
right on. The mmap001 test pounds the disk with 20000-30000 Blk_wrtn/s.
mtest01 can do over 50000 Blk_wrtn/s.

Small differences in any results should be ignored, because there is always some
variation in the test. I.E. One run may receive 120 lines of input from the IRC
clients, and another only 30. One run I may type 30 characters, another 200.
Not a lot of difference when you think about bytes processed, but it can affect
the results.

This isn't a perfectly controlled test, though I try to keep variations
down to the things listed above. For the last 10 days or so, I always use
the same mp3 sampled at 128k.

There are differences in .config too. They all started with the same .config,
but after "make oldconfig", they change a little. In the diff below, aa=Andrea,
ac=Alan, lt=Linus. (This is from latest pre5aa1, ac5, and pre5 kernels)

diff aa ac
> CONFIG_GENERIC_ISA_DMA=y
> CONFIG_RWSEM_XCHGADD_ALGORITHM=y
> CONFIG_X86_PPRO_FENCE=y
< CONFIG_NO_PAGE_VIRTUAL=y

diff aa lt
> CONFIG_RWSEM_XCHGADD_ALGORITHM=y
< CONFIG_NO_PAGE_VIRTUAL=y

diff ac lt
< CONFIG_GENERIC_ISA_DMA=y
< CONFIG_X86_PPRO_FENCE=y


The kernels themselves vary in size a bit.

2.4.14pre5aa
Memory: 514516k/524224k available (912k kernel code, 9320k reserved, 231k data, 208k init

2.4.13-ac5
Memory: 513492k/524224k available (911k kernel code, 10344k reserved, 235k data, 212k init

2.4.14-pre5
Memory: 514060k/524224k available (904k kernel code, 9776k reserved, 228k data, 208k init


This is from 2.4.14pre5aa1 with mtest01 patch.

mp3 played 288 seconds of 392 second run (new run with patch - more bytes allocated)
mp3 played 318 seconds of 383 second run (previous run).

Averages for 10 mtest01 runs
bytes allocated: 1284505600
User time (seconds): 2.145
System time (seconds): 3.005
Elapsed (wall clock) time: 39.206
Percent of CPU this job got: 12.60
Major (requiring I/O) page faults: 132.1
Minor (reclaiming a frame) faults: 314387.9

mp3 played 848 seconds of 875 second run.
mp3 played 823 seconds of 878 second run (previous run).

Average for 5 mmap001 runs
bytes allocated: 2048000000
User time (seconds): 19.530
System time (seconds): 14.396
Elapsed (wall clock seconds) time: 174.98
Percent of CPU this job got: 19.00
Major (requiring I/O) page faults: 500165.0
Minor (reclaiming a frame) faults: 43.0

For the re-run of mmap001, the seconds played varied by 3% from the
previous run. Any difference of 5% or less should be ignored, imho.

So I'm thinking about just continuing to use the mtest01 from LTP,
knowing that with my test conditions, a variation of a few percent
isn't significant.

Andrea, is that okay with you?

--
Randy Hron

2001-11-04 01:46:00

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: VM test comparison of 2.4.14-pre5, aa1, and 2.4.13-ac5-fs

On Wed, Oct 31, 2001 at 12:41:29AM -0500, [email protected] wrote:
> So I'm thinking about just continuing to use the mtest01 from LTP,
> knowing that with my test conditions, a variation of a few percent
> isn't significant.
>
> Andrea, is that okay with you?

Why don't you use the -b option with the mean size of all the previous
runs that you did with the default -p option? this way you'd avoid
throwing away all the previous results and the new ones would be more
reliable. I'd prefer if you would allocate always the same amount of
memory, the variations are not huge, so I guess it's better to reduce
the userspace noise.

In particular I'm interested if you can see significant performance
variations between pre5aa1 and pre6aa1 and pre7aa2. I'm also testing
here (mainly Linus's 40m kde test to verify interactive response on real
life that unfortunately cannot produce raw numbers) and I didn't had
much time to spend producing numbers since there was also some bug to
fix utill yesterday (should be all fixed in pre7aa2). The recent changes
were mostly in function of the kde mem=40m workload that is pretty well
usable for me in pre7aa2 (xmms never skips one beat while playing mp3
even if mem=40m and browsing the web with konqueror is fluid, mozilla
also is usable but much slower than konqueror with mem=40m because it's
a true memory hog at least when flash starts and of course it shares
less libs with the rest of the desktop).

Andrea