I compared the two leading kernels at the moment: 2.4.11 and 2.4.10-ac10.
The latter of course has Rik's VM while the former uses AA's. I used a
simple test which simulates the loads I am interested in. One process
randomly seeks in a 1GB file, reading 16MB chunks, then seeks again and
writes a 16MB chunk, then fsync()s the file. The other process allocates
increasingly larger amounts of memory and reads and writes them a few
times. The interesting thing happens when this second process starts to
use nearly all the physical memory in the machine.
I made a few subjective observations about the kernels. Both of them made
progress under heavy swap + I/O loads, where kernels before 2.4.10 would
uniformly livelock in kswapd. 2.4.10-ac10 seemed to have better
interactive performance under swap. Switching xterms and using Galeon was
faster under 2.4.10-ac10 than under 2.4.11. Both systems were usable
under swap, which again is a huge departure from past 2.4 kernels.
2.4.10-ac10 seems to sometimes have lengthy stalls in certain cases.
When doing fast sequential I/O (building the 1GB file used in this test
for example), I/O always proceeded but sometimes the X pointer would stick
for 1-2 seconds at a stretch. 2.4.10-ac10 also seems to bounce off the
upper limit of memory: if the disk cache approaches the limit of physical
memory, everything will stop for up to 1 second while the kernel recalims
some RAM. Then the cache starts to run up again. Hence the "bounce".
2.4.10-ac10 seems to have better disk I/O under stress. This may be due
to VM or elevator changes. In any case updatedb finished much more
quickly under a swap load than did 2.4.11.
My tests were run on a 1.4GHz Athlon CPU with 256 MB main memory and 256
MB swap. Storage is a 2-disk software RAID-1 attached to an aic7xxx host
adapter. This is also where swap lives. File system is ext2.
The source for my two tests is attached, as well as the output for both
running simultaneously on the respective kernels
([io|flip]-[ac|linus].log). I hope someone finds this useful.
The only hard conclusion I came to was that the AA VM deals with low
memory situations more sanely. On 2.4.10-ac10, the 236MB test took 161
seconds. The 238MB test was killed after 5 minutes. 2.4.11 proceeded
through 244MB at a constant 134 second pace before I just killed it from
boredom.
I'd love to make some charts here but my Gnumeric is too old. If someone
makes any, please send me a few PNG files of them. Note that you can make
a time chart of I/O by multiplying the throughput numbers by 16MB to find
how long the iteration took.
-jwb