2002-07-19 02:12:47

by J.A. Magallon

[permalink] [raw]
Subject: [PATCHSET] Linux 2.4.19-rc1-jam1

HI al...

BEWARE: this kernel probably will eat your disk and your dog, but anyways...
Problems:
- I could not merge the Promise part of ide-convert-10 (due to a big update
of in-kernel code), so I just dropped it. So Promise users are out of the
marketshare. If things go this way, perhaps I will have to drop ide-10.
- I have added again the scalable-timers and irqrate-limiting patches.
My box hangs on rmmod'ing modules with irqrate applied (SMP box, perhaps is
a locking problem and UP is safe...), even without bproc. It also happened
in previous releases, but I did not notice till recently. So irqrate
improves latency, but kills the box ;).

The idea is to easy the way for Randy Hron to compare:
- rc2
- rc2-aa1
- rc2-jam1 minus irqrate == rc2-aa1 plus smptimers (that I think will not
make a big difference)
- rc2-jam1 full == rc2-aa1 + smptimers + irqrate (if you don't rmmod anything...)

I hope this is usefull.

Get it at:

http://giga.cps.unizar.es/~magallon/linux/kernel/2.4.19-rc2-jam1.tar.bz2

TIA

--
J.A. Magallon \ Software is like sex: It's better when it's free
mailto:[email protected] \ -- Linus Torvalds, FSF T-shirt
Linux werewolf 2.4.19-rc2-jam1, Mandrake Linux 8.3 (Cooker) for i586
gcc (GCC) 3.1.1 (Mandrake Linux 8.3 3.1.1-0.8mdk)


2002-07-19 02:39:30

by Thunder from the hill

[permalink] [raw]
Subject: Re: [PATCHSET] Linux 2.4.19-rc1-jam1

Hi,

On Fri, 19 Jul 2002, J.A. Magallon wrote:
> The idea is to easy the way for Randy Hron to compare:
> - rc2
> - rc2-aa1
> - rc2-jam1 minus irqrate == rc2-aa1 plus smptimers (that I think will not
> make a big difference)
> - rc2-jam1 full == rc2-aa1 + smptimers + irqrate (if you don't rmmod anything...)

So the heading is inaccurate.

Regards,
Thunder
--
(Use http://www.ebb.org/ungeek if you can't decode)
------BEGIN GEEK CODE BLOCK------
Version: 3.12
GCS/E/G/S/AT d- s++:-- a? C++$ ULAVHI++++$ P++$ L++++(+++++)$ E W-$
N--- o? K? w-- O- M V$ PS+ PE- Y- PGP+ t+ 5+ X+ R- !tv b++ DI? !D G
e++++ h* r--- y-
------END GEEK CODE BLOCK------

2002-07-19 15:22:54

by J.A. Magallon

[permalink] [raw]
Subject: [PATCHSET] Linux 2.4.19-rc2-jam1 [Was: Re: [PATCHSET] Linux 2.4.19-rc1-jam1]


On 2002.07.19 Thunder from the hill wrote:
>Hi,
>
>On Fri, 19 Jul 2002, J.A. Magallon wrote:
>> The idea is to easy the way for Randy Hron to compare:
>> - rc2
>> - rc2-aa1
>> - rc2-jam1 minus irqrate == rc2-aa1 plus smptimers (that I think will not
>> make a big difference)
>> - rc2-jam1 full == rc2-aa1 + smptimers + irqrate (if you don't rmmod anything...)
>
>So the heading is inaccurate.
>

Oops, yes. It is against rc2.

--
J.A. Magallon \ Software is like sex: It's better when it's free
mailto:[email protected] \ -- Linus Torvalds, FSF T-shirt
Linux werewolf 2.4.19-rc2-jam1, Mandrake Linux 8.3 (Cooker) for i586
gcc (GCC) 3.1.1 (Mandrake Linux 8.3 3.1.1-0.8mdk)

2002-07-20 17:11:23

by Randy Hron

[permalink] [raw]
Subject: Re: [PATCHSET] Linux 2.4.19-rc1-jam1

Andrea put many pieces of read_latency2 in -aa. One thing
I'd like to see -jam try is nr_requests = 256 in
drivers/block/ll_rw_blk.c.

Andrew's read_latency2 had nr_request set to 1024 for most
machines. nr_request max is 128 in -marcelo, -aa and -jam.

When dbench process count is 64, the original read_latency2
helped throughput.

dbench 64 ext2 Average High Low
2.4.19-pre7 146.00 160.24 103.17 MB/sec
2.4.19-pre7-rl 151.41 155.75 137.63

dbench 64 reiserfs
2.4.19-pre7 67.86 68.94 66.70
2.4.19-pre7-rl 70.49 71.11 69.96

dbench 64 ext3
2.4.19-pre7 81.84 89.13 64.81
2.4.19-pre7-rl 81.73 85.28 73.01

It helped a little on ext2 for dbench 192.

dbench 192 ext2 Average High Low
2.4.19-pre7 113.99 119.63 107.52 MB/sec
2.4.19-pre7-rl 115.59 120.17 111.55

But for reiserfs and ext3, it hurt on dbench 192.

dbench 192 ext3
2.4.19-pre7 60.24 61.03 58.54
2.4.19-pre7-rl 32.08 32.82 31.59

dbench 192 reiserfs
2.4.19-pre7 49.35 50.63 48.65
2.4.19-pre7-rl 27.30 28.13 26.55


On tiobench, the original read_latency2 made max latency drop from
457 seconds down to 2 seconds when there were 64 threads.
Throughput went up too.

Sequential Reads ext2

Num Maximum
Kernel Thr MB/sec Latency (seconds)
----------------- --- --------------------
2.4.19-pre7 64 31.68 457.5
2.4.19-pre7-rl 64 35.77 2.1

At 256 threads, read_latency2 also dropped latency and
improved throughput.

2.4.19-pre7 256 33.18 752.4
2.4.19-pre7-rl 256 36.51 134.0

ext3 and reiserfs did not have sequential read throughput
regressions with read_latency2 for tiobench.

Up to more modern times.

At 64 threads, ac2 has low maximum latency (ac has read_latency2
with nr_requests = 1024).

Num Maximum
Kernel Thr MB/sec Latency (seconds)
----------------- --- --------------------
2.4.19-pre10-ac2 64 30.19 1.5
2.4.19-pre10-jam3 64 40.69 29.5
2.4.19-rc1 64 32.86 342.3
2.4.19-rc1-aa2 64 40.72 31.4
2.5.26 64 23.11 561.8

At 256 threads, it's a different story. -aa and -jam have the
lowest max latency numbers, but they are still high.

Num Maximum
Kernel Thr MB/sec Latency (seconds)
----------------- --- --------------------
2.4.19-pre10-ac2 256 29.66 1083.4
2.4.19-pre10-jam3 256 40.35 108.0
2.4.19-rc1 256 32.67 855.9
2.4.19-rc1-aa2 256 40.57 129.5
2.5.26 256 22.23 1135.0

queue_nr_requests is 256 in 2.5.26.

dbench ext2 64 processes Average High Low
2.5.26 220.13 231.97 194.40 MB/sec

dbench ext2 192 processes Average High Low
2.5.26 185.87 210.57 152.97

It's Andrew's other magic that makes dbench so high in 2.5.26,
but I wonder if nr_request = 256 would improve latency/throughput
in -aa and -jam without regressing dbench 192 on ext3/reiserfs.
--
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html