2002-09-18 14:04:54

by Randy Hron

[permalink] [raw]
Subject: Hint benchmark reaches memory size limit on 4gb box

3.75 gb ram
4 gb swap on 2 disks
quad xeon

Running the FLOAT benchmark from
ftp://ftp.scl.ameslab.gov/pub/HINT/hint.src.tar.gz
2.5.34-mm1 gave:

This run was memory limited at 31438643 subintervals -> -1894198156 bytes

The last I noticed, the process was around 2.6 GB.
The process grows over time as it needs memory.
It may have hit 3GB.

The version of hint is from the tarball's
source/serial/unix directory.

The goal is to combine several benchmarks for
a more rounded workload.

I could run 2 copies of Hint if this is a 3gb
userspace limit issue.

parts of the combined/concurrent benchmark:
1) hint (possibly FLOAT & LONGLONG together)
2) netperf -t TCP_RR # request/response
3) chat # 2 rooms with semi-long lived clients
4) postmark # 2 directories + lots of files
5) configure && make && make check GNU ed

Any suggestions?

--
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html


2002-09-19 23:53:08

by Randy Hron

[permalink] [raw]
Subject: Re: Hint benchmark reaches memory size limit on 4gb box


>> 1) hint (possibly FLOAT & LONGLONG together)
>> 2) netperf -t TCP_RR # request/response
>> 3) chat # 2 rooms with semi-long lived clients
>> 4) postmark # 2 directories + lots of files
>> 5) configure && make && make check GNU ed

>> Any suggestions?

> Dunno, Randy. I'd say, yes, you hit 3G. I guess one
> needs to look to find a way to make it less consumptive.

It's been running for about 20 hours on 2.5.34-mm1.

A few observations:
The swap happy processes from hint _really_ slowed
down when they hit swap. swap is on two scsi spindles
shared with standard filesystems. It seems they are
being penalized a lot for being swap hogs, though it
could be just that the swap devices are slow. Hint
may be abnormal in that it really accesses all the
processes memory space. (I'd prefer a combination
of a big process that uses a lot of mem, and other
processes that are big but relatively inactive so
they get paged out.)

I don't have any other systems to compare the
current run to.

I expect the hint processes to run until either swap
is full, or they hit the ~3gb limit. At the current
rate it may be a day or two.

So I'm wondering if you think i should just abort the
current test, and try 2.5.36-mm1, or if the benchmark
needs adjustment.

netperf early in the run had a mostly "low confidence"
intervals. i.e. confidence < 60%. In the later runs,
now that swap is heavily utilized, confidence is high.

Trans. CPU CPU S.dem S.dem
Rate local remote local remote
per sec % S % S us/Tr us/Tr
early in run 15423.32 99.98 106.65 282.036 300.834
later in run 17494.21 99.98 106.65 228.648 243.888

I'm not running chat. I may add that if I can teach it
to throttle sensibly.

I was surprised that early in the run, swap was ~ 300MB
used, though the hint processes were ~500 megs. I.E.
Swap was seeing some action earlier than I expected.

postmark creates ~65 gb of stuff. It uses a lun that
isn't shared with swap.

The ed compile loop is very fast.

This is a bit of top. High system time here.

4:07pm up 3 days, 22:18, 1 user, load average: 5.86, 5.71, 5.65
59 processes: 56 sleeping, 3 running, 0 zombie, 0 stopped
CPU0 states: 3.12% user, 96.2% system, 0.0% nice, 0.0% idle
CPU1 states: 57.1% user, 42.4% system, 0.0% nice, 0.0% idle
CPU2 states: 68.3% user, 31.5% system, 0.0% nice, 0.0% idle
CPU3 states: 51.4% user, 48.1% system, 0.0% nice, 0.0% idle
Mem: 3723360K av, 3717740K used, 5620K free, 0K shrd, 74488K buff
730812K actv, 2662512K in_d, 0K in_c, 0K target
Swap: 4065056K av, 3073364K used, 991692K free 1907636K cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
12571 root 16 0 1708 728 1636 S 52.1 0.0 1:15 netperf
12572 root 25 0 1656 552 1656 R 47.6 0.0 1:09 netserver
10889 root 15 0 20560 18M 1368 D 25.7 0.4 148:43 postmark-1_5
11 root 15 0 0 0 0 SW 5.1 0.0 107:05 kswapd0
27998 root 19 0 7408 5792 3788 R 3.8 0.1 0:00 cc1
21393 root 15 0 1626M 328M 1508 D 1.3 9.0 117:13 LONGLONG
21395 root 15 0 1788M 348M 1508 D 1.3 9.5 106:41 DOUBLE
21351 root 17 0 2240 1024 2140 S 0.1 0.0 0:26 run_ed
26002 root 15 0 0 0 0 SW 0.1 0.0 1:06 pdflush
1 root 15 0 1420 476 1384 S 0.0 0.0 0:19 init


Here is some vmstat 30: cs is high. Oddly si/so bi/bo and in are 0.
That's with either procps-2.5.34-mm1 or rml's recent procps.

procs memory swap io system cpu
r b w swpd free buff cache si so bi bo in cs us sy id
1 3 0 3126544 5444 75556 2013780 0 0 0 0 0 36193 37 63 0
1 3 1 3127780 5828 75832 2014888 0 0 0 0 0 31957 37 63 0
4 3 0 3127676 11192 76252 2005696 0 0 0 0 0 36403 36 64 0
3 3 0 3126180 5672 76220 2008508 0 0 0 0 0 31978 39 61 0
3 3 0 3127720 6700 76364 2010060 0 0 0 0 0 36683 36 64 0
2 3 0 3126988 5444 76492 2012024 0 0 0 0 0 31689 38 62 0


iostat 30 says there is really disk activity:
Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
dev8-0 406.46 6285.98 2083.54 108056 35816 (root/swap)
dev8-1 103.49 1149.51 916.35 19760 15752 (usr/swap)
dev8-2 333.51 16341.13 13502.73 280904 232112 (raid5 array)

Should the bench be adjusted, or should I boot 2.5.36-mm1?

--
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html

2002-09-20 00:20:50

by Andrew Morton

[permalink] [raw]
Subject: Re: Hint benchmark reaches memory size limit on 4gb box

[email protected] wrote:
>
> >> 1) hint (possibly FLOAT & LONGLONG together)
> >> 2) netperf -t TCP_RR # request/response
> >> 3) chat # 2 rooms with semi-long lived clients
> >> 4) postmark # 2 directories + lots of files
> >> 5) configure && make && make check GNU ed
>
> >> Any suggestions?
>
> > Dunno, Randy. I'd say, yes, you hit 3G. I guess one
> > needs to look to find a way to make it less consumptive.
>
> It's been running for about 20 hours on 2.5.34-mm1.

Well it sounds like it's stable. This is on the quad, I assume.

> A few observations:
> The swap happy processes from hint _really_ slowed
> down when they hit swap.

swapout is bust in that kernel. 2.5.36-mm1 has the fix, but
it's just a one-liner:
http://www.zip.com.au/~akpm/linux/patches/2.5/2.5.36/2.5.36-mm1/broken-out/vm-mapping-fix.patch

Really, I just haven't started looking at behaviour under
swappy loads. Even with simple tests the kernel does seem
to be making incorrect eviction decisions, at a slow rate.

(The test: boot with mem=192m, start `vmstat 1', run your
standard memset(malloc(1G)) test. On the second run the kernel
is continuously doing a trickle of reads. Some from swap, some
from executables. It shouldn't. 2.5.26 doesn't. 2.4.19-ac1 does)

> I expect the hint processes to run until either swap
> is full, or they hit the ~3gb limit. At the current
> rate it may be a day or two.

If a performance test takes more than 5-10 minutes to run, it's
being silly. 30 seconds is enough for most things.

> So I'm wondering if you think i should just abort the
> current test, and try 2.5.36-mm1, or if the benchmark
> needs adjustment.

Both, it looks.

> ...
>
> PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
> 12571 root 16 0 1708 728 1636 S 52.1 0.0 1:15 netperf
> 12572 root 25 0 1656 552 1656 R 47.6 0.0 1:09 netserver
> 10889 root 15 0 20560 18M 1368 D 25.7 0.4 148:43 postmark-1_5
> 11 root 15 0 0 0 0 SW 5.1 0.0 107:05 kswapd0

OK, that's the sort of kswapd load which I see under heavy testing.
That's 1.25% of total CPU, and it really isn't just spinning wheels,
promise.

> ..
>
> Here is some vmstat 30: cs is high. Oddly si/so bi/bo and in are 0.
> That's with either procps-2.5.34-mm1 or rml's recent procps.

Yup. That info got shuffled over to /proc/vmstat. There will
be some brokenness for a while.

> ..
> iostat 30 says there is really disk activity:
> Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
> dev8-0 406.46 6285.98 2083.54 108056 35816 (root/swap)
> dev8-1 103.49 1149.51 916.35 19760 15752 (usr/swap)
> dev8-2 333.51 16341.13 13502.73 280904 232112 (raid5 array)

The sard code seems to be working nicely.

> Should the bench be adjusted, or should I boot 2.5.36-mm1?

Both, sorry.

2002-09-23 23:46:48

by Randy Hron

[permalink] [raw]
Subject: Re: Hint benchmark reaches memory size limit on 4gb box

>>Should the bench be adjusted, or should I boot 2.5.36-mm1?

> Both, sorry.

Lorenzo did an update to qsbench. qsbench is much faster
than hint for a ram shortage simulation.

Current lineup:
qsbench with 2 process = 120% TotalMem

1) qsbench alone
2) qsbench with ed compile loop
3) qsbench + very small chat bench loop (5 clients, 1 room)
4) qsbench + postmark loop with ~ 40000 small files

Hm, guess i should also time 2, 3, and 4 without qsbench
for comparison. 2, 3, 4 run in less than 10 seconds
on quad xeon. The idea is to count the loops that complete
during the time for qsbench to do it's thing.

The first run on 2.5.38 got overwritten during a mkfs. :(

Any suggestions before i lose the next batch of data ;)

--
Randy Hron
http://home.earthlink.net/~rwhron/kernel/bigbox.html