2002-12-27 05:37:53

by Con Kolivas

[permalink] [raw]
Subject: [BENCHMARK] vm swappiness with contest

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Here is a family of contest benchmarks using the osdl hardware in uniprocessor
mode on 2.5.53-mm1 while varying vm swappiness. s020 is vm swappiness=20 and
so on:

noload:
Kernel [runs] Time CPU% Loads LCPU% Ratio
s000 [3] 71.1 95 0 0 1.07
s020 [5] 71.9 95 0 0 1.08
s040 [5] 71.7 95 0 0 1.07
s060 [5] 71.3 96 0 0 1.07
s080 [5] 71.3 95 0 0 1.07
s100 [5] 71.6 95 0 0 1.07

cacherun:
Kernel [runs] Time CPU% Loads LCPU% Ratio
s000 [3] 68.4 99 0 0 1.02
s020 [5] 68.8 99 0 0 1.03
s040 [5] 68.7 99 0 0 1.03
s060 [5] 68.6 99 0 0 1.03
s080 [5] 68.5 99 0 0 1.03
s100 [5] 68.7 99 0 0 1.03

process_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
s000 [3] 119.0 57 49 41 1.78
s020 [5] 119.4 57 49 41 1.79
s040 [5] 118.6 57 48 41 1.78
s060 [5] 117.6 57 47 41 1.76
s080 [5] 119.5 57 49 41 1.79
s100 [5] 118.4 58 48 40 1.77

dbench_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
s000 [3] 191.6 41 1 43 2.87
s020 [5] 195.5 40 1 44 2.93
s040 [5] 197.9 41 1 43 2.96
s060 [5] 331.4 32 0 23 4.96
s080 [5] 439.4 24 0 10 6.58
s100 [5] 883.6 13 1 9 13.24
The first of the massive effect of changing this value. A recurring theme is
large file writes with a swappy kernel seems to waste time swapping data
because of the IO data. Obviously none of these IO loads use the o_direct
option.

ctar_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
s000 [3] 113.4 81 3 10 1.70
s020 [5] 103.3 80 2 9 1.55
s040 [5] 110.0 79 3 9 1.65
s060 [5] 103.0 80 3 9 1.54
s080 [5] 104.4 80 2 8 1.56
s100 [5] 100.2 80 2 8 1.50
Slightly slower with an unswappy kernel

xtar_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
s000 [3] 139.2 65 3 8 2.09
s020 [5] 128.7 70 2 7 1.93
s040 [5] 152.8 59 3 7 2.29
s060 [5] 137.0 64 2 6 2.05
s080 [5] 124.6 66 2 6 1.87
s100 [5] 127.6 66 2 6 1.91
Up and down, no firm pattern

io_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
s000 [3] 121.8 63 21 17 1.82
s020 [5] 125.9 61 22 17 1.89
s040 [5] 130.1 59 22 17 1.95
s080 [5] 174.6 47 27 15 2.62
s100 [5] 208.1 42 25 11 3.12
Increase the swappiness, increase the kernel compile time as dbench_load does


io_other:
Kernel [runs] Time CPU% Loads LCPU% Ratio
s000 [3] 103.6 73 18 17 1.55
s020 [5] 135.4 62 25 18 2.03
s040 [5] 157.7 57 28 18 2.36
s060 [5] 188.1 48 32 16 2.82
s080 [5] 246.2 37 38 15 3.69
s100 [5] 378.8 24 45 11 5.67
Another massive change.

read_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
s000 [3] 89.7 80 15 7 1.34
s020 [5] 91.1 79 13 6 1.36
s040 [5] 90.6 79 12 6 1.36
s060 [5] 90.3 79 12 6 1.35
s080 [5] 90.3 79 12 6 1.35
s100 [5] 92.2 78 10 5 1.38

list_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
s000 [3] 81.8 85 0 9 1.23
s020 [5] 81.9 85 0 9 1.23
s040 [5] 82.0 85 0 9 1.23
s060 [5] 82.2 85 0 8 1.23
s080 [5] 82.7 85 0 8 1.24
s100 [5] 83.1 85 0 8 1.24

mem_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
s000 [3] 233.2 32 65 1 3.49
s020 [5] 172.3 42 50 1 2.58
s040 [5] 163.8 46 44 1 2.45
s060 [5] 192.2 39 38 1 2.88
s080 [5] 186.4 42 37 0 2.79
s100 [5] 128.2 57 37 1 1.92
Here it seems only the extreme values make a difference. Un unswappy kernel
makes it slow down significantly, whereas a swappy kernel speeds it up
significantly.

Paolo if I was to choose a number from these values I'd suggest lower than 60
rather than higher, BUT that is because the io load effects become a real
problem when the kernel is swappy - don't _really_ know what this means for
the rest of the time. Maybe in the 40-50 range. There seems to be a knee
(bend) in the curve (most noticable in dbench_load) rather than the curves
being linear. That knee I believe simply shows the way the algorithm for
swappiness basically works. I might throw a 50 at the machine as well to see
what that does.

Con
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.0 (GNU/Linux)

iD8DBQE+C+kYF6dfvkL3i1gRAt2HAJ9UJFKO2i5K6LW/7WU+cJ1TBOtlEQCeISK7
7ecGaUImIiJvC1Bz649vmH8=
=UI1t
-----END PGP SIGNATURE-----


2002-12-27 09:52:36

by Con Kolivas

[permalink] [raw]
Subject: Re: [BENCHMARK] vm swappiness with contest

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, 27 Dec 2002 04:46 pm, Con Kolivas wrote:
> Here is a family of contest benchmarks using the osdl hardware in
> uniprocessor mode on 2.5.53-mm1 while varying vm swappiness. s020 is vm
> swappiness=20 and so on:
SNIP--->
Heres a set with 50 as well

noload:
Kernel [runs] Time CPU% Loads LCPU% Ratio
s000 [3] 71.1 95 0 0 1.07
s020 [5] 71.9 95 0 0 1.08
s040 [5] 71.7 95 0 0 1.07
s050 [5] 71.4 96 0 0 1.07
s060 [5] 71.3 96 0 0 1.07
s080 [5] 71.3 95 0 0 1.07
s100 [5] 71.6 95 0 0 1.07

cacherun:
Kernel [runs] Time CPU% Loads LCPU% Ratio
s000 [3] 68.4 99 0 0 1.02
s020 [5] 68.8 99 0 0 1.03
s040 [5] 68.7 99 0 0 1.03
s050 [5] 68.8 99 0 0 1.03
s060 [5] 68.6 99 0 0 1.03
s080 [5] 68.5 99 0 0 1.03
s100 [5] 68.7 99 0 0 1.03

process_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
s000 [3] 119.0 57 49 41 1.78
s020 [5] 119.4 57 49 41 1.79
s040 [5] 118.6 57 48 41 1.78
s050 [5] 116.5 58 46 40 1.75
s060 [5] 117.6 57 47 41 1.76
s080 [5] 119.5 57 49 41 1.79
s100 [5] 118.4 58 48 40 1.77

dbench_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
s000 [3] 191.6 41 1 43 2.87
s020 [5] 195.5 40 1 44 2.93
s040 [5] 197.9 41 1 43 2.96
s050 [5] 914.6 15 0 6 13.70
s060 [5] 331.4 32 0 23 4.96
s080 [5] 439.4 24 0 10 6.58
s100 [5] 883.6 13 1 9 13.24
woah hitting some sort of resonance here with 50

ctar_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
s000 [3] 113.4 81 3 10 1.70
s020 [5] 103.3 80 2 9 1.55
s040 [5] 110.0 79 3 9 1.65
s050 [5] 97.8 82 2 7 1.46
s060 [5] 103.0 80 3 9 1.54
s080 [5] 104.4 80 2 8 1.56
s100 [5] 100.2 80 2 8 1.50
and a dip here

xtar_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
s000 [3] 139.2 65 3 8 2.09
s020 [5] 128.7 70 2 7 1.93
s040 [5] 152.8 59 3 7 2.29
s050 [5] 112.7 75 1 5 1.69
s060 [5] 137.0 64 2 6 2.05
s080 [5] 124.6 66 2 6 1.87
s100 [5] 127.6 66 2 6 1.91
dip here

io_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
s000 [3] 121.8 63 21 17 1.82
s020 [5] 125.9 61 22 17 1.89
s040 [5] 130.1 59 22 17 1.95
s050 [5] 220.0 40 28 12 3.30
s080 [5] 174.6 47 27 15 2.62
s100 [5] 208.1 42 25 11 3.12
peak here

io_other:
Kernel [runs] Time CPU% Loads LCPU% Ratio
s000 [3] 103.6 73 18 17 1.55
s020 [5] 135.4 62 25 18 2.03
s040 [5] 157.7 57 28 18 2.36
s050 [5] 154.7 56 22 14 2.32
s060 [5] 188.1 48 32 16 2.82
s080 [5] 246.2 37 38 15 3.69
s100 [5] 378.8 24 45 11 5.67

read_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
s000 [3] 89.7 80 15 7 1.34
s020 [5] 91.1 79 13 6 1.36
s040 [5] 90.6 79 12 6 1.36
s050 [5] 89.1 81 12 5 1.33
s060 [5] 90.3 79 12 6 1.35
s080 [5] 90.3 79 12 6 1.35
s100 [5] 92.2 78 10 5 1.38

list_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
s000 [3] 81.8 85 0 9 1.23
s020 [5] 81.9 85 0 9 1.23
s040 [5] 82.0 85 0 9 1.23
s050 [5] 82.9 85 0 8 1.24
s060 [5] 82.2 85 0 8 1.23
s080 [5] 82.7 85 0 8 1.24
s100 [5] 83.1 85 0 8 1.24

mem_load:
Kernel [runs] Time CPU% Loads LCPU% Ratio
s000 [3] 233.2 32 65 1 3.49
s020 [5] 172.3 42 50 1 2.58
s040 [5] 163.8 46 44 1 2.45
s050 [5] 223.6 33 36 1 3.35
s060 [5] 192.2 39 38 1 2.88
s080 [5] 186.4 42 37 0 2.79
s100 [5] 128.2 57 37 1 1.92
and peak here

50 seems real bad for some things. Some sort of algorithm resonance occurs at
50

Con
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.0 (GNU/Linux)

iD8DBQE+DCTGF6dfvkL3i1gRAlBwAJ4/jIopRp/Rn9ivrCYzkY8z7pxQGgCfSuB+
3vzf2LkBCvr4dUzrb+FU0V4=
=JtHm
-----END PGP SIGNATURE-----

2002-12-27 13:05:06

by Paolo Ciarrocchi

[permalink] [raw]
Subject: Re: [BENCHMARK] vm swappiness with contest

From: Con Kolivas <[email protected]>
[...]
> Paolo if I was to choose a number from these values I'd suggest lower than 60
> rather than higher, BUT that is because the io load effects become a real
> problem when the kernel is swappy - don't _really_ know what this means for
> the rest of the time. Maybe in the 40-50 range. There seems to be a knee
> (bend) in the curve (most noticable in dbench_load) rather than the curves
> being linear. That knee I believe simply shows the way the algorithm for
> swappiness basically works. I might throw a 50 at the machine as well to see
> what that does.

Con, thank you for your time and results.
I have to confirm you that large file writes with a swappy kernel
seems to waste time swapping data, I see it with the resp tool too.

May be we can say that a good value for a desktop enviroment is in the
40-50 range while for a server is 70-80, this because with the osdb bench
tool I get the best results with 80.

Ciao,
Paolo
--
______________________________________________
http://www.linuxmail.org/
Now with POP3/IMAP access for only US$19.95/yr

Powered by Outblaze

2002-12-28 06:08:38

by Con Kolivas

[permalink] [raw]
Subject: Re: [BENCHMARK] vm swappiness with contest

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Fri, 27 Dec 2002 09:00 pm, Con Kolivas wrote:
> On Fri, 27 Dec 2002 04:46 pm, Con Kolivas wrote:
> > Here is a family of contest benchmarks using the osdl hardware in
> > uniprocessor mode on 2.5.53-mm1 while varying vm swappiness. s020 is vm
> > swappiness=20 and so on:
>
> SNIP--->
SNIP SNIP -->

akpm was the first to suggest these results looked unusual and suggested
running them in a single sitting. The thing is, I ran these in a single
sitting without rebooting over about 12 hours sequentially so I thought I'd
try a different approach. Look at this first set rearranged in the order I
tested them:

> dbench_load:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> s000 [3] 191.6 41 1 43 2.87
> s020 [5] 195.5 40 1 44 2.93
> s040 [5] 197.9 41 1 43 2.96
> s060 [5] 331.4 32 0 23 4.96
> s080 [5] 439.4 24 0 10 6.58
> s100 [5] 883.6 13 1 9 13.24
> s050 [5] 914.6 15 0 6 13.70

It appeared to take longer to run the longer the machine had been running,
even though all memory is "flushed" and swap is turned on/off before each
run. So I ran these again with a reboot between each run:

sw000 [5] 185.1 42 1 42 2.77
sw020 [5] 199.9 39 1 44 2.99
sw040 [5] 210.5 38 2 45 3.15
sw050 [5] 199.7 39 2 46 2.99
sw060 [5] 190.3 41 1 45 2.85
sw080 [5] 196.1 40 1 44 2.94
sw100 [5] 198.7 40 1 43 2.98

Well these look rather different shall we say? There's virtually no change
regardless of the swappiness setting.

Question. Why does the above happen when the machine has been running for a
while? All the file writes are deleted between each run so the filesystem
doesnt change that dramatically, but even if it was the change to the
filesystem, why does a reboot fix it? (ext3 throughout)

Is there something about the filesystem layer or elsewhere in the kernel that
could decay or fragment over time that only a reboot can fix? This would seem
to be a bad thing.

Comments?
Con
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.0 (GNU/Linux)

iD8DBQE+DUHPF6dfvkL3i1gRAlEaAJ0fV2c1T1TdkM3gakNQUUx+doptNQCbBoCS
XgPTttdepCq+1m4n66TFexY=
=pvGe
-----END PGP SIGNATURE-----

2002-12-28 06:17:49

by Andrew Morton

[permalink] [raw]
Subject: Re: [BENCHMARK] vm swappiness with contest

Con Kolivas wrote:
>
> ...
> Is there something about the filesystem layer or elsewhere in the kernel that
> could decay or fragment over time that only a reboot can fix? This would seem
> to be a bad thing.

Not much that I can think of. Apart from a damn great memory leak
somewhere.

Suggest you perform a few runs, keeping an eye on the vm statistics
after each run.

2002-12-28 07:57:56

by Linus Torvalds

[permalink] [raw]
Subject: Re: [BENCHMARK] vm swappiness with contest

In article <[email protected]>,
Con Kolivas <[email protected]> wrote:
>
>Is there something about the filesystem layer or elsewhere in the kernel that
>could decay or fragment over time that only a reboot can fix? This would seem
>to be a bad thing.

You might want to save and compare /proc/slabinfo before and after.

It might be things like the dcache growing out of control and not
shrinking gracefully under memory pressure, we've certainly had that
happen before.

Or it might just be a memory leak, of course. That too will be visible
in slabinfo if it's a slab/kmalloc leak (but obviously not if it's a
page allocator leak).

Linus

2002-12-31 05:50:36

by Con Kolivas

[permalink] [raw]
Subject: Re: [BENCHMARK] vm swappiness with contest

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Saturday 28 Dec 2002 5:16 pm, Con Kolivas wrote:
> Is there something about the filesystem layer or elsewhere in the kernel
> that could decay or fragment over time that only a reboot can fix? This
> would seem to be a bad thing.

Ok Linus suggested I check slabinfo before and after.

I ran contest for a few days till I recreated the problem and it did recur. I
don't know how to interpret the information so I'll just dump it here:

before:
slabinfo - version: 1.2
unix_sock 6 18 416 2 2 1 : 120 60
tcp_tw_bucket 0 0 96 0 0 1 : 248 124
tcp_bind_bucket 2 113 32 1 1 1 : 248 124
tcp_open_request 0 0 64 0 0 1 : 248 124
inet_peer_cache 1 59 64 1 1 1 : 248 124
secpath_cache 0 0 32 0 0 1 : 248 124
flow_cache 0 0 64 0 0 1 : 248 124
xfrm4_dst_cache 0 0 224 0 0 1 : 248 124
ip_fib_hash 10 113 32 1 1 1 : 248 124
ip_dst_cache 16 17 224 1 1 1 : 248 124
arp_cache 3 30 128 1 1 1 : 248 124
raw4_sock 0 0 416 0 0 1 : 120 60
udp_sock 0 0 448 0 0 1 : 120 60
tcp_sock 6 9 864 1 1 2 : 120 60
reiser_inode_cache 0 0 384 0 0 1 : 120 60
ext2_inode_cache 0 0 448 0 0 1 : 120 60
journal_head 70 312 48 4 4 1 : 248 124
revoke_table 6 253 12 1 1 1 : 248 124
revoke_record 0 0 32 0 0 1 : 248 124
ext3_inode_cache 1206 1206 448 134 134 1 : 120 60
ext3_xattr 0 0 44 0 0 1 : 248 124
eventpoll pwq 0 0 36 0 0 1 : 248 124
eventpoll epi 0 0 64 0 0 1 : 248 124
kioctx 0 0 192 0 0 1 : 248 124
kiocb 0 0 160 0 0 1 : 248 124
dnotify_cache 0 0 20 0 0 1 : 248 124
file_lock_cache 9 40 96 1 1 1 : 248 124
fasync_cache 0 0 16 0 0 1 : 248 124
shmem_inode_cache 3 9 416 1 1 1 : 120 60
uid_cache 1 113 32 1 1 1 : 248 124
deadline_drq 768 780 48 10 10 1 : 248 124
blkdev_requests 768 784 136 28 28 1 : 248 124
biovec-BIO_MAX_PAGES 256 260 3072 52 52 4 : 54 27
biovec-128 256 260 1536 52 52 2 : 54 27
biovec-64 256 260 768 52 52 1 : 120 60
biovec-16 256 260 192 13 13 1 : 248 124
biovec-4 256 295 64 5 5 1 : 248 124
biovec-1 263 808 16 4 4 1 : 248 124
bio 263 413 64 7 7 1 : 248 124
sock_inode_cache 15 20 384 2 2 1 : 120 60
skbuff_head_cache 162 168 160 7 7 1 : 248 124
sock 3 11 352 1 1 1 : 120 60
proc_inode_cache 77 77 352 7 7 1 : 120 60
sigqueue 7 29 132 1 1 1 : 248 124
radix_tree_node 1118 1125 260 75 75 1 : 120 60
cdev_cache 361 413 64 7 7 1 : 248 124
bdev_cache 10 40 96 1 1 1 : 248 124
mnt_cache 18 59 64 1 1 1 : 248 124
inode_cache 244 264 320 22 22 1 : 120 60
dentry_cache 2373 2376 160 99 99 1 : 248 124
filp 210 210 128 7 7 1 : 248 124
names_cache 1 1 4096 1 1 1 : 54 27
buffer_head 964 1014 48 13 13 1 : 248 124
mm_struct 30 30 384 3 3 1 : 120 60
vm_area_struct 413 413 64 7 7 1 : 248 124
fs_cache 31 59 64 1 1 1 : 248 124
files_cache 18 18 416 2 2 1 : 120 60
signal_act 27 27 1344 9 9 1 : 54 27
task_struct 35 35 1536 7 7 2 : 54 27
pte_chain 1130 1130 32 10 10 1 : 248 124
mm_chain 16 338 8 1 1 1 : 248 124
size-131072(DMA) 0 0 131072 0 0 32 : 8 4
size-131072 0 0 131072 0 0 32 : 8 4
size-65536(DMA) 0 0 65536 0 0 16 : 8 4
size-65536 0 0 65536 0 0 16 : 8 4
size-32768(DMA) 0 0 32768 0 0 8 : 8 4
size-32768 0 0 32768 0 0 8 : 8 4
size-16384(DMA) 0 0 16384 0 0 4 : 8 4
size-16384 0 0 16384 0 0 4 : 8 4
size-8192(DMA) 0 0 8192 0 0 2 : 8 4
size-8192 7 7 8192 7 7 2 : 8 4
size-4096(DMA) 0 0 4096 0 0 1 : 54 27
size-4096 22 22 4096 22 22 1 : 54 27
size-2048(DMA) 0 0 2048 0 0 1 : 54 27
size-2048 108 108 2048 54 54 1 : 54 27
size-1024(DMA) 0 0 1024 0 0 1 : 120 60
size-1024 80 80 1024 20 20 1 : 120 60
size-512(DMA) 0 0 512 0 0 1 : 120 60
size-512 96 96 512 12 12 1 : 120 60
size-256(DMA) 0 0 256 0 0 1 : 248 124
size-256 52 60 256 4 4 1 : 248 124
size-192(DMA) 0 0 192 0 0 1 : 248 124
size-192 20 20 192 1 1 1 : 248 124
size-128(DMA) 0 0 128 0 0 1 : 248 124
size-128 41 60 128 2 2 1 : 248 124
size-96(DMA) 0 0 96 0 0 1 : 248 124
size-96 448 480 96 12 12 1 : 248 124
size-64(DMA) 0 0 64 0 0 1 : 248 124
size-64 177 177 64 3 3 1 : 248 124
size-32(DMA) 0 0 32 0 0 1 : 248 124
size-32 289 339 32 3 3 1 : 248 124
kmem_cache 99 99 116 3 3 1 : 248 124


After:
unix_sock 3 18 416 2 2 1 : 120 60
tcp_tw_bucket 0 0 96 0 0 1 : 248 124
tcp_bind_bucket 2 113 32 1 1 1 : 248 124
tcp_open_request 0 0 64 0 0 1 : 248 124
inet_peer_cache 0 0 64 0 0 1 : 248 124
secpath_cache 0 0 32 0 0 1 : 248 124
flow_cache 0 0 64 0 0 1 : 248 124
xfrm4_dst_cache 0 0 224 0 0 1 : 248 124
ip_fib_hash 10 113 32 1 1 1 : 248 124
ip_dst_cache 23 51 224 3 3 1 : 248 124
arp_cache 1 30 128 1 1 1 : 248 124
raw4_sock 0 0 416 0 0 1 : 120 60
udp_sock 0 0 448 0 0 1 : 120 60
tcp_sock 6 9 864 1 1 2 : 120 60
reiser_inode_cache 0 0 384 0 0 1 : 120 60
ext2_inode_cache 0 0 448 0 0 1 : 120 60
journal_head 113 2028 48 26 26 1 : 248 124
revoke_table 6 253 12 1 1 1 : 248 124
revoke_record 0 0 32 0 0 1 : 248 124
ext3_inode_cache 1361 2277 448 253 253 1 : 120 60
ext3_xattr 0 0 44 0 0 1 : 248 124
eventpoll pwq 0 0 36 0 0 1 : 248 124
eventpoll epi 0 0 64 0 0 1 : 248 124
kioctx 0 0 192 0 0 1 : 248 124
kiocb 0 0 160 0 0 1 : 248 124
dnotify_cache 0 0 20 0 0 1 : 248 124
file_lock_cache 10 40 96 1 1 1 : 248 124
fasync_cache 0 0 16 0 0 1 : 248 124
shmem_inode_cache 3 9 416 1 1 1 : 120 60
uid_cache 0 0 32 0 0 1 : 248 124
deadline_drq 768 780 48 10 10 1 : 248 124
blkdev_requests 768 784 136 28 28 1 : 248 124
biovec-BIO_MAX_PAGES 256 260 3072 52 52 4 : 54 27
biovec-128 256 260 1536 52 52 2 : 54 27
biovec-64 256 260 768 52 52 1 : 120 60
biovec-16 256 260 192 13 13 1 : 248 124
biovec-4 256 295 64 5 5 1 : 248 124
biovec-1 333 404 16 2 2 1 : 248 124
bio 295 295 64 5 5 1 : 248 124
sock_inode_cache 13 20 384 2 2 1 : 120 60
skbuff_head_cache 232 288 160 12 12 1 : 248 124
sock 3 11 352 1 1 1 : 120 60
proc_inode_cache 72 88 352 8 8 1 : 120 60
sigqueue 13 29 132 1 1 1 : 248 124
radix_tree_node 2039 2460 260 164 164 1 : 120 60
cdev_cache 12 118 64 2 2 1 : 248 124
bdev_cache 10 40 96 1 1 1 : 248 124
mnt_cache 18 59 64 1 1 1 : 248 124
inode_cache 244 264 320 22 22 1 : 120 60
dentry_cache 2214 5016 160 209 209 1 : 248 124
filp 775 780 128 26 26 1 : 248 124
names_cache 1 1 4096 1 1 1 : 54 27
buffer_head 46507 58500 48 750 750 1 : 248 124
mm_struct 30 30 384 3 3 1 : 120 60
vm_area_struct 415 944 64 16 16 1 : 248 124
fs_cache 30 118 64 2 2 1 : 248 124
files_cache 27 27 416 3 3 1 : 120 60
signal_act 27 27 1344 9 9 1 : 54 27
task_struct 40 40 1536 8 8 2 : 54 27
pte_chain 415 1808 32 16 16 1 : 248 124
mm_chain 125 338 8 1 1 1 : 248 124
size-131072(DMA) 0 0 131072 0 0 32 : 8 4
size-131072 0 0 131072 0 0 32 : 8 4
size-65536(DMA) 0 0 65536 0 0 16 : 8 4
size-65536 0 0 65536 0 0 16 : 8 4
size-32768(DMA) 0 0 32768 0 0 8 : 8 4
size-32768 0 0 32768 0 0 8 : 8 4
size-16384(DMA) 0 0 16384 0 0 4 : 8 4
size-16384 0 0 16384 0 0 4 : 8 4
size-8192(DMA) 0 0 8192 0 0 2 : 8 4
size-8192 7 7 8192 7 7 2 : 8 4
size-4096(DMA) 0 0 4096 0 0 1 : 54 27
size-4096 22 22 4096 22 22 1 : 54 27
size-2048(DMA) 0 0 2048 0 0 1 : 54 27
size-2048 106 106 2048 53 53 1 : 54 27
size-1024(DMA) 0 0 1024 0 0 1 : 120 60
size-1024 84 84 1024 21 21 1 : 120 60
size-512(DMA) 0 0 512 0 0 1 : 120 60
size-512 144 144 512 18 18 1 : 120 60
size-256(DMA) 0 0 256 0 0 1 : 248 124
size-256 73 75 256 5 5 1 : 248 124
size-192(DMA) 0 0 192 0 0 1 : 248 124
size-192 20 20 192 1 1 1 : 248 124
size-128(DMA) 0 0 128 0 0 1 : 248 124
size-128 41 60 128 2 2 1 : 248 124
size-96(DMA) 0 0 96 0 0 1 : 248 124
size-96 436 480 96 12 12 1 : 248 124
size-64(DMA) 0 0 64 0 0 1 : 248 124
size-64 183 236 64 4 4 1 : 248 124
size-32(DMA) 0 0 32 0 0 1 : 248 124
size-32 246 904 32 8 8 1 : 248 124
kmem_cache 99 99 116 3 3 1 : 248 124

The biggest change I can see is buffer head.

The machine has been kept online in that state so I can extract more info if
needed.

Con
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.0 (GNU/Linux)

iD8DBQE+ETIGF6dfvkL3i1gRAqjpAKCjxltSBwAus/RkLC+E32ZpI0GAYACgldf8
qgYP9oNr0nrco1oCQm8Wakg=
=Uj/q
-----END PGP SIGNATURE-----

2002-12-31 06:00:41

by Andrew Morton

[permalink] [raw]
Subject: Re: [BENCHMARK] vm swappiness with contest

Con Kolivas wrote:
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On Saturday 28 Dec 2002 5:16 pm, Con Kolivas wrote:
> > Is there something about the filesystem layer or elsewhere in the kernel
> > that could decay or fragment over time that only a reboot can fix? This
> > would seem to be a bad thing.
>
> Ok Linus suggested I check slabinfo before and after.
>
> I ran contest for a few days till I recreated the problem and it did recur. I
> don't know how to interpret the information so I'll just dump it here:
>


Looks OK. Could we see /proc/meminfo and /proc/vmstat?

What filesystem are you using? And what kernel?

2002-12-31 06:15:43

by Con Kolivas

[permalink] [raw]
Subject: Re: [BENCHMARK] vm swappiness with contest

On Tuesday 31 Dec 2002 5:08 pm, Andrew Morton wrote:
> Con Kolivas wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > On Saturday 28 Dec 2002 5:16 pm, Con Kolivas wrote:
> > > Is there something about the filesystem layer or elsewhere in the
> > > kernel that could decay or fragment over time that only a reboot can
> > > fix? This would seem to be a bad thing.
> >
> > Ok Linus suggested I check slabinfo before and after.
> >
> > I ran contest for a few days till I recreated the problem and it did
> > recur. I don't know how to interpret the information so I'll just dump it
> > here:
>
> Looks OK. Could we see /proc/meminfo and /proc/vmstat?

meminfo:
MemTotal: 257296 kB
MemFree: 47468 kB
Buffers: 27028 kB
Cached: 7480 kB
SwapCached: 272 kB
Active: 154968 kB
Inactive: 42756 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 257296 kB
LowFree: 47468 kB
SwapTotal: 4194272 kB
SwapFree: 4193816 kB
Dirty: 1116 kB
Writeback: 0 kB
Mapped: 3740 kB
Slab: 8564 kB
Committed_AS: 6580 kB
PageTables: 196 kB
ReverseMaps: 1381

vmstat:
nr_dirty 240
nr_writeback 0
nr_pagecache 8712
nr_page_table_pages 49
nr_reverse_maps 1383
nr_mapped 937
nr_slab 2140
pgpgin 254170865
pgpgout 637037580
pswpin 5126123
pswpout 25405883
pgalloc 510309571
pgfree 510321474
pgactivate 486194881
pgdeactivate 503024404
pgfault 693111528
pgmajfault 1013600
pgscan 1521783411
pgrefill 939783846
pgsteal 146188093
kswapd_steal 105403604
pageoutrun 377963
allocstall 1408123
pgrotated 45339615

>
> What filesystem are you using? And what kernel?
ext3 on 2.5.53-mm1

Con

2002-12-31 06:29:08

by Andrew Morton

[permalink] [raw]
Subject: Re: [BENCHMARK] vm swappiness with contest

Con Kolivas wrote:
>
> On Tuesday 31 Dec 2002 5:08 pm, Andrew Morton wrote:
> > Con Kolivas wrote:
> > > -----BEGIN PGP SIGNED MESSAGE-----
> > > Hash: SHA1
> > >
> > > On Saturday 28 Dec 2002 5:16 pm, Con Kolivas wrote:
> > > > Is there something about the filesystem layer or elsewhere in the
> > > > kernel that could decay or fragment over time that only a reboot can
> > > > fix? This would seem to be a bad thing.
> > >
> > > Ok Linus suggested I check slabinfo before and after.
> > >
> > > I ran contest for a few days till I recreated the problem and it did
> > > recur. I don't know how to interpret the information so I'll just dump it
> > > here:
> >
> > Looks OK. Could we see /proc/meminfo and /proc/vmstat?
>
> meminfo:
> MemTotal: 257296 kB
> MemFree: 47468 kB
> Buffers: 27028 kB
> Cached: 7480 kB
> SwapCached: 272 kB
> Active: 154968 kB
> Inactive: 42756 kB
> HighTotal: 0 kB
> HighFree: 0 kB
> LowTotal: 257296 kB
> LowFree: 47468 kB
> SwapTotal: 4194272 kB
> SwapFree: 4193816 kB
> Dirty: 1116 kB
> Writeback: 0 kB
> Mapped: 3740 kB
> Slab: 8564 kB
> Committed_AS: 6580 kB
> PageTables: 196 kB
> ReverseMaps: 1381
>

These numbers _look_ wrong, but ext3 truncate does funny things.
Could you now run a big usemem/fillmem application to try to allocate and
use 200 megs of memory, then resend /proc/meminfo?

2002-12-31 06:50:03

by Con Kolivas

[permalink] [raw]
Subject: Re: [BENCHMARK] vm swappiness with contest

On Tuesday 31 Dec 2002 5:37 pm, Andrew Morton wrote:
> Con Kolivas wrote:
> > On Tuesday 31 Dec 2002 5:08 pm, Andrew Morton wrote:
> > > Con Kolivas wrote:
> > > > -----BEGIN PGP SIGNED MESSAGE-----
> > > > Hash: SHA1
> > > >
> > > > On Saturday 28 Dec 2002 5:16 pm, Con Kolivas wrote:
> > > > > Is there something about the filesystem layer or elsewhere in the
> > > > > kernel that could decay or fragment over time that only a reboot
> > > > > can fix? This would seem to be a bad thing.
> > > >
> > > > Ok Linus suggested I check slabinfo before and after.
> > > >
> > > > I ran contest for a few days till I recreated the problem and it did
> > > > recur. I don't know how to interpret the information so I'll just
> > > > dump it here:
> > >
> > > Looks OK. Could we see /proc/meminfo and /proc/vmstat?
> >
> > meminfo:
> > MemTotal: 257296 kB
> > MemFree: 47468 kB
> > Buffers: 27028 kB
> > Cached: 7480 kB
> > SwapCached: 272 kB
> > Active: 154968 kB
> > Inactive: 42756 kB
> > HighTotal: 0 kB
> > HighFree: 0 kB
> > LowTotal: 257296 kB
> > LowFree: 47468 kB
> > SwapTotal: 4194272 kB
> > SwapFree: 4193816 kB
> > Dirty: 1116 kB
> > Writeback: 0 kB
> > Mapped: 3740 kB
> > Slab: 8564 kB
> > Committed_AS: 6580 kB
> > PageTables: 196 kB
> > ReverseMaps: 1381
>
> These numbers _look_ wrong, but ext3 truncate does funny things.
> Could you now run a big usemem/fillmem application to try to allocate and
> use 200 megs of memory, then resend /proc/meminfo?

post usemem:
MemTotal: 257296 kB
MemFree: 86168 kB
Buffers: 392 kB
Cached: 2244 kB
SwapCached: 632 kB
Active: 159484 kB
Inactive: 1380 kB
HighTotal: 0 kB
HighFree: 0 kB
LowTotal: 257296 kB
LowFree: 86168 kB
SwapTotal: 4194272 kB
SwapFree: 4192668 kB
Dirty: 60 kB
Writeback: 0 kB
Mapped: 1768 kB
Slab: 6748 kB
Committed_AS: 6588 kB
PageTables: 196 kB
ReverseMaps: 619

Con

2002-12-31 07:00:09

by Andrew Morton

[permalink] [raw]
Subject: Re: [BENCHMARK] vm swappiness with contest

Con Kolivas wrote:
>
> ...
> post usemem:
> MemTotal: 257296 kB
> MemFree: 86168 kB
> Buffers: 392 kB
> Cached: 2244 kB
> SwapCached: 632 kB
> Active: 159484 kB
> Inactive: 1380 kB
> HighTotal: 0 kB
> HighFree: 0 kB
> LowTotal: 257296 kB
> LowFree: 86168 kB
> SwapTotal: 4194272 kB
> SwapFree: 4192668 kB
> Dirty: 60 kB
> Writeback: 0 kB
> Mapped: 1768 kB
> Slab: 6748 kB
> Committed_AS: 6588 kB
> PageTables: 196 kB
> ReverseMaps: 619

OK, thanks. It's a memory leak.

Could you please send me a detailed description of how to
set about reproducing this?

When you say "I ran contest for a few days till I recreated the problem
and it did recur.", does this imply that the leak was really slowly
increasing, or does it imply that everything was fine for a few days
uptime and then it sudddenly leaked a large amount of memory?

2002-12-31 07:12:31

by Con Kolivas

[permalink] [raw]
Subject: Re: [BENCHMARK] vm swappiness with contest

On Tuesday 31 Dec 2002 6:08 pm, Andrew Morton wrote:
> Con Kolivas wrote:
> > ...
> > post usemem:
> > MemTotal: 257296 kB
> > MemFree: 86168 kB
> > Buffers: 392 kB
> > Cached: 2244 kB
> > SwapCached: 632 kB
> > Active: 159484 kB
> > Inactive: 1380 kB
> > HighTotal: 0 kB
> > HighFree: 0 kB
> > LowTotal: 257296 kB
> > LowFree: 86168 kB
> > SwapTotal: 4194272 kB
> > SwapFree: 4192668 kB
> > Dirty: 60 kB
> > Writeback: 0 kB
> > Mapped: 1768 kB
> > Slab: 6748 kB
> > Committed_AS: 6588 kB
> > PageTables: 196 kB
> > ReverseMaps: 619
>
> OK, thanks. It's a memory leak.
>
> Could you please send me a detailed description of how to
> set about reproducing this?
>
> When you say "I ran contest for a few days till I recreated the problem
> and it did recur.", does this imply that the leak was really slowly
> increasing, or does it imply that everything was fine for a few days
> uptime and then it sudddenly leaked a large amount of memory?

I ran contest -n 100 -l all

which should run all loads 100 times. The dbench_load results remained pretty
much static until about the 60th run and then it started increasing fairly
rapidly.

dbench_load doesnt exist in contest 0.51, but running dbench_load by itself
did NOT reproduce the problem - I tried this first. Dbench_load just seemed
to notice it the most, so it's one of the other loads in contest exposing the
leak which are all in v 0.51. Unfortunately I cant say for certain which of
the loads it is.

Con