2001-12-15 12:36:37

by Chris Chabot

[permalink] [raw]
Subject: Unfreeable buffer/cache problem in 2.4.17-rc1 still there

Marcelo Tosatti wrote,
>Well, I want people with the "unfreeable" buffer/cache problem to
>confirm with me that 2.4.17-rc1 is working ok.

I'm afraid the problem still seems to exist on my box. It takes a while
to kick in, but after 1 day and 8 hours, i have just lost about 400 mb
of usable memory (!)

I have attached the output from free, slabinfo and a ps aux for 2.4.16
after 1 day and 20 hours, and for 2.4.17-rc1 after 1 day and 8 hours.

When the system comes fresh out of a reboot, there's about 900 and some
megabytes of memory 'free' in the 'free' output.. this number has
remained steady for about a day or so (it was still ok last night), then
i checked this morning, and 400 megs of memory seems to evaporated on me
;-)

The box is a dual p3-600, 1 gig of (ecc) ram, AIC7xxx/u2w, 2 scsi disks
(/ and /home), scsi cdrom, scsi DLT tape, 4 ide disks (internal PIIX4
ide controller) in a single raid0 volume (320 gigs), using ext3 on all
mounts, and 3 network cards (1 3com 3c905tx and 2 intel etherexpresspro
10/100's).

the box runs a basic firewall setup (with a ip route to enable it to
handle both cable modem and adsl), and some other basic services (smb,
named, nfs, dhcp, xinetd, pppd). Nothing fancy realy that could explain
this behaviour.

Ofcource i did check ps aux to see if anything was _using_ this memory,
but unfortunatly this is not the case.

Hope the outputs help trace this problem. If any more input is required,
dont hesitate to bomb my inbox ;-)

-- Chris Chabot


ps, i don't know if its related, but i also found the folowing messages
in my dmesg, dont know where there from.. but i figured i'd mention it
invalidate: busy buffer
invalidate: busy buffer
...
<repeats about 20 times>




Attachments:
slabinfo-2.4.16 (7.79 kB)
slabinfo-2.4.17-rc1 (7.87 kB)
Download all attachments

2001-12-15 13:42:39

by Chris Chabot

[permalink] [raw]
Subject: Re: Unfreeable buffer/cache problem in 2.4.17-rc1 still there

James Stevenson wrote,
> does the same thing happen when you do
> find / -type f -print0 |xargs -0 cat > /dev/null
<snip>
> what cronjobs are running at night ?
> this could be normal because free ram is really a waste
> so it might as well be used for somthing and whats
> better than speeding up disk access
> if things do start to use memory they take it from the free section then
> the disk cache gets droped and refills the free section.

I apreciate the effort and sentiment. However if you would look at the
output of 'free' (originaly attached files), you would notice i am
talking about free = (available + cache + buffer) and not just
'available' ;-)

This is a problem i only have on one of the 40 or so servers i manage,
however the one that has it is my personal gateway & firewall machine,
so it feels prety sore ;-)

There are one or two other people on the list who also had the same
problem, is it fixed for you guys, or still seeing the same problem? I'm
having the sneaking suspission this behaviour isnt gone yet.

Anyways, thanks for trying to educate james, however i only wish it was
that simple ;-)

-- Chris


2001-12-15 14:25:13

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: Unfreeable buffer/cache problem in 2.4.17-rc1 still there

On Sat, Dec 15, 2001 at 01:36:11PM +0100, Chris Chabot wrote:
> inode_cache 686896 686896 480 85862 85862 1 : 124 62
> dentry_cache 696810 696810 128 23227 23227 1 : 252 126

this is an icache/dcache problem, can you reproduce on 2.4.17rc1aa1, it
will shrink more aggressively.

really to get an even better balance we should add the icache/dcache
slab pages into the lru as well... that would trigger the icache/dcache
flushes more easily when too much ram is in those caches.

Andrea

2001-12-15 16:06:16

by Ed Tomlinson

[permalink] [raw]
Subject: Re: Unfreeable buffer/cache problem in 2.4.17-rc1 still there

Andrea Arcangeli wrote:

> On Sat, Dec 15, 2001 at 01:36:11PM +0100, Chris Chabot wrote:
>> inode_cache 686896 686896 480 85862 85862 1 : 124 62
>> dentry_cache 696810 696810 128 23227 23227 1 : 252 126
>
> this is an icache/dcache problem, can you reproduce on 2.4.17rc1aa1, it
> will shrink more aggressively.
>
> really to get an even better balance we should add the icache/dcache
> slab pages into the lru as well... that would trigger the icache/dcache
> flushes more easily when too much ram is in those caches.

Interesting idea. Is this what you are thinking? We find a slab page at the tail of the
lru so we call the related shrink function. If the page is still active after shrinking, we
requeue it at the head of the lru. The slab page freeing logic would have to how to
unlink from the lru.

The priority arguement of the shink functions would now allow us to keep the ratio of
lru size vs icache/dcache/dqcache under control. This might be a knob that would
be interesting to have in proc...

Comments?
Ed Tomlinson

2001-12-15 16:05:46

by Chris Chabot

[permalink] [raw]
Subject: Re: Unfreeable buffer/cache problem in 2.4.17-rc1 still there

Ok, rc1-aa1 is installed, up and running. Since it took a little more
then a day last time for this behavour to be displayed, i'll have to
wait before i can report back anything usefull.

Ps, a side question. if this is 'inode' and 'dentry' cache, why isnt it
reported as 'used by cache' in free, top, gtop, etc?

Also, why is this such a progressive problem? Even if i do a find on the
full +/- 400 gigs, create and remove 2 gig files, create 1000 temp
files, etc, in the first day the memory isnt 'disapearing' yet.
(ofcource it uses a lot of mem for cache/buffers, but no 'unaccounted
for in free or top' memory.

Will report back soon,

-- Chris


On Sat, 2001-12-15 at 15:24, Andrea Arcangeli wrote:
> On Sat, Dec 15, 2001 at 01:36:11PM +0100, Chris Chabot wrote:
> > inode_cache 686896 686896 480 85862 85862 1 : 124 62
> > dentry_cache 696810 696810 128 23227 23227 1 : 252 126
>
> this is an icache/dcache problem, can you reproduce on 2.4.17rc1aa1, it
> will shrink more aggressively.
>
> really to get an even better balance we should add the icache/dcache
> slab pages into the lru as well... that would trigger the icache/dcache
> flushes more easily when too much ram is in those caches.
>
> Andrea


2001-12-15 16:10:16

by Ken Brownfield

[permalink] [raw]
Subject: Re: Unfreeable buffer/cache problem in 2.4.17-rc1 still there

I think "updatedb" at 4am is what you're looking for... How much disk
space do you have on this system?

--
Ken.
[email protected]

On Sat, Dec 15, 2001 at 05:05:20PM +0100, Chris Chabot wrote:
| Ok, rc1-aa1 is installed, up and running. Since it took a little more
| then a day last time for this behavour to be displayed, i'll have to
| wait before i can report back anything usefull.
|
| Ps, a side question. if this is 'inode' and 'dentry' cache, why isnt it
| reported as 'used by cache' in free, top, gtop, etc?
|
| Also, why is this such a progressive problem? Even if i do a find on the
| full +/- 400 gigs, create and remove 2 gig files, create 1000 temp
| files, etc, in the first day the memory isnt 'disapearing' yet.
| (ofcource it uses a lot of mem for cache/buffers, but no 'unaccounted
| for in free or top' memory.
|
| Will report back soon,
|
| -- Chris
|
|
| On Sat, 2001-12-15 at 15:24, Andrea Arcangeli wrote:
| > On Sat, Dec 15, 2001 at 01:36:11PM +0100, Chris Chabot wrote:
| > > inode_cache 686896 686896 480 85862 85862 1 : 124 62
| > > dentry_cache 696810 696810 128 23227 23227 1 : 252 126
| >
| > this is an icache/dcache problem, can you reproduce on 2.4.17rc1aa1, it
| > will shrink more aggressively.
| >
| > really to get an even better balance we should add the icache/dcache
| > slab pages into the lru as well... that would trigger the icache/dcache
| > flushes more easily when too much ram is in those caches.
| >
| > Andrea
|
|
| -
| To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
| the body of a message to [email protected]
| More majordomo info at http://vger.kernel.org/majordomo-info.html
| Please read the FAQ at http://www.tux.org/lkml/

2001-12-15 17:09:25

by Chris Chabot

[permalink] [raw]
Subject: Re: Unfreeable buffer/cache problem in 2.4.17-rc1 still there

Mark Han wrote,
> first, forget silly crap like top; look at the /proc files.
> inode/dentry caches are just slab caches, which afaik
> are not considered part of the 'cached' that top is talking about.
> (since top preceeds slab and is referring to buffer+page caches.)

Man somebody got out of the wrong side of the bed this morning ;-) The
problem is that those prety user end lights are the only things the user
see's ;-) So it might be worth considering exporting this 'secret'
information to the user end (count it as cache in /proc/memusage?)

> what makes you think there's anything wrong with this? you have tons of
> memory, and aren't using it hard, so the kernel uses it to cache files,
> inodes, dentries, etc.

I know, cache is good. However first of all, 400 to 600 mb of cache used
for dentries/inodes seems a little steep to me (as not kernel hacker),
and when i do fire up memory hogging applications (mysql,apache,java
etc) the 'evaporated' memory is not returned for those applications.
Resulting in heavy swapping and a non-responsive system. For a dual p3
with 1 gig of ram, this feels like a problem, yes ;-)
do note then when i do a simple find /, it do see the memory being used
in cached and buffers. This is not the case for the 'missing memory'

Ken Brownfield wrote,
> I think "updatedb" at 4am is what you're looking for... How much disk
> space do you have on this system?
The system has 2 x 18Gb scsi disks (/ and /home) and a single raid0 volume (4x 80 gig ide) as archival storage. Doing a find | wc -l on the archives alone tells me i have more then 340000 files there.. (ranging between a few bytes to > 1 gig)

But indeed, when i run updatedb, the problem of non-visable memory (for
me using top / free anyways) does apear.. good catch!

Andrea wrote,
> this is an icache/dcache problem, can you reproduce on 2.4.17rc1aa1,
> it will shrink more aggressively.

doing that (updatedb), i dont have to wait a day or so to see what
happens, mem free (+buffers + cache from /proc/meminfo) is around 550Mb,
'used memory' (counting ps aux res usage) is < 100Mb. So quite a couple
of megabytes have disapeard again ;/

-- Chris


2001-12-16 19:42:56

by Dmitry Volkoff

[permalink] [raw]
Subject: Re: Unfreeable buffer/cache problem in 2.4.17-rc1 still there

Hello!

Below is simple test case which I think is related to "memory disappear"
problem.

My real program is doing something like this:

// test.c
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>

int main(void)
{
int fd;
int r;
char data[10] = "0123456789";
int i;
int end = 30;
for (i=0;i<end;i++) {
fd = open("testfile", O_WRONLY | O_NDELAY | O_TRUNC | O_CREAT, 0644);
if (fd == -1) {
printf("unable to open\n");
return;
}
r = write(fd,data,sizeof data);
if (r == -1) {
printf("unable to write\n");
close(fd);
return;
}
close(fd);
sleep(1);
}
}
// end test.c

Each time I run `free; ./test; free` I see evergrowing memory usage.
I mean used memory + buffers. At some point system just starts swapping
even if no other processes are running. Tested on 2.4.13 and
2.4.17-pre4aa1. I think something is wrong here because the very same
test program does not show such behaviour on 2.2.19. It does not lose
any single byte of memory. I've even tested this on freebsd-4.4 with
the same result as on 2.2.19.

Example output on 2.4.17-pre4aa1:

bash-2.03$ free; ./test; free
total used free shared buffers cached
Mem: 514528 212508 302020 0 7952 150664
-/+ buffers/cache: 53892 460636
Swap: 1028120 0 1028120
total used free shared buffers cached
Mem: 514528 212616 301912 0 8000 150664
-/+ buffers/cache: 53952 460576
Swap: 1028120 0 1028120

bash-2.03$ free; ./test; free
total used free shared buffers cached
Mem: 514528 212616 301912 0 8008 150664
-/+ buffers/cache: 53944 460584
Swap: 1028120 0 1028120
total used free shared buffers cached
Mem: 514528 212700 301828 0 8056 150664
-/+ buffers/cache: 53980 460548
Swap: 1028120 0 1028120

The results are very consistent. I lose 30-40 byte per run.

--

DV

2001-12-16 20:41:43

by Willy Tarreau

[permalink] [raw]
Subject: Re: Unfreeable buffer/cache problem in 2.4.17-rc1 still there

Hi Dmitry !
> Below is simple test case which I think is related to "memory disappear"
> problem.
--snip--
> The results are very consistent. I lose 30-40 byte per run.

Nearly the same here on 2.4.10-ac12, but I loose only 4-8 kB each time.
So I think this is not related to the new VM, and perhaps it's a very
old thing.

Regards,
Willy

2001-12-17 13:05:10

by Denis Vlasenko

[permalink] [raw]
Subject: Re: Unfreeable buffer/cache problem in 2.4.17-rc1 still there

On Sunday 16 December 2001 17:39, Dmitry Volkoff wrote:
> Hello!
>
> Below is simple test case which I think is related to "memory disappear"
> problem.
>
> My real program is doing something like this:
>
> // test.c
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <stdio.h>
> #include <unistd.h>
>
> int main(void)
> {
> int fd;
> int r;
> char data[10] = "0123456789";
> int i;
> int end = 30;
> for (i=0;i<end;i++) {
> fd = open("testfile", O_WRONLY | O_NDELAY | O_TRUNC | O_CREAT, 0644);
> if (fd == -1) {
> printf("unable to open\n");
> return;
> }
> r = write(fd,data,sizeof data);
> if (r == -1) {
> printf("unable to write\n");
> close(fd);
> return;
> }
> close(fd);
> sleep(1);
> }
> }
> // end test.c

I removed sleep(1). Is it needed?

After 10000+ runs of this proggy swap usage isn't changed on 2.4.17-pre7.
top reports constant 2304K of swap usage.
--
vda

2001-12-18 04:28:31

by Dmitry Volkoff

[permalink] [raw]
Subject: Re: Unfreeable buffer/cache problem in 2.4.17-rc1 still there

On Mon, Dec 17, 2001 at 03:01:12PM -0200, vda wrote:
> > if (r == -1) {
> > printf("unable to write\n");
> > close(fd);
> > return;
> > }
> > close(fd);
> > sleep(1);
> > }
> > }
> > // end test.c
>
> I removed sleep(1). Is it needed?
>

Yes, you need it in order to see the memory leakage.

> After 10000+ runs of this proggy swap usage isn't changed on 2.4.17-pre7.
> top reports constant 2304K of swap usage.

I know. You'll notice this effect only after 1000000+ runs.
Try it again with sleep(1).

> --
> vda
>

--

DV