2001-04-11 11:37:08

by Marcin Kowalski

[permalink] [raw]
Subject: Fwd: Re: memory usage - dentry_cache

To possbile answer my own question:
if I do a can on /proc/slabinfo I get on the machine with "MISSING" memory:
----
slabinfo - version: 1.1 (SMP)
--- cut out
inode_cache 920558 930264 480 116267 116283 1 : 124 6
--- cut out
dentry_cache 557245 638430 128 21281 21281 1 : 252 126


while on an equivalent server I get
----
labinfo - version: 1.1 (SMP)

inode_cache 70464 70464 480 8808 8808 1 : 124 62

dentry_cache 72900 72900 128 2430 2430 1 : 252 126

----

Notice the huge size of the inode and dentry caches. bareing in mind that
both have 85GIG Reiserfs Home filesystems.
The first machine has a many of directories containing many thousands for
files.
BTW could anyone enlighten me to the exact meaning of the values in the
slabinfo gile.

Regards
MarCin



> I can use "ps" to see memory usage of daemons and user programs.
> I can't find any memory information of kernel with "top" and "ps".
>
> Do you know how to take memory usage information of kernel ?
> Thanks for your help.

Regarding this issue, I have a similar problem if I do a free on my system I
get :
--- total used free shared buffers cached
Mem: 1157444 1148120 9324 0 22080 459504
-/+ buffers/cache: 666536 490908
Swap: 641016 19072 621944
---
Now what I do a ps there seems no way to accound for the 500mb + of memory
used. No single or group of processes uses that amount of memory. THis is
very disconcerting, coupled with extremely high loads when cache is dumped to
disk locking up the machine makes me want to move back to 2.2.19 from 2.4.3.

I would also be curious to see how the kernel is using memory...

TIA
MARCin


--
-----------------------------
Marcin Kowalski
Linux/Perl Developer
Datrix Solutions
Cel. 082-400-7603
***Open Source Kicks Ass***
-----------------------------


2001-04-12 04:50:26

by Andreas Dilger

[permalink] [raw]
Subject: Re: Fwd: Re: memory usage - dentry_cacheg

Marcin Kowalski writes:
> if I do a can on /proc/slabinfo I get on the machine with "MISSING" memory:
> ----
> slabinfo - version: 1.1 (SMP)
> --- cut out
> inode_cache 920558 930264 480 116267 116283 1 : 124 6
> --- cut out
> dentry_cache 557245 638430 128 21281 21281 1 : 252 126

I just discovered a similar problem when testing Daniel Philip's new ext2
directory indexing code with bonnie++. I was running bonnie under single
user mode (basically nothing else running) to create 100k files with 1 data
block each (in a single directory). This would create a directory about
8MB in size, 32MB of dirty inode tables, and about 400M of dirty buffers.
I have 128MB RAM, no swap for the testing.

In short order, my single user shell was OOM killed, and in another test
bonnie was OOM-killed (even though the process itself is only 8MB in size).
There were 80k entries each of icache and dcache (38MB and 10MB respectively)
and only dirty buffers otherwise. Clearly we need some VM pressure on the
icache and dcache in this case. Probably also need more agressive flushing
of dirty buffers before invoking OOM.

There were patches floating around on l-k which addressed these issues.
Seems it is time to try them out, which I hadn't before because I wasn't
having any problems myself until now.

Cheers, Andreas
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert

2001-04-12 05:45:34

by Alexander Viro

[permalink] [raw]
Subject: Re: Fwd: Re: memory usage - dentry_cacheg



On Wed, 11 Apr 2001, Andreas Dilger wrote:

> I just discovered a similar problem when testing Daniel Philip's new ext2
> directory indexing code with bonnie++. I was running bonnie under single
> user mode (basically nothing else running) to create 100k files with 1 data
> block each (in a single directory). This would create a directory about
> 8MB in size, 32MB of dirty inode tables, and about 400M of dirty buffers.
> I have 128MB RAM, no swap for the testing.
>
> In short order, my single user shell was OOM killed, and in another test
> bonnie was OOM-killed (even though the process itself is only 8MB in size).
> There were 80k entries each of icache and dcache (38MB and 10MB respectively)
> and only dirty buffers otherwise. Clearly we need some VM pressure on the
> icache and dcache in this case. Probably also need more agressive flushing
> of dirty buffers before invoking OOM.

We _have_ VM pressure there. However, such loads had never been used, so
there's no wonder that system gets unbalanced under them.

I suspect that simple replacement of goto next; with continue; in the
fs/dcache.c::prune_dcache() may make situation seriously better.

Al

2001-04-12 06:53:58

by Jeff Garzik

[permalink] [raw]
Subject: [PATCH] Re: Fwd: Re: memory usage - dentry_cacheg

Index: fs/dcache.c
===================================================================
RCS file: /cvsroot/gkernel/linux_2_4/fs/dcache.c,v
retrieving revision 1.1.1.16
diff -u -r1.1.1.16 dcache.c
--- fs/dcache.c 2001/03/13 04:23:27 1.1.1.16
+++ fs/dcache.c 2001/04/12 06:51:56
@@ -340,7 +340,7 @@
if (dentry->d_flags & DCACHE_REFERENCED) {
dentry->d_flags &= ~DCACHE_REFERENCED;
list_add(&dentry->d_lru, &dentry_unused);
- goto next;
+ continue;
}
dentry_stat.nr_unused--;


Attachments:
dcache.patch (495.00 B)

2001-04-12 07:02:59

by Andreas Dilger

[permalink] [raw]
Subject: Re: Fwd: Re: memory usage - dentry_cacheg

Al writes:
> We _have_ VM pressure there. However, such loads had never been used, so
> there's no wonder that system gets unbalanced under them.
>
> I suspect that simple replacement of goto next; with continue; in the
> fs/dcache.c::prune_dcache() may make situation seriously better.

Yes, it appears that this would be a bug. We were only _checking_
"count" dentries, rather than pruning "count" dentries.

Testing continues.

Cheers, Andreas
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert

2001-04-12 07:11:29

by Alexander Viro

[permalink] [raw]
Subject: [CFT][PATCH] Re: Fwd: Re: memory usage - dentry_cache



On Thu, 12 Apr 2001, Jeff Garzik wrote:

> Alexander Viro wrote:
> > We _have_ VM pressure there. However, such loads had never been used, so
> > there's no wonder that system gets unbalanced under them.
> >
> > I suspect that simple replacement of goto next; with continue; in the
> > fs/dcache.c::prune_dcache() may make situation seriously better.
>
> Awesome. With the obvious patch attached, some local ramfs problems
> disappeared, and my browser and e-mail program are no longer swapped out
> when doing a kernel build.
>
> Thanks :)

OK, how about wider testing? Theory: prune_dcache() goes through the
list of immediately killable dentries and tries to free given amount.
It has a "one warning" policy - it kills dentry if it sees it twice without
lookup finding that dentry in the interval. Unfortunately, as implemented
it stops when it had freed _or_ warned given amount. As the result, memory
pressure on dcache is less than expected.

Patch being:
--- fs/dcache.c Sun Apr 1 23:57:19 2001
+++ /tmp/dcache.c Thu Apr 12 03:07:39 2001
@@ -340,7 +340,7 @@
if (dentry->d_flags & DCACHE_REFERENCED) {
dentry->d_flags &= ~DCACHE_REFERENCED;
list_add(&dentry->d_lru, &dentry_unused);
- goto next;
+ continue;
}
dentry_stat.nr_unused--;

@@ -349,7 +349,6 @@
BUG();

prune_one_dentry(dentry);
- next:
if (!--count)
break;
}


2001-04-12 07:27:32

by Alexander Viro

[permalink] [raw]
Subject: [race][RFC] d_flags use



On Thu, 12 Apr 2001, Andreas Dilger wrote:

> Al writes:
> > We _have_ VM pressure there. However, such loads had never been used, so
> > there's no wonder that system gets unbalanced under them.
> >
> > I suspect that simple replacement of goto next; with continue; in the
> > fs/dcache.c::prune_dcache() may make situation seriously better.
>
> Yes, it appears that this would be a bug. We were only _checking_
> "count" dentries, rather than pruning "count" dentries.
>
> Testing continues.

Uh-oh... After looking at prune_dcache for a minute... Folks, what
protects ->d_flags? That may very well be the reason of some NFS
and autofs problems.

If nobody objects I'll go for test_bit/set_bit/clear_bit here.
Al

2001-04-12 08:01:31

by David Miller

[permalink] [raw]
Subject: Re: [race][RFC] d_flags use


Alexander Viro writes:
> If nobody objects I'll go for test_bit/set_bit/clear_bit here.

Be sure to make d_flags an unsigned long when you do this! :-)

Later,
David S. Miller
[email protected]

2001-04-12 08:06:21

by Alexander Viro

[permalink] [raw]
Subject: Re: [race][RFC] d_flags use



On Thu, 12 Apr 2001, David S. Miller wrote:

>
> Alexander Viro writes:
> > If nobody objects I'll go for test_bit/set_bit/clear_bit here.
>
> Be sure to make d_flags an unsigned long when you do this! :-)

Oh, fsck... Thanks for reminder - I've completely forgotten about
that.
Al

2001-04-12 08:45:05

by David Miller

[permalink] [raw]
Subject: Re: [CFT][PATCH] Re: Fwd: Re: memory usage - dentry_cache


Alexander Viro writes:
> OK, how about wider testing? Theory: prune_dcache() goes through the
> list of immediately killable dentries and tries to free given amount.
> It has a "one warning" policy - it kills dentry if it sees it twice without
> lookup finding that dentry in the interval. Unfortunately, as implemented
> it stops when it had freed _or_ warned given amount. As the result, memory
> pressure on dcache is less than expected.

The reason the code is how it is right now is there used to be a bug
where that goto spot would --count but not check against zero, making
count possibly go negative and then you'd be there for a _long_ time
:-)

Just a FYI...

Later,
David S. Miller
[email protected]

2001-04-12 11:46:45

by Ed Tomlinson

[permalink] [raw]
Subject: [PATCH] Re: memory usage - dentry_cacheg

Hi,

I have been playing around with patches that fix this problem. What seems to happen is
that the VM code is pretty efficent at avoiding the calls to shrink the caches. When they
do get called its a case of to little to late. This is espically bad in lightly loaded
systems. The following patch helps here. I also have a more complex version that uses
autotuning, but would rather push the simple code, _if_ it does the job.

-------------
--- linux.ac3.orig/mm/vmscan.c Sat Apr 7 15:20:49 2001
+++ linux/mm/vmscan.c Sat Apr 7 12:37:27 2001
@@ -997,6 +997,21 @@
*/
refill_inactive_scan(DEF_PRIORITY, 0);

+ /*
+ * Here we apply pressure to the dcache and icache.
+ * The nr_inodes and nr_dentry track the used part of
+ * the slab caches. When there is more than X% objs free
+ * in these lists, as reported by the nr_unused fields,
+ * there is a very good chance that shrinking will free
+ * pages from the slab caches. For the dcache 66% works,
+ * and 80% seems optimal for the icache.
+ */
+
+ if ((dentry_stat.nr_unused+(dentry_stat.nr_unused>>1)) > dentry_stat.nr_dentry)
+ shrink_dcache_memory(DEF_PRIORITY, GFP_KSWAPD);
+ if ((inodes_stat.nr_unused+(inodes_stat.nr_unused>>2)) > inodes_stat.nr_inodes)
+ shrink_icache_memory(DEF_PRIORITY, GFP_KSWAPD);
+
/* Once a second, recalculate some VM stats. */
if (time_after(jiffies, recalc + HZ)) {
recalc = jiffies;
-------------

Ed Tomlinson

Alexander Viro wrote:
>
> On Wed, 11 Apr 2001, Andreas Dilger wrote:
>
>> I just discovered a similar problem when testing Daniel Philip's new ext2
>> directory indexing code with bonnie++. I was running bonnie under single
>> user mode (basically nothing else running) to create 100k files with 1 data
>> block each (in a single directory). This would create a directory about
>> 8MB in size, 32MB of dirty inode tables, and about 400M of dirty buffers.
>> I have 128MB RAM, no swap for the testing.
>>
>> In short order, my single user shell was OOM killed, and in another test
>> bonnie was OOM-killed (even though the process itself is only 8MB in size).
>> There were 80k entries each of icache and dcache (38MB and 10MB respectively)
>> and only dirty buffers otherwise. Clearly we need some VM pressure on the
>> icache and dcache in this case. Probably also need more agressive flushing
>> of dirty buffers before invoking OOM.
>
> We _have_ VM pressure there. However, such loads had never been used, so
> there's no wonder that system gets unbalanced under them.
>
> I suspect that simple replacement of goto next; with continue; in the
> fs/dcache.c::prune_dcache() may make situation seriously better.
>
> Al
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2001-04-12 12:29:42

by Marcin Kowalski

[permalink] [raw]
Subject: Re: [CFT][PATCH] Re: Fwd: Re: memory usage - dentry_cache

Hi

Regarding the patch ....

I don't have experience with the linux kernel internals but could this patch
not lead to a run-loop condition as the only thing that can break our of the
for(;;) loop is the tmp==&dentry_unused statement. So if the required number
of dentries does not exist and this condition is not satisfied we would have
an infinate loop... sorry if this is a silly question.

Also the comment >/* If the dentry was recently referenced, don't free it.
*/<, the code inside is excuted if the DCACHE_REFERENCED flags are set and in
the code is is reversing the DCACHE_REFERENCED flag on the dentry and adding
it to the dentry_unsed list??? So a Refrenched entry is set Not Referenced
and place in the unsed list?? I am unclear about that... is the comment
correct or is my understanding lacking (which is very probable :-))..

TIA
MarCin


FYI >--------

void prune_dcache(int count)
{
spin_lock(&dcache_lock);
for (;;) {
struct dentry *dentry;
struct list_head *tmp;

tmp = dentry_unused.prev;

if (tmp == &dentry_unused)
break;
list_del_init(tmp);
dentry = list_entry(tmp, struct dentry, d_lru);

/* If the dentry was recently referenced, don't free it. */
if (dentry->d_flags & DCACHE_REFERENCED) {
dentry->d_flags &= ~DCACHE_REFERENCED;
list_add(&dentry->d_lru, &dentry_unused);
continue;
}
dentry_stat.nr_unused--;

/* Unused dentry with a count? */
if (atomic_read(&dentry->d_count))
BUG();

prune_one_dentry(dentry);
if (!--count)
break;
}
spin_unlock(&dcache_lock);
}

-----------------------------
Marcin Kowalski
Linux/Perl Developer
Datrix Solutions
Cel. 082-400-7603
***Open Source Kicks Ass***
-----------------------------

2001-04-12 12:43:23

by Yoann Vandoorselaere

[permalink] [raw]
Subject: Re: [CFT][PATCH] Re: Fwd: Re: memory usage - dentry_cache

Marcin Kowalski <[email protected]> writes:

> Hi
>
> Regarding the patch ....
>
> I don't have experience with the linux kernel internals but could this patch
> not lead to a run-loop condition as the only thing that can break our of the
> for(;;) loop is the tmp==&dentry_unused statement. So if the required number
> of dentries does not exist and this condition is not satisfied we would have
> an infinate loop... sorry if this is a silly question.

AFAICT no because of the list_del_init(tmp) call :
When the list will be empty,
tmp will be equal to dentry_unused.prev (this is a circular list).

--
Yoann Vandoorselaere | "Programming is a race between programmers, who try and
MandrakeSoft | make more and more idiot-proof software, and universe,
| which produces more and more remarkable idiots. Until
| now, universe leads the race" -- R. Cook

2001-04-12 13:54:44

by Alexander Viro

[permalink] [raw]
Subject: Re: [CFT][PATCH] Re: Fwd: Re: memory usage - dentry_cache



On Thu, 12 Apr 2001, Marcin Kowalski wrote:

> Hi
>
> Regarding the patch ....
>
> I don't have experience with the linux kernel internals but could this patch
> not lead to a run-loop condition as the only thing that can break our of the
> for(;;) loop is the tmp==&dentry_unused statement. So if the required number
> of dentries does not exist and this condition is not satisfied we would have
> an infinate loop... sorry if this is a silly question.

Nope. Notice that "warned" dentries are not killed, but they are returned
to the list. If we meet them again - they are goners.

More formally, on each iteration you either decrement count or you
decrement the number of dentries that have DCACHE_REFERENCED. count
can't grow at all. Number of dentries with DCACHE_REFERENCED can't grow
unless you release dcache_lock, which happens only in the branch that
decrements count. I.e. loop does terminate.

> Also the comment >/* If the dentry was recently referenced, don't free it.
> */<, the code inside is excuted if the DCACHE_REFERENCED flags are set and in
> the code is is reversing the DCACHE_REFERENCED flag on the dentry and adding
> it to the dentry_unsed list??? So a Refrenched entry is set Not Referenced
> and place in the unsed list?? I am unclear about that... is the comment
> correct or is my understanding lacking (which is very probable :-))..

"referenced" as in "had been found by d_lookup, don't shoot me at sight".
When prune_dcache() picks it up it moves the thing on the other end of list
and removes the mark. Caught twice - too bad, it will be freed.
Al

2001-04-12 14:34:32

by Jan Harkes

[permalink] [raw]
Subject: [PATCH] Re: Fwd: Re: memory usage - dentry_cacheg

On Thu, Apr 12, 2001 at 01:45:08AM -0400, Alexander Viro wrote:
> On Wed, 11 Apr 2001, Andreas Dilger wrote:
>
> > I just discovered a similar problem when testing Daniel Philip's new ext2
> > directory indexing code with bonnie++. I was running bonnie under single
> > user mode (basically nothing else running) to create 100k files with 1 data
> > block each (in a single directory). This would create a directory about
> > 8MB in size, 32MB of dirty inode tables, and about 400M of dirty buffers.
> > I have 128MB RAM, no swap for the testing.
> >
> > In short order, my single user shell was OOM killed, and in another test
> > bonnie was OOM-killed (even though the process itself is only 8MB in size).
> > There were 80k entries each of icache and dcache (38MB and 10MB respectively)
> > and only dirty buffers otherwise. Clearly we need some VM pressure on the
> > icache and dcache in this case. Probably also need more agressive flushing
> > of dirty buffers before invoking OOM.
>
> We _have_ VM pressure there. However, such loads had never been used, so
> there's no wonder that system gets unbalanced under them.

But the VM pressure on the dcache and icache only comes into play once
the system still has a free_shortage _after_ other attempts of freeing
up memory in do_try_to_free_pages.

sync_all_inodes, which is called from shrink_icache_memory is
counterproductive at this point. Writing dirty inodes to disk,
especially when there is a lot of them, requires additional page
allocations.

I have a patch that avoids unconditionally puts pressure on the dcache
and icache, and avoids sync_all_inodes in shrink_icache_memory. An
additional wakeup for the kupdate thread makes sure that inodes are more
frequently written when there is no more free shortage. Maybe kupdated
should be always get woken up.

btw. Alexander, is the following a valid optimization to improve
write-coalescing when calling sync_one for several inodes?

inode.c:sync_one

- filemap_fdatawait(inode->i_mapping);
+ if (sync) filemap_fdatawait(inode->i_mapping);

Jan

================================================
--- fs/buffer.c.orig Mon Mar 26 10:47:09 2001
+++ fs/buffer.c Mon Mar 26 10:48:33 2001
@@ -2593,7 +2593,7 @@
return flushed;
}

-struct task_struct *bdflush_tsk = 0;
+struct task_struct *bdflush_tsk = 0, *kupdate_tsk = 0;

void wakeup_bdflush(int block)
{
@@ -2605,6 +2605,12 @@
}
}

+void wakeup_kupdate(void)
+{
+ if (current != kupdate_tsk)
+ wake_up_process(kupdate_tsk);
+}
+
/*
* Here we attempt to write back old buffers. We also try to flush inodes
* and supers as well, since this function is essentially "update", and
@@ -2751,6 +2757,7 @@
tsk->session = 1;
tsk->pgrp = 1;
strcpy(tsk->comm, "kupdated");
+ kupdate_tsk = tsk;

/* sigstop and sigcont will stop and wakeup kupdate */
spin_lock_irq(&tsk->sigmask_lock);
--- fs/inode.c.orig Thu Mar 22 13:20:55 2001
+++ fs/inode.c Mon Mar 26 10:48:33 2001
@@ -224,7 +224,8 @@
if (dirty & (I_DIRTY_SYNC | I_DIRTY_DATASYNC))
write_inode(inode, sync);

- filemap_fdatawait(inode->i_mapping);
+ if (sync)
+ filemap_fdatawait(inode->i_mapping);

spin_lock(&inode_lock);
inode->i_state &= ~I_LOCK;
@@ -270,19 +271,6 @@
spin_unlock(&inode_lock);
}

-/*
- * Called with the spinlock already held..
- */
-static void sync_all_inodes(void)
-{
- struct super_block * sb = sb_entry(super_blocks.next);
- for (; sb != sb_entry(&super_blocks); sb = sb_entry(sb->s_list.next)) {
- if (!sb->s_dev)
- continue;
- sync_list(&sb->s_dirty);
- }
-}
-
/**
* write_inode_now - write an inode to disk
* @inode: inode to write to disk
@@ -507,8 +495,6 @@
struct inode * inode;

spin_lock(&inode_lock);
- /* go simple and safe syncing everything before starting */
- sync_all_inodes();

entry = inode_unused.prev;
while (entry != &inode_unused)
--- mm/vmscan.c.orig Thu Mar 22 14:00:41 2001
+++ mm/vmscan.c Mon Mar 26 10:48:33 2001
@@ -840,14 +840,7 @@
if (inactive_shortage())
ret += refill_inactive(gfp_mask, user);

- /*
- * Delete pages from the inode and dentry caches and
- * reclaim unused slab cache if memory is low.
- */
- if (free_shortage()) {
- shrink_dcache_memory(DEF_PRIORITY, gfp_mask);
- shrink_icache_memory(DEF_PRIORITY, gfp_mask);
- } else {
+ if (!free_shortage()) {
/*
* Illogical, but true. At least for now.
*
@@ -857,7 +850,14 @@
* which we'll want to keep if under shortage.
*/
kmem_cache_reap(gfp_mask);
+ wakeup_kupdate();
}
+
+ /*
+ * Delete pages from the inode and dentry caches.
+ */
+ shrink_dcache_memory(DEF_PRIORITY, gfp_mask);
+ shrink_icache_memory(DEF_PRIORITY, gfp_mask);

return ret;
}
--- include/linux/fs.h.orig Mon Mar 26 10:48:56 2001
+++ include/linux/fs.h Mon Mar 26 10:48:33 2001
@@ -1248,6 +1248,7 @@
extern unsigned int get_hardblocksize(kdev_t);
extern struct buffer_head * bread(kdev_t, int, int);
extern void wakeup_bdflush(int wait);
+extern void wakeup_kupdate(void);

extern int brw_page(int, struct page *, kdev_t, int [], int);

2001-04-12 14:50:56

by Alexander Viro

[permalink] [raw]
Subject: Re: [PATCH] Re: Fwd: Re: memory usage - dentry_cacheg



On Thu, 12 Apr 2001, Jan Harkes wrote:

> But the VM pressure on the dcache and icache only comes into play once
> the system still has a free_shortage _after_ other attempts of freeing
> up memory in do_try_to_free_pages.

I don't think that it's necessary bad.

> sync_all_inodes, which is called from shrink_icache_memory is
> counterproductive at this point. Writing dirty inodes to disk,
> especially when there is a lot of them, requires additional page
> allocations.

Agreed, but that's
a) a separate story
b) not the case in situation mentioned above (all inodes are
busy).

> I have a patch that avoids unconditionally puts pressure on the dcache
> and icache, and avoids sync_all_inodes in shrink_icache_memory. An
> additional wakeup for the kupdate thread makes sure that inodes are more
> frequently written when there is no more free shortage. Maybe kupdated
> should be always get woken up.

Maybe, but I really doubt that constant pressure on dcache/icache is a
good idea. I'd rather see what will change from fixing that bug in
prune_dcache() before deciding what to do next.

> btw. Alexander, is the following a valid optimization to improve
> write-coalescing when calling sync_one for several inodes?
>
> inode.c:sync_one
>
> - filemap_fdatawait(inode->i_mapping);
> + if (sync) filemap_fdatawait(inode->i_mapping);

Umm... Probably.

2001-04-12 14:57:07

by Rik van Riel

[permalink] [raw]
Subject: Re: [PATCH] Re: memory usage - dentry_cacheg

On Thu, 12 Apr 2001, Ed Tomlinson wrote:

> I have been playing around with patches that fix this problem. What
> seems to happen is that the VM code is pretty efficent at avoiding the
> calls to shrink the caches. When they do get called its a case of to
> little to late. This is espically bad in lightly loaded systems.
> The following patch helps here. I also have a more complex version
> that uses autotuning, but would rather push the simple code, _if_ it
> does the job.

I like this patch. The thing I like most is that it tries to free
from this cache if there is little activity, not when we are low
on memory and it is physically impossible to get rid of the cache.

Remember that evicting early from the inode and dentry cache doesn't
matter since we can easily rebuild this data from the buffer and page
cache.

> -------------
> --- linux.ac3.orig/mm/vmscan.c Sat Apr 7 15:20:49 2001
> +++ linux/mm/vmscan.c Sat Apr 7 12:37:27 2001
> @@ -997,6 +997,21 @@
> */
> refill_inactive_scan(DEF_PRIORITY, 0);
>
> + /*
> + * Here we apply pressure to the dcache and icache.
> + * The nr_inodes and nr_dentry track the used part of
> + * the slab caches. When there is more than X% objs free
> + * in these lists, as reported by the nr_unused fields,
> + * there is a very good chance that shrinking will free
> + * pages from the slab caches. For the dcache 66% works,
> + * and 80% seems optimal for the icache.
> + */
> +
> + if ((dentry_stat.nr_unused+(dentry_stat.nr_unused>>1)) > dentry_stat.nr_dentry)
> + shrink_dcache_memory(DEF_PRIORITY, GFP_KSWAPD);
> + if ((inodes_stat.nr_unused+(inodes_stat.nr_unused>>2)) > inodes_stat.nr_inodes)
> + shrink_icache_memory(DEF_PRIORITY, GFP_KSWAPD);
> +
> /* Once a second, recalculate some VM stats. */
> if (time_after(jiffies, recalc + HZ)) {
> recalc = jiffies;
> -------------
>
> Ed Tomlinson
>
> Alexander Viro wrote:
> >
> > On Wed, 11 Apr 2001, Andreas Dilger wrote:
> >
> >> I just discovered a similar problem when testing Daniel Philip's new ext2
> >> directory indexing code with bonnie++. I was running bonnie under single
> >> user mode (basically nothing else running) to create 100k files with 1 data
> >> block each (in a single directory). This would create a directory about
> >> 8MB in size, 32MB of dirty inode tables, and about 400M of dirty buffers.
> >> I have 128MB RAM, no swap for the testing.
> >>
> >> In short order, my single user shell was OOM killed, and in another test
> >> bonnie was OOM-killed (even though the process itself is only 8MB in size).
> >> There were 80k entries each of icache and dcache (38MB and 10MB respectively)
> >> and only dirty buffers otherwise. Clearly we need some VM pressure on the
> >> icache and dcache in this case. Probably also need more agressive flushing
> >> of dirty buffers before invoking OOM.
> >
> > We _have_ VM pressure there. However, such loads had never been used, so
> > there's no wonder that system gets unbalanced under them.
> >
> > I suspect that simple replacement of goto next; with continue; in the
> > fs/dcache.c::prune_dcache() may make situation seriously better.
> >
> > Al
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
> >
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/ http://distro.conectiva.com.br/

2001-04-12 15:08:07

by Rik van Riel

[permalink] [raw]
Subject: Re: [PATCH] Re: Fwd: Re: memory usage - dentry_cacheg

On Thu, 12 Apr 2001, Alexander Viro wrote:
> On Thu, 12 Apr 2001, Jan Harkes wrote:
>
> > But the VM pressure on the dcache and icache only comes into play once
> > the system still has a free_shortage _after_ other attempts of freeing
> > up memory in do_try_to_free_pages.
>
> I don't think that it's necessary bad.

Please take a look at Ed Tomlinson's patch. It also puts pressure
on the dcache and icache independent of VM pressure, but it does
so based on the (lack of) pressure inside the dcache and icache
themselves.

The patch looks simple, sane and it might save us quite a bit of
trouble in making the prune_{icache,dcache} functions both able
to avoid low-memory deadlocks *AND* at the same time able to run
fast under low-memory situations ... we'd just prune from the
icache and dcache as soon as a "large portion" of the cache isn't
in use.

regards,

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/ http://distro.conectiva.com.br/

2001-04-12 15:13:17

by Alexander Viro

[permalink] [raw]
Subject: Re: [PATCH] Re: memory usage - dentry_cacheg



On Thu, 12 Apr 2001, Rik van Riel wrote:

> On Thu, 12 Apr 2001, Ed Tomlinson wrote:
>
> > I have been playing around with patches that fix this problem. What
> > seems to happen is that the VM code is pretty efficent at avoiding the
> > calls to shrink the caches. When they do get called its a case of to
> > little to late. This is espically bad in lightly loaded systems.
> > The following patch helps here. I also have a more complex version
> > that uses autotuning, but would rather push the simple code, _if_ it
> > does the job.
>
> I like this patch. The thing I like most is that it tries to free
> from this cache if there is little activity, not when we are low
> on memory and it is physically impossible to get rid of the cache.
>
> Remember that evicting early from the inode and dentry cache doesn't
> matter since we can easily rebuild this data from the buffer and page
> cache.

Ahem. Yes, for local block-based filesystems, provided that directories are
small and that indirect blocks will not flush the inode table buffers out of
buffer cache, etc., etc.

Keeping inodes clean when pressure is low is a nice idea. That way you can
easily evict when needed. Evicting early... Not really.

2001-04-12 15:27:39

by Alexander Viro

[permalink] [raw]
Subject: Re: [PATCH] Re: Fwd: Re: memory usage - dentry_cacheg



On Thu, 12 Apr 2001, Rik van Riel wrote:

> Please take a look at Ed Tomlinson's patch. It also puts pressure
> on the dcache and icache independent of VM pressure, but it does
> so based on the (lack of) pressure inside the dcache and icache
> themselves.
>
> The patch looks simple, sane and it might save us quite a bit of
> trouble in making the prune_{icache,dcache} functions both able
> to avoid low-memory deadlocks *AND* at the same time able to run
> fast under low-memory situations ... we'd just prune from the
> icache and dcache as soon as a "large portion" of the cache isn't
> in use.

Bad idea. If you do loops over directory contents you will almost
permanently have almost all dentries freeable. Doesn't make freeing
them a good thing - think of the effects it would have.

Simple question: how many of dentries in /usr/src/linux/include/linux
are busy at any given moment during the compile? At most 10, I suspect.
I.e. ~4%.

I would rather go for active keeping the amount of dirty inodes low,
so that freeing would be cheap. Doing massive write_inode when we
get low on memory is, indeed, a bad thing, but you don't have to
tie that to freeing stuff. Heck, IIRC you are using quite a similar
logics for pagecache...

2001-04-12 15:31:28

by Marcin Kowalski

[permalink] [raw]
Subject: Re: [PATCH] Re: memory usage - dentry_cacheg

Hi

I have applied this(Tom's) patch as well as the small change to
dcache.c(thanx Andreas, David, Alexander and All), I ran some tests and so
far so good, both the dcache and inode cache entries in slabinfo are keeping
nice and low even though I tested by creating thousands of files and then
deleting then. The dentry and icache both pruged succesfully.

Under the high memory usage test 350 mb of swap was left behind after the
process terminated leaving 750mb of free memory(physical). Why is this??
I did a swapoff which brough the machine to its' knees for the better part of
2 minutes and then a swapon and I was left with 170 megs used memory in
total... I must say that I find it odd that the swap is not cleared if there
is avaialbe "real" memory.

Anyway time will tell, I shall see how it performs over the long weekend.
HOpefully no *crashes*, Thanks to all concerned for their help... I learned
*lots*.

Regards
MarCin


-----------------------------
Marcin Kowalski
Linux/Perl Developer
Datrix Solutions
Cel. 082-400-7603
***Open Source Kicks Ass***
-----------------------------

2001-04-12 15:42:51

by Alexander Viro

[permalink] [raw]
Subject: Re: [PATCH] Re: Fwd: Re: memory usage - dentry_cacheg



On Thu, 12 Apr 2001, Alexander Viro wrote:

> Bad idea. If you do loops over directory contents you will almost
> permanently have almost all dentries freeable. Doesn't make freeing
> them a good thing - think of the effects it would have.
>
> Simple question: how many of dentries in /usr/src/linux/include/linux
> are busy at any given moment during the compile? At most 10, I suspect.
> I.e. ~4%.
>
> I would rather go for active keeping the amount of dirty inodes low,
> so that freeing would be cheap. Doing massive write_inode when we
> get low on memory is, indeed, a bad thing, but you don't have to
> tie that to freeing stuff. Heck, IIRC you are using quite a similar
> logics for pagecache...

PS: with your approach negative entries are dead meat - they won't be
caught used unless you look at them exactly at the moment of d_lookup().

Welcome to massive lookups in /bin due to /usr/bin stuff (and no, shell
own cache doesn't help - it's not shared; think of scripts).

IOW. keeping dcache/icache size low is not a good thing, unless you
have a memory pressure that requires it. More agressive kupdate _is_
a good thing, though - possibly kupdate sans flushing buffers, so that
it would just keep the icache clean and let bdflush do the actual IO.

2001-04-12 15:49:11

by Rik van Riel

[permalink] [raw]
Subject: Re: [PATCH] Re: Fwd: Re: memory usage - dentry_cacheg

On Thu, 12 Apr 2001, Alexander Viro wrote:

> IOW. keeping dcache/icache size low is not a good thing, unless you
> have a memory pressure that requires it. More agressive kupdate _is_
> a good thing, though - possibly kupdate sans flushing buffers, so that
> it would just keep the icache clean and let bdflush do the actual IO.

Very well. Then I'll leave the balancing between eating from the
page cache and eating from the dcache/icache to you. Have fun.

Rik
--
Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

http://www.surriel.com/
http://www.conectiva.com/ http://distro.conectiva.com.br/

2001-04-12 17:28:17

by Andreas Dilger

[permalink] [raw]
Subject: Re: [CFT][PATCH] Re: Fwd: Re: memory usage - dentry_cache

David writes:
> Alexander Viro writes:
> > OK, how about wider testing? Theory: prune_dcache() goes through the
> > list of immediately killable dentries and tries to free given amount.
> > It has a "one warning" policy - it kills dentry if it sees it twice without
> > lookup finding that dentry in the interval. Unfortunately, as implemented
> > it stops when it had freed _or_ warned given amount. As the result, memory
> > pressure on dcache is less than expected.
>
> The reason the code is how it is right now is there used to be a bug
> where that goto spot would --count but not check against zero, making
> count possibly go negative and then you'd be there for a _long_ time
> :-)

Actually, this is the case if we call shrink_dcache_memory() with priority
zero. It calls prune_dcache(count = 0), which gets into the situation you
describe (i.e. negative count). I first thought this was a bug, but then
realized for priority 0 (i.e. highest priority) we want to check the whole
dentry_unused list for unreferenced dentries.

Cheers, Andreas
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert

2001-04-13 01:34:44

by Ed Tomlinson

[permalink] [raw]
Subject: Re: [PATCH] Re: memory usage - dentry_cacheg

On Thursday 12 April 2001 11:12, Alexander Viro wrote:
> On Thu, 12 Apr 2001, Rik van Riel wrote:
> > On Thu, 12 Apr 2001, Ed Tomlinson wrote:
> > > I have been playing around with patches that fix this problem. What
> > > seems to happen is that the VM code is pretty efficent at avoiding the
> > > calls to shrink the caches. When they do get called its a case of to
> > > little to late. This is espically bad in lightly loaded systems.
> > > The following patch helps here. I also have a more complex version
> > > that uses autotuning, but would rather push the simple code, _if_ it
> > > does the job.
> >
> > I like this patch. The thing I like most is that it tries to free
> > from this cache if there is little activity, not when we are low
> > on memory and it is physically impossible to get rid of the cache.
> >
> > Remember that evicting early from the inode and dentry cache doesn't
> > matter since we can easily rebuild this data from the buffer and page
> > cache.
>
> Ahem. Yes, for local block-based filesystems, provided that directories are
> small and that indirect blocks will not flush the inode table buffers out
> of buffer cache, etc., etc.
>
> Keeping inodes clean when pressure is low is a nice idea. That way you can
> easily evict when needed. Evicting early... Not really.

What prompted my patch was observing situations where the icache (and dcache
too) got so big that they were applying artifical pressure to the page and
buffer caches. I say artifical since checking the stats these caches showed
over 95% of the entries unused. At this point there is usually another 10%
or so of objects allocated by the slab caches but not accounted for in the
stats (not a problem they are accounted if the cache starts using them).

Suspect your change to the prune logic is not going to help the above situation
much - if the shrink functions are not called often enough we end up with over
size caches.

Comments?
Ed Tomlinson <[email protected]>

2001-04-13 02:04:17

by Alexander Viro

[permalink] [raw]
Subject: Re: [PATCH] Re: memory usage - dentry_cacheg



On Thu, 12 Apr 2001, Ed Tomlinson wrote:

> On Thursday 12 April 2001 11:12, Alexander Viro wrote:
> What prompted my patch was observing situations where the icache (and dcache
> too) got so big that they were applying artifical pressure to the page and
> buffer caches. I say artifical since checking the stats these caches showed
> over 95% of the entries unused. At this point there is usually another 10%
> or so of objects allocated by the slab caches but not accounted for in the
> stats (not a problem they are accounted if the cache starts using them).

"Unused" as in "->d_count==0"? That _is_ OK. Basically, you will have
positive ->d_count only on directories and currently opened files.
E.g. during compile in /usr/include/* you will have 3-5 file dentries
with ->d_count > 0 - ones that are opened _now_. It doesn't mean that
everything else rest is unused in any meaningful sense. Can be freed - yes,
but that's a different story.

If you are talking about "unused" from the slab POV - _ouch_. Looks like
extremely bad fragmentation ;-/ It's surprising, and if that's thte case
I'd like to see more details.
Al

2001-04-13 04:46:05

by Ed Tomlinson

[permalink] [raw]
Subject: Re: [PATCH] Re: memory usage - dentry_cacheg

On Thursday 12 April 2001 22:03, Alexander Viro wrote:
> On Thu, 12 Apr 2001, Ed Tomlinson wrote:
> > On Thursday 12 April 2001 11:12, Alexander Viro wrote:
> > What prompted my patch was observing situations where the icache (and
> > dcache too) got so big that they were applying artifical pressure to the
> > page and buffer caches. I say artifical since checking the stats these
> > caches showed over 95% of the entries unused. At this point there is
> > usually another 10% or so of objects allocated by the slab caches but not
> > accounted for in the stats (not a problem they are accounted if the cache
> > starts using them).
>
> "Unused" as in "->d_count==0"? That _is_ OK. Basically, you will have
> positive ->d_count only on directories and currently opened files.
> E.g. during compile in /usr/include/* you will have 3-5 file dentries
> with ->d_count > 0 - ones that are opened _now_. It doesn't mean that
> everything else rest is unused in any meaningful sense. Can be freed - yes,
> but that's a different story.
>
> If you are talking about "unused" from the slab POV - _ouch_. Looks like
> extremely bad fragmentation ;-/ It's surprising, and if that's thte case
> I'd like to see more details.

>From the POV of dentry_stat.nr_unused. From the slab POV, dentry_stat.nr_dentry
always equals the number of objects used as reported in /proc/slabinfo. If I
could remember my stats from ages back I could take a stab at estimating the
fragmentation... From experience if you look at memory_pressure before and
after a shrink of the dcache you will usually see it decrease if there if
there is more that 75% or so free reported by dentry_stat.nr_unused.

The inode cache is not as good. With fewer inodes per page (slab) I
would expect that percentage to be lower. Instead it usually has to be
above 80% to get pages free...

I am trying your change now.

Ed Tomlinson

2001-04-13 13:36:27

by Ed Tomlinson

[permalink] [raw]
Subject: Re: [PATCH] Re: memory usage - dentry_cacheg

On Friday 13 April 2001 00:45, Ed Tomlinson wrote:
> On Thursday 12 April 2001 22:03, Alexander Viro wrote:

> > If you are talking about "unused" from the slab POV - _ouch_. Looks like
> > extremely bad fragmentation ;-/ It's surprising, and if that's thte case
> > I'd like to see more details.

> From the POV of dentry_stat.nr_unused. From the slab POV,
> dentry_stat.nr_dentry always equals the number of objects used as reported
> in /proc/slabinfo. If I could remember my stats from ages back I could
> take a stab at estimating the fragmentation... From experience if you look
> at memory_pressure before and after a shrink of the dcache you will usually
> see it decrease if there if there is more that 75% or so free reported by
> dentry_stat.nr_unused.
>
> The inode cache is not as good. With fewer inodes per page (slab) I
> would expect that percentage to be lower. Instead it usually has to be
> above 80% to get pages free...
>
> I am trying your change now.

And it does seem to help here. Worst case during an afio backup was:

inode_cache 14187 16952 480 2119 2119 1
dentry_cache 1832 3840 128 128 128 1
4 1 0 45256 1600 36828 165156 4 8 576 79 163 612 39 6 55

without the patch 20000+ inode slabs was not uncommon.

Here are some numbers snapshoting every 120
seconds at the start of a backup.

oscar% while true;do cat /proc/slabinfo | egrep "dentry|inode"; vmstat | tail -1;
sleep 120; done
inode_cache 11083 11592 480 1449 1449 1
dentry_cache 4477 4500 128 150 150 1
0 0 0 136 7116 17048 198072 0 0 36 64 129 443 20 3 77
inode_cache 11493 11816 480 1477 1477 1
dentry_cache 2611 3690 128 123 123 1
4 0 0 8784 1596 66728 152484 0 1 44 65 131 448 20 3 77
inode_cache 4512 6168 480 771 771 1
dentry_cache 2708 4320 128 144 144 1
3 0 0 24168 2936 170108 50196 0 3 62 66 135 457 20 4 76
inode_cache 1651 4184 480 523 523 1
dentry_cache 778 3330 128 111 111 1
2 0 0 156 18560 130504 74848 4 5 77 68 138 462 21 4 75
inode_cache 11426 11432 480 1429 1429 1
dentry_cache 672 3240 128 108 108 1
2 0 0 44928 1740 58292 151932 4 11 101 77 140 467 21 4 74
inode_cache 10572 11480 480 1435 1435 1
dentry_cache 1099 3240 128 108 108 1
3 0 0 45668 1852 21412 189600 4 11 126 79 142 474 22 4 74
inode_cache 10620 11416 480 1427 1427 1
dentry_cache 1611 3240 128 108 108 1
3 0 0 45648 2068 13020 202140 4 11 152 78 143 482 23 4 73
inode_cache 10637 11416 480 1427 1427 1
dentry_cache 1628 3240 128 108 108 1
3 0 0 45648 1588 12412 200832 4 11 171 77 143 489 24 4 72
inode_cache 10652 11416 480 1427 1427 1
dentry_cache 1643 3240 128 108 108 1
2 0 0 45648 1808 12556 191080 4 11 190 76 143 497 25 5 71
inode_cache 10698 11416 480 1427 1427 1
dentry_cache 1697 3240 128 108 108 1
2 0 0 45648 1736 12788 191300 4 10 208 75 143 504 26 5 70
inode_cache 10729 11416 480 1427 1427 1
dentry_cache 1728 3240 128 108 108 1

Looks like there is some fragmentation ocuring. It stays at near a 1:2 ratio for
most of the backup (using afio) and ends up with the slab cache having 10-20% more
entries than the dcache is using.

Thanks
Ed Tomlinson <[email protected]>

2001-04-14 03:30:18

by Paul

[permalink] [raw]
Subject: Re: [PATCH] Re: memory usage - dentry_cacheg

Marcin Kowalski <[email protected]>, on Thu Apr 12, 2001 [05:30:59 PM] said:
> Hi
>
> I have applied this(Tom's) patch as well as the small change to
> dcache.c(thanx Andreas, David, Alexander and All), I ran some tests and so
> far so good, both the dcache and inode cache entries in slabinfo are keeping
> nice and low even though I tested by creating thousands of files and then
> deleting then. The dentry and icache both pruged succesfully.
>

I applied these patches to 2.4.3-ac5, and it made a world
of difference. I can run kernel compiles, things like 'find /',
and move between desktops running netscape, mutt with 15000
messages threaded, etc. without sloggy delays... eg. previously
netscape used to take a second or so to repaint under this type
of 'load' upon returning to it from a brief visit to another
desktop.
This is a subjective assesment of my desktop type system,
k6-333 with 64M; 2.4 is much more usable for me now.
If anyone wants me to run specific tests, I am willing.

Paul
[email protected]