2002-01-09 17:27:26

by Matt Dainty

[permalink] [raw]
Subject: Where's all my memory going?

Hi,

I've fashioned a qmail mail server using an HP NetServer with an HP NetRaid
4M & 1GB RAM, running 2.4.17 with aacraid, LVM, ext3 and highmem. The box
has 6x 9GB disks, one for system, one for qmail's queue, and the remaining
four are RAID5'd with LVM. ext3 is only on the queue disk, ext2 everywhere
else.

Before I stick the box live, I wanted to test that it doesn't fall over
under any remote kind of stress, so I've run postal to simulate lots of mail
connections.

Nothing too hard to begin with, but I'm seeing a degradation in performance
over time, using a maximum message size of 10KB, 5 simultaneous connections,
and limiting to 1500 messages per minute.

Initially the box memory situation is like this:

root@plum:~# free
total used free shared buffers cached
Mem: 1029524 78948 950576 0 26636 23188
-/+ buffers/cache: 29124 1000400
Swap: 2097136 0 2097136

...running postal, it seems to cope fine. Checking the queue using
qmail-qstat shows no messages being delayed for delivery, everything I chuck
at it is being delivered straight away.

However, over time, (30-45 minutes), more and more memory seems to just
disappear from the system until it looks like this, (note that swap is
hardly ever touched):

root@plum:~# free
total used free shared buffers cached
Mem: 1029524 1018032 11492 0 49380 245568
-/+ buffers/cache: 723084 306440
Swap: 2097136 676 2096460

...and qmail-qstat reports a few thousand queued messages. Even if I stop
the postal process, let the queue empty and start again, it never attains
the same performance as it did initially and the queue gets slowly filled.

I haven't left it long enough to see if the box grinds itself into the
ground, but it appears to stay at pretty much the same level as above, once
it gets there. CPU load stays at about ~5.0, (PIII 533), but it's still
very reponsive to input and launching stuff.

Looking at the processes, the biggest memory hog is a copy of dnscache that
claims to have used ~10MB, which is fine as I specified a cache of that size.
Nothing else shows any hint of excessive memory usage.

Can anyone offer any advice or solution to this behaviour, (or more tricks
or settings I can try)? I'd like the mail server to be able to handle 1500
messages instead of 150 a minute! :-) Any extra info required, please let me
know, I'm not sure what else to provide atm.

Cheers

Matt
--
"Phased plasma rifle in a forty-watt range?"
"Hey, just what you see, pal"


2002-01-09 17:37:06

by Alan

[permalink] [raw]
Subject: Re: Where's all my memory going?

> However, over time, (30-45 minutes), more and more memory seems to just
> disappear from the system until it looks like this, (note that swap is
> hardly ever touched):

I don't see any disappearing memory. Remember that Linux will intentionally
keep memory filled with cache pages when it is possible. The rest I can't
help with - Im not familiar enough with qmail to know what limits it places
internally or where the points it and/or the kernel might interact to
cause bottlenecks are

2002-01-09 22:37:11

by Rik van Riel

[permalink] [raw]
Subject: Re: Where's all my memory going?

On Wed, 9 Jan 2002, Alan Cox wrote:

> > However, over time, (30-45 minutes), more and more memory seems to just
> > disappear from the system until it looks like this, (note that swap is
> > hardly ever touched):
>
> I don't see any disappearing memory. Remember that Linux will
> intentionally keep memory filled with cache pages when it is possible.

Matt's system seems to go from 900 MB free to about
300 MB (free + cache).

I doubt qmail would eat 600 MB of RAM (it might, I
just doubt it) so I'm curious where the RAM is going.

Matt, do you see any suspiciously high numbers in
/proc/slabinfo ?

regards,

Rik
--
"Linux holds advantages over the single-vendor commercial OS"
-- Microsoft's "Competing with Linux" document

http://www.surriel.com/ http://distro.conectiva.com/

2002-01-10 08:45:22

by Bruce Guenter

[permalink] [raw]
Subject: Re: Where's all my memory going?

On Wed, Jan 09, 2002 at 08:36:13PM -0200, Rik van Riel wrote:
> Matt's system seems to go from 900 MB free to about
> 300 MB (free + cache).
>
> I doubt qmail would eat 600 MB of RAM (it might, I
> just doubt it) so I'm curious where the RAM is going.

I am seeing the same symptoms, with similar use -- ext3 filesystems
running qmail. Adding up the RSS of all the processes in use gives
about 75MB, while free shows:

total used free shared buffers cached
Mem: 901068 894088 6980 0 157568 113856
-/+ buffers/cache: 622664 278404
Swap: 1028152 10468 1017684

This are fairly consistent numbers. buffers hovers around 150MB and
cached around 110MB all day. The server is heavy on write traffic.

> Matt, do you see any suspiciously high numbers in
> /proc/slabinfo ?

What would be suspiciously high? The four biggest numbers I see are:

inode_cache 139772 204760 480 25589 25595 1
dentry_cache 184024 326550 128 10885 10885 1
buffer_head 166620 220480 96 4487 5512 1
size-64 102388 174876 64 2964 2964 1

I can post complete details for any who wish to investigate further. I
am not seeing a huge slowdown, but I have no real baseline to compare
against.
--
Bruce Guenter <[email protected]> http://em.ca/~bruceg/ http://untroubled.org/
OpenPGP key: 699980E8 / D0B7 C8DD 365D A395 29DA 2E2A E96F B2DC 6999 80E8


Attachments:
(No filename) (1.42 kB)
(No filename) (232.00 B)
Download all attachments

2002-01-10 10:06:18

by Andreas Dilger

[permalink] [raw]
Subject: Re: Where's all my memory going?

On Jan 10, 2002 02:45 -0600, Bruce Guenter wrote:
> On Wed, Jan 09, 2002 at 08:36:13PM -0200, Rik van Riel wrote:
> > Matt's system seems to go from 900 MB free to about
> > 300 MB (free + cache).
> >
> > I doubt qmail would eat 600 MB of RAM (it might, I
> > just doubt it) so I'm curious where the RAM is going.
>
> I am seeing the same symptoms, with similar use -- ext3 filesystems
> running qmail.

Hmm, does qmail put each piece of email is in a separate file? That
might explain a lot about what is going on here.

> Adding up the RSS of all the processes in use gives
> about 75MB, while free shows:
>
> total used free shared buffers cached
> Mem: 901068 894088 6980 0 157568 113856
> -/+ buffers/cache: 622664 278404
> Swap: 1028152 10468 1017684
>
> This are fairly consistent numbers. buffers hovers around 150MB and
> cached around 110MB all day. The server is heavy on write traffic.
>
> > Matt, do you see any suspiciously high numbers in
> > /proc/slabinfo ?
>
> What would be suspiciously high? The four biggest numbers I see are:
>
> inode_cache 139772 204760 480 25589 25595 1
> dentry_cache 184024 326550 128 10885 10885 1
> buffer_head 166620 220480 96 4487 5512 1
> size-64 102388 174876 64 2964 2964 1

Well, these numbers _are_ high, but with 1GB of RAM you have to use it all
_somewhere_. It looks like you don't have much memory pressure, because
there is lots of free space in these slabs that could probably be freed
easily.

I'm thinking that if you get _lots_ of dentry and inode items (especially
under the "postal" benchmark) you may not be able to free the negative
dentries for all of the created/deleted files in the mailspool (all of
which will have unique names). There is a deadlock path in the VM that
has to be avoided, and as a result it makes it harder to free dentries
under certain uncommon loads.

I had a "use once" patch for negative dentries that allowed the VM to
free negative dentries easily if they are never referenced again. It
is a bit old, but it should be pretty close to applying. I have been
using it for months without problems (although I don't really stress
it very much in this regard).

The other question would of course be whether we are calling into
shrink_dcache_memory() enough, but that is an issue for Matt to
see by testing "postal" with and without the patch, and keeping an
eye on the slab caches.

Cheers, Andreas
======================= dcache-2.4.13-neg.diff ============================
--- linux.orig/fs/dcache.c Thu Oct 25 01:50:30 2001
+++ linux/fs/dcache.c Thu Oct 25 00:02:58 2001
@@ -137,7 +137,16 @@
/* Unreachable? Get rid of it */
if (list_empty(&dentry->d_hash))
goto kill_it;
- list_add(&dentry->d_lru, &dentry_unused);
+ if (dentry->d_inode) {
+ list_add(&dentry->d_lru, &dentry_unused);
+ } else {
+ /* Put an unused negative inode to the end of the list.
+ * If it is not referenced again before we need to free some
+ * memory, it will be the first to be freed.
+ */
+ dentry->d_vfs_flags &= ~DCACHE_REFERENCED;
+ list_add_tail(&dentry->d_lru, &dentry_unused);
+ }
dentry_stat.nr_unused++;
spin_unlock(&dcache_lock);
return;
@@ -306,8 +315,9 @@
}

/**
- * prune_dcache - shrink the dcache
+ * _prune_dcache - shrink the dcache
* @count: number of entries to try and free
+ * @gfp_mask: context under which we are trying to free memory
*
* Shrink the dcache. This is done when we need
* more memory, or simply when we need to unmount
@@ -318,7 +328,7 @@
* all the dentries are in use.
*/

-void prune_dcache(int count)
+void _prune_dcache(int count, unsigned int gfp_mask)
{
spin_lock(&dcache_lock);
for (;;) {
@@ -329,15 +339,32 @@

if (tmp == &dentry_unused)
break;
- list_del_init(tmp);
dentry = list_entry(tmp, struct dentry, d_lru);

/* If the dentry was recently referenced, don't free it. */
if (dentry->d_vfs_flags & DCACHE_REFERENCED) {
+ list_del_init(tmp);
dentry->d_vfs_flags &= ~DCACHE_REFERENCED;
list_add(&dentry->d_lru, &dentry_unused);
continue;
}
+
+ /*
+ * Nasty deadlock avoidance.
+ *
+ * ext2_new_block->getblk->GFP->shrink_dcache_memory->
+ * prune_dcache->prune_one_dentry->dput->dentry_iput->iput->
+ * inode->i_sb->s_op->put_inode->ext2_discard_prealloc->
+ * ext2_free_blocks->lock_super->DEADLOCK.
+ *
+ * We should make sure we don't hold the superblock lock over
+ * block allocations, but for now we will only free unused
+ * negative dentries (which are added at the end of the list).
+ */
+ if (dentry->d_inode && !(gfp_mask & __GFP_FS))
+ break;
+
+ list_del_init(tmp);
dentry_stat.nr_unused--;

/* Unused dentry with a count? */
@@ -351,6 +378,11 @@
spin_unlock(&dcache_lock);
}

+void prune_dcache(int count)
+{
+ _prune_dcache(count, __GFP_FS);
+}
+
/*
* Shrink the dcache for the specified super block.
* This allows us to unmount a device without disturbing
@@ -549,26 +581,11 @@
*/
int shrink_dcache_memory(int priority, unsigned int gfp_mask)
{
- int count = 0;
-
- /*
- * Nasty deadlock avoidance.
- *
- * ext2_new_block->getblk->GFP->shrink_dcache_memory->prune_dcache->
- * prune_one_dentry->dput->dentry_iput->iput->inode->i_sb->s_op->
- * put_inode->ext2_discard_prealloc->ext2_free_blocks->lock_super->
- * DEADLOCK.
- *
- * We should make sure we don't hold the superblock lock over
- * block allocations, but for now:
- */
- if (!(gfp_mask & __GFP_FS))
- return 0;
-
- count = dentry_stat.nr_unused / priority;
+ int count = dentry_stat.nr_unused / (priority + 1);

- prune_dcache(count);
+ _prune_dcache(count, gfp_mask);
kmem_cache_shrink(dentry_cache);
+
return 0;
}

@@ -590,8 +607,15 @@
struct dentry *dentry;

dentry = kmem_cache_alloc(dentry_cache, GFP_KERNEL);
- if (!dentry)
- return NULL;
+ if (!dentry) {
+ /* Try to free some unused dentries from the cache, but do
+ * not call into the filesystem to do so (avoid deadlock).
+ */
+ _prune_dcache(16, GFP_NOFS);
+ dentry = kmem_cache_alloc(dentry_cache, GFP_KERNEL);
+ if (!dentry)
+ return NULL;
+ }

if (name->len > DNAME_INLINE_LEN-1) {
str = kmalloc(NAME_ALLOC_LEN(name->len), GFP_KERNEL);
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2002-01-10 11:19:13

by Matt Dainty

[permalink] [raw]
Subject: Re: Where's all my memory going?

On Thu, Jan 10, 2002 at 03:05:38AM -0700, Andreas Dilger wrote:
> On Jan 10, 2002 02:45 -0600, Bruce Guenter wrote:
> > On Wed, Jan 09, 2002 at 08:36:13PM -0200, Rik van Riel wrote:
> > > Matt's system seems to go from 900 MB free to about
> > > 300 MB (free + cache).
> > >
> > > I doubt qmail would eat 600 MB of RAM (it might, I
> > > just doubt it) so I'm curious where the RAM is going.

I'm fairly sure we can eliminate qmail, as most of its processes are
short-lived, often invoked per-delivery, only a few processes stay running
for any length of time.

> > I am seeing the same symptoms, with similar use -- ext3 filesystems
> > running qmail.

Heh, it's your fault I'm using ext3 for the queue! :P

> Hmm, does qmail put each piece of email is in a separate file? That
> might explain a lot about what is going on here.

Yes, in more places than one in the best setups. The queue stores each
message in separate areas as it moves through the system, the queue names
each message as the inode it resides on. (I probably haven't explained that
too well, I'm sure Bruce can elaborate :-). When the message is delivered
locally, and using djb's Maildir mailbox format, the message will be stored
as a separate file too, most commonly under ~user/Maildir/new/.

I originally thought ReiserFS would be good for this, but the benchmarks
Bruce did, showed that in fact ext3 is better, (using data=journal, and
using the syncdir library to force synchronous behaviour on open(), etc.,
similar to chattr +S). I've also used 'noatime' to coax some more speed
out of it.

> > Adding up the RSS of all the processes in use gives
> > about 75MB, while free shows:
> >
> > total used free shared buffers cached
> > Mem: 901068 894088 6980 0 157568 113856
> > -/+ buffers/cache: 622664 278404
> > Swap: 1028152 10468 1017684
> >
> > This are fairly consistent numbers. buffers hovers around 150MB and
> > cached around 110MB all day. The server is heavy on write traffic.
> >
> > > Matt, do you see any suspiciously high numbers in
> > > /proc/slabinfo ?

I'll have another run and see what happens...

> The other question would of course be whether we are calling into
> shrink_dcache_memory() enough, but that is an issue for Matt to
> see by testing "postal" with and without the patch, and keeping an
> eye on the slab caches.

I'll try this patch and see how it performs.

Cheers

Matt
--
"Phased plasma rifle in a forty-watt range?"
"Hey, just what you see, pal"

2002-01-10 14:46:24

by Matt Dainty

[permalink] [raw]
Subject: Re: Where's all my memory going?

On Thu, Jan 10, 2002 at 03:05:38AM -0700, Andreas Dilger wrote:
> On Jan 10, 2002 02:45 -0600, Bruce Guenter wrote:
> > On Wed, Jan 09, 2002 at 08:36:13PM -0200, Rik van Riel wrote:
> >
> > > Matt, do you see any suspiciously high numbers in
> > > /proc/slabinfo ?
> >
> > What would be suspiciously high? The four biggest numbers I see are:
> >
> > inode_cache 139772 204760 480 25589 25595 1
> > dentry_cache 184024 326550 128 10885 10885 1
> > buffer_head 166620 220480 96 4487 5512 1
> > size-64 102388 174876 64 2964 2964 1

Pretty much the same as Bruce here, mostly same culprits anyway:

inode_cache 84352 90800 480 11340 11350 1 : 124 62
dentry_cache 240060 240060 128 8002 8002 1 : 252 126
buffer_head 215417 227760 96 5694 5694 1 : 252 126
size-32 209954 209954 32 1858 1858 1 : 252 126

> The other question would of course be whether we are calling into
> shrink_dcache_memory() enough, but that is an issue for Matt to
> see by testing "postal" with and without the patch, and keeping an
> eye on the slab caches.

Patch applied cleanly, and I redid the 'test'. I've attached the output
of free and /proc/slabinfo, *.1 is without patch, *.2 is with. In both
cases postal was left to run for about 35 minutes by which time it had
delivered around ~54000 messages locally.

Overall, with the patch, the large numbers in /proc/slabinfo are *still*
large, but not as large as without the patch. Overall memory usage still
seems similar.

Matt
--
"Phased plasma rifle in a forty-watt range?"
"Hey, just what you see, pal"


Attachments:
(No filename) (1.62 kB)
before.1 (4.12 kB)
after.1 (4.12 kB)
before.2 (4.12 kB)
after.2 (4.12 kB)
Download all attachments

2002-01-10 16:17:25

by David Rees

[permalink] [raw]
Subject: Re: Where's all my memory going?

On Thu, Jan 10, 2002 at 02:55:42PM +0000, Matt Dainty wrote:
>
> Patch applied cleanly, and I redid the 'test'. I've attached the output
> of free and /proc/slabinfo, *.1 is without patch, *.2 is with. In both
> cases postal was left to run for about 35 minutes by which time it had
> delivered around ~54000 messages locally.
>
> Overall, with the patch, the large numbers in /proc/slabinfo are *still*
> large, but not as large as without the patch. Overall memory usage still
> seems similar.

So the performance of the test was the same with or without the patch?

Does top or vmstat indicate any kind of difference on the system when the
benchmark is pushing 1500 msgs/s vs 150 msgs/s?

There's a kernel profiling tool somewhere that might also help if there's a
large amount of system time being used up. (I think this is it:
http://oss.sgi.com/projects/kernprof/)

-Dave

2002-01-10 20:47:31

by Andreas Dilger

[permalink] [raw]
Subject: Re: Where's all my memory going?

On Jan 10, 2002 14:55 +0000, Matt Dainty wrote:
> Pretty much the same as Bruce here, mostly same culprits anyway:
>
> inode_cache 84352 90800 480 11340 11350 1 : 124 62
> dentry_cache 240060 240060 128 8002 8002 1 : 252 126
> buffer_head 215417 227760 96 5694 5694 1 : 252 126
> size-32 209954 209954 32 1858 1858 1 : 252 126
>
> On Thu, Jan 10, 2002 at 03:05:38AM -0700, Andreas Dilger wrote:
> > The other question would of course be whether we are calling into
> > shrink_dcache_memory() enough, but that is an issue for Matt to
> > see by testing "postal" with and without the patch, and keeping an
> > eye on the slab caches.
>
> Patch applied cleanly, and I redid the 'test'. I've attached the output
> of free and /proc/slabinfo, *.1 is without patch, *.2 is with. In both
> cases postal was left to run for about 35 minutes by which time it had
> delivered around ~54000 messages locally.

One question - what happens to the emails after they are delivered? Are
they kept on the local filesystem? That would be one reason why you have
so many inodes and dentries in the cache - they are cacheing all of these
newly-accessed inodes in the assumption that they may be used again soon.

Even on my system, I have about 35000 items in the inode and dentry caches.

> Overall, with the patch, the large numbers in /proc/slabinfo are *still*
> large, but not as large as without the patch. Overall memory usage still
> seems similar.

Well, Linux will pretty much always use up all of your memory. The real
question always boils down to how to use it most effectively.

Without patch:
> total used free shared buffers cached
> Mem: 1029524 992848 36676 0 54296 139212
> -/+ buffers/cache: 799340 230184
> Swap: 2097136 116 2097020
>
> inode_cache 84352 90800 480 11340 11350 1 : 124 62
> dentry_cache 240060 240060 128 8002 8002 1 : 252 126
> buffer_head 215417 227760 96 5694 5694 1 : 252 126
> size-32 209954 209954 32 1858 1858 1 : 252 126

With patch:
> total used free shared buffers cached
> Mem: 1029524 992708 36816 0 43792 144516
> -/+ buffers/cache: 804400 225124
> Swap: 2097136 116 2097020
>
> inode_cache 55744 62440 480 7801 7805 1 : 124 62
> dentry_cache 125400 125400 128 4180 4180 1 : 252 126
> buffer_head 223430 236160 96 5904 5904 1 : 252 126
> size-32 100005 100005 32 885 885 1 : 252 126

Well with the patch, you have:

((11350 - 7805) + (8002 - 4180) + (1858 - 885)) * 4096 bytes = 32MB

more RAM to play with. Granted that it is not a ton on a 1GB machine,
but it is nothing to sneeze at either. In your case, we could still
be a lot more aggressive in removing dentries and inodes from the cache,
but under many workloads (not the artificial use-once case of such a
benchmark) that may be a net performance loss.

One interesting tidbit is the number of size-32 items in use. This means
that the filename does not fit into the 16 bytes provided inline with
the dentry. There was another patch available which increased the size
of DNAME_INLINE_LEN so that it filled the rest of the cacheline, since
the dentries are aligned this way anyways. Depending on how much space
that gives you, and the length of the filenames used by qmail, you could
save another (whopping) 3.5MB of space and avoid extra allocations for
each and every file.

Still, these slabs only total 18744 pages = 73 MB, so there must be
another culprit hiding elsewhere using the other 720MB of RAM. What
does the output of Ctrl-Alt-SysRQ-M show you (either kernel is fine).

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2002-01-10 22:18:59

by Bruce Guenter

[permalink] [raw]
Subject: Re: Where's all my memory going?

On Thu, Jan 10, 2002 at 03:05:38AM -0700, Andreas Dilger wrote:
> On Jan 10, 2002 02:45 -0600, Bruce Guenter wrote:
> > On Wed, Jan 09, 2002 at 08:36:13PM -0200, Rik van Riel wrote:
> > > Matt's system seems to go from 900 MB free to about
> > > 300 MB (free + cache).
> > >
> > > I doubt qmail would eat 600 MB of RAM (it might, I
> > > just doubt it) so I'm curious where the RAM is going.
> >
> > I am seeing the same symptoms, with similar use -- ext3 filesystems
> > running qmail.
>
> Hmm, does qmail put each piece of email is in a separate file? That
> might explain a lot about what is going on here.

There are actually three to five individual files used as part of the
equation. qmail stores each message as three our four individual files
while it is in the queue (which for local deliveries is very briefly).
In addition, each delivered message is saved as an individual file,
until the client picks it up (and deletes it) with POP.

> Well, these numbers _are_ high, but with 1GB of RAM you have to use it all
> _somewhere_.

Agreed. Free RAM is wasted RAM. However, when adding up the numbers
buffers+cache+RSS+slab, the totals I am reading account for roughly
half of the used RAM:
RSS 84MB (including shared pages counted multiple times)
slabs 82MB
buffers 154MB
cache 152MB
-------------
total 477MB
However, free reports 895MB as used. What am I missing?

> I'm thinking that if you get _lots_ of dentry and inode items (especially
> under the "postal" benchmark) you may not be able to free the negative
> dentries for all of the created/deleted files in the mailspool (all of
> which will have unique names). There is a deadlock path in the VM that
> has to be avoided, and as a result it makes it harder to free dentries
> under certain uncommon loads.

The names in the queue are actually reused fairly frequently. qmail
creates an initial file named for the creating PID, and then renames it
to the inode number of the file. These inode numbers are of course
recycled as are the filenames.

> The other question would of course be whether we are calling into
> shrink_dcache_memory() enough, but that is an issue for Matt to
> see by testing "postal" with and without the patch, and keeping an
> eye on the slab caches.

I'd love to test this as well, but this is a production server. I'll
see if I can put one of my home systems to the task.
--
Bruce Guenter <[email protected]> http://em.ca/~bruceg/ http://untroubled.org/
OpenPGP key: 699980E8 / D0B7 C8DD 365D A395 29DA 2E2A E96F B2DC 6999 80E8


Attachments:
(No filename) (2.48 kB)
(No filename) (232.00 B)
Download all attachments

2002-01-10 22:24:49

by Bruce Guenter

[permalink] [raw]
Subject: Re: Where's all my memory going?

On Thu, Jan 10, 2002 at 01:46:57PM -0700, Andreas Dilger wrote:
> One question - what happens to the emails after they are delivered? Are
> they kept on the local filesystem?

Messages in the queue are deleted after delivery (of course). Messages
delivered locally are stored on the local filesystem until they're
picked up by POP (typically within 15 minutes).
--
Bruce Guenter <[email protected]> http://em.ca/~bruceg/ http://untroubled.org/
OpenPGP key: 699980E8 / D0B7 C8DD 365D A395 29DA 2E2A E96F B2DC 6999 80E8


Attachments:
(No filename) (517.00 B)
(No filename) (232.00 B)
Download all attachments

2002-01-10 22:37:00

by Andreas Dilger

[permalink] [raw]
Subject: Re: Where's all my memory going?

On Jan 10, 2002 16:24 -0600, Bruce Guenter wrote:
> On Thu, Jan 10, 2002 at 01:46:57PM -0700, Andreas Dilger wrote:
> > One question - what happens to the emails after they are delivered? Are
> > they kept on the local filesystem?
>
> Messages in the queue are deleted after delivery (of course). Messages
> delivered locally are stored on the local filesystem until they're
> picked up by POP (typically within 15 minutes).

Sorry, I meant for the "Postal" benchmark only. I would hope that locally
delivered emails are kept until the recipient gets it in the normal case.

In any case, you also pointed out the same thing I did, namely that these
slab entries (while having some high numbers) do not account for the large
amount of used memory in the system. Maybe SysRQ-M output can help a bit?

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2002-01-11 15:30:57

by Rolf Lear

[permalink] [raw]
Subject: Re: Where's all my memory going?


Matt Dainty <[email protected]> writes:
>Hi,
>
>I've fashioned a qmail mail server using an HP NetServer with an HP NetRaid
>4M & 1GB RAM, running 2.4.17 with aacraid, LVM, ext3 and highmem. The box
>has 6x 9GB disks, one for system, one for qmail's queue, and the remaining
>four are RAID5'd with LVM. ext3 is only on the queue disk, ext2 everywhere
>else.
>
....

qmail is very file intensive (which is a good thing ...), and RAID5 is very resource intensive (every write to RAID5 involves a number of reads and a write).

It is quite conceivable that the data volume (throughput) generated by the tests is too large for the throughput of your RAID system. From your mail I understand that the mails are being delivered locally to the RAID5 disk array.

One explaination for your results are that your qmail queue is being filled (by qmail-smtpd) at the rate of the network (presumably 100Mbit or about 10-12MiB/s). This queue is then delivered locally to the RAID5. Files in the queue do not last long (are created and then deleted, and the cache probably never gets flushed to disk ...). Delivered e-mails fill the cache though, and the kernel at some point will begin flushing these cache entries to disk. At some point (and I am guessing this is your 35-40 minute point) all pages in the cache are dirty (i.e. the kernel has not been able to write the cache to disk as fast as it is being filled ...). This will cause the disk to become your bottleneck.

This is based on the assumption that the RAID5 is slower than the network. In my experience, this is often the case. A good test for this would be tools like bonnie++, or tools like vmstat. On a saturated raid array with a cache, it is typical to get 'vmstat 1' output which shows rapid bursts of data writes (bo's), followed by periods of inactivity. A longer vmstat like 'vmstat 10' will probably even out these bursts, and show an 'averaged' throughput of your disks.

It is possible that I am completely off base, but I have been battling similar problems myself recently, and discovered to my horror that RAID5 disk arrays are pathetically slow. Check your disk performance for the bottleneck.

Rolf

2002-01-14 11:30:17

by Matt Dainty

[permalink] [raw]
Subject: Re: Where's all my memory going?

On Thu, Jan 10, 2002 at 03:36:39PM -0700, Andreas Dilger wrote:
>
> In any case, you also pointed out the same thing I did, namely that these
> slab entries (while having some high numbers) do not account for the large
> amount of used memory in the system. Maybe SysRQ-M output can help a bit?

Running this on the box after it's settled down a bit, (over the weekend,
the usage hasn't altered), with all mail delivered and collected so the box
is currently quiet, produces the following 'free' output:

root@plum:~# free
total used free shared buffers cached
Mem: 1029524 965344 64180 0 45204 22936
-/+ buffers/cache: 897204 132320
Swap: 2097136 116 2097020

And SysRQ+M yields the following:

SysRq : Show Memory
Mem-info:
Free pages: 66612kB ( 4424kB HighMem)
Zone:DMA freepages: 4848kB min: 128kB low: 256kB high: 384kB
Zone:Normal freepages: 57340kB min: 1020kB low: 2040kB high: 3060kB
Zone:HighMem freepages: 4424kB min: 1020kB low: 2040kB high: 3060kB
( Active: 160155, inactive: 58253, free: 16653 )
124*4kB 98*8kB 41*16kB 13*32kB 3*64kB 0*128kB 1*256kB 0*512kB 0*1024kB 1*2048kB = 4848kB)
11029*4kB 1097*8kB 140*16kB 7*32kB 1*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 0*2048kB = 57340kB)
328*4kB 55*8kB 19*16kB 10*32kB 2*64kB 3*128kB 0*256kB 1*512kB 1*1024kB 0*2048kB = 4424kB)
Swap cache: add 48, delete 24, find 13/14, race 0+0
Free swap: 2097020kB
262128 pages of RAM
32752 pages of HIGHMEM
4747 reserved pages
20447 pages shared
24 pages swap cached
0 pages in page table cache
Buffer memory: 44424kB
CLEAN: 209114 buffers, 836405 kbyte, 188 used (last=209112), 0 locked, 0 dirty
^^^^^^ Is this our magic value?
DIRTY: 8 buffers, 32 kbyte, 0 used (last=0), 0 locked, 8 dirty

Cheers

Matt
--
"Phased plasma rifle in a forty-watt range?"
"Hey, just what you see, pal"