2005-04-15 08:00:50

by Eduard de Boer

[permalink] [raw]
Subject: MD / RAID5: Memory leak?

Hi,

I've been stumbling the last couple of weeks, getting a new box
working. The problem is that I have a Promise FastTrack SX4 with 512MB
cache and four 300GB Maxtor SATA drives connected to it. I've used
mdadm to create a RAID5 array, which is about 879G.

On this array, I've tried LVM (with ReiserFS, XFS, JFS, Ext2, Ext3 on
top of it) as well as creating the same kind of file systems without
LVM.

I use rsync to copy a bunch of files (several GB's) to the designated
filesystems. But after a while, all file systems get corrupted and
'dmesg' lists all kinds of memory corruptions in 'dm' and so on.
Hence, the file copying stops.

Kernels, I've tried, are:
- 2.6.9-r1
- 2.6.11-r4
- 2.6.11-r5
- 2.6.11.7
- 2.6.12-rc2
... but the problem persists.

When I run 'top' in a separate window, I see the amount of physical
memory dropping and dropping from about 838MB free, down to about 8M
free. Then is when the problems arise.

So, I now tried several times to stop the file copying before it gets
too low. Then the amount of free physical memory remains constant
(instead of being freed).

Only when I umount the filesystem and remount it, the amount of free
memory is back to the 838MB and I can copy another bunch of files
(without any corruption).

Is this pointing to a memory leak problem in md, or is it a problem
with RAID5 on md?

Regards, Eduard.


2005-04-15 10:14:53

by bert hubert

[permalink] [raw]
Subject: Re: MD / RAID5: Memory leak?

On Fri, Apr 15, 2005 at 10:00:39AM +0200, Eduard de Boer wrote:
> I use rsync to copy a bunch of files (several GB's) to the designated
> filesystems. But after a while, all file systems get corrupted and
> 'dmesg' lists all kinds of memory corruptions in 'dm' and so on.
> Hence, the file copying stops.

That doesn't sound like a memory leak per se. Can you watch the contents of
/proc/slabinfo ? Or perhaps compare before/after a copy and check if one of
the entries has grown huge.

Also, if you are able, can you let the machine run memtest86 for a night?

Do you see activity of the OOM-killer?

--
http://www.PowerDNS.com Open source, database driven DNS Software
http://netherlabs.nl Open and Closed source services