Hello,
I have some tough stuff to debug for VM hackers.
I "killed" my kernel (it was just swapping and otherwise irresponsive)
with just aborting rsync v2.5.2.
0. Hardware: Single Duron 700, 320 MB RAM, 1.5 GB swap distributed over
some disks. One SCSI (Fujitsu, attached to SYM53C875) holding the
system (LVM), three IDE (Maxtor and IBM) holding data (no LVM,
attached to VIA KT133). -ac kernels were compiled with gcc 2.95.3,
not sure about SuSE's kernel.
1. Use rsync v2.5.2 to copy some files from one partition (reiserfs, on
IDE) to another partition (ext3).
2. Watch the kernel nicely use all free RAM for Cache, no swap.
3. While rsync is sill running, press Ctrl+C.
4. Now watch how the kernel turns CACHE into USED RAM (xosview), starts
swapping like hell (page out is high, swaps like 100 MB in just some few
seconds) and makes the machine unusable.
Careful, save your data before you try this!
With 2.4.18-pre9-ac3 + Morton's Mini-LL patch, I was able to do "swapoff
-av" to have the machine recover in some minutes' time, with 2.4.18-rc4
or with 2.4.19-pre1-ac2 without LL patch, no way to get a single
character into my xterm.
Using GNU cp v4.1 instead of rsync v2.5.2 does not exhibit this
behaviour, but I'm not sure what rsync does that messes the kernel.
Initially, I thought it was related to rmap, but I cannot find hints
that Hubert's kernel uses rmap. I didn't get around to try -aa kernels
yet. Will do later unless someone has a fix until then ;-)
Some bug must lurk in the kernel which lets rsync wreak havoc with the
memory management.
--
Matthias Andree
GPG encrypted mail welcome, unless it's unsolicited commercial email.
On Fri, 01 Mar 2002, Matthias Andree wrote:
> I have some tough stuff to debug for VM hackers.
Actually, that's stuff for rsync hackers, none of kernel stuff. When I
tried all that stuff on a virtual console, the console was quick enough
to let me spot another undead rsync process eating up memory, so it's
not a kernel issue, I believe.
Shame on me I didn't see this before sending me previous mail. Please
apologize.
On Fri, 01 Mar 2002, Matthias Andree wrote:
> I "killed" my kernel (it was just swapping and otherwise irresponsive)
> with just aborting rsync v2.5.2.
My followup didn't make it, so here's a quick resend:
in that rsync version, after ^C, not all threads are killed, and the
remaining one wreaks havoc and eats all memory. NOT A KERNEL PROBLEM,
please apologize. (Running ulimit -v 30000 before or softlimit on rsync
cures this, obviously, to prevent the remaining runaway thread from
eating all memory.)