2001-11-08 18:28:01

by Jeffrey W. Baker

[permalink] [raw]
Subject: 2.4.13 infinite loop in all processes

The setting: 2-way P-III machine, 2 GB RAM, 1GB swap, ext3 filesystem on
aic7xxx storage. Kernel 2.4.13 + ext3. Last night, the load on the
machine went to 45. I investigated, and saw that every process was
getting equal CPU time, and all CPU time (on both CPUs) was spent in the
kernel. No process was making much progress. I could log in to the
machine and execute 'ps' or read from /proc, but I could not write or
read from the disk. Any attempted disk I/O resulted in dead processes
in the D state.

The physical memory was mostly used up, with 100MB free, 600MB cached,
and 1300MB used. Swap was completely virgin: not a single byte had been
used. The load was very normal: a lot of perl scripts reading and
writing over the network to a database. No oddball mlock activity or
any other out-of-the-ordinary workload.

I left the machine overnight to see if it would sort itself out, but it
did not. I had to power cycle it. I upgraded (?) to 2.4.14 + ext3
0.9.15, and will report if the same happens again.

-jwb