2004-11-23 13:37:43

by Nico Schottelius

[permalink] [raw]
Subject: vmstat: zero cs | system debug

Hello!

Some minutes ago I had the following problem:

- a fileserver with many process in 'D' state
- vmstat shows zero context switches
- more or less zero access to the system
- system load of ~150 (many smbd processes with 'D')
- raid looked fine
- no errors in dmesg
- no mysterious processes
- system not hacked (as its in our company lan)
- after rebooting it seems to work
- the initialization of the Adaptec AHA-2940U/UW/D / AIC-7881U took
quite long, but it works

My questions:

- what could I have done to find out the problem?
-> I did ps axu, vmstat, cat /proc/mdstat, free, netstat -an
- does zero context switches mean there is one process using
completly the cpu? if so, why was I able to start vmstat?

- what todo to fix the problem?
-> tried killall -9 smbd apache ...

- is there some TFM for reading about "Linux system analyzing"?

Thanks for any answer,

Nico


Attachments:
(No filename) (890.00 B)
(No filename) (827.00 B)
Download all attachments

2004-11-25 09:26:36

by Alex Riesen

[permalink] [raw]
Subject: Re: vmstat: zero cs | system debug

On Tue, 23 Nov 2004 14:38:05 +0100, Nico Schottelius
<[email protected]> wrote:
> - what could I have done to find out the problem?
> -> I did ps axu, vmstat, cat /proc/mdstat, free, netstat -an

uname -a, lsmod, free, cat /proc/slabinfo (slabtop)
ps aux -L (to show threads)

> - what todo to fix the problem?
> -> tried killall -9 smbd apache ...

SAK (Alt-SysRq-K) - kill everyone except init