2005-12-01 11:57:23

by Tomasz Chmielewski

[permalink] [raw]
Subject: loadavg always equal or above 1.00 - how to explain?

I noticed one of my Samba + OpenLDAP servers, running 2.6.11.4 kernel
has loadavg always equal or above 1.00, although I can't explain it.

# cat /proc/loadavg
1.00 1.10 1.06 1/65 782

This server is barely used, and as I remember, loadavg was always close
to 0.00 on that system.

When I view the process list with top, no process takes more than 1% of
CPU time; RAM usage is also minimal:


# free
total used free shared buffers cached
Mem: 320836 241016 79820 0 23308 177232
-/+ buffers/cache: 40476 280360
Swap: 811272 14612 796660

This has ~ 50 processes running (ps aux|wc -l), and ~ 50 network
connections (netstat -tupna|wc -l), so everything normal.

Nothing unusual in dmesg, too.

What can cause this anormal load, and how can I spot it?


--
Tomek
http://wpkg.org
WPKG - software deployment and upgrades with Samba


2005-12-01 12:27:47

by Norbert van Nobelen

[permalink] [raw]
Subject: Re: loadavg always equal or above 1.00 - how to explain?

Can you use top to determine which process is requesting most of the CPU?


On Thursday 01 December 2005 12:57, Tomasz Chmielewski wrote:
> I noticed one of my Samba + OpenLDAP servers, running 2.6.11.4 kernel
> has loadavg always equal or above 1.00, although I can't explain it.
>
> # cat /proc/loadavg
> 1.00 1.10 1.06 1/65 782
>
> This server is barely used, and as I remember, loadavg was always close
> to 0.00 on that system.
>
> When I view the process list with top, no process takes more than 1% of
> CPU time; RAM usage is also minimal:
>
>
> # free
> total used free shared buffers cached
> Mem: 320836 241016 79820 0 23308 177232
> -/+ buffers/cache: 40476 280360
> Swap: 811272 14612 796660
>
> This has ~ 50 processes running (ps aux|wc -l), and ~ 50 network
> connections (netstat -tupna|wc -l), so everything normal.
>
> Nothing unusual in dmesg, too.
>
> What can cause this anormal load, and how can I spot it?

--
________
http://www.hipersonik.com : Open source experts

2005-12-01 12:28:31

by Xavier Bestel

[permalink] [raw]
Subject: Re: loadavg always equal or above 1.00 - how to explain?

On Thu, 2005-12-01 at 12:57, Tomasz Chmielewski wrote:
> I noticed one of my Samba + OpenLDAP servers, running 2.6.11.4 kernel
> has loadavg always equal or above 1.00, although I can't explain it.
[...]
> What can cause this anormal load, and how can I spot it?

Some hidden rootkit ?

Xav


2005-12-01 12:41:16

by Arjan van de Ven

[permalink] [raw]
Subject: Re: loadavg always equal or above 1.00 - how to explain?

On Thu, 2005-12-01 at 12:57 +0100, Tomasz Chmielewski wrote:
>
> 1.00 1.10 1.06 1/65 782
>
> This server is barely used, and as I remember, loadavg was always
> close
> to 0.00 on that system.

remember that load is the sum of running/runable processes and processes
in D state (waiting for IO generally, but not always). I'm pretty sure
your load comes from one of the later...

ps ought to tell you which one it is... (if not, an
"echo t > /proc/sysrq-trigger" will dump the kernel state including the
offending process, and will also tell us where exactly that process is)

Greetings,
Arjan van de Ven

2005-12-01 12:46:23

by Tomasz Chmielewski

[permalink] [raw]
Subject: Re: loadavg always equal or above 1.00 - how to explain?

Xavier Bestel schrieb:
> On Thu, 2005-12-01 at 12:57, Tomasz Chmielewski wrote:
>
>>I noticed one of my Samba + OpenLDAP servers, running 2.6.11.4 kernel
>>has loadavg always equal or above 1.00, although I can't explain it.
>
> [...]
>
>>What can cause this anormal load, and how can I spot it?
>
>
> Some hidden rootkit ?

highly unlikely, this machine is totally separated from the outer world,
to break into it someone would have to break into several other servers
first (and no problems with those).


--
Tomek
http://wpkg.org
WPKG - software deployment and upgrades with Samba

2005-12-01 12:51:01

by Tomasz Chmielewski

[permalink] [raw]
Subject: Re: loadavg always equal or above 1.00 - how to explain?

Norbert van Nobelen schrieb:
> Can you use top to determine which process is requesting most of the CPU?

Actually, when I press shift + P in top, top is the most used process
for a while - around 1%, then it drops to ~ 0.0-0.3% and stays like
that; other processes (like sshd, smbd) don't take more than ~0.5%
really few times a minute.

Same goes with memory usage.

vmstat output:

procs -----------memory---------- ---swap-- -----io---- --system--
----cpu----
r b swpd free buff cache si so bi bo in cs us
sy id wa
2 0 14612 69748 26112 184004 0 0 1 1 3 3 3
3 94 0


iostat output:

avg-cpu: %user %nice %sys %iowait %idle
1,79 1,52 2,66 0,33 93,70

Device: tps Blk_read/s Blk_wrtn/s Blk_read Blk_wrtn
hda 2,51 49,99 50,09 538412978 539475864
hdb 0,00 0,00 0,00 1744 0
fd0 0,00 0,00 0,00 6 0


--
Tomek
http://wpkg.org
WPKG - software deployment and upgrades with Samba

2005-12-01 12:58:44

by Bernd Eckenfels

[permalink] [raw]
Subject: Re: loadavg always equal or above 1.00 - how to explain?

In article <[email protected]> you wrote:
> What can cause this anormal load, and how can I spot it?

Look for a process in D state, might be caused by (network)filesystem
failure or a died kernel thread..

Gruss
Bernd

2005-12-01 13:00:35

by Tomasz Chmielewski

[permalink] [raw]
Subject: Re: loadavg always equal or above 1.00 - how to explain?

Arjan van de Ven schrieb:
> On Thu, 2005-12-01 at 12:57 +0100, Tomasz Chmielewski wrote:
>
>>1.00 1.10 1.06 1/65 782
>>
>>This server is barely used, and as I remember, loadavg was always
>>close
>>to 0.00 on that system.
>
>
> remember that load is the sum of running/runable processes and processes
> in D state (waiting for IO generally, but not always). I'm pretty sure
> your load comes from one of the later...
>
> ps ought to tell you which one it is... (if not, an
> "echo t > /proc/sysrq-trigger" will dump the kernel state including the
> offending process, and will also tell us where exactly that process is)

Wohoo, you're great, that was it:

root 29547 0.0 0.3 7516 996 ? D Nov25 0:00 CROND
root 29548 0.0 0.3 7516 996 ? Ss Nov25 0:00 CROND

I stopped it, and loadavg is back to 0.

Now I have to figure out what CROND was doing...

Does ps always show processes in D state in CAPITAL letters?

After cron restart it is "crond", as usual.


--
Tomek
http://wpkg.org
WPKG - software deployment and upgrades with Samba

2005-12-01 13:33:13

by Arjan van de Ven

[permalink] [raw]
Subject: Re: loadavg always equal or above 1.00 - how to explain?


> Now I have to figure out what CROND was doing...
>
if it does it again, do that echo t > /proc/sysrq-trigger
and look for "crond" in the output, that will give a kernel backtrace of
where crond was.

> Does ps always show processes in D state in CAPITAL letters?
>

hmm I never noticed that before, but then again "ps" may be undergoing
innovations ;)



2005-12-01 17:52:28

by Zan Lynx

[permalink] [raw]
Subject: Re: loadavg always equal or above 1.00 - how to explain?

On Thu, 2005-12-01 at 14:00 +0100, Tomasz Chmielewski wrote:
[snip]
> Now I have to figure out what CROND was doing...
>
> Does ps always show processes in D state in CAPITAL letters?
>
> After cron restart it is "crond", as usual.

crond is the regular cron daemon. CROND is what cron names its child
processes as they run scheduled commands.

I've seen cron stuck in D from running user crontabs on unavailable NFS
mounts.
--
Zan Lynx <[email protected]>


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part