2004-01-02 14:48:40

by Mario Vanoni

[permalink] [raw]
Subject: 2.6.1-rc1-mm1: kernel or NFS problem?

a) not in lkml, cc if needed
b) 5 machines otherwise 100% working, congrats
c) test on 4 machines, the 5th is production, taboo
d) top used: 2x 2.0.16, 2x 3.1.0

Scenario:

machine 1: top, sees himself, <1% CPU
machine 2: ssh to machine 1 ; top
machine 1: top sees 2 top's, <1% CPU each
machine 2: reboot with ssh _active_ with top
machine 1: the top from machine 2 is not killed
and consumes now >99% CPU,
must be killed explicitly

Same behaviour on 4 machines,
UP's Celeron-1000+512MBmem & P4-3066HT+1GBmem,
dual SMP's P3-550+1GBmem & P3-1266+2GBmem.

Sar(1) -U ALL 1 0 reports the same values as top(1).

It's not a procps problem, same conditions:

machine 1: top and sar -U ALL 1 0
machine 2: ssh to machine 1; sar -U ALL 1 0
machine 1: ps -efw sees sar, top no CPU used
machine 2: reboot with ssh _active_ with sar -U ALL 1 0
machine 1: top no CPU usage, according ps -efw sar -U ALL 1 0
and corresponding sadc 1 are always active,
killall sar kills both

The ethernet is 10Mb, mixed RJ and BNC connectors.

Mario