2002-04-21 09:58:19

by Denis Vlasenko

[permalink] [raw]
Subject: /proc/stat weirdness

I was curious about top showing unwieldy numbers for idle%
(start top, hold down [space] and you'll see).

top reads /proc/stat in order to get these percents.
A little script which cats /proc/stat continually
and greps for 'cpu ' yield:
cpu 39778 0 46829 337191
cpu 39778 0 46831 337192
cpu 39778 0 46833 337193
cpu 39778 0 46834 337194
cpu 39778 0 46835 337195
cpu 39778 0 46836 337196
cpu 39778 0 46838 337197 <<<
cpu 39778 0 46840 337196 <<<
cpu 39780 0 46840 337198
cpu 39780 0 46842 337199
cpu 39780 0 46843 337201
cpu 39782 0 46844 337201
cpu 39782 0 46846 337201
cpu 39782 0 46848 337201
cpu 39782 0 46849 337202
cpu 39782 0 46849 337204
cpu 39782 0 46850 337205
cpu 39782 0 46852 337205
cpu 39782 0 46853 337206
cpu 39783 0 46853 337207
cpu 39783 0 46855 337207 <<<
cpu 39784 0 46856 337206 <<<
...

Kernel: 2.4.18-pre6
--
vda


2002-04-21 17:03:54

by Denis Vlasenko

[permalink] [raw]
Subject: Re: /proc/stat weirdness

On 21 April 2002 13:04, Mark Hahn wrote:
> > I was curious about top showing unwieldy numbers for idle%
> > (start top, hold down [space] and you'll see).
>
> you need to be more explicit. unweildy? do you mean
> very large?

Yes, stuff like 1231687123,23%

> > top reads /proc/stat in order to get these percents.
> > A little script which cats /proc/stat continually
> > and greps for 'cpu ' yield:
> > cpu 39778 0 46829 337191
> > cpu 39778 0 46831 337192
> > cpu 39778 0 46833 337193
> > cpu 39778 0 46834 337194
> > cpu 39778 0 46835 337195
> > cpu 39778 0 46836 337196
> > cpu 39778 0 46838 337197 <<<
> > cpu 39778 0 46840 337196 <<<
>
> your clock jumped back; do you have a via-based computer?

Nope. It's a HP Vectra, a loyal Intel based box.
Right now I ssh'ed to my NFS server, ran top and held [space] down.
I saw it there too. What do you see on your box?

I modified top to show 'raw' counter difference too:
0000001b 00000014 00000000 000001cb 5.3% user, 3.9% system, 0.0% nice, 90.7% idle
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
and it's easy to notice that last number turns into ffffffff sometimes.
--
vda

2002-04-22 06:40:52

by Denis Vlasenko

[permalink] [raw]
Subject: Re: /proc/stat weirdness

On 21 April 2002 16:27, Mark Hahn wrote:
> > * top with continuous update
> > * output of "while true; do cat /proc/stat; done | grep -F 'cpu '"
>
> I told you, I don't ever see any of this phenomenon.

I'd like to add that it is easily seen only on slower boxes.
It was really hard to catch on PIII 800 (1458 BogoMIPS):
top was too slow refreshing, cat script however was fast enough when
redirected to file to show backstepping. See below.

Hey lkml folks, please run:
# while true; do cat /proc/stat; done | grep -F 'cpu ' >/tmp/log
do you see something like this?

cpu 1170 0 3066 28631
cpu 1170 0 3066 28632
cpu 1170 0 3066 28634
cpu 1170 0 3067 28634
cpu 1171 0 3067 28634
cpu 1171 0 3067 28635 <<<
cpu 1171 0 3069 28634 <<<
cpu 1171 0 3070 28634 <<<
cpu 1171 0 3070 28635
cpu 1171 0 3070 28636
cpu 1172 0 3070 28637
cpu 1172 0 3071 28637

Or better yet, this hunter script:
#!/bin/sh
prev=0
while true; do cat /proc/stat; done | \
grep -F 'cpu ' | \
cut -d ' ' -f 6 | \
while read next; do
diff=$(($next-$prev))
if test $diff -lt 0; then
echo "$prev -> $next"
fi
prev=$next
done

It will print only 'bad' idle counter pairs
--
vda

2002-04-22 07:09:51

by Denis Vlasenko

[permalink] [raw]
Subject: Re: /proc/stat weirdness

On 22 April 2002 04:18, Mark Hahn wrote:
> why not figure out if theres a code-based reason?
> just add some code in the kernel to check for the case,
> and dump some extra data.

Hehe. I caught it (imho, not tested).
jiffies is saved to local variable before user,nice,system is calculated,
what if delta(user+nice+system) was == delta(jiffies)
[i.e. no idle ticks since last readout by e.g. top]
and a jiffy just ended and got accounted into one of user,nice,system?
We'll get delta(user+nice+system) > delta(jiffies),
delta(idle)=delta(jiffies)-delta(user+nice+system) - negative!

File: proc_misc.c
....
static int kstat_read_proc(char *page, char **start, off_t off,
int count, int *eof, void *data)
{
int i, len;
extern unsigned long total_forks;
unsigned long jif = jiffies; <*********** jiffies saved
unsigned int sum = 0, user = 0, nice = 0, system = 0;
int major, disk;

for (i = 0 ; i < smp_num_cpus; i++) {
int cpu = cpu_logical_map(i), j;

user += kstat.per_cpu_user[cpu]; <***
nice += kstat.per_cpu_nice[cpu]; <*** accounting
system += kstat.per_cpu_system[cpu]; <***
#if !defined(CONFIG_ARCH_S390)
for (j = 0 ; j < NR_IRQS ; j++)
sum += kstat.irqs[cpu][j];
#endif
}

len = sprintf(page, "cpu %u %u %u %lu\n", user, nice, system,
jif * smp_num_cpus - (user + nice + system));
--
vda