2002-02-18 08:09:14

by Tom Holroyd

[permalink] [raw]
Subject: Unknown HZ value! (1908) Assume 1024.

After about 50 days of uptime on 2.4.17 on an Alpha, I started getting
this message from ps, et al. The adjtimex program says:

mode: 0
offset: -2942
frequency: -11020216
etc. (I'm running ntpd).

50 days is about 4320000000 clock ticks (normally 1024 Hz) which is
suspiciously close to 2^32. Perhaps something is rolling over?


2002-02-18 21:42:00

by Tim Schmielau

[permalink] [raw]
Subject: Re: Unknown HZ value! (1908) Assume 1024.

On Mon, 18 Feb 2002, Tom Holroyd wrote:

> After about 50 days of uptime on 2.4.17 on an Alpha, I started getting
> this message from ps, et al. The adjtimex program says:
[...]
> 50 days is about 4320000000 clock ticks (normally 1024 Hz) which is
> suspiciously close to 2^32. Perhaps something is rolling over?

I guess this is a userspace problem, where the tools just use a 32 bit
value somewhere.
To make sure, can you post /proc/uptime and /proc/stat output? Also, is
this uniprocessor or SMP?

Tim

2002-02-19 01:48:05

by Tom Holroyd

[permalink] [raw]
Subject: Re: Unknown HZ value! (1908) Assume 1024.

> > After about 50 days of uptime on 2.4.17 on an Alpha, I started getting
> > this message from ps, et al.
> To make sure, can you post /proc/uptime and /proc/stat output? Also, is
> this uniprocessor or SMP?

It's not userspace, this is a kernel problem (proc_misc.c):

/proc/uptime:
4919324.13 1615.50

/proc/stat:
cpu 2427984276 2540057284 67737892 4296620451
cpu0 2427984276 2540057284 67737892 4296620451
...

ps --version:
Unknown HZ value! (1897) Assume 1024.
procps version 2.0.2

All the tools that give me that message are in procps. I tried it
with procps 2.0.7 and it still happens. The message is coming from
sysinfo.c in the libproc part of that package. What it does is, it
takes the "cpu" line from /proc/stat, and adds up all the numbers, and
then divides by the value in /proc/uptime. So it gets:

2427984276 + 2540057284 + 67737892 + 4296620451 = 9332399903
9332399903 / 4919324.13 = 1897 (wrong)

Apparently, that 4th value in /proc/stat is bogus. If I do:

2427984276 + 2540057284 + 67737892 = 5035779452
5035779452 / 4919324.13 = 1023.67 (close enough)

So what is the 4th value in /proc/stat (procps calls it "other", while
the first 3 are "user", "nice", and "sys")? According to
linux/fs/proc/proc_misc.c, it is:

jif * smp_num_cpus - (user + nice + system)

formatted with a %lu (the others are just %u). smp_num_cpus is 1.
Things are declared this way:

unsigned long jif = jiffies;
unsigned int sum = 0, user = 0, nice = 0, system = 0;

So, the problem is that user + nice + system overflows (I'm compiling
with gcc 3.0, BTW).

Thanks for the clue; now, how to fix it?

2002-02-19 09:05:58

by Tim Schmielau

[permalink] [raw]
Subject: Re: Unknown HZ value! (1908) Assume 1024.

On Tue, 19 Feb 2002, Tom Holroyd wrote:
> > > After about 50 days of uptime on 2.4.17 on an Alpha, I started getting
> > > this message from ps, et al.
> > To make sure, can you post /proc/uptime and /proc/stat output? Also, is
> > this uniprocessor or SMP?
>
> It's not userspace, this is a kernel problem (proc_misc.c):
>
> /proc/uptime:
> 4919324.13 1615.50
>
> /proc/stat:
> cpu 2427984276 2540057284 67737892 4296620451
> cpu0 2427984276 2540057284 67737892 4296620451
> ...
>
> ps --version:
> Unknown HZ value! (1897) Assume 1024.
> procps version 2.0.2
>
> All the tools that give me that message are in procps. I tried it
> with procps 2.0.7 and it still happens. The message is coming from
> sysinfo.c in the libproc part of that package. What it does is, it
> takes the "cpu" line from /proc/stat, and adds up all the numbers, and
> then divides by the value in /proc/uptime. So it gets:
>
> 2427984276 + 2540057284 + 67737892 + 4296620451 = 9332399903
> 9332399903 / 4919324.13 = 1897 (wrong)
>
> Apparently, that 4th value in /proc/stat is bogus. If I do:
>
> 2427984276 + 2540057284 + 67737892 = 5035779452
> 5035779452 / 4919324.13 = 1023.67 (close enough)
>
> So what is the 4th value in /proc/stat (procps calls it "other", while
> the first 3 are "user", "nice", and "sys")? According to
> linux/fs/proc/proc_misc.c, it is:
>
> jif * smp_num_cpus - (user + nice + system)
>
> formatted with a %lu (the others are just %u). smp_num_cpus is 1.
> Things are declared this way:
>
> unsigned long jif = jiffies;
> unsigned int sum = 0, user = 0, nice = 0, system = 0;
>
> So, the problem is that user + nice + system overflows (I'm compiling
> with gcc 3.0, BTW).
>
> Thanks for the clue; now, how to fix it?
>

Good analysis!
I'd suggest changing the declarations in kstat_read_proc to

unsigned long jif = jiffies, user = 0, nice = 0, system = 0;
unsigned int sum = 0;

so that ticks that are lost due to overflow count as "other".

For increased symmetry we might also consider to change the "other" value
to 32 bit as well like user, nice, and system. This, however, would also
require userspace changes.
Extending all values to unsigned long would cause problems on 32 bit
platforms. While they might be solved as in the 64 bit jiffies patch
(http://www.lib.uaa.alaska.edu/linux-kernel/archive/2001-Week-47/0736.html)
this looks like pure overkill to me.

Tim

2002-02-19 09:29:42

by Tom Holroyd

[permalink] [raw]
Subject: Re: Unknown HZ value! (1908) Assume 1024.

> > /proc/stat:
> > cpu 2427984276 2540057284 67737892 4296620451
> > cpu0 2427984276 2540057284 67737892 4296620451
> > ...
> >
> I'd suggest changing the declarations in kstat_read_proc to
>
> unsigned long jif = jiffies, user = 0, nice = 0, system = 0;
> unsigned int sum = 0;
>
> so that ticks that are lost due to overflow count as "other".

Isn't it also necessary to change the sprintf() format strings to %lu?
That is,
len = sprintf(page, "cpu %lu %lu %lu %lu\n", user, nice, system,
jif * smp_num_cpus - (user + nice + system));

Also, it looks like the per_cpu loop that follows the sprintf() also
has this bug, i.e., adding three ints to get a long. Short of making
the kstat fields longs, it should suffice to just use a cast there,
and adjust the related sprintf(), e.g., to
len += sprintf(page + len, "cpu%d %lu %lu %lu %lu\n",
i,
kstat.per_cpu_user[cpu_logical_map(i)],
kstat.per_cpu_nice[cpu_logical_map(i)],
kstat.per_cpu_system[cpu_logical_map(i)],
jif -
((unsigned long) kstat.per_cpu_user[cpu_logical_map(i)] +
kstat.per_cpu_nice[cpu_logical_map(i)] +
kstat.per_cpu_system[cpu_logical_map(i)]));

Might be worth sticking
int cpu = cpu_logical_map(i);

at the top of the loop too, to clean that code up a bit.

Anyway, I'll have a go at the above and get back to you in about 48.5
days. :-)

Thanks!

2002-02-19 09:43:34

by Tim Schmielau

[permalink] [raw]
Subject: Re: Unknown HZ value! (1908) Assume 1024.

> > > /proc/stat:
> > > cpu 2427984276 2540057284 67737892 4296620451
> > > cpu0 2427984276 2540057284 67737892 4296620451
> > > ...
> > >
> > I'd suggest changing the declarations in kstat_read_proc to
> >
> > unsigned long jif = jiffies, user = 0, nice = 0, system = 0;
> > unsigned int sum = 0;
> >
> > so that ticks that are lost due to overflow count as "other".
>
> Isn't it also necessary to change the sprintf() format strings to %lu?
> That is,
> len = sprintf(page, "cpu %lu %lu %lu %lu\n", user, nice, system,
> jif * smp_num_cpus - (user + nice + system));
>

Sorry, I got it wrong. What I actually wanted to suggest is:
leave the declarations as they are now, but do the "other"
calculation with longs, i.e.:

len = sprintf(page, "cpu %u %u %u %lu\n", user, nice, system,
jif * smp_num_cpus - ((unsigned long) user + nice + system));



2002-02-19 10:47:54

by Andreas Schwab

[permalink] [raw]
Subject: Re: Unknown HZ value! (1908) Assume 1024.

Tom Holroyd <[email protected]> writes:

|> So what is the 4th value in /proc/stat (procps calls it "other", while
|> the first 3 are "user", "nice", and "sys")? According to
|> linux/fs/proc/proc_misc.c, it is:
|>
|> jif * smp_num_cpus - (user + nice + system)
|>
|> formatted with a %lu (the others are just %u). smp_num_cpus is 1.
|> Things are declared this way:
|>
|> unsigned long jif = jiffies;
|> unsigned int sum = 0, user = 0, nice = 0, system = 0;
|>
|> So, the problem is that user + nice + system overflows (I'm compiling
|> with gcc 3.0, BTW).
|>
|> Thanks for the clue; now, how to fix it?

Changing the line to this:

jif * smp_num_cpus - user - nice - system

should avoid the overflow.

Andreas.

--
Andreas Schwab, SuSE Labs, [email protected]
SuSE GmbH, Deutschherrnstr. 15-19, D-90429 N?rnberg
Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."

2002-02-20 02:12:21

by Tom Holroyd

[permalink] [raw]
Subject: Re: Unknown HZ value! (1908) Assume 1024.

> |> jif * smp_num_cpus - (user + nice + system)
>
> Changing the line to this:
>
> jif * smp_num_cpus - user - nice - system
>
> should avoid the overflow.

True. It still might be a good idea to make them longs, though,
because they are really totals of all the CPUs, as in:
user += kstat.per_cpu_user[cpu];

Now ultimately, kstat.per_cpu_user[cpu] will overflow, and I don't
know what to do about that, but making user, nice, and system unsigned
long will at least allow SMP systems to last a little while longer.
(Actually I don't know why Procps needs these values at all -- the
claim in the code is that all of this is just to compute the HZ value,
which is presumably needed to be able to interpret jiffies. It'd be a
lot simpler just to have /proc/stat export the HZ value directly.)

2002-02-20 09:59:17

by Tim Schmielau

[permalink] [raw]
Subject: Re: Unknown HZ value! (1908) Assume 1024.

On Wed, 20 Feb 2002, Tom Holroyd wrote:

> > |> jif * smp_num_cpus - (user + nice + system)
> >
> > Changing the line to this:
> >
> > jif * smp_num_cpus - user - nice - system
> >
> > should avoid the overflow.
>
> True. It still might be a good idea to make them longs, though,
> because they are really totals of all the CPUs, as in:
> user += kstat.per_cpu_user[cpu];
>
> Now ultimately, kstat.per_cpu_user[cpu] will overflow, and I don't
> know what to do about that, but making user, nice, and system unsigned
> long will at least allow SMP systems to last a little while longer.
> (Actually I don't know why Procps needs these values at all -- the
> claim in the code is that all of this is just to compute the HZ value,
> which is presumably needed to be able to interpret jiffies. It'd be a
> lot simpler just to have /proc/stat export the HZ value directly.)
>

I'd still prefer to export only 32 bit of user, nice, and system. This
way they overflow in a clearly defined way - the 32 bits we export are
exact, only the higher bits are missing.

Tim

2002-04-01 08:29:40

by Albert D. Cahalan

[permalink] [raw]
Subject: Re: Unknown HZ value! (1908) Assume 1024.

Tim Schmielau writes:
> On Wed, 20 Feb 2002, Tom Holroyd wrote:

>>>> jif * smp_num_cpus - (user + nice + system)
>>>
>>> Changing the line to this:
>>>
>>> jif * smp_num_cpus - user - nice - system
>>>
>>> should avoid the overflow.
>>
>> True. It still might be a good idea to make them longs, though,
>> because they are really totals of all the CPUs, as in:
>> user += kstat.per_cpu_user[cpu];
>>
>> Now ultimately, kstat.per_cpu_user[cpu] will overflow, and I don't
>> know what to do about that, but making user, nice, and system unsigned
>> long will at least allow SMP systems to last a little while longer.
>> (Actually I don't know why Procps needs these values at all -- the
>> claim in the code is that all of this is just to compute the HZ value,
>> which is presumably needed to be able to interpret jiffies. It'd be a
>> lot simpler just to have /proc/stat export the HZ value directly.)

Yeah, it would be a lot simpler. Try telling that to Linus. :-(

> I'd still prefer to export only 32 bit of user, nice, and system. This
> way they overflow in a clearly defined way - the 32 bits we export are
> exact, only the higher bits are missing.

The higher bits are absolutely required.

There are ways to push the work of doing a 64-bit counter out into the
proc filesystem and a timer that goes off every 31 bits worth of time.
I've posted an explanation before; you may search for it if you like.

2002-04-02 12:09:07

by Tim Schmielau

[permalink] [raw]
Subject: Re: Unknown HZ value! (1908) Assume 1024.

On Mon, 1 Apr 2002, Albert D. Cahalan wrote:

> Tim Schmielau writes:
>
> > I'd still prefer to export only 32 bit of user, nice, and system. This
> > way they overflow in a clearly defined way - the 32 bits we export are
> > exact, only the higher bits are missing.
>
> The higher bits are absolutely required.
At least on Alpha (with HZ=1024) I definitely agree. At HZ=100 I'm a bit
uncertain. What I don't want is exporting higher bits that sometimes wrap
and sometimes not.
>
> There are ways to push the work of doing a 64-bit counter out into the
> proc filesystem and a timer that goes off every 31 bits worth of time.
> I've posted an explanation before; you may search for it if you like.
>

Like I did for idle time in the >497 days uptime patch? Then I'll include
a chunk for user, nice, and system time in the next version and we can see
if Linus takes it.

Tim