2006-01-19 08:39:32

by RUMI Szabolcs

[permalink] [raw]
Subject: 2.4.x kernel uptime counter problem

Hello!

I've got a Linux system running the 2.4.26 kernel which was about
to pass the 500 day mark these days and now suddenly what I see is
that the uptime counter has reset:

$ uname -a && w && cat /proc/uptime && last -1 reboot
Linux quasar 2.4.26 #3 SMP Tue Sep 7 09:22:08 CEST 2004 i686 Intel(R) Pentium(R) 4 CPU 2.60GHz GenuineIntel GNU/Linux
09:38:08 up 1 day, 12:49, 5 users, load average: 0.00, 0.00, 0.00
USER TTY LOGIN@ IDLE JCPU PCPU WHAT
rumi pty/s0 08:53 0.00s 0.04s 0.02s screen -r
rumi ttyp1 10Sep04 31:58 9:12 9:12 epic
rumi ttyp3 Tue12 44:33m 0.01s 0.01s -/bin/bash
rumi ttyp2 13Feb05 8days 0.11s 0.11s -/bin/bash
rumi ttypc 11Dec05 0.00s 0.12s 0.11s -/bin/bash
132596.51 39801752.60
reboot system boot 2.4.26 Tue Sep 7 18:47 (498+15:50)

>From the above it can be seen that the system is running continuously
and wasn't rebooted 36 hours ago as the uptime counter would suggest.

Is this a known bug?

Regards,

Sab


2006-01-19 09:29:53

by Nick Warne

[permalink] [raw]
Subject: Re: 2.4.x kernel uptime counter problem

On 1/19/06, Rumi Szabolcs <[email protected]> wrote:
> Hello!
>
> I've got a Linux system running the 2.4.26 kernel which was about
> to pass the 500 day mark these days and now suddenly what I see is
> that the uptime counter has reset:
>
> $ uname -a && w && cat /proc/uptime && last -1 reboot
> Linux quasar 2.4.26 #3 SMP Tue Sep 7 09:22:08 CEST 2004 i686 Intel(R) Pentium(R) 4 CPU 2.60GHz GenuineIntel GNU/Linux
> 09:38:08 up 1 day, 12:49, 5 users, load average: 0.00, 0.00, 0.00
> USER TTY LOGIN@ IDLE JCPU PCPU WHAT
> rumi pty/s0 08:53 0.00s 0.04s 0.02s screen -r
> rumi ttyp1 10Sep04 31:58 9:12 9:12 epic
> rumi ttyp3 Tue12 44:33m 0.01s 0.01s -/bin/bash
> rumi ttyp2 13Feb05 8days 0.11s 0.11s -/bin/bash
> rumi ttypc 11Dec05 0.00s 0.12s 0.11s -/bin/bash
> 132596.51 39801752.60
> reboot system boot 2.4.26 Tue Sep 7 18:47 (498+15:50)
>
> From the above it can be seen that the system is running continuously
> and wasn't rebooted 36 hours ago as the uptime counter would suggest.
>
> Is this a known bug?


It's not a bug - it is a feature. uptime rolls over after 497 days.

[sic]
It computes the result of the "uptime" based on the internal "jiffies"
counter, which counts the time since boot, in units of 10
milliseconds.
This is typecast as an "unsigned long" - on the Intel boxes, that's an
unsigned 32-bit number.
Well, it turns out that in a 32-bit number, you can store 497.1 days
before the number wraps.


You can use:
last -xf /var/run/utmp runlevel

to get true uptime in this instance.

Nick

--
http://sourceforge.net/projects/quake2plus/

"Person who say it cannot be done should not interrupt person doing it."
-Chinese Proverb

2006-01-19 20:19:06

by Willy Tarreau

[permalink] [raw]
Subject: Re: 2.4.x kernel uptime counter problem

On Thu, Jan 19, 2006 at 09:29:51AM +0000, Nick wrote:
> On 1/19/06, Rumi Szabolcs <[email protected]> wrote:
> > Hello!
> >
> > I've got a Linux system running the 2.4.26 kernel which was about
> > to pass the 500 day mark these days and now suddenly what I see is
> > that the uptime counter has reset:
> >
> > $ uname -a && w && cat /proc/uptime && last -1 reboot
> > Linux quasar 2.4.26 #3 SMP Tue Sep 7 09:22:08 CEST 2004 i686 Intel(R) Pentium(R) 4 CPU 2.60GHz GenuineIntel GNU/Linux
> > 09:38:08 up 1 day, 12:49, 5 users, load average: 0.00, 0.00, 0.00
> > USER TTY LOGIN@ IDLE JCPU PCPU WHAT
> > rumi pty/s0 08:53 0.00s 0.04s 0.02s screen -r
> > rumi ttyp1 10Sep04 31:58 9:12 9:12 epic
> > rumi ttyp3 Tue12 44:33m 0.01s 0.01s -/bin/bash
> > rumi ttyp2 13Feb05 8days 0.11s 0.11s -/bin/bash
> > rumi ttypc 11Dec05 0.00s 0.12s 0.11s -/bin/bash
> > 132596.51 39801752.60
> > reboot system boot 2.4.26 Tue Sep 7 18:47 (498+15:50)
> >
> > From the above it can be seen that the system is running continuously
> > and wasn't rebooted 36 hours ago as the uptime counter would suggest.
> >
> > Is this a known bug?
>
>
> It's not a bug - it is a feature. uptime rolls over after 497 days.
>
> [sic]
> It computes the result of the "uptime" based on the internal "jiffies"
> counter, which counts the time since boot, in units of 10
> milliseconds.
> This is typecast as an "unsigned long" - on the Intel boxes, that's an
> unsigned 32-bit number.
> Well, it turns out that in a 32-bit number, you can store 497.1 days
> before the number wraps.
>
>
> You can use:
> last -xf /var/run/utmp runlevel
>
> to get true uptime in this instance.
>
> Nick

I would add that if you need to get valid outputs after such an uptime,
you can apply the vhz-j64 patch available at Robert Love's (RML) on
kernel.org.

Regards,
Willy

2006-01-19 20:22:54

by Nick Warne

[permalink] [raw]
Subject: Re: 2.4.x kernel uptime counter problem

On Thursday 19 January 2006 20:18, Willy Tarreau wrote:

> > You can use:
> > last -xf /var/run/utmp runlevel
> >
> > to get true uptime in this instance.
> >
> > Nick
>
> I would add that if you need to get valid outputs after such an uptime,
> you can apply the vhz-j64 patch available at Robert Love's (RML) on
> kernel.org.

:-( Then you would have to start all over again and wait 497.1 days to see if
it works... :-0

Seriously, is this patch to be added to 2.4.x tree at all in the future?

Nick
--
"Person who say it cannot be done should not interrupt person doing it."
-Chinese Proverb

2006-01-19 20:27:39

by Willy Tarreau

[permalink] [raw]
Subject: Re: 2.4.x kernel uptime counter problem

On Thu, Jan 19, 2006 at 08:22:42PM +0000, Nick Warne wrote:
> On Thursday 19 January 2006 20:18, Willy Tarreau wrote:
>
> > > You can use:
> > > last -xf /var/run/utmp runlevel
> > >
> > > to get true uptime in this instance.
> > >
> > > Nick
> >
> > I would add that if you need to get valid outputs after such an uptime,
> > you can apply the vhz-j64 patch available at Robert Love's (RML) on
> > kernel.org.
>
> :-( Then you would have to start all over again and wait 497.1 days to see if
> it works... :-0

No, you can apply the debugjiffies patch which basically sets your time
to -5 min at boot to test the wrapping. Anyway, believe me, it works, I
successfully wrapped twice on 2.4.18 a long time ago, and it was not
that clean by this time.

> Seriously, is this patch to be added to 2.4.x tree at all in the future?

No, because it uses dirty (but very clever) tricks to avoid locking around
the jiffies manipulations, and 2.4 is in critical fixes-only mode right now.

> Nick

Regards,
Willy

2006-01-20 14:15:29

by Jan Engelhardt

[permalink] [raw]
Subject: Re: 2.4.x kernel uptime counter problem

>> > You can use:
>> > last -xf /var/run/utmp runlevel
>> >
>> > to get true uptime in this instance.

Or use some dedicated programs, IIRC there is a "uprecords" program
(http://podgorny.cz/moin/Uptimed). Does require no reboot and should
work right away.


Jan Engelhardt
--

2006-01-20 15:17:44

by Bill Davidsen

[permalink] [raw]
Subject: Re: 2.4.x kernel uptime counter problem

Jan Engelhardt wrote:
>>>>You can use:
>>>>last -xf /var/run/utmp runlevel
>>>>
>>>>to get true uptime in this instance.
>
>
> Or use some dedicated programs, IIRC there is a "uprecords" program
> (http://podgorny.cz/moin/Uptimed). Does require no reboot and should
> work right away.

I never understood uptime anyway, the boot time, to the second, is
available in /proc/stat (btime), and it isn't that hard to turn it into
whatever format you find human readable. I have a perl script which
presents uptime as fractional days, days, hours, min, sec, and/or boot
time. Took me about two minutes to write.

--
-bill davidsen ([email protected])
"The secret to procrastination is to put things off until the
last possible moment - but no longer" -me

2006-01-20 16:32:24

by Jan Engelhardt

[permalink] [raw]
Subject: Re: 2.4.x kernel uptime counter problem

>> Or use some dedicated programs, IIRC there is a "uprecords" program
>> (http://podgorny.cz/moin/Uptimed). Does require no reboot and should work
>> right away.
>
> I never understood uptime anyway, the boot time, to the second, is available in
> /proc/stat (btime), and it isn't that hard to turn it into whatever format you
> find human readable. I have a perl script which presents uptime as fractional
> days, days, hours, min, sec, and/or boot time. Took me about two minutes to
> write.

uptime or uptimed/uprecords? (That's two different things.)
The "uptime" commands is the same as "w | head -n1" (which reads
/proc/uptime) and therefore suffers from jiffies wrap.



Jan Engelhardt
--

2006-01-21 13:27:41

by Bill Davidsen

[permalink] [raw]
Subject: Re: 2.4.x kernel uptime counter problem

Jan Engelhardt wrote:

>>>Or use some dedicated programs, IIRC there is a "uprecords" program
>>>(http://podgorny.cz/moin/Uptimed). Does require no reboot and should work
>>>right away.
>>>
>>>
>>I never understood uptime anyway, the boot time, to the second, is available in
>>/proc/stat (btime), and it isn't that hard to turn it into whatever format you
>>find human readable. I have a perl script which presents uptime as fractional
>>days, days, hours, min, sec, and/or boot time. Took me about two minutes to
>>write.
>>
>>
>
>uptime or uptimed/uprecords? (That's two different things.)
>The "uptime" commands is the same as "w | head -n1" (which reads
>/proc/uptime) and therefore suffers from jiffies wrap.
>

I understand why it has this limitation, the question is why it was
written to have it, when correct function is easily possible. I
understand extra effort to get it right, but not to get it wrong...

--
bill davidsen <[email protected]>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979