2002-01-14 18:02:42

by Simon Kirby

[permalink] [raw]
Subject: [2.4.16] Clock locking bugs?

Just had a server's clock stop at 9:02:30am. Very interesting
results:

[sroot@pro:/]# cat /proc/interrupts
CPU0 CPU1
0: 172839353 172896882 IO-APIC-edge timer
1: 578 522 IO-APIC-edge keyboard
2: 0 0 XT-PIC cascade
4: 22002 21840 IO-APIC-edge serial
10: 581309135 580556518 IO-APIC-level eth0
12: 63077142 63023533 IO-APIC-level aic7xxx
NMI: 0 0
LOC: 345979794 345980089
ERR: 0
MIS: 0

...

[sroot@pro:/]# cat /proc/interrupts
CPU0 CPU1
0: 172839353 172896882 IO-APIC-edge timer
1: 578 522 IO-APIC-edge keyboard
2: 0 0 XT-PIC cascade
4: 22002 21840 IO-APIC-edge serial
10: 581309219 580556518 IO-APIC-level eth0
12: 63077142 63023533 IO-APIC-level aic7xxx
NMI: 0 0
LOC: 345979883 345980178
ERR: 0
MIS: 0

Alan says this is due to locking problems with the timer I/O code.

On the console were a lot of "set_rtc_mmss: can't update from 79 to 32"
type messages that have always happened on SMP kernels with ntpd.

Has anybody created any patches for this?

Simon-

[ Stormix Technologies Inc. ][ NetNation Communications Inc. ]
[ [email protected] ][ [email protected] ]
[ Opinions expressed are not necessarily those of my employers. ]


2002-04-13 19:21:01

by Simon Kirby

[permalink] [raw]
Subject: Re: [2.4.16] Clock locking bugs?

Hrm...Just had this happen on my dual celeron desktop, exactly the same
problem. Kernel 2.5.7. Everything I typed in an rxvt was one character
lagged. :)

Simon-

On Mon, Jan 14, 2002 at 10:02:15AM -0800, Simon Kirby wrote:

> Just had a server's clock stop at 9:02:30am. Very interesting
> results:
>
> [sroot@pro:/]# cat /proc/interrupts
> CPU0 CPU1
> 0: 172839353 172896882 IO-APIC-edge timer
> 1: 578 522 IO-APIC-edge keyboard
> 2: 0 0 XT-PIC cascade
> 4: 22002 21840 IO-APIC-edge serial
> 10: 581309135 580556518 IO-APIC-level eth0
> 12: 63077142 63023533 IO-APIC-level aic7xxx
> NMI: 0 0
> LOC: 345979794 345980089
> ERR: 0
> MIS: 0
>
> ...
>
> [sroot@pro:/]# cat /proc/interrupts
> CPU0 CPU1
> 0: 172839353 172896882 IO-APIC-edge timer
> 1: 578 522 IO-APIC-edge keyboard
> 2: 0 0 XT-PIC cascade
> 4: 22002 21840 IO-APIC-edge serial
> 10: 581309219 580556518 IO-APIC-level eth0
> 12: 63077142 63023533 IO-APIC-level aic7xxx
> NMI: 0 0
> LOC: 345979883 345980178
> ERR: 0
> MIS: 0
>
> Alan says this is due to locking problems with the timer I/O code.
>
> On the console were a lot of "set_rtc_mmss: can't update from 79 to 32"
> type messages that have always happened on SMP kernels with ntpd.
>
> Has anybody created any patches for this?
>
> Simon-
>
> [ Stormix Technologies Inc. ][ NetNation Communications Inc. ]
> [ [email protected] ][ [email protected] ]
> [ Opinions expressed are not necessarily those of my employers. ]

2002-04-16 06:07:49

by Paul Gortmaker

[permalink] [raw]
Subject: Re: [2.4.16] Clock locking bugs?

You've just given another example of why set_rtc_mmss should die (or at a
minimum, not be enabled by default) - the timer died, and it wants to push
its incorrect time onto the rtc as well. I've posted patches to this effect
several times before; now that we are into 2.5, I'll push them again...

I doubt that such a change would influence your timer problem, but feel
free to #if 0 out the set_rtc_mmss routine, and the cruft that calls
it from the timer interrupt - i.e. if (time_status & STA_UNSYNC)
in arch/i386/kernel/time.c - you can then make the *policy* decision
in userspace as to whether you want to sync the rtc with the kernel time.

[At a minimum, I guarantee the annoying set_rtc_mmss messages go away :) ]

Paul.

Simon Kirby wrote:
>
> Hrm...Just had this happen on my dual celeron desktop, exactly the same
> problem. Kernel 2.5.7. Everything I typed in an rxvt was one character
> lagged. :)
>
> Simon-
>
> On Mon, Jan 14, 2002 at 10:02:15AM -0800, Simon Kirby wrote:
>
> > Just had a server's clock stop at 9:02:30am. Very interesting
> > results:
> >
> > [sroot@pro:/]# cat /proc/interrupts
> > CPU0 CPU1
> > 0: 172839353 172896882 IO-APIC-edge timer
...
> > [sroot@pro:/]# cat /proc/interrupts
> > CPU0 CPU1
> > 0: 172839353 172896882 IO-APIC-edge timer
> >
> > Alan says this is due to locking problems with the timer I/O code.
> >
> > On the console were a lot of "set_rtc_mmss: can't update from 79 to 32"
> > type messages that have always happened on SMP kernels with ntpd.
> >
> > Has anybody created any patches for this?