2009-01-02 19:25:53

by Linas Vepstas

[permalink] [raw]
Subject: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Slashdot reported a story of Linux machines crashing on New years eve.

http://ask.slashdot.org/article.pl?sid=09/01/01/1930202

Below follows a summary of the reported crashes. I'm ignoring the
zillions of "mine didn't crash" reports, or the "you're a paranoid
conspiracy theorist, its random chance" reports.

So far, 31 users reported 53 hard crashes at/near midnight, new years.
Symptoms are:
-- hard hang (systems not pingable)
-- irq's not serviced (if disk was active at time of crash,
the disk activity light stays lit)
-- cold reboot (poweroff) required
-- systems work normally after reboot
-- no messages in syslog, no kernel oops, no core file crash dumps
-- not reproducible (simply setting the clock back is not enough
to reproduce; guessing that a simulation of stratum 0 ntp server
is needed to force the leap-second.)
-- The affected machines seem to be running either 2.6.21, 2.6.26 or 2.6.27

Suspect its an kernel race condition triggered by ntp bumping the second.
-- its the leap second, since this doesn't happen other years,
-- its a race condition, since some identically configured machines
didn't go down, while others did.
-- its a race condition, since majority of systems were not affected.
-- its a race condition, since affected systems seem to have been
mostly non-idle servers, or some non-idle desktops/ tv set-tops.
-- ntpd is the only service that monkeys with time adjustments.

There is a "well-known" deadlock in 2.6.21 kernels that caused this:
http://www.mail-archive.com/[email protected]/msg15039.html

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=746976a301ac9c9aa10d7d42454f8d6cdad8ff2b;hp=872aad45d6174570dd2e1defc3efee50f2cfcc72

Here's the synopsis of individual reports. Most folks left very little
info, nor did they leave a way to contact them.


aputerguy
Fedora 8 Linux server 2.6.26.6-49.fc8 Intel p4 2.8GHz.
ASUS P4PE 2 1-TB Seagate SATA hardrives, 1 200MB PATA drive. 2GB DDR. 1
pchdtv5500 card, 1 winfast 2000XP tv card, 1 nVidia 6200 graphics card.


MentalMooMan (785571) <[email protected]>
mythtv box (running mythbuntu) used to be something like 7.10 upgraded to 8.10
2.6.27-9.19-generic ntpd version 4.2.4p4. The CPU is an AMD Athlon XP
1700+ or 1800+. The motherboard is an EPOX EP-8KTA3Pro. message in
/var/log/messages at boot-time:
"warning: `ntpd' uses 32-bit capabilities (legacy support in use)"


Anonymous Coward MythTV box on Fedora 8 (Athlon XP1700+)


athakur999 (44340) Mythbuntu-based HTPC


AZPolarBear (661815) Fedora 8 system


Anonymous Coward 5 of about 70 of our production servers


Anonymous Coward I did have two 2.6.21 servers crash last night


Anonymous Coward Ubuntu 8.10 MythTV box.


SanjuroE (131728) Debian testing and at that time Debian kernel
2.6.26-11.


lukas84 (912874) internal testing machine that's still on 2.6.21


Anonymous Coward Debian testing Kernel 2.6.26


zerosumgame (1429741) kernel 2.6.21 on older Dell 1850's


Wibla (677432) Both my fileservers running debian etch installed from
custom install media (pre-etchnhalf) running 2.6.21-2 and 2.6.21-6
crashed,


Maow (620678) Ubuntu 8.04 on AMD64


Burdell (228580) <[email protected]> RHEL 4 server
RHEL 4 update 6
kernel-smp-2.6.9-67.0.7.EL.x86_64
ntp-4.2.0.a.20040617-6.el4.x86_64
Penguin Computing Altus 2600
dual dual-core Opteron 2212 HE
4G RAM
nVidia MCP55 chipset
I have 9 servers (mostly
different hardware, but one the same as above), all running the exact
same kernel and package set. Only one crashed; the others logged the
leap second and went on fine.


Pretzalzz (577309) Travis Crump <[email protected]>
http://lists.debian.org/debian-user/2009/01/msg00006.html
debian lenny ntp=1:4.2.4p4+dfsg-7;
Linux version 2.6.26-1-amd64 (Debian 2.6.26-4[since updated])
Processor : 2x AMD Athlon(tm) 64 X2 Dual Core Processor 3800+


kst (168867) Ubuntu 8.10


arodland (127775) Debian 2.6.21


Goodgerster (904325) <goodgerster AT gmail DOT com>
Debian


Morgor (542294) Fedora 8 Two of our production servers running fedora 8


Lightjumper (532700) Fedora 9 server


blit (90883) 10 machines running Fedora Core 7


Qwell (684661) Ubuntu 8.10


Jim Fenton (514449) Fedora 8 machine (kernel: 2.6.26.6-49.fc8)


Anonymous Coward My laptop


dmrobbin (560931) F8 box went down,


Anonymous Coward F8 server also "hung"


Anonymous Coward 2.6.26-1-amd64. Four of the 20 locked up


Anonymous Coward (actually, me, Linas) amd64 dual core, ubuntu 8.04
custom-compiled kernel, 2.6.26-64-bit


Anonymous Coward kernel 2.6.26.3-29.fc9.


Anonymous Coward 3 Ubuntu 8.10 Virtual Box (3 of them)


2009-01-02 20:03:40

by Diego Calleja

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

El Fri, 2 Jan 2009 13:25:38 -0600, "Linas Vepstas" <[email protected]> escribió:

> Suspect its an kernel race condition triggered by ntp bumping the second.

How could I create a test case that reproduces what ntp does? Just add
a second?

2009-01-02 20:26:15

by Robert Hancock

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Diego Calleja wrote:
> El Fri, 2 Jan 2009 13:25:38 -0600, "Linas Vepstas" <[email protected]> escribió:
>
>> Suspect its an kernel race condition triggered by ntp bumping the second.
>
> How could I create a test case that reproduces what ntp does? Just add
> a second?

I'd think that setting the clock to just before midnight on Dec.31 and
using the adjtimex syscall to set the TIME_INS state on the clock, then
waiting until midnight rolls around would be a reasonable test..

2009-01-02 20:29:18

by Linas Vepstas

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

2009/1/2 Diego Calleja <[email protected]>:> El Fri, 2 Jan 2009 13:25:38 -0600, "Linas Vepstas" <[email protected]> escribió:>>> Suspect its an kernel race condition triggered by ntp bumping the second.>> How could I create a test case that reproduces what ntp does? Just add> a second?
It might be more subtle than that. One of these cases is discussed in aDebian mailing list thread, where one user claims his hardware clock runsso poorly, it loses second every hour, and he doesn't have problems.ntp normally drifts to adjust time; for exceptional jumps in time, it won'tdrift, but just set.
There's another thread of bug reports on Oracle servers (linux based) whichappearently hit the same problem, although they think it has something todo with a backwards leap-second jump.
--linas????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2009-01-02 21:30:06

by Ben Goodger

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

2009/1/2 Ben Goodger <[email protected]>
>
> 2009/1/2 Linas Vepstas <[email protected]>
>>
>> Slashdot reported a story of Linux machines crashing on New years eve.
>>
>> So far, 31 users reported 53 hard crashes at/near midnight, new years.
>
> Further details about my crash (Goodgerster):
> -- system works normally after reboot;
> -- no messages were written to /var/log/kernel;
> -- affected machine was running 2.6.26-1-amd64 from Debian testing;
> -- the other machine on the network was unaffected (to the extent that it continues normal operation as an NFS server) and is running 2.6.18 from Debian Etch;
> -- the affected machine was using NTP (not sure about the server machine.)
>
> I was unable to find any logs on the Etch machine that would tell us whether the affected machine continued writing to its NFS share after the crash. File corruption is evident, but this would have been caused by the hard reset or the crash in equal measure. Unfortunately, I was careless enough to just hit the reset button after hitting ctrl-alt-backspace a couple of times, but I know that either the X window system or the kernel hung entirely (I do not know whether the NumLock key was inoperable, but the cursor/system monitor/clock stopped moving. The clock displayed 23:59:59 when I returned to it at around 00:15. I am in the UTC+0 timezone; the system clock was therefore in UTC, but I had set it to "windows compatibility" mode (i.e. local timezone).
>
> Hope this helps (?)...
>
> --
> Benjamin Goodger
>
> -----BEGIN GEEK CODE BLOCK-----
> Version: 3.1
> GCS/S/M/B d- s++:-- a18 c++$ UL>+++ P--- L++>+++ E- W+++$ N--- K? w--- O? M- V? PS+(++) PE-() Y+ PGP+ t 5? X-- R- !tv() b+++>++++ DI+++ D+ G e>++++ h! !r*(-) y
> ------END GEEK CODE BLOCK------



--
Benjamin Goodger

-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/S/M/B d- s++:-- a18 c++$ UL>+++ P--- L++>+++ E- W+++$ N--- K? w---
O? M- V? PS+(++) PE-() Y+ PGP+ t 5? X-- R- !tv() b+++>++++ DI+++ D+ G
e>++++ h! !r*(-) y
------END GEEK CODE BLOCK------

2009-01-03 00:32:00

by Chris Adams

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Once upon a time, Linas Vepstas <[email protected]> said:
> Below follows a summary of the reported crashes. I'm ignoring the
> zillions of "mine didn't crash" reports, or the "you're a paranoid
> conspiracy theorist, its random chance" reports.

I have reproduced this and got a stack trace (this is with Fedora 8 and
kernel kernel-2.6.26.6-49.fc8.x86_64):

#0 ktime_get_ts (ts=0xffffffff8158bb30) at include/asm/processor.h:691
#1 0xffffffff8104c09a in ktime_get () at kernel/hrtimer.c:59
#2 0xffffffff8102a39a in hrtick_start_fair (rq=0xffff810009013880,
p=<value optimized out>) at kernel/sched.c:1064
#3 0xffffffff8102decc in enqueue_task_fair (rq=0xffff810009013880,
p=0xffff81003fb02d40, wakeup=1) at kernel/sched_fair.c:863
#4 0xffffffff81029a08 in enqueue_task (rq=0xffffffff8158bb30,
p=0xffff81003b8ac418, wakeup=-994836480) at kernel/sched.c:1550
#5 0xffffffff81029a39 in activate_task (rq=0xffff810009013880,
p=0xffff81003b8ac418, wakeup=20045) at kernel/sched.c:1614
#6 0xffffffff8102be38 in try_to_wake_up (p=0xffff81003fb02d40,
state=<value optimized out>, sync=0) at kernel/sched.c:2173
#7 0xffffffff8102be9c in default_wake_function (curr=<value optimized out>,
mode=998949912, sync=20045, key=0x4c4b40000) at kernel/sched.c:4366
#8 0xffffffff810492ed in autoremove_wake_function (wait=0xffffffff8158bb30,
mode=998949912, sync=20045, key=0x4c4b40000) at kernel/wait.c:132
#9 0xffffffff810296a2 in __wake_up_common (q=0xffffffff813d3180, mode=1,
nr_exclusive=1, sync=0, key=0x0) at kernel/sched.c:4387
#10 0xffffffff8102b97b in __wake_up (q=0xffffffff813d3180, mode=1,
nr_exclusive=1, key=0x0) at kernel/sched.c:4406
#11 0xffffffff8103692f in wake_up_klogd () at kernel/printk.c:1005
#12 0xffffffff81036abb in release_console_sem () at kernel/printk.c:1051
#13 0xffffffff81036fd1 in vprintk (fmt=<value optimized out>,
args=<value optimized out>) at kernel/printk.c:789
#14 0xffffffff81037081 in printk (
fmt=0xffffffff8158bb30 "yj$\201????\2008\001\t") at kernel/printk.c:613
#15 0xffffffff8104ec16 in ntp_leap_second (timer=<value optimized out>)
at kernel/time/ntp.c:143
#16 0xffffffff8104b7a6 in run_hrtimer_pending (cpu_base=0xffff81000900f740)
at kernel/hrtimer.c:1204
#17 0xffffffff8104b86a in run_hrtimer_softirq (h=<value optimized out>)
at kernel/hrtimer.c:1355
#18 0xffffffff8103b31f in __do_softirq () at kernel/softirq.c:234
#19 0xffffffff8100d52c in call_softirq () at include/asm/current_64.h:10
#20 0xffffffff8100ed5e in do_softirq () at arch/x86/kernel/irq_64.c:262
#21 0xffffffff8103b280 in irq_exit () at kernel/softirq.c:310
#22 0xffffffff8101b0fe in smp_apic_timer_interrupt (regs=<value optimized out>)
at arch/x86/kernel/apic_64.c:514
#23 0xffffffff8100cf52 in apic_timer_interrupt ()
at include/asm/current_64.h:10
#24 0xffff81003b9d5a90 in ?? ()
#25 0x0000000000000000 in ?? ()


Basically (to my untrained eye), the leap second code is called from the
timer interrupt handler, which holds xtime_lock. The leap second code
does a printk to notify about the leap second. The printk code tries to
wake up klogd (I assume to prioritize kernel messages), and (under some
conditions), the scheduler attempts to get the current time, which tries
to get xtime_lock => deadlock.

I can only reproduce this if the system is busy. If the system is
otherwise idle at the timer interrupt, I guess the scheduler doesn't try
to get the time. I can run a "find / | xargs cat > /dev/nul" in one
window and then trigger the leap second in another, and the system dies
most of the time.

I'm looking at the source for the RHEL 4 kernel 2.6.9-67.0.7.EL (which I
had crash on a system), and the scheduler is enough different that I am
not finding the path to the deadlock right off.

In any case, the quick-n-dirty fix would be to not try to printk while
holding xtime_lock (I think the NTP code is the only thing that does).
However, it would be nice to still get the leap second notification, so
some other fix would be better I guess.
--
Chris Adams <[email protected]>
Systems and Network Administrator - HiWAAY Internet Services
I don't speak for anybody but myself - that's enough trouble.

2009-01-03 02:24:19

by Duane Griffin

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Fri, Jan 02, 2009 at 06:21:14PM -0600, Chris Adams wrote:
> Once upon a time, Linas Vepstas <[email protected]> said:
> > Below follows a summary of the reported crashes. I'm ignoring the
> > zillions of "mine didn't crash" reports, or the "you're a paranoid
> > conspiracy theorist, its random chance" reports.
>
> I have reproduced this and got a stack trace (this is with Fedora 8 and
> kernel kernel-2.6.26.6-49.fc8.x86_64):
>
> #0 ktime_get_ts (ts=0xffffffff8158bb30) at include/asm/processor.h:691
> #1 0xffffffff8104c09a in ktime_get () at kernel/hrtimer.c:59
> #2 0xffffffff8102a39a in hrtick_start_fair (rq=0xffff810009013880,
> p=<value optimized out>) at kernel/sched.c:1064
> #3 0xffffffff8102decc in enqueue_task_fair (rq=0xffff810009013880,
> p=0xffff81003fb02d40, wakeup=1) at kernel/sched_fair.c:863
> #4 0xffffffff81029a08 in enqueue_task (rq=0xffffffff8158bb30,
> p=0xffff81003b8ac418, wakeup=-994836480) at kernel/sched.c:1550
> #5 0xffffffff81029a39 in activate_task (rq=0xffff810009013880,
> p=0xffff81003b8ac418, wakeup=20045) at kernel/sched.c:1614
> #6 0xffffffff8102be38 in try_to_wake_up (p=0xffff81003fb02d40,
> state=<value optimized out>, sync=0) at kernel/sched.c:2173
> #7 0xffffffff8102be9c in default_wake_function (curr=<value optimized out>,
> mode=998949912, sync=20045, key=0x4c4b40000) at kernel/sched.c:4366
> #8 0xffffffff810492ed in autoremove_wake_function (wait=0xffffffff8158bb30,
> mode=998949912, sync=20045, key=0x4c4b40000) at kernel/wait.c:132
> #9 0xffffffff810296a2 in __wake_up_common (q=0xffffffff813d3180, mode=1,
> nr_exclusive=1, sync=0, key=0x0) at kernel/sched.c:4387
> #10 0xffffffff8102b97b in __wake_up (q=0xffffffff813d3180, mode=1,
> nr_exclusive=1, key=0x0) at kernel/sched.c:4406
> #11 0xffffffff8103692f in wake_up_klogd () at kernel/printk.c:1005
> #12 0xffffffff81036abb in release_console_sem () at kernel/printk.c:1051
> #13 0xffffffff81036fd1 in vprintk (fmt=<value optimized out>,
> args=<value optimized out>) at kernel/printk.c:789
> #14 0xffffffff81037081 in printk (
> fmt=0xffffffff8158bb30 "yj$\201????\2008\001\t") at kernel/printk.c:613
> #15 0xffffffff8104ec16 in ntp_leap_second (timer=<value optimized out>)
> at kernel/time/ntp.c:143
> #16 0xffffffff8104b7a6 in run_hrtimer_pending (cpu_base=0xffff81000900f740)
> at kernel/hrtimer.c:1204
> #17 0xffffffff8104b86a in run_hrtimer_softirq (h=<value optimized out>)
> at kernel/hrtimer.c:1355
> #18 0xffffffff8103b31f in __do_softirq () at kernel/softirq.c:234
> #19 0xffffffff8100d52c in call_softirq () at include/asm/current_64.h:10
> #20 0xffffffff8100ed5e in do_softirq () at arch/x86/kernel/irq_64.c:262
> #21 0xffffffff8103b280 in irq_exit () at kernel/softirq.c:310
> #22 0xffffffff8101b0fe in smp_apic_timer_interrupt (regs=<value optimized out>)
> at arch/x86/kernel/apic_64.c:514
> #23 0xffffffff8100cf52 in apic_timer_interrupt ()
> at include/asm/current_64.h:10
> #24 0xffff81003b9d5a90 in ?? ()
> #25 0x0000000000000000 in ?? ()
>
>
> Basically (to my untrained eye), the leap second code is called from the
> timer interrupt handler, which holds xtime_lock. The leap second code
> does a printk to notify about the leap second. The printk code tries to
> wake up klogd (I assume to prioritize kernel messages), and (under some
> conditions), the scheduler attempts to get the current time, which tries
> to get xtime_lock => deadlock.
>
> I can only reproduce this if the system is busy. If the system is
> otherwise idle at the timer interrupt, I guess the scheduler doesn't try
> to get the time. I can run a "find / | xargs cat > /dev/nul" in one
> window and then trigger the leap second in another, and the system dies
> most of the time.
>
> I'm looking at the source for the RHEL 4 kernel 2.6.9-67.0.7.EL (which I
> had crash on a system), and the scheduler is enough different that I am
> not finding the path to the deadlock right off.
>
> In any case, the quick-n-dirty fix would be to not try to printk while
> holding xtime_lock (I think the NTP code is the only thing that does).
> However, it would be nice to still get the leap second notification, so
> some other fix would be better I guess.

How about just moving the printk out of the lock? I.e. something like
this:

diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c
index f5f793d..ad3e2b7 100644
--- a/kernel/time/ntp.c
+++ b/kernel/time/ntp.c
@@ -140,8 +140,6 @@ static enum hrtimer_restart ntp_leap_second(struct hrtimer *timer)
xtime.tv_sec--;
wall_to_monotonic.tv_sec++;
time_state = TIME_OOP;
- printk(KERN_NOTICE "Clock: "
- "inserting leap second 23:59:60 UTC\n");
hrtimer_add_expires_ns(&leap_timer, NSEC_PER_SEC);
res = HRTIMER_RESTART;
break;
@@ -166,6 +164,10 @@ static enum hrtimer_restart ntp_leap_second(struct hrtimer *timer)

write_sequnlock(&xtime_lock);

+ if (res == HRTIMER_RESTART)
+ printk(KERN_NOTICE "Clock: "
+ "inserting leap second 23:59:60 UTC\n");
+
return res;
}


> --
> Chris Adams <[email protected]>
> Systems and Network Administrator - HiWAAY Internet Services
> I don't speak for anybody but myself - that's enough trouble.

Cheers,
Duane.

--
"I never could learn to drink that blood and call it wine" - Bob Dylan

2009-01-03 03:45:29

by Linas Vepstas

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

2009/1/2 Duane Griffin <[email protected]>:
> On Fri, Jan 02, 2009 at 06:21:14PM -0600, Chris Adams wrote:
>> Once upon a time, Linas Vepstas <[email protected]> said:
>> > Below follows a summary of the reported crashes. I'm ignoring the
>> > zillions of "mine didn't crash" reports, or the "you're a paranoid
>> > conspiracy theorist, its random chance" reports.
>>
>> I have reproduced this and got a stack trace (this is with Fedora 8 and
>> kernel kernel-2.6.26.6-49.fc8.x86_64):
>>
>> Basically (to my untrained eye), the leap second code is called from the
>> timer interrupt handler, which holds xtime_lock. The leap second code
>> does a printk to notify about the leap second. The printk code tries to
>> wake up klogd (I assume to prioritize kernel messages), and (under some
>> conditions), the scheduler attempts to get the current time, which tries
>> to get xtime_lock => deadlock.
>
> How about just moving the printk out of the lock? I.e. something like
> this:

[...]

Sure looks like the right fix to me. (Although there's more than
one printk under that lock). Who's going to write the formal patch?

--linas

2009-01-03 03:50:17

by Linas Vepstas

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

2009/1/2 Linas Vepstas <[email protected]>:
> Slashdot reported a story of Linux machines crashing on New years eve.

FYI, Looks like the bug has been found, and theres a patch!

http://lkml.org/lkml/2009/1/2/389

--linas

2009-01-03 04:02:22

by Ben Goodger

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

2009/1/3 Linas Vepstas <[email protected]>
> > Slashdot reported a story of Linux machines crashing on New years eve.
>
> FYI, Looks like the bug has been found, and theres a patch!
>
> http://lkml.org/lkml/2009/1/2/389

Great. I look forward to not crashing the next time it is 2008-31-31-23:59:59.
Sarcasm aside, please pass on my thanks to Mr Griffin.

--
Benjamin Goodger

2009-01-03 04:41:58

by Chris Adams

[permalink] [raw]
Subject: [PATCH] Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Once upon a time, Duane Griffin <[email protected]> said:
> On Fri, Jan 02, 2009 at 06:21:14PM -0600, Chris Adams wrote:
> > In any case, the quick-n-dirty fix would be to not try to printk while
> > holding xtime_lock (I think the NTP code is the only thing that does).
> > However, it would be nice to still get the leap second notification, so
> > some other fix would be better I guess.
>
> How about just moving the printk out of the lock? I.e. something like
> this:

Well, you've only fixed the inserting a leap second case, not the
removing a leap second case. AFAIK we've never actually had a leap
second removed, but it could happen (and the code is already there), so
it should be fixed as well.

Also, I didn't notice the locking was right there in the ntp_leap_second
function in the 2.6.26.6 kernel I was looking at, because I've also been
looking at the 2.6.9-based RHEL 4 kernel (which is a good bit different;
the lock is held outside the function, so it wouldn't be easy to drop it
for the printk). I guess that's Red Hat's (and other long-term support
vendors') problem. The simplest thing for them is still probably to
just remove the printks.

Here's a patch that moves both prinkts outside the lock. I am unable to
make a kernel with this patch crash on a leap second insertion or
deletion.
--
Chris Adams <[email protected]>
Systems and Network Administrator - HiWAAY Internet Services
I don't speak for anybody but myself - that's enough trouble.


From: Chris Adams <[email protected]>

The code to handle leap seconds printks an information message when the
second is inserted or deleted. It does this while holding xtime_lock.
However, printk wakes up klogd, and in some cases, the scheduler tries
to get the current kernel time, trying to get xtime_lock (which results
in a deadlock). This moved the printks outside of the lock.

Signed-off-by: Chris Adams <[email protected]>
---
diff -urpN linux-2.6.28-git5-vanilla/kernel/time/ntp.c linux-2.6.28-git5/kernel/time/ntp.c
--- linux-2.6.28-git5-vanilla/kernel/time/ntp.c 2009-01-02 22:09:34.000000000 -0600
+++ linux-2.6.28-git5/kernel/time/ntp.c 2009-01-02 22:11:23.000000000 -0600
@@ -130,6 +130,7 @@ void ntp_clear(void)
static enum hrtimer_restart ntp_leap_second(struct hrtimer *timer)
{
enum hrtimer_restart res = HRTIMER_NORESTART;
+ int msg = 0;

write_seqlock(&xtime_lock);

@@ -140,8 +141,7 @@ static enum hrtimer_restart ntp_leap_sec
xtime.tv_sec--;
wall_to_monotonic.tv_sec++;
time_state = TIME_OOP;
- printk(KERN_NOTICE "Clock: "
- "inserting leap second 23:59:60 UTC\n");
+ msg = 1;
hrtimer_add_expires_ns(&leap_timer, NSEC_PER_SEC);
res = HRTIMER_RESTART;
break;
@@ -150,8 +150,7 @@ static enum hrtimer_restart ntp_leap_sec
time_tai--;
wall_to_monotonic.tv_sec--;
time_state = TIME_WAIT;
- printk(KERN_NOTICE "Clock: "
- "deleting leap second 23:59:59 UTC\n");
+ msg = 2;
break;
case TIME_OOP:
time_tai++;
@@ -166,6 +165,17 @@ static enum hrtimer_restart ntp_leap_sec

write_sequnlock(&xtime_lock);

+ switch (msg) {
+ case 1:
+ printk(KERN_NOTICE "Clock: "
+ "inserting leap second 23:59:60 UTC\n");
+ break;
+ case 2:
+ printk(KERN_NOTICE "Clock: "
+ "deleting leap second 23:59:59 UTC\n");
+ break;
+ }
+
return res;
}

2009-01-03 04:46:43

by Duane Griffin

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

2009/1/3 Ben Goodger <[email protected]>:
> 2009/1/3 Linas Vepstas <[email protected]>
>> > Slashdot reported a story of Linux machines crashing on New years eve.
>>
>> FYI, Looks like the bug has been found, and theres a patch!
>>
>> http://lkml.org/lkml/2009/1/2/389
>
> Great. I look forward to not crashing the next time it is 2008-31-31-23:59:59.
> Sarcasm aside, please pass on my thanks to Mr Griffin.

Thanks, but I'm not the one who deserves the thanks: Chris Adams did
all the work in reproducing and diagnosing the problem. My patch was
entirely trivial (and indeed, incomplete).

Cheers,
Duane.

--
"I never could learn to drink that blood and call it wine" - Bob Dylan

2009-01-03 04:51:01

by Ben Goodger

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

2009/1/3 Duane Griffin <[email protected]>:
> Thanks, but I'm not the one who deserves the thanks: Chris Adams did
> all the work in reproducing and diagnosing the problem. My patch was
> entirely trivial (and indeed, incomplete).

I mean 2009-_12_-31, of course.
Thank you, Mr Adams...

--
Benjamin Goodger

-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/S/M/B d- s++:-- a18 c++$ UL>+++ P--- L++>+++ E- W+++$ N--- K? w---
O? M- V? PS+(++) PE-() Y+ PGP+ t 5? X-- R- !tv() b+++>++++ DI+++ D+ G
e>++++ h! !r*(-) y
------END GEEK CODE BLOCK------

2009-01-03 04:52:42

by Duane Griffin

[permalink] [raw]
Subject: Re: [PATCH] Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Fri, Jan 02, 2009 at 10:41:43PM -0600, Chris Adams wrote:
> Once upon a time, Duane Griffin <[email protected]> said:
> > On Fri, Jan 02, 2009 at 06:21:14PM -0600, Chris Adams wrote:
> > > In any case, the quick-n-dirty fix would be to not try to printk while
> > > holding xtime_lock (I think the NTP code is the only thing that does).
> > > However, it would be nice to still get the leap second notification, so
> > > some other fix would be better I guess.
> >
> > How about just moving the printk out of the lock? I.e. something like
> > this:
>
> Well, you've only fixed the inserting a leap second case, not the
> removing a leap second case. AFAIK we've never actually had a leap
> second removed, but it could happen (and the code is already there), so
> it should be fixed as well.

Quite right...

> Also, I didn't notice the locking was right there in the ntp_leap_second
> function in the 2.6.26.6 kernel I was looking at, because I've also been
> looking at the 2.6.9-based RHEL 4 kernel (which is a good bit different;
> the lock is held outside the function, so it wouldn't be easy to drop it
> for the printk). I guess that's Red Hat's (and other long-term support
> vendors') problem. The simplest thing for them is still probably to
> just remove the printks.
>
> Here's a patch that moves both prinkts outside the lock. I am unable to
> make a kernel with this patch crash on a leap second insertion or
> deletion.
> --
> Chris Adams <[email protected]>
> Systems and Network Administrator - HiWAAY Internet Services
> I don't speak for anybody but myself - that's enough trouble.
>
>
> From: Chris Adams <[email protected]>
>
> The code to handle leap seconds printks an information message when the
> second is inserted or deleted. It does this while holding xtime_lock.
> However, printk wakes up klogd, and in some cases, the scheduler tries
> to get the current kernel time, trying to get xtime_lock (which results
> in a deadlock). This moved the printks outside of the lock.
>
> Signed-off-by: Chris Adams <[email protected]>
> ---
> diff -urpN linux-2.6.28-git5-vanilla/kernel/time/ntp.c linux-2.6.28-git5/kernel/time/ntp.c
> --- linux-2.6.28-git5-vanilla/kernel/time/ntp.c 2009-01-02 22:09:34.000000000 -0600
> +++ linux-2.6.28-git5/kernel/time/ntp.c 2009-01-02 22:11:23.000000000 -0600
> @@ -130,6 +130,7 @@ void ntp_clear(void)
> static enum hrtimer_restart ntp_leap_second(struct hrtimer *timer)
> {
> enum hrtimer_restart res = HRTIMER_NORESTART;
> + int msg = 0;
>
> write_seqlock(&xtime_lock);
>
> @@ -140,8 +141,7 @@ static enum hrtimer_restart ntp_leap_sec
> xtime.tv_sec--;
> wall_to_monotonic.tv_sec++;
> time_state = TIME_OOP;
> - printk(KERN_NOTICE "Clock: "
> - "inserting leap second 23:59:60 UTC\n");
> + msg = 1;
> hrtimer_add_expires_ns(&leap_timer, NSEC_PER_SEC);
> res = HRTIMER_RESTART;
> break;
> @@ -150,8 +150,7 @@ static enum hrtimer_restart ntp_leap_sec
> time_tai--;
> wall_to_monotonic.tv_sec--;
> time_state = TIME_WAIT;
> - printk(KERN_NOTICE "Clock: "
> - "deleting leap second 23:59:59 UTC\n");
> + msg = 2;
> break;
> case TIME_OOP:
> time_tai++;
> @@ -166,6 +165,17 @@ static enum hrtimer_restart ntp_leap_sec
>
> write_sequnlock(&xtime_lock);
>
> + switch (msg) {
> + case 1:
> + printk(KERN_NOTICE "Clock: "
> + "inserting leap second 23:59:60 UTC\n");
> + break;
> + case 2:
> + printk(KERN_NOTICE "Clock: "
> + "deleting leap second 23:59:59 UTC\n");
> + break;
> + }
> +
> return res;
> }
>

How about instead of a switch statement, assigning the message to a
variable and printing that. I.e. something like:

static enum hrtimer_restart ntp_leap_second(struct hrtimer *timer)
{
enum hrtimer_restart res = HRTIMER_NORESTART;
const char *msg = NULL;

...
msg = "Clock: inserting leap second 23:59:60 UTC";
...
msg = "Clock: deleting leap second 23:59:59 UTC";
...

if (msg)
printk(KERN_NOTICE "%s\n", msg);

Cheers,
Duane.

--
"I never could learn to drink that blood and call it wine" - Bob Dylan

2009-01-03 06:33:46

by David Newall

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Robert Hancock wrote:
> Diego Calleja wrote:
>> How could I create a test case that reproduces what ntp does? Just add
>> a second?
>
> I'd think that setting the clock to just before midnight on Dec.31 and
> using the adjtimex syscall to set the TIME_INS state on the clock,
> then waiting until midnight rolls around would be a reasonable test..

I don't understand this idea, nor the patch for the problem. I don't
see why adding a leap second would impact the kernel in any way.
Shouldn't this be a simple zoneinfo change, whereby the last two seconds
of the year (in each timezone) both map to 31dec2008 23:59:59? That's
the way the change has worked in the real world. Why would ntp or the
kernel be involved?

2009-01-03 06:37:49

by Ben Goodger

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

2009/1/3 David Newall <[email protected]>:
> Robert Hancock wrote:
>> Diego Calleja wrote:
>>> How could I create a test case that reproduces what ntp does? Just add
>>> a second?
>>
>> I'd think that setting the clock to just before midnight on Dec.31 and
>> using the adjtimex syscall to set the TIME_INS state on the clock,
>> then waiting until midnight rolls around would be a reasonable test..
>
> I don't understand this idea, nor the patch for the problem. I don't
> see why adding a leap second would impact the kernel in any way.
> Shouldn't this be a simple zoneinfo change, whereby the last two seconds
> of the year (in each timezone) both map to 31dec2008 23:59:59? That's
> the way the change has worked in the real world. Why would ntp or the
> kernel be involved?

Actually, the change has worked in the real world with the
introduction of a new second named 23:59:60, or else ignoring the leap
second entirely and correcting the clock (or not) later...

--
Benjamin Goodger

-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/S/M/B d- s++:-- a18 c++$ UL>+++ P--- L++>+++ E- W+++$ N--- K? w---
O? M- V? PS+(++) PE-() Y+ PGP+ t 5? X-- R- !tv() b+++>++++ DI+++ D+ G
e>++++ h! !r*(-) y
------END GEEK CODE BLOCK------

2009-01-03 07:00:44

by Chris Adams

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Once upon a time, David Newall <[email protected]> said:
> I don't understand this idea, nor the patch for the problem. I don't
> see why adding a leap second would impact the kernel in any way.
> Shouldn't this be a simple zoneinfo change, whereby the last two seconds
> of the year (in each timezone) both map to 31dec2008 23:59:59? That's
> the way the change has worked in the real world. Why would ntp or the
> kernel be involved?

The leap second isn't a simple thing like a time zone. Zones account
for an offset from UTC, but a leap second is an extra second inserted
into (or possibly removed from) UTC itself. There was actually a 61
second minute on Dec. 31. The trouble comes in keeping the "seconds
since the epoch" counter sane, meaning (seconds % 86400) == 0 at
00:00:00 UTC. Since there were 86401 seconds Dec. 31, the kernel had to
tick the last second twice to keep correct UTC time.

NTP is used to distribute and synchronize time information, including
leap second info.

See Wikipedia and Google for more information.
--
Chris Adams <[email protected]>
Systems and Network Administrator - HiWAAY Internet Services
I don't speak for anybody but myself - that's enough trouble.

2009-01-03 18:02:10

by Chris Adams

[permalink] [raw]
Subject: Re: [PATCH] v2 Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Once upon a time, Duane Griffin <[email protected]> said:
> How about instead of a switch statement, assigning the message to a
> variable and printing that. I.e. something like:

Good point. Here's an updated version that also adds a comment to the
xtime_lock definition about not using printk.
--
Chris Adams <[email protected]>
Systems and Network Administrator - HiWAAY Internet Services
I don't speak for anybody but myself - that's enough trouble.


From: Chris Adams <[email protected]>

The code to handle leap seconds printks an information message when the
second is inserted or deleted. It does this while holding xtime_lock.
However, printk wakes up klogd, and in some cases, the scheduler tries
to get the current kernel time, trying to get xtime_lock (which results
in a deadlock). This moved the printks outside of the lock. It also
adds a comment to not use printk while holding xtime_lock.

Signed-off-by: Chris Adams <[email protected]>
---
diff -urpN linux-2.6.28-git5-vanilla/include/linux/time.h linux-2.6.28-git5/include/linux/time.h
--- linux-2.6.28-git5-vanilla/include/linux/time.h 2009-01-02 22:09:10.000000000 -0600
+++ linux-2.6.28-git5/include/linux/time.h 2009-01-03 11:57:27.000000000 -0600
@@ -99,6 +99,12 @@ static inline struct timespec timespec_s

extern struct timespec xtime;
extern struct timespec wall_to_monotonic;
+
+/*
+ * Do not call printk while holding this lock; it wakes klogd and the
+ * scheduler may try to get the current kernel time, which will try to get
+ * this lock.
+ */
extern seqlock_t xtime_lock;

extern unsigned long read_persistent_clock(void);
diff -urpN linux-2.6.28-git5-vanilla/kernel/time/ntp.c linux-2.6.28-git5/kernel/time/ntp.c
--- linux-2.6.28-git5-vanilla/kernel/time/ntp.c 2009-01-02 22:09:34.000000000 -0600
+++ linux-2.6.28-git5/kernel/time/ntp.c 2009-01-03 11:57:46.000000000 -0600
@@ -130,6 +130,7 @@ void ntp_clear(void)
static enum hrtimer_restart ntp_leap_second(struct hrtimer *timer)
{
enum hrtimer_restart res = HRTIMER_NORESTART;
+ const char *msg = NULL;

write_seqlock(&xtime_lock);

@@ -140,8 +141,7 @@ static enum hrtimer_restart ntp_leap_sec
xtime.tv_sec--;
wall_to_monotonic.tv_sec++;
time_state = TIME_OOP;
- printk(KERN_NOTICE "Clock: "
- "inserting leap second 23:59:60 UTC\n");
+ msg = "Clock: inserting leap second 23:59:60 UTC";
hrtimer_add_expires_ns(&leap_timer, NSEC_PER_SEC);
res = HRTIMER_RESTART;
break;
@@ -150,8 +150,7 @@ static enum hrtimer_restart ntp_leap_sec
time_tai--;
wall_to_monotonic.tv_sec--;
time_state = TIME_WAIT;
- printk(KERN_NOTICE "Clock: "
- "deleting leap second 23:59:59 UTC\n");
+ msg = "Clock: deleting leap second 23:59:59 UTC";
break;
case TIME_OOP:
time_tai++;
@@ -166,6 +165,9 @@ static enum hrtimer_restart ntp_leap_sec

write_sequnlock(&xtime_lock);

+ if (msg)
+ printk(KERN_NOTICE "%s\n", msg);
+
return res;
}

2009-01-03 19:04:36

by Duane Griffin

[permalink] [raw]
Subject: Re: [PATCH] v2 Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

2009/1/3 Chris Adams <[email protected]>:
> Once upon a time, Duane Griffin <[email protected]> said:
>> How about instead of a switch statement, assigning the message to a
>> variable and printing that. I.e. something like:
>
> Good point. Here's an updated version that also adds a comment to the
> xtime_lock definition about not using printk.

Good idea.

> --
> Chris Adams <[email protected]>
> Systems and Network Administrator - HiWAAY Internet Services
> I don't speak for anybody but myself - that's enough trouble.
>
>
> From: Chris Adams <[email protected]>
>
> The code to handle leap seconds printks an information message when the
> second is inserted or deleted. It does this while holding xtime_lock.
> However, printk wakes up klogd, and in some cases, the scheduler tries
> to get the current kernel time, trying to get xtime_lock (which results
> in a deadlock). This moved the printks outside of the lock. It also
> adds a comment to not use printk while holding xtime_lock.
>
> Signed-off-by: Chris Adams <[email protected]>
> ---
> diff -urpN linux-2.6.28-git5-vanilla/include/linux/time.h linux-2.6.28-git5/include/linux/time.h
> --- linux-2.6.28-git5-vanilla/include/linux/time.h 2009-01-02 22:09:10.000000000 -0600
> +++ linux-2.6.28-git5/include/linux/time.h 2009-01-03 11:57:27.000000000 -0600
> @@ -99,6 +99,12 @@ static inline struct timespec timespec_s
>
> extern struct timespec xtime;
> extern struct timespec wall_to_monotonic;
> +
> +/*
> + * Do not call printk while holding this lock; it wakes klogd and the
> + * scheduler may try to get the current kernel time, which will try to get
> + * this lock.
> + */
> extern seqlock_t xtime_lock;
>
> extern unsigned long read_persistent_clock(void);
> diff -urpN linux-2.6.28-git5-vanilla/kernel/time/ntp.c linux-2.6.28-git5/kernel/time/ntp.c
> --- linux-2.6.28-git5-vanilla/kernel/time/ntp.c 2009-01-02 22:09:34.000000000 -0600
> +++ linux-2.6.28-git5/kernel/time/ntp.c 2009-01-03 11:57:46.000000000 -0600
> @@ -130,6 +130,7 @@ void ntp_clear(void)
> static enum hrtimer_restart ntp_leap_second(struct hrtimer *timer)
> {
> enum hrtimer_restart res = HRTIMER_NORESTART;
> + const char *msg = NULL;
>
> write_seqlock(&xtime_lock);
>
> @@ -140,8 +141,7 @@ static enum hrtimer_restart ntp_leap_sec
> xtime.tv_sec--;
> wall_to_monotonic.tv_sec++;
> time_state = TIME_OOP;
> - printk(KERN_NOTICE "Clock: "
> - "inserting leap second 23:59:60 UTC\n");
> + msg = "Clock: inserting leap second 23:59:60 UTC";
> hrtimer_add_expires_ns(&leap_timer, NSEC_PER_SEC);
> res = HRTIMER_RESTART;
> break;
> @@ -150,8 +150,7 @@ static enum hrtimer_restart ntp_leap_sec
> time_tai--;
> wall_to_monotonic.tv_sec--;
> time_state = TIME_WAIT;
> - printk(KERN_NOTICE "Clock: "
> - "deleting leap second 23:59:59 UTC\n");
> + msg = "Clock: deleting leap second 23:59:59 UTC";
> break;
> case TIME_OOP:
> time_tai++;
> @@ -166,6 +165,9 @@ static enum hrtimer_restart ntp_leap_sec
>
> write_sequnlock(&xtime_lock);
>
> + if (msg)
> + printk(KERN_NOTICE "%s\n", msg);
> +
> return res;
> }

Looks good to me!

Cheers,
Duane.

--
"I never could learn to drink that blood and call it wine" - Bob Dylan

2009-01-03 20:01:33

by Linas Vepstas

[permalink] [raw]
Subject: Re: [PATCH] v2 Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

2009/1/3 Chris Adams <[email protected]>:

>
> From: Chris Adams <[email protected]>
>
> The code to handle leap seconds printks an information message when the
> second is inserted or deleted. It does this while holding xtime_lock.
> However, printk wakes up klogd, and in some cases, the scheduler tries
> to get the current kernel time, trying to get xtime_lock (which results
> in a deadlock). This moved the printks outside of the lock. It also
> adds a comment to not use printk while holding xtime_lock.
>
> Signed-off-by: Chris Adams <[email protected]>

Acked-by: Linas Vepstas <[email protected]>

BTW, I audited the other code in kernel/time/*.c and it looks like there
are no other printk's under the lock. Not surprising -- if there were,
they'd have been found by now. Indeed, in timekeeping.c line 198,
it seems that someone else had indeed tripped over this :-P

--linas

2009-01-03 22:59:12

by Jeffrey J. Kosowsky

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Linas Vepstas wrote at about 21:49:53 -0600 on Friday, January 2, 2009:
> 2009/1/2 Linas Vepstas <[email protected]>:
> > Slashdot reported a story of Linux machines crashing on New years eve.
>
> FYI, Looks like the bug has been found, and theres a patch!
>
> http://lkml.org/lkml/2009/1/2/389
>
> --linas
>

As the OP, good to know that all those who said "it's just a
coincidence that your machine that has been rock stable for 6 years
just happened to crash at midnight GMT when the leap second was
inserted..." were wrong :)

Thanks for the good follow-up and detective work.
Hopefully, it's not too late for Fedora to provide the patch before
support for Fedora 8 expires on the 7th...

2009-01-04 08:41:33

by David Newall

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Chris Adams wrote:
> The leap second isn't a simple thing like a time zone. Zones account
> for an offset from UTC
Time zones are described that way, but I was wondering why not use
zoneinfo, which describes the local time after an arbitrary number of
seconds since the epoch. The leap second is a textbook case for
updating zoneinfo.


> , but a leap second is an extra second inserted
> into (or possibly removed from) UTC itself. There was actually a 61
> second minute on Dec. 31.


> The trouble comes in keeping the "seconds
> since the epoch" counter sane, meaning (seconds % 86400) == 0 at
> 00:00:00 UTC.

That sounds like an irrelevant quality, and as we've seen, striving for
it has caused difficulties. Worse, we've now got the situation where the
number of seconds between midnight starting December 31 and midnight
starting January 1 is incorrect. The correct value is 86401, because
that's how many seconds there were.

> Since there were 86401 seconds Dec. 31, the kernel had to
> tick the last second twice to keep correct UTC time.

It didn't have to, but apparently, and regrettably, that's what was
done; leaving an even bigger problem. How many seconds does the
computer claim were in 2008? Probably not enough.

2009-01-04 08:43:45

by David Newall

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Ben Goodger wrote:
> 2009/1/3 David Newall <[email protected]>:
>
>> Shouldn't this be a simple zoneinfo change, whereby the last two seconds
>> of the year (in each timezone) both map to 31dec2008 23:59:59? That's
>> the way the change has worked in the real world. Why would ntp or the
>> kernel be involved?
>>
>
> Actually, the change has worked in the real world with the
> introduction of a new second named 23:59:60

Fine. However you want to describe that last second is immaterial. The
point is that diddling the clock is not a true solution. Take seconds
since epoch for January 1 and subtract the seconds since epoch since the
previous day and if the result isn't 86401 it's wrong. Is Linux wrong?
(I gather it is.)

2009-01-04 09:00:29

by Kyle Moffett

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Sun, Jan 4, 2009 at 3:43 AM, David Newall <[email protected]> wrote:
> Ben Goodger wrote:
>> 2009/1/3 David Newall <[email protected]>:
>> Actually, the change has worked in the real world with the
>> introduction of a new second named 23:59:60
>
> Fine. However you want to describe that last second is immaterial. The
> point is that diddling the clock is not a true solution. Take seconds
> since epoch for January 1 and subtract the seconds since epoch since the
> previous day and if the result isn't 86401 it's wrong. Is Linux wrong?
> (I gather it is.)

Actually, "diddling the clock" is really the only valid solution to
the leap-second problem. The leap-second is such a fine adjustment
that it is actually affected by random "noise" introduced into the
solar-system from the chaotic gravitational interactions of the
planets with each other. It's impossible to reliably calculate which
future years will have leap seconds, and in which direction they will
occur.

Our calendar year is pretty damn close after we have accounted for the
standard leap-year algorithm, but that algorithm cannot be modified
without breaking a great number of existing date-time systems. The
proper answer (currently implemented in systems all over the world)
coordinates atomic-clock systems across the world with the measured
traversal of the earth (as referenced against the sun and the stars).
If the clocks are slightly "ahead" of where they should be, a leap
second is scheduled to be inserted, and if they're behind, a second is
removed. The flow-of-time is then adjusted for the last minute so
that it runs either 101.6959% of the normal rate (59-second minute) or
98.3606% of the normal rate (61-second minute).

The effective end result is that time actually flows smoothly but the
assumed date of the epoch is adjusted slightly relative to real time
based on subtle fluctuations of the earth's rotation and orbit.

Cheers,
Kyle Moffett

2009-01-04 10:04:08

by David Newall

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Kyle Moffett wrote:
> Actually, "diddling the clock" is really the only valid solution to
> the leap-second problem. The leap-second is such a fine adjustment
> that it is actually affected by random "noise" introduced into the
> solar-system from the chaotic gravitational interactions of the
> planets with each other. It's impossible to reliably calculate which
> future years will have leap seconds, and in which direction they will
> occur.
>

You're confusing the system of keeping time with those characteristics
of the real-world which it represents. They are, in fact, two different
things, hence we regularly adjust the system. Now in the case of UNIX
and derivatives, the system records the number of seconds since an
arbitrary point-in-time, and presents a "wall time" (i.e. the time
displayed by the clock on the wall) using, amongst other things, a set
of adjustment rules codified by a zoneinfo file. The number of second
between 1 minute to- and midnight-ending 31 December is 61. If Linux
does not reflect that it is wrong and must be fixed. If it isn't fixed
we will increasingly discover a discrepancy between time-data that
originates on Linux versus other, correct systems.

I don't understand why such a simple thing was unnecessarily
complicated. And causing crashes! Ha ha ha or what? A simple addition
to zoneinfo was (and still is) all that is required.

2009-01-04 10:11:26

by David Lang

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Sun, 4 Jan 2009, David Newall wrote:

> Kyle Moffett wrote:
>> Actually, "diddling the clock" is really the only valid solution to
>> the leap-second problem. The leap-second is such a fine adjustment
>> that it is actually affected by random "noise" introduced into the
>> solar-system from the chaotic gravitational interactions of the
>> planets with each other. It's impossible to reliably calculate which
>> future years will have leap seconds, and in which direction they will
>> occur.
>>
>
> You're confusing the system of keeping time with those characteristics
> of the real-world which it represents. They are, in fact, two different
> things, hence we regularly adjust the system. Now in the case of UNIX
> and derivatives, the system records the number of seconds since an
> arbitrary point-in-time, and presents a "wall time" (i.e. the time
> displayed by the clock on the wall) using, amongst other things, a set
> of adjustment rules codified by a zoneinfo file. The number of second
> between 1 minute to- and midnight-ending 31 December is 61. If Linux
> does not reflect that it is wrong and must be fixed. If it isn't fixed
> we will increasingly discover a discrepancy between time-data that
> originates on Linux versus other, correct systems.
>
> I don't understand why such a simple thing was unnecessarily
> complicated. And causing crashes! Ha ha ha or what? A simple addition
> to zoneinfo was (and still is) all that is required.

so are you saying that other 'correct' OS's have patches issued every time
a leap second is declared so that they have an in-kernel table of them to
use to calculate the correct time?

what about systems that have hit end of life? what about systems that
users don't want to have to reboot to install a new kernel for a 1 second
shift (which NTP will take care of as far as they are concerned anyway)

when the daylight savings time definitions change all the vendors had to
issue patches, I saw those. I didn't see any patches for the leap second,
so how do these other systems deal with it?

David Lang

2009-01-04 11:36:16

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Sun, 04 Jan 2009 20:33:41 +1030, David Newall said:

> I don't understand why such a simple thing was unnecessarily
> complicated. And causing crashes! Ha ha ha or what? A simple addition
> to zoneinfo was (and still is) all that is required.

Something to keep in mind is that the Posix standard does *NOT* say anything
about leap seconds - poke around in a 'struct tm' sometime.
That's why /usr/share/zoneinfo has separate 'posix' and 'right' subdirectories.

The fun starts when software using the 'right' rules tries to interact with
other software using the Posix rules (quite possibly running on a non-Unixy
system that doesn't even *use* zoneinfo).

Repeat after me: Not all the world is Linux.


Attachments:
(No filename) (226.00 B)

2009-01-04 16:16:20

by Sitsofe Wheeler

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

[email protected] wrote:
> so are you saying that other 'correct' OS's have patches issued every
> time a leap second is declared so that they have an in-kernel table of
> them to use to calculate the correct time?

I think the number of other "correct" OSes that actually step the time
on leap seconds is not that large (at least doing the announcement via
NTP). According to http://www.ntp.org/ntpfaq/NTP-s-algo-real.htm#AEN2499
leap seconds are only changed via stepping if you have the right kernel
discipline (notes on how to check whether a given OS has the kernel
kernel discipline are mentioned on
http://www.ntp.org/ntpfaq/NTP-s-algo-kernel.htm#AEN2220 ).

I have a feeling that OSX doesn't do it (there's a mailing list post
from 2005 where someone was trying to add FreeBSD's ntp_adjtime to
Darwin
http://lists.apple.com/archives/Darwin-kernel/2005/Jan/msg00004.html ).
Additionally folks I know using ntpd synchronized OSX machines said
their machines were off by one second right after the new year.

Windows is also known not to do it without slewing:
http://www.meinberg.de/english/info/leap-second.htm#os .

2009-01-04 17:20:20

by Kyle Moffett

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Sun, Jan 4, 2009 at 5:03 AM, David Newall <[email protected]> wrote:
> You're confusing the system of keeping time with those characteristics
> of the real-world which it represents. They are, in fact, two different
> things, hence we regularly adjust the system. Now in the case of UNIX
> and derivatives, the system records the number of seconds since an
> arbitrary point-in-time, and presents a "wall time" (i.e. the time
> displayed by the clock on the wall) using, amongst other things, a set
> of adjustment rules codified by a zoneinfo file. The number of second
> between 1 minute to- and midnight-ending 31 December is 61. If Linux
> does not reflect that it is wrong and must be fixed. If it isn't fixed
> we will increasingly discover a discrepancy between time-data that
> originates on Linux versus other, correct systems.
>
> I don't understand why such a simple thing was unnecessarily
> complicated. And causing crashes! Ha ha ha or what? A simple addition
> to zoneinfo was (and still is) all that is required.

Leap seconds are an integral part of the NTP standard for the reasons
I described. You can't "update zoneinfo" because a leap second is
applied to *all* timezones... not just a single one. Specifically,
each NTP message includes some bits indicating what the next
leap-second is going to be (at the end of the current month), whether
+1, 0, or -1.

I believe that under Linux if you request a monotonic clock then you
won't "experience" leap-seconds at all; although such a clock will
probably stop while your computer is suspended. On the other hand, if
you explicitly ask for a wall-clock, it is the responsibility of NTP
to keep the wall-clock accurate to the actual passage of days, even if
that involves slight slewing adjustments.

The UTC timezone is explicitly defined to include "leap seconds", and
so we cannot honestly claim to implement the standard unless we
provide a method for those leap seconds to be applied. If you don't
want leap-seconds, submit a patch to the ntp daemon to allow it to run
in "UT1" mode in which it will ignore leap second notifications over
the NTP protocol, or just use a GPS clock.

Cheers,
Kyle Moffett

2009-01-04 17:27:15

by Kyle Moffett

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Sun, Jan 4, 2009 at 11:15 AM, Sitsofe Wheeler <[email protected]> wrote:
> Windows is also known not to do it without slewing:
> http://www.meinberg.de/english/info/leap-second.htm#os .

Well... Microsoft "[does] not guarantee and [does] not support the
accuracy of the W32Time service between nodes on a network. The
W32Time service is not a full-featured NTP solution that meets
time-sensitive application needs." (See
http://support.microsoft.com/kb/939322). The w32time daemon is not
guaranteed to be within a few seconds of UTC at *any* time of the
year, let alone immediately after a leap-second. In addition, windows
does not have any built-in interpolation between timer ticks, so time
increases in ~15ms steps regardless of how accurate your clock is.

Cheers,
Kyle Moffett

2009-01-04 23:16:19

by David Newall

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

[email protected] wrote:
> so are you saying that other 'correct' OS's have patches issued every
> time a leap second is declared so that they have an in-kernel table of
> them to use to calculate the correct time?

No. Exactly the contrary. I'm saying that through use of zoneinfo, for
example, no kernel support is required for leap seconds. And! this
provides correct results for seconds-between two dates.

2009-01-04 23:25:54

by Chris Adams

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Once upon a time, David Newall <[email protected]> said:
> No. Exactly the contrary. I'm saying that through use of zoneinfo, for
> example, no kernel support is required for leap seconds. And! this
> provides correct results for seconds-between two dates.

Again: zoneinfo provides offset from UTC. Leap seconds are changes in
UTC itself, not time zones, so zoneinfo can't handle that.

Please go read Google, Wikipedia, and NTP lists.
--
Chris Adams <[email protected]>
Systems and Network Administrator - HiWAAY Internet Services
I don't speak for anybody but myself - that's enough trouble.

2009-01-04 23:28:06

by David Lang

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Mon, 5 Jan 2009, David Newall wrote:

> [email protected] wrote:
>> so are you saying that other 'correct' OS's have patches issued every
>> time a leap second is declared so that they have an in-kernel table of
>> them to use to calculate the correct time?
>
> No. Exactly the contrary. I'm saying that through use of zoneinfo, for
> example, no kernel support is required for leap seconds. And! this
> provides correct results for seconds-between two dates.

then new zoneinfo files need to be sent out every time there is a leap
second (which from other posts on this thread is potentially every month)

and if it is something to be fixed in zoneinfo, then complaining to the
kernel list and demanding that 'Linux be fixed' is not productive.

David Lang

2009-01-04 23:37:50

by David Newall

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

[email protected] wrote:
> then new zoneinfo files need to be sent out every time there is a leap
> second (which from other posts on this thread is potentially every month)

Not every few months, for goodness sake! Leap seconds aren't that
common! These files do change regularly, however, sometimes on a yearly
basis, because that's how often the date might be changed when daylight
savings transitions. This is to say that leap seconds don't
particularly change the frequency of zoneinfo updates.

> if it is something to be fixed in zoneinfo, then complaining to the
> kernel list and demanding that 'Linux be fixed' is not productive.

Updating zoneinfo is trivial. On the other hand if something has been
done to Linux to support leap seconds (I gather this is the case), then
the point is that it need not have been done, should not have been done,
and needs to be replaced by the standard tools that have worked
satisfactorily for decades, and continue doing so. So yes, I think it
is productive.

2009-01-05 00:01:49

by David Newall

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Chris Adams wrote:
> Once upon a time, David Newall <[email protected]> said:
>
>> No. Exactly the contrary. I'm saying that through use of zoneinfo, for
>> example, no kernel support is required for leap seconds. And! this
>> provides correct results for seconds-between two dates.
>>
>
> Again: zoneinfo provides offset from UTC. Leap seconds are changes in
> UTC itself, not time zones, so zoneinfo can't handle that.
>

Yes, but zoneinfo ALSO provides support for leap seconds. Do read man
zic for specific details.

> Please go read Google, Wikipedia, and NTP lists.

I think you particularly mean NTP. I think your reasoning is that
because NTP's timestamp doesn't include leap seconds, and because we all
like to use NTP to synchronise our clocks, Linux has to make up the
difference. But there is an alternative; which is for the NTP client to
insert those missing leap seconds, which number it can get from
zoneinfo. Epoch remains the start of 1978; seconds between any two
dates included leap-seconds and no special kernel support is required.

2009-01-05 00:04:18

by David Lang

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Mon, 5 Jan 2009, David Newall wrote:

> [email protected] wrote:
>> then new zoneinfo files need to be sent out every time there is a leap
>> second (which from other posts on this thread is potentially every month)
>
> Not every few months, for goodness sake! Leap seconds aren't that
> common! These files do change regularly, however, sometimes on a yearly
> basis, because that's how often the date might be changed when daylight
> savings transitions. This is to say that leap seconds don't
> particularly change the frequency of zoneinfo updates.

another poster said that NTP packets include information about this
month's leap second, so that implies that they could change monthly.

the zoneinfo files normally do not change every year, they only change
when !$#@$# polititions decide to monkey with things and change the rules
(for the US this is once in the history of Linus IIRC), the aggregate of
some country somewhere changing the rules means that more updates are
created, but most people can ignore those updates if they don't directly
apply to them.

David Lang

2009-01-05 00:09:19

by David Newall

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

[email protected] wrote:
> On Sun, 04 Jan 2009 20:33:41 +1030, David Newall said:
>
>
>> I don't understand why such a simple thing was unnecessarily
>> complicated. And causing crashes! Ha ha ha or what? A simple addition
>> to zoneinfo was (and still is) all that is required.
>>
>
> Something to keep in mind is that the Posix standard does *NOT* say anything
> about leap seconds - poke around in a 'struct tm' sometime.
>

I have poked, decades ago. There's nothing in struct tm that's a problem.

> That's why /usr/share/zoneinfo has separate 'posix' and 'right' subdirectories.
>
> The fun starts when software using the 'right' rules tries to interact with
> other software using the Posix rules (quite possibly running on a non-Unixy
> system that doesn't even *use* zoneinfo).
>
> Repeat after me: Not all the world is Linux.

But Linux is; and that's what we're discussing.

2009-01-05 00:14:46

by David Newall

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

[email protected] wrote:
> On Mon, 5 Jan 2009, David Newall wrote:
>
>> [email protected] wrote:
>>> then new zoneinfo files need to be sent out every time there is a leap
>>> second (which from other posts on this thread is potentially every
>>> month)
>>
>> Not every few months, for goodness sake! Leap seconds aren't that
>> common! These files do change regularly, however, sometimes on a yearly
>> basis, because that's how often the date might be changed when daylight
>> savings transitions. This is to say that leap seconds don't
>> particularly change the frequency of zoneinfo updates.
>
> another poster said that NTP packets include information about this
> month's leap second, so that implies that they could change monthly.

Not "could change monthly" rather, "could change at any month".

The frequency of zoneinfo updates would therefore be: every time the
zones you care about change; and every time there's a leap second. No
big effort.

2009-01-05 00:21:22

by Ben Goodger

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

2009/1/5 David Newall <[email protected]>:
>> another poster said that NTP packets include information about this
>> month's leap second, so that implies that they could change monthly.
>
> Not "could change monthly" rather, "could change at any month".
>
> The frequency of zoneinfo updates would therefore be: every time the
> zones you care about change; and every time there's a leap second. No
> big effort.

Unfortunately, as has been pointed out, timezones are completely
unrelated to leap seconds.
NB. Leap seconds, positive or negative, potentially occur every six
months (June 30 or Dec 31), but since their introduction this
frequency has happened only once (in 1972); historically they have
been inserted on average every 1.5 years, but there have been only two
since 2000.

--
Benjamin Goodger

-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/S/M/B d- s++:-- a18 c++$ UL>+++ P--- L++>+++ E- W+++$ N--- K? w---
O? M- V? PS+(++) PE-() Y+ PGP+ t 5? X-- R- !tv() b+++>++++ DI+++ D+ G
e>++++ h! !r*(-) y
------END GEEK CODE BLOCK------

2009-01-05 00:41:46

by Alan

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

> zoneinfo. Epoch remains the start of 1978; seconds between any two
> dates included leap-seconds and no special kernel support is required.

Your time() values then disagree with the rest of the universe. See POSIX
1003.1 Annex B 2.2.2. if you want the whole story,

For any given time based on the 1970 Epoch there is a single correct
answer for the translation between each value and a UTC time.

2009-01-05 00:44:36

by Alan

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

> another poster said that NTP packets include information about this
> month's leap second, so that implies that they could change monthly.

The current rules don't permit this but any rule could be change. However
its unlikely. Current the earth rotation people do this twice a year if
neccessary. It is also too complex to predict significantly (ie years) in
advance when it will be neccessary because the complexity of the tidal
forces involved is beyond simulation.

2009-01-05 05:48:42

by Linas Vepstas

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

2009/1/4 David Newall <[email protected]>:
> [email protected] wrote:
>> then new zoneinfo files need to be sent out every time there is a leap
>> second (which from other posts on this thread is potentially every month)
>
> Not every few months, for goodness sake! Leap seconds aren't that
> common!

But they could be. Appearenly, there was a very long,
multi-year lull, where no leap-seconds were required.
Its not fully understood why, as they used to be common.
Maybe its the melting glaciers :-)

There *was* talk of eliminating them forever (so as to
avoid this kind of bug, which affects banks, satellites,
telecom equipment, etc.) but I guess they didn't do it.

--linas

2009-01-05 14:33:51

by Nick Andrew

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Sun, Jan 04, 2009 at 11:48:31PM -0600, Linas Vepstas wrote:
> There *was* talk of eliminating them forever (so as to
> avoid this kind of bug, which affects banks, satellites,
> telecom equipment, etc.) but I guess they didn't do it.

I can sympathise with the opinion that linux should be able to accurately
distinguish xx:59:60 when a leap second is added (or the missing :59 when
one is subtracted) but not at the expense of making a day which is not
86400 seconds long.

To fix the problem would require accurately modeling international
timekeeping standards such as TAI and use of different syscalls to
return time in TAI and UTC-with-leap-seconds represented. It
wouldn't be good to change the semantics of time().

* http://en.wikipedia.org/wiki/International_Atomic_Time
* http://en.wikipedia.org/wiki/Leap_second

Arguably the kernel's responsibility should be to keep track of the
most fundamental representation of time possible for a machine (that's
probably TAI) and it is a userspace responsibility to map from that
value to other time standards including UTC, using control files
which are updated as leap seconds are declared. Just so long as the
existing behaviour of time() which doesn't recognise leap seconds
is preserved.

Nick.

2009-01-05 16:09:06

by Linas Vepstas

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

2009/1/5 Nick Andrew <[email protected]>:
> On Sun, Jan 04, 2009 at 11:48:31PM -0600, Linas Vepstas wrote:
>> There *was* talk of eliminating them forever (so as to
>> avoid this kind of bug, which affects banks, satellites,
>> telecom equipment, etc.) but I guess they didn't do it.
>
> I can sympathise with the opinion that linux should be able to accurately
> distinguish xx:59:60 when a leap second is added (or the missing :59 when
> one is subtracted) but not at the expense of making a day which is not
> 86400 seconds long.

Careful: This seems to be *exactly* the intent of the
maintainers of the UTC definition: some days really
will have 86401 seconds in them. That's why there's
all this talk of 'solar time' (see e.g. the wikipedia article)

> To fix the problem would require accurately modeling international
> timekeeping standards such as TAI and use of different syscalls to
> return time in TAI and UTC-with-leap-seconds represented. It
> wouldn't be good to change the semantics of time().

Now, this is the first proposal that I've heard that makes
sense. I believe that the Linux kernel/userspace
infrastructure already " accurately models international
timekeeping standards", so we're good.

Changing the kernel to track TAI instead of UTC seems
like an excellent idea -- but not one without a significant
amount of work -- maybe new syscalls are needed, as
well as new monkeying-about in glibc, maybe in ntpd, etc.

> * http://en.wikipedia.org/wiki/International_Atomic_Time
> * http://en.wikipedia.org/wiki/Leap_second
>
> Arguably the kernel's responsibility should be to keep track of the
> most fundamental representation of time possible for a machine (that's
> probably TAI) and it is a userspace responsibility to map from that
> value to other time standards including UTC,

Yes, this really does seem like the right solution.

> using control files
> which are updated as leap seconds are declared.

Lets be clear on what "control files" means. This does
*NOT* mean some config file shipped by some distro
for some package. That would be a horrid solution.
People don't install updates, patches, etc. Distros
ship them late, or never, if the distro is old enough.

A more appropriate solution would be to have
either the kernel or ntpd track the leap seconds
automatically. First, the ntp protocol already provides
the needed notification of a leap second to anyone
who cares about it (i.e. there is no point in getting a
Linux distro involved in this -- a distribution mechanism
already exists, and works *better* than having a distro
do it).

If the kernel needs to track leap seconds, it could do
so using a mechanism similar to the "random pool"
that is saved across reboots. Alternately, ntpd already
stores slew rates &etc. in files, and could track leap
seconds likewise.

> Just so long as the
> existing behaviour of time() which doesn't recognise leap seconds
> is preserved.

Well, 'man 2 time' is as clear as mud. It talks about leap seconds,
but I can't figure out what its saying. I rather
doubt that time() is doing what POSIX.1 seems to want
it to do (which is to ignore leap seconds?)

The reason I'm guessing that time() is wrong, is because
it seems that POSIX wants time() to use TAI time, and
we don't have that handy anywhere (because we've lost
track of those leap seconds)

--linas

2009-01-05 16:50:01

by David Lang

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Mon, 5 Jan 2009, Linas Vepstas wrote:

>> Arguably the kernel's responsibility should be to keep track of the
>> most fundamental representation of time possible for a machine (that's
>> probably TAI) and it is a userspace responsibility to map from that
>> value to other time standards including UTC,
>
> Yes, this really does seem like the right solution.
>
>> using control files
>> which are updated as leap seconds are declared.
>
> Lets be clear on what "control files" means. This does
> *NOT* mean some config file shipped by some distro
> for some package. That would be a horrid solution.
> People don't install updates, patches, etc. Distros
> ship them late, or never, if the distro is old enough.
>
> A more appropriate solution would be to have
> either the kernel or ntpd track the leap seconds
> automatically. First, the ntp protocol already provides
> the needed notification of a leap second to anyone
> who cares about it (i.e. there is no point in getting a
> Linux distro involved in this -- a distribution mechanism
> already exists, and works *better* than having a distro
> do it).

I disagree with this. NTP will only know about leap seconds if it was
running and connected to a server that advertised the leap seconds during
that month.

for example, if you installed a new server today, how would it ever know
that there was a leap second a couple of days ago?

David Lang

> If the kernel needs to track leap seconds, it could do
> so using a mechanism similar to the "random pool"
> that is saved across reboots. Alternately, ntpd already
> stores slew rates &etc. in files, and could track leap
> seconds likewise.
>
>> Just so long as the
>> existing behaviour of time() which doesn't recognise leap seconds
>> is preserved.
>
> Well, 'man 2 time' is as clear as mud. It talks about leap seconds,
> but I can't figure out what its saying. I rather
> doubt that time() is doing what POSIX.1 seems to want
> it to do (which is to ignore leap seconds?)
>
> The reason I'm guessing that time() is wrong, is because
> it seems that POSIX wants time() to use TAI time, and
> we don't have that handy anywhere (because we've lost
> track of those leap seconds)
>
> --linas
>

2009-01-05 17:42:47

by Linas Vepstas

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

2009/1/5 <[email protected]>:
> On Mon, 5 Jan 2009, Linas Vepstas wrote:
>
>>> Arguably the kernel's responsibility should be to keep track of the
>>> most fundamental representation of time possible for a machine (that's
>>> probably TAI) and it is a userspace responsibility to map from that
>>> value to other time standards including UTC,
>>
>> Yes, this really does seem like the right solution.
>>
>>> using control files
>>> which are updated as leap seconds are declared.
>>
>> Lets be clear on what "control files" means. This does
>> *NOT* mean some config file shipped by some distro
>> for some package. That would be a horrid solution.
>> People don't install updates, patches, etc. Distros
>> ship them late, or never, if the distro is old enough.
>>
>> A more appropriate solution would be to have
>> either the kernel or ntpd track the leap seconds
>> automatically. First, the ntp protocol already provides
>> the needed notification of a leap second to anyone
>> who cares about it (i.e. there is no point in getting a
>> Linux distro involved in this -- a distribution mechanism
>> already exists, and works *better* than having a distro
>> do it).
>
> I disagree with this. NTP will only know about leap seconds if it was
> running and connected to a server that advertised the leap seconds during
> that month.
>
> for example, if you installed a new server today, how would it ever know
> that there was a leap second a couple of days ago?

OK, good point. Unless your distro was less
than a few days old (unlikely), you are faced with the
same problem. Sure, eventually, the distro will publish
an update (which will add to the existing list of 36 leap
seconds -- which is needed in any case, since no one
has a server that's been up since 1958), but this is
unlikely to happen during this install window.

The long term solution would be write an RFC to extend
NTP to also provide TAI information -- e.g. to add a
message that indicates the current leap-second offset
between UTC and TAI.

--linas

2009-01-05 19:18:16

by David Newall

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Alan Cox wrote:
>> zoneinfo. Epoch remains the start of 1978; seconds between any two
>> dates included leap-seconds and no special kernel support is required.
>>
>
> Your time() values then disagree with the rest of the universe. See POSIX
> 1003.1 Annex B 2.2.2. if you want the whole story,
>

I can't find this, except possibly (but maybe not) at a cost from ieee,
and I'm not inclined to pay. If you could post a sentence from this
annex it might help me to find it.


> For any given time based on the 1970 Epoch there is a single correct
> answer for the translation between each value and a UTC time.

This confused me because the sense that I've got from this thread
suggests otherwise. Unless I've misunderstood, the time() value for the
first second of 2009 is one greater than the value for the second to
last second of 2008 (i.e. 23:59:59), which means that there is no
translation for the last second. Put another way, my understanding of
what's been said is that the epoch is effectively increased by one
second for each leap second. Have I got this wrong?

2009-01-05 19:34:37

by David Newall

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Ben Goodger wrote:
> 2009/1/5 David Newall <[email protected]>:
>
>>> another poster said that NTP packets include information about this
>>> month's leap second, so that implies that they could change monthly.
>>>
>> Not "could change monthly" rather, "could change at any month".
>>
>> The frequency of zoneinfo updates would therefore be: every time the
>> zones you care about change; and every time there's a leap second. No
>> big effort.
>>
>
> Unfortunately, as has been pointed out, timezones are completely
> unrelated to leap seconds.
>

Zoneinfo files cater for leap seconds.

> NB. Leap seconds, positive or negative, potentially occur every six
> months (June 30 or Dec 31), but since their introduction this
> frequency has happened only once (in 1972); historically they have
> been inserted on average every 1.5 years, but there have been only two
> since 2000.

If you know in advance, you can update zoneinfo files with multiple leap
seconds.

2009-01-05 19:47:34

by Alan

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

> > For any given time based on the 1970 Epoch there is a single correct
> > answer for the translation between each value and a UTC time.
>
> This confused me because the sense that I've got from this thread
> suggests otherwise. Unless I've misunderstood, the time() value for the
> first second of 2009 is one greater than the value for the second to
> last second of 2008 (i.e. 23:59:59), which means that there is no
> translation for the last second. Put another way, my understanding of
> what's been said is that the epoch is effectively increased by one
> second for each leap second. Have I got this wrong?

No I should have said from a UTC time to a value, the reverse is slightly
ambiguous - as you say leap seconds cannot be distinguished (well unless
you are using floating point but thats a whole can of worms)

Glibc has /usr/share/zoneinfo/right as well as posix zones which I guess
is Ulrich's vote on the subject.

In a strictly posix environment then for 1003.1 post 2001 the definition
is non-leap seconds since (a notional) 1/1/70 UTC 00:00:00. Including
leap seconds in the definition would have caused problems with existing
date stamps moving them by about half a minute.

The kernel doesn't give a brass monkeys about interpretation on the whole
with one main exception - the CMOS RTC time conversion is done without
factoring in leap seconds.

Alan

2009-01-05 23:03:57

by Linas Vepstas

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

2009/1/5 David Newall <[email protected]>:

> Zoneinfo files cater for leap seconds.

As has been (repeatedly) pointed out, the leap seconds
only apply to UTC, so there is no way, given UTC, to
use a zoneinfo file to twiddle UTC.

Your argument *might* work if the kernel maintained
TEI instead of UTC. Then there could, in principle, be
a zoneinfo file that converted from TEI to UTC (by
adding 24 seconds to TEI).

However, this requires converting the kernel to track
TEI instead of UTC, and reviewing all sorts of code
in the kernel, glibc, ntp, and myriads of other libraries
to figure out what's affected and whats not. As well
as figuring out how to twiddle zoneinfo files so that
they're backwards/forwards compatible with the
timekeeping change, so that users aren't screwed
when they put new kernels on old distros, or new
zonefiles on old kernels.

This is a fairly big chunk of work, requiring coordination
between lots of different parties.

> If you know in advance, you can update zoneinfo files with multiple leap
> seconds.

Heh. You miss the point. The whole point of leap seconds
is that they're unknowable in advance. You only know if they
already happened, or seem likely to happen real soon now.
The previously cited wikipedia article reviews this nicely.

--linas

p.s. Yes, you could say I'm coming around to your point
of view. If you had said, from the begining, something
like "the kernel should keep TEI instead of UTC, and
compute UTC in user-space from the TEI time", then
you might have met a lot less resistance. But that's not
how your argument came across.

2009-01-06 02:00:18

by David Newall

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Nick Andrew wrote:
> I can sympathise with the opinion that linux should be able to accurately
> distinguish xx:59:60 when a leap second is added (or the missing :59 when
> one is subtracted) but not at the expense of making a day which is not
> 86400 seconds long.
>

Some days are not 86400 seconds long. That's a fact and regardless of
how inconvenient it is, we have to live with it. Some years don't have
365 days; some months don't have 30 days; some Februaries don' have 28
days; and now, some days don't have 86400 seconds. What's the point in
fighting this?

If you want to know the days between two times, dividing by 86400
doesn't cut it.


> Arguably the kernel's responsibility should be to keep track of the
> most fundamental representation of time possible for a machine (that's
> probably TAI) and it is a userspace responsibility to map from that
> value to other time standards including UTC, using control files
> which are updated as leap seconds are declared.

We have this already; zoneinfo

> Just so long as the
> existing behaviour of time() which doesn't recognise leap seconds
> is preserved.

I haven't been able to find this Annex B that Alan talked of, so I can
only go by the man page, which states, simply and explicitly, that
time() returns seconds since Epoch, and also that Epoch is start of
January 1 1970. To my mind, time *does* recognise leap seconds.

2009-01-06 02:18:51

by Chris Adams

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Once upon a time, David Newall <[email protected]> said:
> We have this already; zoneinfo

How many times: zoneinfo is for offset from UTC, not changes in UTC.

> I haven't been able to find this Annex B that Alan talked of, so I can
> only go by the man page, which states, simply and explicitly, that
> time() returns seconds since Epoch, and also that Epoch is start of
> January 1 1970. To my mind, time *does* recognise leap seconds.

Part of the rationale for SUSv3 (aka 1003.1-2001), xbd_chap04.html in my
copy:

The topic of whether seconds since the Epoch should account for leap
seconds has been debated on a number of occasions, and each time
consensus was reached (with acknowledged dissent each time) that the
majority of users are best served by treating all days identically.
(That is, the majority of applications were judged to assume a single
length-as measured in seconds since the Epoch-for all days. Thus,
leap seconds are not applied to seconds since the Epoch.) Those
applications which do care about leap seconds can determine how to
handle them in whatever way those applications feel is best. This
was particularly emphasized because there was disagreement about what
the best way of handling leap seconds might be. It is a practical
impossibility to mandate that a conforming implementation must have a
fixed relationship to any particular official clock (consider
isolated systems, or systems performing "reruns" by setting the clock
to some arbitrary time).

Now, you are wrong, the standard says so, please take this somewhere
else and stop CCing me.

--
Chris Adams <[email protected]>
Systems and Network Administrator - HiWAAY Internet Services
I don't speak for anybody but myself - that's enough trouble.

2009-01-06 02:21:48

by john stultz-lkml

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Fri, Jan 2, 2009 at 4:21 PM, Chris Adams <[email protected]> wrote:
> Once upon a time, Linas Vepstas <[email protected]> said:
>> Below follows a summary of the reported crashes. I'm ignoring the
>> zillions of "mine didn't crash" reports, or the "you're a paranoid
>> conspiracy theorist, its random chance" reports.
>
> I have reproduced this and got a stack trace (this is with Fedora 8 and
> kernel kernel-2.6.26.6-49.fc8.x86_64):
>
[snip]
> Basically (to my untrained eye), the leap second code is called from the
> timer interrupt handler, which holds xtime_lock. The leap second code
> does a printk to notify about the leap second. The printk code tries to
> wake up klogd (I assume to prioritize kernel messages), and (under some
> conditions), the scheduler attempts to get the current time, which tries
> to get xtime_lock => deadlock.

This analysis looks correct to me.

Grrrr. This has bit us a few times since the "no printk while holding
the xtime lock" restriction was added.

Thomas: Do you think this warrents adding a check to the printk path
to make sure the xtime lock isn't held? This way we can at least get a
warning when someone accidentally adds a printk or calls a function
that does while holding the xtime_lock.

thanks
-john

2009-01-06 02:26:21

by Chris Adams

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Once upon a time, john stultz-lkml <[email protected]> said:
> Grrrr. This has bit us a few times since the "no printk while holding
> the xtime lock" restriction was added.

I didn't see that documented anywhere, so my patch adds a comment to
that effect.

> Thomas: Do you think this warrents adding a check to the printk path
> to make sure the xtime lock isn't held? This way we can at least get a
> warning when someone accidentally adds a printk or calls a function
> that does while holding the xtime_lock.

I'm no kernel locking or scheduling (or anything else) expert, but if
printk can check to see if xtime_lock is held, can it skip trying to
wake klogd (so messages still get logged, just maybe not quite as fast)?
Is there anything else that will wake klogd later?

--
Chris Adams <[email protected]>
Systems and Network Administrator - HiWAAY Internet Services
I don't speak for anybody but myself - that's enough trouble.

2009-01-06 02:27:30

by john stultz-lkml

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Mon, Jan 5, 2009 at 9:42 AM, Linas Vepstas <[email protected]> wrote:
[snip]
> The long term solution would be write an RFC to extend
> NTP to also provide TAI information -- e.g. to add a
> message that indicates the current leap-second offset
> between UTC and TAI.

I believe Roman has already added this ability:
http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=153b5d054ac2d98ea0d86504884326b6777f683d;hp=9f14f669d18477fe3df071e2fa4da36c00acee8e

thanks
-john

2009-01-06 02:31:39

by Nick Andrew

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Mon, Jan 05, 2009 at 10:08:50AM -0600, Linas Vepstas wrote:
> 2009/1/5 Nick Andrew <[email protected]>:
> > On Sun, Jan 04, 2009 at 11:48:31PM -0600, Linas Vepstas wrote:
> > Arguably the kernel's responsibility should be to keep track of the
> > most fundamental representation of time possible for a machine (that's
> > probably TAI) and it is a userspace responsibility to map from that
> > value to other time standards including UTC,
>
> Yes, this really does seem like the right solution.
>
> > using control files
> > which are updated as leap seconds are declared.
>
> Lets be clear on what "control files" means. This does
> *NOT* mean some config file shipped by some distro
> for some package. That would be a horrid solution.
> People don't install updates, patches, etc. Distros
> ship them late, or never, if the distro is old enough.

To clarify - as far as I know, TAI is a fundamental time scale
because it's regular and monotonically increasing. Wikipedia
talks about specifying TAI using both Julian Dates and the
Gregorian Calendar - I don't know whether that means representations
of TAI time may suffer gaps depending on declared (subtracted)
leap seconds. In any case I was thinking of something like
Bernsteins TAI64 (http://cr.yp.to/libtai/tai64.html) which is
just a count of seconds (and nanoseconds using TAI64N).

Considering TAI64 as a count of seconds, other time values (UTC,
unix epoch time) can be derived from TAI64 by applying some mapping
function which takes into account all the irregularities introduced
by our complex time systems (including leap years, leap seconds, DST,
pre-Gregorian calendars and so on).

Unix epoch time (seconds since 1 Jan 1970 00:00:00 GMT) is
also regular and monotonically increasing however it's no
longer suitable as a fundamental timebase because it doesn't
recognise the existence of leap seconds. In unix epoch time
a day is always 86400 seconds long and when I said "preserve
the existing behaviour of time()" I meant that this constant
must be maintained.

As Linas correctly noted, UTC allows a distinct representation of a
leap second (xx:59:60). It follows from the previous paragraph that
a mapping from time_t to UTC can never result in ":60". Mapping from
UTC to time_t is lossy: if the input is a leap second then something
must be done with it: mktime() for 09:59:60 returns the same time_t
value as for 10:00:00.

Mapping from TAI64 to UTC or time_t requires knowledge of what leap
seconds were already applied, and when. Wikipedia says TAI is 34
seconds ahead of UTC right now, but I'm talking about converting any
past TAI value, not just current time. So it's not really suitable for
the kernel to just learn about leap seconds on the fly, there needs to
be a persistent table of some kind which states what changes happened
and when. This is analogous to the zoneinfo file, which states not
just the current DST rules but also all past ones.

There will certainly be hosts where this mapping file is out of
date, however it is supplied. That's the case with zoneinfo too,
and there's a general problem in that politicians keep mucking about
with daylight saving time. We're experiencing that now in Australia,
where the state of Western Australia which never had DST in the past,
now has it as a "test". So WA has got it now, much to my displeasure,
and may or may not have it in future. In general it's not possible to
reliably convert future dates from time_t to local time, where future
dates are anything more recent than your zoneinfo file. The same
constraint applies to conversion from TAI64.

There's a good argument for including up-to-date conversion information
in the NTP protocol. I don't know enough about NTP whether it has this
capability already. Hosts which don't have up-to-date zoneinfo files
and don't sync time with NTP probably don't care about accurate time
conversion anyway.

> Well, 'man 2 time' is as clear as mud. It talks about leap seconds,
> but I can't figure out what its saying. I rather
> doubt that time() is doing what POSIX.1 seems to want
> it to do (which is to ignore leap seconds?)

I think I read that linux "ticks the second twice" (I don't know
whether that's the 59 second or the 00 second, it should be 00 for
ctime(3) to make any sense) and I don't know whether gettimeofday(2)
will show tv_usec returning to zero and re-counting the microseconds.

I think POSIX.1 wants time_t to ignore leap seconds as if they
didn't exist. That means that the :59:60 and :00:00 wall clock
seconds share a single time_t value ... in other words, one
time_t second in linux persists for two wall clock seconds during
a leap second.

Sane behaviour would be for tv_sec and tv_usec to be monotonically
increasing while this is going on; the microseconds should pass
at half the usual rate to preserve this.

> The reason I'm guessing that time() is wrong, is because
> it seems that POSIX wants time() to use TAI time, and
> we don't have that handy anywhere (because we've lost
> track of those leap seconds)

I don't think POSIX wants TAI, but it makes sense for a kernel to
provide an unambiguous time reference to userspace. time_t is a
convenient approximation but it is non-linear due to ignoring the
leap seconds and it probably causes havoc for any precise measurements
occurring during the leap second.

Nick.
--
PGP Key ID = 0x418487E7 http://www.nick-andrew.net/
PGP Key fingerprint = B3ED 6894 8E49 1770 C24A 67E3 6266 6EB9 4184 87E7

2009-01-06 02:51:50

by Nick Andrew

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Tue, Jan 06, 2009 at 12:29:47PM +1030, David Newall wrote:
> Nick Andrew wrote:
> > I can sympathise with the opinion that linux should be able to accurately
> > distinguish xx:59:60 when a leap second is added (or the missing :59 when
> > one is subtracted) but not at the expense of making a day which is not
> > 86400 seconds long.
> >
>
> Some days are not 86400 seconds long. That's a fact and regardless of
> how inconvenient it is, we have to live with it.

Sorry, but you're wrong - in the context of time_t, every day is 86400
seconds long. man 2 time says so clearly in the notes:

NOTES
POSIX.1 defines seconds since the Epoch as a value to be interpreted as the number of sec‐
onds between a specified time and the Epoch, according to a formula for conversion from
UTC equivalent to conversion on the naive basis that leap seconds are ignored and all
years divisible by 4 are leap years. This value is not the same as the actual number of
seconds between the time and the Epoch, because of leap seconds and because clocks are not
required to be synchronized to a standard reference.

> Some years don't have
> 365 days; some months don't have 30 days; some Februaries don' have 28
> days; and now, some days don't have 86400 seconds. What's the point in
> fighting this?

I'm not fighting this - the real world has all these issues but the
world of time_t does not. You want to redefine time_t to include all
the leap seconds that were already added (34) or perhaps only the
future ones; either approach is a disaster. It's unreasonable to change
the semantics of something as fundamental as time_t when so much code
depends on those semantics.

Instead, define a new timebase which counts time predictably and
unambiguously then a set of mappings to derived time values like
time_t, UTC and local time.

> > Just so long as the
> > existing behaviour of time() which doesn't recognise leap seconds
> > is preserved.
>
> I haven't been able to find this Annex B that Alan talked of, so I can
> only go by the man page, which states, simply and explicitly, that
> time() returns seconds since Epoch, and also that Epoch is start of
> January 1 1970. To my mind, time *does* recognise leap seconds.

Please read the NOTES section, which clarifies what "seconds since
the Epoch" means.

Nick.
--
PGP Key ID = 0x418487E7 http://www.nick-andrew.net/
PGP Key fingerprint = B3ED 6894 8E49 1770 C24A 67E3 6266 6EB9 4184 87E7

2009-01-06 03:53:33

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Mon, 05 Jan 2009 10:38:48 +1030, David Newall said:
> [email protected] wrote:
> > Something to keep in mind is that the Posix standard does *NOT* say anything
> > about leap seconds - poke around in a 'struct tm' sometime.

> I have poked, decades ago. There's nothing in struct tm that's a problem.

More correctly: "There's nothing in struct time - that's the problem."


Attachments:
(No filename) (226.00 B)

2009-01-06 04:35:45

by Linas Vepstas

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

2009/1/5 john stultz-lkml <[email protected]>:
> On Fri, Jan 2, 2009 at 4:21 PM, Chris Adams <[email protected]> wrote:
>> Basically (to my untrained eye), the leap second code is called from the
>> timer interrupt handler, which holds xtime_lock. The leap second code
>> does a printk to notify about the leap second. The printk code tries to
>> wake up klogd (I assume to prioritize kernel messages), and (under some
>> conditions), the scheduler attempts to get the current time, which tries
>> to get xtime_lock => deadlock.
>
> This analysis looks correct to me.
>
> Grrrr. This has bit us a few times since the "no printk while holding
> the xtime lock" restriction was added.
>
> Thomas: Do you think this warrents adding a check to the printk path
> to make sure the xtime lock isn't held?

No.

> This way we can at least get a
> warning when someone accidentally adds a printk or calls a function
> that does while holding the xtime_lock.

This seems like a basic mistake, that should be avoidable
with code review. I'm sort-of surprised to even see it; anyone
even vaguely familiar with that code would spot it quickly.
Heh. Take that with a grain of salt -- not like I never make
mistakes ;-/

I mean, how many more times can the mistake be made?
I'm arguing its gonna be zero.

--linas

2009-01-06 04:53:57

by Linas Vepstas

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

2009/1/5 john stultz-lkml <[email protected]>:
> On Mon, Jan 5, 2009 at 9:42 AM, Linas Vepstas <[email protected]> wrote:
> [snip]
>> The long term solution would be write an RFC to extend
>> NTP to also provide TAI information -- e.g. to add a
>> message that indicates the current leap-second offset
>> between UTC and TAI.
>
> I believe Roman has already added this ability:
> http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=153b5d054ac2d98ea0d86504884326b6777f683d;hp=9f14f669d18477fe3df071e2fa4da36c00acee8e

Well, you're answering a different statment than what
I was talking about -- I wanted to make sure that TAI
information was available via NTP -- this has nothing
to do with the kernel, and would be something available
to all operating systems.

Anyway -- I'm looking at the patch you reference, and
maybe I'm being dumb -- but -- I think I see a bug.

case TIME_DEL decrements TAI, but TIME_INS does
not increment it. Instead, there's a lonely increment in
TIME_OOP which seems wrong. ??


--linas

2009-01-06 05:00:42

by Linas Vepstas

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Oops.

2009/1/5 Linas Vepstas <[email protected]>:
> 2009/1/5 john stultz-lkml <[email protected]>:
>> On Mon, Jan 5, 2009 at 9:42 AM, Linas Vepstas <[email protected]> wrote:
>> [snip]
>>> The long term solution would be write an RFC to extend
>>> NTP to also provide TAI information -- e.g. to add a
>>> message that indicates the current leap-second offset
>>> between UTC and TAI.
>>
>> I believe Roman has already added this ability:
>> http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=153b5d054ac2d98ea0d86504884326b6777f683d;hp=9f14f669d18477fe3df071e2fa4da36c00acee8e
>
> Well, you're answering a different statment than what
> I was talking about -- I wanted to make sure that TAI
> information was available via NTP -- this has nothing
> to do with the kernel, and would be something available
> to all operating systems.
>
> Anyway -- I'm looking at the patch you reference, and
> maybe I'm being dumb -- but -- I think I see a bug.
>
> case TIME_DEL decrements TAI, but TIME_INS does
> not increment it. Instead, there's a lonely increment in
> TIME_OOP which seems wrong. ??

Never mind. Sorry, I'm wrong, the code looks right.
Time to stop reading email, and go to bed. :-)

--linas

2009-01-06 18:32:16

by Alan

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

> UTC equivalent to conversion on the naive basis that leap seconds are ignored and all
> years divisible by 4 are leap years. This value is not the same as the actual number of
> seconds between the time and the Epoch, because of leap seconds and because clocks are not
> required to be synchronized to a standard reference.

I'm not sure what you are quoting from but it is out of date on the
subject of leap years.

The rest looks right.

2009-01-06 19:57:13

by Warner Losh

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

In message: <[email protected]>
"Linas Vepstas" <[email protected]> writes:
: 2009/1/5 john stultz-lkml <[email protected]>:
: > On Mon, Jan 5, 2009 at 9:42 AM, Linas Vepstas <[email protected]> wrote:
: > [snip]
: >> The long term solution would be write an RFC to extend
: >> NTP to also provide TAI information -- e.g. to add a
: >> message that indicates the current leap-second offset
: >> between UTC and TAI.
: >
: > I believe Roman has already added this ability:
: > http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=153b5d054ac2d98ea0d86504884326b6777f683d;hp=9f14f669d18477fe3df071e2fa4da36c00acee8e
:
: Well, you're answering a different statment than what
: I was talking about -- I wanted to make sure that TAI
: information was available via NTP -- this has nothing
: to do with the kernel, and would be something available
: to all operating systems.
:
: Anyway -- I'm looking at the patch you reference, and
: maybe I'm being dumb -- but -- I think I see a bug.
:
: case TIME_DEL decrements TAI, but TIME_INS does
: not increment it. Instead, there's a lonely increment in
: TIME_OOP which seems wrong. ??

No. That's right. The increment doesn't happen until the leap second
has happened. The TIME_OOP exists to increment the TAI offset at the
right time. The decrement would happen right away, since the second
is deleted at the end of :58.

I had to draw lots of pictures when I was working this code out in
FreeBSD.

Warner

2009-01-06 20:00:00

by Warner Losh

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

In message: <[email protected]>
"Linas Vepstas" <[email protected]> writes:
: 2009/1/5 <[email protected]>:
: > On Mon, 5 Jan 2009, Linas Vepstas wrote:
: >
: >>> Arguably the kernel's responsibility should be to keep track of the
: >>> most fundamental representation of time possible for a machine (that's
: >>> probably TAI) and it is a userspace responsibility to map from that
: >>> value to other time standards including UTC,
: >>
: >> Yes, this really does seem like the right solution.
: >>
: >>> using control files
: >>> which are updated as leap seconds are declared.
: >>
: >> Lets be clear on what "control files" means. This does
: >> *NOT* mean some config file shipped by some distro
: >> for some package. That would be a horrid solution.
: >> People don't install updates, patches, etc. Distros
: >> ship them late, or never, if the distro is old enough.
: >>
: >> A more appropriate solution would be to have
: >> either the kernel or ntpd track the leap seconds
: >> automatically. First, the ntp protocol already provides
: >> the needed notification of a leap second to anyone
: >> who cares about it (i.e. there is no point in getting a
: >> Linux distro involved in this -- a distribution mechanism
: >> already exists, and works *better* than having a distro
: >> do it).
: >
: > I disagree with this. NTP will only know about leap seconds if it was
: > running and connected to a server that advertised the leap seconds during
: > that month.
: >
: > for example, if you installed a new server today, how would it ever know
: > that there was a leap second a couple of days ago?
:
: OK, good point. Unless your distro was less
: than a few days old (unlikely), you are faced with the
: same problem. Sure, eventually, the distro will publish
: an update (which will add to the existing list of 36 leap

List of 24 leap seconds. Although the delta is 34 right now, the
first 10 leap seconds were done as tiny steps (~50-100ms) plus
frequency offsets. Well, the first 'leap' was 1.4228180s on Jan 1,
1961. Everybody assumes that those seconds don't exist to simplify
things (or that there was simply a 10s step between TAI and UTC on
1-Jan-1972). The leapsecond file from NIST doesn't even have them.

: seconds -- which is needed in any case, since no one
: has a server that's been up since 1958), but this is
: unlikely to happen during this install window.
:
: The long term solution would be write an RFC to extend
: NTP to also provide TAI information -- e.g. to add a
: message that indicates the current leap-second offset
: between UTC and TAI.

I'd love that. There's likely going to be some resistance to that
because the leapfile is available via the crypto-authenticated means.
However, there's no real-time information available... Also, there
are many reference clocks that would need this information plugged
into it somehow (IRIG doesn't report leap seconds in any meaningful[*]
way, let alone UTC-TAI offset).

[*] Some IRIG extensions do support reporting leap seconds at the end
of the hour, but that's too late...

Warner

2009-01-07 01:17:29

by Nick Andrew

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Tue, Jan 06, 2009 at 09:40:58AM +0000, Alan Cox wrote:
> > UTC equivalent to conversion on the naive basis that leap seconds are ignored and all
> > years divisible by 4 are leap years. This value is not the same as the actual number of
> > seconds between the time and the Epoch, because of leap seconds and because clocks are not
> > required to be synchronized to a standard reference.
>
> I'm not sure what you are quoting from but it is out of date on the
> subject of leap years.

"man 2 time" on Debian Lenny. The treatment of leap years looks ridiculous, but
within the context of a 32-bit time_t, all divisible-by-4 years between 1901 and
2038 are leap years. It's a bit of a problem for 64-bit time_t though.

Nick.
--
PGP Key ID = 0x418487E7 http://www.nick-andrew.net/
PGP Key fingerprint = B3ED 6894 8E49 1770 C24A 67E3 6266 6EB9 4184 87E7

2009-01-07 04:19:36

by Danny Mayer

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Linas Vepstas wrote:
> 2009/1/5 <[email protected]>:
>> On Mon, 5 Jan 2009, Linas Vepstas wrote:
>>
>>>> Arguably the kernel's responsibility should be to keep track of the
>>>> most fundamental representation of time possible for a machine (that's
>>>> probably TAI) and it is a userspace responsibility to map from that
>>>> value to other time standards including UTC,
>>> Yes, this really does seem like the right solution.
>>>
>>>> using control files
>>>> which are updated as leap seconds are declared.
>>> Lets be clear on what "control files" means. This does
>>> *NOT* mean some config file shipped by some distro
>>> for some package. That would be a horrid solution.
>>> People don't install updates, patches, etc. Distros
>>> ship them late, or never, if the distro is old enough.
>>>
>>> A more appropriate solution would be to have
>>> either the kernel or ntpd track the leap seconds
>>> automatically. First, the ntp protocol already provides
>>> the needed notification of a leap second to anyone
>>> who cares about it (i.e. there is no point in getting a
>>> Linux distro involved in this -- a distribution mechanism
>>> already exists, and works *better* than having a distro
>>> do it).
>> I disagree with this. NTP will only know about leap seconds if it was
>> running and connected to a server that advertised the leap seconds during
>> that month.
>>
>> for example, if you installed a new server today, how would it ever know
>> that there was a leap second a couple of days ago?

Because it gets it's time from an upstream server that already has
incorporated the leap second so it doesn't really need to know that the
leap second happened a few days ago or even a few years ago.

> OK, good point. Unless your distro was less
> than a few days old (unlikely), you are faced with the
> same problem. Sure, eventually, the distro will publish
> an update (which will add to the existing list of 36 leap
> seconds -- which is needed in any case, since no one
> has a server that's been up since 1958), but this is
> unlikely to happen during this install window.
>

This is nonsense. That's not how NTP works.

> The long term solution would be write an RFC to extend
> NTP to also provide TAI information -- e.g. to add a
> message that indicates the current leap-second offset
> between UTC and TAI.
>
> --linas

I don't know what this discussion is really about and why this was sent
to the working group in the middle of the discussion, but there is no
need for NTP to provide TAI information since NTP only uses UTC. Leap
Seconds are automatically signaled and incorporated when they become
due. If you don't have NTP running for some reason when a leap second is
signaled it doesn't matter since your server source will already have
incorporated the leap second so the NTP packet includes the timestamps
that include the leap second adjustment.

Operating Systems use UTC and not TAI by universal agreement and the
ones that don't are extremely rare.

Why don't you tell us what the real problem is instead of telling us
that you need TAI offset information?

Danny

2009-01-07 04:52:32

by Linas Vepstas

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

2009/1/6 Danny Mayer <[email protected]>:
Hi,

> I don't know what this discussion is really about and why this was sent
> to the working group in the middle of the discussion, but there is no
> need for NTP to provide TAI information since NTP only uses UTC. Leap
> Seconds are automatically signaled and incorporated when they become
> due. If you don't have NTP running for some reason when a leap second is
> signaled it doesn't matter since your server source will already have
> incorporated the leap second so the NTP packet includes the timestamps
> that include the leap second adjustment.
>
> Operating Systems use UTC and not TAI by universal agreement and the
> ones that don't are extremely rare.
>
> Why don't you tell us what the real problem is instead of telling us
> that you need TAI offset information?

Currently, the Linux kernel keeps time in UTC. This means
that it must take special actions to tick twice when a leap
second comes by. Due to a (stupid) bug, some fraction
of linux systems crashed; this includes everything from
laptops to servers, to DVR's, to cell phones and cell
phone towers. There's now a fix for this.

However, during the discussion, the idea came out that
maybe keeping UTC time in the kernel is just plain stupid.
So there's this idea floating around that maybe the kernel
should keep TAI time instead. The hope is that this will
reduce the complexity in the kernel, and push it out to
user space, "where it belongs" (to repeat a well-worn
mantra).

However, *if* we were to kick UTC out of the kernel,
and push it to user-land, then, of course, there's a
different problem: how does the kernel know what the
correct TAI time is? As your reply makes abundantly
clear, NTP is not a good source for TAI information.

The comments which you labelled as "non-sense" were
a mis-understanding of a discussion of a particular issue
that would arise if the kernel were to keep TAI -- if it did,
then user-space systems would need to have a reliable
source for leap-seconds. Since NTP does not
provide this, there was discussion about how that
could be worked-around. This then lead to the comment
that, "gee, wouldn't the right long-term solution be that
NTP provide TAI info?"

Clearly, it would be a lot of work to get the kernel to keep
TAI instead of UTC, so this is not, at this time, a "serious
proposal". But if it were possible, and all the various
little issues that result were solvable, then it does seem
like a better long-term solution.

--linas

p.s. the opinions above are not my own; I'm just
summarizing the points made by the most vocal
posters to this list.

2009-01-07 09:38:18

by Alan

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

> "man 2 time" on Debian Lenny. The treatment of leap years looks ridiculous, but
> within the context of a 32-bit time_t, all divisible-by-4 years between 1901 and
> 2038 are leap years. It's a bit of a problem for 64-bit time_t though.

Then Debian documentation needs fixing. POSIX fixed their definition some
years ago.

2009-01-07 09:46:41

by David Newall

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Alan Cox wrote:
>> UTC equivalent to conversion on the naive basis that leap seconds are ignored and all
>> years divisible by 4 are leap years. This value is not the same as the actual number of
>> seconds between the time and the Epoch, because of leap seconds and because clocks are not
>> required to be synchronized to a standard reference.
>>
>
> I'm not sure what you are quoting from but it is out of date on the
> subject of leap years.
>

The range of signed 32-bit times is 1901 through 2039, which has only
one century, 2000, which is a leap year. So the caveat for leap years
is correct but unnecessary.


So I've discoverd, at least on Ubuntu, something wonderful and
reassuring. It already works exactly the way I think is correct. Look:
I create a test timezone with no daylight saving and one leap second:

davidn@takauji:~/timetest$ cat tz
Zone testzone 0:00 0 XXX/YYY
davidn@takauji:~/timetest$ cat leapseconds
Leap 2008 Dec 31 23:59:59 + S
davidn@takauji:~/timetest$ zic -d . -L leapseconds tz

Then the test program, which makes a time_t (what time() returns) for a
few seconds before the leap second, then counts off seconds...

davidn@takauji:~/timetest$ cat timetest.c
#include <time.h>
#include <stdio.h>

main() {

setenv("TZ", ":/home/davidn/timetest/testzone", 1);

struct tm tm1 = { 55, 59, 23, 31, 11, 108 };
time_t t1 = mktime(&tm1);
int i;
for (i = 10; --i; t1++) printf("ctime(%ld) = %s", t1, ctime(&t1));

return 0;


}


Observe two 23:59:59's. Apparently it could be better if the second
23:59:59 was 23:59:60, but I prefer it this way.

davidn@takauji:~/timetest$ ./timetest
ctime(1230767995) = Wed Dec 31 23:59:55 2008
ctime(1230767996) = Wed Dec 31 23:59:56 2008
ctime(1230767997) = Wed Dec 31 23:59:57 2008
ctime(1230767998) = Wed Dec 31 23:59:58 2008
ctime(1230767999) = Wed Dec 31 23:59:59 2008
ctime(1230768000) = Wed Dec 31 23:59:59 2008
ctime(1230768001) = Thu Jan 1 00:00:00 2009
ctime(1230768002) = Thu Jan 1 00:00:01 2009
ctime(1230768003) = Thu Jan 1 00:00:02 2009


Perhaps this is distribution-dependent, but even so, there's no need for
the kernel to drop the second (and it's wrong if it does.)

2009-01-07 09:54:43

by Alan

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

> The range of signed 32-bit times is 1901 through 2039, which has only
> one century, 2000, which is a leap year. So the caveat for leap years
> is correct but unnecessary.

The standard however (and library code) were updated many years ago, so
the description is still wrong.

> So I've discoverd, at least on Ubuntu, something wonderful and
> reassuring. It already works exactly the way I think is correct. Look:
> I create a test timezone with no daylight saving and one leap second:

This is entirely configurable - see my earlier post about the "right" and
posix timezones. Really however that belongs on the glibc list.

As far as the kernel and leapseconds go - remember the kernel RTC support
does not know about leap seconds

2009-01-07 10:04:25

by David Newall

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Linas Vepstas wrote:
> Currently, the Linux kernel keeps time in UTC. This means
> that it must take special actions to tick twice when a leap
> second comes by.

Except it doesn't have to tick twice. Refer to
http://lkml.org/lkml/2009/1/7/78 in which I show that a time_t (what
time() returns) counts leap seconds (According to Bernstein this is what
UTC means), and using zoneinfo, the library processes leap seconds
correctly.

I just realised that the Notes in man 2 time are confusing and probably
unnecessary. Suffice to say that (assuming correctly configured
zoneinfo) time() returns the number of seconds elapsed since start 1970.

2009-01-07 10:18:42

by David Newall

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Alan Cox wrote:
> As far as the kernel and leapseconds go - remember the kernel RTC support
> does not know about leap seconds
>

True but irrelevant because the RTC returns a timestamp. And it's
quietly understood that the RTC is only an approximation.

The remaining fly in the ointment, if indeed the NTP client doesn't
already do what I've outlined, is that leap seconds aren't reckoned into
NTP broadcasts. As intimated, this is correctable using leap second
information from zoneinfo.

Even though this is manifestly not a kernel issue, I'll work up a patch
for ntpdate (apparently what I use) and post her, which I'm sure will be
useful for all other NTP clients.

However it is now clear that no special kernel support is required for
leap-seconds, and any such code that's been incorporated needs to be
removed. Removed I say!

2009-01-07 10:52:43

by Alan

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

> True but irrelevant because the RTC returns a timestamp. And it's
> quietly understood that the RTC is only an approximation.

You miss the point.

The RTC stores the CMOS time in MM DD YY HH:MM:SS format. That conversion
is done kernel side when reading/writing the RTC chip. Thus if you are
using leap second timing your BIOS RTC values will not agree with the
expected value.

> However it is now clear that no special kernel support is required for
> leap-seconds, and any such code that's been incorporated needs to be
> removed. Removed I say!

There never has been any. Its all handled (both posix and sane) by glibc.

Alan

2009-01-07 13:33:33

by Chris Adams

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Once upon a time, David Newall <[email protected]> said:
> The remaining fly in the ointment, if indeed the NTP client doesn't
> already do what I've outlined, is that leap seconds aren't reckoned into
> NTP broadcasts. As intimated, this is correctable using leap second
> information from zoneinfo.

No it isn't; you are still wrong. Yet again, you are ignoring the
facts:

- zoneinfo is for offset from UTC, leap seconds are changes in UTC

- the standards say that time() returns seconds since the epoch in UTC
_except_ explicity NOT including leap seconds

- NTP already has a way to distribute leap second information to trusted
clients

> Even though this is manifestly not a kernel issue, I'll work up a patch
> for ntpdate (apparently what I use) and post her, which I'm sure will be
> useful for all other NTP clients.

ntpdate is obsolete.

> However it is now clear that no special kernel support is required for
> leap-seconds, and any such code that's been incorporated needs to be
> removed. Removed I say!

And you are wrong.
--
Chris Adams <[email protected]>
Systems and Network Administrator - HiWAAY Internet Services
I don't speak for anybody but myself - that's enough trouble.

2009-01-07 13:37:53

by Alan

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

> - zoneinfo is for offset from UTC, leap seconds are changes in UTC

If you two would stop throwing toys at each other and read the glibc
documentation and source you might get somewhere.

> - the standards say that time() returns seconds since the epoch in UTC
> _except_ explicity NOT including leap seconds

Glibc has timezone support for both leap second inclusive ("right" as
it calls them) and posix time offsets.

Alan

2009-01-07 13:46:18

by David Newall

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Alan Cox wrote:
>> True but irrelevant because the RTC returns a timestamp. And it's
>> quietly understood that the RTC is only an approximation.
>>
>
> You miss the point.
>

No, I got the point. I see no problem.

> The RTC stores the CMOS time in MM DD YY HH:MM:SS format.

Yes, which is perfect for mktime(), which knows about leap seconds and
so produces the correct time_t.


>> However it is now clear that no special kernel support is required for
>> leap-seconds, and any such code that's been incorporated needs to be
>> removed. Removed I say!
>>
>
> There never has been any. Its all handled (both posix and sane) by glibc.

Which is what one would expect. It's reports of crashes and kernel bugs
being found and fixed in code to handle leap seconds which lead me to a
different understanding. I thought it was said that there's kernel
support to handle the leap second flag in NTP's broadcasts, and that
that was where the bug was.

So. What is the situation?

2009-01-07 14:10:01

by David Newall

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Chris Adams wrote:
> Once upon a time, David Newall <[email protected]> said:
>
>> The remaining fly in the ointment, if indeed the NTP client doesn't
>> already do what I've outlined, is that leap seconds aren't reckoned into
>> NTP broadcasts. As intimated, this is correctable using leap second
>> information from zoneinfo.
>>
>
> No it isn't; you are still wrong. Yet again, you are ignoring the
> facts:
>

Curiously strong opinions when I've already demonstrated otherwise. On
my system, and possibly also on yours, a time_t, which is what time()
returns, is the number of seconds since epoch; which, in turn, is the
start of 1970. And on my system, zoneinfo handles leap seconds.

Just saying that I'm wrong is contrary and stubborn since evidence shows
my understanding has been correct from the start. If you're sure I'm
wrong, take my demonstration and find a flaw. Otherwise I little value
in your contribution to this discussion.

> - the standards say that time() returns seconds since the epoch in UTC
> _except_ explicity NOT including leap seconds
>

Don't believe everything you read. For example, the time(2) man page
says what POSIX does, but doesn't actually say that Linux also does the
same. It also says what you paraphrased above, but demonstrably that's
not the case. Man pages often are wrong in some details. Hence RTSL.

> - NTP already has a way to distribute leap second information to trusted
> clients
>

Yesterday, when I scanned the RFC, it was clear that NTP broadcasts do
not factor leap seconds; that every day has 86400 seconds. However if
NTP does have a way of distributing leap seconds, other than the almost
pointless leap-second flag, then that's great. If it doesn't (which
is what I understand), there's no problem anyway (as explained.)

>> However it is now clear that no special kernel support is required for
>> leap-seconds, and any such code that's been incorporated needs to be
>> removed. Removed I say!
>>
>
> And you are wrong.
>

So you say, but you have no code to back you up, whereas I do.

2009-01-07 14:13:22

by David Newall

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Alan Cox wrote:
> If you two would stop throwing toys at each other

I object! I'd accept, if that were your claim, that I should have
ignored Chris, but don't accept that I've been bickering or "throwing
toys."

2009-01-07 14:16:18

by Alan

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

> > The RTC stores the CMOS time in MM DD YY HH:MM:SS format.
>
> Yes, which is perfect for mktime(), which knows about leap seconds and
> so produces the correct time_t.

mktime in the kernel has no knowledge of leap seconds whatsoever. Go read
kernel/time.c

> different understanding. I thought it was said that there's kernel
> support to handle the leap second flag in NTP's broadcasts, and that
> that was where the bug was.

All the kernel knows how to do is to slew time (in general) and to repeat
or remove one second. It has no knowledge of leap seconds and it doesn't
know how to convert between UTC/TAI/Unix Epoch etc

2009-01-07 14:35:52

by Danny Mayer

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Linas Vepstas wrote:
> 2009/1/6 Danny Mayer <[email protected]>:
> Hi,
>
>> I don't know what this discussion is really about and why this was sent
>> to the working group in the middle of the discussion, but there is no
>> need for NTP to provide TAI information since NTP only uses UTC. Leap
>> Seconds are automatically signaled and incorporated when they become
>> due. If you don't have NTP running for some reason when a leap second is
>> signaled it doesn't matter since your server source will already have
>> incorporated the leap second so the NTP packet includes the timestamps
>> that include the leap second adjustment.
>>
>> Operating Systems use UTC and not TAI by universal agreement and the
>> ones that don't are extremely rare.
>>
>> Why don't you tell us what the real problem is instead of telling us
>> that you need TAI offset information?
>
> Currently, the Linux kernel keeps time in UTC. This means
> that it must take special actions to tick twice when a leap
> second comes by. Due to a (stupid) bug, some fraction
> of linux systems crashed; this includes everything from
> laptops to servers, to DVR's, to cell phones and cell
> phone towers. There's now a fix for this.
>
> However, during the discussion, the idea came out that
> maybe keeping UTC time in the kernel is just plain stupid.
> So there's this idea floating around that maybe the kernel
> should keep TAI time instead. The hope is that this will
> reduce the complexity in the kernel, and push it out to
> user space, "where it belongs" (to repeat a well-worn
> mantra).
>
> However, *if* we were to kick UTC out of the kernel,
> and push it to user-land, then, of course, there's a
> different problem: how does the kernel know what the
> correct TAI time is? As your reply makes abundantly
> clear, NTP is not a good source for TAI information.
>
> The comments which you labelled as "non-sense" were
> a mis-understanding of a discussion of a particular issue
> that would arise if the kernel were to keep TAI -- if it did,
> then user-space systems would need to have a reliable
> source for leap-seconds. Since NTP does not
> provide this, there was discussion about how that
> could be worked-around. This then lead to the comment
> that, "gee, wouldn't the right long-term solution be that
> NTP provide TAI info?"

It was nonsense because the summary didn't contain all of the
information required to provide context and you copied the Working Group
in the middle of all this.

NTP can provide leap-second information via an autokey protocol request,
see Section 10.6 Leapseconds Values Message (LEAP)
http://www.ietf.org/internet-drafts/draft-ietf-ntp-autokey-04.txt but
that means you need to have autokey set up with another NTP server and
that means adding infrastructure that you probably don't want and are
not prepared to handle.

>
> Clearly, it would be a lot of work to get the kernel to keep
> TAI instead of UTC, so this is not, at this time, a "serious
> proposal". But if it were possible, and all the various
> little issues that result were solvable, then it does seem
> like a better long-term solution.
>

This is a *lot* more complicated than you might think. If you are
thinking of implementing this similarly to the way timezone information
is added for display purposes, you need the whole list of leap seconds
and when the change happened since you now have to look at a timestamp
and see when it was and then apply all of the leapseconds up to that
point in time and none of the leapseconds beyond that. In addition, you
have legacy files that have UTC timestamps on them so you would need to
distinguish between UTC (legacy) and TAI timestamps in the file system
among other places (anywhere where a timestamp exists) and what would
you do about database tables which contain timestamps? The list goes on.

I'd much rather you spend the time tackling the clock interrupt losses
that many of our Linux users complain about. See:
https://support.ntp.org/bin/view/Support/KnownOsIssues#Section_9.2.4.
for some of the gorier details. I'm sure you don't really want us
recommending that they set HZ=100 in the kernel to alleviate the problem.

Danny

> --linas
>
> p.s. the opinions above are not my own; I'm just
> summarizing the points made by the most vocal
> posters to this list.
>
>

2009-01-07 14:36:27

by David Newall

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Alan Cox wrote:
>>> The RTC stores the CMOS time in MM DD YY HH:MM:SS format.
>>>
>> Yes, which is perfect for mktime(), which knows about leap seconds and
>> so produces the correct time_t.
>>
>
> mktime in the kernel has no knowledge of leap seconds whatsoever. Go read
> kernel/time.c
>
Is there a mktime() in the kernel? Isn't it pure user-space? Mktime
does appear to know all about leap seconds (assuming they're in zoneinfo.)


>> different understanding. I thought it was said that there's kernel
>> support to handle the leap second flag in NTP's broadcasts, and that
>> that was where the bug was.
>>
>
> All the kernel knows how to do is to slew time (in general) and to repeat
> or remove one second. It has no knowledge of leap seconds and it doesn't
> know how to convert between UTC/TAI/Unix Epoch etc.

I went back to the start of the thread. Chris posted a stack trace
showing "#15 0xffffffff8104ec16 in ntp_leap_second (timer=<value
optimized out>) at kernel/time/ntp.c:143". That would be kernel code to
process leap seconds from NTP broadcasts, I think. That code needs to
be removed.

2009-01-07 15:41:21

by Alan

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

> Is there a mktime() in the kernel? Isn't it pure user-space? Mktime
> does appear to know all about leap seconds (assuming they're in zoneinfo.)

The GPL goes to great trouble to ensure you get the kernel source code.
Why not use it.

> showing "#15 0xffffffff8104ec16 in ntp_leap_second (timer=<value
> optimized out>) at kernel/time/ntp.c:143". That would be kernel code to
> process leap seconds from NTP broadcasts, I think. That code needs to
> be removed.

I suggest you read that code and understand it.

2009-01-07 15:43:12

by Linas Vepstas

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Thanks for the reply.

2009/1/7 Danny Mayer <[email protected]>:
> Linas Vepstas wrote:
>> 2009/1/6 Danny Mayer <[email protected]>:
>>> Why don't you tell us what the real problem is instead of telling us
>>> that you need TAI offset information?
>>
>> Currently, the Linux kernel keeps time in UTC. This means
>> that it must take special actions to tick twice when a leap
>> second comes by. Due to a (stupid) bug, some fraction
>> of linux systems crashed; this includes everything from
>> laptops to servers, to DVR's, to cell phones and cell
>> phone towers. There's now a fix for this.
>>
>> However, during the discussion, the idea came out that
>> maybe keeping UTC time in the kernel is just plain stupid.
>> So there's this idea floating around that maybe the kernel
>> should keep TAI time instead. The hope is that this will
>> reduce the complexity in the kernel, and push it out to
>> user space, "where it belongs" (to repeat a well-worn
>> mantra).
>>
>> However, *if* we were to kick UTC out of the kernel,
>> and push it to user-land, then, of course, there's a
>> different problem: how does the kernel know what the
>> correct TAI time is? As your reply makes abundantly
>> clear, NTP is not a good source for TAI information.

[...]

>> a discussion of a particular issue
>> that would arise if the kernel were to keep TAI -- if it did,
>> then user-space systems would need to have a reliable
>> source for leap-seconds. Since NTP does not
>> provide this, there was discussion about how that
>> could be worked-around. This then lead to the comment
>> that, "gee, wouldn't the right long-term solution be that
>> NTP provide TAI info?"
>
> NTP can provide leap-second information via an autokey protocol request,
> see Section 10.6 Leapseconds Values Message (LEAP)
> http://www.ietf.org/internet-drafts/draft-ietf-ntp-autokey-04.txt but

Yes, that look like exactly what would be wanted. It would be nice
if such a message was available in the regular, non-encrypted protocol.

> that means you need to have autokey set up with another NTP server and
> that means adding infrastructure that you probably don't want and are
> not prepared to handle.

Heh. Yes, well, I still haven't figured out how to secure DNS. Yet clearly
this whole security mess must march on, and somehow the security
infrastructure must eventually become easy to install.

>> Clearly, it would be a lot of work to get the kernel to keep
>> TAI instead of UTC, so this is not, at this time, a "serious
>> proposal". But if it were possible, and all the various
>> little issues that result were solvable, then it does seem
>> like a better long-term solution.
>>
>
> This is a *lot* more complicated than you might think. If you are
> thinking of implementing this similarly to the way timezone information
> is added for display purposes, you need the whole list of leap seconds
> and when the change happened since you now have to look at a timestamp
> and see when it was and then apply all of the leapseconds up to that
> point in time and none of the leapseconds beyond that. In addition, you
> have legacy files that have UTC timestamps on them so you would need to
> distinguish between UTC (legacy) and TAI timestamps in the file system
> among other places (anywhere where a timestamp exists) and what would
> you do about database tables which contain timestamps? The list goes on.

Yes.

> I'd much rather you spend the time tackling the clock interrupt losses
> that many of our Linux users complain about. See:
> https://support.ntp.org/bin/view/Support/KnownOsIssues#Section_9.2.4.
> for some of the gorier details. I'm sure you don't really want us
> recommending that they set HZ=100 in the kernel to alleviate the problem.

Actually, this is rather sorely lacking in 'gory details', rather, its
a complaint
that 'things don't work' with no discussion of the actual problem. It would
be much better if there was a link to any previous discussions on LKML on
this issue.

My knee-jerk reaction on reading about the lost-interrupts issue is that,
yes, setting HZ=100 and disabling ACPI is indeed a decent short-term
work-around (APIC is something completely different and not something
you can disable). The correct long-term solution would be to use real-time
kernels, which are designed to make sure that things like lost interrupts
never happen.

I have no idea what the status of real-time Linux is, whether it would now
have gaurantees for timer ticks, and whether anything there would now
be mergeable into the mainline kernel.

--linas

2009-01-07 16:04:25

by john stultz

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Wed, Jan 7, 2009 at 6:34 AM, Danny Mayer <[email protected]> wrote:
> I'd much rather you spend the time tackling the clock interrupt losses
> that many of our Linux users complain about. See:
> https://support.ntp.org/bin/view/Support/KnownOsIssues#Section_9.2.4.
> for some of the gorier details. I'm sure you don't really want us
> recommending that they set HZ=100 in the kernel to alleviate the problem.

I believe the lost tick issue as well as the HZ=100 suggestions at the
page above are out of date for 2.6.21 and higher kernels as the
generic timekeeping rework addressed these problems.

Please let me know if you're still seeing any such issues with NTP.

thanks
-john

2009-01-07 17:27:37

by Warner Losh

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

In message: <[email protected]>
David Newall <[email protected]> writes:
: Linas Vepstas wrote:
: > Currently, the Linux kernel keeps time in UTC. This means
: > that it must take special actions to tick twice when a leap
: > second comes by.
:
: Except it doesn't have to tick twice. Refer to
: http://lkml.org/lkml/2009/1/7/78 in which I show that a time_t (what
: time() returns) counts leap seconds (According to Bernstein this is what
: UTC means), and using zoneinfo, the library processes leap seconds
: correctly.

This is *NOT* POSIX time_t. In order to be posix compliant, you can't
do what Bernstein suggests. You can be non-complaint and deal it with
zoneinfo.

: I just realised that the Notes in man 2 time are confusing and probably
: unnecessary. Suffice to say that (assuming correctly configured
: zoneinfo) time() returns the number of seconds elapsed since start 1970.

That's not POSIX complaint.

Warner

2009-01-07 17:38:58

by Warner Losh

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

In message: <[email protected]>
"Linas Vepstas" <[email protected]> writes:
: However, during the discussion, the idea came out that
: maybe keeping UTC time in the kernel is just plain stupid.
: So there's this idea floating around that maybe the kernel
: should keep TAI time instead. The hope is that this will
: reduce the complexity in the kernel, and push it out to
: user space, "where it belongs" (to repeat a well-worn
: mantra).

I agree that this is where it belongs, but it is hard to do that in a
POSIX compliant way. It also becomes hard to timestamp things in
filesystems using UTC rather than TAI. There are other protocols that
deal with UTC times as well.

: However, *if* we were to kick UTC out of the kernel,
: and push it to user-land, then, of course, there's a
: different problem: how does the kernel know what the
: correct TAI time is? As your reply makes abundantly
: clear, NTP is not a good source for TAI information.

Agreed. That's the whole crux of the 'multiple time scales suck'
threads that I've talked about in other forums. You have to know this
information before you start, have to deal with 'dusty system' problem
for systems that have been off for 6 months or not upgraded. You also
have to cope with learning after the fact that your initial guess was
wrong.

I've had many systems that would get this information from GPS and
stall the rest of the system until this data came in. I did this
mostly because there were big issues with the software down stream if
you changed the delta between your putative UTC and TAI after the
fact.

: The comments which you labelled as "non-sense" were
: a mis-understanding of a discussion of a particular issue
: that would arise if the kernel were to keep TAI -- if it did,
: then user-space systems would need to have a reliable
: source for leap-seconds. Since NTP does not
: provide this, there was discussion about how that
: could be worked-around. This then lead to the comment
: that, "gee, wouldn't the right long-term solution be that
: NTP provide TAI info?"

I've wanted this for a long time...

: Clearly, it would be a lot of work to get the kernel to keep
: TAI instead of UTC, so this is not, at this time, a "serious
: proposal". But if it were possible, and all the various
: little issues that result were solvable, then it does seem
: like a better long-term solution.

Yes. The kernel would need to be able to return both UTC and TAI
times to the kernel as well, since there are requirements for NFS to
return timestamps in UTC, not in TAI. Many file systems specify UTC
time, or have traditionally been implemented that way.

Warner

2009-01-07 17:41:52

by Warner Losh

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

In message: <[email protected]>
Danny Mayer <[email protected]> writes:
: Why don't you tell us what the real problem is instead of telling us
: that you need TAI offset information?

The real problem is that POSIX time_t totally ignores leap seconds.
This forces systems that are rolling through a leap second to repeat
time, causing time to jump backwards by 1s (or violate POSIX time_t's
invariant that midnight time_t is % 86400 == 0). This jump backwards
is a pita in the kernel, and violates the assumption that many
programs have that time doesn't flow backwards.

The suggestion to solving this would be to tick in TAI time, and force
userland to cope with the leapsecond issues. Of course, there's a
number of problems with this solution as well, but it feels like it
belongs there...

Warner

2009-01-07 19:24:48

by Danny Mayer

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Linas Vepstas wrote:
> [...]
>
>>> a discussion of a particular issue
>>> that would arise if the kernel were to keep TAI -- if it did,
>>> then user-space systems would need to have a reliable
>>> source for leap-seconds. Since NTP does not
>>> provide this, there was discussion about how that
>>> could be worked-around. This then lead to the comment
>>> that, "gee, wouldn't the right long-term solution be that
>>> NTP provide TAI info?"
>> NTP can provide leap-second information via an autokey protocol request,
>> see Section 10.6 Leapseconds Values Message (LEAP)
>> http://www.ietf.org/internet-drafts/draft-ietf-ntp-autokey-04.txt but
>
> Yes, that look like exactly what would be wanted. It would be nice
> if such a message was available in the regular, non-encrypted protocol.

It's not encrypted, it's an authentication protocol. You really do need
to know that you are receiving a reliable set of information otherwise
anyone can spoof you with bad data and play havoc with your clock and
timestamps.

>> that means you need to have autokey set up with another NTP server and
>> that means adding infrastructure that you probably don't want and are
>> not prepared to handle.
>
> Heh. Yes, well, I still haven't figured out how to secure DNS. Yet clearly
> this whole security mess must march on, and somehow the security
> infrastructure must eventually become easy to install.
>
<DNS hat>
That's pretty easy. Install BIND 9.6.0. Read the DNSSEC deployment
instructions here: https://www.isc.org/files/DNSSEC_in_6_minutes.pdf and
implement. You should be done in almost no time.
</DNS hat>

>>> Clearly, it would be a lot of work to get the kernel to keep
>>> TAI instead of UTC, so this is not, at this time, a "serious
>>> proposal". But if it were possible, and all the various
>>> little issues that result were solvable, then it does seem
>>> like a better long-term solution.
>>>
>> This is a *lot* more complicated than you might think. If you are
>> thinking of implementing this similarly to the way timezone information
>> is added for display purposes, you need the whole list of leap seconds
>> and when the change happened since you now have to look at a timestamp
>> and see when it was and then apply all of the leapseconds up to that
>> point in time and none of the leapseconds beyond that. In addition, you
>> have legacy files that have UTC timestamps on them so you would need to
>> distinguish between UTC (legacy) and TAI timestamps in the file system
>> among other places (anywhere where a timestamp exists) and what would
>> you do about database tables which contain timestamps? The list goes on.
>
> Yes.
>
>> I'd much rather you spend the time tackling the clock interrupt losses
>> that many of our Linux users complain about. See:
>> https://support.ntp.org/bin/view/Support/KnownOsIssues#Section_9.2.4.
>> for some of the gorier details. I'm sure you don't really want us
>> recommending that they set HZ=100 in the kernel to alleviate the problem.
>
> Actually, this is rather sorely lacking in 'gory details', rather, its
> a complaint
> that 'things don't work' with no discussion of the actual problem. It would
> be much better if there was a link to any previous discussions on LKML on
> this issue.

Sorry, but that's not my area of expertise. I just know we have many
people running Linux and have these issues.

>
> My knee-jerk reaction on reading about the lost-interrupts issue is that,
> yes, setting HZ=100 and disabling ACPI is indeed a decent short-term
> work-around (APIC is something completely different and not something
> you can disable). The correct long-term solution would be to use real-time
> kernels, which are designed to make sure that things like lost interrupts
> never happen.
>

I bow to your superior knowledge in this area.

Danny

2009-01-07 19:33:42

by Alan

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

> time, causing time to jump backwards by 1s (or violate POSIX time_t's
> invariant that midnight time_t is % 86400 == 0). This jump backwards
> is a pita in the kernel, and violates the assumption that many
> programs have that time doesn't flow backwards.

They can slew the clock slowly as well. There is a wonderful quote from
one of the summaries of the POSIX committee discussions on time that says
quite simply "the posix clock is not guaranteed to be accurate"

As it currently stands the kernel contains sufficient support that at the
point you know a leap second is coming you can adjust the second length
marginally over the entire period.

The current behaviour is an implementation decision. Jumping on a second
shouldn't be an issue to most people, jumping back is asking for badness
but isn't in fact used in the world today. Slewing the entire day so that
each second is 1/86400 of a second longer or shorter wouldn't be noticed
by anyone.

Alan

2009-01-07 19:44:58

by Warner Losh

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

In message: <[email protected]>
Alan Cox <[email protected]> writes:
: > time, causing time to jump backwards by 1s (or violate POSIX time_t's
: > invariant that midnight time_t is % 86400 == 0). This jump backwards
: > is a pita in the kernel, and violates the assumption that many
: > programs have that time doesn't flow backwards.
:
: They can slew the clock slowly as well. There is a wonderful quote from
: one of the summaries of the POSIX committee discussions on time that says
: quite simply "the posix clock is not guaranteed to be accurate"

True, You can. However, anybody you peer with via ntpd will have
issues unless things are coordinated with ntpd (and aren't a leaf
node). There you have much higher tolerances for correctness.

: As it currently stands the kernel contains sufficient support that at the
: point you know a leap second is coming you can adjust the second length
: marginally over the entire period.
:
: The current behaviour is an implementation decision. Jumping on a second
: shouldn't be an issue to most people, jumping back is asking for badness
: but isn't in fact used in the world today. Slewing the entire day so that
: each second is 1/86400 of a second longer or shorter wouldn't be noticed
: by anyone.

If you are an ntp leaf node, that doesn't care about UTC accurate to
the second, this will work well. For most users, this effectively
papers over the problem.

If you do care about UTC time being more accurate than this slewing
will be too large and introduce errors that are too big. Likewise for
non-leaf ntp nodes. For these machines, having time be off by 1/2
second can be very bad. There are many real-time systems that fall
into this category, trading systems on wall street, systems that
control things based on doing things at certain points within UTC
second, etc. For those types of systems, changing the length of the
second by this much isn't going to work at all.

ntpd also lights the INS bit only on 'leap day' so depending on when
you poll, you might not have a full day's notice of these changes, but
that can be managed...

Warner

2009-01-07 21:42:25

by Chris Adams

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Once upon a time, David Newall <[email protected]> said:
> > - the standards say that time() returns seconds since the epoch in UTC
> > _except_ explicity NOT including leap seconds
>
> Don't believe everything you read. For example, the time(2) man page
> says what POSIX does, but doesn't actually say that Linux also does the
> same. It also says what you paraphrased above, but demonstrably that's
> not the case. Man pages often are wrong in some details. Hence RTSL.

I wasn't talking about man pages. I already quoted the section from the
Single Unix Specification version 3 (which supersedes POSIX) that
explicitly says leap seconds are ignored (despite sometimes heated
disagreement, as seen repeated here). The standard "seconds since the
epoch" is seconds since 1970-01-01 00:00:00 UTC but not including leap
seconds (00:00:00 UTC is always 86400*n seconds since the epoch).

As long as Linux wants to work like all the other POSIX systems (which
it should unless there is huge advantage in doing otherwise), time(),
gettimeofday(), etc., all must work without leap seconds.

--
Chris Adams <[email protected]>
Systems and Network Administrator - HiWAAY Internet Services
I don't speak for anybody but myself - that's enough trouble.

2009-01-07 22:13:37

by Chris Adams

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Once upon a time, David Newall <[email protected]> said:
> I went back to the start of the thread. Chris posted a stack trace
> showing "#15 0xffffffff8104ec16 in ntp_leap_second (timer=<value
> optimized out>) at kernel/time/ntp.c:143". That would be kernel code to
> process leap seconds from NTP broadcasts, I think. That code needs to
> be removed.

Well, the code is to process when the kernel is told about leap seconds
(it doesn't have to be NTP, you can do it with adjtimex, which is what I
did to track down the problem).

But why should it be removed? Why change Linux to be incompatible with
POSIX and other Unix systems? This could create real problems with
things like network file systems for example. Even trying to get
interoperation between UTC-Linux and TAI-Linux would be a PITA.

There was a bug, there is a patch, it should be fixed. There's no
reason to reinvent the wheel just because there was a bug.

Looking at comments, there was another bug related to the same
xtime_lock/printk issue but not in leap second related code; it was
trying to print a message about changing clock sources. Should we now
re-architect all of that as well?

--
Chris Adams <[email protected]>
Systems and Network Administrator - HiWAAY Internet Services
I don't speak for anybody but myself - that's enough trouble.

2009-01-08 03:58:52

by Danny Mayer

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Alan Cox wrote:
>> time, causing time to jump backwards by 1s (or violate POSIX time_t's
>> invariant that midnight time_t is % 86400 == 0). This jump backwards
>> is a pita in the kernel, and violates the assumption that many
>> programs have that time doesn't flow backwards.
>
> They can slew the clock slowly as well. There is a wonderful quote from
> one of the summaries of the POSIX committee discussions on time that says
> quite simply "the posix clock is not guaranteed to be accurate"
>
> As it currently stands the kernel contains sufficient support that at the
> point you know a leap second is coming you can adjust the second length
> marginally over the entire period.
>
> The current behaviour is an implementation decision. Jumping on a second
> shouldn't be an issue to most people, jumping back is asking for badness
> but isn't in fact used in the world today. Slewing the entire day so that
> each second is 1/86400 of a second longer or shorter wouldn't be noticed
> by anyone.

NTP handles most of this, but it needs the cooperation of the O/S kernel
and most of the Unix kernels are able to provide the required API's.
FreeBSD doesn't have any of these problems but Linux historically has.
Most of that code was designed by Dave Mills but since each kernel is
different we should not expect them all to behave the same way and
generally requires an understanding of what NTP expects and that's not
always clear to kernel developers who are not expected to know NTP.

Danny

2009-01-08 04:44:32

by Warner Losh

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

In message: <[email protected]>
Danny Mayer <[email protected]> writes:
: Alan Cox wrote:
: >> time, causing time to jump backwards by 1s (or violate POSIX time_t's
: >> invariant that midnight time_t is % 86400 == 0). This jump backwards
: >> is a pita in the kernel, and violates the assumption that many
: >> programs have that time doesn't flow backwards.
: >
: > They can slew the clock slowly as well. There is a wonderful quote from
: > one of the summaries of the POSIX committee discussions on time that says
: > quite simply "the posix clock is not guaranteed to be accurate"
: >
: > As it currently stands the kernel contains sufficient support that at the
: > point you know a leap second is coming you can adjust the second length
: > marginally over the entire period.
: >
: > The current behaviour is an implementation decision. Jumping on a second
: > shouldn't be an issue to most people, jumping back is asking for badness
: > but isn't in fact used in the world today. Slewing the entire day so that
: > each second is 1/86400 of a second longer or shorter wouldn't be noticed
: > by anyone.
:
: NTP handles most of this, but it needs the cooperation of the O/S kernel
: and most of the Unix kernels are able to provide the required API's.
: FreeBSD doesn't have any of these problems but Linux historically has.
: Most of that code was designed by Dave Mills but since each kernel is
: different we should not expect them all to behave the same way and
: generally requires an understanding of what NTP expects and that's not
: always clear to kernel developers who are not expected to know NTP.

On FreeBSD, Solaris and Digital Unix, I'll point out, that jumping
backwards is used, and has been used since at least 1994. So saying
it isn't used in the world today is flat out wrong.

Warner

2009-01-08 10:50:59

by Alan

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

> On FreeBSD, Solaris and Digital Unix, I'll point out, that jumping
> backwards is used, and has been used since at least 1994. So saying
> it isn't used in the world today is flat out wrong.

I stand by my comment - when was the last time the IERS used a leap
second removal ? The code may exist but it doesn't happen.

Alan

2009-01-08 10:58:09

by Alan

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Thu, 8 Jan 2009 10:48:54 +0000
Alan Cox <[email protected]> wrote:

> > On FreeBSD, Solaris and Digital Unix, I'll point out, that jumping
> > backwards is used, and has been used since at least 1994. So saying
> > it isn't used in the world today is flat out wrong.

[Ignore previous email, must remember not to post before waking up ;)]

You are correct - and providing gettimeofday() is being used on Linux
rather than time() which simply appears to stall due to resolution the
same is true.

Some users do run with the "right" timezone data in non posix mode
because they want their seconds 'sane' but that isn't the default.

Alan

2009-01-08 15:05:44

by Warner Losh

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

In message: <[email protected]>
Alan Cox <[email protected]> writes:
: > On FreeBSD, Solaris and Digital Unix, I'll point out, that jumping
: > backwards is used, and has been used since at least 1994. So saying
: > it isn't used in the world today is flat out wrong.
:
: I stand by my comment - when was the last time the IERS used a leap
: second removal ? The code may exist but it doesn't happen.

Jumping backwards is used for every leap second that IERS has ever
done, which was your original comment. There's has never been a case
where there was a leap second for jump forward though. The proper
technical term here is 'negative leap second'. All leap seconds up
until now have been positive leap seconds, and it is unlikely there
ever will be a negative one.

Warner

2009-01-08 16:51:46

by Magnus Danielson

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

M. Warner Losh skrev:
> In message: <[email protected]>
> David Newall <[email protected]> writes:
> : Linas Vepstas wrote:
> : > Currently, the Linux kernel keeps time in UTC. This means
> : > that it must take special actions to tick twice when a leap
> : > second comes by.
> :
> : Except it doesn't have to tick twice. Refer to
> : http://lkml.org/lkml/2009/1/7/78 in which I show that a time_t (what
> : time() returns) counts leap seconds (According to Bernstein this is what
> : UTC means), and using zoneinfo, the library processes leap seconds
> : correctly.
>
> This is *NOT* POSIX time_t. In order to be posix compliant, you can't
> do what Bernstein suggests. You can be non-complaint and deal it with
> zoneinfo.

You are free to keep your core time in whatever form you wish, but if
you want your time_t to be POSIX compatible when accessed over POSIX
interfaces you would need to honour the POSIX time_t mapping. While
POSIX tried to avoid the leapsecond issue, the mapping they do perform
has a peculiar effect on what happends on time_t if you also want to
honour the UTC to time_t mapping while accepting UTC from external sources.

> : I just realised that the Notes in man 2 time are confusing and probably
> : unnecessary. Suffice to say that (assuming correctly configured
> : zoneinfo) time() returns the number of seconds elapsed since start 1970.
>
> That's not POSIX complaint.

It just *appears* to be the number of "seconds" since 1970. This
appearence is important to some and causing a greif to others.

Cheers,
Magnus

2009-01-08 19:57:46

by Marshall Eubanks

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009


On Jan 8, 2009, at 10:02 AM, M. Warner Losh wrote:

> In message: <[email protected]>
> Alan Cox <[email protected]> writes:
> : > On FreeBSD, Solaris and Digital Unix, I'll point out, that jumping
> : > backwards is used, and has been used since at least 1994. So
> saying
> : > it isn't used in the world today is flat out wrong.
> :
> : I stand by my comment - when was the last time the IERS used a leap
> : second removal ? The code may exist but it doesn't happen.
>
> Jumping backwards is used for every leap second that IERS has ever
> done, which was your original comment. There's has never been a case
> where there was a leap second for jump forward though. The proper
> technical term here is 'negative leap second'. All leap seconds up
> until now have been positive leap seconds, and it is unlikely there
> ever will be a negative one.

I disagree. In the 1970's, the excess LOD was as much as 3 msec.
After going down some, the mid 1990's it rose to around 2 msec.
Now, it is around 1 msec.

Here is a plot

http://www.iers.org/MainDisp.csl?pid=95-100

Only the long period variations count for leap seconds - the seasonal
and other high frequency
oscillations tend to average out.

In the early part of the last century (~1905), it decreased by ~ 5
msec in a year or so.
If that happened right now, it would go to ~ -4 msec negative, and we
would be seeing
2 negative leap seconds or more per year. Even if the decrease from
1975 to 1985 happened again, it
would be at -1 msec, and we would have a negative leap second every
two years or so.

What is a reasonable assumption is that we would likely have a year or
more warning of the
likelihood of a negative leap second.

Regards
Marshall Eubanks

>
>
> Warner
> _______________________________________________
> ntpwg mailing list
> [email protected]
> https://lists.ntp.org/mailman/listinfo/ntpwg

2009-01-08 20:39:28

by Steve Allen

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Wed 2009-01-07T10:39:47 -0700, M. Warner Losh hath writ:
> The suggestion to solving this would be to tick in TAI time, and force
> userland to cope with the leapsecond issues. Of course, there's a
> number of problems with this solution as well, but it feels like it
> belongs there...

Agreed that the leap second belong in userland, but BIPM itself
refuses to agree with the idea of the underlying time scale being TAI.
TAI has no standing as an international recommendation, and it is not
available via the established broadcast mechanisms, and BIPM does not
want those things to happen. What would be needed is a leap-free time
scale with an international recommendation standing behind it so as to
legitimize its use.

The most recent public insight to the ITU-R process of reconsidering
leap seconds in UTC is from September, here
http://www.navcen.uscg.gov/cgsic/meetings/48thmeeting/Reports/Timing%20Subcommittee/48-LS%2020080916.pdf
In the schedule given on page 16 we see that even if the ITU-R process
goes smoothly there will be leap seconds at least until 2017, so we
have to live with them for at least that long. However, at the
October meeting of ITU-R WP7A things did not go smoothly. There were
two countries objecting to any change to UTC, so the process of
considering any change to the broadcast time scale is stalled.

Basically, any change to UTC is currently stalled by the
international political/diplomatic process which controls it.

Way back in 2003 the ITU-R asked for advice on the broadcast time
scale, and the advice from the experts included changing the name if
leaps are dropped.
http://www.inrim.it/luc/cesio/itu/closure.pdf
At that point nobody managed to point out that POSIX demands that the
zoneinfo mechanisms allow for offsets of seconds as well as minutes,
so there was no clear path for implementing that advice while
preserving compliance with specifications that still demand UTC.

Any epoch-based time scale has issues with UTC as it has been defined
http://www.ucolick.org/~sla/leapsecs/epochtime.html
and by ignoring leap seconds POSIX makes that even harder to implement.
During the past century we have seen the creation of at least 4
different uniform time scales, two of which are widely available by
broacast (LORAN-C and GPS), but none of which has the backing of an
international standard behind it.
http://www.ucolick.org/~sla/leapsecs/deltat.html

All civil time scales are conventional constructs, and zoneinfo is
designed to handle the arbitrary nature of changes to civil time. If
the underlying time scale changes its name and stops having leaps,
then leap seconds in UTC are just another form of conventional change
to civil time. UTC could become a time zone. Processes which happen
when POSIX time_t % 86400 == 0 would happen at "atomic midnight"
instead of "civil" midnight, not a big difference.

If the ITU-R were to take the advice of the colloquium it organized,
if they were to abandon the name UTC, and establish a new
international broadcast time scale with a new name, then the
operational systems of the world receiving those broadcasts would not
notice. There would be some rewriting of documents, specifications,
and some extra work streamlining zoneinfo.

It's not just an engineering tradeoff, it's a political tradeoff.
The question for the NTP implementors, kernel hackers, application
writers is whether it's worth waiting to see if the current political
impasse about UTC can be broken, or whether it seems better, easier,
and quicker to lobby the ITU-R delegations to abandon the name UTC
and give a new name to a broadcast time scale without leaps.

Either way we will have to handle another decade of leap seconds
before the broadcast time scale can change its characteristics.

replies directed to the LEAPSECS list

--
Steve Allen <[email protected]> WGS-84 (GPS)
UCO/Lick Observatory Natural Sciences II, Room 165 Lat +36.99855
University of California Voice: +1 831 459 3046 Lng -122.06015
Santa Cruz, CA 95064 http://www.ucolick.org/~sla/ Hgt +250 m

2009-01-08 22:47:54

by David Mills

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Folks,

You are not correct. The kernel software clock variable is in fact
stepped back, but the routine that actually reads the clock does not
step the clock back unless set back more than two seconds.. Otherwise,
the clock is strictly monotonic. That is the ad vice I gave in rfc1583
and implemented the Digital Unix kernel because I wrote tthe code. Other
kernelmongers might or might not have taken the advice.

As for the TAI issue discussed earlier, note that the generic NTP kernel
supportfrom me since 1991 has TAI . However, support to read it
requires the ntp_gettime() syscall and nlot all kernels support it.

The recent leap was observed to work correctly in Solaris and FreeBSD.
It worked fine with the WWV driver and the Spectracom GPS driver, but
not the NMEA, Arbiter, Meinberg nor any of the NIST or USNO primary
servers. It probably did work with the Canadian servers, since the
Ottowa primary server is synchronized via my CHU audio driver. It didn't
work onn my carefully contrived backroom servers, as they lost power
durring the event.

See http://www.eecis.udel.edu/~mills/leap.html and/or the online NTP
documentation and/or my book.

Dave

Alan Cox wrote:

>On Thu, 8 Jan 2009 10:48:54 +0000
>Alan Cox <[email protected]> wrote:
>
>
>
>>>On FreeBSD, Solaris and Digital Unix, I'll point out, that jumping
>>>backwards is used, and has been used since at least 1994. So saying
>>>it isn't used in the world today is flat out wrong.
>>>
>>>
>
>[Ignore previous email, must remember not to post before waking up ;)]
>
>You are correct - and providing gettimeofday() is being used on Linux
>rather than time() which simply appears to stall due to resolution the
>same is true.
>
>Some users do run with the "right" timezone data in non posix mode
>because they want their seconds 'sane' but that isn't the default.
>
>Alan
>
>
>_______________________________________________
>ntpwg mailing list
>[email protected]
>https://lists.ntp.org/mailman/listinfo/ntpwg
>
>

2009-01-10 09:46:30

by David Newall

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Alan Cox wrote:
>> Is there a mktime() in the kernel? Isn't it pure user-space? Mktime
>> does appear to know all about leap seconds (assuming they're in zoneinfo.)
>>
>
> The GPL goes to great trouble to ensure you get the kernel source code.
> Why not use it.
>

Okay. I'm not sure how long you have realised that two, completely
different mktimes have been confused with each other, but surely longer
than me.

The kernel mktime, as far as I can tell without spending a week on it,
is used only on some platforms, during startup to read and set real time
clocks and alarms. Where it matters is RTC ioctls which, tragically, I
think, are passed a struct rtc_time. They should be passed a time_t or
struct timeval because these are what's used everywhere else that I can
think of, between kernel and user-space.

Ideally, struct rtc_time should be deprecated in favour of time_t.
Changes to user-space programs should be trivial; probably, they
currently look like

{ struct rtc_time *rt = gmtime(&t); ioctl(fd, RTC_xxx, rt); }


This is not going to happen without a huge song and dance, which I
certainly don't have the energy for. I think it makes no practical
difference, and only affects how tidy the kernel looks. Assuming
leap-seconds are properly configured if zoneinfo, user-space programs
which use gmtime() to set the RTC will run it fast by the current number
of leap-seconds. However the RTC will continue to advance by one second
per second, and being that fast, mktime will produce the correct time_t
when RTC is read back. This will eventually cause a real problem with
RTCs that handle leap years. When we have 4 years worth of
leap-seconds, or maybe its 96 years worth, the RTC will be set for a
leap-year when it is not, or vice versa. That's a long time away.

It's something of a farce that some systems crashed at the leap-second
because no adjustment was needed. Trying to turn two seconds into one
was a mistake. I gather the mistake is in the NTP client, which should
just ignore the LEAP-SECOND bit, and use the leap-second information
from zoneinfo to convert from the NTP timebase to Linux's.

>> showing "#15 0xffffffff8104ec16 in ntp_leap_second (timer=<value
>> optimized out>) at kernel/time/ntp.c:143". That would be kernel code to
>> process leap seconds from NTP broadcasts, I think. That code needs to
>> be removed.
>>
>
> I suggest you read that code and understand it.
>

Well, there's rather a lot wrong with it, isn't there? All of the stuff
that tries to handle leap seconds is wrong; that goes for timex.h, too.
The kernel needs to do nothing special to handle leap-seconds; they're
just seconds, like every other one.

For the third time, this code has to come out.

2009-01-12 16:11:47

by Pavel Machek

[permalink] [raw]
Subject: Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Mon 2009-01-05 11:42:35, Linas Vepstas wrote:
> 2009/1/5 <[email protected]>:
> > On Mon, 5 Jan 2009, Linas Vepstas wrote:
> >
> >>> Arguably the kernel's responsibility should be to keep track of the
> >>> most fundamental representation of time possible for a machine (that's
> >>> probably TAI) and it is a userspace responsibility to map from that
> >>> value to other time standards including UTC,
> >>
> >> Yes, this really does seem like the right solution.
> >>
> >>> using control files
> >>> which are updated as leap seconds are declared.
> >>
> >> Lets be clear on what "control files" means. This does
> >> *NOT* mean some config file shipped by some distro
> >> for some package. That would be a horrid solution.
> >> People don't install updates, patches, etc. Distros
> >> ship them late, or never, if the distro is old enough.
> >>
> >> A more appropriate solution would be to have
> >> either the kernel or ntpd track the leap seconds
> >> automatically. First, the ntp protocol already provides
> >> the needed notification of a leap second to anyone
> >> who cares about it (i.e. there is no point in getting a
> >> Linux distro involved in this -- a distribution mechanism
> >> already exists, and works *better* than having a distro
> >> do it).
> >
> > I disagree with this. NTP will only know about leap seconds if it was
> > running and connected to a server that advertised the leap seconds during
> > that month.
> >
> > for example, if you installed a new server today, how would it ever know
> > that there was a leap second a couple of days ago?
>
> OK, good point. Unless your distro was less
> than a few days old (unlikely), you are faced with the
> same problem. Sure, eventually, the distro will publish
> an update (which will add to the existing list of 36 leap
> seconds -- which is needed in any case, since no one
> has a server that's been up since 1958), but this is
> unlikely to happen during this install window.
>
> The long term solution would be write an RFC to extend
> NTP to also provide TAI information -- e.g. to add a
> message that indicates the current leap-second offset
> between UTC and TAI.

Offset is not enough; you'd have to provide list of all previous leap
seconds with 'when it happened' timestamps.

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2009-01-12 17:08:49

by Warner Losh

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

In message: <[email protected]>
Pavel Machek <[email protected]> writes:
: On Mon 2009-01-05 11:42:35, Linas Vepstas wrote:
: > 2009/1/5 <[email protected]>:
: > > On Mon, 5 Jan 2009, Linas Vepstas wrote:
: > >
: > >>> Arguably the kernel's responsibility should be to keep track of the
: > >>> most fundamental representation of time possible for a machine (that's
: > >>> probably TAI) and it is a userspace responsibility to map from that
: > >>> value to other time standards including UTC,
: > >>
: > >> Yes, this really does seem like the right solution.
: > >>
: > >>> using control files
: > >>> which are updated as leap seconds are declared.
: > >>
: > >> Lets be clear on what "control files" means. This does
: > >> *NOT* mean some config file shipped by some distro
: > >> for some package. That would be a horrid solution.
: > >> People don't install updates, patches, etc. Distros
: > >> ship them late, or never, if the distro is old enough.
: > >>
: > >> A more appropriate solution would be to have
: > >> either the kernel or ntpd track the leap seconds
: > >> automatically. First, the ntp protocol already provides
: > >> the needed notification of a leap second to anyone
: > >> who cares about it (i.e. there is no point in getting a
: > >> Linux distro involved in this -- a distribution mechanism
: > >> already exists, and works *better* than having a distro
: > >> do it).
: > >
: > > I disagree with this. NTP will only know about leap seconds if it was
: > > running and connected to a server that advertised the leap seconds during
: > > that month.
: > >
: > > for example, if you installed a new server today, how would it ever know
: > > that there was a leap second a couple of days ago?
: >
: > OK, good point. Unless your distro was less
: > than a few days old (unlikely), you are faced with the
: > same problem. Sure, eventually, the distro will publish
: > an update (which will add to the existing list of 36 leap
: > seconds -- which is needed in any case, since no one
: > has a server that's been up since 1958), but this is
: > unlikely to happen during this install window.
: >
: > The long term solution would be write an RFC to extend
: > NTP to also provide TAI information -- e.g. to add a
: > message that indicates the current leap-second offset
: > between UTC and TAI.
:
: Offset is not enough; you'd have to provide list of all previous leap
: seconds with 'when it happened' timestamps.

Well, today you can ftp the leapseconds.txt file from NIST. Of
course, that assumes your machine is on the network, and not a dumb
slave of a smart head-end that's off the net...

Warner

2009-01-12 21:47:29

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: [ntpwg] Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Mon, 12 Jan 2009 10:07:12 MST, "M. Warner Losh" said:

> Well, today you can ftp the leapseconds.txt file from NIST. Of
> course, that assumes your machine is on the network, and not a dumb
> slave of a smart head-end that's off the net...

If you're a dumb slave off a smart head-end, the sysadmin has already solved
the problem of getting files from the outside to the dumb slave, just for their
own sanity in pushing patches and *other* config file updates.

And if you're *not* getting updates pushed to you for all the *other* stuff,
the leapseconds is probably the least of your worries.


Attachments:
(No filename) (226.00 B)

2009-06-08 02:18:31

by Ben Hutchings

[permalink] [raw]
Subject: Re: [PATCH] v2 Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Sat, 2009-01-03 at 12:01 -0600, Chris Adams wrote:
> Once upon a time, Duane Griffin <[email protected]> said:
> > How about instead of a switch statement, assigning the message to a
> > variable and printing that. I.e. something like:
>
> Good point. Here's an updated version that also adds a comment to the
> xtime_lock definition about not using printk.
> --
> Chris Adams <[email protected]>
> Systems and Network Administrator - HiWAAY Internet Services
> I don't speak for anybody but myself - that's enough trouble.
>
>
> From: Chris Adams <[email protected]>
>
> The code to handle leap seconds printks an information message when the
> second is inserted or deleted. It does this while holding xtime_lock.
> However, printk wakes up klogd, and in some cases, the scheduler tries
> to get the current kernel time, trying to get xtime_lock (which results
> in a deadlock). This moved the printks outside of the lock. It also
> adds a comment to not use printk while holding xtime_lock.
[...]

This patch doesn't seem to have gone anywhere. Was this bug fixed in
some other way or has it been forgotten?

Ben.

--
Ben Hutchings
Logic doesn't apply to the real world. - Marvin Minsky


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2009-06-18 22:35:18

by Chris Friesen

[permalink] [raw]
Subject: Re: [PATCH] v2 Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Ben Hutchings wrote:
> On Sat, 2009-01-03 at 12:01 -0600, Chris Adams wrote:
>> Once upon a time, Duane Griffin <[email protected]> said:
>>> How about instead of a switch statement, assigning the message to a
>>> variable and printing that. I.e. something like:
>> Good point. Here's an updated version that also adds a comment to the
>> xtime_lock definition about not using printk.
>> --
>> Chris Adams <[email protected]>
>> Systems and Network Administrator - HiWAAY Internet Services
>> I don't speak for anybody but myself - that's enough trouble.
>>
>>
>> From: Chris Adams <[email protected]>
>>
>> The code to handle leap seconds printks an information message when the
>> second is inserted or deleted. It does this while holding xtime_lock.
>> However, printk wakes up klogd, and in some cases, the scheduler tries
>> to get the current kernel time, trying to get xtime_lock (which results
>> in a deadlock). This moved the printks outside of the lock. It also
>> adds a comment to not use printk while holding xtime_lock.
> [...]
>
> This patch doesn't seem to have gone anywhere. Was this bug fixed in
> some other way or has it been forgotten?

I'm interested in this as well...the current code still issues a
printk() while holding the xtime_lock for writing. Is this allowed or not?

In addition, is it allowed for older kernels also or is Chris Adams'
patch something that should get picked up for the 2.6.27 stable series?

Chris

2009-06-18 22:58:24

by Ben Hutchings

[permalink] [raw]
Subject: Re: [PATCH] v2 Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

On Thu, 2009-06-18 at 16:34 -0600, Chris Friesen wrote:
> Ben Hutchings wrote:
> > On Sat, 2009-01-03 at 12:01 -0600, Chris Adams wrote:
> >> Once upon a time, Duane Griffin <[email protected]> said:
> >>> How about instead of a switch statement, assigning the message to a
> >>> variable and printing that. I.e. something like:
> >> Good point. Here's an updated version that also adds a comment to the
> >> xtime_lock definition about not using printk.
> >> --
> >> Chris Adams <[email protected]>
> >> Systems and Network Administrator - HiWAAY Internet Services
> >> I don't speak for anybody but myself - that's enough trouble.
> >>
> >>
> >> From: Chris Adams <[email protected]>
> >>
> >> The code to handle leap seconds printks an information message when the
> >> second is inserted or deleted. It does this while holding xtime_lock.
> >> However, printk wakes up klogd, and in some cases, the scheduler tries
> >> to get the current kernel time, trying to get xtime_lock (which results
> >> in a deadlock). This moved the printks outside of the lock. It also
> >> adds a comment to not use printk while holding xtime_lock.
> > [...]
> >
> > This patch doesn't seem to have gone anywhere. Was this bug fixed in
> > some other way or has it been forgotten?
>
> I'm interested in this as well...the current code still issues a
> printk() while holding the xtime_lock for writing. Is this allowed or not?

Having investigated further, I believe it has been safe since this
change made in 2.6.27 (which cleverly preempted the new year):

commit b845b517b5e3706a3729f6ea83b88ab85f0725b0
Author: Peter Zijlstra <[email protected]>
Date: Fri Aug 8 21:47:09 2008 +0200

printk: robustify printk

Avoid deadlocks against rq->lock and xtime_lock by deferring the klogd
wakeup by polling from the timer tick.

Signed-off-by: Peter Zijlstra <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>

> In addition, is it allowed for older kernels also or is Chris Adams'
> patch something that should get picked up for the 2.6.27 stable series?

Anything older than 2.6.27 appears to need a change along the lines of
the above-mentioned commit or Chris's patch. Note that this was not the
only case where printk() could be called under xtime_lock. For example,
in arch/alpha/kernel/time.c timer_interrupt() calls set_rtc_mmss() which
can call printk().

Ben.

--
Ben Hutchings
The generation of random numbers is too important to be left to chance.
- Robert Coveyou


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2009-06-18 23:49:18

by Chris Friesen

[permalink] [raw]
Subject: Re: [PATCH] v2 Re: Bug: Status/Summary of slashdot leap-second crash on new years 2008-2009

Ben Hutchings wrote:
> On Thu, 2009-06-18 at 16:34 -0600, Chris Friesen wrote:

> Having investigated further, I believe it has been safe since this
> change made in 2.6.27 (which cleverly preempted the new year):
>
> commit b845b517b5e3706a3729f6ea83b88ab85f0725b0
> Author: Peter Zijlstra <[email protected]>
> Date: Fri Aug 8 21:47:09 2008 +0200
>
> printk: robustify printk
>
> Avoid deadlocks against rq->lock and xtime_lock by deferring the klogd
> wakeup by polling from the timer tick.
>
> Signed-off-by: Peter Zijlstra <[email protected]>
> Signed-off-by: Ingo Molnar <[email protected]>
>
>> In addition, is it allowed for older kernels also or is Chris Adams'
>> patch something that should get picked up for the 2.6.27 stable series?
>
> Anything older than 2.6.27 appears to need a change along the lines of
> the above-mentioned commit or Chris's patch. Note that this was not the
> only case where printk() could be called under xtime_lock. For example,
> in arch/alpha/kernel/time.c timer_interrupt() calls set_rtc_mmss() which
> can call printk().


It appears that the patch in question went into mainline in 2.6.28-rc1
after being developed on the -tip tree. So it doesn't appear to be
present in the mainline 2.6.27 kernel.

Chris