2007-08-11 19:09:20

by Willy Tarreau

[permalink] [raw]
Subject: [2.6.20.16 review 27/28] NTP: remove clock_was_set() call to prevent deadlock

The clock_was_set() call in seconds_overflow() which happens only when
leap seconds are inserted / deleted is wrong in two aspects:

1. it results in a call to on_each_cpu() with interrupts disabled
2. it is potential deadlock source vs. call_lock in smp_call_function()

The only possible side effect of the removal might be, that an absolute
CLOCK_REALTIME timer fires 1 second too late, in the rare case of leap
second deletion and an absolute CLOCK_REALTIME timer which expires in
the affected time frame. It will never fire too early.

This was probably observed by the reporter of a June 30th -> July 1st
hang: http://lkml.org/lkml/2007/7/3/

A similar problem was observed by Dave Jones, who provided a screen shot
with a lockdep back trace, which allowed to analyse the problem.

Signed-off-by: Thomas Gleixner <[email protected]>
Cc: john stultz <[email protected]>
Cc: Dave Jones <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Vincent Fortier <[email protected]>
Signed-off-by: Chris Wright <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
kernel/time/ntp.c | 2 --
1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c
index 3afeaa3..64744bb 100644
--- a/kernel/time/ntp.c
+++ b/kernel/time/ntp.c
@@ -116,7 +116,6 @@ void second_overflow(void)
*/
time_interpolator_update(-NSEC_PER_SEC);
time_state = TIME_OOP;
- clock_was_set();
printk(KERN_NOTICE "Clock: inserting leap second "
"23:59:60 UTC
");
}
@@ -131,7 +130,6 @@ void second_overflow(void)
*/
time_interpolator_update(NSEC_PER_SEC);
time_state = TIME_WAIT;
- clock_was_set();
printk(KERN_NOTICE "Clock: deleting leap second "
"23:59:59 UTC
");
}
--
1.5.2.4

--


2007-08-12 11:21:39

by Jason Uhlenkott

[permalink] [raw]
Subject: Re: [2.6.20.16 review 27/28] NTP: remove clock_was_set() call to prevent deadlock

On Sat, Aug 11, 2007 at 21:48:19 +0200, Willy Tarreau wrote:
> The clock_was_set() call in seconds_overflow() which happens only when
> leap seconds are inserted / deleted is wrong in two aspects:
>
> 1. it results in a call to on_each_cpu() with interrupts disabled
> 2. it is potential deadlock source vs. call_lock in smp_call_function()

clock_was_set() is a no-op in 2.6.20, so this one looks unnecessary
(but harmless). Thankfully the "hang every Linux box on the planet
simultaneously" regression (okay, that's *slight* hyperbole) was
limited to 2.6.21.

2007-08-12 11:49:57

by Willy Tarreau

[permalink] [raw]
Subject: Re: [2.6.20.16 review 27/28] NTP: remove clock_was_set() call to prevent deadlock

On Sun, Aug 12, 2007 at 04:15:58AM -0700, Jason Uhlenkott wrote:
> On Sat, Aug 11, 2007 at 21:48:19 +0200, Willy Tarreau wrote:
> > The clock_was_set() call in seconds_overflow() which happens only when
> > leap seconds are inserted / deleted is wrong in two aspects:
> >
> > 1. it results in a call to on_each_cpu() with interrupts disabled
> > 2. it is potential deadlock source vs. call_lock in smp_call_function()
>
> clock_was_set() is a no-op in 2.6.20, so this one looks unnecessary
> (but harmless). Thankfully the "hang every Linux box on the planet
> simultaneously" regression (okay, that's *slight* hyperbole) was
> limited to 2.6.21.

OK, patch removed.

Thanks for your help,
Willy