We have been seeing "BUG: soft lockup detected on CPU#0!" messages
from testing runs of mainline for some time. This has only been
showing on a very small subset of the systems under test, as it
turns out the slower ones.
Resolving this issue is complex, not because the fix itself is
complex but because of the timer rework which is currently pending
in -mm. As a result this patch is against 2.6.16-rc1. So far
we have had no such errors from runs against -mm, but I am unsure
whether that system eliminates this issue, or mearly is lucky as
faster systems are currently with mainline.
John perhaps you could comment? Also, how experimental is the timer
code, is it likely to go into 2.6.16 or is it more experimental
than that? If so perhaps we need to try and slip a fix like this
underneath it.
Comments?
-apw
=== 8< ===
timer tsc ensure we allow for initial tsc and tsc sync
During early initialisation we select the timer source to use for
the high resolution time source. When selecting the TSC we don't
take into account the initial value of the TSC, this leads to a
jump in the clock at the next clock tick. We also fail to take into
account that the TSC synchronisation in an SMP system resets the TSC.
In both cases this will lead to the timer believing that 0-N TSC
ticks have passed since the last real timer tick. This will lead
to the clock jumping ahead by nearly the maximum time represented
by the lower 32bits of the TSC. For a 1GHz machine this is close to
4s, on slower boxes this can exceed 10s and trip the softlock tests.
In this example a 360Mhz system is booted with CONFIG_PRINTK_TIME
enabled. Not that in both cases 11s 'passes':
[...]
[17179569.184000] Detected 360.156 MHz processor.
[17179569.184000] Using tsc for high-res timesource
[17179569.184000] Console: colour VGA+ 80x25
[17179580.184000] BUG: soft lockup detected on CPU#0!
[...]
[17179581.844000] ...trying to set up timer as Virtual Wire IRQ... works.
[17179582.136000] checking TSC synchronization across 4 CPUs: passed.
[17179593.032000] BUG: soft lockup detected on CPU#0!
[...]
Signed-off-by: Andy Whitcroft <[email protected]>
---
arch/i386/kernel/smpboot.c | 4 ++++
arch/i386/kernel/timers/timer_tsc.c | 16 ++++++++++++++++
include/asm-i386/timer.h | 1 +
3 files changed, 21 insertions(+)
diff -upN reference/arch/i386/kernel/smpboot.c current/arch/i386/kernel/smpboot.c
--- reference/arch/i386/kernel/smpboot.c
+++ current/arch/i386/kernel/smpboot.c
@@ -52,6 +52,7 @@
#include <asm/tlbflush.h>
#include <asm/desc.h>
#include <asm/arch_hooks.h>
+#include <asm/timer.h>
#include <mach_apic.h>
#include <mach_wakecpu.h>
@@ -283,6 +284,9 @@ static void __init synchronize_tsc_bp (v
atomic_inc(&tsc_count_stop);
}
+ /* Ensure we know that the tsc based timer is aware of the change. */
+ tsc_changed();
+
sum = 0;
for (i = 0; i < NR_CPUS; i++) {
if (cpu_isset(i, cpu_callout_map)) {
diff -upN reference/arch/i386/kernel/timers/timer_tsc.c current/arch/i386/kernel/timers/timer_tsc.c
--- reference/arch/i386/kernel/timers/timer_tsc.c
+++ current/arch/i386/kernel/timers/timer_tsc.c
@@ -516,11 +516,15 @@ static int __init init_tsc(char* overrid
result++; /* rounding the result */
hpet_usec_quotient = result;
+
+ hpet_last = hpet_readl(HPET_COUNTER);
} else
#endif
{
tsc_quotient = calibrate_tsc();
}
+ /* Assume this is the last mark offset time */
+ rdtsc(last_tsc_low, last_tsc_high);
if (tsc_quotient) {
fast_gettimeoffset_quotient = tsc_quotient;
@@ -561,6 +565,18 @@ static int tsc_resume(void)
return 0;
}
+/*
+ * When SMP systems are booted we synchronise their TSC's, part of this
+ * process involves resetting them all to zero. At this point it is
+ * important that we rerecord the tsc value else we will jump the clock
+ * forward by 0-N tsc ticks. For a 1Ghz machine this is approximatly 4s
+ * for slower systems it can exceed the 10s softlock timer.
+ */
+void tsc_changed(void)
+{
+ tsc_resume();
+}
+
#ifndef CONFIG_X86_TSC
/* disable flag for tsc. Takes effect by clearing the TSC cpu flag
* in cpu/common.c */
diff -upN reference/include/asm-i386/timer.h current/include/asm-i386/timer.h
--- reference/include/asm-i386/timer.h
+++ current/include/asm-i386/timer.h
@@ -38,6 +38,7 @@ struct init_timer_opts {
extern struct timer_opts* __init select_timer(void);
extern void clock_fallback(void);
void setup_pit_timer(void);
+void tsc_changed(void);
/* Modifiers for buggy PIT handling */
On Fri, 2006-01-20 at 12:53 +0000, Andy Whitcroft wrote:
> We have been seeing "BUG: soft lockup detected on CPU#0!" messages
> from testing runs of mainline for some time. This has only been
> showing on a very small subset of the systems under test, as it
> turns out the slower ones.
>
> Resolving this issue is complex, not because the fix itself is
> complex but because of the timer rework which is currently pending
> in -mm. As a result this patch is against 2.6.16-rc1. So far
> we have had no such errors from runs against -mm, but I am unsure
> whether that system eliminates this issue, or mearly is lucky as
> faster systems are currently with mainline.
Hey Andy, Sorry for the slow reply.
The timekeeping rework is not going to go into 2.6.16 and is currently
out of -mm until I can resolve a few laptop issues.
> John perhaps you could comment? Also, how experimental is the timer
> code, is it likely to go into 2.6.16 or is it more experimental
> than that? If so perhaps we need to try and slip a fix like this
> underneath it.
I'd def try to push a fix in for the issue. I'll just merge my code
around the fix.
> timer tsc ensure we allow for initial tsc and tsc sync
>
> During early initialisation we select the timer source to use for
> the high resolution time source. When selecting the TSC we don't
> take into account the initial value of the TSC, this leads to a
> jump in the clock at the next clock tick. We also fail to take into
> account that the TSC synchronisation in an SMP system resets the TSC.
>
> In both cases this will lead to the timer believing that 0-N TSC
> ticks have passed since the last real timer tick. This will lead
> to the clock jumping ahead by nearly the maximum time represented
> by the lower 32bits of the TSC. For a 1GHz machine this is close to
> 4s, on slower boxes this can exceed 10s and trip the softlock tests.
This sounds very similar to bugme bug #5366
http://bugzilla.kernel.org/show_bug.cgi?id=5366
There's a test patch in there that maybe you could try?
thanks
-john
From: John Stultz <[email protected]>
Suppress lost tick detection until we are fully initialised.
This prevents any modifications to the high resolution timers
from causing non-linearities in the flow of time. For example on
an SMP system we resyncronise the TSC values for all processors.
This results in a TSC reset which will be seen as a huge apparent
tick loss. This can cause premature expiry of timers and in extreme
cases can cause the soft lockup detection to fire.
Acked-by: Andy Whitcroft <[email protected]>
diff --git a/arch/i386/kernel/timers/timer_tsc.c b/arch/i386/kernel/timers/timer_tsc.c
--- a/arch/i386/kernel/timers/timer_tsc.c
+++ b/arch/i386/kernel/timers/timer_tsc.c
@@ -45,6 +45,15 @@ static unsigned long last_tsc_high; /* m
static unsigned long long monotonic_base;
static seqlock_t monotonic_lock = SEQLOCK_UNLOCKED;
+/* Avoid compensating for lost ticks before TSCs are synched */
+static int detect_lost_ticks;
+static int __init start_lost_tick_compensation(void)
+{
+ detect_lost_ticks = 1;
+ return 0;
+}
+late_initcall(start_lost_tick_compensation);
+
/* convert from cycles(64bits) => nanoseconds (64bits)
* basic equation:
* ns = cycles / (freq / ns_per_sec)
@@ -196,7 +205,8 @@ static void mark_offset_tsc_hpet(void)
/* lost tick compensation */
offset = hpet_readl(HPET_T0_CMP) - hpet_tick;
- if (unlikely(((offset - hpet_last) > hpet_tick) && (hpet_last != 0))) {
+ if (unlikely(((offset - hpet_last) > hpet_tick) && (hpet_last != 0))
+ && detect_lost_ticks) {
int lost_ticks = (offset - hpet_last) / hpet_tick;
jiffies_64 += lost_ticks;
}
@@ -419,7 +429,7 @@ static void mark_offset_tsc(void)
delta += delay_at_last_interrupt;
lost = delta/(1000000/HZ);
delay = delta%(1000000/HZ);
- if (lost >= 2) {
+ if (lost >= 2 && detect_lost_ticks) {
jiffies_64 += lost-1;
/* sanity check to ensure we're not always losing ticks */
Andy Whitcroft <[email protected]> wrote:
>
> From: John Stultz <[email protected]>
>
> Suppress lost tick detection until we are fully initialised.
> This prevents any modifications to the high resolution timers
> from causing non-linearities in the flow of time. For example on
> an SMP system we resyncronise the TSC values for all processors.
> This results in a TSC reset which will be seen as a huge apparent
> tick loss. This can cause premature expiry of timers and in extreme
> cases can cause the soft lockup detection to fire.
>
> Acked-by: Andy Whitcroft <[email protected]>
>
> diff --git a/arch/i386/kernel/timers/timer_tsc.c b/arch/i386/kernel/timers/timer_tsc.c
> --- a/arch/i386/kernel/timers/timer_tsc.c
> +++ b/arch/i386/kernel/timers/timer_tsc.c
> @@ -45,6 +45,15 @@ static unsigned long last_tsc_high; /* m
> static unsigned long long monotonic_base;
> static seqlock_t monotonic_lock = SEQLOCK_UNLOCKED;
>
> +/* Avoid compensating for lost ticks before TSCs are synched */
> +static int detect_lost_ticks;
> +static int __init start_lost_tick_compensation(void)
> +{
> + detect_lost_ticks = 1;
> + return 0;
> +}
> +late_initcall(start_lost_tick_compensation);
> +
> /* convert from cycles(64bits) => nanoseconds (64bits)
> * basic equation:
> * ns = cycles / (freq / ns_per_sec)
> @@ -196,7 +205,8 @@ static void mark_offset_tsc_hpet(void)
>
> /* lost tick compensation */
> offset = hpet_readl(HPET_T0_CMP) - hpet_tick;
> - if (unlikely(((offset - hpet_last) > hpet_tick) && (hpet_last != 0))) {
> + if (unlikely(((offset - hpet_last) > hpet_tick) && (hpet_last != 0))
> + && detect_lost_ticks) {
Simple enough. John, so you feel that this is 2.6.16 material?
Note that the time changes in -mm will blow this change away, so I'd be
needing a fresh version of this patch against next-mm, please.
On Tue, 2006-01-31 at 15:21 -0800, Andrew Morton wrote:
> Andy Whitcroft <[email protected]> wrote:
> >
> > From: John Stultz <[email protected]>
> >
> > Suppress lost tick detection until we are fully initialised.
> > This prevents any modifications to the high resolution timers
> > from causing non-linearities in the flow of time. For example on
> > an SMP system we resyncronise the TSC values for all processors.
> > This results in a TSC reset which will be seen as a huge apparent
> > tick loss. This can cause premature expiry of timers and in extreme
> > cases can cause the soft lockup detection to fire.
> >
> > Acked-by: Andy Whitcroft <[email protected]>
> >
> > diff --git a/arch/i386/kernel/timers/timer_tsc.c b/arch/i386/kernel/timers/timer_tsc.c
> > --- a/arch/i386/kernel/timers/timer_tsc.c
> > +++ b/arch/i386/kernel/timers/timer_tsc.c
> > @@ -45,6 +45,15 @@ static unsigned long last_tsc_high; /* m
> > static unsigned long long monotonic_base;
> > static seqlock_t monotonic_lock = SEQLOCK_UNLOCKED;
> >
> > +/* Avoid compensating for lost ticks before TSCs are synched */
> > +static int detect_lost_ticks;
> > +static int __init start_lost_tick_compensation(void)
> > +{
> > + detect_lost_ticks = 1;
> > + return 0;
> > +}
> > +late_initcall(start_lost_tick_compensation);
> > +
> > /* convert from cycles(64bits) => nanoseconds (64bits)
> > * basic equation:
> > * ns = cycles / (freq / ns_per_sec)
> > @@ -196,7 +205,8 @@ static void mark_offset_tsc_hpet(void)
> >
> > /* lost tick compensation */
> > offset = hpet_readl(HPET_T0_CMP) - hpet_tick;
> > - if (unlikely(((offset - hpet_last) > hpet_tick) && (hpet_last != 0))) {
> > + if (unlikely(((offset - hpet_last) > hpet_tick) && (hpet_last != 0))
> > + && detect_lost_ticks) {
>
> Simple enough. John, so you feel that this is 2.6.16 material?
Yep. There's a signed off version somewhere in your inbox.
> Note that the time changes in -mm will blow this change away, so I'd be
> needing a fresh version of this patch against next-mm, please.
Uh, not sure I followed that. Do mean you'd want a new set of the
generic timefoday patches to apply ontop of this fix?
thanks
-john
john stultz <[email protected]> wrote:
>
> > > offset = hpet_readl(HPET_T0_CMP) - hpet_tick;
> > > - if (unlikely(((offset - hpet_last) > hpet_tick) && (hpet_last != 0))) {
> > > + if (unlikely(((offset - hpet_last) > hpet_tick) && (hpet_last != 0))
> > > + && detect_lost_ticks) {
> >
> > Simple enough. John, so you feel that this is 2.6.16 material?
>
> Yep. There's a signed off version somewhere in your inbox.
<looks>
Oh yeah. Hate it when that happens.
> > Note that the time changes in -mm will blow this change away, so I'd be
> > needing a fresh version of this patch against next-mm, please.
>
> Uh, not sure I followed that. Do mean you'd want a new set of the
> generic timefoday patches to apply ontop of this fix?
argh, spare me from that.
No, I'd like a new version of this patch which applies on top of the
generic-tod patches please. Could do it myself, but you moved all the code
around and I can't find anything ;)