2018-03-20 08:44:54

by Liu, Changcheng

[permalink] [raw]
Subject: [PATCH] x86/ioapic: don't use unstable TSC to detect timer IRQ

In rare case, the TSC is every unstable or can't sync with
real time hardware clock. After setting "tsc=unstable" in
command line, system should use delay_without_tsc to detect
timer IRQ. Or system could panic as shown in below log:

[ 0.000000] Command line: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
reboot_panic=p,w tsc=unstable gpt loglevel=8 xxxxxxxxxxxxxx
vga=current nomodeset console=ttyS0,115200n8 xxxxxxxxxxxxxx
[ 0.048000] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[ 0.049000] ..MP-BIOS bug: 8254 timer not connected to IO-APIC
[ 0.049000] ...trying to set up timer (IRQ0) through the 8259A ...
[ 0.049000] ..... (found apic 0 pin 2) ...
[ 0.052000] ....... failed.
[ 0.052000] ...trying to set up timer as Virtual Wire IRQ...
[ 0.052000] ..... failed.
[ 0.052000] ...trying to set up timer as ExtINT IRQ...
[ 0.052000] ..... failed :(.
[ 0.052000] Kernel panic - not syncing: IO-APIC + timer
doesn't work! Boot with apic=debug and send a report. Then try
booting with the 'noapic' option.
[ 0.052000]
[ 0.052000] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
4.15.0-ga804a55 #5
[ 0.052000] Hardware name: Intel Corporation Tiger Lake
Client Platform/TigerLake U LPDDR4 UDIMM, BIOS
TGLSFWR1.R00.1063.A00.1802071025 02/07/2018
[ 0.052000] Call Trace:
[ 0.052000] dump_stack+0x68/0x9d

Signed-off-by: Liu Changcheng <[email protected]>

diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
index cf5d53c..dcfc5b9 100644
--- a/arch/x86/include/asm/tsc.h
+++ b/arch/x86/include/asm/tsc.h
@@ -17,6 +17,7 @@ typedef unsigned long long cycles_t;

extern unsigned int cpu_khz;
extern unsigned int tsc_khz;
+extern int tsc_unstable;

extern void disable_TSC(void);

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 7c55387..0809ec6 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -1643,7 +1643,7 @@ static int __init timer_irq_works(void)
local_save_flags(flags);
local_irq_enable();

- if (boot_cpu_has(X86_FEATURE_TSC))
+ if (boot_cpu_has(X86_FEATURE_TSC) && !tsc_unstable)
delay_with_tsc();
else
delay_without_tsc();
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index fb43027..27b1bae 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -36,7 +36,8 @@ EXPORT_SYMBOL(tsc_khz);
/*
* TSC can be unstable due to cpufreq or due to unsynced TSCs
*/
-static int __read_mostly tsc_unstable;
+int __read_mostly tsc_unstable;
+EXPORT_SYMBOL(tsc_unstable);

/* native_sched_clock() is called before tsc_init(), so
we must start with the TSC soft disabled to prevent
--
2.7.4



2018-03-20 08:50:56

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] x86/ioapic: don't use unstable TSC to detect timer IRQ

On Tue, Mar 20, 2018 at 04:42:55PM +0800, Liu, Changcheng wrote:
> In rare case, the TSC is every unstable or can't sync with
> real time hardware clock.

However did you manage that? Please provide _FAR_ more details.

> diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
> index cf5d53c..dcfc5b9 100644
> --- a/arch/x86/include/asm/tsc.h
> +++ b/arch/x86/include/asm/tsc.h
> @@ -17,6 +17,7 @@ typedef unsigned long long cycles_t;
>
> extern unsigned int cpu_khz;
> extern unsigned int tsc_khz;
> +extern int tsc_unstable;
>
> extern void disable_TSC(void);
>
> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> index 7c55387..0809ec6 100644
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -1643,7 +1643,7 @@ static int __init timer_irq_works(void)
> local_save_flags(flags);
> local_irq_enable();
>
> - if (boot_cpu_has(X86_FEATURE_TSC))
> + if (boot_cpu_has(X86_FEATURE_TSC) && !tsc_unstable)
> delay_with_tsc();
> else
> delay_without_tsc();
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index fb43027..27b1bae 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -36,7 +36,8 @@ EXPORT_SYMBOL(tsc_khz);
> /*
> * TSC can be unstable due to cpufreq or due to unsynced TSCs
> */
> -static int __read_mostly tsc_unstable;
> +int __read_mostly tsc_unstable;
> +EXPORT_SYMBOL(tsc_unstable);
>
> /* native_sched_clock() is called before tsc_init(), so
> we must start with the TSC soft disabled to prevent

No, absolutely not. Even when the TSC is normally deemed unstable, which
typically means it is not sync'ed between cores, it is still perfectly
suitable to be used for delay loops.

2018-03-20 09:02:00

by Liu, Changcheng

[permalink] [raw]
Subject: Re: [PATCH] x86/ioapic: don't use unstable TSC to detect timer IRQ

On 09:49 Tue 20 Mar, Peter Zijlstra wrote:
> On Tue, Mar 20, 2018 at 04:42:55PM +0800, Liu, Changcheng wrote:
> > In rare case, the TSC is every unstable or can't sync with
> > real time hardware clock.
>
> However did you manage that? Please provide _FAR_ more details.
[Changcheng] TSC is simulated and HPET is hardware implemented.
TSC can't sync with HPET. When running linux, the TSC grows too
fast and HPET can't trigger periodic timer interrupt in time which
is used to update jiffies.

>
> > diff --git a/arch/x86/include/asm/tsc.h b/arch/x86/include/asm/tsc.h
> > index cf5d53c..dcfc5b9 100644
> > --- a/arch/x86/include/asm/tsc.h
> > +++ b/arch/x86/include/asm/tsc.h
> > @@ -17,6 +17,7 @@ typedef unsigned long long cycles_t;
> >
> > extern unsigned int cpu_khz;
> > extern unsigned int tsc_khz;
> > +extern int tsc_unstable;
> >
> > extern void disable_TSC(void);
> >
> > diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> > index 7c55387..0809ec6 100644
> > --- a/arch/x86/kernel/apic/io_apic.c
> > +++ b/arch/x86/kernel/apic/io_apic.c
> > @@ -1643,7 +1643,7 @@ static int __init timer_irq_works(void)
> > local_save_flags(flags);
> > local_irq_enable();
> >
> > - if (boot_cpu_has(X86_FEATURE_TSC))
> > + if (boot_cpu_has(X86_FEATURE_TSC) && !tsc_unstable)
> > delay_with_tsc();
> > else
> > delay_without_tsc();
> > diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> > index fb43027..27b1bae 100644
> > --- a/arch/x86/kernel/tsc.c
> > +++ b/arch/x86/kernel/tsc.c
> > @@ -36,7 +36,8 @@ EXPORT_SYMBOL(tsc_khz);
> > /*
> > * TSC can be unstable due to cpufreq or due to unsynced TSCs
> > */
> > -static int __read_mostly tsc_unstable;
> > +int __read_mostly tsc_unstable;
> > +EXPORT_SYMBOL(tsc_unstable);
> >
> > /* native_sched_clock() is called before tsc_init(), so
> > we must start with the TSC soft disabled to prevent
>
> No, absolutely not. Even when the TSC is normally deemed unstable, which
> typically means it is not sync'ed between cores, it is still perfectly
> suitable to be used for delay loops.

2018-03-20 09:02:16

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] x86/ioapic: don't use unstable TSC to detect timer IRQ

On Tue, 20 Mar 2018, Liu, Changcheng wrote:

> In rare case, the TSC is every unstable or can't sync with
> real time hardware clock.

What does that mean?

> After setting "tsc=unstable" in command line, system should use
> delay_without_tsc to detect timer IRQ. Or system could panic as shown in
> below log:

tsc=unstable has nothing to do with TSC being usable for delay loops unless
your TSC is completely broken. Please explain more detailed in which way
this TSC is defect in the hardware. Aside of that is this production
hardware or some experimental silicon?

> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index fb43027..27b1bae 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -36,7 +36,8 @@ EXPORT_SYMBOL(tsc_khz);
> /*
> * TSC can be unstable due to cpufreq or due to unsynced TSCs
> */
> -static int __read_mostly tsc_unstable;
> +int __read_mostly tsc_unstable;
> +EXPORT_SYMBOL(tsc_unstable);

Even if we decided to do that, there is no need to export that symbol ever.

Thanks,

tglx

2018-03-20 09:05:20

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] x86/ioapic: don't use unstable TSC to detect timer IRQ

On Tue, 20 Mar 2018, Liu, Changcheng wrote:

> On 09:49 Tue 20 Mar, Peter Zijlstra wrote:
> > On Tue, Mar 20, 2018 at 04:42:55PM +0800, Liu, Changcheng wrote:
> > > In rare case, the TSC is every unstable or can't sync with
> > > real time hardware clock.
> >
> > However did you manage that? Please provide _FAR_ more details.
> [Changcheng] TSC is simulated and HPET is hardware implemented.
> TSC can't sync with HPET. When running linux, the TSC grows too
> fast and HPET can't trigger periodic timer interrupt in time which
> is used to update jiffies.

Why on earth is that system claiming it has a TSC at all? That's just
broken.

Thanks,

tglx

2018-03-20 09:11:15

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] x86/ioapic: don't use unstable TSC to detect timer IRQ

On Tue, Mar 20, 2018 at 04:58:35PM +0800, Liu, Changcheng wrote:
> On 09:49 Tue 20 Mar, Peter Zijlstra wrote:
> > On Tue, Mar 20, 2018 at 04:42:55PM +0800, Liu, Changcheng wrote:
> > > In rare case, the TSC is every unstable or can't sync with
> > > real time hardware clock.
> >
> > However did you manage that? Please provide _FAR_ more details.

> [Changcheng] TSC is simulated and HPET is hardware implemented.
> TSC can't sync with HPET. When running linux, the TSC grows too
> fast and HPET can't trigger periodic timer interrupt in time which
> is used to update jiffies.

How is that not utterly broken, and how is that even allowed behaviour
as per the SDM ?

If the TSC is so utterly broken as to not function according to spec,
you should not advertise the TSC, at which point you'll find that it is
_very_ hard to run modern linux on x86 without TSC.