2019-04-08 10:43:05

by You-Sheng Yang

[permalink] [raw]
Subject: [PATCH] x86/tsc: mark tsc reliable on CoffeeLake

From: You-Sheng Yang <[email protected]>

On Intel CoffeeLake it's observed tsc is always marked unstable
unexpectedly after entering idle state Package C10(PC10), and then clock
source is switched to hpet. This patch marks tsc as reliable when CPUID
matches CoffeeLake.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=203183
Signed-off-by: You-Sheng Yang <[email protected]>
---
arch/x86/kernel/tsc.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index aab0c82e0a0d..2abbadc9cff0 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1161,6 +1161,16 @@ static void __init check_system_tsc_reliable(void)
#endif
if (boot_cpu_has(X86_FEATURE_TSC_RELIABLE))
tsc_clocksource_reliable = 1;
+
+ /*
+ * On Intel CoffeeLake, tsc may be marked unstable unexpectedly after
+ * entering PC10.
+ */
+ if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
+ (boot_cpu_data.x86_model == INTEL_FAM6_KABYLAKE_MOBILE ||
+ boot_cpu_data.x86_model == INTEL_FAM6_KABYLAKE_DESKTOP) &&
+ boot_cpu_data.x86_stepping >= 0x0a)
+ tsc_clocksource_reliable = 1;
}

/*
--
2.20.1


2019-04-08 12:05:17

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] x86/tsc: mark tsc reliable on CoffeeLake

On Mon, 8 Apr 2019, You-Sheng Yang wrote:

> From: You-Sheng Yang <[email protected]>
>
> On Intel CoffeeLake it's observed tsc is always marked unstable
> unexpectedly after entering idle state Package C10(PC10), and then clock
> source is switched to hpet. This patch marks tsc as reliable when CPUID
> matches CoffeeLake.

This lacks a proper analysis:

1) Why is it marked unstable

2) Why is it correct to set that for coffeelake

> Link: https://bugzilla.kernel.org/show_bug.cgi?id=203183
> Signed-off-by: You-Sheng Yang <[email protected]>
> ---
> arch/x86/kernel/tsc.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index aab0c82e0a0d..2abbadc9cff0 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -1161,6 +1161,16 @@ static void __init check_system_tsc_reliable(void)
> #endif
> if (boot_cpu_has(X86_FEATURE_TSC_RELIABLE))
> tsc_clocksource_reliable = 1;
> +
> + /*
> + * On Intel CoffeeLake, tsc may be marked unstable unexpectedly after
> + * entering PC10.
> + */
> + if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
> + (boot_cpu_data.x86_model == INTEL_FAM6_KABYLAKE_MOBILE ||
> + boot_cpu_data.x86_model == INTEL_FAM6_KABYLAKE_DESKTOP) &&
> + boot_cpu_data.x86_stepping >= 0x0a)
> + tsc_clocksource_reliable = 1;

No. We are not starting that family/model/stepping game especially not
with random stepping cutoffs which are pulled out of thin air. That's
going to spiral out of control sooner than later.

There must be a better way to do that. Rafael?

Thanks,

tglx


2019-04-10 09:16:06

by You-Sheng Yang

[permalink] [raw]
Subject: Re: [PATCH] x86/tsc: mark tsc reliable on CoffeeLake

On 2019/4/8 8:03 PM, Thomas Gleixner wrote:
> On Mon, 8 Apr 2019, You-Sheng Yang wrote:
>
>> From: You-Sheng Yang <[email protected]>
>>
>> On Intel CoffeeLake it's observed tsc is always marked unstable
>> unexpectedly after entering idle state Package C10(PC10), and then clock
>> source is switched to hpet. This patch marks tsc as reliable when CPUID
>> matches CoffeeLake.
>
> This lacks a proper analysis:
>
> 1) Why is it marked unstable

Usually the differences between wd_nsec and cs_nsec in function
clocksource_watchdog in kernel/time/clocksource.c would be less than a
few thousand nanoseconds. However, when CPU is entering deeper idle
state, PC10, the hpet clocksource readings starts to give inaccurate
values for unknown reason, and the differences to cs_nsec varies from a
few hundred nanoseconds to several hundred millisecond, which is larger
than WATCHDOG_THRESHOLD (62.5ms) and finally results in tsc being marked
unreliable. No HPET overflow is found when this occurs.

> 2) Why is it correct to set that for coffeelake

So far this strange behaviour is only found on coffeelake. Besides this,
no much I can tell actually. This could be probably wrong, but may serve
as a start to bring up some more discussion/investigation to solve the
problem. I would be more than willing to help verifying further
appropriate fixes.

Thank you.

You-Sheng Yang

2019-04-12 03:30:23

by You-Sheng Yang

[permalink] [raw]
Subject: Re: [PATCH] x86/tsc: mark tsc reliable on CoffeeLake

On 2019/4/8 8:03 PM, Thomas Gleixner wrote:
> On Mon, 8 Apr 2019, You-Sheng Yang wrote:
>> + /*
>> + * On Intel CoffeeLake, tsc may be marked unstable unexpectedly after
>> + * entering PC10.
>> + */
>> + if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
>> + (boot_cpu_data.x86_model == INTEL_FAM6_KABYLAKE_MOBILE ||
>> + boot_cpu_data.x86_model == INTEL_FAM6_KABYLAKE_DESKTOP) &&
>> + boot_cpu_data.x86_stepping >= 0x0a)
>> + tsc_clocksource_reliable = 1;
>
> No. We are not starting that family/model/stepping game especially not
> with random stepping cutoffs which are pulled out of thin air. That's
> going to spiral out of control sooner than later.

What about we simply disable clocksource watchdog if this is an
invariant TSC?

> There must be a better way to do that. Rafael?
>
> Thanks,
>
> tglx

You-Sheng Yang


Attachments:
signature.asc (499.00 B)
OpenPGP digital signature