2023-05-22 03:55:18

by Feng Tang

[permalink] [raw]
Subject: [PATCH RFC] x86/tsc: Make recalibration default on for TSC_KNOWN_FREQ cases

Commit a7ec817d5542 ("x86/tsc: Add option to force frequency
recalibration with HW timer") was added to handle cases that the
firmware has bug and provides a wrong TSC frequency number, and it
is optional given that this kind of firmware issue rarely happens
(Paul reported once [1]).

But Rui reported that some Sapphire Rapids platform met this issue
again recently, and as firmware is also a kind of 'software' which
can't be bug free, make the recalibration default on. When the
values from firmware and HW timer's calibration have big gap,
raise a warning and let vendor to check which side is broken.

One downside is, many VMs also has X86_FEATURE_TSC_KNOWN_FREQ set,
and they will also do this recalibration.

[1]. https://lore.kernel.org/lkml/20221117230910.GI4001@paulmck-ThinkPad-P17-Gen-1/
Signed-off-by: Feng Tang <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 4 ----
arch/x86/kernel/tsc.c | 14 ++++----------
2 files changed, 4 insertions(+), 14 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 9e5bab29685f..a826ab3b5dfb 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -6451,10 +6451,6 @@
in situations with strict latency requirements (where
interruptions from clocksource watchdog are not
acceptable).
- [x86] recalibrate: force recalibration against a HW timer
- (HPET or PM timer) on systems whose TSC frequency was
- obtained from HW or FW using either an MSR or CPUID(0x15).
- Warn if the difference is more than 500 ppm.
[x86] watchdog: Use TSC as the watchdog clocksource with
which to check other HW timers (HPET or PM timer), but
only on systems where TSC has been deemed trustworthy.
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 344698852146..b77c5b1ad304 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -48,8 +48,6 @@ static DEFINE_STATIC_KEY_FALSE(__use_tsc);

int tsc_clocksource_reliable;

-static int __read_mostly tsc_force_recalibrate;
-
static u32 art_to_tsc_numerator;
static u32 art_to_tsc_denominator;
static u64 art_to_tsc_offset;
@@ -310,8 +308,6 @@ static int __init tsc_setup(char *str)
__func__);
tsc_as_watchdog = 0;
}
- if (!strcmp(str, "recalibrate"))
- tsc_force_recalibrate = 1;
if (!strcmp(str, "watchdog")) {
if (no_tsc_watchdog)
pr_alert("%s: tsc=watchdog overridden by earlier tsc=nowatchdog\n",
@@ -1395,7 +1391,6 @@ static void tsc_refine_calibration_work(struct work_struct *work)
else
freq = calc_pmtimer_ref(delta, ref_start, ref_stop);

- /* Will hit this only if tsc_force_recalibrate has been set */
if (boot_cpu_has(X86_FEATURE_TSC_KNOWN_FREQ)) {

/* Warn if the deviation exceeds 500 ppm */
@@ -1456,17 +1451,16 @@ static int __init init_tsc_clocksource(void)
clocksource_tsc.flags |= CLOCK_SOURCE_SUSPEND_NONSTOP;

/*
- * When TSC frequency is known (retrieved via MSR or CPUID), we skip
- * the refined calibration and directly register it as a clocksource.
+ * When TSC frequency is known (retrieved via MSR or CPUID), we
+ * directly register it as a clocksource. As the firmware could
+ * be wrong (very unlikely) about the values, the recalibration
+ * by hardware timer is kept just as a sanity check.
*/
if (boot_cpu_has(X86_FEATURE_TSC_KNOWN_FREQ)) {
if (boot_cpu_has(X86_FEATURE_ART))
art_related_clocksource = &clocksource_tsc;
clocksource_register_khz(&clocksource_tsc, tsc_khz);
clocksource_unregister(&clocksource_tsc_early);
-
- if (!tsc_force_recalibrate)
- return 0;
}

schedule_delayed_work(&tsc_irqwork, 0);
--
2.34.1



2023-05-22 08:16:41

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH RFC] x86/tsc: Make recalibration default on for TSC_KNOWN_FREQ cases

On Mon, May 22 2023 at 11:30, Feng Tang wrote:
> Commit a7ec817d5542 ("x86/tsc: Add option to force frequency
> recalibration with HW timer") was added to handle cases that the
> firmware has bug and provides a wrong TSC frequency number, and it
> is optional given that this kind of firmware issue rarely happens
> (Paul reported once [1]).
>
> But Rui reported that some Sapphire Rapids platform met this issue
> again recently, and as firmware is also a kind of 'software' which
> can't be bug free, make the recalibration default on. When the
> values from firmware and HW timer's calibration have big gap,
> raise a warning and let vendor to check which side is broken.

Sure firmware can have bugs, but if firmware validation does not even
catch such a trivially to detect bug, then their validation is nothing
else than rubber stamping. Seriously.

Are any of these affected platforms shipping already or is this just
Intel internal muck?

> One downside is, many VMs also has X86_FEATURE_TSC_KNOWN_FREQ set,
> and they will also do this recalibration.

It's also pointless for those SoCs which lack legacy hardware.

So why do you force this on everyone?

Thanks,

tglx

2023-05-22 09:26:11

by Feng Tang

[permalink] [raw]
Subject: Re: [PATCH RFC] x86/tsc: Make recalibration default on for TSC_KNOWN_FREQ cases

Hi Thomas,

Thanks for the review!

On Mon, May 22, 2023 at 10:14:08AM +0200, Thomas Gleixner wrote:
> On Mon, May 22 2023 at 11:30, Feng Tang wrote:
> > Commit a7ec817d5542 ("x86/tsc: Add option to force frequency
> > recalibration with HW timer") was added to handle cases that the
> > firmware has bug and provides a wrong TSC frequency number, and it
> > is optional given that this kind of firmware issue rarely happens
> > (Paul reported once [1]).
> >
> > But Rui reported that some Sapphire Rapids platform met this issue
> > again recently, and as firmware is also a kind of 'software' which
> > can't be bug free, make the recalibration default on. When the
> > values from firmware and HW timer's calibration have big gap,
> > raise a warning and let vendor to check which side is broken.
>
> Sure firmware can have bugs, but if firmware validation does not even
> catch such a trivially to detect bug, then their validation is nothing
> else than rubber stamping. Seriously.

Yes, agree.

> Are any of these affected platforms shipping already or is this just
> Intel internal muck?

Paul and Rui can provide more info. AFAIK, those problems were raised
by external customers, so the platform were already shipped from
Intel. But I'm not sure they are commercial versions or early
engineering drops.

>
> > One downside is, many VMs also has X86_FEATURE_TSC_KNOWN_FREQ set,
> > and they will also do this recalibration.
>
> It's also pointless for those SoCs which lack legacy hardware.

Yes.

> So why do you force this on everyone?

How about we keep the optional parameter, and enforce the check for
bare metal platforms which got TSC frequency info from CPUID(0x15),
like:

---
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 344698852146..ec1ff6aaf5a9 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -670,8 +670,10 @@ unsigned long native_calibrate_tsc(void)
* frequency and is the most accurate one so far we have. This
* is considered a known frequency.
*/
- if (crystal_khz != 0)
+ if (crystal_khz != 0) {
setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
+ tsc_force_recalibrate = 1;
+ }

/*
* Some Intel SoCs like Skylake and Kabylake don't report the crystal

Thanks,
Feng

> Thanks,
>
> tglx

2023-05-22 11:52:17

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH RFC] x86/tsc: Make recalibration default on for TSC_KNOWN_FREQ cases

On Mon, May 22 2023 at 16:47, Feng Tang wrote:
> On Mon, May 22, 2023 at 10:14:08AM +0200, Thomas Gleixner wrote:
>> On Mon, May 22 2023 at 11:30, Feng Tang wrote:
>> Are any of these affected platforms shipping already or is this just
>> Intel internal muck?
>
> Paul and Rui can provide more info. AFAIK, those problems were raised
> by external customers, so the platform were already shipped from
> Intel. But I'm not sure they are commercial versions or early
> engineering drops.

So its at a company which knows how to update firmware, right?

>> So why do you force this on everyone?
>
> How about we keep the optional parameter, and enforce the check for
> bare metal platforms which got TSC frequency info from CPUID(0x15),
> like:

What prevents a hypervisor from providing this info in CPUID(0x15)?

> @@ -670,8 +670,10 @@ unsigned long native_calibrate_tsc(void)
> * frequency and is the most accurate one so far we have. This
> * is considered a known frequency.
> */
> - if (crystal_khz != 0)
> + if (crystal_khz != 0) {
> setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
> + tsc_force_recalibrate = 1;
> + }
>
> /*
> * Some Intel SoCs like Skylake and Kabylake don't report the crystal

and five lines further down:

/*
* For Atom SoCs TSC is the only reliable clocksource.
* Mark TSC reliable so no watchdog on it.
*/
if (boot_cpu_data.x86_model == INTEL_FAM6_ATOM_GOLDMONT)
setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);

So its reliable and needs recalibration against hardware which does not
exist.

Thanks,

tglx

2023-05-22 13:50:55

by Feng Tang

[permalink] [raw]
Subject: Re: [PATCH RFC] x86/tsc: Make recalibration default on for TSC_KNOWN_FREQ cases

On Mon, May 22, 2023 at 01:49:53PM +0200, Thomas Gleixner wrote:
> On Mon, May 22 2023 at 16:47, Feng Tang wrote:
> > On Mon, May 22, 2023 at 10:14:08AM +0200, Thomas Gleixner wrote:
> >> On Mon, May 22 2023 at 11:30, Feng Tang wrote:
> >> Are any of these affected platforms shipping already or is this just
> >> Intel internal muck?
> >
> > Paul and Rui can provide more info. AFAIK, those problems were raised
> > by external customers, so the platform were already shipped from
> > Intel. But I'm not sure they are commercial versions or early
> > engineering drops.
>
> So its at a company which knows how to update firmware, right?

Yes. And the recalibration may help to exposed the bug quickly.

> >> So why do you force this on everyone?
> >
> > How about we keep the optional parameter, and enforce the check for
> > bare metal platforms which got TSC frequency info from CPUID(0x15),
> > like:
>
> What prevents a hypervisor from providing this info in CPUID(0x15)?
>
> > @@ -670,8 +670,10 @@ unsigned long native_calibrate_tsc(void)
> > * frequency and is the most accurate one so far we have. This
> > * is considered a known frequency.
> > */
> > - if (crystal_khz != 0)
> > + if (crystal_khz != 0) {
> > setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
> > + tsc_force_recalibrate = 1;
> > + }
> >
> > /*
> > * Some Intel SoCs like Skylake and Kabylake don't report the crystal
>
> and five lines further down:
>
> /*
> * For Atom SoCs TSC is the only reliable clocksource.
> * Mark TSC reliable so no watchdog on it.
> */
> if (boot_cpu_data.x86_model == INTEL_FAM6_ATOM_GOLDMONT)
> setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
>
> So its reliable and needs recalibration against hardware which does not
> exist.

I misunderstood it. When you said 'SOCs which lack legacy hardware',
I thought you were referring those old Merrifield/Medfield things,
which may have no HPET/ACPI PM_Timer but an APB timer, and mainly go
through MSR way (tsc_msr.c) for TSC frequency.

In this native_calibrate_tsc(), which touches the INTEL_FAM6_ATOM_GOLDMONT
and INTEL_FAM6_ATOM_GOLDMONT_D, I dug out one Apollo Lake and one
Denverton platform (which comply to those GOLDMNOT model), and they
both have 'hpet' and 'acpi_pm' clocksource registered.

Thanks,
Feng

>
> Thanks,
>
> tglx

2023-05-22 14:46:25

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH RFC] x86/tsc: Make recalibration default on for TSC_KNOWN_FREQ cases

On Mon, May 22 2023 at 21:00, Feng Tang wrote:
> On Mon, May 22, 2023 at 01:49:53PM +0200, Thomas Gleixner wrote:
>> > Paul and Rui can provide more info. AFAIK, those problems were raised
>> > by external customers, so the platform were already shipped from
>> > Intel. But I'm not sure they are commercial versions or early
>> > engineering drops.
>>
>> So its at a company which knows how to update firmware, right?
>
> Yes. And the recalibration may help to exposed the bug quickly.

That should be exposed _before_ crappy firmware is shipped and
validation can use the command line parameter. I'm tired of this
constant source of embarrassing stupidity. It's not rocket science to
catch this before shipping.

And guess what. Making this easy to recover from is just not making the
situation any better because firmware people will even care less.

>> and five lines further down:
>>
>> /*
>> * For Atom SoCs TSC is the only reliable clocksource.
>> * Mark TSC reliable so no watchdog on it.
>> */
>> if (boot_cpu_data.x86_model == INTEL_FAM6_ATOM_GOLDMONT)
>> setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
>>
>> So its reliable and needs recalibration against hardware which does not
>> exist.
>
> I misunderstood it. When you said 'SOCs which lack legacy hardware',
> I thought you were referring those old Merrifield/Medfield things,
> which may have no HPET/ACPI PM_Timer but an APB timer, and mainly go
> through MSR way (tsc_msr.c) for TSC frequency.
>
> In this native_calibrate_tsc(), which touches the INTEL_FAM6_ATOM_GOLDMONT
> and INTEL_FAM6_ATOM_GOLDMONT_D, I dug out one Apollo Lake and one
> Denverton platform (which comply to those GOLDMNOT model), and they
> both have 'hpet' and 'acpi_pm' clocksource registered.

So that comment is wrong and that commit log is from fantasy land?

http://lkml.kernel.org/r/[email protected]

Clearly the left hand is not knowing what the right hand is doing.

Thanks,

tglx

2023-05-22 15:44:04

by Feng Tang

[permalink] [raw]
Subject: Re: [PATCH RFC] x86/tsc: Make recalibration default on for TSC_KNOWN_FREQ cases

On Mon, May 22, 2023 at 04:31:28PM +0200, Thomas Gleixner wrote:
> On Mon, May 22 2023 at 21:00, Feng Tang wrote:
> > On Mon, May 22, 2023 at 01:49:53PM +0200, Thomas Gleixner wrote:
> >> > Paul and Rui can provide more info. AFAIK, those problems were raised
> >> > by external customers, so the platform were already shipped from
> >> > Intel. But I'm not sure they are commercial versions or early
> >> > engineering drops.
> >>
> >> So its at a company which knows how to update firmware, right?
> >
> > Yes. And the recalibration may help to exposed the bug quickly.
>
> That should be exposed _before_ crappy firmware is shipped and
> validation can use the command line parameter. I'm tired of this
> constant source of embarrassing stupidity. It's not rocket science to
> catch this before shipping.
>
> And guess what. Making this easy to recover from is just not making the
> situation any better because firmware people will even care less.

I can't argue with that :)

> >> and five lines further down:
> >>
> >> /*
> >> * For Atom SoCs TSC is the only reliable clocksource.
> >> * Mark TSC reliable so no watchdog on it.
> >> */
> >> if (boot_cpu_data.x86_model == INTEL_FAM6_ATOM_GOLDMONT)
> >> setup_force_cpu_cap(X86_FEATURE_TSC_RELIABLE);
> >>
> >> So its reliable and needs recalibration against hardware which does not
> >> exist.
> >
> > I misunderstood it. When you said 'SOCs which lack legacy hardware',
> > I thought you were referring those old Merrifield/Medfield things,
> > which may have no HPET/ACPI PM_Timer but an APB timer, and mainly go
> > through MSR way (tsc_msr.c) for TSC frequency.
> >
> > In this native_calibrate_tsc(), which touches the INTEL_FAM6_ATOM_GOLDMONT
> > and INTEL_FAM6_ATOM_GOLDMONT_D, I dug out one Apollo Lake and one
> > Denverton platform (which comply to those GOLDMNOT model), and they
> > both have 'hpet' and 'acpi_pm' clocksource registered.
>
> So that comment is wrong and that commit log is from fantasy land?
>
> http://lkml.kernel.org/r/[email protected]
>
> Clearly the left hand is not knowing what the right hand is doing.

I started working on Atom (Moorestown) in about 2008, and moved to
other platforms before the time of the patch.

And I don't understand the commit log: "On Intel GOLDMONT Atom SoC
TSC is the only reliable clocksource. We mark TSC reliable to avoid
watchdog on it."

Clearly the Denventon I found today has both HPET and ACPI_PM timer:

[root@dnv0 ~]# grep . /sys/devices/system/clocksource/clocksource0/*
/sys/devices/system/clocksource/clocksource0/available_clocksource:tsc hpet acpi_pm
/sys/devices/system/clocksource/clocksource0/current_clocksource:tsc

The lscpu info is:

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 39 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 12
On-line CPU(s) list: 0-11
Vendor ID: GenuineIntel
BIOS Vendor ID: Intel(R) Corporation
Model name: Intel(R) Atom(TM) CPU C3850 @ 2.10GHz
BIOS Model name: Intel(R) Atom(TM) CPU C3850 @ 2.10GHz CPU @ 2.1GHz
BIOS CPU family: 43
CPU family: 6
Model: 95
Thread(s) per core: 1
Core(s) per socket: 12
Socket(s): 1
Stepping: 1

Maybe this cpu model (0x5F) has been used by some type of platforms
which has met the false alarm watchdog issue.

Thanks,
Feng

> Thanks,
>
> tglx

2023-05-22 16:59:35

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH RFC] x86/tsc: Make recalibration default on for TSC_KNOWN_FREQ cases

On Mon, May 22, 2023 at 11:20:15PM +0800, Feng Tang wrote:

> And I don't understand the commit log: "On Intel GOLDMONT Atom SoC
> TSC is the only reliable clocksource. We mark TSC reliable to avoid
> watchdog on it."
>
> Clearly the Denventon I found today has both HPET and ACPI_PM timer:
>
> [root@dnv0 ~]# grep . /sys/devices/system/clocksource/clocksource0/*
> /sys/devices/system/clocksource/clocksource0/available_clocksource:tsc hpet acpi_pm
> /sys/devices/system/clocksource/clocksource0/current_clocksource:tsc
>
> The lscpu info is:
>
> Architecture: x86_64
> CPU op-mode(s): 32-bit, 64-bit
> Address sizes: 39 bits physical, 48 bits virtual
> Byte Order: Little Endian
> CPU(s): 12
> On-line CPU(s) list: 0-11
> Vendor ID: GenuineIntel
> BIOS Vendor ID: Intel(R) Corporation
> Model name: Intel(R) Atom(TM) CPU C3850 @ 2.10GHz
> BIOS Model name: Intel(R) Atom(TM) CPU C3850 @ 2.10GHz CPU @ 2.1GHz
> BIOS CPU family: 43
> CPU family: 6
> Model: 95
> Thread(s) per core: 1
> Core(s) per socket: 12
> Socket(s): 1
> Stepping: 1
>
> Maybe this cpu model (0x5F) has been used by some type of platforms
> which has met the false alarm watchdog issue.

It has them; but they are not *reliable*.




2023-05-23 01:42:32

by Feng Tang

[permalink] [raw]
Subject: Re: [PATCH RFC] x86/tsc: Make recalibration default on for TSC_KNOWN_FREQ cases

On Mon, May 22, 2023 at 06:11:58PM +0200, Peter Zijlstra wrote:
> On Mon, May 22, 2023 at 11:20:15PM +0800, Feng Tang wrote:
>
> > And I don't understand the commit log: "On Intel GOLDMONT Atom SoC
> > TSC is the only reliable clocksource. We mark TSC reliable to avoid
> > watchdog on it."
> >
> > Clearly the Denventon I found today has both HPET and ACPI_PM timer:
> >
> > [root@dnv0 ~]# grep . /sys/devices/system/clocksource/clocksource0/*
> > /sys/devices/system/clocksource/clocksource0/available_clocksource:tsc hpet acpi_pm
> > /sys/devices/system/clocksource/clocksource0/current_clocksource:tsc
> >
> > The lscpu info is:
> >
> > Architecture: x86_64
> > CPU op-mode(s): 32-bit, 64-bit
> > Address sizes: 39 bits physical, 48 bits virtual
> > Byte Order: Little Endian
> > CPU(s): 12
> > On-line CPU(s) list: 0-11
> > Vendor ID: GenuineIntel
> > BIOS Vendor ID: Intel(R) Corporation
> > Model name: Intel(R) Atom(TM) CPU C3850 @ 2.10GHz
> > BIOS Model name: Intel(R) Atom(TM) CPU C3850 @ 2.10GHz CPU @ 2.1GHz
> > BIOS CPU family: 43
> > CPU family: 6
> > Model: 95
> > Thread(s) per core: 1
> > Core(s) per socket: 12
> > Socket(s): 1
> > Stepping: 1
> >
> > Maybe this cpu model (0x5F) has been used by some type of platforms
> > which has met the false alarm watchdog issue.
>
> It has them; but they are not *reliable*.

Yes, that's possible. I tried to CC the author Bin in case he can
provide more background or information for his statement, but his
email address is unreachable now.

Thanks,
Feng

2023-05-23 09:05:47

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH RFC] x86/tsc: Make recalibration default on for TSC_KNOWN_FREQ cases

On Tue, May 23, 2023 at 09:18:23AM +0800, Feng Tang wrote:
> On Mon, May 22, 2023 at 06:11:58PM +0200, Peter Zijlstra wrote:
> > On Mon, May 22, 2023 at 11:20:15PM +0800, Feng Tang wrote:
> >
> > > And I don't understand the commit log: "On Intel GOLDMONT Atom SoC
> > > TSC is the only reliable clocksource. We mark TSC reliable to avoid
> > > watchdog on it."
> > >
> > > Clearly the Denventon I found today has both HPET and ACPI_PM timer:
> > >
> > > [root@dnv0 ~]# grep . /sys/devices/system/clocksource/clocksource0/*
> > > /sys/devices/system/clocksource/clocksource0/available_clocksource:tsc hpet acpi_pm
> > > /sys/devices/system/clocksource/clocksource0/current_clocksource:tsc
> > >
> > > The lscpu info is:
> > >
> > > Architecture: x86_64
> > > CPU op-mode(s): 32-bit, 64-bit
> > > Address sizes: 39 bits physical, 48 bits virtual
> > > Byte Order: Little Endian
> > > CPU(s): 12
> > > On-line CPU(s) list: 0-11
> > > Vendor ID: GenuineIntel
> > > BIOS Vendor ID: Intel(R) Corporation
> > > Model name: Intel(R) Atom(TM) CPU C3850 @ 2.10GHz
> > > BIOS Model name: Intel(R) Atom(TM) CPU C3850 @ 2.10GHz CPU @ 2.1GHz
> > > BIOS CPU family: 43
> > > CPU family: 6
> > > Model: 95
> > > Thread(s) per core: 1
> > > Core(s) per socket: 12
> > > Socket(s): 1
> > > Stepping: 1
> > >
> > > Maybe this cpu model (0x5F) has been used by some type of platforms
> > > which has met the false alarm watchdog issue.
> >
> > It has them; but they are not *reliable*.
>
> Yes, that's possible. I tried to CC the author Bin in case he can
> provide more background or information for his statement, but his
> email address is unreachable now.

IIRC HPET stops in C10 or something stupid like that, I forgot what the
problem with ACPI_PM is.

2023-05-23 14:55:39

by Feng Tang

[permalink] [raw]
Subject: Re: [PATCH RFC] x86/tsc: Make recalibration default on for TSC_KNOWN_FREQ cases

On Tue, May 23, 2023 at 10:11:15AM +0200, Peter Zijlstra wrote:
> On Tue, May 23, 2023 at 09:18:23AM +0800, Feng Tang wrote:
> > On Mon, May 22, 2023 at 06:11:58PM +0200, Peter Zijlstra wrote:
> > > On Mon, May 22, 2023 at 11:20:15PM +0800, Feng Tang wrote:
> > >
> > > > And I don't understand the commit log: "On Intel GOLDMONT Atom SoC
> > > > TSC is the only reliable clocksource. We mark TSC reliable to avoid
> > > > watchdog on it."
> > > >
> > > > Clearly the Denventon I found today has both HPET and ACPI_PM timer:
> > > >
> > > > [root@dnv0 ~]# grep . /sys/devices/system/clocksource/clocksource0/*
> > > > /sys/devices/system/clocksource/clocksource0/available_clocksource:tsc hpet acpi_pm
> > > > /sys/devices/system/clocksource/clocksource0/current_clocksource:tsc
> > > >
> > > > The lscpu info is:
> > > >
> > > > Architecture: x86_64
> > > > CPU op-mode(s): 32-bit, 64-bit
> > > > Address sizes: 39 bits physical, 48 bits virtual
> > > > Byte Order: Little Endian
> > > > CPU(s): 12
> > > > On-line CPU(s) list: 0-11
> > > > Vendor ID: GenuineIntel
> > > > BIOS Vendor ID: Intel(R) Corporation
> > > > Model name: Intel(R) Atom(TM) CPU C3850 @ 2.10GHz
> > > > BIOS Model name: Intel(R) Atom(TM) CPU C3850 @ 2.10GHz CPU @ 2.1GHz
> > > > BIOS CPU family: 43
> > > > CPU family: 6
> > > > Model: 95
> > > > Thread(s) per core: 1
> > > > Core(s) per socket: 12
> > > > Socket(s): 1
> > > > Stepping: 1
> > > >
> > > > Maybe this cpu model (0x5F) has been used by some type of platforms
> > > > which has met the false alarm watchdog issue.
> > >
> > > It has them; but they are not *reliable*.
> >
> > Yes, that's possible. I tried to CC the author Bin in case he can
> > provide more background or information for his statement, but his
> > email address is unreachable now.
>
> IIRC HPET stops in C10 or something stupid like that, I forgot what the
> problem with ACPI_PM is.

Yes, that HPET issue is a common one for several generations of
client platforms like Baytrail, Coffee Lake etc.


2023-06-02 18:35:57

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH RFC] x86/tsc: Make recalibration default on for TSC_KNOWN_FREQ cases

On Mon, May 22, 2023 at 10:14:08AM +0200, Thomas Gleixner wrote:
> On Mon, May 22 2023 at 11:30, Feng Tang wrote:
> > Commit a7ec817d5542 ("x86/tsc: Add option to force frequency
> > recalibration with HW timer") was added to handle cases that the
> > firmware has bug and provides a wrong TSC frequency number, and it
> > is optional given that this kind of firmware issue rarely happens
> > (Paul reported once [1]).
> >
> > But Rui reported that some Sapphire Rapids platform met this issue
> > again recently, and as firmware is also a kind of 'software' which
> > can't be bug free, make the recalibration default on. When the
> > values from firmware and HW timer's calibration have big gap,
> > raise a warning and let vendor to check which side is broken.
>
> Sure firmware can have bugs, but if firmware validation does not even
> catch such a trivially to detect bug, then their validation is nothing
> else than rubber stamping. Seriously.
>
> Are any of these affected platforms shipping already or is this just
> Intel internal muck?
>
> > One downside is, many VMs also has X86_FEATURE_TSC_KNOWN_FREQ set,
> > and they will also do this recalibration.
>
> It's also pointless for those SoCs which lack legacy hardware.
>
> So why do you force this on everyone?

Just for the record, this patch could be helpful in allowing victims
of TSC mis-synchronization to more easily provide a more complete bug
report to the firmware people. There is of course no point if there is
already a fix available.

But it is not all that hard to work around not having this patch upstream.
This can be hand-applied as needed, NTP drift rates can be pressed
into service for those of us having atomic clocks near all our servers,
or the firmware guys can be tasked with figuring it out.

So this patch would be nice to have, but we could live without it.

Thanx, Paul

2023-06-02 18:43:52

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH RFC] x86/tsc: Make recalibration default on for TSC_KNOWN_FREQ cases

On 6/2/23 11:29, Paul E. McKenney wrote:
>>> One downside is, many VMs also has X86_FEATURE_TSC_KNOWN_FREQ set,
>>> and they will also do this recalibration.
>> It's also pointless for those SoCs which lack legacy hardware.
>>
>> So why do you force this on everyone?
> Just for the record, this patch could be helpful in allowing victims
> of TSC mis-synchronization to more easily provide a more complete bug
> report to the firmware people. There is of course no point if there is
> already a fix available.
>
> But it is not all that hard to work around not having this patch upstream.
> This can be hand-applied as needed, NTP drift rates can be pressed
> into service for those of us having atomic clocks near all our servers,
> or the firmware guys can be tasked with figuring it out.
>
> So this patch would be nice to have, but we could live without it.

Is this the kind of thing we could relegate to a kernel unit test? Like
make the recalibration logic _available_, but don't have it affect the
rest of the system.

I love patching my kernel as much as the next guy. But, you know what I
*don't* love? Explaining how to patch kernels to other people. ;)

2023-06-02 19:33:44

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH RFC] x86/tsc: Make recalibration default on for TSC_KNOWN_FREQ cases

On Fri, Jun 02, 2023 at 11:36:54AM -0700, Dave Hansen wrote:
> On 6/2/23 11:29, Paul E. McKenney wrote:
> >>> One downside is, many VMs also has X86_FEATURE_TSC_KNOWN_FREQ set,
> >>> and they will also do this recalibration.
> >> It's also pointless for those SoCs which lack legacy hardware.
> >>
> >> So why do you force this on everyone?
> > Just for the record, this patch could be helpful in allowing victims
> > of TSC mis-synchronization to more easily provide a more complete bug
> > report to the firmware people. There is of course no point if there is
> > already a fix available.
> >
> > But it is not all that hard to work around not having this patch upstream.
> > This can be hand-applied as needed, NTP drift rates can be pressed
> > into service for those of us having atomic clocks near all our servers,
> > or the firmware guys can be tasked with figuring it out.
> >
> > So this patch would be nice to have, but we could live without it.
>
> Is this the kind of thing we could relegate to a kernel unit test? Like
> make the recalibration logic _available_, but don't have it affect the
> rest of the system.
>
> I love patching my kernel as much as the next guy. But, you know what I
> *don't* love? Explaining how to patch kernels to other people. ;)

One could argue that we already have the TSC equivalent of a kernel unit
test with the tsc=recalibrate kernel boot parameter.

So, would it make sense to have something like tsc=recalibrate (already
present) in the guise of something like hpet=recalibrate and/or
pmtmr=recalibrate in order to allow people to opt into recalibrating
whatever timer is marked unstable?

Thanx, Paul

2023-06-05 01:32:37

by Feng Tang

[permalink] [raw]
Subject: Re: [PATCH RFC] x86/tsc: Make recalibration default on for TSC_KNOWN_FREQ cases

On Fri, Jun 02, 2023 at 12:08:10PM -0700, Paul E. McKenney wrote:
> On Fri, Jun 02, 2023 at 11:36:54AM -0700, Dave Hansen wrote:
> > On 6/2/23 11:29, Paul E. McKenney wrote:
> > >>> One downside is, many VMs also has X86_FEATURE_TSC_KNOWN_FREQ set,
> > >>> and they will also do this recalibration.
> > >> It's also pointless for those SoCs which lack legacy hardware.
> > >>
> > >> So why do you force this on everyone?
> > > Just for the record, this patch could be helpful in allowing victims
> > > of TSC mis-synchronization to more easily provide a more complete bug
> > > report to the firmware people. There is of course no point if there is
> > > already a fix available.
> > >
> > > But it is not all that hard to work around not having this patch upstream.
> > > This can be hand-applied as needed, NTP drift rates can be pressed
> > > into service for those of us having atomic clocks near all our servers,
> > > or the firmware guys can be tasked with figuring it out.
> > >
> > > So this patch would be nice to have, but we could live without it.
> >
> > Is this the kind of thing we could relegate to a kernel unit test? Like
> > make the recalibration logic _available_, but don't have it affect the
> > rest of the system.
> >
> > I love patching my kernel as much as the next guy. But, you know what I
> > *don't* love? Explaining how to patch kernels to other people. ;)
>
> One could argue that we already have the TSC equivalent of a kernel unit
> test with the tsc=recalibrate kernel boot parameter.
>
> So, would it make sense to have something like tsc=recalibrate (already
> present) in the guise of something like hpet=recalibrate and/or
> pmtmr=recalibrate in order to allow people to opt into recalibrating
> whatever timer is marked unstable?

This kind of hint parsing should be in tsc.c, so some name like
'tsc_recal=hpet/pmtmr' ?

As hpet and pmtimer are the only 2 choices and hpet is the default
one, if people want to use pmtimer, they can use combined parameter
"tsc=recalibrate nohpet".

Also, thanks for sharing your thoughts form a victim's viewpoint
in the other mail! :)

Thanks,
Feng

> Thanx, Paul

2023-06-08 17:19:46

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH RFC] x86/tsc: Make recalibration default on for TSC_KNOWN_FREQ cases

On Mon, Jun 05, 2023 at 09:04:45AM +0800, Feng Tang wrote:
> On Fri, Jun 02, 2023 at 12:08:10PM -0700, Paul E. McKenney wrote:
> > On Fri, Jun 02, 2023 at 11:36:54AM -0700, Dave Hansen wrote:
> > > On 6/2/23 11:29, Paul E. McKenney wrote:
> > > >>> One downside is, many VMs also has X86_FEATURE_TSC_KNOWN_FREQ set,
> > > >>> and they will also do this recalibration.
> > > >> It's also pointless for those SoCs which lack legacy hardware.
> > > >>
> > > >> So why do you force this on everyone?
> > > > Just for the record, this patch could be helpful in allowing victims
> > > > of TSC mis-synchronization to more easily provide a more complete bug
> > > > report to the firmware people. There is of course no point if there is
> > > > already a fix available.
> > > >
> > > > But it is not all that hard to work around not having this patch upstream.
> > > > This can be hand-applied as needed, NTP drift rates can be pressed
> > > > into service for those of us having atomic clocks near all our servers,
> > > > or the firmware guys can be tasked with figuring it out.
> > > >
> > > > So this patch would be nice to have, but we could live without it.
> > >
> > > Is this the kind of thing we could relegate to a kernel unit test? Like
> > > make the recalibration logic _available_, but don't have it affect the
> > > rest of the system.
> > >
> > > I love patching my kernel as much as the next guy. But, you know what I
> > > *don't* love? Explaining how to patch kernels to other people. ;)
> >
> > One could argue that we already have the TSC equivalent of a kernel unit
> > test with the tsc=recalibrate kernel boot parameter.
> >
> > So, would it make sense to have something like tsc=recalibrate (already
> > present) in the guise of something like hpet=recalibrate and/or
> > pmtmr=recalibrate in order to allow people to opt into recalibrating
> > whatever timer is marked unstable?
>
> This kind of hint parsing should be in tsc.c, so some name like
> 'tsc_recal=hpet/pmtmr' ?
>
> As hpet and pmtimer are the only 2 choices and hpet is the default
> one, if people want to use pmtimer, they can use combined parameter
> "tsc=recalibrate nohpet".

Or something like "tsc=recalibrateother"?

I am not all that worried about what the boot parameters look like,
as long as we have a way of telling the kernel to recalibrate whatever
clocksource just got marked unstable. ;-)

> Also, thanks for sharing your thoughts form a victim's viewpoint
> in the other mail! :)

Glad it helped! ;-)

Thanx, Paul