2024-04-08 10:06:25

by Lei Chen

[permalink] [raw]
Subject: [PATCH RESEND] x86/tsc: print some log if calibrated tsc freq deviates from original too much

In most cases, tsc_khz is refined by hpet on boot. But in a few
production-level nodes, the refinement fails because calibrated
freq diviates from origin tsc freq more than 1%. Printing some
logs will help get this info.

Signed-off-by: Lei Chen <[email protected]>
---
arch/x86/kernel/tsc.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 15f97c0abc9d..a68b16e72df1 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -1435,8 +1435,15 @@ static void tsc_refine_calibration_work(struct work_struct *work)
}

/* Make sure we're within 1% */
- if (abs(tsc_khz - freq) > tsc_khz/100)
+ if (abs(tsc_khz - freq) > tsc_khz/100) {
+ pr_warn("Warning: TSC freq calibrated by [%s]: %lu.%03lu MHz deviates too much from original freq: %lu.%03lu MHz\n",
+ hpet ? "HPET" : "PM_TIMER",
+ (unsigned long)freq / 1000,
+ (unsigned long)freq % 1000,
+ (unsigned long)tsc_khz / 1000,
+ (unsigned long)tsc_khz % 1000);
goto out;
+ }

tsc_khz = freq;
pr_info("Refined TSC clocksource calibration: %lu.%03lu MHz\n",
--
2.44.0



2024-04-09 08:06:14

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH RESEND] x86/tsc: print some log if calibrated tsc freq deviates from original too much


* Lei Chen <[email protected]> wrote:

> In most cases, tsc_khz is refined by hpet on boot. But in a few
> production-level nodes, the refinement fails because calibrated
> freq diviates from origin tsc freq more than 1%. Printing some
> logs will help get this info.
>
> Signed-off-by: Lei Chen <[email protected]>
> ---
> arch/x86/kernel/tsc.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> index 15f97c0abc9d..a68b16e72df1 100644
> --- a/arch/x86/kernel/tsc.c
> +++ b/arch/x86/kernel/tsc.c
> @@ -1435,8 +1435,15 @@ static void tsc_refine_calibration_work(struct work_struct *work)
> }
>
> /* Make sure we're within 1% */
> - if (abs(tsc_khz - freq) > tsc_khz/100)
> + if (abs(tsc_khz - freq) > tsc_khz/100) {
> + pr_warn("Warning: TSC freq calibrated by [%s]: %lu.%03lu MHz deviates too much from original freq: %lu.%03lu MHz\n",

Yeah, so it wouldn't cost us anything to more precisely define 'too much':

s/deviates too much from
/deviates by more than 1% from

Right?

> + hpet ? "HPET" : "PM_TIMER",
> + (unsigned long)freq / 1000,
> + (unsigned long)freq % 1000,
> + (unsigned long)tsc_khz / 1000,
> + (unsigned long)tsc_khz % 1000);
> goto out;
> + }

The warning makes sense I suppose, if it's one per system and once per
bootup [right?], but I think pr_info() would be plenty enough priority for
this condition - especially as we didn't have the warning before and don't
know how frequent it is?

Thanks,

Ingo

2024-04-10 04:53:16

by Lei Chen

[permalink] [raw]
Subject: Re: [PATCH RESEND] x86/tsc: print some log if calibrated tsc freq deviates from original too much

On Tue, Apr 9, 2024 at 4:05 PM Ingo Molnar <[email protected]> wrote:
>
>
> * Lei Chen <[email protected]> wrote:
>
> > In most cases, tsc_khz is refined by hpet on boot. But in a few
> > production-level nodes, the refinement fails because calibrated
> > freq diviates from origin tsc freq more than 1%. Printing some
> > logs will help get this info.
> >
> > Signed-off-by: Lei Chen <[email protected]>
> > ---
> > arch/x86/kernel/tsc.c | 9 ++++++++-
> > 1 file changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
> > index 15f97c0abc9d..a68b16e72df1 100644
> > --- a/arch/x86/kernel/tsc.c
> > +++ b/arch/x86/kernel/tsc.c
> > @@ -1435,8 +1435,15 @@ static void tsc_refine_calibration_work(struct work_struct *work)
> > }
> >
> > /* Make sure we're within 1% */
> > - if (abs(tsc_khz - freq) > tsc_khz/100)
> > + if (abs(tsc_khz - freq) > tsc_khz/100) {
> > + pr_warn("Warning: TSC freq calibrated by [%s]: %lu.%03lu MHz deviates too much from original freq: %lu.%03lu MHz\n",
>
> Yeah, so it wouldn't cost us anything to more precisely define 'too much':
>
> s/deviates too much from
> /deviates by more than 1% from
>
> Right?

Thanks for your suggestion, I'll change it.


>
> > + hpet ? "HPET" : "PM_TIMER",
> > + (unsigned long)freq / 1000,
> > + (unsigned long)freq % 1000,
> > + (unsigned long)tsc_khz / 1000,
> > + (unsigned long)tsc_khz % 1000);
> > goto out;
> > + }
>
> The warning makes sense I suppose, if it's one per system and once per
> bootup [right?], but I think pr_info() would be plenty enough priority for
> this condition - especially as we didn't have the warning before and don't
> know how frequent it is?
>
I'm afraid tsc frequency deviates from HPET more than 1% is a system risk.
It means either HPET or tsc might be not stable. Users should be cautious of
the risk and turn to ntp for help. In our production nodes, it indeed
causes more
time skew. So maybe warning is a better way to expose the risk ?