Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5054C38142 for ; Wed, 25 Jan 2023 00:28:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234798AbjAYA2P (ORCPT ); Tue, 24 Jan 2023 19:28:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234745AbjAYA2N (ORCPT ); Tue, 24 Jan 2023 19:28:13 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0C719518E0; Tue, 24 Jan 2023 16:27:39 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id F33146141D; Wed, 25 Jan 2023 00:27:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2088BC43442; Wed, 25 Jan 2023 00:27:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1674606453; bh=Hf6D8jaVh9N46w1G0EEZbriQYqxBI7NIDla18BtPWI4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GsY2fk5zT8DTcq3AwyfRvEy78M9JjdbFUe8YMnBV7hXK7/RIi5I8ZkMfNVaxnQfoi uT6pA7E7X2eN+KlCKaWkSG8pvviYwHmm2/ji9DqcUamRKwt0h/IKwmy1NPPekEvYXX i5JxsWEH6816qagZnEbQCN2dWOeNM+y87ChgdTEqHuOW10rYUruZxvIQu0x53ogep+ lcv7lypzWme4ZHo+TO/GX0SbaBrXR5srcqayHkvhPLyyLRt88zs3m+Dldxnzz+UEMn SIiAZLAhaqRx0i14Tfho4gcsTgBv0Bj7l7vAfSKoZoWJ4yoYGd4svMwYpbc9gTStQc FjFFY0BU+NGQg== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 7D74F5C1D0D; Tue, 24 Jan 2023 16:27:32 -0800 (PST) From: "Paul E. McKenney" To: tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, john.stultz@linaro.org, sboyd@kernel.org, corbet@lwn.net, Mark.Rutland@arm.com, maz@kernel.org, kernel-team@meta.com, neeraju@codeaurora.org, ak@linux.intel.com, feng.tang@intel.com, zhengjun.xing@intel.com, Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , x86@kernel.org, linux-doc@vger.kernel.org, "Paul E . McKenney" Subject: [PATCH v2 clocksource 7/7] x86/tsc: Add option to force frequency recalibration with HW timer Date: Tue, 24 Jan 2023 16:27:30 -0800 Message-Id: <20230125002730.1471349-7-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20230125002708.GA1471122@paulmck-ThinkPad-P17-Gen-1> References: <20230125002708.GA1471122@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Feng Tang The kernel assumes that the TSC frequency which is provided by the hardware / firmware via MSRs or CPUID(0x15) is correct after applying a few basic consistency checks. This disables the TSC recalibration against HPET or PM timer. As a result there is no mechanism to validate that frequency in cases where a firmware or hardware defect is suspected. And there was case that some user used atomic clock to measure the TSC frequency and reported an inaccuracy issue, which was later fixed in firmware. Add an option 'recalibrate' for 'tsc' kernel parameter to force the tsc freq recalibration with HPET or PM timer, and warn if the deviation from previous value is more than about 500 PPM, which provides a way to verify the data from hardware / firmware. There is no functional change to existing work flow. Recently there was a real-world case: "The 40ms/s divergence between TSC and HPET was observed on hardware that is quite recent" [1], on that platform the TSC frequence 1896 MHz was got from CPUID(0x15), and the force-reclibration with HPET/PMTIMER both calibrated out value of 1975 MHz, which also matched with check from software 'chronyd', indicating it's a problem of BIOS or firmware. [Thanks tglx for helping improving the commit log] [ paulmck: Wordsmith Kconfig help text. ] [1]. https://lore.kernel.org/lkml/20221117230910.GI4001@paulmck-ThinkPad-P17-Gen-1/ Signed-off-by: Feng Tang Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: "H. Peter Anvin" Cc: Jonathan Corbet Cc: Cc: Signed-off-by: Paul E. McKenney --- .../admin-guide/kernel-parameters.txt | 4 +++ arch/x86/kernel/tsc.c | 34 ++++++++++++++++--- 2 files changed, 34 insertions(+), 4 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 6cfa6e3996cf7..95f0d104c2322 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -6369,6 +6369,10 @@ in situations with strict latency requirements (where interruptions from clocksource watchdog are not acceptable). + [x86] recalibrate: force recalibration against a HW timer + (HPET or PM timer) on systems whose TSC frequency was + obtained from HW or FW using either an MSR or CPUID(0x15). + Warn if the difference is more than 500 ppm. tsc_early_khz= [X86] Skip early TSC calibration and use the given value instead. Useful when the early TSC frequency discovery diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index af3782fb6200c..a5371c6d4b64b 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -48,6 +48,8 @@ static DEFINE_STATIC_KEY_FALSE(__use_tsc); int tsc_clocksource_reliable; +static int __read_mostly tsc_force_recalibrate; + static u32 art_to_tsc_numerator; static u32 art_to_tsc_denominator; static u64 art_to_tsc_offset; @@ -303,6 +305,8 @@ static int __init tsc_setup(char *str) mark_tsc_unstable("boot parameter"); if (!strcmp(str, "nowatchdog")) no_tsc_watchdog = 1; + if (!strcmp(str, "recalibrate")) + tsc_force_recalibrate = 1; return 1; } @@ -1379,6 +1383,25 @@ static void tsc_refine_calibration_work(struct work_struct *work) else freq = calc_pmtimer_ref(delta, ref_start, ref_stop); + /* Will hit this only if tsc_force_recalibrate has been set */ + if (boot_cpu_has(X86_FEATURE_TSC_KNOWN_FREQ)) { + + /* Warn if the deviation exceeds 500 ppm */ + if (abs(tsc_khz - freq) > (tsc_khz >> 11)) { + pr_warn("Warning: TSC freq calibrated by CPUID/MSR differs from what is calibrated by HW timer, please check with vendor!!\n"); + pr_info("Previous calibrated TSC freq:\t %lu.%03lu MHz\n", + (unsigned long)tsc_khz / 1000, + (unsigned long)tsc_khz % 1000); + } + + pr_info("TSC freq recalibrated by [%s]:\t %lu.%03lu MHz\n", + hpet ? "HPET" : "PM_TIMER", + (unsigned long)freq / 1000, + (unsigned long)freq % 1000); + + return; + } + /* Make sure we're within 1% */ if (abs(tsc_khz - freq) > tsc_khz/100) goto out; @@ -1412,8 +1435,10 @@ static int __init init_tsc_clocksource(void) if (!boot_cpu_has(X86_FEATURE_TSC) || !tsc_khz) return 0; - if (tsc_unstable) - goto unreg; + if (tsc_unstable) { + clocksource_unregister(&clocksource_tsc_early); + return 0; + } if (boot_cpu_has(X86_FEATURE_NONSTOP_TSC_S3)) clocksource_tsc.flags |= CLOCK_SOURCE_SUSPEND_NONSTOP; @@ -1426,9 +1451,10 @@ static int __init init_tsc_clocksource(void) if (boot_cpu_has(X86_FEATURE_ART)) art_related_clocksource = &clocksource_tsc; clocksource_register_khz(&clocksource_tsc, tsc_khz); -unreg: clocksource_unregister(&clocksource_tsc_early); - return 0; + + if (!tsc_force_recalibrate) + return 0; } schedule_delayed_work(&tsc_irqwork, 0); -- 2.31.1.189.g2e36527f23