Received: by 2002:a05:6358:16cc:b0:ea:6187:17c9 with SMTP id r12csp13335217rwl; Wed, 4 Jan 2023 06:52:51 -0800 (PST) X-Google-Smtp-Source: AMrXdXs8gzZsLgG7Ui+UcXIwPXc/SRpcsCmpktScNsYht9hfh6zvqLbSYlWPzRmSxPU9xYam4bXM X-Received: by 2002:a05:6a20:438d:b0:b2:18da:1515 with SMTP id i13-20020a056a20438d00b000b218da1515mr62551463pzl.20.1672843971391; Wed, 04 Jan 2023 06:52:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1672843971; cv=none; d=google.com; s=arc-20160816; b=wGDRMvidhC9S1/difF1GYf3y8hxCGiBzPChpS3Z9Sc8/ENTmXVS6Gp1JQHCLuM8RwM gKCQ34b60FwF1YMfUrP/XWI+Vn7756s64EyrlDfEBXud5UfpbyRlXFFFQvQnqbnX9ss3 FLLj3r0ty3oJ9Lh3nf2KoO4vMWW+vQGEE+dG7+lq5taC5/y41OwIcp048lpDp7Sd8Zbe luwnFPuqGF2pouBXpIx3LVaat5Xm7JiSCQRrUh0bhNZfeWEX1rdsjJZN33yJZ9Uzpz73 JzNOR6o8/oWMlYpZq1Voo0Wupuq7uUi4vXB0D4ZYwK5IfaW+IskhmoOSc1BdJzc0A17m hgKw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=IIgj+N8afw6sAX66Xf1zXmqBARFewaUVwWgxSUCvvyI=; b=ad8HzUfF+RPUnOt34tFUymRRRZIpGf1ZE+Lzs7znB/V4mHQUGoageEOOvIbtxTZvEm kdqjNeSk9o4GWRN9DA1NsfkoxNCaKUxlfuDV5DGMHLOKEkXf7Bt2Ha/AEsUzID5H9Ebk 1AX7Kv121yhH0JcFYk+E5LGJ6U5S29rsNAKTy1ncZw389tL2Sat0ahtsh6mBZCc7NJvO KoPo0MuyHjEOpbjd6D/OZ566bdTPIDR5Fd8ZcPDU+TEE9ObR8iMApBOsy9kKgXGMGfDB zsmcvvYpNUuYL3AmHQHXWTTLOsusA9K2c6/IdyuIkBYQwSITLwaP9l/XJWsbbI5rG63x kWIw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=bdp+E5Qj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 5-20020a631045000000b0043ac3ec9d9esi33238640pgq.595.2023.01.04.06.52.42; Wed, 04 Jan 2023 06:52:51 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=bdp+E5Qj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239464AbjADOdC (ORCPT + 57 others); Wed, 4 Jan 2023 09:33:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54614 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239443AbjADOch (ORCPT ); Wed, 4 Jan 2023 09:32:37 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AF67837531 for ; Wed, 4 Jan 2023 06:32:28 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 4C8076175F for ; Wed, 4 Jan 2023 14:32:28 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AB13FC433F1; Wed, 4 Jan 2023 14:32:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672842747; bh=LnUfuWjYjSCYv3VFT/Go2AEABRbAqzG6/TB50W6Hsuc=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=bdp+E5QjkA9Eo8XFJNAEArK60xe2FYT6pn/Taq5z70Pia/D1arANv+AF+E1hVQ/yU bmw39vJUZv0fLsVrXpGl6J8ieP+YWA4Jcy/1RlrhLprfqwSDahT2u7ZWWoT3U2hGzj buDcLkV4qU1x8VO5K5656deweaYraJMb16SsJen6jCqDlQTIU3zOOF8XkBPwLRUkWm AiqyYBfrqTnevXYD2iQuB+vJKCj8ZI9b15tFc3vFb9d/Nt2t7KYuGEOHhOLmv7ybEs 7Y357FbR4qUg6MRyrk7S9Lk+fbzlkeuljoUs4PaMp5E1hMM1lFJKY9LbMhQLUvn42i O3bU2jwViaX6Q== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 590465C0558; Wed, 4 Jan 2023 06:32:27 -0800 (PST) Date: Wed, 4 Jan 2023 06:32:27 -0800 From: "Paul E. McKenney" To: Feng Tang Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , Peter Zijlstra , Jonathan Corbet , x86@kernel.org, linux-kernel@vger.kernel.org, rui.zhang@intel.com, len.brown@intel.com, tim.c.chen@intel.com Subject: Re: [PATCH v5] x86/tsc: Add option to force frequency recalibration with HW timer Message-ID: <20230104143227.GC4028633@paulmck-ThinkPad-P17-Gen-1> Reply-To: paulmck@kernel.org References: <20230104081938.1014511-1-feng.tang@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230104081938.1014511-1-feng.tang@intel.com> X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 04, 2023 at 04:19:38PM +0800, Feng Tang wrote: > The kernel assumes that the TSC frequency which is provided by the > hardware / firmware via MSRs or CPUID(0x15) is correct after applying > a few basic consistency checks. This disables the TSC recalibration > against HPET or PM timer. > > As a result there is no mechanism to validate that frequency in cases > where a firmware or hardware defect is suspected. And there was case > that some user used atomic clock to measure the TSC frequency and > reported an inaccuracy issue, which was later fixed in firmware. > > Add an option 'recalibrate' for 'tsc' kernel parameter to force the > tsc freq recalibration with HPET or PM timer, and warn if the > deviation from previous value is more than about 500 PPM, which > provides a way to verify the data from hardware / firmware. > > There is no functional change to existing work flow. > > Recently there was a real-world case: "The 40ms/s divergence between > TSC and HPET was observed on hardware that is quite recent" [1], on > that platform the TSC frequence 1896 MHz was got from CPUID(0x15), > and the force-reclibration with HPET/PMTIMER both calibrated out > value of 1975 MHz, which also matched with check from software > 'chronyd', indicating it's a problem of BIOS or firmware. > > [Thanks tglx for helping improving the commit log] > > [1]. https://lore.kernel.org/lkml/20221117230910.GI4001@paulmck-ThinkPad-P17-Gen-1/ > Signed-off-by: Feng Tang Nice!!! Tested-by: Paul E. McKenney > --- > Changelog: > > since v4: > * add the real world case, where the patch helped to root > caused a BIOS/FW problem of inaccurate CPUID-0x15 info > * rebase against v6.2-rc1 > > since v3: > * add some real world case into commit log > * rebase against v6.0-rc1 > > since v2: > * revise the option description in kernel-parameters.txt > * rebase against v5.19-rc2 > > since v1: > * refine commit log to state clearly the problem and intention > of the patch by copying Thomas' words. > > .../admin-guide/kernel-parameters.txt | 4 +++ > arch/x86/kernel/tsc.c | 34 ++++++++++++++++--- > 2 files changed, 34 insertions(+), 4 deletions(-) > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > index 6cfa6e3996cf..d9eb98e748d5 100644 > --- a/Documentation/admin-guide/kernel-parameters.txt > +++ b/Documentation/admin-guide/kernel-parameters.txt > @@ -6369,6 +6369,10 @@ > in situations with strict latency requirements (where > interruptions from clocksource watchdog are not > acceptable). > + [x86] recalibrate: force to do frequency recalibration > + with a HW timer (HPET or PM timer) for systems whose > + TSC frequency comes from HW or FW through MSR or CPUID(0x15), > + and warn if the difference is more than 500 ppm. > > tsc_early_khz= [X86] Skip early TSC calibration and use the given > value instead. Useful when the early TSC frequency discovery > diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c > index a78e73da4a74..92bbc4a6b3fc 100644 > --- a/arch/x86/kernel/tsc.c > +++ b/arch/x86/kernel/tsc.c > @@ -48,6 +48,8 @@ static DEFINE_STATIC_KEY_FALSE(__use_tsc); > > int tsc_clocksource_reliable; > > +static int __read_mostly tsc_force_recalibrate; > + > static u32 art_to_tsc_numerator; > static u32 art_to_tsc_denominator; > static u64 art_to_tsc_offset; > @@ -303,6 +305,8 @@ static int __init tsc_setup(char *str) > mark_tsc_unstable("boot parameter"); > if (!strcmp(str, "nowatchdog")) > no_tsc_watchdog = 1; > + if (!strcmp(str, "recalibrate")) > + tsc_force_recalibrate = 1; > return 1; > } > > @@ -1374,6 +1378,25 @@ static void tsc_refine_calibration_work(struct work_struct *work) > else > freq = calc_pmtimer_ref(delta, ref_start, ref_stop); > > + /* Will hit this only if tsc_force_recalibrate has been set */ > + if (boot_cpu_has(X86_FEATURE_TSC_KNOWN_FREQ)) { > + > + /* Warn if the deviation exceeds 500 ppm */ > + if (abs(tsc_khz - freq) > (tsc_khz >> 11)) { > + pr_warn("Warning: TSC freq calibrated by CPUID/MSR differs from what is calibrated by HW timer, please check with vendor!!\n"); > + pr_info("Previous calibrated TSC freq:\t %lu.%03lu MHz\n", > + (unsigned long)tsc_khz / 1000, > + (unsigned long)tsc_khz % 1000); > + } > + > + pr_info("TSC freq recalibrated by [%s]:\t %lu.%03lu MHz\n", > + hpet ? "HPET" : "PM_TIMER", > + (unsigned long)freq / 1000, > + (unsigned long)freq % 1000); > + > + return; > + } > + > /* Make sure we're within 1% */ > if (abs(tsc_khz - freq) > tsc_khz/100) > goto out; > @@ -1407,8 +1430,10 @@ static int __init init_tsc_clocksource(void) > if (!boot_cpu_has(X86_FEATURE_TSC) || !tsc_khz) > return 0; > > - if (tsc_unstable) > - goto unreg; > + if (tsc_unstable) { > + clocksource_unregister(&clocksource_tsc_early); > + return 0; > + } > > if (boot_cpu_has(X86_FEATURE_NONSTOP_TSC_S3)) > clocksource_tsc.flags |= CLOCK_SOURCE_SUSPEND_NONSTOP; > @@ -1421,9 +1446,10 @@ static int __init init_tsc_clocksource(void) > if (boot_cpu_has(X86_FEATURE_ART)) > art_related_clocksource = &clocksource_tsc; > clocksource_register_khz(&clocksource_tsc, tsc_khz); > -unreg: > clocksource_unregister(&clocksource_tsc_early); > - return 0; > + > + if (!tsc_force_recalibrate) > + return 0; > } > > schedule_delayed_work(&tsc_irqwork, 0); > -- > 2.34.1 >