Received: by 2002:a05:6358:16cc:b0:ea:6187:17c9 with SMTP id r12csp13611692rwl; Wed, 4 Jan 2023 10:24:45 -0800 (PST) X-Google-Smtp-Source: AMrXdXv2UUKZov+gTo/Fzx9d54GowzRTYFEs4Go1685pxpyvllEJGajqcpYO4/orn9fCGsYCF6pH X-Received: by 2002:a17:906:d0d0:b0:7c1:23f2:5b51 with SMTP id bq16-20020a170906d0d000b007c123f25b51mr47425972ejb.60.1672856685409; Wed, 04 Jan 2023 10:24:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1672856685; cv=none; d=google.com; s=arc-20160816; b=T48WpR9gQE3bUn/IgwyRnFCYeK/24zsGZeTupciUr75449bdU6WbEq23HAfBI4uEiL ZgTCINUe8ZTKWq2io9tETGvoxrE603zrZFtVTPc4TFI/QUAWeUf376NdSfsXaT5Zq8Z3 BNEqCVfZL5j5IS1w3yYYsnNx72QogukLqTCZKGWk6u4g6rj+HxC1hoLNp4wUPx9O1ttx ck0llQQGE6JEcg4V01I9Ux1NRLAkyrD9RaxqRz5AXecsH5TsJUhom/NBlghdCEcQXYdf /qJhtQItyaaPQlJkKWbeczquUW7qTjQNN1rlF7htVGiwKf8rD7Ouu9us4BxS9gY9cZZe GZXA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=mbPEGmwjwXShVh37BanzN8QEOZslRYk8bbWGGE0Bgmo=; b=p/hctJeeBV2iSuBKoIVB9crtSqWRlUEEhtZ5WcgJhHRHKQEtKtdne72POh2qWaJA+o 22XkT7d8gMgVc0/Ua6H4Eu9Y1KxyNrQMwSds0UhlCf666WSAACy7shf5sHXs0ZQ74M+X f2rc44tE8SZ7dVztvEHiXglkBKg7QzJX17x3XNlJiamppozusdpInBvTpTMzwMiCV32K 3zGRfEMUtUPpVGVikGrUuMNqR8/ZTz+tcgovAhtyhIu4kybHUhfBkcD1jKqfTmtzO+tF LtNslbrwaM5dL+WeD05vrlMNQHEb2Pvi6XHfvKCZNzuagapCR23huKUiwRGf7Usy2hsb ZS4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=hco6D4vx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id nb11-20020a1709071c8b00b007f46a3735cbsi34182628ejc.172.2023.01.04.10.24.31; Wed, 04 Jan 2023 10:24:45 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=hco6D4vx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235439AbjADRql (ORCPT + 56 others); Wed, 4 Jan 2023 12:46:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45612 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231423AbjADRqj (ORCPT ); Wed, 4 Jan 2023 12:46:39 -0500 Received: from sin.source.kernel.org (sin.source.kernel.org [IPv6:2604:1380:40e1:4800::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7F3DDFD08 for ; Wed, 4 Jan 2023 09:46:38 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 93ADFCE187D for ; Wed, 4 Jan 2023 17:46:36 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CC46FC433F0; Wed, 4 Jan 2023 17:46:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1672854394; bh=ZIx/ZAc0t/oQVqKPKboHjesRBCcVhfif+88/brPtM14=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=hco6D4vxx8pXXGhRnhNa84hzUq4SIwbF/SUaqp4kvblGtSRY/NZdAbWuQEFnhAfgb 3YzCnULn7EBlWCNYKxHhXpeL7kI9xD9oZ/1L0JgeYr/iV5DtA6xonLHRkAJBpYoGy4 a+cmGlbov2kh9cS4lMh5P5xbATvFEen/3HViyUIgOfHFe9nfp5zJSJq7pj9t1GbLRG w+S0Iv1HcLCOoBaEmPlIo748sV5vs4bxuvos+vOQDwPs1xYAfSDVRvUNV8JPR5DXqq hpi/znIv8knc8/RD64B2tI7WhjrC/GczuQ+ycimflWbc7pWfHoYwPDUBDG+cVVSTvn kNpam+dSLIShQ== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 606155C066D; Wed, 4 Jan 2023 09:46:34 -0800 (PST) Date: Wed, 4 Jan 2023 09:46:34 -0800 From: "Paul E. McKenney" To: Feng Tang Cc: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , Peter Zijlstra , Jonathan Corbet , x86@kernel.org, linux-kernel@vger.kernel.org, rui.zhang@intel.com, len.brown@intel.com, tim.c.chen@intel.com Subject: Re: [PATCH v5] x86/tsc: Add option to force frequency recalibration with HW timer Message-ID: <20230104174634.GA1735127@paulmck-ThinkPad-P17-Gen-1> Reply-To: paulmck@kernel.org References: <20230104081938.1014511-1-feng.tang@intel.com> <20230104143227.GC4028633@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230104143227.GC4028633@paulmck-ThinkPad-P17-Gen-1> X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 04, 2023 at 06:32:27AM -0800, Paul E. McKenney wrote: > On Wed, Jan 04, 2023 at 04:19:38PM +0800, Feng Tang wrote: > > The kernel assumes that the TSC frequency which is provided by the > > hardware / firmware via MSRs or CPUID(0x15) is correct after applying > > a few basic consistency checks. This disables the TSC recalibration > > against HPET or PM timer. > > > > As a result there is no mechanism to validate that frequency in cases > > where a firmware or hardware defect is suspected. And there was case > > that some user used atomic clock to measure the TSC frequency and > > reported an inaccuracy issue, which was later fixed in firmware. > > > > Add an option 'recalibrate' for 'tsc' kernel parameter to force the > > tsc freq recalibration with HPET or PM timer, and warn if the > > deviation from previous value is more than about 500 PPM, which > > provides a way to verify the data from hardware / firmware. > > > > There is no functional change to existing work flow. > > > > Recently there was a real-world case: "The 40ms/s divergence between > > TSC and HPET was observed on hardware that is quite recent" [1], on > > that platform the TSC frequence 1896 MHz was got from CPUID(0x15), > > and the force-reclibration with HPET/PMTIMER both calibrated out > > value of 1975 MHz, which also matched with check from software > > 'chronyd', indicating it's a problem of BIOS or firmware. > > > > [Thanks tglx for helping improving the commit log] > > > > [1]. https://lore.kernel.org/lkml/20221117230910.GI4001@paulmck-ThinkPad-P17-Gen-1/ > > Signed-off-by: Feng Tang > > Nice!!! > > Tested-by: Paul E. McKenney And I have queued this on -rcu for further review and testing, in particular, to get it into -next sooner rather than later. Hope that is OK! I was thinking that this recalibrate patch made mine unnecessary: b32498162f5c ("clocksource: Verify HPET and PMTMR when TSC unverified") But upon further thought, I remembered that what we here at Meta need is for TSC to remain in use on systems for which it is deemed trustworthy. The reason is that even a short switch to HPET can terminally annoy some of our systems. So I must therefore keep b32498162f5c. Thanx, Paul > > --- > > Changelog: > > > > since v4: > > * add the real world case, where the patch helped to root > > caused a BIOS/FW problem of inaccurate CPUID-0x15 info > > * rebase against v6.2-rc1 > > > > since v3: > > * add some real world case into commit log > > * rebase against v6.0-rc1 > > > > since v2: > > * revise the option description in kernel-parameters.txt > > * rebase against v5.19-rc2 > > > > since v1: > > * refine commit log to state clearly the problem and intention > > of the patch by copying Thomas' words. > > > > .../admin-guide/kernel-parameters.txt | 4 +++ > > arch/x86/kernel/tsc.c | 34 ++++++++++++++++--- > > 2 files changed, 34 insertions(+), 4 deletions(-) > > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt > > index 6cfa6e3996cf..d9eb98e748d5 100644 > > --- a/Documentation/admin-guide/kernel-parameters.txt > > +++ b/Documentation/admin-guide/kernel-parameters.txt > > @@ -6369,6 +6369,10 @@ > > in situations with strict latency requirements (where > > interruptions from clocksource watchdog are not > > acceptable). > > + [x86] recalibrate: force to do frequency recalibration > > + with a HW timer (HPET or PM timer) for systems whose > > + TSC frequency comes from HW or FW through MSR or CPUID(0x15), > > + and warn if the difference is more than 500 ppm. > > > > tsc_early_khz= [X86] Skip early TSC calibration and use the given > > value instead. Useful when the early TSC frequency discovery > > diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c > > index a78e73da4a74..92bbc4a6b3fc 100644 > > --- a/arch/x86/kernel/tsc.c > > +++ b/arch/x86/kernel/tsc.c > > @@ -48,6 +48,8 @@ static DEFINE_STATIC_KEY_FALSE(__use_tsc); > > > > int tsc_clocksource_reliable; > > > > +static int __read_mostly tsc_force_recalibrate; > > + > > static u32 art_to_tsc_numerator; > > static u32 art_to_tsc_denominator; > > static u64 art_to_tsc_offset; > > @@ -303,6 +305,8 @@ static int __init tsc_setup(char *str) > > mark_tsc_unstable("boot parameter"); > > if (!strcmp(str, "nowatchdog")) > > no_tsc_watchdog = 1; > > + if (!strcmp(str, "recalibrate")) > > + tsc_force_recalibrate = 1; > > return 1; > > } > > > > @@ -1374,6 +1378,25 @@ static void tsc_refine_calibration_work(struct work_struct *work) > > else > > freq = calc_pmtimer_ref(delta, ref_start, ref_stop); > > > > + /* Will hit this only if tsc_force_recalibrate has been set */ > > + if (boot_cpu_has(X86_FEATURE_TSC_KNOWN_FREQ)) { > > + > > + /* Warn if the deviation exceeds 500 ppm */ > > + if (abs(tsc_khz - freq) > (tsc_khz >> 11)) { > > + pr_warn("Warning: TSC freq calibrated by CPUID/MSR differs from what is calibrated by HW timer, please check with vendor!!\n"); > > + pr_info("Previous calibrated TSC freq:\t %lu.%03lu MHz\n", > > + (unsigned long)tsc_khz / 1000, > > + (unsigned long)tsc_khz % 1000); > > + } > > + > > + pr_info("TSC freq recalibrated by [%s]:\t %lu.%03lu MHz\n", > > + hpet ? "HPET" : "PM_TIMER", > > + (unsigned long)freq / 1000, > > + (unsigned long)freq % 1000); > > + > > + return; > > + } > > + > > /* Make sure we're within 1% */ > > if (abs(tsc_khz - freq) > tsc_khz/100) > > goto out; > > @@ -1407,8 +1430,10 @@ static int __init init_tsc_clocksource(void) > > if (!boot_cpu_has(X86_FEATURE_TSC) || !tsc_khz) > > return 0; > > > > - if (tsc_unstable) > > - goto unreg; > > + if (tsc_unstable) { > > + clocksource_unregister(&clocksource_tsc_early); > > + return 0; > > + } > > > > if (boot_cpu_has(X86_FEATURE_NONSTOP_TSC_S3)) > > clocksource_tsc.flags |= CLOCK_SOURCE_SUSPEND_NONSTOP; > > @@ -1421,9 +1446,10 @@ static int __init init_tsc_clocksource(void) > > if (boot_cpu_has(X86_FEATURE_ART)) > > art_related_clocksource = &clocksource_tsc; > > clocksource_register_khz(&clocksource_tsc, tsc_khz); > > -unreg: > > clocksource_unregister(&clocksource_tsc_early); > > - return 0; > > + > > + if (!tsc_force_recalibrate) > > + return 0; > > } > > > > schedule_delayed_work(&tsc_irqwork, 0); > > -- > > 2.34.1 > >