Received: by 2002:a05:6358:1087:b0:cb:c9d3:cd90 with SMTP id j7csp887219rwi; Wed, 19 Oct 2022 04:20:04 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6+biTVfgehfDmNmwCcSIe/g/0rUZ8MaqJd5jdm070eOdJ12zeqRS66AljBRz03M8yKYTkD X-Received: by 2002:a17:90b:4c12:b0:20d:7124:7c10 with SMTP id na18-20020a17090b4c1200b0020d71247c10mr43582885pjb.204.1666178404097; Wed, 19 Oct 2022 04:20:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666178404; cv=none; d=google.com; s=arc-20160816; b=iFPcXwv92NgHhYgQEthRMn3mpOc9OUcD6HlHFNCn153a0/S0lwta6sWgePhWtBSS7w rZ/WX8Mt27/a/oBkkSoTMoxx1IeBXMfn3m8Psy+bWpo2KQYr8cmL4nQPqWeJPsY0ZP+h gEGCHASbTT+uDr9d3EqT0VWRvM6cK1lkGe4E8af64wnO89+egI8Zy21Uwwrduwz/xwQ2 giBsv3G2MOcHnIfsIiWkFmvdexQwjGjrk+5Wn1f5I+hlE/khDnlgJZnScbQ7ypDkz54M pQP48ihv2Tcx7A0k6evCXvKXik9CAQp3Duo4SvX0yzoiHbrCC0YYJP/HLNv+GHQyoVrx opLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:dkim-signature:dkim-signature:from; bh=dszFvwRRqgYPYmZS5tBZgKWOHRKBqULvHf1enENPjfw=; b=afM3IZpdo0FO3IfrutUniIWmUPOYwFD6l+DZmuUFdKkCVKcozsRFmB4CZyDjFaXlhs tevyTqkLMFry2fy/kEzosvvTdwo+aTVTw+59JXNg+c5A0WheOkw5lTST8msSev3vL83H triYQ6yLA2VCDao7j1GYWlYMg4i4T+FZ2ovxLdo5rCHowpVB33faEzIK1Ag9Z0IjCZME 75n/eZiEMdKEuDyRxtJaTFID13NpmZ8MdK4b8+57bPyubdSgETYb2+PP1Z03pm+QcLG2 vkR+nvZLHZ5bq/AQmDmeuTpFX3xYx5NYN6LDBnmsbrAXYkJuKdjZuLYcBqWT1ltUPMtG 1f9Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=fXwEIkIY; dkim=neutral (no key) header.i=@linutronix.de header.b=G1+GcVz7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q2-20020a170902b10200b0017f871eb680si17179949plr.269.2022.10.19.04.19.41; Wed, 19 Oct 2022 04:20:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=fXwEIkIY; dkim=neutral (no key) header.i=@linutronix.de header.b=G1+GcVz7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231497AbiJSLGn (ORCPT + 99 others); Wed, 19 Oct 2022 07:06:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54048 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234816AbiJSLGK (ORCPT ); Wed, 19 Oct 2022 07:06:10 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BD045D0CF1 for ; Wed, 19 Oct 2022 03:35:16 -0700 (PDT) From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1666171124; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=dszFvwRRqgYPYmZS5tBZgKWOHRKBqULvHf1enENPjfw=; b=fXwEIkIYNl/xrJnRWxeLhInEkdPDTIU8JDE9a2rQDU7RxgXVk9Hy3TIgpetghsHqGOkARJ zHb9evHnq5Auj738FgDXfRxR6gFqfz9KxiTWQNDegcDWPEOh4JwmbdmSjvnPMngcOjmcUG 0vj31prXr+bObGiFioMSFS3FuCoZmEemt0qNqfDgS4V+znZwD8NLjn2Kk02pJUXRM6rJ1q zmFy9ZnmVxNUnOK1YrFYHwoSIWMNzqAHsNKY88zp6i8qkQjY2QzOqVBUn1OhkHMqrCk1wz /FB3P+8OPdQ9rOUB3bd8LTAh2eALU8cw70f7kIL4c2EZNBu2az13+LU1rmu6Ug== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1666171124; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=dszFvwRRqgYPYmZS5tBZgKWOHRKBqULvHf1enENPjfw=; b=G1+GcVz73SDytlqMlIJx1ouAEOHATsZfxMmt5cQkpcUMIv1uqlJjlhMC5IZ+OtRkym/ZkN 4hh4ME/WP6slQbCQ== To: Feng Tang , Ingo Molnar , Borislav Petkov , Dave Hansen , "H . Peter Anvin" , Peter Zijlstra , x86@kernel.org, linux-kernel@vger.kernel.org Cc: rui.zhang@intel.com, tim.c.chen@intel.com, Xiongfeng Wang , Feng Tang , Yu Liao Subject: Re: [PATCH v2] x86/tsc: Extend watchdog check exemption to 4-Sockets platform In-Reply-To: <20221013131200.973649-1-feng.tang@intel.com> References: <20221013131200.973649-1-feng.tang@intel.com> Date: Wed, 19 Oct 2022 11:18:43 +0200 Message-ID: <87tu40p3ws.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Oct 13 2022 at 21:12, Feng Tang wrote: > There is report again that the tsc clocksource on a 4 sockets x86 > Skylake server was wrongly judged as 'unstable' by 'jiffies' watchdog, > and disabled [1]. > > Commit b50db7095fe0 ("x86/tsc: Disable clocksource watchdog for TSC > on qualified platorms") was introduce to deal with these false > alarms of tsc unstable issues, covering qualified platforms for 2 > sockets or smaller ones. > > Extend the exemption to 4 sockets to fix the issue. > > We also got similar reports on 8 sockets platform from internal test, > but as Peter pointed out, there was tsc sync issues for 8-sockets > platform, and it'd better be handled architecture by architecture, > instead of directly changing the threshold to 8 here. > > Rui also proposed another way to disable 'jiffies' as clocksource > watchdog [2], which can also solve this specific problem in an > architecture independent way, with one limitation that some tsc false > alarms are reported by other watchdogs like HPET in post-boot time, > while 'jiffies' is mostly used in boot phase before hardware > clocksources are initialized. HPET is initialized early, but if HPET is disabled or not advertised then the only other hardware clocksource is PMTIMER which is initialized late via fs_initcall. PMTIMER is initialized late due to broken Pentium era chipsets which are sorted with PCI quirks. For anything else we can initialize it early. Something like the below. I'm sure I said this more than once, but I'm happy to repeat myself forever: Instead of proliferating lousy hacks, can the X86 vendors finaly get their act together and provide some architected information whether the TSC is trustworthy or not? Thanks, tglx --- --- a/arch/x86/kernel/time.c +++ b/arch/x86/kernel/time.c @@ -10,6 +10,7 @@ * */ +#include #include #include #include @@ -75,6 +76,14 @@ static void __init setup_default_timer_i void __init hpet_time_init(void) { if (!hpet_enable()) { + /* + * Some Pentium chipsets have broken HPETs and need + * PCI quirks to run before init. + */ + if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL || + boot_cpu_data.family != 5) + init_acpi_pm_clocksource(); + if (!pit_timer_init()) return; } --- a/drivers/clocksource/acpi_pm.c +++ b/drivers/clocksource/acpi_pm.c @@ -30,6 +30,7 @@ * in arch/i386/kernel/acpi/boot.c */ u32 pmtmr_ioport __read_mostly; +static bool pmtmr_initialized __init_data; static inline u32 read_pmtmr(void) { @@ -142,7 +143,7 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SE * Some boards have the PMTMR running way too fast. We check * the PMTMR rate against PIT channel 2 to catch these cases. */ -static int verify_pmtmr_rate(void) +static int __init verify_pmtmr_rate(void) { u64 value1, value2; unsigned long count, delta; @@ -172,14 +173,18 @@ static int verify_pmtmr_rate(void) /* Number of reads we try to get two different values */ #define ACPI_PM_READ_CHECKS 10000 -static int __init init_acpi_pm_clocksource(void) +int __init init_acpi_pm_clocksource(void) { u64 value1, value2; unsigned int i, j = 0; + int ret; if (!pmtmr_ioport) return -ENODEV; + if (pmtmr_initialized) + return 0; + /* "verify" this timing source: */ for (j = 0; j < ACPI_PM_MONOTONICITY_CHECKS; j++) { udelay(100 * j); @@ -210,10 +215,11 @@ static int __init init_acpi_pm_clocksour return -ENODEV; } - return clocksource_register_hz(&clocksource_acpi_pm, - PMTMR_TICKS_PER_SEC); + ret = clocksource_register_hz(&clocksource_acpi_pm, PMTMR_TICKS_PER_SEC); + if (!ret) + pmtimer_initialized = true; + return ret; } - /* We use fs_initcall because we want the PCI fixups to have run * but we still need to load before device_initcall */ --- a/include/linux/acpi_pmtmr.h +++ b/include/linux/acpi_pmtmr.h @@ -13,6 +13,8 @@ /* Overrun value */ #define ACPI_PM_OVRRUN (1<<24) +extern int __init init_acpi_pm_clocksource(void); + #ifdef CONFIG_X86_PM_TIMER extern u32 acpi_pm_read_verified(void);