Received: by 2002:a05:6358:7058:b0:131:369:b2a3 with SMTP id 24csp5930322rwp; Mon, 17 Jul 2023 11:46:23 -0700 (PDT) X-Google-Smtp-Source: APBJJlHVTQvJKtkaRw1dVLe5GqAVOo16ZwA8lQc796uzppxitVUJC6wBhGqhGA1k9qmz8FFE7sT2 X-Received: by 2002:a05:6830:11c2:b0:6b9:350e:4051 with SMTP id v2-20020a05683011c200b006b9350e4051mr11285360otq.4.1689619583053; Mon, 17 Jul 2023 11:46:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689619583; cv=none; d=google.com; s=arc-20160816; b=ZLornLyphZCZLcecKaPusGIFFsaV9J1FLnv24WdNCubMKcfJ97BHgP3iHKw+BnMjx4 y8dAvzZSEo+20FnuHjNIb3e3E4LTyHZroTrTTF8iIbfknVDw3iv4nYkl7tFfEYNvtCHt Jv30UU2VDEb0CXM2C7segRMDbgXRooECS3gLlRTD8ri0DpcYzpQBuOOYY8QtUO/3mnfq qQssPvRkdns996CMWrb0MTKM8oDyadQLE5x8t0svSZKa4uGO7IdnMGIFq8FKklW50ZJ8 0SGqepUTe/2cA+tyN5xtlcQsWq0m2n08DI+XPAENGrCHFGaJ1/ZRlQ3iH9pQqqH+pVQn /ujw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=WR1rqtioSV1UzmCa1mbQt6cdrfqRV2qoapm+TffNV8M=; fh=ieGhnLmsaY5FDf8cp0iAT1CjY0Ir3a1s18HnfHp238Y=; b=tnybJppv1ds9elphQ6RfxGavrQDinryBqudFtAeSDJC4/YkRLhGyorF260McugGdZd JuiBnlHMK9unQznUbg1LrT+kTPAkh+CCLXFPfmukznZSHWmYiSLqtgU0bI9K7x4Uubtl VlvCj0SBpmmF8IPylCcYzxKha7Oc3lFkXpVNv20ji8qB1q9+UoIV2/qxRTeRXXeh158/ 8AkMGKsVQQNMEtVEnkWiNYemTOR4S5Gn7Sq7gwVgwkpj3QBzO1tVuQ9h9WLQcLMyV/2U kr3rD7JAuBWx8JPpq7MqleP/0nIGU3TBgKHkuAqrG7lp5gvqUcWVg/HH4pFqKKSXFX8q aMdQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Q7HCYboI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g124-20020a636b82000000b0055c82716d17si162667pgc.830.2023.07.17.11.46.10; Mon, 17 Jul 2023 11:46:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Q7HCYboI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231411AbjGQS2v (ORCPT + 99 others); Mon, 17 Jul 2023 14:28:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60794 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232086AbjGQS2S (ORCPT ); Mon, 17 Jul 2023 14:28:18 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 119E2E6F for ; Mon, 17 Jul 2023 11:28:17 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id A1467611F0 for ; Mon, 17 Jul 2023 18:28:16 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0BBABC433C8; Mon, 17 Jul 2023 18:28:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1689618496; bh=LxVPV/awBr9JfrIXQcn7s5nybGvSK4R3yuZ8UegsyEo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Q7HCYboIy3k9DTctpmN4Na4Olx0Cl/2WKV4x3hO5adpOJs9UdrnskmzdF0EYOzt3C kxvC9Vb9O4mdfIuRerjrWE5UXoIzFkU0VDwuX7N5pI/bxZwqDPz+7CVAc8v93Z65M/ Q79nxpbCVvlesiSbtqppsdZA0M5P7OXDA5aiie1sT9mxzx6Jy+CrSvsG4L/TLmTbmx O3HUml2d9qqsH76oGWKlHof+tldsQI03K5VQ0w2LTyoCVx3OMDlJIPNXVfGQEnLoDM NZClUUW//W1SxqNzqwUqlk5sif3Wp3TyH8UDYoCJq09n+p4fXVeS+Hwn3xNKPf0YZP SPGbU9iD5D4Yg== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id A4D05CE04CD; Mon, 17 Jul 2023 11:28:15 -0700 (PDT) From: "Paul E. McKenney" To: tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, john.stultz@linaro.org, sboyd@kernel.org, corbet@lwn.net, Mark.Rutland@arm.com, maz@kernel.org, kernel-team@meta.com, neeraju@codeaurora.org, ak@linux.intel.com, feng.tang@intel.com, zhengjun.xing@intel.com, daniel.lezcano@linaro.org, Yu Liao , "Paul E . McKenney" Subject: [PATCH clocksource 2/2] x86/tsc: Extend watchdog check exemption to 4-Sockets platform Date: Mon, 17 Jul 2023 11:28:14 -0700 Message-Id: <20230717182814.1099419-2-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Feng Tang There were reports again that the tsc clocksource on 4 sockets x86 servers was wrongly judged as 'unstable' by 'jiffies' and other watchdogs, and disabled [1][2]. Commit b50db7095fe0 ("x86/tsc: Disable clocksource watchdog for TSC on qualified platorms") was introduce to deal with these false alarms of tsc unstable issues, covering qualified platforms for 2 sockets or smaller ones. And from history of chasing TSC issues, Thomas and Peter only saw real TSC synchronization issue on 8 socket machines. So extend the exemption to 4 sockets to fix the issue. Rui also proposed another way to disable 'jiffies' as clocksource watchdog [3], which can also solve problem in [1]. in an architecture independent way, but can't cure the problem in [2]. whose watchdog is HPET or PMTIMER, while 'jiffies' is mostly used as watchdog in boot phase. 'nr_online_nodes' has known inaccurate problem for cases like platform with cpu-less memory nodes, sub numa cluster enabled, fakenuma, kernel cmdline parameter 'maxcpus=', etc. The harmful case is the 'maxcpus' one which could possibly under estimates the package number, and disable the watchdog, but bright side is it is mostly for debug usage. All these will be addressed in other patches, as discussed in thread [4]. [1]. https://lore.kernel.org/all/9d3bf570-3108-0336-9c52-9bee15767d29@huawei.com/ [2]. https://lore.kernel.org/lkml/06df410c-2177-4671-832f-339cff05b1d9@paulmck-laptop/ [3]. https://lore.kernel.org/all/bd5b97f89ab2887543fc262348d1c7cafcaae536.camel@intel.com/ [4]. https://lore.kernel.org/all/20221021062131.1826810-1-feng.tang@intel.com/ Reported-by: Yu Liao Reported-by: Paul E. McKenney Signed-off-by: Feng Tang Signed-off-by: Paul E. McKenney --- arch/x86/kernel/tsc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index 3425c6a943e4..15f97c0abc9d 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -1258,7 +1258,7 @@ static void __init check_system_tsc_reliable(void) if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) && boot_cpu_has(X86_FEATURE_NONSTOP_TSC) && boot_cpu_has(X86_FEATURE_TSC_ADJUST) && - nr_online_nodes <= 2) + nr_online_nodes <= 4) tsc_disable_clocksource_watchdog(); } -- 2.40.1