Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C49DDC05027 for ; Thu, 2 Feb 2023 04:54:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230046AbjBBEyY (ORCPT ); Wed, 1 Feb 2023 23:54:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42992 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229546AbjBBEyV (ORCPT ); Wed, 1 Feb 2023 23:54:21 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 25BC461D75 for ; Wed, 1 Feb 2023 20:54:20 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id C1AA7B82424 for ; Thu, 2 Feb 2023 04:54:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6FF92C4339B; Thu, 2 Feb 2023 04:54:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1675313657; bh=QHk55/J0xKtWtokrmK7I5IkQaZtBHbMhKB8JBIjPfw0=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=EiCMTqijwxk2TyCtIdIj218dfyAsyR9W3B16vP6sqH4mk7SWvSH2410nj7Ugpunml wWB3ClLGdwETwueYs+k1zwAGJr9zw+mqgdugT721miEySd0p96Wk6K2Q3J/v9RUs3J zAASeD1fZU2SVfELoWSHNk221vp1U1Qq1gzaElACMqByho8ilgLzcxNC65bsFKnpQ+ Q5QbmQ8p+hVqRgKUzTys2Vft4+owVY5I4vTBfcc8MeziUfrkGPnMKou1OyEV2dW9jJ ay8+flx5295vKwrxp+vky/HDIIy9Ov/chziv3uGGnRzfDU/o18HJ30P6Xp36j4T8C/ 2wjU00n2QbkzA== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 0B7A75C13DF; Wed, 1 Feb 2023 20:54:17 -0800 (PST) Date: Wed, 1 Feb 2023 20:54:17 -0800 From: "Paul E. McKenney" To: Waiman Long Cc: Thomas Gleixner , linux-kernel@vger.kernel.org, john.stultz@linaro.org, sboyd@kernel.org, corbet@lwn.net, Mark.Rutland@arm.com, maz@kernel.org, kernel-team@meta.com, neeraju@codeaurora.org, ak@linux.intel.com, feng.tang@intel.com, zhengjun.xing@intel.com, Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Daniel Lezcano , x86@kernel.org Subject: Re: [PATCH v2 clocksource 6/7] clocksource: Verify HPET and PMTMR when TSC unverified Message-ID: <20230202045417.GT2948950@paulmck-ThinkPad-P17-Gen-1> Reply-To: paulmck@kernel.org References: <20230125002708.GA1471122@paulmck-ThinkPad-P17-Gen-1> <20230125002730.1471349-6-paulmck@kernel.org> <87wn51znsh.ffs@tglx> <15e8c929-845e-ef65-dc04-a51f071dd256@redhat.com> <20230201195517.GM2948950@paulmck-ThinkPad-P17-Gen-1> <39752908-cc10-d63f-d02e-381693060af8@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <39752908-cc10-d63f-d02e-381693060af8@redhat.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 01, 2023 at 10:40:56PM -0500, Waiman Long wrote: > On 2/1/23 14:55, Paul E. McKenney wrote: > > On Wed, Feb 01, 2023 at 02:26:29PM -0500, Waiman Long wrote: > > > On 2/1/23 05:24, Thomas Gleixner wrote: > > > > Paul! > > > > > > > > On Tue, Jan 24 2023 at 16:27, Paul E. McKenney wrote: > > > > > On systems with two or fewer sockets, when the boot CPU has CONSTANT_TSC, > > > > > NONSTOP_TSC, and TSC_ADJUST, clocksource watchdog verification of the > > > > > TSC is disabled. This works well much of the time, but there is the > > > > > occasional production-level system that meets all of these criteria, but > > > > > which still has a TSC that skews significantly from atomic-clock time. > > > > > This is usually attributed to a firmware or hardware fault. Yes, the > > > > > various NTP daemons do express their opinions of userspace-to-atomic-clock > > > > > time skew, but they put them in various places, depending on the daemon > > > > > and distro in question. It would therefore be good for the kernel to > > > > > have some clue that there is a problem. > > > > > > > > > > The old behavior of marking the TSC unstable is a non-starter because a > > > > > great many workloads simply cannot tolerate the overheads and latencies > > > > > of the various non-TSC clocksources. In addition, NTP-corrected systems > > > > > sometimes can tolerate significant kernel-space time skew as long as > > > > > the userspace time sources are within epsilon of atomic-clock time. > > > > > > > > > > Therefore, when watchdog verification of TSC is disabled, enable it for > > > > > HPET and PMTMR (AKA ACPI PM timer). This provides the needed in-kernel > > > > > time-skew diagnostic without degrading the system's performance. > > > > I'm more than unhappy about this. We finally have a point where the TSC > > > > watchdog overhead can go away without adding TSC=reliable to the kernel > > > > commandline. > > > > > > > > Now you add an unconditionally enforce the watchdog again in a way which > > > > even cannot be disabled on the kernel command line. > > > > > > > > Patently bad idea, no cookies for you! > > > I have a similar concern about this patch as well. That is why I was > > > suggesting to have this enabled for a limited time after boot for sanity > > > checking purpose only. > > Fair enough! > > > > If the watchdog checking of HPET and/or PMTMR against TSC only happens > > only when the sysadm asks for it, would you still want to have the ability > > to enable such watchdog checking at boot time, and then to disable it > > once the system had been running for some limited time? > > Yes, being optional is another way to avoid the overhead for the majority of > users. The paranoids can turn it on if they want to. Very good, thank you! Thanx, Paul