Received: by 2002:a05:6a10:a841:0:0:0:0 with SMTP id d1csp1902600pxy; Thu, 29 Apr 2021 18:01:27 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzzt9wwOZFh++KKoLyaITrT7KgzDOw+oZYj7gCEDButnjiH/itNk/eZ/jpCGS5Zw0MRVThn X-Received: by 2002:a17:90a:510d:: with SMTP id t13mr11461743pjh.1.1619744487420; Thu, 29 Apr 2021 18:01:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1619744487; cv=none; d=google.com; s=arc-20160816; b=GETz7q5vFAtdfMEWovXY/qu/7bdFyZ6AJ099vYiyoq8ebUusYG5Fl8tEg7tQUdfjLR npt/XnzQ+A99fy/r1ZeigMfZKa3pORd1fPZqirQVLN5oDcFrHZBNnMo5f9PmPKZ1DEmI mss2vSGVAE9O7OlkTEwx08BNcpFcMLFslQKddt39I/HgTTk4WDb6qRQz0995BbrtwhAZ SMm71EL8uIQvdvEjbrMgYw6nxizhOuWPAEKVnpFL/J417+XfMUDtvRxups5vOm/rcZSa mHS8bhqstjgn1BmBUbLWm7crEEl4T4lEev4HgAgDSEpZSiE5KpO7MWYrpcxlOH9nhfSA cK7w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=E0jr23aVaMw4VqPOwjSPzSmyPoDADuxV97m0LG/Awps=; b=QYMk9qcWT+hHAFS7iWaww9dVonNqnm05PcfWaTSWP6kjdTh57VXg7NCUceHhxabawj buf/W/ECEdIv8dUUuBaUblnnnLTnb1AK2TVpDwS6RkWrVD1mvMYz2WC19Ypmb6NigFCT rmWHvC4iOQ0yCc3R+GPFsSDLqFyBTrMb3yprRtV3Jyv/be7sBM1NQ3GPiBCDSe1DW6Uu lqQa7CjWFyfDUixvAp9x8diCVlgUm3EunnE9CkIxCd6ukHepjUJBjhBNVynWSZ/mKYHa Yg90pO3/Anz10wkYq5uGZT3KHxmAtZl9ssgUoVNUx2c8xd+60OfzGhtay0K14Dz5yv7A 8mnA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Ji7kRmZ3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n7si291839pfd.76.2021.04.29.18.01.13; Thu, 29 Apr 2021 18:01:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Ji7kRmZ3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229861AbhD3BAH (ORCPT + 99 others); Thu, 29 Apr 2021 21:00:07 -0400 Received: from mail.kernel.org ([198.145.29.99]:36868 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229519AbhD3BAG (ORCPT ); Thu, 29 Apr 2021 21:00:06 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 9E1F6613ED; Fri, 30 Apr 2021 00:59:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1619744359; bh=GZNvsGnrPxvovRidLnOihcjHTpLGj9pQkIEbFHJe7UA=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=Ji7kRmZ3VS2RmjiM8EjIZPT5xnmxoZUPzOK7vE5oBSaprYIT5zVAcAsL9HwAU2dEy XEA08Zwqexp+B+qBLK7pSKqz0m0VbYBGR+BxUIISmSoUqEwwy6TnxI6Ctg5dZ2YJV4 Yttj2pJyesdqIXK09xZI2psobWMwfgiapU03fNKPkiIyH/aH3S7VKLEZACI9ds3Wf5 xLBK6yzUrFBHT8AW8U9wbYG2f7KY/mJwVe+DUpAmw4lrf46M6SBntdL0W02PYVJHf/ 9bFdiKnkFsiUMG3Fk1W8ULU0DXsVvgcbxw9G/kgHVd8+7VDzIUJm6Nox4skslNB4qu +JpMFfPtjQfOQ== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 538AB5C051D; Thu, 29 Apr 2021 17:59:19 -0700 (PDT) Date: Thu, 29 Apr 2021 17:59:19 -0700 From: "Paul E. McKenney" To: Andi Kleen Cc: Thomas Gleixner , Feng Tang , kernel test robot , 0day robot , John Stultz , Stephen Boyd , Jonathan Corbet , Mark Rutland , Marc Zyngier , Xing Zhengjun , LKML , lkp@lists.01.org, kernel-team@fb.com, neeraju@codeaurora.org, zhengjun.xing@intel.com, x86@kernel.org, Paolo Bonzini Subject: Re: [clocksource] 8c30ace35d: WARNING:at_kernel/time/clocksource.c:#clocksource_watchdog Message-ID: <20210430005919.GA983840@paulmck-ThinkPad-P17-Gen-1> Reply-To: paulmck@kernel.org References: <87y2d3mo2q.ffs@nanos.tec.linutronix.de> <87a6pimt1f.ffs@nanos.tec.linutronix.de> <20210428183118.GR975577@paulmck-ThinkPad-P17-Gen-1> <878s517axu.ffs@nanos.tec.linutronix.de> <20210429142641.GU975577@paulmck-ThinkPad-P17-Gen-1> <87lf91f177.ffs@nanos.tec.linutronix.de> <20210429230411.GK4032392@tassilo.jf.intel.com> <20210430002459.GB975577@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210430002459.GB975577@paulmck-ThinkPad-P17-Gen-1> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 29, 2021 at 05:24:59PM -0700, Paul E. McKenney wrote: > On Thu, Apr 29, 2021 at 04:04:11PM -0700, Andi Kleen wrote: > > > > The idea is to leave the watchdog code in kernel/time/clocksource.c, > > > > but to move the fault injection into kernel/time/clocksourcefault.c or > > > > some such. In this new file, new clocksource structures are created that > > > > use some existing timebase/clocksource under the covers. These then > > > > inject delays based on module parameters (one senstive to CPU number, > > > > the other unconditional). They register these clocksources using the > > > > normal interfaces, and verify that they are eventually marked unstable > > > > when the fault-injection parameters warrant it. This is combined with > > > > the usual checking of the console log. > > > > > > > > Or am I missing your point? > > > > > > That's what I meant. > > > > I still think all this stuff should be in the fault injection framework, > > like other fault injections, to have a consistent discoverable interface. > > > > I actually checked now and the standard fault injection supports boot arguments, > > so needing it at boot time shouldn't be a barrier. > > Per Thomas's feedback, I am in the midst of converting this to a unit > test implemented as a kernel module, at which point the only fault > injection will be in the unit test. > > At the moment, the code just registers, reads, unregisters, and verifies > that the bogus unit-test clocksources act normally. Fault injection is > next on the list for the fine-grained clocksource. Which, as Thomas > noted, is quite a bit simpler, as I just need to force a delay until > the clocksource gets marked unstable with no need for fancy counting. And this is what I currently get on the console from a successful test: ------------------------------------------------------------------------ clocksource_wdtest: --- holdoff=20 clocksource_wdtest: --- Verify jiffies-like uncertainty margin. clocksource: wdtest-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns clocksource_wdtest: --- Verify tsc-like uncertainty margin. clocksource: wdtest-ktime: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns clocksource_wdtest: --- tsc-like times: 1619743817068433427 - 1619743817068432303 = 1124. clocksource_wdtest: --- Watchdog without error injection. clocksource_wdtest: --- Watchdog with singleton error injection. clocksource_wdtest: --- Watchdog with doublet error injection, expect console messages. clocksource: timekeeping watchdog on CPU4: kvm-clock retried 2 times before success clocksource_wdtest: --- Watchdog with quadruplet error injection, expect clock skew. clocksource: timekeeping watchdog on CPU8: kvm-clock read-back delay of 401209ns, attempt 4, marking unstable clocksource_wdtest: --- Marking wdtest-ktime unstable due to clocksource watchdog. clocksource_wdtest: --- Done with test. ------------------------------------------------------------------------ The code currently looks like a dog's breakfast, so I will clean it up before sending it out. And of course add the time-readout error injection to test the other clock-skew code path. And yes, there are WARNs to verify that skew happens when it is supposed to and so on. Thanx, Paul