Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp384240pxj; Fri, 7 May 2021 10:40:22 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy3WZyz50HZlR2NqAyuRIG5uw06caaohEbylEaWJ3LEHJZ4Co1lnMfL8x8nsapZo1WkpZc9 X-Received: by 2002:a17:90a:e384:: with SMTP id b4mr25600541pjz.157.1620409222444; Fri, 07 May 2021 10:40:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1620409222; cv=none; d=google.com; s=arc-20160816; b=XAvS6YlMaVrSVhQxnckiZs5IY8O+uwWiIA2k5+nJZ7QfCgPS+obFD36qq85acarPbO 5d3wbOxCEXgriIwBYI7WxwYg1f2rQ5Rq09EPFuAEXKzAJeh/RvvGwMULs493APvKzYH2 dIJckz6kKpov4ERfDf+2HM2vTD5BVHF52M7iyC1lzHJ3gYErGxHEKi/rsRJCwHLzhjjG uoHjJZqOzV7YZMAmwRhEnaYfmdDg4ovZTMwZLw0wPcqLOhQvSA1rP1+4l/8qBMUJ79YJ wfa40ZrNWz57Pa0bq36VlgNvARpPa8cDf4KzoHbRcpUj/p6YV5iW4Ewgum9pL0isu17Q AgSg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=W/Wqv36pH98FiWNsSqZaol8isH/95Yg69svWRURmBhQ=; b=m6ArWf9dpBAJf3xqyI+rL1pQU5M179GmfipMfnQICNIsehRUIG/ut+YaPFLpM+lnEx fyqDyyP5SpGJGmq9qrKcxWWc+jOGIL4NS3K88FyqtMp7xsSK8ow+39uEiZMBXxcmCvQr QKCbR9fS4ttlLG6rY/XfZtz99kSW6bvBx7icEhOw7pdAcCbhpxJA8iUexGclhQyGDvby T4s45voNQvmvjDBTE+70LQg14IpX01Eans1DkHkWDeHj30TNsk+7H52II8vWzo8z64FZ qLT6VZ7y3SrwN+RJIJSMGTviuV2W73CUhTwUuVdkrY3v+mdI/tPjDjLUWj88QUHheaeF Rm/w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=DYR3SASq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n3si5593769plf.63.2021.05.07.10.40.09; Fri, 07 May 2021 10:40:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=DYR3SASq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234545AbhEGROA (ORCPT + 99 others); Fri, 7 May 2021 13:14:00 -0400 Received: from mail.kernel.org ([198.145.29.99]:41026 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230499AbhEGROA (ORCPT ); Fri, 7 May 2021 13:14:00 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 1CE5C610CD; Fri, 7 May 2021 17:13:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1620407580; bh=zFapcfGOgC3O/0mNO723wBmbSDDp4RZDLXU57riBmLU=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=DYR3SASqhtUsxrXUtZJPoGsi/CrYhaJAUjVKpAvO8w2AaIpMG9nHvU7/FxiskJAfe amI5SSXE3GIUyON29Eei0AZCpgkGu0rcc52wOQ4IPsburunq8QSyXFEaNYsf2LKrlS Y3v+fHun7vc3Aq9UBT6ssrN3a7Xv7MfjW4hwzunOlZ0sYCAFwE5EX4Dndsjcwjl5UL mSg/RGrv1XHVFo2f6fqEGBMQOPOqrMOzK6r1UxtY4ym81fO+nANU1IKW79Bu/WrHoX YyVCeEBeOr3s9m0VoCZneUA1PLIc+N9SLJkVvmmNsFJngSvkJar2snTcEWBF2DT0nM TX+Wbg7kITQzg== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id A1CBE5C0293; Fri, 7 May 2021 10:12:59 -0700 (PDT) Date: Fri, 7 May 2021 10:12:59 -0700 From: "Paul E. McKenney" To: kernel test robot Cc: Thomas Gleixner , John Stultz , Stephen Boyd , Jonathan Corbet , Mark Rutland , Marc Zyngier , Andi Kleen , Feng Tang , Xing Zhengjun , Chris Mason , LKML , Linux Memory Management List , lkp@lists.01.org, lkp@intel.com Subject: Re: [clocksource] 8e614d5b58: WARNING:at_kernel/time/clocksource-wdtest.c:#wdtest_func.cold Message-ID: <20210507171259.GA236800@paulmck-ThinkPad-P17-Gen-1> Reply-To: paulmck@kernel.org References: <20210505143616.GC9038@xsang-OptiPlex-9020> <20210505180312.GM975577@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210505180312.GM975577@paulmck-ThinkPad-P17-Gen-1> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 05, 2021 at 11:03:12AM -0700, Paul E. McKenney wrote: > On Wed, May 05, 2021 at 10:36:16PM +0800, kernel test robot wrote: > > > > > > Greeting, > > > > FYI, we noticed the following commit (built with gcc-9): > > > > commit: 8e614d5b58992e722f07de7c2426f2c44668092b ("clocksource: Provide kernel module to test clocksource watchdog") > > https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master > > > > > > in testcase: boot > > > > on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G > > > > caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): > > > > > > +-------------------------------------------------------------------------+------------+------------+ > > | | bdbd9c673e | 8e614d5b58 | > > +-------------------------------------------------------------------------+------------+------------+ > > | WARNING:at_kernel/time/clocksource-wdtest.c:#wdtest_func.cold | 0 | 11 | > > | RIP:wdtest_func.cold | 0 | 11 | > > +-------------------------------------------------------------------------+------------+------------+ > > Might it be useful to address the lockdep issues that preceded this splat? > > Leaving that aside, the system appears to still be booting. There are > RCU CPU stall warning messages later on, and then the system hangs more > than six minutes while still booting, presumably due to the large number > of self-tests and debug options enabled. > > The intent is that the clocksource-wdtest tests run after boot has > completed. One approach would be to test it using modprobe after boot > has completed. In addition, the clocksource-wdtest module is not designed > to handle CPU overload conditions, and making it do so would reduce the > effectiveness of the test. > > I suggest setting clocksource-wdtest.holdoff=N, where "N" is in seconds > and is large enough that boot has completed. Alternatively, use modprobe > to activate this module from userspace after boot has completed. > > What I do is just set CONFIG_TEST_CLOCKSOURCE_WATCHDOG=y in an ordinary > rcutorture run, if that helps. All that aside, does the patch below help in your environment? If so, I can adjust so that my testing gets done quickly and yours avoids false-positive failures. Thanx, Paul ------------------------------------------------------------------------ diff --git a/kernel/time/clocksource-wdtest.c b/kernel/time/clocksource-wdtest.c index 01df12395c0e..0d8542f8b1d2 100644 --- a/kernel/time/clocksource-wdtest.c +++ b/kernel/time/clocksource-wdtest.c @@ -149,7 +149,7 @@ static int wdtest_func(void *arg) s = ", expect clock skew"; pr_info("--- Watchdog with %dx error injection, %lu retries%s.\n", i, max_cswd_read_retries, s); WRITE_ONCE(wdtest_ktime_read_ndelays, i); - schedule_timeout_uninterruptible(2 * HZ); + schedule_timeout_uninterruptible(60 * HZ); WARN_ON_ONCE(READ_ONCE(wdtest_ktime_read_ndelays)); WARN_ON_ONCE((i <= max_cswd_read_retries) != !(clocksource_wdtest_ktime.flags & CLOCK_SOURCE_UNSTABLE));