Received: by 2002:a05:6358:701b:b0:131:369:b2a3 with SMTP id 27csp2285586rwo; Sun, 23 Jul 2023 11:33:35 -0700 (PDT) X-Google-Smtp-Source: APBJJlG2KcS9g0jfAbG0VfXvj0Y6eWcTWnue/dAzrFTesI40WuaDsmHLl+fj1GumRrpEN0YFzxPk X-Received: by 2002:a05:6a20:3d0e:b0:133:5da8:2fa7 with SMTP id y14-20020a056a203d0e00b001335da82fa7mr10054947pzi.25.1690137214767; Sun, 23 Jul 2023 11:33:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690137214; cv=none; d=google.com; s=arc-20160816; b=a8cJ+MZJXcgC76C/kiepugRqydkRzPkH03RyqnIRd1xumdPggVoZtZuUSbsYatUJeX 8xELgBJGH94m+zDLhnj9Wvnk9DxVyTpYJxUkmdrPohvdOasL7QUlI3FzKoM0/zP7PT3q /PCUcU/J+wTb415/Ah2vNIm4pMDKAKiObFqWUCrM/I10z8F8JpGBrDTj5qnvA/9j4UYW XJ9kvwEb8caZl9a4XFX/4K6mx4HqDxr5GTaFb5Z/CXiovFq/aZfgDwBPm5DFwKyRAFWR HXKEJr46OAkNbqIYWZIH5Cw8FwoWxZT8B+wpUrwb46zns1fQ3D71fFkVl5ntK5jJg522 9FCw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=Jwl9f+yLaTfE50Q6I29RxtrlZV7t9ckW1HOhhiHObCc=; fh=ii6i95db0fAi6W1kGUnJMUmPnmZNIm1BuSC9Ee5DQB8=; b=BO7ZqemrEQbpjM833qd/7fMWx9+u3T9kSh7021vULvWx5r8BukiCwBNYM6vu/a2Zmd adbEGcK47yb5ipFTlQQZooke+Ltjv/pIgzaziBWl4bB4Q2jbKbJWAZoVU8bPXwh+cbs8 yY7xZ60bdi7+w/R90YW3cvP7LLjVzlcVmhUjX8W90J+AWH5djIqpZB0Dz7nKG/titGaC RjB+Bl3RUnDQdo+PeETMzSll1RFG/VPSaPZqGf6Af/NW+mcbqNdMi0aVmL/PdxoVB2k1 ff/DQ5K5Gq9OesdX7tYdO44k8xXoesdqp2s5h/eESCgiT0myKp+99mA2I0rHQtKacS13 kVTg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=cKfB+qy9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b26-20020a6567da000000b0056394893c6csi3992317pgs.774.2023.07.23.11.33.23; Sun, 23 Jul 2023 11:33:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=cKfB+qy9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230006AbjGWRTb (ORCPT + 99 others); Sun, 23 Jul 2023 13:19:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58830 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229547AbjGWRTa (ORCPT ); Sun, 23 Jul 2023 13:19:30 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2A5A8DF; Sun, 23 Jul 2023 10:19:29 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B09F860DE1; Sun, 23 Jul 2023 17:19:28 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 132FCC433C9; Sun, 23 Jul 2023 17:19:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1690132768; bh=dX2SHTNsZ5to4x5tfiKvvH1U1MEL+YgckvrEklsGLL0=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=cKfB+qy9SHRDmByNTnON2BAK+roy+x6g/n0nWqmq8JVQs6YVKOAy9twu01mmshXJo OKbNzSZZPTdolsIuWijFoi58TvSRbB7Ciw/bzpP0SqwlTg0bFg1dM59eXeCyXDxP2D LoEisFZdTgEnaEW/UV8x7LgHkABQSaQeoNfTA2DlqY7I6t1F+TicEi9f9XDukfjRlJ OYkFutQN0uKsrYYq7rstbkxMB9Wgnhzh6TBxGQyE3fh9qE/nqKF8kK6vb6g/LP5kxt /KGcXxXfCRNKSUR0cevFDoRAVoC1rb0P/ChTEWdKQPmMg70vbmZiYSdDxeABKtTcb0 KqgwYxZ475M+g== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id A9485CE0304; Sun, 23 Jul 2023 10:19:27 -0700 (PDT) Date: Sun, 23 Jul 2023 10:19:27 -0700 From: "Paul E. McKenney" To: Joel Fernandes Cc: linux-kernel@vger.kernel.org, stable@vger.kernel.org, rcu@vger.kernel.org, Greg KH Subject: Re: [BUG] Re: Linux 6.4.4 Message-ID: Reply-To: paulmck@kernel.org References: <8682b08c-347b-5547-60e0-013dcf1f8c93@joelfernandes.org> <32aec6d1-bf25-7b47-8f31-7b6318d5238d@joelfernandes.org> <9b42cb38-8375-fc41-475a-2bd26c60a7b9@joelfernandes.org> <5dcf7117-cec7-4772-8aad-e100484a84dc@paulmck-laptop> <7bfde9f4-2bd6-7337-b9ca-94a9253d847f@joelfernandes.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Jul 23, 2023 at 10:50:26AM -0400, Joel Fernandes wrote: > > > On 7/22/23 13:27, Paul E. McKenney wrote: > [..] > > > > OK, if this kernel is non-preemptible, you are not running TREE03, > > correct? > > > >> Next plan of action is to get sched_waking stack traces since I have a > >> very reliable repro of this now. > > > > Too much fun! ;-) > > For TREE07 issue, it is actually the schedule_timeout_interruptible(1) > in stutter_wait() that is beating up the CPU0 for 4 seconds. > > This is very similar to the issue I fixed in New year in d52d3a2bf408 > ("torture: Fix hang during kthread shutdown phase") Agreed, if there are enough kthreads, and all the kthreads are on a single CPU, this could consume that CPU. > Adding a cond_resched() there also did not help. > > I think the issue is the stutter thread fails to move spt forward > because it does not get CPU time. But spt == 1 should be very brief > AFAIU. I was wondering if we could set that to RT. Or just use a single hrtimer-based wait for each kthread? > But also maybe the following will cure it like it did for the shutdown > issue, giving the stutter thread just enough CPU time to move spt forward. > > Now I am trying the following and will let it run while I go do other > family related things. ;) Good point, if this avoids the problem, that gives a strong indication that your hypothesis on the root cause is correct. Thanx, Paul > +++ b/kernel/torture.c > @@ -733,6 +733,6 @@ bool stutter_wait(const char *title) > ret = true; > } > if (spt == 1) { > - schedule_timeout_interruptible(1); > + schedule_timeout_interruptible(HZ / 20); > cond_resched(); > } else if (spt == 2) { >