Received: by 2002:ac0:e350:0:0:0:0:0 with SMTP id g16csp370154imn; Wed, 3 Aug 2022 07:24:08 -0700 (PDT) X-Google-Smtp-Source: AA6agR5cMAcZdYmojbE1EePSfxxk52HDvFm2/MM2USlgWk5bd13dcJkUCbj01kWaWJZSxzYUBTJk X-Received: by 2002:a17:903:2407:b0:16d:ab14:a971 with SMTP id e7-20020a170903240700b0016dab14a971mr25559384plo.48.1659536648741; Wed, 03 Aug 2022 07:24:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1659536648; cv=none; d=google.com; s=arc-20160816; b=0GpvWajkNMJgDLaHkeZf98Bhuz5lCp+GfT2XNCnguBE+aXKTJjKyxuWC4XHZeq5asR 9qo60dMbb4nme8Fbgo0/2XoHW23fnsxPle/N8A+4cDQI0wvEGZHn2FXT2EdStb8L8Ynn NS33ZH9whZ3Pu3ya6CtxdmbXS4bM3doPeB+llZYRijQyZ5cGHatRUq14O60MU6LM95Bu i2qMhgp9YcJ4s91XR+CSq3APsfWPg8ury0zHFa7wxoCbv/n1V2zpivO8297r0CVIEBxQ EjKUW28u/Wv65VgjpLCBbYVtKgtHm/zqHKNO9v2gSx6XSWxvG91fGGf+e/oqafodtUYz l0ag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=gRU47E5fQDmE+1F55GWgwcE8FVCHdtteJtKiUHttwg8=; b=JZc1dwOUSYG1Z/KwODZQL1hMWqgv8zB0woJXAWhpIqTih0YS6LTFAmVoV7rq9yV7zO JFgsQneekFdxlH+XMvndF3RURxHaePtstuUL0ybiMKS1jA4v5Y5hfouFy4PMnsbeQ39x DjHESr90HrbrXkj9z/oNzpBJ0Y0ke5BAREdVToMVzbqac2yYfRTpA69A3sieBi7wGWTA 1Vmbf643FzoMTZsnLSsaWxLmy5fhXbbQ3IxujI5k7YX0QVMwsVd1IwWUff4631fLpDB/ gM+r2DPnQXFsEJwC9xqdvzUPng65FQa/WvSOUame7m/evLH4mHLVwsyOlebUgC1KIfwm 8Hbw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=CIrIsvdD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w13-20020a63b74d000000b0041a4df226d8si6421191pgt.341.2022.08.03.07.23.53; Wed, 03 Aug 2022 07:24:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=CIrIsvdD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238588AbiHCOG7 (ORCPT + 99 others); Wed, 3 Aug 2022 10:06:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60338 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235951AbiHCOG6 (ORCPT ); Wed, 3 Aug 2022 10:06:58 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 17B8718354 for ; Wed, 3 Aug 2022 07:06:57 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id B48A9B8215F for ; Wed, 3 Aug 2022 14:06:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7198CC433D6; Wed, 3 Aug 2022 14:06:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1659535614; bh=FUpLy7yT54M55nQG6sKMEQx1kqgZE0vNDwbbsbuVheM=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=CIrIsvdD+5YDtz7+R1rB3CVh91thuzNPrb+Fvj4lEHdWIS4JlVl60i/Ba6EkqngjM iFv3+1OnI4HDM15ylPQJZ4LEXGoBnB9ltL6bwBVS4BMTioiFmTqrrogH1uC9rtdFmr Lrnv/WEGePTgYXz1llolNl7Ut8uW0f20CUUNCA1UFW3HvpZ0GKvbHjdquHayY0lGF6 ChK8R62ISf6FgZUl/C9OrL8tDiNwFTs23k52C32sjfwzTD0vzatYZtI5M+L5mDw5ir n+Ny/cjE1CpvjTuqeQ9KR3c2D8et3gjXE8N/ZZOS4bFoh+Ev7EUjnO2BuVzP/iH8DJ naaEKAhQhDtEQ== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 058B05C092A; Wed, 3 Aug 2022 07:06:53 -0700 (PDT) Date: Wed, 3 Aug 2022 07:06:53 -0700 From: "Paul E. McKenney" To: Steven Rostedt Cc: Liu Song , mingo@redhat.com, peterz@infradead.org, juri.lelli@redhat.com, vincent.guittot@linaro.org, dietmar.eggemann@arm.com, bsegall@google.com, mgorman@suse.de, bristot@redhat.com, vschneid@redhat.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH] sched/debug: avoid executing show_state and causing rcu stall warning Message-ID: <20220803140653.GD2125313@paulmck-ThinkPad-P17-Gen-1> Reply-To: paulmck@kernel.org References: <1659489525-82994-1-git-send-email-liusong@linux.alibaba.com> <20220803084235.5d56d1e4@gandalf.local.home> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220803084235.5d56d1e4@gandalf.local.home> X-Spam-Status: No, score=-7.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Aug 03, 2022 at 08:42:35AM -0400, Steven Rostedt wrote: > > [ Adding Paul ] > > On Wed, 3 Aug 2022 09:18:45 +0800 > Liu Song wrote: > > > From: Liu Song > > > > If the number of CPUs is large, "sysrq_sched_debug_show" will execute for > > a long time. Every time I execute "echo t > /proc/sysrq-trigger" on my > > 128-core machine, the rcu stall warning will be triggered. Moreover, > > sysrq_sched_debug_show does not need to be protected by rcu_read_lock, > > and no rcu stall warning will appear after adjustment. > > > > Signed-off-by: Liu Song > > --- > > kernel/sched/core.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > > index 5555e49..82c117e 100644 > > --- a/kernel/sched/core.c > > +++ b/kernel/sched/core.c > > @@ -8879,11 +8879,11 @@ void show_state_filter(unsigned int state_filter) > > sched_show_task(p); > > } > > > > + rcu_read_unlock(); > > #ifdef CONFIG_SCHED_DEBUG > > if (!state_filter) > > sysrq_sched_debug_show(); > > If this is just because sysrq_sched_debug_show() is very slow, does RCU > have a way to "touch" it? Like the watchdogs have? That is, to tell RCU > "Yes I know I'm taking a long time, but I'm still making forward progress, > don't complain about me". Then the sysrq_sched_debug_show() could have: > > for_each_online_cpu(cpu) { > /* > * Need to reset softlockup watchdogs on all CPUs, because > * another CPU might be blocked waiting for us to process > * an IPI or stop_machine. > */ > touch_nmi_watchdog(); > touch_all_softlockup_watchdogs(); > + touch_rcu(); > print_cpu(NULL, cpu); > } > > ?? There is an rcu_sysrq_start() and rcu_sysrq_end() to suppress this. These are invoked by __handle_sysrq(). The value of rcu_cpu_stall_suppress should be non-zero during the sysrq execution, and this should prevent RCU CPU stall warnings from being printed. That said, the code currently does not support overlapping calls to the various functions that suppress RCU CPU stall warnings. Except that the only other use in current mainline is rcu_panic(), which never unsuppresses. So could you please check the value of rcu_cpu_stall_suppress? Just in case some other form of suppression was added somewhere that I missed? Thanx, Paul > -- Steve > > > #endif > > - rcu_read_unlock(); > > /* > > * Only show locks if all tasks are dumped: > > */ >