Received: by 2002:a05:7412:f589:b0:e2:908c:2ebd with SMTP id eh9csp1080207rdb; Wed, 1 Nov 2023 10:40:34 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHoHSUO+MQ01pNTmqq8xZVCyw+KNUxf5yRTrAzQnHyrW9l/P56rKfAbeWsfMVV0oqMCn5rt X-Received: by 2002:a17:902:e846:b0:1cc:703d:20fe with SMTP id t6-20020a170902e84600b001cc703d20femr4901180plg.42.1698860434349; Wed, 01 Nov 2023 10:40:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698860434; cv=none; d=google.com; s=arc-20160816; b=TkAHS43kkWoZMUGtECg4XIGBsbPDxahA+/Sjty0oWg6E3yWDqhdH/HEmBlyQxIXcuc heXsam/2OJB1j9OvT+7celfi58MbaNe10oaA9deg2Xan+RQo0d9NNdruPCPpzDMVlEAj ZPaKLPHD83TLCStsXMcaF0TZ620q7aPVG5tw2Nn+Wimv5Bd2vswX2+kNmRVmg08gfIJM 31mrAmLTAww92lrbKlybdLAtm1sdECHYqmhj5BuaPxF8M4Zy6/n/Xv4fOIzkGj/URvDF GmjH+NL7FTELsMEMXXyPoTdY+n053ZMcEhcq7hVxBVsC3eQxUBxkFMuZ0L1um4rwSc84 Kk/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=ax2Fc4Ymatig/O7JtydEBeGNh1/AwoGpeCzhmQTtnGg=; fh=3ScpczENNZiEkLn/QZvsAui1N66SK2LUATLg9aPCyVw=; b=QPk2OpBnP2YkKJNFHc7wxduhpC5+LrjOwdql0QI065UwiQfXZoGO+dRztULGFCnnzS lWwxm9bX3WpSXgSvNTpUiWeyERTAijPhzLdrQPmHuC9IDFvnHSswGmghBOv9VjivfRkq 4qSS3hLcQv6ArBtl830fyMlB7St2bkbNa8KdH4niH6P76oka6MdEp9FZZgqzv2sjUaR6 4P2YgQrk9BM3J6fTLqrEKeUVl6cPIr+DaNIXYiR5mWD2WS7Id8O1YpBwfIiOE82qZKJd umhwwsDy9VAliRbmRZ/o8ii8wdb5399VTF6ZukmwAEli6/b7gITy7fkGBAG3IQbD+YaO xBpw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=mXkQHR2e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id ld16-20020a170902fad000b001c74f2d8980si3183480plb.160.2023.11.01.10.40.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 10:40:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=mXkQHR2e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 8952F807FC2D; Wed, 1 Nov 2023 10:40:30 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232395AbjKARkU (ORCPT + 99 others); Wed, 1 Nov 2023 13:40:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33598 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231233AbjKARkT (ORCPT ); Wed, 1 Nov 2023 13:40:19 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA9AAED; Wed, 1 Nov 2023 10:40:14 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 83300C433C7; Wed, 1 Nov 2023 17:40:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1698860414; bh=a6W3IlblsIT8EDDhgTVDxrMWX5YkV60VTCvGI//nKvQ=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=mXkQHR2eAmvKVo+G4Y9+a/+qFEOqK7eJ8jmOMxxQHG+vnVnXP3MQcJYpbiIy7b8Bt K4E9v2+qjGJI+L69cXX0oSI6bHMNqygOoWqdh/+sfXbrMaEpqw1G4qe7fV2bi/WlUB prC0jDazB2FAptjTu423XE3JqPQbt/XEjaT5UKS7y8thUBBpramPuzov2qZ5YWWn1x MtWvAofLN/smhz9DHJLLfT6ooICS5t4W0W2cvAuxpIIEEc7p/+ue3FOW7vCsoLk8hE WP+pA3gSH8e0auPrQ4xT+n4OfP+SIR4HUo+XkKSpMfufkrGzXLFJT2vX1cH/q/8XbN Y4j9Pt5LHFloQ== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 2052ECE09BE; Wed, 1 Nov 2023 10:40:14 -0700 (PDT) Date: Wed, 1 Nov 2023 10:40:14 -0700 From: "Paul E. McKenney" To: Linus Torvalds Cc: Frederic Weisbecker , linux-kernel@vger.kernel.org, Thomas Gleixner , Ingo Molnar , rcu@vger.kernel.org, Boqun Feng , Joel Fernandes , Neeraj Upadhyay , Uladzislau Rezki , Z qiang Subject: Re: [GIT PULL] RCU changes for v6.7 Message-ID: Reply-To: paulmck@kernel.org References: <78b18304-c6a5-4ea1-a603-8c8f1d79cc1a@paulmck-laptop> <7416f684-37e7-4355-a5a0-2b1b5ef1b4d7@paulmck-laptop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-1.6 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Wed, 01 Nov 2023 10:40:30 -0700 (PDT) On Wed, Nov 01, 2023 at 07:11:54AM -1000, Linus Torvalds wrote: > On Tue, 31 Oct 2023 at 15:08, Paul E. McKenney wrote: > > > > Here are the ways forward I can see: > > > > 1. Status quo. This has all the issues that you call out. > > People will hurt themselves with it and consume time and effort. > > So let's not do this. > > Well, at a *minimum*, I really want that notifier chain call to be > done *after* the core printk's. > > That way, if it deadlocks or does something else stupid, at least the > core printouts make it out. > > IOW, I think the notifier should be done perhaps just before the > "panic_on_rcu_stall()" call, not at the top before you've even > reported any stall conditions at all. Understood. But my problem is that the core printk()s destroy the state that the notifier is trying to output. > And yes, I think the trace_rcu_stall_warning() might be better off > later too, but at least trace events are things that get regular > testing in nasty conditions (including NMI etc), so I'm *much* less > worried about those than about "random developers who think they know > what they do and add a notifier". Agreed, this is a special debug facility, not something that anyone should use in production. And also not something that should be used where gdb would do the job. > And yes, I do think the notifier should be narrowed down a lot, if you > actually want to keep it. Understood, thus a new default-disabled Kconfig option that depends on RCU_EXPERT and DEBUG_KERNEL, along with a default-disabled kernel boot parameter, both of which have to be selected to make anything happen. > I did not actually hear you say that there is a good use-case for it. > I only saw you say "Those of us who need this", without showing *any* > kind of indication of why anybody would use it in reality. > > Why the secrecy? There is certainly no current user, nor any > description of what a user would be and what makes that notifier > useful. > > The commit message also just says "It is sometimes helpful" and some > strange reference to "the subsystem causing the stall to dump its > state". It all sounds very fishy. Why would anybody ever have a known > subsystem causing RCU stalls? Except, of course, for the rcutorture > testing. One use case is dumping out the qspinlock state for an extremely rare lockup. If you even look at the system cross-eyed, the lockup goes away. And yes, I should have mentioned this in the commit log, and I apologize for having failed to do so. I do not expect that the state-dump code would ever be appropriate for mainline. > Anyway, that all absolutely SCREAMS to me "this is not something > useful in any normal kernel", and so yes: Agreed, definitely not for any normal kernel! > > 3. Add a default-n Kconfig option that depends on RCU_EXPERT > > and KEBUG_KERNEL, so that these problems can only arise in > > specially built kernels. > > > > 4. Same as #3, but use a kernel boot parameter instead of a > > Kconfig option. > > let's make it clear that this is *not* something that any upstream > kernel would ever do, and the *only* possible use for it is some kind > of external temporary debug patch. > > See why I so hate things like this? Let's head off any crazy use long > *long* before somebody decides that "Oh, I want to use this". You are absolutely right, a debug tool with this many sharp edges should definitely not be default-enabled. And needs some scary words in the Kconfig help text. And a boot-time splat to make people think twice before using it. Apologies for not having thought this through! I will send a fixup patch before the end of today. Thanx, Paul