Received: by 2002:a05:7412:1703:b0:e2:908c:2ebd with SMTP id dm3csp1403863rdb; Sat, 26 Aug 2023 00:38:30 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFJjAcX6ApRfr1Y9WaFC+ce934ZPk4iGa+aus1AX0GudNdFJ+90hy/HCiJ9N6h0SYxHevIB X-Received: by 2002:a05:6808:658:b0:3a3:bf6a:ba55 with SMTP id z24-20020a056808065800b003a3bf6aba55mr4442843oih.3.1693035510627; Sat, 26 Aug 2023 00:38:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693035510; cv=none; d=google.com; s=arc-20160816; b=uGzWWnzxpkBzcrl1lKjtblLvnaJDHtvvsCu/BCuykE5IbaqXAnaYUHiVUQFqAevt6P 1cf3iqsOtmK+DvKcM8275xYTykfrllSjtvI/HM+mzuOt+2P60LJzcv0IAmEFXndyfDCG PcB20kSnDGLA3EIbXP2icHgqoXIHcTv6nAKvrmjhCIxRlfVI2n7LsGhXSQpqASr0BZN2 +VjRqTbT60ic7r00hFNh4CdGPcjaM1C6/sjwYUxJDjZkZEd6tROzslAXHZWrWQAT+k17 4HZ5Sb5AhCyDC3GTKKeHc28Bog/ThAXIlFzv3910h88P/wApXFMZgsTqjlPU+sBNX9iq EZGA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:reply-to:message-id :subject:cc:to:from:date:dkim-signature; bh=B3/4gML5F4Dx8NzU5xDiTxan65DOR7yg/MQuoz8lN58=; fh=mIQv1AdCVklLKY8JsLAojvGTUX7uVbB8j5EPR0wTWKM=; b=eXWd0InXm0OPOKZmVjTKbw7seFB1RRxH3ufmi1vyPv2rhu5M5Bq2dD6272Hm6GOr9a 2odJeKB1Z2nR7HGOgG3UUdVHB5d50hanKPXp9ujJGkYvkbAZdTlQkgiqggjr5XKAWHis bXWrH0zpcNgGjodvJ61zKo6C0w3/Mq0D5y2SCkIV17ZbetFhzfJGE7iwqXilHPMYpzHw FNmVHdrmgoPnXIyfXxUrKEFS4lIdtV3qa7JyuRYciorEwTkv8DPprx88DVd115Sy1umt gxMI+ZGMbwk6P4OLq4F7tqghVVvhwl+iELYmzIw0AFyH9KjGyEEa/iEUix1i9b8vjYf5 Qfyw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=iQfk4J7N; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z9-20020a056a001d8900b0064f78c32b89si3359998pfw.95.2023.08.26.00.37.43; Sat, 26 Aug 2023 00:38:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=iQfk4J7N; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231670AbjHZBp7 (ORCPT + 99 others); Fri, 25 Aug 2023 21:45:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37560 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231615AbjHZBpm (ORCPT ); Fri, 25 Aug 2023 21:45:42 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 981101FF3; Fri, 25 Aug 2023 18:45:40 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 14E1D608D5; Sat, 26 Aug 2023 01:45:40 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 60266C433C8; Sat, 26 Aug 2023 01:45:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1693014339; bh=BbMo1JBT1UtE2tnVg5AXhzHXnZUjJbWD10qmfiKFfI4=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=iQfk4J7NhAUinBTw9JGI7PTCCuoK/Kmskvjh3YSkV23PVcTPVFfMCooNX47vO3H9K RgJk4BgHxKo0DMZM8DVru/eAbhpV+8ZSkYEcgvFXbyBqvO3Hm2sfBbobk3IX0i1qI/ qz0TGtjy+tnV8DZQupN0jQyuHYjAEqez3MhD0EqLff+5X3r5hfAeQnxo6HhMuUu6oV 337uUkylNtkQPHJwVZj3dTTV2FEJ9O7cKXemII1JIHjxpiKMoNe8sFcJoNhoJjy7l2 waLykBjMNRk/pHwB7i9e17njqpFohyDue4GB7tsFiYZpgLTZ/QfwqurSKeeE6zJImL CQaTqAVG9Xlig== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 073F5CE137E; Fri, 25 Aug 2023 18:45:38 -0700 (PDT) Date: Fri, 25 Aug 2023 18:45:38 -0700 From: "Paul E. McKenney" To: Huacai Chen Cc: Thomas Gleixner , Joel Fernandes , Z qiang , Huacai Chen , Frederic Weisbecker , Neeraj Upadhyay , Josh Triplett , Boqun Feng , Ingo Molnar , John Stultz , Stephen Boyd , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , Sergey Senozhatsky , rcu@vger.kernel.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, Binbin Zhou Subject: Re: [PATCH V4 2/2] rcu: Update jiffies in rcu_cpu_stall_reset() Message-ID: <037d4c84-d18d-40be-b311-76c29effa922@paulmck-laptop> Reply-To: paulmck@kernel.org References: <87ttspct76.ffs@tglx> <03fe7084-0509-45fa-87ee-8f8705a221a6@paulmck-laptop> <16827b4e-9823-456d-a6be-157fbfae64c3@paulmck-laptop> <8792da20-a58e-4cc0-b3d2-231d5ade2242@paulmck-laptop> <24e34f50-32d2-4b67-8ec0-1034c984d035@paulmck-laptop> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Aug 25, 2023 at 07:15:44PM +0800, Huacai Chen wrote: > Hi, Paul, > > On Fri, Aug 25, 2023 at 2:28 AM Paul E. McKenney wrote: > > > > On Thu, Aug 24, 2023 at 11:43:04PM +0800, Huacai Chen wrote: > > > Hi, Paul, > > > > > > On Thu, Aug 24, 2023 at 9:24 PM Paul E. McKenney wrote: > > > > > > > > On Thu, Aug 24, 2023 at 08:40:00PM +0800, Huacai Chen wrote: > > > > > Hi, Paul, > > > > > On Thu, Aug 24, 2023 at 7:40 PM Paul E. McKenney wrote: > > > > > > On Thu, Aug 24, 2023 at 10:50:41AM +0800, Huacai Chen wrote: > > > > > > > Hi, Paul, > > > > > > > On Thu, Aug 24, 2023 at 6:41 AM Paul E. McKenney wrote: > > > > > > > > On Thu, Aug 24, 2023 at 12:03:25AM +0200, Thomas Gleixner wrote: > > > > > > > > > On Thu, Aug 17 2023 at 16:06, Huacai Chen wrote: > > > > > > > > > > On Thu, Aug 17, 2023 at 3:27 AM Joel Fernandes wrote: > > > > > > > > > >> > If do_update_jiffies_64() cannot be used in NMI context, > > > > > > > > > >> > > > > > > > > > >> Can you not make the jiffies update conditional on whether it is > > > > > > > > > >> called within NMI context? > > > > > > > > > > > > > > > > > > Which solves what? If KGDB has a breakpoint in the jiffies lock held > > > > > > > > > region then you still dead lock. > > > > > > > > > > > > > > > > > > >> I dislike that.. > > > > > > > > > > Is this acceptable? > > > > > > > > > > > > > > > > > > > > void rcu_cpu_stall_reset(void) > > > > > > > > > > { > > > > > > > > > > unsigned long delta; > > > > > > > > > > > > > > > > > > > > delta = nsecs_to_jiffies(ktime_get_ns() - ktime_get_coarse_ns()); > > > > > > > > > > > > > > > > > > > > WRITE_ONCE(rcu_state.jiffies_stall, > > > > > > > > > > jiffies + delta + rcu_jiffies_till_stall_check()); > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > This can update jiffies_stall without updating jiffies (but has the > > > > > > > > > > same effect). > > > > > > > > > > > > > > > > > > Now you traded the potential dead lock on jiffies lock for a potential > > > > > > > > > live lock vs. tk_core.seq. Not really an improvement, right? > > > > > > > > > > > > > > > > > > The only way you can do the above is something like the incomplete and > > > > > > > > > uncompiled below. NMI safe and therefore livelock proof time interfaces > > > > > > > > > exist for a reason. > > > > > > > > > > > > > > > > Just for completeness, another approach, with its own advantages > > > > > > > > and disadvantage, is to add something like ULONG_MAX/4 to > > > > > > > > rcu_state.jiffies_stall, but also set a counter indicating that this > > > > > > > > has been done. Then RCU's force-quiescent processing could decrement > > > > > > > > that counter (if non-zero) and reset rcu_state.jiffies_stall when it > > > > > > > > does reach zero. > > > > > > > > > > > > > > > > Setting the counter to three should cover most cases, but "live by the > > > > > > > > heuristic, die by the heuristic". ;-) > > > > > > > > > > > > > > > > It would be good to have some indication when gdb exited, but things > > > > > > > > like the gdb "next" command can make that "interesting" when applied to > > > > > > > > a long-running function. > > > > > > > > > > > > > > The original code is adding ULONG_MAX/2, so adding ULONG_MAX/4 may > > > > > > > make no much difference? The simplest way is adding 300*HZ, but Joel > > > > > > > dislikes that. > > > > > > > > > > > > I am not seeing the ULONG_MAX/2, so could you please point me to that > > > > > > original code? > > > > > > > > > > Maybe I misunderstand something, I say the original code means code > > > > > before commit a80be428fbc1f1f3bc9ed924 ("rcu: Do not disable GP stall > > > > > detection in rcu_cpu_stall_reset()"). > > > > > > > > Yes, my suggestion would essentially revert that patch. It would > > > > compensate by resetting rcu_state.jiffies_stall after a few calls > > > > to rcu_gp_fqs(). > > > > > > > > Alternatively, we could simply provide a way for gdb users to manually > > > > disable RCU CPU stall warnings at the beginning of their debug sessions > > > > and to manually re-enable them when they are done. > > > > > > This problem is not KGDB-specific (though it is firstly found in the > > > KGDB case), so I want to fix it in the rcu code rather than in the > > > kgdb code. > > > > Sure, for example, there is also PowerPC XMON. > > > > But this problem also is not RCU-specific. There are also hardlockups, > > softlockups, workqueue lockups, networking timeouts, and who knows what > > all else. > > > > Plus, and again to Thomas's point, gdb breakpoints can happen anywhere. > > For example, immediately after RCU computes the RCU CPU stall time for > > a new grace period, and right before it stores it. The gdb callout > > updates rcu_state.jiffies_stall, but that update is overwritten with a > > stale value as soon as the system starts back up. > > > > Low probabillity, to be sure, but there are quite a few places in > > the kernel right after a read from some timebase or another, and many > > (perhaps all) of these can see similar stale-time-use problems. > > > > The only way I know of to avoid these sorts of false positives is for > > the user to manually suppress all timeouts (perhaps using a kernel-boot > > parameter for your early-boot case), do the gdb work, and then unsuppress > > all stalls. Even that won't work for networking, because the other > > system's clock will be running throughout. > > > > In other words, from what I know now, there is no perfect solution. > > Therefore, there are sharp limits to the complexity of any solution that > > I will be willing to accept. > I think the simplest solution is (I hope Joel will not angry): > > void rcu_cpu_stall_reset(void) > { > WRITE_ONCE(rcu_state.jiffies_stall, jiffies + 300*HZ); > } > > 300s is the upper limit of "stall timeout" we can configure > (RCU_CPU_STALL_TIMEOUT in kernel/rcu/Kconfig.debug), so it isn't just > a "magic number". In practice, 300s is also enough for any normal kgdb > operation. And compared to "resetting after a few calls to > rcu_gp_fqs()", this simple solution means "automatically resetting > after 300s". Please keep in mind that the long-ago choice of 300s did not take things like kernel debuggers into account. But that 300s limit still makes sense in the absence of things like kernel debuggers. So this code is what takes up the difference. > If this is completely unacceptable, I prefer Thomas's > tick_estimate_stale_jiffies() solution. Thomas's tick_estimate_stale_jiffies() does have its merits, but the advantage of ULONG_MAX/4 is that you don't have to care about jiffies being stale. Thanx, Paul > > > > > > The advantage of ULONG_MAX/4 over ULONG_MAX/2 is that the time_after() > > > > > > and time_before() macros have ULONG_MAX/4 slop in either direction > > > > > > before giving you the wrong answer. You can get nearly the same result > > > > > > using ULONG_MAX/2, but it requires a bit more care. And even on 32-bit > > > > > > HZ=1000 systems, ULONG_MAX/4 gets you more than 12 days of gdb session > > > > > > or jiffies-update delay before you start getting false positives. > > > > > > > > > > > > Then things can be reset after (say) 3 calls to rcu_gp_fqs() and > > > > > > also the current reset at the beginning of a grace period, which > > > > > > is in record_gp_stall_check_time(). > > > > > > > > > > > > It would be better if RCU could get notified at both ends of the debug > > > > > > session, but given gdb commands such as "next", along with Thomas's > > > > > > point about gdb breakpoints being pretty much anywhere, this might or > > > > > > might not be so helpful in real life. But worth looking into. > > > > > > > > > > > > Thanx, Paul > > > > > > > > > > > > > Huacai > > > > > > > > > > > > > > > > > > > > > > > Thanx, Paul > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > tglx > > > > > > > > > --- > > > > > > > > > --- a/kernel/time/tick-sched.c > > > > > > > > > +++ b/kernel/time/tick-sched.c > > > > > > > > > @@ -51,6 +51,13 @@ struct tick_sched *tick_get_tick_sched(i > > > > > > > > > */ > > > > > > > > > static ktime_t last_jiffies_update; > > > > > > > > > > > > > > > > > > +unsigned long tick_estimate_stale_jiffies(void) > > > > > > > > > +{ > > > > > > > > > + ktime_t delta = ktime_get_mono_fast_ns() - READ_ONCE(last_jiffies_update); > > > > > > > > > + > > > > > > > > > + return delta < 0 ? 0 : div_s64(delta, TICK_NSEC); > > > > > > > > > +} > > > > > > > > > + > > > > > > > > > /* > > > > > > > > > * Must be called with interrupts disabled ! > > > > > > > > > */ > > > > > > > > > > > > > > > > > >