Received: by 2002:a05:6a10:144:0:0:0:0 with SMTP id 4csp537530pxw; Fri, 8 Apr 2022 14:11:48 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxslnqITOZoPzvRQvI//m7O99rgCs9jGp3FPtvpP845Zi4bgFgRxrAnJnJK+Nkzmkqu+PT/ X-Received: by 2002:a17:902:b092:b0:156:a40f:f716 with SMTP id p18-20020a170902b09200b00156a40ff716mr21082501plr.72.1649452308456; Fri, 08 Apr 2022 14:11:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649452308; cv=none; d=google.com; s=arc-20160816; b=ytmiKPXnI4+Br1LpYwX2ZOAx7EsP0vnUunpRAFUgWBblPfOCfnHAmFZLk6sv0JZGlq y3a/2I+7HGpWjunn2lW0jqoet6S1mYBuF5eGMilA64xwrJIPoqEmh4tp5u2AlHvGbXqU 9LiLlknUgQO/G7aBQXqoyob5WPaDfFcXJKE0sAH1sLRphHbfbkMT9QH9UysqFcVJIBgK CSqr5LEdr7YUaZKDSd5r2b8R7T5Mm96rqFHW3NbcYW1EWSYnQAfMWNDbDc6oLrDDyZAj B0whcCX/KpvErindVxc/SVfPU2TKbRvI+O9F2e1w0Rd5MXUlCDEh8F5YsFHZUlrSk1vV WDEQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=7R4f+94qI2K6ExVGY0p3omvs96ezNwS+NBDPlgD2fHw=; b=txamCvPG/0Ym9Cq4RotiXCKPENE0YG/4raJyiuFx+hqLSKIzhx8G9rSKSQI8bEQuSW Xr/+pkQ8V4eRSELlj+mCZfNupZcY9mTqAJx77DbAQvAyVriBWPCVoOf8aKK4JsmCt6IZ MwbxK3LJAaqXi7InmQ1sXzr2eNNVv9QMb1jfQGf0bq3PyRqTrGDYvUkRfk1AMKHW4nkz AIK6Jrw6XzNUuTGNh8YPJGCqKkcxWh2mAk/EYjvYacvqPBbXRuM6u6mdqcWbqXHGPUzv oCkONno730u328nNBEegrFBc8pLI0K2As9rvmznjl1dYDVVD2NaIATyT0mRG3x774D50 GMrA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=x7VZ4v6G; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l2-20020a170902f68200b0015605e9e571si2135125plg.494.2022.04.08.14.11.34; Fri, 08 Apr 2022 14:11:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes.org header.s=google header.b=x7VZ4v6G; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238048AbiDHRQ5 (ORCPT + 99 others); Fri, 8 Apr 2022 13:16:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39806 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234995AbiDHRQx (ORCPT ); Fri, 8 Apr 2022 13:16:53 -0400 Received: from mail-io1-xd34.google.com (mail-io1-xd34.google.com [IPv6:2607:f8b0:4864:20::d34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 85B673C737 for ; Fri, 8 Apr 2022 10:14:48 -0700 (PDT) Received: by mail-io1-xd34.google.com with SMTP id k25so11368576iok.8 for ; Fri, 08 Apr 2022 10:14:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=7R4f+94qI2K6ExVGY0p3omvs96ezNwS+NBDPlgD2fHw=; b=x7VZ4v6GavJUvP98gYT0IO36JWQHaZDKGngI52NebyKbuqK1pSLvcTw9VmD10ofbT2 UUqEY7AwTmD5wYhvsg1tBmxANERQwy4l2DC4yKxpmdgLiYK6e7hUGs/bKF/j3zuFEJnX hNbki2Le1C2/sceqwksGM2w1gCbZ0sbRHsdNU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=7R4f+94qI2K6ExVGY0p3omvs96ezNwS+NBDPlgD2fHw=; b=A9OIYEBXBDn47xLDMb6DwXl08si6NbpRbjTuJ/87Q+HnXBX2VUajSqUpfjiCmOYRnW m2re9OxDaI6jhnrNMmaWcg2xTmG3p8pBlyB2KJUZJhc7WwEke4R2sYPlaW32Q3XGhzxs uBWfdRCwmcNlVjIRJVangg2XogiaWPoJetv2RqUqMP0EosCGfjYLQjtW2k/eA/7/WNOT b9N8a3K5cZzn5+4OSOFp1mStKkW3PD8aVExEp0W32O5XaYNjCXPrD2TliW+YYCVLQKX/ Zi7VHeII1usqnh2xCajq0H5x25KrErg/Fu+jkR9fdRO2r9q1JX0wX9qCfuH/ODJOX8jP yKHQ== X-Gm-Message-State: AOAM533mtwgLIOTmXTPgBAzc6fmwAckqwYoYyS2cAaOSr9vOxvqhvbMP T0w+B3lmdGbzJDFpKR/G8VQ0iKxLnomXd09sWnQ12g== X-Received: by 2002:a05:6638:737:b0:317:d5e0:2b3a with SMTP id j23-20020a056638073700b00317d5e02b3amr9614023jad.52.1649438087384; Fri, 08 Apr 2022 10:14:47 -0700 (PDT) MIME-Version: 1.0 References: <20220408045734.1158817-1-kaleshsingh@google.com> <20220408143444.GC4285@paulmck-ThinkPad-P17-Gen-1> <20220408153447.GE4285@paulmck-ThinkPad-P17-Gen-1> In-Reply-To: <20220408153447.GE4285@paulmck-ThinkPad-P17-Gen-1> From: Joel Fernandes Date: Fri, 8 Apr 2022 13:14:35 -0400 Message-ID: Subject: Re: [PATCH v2] EXP rcu: Move expedited grace period (GP) work to RT kthread_worker To: "Paul E. McKenney" Cc: Kalesh Singh , Suren Baghdasaryan , kernel-team , Tejun Heo , Tim Murray , Wei Wang , Kyle Lin , Chunwei Lu , Lulu Wang , Frederic Weisbecker , Neeraj Upadhyay , Josh Triplett , Steven Rostedt , Mathieu Desnoyers , Lai Jiangshan , rcu , LKML Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Apr 8, 2022 at 11:34 AM Paul E. McKenney wrote: > > On Fri, Apr 08, 2022 at 10:41:26AM -0400, Joel Fernandes wrote: > > On Fri, Apr 8, 2022 at 10:34 AM Paul E. McKenney wrote: > > > > > > On Fri, Apr 08, 2022 at 06:42:42AM -0400, Joel Fernandes wrote: > > > > On Fri, Apr 8, 2022 at 12:57 AM Kalesh Singh wrote: > > > > > > > > > [...] > > > > > @@ -334,15 +334,13 @@ static bool exp_funnel_lock(unsigned long s) > > > > > * Select the CPUs within the specified rcu_node that the upcoming > > > > > * expedited grace period needs to wait for. > > > > > */ > > > > > -static void sync_rcu_exp_select_node_cpus(struct work_struct *wp) > > > > > +static void __sync_rcu_exp_select_node_cpus(struct rcu_exp_work *rewp) > > > > > { > > > > > int cpu; > > > > > unsigned long flags; > > > > > unsigned long mask_ofl_test; > > > > > unsigned long mask_ofl_ipi; > > > > > int ret; > > > > > - struct rcu_exp_work *rewp = > > > > > - container_of(wp, struct rcu_exp_work, rew_work); > > > > > struct rcu_node *rnp = container_of(rewp, struct rcu_node, rew); > > > > > > > > > > raw_spin_lock_irqsave_rcu_node(rnp, flags); > > > > > @@ -417,13 +415,119 @@ static void sync_rcu_exp_select_node_cpus(struct work_struct *wp) > > > > > rcu_report_exp_cpu_mult(rnp, mask_ofl_test, false); > > > > > } > > > > > > > > > > +static void rcu_exp_sel_wait_wake(unsigned long s); > > > > > + > > > > > +#ifdef CONFIG_RCU_EXP_KTHREAD > > > > > > > > Just my 2c: > > > > > > > > Honestly, I am not sure if the benefits of duplicating the code to use > > > > normal workqueues outweighs the drawbacks (namely code complexity, > > > > code duplication - which can in turn cause more bugs and maintenance > > > > headaches down the line). The code is harder to read and adding more > > > > 30 character function names does not help. > > > > > > > > For something as important as expedited GPs, I can't imagine a > > > > scenario where an RT kthread worker would cause "issues". If it does > > > > cause issues, that's what the -rc cycles and the stable releases are > > > > for. I prefer to trust the process than take a one-foot-in-the-door > > > > approach. > > > > > > > > So please, can we just keep it simple? > > > > > > Yes and no. > > > > > > This is a bug fix, but only for those systems that are expecting real-time > > > response from synchronize_rcu_expedited(). As far as I know, this is only > > > Android. The rest of the systems are just fine with the current behavior. > > > > As far as you know, but are you sure? > > None of us are sure. We are balancing risks and potential benefits. Right. > > > In addition, this bug fix introduces significant risks, especially in > > > terms of performance for throughput-oriented workloads. > > > > Could you explain what the risk is? That's the part I did not follow. > > How can making synchronize_rcu_expedited() work getting priority > > introduce throughput issues? > > Status quo has synchronize_rcu_expedited() workqueues running as > SCHED_OTHER. Yeah, so if we go by this, you are saying RCU_BOOST likely does not work correctly on status quo right? I do not see what in the commit message indicates that this is an Android-only issue, let me know what I am missing. The users affected by this will instead have these running > as SCHED_FIFO. That changes scheduling. For users not explicitly > needing low-latency synchronize_rcu_expedited(), this change is very > unlikely to be for the better. And historically, unnecessarily running > portions of RCU at real-time priorities has been a change for the worse. > As in greatly increased context-switch rates and consequently degraded > performance. Please note that this is not a theoretical statement: Real > users have really been burned by too much SCHED_FIFO in RCU kthreads in > the past. Android also has suffered from too much SCHED_FIFO in the past. I remember the display thread called 'surfaceflinger' had to be dropped from RT privilege at one point. > > > So yes, let's do this bug fix (with appropriate adjustment), but let's > > > also avoid exposing the non-Android workloads to risks from the inevitable > > > unintended consequences. ;-) > > > > I would argue the risk is also adding code complexity and more bugs > > without clear rationale for why it is being done. There's always risk > > with any change, but that's the -rc cycles and stable kernels help > > catch those. I think we should not add more code complexity if it is a > > theoretical concern. > > > > There's also another possible risk - there is a possible hidden > > problem here that probably the non-Android folks haven't noticed or > > been able to debug. I would rather just do the right thing. > > > > Just my 2c, > > Sorry, but my answer is still "no". > > Your suggested change risks visiting unacceptable performance > degradation on a very large number of innocent users for whom current > synchronize_rcu_expedited() latency is plenty good enough. I believe the process will catch any such risk, but it is your call! ;-) Thanks, - Joel