Received: by 10.192.165.148 with SMTP id m20csp1422847imm; Thu, 10 May 2018 10:25:13 -0700 (PDT) X-Google-Smtp-Source: AB8JxZqzO2KZIX5QwrE7ls2fnpgTZYv+weraNwNIIfirm3E1Al0DCp4t83U/Nfcl0EqtNghxRFWL X-Received: by 2002:a63:6d47:: with SMTP id i68-v6mr1844890pgc.59.1525973113047; Thu, 10 May 2018 10:25:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1525973112; cv=none; d=google.com; s=arc-20160816; b=Ea5vv7dBzwqR46y3Ig3LMf14FPZebN6UeavAOW/gtwyIyL502Wo/Cq+smXIksgBW8R 9CmE0PT3FF0c8Yog2vSG7qcs8KeNZS2chH+XWlMvcRnVnsSYxhH2Um4oRVDQkutufKJo OWAPwaYY3Moq6ZrMsxGbat6gc+PS3P8OBl6rHp6hCkKADlr9du+ySHaAvUjoXFQcAKum 2W1pu1ya9opH/NHxRQef9HKaXZvnCbKXUF+zlw0w5khmOSXqWIvzUZLejkIVCEReeEdg ur3Y9hQndsKa4Jo4+BW264ZL3JsUFQMOhcd6bMdIHrgjB0BaN7X2ry0zfqYsAFIcEZVl gYjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=yWe24jn8GduehFl2tfHTpS9CqlYey2f1FwWk5Dg3SxM=; b=SZEQ4ut/f1LKs67cCvnr3zyeYVoNOdLIwupW99hVDWkw8goe3xxmcqaicaHpMf7vj/ 2EhqG0oJDA2ebwbP1feJ5MlrIL7IlsDuXaf770e5vaTg3/ojeMhBhkrBlrm7oPwzRVSg fITBdKps1LrT40sGyzFQBQ7cDISD6OOkLGeJFAqlSyRTHU9I+l9iax1Vai/ikq3itsr1 /QoJwgs0D5eGmll5qyXtc6jndj89OSFIGRtifLAunuZLehYOvozZ4X4rocEb0zGLPgnA KLZr+G3KNrQn0oVcqqNhVAIFCvGy0GYIwE2oy1bjgThxrSNFCrANgcmTFjX0RJ7pddDW cyxw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@joelfernandes-org.20150623.gappssmtp.com header.s=20150623 header.b=1s6o8h5u; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p10-v6si1017252pgs.432.2018.05.10.10.24.56; Thu, 10 May 2018 10:25:12 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@joelfernandes-org.20150623.gappssmtp.com header.s=20150623 header.b=1s6o8h5u; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966632AbeEJRWo (ORCPT + 99 others); Thu, 10 May 2018 13:22:44 -0400 Received: from mail-pg0-f65.google.com ([74.125.83.65]:43156 "EHLO mail-pg0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966425AbeEJRWm (ORCPT ); Thu, 10 May 2018 13:22:42 -0400 Received: by mail-pg0-f65.google.com with SMTP id k11-v6so1243506pgo.10 for ; Thu, 10 May 2018 10:22:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=yWe24jn8GduehFl2tfHTpS9CqlYey2f1FwWk5Dg3SxM=; b=1s6o8h5uxDYNzzLWWC7uhIwAX2kvp3FgfVnLwS0wB2gDWkYzF5YdI3d1rHzBShINXN kodfd/vfMjiXZUpVaxZ8XcRongC7wBaOSSr8dUbYCr5uIQx3woMMiWSdyccPc9xKCR6q vKVOJkS5NHdD9PH+1F22lB9KLxV5ZVVXGMh0ZDs6ZLV/SQNFOUOWv2wG8xoCBTYC/T83 PZmI0aA/on5kzzijHYJT0Irst38+9VRLpMmp6LXYAYmn9INDSVKu5KxxmbtTzJyg6o7s OVh2lKHUg1yhMEY9Gim2mXt6BQrDQrzJDWqE9iw/erYgMKhMJ0MeXC7VdPcwDzDtR70p 0q2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=yWe24jn8GduehFl2tfHTpS9CqlYey2f1FwWk5Dg3SxM=; b=mX3nWBuhAJ8ZRC5uCqoOxqaK+QtkYKPJz1ZLzwOjYC6Rx0fOAgp4JHUjzJdMDPeg1z xVU4MDUAlNAbynWo8Qw1GnwbCcyVcoz09zMuQgkFYKOdBWQbpf4G+TT1nX5s/4qtCkiQ KmA+ebGl+z/3o0Ql2uwH2/ANTzuZ1FndhBXhMYO4rntzSmKkc7WgC8U2qlQJPTPAIwVk Y2I2bPZbzSvuvhJqniso21BOAfVHZGK5jxZqt93NDylElSnFlbuWraQ8Bhjx/vYWWeNk vMws3OUb1dEPxRfCObx2dbFf1XQmutN/GtP7Ur/br7ViQ+rgYqTd5OaU0lX0Z6FajgBQ Jdmw== X-Gm-Message-State: ALKqPwf5eFBvg3+F9P/e3QzGN+Q0cvQgC2GJ4S+qCXAJvJVW3UtgTC3T JRlr3uxjb3I/Rig6S2ToeM8Eeg== X-Received: by 2002:a63:725b:: with SMTP id c27-v6mr1721868pgn.296.1525972962092; Thu, 10 May 2018 10:22:42 -0700 (PDT) Received: from localhost ([2620:0:1000:1600:3122:ea9c:d178:eb]) by smtp.gmail.com with ESMTPSA id s4-v6sm2501898pgp.35.2018.05.10.10.22.41 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 10 May 2018 10:22:41 -0700 (PDT) Date: Thu, 10 May 2018 10:22:40 -0700 From: Joel Fernandes To: "Paul E. McKenney" Cc: linux-kernel@vger.kernel.org, mingo@kernel.org, jiangshanlai@gmail.com, dipankar@in.ibm.com, akpm@linux-foundation.org, mathieu.desnoyers@efficios.com, josh@joshtriplett.org, tglx@linutronix.de, peterz@infradead.org, rostedt@goodmis.org, dhowells@redhat.com, edumazet@google.com, fweisbec@gmail.com, oleg@redhat.com, joel.opensrc@gmail.com, torvalds@linux-foundation.org, npiggin@gmail.com Subject: Re: [tip/core/rcu, 05/21] rcu: Make rcu_gp_cleanup() more accurately predict need for new GP Message-ID: <20180510172240.GA228531@joelaf.mtv.corp.google.com> References: <1524452624-27589-5-git-send-email-paulmck@linux.vnet.ibm.com> <20180510072133.GA122810@joelaf.mtv.corp.google.com> <20180510131546.GN26088@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180510131546.GN26088@linux.vnet.ibm.com> User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, May 10, 2018 at 06:15:46AM -0700, Paul E. McKenney wrote: > On Thu, May 10, 2018 at 12:21:33AM -0700, Joel Fernandes wrote: > > Hi Paul, > > > > On Sun, Apr 22, 2018 at 08:03:28PM -0700, Paul E. McKenney wrote: > > > Currently, rcu_gp_cleanup() scans the rcu_node tree in order to reset > > > state to reflect the end of the grace period. It also checks to see > > > whether a new grace period is needed, but in a number of cases, rather > > > than directly cause the new grace period to be immediately started, it > > > instead leaves the grace-period-needed state where various fail-safes > > > can find it. This works fine, but results in higher contention on the > > > root rcu_node structure's ->lock, which is undesirable, and contention > > > on that lock has recently become noticeable. > > > > > > This commit therefore makes rcu_gp_cleanup() immediately start a new > > > grace period if there is any need for one. > > > > > > It is quite possible that it will later be necessary to throttle the > > > grace-period rate, but that can be dealt with when and if. > > > > > > Reported-by: Nicholas Piggin > > > Signed-off-by: Paul E. McKenney > > > --- > > > kernel/rcu/tree.c | 16 ++++++++++------ > > > kernel/rcu/tree.h | 1 - > > > kernel/rcu/tree_plugin.h | 17 ----------------- > > > 3 files changed, 10 insertions(+), 24 deletions(-) > > > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > > > index 497f139056c7..afc5e32f0da4 100644 > > > --- a/kernel/rcu/tree.c > > > +++ b/kernel/rcu/tree.c > > > @@ -1763,14 +1763,14 @@ rcu_start_future_gp(struct rcu_node *rnp, struct rcu_data *rdp, > > > * Clean up any old requests for the just-ended grace period. Also return > > > * whether any additional grace periods have been requested. > > > */ > > > -static int rcu_future_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp) > > > +static bool rcu_future_gp_cleanup(struct rcu_state *rsp, struct rcu_node *rnp) > > > { > > > int c = rnp->completed; > > > - int needmore; > > > + bool needmore; > > > struct rcu_data *rdp = this_cpu_ptr(rsp->rda); > > > > > > need_future_gp_element(rnp, c) = 0; > > > - needmore = need_future_gp_element(rnp, c + 1); > > > + needmore = need_any_future_gp(rnp); > > > trace_rcu_future_gp(rnp, rdp, c, > > > needmore ? TPS("CleanupMore") : TPS("Cleanup")); > > > return needmore; > > > @@ -2113,7 +2113,6 @@ static void rcu_gp_cleanup(struct rcu_state *rsp) > > > { > > > unsigned long gp_duration; > > > bool needgp = false; > > > - int nocb = 0; > > > struct rcu_data *rdp; > > > struct rcu_node *rnp = rcu_get_root(rsp); > > > struct swait_queue_head *sq; > > > @@ -2152,7 +2151,7 @@ static void rcu_gp_cleanup(struct rcu_state *rsp) > > > if (rnp == rdp->mynode) > > > needgp = __note_gp_changes(rsp, rnp, rdp) || needgp; > > > /* smp_mb() provided by prior unlock-lock pair. */ > > > - nocb += rcu_future_gp_cleanup(rsp, rnp); > > > + needgp = rcu_future_gp_cleanup(rsp, rnp) || needgp; > > > sq = rcu_nocb_gp_get(rnp); > > > raw_spin_unlock_irq_rcu_node(rnp); > > > rcu_nocb_gp_cleanup(sq); > > > @@ -2162,13 +2161,18 @@ static void rcu_gp_cleanup(struct rcu_state *rsp) > > > } > > > rnp = rcu_get_root(rsp); > > > raw_spin_lock_irq_rcu_node(rnp); /* Order GP before ->completed update. */ > > > - rcu_nocb_gp_set(rnp, nocb); > > > > > > /* Declare grace period done. */ > > > WRITE_ONCE(rsp->completed, rsp->gpnum); > > > trace_rcu_grace_period(rsp->name, rsp->completed, TPS("end")); > > > rsp->gp_state = RCU_GP_IDLE; > > > + /* Check for GP requests since above loop. */ > > > rdp = this_cpu_ptr(rsp->rda); > > > + if (need_any_future_gp(rnp)) { > > > + trace_rcu_future_gp(rnp, rdp, rsp->completed - 1, > > > + TPS("CleanupMore")); > > > + needgp = true; > > > > Patch makes sense to me. > > > > I didn't get the "rsp->completed - 1" bit in the call to trace_rcu_future_gp. > > The grace period that just completed is in rsp->completed. The future one > > should be completed + 1. What is meaning of the third argument 'c' to the > > trace event? > > The thought was that the grace period must have been requested while > rsp->completed was one less than it is now. > > In the current code, it uses rnp->gp_seq_needed, which is instead the grace > period that is being requested. Oh ok, IIUC from the code, the 'c' parameter passed to trace_rcu_future_gp is the grace-period number in the future. Perhaps we should clarify this in the include/trace/events/rcu.h code what this parameter means. Probably 'future_gp' or something like that. > > Also in rcu_future_gp_cleanup, we call: > > trace_rcu_future_gp(rnp, rdp, c, > > needmore ? TPS("CleanupMore") : TPS("Cleanup")); > > For this case, in the final trace event record, rnp->completed and c will be > > the same, since c is set to rnp->completed before calling > > trace_rcu_future_gp. I was thinking they should be different, do you expect > > them to be the same? > > Hmmm... That does look a bit inconsistent. And it currently uses > rnp->gp_seq instead of rnp->gp_seq_needed despite having the same > "CleanupMore" name. Yes I was thinking in rcu_future_gp_cleanup, the call to trace_rcu_future_gp should be trace_rcu_future_gp(rnp, rdp, c + 1, needmore...); This is because in rcu_future_gp_cleanup, c is set to rnp->completed. Just before this point rnp->completed was set to rsp->gpnum, which marks the end of the GP for the node. The next gp would be c + 1 right? > Looks like a review of the calls to trace_rcu_this_gp() is in order. Yes, I'll do some tracing and see if something else doesn't make sense to me and let you know. > Or did you have suggestions for name/gp assocations for this trace > message type? I think the name for this one is fine but also that "CleanupMore" sounds like more clean up is needed. It could be improved to "CleanupNeedgp" or "CleanupAndStart" or something like that. thanks! - Joel