Received: by 10.192.165.148 with SMTP id m20csp2947365imm; Sun, 22 Apr 2018 20:04:32 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/EuR+gJdy5LhuE+pFDcG6Upu01rXynHxagKVPcSDaRW8kMUE+MJoGGIGHZZa2eWnoOymPE X-Received: by 2002:a17:902:40d:: with SMTP id 13-v6mr13735252ple.117.1524452672488; Sun, 22 Apr 2018 20:04:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524452672; cv=none; d=google.com; s=arc-20160816; b=lzWyB5tqa1wy2i5Q+Fp9MQJ6Cjd37qXbeZogk1SvwAtWABBiYkCcXCBCbFTttAbcAT KqWMSEjmw5r08ONUlPGjcVle4EbKwy/PIVbXdfk7eq3FNZmt5QZa21zKpbHxOgpWhKdN sSx2pVlVXeY+EFQUgLZ9lm0e9lCGV7PEAWUw3v6LZ/eFBYcTTc9ZPA8eCfxQojOFysRl aqhEApK40TaXtnoBcNvugyCD0AFsiLJMl/VH9djxAhYevrXZlhGat+h5gL+kz8rbbfWr FRvY9V1h1z4ZQ1RAtSYO62T5JfO4Oe8e33T4z8mYcZKRHy7FNXltNf6szjSPNHzRSU0y UBWQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:references:in-reply-to:date :subject:cc:to:from:arc-authentication-results; bh=OnVCr996x1UMVTqGZLSya55tbrOT7t/WQktcC5Y2s8s=; b=GU7js2bN1CQ7kzg5N4bT+F2W3cFcpOu7yltZolBj49bDknDU0gq0EnxeCx62jcjbh8 ZvCjNDGqoTbwK2YHkfAIact60wS5d0fLro9c76c0Se22FVFCynjBeyUgGfiSSL1Pdoej PUZirbc99W2T0bA4dnN+1iTawXc6NM1H/yUXlVOeymHD5tAZ680LbQxzktYF0cBc1cyG lTRGwUZH7vY1Fk1D+NNG2Ho9CMp0fuxz+2+Odi/CKPw45BydrIikti6WHojz8tHXEG0p Z0oEh4SgGXWDrhfAl+hc16TxyqXSlgbEz734uBpe1p09EFV5A4eNaomgYPAF7s+sZkKG hEuA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p78si10516250pfa.180.2018.04.22.20.04.18; Sun, 22 Apr 2018 20:04:32 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754108AbeDWDDB (ORCPT + 99 others); Sun, 22 Apr 2018 23:03:01 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:36274 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753989AbeDWDCl (ORCPT ); Sun, 22 Apr 2018 23:02:41 -0400 Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w3N2wei1134413 for ; Sun, 22 Apr 2018 23:02:40 -0400 Received: from e11.ny.us.ibm.com (e11.ny.us.ibm.com [129.33.205.201]) by mx0b-001b2d01.pphosted.com with ESMTP id 2hh3xawhv2-1 (version=TLSv1.2 cipher=AES256-SHA256 bits=256 verify=NOT) for ; Sun, 22 Apr 2018 23:02:39 -0400 Received: from localhost by e11.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sun, 22 Apr 2018 23:02:38 -0400 Received: from b01cxnp22034.gho.pok.ibm.com (9.57.198.24) by e11.ny.us.ibm.com (146.89.104.198) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Sun, 22 Apr 2018 23:02:34 -0400 Received: from b01ledav003.gho.pok.ibm.com (b01ledav003.gho.pok.ibm.com [9.57.199.108]) by b01cxnp22034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w3N32XTm50855960; Mon, 23 Apr 2018 03:02:33 GMT Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 534A9B2050; Mon, 23 Apr 2018 00:04:36 -0400 (EDT) Received: from paulmck-ThinkPad-W541 (unknown [9.85.149.45]) by b01ledav003.gho.pok.ibm.com (Postfix) with ESMTP id 01518B204D; Mon, 23 Apr 2018 00:04:36 -0400 (EDT) Received: by paulmck-ThinkPad-W541 (Postfix, from userid 1000) id 4CB0B16C91FA; Sun, 22 Apr 2018 20:03:45 -0700 (PDT) From: "Paul E. McKenney" To: linux-kernel@vger.kernel.org Cc: mingo@kernel.org, jiangshanlai@gmail.com, dipankar@in.ibm.com, akpm@linux-foundation.org, mathieu.desnoyers@efficios.com, josh@joshtriplett.org, tglx@linutronix.de, peterz@infradead.org, rostedt@goodmis.org, dhowells@redhat.com, edumazet@google.com, fweisbec@gmail.com, oleg@redhat.com, joel.opensrc@gmail.com, torvalds@linux-foundation.org, npiggin@gmail.com, "Paul E. McKenney" Subject: [PATCH tip/core/rcu 16/21] rcu: Add funnel locking to rcu_start_this_gp() Date: Sun, 22 Apr 2018 20:03:39 -0700 X-Mailer: git-send-email 2.5.2 In-Reply-To: <20180423030258.GA23370@linux.vnet.ibm.com> References: <20180423030258.GA23370@linux.vnet.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 18042303-2213-0000-0000-0000029783A6 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00008903; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000257; SDB=6.01021898; UDB=6.00521542; IPR=6.00801133; MB=3.00020719; MTD=3.00000008; XFM=3.00000015; UTC=2018-04-23 03:02:37 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18042303-2214-0000-0000-000059DC4076 Message-Id: <1524452624-27589-16-git-send-email-paulmck@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-04-23_01:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=1 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1804230032 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The rcu_start_this_gp() function had a simple form of funnel locking that used only the leaves and root of the rcu_node tree, which is fine for systems with only a few hundred CPUs, but sub-optimal for systems having thousands of CPUs. This commit therefore adds full-tree funnel locking. This variant of funnel locking is unusual in the following ways: 1. The leaf-level rcu_node structure's ->lock is held throughout. Other funnel-locking implementations drop the leaf-level lock before progressing to the next level of the tree. 2. Funnel locking can be started at the root, which is convenient for code that already holds the root rcu_node structure's ->lock. Other funnel-locking implementations start at the leaves. 3. If an rcu_node structure other than the initial one believes that a grace period is in progress, it is not necessary to go further up the tree. This is because grace-period cleanup scans the full tree, so that marking the need for a subsequent grace period anywhere in the tree suffices -- but only if a grace period is currently in progress. 4. It is possible that the RCU grace-period kthread has not yet started, and this case must be handled appropriately. However, the general approach of using a tree to control lock contention is still in place. Signed-off-by: Paul E. McKenney --- kernel/rcu/tree.c | 92 +++++++++++++++++++++---------------------------------- 1 file changed, 35 insertions(+), 57 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 94519c7d552f..d3c769502929 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1682,74 +1682,52 @@ static bool rcu_start_this_gp(struct rcu_node *rnp, struct rcu_data *rdp, { bool ret = false; struct rcu_state *rsp = rdp->rsp; - struct rcu_node *rnp_root = rcu_get_root(rsp); - - raw_lockdep_assert_held_rcu_node(rnp); - - /* If the specified GP is already known needed, return to caller. */ - trace_rcu_this_gp(rnp, rdp, c, TPS("Startleaf")); - if (need_future_gp_element(rnp, c)) { - trace_rcu_this_gp(rnp, rdp, c, TPS("Prestartleaf")); - goto out; - } + struct rcu_node *rnp_root; /* - * If this rcu_node structure believes that a grace period is in - * progress, then we must wait for the one following, which is in - * "c". Because our request will be noticed at the end of the - * current grace period, we don't need to explicitly start one. + * Use funnel locking to either acquire the root rcu_node + * structure's lock or bail out if the need for this grace period + * has already been recorded -- or has already started. If there + * is already a grace period in progress in a non-leaf node, no + * recording is needed because the end of the grace period will + * scan the leaf rcu_node structures. Note that rnp->lock must + * not be released. */ - if (rnp->gpnum != rnp->completed) { - need_future_gp_element(rnp, c) = true; - trace_rcu_this_gp(rnp, rdp, c, TPS("Startedleaf")); - goto out; + raw_lockdep_assert_held_rcu_node(rnp); + trace_rcu_this_gp(rnp, rdp, c, TPS("Startleaf")); + for (rnp_root = rnp; 1; rnp_root = rnp_root->parent) { + if (rnp_root != rnp) + raw_spin_lock_rcu_node(rnp_root); + if (need_future_gp_element(rnp_root, c) || + ULONG_CMP_GE(rnp_root->gpnum, c) || + (rnp != rnp_root && + rnp_root->gpnum != rnp_root->completed)) { + trace_rcu_this_gp(rnp_root, rdp, c, TPS("Prestarted")); + goto unlock_out; + } + need_future_gp_element(rnp_root, c) = true; + if (rnp_root != rnp && rnp_root->parent != NULL) + raw_spin_unlock_rcu_node(rnp_root); + if (!rnp_root->parent) + break; /* At root, and perhaps also leaf. */ } - /* - * There might be no grace period in progress. If we don't already - * hold it, acquire the root rcu_node structure's lock in order to - * start one (if needed). - */ - if (rnp != rnp_root) - raw_spin_lock_rcu_node(rnp_root); - - /* - * Get a new grace-period number. If there really is no grace - * period in progress, it will be smaller than the one we obtained - * earlier. Adjust callbacks as needed. - */ - c = rcu_cbs_completed(rsp, rnp_root); - if (!rcu_is_nocb_cpu(rdp->cpu)) - (void)rcu_segcblist_accelerate(&rdp->cblist, c); - - /* - * If the needed for the required grace period is already - * recorded, trace and leave. - */ - if (need_future_gp_element(rnp_root, c)) { - trace_rcu_this_gp(rnp, rdp, c, TPS("Prestartedroot")); + /* If GP already in progress, just leave, otherwise start one. */ + if (rnp_root->gpnum != rnp_root->completed) { + trace_rcu_this_gp(rnp_root, rdp, c, TPS("Startedleafroot")); goto unlock_out; } - - /* Record the need for the future grace period. */ - need_future_gp_element(rnp_root, c) = true; - - /* If a grace period is not already in progress, start one. */ - if (rnp_root->gpnum != rnp_root->completed) { - trace_rcu_this_gp(rnp, rdp, c, TPS("Startedleafroot")); - } else { - trace_rcu_this_gp(rnp, rdp, c, TPS("Startedroot")); - if (!rsp->gp_kthread) - goto unlock_out; /* No grace-period kthread yet! */ - WRITE_ONCE(rsp->gp_flags, rsp->gp_flags | RCU_GP_FLAG_INIT); - trace_rcu_grace_period(rsp->name, READ_ONCE(rsp->gpnum), - TPS("newreq")); - ret = true; /* Caller must wake GP kthread. */ + trace_rcu_this_gp(rnp_root, rdp, c, TPS("Startedroot")); + WRITE_ONCE(rsp->gp_flags, rsp->gp_flags | RCU_GP_FLAG_INIT); + if (!rsp->gp_kthread) { + trace_rcu_this_gp(rnp_root, rdp, c, TPS("NoGPkthread")); + goto unlock_out; } + trace_rcu_grace_period(rsp->name, READ_ONCE(rsp->gpnum), TPS("newreq")); + ret = true; /* Caller must wake GP kthread. */ unlock_out: if (rnp != rnp_root) raw_spin_unlock_rcu_node(rnp_root); -out: return ret; } -- 2.5.2