Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp5046512imm; Tue, 16 Oct 2018 04:28:24 -0700 (PDT) X-Google-Smtp-Source: ACcGV62Ou7Ek8EKheMDHoXiheCkbiTrzn/LrXOMHEVDP+OETuHHX/V1E+zOkx6pF9m24LCWsjv+k X-Received: by 2002:a17:902:9b89:: with SMTP id y9-v6mr21140528plp.239.1539689304683; Tue, 16 Oct 2018 04:28:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539689304; cv=none; d=google.com; s=arc-20160816; b=sX7kJfV5/Cq+Vj7vD3fxV9C0WstT4fA2J1DWUbF/97FLj4VrAsMJzZdR1OOGv+3jgq A/Qu0SDEn4tg1Jg2LL2HTb8B9HubHOzGB9jFmq86HEUtJYbmlVRUka61gCYpgdLTgQIX PlyXWZz+uxL3niHYoIzCT/OEUhY+JuYPiM8KkCHh4VTV9E/+NCAlOT9yXFpWsUw7MKZs oq2we3xLjPha73S3qva8zSOyAMVt+PbnG6Emkh9JeQF9iSkodIC/DLC88S9iqTlFpYju +dOKt8p4sdvBmJxhM2i7XCwcWNTI+cRwVKtQvuuO07AASL+vmNU9GEmhUa3glCh7mxCE 6N7A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:user-agent:in-reply-to :content-disposition:mime-version:references:reply-to:subject:cc:to :from:date; bh=VmUnGMysFhQEY6XGagQCJylCY/py1zsSPWP38fysP/s=; b=X1aWa12n116UOT9UAbjWhhKOX+nQA7Zk+0DPWaF/ymO+NOFMptlNsCNvXsOpLt+mRo TqUYs2avMZAq8hNGs+L7wr9r7OHbcEb7b80oUkUmz2NvelS17vq4g2OdPXTt/F7/ZYt3 WounrI7fWZ4jk3N/F4SBwyJlMbDAGtEOePhHT6I40OXoP3//cYQzRz8zxjjxSDWy/W1s x/yDnWZk7KDcSAByhVknIlB1uK+bq3H+m4cgI9a8XqyHP6t0dOeoOSaXA+kbXNPD0DQh llEPxX9Bfj81AK4hCB1PpPXJGzOnyjGFJixtRxCDrYomP1dIApAFqTvKzR+jiGQynyuw aRFQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i5-v6si13494153pgg.559.2018.10.16.04.28.08; Tue, 16 Oct 2018 04:28:24 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727035AbeJPTQS (ORCPT + 99 others); Tue, 16 Oct 2018 15:16:18 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:46722 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726978AbeJPTQS (ORCPT ); Tue, 16 Oct 2018 15:16:18 -0400 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w9GBOwl8071535 for ; Tue, 16 Oct 2018 07:26:18 -0400 Received: from e17.ny.us.ibm.com (e17.ny.us.ibm.com [129.33.205.207]) by mx0a-001b2d01.pphosted.com with ESMTP id 2n5bn8h7d2-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 16 Oct 2018 07:26:17 -0400 Received: from localhost by e17.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 16 Oct 2018 07:26:16 -0400 Received: from b01cxnp23034.gho.pok.ibm.com (9.57.198.29) by e17.ny.us.ibm.com (146.89.104.204) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Tue, 16 Oct 2018 07:26:12 -0400 Received: from b01ledav003.gho.pok.ibm.com (b01ledav003.gho.pok.ibm.com [9.57.199.108]) by b01cxnp23034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w9GBQBnL24641770 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 16 Oct 2018 11:26:11 GMT Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 00D9AB2066; Tue, 16 Oct 2018 07:24:06 -0400 (EDT) Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C0A0AB205F; Tue, 16 Oct 2018 07:24:05 -0400 (EDT) Received: from paulmck-ThinkPad-W541 (unknown [9.85.132.200]) by b01ledav003.gho.pok.ibm.com (Postfix) with ESMTP; Tue, 16 Oct 2018 07:24:05 -0400 (EDT) Received: by paulmck-ThinkPad-W541 (Postfix, from userid 1000) id 798EB16C3ADB; Tue, 16 Oct 2018 04:26:11 -0700 (PDT) Date: Tue, 16 Oct 2018 04:26:11 -0700 From: "Paul E. McKenney" To: Joel Fernandes Cc: Nikolay Borisov , linux-kernel@vger.kernel.org, Jonathan Corbet , Josh Triplett , Lai Jiangshan , linux-doc@vger.kernel.org, Mathieu Desnoyers , Steven Rostedt Subject: Re: [PATCH RFC] doc: rcu: remove obsolete (non-)requirement about disabling preemption Reply-To: paulmck@linux.ibm.com References: <20181015020827.GA217384@joelaf.mtv.corp.google.com> <20181015021349.GB217384@joelaf.mtv.corp.google.com> <20181015023328.GP2674@linux.ibm.com> <20181015024758.GA227989@joelaf.mtv.corp.google.com> <5151da01-343b-bb37-353e-b6652ae530f5@suse.com> <20181015112112.GT2674@linux.ibm.com> <20181015193951.GA33528@joelaf.mtv.corp.google.com> <20181015195426.GD2674@linux.ibm.com> <20181015201556.GA43575@joelaf.mtv.corp.google.com> <20181015210856.GE2674@linux.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181015210856.GE2674@linux.ibm.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 18101611-0040-0000-0000-000004813AED X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00009885; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000268; SDB=6.01103463; UDB=6.00571171; IPR=6.00883515; MB=3.00023776; MTD=3.00000008; XFM=3.00000015; UTC=2018-10-16 11:26:15 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18101611-0041-0000-0000-0000088950A1 Message-Id: <20181016112611.GA27405@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-10-16_07:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1810160099 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Oct 15, 2018 at 02:08:56PM -0700, Paul E. McKenney wrote: > On Mon, Oct 15, 2018 at 01:15:56PM -0700, Joel Fernandes wrote: > > On Mon, Oct 15, 2018 at 12:54:26PM -0700, Paul E. McKenney wrote: > > [...] > > > > > In any case, please don't spin for milliseconds with preemption disabled. > > > > > The real-time guys are unlikely to be happy with you if you do this! > > > > > > > > Well just to clarify, I was just running Oleg's test which did this. This > > > > test was mentioned in the original documentation that I deleted. Ofcourse I > > > > would not dare do such a thing in production code :-D. I guess to Oleg's > > > > defense, he did it to very that synchronize_rcu() was not blocked on > > > > preempt-disable sections which was a different test. > > > > > > Understood! Just pointing out that RCU's tolerating a given action does > > > not necessarily mean that it is a good idea to take that action. ;-) > > > > Makes sense :-) thanks. > > Don't worry, that won't happen again. ;-) > > > > > > > > + pr_crit("SPIN done!\n"); > > > > > > > + preempt_enable(); > > > > > > > + break; > > > > > > > + case 777: > > > > > > > + pr_crit("SYNC start\n"); > > > > > > > + synchronize_rcu(); > > > > > > > + pr_crit("SYNC done!\n"); > > > > > > > > > > > > But you are using the console printing infrastructure which is rather > > > > > > heavyweight. Try replacing pr_* calls with trace_printk so that you > > > > > > write to the lock-free ring buffer, this will reduce the noise from the > > > > > > heavy console printing infrastructure. > > > > > > > > > > And this might be a problem as well. > > > > > > > > This was not the issue (or atleast not fully the issue) since I saw the same > > > > thing with trace_printk. It was exactly what you said - which is the > > > > excessively long preempt disabled times. > > > > > > One approach would be to apply this patch against (say) v4.18, which > > > does not have consolidated grace periods. You might then be able to > > > tell if the pr_crit() calls make any difference. > > > > I could do that, yeah. But since the original problem went away due to > > disabling preempts for a short while, I will move on and continue to focus on > > updating other parts of the documenation. Just to mention I > > brought this up because I thought its better to do that than not to, just > > incase there is any lurking issue with the consolidation. Sorry if that ended > > up with me being noisy. > > Not a problem, no need to apologize! Besides, digging through the code did point out a reasonable optimization. In the common case, this would buy 100s of microseconds rather than milliseconds, but it seems simple enough to be worthwhile. Thoughts? Thanx, Paul ------------------------------------------------------------------------ commit 07921e8720907f58f82b142f2027fc56d5abdbfd Author: Paul E. McKenney Date: Tue Oct 16 04:12:58 2018 -0700 rcu: Speed up expedited GPs when interrupting RCU reader In PREEMPT kernels, an expedited grace period might send an IPI to a CPU that is executing an RCU read-side critical section. In that case, it would be nice if the rcu_read_unlock() directly interacted with the RCU core code to immediately report the quiescent state. And this does happen in the case where the reader has been preempted. But it would also be a nice performance optimization if immediate reporting also happened in the preemption-free case. This commit therefore adds an ->exp_hint field to the task_struct structure's ->rcu_read_unlock_special field. The IPI handler sets this hint when it has interrupted an RCU read-side critical section, and this causes the outermost rcu_read_unlock() call to invoke rcu_read_unlock_special(), which, if preemption is enabled, reports the quiescent state immediately. If preemption is disabled, then the report is required to be deferred until preemption (or bottom halves or interrupts or whatever) is re-enabled. Because this is a hint, it does nothing for more complicated cases. For example, if the IPI interrupts an RCU reader, but interrupts are disabled across the rcu_read_unlock(), but another rcu_read_lock() is executed before interrupts are re-enabled, the hint will already have been cleared. If you do crazy things like this, reporting will be deferred until some later RCU_SOFTIRQ handler, context switch, cond_resched(), or similar. Reported-by: Joel Fernandes Signed-off-by: Paul E. McKenney diff --git a/include/linux/sched.h b/include/linux/sched.h index 004ca21f7e80..64ce751b5fe9 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -571,8 +571,10 @@ union rcu_special { struct { u8 blocked; u8 need_qs; + u8 exp_hint; /* Hint for performance. */ + u8 pad; /* No garbage from compiler! */ } b; /* Bits. */ - u16 s; /* Set of bits. */ + u32 s; /* Set of bits. */ }; enum perf_event_task_context { diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h index e669ccf3751b..928fe5893a57 100644 --- a/kernel/rcu/tree_exp.h +++ b/kernel/rcu/tree_exp.h @@ -692,8 +692,10 @@ static void sync_rcu_exp_handler(void *unused) */ if (t->rcu_read_lock_nesting > 0) { raw_spin_lock_irqsave_rcu_node(rnp, flags); - if (rnp->expmask & rdp->grpmask) + if (rnp->expmask & rdp->grpmask) { rdp->deferred_qs = true; + WRITE_ONCE(t->rcu_read_unlock_special.b.exp_hint, true); + } raw_spin_unlock_irqrestore_rcu_node(rnp, flags); } diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 8b48bb7c224c..d6286eb6e77e 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -643,8 +643,9 @@ static void rcu_read_unlock_special(struct task_struct *t) local_irq_save(flags); irqs_were_disabled = irqs_disabled_flags(flags); if ((preempt_bh_were_disabled || irqs_were_disabled) && - t->rcu_read_unlock_special.b.blocked) { + t->rcu_read_unlock_special.s) { /* Need to defer quiescent state until everything is enabled. */ + WRITE_ONCE(t->rcu_read_unlock_special.b.exp_hint, false); raise_softirq_irqoff(RCU_SOFTIRQ); local_irq_restore(flags); return;