Received: by 2002:ac0:a582:0:0:0:0:0 with SMTP id m2-v6csp848645imm; Wed, 17 Oct 2018 09:12:09 -0700 (PDT) X-Google-Smtp-Source: ACcGV625HMac9r+CBkoORFLjJ8sjam51GB9XabLicgRY2o9Y1OUNBOY3IHZbgJ99Onjrssfskkj5 X-Received: by 2002:a65:5bc1:: with SMTP id o1-v6mr25358511pgr.391.1539792729277; Wed, 17 Oct 2018 09:12:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539792729; cv=none; d=google.com; s=arc-20160816; b=iROmSIj5vSuV0/hvi+CFnYnwkS/gqNNAwXYgyAXzhpfZSjKJbMW9VccnoLs/G9wQmU nRdrocfyWPfeDDcQpQCOg9HEeCw8f0h2GGzYWk2aqzdnaCMnzzLHeA7S1N48zuarkiu+ P13ShHFcvwomQNLJe5xSOPEJa2XW9asq2iYIxKLYVkLvhy/OEbL6jtVDj333oQc0uQwQ kAD5uZ8THqacawM0tajI5Ap8zzuy9Pf2FvPbqkFn9dN0n08FO66eYGsUb6u0yw4u5Gas 8h25mM6fRYErNoRJkN+MuEVb0+p+cVqJrXQJ82JZWJrATezbgAMs9PgDCVIA/n3Cr3WV 70XQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:user-agent:in-reply-to :content-disposition:mime-version:references:reply-to:subject:cc:to :from:date; bh=UD/zwrE2Yj5TtTUvPrl3Bs0dTxb9u/wNem83v7peAOs=; b=wPh1fbwCca/epWAAxp0T87pXuutoY95LfeW5Y0vkuPuE9SeB7IdxYpvBgSmgNbmweM 4LHrycZorH3d221qcQ2JgjxWJebGhY1iv8P+goTyV6tZHv2xNdTJCtFVuQ71lQwNXiMd 59K5Q3StMM3YCivn42umZS2wnWndZ2oorHam9jwmkWwHe2sxh4U0JHGdhimSRT+09hOT nAV2mW9XTvwGXePGdCEXRrLLYLba0kKlenncWPHLgo6KNQ/4DMpgbSm5bF1UjABkCNX2 3oYSjz3vclNbeir59myxYMDOZHKlNm+rPHOne9deZhtz1dMYNDRufgt8fJ3HUtD67bPN YgvA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j91-v6si18558971pld.181.2018.10.17.09.11.40; Wed, 17 Oct 2018 09:12:09 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727751AbeJRAHa (ORCPT + 99 others); Wed, 17 Oct 2018 20:07:30 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:39614 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727014AbeJRAHa (ORCPT ); Wed, 17 Oct 2018 20:07:30 -0400 Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w9HGA4rO056729 for ; Wed, 17 Oct 2018 12:11:04 -0400 Received: from e17.ny.us.ibm.com (e17.ny.us.ibm.com [129.33.205.207]) by mx0a-001b2d01.pphosted.com with ESMTP id 2n67xtsd7v-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 17 Oct 2018 12:11:04 -0400 Received: from localhost by e17.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 17 Oct 2018 12:11:03 -0400 Received: from b01cxnp23033.gho.pok.ibm.com (9.57.198.28) by e17.ny.us.ibm.com (146.89.104.204) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 17 Oct 2018 12:11:00 -0400 Received: from b01ledav003.gho.pok.ibm.com (b01ledav003.gho.pok.ibm.com [9.57.199.108]) by b01cxnp23033.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w9HGAx7Q26214444 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 17 Oct 2018 16:10:59 GMT Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 31EC3B205F; Wed, 17 Oct 2018 16:10:59 +0000 (GMT) Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 00676B2065; Wed, 17 Oct 2018 16:10:58 +0000 (GMT) Received: from paulmck-ThinkPad-W541 (unknown [9.70.82.109]) by b01ledav003.gho.pok.ibm.com (Postfix) with ESMTP; Wed, 17 Oct 2018 16:10:58 +0000 (GMT) Received: by paulmck-ThinkPad-W541 (Postfix, from userid 1000) id 839BF16C4258; Wed, 17 Oct 2018 09:11:00 -0700 (PDT) Date: Wed, 17 Oct 2018 09:11:00 -0700 From: "Paul E. McKenney" To: Joel Fernandes Cc: Nikolay Borisov , linux-kernel@vger.kernel.org, Jonathan Corbet , Josh Triplett , Lai Jiangshan , linux-doc@vger.kernel.org, Mathieu Desnoyers , Steven Rostedt Subject: Re: [PATCH RFC] doc: rcu: remove obsolete (non-)requirement about disabling preemption Reply-To: paulmck@linux.ibm.com References: <20181015023328.GP2674@linux.ibm.com> <20181015024758.GA227989@joelaf.mtv.corp.google.com> <5151da01-343b-bb37-353e-b6652ae530f5@suse.com> <20181015112112.GT2674@linux.ibm.com> <20181015193951.GA33528@joelaf.mtv.corp.google.com> <20181015195426.GD2674@linux.ibm.com> <20181015201556.GA43575@joelaf.mtv.corp.google.com> <20181015210856.GE2674@linux.ibm.com> <20181016112611.GA27405@linux.ibm.com> <20181016204122.GA8176@joelaf.mtv.corp.google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20181016204122.GA8176@joelaf.mtv.corp.google.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 18101716-0040-0000-0000-000004820BCB X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00009892; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000268; SDB=6.01104038; UDB=6.00571514; IPR=6.00884090; MB=3.00023796; MTD=3.00000008; XFM=3.00000015; UTC=2018-10-17 16:11:02 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18101716-0041-0000-0000-0000088A2246 Message-Id: <20181017161100.GP2674@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-10-16_14:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1810170137 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Oct 16, 2018 at 01:41:22PM -0700, Joel Fernandes wrote: > On Tue, Oct 16, 2018 at 04:26:11AM -0700, Paul E. McKenney wrote: > > On Mon, Oct 15, 2018 at 02:08:56PM -0700, Paul E. McKenney wrote: > > > On Mon, Oct 15, 2018 at 01:15:56PM -0700, Joel Fernandes wrote: > > > > On Mon, Oct 15, 2018 at 12:54:26PM -0700, Paul E. McKenney wrote: > > > > [...] > > > > > > > In any case, please don't spin for milliseconds with preemption disabled. > > > > > > > The real-time guys are unlikely to be happy with you if you do this! > > > > > > > > > > > > Well just to clarify, I was just running Oleg's test which did this. This > > > > > > test was mentioned in the original documentation that I deleted. Ofcourse I > > > > > > would not dare do such a thing in production code :-D. I guess to Oleg's > > > > > > defense, he did it to very that synchronize_rcu() was not blocked on > > > > > > preempt-disable sections which was a different test. > > > > > > > > > > Understood! Just pointing out that RCU's tolerating a given action does > > > > > not necessarily mean that it is a good idea to take that action. ;-) > > > > > > > > Makes sense :-) thanks. > > > > > > Don't worry, that won't happen again. ;-) > > > > > > > > > > > > + pr_crit("SPIN done!\n"); > > > > > > > > > + preempt_enable(); > > > > > > > > > + break; > > > > > > > > > + case 777: > > > > > > > > > + pr_crit("SYNC start\n"); > > > > > > > > > + synchronize_rcu(); > > > > > > > > > + pr_crit("SYNC done!\n"); > > > > > > > > > > > > > > > > But you are using the console printing infrastructure which is rather > > > > > > > > heavyweight. Try replacing pr_* calls with trace_printk so that you > > > > > > > > write to the lock-free ring buffer, this will reduce the noise from the > > > > > > > > heavy console printing infrastructure. > > > > > > > > > > > > > > And this might be a problem as well. > > > > > > > > > > > > This was not the issue (or atleast not fully the issue) since I saw the same > > > > > > thing with trace_printk. It was exactly what you said - which is the > > > > > > excessively long preempt disabled times. > > > > > > > > > > One approach would be to apply this patch against (say) v4.18, which > > > > > does not have consolidated grace periods. You might then be able to > > > > > tell if the pr_crit() calls make any difference. > > > > > > > > I could do that, yeah. But since the original problem went away due to > > > > disabling preempts for a short while, I will move on and continue to focus on > > > > updating other parts of the documenation. Just to mention I > > > > brought this up because I thought its better to do that than not to, just > > > > incase there is any lurking issue with the consolidation. Sorry if that ended > > > > up with me being noisy. > > > > > > Not a problem, no need to apologize! > > > > Besides, digging through the code did point out a reasonable optimization. > > In the common case, this would buy 100s of microseconds rather than > > milliseconds, but it seems simple enough to be worthwhile. Thoughts? > > Cool, thanks. One comment below: > > > ------------------------------------------------------------------------ > > > > commit 07921e8720907f58f82b142f2027fc56d5abdbfd > > Author: Paul E. McKenney > > Date: Tue Oct 16 04:12:58 2018 -0700 > > > > rcu: Speed up expedited GPs when interrupting RCU reader > > > > In PREEMPT kernels, an expedited grace period might send an IPI to a > > CPU that is executing an RCU read-side critical section. In that case, > > it would be nice if the rcu_read_unlock() directly interacted with the > > RCU core code to immediately report the quiescent state. And this does > > happen in the case where the reader has been preempted. But it would > > also be a nice performance optimization if immediate reporting also > > happened in the preemption-free case. > > > > This commit therefore adds an ->exp_hint field to the task_struct structure's > > ->rcu_read_unlock_special field. The IPI handler sets this hint when > > it has interrupted an RCU read-side critical section, and this causes > > the outermost rcu_read_unlock() call to invoke rcu_read_unlock_special(), > > which, if preemption is enabled, reports the quiescent state immediately. > > If preemption is disabled, then the report is required to be deferred > > until preemption (or bottom halves or interrupts or whatever) is re-enabled. > > > > Because this is a hint, it does nothing for more complicated cases. For > > example, if the IPI interrupts an RCU reader, but interrupts are disabled > > across the rcu_read_unlock(), but another rcu_read_lock() is executed > > before interrupts are re-enabled, the hint will already have been cleared. > > If you do crazy things like this, reporting will be deferred until some > > later RCU_SOFTIRQ handler, context switch, cond_resched(), or similar. > > > > Reported-by: Joel Fernandes > > Signed-off-by: Paul E. McKenney > > > > diff --git a/include/linux/sched.h b/include/linux/sched.h > > index 004ca21f7e80..64ce751b5fe9 100644 > > --- a/include/linux/sched.h > > +++ b/include/linux/sched.h > > @@ -571,8 +571,10 @@ union rcu_special { > > struct { > > u8 blocked; > > u8 need_qs; > > + u8 exp_hint; /* Hint for performance. */ > > + u8 pad; /* No garbage from compiler! */ > > } b; /* Bits. */ > > - u16 s; /* Set of bits. */ > > + u32 s; /* Set of bits. */ > > }; > > > > enum perf_event_task_context { > > diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h > > index e669ccf3751b..928fe5893a57 100644 > > --- a/kernel/rcu/tree_exp.h > > +++ b/kernel/rcu/tree_exp.h > > @@ -692,8 +692,10 @@ static void sync_rcu_exp_handler(void *unused) > > */ > > if (t->rcu_read_lock_nesting > 0) { > > raw_spin_lock_irqsave_rcu_node(rnp, flags); > > - if (rnp->expmask & rdp->grpmask) > > + if (rnp->expmask & rdp->grpmask) { > > rdp->deferred_qs = true; > > + WRITE_ONCE(t->rcu_read_unlock_special.b.exp_hint, true); > > + } > > raw_spin_unlock_irqrestore_rcu_node(rnp, flags); > > } > > > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h > > index 8b48bb7c224c..d6286eb6e77e 100644 > > --- a/kernel/rcu/tree_plugin.h > > +++ b/kernel/rcu/tree_plugin.h > > @@ -643,8 +643,9 @@ static void rcu_read_unlock_special(struct task_struct *t) > > local_irq_save(flags); > > irqs_were_disabled = irqs_disabled_flags(flags); > > if ((preempt_bh_were_disabled || irqs_were_disabled) && > > - t->rcu_read_unlock_special.b.blocked) { > > + t->rcu_read_unlock_special.s) { > > /* Need to defer quiescent state until everything is enabled. */ > > + WRITE_ONCE(t->rcu_read_unlock_special.b.exp_hint, false); > > raise_softirq_irqoff(RCU_SOFTIRQ); > > Still going through this patch, but it seems to me like the fact that > rcu_read_unlock_special is called means someone has requested for a grace > period. Then in that case, does it not make sense to raise the softirq > for processing anyway? Not necessarily. Another reason that rcu_read_unlock_special() might be called is if the RCU read-side critical section had been preempted, in which case there might not even be a grace period in progress. In addition, if interrupts, bottom halves, and preemption are all enabled, the code in rcu_preempt_deferred_qs_irqrestore() doesn't need to bother raising softirq, as it can instead just immediately report the quiescent state. Thanx, Paul