Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934756AbcKNSMr (ORCPT ); Mon, 14 Nov 2016 13:12:47 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:44157 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932731AbcKNSMp (ORCPT ); Mon, 14 Nov 2016 13:12:45 -0500 Date: Mon, 14 Nov 2016 10:12:37 -0800 From: "Paul E. McKenney" To: Peter Zijlstra Cc: Josh Triplett , linux-kernel@vger.kernel.org, mingo@kernel.org, jiangshanlai@gmail.com, dipankar@in.ibm.com, akpm@linux-foundation.org, mathieu.desnoyers@efficios.com, tglx@linutronix.de, rostedt@goodmis.org, dhowells@redhat.com, edumazet@google.com, dvhart@linux.intel.com, fweisbec@gmail.com, oleg@redhat.com, bobby.prani@gmail.com Subject: Re: [PATCH tip/core/rcu 6/7] rcu: Make expedited grace periods recheck dyntick idle state Reply-To: paulmck@linux.vnet.ibm.com References: <20161114165648.GA15216@linux.vnet.ibm.com> <1479142633-15315-6-git-send-email-paulmck@linux.vnet.ibm.com> <20161114172512.bcwdy66elesds5t4@jtriplet-mobl2.jf.intel.com> <20161114173733.GJ3142@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161114173733.GJ3142@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16111418-0016-0000-0000-0000052B7F22 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00006077; HX=3.00000240; KW=3.00000007; PH=3.00000004; SC=3.00000189; SDB=6.00780627; UDB=6.00376451; IPR=6.00558128; BA=6.00004878; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00013322; XFM=3.00000011; UTC=2016-11-14 18:12:42 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16111418-0017-0000-0000-000034A2F962 Message-Id: <20161114181237.GM4127@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-11-14_11:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609300000 definitions=main-1611140365 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2056 Lines: 40 On Mon, Nov 14, 2016 at 06:37:33PM +0100, Peter Zijlstra wrote: > On Mon, Nov 14, 2016 at 09:25:12AM -0800, Josh Triplett wrote: > > On Mon, Nov 14, 2016 at 08:57:12AM -0800, Paul E. McKenney wrote: > > > Expedited grace periods check dyntick-idle state, and avoid sending > > > IPIs to idle CPUs, including those running guest OSes, and, on NOHZ_FULL > > > kernels, nohz_full CPUs. However, the kernel has been observed checking > > > a CPU while it was non-idle, but sending the IPI after it has gone > > > idle. This commit therefore rechecks idle state immediately before > > > sending the IPI, refraining from IPIing CPUs that have since gone idle. > > > > > > Reported-by: Rik van Riel > > > Signed-off-by: Paul E. McKenney > > > > atomic_add_return(0, ...) seems odd. Do you actually want that, rather > > than atomic_read(...)? If so, can you please document exactly why? > > Yes that is weird. The only effective difference is that it would do a > load-exclusive instead of a regular load. It is weird, and checking to see if it is safe to convert it and its friends to something with less overhead is on my list. This starts with a patch series I will post soon that consolidates all these atomic_add_return() calls into a single function, which will ease testing and other verification. All that aside, please keep in mind that much is required from this load. It is part of a network of ordered operations that guarantee that any operation from any CPU preceding a given grace period is seen to precede any other operation from any CPU following that same grace period. And each and every CPU must agree on the order of those two operations, otherwise, RCU is broken. In addition, please note also that these operations are nowhere near any fastpaths. In fact, the specific operation you are concerned about is in an expedited grace period, which has significant overhead. This this added IPI can substitute for an IPI, so, believe it or not, is an optimization. Thanx, Paul