Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753305AbdF2Sru (ORCPT ); Thu, 29 Jun 2017 14:47:50 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:34976 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753054AbdF2Srn (ORCPT ); Thu, 29 Jun 2017 14:47:43 -0400 Date: Thu, 29 Jun 2017 11:47:35 -0700 From: "Paul E. McKenney" To: Boqun Feng Cc: Linus Torvalds , Alan Stern , Andrea Parri , Linux Kernel Mailing List , priyalee.kushwaha@intel.com, =?utf-8?Q?Stanis=C5=82aw?= Drozd , Arnd Bergmann , ldr709@gmail.com, Thomas Gleixner , Peter Zijlstra , Josh Triplett , Nicolas Pitre , Krister Johansen , Vegard Nossum , dcb314@hotmail.com, Wu Fengguang , Frederic Weisbecker , Rik van Riel , Steven Rostedt , Ingo Molnar , Luc Maranget , Jade Alglave Subject: Re: [GIT PULL rcu/next] RCU commits for 4.13 Reply-To: paulmck@linux.vnet.ibm.com References: <20170628170321.GQ3721@linux.vnet.ibm.com> <20170628235412.GB3721@linux.vnet.ibm.com> <20170629004556.GD3721@linux.vnet.ibm.com> <20170629031726.pb5dhjnxxiif25ma@tardis> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170629031726.pb5dhjnxxiif25ma@tardis> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17062918-0008-0000-0000-0000025188B3 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007294; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000214; SDB=6.00880503; UDB=6.00438950; IPR=6.00660655; BA=6.00005447; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00016011; XFM=3.00000015; UTC=2017-06-29 18:47:39 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17062918-0009-0000-0000-000035DA5C74 Message-Id: <20170629184735.GC2393@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-06-29_13:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000 definitions=main-1706290302 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4692 Lines: 102 On Thu, Jun 29, 2017 at 11:17:26AM +0800, Boqun Feng wrote: > On Wed, Jun 28, 2017 at 05:45:56PM -0700, Paul E. McKenney wrote: > > On Wed, Jun 28, 2017 at 05:05:46PM -0700, Linus Torvalds wrote: > > > On Wed, Jun 28, 2017 at 4:54 PM, Paul E. McKenney > > > wrote: > > > > > > > > Linus, are you dead-set against defining spin_unlock_wait() to be > > > > spin_lock + spin_unlock? For example, is the current x86 implementation > > > > of spin_unlock_wait() really a non-negotiable hard requirement? Or > > > > would you be willing to live with the spin_lock + spin_unlock semantics? > > > > > > So I think the "same as spin_lock + spin_unlock" semantics are kind of insane. > > > > > > One of the issues is that the same as "spin_lock + spin_unlock" is > > > basically now architecture-dependent. Is it really the > > > architecture-dependent ordering you want to define this as? > > > > > > So I just think it's a *bad* definition. If somebody wants something > > > that is exactly equivalent to spin_lock+spin_unlock, then dammit, just > > > do *THAT*. It's completely pointless to me to define > > > spin_unlock_wait() in those terms. > > > > > > And if it's not equivalent to the *architecture* behavior of > > > spin_lock+spin_unlock, then I think it should be descibed in terms > > > that aren't about the architecture implementation (so you shouldn't > > > describe it as "spin_lock+spin_unlock", you should describe it in > > > terms of memory barrier semantics. > > > > > > And if we really have to use the spin_lock+spinunlock semantics for > > > this, then what is the advantage of spin_unlock_wait at all, if it > > > doesn't fundamentally avoid some locking overhead of just taking the > > > spinlock in the first place? > > > > > > And if we can't use a cheaper model, maybe we should just get rid of > > > it entirely? > > > > > > Finally: if the memory barrier semantics are exactly the same, and > > > it's purely about avoiding some nasty contention case, I think the > > > concept is broken - contention is almost never an actual issue, and if > > > it is, the problem is much deeper than spin_unlock_wait(). > > > > All good points! > > > > I must confess that your sentence about getting rid of spin_unlock_wait() > > entirely does resonate with me, especially given the repeated bouts of > > "but what -exactly- is it -supposed- to do?" over the past 18 months > > or so. ;-) > > > > Just for completeness, here is a list of the definitions that have been > > put forward, just in case it inspires someone to come up with something > > better: > > > > 1. spin_unlock_wait() provides only acquire semantics. Code > > placed after the spin_unlock_wait() will see the effects of > > all previous critical sections, but there is no guarantees for > > subsequent critical sections. The x86 implementation provides > > this. I -think- that the ARM and PowerPC implementations could > > get rid of a memory-barrier instruction and still provide this. > > > > Yes, except we still need a smp_lwsync() in powerpc's > spin_unlock_wait(). > > And FWIW, the two smp_mb()s in spin_unlock_wait() on PowerPC exist there > just because when Peter worked on commit 726328d92a42, we decided to let > the fix for spin_unlock_wait() on PowerPC(i.e. commit 6262db7c088bb ) go > into the tree first to avoid some possible conflicts. And.. I forgot to > do the clean-up for an aquire-semantics spin_unlock_wait() later.. ;-) > > I could send out the necessary fix once we have a conclusion for the > semantics part. If we end up still having spin_unlock_wait(), I will be happy to take you up on that. Thanx, Paul > Regards, > Boqun > > > 2. As #1 above, but a "smp_mb();spin_unlock_wait();" provides the > > additional guarantee that code placed before this construct is > > seen by all subsequent critical sections. The x86 implementation > > provides this, as do ARM and PowerPC, but it is not clear that all > > architectures do. As Alan noted, this is an extremely unnatural > > definition for the current memory model. > > > > 3. [ Just for completeness, yes, this is off the table! ] The > > spin_unlock_wait() has the same semantics as a spin_lock() > > followed immediately by a spin_unlock(). > > > > 4. spin_unlock_wait() is analogous to synchronize_rcu(), where > > spin_unlock_wait()'s "read-side critical sections" are the lock's > > normal critical sections. This was the first definition I heard > > that made any sense to me, but it turns out to be equivalent > > to #3. Thus, also off the table. > > > > Does anyone know of any other possible definitions? > > > > Thanx, Paul > >