Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753144AbdF2SrN (ORCPT ); Thu, 29 Jun 2017 14:47:13 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:41652 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752506AbdF2SrF (ORCPT ); Thu, 29 Jun 2017 14:47:05 -0400 Date: Thu, 29 Jun 2017 11:46:51 -0700 From: "Paul E. McKenney" To: Will Deacon Cc: Linus Torvalds , Alan Stern , Andrea Parri , Linux Kernel Mailing List , priyalee.kushwaha@intel.com, =?utf-8?Q?Stanis=C5=82aw?= Drozd , Arnd Bergmann , ldr709@gmail.com, Thomas Gleixner , Peter Zijlstra , Josh Triplett , Nicolas Pitre , Krister Johansen , Vegard Nossum , dcb314@hotmail.com, Wu Fengguang , Frederic Weisbecker , Rik van Riel , Steven Rostedt , Ingo Molnar , Luc Maranget , Jade Alglave Subject: Re: [GIT PULL rcu/next] RCU commits for 4.13 Reply-To: paulmck@linux.vnet.ibm.com References: <20170628170321.GQ3721@linux.vnet.ibm.com> <20170628235412.GB3721@linux.vnet.ibm.com> <20170629113848.GA18630@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170629113848.GA18630@arm.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17062918-0052-0000-0000-00000232887D X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007294; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000214; SDB=6.00880503; UDB=6.00438950; IPR=6.00660655; BA=6.00005447; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00016011; XFM=3.00000015; UTC=2017-06-29 18:46:57 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17062918-0053-0000-0000-000051265D41 Message-Id: <20170629184651.GB2393@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-06-29_13:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000 definitions=main-1706290302 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2607 Lines: 54 On Thu, Jun 29, 2017 at 12:38:48PM +0100, Will Deacon wrote: > [turns out I've not been on cc for this thread, but Jade pointed me to it > and I see my name came up at some point!] My bad for not having you Cc: on the original patch, apologies! > On Wed, Jun 28, 2017 at 05:05:46PM -0700, Linus Torvalds wrote: > > On Wed, Jun 28, 2017 at 4:54 PM, Paul E. McKenney > > wrote: > > > > > > Linus, are you dead-set against defining spin_unlock_wait() to be > > > spin_lock + spin_unlock? For example, is the current x86 implementation > > > of spin_unlock_wait() really a non-negotiable hard requirement? Or > > > would you be willing to live with the spin_lock + spin_unlock semantics? > > > > So I think the "same as spin_lock + spin_unlock" semantics are kind of insane. > > > > One of the issues is that the same as "spin_lock + spin_unlock" is > > basically now architecture-dependent. Is it really the > > architecture-dependent ordering you want to define this as? > > > > So I just think it's a *bad* definition. If somebody wants something > > that is exactly equivalent to spin_lock+spin_unlock, then dammit, just > > do *THAT*. It's completely pointless to me to define > > spin_unlock_wait() in those terms. > > > > And if it's not equivalent to the *architecture* behavior of > > spin_lock+spin_unlock, then I think it should be descibed in terms > > that aren't about the architecture implementation (so you shouldn't > > describe it as "spin_lock+spin_unlock", you should describe it in > > terms of memory barrier semantics. > > > > And if we really have to use the spin_lock+spinunlock semantics for > > this, then what is the advantage of spin_unlock_wait at all, if it > > doesn't fundamentally avoid some locking overhead of just taking the > > spinlock in the first place? > > Just on this point -- the arm64 code provides the same ordering semantics > as you would get from a lock;unlock sequence, but we can optimise that > when compared to an actual lock;unlock sequence because we don't need to > wait in turn for our ticket. I suspect something similar could be done > if/when we move to qspinlocks. > > Whether or not this is actually worth optimising is another question, but > it is worth noting that unlock_wait can be implemented more cheaply than > lock;unlock, whilst providing the same ordering guarantees (if that's > really what we want -- see my reply to Paul). > > Simplicity tends to be my preference, so ripping this out would suit me > best ;) Creating the series to do just that, with you on Cc this time! Thanx, Paul