Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751661AbdF1XyY (ORCPT ); Wed, 28 Jun 2017 19:54:24 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:33183 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751552AbdF1XyR (ORCPT ); Wed, 28 Jun 2017 19:54:17 -0400 Date: Wed, 28 Jun 2017 16:54:12 -0700 From: "Paul E. McKenney" To: Alan Stern Cc: Linus Torvalds , Andrea Parri , Linux Kernel Mailing List , priyalee.kushwaha@intel.com, drozdziak1@gmail.com, Arnd Bergmann , ldr709@gmail.com, Thomas Gleixner , Peter Zijlstra , Josh Triplett , Nicolas Pitre , Krister Johansen , Vegard Nossum , dcb314@hotmail.com, Wu Fengguang , Frederic Weisbecker , Rik van Riel , Steven Rostedt , Ingo Molnar , Luc Maranget , Jade Alglave Subject: Re: [GIT PULL rcu/next] RCU commits for 4.13 Reply-To: paulmck@linux.vnet.ibm.com References: <20170628170321.GQ3721@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17062823-0044-0000-0000-0000035F1A6D X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007291; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000214; SDB=6.00880126; UDB=6.00438724; IPR=6.00660280; BA=6.00005445; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00015995; XFM=3.00000015; UTC=2017-06-28 23:54:15 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17062823-0045-0000-0000-0000078D271C Message-Id: <20170628235412.GB3721@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-06-28_15:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=98 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000 definitions=main-1706280382 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3950 Lines: 86 On Wed, Jun 28, 2017 at 04:16:03PM -0400, Alan Stern wrote: > On Wed, 28 Jun 2017, Paul E. McKenney wrote: [ . . . ] > Yes. Bear in mind that the PowerPC version of arch_spin_unlock_wait > ends with smp_mb. That already makes it a lot stronger than the > smp_cond_load_acquire implementation on other architectures. Fair enough! > > So what am I missing here? > > Our memory model is (deliberately) weaker than the guarantees provided > by any of the hardware implementations. Agreed -- the model cannot say that something is guaranteed unless -all- the architectures provide that guarantee. If I remember correctly, each architecture currently running Linux is stronger in some way than at least one other architecture. So yes, the model has to be strictly weaker than each of the architectures taken individually, because otherwise the model would be making a guarantee that at least one architecture does not meet, which would disqualify it from being a valid Linux-kernel memory model. Plus, yes, there are a few places where the model is a bit weaker than it needs to be in order to keep the model's complexity from exploding beyond all reason. > So while adding smp_mb in > front of smp_cond_load_acquire may suffice to give the desired > semantics in many cases, it might not suffice for all architectures > (because it doesn't suffice in the model). In fact, we would need to > change the model to make it accept this idiom. Understood. And if Murphy is being his usual self, the change wouldn't make the model any simpler. > I admit not being able to point to any architectures where the > difference matters. But if you take this approach, you do need to > document that for any future architectures, putting smp_mb in front of > spin_unlock_wait must be guaranteed by the arch-specific code to have > the desired effect. > > Also, it would not be obvious to a reader _why_ putting an explicit > smp_mb before spin_unlock_wait would make prior writes visible to later > critical sections, since it's not obvious that spin_unlock_wait needs > to do any writes. Agreed, my current comments need help. I will fix them as you suggest. > Replacing the whole thing with spin_lock + > spin_unlock would be easier to understand and wouldn't need any special > guarantees or explanations. That was my initial choice, so I would look pretty silly arguing all that much. And I have had much better success getting the lock+unlock definition across to people, so my experience agrees with yours on the easy-to-understand part. > > > If there are any places where this would add unacceptable overhead, > > > maybe those places need some rethinking. For instance, perhaps we > > > could add a separate primitive that provides only release semantics. > > > (But would using the new primitive together with spin_unlock_wait > > > really be significantly better than lock-unlock?) > > > > At the moment, I have no idea on the relative overheads. ;-) > > I don't either. But for architectures where spin_unlock_wait already > does the equivalent of load-locked + store-conditional, it's hard to > see why replacing it with spin_lock + spin_unlock would make it any > slower. Well, the current implementations don't actually take the lock, which means that they don't delay other lock holders quite so much. Which might not be a huge difference even at high levels of lock contention, and would definitely be down in the noise for low levels of lock contention. Digital packrat that I am, I am still carrying both sets of patches, so can still go either way. Linus, are you dead-set against defining spin_unlock_wait() to be spin_lock + spin_unlock? For example, is the current x86 implementation of spin_unlock_wait() really a non-negotiable hard requirement? Or would you be willing to live with the spin_lock + spin_unlock semantics? Thanx, Paul