Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760536AbcCDWyH (ORCPT ); Fri, 4 Mar 2016 17:54:07 -0500 Received: from bombadil.infradead.org ([198.137.202.9]:39884 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1759947AbcCDWyF (ORCPT ); Fri, 4 Mar 2016 17:54:05 -0500 Date: Fri, 4 Mar 2016 14:53:54 -0800 From: Darren Hart To: "Paul E. McKenney" Cc: Jianyu Zhan , LKML , Peter Zijlstra , Thomas Gleixner , dave@stgolabs.net, Andrew Morton , Ingo Molnar , Rasmus Villemoes , dvhart@linux.intel.com, Christian Borntraeger , Fengguang Wu , bigeasy@linutronix.de Subject: Re: [PATCH] futex: replace bare barrier() with more lightweight READ_ONCE() Message-ID: <20160304225354.GH1092@dvhart-mobl5.amr.corp.intel.com> References: <1457019485-26441-1-git-send-email-nasa4836@gmail.com> <20160303170532.GC1092@dvhart-mobl5.amr.corp.intel.com> <20160304210524.GF1092@dvhart-mobl5.amr.corp.intel.com> <20160304215720.GV3577@linux.vnet.ibm.com> <20160304223801.GG1092@dvhart-mobl5.amr.corp.intel.com> <20160304224511.GX3577@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160304224511.GX3577@linux.vnet.ibm.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3599 Lines: 79 On Fri, Mar 04, 2016 at 02:45:11PM -0800, Paul McKenney wrote: > On Fri, Mar 04, 2016 at 02:38:01PM -0800, Darren Hart wrote: > > On Fri, Mar 04, 2016 at 01:57:20PM -0800, Paul McKenney wrote: > > > On Fri, Mar 04, 2016 at 01:05:24PM -0800, Darren Hart wrote: > > > > On Fri, Mar 04, 2016 at 09:12:31AM +0800, Jianyu Zhan wrote: > > > > > On Fri, Mar 4, 2016 at 1:05 AM, Darren Hart wrote: > > > > > > I thought I provided a corrected comment block.... maybe I didn't. We have been > > > > > > working on improving the futex documentation, so we're paying close attention to > > > > > > terminology as well as grammar. This one needs a couple minor tweaks. I suggest: > > > > > > > > > > > > /* > > > > > > * Use READ_ONCE to forbid the compiler from reloading q->lock_ptr and > > > > > > * optimizing lock_ptr out of the logic below. > > > > > > */ > > > > > > > > > > > > The bit about q->lock_ptr possibly changing is already covered by the large > > > > > > comment block below the spin_lock(lock_ptr) call. > > > > > > > > > > The large comment block is explaining the why the retry logic is required. > > > > > To achieve this semantic requirement, the READ_ONCE is needed to prevent > > > > > compiler optimizing it by doing double loads. > > > > > > > > > > So I think the comment above should explain this tricky part. > > > > > > > > Fair point. Consider: > > > > > > > > > > > > /* > > > > * q->lock_ptr can change between this read and the following spin_lock. > > > > * Use READ_ONCE to forbid the compiler from reloading q->lock_ptr and > > > > * optimizing lock_ptr out of the logic below. > > > > */ > > > > > > > > > > > > > > > /* Use READ_ONCE to forbid the compiler from reloading q->lock_ptr in spin_lock() */ > > > > > > > > > > And as for preventing from optimizing the lock_ptr out of the retry > > > > > code block, I have consult > > > > > Paul Mckenney, he suggests one more READ_ONCE should be added here: > > > > > > > > Let's keep this discussion together so we have a record of the > > > > justification. > > > > > > > > +Paul McKenney > > > > > > > > Paul, my understanding was that spin_lock was a CPU memory barrier, > > > > which in turn is an implicit compiler barrier (aka barrier()), of which > > > > READ_ONCE is described as a weaker form. Reviewing this, I realize the > > > > scope of barrier() wasn't clear to me. It seems while barrier() ensures > > > > ordering, it does not offer the same guarantee regarding reloading that > > > > READ_ONCE offers. So READ_ONCE is not strictly a weaker form of > > > > barrier() as I had gathered from a spotty reading of > > > > memory-barriers.txt, but it also offers guarantees regarding memory > > > > references that barrier() does not. > > > > > > > > Correct? > > > > > > If q->lock_ptr is never changed except under that lock, then there is > > > indeed no reason for the ACCESS_ONCE(). > > > > The only location where a q->lock_ptr is updated without that lock being held is > > in queue_lock(). This is safe as the futex_q is not yet queued onto an hb until > > after the lock is held (so unqueue_me() cannot race with queue_lock()). > > > > > So, is q->lock_ptr ever changed while the lock is -not- held? If so, > > > I suggest that you put an ACCESS_ONCE() there. > > > > It is not. > > If I followed that correctly, then I agree that you don't need an > ACCESS_ONCE() in this case. > Thanks for offering your time and expertise Paul. It's a major effort every time I open memory-barriers.txt :-) -- Darren Hart Intel Open Source Technology Center