Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751972AbdHJKl0 (ORCPT ); Thu, 10 Aug 2017 06:41:26 -0400 Received: from bombadil.infradead.org ([65.50.211.133]:41907 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751361AbdHJKlZ (ORCPT ); Thu, 10 Aug 2017 06:41:25 -0400 Date: Thu, 10 Aug 2017 12:41:22 +0200 From: Peter Zijlstra To: Andrea Parri Cc: Waiman Long , Prateek Sood , mingo@redhat.com, sramana@codeaurora.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] rwsem: fix missed wakeup due to reordering of load Message-ID: <20170810104122.mhxpayi7hvjcyuoy@hirez.programming.kicks-ass.net> References: <1501100272-16338-1-git-send-email-prsood@codeaurora.org> <680a3f82-b36b-e09f-b1db-fa23d9f21fc2@redhat.com> <20170810083255.GA3913@andrea> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170810083255.GA3913@andrea> User-Agent: NeoMutt/20170609 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3082 Lines: 70 On Thu, Aug 10, 2017 at 10:32:56AM +0200, Andrea Parri wrote: > On Thu, Jul 27, 2017 at 11:48:53AM -0400, Waiman Long wrote: > > On 07/26/2017 04:17 PM, Prateek Sood wrote: > > > If a spinner is present, there is a chance that the load of > > > rwsem_has_spinner() in rwsem_wake() can be reordered with > > > respect to decrement of rwsem count in __up_write() leading > > > to wakeup being missed. > > > > > > spinning writer up_write caller > > > --------------- ----------------------- > > > [S] osq_unlock() [L] osq > > > spin_lock(wait_lock) > > > sem->count=0xFFFFFFFF00000001 > > > +0xFFFFFFFF00000000 > > > count=sem->count > > > MB > > > sem->count=0xFFFFFFFE00000001 > > > -0xFFFFFFFF00000001 > > > spin_trylock(wait_lock) > > > return > > > rwsem_try_write_lock(count) > > > spin_unlock(wait_lock) > > > schedule() > > > > > > Reordering of atomic_long_sub_return_release() in __up_write() > > > and rwsem_has_spinner() in rwsem_wake() can cause missing of > > > wakeup in up_write() context. In spinning writer, sem->count > > > and local variable count is 0XFFFFFFFE00000001. It would result > > > in rwsem_try_write_lock() failing to acquire rwsem and spinning > > > writer going to sleep in rwsem_down_write_failed(). > > > > > > The smp_rmb() will make sure that the spinner state is > > > consulted after sem->count is updated in up_write context. > > > > > > Signed-off-by: Prateek Sood > > > > Did you actually observe that the reordering happens? > > > > I am not sure if some architectures can actually speculatively execute > > instruction ahead of a branch and then ahead into a function call. I > > know it can happen if the function call is inlined, but rwsem_wake() > > will not be inlined into __up_read() or __up_write(). > > Branches/control dependencies targeting a read do not necessarily preserve > program order; this is for example the case for PowerPC and ARM. > > I'd not expect more than a compiler barrier from a function call (in fact, > not even that if the function happens to be inlined). Indeed. Reads can be speculated by deep out-of-order CPUs no problem. That's what branch predictors are for. > > Even if that is the case, I am not sure if smp_rmb() alone is enough to > > guarantee the ordering as I think it will depend on how the > > atomic_long_sub_return_release() is implmented. > > AFAICT, the pattern under discussion is MP with: > > - a store-release to osq->tail(unlock) followed by a store to sem->count, > separated by a MB (from atomic_long_add_return()) on CPU0; > > - a load of sem->count (for atomic_long_sub_return_release()) followed by Which is a regular load, as 'release' only need apply to the store. > a load of osq->tail (rwsem_has_spinner()) on CPU1. > > Thus a RMW between the two loads suffices to forbid the weak behaviour. Agreed.