Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755399AbdIGOIg (ORCPT ); Thu, 7 Sep 2017 10:08:36 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:41458 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754273AbdIGOIe (ORCPT ); Thu, 7 Sep 2017 10:08:34 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org D5C846071C Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=prsood@codeaurora.org Subject: Re: [PATCH] rwsem: fix missed wakeup due to reordering of load To: Peter Zijlstra References: <1503487735-4362-1-git-send-email-prsood@codeaurora.org> <20170824112927.tqejopswi7mcy4sq@hirez.programming.kicks-ass.net> <20170824123304.5pqw53qbsytpfbrp@hirez.programming.kicks-ass.net> <20170824125233.nmgfau45sh4jgsqf@hirez.programming.kicks-ass.net> Cc: mingo@redhat.com, sramana@codeaurora.org, linux-kernel@vger.kernel.org, Waiman Long , Davidlohr Bueso , Andrea Parri , Will Deacon From: Prateek Sood Message-ID: <85a99bb9-d7e6-3844-8a41-89c5225710a7@codeaurora.org> Date: Thu, 7 Sep 2017 19:38:18 +0530 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.0 MIME-Version: 1.0 In-Reply-To: <20170824125233.nmgfau45sh4jgsqf@hirez.programming.kicks-ass.net> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3635 Lines: 99 On 08/24/2017 06:22 PM, Peter Zijlstra wrote: > On Thu, Aug 24, 2017 at 02:33:04PM +0200, Peter Zijlstra wrote: >> On Thu, Aug 24, 2017 at 01:29:27PM +0200, Peter Zijlstra wrote: >>> >>> WTH did you not Cc the people that commented on your patch last time? >>> >>> On Wed, Aug 23, 2017 at 04:58:55PM +0530, Prateek Sood wrote: >>>> If a spinner is present, there is a chance that the load of >>>> rwsem_has_spinner() in rwsem_wake() can be reordered with >>>> respect to decrement of rwsem count in __up_write() leading >>>> to wakeup being missed. >>> >>>> spinning writer up_write caller >>>> --------------- ----------------------- >>>> [S] osq_unlock() [L] osq >>>> spin_lock(wait_lock) >>>> sem->count=0xFFFFFFFF00000001 >>>> +0xFFFFFFFF00000000 >>>> count=sem->count >>>> MB >>>> sem->count=0xFFFFFFFE00000001 >>>> -0xFFFFFFFF00000001 >>>> RMB >>> >>> This doesn't make sense, it appears to order a STORE against something >>> else. >>> >>>> spin_trylock(wait_lock) >>>> return >>>> rwsem_try_write_lock(count) >>>> spin_unlock(wait_lock) >>>> schedule() >> >> Is this what you wanted to write? > > And ideally there should be a comment near the atomic_long_add_return() > in __rwsem_down_write_failed_common() to indicate we rely on the implied > smp_mb() before it -- just in case someone goes and makes it > atomic_long_add_return_relaxed(). > > And I suppose someone should look at the waiting branch of that thing > too.. because I'm not sure what happens if waiting is true but count > isn't big enough. > > I bloody hate the rwsem code, that BIAS stuff forever confuses me. I > have a start at rewriting the thing to put the owner in the lock word > just like we now do for mutex, but never seem to get around to finishing > it. > >> --- >> kernel/locking/rwsem-xadd.c | 27 +++++++++++++++++++++++++++ >> 1 file changed, 27 insertions(+) >> >> diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c >> index 02f660666ab8..813b5d3654ce 100644 >> --- a/kernel/locking/rwsem-xadd.c >> +++ b/kernel/locking/rwsem-xadd.c >> @@ -613,6 +613,33 @@ struct rw_semaphore *rwsem_wake(struct rw_semaphore *sem) >> DEFINE_WAKE_Q(wake_q); >> >> /* >> + * __rwsem_down_write_failed_common(sem) >> + * rwsem_optimistic_spin(sem) >> + * osq_unlock(sem->osq) >> + * ... >> + * atomic_long_add_return(&sem->count) >> + * >> + * - VS - >> + * >> + * __up_write() >> + * if (atomic_long_sub_return_release(&sem->count) < 0) >> + * rwsem_wake(sem) >> + * osq_is_locked(&sem->osq) >> + * >> + * And __up_write() must observe !osq_is_locked() when it observes the >> + * atomic_long_add_return() in order to not miss a wakeup. >> + * >> + * This boils down to: >> + * >> + * [S.rel] X = 1 [RmW] r0 = (Y += 0) >> + * MB RMB >> + * [RmW] Y += 1 [L] r1 = X >> + * >> + * exists (r0=1 /\ r1=0) >> + */ >> + smp_rmb(); >> + >> + /* >> * If a spinner is present, it is not necessary to do the wakeup. >> * Try to do wakeup only if the trylock succeeds to minimize >> * spinlock contention which may introduce too much delay in the Thanks Peter for your suggestion on comments. I will resend the patch with updated comments -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc., is a member of Code Aurora Forum, a Linux Foundation Collaborative Project