Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754728AbbHDD0K (ORCPT ); Mon, 3 Aug 2015 23:26:10 -0400 Received: from g4t3425.houston.hp.com ([15.201.208.53]:19435 "EHLO g4t3425.houston.hp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754545AbbHDD0H (ORCPT ); Mon, 3 Aug 2015 23:26:07 -0400 Message-ID: <55C030CA.5060206@hp.com> Date: Mon, 03 Aug 2015 23:26:02 -0400 From: Waiman Long User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.12) Gecko/20130109 Thunderbird/10.0.12 MIME-Version: 1.0 To: Peter Zijlstra CC: Ingo Molnar , Thomas Gleixner , "H. Peter Anvin" , x86@kernel.org, linux-kernel@vger.kernel.org, Scott J Norton , Douglas Hatch , Davidlohr Bueso Subject: Re: [PATCH v4 1/7] locking/pvqspinlock: Unconditional PV kick with _Q_SLOW_VAL References: <1438395724-25910-1-git-send-email-Waiman.Long@hp.com> <1438395724-25910-2-git-send-email-Waiman.Long@hp.com> <20150801222903.GC25159@twins.programming.kicks-ass.net> In-Reply-To: <20150801222903.GC25159@twins.programming.kicks-ass.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3646 Lines: 91 On 08/01/2015 06:29 PM, Peter Zijlstra wrote: > On Fri, Jul 31, 2015 at 10:21:58PM -0400, Waiman Long wrote: >> The smp_store_release() is not a full barrier. In order to avoid missed >> wakeup, we may need to add memory barrier around locked and cpu state >> variables adding to complexity. As the chance of spurious wakeup is very >> low, it is easier and safer to just do an unconditional kick at unlock >> time. >> >> Signed-off-by: Waiman Long >> --- >> kernel/locking/qspinlock_paravirt.h | 11 ++++++++--- >> 1 files changed, 8 insertions(+), 3 deletions(-) >> >> diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h >> index 15d3733..2dd4b39 100644 >> --- a/kernel/locking/qspinlock_paravirt.h >> +++ b/kernel/locking/qspinlock_paravirt.h >> @@ -240,7 +240,6 @@ static void pv_wait_head(struct qspinlock *lock, struct mcs_spinlock *node) >> cpu_relax(); >> } >> >> - WRITE_ONCE(pn->state, vcpu_halted); >> if (!lp) { /* ONCE */ >> lp = pv_hash(lock, pn); >> /* >> @@ -320,9 +319,15 @@ __visible void __pv_queued_spin_unlock(struct qspinlock *lock) >> /* >> * At this point the memory pointed at by lock can be freed/reused, >> * however we can still use the pv_node to kick the CPU. >> + * >> + * As smp_store_release() is not a full barrier, adding a check to >> + * the node->state doesn't guarantee the checking is really done >> + * after clearing the lock byte > This is true, but _WHY_ is that a problem ? > > since they are in 2 separate >> + * cachelines and so hardware can reorder them. > That's just gibberish, even in the same cacheline stuff can get > reordered. > > So either we insert >> + * memory barrier here and in the corresponding pv_wait_head() >> + * function or we do an unconditional kick which is what is done here. > why, why why ? You've added words, but you've not actually described > what the problem is you're trying to fix. > > AFAICT the only thing we really care about here is that the load in > question happens _after_ we observe SLOW, and that is still true. > > The order against the unlock is irrelevant. > > So we set ->state before we hash and before we set SLOW. Given that > we've seen SLOW, we must therefore also see ->state. > > If ->state == halted, this means the CPU in question is blocked and the > pv_node will not get re-used -- if it does get re-used, it wasn't > blocked and we don't care either. > > Therefore, ->cpu is stable and we'll kick it into action. > > How do you end up not waking a waiting cpu? Explain that. > Yes, it is safe in the current code. In some versions of my pvqspinlock patch, I was resetting the state back to running in pv_wait_head(). This causes race problem. The current code, however, will not reset the state back to running and so the check is redundant. I will clarify that in the next patch. >> */ >> - if (READ_ONCE(node->state) == vcpu_halted) >> - pv_kick(node->cpu); >> + pv_kick(node->cpu); >> } > Also, this patch clearly isn't against my tree. > Yes, I was backing against the latest tip tree. As some of the files in the patch were modified in the latest tip tree, I will rebase my patch and update it. Please let me know if I should be using your tree instead. Cheers, Longman -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/