Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753597AbbLKM0w (ORCPT ); Fri, 11 Dec 2015 07:26:52 -0500 Received: from casper.infradead.org ([85.118.1.10]:45338 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751238AbbLKM0u (ORCPT ); Fri, 11 Dec 2015 07:26:50 -0500 Date: Fri, 11 Dec 2015 13:26:47 +0100 From: Peter Zijlstra To: Will Deacon Cc: Andrew Pinski , Davidlohr Bueso , Thomas Gleixner , "Paul E. McKenney" , Ingo Molnar , Linux Kernel Mailing List , "linux-arm-kernel@lists.infradead.org" , david.daney@cavium.com Subject: Re: FW: Commit 81a43adae3b9 (locking/mutex: Use acquire/release semantics) causing failures on arm64 (ThunderX) Message-ID: <20151211122647.GM6356@twins.programming.kicks-ass.net> References: <5669D5F2.5050004@caviumnetworks.com> <20151211084133.GE6356@twins.programming.kicks-ass.net> <20151211120419.GD18828@arm.com> <20151211121319.GK6356@twins.programming.kicks-ass.net> <20151211121759.GE18828@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151211121759.GE18828@arm.com> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1780 Lines: 42 On Fri, Dec 11, 2015 at 12:18:00PM +0000, Will Deacon wrote: > On Fri, Dec 11, 2015 at 01:13:19PM +0100, Peter Zijlstra wrote: > > On Fri, Dec 11, 2015 at 12:04:19PM +0000, Will Deacon wrote: > > > I think Andrew meant the atomic_xchg_acquire at the start of osq_lock, > > > as opposed to "compare and swap". In which case, it does look like > > > there's a bug here because there is nothing to order the initialisation > > > of the node fields with publishing of the node, whether that's > > > indirectly as a result of setting the tail to the current CPU or > > > directly as a result of the WRITE_ONCE. > > > > Agreed, this does indeed look like a bug. If confirmed please write a > > shiny changelog and I'll queue asap. > > Yup. I've failed to reproduce the issue locally, so we'll need to wait > for Andrew and/or David to get back to us first. While we're there, the acquire in osq_wait_next() seems somewhat ill documented too. I _think_ we need ACQUIRE semantics there because we want to strictly order the lock-unqueue A,B,C steps and we get that with: A: SC B: ACQ C: Relaxed Similarly for unlock we want the WRITE_ONCE to happen after osq_wait_next, but in that case we can even rely on the control dependency there. As noted in a previous email, the ACQUIRE for osq_wait_next() does not come from its use in lock since its on the fail path, and trylock failure doesn't imply any barriers. Not should it have RELEASE semantics for its use in unlock, since we already have that covered by the xchg() done prior. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/