Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753959AbbLKOSJ (ORCPT ); Fri, 11 Dec 2015 09:18:09 -0500 Received: from mx2.suse.de ([195.135.220.15]:34758 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753866AbbLKOSG (ORCPT ); Fri, 11 Dec 2015 09:18:06 -0500 Date: Fri, 11 Dec 2015 06:17:47 -0800 From: Davidlohr Bueso To: Will Deacon Cc: Peter Zijlstra , Andrew Pinski , Davidlohr Bueso , Thomas Gleixner , "Paul E. McKenney" , Ingo Molnar , Linux Kernel Mailing List , "linux-arm-kernel@lists.infradead.org" Subject: Re: FW: Commit 81a43adae3b9 (locking/mutex: Use acquire/release semantics) causing failures on arm64 (ThunderX) Message-ID: <20151211141747.GC5650@linux-uzut.site> References: <5669D5F2.5050004@caviumnetworks.com> <20151211084133.GE6356@twins.programming.kicks-ass.net> <20151211120419.GD18828@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <20151211120419.GD18828@arm.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1237 Lines: 32 On Fri, 11 Dec 2015, Will Deacon wrote: >I think Andrew meant the atomic_xchg_acquire at the start of osq_lock, >as opposed to "compare and swap". In which case, it does look like >there's a bug here because there is nothing to order the initialisation >of the node fields with publishing of the node, whether that's >indirectly as a result of setting the tail to the current CPU or >directly as a result of the WRITE_ONCE. Sorry I'm late to the party. Duh yes this is obviously bogus, and worse I recall triggering a similar tail initialization issue in osq_lock on some experimental work on x86, so this is very much a point of failure. Ack. > >Andrew, David: does making that atomic_xchg_acquire and atomic_xchg >fix things for you? > >I don't fully grok what 81a43adae3b9 has to do with any of this, so >maybe there's another bug too. I think this is mainly because mutex_optimistic_spin is where the stack shows the lockup, which really translates to c55a6ffa62. Thanks, Davidlohr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/