Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751625AbdFILN0 (ORCPT ); Fri, 9 Jun 2017 07:13:26 -0400 Received: from merlin.infradead.org ([205.233.59.134]:33758 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751527AbdFILNZ (ORCPT ); Fri, 9 Jun 2017 07:13:25 -0400 Date: Fri, 9 Jun 2017 13:13:05 +0200 From: Peter Zijlstra To: Will Deacon , Paul McKenney , Boqun Feng Cc: linux-kernel@vger.kernel.org, Ingo Molnar , Thomas Gleixner , vgupta@synopsys.com, rkuo@codeaurora.org, james.hogan@imgtec.com, jejb@parisc-linux.org, davem@davemloft.net, cmetcalf@mellanox.com Subject: Re: [RFC][PATCH] atomic: Fix atomic_set_release() for 'funny' architectures Message-ID: <20170609111305.bn4ca4uscbp6pgxn@hirez.programming.kicks-ass.net> References: <20170609092450.jwmldgtli57ozxgq@hirez.programming.kicks-ass.net> <20170609110506.yod47flaav3wgoj5@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170609110506.yod47flaav3wgoj5@hirez.programming.kicks-ass.net> User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 896 Lines: 20 On Fri, Jun 09, 2017 at 01:05:06PM +0200, Peter Zijlstra wrote: > The spinlock based atomics should be SC, that is, none of them appear to > place extra barriers in atomic_cmpxchg() or any of the other SC atomic > primitives and therefore seem to rely on their spinlock implementation > being SC (I did not fully validate all that). So I did see that ARC and PARISC have 'superfluous' smp_mb() calls around their spinlock implementation. That is, for spinlock semantics you only need one _after_ lock and one _before_ unlock. But the atomic stuff relies on being SC and thus would need one before and after both lock and unlock. Now, afaict PARISC doesn't even have memory barriers (it uses asm-generic/barrier.h) so that's a bit of a puzzle. But ARC could probably optimize (if they still care about that hardware) by pulling out those barriers and putting it in the atomic implementation.