Received: by 10.223.164.221 with SMTP id h29csp2576146wrb; Thu, 2 Nov 2017 13:23:18 -0700 (PDT) X-Google-Smtp-Source: ABhQp+RIx8/qFbHFpwb7BO8Bn8Hbf5F3nNB02xT1k+3qznZaIaKrKfWdKpRW8wf5y5vpjUASVVYR X-Received: by 10.99.114.81 with SMTP id c17mr4857377pgn.43.1509654198194; Thu, 02 Nov 2017 13:23:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1509654198; cv=none; d=google.com; s=arc-20160816; b=gp8WfAI8SzqdbKnMRb/5XH05V1e8TIyAWpax0vcTcG00RoiC3q5bKcFsAps0Brs1Ws JbScENRNiQJtiPJUrXXaSCAofy17ZoxSN37qctVw6EM+UL30Qqt88NZsiKZanvmpaXIg yvs8rMHeDA5ihy/K7s57DmstivI7yTt4IIk/1fmekwQdgoAHfRNJen1EmeY26+TZ4gTm qL3tXGv3OfkSn2ULeQZ6Q7bCvp8aUhBzUCBSn7NMabL76CnhZFtwwocMQ3GqPEpdbxw0 24uL+NtVqTuD7dSNFS+4wQ9zllzKfCqeCLjLQKU+hrE78rY0J+WX77md6lrBe0lKQtQA Rj4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:in-reply-to :subject:cc:to:from:date:arc-authentication-results; bh=I+sVHSVEpcuDHkvRYqBeZqCUlnqy4X9YZ2eAFyaGJqY=; b=XAIb7Htry0Qb8VUuKqsfDZUV55fhEVzYIn2mYgWY4dDba1qo6x12y5Y5m4yziVr1uQ 1KxigI6ZtaxvVMj9I7LvE1ps86V5NIL3XcRHLJu8ROOuYxJ+1NT29aZCjekJhpHgZsLt BDLYuhdYfpdvrmSln6qCBFcFLR8h+BZlDwEJ+JPNCc6doJyg4Fj4Tvlzc5PrX9szCvrT TtGnfXnnbOFccCjtdU2/h4SEg9/pv8VNHrI+PKpHt6Wz3geHLuotd+da6yphdFP6rtts TqslBEXckkAX7zNW8/vz8Tja0ChA6N9sgcyVRPfh/FtbnRgwZCGj6Lzng/keg3Vh381f O6Vg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=harvard.edu Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v1si4437269pfg.39.2017.11.02.13.23.04; Thu, 02 Nov 2017 13:23:18 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=harvard.edu Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S964896AbdKBUV7 (ORCPT + 99 others); Thu, 2 Nov 2017 16:21:59 -0400 Received: from iolanthe.rowland.org ([192.131.102.54]:33300 "HELO iolanthe.rowland.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S964885AbdKBUV5 (ORCPT ); Thu, 2 Nov 2017 16:21:57 -0400 Received: (qmail 3286 invoked by uid 2102); 2 Nov 2017 16:21:56 -0400 Received: from localhost (sendmail-bs@127.0.0.1) by localhost with SMTP; 2 Nov 2017 16:21:56 -0400 Date: Thu, 2 Nov 2017 16:21:56 -0400 (EDT) From: Alan Stern X-X-Sender: stern@iolanthe.rowland.org To: Will Deacon cc: Peter Zijlstra , "Reshetova, Elena" , "linux-kernel@vger.kernel.org" , "gregkh@linuxfoundation.org" , "keescook@chromium.org" , "tglx@linutronix.de" , "mingo@redhat.com" , "ishkamiel@gmail.com" , Paul McKenney , , , , Subject: Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t In-Reply-To: <20171102171644.GD595@arm.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2 Nov 2017, Will Deacon wrote: > > Right. To address your point: release + acquire isn't the same as a > > full barrier either. The SB pattern illustrates the difference: > > > > P0 P1 > > Write x=1 Write y=1 > > Release a smp_mb > > Acquire b Read x=0 > > Read y=0 > > > > This would not be allowed if the release + acquire sequence was > > replaced by smp_mb. But as it stands, this is allowed because nothing > > prevents the CPU from interchanging the order of the release and the > > acquire -- and then you're back to the acquire + release case. > > > > However, there is one circumstance where this interchange isn't > > allowed: when the release and acquire access the same memory > > location. Thus: > > > > P0(int *x, int *y, int *a) > > { > > int r0; > > > > WRITE_ONCE(*x, 1); > > smp_store_release(a, 1); > > smp_load_acquire(a); > > r0 = READ_ONCE(*y); > > } > > > > P1(int *x, int *y) > > { > > int r1; > > > > WRITE_ONCE(*y, 1); > > smp_mb(); > > r1 = READ_ONCE(*x); > > } > > > > exists (0:r0=0 /\ 1:r1=0) > > > > This is forbidden. It would remain forbidden even if the smp_mb in P1 > > were replaced by a similar release/acquire pair for the same memory > > location. I have to apologize; this was totally wrong. This test is not forbidden under the LKMM, and it certainly isn't forbidden if the smp_mb is replaced by a release/acquire pair. I was trying to think of something completely different. If you have a release/acquire to the same address, it creates a happens-before ordering: Access x Release a Acquire a Access y Here is the access to x happens-before the access to y. This is true even on x86, even in the presence of forwarding -- the CPU still has to execute the instructions in order. But if the release and acquire are to different addresses: Access x Release a Acquire b Access y then there is no happens-before ordering for x and y -- the CPU can execute the last two instructions before the first two. x86 and PowerPC won't do this, but I believe ARMv8 can. (Please correct me if it can't.) But happens-before is much weaker than a strong fence. So in short, release + acquire, even to the same address, is no replacement for smp_mb(). > Isn't this allowed on x86 mapping smp_mb() to mfence, store-release to plain > store and load-acquire to plain load? All we're saying is that you can forward > from a release to an acquire, which is fine for RCpc semantics. > > e.g. > > X86 SB+mfence+po-rfi-po > "MFencedWR Fre PodWW Rfi PodRR Fre" > Generator=diyone7 (version 7.46+3) > Prefetch=0:x=F,0:y=T,1:y=F,1:x=T > Com=Fr Fr > Orig=MFencedWR Fre PodWW Rfi PodRR Fre > { > } > P0 | P1 ; > MOV [x],$1 | MOV [y],$1 ; > MFENCE | MOV [z],$1 ; > MOV EAX,[y] | MOV EAX,[z] ; > | MOV EBX,[x] ; > exists > (0:EAX=0 /\ 1:EAX=1 /\ 1:EBX=0) > > which herd says is allowed: > > Test SB+mfence+po-rfi-po Allowed > States 4 > 0:EAX=0; 1:EAX=1; 1:EBX=0; > 0:EAX=0; 1:EAX=1; 1:EBX=1; > 0:EAX=1; 1:EAX=1; 1:EBX=0; > 0:EAX=1; 1:EAX=1; 1:EBX=1; > Ok > Witnesses > Positive: 1 Negative: 3 > Condition exists (0:EAX=0 /\ 1:EAX=1 /\ 1:EBX=0) > Observation SB+mfence+po-rfi-po Sometimes 1 3 > Time SB+mfence+po-rfi-po 0.00 > Hash=0f983e2d7579e5c04c332f9ac620c31f > > and I can reproduce using litmus to actually run it on my x86 box: > > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% > % Results for SB+mfence+po-rfi-po.litmus % > %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% > X86 SB+mfence+po-rfi-po > "MFencedWR Fre PodWW Rfi PodRR Fre" > > {} > > P0 | P1 ; > MOV [x],$1 | MOV [y],$1 ; > MFENCE | MOV [z],$1 ; > MOV EAX,[y] | MOV EAX,[z] ; > | MOV EBX,[x] ; > > exists (0:EAX=0 /\ 1:EAX=1 /\ 1:EBX=0) > Generated assembler > #START _litmus_P1 > movl $1,(%r8,%rcx) > movl $1,(%r9,%rcx) > movl (%r9,%rcx),%eax > movl (%rdi,%rcx),%edx > #START _litmus_P0 > movl $1,(%rdx,%rcx) > mfence > movl (%rdi,%rcx),%eax > > Test SB+mfence+po-rfi-po Allowed > Histogram (4 states) > 8 *>0:EAX=0; 1:EAX=1; 1:EBX=0; > 1999851:>0:EAX=1; 1:EAX=1; 1:EBX=0; > 1999549:>0:EAX=0; 1:EAX=1; 1:EBX=1; > 592 :>0:EAX=1; 1:EAX=1; 1:EBX=1; > Ok > > Witnesses > Positive: 8, Negative: 3999992 > Condition exists (0:EAX=0 /\ 1:EAX=1 /\ 1:EBX=0) is validated > Hash=0f983e2d7579e5c04c332f9ac620c31f > Generator=diyone7 (version 7.46+3) > Com=Fr Fr > Orig=MFencedWR Fre PodWW Rfi PodRR Fre > Observation SB+mfence+po-rfi-po Sometimes 8 3999992 > Time SB+mfence+po-rfi-po 0.17 Yes, you are quite correct. Thanks for pointing out my mistake. Alan Stern From 1582977273875262439@xxx Thu Nov 02 17:46:09 +0000 2017 X-GM-THRID: 1582046402032606032 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread