Received: by 10.223.164.221 with SMTP id h29csp2390930wrb; Thu, 2 Nov 2017 10:09:47 -0700 (PDT) X-Google-Smtp-Source: ABhQp+RS3JE9KZBClflN3nNDBWFfj5OwgHMzRXlxIRmjZhNz5oCbd0rZn60X2nVwZQ+WLVarQixE X-Received: by 10.84.253.144 with SMTP id a16mr3871970plm.177.1509642587142; Thu, 02 Nov 2017 10:09:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1509642587; cv=none; d=google.com; s=arc-20160816; b=cWf2fmMIEBKMrdck735r5JqFfQ3SM3arvKBJkxhm1iMfFM1Ji2SlxQH2WEUn15CoFq ympnVAj88NU7Pxm/o4q4M4tOhWUsPlf+vTPg6ws7omOI5nzFAcw2YqayaUV9TxSxgt+u rRMyecyxUgATt6Hn0kHy1aDnGWq1SqueUke2CL95cacJ1SUjRVJ6VB8+RnoSfB8oP27X 3Q/t0PuFzZzSFoPh5bK1ER+68rrX5ZD3GMc9cPCT1eo8h9zJ32LOaAlORJEC6/lgDBTm yZr0AGirRFHrTFEev1xL14WzYhMFsv9+mqIE6Lw3bGm/5+9MWaxycCHTx7N/i2g9b/2+ ORWQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:in-reply-to :subject:cc:to:from:date:arc-authentication-results; bh=KL+jfvUfPrj4aNxFBOFIATXpv3l88F9Dq6FSojd1ZGM=; b=z5VrscbPpUfR157AASnRF+wTh3UHe2GF8G+qZemsVFctKTs6GF3u16zQcGo1WB3VcG nZ9Vo4xeGIT6ks/xqtj2008ftr4jk7tFn9uDMWwU0wtDRU0EQnstswskKc34xfDwVP3A R8HrXbbXFciFS4m/iuY55A39HFZFzUwvY/3CKHP/Myczc5sn0q4uoFudOem3Tlu4qdTM ey2WmW5dk3VGyXvgJlc7EXg8LCkV8L9GDiMmdcKqAGkvQHiohx0RuKD0b5ncevhqr+xn DlKa93gGxVpXSXvw3LVaOcvYQy0kdzFi3GXSJhpWZN2yFSjK9NZ4vtaszRAZmUGIZ5MV 8IiA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=harvard.edu Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k26si4136174pfj.372.2017.11.02.10.09.32; Thu, 02 Nov 2017 10:09:47 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=harvard.edu Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754089AbdKBRIy (ORCPT + 97 others); Thu, 2 Nov 2017 13:08:54 -0400 Received: from iolanthe.rowland.org ([192.131.102.54]:32988 "HELO iolanthe.rowland.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1752042AbdKBRIx (ORCPT ); Thu, 2 Nov 2017 13:08:53 -0400 Received: (qmail 2724 invoked by uid 2102); 2 Nov 2017 13:08:52 -0400 Received: from localhost (sendmail-bs@127.0.0.1) by localhost with SMTP; 2 Nov 2017 13:08:52 -0400 Date: Thu, 2 Nov 2017 13:08:52 -0400 (EDT) From: Alan Stern X-X-Sender: stern@iolanthe.rowland.org To: Peter Zijlstra cc: "Reshetova, Elena" , "linux-kernel@vger.kernel.org" , "gregkh@linuxfoundation.org" , "keescook@chromium.org" , "tglx@linutronix.de" , "mingo@redhat.com" , "ishkamiel@gmail.com" , Will Deacon , Paul McKenney , , , , Subject: Re: [PATCH] refcount: provide same memory ordering guarantees as in atomic_t In-Reply-To: <20171102160237.t2xkryg6joskf77y@hirez.programming.kicks-ass.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 2 Nov 2017, Peter Zijlstra wrote: > On Thu, Nov 02, 2017 at 11:40:35AM -0400, Alan Stern wrote: > > On Thu, 2 Nov 2017, Peter Zijlstra wrote: > > > > > > Lock functions such as refcount_dec_and_lock() & > > > > refcount_dec_and_mutex_lock() Provide exactly the same guarantees as > > > > they atomic counterparts. > > > > > > Nope. The atomic_dec_and_lock() provides smp_mb() while > > > refcount_dec_and_lock() merely orders all prior load/store's against all > > > later load/store's. > > > > In fact there is no guaranteed ordering when refcount_dec_and_lock() > > returns false; > > It should provide a release: > > - if !=1, dec_not_one will provide release > - if ==1, dec_not_one will no-op, but then we'll acquire the lock and > dec_and_test will provide the release, even if the test fails and we > unlock again it should still dec. > > The one exception is when the counter is saturated, but in that case > we'll never free the object and the ordering is moot in any case. Also if the counter is 0, but that will never happen if the refcounting is correct. > > it provides ordering only if the return value is true. > > In which case it provides acquire ordering (thanks to the spin_lock), > > and both release ordering and a control dependency (thanks to the > > refcount_dec_and_test). > > > > > The difference is subtle and involves at least 3 CPUs. I can't seem to > > > write up anything simple, keeps turning into monsters :/ Will, Paul, > > > have you got anything simple around? > > > > The combination of acquire + release is not the same as smp_mb, because > > acquire+release is nothing, its release+acquire that I meant which > should order things locally, but now that you've got me looking at it > again, we don't in fact do that. > > So refcount_dec_and_lock() will provide a release, irrespective of the > return value (assuming we're not saturated). If it returns true, it also > does an acquire for the lock. > > But combined they're acquire+release, which is unfortunate.. it means > the lock section and the refcount stuff overlaps, but I don't suppose > that's actually a problem. Need to consider more. Right. To address your point: release + acquire isn't the same as a full barrier either. The SB pattern illustrates the difference: P0 P1 Write x=1 Write y=1 Release a smp_mb Acquire b Read x=0 Read y=0 This would not be allowed if the release + acquire sequence was replaced by smp_mb. But as it stands, this is allowed because nothing prevents the CPU from interchanging the order of the release and the acquire -- and then you're back to the acquire + release case. However, there is one circumstance where this interchange isn't allowed: when the release and acquire access the same memory location. Thus: P0(int *x, int *y, int *a) { int r0; WRITE_ONCE(*x, 1); smp_store_release(a, 1); smp_load_acquire(a); r0 = READ_ONCE(*y); } P1(int *x, int *y) { int r1; WRITE_ONCE(*y, 1); smp_mb(); r1 = READ_ONCE(*x); } exists (0:r0=0 /\ 1:r1=0) This is forbidden. It would remain forbidden even if the smp_mb in P1 were replaced by a similar release/acquire pair for the same memory location. To see the difference between smp_mb and release/acquire requires three threads: P0 P1 P2 Write x=1 Read y=1 Read z=1 Release a data dep. smp_rmb Acquire a Write z=1 Read x=0 Write y=1 The Linux Kernel Memory Model allows this execution, although as far as I know, no existing hardware will do it. But with smp_mb in P0, the execution would be forbidden. None of this should be a problem for refcount_dec_and_lock, assuming it is used purely for reference counting. Alan Stern From 1582973597182919702@xxx Thu Nov 02 16:47:43 +0000 2017 X-GM-THRID: 1582046402032606032 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread