Received: by 10.223.164.202 with SMTP id h10csp663514wrb; Mon, 6 Nov 2017 13:13:20 -0800 (PST) X-Google-Smtp-Source: ABhQp+QmUejCOTGaEk5OlSNQrpWwyGgMI5+0YSwPQ+TdbUmY5k/LaAB/26DQw4+rDXY2FmchTaax X-Received: by 10.98.252.3 with SMTP id e3mr17956680pfh.136.1510002800084; Mon, 06 Nov 2017 13:13:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510002800; cv=none; d=google.com; s=arc-20160816; b=mXh10OYo9W21wiGxab2uVSBI2uEa0/By7MWU62qTu5qRFTbsbQcigZr/cSs/bOCREQ UkL/tdNxGUfi0djRyW71pXIBxnnU0PKCV31/9cJBG7Z+yHWtuMeXmZJjP9zFh94eQraH DdAZNwGVsUyAC8+7FBwKETeososvd/MVTcjjn8pv9ubZHFIowXB+EXfRrEumoRJ0AXix NX4RpMl5LJqMJplYTt3bQ8EOwv3jfe35grBrm0a4020Ld8PpGfMxHCAIPuc4kYNAh1eF jpxzizbkc2oX/xG4h7y+fEnNwfA/4pkFCMg51Waf2ZTbnUw7XGqQJUnaXHsP2fazVPcw TNmw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=QLF+91Vtx+4uAtFkWN/9Ei4Ol1ehQEVkwkpe2gZII10=; b=aGmzK3EqiSzSa/Mft3F2d2rnqZn8EllzVV0ByDRi9k+aVqUCrrGb7QPGviolftMQv6 6UdSHZ4tiCFClcxyV7gDYyXdTNyB0gCYd2DWPYsYKhpX9iKKRU8REzTLIru6wZRFO8xC HD+QQ3aRCOzWBTP3EMLIOr6Kd7wRtSBo0arac/pywVp3g7YnNZifoGGT4JN0KCweqDQv lNm0qKKlEgWXjBFczCKuF68fPS/TEet6uPrTKMqmqH/woXGPDd/uNFC+4IpTqIaucCbV snumraDKHIFI1jgZLERLCUReZrKPlEsNMkHAYKM/RDUwwB1+uaiXPFMsSOv+81wOocc5 4pKA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=SwDJzSyl; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 69si11023367plc.528.2017.11.06.13.13.05; Mon, 06 Nov 2017 13:13:20 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=SwDJzSyl; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932681AbdKFS6v (ORCPT + 95 others); Mon, 6 Nov 2017 13:58:51 -0500 Received: from merlin.infradead.org ([205.233.59.134]:47974 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932387AbdKFS6u (ORCPT ); Mon, 6 Nov 2017 13:58:50 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=merlin.20170209; h=Content-Transfer-Encoding:Content-Type: In-Reply-To:MIME-Version:Date:Message-ID:From:References:Cc:To:Subject:Sender :Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=QLF+91Vtx+4uAtFkWN/9Ei4Ol1ehQEVkwkpe2gZII10=; b=SwDJzSylx6Qvnion+J7EhHkBAO SN/TFiwLywPnd17TqqoS1hpr5CQYJDcaCav1HLJ059NuV84YDmRzc46i4Qn6cQiF0ciKqAu0lYyC/ alzqvwJZs4FroMxU1/5PkIRkOXQzgGkX+ttSCiHwstnNv9ZnRoGBp1+xyZ0kxtHLMCmptXAgGY8A3 FGgl4cTvEWc5wsVAcYBh8bI7pJC1L9/VDe6N4ZTSb5eWcyP3TMAzfzCl5dRvdrb+MOw3JRH9pM1NJ s8TRtpBU36VHTlaU736S0ou1iLjqVURPNt8Phu7D4xSWKUdBvei/mHlttMGct+f6At0SQEUNEYvw1 zTdyTr4Q==; Received: from static-50-53-32-32.bvtn.or.frontiernet.net ([50.53.32.32] helo=midway.dunlap) by merlin.infradead.org with esmtpsa (Exim 4.87 #1 (Red Hat Linux)) id 1eBmbV-0002yE-BA; Mon, 06 Nov 2017 18:58:39 +0000 Subject: Re: [PATCH] refcount_t: documentation for memory ordering differences To: Elena Reshetova , peterz@infradead.org Cc: linux-kernel@vger.kernel.org, gregkh@linuxfoundation.org, keescook@chromium.org, tglx@linutronix.de, mingo@redhat.com, ishkamiel@gmail.com References: <1509975151-13569-1-git-send-email-elena.reshetova@intel.com> From: Randy Dunlap Message-ID: Date: Mon, 6 Nov 2017 10:58:35 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: <1509975151-13569-1-git-send-email-elena.reshetova@intel.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/06/2017 05:32 AM, Elena Reshetova wrote: > Some functions from refcount_t API provide different > memory ordering guarantees that their atomic counterparts. > This adds a document outlining the differences and > showing examples. > > Signed-off-by: Elena Reshetova > --- > Documentation/refcount-vs-atomic.txt | 234 +++++++++++++++++++++++++++++++++++ > 1 file changed, 234 insertions(+) > create mode 100644 Documentation/refcount-vs-atomic.txt > > diff --git a/Documentation/refcount-vs-atomic.txt b/Documentation/refcount-vs-atomic.txt > new file mode 100644 > index 0000000..09efd2b > --- /dev/null > +++ b/Documentation/refcount-vs-atomic.txt > @@ -0,0 +1,234 @@ > +================================== > +refcount_t API compare to atomic_t > +================================== > + > +The goal of refcount_t API is to provide a minimal API for implementing > +object's reference counters. While a generic architecture-independent > +implementation from lib/refcount.c uses atomic operations underneath, > +there is a number of differences between some of the refcount_*() and there are > +atomic_*() functions with regards to the memory ordering guarantees. > +This document outlines the differences and provides respective examples > +in order to help maintainers validate their code against the change in > +these memory ordering guarantees. > + > +memory-barriers.txt and atomic_t.txt provide more background to the > +memory ordering in general and for atomic operations specifically. > + > +Summary of the differences > +========================== > + > + 1) There is no difference between respective non-RMW ops, i.e. > + refcount_set() & refcount_read() have exactly the same ordering > + guarantees (meaning fully unordered) as atomic_set() and atomic_read(). > + 2) For the increment-based ops that return no value (namely > + refcount_inc() & refcount_add()) memory ordering guarantees are > + exactly the same (meaning fully unordered) as respective atomic > + functions (atomic_inc() & atomic_add()). > + 3) For the decrement-based ops that return no value (namely > + refcount_dec()) memory ordering guarantees are slightly > + stronger than respective atomic counterpart (atomic_dec()). > + While atomic_dec() is fully unordered, refcount_dec() does > + provide a RELEASE memory ordering guarantee (see next section). > + 4) For the rest of increment-based RMW ops (refcount_inc_not_zero(), > + refcount_add_not_zero()) the memory ordering guarantees are relaxed > + compare to their atomic counterparts (atomic_inc_not_zero()). compared > + Refcount variants provide no memory ordering guarantees apart from > + control dependency on success, while atomics provide a full memory provide full memory > + ordering guarantees (see next section). > + 5) The rest of decrement-based RMW ops (refcount_dec_and_test(), > + refcount_sub_and_test(), refcount_dec_if_one(), refcount_dec_not_one()) > + provide only RELEASE memory ordering and control dependency on success > + (see next section). The respective atomic counterparts > + (atomic_dec_and_test(), atomic_sub_and_test()) provide full memory ordering. > + 6) The lock-based RMW ops (refcount_dec_and_lock() & > + refcount_dec_and_mutex_lock()) alway provide RELEASE memory ordering > + and ACQUIRE memory ordering & control dependency on success > + (see next section). The respective atomic counterparts > + (atomic_dec_and_lock() & atomic_dec_and_mutex_lock()) > + provide full memory ordering. > + > + > + > +Details and examples > +==================== > + > +Here we consider the cases 3)-6) that do present differences together > +with respective examples. > + > +case 3) - decrement-based RMW ops that return no value > +------------------------------------------------------ > + > +Function changes: > + atomic_dec() --> refcount_dec() > + > +Memory ordering guarantee changes: > + fully unordered --> RELEASE ordering > + > +RELEASE ordering guarantees that prior loads and stores are > +completed before the operation. Implemented using smp_store_release(). > + > +Examples: > +~~~~~~~~~ > + > +For fully unordered operations stores to a, b and c can > +happen in any sequence: > + > +P0(int *a, int *b, int *c) > + { > + WRITE_ONCE(*a, 1); > + WRITE_ONCE(*b, 1); > + WRITE_ONCE(*c, 1); > + } > + > + > +For a RELEASE ordered operation, read and write from/to @a read or write (??) > +is guaranteed to happen before store to @b. There are no If you want to keep "read and write" above, please change "is" to "are". Are "write" and "store" the same? They seem to be used interchangeably. > +guarantees on the order of store/read to/from @c: > + > +P0(int *a, int *b, int *c) > + { > + READ_ONCE(*a); > + WRITE_ONCE(*a, 1); > + smp_store_release(b, 1); > + WRITE_ONCE(*c, 1); > + READ_ONCE(*c); > + } > + > + > +case 4) - increment-based RMW ops that return a value > +----------------------------------------------------- > + > +Function changes: > + atomic_inc_not_zero() --> refcount_inc_not_zero() > + no atomic counterpart --> refcount_add_not_zero() > + > +Memory ordering guarantees changes: > + fully ordered --> control dependency on success for stores > + > +Control dependency on success guarantees that if a reference for an > +object was successfully obtained (reference counter increment or > +addition happened, functions returned true), then further stores are ordered > +against this operation. Control dependency on stores are not implemented > +using any explicit barriers, but we rely on CPU not to speculate on stores. > + > +*Note*: we really assume here that necessary ordering is provided as a result > +of obtaining pointer to the object! > + > +Examples: > +~~~~~~~~~ > + > +For a fully ordered atomic operation smp_mb() barriers are inserted before > +and after the actual operation: > + > +P0(int *a, int *b, int *c) > + { > + WRITE_ONCE(*b, 2); > + READ_ONCE(*c); > + if ( ({ smp_mb(); ret = do_atomic_inc_not_zero(*a); smp_mb(); ret }) ) { > + safely_perform_operation_on_object_protected_by_@a(); > + ... > + } > + WRITE_ONCE(*c, 2); > + READ_ONCE(*b); > + } fix indentation above? or is it meant to be funky? > + > +These barriers guarantee that all prior loads and stores (@b and @c) > +are completed before the operation, as well as all later loads and > +stores (@b and @c) are completed after the operation. > + > +For a fully unordered refcount operation smp_mb() barriers are absent > +and only control dependency on stores is guaranteed: are > + > +P0(int *a, int *b, int *c) > + { > + WRITE_ONCE(*b, 2); > + READ_ONCE(*c); > + if ( ({ ret = do_refcount_inc_not_zero(*a); ret }) ) { > + perform_store_operation_on_object_protected_by_@a(); > + /* here we assume that necessary ordering is provided > + * using other means, such as locks etc. */ > + ... > + } > + WRITE_ONCE(*c, 2); > + READ_ONCE(*b); > + } indentation? > + > +No guarantees on order of stores and loads to/from @b and @c. > + > + > +case 5) - decrement-based RMW ops that return a value > +----------------------------------------------------- > + > +Function changes: > + atomic_dec_and_test() --> refcount_dec_and_test() > + atomic_sub_and_test() --> refcount_sub_and_test() > + no atomic counterpart --> refcount_dec_if_one() > + atomic_add_unless(&var, -1, 1) --> refcount_dec_not_one(&var) > + > +Memory ordering guarantees changes: > + fully ordered --> RELEASE ordering + control dependency on success for stores > + > +Note: atomic_add_unless() only provides full order on success. > + > +Examples: > +~~~~~~~~~ > + > +For a fully ordered atomic operation smp_mb() barriers are inserted before > +and after the actual operation: > + > +P0(int *a, int *b, int *c) > + { > + WRITE_ONCE(*b, 2); > + READ_ONCE(*c); > + if ( ({ smp_mb(); ret = do_atomic_dec_and_test(*a); smp_mb(); ret }) ) { > + safely_free_the_object_protected_by_@a(); > + ... > + } > + WRITE_ONCE(*c, 2); > + READ_ONCE(*b); > + } indentation? > + > +These barriers guarantee that all prior loads and stores (@b and @c) > +are completed before the operation, as well as all later loads and > +stores (@b and @c) are completed after the operation. > + > + > +P0(int *a, int *b, int *c) > + { > + WRITE_ONCE(*b, 2); > + READ_ONCE(*c); > + if ( ({ smp_store_release(*a); ret = do_refcount_dec_and_test(*a); ret }) ) { > + safely_free_the_object_protected_by_@a(); > + /* here we know that this is 1 --> 0 transition > + * and therefore we are the last user of this object > + * so no concurrency issues are present */ > + ... > + } > + WRITE_ONCE(*c, 2); > + READ_ONCE(*b); > + } odd indentation intended? > + > +Here smp_store_release() guarantees that a store to @b and read > +from @c happens before the operation. However, there is no happen > +guarantee on the order of store to @c and read to @b following > +the if cause. clause (?) > + > + > +case 6) - lock-based RMW > +------------------------ > + > +Function changes: > + > + atomic_dec_and_lock() --> refcount_dec_and_lock() > + atomic_dec_and_mutex_lock() --> refcount_dec_and_mutex_lock() > + > +Memory ordering guarantees changes: > + fully ordered --> RELEASE ordering always, and on success ACQUIRE > + ordering & control dependency for stores > + > + > +ACQUIRE ordering guarantees that loads and stores issued after the ACQUIRE > +operation are completed after the operation. In this case implemented > +using spin_lock(). > + > + > -- ~Randy From 1583324615343508740@xxx Mon Nov 06 13:47:00 +0000 2017 X-GM-THRID: 1583324615343508740 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread