Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753383AbdLEKg5 (ORCPT ); Tue, 5 Dec 2017 05:36:57 -0500 Received: from mga01.intel.com ([192.55.52.88]:61528 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753299AbdLEKgn (ORCPT ); Tue, 5 Dec 2017 05:36:43 -0500 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.45,363,1508828400"; d="scan'208";a="214170" From: "Reshetova, Elena" To: Kees Cook CC: Peter Zijlstra , LKML , Dave Chinner , "linux-doc@vger.kernel.org" , Jonathan Corbet Subject: RE: [PATCH] refcount_t: documentation for memory ordering differences Thread-Topic: [PATCH] refcount_t: documentation for memory ordering differences Thread-Index: AQHTaQ8EfESScr39uUyqIumZ/UYH7KMsJQgAgAhxrHA= Date: Tue, 5 Dec 2017 10:36:40 +0000 Message-ID: <2236FBA76BA1254E88B949DDB74E612B802C8208@IRSMSX102.ger.corp.intel.com> References: <1511958996-19501-1-git-send-email-elena.reshetova@intel.com> <1511958996-19501-2-git-send-email-elena.reshetova@intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 11.0.0.116 dlp-reaction: no-action x-ctpclassification: CTP_IC x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiYWY5OWUwNWItYmJkZS00OTI4LTg0YmMtMjhkZmM1MTFmMjNkIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjIuNS4xOCIsIlRydXN0ZWRMYWJlbEhhc2giOiJpTFBBbmVxbys4aVJjbGYzQk91dlFIWU1BYkxIOVp0XC9ORnpiMmFzQUZnWWxXQWhPUEx0MzRjZWE3QkkySHZaZiJ9 x-originating-ip: [163.33.239.180] Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by nfs id vB5Ab4iB030144 Content-Length: 7643 Lines: 190 On Wed, Nov 29, 2017 at 4:36 AM, Elena Reshetova > wrote: > > Some functions from refcount_t API provide different > > memory ordering guarantees that their atomic counterparts. > > This adds a document outlining these differences. > > > > Signed-off-by: Elena Reshetova > > Thanks for the improvements! > > I have some markup changes to add, but I'll send that as a separate patch. Thank you Kees! I guess I was too minimal on my markup use, so doc was pretty plain before. I have just joined your changes with mine and put both of our sign-off to the resulting patch. I think this way it is easier for reviewers since ultimately content is the same. I will now fix one more thing Randy noticed and then send it to linux-doc and Jon Corbet. Best Regards, Elena. > > Acked-by: Kees Cook > > -Kees > > > --- > > Documentation/core-api/index.rst | 1 + > > Documentation/core-api/refcount-vs-atomic.rst | 129 > ++++++++++++++++++++++++++ > > 2 files changed, 130 insertions(+) > > create mode 100644 Documentation/core-api/refcount-vs-atomic.rst > > > > diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst > > index d5bbe03..d4d54b0 100644 > > --- a/Documentation/core-api/index.rst > > +++ b/Documentation/core-api/index.rst > > @@ -14,6 +14,7 @@ Core utilities > > kernel-api > > assoc_array > > atomic_ops > > + refcount-vs-atomic > > cpu_hotplug > > local_ops > > workqueue > > diff --git a/Documentation/core-api/refcount-vs-atomic.rst > b/Documentation/core-api/refcount-vs-atomic.rst > > new file mode 100644 > > index 0000000..5619d48 > > --- /dev/null > > +++ b/Documentation/core-api/refcount-vs-atomic.rst > > @@ -0,0 +1,129 @@ > > +=================================== > > +refcount_t API compared to atomic_t > > +=================================== > > + > > +The goal of refcount_t API is to provide a minimal API for implementing > > +an object's reference counters. While a generic architecture-independent > > +implementation from lib/refcount.c uses atomic operations underneath, > > +there are a number of differences between some of the refcount_*() and > > +atomic_*() functions with regards to the memory ordering guarantees. > > +This document outlines the differences and provides respective examples > > +in order to help maintainers validate their code against the change in > > +these memory ordering guarantees. > > + > > +memory-barriers.txt and atomic_t.txt provide more background to the > > +memory ordering in general and for atomic operations specifically. > > + > > +Relevant types of memory ordering > > +================================= > > + > > +**Note**: the following section only covers some of the memory > > +ordering types that are relevant for the atomics and reference > > +counters and used through this document. For a much broader picture > > +please consult memory-barriers.txt document. > > + > > +In the absence of any memory ordering guarantees (i.e. fully unordered) > > +atomics & refcounters only provide atomicity and > > +program order (po) relation (on the same CPU). It guarantees that > > +each atomic_*() and refcount_*() operation is atomic and instructions > > +are executed in program order on a single CPU. > > +This is implemented using READ_ONCE()/WRITE_ONCE() and > > +compare-and-swap primitives. > > + > > +A strong (full) memory ordering guarantees that all prior loads and > > +stores (all po-earlier instructions) on the same CPU are completed > > +before any po-later instruction is executed on the same CPU. > > +It also guarantees that all po-earlier stores on the same CPU > > +and all propagated stores from other CPUs must propagate to all > > +other CPUs before any po-later instruction is executed on the original > > +CPU (A-cumulative property). This is implemented using smp_mb(). > > + > > +A RELEASE memory ordering guarantees that all prior loads and > > +stores (all po-earlier instructions) on the same CPU are completed > > +before the operation. It also guarantees that all po-earlier > > +stores on the same CPU and all propagated stores from other CPUs > > +must propagate to all other CPUs before the release operation > > +(A-cumulative property). This is implemented using smp_store_release(). > > + > > +A control dependency (on success) for refcounters guarantees that > > +if a reference for an object was successfully obtained (reference > > +counter increment or addition happened, function returned true), > > +then further stores are ordered against this operation. > > +Control dependency on stores are not implemented using any explicit > > +barriers, but rely on CPU not to speculate on stores. This is only > > +a single CPU relation and provides no guarantees for other CPUs. > > + > > + > > +Comparison of functions > > +======================= > > + > > +case 1) - non-"Read/Modify/Write" (RMW) ops > > +------------------------------------------- > > + > > +Function changes: > > + atomic_set() --> refcount_set() > > + atomic_read() --> refcount_read() > > + > > +Memory ordering guarantee changes: > > + none (both fully unordered) > > + > > +case 2) - increment-based ops that return no value > > +-------------------------------------------------- > > + > > +Function changes: > > + atomic_inc() --> refcount_inc() > > + atomic_add() --> refcount_add() > > + > > +Memory ordering guarantee changes: > > + none (both fully unordered) > > + > > + > > +case 3) - decrement-based RMW ops that return no value > > +------------------------------------------------------ > > +Function changes: > > + atomic_dec() --> refcount_dec() > > + > > +Memory ordering guarantee changes: > > + fully unordered --> RELEASE ordering > > + > > + > > +case 4) - increment-based RMW ops that return a value > > +----------------------------------------------------- > > + > > +Function changes: > > + atomic_inc_not_zero() --> refcount_inc_not_zero() > > + no atomic counterpart --> refcount_add_not_zero() > > + > > +Memory ordering guarantees changes: > > + fully ordered --> control dependency on success for stores > > + > > +*Note*: we really assume here that necessary ordering is provided as a result > > +of obtaining pointer to the object! > > + > > + > > +case 5) - decrement-based RMW ops that return a value > > +----------------------------------------------------- > > + > > +Function changes: > > + atomic_dec_and_test() --> refcount_dec_and_test() > > + atomic_sub_and_test() --> refcount_sub_and_test() > > + no atomic counterpart --> refcount_dec_if_one() > > + atomic_add_unless(&var, -1, 1) --> refcount_dec_not_one(&var) > > + > > +Memory ordering guarantees changes: > > + fully ordered --> RELEASE ordering + control dependency > > + > > +Note: atomic_add_unless() only provides full order on success. > > + > > + > > +case 6) - lock-based RMW > > +------------------------ > > + > > +Function changes: > > + > > + atomic_dec_and_lock() --> refcount_dec_and_lock() > > + atomic_dec_and_mutex_lock() --> refcount_dec_and_mutex_lock() > > + > > +Memory ordering guarantees changes: > > + fully ordered --> RELEASE ordering + control dependency + > > + hold spin_lock() on success > > -- > > 2.7.4 > > > > > > -- > Kees Cook > Pixel Security