Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754686AbcKYNdI (ORCPT ); Fri, 25 Nov 2016 08:33:08 -0500 Received: from ec2-52-27-115-49.us-west-2.compute.amazonaws.com ([52.27.115.49]:59452 "EHLO osg.samsung.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753808AbcKYNdA (ORCPT ); Fri, 25 Nov 2016 08:33:00 -0500 Date: Fri, 25 Nov 2016 11:25:12 -0200 From: Mauro Carvalho Chehab To: Silvio Fricke Cc: linux-doc@vger.kernel.org, Jonathan Corbet , Ming Lei , "Luis R . Rodriguez" , Mauro Carvalho Chehab , linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 2/4] Documentation/atomic_ops.txt: convert to ReST markup Message-ID: <20161125112512.76cc54cf@vento.lan> In-Reply-To: <64a2300a71a50d1c11d424eee7716a4590286227.1480068123.git-series.silvio.fricke@gmail.com> References: <64a2300a71a50d1c11d424eee7716a4590286227.1480068123.git-series.silvio.fricke@gmail.com> Organization: Samsung X-Mailer: Claws Mail 3.14.0 (GTK+ 2.24.31; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 44047 Lines: 1087 Em Fri, 25 Nov 2016 11:02:40 +0100 Silvio Fricke escreveu: > ... and move to core-api folder. > > Signed-off-by: Silvio Fricke > --- > Documentation/atomic_ops.txt => Documentation/core-api/atomic_ops.rst | 745 +++++++++++++++++++++++++++++++++++++----------------------------------- > Documentation/core-api/index.rst | 1 +- > Documentation/process/volatile-considered-harmful.rst | 3 +- > 3 files changed, 391 insertions(+), 358 deletions(-) > > diff --git a/Documentation/atomic_ops.txt b/Documentation/core-api/atomic_ops.rst > similarity index 47% > rename from Documentation/atomic_ops.txt > rename to Documentation/core-api/atomic_ops.rst > index 6c5e8a9..3ac94ff 100644 > --- a/Documentation/atomic_ops.txt > +++ b/Documentation/core-api/atomic_ops.rst > @@ -1,204 +1,212 @@ > - Semantics and Behavior of Atomic and > - Bitmask Operations > +======================================================= > +Semantics and Behavior of Atomic and Bitmask Operations > +======================================================= > > - David S. Miller > +:Author: David S. Miller > > - This document is intended to serve as a guide to Linux port > +This document is intended to serve as a guide to Linux port > maintainers on how to implement atomic counter, bitops, and spinlock > interfaces properly. > > - The atomic_t type should be defined as a signed integer and > -the atomic_long_t type as a signed long integer. Also, they should > +Atomic Type And Operations > +========================== > + > +The ``atomic_t type`` should be defined as a signed integer and > +the ``atomic_long_t`` type as a signed long integer. Also, they should > be made opaque such that any kind of cast to a normal C integer type > -will fail. Something like the following should suffice: > +will fail. Something like the following should suffice:: > > - typedef struct { int counter; } atomic_t; > - typedef struct { long counter; } atomic_long_t; > + typedef struct { int counter; } atomic_t; > + typedef struct { long counter; } atomic_long_t; > > Historically, counter has been declared volatile. This is now discouraged. > -See Documentation/process/volatile-considered-harmful.rst for the complete rationale. > +See :ref:`Documentation/process/volatile-considered-harmful.rst > +` for the complete rationale. > > -local_t is very similar to atomic_t. If the counter is per CPU and only > -updated by one CPU, local_t is probably more appropriate. Please see > -Documentation/local_ops.txt for the semantics of local_t. > +``local_t`` is very similar to ``atomic_t``. If the counter is per CPU and only > +updated by one CPU, ``local_t`` is probably more appropriate. Please see > +:ref:`Documentation/local_ops.txt ` for the semantics of ``local_t``. > > -The first operations to implement for atomic_t's are the initializers and > -plain reads. > +The first operations to implement for ``atomic_t``'s are the initializers and > +plain reads. :: > > - #define ATOMIC_INIT(i) { (i) } > - #define atomic_set(v, i) ((v)->counter = (i)) > + #define ATOMIC_INIT(i) { (i) } > + #define atomic_set(v, i) ((v)->counter = (i)) > > -The first macro is used in definitions, such as: > +The first macro is used in definitions, such as:: > > -static atomic_t my_counter = ATOMIC_INIT(1); > + static atomic_t my_counter = ATOMIC_INIT(1); > > The initializer is atomic in that the return values of the atomic operations > are guaranteed to be correct reflecting the initialized value if the > initializer is used before runtime. If the initializer is used at runtime, a > proper implicit or explicit read memory barrier is needed before reading the > -value with atomic_read from another thread. > +value with ``atomic_read`` from another thread. > > -As with all of the atomic_ interfaces, replace the leading "atomic_" > -with "atomic_long_" to operate on atomic_long_t. > +As with all of the ``atomic_`` interfaces, replace the leading ``atomic_`` > +with ``atomic_long_`` to operate on ``atomic_long_t``. > > -The second interface can be used at runtime, as in: > +The second interface can be used at runtime, as in:: > > - struct foo { atomic_t counter; }; > - ... > + struct foo { atomic_t counter; }; > + ... > > - struct foo *k; > + struct foo *k; > > - k = kmalloc(sizeof(*k), GFP_KERNEL); > - if (!k) > - return -ENOMEM; > - atomic_set(&k->counter, 0); > + k = kmalloc(sizeof(*k), GFP_KERNEL); > + if (!k) > + return -ENOMEM; > + atomic_set(&k->counter, 0); > > The setting is atomic in that the return values of the atomic operations by > all threads are guaranteed to be correct reflecting either the value that has > been set with this operation or set with another operation. A proper implicit > or explicit memory barrier is needed before the value set with the operation > -is guaranteed to be readable with atomic_read from another thread. > +is guaranteed to be readable with ``atomic_read`` from another thread. > > -Next, we have: > +Next, we have:: > > - #define atomic_read(v) ((v)->counter) > + #define atomic_read(v) ((v)->counter) > > which simply reads the counter value currently visible to the calling thread. > The read is atomic in that the return value is guaranteed to be one of the > values initialized or modified with the interface operations if a proper > implicit or explicit memory barrier is used after possible runtime > initialization by any other thread and the value is modified only with the > -interface operations. atomic_read does not guarantee that the runtime > +interface operations. ``atomic_read`` does not guarantee that the runtime > initialization by any other thread is visible yet, so the user of the > interface must take care of that with a proper implicit or explicit memory > barrier. > > -*** WARNING: atomic_read() and atomic_set() DO NOT IMPLY BARRIERS! *** > +.. warning:: > > -Some architectures may choose to use the volatile keyword, barriers, or inline > -assembly to guarantee some degree of immediacy for atomic_read() and > -atomic_set(). This is not uniformly guaranteed, and may change in the future, > -so all users of atomic_t should treat atomic_read() and atomic_set() as simple > -C statements that may be reordered or optimized away entirely by the compiler > -or processor, and explicitly invoke the appropriate compiler and/or memory > -barrier for each use case. Failure to do so will result in code that may > -suddenly break when used with different architectures or compiler > -optimizations, or even changes in unrelated code which changes how the > -compiler optimizes the section accessing atomic_t variables. > + ``atomic_read()`` and ``atomic_set()`` DO NOT IMPLY BARRIERS! > > -*** YOU HAVE BEEN WARNED! *** > + Some architectures may choose to use the ``volatile`` keyword, barriers, or > + ``inline`` assembly to guarantee some degree of immediacy for ``atomic_read()`` > + and ``atomic_set()``. This is not uniformly guaranteed, and may change in the > + future, so all users of ``atomic_t`` should treat ``atomic_read()`` and > + ``atomic_set()`` as simple C statements that may be reordered or optimized away > + entirely by the compiler or processor, and explicitly invoke the appropriate > + compiler and/or memory barrier for each use case. Failure to do so will result > + in code that may suddenly break when used with different architectures or > + compiler optimizations, or even changes in unrelated code which changes how the > + compiler optimizes the section accessing ``atomic_t`` variables. > > Properly aligned pointers, longs, ints, and chars (and unsigned > equivalents) may be atomically loaded from and stored to in the same > -sense as described for atomic_read() and atomic_set(). The READ_ONCE() > -and WRITE_ONCE() macros should be used to prevent the compiler from using > -optimizations that might otherwise optimize accesses out of existence on > -the one hand, or that might create unsolicited accesses on the other. > +sense as described for ``atomic_read()`` and ``atomic_set()``. The > +``READ_ONCE()`` and ``WRITE_ONCE()`` macros should be used to prevent the > +compiler from using optimizations that might otherwise optimize accesses out of > +existence on the one hand, or that might create unsolicited accesses on the > +other. > > -For example consider the following code: > +For example consider the following code:: > > - while (a > 0) > - do_something(); > + while (a > 0) > + do_something(); > > -If the compiler can prove that do_something() does not store to the > +If the compiler can prove that ``do_something()`` does not store to the > variable a, then the compiler is within its rights transforming this to > -the following: > +the following:: > > - tmp = a; > - if (a > 0) > - for (;;) > - do_something(); > + tmp = a; > + if (a > 0) > + for (;;) > + do_something(); > > If you don't want the compiler to do this (and you probably don't), then > -you should use something like the following: > +you should use something like the following:: > > - while (READ_ONCE(a) < 0) > - do_something(); > + while (READ_ONCE(a) < 0) > + do_something(); > > -Alternatively, you could place a barrier() call in the loop. > +Alternatively, you could place a ``barrier()`` call in the loop. > > -For another example, consider the following code: > +For another example, consider the following code:: > > - tmp_a = a; > - do_something_with(tmp_a); > - do_something_else_with(tmp_a); > + tmp_a = a; > + do_something_with(tmp_a); > + do_something_else_with(tmp_a); > > -If the compiler can prove that do_something_with() does not store to the > -variable a, then the compiler is within its rights to manufacture an > -additional load as follows: > +If the compiler can prove that ``do_something_with()`` does not store to the > +variable ``a``, then the compiler is within its rights to manufacture an > +additional load as follows:: > > - tmp_a = a; > - do_something_with(tmp_a); > - tmp_a = a; > - do_something_else_with(tmp_a); > + tmp_a = a; > + do_something_with(tmp_a); > + tmp_a = a; > + do_something_else_with(tmp_a); > > This could fatally confuse your code if it expected the same value > -to be passed to do_something_with() and do_something_else_with(). > +to be passed to ``do_something_with()`` and ``do_something_else_with()``. > > The compiler would be likely to manufacture this additional load if > -do_something_with() was an inline function that made very heavy use > +``do_something_with()`` was an inline function that made very heavy use > of registers: reloading from variable a could save a flush to the > stack and later reload. To prevent the compiler from attacking your > -code in this manner, write the following: > +code in this manner, write the following:: > > - tmp_a = READ_ONCE(a); > - do_something_with(tmp_a); > - do_something_else_with(tmp_a); > + tmp_a = READ_ONCE(a); > + do_something_with(tmp_a); > + do_something_else_with(tmp_a); > > For a final example, consider the following code, assuming that the > -variable a is set at boot time before the second CPU is brought online > -and never changed later, so that memory barriers are not needed: > +variable ``a`` is set at boot time before the second CPU is brought online > +and never changed later, so that memory barriers are not needed:: > > - if (a) > - b = 9; > - else > - b = 42; > + if (a) > + b = 9; > + else > + b = 42; > > The compiler is within its rights to manufacture an additional store > -by transforming the above code into the following: > +by transforming the above code into the following:: > > - b = 42; > - if (a) > - b = 9; > + b = 42; > + if (a) > + b = 9; > > This could come as a fatal surprise to other code running concurrently > -that expected b to never have the value 42 if a was zero. To prevent > -the compiler from doing this, write something like: > +that expected ``b`` to never have the value ``42`` if ``a`` was zero. To > +prevent the compiler from doing this, write something like:: > + > + if (a) > + WRITE_ONCE(b, 9); > + else > + WRITE_ONCE(b, 42); > > - if (a) > - WRITE_ONCE(b, 9); > - else > - WRITE_ONCE(b, 42); > +Don't even **think** about doing this without proper use of memory barriers, > +locks, or atomic operations if variable ``a`` can change at runtime! > > -Don't even -think- about doing this without proper use of memory barriers, > -locks, or atomic operations if variable a can change at runtime! > +.. warning:: > > -*** WARNING: READ_ONCE() OR WRITE_ONCE() DO NOT IMPLY A BARRIER! *** > + ``READ_ONCE()`` OR ``WRITE_ONCE()`` DO NOT IMPLY A BARRIER! > > Now, we move onto the atomic operation interfaces typically implemented with > -the help of assembly code. > +the help of assembly code. :: > > - void atomic_add(int i, atomic_t *v); > - void atomic_sub(int i, atomic_t *v); > - void atomic_inc(atomic_t *v); > - void atomic_dec(atomic_t *v); > + void atomic_add(int i, atomic_t *v); > + void atomic_sub(int i, atomic_t *v); > + void atomic_inc(atomic_t *v); > + void atomic_dec(atomic_t *v); > > These four routines add and subtract integral values to/from the given > -atomic_t value. The first two routines pass explicit integers by > +``atomic_t`` value. The first two routines pass explicit integers by > which to make the adjustment, whereas the latter two use an implicit > adjustment value of "1". > > -One very important aspect of these two routines is that they DO NOT > +One very important aspect of these two routines is that they **DO NOT** > require any explicit memory barriers. They need only perform the > -atomic_t counter update in an SMP safe manner. > +``atomic_t`` counter update in an SMP safe manner. > > -Next, we have: > +Next, we have:: > > - int atomic_inc_return(atomic_t *v); > - int atomic_dec_return(atomic_t *v); > + int atomic_inc_return(atomic_t *v); > + int atomic_dec_return(atomic_t *v); > > These routines add 1 and subtract 1, respectively, from the given > -atomic_t and return the new counter value after the operation is > +``atomic_t`` and return the new counter value after the operation is > performed. > > Unlike the above routines, it is required that these primitives > @@ -207,317 +215,330 @@ the operation. It must be done such that all memory operations before > and after the atomic operation calls are strongly ordered with respect > to the atomic operation itself. > > -For example, it should behave as if a smp_mb() call existed both > +For example, it should behave as if a ``smp_mb()`` call existed both > before and after the atomic operation. > > If the atomic instructions used in an implementation provide explicit > memory barrier semantics which satisfy the above requirements, that is > fine as well. > > -Let's move on: > +Let's move on:: > > - int atomic_add_return(int i, atomic_t *v); > - int atomic_sub_return(int i, atomic_t *v); > + int atomic_add_return(int i, atomic_t *v); > + int atomic_sub_return(int i, atomic_t *v); > > -These behave just like atomic_{inc,dec}_return() except that an > +These behave just like ``atomic_{inc,dec}_return()`` except that an > explicit counter adjustment is given instead of the implicit "1". > -This means that like atomic_{inc,dec}_return(), the memory barrier > +This means that like ``atomic_{inc,dec}_return()``, the memory barrier > semantics are required. > > -Next: > +Next:: > > - int atomic_inc_and_test(atomic_t *v); > - int atomic_dec_and_test(atomic_t *v); > + int atomic_inc_and_test(atomic_t *v); > + int atomic_dec_and_test(atomic_t *v); > > These two routines increment and decrement by 1, respectively, the > given atomic counter. They return a boolean indicating whether the > resulting counter value was zero or not. > > Again, these primitives provide explicit memory barrier semantics around > -the atomic operation. > +the atomic operation. :: Better to remove the dot here: the atomic operation:: And the same on other places. > > - int atomic_sub_and_test(int i, atomic_t *v); > + int atomic_sub_and_test(int i, atomic_t *v); > > -This is identical to atomic_dec_and_test() except that an explicit > +This is identical to ``atomic_dec_and_test()`` except that an explicit > decrement is given instead of the implicit "1". This primitive must > -provide explicit memory barrier semantics around the operation. > +provide explicit memory barrier semantics around the operation. :: > > - int atomic_add_negative(int i, atomic_t *v); > + int atomic_add_negative(int i, atomic_t *v); > > The given increment is added to the given atomic counter value. A boolean > is return which indicates whether the resulting counter value is negative. > This primitive must provide explicit memory barrier semantics around > the operation. > > -Then: > +Then:: > > - int atomic_xchg(atomic_t *v, int new); > + int atomic_xchg(atomic_t *v, int new); > > This performs an atomic exchange operation on the atomic variable v, setting > the given new value. It returns the old value that the atomic variable v had > just before the operation. > > -atomic_xchg must provide explicit memory barriers around the operation. > +``atomic_xchg`` must provide explicit memory barriers around the operation. :: > > - int atomic_cmpxchg(atomic_t *v, int old, int new); > + int atomic_cmpxchg(atomic_t *v, int old, int new); > > This performs an atomic compare exchange operation on the atomic value v, > -with the given old and new values. Like all atomic_xxx operations, > -atomic_cmpxchg will only satisfy its atomicity semantics as long as all > -other accesses of *v are performed through atomic_xxx operations. > +with the given old and new values. Like all ``atomic_*`` operations, > +``atomic_cmpxchg`` will only satisfy its atomicity semantics as long as all > +other accesses of ``*v`` are performed through ``atomic_*`` operations. > > -atomic_cmpxchg must provide explicit memory barriers around the operation, > +``atomic_cmpxchg`` must provide explicit memory barriers around the operation, > although if the comparison fails then no memory ordering guarantees are > required. > > -The semantics for atomic_cmpxchg are the same as those defined for 'cas' > +The semantics for ``atomic_cmpxchg`` are the same as those defined for 'cas' > below. > > -Finally: > +Finally:: > > - int atomic_add_unless(atomic_t *v, int a, int u); > + int atomic_add_unless(atomic_t *v, int a, int u); > > If the atomic value v is not equal to u, this function adds a to v, and > returns non zero. If v is equal to u then it returns zero. This is done as > an atomic operation. > > -atomic_add_unless must provide explicit memory barriers around the > +``atomic_add_unless`` must provide explicit memory barriers around the > operation unless it fails (returns 0). > > -atomic_inc_not_zero, equivalent to atomic_add_unless(v, 1, 0) > +``atomic_inc_not_zero``, equivalent to ``atomic_add_unless(v, 1, 0)`` > > > -If a caller requires memory barrier semantics around an atomic_t > +If a caller requires memory barrier semantics around an ``atomic_t`` > operation which does not return a value, a set of interfaces are > -defined which accomplish this: > +defined which accomplish this:: > > - void smp_mb__before_atomic(void); > - void smp_mb__after_atomic(void); > + void smp_mb__before_atomic(void); > + void smp_mb__after_atomic(void); > > -For example, smp_mb__before_atomic() can be used like so: > +For example, ``smp_mb__before_atomic()`` can be used like so:: > > - obj->dead = 1; > - smp_mb__before_atomic(); > - atomic_dec(&obj->ref_count); > + obj->dead = 1; > + smp_mb__before_atomic(); > + atomic_dec(&obj->ref_count); > > -It makes sure that all memory operations preceding the atomic_dec() > +It makes sure that all memory operations preceding the ``atomic_dec()`` > call are strongly ordered with respect to the atomic counter > operation. In the above example, it guarantees that the assignment of > -"1" to obj->dead will be globally visible to other cpus before the > +"1" to ``obj->dead`` will be globally visible to other cpus before the > atomic counter decrement. > > -Without the explicit smp_mb__before_atomic() call, the > +Without the explicit ``smp_mb__before_atomic()`` call, the > implementation could legally allow the atomic counter update visible > -to other cpus before the "obj->dead = 1;" assignment. > +to other cpus before the ``obj->dead = 1;`` assignment. > > A missing memory barrier in the cases where they are required by the > -atomic_t implementation above can have disastrous results. Here is > +``atomic_t`` implementation above can have disastrous results. Here is > an example, which follows a pattern occurring frequently in the Linux > kernel. It is the use of atomic counters to implement reference > counting, and it works such that once the counter falls to zero it can > be guaranteed that no other entity can be accessing the object: > > -static void obj_list_add(struct obj *obj, struct list_head *head) > -{ > - obj->active = 1; > - list_add(&obj->list, head); > -} > - > -static void obj_list_del(struct obj *obj) > -{ > - list_del(&obj->list); > - obj->active = 0; > -} > - > -static void obj_destroy(struct obj *obj) > -{ > - BUG_ON(obj->active); > - kfree(obj); > -} > - > -struct obj *obj_list_peek(struct list_head *head) > -{ > - if (!list_empty(head)) { > - struct obj *obj; > - > - obj = list_entry(head->next, struct obj, list); > - atomic_inc(&obj->refcnt); > - return obj; > - } > - return NULL; > -} > - > -void obj_poke(void) > -{ > - struct obj *obj; > - > - spin_lock(&global_list_lock); > - obj = obj_list_peek(&global_list); > - spin_unlock(&global_list_lock); > - > - if (obj) { > - obj->ops->poke(obj); > - if (atomic_dec_and_test(&obj->refcnt)) > - obj_destroy(obj); > - } > -} > - > -void obj_timeout(struct obj *obj) > -{ > - spin_lock(&global_list_lock); > - obj_list_del(obj); > - spin_unlock(&global_list_lock); > - > - if (atomic_dec_and_test(&obj->refcnt)) > - obj_destroy(obj); > -} > - > -(This is a simplification of the ARP queue management in the > - generic neighbour discover code of the networking. Olaf Kirch > - found a bug wrt. memory barriers in kfree_skb() that exposed > - the atomic_t memory barrier requirements quite clearly.) > - > -Given the above scheme, it must be the case that the obj->active > +.. code-block:: c > + > + static void obj_list_add(struct obj *obj, struct list_head *head) > + { > + obj->active = 1; > + list_add(&obj->list, head); > + } > + > + static void obj_list_del(struct obj *obj) > + { > + list_del(&obj->list); > + obj->active = 0; > + } > + > + static void obj_destroy(struct obj *obj) > + { > + BUG_ON(obj->active); > + kfree(obj); > + } > + > + struct obj *obj_list_peek(struct list_head *head) > + { > + if (!list_empty(head)) { > + struct obj *obj; > + > + obj = list_entry(head->next, struct obj, list); > + atomic_inc(&obj->refcnt); > + return obj; > + } > + return NULL; > + } > + > + void obj_poke(void) > + { > + struct obj *obj; > + > + spin_lock(&global_list_lock); > + obj = obj_list_peek(&global_list); > + spin_unlock(&global_list_lock); > + > + if (obj) { > + obj->ops->poke(obj); > + if (atomic_dec_and_test(&obj->refcnt)) > + obj_destroy(obj); > + } > + } > + > + void obj_timeout(struct obj *obj) > + { > + spin_lock(&global_list_lock); > + obj_list_del(obj); > + spin_unlock(&global_list_lock); > + > + if (atomic_dec_and_test(&obj->refcnt)) > + obj_destroy(obj); > + } > + > +.. note: > + > + This is a simplification of the ARP queue management in the > + generic neighbour discover code of the networking. Olaf Kirch > + found a bug wrt. memory barriers in kfree_skb() that exposed > + the atomic_t memory barrier requirements quite clearly. > + > +Given the above scheme, it must be the case that the ``obj->active`` > update done by the obj list deletion be visible to other processors > before the atomic counter decrement is performed. > > -Otherwise, the counter could fall to zero, yet obj->active would still > -be set, thus triggering the assertion in obj_destroy(). The error > -sequence looks like this: > - > - cpu 0 cpu 1 > - obj_poke() obj_timeout() > - obj = obj_list_peek(); > - ... gains ref to obj, refcnt=2 > - obj_list_del(obj); > - obj->active = 0 ... > - ... visibility delayed ... > - atomic_dec_and_test() > - ... refcnt drops to 1 ... > - atomic_dec_and_test() > - ... refcount drops to 0 ... > - obj_destroy() > - BUG() triggers since obj->active > - still seen as one > - obj->active update visibility occurs > +Otherwise, the counter could fall to zero, yet ``obj->active`` would still > +be set, thus triggering the assertion in ``obj_destroy()``. The error > +sequence looks like this:: > + > + cpu 0 cpu 1 > + obj_poke() obj_timeout() > + obj = obj_list_peek(); > + ... gains ref to obj, refcnt=2 > + obj_list_del(obj); > + obj->active = 0 ... > + ... visibility delayed ... > + atomic_dec_and_test() > + ... refcnt drops to 1 ... > + atomic_dec_and_test() > + ... refcount drops to 0 ... > + obj_destroy() > + BUG() triggers since obj->active > + still seen as one > + obj->active update visibility occurs > > With the memory barrier semantics required of the atomic_t operations > which return values, the above sequence of memory visibility can never > -happen. Specifically, in the above case the atomic_dec_and_test() > +happen. Specifically, in the above case the ``atomic_dec_and_test()`` > counter decrement would not become globally visible until the > -obj->active update does. > +``obj->active`` update does. > > As a historical note, 32-bit Sparc used to only allow usage of > -24-bits of its atomic_t type. This was because it used 8 bits > +24-bits of its ``atomic_t`` type. This was because it used 8 bits > as a spinlock for SMP safety. Sparc32 lacked a "compare and swap" > type instruction. However, 32-bit Sparc has since been moved over > to a "hash table of spinlocks" scheme, that allows the full 32-bit > counter to be realized. Essentially, an array of spinlocks are > -indexed into based upon the address of the atomic_t being operated > +indexed into based upon the address of the ``atomic_t`` being operated > on, and that lock protects the atomic operation. Parisc uses the > same scheme. > > -Another note is that the atomic_t operations returning values are > +Another note is that the ``atomic_t`` operations returning values are > extremely slow on an old 386. > > +Atomic Bitmask > +============== > + > We will now cover the atomic bitmask operations. You will find that > their SMP and memory barrier semantics are similar in shape and scope > -to the atomic_t ops above. > +to the ``atomic_t`` ops above. > > Native atomic bit operations are defined to operate on objects aligned > -to the size of an "unsigned long" C data type, and are least of that > -size. The endianness of the bits within each "unsigned long" are the > -native endianness of the cpu. > +to the size of an ``unsigned long`` C data type, and are least of that > +size. The endianness of the bits within each ``unsigned long`` are the > +native endianness of the cpu. :: > > - void set_bit(unsigned long nr, volatile unsigned long *addr); > - void clear_bit(unsigned long nr, volatile unsigned long *addr); > - void change_bit(unsigned long nr, volatile unsigned long *addr); > + void set_bit(unsigned long nr, volatile unsigned long *addr); > + void clear_bit(unsigned long nr, volatile unsigned long *addr); > + void change_bit(unsigned long nr, volatile unsigned long *addr); > > These routines set, clear, and change, respectively, the bit number > -indicated by "nr" on the bit mask pointed to by "ADDR". > +indicated by ``nr`` on the bit mask pointed to by ``addr``. > > They must execute atomically, yet there are no implicit memory barrier > -semantics required of these interfaces. > +semantics required of these interfaces. :: > > - int test_and_set_bit(unsigned long nr, volatile unsigned long *addr); > - int test_and_clear_bit(unsigned long nr, volatile unsigned long *addr); > - int test_and_change_bit(unsigned long nr, volatile unsigned long *addr); > + int test_and_set_bit(unsigned long nr, volatile unsigned long *addr); > + int test_and_clear_bit(unsigned long nr, volatile unsigned long *addr); > + int test_and_change_bit(unsigned long nr, volatile unsigned long *addr); > > Like the above, except that these routines return a boolean which > indicates whether the changed bit was set _BEFORE_ the atomic bit > operation. > > -WARNING! It is incredibly important that the value be a boolean, > -ie. "0" or "1". Do not try to be fancy and save a few instructions by > -declaring the above to return "long" and just returning something like > -"old_val & mask" because that will not work. > +.. warning:: > > -For one thing, this return value gets truncated to int in many code > -paths using these interfaces, so on 64-bit if the bit is set in the > -upper 32-bits then testers will never see that. > + It is incredibly important that the value be a boolean, > + ie. "0" or "1". Do not try to be fancy and save a few instructions by > + declaring the above to return "long" and just returning something like > + "old_val & mask" because that will not work. > > -One great example of where this problem crops up are the thread_info > -flag operations. Routines such as test_and_set_ti_thread_flag() chop > -the return value into an int. There are other places where things > -like this occur as well. > + For one thing, this return value gets truncated to int in many code > + paths using these interfaces, so on 64-bit if the bit is set in the > + upper 32-bits then testers will never see that. > > -These routines, like the atomic_t counter operations returning values, > + One great example of where this problem crops up are the ``thread_info`` > + flag operations. Routines such as ``test_and_set_ti_thread_flag()`` chop > + the return value into an int. There are other places where things > + like this occur as well. > + > +These routines, like the ``atomic_t`` counter operations returning values, > must provide explicit memory barrier semantics around their execution. > All memory operations before the atomic bit operation call must be > made visible globally before the atomic bit operation is made visible. > Likewise, the atomic bit operation must be visible globally before any > subsequent memory operation is made visible. For example: > > - obj->dead = 1; > - if (test_and_set_bit(0, &obj->flags)) > - /* ... */; > - obj->killed = 1; > +.. code-block:: c Instead of using code-block, better to do: subsequent memory operation is made visible. For example:: > + > + obj->dead = 1; > + if (test_and_set_bit(0, &obj->flags)) > + /* ... */; > + obj->killed = 1; > > -The implementation of test_and_set_bit() must guarantee that > -"obj->dead = 1;" is visible to cpus before the atomic memory operation > -done by test_and_set_bit() becomes visible. Likewise, the atomic > -memory operation done by test_and_set_bit() must become visible before > -"obj->killed = 1;" is visible. > +The implementation of ``test_and_set_bit()`` must guarantee that > +``obj->dead = 1;`` is visible to cpus before the atomic memory operation > +done by ``test_and_set_bit()`` becomes visible. Likewise, the atomic > +memory operation done by ``test_and_set_bit()`` must become visible before > +``obj->killed = 1;`` is visible. > > -Finally there is the basic operation: > +Finally there is the basic operation:: > > - int test_bit(unsigned long nr, __const__ volatile unsigned long *addr); > + int test_bit(unsigned long nr, __const__ volatile unsigned long *addr); > > -Which returns a boolean indicating if bit "nr" is set in the bitmask > -pointed to by "addr". > +Which returns a boolean indicating if bit ``nr`` is set in the bitmask > +pointed to by ``addr``. > > -If explicit memory barriers are required around {set,clear}_bit() (which do > +If explicit memory barriers are required around ``{set,clear}_bit()`` (which do > not return a value, and thus does not need to provide memory barrier > -semantics), two interfaces are provided: > +semantics), two interfaces are provided:: > > - void smp_mb__before_atomic(void); > - void smp_mb__after_atomic(void); > + void smp_mb__before_atomic(void); > + void smp_mb__after_atomic(void); > > -They are used as follows, and are akin to their atomic_t operation > +They are used as follows, and are akin to their ``atomic_t`` operation > brothers: > > - /* All memory operations before this call will > - * be globally visible before the clear_bit(). > - */ > - smp_mb__before_atomic(); > - clear_bit( ... ); > +.. code-block:: c > > - /* The clear_bit() will be visible before all > - * subsequent memory operations. > - */ > - smp_mb__after_atomic(); > + /* All memory operations before this call will > + * be globally visible before the clear_bit(). > + */ > + smp_mb__before_atomic(); > + clear_bit( ... ); > + > + /* The clear_bit() will be visible before all > + * subsequent memory operations. > + */ > + smp_mb__after_atomic(); > > There are two special bitops with lock barrier semantics (acquire/release, > same as spinlocks). These operate in the same way as their non-_lock/unlock > postfixed variants, except that they are to provide acquire/release semantics, > -respectively. This means they can be used for bit_spin_trylock and > -bit_spin_unlock type operations without specifying any more barriers. > +respectively. This means they can be used for ``bit_spin_trylock`` and > +``bit_spin_unlock`` type operations without specifying any more barriers. :: > > - int test_and_set_bit_lock(unsigned long nr, unsigned long *addr); > - void clear_bit_unlock(unsigned long nr, unsigned long *addr); > - void __clear_bit_unlock(unsigned long nr, unsigned long *addr); > + int test_and_set_bit_lock(unsigned long nr, unsigned long *addr); > + void clear_bit_unlock(unsigned long nr, unsigned long *addr); > + void __clear_bit_unlock(unsigned long nr, unsigned long *addr); > > -The __clear_bit_unlock version is non-atomic, however it still implements > +The ``__clear_bit_unlock`` version is non-atomic, however it still implements > unlock barrier semantics. This can be useful if the lock itself is protecting > the other bits in the word. > > @@ -526,41 +547,43 @@ provided. They are used in contexts where some other higher-level SMP > locking scheme is being used to protect the bitmask, and thus less > expensive non-atomic operations may be used in the implementation. > They have names similar to the above bitmask operation interfaces, > -except that two underscores are prefixed to the interface name. > +except that two underscores are prefixed to the interface name. :: > > - void __set_bit(unsigned long nr, volatile unsigned long *addr); > - void __clear_bit(unsigned long nr, volatile unsigned long *addr); > - void __change_bit(unsigned long nr, volatile unsigned long *addr); > - int __test_and_set_bit(unsigned long nr, volatile unsigned long *addr); > - int __test_and_clear_bit(unsigned long nr, volatile unsigned long *addr); > - int __test_and_change_bit(unsigned long nr, volatile unsigned long *addr); > + void __set_bit(unsigned long nr, volatile unsigned long *addr); > + void __clear_bit(unsigned long nr, volatile unsigned long *addr); > + void __change_bit(unsigned long nr, volatile unsigned long *addr); > + int __test_and_set_bit(unsigned long nr, volatile unsigned long *addr); > + int __test_and_clear_bit(unsigned long nr, volatile unsigned long *addr); > + int __test_and_change_bit(unsigned long nr, volatile unsigned long *addr); > > These non-atomic variants also do not require any special memory > barrier semantics. > > -The routines xchg() and cmpxchg() must provide the same exact > +The routines ``xchg()`` and ``cmpxchg()`` must provide the same exact > memory-barrier semantics as the atomic and bit operations returning > values. > > -Note: If someone wants to use xchg(), cmpxchg() and their variants, > -linux/atomic.h should be included rather than asm/cmpxchg.h, unless > -the code is in arch/* and can take care of itself. > +.. note:: > + > + If someone wants to use ``xchg()``, ``cmpxchg()`` and their variants, > + ``linux/atomic.h`` should be included rather than ``asm/cmpxchg.h``, unless > + the code is in arch/* and can take care of itself. > > Spinlocks and rwlocks have memory barrier expectations as well. > The rule to follow is simple: > > -1) When acquiring a lock, the implementation must make it globally > +1. When acquiring a lock, the implementation must make it globally > visible before any subsequent memory operation. > > -2) When releasing a lock, the implementation must make it such that > +2. When releasing a lock, the implementation must make it such that > all previous memory operations are globally visible before the > lock release. > > -Which finally brings us to _atomic_dec_and_lock(). There is an > -architecture-neutral version implemented in lib/dec_and_lock.c, > -but most platforms will wish to optimize this in assembler. > +Which finally brings us to ``_atomic_dec_and_lock()``. There is an > +architecture-neutral version implemented in ``lib/dec_and_lock.c``, > +but most platforms will wish to optimize this in assembler. :: > > - int _atomic_dec_and_lock(atomic_t *atomic, spinlock_t *lock); > + int _atomic_dec_and_lock(atomic_t *atomic, spinlock_t *lock); > > Atomically decrement the given counter, and if will drop to zero > atomically acquire the given spinlock and perform the decrement > @@ -573,68 +596,74 @@ sure the spinlock operation is globally visible before any > subsequent memory operation. > > We can demonstrate this operation more clearly if we define > -an abstract atomic operation: > +an abstract atomic operation:: > > - long cas(long *mem, long old, long new); > + long cas(long *mem, long old, long new); > > -"cas" stands for "compare and swap". It atomically: > +"``cas``" stands for "compare and swap". It atomically: > > -1) Compares "old" with the value currently at "mem". > -2) If they are equal, "new" is written to "mem". > -3) Regardless, the current value at "mem" is returned. > +1. Compares "old" with the value currently at "mem". > +2. If they are equal, "new" is written to "mem". > +3. Regardless, the current value at "mem" is returned. > > As an example usage, here is what an atomic counter update > might look like: > > -void example_atomic_inc(long *counter) > -{ > - long old, new, ret; > +.. code-block:: c Same here. Use :: instead of code-block. > > - while (1) { > - old = *counter; > - new = old + 1; > + void example_atomic_inc(long *counter) > + { > + long old, new, ret; > > - ret = cas(counter, old, new); > - if (ret == old) > - break; > - } > -} > + while (1) { > + old = *counter; > + new = old + 1; > + > + ret = cas(counter, old, new); > + if (ret == old) > + break; > + } > + } > > Let's use cas() in order to build a pseudo-C atomic_dec_and_lock(): > > -int _atomic_dec_and_lock(atomic_t *atomic, spinlock_t *lock) > -{ > - long old, new, ret; > - int went_to_zero; > - > - went_to_zero = 0; > - while (1) { > - old = atomic_read(atomic); > - new = old - 1; > - if (new == 0) { > - went_to_zero = 1; > - spin_lock(lock); > - } > - ret = cas(atomic, old, new); > - if (ret == old) > - break; > - if (went_to_zero) { > - spin_unlock(lock); > - went_to_zero = 0; > - } > - } > - > - return went_to_zero; > -} > - > -Now, as far as memory barriers go, as long as spin_lock() > +.. code-block:: c And here > + > + int _atomic_dec_and_lock(atomic_t *atomic, spinlock_t *lock) > + { > + long old, new, ret; > + int went_to_zero; > + > + went_to_zero = 0; > + while (1) { > + old = atomic_read(atomic); > + new = old - 1; > + if (new == 0) { > + went_to_zero = 1; > + spin_lock(lock); > + } > + ret = cas(atomic, old, new); > + if (ret == old) > + break; > + if (went_to_zero) { > + spin_unlock(lock); > + went_to_zero = 0; > + } > + } > + > + return went_to_zero; > + } > + > +Now, as far as memory barriers go, as long as i``spin_lock()`` > strictly orders all subsequent memory operations (including > -the cas()) with respect to itself, things will be fine. > +the ``cas()``) with respect to itself, things will be fine. > > -Said another way, _atomic_dec_and_lock() must guarantee that > +Said another way, ``_atomic_dec_and_lock()`` must guarantee that > a counter dropping to zero is never made visible before the > spinlock being acquired. > > -Note that this also means that for the case where the counter > -is not dropping to zero, there are no memory ordering > -requirements. > +.. note:: > + > + Note that this also means that for the case where the counter > + is not dropping to zero, there are no memory ordering > + requirements. > diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst > index 480d9a3..f3e5f5e 100644 > --- a/Documentation/core-api/index.rst > +++ b/Documentation/core-api/index.rst > @@ -8,6 +8,7 @@ Kernel and driver related documentation. > :maxdepth: 1 > > assoc_array > + atomic_ops > workqueue > > .. only:: subproject > diff --git a/Documentation/process/volatile-considered-harmful.rst b/Documentation/process/volatile-considered-harmful.rst > index e0d042a..4934e65 100644 > --- a/Documentation/process/volatile-considered-harmful.rst > +++ b/Documentation/process/volatile-considered-harmful.rst > @@ -1,3 +1,6 @@ > + > +.. _volatile_considered_harmful: > + > Why the "volatile" type class should not be used > ------------------------------------------------ > Thanks, Mauro