2011-06-14 09:00:07

by Lai Jiangshan

[permalink] [raw]
Subject: [PATCH] rcu,doc: lock-free update site

Add a document which describes a pattern of using RCU to implement lock-free(lockless)
update site.

Singed-off-by: Lai Jiangshan <[email protected]>
---
Documentation/RCU/00-INDEX | 2 +
Documentation/RCU/lock-free-update-site.txt | 143 +++++++++++++++++++++++++++
2 files changed, 145 insertions(+), 0 deletions(-)

diff --git a/Documentation/RCU/00-INDEX b/Documentation/RCU/00-INDEX
index 1d7a885..7178dd5 100644
--- a/Documentation/RCU/00-INDEX
+++ b/Documentation/RCU/00-INDEX
@@ -8,6 +8,8 @@ listRCU.txt
- Using RCU to Protect Read-Mostly Linked Lists
lockdep.txt
- RCU and lockdep checking
+lock-free-update-site.txt
+ - RCU pattern of lock-free update site
NMI-RCU.txt
- Using RCU to Protect Dynamic NMI Handlers
rcubarrier.txt
diff --git a/Documentation/RCU/lock-free-update-site.txt b/Documentation/RCU/lock-free-update-site.txt
new file mode 100644
index 0000000..9b6984a
--- /dev/null
+++ b/Documentation/RCU/lock-free-update-site.txt
@@ -0,0 +1,143 @@
+Lock-free(lockless) update site
+
+This article describes a pattern of using RCU to implement lock-free(lockless)
+update site. RCU update site is considered call-rare and it is protected
+by a update-site lock generally. But blocking algorithms are undesirable
+in some cases for some reasons, thus, this pattern may help.
+
+This pattern can only protect a single pointer which is the only reference
+of the object.
+
+object pointer:
+
+struct my_struct *gptr;
+
+wait-free read site:
+{
+ rcu_read_lock();
+ ptr = rcu_dereference(gptr);
+ my_struct_read(ptr);
+ rcu_read_unlock();
+}
+
+lock-free update site(update as new):
+{
+ new_ptr = my_struct_alloc();
+ for (;;) {
+ rcu_read_lock();
+
+ old_ptr = rcu_dereference(gptr);
+
+ /* copy data from old_ptr to new_ptr and update it */
+ my_struct_update(new_ptr, old_ptr);
+
+ /* atomically publish the new_ptr and de-publish the old_ptr */
+ if (cmpxchg(&gptr, old_ptr, new_ptr) == old_ptr) {
+ rcu_read_unlock();
+
+ /*
+ * free it after a grace-period, read sites and other
+ * update sites may be reading it in parallel.
+ */
+ kfree_rcu(old_ptr);
+
+ /* success, exit the loop */
+ break;
+ } else {
+ rcu_read_unlock();
+
+ /*
+ * Other update site successfully update it, we need
+ * to read the latest data and try the update again.
+ *
+ * If the other update site did the same thing we need,
+ * we can free the new_ptr and exit this loop too,
+ * and it may becomes a wait-free algorithm.
+ */
+ }
+ }
+}
+
+1) In update site, rcu_read_lock() is needed for my_struct_update().
+
+ In this kind of lock-free update site, many update sides
+ may run parallel, other update side may had successfully
+ de-published old_ptr and tried to free it. rcu_read_lock()
+ prevents old_ptr from freeing and ensures it valid for
+ my_struct_update().
+
+2) In update site, rcu_read_lock() is needed until cmpxchg() finished.
+
+ Although the content of old_ptr is not accessed when cmpxchg(),
+ but old_ptr should not be freed until cmpxchg() finished.
+ Otherwise we may miss other successful update and publish a
+ new_ptr without information from the latest object.
+
+ Example:(wrong update site code, rcu_read_unlock() is moved up before cmpxchg())
+ (cause ABA-problem: http://en.wikipedia.org/wiki/ABA_problem)
+
+ CPU0 CPU1
+ rcu_read_lock()
+ old_ptr = rcu_dereference(gptr);
+ my_struct_update(new_ptr, old_ptr);
+ rcu_read_unlock();
+ . successfully update, now gptr=other_ptr
+ . old_ptr is freed
+ .
+ . other update, my_struct_alloc() returns old_ptr
+ . successfully publish and de-publish
+ . now gptr=old_ptr again
+ .
+ cmpxchg(&gptr, old_ptr, new_ptr)
+ cmpxchg() success, but the 2 updates
+ of CPU1 are completely missed.
+
+ This exmaple shows rcu_read_lock() is needed to prevent old_ptr from reusing
+ before cmpxchg() finished and to prevent ABA-problem.
+
+3) Beware NULL pointer.
+
+ Some use cases may set gptr to NULL when needed. (the previous gptr != NULL)
+
+lock-free update site(dispose, wait-free):
+{
+ old_ptr = xchg(&gptr, NULL);
+ if (old_ptr != NULL)
+ kfree_rcu(old_ptr);
+}
+
+ This code cause NULL reusing and may cause ABA-problem like above example:
+
+ CPU0 CPU1
+ rcu_read_lock()
+ old_ptr = rcu_dereference(gptr);
+ /* old_ptr = NULL */
+ my_struct_update(new_ptr, NULL);
+ . successfully update, now gptr=other_ptr
+ .
+ . successfully dispose
+ . now gptr=NULL again
+ .
+ cmpxchg(&gptr, NULL, new_ptr)
+ cmpxchg() success, but the update
+ and the dispose of CPU1 are missed
+ consideration by CPU0.
+ rcu_read_unlock();
+
+ In many use cases, these behaviors are OK. In these use cases,
+ my_struct_update(new_ptr, NULL) give us the same result even we retry.
+
+ But in some raw use cases(I can't find any use-case now, I believe it exist),
+ the missed considerations of the updates are not acceptable, in this case,
+ we should use different null-value for NULL pointer for every disposing.
+
+lock-free update site(dispose, wait-free, paranoid version):
+{
+ null_ptr = alloc_null_ptr();
+ old_ptr = xchg(&gptr, null_ptr);
+ if (is_null_ptr(old_ptr))
+ free_null_ptr_by_rcu_for_preventing_it_from_reusing(old_ptr);
+ else
+ kfree_rcu(old_ptr);
+}
+


2011-06-14 12:50:41

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH] rcu,doc: lock-free update site

* Lai Jiangshan ([email protected]) wrote:
> Add a document which describes a pattern of using RCU to implement lock-free(lockless)
> update site.
>
[...]
> @@ -0,0 +1,143 @@
> +Lock-free(lockless) update site
> +
> +This article describes a pattern of using RCU to implement lock-free(lockless)
> +update site. RCU update site is considered call-rare and it is protected
> +by a update-site lock generally. But blocking algorithms are undesirable
> +in some cases for some reasons, thus, this pattern may help.

Hi Lai,

Yes, using this kind of rcu read-side lock to protect against the
cmpxchg ABA problem is well-known (to me at least) ;) I used this
technique in the userspace RCU library "lock-free queue" and "lock-free
stack" in 2010*. Please feel free to dig through my RCU data containers code
to bring in more data structure examples:

http://git.lttng.org/?p=userspace-rcu.git;a=blob;f=urcu/static/rculfqueue.h;h=b627e450cfdd581692b474d89437e3fd47f18463;hb=HEAD

http://git.lttng.org/?p=userspace-rcu.git;a=blob;f=urcu/static/rculfqueue.h;h=b627e450cfdd581692b474d89437e3fd47f18463;hb=HEAD

Thanks!

Mathieu

* AFAIK I introduced this technique using RCU read-side C.S. to deal
with cmpxchg ABA at that point, but someone might have thought about
it before me without my knowledge. My litterature survey so far
indicates that using a double-word CAS on a pointer/counter was one of
the usual technique used to protect against cmpxchg ABA so far. Other
techniques imply allocating elements in a limited-size array (so a
simple cmpxchg can update the array index and counter atomically),
Hasard Pointers, or having a full-blown GC which provides similar
guarantees to the RCU grace period with a read-side lock held.
Ref.:

[1998] Maged Michael, Michael Scott "Simple, fast, and practical non-blocking and blocking concurrent queue algorithms"
[2002] Maged M.Michael "Safe memory reclamation for dynamic lock-free objects using atomic reads and writes"
[2003] Maged M.Michael "Hazard Pointers: Safe memory reclamation for lock-free objects"

> +
> +This pattern can only protect a single pointer which is the only reference
> +of the object.
> +
> +object pointer:
> +
> +struct my_struct *gptr;
> +
> +wait-free read site:
> +{
> + rcu_read_lock();
> + ptr = rcu_dereference(gptr);
> + my_struct_read(ptr);
> + rcu_read_unlock();
> +}
> +
> +lock-free update site(update as new):
> +{
> + new_ptr = my_struct_alloc();
> + for (;;) {
> + rcu_read_lock();
> +
> + old_ptr = rcu_dereference(gptr);
> +
> + /* copy data from old_ptr to new_ptr and update it */
> + my_struct_update(new_ptr, old_ptr);
> +
> + /* atomically publish the new_ptr and de-publish the old_ptr */
> + if (cmpxchg(&gptr, old_ptr, new_ptr) == old_ptr) {
> + rcu_read_unlock();
> +
> + /*
> + * free it after a grace-period, read sites and other
> + * update sites may be reading it in parallel.
> + */
> + kfree_rcu(old_ptr);
> +
> + /* success, exit the loop */
> + break;
> + } else {
> + rcu_read_unlock();
> +
> + /*
> + * Other update site successfully update it, we need
> + * to read the latest data and try the update again.
> + *
> + * If the other update site did the same thing we need,
> + * we can free the new_ptr and exit this loop too,
> + * and it may becomes a wait-free algorithm.
> + */
> + }



> + }
> +}
> +
> +1) In update site, rcu_read_lock() is needed for my_struct_update().
> +
> + In this kind of lock-free update site, many update sides
> + may run parallel, other update side may had successfully
> + de-published old_ptr and tried to free it. rcu_read_lock()
> + prevents old_ptr from freeing and ensures it valid for
> + my_struct_update().
> +
> +2) In update site, rcu_read_lock() is needed until cmpxchg() finished.
> +
> + Although the content of old_ptr is not accessed when cmpxchg(),
> + but old_ptr should not be freed until cmpxchg() finished.
> + Otherwise we may miss other successful update and publish a
> + new_ptr without information from the latest object.
> +
> + Example:(wrong update site code, rcu_read_unlock() is moved up before cmpxchg())
> + (cause ABA-problem: http://en.wikipedia.org/wiki/ABA_problem)
> +
> + CPU0 CPU1
> + rcu_read_lock()
> + old_ptr = rcu_dereference(gptr);
> + my_struct_update(new_ptr, old_ptr);
> + rcu_read_unlock();
> + . successfully update, now gptr=other_ptr
> + . old_ptr is freed
> + .
> + . other update, my_struct_alloc() returns old_ptr
> + . successfully publish and de-publish
> + . now gptr=old_ptr again
> + .
> + cmpxchg(&gptr, old_ptr, new_ptr)
> + cmpxchg() success, but the 2 updates
> + of CPU1 are completely missed.
> +
> + This exmaple shows rcu_read_lock() is needed to prevent old_ptr from reusing
> + before cmpxchg() finished and to prevent ABA-problem.
> +
> +3) Beware NULL pointer.
> +
> + Some use cases may set gptr to NULL when needed. (the previous gptr != NULL)
> +
> +lock-free update site(dispose, wait-free):
> +{
> + old_ptr = xchg(&gptr, NULL);
> + if (old_ptr != NULL)
> + kfree_rcu(old_ptr);
> +}
> +
> + This code cause NULL reusing and may cause ABA-problem like above example:
> +
> + CPU0 CPU1
> + rcu_read_lock()
> + old_ptr = rcu_dereference(gptr);
> + /* old_ptr = NULL */
> + my_struct_update(new_ptr, NULL);
> + . successfully update, now gptr=other_ptr
> + .
> + . successfully dispose
> + . now gptr=NULL again
> + .
> + cmpxchg(&gptr, NULL, new_ptr)
> + cmpxchg() success, but the update
> + and the dispose of CPU1 are missed
> + consideration by CPU0.
> + rcu_read_unlock();
> +
> + In many use cases, these behaviors are OK. In these use cases,
> + my_struct_update(new_ptr, NULL) give us the same result even we retry.
> +
> + But in some raw use cases(I can't find any use-case now, I believe it exist),
> + the missed considerations of the updates are not acceptable, in this case,
> + we should use different null-value for NULL pointer for every disposing.
> +
> +lock-free update site(dispose, wait-free, paranoid version):
> +{
> + null_ptr = alloc_null_ptr();
> + old_ptr = xchg(&gptr, null_ptr);
> + if (is_null_ptr(old_ptr))
> + free_null_ptr_by_rcu_for_preventing_it_from_reusing(old_ptr);
> + else
> + kfree_rcu(old_ptr);
> +}
> +

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

2011-06-16 02:40:12

by Lai Jiangshan

[permalink] [raw]
Subject: Re: [PATCH] rcu,doc: lock-free update site

On 06/14/2011 08:50 PM, Mathieu Desnoyers wrote:
> * Lai Jiangshan ([email protected]) wrote:
>> Add a document which describes a pattern of using RCU to implement lock-free(lockless)
>> update site.
>>
> [...]
>> @@ -0,0 +1,143 @@
>> +Lock-free(lockless) update site
>> +
>> +This article describes a pattern of using RCU to implement lock-free(lockless)
>> +update site. RCU update site is considered call-rare and it is protected
>> +by a update-site lock generally. But blocking algorithms are undesirable
>> +in some cases for some reasons, thus, this pattern may help.
>
> Hi Lai,
>
> Yes, using this kind of rcu read-side lock to protect against the
> cmpxchg ABA problem is well-known (to me at least) ;) I used this
> technique in the userspace RCU library "lock-free queue" and "lock-free
> stack" in 2010*. Please feel free to dig through my RCU data containers code
> to bring in more data structure examples:
>
> http://git.lttng.org/?p=userspace-rcu.git;a=blob;f=urcu/static/rculfqueue.h;h=b627e450cfdd581692b474d89437e3fd47f18463;hb=HEAD
>
> http://git.lttng.org/?p=userspace-rcu.git;a=blob;f=urcu/static/rculfqueue.h;h=b627e450cfdd581692b474d89437e3fd47f18463;hb=HEAD
>
> Thanks!
>
> Mathieu


Hi, Mathieu

I will try to make simple update site wait-free, so I wrote a simple guide/document
at first. I didn't notice your work. Your work is excellent,
I will add some references. I hope you rewrite/pretty this document also.

The lock-free stack is a good example, but the lock-free queue is not good here I think.

Thanks,
Lai


>
> * AFAIK I introduced this technique using RCU read-side C.S. to deal
> with cmpxchg ABA at that point, but someone might have thought about
> it before me without my knowledge. My litterature survey so far
> indicates that using a double-word CAS on a pointer/counter was one of
> the usual technique used to protect against cmpxchg ABA so far. Other
> techniques imply allocating elements in a limited-size array (so a
> simple cmpxchg can update the array index and counter atomically),
> Hasard Pointers, or having a full-blown GC which provides similar
> guarantees to the RCU grace period with a read-side lock held.
> Ref.:
>
> [1998] Maged Michael, Michael Scott "Simple, fast, and practical non-blocking and blocking concurrent queue algorithms"
> [2002] Maged M.Michael "Safe memory reclamation for dynamic lock-free objects using atomic reads and writes"
> [2003] Maged M.Michael "Hazard Pointers: Safe memory reclamation for lock-free objects"
>

2011-06-16 04:06:43

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH] rcu,doc: lock-free update site

On Thu, Jun 16, 2011 at 10:40:16AM +0800, Lai Jiangshan wrote:
> On 06/14/2011 08:50 PM, Mathieu Desnoyers wrote:
> > * Lai Jiangshan ([email protected]) wrote:
> >> Add a document which describes a pattern of using RCU to implement lock-free(lockless)
> >> update site.
> >>
> > [...]
> >> @@ -0,0 +1,143 @@
> >> +Lock-free(lockless) update site
> >> +
> >> +This article describes a pattern of using RCU to implement lock-free(lockless)
> >> +update site. RCU update site is considered call-rare and it is protected
> >> +by a update-site lock generally. But blocking algorithms are undesirable
> >> +in some cases for some reasons, thus, this pattern may help.
> >
> > Hi Lai,
> >
> > Yes, using this kind of rcu read-side lock to protect against the
> > cmpxchg ABA problem is well-known (to me at least) ;) I used this
> > technique in the userspace RCU library "lock-free queue" and "lock-free
> > stack" in 2010*. Please feel free to dig through my RCU data containers code
> > to bring in more data structure examples:
> >
> > http://git.lttng.org/?p=userspace-rcu.git;a=blob;f=urcu/static/rculfqueue.h;h=b627e450cfdd581692b474d89437e3fd47f18463;hb=HEAD
> >
> > http://git.lttng.org/?p=userspace-rcu.git;a=blob;f=urcu/static/rculfqueue.h;h=b627e450cfdd581692b474d89437e3fd47f18463;hb=HEAD
> >
> > Thanks!
> >
> > Mathieu
>
>
> Hi, Mathieu
>
> I will try to make simple update site wait-free, so I wrote a simple guide/document
> at first. I didn't notice your work. Your work is excellent,
> I will add some references. I hope you rewrite/pretty this document also.
>
> The lock-free stack is a good example, but the lock-free queue is not good here I think.

I look forward to seeing the update!

Thanx, Paul

> Thanks,
> Lai
>
>
> >
> > * AFAIK I introduced this technique using RCU read-side C.S. to deal
> > with cmpxchg ABA at that point, but someone might have thought about
> > it before me without my knowledge. My litterature survey so far
> > indicates that using a double-word CAS on a pointer/counter was one of
> > the usual technique used to protect against cmpxchg ABA so far. Other
> > techniques imply allocating elements in a limited-size array (so a
> > simple cmpxchg can update the array index and counter atomically),
> > Hasard Pointers, or having a full-blown GC which provides similar
> > guarantees to the RCU grace period with a read-side lock held.
> > Ref.:
> >
> > [1998] Maged Michael, Michael Scott "Simple, fast, and practical non-blocking and blocking concurrent queue algorithms"
> > [2002] Maged M.Michael "Safe memory reclamation for dynamic lock-free objects using atomic reads and writes"
> > [2003] Maged M.Michael "Hazard Pointers: Safe memory reclamation for lock-free objects"
> >
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/