2016-03-02 13:08:44

by Jianyu Zhan

[permalink] [raw]
Subject: [PATCH v2] futex: replace bare barrier() with a READ_ONCE()

Commit e91467ecd1ef ("bug in futex unqueue_me") introduces a barrier()
in unqueue_me(), to address below problem.

The scenario is like this:

====================
original code:

retry:
lock_ptr = q->lock_ptr;
if (lock_ptr != 0) {
spin_lock(lock_ptr)
if (unlikely(lock_ptr != q->lock_ptr)) {
spin_unlock(lock_ptr);
goto retry;
}
...
}

====================
s390x generates code that is equivalent to:

retry:
if (q->lock_ptr != 0) {
spin_lock(q->lock_ptr)
if (unlikely(lock_ptr != q->lock_ptr)) {
spin_unlock(lock_ptr);
goto retry;
}
...
}

since q->lock_ptr might change between test of non-nullness and spin_lock(),
the double load may lead to problem. So that commit use a barrier() to prevent this.

This patch replaces this bare barrier() with a READ_ONCE().

The reasons are:

1) READ_ONCE() is a more weak form of barrier() that affect only the specific
accesses, while barrier() is a more general compiler level memroy barrier.

2) READ_ONCE() which could be more informative by its name, while a bare barrier()
without comment leads to quite a bit of perplexity.

3) READ_ONCE() _might_ prevent more _theoretical_ "optimizations" by the compiler:

The above retry logic is effectively the same as:

while (lock_ptr = READ_ONCE(q->lock_ptr)) {
spin_lock(lock_ptr)
if (unlikely(lock_ptr != q->lock_ptr)) {
spin_unlock(lock_ptr);
}

...
}

If without the READ_ONCE, the compiler _might_ optimize the load of q->lock_ptr out of the
test satement, which will also lead to the same pointer aliasing problem.

For why do compiler might do this, here is a snippet quoted from Documentation/memory-barriers.txt:

|(*) The compiler is within its rights to reload a variable, for example,
| in cases where high register pressure prevents the compiler from
| keeping all data of interest in registers. The compiler might
| therefore optimize the variable 'tmp' out of our previous example:
|
| while (tmp = a)
| do_something_with(tmp);
|
| This could result in the following code, which is perfectly safe in
| single-threaded code, but can be fatal in concurrent code:
|
| while (a)
| do_something_with(a);
|
| For example, the optimized version of this code could result in
| passing a zero to do_something_with() in the case where the variable
| a was modified by some other CPU between the "while" statement and
| the call to do_something_with().
|
| Again, use READ_ONCE() to prevent the compiler from doing this:
|
| while (tmp = READ_ONCE(a))
| do_something_with(tmp);


Suggested-by: Darren Hart <[email protected]>
Signed-off-by: Jianyu Zhan <[email protected]>
---
v1->v2:
According to suggestion by Darren Hart, revise the commit log to justify the replacement.

I also cc'ed the s390 maintainers, if they could help provide assembly code after this patch
to prove the correctness, that would be better.

kernel/futex.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index 5d6ce64..6796084 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1927,8 +1927,13 @@ static int unqueue_me(struct futex_q *q)

/* In the common case we don't take the spinlock, which is nice. */
retry:
- lock_ptr = q->lock_ptr;
- barrier();
+ /*
+ * On s390x, it was observed that compiler generates such code that spin_lock() will operate on
+ * another load of q->lock_ptr, instead of on @lock_ptr, and since q->lock_ptr might change between
+ * the test of non-nullness and the spin_lock(), which leads to problem. So use READ_ONCE() here to
+ * prevent this compiler "optimization".
+ */
+ lock_ptr = READ_ONCE(q->lock_ptr);
if (lock_ptr != NULL) {
spin_lock(lock_ptr);
/*
--
2.4.3


2016-03-02 13:36:02

by Christian Borntraeger

[permalink] [raw]
Subject: Re: [PATCH v2] futex: replace bare barrier() with a READ_ONCE()

On 03/02/2016 02:08 PM, Jianyu Zhan wrote:
> Commit e91467ecd1ef ("bug in futex unqueue_me") introduces a barrier()
> in unqueue_me(), to address below problem.
>
> The scenario is like this:
>
> ====================
> original code:
>
> retry:
> lock_ptr = q->lock_ptr;
> if (lock_ptr != 0) {
> spin_lock(lock_ptr)
> if (unlikely(lock_ptr != q->lock_ptr)) {
> spin_unlock(lock_ptr);
> goto retry;
> }
> ...
> }
>
> ====================
> s390x generates code that is equivalent to:
>
> retry:
> if (q->lock_ptr != 0) {
> spin_lock(q->lock_ptr)
> if (unlikely(lock_ptr != q->lock_ptr)) {
> spin_unlock(lock_ptr);
> goto retry;
> }
> ...
> }
>
> since q->lock_ptr might change between test of non-nullness and spin_lock(),
> the double load may lead to problem. So that commit use a barrier() to prevent this.
>
> This patch replaces this bare barrier() with a READ_ONCE().
>
> The reasons are:
>
> 1) READ_ONCE() is a more weak form of barrier() that affect only the specific
> accesses, while barrier() is a more general compiler level memroy barrier.
>
> 2) READ_ONCE() which could be more informative by its name, while a bare barrier()
> without comment leads to quite a bit of perplexity.
>
> 3) READ_ONCE() _might_ prevent more _theoretical_ "optimizations" by the compiler:
>
> The above retry logic is effectively the same as:
>
> while (lock_ptr = READ_ONCE(q->lock_ptr)) {
> spin_lock(lock_ptr)
> if (unlikely(lock_ptr != q->lock_ptr)) {
> spin_unlock(lock_ptr);
> }
>
> ...
> }
>
> If without the READ_ONCE, the compiler _might_ optimize the load of q->lock_ptr out of the
> test satement, which will also lead to the same pointer aliasing problem.
>
> For why do compiler might do this, here is a snippet quoted from Documentation/memory-barriers.txt:
>
> |(*) The compiler is within its rights to reload a variable, for example,
> | in cases where high register pressure prevents the compiler from
> | keeping all data of interest in registers. The compiler might
> | therefore optimize the variable 'tmp' out of our previous example:
> |
> | while (tmp = a)
> | do_something_with(tmp);
> |
> | This could result in the following code, which is perfectly safe in
> | single-threaded code, but can be fatal in concurrent code:
> |
> | while (a)
> | do_something_with(a);
> |
> | For example, the optimized version of this code could result in
> | passing a zero to do_something_with() in the case where the variable
> | a was modified by some other CPU between the "while" statement and
> | the call to do_something_with().
> |
> | Again, use READ_ONCE() to prevent the compiler from doing this:
> |
> | while (tmp = READ_ONCE(a))
> | do_something_with(tmp);


Not sure if you need that huge patch description, but it does not hurt either.


>
>
> Suggested-by: Darren Hart <[email protected]>
> Signed-off-by: Jianyu Zhan <[email protected]>
> ---
> v1->v2:
> According to suggestion by Darren Hart, revise the commit log to justify the replacement.
>
> I also cc'ed the s390 maintainers, if they could help provide assembly code after this patch
> to prove the correctness, that would be better.
>
> kernel/futex.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/futex.c b/kernel/futex.c
> index 5d6ce64..6796084 100644
> --- a/kernel/futex.c
> +++ b/kernel/futex.c
> @@ -1927,8 +1927,13 @@ static int unqueue_me(struct futex_q *q)
>
> /* In the common case we don't take the spinlock, which is nice. */
> retry:
> - lock_ptr = q->lock_ptr;
> - barrier();
> + /*
> + * On s390x, it was observed that compiler generates such code that spin_lock() will operate on
> + * another load of q->lock_ptr, instead of on @lock_ptr, and since q->lock_ptr might change between
> + * the test of non-nullness and the spin_lock(), which leads to problem. So use READ_ONCE() here to
> + * prevent this compiler "optimization".
> + */

That change makes sense. I did the original barrier back in 2006 (could you cc me next time?)
ACCESS_ONCE or READ_ONCE was not available at that time and its now the better way.

This is not an s390 specific problem, it was just triggered there as the gcc cost model
considered the memory read as cheap as a register read.

Maybe simplify the comment to something like

/* Prevent the compiler to read lock_ptr twice (if and spin_lock) */


> + lock_ptr = READ_ONCE(q->lock_ptr);
> if (lock_ptr != NULL) {
> spin_lock(lock_ptr);
> /*
>

2016-03-02 13:57:15

by Jianyu Zhan

[permalink] [raw]
Subject: Re: [PATCH v2] futex: replace bare barrier() with a READ_ONCE()

On Wed, Mar 2, 2016 at 9:35 PM, Christian Borntraeger
<[email protected]> wrote:
> That change makes sense. I did the original barrier back in 2006 (could you cc me next time?)

Sorry for that, I thought get_maintainer.pl would had spit the
original author's email, but apparently it didn't :(

> ACCESS_ONCE or READ_ONCE was not available at that time and its now the better way.
>
> This is not an s390 specific problem, it was just triggered there as the gcc cost model
> considered the memory read as cheap as a register read.
>

Thanks for the clarification.

> Maybe simplify the comment to something like
>
> /* Prevent the compiler to read lock_ptr twice (if and spin_lock) */

Will do.


Regards,
Jianyu Zhan

2016-03-02 16:41:30

by Darren Hart

[permalink] [raw]
Subject: Re: [PATCH v2] futex: replace bare barrier() with a READ_ONCE()

On Wed, Mar 02, 2016 at 09:56:32PM +0800, Jianyu Zhan wrote:
> On Wed, Mar 2, 2016 at 9:35 PM, Christian Borntraeger
> <[email protected]> wrote:
> > That change makes sense. I did the original barrier back in 2006 (could you cc me next time?)
>
> Sorry for that, I thought get_maintainer.pl would had spit the
> original author's email, but apparently it didn't :(

get_maintainer gives you the documented maintainers per the MAINTAINERS file, as
well as frequent committers. This patch was in 2008. Always include the author
of patches you are referencing (and that's manual).

>
> > ACCESS_ONCE or READ_ONCE was not available at that time and its now the better way.
> >

Noting this specifically in the commit message helps the argument and why the
change is being made now.

> > This is not an s390 specific problem, it was just triggered there as the gcc cost model
> > considered the memory read as cheap as a register read.
> >
>
> Thanks for the clarification.
>

And as with the original patch (e91467ecd1ef), we should be able to justify the
change with an assembly dump, as Christian did in his commit.

--
Darren Hart
Intel Open Source Technology Center

2016-03-02 16:48:29

by Darren Hart

[permalink] [raw]
Subject: Re: [PATCH v2] futex: replace bare barrier() with a READ_ONCE()

On Wed, Mar 02, 2016 at 09:08:31PM +0800, Jianyu Zhan wrote:

...

> 3) READ_ONCE() _might_ prevent more _theoretical_ "optimizations" by the compiler:
>
> The above retry logic is effectively the same as:
>
> while (lock_ptr = READ_ONCE(q->lock_ptr)) {
> spin_lock(lock_ptr)


The spin_lock() is memory barrier, and therefor a general compiler barrier. The
READ_ONCE would be redundant in this case.

Unless you can demonstrate a failure mode in disassembly, or can point out how
the spin_lock barrier is insufficient that I have missed, this third point is
already covered.

--
Darren Hart
Intel Open Source Technology Center