2015-08-06 12:41:46

by Vineet Gupta

[permalink] [raw]
Subject: [PATCH 0/4] ARC futex fixes

Hi Thomas/Peter,

Of late I've been debugging a seeming lost wakeup when running a specific
EEMBC Multibench workload (4M-check -w4) on a quad core HS38 config on FPGA
CONFIG_SMP, CONFIG_PREEMPT.

I've yet to nail that issue down, but in the process found some deficinecies
in ARC futex backend. Can you please take a quick look and shout if there's
something wrong there.

The futex backend for ARC, despite it's age, only recently started getting
real testing due to recent switch to NPTL based userland (uClibc).

Thx,
-Vineet

Vineet Gupta (4):
ARC: add barriers to futex code
ARC: futex cosmetics
ARC: make futex_atomic_cmpxchg_inatomic() return bimodal
ARC: Enable HAVE_FUTEX_CMPXCHG

arch/arc/Kconfig | 1 +
arch/arc/include/asm/futex.h | 44 +++++++++++++++++++++++++-------------------
2 files changed, 26 insertions(+), 19 deletions(-)

--
1.9.1


2015-08-06 12:37:21

by Vineet Gupta

[permalink] [raw]
Subject: [PATCH 1/4] ARC: add barriers to futex code

The atomic ops on futex need to provide the full barrier just like
regular atomics in kernel.

Also remove pagefault_enable/disable in futex_atomic_cmpxchg_inatomic()
as core code already does that

Cc: David Hildenbrand <[email protected]>
Cc: Peter Zijlstra (Intel) <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Michel Lespinasse <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>
---
arch/arc/include/asm/futex.h | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/arc/include/asm/futex.h b/arch/arc/include/asm/futex.h
index 70cfe16b742d..160656d0a15a 100644
--- a/arch/arc/include/asm/futex.h
+++ b/arch/arc/include/asm/futex.h
@@ -20,6 +20,7 @@

#define __futex_atomic_op(insn, ret, oldval, uaddr, oparg)\
\
+ smp_mb(); \
__asm__ __volatile__( \
"1: llock %1, [%2] \n" \
insn "\n" \
@@ -40,12 +41,14 @@
\
: "=&r" (ret), "=&r" (oldval) \
: "r" (uaddr), "r" (oparg), "ir" (-EFAULT) \
- : "cc", "memory")
+ : "cc", "memory"); \
+ smp_mb(); \

#else /* !CONFIG_ARC_HAS_LLSC */

#define __futex_atomic_op(insn, ret, oldval, uaddr, oparg)\
\
+ smp_mb(); \
__asm__ __volatile__( \
"1: ld %1, [%2] \n" \
insn "\n" \
@@ -65,7 +68,8 @@
\
: "=&r" (ret), "=&r" (oldval) \
: "r" (uaddr), "r" (oparg), "ir" (-EFAULT) \
- : "cc", "memory")
+ : "cc", "memory"); \
+ smp_mb(); \

#endif

@@ -151,7 +155,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 oldval,
if (!access_ok(VERIFY_WRITE, uaddr, sizeof(int)))
return -EFAULT;

- pagefault_disable();
+ smp_mb();

__asm__ __volatile__(
#ifdef CONFIG_ARC_HAS_LLSC
@@ -178,7 +182,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 oldval,
: "r"(oldval), "r"(newval), "r"(uaddr), "ir"(-EFAULT)
: "cc", "memory");

- pagefault_enable();
+ smp_mb();

*uval = val;
return val;
--
1.9.1

2015-08-06 12:36:07

by Vineet Gupta

[permalink] [raw]
Subject: [PATCH 2/4] ARC: futex cosmetics

Cc: Peter Zijlstra (Intel) <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Michel Lespinasse <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>
---
arch/arc/include/asm/futex.h | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/arc/include/asm/futex.h b/arch/arc/include/asm/futex.h
index 160656d0a15a..82902ff0517f 100644
--- a/arch/arc/include/asm/futex.h
+++ b/arch/arc/include/asm/futex.h
@@ -94,6 +94,7 @@ static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
__futex_atomic_op("mov %0, %3", ret, oldval, uaddr, oparg);
break;
case FUTEX_OP_ADD:
+ /* oldval = *uaddr; *uaddr += oparg ; ret = *uaddr */
__futex_atomic_op("add %0, %1, %3", ret, oldval, uaddr, oparg);
break;
case FUTEX_OP_OR:
@@ -147,12 +148,12 @@ static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
* -user access r/w fails: return -EFAULT
*/
static inline int
-futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 oldval,
- u32 newval)
+futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 expval,
+ u32 newval)
{
- u32 val;
+ u32 existval;

- if (!access_ok(VERIFY_WRITE, uaddr, sizeof(int)))
+ if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
return -EFAULT;

smp_mb();
@@ -178,14 +179,14 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 oldval,
" .word 1b, 4b \n"
" .word 2b, 4b \n"
" .previous\n"
- : "=&r"(val)
- : "r"(oldval), "r"(newval), "r"(uaddr), "ir"(-EFAULT)
+ : "=&r"(existval)
+ : "r"(expval), "r"(newval), "r"(uaddr), "ir"(-EFAULT)
: "cc", "memory");

smp_mb();

- *uval = val;
- return val;
+ *uval = existval;
+ return existval;
}

#endif
--
1.9.1

2015-08-06 12:36:13

by Vineet Gupta

[permalink] [raw]
Subject: [PATCH 3/4] ARC: make futex_atomic_cmpxchg_inatomic() return bimodal

Callers of cmpxchg_futex_value_locked() in futex code expect bimodal
return value:
!0 (essentially -EFAULT as failure)
0 (success)

Before this patch, the success return value was old value of futex,
which could very well be non zero, causing caller to possibly take the
failure path erroneously.

Fix that by returning 0 for success

(This fix was done back in 2011 for all upstream arches, which ARC
obviously missed)

Cc: Peter Zijlstra (Intel) <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Michel Lespinasse <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>
---
arch/arc/include/asm/futex.h | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/arch/arc/include/asm/futex.h b/arch/arc/include/asm/futex.h
index 82902ff0517f..6c8f207a5f02 100644
--- a/arch/arc/include/asm/futex.h
+++ b/arch/arc/include/asm/futex.h
@@ -151,6 +151,7 @@ static inline int
futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 expval,
u32 newval)
{
+ int ret = 0;
u32 existval;

if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
@@ -160,18 +161,18 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 expval,

__asm__ __volatile__(
#ifdef CONFIG_ARC_HAS_LLSC
- "1: llock %0, [%3] \n"
- " brne %0, %1, 3f \n"
- "2: scond %2, [%3] \n"
+ "1: llock %1, [%4] \n"
+ " brne %1, %2, 3f \n"
+ "2: scond %3, [%4] \n"
" bnz 1b \n"
#else
- "1: ld %0, [%3] \n"
- " brne %0, %1, 3f \n"
- "2: st %2, [%3] \n"
+ "1: ld %1, [%4] \n"
+ " brne %1, %2, 3f \n"
+ "2: st %3, [%4] \n"
#endif
"3: \n"
" .section .fixup,\"ax\" \n"
- "4: mov %0, %4 \n"
+ "4: mov %0, %5 \n"
" b 3b \n"
" .previous \n"
" .section __ex_table,\"a\" \n"
@@ -179,14 +180,14 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 expval,
" .word 1b, 4b \n"
" .word 2b, 4b \n"
" .previous\n"
- : "=&r"(existval)
+ : "+&r"(ret), "=&r"(existval)
: "r"(expval), "r"(newval), "r"(uaddr), "ir"(-EFAULT)
: "cc", "memory");

smp_mb();

*uval = existval;
- return existval;
+ return ret;
}

#endif
--
1.9.1

2015-08-06 12:41:47

by Vineet Gupta

[permalink] [raw]
Subject: [PATCH 4/4] ARC: Enable HAVE_FUTEX_CMPXCHG

ARC doesn't need the runtime detection of futex cmpxchg op

Cc: Peter Zijlstra (Intel) <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>
---
arch/arc/Kconfig | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
index bd4670d1b89b..bb0d09159714 100644
--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@@ -22,6 +22,7 @@ config ARC
select GENERIC_SMP_IDLE_THREAD
select HAVE_ARCH_KGDB
select HAVE_ARCH_TRACEHOOK
+ select HAVE_FUTEX_CMPXCHG
select HAVE_IOREMAP_PROT
select HAVE_KPROBES
select HAVE_KRETPROBES
--
1.9.1

2015-08-06 13:15:52

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH 1/4] ARC: add barriers to futex code

> The atomic ops on futex need to provide the full barrier just like
> regular atomics in kernel.
>
> Also remove pagefault_enable/disable in futex_atomic_cmpxchg_inatomic()
> as core code already does that
>
> Cc: David Hildenbrand <[email protected]>
> Cc: Peter Zijlstra (Intel) <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Michel Lespinasse <[email protected]>
> Signed-off-by: Vineet Gupta <[email protected]>
> ---
> arch/arc/include/asm/futex.h | 12 ++++++++----
> 1 file changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arc/include/asm/futex.h b/arch/arc/include/asm/futex.h
> index 70cfe16b742d..160656d0a15a 100644
> --- a/arch/arc/include/asm/futex.h
> +++ b/arch/arc/include/asm/futex.h
> @@ -20,6 +20,7 @@
>
> #define __futex_atomic_op(insn, ret, oldval, uaddr, oparg)\
> \
> + smp_mb(); \
> __asm__ __volatile__( \
> "1: llock %1, [%2] \n" \
> insn "\n" \
> @@ -40,12 +41,14 @@
> \
> : "=&r" (ret), "=&r" (oldval) \
> : "r" (uaddr), "r" (oparg), "ir" (-EFAULT) \
> - : "cc", "memory")
> + : "cc", "memory"); \
> + smp_mb(); \

I think you should drop the ;

>
> #else /* !CONFIG_ARC_HAS_LLSC */
>
> #define __futex_atomic_op(insn, ret, oldval, uaddr, oparg)\
> \
> + smp_mb(); \
> __asm__ __volatile__( \
> "1: ld %1, [%2] \n" \
> insn "\n" \
> @@ -65,7 +68,8 @@
> \
> : "=&r" (ret), "=&r" (oldval) \
> : "r" (uaddr), "r" (oparg), "ir" (-EFAULT) \
> - : "cc", "memory")
> + : "cc", "memory"); \
> + smp_mb(); \

dito

>
> #endif
>
> @@ -151,7 +155,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 oldval,
> if (!access_ok(VERIFY_WRITE, uaddr, sizeof(int)))
> return -EFAULT;
>
> - pagefault_disable();
> + smp_mb();
>
> __asm__ __volatile__(
> #ifdef CONFIG_ARC_HAS_LLSC
> @@ -178,7 +182,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 oldval,
> : "r"(oldval), "r"(newval), "r"(uaddr), "ir"(-EFAULT)
> : "cc", "memory");
>
> - pagefault_enable();
> + smp_mb();
>
> *uval = val;
> return val;

Looks like pagefault_() magic is only required for futex_atomic_op_inuser. So
this should be fine (and arc seems to be the only arch left that has in in
_inatomic).

Not sure if you want to change the comment:

/* Compare-xchg with pagefaults disabled.
* Notes:
* -Best-Effort: Exchg happens only if compare succeeds.

Maybe something like "Compare-xchg: pagefaults have to be disabled by the
caller"

Looks sane to me.

David

2015-08-06 13:28:14

by Vineet Gupta

[permalink] [raw]
Subject: Re: [PATCH 1/4] ARC: add barriers to futex code

On Thursday 06 August 2015 06:45 PM, David Hildenbrand wrote:
>> The atomic ops on futex need to provide the full barrier just like
>> regular atomics in kernel.
>>
>> Also remove pagefault_enable/disable in futex_atomic_cmpxchg_inatomic()
>> as core code already does that
>>
>> Cc: David Hildenbrand <[email protected]>
>> Cc: Peter Zijlstra (Intel) <[email protected]>
>> Cc: Thomas Gleixner <[email protected]>
>> Cc: Michel Lespinasse <[email protected]>
>> Signed-off-by: Vineet Gupta <[email protected]>
>> ---
>> arch/arc/include/asm/futex.h | 12 ++++++++----
>> 1 file changed, 8 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/arc/include/asm/futex.h b/arch/arc/include/asm/futex.h
>> index 70cfe16b742d..160656d0a15a 100644
>> --- a/arch/arc/include/asm/futex.h
>> +++ b/arch/arc/include/asm/futex.h
>> @@ -20,6 +20,7 @@
>>
>> #define __futex_atomic_op(insn, ret, oldval, uaddr, oparg)\
>> \
>> + smp_mb(); \
>> __asm__ __volatile__( \
>> "1: llock %1, [%2] \n" \
>> insn "\n" \
>> @@ -40,12 +41,14 @@
>> \
>> : "=&r" (ret), "=&r" (oldval) \
>> : "r" (uaddr), "r" (oparg), "ir" (-EFAULT) \
>> - : "cc", "memory")
>> + : "cc", "memory"); \
>> + smp_mb(); \
> I think you should drop the ;

OK sure !

>
>> #else /* !CONFIG_ARC_HAS_LLSC */
>>
>> #define __futex_atomic_op(insn, ret, oldval, uaddr, oparg)\
>> \
>> + smp_mb(); \
>> __asm__ __volatile__( \
>> "1: ld %1, [%2] \n" \
>> insn "\n" \
>> @@ -65,7 +68,8 @@
>> \
>> : "=&r" (ret), "=&r" (oldval) \
>> : "r" (uaddr), "r" (oparg), "ir" (-EFAULT) \
>> - : "cc", "memory")
>> + : "cc", "memory"); \
>> + smp_mb(); \
> dito

OK !

>
>> #endif
>>
>> @@ -151,7 +155,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 oldval,
>> if (!access_ok(VERIFY_WRITE, uaddr, sizeof(int)))
>> return -EFAULT;
>>
>> - pagefault_disable();
>> + smp_mb();
>>
>> __asm__ __volatile__(
>> #ifdef CONFIG_ARC_HAS_LLSC
>> @@ -178,7 +182,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 oldval,
>> : "r"(oldval), "r"(newval), "r"(uaddr), "ir"(-EFAULT)
>> : "cc", "memory");
>>
>> - pagefault_enable();
>> + smp_mb();
>>
>> *uval = val;
>> return val;
> Looks like pagefault_() magic is only required for futex_atomic_op_inuser. So
> this should be fine (and arc seems to be the only arch left that has in in
> _inatomic).
>
> Not sure if you want to change the comment:
>
> /* Compare-xchg with pagefaults disabled.
> * Notes:
> * -Best-Effort: Exchg happens only if compare succeeds.
>
> Maybe something like "Compare-xchg: pagefaults have to be disabled by the
> caller"

Will do !

> Looks sane to me.

Thx for the quick review David. It seems ARC also needs the preempt disable magic
for !LLSC config. I'll send a separate patch to that effect.

Thx,
-Vineet

>
> David
>
>

2015-08-06 13:46:43

by Vineet Gupta

[permalink] [raw]
Subject: [PATCH 5/4] ARC: ensure futex ops are atomic in !LLSC config

W/o hardware assisted atomic r-m-w the best we can do is to disable
preemption.

Cc: David Hildenbrand <[email protected]>
Cc: Peter Zijlstra (Intel) <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Michel Lespinasse <[email protected]>
Signed-off-by: Vineet Gupta <[email protected]>
---
arch/arc/include/asm/futex.h | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/arch/arc/include/asm/futex.h b/arch/arc/include/asm/futex.h
index 0ea8bcc7b846..8f449982523b 100644
--- a/arch/arc/include/asm/futex.h
+++ b/arch/arc/include/asm/futex.h
@@ -87,6 +87,9 @@ static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
if (!access_ok(VERIFY_WRITE, uaddr, sizeof(int)))
return -EFAULT;

+#ifndef CONFIG_ARC_HAS_LLSC
+ preempt_disable(); /* to guarantee atomic r-m-w of futex op */
+#endif
pagefault_disable();

switch (op) {
@@ -111,6 +114,9 @@ static inline int futex_atomic_op_inuser(int encoded_op, u32 __user *uaddr)
}

pagefault_enable();
+#ifndef CONFIG_ARC_HAS_LLSC
+ preempt_enable();
+#endif

if (!ret) {
switch (cmp) {
@@ -153,6 +159,9 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 expval,
if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
return -EFAULT;

+#ifndef CONFIG_ARC_HAS_LLSC
+ preempt_disable(); /* to guarantee atomic r-m-w of futex op */
+#endif
smp_mb();

__asm__ __volatile__(
@@ -182,6 +191,9 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr, u32 expval,

smp_mb();

+#ifndef CONFIG_ARC_HAS_LLSC
+ preempt_enable();
+#endif
*uval = existval;
return ret;
}
--
1.9.1

2015-08-06 13:48:31

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 1/4] ARC: add barriers to futex code

On Thu, Aug 06, 2015 at 06:05:20PM +0530, Vineet Gupta wrote:
> The atomic ops on futex need to provide the full barrier just like
> regular atomics in kernel.
>
> Also remove pagefault_enable/disable in futex_atomic_cmpxchg_inatomic()
> as core code already does that

Urgh, and of course tglx just left for holidays :-)

> +++ b/arch/arc/include/asm/futex.h
> @@ -20,6 +20,7 @@
>
> #define __futex_atomic_op(insn, ret, oldval, uaddr, oparg)\
> \
> + smp_mb(); \
> __asm__ __volatile__( \
> "1: llock %1, [%2] \n" \
> insn "\n" \
> @@ -40,12 +41,14 @@
> \
> : "=&r" (ret), "=&r" (oldval) \
> : "r" (uaddr), "r" (oparg), "ir" (-EFAULT) \
> - : "cc", "memory")
> + : "cc", "memory"); \
> + smp_mb(); \
>


So:

- alhpa: only has the first smp_mb(), suggesting RELEASE
- arm: only has the first smp_mb(), suggesting RELEASE
- arm64: has store-release + smp_mb(), suggesting full barriers
- MIPS: has LLSC_MB after, suggesting ACQUIRE
- powerpc: lwsync before, sync after, full barrier

x86 is of course boring and fully ordered

Looking at the usage site of futex_atomic_op_inuser(), that's in
futex_wake_op() which might suggest RELEASE is indeed sufficient.

Which leaves me puzzled on MIPS, but what do I know.

At the very least this patch isn't wrong, fully ordered is sufficient.

2015-08-06 14:11:45

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH 1/4] ARC: add barriers to futex code

On Thu, Aug 06, 2015 at 02:48:26PM +0100, Peter Zijlstra wrote:
> On Thu, Aug 06, 2015 at 06:05:20PM +0530, Vineet Gupta wrote:
> > The atomic ops on futex need to provide the full barrier just like
> > regular atomics in kernel.
> >
> > Also remove pagefault_enable/disable in futex_atomic_cmpxchg_inatomic()
> > as core code already does that
>
> Urgh, and of course tglx just left for holidays :-)

Damn, he's really missing out on this!

> > +++ b/arch/arc/include/asm/futex.h
> > @@ -20,6 +20,7 @@
> >
> > #define __futex_atomic_op(insn, ret, oldval, uaddr, oparg)\
> > \
> > + smp_mb(); \
> > __asm__ __volatile__( \
> > "1: llock %1, [%2] \n" \
> > insn "\n" \
> > @@ -40,12 +41,14 @@
> > \
> > : "=&r" (ret), "=&r" (oldval) \
> > : "r" (uaddr), "r" (oparg), "ir" (-EFAULT) \
> > - : "cc", "memory")
> > + : "cc", "memory"); \
> > + smp_mb(); \
> >
>
>
> So:
>
> - alhpa: only has the first smp_mb(), suggesting RELEASE
> - arm: only has the first smp_mb(), suggesting RELEASE
> - arm64: has store-release + smp_mb(), suggesting full barriers

I'd be ok relaxing that to smp_mb() but I don't think I'm brave enough
to go all the way to an STLXR. You can lose SC if you combine explicit
barrier instructions with the acquire/release instructions and I dread
to think what userspace is doing...

> - MIPS: has LLSC_MB after, suggesting ACQUIRE

Yikes, so there's a fun semantic difference there. Maybe we should go
look at glibc (which only uses one of the futex ops in pthread_cond_wait
iirc).

> - powerpc: lwsync before, sync after, full barrier
>
> x86 is of course boring and fully ordered
>
> Looking at the usage site of futex_atomic_op_inuser(), that's in
> futex_wake_op() which might suggest RELEASE is indeed sufficient.
>
> Which leaves me puzzled on MIPS, but what do I know.
>
> At the very least this patch isn't wrong, fully ordered is sufficient.

Agreed.

Will

2015-08-07 11:40:54

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 1/4] ARC: add barriers to futex code

On Thu, Aug 06, 2015 at 03:48:26PM +0200, Peter Zijlstra wrote:
> On Thu, Aug 06, 2015 at 06:05:20PM +0530, Vineet Gupta wrote:
> > The atomic ops on futex need to provide the full barrier just like
> > regular atomics in kernel.
> >
> > Also remove pagefault_enable/disable in futex_atomic_cmpxchg_inatomic()
> > as core code already does that
>
> Urgh, and of course tglx just left for holidays :-)
>
> > +++ b/arch/arc/include/asm/futex.h
> > @@ -20,6 +20,7 @@
> >
> > #define __futex_atomic_op(insn, ret, oldval, uaddr, oparg)\
> > \
> > + smp_mb(); \
> > __asm__ __volatile__( \
> > "1: llock %1, [%2] \n" \
> > insn "\n" \
> > @@ -40,12 +41,14 @@
> > \
> > : "=&r" (ret), "=&r" (oldval) \
> > : "r" (uaddr), "r" (oparg), "ir" (-EFAULT) \
> > - : "cc", "memory")
> > + : "cc", "memory"); \
> > + smp_mb(); \
> >
>
>
> So:
>
> - alhpa: only has the first smp_mb(), suggesting RELEASE
> - arm: only has the first smp_mb(), suggesting RELEASE
> - arm64: has store-release + smp_mb(), suggesting full barriers
> - MIPS: has LLSC_MB after, suggesting ACQUIRE
> - powerpc: lwsync before, sync after, full barrier
>
> x86 is of course boring and fully ordered
>
> Looking at the usage site of futex_atomic_op_inuser(), that's in
> futex_wake_op() which might suggest RELEASE is indeed sufficient.
>
> Which leaves me puzzled on MIPS, but what do I know.

So I _think_ the MIPS code is broken. The mips futex code is from 2006
and the mips smp_mb__before_llsc bits are from 2010, so its well
possible this was missed.

Ralf, David, did I miss the obvious or does the below patch make sense?

---
Subject: MIPS: Fix __futex_atomic_op() for WEAK_REORDERING_BEYOND_LLSC

Current __futex_atomic_op() doesn't have smp_mb__before_llsc(), meaning
it would allow prior load/stores to escape out and re-order against the
ll/sc itself.

While the exact requirements on __futex_atomic_op() are a little unclear
at the moment it must be either RELEASE or fully ordered, without
smp_mb__before_llsc() the MIPS code is neither.

Therefore add smp_mb__before_llsc().

Cc: Ralf Baechle <[email protected]>
Cc: David Daney <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
---
arch/mips/include/asm/futex.h | 2 ++
1 file changed, 2 insertions(+)

diff --git a/arch/mips/include/asm/futex.h b/arch/mips/include/asm/futex.h
index 1de190bdfb9c..2b8023b9b661 100644
--- a/arch/mips/include/asm/futex.h
+++ b/arch/mips/include/asm/futex.h
@@ -20,6 +20,8 @@

#define __futex_atomic_op(insn, ret, oldval, uaddr, oparg) \
{ \
+ smp_mb__before_llsc(); \
+ \
if (cpu_has_llsc && R10000_LLSC_WAR) { \
__asm__ __volatile__( \
" .set push \n" \

2015-08-17 07:47:04

by Vineet Gupta

[permalink] [raw]
Subject: Re: [PATCH 0/4] ARC futex fixes

On Thursday 06 August 2015 06:05 PM, Vineet Gupta wrote:

Hi Thomas/Peter,

Of late I've been debugging a seeming lost wakeup when running a specific
EEMBC Multibench workload (4M-check -w4) on a quad core HS38 config on FPGA
CONFIG_SMP, CONFIG_PREEMPT.

I've yet to nail that issue down, but in the process found some deficinecies
in ARC futex backend. Can you please take a quick look and shout if there's
something wrong there.

The futex backend for ARC, despite it's age, only recently started getting
real testing due to recent switch to NPTL based userland (uClibc).

Thx,
-Vineet

Vineet Gupta (4):
ARC: add barriers to futex code
ARC: futex cosmetics
ARC: make futex_atomic_cmpxchg_inatomic() return bimodal
ARC: Enable HAVE_FUTEX_CMPXCHG

arch/arc/Kconfig | 1 +
arch/arc/include/asm/futex.h | 44 +++++++++++++++++++++++++-------------------
2 files changed, 26 insertions(+), 19 deletions(-)




Hi Thomas,

Could u please take a look at these patches when u get a chance ?

Thx,
-Vineet