2016-04-22 09:58:52

by Peter Zijlstra

[permalink] [raw]
Subject: [RFC][PATCH 03/31] locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}()

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
---
arch/arc/include/asm/atomic.h | 69 ++++++++++++++++++++++++++++++++++++++----
1 file changed, 64 insertions(+), 5 deletions(-)

--- a/arch/arc/include/asm/atomic.h
+++ b/arch/arc/include/asm/atomic.h
@@ -102,6 +102,38 @@ static inline int atomic_##op##_return(i
return val; \
}

+#define ATOMIC_FETCH_OP(op, c_op, asm_op) \
+static inline int atomic_fetch_##op(int i, atomic_t *v) \
+{ \
+ unsigned int val, result; \
+ SCOND_FAIL_RETRY_VAR_DEF \
+ \
+ /* \
+ * Explicit full memory barrier needed before/after as \
+ * LLOCK/SCOND thmeselves don't provide any such semantics \
+ */ \
+ smp_mb(); \
+ \
+ __asm__ __volatile__( \
+ "1: llock %[val], [%[ctr]] \n" \
+ " mov %[result], %[val] \n" \
+ " " #asm_op " %[val], %[val], %[i] \n" \
+ " scond %[val], [%[ctr]] \n" \
+ " \n" \
+ SCOND_FAIL_RETRY_ASM \
+ \
+ : [val] "=&r" (val), \
+ [result] "=&r" (result) \
+ SCOND_FAIL_RETRY_VARS \
+ : [ctr] "r" (&v->counter), \
+ [i] "ir" (i) \
+ : "cc"); \
+ \
+ smp_mb(); \
+ \
+ return result; \
+}
+
#else /* !CONFIG_ARC_HAS_LLSC */

#ifndef CONFIG_SMP
@@ -164,23 +196,50 @@ static inline int atomic_##op##_return(i
return temp; \
}

+#define ATOMIC_FETCH_OP(op, c_op, asm_op) \
+static inline int atomic_fetch_##op(int i, atomic_t *v) \
+{ \
+ unsigned long flags; \
+ unsigned long temp, result; \
+ \
+ /* \
+ * spin lock/unlock provides the needed smp_mb() before/after \
+ */ \
+ atomic_ops_lock(flags); \
+ result = temp = v->counter; \
+ temp c_op i; \
+ v->counter = temp; \
+ atomic_ops_unlock(flags); \
+ \
+ return result; \
+}
+
#endif /* !CONFIG_ARC_HAS_LLSC */

#define ATOMIC_OPS(op, c_op, asm_op) \
ATOMIC_OP(op, c_op, asm_op) \
- ATOMIC_OP_RETURN(op, c_op, asm_op)
+ ATOMIC_OP_RETURN(op, c_op, asm_op) \
+ ATOMIC_FETCH_OP(op, c_op, asm_op)

ATOMIC_OPS(add, +=, add)
ATOMIC_OPS(sub, -=, sub)

#define atomic_andnot atomic_andnot

-ATOMIC_OP(and, &=, and)
-ATOMIC_OP(andnot, &= ~, bic)
-ATOMIC_OP(or, |=, or)
-ATOMIC_OP(xor, ^=, xor)
+#define atomic_fetch_or atomic_fetch_or
+
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op, c_op, asm_op) \
+ ATOMIC_OP(op, c_op, asm_op) \
+ ATOMIC_FETCH_OP(op, c_op, asm_op)
+
+ATOMIC_OPS(and, &=, and)
+ATOMIC_OPS(andnot, &= ~, bic)
+ATOMIC_OPS(or, |=, or)
+ATOMIC_OPS(xor, ^=, xor)

#undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
#undef ATOMIC_OP_RETURN
#undef ATOMIC_OP
#undef SCOND_FAIL_RETRY_VAR_DEF



2016-04-22 10:51:07

by Vineet Gupta

[permalink] [raw]
Subject: Re: [RFC][PATCH 03/31] locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}()

On Friday 22 April 2016 03:13 PM, Peter Zijlstra wrote:
> Implement FETCH-OP atomic primitives, these are very similar to the
> existing OP-RETURN primitives we already have, except they return the
> value of the atomic variable _before_ modification.
>
> This is especially useful for irreversible operations -- such as
> bitops (because it becomes impossible to reconstruct the state prior
> to modification).
>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> ---
> arch/arc/include/asm/atomic.h | 69 ++++++++++++++++++++++++++++++++++++++----
> 1 file changed, 64 insertions(+), 5 deletions(-)
>
> --- a/arch/arc/include/asm/atomic.h
> +++ b/arch/arc/include/asm/atomic.h
> @@ -102,6 +102,38 @@ static inline int atomic_##op##_return(i
> return val; \
> }
>
> +#define ATOMIC_FETCH_OP(op, c_op, asm_op) \
> +static inline int atomic_fetch_##op(int i, atomic_t *v) \
> +{ \
> + unsigned int val, result; \
> + SCOND_FAIL_RETRY_VAR_DEF \
> + \
> + /* \
> + * Explicit full memory barrier needed before/after as \
> + * LLOCK/SCOND thmeselves don't provide any such semantics \
> + */ \
> + smp_mb(); \
> + \
> + __asm__ __volatile__( \
> + "1: llock %[val], [%[ctr]] \n" \
> + " mov %[result], %[val] \n" \

Calling it result could be a bit confusing, this is meant to be the "orig" value.
So it indeed "result" of the API, but for atomic operation it is pristine value.

Also we can optimize away that MOV - given there are plenty of regs, so

> + " " #asm_op " %[val], %[val], %[i] \n" \
> + " scond %[val], [%[ctr]] \n" \

Instead have

+ " " #asm_op " %[result], %[val], %[i] \n" \
+ " scond %[result], [%[ctr]] \n" \



> + " \n" \
> + SCOND_FAIL_RETRY_ASM \
> + \
> + : [val] "=&r" (val), \
> + [result] "=&r" (result) \
> + SCOND_FAIL_RETRY_VARS \
> + : [ctr] "r" (&v->counter), \
> + [i] "ir" (i) \
> + : "cc"); \
> + \
> + smp_mb(); \
> + \
> + return result; \

This needs to be

+ return val; \



> +}
> +
> #else /* !CONFIG_ARC_HAS_LLSC */
>
> #ifndef CONFIG_SMP
> @@ -164,23 +196,50 @@ static inline int atomic_##op##_return(i
> return temp; \
> }
>
> +#define ATOMIC_FETCH_OP(op, c_op, asm_op) \
> +static inline int atomic_fetch_##op(int i, atomic_t *v) \
> +{ \
> + unsigned long flags; \
> + unsigned long temp, result; \

Same s above, I wouldn't call it result here !

Also per your other comment/patches, converting ARC to _relaxed atomics sounds
trivial, I can provide a fixup patch once your series is stable'ish and u point me
to ur git tree or some such .

Thx,
-Vineet

> + \
> + /* \
> + * spin lock/unlock provides the needed smp_mb() before/after \
> + */ \
> + atomic_ops_lock(flags); \
> + result = temp = v->counter; \
> + temp c_op i; \
> + v->counter = temp; \
> + atomic_ops_unlock(flags); \
> + \
> + return result; \
> +}
> +
> #endif /* !CONFIG_ARC_HAS_LLSC */
>
> #define ATOMIC_OPS(op, c_op, asm_op) \
> ATOMIC_OP(op, c_op, asm_op) \
> - ATOMIC_OP_RETURN(op, c_op, asm_op)
> + ATOMIC_OP_RETURN(op, c_op, asm_op) \
> + ATOMIC_FETCH_OP(op, c_op, asm_op)
>
> ATOMIC_OPS(add, +=, add)
> ATOMIC_OPS(sub, -=, sub)
>
> #define atomic_andnot atomic_andnot
>
> -ATOMIC_OP(and, &=, and)
> -ATOMIC_OP(andnot, &= ~, bic)
> -ATOMIC_OP(or, |=, or)
> -ATOMIC_OP(xor, ^=, xor)
> +#define atomic_fetch_or atomic_fetch_or
> +
> +#undef ATOMIC_OPS
> +#define ATOMIC_OPS(op, c_op, asm_op) \
> + ATOMIC_OP(op, c_op, asm_op) \
> + ATOMIC_FETCH_OP(op, c_op, asm_op)
> +
> +ATOMIC_OPS(and, &=, and)
> +ATOMIC_OPS(andnot, &= ~, bic)
> +ATOMIC_OPS(or, |=, or)
> +ATOMIC_OPS(xor, ^=, xor)
>
> #undef ATOMIC_OPS
> +#undef ATOMIC_FETCH_OP
> #undef ATOMIC_OP_RETURN
> #undef ATOMIC_OP
> #undef SCOND_FAIL_RETRY_VAR_DEF

2016-04-22 14:16:50

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC][PATCH 03/31] locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}()

On Fri, Apr 22, 2016 at 10:50:41AM +0000, Vineet Gupta wrote:

> > +#define ATOMIC_FETCH_OP(op, c_op, asm_op) \
> > +static inline int atomic_fetch_##op(int i, atomic_t *v) \
> > +{ \
> > + unsigned int val, result; \
> > + SCOND_FAIL_RETRY_VAR_DEF \
> > + \
> > + /* \
> > + * Explicit full memory barrier needed before/after as \
> > + * LLOCK/SCOND thmeselves don't provide any such semantics \
> > + */ \
> > + smp_mb(); \
> > + \
> > + __asm__ __volatile__( \
> > + "1: llock %[val], [%[ctr]] \n" \
> > + " mov %[result], %[val] \n" \
>
> Calling it result could be a bit confusing, this is meant to be the "orig" value.
> So it indeed "result" of the API, but for atomic operation it is pristine value.
>
> Also we can optimize away that MOV - given there are plenty of regs, so
>
> > + " " #asm_op " %[val], %[val], %[i] \n" \
> > + " scond %[val], [%[ctr]] \n" \
>
> Instead have
>
> + " " #asm_op " %[result], %[val], %[i] \n" \
> + " scond %[result], [%[ctr]] \n" \
>
>

Indeed, how about something like so?

---
Subject: locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}()
From: Peter Zijlstra <[email protected]>
Date: Mon Apr 18 01:16:09 CEST 2016

Implement FETCH-OP atomic primitives, these are very similar to the
existing OP-RETURN primitives we already have, except they return the
value of the atomic variable _before_ modification.

This is especially useful for irreversible operations -- such as
bitops (because it becomes impossible to reconstruct the state prior
to modification).

Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
---
arch/arc/include/asm/atomic.h | 69 ++++++++++++++++++++++++++++++++++++++----
1 file changed, 64 insertions(+), 5 deletions(-)

--- a/arch/arc/include/asm/atomic.h
+++ b/arch/arc/include/asm/atomic.h
@@ -102,6 +102,37 @@ static inline int atomic_##op##_return(i
return val; \
}

+#define ATOMIC_FETCH_OP(op, c_op, asm_op) \
+static inline int atomic_fetch_##op(int i, atomic_t *v) \
+{ \
+ unsigned int val, orig; \
+ SCOND_FAIL_RETRY_VAR_DEF \
+ \
+ /* \
+ * Explicit full memory barrier needed before/after as \
+ * LLOCK/SCOND thmeselves don't provide any such semantics \
+ */ \
+ smp_mb(); \
+ \
+ __asm__ __volatile__( \
+ "1: llock %[orig], [%[ctr]] \n" \
+ " " #asm_op " %[val], %[orig], %[i] \n" \
+ " scond %[val], [%[ctr]] \n" \
+ " \n" \
+ SCOND_FAIL_RETRY_ASM \
+ \
+ : [val] "=&r" (val), \
+ [orig] "=&r" (orig) \
+ SCOND_FAIL_RETRY_VARS \
+ : [ctr] "r" (&v->counter), \
+ [i] "ir" (i) \
+ : "cc"); \
+ \
+ smp_mb(); \
+ \
+ return orig; \
+}
+
#else /* !CONFIG_ARC_HAS_LLSC */

#ifndef CONFIG_SMP
@@ -164,23 +196,49 @@ static inline int atomic_##op##_return(i
return temp; \
}

+#define ATOMIC_FETCH_OP(op, c_op, asm_op) \
+static inline int atomic_fetch_##op(int i, atomic_t *v) \
+{ \
+ unsigned long flags; \
+ unsigned long orig; \
+ \
+ /* \
+ * spin lock/unlock provides the needed smp_mb() before/after \
+ */ \
+ atomic_ops_lock(flags); \
+ orig = v->counter; \
+ v->counter c_op i; \
+ atomic_ops_unlock(flags); \
+ \
+ return orig; \
+}
+
#endif /* !CONFIG_ARC_HAS_LLSC */

#define ATOMIC_OPS(op, c_op, asm_op) \
ATOMIC_OP(op, c_op, asm_op) \
- ATOMIC_OP_RETURN(op, c_op, asm_op)
+ ATOMIC_OP_RETURN(op, c_op, asm_op) \
+ ATOMIC_FETCH_OP(op, c_op, asm_op)

ATOMIC_OPS(add, +=, add)
ATOMIC_OPS(sub, -=, sub)

#define atomic_andnot atomic_andnot

-ATOMIC_OP(and, &=, and)
-ATOMIC_OP(andnot, &= ~, bic)
-ATOMIC_OP(or, |=, or)
-ATOMIC_OP(xor, ^=, xor)
+#define atomic_fetch_or atomic_fetch_or
+
+#undef ATOMIC_OPS
+#define ATOMIC_OPS(op, c_op, asm_op) \
+ ATOMIC_OP(op, c_op, asm_op) \
+ ATOMIC_FETCH_OP(op, c_op, asm_op)
+
+ATOMIC_OPS(and, &=, and)
+ATOMIC_OPS(andnot, &= ~, bic)
+ATOMIC_OPS(or, |=, or)
+ATOMIC_OPS(xor, ^=, xor)

#undef ATOMIC_OPS
+#undef ATOMIC_FETCH_OP
#undef ATOMIC_OP_RETURN
#undef ATOMIC_OP
#undef SCOND_FAIL_RETRY_VAR_DEF

2016-04-22 14:27:17

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [RFC][PATCH 03/31] locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}()

On Fri, Apr 22, 2016 at 10:50:41AM +0000, Vineet Gupta wrote:
> Also per your other comment/patches, converting ARC to _relaxed atomics sounds
> trivial, I can provide a fixup patch once your series is stable'ish and u point me
> to ur git tree or some such .

Yeah, that change is pretty simple; I ran out of steam and only did
alpha and mips in this series.


2016-04-25 04:27:07

by Vineet Gupta

[permalink] [raw]
Subject: Re: [RFC][PATCH 03/31] locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}()

On Friday 22 April 2016 07:46 PM, Peter Zijlstra wrote:
> On Fri, Apr 22, 2016 at 10:50:41AM +0000, Vineet Gupta wrote:
>
>>> > > +#define ATOMIC_FETCH_OP(op, c_op, asm_op) \
>>> > > +static inline int atomic_fetch_##op(int i, atomic_t *v) \
>>> > > +{ \
>>> > > + unsigned int val, result; \
>>> > > + SCOND_FAIL_RETRY_VAR_DEF \
>>> > > + \
>>> > > + /* \
>>> > > + * Explicit full memory barrier needed before/after as \
>>> > > + * LLOCK/SCOND thmeselves don't provide any such semantics \
>>> > > + */ \
>>> > > + smp_mb(); \
>>> > > + \
>>> > > + __asm__ __volatile__( \
>>> > > + "1: llock %[val], [%[ctr]] \n" \
>>> > > + " mov %[result], %[val] \n" \
>> >
>> > Calling it result could be a bit confusing, this is meant to be the "orig" value.
>> > So it indeed "result" of the API, but for atomic operation it is pristine value.
>> >
>> > Also we can optimize away that MOV - given there are plenty of regs, so
>> >
>>> > > + " " #asm_op " %[val], %[val], %[i] \n" \
>>> > > + " scond %[val], [%[ctr]] \n" \
>> >
>> > Instead have
>> >
>> > + " " #asm_op " %[result], %[val], %[i] \n" \
>> > + " scond %[result], [%[ctr]] \n" \
>> >
>> >
> Indeed, how about something like so?
>
> ---
> Subject: locking,arc: Implement atomic_fetch_{add,sub,and,andnot,or,xor}()
> From: Peter Zijlstra <[email protected]>
> Date: Mon Apr 18 01:16:09 CEST 2016
>
> Implement FETCH-OP atomic primitives, these are very similar to the
> existing OP-RETURN primitives we already have, except they return the
> value of the atomic variable _before_ modification.
>
> This is especially useful for irreversible operations -- such as
> bitops (because it becomes impossible to reconstruct the state prior
> to modification).
>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>

Acked-by: Vineet Gupta <[email protected]>