This series consists of a bunch of cleanups to the way we handle memory
barriers (though no changes to the sync instructions we use to implement
them) & atomic memory accesses. One major goal was to ensure the
Loongson3 LL/SC errata workarounds are applied in a safe manner from
within inline-asm & that we can automatically verify the resulting
kernel binary looks reasonable. Many patches are cleanups found along
the way.
Applies atop v5.4-rc1.
Changes in v2:
- Keep our fls/ffs implementations. Turns out GCC's builtins call
intrinsics in some configurations, and if we'd need to go implement
those then using the generic fls/ffs doesn't seem like such a win.
- De-string __WEAK_LLSC_MB to allow use with __SYNC_ELSE().
- Only try to build the loongson3-llsc-check tool from
arch/mips/Makefile when CONFIG_CPU_LOONGSON3_WORKAROUNDS is enabled.
Paul Burton (36):
MIPS: Unify sc beqz definition
MIPS: Use compact branch for LL/SC loops on MIPSr6+
MIPS: barrier: Add __SYNC() infrastructure
MIPS: barrier: Clean up rmb() & wmb() definitions
MIPS: barrier: Clean up __smp_mb() definition
MIPS: barrier: Remove fast_mb() Octeon #ifdef'ery
MIPS: barrier: Clean up __sync() definition
MIPS: barrier: Clean up sync_ginv()
MIPS: atomic: Fix whitespace in ATOMIC_OP macros
MIPS: atomic: Handle !kernel_uses_llsc first
MIPS: atomic: Use one macro to generate 32b & 64b functions
MIPS: atomic: Emit Loongson3 sync workarounds within asm
MIPS: atomic: Use _atomic barriers in atomic_sub_if_positive()
MIPS: atomic: Unify 32b & 64b sub_if_positive
MIPS: atomic: Deduplicate 32b & 64b read, set, xchg, cmpxchg
MIPS: bitops: Handle !kernel_uses_llsc first
MIPS: bitops: Only use ins for bit 16 or higher
MIPS: bitops: Use MIPS_ISA_REV, not #ifdefs
MIPS: bitops: ins start position is always an immediate
MIPS: bitops: Implement test_and_set_bit() in terms of _lock variant
MIPS: bitops: Allow immediates in test_and_{set,clear,change}_bit
MIPS: bitops: Use the BIT() macro
MIPS: bitops: Avoid redundant zero-comparison for non-LLSC
MIPS: bitops: Abstract LL/SC loops
MIPS: bitops: Use BIT_WORD() & BITS_PER_LONG
MIPS: bitops: Emit Loongson3 sync workarounds within asm
MIPS: bitops: Use smp_mb__before_atomic in test_* ops
MIPS: cmpxchg: Emit Loongson3 sync workarounds within asm
MIPS: cmpxchg: Omit redundant barriers for Loongson3
MIPS: futex: Emit Loongson3 sync workarounds within asm
MIPS: syscall: Emit Loongson3 sync workarounds within asm
MIPS: barrier: Remove loongson_llsc_mb()
MIPS: barrier: Make __smp_mb__before_atomic() a no-op for Loongson3
MIPS: genex: Add Loongson3 LL/SC workaround to ejtag_debug_handler
MIPS: genex: Don't reload address unnecessarily
MIPS: Check Loongson3 LL/SC errata workaround correctness
arch/mips/Makefile | 3 +
arch/mips/Makefile.postlink | 10 +-
arch/mips/include/asm/atomic.h | 571 +++++++++----------------
arch/mips/include/asm/barrier.h | 228 ++--------
arch/mips/include/asm/bitops.h | 443 ++++++-------------
arch/mips/include/asm/cmpxchg.h | 59 +--
arch/mips/include/asm/futex.h | 15 +-
arch/mips/include/asm/llsc.h | 19 +-
arch/mips/include/asm/sync.h | 207 +++++++++
arch/mips/kernel/genex.S | 6 +-
arch/mips/kernel/pm-cps.c | 20 +-
arch/mips/kernel/syscall.c | 3 +-
arch/mips/lib/bitops.c | 57 +--
arch/mips/loongson64/Platform | 2 +-
arch/mips/tools/.gitignore | 1 +
arch/mips/tools/Makefile | 5 +
arch/mips/tools/loongson3-llsc-check.c | 307 +++++++++++++
17 files changed, 981 insertions(+), 975 deletions(-)
create mode 100644 arch/mips/include/asm/sync.h
create mode 100644 arch/mips/tools/loongson3-llsc-check.c
--
2.23.0
We currently duplicate the definition of __scbeqz in asm/atomic.h &
asm/cmpxchg.h. Move it to asm/llsc.h & rename it to __SC_BEQZ to fit
better with the existing __SC macro provided there.
We include a tab in the string in order to avoid the need for users to
indent code any further to include whitespace of their own after the
instruction mnemonic.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/atomic.h | 28 +++++++++-------------------
arch/mips/include/asm/cmpxchg.h | 20 ++++----------------
arch/mips/include/asm/llsc.h | 11 +++++++++++
3 files changed, 24 insertions(+), 35 deletions(-)
diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h
index bb8658cc7f12..7578c807ef98 100644
--- a/arch/mips/include/asm/atomic.h
+++ b/arch/mips/include/asm/atomic.h
@@ -20,19 +20,9 @@
#include <asm/compiler.h>
#include <asm/cpu-features.h>
#include <asm/cmpxchg.h>
+#include <asm/llsc.h>
#include <asm/war.h>
-/*
- * Using a branch-likely instruction to check the result of an sc instruction
- * works around a bug present in R10000 CPUs prior to revision 3.0 that could
- * cause ll-sc sequences to execute non-atomically.
- */
-#if R10000_LLSC_WAR
-# define __scbeqz "beqzl"
-#else
-# define __scbeqz "beqz"
-#endif
-
#define ATOMIC_INIT(i) { (i) }
/*
@@ -65,7 +55,7 @@ static __inline__ void atomic_##op(int i, atomic_t * v) \
"1: ll %0, %1 # atomic_" #op " \n" \
" " #asm_op " %0, %2 \n" \
" sc %0, %1 \n" \
- "\t" __scbeqz " %0, 1b \n" \
+ "\t" __SC_BEQZ "%0, 1b \n" \
" .set pop \n" \
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter) \
: "Ir" (i) : __LLSC_CLOBBER); \
@@ -93,7 +83,7 @@ static __inline__ int atomic_##op##_return_relaxed(int i, atomic_t * v) \
"1: ll %1, %2 # atomic_" #op "_return \n" \
" " #asm_op " %0, %1, %3 \n" \
" sc %0, %2 \n" \
- "\t" __scbeqz " %0, 1b \n" \
+ "\t" __SC_BEQZ "%0, 1b \n" \
" " #asm_op " %0, %1, %3 \n" \
" .set pop \n" \
: "=&r" (result), "=&r" (temp), \
@@ -127,7 +117,7 @@ static __inline__ int atomic_fetch_##op##_relaxed(int i, atomic_t * v) \
"1: ll %1, %2 # atomic_fetch_" #op " \n" \
" " #asm_op " %0, %1, %3 \n" \
" sc %0, %2 \n" \
- "\t" __scbeqz " %0, 1b \n" \
+ "\t" __SC_BEQZ "%0, 1b \n" \
" .set pop \n" \
" move %0, %1 \n" \
: "=&r" (result), "=&r" (temp), \
@@ -205,7 +195,7 @@ static __inline__ int atomic_sub_if_positive(int i, atomic_t * v)
" .set push \n"
" .set "MIPS_ISA_LEVEL" \n"
" sc %1, %2 \n"
- "\t" __scbeqz " %1, 1b \n"
+ "\t" __SC_BEQZ "%1, 1b \n"
"2: \n"
" .set pop \n"
: "=&r" (result), "=&r" (temp),
@@ -267,7 +257,7 @@ static __inline__ void atomic64_##op(s64 i, atomic64_t * v) \
"1: lld %0, %1 # atomic64_" #op " \n" \
" " #asm_op " %0, %2 \n" \
" scd %0, %1 \n" \
- "\t" __scbeqz " %0, 1b \n" \
+ "\t" __SC_BEQZ "%0, 1b \n" \
" .set pop \n" \
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter) \
: "Ir" (i) : __LLSC_CLOBBER); \
@@ -295,7 +285,7 @@ static __inline__ s64 atomic64_##op##_return_relaxed(s64 i, atomic64_t * v) \
"1: lld %1, %2 # atomic64_" #op "_return\n" \
" " #asm_op " %0, %1, %3 \n" \
" scd %0, %2 \n" \
- "\t" __scbeqz " %0, 1b \n" \
+ "\t" __SC_BEQZ "%0, 1b \n" \
" " #asm_op " %0, %1, %3 \n" \
" .set pop \n" \
: "=&r" (result), "=&r" (temp), \
@@ -329,7 +319,7 @@ static __inline__ s64 atomic64_fetch_##op##_relaxed(s64 i, atomic64_t * v) \
"1: lld %1, %2 # atomic64_fetch_" #op "\n" \
" " #asm_op " %0, %1, %3 \n" \
" scd %0, %2 \n" \
- "\t" __scbeqz " %0, 1b \n" \
+ "\t" __SC_BEQZ "%0, 1b \n" \
" move %0, %1 \n" \
" .set pop \n" \
: "=&r" (result), "=&r" (temp), \
@@ -404,7 +394,7 @@ static __inline__ s64 atomic64_sub_if_positive(s64 i, atomic64_t * v)
" move %1, %0 \n"
" bltz %0, 1f \n"
" scd %1, %2 \n"
- "\t" __scbeqz " %1, 1b \n"
+ "\t" __SC_BEQZ "%1, 1b \n"
"1: \n"
" .set pop \n"
: "=&r" (result), "=&r" (temp),
diff --git a/arch/mips/include/asm/cmpxchg.h b/arch/mips/include/asm/cmpxchg.h
index 79bf34efbc04..5d3f0e3513b4 100644
--- a/arch/mips/include/asm/cmpxchg.h
+++ b/arch/mips/include/asm/cmpxchg.h
@@ -11,19 +11,9 @@
#include <linux/bug.h>
#include <linux/irqflags.h>
#include <asm/compiler.h>
+#include <asm/llsc.h>
#include <asm/war.h>
-/*
- * Using a branch-likely instruction to check the result of an sc instruction
- * works around a bug present in R10000 CPUs prior to revision 3.0 that could
- * cause ll-sc sequences to execute non-atomically.
- */
-#if R10000_LLSC_WAR
-# define __scbeqz "beqzl"
-#else
-# define __scbeqz "beqz"
-#endif
-
/*
* These functions doesn't exist, so if they are called you'll either:
*
@@ -57,7 +47,7 @@ extern unsigned long __xchg_called_with_bad_pointer(void)
" move $1, %z3 \n" \
" .set " MIPS_ISA_ARCH_LEVEL " \n" \
" " st " $1, %1 \n" \
- "\t" __scbeqz " $1, 1b \n" \
+ "\t" __SC_BEQZ "$1, 1b \n" \
" .set pop \n" \
: "=&r" (__ret), "=" GCC_OFF_SMALL_ASM() (*m) \
: GCC_OFF_SMALL_ASM() (*m), "Jr" (val) \
@@ -130,7 +120,7 @@ static inline unsigned long __xchg(volatile void *ptr, unsigned long x,
" move $1, %z4 \n" \
" .set "MIPS_ISA_ARCH_LEVEL" \n" \
" " st " $1, %1 \n" \
- "\t" __scbeqz " $1, 1b \n" \
+ "\t" __SC_BEQZ "$1, 1b \n" \
" .set pop \n" \
"2: \n" \
: "=&r" (__ret), "=" GCC_OFF_SMALL_ASM() (*m) \
@@ -268,7 +258,7 @@ static inline unsigned long __cmpxchg64(volatile void *ptr,
/* Attempt to store new at ptr */
" scd %L1, %2 \n"
/* If we failed, loop! */
- "\t" __scbeqz " %L1, 1b \n"
+ "\t" __SC_BEQZ "%L1, 1b \n"
" .set pop \n"
"2: \n"
: "=&r"(ret),
@@ -311,6 +301,4 @@ static inline unsigned long __cmpxchg64(volatile void *ptr,
# endif /* !CONFIG_SMP */
#endif /* !CONFIG_64BIT */
-#undef __scbeqz
-
#endif /* __ASM_CMPXCHG_H */
diff --git a/arch/mips/include/asm/llsc.h b/arch/mips/include/asm/llsc.h
index c6d17d171147..9b19f38562ac 100644
--- a/arch/mips/include/asm/llsc.h
+++ b/arch/mips/include/asm/llsc.h
@@ -25,4 +25,15 @@
#define __EXT "dext "
#endif
+/*
+ * Using a branch-likely instruction to check the result of an sc instruction
+ * works around a bug present in R10000 CPUs prior to revision 3.0 that could
+ * cause ll-sc sequences to execute non-atomically.
+ */
+#if R10000_LLSC_WAR
+# define __SC_BEQZ "beqzl "
+#else
+# define __SC_BEQZ "beqz "
+#endif
+
#endif /* __ASM_LLSC_H */
--
2.23.0
When targeting MIPSr6 or higher make use of a compact branch in LL/SC
loops, preventing the insertion of a delay slot nop that only serves to
waste space.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/llsc.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/mips/include/asm/llsc.h b/arch/mips/include/asm/llsc.h
index 9b19f38562ac..d240a4a2d1c4 100644
--- a/arch/mips/include/asm/llsc.h
+++ b/arch/mips/include/asm/llsc.h
@@ -9,6 +9,8 @@
#ifndef __ASM_LLSC_H
#define __ASM_LLSC_H
+#include <asm/isa-rev.h>
+
#if _MIPS_SZLONG == 32
#define SZLONG_LOG 5
#define SZLONG_MASK 31UL
@@ -32,6 +34,8 @@
*/
#if R10000_LLSC_WAR
# define __SC_BEQZ "beqzl "
+#elif MIPS_ISA_REV >= 6
+# define __SC_BEQZ "beqzc "
#else
# define __SC_BEQZ "beqz "
#endif
--
2.23.0
The definition of fast_mb() is the same in both the Octeon & non-Octeon
cases, so remove the duplication & define it only once.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/barrier.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
index 8a5abc1c85a6..657ec01120a4 100644
--- a/arch/mips/include/asm/barrier.h
+++ b/arch/mips/include/asm/barrier.h
@@ -38,6 +38,8 @@ static inline void wmb(void)
}
#define wmb wmb
+#define fast_mb() __sync()
+
#define __fast_iob() \
__asm__ __volatile__( \
".set push\n\t" \
@@ -49,10 +51,8 @@ static inline void wmb(void)
: "m" (*(int *)CKSEG1) \
: "memory")
#ifdef CONFIG_CPU_CAVIUM_OCTEON
-# define fast_mb() __sync()
# define fast_iob() do { } while (0)
#else /* ! CONFIG_CPU_CAVIUM_OCTEON */
-# define fast_mb() __sync()
# ifdef CONFIG_SGI_IP28
# define fast_iob() \
__asm__ __volatile__( \
--
2.23.0
Simplify our definitions of rmb() & wmb() using the new __SYNC()
infrastructure.
The fast_rmb() & fast_wmb() macros are removed, since they only provided
a level of indirection that made the code less readable & weren't
directly used anywhere in the kernel tree.
The Octeon #ifdef'ery is removed, since the "syncw" instruction
previously used is merely an alias for "sync 4" which __SYNC() will emit
for the wmb sync type when the kernel is configured for an Octeon CPU.
Similarly __SYNC() will emit nothing for the rmb sync type in Octeon
configurations.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/barrier.h | 28 ++++++++++++++--------------
1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
index 5ad39bfd3b6d..f36cab87cfde 100644
--- a/arch/mips/include/asm/barrier.h
+++ b/arch/mips/include/asm/barrier.h
@@ -26,6 +26,18 @@
#define __sync() do { } while(0)
#endif
+static inline void rmb(void)
+{
+ asm volatile(__SYNC(rmb, always) ::: "memory");
+}
+#define rmb rmb
+
+static inline void wmb(void)
+{
+ asm volatile(__SYNC(wmb, always) ::: "memory");
+}
+#define wmb wmb
+
#define __fast_iob() \
__asm__ __volatile__( \
".set push\n\t" \
@@ -37,16 +49,9 @@
: "m" (*(int *)CKSEG1) \
: "memory")
#ifdef CONFIG_CPU_CAVIUM_OCTEON
-# define OCTEON_SYNCW_STR ".set push\n.set arch=octeon\nsyncw\nsyncw\n.set pop\n"
-# define __syncw() __asm__ __volatile__(OCTEON_SYNCW_STR : : : "memory")
-
-# define fast_wmb() __syncw()
-# define fast_rmb() barrier()
# define fast_mb() __sync()
# define fast_iob() do { } while (0)
#else /* ! CONFIG_CPU_CAVIUM_OCTEON */
-# define fast_wmb() __sync()
-# define fast_rmb() __sync()
# define fast_mb() __sync()
# ifdef CONFIG_SGI_IP28
# define fast_iob() \
@@ -83,19 +88,14 @@
#endif /* !CONFIG_CPU_HAS_WB */
-#define wmb() fast_wmb()
-#define rmb() fast_rmb()
-
#if defined(CONFIG_WEAK_ORDERING)
# ifdef CONFIG_CPU_CAVIUM_OCTEON
# define __smp_mb() __sync()
-# define __smp_rmb() barrier()
-# define __smp_wmb() __syncw()
# else
# define __smp_mb() __asm__ __volatile__("sync" : : :"memory")
-# define __smp_rmb() __asm__ __volatile__("sync" : : :"memory")
-# define __smp_wmb() __asm__ __volatile__("sync" : : :"memory")
# endif
+# define __smp_rmb() rmb()
+# define __smp_wmb() wmb()
#else
#define __smp_mb() barrier()
#define __smp_rmb() barrier()
--
2.23.0
Handle the !kernel_uses_llsc path first in our ATOMIC_OP(),
ATOMIC_OP_RETURN() & ATOMIC_FETCH_OP() macros & return from within the
block. This allows us to de-indent the kernel_uses_llsc path by one
level which will be useful when making further changes.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/atomic.h | 99 +++++++++++++++++-----------------
1 file changed, 49 insertions(+), 50 deletions(-)
diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h
index 2d2a8a74c51b..ace2ea005588 100644
--- a/arch/mips/include/asm/atomic.h
+++ b/arch/mips/include/asm/atomic.h
@@ -45,51 +45,36 @@
#define ATOMIC_OP(op, c_op, asm_op) \
static __inline__ void atomic_##op(int i, atomic_t * v) \
{ \
- if (kernel_uses_llsc) { \
- int temp; \
+ int temp; \
\
- loongson_llsc_mb(); \
- __asm__ __volatile__( \
- " .set push \n" \
- " .set "MIPS_ISA_LEVEL" \n" \
- "1: ll %0, %1 # atomic_" #op " \n" \
- " " #asm_op " %0, %2 \n" \
- " sc %0, %1 \n" \
- "\t" __SC_BEQZ "%0, 1b \n" \
- " .set pop \n" \
- : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter) \
- : "Ir" (i) : __LLSC_CLOBBER); \
- } else { \
+ if (!kernel_uses_llsc) { \
unsigned long flags; \
\
raw_local_irq_save(flags); \
v->counter c_op i; \
raw_local_irq_restore(flags); \
+ return; \
} \
+ \
+ loongson_llsc_mb(); \
+ __asm__ __volatile__( \
+ " .set push \n" \
+ " .set " MIPS_ISA_LEVEL " \n" \
+ "1: ll %0, %1 # atomic_" #op " \n" \
+ " " #asm_op " %0, %2 \n" \
+ " sc %0, %1 \n" \
+ "\t" __SC_BEQZ "%0, 1b \n" \
+ " .set pop \n" \
+ : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter) \
+ : "Ir" (i) : __LLSC_CLOBBER); \
}
#define ATOMIC_OP_RETURN(op, c_op, asm_op) \
static __inline__ int atomic_##op##_return_relaxed(int i, atomic_t * v) \
{ \
- int result; \
- \
- if (kernel_uses_llsc) { \
- int temp; \
+ int temp, result; \
\
- loongson_llsc_mb(); \
- __asm__ __volatile__( \
- " .set push \n" \
- " .set "MIPS_ISA_LEVEL" \n" \
- "1: ll %1, %2 # atomic_" #op "_return \n" \
- " " #asm_op " %0, %1, %3 \n" \
- " sc %0, %2 \n" \
- "\t" __SC_BEQZ "%0, 1b \n" \
- " " #asm_op " %0, %1, %3 \n" \
- " .set pop \n" \
- : "=&r" (result), "=&r" (temp), \
- "+" GCC_OFF_SMALL_ASM() (v->counter) \
- : "Ir" (i) : __LLSC_CLOBBER); \
- } else { \
+ if (!kernel_uses_llsc) { \
unsigned long flags; \
\
raw_local_irq_save(flags); \
@@ -97,41 +82,55 @@ static __inline__ int atomic_##op##_return_relaxed(int i, atomic_t * v) \
result c_op i; \
v->counter = result; \
raw_local_irq_restore(flags); \
+ return result; \
} \
\
+ loongson_llsc_mb(); \
+ __asm__ __volatile__( \
+ " .set push \n" \
+ " .set " MIPS_ISA_LEVEL " \n" \
+ "1: ll %1, %2 # atomic_" #op "_return \n" \
+ " " #asm_op " %0, %1, %3 \n" \
+ " sc %0, %2 \n" \
+ "\t" __SC_BEQZ "%0, 1b \n" \
+ " " #asm_op " %0, %1, %3 \n" \
+ " .set pop \n" \
+ : "=&r" (result), "=&r" (temp), \
+ "+" GCC_OFF_SMALL_ASM() (v->counter) \
+ : "Ir" (i) : __LLSC_CLOBBER); \
+ \
return result; \
}
#define ATOMIC_FETCH_OP(op, c_op, asm_op) \
static __inline__ int atomic_fetch_##op##_relaxed(int i, atomic_t * v) \
{ \
- int result; \
+ int temp, result; \
\
- if (kernel_uses_llsc) { \
- int temp; \
- \
- loongson_llsc_mb(); \
- __asm__ __volatile__( \
- " .set push \n" \
- " .set "MIPS_ISA_LEVEL" \n" \
- "1: ll %1, %2 # atomic_fetch_" #op " \n" \
- " " #asm_op " %0, %1, %3 \n" \
- " sc %0, %2 \n" \
- "\t" __SC_BEQZ "%0, 1b \n" \
- " .set pop \n" \
- " move %0, %1 \n" \
- : "=&r" (result), "=&r" (temp), \
- "+" GCC_OFF_SMALL_ASM() (v->counter) \
- : "Ir" (i) : __LLSC_CLOBBER); \
- } else { \
+ if (!kernel_uses_llsc) { \
unsigned long flags; \
\
raw_local_irq_save(flags); \
result = v->counter; \
v->counter c_op i; \
raw_local_irq_restore(flags); \
+ return result; \
} \
\
+ loongson_llsc_mb(); \
+ __asm__ __volatile__( \
+ " .set push \n" \
+ " .set "MIPS_ISA_LEVEL" \n" \
+ "1: ll %1, %2 # atomic_fetch_" #op " \n" \
+ " " #asm_op " %0, %1, %3 \n" \
+ " sc %0, %2 \n" \
+ "\t" __SC_BEQZ "%0, 1b \n" \
+ " .set pop \n" \
+ " move %0, %1 \n" \
+ : "=&r" (result), "=&r" (temp), \
+ "+" GCC_OFF_SMALL_ASM() (v->counter) \
+ : "Ir" (i) : __LLSC_CLOBBER); \
+ \
return result; \
}
--
2.23.0
Use the new __SYNC() infrastructure to implement sync_ginv(), for
consistency with much of the rest of the asm/barrier.h.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/barrier.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
index a117c6d95038..c7e05e832da9 100644
--- a/arch/mips/include/asm/barrier.h
+++ b/arch/mips/include/asm/barrier.h
@@ -163,7 +163,7 @@ static inline void wmb(void)
static inline void sync_ginv(void)
{
- asm volatile("sync\t%0" :: "i"(__SYNC_ginv));
+ asm volatile(__SYNC(ginv, always));
}
#include <asm-generic/barrier.h>
--
2.23.0
Unify the definitions of atomic_sub_if_positive() &
atomic64_sub_if_positive() using a macro like we do for most other
atomic functions. This allows us to share the implementation ensuring
consistency between the two. Notably this provides the appropriate
loongson3_war barriers in the atomic64_sub_if_positive() case which were
previously missing.
The code is rearranged a little to handle the !kernel_uses_llsc case
first in order to de-indent the LL/SC case & allow us not to go over 80
characters per line.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/atomic.h | 164 ++++++++++++---------------------
1 file changed, 58 insertions(+), 106 deletions(-)
diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h
index 24443ef29337..96ef50fa2817 100644
--- a/arch/mips/include/asm/atomic.h
+++ b/arch/mips/include/asm/atomic.h
@@ -192,65 +192,71 @@ ATOMIC_OPS(atomic64, xor, s64, ^=, xor, lld, scd)
* Atomically test @v and subtract @i if @v is greater or equal than @i.
* The function returns the old value of @v minus @i.
*/
-static __inline__ int atomic_sub_if_positive(int i, atomic_t * v)
-{
- int result;
-
- smp_mb__before_atomic();
-
- if (kernel_uses_llsc) {
- int temp;
-
- __asm__ __volatile__(
- " .set push \n"
- " .set "MIPS_ISA_LEVEL" \n"
- " " __SYNC(full, loongson3_war) " \n"
- "1: ll %1, %2 # atomic_sub_if_positive\n"
- " .set pop \n"
- " subu %0, %1, %3 \n"
- " move %1, %0 \n"
- " bltz %0, 2f \n"
- " .set push \n"
- " .set "MIPS_ISA_LEVEL" \n"
- " sc %1, %2 \n"
- "\t" __SC_BEQZ "%1, 1b \n"
- "2: " __SYNC(full, loongson3_war) " \n"
- " .set pop \n"
- : "=&r" (result), "=&r" (temp),
- "+" GCC_OFF_SMALL_ASM() (v->counter)
- : "Ir" (i) : __LLSC_CLOBBER);
- } else {
- unsigned long flags;
+#define ATOMIC_SIP_OP(pfx, type, op, ll, sc) \
+static __inline__ int pfx##_sub_if_positive(type i, pfx##_t * v) \
+{ \
+ type temp, result; \
+ \
+ smp_mb__before_atomic(); \
+ \
+ if (!kernel_uses_llsc) { \
+ unsigned long flags; \
+ \
+ raw_local_irq_save(flags); \
+ result = v->counter; \
+ result -= i; \
+ if (result >= 0) \
+ v->counter = result; \
+ raw_local_irq_restore(flags); \
+ smp_mb__after_atomic(); \
+ return result; \
+ } \
+ \
+ __asm__ __volatile__( \
+ " .set push \n" \
+ " .set " MIPS_ISA_LEVEL " \n" \
+ " " __SYNC(full, loongson3_war) " \n" \
+ "1: " #ll " %1, %2 # atomic_sub_if_positive\n" \
+ " .set pop \n" \
+ " " #op " %0, %1, %3 \n" \
+ " move %1, %0 \n" \
+ " bltz %0, 2f \n" \
+ " .set push \n" \
+ " .set " MIPS_ISA_LEVEL " \n" \
+ " " #sc " %1, %2 \n" \
+ " " __SC_BEQZ "%1, 1b \n" \
+ "2: " __SYNC(full, loongson3_war) " \n" \
+ " .set pop \n" \
+ : "=&r" (result), "=&r" (temp), \
+ "+" GCC_OFF_SMALL_ASM() (v->counter) \
+ : "Ir" (i) \
+ : __LLSC_CLOBBER); \
+ \
+ /* \
+ * In the Loongson3 workaround case we already have a \
+ * completion barrier at 2: above, which is needed due to the \
+ * bltz that can branch to code outside of the LL/SC loop. As \
+ * such, we don't need to emit another barrier here. \
+ */ \
+ if (!__SYNC_loongson3_war) \
+ smp_mb__after_atomic(); \
+ \
+ return result; \
+}
- raw_local_irq_save(flags);
- result = v->counter;
- result -= i;
- if (result >= 0)
- v->counter = result;
- raw_local_irq_restore(flags);
- }
+ATOMIC_SIP_OP(atomic, int, subu, ll, sc)
+#define atomic_dec_if_positive(v) atomic_sub_if_positive(1, v)
- /*
- * In the Loongson3 workaround case we already have a completion
- * barrier at 2: above, which is needed due to the bltz that can branch
- * to code outside of the LL/SC loop. As such, we don't need to emit
- * another barrier here.
- */
- if (!__SYNC_loongson3_war)
- smp_mb__after_atomic();
+#ifdef CONFIG_64BIT
+ATOMIC_SIP_OP(atomic64, s64, dsubu, lld, scd)
+#define atomic64_dec_if_positive(v) atomic64_sub_if_positive(1, v)
+#endif
- return result;
-}
+#undef ATOMIC_SIP_OP
#define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
#define atomic_xchg(v, new) (xchg(&((v)->counter), (new)))
-/*
- * atomic_dec_if_positive - decrement by 1 if old value positive
- * @v: pointer of type atomic_t
- */
-#define atomic_dec_if_positive(v) atomic_sub_if_positive(1, v)
-
#ifdef CONFIG_64BIT
#define ATOMIC64_INIT(i) { (i) }
@@ -269,64 +275,10 @@ static __inline__ int atomic_sub_if_positive(int i, atomic_t * v)
*/
#define atomic64_set(v, i) WRITE_ONCE((v)->counter, (i))
-/*
- * atomic64_sub_if_positive - conditionally subtract integer from atomic
- * variable
- * @i: integer value to subtract
- * @v: pointer of type atomic64_t
- *
- * Atomically test @v and subtract @i if @v is greater or equal than @i.
- * The function returns the old value of @v minus @i.
- */
-static __inline__ s64 atomic64_sub_if_positive(s64 i, atomic64_t * v)
-{
- s64 result;
-
- smp_mb__before_llsc();
-
- if (kernel_uses_llsc) {
- s64 temp;
-
- __asm__ __volatile__(
- " .set push \n"
- " .set "MIPS_ISA_LEVEL" \n"
- "1: lld %1, %2 # atomic64_sub_if_positive\n"
- " dsubu %0, %1, %3 \n"
- " move %1, %0 \n"
- " bltz %0, 1f \n"
- " scd %1, %2 \n"
- "\t" __SC_BEQZ "%1, 1b \n"
- "1: \n"
- " .set pop \n"
- : "=&r" (result), "=&r" (temp),
- "+" GCC_OFF_SMALL_ASM() (v->counter)
- : "Ir" (i));
- } else {
- unsigned long flags;
-
- raw_local_irq_save(flags);
- result = v->counter;
- result -= i;
- if (result >= 0)
- v->counter = result;
- raw_local_irq_restore(flags);
- }
-
- smp_llsc_mb();
-
- return result;
-}
-
#define atomic64_cmpxchg(v, o, n) \
((__typeof__((v)->counter))cmpxchg(&((v)->counter), (o), (n)))
#define atomic64_xchg(v, new) (xchg(&((v)->counter), (new)))
-/*
- * atomic64_dec_if_positive - decrement by 1 if old value positive
- * @v: pointer of type atomic64_t
- */
-#define atomic64_dec_if_positive(v) atomic64_sub_if_positive(1, v)
-
#endif /* CONFIG_64BIT */
#endif /* _ASM_ATOMIC_H */
--
2.23.0
Reorder conditions in our various bitops functions that check
kernel_uses_llsc such that they handle the !kernel_uses_llsc case first.
This allows us to avoid the need to duplicate the kernel_uses_llsc check
in all the other cases. For functions that don't involve barriers common
to the various implementations, we switch to returning from within each
if block making each case easier to read in isolation.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/bitops.h | 213 ++++++++++++++++-----------------
1 file changed, 105 insertions(+), 108 deletions(-)
diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h
index 985d6a02f9ea..e300960717e0 100644
--- a/arch/mips/include/asm/bitops.h
+++ b/arch/mips/include/asm/bitops.h
@@ -52,11 +52,16 @@ int __mips_test_and_change_bit(unsigned long nr,
*/
static inline void set_bit(unsigned long nr, volatile unsigned long *addr)
{
- unsigned long *m = ((unsigned long *) addr) + (nr >> SZLONG_LOG);
+ unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG);
int bit = nr & SZLONG_MASK;
unsigned long temp;
- if (kernel_uses_llsc && R10000_LLSC_WAR) {
+ if (!kernel_uses_llsc) {
+ __mips_set_bit(nr, addr);
+ return;
+ }
+
+ if (R10000_LLSC_WAR) {
__asm__ __volatile__(
" .set push \n"
" .set arch=r4000 \n"
@@ -68,8 +73,11 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr)
: "=&r" (temp), "=" GCC_OFF_SMALL_ASM() (*m)
: "ir" (1UL << bit), GCC_OFF_SMALL_ASM() (*m)
: __LLSC_CLOBBER);
+ return;
+ }
+
#if defined(CONFIG_CPU_MIPSR2) || defined(CONFIG_CPU_MIPSR6)
- } else if (kernel_uses_llsc && __builtin_constant_p(bit)) {
+ if (__builtin_constant_p(bit)) {
loongson_llsc_mb();
do {
__asm__ __volatile__(
@@ -80,23 +88,23 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr)
: "ir" (bit), "r" (~0)
: __LLSC_CLOBBER);
} while (unlikely(!temp));
+ return;
+ }
#endif /* CONFIG_CPU_MIPSR2 || CONFIG_CPU_MIPSR6 */
- } else if (kernel_uses_llsc) {
- loongson_llsc_mb();
- do {
- __asm__ __volatile__(
- " .set push \n"
- " .set "MIPS_ISA_ARCH_LEVEL" \n"
- " " __LL "%0, %1 # set_bit \n"
- " or %0, %2 \n"
- " " __SC "%0, %1 \n"
- " .set pop \n"
- : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m)
- : "ir" (1UL << bit)
- : __LLSC_CLOBBER);
- } while (unlikely(!temp));
- } else
- __mips_set_bit(nr, addr);
+
+ loongson_llsc_mb();
+ do {
+ __asm__ __volatile__(
+ " .set push \n"
+ " .set "MIPS_ISA_ARCH_LEVEL" \n"
+ " " __LL "%0, %1 # set_bit \n"
+ " or %0, %2 \n"
+ " " __SC "%0, %1 \n"
+ " .set pop \n"
+ : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m)
+ : "ir" (1UL << bit)
+ : __LLSC_CLOBBER);
+ } while (unlikely(!temp));
}
/*
@@ -111,11 +119,16 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr)
*/
static inline void clear_bit(unsigned long nr, volatile unsigned long *addr)
{
- unsigned long *m = ((unsigned long *) addr) + (nr >> SZLONG_LOG);
+ unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG);
int bit = nr & SZLONG_MASK;
unsigned long temp;
- if (kernel_uses_llsc && R10000_LLSC_WAR) {
+ if (!kernel_uses_llsc) {
+ __mips_clear_bit(nr, addr);
+ return;
+ }
+
+ if (R10000_LLSC_WAR) {
__asm__ __volatile__(
" .set push \n"
" .set arch=r4000 \n"
@@ -127,8 +140,11 @@ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr)
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m)
: "ir" (~(1UL << bit))
: __LLSC_CLOBBER);
+ return;
+ }
+
#if defined(CONFIG_CPU_MIPSR2) || defined(CONFIG_CPU_MIPSR6)
- } else if (kernel_uses_llsc && __builtin_constant_p(bit)) {
+ if (__builtin_constant_p(bit)) {
loongson_llsc_mb();
do {
__asm__ __volatile__(
@@ -139,23 +155,23 @@ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr)
: "ir" (bit)
: __LLSC_CLOBBER);
} while (unlikely(!temp));
+ return;
+ }
#endif /* CONFIG_CPU_MIPSR2 || CONFIG_CPU_MIPSR6 */
- } else if (kernel_uses_llsc) {
- loongson_llsc_mb();
- do {
- __asm__ __volatile__(
- " .set push \n"
- " .set "MIPS_ISA_ARCH_LEVEL" \n"
- " " __LL "%0, %1 # clear_bit \n"
- " and %0, %2 \n"
- " " __SC "%0, %1 \n"
- " .set pop \n"
- : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m)
- : "ir" (~(1UL << bit))
- : __LLSC_CLOBBER);
- } while (unlikely(!temp));
- } else
- __mips_clear_bit(nr, addr);
+
+ loongson_llsc_mb();
+ do {
+ __asm__ __volatile__(
+ " .set push \n"
+ " .set "MIPS_ISA_ARCH_LEVEL" \n"
+ " " __LL "%0, %1 # clear_bit \n"
+ " and %0, %2 \n"
+ " " __SC "%0, %1 \n"
+ " .set pop \n"
+ : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m)
+ : "ir" (~(1UL << bit))
+ : __LLSC_CLOBBER);
+ } while (unlikely(!temp));
}
/*
@@ -183,12 +199,16 @@ static inline void clear_bit_unlock(unsigned long nr, volatile unsigned long *ad
*/
static inline void change_bit(unsigned long nr, volatile unsigned long *addr)
{
+ unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG);
int bit = nr & SZLONG_MASK;
+ unsigned long temp;
- if (kernel_uses_llsc && R10000_LLSC_WAR) {
- unsigned long *m = ((unsigned long *) addr) + (nr >> SZLONG_LOG);
- unsigned long temp;
+ if (!kernel_uses_llsc) {
+ __mips_change_bit(nr, addr);
+ return;
+ }
+ if (R10000_LLSC_WAR) {
__asm__ __volatile__(
" .set push \n"
" .set arch=r4000 \n"
@@ -200,25 +220,22 @@ static inline void change_bit(unsigned long nr, volatile unsigned long *addr)
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m)
: "ir" (1UL << bit)
: __LLSC_CLOBBER);
- } else if (kernel_uses_llsc) {
- unsigned long *m = ((unsigned long *) addr) + (nr >> SZLONG_LOG);
- unsigned long temp;
+ return;
+ }
- loongson_llsc_mb();
- do {
- __asm__ __volatile__(
- " .set push \n"
- " .set "MIPS_ISA_ARCH_LEVEL" \n"
- " " __LL "%0, %1 # change_bit \n"
- " xor %0, %2 \n"
- " " __SC "%0, %1 \n"
- " .set pop \n"
- : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m)
- : "ir" (1UL << bit)
- : __LLSC_CLOBBER);
- } while (unlikely(!temp));
- } else
- __mips_change_bit(nr, addr);
+ loongson_llsc_mb();
+ do {
+ __asm__ __volatile__(
+ " .set push \n"
+ " .set "MIPS_ISA_ARCH_LEVEL" \n"
+ " " __LL "%0, %1 # change_bit \n"
+ " xor %0, %2 \n"
+ " " __SC "%0, %1 \n"
+ " .set pop \n"
+ : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m)
+ : "ir" (1UL << bit)
+ : __LLSC_CLOBBER);
+ } while (unlikely(!temp));
}
/*
@@ -232,15 +249,15 @@ static inline void change_bit(unsigned long nr, volatile unsigned long *addr)
static inline int test_and_set_bit(unsigned long nr,
volatile unsigned long *addr)
{
+ unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG);
int bit = nr & SZLONG_MASK;
- unsigned long res;
+ unsigned long res, temp;
smp_mb__before_llsc();
- if (kernel_uses_llsc && R10000_LLSC_WAR) {
- unsigned long *m = ((unsigned long *) addr) + (nr >> SZLONG_LOG);
- unsigned long temp;
-
+ if (!kernel_uses_llsc) {
+ res = __mips_test_and_set_bit(nr, addr);
+ } else if (R10000_LLSC_WAR) {
__asm__ __volatile__(
" .set push \n"
" .set arch=r4000 \n"
@@ -253,10 +270,7 @@ static inline int test_and_set_bit(unsigned long nr,
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
: "r" (1UL << bit)
: __LLSC_CLOBBER);
- } else if (kernel_uses_llsc) {
- unsigned long *m = ((unsigned long *) addr) + (nr >> SZLONG_LOG);
- unsigned long temp;
-
+ } else {
loongson_llsc_mb();
do {
__asm__ __volatile__(
@@ -272,8 +286,7 @@ static inline int test_and_set_bit(unsigned long nr,
} while (unlikely(!res));
res = temp & (1UL << bit);
- } else
- res = __mips_test_and_set_bit(nr, addr);
+ }
smp_llsc_mb();
@@ -291,13 +304,13 @@ static inline int test_and_set_bit(unsigned long nr,
static inline int test_and_set_bit_lock(unsigned long nr,
volatile unsigned long *addr)
{
+ unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG);
int bit = nr & SZLONG_MASK;
- unsigned long res;
-
- if (kernel_uses_llsc && R10000_LLSC_WAR) {
- unsigned long *m = ((unsigned long *) addr) + (nr >> SZLONG_LOG);
- unsigned long temp;
+ unsigned long res, temp;
+ if (!kernel_uses_llsc) {
+ res = __mips_test_and_set_bit_lock(nr, addr);
+ } else if (R10000_LLSC_WAR) {
__asm__ __volatile__(
" .set push \n"
" .set arch=r4000 \n"
@@ -310,11 +323,7 @@ static inline int test_and_set_bit_lock(unsigned long nr,
: "=&r" (temp), "+m" (*m), "=&r" (res)
: "r" (1UL << bit)
: __LLSC_CLOBBER);
- } else if (kernel_uses_llsc) {
- unsigned long *m = ((unsigned long *) addr) + (nr >> SZLONG_LOG);
- unsigned long temp;
-
- loongson_llsc_mb();
+ } else {
do {
__asm__ __volatile__(
" .set push \n"
@@ -329,8 +338,7 @@ static inline int test_and_set_bit_lock(unsigned long nr,
} while (unlikely(!res));
res = temp & (1UL << bit);
- } else
- res = __mips_test_and_set_bit_lock(nr, addr);
+ }
smp_llsc_mb();
@@ -347,15 +355,15 @@ static inline int test_and_set_bit_lock(unsigned long nr,
static inline int test_and_clear_bit(unsigned long nr,
volatile unsigned long *addr)
{
+ unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG);
int bit = nr & SZLONG_MASK;
- unsigned long res;
+ unsigned long res, temp;
smp_mb__before_llsc();
- if (kernel_uses_llsc && R10000_LLSC_WAR) {
- unsigned long *m = ((unsigned long *) addr) + (nr >> SZLONG_LOG);
- unsigned long temp;
-
+ if (!kernel_uses_llsc) {
+ res = __mips_test_and_clear_bit(nr, addr);
+ } else if (R10000_LLSC_WAR) {
__asm__ __volatile__(
" .set push \n"
" .set arch=r4000 \n"
@@ -370,10 +378,7 @@ static inline int test_and_clear_bit(unsigned long nr,
: "r" (1UL << bit)
: __LLSC_CLOBBER);
#if defined(CONFIG_CPU_MIPSR2) || defined(CONFIG_CPU_MIPSR6)
- } else if (kernel_uses_llsc && __builtin_constant_p(nr)) {
- unsigned long *m = ((unsigned long *) addr) + (nr >> SZLONG_LOG);
- unsigned long temp;
-
+ } else if (__builtin_constant_p(nr)) {
loongson_llsc_mb();
do {
__asm__ __volatile__(
@@ -386,10 +391,7 @@ static inline int test_and_clear_bit(unsigned long nr,
: __LLSC_CLOBBER);
} while (unlikely(!temp));
#endif
- } else if (kernel_uses_llsc) {
- unsigned long *m = ((unsigned long *) addr) + (nr >> SZLONG_LOG);
- unsigned long temp;
-
+ } else {
loongson_llsc_mb();
do {
__asm__ __volatile__(
@@ -406,8 +408,7 @@ static inline int test_and_clear_bit(unsigned long nr,
} while (unlikely(!res));
res = temp & (1UL << bit);
- } else
- res = __mips_test_and_clear_bit(nr, addr);
+ }
smp_llsc_mb();
@@ -425,15 +426,15 @@ static inline int test_and_clear_bit(unsigned long nr,
static inline int test_and_change_bit(unsigned long nr,
volatile unsigned long *addr)
{
+ unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG);
int bit = nr & SZLONG_MASK;
- unsigned long res;
+ unsigned long res, temp;
smp_mb__before_llsc();
- if (kernel_uses_llsc && R10000_LLSC_WAR) {
- unsigned long *m = ((unsigned long *) addr) + (nr >> SZLONG_LOG);
- unsigned long temp;
-
+ if (!kernel_uses_llsc) {
+ res = __mips_test_and_change_bit(nr, addr);
+ } else if (R10000_LLSC_WAR) {
__asm__ __volatile__(
" .set push \n"
" .set arch=r4000 \n"
@@ -446,10 +447,7 @@ static inline int test_and_change_bit(unsigned long nr,
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
: "r" (1UL << bit)
: __LLSC_CLOBBER);
- } else if (kernel_uses_llsc) {
- unsigned long *m = ((unsigned long *) addr) + (nr >> SZLONG_LOG);
- unsigned long temp;
-
+ } else {
loongson_llsc_mb();
do {
__asm__ __volatile__(
@@ -465,8 +463,7 @@ static inline int test_and_change_bit(unsigned long nr,
} while (unlikely(!res));
res = temp & (1UL << bit);
- } else
- res = __mips_test_and_change_bit(nr, addr);
+ }
smp_llsc_mb();
--
2.23.0
Cut down on duplication by generalizing the ATOMIC_OP(),
ATOMIC_OP_RETURN() & ATOMIC_FETCH_OP() macros to work for both 32b &
64b atomics, and removing the ATOMIC64_ variants. This ensures
consistency between our atomic_* & atomic64_* functions.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/atomic.h | 196 ++++++++-------------------------
1 file changed, 45 insertions(+), 151 deletions(-)
diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h
index ace2ea005588..b834af5a7382 100644
--- a/arch/mips/include/asm/atomic.h
+++ b/arch/mips/include/asm/atomic.h
@@ -42,10 +42,10 @@
*/
#define atomic_set(v, i) WRITE_ONCE((v)->counter, (i))
-#define ATOMIC_OP(op, c_op, asm_op) \
-static __inline__ void atomic_##op(int i, atomic_t * v) \
+#define ATOMIC_OP(pfx, op, type, c_op, asm_op, ll, sc) \
+static __inline__ void pfx##_##op(type i, pfx##_t * v) \
{ \
- int temp; \
+ type temp; \
\
if (!kernel_uses_llsc) { \
unsigned long flags; \
@@ -60,19 +60,19 @@ static __inline__ void atomic_##op(int i, atomic_t * v) \
__asm__ __volatile__( \
" .set push \n" \
" .set " MIPS_ISA_LEVEL " \n" \
- "1: ll %0, %1 # atomic_" #op " \n" \
+ "1: " #ll " %0, %1 # " #pfx "_" #op " \n" \
" " #asm_op " %0, %2 \n" \
- " sc %0, %1 \n" \
+ " " #sc " %0, %1 \n" \
"\t" __SC_BEQZ "%0, 1b \n" \
" .set pop \n" \
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter) \
: "Ir" (i) : __LLSC_CLOBBER); \
}
-#define ATOMIC_OP_RETURN(op, c_op, asm_op) \
-static __inline__ int atomic_##op##_return_relaxed(int i, atomic_t * v) \
+#define ATOMIC_OP_RETURN(pfx, op, type, c_op, asm_op, ll, sc) \
+static __inline__ type pfx##_##op##_return_relaxed(type i, pfx##_t * v) \
{ \
- int temp, result; \
+ type temp, result; \
\
if (!kernel_uses_llsc) { \
unsigned long flags; \
@@ -89,9 +89,9 @@ static __inline__ int atomic_##op##_return_relaxed(int i, atomic_t * v) \
__asm__ __volatile__( \
" .set push \n" \
" .set " MIPS_ISA_LEVEL " \n" \
- "1: ll %1, %2 # atomic_" #op "_return \n" \
+ "1: " #ll " %1, %2 # " #pfx "_" #op "_return\n" \
" " #asm_op " %0, %1, %3 \n" \
- " sc %0, %2 \n" \
+ " " #sc " %0, %2 \n" \
"\t" __SC_BEQZ "%0, 1b \n" \
" " #asm_op " %0, %1, %3 \n" \
" .set pop \n" \
@@ -102,8 +102,8 @@ static __inline__ int atomic_##op##_return_relaxed(int i, atomic_t * v) \
return result; \
}
-#define ATOMIC_FETCH_OP(op, c_op, asm_op) \
-static __inline__ int atomic_fetch_##op##_relaxed(int i, atomic_t * v) \
+#define ATOMIC_FETCH_OP(pfx, op, type, c_op, asm_op, ll, sc) \
+static __inline__ type pfx##_fetch_##op##_relaxed(type i, pfx##_t * v) \
{ \
int temp, result; \
\
@@ -120,10 +120,10 @@ static __inline__ int atomic_fetch_##op##_relaxed(int i, atomic_t * v) \
loongson_llsc_mb(); \
__asm__ __volatile__( \
" .set push \n" \
- " .set "MIPS_ISA_LEVEL" \n" \
- "1: ll %1, %2 # atomic_fetch_" #op " \n" \
+ " .set " MIPS_ISA_LEVEL " \n" \
+ "1: " #ll " %1, %2 # " #pfx "_fetch_" #op "\n" \
" " #asm_op " %0, %1, %3 \n" \
- " sc %0, %2 \n" \
+ " " #sc " %0, %2 \n" \
"\t" __SC_BEQZ "%0, 1b \n" \
" .set pop \n" \
" move %0, %1 \n" \
@@ -134,32 +134,50 @@ static __inline__ int atomic_fetch_##op##_relaxed(int i, atomic_t * v) \
return result; \
}
-#define ATOMIC_OPS(op, c_op, asm_op) \
- ATOMIC_OP(op, c_op, asm_op) \
- ATOMIC_OP_RETURN(op, c_op, asm_op) \
- ATOMIC_FETCH_OP(op, c_op, asm_op)
+#define ATOMIC_OPS(pfx, op, type, c_op, asm_op, ll, sc) \
+ ATOMIC_OP(pfx, op, type, c_op, asm_op, ll, sc) \
+ ATOMIC_OP_RETURN(pfx, op, type, c_op, asm_op, ll, sc) \
+ ATOMIC_FETCH_OP(pfx, op, type, c_op, asm_op, ll, sc)
-ATOMIC_OPS(add, +=, addu)
-ATOMIC_OPS(sub, -=, subu)
+ATOMIC_OPS(atomic, add, int, +=, addu, ll, sc)
+ATOMIC_OPS(atomic, sub, int, -=, subu, ll, sc)
#define atomic_add_return_relaxed atomic_add_return_relaxed
#define atomic_sub_return_relaxed atomic_sub_return_relaxed
#define atomic_fetch_add_relaxed atomic_fetch_add_relaxed
#define atomic_fetch_sub_relaxed atomic_fetch_sub_relaxed
+#ifdef CONFIG_64BIT
+ATOMIC_OPS(atomic64, add, s64, +=, daddu, lld, scd)
+ATOMIC_OPS(atomic64, sub, s64, -=, dsubu, lld, scd)
+# define atomic64_add_return_relaxed atomic64_add_return_relaxed
+# define atomic64_sub_return_relaxed atomic64_sub_return_relaxed
+# define atomic64_fetch_add_relaxed atomic64_fetch_add_relaxed
+# define atomic64_fetch_sub_relaxed atomic64_fetch_sub_relaxed
+#endif /* CONFIG_64BIT */
+
#undef ATOMIC_OPS
-#define ATOMIC_OPS(op, c_op, asm_op) \
- ATOMIC_OP(op, c_op, asm_op) \
- ATOMIC_FETCH_OP(op, c_op, asm_op)
+#define ATOMIC_OPS(pfx, op, type, c_op, asm_op, ll, sc) \
+ ATOMIC_OP(pfx, op, type, c_op, asm_op, ll, sc) \
+ ATOMIC_FETCH_OP(pfx, op, type, c_op, asm_op, ll, sc)
-ATOMIC_OPS(and, &=, and)
-ATOMIC_OPS(or, |=, or)
-ATOMIC_OPS(xor, ^=, xor)
+ATOMIC_OPS(atomic, and, int, &=, and, ll, sc)
+ATOMIC_OPS(atomic, or, int, |=, or, ll, sc)
+ATOMIC_OPS(atomic, xor, int, ^=, xor, ll, sc)
#define atomic_fetch_and_relaxed atomic_fetch_and_relaxed
#define atomic_fetch_or_relaxed atomic_fetch_or_relaxed
#define atomic_fetch_xor_relaxed atomic_fetch_xor_relaxed
+#ifdef CONFIG_64BIT
+ATOMIC_OPS(atomic64, and, s64, &=, and, lld, scd)
+ATOMIC_OPS(atomic64, or, s64, |=, or, lld, scd)
+ATOMIC_OPS(atomic64, xor, s64, ^=, xor, lld, scd)
+# define atomic64_fetch_and_relaxed atomic64_fetch_and_relaxed
+# define atomic64_fetch_or_relaxed atomic64_fetch_or_relaxed
+# define atomic64_fetch_xor_relaxed atomic64_fetch_xor_relaxed
+#endif
+
#undef ATOMIC_OPS
#undef ATOMIC_FETCH_OP
#undef ATOMIC_OP_RETURN
@@ -243,130 +261,6 @@ static __inline__ int atomic_sub_if_positive(int i, atomic_t * v)
*/
#define atomic64_set(v, i) WRITE_ONCE((v)->counter, (i))
-#define ATOMIC64_OP(op, c_op, asm_op) \
-static __inline__ void atomic64_##op(s64 i, atomic64_t * v) \
-{ \
- if (kernel_uses_llsc) { \
- s64 temp; \
- \
- loongson_llsc_mb(); \
- __asm__ __volatile__( \
- " .set push \n" \
- " .set "MIPS_ISA_LEVEL" \n" \
- "1: lld %0, %1 # atomic64_" #op " \n" \
- " " #asm_op " %0, %2 \n" \
- " scd %0, %1 \n" \
- "\t" __SC_BEQZ "%0, 1b \n" \
- " .set pop \n" \
- : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter) \
- : "Ir" (i) : __LLSC_CLOBBER); \
- } else { \
- unsigned long flags; \
- \
- raw_local_irq_save(flags); \
- v->counter c_op i; \
- raw_local_irq_restore(flags); \
- } \
-}
-
-#define ATOMIC64_OP_RETURN(op, c_op, asm_op) \
-static __inline__ s64 atomic64_##op##_return_relaxed(s64 i, atomic64_t * v) \
-{ \
- s64 result; \
- \
- if (kernel_uses_llsc) { \
- s64 temp; \
- \
- loongson_llsc_mb(); \
- __asm__ __volatile__( \
- " .set push \n" \
- " .set "MIPS_ISA_LEVEL" \n" \
- "1: lld %1, %2 # atomic64_" #op "_return\n" \
- " " #asm_op " %0, %1, %3 \n" \
- " scd %0, %2 \n" \
- "\t" __SC_BEQZ "%0, 1b \n" \
- " " #asm_op " %0, %1, %3 \n" \
- " .set pop \n" \
- : "=&r" (result), "=&r" (temp), \
- "+" GCC_OFF_SMALL_ASM() (v->counter) \
- : "Ir" (i) : __LLSC_CLOBBER); \
- } else { \
- unsigned long flags; \
- \
- raw_local_irq_save(flags); \
- result = v->counter; \
- result c_op i; \
- v->counter = result; \
- raw_local_irq_restore(flags); \
- } \
- \
- return result; \
-}
-
-#define ATOMIC64_FETCH_OP(op, c_op, asm_op) \
-static __inline__ s64 atomic64_fetch_##op##_relaxed(s64 i, atomic64_t * v) \
-{ \
- s64 result; \
- \
- if (kernel_uses_llsc) { \
- s64 temp; \
- \
- loongson_llsc_mb(); \
- __asm__ __volatile__( \
- " .set push \n" \
- " .set "MIPS_ISA_LEVEL" \n" \
- "1: lld %1, %2 # atomic64_fetch_" #op "\n" \
- " " #asm_op " %0, %1, %3 \n" \
- " scd %0, %2 \n" \
- "\t" __SC_BEQZ "%0, 1b \n" \
- " move %0, %1 \n" \
- " .set pop \n" \
- : "=&r" (result), "=&r" (temp), \
- "+" GCC_OFF_SMALL_ASM() (v->counter) \
- : "Ir" (i) : __LLSC_CLOBBER); \
- } else { \
- unsigned long flags; \
- \
- raw_local_irq_save(flags); \
- result = v->counter; \
- v->counter c_op i; \
- raw_local_irq_restore(flags); \
- } \
- \
- return result; \
-}
-
-#define ATOMIC64_OPS(op, c_op, asm_op) \
- ATOMIC64_OP(op, c_op, asm_op) \
- ATOMIC64_OP_RETURN(op, c_op, asm_op) \
- ATOMIC64_FETCH_OP(op, c_op, asm_op)
-
-ATOMIC64_OPS(add, +=, daddu)
-ATOMIC64_OPS(sub, -=, dsubu)
-
-#define atomic64_add_return_relaxed atomic64_add_return_relaxed
-#define atomic64_sub_return_relaxed atomic64_sub_return_relaxed
-#define atomic64_fetch_add_relaxed atomic64_fetch_add_relaxed
-#define atomic64_fetch_sub_relaxed atomic64_fetch_sub_relaxed
-
-#undef ATOMIC64_OPS
-#define ATOMIC64_OPS(op, c_op, asm_op) \
- ATOMIC64_OP(op, c_op, asm_op) \
- ATOMIC64_FETCH_OP(op, c_op, asm_op)
-
-ATOMIC64_OPS(and, &=, and)
-ATOMIC64_OPS(or, |=, or)
-ATOMIC64_OPS(xor, ^=, xor)
-
-#define atomic64_fetch_and_relaxed atomic64_fetch_and_relaxed
-#define atomic64_fetch_or_relaxed atomic64_fetch_or_relaxed
-#define atomic64_fetch_xor_relaxed atomic64_fetch_xor_relaxed
-
-#undef ATOMIC64_OPS
-#undef ATOMIC64_FETCH_OP
-#undef ATOMIC64_OP_RETURN
-#undef ATOMIC64_OP
-
/*
* atomic64_sub_if_positive - conditionally subtract integer from atomic
* variable
--
2.23.0
Generate the sync instructions required to workaround Loongson3 LL/SC
errata within inline asm blocks, which feels a little safer than doing
it from C where strictly speaking the compiler would be well within its
rights to insert a memory access between the separate asm statements we
previously had, containing sync & ll instructions respectively.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/atomic.h | 20 ++++++++++++++------
1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h
index b834af5a7382..841ff274ada6 100644
--- a/arch/mips/include/asm/atomic.h
+++ b/arch/mips/include/asm/atomic.h
@@ -21,6 +21,7 @@
#include <asm/cpu-features.h>
#include <asm/cmpxchg.h>
#include <asm/llsc.h>
+#include <asm/sync.h>
#include <asm/war.h>
#define ATOMIC_INIT(i) { (i) }
@@ -56,10 +57,10 @@ static __inline__ void pfx##_##op(type i, pfx##_t * v) \
return; \
} \
\
- loongson_llsc_mb(); \
__asm__ __volatile__( \
" .set push \n" \
" .set " MIPS_ISA_LEVEL " \n" \
+ " " __SYNC(full, loongson3_war) " \n" \
"1: " #ll " %0, %1 # " #pfx "_" #op " \n" \
" " #asm_op " %0, %2 \n" \
" " #sc " %0, %1 \n" \
@@ -85,10 +86,10 @@ static __inline__ type pfx##_##op##_return_relaxed(type i, pfx##_t * v) \
return result; \
} \
\
- loongson_llsc_mb(); \
__asm__ __volatile__( \
" .set push \n" \
" .set " MIPS_ISA_LEVEL " \n" \
+ " " __SYNC(full, loongson3_war) " \n" \
"1: " #ll " %1, %2 # " #pfx "_" #op "_return\n" \
" " #asm_op " %0, %1, %3 \n" \
" " #sc " %0, %2 \n" \
@@ -117,10 +118,10 @@ static __inline__ type pfx##_fetch_##op##_relaxed(type i, pfx##_t * v) \
return result; \
} \
\
- loongson_llsc_mb(); \
__asm__ __volatile__( \
" .set push \n" \
" .set " MIPS_ISA_LEVEL " \n" \
+ " " __SYNC(full, loongson3_war) " \n" \
"1: " #ll " %1, %2 # " #pfx "_fetch_" #op "\n" \
" " #asm_op " %0, %1, %3 \n" \
" " #sc " %0, %2 \n" \
@@ -200,10 +201,10 @@ static __inline__ int atomic_sub_if_positive(int i, atomic_t * v)
if (kernel_uses_llsc) {
int temp;
- loongson_llsc_mb();
__asm__ __volatile__(
" .set push \n"
" .set "MIPS_ISA_LEVEL" \n"
+ " " __SYNC(full, loongson3_war) " \n"
"1: ll %1, %2 # atomic_sub_if_positive\n"
" .set pop \n"
" subu %0, %1, %3 \n"
@@ -213,7 +214,7 @@ static __inline__ int atomic_sub_if_positive(int i, atomic_t * v)
" .set "MIPS_ISA_LEVEL" \n"
" sc %1, %2 \n"
"\t" __SC_BEQZ "%1, 1b \n"
- "2: \n"
+ "2: " __SYNC(full, loongson3_war) " \n"
" .set pop \n"
: "=&r" (result), "=&r" (temp),
"+" GCC_OFF_SMALL_ASM() (v->counter)
@@ -229,7 +230,14 @@ static __inline__ int atomic_sub_if_positive(int i, atomic_t * v)
raw_local_irq_restore(flags);
}
- smp_llsc_mb();
+ /*
+ * In the Loongson3 workaround case we already have a completion
+ * barrier at 2: above, which is needed due to the bltz that can branch
+ * to code outside of the LL/SC loop. As such, we don't need to emit
+ * another barrier here.
+ */
+ if (!__SYNC_loongson3_war)
+ smp_llsc_mb();
return result;
}
--
2.23.0
Use smp_mb__before_atomic() & smp_mb__after_atomic() in
atomic_sub_if_positive() rather than the equivalent
smp_mb__before_llsc() & smp_llsc_mb(). The former are more standard &
this preps us for avoiding redundant duplicate barriers on Loongson3 in
a later patch.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/atomic.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h
index 841ff274ada6..24443ef29337 100644
--- a/arch/mips/include/asm/atomic.h
+++ b/arch/mips/include/asm/atomic.h
@@ -196,7 +196,7 @@ static __inline__ int atomic_sub_if_positive(int i, atomic_t * v)
{
int result;
- smp_mb__before_llsc();
+ smp_mb__before_atomic();
if (kernel_uses_llsc) {
int temp;
@@ -237,7 +237,7 @@ static __inline__ int atomic_sub_if_positive(int i, atomic_t * v)
* another barrier here.
*/
if (!__SYNC_loongson3_war)
- smp_llsc_mb();
+ smp_mb__after_atomic();
return result;
}
--
2.23.0
set_bit() can set bits 0-15 using an ori instruction, rather than
loading the value -1 into a register & then using an ins instruction.
That is, rather than the following:
li t0, -1
ll t1, 0(t2)
ins t1, t0, 4, 1
sc t1, 0(t2)
We can have the simpler:
ll t1, 0(t2)
ori t1, t1, 0x10
sc t1, 0(t2)
The or path already allows immediates to be used, so simply restricting
the ins path to bits that don't fit in immediates is sufficient to take
advantage of this.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/bitops.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h
index e300960717e0..1e5739191ddf 100644
--- a/arch/mips/include/asm/bitops.h
+++ b/arch/mips/include/asm/bitops.h
@@ -77,7 +77,7 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr)
}
#if defined(CONFIG_CPU_MIPSR2) || defined(CONFIG_CPU_MIPSR6)
- if (__builtin_constant_p(bit)) {
+ if (__builtin_constant_p(bit) && (bit >= 16)) {
loongson_llsc_mb();
do {
__asm__ __volatile__(
--
2.23.0
Use smp_mb__before_atomic() rather than smp_mb__before_llsc() in
test_and_set_bit(), test_and_clear_bit() & test_and_change_bit(). The
_atomic() versions make semantic sense in these cases, and will allow a
later patch to omit redundant barriers for Loongson3 systems that
already include a barrier within __test_bit_op().
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/bitops.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h
index c08b6d225f10..a74769940fbd 100644
--- a/arch/mips/include/asm/bitops.h
+++ b/arch/mips/include/asm/bitops.h
@@ -209,7 +209,7 @@ static inline int test_and_set_bit_lock(unsigned long nr,
static inline int test_and_set_bit(unsigned long nr,
volatile unsigned long *addr)
{
- smp_mb__before_llsc();
+ smp_mb__before_atomic();
return test_and_set_bit_lock(nr, addr);
}
@@ -228,7 +228,7 @@ static inline int test_and_clear_bit(unsigned long nr,
int bit = nr % BITS_PER_LONG;
unsigned long res, orig;
- smp_mb__before_llsc();
+ smp_mb__before_atomic();
if (!kernel_uses_llsc) {
res = __mips_test_and_clear_bit(nr, addr);
@@ -265,7 +265,7 @@ static inline int test_and_change_bit(unsigned long nr,
int bit = nr % BITS_PER_LONG;
unsigned long res, orig;
- smp_mb__before_llsc();
+ smp_mb__before_atomic();
if (!kernel_uses_llsc) {
res = __mips_test_and_change_bit(nr, addr);
--
2.23.0
The start position for an ins instruction is always encoded as an
immediate, so allowing registers to be used by the inline asm makes no
sense. It should never happen anyway since a bit index should always be
small enough to be treated as an immediate, but remove the nonsensical
"r" for sanity.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/bitops.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h
index 0f5329e32e87..03532ae9f528 100644
--- a/arch/mips/include/asm/bitops.h
+++ b/arch/mips/include/asm/bitops.h
@@ -85,7 +85,7 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr)
" " __INS "%0, %3, %2, 1 \n"
" " __SC "%0, %1 \n"
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m)
- : "ir" (bit), "r" (~0)
+ : "i" (bit), "r" (~0)
: __LLSC_CLOBBER);
} while (unlikely(!temp));
return;
@@ -150,7 +150,7 @@ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr)
" " __INS "%0, $0, %2, 1 \n"
" " __SC "%0, %1 \n"
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m)
- : "ir" (bit)
+ : "i" (bit)
: __LLSC_CLOBBER);
} while (unlikely(!temp));
return;
@@ -383,7 +383,7 @@ static inline int test_and_clear_bit(unsigned long nr,
" " __INS "%0, $0, %3, 1 \n"
" " __SC "%0, %1 \n"
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
- : "ir" (bit)
+ : "i" (bit)
: __LLSC_CLOBBER);
} while (unlikely(!temp));
} else {
--
2.23.0
The only difference between test_and_set_bit() & test_and_set_bit_lock()
is memory ordering barrier semantics - the former provides a full
barrier whilst the latter only provides acquire semantics.
We can therefore implement test_and_set_bit() in terms of
test_and_set_bit_lock() with the addition of the extra memory barrier.
Do this in order to avoid duplicating logic.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/bitops.h | 66 +++++++---------------------------
arch/mips/lib/bitops.c | 26 --------------
2 files changed, 13 insertions(+), 79 deletions(-)
diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h
index 03532ae9f528..ea35a2e87b6d 100644
--- a/arch/mips/include/asm/bitops.h
+++ b/arch/mips/include/asm/bitops.h
@@ -31,8 +31,6 @@
void __mips_set_bit(unsigned long nr, volatile unsigned long *addr);
void __mips_clear_bit(unsigned long nr, volatile unsigned long *addr);
void __mips_change_bit(unsigned long nr, volatile unsigned long *addr);
-int __mips_test_and_set_bit(unsigned long nr,
- volatile unsigned long *addr);
int __mips_test_and_set_bit_lock(unsigned long nr,
volatile unsigned long *addr);
int __mips_test_and_clear_bit(unsigned long nr,
@@ -236,24 +234,22 @@ static inline void change_bit(unsigned long nr, volatile unsigned long *addr)
}
/*
- * test_and_set_bit - Set a bit and return its old value
+ * test_and_set_bit_lock - Set a bit and return its old value
* @nr: Bit to set
* @addr: Address to count from
*
- * This operation is atomic and cannot be reordered.
- * It also implies a memory barrier.
+ * This operation is atomic and implies acquire ordering semantics
+ * after the memory operation.
*/
-static inline int test_and_set_bit(unsigned long nr,
+static inline int test_and_set_bit_lock(unsigned long nr,
volatile unsigned long *addr)
{
unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG);
int bit = nr & SZLONG_MASK;
unsigned long res, temp;
- smp_mb__before_llsc();
-
if (!kernel_uses_llsc) {
- res = __mips_test_and_set_bit(nr, addr);
+ res = __mips_test_and_set_bit_lock(nr, addr);
} else if (R10000_LLSC_WAR) {
__asm__ __volatile__(
" .set push \n"
@@ -264,7 +260,7 @@ static inline int test_and_set_bit(unsigned long nr,
" beqzl %2, 1b \n"
" and %2, %0, %3 \n"
" .set pop \n"
- : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
+ : "=&r" (temp), "+m" (*m), "=&r" (res)
: "r" (1UL << bit)
: __LLSC_CLOBBER);
} else {
@@ -291,56 +287,20 @@ static inline int test_and_set_bit(unsigned long nr,
}
/*
- * test_and_set_bit_lock - Set a bit and return its old value
+ * test_and_set_bit - Set a bit and return its old value
* @nr: Bit to set
* @addr: Address to count from
*
- * This operation is atomic and implies acquire ordering semantics
- * after the memory operation.
+ * This operation is atomic and cannot be reordered.
+ * It also implies a memory barrier.
*/
-static inline int test_and_set_bit_lock(unsigned long nr,
+static inline int test_and_set_bit(unsigned long nr,
volatile unsigned long *addr)
{
- unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG);
- int bit = nr & SZLONG_MASK;
- unsigned long res, temp;
-
- if (!kernel_uses_llsc) {
- res = __mips_test_and_set_bit_lock(nr, addr);
- } else if (R10000_LLSC_WAR) {
- __asm__ __volatile__(
- " .set push \n"
- " .set arch=r4000 \n"
- "1: " __LL "%0, %1 # test_and_set_bit \n"
- " or %2, %0, %3 \n"
- " " __SC "%2, %1 \n"
- " beqzl %2, 1b \n"
- " and %2, %0, %3 \n"
- " .set pop \n"
- : "=&r" (temp), "+m" (*m), "=&r" (res)
- : "r" (1UL << bit)
- : __LLSC_CLOBBER);
- } else {
- do {
- __asm__ __volatile__(
- " .set push \n"
- " .set "MIPS_ISA_ARCH_LEVEL" \n"
- " " __LL "%0, %1 # test_and_set_bit \n"
- " or %2, %0, %3 \n"
- " " __SC "%2, %1 \n"
- " .set pop \n"
- : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
- : "r" (1UL << bit)
- : __LLSC_CLOBBER);
- } while (unlikely(!res));
-
- res = temp & (1UL << bit);
- }
-
- smp_llsc_mb();
-
- return res != 0;
+ smp_mb__before_llsc();
+ return test_and_set_bit_lock(nr, addr);
}
+
/*
* test_and_clear_bit - Clear a bit and return its old value
* @nr: Bit to clear
diff --git a/arch/mips/lib/bitops.c b/arch/mips/lib/bitops.c
index 3b2a1e78a543..fba402c0879d 100644
--- a/arch/mips/lib/bitops.c
+++ b/arch/mips/lib/bitops.c
@@ -77,32 +77,6 @@ void __mips_change_bit(unsigned long nr, volatile unsigned long *addr)
EXPORT_SYMBOL(__mips_change_bit);
-/**
- * __mips_test_and_set_bit - Set a bit and return its old value. This is
- * called by test_and_set_bit() if it cannot find a faster solution.
- * @nr: Bit to set
- * @addr: Address to count from
- */
-int __mips_test_and_set_bit(unsigned long nr,
- volatile unsigned long *addr)
-{
- unsigned long *a = (unsigned long *)addr;
- unsigned bit = nr & SZLONG_MASK;
- unsigned long mask;
- unsigned long flags;
- int res;
-
- a += nr >> SZLONG_LOG;
- mask = 1UL << bit;
- raw_local_irq_save(flags);
- res = (mask & *a) != 0;
- *a |= mask;
- raw_local_irq_restore(flags);
- return res;
-}
-EXPORT_SYMBOL(__mips_test_and_set_bit);
-
-
/**
* __mips_test_and_set_bit_lock - Set a bit and return its old value. This is
* called by test_and_set_bit_lock() if it cannot find a faster solution.
--
2.23.0
When building a kernel configured to support Loongson3 LL/SC workarounds
(ie. CONFIG_CPU_LOONGSON3_WORKAROUNDS=y) the inline assembly in
__xchg_asm() & __cmpxchg_asm() already emits completion barriers, and as
such we don't need to emit extra barriers from the xchg() or cmpxchg()
macros. Add compile-time constant checks causing us to omit the
redundant memory barriers.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/cmpxchg.h | 26 +++++++++++++++++++++++---
1 file changed, 23 insertions(+), 3 deletions(-)
diff --git a/arch/mips/include/asm/cmpxchg.h b/arch/mips/include/asm/cmpxchg.h
index fc121d20a980..820df68e32e1 100644
--- a/arch/mips/include/asm/cmpxchg.h
+++ b/arch/mips/include/asm/cmpxchg.h
@@ -94,7 +94,13 @@ static inline unsigned long __xchg(volatile void *ptr, unsigned long x,
({ \
__typeof__(*(ptr)) __res; \
\
- smp_mb__before_llsc(); \
+ /* \
+ * In the Loongson3 workaround case __xchg_asm() already \
+ * contains a completion barrier prior to the LL, so we don't \
+ * need to emit an extra one here. \
+ */ \
+ if (!__SYNC_loongson3_war) \
+ smp_mb__before_llsc(); \
\
__res = (__typeof__(*(ptr))) \
__xchg((ptr), (unsigned long)(x), sizeof(*(ptr))); \
@@ -179,9 +185,23 @@ static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old,
({ \
__typeof__(*(ptr)) __res; \
\
- smp_mb__before_llsc(); \
+ /* \
+ * In the Loongson3 workaround case __cmpxchg_asm() already \
+ * contains a completion barrier prior to the LL, so we don't \
+ * need to emit an extra one here. \
+ */ \
+ if (!__SYNC_loongson3_war) \
+ smp_mb__before_llsc(); \
+ \
__res = cmpxchg_local((ptr), (old), (new)); \
- smp_llsc_mb(); \
+ \
+ /* \
+ * In the Loongson3 workaround case __cmpxchg_asm() already \
+ * contains a completion barrier after the SC, so we don't \
+ * need to emit an extra one here. \
+ */ \
+ if (!__SYNC_loongson3_war) \
+ smp_llsc_mb(); \
\
__res; \
})
--
2.23.0
Use the BIT() macro in asm/bitops.h rather than open-coding its
equivalent.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/bitops.h | 31 ++++++++++++++++---------------
1 file changed, 16 insertions(+), 15 deletions(-)
diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h
index 7314ba5a3683..0f8ff896e86b 100644
--- a/arch/mips/include/asm/bitops.h
+++ b/arch/mips/include/asm/bitops.h
@@ -13,6 +13,7 @@
#error only <linux/bitops.h> can be included directly
#endif
+#include <linux/bits.h>
#include <linux/compiler.h>
#include <linux/types.h>
#include <asm/barrier.h>
@@ -70,7 +71,7 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr)
" beqzl %0, 1b \n"
" .set pop \n"
: "=&r" (temp), "=" GCC_OFF_SMALL_ASM() (*m)
- : "ir" (1UL << bit), GCC_OFF_SMALL_ASM() (*m)
+ : "ir" (BIT(bit)), GCC_OFF_SMALL_ASM() (*m)
: __LLSC_CLOBBER);
return;
}
@@ -99,7 +100,7 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr)
" " __SC "%0, %1 \n"
" .set pop \n"
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m)
- : "ir" (1UL << bit)
+ : "ir" (BIT(bit))
: __LLSC_CLOBBER);
} while (unlikely(!temp));
}
@@ -135,7 +136,7 @@ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr)
" beqzl %0, 1b \n"
" .set pop \n"
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m)
- : "ir" (~(1UL << bit))
+ : "ir" (~(BIT(bit)))
: __LLSC_CLOBBER);
return;
}
@@ -164,7 +165,7 @@ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr)
" " __SC "%0, %1 \n"
" .set pop \n"
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m)
- : "ir" (~(1UL << bit))
+ : "ir" (~(BIT(bit)))
: __LLSC_CLOBBER);
} while (unlikely(!temp));
}
@@ -213,7 +214,7 @@ static inline void change_bit(unsigned long nr, volatile unsigned long *addr)
" beqzl %0, 1b \n"
" .set pop \n"
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m)
- : "ir" (1UL << bit)
+ : "ir" (BIT(bit))
: __LLSC_CLOBBER);
return;
}
@@ -228,7 +229,7 @@ static inline void change_bit(unsigned long nr, volatile unsigned long *addr)
" " __SC "%0, %1 \n"
" .set pop \n"
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m)
- : "ir" (1UL << bit)
+ : "ir" (BIT(bit))
: __LLSC_CLOBBER);
} while (unlikely(!temp));
}
@@ -261,7 +262,7 @@ static inline int test_and_set_bit_lock(unsigned long nr,
" and %2, %0, %3 \n"
" .set pop \n"
: "=&r" (temp), "+m" (*m), "=&r" (res)
- : "ir" (1UL << bit)
+ : "ir" (BIT(bit))
: __LLSC_CLOBBER);
} else {
loongson_llsc_mb();
@@ -274,11 +275,11 @@ static inline int test_and_set_bit_lock(unsigned long nr,
" " __SC "%2, %1 \n"
" .set pop \n"
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
- : "ir" (1UL << bit)
+ : "ir" (BIT(bit))
: __LLSC_CLOBBER);
} while (unlikely(!res));
- res = temp & (1UL << bit);
+ res = temp & BIT(bit);
}
smp_llsc_mb();
@@ -332,7 +333,7 @@ static inline int test_and_clear_bit(unsigned long nr,
" and %2, %0, %3 \n"
" .set pop \n"
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
- : "ir" (1UL << bit)
+ : "ir" (BIT(bit))
: __LLSC_CLOBBER);
} else if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(nr)) {
loongson_llsc_mb();
@@ -358,11 +359,11 @@ static inline int test_and_clear_bit(unsigned long nr,
" " __SC "%2, %1 \n"
" .set pop \n"
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
- : "ir" (1UL << bit)
+ : "ir" (BIT(bit))
: __LLSC_CLOBBER);
} while (unlikely(!res));
- res = temp & (1UL << bit);
+ res = temp & BIT(bit);
}
smp_llsc_mb();
@@ -400,7 +401,7 @@ static inline int test_and_change_bit(unsigned long nr,
" and %2, %0, %3 \n"
" .set pop \n"
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
- : "ir" (1UL << bit)
+ : "ir" (BIT(bit))
: __LLSC_CLOBBER);
} else {
loongson_llsc_mb();
@@ -413,11 +414,11 @@ static inline int test_and_change_bit(unsigned long nr,
" " __SC "\t%2, %1 \n"
" .set pop \n"
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
- : "ir" (1UL << bit)
+ : "ir" (BIT(bit))
: __LLSC_CLOBBER);
} while (unlikely(!res));
- res = temp & (1UL << bit);
+ res = temp & BIT(bit);
}
smp_llsc_mb();
--
2.23.0
Generate the sync instructions required to workaround Loongson3 LL/SC
errata within inline asm blocks, which feels a little safer than doing
it from C where strictly speaking the compiler would be well within its
rights to insert a memory access between the separate asm statements we
previously had, containing sync & ll instructions respectively.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/bitops.h | 11 ++---------
1 file changed, 2 insertions(+), 9 deletions(-)
diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h
index d39fca2def60..c08b6d225f10 100644
--- a/arch/mips/include/asm/bitops.h
+++ b/arch/mips/include/asm/bitops.h
@@ -31,6 +31,7 @@
asm volatile( \
" .set push \n" \
" .set " MIPS_ISA_LEVEL " \n" \
+ " " __SYNC(full, loongson3_war) " \n" \
"1: " __LL "%0, %1 \n" \
" " insn " \n" \
" " __SC "%0, %1 \n" \
@@ -47,6 +48,7 @@
asm volatile( \
" .set push \n" \
" .set " MIPS_ISA_LEVEL " \n" \
+ " " __SYNC(full, loongson3_war) " \n" \
"1: " __LL ll_dst ", %2 \n" \
" " insn " \n" \
" " __SC "%1, %2 \n" \
@@ -96,12 +98,10 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr)
}
if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(bit) && (bit >= 16)) {
- loongson_llsc_mb();
__bit_op(*m, __INS "%0, %3, %2, 1", "i"(bit), "r"(~0));
return;
}
- loongson_llsc_mb();
__bit_op(*m, "or\t%0, %2", "ir"(BIT(bit)));
}
@@ -126,12 +126,10 @@ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr)
}
if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(bit)) {
- loongson_llsc_mb();
__bit_op(*m, __INS "%0, $0, %2, 1", "i"(bit));
return;
}
- loongson_llsc_mb();
__bit_op(*m, "and\t%0, %2", "ir"(~BIT(bit)));
}
@@ -168,7 +166,6 @@ static inline void change_bit(unsigned long nr, volatile unsigned long *addr)
return;
}
- loongson_llsc_mb();
__bit_op(*m, "xor\t%0, %2", "ir"(BIT(bit)));
}
@@ -190,7 +187,6 @@ static inline int test_and_set_bit_lock(unsigned long nr,
if (!kernel_uses_llsc) {
res = __mips_test_and_set_bit_lock(nr, addr);
} else {
- loongson_llsc_mb();
orig = __test_bit_op(*m, "%0",
"or\t%1, %0, %3",
"ir"(BIT(bit)));
@@ -237,13 +233,11 @@ static inline int test_and_clear_bit(unsigned long nr,
if (!kernel_uses_llsc) {
res = __mips_test_and_clear_bit(nr, addr);
} else if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(nr)) {
- loongson_llsc_mb();
res = __test_bit_op(*m, "%1",
__EXT "%0, %1, %3, 1;"
__INS "%1, $0, %3, 1",
"i"(bit));
} else {
- loongson_llsc_mb();
orig = __test_bit_op(*m, "%0",
"or\t%1, %0, %3;"
"xor\t%1, %1, %3",
@@ -276,7 +270,6 @@ static inline int test_and_change_bit(unsigned long nr,
if (!kernel_uses_llsc) {
res = __mips_test_and_change_bit(nr, addr);
} else {
- loongson_llsc_mb();
orig = __test_bit_op(*m, "%0",
"xor\t%1, %0, %3",
"ir"(BIT(bit)));
--
2.23.0
The IRQ-disabling non-LLSC fallbacks for bitops on UP systems already
return a zero or one, so there's no need to perform another comparison
against zero. Move these comparisons into the LLSC paths to avoid the
redundant work.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/bitops.h | 18 ++++++++++++------
1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h
index 0f8ff896e86b..7671db2a7b73 100644
--- a/arch/mips/include/asm/bitops.h
+++ b/arch/mips/include/asm/bitops.h
@@ -264,6 +264,8 @@ static inline int test_and_set_bit_lock(unsigned long nr,
: "=&r" (temp), "+m" (*m), "=&r" (res)
: "ir" (BIT(bit))
: __LLSC_CLOBBER);
+
+ res = res != 0;
} else {
loongson_llsc_mb();
do {
@@ -279,12 +281,12 @@ static inline int test_and_set_bit_lock(unsigned long nr,
: __LLSC_CLOBBER);
} while (unlikely(!res));
- res = temp & BIT(bit);
+ res = (temp & BIT(bit)) != 0;
}
smp_llsc_mb();
- return res != 0;
+ return res;
}
/*
@@ -335,6 +337,8 @@ static inline int test_and_clear_bit(unsigned long nr,
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
: "ir" (BIT(bit))
: __LLSC_CLOBBER);
+
+ res = res != 0;
} else if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(nr)) {
loongson_llsc_mb();
do {
@@ -363,12 +367,12 @@ static inline int test_and_clear_bit(unsigned long nr,
: __LLSC_CLOBBER);
} while (unlikely(!res));
- res = temp & BIT(bit);
+ res = (temp & BIT(bit)) != 0;
}
smp_llsc_mb();
- return res != 0;
+ return res;
}
/*
@@ -403,6 +407,8 @@ static inline int test_and_change_bit(unsigned long nr,
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
: "ir" (BIT(bit))
: __LLSC_CLOBBER);
+
+ res = res != 0;
} else {
loongson_llsc_mb();
do {
@@ -418,12 +424,12 @@ static inline int test_and_change_bit(unsigned long nr,
: __LLSC_CLOBBER);
} while (unlikely(!res));
- res = temp & BIT(bit);
+ res = (temp & BIT(bit)) != 0;
}
smp_llsc_mb();
- return res != 0;
+ return res;
}
#include <asm-generic/bitops/non-atomic.h>
--
2.23.0
Rather than using custom SZLONG_LOG & SZLONG_MASK macros to shift & mask
a bit index to form word & bit offsets respectively, make use of the
standard BIT_WORD() & BITS_PER_LONG macros for the same purpose.
volatile is added to the definition of pointers to the long-sized word
we'll operate on, in order to prevent the compiler complaining that we
cast away the volatile qualifier of the addr argument. This should have
no effect on generated code, which in the LL/SC case is inline asm
anyway & in the non-LLSC case access is constrained by compiler barriers
provided by raw_local_irq_{save,restore}().
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/bitops.h | 24 ++++++++++++------------
arch/mips/include/asm/llsc.h | 4 ----
arch/mips/lib/bitops.c | 31 +++++++++++++------------------
3 files changed, 25 insertions(+), 34 deletions(-)
diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h
index fba0a842b98a..d39fca2def60 100644
--- a/arch/mips/include/asm/bitops.h
+++ b/arch/mips/include/asm/bitops.h
@@ -87,8 +87,8 @@ int __mips_test_and_change_bit(unsigned long nr,
*/
static inline void set_bit(unsigned long nr, volatile unsigned long *addr)
{
- unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG);
- int bit = nr & SZLONG_MASK;
+ volatile unsigned long *m = &addr[BIT_WORD(nr)];
+ int bit = nr % BITS_PER_LONG;
if (!kernel_uses_llsc) {
__mips_set_bit(nr, addr);
@@ -117,8 +117,8 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr)
*/
static inline void clear_bit(unsigned long nr, volatile unsigned long *addr)
{
- unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG);
- int bit = nr & SZLONG_MASK;
+ volatile unsigned long *m = &addr[BIT_WORD(nr)];
+ int bit = nr % BITS_PER_LONG;
if (!kernel_uses_llsc) {
__mips_clear_bit(nr, addr);
@@ -160,8 +160,8 @@ static inline void clear_bit_unlock(unsigned long nr, volatile unsigned long *ad
*/
static inline void change_bit(unsigned long nr, volatile unsigned long *addr)
{
- unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG);
- int bit = nr & SZLONG_MASK;
+ volatile unsigned long *m = &addr[BIT_WORD(nr)];
+ int bit = nr % BITS_PER_LONG;
if (!kernel_uses_llsc) {
__mips_change_bit(nr, addr);
@@ -183,8 +183,8 @@ static inline void change_bit(unsigned long nr, volatile unsigned long *addr)
static inline int test_and_set_bit_lock(unsigned long nr,
volatile unsigned long *addr)
{
- unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG);
- int bit = nr & SZLONG_MASK;
+ volatile unsigned long *m = &addr[BIT_WORD(nr)];
+ int bit = nr % BITS_PER_LONG;
unsigned long res, orig;
if (!kernel_uses_llsc) {
@@ -228,8 +228,8 @@ static inline int test_and_set_bit(unsigned long nr,
static inline int test_and_clear_bit(unsigned long nr,
volatile unsigned long *addr)
{
- unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG);
- int bit = nr & SZLONG_MASK;
+ volatile unsigned long *m = &addr[BIT_WORD(nr)];
+ int bit = nr % BITS_PER_LONG;
unsigned long res, orig;
smp_mb__before_llsc();
@@ -267,8 +267,8 @@ static inline int test_and_clear_bit(unsigned long nr,
static inline int test_and_change_bit(unsigned long nr,
volatile unsigned long *addr)
{
- unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG);
- int bit = nr & SZLONG_MASK;
+ volatile unsigned long *m = &addr[BIT_WORD(nr)];
+ int bit = nr % BITS_PER_LONG;
unsigned long res, orig;
smp_mb__before_llsc();
diff --git a/arch/mips/include/asm/llsc.h b/arch/mips/include/asm/llsc.h
index d240a4a2d1c4..c49738bc3bda 100644
--- a/arch/mips/include/asm/llsc.h
+++ b/arch/mips/include/asm/llsc.h
@@ -12,15 +12,11 @@
#include <asm/isa-rev.h>
#if _MIPS_SZLONG == 32
-#define SZLONG_LOG 5
-#define SZLONG_MASK 31UL
#define __LL "ll "
#define __SC "sc "
#define __INS "ins "
#define __EXT "ext "
#elif _MIPS_SZLONG == 64
-#define SZLONG_LOG 6
-#define SZLONG_MASK 63UL
#define __LL "lld "
#define __SC "scd "
#define __INS "dins "
diff --git a/arch/mips/lib/bitops.c b/arch/mips/lib/bitops.c
index fba402c0879d..116d0bd8b2ae 100644
--- a/arch/mips/lib/bitops.c
+++ b/arch/mips/lib/bitops.c
@@ -7,6 +7,7 @@
* Copyright (c) 1999, 2000 Silicon Graphics, Inc.
*/
#include <linux/bitops.h>
+#include <linux/bits.h>
#include <linux/irqflags.h>
#include <linux/export.h>
@@ -19,12 +20,11 @@
*/
void __mips_set_bit(unsigned long nr, volatile unsigned long *addr)
{
- unsigned long *a = (unsigned long *)addr;
- unsigned bit = nr & SZLONG_MASK;
+ volatile unsigned long *a = &addr[BIT_WORD(nr)];
+ unsigned int bit = nr % BITS_PER_LONG;
unsigned long mask;
unsigned long flags;
- a += nr >> SZLONG_LOG;
mask = 1UL << bit;
raw_local_irq_save(flags);
*a |= mask;
@@ -41,12 +41,11 @@ EXPORT_SYMBOL(__mips_set_bit);
*/
void __mips_clear_bit(unsigned long nr, volatile unsigned long *addr)
{
- unsigned long *a = (unsigned long *)addr;
- unsigned bit = nr & SZLONG_MASK;
+ volatile unsigned long *a = &addr[BIT_WORD(nr)];
+ unsigned int bit = nr % BITS_PER_LONG;
unsigned long mask;
unsigned long flags;
- a += nr >> SZLONG_LOG;
mask = 1UL << bit;
raw_local_irq_save(flags);
*a &= ~mask;
@@ -63,12 +62,11 @@ EXPORT_SYMBOL(__mips_clear_bit);
*/
void __mips_change_bit(unsigned long nr, volatile unsigned long *addr)
{
- unsigned long *a = (unsigned long *)addr;
- unsigned bit = nr & SZLONG_MASK;
+ volatile unsigned long *a = &addr[BIT_WORD(nr)];
+ unsigned int bit = nr % BITS_PER_LONG;
unsigned long mask;
unsigned long flags;
- a += nr >> SZLONG_LOG;
mask = 1UL << bit;
raw_local_irq_save(flags);
*a ^= mask;
@@ -86,13 +84,12 @@ EXPORT_SYMBOL(__mips_change_bit);
int __mips_test_and_set_bit_lock(unsigned long nr,
volatile unsigned long *addr)
{
- unsigned long *a = (unsigned long *)addr;
- unsigned bit = nr & SZLONG_MASK;
+ volatile unsigned long *a = &addr[BIT_WORD(nr)];
+ unsigned int bit = nr % BITS_PER_LONG;
unsigned long mask;
unsigned long flags;
int res;
- a += nr >> SZLONG_LOG;
mask = 1UL << bit;
raw_local_irq_save(flags);
res = (mask & *a) != 0;
@@ -111,13 +108,12 @@ EXPORT_SYMBOL(__mips_test_and_set_bit_lock);
*/
int __mips_test_and_clear_bit(unsigned long nr, volatile unsigned long *addr)
{
- unsigned long *a = (unsigned long *)addr;
- unsigned bit = nr & SZLONG_MASK;
+ volatile unsigned long *a = &addr[BIT_WORD(nr)];
+ unsigned int bit = nr % BITS_PER_LONG;
unsigned long mask;
unsigned long flags;
int res;
- a += nr >> SZLONG_LOG;
mask = 1UL << bit;
raw_local_irq_save(flags);
res = (mask & *a) != 0;
@@ -136,13 +132,12 @@ EXPORT_SYMBOL(__mips_test_and_clear_bit);
*/
int __mips_test_and_change_bit(unsigned long nr, volatile unsigned long *addr)
{
- unsigned long *a = (unsigned long *)addr;
- unsigned bit = nr & SZLONG_MASK;
+ volatile unsigned long *a = &addr[BIT_WORD(nr)];
+ unsigned int bit = nr % BITS_PER_LONG;
unsigned long mask;
unsigned long flags;
int res;
- a += nr >> SZLONG_LOG;
mask = 1UL << bit;
raw_local_irq_save(flags);
res = (mask & *a) != 0;
--
2.23.0
Generate the sync instructions required to workaround Loongson3 LL/SC
errata within inline asm blocks, which feels a little safer than doing
it from C where strictly speaking the compiler would be well within its
rights to insert a memory access between the separate asm statements we
previously had, containing sync & ll instructions respectively.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/cmpxchg.h | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/arch/mips/include/asm/cmpxchg.h b/arch/mips/include/asm/cmpxchg.h
index 5d3f0e3513b4..fc121d20a980 100644
--- a/arch/mips/include/asm/cmpxchg.h
+++ b/arch/mips/include/asm/cmpxchg.h
@@ -12,6 +12,7 @@
#include <linux/irqflags.h>
#include <asm/compiler.h>
#include <asm/llsc.h>
+#include <asm/sync.h>
#include <asm/war.h>
/*
@@ -36,12 +37,12 @@ extern unsigned long __xchg_called_with_bad_pointer(void)
__typeof(*(m)) __ret; \
\
if (kernel_uses_llsc) { \
- loongson_llsc_mb(); \
__asm__ __volatile__( \
" .set push \n" \
" .set noat \n" \
" .set push \n" \
" .set " MIPS_ISA_ARCH_LEVEL " \n" \
+ " " __SYNC(full, loongson3_war) " \n" \
"1: " ld " %0, %2 # __xchg_asm \n" \
" .set pop \n" \
" move $1, %z3 \n" \
@@ -108,12 +109,12 @@ static inline unsigned long __xchg(volatile void *ptr, unsigned long x,
__typeof(*(m)) __ret; \
\
if (kernel_uses_llsc) { \
- loongson_llsc_mb(); \
__asm__ __volatile__( \
" .set push \n" \
" .set noat \n" \
" .set push \n" \
" .set "MIPS_ISA_ARCH_LEVEL" \n" \
+ " " __SYNC(full, loongson3_war) " \n" \
"1: " ld " %0, %2 # __cmpxchg_asm \n" \
" bne %0, %z3, 2f \n" \
" .set pop \n" \
@@ -122,11 +123,10 @@ static inline unsigned long __xchg(volatile void *ptr, unsigned long x,
" " st " $1, %1 \n" \
"\t" __SC_BEQZ "$1, 1b \n" \
" .set pop \n" \
- "2: \n" \
+ "2: " __SYNC(full, loongson3_war) " \n" \
: "=&r" (__ret), "=" GCC_OFF_SMALL_ASM() (*m) \
: GCC_OFF_SMALL_ASM() (*m), "Jr" (old), "Jr" (new) \
: __LLSC_CLOBBER); \
- loongson_llsc_mb(); \
} else { \
unsigned long __flags; \
\
@@ -222,11 +222,11 @@ static inline unsigned long __cmpxchg64(volatile void *ptr,
*/
local_irq_save(flags);
- loongson_llsc_mb();
asm volatile(
" .set push \n"
" .set " MIPS_ISA_ARCH_LEVEL " \n"
/* Load 64 bits from ptr */
+ " " __SYNC(full, loongson3_war) " \n"
"1: lld %L0, %3 # __cmpxchg64 \n"
/*
* Split the 64 bit value we loaded into the 2 registers that hold the
@@ -260,7 +260,7 @@ static inline unsigned long __cmpxchg64(volatile void *ptr,
/* If we failed, loop! */
"\t" __SC_BEQZ "%L1, 1b \n"
" .set pop \n"
- "2: \n"
+ "2: " __SYNC(full, loongson3_war) " \n"
: "=&r"(ret),
"=&r"(tmp),
"=" GCC_OFF_SMALL_ASM() (*(unsigned long long *)ptr)
@@ -268,7 +268,6 @@ static inline unsigned long __cmpxchg64(volatile void *ptr,
"r" (old),
"r" (new)
: "memory");
- loongson_llsc_mb();
local_irq_restore(flags);
return ret;
--
2.23.0
Introduce __bit_op() & __test_bit_op() macros which abstract away the
implementation of LL/SC loops. This cuts down on a lot of duplicate
boilerplate code, and also allows R10000_LLSC_WAR to be handled outside
of the individual bitop functions.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/bitops.h | 267 ++++++++-------------------------
1 file changed, 63 insertions(+), 204 deletions(-)
diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h
index 7671db2a7b73..fba0a842b98a 100644
--- a/arch/mips/include/asm/bitops.h
+++ b/arch/mips/include/asm/bitops.h
@@ -25,6 +25,41 @@
#include <asm/sgidefs.h>
#include <asm/war.h>
+#define __bit_op(mem, insn, inputs...) do { \
+ unsigned long temp; \
+ \
+ asm volatile( \
+ " .set push \n" \
+ " .set " MIPS_ISA_LEVEL " \n" \
+ "1: " __LL "%0, %1 \n" \
+ " " insn " \n" \
+ " " __SC "%0, %1 \n" \
+ " " __SC_BEQZ "%0, 1b \n" \
+ " .set pop \n" \
+ : "=&r"(temp), "+" GCC_OFF_SMALL_ASM()(mem) \
+ : inputs \
+ : __LLSC_CLOBBER); \
+} while (0)
+
+#define __test_bit_op(mem, ll_dst, insn, inputs...) ({ \
+ unsigned long orig, temp; \
+ \
+ asm volatile( \
+ " .set push \n" \
+ " .set " MIPS_ISA_LEVEL " \n" \
+ "1: " __LL ll_dst ", %2 \n" \
+ " " insn " \n" \
+ " " __SC "%1, %2 \n" \
+ " " __SC_BEQZ "%1, 1b \n" \
+ " .set pop \n" \
+ : "=&r"(orig), "=&r"(temp), \
+ "+" GCC_OFF_SMALL_ASM()(mem) \
+ : inputs \
+ : __LLSC_CLOBBER); \
+ \
+ orig; \
+})
+
/*
* These are the "slower" versions of the functions and are in bitops.c.
* These functions call raw_local_irq_{save,restore}().
@@ -54,55 +89,20 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr)
{
unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG);
int bit = nr & SZLONG_MASK;
- unsigned long temp;
if (!kernel_uses_llsc) {
__mips_set_bit(nr, addr);
return;
}
- if (R10000_LLSC_WAR) {
- __asm__ __volatile__(
- " .set push \n"
- " .set arch=r4000 \n"
- "1: " __LL "%0, %1 # set_bit \n"
- " or %0, %2 \n"
- " " __SC "%0, %1 \n"
- " beqzl %0, 1b \n"
- " .set pop \n"
- : "=&r" (temp), "=" GCC_OFF_SMALL_ASM() (*m)
- : "ir" (BIT(bit)), GCC_OFF_SMALL_ASM() (*m)
- : __LLSC_CLOBBER);
- return;
- }
-
if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(bit) && (bit >= 16)) {
loongson_llsc_mb();
- do {
- __asm__ __volatile__(
- " " __LL "%0, %1 # set_bit \n"
- " " __INS "%0, %3, %2, 1 \n"
- " " __SC "%0, %1 \n"
- : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m)
- : "i" (bit), "r" (~0)
- : __LLSC_CLOBBER);
- } while (unlikely(!temp));
+ __bit_op(*m, __INS "%0, %3, %2, 1", "i"(bit), "r"(~0));
return;
}
loongson_llsc_mb();
- do {
- __asm__ __volatile__(
- " .set push \n"
- " .set "MIPS_ISA_ARCH_LEVEL" \n"
- " " __LL "%0, %1 # set_bit \n"
- " or %0, %2 \n"
- " " __SC "%0, %1 \n"
- " .set pop \n"
- : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m)
- : "ir" (BIT(bit))
- : __LLSC_CLOBBER);
- } while (unlikely(!temp));
+ __bit_op(*m, "or\t%0, %2", "ir"(BIT(bit)));
}
/*
@@ -119,55 +119,20 @@ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr)
{
unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG);
int bit = nr & SZLONG_MASK;
- unsigned long temp;
if (!kernel_uses_llsc) {
__mips_clear_bit(nr, addr);
return;
}
- if (R10000_LLSC_WAR) {
- __asm__ __volatile__(
- " .set push \n"
- " .set arch=r4000 \n"
- "1: " __LL "%0, %1 # clear_bit \n"
- " and %0, %2 \n"
- " " __SC "%0, %1 \n"
- " beqzl %0, 1b \n"
- " .set pop \n"
- : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m)
- : "ir" (~(BIT(bit)))
- : __LLSC_CLOBBER);
- return;
- }
-
if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(bit)) {
loongson_llsc_mb();
- do {
- __asm__ __volatile__(
- " " __LL "%0, %1 # clear_bit \n"
- " " __INS "%0, $0, %2, 1 \n"
- " " __SC "%0, %1 \n"
- : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m)
- : "i" (bit)
- : __LLSC_CLOBBER);
- } while (unlikely(!temp));
+ __bit_op(*m, __INS "%0, $0, %2, 1", "i"(bit));
return;
}
loongson_llsc_mb();
- do {
- __asm__ __volatile__(
- " .set push \n"
- " .set "MIPS_ISA_ARCH_LEVEL" \n"
- " " __LL "%0, %1 # clear_bit \n"
- " and %0, %2 \n"
- " " __SC "%0, %1 \n"
- " .set pop \n"
- : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m)
- : "ir" (~(BIT(bit)))
- : __LLSC_CLOBBER);
- } while (unlikely(!temp));
+ __bit_op(*m, "and\t%0, %2", "ir"(~BIT(bit)));
}
/*
@@ -197,41 +162,14 @@ static inline void change_bit(unsigned long nr, volatile unsigned long *addr)
{
unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG);
int bit = nr & SZLONG_MASK;
- unsigned long temp;
if (!kernel_uses_llsc) {
__mips_change_bit(nr, addr);
return;
}
- if (R10000_LLSC_WAR) {
- __asm__ __volatile__(
- " .set push \n"
- " .set arch=r4000 \n"
- "1: " __LL "%0, %1 # change_bit \n"
- " xor %0, %2 \n"
- " " __SC "%0, %1 \n"
- " beqzl %0, 1b \n"
- " .set pop \n"
- : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m)
- : "ir" (BIT(bit))
- : __LLSC_CLOBBER);
- return;
- }
-
loongson_llsc_mb();
- do {
- __asm__ __volatile__(
- " .set push \n"
- " .set "MIPS_ISA_ARCH_LEVEL" \n"
- " " __LL "%0, %1 # change_bit \n"
- " xor %0, %2 \n"
- " " __SC "%0, %1 \n"
- " .set pop \n"
- : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m)
- : "ir" (BIT(bit))
- : __LLSC_CLOBBER);
- } while (unlikely(!temp));
+ __bit_op(*m, "xor\t%0, %2", "ir"(BIT(bit)));
}
/*
@@ -247,41 +185,16 @@ static inline int test_and_set_bit_lock(unsigned long nr,
{
unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG);
int bit = nr & SZLONG_MASK;
- unsigned long res, temp;
+ unsigned long res, orig;
if (!kernel_uses_llsc) {
res = __mips_test_and_set_bit_lock(nr, addr);
- } else if (R10000_LLSC_WAR) {
- __asm__ __volatile__(
- " .set push \n"
- " .set arch=r4000 \n"
- "1: " __LL "%0, %1 # test_and_set_bit \n"
- " or %2, %0, %3 \n"
- " " __SC "%2, %1 \n"
- " beqzl %2, 1b \n"
- " and %2, %0, %3 \n"
- " .set pop \n"
- : "=&r" (temp), "+m" (*m), "=&r" (res)
- : "ir" (BIT(bit))
- : __LLSC_CLOBBER);
-
- res = res != 0;
} else {
loongson_llsc_mb();
- do {
- __asm__ __volatile__(
- " .set push \n"
- " .set "MIPS_ISA_ARCH_LEVEL" \n"
- " " __LL "%0, %1 # test_and_set_bit \n"
- " or %2, %0, %3 \n"
- " " __SC "%2, %1 \n"
- " .set pop \n"
- : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
- : "ir" (BIT(bit))
- : __LLSC_CLOBBER);
- } while (unlikely(!res));
-
- res = (temp & BIT(bit)) != 0;
+ orig = __test_bit_op(*m, "%0",
+ "or\t%1, %0, %3",
+ "ir"(BIT(bit)));
+ res = (orig & BIT(bit)) != 0;
}
smp_llsc_mb();
@@ -317,57 +230,25 @@ static inline int test_and_clear_bit(unsigned long nr,
{
unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG);
int bit = nr & SZLONG_MASK;
- unsigned long res, temp;
+ unsigned long res, orig;
smp_mb__before_llsc();
if (!kernel_uses_llsc) {
res = __mips_test_and_clear_bit(nr, addr);
- } else if (R10000_LLSC_WAR) {
- __asm__ __volatile__(
- " .set push \n"
- " .set arch=r4000 \n"
- "1: " __LL "%0, %1 # test_and_clear_bit \n"
- " or %2, %0, %3 \n"
- " xor %2, %3 \n"
- " " __SC "%2, %1 \n"
- " beqzl %2, 1b \n"
- " and %2, %0, %3 \n"
- " .set pop \n"
- : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
- : "ir" (BIT(bit))
- : __LLSC_CLOBBER);
-
- res = res != 0;
} else if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(nr)) {
loongson_llsc_mb();
- do {
- __asm__ __volatile__(
- " " __LL "%0, %1 # test_and_clear_bit \n"
- " " __EXT "%2, %0, %3, 1 \n"
- " " __INS "%0, $0, %3, 1 \n"
- " " __SC "%0, %1 \n"
- : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
- : "i" (bit)
- : __LLSC_CLOBBER);
- } while (unlikely(!temp));
+ res = __test_bit_op(*m, "%1",
+ __EXT "%0, %1, %3, 1;"
+ __INS "%1, $0, %3, 1",
+ "i"(bit));
} else {
loongson_llsc_mb();
- do {
- __asm__ __volatile__(
- " .set push \n"
- " .set "MIPS_ISA_ARCH_LEVEL" \n"
- " " __LL "%0, %1 # test_and_clear_bit \n"
- " or %2, %0, %3 \n"
- " xor %2, %3 \n"
- " " __SC "%2, %1 \n"
- " .set pop \n"
- : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
- : "ir" (BIT(bit))
- : __LLSC_CLOBBER);
- } while (unlikely(!res));
-
- res = (temp & BIT(bit)) != 0;
+ orig = __test_bit_op(*m, "%0",
+ "or\t%1, %0, %3;"
+ "xor\t%1, %1, %3",
+ "ir"(BIT(bit)));
+ res = (orig & BIT(bit)) != 0;
}
smp_llsc_mb();
@@ -388,43 +269,18 @@ static inline int test_and_change_bit(unsigned long nr,
{
unsigned long *m = ((unsigned long *)addr) + (nr >> SZLONG_LOG);
int bit = nr & SZLONG_MASK;
- unsigned long res, temp;
+ unsigned long res, orig;
smp_mb__before_llsc();
if (!kernel_uses_llsc) {
res = __mips_test_and_change_bit(nr, addr);
- } else if (R10000_LLSC_WAR) {
- __asm__ __volatile__(
- " .set push \n"
- " .set arch=r4000 \n"
- "1: " __LL "%0, %1 # test_and_change_bit \n"
- " xor %2, %0, %3 \n"
- " " __SC "%2, %1 \n"
- " beqzl %2, 1b \n"
- " and %2, %0, %3 \n"
- " .set pop \n"
- : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
- : "ir" (BIT(bit))
- : __LLSC_CLOBBER);
-
- res = res != 0;
} else {
loongson_llsc_mb();
- do {
- __asm__ __volatile__(
- " .set push \n"
- " .set "MIPS_ISA_ARCH_LEVEL" \n"
- " " __LL "%0, %1 # test_and_change_bit \n"
- " xor %2, %0, %3 \n"
- " " __SC "\t%2, %1 \n"
- " .set pop \n"
- : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
- : "ir" (BIT(bit))
- : __LLSC_CLOBBER);
- } while (unlikely(!res));
-
- res = (temp & BIT(bit)) != 0;
+ orig = __test_bit_op(*m, "%0",
+ "xor\t%1, %0, %3",
+ "ir"(BIT(bit)));
+ res = (orig & BIT(bit)) != 0;
}
smp_llsc_mb();
@@ -432,6 +288,9 @@ static inline int test_and_change_bit(unsigned long nr,
return res;
}
+#undef __bit_op
+#undef __test_bit_op
+
#include <asm-generic/bitops/non-atomic.h>
/*
--
2.23.0
Generate the sync instructions required to workaround Loongson3 LL/SC
errata within inline asm blocks, which feels a little safer than doing
it from C where strictly speaking the compiler would be well within its
rights to insert a memory access between the separate asm statements we
previously had, containing sync & ll instructions respectively.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2:
- De-string __WEAK_LLSC_MB to allow its use with __SYNC_ELSE().
arch/mips/include/asm/barrier.h | 13 +++++++------
arch/mips/include/asm/futex.h | 15 +++++++--------
2 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
index c7e05e832da9..133afd565067 100644
--- a/arch/mips/include/asm/barrier.h
+++ b/arch/mips/include/asm/barrier.h
@@ -95,13 +95,14 @@ static inline void wmb(void)
* ordering will be done by smp_llsc_mb() and friends.
*/
#if defined(CONFIG_WEAK_REORDERING_BEYOND_LLSC) && defined(CONFIG_SMP)
-#define __WEAK_LLSC_MB " sync \n"
-#define smp_llsc_mb() __asm__ __volatile__(__WEAK_LLSC_MB : : :"memory")
-#define __LLSC_CLOBBER
+# define __WEAK_LLSC_MB sync
+# define smp_llsc_mb() \
+ __asm__ __volatile__(__stringify(__WEAK_LLSC_MB) : : :"memory")
+# define __LLSC_CLOBBER
#else
-#define __WEAK_LLSC_MB " \n"
-#define smp_llsc_mb() do { } while (0)
-#define __LLSC_CLOBBER "memory"
+# define __WEAK_LLSC_MB
+# define smp_llsc_mb() do { } while (0)
+# define __LLSC_CLOBBER "memory"
#endif
#ifdef CONFIG_CPU_CAVIUM_OCTEON
diff --git a/arch/mips/include/asm/futex.h b/arch/mips/include/asm/futex.h
index b83b0397462d..54cf20530931 100644
--- a/arch/mips/include/asm/futex.h
+++ b/arch/mips/include/asm/futex.h
@@ -16,6 +16,7 @@
#include <asm/barrier.h>
#include <asm/compiler.h>
#include <asm/errno.h>
+#include <asm/sync.h>
#include <asm/war.h>
#define __futex_atomic_op(insn, ret, oldval, uaddr, oparg) \
@@ -32,7 +33,7 @@
" .set arch=r4000 \n" \
"2: sc $1, %2 \n" \
" beqzl $1, 1b \n" \
- __WEAK_LLSC_MB \
+ __stringify(__WEAK_LLSC_MB) \
"3: \n" \
" .insn \n" \
" .set pop \n" \
@@ -50,19 +51,19 @@
"i" (-EFAULT) \
: "memory"); \
} else if (cpu_has_llsc) { \
- loongson_llsc_mb(); \
__asm__ __volatile__( \
" .set push \n" \
" .set noat \n" \
" .set push \n" \
" .set "MIPS_ISA_ARCH_LEVEL" \n" \
+ " " __SYNC(full, loongson3_war) " \n" \
"1: "user_ll("%1", "%4")" # __futex_atomic_op\n" \
" .set pop \n" \
" " insn " \n" \
" .set "MIPS_ISA_ARCH_LEVEL" \n" \
"2: "user_sc("$1", "%2")" \n" \
" beqz $1, 1b \n" \
- __WEAK_LLSC_MB \
+ __stringify(__WEAK_LLSC_MB) \
"3: \n" \
" .insn \n" \
" .set pop \n" \
@@ -147,7 +148,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
" .set arch=r4000 \n"
"2: sc $1, %2 \n"
" beqzl $1, 1b \n"
- __WEAK_LLSC_MB
+ __stringify(__WEAK_LLSC_MB)
"3: \n"
" .insn \n"
" .set pop \n"
@@ -164,13 +165,13 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
"i" (-EFAULT)
: "memory");
} else if (cpu_has_llsc) {
- loongson_llsc_mb();
__asm__ __volatile__(
"# futex_atomic_cmpxchg_inatomic \n"
" .set push \n"
" .set noat \n"
" .set push \n"
" .set "MIPS_ISA_ARCH_LEVEL" \n"
+ " " __SYNC(full, loongson3_war) " \n"
"1: "user_ll("%1", "%3")" \n"
" bne %1, %z4, 3f \n"
" .set pop \n"
@@ -178,8 +179,7 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
" .set "MIPS_ISA_ARCH_LEVEL" \n"
"2: "user_sc("$1", "%2")" \n"
" beqz $1, 1b \n"
- __WEAK_LLSC_MB
- "3: \n"
+ "3: " __SYNC_ELSE(full, loongson3_war, __WEAK_LLSC_MB) "\n"
" .insn \n"
" .set pop \n"
" .section .fixup,\"ax\" \n"
@@ -194,7 +194,6 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
: GCC_OFF_SMALL_ASM() (*uaddr), "Jr" (oldval), "Jr" (newval),
"i" (-EFAULT)
: "memory");
- loongson_llsc_mb();
} else
return -ENOSYS;
--
2.23.0
Generate the sync instructions required to workaround Loongson3 LL/SC
errata within inline asm blocks, which feels a little safer than doing
it from C where strictly speaking the compiler would be well within its
rights to insert a memory access between the separate asm statements we
previously had, containing sync & ll instructions respectively.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/kernel/syscall.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/mips/kernel/syscall.c b/arch/mips/kernel/syscall.c
index b0e25e913bdb..3ea288ca35f1 100644
--- a/arch/mips/kernel/syscall.c
+++ b/arch/mips/kernel/syscall.c
@@ -37,6 +37,7 @@
#include <asm/signal.h>
#include <asm/sim.h>
#include <asm/shmparam.h>
+#include <asm/sync.h>
#include <asm/sysmips.h>
#include <asm/switch_to.h>
@@ -132,12 +133,12 @@ static inline int mips_atomic_set(unsigned long addr, unsigned long new)
[efault] "i" (-EFAULT)
: "memory");
} else if (cpu_has_llsc) {
- loongson_llsc_mb();
__asm__ __volatile__ (
" .set push \n"
" .set "MIPS_ISA_ARCH_LEVEL" \n"
" li %[err], 0 \n"
"1: \n"
+ " " __SYNC(full, loongson3_war) " \n"
user_ll("%[old]", "(%[addr])")
" move %[tmp], %[new] \n"
"2: \n"
--
2.23.0
Loongson3 systems with CONFIG_CPU_LOONGSON3_WORKAROUNDS enabled already
emit a full completion barrier as part of the inline assembly containing
LL/SC loops for atomic operations. As such the barrier emitted by
__smp_mb__before_atomic() is redundant, and we can remove it.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/barrier.h | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
index 6d92d5ccdafa..49ff172a72b9 100644
--- a/arch/mips/include/asm/barrier.h
+++ b/arch/mips/include/asm/barrier.h
@@ -119,7 +119,17 @@ static inline void wmb(void)
#define nudge_writes() mb()
#endif
-#define __smp_mb__before_atomic() __smp_mb__before_llsc()
+/*
+ * In the Loongson3 LL/SC workaround case, all of our LL/SC loops already have
+ * a completion barrier immediately preceding the LL instruction. Therefore we
+ * can skip emitting a barrier from __smp_mb__before_atomic().
+ */
+#ifdef CONFIG_CPU_LOONGSON3_WORKAROUNDS
+# define __smp_mb__before_atomic()
+#else
+# define __smp_mb__before_atomic() __smp_mb__before_llsc()
+#endif
+
#define __smp_mb__after_atomic() smp_llsc_mb()
static inline void sync_ginv(void)
--
2.23.0
In ejtag_debug_handler() we must reload the address of
ejtag_debug_buffer_spinlock if an sc fails, since the address in k0 will
have been clobbered by the result of the sc instruction. In the case
where we simply load a non-zero value (ie. there's contention for the
lock) the address will not be clobbered & we can simply branch back to
repeat the load from memory without reloading the address into k0.
The primary motivation for this change is that it moves the target of
the bnez instruction to an instruction within the LL/SC loop (the LL
itself), which we know contains no other memory accesses & therefore
isn't affected by Loongson3 LL/SC errata.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/kernel/genex.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/mips/kernel/genex.S b/arch/mips/kernel/genex.S
index ac4f2b835165..60ede6b75a3b 100644
--- a/arch/mips/kernel/genex.S
+++ b/arch/mips/kernel/genex.S
@@ -355,8 +355,8 @@ NESTED(ejtag_debug_handler, PT_SIZE, sp)
#ifdef CONFIG_SMP
1: PTR_LA k0, ejtag_debug_buffer_spinlock
__SYNC(full, loongson3_war)
- ll k0, 0(k0)
- bnez k0, 1b
+2: ll k0, 0(k0)
+ bnez k0, 2b
PTR_LA k0, ejtag_debug_buffer_spinlock
sc k0, 0(k0)
beqz k0, 1b
--
2.23.0
In ejtag_debug_handler we use LL & SC instructions to acquire & release
an open-coded spinlock. For Loongson3 systems affected by LL/SC errata
this requires that we insert a sync instruction prior to the LL in order
to ensure correct behavior of the LL/SC loop.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/kernel/genex.S | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/mips/kernel/genex.S b/arch/mips/kernel/genex.S
index efde27c99414..ac4f2b835165 100644
--- a/arch/mips/kernel/genex.S
+++ b/arch/mips/kernel/genex.S
@@ -18,6 +18,7 @@
#include <asm/fpregdef.h>
#include <asm/mipsregs.h>
#include <asm/stackframe.h>
+#include <asm/sync.h>
#include <asm/war.h>
#include <asm/thread_info.h>
@@ -353,6 +354,7 @@ NESTED(ejtag_debug_handler, PT_SIZE, sp)
#ifdef CONFIG_SMP
1: PTR_LA k0, ejtag_debug_buffer_spinlock
+ __SYNC(full, loongson3_war)
ll k0, 0(k0)
bnez k0, 1b
PTR_LA k0, ejtag_debug_buffer_spinlock
--
2.23.0
The logical operations or & xor used in the test_and_set_bit_lock(),
test_and_clear_bit() & test_and_change_bit() functions currently force
the value 1<<bit to be placed in a register. If the bit is compile-time
constant & fits within the immediate field of an or/xor instruction (ie.
16 bits) then we can make use of the ori/xori instruction variants &
avoid the use of an extra register. Add the extra "i" constraints in
order to allow use of these immediate encodings.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/bitops.h | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h
index ea35a2e87b6d..7314ba5a3683 100644
--- a/arch/mips/include/asm/bitops.h
+++ b/arch/mips/include/asm/bitops.h
@@ -261,7 +261,7 @@ static inline int test_and_set_bit_lock(unsigned long nr,
" and %2, %0, %3 \n"
" .set pop \n"
: "=&r" (temp), "+m" (*m), "=&r" (res)
- : "r" (1UL << bit)
+ : "ir" (1UL << bit)
: __LLSC_CLOBBER);
} else {
loongson_llsc_mb();
@@ -274,7 +274,7 @@ static inline int test_and_set_bit_lock(unsigned long nr,
" " __SC "%2, %1 \n"
" .set pop \n"
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
- : "r" (1UL << bit)
+ : "ir" (1UL << bit)
: __LLSC_CLOBBER);
} while (unlikely(!res));
@@ -332,7 +332,7 @@ static inline int test_and_clear_bit(unsigned long nr,
" and %2, %0, %3 \n"
" .set pop \n"
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
- : "r" (1UL << bit)
+ : "ir" (1UL << bit)
: __LLSC_CLOBBER);
} else if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(nr)) {
loongson_llsc_mb();
@@ -358,7 +358,7 @@ static inline int test_and_clear_bit(unsigned long nr,
" " __SC "%2, %1 \n"
" .set pop \n"
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
- : "r" (1UL << bit)
+ : "ir" (1UL << bit)
: __LLSC_CLOBBER);
} while (unlikely(!res));
@@ -400,7 +400,7 @@ static inline int test_and_change_bit(unsigned long nr,
" and %2, %0, %3 \n"
" .set pop \n"
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
- : "r" (1UL << bit)
+ : "ir" (1UL << bit)
: __LLSC_CLOBBER);
} else {
loongson_llsc_mb();
@@ -413,7 +413,7 @@ static inline int test_and_change_bit(unsigned long nr,
" " __SC "\t%2, %1 \n"
" .set pop \n"
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
- : "r" (1UL << bit)
+ : "ir" (1UL << bit)
: __LLSC_CLOBBER);
} while (unlikely(!res));
--
2.23.0
When Loongson3 LL/SC errata workarounds are enabled (ie.
CONFIG_CPU_LOONGSON3_WORKAROUNDS=y) run a tool to scan through the
compiled kernel & ensure that the workaround is applied correctly. That
is, ensure that:
- Every LL or LLD instruction is preceded by a sync instruction.
- Any branches from within an LL/SC loop to outside of that loop
target a sync instruction.
Reasoning for these conditions can be found by reading the comment above
the definition of __SYNC_loongson3_war in arch/mips/include/asm/sync.h.
This tool will help ensure that we don't inadvertently introduce code
paths that miss the required workarounds.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2:
- Only try to build loongson3-llsc-check from arch/mips/Makefile when
CONFIG_CPU_LOONGSON3_WORKAROUNDS is enabled.
arch/mips/Makefile | 3 +
arch/mips/Makefile.postlink | 10 +-
arch/mips/tools/.gitignore | 1 +
arch/mips/tools/Makefile | 5 +
arch/mips/tools/loongson3-llsc-check.c | 307 +++++++++++++++++++++++++
5 files changed, 325 insertions(+), 1 deletion(-)
create mode 100644 arch/mips/tools/loongson3-llsc-check.c
diff --git a/arch/mips/Makefile b/arch/mips/Makefile
index cdc09b71febe..0a5eab626260 100644
--- a/arch/mips/Makefile
+++ b/arch/mips/Makefile
@@ -14,6 +14,9 @@
archscripts: scripts_basic
$(Q)$(MAKE) $(build)=arch/mips/tools elf-entry
+ifeq ($(CONFIG_CPU_LOONGSON3_WORKAROUNDS),y)
+ $(Q)$(MAKE) $(build)=arch/mips/tools loongson3-llsc-check
+endif
$(Q)$(MAKE) $(build)=arch/mips/boot/tools relocs
KBUILD_DEFCONFIG := 32r2el_defconfig
diff --git a/arch/mips/Makefile.postlink b/arch/mips/Makefile.postlink
index 4eea4188cb20..f03fdc95143e 100644
--- a/arch/mips/Makefile.postlink
+++ b/arch/mips/Makefile.postlink
@@ -3,7 +3,8 @@
# Post-link MIPS pass
# ===========================================================================
#
-# 1. Insert relocations into vmlinux
+# 1. Check that Loongson3 LL/SC workarounds are applied correctly
+# 2. Insert relocations into vmlinux
PHONY := __archpost
__archpost:
@@ -11,6 +12,10 @@ __archpost:
-include include/config/auto.conf
include scripts/Kbuild.include
+CMD_LS3_LLSC = arch/mips/tools/loongson3-llsc-check
+quiet_cmd_ls3_llsc = LLSCCHK $@
+ cmd_ls3_llsc = $(CMD_LS3_LLSC) $@
+
CMD_RELOCS = arch/mips/boot/tools/relocs
quiet_cmd_relocs = RELOCS $@
cmd_relocs = $(CMD_RELOCS) $@
@@ -19,6 +24,9 @@ quiet_cmd_relocs = RELOCS $@
vmlinux: FORCE
@true
+ifeq ($(CONFIG_CPU_LOONGSON3_WORKAROUNDS),y)
+ $(call if_changed,ls3_llsc)
+endif
ifeq ($(CONFIG_RELOCATABLE),y)
$(call if_changed,relocs)
endif
diff --git a/arch/mips/tools/.gitignore b/arch/mips/tools/.gitignore
index 56d34ccccce4..b0209450d9ff 100644
--- a/arch/mips/tools/.gitignore
+++ b/arch/mips/tools/.gitignore
@@ -1 +1,2 @@
elf-entry
+loongson3-llsc-check
diff --git a/arch/mips/tools/Makefile b/arch/mips/tools/Makefile
index 3baee4bc6775..aaef688749f5 100644
--- a/arch/mips/tools/Makefile
+++ b/arch/mips/tools/Makefile
@@ -3,3 +3,8 @@ hostprogs-y := elf-entry
PHONY += elf-entry
elf-entry: $(obj)/elf-entry
@:
+
+hostprogs-$(CONFIG_CPU_LOONGSON3_WORKAROUNDS) += loongson3-llsc-check
+PHONY += loongson3-llsc-check
+loongson3-llsc-check: $(obj)/loongson3-llsc-check
+ @:
diff --git a/arch/mips/tools/loongson3-llsc-check.c b/arch/mips/tools/loongson3-llsc-check.c
new file mode 100644
index 000000000000..0ebddd0ae46f
--- /dev/null
+++ b/arch/mips/tools/loongson3-llsc-check.c
@@ -0,0 +1,307 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include <byteswap.h>
+#include <elf.h>
+#include <endian.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/mman.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <unistd.h>
+
+#ifdef be32toh
+/* If libc provides le{16,32,64}toh() then we'll use them */
+#elif BYTE_ORDER == LITTLE_ENDIAN
+# define le16toh(x) (x)
+# define le32toh(x) (x)
+# define le64toh(x) (x)
+#elif BYTE_ORDER == BIG_ENDIAN
+# define le16toh(x) bswap_16(x)
+# define le32toh(x) bswap_32(x)
+# define le64toh(x) bswap_64(x)
+#endif
+
+/* MIPS opcodes, in bits 31:26 of an instruction */
+#define OP_SPECIAL 0x00
+#define OP_REGIMM 0x01
+#define OP_BEQ 0x04
+#define OP_BNE 0x05
+#define OP_BLEZ 0x06
+#define OP_BGTZ 0x07
+#define OP_BEQL 0x14
+#define OP_BNEL 0x15
+#define OP_BLEZL 0x16
+#define OP_BGTZL 0x17
+#define OP_LL 0x30
+#define OP_LLD 0x34
+#define OP_SC 0x38
+#define OP_SCD 0x3c
+
+/* Bits 20:16 of OP_REGIMM instructions */
+#define REGIMM_BLTZ 0x00
+#define REGIMM_BGEZ 0x01
+#define REGIMM_BLTZL 0x02
+#define REGIMM_BGEZL 0x03
+#define REGIMM_BLTZAL 0x10
+#define REGIMM_BGEZAL 0x11
+#define REGIMM_BLTZALL 0x12
+#define REGIMM_BGEZALL 0x13
+
+/* Bits 5:0 of OP_SPECIAL instructions */
+#define SPECIAL_SYNC 0x0f
+
+static void usage(FILE *f)
+{
+ fprintf(f, "Usage: loongson3-llsc-check /path/to/vmlinux\n");
+}
+
+static int se16(uint16_t x)
+{
+ return (int16_t)x;
+}
+
+static bool is_ll(uint32_t insn)
+{
+ switch (insn >> 26) {
+ case OP_LL:
+ case OP_LLD:
+ return true;
+
+ default:
+ return false;
+ }
+}
+
+static bool is_sc(uint32_t insn)
+{
+ switch (insn >> 26) {
+ case OP_SC:
+ case OP_SCD:
+ return true;
+
+ default:
+ return false;
+ }
+}
+
+static bool is_sync(uint32_t insn)
+{
+ /* Bits 31:11 should all be zeroes */
+ if (insn >> 11)
+ return false;
+
+ /* Bits 5:0 specify the SYNC special encoding */
+ if ((insn & 0x3f) != SPECIAL_SYNC)
+ return false;
+
+ return true;
+}
+
+static bool is_branch(uint32_t insn, int *off)
+{
+ switch (insn >> 26) {
+ case OP_BEQ:
+ case OP_BEQL:
+ case OP_BNE:
+ case OP_BNEL:
+ case OP_BGTZ:
+ case OP_BGTZL:
+ case OP_BLEZ:
+ case OP_BLEZL:
+ *off = se16(insn) + 1;
+ return true;
+
+ case OP_REGIMM:
+ switch ((insn >> 16) & 0x1f) {
+ case REGIMM_BGEZ:
+ case REGIMM_BGEZL:
+ case REGIMM_BGEZAL:
+ case REGIMM_BGEZALL:
+ case REGIMM_BLTZ:
+ case REGIMM_BLTZL:
+ case REGIMM_BLTZAL:
+ case REGIMM_BLTZALL:
+ *off = se16(insn) + 1;
+ return true;
+
+ default:
+ return false;
+ }
+
+ default:
+ return false;
+ }
+}
+
+static int check_ll(uint64_t pc, uint32_t *code, size_t sz)
+{
+ ssize_t i, max, sc_pos;
+ int off;
+
+ /*
+ * Every LL must be preceded by a sync instruction in order to ensure
+ * that instruction reordering doesn't allow a prior memory access to
+ * execute after the LL & cause erroneous results.
+ */
+ if (!is_sync(le32toh(code[-1]))) {
+ fprintf(stderr, "%" PRIx64 ": LL not preceded by sync\n", pc);
+ return -EINVAL;
+ }
+
+ /* Find the matching SC instruction */
+ max = sz / 4;
+ for (sc_pos = 0; sc_pos < max; sc_pos++) {
+ if (is_sc(le32toh(code[sc_pos])))
+ break;
+ }
+ if (sc_pos >= max) {
+ fprintf(stderr, "%" PRIx64 ": LL has no matching SC\n", pc);
+ return -EINVAL;
+ }
+
+ /*
+ * Check branches within the LL/SC loop target sync instructions,
+ * ensuring that speculative execution can't generate memory accesses
+ * due to instructions outside of the loop.
+ */
+ for (i = 0; i < sc_pos; i++) {
+ if (!is_branch(le32toh(code[i]), &off))
+ continue;
+
+ /*
+ * If the branch target is within the LL/SC loop then we don't
+ * need to worry about it.
+ */
+ if ((off >= -i) && (off <= sc_pos))
+ continue;
+
+ /* If the branch targets a sync instruction we're all good... */
+ if (is_sync(le32toh(code[i + off])))
+ continue;
+
+ /* ...but if not, we have a problem */
+ fprintf(stderr, "%" PRIx64 ": Branch target not a sync\n",
+ pc + (i * 4));
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int check_code(uint64_t pc, uint32_t *code, size_t sz)
+{
+ int err = 0;
+
+ if (sz % 4) {
+ fprintf(stderr, "%" PRIx64 ": Section size not a multiple of 4\n",
+ pc);
+ err = -EINVAL;
+ sz -= (sz % 4);
+ }
+
+ if (is_ll(le32toh(code[0]))) {
+ fprintf(stderr, "%" PRIx64 ": First instruction in section is an LL\n",
+ pc);
+ err = -EINVAL;
+ }
+
+#define advance() ( \
+ code++, \
+ pc += 4, \
+ sz -= 4 \
+)
+
+ /*
+ * Skip the first instructionm allowing check_ll to look backwards
+ * unconditionally.
+ */
+ advance();
+
+ /* Now scan through the code looking for LL instructions */
+ for (; sz; advance()) {
+ if (is_ll(le32toh(code[0])))
+ err |= check_ll(pc, code, sz);
+ }
+
+ return err;
+}
+
+int main(int argc, char *argv[])
+{
+ int vmlinux_fd, status, err, i;
+ const char *vmlinux_path;
+ struct stat st;
+ Elf64_Ehdr *eh;
+ Elf64_Shdr *sh;
+ void *vmlinux;
+
+ status = EXIT_FAILURE;
+
+ if (argc < 2) {
+ usage(stderr);
+ goto out_ret;
+ }
+
+ vmlinux_path = argv[1];
+ vmlinux_fd = open(vmlinux_path, O_RDONLY);
+ if (vmlinux_fd == -1) {
+ perror("Unable to open vmlinux");
+ goto out_ret;
+ }
+
+ err = fstat(vmlinux_fd, &st);
+ if (err) {
+ perror("Unable to stat vmlinux");
+ goto out_close;
+ }
+
+ vmlinux = mmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, vmlinux_fd, 0);
+ if (vmlinux == MAP_FAILED) {
+ perror("Unable to mmap vmlinux");
+ goto out_close;
+ }
+
+ eh = vmlinux;
+ if (memcmp(eh->e_ident, ELFMAG, SELFMAG)) {
+ fprintf(stderr, "vmlinux is not an ELF?\n");
+ goto out_munmap;
+ }
+
+ if (eh->e_ident[EI_CLASS] != ELFCLASS64) {
+ fprintf(stderr, "vmlinux is not 64b?\n");
+ goto out_munmap;
+ }
+
+ if (eh->e_ident[EI_DATA] != ELFDATA2LSB) {
+ fprintf(stderr, "vmlinux is not little endian?\n");
+ goto out_munmap;
+ }
+
+ for (i = 0; i < le16toh(eh->e_shnum); i++) {
+ sh = vmlinux + le64toh(eh->e_shoff) + (i * le16toh(eh->e_shentsize));
+
+ if (sh->sh_type != SHT_PROGBITS)
+ continue;
+ if (!(sh->sh_flags & SHF_EXECINSTR))
+ continue;
+
+ err = check_code(le64toh(sh->sh_addr),
+ vmlinux + le64toh(sh->sh_offset),
+ le64toh(sh->sh_size));
+ if (err)
+ goto out_munmap;
+ }
+
+ status = EXIT_SUCCESS;
+out_munmap:
+ munmap(vmlinux, st.st_size);
+out_close:
+ close(vmlinux_fd);
+out_ret:
+ return status;
+}
--
2.23.0
The loongson_llsc_mb() macro is no longer used - instead barriers are
emitted as part of inline asm using the __SYNC() macro. Remove the
now-defunct loongson_llsc_mb() macro.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/barrier.h | 40 ---------------------------------
arch/mips/loongson64/Platform | 2 +-
2 files changed, 1 insertion(+), 41 deletions(-)
diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
index 133afd565067..6d92d5ccdafa 100644
--- a/arch/mips/include/asm/barrier.h
+++ b/arch/mips/include/asm/barrier.h
@@ -122,46 +122,6 @@ static inline void wmb(void)
#define __smp_mb__before_atomic() __smp_mb__before_llsc()
#define __smp_mb__after_atomic() smp_llsc_mb()
-/*
- * Some Loongson 3 CPUs have a bug wherein execution of a memory access (load,
- * store or prefetch) in between an LL & SC can cause the SC instruction to
- * erroneously succeed, breaking atomicity. Whilst it's unusual to write code
- * containing such sequences, this bug bites harder than we might otherwise
- * expect due to reordering & speculation:
- *
- * 1) A memory access appearing prior to the LL in program order may actually
- * be executed after the LL - this is the reordering case.
- *
- * In order to avoid this we need to place a memory barrier (ie. a SYNC
- * instruction) prior to every LL instruction, in between it and any earlier
- * memory access instructions.
- *
- * This reordering case is fixed by 3A R2 CPUs, ie. 3A2000 models and later.
- *
- * 2) If a conditional branch exists between an LL & SC with a target outside
- * of the LL-SC loop, for example an exit upon value mismatch in cmpxchg()
- * or similar, then misprediction of the branch may allow speculative
- * execution of memory accesses from outside of the LL-SC loop.
- *
- * In order to avoid this we need a memory barrier (ie. a SYNC instruction)
- * at each affected branch target, for which we also use loongson_llsc_mb()
- * defined below.
- *
- * This case affects all current Loongson 3 CPUs.
- *
- * The above described cases cause an error in the cache coherence protocol;
- * such that the Invalidate of a competing LL-SC goes 'missing' and SC
- * erroneously observes its core still has Exclusive state and lets the SC
- * proceed.
- *
- * Therefore the error only occurs on SMP systems.
- */
-#ifdef CONFIG_CPU_LOONGSON3_WORKAROUNDS /* Loongson-3's LLSC workaround */
-#define loongson_llsc_mb() __asm__ __volatile__("sync" : : :"memory")
-#else
-#define loongson_llsc_mb() do { } while (0)
-#endif
-
static inline void sync_ginv(void)
{
asm volatile(__SYNC(ginv, always));
diff --git a/arch/mips/loongson64/Platform b/arch/mips/loongson64/Platform
index c1a4d4dc4665..28172500f95a 100644
--- a/arch/mips/loongson64/Platform
+++ b/arch/mips/loongson64/Platform
@@ -27,7 +27,7 @@ cflags-$(CONFIG_CPU_LOONGSON3) += -Wa,--trap
#
# Some versions of binutils, not currently mainline as of 2019/02/04, support
# an -mfix-loongson3-llsc flag which emits a sync prior to each ll instruction
-# to work around a CPU bug (see loongson_llsc_mb() in asm/barrier.h for a
+# to work around a CPU bug (see __SYNC_loongson3_war in asm/sync.h for a
# description).
#
# We disable this in order to prevent the assembler meddling with the
--
2.23.0
Remove the remaining duplication between 32b & 64b in asm/atomic.h by
making use of an ATOMIC_OPS() macro to generate:
- atomic_read()/atomic64_read()
- atomic_set()/atomic64_set()
- atomic_cmpxchg()/atomic64_cmpxchg()
- atomic_xchg()/atomic64_xchg()
This is consistent with the way all other functions in asm/atomic.h are
generated, and ensures consistency between the 32b & 64b functions.
Of note is that this results in the above now being static inline
functions rather than macros.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/atomic.h | 70 +++++++++++++---------------------
1 file changed, 27 insertions(+), 43 deletions(-)
diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h
index 96ef50fa2817..e5ac88392d1f 100644
--- a/arch/mips/include/asm/atomic.h
+++ b/arch/mips/include/asm/atomic.h
@@ -24,24 +24,34 @@
#include <asm/sync.h>
#include <asm/war.h>
-#define ATOMIC_INIT(i) { (i) }
+#define ATOMIC_OPS(pfx, type) \
+static __always_inline type pfx##_read(const pfx##_t *v) \
+{ \
+ return READ_ONCE(v->counter); \
+} \
+ \
+static __always_inline void pfx##_set(pfx##_t *v, type i) \
+{ \
+ WRITE_ONCE(v->counter, i); \
+} \
+ \
+static __always_inline type pfx##_cmpxchg(pfx##_t *v, type o, type n) \
+{ \
+ return cmpxchg(&v->counter, o, n); \
+} \
+ \
+static __always_inline type pfx##_xchg(pfx##_t *v, type n) \
+{ \
+ return xchg(&v->counter, n); \
+}
-/*
- * atomic_read - read atomic variable
- * @v: pointer of type atomic_t
- *
- * Atomically reads the value of @v.
- */
-#define atomic_read(v) READ_ONCE((v)->counter)
+#define ATOMIC_INIT(i) { (i) }
+ATOMIC_OPS(atomic, int)
-/*
- * atomic_set - set atomic variable
- * @v: pointer of type atomic_t
- * @i: required value
- *
- * Atomically sets the value of @v to @i.
- */
-#define atomic_set(v, i) WRITE_ONCE((v)->counter, (i))
+#ifdef CONFIG_64BIT
+# define ATOMIC64_INIT(i) { (i) }
+ATOMIC_OPS(atomic64, s64)
+#endif
#define ATOMIC_OP(pfx, op, type, c_op, asm_op, ll, sc) \
static __inline__ void pfx##_##op(type i, pfx##_t * v) \
@@ -135,6 +145,7 @@ static __inline__ type pfx##_fetch_##op##_relaxed(type i, pfx##_t * v) \
return result; \
}
+#undef ATOMIC_OPS
#define ATOMIC_OPS(pfx, op, type, c_op, asm_op, ll, sc) \
ATOMIC_OP(pfx, op, type, c_op, asm_op, ll, sc) \
ATOMIC_OP_RETURN(pfx, op, type, c_op, asm_op, ll, sc) \
@@ -254,31 +265,4 @@ ATOMIC_SIP_OP(atomic64, s64, dsubu, lld, scd)
#undef ATOMIC_SIP_OP
-#define atomic_cmpxchg(v, o, n) (cmpxchg(&((v)->counter), (o), (n)))
-#define atomic_xchg(v, new) (xchg(&((v)->counter), (new)))
-
-#ifdef CONFIG_64BIT
-
-#define ATOMIC64_INIT(i) { (i) }
-
-/*
- * atomic64_read - read atomic variable
- * @v: pointer of type atomic64_t
- *
- */
-#define atomic64_read(v) READ_ONCE((v)->counter)
-
-/*
- * atomic64_set - set atomic variable
- * @v: pointer of type atomic64_t
- * @i: required value
- */
-#define atomic64_set(v, i) WRITE_ONCE((v)->counter, (i))
-
-#define atomic64_cmpxchg(v, o, n) \
- ((__typeof__((v)->counter))cmpxchg(&((v)->counter), (o), (n)))
-#define atomic64_xchg(v, new) (xchg(&((v)->counter), (new)))
-
-#endif /* CONFIG_64BIT */
-
#endif /* _ASM_ATOMIC_H */
--
2.23.0
We define macros in asm/atomic.h which end each line with space
characters before a backslash to continue on the next line. Remove the
space characters leaving tabs as the whitespace used for conformity with
coding convention.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/atomic.h | 184 ++++++++++++++++-----------------
1 file changed, 92 insertions(+), 92 deletions(-)
diff --git a/arch/mips/include/asm/atomic.h b/arch/mips/include/asm/atomic.h
index 7578c807ef98..2d2a8a74c51b 100644
--- a/arch/mips/include/asm/atomic.h
+++ b/arch/mips/include/asm/atomic.h
@@ -42,102 +42,102 @@
*/
#define atomic_set(v, i) WRITE_ONCE((v)->counter, (i))
-#define ATOMIC_OP(op, c_op, asm_op) \
-static __inline__ void atomic_##op(int i, atomic_t * v) \
-{ \
- if (kernel_uses_llsc) { \
- int temp; \
- \
- loongson_llsc_mb(); \
- __asm__ __volatile__( \
- " .set push \n" \
- " .set "MIPS_ISA_LEVEL" \n" \
- "1: ll %0, %1 # atomic_" #op " \n" \
- " " #asm_op " %0, %2 \n" \
- " sc %0, %1 \n" \
- "\t" __SC_BEQZ "%0, 1b \n" \
- " .set pop \n" \
- : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter) \
- : "Ir" (i) : __LLSC_CLOBBER); \
- } else { \
- unsigned long flags; \
- \
- raw_local_irq_save(flags); \
- v->counter c_op i; \
- raw_local_irq_restore(flags); \
- } \
+#define ATOMIC_OP(op, c_op, asm_op) \
+static __inline__ void atomic_##op(int i, atomic_t * v) \
+{ \
+ if (kernel_uses_llsc) { \
+ int temp; \
+ \
+ loongson_llsc_mb(); \
+ __asm__ __volatile__( \
+ " .set push \n" \
+ " .set "MIPS_ISA_LEVEL" \n" \
+ "1: ll %0, %1 # atomic_" #op " \n" \
+ " " #asm_op " %0, %2 \n" \
+ " sc %0, %1 \n" \
+ "\t" __SC_BEQZ "%0, 1b \n" \
+ " .set pop \n" \
+ : "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (v->counter) \
+ : "Ir" (i) : __LLSC_CLOBBER); \
+ } else { \
+ unsigned long flags; \
+ \
+ raw_local_irq_save(flags); \
+ v->counter c_op i; \
+ raw_local_irq_restore(flags); \
+ } \
}
-#define ATOMIC_OP_RETURN(op, c_op, asm_op) \
-static __inline__ int atomic_##op##_return_relaxed(int i, atomic_t * v) \
-{ \
- int result; \
- \
- if (kernel_uses_llsc) { \
- int temp; \
- \
- loongson_llsc_mb(); \
- __asm__ __volatile__( \
- " .set push \n" \
- " .set "MIPS_ISA_LEVEL" \n" \
- "1: ll %1, %2 # atomic_" #op "_return \n" \
- " " #asm_op " %0, %1, %3 \n" \
- " sc %0, %2 \n" \
- "\t" __SC_BEQZ "%0, 1b \n" \
- " " #asm_op " %0, %1, %3 \n" \
- " .set pop \n" \
- : "=&r" (result), "=&r" (temp), \
- "+" GCC_OFF_SMALL_ASM() (v->counter) \
- : "Ir" (i) : __LLSC_CLOBBER); \
- } else { \
- unsigned long flags; \
- \
- raw_local_irq_save(flags); \
- result = v->counter; \
- result c_op i; \
- v->counter = result; \
- raw_local_irq_restore(flags); \
- } \
- \
- return result; \
+#define ATOMIC_OP_RETURN(op, c_op, asm_op) \
+static __inline__ int atomic_##op##_return_relaxed(int i, atomic_t * v) \
+{ \
+ int result; \
+ \
+ if (kernel_uses_llsc) { \
+ int temp; \
+ \
+ loongson_llsc_mb(); \
+ __asm__ __volatile__( \
+ " .set push \n" \
+ " .set "MIPS_ISA_LEVEL" \n" \
+ "1: ll %1, %2 # atomic_" #op "_return \n" \
+ " " #asm_op " %0, %1, %3 \n" \
+ " sc %0, %2 \n" \
+ "\t" __SC_BEQZ "%0, 1b \n" \
+ " " #asm_op " %0, %1, %3 \n" \
+ " .set pop \n" \
+ : "=&r" (result), "=&r" (temp), \
+ "+" GCC_OFF_SMALL_ASM() (v->counter) \
+ : "Ir" (i) : __LLSC_CLOBBER); \
+ } else { \
+ unsigned long flags; \
+ \
+ raw_local_irq_save(flags); \
+ result = v->counter; \
+ result c_op i; \
+ v->counter = result; \
+ raw_local_irq_restore(flags); \
+ } \
+ \
+ return result; \
}
-#define ATOMIC_FETCH_OP(op, c_op, asm_op) \
-static __inline__ int atomic_fetch_##op##_relaxed(int i, atomic_t * v) \
-{ \
- int result; \
- \
- if (kernel_uses_llsc) { \
- int temp; \
- \
- loongson_llsc_mb(); \
- __asm__ __volatile__( \
- " .set push \n" \
- " .set "MIPS_ISA_LEVEL" \n" \
- "1: ll %1, %2 # atomic_fetch_" #op " \n" \
- " " #asm_op " %0, %1, %3 \n" \
- " sc %0, %2 \n" \
- "\t" __SC_BEQZ "%0, 1b \n" \
- " .set pop \n" \
- " move %0, %1 \n" \
- : "=&r" (result), "=&r" (temp), \
- "+" GCC_OFF_SMALL_ASM() (v->counter) \
- : "Ir" (i) : __LLSC_CLOBBER); \
- } else { \
- unsigned long flags; \
- \
- raw_local_irq_save(flags); \
- result = v->counter; \
- v->counter c_op i; \
- raw_local_irq_restore(flags); \
- } \
- \
- return result; \
+#define ATOMIC_FETCH_OP(op, c_op, asm_op) \
+static __inline__ int atomic_fetch_##op##_relaxed(int i, atomic_t * v) \
+{ \
+ int result; \
+ \
+ if (kernel_uses_llsc) { \
+ int temp; \
+ \
+ loongson_llsc_mb(); \
+ __asm__ __volatile__( \
+ " .set push \n" \
+ " .set "MIPS_ISA_LEVEL" \n" \
+ "1: ll %1, %2 # atomic_fetch_" #op " \n" \
+ " " #asm_op " %0, %1, %3 \n" \
+ " sc %0, %2 \n" \
+ "\t" __SC_BEQZ "%0, 1b \n" \
+ " .set pop \n" \
+ " move %0, %1 \n" \
+ : "=&r" (result), "=&r" (temp), \
+ "+" GCC_OFF_SMALL_ASM() (v->counter) \
+ : "Ir" (i) : __LLSC_CLOBBER); \
+ } else { \
+ unsigned long flags; \
+ \
+ raw_local_irq_save(flags); \
+ result = v->counter; \
+ v->counter c_op i; \
+ raw_local_irq_restore(flags); \
+ } \
+ \
+ return result; \
}
-#define ATOMIC_OPS(op, c_op, asm_op) \
- ATOMIC_OP(op, c_op, asm_op) \
- ATOMIC_OP_RETURN(op, c_op, asm_op) \
+#define ATOMIC_OPS(op, c_op, asm_op) \
+ ATOMIC_OP(op, c_op, asm_op) \
+ ATOMIC_OP_RETURN(op, c_op, asm_op) \
ATOMIC_FETCH_OP(op, c_op, asm_op)
ATOMIC_OPS(add, +=, addu)
@@ -149,8 +149,8 @@ ATOMIC_OPS(sub, -=, subu)
#define atomic_fetch_sub_relaxed atomic_fetch_sub_relaxed
#undef ATOMIC_OPS
-#define ATOMIC_OPS(op, c_op, asm_op) \
- ATOMIC_OP(op, c_op, asm_op) \
+#define ATOMIC_OPS(op, c_op, asm_op) \
+ ATOMIC_OP(op, c_op, asm_op) \
ATOMIC_FETCH_OP(op, c_op, asm_op)
ATOMIC_OPS(and, &=, and)
--
2.23.0
Rather than #ifdef on CONFIG_CPU_* to determine whether the ins
instruction is supported we can simply check MIPS_ISA_REV to discover
whether we're targeting MIPSr2 or higher. Do so in order to clean up the
code.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/bitops.h | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)
diff --git a/arch/mips/include/asm/bitops.h b/arch/mips/include/asm/bitops.h
index 1e5739191ddf..0f5329e32e87 100644
--- a/arch/mips/include/asm/bitops.h
+++ b/arch/mips/include/asm/bitops.h
@@ -19,6 +19,7 @@
#include <asm/byteorder.h> /* sigh ... */
#include <asm/compiler.h>
#include <asm/cpu-features.h>
+#include <asm/isa-rev.h>
#include <asm/llsc.h>
#include <asm/sgidefs.h>
#include <asm/war.h>
@@ -76,8 +77,7 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr)
return;
}
-#if defined(CONFIG_CPU_MIPSR2) || defined(CONFIG_CPU_MIPSR6)
- if (__builtin_constant_p(bit) && (bit >= 16)) {
+ if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(bit) && (bit >= 16)) {
loongson_llsc_mb();
do {
__asm__ __volatile__(
@@ -90,7 +90,6 @@ static inline void set_bit(unsigned long nr, volatile unsigned long *addr)
} while (unlikely(!temp));
return;
}
-#endif /* CONFIG_CPU_MIPSR2 || CONFIG_CPU_MIPSR6 */
loongson_llsc_mb();
do {
@@ -143,8 +142,7 @@ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr)
return;
}
-#if defined(CONFIG_CPU_MIPSR2) || defined(CONFIG_CPU_MIPSR6)
- if (__builtin_constant_p(bit)) {
+ if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(bit)) {
loongson_llsc_mb();
do {
__asm__ __volatile__(
@@ -157,7 +155,6 @@ static inline void clear_bit(unsigned long nr, volatile unsigned long *addr)
} while (unlikely(!temp));
return;
}
-#endif /* CONFIG_CPU_MIPSR2 || CONFIG_CPU_MIPSR6 */
loongson_llsc_mb();
do {
@@ -377,8 +374,7 @@ static inline int test_and_clear_bit(unsigned long nr,
: "=&r" (temp), "+" GCC_OFF_SMALL_ASM() (*m), "=&r" (res)
: "r" (1UL << bit)
: __LLSC_CLOBBER);
-#if defined(CONFIG_CPU_MIPSR2) || defined(CONFIG_CPU_MIPSR6)
- } else if (__builtin_constant_p(nr)) {
+ } else if ((MIPS_ISA_REV >= 2) && __builtin_constant_p(nr)) {
loongson_llsc_mb();
do {
__asm__ __volatile__(
@@ -390,7 +386,6 @@ static inline int test_and_clear_bit(unsigned long nr,
: "ir" (bit)
: __LLSC_CLOBBER);
} while (unlikely(!temp));
-#endif
} else {
loongson_llsc_mb();
do {
--
2.23.0
Implement __sync() using the new __SYNC() infrastructure, which will
take care of not emitting an instruction for old R3k CPUs that don't
support it. The only behavioral difference is that __sync() will now
provide a compiler barrier on these old CPUs, but that seems like
reasonable behavior anyway.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/barrier.h | 18 ++++--------------
1 file changed, 4 insertions(+), 14 deletions(-)
diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
index 657ec01120a4..a117c6d95038 100644
--- a/arch/mips/include/asm/barrier.h
+++ b/arch/mips/include/asm/barrier.h
@@ -11,20 +11,10 @@
#include <asm/addrspace.h>
#include <asm/sync.h>
-#ifdef CONFIG_CPU_HAS_SYNC
-#define __sync() \
- __asm__ __volatile__( \
- ".set push\n\t" \
- ".set noreorder\n\t" \
- ".set mips2\n\t" \
- "sync\n\t" \
- ".set pop" \
- : /* no output */ \
- : /* no input */ \
- : "memory")
-#else
-#define __sync() do { } while(0)
-#endif
+static inline void __sync(void)
+{
+ asm volatile(__SYNC(full, always) ::: "memory");
+}
static inline void rmb(void)
{
--
2.23.0
We #ifdef on Cavium Octeon CPUs, but emit the same sync instruction in
both cases. Remove the #ifdef & simply expand to the __sync() macro.
Whilst here indent the strong ordering case definitions to match the
indentation of the weak ordering ones, helping readability.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/barrier.h | 12 ++++--------
1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
index f36cab87cfde..8a5abc1c85a6 100644
--- a/arch/mips/include/asm/barrier.h
+++ b/arch/mips/include/asm/barrier.h
@@ -89,17 +89,13 @@ static inline void wmb(void)
#endif /* !CONFIG_CPU_HAS_WB */
#if defined(CONFIG_WEAK_ORDERING)
-# ifdef CONFIG_CPU_CAVIUM_OCTEON
-# define __smp_mb() __sync()
-# else
-# define __smp_mb() __asm__ __volatile__("sync" : : :"memory")
-# endif
+# define __smp_mb() __sync()
# define __smp_rmb() rmb()
# define __smp_wmb() wmb()
#else
-#define __smp_mb() barrier()
-#define __smp_rmb() barrier()
-#define __smp_wmb() barrier()
+# define __smp_mb() barrier()
+# define __smp_rmb() barrier()
+# define __smp_wmb() barrier()
#endif
/*
--
2.23.0
Introduce an asm/sync.h header which provides infrastructure that can be
used to generate sync instructions of various types, and for various
reasons. For example if we need a sync instruction that provides a full
completion barrier but only on systems which have weak memory ordering,
we can generate the appropriate assembly code using:
__SYNC(full, weak_ordering)
When the kernel is configured to run on systems with weak memory
ordering (ie. CONFIG_WEAK_ORDERING is selected) we'll emit a sync
instruction. When the kernel is configured to run on systems with strong
memory ordering (ie. CONFIG_WEAK_ORDERING is not selected) we'll emit
nothing. The caller doesn't need to know which happened - it simply says
what it needs & when, with no concern for checking the kernel
configuration.
There are some scenarios in which we may want to emit code only when we
*didn't* emit a sync instruction. For example, some Loongson3 CPUs
suffer from a bug that requires us to emit a sync instruction prior to
each ll instruction (enabled by CONFIG_CPU_LOONGSON3_WORKAROUNDS). In
cases where this bug workaround is enabled, it's wasteful to then have
more generic code emit another sync instruction to provide barriers we
need in general. A __SYNC_ELSE() macro allows for this, providing an
extra argument that contains code to be assembled only in cases where
the sync instruction was not emitted. For example if we have a scenario
in which we generally want to emit a release barrier but for affected
Loongson3 configurations upgrade that to a full completion barrier, we
can do that like so:
__SYNC_ELSE(full, loongson3_war, __SYNC(rl, always))
The assembly generated by these macros can be used either as inline
assembly or in assembly source files.
Differing types of sync as provided by MIPSr6 are defined, but currently
they all generate a full completion barrier except in kernels configured
for Cavium Octeon systems. There the wmb sync-type is used, and rmb
syncs are omitted, as has been the case since commit 6b07d38aaa52
("MIPS: Octeon: Use optimized memory barrier primitives."). Using
__SYNC() with the wmb or rmb types will abstract away the Octeon
specific behavior and allow us to later clean up asm/barrier.h code that
currently includes a plethora of #ifdef's.
Signed-off-by: Paul Burton <[email protected]>
---
Changes in v2: None
arch/mips/include/asm/barrier.h | 113 +----------------
arch/mips/include/asm/sync.h | 207 ++++++++++++++++++++++++++++++++
arch/mips/kernel/pm-cps.c | 20 +--
3 files changed, 219 insertions(+), 121 deletions(-)
create mode 100644 arch/mips/include/asm/sync.h
diff --git a/arch/mips/include/asm/barrier.h b/arch/mips/include/asm/barrier.h
index 9228f7386220..5ad39bfd3b6d 100644
--- a/arch/mips/include/asm/barrier.h
+++ b/arch/mips/include/asm/barrier.h
@@ -9,116 +9,7 @@
#define __ASM_BARRIER_H
#include <asm/addrspace.h>
-
-/*
- * Sync types defined by the MIPS architecture (document MD00087 table 6.5)
- * These values are used with the sync instruction to perform memory barriers.
- * Types of ordering guarantees available through the SYNC instruction:
- * - Completion Barriers
- * - Ordering Barriers
- * As compared to the completion barrier, the ordering barrier is a
- * lighter-weight operation as it does not require the specified instructions
- * before the SYNC to be already completed. Instead it only requires that those
- * specified instructions which are subsequent to the SYNC in the instruction
- * stream are never re-ordered for processing ahead of the specified
- * instructions which are before the SYNC in the instruction stream.
- * This potentially reduces how many cycles the barrier instruction must stall
- * before it completes.
- * Implementations that do not use any of the non-zero values of stype to define
- * different barriers, such as ordering barriers, must make those stype values
- * act the same as stype zero.
- */
-
-/*
- * Completion barriers:
- * - Every synchronizable specified memory instruction (loads or stores or both)
- * that occurs in the instruction stream before the SYNC instruction must be
- * already globally performed before any synchronizable specified memory
- * instructions that occur after the SYNC are allowed to be performed, with
- * respect to any other processor or coherent I/O module.
- *
- * - The barrier does not guarantee the order in which instruction fetches are
- * performed.
- *
- * - A stype value of zero will always be defined such that it performs the most
- * complete set of synchronization operations that are defined.This means
- * stype zero always does a completion barrier that affects both loads and
- * stores preceding the SYNC instruction and both loads and stores that are
- * subsequent to the SYNC instruction. Non-zero values of stype may be defined
- * by the architecture or specific implementations to perform synchronization
- * behaviors that are less complete than that of stype zero. If an
- * implementation does not use one of these non-zero values to define a
- * different synchronization behavior, then that non-zero value of stype must
- * act the same as stype zero completion barrier. This allows software written
- * for an implementation with a lighter-weight barrier to work on another
- * implementation which only implements the stype zero completion barrier.
- *
- * - A completion barrier is required, potentially in conjunction with SSNOP (in
- * Release 1 of the Architecture) or EHB (in Release 2 of the Architecture),
- * to guarantee that memory reference results are visible across operating
- * mode changes. For example, a completion barrier is required on some
- * implementations on entry to and exit from Debug Mode to guarantee that
- * memory effects are handled correctly.
- */
-
-/*
- * stype 0 - A completion barrier that affects preceding loads and stores and
- * subsequent loads and stores.
- * Older instructions which must reach the load/store ordering point before the
- * SYNC instruction completes: Loads, Stores
- * Younger instructions which must reach the load/store ordering point only
- * after the SYNC instruction completes: Loads, Stores
- * Older instructions which must be globally performed when the SYNC instruction
- * completes: Loads, Stores
- */
-#define STYPE_SYNC 0x0
-
-/*
- * Ordering barriers:
- * - Every synchronizable specified memory instruction (loads or stores or both)
- * that occurs in the instruction stream before the SYNC instruction must
- * reach a stage in the load/store datapath after which no instruction
- * re-ordering is possible before any synchronizable specified memory
- * instruction which occurs after the SYNC instruction in the instruction
- * stream reaches the same stage in the load/store datapath.
- *
- * - If any memory instruction before the SYNC instruction in program order,
- * generates a memory request to the external memory and any memory
- * instruction after the SYNC instruction in program order also generates a
- * memory request to external memory, the memory request belonging to the
- * older instruction must be globally performed before the time the memory
- * request belonging to the younger instruction is globally performed.
- *
- * - The barrier does not guarantee the order in which instruction fetches are
- * performed.
- */
-
-/*
- * stype 0x10 - An ordering barrier that affects preceding loads and stores and
- * subsequent loads and stores.
- * Older instructions which must reach the load/store ordering point before the
- * SYNC instruction completes: Loads, Stores
- * Younger instructions which must reach the load/store ordering point only
- * after the SYNC instruction completes: Loads, Stores
- * Older instructions which must be globally performed when the SYNC instruction
- * completes: N/A
- */
-#define STYPE_SYNC_MB 0x10
-
-/*
- * stype 0x14 - A completion barrier specific to global invalidations
- *
- * When a sync instruction of this type completes any preceding GINVI or GINVT
- * operation has been globalized & completed on all coherent CPUs. Anything
- * that the GINV* instruction should invalidate will have been invalidated on
- * all coherent CPUs when this instruction completes. It is implementation
- * specific whether the GINV* instructions themselves will ensure completion,
- * or this sync type will.
- *
- * In systems implementing global invalidates (ie. with Config5.GI == 2 or 3)
- * this sync type also requires that previous SYNCI operations have completed.
- */
-#define STYPE_GINV 0x14
+#include <asm/sync.h>
#ifdef CONFIG_CPU_HAS_SYNC
#define __sync() \
@@ -286,7 +177,7 @@
static inline void sync_ginv(void)
{
- asm volatile("sync\t%0" :: "i"(STYPE_GINV));
+ asm volatile("sync\t%0" :: "i"(__SYNC_ginv));
}
#include <asm-generic/barrier.h>
diff --git a/arch/mips/include/asm/sync.h b/arch/mips/include/asm/sync.h
new file mode 100644
index 000000000000..7c6a1095f556
--- /dev/null
+++ b/arch/mips/include/asm/sync.h
@@ -0,0 +1,207 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef __MIPS_ASM_SYNC_H__
+#define __MIPS_ASM_SYNC_H__
+
+/*
+ * sync types are defined by the MIPS64 Instruction Set documentation in Volume
+ * II-A of the MIPS Architecture Reference Manual, which can be found here:
+ *
+ * https://www.mips.com/?do-download=the-mips64-instruction-set-v6-06
+ *
+ * Two types of barrier are provided:
+ *
+ * 1) Completion barriers, which ensure that a memory operation has actually
+ * completed & often involve stalling the CPU pipeline to do so.
+ *
+ * 2) Ordering barriers, which only ensure that affected memory operations
+ * won't be reordered in the CPU pipeline in a manner that violates the
+ * restrictions imposed by the barrier.
+ *
+ * Ordering barriers can be more efficient than completion barriers, since:
+ *
+ * a) Ordering barriers only require memory access instructions which preceed
+ * them in program order (older instructions) to reach a point in the
+ * load/store datapath beyond which reordering is not possible before
+ * allowing memory access instructions which follow them (younger
+ * instructions) to be performed. That is, older instructions don't
+ * actually need to complete - they just need to get far enough that all
+ * other coherent CPUs will observe their completion before they observe
+ * the effects of younger instructions.
+ *
+ * b) Multiple variants of ordering barrier are provided which allow the
+ * effects to be restricted to different combinations of older or younger
+ * loads or stores. By way of example, if we only care that stores older
+ * than a barrier are observed prior to stores that are younger than a
+ * barrier & don't care about the ordering of loads then the 'wmb'
+ * ordering barrier can be used. Limiting the barrier's effects to stores
+ * allows loads to continue unaffected & potentially allows the CPU to
+ * make progress faster than if younger loads had to wait for older stores
+ * to complete.
+ */
+
+/*
+ * No sync instruction at all; used to allow code to nullify the effect of the
+ * __SYNC() macro without needing lots of #ifdefery.
+ */
+#define __SYNC_none -1
+
+/*
+ * A full completion barrier; all memory accesses appearing prior to this sync
+ * instruction in program order must complete before any memory accesses
+ * appearing after this sync instruction in program order.
+ */
+#define __SYNC_full 0x00
+
+/*
+ * For now we use a full completion barrier to implement all sync types, until
+ * we're satisfied that lightweight ordering barriers defined by MIPSr6 are
+ * sufficient to uphold our desired memory model.
+ */
+#define __SYNC_aq __SYNC_full
+#define __SYNC_rl __SYNC_full
+#define __SYNC_mb __SYNC_full
+
+/*
+ * ...except on Cavium Octeon CPUs, which have been using the 'wmb' ordering
+ * barrier since 2010 & omit 'rmb' barriers because the CPUs don't perform
+ * speculative reads.
+ */
+#ifdef CONFIG_CPU_CAVIUM_OCTEON
+# define __SYNC_rmb __SYNC_none
+# define __SYNC_wmb 0x04
+#else
+# define __SYNC_rmb __SYNC_full
+# define __SYNC_wmb __SYNC_full
+#endif
+
+/*
+ * A GINV sync is a little different; it doesn't relate directly to loads or
+ * stores, but instead causes synchronization of an icache or TLB global
+ * invalidation operation triggered by the ginvi or ginvt instructions
+ * respectively. In cases where we need to know that a ginvi or ginvt operation
+ * has been performed by all coherent CPUs, we must issue a sync instruction of
+ * this type. Once this instruction graduates all coherent CPUs will have
+ * observed the invalidation.
+ */
+#define __SYNC_ginv 0x14
+
+/* Trivial; indicate that we always need this sync instruction. */
+#define __SYNC_always (1 << 0)
+
+/*
+ * Indicate that we need this sync instruction only on systems with weakly
+ * ordered memory access. In general this is most MIPS systems, but there are
+ * exceptions which provide strongly ordered memory.
+ */
+#ifdef CONFIG_WEAK_ORDERING
+# define __SYNC_weak_ordering (1 << 1)
+#else
+# define __SYNC_weak_ordering 0
+#endif
+
+/*
+ * Indicate that we need this sync instruction only on systems where LL/SC
+ * don't implicitly provide a memory barrier. In general this is most MIPS
+ * systems.
+ */
+#ifdef CONFIG_WEAK_REORDERING_BEYOND_LLSC
+# define __SYNC_weak_llsc (1 << 2)
+#else
+# define __SYNC_weak_llsc 0
+#endif
+
+/*
+ * Some Loongson 3 CPUs have a bug wherein execution of a memory access (load,
+ * store or prefetch) in between an LL & SC can cause the SC instruction to
+ * erroneously succeed, breaking atomicity. Whilst it's unusual to write code
+ * containing such sequences, this bug bites harder than we might otherwise
+ * expect due to reordering & speculation:
+ *
+ * 1) A memory access appearing prior to the LL in program order may actually
+ * be executed after the LL - this is the reordering case.
+ *
+ * In order to avoid this we need to place a memory barrier (ie. a SYNC
+ * instruction) prior to every LL instruction, in between it and any earlier
+ * memory access instructions.
+ *
+ * This reordering case is fixed by 3A R2 CPUs, ie. 3A2000 models and later.
+ *
+ * 2) If a conditional branch exists between an LL & SC with a target outside
+ * of the LL-SC loop, for example an exit upon value mismatch in cmpxchg()
+ * or similar, then misprediction of the branch may allow speculative
+ * execution of memory accesses from outside of the LL-SC loop.
+ *
+ * In order to avoid this we need a memory barrier (ie. a SYNC instruction)
+ * at each affected branch target.
+ *
+ * This case affects all current Loongson 3 CPUs.
+ *
+ * The above described cases cause an error in the cache coherence protocol;
+ * such that the Invalidate of a competing LL-SC goes 'missing' and SC
+ * erroneously observes its core still has Exclusive state and lets the SC
+ * proceed.
+ *
+ * Therefore the error only occurs on SMP systems.
+ */
+#ifdef CONFIG_CPU_LOONGSON3_WORKAROUNDS
+# define __SYNC_loongson3_war (1 << 31)
+#else
+# define __SYNC_loongson3_war 0
+#endif
+
+/*
+ * Some Cavium Octeon CPUs suffer from a bug that causes a single wmb ordering
+ * barrier to be ineffective, requiring the use of 2 in sequence to provide an
+ * effective barrier as noted by commit 6b07d38aaa52 ("MIPS: Octeon: Use
+ * optimized memory barrier primitives."). Here we specify that the affected
+ * sync instructions should be emitted twice.
+ */
+#ifdef CONFIG_CPU_CAVIUM_OCTEON
+# define __SYNC_rpt(type) (1 + (type == __SYNC_wmb))
+#else
+# define __SYNC_rpt(type) 1
+#endif
+
+/*
+ * The main event. Here we actually emit a sync instruction of a given type, if
+ * reason is non-zero.
+ *
+ * In future we have the option of emitting entries in a fixups-style table
+ * here that would allow us to opportunistically remove some sync instructions
+ * when we detect at runtime that we're running on a CPU that doesn't need
+ * them.
+ */
+#ifdef CONFIG_CPU_HAS_SYNC
+# define ____SYNC(_type, _reason, _else) \
+ .if (( _type ) != -1) && ( _reason ); \
+ .set push; \
+ .set MIPS_ISA_LEVEL_RAW; \
+ .rept __SYNC_rpt(_type); \
+ sync _type; \
+ .endr; \
+ .set pop; \
+ .else; \
+ _else; \
+ .endif
+#else
+# define ____SYNC(_type, _reason, _else)
+#endif
+
+/*
+ * Preprocessor magic to expand macros used as arguments before we insert them
+ * into assembly code.
+ */
+#ifdef __ASSEMBLY__
+# define ___SYNC(type, reason, else) \
+ ____SYNC(type, reason, else)
+#else
+# define ___SYNC(type, reason, else) \
+ __stringify(____SYNC(type, reason, else))
+#endif
+
+#define __SYNC(type, reason) \
+ ___SYNC(__SYNC_##type, __SYNC_##reason, )
+#define __SYNC_ELSE(type, reason, else) \
+ ___SYNC(__SYNC_##type, __SYNC_##reason, else)
+
+#endif /* __MIPS_ASM_SYNC_H__ */
diff --git a/arch/mips/kernel/pm-cps.c b/arch/mips/kernel/pm-cps.c
index a26f40db15d0..9bf60d7d44d3 100644
--- a/arch/mips/kernel/pm-cps.c
+++ b/arch/mips/kernel/pm-cps.c
@@ -307,7 +307,7 @@ static int cps_gen_flush_fsb(u32 **pp, struct uasm_label **pl,
}
/* Barrier ensuring previous cache invalidates are complete */
- uasm_i_sync(pp, STYPE_SYNC);
+ uasm_i_sync(pp, __SYNC_full);
uasm_i_ehb(pp);
/* Check whether the pipeline stalled due to the FSB being full */
@@ -397,7 +397,7 @@ static void *cps_gen_entry_code(unsigned cpu, enum cps_pm_state state)
if (coupled_coherence) {
/* Increment ready_count */
- uasm_i_sync(&p, STYPE_SYNC_MB);
+ uasm_i_sync(&p, __SYNC_mb);
uasm_build_label(&l, p, lbl_incready);
uasm_i_ll(&p, t1, 0, r_nc_count);
uasm_i_addiu(&p, t2, t1, 1);
@@ -406,7 +406,7 @@ static void *cps_gen_entry_code(unsigned cpu, enum cps_pm_state state)
uasm_i_addiu(&p, t1, t1, 1);
/* Barrier ensuring all CPUs see the updated r_nc_count value */
- uasm_i_sync(&p, STYPE_SYNC_MB);
+ uasm_i_sync(&p, __SYNC_mb);
/*
* If this is the last VPE to become ready for non-coherence
@@ -473,7 +473,7 @@ static void *cps_gen_entry_code(unsigned cpu, enum cps_pm_state state)
Index_Writeback_Inv_D, lbl_flushdcache);
/* Barrier ensuring previous cache invalidates are complete */
- uasm_i_sync(&p, STYPE_SYNC);
+ uasm_i_sync(&p, __SYNC_full);
uasm_i_ehb(&p);
if (mips_cm_revision() < CM_REV_CM3) {
@@ -487,7 +487,7 @@ static void *cps_gen_entry_code(unsigned cpu, enum cps_pm_state state)
uasm_i_lw(&p, t0, 0, r_pcohctl);
/* Barrier to ensure write to coherence control is complete */
- uasm_i_sync(&p, STYPE_SYNC);
+ uasm_i_sync(&p, __SYNC_full);
uasm_i_ehb(&p);
}
@@ -534,7 +534,7 @@ static void *cps_gen_entry_code(unsigned cpu, enum cps_pm_state state)
}
/* Barrier to ensure write to CPC command is complete */
- uasm_i_sync(&p, STYPE_SYNC);
+ uasm_i_sync(&p, __SYNC_full);
uasm_i_ehb(&p);
}
@@ -572,13 +572,13 @@ static void *cps_gen_entry_code(unsigned cpu, enum cps_pm_state state)
uasm_i_lw(&p, t0, 0, r_pcohctl);
/* Barrier to ensure write to coherence control is complete */
- uasm_i_sync(&p, STYPE_SYNC);
+ uasm_i_sync(&p, __SYNC_full);
uasm_i_ehb(&p);
if (coupled_coherence && (state == CPS_PM_NC_WAIT)) {
/* Decrement ready_count */
uasm_build_label(&l, p, lbl_decready);
- uasm_i_sync(&p, STYPE_SYNC_MB);
+ uasm_i_sync(&p, __SYNC_mb);
uasm_i_ll(&p, t1, 0, r_nc_count);
uasm_i_addiu(&p, t2, t1, -1);
uasm_i_sc(&p, t2, 0, r_nc_count);
@@ -586,7 +586,7 @@ static void *cps_gen_entry_code(unsigned cpu, enum cps_pm_state state)
uasm_i_andi(&p, v0, t1, (1 << fls(smp_num_siblings)) - 1);
/* Barrier ensuring all CPUs see the updated r_nc_count value */
- uasm_i_sync(&p, STYPE_SYNC_MB);
+ uasm_i_sync(&p, __SYNC_mb);
}
if (coupled_coherence && (state == CPS_PM_CLOCK_GATED)) {
@@ -608,7 +608,7 @@ static void *cps_gen_entry_code(unsigned cpu, enum cps_pm_state state)
uasm_build_label(&l, p, lbl_secondary_cont);
/* Barrier ensuring all CPUs see the updated r_nc_count value */
- uasm_i_sync(&p, STYPE_SYNC_MB);
+ uasm_i_sync(&p, __SYNC_mb);
}
/* The core is coherent, time to return to C code */
--
2.23.0
Hello,
Paul Burton wrote:
> This series consists of a bunch of cleanups to the way we handle memory
> barriers (though no changes to the sync instructions we use to implement
> them) & atomic memory accesses. One major goal was to ensure the
> Loongson3 LL/SC errata workarounds are applied in a safe manner from
> within inline-asm & that we can automatically verify the resulting
> kernel binary looks reasonable. Many patches are cleanups found along
> the way.
>
> Applies atop v5.4-rc1.
>
> Changes in v2:
> - Keep our fls/ffs implementations. Turns out GCC's builtins call
> intrinsics in some configurations, and if we'd need to go implement
> those then using the generic fls/ffs doesn't seem like such a win.
> - De-string __WEAK_LLSC_MB to allow use with __SYNC_ELSE().
> - Only try to build the loongson3-llsc-check tool from
> arch/mips/Makefile when CONFIG_CPU_LOONGSON3_WORKAROUNDS is enabled.
>
> Paul Burton (36):
> MIPS: Unify sc beqz definition
> MIPS: Use compact branch for LL/SC loops on MIPSr6+
> MIPS: barrier: Add __SYNC() infrastructure
> MIPS: barrier: Clean up rmb() & wmb() definitions
> MIPS: barrier: Clean up __smp_mb() definition
> MIPS: barrier: Remove fast_mb() Octeon #ifdef'ery
> MIPS: barrier: Clean up __sync() definition
> MIPS: barrier: Clean up sync_ginv()
> MIPS: atomic: Fix whitespace in ATOMIC_OP macros
> MIPS: atomic: Handle !kernel_uses_llsc first
> MIPS: atomic: Use one macro to generate 32b & 64b functions
> MIPS: atomic: Emit Loongson3 sync workarounds within asm
> MIPS: atomic: Use _atomic barriers in atomic_sub_if_positive()
> MIPS: atomic: Unify 32b & 64b sub_if_positive
> MIPS: atomic: Deduplicate 32b & 64b read, set, xchg, cmpxchg
> MIPS: bitops: Handle !kernel_uses_llsc first
> MIPS: bitops: Only use ins for bit 16 or higher
> MIPS: bitops: Use MIPS_ISA_REV, not #ifdefs
> MIPS: bitops: ins start position is always an immediate
> MIPS: bitops: Implement test_and_set_bit() in terms of _lock variant
> MIPS: bitops: Allow immediates in test_and_{set,clear,change}_bit
> MIPS: bitops: Use the BIT() macro
> MIPS: bitops: Avoid redundant zero-comparison for non-LLSC
> MIPS: bitops: Abstract LL/SC loops
> MIPS: bitops: Use BIT_WORD() & BITS_PER_LONG
> MIPS: bitops: Emit Loongson3 sync workarounds within asm
> MIPS: bitops: Use smp_mb__before_atomic in test_* ops
> MIPS: cmpxchg: Emit Loongson3 sync workarounds within asm
> MIPS: cmpxchg: Omit redundant barriers for Loongson3
> MIPS: futex: Emit Loongson3 sync workarounds within asm
> MIPS: syscall: Emit Loongson3 sync workarounds within asm
> MIPS: barrier: Remove loongson_llsc_mb()
> MIPS: barrier: Make __smp_mb__before_atomic() a no-op for Loongson3
> MIPS: genex: Add Loongson3 LL/SC workaround to ejtag_debug_handler
> MIPS: genex: Don't reload address unnecessarily
> MIPS: Check Loongson3 LL/SC errata workaround correctness
>
> arch/mips/Makefile | 3 +
> arch/mips/Makefile.postlink | 10 +-
Series applied to mips-next.
> MIPS: Unify sc beqz definition
> commit 878f75c7a253
> https://git.kernel.org/mips/c/878f75c7a253
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: Use compact branch for LL/SC loops on MIPSr6+
> commit ef85d057a605
> https://git.kernel.org/mips/c/ef85d057a605
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: barrier: Add __SYNC() infrastructure
> commit bf92927251b3
> https://git.kernel.org/mips/c/bf92927251b3
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: barrier: Clean up rmb() & wmb() definitions
> commit 21e3134b3ec0
> https://git.kernel.org/mips/c/21e3134b3ec0
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: barrier: Clean up __smp_mb() definition
> commit 05e6da742b5b
> https://git.kernel.org/mips/c/05e6da742b5b
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: barrier: Remove fast_mb() Octeon #ifdef'ery
> commit 5c12a6eff6ae
> https://git.kernel.org/mips/c/5c12a6eff6ae
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: barrier: Clean up __sync() definition
> commit fe0065e56227
> https://git.kernel.org/mips/c/fe0065e56227
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: barrier: Clean up sync_ginv()
> commit 185d7d7a5819
> https://git.kernel.org/mips/c/185d7d7a5819
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: atomic: Fix whitespace in ATOMIC_OP macros
> commit 36d3295c5a0d
> https://git.kernel.org/mips/c/36d3295c5a0d
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: atomic: Handle !kernel_uses_llsc first
> commit 9537db24c65a
> https://git.kernel.org/mips/c/9537db24c65a
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: atomic: Use one macro to generate 32b & 64b functions
> commit a38ee6bb14a4
> https://git.kernel.org/mips/c/a38ee6bb14a4
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: atomic: Emit Loongson3 sync workarounds within asm
> commit 4d1dbfe6cbec
> https://git.kernel.org/mips/c/4d1dbfe6cbec
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: atomic: Use _atomic barriers in atomic_sub_if_positive()
> commit 77d281b7966e
> https://git.kernel.org/mips/c/77d281b7966e
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: atomic: Unify 32b & 64b sub_if_positive
> commit 40e784b4d4bc
> https://git.kernel.org/mips/c/40e784b4d4bc
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: atomic: Deduplicate 32b & 64b read, set, xchg, cmpxchg
> commit 1da7bce8591d
> https://git.kernel.org/mips/c/1da7bce8591d
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: bitops: Handle !kernel_uses_llsc first
> commit fe7cd97e68fa
> https://git.kernel.org/mips/c/fe7cd97e68fa
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: bitops: Only use ins for bit 16 or higher
> commit 3d2920cf4fd4
> https://git.kernel.org/mips/c/3d2920cf4fd4
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: bitops: Use MIPS_ISA_REV, not #ifdefs
> commit 59361e9975fd
> https://git.kernel.org/mips/c/59361e9975fd
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: bitops: ins start position is always an immediate
> commit 27aab27259ae
> https://git.kernel.org/mips/c/27aab27259ae
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: bitops: Implement test_and_set_bit() in terms of _lock variant
> commit 6bbe043bd3f4
> https://git.kernel.org/mips/c/6bbe043bd3f4
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: bitops: Allow immediates in test_and_{set,clear,change}_bit
> commit a2e66b862cc7
> https://git.kernel.org/mips/c/a2e66b862cc7
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: bitops: Use the BIT() macro
> commit d6103510e7cc
> https://git.kernel.org/mips/c/d6103510e7cc
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: bitops: Avoid redundant zero-comparison for non-LLSC
> commit aad028cadb17
> https://git.kernel.org/mips/c/aad028cadb17
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: bitops: Abstract LL/SC loops
> commit cc99987c375e
> https://git.kernel.org/mips/c/cc99987c375e
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: bitops: Use BIT_WORD() & BITS_PER_LONG
> commit c042be02d730
> https://git.kernel.org/mips/c/c042be02d730
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: bitops: Emit Loongson3 sync workarounds within asm
> commit 5bb29275df7a
> https://git.kernel.org/mips/c/5bb29275df7a
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: bitops: Use smp_mb__before_atomic in test_* ops
> commit 9026737703ae
> https://git.kernel.org/mips/c/9026737703ae
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: cmpxchg: Emit Loongson3 sync workarounds within asm
> commit 6a57d2d1e7c3
> https://git.kernel.org/mips/c/6a57d2d1e7c3
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: cmpxchg: Omit redundant barriers for Loongson3
> commit a91f2a1dba44
> https://git.kernel.org/mips/c/a91f2a1dba44
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: futex: Emit Loongson3 sync workarounds within asm
> commit 3c1d3f097972
> https://git.kernel.org/mips/c/3c1d3f097972
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: syscall: Emit Loongson3 sync workarounds within asm
> commit e84957e6ae04
> https://git.kernel.org/mips/c/e84957e6ae04
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: barrier: Remove loongson_llsc_mb()
> commit 7f56b1235481
> https://git.kernel.org/mips/c/7f56b1235481
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: barrier: Make __smp_mb__before_atomic() a no-op for Loongson3
> commit ae4cd0b1a475
> https://git.kernel.org/mips/c/ae4cd0b1a475
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: genex: Add Loongson3 LL/SC workaround to ejtag_debug_handler
> commit 12dbb04f2ac1
> https://git.kernel.org/mips/c/12dbb04f2ac1
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: genex: Don't reload address unnecessarily
> commit 4dee90d7b579
> https://git.kernel.org/mips/c/4dee90d7b579
>
> Signed-off-by: Paul Burton <[email protected]>
>
> MIPS: Check Loongson3 LL/SC errata workaround correctness
> commit e4acfbc18fc9
> https://git.kernel.org/mips/c/e4acfbc18fc9
>
> Signed-off-by: Paul Burton <[email protected]>
Thanks,
Paul
[ This message was auto-generated; if you believe anything is incorrect
then please email [email protected] to report it. ]