LinuxLists.cc - [PATCH v5 0/7] riscv: Add qspinlock/qrwlock

2021-03-28 06:36:14

Subject: [PATCH v5 0/7] riscv: Add qspinlock/qrwlock

From: Guo Ren <[email protected]>

Current riscv is still using baby spinlock implementation. It'll cause
fairness and cache line bouncing problems. Many people are involved
and pay the efforts to improve it:

- The first version of patch was made in 2019.1:
https://lore.kernel.org/linux-riscv/[email protected]/#r

- The second version was made in 2020.11:
https://lore.kernel.org/linux-riscv/[email protected]/

- A good discussion at Platform HSC.2021-03-08:
https://drive.google.com/drive/folders/1ooqdnIsYx7XKor5O1XTtM6D1CHp4hc0p

Hope your comments and Tested-by or Co-developed-by or Reviewed-by ...

Let's kick the qspinlock into riscv right now (Also for the
architecture which hasn't xchg16 atomic instruction.)

Change V5:
- Fixup #endif comment typo by Waiman
- Remove cmpxchg coding convention patches which will get into a
separate patchset later by Arnd's advice
- Try to involve more architectures in the discussion

Change V4:
- Remove custom sub-word xchg implementation
- Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32 in locking/qspinlock

Change V3:
- Coding convention by Peter Zijlstra's advices

Change V2:
- Coding convention in cmpxchg.h
- Re-implement short xchg
- Remove char & cmpxchg implementations

Guo Ren (6):
locking/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32
csky: Convert custom spinlock/rwlock to generic qspinlock/qrwlock
powerpc/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32
openrisc: qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32
sparc: qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32
xtensa: qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

Michael Clark (1):
riscv: Convert custom spinlock/rwlock to generic qspinlock/qrwlock

arch/csky/Kconfig | 2 +
arch/csky/include/asm/Kbuild | 2 +
arch/csky/include/asm/spinlock.h | 82 +--------------
arch/csky/include/asm/spinlock_types.h | 16 +--
arch/openrisc/Kconfig | 1 +
arch/powerpc/Kconfig | 1 +
arch/riscv/Kconfig | 3 +
arch/riscv/include/asm/Kbuild | 3 +
arch/riscv/include/asm/spinlock.h | 126 +-----------------------
arch/riscv/include/asm/spinlock_types.h | 15 +--
arch/sparc/Kconfig | 1 +
arch/xtensa/Kconfig | 1 +
kernel/Kconfig.locks | 3 +
kernel/locking/qspinlock.c | 46 +++++----
14 files changed, 49 insertions(+), 253 deletions(-)

--
2.17.1

2021-03-28 06:36:23

by Guo Ren

[permalink] [raw]

Subject: [PATCH v5 2/7] riscv: Convert custom spinlock/rwlock to generic qspinlock/qrwlock

From: Michael Clark <[email protected]>

Update the RISC-V port to use the generic qspinlock and qrwlock.

This patch requires support for xchg_xtail for full-word which
are added by a previous patch:

Guo added select ARCH_USE_QUEUED_SPINLOCKS_XCHG32 in Kconfig

Guo fixed up compile error which made by below include sequence:
+#include <asm/qrwlock.h>
+#include <asm/qspinlock.h>

Signed-off-by: Michael Clark <[email protected]>
Co-developed-by: Guo Ren <[email protected]>
Tested-by: Guo Ren <[email protected]>
Signed-off-by: Guo Ren <[email protected]>
Link: https://lore.kernel.org/linux-riscv/[email protected]/
Cc: Peter Zijlstra <[email protected]>
Cc: Anup Patel <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
---
arch/riscv/Kconfig | 3 +
arch/riscv/include/asm/Kbuild | 3 +
arch/riscv/include/asm/spinlock.h | 126 +-----------------------
arch/riscv/include/asm/spinlock_types.h | 15 +--
4 files changed, 11 insertions(+), 136 deletions(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 87d7b52f278f..67cc65ba1ea1 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -33,6 +33,9 @@ config RISCV
select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
select ARCH_WANT_FRAME_POINTERS
select ARCH_WANT_HUGE_PMD_SHARE if 64BIT
+ select ARCH_USE_QUEUED_RWLOCKS
+ select ARCH_USE_QUEUED_SPINLOCKS
+ select ARCH_USE_QUEUED_SPINLOCKS_XCHG32
select CLONE_BACKWARDS
select CLINT_TIMER if !MMU
select COMMON_CLK
diff --git a/arch/riscv/include/asm/Kbuild b/arch/riscv/include/asm/Kbuild
index 445ccc97305a..750c1056b90f 100644
--- a/arch/riscv/include/asm/Kbuild
+++ b/arch/riscv/include/asm/Kbuild
@@ -3,5 +3,8 @@ generic-y += early_ioremap.h
generic-y += extable.h
generic-y += flat.h
generic-y += kvm_para.h
+generic-y += mcs_spinlock.h
+generic-y += qrwlock.h
+generic-y += qspinlock.h
generic-y += user.h
generic-y += vmlinux.lds.h
diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
index f4f7fa1b7ca8..a557de67a425 100644
--- a/arch/riscv/include/asm/spinlock.h
+++ b/arch/riscv/include/asm/spinlock.h
@@ -7,129 +7,7 @@
#ifndef _ASM_RISCV_SPINLOCK_H
#define _ASM_RISCV_SPINLOCK_H

-#include <linux/kernel.h>
-#include <asm/current.h>
-#include <asm/fence.h>
-
-/*
- * Simple spin lock operations. These provide no fairness guarantees.
- */
-
-/* FIXME: Replace this with a ticket lock, like MIPS. */
-
-#define arch_spin_is_locked(x) (READ_ONCE((x)->lock) != 0)
-
-static inline void arch_spin_unlock(arch_spinlock_t *lock)
-{
- smp_store_release(&lock->lock, 0);
-}
-
-static inline int arch_spin_trylock(arch_spinlock_t *lock)
-{
- int tmp = 1, busy;
-
- __asm__ __volatile__ (
- " amoswap.w %0, %2, %1\n"
- RISCV_ACQUIRE_BARRIER
- : "=r" (busy), "+A" (lock->lock)
- : "r" (tmp)
- : "memory");
-
- return !busy;
-}
-
-static inline void arch_spin_lock(arch_spinlock_t *lock)
-{
- while (1) {
- if (arch_spin_is_locked(lock))
- continue;
-
- if (arch_spin_trylock(lock))
- break;
- }
-}
-
-/***********************************************************/
-
-static inline void arch_read_lock(arch_rwlock_t *lock)
-{
- int tmp;
-
- __asm__ __volatile__(
- "1: lr.w %1, %0\n"
- " bltz %1, 1b\n"
- " addi %1, %1, 1\n"
- " sc.w %1, %1, %0\n"
- " bnez %1, 1b\n"
- RISCV_ACQUIRE_BARRIER
- : "+A" (lock->lock), "=&r" (tmp)
- :: "memory");
-}
-
-static inline void arch_write_lock(arch_rwlock_t *lock)
-{
- int tmp;
-
- __asm__ __volatile__(
- "1: lr.w %1, %0\n"
- " bnez %1, 1b\n"
- " li %1, -1\n"
- " sc.w %1, %1, %0\n"
- " bnez %1, 1b\n"
- RISCV_ACQUIRE_BARRIER
- : "+A" (lock->lock), "=&r" (tmp)
- :: "memory");
-}
-
-static inline int arch_read_trylock(arch_rwlock_t *lock)
-{
- int busy;
-
- __asm__ __volatile__(
- "1: lr.w %1, %0\n"
- " bltz %1, 1f\n"
- " addi %1, %1, 1\n"
- " sc.w %1, %1, %0\n"
- " bnez %1, 1b\n"
- RISCV_ACQUIRE_BARRIER
- "1:\n"
- : "+A" (lock->lock), "=&r" (busy)
- :: "memory");
-
- return !busy;
-}
-
-static inline int arch_write_trylock(arch_rwlock_t *lock)
-{
- int busy;
-
- __asm__ __volatile__(
- "1: lr.w %1, %0\n"
- " bnez %1, 1f\n"
- " li %1, -1\n"
- " sc.w %1, %1, %0\n"
- " bnez %1, 1b\n"
- RISCV_ACQUIRE_BARRIER
- "1:\n"
- : "+A" (lock->lock), "=&r" (busy)
- :: "memory");
-
- return !busy;
-}
-
-static inline void arch_read_unlock(arch_rwlock_t *lock)
-{
- __asm__ __volatile__(
- RISCV_RELEASE_BARRIER
- " amoadd.w x0, %1, %0\n"
- : "+A" (lock->lock)
- : "r" (-1)
- : "memory");
-}
-
-static inline void arch_write_unlock(arch_rwlock_t *lock)
-{
- smp_store_release(&lock->lock, 0);
-}
+#include <asm/qspinlock.h>
+#include <asm/qrwlock.h>

#endif /* _ASM_RISCV_SPINLOCK_H */
diff --git a/arch/riscv/include/asm/spinlock_types.h b/arch/riscv/include/asm/spinlock_types.h
index f398e7638dd6..d033a973f287 100644
--- a/arch/riscv/include/asm/spinlock_types.h
+++ b/arch/riscv/include/asm/spinlock_types.h
@@ -6,20 +6,11 @@
#ifndef _ASM_RISCV_SPINLOCK_TYPES_H
#define _ASM_RISCV_SPINLOCK_TYPES_H

-#ifndef __LINUX_SPINLOCK_TYPES_H
+#if !defined(__LINUX_SPINLOCK_TYPES_H) && !defined(_ASM_RISCV_SPINLOCK_H)
# error "please don't include this file directly"
#endif

-typedef struct {
- volatile unsigned int lock;
-} arch_spinlock_t;
-
-#define __ARCH_SPIN_LOCK_UNLOCKED { 0 }
-
-typedef struct {
- volatile unsigned int lock;
-} arch_rwlock_t;
-
-#define __ARCH_RW_LOCK_UNLOCKED { 0 }
+#include <asm-generic/qspinlock_types.h>
+#include <asm-generic/qrwlock_types.h>

#endif /* _ASM_RISCV_SPINLOCK_TYPES_H */
--
2.17.1

2021-03-28 06:37:21

by Guo Ren

[permalink] [raw]

Subject: [PATCH v5 1/7] locking/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

From: Guo Ren <[email protected]>

Some architectures don't have sub-word swap atomic instruction,
they only have the full word's one.

The sub-word swap only improve the performance when:
NR_CPUS < 16K
* 0- 7: locked byte
* 8: pending
* 9-15: not used
* 16-17: tail index
* 18-31: tail cpu (+1)

The 9-15 bits are wasted to use xchg16 in xchg_tail.

Please let architecture select xchg16/xchg32 to implement
xchg_tail.

Signed-off-by: Guo Ren <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Anup Patel <[email protected]>
---
kernel/Kconfig.locks | 3 +++
kernel/locking/qspinlock.c | 46 +++++++++++++++++++++-----------------
2 files changed, 28 insertions(+), 21 deletions(-)

diff --git a/kernel/Kconfig.locks b/kernel/Kconfig.locks
index 3de8fd11873b..d02f1261f73f 100644
--- a/kernel/Kconfig.locks
+++ b/kernel/Kconfig.locks
@@ -239,6 +239,9 @@ config LOCK_SPIN_ON_OWNER
config ARCH_USE_QUEUED_SPINLOCKS
bool

+config ARCH_USE_QUEUED_SPINLOCKS_XCHG32
+ bool
+
config QUEUED_SPINLOCKS
def_bool y if ARCH_USE_QUEUED_SPINLOCKS
depends on SMP
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index cbff6ba53d56..4bfaa969bd15 100644
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -163,26 +163,6 @@ static __always_inline void clear_pending_set_locked(struct qspinlock *lock)
WRITE_ONCE(lock->locked_pending, _Q_LOCKED_VAL);
}

-/*
- * xchg_tail - Put in the new queue tail code word & retrieve previous one
- * @lock : Pointer to queued spinlock structure
- * @tail : The new queue tail code word
- * Return: The previous queue tail code word
- *
- * xchg(lock, tail), which heads an address dependency
- *
- * p,*,* -> n,*,* ; prev = xchg(lock, node)
- */
-static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail)
-{
- /*
- * We can use relaxed semantics since the caller ensures that the
- * MCS node is properly initialized before updating the tail.
- */
- return (u32)xchg_relaxed(&lock->tail,
- tail >> _Q_TAIL_OFFSET) << _Q_TAIL_OFFSET;
-}
-
#else /* _Q_PENDING_BITS == 8 */

/**
@@ -206,6 +186,30 @@ static __always_inline void clear_pending_set_locked(struct qspinlock *lock)
{
atomic_add(-_Q_PENDING_VAL + _Q_LOCKED_VAL, &lock->val);
}
+#endif /* _Q_PENDING_BITS == 8 */
+
+#if _Q_PENDING_BITS == 8 && !defined(CONFIG_ARCH_USE_QUEUED_SPINLOCKS_XCHG32)
+/*
+ * xchg_tail - Put in the new queue tail code word & retrieve previous one
+ * @lock : Pointer to queued spinlock structure
+ * @tail : The new queue tail code word
+ * Return: The previous queue tail code word
+ *
+ * xchg(lock, tail), which heads an address dependency
+ *
+ * p,*,* -> n,*,* ; prev = xchg(lock, node)
+ */
+static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail)
+{
+ /*
+ * We can use relaxed semantics since the caller ensures that the
+ * MCS node is properly initialized before updating the tail.
+ */
+ return (u32)xchg_relaxed(&lock->tail,
+ tail >> _Q_TAIL_OFFSET) << _Q_TAIL_OFFSET;
+}
+
+#else

/**
* xchg_tail - Put in the new queue tail code word & retrieve previous one
@@ -236,7 +240,7 @@ static __always_inline u32 xchg_tail(struct qspinlock *lock, u32 tail)
}
return old;
}
-#endif /* _Q_PENDING_BITS == 8 */
+#endif

/**
* queued_fetch_set_pending_acquire - fetch the whole lock value and set pending
--
2.17.1

2021-03-28 06:40:19

by Guo Ren

[permalink] [raw]

Subject: [PATCH v5 6/7] sparc: qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

From: Guo Ren <[email protected]>

We don't have native hw xchg16 instruction, so let qspinlock
generic code to deal with it.

Using the full-word atomic xchg instructions implement xchg16 has
the semantic risk for atomic operations.

This patch cancels the dependency of on qspinlock generic code on
architecture's xchg16.

Signed-off-by: Guo Ren <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Rob Gardner <[email protected]>
---
arch/sparc/Kconfig | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 164a5254c91c..1079fe3f058c 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -91,6 +91,7 @@ config SPARC64
select HAVE_REGS_AND_STACK_ACCESS_API
select ARCH_USE_QUEUED_RWLOCKS
select ARCH_USE_QUEUED_SPINLOCKS
+ select ARCH_USE_QUEUED_SPINLOCKS_XCHG32
select GENERIC_TIME_VSYSCALL
select ARCH_CLOCKSOURCE_DATA
select ARCH_HAS_PTE_SPECIAL
--
2.17.1

2021-03-28 06:40:31

by Guo Ren

[permalink] [raw]

Subject: [PATCH v5 5/7] openrisc: qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

From: Guo Ren <[email protected]>

We don't have native hw xchg16 instruction, so let qspinlock
generic code to deal with it.

Using the full-word atomic xchg instructions implement xchg16 has
the semantic risk for atomic operations.

This patch cancels the dependency of on qspinlock generic code on
architecture's xchg16.

Signed-off-by: Guo Ren <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Jonas Bonn <[email protected]>
Cc: Stefan Kristiansson <[email protected]>
Cc: Stafford Horne <[email protected]>
Cc: [email protected]
---
arch/openrisc/Kconfig | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/openrisc/Kconfig b/arch/openrisc/Kconfig
index 591acc5990dc..b299e409429f 100644
--- a/arch/openrisc/Kconfig
+++ b/arch/openrisc/Kconfig
@@ -33,6 +33,7 @@ config OPENRISC
select OR1K_PIC
select CPU_NO_EFFICIENT_FFS if !OPENRISC_HAVE_INST_FF1
select ARCH_USE_QUEUED_SPINLOCKS
+ select ARCH_USE_QUEUED_SPINLOCKS_XCHG32
select ARCH_USE_QUEUED_RWLOCKS
select OMPIC if SMP
select ARCH_WANT_FRAME_POINTERS
--
2.17.1

2021-03-28 06:40:55

by Guo Ren

[permalink] [raw]

Subject: [PATCH v5 3/7] csky: Convert custom spinlock/rwlock to generic qspinlock/qrwlock

From: Guo Ren <[email protected]>

Update the C-SKY port to use the generic qspinlock and qrwlock.

C-SKY only support ldex.w/stex.w with word(double word) size &
align access. So it must select XCHG32 to let qspinlock only use
word atomic xchg_tail.

Signed-off-by: Guo Ren <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Arnd Bergmann <[email protected]>
---
arch/csky/Kconfig | 2 +
arch/csky/include/asm/Kbuild | 2 +
arch/csky/include/asm/spinlock.h | 82 +-------------------------
arch/csky/include/asm/spinlock_types.h | 16 +----
4 files changed, 6 insertions(+), 96 deletions(-)

diff --git a/arch/csky/Kconfig b/arch/csky/Kconfig
index 34e91224adc3..5910eb6ddde2 100644
--- a/arch/csky/Kconfig
+++ b/arch/csky/Kconfig
@@ -8,6 +8,8 @@ config CSKY
select ARCH_HAS_SYNC_DMA_FOR_DEVICE
select ARCH_USE_BUILTIN_BSWAP
select ARCH_USE_QUEUED_RWLOCKS
+ select ARCH_USE_QUEUED_SPINLOCKS
+ select ARCH_USE_QUEUED_SPINLOCKS_XCHG32
select ARCH_WANT_FRAME_POINTERS if !CPU_CK610
select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT
select COMMON_CLK
diff --git a/arch/csky/include/asm/Kbuild b/arch/csky/include/asm/Kbuild
index cc24bb8e539f..2a2d09963bb9 100644
--- a/arch/csky/include/asm/Kbuild
+++ b/arch/csky/include/asm/Kbuild
@@ -2,6 +2,8 @@
generic-y += asm-offsets.h
generic-y += gpio.h
generic-y += kvm_para.h
+generic-y += mcs_spinlock.h
generic-y += qrwlock.h
+generic-y += qspinlock.h
generic-y += user.h
generic-y += vmlinux.lds.h
diff --git a/arch/csky/include/asm/spinlock.h b/arch/csky/include/asm/spinlock.h
index 69f5aa249c5f..fcff36753c25 100644
--- a/arch/csky/include/asm/spinlock.h
+++ b/arch/csky/include/asm/spinlock.h
@@ -3,87 +3,7 @@
#ifndef __ASM_CSKY_SPINLOCK_H
#define __ASM_CSKY_SPINLOCK_H

-#include <linux/spinlock_types.h>
-#include <asm/barrier.h>
-
-/*
- * Ticket-based spin-locking.
- */
-static inline void arch_spin_lock(arch_spinlock_t *lock)
-{
- arch_spinlock_t lockval;
- u32 ticket_next = 1 << TICKET_NEXT;
- u32 *p = &lock->lock;
- u32 tmp;
-
- asm volatile (
- "1: ldex.w %0, (%2) \n"
- " mov %1, %0 \n"
- " add %0, %3 \n"
- " stex.w %0, (%2) \n"
- " bez %0, 1b \n"
- : "=&r" (tmp), "=&r" (lockval)
- : "r"(p), "r"(ticket_next)
- : "cc");
-
- while (lockval.tickets.next != lockval.tickets.owner)
- lockval.tickets.owner = READ_ONCE(lock->tickets.owner);
-
- smp_mb();
-}
-
-static inline int arch_spin_trylock(arch_spinlock_t *lock)
-{
- u32 tmp, contended, res;
- u32 ticket_next = 1 << TICKET_NEXT;
- u32 *p = &lock->lock;
-
- do {
- asm volatile (
- " ldex.w %0, (%3) \n"
- " movi %2, 1 \n"
- " rotli %1, %0, 16 \n"
- " cmpne %1, %0 \n"
- " bt 1f \n"
- " movi %2, 0 \n"
- " add %0, %0, %4 \n"
- " stex.w %0, (%3) \n"
- "1: \n"
- : "=&r" (res), "=&r" (tmp), "=&r" (contended)
- : "r"(p), "r"(ticket_next)
- : "cc");
- } while (!res);
-
- if (!contended)
- smp_mb();
-
- return !contended;
-}
-
-static inline void arch_spin_unlock(arch_spinlock_t *lock)
-{
- smp_mb();
- WRITE_ONCE(lock->tickets.owner, lock->tickets.owner + 1);
-}
-
-static inline int arch_spin_value_unlocked(arch_spinlock_t lock)
-{
- return lock.tickets.owner == lock.tickets.next;
-}
-
-static inline int arch_spin_is_locked(arch_spinlock_t *lock)
-{
- return !arch_spin_value_unlocked(READ_ONCE(*lock));
-}
-
-static inline int arch_spin_is_contended(arch_spinlock_t *lock)
-{
- struct __raw_tickets tickets = READ_ONCE(lock->tickets);
-
- return (tickets.next - tickets.owner) > 1;
-}
-#define arch_spin_is_contended arch_spin_is_contended
-
+#include <asm/qspinlock.h>
#include <asm/qrwlock.h>

#endif /* __ASM_CSKY_SPINLOCK_H */
diff --git a/arch/csky/include/asm/spinlock_types.h b/arch/csky/include/asm/spinlock_types.h
index 8ff0f6ff3a00..757594760e65 100644
--- a/arch/csky/include/asm/spinlock_types.h
+++ b/arch/csky/include/asm/spinlock_types.h
@@ -7,21 +7,7 @@
# error "please don't include this file directly"
#endif

-#define TICKET_NEXT 16
-
-typedef struct {
- union {
- u32 lock;
- struct __raw_tickets {
- /* little endian */
- u16 owner;
- u16 next;
- } tickets;
- };
-} arch_spinlock_t;
-
-#define __ARCH_SPIN_LOCK_UNLOCKED { { 0 } }
-
+#include <asm-generic/qspinlock_types.h>
#include <asm-generic/qrwlock_types.h>

#endif /* __ASM_CSKY_SPINLOCK_TYPES_H */
--
2.17.1

2021-03-28 06:41:11

by Guo Ren

[permalink] [raw]

Subject: [PATCH v5 4/7] powerpc/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

From: Guo Ren <[email protected]>

We don't have native hw xchg16 instruction, so let qspinlock
generic code to deal with it.

Using the full-word atomic xchg instructions implement xchg16 has
the semantic risk for atomic operations.

This patch cancels the dependency of on qspinlock generic code on
architecture's xchg16.

Signed-off-by: Guo Ren <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Paul Mackerras <[email protected]>
---
arch/powerpc/Kconfig | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 386ae12d8523..69ec4ade6521 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -151,6 +151,7 @@ config PPC
select ARCH_USE_CMPXCHG_LOCKREF if PPC64
select ARCH_USE_QUEUED_RWLOCKS if PPC_QUEUED_SPINLOCKS
select ARCH_USE_QUEUED_SPINLOCKS if PPC_QUEUED_SPINLOCKS
+ select ARCH_USE_QUEUED_SPINLOCKS_XCHG32 if PPC_QUEUED_SPINLOCKS
select ARCH_WANT_IPC_PARSE_VERSION
select ARCH_WANT_IRQS_OFF_ACTIVATE_MM
select ARCH_WANT_LD_ORPHAN_WARN
--
2.17.1

2021-03-28 06:44:12

by Guo Ren

[permalink] [raw]

Subject: [PATCH v5 7/7] xtensa: qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

From: Guo Ren <[email protected]>

We don't have native hw xchg16 instruction, so let qspinlock
generic code to deal with it.

Using the full-word atomic xchg instructions implement xchg16 has
the semantic risk for atomic operations.

This patch cancels the dependency of on qspinlock generic code on
architecture's xchg16.

Signed-off-by: Guo Ren <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Chris Zankel <[email protected]>
Cc: Max Filippov <[email protected]>
---
arch/xtensa/Kconfig | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/xtensa/Kconfig b/arch/xtensa/Kconfig
index 9ad6b7b82707..f19d780638f7 100644
--- a/arch/xtensa/Kconfig
+++ b/arch/xtensa/Kconfig
@@ -9,6 +9,7 @@ config XTENSA
select ARCH_HAS_DMA_SET_UNCACHED if MMU
select ARCH_USE_QUEUED_RWLOCKS
select ARCH_USE_QUEUED_SPINLOCKS
+ select ARCH_USE_QUEUED_SPINLOCKS_XCHG32
select ARCH_WANT_FRAME_POINTERS
select ARCH_WANT_IPC_PARSE_VERSION
select BUILDTIME_TABLE_SORT
--
2.17.1

2021-03-28 11:20:49

by Christophe Leroy

[permalink] [raw]

Subject: Re: [PATCH v5 4/7] powerpc/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

Le 28/03/2021 à 08:30, [email protected] a écrit :
> From: Guo Ren <[email protected]>
>
> We don't have native hw xchg16 instruction, so let qspinlock
> generic code to deal with it.

We have lharx/sthcx pair on some versions of powerpc.

See https://patchwork.ozlabs.org/project/linuxppc-dev/patch/[email protected]/

Christophe

>
> Using the full-word atomic xchg instructions implement xchg16 has
> the semantic risk for atomic operations.
>
> This patch cancels the dependency of on qspinlock generic code on
> architecture's xchg16.
>
> Signed-off-by: Guo Ren <[email protected]>
> Cc: Michael Ellerman <[email protected]>
> Cc: Benjamin Herrenschmidt <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> ---
> arch/powerpc/Kconfig | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 386ae12d8523..69ec4ade6521 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -151,6 +151,7 @@ config PPC
> select ARCH_USE_CMPXCHG_LOCKREF if PPC64
> select ARCH_USE_QUEUED_RWLOCKS if PPC_QUEUED_SPINLOCKS
> select ARCH_USE_QUEUED_SPINLOCKS if PPC_QUEUED_SPINLOCKS
> + select ARCH_USE_QUEUED_SPINLOCKS_XCHG32 if PPC_QUEUED_SPINLOCKS
> select ARCH_WANT_IPC_PARSE_VERSION
> select ARCH_WANT_IRQS_OFF_ACTIVATE_MM
> select ARCH_WANT_LD_ORPHAN_WARN
>

2021-03-28 11:43:11

by Guo Ren

[permalink] [raw]

Subject: Re: [PATCH v5 4/7] powerpc/qspinlock: Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32

On Sun, Mar 28, 2021 at 7:14 PM Christophe Leroy
<[email protected]> wrote:
>
>
>
> Le 28/03/2021 à 08:30, [email protected] a écrit :
> > From: Guo Ren <[email protected]>
> >
> > We don't have native hw xchg16 instruction, so let qspinlock
> > generic code to deal with it.
>
> We have lharx/sthcx pair on some versions of powerpc.
>
> See https://patchwork.ozlabs.org/project/linuxppc-dev/patch/[email protected]/
Got it, thx for the information.

>
> Christophe
>
> >
> > Using the full-word atomic xchg instructions implement xchg16 has
> > the semantic risk for atomic operations.
> >
> > This patch cancels the dependency of on qspinlock generic code on
> > architecture's xchg16.
> >
> > Signed-off-by: Guo Ren <[email protected]>
> > Cc: Michael Ellerman <[email protected]>
> > Cc: Benjamin Herrenschmidt <[email protected]>
> > Cc: Paul Mackerras <[email protected]>
> > ---
> > arch/powerpc/Kconfig | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> > index 386ae12d8523..69ec4ade6521 100644
> > --- a/arch/powerpc/Kconfig
> > +++ b/arch/powerpc/Kconfig
> > @@ -151,6 +151,7 @@ config PPC
> > select ARCH_USE_CMPXCHG_LOCKREF if PPC64
> > select ARCH_USE_QUEUED_RWLOCKS if PPC_QUEUED_SPINLOCKS
> > select ARCH_USE_QUEUED_SPINLOCKS if PPC_QUEUED_SPINLOCKS
> > + select ARCH_USE_QUEUED_SPINLOCKS_XCHG32 if PPC_QUEUED_SPINLOCKS
> > select ARCH_WANT_IPC_PARSE_VERSION
> > select ARCH_WANT_IRQS_OFF_ACTIVATE_MM
> > select ARCH_WANT_LD_ORPHAN_WARN
> >

--
Best Regards
Guo Ren

ML: https://lore.kernel.org/linux-csky/