Peter sent an RFC out about a year ago
<https://lore.kernel.org/lkml/[email protected]/>,
but after a spirited discussion it looks like we lost track of things.
IIRC there was broad consensus on this being the way to go, but there
was a lot of discussion so I wasn't sure. Given that it's been a year,
I figured it'd be best to just send this out again formatted a bit more
explicitly as a patch.
This has had almost no testing (just a build test on RISC-V defconfig),
but I wanted to send it out largely as-is because I didn't have a SOB
from Peter on the code. I had sent around something sort of similar in
spirit, but this looks completely re-written. Just to play it safe I
wanted to send out almost exactly as it was posted. I'd probably rename
this tspinlock and tspinlock_types, as the mis-match kind of makes my
eyes go funny, but I don't really care that much. I'll also go through
the other ports and see if there's any more candidates, I seem to
remember there having been more than just OpenRISC but it's been a
while.
I'm in no big rush for this and given the complex HW dependencies I
think it's best to target it for 5.19, that'd give us a full merge
window for folks to test/benchmark it on their systems to make sure it's
OK. RISC-V has a forward progress guarantee so we should be safe, but
these can always trip things up.
From: Peter Zijlstra <[email protected]>
This is a simple, fair spinlock. Specifically it doesn't have all the
subtle memory model dependencies that qspinlock has, which makes it more
suitable for simple systems as it is more likely to be correct.
[Palmer: commit text]
Signed-off-by: Palmer Dabbelt <[email protected]>
--
I have specifically not included Peter's SOB on this, as he sent his
original patch
<https://lore.kernel.org/lkml/[email protected]/>
without one.
---
include/asm-generic/ticket-lock-types.h | 11 ++++
include/asm-generic/ticket-lock.h | 86 +++++++++++++++++++++++++
2 files changed, 97 insertions(+)
create mode 100644 include/asm-generic/ticket-lock-types.h
create mode 100644 include/asm-generic/ticket-lock.h
diff --git a/include/asm-generic/ticket-lock-types.h b/include/asm-generic/ticket-lock-types.h
new file mode 100644
index 000000000000..829759aedda8
--- /dev/null
+++ b/include/asm-generic/ticket-lock-types.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ASM_GENERIC_TICKET_LOCK_TYPES_H
+#define __ASM_GENERIC_TICKET_LOCK_TYPES_H
+
+#include <linux/types.h>
+typedef atomic_t arch_spinlock_t;
+
+#define __ARCH_SPIN_LOCK_UNLOCKED ATOMIC_INIT(0)
+
+#endif /* __ASM_GENERIC_TICKET_LOCK_TYPES_H */
diff --git a/include/asm-generic/ticket-lock.h b/include/asm-generic/ticket-lock.h
new file mode 100644
index 000000000000..3f0d53e21a37
--- /dev/null
+++ b/include/asm-generic/ticket-lock.h
@@ -0,0 +1,86 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+/*
+ * 'Generic' ticket-lock implementation.
+ *
+ * It relies on atomic_fetch_add() having well defined forward progress
+ * guarantees under contention. If your architecture cannot provide this, stick
+ * to a test-and-set lock.
+ *
+ * It also relies on atomic_fetch_add() being safe vs smp_store_release() on a
+ * sub-word of the value. This is generally true for anything LL/SC although
+ * you'd be hard pressed to find anything useful in architecture specifications
+ * about this. If your architecture cannot do this you might be better off with
+ * a test-and-set.
+ *
+ * It further assumes atomic_*_release() + atomic_*_acquire() is RCpc and hence
+ * uses atomic_fetch_add() which is SC to create an RCsc lock.
+ *
+ * The implementation uses smp_cond_load_acquire() to spin, so if the
+ * architecture has WFE like instructions to sleep instead of poll for word
+ * modifications be sure to implement that (see ARM64 for example).
+ *
+ */
+
+#ifndef __ASM_GENERIC_TICKET_LOCK_H
+#define __ASM_GENERIC_TICKET_LOCK_H
+
+#include <linux/atomic.h>
+#include <asm/ticket-lock-types.h>
+
+static __always_inline void ticket_lock(arch_spinlock_t *lock)
+{
+ u32 val = atomic_fetch_add(1<<16, lock); /* SC, gives us RCsc */
+ u16 ticket = val >> 16;
+
+ if (ticket == (u16)val)
+ return;
+
+ atomic_cond_read_acquire(lock, ticket == (u16)VAL);
+}
+
+static __always_inline bool ticket_trylock(arch_spinlock_t *lock)
+{
+ u32 old = atomic_read(lock);
+
+ if ((old >> 16) != (old & 0xffff))
+ return false;
+
+ return atomic_try_cmpxchg(lock, &old, old + (1<<16)); /* SC, for RCsc */
+}
+
+static __always_inline void ticket_unlock(arch_spinlock_t *lock)
+{
+ u16 *ptr = (u16 *)lock + __is_defined(__BIG_ENDIAN);
+ u32 val = atomic_read(lock);
+
+ smp_store_release(ptr, (u16)val + 1);
+}
+
+static __always_inline int ticket_is_locked(arch_spinlock_t *lock)
+{
+ u32 val = atomic_read(lock);
+
+ return ((val >> 16) != (val & 0xffff));
+}
+
+static __always_inline int ticket_is_contended(arch_spinlock_t *lock)
+{
+ u32 val = atomic_read(lock);
+
+ return (s16)((val >> 16) - (val & 0xffff)) > 1;
+}
+
+static __always_inline int ticket_value_unlocked(arch_spinlock_t lock)
+{
+ return !ticket_is_locked(&lock);
+}
+
+#define arch_spin_lock(l) ticket_lock(l)
+#define arch_spin_trylock(l) ticket_trylock(l)
+#define arch_spin_unlock(l) ticket_unlock(l)
+#define arch_spin_is_locked(l) ticket_is_locked(l)
+#define arch_spin_is_contended(l) ticket_is_contended(l)
+#define arch_spin_value_unlocked(l) ticket_value_unlocked(l)
+
+#endif /* __ASM_GENERIC_TICKET_LOCK_H */
--
2.34.1
From: Palmer Dabbelt <[email protected]>
Our existing spinlocks aren't fair and replacing them has been on the
TODO list for a long time. This moves to the recently-introduced ticket
spinlocks, which are simple enough that they are likely to be correct
and fast on the vast majority of extant implementations.
Signed-off-by: Palmer Dabbelt <[email protected]>
---
arch/riscv/include/asm/Kbuild | 2 ++
arch/riscv/include/asm/spinlock.h | 41 +------------------------
arch/riscv/include/asm/spinlock_types.h | 6 +---
3 files changed, 4 insertions(+), 45 deletions(-)
diff --git a/arch/riscv/include/asm/Kbuild b/arch/riscv/include/asm/Kbuild
index 57b86fd9916c..42b1961af1a6 100644
--- a/arch/riscv/include/asm/Kbuild
+++ b/arch/riscv/include/asm/Kbuild
@@ -2,5 +2,7 @@
generic-y += early_ioremap.h
generic-y += flat.h
generic-y += kvm_para.h
+generic-y += ticket-lock.h
+generic-y += ticket-lock-types.h
generic-y += user.h
generic-y += vmlinux.lds.h
diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
index f4f7fa1b7ca8..38089cbdea92 100644
--- a/arch/riscv/include/asm/spinlock.h
+++ b/arch/riscv/include/asm/spinlock.h
@@ -10,46 +10,7 @@
#include <linux/kernel.h>
#include <asm/current.h>
#include <asm/fence.h>
-
-/*
- * Simple spin lock operations. These provide no fairness guarantees.
- */
-
-/* FIXME: Replace this with a ticket lock, like MIPS. */
-
-#define arch_spin_is_locked(x) (READ_ONCE((x)->lock) != 0)
-
-static inline void arch_spin_unlock(arch_spinlock_t *lock)
-{
- smp_store_release(&lock->lock, 0);
-}
-
-static inline int arch_spin_trylock(arch_spinlock_t *lock)
-{
- int tmp = 1, busy;
-
- __asm__ __volatile__ (
- " amoswap.w %0, %2, %1\n"
- RISCV_ACQUIRE_BARRIER
- : "=r" (busy), "+A" (lock->lock)
- : "r" (tmp)
- : "memory");
-
- return !busy;
-}
-
-static inline void arch_spin_lock(arch_spinlock_t *lock)
-{
- while (1) {
- if (arch_spin_is_locked(lock))
- continue;
-
- if (arch_spin_trylock(lock))
- break;
- }
-}
-
-/***********************************************************/
+#include <asm-generic/ticket-lock.h>
static inline void arch_read_lock(arch_rwlock_t *lock)
{
diff --git a/arch/riscv/include/asm/spinlock_types.h b/arch/riscv/include/asm/spinlock_types.h
index 5a35a49505da..431ee08e26c4 100644
--- a/arch/riscv/include/asm/spinlock_types.h
+++ b/arch/riscv/include/asm/spinlock_types.h
@@ -10,11 +10,7 @@
# error "please don't include this file directly"
#endif
-typedef struct {
- volatile unsigned int lock;
-} arch_spinlock_t;
-
-#define __ARCH_SPIN_LOCK_UNLOCKED { 0 }
+#include <asm-generic/ticket-lock-types.h>
typedef struct {
volatile unsigned int lock;
--
2.34.1
From: Palmer Dabbelt <[email protected]>
With the move to fair spinlocks, we might as well move to fair rwlocks.
Signed-off-by: Palmer Dabbelt <[email protected]>
---
arch/riscv/Kconfig | 1 +
arch/riscv/include/asm/Kbuild | 2 +
arch/riscv/include/asm/spinlock.h | 82 +------------------------
arch/riscv/include/asm/spinlock_types.h | 7 +--
4 files changed, 5 insertions(+), 87 deletions(-)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 5adcbd9b5e88..feb7030cfb6d 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -38,6 +38,7 @@ config RISCV
select ARCH_SUPPORTS_DEBUG_PAGEALLOC if MMU
select ARCH_SUPPORTS_HUGETLBFS if MMU
select ARCH_USE_MEMTEST
+ select ARCH_USE_QUEUED_RWLOCKS
select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT if MMU
select ARCH_WANT_FRAME_POINTERS
select ARCH_WANT_HUGE_PMD_SHARE if 64BIT
diff --git a/arch/riscv/include/asm/Kbuild b/arch/riscv/include/asm/Kbuild
index 42b1961af1a6..e8714070cbb9 100644
--- a/arch/riscv/include/asm/Kbuild
+++ b/arch/riscv/include/asm/Kbuild
@@ -4,5 +4,7 @@ generic-y += flat.h
generic-y += kvm_para.h
generic-y += ticket-lock.h
generic-y += ticket-lock-types.h
+generic-y += qrwlock.h
+generic-y += qrwlock_types.h
generic-y += user.h
generic-y += vmlinux.lds.h
diff --git a/arch/riscv/include/asm/spinlock.h b/arch/riscv/include/asm/spinlock.h
index 38089cbdea92..97dfb150d18c 100644
--- a/arch/riscv/include/asm/spinlock.h
+++ b/arch/riscv/include/asm/spinlock.h
@@ -11,86 +11,6 @@
#include <asm/current.h>
#include <asm/fence.h>
#include <asm-generic/ticket-lock.h>
-
-static inline void arch_read_lock(arch_rwlock_t *lock)
-{
- int tmp;
-
- __asm__ __volatile__(
- "1: lr.w %1, %0\n"
- " bltz %1, 1b\n"
- " addi %1, %1, 1\n"
- " sc.w %1, %1, %0\n"
- " bnez %1, 1b\n"
- RISCV_ACQUIRE_BARRIER
- : "+A" (lock->lock), "=&r" (tmp)
- :: "memory");
-}
-
-static inline void arch_write_lock(arch_rwlock_t *lock)
-{
- int tmp;
-
- __asm__ __volatile__(
- "1: lr.w %1, %0\n"
- " bnez %1, 1b\n"
- " li %1, -1\n"
- " sc.w %1, %1, %0\n"
- " bnez %1, 1b\n"
- RISCV_ACQUIRE_BARRIER
- : "+A" (lock->lock), "=&r" (tmp)
- :: "memory");
-}
-
-static inline int arch_read_trylock(arch_rwlock_t *lock)
-{
- int busy;
-
- __asm__ __volatile__(
- "1: lr.w %1, %0\n"
- " bltz %1, 1f\n"
- " addi %1, %1, 1\n"
- " sc.w %1, %1, %0\n"
- " bnez %1, 1b\n"
- RISCV_ACQUIRE_BARRIER
- "1:\n"
- : "+A" (lock->lock), "=&r" (busy)
- :: "memory");
-
- return !busy;
-}
-
-static inline int arch_write_trylock(arch_rwlock_t *lock)
-{
- int busy;
-
- __asm__ __volatile__(
- "1: lr.w %1, %0\n"
- " bnez %1, 1f\n"
- " li %1, -1\n"
- " sc.w %1, %1, %0\n"
- " bnez %1, 1b\n"
- RISCV_ACQUIRE_BARRIER
- "1:\n"
- : "+A" (lock->lock), "=&r" (busy)
- :: "memory");
-
- return !busy;
-}
-
-static inline void arch_read_unlock(arch_rwlock_t *lock)
-{
- __asm__ __volatile__(
- RISCV_RELEASE_BARRIER
- " amoadd.w x0, %1, %0\n"
- : "+A" (lock->lock)
- : "r" (-1)
- : "memory");
-}
-
-static inline void arch_write_unlock(arch_rwlock_t *lock)
-{
- smp_store_release(&lock->lock, 0);
-}
+#include <asm/qrwlock.h>
#endif /* _ASM_RISCV_SPINLOCK_H */
diff --git a/arch/riscv/include/asm/spinlock_types.h b/arch/riscv/include/asm/spinlock_types.h
index 431ee08e26c4..3779f13706fa 100644
--- a/arch/riscv/include/asm/spinlock_types.h
+++ b/arch/riscv/include/asm/spinlock_types.h
@@ -11,11 +11,6 @@
#endif
#include <asm-generic/ticket-lock-types.h>
-
-typedef struct {
- volatile unsigned int lock;
-} arch_rwlock_t;
-
-#define __ARCH_RW_LOCK_UNLOCKED { 0 }
+#include <asm/qrwlock_types.h>
#endif /* _ASM_RISCV_SPINLOCK_TYPES_H */
--
2.34.1
From: Peter Zijlstra <[email protected]>
We have no indications that openrisc meets the qspinlock requirements,
so move to ticket-spinlock as that is more likey to be correct.
Signed-off-by: Palmer Dabbelt <[email protected]>
---
I have specifically not included Peter's SOB on this, as he sent his
original patch
<https://lore.kernel.org/lkml/[email protected]/>
without one.
---
arch/openrisc/Kconfig | 1 -
arch/openrisc/include/asm/Kbuild | 5 ++---
arch/openrisc/include/asm/spinlock.h | 3 +--
arch/openrisc/include/asm/spinlock_types.h | 2 +-
4 files changed, 4 insertions(+), 7 deletions(-)
diff --git a/arch/openrisc/Kconfig b/arch/openrisc/Kconfig
index f724b3f1aeed..f5fa226362f6 100644
--- a/arch/openrisc/Kconfig
+++ b/arch/openrisc/Kconfig
@@ -30,7 +30,6 @@ config OPENRISC
select HAVE_DEBUG_STACKOVERFLOW
select OR1K_PIC
select CPU_NO_EFFICIENT_FFS if !OPENRISC_HAVE_INST_FF1
- select ARCH_USE_QUEUED_SPINLOCKS
select ARCH_USE_QUEUED_RWLOCKS
select OMPIC if SMP
select ARCH_WANT_FRAME_POINTERS
diff --git a/arch/openrisc/include/asm/Kbuild b/arch/openrisc/include/asm/Kbuild
index ca5987e11053..cb260e7d73db 100644
--- a/arch/openrisc/include/asm/Kbuild
+++ b/arch/openrisc/include/asm/Kbuild
@@ -1,9 +1,8 @@
# SPDX-License-Identifier: GPL-2.0
generic-y += extable.h
generic-y += kvm_para.h
-generic-y += mcs_spinlock.h
-generic-y += qspinlock_types.h
-generic-y += qspinlock.h
+generic-y += ticket-lock.h
+generic-y += ticket-lock-types.h
generic-y += qrwlock_types.h
generic-y += qrwlock.h
generic-y += user.h
diff --git a/arch/openrisc/include/asm/spinlock.h b/arch/openrisc/include/asm/spinlock.h
index 264944a71535..40e4c9fdc349 100644
--- a/arch/openrisc/include/asm/spinlock.h
+++ b/arch/openrisc/include/asm/spinlock.h
@@ -15,8 +15,7 @@
#ifndef __ASM_OPENRISC_SPINLOCK_H
#define __ASM_OPENRISC_SPINLOCK_H
-#include <asm/qspinlock.h>
-
+#include <asm/ticket-lock.h>
#include <asm/qrwlock.h>
#define arch_spin_relax(lock) cpu_relax()
diff --git a/arch/openrisc/include/asm/spinlock_types.h b/arch/openrisc/include/asm/spinlock_types.h
index 7c6fb1208c88..58ea31fa65ce 100644
--- a/arch/openrisc/include/asm/spinlock_types.h
+++ b/arch/openrisc/include/asm/spinlock_types.h
@@ -1,7 +1,7 @@
#ifndef _ASM_OPENRISC_SPINLOCK_TYPES_H
#define _ASM_OPENRISC_SPINLOCK_TYPES_H
-#include <asm/qspinlock_types.h>
+#include <asm/ticket-lock-types.h>
#include <asm/qrwlock_types.h>
#endif /* _ASM_OPENRISC_SPINLOCK_TYPES_H */
--
2.34.1
On Wed, Mar 16, 2022 at 04:25:58PM -0700, Palmer Dabbelt wrote:
> From: Peter Zijlstra <[email protected]>
>
> We have no indications that openrisc meets the qspinlock requirements,
> so move to ticket-spinlock as that is more likey to be correct.
>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> Signed-off-by: Palmer Dabbelt <[email protected]>
On Thu, Mar 17, 2022 at 12:25 AM Palmer Dabbelt <[email protected]> wrote:
>
> Peter sent an RFC out about a year ago
> <https://lore.kernel.org/lkml/[email protected]/>,
> but after a spirited discussion it looks like we lost track of things.
> IIRC there was broad consensus on this being the way to go, but there
> was a lot of discussion so I wasn't sure. Given that it's been a year,
> I figured it'd be best to just send this out again formatted a bit more
> explicitly as a patch.
>
> This has had almost no testing (just a build test on RISC-V defconfig),
> but I wanted to send it out largely as-is because I didn't have a SOB
> from Peter on the code. I had sent around something sort of similar in
> spirit, but this looks completely re-written. Just to play it safe I
> wanted to send out almost exactly as it was posted. I'd probably rename
> this tspinlock and tspinlock_types, as the mis-match kind of makes my
> eyes go funny, but I don't really care that much. I'll also go through
> the other ports and see if there's any more candidates, I seem to
> remember there having been more than just OpenRISC but it's been a
> while.
>
> I'm in no big rush for this and given the complex HW dependencies I
> think it's best to target it for 5.19, that'd give us a full merge
> window for folks to test/benchmark it on their systems to make sure it's
> OK. RISC-V has a forward progress guarantee so we should be safe, but
> these can always trip things up.
This all looks good to me, feel free to merge the asm-generic
bits through the riscv tree.
Regarding the naming, my preference would be to just use
this version in place of the (currently useless) asm-generic/spinlock.h,
and just naming it arch_spin_lock() etc.
This way, converting an architecture to the generic ticket lock can
be done simply by removing its custom asm/spinlock.h. Or it
could stay with the current name, but then have a new
asm-generic/spinlock.h that just includes both asm/ticket_lock.h
and asm/qrwlock.h.
Arnd
Hi,
Am Donnerstag, 17. M?rz 2022, 00:25:55 CET schrieb Palmer Dabbelt:
> Peter sent an RFC out about a year ago
> <https://lore.kernel.org/lkml/[email protected]/>,
> but after a spirited discussion it looks like we lost track of things.
> IIRC there was broad consensus on this being the way to go, but there
> was a lot of discussion so I wasn't sure. Given that it's been a year,
> I figured it'd be best to just send this out again formatted a bit more
> explicitly as a patch.
>
> This has had almost no testing (just a build test on RISC-V defconfig),
> but I wanted to send it out largely as-is because I didn't have a SOB
> from Peter on the code. I had sent around something sort of similar in
> spirit, but this looks completely re-written. Just to play it safe I
> wanted to send out almost exactly as it was posted. I'd probably rename
> this tspinlock and tspinlock_types, as the mis-match kind of makes my
> eyes go funny, but I don't really care that much. I'll also go through
> the other ports and see if there's any more candidates, I seem to
> remember there having been more than just OpenRISC but it's been a
> while.
>
> I'm in no big rush for this and given the complex HW dependencies I
> think it's best to target it for 5.19, that'd give us a full merge
> window for folks to test/benchmark it on their systems to make sure it's
> OK. RISC-V has a forward progress guarantee so we should be safe, but
> these can always trip things up.
I've tested this on both the Qemu-Virt machine as well as the
Allwinner Nezha board (with a D1 SoC).
Both of those are of course not necessarily the best platforms
for benchmarks I guess, as from what I gathered before I'd need
need multiple cores to actually get interesting measurements when
comparing different implementations. But at least everything that
worked before still works with this series ;-)
So, Series
Tested-by: Heiko Stuebner <[email protected]>
Heiko
On Wed, Mar 16, 2022 at 04:26:00PM -0700, Palmer Dabbelt wrote:
> From: Palmer Dabbelt <[email protected]>
>
> With the move to fair spinlocks, we might as well move to fair rwlocks.
s/might as well/can/ ?
Note that qrwlock relies on spinlock to be/provide fairness.
> Signed-off-by: Palmer Dabbelt <[email protected]>
On Wed, Mar 16, 2022 at 04:25:57PM -0700, Palmer Dabbelt wrote:
> From: Peter Zijlstra <[email protected]>
>
> This is a simple, fair spinlock. Specifically it doesn't have all the
> subtle memory model dependencies that qspinlock has, which makes it more
> suitable for simple systems as it is more likely to be correct.
>
> [Palmer: commit text]
> Signed-off-by: Palmer Dabbelt <[email protected]>
>
> --
>
> I have specifically not included Peter's SOB on this, as he sent his
> original patch
> <https://lore.kernel.org/lkml/[email protected]/>
> without one.
> ---
> include/asm-generic/ticket-lock-types.h | 11 ++++
> include/asm-generic/ticket-lock.h | 86 +++++++++++++++++++++++++
> 2 files changed, 97 insertions(+)
> create mode 100644 include/asm-generic/ticket-lock-types.h
> create mode 100644 include/asm-generic/ticket-lock.h
>
> diff --git a/include/asm-generic/ticket-lock-types.h b/include/asm-generic/ticket-lock-types.h
> new file mode 100644
> index 000000000000..829759aedda8
> --- /dev/null
> +++ b/include/asm-generic/ticket-lock-types.h
> @@ -0,0 +1,11 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef __ASM_GENERIC_TICKET_LOCK_TYPES_H
> +#define __ASM_GENERIC_TICKET_LOCK_TYPES_H
> +
> +#include <linux/types.h>
> +typedef atomic_t arch_spinlock_t;
> +
> +#define __ARCH_SPIN_LOCK_UNLOCKED ATOMIC_INIT(0)
> +
> +#endif /* __ASM_GENERIC_TICKET_LOCK_TYPES_H */
> diff --git a/include/asm-generic/ticket-lock.h b/include/asm-generic/ticket-lock.h
> new file mode 100644
> index 000000000000..3f0d53e21a37
> --- /dev/null
> +++ b/include/asm-generic/ticket-lock.h
> @@ -0,0 +1,86 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +/*
> + * 'Generic' ticket-lock implementation.
> + *
> + * It relies on atomic_fetch_add() having well defined forward progress
> + * guarantees under contention. If your architecture cannot provide this, stick
> + * to a test-and-set lock.
> + *
> + * It also relies on atomic_fetch_add() being safe vs smp_store_release() on a
> + * sub-word of the value. This is generally true for anything LL/SC although
> + * you'd be hard pressed to find anything useful in architecture specifications
> + * about this. If your architecture cannot do this you might be better off with
> + * a test-and-set.
> + *
> + * It further assumes atomic_*_release() + atomic_*_acquire() is RCpc and hence
> + * uses atomic_fetch_add() which is SC to create an RCsc lock.
> + *
Probably it's better to use "fully-ordered" instead of "SC", because our
atomic documents never use "SC" or "Sequential Consisteny" to describe
the semantics, further I'm not sure our "fully-ordered" is equivalent to
SC, better not cause misunderstanding in the future here.
> + * The implementation uses smp_cond_load_acquire() to spin, so if the
> + * architecture has WFE like instructions to sleep instead of poll for word
> + * modifications be sure to implement that (see ARM64 for example).
> + *
> + */
> +
> +#ifndef __ASM_GENERIC_TICKET_LOCK_H
> +#define __ASM_GENERIC_TICKET_LOCK_H
> +
> +#include <linux/atomic.h>
> +#include <asm/ticket-lock-types.h>
> +
> +static __always_inline void ticket_lock(arch_spinlock_t *lock)
> +{
> + u32 val = atomic_fetch_add(1<<16, lock); /* SC, gives us RCsc */
> + u16 ticket = val >> 16;
> +
> + if (ticket == (u16)val)
> + return;
> +
> + atomic_cond_read_acquire(lock, ticket == (u16)VAL);
If you want to make the lock RCsc, you will also need to make the above
atomic_cond_read_acquire() a RCsc acquire, now it's only RCpc.
Regards,
Boqun
> +}
> +
> +static __always_inline bool ticket_trylock(arch_spinlock_t *lock)
> +{
> + u32 old = atomic_read(lock);
> +
> + if ((old >> 16) != (old & 0xffff))
> + return false;
> +
> + return atomic_try_cmpxchg(lock, &old, old + (1<<16)); /* SC, for RCsc */
> +}
> +
> +static __always_inline void ticket_unlock(arch_spinlock_t *lock)
> +{
> + u16 *ptr = (u16 *)lock + __is_defined(__BIG_ENDIAN);
> + u32 val = atomic_read(lock);
> +
> + smp_store_release(ptr, (u16)val + 1);
> +}
> +
> +static __always_inline int ticket_is_locked(arch_spinlock_t *lock)
> +{
> + u32 val = atomic_read(lock);
> +
> + return ((val >> 16) != (val & 0xffff));
> +}
> +
> +static __always_inline int ticket_is_contended(arch_spinlock_t *lock)
> +{
> + u32 val = atomic_read(lock);
> +
> + return (s16)((val >> 16) - (val & 0xffff)) > 1;
> +}
> +
> +static __always_inline int ticket_value_unlocked(arch_spinlock_t lock)
> +{
> + return !ticket_is_locked(&lock);
> +}
> +
> +#define arch_spin_lock(l) ticket_lock(l)
> +#define arch_spin_trylock(l) ticket_trylock(l)
> +#define arch_spin_unlock(l) ticket_unlock(l)
> +#define arch_spin_is_locked(l) ticket_is_locked(l)
> +#define arch_spin_is_contended(l) ticket_is_contended(l)
> +#define arch_spin_value_unlocked(l) ticket_value_unlocked(l)
> +
> +#endif /* __ASM_GENERIC_TICKET_LOCK_H */
> --
> 2.34.1
>
On Thu, Mar 17, 2022 at 11:03:40AM -0400, Waiman Long wrote:
[...]
> > > + * It also relies on atomic_fetch_add() being safe vs smp_store_release() on a
> > > + * sub-word of the value. This is generally true for anything LL/SC although
> > > + * you'd be hard pressed to find anything useful in architecture specifications
> > > + * about this. If your architecture cannot do this you might be better off with
> > > + * a test-and-set.
> > > + *
> > > + * It further assumes atomic_*_release() + atomic_*_acquire() is RCpc and hence
> > > + * uses atomic_fetch_add() which is SC to create an RCsc lock.
> > > + *
> > Probably it's better to use "fully-ordered" instead of "SC", because our
> > atomic documents never use "SC" or "Sequential Consisteny" to describe
> > the semantics, further I'm not sure our "fully-ordered" is equivalent to
> > SC, better not cause misunderstanding in the future here.
>
> The terms RCpc, RCsc comes from academia. I believe we can keep this but add
I'm not saying we cannot keep "RCpc" and "RCsc", and we actually use
them to describe the memory ordering attributes of our lock or atomic
primitives. These terms are well defined.
The thing is that instead of "SC" we use "fully-ordered" to describe the
memory ordering semantics of atomics like cmpxchg(), and IIUC the
definition of "SC" isn't equivalent to "fully-ordered", in other words,
there is no "SC" atomic in Linux kernel right now. So using "SC" here
is not quite right. Just say "...which is fully-ordered to create an
RCsc lock."
But yes, maybe I'm wrong, and "SC" can be used exchangably with
"fully-ordered", but at least some reasoning is needed.
Regards,
Boqun
> more comment to elaborate what they are and what do they mean for the
> average kernel engineer.
>
> Cheers,
> Longman
>
On Wed, Mar 16, 2022 at 04:25:57PM -0700, Palmer Dabbelt wrote:
> From: Peter Zijlstra <[email protected]>
>
> This is a simple, fair spinlock. Specifically it doesn't have all the
> subtle memory model dependencies that qspinlock has, which makes it more
> suitable for simple systems as it is more likely to be correct.
>
> [Palmer: commit text]
> Signed-off-by: Palmer Dabbelt <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
>
> --
>
> I have specifically not included Peter's SOB on this, as he sent his
> original patch
> <https://lore.kernel.org/lkml/[email protected]/>
> without one.
Fixed ;-)
On 3/17/22 09:57, Boqun Feng wrote:
> On Wed, Mar 16, 2022 at 04:25:57PM -0700, Palmer Dabbelt wrote:
>> From: Peter Zijlstra <[email protected]>
>>
>> This is a simple, fair spinlock. Specifically it doesn't have all the
>> subtle memory model dependencies that qspinlock has, which makes it more
>> suitable for simple systems as it is more likely to be correct.
>>
>> [Palmer: commit text]
>> Signed-off-by: Palmer Dabbelt <[email protected]>
>>
>> --
>>
>> I have specifically not included Peter's SOB on this, as he sent his
>> original patch
>> <https://lore.kernel.org/lkml/[email protected]/>
>> without one.
>> ---
>> include/asm-generic/ticket-lock-types.h | 11 ++++
>> include/asm-generic/ticket-lock.h | 86 +++++++++++++++++++++++++
>> 2 files changed, 97 insertions(+)
>> create mode 100644 include/asm-generic/ticket-lock-types.h
>> create mode 100644 include/asm-generic/ticket-lock.h
>>
>> diff --git a/include/asm-generic/ticket-lock-types.h b/include/asm-generic/ticket-lock-types.h
>> new file mode 100644
>> index 000000000000..829759aedda8
>> --- /dev/null
>> +++ b/include/asm-generic/ticket-lock-types.h
>> @@ -0,0 +1,11 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +
>> +#ifndef __ASM_GENERIC_TICKET_LOCK_TYPES_H
>> +#define __ASM_GENERIC_TICKET_LOCK_TYPES_H
>> +
>> +#include <linux/types.h>
>> +typedef atomic_t arch_spinlock_t;
>> +
>> +#define __ARCH_SPIN_LOCK_UNLOCKED ATOMIC_INIT(0)
>> +
>> +#endif /* __ASM_GENERIC_TICKET_LOCK_TYPES_H */
>> diff --git a/include/asm-generic/ticket-lock.h b/include/asm-generic/ticket-lock.h
>> new file mode 100644
>> index 000000000000..3f0d53e21a37
>> --- /dev/null
>> +++ b/include/asm-generic/ticket-lock.h
>> @@ -0,0 +1,86 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +
>> +/*
>> + * 'Generic' ticket-lock implementation.
>> + *
>> + * It relies on atomic_fetch_add() having well defined forward progress
>> + * guarantees under contention. If your architecture cannot provide this, stick
>> + * to a test-and-set lock.
>> + *
>> + * It also relies on atomic_fetch_add() being safe vs smp_store_release() on a
>> + * sub-word of the value. This is generally true for anything LL/SC although
>> + * you'd be hard pressed to find anything useful in architecture specifications
>> + * about this. If your architecture cannot do this you might be better off with
>> + * a test-and-set.
>> + *
>> + * It further assumes atomic_*_release() + atomic_*_acquire() is RCpc and hence
>> + * uses atomic_fetch_add() which is SC to create an RCsc lock.
>> + *
> Probably it's better to use "fully-ordered" instead of "SC", because our
> atomic documents never use "SC" or "Sequential Consisteny" to describe
> the semantics, further I'm not sure our "fully-ordered" is equivalent to
> SC, better not cause misunderstanding in the future here.
The terms RCpc, RCsc comes from academia. I believe we can keep this but
add more comment to elaborate what they are and what do they mean for
the average kernel engineer.
Cheers,
Longman
On 3/16/22 19:25, Palmer Dabbelt wrote:
> From: Peter Zijlstra <[email protected]>
>
> This is a simple, fair spinlock. Specifically it doesn't have all the
> subtle memory model dependencies that qspinlock has, which makes it more
> suitable for simple systems as it is more likely to be correct.
>
> [Palmer: commit text]
> Signed-off-by: Palmer Dabbelt <[email protected]>
>
> --
>
> I have specifically not included Peter's SOB on this, as he sent his
> original patch
> <https://lore.kernel.org/lkml/[email protected]/>
> without one.
> ---
> include/asm-generic/ticket-lock-types.h | 11 ++++
> include/asm-generic/ticket-lock.h | 86 +++++++++++++++++++++++++
> 2 files changed, 97 insertions(+)
> create mode 100644 include/asm-generic/ticket-lock-types.h
> create mode 100644 include/asm-generic/ticket-lock.h
>
> diff --git a/include/asm-generic/ticket-lock-types.h b/include/asm-generic/ticket-lock-types.h
> new file mode 100644
> index 000000000000..829759aedda8
> --- /dev/null
> +++ b/include/asm-generic/ticket-lock-types.h
> @@ -0,0 +1,11 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef __ASM_GENERIC_TICKET_LOCK_TYPES_H
> +#define __ASM_GENERIC_TICKET_LOCK_TYPES_H
> +
> +#include <linux/types.h>
> +typedef atomic_t arch_spinlock_t;
> +
> +#define __ARCH_SPIN_LOCK_UNLOCKED ATOMIC_INIT(0)
> +
> +#endif /* __ASM_GENERIC_TICKET_LOCK_TYPES_H */
> diff --git a/include/asm-generic/ticket-lock.h b/include/asm-generic/ticket-lock.h
> new file mode 100644
> index 000000000000..3f0d53e21a37
> --- /dev/null
> +++ b/include/asm-generic/ticket-lock.h
> @@ -0,0 +1,86 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +/*
> + * 'Generic' ticket-lock implementation.
> + *
> + * It relies on atomic_fetch_add() having well defined forward progress
> + * guarantees under contention. If your architecture cannot provide this, stick
> + * to a test-and-set lock.
> + *
> + * It also relies on atomic_fetch_add() being safe vs smp_store_release() on a
> + * sub-word of the value. This is generally true for anything LL/SC although
> + * you'd be hard pressed to find anything useful in architecture specifications
> + * about this. If your architecture cannot do this you might be better off with
> + * a test-and-set.
> + *
> + * It further assumes atomic_*_release() + atomic_*_acquire() is RCpc and hence
> + * uses atomic_fetch_add() which is SC to create an RCsc lock.
> + *
> + * The implementation uses smp_cond_load_acquire() to spin, so if the
> + * architecture has WFE like instructions to sleep instead of poll for word
> + * modifications be sure to implement that (see ARM64 for example).
> + *
> + */
> +
> +#ifndef __ASM_GENERIC_TICKET_LOCK_H
> +#define __ASM_GENERIC_TICKET_LOCK_H
> +
> +#include <linux/atomic.h>
> +#include <asm/ticket-lock-types.h>
> +
> +static __always_inline void ticket_lock(arch_spinlock_t *lock)
> +{
> + u32 val = atomic_fetch_add(1<<16, lock); /* SC, gives us RCsc */
> + u16 ticket = val >> 16;
> +
> + if (ticket == (u16)val)
> + return;
> +
> + atomic_cond_read_acquire(lock, ticket == (u16)VAL);
> +}
> +
> +static __always_inline bool ticket_trylock(arch_spinlock_t *lock)
> +{
> + u32 old = atomic_read(lock);
> +
> + if ((old >> 16) != (old & 0xffff))
> + return false;
> +
> + return atomic_try_cmpxchg(lock, &old, old + (1<<16)); /* SC, for RCsc */
> +}
> +
> +static __always_inline void ticket_unlock(arch_spinlock_t *lock)
> +{
> + u16 *ptr = (u16 *)lock + __is_defined(__BIG_ENDIAN);
> + u32 val = atomic_read(lock);
> +
> + smp_store_release(ptr, (u16)val + 1);
> +}
> +
> +static __always_inline int ticket_is_locked(arch_spinlock_t *lock)
> +{
> + u32 val = atomic_read(lock);
> +
> + return ((val >> 16) != (val & 0xffff));
> +}
> +
> +static __always_inline int ticket_is_contended(arch_spinlock_t *lock)
> +{
> + u32 val = atomic_read(lock);
> +
> + return (s16)((val >> 16) - (val & 0xffff)) > 1;
> +}
> +
> +static __always_inline int ticket_value_unlocked(arch_spinlock_t lock)
> +{
> + return !ticket_is_locked(&lock);
> +}
> +
> +#define arch_spin_lock(l) ticket_lock(l)
> +#define arch_spin_trylock(l) ticket_trylock(l)
> +#define arch_spin_unlock(l) ticket_unlock(l)
> +#define arch_spin_is_locked(l) ticket_is_locked(l)
> +#define arch_spin_is_contended(l) ticket_is_contended(l)
> +#define arch_spin_value_unlocked(l) ticket_value_unlocked(l)
> +
> +#endif /* __ASM_GENERIC_TICKET_LOCK_H */
Acked-by: Waiman Long <[email protected]>
Tested-by: Guo Ren <[email protected]>
On Thu, Mar 17, 2022 at 8:58 PM Heiko Stübner <[email protected]> wrote:
>
> Hi,
>
> Am Donnerstag, 17. März 2022, 00:25:55 CET schrieb Palmer Dabbelt:
> > Peter sent an RFC out about a year ago
> > <https://lore.kernel.org/lkml/[email protected]/>,
> > but after a spirited discussion it looks like we lost track of things.
> > IIRC there was broad consensus on this being the way to go, but there
> > was a lot of discussion so I wasn't sure. Given that it's been a year,
> > I figured it'd be best to just send this out again formatted a bit more
> > explicitly as a patch.
> >
> > This has had almost no testing (just a build test on RISC-V defconfig),
> > but I wanted to send it out largely as-is because I didn't have a SOB
> > from Peter on the code. I had sent around something sort of similar in
> > spirit, but this looks completely re-written. Just to play it safe I
> > wanted to send out almost exactly as it was posted. I'd probably rename
> > this tspinlock and tspinlock_types, as the mis-match kind of makes my
> > eyes go funny, but I don't really care that much. I'll also go through
> > the other ports and see if there's any more candidates, I seem to
> > remember there having been more than just OpenRISC but it's been a
> > while.
> >
> > I'm in no big rush for this and given the complex HW dependencies I
> > think it's best to target it for 5.19, that'd give us a full merge
> > window for folks to test/benchmark it on their systems to make sure it's
> > OK. RISC-V has a forward progress guarantee so we should be safe, but
> > these can always trip things up.
>
> I've tested this on both the Qemu-Virt machine as well as the
> Allwinner Nezha board (with a D1 SoC).
>
> Both of those are of course not necessarily the best platforms
> for benchmarks I guess, as from what I gathered before I'd need
> need multiple cores to actually get interesting measurements when
> comparing different implementations. But at least everything that
> worked before still works with this series ;-)
>
>
> So, Series
> Tested-by: Heiko Stuebner <[email protected]>
>
>
> Heiko
>
>
--
Best Regards
Guo Ren
ML: https://lore.kernel.org/linux-csky/
Hi Palmer,
Tested-by: Guo Ren <[email protected]>
Could help involve the below patch in your series?
https://lore.kernel.org/linux-arch/[email protected]/T/#u
On Thu, Mar 17, 2022 at 1:14 PM Palmer Dabbelt <[email protected]> wrote:
>
> Peter sent an RFC out about a year ago
> <https://lore.kernel.org/lkml/[email protected]/>,
> but after a spirited discussion it looks like we lost track of things.
> IIRC there was broad consensus on this being the way to go, but there
> was a lot of discussion so I wasn't sure. Given that it's been a year,
> I figured it'd be best to just send this out again formatted a bit more
> explicitly as a patch.
>
> This has had almost no testing (just a build test on RISC-V defconfig),
> but I wanted to send it out largely as-is because I didn't have a SOB
> from Peter on the code. I had sent around something sort of similar in
> spirit, but this looks completely re-written. Just to play it safe I
> wanted to send out almost exactly as it was posted. I'd probably rename
> this tspinlock and tspinlock_types, as the mis-match kind of makes my
> eyes go funny, but I don't really care that much. I'll also go through
> the other ports and see if there's any more candidates, I seem to
> remember there having been more than just OpenRISC but it's been a
> while.
>
> I'm in no big rush for this and given the complex HW dependencies I
> think it's best to target it for 5.19, that'd give us a full merge
> window for folks to test/benchmark it on their systems to make sure it's
> OK. RISC-V has a forward progress guarantee so we should be safe, but
> these can always trip things up.
--
Best Regards
Guo Ren
ML: https://lore.kernel.org/linux-csky/
On Wed, Mar 16, 2022 at 04:25:58PM -0700, Palmer Dabbelt wrote:
> From: Peter Zijlstra <[email protected]>
>
> We have no indications that openrisc meets the qspinlock requirements,
> so move to ticket-spinlock as that is more likey to be correct.
>
> Signed-off-by: Palmer Dabbelt <[email protected]>
>
> ---
>
> I have specifically not included Peter's SOB on this, as he sent his
> original patch
> <https://lore.kernel.org/lkml/[email protected]/>
> without one.
> ---
> arch/openrisc/Kconfig | 1 -
> arch/openrisc/include/asm/Kbuild | 5 ++---
> arch/openrisc/include/asm/spinlock.h | 3 +--
> arch/openrisc/include/asm/spinlock_types.h | 2 +-
> 4 files changed, 4 insertions(+), 7 deletions(-)
Hello,
This series breaks SMP support on OpenRISC. I haven't traced it down yet, it
seems trivial but I have a few places to check.
I replied to this on a kbuild warning thread, but also going to reply here with
more information.
https://lore.kernel.org/lkml/YjeY7CfaFKjr8IUc@antec/#R
So far this is what I see:
* ticket_lock is stuck trying to lock console_sem
* it is stuck on atomic_cond_read_acquire
reading lock value: returns 0 (*lock is 0x10000)
ticket value: is 1
* possible issues:
- OpenRISC is big endian, that seems to impact ticket_unlock, it looks
like we are failing on the first call to ticket_lock though
Backtrace:
(gdb) target remote :10001
Remote debugging using :10001
ticket_lock (lock=0xc049e078 <console_sem>) at include/asm-generic/ticket-lock.h:39
39 atomic_cond_read_acquire(lock, ticket == (u16)VAL);
(gdb) bt
#0 ticket_lock (lock=0xc049e078 <console_sem>) at include/asm-generic/ticket-lock.h:39
#1 do_raw_spin_lock (lock=0xc049e078 <console_sem>) at include/linux/spinlock.h:185
#2 __raw_spin_lock_irqsave (lock=0xc049e078 <console_sem>) at include/linux/spinlock_api_smp.h:111
#3 _raw_spin_lock_irqsave (lock=0xc049e078 <console_sem>) at kernel/locking/spinlock.c:162
Backtrace stopped: Cannot access memory at address 0xc0491ee8
Code:
31 static __always_inline void ticket_lock(arch_spinlock_t *lock)
32 {
33 u32 val = atomic_fetch_add(1<<16, lock); /* SC, gives us RCsc */
34 u16 ticket = val >> 16;
35
36 if (ticket == (u16)val)
37 return;
38
39 atomic_cond_read_acquire(lock, ticket == (u16)VAL); <--- stuck here
40 }
Assembly:
c04232ac <_raw_spin_lock_irqsave>:
c04232ac: 9c 21 ff f0 l.addi r1,r1,-16
c04232b0: d4 01 10 08 l.sw 8(r1),r2
c04232b4: 9c 41 00 10 l.addi r2,r1,16
c04232b8: d4 01 80 00 l.sw 0(r1),r16
c04232bc: d4 01 90 04 l.sw 4(r1),r18
c04232c0: d4 01 48 0c l.sw 12(r1),r9
c04232c4: 07 ef 8b 35 l.jal c0005f98 <arch_local_save_flags>
c04232c8: e2 03 18 04 l.or r16,r3,r3
c04232cc: 18 60 00 00 l.movhi r3,0x0
c04232d0: 07 ef 8b 3c l.jal c0005fc0 <arch_local_irq_restore>
c04232d4: e2 4b 58 04 l.or r18,r11,r11
c04232d8: 1a 60 00 01 l.movhi r19,0x1
atomic_fetch_add:
c04232dc: 6e 30 00 00 l.lwa r17,0(r16)
c04232e0: e2 31 98 00 l.add r17,r17,r19
c04232e4: cc 10 88 00 l.swa 0(r16),r17
c04232e8: 0f ff ff fd l.bnf c04232dc <_raw_spin_lock_irqsave+0x30>
c04232ec: 15 00 00 00 l.nop 0x0
u16 ticket = val >> 16;
c04232f0: ba 71 00 50 l.srli r19,r17,0x10
c04232f4: a6 31 ff ff l.andi r17,r17,0xffff
c04232f8: e4 13 88 00 l.sfeq r19,r17
c04232fc: 10 00 00 0e l.bf c0423334 <_raw_spin_lock_irqsave+0x88>
c0423300: e1 72 90 04 l.or r11,r18,r18
if (ticket == (u16)val):
c0423304: 86 30 00 00 l.lwz r17,0(r16)
c0423308: a6 31 ff ff l.andi r17,r17,0xffff
c042330c: e4 11 98 00 l.sfeq r17,r19
c0423310: 10 00 00 07 l.bf c042332c <_raw_spin_lock_irqsave+0x80>
c0423314: 15 00 00 00 l.nop 0x0
atomic_cond_read_acquire:
c0423318: 86 30 00 00 l.lwz r17,0(r16)
c042331c: a6 31 ff ff l.andi r17,r17,0xffff
c0423320: e4 33 88 00 l.sfne r19,r17
c0423324: 13 ff ff fd l.bf c0423318 <_raw_spin_lock_irqsave+0x6c>
c0423328: 15 00 00 00 l.nop 0x0
c042332c: 22 00 00 00 l.msync
c0423330: e1 72 90 04 l.or r11,r18,r18
c0423334: 86 01 00 00 l.lwz r16,0(r1)
c0423338: 86 41 00 04 l.lwz r18,4(r1)
c042333c: 84 41 00 08 l.lwz r2,8(r1)
c0423340: 85 21 00 0c l.lwz r9,12(r1)
c0423344: 44 00 48 00 l.jr r9
c0423348: 9c 21 00 10 l.addi r1,r1,16
-Stafford
> diff --git a/arch/openrisc/Kconfig b/arch/openrisc/Kconfig
> index f724b3f1aeed..f5fa226362f6 100644
> --- a/arch/openrisc/Kconfig
> +++ b/arch/openrisc/Kconfig
> @@ -30,7 +30,6 @@ config OPENRISC
> select HAVE_DEBUG_STACKOVERFLOW
> select OR1K_PIC
> select CPU_NO_EFFICIENT_FFS if !OPENRISC_HAVE_INST_FF1
> - select ARCH_USE_QUEUED_SPINLOCKS
> select ARCH_USE_QUEUED_RWLOCKS
> select OMPIC if SMP
> select ARCH_WANT_FRAME_POINTERS
> diff --git a/arch/openrisc/include/asm/Kbuild b/arch/openrisc/include/asm/Kbuild
> index ca5987e11053..cb260e7d73db 100644
> --- a/arch/openrisc/include/asm/Kbuild
> +++ b/arch/openrisc/include/asm/Kbuild
> @@ -1,9 +1,8 @@
> # SPDX-License-Identifier: GPL-2.0
> generic-y += extable.h
> generic-y += kvm_para.h
> -generic-y += mcs_spinlock.h
> -generic-y += qspinlock_types.h
> -generic-y += qspinlock.h
> +generic-y += ticket-lock.h
> +generic-y += ticket-lock-types.h
> generic-y += qrwlock_types.h
> generic-y += qrwlock.h
> generic-y += user.h
> diff --git a/arch/openrisc/include/asm/spinlock.h b/arch/openrisc/include/asm/spinlock.h
> index 264944a71535..40e4c9fdc349 100644
> --- a/arch/openrisc/include/asm/spinlock.h
> +++ b/arch/openrisc/include/asm/spinlock.h
> @@ -15,8 +15,7 @@
> #ifndef __ASM_OPENRISC_SPINLOCK_H
> #define __ASM_OPENRISC_SPINLOCK_H
>
> -#include <asm/qspinlock.h>
> -
> +#include <asm/ticket-lock.h>
> #include <asm/qrwlock.h>
>
> #define arch_spin_relax(lock) cpu_relax()
> diff --git a/arch/openrisc/include/asm/spinlock_types.h b/arch/openrisc/include/asm/spinlock_types.h
> index 7c6fb1208c88..58ea31fa65ce 100644
> --- a/arch/openrisc/include/asm/spinlock_types.h
> +++ b/arch/openrisc/include/asm/spinlock_types.h
> @@ -1,7 +1,7 @@
> #ifndef _ASM_OPENRISC_SPINLOCK_TYPES_H
> #define _ASM_OPENRISC_SPINLOCK_TYPES_H
>
> -#include <asm/qspinlock_types.h>
> +#include <asm/ticket-lock-types.h>
> #include <asm/qrwlock_types.h>
>
> #endif /* _ASM_OPENRISC_SPINLOCK_TYPES_H */
> --
> 2.34.1
>
On Tue, Mar 22, 2022 at 7:23 AM Stafford Horne <[email protected]> wrote:
>
> On Wed, Mar 16, 2022 at 04:25:58PM -0700, Palmer Dabbelt wrote:
> > From: Peter Zijlstra <[email protected]>
> >
> > We have no indications that openrisc meets the qspinlock requirements,
> > so move to ticket-spinlock as that is more likey to be correct.
> >
> > Signed-off-by: Palmer Dabbelt <[email protected]>
> >
> > ---
> >
> > I have specifically not included Peter's SOB on this, as he sent his
> > original patch
> > <https://lore.kernel.org/lkml/[email protected]/>
> > without one.
> > ---
> > arch/openrisc/Kconfig | 1 -
> > arch/openrisc/include/asm/Kbuild | 5 ++---
> > arch/openrisc/include/asm/spinlock.h | 3 +--
> > arch/openrisc/include/asm/spinlock_types.h | 2 +-
> > 4 files changed, 4 insertions(+), 7 deletions(-)
>
> Hello,
>
> This series breaks SMP support on OpenRISC. I haven't traced it down yet, it
> seems trivial but I have a few places to check.
>
> I replied to this on a kbuild warning thread, but also going to reply here with
> more information.
>
> https://lore.kernel.org/lkml/YjeY7CfaFKjr8IUc@antec/#R
>
> So far this is what I see:
>
> * ticket_lock is stuck trying to lock console_sem
> * it is stuck on atomic_cond_read_acquire
> reading lock value: returns 0 (*lock is 0x10000)
> ticket value: is 1
> * possible issues:
> - OpenRISC is big endian, that seems to impact ticket_unlock, it looks
All csky & riscv are little-endian, it seems the series has a bug with
big-endian. Is that all right for qemu? (If qemu was all right, but
real hardware failed.)
> like we are failing on the first call to ticket_lock though
>
> Backtrace:
> (gdb) target remote :10001
> Remote debugging using :10001
> ticket_lock (lock=0xc049e078 <console_sem>) at include/asm-generic/ticket-lock.h:39
> 39 atomic_cond_read_acquire(lock, ticket == (u16)VAL);
> (gdb) bt
> #0 ticket_lock (lock=0xc049e078 <console_sem>) at include/asm-generic/ticket-lock.h:39
> #1 do_raw_spin_lock (lock=0xc049e078 <console_sem>) at include/linux/spinlock.h:185
> #2 __raw_spin_lock_irqsave (lock=0xc049e078 <console_sem>) at include/linux/spinlock_api_smp.h:111
> #3 _raw_spin_lock_irqsave (lock=0xc049e078 <console_sem>) at kernel/locking/spinlock.c:162
> Backtrace stopped: Cannot access memory at address 0xc0491ee8
>
> Code:
>
> 31 static __always_inline void ticket_lock(arch_spinlock_t *lock)
> 32 {
> 33 u32 val = atomic_fetch_add(1<<16, lock); /* SC, gives us RCsc */
> 34 u16 ticket = val >> 16;
> 35
> 36 if (ticket == (u16)val)
> 37 return;
> 38
> 39 atomic_cond_read_acquire(lock, ticket == (u16)VAL); <--- stuck here
> 40 }
>
> Assembly:
>
> c04232ac <_raw_spin_lock_irqsave>:
> c04232ac: 9c 21 ff f0 l.addi r1,r1,-16
> c04232b0: d4 01 10 08 l.sw 8(r1),r2
> c04232b4: 9c 41 00 10 l.addi r2,r1,16
> c04232b8: d4 01 80 00 l.sw 0(r1),r16
> c04232bc: d4 01 90 04 l.sw 4(r1),r18
> c04232c0: d4 01 48 0c l.sw 12(r1),r9
> c04232c4: 07 ef 8b 35 l.jal c0005f98 <arch_local_save_flags>
> c04232c8: e2 03 18 04 l.or r16,r3,r3
> c04232cc: 18 60 00 00 l.movhi r3,0x0
> c04232d0: 07 ef 8b 3c l.jal c0005fc0 <arch_local_irq_restore>
> c04232d4: e2 4b 58 04 l.or r18,r11,r11
> c04232d8: 1a 60 00 01 l.movhi r19,0x1
> atomic_fetch_add:
> c04232dc: 6e 30 00 00 l.lwa r17,0(r16)
> c04232e0: e2 31 98 00 l.add r17,r17,r19
> c04232e4: cc 10 88 00 l.swa 0(r16),r17
> c04232e8: 0f ff ff fd l.bnf c04232dc <_raw_spin_lock_irqsave+0x30>
> c04232ec: 15 00 00 00 l.nop 0x0
> u16 ticket = val >> 16;
> c04232f0: ba 71 00 50 l.srli r19,r17,0x10
> c04232f4: a6 31 ff ff l.andi r17,r17,0xffff
> c04232f8: e4 13 88 00 l.sfeq r19,r17
> c04232fc: 10 00 00 0e l.bf c0423334 <_raw_spin_lock_irqsave+0x88>
> c0423300: e1 72 90 04 l.or r11,r18,r18
> if (ticket == (u16)val):
> c0423304: 86 30 00 00 l.lwz r17,0(r16)
> c0423308: a6 31 ff ff l.andi r17,r17,0xffff
> c042330c: e4 11 98 00 l.sfeq r17,r19
> c0423310: 10 00 00 07 l.bf c042332c <_raw_spin_lock_irqsave+0x80>
> c0423314: 15 00 00 00 l.nop 0x0
> atomic_cond_read_acquire:
> c0423318: 86 30 00 00 l.lwz r17,0(r16)
> c042331c: a6 31 ff ff l.andi r17,r17,0xffff
> c0423320: e4 33 88 00 l.sfne r19,r17
> c0423324: 13 ff ff fd l.bf c0423318 <_raw_spin_lock_irqsave+0x6c>
> c0423328: 15 00 00 00 l.nop 0x0
> c042332c: 22 00 00 00 l.msync
> c0423330: e1 72 90 04 l.or r11,r18,r18
> c0423334: 86 01 00 00 l.lwz r16,0(r1)
> c0423338: 86 41 00 04 l.lwz r18,4(r1)
> c042333c: 84 41 00 08 l.lwz r2,8(r1)
> c0423340: 85 21 00 0c l.lwz r9,12(r1)
> c0423344: 44 00 48 00 l.jr r9
> c0423348: 9c 21 00 10 l.addi r1,r1,16
>
>
> -Stafford
>
> > diff --git a/arch/openrisc/Kconfig b/arch/openrisc/Kconfig
> > index f724b3f1aeed..f5fa226362f6 100644
> > --- a/arch/openrisc/Kconfig
> > +++ b/arch/openrisc/Kconfig
> > @@ -30,7 +30,6 @@ config OPENRISC
> > select HAVE_DEBUG_STACKOVERFLOW
> > select OR1K_PIC
> > select CPU_NO_EFFICIENT_FFS if !OPENRISC_HAVE_INST_FF1
> > - select ARCH_USE_QUEUED_SPINLOCKS
> > select ARCH_USE_QUEUED_RWLOCKS
> > select OMPIC if SMP
> > select ARCH_WANT_FRAME_POINTERS
> > diff --git a/arch/openrisc/include/asm/Kbuild b/arch/openrisc/include/asm/Kbuild
> > index ca5987e11053..cb260e7d73db 100644
> > --- a/arch/openrisc/include/asm/Kbuild
> > +++ b/arch/openrisc/include/asm/Kbuild
> > @@ -1,9 +1,8 @@
> > # SPDX-License-Identifier: GPL-2.0
> > generic-y += extable.h
> > generic-y += kvm_para.h
> > -generic-y += mcs_spinlock.h
> > -generic-y += qspinlock_types.h
> > -generic-y += qspinlock.h
> > +generic-y += ticket-lock.h
> > +generic-y += ticket-lock-types.h
> > generic-y += qrwlock_types.h
> > generic-y += qrwlock.h
> > generic-y += user.h
> > diff --git a/arch/openrisc/include/asm/spinlock.h b/arch/openrisc/include/asm/spinlock.h
> > index 264944a71535..40e4c9fdc349 100644
> > --- a/arch/openrisc/include/asm/spinlock.h
> > +++ b/arch/openrisc/include/asm/spinlock.h
> > @@ -15,8 +15,7 @@
> > #ifndef __ASM_OPENRISC_SPINLOCK_H
> > #define __ASM_OPENRISC_SPINLOCK_H
> >
> > -#include <asm/qspinlock.h>
> > -
> > +#include <asm/ticket-lock.h>
> > #include <asm/qrwlock.h>
> >
> > #define arch_spin_relax(lock) cpu_relax()
> > diff --git a/arch/openrisc/include/asm/spinlock_types.h b/arch/openrisc/include/asm/spinlock_types.h
> > index 7c6fb1208c88..58ea31fa65ce 100644
> > --- a/arch/openrisc/include/asm/spinlock_types.h
> > +++ b/arch/openrisc/include/asm/spinlock_types.h
> > @@ -1,7 +1,7 @@
> > #ifndef _ASM_OPENRISC_SPINLOCK_TYPES_H
> > #define _ASM_OPENRISC_SPINLOCK_TYPES_H
> >
> > -#include <asm/qspinlock_types.h>
> > +#include <asm/ticket-lock-types.h>
> > #include <asm/qrwlock_types.h>
> >
> > #endif /* _ASM_OPENRISC_SPINLOCK_TYPES_H */
> > --
> > 2.34.1
> >
--
Best Regards
Guo Ren
ML: https://lore.kernel.org/linux-csky/
On Tue, Mar 22, 2022 at 11:29:13AM +0800, Guo Ren wrote:
> On Tue, Mar 22, 2022 at 7:23 AM Stafford Horne <[email protected]> wrote:
> >
> > On Wed, Mar 16, 2022 at 04:25:58PM -0700, Palmer Dabbelt wrote:
> > > From: Peter Zijlstra <[email protected]>
> > >
> > > We have no indications that openrisc meets the qspinlock requirements,
> > > so move to ticket-spinlock as that is more likey to be correct.
> > >
> > > Signed-off-by: Palmer Dabbelt <[email protected]>
> > >
> > > ---
> > >
> > > I have specifically not included Peter's SOB on this, as he sent his
> > > original patch
> > > <https://lore.kernel.org/lkml/[email protected]/>
> > > without one.
> > > ---
> > > arch/openrisc/Kconfig | 1 -
> > > arch/openrisc/include/asm/Kbuild | 5 ++---
> > > arch/openrisc/include/asm/spinlock.h | 3 +--
> > > arch/openrisc/include/asm/spinlock_types.h | 2 +-
> > > 4 files changed, 4 insertions(+), 7 deletions(-)
> >
> > Hello,
> >
> > This series breaks SMP support on OpenRISC. I haven't traced it down yet, it
> > seems trivial but I have a few places to check.
> >
> > I replied to this on a kbuild warning thread, but also going to reply here with
> > more information.
> >
> > https://lore.kernel.org/lkml/YjeY7CfaFKjr8IUc@antec/#R
> >
> > So far this is what I see:
> >
> > * ticket_lock is stuck trying to lock console_sem
> > * it is stuck on atomic_cond_read_acquire
> > reading lock value: returns 0 (*lock is 0x10000)
> > ticket value: is 1
> > * possible issues:
> > - OpenRISC is big endian, that seems to impact ticket_unlock, it looks
> All csky & riscv are little-endian, it seems the series has a bug with
> big-endian. Is that all right for qemu? (If qemu was all right, but
> real hardware failed.)
Hi Guo Ren,
OpenRISC real hardware and QEMU are both big-endian. It fails on both.
I replied on patch 1/5 with a suggested patch which fixes the issue for me.
Please have a look.
BTW. now I can look into the sparse warnings.
-Stafford
On Tue, Mar 22, 2022 at 12:10 PM Stafford Horne <[email protected]> wrote:
>
> On Tue, Mar 22, 2022 at 11:29:13AM +0800, Guo Ren wrote:
> > On Tue, Mar 22, 2022 at 7:23 AM Stafford Horne <[email protected]> wrote:
> > >
> > > On Wed, Mar 16, 2022 at 04:25:58PM -0700, Palmer Dabbelt wrote:
> > > > From: Peter Zijlstra <[email protected]>
> > > >
> > > > We have no indications that openrisc meets the qspinlock requirements,
> > > > so move to ticket-spinlock as that is more likey to be correct.
> > > >
> > > > Signed-off-by: Palmer Dabbelt <[email protected]>
> > > >
> > > > ---
> > > >
> > > > I have specifically not included Peter's SOB on this, as he sent his
> > > > original patch
> > > > <https://lore.kernel.org/lkml/[email protected]/>
> > > > without one.
> > > > ---
> > > > arch/openrisc/Kconfig | 1 -
> > > > arch/openrisc/include/asm/Kbuild | 5 ++---
> > > > arch/openrisc/include/asm/spinlock.h | 3 +--
> > > > arch/openrisc/include/asm/spinlock_types.h | 2 +-
> > > > 4 files changed, 4 insertions(+), 7 deletions(-)
> > >
> > > Hello,
> > >
> > > This series breaks SMP support on OpenRISC. I haven't traced it down yet, it
> > > seems trivial but I have a few places to check.
> > >
> > > I replied to this on a kbuild warning thread, but also going to reply here with
> > > more information.
> > >
> > > https://lore.kernel.org/lkml/YjeY7CfaFKjr8IUc@antec/#R
> > >
> > > So far this is what I see:
> > >
> > > * ticket_lock is stuck trying to lock console_sem
> > > * it is stuck on atomic_cond_read_acquire
> > > reading lock value: returns 0 (*lock is 0x10000)
> > > ticket value: is 1
> > > * possible issues:
> > > - OpenRISC is big endian, that seems to impact ticket_unlock, it looks
> > All csky & riscv are little-endian, it seems the series has a bug with
> > big-endian. Is that all right for qemu? (If qemu was all right, but
> > real hardware failed.)
>
> Hi Guo Ren,
>
> OpenRISC real hardware and QEMU are both big-endian. It fails on both.
>
> I replied on patch 1/5 with a suggested patch which fixes the issue for me.
> Please have a look.
Great, I saw that:
static __always_inline void ticket_unlock(arch_spinlock_t *lock)
{
u16 *ptr = (u16 *)lock + __is_defined(__BIG_ENDIAN);
//__is_defined(__BIG_ENDIAN) would be zero for openrisc.
Using CONFIG_CPU_BIG_ENDIAN is correct, Arnd has also asked me using
CONFIG_CPU_BIG_ENDIAN in compat.h:
diff --git a/include/asm-generic/compat.h b/include/asm-generic/compat.h
index 11653d6846cc..d06308a2a7a8 100644
--- a/include/asm-generic/compat.h
+++ b/include/asm-generic/compat.h
@@ -14,6 +14,13 @@
#define COMPAT_OFF_T_MAX 0x7fffffff
#endif
+#if !defined(compat_arg_u64) && !defined(CONFIG_CPU_BIG_ENDIAN)
+#define compat_arg_u64(name) u32 name##_lo, u32 name##_hi
+#define compat_arg_u64_dual(name) u32, name##_lo, u32, name##_hi
+#define compat_arg_u64_glue(name) (((u64)name##_lo & 0xffffffffUL) | \
+ ((u64)name##_hi << 32))
+#endif
+
/* These types are common across all compat ABIs */
typedef u32 compat_size_t;
typedef s32 compat_ssize_t;
> BTW. now I can look into the sparse warnings.
>
> -Stafford
>
--
Best Regards
Guo Ren
ML: https://lore.kernel.org/linux-csky/
On Tue, 22 Mar 2022 11:18:18 PDT (-0700), [email protected] wrote:
> On 16/03/2022 23:25, Palmer Dabbelt wrote:
>> Peter sent an RFC out about a year ago
>> <https://lore.kernel.org/lkml/[email protected]/>,
>> but after a spirited discussion it looks like we lost track of things.
>> IIRC there was broad consensus on this being the way to go, but there
>> was a lot of discussion so I wasn't sure. Given that it's been a year,
>> I figured it'd be best to just send this out again formatted a bit more
>> explicitly as a patch.
>>
>> This has had almost no testing (just a build test on RISC-V defconfig),
>> but I wanted to send it out largely as-is because I didn't have a SOB
>> from Peter on the code. I had sent around something sort of similar in
>> spirit, but this looks completely re-written. Just to play it safe I
>> wanted to send out almost exactly as it was posted. I'd probably rename
>> this tspinlock and tspinlock_types, as the mis-match kind of makes my
>> eyes go funny, but I don't really care that much. I'll also go through
>> the other ports and see if there's any more candidates, I seem to
>> remember there having been more than just OpenRISC but it's been a
>> while.
>>
>> I'm in no big rush for this and given the complex HW dependencies I
>> think it's best to target it for 5.19, that'd give us a full merge
>> window for folks to test/benchmark it on their systems to make sure it's
>> OK.
>
> Is there a specific way you have been testing/benching things, or is it
> just a case of test what we ourselves care about?
I do a bunch of functional testing in QEMU (it's all in my
riscv-systems-ci repo, but that's not really fit for human consumption
so I don't tell folks to use it). That's pretty much useless for
something like this: sure it'd find something just straight-up broken in
the lock implementation, but the stuff I'm really worried about here
would be poor interactions with hardware that wasn't designed/tested
against this flavor of locks.
I don't currently do any regular testing on HW, but there's a handful of
folks who do. If you've got HW you care about then the best bet is to
give this a shot on it. There's already been some boot test reports, so
it's at least mostly there (on RISC-V, last I saw it was breaking
OpenRISC so there's probably some lurking issue somewhere). I was
hoping we'd get enough coverage that way to have confidence in this, but
if not then I've got a bunch of RISC-V hardware lying around that I can
spin up to fill the gaps.
As far as what workloads, I really don't know here. At least on RISC-V,
I think any lock microbenchmarks would be essentially meaningless: this
is fair, so even if lock/unlock is a bit slower that's probably a win
for real workloads. That said, I'm not sure any of the existing
hardware runs any workloads that I'm personally interested in so unless
this is some massive hit to just general system responsiveness or
make/GCC then I'm probably not going to find anything.
Happy to hear if anyone has ideas, though.
>
> Thanks,
> Conor.
>
>> RISC-V has a forward progress guarantee so we should be safe, but
>> these can always trip things up.