2022-09-22 11:35:43

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH v4 00/25] RSEQ node id and virtual cpu id extensions

Extend the rseq ABI to expose a NUMA node ID and a vm_vcpu_id field.

The NUMA node ID field allows implementing a faster getcpu(2) in libc.

The virtual cpu id allows ideal scaling (down or up) of user-space
per-cpu data structures. The virtual cpu ids allocated within a memory
space are tracked by the scheduler, which takes into account the number
of concurrently running threads, thus implicitly considering the number
of threads, the cpu affinity, the cpusets applying to those threads, and
the number of logical cores on the system.

This series is based on the v5.19 tag.

Thanks,

Mathieu

Mathieu Desnoyers (25):
rseq: Introduce feature size and alignment ELF auxiliary vector
entries
rseq: Introduce extensible rseq ABI
rseq: Extend struct rseq with numa node id
selftests/rseq: Use ELF auxiliary vector for extensible rseq
selftests/rseq: Implement rseq numa node id field selftest
lib: Invert _find_next_bit source arguments
lib: Implement find_{first,next}_{zero,one}_and_zero_bit
cpumask: Implement cpumask_{first,next}_{zero,one}_and_zero
sched: Introduce per memory space current virtual cpu id
rseq: Extend struct rseq with per memory space vcpu id
selftests/rseq: Remove RSEQ_SKIP_FASTPATH code
selftests/rseq: Implement rseq vm_vcpu_id field support
selftests/rseq: x86: Template memory ordering and percpu access mode
selftests/rseq: arm: Template memory ordering and percpu access mode
selftests/rseq: arm64: Template memory ordering and percpu access mode
selftests/rseq: mips: Template memory ordering and percpu access mode
selftests/rseq: ppc: Template memory ordering and percpu access mode
selftests/rseq: s390: Template memory ordering and percpu access mode
selftests/rseq: riscv: Template memory ordering and percpu access mode
selftests/rseq: Implement basic percpu ops vm_vcpu_id test
selftests/rseq: Implement parametrized vm_vcpu_id test
selftests/rseq: x86: Implement rseq_load_u32_u32
selftests/rseq: Implement numa node id vs vm_vcpu_id invariant test
selftests/rseq: parametrized test: Report/abort on negative cpu id
tracing/rseq: Add mm_vcpu_id field to rseq_update

fs/binfmt_elf.c | 5 +
fs/exec.c | 6 +
include/linux/cpumask.h | 86 ++
include/linux/find.h | 123 +-
include/linux/mm.h | 25 +
include/linux/mm_types.h | 110 +-
include/linux/sched.h | 9 +
include/trace/events/rseq.h | 7 +-
include/uapi/linux/auxvec.h | 2 +
include/uapi/linux/rseq.h | 22 +
init/Kconfig | 4 +
kernel/fork.c | 11 +-
kernel/ptrace.c | 2 +-
kernel/rseq.c | 61 +-
kernel/sched/core.c | 49 +
kernel/sched/sched.h | 166 +++
kernel/signal.c | 2 +
lib/find_bit.c | 17 +-
tools/include/linux/find.h | 9 +-
tools/lib/find_bit.c | 17 +-
tools/testing/selftests/rseq/.gitignore | 5 +
tools/testing/selftests/rseq/Makefile | 20 +-
.../testing/selftests/rseq/basic_numa_test.c | 117 ++
.../selftests/rseq/basic_percpu_ops_test.c | 46 +-
tools/testing/selftests/rseq/basic_test.c | 4 +
tools/testing/selftests/rseq/compiler.h | 6 +
tools/testing/selftests/rseq/param_test.c | 157 ++-
tools/testing/selftests/rseq/rseq-abi.h | 22 +
tools/testing/selftests/rseq/rseq-arm-bits.h | 505 +++++++
tools/testing/selftests/rseq/rseq-arm.h | 701 +---------
.../testing/selftests/rseq/rseq-arm64-bits.h | 392 ++++++
tools/testing/selftests/rseq/rseq-arm64.h | 520 +------
.../testing/selftests/rseq/rseq-bits-reset.h | 10 +
.../selftests/rseq/rseq-bits-template.h | 39 +
tools/testing/selftests/rseq/rseq-mips-bits.h | 462 +++++++
tools/testing/selftests/rseq/rseq-mips.h | 646 +--------
tools/testing/selftests/rseq/rseq-ppc-bits.h | 454 +++++++
tools/testing/selftests/rseq/rseq-ppc.h | 617 +--------
.../testing/selftests/rseq/rseq-riscv-bits.h | 410 ++++++
tools/testing/selftests/rseq/rseq-riscv.h | 529 +-------
tools/testing/selftests/rseq/rseq-s390-bits.h | 474 +++++++
tools/testing/selftests/rseq/rseq-s390.h | 495 +------
tools/testing/selftests/rseq/rseq-skip.h | 65 -
tools/testing/selftests/rseq/rseq-x86-bits.h | 1036 ++++++++++++++
tools/testing/selftests/rseq/rseq-x86.h | 1193 +----------------
tools/testing/selftests/rseq/rseq.c | 86 +-
tools/testing/selftests/rseq/rseq.h | 229 +++-
.../testing/selftests/rseq/run_param_test.sh | 5 +
48 files changed, 5286 insertions(+), 4692 deletions(-)
create mode 100644 tools/testing/selftests/rseq/basic_numa_test.c
create mode 100644 tools/testing/selftests/rseq/rseq-arm-bits.h
create mode 100644 tools/testing/selftests/rseq/rseq-arm64-bits.h
create mode 100644 tools/testing/selftests/rseq/rseq-bits-reset.h
create mode 100644 tools/testing/selftests/rseq/rseq-bits-template.h
create mode 100644 tools/testing/selftests/rseq/rseq-mips-bits.h
create mode 100644 tools/testing/selftests/rseq/rseq-ppc-bits.h
create mode 100644 tools/testing/selftests/rseq/rseq-riscv-bits.h
create mode 100644 tools/testing/selftests/rseq/rseq-s390-bits.h
delete mode 100644 tools/testing/selftests/rseq/rseq-skip.h
create mode 100644 tools/testing/selftests/rseq/rseq-x86-bits.h

--
2.25.1


2022-09-22 11:53:49

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH v4 01/25] rseq: Introduce feature size and alignment ELF auxiliary vector entries

Export the rseq feature size supported by the kernel as well as the
required allocation alignment for the rseq per-thread area to user-space
through ELF auxiliary vector entries.

This is part of the extensible rseq ABI.

Signed-off-by: Mathieu Desnoyers <[email protected]>
---
fs/binfmt_elf.c | 5 +++++
include/uapi/linux/auxvec.h | 2 ++
include/uapi/linux/rseq.h | 5 +++++
3 files changed, 12 insertions(+)

diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
index 63c7ebb0da89..04fca1e4cbd2 100644
--- a/fs/binfmt_elf.c
+++ b/fs/binfmt_elf.c
@@ -46,6 +46,7 @@
#include <linux/cred.h>
#include <linux/dax.h>
#include <linux/uaccess.h>
+#include <linux/rseq.h>
#include <asm/param.h>
#include <asm/page.h>

@@ -288,6 +289,10 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec,
if (bprm->have_execfd) {
NEW_AUX_ENT(AT_EXECFD, bprm->execfd);
}
+#ifdef CONFIG_RSEQ
+ NEW_AUX_ENT(AT_RSEQ_FEATURE_SIZE, offsetof(struct rseq, end));
+ NEW_AUX_ENT(AT_RSEQ_ALIGN, __alignof__(struct rseq));
+#endif
#undef NEW_AUX_ENT
/* AT_NULL is zero; clear the rest too */
memset(elf_info, 0, (char *)mm->saved_auxv +
diff --git a/include/uapi/linux/auxvec.h b/include/uapi/linux/auxvec.h
index c7e502bf5a6f..6991c4b8ab18 100644
--- a/include/uapi/linux/auxvec.h
+++ b/include/uapi/linux/auxvec.h
@@ -30,6 +30,8 @@
* differ from AT_PLATFORM. */
#define AT_RANDOM 25 /* address of 16 random bytes */
#define AT_HWCAP2 26 /* extension of AT_HWCAP */
+#define AT_RSEQ_FEATURE_SIZE 27 /* rseq supported feature size */
+#define AT_RSEQ_ALIGN 28 /* rseq allocation alignment */

#define AT_EXECFN 31 /* filename of program */

diff --git a/include/uapi/linux/rseq.h b/include/uapi/linux/rseq.h
index 77ee207623a9..05d3c4cdeb40 100644
--- a/include/uapi/linux/rseq.h
+++ b/include/uapi/linux/rseq.h
@@ -130,6 +130,11 @@ struct rseq {
* this thread.
*/
__u32 flags;
+
+ /*
+ * Flexible array member at end of structure, after last feature field.
+ */
+ char end[];
} __attribute__((aligned(4 * sizeof(__u64))));

#endif /* _UAPI_LINUX_RSEQ_H */
--
2.25.1

2022-09-22 11:57:08

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH v4 14/25] selftests/rseq: arm: Template memory ordering and percpu access mode

Introduce a rseq-arm-bits.h template header which is internally included
to generate the static inline functions covering:

- relaxed and release memory ordering,
- per-cpu-id and per-vm-vcpu-id per-cpu data access.

Signed-off-by: Mathieu Desnoyers <[email protected]>
Cc: Russell King <[email protected]>
Cc: Mark Rutland <[email protected]>
---
tools/testing/selftests/rseq/rseq-arm-bits.h | 505 ++++++++++++++
tools/testing/selftests/rseq/rseq-arm.h | 695 +------------------
2 files changed, 530 insertions(+), 670 deletions(-)
create mode 100644 tools/testing/selftests/rseq/rseq-arm-bits.h

diff --git a/tools/testing/selftests/rseq/rseq-arm-bits.h b/tools/testing/selftests/rseq/rseq-arm-bits.h
new file mode 100644
index 000000000000..60c149d40ac5
--- /dev/null
+++ b/tools/testing/selftests/rseq/rseq-arm-bits.h
@@ -0,0 +1,505 @@
+/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
+/*
+ * rseq-arm-bits.h
+ *
+ * (C) Copyright 2016-2022 - Mathieu Desnoyers <[email protected]>
+ */
+
+#include "rseq-bits-template.h"
+
+#if defined(RSEQ_TEMPLATE_MO_RELAXED) && \
+ (defined(RSEQ_TEMPLATE_CPU_ID) || defined(RSEQ_TEMPLATE_VM_VCPU_ID))
+
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_cmpeqv_storev)(intptr_t *v, intptr_t expect, intptr_t newv, int cpu)
+{
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
+#endif
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
+ RSEQ_INJECT_ASM(3)
+ "ldr r0, %[v]\n\t"
+ "cmp %[expect], r0\n\t"
+ "bne %l[cmpfail]\n\t"
+ RSEQ_INJECT_ASM(4)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, %l[error1])
+ "ldr r0, %[v]\n\t"
+ "cmp %[expect], r0\n\t"
+ "bne %l[error2]\n\t"
+#endif
+ /* final store */
+ "str %[newv], %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(5)
+ "b 5f\n\t"
+ RSEQ_ASM_DEFINE_ABORT(3, 4, "", abort, 1b, 2b, 4f)
+ "5:\n\t"
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
+ [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
+ [v] "m" (*v),
+ [expect] "r" (expect),
+ [newv] "r" (newv)
+ RSEQ_INJECT_INPUT
+ : "r0", "memory", "cc"
+ RSEQ_INJECT_CLOBBER
+ : abort, cmpfail
+#ifdef RSEQ_COMPARE_TWICE
+ , error1, error2
+#endif
+ );
+ rseq_after_asm_goto();
+ return 0;
+abort:
+ rseq_after_asm_goto();
+ RSEQ_INJECT_FAILED
+ return -1;
+cmpfail:
+ rseq_after_asm_goto();
+ return 1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_after_asm_goto();
+ rseq_bug("cpu_id comparison failed");
+error2:
+ rseq_after_asm_goto();
+ rseq_bug("expected value comparison failed");
+#endif
+}
+
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_cmpnev_storeoffp_load)(intptr_t *v, intptr_t expectnot,
+ long voffp, intptr_t *load, int cpu)
+{
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
+#endif
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
+ RSEQ_INJECT_ASM(3)
+ "ldr r0, %[v]\n\t"
+ "cmp %[expectnot], r0\n\t"
+ "beq %l[cmpfail]\n\t"
+ RSEQ_INJECT_ASM(4)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, %l[error1])
+ "ldr r0, %[v]\n\t"
+ "cmp %[expectnot], r0\n\t"
+ "beq %l[error2]\n\t"
+#endif
+ "str r0, %[load]\n\t"
+ "add r0, %[voffp]\n\t"
+ "ldr r0, [r0]\n\t"
+ /* final store */
+ "str r0, %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(5)
+ "b 5f\n\t"
+ RSEQ_ASM_DEFINE_ABORT(3, 4, "", abort, 1b, 2b, 4f)
+ "5:\n\t"
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
+ [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
+ /* final store input */
+ [v] "m" (*v),
+ [expectnot] "r" (expectnot),
+ [voffp] "Ir" (voffp),
+ [load] "m" (*load)
+ RSEQ_INJECT_INPUT
+ : "r0", "memory", "cc"
+ RSEQ_INJECT_CLOBBER
+ : abort, cmpfail
+#ifdef RSEQ_COMPARE_TWICE
+ , error1, error2
+#endif
+ );
+ rseq_after_asm_goto();
+ return 0;
+abort:
+ rseq_after_asm_goto();
+ RSEQ_INJECT_FAILED
+ return -1;
+cmpfail:
+ rseq_after_asm_goto();
+ return 1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_after_asm_goto();
+ rseq_bug("cpu_id comparison failed");
+error2:
+ rseq_after_asm_goto();
+ rseq_bug("expected value comparison failed");
+#endif
+}
+
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_addv)(intptr_t *v, intptr_t count, int cpu)
+{
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+#endif
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
+ RSEQ_INJECT_ASM(3)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, %l[error1])
+#endif
+ "ldr r0, %[v]\n\t"
+ "add r0, %[count]\n\t"
+ /* final store */
+ "str r0, %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(4)
+ "b 5f\n\t"
+ RSEQ_ASM_DEFINE_ABORT(3, 4, "", abort, 1b, 2b, 4f)
+ "5:\n\t"
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
+ [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
+ [v] "m" (*v),
+ [count] "Ir" (count)
+ RSEQ_INJECT_INPUT
+ : "r0", "memory", "cc"
+ RSEQ_INJECT_CLOBBER
+ : abort
+#ifdef RSEQ_COMPARE_TWICE
+ , error1
+#endif
+ );
+ rseq_after_asm_goto();
+ return 0;
+abort:
+ rseq_after_asm_goto();
+ RSEQ_INJECT_FAILED
+ return -1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_after_asm_goto();
+ rseq_bug("cpu_id comparison failed");
+#endif
+}
+
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_cmpeqv_cmpeqv_storev)(intptr_t *v, intptr_t expect,
+ intptr_t *v2, intptr_t expect2,
+ intptr_t newv, int cpu)
+{
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error3])
+#endif
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
+ RSEQ_INJECT_ASM(3)
+ "ldr r0, %[v]\n\t"
+ "cmp %[expect], r0\n\t"
+ "bne %l[cmpfail]\n\t"
+ RSEQ_INJECT_ASM(4)
+ "ldr r0, %[v2]\n\t"
+ "cmp %[expect2], r0\n\t"
+ "bne %l[cmpfail]\n\t"
+ RSEQ_INJECT_ASM(5)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, %l[error1])
+ "ldr r0, %[v]\n\t"
+ "cmp %[expect], r0\n\t"
+ "bne %l[error2]\n\t"
+ "ldr r0, %[v2]\n\t"
+ "cmp %[expect2], r0\n\t"
+ "bne %l[error3]\n\t"
+#endif
+ /* final store */
+ "str %[newv], %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(6)
+ "b 5f\n\t"
+ RSEQ_ASM_DEFINE_ABORT(3, 4, "", abort, 1b, 2b, 4f)
+ "5:\n\t"
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
+ [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
+ /* cmp2 input */
+ [v2] "m" (*v2),
+ [expect2] "r" (expect2),
+ /* final store input */
+ [v] "m" (*v),
+ [expect] "r" (expect),
+ [newv] "r" (newv)
+ RSEQ_INJECT_INPUT
+ : "r0", "memory", "cc"
+ RSEQ_INJECT_CLOBBER
+ : abort, cmpfail
+#ifdef RSEQ_COMPARE_TWICE
+ , error1, error2, error3
+#endif
+ );
+ rseq_after_asm_goto();
+ return 0;
+abort:
+ rseq_after_asm_goto();
+ RSEQ_INJECT_FAILED
+ return -1;
+cmpfail:
+ rseq_after_asm_goto();
+ return 1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_after_asm_goto();
+ rseq_bug("cpu_id comparison failed");
+error2:
+ rseq_after_asm_goto();
+ rseq_bug("1st expected value comparison failed");
+error3:
+ rseq_after_asm_goto();
+ rseq_bug("2nd expected value comparison failed");
+#endif
+}
+
+#endif /* #if defined(RSEQ_TEMPLATE_MO_RELAXED) &&
+ (defined(RSEQ_TEMPLATE_CPU_ID) || defined(RSEQ_TEMPLATE_VM_VCPU_ID)) */
+
+#if (defined(RSEQ_TEMPLATE_MO_RELAXED) || defined(RSEQ_TEMPLATE_MO_RELEASE)) && \
+ (defined(RSEQ_TEMPLATE_CPU_ID) || defined(RSEQ_TEMPLATE_VM_VCPU_ID))
+
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_cmpeqv_trystorev_storev)(intptr_t *v, intptr_t expect,
+ intptr_t *v2, intptr_t newv2,
+ intptr_t newv, int cpu)
+{
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
+#endif
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
+ RSEQ_INJECT_ASM(3)
+ "ldr r0, %[v]\n\t"
+ "cmp %[expect], r0\n\t"
+ "bne %l[cmpfail]\n\t"
+ RSEQ_INJECT_ASM(4)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, %l[error1])
+ "ldr r0, %[v]\n\t"
+ "cmp %[expect], r0\n\t"
+ "bne %l[error2]\n\t"
+#endif
+ /* try store */
+ "str %[newv2], %[v2]\n\t"
+ RSEQ_INJECT_ASM(5)
+#ifdef RSEQ_TEMPLATE_MO_RELEASE
+ "dmb\n\t" /* full mb provides store-release */
+#endif
+ /* final store */
+ "str %[newv], %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(6)
+ "b 5f\n\t"
+ RSEQ_ASM_DEFINE_ABORT(3, 4, "", abort, 1b, 2b, 4f)
+ "5:\n\t"
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
+ [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
+ /* try store input */
+ [v2] "m" (*v2),
+ [newv2] "r" (newv2),
+ /* final store input */
+ [v] "m" (*v),
+ [expect] "r" (expect),
+ [newv] "r" (newv)
+ RSEQ_INJECT_INPUT
+ : "r0", "memory", "cc"
+ RSEQ_INJECT_CLOBBER
+ : abort, cmpfail
+#ifdef RSEQ_COMPARE_TWICE
+ , error1, error2
+#endif
+ );
+ rseq_after_asm_goto();
+ return 0;
+abort:
+ rseq_after_asm_goto();
+ RSEQ_INJECT_FAILED
+ return -1;
+cmpfail:
+ rseq_after_asm_goto();
+ return 1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_after_asm_goto();
+ rseq_bug("cpu_id comparison failed");
+error2:
+ rseq_after_asm_goto();
+ rseq_bug("expected value comparison failed");
+#endif
+}
+
+
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_cmpeqv_trymemcpy_storev)(intptr_t *v, intptr_t expect,
+ void *dst, void *src, size_t len,
+ intptr_t newv, int cpu)
+{
+ uint32_t rseq_scratch[3];
+
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
+#endif
+ "str %[src], %[rseq_scratch0]\n\t"
+ "str %[dst], %[rseq_scratch1]\n\t"
+ "str %[len], %[rseq_scratch2]\n\t"
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
+ RSEQ_INJECT_ASM(3)
+ "ldr r0, %[v]\n\t"
+ "cmp %[expect], r0\n\t"
+ "bne 5f\n\t"
+ RSEQ_INJECT_ASM(4)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 6f)
+ "ldr r0, %[v]\n\t"
+ "cmp %[expect], r0\n\t"
+ "bne 7f\n\t"
+#endif
+ /* try memcpy */
+ "cmp %[len], #0\n\t" \
+ "beq 333f\n\t" \
+ "222:\n\t" \
+ "ldrb %%r0, [%[src]]\n\t" \
+ "strb %%r0, [%[dst]]\n\t" \
+ "adds %[src], #1\n\t" \
+ "adds %[dst], #1\n\t" \
+ "subs %[len], #1\n\t" \
+ "bne 222b\n\t" \
+ "333:\n\t" \
+ RSEQ_INJECT_ASM(5)
+#ifdef RSEQ_TEMPLATE_MO_RELEASE
+ "dmb\n\t" /* full mb provides store-release */
+#endif
+ /* final store */
+ "str %[newv], %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(6)
+ /* teardown */
+ "ldr %[len], %[rseq_scratch2]\n\t"
+ "ldr %[dst], %[rseq_scratch1]\n\t"
+ "ldr %[src], %[rseq_scratch0]\n\t"
+ "b 8f\n\t"
+ RSEQ_ASM_DEFINE_ABORT(3, 4,
+ /* teardown */
+ "ldr %[len], %[rseq_scratch2]\n\t"
+ "ldr %[dst], %[rseq_scratch1]\n\t"
+ "ldr %[src], %[rseq_scratch0]\n\t",
+ abort, 1b, 2b, 4f)
+ RSEQ_ASM_DEFINE_CMPFAIL(5,
+ /* teardown */
+ "ldr %[len], %[rseq_scratch2]\n\t"
+ "ldr %[dst], %[rseq_scratch1]\n\t"
+ "ldr %[src], %[rseq_scratch0]\n\t",
+ cmpfail)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_CMPFAIL(6,
+ /* teardown */
+ "ldr %[len], %[rseq_scratch2]\n\t"
+ "ldr %[dst], %[rseq_scratch1]\n\t"
+ "ldr %[src], %[rseq_scratch0]\n\t",
+ error1)
+ RSEQ_ASM_DEFINE_CMPFAIL(7,
+ /* teardown */
+ "ldr %[len], %[rseq_scratch2]\n\t"
+ "ldr %[dst], %[rseq_scratch1]\n\t"
+ "ldr %[src], %[rseq_scratch0]\n\t",
+ error2)
+#endif
+ "8:\n\t"
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
+ [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
+ /* final store input */
+ [v] "m" (*v),
+ [expect] "r" (expect),
+ [newv] "r" (newv),
+ /* try memcpy input */
+ [dst] "r" (dst),
+ [src] "r" (src),
+ [len] "r" (len),
+ [rseq_scratch0] "m" (rseq_scratch[0]),
+ [rseq_scratch1] "m" (rseq_scratch[1]),
+ [rseq_scratch2] "m" (rseq_scratch[2])
+ RSEQ_INJECT_INPUT
+ : "r0", "memory", "cc"
+ RSEQ_INJECT_CLOBBER
+ : abort, cmpfail
+#ifdef RSEQ_COMPARE_TWICE
+ , error1, error2
+#endif
+ );
+ rseq_after_asm_goto();
+ return 0;
+abort:
+ rseq_after_asm_goto();
+ RSEQ_INJECT_FAILED
+ return -1;
+cmpfail:
+ rseq_after_asm_goto();
+ return 1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_after_asm_goto();
+ rseq_bug("cpu_id comparison failed");
+error2:
+ rseq_after_asm_goto();
+ rseq_bug("expected value comparison failed");
+#endif
+}
+
+#endif /* #if (defined(RSEQ_TEMPLATE_MO_RELAXED) || defined(RSEQ_TEMPLATE_MO_RELEASE)) &&
+ (defined(RSEQ_TEMPLATE_CPU_ID) || defined(RSEQ_TEMPLATE_VM_VCPU_ID)) */
+
+#include "rseq-bits-reset.h"
diff --git a/tools/testing/selftests/rseq/rseq-arm.h b/tools/testing/selftests/rseq/rseq-arm.h
index 7445107f842b..eb906db604f0 100644
--- a/tools/testing/selftests/rseq/rseq-arm.h
+++ b/tools/testing/selftests/rseq/rseq-arm.h
@@ -2,7 +2,7 @@
/*
* rseq-arm.h
*
- * (C) Copyright 2016-2018 - Mathieu Desnoyers <[email protected]>
+ * (C) Copyright 2016-2022 - Mathieu Desnoyers <[email protected]>
*/

/*
@@ -143,679 +143,34 @@ do { \
teardown \
"b %l[" __rseq_str(cmpfail_label) "]\n\t"

-static inline __attribute__((always_inline))
-int rseq_cmpeqv_storev(intptr_t *v, intptr_t expect, intptr_t newv, int cpu)
-{
- RSEQ_INJECT_C(9)
+/* Per-cpu-id indexing. */

- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
- RSEQ_INJECT_ASM(3)
- "ldr r0, %[v]\n\t"
- "cmp %[expect], r0\n\t"
- "bne %l[cmpfail]\n\t"
- RSEQ_INJECT_ASM(4)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, %l[error1])
- "ldr r0, %[v]\n\t"
- "cmp %[expect], r0\n\t"
- "bne %l[error2]\n\t"
-#endif
- /* final store */
- "str %[newv], %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(5)
- "b 5f\n\t"
- RSEQ_ASM_DEFINE_ABORT(3, 4, "", abort, 1b, 2b, 4f)
- "5:\n\t"
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
- [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
- [v] "m" (*v),
- [expect] "r" (expect),
- [newv] "r" (newv)
- RSEQ_INJECT_INPUT
- : "r0", "memory", "cc"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2
-#endif
- );
- rseq_after_asm_goto();
- return 0;
-abort:
- rseq_after_asm_goto();
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- rseq_after_asm_goto();
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_after_asm_goto();
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_after_asm_goto();
- rseq_bug("expected value comparison failed");
-#endif
-}
-
-static inline __attribute__((always_inline))
-int rseq_cmpnev_storeoffp_load(intptr_t *v, intptr_t expectnot,
- long voffp, intptr_t *load, int cpu)
-{
- RSEQ_INJECT_C(9)
-
- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
- RSEQ_INJECT_ASM(3)
- "ldr r0, %[v]\n\t"
- "cmp %[expectnot], r0\n\t"
- "beq %l[cmpfail]\n\t"
- RSEQ_INJECT_ASM(4)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, %l[error1])
- "ldr r0, %[v]\n\t"
- "cmp %[expectnot], r0\n\t"
- "beq %l[error2]\n\t"
-#endif
- "str r0, %[load]\n\t"
- "add r0, %[voffp]\n\t"
- "ldr r0, [r0]\n\t"
- /* final store */
- "str r0, %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(5)
- "b 5f\n\t"
- RSEQ_ASM_DEFINE_ABORT(3, 4, "", abort, 1b, 2b, 4f)
- "5:\n\t"
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
- [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
- /* final store input */
- [v] "m" (*v),
- [expectnot] "r" (expectnot),
- [voffp] "Ir" (voffp),
- [load] "m" (*load)
- RSEQ_INJECT_INPUT
- : "r0", "memory", "cc"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2
-#endif
- );
- rseq_after_asm_goto();
- return 0;
-abort:
- rseq_after_asm_goto();
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- rseq_after_asm_goto();
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_after_asm_goto();
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_after_asm_goto();
- rseq_bug("expected value comparison failed");
-#endif
-}
-
-static inline __attribute__((always_inline))
-int rseq_addv(intptr_t *v, intptr_t count, int cpu)
-{
- RSEQ_INJECT_C(9)
-
- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
- RSEQ_INJECT_ASM(3)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, %l[error1])
-#endif
- "ldr r0, %[v]\n\t"
- "add r0, %[count]\n\t"
- /* final store */
- "str r0, %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(4)
- "b 5f\n\t"
- RSEQ_ASM_DEFINE_ABORT(3, 4, "", abort, 1b, 2b, 4f)
- "5:\n\t"
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
- [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
- [v] "m" (*v),
- [count] "Ir" (count)
- RSEQ_INJECT_INPUT
- : "r0", "memory", "cc"
- RSEQ_INJECT_CLOBBER
- : abort
-#ifdef RSEQ_COMPARE_TWICE
- , error1
-#endif
- );
- rseq_after_asm_goto();
- return 0;
-abort:
- rseq_after_asm_goto();
- RSEQ_INJECT_FAILED
- return -1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_after_asm_goto();
- rseq_bug("cpu_id comparison failed");
-#endif
-}
-
-static inline __attribute__((always_inline))
-int rseq_cmpeqv_trystorev_storev(intptr_t *v, intptr_t expect,
- intptr_t *v2, intptr_t newv2,
- intptr_t newv, int cpu)
-{
- RSEQ_INJECT_C(9)
-
- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
- RSEQ_INJECT_ASM(3)
- "ldr r0, %[v]\n\t"
- "cmp %[expect], r0\n\t"
- "bne %l[cmpfail]\n\t"
- RSEQ_INJECT_ASM(4)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, %l[error1])
- "ldr r0, %[v]\n\t"
- "cmp %[expect], r0\n\t"
- "bne %l[error2]\n\t"
-#endif
- /* try store */
- "str %[newv2], %[v2]\n\t"
- RSEQ_INJECT_ASM(5)
- /* final store */
- "str %[newv], %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(6)
- "b 5f\n\t"
- RSEQ_ASM_DEFINE_ABORT(3, 4, "", abort, 1b, 2b, 4f)
- "5:\n\t"
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
- [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
- /* try store input */
- [v2] "m" (*v2),
- [newv2] "r" (newv2),
- /* final store input */
- [v] "m" (*v),
- [expect] "r" (expect),
- [newv] "r" (newv)
- RSEQ_INJECT_INPUT
- : "r0", "memory", "cc"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2
-#endif
- );
- rseq_after_asm_goto();
- return 0;
-abort:
- rseq_after_asm_goto();
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- rseq_after_asm_goto();
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_after_asm_goto();
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_after_asm_goto();
- rseq_bug("expected value comparison failed");
-#endif
-}
+#define RSEQ_TEMPLATE_CPU_ID
+#define RSEQ_TEMPLATE_MO_RELAXED
+#include "rseq-arm-bits.h"
+#undef RSEQ_TEMPLATE_MO_RELAXED

-static inline __attribute__((always_inline))
-int rseq_cmpeqv_trystorev_storev_release(intptr_t *v, intptr_t expect,
- intptr_t *v2, intptr_t newv2,
- intptr_t newv, int cpu)
-{
- RSEQ_INJECT_C(9)
+#define RSEQ_TEMPLATE_MO_RELEASE
+#include "rseq-arm-bits.h"
+#undef RSEQ_TEMPLATE_MO_RELEASE
+#undef RSEQ_TEMPLATE_CPU_ID

- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
- RSEQ_INJECT_ASM(3)
- "ldr r0, %[v]\n\t"
- "cmp %[expect], r0\n\t"
- "bne %l[cmpfail]\n\t"
- RSEQ_INJECT_ASM(4)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, %l[error1])
- "ldr r0, %[v]\n\t"
- "cmp %[expect], r0\n\t"
- "bne %l[error2]\n\t"
-#endif
- /* try store */
- "str %[newv2], %[v2]\n\t"
- RSEQ_INJECT_ASM(5)
- "dmb\n\t" /* full mb provides store-release */
- /* final store */
- "str %[newv], %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(6)
- "b 5f\n\t"
- RSEQ_ASM_DEFINE_ABORT(3, 4, "", abort, 1b, 2b, 4f)
- "5:\n\t"
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
- [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
- /* try store input */
- [v2] "m" (*v2),
- [newv2] "r" (newv2),
- /* final store input */
- [v] "m" (*v),
- [expect] "r" (expect),
- [newv] "r" (newv)
- RSEQ_INJECT_INPUT
- : "r0", "memory", "cc"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2
-#endif
- );
- rseq_after_asm_goto();
- return 0;
-abort:
- rseq_after_asm_goto();
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- rseq_after_asm_goto();
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_after_asm_goto();
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_after_asm_goto();
- rseq_bug("expected value comparison failed");
-#endif
-}
-
-static inline __attribute__((always_inline))
-int rseq_cmpeqv_cmpeqv_storev(intptr_t *v, intptr_t expect,
- intptr_t *v2, intptr_t expect2,
- intptr_t newv, int cpu)
-{
- RSEQ_INJECT_C(9)
-
- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error3])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
- RSEQ_INJECT_ASM(3)
- "ldr r0, %[v]\n\t"
- "cmp %[expect], r0\n\t"
- "bne %l[cmpfail]\n\t"
- RSEQ_INJECT_ASM(4)
- "ldr r0, %[v2]\n\t"
- "cmp %[expect2], r0\n\t"
- "bne %l[cmpfail]\n\t"
- RSEQ_INJECT_ASM(5)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, %l[error1])
- "ldr r0, %[v]\n\t"
- "cmp %[expect], r0\n\t"
- "bne %l[error2]\n\t"
- "ldr r0, %[v2]\n\t"
- "cmp %[expect2], r0\n\t"
- "bne %l[error3]\n\t"
-#endif
- /* final store */
- "str %[newv], %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(6)
- "b 5f\n\t"
- RSEQ_ASM_DEFINE_ABORT(3, 4, "", abort, 1b, 2b, 4f)
- "5:\n\t"
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
- [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
- /* cmp2 input */
- [v2] "m" (*v2),
- [expect2] "r" (expect2),
- /* final store input */
- [v] "m" (*v),
- [expect] "r" (expect),
- [newv] "r" (newv)
- RSEQ_INJECT_INPUT
- : "r0", "memory", "cc"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2, error3
-#endif
- );
- rseq_after_asm_goto();
- return 0;
-abort:
- rseq_after_asm_goto();
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- rseq_after_asm_goto();
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_after_asm_goto();
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_after_asm_goto();
- rseq_bug("1st expected value comparison failed");
-error3:
- rseq_after_asm_goto();
- rseq_bug("2nd expected value comparison failed");
-#endif
-}
-
-static inline __attribute__((always_inline))
-int rseq_cmpeqv_trymemcpy_storev(intptr_t *v, intptr_t expect,
- void *dst, void *src, size_t len,
- intptr_t newv, int cpu)
-{
- uint32_t rseq_scratch[3];
+/* Per-vm-vcpu-id indexing. */

- RSEQ_INJECT_C(9)
+#define RSEQ_TEMPLATE_VM_VCPU_ID
+#define RSEQ_TEMPLATE_MO_RELAXED
+#include "rseq-arm-bits.h"
+#undef RSEQ_TEMPLATE_MO_RELAXED

- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
-#endif
- "str %[src], %[rseq_scratch0]\n\t"
- "str %[dst], %[rseq_scratch1]\n\t"
- "str %[len], %[rseq_scratch2]\n\t"
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
- RSEQ_INJECT_ASM(3)
- "ldr r0, %[v]\n\t"
- "cmp %[expect], r0\n\t"
- "bne 5f\n\t"
- RSEQ_INJECT_ASM(4)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 6f)
- "ldr r0, %[v]\n\t"
- "cmp %[expect], r0\n\t"
- "bne 7f\n\t"
-#endif
- /* try memcpy */
- "cmp %[len], #0\n\t" \
- "beq 333f\n\t" \
- "222:\n\t" \
- "ldrb %%r0, [%[src]]\n\t" \
- "strb %%r0, [%[dst]]\n\t" \
- "adds %[src], #1\n\t" \
- "adds %[dst], #1\n\t" \
- "subs %[len], #1\n\t" \
- "bne 222b\n\t" \
- "333:\n\t" \
- RSEQ_INJECT_ASM(5)
- /* final store */
- "str %[newv], %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(6)
- /* teardown */
- "ldr %[len], %[rseq_scratch2]\n\t"
- "ldr %[dst], %[rseq_scratch1]\n\t"
- "ldr %[src], %[rseq_scratch0]\n\t"
- "b 8f\n\t"
- RSEQ_ASM_DEFINE_ABORT(3, 4,
- /* teardown */
- "ldr %[len], %[rseq_scratch2]\n\t"
- "ldr %[dst], %[rseq_scratch1]\n\t"
- "ldr %[src], %[rseq_scratch0]\n\t",
- abort, 1b, 2b, 4f)
- RSEQ_ASM_DEFINE_CMPFAIL(5,
- /* teardown */
- "ldr %[len], %[rseq_scratch2]\n\t"
- "ldr %[dst], %[rseq_scratch1]\n\t"
- "ldr %[src], %[rseq_scratch0]\n\t",
- cmpfail)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_CMPFAIL(6,
- /* teardown */
- "ldr %[len], %[rseq_scratch2]\n\t"
- "ldr %[dst], %[rseq_scratch1]\n\t"
- "ldr %[src], %[rseq_scratch0]\n\t",
- error1)
- RSEQ_ASM_DEFINE_CMPFAIL(7,
- /* teardown */
- "ldr %[len], %[rseq_scratch2]\n\t"
- "ldr %[dst], %[rseq_scratch1]\n\t"
- "ldr %[src], %[rseq_scratch0]\n\t",
- error2)
-#endif
- "8:\n\t"
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
- [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
- /* final store input */
- [v] "m" (*v),
- [expect] "r" (expect),
- [newv] "r" (newv),
- /* try memcpy input */
- [dst] "r" (dst),
- [src] "r" (src),
- [len] "r" (len),
- [rseq_scratch0] "m" (rseq_scratch[0]),
- [rseq_scratch1] "m" (rseq_scratch[1]),
- [rseq_scratch2] "m" (rseq_scratch[2])
- RSEQ_INJECT_INPUT
- : "r0", "memory", "cc"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2
-#endif
- );
- rseq_after_asm_goto();
- return 0;
-abort:
- rseq_after_asm_goto();
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- rseq_after_asm_goto();
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_after_asm_goto();
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_after_asm_goto();
- rseq_bug("expected value comparison failed");
-#endif
-}
-
-static inline __attribute__((always_inline))
-int rseq_cmpeqv_trymemcpy_storev_release(intptr_t *v, intptr_t expect,
- void *dst, void *src, size_t len,
- intptr_t newv, int cpu)
-{
- uint32_t rseq_scratch[3];
+#define RSEQ_TEMPLATE_MO_RELEASE
+#include "rseq-arm-bits.h"
+#undef RSEQ_TEMPLATE_MO_RELEASE
+#undef RSEQ_TEMPLATE_VM_VCPU_ID

- RSEQ_INJECT_C(9)
+/* APIs which are not based on cpu ids. */

- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
-#endif
- "str %[src], %[rseq_scratch0]\n\t"
- "str %[dst], %[rseq_scratch1]\n\t"
- "str %[len], %[rseq_scratch2]\n\t"
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
- RSEQ_INJECT_ASM(3)
- "ldr r0, %[v]\n\t"
- "cmp %[expect], r0\n\t"
- "bne 5f\n\t"
- RSEQ_INJECT_ASM(4)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 6f)
- "ldr r0, %[v]\n\t"
- "cmp %[expect], r0\n\t"
- "bne 7f\n\t"
-#endif
- /* try memcpy */
- "cmp %[len], #0\n\t" \
- "beq 333f\n\t" \
- "222:\n\t" \
- "ldrb %%r0, [%[src]]\n\t" \
- "strb %%r0, [%[dst]]\n\t" \
- "adds %[src], #1\n\t" \
- "adds %[dst], #1\n\t" \
- "subs %[len], #1\n\t" \
- "bne 222b\n\t" \
- "333:\n\t" \
- RSEQ_INJECT_ASM(5)
- "dmb\n\t" /* full mb provides store-release */
- /* final store */
- "str %[newv], %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(6)
- /* teardown */
- "ldr %[len], %[rseq_scratch2]\n\t"
- "ldr %[dst], %[rseq_scratch1]\n\t"
- "ldr %[src], %[rseq_scratch0]\n\t"
- "b 8f\n\t"
- RSEQ_ASM_DEFINE_ABORT(3, 4,
- /* teardown */
- "ldr %[len], %[rseq_scratch2]\n\t"
- "ldr %[dst], %[rseq_scratch1]\n\t"
- "ldr %[src], %[rseq_scratch0]\n\t",
- abort, 1b, 2b, 4f)
- RSEQ_ASM_DEFINE_CMPFAIL(5,
- /* teardown */
- "ldr %[len], %[rseq_scratch2]\n\t"
- "ldr %[dst], %[rseq_scratch1]\n\t"
- "ldr %[src], %[rseq_scratch0]\n\t",
- cmpfail)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_CMPFAIL(6,
- /* teardown */
- "ldr %[len], %[rseq_scratch2]\n\t"
- "ldr %[dst], %[rseq_scratch1]\n\t"
- "ldr %[src], %[rseq_scratch0]\n\t",
- error1)
- RSEQ_ASM_DEFINE_CMPFAIL(7,
- /* teardown */
- "ldr %[len], %[rseq_scratch2]\n\t"
- "ldr %[dst], %[rseq_scratch1]\n\t"
- "ldr %[src], %[rseq_scratch0]\n\t",
- error2)
-#endif
- "8:\n\t"
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
- [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
- /* final store input */
- [v] "m" (*v),
- [expect] "r" (expect),
- [newv] "r" (newv),
- /* try memcpy input */
- [dst] "r" (dst),
- [src] "r" (src),
- [len] "r" (len),
- [rseq_scratch0] "m" (rseq_scratch[0]),
- [rseq_scratch1] "m" (rseq_scratch[1]),
- [rseq_scratch2] "m" (rseq_scratch[2])
- RSEQ_INJECT_INPUT
- : "r0", "memory", "cc"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2
-#endif
- );
- rseq_after_asm_goto();
- return 0;
-abort:
- rseq_after_asm_goto();
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- rseq_after_asm_goto();
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_after_asm_goto();
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_after_asm_goto();
- rseq_bug("expected value comparison failed");
-#endif
-}
+#define RSEQ_TEMPLATE_CPU_ID_NONE
+#define RSEQ_TEMPLATE_MO_RELAXED
+#include "rseq-arm-bits.h"
+#undef RSEQ_TEMPLATE_MO_RELAXED
+#undef RSEQ_TEMPLATE_CPU_ID_NONE
--
2.25.1

2022-09-22 11:57:23

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH v4 23/25] selftests/rseq: Implement numa node id vs vm_vcpu_id invariant test

On all architectures except Power, the NUMA topology is never
reconfigured after a CPU has been associated with a NUMA node in the
system lifetime.

Even on Power, we can assume that NUMA topology reconfiguration happens
rarely, and therefore we do not expect it to happen while the NUMA test
is running.

This test validates that the mapping between a vm_vcpu_id and a numa
node id remains valid for the process lifetime. In other words, it
validates that if any thread within the process running on behalf of a
vm_vcpu_id N observes a NUMA node id M, all threads within this process
will always observe the same NUMA node id value when running on behalf
of that same vm_vcpu_id.

This characteristic is important for NUMA locality.

This test is skipped on architectures that do not implement
rseq_load_u32_u32.

Signed-off-by: Mathieu Desnoyers <[email protected]>
---
tools/testing/selftests/rseq/.gitignore | 1 +
tools/testing/selftests/rseq/Makefile | 2 +-
.../testing/selftests/rseq/basic_numa_test.c | 117 ++++++++++++++++++
3 files changed, 119 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/rseq/basic_numa_test.c

diff --git a/tools/testing/selftests/rseq/.gitignore b/tools/testing/selftests/rseq/.gitignore
index db5c1a124c6c..9231abed69cc 100644
--- a/tools/testing/selftests/rseq/.gitignore
+++ b/tools/testing/selftests/rseq/.gitignore
@@ -1,4 +1,5 @@
# SPDX-License-Identifier: GPL-2.0-only
+basic_numa_test
basic_percpu_ops_test
basic_percpu_ops_vm_vcpu_id_test
basic_test
diff --git a/tools/testing/selftests/rseq/Makefile b/tools/testing/selftests/rseq/Makefile
index 3eec8e166385..4bf5b7202254 100644
--- a/tools/testing/selftests/rseq/Makefile
+++ b/tools/testing/selftests/rseq/Makefile
@@ -12,7 +12,7 @@ LDLIBS += -lpthread -ldl
# still track changes to header files and depend on shared object.
OVERRIDE_TARGETS = 1

-TEST_GEN_PROGS = basic_test basic_percpu_ops_test basic_percpu_ops_vm_vcpu_id_test param_test \
+TEST_GEN_PROGS = basic_test basic_numa_test basic_percpu_ops_test basic_percpu_ops_vm_vcpu_id_test param_test \
param_test_benchmark param_test_compare_twice param_test_vm_vcpu_id \
param_test_vm_vcpu_id_benchmark param_test_vm_vcpu_id_compare_twice

diff --git a/tools/testing/selftests/rseq/basic_numa_test.c b/tools/testing/selftests/rseq/basic_numa_test.c
new file mode 100644
index 000000000000..45cb714b135c
--- /dev/null
+++ b/tools/testing/selftests/rseq/basic_numa_test.c
@@ -0,0 +1,117 @@
+// SPDX-License-Identifier: LGPL-2.1
+/*
+ * Basic rseq NUMA test. Validate that (vm_vcpu_id, numa_node_id) pairs are
+ * invariant. The only known scenario where this is untrue is on Power which
+ * can reconfigure the NUMA topology on CPU hotunplug/hotplug sequence.
+ */
+
+#define _GNU_SOURCE
+#include <assert.h>
+#include <sched.h>
+#include <signal.h>
+#include <stdio.h>
+#include <string.h>
+#include <sys/time.h>
+
+#include "rseq.h"
+
+#define NR_LOOPS 100000000
+#define NR_THREADS 16
+
+#ifdef RSEQ_ARCH_HAS_LOAD_U32_U32
+
+static
+int cpu_numa_id[CPU_SETSIZE];
+
+static
+void numa_id_init(void)
+{
+ int i;
+
+ for (i = 0; i < CPU_SETSIZE; i++)
+ cpu_numa_id[i] = -1;
+}
+
+static
+void *test_thread(void *arg)
+{
+ int i;
+
+ if (rseq_register_current_thread()) {
+ fprintf(stderr, "Error: rseq_register_current_thread(...) failed(%d): %s\n",
+ errno, strerror(errno));
+ abort();
+ }
+
+ for (i = 0; i < NR_LOOPS; i++) {
+ uint32_t vm_vcpu_id, node;
+ int cached_node_id;
+
+ while (rseq_load_u32_u32(RSEQ_MO_RELAXED, &vm_vcpu_id, &rseq_get_abi()->vm_vcpu_id,
+ &node, &rseq_get_abi()->node_id) != 0) {
+ /* Retry. */
+ }
+ cached_node_id = RSEQ_READ_ONCE(cpu_numa_id[vm_vcpu_id]);
+ if (cached_node_id == -1) {
+ RSEQ_WRITE_ONCE(cpu_numa_id[vm_vcpu_id], node);
+ } else {
+ if (node != cached_node_id) {
+ fprintf(stderr, "Error: NUMA node id discrepancy: vm_vcpu_id %u cached node id %d node id %u.\n",
+ vm_vcpu_id, cached_node_id, node);
+ fprintf(stderr, "This is likely a kernel bug, or caused by a concurrent NUMA topology reconfiguration.\n");
+ abort();
+ }
+ }
+ }
+
+ if (rseq_unregister_current_thread()) {
+ fprintf(stderr, "Error: rseq_unregister_current_thread(...) failed(%d): %s\n",
+ errno, strerror(errno));
+ abort();
+ }
+ return NULL;
+}
+
+static
+int test_numa(void)
+{
+ pthread_t tid[NR_THREADS];
+ int err, i;
+ void *tret;
+
+ numa_id_init();
+
+ printf("testing rseq (vm_vcpu_id, numa_node_id) invariant, single thread\n");
+
+ (void) test_thread(NULL);
+
+ printf("testing rseq (vm_vcpu_id, numa_node_id) invariant, multi-threaded\n");
+
+ for (i = 0; i < NR_THREADS; i++) {
+ err = pthread_create(&tid[i], NULL, test_thread, NULL);
+ if (err != 0)
+ abort();
+ }
+
+ for (i = 0; i < NR_THREADS; i++) {
+ err = pthread_join(tid[i], &tret);
+ if (err != 0)
+ abort();
+ }
+
+ return 0;
+}
+#else
+static
+int test_numa(void)
+{
+ fprintf(stderr, "rseq_load_u32_u32 is not implemented on this architecture. "
+ "Skipping numa test.\n");
+ return 0;
+}
+#endif
+
+int main(int argc, char **argv)
+{
+ return test_numa();
+}
--
2.25.1

2022-09-22 12:07:21

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH v4 25/25] tracing/rseq: Add mm_vcpu_id field to rseq_update

Add the mm_vcpu_id field to the rseq_update event, allowing tracers to
follow which vcpu_id is observed by user-space, and whether negative
vcpu_id values are visible in case of internal scheduler implementation
issues.

Signed-off-by: Mathieu Desnoyers <[email protected]>
---
include/trace/events/rseq.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/trace/events/rseq.h b/include/trace/events/rseq.h
index 6bd442697354..10b236fc047a 100644
--- a/include/trace/events/rseq.h
+++ b/include/trace/events/rseq.h
@@ -17,14 +17,17 @@ TRACE_EVENT(rseq_update,
TP_STRUCT__entry(
__field(s32, cpu_id)
__field(s32, node_id)
+ __field(s32, mm_vcpu_id)
),

TP_fast_assign(
__entry->cpu_id = raw_smp_processor_id();
__entry->node_id = cpu_to_node(raw_smp_processor_id());
+ __entry->mm_vcpu_id = t->mm_vcpu;
),

- TP_printk("cpu_id=%d node_id=%d", __entry->cpu_id, __entry->node_id)
+ TP_printk("cpu_id=%d node_id=%d mm_vcpu_id=%d", __entry->cpu_id,
+ __entry->node_id, __entry->mm_vcpu_id)
);

TRACE_EVENT(rseq_ip_fixup,
--
2.25.1

2022-09-22 12:07:21

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH v4 16/25] selftests/rseq: mips: Template memory ordering and percpu access mode

Introduce a rseq-mips-bits.h template header which is internally
included to generate the static inline functions covering:

- relaxed and release memory ordering,
- per-cpu-id and per-vm-vcpu-id per-cpu data access.

Signed-off-by: Mathieu Desnoyers <[email protected]>
Cc: Paul Burton <[email protected]>
---
tools/testing/selftests/rseq/rseq-mips-bits.h | 462 +++++++++++++
tools/testing/selftests/rseq/rseq-mips.h | 640 +-----------------
2 files changed, 487 insertions(+), 615 deletions(-)
create mode 100644 tools/testing/selftests/rseq/rseq-mips-bits.h

diff --git a/tools/testing/selftests/rseq/rseq-mips-bits.h b/tools/testing/selftests/rseq/rseq-mips-bits.h
new file mode 100644
index 000000000000..890e9123403f
--- /dev/null
+++ b/tools/testing/selftests/rseq/rseq-mips-bits.h
@@ -0,0 +1,462 @@
+/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
+/*
+ * Author: Paul Burton <[email protected]>
+ * (C) Copyright 2018 MIPS Tech LLC
+ * (C) Copyright 2016-2022 - Mathieu Desnoyers <[email protected]>
+ */
+
+#include "rseq-bits-template.h"
+
+#if defined(RSEQ_TEMPLATE_MO_RELAXED) && \
+ (defined(RSEQ_TEMPLATE_CPU_ID) || defined(RSEQ_TEMPLATE_VM_VCPU_ID))
+
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_cmpeqv_storev)(intptr_t *v, intptr_t expect, intptr_t newv, int cpu)
+{
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
+#endif
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
+ RSEQ_INJECT_ASM(3)
+ LONG_L " $4, %[v]\n\t"
+ "bne $4, %[expect], %l[cmpfail]\n\t"
+ RSEQ_INJECT_ASM(4)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, %l[error1])
+ LONG_L " $4, %[v]\n\t"
+ "bne $4, %[expect], %l[error2]\n\t"
+#endif
+ /* final store */
+ LONG_S " %[newv], %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(5)
+ "b 5f\n\t"
+ RSEQ_ASM_DEFINE_ABORT(3, 4, "", abort, 1b, 2b, 4f)
+ "5:\n\t"
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
+ [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
+ [v] "m" (*v),
+ [expect] "r" (expect),
+ [newv] "r" (newv)
+ RSEQ_INJECT_INPUT
+ : "$4", "memory"
+ RSEQ_INJECT_CLOBBER
+ : abort, cmpfail
+#ifdef RSEQ_COMPARE_TWICE
+ , error1, error2
+#endif
+ );
+ return 0;
+abort:
+ RSEQ_INJECT_FAILED
+ return -1;
+cmpfail:
+ return 1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_bug("cpu_id comparison failed");
+error2:
+ rseq_bug("expected value comparison failed");
+#endif
+}
+
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_cmpnev_storeoffp_load)(intptr_t *v, intptr_t expectnot,
+ long voffp, intptr_t *load, int cpu)
+{
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
+#endif
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
+ RSEQ_INJECT_ASM(3)
+ LONG_L " $4, %[v]\n\t"
+ "beq $4, %[expectnot], %l[cmpfail]\n\t"
+ RSEQ_INJECT_ASM(4)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, %l[error1])
+ LONG_L " $4, %[v]\n\t"
+ "beq $4, %[expectnot], %l[error2]\n\t"
+#endif
+ LONG_S " $4, %[load]\n\t"
+ LONG_ADDI " $4, %[voffp]\n\t"
+ LONG_L " $4, 0($4)\n\t"
+ /* final store */
+ LONG_S " $4, %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(5)
+ "b 5f\n\t"
+ RSEQ_ASM_DEFINE_ABORT(3, 4, "", abort, 1b, 2b, 4f)
+ "5:\n\t"
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
+ [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
+ /* final store input */
+ [v] "m" (*v),
+ [expectnot] "r" (expectnot),
+ [voffp] "Ir" (voffp),
+ [load] "m" (*load)
+ RSEQ_INJECT_INPUT
+ : "$4", "memory"
+ RSEQ_INJECT_CLOBBER
+ : abort, cmpfail
+#ifdef RSEQ_COMPARE_TWICE
+ , error1, error2
+#endif
+ );
+ return 0;
+abort:
+ RSEQ_INJECT_FAILED
+ return -1;
+cmpfail:
+ return 1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_bug("cpu_id comparison failed");
+error2:
+ rseq_bug("expected value comparison failed");
+#endif
+}
+
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_addv)(intptr_t *v, intptr_t count, int cpu)
+{
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+#endif
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
+ RSEQ_INJECT_ASM(3)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, %l[error1])
+#endif
+ LONG_L " $4, %[v]\n\t"
+ LONG_ADDI " $4, %[count]\n\t"
+ /* final store */
+ LONG_S " $4, %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(4)
+ "b 5f\n\t"
+ RSEQ_ASM_DEFINE_ABORT(3, 4, "", abort, 1b, 2b, 4f)
+ "5:\n\t"
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
+ [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
+ [v] "m" (*v),
+ [count] "Ir" (count)
+ RSEQ_INJECT_INPUT
+ : "$4", "memory"
+ RSEQ_INJECT_CLOBBER
+ : abort
+#ifdef RSEQ_COMPARE_TWICE
+ , error1
+#endif
+ );
+ return 0;
+abort:
+ RSEQ_INJECT_FAILED
+ return -1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_bug("cpu_id comparison failed");
+#endif
+}
+
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_cmpeqv_cmpeqv_storev)(intptr_t *v, intptr_t expect,
+ intptr_t *v2, intptr_t expect2,
+ intptr_t newv, int cpu)
+{
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error3])
+#endif
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
+ RSEQ_INJECT_ASM(3)
+ LONG_L " $4, %[v]\n\t"
+ "bne $4, %[expect], %l[cmpfail]\n\t"
+ RSEQ_INJECT_ASM(4)
+ LONG_L " $4, %[v2]\n\t"
+ "bne $4, %[expect2], %l[cmpfail]\n\t"
+ RSEQ_INJECT_ASM(5)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, %l[error1])
+ LONG_L " $4, %[v]\n\t"
+ "bne $4, %[expect], %l[error2]\n\t"
+ LONG_L " $4, %[v2]\n\t"
+ "bne $4, %[expect2], %l[error3]\n\t"
+#endif
+ /* final store */
+ LONG_S " %[newv], %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(6)
+ "b 5f\n\t"
+ RSEQ_ASM_DEFINE_ABORT(3, 4, "", abort, 1b, 2b, 4f)
+ "5:\n\t"
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
+ [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
+ /* cmp2 input */
+ [v2] "m" (*v2),
+ [expect2] "r" (expect2),
+ /* final store input */
+ [v] "m" (*v),
+ [expect] "r" (expect),
+ [newv] "r" (newv)
+ RSEQ_INJECT_INPUT
+ : "$4", "memory"
+ RSEQ_INJECT_CLOBBER
+ : abort, cmpfail
+#ifdef RSEQ_COMPARE_TWICE
+ , error1, error2, error3
+#endif
+ );
+ return 0;
+abort:
+ RSEQ_INJECT_FAILED
+ return -1;
+cmpfail:
+ return 1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_bug("cpu_id comparison failed");
+error2:
+ rseq_bug("1st expected value comparison failed");
+error3:
+ rseq_bug("2nd expected value comparison failed");
+#endif
+}
+
+#endif /* #if defined(RSEQ_TEMPLATE_MO_RELAXED) &&
+ (defined(RSEQ_TEMPLATE_CPU_ID) || defined(RSEQ_TEMPLATE_VM_VCPU_ID)) */
+
+#if (defined(RSEQ_TEMPLATE_MO_RELAXED) || defined(RSEQ_TEMPLATE_MO_RELEASE)) && \
+ (defined(RSEQ_TEMPLATE_CPU_ID) || defined(RSEQ_TEMPLATE_VM_VCPU_ID))
+
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_cmpeqv_trystorev_storev)(intptr_t *v, intptr_t expect,
+ intptr_t *v2, intptr_t newv2,
+ intptr_t newv, int cpu)
+{
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
+#endif
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
+ RSEQ_INJECT_ASM(3)
+ LONG_L " $4, %[v]\n\t"
+ "bne $4, %[expect], %l[cmpfail]\n\t"
+ RSEQ_INJECT_ASM(4)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, %l[error1])
+ LONG_L " $4, %[v]\n\t"
+ "bne $4, %[expect], %l[error2]\n\t"
+#endif
+ /* try store */
+ LONG_S " %[newv2], %[v2]\n\t"
+ RSEQ_INJECT_ASM(5)
+#ifdef RSEQ_TEMPLATE_MO_RELEASE
+ "sync\n\t" /* full sync provides store-release */
+#endif
+ /* final store */
+ LONG_S " %[newv], %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(6)
+ "b 5f\n\t"
+ RSEQ_ASM_DEFINE_ABORT(3, 4, "", abort, 1b, 2b, 4f)
+ "5:\n\t"
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
+ [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
+ /* try store input */
+ [v2] "m" (*v2),
+ [newv2] "r" (newv2),
+ /* final store input */
+ [v] "m" (*v),
+ [expect] "r" (expect),
+ [newv] "r" (newv)
+ RSEQ_INJECT_INPUT
+ : "$4", "memory"
+ RSEQ_INJECT_CLOBBER
+ : abort, cmpfail
+#ifdef RSEQ_COMPARE_TWICE
+ , error1, error2
+#endif
+ );
+ return 0;
+abort:
+ RSEQ_INJECT_FAILED
+ return -1;
+cmpfail:
+ return 1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_bug("cpu_id comparison failed");
+error2:
+ rseq_bug("expected value comparison failed");
+#endif
+}
+
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_cmpeqv_trymemcpy_storev)(intptr_t *v, intptr_t expect,
+ void *dst, void *src, size_t len,
+ intptr_t newv, int cpu)
+{
+ uintptr_t rseq_scratch[3];
+
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
+#endif
+ LONG_S " %[src], %[rseq_scratch0]\n\t"
+ LONG_S " %[dst], %[rseq_scratch1]\n\t"
+ LONG_S " %[len], %[rseq_scratch2]\n\t"
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
+ RSEQ_INJECT_ASM(3)
+ LONG_L " $4, %[v]\n\t"
+ "bne $4, %[expect], 5f\n\t"
+ RSEQ_INJECT_ASM(4)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 6f)
+ LONG_L " $4, %[v]\n\t"
+ "bne $4, %[expect], 7f\n\t"
+#endif
+ /* try memcpy */
+ "beqz %[len], 333f\n\t" \
+ "222:\n\t" \
+ "lb $4, 0(%[src])\n\t" \
+ "sb $4, 0(%[dst])\n\t" \
+ LONG_ADDI " %[src], 1\n\t" \
+ LONG_ADDI " %[dst], 1\n\t" \
+ LONG_ADDI " %[len], -1\n\t" \
+ "bnez %[len], 222b\n\t" \
+ "333:\n\t" \
+ RSEQ_INJECT_ASM(5)
+#ifdef RSEQ_TEMPLATE_MO_RELEASE
+ "sync\n\t" /* full sync provides store-release */
+#endif
+ /* final store */
+ LONG_S " %[newv], %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(6)
+ /* teardown */
+ LONG_L " %[len], %[rseq_scratch2]\n\t"
+ LONG_L " %[dst], %[rseq_scratch1]\n\t"
+ LONG_L " %[src], %[rseq_scratch0]\n\t"
+ "b 8f\n\t"
+ RSEQ_ASM_DEFINE_ABORT(3, 4,
+ /* teardown */
+ LONG_L " %[len], %[rseq_scratch2]\n\t"
+ LONG_L " %[dst], %[rseq_scratch1]\n\t"
+ LONG_L " %[src], %[rseq_scratch0]\n\t",
+ abort, 1b, 2b, 4f)
+ RSEQ_ASM_DEFINE_CMPFAIL(5,
+ /* teardown */
+ LONG_L " %[len], %[rseq_scratch2]\n\t"
+ LONG_L " %[dst], %[rseq_scratch1]\n\t"
+ LONG_L " %[src], %[rseq_scratch0]\n\t",
+ cmpfail)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_CMPFAIL(6,
+ /* teardown */
+ LONG_L " %[len], %[rseq_scratch2]\n\t"
+ LONG_L " %[dst], %[rseq_scratch1]\n\t"
+ LONG_L " %[src], %[rseq_scratch0]\n\t",
+ error1)
+ RSEQ_ASM_DEFINE_CMPFAIL(7,
+ /* teardown */
+ LONG_L " %[len], %[rseq_scratch2]\n\t"
+ LONG_L " %[dst], %[rseq_scratch1]\n\t"
+ LONG_L " %[src], %[rseq_scratch0]\n\t",
+ error2)
+#endif
+ "8:\n\t"
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
+ [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
+ /* final store input */
+ [v] "m" (*v),
+ [expect] "r" (expect),
+ [newv] "r" (newv),
+ /* try memcpy input */
+ [dst] "r" (dst),
+ [src] "r" (src),
+ [len] "r" (len),
+ [rseq_scratch0] "m" (rseq_scratch[0]),
+ [rseq_scratch1] "m" (rseq_scratch[1]),
+ [rseq_scratch2] "m" (rseq_scratch[2])
+ RSEQ_INJECT_INPUT
+ : "$4", "memory"
+ RSEQ_INJECT_CLOBBER
+ : abort, cmpfail
+#ifdef RSEQ_COMPARE_TWICE
+ , error1, error2
+#endif
+ );
+ return 0;
+abort:
+ RSEQ_INJECT_FAILED
+ return -1;
+cmpfail:
+ return 1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_bug("cpu_id comparison failed");
+error2:
+ rseq_bug("expected value comparison failed");
+#endif
+}
+
+#endif /* #if (defined(RSEQ_TEMPLATE_MO_RELAXED) || defined(RSEQ_TEMPLATE_MO_RELEASE)) &&
+ (defined(RSEQ_TEMPLATE_CPU_ID) || defined(RSEQ_TEMPLATE_VM_VCPU_ID)) */
+
+#include "rseq-bits-reset.h"
diff --git a/tools/testing/selftests/rseq/rseq-mips.h b/tools/testing/selftests/rseq/rseq-mips.h
index dd199952d649..0ca65cc088df 100644
--- a/tools/testing/selftests/rseq/rseq-mips.h
+++ b/tools/testing/selftests/rseq/rseq-mips.h
@@ -2,9 +2,7 @@
/*
* Author: Paul Burton <[email protected]>
* (C) Copyright 2018 MIPS Tech LLC
- *
- * Based on rseq-arm.h:
- * (C) Copyright 2016-2018 - Mathieu Desnoyers <[email protected]>
+ * (C) Copyright 2016-2022 - Mathieu Desnoyers <[email protected]>
*/

/*
@@ -150,622 +148,34 @@ do { \
teardown \
"b %l[" __rseq_str(cmpfail_label) "]\n\t"

-static inline __attribute__((always_inline))
-int rseq_cmpeqv_storev(intptr_t *v, intptr_t expect, intptr_t newv, int cpu)
-{
- RSEQ_INJECT_C(9)
+/* Per-cpu-id indexing. */

- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
- RSEQ_INJECT_ASM(3)
- LONG_L " $4, %[v]\n\t"
- "bne $4, %[expect], %l[cmpfail]\n\t"
- RSEQ_INJECT_ASM(4)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, %l[error1])
- LONG_L " $4, %[v]\n\t"
- "bne $4, %[expect], %l[error2]\n\t"
-#endif
- /* final store */
- LONG_S " %[newv], %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(5)
- "b 5f\n\t"
- RSEQ_ASM_DEFINE_ABORT(3, 4, "", abort, 1b, 2b, 4f)
- "5:\n\t"
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
- [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
- [v] "m" (*v),
- [expect] "r" (expect),
- [newv] "r" (newv)
- RSEQ_INJECT_INPUT
- : "$4", "memory"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2
-#endif
- );
- return 0;
-abort:
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_bug("expected value comparison failed");
-#endif
-}
+#define RSEQ_TEMPLATE_CPU_ID
+#define RSEQ_TEMPLATE_MO_RELAXED
+#include "rseq-mips-bits.h"
+#undef RSEQ_TEMPLATE_MO_RELAXED

-static inline __attribute__((always_inline))
-int rseq_cmpnev_storeoffp_load(intptr_t *v, intptr_t expectnot,
- long voffp, intptr_t *load, int cpu)
-{
- RSEQ_INJECT_C(9)
+#define RSEQ_TEMPLATE_MO_RELEASE
+#include "rseq-mips-bits.h"
+#undef RSEQ_TEMPLATE_MO_RELEASE
+#undef RSEQ_TEMPLATE_CPU_ID

- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
- RSEQ_INJECT_ASM(3)
- LONG_L " $4, %[v]\n\t"
- "beq $4, %[expectnot], %l[cmpfail]\n\t"
- RSEQ_INJECT_ASM(4)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, %l[error1])
- LONG_L " $4, %[v]\n\t"
- "beq $4, %[expectnot], %l[error2]\n\t"
-#endif
- LONG_S " $4, %[load]\n\t"
- LONG_ADDI " $4, %[voffp]\n\t"
- LONG_L " $4, 0($4)\n\t"
- /* final store */
- LONG_S " $4, %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(5)
- "b 5f\n\t"
- RSEQ_ASM_DEFINE_ABORT(3, 4, "", abort, 1b, 2b, 4f)
- "5:\n\t"
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
- [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
- /* final store input */
- [v] "m" (*v),
- [expectnot] "r" (expectnot),
- [voffp] "Ir" (voffp),
- [load] "m" (*load)
- RSEQ_INJECT_INPUT
- : "$4", "memory"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2
-#endif
- );
- return 0;
-abort:
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_bug("expected value comparison failed");
-#endif
-}
+/* Per-vm-vcpu-id indexing. */

-static inline __attribute__((always_inline))
-int rseq_addv(intptr_t *v, intptr_t count, int cpu)
-{
- RSEQ_INJECT_C(9)
+#define RSEQ_TEMPLATE_VM_VCPU_ID
+#define RSEQ_TEMPLATE_MO_RELAXED
+#include "rseq-mips-bits.h"
+#undef RSEQ_TEMPLATE_MO_RELAXED

- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
- RSEQ_INJECT_ASM(3)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, %l[error1])
-#endif
- LONG_L " $4, %[v]\n\t"
- LONG_ADDI " $4, %[count]\n\t"
- /* final store */
- LONG_S " $4, %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(4)
- "b 5f\n\t"
- RSEQ_ASM_DEFINE_ABORT(3, 4, "", abort, 1b, 2b, 4f)
- "5:\n\t"
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
- [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
- [v] "m" (*v),
- [count] "Ir" (count)
- RSEQ_INJECT_INPUT
- : "$4", "memory"
- RSEQ_INJECT_CLOBBER
- : abort
-#ifdef RSEQ_COMPARE_TWICE
- , error1
-#endif
- );
- return 0;
-abort:
- RSEQ_INJECT_FAILED
- return -1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_bug("cpu_id comparison failed");
-#endif
-}
-
-static inline __attribute__((always_inline))
-int rseq_cmpeqv_trystorev_storev(intptr_t *v, intptr_t expect,
- intptr_t *v2, intptr_t newv2,
- intptr_t newv, int cpu)
-{
- RSEQ_INJECT_C(9)
+#define RSEQ_TEMPLATE_MO_RELEASE
+#include "rseq-mips-bits.h"
+#undef RSEQ_TEMPLATE_MO_RELEASE
+#undef RSEQ_TEMPLATE_VM_VCPU_ID

- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
- RSEQ_INJECT_ASM(3)
- LONG_L " $4, %[v]\n\t"
- "bne $4, %[expect], %l[cmpfail]\n\t"
- RSEQ_INJECT_ASM(4)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, %l[error1])
- LONG_L " $4, %[v]\n\t"
- "bne $4, %[expect], %l[error2]\n\t"
-#endif
- /* try store */
- LONG_S " %[newv2], %[v2]\n\t"
- RSEQ_INJECT_ASM(5)
- /* final store */
- LONG_S " %[newv], %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(6)
- "b 5f\n\t"
- RSEQ_ASM_DEFINE_ABORT(3, 4, "", abort, 1b, 2b, 4f)
- "5:\n\t"
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
- [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
- /* try store input */
- [v2] "m" (*v2),
- [newv2] "r" (newv2),
- /* final store input */
- [v] "m" (*v),
- [expect] "r" (expect),
- [newv] "r" (newv)
- RSEQ_INJECT_INPUT
- : "$4", "memory"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2
-#endif
- );
- return 0;
-abort:
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_bug("expected value comparison failed");
-#endif
-}
+/* APIs which are not based on cpu ids. */

-static inline __attribute__((always_inline))
-int rseq_cmpeqv_trystorev_storev_release(intptr_t *v, intptr_t expect,
- intptr_t *v2, intptr_t newv2,
- intptr_t newv, int cpu)
-{
- RSEQ_INJECT_C(9)
-
- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
- RSEQ_INJECT_ASM(3)
- LONG_L " $4, %[v]\n\t"
- "bne $4, %[expect], %l[cmpfail]\n\t"
- RSEQ_INJECT_ASM(4)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, %l[error1])
- LONG_L " $4, %[v]\n\t"
- "bne $4, %[expect], %l[error2]\n\t"
-#endif
- /* try store */
- LONG_S " %[newv2], %[v2]\n\t"
- RSEQ_INJECT_ASM(5)
- "sync\n\t" /* full sync provides store-release */
- /* final store */
- LONG_S " %[newv], %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(6)
- "b 5f\n\t"
- RSEQ_ASM_DEFINE_ABORT(3, 4, "", abort, 1b, 2b, 4f)
- "5:\n\t"
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
- [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
- /* try store input */
- [v2] "m" (*v2),
- [newv2] "r" (newv2),
- /* final store input */
- [v] "m" (*v),
- [expect] "r" (expect),
- [newv] "r" (newv)
- RSEQ_INJECT_INPUT
- : "$4", "memory"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2
-#endif
- );
- return 0;
-abort:
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_bug("expected value comparison failed");
-#endif
-}
-
-static inline __attribute__((always_inline))
-int rseq_cmpeqv_cmpeqv_storev(intptr_t *v, intptr_t expect,
- intptr_t *v2, intptr_t expect2,
- intptr_t newv, int cpu)
-{
- RSEQ_INJECT_C(9)
-
- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error3])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
- RSEQ_INJECT_ASM(3)
- LONG_L " $4, %[v]\n\t"
- "bne $4, %[expect], %l[cmpfail]\n\t"
- RSEQ_INJECT_ASM(4)
- LONG_L " $4, %[v2]\n\t"
- "bne $4, %[expect2], %l[cmpfail]\n\t"
- RSEQ_INJECT_ASM(5)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, %l[error1])
- LONG_L " $4, %[v]\n\t"
- "bne $4, %[expect], %l[error2]\n\t"
- LONG_L " $4, %[v2]\n\t"
- "bne $4, %[expect2], %l[error3]\n\t"
-#endif
- /* final store */
- LONG_S " %[newv], %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(6)
- "b 5f\n\t"
- RSEQ_ASM_DEFINE_ABORT(3, 4, "", abort, 1b, 2b, 4f)
- "5:\n\t"
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
- [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
- /* cmp2 input */
- [v2] "m" (*v2),
- [expect2] "r" (expect2),
- /* final store input */
- [v] "m" (*v),
- [expect] "r" (expect),
- [newv] "r" (newv)
- RSEQ_INJECT_INPUT
- : "$4", "memory"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2, error3
-#endif
- );
- return 0;
-abort:
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_bug("1st expected value comparison failed");
-error3:
- rseq_bug("2nd expected value comparison failed");
-#endif
-}
-
-static inline __attribute__((always_inline))
-int rseq_cmpeqv_trymemcpy_storev(intptr_t *v, intptr_t expect,
- void *dst, void *src, size_t len,
- intptr_t newv, int cpu)
-{
- uintptr_t rseq_scratch[3];
-
- RSEQ_INJECT_C(9)
-
- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
-#endif
- LONG_S " %[src], %[rseq_scratch0]\n\t"
- LONG_S " %[dst], %[rseq_scratch1]\n\t"
- LONG_S " %[len], %[rseq_scratch2]\n\t"
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
- RSEQ_INJECT_ASM(3)
- LONG_L " $4, %[v]\n\t"
- "bne $4, %[expect], 5f\n\t"
- RSEQ_INJECT_ASM(4)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 6f)
- LONG_L " $4, %[v]\n\t"
- "bne $4, %[expect], 7f\n\t"
-#endif
- /* try memcpy */
- "beqz %[len], 333f\n\t" \
- "222:\n\t" \
- "lb $4, 0(%[src])\n\t" \
- "sb $4, 0(%[dst])\n\t" \
- LONG_ADDI " %[src], 1\n\t" \
- LONG_ADDI " %[dst], 1\n\t" \
- LONG_ADDI " %[len], -1\n\t" \
- "bnez %[len], 222b\n\t" \
- "333:\n\t" \
- RSEQ_INJECT_ASM(5)
- /* final store */
- LONG_S " %[newv], %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(6)
- /* teardown */
- LONG_L " %[len], %[rseq_scratch2]\n\t"
- LONG_L " %[dst], %[rseq_scratch1]\n\t"
- LONG_L " %[src], %[rseq_scratch0]\n\t"
- "b 8f\n\t"
- RSEQ_ASM_DEFINE_ABORT(3, 4,
- /* teardown */
- LONG_L " %[len], %[rseq_scratch2]\n\t"
- LONG_L " %[dst], %[rseq_scratch1]\n\t"
- LONG_L " %[src], %[rseq_scratch0]\n\t",
- abort, 1b, 2b, 4f)
- RSEQ_ASM_DEFINE_CMPFAIL(5,
- /* teardown */
- LONG_L " %[len], %[rseq_scratch2]\n\t"
- LONG_L " %[dst], %[rseq_scratch1]\n\t"
- LONG_L " %[src], %[rseq_scratch0]\n\t",
- cmpfail)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_CMPFAIL(6,
- /* teardown */
- LONG_L " %[len], %[rseq_scratch2]\n\t"
- LONG_L " %[dst], %[rseq_scratch1]\n\t"
- LONG_L " %[src], %[rseq_scratch0]\n\t",
- error1)
- RSEQ_ASM_DEFINE_CMPFAIL(7,
- /* teardown */
- LONG_L " %[len], %[rseq_scratch2]\n\t"
- LONG_L " %[dst], %[rseq_scratch1]\n\t"
- LONG_L " %[src], %[rseq_scratch0]\n\t",
- error2)
-#endif
- "8:\n\t"
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
- [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
- /* final store input */
- [v] "m" (*v),
- [expect] "r" (expect),
- [newv] "r" (newv),
- /* try memcpy input */
- [dst] "r" (dst),
- [src] "r" (src),
- [len] "r" (len),
- [rseq_scratch0] "m" (rseq_scratch[0]),
- [rseq_scratch1] "m" (rseq_scratch[1]),
- [rseq_scratch2] "m" (rseq_scratch[2])
- RSEQ_INJECT_INPUT
- : "$4", "memory"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2
-#endif
- );
- return 0;
-abort:
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_bug("expected value comparison failed");
-#endif
-}
-
-static inline __attribute__((always_inline))
-int rseq_cmpeqv_trymemcpy_storev_release(intptr_t *v, intptr_t expect,
- void *dst, void *src, size_t len,
- intptr_t newv, int cpu)
-{
- uintptr_t rseq_scratch[3];
-
- RSEQ_INJECT_C(9)
-
- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(9, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
-#endif
- LONG_S " %[src], %[rseq_scratch0]\n\t"
- LONG_S " %[dst], %[rseq_scratch1]\n\t"
- LONG_S " %[len], %[rseq_scratch2]\n\t"
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3f, rseq_cs)
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 4f)
- RSEQ_INJECT_ASM(3)
- LONG_L " $4, %[v]\n\t"
- "bne $4, %[expect], 5f\n\t"
- RSEQ_INJECT_ASM(4)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, current_cpu_id, 6f)
- LONG_L " $4, %[v]\n\t"
- "bne $4, %[expect], 7f\n\t"
-#endif
- /* try memcpy */
- "beqz %[len], 333f\n\t" \
- "222:\n\t" \
- "lb $4, 0(%[src])\n\t" \
- "sb $4, 0(%[dst])\n\t" \
- LONG_ADDI " %[src], 1\n\t" \
- LONG_ADDI " %[dst], 1\n\t" \
- LONG_ADDI " %[len], -1\n\t" \
- "bnez %[len], 222b\n\t" \
- "333:\n\t" \
- RSEQ_INJECT_ASM(5)
- "sync\n\t" /* full sync provides store-release */
- /* final store */
- LONG_S " %[newv], %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(6)
- /* teardown */
- LONG_L " %[len], %[rseq_scratch2]\n\t"
- LONG_L " %[dst], %[rseq_scratch1]\n\t"
- LONG_L " %[src], %[rseq_scratch0]\n\t"
- "b 8f\n\t"
- RSEQ_ASM_DEFINE_ABORT(3, 4,
- /* teardown */
- LONG_L " %[len], %[rseq_scratch2]\n\t"
- LONG_L " %[dst], %[rseq_scratch1]\n\t"
- LONG_L " %[src], %[rseq_scratch0]\n\t",
- abort, 1b, 2b, 4f)
- RSEQ_ASM_DEFINE_CMPFAIL(5,
- /* teardown */
- LONG_L " %[len], %[rseq_scratch2]\n\t"
- LONG_L " %[dst], %[rseq_scratch1]\n\t"
- LONG_L " %[src], %[rseq_scratch0]\n\t",
- cmpfail)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_CMPFAIL(6,
- /* teardown */
- LONG_L " %[len], %[rseq_scratch2]\n\t"
- LONG_L " %[dst], %[rseq_scratch1]\n\t"
- LONG_L " %[src], %[rseq_scratch0]\n\t",
- error1)
- RSEQ_ASM_DEFINE_CMPFAIL(7,
- /* teardown */
- LONG_L " %[len], %[rseq_scratch2]\n\t"
- LONG_L " %[dst], %[rseq_scratch1]\n\t"
- LONG_L " %[src], %[rseq_scratch0]\n\t",
- error2)
-#endif
- "8:\n\t"
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [current_cpu_id] "m" (rseq_get_abi()->cpu_id),
- [rseq_cs] "m" (rseq_get_abi()->rseq_cs.arch.ptr),
- /* final store input */
- [v] "m" (*v),
- [expect] "r" (expect),
- [newv] "r" (newv),
- /* try memcpy input */
- [dst] "r" (dst),
- [src] "r" (src),
- [len] "r" (len),
- [rseq_scratch0] "m" (rseq_scratch[0]),
- [rseq_scratch1] "m" (rseq_scratch[1]),
- [rseq_scratch2] "m" (rseq_scratch[2])
- RSEQ_INJECT_INPUT
- : "$4", "memory"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2
-#endif
- );
- return 0;
-abort:
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_bug("expected value comparison failed");
-#endif
-}
+#define RSEQ_TEMPLATE_CPU_ID_NONE
+#define RSEQ_TEMPLATE_MO_RELAXED
+#include "rseq-mips-bits.h"
+#undef RSEQ_TEMPLATE_MO_RELAXED
+#undef RSEQ_TEMPLATE_CPU_ID_NONE
--
2.25.1

2022-09-22 12:08:52

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH v4 07/25] lib: Implement find_{first,next}_{zero,one}_and_zero_bit

Allow finding the first or next bit within two input bitmasks which is
either:

- both zero and zero,
- respectively one and zero.

Signed-off-by: Mathieu Desnoyers <[email protected]>
---
include/linux/find.h | 110 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 110 insertions(+)

diff --git a/include/linux/find.h b/include/linux/find.h
index 935892da576e..325707b8bd56 100644
--- a/include/linux/find.h
+++ b/include/linux/find.h
@@ -76,6 +76,66 @@ unsigned long find_next_and_bit(const unsigned long *addr1,
}
#endif

+#ifndef find_next_one_and_zero_bit
+/**
+ * find_next_one_and_zero_bit - find the next bit which is one in addr1 and zero in addr2 memory region
+ * @addr1: The first address to base the search on
+ * @addr2: The second address to base the search on
+ * @offset: The bitnumber to start searching at
+ * @size: The bitmap size in bits
+ *
+ * Returns the bit number for the next bit set in addr1 and cleared in addr2
+ * If no corresponding bits meet this criterion, returns @size.
+ */
+static inline
+unsigned long find_next_one_and_zero_bit(const unsigned long *addr1,
+ const unsigned long *addr2, unsigned long size,
+ unsigned long offset)
+{
+ if (small_const_nbits(size)) {
+ unsigned long val;
+
+ if (unlikely(offset >= size))
+ return size;
+
+ val = *addr1 & ~*addr2 & GENMASK(size - 1, offset);
+ return val ? __ffs(val) : size;
+ }
+
+ return _find_next_bit(addr1, addr2, size, offset, 0UL, ~0UL, 0);
+}
+#endif
+
+#ifndef find_next_zero_and_zero_bit
+/**
+ * find_next_zero_and_zero_bit - find the next bit which is zero in addr1 and addr2 memory regions
+ * @addr1: The first address to base the search on
+ * @addr2: The second address to base the search on
+ * @offset: The bitnumber to start searching at
+ * @size: The bitmap size in bits
+ *
+ * Returns the bit number for the next bit cleared in addr1 and addr2
+ * If no corresponding bits meet this criterion, returns @size.
+ */
+static inline
+unsigned long find_next_zero_and_zero_bit(const unsigned long *addr1,
+ const unsigned long *addr2, unsigned long size,
+ unsigned long offset)
+{
+ if (small_const_nbits(size)) {
+ unsigned long val;
+
+ if (unlikely(offset >= size))
+ return size;
+
+ val = ~*addr1 & ~*addr2 & GENMASK(size - 1, offset);
+ return val ? __ffs(val) : size;
+ }
+
+ return _find_next_bit(addr1, addr2, size, offset, ~0UL, ~0UL, 0);
+}
+#endif
+
#ifndef find_next_zero_bit
/**
* find_next_zero_bit - find the next cleared bit in a memory region
@@ -173,6 +233,56 @@ unsigned long find_first_zero_bit(const unsigned long *addr, unsigned long size)
}
#endif

+#ifndef find_first_one_and_zero_bit
+/**
+ * find_first_one_and_zero_bit - find the first bit which is one in addr1 and zero in addr2 memory region
+ * @addr1: The first address to base the search on
+ * @addr2: The second address to base the search on
+ * @size: The bitmap size in bits
+ *
+ * Returns the bit number for the first bit set in addr1 and cleared in addr2
+ * If no corresponding bits meet this criterion, returns @size.
+ */
+static inline
+unsigned long find_first_one_and_zero_bit(const unsigned long *addr1,
+ const unsigned long *addr2,
+ unsigned long size)
+{
+ if (small_const_nbits(size)) {
+ unsigned long val = *addr1 & ~*addr2 & GENMASK(size - 1, 0);
+
+ return val ? __ffs(val) : size;
+ }
+
+ return _find_next_bit(addr1, addr2, size, 0, 0UL, ~0UL, 0);
+}
+#endif
+
+#ifndef find_first_zero_and_zero_bit
+/**
+ * find_first_zero_and_zero_bit - find the first bit which is zero in addr1 and addr2 memory regions
+ * @addr1: The first address to base the search on
+ * @addr2: The second address to base the search on
+ * @size: The bitmap size in bits
+ *
+ * Returns the bit number for the first bit cleared in addr1 and addr2
+ * If no corresponding bits meet this criterion, returns @size.
+ */
+static inline
+unsigned long find_first_zero_and_zero_bit(const unsigned long *addr1,
+ const unsigned long *addr2,
+ unsigned long size)
+{
+ if (small_const_nbits(size)) {
+ unsigned long val = ~*addr1 & ~*addr2 & GENMASK(size - 1, 0);
+
+ return val ? __ffs(val) : size;
+ }
+
+ return _find_next_bit(addr1, addr2, size, 0, ~0UL, ~0UL, 0);
+}
+#endif
+
#ifndef find_last_bit
/**
* find_last_bit - find the last set bit in a memory region
--
2.25.1

2022-09-22 12:10:30

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH v4 08/25] cpumask: Implement cpumask_{first,next}_{zero,one}_and_zero

Allow finding the first or next bit within two input cpumasks which is
either:

- both zero and zero,
- respectively one and zero.

Signed-off-by: Mathieu Desnoyers <[email protected]>
---
include/linux/cpumask.h | 86 +++++++++++++++++++++++++++++++++++++++++
1 file changed, 86 insertions(+)

diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index fe29ac7cc469..0f5c3e47423f 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -134,6 +134,18 @@ static inline unsigned int cpumask_first_and(const struct cpumask *srcp1,
return 0;
}

+static inline unsigned int cpumask_first_one_and_zero(const struct cpumask *srcp1,
+ const struct cpumask *srcp2)
+{
+ return 0;
+}
+
+static inline unsigned int cpumask_first_zero_and_zero(const struct cpumask *srcp1,
+ const struct cpumask *srcp2)
+{
+ return 0;
+}
+
static inline unsigned int cpumask_last(const struct cpumask *srcp)
{
return 0;
@@ -157,6 +169,20 @@ static inline unsigned int cpumask_next_and(int n,
return n+1;
}

+static inline unsigned int cpumask_next_one_and_zero(int n,
+ const struct cpumask *srcp1,
+ const struct cpumask *srcp2)
+{
+ return n+1;
+}
+
+static inline unsigned int cpumask_next_zero_and_zero(int n,
+ const struct cpumask *srcp1,
+ const struct cpumask *srcp2)
+{
+ return n+1;
+}
+
static inline unsigned int cpumask_next_wrap(int n, const struct cpumask *mask,
int start, bool wrap)
{
@@ -230,6 +256,32 @@ unsigned int cpumask_first_and(const struct cpumask *srcp1, const struct cpumask
return find_first_and_bit(cpumask_bits(srcp1), cpumask_bits(srcp2), nr_cpumask_bits);
}

+/**
+ * cpumask_first_one_and_zero - return the first cpu from *srcp1 & ~*srcp2
+ * @src1p: the first input
+ * @src2p: the second input
+ *
+ * Returns >= nr_cpu_ids if no cpus match in both.
+ */
+static inline
+unsigned int cpumask_first_one_and_zero(const struct cpumask *srcp1, const struct cpumask *srcp2)
+{
+ return find_first_one_and_zero_bit(cpumask_bits(srcp1), cpumask_bits(srcp2), nr_cpumask_bits);
+}
+
+/**
+ * cpumask_first_zero_and_zero - return the first cpu from ~*srcp1 & ~*srcp2
+ * @src1p: the first input
+ * @src2p: the second input
+ *
+ * Returns >= nr_cpu_ids if no cpus match in both.
+ */
+static inline
+unsigned int cpumask_first_zero_and_zero(const struct cpumask *srcp1, const struct cpumask *srcp2)
+{
+ return find_first_zero_and_zero_bit(cpumask_bits(srcp1), cpumask_bits(srcp2), nr_cpumask_bits);
+}
+
/**
* cpumask_last - get the last CPU in a cpumask
* @srcp: - the cpumask pointer
@@ -258,6 +310,40 @@ static inline unsigned int cpumask_next_zero(int n, const struct cpumask *srcp)
return find_next_zero_bit(cpumask_bits(srcp), nr_cpumask_bits, n+1);
}

+/**
+ * cpumask_next_one_and_zero - return the next cpu from *srcp1 & ~*srcp2
+ * @n: the cpu prior to the place to search (ie. return will be > @n)
+ * @src1p: the first input
+ * @src2p: the second input
+ *
+ * Returns >= nr_cpu_ids if no cpus match in both.
+ */
+static inline
+unsigned int cpumask_next_one_and_zero(int n, const struct cpumask *srcp1, const struct cpumask *srcp2)
+{
+ /* -1 is a legal arg here. */
+ if (n != -1)
+ cpumask_check(n);
+ return find_next_one_and_zero_bit(cpumask_bits(srcp1), cpumask_bits(srcp2), nr_cpumask_bits, n+1);
+}
+
+/**
+ * cpumask_next_zero_and_zero - return the next cpu from ~*srcp1 & ~*srcp2
+ * @n: the cpu prior to the place to search (ie. return will be > @n)
+ * @src1p: the first input
+ * @src2p: the second input
+ *
+ * Returns >= nr_cpu_ids if no cpus match in both.
+ */
+static inline
+unsigned int cpumask_next_zero_and_zero(int n, const struct cpumask *srcp1, const struct cpumask *srcp2)
+{
+ /* -1 is a legal arg here. */
+ if (n != -1)
+ cpumask_check(n);
+ return find_next_zero_and_zero_bit(cpumask_bits(srcp1), cpumask_bits(srcp2), nr_cpumask_bits, n+1);
+}
+
int __pure cpumask_next_and(int n, const struct cpumask *, const struct cpumask *);
int __pure cpumask_any_but(const struct cpumask *mask, unsigned int cpu);
unsigned int cpumask_local_spread(unsigned int i, int node);
--
2.25.1

2022-09-22 12:12:30

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH v4 05/25] selftests/rseq: Implement rseq numa node id field selftest

Test the NUMA node id extension rseq field. Compare it against the value
returned by the getcpu(2) system call while pinned on a specific core.

Signed-off-by: Mathieu Desnoyers <[email protected]>
---
tools/testing/selftests/rseq/basic_test.c | 4 ++++
tools/testing/selftests/rseq/rseq-abi.h | 8 +++++++
tools/testing/selftests/rseq/rseq.c | 18 +++++++++++++++
tools/testing/selftests/rseq/rseq.h | 28 +++++++++++++++++++++++
4 files changed, 58 insertions(+)

diff --git a/tools/testing/selftests/rseq/basic_test.c b/tools/testing/selftests/rseq/basic_test.c
index d8efbfb89193..295eea16466f 100644
--- a/tools/testing/selftests/rseq/basic_test.c
+++ b/tools/testing/selftests/rseq/basic_test.c
@@ -22,6 +22,8 @@ void test_cpu_pointer(void)
CPU_ZERO(&test_affinity);
for (i = 0; i < CPU_SETSIZE; i++) {
if (CPU_ISSET(i, &affinity)) {
+ int node;
+
CPU_SET(i, &test_affinity);
sched_setaffinity(0, sizeof(test_affinity),
&test_affinity);
@@ -29,6 +31,8 @@ void test_cpu_pointer(void)
assert(rseq_current_cpu() == i);
assert(rseq_current_cpu_raw() == i);
assert(rseq_cpu_start() == i);
+ node = rseq_fallback_current_node();
+ assert(rseq_current_node_id() == node);
CPU_CLR(i, &test_affinity);
}
}
diff --git a/tools/testing/selftests/rseq/rseq-abi.h b/tools/testing/selftests/rseq/rseq-abi.h
index 00ac846d85b0..a1faa9162d52 100644
--- a/tools/testing/selftests/rseq/rseq-abi.h
+++ b/tools/testing/selftests/rseq/rseq-abi.h
@@ -147,6 +147,14 @@ struct rseq_abi {
*/
__u32 flags;

+ /*
+ * Restartable sequences node_id field. Updated by the kernel. Read by
+ * user-space with single-copy atomicity semantics. This field should
+ * only be read by the thread which registered this data structure.
+ * Aligned on 32-bit. Contains the current NUMA node ID.
+ */
+ __u32 node_id;
+
/*
* Flexible array member at end of structure, after last feature field.
*/
diff --git a/tools/testing/selftests/rseq/rseq.c b/tools/testing/selftests/rseq/rseq.c
index 20ea536d1012..0a96c3c779cd 100644
--- a/tools/testing/selftests/rseq/rseq.c
+++ b/tools/testing/selftests/rseq/rseq.c
@@ -79,6 +79,11 @@ static int sys_rseq(struct rseq_abi *rseq_abi, uint32_t rseq_len,
return syscall(__NR_rseq, rseq_abi, rseq_len, flags, sig);
}

+static int sys_getcpu(unsigned *cpu, unsigned *node)
+{
+ return syscall(__NR_getcpu, cpu, node, NULL);
+}
+
int rseq_available(void)
{
int rc;
@@ -199,3 +204,16 @@ int32_t rseq_fallback_current_cpu(void)
}
return cpu;
}
+
+int32_t rseq_fallback_current_node(void)
+{
+ uint32_t cpu_id, node_id;
+ int ret;
+
+ ret = sys_getcpu(&cpu_id, &node_id);
+ if (ret) {
+ perror("sys_getcpu()");
+ return ret;
+ }
+ return (int32_t) node_id;
+}
diff --git a/tools/testing/selftests/rseq/rseq.h b/tools/testing/selftests/rseq/rseq.h
index 95adc1e1b0db..fd17d0e54a1b 100644
--- a/tools/testing/selftests/rseq/rseq.h
+++ b/tools/testing/selftests/rseq/rseq.h
@@ -20,6 +20,15 @@
#include "rseq-abi.h"
#include "compiler.h"

+#ifndef rseq_sizeof_field
+#define rseq_sizeof_field(TYPE, MEMBER) sizeof((((TYPE *)0)->MEMBER))
+#endif
+
+#ifndef rseq_offsetofend
+#define rseq_offsetofend(TYPE, MEMBER) \
+ (offsetof(TYPE, MEMBER) + rseq_sizeof_field(TYPE, MEMBER))
+#endif
+
/*
* Empty code injection macros, override when testing.
* It is important to consider that the ASM injection macros need to be
@@ -128,6 +137,11 @@ int rseq_unregister_current_thread(void);
*/
int32_t rseq_fallback_current_cpu(void);

+/*
+ * Restartable sequence fallback for reading the current node number.
+ */
+int32_t rseq_fallback_current_node(void);
+
/*
* Values returned can be either the current CPU number, -1 (rseq is
* uninitialized), or -2 (rseq initialization has failed).
@@ -163,6 +177,20 @@ static inline uint32_t rseq_current_cpu(void)
return cpu;
}

+static inline bool rseq_node_id_available(void)
+{
+ return (int) rseq_feature_size >= rseq_offsetofend(struct rseq_abi, node_id);
+}
+
+/*
+ * Current NUMA node number.
+ */
+static inline uint32_t rseq_current_node_id(void)
+{
+ assert(rseq_node_id_available());
+ return RSEQ_ACCESS_ONCE(rseq_get_abi()->node_id);
+}
+
static inline void rseq_clear_rseq_cs(void)
{
RSEQ_WRITE_ONCE(rseq_get_abi()->rseq_cs.arch.ptr, 0);
--
2.25.1

2022-09-22 12:14:23

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH v4 20/25] selftests/rseq: Implement basic percpu ops vm_vcpu_id test

Adapt to the rseq.h API changes introduced by commits
"selftests/rseq: <arch>: Template memory ordering and percpu access mode".

Build a new basic_percpu_ops_vm_vcpu_id_test to test the new
"vm_vcpu_id" rseq field.

Signed-off-by: Mathieu Desnoyers <[email protected]>
---
tools/testing/selftests/rseq/.gitignore | 1 +
tools/testing/selftests/rseq/Makefile | 5 +-
.../selftests/rseq/basic_percpu_ops_test.c | 46 ++++++++++++++++---
3 files changed, 44 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/rseq/.gitignore b/tools/testing/selftests/rseq/.gitignore
index 5910888ebfe1..5a7e5acc628c 100644
--- a/tools/testing/selftests/rseq/.gitignore
+++ b/tools/testing/selftests/rseq/.gitignore
@@ -1,5 +1,6 @@
# SPDX-License-Identifier: GPL-2.0-only
basic_percpu_ops_test
+basic_percpu_ops_vm_vcpu_id_test
basic_test
basic_rseq_op_test
param_test
diff --git a/tools/testing/selftests/rseq/Makefile b/tools/testing/selftests/rseq/Makefile
index 215e1067f037..4210c135e621 100644
--- a/tools/testing/selftests/rseq/Makefile
+++ b/tools/testing/selftests/rseq/Makefile
@@ -12,7 +12,7 @@ LDLIBS += -lpthread -ldl
# still track changes to header files and depend on shared object.
OVERRIDE_TARGETS = 1

-TEST_GEN_PROGS = basic_test basic_percpu_ops_test param_test \
+TEST_GEN_PROGS = basic_test basic_percpu_ops_test basic_percpu_ops_vm_vcpu_id_test param_test \
param_test_benchmark param_test_compare_twice

TEST_GEN_PROGS_EXTENDED = librseq.so
@@ -29,6 +29,9 @@ $(OUTPUT)/librseq.so: rseq.c rseq.h rseq-*.h
$(OUTPUT)/%: %.c $(TEST_GEN_PROGS_EXTENDED) rseq.h rseq-*.h
$(CC) $(CFLAGS) $< $(LDLIBS) -lrseq -o $@

+$(OUTPUT)/basic_percpu_ops_vm_vcpu_id_test: basic_percpu_ops_test.c $(TEST_GEN_PROGS_EXTENDED) rseq.h rseq-*.h
+ $(CC) $(CFLAGS) -DBUILDOPT_RSEQ_PERCPU_VM_VCPU_ID $< $(LDLIBS) -lrseq -o $@
+
$(OUTPUT)/param_test_benchmark: param_test.c $(TEST_GEN_PROGS_EXTENDED) \
rseq.h rseq-*.h
$(CC) $(CFLAGS) -DBENCHMARK $< $(LDLIBS) -lrseq -o $@
diff --git a/tools/testing/selftests/rseq/basic_percpu_ops_test.c b/tools/testing/selftests/rseq/basic_percpu_ops_test.c
index 517756afc2a4..719ff9910e23 100644
--- a/tools/testing/selftests/rseq/basic_percpu_ops_test.c
+++ b/tools/testing/selftests/rseq/basic_percpu_ops_test.c
@@ -12,6 +12,32 @@
#include "../kselftest.h"
#include "rseq.h"

+#ifdef BUILDOPT_RSEQ_PERCPU_VM_VCPU_ID
+# define RSEQ_PERCPU RSEQ_PERCPU_VM_VCPU_ID
+static
+int get_current_cpu_id(void)
+{
+ return rseq_current_vm_vcpu_id();
+}
+static
+bool rseq_validate_cpu_id(void)
+{
+ return rseq_vm_vcpu_id_available();
+}
+#else
+# define RSEQ_PERCPU RSEQ_PERCPU_CPU_ID
+static
+int get_current_cpu_id(void)
+{
+ return rseq_cpu_start();
+}
+static
+bool rseq_validate_cpu_id(void)
+{
+ return rseq_current_cpu_raw() >= 0;
+}
+#endif
+
struct percpu_lock_entry {
intptr_t v;
} __attribute__((aligned(128)));
@@ -51,9 +77,9 @@ int rseq_this_cpu_lock(struct percpu_lock *lock)
for (;;) {
int ret;

- cpu = rseq_cpu_start();
- ret = rseq_cmpeqv_storev(&lock->c[cpu].v,
- 0, 1, cpu);
+ cpu = get_current_cpu_id();
+ ret = rseq_cmpeqv_storev(RSEQ_MO_RELAXED, RSEQ_PERCPU,
+ &lock->c[cpu].v, 0, 1, cpu);
if (rseq_likely(!ret))
break;
/* Retry if comparison fails or rseq aborts. */
@@ -141,13 +167,14 @@ void this_cpu_list_push(struct percpu_list *list,
intptr_t *targetptr, newval, expect;
int ret;

- cpu = rseq_cpu_start();
+ cpu = get_current_cpu_id();
/* Load list->c[cpu].head with single-copy atomicity. */
expect = (intptr_t)RSEQ_READ_ONCE(list->c[cpu].head);
newval = (intptr_t)node;
targetptr = (intptr_t *)&list->c[cpu].head;
node->next = (struct percpu_list_node *)expect;
- ret = rseq_cmpeqv_storev(targetptr, expect, newval, cpu);
+ ret = rseq_cmpeqv_storev(RSEQ_MO_RELAXED, RSEQ_PERCPU,
+ targetptr, expect, newval, cpu);
if (rseq_likely(!ret))
break;
/* Retry if comparison fails or rseq aborts. */
@@ -170,12 +197,13 @@ struct percpu_list_node *this_cpu_list_pop(struct percpu_list *list,
long offset;
int ret, cpu;

- cpu = rseq_cpu_start();
+ cpu = get_current_cpu_id();
targetptr = (intptr_t *)&list->c[cpu].head;
expectnot = (intptr_t)NULL;
offset = offsetof(struct percpu_list_node, next);
load = (intptr_t *)&head;
- ret = rseq_cmpnev_storeoffp_load(targetptr, expectnot,
+ ret = rseq_cmpnev_storeoffp_load(RSEQ_MO_RELAXED, RSEQ_PERCPU,
+ targetptr, expectnot,
offset, load, cpu);
if (rseq_likely(!ret)) {
if (_cpu)
@@ -295,6 +323,10 @@ int main(int argc, char **argv)
errno, strerror(errno));
goto error;
}
+ if (!rseq_validate_cpu_id()) {
+ fprintf(stderr, "Error: cpu id getter unavailable\n");
+ goto error;
+ }
printf("spinlock\n");
test_percpu_spinlock();
printf("percpu_list\n");
--
2.25.1

2022-09-22 12:14:24

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH v4 06/25] lib: Invert _find_next_bit source arguments

Apply bit-invert operations before the AND operation in _find_next_bit.
Allows AND operations on combined bitmasks in which we search either for
one or zero, e.g.: find first bit which is both zero in one bitmask AND
one in the second bitmask.

The existing use for find first zero bit does not use the second
argument, so whether the inversion is performed before or after the AND
operator does not matter.

Signed-off-by: Mathieu Desnoyers <[email protected]>
---
include/linux/find.h | 13 +++++++------
lib/find_bit.c | 17 ++++++++---------
tools/include/linux/find.h | 9 +++++----
tools/lib/find_bit.c | 17 ++++++++---------
4 files changed, 28 insertions(+), 28 deletions(-)

diff --git a/include/linux/find.h b/include/linux/find.h
index 424ef67d4a42..935892da576e 100644
--- a/include/linux/find.h
+++ b/include/linux/find.h
@@ -10,7 +10,8 @@

extern unsigned long _find_next_bit(const unsigned long *addr1,
const unsigned long *addr2, unsigned long nbits,
- unsigned long start, unsigned long invert, unsigned long le);
+ unsigned long start, unsigned long invert_src1,
+ unsigned long src2, unsigned long le);
extern unsigned long _find_first_bit(const unsigned long *addr, unsigned long size);
extern unsigned long _find_first_and_bit(const unsigned long *addr1,
const unsigned long *addr2, unsigned long size);
@@ -41,7 +42,7 @@ unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
return val ? __ffs(val) : size;
}

- return _find_next_bit(addr, NULL, size, offset, 0UL, 0);
+ return _find_next_bit(addr, NULL, size, offset, 0UL, 0UL, 0);
}
#endif

@@ -71,7 +72,7 @@ unsigned long find_next_and_bit(const unsigned long *addr1,
return val ? __ffs(val) : size;
}

- return _find_next_bit(addr1, addr2, size, offset, 0UL, 0);
+ return _find_next_bit(addr1, addr2, size, offset, 0UL, 0UL, 0);
}
#endif

@@ -99,7 +100,7 @@ unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
return val == ~0UL ? size : ffz(val);
}

- return _find_next_bit(addr, NULL, size, offset, ~0UL, 0);
+ return _find_next_bit(addr, NULL, size, offset, ~0UL, 0UL, 0);
}
#endif

@@ -247,7 +248,7 @@ unsigned long find_next_zero_bit_le(const void *addr, unsigned
return val == ~0UL ? size : ffz(val);
}

- return _find_next_bit(addr, NULL, size, offset, ~0UL, 1);
+ return _find_next_bit(addr, NULL, size, offset, ~0UL, 0UL, 1);
}
#endif

@@ -266,7 +267,7 @@ unsigned long find_next_bit_le(const void *addr, unsigned
return val ? __ffs(val) : size;
}

- return _find_next_bit(addr, NULL, size, offset, 0UL, 1);
+ return _find_next_bit(addr, NULL, size, offset, 0UL, 0UL, 1);
}
#endif

diff --git a/lib/find_bit.c b/lib/find_bit.c
index 1b8e4b2a9cba..73e78565e691 100644
--- a/lib/find_bit.c
+++ b/lib/find_bit.c
@@ -25,23 +25,23 @@
/*
* This is a common helper function for find_next_bit, find_next_zero_bit, and
* find_next_and_bit. The differences are:
- * - The "invert" argument, which is XORed with each fetched word before
- * searching it for one bits.
+ * - The "invert_src1" and "invert_src2" arguments, which are XORed to
+ * each source word before applying the 'and' operator.
* - The optional "addr2", which is anded with "addr1" if present.
*/
unsigned long _find_next_bit(const unsigned long *addr1,
const unsigned long *addr2, unsigned long nbits,
- unsigned long start, unsigned long invert, unsigned long le)
+ unsigned long start, unsigned long invert_src1,
+ unsigned long invert_src2, unsigned long le)
{
unsigned long tmp, mask;

if (unlikely(start >= nbits))
return nbits;

- tmp = addr1[start / BITS_PER_LONG];
+ tmp = addr1[start / BITS_PER_LONG] ^ invert_src1;
if (addr2)
- tmp &= addr2[start / BITS_PER_LONG];
- tmp ^= invert;
+ tmp &= addr2[start / BITS_PER_LONG] ^ invert_src2;

/* Handle 1st word. */
mask = BITMAP_FIRST_WORD_MASK(start);
@@ -57,10 +57,9 @@ unsigned long _find_next_bit(const unsigned long *addr1,
if (start >= nbits)
return nbits;

- tmp = addr1[start / BITS_PER_LONG];
+ tmp = addr1[start / BITS_PER_LONG] ^ invert_src1;
if (addr2)
- tmp &= addr2[start / BITS_PER_LONG];
- tmp ^= invert;
+ tmp &= addr2[start / BITS_PER_LONG] ^ invert_src2;
}

if (le)
diff --git a/tools/include/linux/find.h b/tools/include/linux/find.h
index 47e2bd6c5174..5ab0c95086ad 100644
--- a/tools/include/linux/find.h
+++ b/tools/include/linux/find.h
@@ -10,7 +10,8 @@

extern unsigned long _find_next_bit(const unsigned long *addr1,
const unsigned long *addr2, unsigned long nbits,
- unsigned long start, unsigned long invert, unsigned long le);
+ unsigned long start, unsigned long invert_src1,
+ unsigned long src2, unsigned long le);
extern unsigned long _find_first_bit(const unsigned long *addr, unsigned long size);
extern unsigned long _find_first_and_bit(const unsigned long *addr1,
const unsigned long *addr2, unsigned long size);
@@ -41,7 +42,7 @@ unsigned long find_next_bit(const unsigned long *addr, unsigned long size,
return val ? __ffs(val) : size;
}

- return _find_next_bit(addr, NULL, size, offset, 0UL, 0);
+ return _find_next_bit(addr, NULL, size, offset, 0UL, 0UL, 0);
}
#endif

@@ -71,7 +72,7 @@ unsigned long find_next_and_bit(const unsigned long *addr1,
return val ? __ffs(val) : size;
}

- return _find_next_bit(addr1, addr2, size, offset, 0UL, 0);
+ return _find_next_bit(addr1, addr2, size, offset, 0UL, 0UL, 0);
}
#endif

@@ -99,7 +100,7 @@ unsigned long find_next_zero_bit(const unsigned long *addr, unsigned long size,
return val == ~0UL ? size : ffz(val);
}

- return _find_next_bit(addr, NULL, size, offset, ~0UL, 0);
+ return _find_next_bit(addr, NULL, size, offset, ~0UL, 0UL, 0);
}
#endif

diff --git a/tools/lib/find_bit.c b/tools/lib/find_bit.c
index ba4b8d94e004..4176232de7f9 100644
--- a/tools/lib/find_bit.c
+++ b/tools/lib/find_bit.c
@@ -24,13 +24,14 @@
/*
* This is a common helper function for find_next_bit, find_next_zero_bit, and
* find_next_and_bit. The differences are:
- * - The "invert" argument, which is XORed with each fetched word before
- * searching it for one bits.
+ * - The "invert_src1" and "invert_src2" arguments, which are XORed to
+ * each source word before applying the 'and' operator.
* - The optional "addr2", which is anded with "addr1" if present.
*/
unsigned long _find_next_bit(const unsigned long *addr1,
const unsigned long *addr2, unsigned long nbits,
- unsigned long start, unsigned long invert, unsigned long le)
+ unsigned long start, unsigned long invert_src1,
+ unsigned long invert_src2, unsigned long le)
{
unsigned long tmp, mask;
(void) le;
@@ -38,10 +39,9 @@ unsigned long _find_next_bit(const unsigned long *addr1,
if (unlikely(start >= nbits))
return nbits;

- tmp = addr1[start / BITS_PER_LONG];
+ tmp = addr1[start / BITS_PER_LONG] ^ invert_src1;
if (addr2)
- tmp &= addr2[start / BITS_PER_LONG];
- tmp ^= invert;
+ tmp &= addr2[start / BITS_PER_LONG] ^ invert_src2;

/* Handle 1st word. */
mask = BITMAP_FIRST_WORD_MASK(start);
@@ -64,10 +64,9 @@ unsigned long _find_next_bit(const unsigned long *addr1,
if (start >= nbits)
return nbits;

- tmp = addr1[start / BITS_PER_LONG];
+ tmp = addr1[start / BITS_PER_LONG] ^ invert_src1;
if (addr2)
- tmp &= addr2[start / BITS_PER_LONG];
- tmp ^= invert;
+ tmp &= addr2[start / BITS_PER_LONG] ^ invert_src2;
}

#if (0)
--
2.25.1

2022-09-22 12:16:34

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH v4 13/25] selftests/rseq: x86: Template memory ordering and percpu access mode

Introduce a rseq-x86-bits.h template header which is internally included
to generate the static inline functions covering:

- relaxed and release memory ordering,
- per-cpu-id and per-vm-vcpu-id per-cpu data access.

This introduces changes to the rseq.h selftests API which require to
update the rseq selftest programs. Similar API/templating changes need
to be done for other architectures.

Signed-off-by: Mathieu Desnoyers <[email protected]>
---
tools/testing/selftests/rseq/compiler.h | 6 +
.../testing/selftests/rseq/rseq-bits-reset.h | 10 +
.../selftests/rseq/rseq-bits-template.h | 39 +
tools/testing/selftests/rseq/rseq-x86-bits.h | 993 ++++++++++++++
tools/testing/selftests/rseq/rseq-x86.h | 1181 +----------------
tools/testing/selftests/rseq/rseq.h | 159 +++
6 files changed, 1238 insertions(+), 1150 deletions(-)
create mode 100644 tools/testing/selftests/rseq/rseq-bits-reset.h
create mode 100644 tools/testing/selftests/rseq/rseq-bits-template.h
create mode 100644 tools/testing/selftests/rseq/rseq-x86-bits.h

diff --git a/tools/testing/selftests/rseq/compiler.h b/tools/testing/selftests/rseq/compiler.h
index 876eb6a7f75b..f47092bddeba 100644
--- a/tools/testing/selftests/rseq/compiler.h
+++ b/tools/testing/selftests/rseq/compiler.h
@@ -27,4 +27,10 @@
*/
#define rseq_after_asm_goto() asm volatile ("" : : : "memory")

+/* Combine two tokens. */
+#define RSEQ__COMBINE_TOKENS(_tokena, _tokenb) \
+ _tokena##_tokenb
+#define RSEQ_COMBINE_TOKENS(_tokena, _tokenb) \
+ RSEQ__COMBINE_TOKENS(_tokena, _tokenb)
+
#endif /* RSEQ_COMPILER_H_ */
diff --git a/tools/testing/selftests/rseq/rseq-bits-reset.h b/tools/testing/selftests/rseq/rseq-bits-reset.h
new file mode 100644
index 000000000000..3016070212f3
--- /dev/null
+++ b/tools/testing/selftests/rseq/rseq-bits-reset.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
+/*
+ * rseq-bits-reset.h
+ *
+ * (C) Copyright 2016-2022 - Mathieu Desnoyers <[email protected]>
+ */
+
+#undef RSEQ_TEMPLATE_IDENTIFIER
+#undef RSEQ_TEMPLATE_CPU_ID_OFFSET
+#undef RSEQ_TEMPLATE_SUFFIX
diff --git a/tools/testing/selftests/rseq/rseq-bits-template.h b/tools/testing/selftests/rseq/rseq-bits-template.h
new file mode 100644
index 000000000000..ea0fe06ef314
--- /dev/null
+++ b/tools/testing/selftests/rseq/rseq-bits-template.h
@@ -0,0 +1,39 @@
+/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
+/*
+ * rseq-bits-template.h
+ *
+ * (C) Copyright 2016-2022 - Mathieu Desnoyers <[email protected]>
+ */
+
+#ifdef RSEQ_TEMPLATE_CPU_ID
+# define RSEQ_TEMPLATE_CPU_ID_OFFSET RSEQ_CPU_ID_OFFSET
+# ifdef RSEQ_TEMPLATE_MO_RELEASE
+# define RSEQ_TEMPLATE_SUFFIX _release_cpu_id
+# elif defined (RSEQ_TEMPLATE_MO_RELAXED)
+# define RSEQ_TEMPLATE_SUFFIX _relaxed_cpu_id
+# else
+# error "Never use <rseq-bits-template.h> directly; include <rseq.h> instead."
+# endif
+#elif defined(RSEQ_TEMPLATE_VM_VCPU_ID)
+# define RSEQ_TEMPLATE_CPU_ID_OFFSET RSEQ_VM_VCPU_ID_OFFSET
+# ifdef RSEQ_TEMPLATE_MO_RELEASE
+# define RSEQ_TEMPLATE_SUFFIX _release_vm_vcpu_id
+# elif defined (RSEQ_TEMPLATE_MO_RELAXED)
+# define RSEQ_TEMPLATE_SUFFIX _relaxed_vm_vcpu_id
+# else
+# error "Never use <rseq-bits-template.h> directly; include <rseq.h> instead."
+# endif
+#elif defined (RSEQ_TEMPLATE_CPU_ID_NONE)
+# ifdef RSEQ_TEMPLATE_MO_RELEASE
+# define RSEQ_TEMPLATE_SUFFIX _release
+# elif defined (RSEQ_TEMPLATE_MO_RELAXED)
+# define RSEQ_TEMPLATE_SUFFIX _relaxed
+# else
+# error "Never use <rseq-bits-template.h> directly; include <rseq.h> instead."
+# endif
+#else
+# error "Never use <rseq-bits-template.h> directly; include <rseq.h> instead."
+#endif
+
+#define RSEQ_TEMPLATE_IDENTIFIER(x) RSEQ_COMBINE_TOKENS(x, RSEQ_TEMPLATE_SUFFIX)
+
diff --git a/tools/testing/selftests/rseq/rseq-x86-bits.h b/tools/testing/selftests/rseq/rseq-x86-bits.h
new file mode 100644
index 000000000000..28ca77cc876c
--- /dev/null
+++ b/tools/testing/selftests/rseq/rseq-x86-bits.h
@@ -0,0 +1,993 @@
+/* SPDX-License-Identifier: LGPL-2.1 OR MIT */
+/*
+ * rseq-x86-bits.h
+ *
+ * (C) Copyright 2016-2022 - Mathieu Desnoyers <[email protected]>
+ */
+
+#include "rseq-bits-template.h"
+
+#ifdef __x86_64__
+
+#if defined(RSEQ_TEMPLATE_MO_RELAXED) && \
+ (defined(RSEQ_TEMPLATE_CPU_ID) || defined(RSEQ_TEMPLATE_VM_VCPU_ID))
+
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_cmpeqv_storev)(intptr_t *v, intptr_t expect, intptr_t newv, int cpu)
+{
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
+#endif
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), 4f)
+ RSEQ_INJECT_ASM(3)
+ "cmpq %[v], %[expect]\n\t"
+ "jnz %l[cmpfail]\n\t"
+ RSEQ_INJECT_ASM(4)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), %l[error1])
+ "cmpq %[v], %[expect]\n\t"
+ "jnz %l[error2]\n\t"
+#endif
+ /* final store */
+ "movq %[newv], %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(5)
+ RSEQ_ASM_DEFINE_ABORT(4, "", abort)
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [rseq_offset] "r" (rseq_offset),
+ [v] "m" (*v),
+ [expect] "r" (expect),
+ [newv] "r" (newv)
+ : "memory", "cc", "rax"
+ RSEQ_INJECT_CLOBBER
+ : abort, cmpfail
+#ifdef RSEQ_COMPARE_TWICE
+ , error1, error2
+#endif
+ );
+ rseq_after_asm_goto();
+ return 0;
+abort:
+ rseq_after_asm_goto();
+ RSEQ_INJECT_FAILED
+ return -1;
+cmpfail:
+ rseq_after_asm_goto();
+ return 1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_after_asm_goto();
+ rseq_bug("cpu_id comparison failed");
+error2:
+ rseq_after_asm_goto();
+ rseq_bug("expected value comparison failed");
+#endif
+}
+
+/*
+ * Compare @v against @expectnot. When it does _not_ match, load @v
+ * into @load, and store the content of *@v + voffp into @v.
+ */
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_cmpnev_storeoffp_load)(intptr_t *v, intptr_t expectnot,
+ long voffp, intptr_t *load, int cpu)
+{
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
+#endif
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), 4f)
+ RSEQ_INJECT_ASM(3)
+ "movq %[v], %%rbx\n\t"
+ "cmpq %%rbx, %[expectnot]\n\t"
+ "je %l[cmpfail]\n\t"
+ RSEQ_INJECT_ASM(4)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), %l[error1])
+ "movq %[v], %%rbx\n\t"
+ "cmpq %%rbx, %[expectnot]\n\t"
+ "je %l[error2]\n\t"
+#endif
+ "movq %%rbx, %[load]\n\t"
+ "addq %[voffp], %%rbx\n\t"
+ "movq (%%rbx), %%rbx\n\t"
+ /* final store */
+ "movq %%rbx, %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(5)
+ RSEQ_ASM_DEFINE_ABORT(4, "", abort)
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [rseq_offset] "r" (rseq_offset),
+ /* final store input */
+ [v] "m" (*v),
+ [expectnot] "r" (expectnot),
+ [voffp] "er" (voffp),
+ [load] "m" (*load)
+ : "memory", "cc", "rax", "rbx"
+ RSEQ_INJECT_CLOBBER
+ : abort, cmpfail
+#ifdef RSEQ_COMPARE_TWICE
+ , error1, error2
+#endif
+ );
+ rseq_after_asm_goto();
+ return 0;
+abort:
+ rseq_after_asm_goto();
+ RSEQ_INJECT_FAILED
+ return -1;
+cmpfail:
+ rseq_after_asm_goto();
+ return 1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_after_asm_goto();
+ rseq_bug("cpu_id comparison failed");
+error2:
+ rseq_after_asm_goto();
+ rseq_bug("expected value comparison failed");
+#endif
+}
+
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_addv)(intptr_t *v, intptr_t count, int cpu)
+{
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+#endif
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), 4f)
+ RSEQ_INJECT_ASM(3)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), %l[error1])
+#endif
+ /* final store */
+ "addq %[count], %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(4)
+ RSEQ_ASM_DEFINE_ABORT(4, "", abort)
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [rseq_offset] "r" (rseq_offset),
+ /* final store input */
+ [v] "m" (*v),
+ [count] "er" (count)
+ : "memory", "cc", "rax"
+ RSEQ_INJECT_CLOBBER
+ : abort
+#ifdef RSEQ_COMPARE_TWICE
+ , error1
+#endif
+ );
+ rseq_after_asm_goto();
+ return 0;
+abort:
+ rseq_after_asm_goto();
+ RSEQ_INJECT_FAILED
+ return -1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_after_asm_goto();
+ rseq_bug("cpu_id comparison failed");
+#endif
+}
+
+#define RSEQ_ARCH_HAS_OFFSET_DEREF_ADDV
+
+/*
+ * pval = *(ptr+off)
+ * *pval += inc;
+ */
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_offset_deref_addv)(intptr_t *ptr, long off, intptr_t inc, int cpu)
+{
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+#endif
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), 4f)
+ RSEQ_INJECT_ASM(3)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), %l[error1])
+#endif
+ /* get p+v */
+ "movq %[ptr], %%rbx\n\t"
+ "addq %[off], %%rbx\n\t"
+ /* get pv */
+ "movq (%%rbx), %%rcx\n\t"
+ /* *pv += inc */
+ "addq %[inc], (%%rcx)\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(4)
+ RSEQ_ASM_DEFINE_ABORT(4, "", abort)
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [rseq_offset] "r" (rseq_offset),
+ /* final store input */
+ [ptr] "m" (*ptr),
+ [off] "er" (off),
+ [inc] "er" (inc)
+ : "memory", "cc", "rax", "rbx", "rcx"
+ RSEQ_INJECT_CLOBBER
+ : abort
+#ifdef RSEQ_COMPARE_TWICE
+ , error1
+#endif
+ );
+ return 0;
+abort:
+ RSEQ_INJECT_FAILED
+ return -1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_bug("cpu_id comparison failed");
+#endif
+}
+
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_cmpeqv_cmpeqv_storev)(intptr_t *v, intptr_t expect,
+ intptr_t *v2, intptr_t expect2,
+ intptr_t newv, int cpu)
+{
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error3])
+#endif
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), 4f)
+ RSEQ_INJECT_ASM(3)
+ "cmpq %[v], %[expect]\n\t"
+ "jnz %l[cmpfail]\n\t"
+ RSEQ_INJECT_ASM(4)
+ "cmpq %[v2], %[expect2]\n\t"
+ "jnz %l[cmpfail]\n\t"
+ RSEQ_INJECT_ASM(5)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), %l[error1])
+ "cmpq %[v], %[expect]\n\t"
+ "jnz %l[error2]\n\t"
+ "cmpq %[v2], %[expect2]\n\t"
+ "jnz %l[error3]\n\t"
+#endif
+ /* final store */
+ "movq %[newv], %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(6)
+ RSEQ_ASM_DEFINE_ABORT(4, "", abort)
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [rseq_offset] "r" (rseq_offset),
+ /* cmp2 input */
+ [v2] "m" (*v2),
+ [expect2] "r" (expect2),
+ /* final store input */
+ [v] "m" (*v),
+ [expect] "r" (expect),
+ [newv] "r" (newv)
+ : "memory", "cc", "rax"
+ RSEQ_INJECT_CLOBBER
+ : abort, cmpfail
+#ifdef RSEQ_COMPARE_TWICE
+ , error1, error2, error3
+#endif
+ );
+ rseq_after_asm_goto();
+ return 0;
+abort:
+ rseq_after_asm_goto();
+ RSEQ_INJECT_FAILED
+ return -1;
+cmpfail:
+ rseq_after_asm_goto();
+ return 1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_after_asm_goto();
+ rseq_bug("cpu_id comparison failed");
+error2:
+ rseq_after_asm_goto();
+ rseq_bug("1st expected value comparison failed");
+error3:
+ rseq_after_asm_goto();
+ rseq_bug("2nd expected value comparison failed");
+#endif
+}
+
+#endif /* #if defined(RSEQ_TEMPLATE_MO_RELAXED) &&
+ (defined(RSEQ_TEMPLATE_CPU_ID) || defined(RSEQ_TEMPLATE_VM_VCPU_ID)) */
+
+#if (defined(RSEQ_TEMPLATE_MO_RELAXED) || defined(RSEQ_TEMPLATE_MO_RELEASE)) && \
+ (defined(RSEQ_TEMPLATE_CPU_ID) || defined(RSEQ_TEMPLATE_VM_VCPU_ID))
+
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_cmpeqv_trystorev_storev)(intptr_t *v, intptr_t expect,
+ intptr_t *v2, intptr_t newv2,
+ intptr_t newv, int cpu)
+{
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
+#endif
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), 4f)
+ RSEQ_INJECT_ASM(3)
+ "cmpq %[v], %[expect]\n\t"
+ "jnz %l[cmpfail]\n\t"
+ RSEQ_INJECT_ASM(4)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), %l[error1])
+ "cmpq %[v], %[expect]\n\t"
+ "jnz %l[error2]\n\t"
+#endif
+ /* try store */
+ "movq %[newv2], %[v2]\n\t"
+ RSEQ_INJECT_ASM(5)
+ /* final store */
+ "movq %[newv], %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(6)
+ RSEQ_ASM_DEFINE_ABORT(4, "", abort)
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [rseq_offset] "r" (rseq_offset),
+ /* try store input */
+ [v2] "m" (*v2),
+ [newv2] "r" (newv2),
+ /* final store input */
+ [v] "m" (*v),
+ [expect] "r" (expect),
+ [newv] "r" (newv)
+ : "memory", "cc", "rax"
+ RSEQ_INJECT_CLOBBER
+ : abort, cmpfail
+#ifdef RSEQ_COMPARE_TWICE
+ , error1, error2
+#endif
+ );
+ rseq_after_asm_goto();
+ return 0;
+abort:
+ rseq_after_asm_goto();
+ RSEQ_INJECT_FAILED
+ return -1;
+cmpfail:
+ rseq_after_asm_goto();
+ return 1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_after_asm_goto();
+ rseq_bug("cpu_id comparison failed");
+error2:
+ rseq_after_asm_goto();
+ rseq_bug("expected value comparison failed");
+#endif
+}
+
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_cmpeqv_trymemcpy_storev)(intptr_t *v, intptr_t expect,
+ void *dst, void *src, size_t len,
+ intptr_t newv, int cpu)
+{
+ uint64_t rseq_scratch[3];
+
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
+#endif
+ "movq %[src], %[rseq_scratch0]\n\t"
+ "movq %[dst], %[rseq_scratch1]\n\t"
+ "movq %[len], %[rseq_scratch2]\n\t"
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), 4f)
+ RSEQ_INJECT_ASM(3)
+ "cmpq %[v], %[expect]\n\t"
+ "jnz 5f\n\t"
+ RSEQ_INJECT_ASM(4)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), 6f)
+ "cmpq %[v], %[expect]\n\t"
+ "jnz 7f\n\t"
+#endif
+ /* try memcpy */
+ "test %[len], %[len]\n\t" \
+ "jz 333f\n\t" \
+ "222:\n\t" \
+ "movb (%[src]), %%al\n\t" \
+ "movb %%al, (%[dst])\n\t" \
+ "inc %[src]\n\t" \
+ "inc %[dst]\n\t" \
+ "dec %[len]\n\t" \
+ "jnz 222b\n\t" \
+ "333:\n\t" \
+ RSEQ_INJECT_ASM(5)
+ /* final store */
+ "movq %[newv], %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(6)
+ /* teardown */
+ "movq %[rseq_scratch2], %[len]\n\t"
+ "movq %[rseq_scratch1], %[dst]\n\t"
+ "movq %[rseq_scratch0], %[src]\n\t"
+ RSEQ_ASM_DEFINE_ABORT(4,
+ "movq %[rseq_scratch2], %[len]\n\t"
+ "movq %[rseq_scratch1], %[dst]\n\t"
+ "movq %[rseq_scratch0], %[src]\n\t",
+ abort)
+ RSEQ_ASM_DEFINE_CMPFAIL(5,
+ "movq %[rseq_scratch2], %[len]\n\t"
+ "movq %[rseq_scratch1], %[dst]\n\t"
+ "movq %[rseq_scratch0], %[src]\n\t",
+ cmpfail)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_CMPFAIL(6,
+ "movq %[rseq_scratch2], %[len]\n\t"
+ "movq %[rseq_scratch1], %[dst]\n\t"
+ "movq %[rseq_scratch0], %[src]\n\t",
+ error1)
+ RSEQ_ASM_DEFINE_CMPFAIL(7,
+ "movq %[rseq_scratch2], %[len]\n\t"
+ "movq %[rseq_scratch1], %[dst]\n\t"
+ "movq %[rseq_scratch0], %[src]\n\t",
+ error2)
+#endif
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [rseq_offset] "r" (rseq_offset),
+ /* final store input */
+ [v] "m" (*v),
+ [expect] "r" (expect),
+ [newv] "r" (newv),
+ /* try memcpy input */
+ [dst] "r" (dst),
+ [src] "r" (src),
+ [len] "r" (len),
+ [rseq_scratch0] "m" (rseq_scratch[0]),
+ [rseq_scratch1] "m" (rseq_scratch[1]),
+ [rseq_scratch2] "m" (rseq_scratch[2])
+ : "memory", "cc", "rax"
+ RSEQ_INJECT_CLOBBER
+ : abort, cmpfail
+#ifdef RSEQ_COMPARE_TWICE
+ , error1, error2
+#endif
+ );
+ rseq_after_asm_goto();
+ return 0;
+abort:
+ rseq_after_asm_goto();
+ RSEQ_INJECT_FAILED
+ return -1;
+cmpfail:
+ rseq_after_asm_goto();
+ return 1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_after_asm_goto();
+ rseq_bug("cpu_id comparison failed");
+error2:
+ rseq_after_asm_goto();
+ rseq_bug("expected value comparison failed");
+#endif
+}
+
+#endif /* #if (defined(RSEQ_TEMPLATE_MO_RELAXED) || defined(RSEQ_TEMPLATE_MO_RELEASE)) &&
+ (defined(RSEQ_TEMPLATE_CPU_ID) || defined(RSEQ_TEMPLATE_VM_VCPU_ID)) */
+
+#elif defined(__i386__)
+
+#if defined(RSEQ_TEMPLATE_MO_RELAXED) && \
+ (defined(RSEQ_TEMPLATE_CPU_ID) || defined(RSEQ_TEMPLATE_VM_VCPU_ID))
+
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_cmpeqv_storev)(intptr_t *v, intptr_t expect, intptr_t newv, int cpu)
+{
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
+#endif
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), 4f)
+ RSEQ_INJECT_ASM(3)
+ "cmpl %[v], %[expect]\n\t"
+ "jnz %l[cmpfail]\n\t"
+ RSEQ_INJECT_ASM(4)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), %l[error1])
+ "cmpl %[v], %[expect]\n\t"
+ "jnz %l[error2]\n\t"
+#endif
+ /* final store */
+ "movl %[newv], %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(5)
+ RSEQ_ASM_DEFINE_ABORT(4, "", abort)
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [rseq_offset] "r" (rseq_offset),
+ [v] "m" (*v),
+ [expect] "r" (expect),
+ [newv] "r" (newv)
+ : "memory", "cc", "eax"
+ RSEQ_INJECT_CLOBBER
+ : abort, cmpfail
+#ifdef RSEQ_COMPARE_TWICE
+ , error1, error2
+#endif
+ );
+ rseq_after_asm_goto();
+ return 0;
+abort:
+ rseq_after_asm_goto();
+ RSEQ_INJECT_FAILED
+ return -1;
+cmpfail:
+ rseq_after_asm_goto();
+ return 1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_after_asm_goto();
+ rseq_bug("cpu_id comparison failed");
+error2:
+ rseq_after_asm_goto();
+ rseq_bug("expected value comparison failed");
+#endif
+}
+
+/*
+ * Compare @v against @expectnot. When it does _not_ match, load @v
+ * into @load, and store the content of *@v + voffp into @v.
+ */
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_cmpnev_storeoffp_load)(intptr_t *v, intptr_t expectnot,
+ long voffp, intptr_t *load, int cpu)
+{
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
+#endif
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), 4f)
+ RSEQ_INJECT_ASM(3)
+ "movl %[v], %%ebx\n\t"
+ "cmpl %%ebx, %[expectnot]\n\t"
+ "je %l[cmpfail]\n\t"
+ RSEQ_INJECT_ASM(4)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), %l[error1])
+ "movl %[v], %%ebx\n\t"
+ "cmpl %%ebx, %[expectnot]\n\t"
+ "je %l[error2]\n\t"
+#endif
+ "movl %%ebx, %[load]\n\t"
+ "addl %[voffp], %%ebx\n\t"
+ "movl (%%ebx), %%ebx\n\t"
+ /* final store */
+ "movl %%ebx, %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(5)
+ RSEQ_ASM_DEFINE_ABORT(4, "", abort)
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [rseq_offset] "r" (rseq_offset),
+ /* final store input */
+ [v] "m" (*v),
+ [expectnot] "r" (expectnot),
+ [voffp] "ir" (voffp),
+ [load] "m" (*load)
+ : "memory", "cc", "eax", "ebx"
+ RSEQ_INJECT_CLOBBER
+ : abort, cmpfail
+#ifdef RSEQ_COMPARE_TWICE
+ , error1, error2
+#endif
+ );
+ rseq_after_asm_goto();
+ return 0;
+abort:
+ rseq_after_asm_goto();
+ RSEQ_INJECT_FAILED
+ return -1;
+cmpfail:
+ rseq_after_asm_goto();
+ return 1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_after_asm_goto();
+ rseq_bug("cpu_id comparison failed");
+error2:
+ rseq_after_asm_goto();
+ rseq_bug("expected value comparison failed");
+#endif
+}
+
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_addv)(intptr_t *v, intptr_t count, int cpu)
+{
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+#endif
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), 4f)
+ RSEQ_INJECT_ASM(3)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), %l[error1])
+#endif
+ /* final store */
+ "addl %[count], %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(4)
+ RSEQ_ASM_DEFINE_ABORT(4, "", abort)
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [rseq_offset] "r" (rseq_offset),
+ /* final store input */
+ [v] "m" (*v),
+ [count] "ir" (count)
+ : "memory", "cc", "eax"
+ RSEQ_INJECT_CLOBBER
+ : abort
+#ifdef RSEQ_COMPARE_TWICE
+ , error1
+#endif
+ );
+ rseq_after_asm_goto();
+ return 0;
+abort:
+ rseq_after_asm_goto();
+ RSEQ_INJECT_FAILED
+ return -1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_after_asm_goto();
+ rseq_bug("cpu_id comparison failed");
+#endif
+}
+
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_cmpeqv_cmpeqv_storev)(intptr_t *v, intptr_t expect,
+ intptr_t *v2, intptr_t expect2,
+ intptr_t newv, int cpu)
+{
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error3])
+#endif
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), 4f)
+ RSEQ_INJECT_ASM(3)
+ "cmpl %[v], %[expect]\n\t"
+ "jnz %l[cmpfail]\n\t"
+ RSEQ_INJECT_ASM(4)
+ "cmpl %[expect2], %[v2]\n\t"
+ "jnz %l[cmpfail]\n\t"
+ RSEQ_INJECT_ASM(5)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), %l[error1])
+ "cmpl %[v], %[expect]\n\t"
+ "jnz %l[error2]\n\t"
+ "cmpl %[expect2], %[v2]\n\t"
+ "jnz %l[error3]\n\t"
+#endif
+ "movl %[newv], %%eax\n\t"
+ /* final store */
+ "movl %%eax, %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(6)
+ RSEQ_ASM_DEFINE_ABORT(4, "", abort)
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [rseq_offset] "r" (rseq_offset),
+ /* cmp2 input */
+ [v2] "m" (*v2),
+ [expect2] "r" (expect2),
+ /* final store input */
+ [v] "m" (*v),
+ [expect] "r" (expect),
+ [newv] "m" (newv)
+ : "memory", "cc", "eax"
+ RSEQ_INJECT_CLOBBER
+ : abort, cmpfail
+#ifdef RSEQ_COMPARE_TWICE
+ , error1, error2, error3
+#endif
+ );
+ rseq_after_asm_goto();
+ return 0;
+abort:
+ rseq_after_asm_goto();
+ RSEQ_INJECT_FAILED
+ return -1;
+cmpfail:
+ rseq_after_asm_goto();
+ return 1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_after_asm_goto();
+ rseq_bug("cpu_id comparison failed");
+error2:
+ rseq_after_asm_goto();
+ rseq_bug("1st expected value comparison failed");
+error3:
+ rseq_after_asm_goto();
+ rseq_bug("2nd expected value comparison failed");
+#endif
+}
+
+#endif /* #if defined(RSEQ_TEMPLATE_MO_RELAXED) &&
+ (defined(RSEQ_TEMPLATE_CPU_ID) || defined(RSEQ_TEMPLATE_VM_VCPU_ID)) */
+
+#if (defined(RSEQ_TEMPLATE_MO_RELAXED) || defined(RSEQ_TEMPLATE_MO_RELEASE)) && \
+ (defined(RSEQ_TEMPLATE_CPU_ID) || defined(RSEQ_TEMPLATE_VM_VCPU_ID))
+
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_cmpeqv_trystorev_storev)(intptr_t *v, intptr_t expect,
+ intptr_t *v2, intptr_t newv2,
+ intptr_t newv, int cpu)
+{
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
+#endif
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), 4f)
+ RSEQ_INJECT_ASM(3)
+ "movl %[expect], %%eax\n\t"
+ "cmpl %[v], %%eax\n\t"
+ "jnz %l[cmpfail]\n\t"
+ RSEQ_INJECT_ASM(4)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), %l[error1])
+ "movl %[expect], %%eax\n\t"
+ "cmpl %[v], %%eax\n\t"
+ "jnz %l[error2]\n\t"
+#endif
+ /* try store */
+ "movl %[newv2], %[v2]\n\t"
+ RSEQ_INJECT_ASM(5)
+#ifdef RSEQ_TEMPLATE_MO_RELEASE
+ "lock; addl $0,-128(%%esp)\n\t"
+#endif
+ /* final store */
+ "movl %[newv], %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(6)
+ RSEQ_ASM_DEFINE_ABORT(4, "", abort)
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [rseq_offset] "r" (rseq_offset),
+ /* try store input */
+ [v2] "m" (*v2),
+ [newv2] "r" (newv2),
+ /* final store input */
+ [v] "m" (*v),
+ [expect] "m" (expect),
+ [newv] "r" (newv)
+ : "memory", "cc", "eax"
+ RSEQ_INJECT_CLOBBER
+ : abort, cmpfail
+#ifdef RSEQ_COMPARE_TWICE
+ , error1, error2
+#endif
+ );
+ rseq_after_asm_goto();
+ return 0;
+abort:
+ rseq_after_asm_goto();
+ RSEQ_INJECT_FAILED
+ return -1;
+cmpfail:
+ rseq_after_asm_goto();
+ return 1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_after_asm_goto();
+ rseq_bug("cpu_id comparison failed");
+error2:
+ rseq_after_asm_goto();
+ rseq_bug("expected value comparison failed");
+#endif
+
+}
+
+/* TODO: implement a faster memcpy. */
+static inline __attribute__((always_inline))
+int RSEQ_TEMPLATE_IDENTIFIER(rseq_cmpeqv_trymemcpy_storev)(intptr_t *v, intptr_t expect,
+ void *dst, void *src, size_t len,
+ intptr_t newv, int cpu)
+{
+ uint32_t rseq_scratch[3];
+
+ RSEQ_INJECT_C(9)
+
+ __asm__ __volatile__ goto (
+ RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
+ RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
+#endif
+ "movl %[src], %[rseq_scratch0]\n\t"
+ "movl %[dst], %[rseq_scratch1]\n\t"
+ "movl %[len], %[rseq_scratch2]\n\t"
+ /* Start rseq by storing table entry pointer into rseq_cs. */
+ RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), 4f)
+ RSEQ_INJECT_ASM(3)
+ "movl %[expect], %%eax\n\t"
+ "cmpl %%eax, %[v]\n\t"
+ "jnz 5f\n\t"
+ RSEQ_INJECT_ASM(4)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_TEMPLATE_CPU_ID_OFFSET(%[rseq_offset]), 6f)
+ "movl %[expect], %%eax\n\t"
+ "cmpl %%eax, %[v]\n\t"
+ "jnz 7f\n\t"
+#endif
+ /* try memcpy */
+ "test %[len], %[len]\n\t" \
+ "jz 333f\n\t" \
+ "222:\n\t" \
+ "movb (%[src]), %%al\n\t" \
+ "movb %%al, (%[dst])\n\t" \
+ "inc %[src]\n\t" \
+ "inc %[dst]\n\t" \
+ "dec %[len]\n\t" \
+ "jnz 222b\n\t" \
+ "333:\n\t" \
+ RSEQ_INJECT_ASM(5)
+#ifdef RSEQ_TEMPLATE_MO_RELEASE
+ "lock; addl $0,-128(%%esp)\n\t"
+#endif
+ "movl %[newv], %%eax\n\t"
+ /* final store */
+ "movl %%eax, %[v]\n\t"
+ "2:\n\t"
+ RSEQ_INJECT_ASM(6)
+ /* teardown */
+ "movl %[rseq_scratch2], %[len]\n\t"
+ "movl %[rseq_scratch1], %[dst]\n\t"
+ "movl %[rseq_scratch0], %[src]\n\t"
+ RSEQ_ASM_DEFINE_ABORT(4,
+ "movl %[rseq_scratch2], %[len]\n\t"
+ "movl %[rseq_scratch1], %[dst]\n\t"
+ "movl %[rseq_scratch0], %[src]\n\t",
+ abort)
+ RSEQ_ASM_DEFINE_CMPFAIL(5,
+ "movl %[rseq_scratch2], %[len]\n\t"
+ "movl %[rseq_scratch1], %[dst]\n\t"
+ "movl %[rseq_scratch0], %[src]\n\t",
+ cmpfail)
+#ifdef RSEQ_COMPARE_TWICE
+ RSEQ_ASM_DEFINE_CMPFAIL(6,
+ "movl %[rseq_scratch2], %[len]\n\t"
+ "movl %[rseq_scratch1], %[dst]\n\t"
+ "movl %[rseq_scratch0], %[src]\n\t",
+ error1)
+ RSEQ_ASM_DEFINE_CMPFAIL(7,
+ "movl %[rseq_scratch2], %[len]\n\t"
+ "movl %[rseq_scratch1], %[dst]\n\t"
+ "movl %[rseq_scratch0], %[src]\n\t",
+ error2)
+#endif
+ : /* gcc asm goto does not allow outputs */
+ : [cpu_id] "r" (cpu),
+ [rseq_offset] "r" (rseq_offset),
+ /* final store input */
+ [v] "m" (*v),
+ [expect] "m" (expect),
+ [newv] "m" (newv),
+ /* try memcpy input */
+ [dst] "r" (dst),
+ [src] "r" (src),
+ [len] "r" (len),
+ [rseq_scratch0] "m" (rseq_scratch[0]),
+ [rseq_scratch1] "m" (rseq_scratch[1]),
+ [rseq_scratch2] "m" (rseq_scratch[2])
+ : "memory", "cc", "eax"
+ RSEQ_INJECT_CLOBBER
+ : abort, cmpfail
+#ifdef RSEQ_COMPARE_TWICE
+ , error1, error2
+#endif
+ );
+ rseq_after_asm_goto();
+ return 0;
+abort:
+ rseq_after_asm_goto();
+ RSEQ_INJECT_FAILED
+ return -1;
+cmpfail:
+ rseq_after_asm_goto();
+ return 1;
+#ifdef RSEQ_COMPARE_TWICE
+error1:
+ rseq_after_asm_goto();
+ rseq_bug("cpu_id comparison failed");
+error2:
+ rseq_after_asm_goto();
+ rseq_bug("expected value comparison failed");
+#endif
+}
+
+#endif /* #if (defined(RSEQ_TEMPLATE_MO_RELAXED) || defined(RSEQ_TEMPLATE_MO_RELEASE)) &&
+ (defined(RSEQ_TEMPLATE_CPU_ID) || defined(RSEQ_TEMPLATE_VM_VCPU_ID)) */
+
+#endif
+
+#include "rseq-bits-reset.h"
diff --git a/tools/testing/selftests/rseq/rseq-x86.h b/tools/testing/selftests/rseq/rseq-x86.h
index e148dfb2f68a..a526a6ad3a81 100644
--- a/tools/testing/selftests/rseq/rseq-x86.h
+++ b/tools/testing/selftests/rseq/rseq-x86.h
@@ -2,9 +2,13 @@
/*
* rseq-x86.h
*
- * (C) Copyright 2016-2018 - Mathieu Desnoyers <[email protected]>
+ * (C) Copyright 2016-2022 - Mathieu Desnoyers <[email protected]>
*/

+#ifndef RSEQ_H
+#error "Never use <rseq-x86.h> directly; include <rseq.h> instead."
+#endif
+
#include <stdint.h>

/*
@@ -22,9 +26,10 @@
* address through a "r" input operand.
*/

-/* Offset of cpu_id and rseq_cs fields in struct rseq. */
+/* Offset of cpu_id, rseq_cs, and vm_vcpu_id fields in struct rseq. */
#define RSEQ_CPU_ID_OFFSET 4
#define RSEQ_CS_OFFSET 8
+#define RSEQ_VM_VCPU_ID_OFFSET 24

#ifdef __x86_64__

@@ -108,523 +113,6 @@ do { \
"jmp %l[" __rseq_str(cmpfail_label) "]\n\t" \
".popsection\n\t"

-static inline __attribute__((always_inline))
-int rseq_cmpeqv_storev(intptr_t *v, intptr_t expect, intptr_t newv, int cpu)
-{
- RSEQ_INJECT_C(9)
-
- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f)
- RSEQ_INJECT_ASM(3)
- "cmpq %[v], %[expect]\n\t"
- "jnz %l[cmpfail]\n\t"
- RSEQ_INJECT_ASM(4)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1])
- "cmpq %[v], %[expect]\n\t"
- "jnz %l[error2]\n\t"
-#endif
- /* final store */
- "movq %[newv], %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(5)
- RSEQ_ASM_DEFINE_ABORT(4, "", abort)
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [rseq_offset] "r" (rseq_offset),
- [v] "m" (*v),
- [expect] "r" (expect),
- [newv] "r" (newv)
- : "memory", "cc", "rax"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2
-#endif
- );
- rseq_after_asm_goto();
- return 0;
-abort:
- rseq_after_asm_goto();
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- rseq_after_asm_goto();
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_after_asm_goto();
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_after_asm_goto();
- rseq_bug("expected value comparison failed");
-#endif
-}
-
-/*
- * Compare @v against @expectnot. When it does _not_ match, load @v
- * into @load, and store the content of *@v + voffp into @v.
- */
-static inline __attribute__((always_inline))
-int rseq_cmpnev_storeoffp_load(intptr_t *v, intptr_t expectnot,
- long voffp, intptr_t *load, int cpu)
-{
- RSEQ_INJECT_C(9)
-
- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f)
- RSEQ_INJECT_ASM(3)
- "movq %[v], %%rbx\n\t"
- "cmpq %%rbx, %[expectnot]\n\t"
- "je %l[cmpfail]\n\t"
- RSEQ_INJECT_ASM(4)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1])
- "movq %[v], %%rbx\n\t"
- "cmpq %%rbx, %[expectnot]\n\t"
- "je %l[error2]\n\t"
-#endif
- "movq %%rbx, %[load]\n\t"
- "addq %[voffp], %%rbx\n\t"
- "movq (%%rbx), %%rbx\n\t"
- /* final store */
- "movq %%rbx, %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(5)
- RSEQ_ASM_DEFINE_ABORT(4, "", abort)
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [rseq_offset] "r" (rseq_offset),
- /* final store input */
- [v] "m" (*v),
- [expectnot] "r" (expectnot),
- [voffp] "er" (voffp),
- [load] "m" (*load)
- : "memory", "cc", "rax", "rbx"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2
-#endif
- );
- rseq_after_asm_goto();
- return 0;
-abort:
- rseq_after_asm_goto();
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- rseq_after_asm_goto();
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_after_asm_goto();
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_after_asm_goto();
- rseq_bug("expected value comparison failed");
-#endif
-}
-
-static inline __attribute__((always_inline))
-int rseq_addv(intptr_t *v, intptr_t count, int cpu)
-{
- RSEQ_INJECT_C(9)
-
- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f)
- RSEQ_INJECT_ASM(3)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1])
-#endif
- /* final store */
- "addq %[count], %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(4)
- RSEQ_ASM_DEFINE_ABORT(4, "", abort)
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [rseq_offset] "r" (rseq_offset),
- /* final store input */
- [v] "m" (*v),
- [count] "er" (count)
- : "memory", "cc", "rax"
- RSEQ_INJECT_CLOBBER
- : abort
-#ifdef RSEQ_COMPARE_TWICE
- , error1
-#endif
- );
- rseq_after_asm_goto();
- return 0;
-abort:
- rseq_after_asm_goto();
- RSEQ_INJECT_FAILED
- return -1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_after_asm_goto();
- rseq_bug("cpu_id comparison failed");
-#endif
-}
-
-#define RSEQ_ARCH_HAS_OFFSET_DEREF_ADDV
-
-/*
- * pval = *(ptr+off)
- * *pval += inc;
- */
-static inline __attribute__((always_inline))
-int rseq_offset_deref_addv(intptr_t *ptr, long off, intptr_t inc, int cpu)
-{
- RSEQ_INJECT_C(9)
-
- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f)
- RSEQ_INJECT_ASM(3)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1])
-#endif
- /* get p+v */
- "movq %[ptr], %%rbx\n\t"
- "addq %[off], %%rbx\n\t"
- /* get pv */
- "movq (%%rbx), %%rcx\n\t"
- /* *pv += inc */
- "addq %[inc], (%%rcx)\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(4)
- RSEQ_ASM_DEFINE_ABORT(4, "", abort)
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [rseq_offset] "r" (rseq_offset),
- /* final store input */
- [ptr] "m" (*ptr),
- [off] "er" (off),
- [inc] "er" (inc)
- : "memory", "cc", "rax", "rbx", "rcx"
- RSEQ_INJECT_CLOBBER
- : abort
-#ifdef RSEQ_COMPARE_TWICE
- , error1
-#endif
- );
- return 0;
-abort:
- RSEQ_INJECT_FAILED
- return -1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_bug("cpu_id comparison failed");
-#endif
-}
-
-static inline __attribute__((always_inline))
-int rseq_cmpeqv_trystorev_storev(intptr_t *v, intptr_t expect,
- intptr_t *v2, intptr_t newv2,
- intptr_t newv, int cpu)
-{
- RSEQ_INJECT_C(9)
-
- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f)
- RSEQ_INJECT_ASM(3)
- "cmpq %[v], %[expect]\n\t"
- "jnz %l[cmpfail]\n\t"
- RSEQ_INJECT_ASM(4)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1])
- "cmpq %[v], %[expect]\n\t"
- "jnz %l[error2]\n\t"
-#endif
- /* try store */
- "movq %[newv2], %[v2]\n\t"
- RSEQ_INJECT_ASM(5)
- /* final store */
- "movq %[newv], %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(6)
- RSEQ_ASM_DEFINE_ABORT(4, "", abort)
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [rseq_offset] "r" (rseq_offset),
- /* try store input */
- [v2] "m" (*v2),
- [newv2] "r" (newv2),
- /* final store input */
- [v] "m" (*v),
- [expect] "r" (expect),
- [newv] "r" (newv)
- : "memory", "cc", "rax"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2
-#endif
- );
- rseq_after_asm_goto();
- return 0;
-abort:
- rseq_after_asm_goto();
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- rseq_after_asm_goto();
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_after_asm_goto();
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_after_asm_goto();
- rseq_bug("expected value comparison failed");
-#endif
-}
-
-/* x86-64 is TSO. */
-static inline __attribute__((always_inline))
-int rseq_cmpeqv_trystorev_storev_release(intptr_t *v, intptr_t expect,
- intptr_t *v2, intptr_t newv2,
- intptr_t newv, int cpu)
-{
- return rseq_cmpeqv_trystorev_storev(v, expect, v2, newv2, newv, cpu);
-}
-
-static inline __attribute__((always_inline))
-int rseq_cmpeqv_cmpeqv_storev(intptr_t *v, intptr_t expect,
- intptr_t *v2, intptr_t expect2,
- intptr_t newv, int cpu)
-{
- RSEQ_INJECT_C(9)
-
- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error3])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f)
- RSEQ_INJECT_ASM(3)
- "cmpq %[v], %[expect]\n\t"
- "jnz %l[cmpfail]\n\t"
- RSEQ_INJECT_ASM(4)
- "cmpq %[v2], %[expect2]\n\t"
- "jnz %l[cmpfail]\n\t"
- RSEQ_INJECT_ASM(5)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1])
- "cmpq %[v], %[expect]\n\t"
- "jnz %l[error2]\n\t"
- "cmpq %[v2], %[expect2]\n\t"
- "jnz %l[error3]\n\t"
-#endif
- /* final store */
- "movq %[newv], %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(6)
- RSEQ_ASM_DEFINE_ABORT(4, "", abort)
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [rseq_offset] "r" (rseq_offset),
- /* cmp2 input */
- [v2] "m" (*v2),
- [expect2] "r" (expect2),
- /* final store input */
- [v] "m" (*v),
- [expect] "r" (expect),
- [newv] "r" (newv)
- : "memory", "cc", "rax"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2, error3
-#endif
- );
- rseq_after_asm_goto();
- return 0;
-abort:
- rseq_after_asm_goto();
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- rseq_after_asm_goto();
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_after_asm_goto();
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_after_asm_goto();
- rseq_bug("1st expected value comparison failed");
-error3:
- rseq_after_asm_goto();
- rseq_bug("2nd expected value comparison failed");
-#endif
-}
-
-static inline __attribute__((always_inline))
-int rseq_cmpeqv_trymemcpy_storev(intptr_t *v, intptr_t expect,
- void *dst, void *src, size_t len,
- intptr_t newv, int cpu)
-{
- uint64_t rseq_scratch[3];
-
- RSEQ_INJECT_C(9)
-
- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
-#endif
- "movq %[src], %[rseq_scratch0]\n\t"
- "movq %[dst], %[rseq_scratch1]\n\t"
- "movq %[len], %[rseq_scratch2]\n\t"
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f)
- RSEQ_INJECT_ASM(3)
- "cmpq %[v], %[expect]\n\t"
- "jnz 5f\n\t"
- RSEQ_INJECT_ASM(4)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 6f)
- "cmpq %[v], %[expect]\n\t"
- "jnz 7f\n\t"
-#endif
- /* try memcpy */
- "test %[len], %[len]\n\t" \
- "jz 333f\n\t" \
- "222:\n\t" \
- "movb (%[src]), %%al\n\t" \
- "movb %%al, (%[dst])\n\t" \
- "inc %[src]\n\t" \
- "inc %[dst]\n\t" \
- "dec %[len]\n\t" \
- "jnz 222b\n\t" \
- "333:\n\t" \
- RSEQ_INJECT_ASM(5)
- /* final store */
- "movq %[newv], %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(6)
- /* teardown */
- "movq %[rseq_scratch2], %[len]\n\t"
- "movq %[rseq_scratch1], %[dst]\n\t"
- "movq %[rseq_scratch0], %[src]\n\t"
- RSEQ_ASM_DEFINE_ABORT(4,
- "movq %[rseq_scratch2], %[len]\n\t"
- "movq %[rseq_scratch1], %[dst]\n\t"
- "movq %[rseq_scratch0], %[src]\n\t",
- abort)
- RSEQ_ASM_DEFINE_CMPFAIL(5,
- "movq %[rseq_scratch2], %[len]\n\t"
- "movq %[rseq_scratch1], %[dst]\n\t"
- "movq %[rseq_scratch0], %[src]\n\t",
- cmpfail)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_CMPFAIL(6,
- "movq %[rseq_scratch2], %[len]\n\t"
- "movq %[rseq_scratch1], %[dst]\n\t"
- "movq %[rseq_scratch0], %[src]\n\t",
- error1)
- RSEQ_ASM_DEFINE_CMPFAIL(7,
- "movq %[rseq_scratch2], %[len]\n\t"
- "movq %[rseq_scratch1], %[dst]\n\t"
- "movq %[rseq_scratch0], %[src]\n\t",
- error2)
-#endif
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [rseq_offset] "r" (rseq_offset),
- /* final store input */
- [v] "m" (*v),
- [expect] "r" (expect),
- [newv] "r" (newv),
- /* try memcpy input */
- [dst] "r" (dst),
- [src] "r" (src),
- [len] "r" (len),
- [rseq_scratch0] "m" (rseq_scratch[0]),
- [rseq_scratch1] "m" (rseq_scratch[1]),
- [rseq_scratch2] "m" (rseq_scratch[2])
- : "memory", "cc", "rax"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2
-#endif
- );
- rseq_after_asm_goto();
- return 0;
-abort:
- rseq_after_asm_goto();
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- rseq_after_asm_goto();
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_after_asm_goto();
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_after_asm_goto();
- rseq_bug("expected value comparison failed");
-#endif
-}
-
-/* x86-64 is TSO. */
-static inline __attribute__((always_inline))
-int rseq_cmpeqv_trymemcpy_storev_release(intptr_t *v, intptr_t expect,
- void *dst, void *src, size_t len,
- intptr_t newv, int cpu)
-{
- return rseq_cmpeqv_trymemcpy_storev(v, expect, dst, src, len,
- newv, cpu);
-}
-
#elif defined(__i386__)

#define RSEQ_ASM_TP_SEGMENT %%gs
@@ -711,643 +199,36 @@ do { \
"jmp %l[" __rseq_str(cmpfail_label) "]\n\t" \
".popsection\n\t"

-static inline __attribute__((always_inline))
-int rseq_cmpeqv_storev(intptr_t *v, intptr_t expect, intptr_t newv, int cpu)
-{
- RSEQ_INJECT_C(9)
-
- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f)
- RSEQ_INJECT_ASM(3)
- "cmpl %[v], %[expect]\n\t"
- "jnz %l[cmpfail]\n\t"
- RSEQ_INJECT_ASM(4)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1])
- "cmpl %[v], %[expect]\n\t"
- "jnz %l[error2]\n\t"
-#endif
- /* final store */
- "movl %[newv], %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(5)
- RSEQ_ASM_DEFINE_ABORT(4, "", abort)
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [rseq_offset] "r" (rseq_offset),
- [v] "m" (*v),
- [expect] "r" (expect),
- [newv] "r" (newv)
- : "memory", "cc", "eax"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2
-#endif
- );
- rseq_after_asm_goto();
- return 0;
-abort:
- rseq_after_asm_goto();
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- rseq_after_asm_goto();
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_after_asm_goto();
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_after_asm_goto();
- rseq_bug("expected value comparison failed");
-#endif
-}
-
-/*
- * Compare @v against @expectnot. When it does _not_ match, load @v
- * into @load, and store the content of *@v + voffp into @v.
- */
-static inline __attribute__((always_inline))
-int rseq_cmpnev_storeoffp_load(intptr_t *v, intptr_t expectnot,
- long voffp, intptr_t *load, int cpu)
-{
- RSEQ_INJECT_C(9)
-
- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f)
- RSEQ_INJECT_ASM(3)
- "movl %[v], %%ebx\n\t"
- "cmpl %%ebx, %[expectnot]\n\t"
- "je %l[cmpfail]\n\t"
- RSEQ_INJECT_ASM(4)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1])
- "movl %[v], %%ebx\n\t"
- "cmpl %%ebx, %[expectnot]\n\t"
- "je %l[error2]\n\t"
-#endif
- "movl %%ebx, %[load]\n\t"
- "addl %[voffp], %%ebx\n\t"
- "movl (%%ebx), %%ebx\n\t"
- /* final store */
- "movl %%ebx, %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(5)
- RSEQ_ASM_DEFINE_ABORT(4, "", abort)
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [rseq_offset] "r" (rseq_offset),
- /* final store input */
- [v] "m" (*v),
- [expectnot] "r" (expectnot),
- [voffp] "ir" (voffp),
- [load] "m" (*load)
- : "memory", "cc", "eax", "ebx"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2
-#endif
- );
- rseq_after_asm_goto();
- return 0;
-abort:
- rseq_after_asm_goto();
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- rseq_after_asm_goto();
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_after_asm_goto();
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_after_asm_goto();
- rseq_bug("expected value comparison failed");
-#endif
-}
-
-static inline __attribute__((always_inline))
-int rseq_addv(intptr_t *v, intptr_t count, int cpu)
-{
- RSEQ_INJECT_C(9)
-
- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f)
- RSEQ_INJECT_ASM(3)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1])
-#endif
- /* final store */
- "addl %[count], %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(4)
- RSEQ_ASM_DEFINE_ABORT(4, "", abort)
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [rseq_offset] "r" (rseq_offset),
- /* final store input */
- [v] "m" (*v),
- [count] "ir" (count)
- : "memory", "cc", "eax"
- RSEQ_INJECT_CLOBBER
- : abort
-#ifdef RSEQ_COMPARE_TWICE
- , error1
-#endif
- );
- rseq_after_asm_goto();
- return 0;
-abort:
- rseq_after_asm_goto();
- RSEQ_INJECT_FAILED
- return -1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_after_asm_goto();
- rseq_bug("cpu_id comparison failed");
-#endif
-}
-
-static inline __attribute__((always_inline))
-int rseq_cmpeqv_trystorev_storev(intptr_t *v, intptr_t expect,
- intptr_t *v2, intptr_t newv2,
- intptr_t newv, int cpu)
-{
- RSEQ_INJECT_C(9)
-
- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f)
- RSEQ_INJECT_ASM(3)
- "cmpl %[v], %[expect]\n\t"
- "jnz %l[cmpfail]\n\t"
- RSEQ_INJECT_ASM(4)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1])
- "cmpl %[v], %[expect]\n\t"
- "jnz %l[error2]\n\t"
-#endif
- /* try store */
- "movl %[newv2], %%eax\n\t"
- "movl %%eax, %[v2]\n\t"
- RSEQ_INJECT_ASM(5)
- /* final store */
- "movl %[newv], %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(6)
- RSEQ_ASM_DEFINE_ABORT(4, "", abort)
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [rseq_offset] "r" (rseq_offset),
- /* try store input */
- [v2] "m" (*v2),
- [newv2] "m" (newv2),
- /* final store input */
- [v] "m" (*v),
- [expect] "r" (expect),
- [newv] "r" (newv)
- : "memory", "cc", "eax"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2
-#endif
- );
- rseq_after_asm_goto();
- return 0;
-abort:
- rseq_after_asm_goto();
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- rseq_after_asm_goto();
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_after_asm_goto();
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_after_asm_goto();
- rseq_bug("expected value comparison failed");
-#endif
-}
-
-static inline __attribute__((always_inline))
-int rseq_cmpeqv_trystorev_storev_release(intptr_t *v, intptr_t expect,
- intptr_t *v2, intptr_t newv2,
- intptr_t newv, int cpu)
-{
- RSEQ_INJECT_C(9)
-
- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f)
- RSEQ_INJECT_ASM(3)
- "movl %[expect], %%eax\n\t"
- "cmpl %[v], %%eax\n\t"
- "jnz %l[cmpfail]\n\t"
- RSEQ_INJECT_ASM(4)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1])
- "movl %[expect], %%eax\n\t"
- "cmpl %[v], %%eax\n\t"
- "jnz %l[error2]\n\t"
-#endif
- /* try store */
- "movl %[newv2], %[v2]\n\t"
- RSEQ_INJECT_ASM(5)
- "lock; addl $0,-128(%%esp)\n\t"
- /* final store */
- "movl %[newv], %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(6)
- RSEQ_ASM_DEFINE_ABORT(4, "", abort)
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [rseq_offset] "r" (rseq_offset),
- /* try store input */
- [v2] "m" (*v2),
- [newv2] "r" (newv2),
- /* final store input */
- [v] "m" (*v),
- [expect] "m" (expect),
- [newv] "r" (newv)
- : "memory", "cc", "eax"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2
-#endif
- );
- rseq_after_asm_goto();
- return 0;
-abort:
- rseq_after_asm_goto();
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- rseq_after_asm_goto();
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_after_asm_goto();
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_after_asm_goto();
- rseq_bug("expected value comparison failed");
#endif

-}
+/* Per-cpu-id indexing. */

-static inline __attribute__((always_inline))
-int rseq_cmpeqv_cmpeqv_storev(intptr_t *v, intptr_t expect,
- intptr_t *v2, intptr_t expect2,
- intptr_t newv, int cpu)
-{
- RSEQ_INJECT_C(9)
+#define RSEQ_TEMPLATE_CPU_ID
+#define RSEQ_TEMPLATE_MO_RELAXED
+#include "rseq-x86-bits.h"
+#undef RSEQ_TEMPLATE_MO_RELAXED

- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error3])
-#endif
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f)
- RSEQ_INJECT_ASM(3)
- "cmpl %[v], %[expect]\n\t"
- "jnz %l[cmpfail]\n\t"
- RSEQ_INJECT_ASM(4)
- "cmpl %[expect2], %[v2]\n\t"
- "jnz %l[cmpfail]\n\t"
- RSEQ_INJECT_ASM(5)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), %l[error1])
- "cmpl %[v], %[expect]\n\t"
- "jnz %l[error2]\n\t"
- "cmpl %[expect2], %[v2]\n\t"
- "jnz %l[error3]\n\t"
-#endif
- "movl %[newv], %%eax\n\t"
- /* final store */
- "movl %%eax, %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(6)
- RSEQ_ASM_DEFINE_ABORT(4, "", abort)
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [rseq_offset] "r" (rseq_offset),
- /* cmp2 input */
- [v2] "m" (*v2),
- [expect2] "r" (expect2),
- /* final store input */
- [v] "m" (*v),
- [expect] "r" (expect),
- [newv] "m" (newv)
- : "memory", "cc", "eax"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2, error3
-#endif
- );
- rseq_after_asm_goto();
- return 0;
-abort:
- rseq_after_asm_goto();
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- rseq_after_asm_goto();
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_after_asm_goto();
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_after_asm_goto();
- rseq_bug("1st expected value comparison failed");
-error3:
- rseq_after_asm_goto();
- rseq_bug("2nd expected value comparison failed");
-#endif
-}
+#define RSEQ_TEMPLATE_MO_RELEASE
+#include "rseq-x86-bits.h"
+#undef RSEQ_TEMPLATE_MO_RELEASE
+#undef RSEQ_TEMPLATE_CPU_ID

-/* TODO: implement a faster memcpy. */
-static inline __attribute__((always_inline))
-int rseq_cmpeqv_trymemcpy_storev(intptr_t *v, intptr_t expect,
- void *dst, void *src, size_t len,
- intptr_t newv, int cpu)
-{
- uint32_t rseq_scratch[3];
+/* Per-vm-vcpu-id indexing. */

- RSEQ_INJECT_C(9)
+#define RSEQ_TEMPLATE_VM_VCPU_ID
+#define RSEQ_TEMPLATE_MO_RELAXED
+#include "rseq-x86-bits.h"
+#undef RSEQ_TEMPLATE_MO_RELAXED

- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
-#endif
- "movl %[src], %[rseq_scratch0]\n\t"
- "movl %[dst], %[rseq_scratch1]\n\t"
- "movl %[len], %[rseq_scratch2]\n\t"
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f)
- RSEQ_INJECT_ASM(3)
- "movl %[expect], %%eax\n\t"
- "cmpl %%eax, %[v]\n\t"
- "jnz 5f\n\t"
- RSEQ_INJECT_ASM(4)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 6f)
- "movl %[expect], %%eax\n\t"
- "cmpl %%eax, %[v]\n\t"
- "jnz 7f\n\t"
-#endif
- /* try memcpy */
- "test %[len], %[len]\n\t" \
- "jz 333f\n\t" \
- "222:\n\t" \
- "movb (%[src]), %%al\n\t" \
- "movb %%al, (%[dst])\n\t" \
- "inc %[src]\n\t" \
- "inc %[dst]\n\t" \
- "dec %[len]\n\t" \
- "jnz 222b\n\t" \
- "333:\n\t" \
- RSEQ_INJECT_ASM(5)
- "movl %[newv], %%eax\n\t"
- /* final store */
- "movl %%eax, %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(6)
- /* teardown */
- "movl %[rseq_scratch2], %[len]\n\t"
- "movl %[rseq_scratch1], %[dst]\n\t"
- "movl %[rseq_scratch0], %[src]\n\t"
- RSEQ_ASM_DEFINE_ABORT(4,
- "movl %[rseq_scratch2], %[len]\n\t"
- "movl %[rseq_scratch1], %[dst]\n\t"
- "movl %[rseq_scratch0], %[src]\n\t",
- abort)
- RSEQ_ASM_DEFINE_CMPFAIL(5,
- "movl %[rseq_scratch2], %[len]\n\t"
- "movl %[rseq_scratch1], %[dst]\n\t"
- "movl %[rseq_scratch0], %[src]\n\t",
- cmpfail)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_CMPFAIL(6,
- "movl %[rseq_scratch2], %[len]\n\t"
- "movl %[rseq_scratch1], %[dst]\n\t"
- "movl %[rseq_scratch0], %[src]\n\t",
- error1)
- RSEQ_ASM_DEFINE_CMPFAIL(7,
- "movl %[rseq_scratch2], %[len]\n\t"
- "movl %[rseq_scratch1], %[dst]\n\t"
- "movl %[rseq_scratch0], %[src]\n\t",
- error2)
-#endif
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [rseq_offset] "r" (rseq_offset),
- /* final store input */
- [v] "m" (*v),
- [expect] "m" (expect),
- [newv] "m" (newv),
- /* try memcpy input */
- [dst] "r" (dst),
- [src] "r" (src),
- [len] "r" (len),
- [rseq_scratch0] "m" (rseq_scratch[0]),
- [rseq_scratch1] "m" (rseq_scratch[1]),
- [rseq_scratch2] "m" (rseq_scratch[2])
- : "memory", "cc", "eax"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2
-#endif
- );
- rseq_after_asm_goto();
- return 0;
-abort:
- rseq_after_asm_goto();
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- rseq_after_asm_goto();
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_after_asm_goto();
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_after_asm_goto();
- rseq_bug("expected value comparison failed");
-#endif
-}
-
-/* TODO: implement a faster memcpy. */
-static inline __attribute__((always_inline))
-int rseq_cmpeqv_trymemcpy_storev_release(intptr_t *v, intptr_t expect,
- void *dst, void *src, size_t len,
- intptr_t newv, int cpu)
-{
- uint32_t rseq_scratch[3];
-
- RSEQ_INJECT_C(9)
+#define RSEQ_TEMPLATE_MO_RELEASE
+#include "rseq-x86-bits.h"
+#undef RSEQ_TEMPLATE_MO_RELEASE
+#undef RSEQ_TEMPLATE_VM_VCPU_ID

- __asm__ __volatile__ goto (
- RSEQ_ASM_DEFINE_TABLE(3, 1f, 2f, 4f) /* start, commit, abort */
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[cmpfail])
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error1])
- RSEQ_ASM_DEFINE_EXIT_POINT(1f, %l[error2])
-#endif
- "movl %[src], %[rseq_scratch0]\n\t"
- "movl %[dst], %[rseq_scratch1]\n\t"
- "movl %[len], %[rseq_scratch2]\n\t"
- /* Start rseq by storing table entry pointer into rseq_cs. */
- RSEQ_ASM_STORE_RSEQ_CS(1, 3b, RSEQ_ASM_TP_SEGMENT:RSEQ_CS_OFFSET(%[rseq_offset]))
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 4f)
- RSEQ_INJECT_ASM(3)
- "movl %[expect], %%eax\n\t"
- "cmpl %%eax, %[v]\n\t"
- "jnz 5f\n\t"
- RSEQ_INJECT_ASM(4)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_CMP_CPU_ID(cpu_id, RSEQ_ASM_TP_SEGMENT:RSEQ_CPU_ID_OFFSET(%[rseq_offset]), 6f)
- "movl %[expect], %%eax\n\t"
- "cmpl %%eax, %[v]\n\t"
- "jnz 7f\n\t"
-#endif
- /* try memcpy */
- "test %[len], %[len]\n\t" \
- "jz 333f\n\t" \
- "222:\n\t" \
- "movb (%[src]), %%al\n\t" \
- "movb %%al, (%[dst])\n\t" \
- "inc %[src]\n\t" \
- "inc %[dst]\n\t" \
- "dec %[len]\n\t" \
- "jnz 222b\n\t" \
- "333:\n\t" \
- RSEQ_INJECT_ASM(5)
- "lock; addl $0,-128(%%esp)\n\t"
- "movl %[newv], %%eax\n\t"
- /* final store */
- "movl %%eax, %[v]\n\t"
- "2:\n\t"
- RSEQ_INJECT_ASM(6)
- /* teardown */
- "movl %[rseq_scratch2], %[len]\n\t"
- "movl %[rseq_scratch1], %[dst]\n\t"
- "movl %[rseq_scratch0], %[src]\n\t"
- RSEQ_ASM_DEFINE_ABORT(4,
- "movl %[rseq_scratch2], %[len]\n\t"
- "movl %[rseq_scratch1], %[dst]\n\t"
- "movl %[rseq_scratch0], %[src]\n\t",
- abort)
- RSEQ_ASM_DEFINE_CMPFAIL(5,
- "movl %[rseq_scratch2], %[len]\n\t"
- "movl %[rseq_scratch1], %[dst]\n\t"
- "movl %[rseq_scratch0], %[src]\n\t",
- cmpfail)
-#ifdef RSEQ_COMPARE_TWICE
- RSEQ_ASM_DEFINE_CMPFAIL(6,
- "movl %[rseq_scratch2], %[len]\n\t"
- "movl %[rseq_scratch1], %[dst]\n\t"
- "movl %[rseq_scratch0], %[src]\n\t",
- error1)
- RSEQ_ASM_DEFINE_CMPFAIL(7,
- "movl %[rseq_scratch2], %[len]\n\t"
- "movl %[rseq_scratch1], %[dst]\n\t"
- "movl %[rseq_scratch0], %[src]\n\t",
- error2)
-#endif
- : /* gcc asm goto does not allow outputs */
- : [cpu_id] "r" (cpu),
- [rseq_offset] "r" (rseq_offset),
- /* final store input */
- [v] "m" (*v),
- [expect] "m" (expect),
- [newv] "m" (newv),
- /* try memcpy input */
- [dst] "r" (dst),
- [src] "r" (src),
- [len] "r" (len),
- [rseq_scratch0] "m" (rseq_scratch[0]),
- [rseq_scratch1] "m" (rseq_scratch[1]),
- [rseq_scratch2] "m" (rseq_scratch[2])
- : "memory", "cc", "eax"
- RSEQ_INJECT_CLOBBER
- : abort, cmpfail
-#ifdef RSEQ_COMPARE_TWICE
- , error1, error2
-#endif
- );
- rseq_after_asm_goto();
- return 0;
-abort:
- rseq_after_asm_goto();
- RSEQ_INJECT_FAILED
- return -1;
-cmpfail:
- rseq_after_asm_goto();
- return 1;
-#ifdef RSEQ_COMPARE_TWICE
-error1:
- rseq_after_asm_goto();
- rseq_bug("cpu_id comparison failed");
-error2:
- rseq_after_asm_goto();
- rseq_bug("expected value comparison failed");
-#endif
-}
+/* APIs which are not based on cpu ids. */

-#endif
+#define RSEQ_TEMPLATE_CPU_ID_NONE
+#define RSEQ_TEMPLATE_MO_RELAXED
+#include "rseq-x86-bits.h"
+#undef RSEQ_TEMPLATE_MO_RELAXED
+#undef RSEQ_TEMPLATE_CPU_ID_NONE
diff --git a/tools/testing/selftests/rseq/rseq.h b/tools/testing/selftests/rseq/rseq.h
index 003e0e3750ce..95a76a1c3b27 100644
--- a/tools/testing/selftests/rseq/rseq.h
+++ b/tools/testing/selftests/rseq/rseq.h
@@ -74,6 +74,20 @@ extern unsigned int rseq_flags;
*/
extern unsigned int rseq_feature_size;

+enum rseq_mo {
+ RSEQ_MO_RELAXED = 0,
+ RSEQ_MO_CONSUME = 1, /* Unused */
+ RSEQ_MO_ACQUIRE = 2, /* Unused */
+ RSEQ_MO_RELEASE = 3,
+ RSEQ_MO_ACQ_REL = 4, /* Unused */
+ RSEQ_MO_SEQ_CST = 5, /* Unused */
+};
+
+enum rseq_percpu_mode {
+ RSEQ_PERCPU_CPU_ID = 0,
+ RSEQ_PERCPU_VM_VCPU_ID = 1,
+};
+
static inline struct rseq_abi *rseq_get_abi(void)
{
return (struct rseq_abi *) ((uintptr_t) rseq_thread_pointer() + rseq_offset);
@@ -222,4 +236,149 @@ static inline void rseq_prepare_unload(void)
rseq_clear_rseq_cs();
}

+static inline __attribute__((always_inline))
+int rseq_cmpeqv_storev(enum rseq_mo rseq_mo, enum rseq_percpu_mode percpu_mode,
+ intptr_t *v, intptr_t expect,
+ intptr_t newv, int cpu)
+{
+ if (rseq_mo != RSEQ_MO_RELAXED)
+ return -1;
+ switch (percpu_mode) {
+ case RSEQ_PERCPU_CPU_ID:
+ return rseq_cmpeqv_storev_relaxed_cpu_id(v, expect, newv, cpu);
+ case RSEQ_PERCPU_VM_VCPU_ID:
+ return rseq_cmpeqv_storev_relaxed_vm_vcpu_id(v, expect, newv, cpu);
+ }
+ return -1;
+}
+
+/*
+ * Compare @v against @expectnot. When it does _not_ match, load @v
+ * into @load, and store the content of *@v + voffp into @v.
+ */
+static inline __attribute__((always_inline))
+int rseq_cmpnev_storeoffp_load(enum rseq_mo rseq_mo, enum rseq_percpu_mode percpu_mode,
+ intptr_t *v, intptr_t expectnot, long voffp, intptr_t *load,
+ int cpu)
+{
+ if (rseq_mo != RSEQ_MO_RELAXED)
+ return -1;
+ switch (percpu_mode) {
+ case RSEQ_PERCPU_CPU_ID:
+ return rseq_cmpnev_storeoffp_load_relaxed_cpu_id(v, expectnot, voffp, load, cpu);
+ case RSEQ_PERCPU_VM_VCPU_ID:
+ return rseq_cmpnev_storeoffp_load_relaxed_vm_vcpu_id(v, expectnot, voffp, load, cpu);
+ }
+ return -1;
+}
+
+static inline __attribute__((always_inline))
+int rseq_addv(enum rseq_mo rseq_mo, enum rseq_percpu_mode percpu_mode,
+ intptr_t *v, intptr_t count, int cpu)
+{
+ if (rseq_mo != RSEQ_MO_RELAXED)
+ return -1;
+ switch (percpu_mode) {
+ case RSEQ_PERCPU_CPU_ID:
+ return rseq_addv_relaxed_cpu_id(v, count, cpu);
+ case RSEQ_PERCPU_VM_VCPU_ID:
+ return rseq_addv_relaxed_vm_vcpu_id(v, count, cpu);
+ }
+ return -1;
+}
+
+#ifdef RSEQ_ARCH_HAS_OFFSET_DEREF_ADDV
+/*
+ * pval = *(ptr+off)
+ * *pval += inc;
+ */
+static inline __attribute__((always_inline))
+int rseq_offset_deref_addv(enum rseq_mo rseq_mo, enum rseq_percpu_mode percpu_mode,
+ intptr_t *ptr, long off, intptr_t inc, int cpu)
+{
+ if (rseq_mo != RSEQ_MO_RELAXED)
+ return -1;
+ switch (percpu_mode) {
+ case RSEQ_PERCPU_CPU_ID:
+ return rseq_offset_deref_addv_relaxed_cpu_id(ptr, off, inc, cpu);
+ case RSEQ_PERCPU_VM_VCPU_ID:
+ return rseq_offset_deref_addv_relaxed_vm_vcpu_id(ptr, off, inc, cpu);
+ }
+ return -1;
+}
+#endif
+
+static inline __attribute__((always_inline))
+int rseq_cmpeqv_trystorev_storev(enum rseq_mo rseq_mo, enum rseq_percpu_mode percpu_mode,
+ intptr_t *v, intptr_t expect,
+ intptr_t *v2, intptr_t newv2,
+ intptr_t newv, int cpu)
+{
+ switch (rseq_mo) {
+ case RSEQ_MO_RELAXED:
+ switch (percpu_mode) {
+ case RSEQ_PERCPU_CPU_ID:
+ return rseq_cmpeqv_trystorev_storev_relaxed_cpu_id(v, expect, v2, newv2, newv, cpu);
+ case RSEQ_PERCPU_VM_VCPU_ID:
+ return rseq_cmpeqv_trystorev_storev_relaxed_vm_vcpu_id(v, expect, v2, newv2, newv, cpu);
+ }
+ return -1;
+ case RSEQ_MO_RELEASE:
+ switch (percpu_mode) {
+ case RSEQ_PERCPU_CPU_ID:
+ return rseq_cmpeqv_trystorev_storev_release_cpu_id(v, expect, v2, newv2, newv, cpu);
+ case RSEQ_PERCPU_VM_VCPU_ID:
+ return rseq_cmpeqv_trystorev_storev_release_vm_vcpu_id(v, expect, v2, newv2, newv, cpu);
+ }
+ return -1;
+ default:
+ return -1;
+ }
+}
+
+static inline __attribute__((always_inline))
+int rseq_cmpeqv_cmpeqv_storev(enum rseq_mo rseq_mo, enum rseq_percpu_mode percpu_mode,
+ intptr_t *v, intptr_t expect,
+ intptr_t *v2, intptr_t expect2,
+ intptr_t newv, int cpu)
+{
+ if (rseq_mo != RSEQ_MO_RELAXED)
+ return -1;
+ switch (percpu_mode) {
+ case RSEQ_PERCPU_CPU_ID:
+ return rseq_cmpeqv_cmpeqv_storev_relaxed_cpu_id(v, expect, v2, expect2, newv, cpu);
+ case RSEQ_PERCPU_VM_VCPU_ID:
+ return rseq_cmpeqv_cmpeqv_storev_relaxed_vm_vcpu_id(v, expect, v2, expect2, newv, cpu);
+ }
+ return -1;
+}
+
+static inline __attribute__((always_inline))
+int rseq_cmpeqv_trymemcpy_storev(enum rseq_mo rseq_mo, enum rseq_percpu_mode percpu_mode,
+ intptr_t *v, intptr_t expect,
+ void *dst, void *src, size_t len,
+ intptr_t newv, int cpu)
+{
+ switch (rseq_mo) {
+ case RSEQ_MO_RELAXED:
+ switch (percpu_mode) {
+ case RSEQ_PERCPU_CPU_ID:
+ return rseq_cmpeqv_trymemcpy_storev_relaxed_cpu_id(v, expect, dst, src, len, newv, cpu);
+ case RSEQ_PERCPU_VM_VCPU_ID:
+ return rseq_cmpeqv_trymemcpy_storev_relaxed_vm_vcpu_id(v, expect, dst, src, len, newv, cpu);
+ }
+ return -1;
+ case RSEQ_MO_RELEASE:
+ switch (percpu_mode) {
+ case RSEQ_PERCPU_CPU_ID:
+ return rseq_cmpeqv_trymemcpy_storev_release_cpu_id(v, expect, dst, src, len, newv, cpu);
+ case RSEQ_PERCPU_VM_VCPU_ID:
+ return rseq_cmpeqv_trymemcpy_storev_release_vm_vcpu_id(v, expect, dst, src, len, newv, cpu);
+ }
+ return -1;
+ default:
+ return -1;
+ }
+}
+
#endif /* RSEQ_H_ */
--
2.25.1

2022-09-22 15:33:09

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v4 25/25] tracing/rseq: Add mm_vcpu_id field to rseq_update

Hi Mathieu,

I love your patch! Yet something to improve:

[auto build test ERROR on rostedt-trace/for-next]
[cannot apply to shuah-kselftest/next tip/sched/core kees/for-next/execve linus/master v6.0-rc6 next-20220921]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Mathieu-Desnoyers/RSEQ-node-id-and-virtual-cpu-id-extensions/20220922-191315
base: https://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git for-next
config: i386-randconfig-a001
compiler: gcc-11 (Debian 11.3.0-5) 11.3.0
reproduce (this is a W=1 build):
# https://github.com/intel-lab-lkp/linux/commit/1e36316a4b14e773e904e677149f9f757057fd90
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Mathieu-Desnoyers/RSEQ-node-id-and-virtual-cpu-id-extensions/20220922-191315
git checkout 1e36316a4b14e773e904e677149f9f757057fd90
# save the config file
mkdir build_dir && cp config build_dir/.config
make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

In file included from include/trace/define_trace.h:102,
from include/trace/events/rseq.h:62,
from kernel/rseq.c:19:
include/trace/events/rseq.h: In function 'trace_event_raw_event_rseq_update':
>> include/trace/events/rseq.h:26:40: error: 'struct task_struct' has no member named 'mm_vcpu'
26 | __entry->mm_vcpu_id = t->mm_vcpu;
| ^~
include/trace/trace_events.h:402:11: note: in definition of macro 'DECLARE_EVENT_CLASS'
402 | { assign; } \
| ^~~~~~
include/trace/trace_events.h:44:30: note: in expansion of macro 'PARAMS'
44 | PARAMS(assign), \
| ^~~~~~
include/trace/events/rseq.h:11:1: note: in expansion of macro 'TRACE_EVENT'
11 | TRACE_EVENT(rseq_update,
| ^~~~~~~~~~~
include/trace/events/rseq.h:23:9: note: in expansion of macro 'TP_fast_assign'
23 | TP_fast_assign(
| ^~~~~~~~~~~~~~
In file included from include/trace/define_trace.h:103,
from include/trace/events/rseq.h:62,
from kernel/rseq.c:19:
include/trace/events/rseq.h: In function 'perf_trace_rseq_update':
>> include/trace/events/rseq.h:26:40: error: 'struct task_struct' has no member named 'mm_vcpu'
26 | __entry->mm_vcpu_id = t->mm_vcpu;
| ^~
include/trace/perf.h:89:11: note: in definition of macro 'DECLARE_EVENT_CLASS'
89 | { assign; } \
| ^~~~~~
include/trace/trace_events.h:44:30: note: in expansion of macro 'PARAMS'
44 | PARAMS(assign), \
| ^~~~~~
include/trace/events/rseq.h:11:1: note: in expansion of macro 'TRACE_EVENT'
11 | TRACE_EVENT(rseq_update,
| ^~~~~~~~~~~
include/trace/events/rseq.h:23:9: note: in expansion of macro 'TP_fast_assign'
23 | TP_fast_assign(
| ^~~~~~~~~~~~~~


vim +26 include/trace/events/rseq.h

12
13 TP_PROTO(struct task_struct *t),
14
15 TP_ARGS(t),
16
17 TP_STRUCT__entry(
18 __field(s32, cpu_id)
19 __field(s32, node_id)
20 __field(s32, mm_vcpu_id)
21 ),
22
23 TP_fast_assign(
24 __entry->cpu_id = raw_smp_processor_id();
25 __entry->node_id = cpu_to_node(raw_smp_processor_id());
> 26 __entry->mm_vcpu_id = t->mm_vcpu;
27 ),
28
29 TP_printk("cpu_id=%d node_id=%d mm_vcpu_id=%d", __entry->cpu_id,
30 __entry->node_id, __entry->mm_vcpu_id)
31 );
32

--
0-DAY CI Kernel Test Service
https://01.org/lkp


Attachments:
(No filename) (4.32 kB)
config (152.95 kB)
Download all attachments

2022-09-22 16:33:43

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH v4.1 25/25] tracing/rseq: Add mm_vcpu_id field to rseq_update

Add the mm_vcpu_id field to the rseq_update event, allowing tracers to
follow which vcpu_id is observed by user-space, and whether negative
vcpu_id values are visible in case of internal scheduler implementation
issues.

Signed-off-by: Mathieu Desnoyers <[email protected]>
---
Changes since v4:
- use task_mm_vcpu_id() to get the mm_vcpu_id from the task struct.
---
include/trace/events/rseq.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/trace/events/rseq.h b/include/trace/events/rseq.h
index 6bd442697354..cef5a91a9ba0 100644
--- a/include/trace/events/rseq.h
+++ b/include/trace/events/rseq.h
@@ -17,14 +17,17 @@ TRACE_EVENT(rseq_update,
TP_STRUCT__entry(
__field(s32, cpu_id)
__field(s32, node_id)
+ __field(s32, mm_vcpu_id)
),

TP_fast_assign(
__entry->cpu_id = raw_smp_processor_id();
__entry->node_id = cpu_to_node(raw_smp_processor_id());
+ __entry->mm_vcpu_id = task_mm_vcpu_id(t);
),

- TP_printk("cpu_id=%d node_id=%d", __entry->cpu_id, __entry->node_id)
+ TP_printk("cpu_id=%d node_id=%d mm_vcpu_id=%d", __entry->cpu_id,
+ __entry->node_id, __entry->mm_vcpu_id)
);

TRACE_EVENT(rseq_ip_fixup,
--
2.25.1

2022-09-23 10:07:19

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v4 25/25] tracing/rseq: Add mm_vcpu_id field to rseq_update

Hi Mathieu,

I love your patch! Yet something to improve:

[auto build test ERROR on rostedt-trace/for-next]
[cannot apply to shuah-kselftest/next tip/sched/core kees/for-next/execve linus/master v6.0-rc6 next-20220921]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Mathieu-Desnoyers/RSEQ-node-id-and-virtual-cpu-id-extensions/20220922-191315
base: https://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git for-next
config: x86_64-randconfig-a001 (https://download.01.org/0day-ci/archive/20220923/[email protected]/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/1e36316a4b14e773e904e677149f9f757057fd90
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Mathieu-Desnoyers/RSEQ-node-id-and-virtual-cpu-id-extensions/20220922-191315
git checkout 1e36316a4b14e773e904e677149f9f757057fd90
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

In file included from kernel/rseq.c:19:
In file included from include/trace/events/rseq.h:62:
In file included from include/trace/define_trace.h:102:
In file included from include/trace/trace_events.h:419:
>> include/trace/events/rseq.h:26:28: error: no member named 'mm_vcpu' in 'struct task_struct'
__entry->mm_vcpu_id = t->mm_vcpu;
~ ^
include/trace/stages/stage6_event_callback.h:112:33: note: expanded from macro 'TP_fast_assign'
#define TP_fast_assign(args...) args
^~~~
include/trace/trace_events.h:44:16: note: expanded from macro 'TRACE_EVENT'
PARAMS(assign), \
^~~~~~
include/linux/tracepoint.h:107:25: note: expanded from macro 'PARAMS'
#define PARAMS(args...) args
^~~~
include/trace/trace_events.h:402:4: note: expanded from macro 'DECLARE_EVENT_CLASS'
{ assign; } \
^~~~~~
In file included from kernel/rseq.c:19:
In file included from include/trace/events/rseq.h:62:
In file included from include/trace/define_trace.h:103:
In file included from include/trace/perf.h:113:
>> include/trace/events/rseq.h:26:28: error: no member named 'mm_vcpu' in 'struct task_struct'
__entry->mm_vcpu_id = t->mm_vcpu;
~ ^
include/trace/stages/stage6_event_callback.h:112:33: note: expanded from macro 'TP_fast_assign'
#define TP_fast_assign(args...) args
^~~~
include/trace/trace_events.h:44:16: note: expanded from macro 'TRACE_EVENT'
PARAMS(assign), \
^~~~~~
include/linux/tracepoint.h:107:25: note: expanded from macro 'PARAMS'
#define PARAMS(args...) args
^~~~
include/trace/perf.h:89:4: note: expanded from macro 'DECLARE_EVENT_CLASS'
{ assign; } \
^~~~~~
2 errors generated.


vim +26 include/trace/events/rseq.h

12
13 TP_PROTO(struct task_struct *t),
14
15 TP_ARGS(t),
16
17 TP_STRUCT__entry(
18 __field(s32, cpu_id)
19 __field(s32, node_id)
20 __field(s32, mm_vcpu_id)
21 ),
22
23 TP_fast_assign(
24 __entry->cpu_id = raw_smp_processor_id();
25 __entry->node_id = cpu_to_node(raw_smp_processor_id());
> 26 __entry->mm_vcpu_id = t->mm_vcpu;
27 ),
28
29 TP_printk("cpu_id=%d node_id=%d mm_vcpu_id=%d", __entry->cpu_id,
30 __entry->node_id, __entry->mm_vcpu_id)
31 );
32

--
0-DAY CI Kernel Test Service
https://01.org/lkp

2022-09-23 13:54:49

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH v4 00/25] RSEQ node id and virtual cpu id extensions

On 2022-09-22 16:10, Chris Kennelly wrote:
> Hi,
>
> I still need to update the code in TCMalloc to cooperate with the new
> glibc ABI/convention.  One concern I have is that it looks like I might
> need to add a extra memory dereference (or two) to get the early
> initialized offsets provided by glibc folded into the read of the cpu_id
> field.

If you have a concrete example of this, I'd be happy to help and perhaps
we can improve your usage pattern.

>
> I think I can avoid this by using %gs to point to the address of the
> cpu_id field itself (which I think could be used to select between vCPUs
> or not*), but %gs is a global piece of state that all of the libraries
> in the program need to cooperate on.

I think what we are all looking for here is a scheme that would allow us
the fastest per-vcpu data structure accesses possible from userspace.

I think we could do something similar to what is done in the Linux
kernel for that, but in userspace. Here are some random ideas I have on
this topic:

We could introduce a new prctl(2) PT_{SET,GET}_GS_MODE on x86-64. This
would take as arguments the indexing mode and offset multiplier we want
to be applied to the GS segment selector on return to userspace:

enum gs_index_mode {
GS_INDEX_MODE_MM_VCPU,
};

struct prctl_set_gs_mode {
enum gs_index_mode index_mode;
u64 stride;
};

For a memory space which has this gs mode set, the return to userspace
code would populate the GS segment selector register with:

stride * current->mm_vcpu_id

The "stride" would be the virtual address space size allowed for
per-vcpu-data. This could be decided by the libc, with a tunable
allowing to increase/decrease this size. Another libc tunable could
disable populating the GS segment selector altogether (e.g. for
compatibility with applications like Wine which AFAIK use it).

With this in place, I hope we could then do per-vcpu data access by
simply prefixing memory access instructions with a %%gs: segment
selector prefix.

Thoughts ?

Thanks,

Mathieu


>
> Thanks,
> Chris
>
> * TCMalloc is already paying a load+pointer arithmetic to select between
> cpu_id versus vcpu_id, so this would actually make things a little bit
> faster.
>
> On Thu, Sep 22, 2022 at 3:21 PM Mathieu Desnoyers
> <[email protected] <mailto:[email protected]>>
> wrote:
>
> Hi Chris,
>
> Sorry it looks like I forgot to CC you on this series. If you can give
> it a spin with tcmalloc I would be very much interested in the result.
>
> Thanks,
>
> Mathieu
>


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

2022-09-27 08:50:32

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v4 06/25] lib: Invert _find_next_bit source arguments

Hi Mathieu,

I love your patch! Perhaps something to improve:

[auto build test WARNING on rostedt-trace/for-next]
[cannot apply to shuah-kselftest/next tip/sched/core kees/for-next/execve linus/master v6.0-rc7 next-20220923]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Mathieu-Desnoyers/RSEQ-node-id-and-virtual-cpu-id-extensions/20220922-191315
base: https://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git for-next
config: loongarch-randconfig-c023-20220925
compiler: loongarch64-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/d91697c3ffe8e93dffb44ee588ab2764e0b14b88
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Mathieu-Desnoyers/RSEQ-node-id-and-virtual-cpu-id-extensions/20220922-191315
git checkout d91697c3ffe8e93dffb44ee588ab2764e0b14b88
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=loongarch SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>

All warnings (new ones prefixed by >>, old ones prefixed by <<):

>> WARNING: modpost: vmlinux.o(.text.unlikely+0xe440): Section mismatch in reference from the variable L0 to the variable .init.data:acpi_masked_gpes_map
The function L0() references
the variable __initdata acpi_masked_gpes_map.
This is often because L0 lacks a __initdata
annotation or the annotation of acpi_masked_gpes_map is wrong.

Note: the below error/warnings can be found in parent commit:
<< WARNING: modpost: vmlinux.o(___ksymtab+init_numa_memory+0x0): Section mismatch in reference from the variable __ksymtab_init_numa_memory to the function .init.text:init_numa_memory()

--
0-DAY CI Kernel Test Service
https://01.org/lkp


Attachments:
(No filename) (2.28 kB)
config (150.44 kB)
Download all attachments

2022-10-10 13:06:02

by Florian Weimer

[permalink] [raw]
Subject: Re: [PATCH v4 01/25] rseq: Introduce feature size and alignment ELF auxiliary vector entries

* Mathieu Desnoyers:

> Export the rseq feature size supported by the kernel as well as the
> required allocation alignment for the rseq per-thread area to user-space
> through ELF auxiliary vector entries.
>
> This is part of the extensible rseq ABI.
>
> Signed-off-by: Mathieu Desnoyers <[email protected]>
> ---
> fs/binfmt_elf.c | 5 +++++
> include/uapi/linux/auxvec.h | 2 ++
> include/uapi/linux/rseq.h | 5 +++++
> 3 files changed, 12 insertions(+)
>
> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
> index 63c7ebb0da89..04fca1e4cbd2 100644
> --- a/fs/binfmt_elf.c
> +++ b/fs/binfmt_elf.c
> @@ -46,6 +46,7 @@
> #include <linux/cred.h>
> #include <linux/dax.h>
> #include <linux/uaccess.h>
> +#include <linux/rseq.h>
> #include <asm/param.h>
> #include <asm/page.h>
>
> @@ -288,6 +289,10 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec,
> if (bprm->have_execfd) {
> NEW_AUX_ENT(AT_EXECFD, bprm->execfd);
> }
> +#ifdef CONFIG_RSEQ
> + NEW_AUX_ENT(AT_RSEQ_FEATURE_SIZE, offsetof(struct rseq, end));
> + NEW_AUX_ENT(AT_RSEQ_ALIGN, __alignof__(struct rseq));
> +#endif
> #undef NEW_AUX_ENT
> /* AT_NULL is zero; clear the rest too */
> memset(elf_info, 0, (char *)mm->saved_auxv +
> diff --git a/include/uapi/linux/auxvec.h b/include/uapi/linux/auxvec.h
> index c7e502bf5a6f..6991c4b8ab18 100644
> --- a/include/uapi/linux/auxvec.h
> +++ b/include/uapi/linux/auxvec.h
> @@ -30,6 +30,8 @@
> * differ from AT_PLATFORM. */
> #define AT_RANDOM 25 /* address of 16 random bytes */
> #define AT_HWCAP2 26 /* extension of AT_HWCAP */
> +#define AT_RSEQ_FEATURE_SIZE 27 /* rseq supported feature size */
> +#define AT_RSEQ_ALIGN 28 /* rseq allocation alignment */
>
> #define AT_EXECFN 31 /* filename of program */

Do we need the alignment? Or can we keep it perpetually at 32? Or we
could steal some bits from AT_RSEQ_FEATURE_SIZE? (Not the lower
bits—they aren't unused due to the way the feature size works.)

Thanks,
Florian

2022-10-10 13:13:03

by Florian Weimer

[permalink] [raw]
Subject: Re: [PATCH v4 00/25] RSEQ node id and virtual cpu id extensions

* Mathieu Desnoyers:

> Extend the rseq ABI to expose a NUMA node ID and a vm_vcpu_id field.
>
> The NUMA node ID field allows implementing a faster getcpu(2) in libc.
>
> The virtual cpu id allows ideal scaling (down or up) of user-space
> per-cpu data structures. The virtual cpu ids allocated within a memory
> space are tracked by the scheduler, which takes into account the number
> of concurrently running threads, thus implicitly considering the number
> of threads, the cpu affinity, the cpusets applying to those threads, and
> the number of logical cores on the system.

Do you have some code that shows how the userspace application handshake
is supposed to work with the existing three __rseq_* symbols? Maybe I'm
missing something.

From an application perspective, it would be best to add 8 more shared
bytes in use, to push the new feature size over 32. This would be
clearly visible in __rseq_size, helping applications a lot.

Alternatively, we could sacrifice a bit to indicate that the this round
of extensions is present. But we'll need another bit to indicate that
the last remaining 4 bytes are in use, for consistency. Or come up with
something to put their today. The TID seems like an obvious choice.

If we want to the 8 more bytes route, TID and PID should be
uncontroversal? The PID cache is clearly something that userspace
likes, not just as a defeat device for the old BYTE benchmark.

Thanks,
Florian

2022-10-17 16:11:19

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH v4 00/25] RSEQ node id and virtual cpu id extensions

On 2022-10-10 09:04, Florian Weimer wrote:
> * Mathieu Desnoyers:
>
>> Extend the rseq ABI to expose a NUMA node ID and a vm_vcpu_id field.
>>
>> The NUMA node ID field allows implementing a faster getcpu(2) in libc.
>>
>> The virtual cpu id allows ideal scaling (down or up) of user-space
>> per-cpu data structures. The virtual cpu ids allocated within a memory
>> space are tracked by the scheduler, which takes into account the number
>> of concurrently running threads, thus implicitly considering the number
>> of threads, the cpu affinity, the cpusets applying to those threads, and
>> the number of logical cores on the system.
>
> Do you have some code that shows how the userspace application handshake
> is supposed to work with the existing three __rseq_* symbols? Maybe I'm
> missing something.

see https://lore.kernel.org/lkml/[email protected]/

+static
+unsigned int get_rseq_feature_size(void)
+{
+ unsigned long auxv_rseq_feature_size, auxv_rseq_align;
+
+ auxv_rseq_align = getauxval(AT_RSEQ_ALIGN);
+ assert(!auxv_rseq_align || auxv_rseq_align <= RSEQ_THREAD_AREA_ALLOC_SIZE);
+
+ auxv_rseq_feature_size = getauxval(AT_RSEQ_FEATURE_SIZE);
+ assert(!auxv_rseq_feature_size || auxv_rseq_feature_size <= RSEQ_THREAD_AREA_ALLOC_SIZE);
+ if (auxv_rseq_feature_size)
+ return auxv_rseq_feature_size;
+ else
+ return ORIG_RSEQ_FEATURE_SIZE;
+}

then in rseq_init():

+ rseq_feature_size = get_rseq_feature_size();
+ if (rseq_feature_size == ORIG_RSEQ_FEATURE_SIZE)
+ rseq_size = ORIG_RSEQ_ALLOC_SIZE;
+ else
+ rseq_size = RSEQ_THREAD_AREA_ALLOC_SIZE;

Then using it for e.g. node_id:

https://lore.kernel.org/lkml/[email protected]/

+#ifndef rseq_sizeof_field
+#define rseq_sizeof_field(TYPE, MEMBER) sizeof((((TYPE *)0)->MEMBER))
+#endif
+
+#ifndef rseq_offsetofend
+#define rseq_offsetofend(TYPE, MEMBER) \
+ (offsetof(TYPE, MEMBER) + rseq_sizeof_field(TYPE, MEMBER))
+#endif

+static inline bool rseq_node_id_available(void)
+{
+ return (int) rseq_feature_size >= rseq_offsetofend(struct rseq_abi, node_id);
+}
+
+/*
+ * Current NUMA node number.
+ */
+static inline uint32_t rseq_current_node_id(void)
+{
+ assert(rseq_node_id_available());
+ return RSEQ_ACCESS_ONCE(rseq_get_abi()->node_id);
+}

>
> From an application perspective, it would be best to add 8 more shared
> bytes in use, to push the new feature size over 32. This would be
> clearly visible in __rseq_size, helping applications a lot.

[ I guess you meant 12 bytes ]

The fool-proof approach here would be to skip the 12 bytes of padding
currently at the end of struct rseq.

Maybe this is something we should do in order to make sure the userspace
check is regular for all fields.

>
> Alternatively, we could sacrifice a bit to indicate that the this round
> of extensions is present. But we'll need another bit to indicate that
> the last remaining 4 bytes are in use, for consistency. Or come up with
> something to put their today. The TID seems like an obvious choice.

Whatever we add into those bits would need to be "special" and use
something like a flag check to validate whether the field is populated
or not. Perhaps keeping things simpler and skipping those 12 bytes
entirely is preferable.

>
> If we want to the 8 more bytes route, TID and PID should be
> uncontroversal? The PID cache is clearly something that userspace
> likes, not just as a defeat device for the old BYTE benchmark.

I agree that having the PID and TID there might be relevant, but I
would rather prefer to have all fields use a check that is regular
from the point of view of userspace. This minimizes the risk of user
errors.

Thoughts ?

Thanks,

Mathieu


>
> Thanks,
> Florian
>

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

2022-10-17 16:39:47

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH v4 01/25] rseq: Introduce feature size and alignment ELF auxiliary vector entries

On 2022-10-10 08:42, Florian Weimer wrote:
> * Mathieu Desnoyers:
>
>> Export the rseq feature size supported by the kernel as well as the
>> required allocation alignment for the rseq per-thread area to user-space
>> through ELF auxiliary vector entries.
>>
>> This is part of the extensible rseq ABI.
>>
>> Signed-off-by: Mathieu Desnoyers <[email protected]>
>> ---
>> fs/binfmt_elf.c | 5 +++++
>> include/uapi/linux/auxvec.h | 2 ++
>> include/uapi/linux/rseq.h | 5 +++++
>> 3 files changed, 12 insertions(+)
>>
>> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
>> index 63c7ebb0da89..04fca1e4cbd2 100644
>> --- a/fs/binfmt_elf.c
>> +++ b/fs/binfmt_elf.c
>> @@ -46,6 +46,7 @@
>> #include <linux/cred.h>
>> #include <linux/dax.h>
>> #include <linux/uaccess.h>
>> +#include <linux/rseq.h>
>> #include <asm/param.h>
>> #include <asm/page.h>
>>
>> @@ -288,6 +289,10 @@ create_elf_tables(struct linux_binprm *bprm, const struct elfhdr *exec,
>> if (bprm->have_execfd) {
>> NEW_AUX_ENT(AT_EXECFD, bprm->execfd);
>> }
>> +#ifdef CONFIG_RSEQ
>> + NEW_AUX_ENT(AT_RSEQ_FEATURE_SIZE, offsetof(struct rseq, end));
>> + NEW_AUX_ENT(AT_RSEQ_ALIGN, __alignof__(struct rseq));
>> +#endif
>> #undef NEW_AUX_ENT
>> /* AT_NULL is zero; clear the rest too */
>> memset(elf_info, 0, (char *)mm->saved_auxv +
>> diff --git a/include/uapi/linux/auxvec.h b/include/uapi/linux/auxvec.h
>> index c7e502bf5a6f..6991c4b8ab18 100644
>> --- a/include/uapi/linux/auxvec.h
>> +++ b/include/uapi/linux/auxvec.h
>> @@ -30,6 +30,8 @@
>> * differ from AT_PLATFORM. */
>> #define AT_RANDOM 25 /* address of 16 random bytes */
>> #define AT_HWCAP2 26 /* extension of AT_HWCAP */
>> +#define AT_RSEQ_FEATURE_SIZE 27 /* rseq supported feature size */
>> +#define AT_RSEQ_ALIGN 28 /* rseq allocation alignment */
>>
>> #define AT_EXECFN 31 /* filename of program */
>
> Do we need the alignment? Or can we keep it perpetually at 32? Or we
> could steal some bits from AT_RSEQ_FEATURE_SIZE? (Not the lower
> bits—they aren't unused due to the way the feature size works.)

I cannot imagine a use-case that would require us to bump the alignment
requirement over 32 bytes, so we may very well leave it at 32. But
perhaps someone else has a better imagination than mine ?

Thanks,

Mathieu

>
> Thanks,
> Florian
>

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

2022-10-17 17:37:03

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH v4 01/25] rseq: Introduce feature size and alignment ELF auxiliary vector entries

On 2022-10-17 12:09, Mathieu Desnoyers wrote:
> On 2022-10-10 08:42, Florian Weimer wrote:
>> * Mathieu Desnoyers:
>>
>>> Export the rseq feature size supported by the kernel as well as the
>>> required allocation alignment for the rseq per-thread area to user-space
>>> through ELF auxiliary vector entries.
>>>
>>> This is part of the extensible rseq ABI.
>>>
>>> Signed-off-by: Mathieu Desnoyers <[email protected]>
>>> ---
>>>   fs/binfmt_elf.c             | 5 +++++
>>>   include/uapi/linux/auxvec.h | 2 ++
>>>   include/uapi/linux/rseq.h   | 5 +++++
>>>   3 files changed, 12 insertions(+)
>>>
>>> diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c
>>> index 63c7ebb0da89..04fca1e4cbd2 100644
>>> --- a/fs/binfmt_elf.c
>>> +++ b/fs/binfmt_elf.c
>>> @@ -46,6 +46,7 @@
>>>   #include <linux/cred.h>
>>>   #include <linux/dax.h>
>>>   #include <linux/uaccess.h>
>>> +#include <linux/rseq.h>
>>>   #include <asm/param.h>
>>>   #include <asm/page.h>
>>> @@ -288,6 +289,10 @@ create_elf_tables(struct linux_binprm *bprm,
>>> const struct elfhdr *exec,
>>>       if (bprm->have_execfd) {
>>>           NEW_AUX_ENT(AT_EXECFD, bprm->execfd);
>>>       }
>>> +#ifdef CONFIG_RSEQ
>>> +    NEW_AUX_ENT(AT_RSEQ_FEATURE_SIZE, offsetof(struct rseq, end));
>>> +    NEW_AUX_ENT(AT_RSEQ_ALIGN, __alignof__(struct rseq));
>>> +#endif
>>>   #undef NEW_AUX_ENT
>>>       /* AT_NULL is zero; clear the rest too */
>>>       memset(elf_info, 0, (char *)mm->saved_auxv +
>>> diff --git a/include/uapi/linux/auxvec.h b/include/uapi/linux/auxvec.h
>>> index c7e502bf5a6f..6991c4b8ab18 100644
>>> --- a/include/uapi/linux/auxvec.h
>>> +++ b/include/uapi/linux/auxvec.h
>>> @@ -30,6 +30,8 @@
>>>                    * differ from AT_PLATFORM. */
>>>   #define AT_RANDOM 25    /* address of 16 random bytes */
>>>   #define AT_HWCAP2 26    /* extension of AT_HWCAP */
>>> +#define AT_RSEQ_FEATURE_SIZE    27    /* rseq supported feature size */
>>> +#define AT_RSEQ_ALIGN        28    /* rseq allocation alignment */
>>>   #define AT_EXECFN  31    /* filename of program */
>>
>> Do we need the alignment?  Or can we keep it perpetually at 32?  Or we
>> could steal some bits from AT_RSEQ_FEATURE_SIZE?  (Not the lower
>> bits—they aren't unused due to the way the feature size works.)
>
> I cannot imagine a use-case that would require us to bump the alignment
> requirement over 32 bytes, so we may very well leave it at 32. But
> perhaps someone else has a better imagination than mine ?

Actually, here is a scenario that warrants exposing the required alignment:

Note that struct rseq is *not* packed.

If we extend struct rseq to a size that makes the compiler use an
alignment larger than 32 bytes in the future, and if the compiler uses
that larger alignment knowledge to issue instructions that require the
larger alignment, then it would be incorrect for user-space to allocate
the struct rseq on an alignment lower than the required alignment.

Indeed, on rseq registration, we have the following check:

if (!IS_ALIGNED((unsigned long)rseq, __alignof__(*rseq))
[...]
return -EINVAL;

Which would break if the size of struct rseq is large enough that the
alignment grows larger than 32 bytes.

You mentioned we could steal some high bits from AT_RSEQ_FEATURE_SIZE to
put the alignment. What is the issue with exposing an explicit
AT_RSEQ_ALIGN ? It's just a auxv entry, so I don't see it as a huge
performance concern to access 2 entries rather than one.

Thanks,

Mathieu

>
> Thanks,
>
> Mathieu
>
>>
>> Thanks,
>> Florian
>>
>

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

2022-10-18 15:52:17

by Florian Weimer

[permalink] [raw]
Subject: Re: [PATCH v4 01/25] rseq: Introduce feature size and alignment ELF auxiliary vector entries

* Mathieu Desnoyers:

> If we extend struct rseq to a size that makes the compiler use an
> alignment larger than 32 bytes in the future, and if the compiler uses
> that larger alignment knowledge to issue instructions that require the
> larger alignment, then it would be incorrect for user-space to
> allocate the struct rseq on an alignment lower than the required
> alignment.
>
> Indeed, on rseq registration, we have the following check:
>
> if (!IS_ALIGNED((unsigned long)rseq, __alignof__(*rseq))
> [...]
> return -EINVAL;
>
> Which would break if the size of struct rseq is large enough that the
> alignment grows larger than 32 bytes.

I never quite understood the reason for that check, it certainly made
the glibc implementation more complicated. But to support variable
sizes internally, we'll have to put in some extra effort anyway, so that
it won't matter much in the end. As long as the required alignment
isn't larger than the page size. 8-/

> You mentioned we could steal some high bits from AT_RSEQ_FEATURE_SIZE
> to put the alignment. What is the issue with exposing an explicit
> AT_RSEQ_ALIGN ? It's just a auxv entry, so I don't see it as a huge
> performance concern to access 2 entries rather than one.

I don't mind too much, we already have a large on-stack array in the
loader so that we can decode the auxiliary vector without a humongous
switch statement. But eventually that approach will stop working if the
set of interesting AT_* values become too large and discontinuous.

Thanks,
Florian

2022-10-18 19:07:25

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH v4 01/25] rseq: Introduce feature size and alignment ELF auxiliary vector entries

On 2022-10-18 11:34, Florian Weimer wrote:
> * Mathieu Desnoyers:
>
>> If we extend struct rseq to a size that makes the compiler use an
>> alignment larger than 32 bytes in the future, and if the compiler uses
>> that larger alignment knowledge to issue instructions that require the
>> larger alignment, then it would be incorrect for user-space to
>> allocate the struct rseq on an alignment lower than the required
>> alignment.
>>
>> Indeed, on rseq registration, we have the following check:
>>
>> if (!IS_ALIGNED((unsigned long)rseq, __alignof__(*rseq))
>> [...]
>> return -EINVAL;
>>
>> Which would break if the size of struct rseq is large enough that the
>> alignment grows larger than 32 bytes.
>
> I never quite understood the reason for that check, it certainly made
> the glibc implementation more complicated. But to support variable
> sizes internally, we'll have to put in some extra effort anyway, so that
> it won't matter much in the end. As long as the required alignment
> isn't larger than the page size. 8-/

I don't expect it to grow so large.

There is one more reason why increasing the alignment of struct rseq may
become useful as the structure grows: it would guarantee that it fits in
a single lower level cache line as its size increases. It's not
something I expect would break if not properly aligned, but it's a nice
optimization.

I see two possible approaches here:

1) We expose the rseq alignment explicitly through auxv, and we can keep
the IS_ALIGNED validation on rseq registration. This "IS_ALIGNED" check
would probably have to be tweaked though, because if the registered
rseq size is 32, then an alignment of 32 is all we require. It's only if
the rseq_len is different from 32 that we need to validate that the
alignment matches the alignment of struct rseq.

2) We don't expose the rseq alignment through auxv, effectively fixing
it at 32. We would need to modify the IS_ALIGNED check on rseq
registration so it validates an alignment of 32 rather than using the
alignment of struct rseq.

>
>> You mentioned we could steal some high bits from AT_RSEQ_FEATURE_SIZE
>> to put the alignment. What is the issue with exposing an explicit
>> AT_RSEQ_ALIGN ? It's just a auxv entry, so I don't see it as a huge
>> performance concern to access 2 entries rather than one.
>
> I don't mind too much, we already have a large on-stack array in the
> loader so that we can decode the auxiliary vector without a humongous
> switch statement. But eventually that approach will stop working if the
> set of interesting AT_* values become too large and discontinuous.

OK. So I guess the main question here is whether we want fixed-32-bytes
alignment, or do we want to be able to increase the mandated alignment
in the future as struct rseq expands ?

The possible reasons for increasing the alignment over 32-bytes would be:

- Unforeseen compiler requirement on a structure alignment larger than
32-bytes as we extend the size of struct rseq.
- Optimization to fit within a single LLC cache line as struct rseq grows.

Thoughts ?

Thanks,

Mathieu

>
> Thanks,
> Florian
>

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com