x86 CPUs use a TSO memory model. Apple Silicon CPUs have the ability to
selectively use a TSO memory model. This can be done by setting the
ACTLR.TSOEN bit to 1. This feature is useful for x86 emulators, since it
removes the need for emulators to insert memory barriers in order to abide
by the TSO memory model. This patch series will add ACTLR.TSOEN support to
virtualized linux on Apple Silicon machines. Userspace will be able to use
a prctl to change the memory model of the CPU from the default ARM64 memory
model to a TSO memory model.
A simple test can be used to determine if the TSO memory model is in use.
This must be done on Apple Silicon MacOS Sonoma version 14.4 or later,
since earlier versions do not support modification of the TSOEN bit.
https://github.com/saagarjha/TSOEnabler/blob/master/testtso/main.c
This program will hang indefinitely if TSO is in use, and will crash almost
immediately if it is not in use.
Zayd Qumsieh (3):
tso: aarch64: allow linux kernel to read/write ACTLR.TSOEN
tso: aarch64: context-switch tso bit on thread switch
tso: aarch64: allow userspace to set tso bit using prctl
arch/arm64/Kconfig | 19 +++++++++
arch/arm64/include/asm/processor.h | 4 ++
arch/arm64/include/asm/sysreg.h | 7 ++++
arch/arm64/include/asm/tso.h | 19 +++++++++
arch/arm64/kernel/Makefile | 2 +-
arch/arm64/kernel/process.c | 61 +++++++++++++++++++++++++++++
arch/arm64/kernel/tso.c | 62 ++++++++++++++++++++++++++++++
include/uapi/linux/prctl.h | 9 +++++
kernel/sys.c | 11 ++++++
9 files changed, 193 insertions(+), 1 deletion(-)
create mode 100644 arch/arm64/include/asm/tso.h
create mode 100644 arch/arm64/kernel/tso.c
--
2.39.3 (Apple Git-146)
Add a new prctl to allow userspace to change the TSO bit. This is
useful for emulators that recompile x86_64 to ARM64. Such programs used
to need to emulate TSO by hand, which has massive performance
ramifications. With this change, emulators can now use prctl to set the
TSO bit at will, and avoid emulating TSO.
Signed-off-by: Zayd Qumsieh <[email protected]>
---
arch/arm64/Kconfig | 6 +++++
arch/arm64/include/asm/tso.h | 1 +
arch/arm64/kernel/process.c | 52 ++++++++++++++++++++++++++++++++++++
arch/arm64/kernel/tso.c | 1 +
include/uapi/linux/prctl.h | 9 +++++++
kernel/sys.c | 11 ++++++++
6 files changed, 80 insertions(+)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 35162e5a0705..ecb7e1f080af 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2088,10 +2088,16 @@ config ARM64_TSO
dynamically switched between the default ARM64 memory model
and x86_64's memory model (TSO).
+ This option enables the support for toggling TSO mode for
+ userspace threads.
+
Selecting this option allows the feature to be detected at
runtime. If the CPU doesn't implement TSO mode, then this
feature will be disabled.
+ Userspace threads that want to use this feature must
+ explicitly opt in via a prctl().
+
endmenu # "ARMv8.5 architectural features"
menu "ARMv8.7 architectural features"
diff --git a/arch/arm64/include/asm/tso.h b/arch/arm64/include/asm/tso.h
index 405e9a5efdf5..cf31c685b1dd 100644
--- a/arch/arm64/include/asm/tso.h
+++ b/arch/arm64/include/asm/tso.h
@@ -13,6 +13,7 @@
int modify_tso_enable(bool tso_enable);
void tso_thread_switch(struct task_struct *next);
+int arch_set_mem_model(struct task_struct *task, int memory_model);
#endif /* CONFIG_ARM64_TSO */
#endif /* __ASM_TSO_H */
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 3831c1a97f79..2b0e9a5331e0 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -763,3 +763,55 @@ int arch_elf_adjust_prot(int prot, const struct arch_elf_state *state,
return prot;
}
#endif
+
+static int arch_set_mem_model_default(struct task_struct *task)
+{
+ int return_error = 0;
+
+#ifdef CONFIG_ARM64_TSO
+ int modify_tso_enable_error = modify_tso_enable(false);
+
+ if (modify_tso_enable_error == -EOPNOTSUPP)
+ // TSO is the only other memory model on arm64.
+ // If TSO is not supported, then the default memory
+ // model must already be set.
+ return_error = 0;
+ else
+ return_error = modify_tso_enable_error;
+
+ if (!return_error)
+ task->thread.tso = false;
+
+ return return_error;
+#endif
+
+ return return_error;
+}
+
+#ifdef CONFIG_ARM64_TSO
+
+static int arch_set_mem_model_tso(struct task_struct *task)
+{
+ int error = modify_tso_enable(true);
+
+ if (!error)
+ task->thread.tso = true;
+
+ return error;
+}
+
+#endif /* CONFIG_ARM64_TSO */
+
+int arch_set_mem_model(struct task_struct *task, int memory_model)
+{
+ switch (memory_model) {
+ case PR_SET_MEM_MODEL_DEFAULT:
+ return arch_set_mem_model_default(task);
+#ifdef CONFIG_ARM64_TSO
+ case PR_SET_MEM_MODEL_TSO:
+ return arch_set_mem_model_tso(task);
+#endif /* CONFIG_ARM64_TSO */
+ default:
+ return -EINVAL;
+ }
+}
diff --git a/arch/arm64/kernel/tso.c b/arch/arm64/kernel/tso.c
index 9a15d825943f..44749f1f5e10 100644
--- a/arch/arm64/kernel/tso.c
+++ b/arch/arm64/kernel/tso.c
@@ -58,4 +58,5 @@ void tso_thread_switch(struct task_struct *next)
}
}
+
#endif /* CONFIG_ARM64_TSO */
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 370ed14b1ae0..62b767e6efcf 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -1,4 +1,8 @@
/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * Copyright © 2024 Apple Inc. All rights reserved.
+ */
+
#ifndef _LINUX_PRCTL_H
#define _LINUX_PRCTL_H
@@ -306,4 +310,9 @@ struct prctl_mm_map {
# define PR_RISCV_V_VSTATE_CTRL_NEXT_MASK 0xc
# define PR_RISCV_V_VSTATE_CTRL_MASK 0x1f
+/* Set the CPU memory model */
+#define PR_SET_MEM_MODEL 71
+# define PR_SET_MEM_MODEL_DEFAULT 0
+# define PR_SET_MEM_MODEL_TSO 1
+
#endif /* _LINUX_PRCTL_H */
diff --git a/kernel/sys.c b/kernel/sys.c
index 8bb106a56b3a..94c18700b849 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -3,6 +3,7 @@
* linux/kernel/sys.c
*
* Copyright (C) 1991, 1992 Linus Torvalds
+ * Copyright © 2024 Apple Inc. All rights reserved.
*/
#include <linux/export.h>
@@ -2315,6 +2316,11 @@ int __weak arch_prctl_spec_ctrl_set(struct task_struct *t, unsigned long which,
return -EINVAL;
}
+int __weak arch_set_mem_model(struct task_struct *task, int memory_model)
+{
+ return -EINVAL;
+}
+
#define PR_IO_FLUSHER (PF_MEMALLOC_NOIO | PF_LOCAL_THROTTLE)
#ifdef CONFIG_ANON_VMA_NAME
@@ -2760,6 +2766,11 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
case PR_RISCV_V_GET_CONTROL:
error = RISCV_V_GET_CONTROL();
break;
+ case PR_SET_MEM_MODEL:
+ if (arg3 || arg4 || arg5)
+ return -EINVAL;
+ error = arch_set_mem_model(current, arg2);
+ break;
default:
error = -EINVAL;
break;
--
2.39.3 (Apple Git-146)
On 2024/04/11 6:16, Zayd Qumsieh wrote:
> x86 CPUs use a TSO memory model. Apple Silicon CPUs have the ability to
> selectively use a TSO memory model. This can be done by setting the
> ACTLR.TSOEN bit to 1. This feature is useful for x86 emulators, since it
> removes the need for emulators to insert memory barriers in order to abide
> by the TSO memory model. This patch series will add ACTLR.TSOEN support to
> virtualized linux on Apple Silicon machines. Userspace will be able to use
> a prctl to change the memory model of the CPU from the default ARM64 memory
> model to a TSO memory model.
>
> A simple test can be used to determine if the TSO memory model is in use.
> This must be done on Apple Silicon MacOS Sonoma version 14.4 or later,
> since earlier versions do not support modification of the TSOEN bit.
> https://github.com/saagarjha/TSOEnabler/blob/master/testtso/main.c
>
> This program will hang indefinitely if TSO is in use, and will crash almost
> immediately if it is not in use.
Well this is unexpected, given I talked to Justin Lu at Apple back in
December and I thought our plan was to work together on the series I've
had cooking in the Asahi tree [1] for a while now, which is actually
shipping in thousands of Asahi Linux systems in production and actually
already supported by the FEX-Emu project with our ABI. You CCed 30+
people, but not me nor the [email protected] mailing list...
[1] https://github.com/AsahiLinux/linux/tree/bits/220-tso
Given that we're here now, I'll send out my series for review and see
what people think about that one.
>
> Zayd Qumsieh (3):
> tso: aarch64: allow linux kernel to read/write ACTLR.TSOEN
> tso: aarch64: context-switch tso bit on thread switch
> tso: aarch64: allow userspace to set tso bit using prctl
>
> arch/arm64/Kconfig | 19 +++++++++
> arch/arm64/include/asm/processor.h | 4 ++
> arch/arm64/include/asm/sysreg.h | 7 ++++
> arch/arm64/include/asm/tso.h | 19 +++++++++
> arch/arm64/kernel/Makefile | 2 +-
> arch/arm64/kernel/process.c | 61 +++++++++++++++++++++++++++++
> arch/arm64/kernel/tso.c | 62 ++++++++++++++++++++++++++++++
> include/uapi/linux/prctl.h | 9 +++++
> kernel/sys.c | 11 ++++++
> 9 files changed, 193 insertions(+), 1 deletion(-)
> create mode 100644 arch/arm64/include/asm/tso.h
> create mode 100644 arch/arm64/kernel/tso.c
>
- Hector
Hi Zayd,
It makes a nice change to see an apple.com address on the mailing list!
On Wed, Apr 10, 2024 at 02:16:38PM -0700, Zayd Qumsieh wrote:
> x86 CPUs use a TSO memory model. Apple Silicon CPUs have the ability to
> selectively use a TSO memory model. This can be done by setting the
> ACTLR.TSOEN bit to 1. This feature is useful for x86 emulators, since it
> removes the need for emulators to insert memory barriers in order to abide
> by the TSO memory model. This patch series will add ACTLR.TSOEN support to
> virtualized linux on Apple Silicon machines. Userspace will be able to use
> a prctl to change the memory model of the CPU from the default ARM64 memory
> model to a TSO memory model.
FWIW: I've replied on the other series from Hector:
https://lore.kernel.org/lkml/20240411132853.GA26481@willie-the-truck/T/#t
Will