2023-05-09 14:28:21

by Kristina Martsenko

[permalink] [raw]
Subject: [PATCH v2 00/11] arm64: Support for Armv8.8 memcpy instructions in userspace

The Armv8.8 extension adds new instructions to perform memcpy(), memset() and
memmove() operations in hardware (FEAT_MOPS). This series adds support for
using the new instructions in userspace. More information can be found in the
cover letter for v1:
https://lore.kernel.org/linux-arm-kernel/[email protected]/

Changes in v2:
- Removed booting.rst requirement for HCRX_EL2.MCE2
- Changed HCRX_EL2 cpucap to be STRICT_BOOT type
- Changed HCRX_EL2.SMPME to be set for the guest and unset for the host
- Moved HCRX_EL2 initialization into init_el2_state(), dropped ISB
- Simplified conditional checks in mops exception handler with XOR
- Added comments from Arm ARM into mops exception handler
- Converted cpucaps to use the new ARM64_CPUID_FIELDS() helper
- Added MOPS to hwcaps kselftest
- Improved commit messages
- Rebased onto v6.4-rc1
- v1: https://lore.kernel.org/linux-arm-kernel/[email protected]/


Kristina Martsenko (11):
KVM: arm64: initialize HCRX_EL2
arm64: cpufeature: detect FEAT_HCX
KVM: arm64: switch HCRX_EL2 between host and guest
arm64: mops: document boot requirements for MOPS
arm64: mops: don't disable host MOPS instructions from EL2
KVM: arm64: hide MOPS from guests
arm64: mops: handle MOPS exceptions
arm64: mops: handle single stepping after MOPS exception
arm64: mops: detect and enable FEAT_MOPS
arm64: mops: allow disabling MOPS from the kernel command line
kselftest/arm64: add MOPS to hwcap test

.../admin-guide/kernel-parameters.txt | 3 +
Documentation/arm64/booting.rst | 6 ++
Documentation/arm64/cpu-feature-registers.rst | 2 +
Documentation/arm64/elf_hwcaps.rst | 3 +
arch/arm64/include/asm/el2_setup.h | 18 +++---
arch/arm64/include/asm/esr.h | 11 +++-
arch/arm64/include/asm/exception.h | 1 +
arch/arm64/include/asm/hwcap.h | 1 +
arch/arm64/include/asm/kvm_arm.h | 4 ++
arch/arm64/include/uapi/asm/hwcap.h | 1 +
arch/arm64/kernel/cpufeature.c | 23 ++++++++
arch/arm64/kernel/cpuinfo.c | 1 +
arch/arm64/kernel/entry-common.c | 11 ++++
arch/arm64/kernel/idreg-override.c | 2 +
arch/arm64/kernel/traps.c | 58 +++++++++++++++++++
arch/arm64/kvm/hyp/include/hyp/switch.h | 6 ++
arch/arm64/kvm/sys_regs.c | 1 +
arch/arm64/tools/cpucaps | 2 +
tools/testing/selftests/arm64/abi/hwcap.c | 22 +++++++
19 files changed, 167 insertions(+), 9 deletions(-)


base-commit: ac9a78681b921877518763ba0e89202254349d1b
--
2.25.1


2023-05-09 14:31:38

by Kristina Martsenko

[permalink] [raw]
Subject: [PATCH v2 07/11] arm64: mops: handle MOPS exceptions

The memory copy/set instructions added as part of FEAT_MOPS can take an
exception (e.g. page fault) part-way through their execution and resume
execution afterwards.

If however the task is re-scheduled and execution resumes on a different
CPU, then the CPU may take a new type of exception to indicate this.
This is because the architecture allows two options (Option A and Option
B) to implement the instructions and a heterogeneous system can have
different implementations between CPUs.

In this case the OS has to reset the registers and restart execution
from the prologue instruction. The algorithm for doing this is provided
as part of the Arm ARM.

Add an exception handler for the new exception and wire it up for
userspace tasks.

Reviewed-by: Catalin Marinas <[email protected]>
Signed-off-by: Kristina Martsenko <[email protected]>
---
arch/arm64/include/asm/esr.h | 11 ++++++-
arch/arm64/include/asm/exception.h | 1 +
arch/arm64/kernel/entry-common.c | 11 +++++++
arch/arm64/kernel/traps.c | 52 ++++++++++++++++++++++++++++++
4 files changed, 74 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index 8487aec9b658..ca954f566861 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -47,7 +47,7 @@
#define ESR_ELx_EC_DABT_LOW (0x24)
#define ESR_ELx_EC_DABT_CUR (0x25)
#define ESR_ELx_EC_SP_ALIGN (0x26)
-/* Unallocated EC: 0x27 */
+#define ESR_ELx_EC_MOPS (0x27)
#define ESR_ELx_EC_FP_EXC32 (0x28)
/* Unallocated EC: 0x29 - 0x2B */
#define ESR_ELx_EC_FP_EXC64 (0x2C)
@@ -356,6 +356,15 @@
#define ESR_ELx_SME_ISS_ZA_DISABLED 3
#define ESR_ELx_SME_ISS_ZT_DISABLED 4

+/* ISS field definitions for MOPS exceptions */
+#define ESR_ELx_MOPS_ISS_MEM_INST (UL(1) << 24)
+#define ESR_ELx_MOPS_ISS_FROM_EPILOGUE (UL(1) << 18)
+#define ESR_ELx_MOPS_ISS_WRONG_OPTION (UL(1) << 17)
+#define ESR_ELx_MOPS_ISS_OPTION_A (UL(1) << 16)
+#define ESR_ELx_MOPS_ISS_DESTREG(esr) (((esr) & (UL(0x1f) << 10)) >> 10)
+#define ESR_ELx_MOPS_ISS_SRCREG(esr) (((esr) & (UL(0x1f) << 5)) >> 5)
+#define ESR_ELx_MOPS_ISS_SIZEREG(esr) (((esr) & (UL(0x1f) << 0)) >> 0)
+
#ifndef __ASSEMBLY__
#include <asm/types.h>

diff --git a/arch/arm64/include/asm/exception.h b/arch/arm64/include/asm/exception.h
index e73af709cb7a..72e83af0135f 100644
--- a/arch/arm64/include/asm/exception.h
+++ b/arch/arm64/include/asm/exception.h
@@ -77,6 +77,7 @@ void do_el0_svc(struct pt_regs *regs);
void do_el0_svc_compat(struct pt_regs *regs);
void do_el0_fpac(struct pt_regs *regs, unsigned long esr);
void do_el1_fpac(struct pt_regs *regs, unsigned long esr);
+void do_el0_mops(struct pt_regs *regs, unsigned long esr);
void do_serror(struct pt_regs *regs, unsigned long esr);
void do_notify_resume(struct pt_regs *regs, unsigned long thread_flags);

diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
index 3af3c01c93a6..a8ec174e5b0e 100644
--- a/arch/arm64/kernel/entry-common.c
+++ b/arch/arm64/kernel/entry-common.c
@@ -611,6 +611,14 @@ static void noinstr el0_bti(struct pt_regs *regs)
exit_to_user_mode(regs);
}

+static void noinstr el0_mops(struct pt_regs *regs, unsigned long esr)
+{
+ enter_from_user_mode(regs);
+ local_daif_restore(DAIF_PROCCTX);
+ do_el0_mops(regs, esr);
+ exit_to_user_mode(regs);
+}
+
static void noinstr el0_inv(struct pt_regs *regs, unsigned long esr)
{
enter_from_user_mode(regs);
@@ -688,6 +696,9 @@ asmlinkage void noinstr el0t_64_sync_handler(struct pt_regs *regs)
case ESR_ELx_EC_BTI:
el0_bti(regs);
break;
+ case ESR_ELx_EC_MOPS:
+ el0_mops(regs, esr);
+ break;
case ESR_ELx_EC_BREAKPT_LOW:
case ESR_ELx_EC_SOFTSTP_LOW:
case ESR_ELx_EC_WATCHPT_LOW:
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 4bb1b8f47298..32dc692bffd3 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -514,6 +514,57 @@ void do_el1_fpac(struct pt_regs *regs, unsigned long esr)
die("Oops - FPAC", regs, esr);
}

+void do_el0_mops(struct pt_regs *regs, unsigned long esr)
+{
+ bool wrong_option = esr & ESR_ELx_MOPS_ISS_WRONG_OPTION;
+ bool option_a = esr & ESR_ELx_MOPS_ISS_OPTION_A;
+ int dstreg = ESR_ELx_MOPS_ISS_DESTREG(esr);
+ int srcreg = ESR_ELx_MOPS_ISS_SRCREG(esr);
+ int sizereg = ESR_ELx_MOPS_ISS_SIZEREG(esr);
+ unsigned long dst, src, size;
+
+ dst = pt_regs_read_reg(regs, dstreg);
+ src = pt_regs_read_reg(regs, srcreg);
+ size = pt_regs_read_reg(regs, sizereg);
+
+ /*
+ * Put the registers back in the original format suitable for a
+ * prologue instruction, using the generic return routine from the
+ * Arm ARM (DDI 0487I.a) rules CNTMJ and MWFQH.
+ */
+ if (esr & ESR_ELx_MOPS_ISS_MEM_INST) {
+ /* SET* instruction */
+ if (option_a ^ wrong_option) {
+ /* Format is from Option A; forward set */
+ pt_regs_write_reg(regs, dstreg, dst + size);
+ pt_regs_write_reg(regs, sizereg, -size);
+ }
+ } else {
+ /* CPY* instruction */
+ if (!(option_a ^ wrong_option)) {
+ /* Format is from Option B */
+ if (regs->pstate & PSR_N_BIT) {
+ /* Backward copy */
+ pt_regs_write_reg(regs, dstreg, dst - size);
+ pt_regs_write_reg(regs, srcreg, src - size);
+ }
+ } else {
+ /* Format is from Option A */
+ if (size & BIT(63)) {
+ /* Forward copy */
+ pt_regs_write_reg(regs, dstreg, dst + size);
+ pt_regs_write_reg(regs, srcreg, src + size);
+ pt_regs_write_reg(regs, sizereg, -size);
+ }
+ }
+ }
+
+ if (esr & ESR_ELx_MOPS_ISS_FROM_EPILOGUE)
+ regs->pc -= 8;
+ else
+ regs->pc -= 4;
+}
+
#define __user_cache_maint(insn, address, res) \
if (address >= TASK_SIZE_MAX) { \
res = -EFAULT; \
@@ -824,6 +875,7 @@ static const char *esr_class_str[] = {
[ESR_ELx_EC_DABT_LOW] = "DABT (lower EL)",
[ESR_ELx_EC_DABT_CUR] = "DABT (current EL)",
[ESR_ELx_EC_SP_ALIGN] = "SP Alignment",
+ [ESR_ELx_EC_MOPS] = "MOPS",
[ESR_ELx_EC_FP_EXC32] = "FP (AArch32)",
[ESR_ELx_EC_FP_EXC64] = "FP (AArch64)",
[ESR_ELx_EC_SERROR] = "SError",
--
2.25.1

2023-05-09 14:49:29

by Kristina Martsenko

[permalink] [raw]
Subject: [PATCH v2 04/11] arm64: mops: document boot requirements for MOPS

FEAT_MOPS introduces new instructions, we require that these
instructions not execute as UNDEFINED when we identify that the feature
is supported.

Signed-off-by: Kristina Martsenko <[email protected]>
---
Documentation/arm64/booting.rst | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/Documentation/arm64/booting.rst b/Documentation/arm64/booting.rst
index ffeccdd6bdac..b3bbf330ed0a 100644
--- a/Documentation/arm64/booting.rst
+++ b/Documentation/arm64/booting.rst
@@ -379,6 +379,12 @@ Before jumping into the kernel, the following conditions must be met:

- SMCR_EL2.EZT0 (bit 30) must be initialised to 0b1.

+ For CPUs with Memory Copy and Memory Set instructions (FEAT_MOPS):
+
+ - If the kernel is entered at EL1 and EL2 is present:
+
+ - HCRX_EL2.MSCEn (bit 11) must be initialised to 0b1.
+
The requirements described above for CPU mode, caches, MMUs, architected
timers, coherency and system registers apply to all CPUs. All CPUs must
enter the kernel in the same exception level. Where the values documented
--
2.25.1

2023-05-09 14:52:19

by Kristina Martsenko

[permalink] [raw]
Subject: [PATCH v2 03/11] KVM: arm64: switch HCRX_EL2 between host and guest

Switch the HCRX_EL2 register between host and guest configurations, in
order to enable different features in the host and guest.

Now that there are separate guest flags, we can also remove SMPME from
the host flags, as SMPME is used for virtualizing SME priorities and has
no use in the host.

Signed-off-by: Kristina Martsenko <[email protected]>
---
arch/arm64/include/asm/kvm_arm.h | 3 ++-
arch/arm64/kvm/hyp/include/hyp/switch.h | 6 ++++++
2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index fb7fe28b8eb8..7bb2fbddda54 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -93,7 +93,8 @@
#define HCR_HOST_NVHE_PROTECTED_FLAGS (HCR_HOST_NVHE_FLAGS | HCR_TSC)
#define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)

-#define HCRX_HOST_FLAGS (HCRX_EL2_SMPME)
+#define HCRX_GUEST_FLAGS (HCRX_EL2_SMPME)
+#define HCRX_HOST_FLAGS 0

/* TCR_EL2 Registers bits */
#define TCR_EL2_RES1 ((1U << 31) | (1 << 23))
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index c41166f1a1dd..8f95bcbe6cdf 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -130,6 +130,9 @@ static inline void ___activate_traps(struct kvm_vcpu *vcpu)

if (cpus_have_final_cap(ARM64_HAS_RAS_EXTN) && (hcr & HCR_VSE))
write_sysreg_s(vcpu->arch.vsesr_el2, SYS_VSESR_EL2);
+
+ if (cpus_have_final_cap(ARM64_HAS_HCX))
+ write_sysreg_s(HCRX_GUEST_FLAGS, SYS_HCRX_EL2);
}

static inline void ___deactivate_traps(struct kvm_vcpu *vcpu)
@@ -144,6 +147,9 @@ static inline void ___deactivate_traps(struct kvm_vcpu *vcpu)
vcpu->arch.hcr_el2 &= ~HCR_VSE;
vcpu->arch.hcr_el2 |= read_sysreg(hcr_el2) & HCR_VSE;
}
+
+ if (cpus_have_final_cap(ARM64_HAS_HCX))
+ write_sysreg_s(HCRX_HOST_FLAGS, SYS_HCRX_EL2);
}

static inline bool __populate_fault_info(struct kvm_vcpu *vcpu)
--
2.25.1

2023-05-09 14:55:33

by Kristina Martsenko

[permalink] [raw]
Subject: [PATCH v2 02/11] arm64: cpufeature: detect FEAT_HCX

Detect if the system has the new HCRX_EL2 register added in ARMv8.7/9.2,
so that subsequent patches can check for its presence.

KVM currently relies on the register being present on all CPUs (or
none), so the kernel will panic if that is not the case. Fortunately no
such systems currently exist, but this can be revisited if they appear.
Note that the kernel will not panic if CONFIG_KVM is disabled.

Reviewed-by: Catalin Marinas <[email protected]>
Signed-off-by: Kristina Martsenko <[email protected]>
---
arch/arm64/kernel/cpufeature.c | 8 ++++++++
arch/arm64/tools/cpucaps | 1 +
2 files changed, 9 insertions(+)

diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 7d7128c65161..9898ad77b1db 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -364,6 +364,7 @@ static const struct arm64_ftr_bits ftr_id_aa64mmfr0[] = {
static const struct arm64_ftr_bits ftr_id_aa64mmfr1[] = {
ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_EL1_TIDCP1_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_EL1_AFP_SHIFT, 4, 0),
+ ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_EL1_HCX_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_EL1_ETS_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_EL1_TWED_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR1_EL1_XNX_SHIFT, 4, 0),
@@ -2309,6 +2310,13 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
.type = ARM64_CPUCAP_SYSTEM_FEATURE,
.matches = is_kvm_protected_mode,
},
+ {
+ .desc = "HCRX_EL2 register",
+ .capability = ARM64_HAS_HCX,
+ .type = ARM64_CPUCAP_STRICT_BOOT_CPU_FEATURE,
+ .matches = has_cpuid_feature,
+ ARM64_CPUID_FIELDS(ID_AA64MMFR1_EL1, HCX, IMP)
+ },
#endif
{
.desc = "Kernel page table isolation (KPTI)",
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index 40ba95472594..e1de10fa080e 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -32,6 +32,7 @@ HAS_GENERIC_AUTH_IMP_DEF
HAS_GIC_CPUIF_SYSREGS
HAS_GIC_PRIO_MASKING
HAS_GIC_PRIO_RELAXED_SYNC
+HAS_HCX
HAS_LDAPR
HAS_LSE_ATOMICS
HAS_NESTED_VIRT
--
2.25.1

2023-05-09 15:03:57

by Kristina Martsenko

[permalink] [raw]
Subject: [PATCH v2 01/11] KVM: arm64: initialize HCRX_EL2

ARMv8.7/9.2 adds a new hypervisor configuration register HCRX_EL2.
Initialize the register to a safe value (all fields 0), to be robust
against firmware that has not initialized it. This is also needed to
ensure that the register is reinitialized after a kexec by a future
kernel.

In addition, move SMPME setup over to the new flags, as it would
otherwise get overridden. It is safe to set the bit even if SME is not
(uniformly) supported, as it will write to a RES0 bit (having no
effect), and SME will be disabled by the cpufeature framework.
(Similar to how e.g. the API bit is handled in HCR_HOST_NVHE_FLAGS.)

Signed-off-by: Kristina Martsenko <[email protected]>
---
arch/arm64/include/asm/el2_setup.h | 18 ++++++++++--------
arch/arm64/include/asm/kvm_arm.h | 3 +++
2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h
index 037724b19c5c..0201577863ca 100644
--- a/arch/arm64/include/asm/el2_setup.h
+++ b/arch/arm64/include/asm/el2_setup.h
@@ -22,6 +22,15 @@
isb
.endm

+.macro __init_el2_hcrx
+ mrs x0, id_aa64mmfr1_el1
+ ubfx x0, x0, #ID_AA64MMFR1_EL1_HCX_SHIFT, #4
+ cbz x0, .Lskip_hcrx_\@
+ mov_q x0, HCRX_HOST_FLAGS
+ msr_s SYS_HCRX_EL2, x0
+.Lskip_hcrx_\@:
+.endm
+
/*
* Allow Non-secure EL1 and EL0 to access physical timer and counter.
* This is not necessary for VHE, since the host kernel runs in EL2,
@@ -184,6 +193,7 @@
*/
.macro init_el2_state
__init_el2_sctlr
+ __init_el2_hcrx
__init_el2_timers
__init_el2_debug
__init_el2_lor
@@ -284,14 +294,6 @@
cbz x1, .Lskip_sme_\@

msr_s SYS_SMPRIMAP_EL2, xzr // Make all priorities equal
-
- mrs x1, id_aa64mmfr1_el1 // HCRX_EL2 present?
- ubfx x1, x1, #ID_AA64MMFR1_EL1_HCX_SHIFT, #4
- cbz x1, .Lskip_sme_\@
-
- mrs_s x1, SYS_HCRX_EL2
- orr x1, x1, #HCRX_EL2_SMPME_MASK // Enable priority mapping
- msr_s SYS_HCRX_EL2, x1
.Lskip_sme_\@:
.endm

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index baef29fcbeee..fb7fe28b8eb8 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -9,6 +9,7 @@

#include <asm/esr.h>
#include <asm/memory.h>
+#include <asm/sysreg.h>
#include <asm/types.h>

/* Hyp Configuration Register (HCR) bits */
@@ -92,6 +93,8 @@
#define HCR_HOST_NVHE_PROTECTED_FLAGS (HCR_HOST_NVHE_FLAGS | HCR_TSC)
#define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)

+#define HCRX_HOST_FLAGS (HCRX_EL2_SMPME)
+
/* TCR_EL2 Registers bits */
#define TCR_EL2_RES1 ((1U << 31) | (1 << 23))
#define TCR_EL2_TBI (1 << 20)
--
2.25.1

2023-05-09 15:06:41

by Kristina Martsenko

[permalink] [raw]
Subject: [PATCH v2 06/11] KVM: arm64: hide MOPS from guests

As FEAT_MOPS is not supported in guests yet, hide it from the ID
registers for guests.

The MOPS instructions are UNDEFINED in guests as HCRX_EL2.MSCEn is not
set in HCRX_GUEST_FLAGS, and will take an exception to EL1 if executed.

Acked-by: Catalin Marinas <[email protected]>
Signed-off-by: Kristina Martsenko <[email protected]>
---
arch/arm64/kvm/sys_regs.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 71b12094d613..6dae7fe91cfa 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1252,6 +1252,7 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu, struct sys_reg_desc const *r
ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_GPA3));
if (!cpus_have_final_cap(ARM64_HAS_WFXT))
val &= ~ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_WFxT);
+ val &= ~ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_MOPS);
break;
case SYS_ID_AA64DFR0_EL1:
/* Limit debug to ARMv8.0 */
--
2.25.1

2023-05-09 15:07:52

by Kristina Martsenko

[permalink] [raw]
Subject: [PATCH v2 05/11] arm64: mops: don't disable host MOPS instructions from EL2

To allow nVHE host EL0 and EL1 to use FEAT_MOPS instructions, configure
EL2 to not cause these instructions to be treated as UNDEFINED. A VHE
host is unaffected by this control.

Reviewed-by: Catalin Marinas <[email protected]>
Signed-off-by: Kristina Martsenko <[email protected]>
---
arch/arm64/include/asm/kvm_arm.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 7bb2fbddda54..d2d4f4cd12b8 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -94,7 +94,7 @@
#define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)

#define HCRX_GUEST_FLAGS (HCRX_EL2_SMPME)
-#define HCRX_HOST_FLAGS 0
+#define HCRX_HOST_FLAGS (HCRX_EL2_MSCEn)

/* TCR_EL2 Registers bits */
#define TCR_EL2_RES1 ((1U << 31) | (1 << 23))
--
2.25.1

2023-05-09 15:08:16

by Kristina Martsenko

[permalink] [raw]
Subject: [PATCH v2 11/11] kselftest/arm64: add MOPS to hwcap test

Add the MOPS hwcap to the hwcap kselftest and check that a SIGILL is not
generated when the feature is detected. A SIGILL is reliable when the
feature is not detected as SCTLR_EL1.MSCEn won't have been set.

Signed-off-by: Kristina Martsenko <[email protected]>
---
tools/testing/selftests/arm64/abi/hwcap.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)

diff --git a/tools/testing/selftests/arm64/abi/hwcap.c b/tools/testing/selftests/arm64/abi/hwcap.c
index 93333a90bf3a..d4ad813fed10 100644
--- a/tools/testing/selftests/arm64/abi/hwcap.c
+++ b/tools/testing/selftests/arm64/abi/hwcap.c
@@ -39,6 +39,20 @@ static void cssc_sigill(void)
asm volatile(".inst 0xdac01c00" : : : "x0");
}

+static void mops_sigill(void)
+{
+ char dst[1], src[1];
+ register char *dstp asm ("x0") = dst;
+ register char *srcp asm ("x1") = src;
+ register long size asm ("x2") = 1;
+
+ /* CPYP [x0]!, [x1]!, x2! */
+ asm volatile(".inst 0x1d010440"
+ : "+r" (dstp), "+r" (srcp), "+r" (size)
+ :
+ : "cc", "memory");
+}
+
static void rng_sigill(void)
{
asm volatile("mrs x0, S3_3_C2_C4_0" : : : "x0");
@@ -209,6 +223,14 @@ static const struct hwcap_data {
.cpuinfo = "cssc",
.sigill_fn = cssc_sigill,
},
+ {
+ .name = "MOPS",
+ .at_hwcap = AT_HWCAP2,
+ .hwcap_bit = HWCAP2_MOPS,
+ .cpuinfo = "mops",
+ .sigill_fn = mops_sigill,
+ .sigill_reliable = true,
+ },
{
.name = "RNG",
.at_hwcap = AT_HWCAP2,
--
2.25.1

2023-05-09 15:13:58

by Kristina Martsenko

[permalink] [raw]
Subject: [PATCH v2 08/11] arm64: mops: handle single stepping after MOPS exception

When a MOPS main or epilogue instruction is being executed, the task may
get scheduled on a different CPU and restart execution from the prologue
instruction. If the main or epilogue instruction is being single stepped
then it makes sense to finish the step and take the step exception
before starting to execute the next (prologue) instruction. So
fast-forward the single step state machine when taking a MOPS exception.

This means that if a main or epilogue instruction is single stepped with
ptrace, the debugger will sometimes observe the PC moving back to the
prologue instruction. (As already mentioned, this should be rare as it
only happens when the task is scheduled to another CPU during the step.)

This also ensures that perf breakpoints count prologue instructions
consistently (i.e. every time they are executed), rather than skipping
them when there also happens to be a breakpoint on a main or epilogue
instruction.

Acked-by: Catalin Marinas <[email protected]>
Signed-off-by: Kristina Martsenko <[email protected]>
---
arch/arm64/kernel/traps.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 32dc692bffd3..4363e3b53a81 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -563,6 +563,12 @@ void do_el0_mops(struct pt_regs *regs, unsigned long esr)
regs->pc -= 8;
else
regs->pc -= 4;
+
+ /*
+ * If single stepping then finish the step before executing the
+ * prologue instruction.
+ */
+ user_fastforward_single_step(current);
}

#define __user_cache_maint(insn, address, res) \
--
2.25.1

2023-05-09 15:17:00

by Kristina Martsenko

[permalink] [raw]
Subject: [PATCH v2 09/11] arm64: mops: detect and enable FEAT_MOPS

The Arm v8.8/9.3 FEAT_MOPS feature provides new instructions that
perform a memory copy or set. Wire up the cpufeature code to detect the
presence of FEAT_MOPS and enable it.

Reviewed-by: Catalin Marinas <[email protected]>
Signed-off-by: Kristina Martsenko <[email protected]>
---
Documentation/arm64/cpu-feature-registers.rst | 2 ++
Documentation/arm64/elf_hwcaps.rst | 3 +++
arch/arm64/include/asm/hwcap.h | 1 +
arch/arm64/include/uapi/asm/hwcap.h | 1 +
arch/arm64/kernel/cpufeature.c | 15 +++++++++++++++
arch/arm64/kernel/cpuinfo.c | 1 +
arch/arm64/tools/cpucaps | 1 +
7 files changed, 24 insertions(+)

diff --git a/Documentation/arm64/cpu-feature-registers.rst b/Documentation/arm64/cpu-feature-registers.rst
index c7adc7897df6..4e4625f2455f 100644
--- a/Documentation/arm64/cpu-feature-registers.rst
+++ b/Documentation/arm64/cpu-feature-registers.rst
@@ -288,6 +288,8 @@ infrastructure:
+------------------------------+---------+---------+
| Name | bits | visible |
+------------------------------+---------+---------+
+ | MOPS | [19-16] | y |
+ +------------------------------+---------+---------+
| RPRES | [7-4] | y |
+------------------------------+---------+---------+
| WFXT | [3-0] | y |
diff --git a/Documentation/arm64/elf_hwcaps.rst b/Documentation/arm64/elf_hwcaps.rst
index 83e57e4d38e2..8f847d0dcf57 100644
--- a/Documentation/arm64/elf_hwcaps.rst
+++ b/Documentation/arm64/elf_hwcaps.rst
@@ -302,6 +302,9 @@ HWCAP2_SMEB16B16
HWCAP2_SMEF16F16
Functionality implied by ID_AA64SMFR0_EL1.F16F16 == 0b1

+HWCAP2_MOPS
+ Functionality implied by ID_AA64ISAR2_EL1.MOPS == 0b0001.
+
4. Unused AT_HWCAP bits
-----------------------

diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
index 5d45f19fda7f..692b1ec663b2 100644
--- a/arch/arm64/include/asm/hwcap.h
+++ b/arch/arm64/include/asm/hwcap.h
@@ -137,6 +137,7 @@
#define KERNEL_HWCAP_SME_BI32I32 __khwcap2_feature(SME_BI32I32)
#define KERNEL_HWCAP_SME_B16B16 __khwcap2_feature(SME_B16B16)
#define KERNEL_HWCAP_SME_F16F16 __khwcap2_feature(SME_F16F16)
+#define KERNEL_HWCAP_MOPS __khwcap2_feature(MOPS)

/*
* This yields a mask that user programs can use to figure out what
diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h
index 69a4fb749c65..a2cac4305b1e 100644
--- a/arch/arm64/include/uapi/asm/hwcap.h
+++ b/arch/arm64/include/uapi/asm/hwcap.h
@@ -102,5 +102,6 @@
#define HWCAP2_SME_BI32I32 (1UL << 40)
#define HWCAP2_SME_B16B16 (1UL << 41)
#define HWCAP2_SME_F16F16 (1UL << 42)
+#define HWCAP2_MOPS (1UL << 43)

#endif /* _UAPI__ASM_HWCAP_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 9898ad77b1db..3badc4fa7154 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -223,6 +223,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar2[] = {
ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_CSSC_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_RPRFM_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_HIGHER_SAFE, ID_AA64ISAR2_EL1_BC_SHIFT, 4, 0),
+ ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_MOPS_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH),
FTR_STRICT, FTR_EXACT, ID_AA64ISAR2_EL1_APA3_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_PTR_AUTH),
@@ -2187,6 +2188,11 @@ static void cpu_enable_dit(const struct arm64_cpu_capabilities *__unused)
set_pstate_dit(1);
}

+static void cpu_enable_mops(const struct arm64_cpu_capabilities *__unused)
+{
+ sysreg_clear_set(sctlr_el1, 0, SCTLR_EL1_MSCEn);
+}
+
/* Internal helper functions to match cpu capability type */
static bool
cpucap_late_cpu_optional(const struct arm64_cpu_capabilities *cap)
@@ -2649,6 +2655,14 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
.cpu_enable = cpu_enable_dit,
ARM64_CPUID_FIELDS(ID_AA64PFR0_EL1, DIT, IMP)
},
+ {
+ .desc = "Memory Copy and Memory Set instructions",
+ .capability = ARM64_HAS_MOPS,
+ .type = ARM64_CPUCAP_SYSTEM_FEATURE,
+ .matches = has_cpuid_feature,
+ .cpu_enable = cpu_enable_mops,
+ ARM64_CPUID_FIELDS(ID_AA64ISAR2_EL1, MOPS, IMP)
+ },
{},
};

@@ -2777,6 +2791,7 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
HWCAP_CAP(ID_AA64ISAR2_EL1, RPRFM, IMP, CAP_HWCAP, KERNEL_HWCAP_RPRFM),
HWCAP_CAP(ID_AA64ISAR2_EL1, RPRES, IMP, CAP_HWCAP, KERNEL_HWCAP_RPRES),
HWCAP_CAP(ID_AA64ISAR2_EL1, WFxT, IMP, CAP_HWCAP, KERNEL_HWCAP_WFXT),
+ HWCAP_CAP(ID_AA64ISAR2_EL1, MOPS, IMP, CAP_HWCAP, KERNEL_HWCAP_MOPS),
#ifdef CONFIG_ARM64_SME
HWCAP_CAP(ID_AA64PFR1_EL1, SME, IMP, CAP_HWCAP, KERNEL_HWCAP_SME),
HWCAP_CAP(ID_AA64SMFR0_EL1, FA64, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_FA64),
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index eb4378c23b3c..076a124255d0 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -125,6 +125,7 @@ static const char *const hwcap_str[] = {
[KERNEL_HWCAP_SME_BI32I32] = "smebi32i32",
[KERNEL_HWCAP_SME_B16B16] = "smeb16b16",
[KERNEL_HWCAP_SME_F16F16] = "smef16f16",
+ [KERNEL_HWCAP_MOPS] = "mops",
};

#ifdef CONFIG_COMPAT
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index e1de10fa080e..debc4609f129 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -35,6 +35,7 @@ HAS_GIC_PRIO_RELAXED_SYNC
HAS_HCX
HAS_LDAPR
HAS_LSE_ATOMICS
+HAS_MOPS
HAS_NESTED_VIRT
HAS_NO_FPSIMD
HAS_NO_HW_PREFETCH
--
2.25.1

2023-05-09 15:43:34

by Kristina Martsenko

[permalink] [raw]
Subject: [PATCH v2 10/11] arm64: mops: allow disabling MOPS from the kernel command line

Make it possible to disable the MOPS extension at runtime using the
kernel command line. This can be useful for testing or working around
hardware issues. For example it could be used to test new memory copy
routines that do not use MOPS instructions (e.g. from Arm Optimized
Routines).

Reviewed-by: Catalin Marinas <[email protected]>
Signed-off-by: Kristina Martsenko <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 3 +++
arch/arm64/kernel/idreg-override.c | 2 ++
2 files changed, 5 insertions(+)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 9e5bab29685f..e01fbfd78ae9 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -429,6 +429,9 @@
arm64.nosme [ARM64] Unconditionally disable Scalable Matrix
Extension support

+ arm64.nomops [ARM64] Unconditionally disable Memory Copy and Memory
+ Set instructions support
+
ataflop= [HW,M68k]

atarimouse= [HW,MOUSE] Atari Mouse
diff --git a/arch/arm64/kernel/idreg-override.c b/arch/arm64/kernel/idreg-override.c
index 370ab84fd06e..8439248c21d3 100644
--- a/arch/arm64/kernel/idreg-override.c
+++ b/arch/arm64/kernel/idreg-override.c
@@ -123,6 +123,7 @@ static const struct ftr_set_desc isar2 __initconst = {
.fields = {
FIELD("gpa3", ID_AA64ISAR2_EL1_GPA3_SHIFT, NULL),
FIELD("apa3", ID_AA64ISAR2_EL1_APA3_SHIFT, NULL),
+ FIELD("mops", ID_AA64ISAR2_EL1_MOPS_SHIFT, NULL),
{}
},
};
@@ -174,6 +175,7 @@ static const struct {
"id_aa64isar1.gpi=0 id_aa64isar1.gpa=0 "
"id_aa64isar1.api=0 id_aa64isar1.apa=0 "
"id_aa64isar2.gpa3=0 id_aa64isar2.apa3=0" },
+ { "arm64.nomops", "id_aa64isar2.mops=0" },
{ "arm64.nomte", "id_aa64pfr1.mte=0" },
{ "nokaslr", "kaslr.disabled=1" },
};
--
2.25.1

2023-05-12 04:11:08

by Mark Brown

[permalink] [raw]
Subject: Re: [PATCH v2 11/11] kselftest/arm64: add MOPS to hwcap test

On Tue, May 09, 2023 at 03:22:35PM +0100, Kristina Martsenko wrote:
> Add the MOPS hwcap to the hwcap kselftest and check that a SIGILL is not
> generated when the feature is detected. A SIGILL is reliable when the
> feature is not detected as SCTLR_EL1.MSCEn won't have been set.

Reviewed-by: Mark Brown <[email protected]>

> + /* CPYP [x0]!, [x1]!, x2! */
> + asm volatile(".inst 0x1d010440"
> + : "+r" (dstp), "+r" (srcp), "+r" (size)
> + :
> + : "cc", "memory");

Verified against DDI0602 2023-03.


Attachments:
(No filename) (537.00 B)
signature.asc (499.00 B)
Download all attachments

2023-05-25 19:52:02

by Colton Lewis

[permalink] [raw]
Subject: RE: [PATCH v2 07/11] arm64: mops: handle MOPS exceptions

> + if (esr & ESR_ELx_MOPS_ISS_MEM_INST) {
> + /* SET* instruction */
> + if (option_a ^ wrong_option) {
> + /* Format is from Option A; forward set */
> + pt_regs_write_reg(regs, dstreg, dst + size);
> + pt_regs_write_reg(regs, sizereg, -size);
> + }
> + } else {
> + /* CPY* instruction */
> + if (!(option_a ^ wrong_option)) {
> + /* Format is from Option B */
> + if (regs->pstate & PSR_N_BIT) {
> + /* Backward copy */
> + pt_regs_write_reg(regs, dstreg, dst - size);
> + pt_regs_write_reg(regs, srcreg, src - size);
> + }
> + } else {
> + /* Format is from Option A */
> + if (size & BIT(63)) {
> + /* Forward copy */
> + pt_regs_write_reg(regs, dstreg, dst + size);
> + pt_regs_write_reg(regs, srcreg, src + size);
> + pt_regs_write_reg(regs, sizereg, -size);
> + }
> + }
> + }

I can see an argument for styling things closely to the ARM manual as
you have done here, but Linux style recommends against deep nesting. In
this case it is unneeded. I believe this can be written as a single
if-else chain and that makes it easier to distinguish the three options.

if ((esr & ESR_ELx_MOPS_ISS_MEM_INST) && (option_a ^ wrong_option)) {
/* Format is from Option A; forward set */
pt_regs_write_reg(regs, dstreg, dst + size);
pt_regs_write_reg(regs, sizereg, -size);
} else if ((option_a ^ wrong_option) && (size & BIT(63)) {
/* Forward copy */
pt_regs_write_reg(regs, dstreg, dst + size);
pt_regs_write_reg(regs, srcreg, src + size);
pt_regs_write_reg(regs, sizereg, -size);
} else if (regs-pstate & PSR_N_BIT) {
/* Backward copy */
pt_regs_write_reg(regs, dstreg, dst - size);
pt_regs_write_reg(regs, srcreg, src - size);
}

2023-05-25 19:59:44

by Colton Lewis

[permalink] [raw]
Subject: RE: [PATCH v2 06/11] KVM: arm64: hide MOPS from guests

> As FEAT_MOPS is not supported in guests yet, hide it from the ID
> registers for guests.

> The MOPS instructions are UNDEFINED in guests as HCRX_EL2.MSCEn is not
> set in HCRX_GUEST_FLAGS, and will take an exception to EL1 if executed.

For my benefit, could you please explain why no support for guests yet?
Why not set HCRX_EL2.MSCEn in this series?

2023-05-30 16:47:46

by Kristina Martsenko

[permalink] [raw]
Subject: Re: [PATCH v2 06/11] KVM: arm64: hide MOPS from guests

On 25/05/2023 20:26, Colton Lewis wrote:
>> As FEAT_MOPS is not supported in guests yet, hide it from the ID
>> registers for guests.
>
>> The MOPS instructions are UNDEFINED in guests as HCRX_EL2.MSCEn is not
>> set in HCRX_GUEST_FLAGS, and will take an exception to EL1 if executed.
>
> For my benefit, could you please explain why no support for guests yet?
> Why not set HCRX_EL2.MSCEn in this series?

There's probably a few more things that need doing for guest support, such as
setting the HCRX_EL2.MCE2 bit and handling the mops exception in KVM. I'm
currently having a look at guest support.

Thanks,
Kristina

2023-05-30 16:58:29

by Kristina Martsenko

[permalink] [raw]
Subject: Re: [PATCH v2 07/11] arm64: mops: handle MOPS exceptions

On 25/05/2023 20:50, Colton Lewis wrote:
>> +    if (esr & ESR_ELx_MOPS_ISS_MEM_INST) {
>> +        /* SET* instruction */
>> +        if (option_a ^ wrong_option) {
>> +            /* Format is from Option A; forward set */
>> +            pt_regs_write_reg(regs, dstreg, dst + size);
>> +            pt_regs_write_reg(regs, sizereg, -size);
>> +        }
>> +    } else {
>> +        /* CPY* instruction */
>> +        if (!(option_a ^ wrong_option)) {
>> +            /* Format is from Option B */
>> +            if (regs->pstate & PSR_N_BIT) {
>> +                /* Backward copy */
>> +                pt_regs_write_reg(regs, dstreg, dst - size);
>> +                pt_regs_write_reg(regs, srcreg, src - size);
>> +            }
>> +        } else {
>> +            /* Format is from Option A */
>> +            if (size & BIT(63)) {
>> +                /* Forward copy */
>> +                pt_regs_write_reg(regs, dstreg, dst + size);
>> +                pt_regs_write_reg(regs, srcreg, src + size);
>> +                pt_regs_write_reg(regs, sizereg, -size);
>> +            }
>> +        }
>> +    }
>
> I can see an argument for styling things closely to the ARM manual as
> you have done here, but Linux style recommends against deep nesting. In
> this case it is unneeded. I believe this can be written as a single
> if-else chain and that makes it easier to distinguish the three options.
>
> if ((esr & ESR_ELx_MOPS_ISS_MEM_INST) && (option_a ^ wrong_option)) {
>     /* Format is from Option A; forward set */
>     pt_regs_write_reg(regs, dstreg, dst + size);
>     pt_regs_write_reg(regs, sizereg, -size);
> } else if ((option_a ^ wrong_option) && (size & BIT(63)) {
>     /* Forward copy */
>     pt_regs_write_reg(regs, dstreg, dst + size);
>     pt_regs_write_reg(regs, srcreg, src + size);
>     pt_regs_write_reg(regs, sizereg, -size);
> } else if (regs-pstate & PSR_N_BIT) {
>     /* Backward copy */
>     pt_regs_write_reg(regs, dstreg, dst - size);
>     pt_regs_write_reg(regs, srcreg, src - size);
> }

Yeah, the nesting gets a bit deep here, but there are 6 cases in total, ie 6
ways the hardware can set up the registers and pstate (in 3 of them the kernel
doesn't need to modify the registers), and I think the current structure makes
it clearer what the 6 are, so I'd prefer to keep it as it is for now.

Thanks,
Kristina


2023-06-02 14:07:13

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v2 03/11] KVM: arm64: switch HCRX_EL2 between host and guest

On Tue, May 09, 2023 at 03:22:27PM +0100, Kristina Martsenko wrote:
> Switch the HCRX_EL2 register between host and guest configurations, in
> order to enable different features in the host and guest.
>
> Now that there are separate guest flags, we can also remove SMPME from
> the host flags, as SMPME is used for virtualizing SME priorities and has
> no use in the host.
>
> Signed-off-by: Kristina Martsenko <[email protected]>

Same here, it could be good to have Marc/Oliver look at this.

--
Catalin

2023-06-02 14:13:13

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v2 01/11] KVM: arm64: initialize HCRX_EL2

On Tue, May 09, 2023 at 03:22:25PM +0100, Kristina Martsenko wrote:
> ARMv8.7/9.2 adds a new hypervisor configuration register HCRX_EL2.
> Initialize the register to a safe value (all fields 0), to be robust
> against firmware that has not initialized it. This is also needed to
> ensure that the register is reinitialized after a kexec by a future
> kernel.
>
> In addition, move SMPME setup over to the new flags, as it would
> otherwise get overridden. It is safe to set the bit even if SME is not
> (uniformly) supported, as it will write to a RES0 bit (having no
> effect), and SME will be disabled by the cpufeature framework.
> (Similar to how e.g. the API bit is handled in HCR_HOST_NVHE_FLAGS.)

This looks fine to me but I may have lost track of the VHE/nVHE code
initialisation paths.

Marc/Oliver, are you ok with this patch (or this series in general)? I'd
like to merge it through the arm64 tree.

Thanks.

--
Catalin

2023-06-03 08:56:51

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v2 01/11] KVM: arm64: initialize HCRX_EL2

On Tue, 09 May 2023 15:22:25 +0100,
Kristina Martsenko <[email protected]> wrote:
>
> ARMv8.7/9.2 adds a new hypervisor configuration register HCRX_EL2.
> Initialize the register to a safe value (all fields 0), to be robust
> against firmware that has not initialized it. This is also needed to
> ensure that the register is reinitialized after a kexec by a future
> kernel.
>
> In addition, move SMPME setup over to the new flags, as it would
> otherwise get overridden. It is safe to set the bit even if SME is not
> (uniformly) supported, as it will write to a RES0 bit (having no
> effect), and SME will be disabled by the cpufeature framework.
> (Similar to how e.g. the API bit is handled in HCR_HOST_NVHE_FLAGS.)
>
> Signed-off-by: Kristina Martsenko <[email protected]>

Acked-by: Marc Zyngier <[email protected]>

M.

--
Without deviation from the norm, progress is not possible.

2023-06-03 09:02:49

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v2 06/11] KVM: arm64: hide MOPS from guests

On Tue, 09 May 2023 15:22:30 +0100,
Kristina Martsenko <[email protected]> wrote:
>
> As FEAT_MOPS is not supported in guests yet, hide it from the ID
> registers for guests.
>
> The MOPS instructions are UNDEFINED in guests as HCRX_EL2.MSCEn is not
> set in HCRX_GUEST_FLAGS, and will take an exception to EL1 if executed.
>
> Acked-by: Catalin Marinas <[email protected]>
> Signed-off-by: Kristina Martsenko <[email protected]>

This is very likely to clash with Jing's series that completely
reworks the whole idreg series, but as long as this is on its own
branch, we can deal with that.

Acked-by: Marc Zyngier <[email protected]>

M.

--
Without deviation from the norm, progress is not possible.

2023-06-03 09:05:39

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v2 03/11] KVM: arm64: switch HCRX_EL2 between host and guest

On Tue, 09 May 2023 15:22:27 +0100,
Kristina Martsenko <[email protected]> wrote:
>
> Switch the HCRX_EL2 register between host and guest configurations, in
> order to enable different features in the host and guest.
>
> Now that there are separate guest flags, we can also remove SMPME from
> the host flags, as SMPME is used for virtualizing SME priorities and has
> no use in the host.
>
> Signed-off-by: Kristina Martsenko <[email protected]>

Acked-by: Marc Zyngier <[email protected]>

M.

--
Without deviation from the norm, progress is not possible.

2023-06-05 11:48:14

by Shaoqin Huang

[permalink] [raw]
Subject: Re: [PATCH v2 07/11] arm64: mops: handle MOPS exceptions

Hi Kristina,

On 5/9/23 22:22, Kristina Martsenko wrote:
> The memory copy/set instructions added as part of FEAT_MOPS can take an
> exception (e.g. page fault) part-way through their execution and resume
> execution afterwards.
>
> If however the task is re-scheduled and execution resumes on a different
> CPU, then the CPU may take a new type of exception to indicate this.
> This is because the architecture allows two options (Option A and Option
> B) to implement the instructions and a heterogeneous system can have
> different implementations between CPUs.
>
> In this case the OS has to reset the registers and restart execution
> from the prologue instruction. The algorithm for doing this is provided
> as part of the Arm ARM.
What is the Arm ARM? I'm not quite understand it.
>
> Add an exception handler for the new exception and wire it up for
> userspace tasks.
>
> Reviewed-by: Catalin Marinas <[email protected]>
> Signed-off-by: Kristina Martsenko <[email protected]>
> ---
> arch/arm64/include/asm/esr.h | 11 ++++++-
> arch/arm64/include/asm/exception.h | 1 +
> arch/arm64/kernel/entry-common.c | 11 +++++++
> arch/arm64/kernel/traps.c | 52 ++++++++++++++++++++++++++++++
> 4 files changed, 74 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
> index 8487aec9b658..ca954f566861 100644
> --- a/arch/arm64/include/asm/esr.h
> +++ b/arch/arm64/include/asm/esr.h
> @@ -47,7 +47,7 @@
> #define ESR_ELx_EC_DABT_LOW (0x24)
> #define ESR_ELx_EC_DABT_CUR (0x25)
> #define ESR_ELx_EC_SP_ALIGN (0x26)
> -/* Unallocated EC: 0x27 */
> +#define ESR_ELx_EC_MOPS (0x27)
> #define ESR_ELx_EC_FP_EXC32 (0x28)
> /* Unallocated EC: 0x29 - 0x2B */
> #define ESR_ELx_EC_FP_EXC64 (0x2C)
> @@ -356,6 +356,15 @@
> #define ESR_ELx_SME_ISS_ZA_DISABLED 3
> #define ESR_ELx_SME_ISS_ZT_DISABLED 4
>
> +/* ISS field definitions for MOPS exceptions */
> +#define ESR_ELx_MOPS_ISS_MEM_INST (UL(1) << 24)
> +#define ESR_ELx_MOPS_ISS_FROM_EPILOGUE (UL(1) << 18)
> +#define ESR_ELx_MOPS_ISS_WRONG_OPTION (UL(1) << 17)
> +#define ESR_ELx_MOPS_ISS_OPTION_A (UL(1) << 16)
> +#define ESR_ELx_MOPS_ISS_DESTREG(esr) (((esr) & (UL(0x1f) << 10)) >> 10)
> +#define ESR_ELx_MOPS_ISS_SRCREG(esr) (((esr) & (UL(0x1f) << 5)) >> 5)
> +#define ESR_ELx_MOPS_ISS_SIZEREG(esr) (((esr) & (UL(0x1f) << 0)) >> 0)
> +
> #ifndef __ASSEMBLY__
> #include <asm/types.h>
>
> diff --git a/arch/arm64/include/asm/exception.h b/arch/arm64/include/asm/exception.h
> index e73af709cb7a..72e83af0135f 100644
> --- a/arch/arm64/include/asm/exception.h
> +++ b/arch/arm64/include/asm/exception.h
> @@ -77,6 +77,7 @@ void do_el0_svc(struct pt_regs *regs);
> void do_el0_svc_compat(struct pt_regs *regs);
> void do_el0_fpac(struct pt_regs *regs, unsigned long esr);
> void do_el1_fpac(struct pt_regs *regs, unsigned long esr);
> +void do_el0_mops(struct pt_regs *regs, unsigned long esr);
> void do_serror(struct pt_regs *regs, unsigned long esr);
> void do_notify_resume(struct pt_regs *regs, unsigned long thread_flags);
>
> diff --git a/arch/arm64/kernel/entry-common.c b/arch/arm64/kernel/entry-common.c
> index 3af3c01c93a6..a8ec174e5b0e 100644
> --- a/arch/arm64/kernel/entry-common.c
> +++ b/arch/arm64/kernel/entry-common.c
> @@ -611,6 +611,14 @@ static void noinstr el0_bti(struct pt_regs *regs)
> exit_to_user_mode(regs);
> }
>
> +static void noinstr el0_mops(struct pt_regs *regs, unsigned long esr)
> +{
> + enter_from_user_mode(regs);
> + local_daif_restore(DAIF_PROCCTX);
> + do_el0_mops(regs, esr);
> + exit_to_user_mode(regs);
> +}
> +
> static void noinstr el0_inv(struct pt_regs *regs, unsigned long esr)
> {
> enter_from_user_mode(regs);
> @@ -688,6 +696,9 @@ asmlinkage void noinstr el0t_64_sync_handler(struct pt_regs *regs)
> case ESR_ELx_EC_BTI:
> el0_bti(regs);
> break;
> + case ESR_ELx_EC_MOPS:
> + el0_mops(regs, esr);
> + break;
> case ESR_ELx_EC_BREAKPT_LOW:
> case ESR_ELx_EC_SOFTSTP_LOW:
> case ESR_ELx_EC_WATCHPT_LOW:
> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> index 4bb1b8f47298..32dc692bffd3 100644
> --- a/arch/arm64/kernel/traps.c
> +++ b/arch/arm64/kernel/traps.c
> @@ -514,6 +514,57 @@ void do_el1_fpac(struct pt_regs *regs, unsigned long esr)
> die("Oops - FPAC", regs, esr);
> }
>
> +void do_el0_mops(struct pt_regs *regs, unsigned long esr)
> +{
> + bool wrong_option = esr & ESR_ELx_MOPS_ISS_WRONG_OPTION;
> + bool option_a = esr & ESR_ELx_MOPS_ISS_OPTION_A;
> + int dstreg = ESR_ELx_MOPS_ISS_DESTREG(esr);
> + int srcreg = ESR_ELx_MOPS_ISS_SRCREG(esr);
> + int sizereg = ESR_ELx_MOPS_ISS_SIZEREG(esr);
> + unsigned long dst, src, size;
> +
> + dst = pt_regs_read_reg(regs, dstreg);
> + src = pt_regs_read_reg(regs, srcreg);
> + size = pt_regs_read_reg(regs, sizereg);
> +
> + /*
> + * Put the registers back in the original format suitable for a
> + * prologue instruction, using the generic return routine from the
> + * Arm ARM (DDI 0487I.a) rules CNTMJ and MWFQH.
> + */
> + if (esr & ESR_ELx_MOPS_ISS_MEM_INST) {
> + /* SET* instruction */
> + if (option_a ^ wrong_option) {
> + /* Format is from Option A; forward set */
> + pt_regs_write_reg(regs, dstreg, dst + size);
> + pt_regs_write_reg(regs, sizereg, -size);
> + }
> + } else {
> + /* CPY* instruction */
> + if (!(option_a ^ wrong_option)) {
> + /* Format is from Option B */
> + if (regs->pstate & PSR_N_BIT) {
> + /* Backward copy */
> + pt_regs_write_reg(regs, dstreg, dst - size);
> + pt_regs_write_reg(regs, srcreg, src - size);
> + }
> + } else {
> + /* Format is from Option A */
> + if (size & BIT(63)) {
> + /* Forward copy */
> + pt_regs_write_reg(regs, dstreg, dst + size);
> + pt_regs_write_reg(regs, srcreg, src + size);
> + pt_regs_write_reg(regs, sizereg, -size);
> + }
> + }
> + }
> +
> + if (esr & ESR_ELx_MOPS_ISS_FROM_EPILOGUE)
> + regs->pc -= 8;
> + else
> + regs->pc -= 4;
> +}
> +
> #define __user_cache_maint(insn, address, res) \
> if (address >= TASK_SIZE_MAX) { \
> res = -EFAULT; \
> @@ -824,6 +875,7 @@ static const char *esr_class_str[] = {
> [ESR_ELx_EC_DABT_LOW] = "DABT (lower EL)",
> [ESR_ELx_EC_DABT_CUR] = "DABT (current EL)",
> [ESR_ELx_EC_SP_ALIGN] = "SP Alignment",
> + [ESR_ELx_EC_MOPS] = "MOPS",
> [ESR_ELx_EC_FP_EXC32] = "FP (AArch32)",
> [ESR_ELx_EC_FP_EXC64] = "FP (AArch64)",
> [ESR_ELx_EC_SERROR] = "SError",

--
Shaoqin


2023-06-05 12:12:10

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v2 07/11] arm64: mops: handle MOPS exceptions

On Mon, Jun 05, 2023 at 07:43:27PM +0800, Shaoqin Huang wrote:
> Hi Kristina,
>
> On 5/9/23 22:22, Kristina Martsenko wrote:
> > The memory copy/set instructions added as part of FEAT_MOPS can take an
> > exception (e.g. page fault) part-way through their execution and resume
> > execution afterwards.
> >
> > If however the task is re-scheduled and execution resumes on a different
> > CPU, then the CPU may take a new type of exception to indicate this.
> > This is because the architecture allows two options (Option A and Option
> > B) to implement the instructions and a heterogeneous system can have
> > different implementations between CPUs.
> >
> > In this case the OS has to reset the registers and restart execution
> > from the prologue instruction. The algorithm for doing this is provided
> > as part of the Arm ARM.
>
> What is the Arm ARM? I'm not quite understand it.

The Arm Architecture Reference Manual:

https://developer.arm.com/documentation/ddi0487/latest

(the acronym we pretty well known among the arm/arm64 developers)

--
Catalin

2023-06-05 16:12:52

by Oliver Upton

[permalink] [raw]
Subject: Re: [PATCH v2 06/11] KVM: arm64: hide MOPS from guests

On Sat, Jun 03, 2023 at 09:42:18AM +0100, Marc Zyngier wrote:
> On Tue, 09 May 2023 15:22:30 +0100,
> Kristina Martsenko <[email protected]> wrote:
> >
> > As FEAT_MOPS is not supported in guests yet, hide it from the ID
> > registers for guests.
> >
> > The MOPS instructions are UNDEFINED in guests as HCRX_EL2.MSCEn is not
> > set in HCRX_GUEST_FLAGS, and will take an exception to EL1 if executed.
> >
> > Acked-by: Catalin Marinas <[email protected]>
> > Signed-off-by: Kristina Martsenko <[email protected]>
>
> This is very likely to clash with Jing's series that completely
> reworks the whole idreg series, but as long as this is on its own
> branch, we can deal with that.

Yup, we will definitely want to get that ironed out. I'll pull Catalin's
branch when this all gets queued up.

Acked-by: Oliver Upton <[email protected]>

--
Thanks,
Oliver

2023-06-05 16:15:28

by Oliver Upton

[permalink] [raw]
Subject: Re: [PATCH v2 01/11] KVM: arm64: initialize HCRX_EL2

On Fri, Jun 02, 2023 at 02:49:50PM +0100, Catalin Marinas wrote:
> On Tue, May 09, 2023 at 03:22:25PM +0100, Kristina Martsenko wrote:
> > ARMv8.7/9.2 adds a new hypervisor configuration register HCRX_EL2.
> > Initialize the register to a safe value (all fields 0), to be robust
> > against firmware that has not initialized it. This is also needed to
> > ensure that the register is reinitialized after a kexec by a future
> > kernel.
> >
> > In addition, move SMPME setup over to the new flags, as it would
> > otherwise get overridden. It is safe to set the bit even if SME is not
> > (uniformly) supported, as it will write to a RES0 bit (having no
> > effect), and SME will be disabled by the cpufeature framework.
> > (Similar to how e.g. the API bit is handled in HCR_HOST_NVHE_FLAGS.)
>
> This looks fine to me but I may have lost track of the VHE/nVHE code
> initialisation paths.
>
> Marc/Oliver, are you ok with this patch (or this series in general)? I'd
> like to merge it through the arm64 tree.

Acked-by: Oliver Upton <[email protected]>

--
Thanks,
Oliver

2023-06-05 16:25:19

by Oliver Upton

[permalink] [raw]
Subject: Re: [PATCH v2 03/11] KVM: arm64: switch HCRX_EL2 between host and guest

On Fri, Jun 02, 2023 at 02:51:53PM +0100, Catalin Marinas wrote:
> On Tue, May 09, 2023 at 03:22:27PM +0100, Kristina Martsenko wrote:
> > Switch the HCRX_EL2 register between host and guest configurations, in
> > order to enable different features in the host and guest.
> >
> > Now that there are separate guest flags, we can also remove SMPME from
> > the host flags, as SMPME is used for virtualizing SME priorities and has
> > no use in the host.
> >
> > Signed-off-by: Kristina Martsenko <[email protected]>
>
> Same here, it could be good to have Marc/Oliver look at this.

Acked-by: Oliver Upton <[email protected]>

--
Thanks,
Oliver

2023-06-05 18:01:20

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v2 00/11] arm64: Support for Armv8.8 memcpy instructions in userspace

On Tue, 09 May 2023 15:22:24 +0100, Kristina Martsenko wrote:
> The Armv8.8 extension adds new instructions to perform memcpy(), memset() and
> memmove() operations in hardware (FEAT_MOPS). This series adds support for
> using the new instructions in userspace. More information can be found in the
> cover letter for v1:
> https://lore.kernel.org/linux-arm-kernel/[email protected]/
>
> Changes in v2:
> - Removed booting.rst requirement for HCRX_EL2.MCE2
> - Changed HCRX_EL2 cpucap to be STRICT_BOOT type
> - Changed HCRX_EL2.SMPME to be set for the guest and unset for the host
> - Moved HCRX_EL2 initialization into init_el2_state(), dropped ISB
> - Simplified conditional checks in mops exception handler with XOR
> - Added comments from Arm ARM into mops exception handler
> - Converted cpucaps to use the new ARM64_CPUID_FIELDS() helper
> - Added MOPS to hwcaps kselftest
> - Improved commit messages
> - Rebased onto v6.4-rc1
> - v1: https://lore.kernel.org/linux-arm-kernel/[email protected]/
>
> [...]

Applied to arm64 (for-next/feat_mops), thanks!

[01/11] KVM: arm64: initialize HCRX_EL2
https://git.kernel.org/arm64/c/af94aad4c915
[02/11] arm64: cpufeature: detect FEAT_HCX
https://git.kernel.org/arm64/c/b0c756fe996a
[03/11] KVM: arm64: switch HCRX_EL2 between host and guest
https://git.kernel.org/arm64/c/306b4c9f7120
[04/11] arm64: mops: document boot requirements for MOPS
https://git.kernel.org/arm64/c/f32c053b9806
[05/11] arm64: mops: don't disable host MOPS instructions from EL2
https://git.kernel.org/arm64/c/b1319c0e9559
[06/11] KVM: arm64: hide MOPS from guests
https://git.kernel.org/arm64/c/3172613fbcbb
[07/11] arm64: mops: handle MOPS exceptions
https://git.kernel.org/arm64/c/8536ceaa7471
[08/11] arm64: mops: handle single stepping after MOPS exception
https://git.kernel.org/arm64/c/8cd076a67dc8
[09/11] arm64: mops: detect and enable FEAT_MOPS
https://git.kernel.org/arm64/c/b7564127ffcb
[10/11] arm64: mops: allow disabling MOPS from the kernel command line
https://git.kernel.org/arm64/c/3e1dedb29d0f
[11/11] kselftest/arm64: add MOPS to hwcap test
https://git.kernel.org/arm64/c/d8a324f102cc

--
Catalin