2024-01-22 17:25:25

by Mark Brown

[permalink] [raw]
Subject: [PATCH v4 00/14] arm64: Support for 2023 DPISA extensions

This series enables support for the data processing extensions in the
newly released 2023 architecture, this is mainly support for 8 bit
floating point formats. Most of the extensions only introduce new
instructions and therefore only require hwcaps but there is a new EL0
visible control register FPMR used to control the 8 bit floating point
formats, we need to manage traps for this and context switch it.

Due to uncertainty with the plan for parsing ID registers to identify
which features to expose to the guest the KVM support is placed at the
end of the series, it will need to be revised once that issue is
resolved. The sharing of floating point save code between the host and
guest kernels slightly complicates the introduction of KVM support, we
first introduce host support with some placeholders for KVM then replace
those with the actual KVM support.

I've not added test coverage for ptrace, I've got a test program which
exercises all the FP ptrace interfaces and their interactions together,
my plan is to cover it there rather than add another tiny test program
that duplicates the boilerplace for tracing a target and doesn't
actually run the traced program.

Signed-off-by: Mark Brown <[email protected]>
---
Changes in v4:
- Rebase onto v6.8-rc1.
- Move KVM support to the end of the series.
- Link to v3: https://lore.kernel.org/r/[email protected]

Changes in v3:
- Rebase onto v6.7-rc3.
- Hook up traps for FPMR in emulate-nested.c.
- Link to v2: https://lore.kernel.org/r/[email protected]

Changes in v2:
- Rebase onto v6.7-rc1.
- Link to v1: https://lore.kernel.org/r/[email protected]

---
Mark Brown (14):
arm64/cpufeature: Hook new identification registers up to cpufeature
arm64/fpsimd: Enable host kernel access to FPMR
arm64/fpsimd: Support FEAT_FPMR
arm64/signal: Add FPMR signal handling
arm64/ptrace: Expose FPMR via ptrace
arm64/hwcap: Define hwcaps for 2023 DPISA features
kselftest/arm64: Handle FPMR context in generic signal frame parser
kselftest/arm64: Add basic FPMR test
kselftest/arm64: Add 2023 DPISA hwcap test coverage
KVM: arm64: Share all userspace hardened thread data with the hypervisor
KVM: arm64: Add newly allocated ID registers to register descriptions
KVM: arm64: Support FEAT_FPMR for guests
KVM: arm64: selftests: Document feature registers added in 2023 extensions
KVM: arm64: selftests: Teach get-reg-list about FPMR

Documentation/arch/arm64/elf_hwcaps.rst | 49 +++++
arch/arm64/include/asm/cpu.h | 3 +
arch/arm64/include/asm/cpufeature.h | 5 +
arch/arm64/include/asm/fpsimd.h | 2 +
arch/arm64/include/asm/hwcap.h | 15 ++
arch/arm64/include/asm/kvm_arm.h | 4 +-
arch/arm64/include/asm/kvm_host.h | 5 +-
arch/arm64/include/asm/processor.h | 6 +-
arch/arm64/include/uapi/asm/hwcap.h | 15 ++
arch/arm64/include/uapi/asm/sigcontext.h | 8 +
arch/arm64/kernel/cpufeature.c | 72 +++++++
arch/arm64/kernel/cpuinfo.c | 18 ++
arch/arm64/kernel/fpsimd.c | 13 ++
arch/arm64/kernel/ptrace.c | 42 ++++
arch/arm64/kernel/signal.c | 59 ++++++
arch/arm64/kvm/emulate-nested.c | 8 +
arch/arm64/kvm/fpsimd.c | 14 +-
arch/arm64/kvm/hyp/include/hyp/switch.h | 9 +-
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 4 +-
arch/arm64/kvm/sys_regs.c | 17 +-
arch/arm64/tools/cpucaps | 1 +
include/uapi/linux/elf.h | 1 +
tools/testing/selftests/arm64/abi/hwcap.c | 217 +++++++++++++++++++++
tools/testing/selftests/arm64/signal/.gitignore | 1 +
.../arm64/signal/testcases/fpmr_siginfo.c | 82 ++++++++
.../selftests/arm64/signal/testcases/testcases.c | 8 +
.../selftests/arm64/signal/testcases/testcases.h | 1 +
tools/testing/selftests/kvm/aarch64/get-reg-list.c | 11 +-
28 files changed, 670 insertions(+), 20 deletions(-)
---
base-commit: 6613476e225e090cc9aad49be7fa504e290dd33d
change-id: 20231003-arm64-2023-dpisa-2f3d25746474

Best regards,
--
Mark Brown <[email protected]>



2024-01-22 17:26:33

by Mark Brown

[permalink] [raw]
Subject: [PATCH v4 02/14] arm64/fpsimd: Enable host kernel access to FPMR

FEAT_FPMR provides a new generally accessible architectural register FPMR.
This is only accessible to EL0 and EL1 when HCRX_EL2.EnFPM is set to 1,
do this when the host is running. The guest part will be done along with
context switching the new register and exposing it via guest management.

Signed-off-by: Mark Brown <[email protected]>
---
arch/arm64/include/asm/kvm_arm.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 3c6f8ba1e479..7f45ce9170bb 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -105,7 +105,7 @@
#define HCRX_GUEST_FLAGS \
(HCRX_EL2_SMPME | HCRX_EL2_TCR2En | \
(cpus_have_final_cap(ARM64_HAS_MOPS) ? (HCRX_EL2_MSCEn | HCRX_EL2_MCE2) : 0))
-#define HCRX_HOST_FLAGS (HCRX_EL2_MSCEn | HCRX_EL2_TCR2En)
+#define HCRX_HOST_FLAGS (HCRX_EL2_MSCEn | HCRX_EL2_TCR2En | HCRX_EL2_EnFPM)

/* TCR_EL2 Registers bits */
#define TCR_EL2_DS (1UL << 32)

--
2.30.2


2024-01-22 17:26:44

by Mark Brown

[permalink] [raw]
Subject: [PATCH v4 03/14] arm64/fpsimd: Support FEAT_FPMR

FEAT_FPMR defines a new EL0 accessible register FPMR use to configure the
FP8 related features added to the architecture at the same time. Detect
support for this register and context switch it for EL0 when present.

Due to the sharing of responsibility for saving floating point state
between the host kernel and KVM FP8 support is not yet implemented in KVM
and a stub similar to that used for SVCR is provided for FPMR in order to
avoid bisection issues. To make it easier to share host state with the
hypervisor we store FPMR as a hardened usercopy field in uw (along with
some padding).

Signed-off-by: Mark Brown <[email protected]>
---
arch/arm64/include/asm/cpufeature.h | 5 +++++
arch/arm64/include/asm/fpsimd.h | 2 ++
arch/arm64/include/asm/kvm_host.h | 1 +
arch/arm64/include/asm/processor.h | 4 ++++
arch/arm64/kernel/cpufeature.c | 9 +++++++++
arch/arm64/kernel/fpsimd.c | 13 +++++++++++++
arch/arm64/kvm/fpsimd.c | 1 +
arch/arm64/tools/cpucaps | 1 +
8 files changed, 36 insertions(+)

diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
index 21c824edf8ce..34fcdbc65d7d 100644
--- a/arch/arm64/include/asm/cpufeature.h
+++ b/arch/arm64/include/asm/cpufeature.h
@@ -768,6 +768,11 @@ static __always_inline bool system_supports_tpidr2(void)
return system_supports_sme();
}

+static __always_inline bool system_supports_fpmr(void)
+{
+ return alternative_has_cap_unlikely(ARM64_HAS_FPMR);
+}
+
static __always_inline bool system_supports_cnp(void)
{
return alternative_has_cap_unlikely(ARM64_HAS_CNP);
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 50e5f25d3024..6cf72b0d2c04 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -89,6 +89,7 @@ struct cpu_fp_state {
void *sve_state;
void *sme_state;
u64 *svcr;
+ unsigned long *fpmr;
unsigned int sve_vl;
unsigned int sme_vl;
enum fp_type *fp_type;
@@ -154,6 +155,7 @@ extern void cpu_enable_sve(const struct arm64_cpu_capabilities *__unused);
extern void cpu_enable_sme(const struct arm64_cpu_capabilities *__unused);
extern void cpu_enable_sme2(const struct arm64_cpu_capabilities *__unused);
extern void cpu_enable_fa64(const struct arm64_cpu_capabilities *__unused);
+extern void cpu_enable_fpmr(const struct arm64_cpu_capabilities *__unused);

extern u64 read_smcr_features(void);

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 21c57b812569..7993694a54af 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -543,6 +543,7 @@ struct kvm_vcpu_arch {
enum fp_type fp_type;
unsigned int sve_max_vl;
u64 svcr;
+ unsigned long fpmr;

/* Stage 2 paging state used by the hardware on next switch */
struct kvm_s2_mmu *hw_mmu;
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 5b0a04810b23..b453c66d3fae 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -155,6 +155,8 @@ struct thread_struct {
struct {
unsigned long tp_value; /* TLS register */
unsigned long tp2_value;
+ unsigned long fpmr;
+ unsigned long pad;
struct user_fpsimd_state fpsimd_state;
} uw;

@@ -253,6 +255,8 @@ static inline void arch_thread_struct_whitelist(unsigned long *offset,
BUILD_BUG_ON(sizeof_field(struct thread_struct, uw) !=
sizeof_field(struct thread_struct, uw.tp_value) +
sizeof_field(struct thread_struct, uw.tp2_value) +
+ sizeof_field(struct thread_struct, uw.fpmr) +
+ sizeof_field(struct thread_struct, uw.pad) +
sizeof_field(struct thread_struct, uw.fpsimd_state));

*offset = offsetof(struct thread_struct, uw);
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index eae59ec0f4b0..0263565f617a 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -272,6 +272,7 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr1[] = {
};

static const struct arm64_ftr_bits ftr_id_aa64pfr2[] = {
+ ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR2_EL1_FPMR_SHIFT, 4, 0),
ARM64_FTR_END,
};

@@ -2767,6 +2768,14 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
.type = ARM64_CPUCAP_SYSTEM_FEATURE,
.matches = has_lpa2,
},
+ {
+ .desc = "FPMR",
+ .type = ARM64_CPUCAP_SYSTEM_FEATURE,
+ .capability = ARM64_HAS_FPMR,
+ .matches = has_cpuid_feature,
+ .cpu_enable = cpu_enable_fpmr,
+ ARM64_CPUID_FIELDS(ID_AA64PFR2_EL1, FPMR, IMP)
+ },
{},
};

diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index a5dc6f764195..8e24b5e5e192 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -359,6 +359,9 @@ static void task_fpsimd_load(void)
WARN_ON(preemptible());
WARN_ON(test_thread_flag(TIF_KERNEL_FPSTATE));

+ if (system_supports_fpmr())
+ write_sysreg_s(current->thread.uw.fpmr, SYS_FPMR);
+
if (system_supports_sve() || system_supports_sme()) {
switch (current->thread.fp_type) {
case FP_STATE_FPSIMD:
@@ -446,6 +449,9 @@ static void fpsimd_save_user_state(void)
if (test_thread_flag(TIF_FOREIGN_FPSTATE))
return;

+ if (system_supports_fpmr())
+ *(last->fpmr) = read_sysreg_s(SYS_FPMR);
+
/*
* If a task is in a syscall the ABI allows us to only
* preserve the state shared with FPSIMD so don't bother
@@ -688,6 +694,12 @@ static void sve_to_fpsimd(struct task_struct *task)
}
}

+void cpu_enable_fpmr(const struct arm64_cpu_capabilities *__always_unused p)
+{
+ write_sysreg_s(read_sysreg_s(SYS_SCTLR_EL1) | SCTLR_EL1_EnFPM_MASK,
+ SYS_SCTLR_EL1);
+}
+
#ifdef CONFIG_ARM64_SVE
/*
* Call __sve_free() directly only if you know task can't be scheduled
@@ -1680,6 +1692,7 @@ static void fpsimd_bind_task_to_cpu(void)
last->sve_vl = task_get_sve_vl(current);
last->sme_vl = task_get_sme_vl(current);
last->svcr = &current->thread.svcr;
+ last->fpmr = &current->thread.uw.fpmr;
last->fp_type = &current->thread.fp_type;
last->to_save = FP_STATE_CURRENT;
current->thread.fpsimd_cpu = smp_processor_id();
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 8c1d0d4853df..e3e611e30e91 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -153,6 +153,7 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
fp_state.sve_vl = vcpu->arch.sve_max_vl;
fp_state.sme_state = NULL;
fp_state.svcr = &vcpu->arch.svcr;
+ fp_state.fpmr = &vcpu->arch.fpmr;
fp_state.fp_type = &vcpu->arch.fp_type;

if (vcpu_has_sve(vcpu))
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index b912b1409fc0..63283550c8e8 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -26,6 +26,7 @@ HAS_ECV
HAS_ECV_CNTPOFF
HAS_EPAN
HAS_EVT
+HAS_FPMR
HAS_FGT
HAS_FPSIMD
HAS_GENERIC_AUTH

--
2.30.2


2024-01-22 17:27:30

by Mark Brown

[permalink] [raw]
Subject: [PATCH v4 04/14] arm64/signal: Add FPMR signal handling

Expose FPMR in the signal context on systems where it is supported. The
kernel validates the exact size of the FPSIMD registers so we can't readily
add it to fpsimd_context without disruption.

Signed-off-by: Mark Brown <[email protected]>
---
arch/arm64/include/uapi/asm/sigcontext.h | 8 +++++
arch/arm64/kernel/signal.c | 59 ++++++++++++++++++++++++++++++++
2 files changed, 67 insertions(+)

diff --git a/arch/arm64/include/uapi/asm/sigcontext.h b/arch/arm64/include/uapi/asm/sigcontext.h
index f23c1dc3f002..8a45b7a411e0 100644
--- a/arch/arm64/include/uapi/asm/sigcontext.h
+++ b/arch/arm64/include/uapi/asm/sigcontext.h
@@ -152,6 +152,14 @@ struct tpidr2_context {
__u64 tpidr2;
};

+/* FPMR context */
+#define FPMR_MAGIC 0x46504d52
+
+struct fpmr_context {
+ struct _aarch64_ctx head;
+ __u64 fpmr;
+};
+
#define ZA_MAGIC 0x54366345

struct za_context {
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 0e8beb3349ea..460823baa603 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -60,6 +60,7 @@ struct rt_sigframe_user_layout {
unsigned long tpidr2_offset;
unsigned long za_offset;
unsigned long zt_offset;
+ unsigned long fpmr_offset;
unsigned long extra_offset;
unsigned long end_offset;
};
@@ -182,6 +183,8 @@ struct user_ctxs {
u32 za_size;
struct zt_context __user *zt;
u32 zt_size;
+ struct fpmr_context __user *fpmr;
+ u32 fpmr_size;
};

static int preserve_fpsimd_context(struct fpsimd_context __user *ctx)
@@ -227,6 +230,33 @@ static int restore_fpsimd_context(struct user_ctxs *user)
return err ? -EFAULT : 0;
}

+static int preserve_fpmr_context(struct fpmr_context __user *ctx)
+{
+ int err = 0;
+
+ current->thread.uw.fpmr = read_sysreg_s(SYS_FPMR);
+
+ __put_user_error(FPMR_MAGIC, &ctx->head.magic, err);
+ __put_user_error(sizeof(*ctx), &ctx->head.size, err);
+ __put_user_error(current->thread.uw.fpmr, &ctx->fpmr, err);
+
+ return err;
+}
+
+static int restore_fpmr_context(struct user_ctxs *user)
+{
+ u64 fpmr;
+ int err = 0;
+
+ if (user->fpmr_size != sizeof(*user->fpmr))
+ return -EINVAL;
+
+ __get_user_error(fpmr, &user->fpmr->fpmr, err);
+ if (!err)
+ write_sysreg_s(fpmr, SYS_FPMR);
+
+ return err;
+}

#ifdef CONFIG_ARM64_SVE

@@ -590,6 +620,7 @@ static int parse_user_sigframe(struct user_ctxs *user,
user->tpidr2 = NULL;
user->za = NULL;
user->zt = NULL;
+ user->fpmr = NULL;

if (!IS_ALIGNED((unsigned long)base, 16))
goto invalid;
@@ -684,6 +715,17 @@ static int parse_user_sigframe(struct user_ctxs *user,
user->zt_size = size;
break;

+ case FPMR_MAGIC:
+ if (!system_supports_fpmr())
+ goto invalid;
+
+ if (user->fpmr)
+ goto invalid;
+
+ user->fpmr = (struct fpmr_context __user *)head;
+ user->fpmr_size = size;
+ break;
+
case EXTRA_MAGIC:
if (have_extra_context)
goto invalid;
@@ -806,6 +848,9 @@ static int restore_sigframe(struct pt_regs *regs,
if (err == 0 && system_supports_tpidr2() && user.tpidr2)
err = restore_tpidr2_context(&user);

+ if (err == 0 && system_supports_fpmr() && user.fpmr)
+ err = restore_fpmr_context(&user);
+
if (err == 0 && system_supports_sme() && user.za)
err = restore_za_context(&user);

@@ -928,6 +973,13 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
}
}

+ if (system_supports_fpmr()) {
+ err = sigframe_alloc(user, &user->fpmr_offset,
+ sizeof(struct fpmr_context));
+ if (err)
+ return err;
+ }
+
return sigframe_alloc_end(user);
}

@@ -983,6 +1035,13 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user,
err |= preserve_tpidr2_context(tpidr2_ctx);
}

+ /* FPMR if supported */
+ if (system_supports_fpmr() && err == 0) {
+ struct fpmr_context __user *fpmr_ctx =
+ apply_user_offset(user, user->fpmr_offset);
+ err |= preserve_fpmr_context(fpmr_ctx);
+ }
+
/* ZA state if present */
if (system_supports_sme() && err == 0 && user->za_offset) {
struct za_context __user *za_ctx =

--
2.30.2


2024-01-22 17:27:44

by Mark Brown

[permalink] [raw]
Subject: [PATCH v4 06/14] arm64/hwcap: Define hwcaps for 2023 DPISA features

The 2023 architecture extensions include a large number of floating point
features, most of which simply add new instructions. Add hwcaps so that
userspace can enumerate these features.

Signed-off-by: Mark Brown <[email protected]>
---
Documentation/arch/arm64/elf_hwcaps.rst | 49 +++++++++++++++++++++++++++++++++
arch/arm64/include/asm/hwcap.h | 15 ++++++++++
arch/arm64/include/uapi/asm/hwcap.h | 15 ++++++++++
arch/arm64/kernel/cpufeature.c | 35 +++++++++++++++++++++++
arch/arm64/kernel/cpuinfo.c | 15 ++++++++++
5 files changed, 129 insertions(+)

diff --git a/Documentation/arch/arm64/elf_hwcaps.rst b/Documentation/arch/arm64/elf_hwcaps.rst
index ced7b335e2e0..448c1664879b 100644
--- a/Documentation/arch/arm64/elf_hwcaps.rst
+++ b/Documentation/arch/arm64/elf_hwcaps.rst
@@ -317,6 +317,55 @@ HWCAP2_LRCPC3
HWCAP2_LSE128
Functionality implied by ID_AA64ISAR0_EL1.Atomic == 0b0011.

+HWCAP2_FPMR
+ Functionality implied by ID_AA64PFR2_EL1.FMR == 0b0001.
+
+HWCAP2_LUT
+ Functionality implied by ID_AA64ISAR2_EL1.LUT == 0b0001.
+
+HWCAP2_FAMINMAX
+ Functionality implied by ID_AA64ISAR3_EL1.FAMINMAX == 0b0001.
+
+HWCAP2_F8CVT
+ Functionality implied by ID_AA64FPFR0_EL1.F8CVT == 0b1.
+
+HWCAP2_F8FMA
+ Functionality implied by ID_AA64FPFR0_EL1.F8FMA == 0b1.
+
+HWCAP2_F8DP4
+ Functionality implied by ID_AA64FPFR0_EL1.F8DP4 == 0b1.
+
+HWCAP2_F8DP2
+ Functionality implied by ID_AA64FPFR0_EL1.F8DP2 == 0b1.
+
+HWCAP2_F8E4M3
+ Functionality implied by ID_AA64FPFR0_EL1.F8E4M3 == 0b1.
+
+HWCAP2_F8E5M2
+ Functionality implied by ID_AA64FPFR0_EL1.F8E5M2 == 0b1.
+
+HWCAP2_SME_LUTV2
+ Functionality implied by ID_AA64SMFR0_EL1.LUTv2 == 0b1.
+
+HWCAP2_SME_F8F16
+ Functionality implied by ID_AA64SMFR0_EL1.F8F16 == 0b1.
+
+HWCAP2_SME_F8F32
+ Functionality implied by ID_AA64SMFR0_EL1.F8F32 == 0b1.
+
+HWCAP2_SME_SF8FMA
+ Functionality implied by ID_AA64SMFR0_EL1.SF8FMA == 0b1.
+
+HWCAP2_SME_SF8DP4
+ Functionality implied by ID_AA64SMFR0_EL1.SF8DP4 == 0b1.
+
+HWCAP2_SME_SF8DP2
+ Functionality implied by ID_AA64SMFR0_EL1.SF8DP2 == 0b1.
+
+HWCAP2_SME_SF8DP4
+ Functionality implied by ID_AA64SMFR0_EL1.SF8DP4 == 0b1.
+
+
4. Unused AT_HWCAP bits
-----------------------

diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
index cd71e09ea14d..4edd3b61df11 100644
--- a/arch/arm64/include/asm/hwcap.h
+++ b/arch/arm64/include/asm/hwcap.h
@@ -142,6 +142,21 @@
#define KERNEL_HWCAP_SVE_B16B16 __khwcap2_feature(SVE_B16B16)
#define KERNEL_HWCAP_LRCPC3 __khwcap2_feature(LRCPC3)
#define KERNEL_HWCAP_LSE128 __khwcap2_feature(LSE128)
+#define KERNEL_HWCAP_FPMR __khwcap2_feature(FPMR)
+#define KERNEL_HWCAP_LUT __khwcap2_feature(LUT)
+#define KERNEL_HWCAP_FAMINMAX __khwcap2_feature(FAMINMAX)
+#define KERNEL_HWCAP_F8CVT __khwcap2_feature(F8CVT)
+#define KERNEL_HWCAP_F8FMA __khwcap2_feature(F8FMA)
+#define KERNEL_HWCAP_F8DP4 __khwcap2_feature(F8DP4)
+#define KERNEL_HWCAP_F8DP2 __khwcap2_feature(F8DP2)
+#define KERNEL_HWCAP_F8E4M3 __khwcap2_feature(F8E4M3)
+#define KERNEL_HWCAP_F8E5M2 __khwcap2_feature(F8E5M2)
+#define KERNEL_HWCAP_SME_LUTV2 __khwcap2_feature(SME_LUTV2)
+#define KERNEL_HWCAP_SME_F8F16 __khwcap2_feature(SME_F8F16)
+#define KERNEL_HWCAP_SME_F8F32 __khwcap2_feature(SME_F8F32)
+#define KERNEL_HWCAP_SME_SF8FMA __khwcap2_feature(SME_SF8FMA)
+#define KERNEL_HWCAP_SME_SF8DP4 __khwcap2_feature(SME_SF8DP4)
+#define KERNEL_HWCAP_SME_SF8DP2 __khwcap2_feature(SME_SF8DP2)

/*
* This yields a mask that user programs can use to figure out what
diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h
index 5023599fa278..285610e626f5 100644
--- a/arch/arm64/include/uapi/asm/hwcap.h
+++ b/arch/arm64/include/uapi/asm/hwcap.h
@@ -107,5 +107,20 @@
#define HWCAP2_SVE_B16B16 (1UL << 45)
#define HWCAP2_LRCPC3 (1UL << 46)
#define HWCAP2_LSE128 (1UL << 47)
+#define HWCAP2_FPMR (1UL << 48)
+#define HWCAP2_LUT (1UL << 49)
+#define HWCAP2_FAMINMAX (1UL << 50)
+#define HWCAP2_F8CVT (1UL << 51)
+#define HWCAP2_F8FMA (1UL << 52)
+#define HWCAP2_F8DP4 (1UL << 53)
+#define HWCAP2_F8DP2 (1UL << 54)
+#define HWCAP2_F8E4M3 (1UL << 55)
+#define HWCAP2_F8E5M2 (1UL << 56)
+#define HWCAP2_SME_LUTV2 (1UL << 57)
+#define HWCAP2_SME_F8F16 (1UL << 58)
+#define HWCAP2_SME_F8F32 (1UL << 59)
+#define HWCAP2_SME_SF8FMA (1UL << 60)
+#define HWCAP2_SME_SF8DP4 (1UL << 61)
+#define HWCAP2_SME_SF8DP2 (1UL << 62)

#endif /* _UAPI__ASM_HWCAP_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 0263565f617a..aefda789f510 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -220,6 +220,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
};

static const struct arm64_ftr_bits ftr_id_aa64isar2[] = {
+ ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_LUT_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_CSSC_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_RPRFM_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR2_EL1_CLRBHB_SHIFT, 4, 0),
@@ -235,6 +236,7 @@ static const struct arm64_ftr_bits ftr_id_aa64isar2[] = {
};

static const struct arm64_ftr_bits ftr_id_aa64isar3[] = {
+ ARM64_FTR_BITS(FTR_VISIBLE, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64ISAR3_EL1_FAMINMAX_SHIFT, 4, 0),
ARM64_FTR_END,
};

@@ -303,6 +305,8 @@ static const struct arm64_ftr_bits ftr_id_aa64zfr0[] = {
static const struct arm64_ftr_bits ftr_id_aa64smfr0[] = {
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_FA64_SHIFT, 1, 0),
+ ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
+ FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_LUTv2_SHIFT, 1, 0),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_SMEver_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
@@ -315,6 +319,10 @@ static const struct arm64_ftr_bits ftr_id_aa64smfr0[] = {
FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_B16B16_SHIFT, 1, 0),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_F16F16_SHIFT, 1, 0),
+ ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
+ FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_F8F16_SHIFT, 1, 0),
+ ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
+ FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_F8F32_SHIFT, 1, 0),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_I8I32_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
@@ -325,10 +333,22 @@ static const struct arm64_ftr_bits ftr_id_aa64smfr0[] = {
FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_BI32I32_SHIFT, 1, 0),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_F32F32_SHIFT, 1, 0),
+ ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
+ FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_SF8FMA_SHIFT, 1, 0),
+ ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
+ FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_SF8DP4_SHIFT, 1, 0),
+ ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SME),
+ FTR_STRICT, FTR_EXACT, ID_AA64SMFR0_EL1_SF8DP2_SHIFT, 1, 0),
ARM64_FTR_END,
};

static const struct arm64_ftr_bits ftr_id_aa64fpfr0[] = {
+ ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64FPFR0_EL1_F8CVT_SHIFT, 1, 0),
+ ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64FPFR0_EL1_F8FMA_SHIFT, 1, 0),
+ ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64FPFR0_EL1_F8DP4_SHIFT, 1, 0),
+ ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64FPFR0_EL1_F8DP2_SHIFT, 1, 0),
+ ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64FPFR0_EL1_F8E4M3_SHIFT, 1, 0),
+ ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_EXACT, ID_AA64FPFR0_EL1_F8E5M2_SHIFT, 1, 0),
ARM64_FTR_END,
};

@@ -2859,6 +2879,7 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
HWCAP_CAP(ID_AA64PFR0_EL1, AdvSIMD, IMP, CAP_HWCAP, KERNEL_HWCAP_ASIMD),
HWCAP_CAP(ID_AA64PFR0_EL1, AdvSIMD, FP16, CAP_HWCAP, KERNEL_HWCAP_ASIMDHP),
HWCAP_CAP(ID_AA64PFR0_EL1, DIT, IMP, CAP_HWCAP, KERNEL_HWCAP_DIT),
+ HWCAP_CAP(ID_AA64PFR2_EL1, FPMR, IMP, CAP_HWCAP, KERNEL_HWCAP_FPMR),
HWCAP_CAP(ID_AA64ISAR1_EL1, DPB, IMP, CAP_HWCAP, KERNEL_HWCAP_DCPOP),
HWCAP_CAP(ID_AA64ISAR1_EL1, DPB, DPB2, CAP_HWCAP, KERNEL_HWCAP_DCPODP),
HWCAP_CAP(ID_AA64ISAR1_EL1, JSCVT, IMP, CAP_HWCAP, KERNEL_HWCAP_JSCVT),
@@ -2872,6 +2893,8 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
HWCAP_CAP(ID_AA64ISAR1_EL1, BF16, EBF16, CAP_HWCAP, KERNEL_HWCAP_EBF16),
HWCAP_CAP(ID_AA64ISAR1_EL1, DGH, IMP, CAP_HWCAP, KERNEL_HWCAP_DGH),
HWCAP_CAP(ID_AA64ISAR1_EL1, I8MM, IMP, CAP_HWCAP, KERNEL_HWCAP_I8MM),
+ HWCAP_CAP(ID_AA64ISAR2_EL1, LUT, IMP, CAP_HWCAP, KERNEL_HWCAP_LUT),
+ HWCAP_CAP(ID_AA64ISAR3_EL1, FAMINMAX, IMP, CAP_HWCAP, KERNEL_HWCAP_FAMINMAX),
HWCAP_CAP(ID_AA64MMFR2_EL1, AT, IMP, CAP_HWCAP, KERNEL_HWCAP_USCAT),
#ifdef CONFIG_ARM64_SVE
HWCAP_CAP(ID_AA64PFR0_EL1, SVE, IMP, CAP_HWCAP, KERNEL_HWCAP_SVE),
@@ -2912,6 +2935,7 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
#ifdef CONFIG_ARM64_SME
HWCAP_CAP(ID_AA64PFR1_EL1, SME, IMP, CAP_HWCAP, KERNEL_HWCAP_SME),
HWCAP_CAP(ID_AA64SMFR0_EL1, FA64, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_FA64),
+ HWCAP_CAP(ID_AA64SMFR0_EL1, LUTv2, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_LUTV2),
HWCAP_CAP(ID_AA64SMFR0_EL1, SMEver, SME2p1, CAP_HWCAP, KERNEL_HWCAP_SME2P1),
HWCAP_CAP(ID_AA64SMFR0_EL1, SMEver, SME2, CAP_HWCAP, KERNEL_HWCAP_SME2),
HWCAP_CAP(ID_AA64SMFR0_EL1, I16I64, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_I16I64),
@@ -2919,12 +2943,23 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
HWCAP_CAP(ID_AA64SMFR0_EL1, I16I32, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_I16I32),
HWCAP_CAP(ID_AA64SMFR0_EL1, B16B16, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_B16B16),
HWCAP_CAP(ID_AA64SMFR0_EL1, F16F16, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_F16F16),
+ HWCAP_CAP(ID_AA64SMFR0_EL1, F8F16, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_F8F16),
+ HWCAP_CAP(ID_AA64SMFR0_EL1, F8F32, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_F8F32),
HWCAP_CAP(ID_AA64SMFR0_EL1, I8I32, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_I8I32),
HWCAP_CAP(ID_AA64SMFR0_EL1, F16F32, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_F16F32),
HWCAP_CAP(ID_AA64SMFR0_EL1, B16F32, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_B16F32),
HWCAP_CAP(ID_AA64SMFR0_EL1, BI32I32, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_BI32I32),
HWCAP_CAP(ID_AA64SMFR0_EL1, F32F32, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_F32F32),
+ HWCAP_CAP(ID_AA64SMFR0_EL1, SF8FMA, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_SF8FMA),
+ HWCAP_CAP(ID_AA64SMFR0_EL1, SF8DP4, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_SF8DP4),
+ HWCAP_CAP(ID_AA64SMFR0_EL1, SF8DP2, IMP, CAP_HWCAP, KERNEL_HWCAP_SME_SF8DP2),
#endif /* CONFIG_ARM64_SME */
+ HWCAP_CAP(ID_AA64FPFR0_EL1, F8CVT, IMP, CAP_HWCAP, KERNEL_HWCAP_F8CVT),
+ HWCAP_CAP(ID_AA64FPFR0_EL1, F8FMA, IMP, CAP_HWCAP, KERNEL_HWCAP_F8FMA),
+ HWCAP_CAP(ID_AA64FPFR0_EL1, F8DP4, IMP, CAP_HWCAP, KERNEL_HWCAP_F8DP4),
+ HWCAP_CAP(ID_AA64FPFR0_EL1, F8DP2, IMP, CAP_HWCAP, KERNEL_HWCAP_F8DP2),
+ HWCAP_CAP(ID_AA64FPFR0_EL1, F8E4M3, IMP, CAP_HWCAP, KERNEL_HWCAP_F8E4M3),
+ HWCAP_CAP(ID_AA64FPFR0_EL1, F8E5M2, IMP, CAP_HWCAP, KERNEL_HWCAP_F8E5M2),
{},
};

diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 12b192060156..f0abb150f73e 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -128,6 +128,21 @@ static const char *const hwcap_str[] = {
[KERNEL_HWCAP_SVE_B16B16] = "sveb16b16",
[KERNEL_HWCAP_LRCPC3] = "lrcpc3",
[KERNEL_HWCAP_LSE128] = "lse128",
+ [KERNEL_HWCAP_FPMR] = "fpmr",
+ [KERNEL_HWCAP_LUT] = "lut",
+ [KERNEL_HWCAP_FAMINMAX] = "faminmax",
+ [KERNEL_HWCAP_F8CVT] = "f8cvt",
+ [KERNEL_HWCAP_F8FMA] = "f8fma",
+ [KERNEL_HWCAP_F8DP4] = "f8dp4",
+ [KERNEL_HWCAP_F8DP2] = "f8dp2",
+ [KERNEL_HWCAP_F8E4M3] = "f8e4m3",
+ [KERNEL_HWCAP_F8E5M2] = "f8e5m2",
+ [KERNEL_HWCAP_SME_LUTV2] = "smelutv2",
+ [KERNEL_HWCAP_SME_F8F16] = "smef8f16",
+ [KERNEL_HWCAP_SME_F8F32] = "smef8f32",
+ [KERNEL_HWCAP_SME_SF8FMA] = "smesf8fma",
+ [KERNEL_HWCAP_SME_SF8DP4] = "smesf8dp4",
+ [KERNEL_HWCAP_SME_SF8DP2] = "smesf8dp2",
};

#ifdef CONFIG_COMPAT

--
2.30.2


2024-01-22 17:27:48

by Mark Brown

[permalink] [raw]
Subject: [PATCH v4 07/14] kselftest/arm64: Handle FPMR context in generic signal frame parser

Teach the generic signal frame parsing code about the newly added FPMR
frame, avoiding warnings every time one is generated.

Signed-off-by: Mark Brown <[email protected]>
---
tools/testing/selftests/arm64/signal/testcases/testcases.c | 8 ++++++++
tools/testing/selftests/arm64/signal/testcases/testcases.h | 1 +
2 files changed, 9 insertions(+)

diff --git a/tools/testing/selftests/arm64/signal/testcases/testcases.c b/tools/testing/selftests/arm64/signal/testcases/testcases.c
index 9f580b55b388..674b88cc8c39 100644
--- a/tools/testing/selftests/arm64/signal/testcases/testcases.c
+++ b/tools/testing/selftests/arm64/signal/testcases/testcases.c
@@ -209,6 +209,14 @@ bool validate_reserved(ucontext_t *uc, size_t resv_sz, char **err)
zt = (struct zt_context *)head;
new_flags |= ZT_CTX;
break;
+ case FPMR_MAGIC:
+ if (flags & FPMR_CTX)
+ *err = "Multiple FPMR_MAGIC";
+ else if (head->size !=
+ sizeof(struct fpmr_context))
+ *err = "Bad size for fpmr_context";
+ new_flags |= FPMR_CTX;
+ break;
case EXTRA_MAGIC:
if (flags & EXTRA_CTX)
*err = "Multiple EXTRA_MAGIC";
diff --git a/tools/testing/selftests/arm64/signal/testcases/testcases.h b/tools/testing/selftests/arm64/signal/testcases/testcases.h
index a08ab0d6207a..7727126347e0 100644
--- a/tools/testing/selftests/arm64/signal/testcases/testcases.h
+++ b/tools/testing/selftests/arm64/signal/testcases/testcases.h
@@ -19,6 +19,7 @@
#define ZA_CTX (1 << 2)
#define EXTRA_CTX (1 << 3)
#define ZT_CTX (1 << 4)
+#define FPMR_CTX (1 << 5)

#define KSFT_BAD_MAGIC 0xdeadbeef


--
2.30.2


2024-01-22 17:27:50

by Mark Brown

[permalink] [raw]
Subject: [PATCH v4 09/14] kselftest/arm64: Add 2023 DPISA hwcap test coverage

Add the hwcaps added for the 2023 DPISA extensions to the hwcaps test
program.

Signed-off-by: Mark Brown <[email protected]>
---
tools/testing/selftests/arm64/abi/hwcap.c | 217 ++++++++++++++++++++++++++++++
1 file changed, 217 insertions(+)

diff --git a/tools/testing/selftests/arm64/abi/hwcap.c b/tools/testing/selftests/arm64/abi/hwcap.c
index 1189e77c8152..d8909b2b535a 100644
--- a/tools/testing/selftests/arm64/abi/hwcap.c
+++ b/tools/testing/selftests/arm64/abi/hwcap.c
@@ -58,11 +58,46 @@ static void cssc_sigill(void)
asm volatile(".inst 0xdac01c00" : : : "x0");
}

+static void f8cvt_sigill(void)
+{
+ /* FSCALE V0.4H, V0.4H, V0.4H */
+ asm volatile(".inst 0x2ec03c00");
+}
+
+static void f8dp2_sigill(void)
+{
+ /* FDOT V0.4H, V0.4H, V0.5H */
+ asm volatile(".inst 0xe40fc00");
+}
+
+static void f8dp4_sigill(void)
+{
+ /* FDOT V0.2S, V0.2S, V0.2S */
+ asm volatile(".inst 0xe00fc00");
+}
+
+static void f8fma_sigill(void)
+{
+ /* FMLALB V0.8H, V0.16B, V0.16B */
+ asm volatile(".inst 0xec0fc00");
+}
+
+static void faminmax_sigill(void)
+{
+ /* FAMIN V0.4H, V0.4H, V0.4H */
+ asm volatile(".inst 0x2ec01c00");
+}
+
static void fp_sigill(void)
{
asm volatile("fmov s0, #1");
}

+static void fpmr_sigill(void)
+{
+ asm volatile("mrs x0, S3_3_C4_C4_2" : : : "x0");
+}
+
static void ilrcpc_sigill(void)
{
/* LDAPUR W0, [SP, #8] */
@@ -95,6 +130,12 @@ static void lse128_sigill(void)
: "cc", "memory");
}

+static void lut_sigill(void)
+{
+ /* LUTI2 V0.16B, { V0.16B }, V[0] */
+ asm volatile(".inst 0x4e801000");
+}
+
static void mops_sigill(void)
{
char dst[1], src[1];
@@ -216,6 +257,78 @@ static void smef16f16_sigill(void)
asm volatile("msr S0_3_C4_C6_3, xzr" : : : );
}

+static void smef8f16_sigill(void)
+{
+ /* SMSTART */
+ asm volatile("msr S0_3_C4_C7_3, xzr" : : : );
+
+ /* FDOT ZA.H[W0, 0], Z0.B-Z1.B, Z0.B-Z1.B */
+ asm volatile(".inst 0xc1a01020" : : : );
+
+ /* SMSTOP */
+ asm volatile("msr S0_3_C4_C6_3, xzr" : : : );
+}
+
+static void smef8f32_sigill(void)
+{
+ /* SMSTART */
+ asm volatile("msr S0_3_C4_C7_3, xzr" : : : );
+
+ /* FDOT ZA.S[W0, 0], { Z0.B-Z1.B }, Z0.B[0] */
+ asm volatile(".inst 0xc1500038" : : : );
+
+ /* SMSTOP */
+ asm volatile("msr S0_3_C4_C6_3, xzr" : : : );
+}
+
+static void smelutv2_sigill(void)
+{
+ /* SMSTART */
+ asm volatile("msr S0_3_C4_C7_3, xzr" : : : );
+
+ /* LUTI4 { Z0.B-Z3.B }, ZT0, { Z0-Z1 } */
+ asm volatile(".inst 0xc08b0000" : : : );
+
+ /* SMSTOP */
+ asm volatile("msr S0_3_C4_C6_3, xzr" : : : );
+}
+
+static void smesf8dp2_sigill(void)
+{
+ /* SMSTART */
+ asm volatile("msr S0_3_C4_C7_3, xzr" : : : );
+
+ /* FDOT Z0.H, Z0.B, Z0.B[0] */
+ asm volatile(".inst 0x64204400" : : : );
+
+ /* SMSTOP */
+ asm volatile("msr S0_3_C4_C6_3, xzr" : : : );
+}
+
+static void smesf8dp4_sigill(void)
+{
+ /* SMSTART */
+ asm volatile("msr S0_3_C4_C7_3, xzr" : : : );
+
+ /* FDOT Z0.S, Z0.B, Z0.B[0] */
+ asm volatile(".inst 0xc1a41C00" : : : );
+
+ /* SMSTOP */
+ asm volatile("msr S0_3_C4_C6_3, xzr" : : : );
+}
+
+static void smesf8fma_sigill(void)
+{
+ /* SMSTART */
+ asm volatile("msr S0_3_C4_C7_3, xzr" : : : );
+
+ /* FMLALB V0.8H, V0.16B, V0.16B */
+ asm volatile(".inst 0xec0fc00");
+
+ /* SMSTOP */
+ asm volatile("msr S0_3_C4_C6_3, xzr" : : : );
+}
+
static void sve_sigill(void)
{
/* RDVL x0, #0 */
@@ -353,6 +466,53 @@ static const struct hwcap_data {
.cpuinfo = "cssc",
.sigill_fn = cssc_sigill,
},
+ {
+ .name = "F8CVT",
+ .at_hwcap = AT_HWCAP2,
+ .hwcap_bit = HWCAP2_F8CVT,
+ .cpuinfo = "f8cvt",
+ .sigill_fn = f8cvt_sigill,
+ },
+ {
+ .name = "F8DP4",
+ .at_hwcap = AT_HWCAP2,
+ .hwcap_bit = HWCAP2_F8DP4,
+ .cpuinfo = "f8dp4",
+ .sigill_fn = f8dp4_sigill,
+ },
+ {
+ .name = "F8DP2",
+ .at_hwcap = AT_HWCAP2,
+ .hwcap_bit = HWCAP2_F8DP2,
+ .cpuinfo = "f8dp4",
+ .sigill_fn = f8dp2_sigill,
+ },
+ {
+ .name = "F8E5M2",
+ .at_hwcap = AT_HWCAP2,
+ .hwcap_bit = HWCAP2_F8E5M2,
+ .cpuinfo = "f8e5m2",
+ },
+ {
+ .name = "F8E4M3",
+ .at_hwcap = AT_HWCAP2,
+ .hwcap_bit = HWCAP2_F8E4M3,
+ .cpuinfo = "f8e4m3",
+ },
+ {
+ .name = "F8FMA",
+ .at_hwcap = AT_HWCAP2,
+ .hwcap_bit = HWCAP2_F8FMA,
+ .cpuinfo = "f8fma",
+ .sigill_fn = f8fma_sigill,
+ },
+ {
+ .name = "FAMINMAX",
+ .at_hwcap = AT_HWCAP2,
+ .hwcap_bit = HWCAP2_FAMINMAX,
+ .cpuinfo = "faminmax",
+ .sigill_fn = faminmax_sigill,
+ },
{
.name = "FP",
.at_hwcap = AT_HWCAP,
@@ -360,6 +520,14 @@ static const struct hwcap_data {
.cpuinfo = "fp",
.sigill_fn = fp_sigill,
},
+ {
+ .name = "FPMR",
+ .at_hwcap = AT_HWCAP2,
+ .hwcap_bit = HWCAP2_FPMR,
+ .cpuinfo = "fpmr",
+ .sigill_fn = fpmr_sigill,
+ .sigill_reliable = true,
+ },
{
.name = "JSCVT",
.at_hwcap = AT_HWCAP,
@@ -411,6 +579,13 @@ static const struct hwcap_data {
.cpuinfo = "lse128",
.sigill_fn = lse128_sigill,
},
+ {
+ .name = "LUT",
+ .at_hwcap = AT_HWCAP2,
+ .hwcap_bit = HWCAP2_LUT,
+ .cpuinfo = "lut",
+ .sigill_fn = lut_sigill,
+ },
{
.name = "MOPS",
.at_hwcap = AT_HWCAP2,
@@ -511,6 +686,48 @@ static const struct hwcap_data {
.cpuinfo = "smef16f16",
.sigill_fn = smef16f16_sigill,
},
+ {
+ .name = "SME F8F16",
+ .at_hwcap = AT_HWCAP2,
+ .hwcap_bit = HWCAP2_SME_F8F16,
+ .cpuinfo = "smef8f16",
+ .sigill_fn = smef8f16_sigill,
+ },
+ {
+ .name = "SME F8F32",
+ .at_hwcap = AT_HWCAP2,
+ .hwcap_bit = HWCAP2_SME_F8F32,
+ .cpuinfo = "smef8f32",
+ .sigill_fn = smef8f32_sigill,
+ },
+ {
+ .name = "SME LUTV2",
+ .at_hwcap = AT_HWCAP2,
+ .hwcap_bit = HWCAP2_SME_LUTV2,
+ .cpuinfo = "smelutv2",
+ .sigill_fn = smelutv2_sigill,
+ },
+ {
+ .name = "SME SF8FMA",
+ .at_hwcap = AT_HWCAP2,
+ .hwcap_bit = HWCAP2_SME_SF8FMA,
+ .cpuinfo = "smesf8fma",
+ .sigill_fn = smesf8fma_sigill,
+ },
+ {
+ .name = "SME SF8DP2",
+ .at_hwcap = AT_HWCAP2,
+ .hwcap_bit = HWCAP2_SME_SF8DP2,
+ .cpuinfo = "smesf8dp2",
+ .sigill_fn = smesf8dp2_sigill,
+ },
+ {
+ .name = "SME SF8DP4",
+ .at_hwcap = AT_HWCAP2,
+ .hwcap_bit = HWCAP2_SME_SF8DP4,
+ .cpuinfo = "smesf8dp4",
+ .sigill_fn = smesf8dp4_sigill,
+ },
{
.name = "SVE",
.at_hwcap = AT_HWCAP,

--
2.30.2


2024-01-22 17:28:09

by Mark Brown

[permalink] [raw]
Subject: [PATCH v4 10/14] KVM: arm64: Share all userspace hardened thread data with the hypervisor

As part of the lazy FPSIMD state transitioning done by the hypervisor we
currently share the userpsace FPSIMD state in thread->uw.fpsimd_state with
the host. Since this struct is non-extensible userspace ABI we have to keep
the definition as is but the addition of FPMR in the 2023 dpISA means that
we will want to share more storage with the host. To facilitate this
refactor the current code to share the entire thread->uw rather than just
the one field.

The large number of references to fpsimd_state make it very inconvenient
to add an additional wrapper struct.

Signed-off-by: Mark Brown <[email protected]>
---
arch/arm64/include/asm/kvm_host.h | 3 ++-
arch/arm64/include/asm/processor.h | 2 +-
arch/arm64/kvm/fpsimd.c | 13 ++++++-------
arch/arm64/kvm/hyp/include/hyp/switch.h | 2 +-
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 4 ++--
5 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 7993694a54af..c4fdcc94d733 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -27,6 +27,7 @@
#include <asm/fpsimd.h>
#include <asm/kvm.h>
#include <asm/kvm_asm.h>
+#include <asm/processor.h>
#include <asm/vncr_mapping.h>

#define __KVM_HAVE_ARCH_INTC_INITIALIZED
@@ -601,7 +602,7 @@ struct kvm_vcpu_arch {
struct kvm_guest_debug_arch vcpu_debug_state;
struct kvm_guest_debug_arch external_debug_state;

- struct user_fpsimd_state *host_fpsimd_state; /* hyp VA */
+ struct thread_struct_uw *host_uw; /* hyp VA */
struct task_struct *parent_task;

struct {
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index b453c66d3fae..544baa57f9b9 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -152,7 +152,7 @@ struct thread_struct {
* Maintainers must ensure manually that this contains no
* implicit padding.
*/
- struct {
+ struct thread_struct_uw {
unsigned long tp_value; /* TLS register */
unsigned long tp2_value;
unsigned long fpmr;
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index e3e611e30e91..6cf22cd8f020 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -17,13 +17,13 @@
void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu)
{
struct task_struct *p = vcpu->arch.parent_task;
- struct user_fpsimd_state *fpsimd;
+ struct thread_struct_uw *uw;

if (!is_protected_kvm_enabled() || !p)
return;

- fpsimd = &p->thread.uw.fpsimd_state;
- kvm_unshare_hyp(fpsimd, fpsimd + 1);
+ uw = &p->thread.uw;
+ kvm_unshare_hyp(uw, uw + 1);
put_task_struct(p);
}

@@ -39,17 +39,16 @@ void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu)
int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
{
int ret;
-
- struct user_fpsimd_state *fpsimd = &current->thread.uw.fpsimd_state;
+ struct thread_struct_uw *uw = &current->thread.uw;

kvm_vcpu_unshare_task_fp(vcpu);

/* Make sure the host task fpsimd state is visible to hyp: */
- ret = kvm_share_hyp(fpsimd, fpsimd + 1);
+ ret = kvm_share_hyp(uw, uw + 1);
if (ret)
return ret;

- vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
+ vcpu->arch.host_uw = kern_hyp_va(uw);

/*
* We need to keep current's task_struct pinned until its data has been
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index a038320cdb08..27fcdfd432b9 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -371,7 +371,7 @@ static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)

/* Write out the host state if it's in the registers */
if (vcpu->arch.fp_state == FP_STATE_HOST_OWNED)
- __fpsimd_save_state(vcpu->arch.host_fpsimd_state);
+ __fpsimd_save_state(&(vcpu->arch.host_uw->fpsimd_state));

/* Restore the guest state */
if (sve_guest)
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 2385fd03ed87..eb2208009875 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -42,7 +42,7 @@ static void flush_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)
hyp_vcpu->vcpu.arch.fp_state = host_vcpu->arch.fp_state;

hyp_vcpu->vcpu.arch.debug_ptr = kern_hyp_va(host_vcpu->arch.debug_ptr);
- hyp_vcpu->vcpu.arch.host_fpsimd_state = host_vcpu->arch.host_fpsimd_state;
+ hyp_vcpu->vcpu.arch.host_uw = host_vcpu->arch.host_uw;

hyp_vcpu->vcpu.arch.vsesr_el2 = host_vcpu->arch.vsesr_el2;

@@ -64,7 +64,7 @@ static void sync_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu)
host_vcpu->arch.fault = hyp_vcpu->vcpu.arch.fault;

host_vcpu->arch.iflags = hyp_vcpu->vcpu.arch.iflags;
- host_vcpu->arch.fp_state = hyp_vcpu->vcpu.arch.fp_state;
+ host_vcpu->arch.host_uw = hyp_vcpu->vcpu.arch.host_uw;

host_cpu_if->vgic_hcr = hyp_cpu_if->vgic_hcr;
for (i = 0; i < hyp_cpu_if->used_lrs; ++i)

--
2.30.2


2024-01-22 17:28:23

by Mark Brown

[permalink] [raw]
Subject: [PATCH v4 11/14] KVM: arm64: Add newly allocated ID registers to register descriptions

The 2023 architecture extensions have allocated some new ID registers, add
them to the KVM system register descriptions so that they are visible to
guests.

Signed-off-by: Mark Brown <[email protected]>
---
arch/arm64/kvm/sys_regs.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 30253bd19917..38503b1cd2eb 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -2292,12 +2292,12 @@ static const struct sys_reg_desc sys_reg_descs[] = {
ID_AA64PFR0_EL1_AdvSIMD |
ID_AA64PFR0_EL1_FP), },
ID_SANITISED(ID_AA64PFR1_EL1),
- ID_UNALLOCATED(4,2),
+ ID_SANITISED(ID_AA64PFR2_EL1),
ID_UNALLOCATED(4,3),
ID_WRITABLE(ID_AA64ZFR0_EL1, ~ID_AA64ZFR0_EL1_RES0),
ID_HIDDEN(ID_AA64SMFR0_EL1),
ID_UNALLOCATED(4,6),
- ID_UNALLOCATED(4,7),
+ ID_SANITISED(ID_AA64FPFR0_EL1),

/* CRm=5 */
{ SYS_DESC(SYS_ID_AA64DFR0_EL1),
@@ -2324,7 +2324,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
ID_WRITABLE(ID_AA64ISAR2_EL1, ~(ID_AA64ISAR2_EL1_RES0 |
ID_AA64ISAR2_EL1_APA3 |
ID_AA64ISAR2_EL1_GPA3)),
- ID_UNALLOCATED(6,3),
+ ID_WRITABLE(ID_AA64ISAR3_EL1, ~ID_AA64ISAR3_EL1_RES0),
ID_UNALLOCATED(6,4),
ID_UNALLOCATED(6,5),
ID_UNALLOCATED(6,6),

--
2.30.2


2024-01-22 17:31:02

by Mark Brown

[permalink] [raw]
Subject: [PATCH v4 05/14] arm64/ptrace: Expose FPMR via ptrace

Add a new regset to expose FPMR via ptrace. It is not added to the FPSIMD
registers since that structure is exposed elsewhere without any allowance
for extension we don't add there.

Signed-off-by: Mark Brown <[email protected]>
---
arch/arm64/kernel/ptrace.c | 42 ++++++++++++++++++++++++++++++++++++++++++
include/uapi/linux/elf.h | 1 +
2 files changed, 43 insertions(+)

diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index dc6cf0e37194..aacb45bd36e6 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -698,6 +698,39 @@ static int tls_set(struct task_struct *target, const struct user_regset *regset,
return ret;
}

+static int fpmr_get(struct task_struct *target, const struct user_regset *regset,
+ struct membuf to)
+{
+ if (!system_supports_fpmr())
+ return -EINVAL;
+
+ if (target == current)
+ fpsimd_preserve_current_state();
+
+ return membuf_store(&to, target->thread.uw.fpmr);
+}
+
+static int fpmr_set(struct task_struct *target, const struct user_regset *regset,
+ unsigned int pos, unsigned int count,
+ const void *kbuf, const void __user *ubuf)
+{
+ int ret;
+ unsigned long fpmr;
+
+ if (!system_supports_fpmr())
+ return -EINVAL;
+
+ ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &fpmr, 0, count);
+ if (ret)
+ return ret;
+
+ target->thread.uw.fpmr = fpmr;
+
+ fpsimd_flush_task_state(target);
+
+ return 0;
+}
+
static int system_call_get(struct task_struct *target,
const struct user_regset *regset,
struct membuf to)
@@ -1419,6 +1452,7 @@ enum aarch64_regset {
REGSET_HW_BREAK,
REGSET_HW_WATCH,
#endif
+ REGSET_FPMR,
REGSET_SYSTEM_CALL,
#ifdef CONFIG_ARM64_SVE
REGSET_SVE,
@@ -1497,6 +1531,14 @@ static const struct user_regset aarch64_regsets[] = {
.regset_get = system_call_get,
.set = system_call_set,
},
+ [REGSET_FPMR] = {
+ .core_note_type = NT_ARM_FPMR,
+ .n = 1,
+ .size = sizeof(u64),
+ .align = sizeof(u64),
+ .regset_get = fpmr_get,
+ .set = fpmr_set,
+ },
#ifdef CONFIG_ARM64_SVE
[REGSET_SVE] = { /* Scalable Vector Extension */
.core_note_type = NT_ARM_SVE,
diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
index 9417309b7230..b54b313bcf07 100644
--- a/include/uapi/linux/elf.h
+++ b/include/uapi/linux/elf.h
@@ -440,6 +440,7 @@ typedef struct elf64_shdr {
#define NT_ARM_SSVE 0x40b /* ARM Streaming SVE registers */
#define NT_ARM_ZA 0x40c /* ARM SME ZA registers */
#define NT_ARM_ZT 0x40d /* ARM SME ZT registers */
+#define NT_ARM_FPMR 0x40e /* ARM floating point mode register */
#define NT_ARC_V2 0x600 /* ARCv2 accumulator/extra registers */
#define NT_VMCOREDD 0x700 /* Vmcore Device Dump Note */
#define NT_MIPS_DSP 0x800 /* MIPS DSP ASE registers */

--
2.30.2


2024-01-22 17:31:36

by Mark Brown

[permalink] [raw]
Subject: [PATCH v4 08/14] kselftest/arm64: Add basic FPMR test

Verify that a FPMR frame is generated on systems that support FPMR and not
generated otherwise.

Signed-off-by: Mark Brown <[email protected]>
---
tools/testing/selftests/arm64/signal/.gitignore | 1 +
.../arm64/signal/testcases/fpmr_siginfo.c | 82 ++++++++++++++++++++++
2 files changed, 83 insertions(+)

diff --git a/tools/testing/selftests/arm64/signal/.gitignore b/tools/testing/selftests/arm64/signal/.gitignore
index 839e3a252629..1ce5b5eac386 100644
--- a/tools/testing/selftests/arm64/signal/.gitignore
+++ b/tools/testing/selftests/arm64/signal/.gitignore
@@ -1,6 +1,7 @@
# SPDX-License-Identifier: GPL-2.0-only
mangle_*
fake_sigreturn_*
+fpmr_*
sme_*
ssve_*
sve_*
diff --git a/tools/testing/selftests/arm64/signal/testcases/fpmr_siginfo.c b/tools/testing/selftests/arm64/signal/testcases/fpmr_siginfo.c
new file mode 100644
index 000000000000..e9d24685e741
--- /dev/null
+++ b/tools/testing/selftests/arm64/signal/testcases/fpmr_siginfo.c
@@ -0,0 +1,82 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2023 ARM Limited
+ *
+ * Verify that the FPMR register context in signal frames is set up as
+ * expected.
+ */
+
+#include <signal.h>
+#include <ucontext.h>
+#include <sys/auxv.h>
+#include <sys/prctl.h>
+#include <unistd.h>
+#include <asm/sigcontext.h>
+
+#include "test_signals_utils.h"
+#include "testcases.h"
+
+static union {
+ ucontext_t uc;
+ char buf[1024 * 128];
+} context;
+
+#define SYS_FPMR "S3_3_C4_C4_2"
+
+static uint64_t get_fpmr(void)
+{
+ uint64_t val;
+
+ asm volatile (
+ "mrs %0, " SYS_FPMR "\n"
+ : "=r"(val)
+ :
+ : "cc");
+
+ return val;
+}
+
+int fpmr_present(struct tdescr *td, siginfo_t *si, ucontext_t *uc)
+{
+ struct _aarch64_ctx *head = GET_BUF_RESV_HEAD(context);
+ struct fpmr_context *fpmr_ctx;
+ size_t offset;
+ bool in_sigframe;
+ bool have_fpmr;
+ __u64 orig_fpmr;
+
+ have_fpmr = getauxval(AT_HWCAP2) & HWCAP2_FPMR;
+ if (have_fpmr)
+ orig_fpmr = get_fpmr();
+
+ if (!get_current_context(td, &context.uc, sizeof(context)))
+ return 1;
+
+ fpmr_ctx = (struct fpmr_context *)
+ get_header(head, FPMR_MAGIC, td->live_sz, &offset);
+
+ in_sigframe = fpmr_ctx != NULL;
+
+ fprintf(stderr, "FPMR sigframe %s on system %s FPMR\n",
+ in_sigframe ? "present" : "absent",
+ have_fpmr ? "with" : "without");
+
+ td->pass = (in_sigframe == have_fpmr);
+
+ if (have_fpmr && fpmr_ctx) {
+ if (fpmr_ctx->fpmr != orig_fpmr) {
+ fprintf(stderr, "FPMR in frame is %llx, was %llx\n",
+ fpmr_ctx->fpmr, orig_fpmr);
+ td->pass = false;
+ }
+ }
+
+ return 0;
+}
+
+struct tdescr tde = {
+ .name = "FPMR",
+ .descr = "Validate that FPMR is present as expected",
+ .timeout = 3,
+ .run = fpmr_present,
+};

--
2.30.2


2024-01-22 17:32:31

by Mark Brown

[permalink] [raw]
Subject: [PATCH v4 13/14] KVM: arm64: selftests: Document feature registers added in 2023 extensions

The 2023 architecture extensions allocated some previously usused feature
registers, add comments mapping the names in get-reg-list as we do for the
other allocated registers.

Signed-off-by: Mark Brown <[email protected]>
---
tools/testing/selftests/kvm/aarch64/get-reg-list.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/aarch64/get-reg-list.c b/tools/testing/selftests/kvm/aarch64/get-reg-list.c
index 709d7d721760..71ea6ecec7ce 100644
--- a/tools/testing/selftests/kvm/aarch64/get-reg-list.c
+++ b/tools/testing/selftests/kvm/aarch64/get-reg-list.c
@@ -428,7 +428,7 @@ static __u64 base_regs[] = {
ARM64_SYS_REG(3, 0, 0, 4, 4), /* ID_AA64ZFR0_EL1 */
ARM64_SYS_REG(3, 0, 0, 4, 5), /* ID_AA64SMFR0_EL1 */
ARM64_SYS_REG(3, 0, 0, 4, 6),
- ARM64_SYS_REG(3, 0, 0, 4, 7),
+ ARM64_SYS_REG(3, 0, 0, 4, 7), /* ID_AA64FPFR_EL1 */
ARM64_SYS_REG(3, 0, 0, 5, 0), /* ID_AA64DFR0_EL1 */
ARM64_SYS_REG(3, 0, 0, 5, 1), /* ID_AA64DFR1_EL1 */
ARM64_SYS_REG(3, 0, 0, 5, 2),
@@ -440,7 +440,7 @@ static __u64 base_regs[] = {
ARM64_SYS_REG(3, 0, 0, 6, 0), /* ID_AA64ISAR0_EL1 */
ARM64_SYS_REG(3, 0, 0, 6, 1), /* ID_AA64ISAR1_EL1 */
ARM64_SYS_REG(3, 0, 0, 6, 2), /* ID_AA64ISAR2_EL1 */
- ARM64_SYS_REG(3, 0, 0, 6, 3),
+ ARM64_SYS_REG(3, 0, 0, 6, 3), /* ID_AA64ISAR3_EL1 */
ARM64_SYS_REG(3, 0, 0, 6, 4),
ARM64_SYS_REG(3, 0, 0, 6, 5),
ARM64_SYS_REG(3, 0, 0, 6, 6),

--
2.30.2


2024-01-22 17:36:43

by Mark Brown

[permalink] [raw]
Subject: [PATCH v4 01/14] arm64/cpufeature: Hook new identification registers up to cpufeature

The 2023 architecture extensions have defined several new ID registers,
hook them up to the cpufeature code so we can add feature checks and hwcaps
based on their contents.

Signed-off-by: Mark Brown <[email protected]>
---
arch/arm64/include/asm/cpu.h | 3 +++
arch/arm64/kernel/cpufeature.c | 28 ++++++++++++++++++++++++++++
arch/arm64/kernel/cpuinfo.c | 3 +++
3 files changed, 34 insertions(+)

diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
index b1e43f56ee46..96379be913cd 100644
--- a/arch/arm64/include/asm/cpu.h
+++ b/arch/arm64/include/asm/cpu.h
@@ -52,14 +52,17 @@ struct cpuinfo_arm64 {
u64 reg_id_aa64isar0;
u64 reg_id_aa64isar1;
u64 reg_id_aa64isar2;
+ u64 reg_id_aa64isar3;
u64 reg_id_aa64mmfr0;
u64 reg_id_aa64mmfr1;
u64 reg_id_aa64mmfr2;
u64 reg_id_aa64mmfr3;
u64 reg_id_aa64pfr0;
u64 reg_id_aa64pfr1;
+ u64 reg_id_aa64pfr2;
u64 reg_id_aa64zfr0;
u64 reg_id_aa64smfr0;
+ u64 reg_id_aa64fpfr0;

struct cpuinfo_32bit aarch32;
};
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 8d1a634a403e..eae59ec0f4b0 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -234,6 +234,10 @@ static const struct arm64_ftr_bits ftr_id_aa64isar2[] = {
ARM64_FTR_END,
};

+static const struct arm64_ftr_bits ftr_id_aa64isar3[] = {
+ ARM64_FTR_END,
+};
+
static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_EL1_CSV3_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_EL1_CSV2_SHIFT, 4, 0),
@@ -267,6 +271,10 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr1[] = {
ARM64_FTR_END,
};

+static const struct arm64_ftr_bits ftr_id_aa64pfr2[] = {
+ ARM64_FTR_END,
+};
+
static const struct arm64_ftr_bits ftr_id_aa64zfr0[] = {
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE),
FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ZFR0_EL1_F64MM_SHIFT, 4, 0),
@@ -319,6 +327,10 @@ static const struct arm64_ftr_bits ftr_id_aa64smfr0[] = {
ARM64_FTR_END,
};

+static const struct arm64_ftr_bits ftr_id_aa64fpfr0[] = {
+ ARM64_FTR_END,
+};
+
static const struct arm64_ftr_bits ftr_id_aa64mmfr0[] = {
ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR0_EL1_ECV_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64MMFR0_EL1_FGT_SHIFT, 4, 0),
@@ -702,10 +714,12 @@ static const struct __ftr_reg_entry {
&id_aa64pfr0_override),
ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64PFR1_EL1, ftr_id_aa64pfr1,
&id_aa64pfr1_override),
+ ARM64_FTR_REG(SYS_ID_AA64PFR2_EL1, ftr_id_aa64pfr2),
ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64ZFR0_EL1, ftr_id_aa64zfr0,
&id_aa64zfr0_override),
ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64SMFR0_EL1, ftr_id_aa64smfr0,
&id_aa64smfr0_override),
+ ARM64_FTR_REG(SYS_ID_AA64FPFR0_EL1, ftr_id_aa64fpfr0),

/* Op1 = 0, CRn = 0, CRm = 5 */
ARM64_FTR_REG(SYS_ID_AA64DFR0_EL1, ftr_id_aa64dfr0),
@@ -717,6 +731,7 @@ static const struct __ftr_reg_entry {
&id_aa64isar1_override),
ARM64_FTR_REG_OVERRIDE(SYS_ID_AA64ISAR2_EL1, ftr_id_aa64isar2,
&id_aa64isar2_override),
+ ARM64_FTR_REG(SYS_ID_AA64ISAR3_EL1, ftr_id_aa64isar3),

/* Op1 = 0, CRn = 0, CRm = 7 */
ARM64_FTR_REG(SYS_ID_AA64MMFR0_EL1, ftr_id_aa64mmfr0),
@@ -1043,14 +1058,17 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info)
init_cpu_ftr_reg(SYS_ID_AA64ISAR0_EL1, info->reg_id_aa64isar0);
init_cpu_ftr_reg(SYS_ID_AA64ISAR1_EL1, info->reg_id_aa64isar1);
init_cpu_ftr_reg(SYS_ID_AA64ISAR2_EL1, info->reg_id_aa64isar2);
+ init_cpu_ftr_reg(SYS_ID_AA64ISAR3_EL1, info->reg_id_aa64isar3);
init_cpu_ftr_reg(SYS_ID_AA64MMFR0_EL1, info->reg_id_aa64mmfr0);
init_cpu_ftr_reg(SYS_ID_AA64MMFR1_EL1, info->reg_id_aa64mmfr1);
init_cpu_ftr_reg(SYS_ID_AA64MMFR2_EL1, info->reg_id_aa64mmfr2);
init_cpu_ftr_reg(SYS_ID_AA64MMFR3_EL1, info->reg_id_aa64mmfr3);
init_cpu_ftr_reg(SYS_ID_AA64PFR0_EL1, info->reg_id_aa64pfr0);
init_cpu_ftr_reg(SYS_ID_AA64PFR1_EL1, info->reg_id_aa64pfr1);
+ init_cpu_ftr_reg(SYS_ID_AA64PFR2_EL1, info->reg_id_aa64pfr2);
init_cpu_ftr_reg(SYS_ID_AA64ZFR0_EL1, info->reg_id_aa64zfr0);
init_cpu_ftr_reg(SYS_ID_AA64SMFR0_EL1, info->reg_id_aa64smfr0);
+ init_cpu_ftr_reg(SYS_ID_AA64FPFR0_EL1, info->reg_id_aa64fpfr0);

if (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0))
init_32bit_cpu_features(&info->aarch32);
@@ -1272,6 +1290,8 @@ void update_cpu_features(int cpu,
info->reg_id_aa64isar1, boot->reg_id_aa64isar1);
taint |= check_update_ftr_reg(SYS_ID_AA64ISAR2_EL1, cpu,
info->reg_id_aa64isar2, boot->reg_id_aa64isar2);
+ taint |= check_update_ftr_reg(SYS_ID_AA64ISAR3_EL1, cpu,
+ info->reg_id_aa64isar3, boot->reg_id_aa64isar3);

/*
* Differing PARange support is fine as long as all peripherals and
@@ -1291,6 +1311,8 @@ void update_cpu_features(int cpu,
info->reg_id_aa64pfr0, boot->reg_id_aa64pfr0);
taint |= check_update_ftr_reg(SYS_ID_AA64PFR1_EL1, cpu,
info->reg_id_aa64pfr1, boot->reg_id_aa64pfr1);
+ taint |= check_update_ftr_reg(SYS_ID_AA64PFR2_EL1, cpu,
+ info->reg_id_aa64pfr2, boot->reg_id_aa64pfr2);

taint |= check_update_ftr_reg(SYS_ID_AA64ZFR0_EL1, cpu,
info->reg_id_aa64zfr0, boot->reg_id_aa64zfr0);
@@ -1298,6 +1320,9 @@ void update_cpu_features(int cpu,
taint |= check_update_ftr_reg(SYS_ID_AA64SMFR0_EL1, cpu,
info->reg_id_aa64smfr0, boot->reg_id_aa64smfr0);

+ taint |= check_update_ftr_reg(SYS_ID_AA64FPFR0_EL1, cpu,
+ info->reg_id_aa64fpfr0, boot->reg_id_aa64fpfr0);
+
/* Probe vector lengths */
if (IS_ENABLED(CONFIG_ARM64_SVE) &&
id_aa64pfr0_sve(read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1))) {
@@ -1410,8 +1435,10 @@ u64 __read_sysreg_by_encoding(u32 sys_id)

read_sysreg_case(SYS_ID_AA64PFR0_EL1);
read_sysreg_case(SYS_ID_AA64PFR1_EL1);
+ read_sysreg_case(SYS_ID_AA64PFR2_EL1);
read_sysreg_case(SYS_ID_AA64ZFR0_EL1);
read_sysreg_case(SYS_ID_AA64SMFR0_EL1);
+ read_sysreg_case(SYS_ID_AA64FPFR0_EL1);
read_sysreg_case(SYS_ID_AA64DFR0_EL1);
read_sysreg_case(SYS_ID_AA64DFR1_EL1);
read_sysreg_case(SYS_ID_AA64MMFR0_EL1);
@@ -1421,6 +1448,7 @@ u64 __read_sysreg_by_encoding(u32 sys_id)
read_sysreg_case(SYS_ID_AA64ISAR0_EL1);
read_sysreg_case(SYS_ID_AA64ISAR1_EL1);
read_sysreg_case(SYS_ID_AA64ISAR2_EL1);
+ read_sysreg_case(SYS_ID_AA64ISAR3_EL1);

read_sysreg_case(SYS_CNTFRQ_EL0);
read_sysreg_case(SYS_CTR_EL0);
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 47043c0d95ec..12b192060156 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -443,14 +443,17 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info)
info->reg_id_aa64isar0 = read_cpuid(ID_AA64ISAR0_EL1);
info->reg_id_aa64isar1 = read_cpuid(ID_AA64ISAR1_EL1);
info->reg_id_aa64isar2 = read_cpuid(ID_AA64ISAR2_EL1);
+ info->reg_id_aa64isar3 = read_cpuid(ID_AA64ISAR3_EL1);
info->reg_id_aa64mmfr0 = read_cpuid(ID_AA64MMFR0_EL1);
info->reg_id_aa64mmfr1 = read_cpuid(ID_AA64MMFR1_EL1);
info->reg_id_aa64mmfr2 = read_cpuid(ID_AA64MMFR2_EL1);
info->reg_id_aa64mmfr3 = read_cpuid(ID_AA64MMFR3_EL1);
info->reg_id_aa64pfr0 = read_cpuid(ID_AA64PFR0_EL1);
info->reg_id_aa64pfr1 = read_cpuid(ID_AA64PFR1_EL1);
+ info->reg_id_aa64pfr2 = read_cpuid(ID_AA64PFR2_EL1);
info->reg_id_aa64zfr0 = read_cpuid(ID_AA64ZFR0_EL1);
info->reg_id_aa64smfr0 = read_cpuid(ID_AA64SMFR0_EL1);
+ info->reg_id_aa64fpfr0 = read_cpuid(ID_AA64FPFR0_EL1);

if (id_aa64pfr1_mte(info->reg_id_aa64pfr1))
info->reg_gmid = read_cpuid(GMID_EL1);

--
2.30.2


2024-01-22 17:56:05

by Mark Brown

[permalink] [raw]
Subject: [PATCH v4 12/14] KVM: arm64: Support FEAT_FPMR for guests

FEAT_FPMR introduces a new system register FPMR which allows configuration
of floating point behaviour, currently for FP8 specific features. Allow use
of this in guests, disabling the trap while guests are running and saving
and restoring the value along with the rest of the floating point state.
Since FPMR is stored immediately after the main floating point state we
share it with the hypervisor by adjusting the size of the shared region.

Access to FPMR is covered by both a register specific trap HCRX_EL2.EnFPM
and the overall floating point access trap so we just unconditionally
enable the FPMR specific trap and rely on the floating point access trap to
detect guest floating point usage.

Signed-off-by: Mark Brown <[email protected]>
---
arch/arm64/include/asm/kvm_arm.h | 2 +-
arch/arm64/include/asm/kvm_host.h | 3 ++-
arch/arm64/kvm/emulate-nested.c | 8 ++++++++
arch/arm64/kvm/fpsimd.c | 2 +-
arch/arm64/kvm/hyp/include/hyp/switch.h | 7 ++++++-
arch/arm64/kvm/sys_regs.c | 11 +++++++++++
6 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 7f45ce9170bb..b2ddb6165953 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -103,7 +103,7 @@
#define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)

#define HCRX_GUEST_FLAGS \
- (HCRX_EL2_SMPME | HCRX_EL2_TCR2En | \
+ (HCRX_EL2_SMPME | HCRX_EL2_TCR2En | HCRX_EL2_EnFPM | \
(cpus_have_final_cap(ARM64_HAS_MOPS) ? (HCRX_EL2_MSCEn | HCRX_EL2_MCE2) : 0))
#define HCRX_HOST_FLAGS (HCRX_EL2_MSCEn | HCRX_EL2_TCR2En | HCRX_EL2_EnFPM)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index c4fdcc94d733..99c0f8944f04 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -384,6 +384,8 @@ enum vcpu_sysreg {
APGAKEYLO_EL1,
APGAKEYHI_EL1,

+ FPMR,
+
/* Memory Tagging Extension registers */
RGSR_EL1, /* Random Allocation Tag Seed Register */
GCR_EL1, /* Tag Control Register */
@@ -544,7 +546,6 @@ struct kvm_vcpu_arch {
enum fp_type fp_type;
unsigned int sve_max_vl;
u64 svcr;
- unsigned long fpmr;

/* Stage 2 paging state used by the hardware on next switch */
struct kvm_s2_mmu *hw_mmu;
diff --git a/arch/arm64/kvm/emulate-nested.c b/arch/arm64/kvm/emulate-nested.c
index 431fd429932d..3af5fd0e28dc 100644
--- a/arch/arm64/kvm/emulate-nested.c
+++ b/arch/arm64/kvm/emulate-nested.c
@@ -67,6 +67,8 @@ enum cgt_group_id {
CGT_HCR_TTLBIS,
CGT_HCR_TTLBOS,

+ CGT_HCRX_EnFPM,
+
CGT_MDCR_TPMCR,
CGT_MDCR_TPM,
CGT_MDCR_TDE,
@@ -279,6 +281,11 @@ static const struct trap_bits coarse_trap_bits[] = {
.mask = HCR_TTLBOS,
.behaviour = BEHAVE_FORWARD_ANY,
},
+ [CGT_HCRX_EnFPM] = {
+ .index = HCRX_EL2,
+ .mask = HCRX_EL2_EnFPM,
+ .behaviour = BEHAVE_FORWARD_ANY,
+ },
[CGT_MDCR_TPMCR] = {
.index = MDCR_EL2,
.value = MDCR_EL2_TPMCR,
@@ -478,6 +485,7 @@ static const struct encoding_to_trap_config encoding_to_cgt[] __initconst = {
SR_TRAP(SYS_AIDR_EL1, CGT_HCR_TID1),
SR_TRAP(SYS_SMIDR_EL1, CGT_HCR_TID1),
SR_TRAP(SYS_CTR_EL0, CGT_HCR_TID2),
+ SR_TRAP(SYS_FPMR, CGT_HCRX_EnFPM),
SR_TRAP(SYS_CCSIDR_EL1, CGT_HCR_TID2_TID4),
SR_TRAP(SYS_CCSIDR2_EL1, CGT_HCR_TID2_TID4),
SR_TRAP(SYS_CLIDR_EL1, CGT_HCR_TID2_TID4),
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 6cf22cd8f020..9e002489c843 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -152,7 +152,7 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
fp_state.sve_vl = vcpu->arch.sve_max_vl;
fp_state.sme_state = NULL;
fp_state.svcr = &vcpu->arch.svcr;
- fp_state.fpmr = &vcpu->arch.fpmr;
+ fp_state.fpmr = (unsigned long *)&__vcpu_sys_reg(vcpu, FPMR);
fp_state.fp_type = &vcpu->arch.fp_type;

if (vcpu_has_sve(vcpu))
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 27fcdfd432b9..abf785c473d0 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -370,10 +370,15 @@ static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
isb();

/* Write out the host state if it's in the registers */
- if (vcpu->arch.fp_state == FP_STATE_HOST_OWNED)
+ if (vcpu->arch.fp_state == FP_STATE_HOST_OWNED) {
__fpsimd_save_state(&(vcpu->arch.host_uw->fpsimd_state));
+ if (cpus_have_final_cap(ARM64_HAS_FPMR))
+ vcpu->arch.host_uw->fpmr = read_sysreg_s(SYS_FPMR);
+ }

/* Restore the guest state */
+ if (cpus_have_final_cap(ARM64_HAS_FPMR))
+ write_sysreg_s(__vcpu_sys_reg(vcpu, FPMR), SYS_FPMR);
if (sve_guest)
__hyp_sve_restore_guest(vcpu);
else
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 38503b1cd2eb..216eac44c124 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -2067,6 +2067,15 @@ static unsigned int hidden_user_visibility(const struct kvm_vcpu *vcpu,
.visibility = hidden_user_visibility, \
}

+static unsigned int fpmr_visibility(const struct kvm_vcpu *vcpu,
+ const struct sys_reg_desc *rd)
+{
+ if (cpus_have_final_cap(ARM64_HAS_FPMR))
+ return 0;
+
+ return REG_HIDDEN;
+}
+
/*
* Since reset() callback and field val are not used for idregs, they will be
* used for specific purposes for idregs.
@@ -2463,6 +2472,8 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ SYS_DESC(SYS_CSSELR_EL1), access_csselr, reset_unknown, CSSELR_EL1 },
{ SYS_DESC(SYS_CTR_EL0), access_ctr },
{ SYS_DESC(SYS_SVCR), undef_access },
+ { SYS_DESC(SYS_FPMR), access_rw, reset_unknown, FPMR,
+ .visibility = fpmr_visibility },

{ PMU_SYS_REG(PMCR_EL0), .access = access_pmcr, .reset = reset_pmcr,
.reg = PMCR_EL0, .get_user = get_pmcr, .set_user = set_pmcr },

--
2.30.2


2024-01-22 18:09:16

by Mark Brown

[permalink] [raw]
Subject: [PATCH v4 14/14] KVM: arm64: selftests: Teach get-reg-list about FPMR

FEAT_FPMR defines a new register FMPR which is available at all ELs and is
discovered via ID_AA64PFR2_EL1.FPMR, add this to the set of registers that
get-reg-list knows to check for with the required identification register
depdendency.

Signed-off-by: Mark Brown <[email protected]>
---
tools/testing/selftests/kvm/aarch64/get-reg-list.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/tools/testing/selftests/kvm/aarch64/get-reg-list.c b/tools/testing/selftests/kvm/aarch64/get-reg-list.c
index 71ea6ecec7ce..1e43511d1440 100644
--- a/tools/testing/selftests/kvm/aarch64/get-reg-list.c
+++ b/tools/testing/selftests/kvm/aarch64/get-reg-list.c
@@ -40,6 +40,12 @@ static struct feature_id_reg feat_id_regs[] = {
ARM64_SYS_REG(3, 0, 0, 7, 3), /* ID_AA64MMFR3_EL1 */
4,
1
+ },
+ {
+ ARM64_SYS_REG(3, 3, 4, 4, 2), /* FPMR */
+ ARM64_SYS_REG(3, 0, 0, 4, 2), /* ID_AA64PFR2_EL1 */
+ 32,
+ 1
}
};

@@ -481,6 +487,7 @@ static __u64 base_regs[] = {
ARM64_SYS_REG(3, 3, 14, 2, 1), /* CNTP_CTL_EL0 */
ARM64_SYS_REG(3, 3, 14, 2, 2), /* CNTP_CVAL_EL0 */
ARM64_SYS_REG(3, 4, 3, 0, 0), /* DACR32_EL2 */
+ ARM64_SYS_REG(3, 3, 4, 4, 2), /* FPMR */
ARM64_SYS_REG(3, 4, 5, 0, 1), /* IFSR32_EL2 */
ARM64_SYS_REG(3, 4, 5, 3, 0), /* FPEXC32_EL2 */
};

--
2.30.2


2024-02-23 11:33:41

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v4 11/14] KVM: arm64: Add newly allocated ID registers to register descriptions

On Mon, 22 Jan 2024 16:28:14 +0000,
Mark Brown <[email protected]> wrote:
>
> The 2023 architecture extensions have allocated some new ID registers, add
> them to the KVM system register descriptions so that they are visible to
> guests.
>
> Signed-off-by: Mark Brown <[email protected]>
> ---
> arch/arm64/kvm/sys_regs.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 30253bd19917..38503b1cd2eb 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -2292,12 +2292,12 @@ static const struct sys_reg_desc sys_reg_descs[] = {
> ID_AA64PFR0_EL1_AdvSIMD |
> ID_AA64PFR0_EL1_FP), },
> ID_SANITISED(ID_AA64PFR1_EL1),
> - ID_UNALLOCATED(4,2),
> + ID_SANITISED(ID_AA64PFR2_EL1),

So you now expose all sort of MTE things to the guest?

> ID_UNALLOCATED(4,3),
> ID_WRITABLE(ID_AA64ZFR0_EL1, ~ID_AA64ZFR0_EL1_RES0),
> ID_HIDDEN(ID_AA64SMFR0_EL1),
> ID_UNALLOCATED(4,6),
> - ID_UNALLOCATED(4,7),
> + ID_SANITISED(ID_AA64FPFR0_EL1),
>
> /* CRm=5 */
> { SYS_DESC(SYS_ID_AA64DFR0_EL1),
> @@ -2324,7 +2324,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
> ID_WRITABLE(ID_AA64ISAR2_EL1, ~(ID_AA64ISAR2_EL1_RES0 |
> ID_AA64ISAR2_EL1_APA3 |
> ID_AA64ISAR2_EL1_GPA3)),
> - ID_UNALLOCATED(6,3),
> + ID_WRITABLE(ID_AA64ISAR3_EL1, ~ID_AA64ISAR3_EL1_RES0),

How about the non dpISA stuff that is advertised in the same register,
and for which no support exists?

> ID_UNALLOCATED(6,4),
> ID_UNALLOCATED(6,5),
> ID_UNALLOCATED(6,6),
>

M.

--
Without deviation from the norm, progress is not possible.

2024-02-23 11:34:14

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v4 02/14] arm64/fpsimd: Enable host kernel access to FPMR

On Mon, 22 Jan 2024 16:28:05 +0000,
Mark Brown <[email protected]> wrote:
>
> FEAT_FPMR provides a new generally accessible architectural register FPMR.
> This is only accessible to EL0 and EL1 when HCRX_EL2.EnFPM is set to 1,
> do this when the host is running. The guest part will be done along with
> context switching the new register and exposing it via guest management.
>
> Signed-off-by: Mark Brown <[email protected]>
> ---
> arch/arm64/include/asm/kvm_arm.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
> index 3c6f8ba1e479..7f45ce9170bb 100644
> --- a/arch/arm64/include/asm/kvm_arm.h
> +++ b/arch/arm64/include/asm/kvm_arm.h
> @@ -105,7 +105,7 @@
> #define HCRX_GUEST_FLAGS \
> (HCRX_EL2_SMPME | HCRX_EL2_TCR2En | \
> (cpus_have_final_cap(ARM64_HAS_MOPS) ? (HCRX_EL2_MSCEn | HCRX_EL2_MCE2) : 0))
> -#define HCRX_HOST_FLAGS (HCRX_EL2_MSCEn | HCRX_EL2_TCR2En)
> +#define HCRX_HOST_FLAGS (HCRX_EL2_MSCEn | HCRX_EL2_TCR2En | HCRX_EL2_EnFPM)
>
> /* TCR_EL2 Registers bits */
> #define TCR_EL2_DS (1UL << 32)
>

Acked-by: Marc Zyngier <[email protected]>

M.

--
Without deviation from the norm, progress is not possible.

2024-02-23 11:34:52

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v4 12/14] KVM: arm64: Support FEAT_FPMR for guests

On Mon, 22 Jan 2024 16:28:15 +0000,
Mark Brown <[email protected]> wrote:
>
> FEAT_FPMR introduces a new system register FPMR which allows configuration
> of floating point behaviour, currently for FP8 specific features. Allow use
> of this in guests, disabling the trap while guests are running and saving
> and restoring the value along with the rest of the floating point state.
> Since FPMR is stored immediately after the main floating point state we
> share it with the hypervisor by adjusting the size of the shared region.
>
> Access to FPMR is covered by both a register specific trap HCRX_EL2.EnFPM
> and the overall floating point access trap so we just unconditionally
> enable the FPMR specific trap and rely on the floating point access trap to
> detect guest floating point usage.
>
> Signed-off-by: Mark Brown <[email protected]>
> ---
> arch/arm64/include/asm/kvm_arm.h | 2 +-
> arch/arm64/include/asm/kvm_host.h | 3 ++-
> arch/arm64/kvm/emulate-nested.c | 8 ++++++++
> arch/arm64/kvm/fpsimd.c | 2 +-
> arch/arm64/kvm/hyp/include/hyp/switch.h | 7 ++++++-
> arch/arm64/kvm/sys_regs.c | 11 +++++++++++
> 6 files changed, 29 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
> index 7f45ce9170bb..b2ddb6165953 100644
> --- a/arch/arm64/include/asm/kvm_arm.h
> +++ b/arch/arm64/include/asm/kvm_arm.h
> @@ -103,7 +103,7 @@
> #define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)
>
> #define HCRX_GUEST_FLAGS \
> - (HCRX_EL2_SMPME | HCRX_EL2_TCR2En | \
> + (HCRX_EL2_SMPME | HCRX_EL2_TCR2En | HCRX_EL2_EnFPM | \

No. We don't do that anymore. This can only be enabled if the guest
has it advertised via ID_AA64PFR2_EL1.FPMR.

> (cpus_have_final_cap(ARM64_HAS_MOPS) ? (HCRX_EL2_MSCEn | HCRX_EL2_MCE2) : 0))
> #define HCRX_HOST_FLAGS (HCRX_EL2_MSCEn | HCRX_EL2_TCR2En | HCRX_EL2_EnFPM)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index c4fdcc94d733..99c0f8944f04 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -384,6 +384,8 @@ enum vcpu_sysreg {
> APGAKEYLO_EL1,
> APGAKEYHI_EL1,
>
> + FPMR,
> +
> /* Memory Tagging Extension registers */
> RGSR_EL1, /* Random Allocation Tag Seed Register */
> GCR_EL1, /* Tag Control Register */
> @@ -544,7 +546,6 @@ struct kvm_vcpu_arch {
> enum fp_type fp_type;
> unsigned int sve_max_vl;
> u64 svcr;
> - unsigned long fpmr;
>
> /* Stage 2 paging state used by the hardware on next switch */
> struct kvm_s2_mmu *hw_mmu;
> diff --git a/arch/arm64/kvm/emulate-nested.c b/arch/arm64/kvm/emulate-nested.c
> index 431fd429932d..3af5fd0e28dc 100644
> --- a/arch/arm64/kvm/emulate-nested.c
> +++ b/arch/arm64/kvm/emulate-nested.c
> @@ -67,6 +67,8 @@ enum cgt_group_id {
> CGT_HCR_TTLBIS,
> CGT_HCR_TTLBOS,
>
> + CGT_HCRX_EnFPM,
> +
> CGT_MDCR_TPMCR,
> CGT_MDCR_TPM,
> CGT_MDCR_TDE,
> @@ -279,6 +281,11 @@ static const struct trap_bits coarse_trap_bits[] = {
> .mask = HCR_TTLBOS,
> .behaviour = BEHAVE_FORWARD_ANY,
> },
> + [CGT_HCRX_EnFPM] = {
> + .index = HCRX_EL2,
> + .mask = HCRX_EL2_EnFPM,
> + .behaviour = BEHAVE_FORWARD_ANY,

This is obviously incorrect.

> + },
> [CGT_MDCR_TPMCR] = {
> .index = MDCR_EL2,
> .value = MDCR_EL2_TPMCR,
> @@ -478,6 +485,7 @@ static const struct encoding_to_trap_config encoding_to_cgt[] __initconst = {
> SR_TRAP(SYS_AIDR_EL1, CGT_HCR_TID1),
> SR_TRAP(SYS_SMIDR_EL1, CGT_HCR_TID1),
> SR_TRAP(SYS_CTR_EL0, CGT_HCR_TID2),
> + SR_TRAP(SYS_FPMR, CGT_HCRX_EnFPM),
> SR_TRAP(SYS_CCSIDR_EL1, CGT_HCR_TID2_TID4),
> SR_TRAP(SYS_CCSIDR2_EL1, CGT_HCR_TID2_TID4),
> SR_TRAP(SYS_CLIDR_EL1, CGT_HCR_TID2_TID4),
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 6cf22cd8f020..9e002489c843 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -152,7 +152,7 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
> fp_state.sve_vl = vcpu->arch.sve_max_vl;
> fp_state.sme_state = NULL;
> fp_state.svcr = &vcpu->arch.svcr;
> - fp_state.fpmr = &vcpu->arch.fpmr;
> + fp_state.fpmr = (unsigned long *)&__vcpu_sys_reg(vcpu, FPMR);
> fp_state.fp_type = &vcpu->arch.fp_type;
>
> if (vcpu_has_sve(vcpu))
> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> index 27fcdfd432b9..abf785c473d0 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> @@ -370,10 +370,15 @@ static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
> isb();
>
> /* Write out the host state if it's in the registers */
> - if (vcpu->arch.fp_state == FP_STATE_HOST_OWNED)
> + if (vcpu->arch.fp_state == FP_STATE_HOST_OWNED) {
> __fpsimd_save_state(&(vcpu->arch.host_uw->fpsimd_state));
> + if (cpus_have_final_cap(ARM64_HAS_FPMR))

Same thing as above: this cannot be unconditional.

> + vcpu->arch.host_uw->fpmr = read_sysreg_s(SYS_FPMR);
> + }
>
> /* Restore the guest state */
> + if (cpus_have_final_cap(ARM64_HAS_FPMR))
> + write_sysreg_s(__vcpu_sys_reg(vcpu, FPMR), SYS_FPMR);
> if (sve_guest)
> __hyp_sve_restore_guest(vcpu);
> else
> diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
> index 38503b1cd2eb..216eac44c124 100644
> --- a/arch/arm64/kvm/sys_regs.c
> +++ b/arch/arm64/kvm/sys_regs.c
> @@ -2067,6 +2067,15 @@ static unsigned int hidden_user_visibility(const struct kvm_vcpu *vcpu,
> .visibility = hidden_user_visibility, \
> }
>
> +static unsigned int fpmr_visibility(const struct kvm_vcpu *vcpu,
> + const struct sys_reg_desc *rd)
> +{
> + if (cpus_have_final_cap(ARM64_HAS_FPMR))
> + return 0;

Same thing.

> +
> + return REG_HIDDEN;
> +}
> +
> /*
> * Since reset() callback and field val are not used for idregs, they will be
> * used for specific purposes for idregs.
> @@ -2463,6 +2472,8 @@ static const struct sys_reg_desc sys_reg_descs[] = {
> { SYS_DESC(SYS_CSSELR_EL1), access_csselr, reset_unknown, CSSELR_EL1 },
> { SYS_DESC(SYS_CTR_EL0), access_ctr },
> { SYS_DESC(SYS_SVCR), undef_access },
> + { SYS_DESC(SYS_FPMR), access_rw, reset_unknown, FPMR,
> + .visibility = fpmr_visibility },
>
> { PMU_SYS_REG(PMCR_EL0), .access = access_pmcr, .reset = reset_pmcr,
> .reg = PMCR_EL0, .get_user = get_pmcr, .set_user = set_pmcr },
>

Thanks,

M.

--
Without deviation from the norm, progress is not possible.

2024-02-23 11:37:22

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v4 03/14] arm64/fpsimd: Support FEAT_FPMR

On Mon, 22 Jan 2024 16:28:06 +0000,
Mark Brown <[email protected]> wrote:
>
> FEAT_FPMR defines a new EL0 accessible register FPMR use to configure the
> FP8 related features added to the architecture at the same time. Detect
> support for this register and context switch it for EL0 when present.
>
> Due to the sharing of responsibility for saving floating point state
> between the host kernel and KVM FP8 support is not yet implemented in KVM
> and a stub similar to that used for SVCR is provided for FPMR in order to
> avoid bisection issues. To make it easier to share host state with the
> hypervisor we store FPMR as a hardened usercopy field in uw (along with
> some padding).
>
> Signed-off-by: Mark Brown <[email protected]>
> ---
> arch/arm64/include/asm/cpufeature.h | 5 +++++
> arch/arm64/include/asm/fpsimd.h | 2 ++
> arch/arm64/include/asm/kvm_host.h | 1 +
> arch/arm64/include/asm/processor.h | 4 ++++
> arch/arm64/kernel/cpufeature.c | 9 +++++++++
> arch/arm64/kernel/fpsimd.c | 13 +++++++++++++
> arch/arm64/kvm/fpsimd.c | 1 +
> arch/arm64/tools/cpucaps | 1 +
> 8 files changed, 36 insertions(+)
>
> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h
> index 21c824edf8ce..34fcdbc65d7d 100644
> --- a/arch/arm64/include/asm/cpufeature.h
> +++ b/arch/arm64/include/asm/cpufeature.h
> @@ -768,6 +768,11 @@ static __always_inline bool system_supports_tpidr2(void)
> return system_supports_sme();
> }
>
> +static __always_inline bool system_supports_fpmr(void)
> +{
> + return alternative_has_cap_unlikely(ARM64_HAS_FPMR);
> +}
> +
> static __always_inline bool system_supports_cnp(void)
> {
> return alternative_has_cap_unlikely(ARM64_HAS_CNP);
> diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
> index 50e5f25d3024..6cf72b0d2c04 100644
> --- a/arch/arm64/include/asm/fpsimd.h
> +++ b/arch/arm64/include/asm/fpsimd.h
> @@ -89,6 +89,7 @@ struct cpu_fp_state {
> void *sve_state;
> void *sme_state;
> u64 *svcr;
> + unsigned long *fpmr;
> unsigned int sve_vl;
> unsigned int sme_vl;
> enum fp_type *fp_type;
> @@ -154,6 +155,7 @@ extern void cpu_enable_sve(const struct arm64_cpu_capabilities *__unused);
> extern void cpu_enable_sme(const struct arm64_cpu_capabilities *__unused);
> extern void cpu_enable_sme2(const struct arm64_cpu_capabilities *__unused);
> extern void cpu_enable_fa64(const struct arm64_cpu_capabilities *__unused);
> +extern void cpu_enable_fpmr(const struct arm64_cpu_capabilities *__unused);
>
> extern u64 read_smcr_features(void);
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 21c57b812569..7993694a54af 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -543,6 +543,7 @@ struct kvm_vcpu_arch {
> enum fp_type fp_type;
> unsigned int sve_max_vl;
> u64 svcr;
> + unsigned long fpmr;

As this directly represents a register, I'd rather you use a type that
represents the size of that register unambiguously (u64).

>
> /* Stage 2 paging state used by the hardware on next switch */
> struct kvm_s2_mmu *hw_mmu;
> diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> index 5b0a04810b23..b453c66d3fae 100644
> --- a/arch/arm64/include/asm/processor.h
> +++ b/arch/arm64/include/asm/processor.h
> @@ -155,6 +155,8 @@ struct thread_struct {
> struct {
> unsigned long tp_value; /* TLS register */
> unsigned long tp2_value;
> + unsigned long fpmr;
> + unsigned long pad;
> struct user_fpsimd_state fpsimd_state;
> } uw;
>
> @@ -253,6 +255,8 @@ static inline void arch_thread_struct_whitelist(unsigned long *offset,
> BUILD_BUG_ON(sizeof_field(struct thread_struct, uw) !=
> sizeof_field(struct thread_struct, uw.tp_value) +
> sizeof_field(struct thread_struct, uw.tp2_value) +
> + sizeof_field(struct thread_struct, uw.fpmr) +
> + sizeof_field(struct thread_struct, uw.pad) +
> sizeof_field(struct thread_struct, uw.fpsimd_state));
>
> *offset = offsetof(struct thread_struct, uw);
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index eae59ec0f4b0..0263565f617a 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -272,6 +272,7 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr1[] = {
> };
>
> static const struct arm64_ftr_bits ftr_id_aa64pfr2[] = {
> + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR2_EL1_FPMR_SHIFT, 4, 0),
> ARM64_FTR_END,
> };
>
> @@ -2767,6 +2768,14 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
> .type = ARM64_CPUCAP_SYSTEM_FEATURE,
> .matches = has_lpa2,
> },
> + {
> + .desc = "FPMR",
> + .type = ARM64_CPUCAP_SYSTEM_FEATURE,
> + .capability = ARM64_HAS_FPMR,
> + .matches = has_cpuid_feature,
> + .cpu_enable = cpu_enable_fpmr,
> + ARM64_CPUID_FIELDS(ID_AA64PFR2_EL1, FPMR, IMP)
> + },
> {},
> };
>
> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
> index a5dc6f764195..8e24b5e5e192 100644
> --- a/arch/arm64/kernel/fpsimd.c
> +++ b/arch/arm64/kernel/fpsimd.c
> @@ -359,6 +359,9 @@ static void task_fpsimd_load(void)
> WARN_ON(preemptible());
> WARN_ON(test_thread_flag(TIF_KERNEL_FPSTATE));
>
> + if (system_supports_fpmr())
> + write_sysreg_s(current->thread.uw.fpmr, SYS_FPMR);
> +
> if (system_supports_sve() || system_supports_sme()) {
> switch (current->thread.fp_type) {
> case FP_STATE_FPSIMD:
> @@ -446,6 +449,9 @@ static void fpsimd_save_user_state(void)
> if (test_thread_flag(TIF_FOREIGN_FPSTATE))
> return;
>
> + if (system_supports_fpmr())
> + *(last->fpmr) = read_sysreg_s(SYS_FPMR);
> +
> /*
> * If a task is in a syscall the ABI allows us to only
> * preserve the state shared with FPSIMD so don't bother
> @@ -688,6 +694,12 @@ static void sve_to_fpsimd(struct task_struct *task)
> }
> }
>
> +void cpu_enable_fpmr(const struct arm64_cpu_capabilities *__always_unused p)
> +{
> + write_sysreg_s(read_sysreg_s(SYS_SCTLR_EL1) | SCTLR_EL1_EnFPM_MASK,
> + SYS_SCTLR_EL1);
> +}
> +
> #ifdef CONFIG_ARM64_SVE
> /*
> * Call __sve_free() directly only if you know task can't be scheduled
> @@ -1680,6 +1692,7 @@ static void fpsimd_bind_task_to_cpu(void)
> last->sve_vl = task_get_sve_vl(current);
> last->sme_vl = task_get_sme_vl(current);
> last->svcr = &current->thread.svcr;
> + last->fpmr = &current->thread.uw.fpmr;
> last->fp_type = &current->thread.fp_type;
> last->to_save = FP_STATE_CURRENT;
> current->thread.fpsimd_cpu = smp_processor_id();
> diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
> index 8c1d0d4853df..e3e611e30e91 100644
> --- a/arch/arm64/kvm/fpsimd.c
> +++ b/arch/arm64/kvm/fpsimd.c
> @@ -153,6 +153,7 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
> fp_state.sve_vl = vcpu->arch.sve_max_vl;
> fp_state.sme_state = NULL;
> fp_state.svcr = &vcpu->arch.svcr;
> + fp_state.fpmr = &vcpu->arch.fpmr;
> fp_state.fp_type = &vcpu->arch.fp_type;

Given the number of fields you keep track of, it would make a lot more
sense if these FP-related fields were in their own little structure
and tracked by a single pointer (I don't think there is a case where
we track them independently).

Thanks,

M.

--
Without deviation from the norm, progress is not possible.

2024-02-23 14:47:27

by Mark Brown

[permalink] [raw]
Subject: Re: [PATCH v4 12/14] KVM: arm64: Support FEAT_FPMR for guests

On Fri, Feb 23, 2024 at 11:18:51AM +0000, Marc Zyngier wrote:
> Mark Brown <[email protected]> wrote:

> > #define HCRX_GUEST_FLAGS \
> > - (HCRX_EL2_SMPME | HCRX_EL2_TCR2En | \
> > + (HCRX_EL2_SMPME | HCRX_EL2_TCR2En | HCRX_EL2_EnFPM | \

> No. We don't do that anymore. This can only be enabled if the guest
> has it advertised via ID_AA64PFR2_EL1.FPMR.

Right, as mentioned in the cover letter (and previously discussed with
one of the other serieses) this needs a rework against your at the time
of posting still pending changes to parse the ID registers. It looks
like everything is there for those now so I'll do that after the merge
window.


Attachments:
(No filename) (666.00 B)
signature.asc (499.00 B)
Download all attachments

2024-03-06 16:41:56

by Mark Brown

[permalink] [raw]
Subject: Re: [PATCH v4 03/14] arm64/fpsimd: Support FEAT_FPMR

On Fri, Feb 23, 2024 at 11:07:02AM +0000, Marc Zyngier wrote:
> Mark Brown <[email protected]> wrote:

> > --- a/arch/arm64/kvm/fpsimd.c
> > +++ b/arch/arm64/kvm/fpsimd.c
> > @@ -153,6 +153,7 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
> > fp_state.sve_vl = vcpu->arch.sve_max_vl;
> > fp_state.sme_state = NULL;
> > fp_state.svcr = &vcpu->arch.svcr;
> > + fp_state.fpmr = &vcpu->arch.fpmr;
> > fp_state.fp_type = &vcpu->arch.fp_type;

> Given the number of fields you keep track of, it would make a lot more
> sense if these FP-related fields were in their own little structure
> and tracked by a single pointer (I don't think there is a case where
> we track them independently).

I sent a patch series that attempted to address this somewhat by
refactoring things so that we initialise the struct we bind during vCPU
setup rather than each time we prepare to switch away from the guest:

https://lore.kernel.org/linux-arm-kernel/[email protected]/

which you weren't terribly enthusiastic about:

https://lore.kernel.org/all/[email protected]/

As covered in the changelog for patch 2 of that series the FP state
includes both system registers which KVM wants to expose via it's system
register ABI and FPSIMD state which the host kernel wants to flag as
suitable for hardended usercopy by embedding in thread_struct.uw. This
means that pulling everything into a single structure which is used
throughout the kernel would require a bunch of surgery which I'm not
sure is worth the effort or ongoing maintainance cost.

The current approach is verbose but not complicated and localised to the
code managing the FP state, as I said in the discussion on that thread I
think it represents a reasonable tradeoff.


Attachments:
(No filename) (1.80 kB)
signature.asc (499.00 B)
Download all attachments