2020-06-18 17:56:51

by David Brazdil

[permalink] [raw]
Subject: [PATCH v3 00/15] Split off nVHE hyp code

Refactor files in arch/arm64/kvm/hyp to compile all code which runs in EL2
under nVHE into separate object files from the rest of KVM. This is done in
preparation for being able to unmap hyp code from EL1 and kernel code/data
from EL2 but has other benefits too, notably:
* safe use of KASAN/UBSAN/GCOV instrumentation on VHE code,
* no need for __hyp_text annotations.

nVHE-specific code is moved to hyp/nvhe and compiled with custom build rules
similar to those used by EFI stub. Shared source files are compiled under both
VHE and nVHE build rules. Where a source file contained both VHE and nVHE code,
it is split into a shared header file and two C source files. This is done one
file per commit to make review easier.

All nVHE symbols are prefixed with "__kvm_nvhe_" to avoid collisions with VHE
variants (also inspired by EFI stub). Since this prefixes unresolved symbols
too, image-vars.h contains a list of kernel symbol aliases where nVHE code
still refers to kernel proper. The list grows fairly large as the patch series
progresses and code is moved around, but at the end contains 20 symbols. These
remaining dependencies on kernel proper will be further reduced in the future.

No functional changes are intended but code was simplified whenever the
refactoring made it possible.

Tested by running kvm-unit-tests on QEMU 5.0 with VHE/nVHE and GIC v2/v3.

Dual compilation of code shared by VHE/nVHE increase the size of the kernel.
Bloat-o-meter vmlinux diff shows an increase of 21 KB on the ELF symbol level.
Size of Image.gz is up by 10 KB; size of Image is unchanged, presumably due
to ELF section alignment.

This is based off v5.8-rc1. Available in branch 'topic/el2-obj-v3' of git repo:
https://android-kvm.googlesource.com/linux

Changes v2 -> v3:
* rebase onto v5.8-rc1
* remove patch changing hypcall interface to function IDs
* move hyp-init.S to nVHE
* fix symbol aliasing under CONFIG_ARM64_PSEUDO_NMI and CONFIG_ARM64_SVE
* remove VHE's unused __kvm_enable_ssbs()
* make nVHE use VHE's hyp_panic_string for consistent use of absolute relocs
when returning pointers back to kernel
* replace __noscs annotation (added to __hyp_text macro) with build rule

Changes v1 -> v2:
* change nVHE symbol prefix from __hyp_text_ to __kvm_nvhe_
* rename __HYPERVISOR__ macro to __KVM_NVHE_HYPERVISOR__
* use hcall jump table instead of array of function pointers
* drop patch to unify HVC callers
* move __smccc_workaround_1_smc to own file
* header guards for hyp/*.h
* improve helpers for handling VHE/nVHE hyp syms in kernel proper
* improve commit messages, cover letter

-David


Andrew Scull (2):
arm64: kvm: Handle calls to prefixed hyp functions
arm64: kvm: Move hyp-init.S to nVHE

David Brazdil (13):
arm64: kvm: Fix symbol dependency in __hyp_call_panic_nvhe
arm64: kvm: Move __smccc_workaround_1_smc to .rodata
arm64: kvm: Add build rules for separate nVHE object files
arm64: kvm: Build hyp-entry.S separately for VHE/nVHE
arm64: kvm: Split hyp/tlb.c to VHE/nVHE
arm64: kvm: Split hyp/switch.c to VHE/nVHE
arm64: kvm: Split hyp/debug-sr.c to VHE/nVHE
arm64: kvm: Split hyp/sysreg-sr.c to VHE/nVHE
arm64: kvm: Split hyp/timer-sr.c to VHE/nVHE
arm64: kvm: Compile remaining hyp/ files for both VHE/nVHE
arm64: kvm: Add comments around __kvm_nvhe_ symbol aliases
arm64: kvm: Remove __hyp_text macro, use build rules instead
arm64: kvm: Lift instrumentation restrictions on VHE

arch/arm64/include/asm/kvm_asm.h | 32 +-
arch/arm64/include/asm/kvm_emulate.h | 2 +-
arch/arm64/include/asm/kvm_host.h | 19 +-
arch/arm64/include/asm/kvm_hyp.h | 13 +-
arch/arm64/include/asm/kvm_mmu.h | 16 +-
arch/arm64/include/asm/mmu.h | 7 -
arch/arm64/kernel/cpu_errata.c | 4 +-
arch/arm64/kernel/image-vars.h | 50 ++
arch/arm64/kvm/Makefile | 2 +-
arch/arm64/kvm/arm.c | 8 +-
arch/arm64/kvm/hyp/Makefile | 11 +-
arch/arm64/kvm/hyp/aarch32.c | 6 +-
arch/arm64/kvm/hyp/debug-sr.c | 210 +------
arch/arm64/kvm/hyp/debug-sr.h | 170 +++++
arch/arm64/kvm/hyp/entry.S | 1 -
arch/arm64/kvm/hyp/fpsimd.S | 1 -
arch/arm64/kvm/hyp/hyp-entry.S | 21 +-
arch/arm64/kvm/hyp/nvhe/Makefile | 43 ++
arch/arm64/kvm/hyp/nvhe/debug-sr.c | 77 +++
arch/arm64/kvm/{ => hyp/nvhe}/hyp-init.S | 0
arch/arm64/kvm/hyp/nvhe/switch.c | 271 ++++++++
arch/arm64/kvm/hyp/nvhe/sysreg-sr.c | 56 ++
arch/arm64/kvm/hyp/nvhe/timer-sr.c | 43 ++
arch/arm64/kvm/hyp/nvhe/tlb.c | 68 ++
arch/arm64/kvm/hyp/smccc_wa.S | 30 +
arch/arm64/kvm/hyp/switch.c | 749 +----------------------
arch/arm64/kvm/hyp/switch.h | 504 +++++++++++++++
arch/arm64/kvm/hyp/sysreg-sr.c | 215 +------
arch/arm64/kvm/hyp/sysreg-sr.h | 204 ++++++
arch/arm64/kvm/hyp/timer-sr.c | 38 +-
arch/arm64/kvm/hyp/tlb.c | 169 +----
arch/arm64/kvm/hyp/tlb.h | 131 ++++
arch/arm64/kvm/hyp/vgic-v2-cpuif-proxy.c | 4 +-
arch/arm64/kvm/hyp/vgic-v3-sr.c | 130 ++--
arch/arm64/kvm/mmu.c | 2 +-
arch/arm64/kvm/va_layout.c | 2 +-
scripts/kallsyms.c | 1 +
37 files changed, 1829 insertions(+), 1481 deletions(-)
create mode 100644 arch/arm64/kvm/hyp/debug-sr.h
create mode 100644 arch/arm64/kvm/hyp/nvhe/Makefile
create mode 100644 arch/arm64/kvm/hyp/nvhe/debug-sr.c
rename arch/arm64/kvm/{ => hyp/nvhe}/hyp-init.S (100%)
create mode 100644 arch/arm64/kvm/hyp/nvhe/switch.c
create mode 100644 arch/arm64/kvm/hyp/nvhe/sysreg-sr.c
create mode 100644 arch/arm64/kvm/hyp/nvhe/timer-sr.c
create mode 100644 arch/arm64/kvm/hyp/nvhe/tlb.c
create mode 100644 arch/arm64/kvm/hyp/smccc_wa.S
create mode 100644 arch/arm64/kvm/hyp/switch.h
create mode 100644 arch/arm64/kvm/hyp/sysreg-sr.h
create mode 100644 arch/arm64/kvm/hyp/tlb.h

--
2.27.0


2020-06-18 17:56:58

by David Brazdil

[permalink] [raw]
Subject: [PATCH v3 04/15] arm64: kvm: Handle calls to prefixed hyp functions

From: Andrew Scull <[email protected]>

This patch is part of a series which builds KVM's non-VHE hyp code separately
from VHE and the rest of the kernel.

Once hyp functions are moved to a hyp object, they will have prefixed symbols.
This change declares and gets the address of the prefixed version for calls to
the hyp functions.

To aid migration, the hyp functions that have not yet moved have their prefixed
versions aliased to their non-prefixed version. This begins with all the hyp
functions being listed and will reduce to none of them once the migration is
complete.

Signed-off-by: Andrew Scull <[email protected]>

Extracted kvm_call_hyp nVHE branches into own helper macros.
Signed-off-by: David Brazdil <[email protected]>
---
arch/arm64/include/asm/kvm_asm.h | 19 +++++++++++++++++++
arch/arm64/include/asm/kvm_host.h | 19 ++++++++++++++++---
arch/arm64/kernel/image-vars.h | 15 +++++++++++++++
3 files changed, 50 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 352aaebf4198..6a682d66a640 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -42,6 +42,24 @@

#include <linux/mm.h>

+/*
+ * Translate name of a symbol defined in nVHE hyp to the name seen
+ * by kernel proper. All nVHE symbols are prefixed by the build system
+ * to avoid clashes with the VHE variants.
+ */
+#define kvm_nvhe_sym(sym) __kvm_nvhe_##sym
+
+#define DECLARE_KVM_VHE_SYM(sym) extern char sym[]
+#define DECLARE_KVM_NVHE_SYM(sym) extern char kvm_nvhe_sym(sym)[]
+
+/*
+ * Define a pair of symbols sharing the same name but one defined in
+ * VHE and the other in nVHE hyp implementations.
+ */
+#define DECLARE_KVM_HYP_SYM(sym) \
+ DECLARE_KVM_VHE_SYM(sym); \
+ DECLARE_KVM_NVHE_SYM(sym)
+
/* Translate a kernel address of @sym into its equivalent linear mapping */
#define kvm_ksym_ref(sym) \
({ \
@@ -50,6 +68,7 @@
val = lm_alias(&sym); \
val; \
})
+#define kvm_ksym_ref_nvhe(sym) kvm_ksym_ref(kvm_nvhe_sym(sym))

struct kvm;
struct kvm_vcpu;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index c3e6fcc664b1..e782f98243d3 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -448,6 +448,20 @@ void kvm_arm_resume_guest(struct kvm *kvm);

u64 __kvm_call_hyp(void *hypfn, ...);

+#define kvm_call_hyp_nvhe(f, ...) \
+ do { \
+ DECLARE_KVM_NVHE_SYM(f); \
+ __kvm_call_hyp(kvm_ksym_ref_nvhe(f), ##__VA_ARGS__); \
+ } while(0)
+
+#define kvm_call_hyp_nvhe_ret(f, ...) \
+ ({ \
+ DECLARE_KVM_NVHE_SYM(f); \
+ typeof(f(__VA_ARGS__)) ret; \
+ ret = __kvm_call_hyp(kvm_ksym_ref_nvhe(f), \
+ ##__VA_ARGS__); \
+ })
+
/*
* The couple of isb() below are there to guarantee the same behaviour
* on VHE as on !VHE, where the eret to EL1 acts as a context
@@ -459,7 +473,7 @@ u64 __kvm_call_hyp(void *hypfn, ...);
f(__VA_ARGS__); \
isb(); \
} else { \
- __kvm_call_hyp(kvm_ksym_ref(f), ##__VA_ARGS__); \
+ kvm_call_hyp_nvhe(f, ##__VA_ARGS__); \
} \
} while(0)

@@ -471,8 +485,7 @@ u64 __kvm_call_hyp(void *hypfn, ...);
ret = f(__VA_ARGS__); \
isb(); \
} else { \
- ret = __kvm_call_hyp(kvm_ksym_ref(f), \
- ##__VA_ARGS__); \
+ ret = kvm_call_hyp_nvhe_ret(f, ##__VA_ARGS__); \
} \
\
ret; \
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index f32b406e90c0..89affa38b143 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -61,6 +61,21 @@ __efistub__ctype = _ctype;
* memory mappings.
*/

+__kvm_nvhe___kvm_enable_ssbs = __kvm_enable_ssbs;
+__kvm_nvhe___kvm_flush_vm_context = __kvm_flush_vm_context;
+__kvm_nvhe___kvm_get_mdcr_el2 = __kvm_get_mdcr_el2;
+__kvm_nvhe___kvm_timer_set_cntvoff = __kvm_timer_set_cntvoff;
+__kvm_nvhe___kvm_tlb_flush_local_vmid = __kvm_tlb_flush_local_vmid;
+__kvm_nvhe___kvm_tlb_flush_vmid = __kvm_tlb_flush_vmid;
+__kvm_nvhe___kvm_tlb_flush_vmid_ipa = __kvm_tlb_flush_vmid_ipa;
+__kvm_nvhe___kvm_vcpu_run_nvhe = __kvm_vcpu_run_nvhe;
+__kvm_nvhe___vgic_v3_get_ich_vtr_el2 = __vgic_v3_get_ich_vtr_el2;
+__kvm_nvhe___vgic_v3_init_lrs = __vgic_v3_init_lrs;
+__kvm_nvhe___vgic_v3_read_vmcr = __vgic_v3_read_vmcr;
+__kvm_nvhe___vgic_v3_restore_aprs = __vgic_v3_restore_aprs;
+__kvm_nvhe___vgic_v3_save_aprs = __vgic_v3_save_aprs;
+__kvm_nvhe___vgic_v3_write_vmcr = __vgic_v3_write_vmcr;
+
#endif /* CONFIG_KVM */

#endif /* __ARM64_KERNEL_IMAGE_VARS_H */
--
2.27.0

2020-06-18 17:58:52

by David Brazdil

[permalink] [raw]
Subject: [PATCH v3 10/15] arm64: kvm: Split hyp/sysreg-sr.c to VHE/nVHE

This patch is part of a series which builds KVM's non-VHE hyp code separately
from VHE and the rest of the kernel.

sysreg-sr.c contains KVM's code for saving/restoring system registers, with
some parts shared between VHE/nVHE. These common routines are moved to
sysreg-sr.h, VHE-specific code is left in sysreg-sr.c and nVHE-specific code is
moved to nvhe/sysreg-sr.c.

Signed-off-by: David Brazdil <[email protected]>
---
arch/arm64/include/asm/kvm_hyp.h | 4 +
arch/arm64/kernel/image-vars.h | 5 -
arch/arm64/kvm/arm.c | 2 +-
arch/arm64/kvm/hyp/nvhe/Makefile | 2 +-
arch/arm64/kvm/hyp/nvhe/sysreg-sr.c | 56 ++++++++
arch/arm64/kvm/hyp/sysreg-sr.c | 215 +---------------------------
arch/arm64/kvm/hyp/sysreg-sr.h | 211 +++++++++++++++++++++++++++
7 files changed, 279 insertions(+), 216 deletions(-)
create mode 100644 arch/arm64/kvm/hyp/nvhe/sysreg-sr.c
create mode 100644 arch/arm64/kvm/hyp/sysreg-sr.h

diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 1cb5903a2693..c8bbd221aac0 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -66,12 +66,16 @@ int __vgic_v3_perform_cpuif_access(struct kvm_vcpu *vcpu);
void __timer_enable_traps(struct kvm_vcpu *vcpu);
void __timer_disable_traps(struct kvm_vcpu *vcpu);

+#ifdef __KVM_NVHE_HYPERVISOR__
void __sysreg_save_state_nvhe(struct kvm_cpu_context *ctxt);
void __sysreg_restore_state_nvhe(struct kvm_cpu_context *ctxt);
+#else
void sysreg_save_host_state_vhe(struct kvm_cpu_context *ctxt);
void sysreg_restore_host_state_vhe(struct kvm_cpu_context *ctxt);
void sysreg_save_guest_state_vhe(struct kvm_cpu_context *ctxt);
void sysreg_restore_guest_state_vhe(struct kvm_cpu_context *ctxt);
+#endif
+
void __sysreg32_save_state(struct kvm_vcpu *vcpu);
void __sysreg32_restore_state(struct kvm_vcpu *vcpu);

diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 8096e6f1f2bf..ddaae7267ab1 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -68,12 +68,7 @@ __kvm_nvhe___guest_exit = __guest_exit;
__kvm_nvhe___hyp_panic_string = __hyp_panic_string;
__kvm_nvhe___hyp_stub_vectors = __hyp_stub_vectors;
__kvm_nvhe___icache_flags = __icache_flags;
-__kvm_nvhe___kvm_enable_ssbs = __kvm_enable_ssbs;
__kvm_nvhe___kvm_timer_set_cntvoff = __kvm_timer_set_cntvoff;
-__kvm_nvhe___sysreg32_restore_state = __sysreg32_restore_state;
-__kvm_nvhe___sysreg32_save_state = __sysreg32_save_state;
-__kvm_nvhe___sysreg_restore_state_nvhe = __sysreg_restore_state_nvhe;
-__kvm_nvhe___sysreg_save_state_nvhe = __sysreg_save_state_nvhe;
__kvm_nvhe___timer_disable_traps = __timer_disable_traps;
__kvm_nvhe___timer_enable_traps = __timer_enable_traps;
__kvm_nvhe___vgic_v2_perform_cpuif_access = __vgic_v2_perform_cpuif_access;
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 59bbe6ce2d54..62ceb546393e 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1302,7 +1302,7 @@ static void cpu_init_hyp_mode(void)
*/
if (this_cpu_has_cap(ARM64_SSBS) &&
arm64_get_ssbd_state() == ARM64_SSBD_FORCE_DISABLE) {
- kvm_call_hyp(__kvm_enable_ssbs);
+ kvm_call_hyp_nvhe(__kvm_enable_ssbs);
}
}

diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 95a06786bf26..d242e437cf89 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -7,7 +7,7 @@ asflags-y := -D__KVM_NVHE_HYPERVISOR__
ccflags-y := -D__KVM_NVHE_HYPERVISOR__ -fno-stack-protector \
-DDISABLE_BRANCH_PROFILING $(DISABLE_STACKLEAK_PLUGIN)

-obj-y := debug-sr.o switch.o tlb.o hyp-init.o ../hyp-entry.o
+obj-y := sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o ../hyp-entry.o

obj-y := $(patsubst %.o,%.hyp.o,$(obj-y))
extra-y := $(patsubst %.hyp.o,%.hyp.tmp.o,$(obj-y))
diff --git a/arch/arm64/kvm/hyp/nvhe/sysreg-sr.c b/arch/arm64/kvm/hyp/nvhe/sysreg-sr.c
new file mode 100644
index 000000000000..55ab924d841a
--- /dev/null
+++ b/arch/arm64/kvm/hyp/nvhe/sysreg-sr.c
@@ -0,0 +1,56 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2012-2015 - ARM Ltd
+ * Author: Marc Zyngier <[email protected]>
+ */
+
+#include <linux/compiler.h>
+#include <linux/kvm_host.h>
+
+#include <asm/kprobes.h>
+#include <asm/kvm_asm.h>
+#include <asm/kvm_emulate.h>
+#include <asm/kvm_hyp.h>
+
+#include "../sysreg-sr.h"
+
+/*
+ * Non-VHE: Both host and guest must save everything.
+ */
+
+void __hyp_text __sysreg_save_state_nvhe(struct kvm_cpu_context *ctxt)
+{
+ __sysreg_save_el1_state(ctxt);
+ __sysreg_save_common_state(ctxt);
+ __sysreg_save_user_state(ctxt);
+ __sysreg_save_el2_return_state(ctxt);
+}
+
+void __hyp_text __sysreg_restore_state_nvhe(struct kvm_cpu_context *ctxt)
+{
+ __sysreg_restore_el1_state(ctxt);
+ __sysreg_restore_common_state(ctxt);
+ __sysreg_restore_user_state(ctxt);
+ __sysreg_restore_el2_return_state(ctxt);
+}
+
+void __hyp_text __sysreg32_save_state(struct kvm_vcpu *vcpu)
+{
+ ___sysreg32_save_state(vcpu);
+}
+
+void __hyp_text __sysreg32_restore_state(struct kvm_vcpu *vcpu)
+{
+ ___sysreg32_restore_state(vcpu);
+}
+
+void __hyp_text __kvm_enable_ssbs(void)
+{
+ u64 tmp;
+
+ asm volatile(
+ "mrs %0, sctlr_el2\n"
+ "orr %0, %0, %1\n"
+ "msr sctlr_el2, %0"
+ : "=&r" (tmp) : "L" (SCTLR_ELx_DSSBS));
+}
diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
index 2493439a5c54..5b559b00e9e5 100644
--- a/arch/arm64/kvm/hyp/sysreg-sr.c
+++ b/arch/arm64/kvm/hyp/sysreg-sr.c
@@ -12,9 +12,9 @@
#include <asm/kvm_emulate.h>
#include <asm/kvm_hyp.h>

+#include "sysreg-sr.h"
+
/*
- * Non-VHE: Both host and guest must save everything.
- *
* VHE: Host and guest must save mdscr_el1 and sp_el0 (and the PC and
* pstate, which are handled as part of the el2 return state) on every
* switch (sp_el0 is being dealt with in the assembly code).
@@ -24,59 +24,6 @@
* classes are handled as part of kvm_arch_vcpu_load and kvm_arch_vcpu_put.
*/

-static void __hyp_text __sysreg_save_common_state(struct kvm_cpu_context *ctxt)
-{
- ctxt->sys_regs[MDSCR_EL1] = read_sysreg(mdscr_el1);
-}
-
-static void __hyp_text __sysreg_save_user_state(struct kvm_cpu_context *ctxt)
-{
- ctxt->sys_regs[TPIDR_EL0] = read_sysreg(tpidr_el0);
- ctxt->sys_regs[TPIDRRO_EL0] = read_sysreg(tpidrro_el0);
-}
-
-static void __hyp_text __sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
-{
- ctxt->sys_regs[CSSELR_EL1] = read_sysreg(csselr_el1);
- ctxt->sys_regs[SCTLR_EL1] = read_sysreg_el1(SYS_SCTLR);
- ctxt->sys_regs[CPACR_EL1] = read_sysreg_el1(SYS_CPACR);
- ctxt->sys_regs[TTBR0_EL1] = read_sysreg_el1(SYS_TTBR0);
- ctxt->sys_regs[TTBR1_EL1] = read_sysreg_el1(SYS_TTBR1);
- ctxt->sys_regs[TCR_EL1] = read_sysreg_el1(SYS_TCR);
- ctxt->sys_regs[ESR_EL1] = read_sysreg_el1(SYS_ESR);
- ctxt->sys_regs[AFSR0_EL1] = read_sysreg_el1(SYS_AFSR0);
- ctxt->sys_regs[AFSR1_EL1] = read_sysreg_el1(SYS_AFSR1);
- ctxt->sys_regs[FAR_EL1] = read_sysreg_el1(SYS_FAR);
- ctxt->sys_regs[MAIR_EL1] = read_sysreg_el1(SYS_MAIR);
- ctxt->sys_regs[VBAR_EL1] = read_sysreg_el1(SYS_VBAR);
- ctxt->sys_regs[CONTEXTIDR_EL1] = read_sysreg_el1(SYS_CONTEXTIDR);
- ctxt->sys_regs[AMAIR_EL1] = read_sysreg_el1(SYS_AMAIR);
- ctxt->sys_regs[CNTKCTL_EL1] = read_sysreg_el1(SYS_CNTKCTL);
- ctxt->sys_regs[PAR_EL1] = read_sysreg(par_el1);
- ctxt->sys_regs[TPIDR_EL1] = read_sysreg(tpidr_el1);
-
- ctxt->gp_regs.sp_el1 = read_sysreg(sp_el1);
- ctxt->gp_regs.elr_el1 = read_sysreg_el1(SYS_ELR);
- ctxt->gp_regs.spsr[KVM_SPSR_EL1]= read_sysreg_el1(SYS_SPSR);
-}
-
-static void __hyp_text __sysreg_save_el2_return_state(struct kvm_cpu_context *ctxt)
-{
- ctxt->gp_regs.regs.pc = read_sysreg_el2(SYS_ELR);
- ctxt->gp_regs.regs.pstate = read_sysreg_el2(SYS_SPSR);
-
- if (cpus_have_final_cap(ARM64_HAS_RAS_EXTN))
- ctxt->sys_regs[DISR_EL1] = read_sysreg_s(SYS_VDISR_EL2);
-}
-
-void __hyp_text __sysreg_save_state_nvhe(struct kvm_cpu_context *ctxt)
-{
- __sysreg_save_el1_state(ctxt);
- __sysreg_save_common_state(ctxt);
- __sysreg_save_user_state(ctxt);
- __sysreg_save_el2_return_state(ctxt);
-}
-
void sysreg_save_host_state_vhe(struct kvm_cpu_context *ctxt)
{
__sysreg_save_common_state(ctxt);
@@ -90,111 +37,6 @@ void sysreg_save_guest_state_vhe(struct kvm_cpu_context *ctxt)
}
NOKPROBE_SYMBOL(sysreg_save_guest_state_vhe);

-static void __hyp_text __sysreg_restore_common_state(struct kvm_cpu_context *ctxt)
-{
- write_sysreg(ctxt->sys_regs[MDSCR_EL1], mdscr_el1);
-}
-
-static void __hyp_text __sysreg_restore_user_state(struct kvm_cpu_context *ctxt)
-{
- write_sysreg(ctxt->sys_regs[TPIDR_EL0], tpidr_el0);
- write_sysreg(ctxt->sys_regs[TPIDRRO_EL0], tpidrro_el0);
-}
-
-static void __hyp_text __sysreg_restore_el1_state(struct kvm_cpu_context *ctxt)
-{
- write_sysreg(ctxt->sys_regs[MPIDR_EL1], vmpidr_el2);
- write_sysreg(ctxt->sys_regs[CSSELR_EL1], csselr_el1);
-
- if (has_vhe() ||
- !cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
- write_sysreg_el1(ctxt->sys_regs[SCTLR_EL1], SYS_SCTLR);
- write_sysreg_el1(ctxt->sys_regs[TCR_EL1], SYS_TCR);
- } else if (!ctxt->__hyp_running_vcpu) {
- /*
- * Must only be done for guest registers, hence the context
- * test. We're coming from the host, so SCTLR.M is already
- * set. Pairs with nVHE's __activate_traps().
- */
- write_sysreg_el1((ctxt->sys_regs[TCR_EL1] |
- TCR_EPD1_MASK | TCR_EPD0_MASK),
- SYS_TCR);
- isb();
- }
-
- write_sysreg_el1(ctxt->sys_regs[CPACR_EL1], SYS_CPACR);
- write_sysreg_el1(ctxt->sys_regs[TTBR0_EL1], SYS_TTBR0);
- write_sysreg_el1(ctxt->sys_regs[TTBR1_EL1], SYS_TTBR1);
- write_sysreg_el1(ctxt->sys_regs[ESR_EL1], SYS_ESR);
- write_sysreg_el1(ctxt->sys_regs[AFSR0_EL1], SYS_AFSR0);
- write_sysreg_el1(ctxt->sys_regs[AFSR1_EL1], SYS_AFSR1);
- write_sysreg_el1(ctxt->sys_regs[FAR_EL1], SYS_FAR);
- write_sysreg_el1(ctxt->sys_regs[MAIR_EL1], SYS_MAIR);
- write_sysreg_el1(ctxt->sys_regs[VBAR_EL1], SYS_VBAR);
- write_sysreg_el1(ctxt->sys_regs[CONTEXTIDR_EL1],SYS_CONTEXTIDR);
- write_sysreg_el1(ctxt->sys_regs[AMAIR_EL1], SYS_AMAIR);
- write_sysreg_el1(ctxt->sys_regs[CNTKCTL_EL1], SYS_CNTKCTL);
- write_sysreg(ctxt->sys_regs[PAR_EL1], par_el1);
- write_sysreg(ctxt->sys_regs[TPIDR_EL1], tpidr_el1);
-
- if (!has_vhe() &&
- cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT) &&
- ctxt->__hyp_running_vcpu) {
- /*
- * Must only be done for host registers, hence the context
- * test. Pairs with nVHE's __deactivate_traps().
- */
- isb();
- /*
- * At this stage, and thanks to the above isb(), S2 is
- * deconfigured and disabled. We can now restore the host's
- * S1 configuration: SCTLR, and only then TCR.
- */
- write_sysreg_el1(ctxt->sys_regs[SCTLR_EL1], SYS_SCTLR);
- isb();
- write_sysreg_el1(ctxt->sys_regs[TCR_EL1], SYS_TCR);
- }
-
- write_sysreg(ctxt->gp_regs.sp_el1, sp_el1);
- write_sysreg_el1(ctxt->gp_regs.elr_el1, SYS_ELR);
- write_sysreg_el1(ctxt->gp_regs.spsr[KVM_SPSR_EL1],SYS_SPSR);
-}
-
-static void __hyp_text
-__sysreg_restore_el2_return_state(struct kvm_cpu_context *ctxt)
-{
- u64 pstate = ctxt->gp_regs.regs.pstate;
- u64 mode = pstate & PSR_AA32_MODE_MASK;
-
- /*
- * Safety check to ensure we're setting the CPU up to enter the guest
- * in a less privileged mode.
- *
- * If we are attempting a return to EL2 or higher in AArch64 state,
- * program SPSR_EL2 with M=EL2h and the IL bit set which ensures that
- * we'll take an illegal exception state exception immediately after
- * the ERET to the guest. Attempts to return to AArch32 Hyp will
- * result in an illegal exception return because EL2's execution state
- * is determined by SCR_EL3.RW.
- */
- if (!(mode & PSR_MODE32_BIT) && mode >= PSR_MODE_EL2t)
- pstate = PSR_MODE_EL2h | PSR_IL_BIT;
-
- write_sysreg_el2(ctxt->gp_regs.regs.pc, SYS_ELR);
- write_sysreg_el2(pstate, SYS_SPSR);
-
- if (cpus_have_final_cap(ARM64_HAS_RAS_EXTN))
- write_sysreg_s(ctxt->sys_regs[DISR_EL1], SYS_VDISR_EL2);
-}
-
-void __hyp_text __sysreg_restore_state_nvhe(struct kvm_cpu_context *ctxt)
-{
- __sysreg_restore_el1_state(ctxt);
- __sysreg_restore_common_state(ctxt);
- __sysreg_restore_user_state(ctxt);
- __sysreg_restore_el2_return_state(ctxt);
-}
-
void sysreg_restore_host_state_vhe(struct kvm_cpu_context *ctxt)
{
__sysreg_restore_common_state(ctxt);
@@ -208,48 +50,14 @@ void sysreg_restore_guest_state_vhe(struct kvm_cpu_context *ctxt)
}
NOKPROBE_SYMBOL(sysreg_restore_guest_state_vhe);

-void __hyp_text __sysreg32_save_state(struct kvm_vcpu *vcpu)
+void __sysreg32_save_state(struct kvm_vcpu *vcpu)
{
- u64 *spsr, *sysreg;
-
- if (!vcpu_el1_is_32bit(vcpu))
- return;
-
- spsr = vcpu->arch.ctxt.gp_regs.spsr;
- sysreg = vcpu->arch.ctxt.sys_regs;
-
- spsr[KVM_SPSR_ABT] = read_sysreg(spsr_abt);
- spsr[KVM_SPSR_UND] = read_sysreg(spsr_und);
- spsr[KVM_SPSR_IRQ] = read_sysreg(spsr_irq);
- spsr[KVM_SPSR_FIQ] = read_sysreg(spsr_fiq);
-
- sysreg[DACR32_EL2] = read_sysreg(dacr32_el2);
- sysreg[IFSR32_EL2] = read_sysreg(ifsr32_el2);
-
- if (has_vhe() || vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY)
- sysreg[DBGVCR32_EL2] = read_sysreg(dbgvcr32_el2);
+ ___sysreg32_save_state(vcpu);
}

-void __hyp_text __sysreg32_restore_state(struct kvm_vcpu *vcpu)
+void __sysreg32_restore_state(struct kvm_vcpu *vcpu)
{
- u64 *spsr, *sysreg;
-
- if (!vcpu_el1_is_32bit(vcpu))
- return;
-
- spsr = vcpu->arch.ctxt.gp_regs.spsr;
- sysreg = vcpu->arch.ctxt.sys_regs;
-
- write_sysreg(spsr[KVM_SPSR_ABT], spsr_abt);
- write_sysreg(spsr[KVM_SPSR_UND], spsr_und);
- write_sysreg(spsr[KVM_SPSR_IRQ], spsr_irq);
- write_sysreg(spsr[KVM_SPSR_FIQ], spsr_fiq);
-
- write_sysreg(sysreg[DACR32_EL2], dacr32_el2);
- write_sysreg(sysreg[IFSR32_EL2], ifsr32_el2);
-
- if (has_vhe() || vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY)
- write_sysreg(sysreg[DBGVCR32_EL2], dbgvcr32_el2);
+ ___sysreg32_restore_state(vcpu);
}

/**
@@ -320,14 +128,3 @@ void kvm_vcpu_put_sysregs(struct kvm_vcpu *vcpu)

vcpu->arch.sysregs_loaded_on_cpu = false;
}
-
-void __hyp_text __kvm_enable_ssbs(void)
-{
- u64 tmp;
-
- asm volatile(
- "mrs %0, sctlr_el2\n"
- "orr %0, %0, %1\n"
- "msr sctlr_el2, %0"
- : "=&r" (tmp) : "L" (SCTLR_ELx_DSSBS));
-}
diff --git a/arch/arm64/kvm/hyp/sysreg-sr.h b/arch/arm64/kvm/hyp/sysreg-sr.h
new file mode 100644
index 000000000000..7bc102c60294
--- /dev/null
+++ b/arch/arm64/kvm/hyp/sysreg-sr.h
@@ -0,0 +1,211 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2012-2015 - ARM Ltd
+ * Author: Marc Zyngier <[email protected]>
+ */
+
+#ifndef __ARM64_KVM_HYP_SYSREG_SR_H__
+#define __ARM64_KVM_HYP_SYSREG_SR_H__
+
+#include <linux/compiler.h>
+#include <linux/kvm_host.h>
+
+#include <asm/kprobes.h>
+#include <asm/kvm_asm.h>
+#include <asm/kvm_emulate.h>
+#include <asm/kvm_hyp.h>
+
+static inline void __hyp_text
+__sysreg_save_common_state(struct kvm_cpu_context *ctxt)
+{
+ ctxt->sys_regs[MDSCR_EL1] = read_sysreg(mdscr_el1);
+}
+
+static inline void __hyp_text
+__sysreg_save_user_state(struct kvm_cpu_context *ctxt)
+{
+ ctxt->sys_regs[TPIDR_EL0] = read_sysreg(tpidr_el0);
+ ctxt->sys_regs[TPIDRRO_EL0] = read_sysreg(tpidrro_el0);
+}
+
+static inline void __hyp_text
+__sysreg_save_el1_state(struct kvm_cpu_context *ctxt)
+{
+ ctxt->sys_regs[CSSELR_EL1] = read_sysreg(csselr_el1);
+ ctxt->sys_regs[SCTLR_EL1] = read_sysreg_el1(SYS_SCTLR);
+ ctxt->sys_regs[CPACR_EL1] = read_sysreg_el1(SYS_CPACR);
+ ctxt->sys_regs[TTBR0_EL1] = read_sysreg_el1(SYS_TTBR0);
+ ctxt->sys_regs[TTBR1_EL1] = read_sysreg_el1(SYS_TTBR1);
+ ctxt->sys_regs[TCR_EL1] = read_sysreg_el1(SYS_TCR);
+ ctxt->sys_regs[ESR_EL1] = read_sysreg_el1(SYS_ESR);
+ ctxt->sys_regs[AFSR0_EL1] = read_sysreg_el1(SYS_AFSR0);
+ ctxt->sys_regs[AFSR1_EL1] = read_sysreg_el1(SYS_AFSR1);
+ ctxt->sys_regs[FAR_EL1] = read_sysreg_el1(SYS_FAR);
+ ctxt->sys_regs[MAIR_EL1] = read_sysreg_el1(SYS_MAIR);
+ ctxt->sys_regs[VBAR_EL1] = read_sysreg_el1(SYS_VBAR);
+ ctxt->sys_regs[CONTEXTIDR_EL1] = read_sysreg_el1(SYS_CONTEXTIDR);
+ ctxt->sys_regs[AMAIR_EL1] = read_sysreg_el1(SYS_AMAIR);
+ ctxt->sys_regs[CNTKCTL_EL1] = read_sysreg_el1(SYS_CNTKCTL);
+ ctxt->sys_regs[PAR_EL1] = read_sysreg(par_el1);
+ ctxt->sys_regs[TPIDR_EL1] = read_sysreg(tpidr_el1);
+
+ ctxt->gp_regs.sp_el1 = read_sysreg(sp_el1);
+ ctxt->gp_regs.elr_el1 = read_sysreg_el1(SYS_ELR);
+ ctxt->gp_regs.spsr[KVM_SPSR_EL1]= read_sysreg_el1(SYS_SPSR);
+}
+
+static inline void __hyp_text
+__sysreg_save_el2_return_state(struct kvm_cpu_context *ctxt)
+{
+ ctxt->gp_regs.regs.pc = read_sysreg_el2(SYS_ELR);
+ ctxt->gp_regs.regs.pstate = read_sysreg_el2(SYS_SPSR);
+
+ if (cpus_have_final_cap(ARM64_HAS_RAS_EXTN))
+ ctxt->sys_regs[DISR_EL1] = read_sysreg_s(SYS_VDISR_EL2);
+}
+
+static inline void __hyp_text
+__sysreg_restore_common_state(struct kvm_cpu_context *ctxt)
+{
+ write_sysreg(ctxt->sys_regs[MDSCR_EL1], mdscr_el1);
+}
+
+static inline void __hyp_text
+__sysreg_restore_user_state(struct kvm_cpu_context *ctxt)
+{
+ write_sysreg(ctxt->sys_regs[TPIDR_EL0], tpidr_el0);
+ write_sysreg(ctxt->sys_regs[TPIDRRO_EL0], tpidrro_el0);
+}
+
+static inline void __hyp_text
+__sysreg_restore_el1_state(struct kvm_cpu_context *ctxt)
+{
+ write_sysreg(ctxt->sys_regs[MPIDR_EL1], vmpidr_el2);
+ write_sysreg(ctxt->sys_regs[CSSELR_EL1], csselr_el1);
+
+ if (has_vhe() ||
+ !cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
+ write_sysreg_el1(ctxt->sys_regs[SCTLR_EL1], SYS_SCTLR);
+ write_sysreg_el1(ctxt->sys_regs[TCR_EL1], SYS_TCR);
+ } else if (!ctxt->__hyp_running_vcpu) {
+ /*
+ * Must only be done for guest registers, hence the context
+ * test. We're coming from the host, so SCTLR.M is already
+ * set. Pairs with nVHE's __activate_traps().
+ */
+ write_sysreg_el1((ctxt->sys_regs[TCR_EL1] |
+ TCR_EPD1_MASK | TCR_EPD0_MASK),
+ SYS_TCR);
+ isb();
+ }
+
+ write_sysreg_el1(ctxt->sys_regs[CPACR_EL1], SYS_CPACR);
+ write_sysreg_el1(ctxt->sys_regs[TTBR0_EL1], SYS_TTBR0);
+ write_sysreg_el1(ctxt->sys_regs[TTBR1_EL1], SYS_TTBR1);
+ write_sysreg_el1(ctxt->sys_regs[ESR_EL1], SYS_ESR);
+ write_sysreg_el1(ctxt->sys_regs[AFSR0_EL1], SYS_AFSR0);
+ write_sysreg_el1(ctxt->sys_regs[AFSR1_EL1], SYS_AFSR1);
+ write_sysreg_el1(ctxt->sys_regs[FAR_EL1], SYS_FAR);
+ write_sysreg_el1(ctxt->sys_regs[MAIR_EL1], SYS_MAIR);
+ write_sysreg_el1(ctxt->sys_regs[VBAR_EL1], SYS_VBAR);
+ write_sysreg_el1(ctxt->sys_regs[CONTEXTIDR_EL1],SYS_CONTEXTIDR);
+ write_sysreg_el1(ctxt->sys_regs[AMAIR_EL1], SYS_AMAIR);
+ write_sysreg_el1(ctxt->sys_regs[CNTKCTL_EL1], SYS_CNTKCTL);
+ write_sysreg(ctxt->sys_regs[PAR_EL1], par_el1);
+ write_sysreg(ctxt->sys_regs[TPIDR_EL1], tpidr_el1);
+
+ if (!has_vhe() &&
+ cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT) &&
+ ctxt->__hyp_running_vcpu) {
+ /*
+ * Must only be done for host registers, hence the context
+ * test. Pairs with nVHE's __deactivate_traps().
+ */
+ isb();
+ /*
+ * At this stage, and thanks to the above isb(), S2 is
+ * deconfigured and disabled. We can now restore the host's
+ * S1 configuration: SCTLR, and only then TCR.
+ */
+ write_sysreg_el1(ctxt->sys_regs[SCTLR_EL1], SYS_SCTLR);
+ isb();
+ write_sysreg_el1(ctxt->sys_regs[TCR_EL1], SYS_TCR);
+ }
+
+ write_sysreg(ctxt->gp_regs.sp_el1, sp_el1);
+ write_sysreg_el1(ctxt->gp_regs.elr_el1, SYS_ELR);
+ write_sysreg_el1(ctxt->gp_regs.spsr[KVM_SPSR_EL1],SYS_SPSR);
+}
+
+static inline void __hyp_text
+__sysreg_restore_el2_return_state(struct kvm_cpu_context *ctxt)
+{
+ u64 pstate = ctxt->gp_regs.regs.pstate;
+ u64 mode = pstate & PSR_AA32_MODE_MASK;
+
+ /*
+ * Safety check to ensure we're setting the CPU up to enter the guest
+ * in a less privileged mode.
+ *
+ * If we are attempting a return to EL2 or higher in AArch64 state,
+ * program SPSR_EL2 with M=EL2h and the IL bit set which ensures that
+ * we'll take an illegal exception state exception immediately after
+ * the ERET to the guest. Attempts to return to AArch32 Hyp will
+ * result in an illegal exception return because EL2's execution state
+ * is determined by SCR_EL3.RW.
+ */
+ if (!(mode & PSR_MODE32_BIT) && mode >= PSR_MODE_EL2t)
+ pstate = PSR_MODE_EL2h | PSR_IL_BIT;
+
+ write_sysreg_el2(ctxt->gp_regs.regs.pc, SYS_ELR);
+ write_sysreg_el2(pstate, SYS_SPSR);
+
+ if (cpus_have_final_cap(ARM64_HAS_RAS_EXTN))
+ write_sysreg_s(ctxt->sys_regs[DISR_EL1], SYS_VDISR_EL2);
+}
+
+static inline void __hyp_text ___sysreg32_save_state(struct kvm_vcpu *vcpu)
+{
+ u64 *spsr, *sysreg;
+
+ if (!vcpu_el1_is_32bit(vcpu))
+ return;
+
+ spsr = vcpu->arch.ctxt.gp_regs.spsr;
+ sysreg = vcpu->arch.ctxt.sys_regs;
+
+ spsr[KVM_SPSR_ABT] = read_sysreg(spsr_abt);
+ spsr[KVM_SPSR_UND] = read_sysreg(spsr_und);
+ spsr[KVM_SPSR_IRQ] = read_sysreg(spsr_irq);
+ spsr[KVM_SPSR_FIQ] = read_sysreg(spsr_fiq);
+
+ sysreg[DACR32_EL2] = read_sysreg(dacr32_el2);
+ sysreg[IFSR32_EL2] = read_sysreg(ifsr32_el2);
+
+ if (has_vhe() || vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY)
+ sysreg[DBGVCR32_EL2] = read_sysreg(dbgvcr32_el2);
+}
+
+static inline void __hyp_text ___sysreg32_restore_state(struct kvm_vcpu *vcpu)
+{
+ u64 *spsr, *sysreg;
+
+ if (!vcpu_el1_is_32bit(vcpu))
+ return;
+
+ spsr = vcpu->arch.ctxt.gp_regs.spsr;
+ sysreg = vcpu->arch.ctxt.sys_regs;
+
+ write_sysreg(spsr[KVM_SPSR_ABT], spsr_abt);
+ write_sysreg(spsr[KVM_SPSR_UND], spsr_und);
+ write_sysreg(spsr[KVM_SPSR_IRQ], spsr_irq);
+ write_sysreg(spsr[KVM_SPSR_FIQ], spsr_fiq);
+
+ write_sysreg(sysreg[DACR32_EL2], dacr32_el2);
+ write_sysreg(sysreg[IFSR32_EL2], ifsr32_el2);
+
+ if (has_vhe() || vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY)
+ write_sysreg(sysreg[DBGVCR32_EL2], dbgvcr32_el2);
+}
+
+#endif /* __ARM64_KVM_HYP_SYSREG_SR_H__ */
--
2.27.0

2020-06-18 20:27:34

by David Brazdil

[permalink] [raw]
Subject: [PATCH v3 09/15] arm64: kvm: Split hyp/debug-sr.c to VHE/nVHE

This patch is part of a series which builds KVM's non-VHE hyp code separately
from VHE and the rest of the kernel.

debug-sr.c contains KVM's code for context-switching debug registers, with some
parts shared between VHE/nVHE. These common routines are moved to debug-sr.h,
VHE-specific code is left in debug-sr.c and nVHE-specific code is moved to
nvhe/debug-sr.c.

Functions are slightly refactored to move code hidden behind `has_vhe()` checks
to the corresponding .c files.

Signed-off-by: David Brazdil <[email protected]>
---
arch/arm64/kernel/image-vars.h | 3 -
arch/arm64/kvm/hyp/debug-sr.c | 210 +----------------------------
arch/arm64/kvm/hyp/debug-sr.h | 172 +++++++++++++++++++++++
arch/arm64/kvm/hyp/nvhe/Makefile | 2 +-
arch/arm64/kvm/hyp/nvhe/debug-sr.c | 77 +++++++++++
5 files changed, 256 insertions(+), 208 deletions(-)
create mode 100644 arch/arm64/kvm/hyp/debug-sr.h
create mode 100644 arch/arm64/kvm/hyp/nvhe/debug-sr.c

diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 855f9718d6d9..8096e6f1f2bf 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -61,8 +61,6 @@ __efistub__ctype = _ctype;
* memory mappings.
*/

-__kvm_nvhe___debug_switch_to_guest = __debug_switch_to_guest;
-__kvm_nvhe___debug_switch_to_host = __debug_switch_to_host;
__kvm_nvhe___fpsimd_restore_state = __fpsimd_restore_state;
__kvm_nvhe___fpsimd_save_state = __fpsimd_save_state;
__kvm_nvhe___guest_enter = __guest_enter;
@@ -71,7 +69,6 @@ __kvm_nvhe___hyp_panic_string = __hyp_panic_string;
__kvm_nvhe___hyp_stub_vectors = __hyp_stub_vectors;
__kvm_nvhe___icache_flags = __icache_flags;
__kvm_nvhe___kvm_enable_ssbs = __kvm_enable_ssbs;
-__kvm_nvhe___kvm_get_mdcr_el2 = __kvm_get_mdcr_el2;
__kvm_nvhe___kvm_timer_set_cntvoff = __kvm_timer_set_cntvoff;
__kvm_nvhe___sysreg32_restore_state = __sysreg32_restore_state;
__kvm_nvhe___sysreg32_save_state = __sysreg32_save_state;
diff --git a/arch/arm64/kvm/hyp/debug-sr.c b/arch/arm64/kvm/hyp/debug-sr.c
index e95af204fec7..28c0a54cda2a 100644
--- a/arch/arm64/kvm/hyp/debug-sr.c
+++ b/arch/arm64/kvm/hyp/debug-sr.c
@@ -4,221 +4,23 @@
* Author: Marc Zyngier <[email protected]>
*/

-#include <linux/compiler.h>
#include <linux/kvm_host.h>

-#include <asm/debug-monitors.h>
-#include <asm/kvm_asm.h>
#include <asm/kvm_hyp.h>
-#include <asm/kvm_mmu.h>

-#define read_debug(r,n) read_sysreg(r##n##_el1)
-#define write_debug(v,r,n) write_sysreg(v, r##n##_el1)
+#include "debug-sr.h"

-#define save_debug(ptr,reg,nr) \
- switch (nr) { \
- case 15: ptr[15] = read_debug(reg, 15); \
- /* Fall through */ \
- case 14: ptr[14] = read_debug(reg, 14); \
- /* Fall through */ \
- case 13: ptr[13] = read_debug(reg, 13); \
- /* Fall through */ \
- case 12: ptr[12] = read_debug(reg, 12); \
- /* Fall through */ \
- case 11: ptr[11] = read_debug(reg, 11); \
- /* Fall through */ \
- case 10: ptr[10] = read_debug(reg, 10); \
- /* Fall through */ \
- case 9: ptr[9] = read_debug(reg, 9); \
- /* Fall through */ \
- case 8: ptr[8] = read_debug(reg, 8); \
- /* Fall through */ \
- case 7: ptr[7] = read_debug(reg, 7); \
- /* Fall through */ \
- case 6: ptr[6] = read_debug(reg, 6); \
- /* Fall through */ \
- case 5: ptr[5] = read_debug(reg, 5); \
- /* Fall through */ \
- case 4: ptr[4] = read_debug(reg, 4); \
- /* Fall through */ \
- case 3: ptr[3] = read_debug(reg, 3); \
- /* Fall through */ \
- case 2: ptr[2] = read_debug(reg, 2); \
- /* Fall through */ \
- case 1: ptr[1] = read_debug(reg, 1); \
- /* Fall through */ \
- default: ptr[0] = read_debug(reg, 0); \
- }
-
-#define restore_debug(ptr,reg,nr) \
- switch (nr) { \
- case 15: write_debug(ptr[15], reg, 15); \
- /* Fall through */ \
- case 14: write_debug(ptr[14], reg, 14); \
- /* Fall through */ \
- case 13: write_debug(ptr[13], reg, 13); \
- /* Fall through */ \
- case 12: write_debug(ptr[12], reg, 12); \
- /* Fall through */ \
- case 11: write_debug(ptr[11], reg, 11); \
- /* Fall through */ \
- case 10: write_debug(ptr[10], reg, 10); \
- /* Fall through */ \
- case 9: write_debug(ptr[9], reg, 9); \
- /* Fall through */ \
- case 8: write_debug(ptr[8], reg, 8); \
- /* Fall through */ \
- case 7: write_debug(ptr[7], reg, 7); \
- /* Fall through */ \
- case 6: write_debug(ptr[6], reg, 6); \
- /* Fall through */ \
- case 5: write_debug(ptr[5], reg, 5); \
- /* Fall through */ \
- case 4: write_debug(ptr[4], reg, 4); \
- /* Fall through */ \
- case 3: write_debug(ptr[3], reg, 3); \
- /* Fall through */ \
- case 2: write_debug(ptr[2], reg, 2); \
- /* Fall through */ \
- case 1: write_debug(ptr[1], reg, 1); \
- /* Fall through */ \
- default: write_debug(ptr[0], reg, 0); \
- }
-
-static void __hyp_text __debug_save_spe_nvhe(u64 *pmscr_el1)
-{
- u64 reg;
-
- /* Clear pmscr in case of early return */
- *pmscr_el1 = 0;
-
- /* SPE present on this CPU? */
- if (!cpuid_feature_extract_unsigned_field(read_sysreg(id_aa64dfr0_el1),
- ID_AA64DFR0_PMSVER_SHIFT))
- return;
-
- /* Yes; is it owned by EL3? */
- reg = read_sysreg_s(SYS_PMBIDR_EL1);
- if (reg & BIT(SYS_PMBIDR_EL1_P_SHIFT))
- return;
-
- /* No; is the host actually using the thing? */
- reg = read_sysreg_s(SYS_PMBLIMITR_EL1);
- if (!(reg & BIT(SYS_PMBLIMITR_EL1_E_SHIFT)))
- return;
-
- /* Yes; save the control register and disable data generation */
- *pmscr_el1 = read_sysreg_s(SYS_PMSCR_EL1);
- write_sysreg_s(0, SYS_PMSCR_EL1);
- isb();
-
- /* Now drain all buffered data to memory */
- psb_csync();
- dsb(nsh);
-}
-
-static void __hyp_text __debug_restore_spe_nvhe(u64 pmscr_el1)
-{
- if (!pmscr_el1)
- return;
-
- /* The host page table is installed, but not yet synchronised */
- isb();
-
- /* Re-enable data generation */
- write_sysreg_s(pmscr_el1, SYS_PMSCR_EL1);
-}
-
-static void __hyp_text __debug_save_state(struct kvm_vcpu *vcpu,
- struct kvm_guest_debug_arch *dbg,
- struct kvm_cpu_context *ctxt)
-{
- u64 aa64dfr0;
- int brps, wrps;
-
- aa64dfr0 = read_sysreg(id_aa64dfr0_el1);
- brps = (aa64dfr0 >> 12) & 0xf;
- wrps = (aa64dfr0 >> 20) & 0xf;
-
- save_debug(dbg->dbg_bcr, dbgbcr, brps);
- save_debug(dbg->dbg_bvr, dbgbvr, brps);
- save_debug(dbg->dbg_wcr, dbgwcr, wrps);
- save_debug(dbg->dbg_wvr, dbgwvr, wrps);
-
- ctxt->sys_regs[MDCCINT_EL1] = read_sysreg(mdccint_el1);
-}
-
-static void __hyp_text __debug_restore_state(struct kvm_vcpu *vcpu,
- struct kvm_guest_debug_arch *dbg,
- struct kvm_cpu_context *ctxt)
+void __debug_switch_to_guest(struct kvm_vcpu *vcpu)
{
- u64 aa64dfr0;
- int brps, wrps;
-
- aa64dfr0 = read_sysreg(id_aa64dfr0_el1);
-
- brps = (aa64dfr0 >> 12) & 0xf;
- wrps = (aa64dfr0 >> 20) & 0xf;
-
- restore_debug(dbg->dbg_bcr, dbgbcr, brps);
- restore_debug(dbg->dbg_bvr, dbgbvr, brps);
- restore_debug(dbg->dbg_wcr, dbgwcr, wrps);
- restore_debug(dbg->dbg_wvr, dbgwvr, wrps);
-
- write_sysreg(ctxt->sys_regs[MDCCINT_EL1], mdccint_el1);
+ __debug_switch_to_guest_common(vcpu);
}

-void __hyp_text __debug_switch_to_guest(struct kvm_vcpu *vcpu)
+void __debug_switch_to_host(struct kvm_vcpu *vcpu)
{
- struct kvm_cpu_context *host_ctxt;
- struct kvm_cpu_context *guest_ctxt;
- struct kvm_guest_debug_arch *host_dbg;
- struct kvm_guest_debug_arch *guest_dbg;
-
- /*
- * Non-VHE: Disable and flush SPE data generation
- * VHE: The vcpu can run, but it can't hide.
- */
- if (!has_vhe())
- __debug_save_spe_nvhe(&vcpu->arch.host_debug_state.pmscr_el1);
-
- if (!(vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY))
- return;
-
- host_ctxt = &__hyp_this_cpu_ptr(kvm_host_data)->host_ctxt;
- guest_ctxt = &vcpu->arch.ctxt;
- host_dbg = &vcpu->arch.host_debug_state.regs;
- guest_dbg = kern_hyp_va(vcpu->arch.debug_ptr);
-
- __debug_save_state(vcpu, host_dbg, host_ctxt);
- __debug_restore_state(vcpu, guest_dbg, guest_ctxt);
-}
-
-void __hyp_text __debug_switch_to_host(struct kvm_vcpu *vcpu)
-{
- struct kvm_cpu_context *host_ctxt;
- struct kvm_cpu_context *guest_ctxt;
- struct kvm_guest_debug_arch *host_dbg;
- struct kvm_guest_debug_arch *guest_dbg;
-
- if (!has_vhe())
- __debug_restore_spe_nvhe(vcpu->arch.host_debug_state.pmscr_el1);
-
- if (!(vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY))
- return;
-
- host_ctxt = &__hyp_this_cpu_ptr(kvm_host_data)->host_ctxt;
- guest_ctxt = &vcpu->arch.ctxt;
- host_dbg = &vcpu->arch.host_debug_state.regs;
- guest_dbg = kern_hyp_va(vcpu->arch.debug_ptr);
-
- __debug_save_state(vcpu, guest_dbg, guest_ctxt);
- __debug_restore_state(vcpu, host_dbg, host_ctxt);
-
- vcpu->arch.flags &= ~KVM_ARM64_DEBUG_DIRTY;
+ __debug_switch_to_host_common(vcpu);
}

-u32 __hyp_text __kvm_get_mdcr_el2(void)
+u32 __kvm_get_mdcr_el2(void)
{
return read_sysreg(mdcr_el2);
}
diff --git a/arch/arm64/kvm/hyp/debug-sr.h b/arch/arm64/kvm/hyp/debug-sr.h
new file mode 100644
index 000000000000..62b5deeb301d
--- /dev/null
+++ b/arch/arm64/kvm/hyp/debug-sr.h
@@ -0,0 +1,172 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2015 - ARM Ltd
+ * Author: Marc Zyngier <[email protected]>
+ */
+
+#ifndef __ARM64_KVM_HYP_DEBUG_SR_H__
+#define __ARM64_KVM_HYP_DEBUG_SR_H__
+
+#include <linux/compiler.h>
+#include <linux/kvm_host.h>
+
+#include <asm/debug-monitors.h>
+#include <asm/kvm_asm.h>
+#include <asm/kvm_hyp.h>
+#include <asm/kvm_mmu.h>
+
+#define read_debug(r,n) read_sysreg(r##n##_el1)
+#define write_debug(v,r,n) write_sysreg(v, r##n##_el1)
+
+#define save_debug(ptr,reg,nr) \
+ switch (nr) { \
+ case 15: ptr[15] = read_debug(reg, 15); \
+ /* Fall through */ \
+ case 14: ptr[14] = read_debug(reg, 14); \
+ /* Fall through */ \
+ case 13: ptr[13] = read_debug(reg, 13); \
+ /* Fall through */ \
+ case 12: ptr[12] = read_debug(reg, 12); \
+ /* Fall through */ \
+ case 11: ptr[11] = read_debug(reg, 11); \
+ /* Fall through */ \
+ case 10: ptr[10] = read_debug(reg, 10); \
+ /* Fall through */ \
+ case 9: ptr[9] = read_debug(reg, 9); \
+ /* Fall through */ \
+ case 8: ptr[8] = read_debug(reg, 8); \
+ /* Fall through */ \
+ case 7: ptr[7] = read_debug(reg, 7); \
+ /* Fall through */ \
+ case 6: ptr[6] = read_debug(reg, 6); \
+ /* Fall through */ \
+ case 5: ptr[5] = read_debug(reg, 5); \
+ /* Fall through */ \
+ case 4: ptr[4] = read_debug(reg, 4); \
+ /* Fall through */ \
+ case 3: ptr[3] = read_debug(reg, 3); \
+ /* Fall through */ \
+ case 2: ptr[2] = read_debug(reg, 2); \
+ /* Fall through */ \
+ case 1: ptr[1] = read_debug(reg, 1); \
+ /* Fall through */ \
+ default: ptr[0] = read_debug(reg, 0); \
+ }
+
+#define restore_debug(ptr,reg,nr) \
+ switch (nr) { \
+ case 15: write_debug(ptr[15], reg, 15); \
+ /* Fall through */ \
+ case 14: write_debug(ptr[14], reg, 14); \
+ /* Fall through */ \
+ case 13: write_debug(ptr[13], reg, 13); \
+ /* Fall through */ \
+ case 12: write_debug(ptr[12], reg, 12); \
+ /* Fall through */ \
+ case 11: write_debug(ptr[11], reg, 11); \
+ /* Fall through */ \
+ case 10: write_debug(ptr[10], reg, 10); \
+ /* Fall through */ \
+ case 9: write_debug(ptr[9], reg, 9); \
+ /* Fall through */ \
+ case 8: write_debug(ptr[8], reg, 8); \
+ /* Fall through */ \
+ case 7: write_debug(ptr[7], reg, 7); \
+ /* Fall through */ \
+ case 6: write_debug(ptr[6], reg, 6); \
+ /* Fall through */ \
+ case 5: write_debug(ptr[5], reg, 5); \
+ /* Fall through */ \
+ case 4: write_debug(ptr[4], reg, 4); \
+ /* Fall through */ \
+ case 3: write_debug(ptr[3], reg, 3); \
+ /* Fall through */ \
+ case 2: write_debug(ptr[2], reg, 2); \
+ /* Fall through */ \
+ case 1: write_debug(ptr[1], reg, 1); \
+ /* Fall through */ \
+ default: write_debug(ptr[0], reg, 0); \
+ }
+
+static inline void __hyp_text
+__debug_save_state(struct kvm_vcpu *vcpu, struct kvm_guest_debug_arch *dbg,
+ struct kvm_cpu_context *ctxt)
+{
+ u64 aa64dfr0;
+ int brps, wrps;
+
+ aa64dfr0 = read_sysreg(id_aa64dfr0_el1);
+ brps = (aa64dfr0 >> 12) & 0xf;
+ wrps = (aa64dfr0 >> 20) & 0xf;
+
+ save_debug(dbg->dbg_bcr, dbgbcr, brps);
+ save_debug(dbg->dbg_bvr, dbgbvr, brps);
+ save_debug(dbg->dbg_wcr, dbgwcr, wrps);
+ save_debug(dbg->dbg_wvr, dbgwvr, wrps);
+
+ ctxt->sys_regs[MDCCINT_EL1] = read_sysreg(mdccint_el1);
+}
+
+static inline void __hyp_text
+__debug_restore_state(struct kvm_vcpu *vcpu, struct kvm_guest_debug_arch *dbg,
+ struct kvm_cpu_context *ctxt)
+{
+ u64 aa64dfr0;
+ int brps, wrps;
+
+ aa64dfr0 = read_sysreg(id_aa64dfr0_el1);
+
+ brps = (aa64dfr0 >> 12) & 0xf;
+ wrps = (aa64dfr0 >> 20) & 0xf;
+
+ restore_debug(dbg->dbg_bcr, dbgbcr, brps);
+ restore_debug(dbg->dbg_bvr, dbgbvr, brps);
+ restore_debug(dbg->dbg_wcr, dbgwcr, wrps);
+ restore_debug(dbg->dbg_wvr, dbgwvr, wrps);
+
+ write_sysreg(ctxt->sys_regs[MDCCINT_EL1], mdccint_el1);
+}
+
+static inline void __hyp_text
+__debug_switch_to_guest_common(struct kvm_vcpu *vcpu)
+{
+ struct kvm_cpu_context *host_ctxt;
+ struct kvm_cpu_context *guest_ctxt;
+ struct kvm_guest_debug_arch *host_dbg;
+ struct kvm_guest_debug_arch *guest_dbg;
+
+ if (!(vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY))
+ return;
+
+ host_ctxt = &__hyp_this_cpu_ptr(kvm_host_data)->host_ctxt;
+ guest_ctxt = &vcpu->arch.ctxt;
+ host_dbg = &vcpu->arch.host_debug_state.regs;
+ guest_dbg = kern_hyp_va(vcpu->arch.debug_ptr);
+
+ __debug_save_state(vcpu, host_dbg, host_ctxt);
+ __debug_restore_state(vcpu, guest_dbg, guest_ctxt);
+}
+
+static inline void __hyp_text
+__debug_switch_to_host_common(struct kvm_vcpu *vcpu)
+{
+ struct kvm_cpu_context *host_ctxt;
+ struct kvm_cpu_context *guest_ctxt;
+ struct kvm_guest_debug_arch *host_dbg;
+ struct kvm_guest_debug_arch *guest_dbg;
+
+ if (!(vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY))
+ return;
+
+ host_ctxt = &__hyp_this_cpu_ptr(kvm_host_data)->host_ctxt;
+ guest_ctxt = &vcpu->arch.ctxt;
+ host_dbg = &vcpu->arch.host_debug_state.regs;
+ guest_dbg = kern_hyp_va(vcpu->arch.debug_ptr);
+
+ __debug_save_state(vcpu, guest_dbg, guest_ctxt);
+ __debug_restore_state(vcpu, host_dbg, host_ctxt);
+
+ vcpu->arch.flags &= ~KVM_ARM64_DEBUG_DIRTY;
+}
+
+#endif /* __ARM64_KVM_HYP_DEBUG_SR_H__ */
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 336b1bf64ceb..95a06786bf26 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -7,7 +7,7 @@ asflags-y := -D__KVM_NVHE_HYPERVISOR__
ccflags-y := -D__KVM_NVHE_HYPERVISOR__ -fno-stack-protector \
-DDISABLE_BRANCH_PROFILING $(DISABLE_STACKLEAK_PLUGIN)

-obj-y := switch.o tlb.o hyp-init.o ../hyp-entry.o
+obj-y := debug-sr.o switch.o tlb.o hyp-init.o ../hyp-entry.o

obj-y := $(patsubst %.o,%.hyp.o,$(obj-y))
extra-y := $(patsubst %.hyp.o,%.hyp.tmp.o,$(obj-y))
diff --git a/arch/arm64/kvm/hyp/nvhe/debug-sr.c b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
new file mode 100644
index 000000000000..b3752cfdcf3d
--- /dev/null
+++ b/arch/arm64/kvm/hyp/nvhe/debug-sr.c
@@ -0,0 +1,77 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2015 - ARM Ltd
+ * Author: Marc Zyngier <[email protected]>
+ */
+
+#include <linux/compiler.h>
+#include <linux/kvm_host.h>
+
+#include <asm/debug-monitors.h>
+#include <asm/kvm_asm.h>
+#include <asm/kvm_hyp.h>
+#include <asm/kvm_mmu.h>
+
+#include "../debug-sr.h"
+
+static void __hyp_text __debug_save_spe(u64 *pmscr_el1)
+{
+ u64 reg;
+
+ /* Clear pmscr in case of early return */
+ *pmscr_el1 = 0;
+
+ /* SPE present on this CPU? */
+ if (!cpuid_feature_extract_unsigned_field(read_sysreg(id_aa64dfr0_el1),
+ ID_AA64DFR0_PMSVER_SHIFT))
+ return;
+
+ /* Yes; is it owned by EL3? */
+ reg = read_sysreg_s(SYS_PMBIDR_EL1);
+ if (reg & BIT(SYS_PMBIDR_EL1_P_SHIFT))
+ return;
+
+ /* No; is the host actually using the thing? */
+ reg = read_sysreg_s(SYS_PMBLIMITR_EL1);
+ if (!(reg & BIT(SYS_PMBLIMITR_EL1_E_SHIFT)))
+ return;
+
+ /* Yes; save the control register and disable data generation */
+ *pmscr_el1 = read_sysreg_s(SYS_PMSCR_EL1);
+ write_sysreg_s(0, SYS_PMSCR_EL1);
+ isb();
+
+ /* Now drain all buffered data to memory */
+ psb_csync();
+ dsb(nsh);
+}
+
+static void __hyp_text __debug_restore_spe(u64 pmscr_el1)
+{
+ if (!pmscr_el1)
+ return;
+
+ /* The host page table is installed, but not yet synchronised */
+ isb();
+
+ /* Re-enable data generation */
+ write_sysreg_s(pmscr_el1, SYS_PMSCR_EL1);
+}
+
+void __hyp_text __debug_switch_to_guest(struct kvm_vcpu *vcpu)
+{
+ /* Disable and flush SPE data generation */
+ __debug_save_spe(&vcpu->arch.host_debug_state.pmscr_el1);
+ __debug_switch_to_guest_common(vcpu);
+}
+
+void __hyp_text __debug_switch_to_host(struct kvm_vcpu *vcpu)
+{
+ __debug_restore_spe(vcpu->arch.host_debug_state.pmscr_el1);
+ __debug_switch_to_host_common(vcpu);
+}
+
+u32 __hyp_text __kvm_get_mdcr_el2(void)
+{
+ return read_sysreg(mdcr_el2);
+}
--
2.27.0

2020-06-18 20:31:02

by David Brazdil

[permalink] [raw]
Subject: [PATCH v3 07/15] arm64: kvm: Split hyp/tlb.c to VHE/nVHE

This patch is part of a series which builds KVM's non-VHE hyp code separately
from VHE and the rest of the kernel.

tlb.c contains code for flushing the TLB, with parts shared between VHE/nVHE.
These common routines are moved into a header file tlb.h, VHE-specific code
remains in tlb.c and nVHE-specific code is moved to nvhe/tlb.c.

The header file expects its users to implement two helper functions declared
at the top of the file.

Signed-off-by: David Brazdil <[email protected]>
---
arch/arm64/kernel/image-vars.h | 8 +-
arch/arm64/kvm/hyp/nvhe/Makefile | 2 +-
arch/arm64/kvm/hyp/nvhe/tlb.c | 70 +++++++++++++
arch/arm64/kvm/hyp/tlb.c | 171 +++----------------------------
arch/arm64/kvm/hyp/tlb.h | 134 ++++++++++++++++++++++++
5 files changed, 222 insertions(+), 163 deletions(-)
create mode 100644 arch/arm64/kvm/hyp/nvhe/tlb.c
create mode 100644 arch/arm64/kvm/hyp/tlb.h

diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 4dc969ccda9e..e8a8aa6bc7bd 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -63,13 +63,10 @@ __efistub__ctype = _ctype;

__kvm_nvhe___guest_exit = __guest_exit;
__kvm_nvhe___hyp_stub_vectors = __hyp_stub_vectors;
+__kvm_nvhe___icache_flags = __icache_flags;
__kvm_nvhe___kvm_enable_ssbs = __kvm_enable_ssbs;
-__kvm_nvhe___kvm_flush_vm_context = __kvm_flush_vm_context;
__kvm_nvhe___kvm_get_mdcr_el2 = __kvm_get_mdcr_el2;
__kvm_nvhe___kvm_timer_set_cntvoff = __kvm_timer_set_cntvoff;
-__kvm_nvhe___kvm_tlb_flush_local_vmid = __kvm_tlb_flush_local_vmid;
-__kvm_nvhe___kvm_tlb_flush_vmid = __kvm_tlb_flush_vmid;
-__kvm_nvhe___kvm_tlb_flush_vmid_ipa = __kvm_tlb_flush_vmid_ipa;
__kvm_nvhe___kvm_vcpu_run_nvhe = __kvm_vcpu_run_nvhe;
__kvm_nvhe___vgic_v3_get_ich_vtr_el2 = __vgic_v3_get_ich_vtr_el2;
__kvm_nvhe___vgic_v3_init_lrs = __vgic_v3_init_lrs;
@@ -79,8 +76,11 @@ __kvm_nvhe___vgic_v3_save_aprs = __vgic_v3_save_aprs;
__kvm_nvhe___vgic_v3_write_vmcr = __vgic_v3_write_vmcr;
__kvm_nvhe_abort_guest_exit_end = abort_guest_exit_end;
__kvm_nvhe_abort_guest_exit_start = abort_guest_exit_start;
+__kvm_nvhe_arm64_const_caps_ready = arm64_const_caps_ready;
__kvm_nvhe_arm64_enable_wa2_handling = arm64_enable_wa2_handling;
__kvm_nvhe_arm64_ssbd_callback_required = arm64_ssbd_callback_required;
+__kvm_nvhe_cpu_hwcap_keys = cpu_hwcap_keys;
+__kvm_nvhe_cpu_hwcaps = cpu_hwcaps;
__kvm_nvhe_hyp_panic = hyp_panic;
__kvm_nvhe_idmap_t0sz = idmap_t0sz;
__kvm_nvhe_kimage_voffset = kimage_voffset;
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index fef6f1881765..3bfc51de1679 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -7,7 +7,7 @@ asflags-y := -D__KVM_NVHE_HYPERVISOR__
ccflags-y := -D__KVM_NVHE_HYPERVISOR__ -fno-stack-protector \
-DDISABLE_BRANCH_PROFILING $(DISABLE_STACKLEAK_PLUGIN)

-obj-y := hyp-init.o ../hyp-entry.o
+obj-y := tlb.o hyp-init.o ../hyp-entry.o

obj-y := $(patsubst %.o,%.hyp.o,$(obj-y))
extra-y := $(patsubst %.hyp.o,%.hyp.tmp.o,$(obj-y))
diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c
new file mode 100644
index 000000000000..111c4b0a23d3
--- /dev/null
+++ b/arch/arm64/kvm/hyp/nvhe/tlb.c
@@ -0,0 +1,70 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2015 - ARM Ltd
+ * Author: Marc Zyngier <[email protected]>
+ */
+
+#include <linux/irqflags.h>
+
+#include <asm/kvm_hyp.h>
+#include <asm/kvm_mmu.h>
+#include <asm/tlbflush.h>
+
+#include "../tlb.h"
+
+static void __hyp_text __tlb_switch_to_guest(struct kvm *kvm,
+ struct tlb_inv_context *cxt)
+{
+ if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
+ u64 val;
+
+ /*
+ * For CPUs that are affected by ARM 1319367, we need to
+ * avoid a host Stage-1 walk while we have the guest's
+ * VMID set in the VTTBR in order to invalidate TLBs.
+ * We're guaranteed that the S1 MMU is enabled, so we can
+ * simply set the EPD bits to avoid any further TLB fill.
+ */
+ val = cxt->tcr = read_sysreg_el1(SYS_TCR);
+ val |= TCR_EPD1_MASK | TCR_EPD0_MASK;
+ write_sysreg_el1(val, SYS_TCR);
+ isb();
+ }
+
+ /* __load_guest_stage2() includes an ISB for the workaround. */
+ __load_guest_stage2(kvm);
+ asm(ALTERNATIVE("isb", "nop", ARM64_WORKAROUND_SPECULATIVE_AT));
+}
+
+static void __hyp_text __tlb_switch_to_host(struct kvm *kvm,
+ struct tlb_inv_context *cxt)
+{
+ write_sysreg(0, vttbr_el2);
+
+ if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
+ /* Ensure write of the host VMID */
+ isb();
+ /* Restore the host's TCR_EL1 */
+ write_sysreg_el1(cxt->tcr, SYS_TCR);
+ }
+}
+
+void __hyp_text __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
+{
+ __tlb_flush_vmid_ipa(kvm, ipa);
+}
+
+void __hyp_text __kvm_tlb_flush_vmid(struct kvm *kvm)
+{
+ __tlb_flush_vmid(kvm);
+}
+
+void __hyp_text __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu)
+{
+ __tlb_flush_local_vmid(vcpu);
+}
+
+void __hyp_text __kvm_flush_vm_context(void)
+{
+ __tlb_flush_vm_context();
+}
diff --git a/arch/arm64/kvm/hyp/tlb.c b/arch/arm64/kvm/hyp/tlb.c
index d063a576d511..4e190f8c7e9c 100644
--- a/arch/arm64/kvm/hyp/tlb.c
+++ b/arch/arm64/kvm/hyp/tlb.c
@@ -10,14 +10,10 @@
#include <asm/kvm_mmu.h>
#include <asm/tlbflush.h>

-struct tlb_inv_context {
- unsigned long flags;
- u64 tcr;
- u64 sctlr;
-};
+#include "tlb.h"

-static void __hyp_text __tlb_switch_to_guest_vhe(struct kvm *kvm,
- struct tlb_inv_context *cxt)
+static void __hyp_text __tlb_switch_to_guest(struct kvm *kvm,
+ struct tlb_inv_context *cxt)
{
u64 val;

@@ -60,41 +56,8 @@ static void __hyp_text __tlb_switch_to_guest_vhe(struct kvm *kvm,
isb();
}

-static void __hyp_text __tlb_switch_to_guest_nvhe(struct kvm *kvm,
- struct tlb_inv_context *cxt)
-{
- if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
- u64 val;
-
- /*
- * For CPUs that are affected by ARM 1319367, we need to
- * avoid a host Stage-1 walk while we have the guest's
- * VMID set in the VTTBR in order to invalidate TLBs.
- * We're guaranteed that the S1 MMU is enabled, so we can
- * simply set the EPD bits to avoid any further TLB fill.
- */
- val = cxt->tcr = read_sysreg_el1(SYS_TCR);
- val |= TCR_EPD1_MASK | TCR_EPD0_MASK;
- write_sysreg_el1(val, SYS_TCR);
- isb();
- }
-
- /* __load_guest_stage2() includes an ISB for the workaround. */
- __load_guest_stage2(kvm);
- asm(ALTERNATIVE("isb", "nop", ARM64_WORKAROUND_SPECULATIVE_AT));
-}
-
-static void __hyp_text __tlb_switch_to_guest(struct kvm *kvm,
- struct tlb_inv_context *cxt)
-{
- if (has_vhe())
- __tlb_switch_to_guest_vhe(kvm, cxt);
- else
- __tlb_switch_to_guest_nvhe(kvm, cxt);
-}
-
-static void __hyp_text __tlb_switch_to_host_vhe(struct kvm *kvm,
- struct tlb_inv_context *cxt)
+static void __hyp_text __tlb_switch_to_host(struct kvm *kvm,
+ struct tlb_inv_context *cxt)
{
/*
* We're done with the TLB operation, let's restore the host's
@@ -113,130 +76,22 @@ static void __hyp_text __tlb_switch_to_host_vhe(struct kvm *kvm,
local_irq_restore(cxt->flags);
}

-static void __hyp_text __tlb_switch_to_host_nvhe(struct kvm *kvm,
- struct tlb_inv_context *cxt)
-{
- write_sysreg(0, vttbr_el2);
-
- if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
- /* Ensure write of the host VMID */
- isb();
- /* Restore the host's TCR_EL1 */
- write_sysreg_el1(cxt->tcr, SYS_TCR);
- }
-}
-
-static void __hyp_text __tlb_switch_to_host(struct kvm *kvm,
- struct tlb_inv_context *cxt)
-{
- if (has_vhe())
- __tlb_switch_to_host_vhe(kvm, cxt);
- else
- __tlb_switch_to_host_nvhe(kvm, cxt);
-}
-
-void __hyp_text __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
+void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
{
- struct tlb_inv_context cxt;
-
- dsb(ishst);
-
- /* Switch to requested VMID */
- kvm = kern_hyp_va(kvm);
- __tlb_switch_to_guest(kvm, &cxt);
-
- /*
- * We could do so much better if we had the VA as well.
- * Instead, we invalidate Stage-2 for this IPA, and the
- * whole of Stage-1. Weep...
- */
- ipa >>= 12;
- __tlbi(ipas2e1is, ipa);
-
- /*
- * We have to ensure completion of the invalidation at Stage-2,
- * since a table walk on another CPU could refill a TLB with a
- * complete (S1 + S2) walk based on the old Stage-2 mapping if
- * the Stage-1 invalidation happened first.
- */
- dsb(ish);
- __tlbi(vmalle1is);
- dsb(ish);
- isb();
-
- /*
- * If the host is running at EL1 and we have a VPIPT I-cache,
- * then we must perform I-cache maintenance at EL2 in order for
- * it to have an effect on the guest. Since the guest cannot hit
- * I-cache lines allocated with a different VMID, we don't need
- * to worry about junk out of guest reset (we nuke the I-cache on
- * VMID rollover), but we do need to be careful when remapping
- * executable pages for the same guest. This can happen when KSM
- * takes a CoW fault on an executable page, copies the page into
- * a page that was previously mapped in the guest and then needs
- * to invalidate the guest view of the I-cache for that page
- * from EL1. To solve this, we invalidate the entire I-cache when
- * unmapping a page from a guest if we have a VPIPT I-cache but
- * the host is running at EL1. As above, we could do better if
- * we had the VA.
- *
- * The moral of this story is: if you have a VPIPT I-cache, then
- * you should be running with VHE enabled.
- */
- if (!has_vhe() && icache_is_vpipt())
- __flush_icache_all();
-
- __tlb_switch_to_host(kvm, &cxt);
+ __tlb_flush_vmid_ipa(kvm, ipa);
}

-void __hyp_text __kvm_tlb_flush_vmid(struct kvm *kvm)
+void __kvm_tlb_flush_vmid(struct kvm *kvm)
{
- struct tlb_inv_context cxt;
-
- dsb(ishst);
-
- /* Switch to requested VMID */
- kvm = kern_hyp_va(kvm);
- __tlb_switch_to_guest(kvm, &cxt);
-
- __tlbi(vmalls12e1is);
- dsb(ish);
- isb();
-
- __tlb_switch_to_host(kvm, &cxt);
+ __tlb_flush_vmid(kvm);
}

-void __hyp_text __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu)
+void __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu)
{
- struct kvm *kvm = kern_hyp_va(kern_hyp_va(vcpu)->kvm);
- struct tlb_inv_context cxt;
-
- /* Switch to requested VMID */
- __tlb_switch_to_guest(kvm, &cxt);
-
- __tlbi(vmalle1);
- dsb(nsh);
- isb();
-
- __tlb_switch_to_host(kvm, &cxt);
+ __tlb_flush_local_vmid(vcpu);
}

-void __hyp_text __kvm_flush_vm_context(void)
+void __kvm_flush_vm_context(void)
{
- dsb(ishst);
- __tlbi(alle1is);
-
- /*
- * VIPT and PIPT caches are not affected by VMID, so no maintenance
- * is necessary across a VMID rollover.
- *
- * VPIPT caches constrain lookup and maintenance to the active VMID,
- * so we need to invalidate lines with a stale VMID to avoid an ABA
- * race after multiple rollovers.
- *
- */
- if (icache_is_vpipt())
- asm volatile("ic ialluis");
-
- dsb(ish);
+ __tlb_flush_vm_context();
}
diff --git a/arch/arm64/kvm/hyp/tlb.h b/arch/arm64/kvm/hyp/tlb.h
new file mode 100644
index 000000000000..841ef400c8ec
--- /dev/null
+++ b/arch/arm64/kvm/hyp/tlb.h
@@ -0,0 +1,134 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2015 - ARM Ltd
+ * Author: Marc Zyngier <[email protected]>
+ */
+
+#ifndef __ARM64_KVM_HYP_TLB_H__
+#define __ARM64_KVM_HYP_TLB_H__
+
+#include <linux/irqflags.h>
+
+#include <asm/kvm_hyp.h>
+#include <asm/kvm_mmu.h>
+#include <asm/tlbflush.h>
+
+struct tlb_inv_context {
+ unsigned long flags;
+ u64 tcr;
+ u64 sctlr;
+};
+
+static void __hyp_text __tlb_switch_to_guest(struct kvm *kvm,
+ struct tlb_inv_context *cxt);
+static void __hyp_text __tlb_switch_to_host(struct kvm *kvm,
+ struct tlb_inv_context *cxt);
+
+static inline void __hyp_text
+__tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
+{
+ struct tlb_inv_context cxt;
+
+ dsb(ishst);
+
+ /* Switch to requested VMID */
+ kvm = kern_hyp_va(kvm);
+ __tlb_switch_to_guest(kvm, &cxt);
+
+ /*
+ * We could do so much better if we had the VA as well.
+ * Instead, we invalidate Stage-2 for this IPA, and the
+ * whole of Stage-1. Weep...
+ */
+ ipa >>= 12;
+ __tlbi(ipas2e1is, ipa);
+
+ /*
+ * We have to ensure completion of the invalidation at Stage-2,
+ * since a table walk on another CPU could refill a TLB with a
+ * complete (S1 + S2) walk based on the old Stage-2 mapping if
+ * the Stage-1 invalidation happened first.
+ */
+ dsb(ish);
+ __tlbi(vmalle1is);
+ dsb(ish);
+ isb();
+
+ /*
+ * If the host is running at EL1 and we have a VPIPT I-cache,
+ * then we must perform I-cache maintenance at EL2 in order for
+ * it to have an effect on the guest. Since the guest cannot hit
+ * I-cache lines allocated with a different VMID, we don't need
+ * to worry about junk out of guest reset (we nuke the I-cache on
+ * VMID rollover), but we do need to be careful when remapping
+ * executable pages for the same guest. This can happen when KSM
+ * takes a CoW fault on an executable page, copies the page into
+ * a page that was previously mapped in the guest and then needs
+ * to invalidate the guest view of the I-cache for that page
+ * from EL1. To solve this, we invalidate the entire I-cache when
+ * unmapping a page from a guest if we have a VPIPT I-cache but
+ * the host is running at EL1. As above, we could do better if
+ * we had the VA.
+ *
+ * The moral of this story is: if you have a VPIPT I-cache, then
+ * you should be running with VHE enabled.
+ */
+ if (!has_vhe() && icache_is_vpipt())
+ __flush_icache_all();
+
+ __tlb_switch_to_host(kvm, &cxt);
+}
+
+static inline void __hyp_text __tlb_flush_vmid(struct kvm *kvm)
+{
+ struct tlb_inv_context cxt;
+
+ dsb(ishst);
+
+ /* Switch to requested VMID */
+ kvm = kern_hyp_va(kvm);
+ __tlb_switch_to_guest(kvm, &cxt);
+
+ __tlbi(vmalls12e1is);
+ dsb(ish);
+ isb();
+
+ __tlb_switch_to_host(kvm, &cxt);
+}
+
+static inline void __hyp_text __tlb_flush_local_vmid(struct kvm_vcpu *vcpu)
+{
+ struct kvm *kvm = kern_hyp_va(kern_hyp_va(vcpu)->kvm);
+ struct tlb_inv_context cxt;
+
+ /* Switch to requested VMID */
+ __tlb_switch_to_guest(kvm, &cxt);
+
+ __tlbi(vmalle1);
+ dsb(nsh);
+ isb();
+
+ __tlb_switch_to_host(kvm, &cxt);
+}
+
+static inline void __hyp_text __tlb_flush_vm_context(void)
+{
+ dsb(ishst);
+ __tlbi(alle1is);
+
+ /*
+ * VIPT and PIPT caches are not affected by VMID, so no maintenance
+ * is necessary across a VMID rollover.
+ *
+ * VPIPT caches constrain lookup and maintenance to the active VMID,
+ * so we need to invalidate lines with a stale VMID to avoid an ABA
+ * race after multiple rollovers.
+ *
+ */
+ if (icache_is_vpipt())
+ asm volatile("ic ialluis");
+
+ dsb(ish);
+}
+
+#endif /* __ARM64_KVM_HYP_TLB_H__ */
--
2.27.0

2020-06-18 21:18:08

by David Brazdil

[permalink] [raw]
Subject: [PATCH v3 15/15] arm64: kvm: Lift instrumentation restrictions on VHE

With VHE and nVHE executable code completely separated, remove build config
that disabled GCOV/KASAN/UBSAN/KCOV instrumentation for VHE as these now
execute under the same memory mappings as the rest of the kernel.

No violations are currently being reported by either KASAN or UBSAN.

Signed-off-by: David Brazdil <[email protected]>
---
arch/arm64/kvm/hyp/Makefile | 8 --------
1 file changed, 8 deletions(-)

diff --git a/arch/arm64/kvm/hyp/Makefile b/arch/arm64/kvm/hyp/Makefile
index 5f4f217532e0..cd0c3936d266 100644
--- a/arch/arm64/kvm/hyp/Makefile
+++ b/arch/arm64/kvm/hyp/Makefile
@@ -11,11 +11,3 @@ obj-$(CONFIG_KVM_INDIRECT_VECTORS) += smccc_wa.o

hyp-y := vgic-v3-sr.o timer-sr.o aarch32.o vgic-v2-cpuif-proxy.o sysreg-sr.o \
debug-sr.o entry.o switch.o fpsimd.o tlb.o hyp-entry.o
-
-# KVM code is run at a different exception code with a different map, so
-# compiler instrumentation that inserts callbacks or checks into the code may
-# cause crashes. Just disable it.
-GCOV_PROFILE := n
-KASAN_SANITIZE := n
-UBSAN_SANITIZE := n
-KCOV_INSTRUMENT := n
--
2.27.0

2020-06-19 08:30:43

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 07/15] arm64: kvm: Split hyp/tlb.c to VHE/nVHE

On 2020-06-18 13:25, David Brazdil wrote:
> This patch is part of a series which builds KVM's non-VHE hyp code
> separately
> from VHE and the rest of the kernel.
>
> tlb.c contains code for flushing the TLB, with parts shared between
> VHE/nVHE.
> These common routines are moved into a header file tlb.h, VHE-specific
> code
> remains in tlb.c and nVHE-specific code is moved to nvhe/tlb.c.
>
> The header file expects its users to implement two helper functions
> declared
> at the top of the file.
>
> Signed-off-by: David Brazdil <[email protected]>
> ---
> arch/arm64/kernel/image-vars.h | 8 +-
> arch/arm64/kvm/hyp/nvhe/Makefile | 2 +-
> arch/arm64/kvm/hyp/nvhe/tlb.c | 70 +++++++++++++
> arch/arm64/kvm/hyp/tlb.c | 171 +++----------------------------
> arch/arm64/kvm/hyp/tlb.h | 134 ++++++++++++++++++++++++
> 5 files changed, 222 insertions(+), 163 deletions(-)
> create mode 100644 arch/arm64/kvm/hyp/nvhe/tlb.c
> create mode 100644 arch/arm64/kvm/hyp/tlb.h
>
> diff --git a/arch/arm64/kernel/image-vars.h
> b/arch/arm64/kernel/image-vars.h
> index 4dc969ccda9e..e8a8aa6bc7bd 100644
> --- a/arch/arm64/kernel/image-vars.h
> +++ b/arch/arm64/kernel/image-vars.h
> @@ -63,13 +63,10 @@ __efistub__ctype = _ctype;
>
> __kvm_nvhe___guest_exit = __guest_exit;
> __kvm_nvhe___hyp_stub_vectors = __hyp_stub_vectors;
> +__kvm_nvhe___icache_flags = __icache_flags;

This is new, and definitely deserves a comment, together with much of
the other kernel symbols you added. You probably want to keep these
symbols separate from the KVM-specific symbols for ease of maintenance.

> __kvm_nvhe___kvm_enable_ssbs = __kvm_enable_ssbs;
> -__kvm_nvhe___kvm_flush_vm_context = __kvm_flush_vm_context;
> __kvm_nvhe___kvm_get_mdcr_el2 = __kvm_get_mdcr_el2;
> __kvm_nvhe___kvm_timer_set_cntvoff = __kvm_timer_set_cntvoff;
> -__kvm_nvhe___kvm_tlb_flush_local_vmid = __kvm_tlb_flush_local_vmid;
> -__kvm_nvhe___kvm_tlb_flush_vmid = __kvm_tlb_flush_vmid;
> -__kvm_nvhe___kvm_tlb_flush_vmid_ipa = __kvm_tlb_flush_vmid_ipa;
> __kvm_nvhe___kvm_vcpu_run_nvhe = __kvm_vcpu_run_nvhe;
> __kvm_nvhe___vgic_v3_get_ich_vtr_el2 = __vgic_v3_get_ich_vtr_el2;
> __kvm_nvhe___vgic_v3_init_lrs = __vgic_v3_init_lrs;
> @@ -79,8 +76,11 @@ __kvm_nvhe___vgic_v3_save_aprs =
> __vgic_v3_save_aprs;
> __kvm_nvhe___vgic_v3_write_vmcr = __vgic_v3_write_vmcr;
> __kvm_nvhe_abort_guest_exit_end = abort_guest_exit_end;
> __kvm_nvhe_abort_guest_exit_start = abort_guest_exit_start;
> +__kvm_nvhe_arm64_const_caps_ready = arm64_const_caps_ready;
> __kvm_nvhe_arm64_enable_wa2_handling = arm64_enable_wa2_handling;
> __kvm_nvhe_arm64_ssbd_callback_required =
> arm64_ssbd_callback_required;
> +__kvm_nvhe_cpu_hwcap_keys = cpu_hwcap_keys;
> +__kvm_nvhe_cpu_hwcaps = cpu_hwcaps;
> __kvm_nvhe_hyp_panic = hyp_panic;
> __kvm_nvhe_idmap_t0sz = idmap_t0sz;
> __kvm_nvhe_kimage_voffset = kimage_voffset;
> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile
> b/arch/arm64/kvm/hyp/nvhe/Makefile
> index fef6f1881765..3bfc51de1679 100644
> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> @@ -7,7 +7,7 @@ asflags-y := -D__KVM_NVHE_HYPERVISOR__
> ccflags-y := -D__KVM_NVHE_HYPERVISOR__ -fno-stack-protector \
> -DDISABLE_BRANCH_PROFILING $(DISABLE_STACKLEAK_PLUGIN)
>
> -obj-y := hyp-init.o ../hyp-entry.o
> +obj-y := tlb.o hyp-init.o ../hyp-entry.o
>
> obj-y := $(patsubst %.o,%.hyp.o,$(obj-y))
> extra-y := $(patsubst %.hyp.o,%.hyp.tmp.o,$(obj-y))
> diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c
> b/arch/arm64/kvm/hyp/nvhe/tlb.c
> new file mode 100644
> index 000000000000..111c4b0a23d3
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/nvhe/tlb.c
> @@ -0,0 +1,70 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2015 - ARM Ltd
> + * Author: Marc Zyngier <[email protected]>
> + */
> +
> +#include <linux/irqflags.h>

Unnecessary non !VHE.

> +
> +#include <asm/kvm_hyp.h>
> +#include <asm/kvm_mmu.h>
> +#include <asm/tlbflush.h>

At least tlbflush.h should directly be dragged by tlb.h, since that's
where the actual TLB ops are done.

> +
> +#include "../tlb.h"
> +
> +static void __hyp_text __tlb_switch_to_guest(struct kvm *kvm,
> + struct tlb_inv_context *cxt)
> +{
> + if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
> + u64 val;
> +
> + /*
> + * For CPUs that are affected by ARM 1319367, we need to
> + * avoid a host Stage-1 walk while we have the guest's
> + * VMID set in the VTTBR in order to invalidate TLBs.
> + * We're guaranteed that the S1 MMU is enabled, so we can
> + * simply set the EPD bits to avoid any further TLB fill.
> + */
> + val = cxt->tcr = read_sysreg_el1(SYS_TCR);
> + val |= TCR_EPD1_MASK | TCR_EPD0_MASK;
> + write_sysreg_el1(val, SYS_TCR);
> + isb();
> + }
> +
> + /* __load_guest_stage2() includes an ISB for the workaround. */
> + __load_guest_stage2(kvm);
> + asm(ALTERNATIVE("isb", "nop", ARM64_WORKAROUND_SPECULATIVE_AT));
> +}
> +
> +static void __hyp_text __tlb_switch_to_host(struct kvm *kvm,
> + struct tlb_inv_context *cxt)
> +{
> + write_sysreg(0, vttbr_el2);
> +
> + if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
> + /* Ensure write of the host VMID */
> + isb();
> + /* Restore the host's TCR_EL1 */
> + write_sysreg_el1(cxt->tcr, SYS_TCR);
> + }
> +}
> +
> +void __hyp_text __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t
> ipa)
> +{
> + __tlb_flush_vmid_ipa(kvm, ipa);
> +}
> +
> +void __hyp_text __kvm_tlb_flush_vmid(struct kvm *kvm)
> +{
> + __tlb_flush_vmid(kvm);
> +}
> +
> +void __hyp_text __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu)
> +{
> + __tlb_flush_local_vmid(vcpu);
> +}
> +
> +void __hyp_text __kvm_flush_vm_context(void)
> +{
> + __tlb_flush_vm_context();
> +}

Overall, I find the result hard to reason about. Too many things happen
in the .h file, and reading the .c file feels puzzling (no apparent
users of the two static functions, for example).

The VHE/nVHE files only differ by the the __tlb_switch_to_*() helpers
(and the __hyp_text annotation that ends up going away). Why can't this
be structured in a more conventionnal way, where the TLB code stays in a
C file, and per-mode .h files providing the two helpers? It would
certainly look much more readable.

> diff --git a/arch/arm64/kvm/hyp/tlb.c b/arch/arm64/kvm/hyp/tlb.c
> index d063a576d511..4e190f8c7e9c 100644
> --- a/arch/arm64/kvm/hyp/tlb.c
> +++ b/arch/arm64/kvm/hyp/tlb.c
> @@ -10,14 +10,10 @@
> #include <asm/kvm_mmu.h>
> #include <asm/tlbflush.h>
>
> -struct tlb_inv_context {
> - unsigned long flags;
> - u64 tcr;
> - u64 sctlr;
> -};
> +#include "tlb.h"
>
> -static void __hyp_text __tlb_switch_to_guest_vhe(struct kvm *kvm,
> - struct tlb_inv_context *cxt)
> +static void __hyp_text __tlb_switch_to_guest(struct kvm *kvm,
> + struct tlb_inv_context *cxt)
> {
> u64 val;
>
> @@ -60,41 +56,8 @@ static void __hyp_text
> __tlb_switch_to_guest_vhe(struct kvm *kvm,
> isb();
> }
>
> -static void __hyp_text __tlb_switch_to_guest_nvhe(struct kvm *kvm,
> - struct tlb_inv_context *cxt)
> -{
> - if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
> - u64 val;
> -
> - /*
> - * For CPUs that are affected by ARM 1319367, we need to
> - * avoid a host Stage-1 walk while we have the guest's
> - * VMID set in the VTTBR in order to invalidate TLBs.
> - * We're guaranteed that the S1 MMU is enabled, so we can
> - * simply set the EPD bits to avoid any further TLB fill.
> - */
> - val = cxt->tcr = read_sysreg_el1(SYS_TCR);
> - val |= TCR_EPD1_MASK | TCR_EPD0_MASK;
> - write_sysreg_el1(val, SYS_TCR);
> - isb();
> - }
> -
> - /* __load_guest_stage2() includes an ISB for the workaround. */
> - __load_guest_stage2(kvm);
> - asm(ALTERNATIVE("isb", "nop", ARM64_WORKAROUND_SPECULATIVE_AT));
> -}
> -
> -static void __hyp_text __tlb_switch_to_guest(struct kvm *kvm,
> - struct tlb_inv_context *cxt)
> -{
> - if (has_vhe())
> - __tlb_switch_to_guest_vhe(kvm, cxt);
> - else
> - __tlb_switch_to_guest_nvhe(kvm, cxt);
> -}
> -
> -static void __hyp_text __tlb_switch_to_host_vhe(struct kvm *kvm,
> - struct tlb_inv_context *cxt)
> +static void __hyp_text __tlb_switch_to_host(struct kvm *kvm,
> + struct tlb_inv_context *cxt)
> {
> /*
> * We're done with the TLB operation, let's restore the host's
> @@ -113,130 +76,22 @@ static void __hyp_text
> __tlb_switch_to_host_vhe(struct kvm *kvm,
> local_irq_restore(cxt->flags);
> }
>
> -static void __hyp_text __tlb_switch_to_host_nvhe(struct kvm *kvm,
> - struct tlb_inv_context *cxt)
> -{
> - write_sysreg(0, vttbr_el2);
> -
> - if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
> - /* Ensure write of the host VMID */
> - isb();
> - /* Restore the host's TCR_EL1 */
> - write_sysreg_el1(cxt->tcr, SYS_TCR);
> - }
> -}
> -
> -static void __hyp_text __tlb_switch_to_host(struct kvm *kvm,
> - struct tlb_inv_context *cxt)
> -{
> - if (has_vhe())
> - __tlb_switch_to_host_vhe(kvm, cxt);
> - else
> - __tlb_switch_to_host_nvhe(kvm, cxt);
> -}
> -
> -void __hyp_text __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t
> ipa)
> +void __kvm_tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)
> {
> - struct tlb_inv_context cxt;
> -
> - dsb(ishst);
> -
> - /* Switch to requested VMID */
> - kvm = kern_hyp_va(kvm);
> - __tlb_switch_to_guest(kvm, &cxt);
> -
> - /*
> - * We could do so much better if we had the VA as well.
> - * Instead, we invalidate Stage-2 for this IPA, and the
> - * whole of Stage-1. Weep...
> - */
> - ipa >>= 12;
> - __tlbi(ipas2e1is, ipa);
> -
> - /*
> - * We have to ensure completion of the invalidation at Stage-2,
> - * since a table walk on another CPU could refill a TLB with a
> - * complete (S1 + S2) walk based on the old Stage-2 mapping if
> - * the Stage-1 invalidation happened first.
> - */
> - dsb(ish);
> - __tlbi(vmalle1is);
> - dsb(ish);
> - isb();
> -
> - /*
> - * If the host is running at EL1 and we have a VPIPT I-cache,
> - * then we must perform I-cache maintenance at EL2 in order for
> - * it to have an effect on the guest. Since the guest cannot hit
> - * I-cache lines allocated with a different VMID, we don't need
> - * to worry about junk out of guest reset (we nuke the I-cache on
> - * VMID rollover), but we do need to be careful when remapping
> - * executable pages for the same guest. This can happen when KSM
> - * takes a CoW fault on an executable page, copies the page into
> - * a page that was previously mapped in the guest and then needs
> - * to invalidate the guest view of the I-cache for that page
> - * from EL1. To solve this, we invalidate the entire I-cache when
> - * unmapping a page from a guest if we have a VPIPT I-cache but
> - * the host is running at EL1. As above, we could do better if
> - * we had the VA.
> - *
> - * The moral of this story is: if you have a VPIPT I-cache, then
> - * you should be running with VHE enabled.
> - */
> - if (!has_vhe() && icache_is_vpipt())
> - __flush_icache_all();
> -
> - __tlb_switch_to_host(kvm, &cxt);
> + __tlb_flush_vmid_ipa(kvm, ipa);
> }
>
> -void __hyp_text __kvm_tlb_flush_vmid(struct kvm *kvm)
> +void __kvm_tlb_flush_vmid(struct kvm *kvm)
> {
> - struct tlb_inv_context cxt;
> -
> - dsb(ishst);
> -
> - /* Switch to requested VMID */
> - kvm = kern_hyp_va(kvm);
> - __tlb_switch_to_guest(kvm, &cxt);
> -
> - __tlbi(vmalls12e1is);
> - dsb(ish);
> - isb();
> -
> - __tlb_switch_to_host(kvm, &cxt);
> + __tlb_flush_vmid(kvm);
> }
>
> -void __hyp_text __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu)
> +void __kvm_tlb_flush_local_vmid(struct kvm_vcpu *vcpu)
> {
> - struct kvm *kvm = kern_hyp_va(kern_hyp_va(vcpu)->kvm);
> - struct tlb_inv_context cxt;
> -
> - /* Switch to requested VMID */
> - __tlb_switch_to_guest(kvm, &cxt);
> -
> - __tlbi(vmalle1);
> - dsb(nsh);
> - isb();
> -
> - __tlb_switch_to_host(kvm, &cxt);
> + __tlb_flush_local_vmid(vcpu);
> }
>
> -void __hyp_text __kvm_flush_vm_context(void)
> +void __kvm_flush_vm_context(void)
> {
> - dsb(ishst);
> - __tlbi(alle1is);
> -
> - /*
> - * VIPT and PIPT caches are not affected by VMID, so no maintenance
> - * is necessary across a VMID rollover.
> - *
> - * VPIPT caches constrain lookup and maintenance to the active VMID,
> - * so we need to invalidate lines with a stale VMID to avoid an ABA
> - * race after multiple rollovers.
> - *
> - */
> - if (icache_is_vpipt())
> - asm volatile("ic ialluis");
> -
> - dsb(ish);
> + __tlb_flush_vm_context();
> }
> diff --git a/arch/arm64/kvm/hyp/tlb.h b/arch/arm64/kvm/hyp/tlb.h
> new file mode 100644
> index 000000000000..841ef400c8ec
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/tlb.h
> @@ -0,0 +1,134 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2015 - ARM Ltd
> + * Author: Marc Zyngier <[email protected]>
> + */
> +
> +#ifndef __ARM64_KVM_HYP_TLB_H__
> +#define __ARM64_KVM_HYP_TLB_H__
> +
> +#include <linux/irqflags.h>
> +
> +#include <asm/kvm_hyp.h>
> +#include <asm/kvm_mmu.h>
> +#include <asm/tlbflush.h>
> +
> +struct tlb_inv_context {
> + unsigned long flags;
> + u64 tcr;
> + u64 sctlr;
> +};
> +
> +static void __hyp_text __tlb_switch_to_guest(struct kvm *kvm,
> + struct tlb_inv_context *cxt);
> +static void __hyp_text __tlb_switch_to_host(struct kvm *kvm,
> + struct tlb_inv_context *cxt);
> +
> +static inline void __hyp_text
> +__tlb_flush_vmid_ipa(struct kvm *kvm, phys_addr_t ipa)

For things that you move around, please do not reformat the code, This
will lead to unnecessary conflicts. And long lines are just fine (screw
checkpatch!).

Thanks,

M.
--
Jazz is not dead. It just smells funny...