2020-09-03 09:18:51

by David Brazdil

[permalink] [raw]
Subject: [PATCH v2 00/10] Independent per-CPU data section for nVHE

Introduce '.hyp.data..percpu' as part of ongoing effort to make nVHE
hyp code self-contained and independent of the rest of the kernel.

The series builds on top of the "Split off nVHE hyp code" series which
used objcopy to rename '.text' to '.hyp.text' and prefix all ELF
symbols with '__kvm_nvhe' for all object files under kvm/hyp/nvhe.

The series is structured as follows:

- patch 1: Modify generic PERCPU_* linker script macros to make it
possible to define multiple per-CPU ELF sections with prefixed
section and symbol names.

- patch 2: Improve existing hyp build rules. This could be sent and merged
independently of per-CPU but this series builds on it.

- patches 3-4: Replace hyp helpers for accessing per-CPU variables
with common helpers modified to work correctly in hyp. Per-CPU
variables can now be accessed with one API anywhere.

- patches 5-7: Where VHE and nVHE use per-CPU variables defined in
kernel proper, move their definitions to hyp/ where they are
duplicated and owned by VHE/nVHE, respectively. Non-VHE hyp code
now refers only to per-CPU variables defined in its source files.
Helpers are added so that kernel proper can continue to access
nVHE hyp variables, same way as it does with other nVHE symbols.

- patches 8-10: Introduce '.hyp.data..percpu' ELF section and allocate
memory for every CPU core during KVM init. All nVHE per-CPU state
is now grouped together in ELF and in memory. Introducing a new
per-CPU variable does not require adding new memory mappings any
more. nVHE hyp code cannot accidentally refer to kernel-proper
per-CPU data as it only has the pointer to its own per-CPU memory.

Patches are rebased on v5.9-rc3 and available in branch 'topic/percpu-v2' at:
https://android-kvm.googlesource.com/linux

Changes v1 -> v2:
* 5.9-rc3 base
* partially link hyp code, add linker script

David Brazdil (10):
Macros to override naming of percpu symbols and sections
kvm: arm64: Partially link nVHE hyp code, simplify HYPCOPY
kvm: arm64: Remove __hyp_this_cpu_read
kvm: arm64: Remove hyp_adr/ldr_this_cpu
kvm: arm64: Add helpers for accessing nVHE hyp per-cpu vars
kvm: arm64: Duplicate arm64_ssbd_callback_required for nVHE hyp
kvm: arm64: Create separate instances of kvm_host_data for VHE/nVHE
kvm: arm64: Mark hyp stack pages reserved
kvm: arm64: Set up hyp percpu data for nVHE
kvm: arm64: Remove unnecessary hyp mappings

arch/arm64/include/asm/assembler.h | 27 ++++--
arch/arm64/include/asm/kvm_asm.h | 74 ++++++++-------
arch/arm64/include/asm/kvm_host.h | 2 +-
arch/arm64/include/asm/kvm_mmu.h | 23 ++---
arch/arm64/include/asm/percpu.h | 33 ++++++-
arch/arm64/include/asm/sections.h | 1 +
arch/arm64/kernel/image-vars.h | 2 -
arch/arm64/kernel/vmlinux.lds.S | 10 ++
arch/arm64/kvm/arm.c | 110 ++++++++++++++++++----
arch/arm64/kvm/hyp/hyp-entry.S | 2 +-
arch/arm64/kvm/hyp/include/hyp/debug-sr.h | 4 +-
arch/arm64/kvm/hyp/include/hyp/switch.h | 8 +-
arch/arm64/kvm/hyp/nvhe/Makefile | 56 +++++------
arch/arm64/kvm/hyp/nvhe/hyp.lds.S | 19 ++++
arch/arm64/kvm/hyp/nvhe/switch.c | 8 +-
arch/arm64/kvm/hyp/vhe/switch.c | 5 +-
arch/arm64/kvm/hyp/vhe/sysreg-sr.c | 4 +-
arch/arm64/kvm/pmu.c | 13 ++-
include/asm-generic/vmlinux.lds.h | 40 +++++---
19 files changed, 304 insertions(+), 137 deletions(-)
create mode 100644 arch/arm64/kvm/hyp/nvhe/hyp.lds.S

--
2.28.0.402.g5ffc5be6b7-goog


2020-09-03 09:19:10

by David Brazdil

[permalink] [raw]
Subject: [PATCH v2 01/10] Macros to override naming of percpu symbols and sections

Modify generic linker script macros to generate section/symbol names for
percpu area using overridable macros. No functional changes.

This will allow arm64 linker script to define a second KVM-specific percpu
data section using the generic PERCPU_SECTION macro.

Signed-off-by: David Brazdil <[email protected]>
---
include/asm-generic/vmlinux.lds.h | 40 +++++++++++++++++++++----------
1 file changed, 27 insertions(+), 13 deletions(-)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 5430febd34be..8f3f5c45e891 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -920,6 +920,20 @@
#define INIT_RAM_FS
#endif

+/*
+ * Macros to override the naming of percpu symbols and sections.
+ * Used by arm64 linker script to define a separate percpu area for KVM.
+ */
+#define PERCPU_SECTION_BASE_NAME .data..percpu
+
+#ifndef PERCPU_SECTION_NAME
+#define PERCPU_SECTION_NAME(suffix) PERCPU_SECTION_BASE_NAME ## suffix
+#endif
+
+#ifndef PERCPU_SYMBOL_NAME
+#define PERCPU_SYMBOL_NAME(name) name
+#endif
+
/*
* Memory encryption operates on a page basis. Since we need to clear
* the memory encryption mask for this section, it needs to be aligned
@@ -931,7 +945,7 @@
#ifdef CONFIG_AMD_MEM_ENCRYPT
#define PERCPU_DECRYPTED_SECTION \
. = ALIGN(PAGE_SIZE); \
- *(.data..percpu..decrypted) \
+ *(PERCPU_SECTION_NAME(..decrypted)) \
. = ALIGN(PAGE_SIZE);
#else
#define PERCPU_DECRYPTED_SECTION
@@ -975,17 +989,17 @@
* sharing between subsections for different purposes.
*/
#define PERCPU_INPUT(cacheline) \
- __per_cpu_start = .; \
- *(.data..percpu..first) \
+ PERCPU_SYMBOL_NAME(__per_cpu_start) = .; \
+ *(PERCPU_SECTION_NAME(..first)) \
. = ALIGN(PAGE_SIZE); \
- *(.data..percpu..page_aligned) \
+ *(PERCPU_SECTION_NAME(..page_aligned)) \
. = ALIGN(cacheline); \
- *(.data..percpu..read_mostly) \
+ *(PERCPU_SECTION_NAME(..read_mostly)) \
. = ALIGN(cacheline); \
- *(.data..percpu) \
- *(.data..percpu..shared_aligned) \
+ *(PERCPU_SECTION_NAME()) \
+ *(PERCPU_SECTION_NAME(..shared_aligned)) \
PERCPU_DECRYPTED_SECTION \
- __per_cpu_end = .;
+ PERCPU_SYMBOL_NAME(__per_cpu_end) = .;

/**
* PERCPU_VADDR - define output section for percpu area
@@ -1012,11 +1026,11 @@
* address, use PERCPU_SECTION.
*/
#define PERCPU_VADDR(cacheline, vaddr, phdr) \
- __per_cpu_load = .; \
- .data..percpu vaddr : AT(__per_cpu_load - LOAD_OFFSET) { \
+ PERCPU_SYMBOL_NAME(__per_cpu_load) = .; \
+ PERCPU_SECTION_NAME() vaddr : AT(PERCPU_SYMBOL_NAME(__per_cpu_load) - LOAD_OFFSET) { \
PERCPU_INPUT(cacheline) \
} phdr \
- . = __per_cpu_load + SIZEOF(.data..percpu);
+ . = PERCPU_SYMBOL_NAME(__per_cpu_load) + SIZEOF(PERCPU_SECTION_NAME());

/**
* PERCPU_SECTION - define output section for percpu area, simple version
@@ -1032,8 +1046,8 @@
*/
#define PERCPU_SECTION(cacheline) \
. = ALIGN(PAGE_SIZE); \
- .data..percpu : AT(ADDR(.data..percpu) - LOAD_OFFSET) { \
- __per_cpu_load = .; \
+ PERCPU_SECTION_NAME() : AT(ADDR(PERCPU_SECTION_NAME()) - LOAD_OFFSET) { \
+ PERCPU_SYMBOL_NAME(__per_cpu_load) = .; \
PERCPU_INPUT(cacheline) \
}

--
2.28.0.402.g5ffc5be6b7-goog

2020-09-03 09:19:19

by David Brazdil

[permalink] [raw]
Subject: [PATCH v2 03/10] kvm: arm64: Remove __hyp_this_cpu_read

this_cpu_ptr is meant for use in kernel proper because it selects between
TPIDR_EL1/2 based on nVHE/VHE. __hyp_this_cpu_ptr was used in hyp to always
select TPIDR_EL2. Unify all users behind this_cpu_ptr and friends by
selecting _EL2 register under __KVM_NVHE_HYPERVISOR__.

Under CONFIG_DEBUG_PREEMPT, the kernel helpers perform a preemption check
which is omitted by the hyp helpers. Preserve the behavior for nVHE by
overriding the corresponding macros under __KVM_NVHE_HYPERVISOR__. Extend
the checks into VHE hyp code.

Signed-off-by: David Brazdil <[email protected]>
---
arch/arm64/include/asm/kvm_asm.h | 20 --------------
arch/arm64/include/asm/percpu.h | 33 +++++++++++++++++++++--
arch/arm64/kvm/hyp/include/hyp/debug-sr.h | 4 +--
arch/arm64/kvm/hyp/include/hyp/switch.h | 8 +++---
arch/arm64/kvm/hyp/nvhe/switch.c | 2 +-
arch/arm64/kvm/hyp/vhe/switch.c | 2 +-
arch/arm64/kvm/hyp/vhe/sysreg-sr.c | 4 +--
7 files changed, 41 insertions(+), 32 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 6f98fbd0ac81..9149079f0269 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -149,26 +149,6 @@ extern char __smccc_workaround_1_smc[__SMCCC_WORKAROUND_1_SMC_SZ];
addr; \
})

-/*
- * Home-grown __this_cpu_{ptr,read} variants that always work at HYP,
- * provided that sym is really a *symbol* and not a pointer obtained from
- * a data structure. As for SHIFT_PERCPU_PTR(), the creative casting keeps
- * sparse quiet.
- */
-#define __hyp_this_cpu_ptr(sym) \
- ({ \
- void *__ptr; \
- __verify_pcpu_ptr(&sym); \
- __ptr = hyp_symbol_addr(sym); \
- __ptr += read_sysreg(tpidr_el2); \
- (typeof(sym) __kernel __force *)__ptr; \
- })
-
-#define __hyp_this_cpu_read(sym) \
- ({ \
- *__hyp_this_cpu_ptr(sym); \
- })
-
#define __KVM_EXTABLE(from, to) \
" .pushsection __kvm_ex_table, \"a\"\n" \
" .align 3\n" \
diff --git a/arch/arm64/include/asm/percpu.h b/arch/arm64/include/asm/percpu.h
index 0b6409b89e5e..b4008331475b 100644
--- a/arch/arm64/include/asm/percpu.h
+++ b/arch/arm64/include/asm/percpu.h
@@ -19,7 +19,21 @@ static inline void set_my_cpu_offset(unsigned long off)
:: "r" (off) : "memory");
}

-static inline unsigned long __my_cpu_offset(void)
+static inline unsigned long __hyp_my_cpu_offset(void)
+{
+ unsigned long off;
+
+ /*
+ * We want to allow caching the value, so avoid using volatile and
+ * instead use a fake stack read to hazard against barrier().
+ */
+ asm("mrs %0, tpidr_el2" : "=r" (off) :
+ "Q" (*(const unsigned long *)current_stack_pointer));
+
+ return off;
+}
+
+static inline unsigned long __kern_my_cpu_offset(void)
{
unsigned long off;

@@ -35,7 +49,12 @@ static inline unsigned long __my_cpu_offset(void)

return off;
}
-#define __my_cpu_offset __my_cpu_offset()
+
+#ifdef __KVM_NVHE_HYPERVISOR__
+#define __my_cpu_offset __hyp_my_cpu_offset()
+#else
+#define __my_cpu_offset __kern_my_cpu_offset()
+#endif

#define PERCPU_RW_OPS(sz) \
static inline unsigned long __percpu_read_##sz(void *ptr) \
@@ -227,4 +246,14 @@ PERCPU_RET_OP(add, add, ldadd)

#include <asm-generic/percpu.h>

+/* Redefine macros for nVHE hyp under DEBUG_PREEMPT to avoid its dependencies. */
+#if defined(__KVM_NVHE_HYPERVISOR__) && defined(CONFIG_DEBUG_PREEMPT)
+#undef this_cpu_ptr
+#define this_cpu_ptr raw_cpu_ptr
+#undef __this_cpu_read
+#define __this_cpu_read raw_cpu_read
+#undef __this_cpu_write
+#define __this_cpu_write raw_cpu_write
+#endif
+
#endif /* __ASM_PERCPU_H */
diff --git a/arch/arm64/kvm/hyp/include/hyp/debug-sr.h b/arch/arm64/kvm/hyp/include/hyp/debug-sr.h
index 5e28ea6aa097..4ebe9f558f3a 100644
--- a/arch/arm64/kvm/hyp/include/hyp/debug-sr.h
+++ b/arch/arm64/kvm/hyp/include/hyp/debug-sr.h
@@ -135,7 +135,7 @@ static inline void __debug_switch_to_guest_common(struct kvm_vcpu *vcpu)
if (!(vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY))
return;

- host_ctxt = &__hyp_this_cpu_ptr(kvm_host_data)->host_ctxt;
+ host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
guest_ctxt = &vcpu->arch.ctxt;
host_dbg = &vcpu->arch.host_debug_state.regs;
guest_dbg = kern_hyp_va(vcpu->arch.debug_ptr);
@@ -154,7 +154,7 @@ static inline void __debug_switch_to_host_common(struct kvm_vcpu *vcpu)
if (!(vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY))
return;

- host_ctxt = &__hyp_this_cpu_ptr(kvm_host_data)->host_ctxt;
+ host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
guest_ctxt = &vcpu->arch.ctxt;
host_dbg = &vcpu->arch.host_debug_state.regs;
guest_dbg = kern_hyp_va(vcpu->arch.debug_ptr);
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 5b6b8fa00f0a..f150407fa798 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -386,7 +386,7 @@ static inline bool __hyp_handle_ptrauth(struct kvm_vcpu *vcpu)
!esr_is_ptrauth_trap(kvm_vcpu_get_esr(vcpu)))
return false;

- ctxt = &__hyp_this_cpu_ptr(kvm_host_data)->host_ctxt;
+ ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
__ptrauth_save_key(ctxt, APIA);
__ptrauth_save_key(ctxt, APIB);
__ptrauth_save_key(ctxt, APDA);
@@ -495,7 +495,7 @@ static inline void __set_guest_arch_workaround_state(struct kvm_vcpu *vcpu)
* guest wants it disabled, so be it...
*/
if (__needs_ssbd_off(vcpu) &&
- __hyp_this_cpu_read(arm64_ssbd_callback_required))
+ __this_cpu_read(arm64_ssbd_callback_required))
arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2, 0, NULL);
#endif
}
@@ -507,7 +507,7 @@ static inline void __set_host_arch_workaround_state(struct kvm_vcpu *vcpu)
* If the guest has disabled the workaround, bring it back on.
*/
if (__needs_ssbd_off(vcpu) &&
- __hyp_this_cpu_read(arm64_ssbd_callback_required))
+ __this_cpu_read(arm64_ssbd_callback_required))
arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2, 1, NULL);
#endif
}
@@ -521,7 +521,7 @@ static inline void __kvm_unexpected_el2_exception(void)

entry = hyp_symbol_addr(__start___kvm_ex_table);
end = hyp_symbol_addr(__stop___kvm_ex_table);
- host_ctxt = &__hyp_this_cpu_ptr(kvm_host_data)->host_ctxt;
+ host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;

while (entry < end) {
addr = (unsigned long)&entry->insn + entry->insn;
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index 0970442d2dbc..cc4f8e790fb3 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -175,7 +175,7 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)

vcpu = kern_hyp_va(vcpu);

- host_ctxt = &__hyp_this_cpu_ptr(kvm_host_data)->host_ctxt;
+ host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
host_ctxt->__hyp_running_vcpu = vcpu;
guest_ctxt = &vcpu->arch.ctxt;

diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
index c1da4f86ccac..575e8054f116 100644
--- a/arch/arm64/kvm/hyp/vhe/switch.c
+++ b/arch/arm64/kvm/hyp/vhe/switch.c
@@ -108,7 +108,7 @@ static int __kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
struct kvm_cpu_context *guest_ctxt;
u64 exit_code;

- host_ctxt = &__hyp_this_cpu_ptr(kvm_host_data)->host_ctxt;
+ host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
host_ctxt->__hyp_running_vcpu = vcpu;
guest_ctxt = &vcpu->arch.ctxt;

diff --git a/arch/arm64/kvm/hyp/vhe/sysreg-sr.c b/arch/arm64/kvm/hyp/vhe/sysreg-sr.c
index 996471e4c138..2a0b8c88d74f 100644
--- a/arch/arm64/kvm/hyp/vhe/sysreg-sr.c
+++ b/arch/arm64/kvm/hyp/vhe/sysreg-sr.c
@@ -66,7 +66,7 @@ void kvm_vcpu_load_sysregs_vhe(struct kvm_vcpu *vcpu)
struct kvm_cpu_context *guest_ctxt = &vcpu->arch.ctxt;
struct kvm_cpu_context *host_ctxt;

- host_ctxt = &__hyp_this_cpu_ptr(kvm_host_data)->host_ctxt;
+ host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
__sysreg_save_user_state(host_ctxt);

/*
@@ -100,7 +100,7 @@ void kvm_vcpu_put_sysregs_vhe(struct kvm_vcpu *vcpu)
struct kvm_cpu_context *guest_ctxt = &vcpu->arch.ctxt;
struct kvm_cpu_context *host_ctxt;

- host_ctxt = &__hyp_this_cpu_ptr(kvm_host_data)->host_ctxt;
+ host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
deactivate_traps_vhe_put();

__sysreg_save_el1_state(guest_ctxt);
--
2.28.0.402.g5ffc5be6b7-goog

2020-09-03 09:19:33

by David Brazdil

[permalink] [raw]
Subject: [PATCH v2 08/10] kvm: arm64: Mark hyp stack pages reserved

In preparation for unmapping hyp pages from host stage-2, allocate/free hyp
stack using new helpers which automatically mark the pages reserved.

Signed-off-by: David Brazdil <[email protected]>
---
arch/arm64/kvm/arm.c | 49 ++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 47 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index d437052c5481..8a1fcf4dffca 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1453,13 +1453,58 @@ static int init_subsystems(void)
return err;
}

+/*
+ * Alloc pages and mark them reserved so the kernel never tries to
+ * take them away from the hypervisor.
+ */
+static unsigned long alloc_hyp_pages(gfp_t flags, unsigned int order)
+{
+ struct page *page;
+ unsigned long i;
+
+ page = alloc_pages(flags, order);
+ if (!page)
+ return 0;
+
+ for (i = 0; i < (1ul << order); ++i)
+ mark_page_reserved(page + i);
+
+ return (unsigned long)page_address(page);
+}
+
+static unsigned long alloc_hyp_page(gfp_t flags)
+{
+ return alloc_hyp_pages(flags, 0);
+}
+
+/*
+ * Free pages which were previously marked reserved for the hypervisor.
+ */
+static void free_hyp_pages(unsigned long addr, unsigned int order)
+{
+ unsigned long i;
+ struct page *page;
+
+ if (!addr)
+ return;
+
+ page = virt_to_page(addr);
+ for (i = 0; i < (1ul << order); ++i)
+ free_reserved_page(page + i);
+}
+
+static void free_hyp_page(unsigned long addr)
+{
+ return free_hyp_pages(addr, 0);
+}
+
static void teardown_hyp_mode(void)
{
int cpu;

free_hyp_pgds();
for_each_possible_cpu(cpu)
- free_page(per_cpu(kvm_arm_hyp_stack_page, cpu));
+ free_hyp_page(per_cpu(kvm_arm_hyp_stack_page, cpu));
}

/**
@@ -1483,7 +1528,7 @@ static int init_hyp_mode(void)
for_each_possible_cpu(cpu) {
unsigned long stack_page;

- stack_page = __get_free_page(GFP_KERNEL);
+ stack_page = alloc_hyp_page(GFP_KERNEL);
if (!stack_page) {
err = -ENOMEM;
goto out_err;
--
2.28.0.402.g5ffc5be6b7-goog

2020-09-03 09:19:33

by David Brazdil

[permalink] [raw]
Subject: [PATCH v2 04/10] kvm: arm64: Remove hyp_adr/ldr_this_cpu

The hyp_adr/ldr_this_cpu helpers were introduced for use in hyp code
because they always needed to use TPIDR_EL2 for base, while
adr/ldr_this_cpu from kernel proper would select between TPIDR_EL2 and
_EL1 based on VHE/nVHE.

Simplify this now that the nVHE hyp mode case can be handled using the
__KVM_NVHE_HYPERVISOR__ macro. VHE selects _EL2 with alternatives.

Signed-off-by: David Brazdil <[email protected]>
---
arch/arm64/include/asm/assembler.h | 27 +++++++++++++++++----------
arch/arm64/include/asm/kvm_asm.h | 14 +-------------
arch/arm64/kvm/hyp/hyp-entry.S | 2 +-
3 files changed, 19 insertions(+), 24 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 54d181177656..b392a977efb6 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -218,6 +218,21 @@ lr .req x30 // link register
str \src, [\tmp, :lo12:\sym]
.endm

+ /*
+ * @dst: destination register (32 or 64 bit wide)
+ */
+ .macro this_cpu_offset, dst
+#ifdef __KVM_NVHE_HYPERVISOR__
+ mrs \dst, tpidr_el2
+#else
+alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
+ mrs \dst, tpidr_el1
+alternative_else
+ mrs \dst, tpidr_el2
+alternative_endif
+#endif
+ .endm
+
/*
* @dst: Result of per_cpu(sym, smp_processor_id()) (can be SP)
* @sym: The name of the per-cpu variable
@@ -226,11 +241,7 @@ lr .req x30 // link register
.macro adr_this_cpu, dst, sym, tmp
adrp \tmp, \sym
add \dst, \tmp, #:lo12:\sym
-alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
- mrs \tmp, tpidr_el1
-alternative_else
- mrs \tmp, tpidr_el2
-alternative_endif
+ this_cpu_offset \tmp
add \dst, \dst, \tmp
.endm

@@ -241,11 +252,7 @@ alternative_endif
*/
.macro ldr_this_cpu dst, sym, tmp
adr_l \dst, \sym
-alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
- mrs \tmp, tpidr_el1
-alternative_else
- mrs \tmp, tpidr_el2
-alternative_endif
+ this_cpu_offset \tmp
ldr \dst, [\dst, \tmp]
.endm

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 9149079f0269..469c0662f7f3 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -179,20 +179,8 @@ extern char __smccc_workaround_1_smc[__SMCCC_WORKAROUND_1_SMC_SZ];

#else /* __ASSEMBLY__ */

-.macro hyp_adr_this_cpu reg, sym, tmp
- adr_l \reg, \sym
- mrs \tmp, tpidr_el2
- add \reg, \reg, \tmp
-.endm
-
-.macro hyp_ldr_this_cpu reg, sym, tmp
- adr_l \reg, \sym
- mrs \tmp, tpidr_el2
- ldr \reg, [\reg, \tmp]
-.endm
-
.macro get_host_ctxt reg, tmp
- hyp_adr_this_cpu \reg, kvm_host_data, \tmp
+ adr_this_cpu \reg, kvm_host_data, \tmp
add \reg, \reg, #HOST_DATA_CONTEXT
.endm

diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S
index 46b4dab933d0..fba91c2ab410 100644
--- a/arch/arm64/kvm/hyp/hyp-entry.S
+++ b/arch/arm64/kvm/hyp/hyp-entry.S
@@ -132,7 +132,7 @@ alternative_cb_end
str x0, [x2, #VCPU_WORKAROUND_FLAGS]

/* Check that we actually need to perform the call */
- hyp_ldr_this_cpu x0, arm64_ssbd_callback_required, x2
+ ldr_this_cpu x0, arm64_ssbd_callback_required, x2
cbz x0, wa2_end

mov w0, #ARM_SMCCC_ARCH_WORKAROUND_2
--
2.28.0.402.g5ffc5be6b7-goog

2020-09-03 09:19:56

by David Brazdil

[permalink] [raw]
Subject: [PATCH v2 09/10] kvm: arm64: Set up hyp percpu data for nVHE

Add hyp percpu section to linker script and rename the corresponding ELF
sections of hyp/nvhe object files. This moves all nVHE-specific percpu
variables to the new hyp percpu section.

Allocate sufficient amount of memory for all percpu hyp regions at global KVM
init time and create corresponding hyp mappings.

The base addresses of hyp percpu regions are kept in a dynamically allocated
array in the kernel.

Add NULL checks in PMU event-reset code as it may run before KVM memory is
initialized.

Signed-off-by: David Brazdil <[email protected]>
---
arch/arm64/include/asm/kvm_asm.h | 19 +++++++++--
arch/arm64/include/asm/sections.h | 1 +
arch/arm64/kernel/vmlinux.lds.S | 10 ++++++
arch/arm64/kvm/arm.c | 56 +++++++++++++++++++++++++++++--
arch/arm64/kvm/hyp/nvhe/hyp.lds.S | 5 +++
arch/arm64/kvm/pmu.c | 5 ++-
6 files changed, 90 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 2b89817cdb01..c87111c25d9e 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -72,8 +72,23 @@
#define CHOOSE_VHE_SYM(sym) sym
#define CHOOSE_NVHE_SYM(sym) kvm_nvhe_sym(sym)

-#define this_cpu_ptr_nvhe(sym) this_cpu_ptr(&kvm_nvhe_sym(sym))
-#define per_cpu_ptr_nvhe(sym, cpu) per_cpu_ptr(&kvm_nvhe_sym(sym), cpu)
+/* Array of percpu base addresses. Length of the array is nr_cpu_ids. */
+extern unsigned long *kvm_arm_hyp_percpu_base;
+
+/*
+ * Compute pointer to a symbol defined in nVHE percpu region.
+ * Returns NULL if percpu memory has not been allocated yet.
+ */
+#define this_cpu_ptr_nvhe(sym) per_cpu_ptr_nvhe(sym, smp_processor_id())
+#define per_cpu_ptr_nvhe(sym, cpu) \
+ ({ \
+ unsigned long base, off; \
+ base = kvm_arm_hyp_percpu_base \
+ ? kvm_arm_hyp_percpu_base[cpu] : 0; \
+ off = (unsigned long)&kvm_nvhe_sym(sym) - \
+ (unsigned long)&kvm_nvhe_sym(__per_cpu_start); \
+ base ? (typeof(kvm_nvhe_sym(sym))*)(base + off) : NULL; \
+ })

#ifndef __KVM_NVHE_HYPERVISOR__
/*
diff --git a/arch/arm64/include/asm/sections.h b/arch/arm64/include/asm/sections.h
index 3994169985ef..5062553a6847 100644
--- a/arch/arm64/include/asm/sections.h
+++ b/arch/arm64/include/asm/sections.h
@@ -18,5 +18,6 @@ extern char __exittext_begin[], __exittext_end[];
extern char __irqentry_text_start[], __irqentry_text_end[];
extern char __mmuoff_data_start[], __mmuoff_data_end[];
extern char __entry_tramp_text_start[], __entry_tramp_text_end[];
+extern char __kvm_nvhe___per_cpu_start[], __kvm_nvhe___per_cpu_end[];

#endif /* __ASM_SECTIONS_H */
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 7cba7623fcec..5904a4de9f40 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -15,6 +15,9 @@

#include "image.h"

+#define __CONCAT3(x, y, z) x ## y ## z
+#define CONCAT3(x, y, z) __CONCAT3(x, y, z)
+
OUTPUT_ARCH(aarch64)
ENTRY(_text)

@@ -191,6 +194,13 @@ SECTIONS

PERCPU_SECTION(L1_CACHE_BYTES)

+ /* KVM nVHE per-cpu section */
+ #undef PERCPU_SECTION_NAME
+ #undef PERCPU_SYMBOL_NAME
+ #define PERCPU_SECTION_NAME(suffix) CONCAT3(.hyp, PERCPU_SECTION_BASE_NAME, suffix)
+ #define PERCPU_SYMBOL_NAME(name) __kvm_nvhe_ ## name
+ PERCPU_SECTION(L1_CACHE_BYTES)
+
.rela.dyn : ALIGN(8) {
*(.rela .rela*)
}
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 8a1fcf4dffca..df7d133056ce 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -47,6 +47,7 @@ __asm__(".arch_extension virt");
#endif

static DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);
+unsigned long *kvm_arm_hyp_percpu_base;

/* The VMID used in the VTTBR */
static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
@@ -1255,6 +1256,15 @@ long kvm_arch_vm_ioctl(struct file *filp,
}
}

+#define kvm_hyp_percpu_base(cpu) ((unsigned long)per_cpu_ptr_nvhe(__per_cpu_start, cpu))
+#define kvm_hyp_percpu_array_size (nr_cpu_ids * sizeof(*kvm_arm_hyp_percpu_base))
+#define kvm_hyp_percpu_array_order (get_order(kvm_hyp_percpu_array_size))
+#define kvm_hyp_percpu_start (CHOOSE_NVHE_SYM(__per_cpu_start))
+#define kvm_hyp_percpu_end (CHOOSE_NVHE_SYM(__per_cpu_end))
+#define kvm_hyp_percpu_size ((unsigned long)kvm_hyp_percpu_end - \
+ (unsigned long)kvm_hyp_percpu_start)
+#define kvm_hyp_percpu_order (kvm_hyp_percpu_size ? get_order(kvm_hyp_percpu_size) : 0)
+
static void cpu_init_hyp_mode(void)
{
phys_addr_t pgd_ptr;
@@ -1270,8 +1280,8 @@ static void cpu_init_hyp_mode(void)
* kernel's mapping to the linear mapping, and store it in tpidr_el2
* so that we can use adr_l to access per-cpu variables in EL2.
*/
- tpidr_el2 = ((unsigned long)this_cpu_ptr(&kvm_host_data) -
- (unsigned long)kvm_ksym_ref(&kvm_host_data));
+ tpidr_el2 = (unsigned long)this_cpu_ptr_nvhe(__per_cpu_start) -
+ (unsigned long)kvm_ksym_ref(kvm_hyp_percpu_start);

pgd_ptr = kvm_mmu_get_httbr();
hyp_stack_ptr = __this_cpu_read(kvm_arm_hyp_stack_page) + PAGE_SIZE;
@@ -1503,8 +1513,11 @@ static void teardown_hyp_mode(void)
int cpu;

free_hyp_pgds();
- for_each_possible_cpu(cpu)
+ for_each_possible_cpu(cpu) {
free_hyp_page(per_cpu(kvm_arm_hyp_stack_page, cpu));
+ free_hyp_pages(kvm_hyp_percpu_base(cpu), kvm_hyp_percpu_order);
+ }
+ free_hyp_pages((unsigned long)kvm_arm_hyp_percpu_base, kvm_hyp_percpu_array_order);
}

/**
@@ -1537,6 +1550,28 @@ static int init_hyp_mode(void)
per_cpu(kvm_arm_hyp_stack_page, cpu) = stack_page;
}

+ /*
+ * Allocate and initialize pages for Hypervisor-mode percpu regions.
+ */
+ kvm_arm_hyp_percpu_base = (unsigned long *)alloc_hyp_pages(
+ GFP_KERNEL | __GFP_ZERO, kvm_hyp_percpu_array_order);
+ if (!kvm_arm_hyp_percpu_base) {
+ err = -ENOMEM;
+ goto out_err;
+ }
+ for_each_possible_cpu(cpu) {
+ unsigned long percpu_base;
+
+ percpu_base = alloc_hyp_pages(GFP_KERNEL, kvm_hyp_percpu_order);
+ if (!percpu_base) {
+ err = -ENOMEM;
+ goto out_err;
+ }
+
+ memcpy((void *)percpu_base, kvm_hyp_percpu_start, kvm_hyp_percpu_size);
+ kvm_arm_hyp_percpu_base[cpu] = percpu_base;
+ }
+
/*
* Map the Hyp-code called directly from the host
*/
@@ -1581,6 +1616,21 @@ static int init_hyp_mode(void)
}
}

+ /*
+ * Map Hyp percpu pages
+ */
+ for_each_possible_cpu(cpu) {
+ char *percpu_begin = (char *)kvm_arm_hyp_percpu_base[cpu];
+ char *percpu_end = percpu_begin + PAGE_ALIGN(kvm_hyp_percpu_size);
+
+ err = create_hyp_mappings(percpu_begin, percpu_end, PAGE_HYP);
+
+ if (err) {
+ kvm_err("Cannot map hyp percpu region\n");
+ goto out_err;
+ }
+ }
+
for_each_possible_cpu(cpu) {
kvm_host_data_t *cpu_data;

diff --git a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
index aaa0ce133a32..7d8c3fa004f4 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
+++ b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
@@ -11,4 +11,9 @@

SECTIONS {
HYP_SECTION(.text)
+ HYP_SECTION(.data..percpu)
+ HYP_SECTION(.data..percpu..first)
+ HYP_SECTION(.data..percpu..page_aligned)
+ HYP_SECTION(.data..percpu..read_mostly)
+ HYP_SECTION(.data..percpu..shared_aligned)
}
diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
index 6d80ffe1ebfc..cd653091f213 100644
--- a/arch/arm64/kvm/pmu.c
+++ b/arch/arm64/kvm/pmu.c
@@ -33,7 +33,7 @@ void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr)
{
struct kvm_host_data *ctx = this_cpu_ptr_hyp(kvm_host_data);

- if (!kvm_pmu_switch_needed(attr))
+ if (!ctx || !kvm_pmu_switch_needed(attr))
return;

if (!attr->exclude_host)
@@ -49,6 +49,9 @@ void kvm_clr_pmu_events(u32 clr)
{
struct kvm_host_data *ctx = this_cpu_ptr_hyp(kvm_host_data);

+ if (!ctx)
+ return;
+
ctx->pmu_events.events_host &= ~clr;
ctx->pmu_events.events_guest &= ~clr;
}
--
2.28.0.402.g5ffc5be6b7-goog

2020-09-03 09:20:44

by David Brazdil

[permalink] [raw]
Subject: [PATCH v2 06/10] kvm: arm64: Duplicate arm64_ssbd_callback_required for nVHE hyp

Hyp keeps track of which cores require SSBD callback by accessing a
kernel-proper global variable. Create an nVHE symbol of the same name
and copy the value from kernel proper to nVHE at KVM init time.

Done in preparation for separating percpu memory owned by kernel
proper and nVHE.

Signed-off-by: David Brazdil <[email protected]>
---
arch/arm64/include/asm/kvm_mmu.h | 10 +++++++---
arch/arm64/kernel/image-vars.h | 1 -
arch/arm64/kvm/arm.c | 2 +-
arch/arm64/kvm/hyp/nvhe/switch.c | 3 +++
4 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 189839c3706a..9db93da35606 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -529,23 +529,27 @@ static inline int kvm_map_vectors(void)

#ifdef CONFIG_ARM64_SSBD
DECLARE_PER_CPU_READ_MOSTLY(u64, arm64_ssbd_callback_required);
+DECLARE_KVM_NVHE_PER_CPU(u64, arm64_ssbd_callback_required);

-static inline int hyp_map_aux_data(void)
+static inline int hyp_init_aux_data(void)
{
int cpu, err;

for_each_possible_cpu(cpu) {
u64 *ptr;

- ptr = per_cpu_ptr(&arm64_ssbd_callback_required, cpu);
+ ptr = per_cpu_ptr_nvhe(arm64_ssbd_callback_required, cpu);
err = create_hyp_mappings(ptr, ptr + 1, PAGE_HYP);
if (err)
return err;
+
+ /* Copy value from kernel to hyp. */
+ *ptr = per_cpu(arm64_ssbd_callback_required, cpu);
}
return 0;
}
#else
-static inline int hyp_map_aux_data(void)
+static inline int hyp_init_aux_data(void)
{
return 0;
}
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 8982b68289b7..5fc5eadfb3e6 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -69,7 +69,6 @@ KVM_NVHE_ALIAS(kvm_patch_vector_branch);
KVM_NVHE_ALIAS(kvm_update_va_mask);

/* Global kernel state accessed by nVHE hyp code. */
-KVM_NVHE_ALIAS(arm64_ssbd_callback_required);
KVM_NVHE_ALIAS(kvm_host_data);
KVM_NVHE_ALIAS(kvm_vgic_global_state);

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 46dc3d75cf13..1bb960812493 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1549,7 +1549,7 @@ static int init_hyp_mode(void)
}
}

- err = hyp_map_aux_data();
+ err = hyp_init_aux_data();
if (err)
kvm_err("Cannot map host auxiliary data: %d\n", err);

diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index cc4f8e790fb3..4662df6330d7 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -27,6 +27,9 @@
#include <asm/processor.h>
#include <asm/thread_info.h>

+/* Non-VHE copy of the kernel symbol. */
+DEFINE_PER_CPU_READ_MOSTLY(u64, arm64_ssbd_callback_required);
+
static void __activate_traps(struct kvm_vcpu *vcpu)
{
u64 val;
--
2.28.0.402.g5ffc5be6b7-goog

2020-09-03 09:21:28

by David Brazdil

[permalink] [raw]
Subject: [PATCH v2 02/10] kvm: arm64: Partially link nVHE hyp code, simplify HYPCOPY

Previous series introduced custom build rules for nVHE hyp code, using
objcopy to prefix ELF section and symbol names to separate nVHE code
into its own "namespace". This approach was limited by the expressiveness
of objcopy's command line interface, eg. missing support for wildcards.

Improve the build rules by partially linking all '.hyp.o' files and
prefixing their ELF section names using a linker script. Continue using
objcopy for prefixing ELF symbol names.

One immediate advantage of this approach is that all subsections
matching a pattern can be merged into a single prefixed section, eg.
.text and .text.* can be linked into a single '.hyp.text'. This removes
the need for -fno-reorder-functions on GCC and will be useful in the
future too: LTO builds use .text subsections, compilers routinely
generate .rodata subsections, etc.

Partially linking all hyp code into a single object file also makes it
easier to analyze.

Signed-off-by: David Brazdil <[email protected]>
---
arch/arm64/kvm/hyp/nvhe/Makefile | 56 ++++++++++++++++---------------
arch/arm64/kvm/hyp/nvhe/hyp.lds.S | 14 ++++++++
2 files changed, 43 insertions(+), 27 deletions(-)
create mode 100644 arch/arm64/kvm/hyp/nvhe/hyp.lds.S

diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index aef76487edc2..1b2fbb19f3e8 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -10,40 +10,42 @@ obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o
obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
../fpsimd.o ../hyp-entry.o

-obj-y := $(patsubst %.o,%.hyp.o,$(obj-y))
-extra-y := $(patsubst %.hyp.o,%.hyp.tmp.o,$(obj-y))
+##
+## Build rules for compiling nVHE hyp code
+## Output of this folder is `hyp.o`, a partially linked object file containing
+## all nVHE hyp code and data.
+##

-$(obj)/%.hyp.tmp.o: $(src)/%.c FORCE
+hyp-obj := $(patsubst %.o,%.hyp.o,$(obj-y))
+obj-y := hyp.o
+extra-y := $(hyp-obj) hyp.tmp.o hyp.lds
+
+# 1) Compile all source files to `.hyp.o` object files. The file extension
+# avoids file name clashes for files shared with VHE.
+$(obj)/%.hyp.o: $(src)/%.c FORCE
$(call if_changed_rule,cc_o_c)
-$(obj)/%.hyp.tmp.o: $(src)/%.S FORCE
+$(obj)/%.hyp.o: $(src)/%.S FORCE
$(call if_changed_rule,as_o_S)
-$(obj)/%.hyp.o: $(obj)/%.hyp.tmp.o FORCE
- $(call if_changed,hypcopy)

-# Disable reordering functions by GCC (enabled at -O2).
-# This pass puts functions into '.text.*' sections to aid the linker
-# in optimizing ELF layout. See HYPCOPY comment below for more info.
-ccflags-y += $(call cc-option,-fno-reorder-functions)
+# 2) Compile linker script.
+$(obj)/hyp.lds: $(src)/hyp.lds.S FORCE
+ $(call if_changed_dep,cpp_lds_S)
+
+# 3) Partially link all '.hyp.o' files and apply the linker script.
+# Prefixes names of ELF sections with '.hyp', eg. '.hyp.text'.
+LDFLAGS_hyp.tmp.o := -r -T $(obj)/hyp.lds
+$(obj)/hyp.tmp.o: $(addprefix $(obj)/,$(hyp-obj)) $(obj)/hyp.lds FORCE
+ $(call if_changed,ld)
+
+# 4) Produce the final 'hyp.o', ready to be linked into 'vmlinux'.
+# Prefixes names of ELF symbols with '__kvm_nvhe_'.
+$(obj)/hyp.o: $(obj)/hyp.tmp.o FORCE
+ $(call if_changed,hypcopy)

# The HYPCOPY command uses `objcopy` to prefix all ELF symbol names
-# and relevant ELF section names to avoid clashes with VHE code/data.
-#
-# Hyp code is assumed to be in the '.text' section of the input object
-# files (with the exception of specialized sections such as
-# '.hyp.idmap.text'). This assumption may be broken by a compiler that
-# divides code into sections like '.text.unlikely' so as to optimize
-# ELF layout. HYPCOPY checks that no such sections exist in the input
-# using `objdump`, otherwise they would be linked together with other
-# kernel code and not memory-mapped correctly at runtime.
+# to avoid clashes with VHE code/data.
quiet_cmd_hypcopy = HYPCOPY $@
- cmd_hypcopy = \
- if $(OBJDUMP) -h $< | grep -F '.text.'; then \
- echo "$@: function reordering not supported in nVHE hyp code" >&2; \
- /bin/false; \
- fi; \
- $(OBJCOPY) --prefix-symbols=__kvm_nvhe_ \
- --rename-section=.text=.hyp.text \
- $< $@
+ cmd_hypcopy = $(OBJCOPY) --prefix-symbols=__kvm_nvhe_ $< $@

# Remove ftrace and Shadow Call Stack CFLAGS.
# This is equivalent to the 'notrace' and '__noscs' annotations.
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
new file mode 100644
index 000000000000..aaa0ce133a32
--- /dev/null
+++ b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Linker script used during partial linking of nVHE EL2 object files.
+ * Written by David Brazdil <[email protected]>
+ */
+
+/*
+ * Defines an ELF hyp section from input section @NAME and its subsections.
+ */
+#define HYP_SECTION(NAME) .hyp##NAME : { *(NAME NAME##.[0-9a-zA-Z_]*) }
+
+SECTIONS {
+ HYP_SECTION(.text)
+}
--
2.28.0.402.g5ffc5be6b7-goog

2020-09-03 09:21:42

by David Brazdil

[permalink] [raw]
Subject: [PATCH v2 07/10] kvm: arm64: Create separate instances of kvm_host_data for VHE/nVHE

Host CPU context is stored in a global per-cpu variable `kvm_host_data`.
In preparation for introducing independent per-CPU region for nVHE hyp,
create two separate instances of `kvm_host_data`, one for VHE and one
for nVHE.

Signed-off-by: David Brazdil <[email protected]>
---
arch/arm64/include/asm/kvm_host.h | 2 +-
arch/arm64/kernel/image-vars.h | 1 -
arch/arm64/kvm/arm.c | 5 ++---
arch/arm64/kvm/hyp/nvhe/switch.c | 3 +++
arch/arm64/kvm/hyp/vhe/switch.c | 3 +++
arch/arm64/kvm/pmu.c | 8 ++++----
6 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index e52c927aade5..964e05777fe3 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -565,7 +565,7 @@ void kvm_set_sei_esr(struct kvm_vcpu *vcpu, u64 syndrome);

struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);

-DECLARE_PER_CPU(kvm_host_data_t, kvm_host_data);
+DECLARE_KVM_HYP_PER_CPU(kvm_host_data_t, kvm_host_data);

static inline void kvm_init_host_cpu_context(struct kvm_cpu_context *cpu_ctxt)
{
diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 5fc5eadfb3e6..21307e2db3fc 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -69,7 +69,6 @@ KVM_NVHE_ALIAS(kvm_patch_vector_branch);
KVM_NVHE_ALIAS(kvm_update_va_mask);

/* Global kernel state accessed by nVHE hyp code. */
-KVM_NVHE_ALIAS(kvm_host_data);
KVM_NVHE_ALIAS(kvm_vgic_global_state);

/* Kernel constant needed to compute idmap addresses. */
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 1bb960812493..d437052c5481 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -46,7 +46,6 @@
__asm__(".arch_extension virt");
#endif

-DEFINE_PER_CPU(kvm_host_data_t, kvm_host_data);
static DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);

/* The VMID used in the VTTBR */
@@ -1305,7 +1304,7 @@ static void cpu_hyp_reset(void)

static void cpu_hyp_reinit(void)
{
- kvm_init_host_cpu_context(&this_cpu_ptr(&kvm_host_data)->host_ctxt);
+ kvm_init_host_cpu_context(&this_cpu_ptr_hyp(kvm_host_data)->host_ctxt);

cpu_hyp_reset();

@@ -1540,7 +1539,7 @@ static int init_hyp_mode(void)
for_each_possible_cpu(cpu) {
kvm_host_data_t *cpu_data;

- cpu_data = per_cpu_ptr(&kvm_host_data, cpu);
+ cpu_data = per_cpu_ptr_hyp(kvm_host_data, cpu);
err = create_hyp_mappings(cpu_data, cpu_data + 1, PAGE_HYP);

if (err) {
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index 4662df6330d7..a7e9b03bd9d1 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -30,6 +30,9 @@
/* Non-VHE copy of the kernel symbol. */
DEFINE_PER_CPU_READ_MOSTLY(u64, arm64_ssbd_callback_required);

+/* Non-VHE instance of kvm_host_data. */
+DEFINE_PER_CPU(kvm_host_data_t, kvm_host_data);
+
static void __activate_traps(struct kvm_vcpu *vcpu)
{
u64 val;
diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
index 575e8054f116..0949fc97bf03 100644
--- a/arch/arm64/kvm/hyp/vhe/switch.c
+++ b/arch/arm64/kvm/hyp/vhe/switch.c
@@ -28,6 +28,9 @@

const char __hyp_panic_string[] = "HYP panic:\nPS:%08llx PC:%016llx ESR:%08llx\nFAR:%016llx HPFAR:%016llx PAR:%016llx\nVCPU:%p\n";

+/* VHE instance of kvm_host_data. */
+DEFINE_PER_CPU(kvm_host_data_t, kvm_host_data);
+
static void __activate_traps(struct kvm_vcpu *vcpu)
{
u64 val;
diff --git a/arch/arm64/kvm/pmu.c b/arch/arm64/kvm/pmu.c
index 3c224162b3dd..6d80ffe1ebfc 100644
--- a/arch/arm64/kvm/pmu.c
+++ b/arch/arm64/kvm/pmu.c
@@ -31,7 +31,7 @@ static bool kvm_pmu_switch_needed(struct perf_event_attr *attr)
*/
void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr)
{
- struct kvm_host_data *ctx = this_cpu_ptr(&kvm_host_data);
+ struct kvm_host_data *ctx = this_cpu_ptr_hyp(kvm_host_data);

if (!kvm_pmu_switch_needed(attr))
return;
@@ -47,7 +47,7 @@ void kvm_set_pmu_events(u32 set, struct perf_event_attr *attr)
*/
void kvm_clr_pmu_events(u32 clr)
{
- struct kvm_host_data *ctx = this_cpu_ptr(&kvm_host_data);
+ struct kvm_host_data *ctx = this_cpu_ptr_hyp(kvm_host_data);

ctx->pmu_events.events_host &= ~clr;
ctx->pmu_events.events_guest &= ~clr;
@@ -173,7 +173,7 @@ void kvm_vcpu_pmu_restore_guest(struct kvm_vcpu *vcpu)
return;

preempt_disable();
- host = this_cpu_ptr(&kvm_host_data);
+ host = this_cpu_ptr_hyp(kvm_host_data);
events_guest = host->pmu_events.events_guest;
events_host = host->pmu_events.events_host;

@@ -193,7 +193,7 @@ void kvm_vcpu_pmu_restore_host(struct kvm_vcpu *vcpu)
if (!has_vhe())
return;

- host = this_cpu_ptr(&kvm_host_data);
+ host = this_cpu_ptr_hyp(kvm_host_data);
events_guest = host->pmu_events.events_guest;
events_host = host->pmu_events.events_host;

--
2.28.0.402.g5ffc5be6b7-goog

2020-09-03 09:22:16

by David Brazdil

[permalink] [raw]
Subject: [PATCH v2 05/10] kvm: arm64: Add helpers for accessing nVHE hyp per-cpu vars

Defining a per-CPU variable in hyp/nvhe will result in its name being
prefixed with __kvm_nvhe_. Add helpers for declaring these variables
in kernel proper and accessing them with this_cpu_ptr and per_cpu_ptr.

Signed-off-by: David Brazdil <[email protected]>
---
arch/arm64/include/asm/kvm_asm.h | 25 +++++++++++++++++++++++--
1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 469c0662f7f3..2b89817cdb01 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -60,9 +60,21 @@
DECLARE_KVM_VHE_SYM(sym); \
DECLARE_KVM_NVHE_SYM(sym)

+#define DECLARE_KVM_VHE_PER_CPU(type, sym) \
+ DECLARE_PER_CPU(type, sym)
+#define DECLARE_KVM_NVHE_PER_CPU(type, sym) \
+ DECLARE_PER_CPU(type, kvm_nvhe_sym(sym))
+
+#define DECLARE_KVM_HYP_PER_CPU(type, sym) \
+ DECLARE_KVM_VHE_PER_CPU(type, sym); \
+ DECLARE_KVM_NVHE_PER_CPU(type, sym)
+
#define CHOOSE_VHE_SYM(sym) sym
#define CHOOSE_NVHE_SYM(sym) kvm_nvhe_sym(sym)

+#define this_cpu_ptr_nvhe(sym) this_cpu_ptr(&kvm_nvhe_sym(sym))
+#define per_cpu_ptr_nvhe(sym, cpu) per_cpu_ptr(&kvm_nvhe_sym(sym), cpu)
+
#ifndef __KVM_NVHE_HYPERVISOR__
/*
* BIG FAT WARNINGS:
@@ -75,12 +87,21 @@
* - Don't let the nVHE hypervisor have access to this, as it will
* pick the *wrong* symbol (yes, it runs at EL2...).
*/
-#define CHOOSE_HYP_SYM(sym) (is_kernel_in_hyp_mode() ? CHOOSE_VHE_SYM(sym) \
+#define CHOOSE_HYP_SYM(sym) (is_kernel_in_hyp_mode() \
+ ? CHOOSE_VHE_SYM(sym) \
: CHOOSE_NVHE_SYM(sym))
+#define this_cpu_ptr_hyp(sym) (is_kernel_in_hyp_mode() \
+ ? this_cpu_ptr(&sym) \
+ : this_cpu_ptr_nvhe(sym))
+#define per_cpu_ptr_hyp(sym, cpu) (is_kernel_in_hyp_mode() \
+ ? per_cpu_ptr(&sym, cpu) \
+ : per_cpu_ptr_nvhe(sym, cpu))
#else
/* The nVHE hypervisor shouldn't even try to access anything */
extern void *__nvhe_undefined_symbol;
-#define CHOOSE_HYP_SYM(sym) __nvhe_undefined_symbol
+#define CHOOSE_HYP_SYM(sym) __nvhe_undefined_symbol
+#define this_cpu_ptr_hyp(sym) (&__nvhe_undefined_symbol)
+#define per_cpu_ptr_hyp(sym, cpu) (&__nvhe_undefined_symbol)
#endif

/* Translate a kernel address @ptr into its equivalent linear mapping */
--
2.28.0.402.g5ffc5be6b7-goog

2020-09-03 09:22:32

by David Brazdil

[permalink] [raw]
Subject: [PATCH v2 10/10] kvm: arm64: Remove unnecessary hyp mappings

With all nVHE per-CPU variables being part of the hyp per-CPU region,
mapping them individual is not necessary any longer. They are mapped to hyp
as part of the overall per-CPU region.

Signed-off-by: David Brazdil <[email protected]>
---
arch/arm64/include/asm/kvm_mmu.h | 25 +++++++------------------
arch/arm64/kvm/arm.c | 17 +----------------
2 files changed, 8 insertions(+), 34 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 9db93da35606..bbe9df76ff42 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -531,28 +531,17 @@ static inline int kvm_map_vectors(void)
DECLARE_PER_CPU_READ_MOSTLY(u64, arm64_ssbd_callback_required);
DECLARE_KVM_NVHE_PER_CPU(u64, arm64_ssbd_callback_required);

-static inline int hyp_init_aux_data(void)
+static inline void hyp_init_aux_data(void)
{
- int cpu, err;
+ int cpu;

- for_each_possible_cpu(cpu) {
- u64 *ptr;
-
- ptr = per_cpu_ptr_nvhe(arm64_ssbd_callback_required, cpu);
- err = create_hyp_mappings(ptr, ptr + 1, PAGE_HYP);
- if (err)
- return err;
-
- /* Copy value from kernel to hyp. */
- *ptr = per_cpu(arm64_ssbd_callback_required, cpu);
- }
- return 0;
+ /* Copy arm64_ssbd_callback_required values from kernel to hyp. */
+ for_each_possible_cpu(cpu)
+ *(per_cpu_ptr_nvhe(arm64_ssbd_callback_required, cpu)) =
+ per_cpu(arm64_ssbd_callback_required, cpu);
}
#else
-static inline int hyp_init_aux_data(void)
-{
- return 0;
-}
+static inline void hyp_init_aux_data(void) {}
#endif

#define kvm_phys_to_vttbr(addr) phys_to_ttbr(addr)
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index df7d133056ce..dfe1baa5bbb7 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1631,22 +1631,7 @@ static int init_hyp_mode(void)
}
}

- for_each_possible_cpu(cpu) {
- kvm_host_data_t *cpu_data;
-
- cpu_data = per_cpu_ptr_hyp(kvm_host_data, cpu);
- err = create_hyp_mappings(cpu_data, cpu_data + 1, PAGE_HYP);
-
- if (err) {
- kvm_err("Cannot map host CPU state: %d\n", err);
- goto out_err;
- }
- }
-
- err = hyp_init_aux_data();
- if (err)
- kvm_err("Cannot map host auxiliary data: %d\n", err);
-
+ hyp_init_aux_data();
return 0;

out_err:
--
2.28.0.402.g5ffc5be6b7-goog

2020-09-10 10:01:11

by Andrew Scull

[permalink] [raw]
Subject: Re: [PATCH v2 02/10] kvm: arm64: Partially link nVHE hyp code, simplify HYPCOPY

On Thu, Sep 03, 2020 at 11:17:04AM +0200, 'David Brazdil' via kernel-team wrote:
> Previous series introduced custom build rules for nVHE hyp code, using
> objcopy to prefix ELF section and symbol names to separate nVHE code
> into its own "namespace". This approach was limited by the expressiveness
> of objcopy's command line interface, eg. missing support for wildcards.
>
> Improve the build rules by partially linking all '.hyp.o' files and
> prefixing their ELF section names using a linker script. Continue using
> objcopy for prefixing ELF symbol names.
>
> One immediate advantage of this approach is that all subsections
> matching a pattern can be merged into a single prefixed section, eg.
> .text and .text.* can be linked into a single '.hyp.text'. This removes
> the need for -fno-reorder-functions on GCC and will be useful in the
> future too: LTO builds use .text subsections, compilers routinely
> generate .rodata subsections, etc.

This certaintly feels like a more robust and controlled approach to the
sections now that we have an explicit list of those that are allowed.

> Partially linking all hyp code into a single object file also makes it
> easier to analyze.
>
> Signed-off-by: David Brazdil <[email protected]>
> ---
> arch/arm64/kvm/hyp/nvhe/Makefile | 56 ++++++++++++++++---------------
> arch/arm64/kvm/hyp/nvhe/hyp.lds.S | 14 ++++++++
> 2 files changed, 43 insertions(+), 27 deletions(-)
> create mode 100644 arch/arm64/kvm/hyp/nvhe/hyp.lds.S
>
> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> index aef76487edc2..1b2fbb19f3e8 100644
> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> @@ -10,40 +10,42 @@ obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o
> obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
> ../fpsimd.o ../hyp-entry.o
>
> -obj-y := $(patsubst %.o,%.hyp.o,$(obj-y))
> -extra-y := $(patsubst %.hyp.o,%.hyp.tmp.o,$(obj-y))
> +##
> +## Build rules for compiling nVHE hyp code
> +## Output of this folder is `hyp.o`, a partially linked object file containing
> +## all nVHE hyp code and data.
> +##
>
> -$(obj)/%.hyp.tmp.o: $(src)/%.c FORCE
> +hyp-obj := $(patsubst %.o,%.hyp.o,$(obj-y))
> +obj-y := hyp.o
> +extra-y := $(hyp-obj) hyp.tmp.o hyp.lds
> +
> +# 1) Compile all source files to `.hyp.o` object files. The file extension
> +# avoids file name clashes for files shared with VHE.

Very much a nit, but possibly .nvhe.o or .kvm_nvhe.o would make the
intended distinction more obvious and line up with the prefix being
applied to the symbols.

> +$(obj)/%.hyp.o: $(src)/%.c FORCE
> $(call if_changed_rule,cc_o_c)
> -$(obj)/%.hyp.tmp.o: $(src)/%.S FORCE
> +$(obj)/%.hyp.o: $(src)/%.S FORCE
> $(call if_changed_rule,as_o_S)
> -$(obj)/%.hyp.o: $(obj)/%.hyp.tmp.o FORCE
> - $(call if_changed,hypcopy)
>
> -# Disable reordering functions by GCC (enabled at -O2).
> -# This pass puts functions into '.text.*' sections to aid the linker
> -# in optimizing ELF layout. See HYPCOPY comment below for more info.
> -ccflags-y += $(call cc-option,-fno-reorder-functions)
> +# 2) Compile linker script.
> +$(obj)/hyp.lds: $(src)/hyp.lds.S FORCE
> + $(call if_changed_dep,cpp_lds_S)
> +
> +# 3) Partially link all '.hyp.o' files and apply the linker script.
> +# Prefixes names of ELF sections with '.hyp', eg. '.hyp.text'.
> +LDFLAGS_hyp.tmp.o := -r -T $(obj)/hyp.lds
> +$(obj)/hyp.tmp.o: $(addprefix $(obj)/,$(hyp-obj)) $(obj)/hyp.lds FORCE
> + $(call if_changed,ld)
> +
> +# 4) Produce the final 'hyp.o', ready to be linked into 'vmlinux'.
> +# Prefixes names of ELF symbols with '__kvm_nvhe_'.
> +$(obj)/hyp.o: $(obj)/hyp.tmp.o FORCE
> + $(call if_changed,hypcopy)
>
> # The HYPCOPY command uses `objcopy` to prefix all ELF symbol names
> -# and relevant ELF section names to avoid clashes with VHE code/data.
> -#
> -# Hyp code is assumed to be in the '.text' section of the input object
> -# files (with the exception of specialized sections such as
> -# '.hyp.idmap.text'). This assumption may be broken by a compiler that
> -# divides code into sections like '.text.unlikely' so as to optimize
> -# ELF layout. HYPCOPY checks that no such sections exist in the input
> -# using `objdump`, otherwise they would be linked together with other
> -# kernel code and not memory-mapped correctly at runtime.
> +# to avoid clashes with VHE code/data.
> quiet_cmd_hypcopy = HYPCOPY $@
> - cmd_hypcopy = \
> - if $(OBJDUMP) -h $< | grep -F '.text.'; then \
> - echo "$@: function reordering not supported in nVHE hyp code" >&2; \
> - /bin/false; \
> - fi; \
> - $(OBJCOPY) --prefix-symbols=__kvm_nvhe_ \
> - --rename-section=.text=.hyp.text \
> - $< $@
> + cmd_hypcopy = $(OBJCOPY) --prefix-symbols=__kvm_nvhe_ $< $@
>
> # Remove ftrace and Shadow Call Stack CFLAGS.
> # This is equivalent to the 'notrace' and '__noscs' annotations.
> diff --git a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
> new file mode 100644
> index 000000000000..aaa0ce133a32
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
> @@ -0,0 +1,14 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Linker script used during partial linking of nVHE EL2 object files.
> + * Written by David Brazdil <[email protected]>
> + */

Should this file have the standard copyright line?

> +
> +/*
> + * Defines an ELF hyp section from input section @NAME and its subsections.
> + */
> +#define HYP_SECTION(NAME) .hyp##NAME : { *(NAME NAME##.[0-9a-zA-Z_]*) }
> +
> +SECTIONS {
> + HYP_SECTION(.text)
> +}
> --
> 2.28.0.402.g5ffc5be6b7-goog
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
>

2020-09-10 11:35:04

by Andrew Scull

[permalink] [raw]
Subject: Re: [PATCH v2 03/10] kvm: arm64: Remove __hyp_this_cpu_read

On Thu, Sep 03, 2020 at 11:17:05AM +0200, 'David Brazdil' via kernel-team wrote:
> this_cpu_ptr is meant for use in kernel proper because it selects between
> TPIDR_EL1/2 based on nVHE/VHE. __hyp_this_cpu_ptr was used in hyp to always
> select TPIDR_EL2. Unify all users behind this_cpu_ptr and friends by
> selecting _EL2 register under __KVM_NVHE_HYPERVISOR__.
>
> Under CONFIG_DEBUG_PREEMPT, the kernel helpers perform a preemption check
> which is omitted by the hyp helpers. Preserve the behavior for nVHE by
> overriding the corresponding macros under __KVM_NVHE_HYPERVISOR__. Extend
> the checks into VHE hyp code.
>
> Signed-off-by: David Brazdil <[email protected]>
> ---
> arch/arm64/include/asm/kvm_asm.h | 20 --------------
> arch/arm64/include/asm/percpu.h | 33 +++++++++++++++++++++--
> arch/arm64/kvm/hyp/include/hyp/debug-sr.h | 4 +--
> arch/arm64/kvm/hyp/include/hyp/switch.h | 8 +++---
> arch/arm64/kvm/hyp/nvhe/switch.c | 2 +-
> arch/arm64/kvm/hyp/vhe/switch.c | 2 +-
> arch/arm64/kvm/hyp/vhe/sysreg-sr.c | 4 +--
> 7 files changed, 41 insertions(+), 32 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index 6f98fbd0ac81..9149079f0269 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -149,26 +149,6 @@ extern char __smccc_workaround_1_smc[__SMCCC_WORKAROUND_1_SMC_SZ];
> addr; \
> })
>
> -/*
> - * Home-grown __this_cpu_{ptr,read} variants that always work at HYP,
> - * provided that sym is really a *symbol* and not a pointer obtained from
> - * a data structure. As for SHIFT_PERCPU_PTR(), the creative casting keeps
> - * sparse quiet.
> - */
> -#define __hyp_this_cpu_ptr(sym) \
> - ({ \
> - void *__ptr; \
> - __verify_pcpu_ptr(&sym); \
> - __ptr = hyp_symbol_addr(sym); \
> - __ptr += read_sysreg(tpidr_el2); \
> - (typeof(sym) __kernel __force *)__ptr; \
> - })
> -
> -#define __hyp_this_cpu_read(sym) \
> - ({ \
> - *__hyp_this_cpu_ptr(sym); \
> - })
> -
> #define __KVM_EXTABLE(from, to) \
> " .pushsection __kvm_ex_table, \"a\"\n" \
> " .align 3\n" \
> diff --git a/arch/arm64/include/asm/percpu.h b/arch/arm64/include/asm/percpu.h
> index 0b6409b89e5e..b4008331475b 100644
> --- a/arch/arm64/include/asm/percpu.h
> +++ b/arch/arm64/include/asm/percpu.h
> @@ -19,7 +19,21 @@ static inline void set_my_cpu_offset(unsigned long off)
> :: "r" (off) : "memory");
> }
>
> -static inline unsigned long __my_cpu_offset(void)
> +static inline unsigned long __hyp_my_cpu_offset(void)
> +{
> + unsigned long off;
> +
> + /*
> + * We want to allow caching the value, so avoid using volatile and
> + * instead use a fake stack read to hazard against barrier().
> + */
> + asm("mrs %0, tpidr_el2" : "=r" (off) :
> + "Q" (*(const unsigned long *)current_stack_pointer));
> +
> + return off;
> +}
> +
> +static inline unsigned long __kern_my_cpu_offset(void)
> {
> unsigned long off;
>
> @@ -35,7 +49,12 @@ static inline unsigned long __my_cpu_offset(void)
>
> return off;
> }
> -#define __my_cpu_offset __my_cpu_offset()
> +
> +#ifdef __KVM_NVHE_HYPERVISOR__
> +#define __my_cpu_offset __hyp_my_cpu_offset()

Is there a benefit to this being used for __KVM_VHE_HYPERVISOR__ too
since that is "hyp" too and doesn't need the alternative since it will
always pick EL2?

> +#else
> +#define __my_cpu_offset __kern_my_cpu_offset()
> +#endif
>
> #define PERCPU_RW_OPS(sz) \
> static inline unsigned long __percpu_read_##sz(void *ptr) \
> @@ -227,4 +246,14 @@ PERCPU_RET_OP(add, add, ldadd)
>
> #include <asm-generic/percpu.h>
>
> +/* Redefine macros for nVHE hyp under DEBUG_PREEMPT to avoid its dependencies. */
> +#if defined(__KVM_NVHE_HYPERVISOR__) && defined(CONFIG_DEBUG_PREEMPT)
> +#undef this_cpu_ptr
> +#define this_cpu_ptr raw_cpu_ptr
> +#undef __this_cpu_read
> +#define __this_cpu_read raw_cpu_read
> +#undef __this_cpu_write
> +#define __this_cpu_write raw_cpu_write
> +#endif

This is an incomplete cherry-picked list of macros that are redefined to
remove the call to __this_cpu_preempt_check that would result in a
linker failure since that symbol is not defined for nVHE hyp.

I remember there being some dislike of truely defining that symbol with
an nVHE hyp implementation but I can't see why. Yes, nVHE hyp is always
has preemption disabled so the implementation is just an empty function
but why is is preferrable to redefine some of these macros instead?

> +
> #endif /* __ASM_PERCPU_H */
> diff --git a/arch/arm64/kvm/hyp/include/hyp/debug-sr.h b/arch/arm64/kvm/hyp/include/hyp/debug-sr.h
> index 5e28ea6aa097..4ebe9f558f3a 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/debug-sr.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/debug-sr.h
> @@ -135,7 +135,7 @@ static inline void __debug_switch_to_guest_common(struct kvm_vcpu *vcpu)
> if (!(vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY))
> return;
>
> - host_ctxt = &__hyp_this_cpu_ptr(kvm_host_data)->host_ctxt;
> + host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> guest_ctxt = &vcpu->arch.ctxt;
> host_dbg = &vcpu->arch.host_debug_state.regs;
> guest_dbg = kern_hyp_va(vcpu->arch.debug_ptr);
> @@ -154,7 +154,7 @@ static inline void __debug_switch_to_host_common(struct kvm_vcpu *vcpu)
> if (!(vcpu->arch.flags & KVM_ARM64_DEBUG_DIRTY))
> return;
>
> - host_ctxt = &__hyp_this_cpu_ptr(kvm_host_data)->host_ctxt;
> + host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> guest_ctxt = &vcpu->arch.ctxt;
> host_dbg = &vcpu->arch.host_debug_state.regs;
> guest_dbg = kern_hyp_va(vcpu->arch.debug_ptr);
> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> index 5b6b8fa00f0a..f150407fa798 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> @@ -386,7 +386,7 @@ static inline bool __hyp_handle_ptrauth(struct kvm_vcpu *vcpu)
> !esr_is_ptrauth_trap(kvm_vcpu_get_esr(vcpu)))
> return false;
>
> - ctxt = &__hyp_this_cpu_ptr(kvm_host_data)->host_ctxt;
> + ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> __ptrauth_save_key(ctxt, APIA);
> __ptrauth_save_key(ctxt, APIB);
> __ptrauth_save_key(ctxt, APDA);
> @@ -495,7 +495,7 @@ static inline void __set_guest_arch_workaround_state(struct kvm_vcpu *vcpu)
> * guest wants it disabled, so be it...
> */
> if (__needs_ssbd_off(vcpu) &&
> - __hyp_this_cpu_read(arm64_ssbd_callback_required))
> + __this_cpu_read(arm64_ssbd_callback_required))
> arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2, 0, NULL);
> #endif
> }
> @@ -507,7 +507,7 @@ static inline void __set_host_arch_workaround_state(struct kvm_vcpu *vcpu)
> * If the guest has disabled the workaround, bring it back on.
> */
> if (__needs_ssbd_off(vcpu) &&
> - __hyp_this_cpu_read(arm64_ssbd_callback_required))
> + __this_cpu_read(arm64_ssbd_callback_required))
> arm_smccc_1_1_smc(ARM_SMCCC_ARCH_WORKAROUND_2, 1, NULL);
> #endif
> }
> @@ -521,7 +521,7 @@ static inline void __kvm_unexpected_el2_exception(void)
>
> entry = hyp_symbol_addr(__start___kvm_ex_table);
> end = hyp_symbol_addr(__stop___kvm_ex_table);
> - host_ctxt = &__hyp_this_cpu_ptr(kvm_host_data)->host_ctxt;
> + host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
>
> while (entry < end) {
> addr = (unsigned long)&entry->insn + entry->insn;
> diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
> index 0970442d2dbc..cc4f8e790fb3 100644
> --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> @@ -175,7 +175,7 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
>
> vcpu = kern_hyp_va(vcpu);
>
> - host_ctxt = &__hyp_this_cpu_ptr(kvm_host_data)->host_ctxt;
> + host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> host_ctxt->__hyp_running_vcpu = vcpu;
> guest_ctxt = &vcpu->arch.ctxt;
>
> diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
> index c1da4f86ccac..575e8054f116 100644
> --- a/arch/arm64/kvm/hyp/vhe/switch.c
> +++ b/arch/arm64/kvm/hyp/vhe/switch.c
> @@ -108,7 +108,7 @@ static int __kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
> struct kvm_cpu_context *guest_ctxt;
> u64 exit_code;
>
> - host_ctxt = &__hyp_this_cpu_ptr(kvm_host_data)->host_ctxt;
> + host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> host_ctxt->__hyp_running_vcpu = vcpu;
> guest_ctxt = &vcpu->arch.ctxt;
>
> diff --git a/arch/arm64/kvm/hyp/vhe/sysreg-sr.c b/arch/arm64/kvm/hyp/vhe/sysreg-sr.c
> index 996471e4c138..2a0b8c88d74f 100644
> --- a/arch/arm64/kvm/hyp/vhe/sysreg-sr.c
> +++ b/arch/arm64/kvm/hyp/vhe/sysreg-sr.c
> @@ -66,7 +66,7 @@ void kvm_vcpu_load_sysregs_vhe(struct kvm_vcpu *vcpu)
> struct kvm_cpu_context *guest_ctxt = &vcpu->arch.ctxt;
> struct kvm_cpu_context *host_ctxt;
>
> - host_ctxt = &__hyp_this_cpu_ptr(kvm_host_data)->host_ctxt;
> + host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> __sysreg_save_user_state(host_ctxt);
>
> /*
> @@ -100,7 +100,7 @@ void kvm_vcpu_put_sysregs_vhe(struct kvm_vcpu *vcpu)
> struct kvm_cpu_context *guest_ctxt = &vcpu->arch.ctxt;
> struct kvm_cpu_context *host_ctxt;
>
> - host_ctxt = &__hyp_this_cpu_ptr(kvm_host_data)->host_ctxt;
> + host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> deactivate_traps_vhe_put();
>
> __sysreg_save_el1_state(guest_ctxt);
> --
> 2.28.0.402.g5ffc5be6b7-goog
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
>

2020-09-10 12:46:02

by Andrew Scull

[permalink] [raw]
Subject: Re: [PATCH v2 04/10] kvm: arm64: Remove hyp_adr/ldr_this_cpu

On Thu, Sep 03, 2020 at 11:17:06AM +0200, 'David Brazdil' via kernel-team wrote:
> The hyp_adr/ldr_this_cpu helpers were introduced for use in hyp code
> because they always needed to use TPIDR_EL2 for base, while
> adr/ldr_this_cpu from kernel proper would select between TPIDR_EL2 and
> _EL1 based on VHE/nVHE.
>
> Simplify this now that the nVHE hyp mode case can be handled using the
> __KVM_NVHE_HYPERVISOR__ macro. VHE selects _EL2 with alternatives.
>
> Signed-off-by: David Brazdil <[email protected]>

Acked-by: Andrew Scull <[email protected]>

> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> index 54d181177656..b392a977efb6 100644
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -218,6 +218,21 @@ lr .req x30 // link register
> str \src, [\tmp, :lo12:\sym]
> .endm
>
> + /*
> + * @dst: destination register (32 or 64 bit wide)
> + */
> + .macro this_cpu_offset, dst
> +#ifdef __KVM_NVHE_HYPERVISOR__
> + mrs \dst, tpidr_el2

Another part that might also apply to __KVM_VHE_HYPERVISOR__.

> +#else
> +alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
> + mrs \dst, tpidr_el1
> +alternative_else
> + mrs \dst, tpidr_el2
> +alternative_endif
> +#endif
> + .endm


2020-09-10 14:21:42

by Andrew Scull

[permalink] [raw]
Subject: Re: [PATCH v2 10/10] kvm: arm64: Remove unnecessary hyp mappings

On Thu, Sep 03, 2020 at 11:17:12AM +0200, 'David Brazdil' via kernel-team wrote:
> With all nVHE per-CPU variables being part of the hyp per-CPU region,
> mapping them individual is not necessary any longer. They are mapped to hyp
> as part of the overall per-CPU region.
>
> Signed-off-by: David Brazdil <[email protected]>

Acked-by: Andrew Scull<[email protected]>

> ---
> arch/arm64/include/asm/kvm_mmu.h | 25 +++++++------------------
> arch/arm64/kvm/arm.c | 17 +----------------
> 2 files changed, 8 insertions(+), 34 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 9db93da35606..bbe9df76ff42 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -531,28 +531,17 @@ static inline int kvm_map_vectors(void)
> DECLARE_PER_CPU_READ_MOSTLY(u64, arm64_ssbd_callback_required);
> DECLARE_KVM_NVHE_PER_CPU(u64, arm64_ssbd_callback_required);
>
> -static inline int hyp_init_aux_data(void)
> +static inline void hyp_init_aux_data(void)
> {
> - int cpu, err;
> + int cpu;
>
> - for_each_possible_cpu(cpu) {
> - u64 *ptr;
> -
> - ptr = per_cpu_ptr_nvhe(arm64_ssbd_callback_required, cpu);
> - err = create_hyp_mappings(ptr, ptr + 1, PAGE_HYP);
> - if (err)
> - return err;
> -
> - /* Copy value from kernel to hyp. */
> - *ptr = per_cpu(arm64_ssbd_callback_required, cpu);
> - }
> - return 0;
> + /* Copy arm64_ssbd_callback_required values from kernel to hyp. */
> + for_each_possible_cpu(cpu)
> + *(per_cpu_ptr_nvhe(arm64_ssbd_callback_required, cpu)) =
> + per_cpu(arm64_ssbd_callback_required, cpu);

Careful with breaking allocations across lines, that seems to be taboo
in this subsystem.

> }
> #else
> -static inline int hyp_init_aux_data(void)
> -{
> - return 0;
> -}
> +static inline void hyp_init_aux_data(void) {}
> #endif
>
> #define kvm_phys_to_vttbr(addr) phys_to_ttbr(addr)
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index df7d133056ce..dfe1baa5bbb7 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -1631,22 +1631,7 @@ static int init_hyp_mode(void)
> }
> }
>
> - for_each_possible_cpu(cpu) {
> - kvm_host_data_t *cpu_data;
> -
> - cpu_data = per_cpu_ptr_hyp(kvm_host_data, cpu);
> - err = create_hyp_mappings(cpu_data, cpu_data + 1, PAGE_HYP);
> -
> - if (err) {
> - kvm_err("Cannot map host CPU state: %d\n", err);
> - goto out_err;
> - }
> - }
> -
> - err = hyp_init_aux_data();
> - if (err)
> - kvm_err("Cannot map host auxiliary data: %d\n", err);
> -
> + hyp_init_aux_data();
> return 0;
>
> out_err:
> --
> 2.28.0.402.g5ffc5be6b7-goog
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
>

2020-09-14 13:20:29

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2 02/10] kvm: arm64: Partially link nVHE hyp code, simplify HYPCOPY

On Thu, Sep 03, 2020 at 11:17:04AM +0200, David Brazdil wrote:
> Previous series introduced custom build rules for nVHE hyp code, using
> objcopy to prefix ELF section and symbol names to separate nVHE code
> into its own "namespace". This approach was limited by the expressiveness
> of objcopy's command line interface, eg. missing support for wildcards.

nit: "Previous series" isn't a lot of use here or in the git log. You can
just say something like:

"Relying on objcopy to prefix the ELF section names of the nVHE hyp code
is brittle and prevents us from using wildcards to match specific
section names."

and then go on to explain what the change is doing (see
Documentation/process/submitting-patches.rst for more help here)

Also, given that this is independent of the other patches, please can you
move it right to the start of the series? I'm a bit worried about the
potential for regressions given the changes to the way in which we link,
so the sooner we can get this patch some more exposure, the better.

> Improve the build rules by partially linking all '.hyp.o' files and
> prefixing their ELF section names using a linker script. Continue using
> objcopy for prefixing ELF symbol names.
>
> One immediate advantage of this approach is that all subsections
> matching a pattern can be merged into a single prefixed section, eg.
> .text and .text.* can be linked into a single '.hyp.text'. This removes
> the need for -fno-reorder-functions on GCC and will be useful in the
> future too: LTO builds use .text subsections, compilers routinely
> generate .rodata subsections, etc.
>
> Partially linking all hyp code into a single object file also makes it
> easier to analyze.
>
> Signed-off-by: David Brazdil <[email protected]>
> ---
> arch/arm64/kvm/hyp/nvhe/Makefile | 56 ++++++++++++++++---------------
> arch/arm64/kvm/hyp/nvhe/hyp.lds.S | 14 ++++++++
> 2 files changed, 43 insertions(+), 27 deletions(-)
> create mode 100644 arch/arm64/kvm/hyp/nvhe/hyp.lds.S
>
> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> index aef76487edc2..1b2fbb19f3e8 100644
> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> @@ -10,40 +10,42 @@ obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o
> obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
> ../fpsimd.o ../hyp-entry.o
>
> -obj-y := $(patsubst %.o,%.hyp.o,$(obj-y))
> -extra-y := $(patsubst %.hyp.o,%.hyp.tmp.o,$(obj-y))
> +##
> +## Build rules for compiling nVHE hyp code
> +## Output of this folder is `hyp.o`, a partially linked object file containing
> +## all nVHE hyp code and data.
> +##
>
> -$(obj)/%.hyp.tmp.o: $(src)/%.c FORCE
> +hyp-obj := $(patsubst %.o,%.hyp.o,$(obj-y))
> +obj-y := hyp.o
> +extra-y := $(hyp-obj) hyp.tmp.o hyp.lds
> +
> +# 1) Compile all source files to `.hyp.o` object files. The file extension
> +# avoids file name clashes for files shared with VHE.
> +$(obj)/%.hyp.o: $(src)/%.c FORCE
> $(call if_changed_rule,cc_o_c)
> -$(obj)/%.hyp.tmp.o: $(src)/%.S FORCE
> +$(obj)/%.hyp.o: $(src)/%.S FORCE
> $(call if_changed_rule,as_o_S)
> -$(obj)/%.hyp.o: $(obj)/%.hyp.tmp.o FORCE
> - $(call if_changed,hypcopy)
>
> -# Disable reordering functions by GCC (enabled at -O2).
> -# This pass puts functions into '.text.*' sections to aid the linker
> -# in optimizing ELF layout. See HYPCOPY comment below for more info.
> -ccflags-y += $(call cc-option,-fno-reorder-functions)
> +# 2) Compile linker script.
> +$(obj)/hyp.lds: $(src)/hyp.lds.S FORCE
> + $(call if_changed_dep,cpp_lds_S)

Why is it not sufficient just to list the linker script as a target, like
we do for vmlinux.lds in extra-y?

> +# 3) Partially link all '.hyp.o' files and apply the linker script.
> +# Prefixes names of ELF sections with '.hyp', eg. '.hyp.text'.
> +LDFLAGS_hyp.tmp.o := -r -T $(obj)/hyp.lds
> +$(obj)/hyp.tmp.o: $(addprefix $(obj)/,$(hyp-obj)) $(obj)/hyp.lds FORCE
> + $(call if_changed,ld)
> +
> +# 4) Produce the final 'hyp.o', ready to be linked into 'vmlinux'.
> +# Prefixes names of ELF symbols with '__kvm_nvhe_'.
> +$(obj)/hyp.o: $(obj)/hyp.tmp.o FORCE
> + $(call if_changed,hypcopy)
>
> # The HYPCOPY command uses `objcopy` to prefix all ELF symbol names
> -# and relevant ELF section names to avoid clashes with VHE code/data.
> -#
> -# Hyp code is assumed to be in the '.text' section of the input object
> -# files (with the exception of specialized sections such as
> -# '.hyp.idmap.text'). This assumption may be broken by a compiler that
> -# divides code into sections like '.text.unlikely' so as to optimize
> -# ELF layout. HYPCOPY checks that no such sections exist in the input
> -# using `objdump`, otherwise they would be linked together with other
> -# kernel code and not memory-mapped correctly at runtime.
> +# to avoid clashes with VHE code/data.
> quiet_cmd_hypcopy = HYPCOPY $@
> - cmd_hypcopy = \
> - if $(OBJDUMP) -h $< | grep -F '.text.'; then \
> - echo "$@: function reordering not supported in nVHE hyp code" >&2; \
> - /bin/false; \
> - fi; \
> - $(OBJCOPY) --prefix-symbols=__kvm_nvhe_ \
> - --rename-section=.text=.hyp.text \
> - $< $@
> + cmd_hypcopy = $(OBJCOPY) --prefix-symbols=__kvm_nvhe_ $< $@
>
> # Remove ftrace and Shadow Call Stack CFLAGS.
> # This is equivalent to the 'notrace' and '__noscs' annotations.
> diff --git a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
> new file mode 100644
> index 000000000000..aaa0ce133a32
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
> @@ -0,0 +1,14 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Linker script used during partial linking of nVHE EL2 object files.
> + * Written by David Brazdil <[email protected]>
> + */
> +
> +/*
> + * Defines an ELF hyp section from input section @NAME and its subsections.
> + */
> +#define HYP_SECTION(NAME) .hyp##NAME : { *(NAME NAME##.[0-9a-zA-Z_]*) }

Is 'NAME##.*' likely to cause a problem here?

Will

2020-09-14 17:41:58

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] Independent per-CPU data section for nVHE

Hi David,

On Thu, Sep 03, 2020 at 11:17:02AM +0200, David Brazdil wrote:
> Introduce '.hyp.data..percpu' as part of ongoing effort to make nVHE
> hyp code self-contained and independent of the rest of the kernel.
>
> The series builds on top of the "Split off nVHE hyp code" series which
> used objcopy to rename '.text' to '.hyp.text' and prefix all ELF
> symbols with '__kvm_nvhe' for all object files under kvm/hyp/nvhe.

I've been playing around with this series this afternoon, trying to see
if we can reduce the coupling between the nVHE code and the core code. I've
ended up with the diff below on top of your series, but I think it actually
removes the need to change the core code at all. The idea is to collapse
the percpu sections during prelink, and then we can just deal with the
resulting data section a bit like we do for .hyp.text already.

Have I missed something critical?

Cheers,

Will

--->8

diff --git a/arch/arm64/include/asm/hyp_image.h b/arch/arm64/include/asm/hyp_image.h
new file mode 100644
index 000000000000..40bbf2ddb50f
--- /dev/null
+++ b/arch/arm64/include/asm/hyp_image.h
@@ -0,0 +1,20 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ASM_HYP_IMAGE_H
+#define __ASM_HYP_IMAGE_H
+
+/*
+ * KVM nVHE code has its own symbol namespace prefixed with __kvm_nvhe_, to
+ * separate it from the kernel proper.
+ */
+#define kvm_nvhe_sym(sym) __kvm_nvhe_##sym
+
+#ifdef LINKER_SCRIPT
+/*
+ * Defines an ELF hyp section from input section @NAME and its subsections.
+ */
+#define HYP_SECTION(NAME) .hyp ## NAME : { *(NAME NAME ## .*) }
+#define KVM_NVHE_ALIAS(sym) kvm_nvhe_sym(sym) = sym;
+#endif /* LINKER_SCRIPT */
+
+#endif /* __ASM_HYP_IMAGE_H */
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index c87111c25d9e..e0e1e404f6eb 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -7,6 +7,7 @@
#ifndef __ARM_KVM_ASM_H__
#define __ARM_KVM_ASM_H__

+#include <asm/hyp_image.h>
#include <asm/virt.h>

#define VCPU_WORKAROUND_2_FLAG_SHIFT 0
@@ -42,13 +43,6 @@

#include <linux/mm.h>

-/*
- * Translate name of a symbol defined in nVHE hyp to the name seen
- * by kernel proper. All nVHE symbols are prefixed by the build system
- * to avoid clashes with the VHE variants.
- */
-#define kvm_nvhe_sym(sym) __kvm_nvhe_##sym
-
#define DECLARE_KVM_VHE_SYM(sym) extern char sym[]
#define DECLARE_KVM_NVHE_SYM(sym) extern char kvm_nvhe_sym(sym)[]

diff --git a/arch/arm64/kernel/image-vars.h b/arch/arm64/kernel/image-vars.h
index 21307e2db3fc..f16205300dbc 100644
--- a/arch/arm64/kernel/image-vars.h
+++ b/arch/arm64/kernel/image-vars.h
@@ -54,15 +54,11 @@ __efistub__ctype = _ctype;
#ifdef CONFIG_KVM

/*
- * KVM nVHE code has its own symbol namespace prefixed with __kvm_nvhe_, to
- * separate it from the kernel proper. The following symbols are legally
- * accessed by it, therefore provide aliases to make them linkable.
- * Do not include symbols which may not be safely accessed under hypervisor
- * memory mappings.
+ * The following symbols are legally accessed by the KVM nVHE code, therefore
+ * provide aliases to make them linkable. Do not include symbols which may not
+ * be safely accessed under hypervisor memory mappings.
*/

-#define KVM_NVHE_ALIAS(sym) __kvm_nvhe_##sym = sym;
-
/* Alternative callbacks for init-time patching of nVHE hyp code. */
KVM_NVHE_ALIAS(arm64_enable_wa2_handling);
KVM_NVHE_ALIAS(kvm_patch_vector_branch);
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 5904a4de9f40..c06e6860adfd 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -9,27 +9,37 @@

#include <asm-generic/vmlinux.lds.h>
#include <asm/cache.h>
+#include <asm/hyp_image.h>
#include <asm/kernel-pgtable.h>
#include <asm/memory.h>
#include <asm/page.h>

#include "image.h"

-#define __CONCAT3(x, y, z) x ## y ## z
-#define CONCAT3(x, y, z) __CONCAT3(x, y, z)
-
OUTPUT_ARCH(aarch64)
ENTRY(_text)

jiffies = jiffies_64;

-
+#ifdef CONFIG_KVM
#define HYPERVISOR_EXTABLE \
. = ALIGN(SZ_8); \
__start___kvm_ex_table = .; \
*(__kvm_ex_table) \
__stop___kvm_ex_table = .;

+#define HYPERVISOR_PERCPU_SECTION \
+ . = ALIGN(PAGE_SIZE); \
+ .hyp.data..percpu : { \
+ kvm_nvhe_sym(__per_cpu_start) = .; \
+ *(.hyp.data..percpu) \
+ kvm_nvhe_sym(__per_cpu_end) = .; \
+ }
+#else
+#define HYPERVISOR_EXTABLE
+#define HYPERVISOR_PERCPU_SECTION
+#endif
+
#define HYPERVISOR_TEXT \
/* \
* Align to 4 KB so that \
@@ -193,13 +203,7 @@ SECTIONS
}

PERCPU_SECTION(L1_CACHE_BYTES)
-
- /* KVM nVHE per-cpu section */
- #undef PERCPU_SECTION_NAME
- #undef PERCPU_SYMBOL_NAME
- #define PERCPU_SECTION_NAME(suffix) CONCAT3(.hyp, PERCPU_SECTION_BASE_NAME, suffix)
- #define PERCPU_SYMBOL_NAME(name) __kvm_nvhe_ ## name
- PERCPU_SECTION(L1_CACHE_BYTES)
+ HYPERVISOR_PERCPU_SECTION

.rela.dyn : ALIGN(8) {
*(.rela .rela*)
diff --git a/arch/arm64/kvm/hyp/nvhe/.gitignore b/arch/arm64/kvm/hyp/nvhe/.gitignore
new file mode 100644
index 000000000000..695d73d0249e
--- /dev/null
+++ b/arch/arm64/kvm/hyp/nvhe/.gitignore
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+hyp.lds
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 1b2fbb19f3e8..decc2373aa6c 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -33,8 +33,8 @@ $(obj)/hyp.lds: $(src)/hyp.lds.S FORCE

# 3) Partially link all '.hyp.o' files and apply the linker script.
# Prefixes names of ELF sections with '.hyp', eg. '.hyp.text'.
-LDFLAGS_hyp.tmp.o := -r -T $(obj)/hyp.lds
-$(obj)/hyp.tmp.o: $(addprefix $(obj)/,$(hyp-obj)) $(obj)/hyp.lds FORCE
+LDFLAGS_hyp.tmp.o := -r -T
+$(obj)/hyp.tmp.o: $(obj)/hyp.lds $(addprefix $(obj)/,$(hyp-obj)) FORCE
$(call if_changed,ld)

# 4) Produce the final 'hyp.o', ready to be linked into 'vmlinux'.
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
index 7d8c3fa004f4..8121f2a6aedf 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
+++ b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
@@ -4,16 +4,9 @@
* Written by David Brazdil <[email protected]>
*/

-/*
- * Defines an ELF hyp section from input section @NAME and its subsections.
- */
-#define HYP_SECTION(NAME) .hyp##NAME : { *(NAME NAME##.[0-9a-zA-Z_]*) }
+#include <asm/hyp_image.h>

SECTIONS {
HYP_SECTION(.text)
HYP_SECTION(.data..percpu)
- HYP_SECTION(.data..percpu..first)
- HYP_SECTION(.data..percpu..page_aligned)
- HYP_SECTION(.data..percpu..read_mostly)
- HYP_SECTION(.data..percpu..shared_aligned)
}

2020-09-16 17:57:55

by David Brazdil

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] Independent per-CPU data section for nVHE

> > SECTIONS {
> > HYP_SECTION(.text)
> > - HYP_SECTION(.data..percpu)
> > - HYP_SECTION(.data..percpu..first)
> > - HYP_SECTION(.data..percpu..page_aligned)
> > - HYP_SECTION(.data..percpu..read_mostly)
> > - HYP_SECTION(.data..percpu..shared_aligned)
> > +
> > + .hyp..data..percpu : {
>
> Too many '.'s here?
Oops

>
> > + __per_cpu_load = .;
>
> I don't think we need this symbol.
True

>
> Otherwise, idea looks good to me. Can you respin like this, but also
> incorporating some of the cleanup in the diff I posted, please?

On it! :)

David

2020-09-16 19:06:23

by David Brazdil

[permalink] [raw]
Subject: Re: [PATCH v2 10/10] kvm: arm64: Remove unnecessary hyp mappings

> > + for_each_possible_cpu(cpu)
> > + *(per_cpu_ptr_nvhe(arm64_ssbd_callback_required, cpu)) =
> > + per_cpu(arm64_ssbd_callback_required, cpu);
>
> Careful with breaking allocations across lines, that seems to be taboo
> in this subsystem.

Happy to put the `ptr` var back. Sorry *embarrassed emoji*.

Thanks for reviewing,
David

2020-09-16 19:08:22

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] Independent per-CPU data section for nVHE

On Wed, Sep 16, 2020 at 01:24:12PM +0100, David Brazdil wrote:
> > I was also wondering about another approach - using the PERCPU_SECTION macro
> > unchanged in the hyp linker script. It would lay out a single .data..percpu and
> > we would then prefix it with .hyp and the symbols with __kvm_nvhe_ as with
> > everything else. WDYT? Haven't tried that yet, could be a naive idea.
>
> Seems to work. Can't use PERCPU_SECTION directly because then we couldn't
> rename it in the same linker script, but if we just unwrap that one layer
> we can use PERCPU_INPUT. No global macro changes needed.
>
> Let me know what you think.
>
> ------8<------
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index 5904a4de9f40..9e6bf21268f1 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -195,11 +195,9 @@ SECTIONS
> PERCPU_SECTION(L1_CACHE_BYTES)
>
> /* KVM nVHE per-cpu section */
> - #undef PERCPU_SECTION_NAME
> - #undef PERCPU_SYMBOL_NAME
> - #define PERCPU_SECTION_NAME(suffix) CONCAT3(.hyp, PERCPU_SECTION_BASE_NAME, suffix)
> - #define PERCPU_SYMBOL_NAME(name) __kvm_nvhe_ ## name
> - PERCPU_SECTION(L1_CACHE_BYTES)
> + . = ALIGN(PAGE_SIZE);
> + .hyp.data..percpu : { *(.hyp.data..percpu) }
> + . = ALIGN(PAGE_SIZE);
>
> .rela.dyn : ALIGN(8) {
> *(.rela .rela*)
> diff --git a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
> index 7d8c3fa004f4..1d8e4f7edc29 100644
> --- a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
> +++ b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
> @@ -4,6 +4,10 @@
> * Written by David Brazdil <[email protected]>
> */
>
> +#include <asm-generic/vmlinux.lds.h>
> +#include <asm/cache.h>
> +#include <asm/memory.h>
> +
> /*
> * Defines an ELF hyp section from input section @NAME and its subsections.
> */
> @@ -11,9 +15,9 @@
>
> SECTIONS {
> HYP_SECTION(.text)
> - HYP_SECTION(.data..percpu)
> - HYP_SECTION(.data..percpu..first)
> - HYP_SECTION(.data..percpu..page_aligned)
> - HYP_SECTION(.data..percpu..read_mostly)
> - HYP_SECTION(.data..percpu..shared_aligned)
> +
> + .hyp..data..percpu : {

Too many '.'s here?

> + __per_cpu_load = .;

I don't think we need this symbol.

Otherwise, idea looks good to me. Can you respin like this, but also
incorporating some of the cleanup in the diff I posted, please?

Will

2020-09-16 21:02:25

by David Brazdil

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] Independent per-CPU data section for nVHE

> I was also wondering about another approach - using the PERCPU_SECTION macro
> unchanged in the hyp linker script. It would lay out a single .data..percpu and
> we would then prefix it with .hyp and the symbols with __kvm_nvhe_ as with
> everything else. WDYT? Haven't tried that yet, could be a naive idea.

Seems to work. Can't use PERCPU_SECTION directly because then we couldn't
rename it in the same linker script, but if we just unwrap that one layer
we can use PERCPU_INPUT. No global macro changes needed.

Let me know what you think.

------8<------
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 5904a4de9f40..9e6bf21268f1 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -195,11 +195,9 @@ SECTIONS
PERCPU_SECTION(L1_CACHE_BYTES)

/* KVM nVHE per-cpu section */
- #undef PERCPU_SECTION_NAME
- #undef PERCPU_SYMBOL_NAME
- #define PERCPU_SECTION_NAME(suffix) CONCAT3(.hyp, PERCPU_SECTION_BASE_NAME, suffix)
- #define PERCPU_SYMBOL_NAME(name) __kvm_nvhe_ ## name
- PERCPU_SECTION(L1_CACHE_BYTES)
+ . = ALIGN(PAGE_SIZE);
+ .hyp.data..percpu : { *(.hyp.data..percpu) }
+ . = ALIGN(PAGE_SIZE);

.rela.dyn : ALIGN(8) {
*(.rela .rela*)
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
index 7d8c3fa004f4..1d8e4f7edc29 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
+++ b/arch/arm64/kvm/hyp/nvhe/hyp.lds.S
@@ -4,6 +4,10 @@
* Written by David Brazdil <[email protected]>
*/

+#include <asm-generic/vmlinux.lds.h>
+#include <asm/cache.h>
+#include <asm/memory.h>
+
/*
* Defines an ELF hyp section from input section @NAME and its subsections.
*/
@@ -11,9 +15,9 @@

SECTIONS {
HYP_SECTION(.text)
- HYP_SECTION(.data..percpu)
- HYP_SECTION(.data..percpu..first)
- HYP_SECTION(.data..percpu..page_aligned)
- HYP_SECTION(.data..percpu..read_mostly)
- HYP_SECTION(.data..percpu..shared_aligned)
+
+ .hyp..data..percpu : {
+ __per_cpu_load = .;
+ PERCPU_INPUT(L1_CACHE_BYTES)
+ }
}
-----8<------

David

2020-09-16 21:10:17

by David Brazdil

[permalink] [raw]
Subject: Re: [PATCH v2 00/10] Independent per-CPU data section for nVHE

Hi Will,

On Mon, Sep 14, 2020 at 06:40:09PM +0100, Will Deacon wrote:
> Hi David,
>
> On Thu, Sep 03, 2020 at 11:17:02AM +0200, David Brazdil wrote:
> > Introduce '.hyp.data..percpu' as part of ongoing effort to make nVHE
> > hyp code self-contained and independent of the rest of the kernel.
> >
> > The series builds on top of the "Split off nVHE hyp code" series which
> > used objcopy to rename '.text' to '.hyp.text' and prefix all ELF
> > symbols with '__kvm_nvhe' for all object files under kvm/hyp/nvhe.
>
> I've been playing around with this series this afternoon, trying to see
> if we can reduce the coupling between the nVHE code and the core code. I've
> ended up with the diff below on top of your series, but I think it actually
> removes the need to change the core code at all. The idea is to collapse
> the percpu sections during prelink, and then we can just deal with the
> resulting data section a bit like we do for .hyp.text already.
>
> Have I missed something critical?

I was wondering whether this approach would be sufficient as well because of
the simplicity. We'd just need to be careful about correctly preserving the
semantics of the different .data..percpu..* sections.

For instance, I've noticed you make .hyp..data..percpu page-aligned rather than
cacheline-aligned. We need that for stage-2 unmapping but it also happens to
correctly align DEFINE_PER_CPU_PAGE_ALIGNED variables when collapsed into the
single hyp section. The reason why I ended up reusing the global macro was to
avoid introducing subtleties like that into the arm64 linker script. Do you
think it's a worthwhile trade off?

One place where this approach doesn't work is DEFINE_PER_CPU_FIRST. But I'm
guessing that's something we can live without.

I was also wondering about another approach - using the PERCPU_SECTION macro
unchanged in the hyp linker script. It would lay out a single .data..percpu and
we would then prefix it with .hyp and the symbols with __kvm_nvhe_ as with
everything else. WDYT? Haven't tried that yet, could be a naive idea.

Thanks for reviewing,
David

2020-09-17 08:38:08

by David Brazdil

[permalink] [raw]
Subject: Re: [PATCH v2 03/10] kvm: arm64: Remove __hyp_this_cpu_read

Hey Andrew,

> > +#ifdef __KVM_NVHE_HYPERVISOR__
> > +#define __my_cpu_offset __hyp_my_cpu_offset()
>
> Is there a benefit to this being used for __KVM_VHE_HYPERVISOR__ too
> since that is "hyp" too and doesn't need the alternative since it will
> always pick EL2?

Minor time and space savings, but you're right, makes sense to treat them
equally. Updated in v3.

> > +/* Redefine macros for nVHE hyp under DEBUG_PREEMPT to avoid its dependencies. */
> > +#if defined(__KVM_NVHE_HYPERVISOR__) && defined(CONFIG_DEBUG_PREEMPT)
> > +#undef this_cpu_ptr
> > +#define this_cpu_ptr raw_cpu_ptr
> > +#undef __this_cpu_read
> > +#define __this_cpu_read raw_cpu_read
> > +#undef __this_cpu_write
> > +#define __this_cpu_write raw_cpu_write
> > +#endif
>
> This is an incomplete cherry-picked list of macros that are redefined to
> remove the call to __this_cpu_preempt_check that would result in a
> linker failure since that symbol is not defined for nVHE hyp.
>
> I remember there being some dislike of truely defining that symbol with
> an nVHE hyp implementation but I can't see why. Yes, nVHE hyp is always
> has preemption disabled so the implementation is just an empty function
> but why is is preferrable to redefine some of these macros instead?

That was my initial implementation and we could probably sway others in that
direction, too. That said, I just tried it on 5.9-rc5 and there are two more
dependencies. No idea what changed sinced the last time I tried, maybe I simply
messed up back then.

Basically, this_cpu_ptr translates to:
__this_cpu_preempt_check(); per_cpu_ptr(sym, debug_smp_processor_id())

__this_cpu_preempt_check: should be empty for hyp
per_cpu_ptr: needs mapping of the array of bases in hyp, otherwise easy
debug_smp_processor_id: needs a clone of 'cpu_number' percpu variable

Neither of these is particularly difficult to implement, and two will even be
useful going forward, but it still feels too fiddly for posting this late in
the 5.10 cycle. So I suggest we stick to the macro redefines for now and I'll
post those patches for 5.11. WDYT?

You can find the patches on branch 'topic/percpu-v3-debug-preempt' of
https://android-kvm.googlesource.com/linux.

David