Hi all,
This is v3 of the nVHE hypervisor stack enhancements.
Previous versions can be found at:
v2: https://lore.kernel.org/r/[email protected]/
v1: https://lore.kernel.org/r/[email protected]/
The main update in this version is that the unwinder now uses the core logic
from the regular kernel stack unwinder to avoid duplicate code, per Mark; along
with fixes for the other issues identified in v2.
The previous cover letter (with updated call trace) has been copied below.
Thanks,
Kalesh
-----
This series is based on 5.17-rc5 and adds the following stack features to
the KVM nVHE hypervisor:
== Hyp Stack Guard Pages ==
Based on the technique used by arm64 VMAP_STACK to detect overflow.
i.e. the stack is aligned to twice its size which ensure that the
'stack shift' bit of any valid SP is 0. The 'stack shift' bit can be
tested in the exception entry to detect overflow without corrupting GPRs.
== Hyp Stack Unwinder ==
Based on the arm64 kernel stack unwinder
(See: arch/arm64/kernel/stacktrace.c)
The unwinding and dumping of the hyp stack is not enabled by default and
depends on CONFIG_NVHE_EL2_DEBUG to avoid potential information leaks.
When CONFIG_NVHE_EL2_DEBUG is enabled the host stage 2 protection is
disabled, allowing the host to read the hypervisor stack pages and unwind
the stack from EL1. This allows us to print the hypervisor stacktrace
before panicking the host; as shown below.
Example call trace:
[ 98.916444][ T426] kvm [426]: nVHE hyp panic at: [<ffffffc0096156fc>] __kvm_nvhe_overflow_stack+0x8/0x34!
[ 98.918360][ T426] nVHE HYP call trace:
[ 98.918692][ T426] kvm [426]: [<ffffffc009615aac>] __kvm_nvhe_cpu_prepare_nvhe_panic_info+0x4c/0x68
[ 98.919545][ T426] kvm [426]: [<ffffffc0096159a4>] __kvm_nvhe_hyp_panic+0x2c/0xe8
[ 98.920107][ T426] kvm [426]: [<ffffffc009615ad8>] __kvm_nvhe_hyp_panic_bad_stack+0x10/0x10
[ 98.920665][ T426] kvm [426]: [<ffffffc009610a4c>] __kvm_nvhe___kvm_hyp_host_vector+0x24c/0x794
[ 98.921292][ T426] kvm [426]: [<ffffffc009615718>] __kvm_nvhe_overflow_stack+0x24/0x34
. . .
[ 98.973382][ T426] kvm [426]: [<ffffffc009615718>] __kvm_nvhe_overflow_stack+0x24/0x34
[ 98.973816][ T426] kvm [426]: [<ffffffc0096152f4>] __kvm_nvhe___kvm_vcpu_run+0x38/0x438
[ 98.974255][ T426] kvm [426]: [<ffffffc009616f80>] __kvm_nvhe_handle___kvm_vcpu_run+0x1c4/0x364
[ 98.974719][ T426] kvm [426]: [<ffffffc009616928>] __kvm_nvhe_handle_trap+0xa8/0x130
[ 98.975152][ T426] kvm [426]: [<ffffffc009610064>] __kvm_nvhe___host_exit+0x64/0x64
[ 98.975588][ T426] ---- end of nVHE HYP call trace ----
Kalesh Singh (8):
KVM: arm64: Introduce hyp_alloc_private_va_range()
KVM: arm64: Introduce pkvm_alloc_private_va_range()
KVM: arm64: Add guard pages for KVM nVHE hypervisor stack
KVM: arm64: Add guard pages for pKVM (protected nVHE) hypervisor stack
KVM: arm64: Detect and handle hypervisor stack overflows
KVM: arm64: Add hypervisor overflow stack
KVM: arm64: Unwind and dump nVHE HYP stacktrace
KVM: arm64: Symbolize the nVHE HYP backtrace
arch/arm64/include/asm/kvm_asm.h | 20 +++
arch/arm64/include/asm/kvm_mmu.h | 4 +
arch/arm64/include/asm/stacktrace.h | 12 ++
arch/arm64/kernel/stacktrace.c | 210 ++++++++++++++++++++++++---
arch/arm64/kvm/Kconfig | 5 +-
arch/arm64/kvm/arm.c | 34 ++++-
arch/arm64/kvm/handle_exit.c | 16 +-
arch/arm64/kvm/hyp/include/nvhe/mm.h | 3 +-
arch/arm64/kvm/hyp/nvhe/host.S | 29 ++++
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 5 +-
arch/arm64/kvm/hyp/nvhe/mm.c | 51 ++++---
arch/arm64/kvm/hyp/nvhe/setup.c | 25 +++-
arch/arm64/kvm/hyp/nvhe/switch.c | 30 +++-
arch/arm64/kvm/mmu.c | 62 +++++---
scripts/kallsyms.c | 2 +-
15 files changed, 422 insertions(+), 86 deletions(-)
base-commit: cfb92440ee71adcc2105b0890bb01ac3cddb8507
--
2.35.1.473.g83b2b277ed-goog
Maps the stack pages in the flexible private VA range and allocates
guard pages below the stack as unbacked VA space. The stack is aligned
to twice its size to aid overflow detection (implemented in a subsequent
patch in the series).
Signed-off-by: Kalesh Singh <[email protected]>
---
Changes in v3:
- Handle null ptr in IS_ERR_OR_NULL checks, per Mark
arch/arm64/kvm/hyp/nvhe/setup.c | 25 +++++++++++++++++++++----
1 file changed, 21 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
index 27af337f9fea..5f3a4002f9c5 100644
--- a/arch/arm64/kvm/hyp/nvhe/setup.c
+++ b/arch/arm64/kvm/hyp/nvhe/setup.c
@@ -105,11 +105,28 @@ static int recreate_hyp_mappings(phys_addr_t phys, unsigned long size,
if (ret)
return ret;
- end = (void *)per_cpu_ptr(&kvm_init_params, i)->stack_hyp_va;
+ /*
+ * Private mappings are allocated upwards from __io_map_base
+ * so allocate the guard page first then the stack.
+ */
+ start = (void *)pkvm_alloc_private_va_range(PAGE_SIZE, PAGE_SIZE);
+ if (IS_ERR_OR_NULL(start))
+ return start ? PTR_ERR(start) : -ENOMEM;
+
+ /*
+ * The stack is aligned to twice its size to facilitate overflow
+ * detection.
+ */
+ end = (void *)per_cpu_ptr(&kvm_init_params, i)->stack_pa;
start = end - PAGE_SIZE;
- ret = pkvm_create_mappings(start, end, PAGE_HYP);
- if (ret)
- return ret;
+ start = (void *)__pkvm_create_private_mapping((phys_addr_t)start,
+ PAGE_SIZE, PAGE_SIZE * 2, PAGE_HYP);
+ if (IS_ERR_OR_NULL(start))
+ return start ? PTR_ERR(start) : -ENOMEM;
+ end = start + PAGE_SIZE;
+
+ /* Update stack_hyp_va to end of the stack's private VA range */
+ per_cpu_ptr(&kvm_init_params, i)->stack_hyp_va = (unsigned long) end;
}
/*
--
2.35.1.473.g83b2b277ed-goog
On Wed, Feb 23, 2022 at 9:15 PM Kalesh Singh <[email protected]> wrote:
>
> Hi all,
>
> This is v3 of the nVHE hypervisor stack enhancements.
Please find the latest version v4, posted at:
https://lore.kernel.org/r/[email protected]/
Thanks,
Kalesh
>
> Previous versions can be found at:
> v2: https://lore.kernel.org/r/[email protected]/
> v1: https://lore.kernel.org/r/[email protected]/
>
> The main update in this version is that the unwinder now uses the core logic
> from the regular kernel stack unwinder to avoid duplicate code, per Mark; along
> with fixes for the other issues identified in v2.
>
> The previous cover letter (with updated call trace) has been copied below.
>
> Thanks,
> Kalesh
>
> -----
>
> This series is based on 5.17-rc5 and adds the following stack features to
> the KVM nVHE hypervisor:
>
> == Hyp Stack Guard Pages ==
>
> Based on the technique used by arm64 VMAP_STACK to detect overflow.
> i.e. the stack is aligned to twice its size which ensure that the
> 'stack shift' bit of any valid SP is 0. The 'stack shift' bit can be
> tested in the exception entry to detect overflow without corrupting GPRs.
>
> == Hyp Stack Unwinder ==
>
> Based on the arm64 kernel stack unwinder
> (See: arch/arm64/kernel/stacktrace.c)
>
> The unwinding and dumping of the hyp stack is not enabled by default and
> depends on CONFIG_NVHE_EL2_DEBUG to avoid potential information leaks.
>
> When CONFIG_NVHE_EL2_DEBUG is enabled the host stage 2 protection is
> disabled, allowing the host to read the hypervisor stack pages and unwind
> the stack from EL1. This allows us to print the hypervisor stacktrace
> before panicking the host; as shown below.
>
> Example call trace:
>
> [ 98.916444][ T426] kvm [426]: nVHE hyp panic at: [<ffffffc0096156fc>] __kvm_nvhe_overflow_stack+0x8/0x34!
> [ 98.918360][ T426] nVHE HYP call trace:
> [ 98.918692][ T426] kvm [426]: [<ffffffc009615aac>] __kvm_nvhe_cpu_prepare_nvhe_panic_info+0x4c/0x68
> [ 98.919545][ T426] kvm [426]: [<ffffffc0096159a4>] __kvm_nvhe_hyp_panic+0x2c/0xe8
> [ 98.920107][ T426] kvm [426]: [<ffffffc009615ad8>] __kvm_nvhe_hyp_panic_bad_stack+0x10/0x10
> [ 98.920665][ T426] kvm [426]: [<ffffffc009610a4c>] __kvm_nvhe___kvm_hyp_host_vector+0x24c/0x794
> [ 98.921292][ T426] kvm [426]: [<ffffffc009615718>] __kvm_nvhe_overflow_stack+0x24/0x34
> . . .
>
> [ 98.973382][ T426] kvm [426]: [<ffffffc009615718>] __kvm_nvhe_overflow_stack+0x24/0x34
> [ 98.973816][ T426] kvm [426]: [<ffffffc0096152f4>] __kvm_nvhe___kvm_vcpu_run+0x38/0x438
> [ 98.974255][ T426] kvm [426]: [<ffffffc009616f80>] __kvm_nvhe_handle___kvm_vcpu_run+0x1c4/0x364
> [ 98.974719][ T426] kvm [426]: [<ffffffc009616928>] __kvm_nvhe_handle_trap+0xa8/0x130
> [ 98.975152][ T426] kvm [426]: [<ffffffc009610064>] __kvm_nvhe___host_exit+0x64/0x64
> [ 98.975588][ T426] ---- end of nVHE HYP call trace ----
>
>
> Kalesh Singh (8):
> KVM: arm64: Introduce hyp_alloc_private_va_range()
> KVM: arm64: Introduce pkvm_alloc_private_va_range()
> KVM: arm64: Add guard pages for KVM nVHE hypervisor stack
> KVM: arm64: Add guard pages for pKVM (protected nVHE) hypervisor stack
> KVM: arm64: Detect and handle hypervisor stack overflows
> KVM: arm64: Add hypervisor overflow stack
> KVM: arm64: Unwind and dump nVHE HYP stacktrace
> KVM: arm64: Symbolize the nVHE HYP backtrace
>
> arch/arm64/include/asm/kvm_asm.h | 20 +++
> arch/arm64/include/asm/kvm_mmu.h | 4 +
> arch/arm64/include/asm/stacktrace.h | 12 ++
> arch/arm64/kernel/stacktrace.c | 210 ++++++++++++++++++++++++---
> arch/arm64/kvm/Kconfig | 5 +-
> arch/arm64/kvm/arm.c | 34 ++++-
> arch/arm64/kvm/handle_exit.c | 16 +-
> arch/arm64/kvm/hyp/include/nvhe/mm.h | 3 +-
> arch/arm64/kvm/hyp/nvhe/host.S | 29 ++++
> arch/arm64/kvm/hyp/nvhe/hyp-main.c | 5 +-
> arch/arm64/kvm/hyp/nvhe/mm.c | 51 ++++---
> arch/arm64/kvm/hyp/nvhe/setup.c | 25 +++-
> arch/arm64/kvm/hyp/nvhe/switch.c | 30 +++-
> arch/arm64/kvm/mmu.c | 62 +++++---
> scripts/kallsyms.c | 2 +-
> 15 files changed, 422 insertions(+), 86 deletions(-)
>
>
> base-commit: cfb92440ee71adcc2105b0890bb01ac3cddb8507
> --
> 2.35.1.473.g83b2b277ed-goog
>