2022-07-21 05:58:29

by Kalesh Singh

[permalink] [raw]
Subject: [PATCH v5 00/17] KVM nVHE Hypervisor stack unwinder

Hi all,

This is v5 of nVHE hypervisor stacktraces support. The series is based on
arm64 for-next/stacktrace.

The previous versions were posted at:
v4: https://lore.kernel.org/r/[email protected]/
v3: https://lore.kernel.org/r/[email protected]/
v2: https://lore.kernel.org/r/[email protected]/
v1: https://lore.kernel.org/r/[email protected]/

The main updates in this version are some refactoring to move stuff out of
stacktrace/nvhe.h (leaving only the unwinder implementation in the header);
and fixing the symbolization of the hyp stacktrace when KASLR is enabled;
along with the addressing the other minor comments.

Patch 18 (KVM: arm64: Dump nVHE hypervisor stack on panic) was also squashed
into earlier patches.

The previous cover letter is copied below for convenience.

Thanks all for your feedback.

--Kalesh

============

KVM nVHE Stack unwinding.
===

nVHE has two modes of operation: protected (pKVM) and unprotected
(conventional nVHE). Depending on the mode, a slightly different approach
is used to dump the hypervisor stacktrace but the core unwinding logic
remains the same.

Protected nVHE (pKVM) stacktraces
====

In protected nVHE mode, the host cannot directly access hypervisor memory.

The hypervisor stack unwinding happens in EL2 and is made accessible to
the host via a shared buffer. Symbolizing and printing the stacktrace
addresses is delegated to the host and happens in EL1.

Non-protected (Conventional) nVHE stacktraces
====

In non-protected mode, the host is able to directly access the hypervisor
stack pages.

The hypervisor stack unwinding and dumping of the stacktrace is performed
by the host in EL1, as this avoids the memory overhead of setting up
shared buffers between the host and hypervisor.

Resuing the Core Unwinding Logic
====

Since the hypervisor cannot link against the kernel code in proteced mode.
The common stack unwinding code is moved to a shared header to allow reuse
in the nVHE hypervisor.

Reducing the memory footprint
====

In this version the below steps were taken to reduce the memory usage of
nVHE stack unwinding:

1) The nVHE overflow stack is reduced from PAGE_SIZE to 4KB; benificial
for configurations with non 4KB pages (16KB or 64KB pages).
2) In protected nVHE mode (pKVM), the shared stacktrace buffers with the
host are reduced from PAGE_SIZE to the minimum size required.
3) In systems other than Android, conventional nVHE makes up the vast
majority of use case. So the pKVM stack tracing is disabled by default
(!CONFIG_PROTECTED_NVHE_STACKTRACE), which avoid the memory usage for
setting up shared buffers.
4) In non-protected nVHE mode (conventional nVHE), the stack unwinding
is done directly in EL1 by the host and no shared buffers with the
hypervisor are needed.

Sample Output
====

The below shows an example output from a simple stack overflow test:

[ 126.862960] kvm [371]: nVHE hyp panic at: [<ffff8000090a51d0>] __kvm_nvhe_recursive_death+0x10/0x34!
[ 126.869920] kvm [371]: Protected nVHE HYP call trace:
[ 126.870528] kvm [371]: [<ffff8000090a5570>] __kvm_nvhe_hyp_panic+0xac/0xf8
[ 126.871342] kvm [371]: [<ffff8000090a55cc>] __kvm_nvhe_hyp_panic_bad_stack+0x10/0x10
[ 126.872174] kvm [371]: [<ffff8000090a51e4>] __kvm_nvhe_recursive_death+0x24/0x34
[ 126.872971] kvm [371]: [<ffff8000090a51e4>] __kvm_nvhe_recursive_death+0x24/0x34
. . .

[ 126.927314] kvm [371]: [<ffff8000090a51e4>] __kvm_nvhe_recursive_death+0x24/0x34
[ 126.927727] kvm [371]: [<ffff8000090a51e4>] __kvm_nvhe_recursive_death+0x24/0x34
[ 126.928137] kvm [371]: [<ffff8000090a4de4>] __kvm_nvhe___kvm_vcpu_run+0x30/0x40c
[ 126.928561] kvm [371]: [<ffff8000090a7b64>] __kvm_nvhe_handle___kvm_vcpu_run+0x30/0x48
[ 126.928984] kvm [371]: [<ffff8000090a78b8>] __kvm_nvhe_handle_trap+0xc4/0x128
[ 126.929385] kvm [371]: [<ffff8000090a6864>] __kvm_nvhe___host_exit+0x64/0x64
[ 126.929804] kvm [371]: ---- End of Protected nVHE HYP call trace ----

============

Kalesh Singh (17):
arm64: stacktrace: Add shared header for common stack unwinding code
arm64: stacktrace: Factor out on_accessible_stack_common()
arm64: stacktrace: Factor out unwind_next_common()
arm64: stacktrace: Handle frame pointer from different address spaces
arm64: stacktrace: Factor out common unwind()
arm64: stacktrace: Add description of stacktrace/common.h
KVM: arm64: On stack overflow switch to hyp overflow_stack
KVM: arm64: Add PROTECTED_NVHE_STACKTRACE Kconfig
KVM: arm64: Allocate shared pKVM hyp stacktrace buffers
KVM: arm64: Stub implementation of pKVM HYP stack unwinder
KVM: arm64: Stub implementation of non-protected nVHE HYP stack
unwinder
KVM: arm64: Save protected-nVHE (pKVM) hyp stacktrace
KVM: arm64: Prepare non-protected nVHE hypervisor stacktrace
KVM: arm64: Implement protected nVHE hyp stack unwinder
KVM: arm64: Implement non-protected nVHE hyp stack unwinder
KVM: arm64: Introduce pkvm_dump_backtrace()
KVM: arm64: Introduce hyp_dump_backtrace()

arch/arm64/include/asm/kvm_asm.h | 16 ++
arch/arm64/include/asm/memory.h | 8 +
arch/arm64/include/asm/stacktrace.h | 92 +++++----
arch/arm64/include/asm/stacktrace/common.h | 230 +++++++++++++++++++++
arch/arm64/include/asm/stacktrace/nvhe.h | 199 ++++++++++++++++++
arch/arm64/kernel/stacktrace.c | 157 --------------
arch/arm64/kvm/Kconfig | 15 ++
arch/arm64/kvm/arm.c | 2 +-
arch/arm64/kvm/handle_exit.c | 101 +++++++++
arch/arm64/kvm/hyp/nvhe/Makefile | 2 +-
arch/arm64/kvm/hyp/nvhe/host.S | 9 +-
arch/arm64/kvm/hyp/nvhe/stacktrace.c | 116 +++++++++++
arch/arm64/kvm/hyp/nvhe/switch.c | 6 +
13 files changed, 749 insertions(+), 204 deletions(-)
create mode 100644 arch/arm64/include/asm/stacktrace/common.h
create mode 100644 arch/arm64/include/asm/stacktrace/nvhe.h
create mode 100644 arch/arm64/kvm/hyp/nvhe/stacktrace.c


base-commit: 82a592c13b0aeff94d84d54183dae0b26384c95f
--
2.37.0.170.g444d1eabd0-goog


2022-07-21 05:58:38

by Kalesh Singh

[permalink] [raw]
Subject: [PATCH v5 01/17] arm64: stacktrace: Add shared header for common stack unwinding code

In order to reuse the arm64 stack unwinding logic for the nVHE
hypervisor stack, move the common code to a shared header
(arch/arm64/include/asm/stacktrace/common.h).

The nVHE hypervisor cannot safely link against kernel code, so we
make use of the shared header to avoid duplicated logic later in
this series.

Signed-off-by: Kalesh Singh <[email protected]>
Reviewed-by: Mark Brown <[email protected]>
Reviewed-by: Fuad Tabba <[email protected]>
---

Changes in v5:
- Add Reviewed-by tags from Mark Brown and Fuad

arch/arm64/include/asm/stacktrace.h | 35 +------
arch/arm64/include/asm/stacktrace/common.h | 105 +++++++++++++++++++++
arch/arm64/kernel/stacktrace.c | 57 -----------
3 files changed, 106 insertions(+), 91 deletions(-)
create mode 100644 arch/arm64/include/asm/stacktrace/common.h

diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index aec9315bf156..79f455b37c84 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -8,52 +8,19 @@
#include <linux/percpu.h>
#include <linux/sched.h>
#include <linux/sched/task_stack.h>
-#include <linux/types.h>
#include <linux/llist.h>

#include <asm/memory.h>
#include <asm/ptrace.h>
#include <asm/sdei.h>

-enum stack_type {
- STACK_TYPE_UNKNOWN,
- STACK_TYPE_TASK,
- STACK_TYPE_IRQ,
- STACK_TYPE_OVERFLOW,
- STACK_TYPE_SDEI_NORMAL,
- STACK_TYPE_SDEI_CRITICAL,
- __NR_STACK_TYPES
-};
-
-struct stack_info {
- unsigned long low;
- unsigned long high;
- enum stack_type type;
-};
+#include <asm/stacktrace/common.h>

extern void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk,
const char *loglvl);

DECLARE_PER_CPU(unsigned long *, irq_stack_ptr);

-static inline bool on_stack(unsigned long sp, unsigned long size,
- unsigned long low, unsigned long high,
- enum stack_type type, struct stack_info *info)
-{
- if (!low)
- return false;
-
- if (sp < low || sp + size < sp || sp + size > high)
- return false;
-
- if (info) {
- info->low = low;
- info->high = high;
- info->type = type;
- }
- return true;
-}
-
static inline bool on_irq_stack(unsigned long sp, unsigned long size,
struct stack_info *info)
{
diff --git a/arch/arm64/include/asm/stacktrace/common.h b/arch/arm64/include/asm/stacktrace/common.h
new file mode 100644
index 000000000000..64ae4f6b06fe
--- /dev/null
+++ b/arch/arm64/include/asm/stacktrace/common.h
@@ -0,0 +1,105 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Common arm64 stack unwinder code.
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ */
+#ifndef __ASM_STACKTRACE_COMMON_H
+#define __ASM_STACKTRACE_COMMON_H
+
+#include <linux/bitmap.h>
+#include <linux/bitops.h>
+#include <linux/types.h>
+
+enum stack_type {
+ STACK_TYPE_UNKNOWN,
+ STACK_TYPE_TASK,
+ STACK_TYPE_IRQ,
+ STACK_TYPE_OVERFLOW,
+ STACK_TYPE_SDEI_NORMAL,
+ STACK_TYPE_SDEI_CRITICAL,
+ __NR_STACK_TYPES
+};
+
+struct stack_info {
+ unsigned long low;
+ unsigned long high;
+ enum stack_type type;
+};
+
+/*
+ * A snapshot of a frame record or fp/lr register values, along with some
+ * accounting information necessary for robust unwinding.
+ *
+ * @fp: The fp value in the frame record (or the real fp)
+ * @pc: The lr value in the frame record (or the real lr)
+ *
+ * @stacks_done: Stacks which have been entirely unwound, for which it is no
+ * longer valid to unwind to.
+ *
+ * @prev_fp: The fp that pointed to this frame record, or a synthetic value
+ * of 0. This is used to ensure that within a stack, each
+ * subsequent frame record is at an increasing address.
+ * @prev_type: The type of stack this frame record was on, or a synthetic
+ * value of STACK_TYPE_UNKNOWN. This is used to detect a
+ * transition from one stack to another.
+ *
+ * @kr_cur: When KRETPROBES is selected, holds the kretprobe instance
+ * associated with the most recently encountered replacement lr
+ * value.
+ *
+ * @task: The task being unwound.
+ */
+struct unwind_state {
+ unsigned long fp;
+ unsigned long pc;
+ DECLARE_BITMAP(stacks_done, __NR_STACK_TYPES);
+ unsigned long prev_fp;
+ enum stack_type prev_type;
+#ifdef CONFIG_KRETPROBES
+ struct llist_node *kr_cur;
+#endif
+ struct task_struct *task;
+};
+
+static inline bool on_stack(unsigned long sp, unsigned long size,
+ unsigned long low, unsigned long high,
+ enum stack_type type, struct stack_info *info)
+{
+ if (!low)
+ return false;
+
+ if (sp < low || sp + size < sp || sp + size > high)
+ return false;
+
+ if (info) {
+ info->low = low;
+ info->high = high;
+ info->type = type;
+ }
+ return true;
+}
+
+static inline void unwind_init_common(struct unwind_state *state,
+ struct task_struct *task)
+{
+ state->task = task;
+#ifdef CONFIG_KRETPROBES
+ state->kr_cur = NULL;
+#endif
+
+ /*
+ * Prime the first unwind.
+ *
+ * In unwind_next() we'll check that the FP points to a valid stack,
+ * which can't be STACK_TYPE_UNKNOWN, and the first unwind will be
+ * treated as a transition to whichever stack that happens to be. The
+ * prev_fp value won't be used, but we set it to 0 such that it is
+ * definitely not an accessible stack address.
+ */
+ bitmap_zero(state->stacks_done, __NR_STACK_TYPES);
+ state->prev_fp = 0;
+ state->prev_type = STACK_TYPE_UNKNOWN;
+}
+
+#endif /* __ASM_STACKTRACE_COMMON_H */
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index fcaa151b81f1..94a5dd2ab8fd 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -18,63 +18,6 @@
#include <asm/stack_pointer.h>
#include <asm/stacktrace.h>

-/*
- * A snapshot of a frame record or fp/lr register values, along with some
- * accounting information necessary for robust unwinding.
- *
- * @fp: The fp value in the frame record (or the real fp)
- * @pc: The lr value in the frame record (or the real lr)
- *
- * @stacks_done: Stacks which have been entirely unwound, for which it is no
- * longer valid to unwind to.
- *
- * @prev_fp: The fp that pointed to this frame record, or a synthetic value
- * of 0. This is used to ensure that within a stack, each
- * subsequent frame record is at an increasing address.
- * @prev_type: The type of stack this frame record was on, or a synthetic
- * value of STACK_TYPE_UNKNOWN. This is used to detect a
- * transition from one stack to another.
- *
- * @kr_cur: When KRETPROBES is selected, holds the kretprobe instance
- * associated with the most recently encountered replacement lr
- * value.
- *
- * @task: The task being unwound.
- */
-struct unwind_state {
- unsigned long fp;
- unsigned long pc;
- DECLARE_BITMAP(stacks_done, __NR_STACK_TYPES);
- unsigned long prev_fp;
- enum stack_type prev_type;
-#ifdef CONFIG_KRETPROBES
- struct llist_node *kr_cur;
-#endif
- struct task_struct *task;
-};
-
-static void unwind_init_common(struct unwind_state *state,
- struct task_struct *task)
-{
- state->task = task;
-#ifdef CONFIG_KRETPROBES
- state->kr_cur = NULL;
-#endif
-
- /*
- * Prime the first unwind.
- *
- * In unwind_next() we'll check that the FP points to a valid stack,
- * which can't be STACK_TYPE_UNKNOWN, and the first unwind will be
- * treated as a transition to whichever stack that happens to be. The
- * prev_fp value won't be used, but we set it to 0 such that it is
- * definitely not an accessible stack address.
- */
- bitmap_zero(state->stacks_done, __NR_STACK_TYPES);
- state->prev_fp = 0;
- state->prev_type = STACK_TYPE_UNKNOWN;
-}
-
/*
* Start an unwind from a pt_regs.
*
--
2.37.0.170.g444d1eabd0-goog

2022-07-21 05:58:51

by Kalesh Singh

[permalink] [raw]
Subject: [PATCH v5 03/17] arm64: stacktrace: Factor out unwind_next_common()

Move common unwind_next logic to stacktrace/common.h. This allows
reusing the code in the implementation the nVHE hypervisor stack
unwinder, later in this series.

Signed-off-by: Kalesh Singh <[email protected]>
Reviewed-by: Fuad Tabba <[email protected]>
Reviewed-by: Mark Brown <[email protected]>
---

Changes in v5:
- Add Reviewed-by tags from Mark Brown and Fuad

arch/arm64/include/asm/stacktrace/common.h | 50 ++++++++++++++++++++++
arch/arm64/kernel/stacktrace.c | 41 ++----------------
2 files changed, 54 insertions(+), 37 deletions(-)

diff --git a/arch/arm64/include/asm/stacktrace/common.h b/arch/arm64/include/asm/stacktrace/common.h
index f58b786460d3..0c5cbfdb56b5 100644
--- a/arch/arm64/include/asm/stacktrace/common.h
+++ b/arch/arm64/include/asm/stacktrace/common.h
@@ -65,6 +65,10 @@ struct unwind_state {
static inline bool on_overflow_stack(unsigned long sp, unsigned long size,
struct stack_info *info);

+static inline bool on_accessible_stack(const struct task_struct *tsk,
+ unsigned long sp, unsigned long size,
+ struct stack_info *info);
+
static inline bool on_stack(unsigned long sp, unsigned long size,
unsigned long low, unsigned long high,
enum stack_type type, struct stack_info *info)
@@ -120,4 +124,50 @@ static inline void unwind_init_common(struct unwind_state *state,
state->prev_type = STACK_TYPE_UNKNOWN;
}

+static inline int unwind_next_common(struct unwind_state *state,
+ struct stack_info *info)
+{
+ struct task_struct *tsk = state->task;
+ unsigned long fp = state->fp;
+
+ if (fp & 0x7)
+ return -EINVAL;
+
+ if (!on_accessible_stack(tsk, fp, 16, info))
+ return -EINVAL;
+
+ if (test_bit(info->type, state->stacks_done))
+ return -EINVAL;
+
+ /*
+ * As stacks grow downward, any valid record on the same stack must be
+ * at a strictly higher address than the prior record.
+ *
+ * Stacks can nest in several valid orders, e.g.
+ *
+ * TASK -> IRQ -> OVERFLOW -> SDEI_NORMAL
+ * TASK -> SDEI_NORMAL -> SDEI_CRITICAL -> OVERFLOW
+ *
+ * ... but the nesting itself is strict. Once we transition from one
+ * stack to another, it's never valid to unwind back to that first
+ * stack.
+ */
+ if (info->type == state->prev_type) {
+ if (fp <= state->prev_fp)
+ return -EINVAL;
+ } else {
+ __set_bit(state->prev_type, state->stacks_done);
+ }
+
+ /*
+ * Record this frame record's values and location. The prev_fp and
+ * prev_type are only meaningful to the next unwind_next() invocation.
+ */
+ state->fp = READ_ONCE(*(unsigned long *)(fp));
+ state->pc = READ_ONCE(*(unsigned long *)(fp + 8));
+ state->prev_fp = fp;
+ state->prev_type = info->type;
+
+ return 0;
+}
#endif /* __ASM_STACKTRACE_COMMON_H */
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 94a5dd2ab8fd..834851939364 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -81,48 +81,15 @@ static int notrace unwind_next(struct unwind_state *state)
struct task_struct *tsk = state->task;
unsigned long fp = state->fp;
struct stack_info info;
+ int err;

/* Final frame; nothing to unwind */
if (fp == (unsigned long)task_pt_regs(tsk)->stackframe)
return -ENOENT;

- if (fp & 0x7)
- return -EINVAL;
-
- if (!on_accessible_stack(tsk, fp, 16, &info))
- return -EINVAL;
-
- if (test_bit(info.type, state->stacks_done))
- return -EINVAL;
-
- /*
- * As stacks grow downward, any valid record on the same stack must be
- * at a strictly higher address than the prior record.
- *
- * Stacks can nest in several valid orders, e.g.
- *
- * TASK -> IRQ -> OVERFLOW -> SDEI_NORMAL
- * TASK -> SDEI_NORMAL -> SDEI_CRITICAL -> OVERFLOW
- *
- * ... but the nesting itself is strict. Once we transition from one
- * stack to another, it's never valid to unwind back to that first
- * stack.
- */
- if (info.type == state->prev_type) {
- if (fp <= state->prev_fp)
- return -EINVAL;
- } else {
- __set_bit(state->prev_type, state->stacks_done);
- }
-
- /*
- * Record this frame record's values and location. The prev_fp and
- * prev_type are only meaningful to the next unwind_next() invocation.
- */
- state->fp = READ_ONCE(*(unsigned long *)(fp));
- state->pc = READ_ONCE(*(unsigned long *)(fp + 8));
- state->prev_fp = fp;
- state->prev_type = info.type;
+ err = unwind_next_common(state, &info);
+ if (err)
+ return err;

state->pc = ptrauth_strip_insn_pac(state->pc);

--
2.37.0.170.g444d1eabd0-goog

2022-07-21 05:59:06

by Kalesh Singh

[permalink] [raw]
Subject: [PATCH v5 06/17] arm64: stacktrace: Add description of stacktrace/common.h

Add brief description on how to use stacktrace/common.h to implement
a stack unwinder.

Signed-off-by: Kalesh Singh <[email protected]>
---

Changes in v5:
- Add short description of each required function, per Fuad and Marc
- Add Reviewed-by tag from Fuad

arch/arm64/include/asm/stacktrace/common.h | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/arch/arm64/include/asm/stacktrace/common.h b/arch/arm64/include/asm/stacktrace/common.h
index 7807752aaab1..be7920ba70b0 100644
--- a/arch/arm64/include/asm/stacktrace/common.h
+++ b/arch/arm64/include/asm/stacktrace/common.h
@@ -2,6 +2,21 @@
/*
* Common arm64 stack unwinder code.
*
+ * To implement a new arm64 stack unwinder:
+ * 1) Include this header
+ *
+ * 2) Provide implementations for the following functions:
+ * on_overflow_stack(): Returns true if SP is on the overflow
+ * stack.
+ * on_accessible_stack(): Returns true is SP is on any accessible
+ * stack.
+ * unwind_next(): Performs validation checks on the frame
+ * pointer, and transitions unwind_state
+ * to the next frame.
+ *
+ * See: arch/arm64/include/asm/stacktrace.h for reference
+ * implementations.
+ *
* Copyright (C) 2012 ARM Ltd.
*/
#ifndef __ASM_STACKTRACE_COMMON_H
--
2.37.0.170.g444d1eabd0-goog

2022-07-21 05:59:24

by Kalesh Singh

[permalink] [raw]
Subject: [PATCH v5 09/17] KVM: arm64: Allocate shared pKVM hyp stacktrace buffers

In protected nVHE mode the host cannot directly access
hypervisor memory, so we will dump the hypervisor stacktrace
to a shared buffer with the host.

The minimum size for the buffer required, assuming the min frame
size of [x29, x30] (2 * sizeof(long)), is half the combined size of
the hypervisor and overflow stacks plus an additional entry to
delimit the end of the stacktrace.

The stacktrace buffers are used later in the seried to dump the
nVHE hypervisor stacktrace when using protected-mode.

Signed-off-by: Kalesh Singh <[email protected]>
---

Changes in v5:
- Fix typo in commit text, per Marc

arch/arm64/include/asm/memory.h | 8 ++++++++
arch/arm64/kvm/hyp/nvhe/stacktrace.c | 4 ++++
2 files changed, 12 insertions(+)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 0af70d9abede..cab80a9a4086 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -113,6 +113,14 @@

#define OVERFLOW_STACK_SIZE SZ_4K

+/*
+ * With the minimum frame size of [x29, x30], exactly half the combined
+ * sizes of the hyp and overflow stacks is the maximum size needed to
+ * save the unwinded stacktrace; plus an additional entry to delimit the
+ * end.
+ */
+#define NVHE_STACKTRACE_SIZE ((OVERFLOW_STACK_SIZE + PAGE_SIZE) / 2 + sizeof(long))
+
/*
* Alignment of kernel segments (e.g. .text, .data).
*
diff --git a/arch/arm64/kvm/hyp/nvhe/stacktrace.c b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
index a3d5b34e1249..69e65b457f1c 100644
--- a/arch/arm64/kvm/hyp/nvhe/stacktrace.c
+++ b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
@@ -9,3 +9,7 @@

DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], overflow_stack)
__aligned(16);
+
+#ifdef CONFIG_PROTECTED_NVHE_STACKTRACE
+DEFINE_PER_CPU(unsigned long [NVHE_STACKTRACE_SIZE/sizeof(long)], pkvm_stacktrace);
+#endif /* CONFIG_PROTECTED_NVHE_STACKTRACE */
--
2.37.0.170.g444d1eabd0-goog

2022-07-21 05:59:42

by Kalesh Singh

[permalink] [raw]
Subject: [PATCH v5 12/17] KVM: arm64: Save protected-nVHE (pKVM) hyp stacktrace

In protected nVHE mode, the host cannot access private owned hypervisor
memory. Also the hypervisor aims to remains simple to reduce the attack
surface and does not provide any printk support.

For the above reasons, the approach taken to provide hypervisor stacktraces
in protected mode is:
1) Unwind and save the hyp stack addresses in EL2 to a shared buffer
with the host (done in this patch).
2) Delegate the dumping and symbolization of the addresses to the
host in EL1 (later patch in the series).

On hyp_panic(), the hypervisor prepares the stacktrace before returning to
the host.

Signed-off-by: Kalesh Singh <[email protected]>
---

Changes in v5:
- Comment/clarify pkvm_save_backtrace_entry(), per Fuad
- kvm_nvhe_unwind_init(), doesn't need to be always inline, make it
inline instead to avoid linking issues, per Marc
- Use regular comments instead of doc comments, per Fuad

arch/arm64/include/asm/stacktrace/nvhe.h | 17 ++++++
arch/arm64/kvm/hyp/nvhe/stacktrace.c | 78 ++++++++++++++++++++++++
arch/arm64/kvm/hyp/nvhe/switch.c | 6 ++
3 files changed, 101 insertions(+)

diff --git a/arch/arm64/include/asm/stacktrace/nvhe.h b/arch/arm64/include/asm/stacktrace/nvhe.h
index 3078501f8e22..05d7e03e0a8c 100644
--- a/arch/arm64/include/asm/stacktrace/nvhe.h
+++ b/arch/arm64/include/asm/stacktrace/nvhe.h
@@ -21,6 +21,23 @@

#include <asm/stacktrace/common.h>

+/*
+ * kvm_nvhe_unwind_init - Start an unwind from the given nVHE HYP fp and pc
+ *
+ * @state : unwind_state to initialize
+ * @fp : frame pointer at which to start the unwinding.
+ * @pc : program counter at which to start the unwinding.
+ */
+static inline void kvm_nvhe_unwind_init(struct unwind_state *state,
+ unsigned long fp,
+ unsigned long pc)
+{
+ unwind_init_common(state, NULL);
+
+ state->fp = fp;
+ state->pc = pc;
+}
+
static inline bool on_accessible_stack(const struct task_struct *tsk,
unsigned long sp, unsigned long size,
struct stack_info *info)
diff --git a/arch/arm64/kvm/hyp/nvhe/stacktrace.c b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
index 96c8b93320eb..60461c033a04 100644
--- a/arch/arm64/kvm/hyp/nvhe/stacktrace.c
+++ b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
@@ -11,4 +11,82 @@ DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], overflow_stack)

#ifdef CONFIG_PROTECTED_NVHE_STACKTRACE
DEFINE_PER_CPU(unsigned long [NVHE_STACKTRACE_SIZE/sizeof(long)], pkvm_stacktrace);
+
+/*
+ * pkvm_save_backtrace_entry - Saves a protected nVHE HYP stacktrace entry
+ *
+ * @arg : the position of the entry in the stacktrace buffer
+ * @where : the program counter corresponding to the stack frame
+ *
+ * Save the return address of a stack frame to the shared stacktrace buffer.
+ * The host can access this shared buffer from EL1 to dump the backtrace.
+ */
+static bool pkvm_save_backtrace_entry(void *arg, unsigned long where)
+{
+ unsigned long **stacktrace_entry = (unsigned long **)arg;
+ int nr_entries = NVHE_STACKTRACE_SIZE / sizeof(long);
+ unsigned long *stacktrace_start, *stacktrace_end;
+
+ stacktrace_start = (unsigned long *)this_cpu_ptr(pkvm_stacktrace);
+ stacktrace_end = stacktrace_start + nr_entries;
+
+ /*
+ * Need 2 free slots: 1 for current entry and 1 for the
+ * trailing zero entry delimiter.
+ */
+ if (*stacktrace_entry > stacktrace_end - 2)
+ return false;
+
+ /* Save the current entry */
+ **stacktrace_entry = where;
+
+ /* Add trailing zero entry delimiter */
+ *(*stacktrace_entry + 1) = 0UL;
+
+ /*
+ * Increment the current entry position. The zero entry
+ * will be overwritten by the next backtrace entry (if any)
+ */
+ ++*stacktrace_entry;
+
+ return true;
+}
+
+/*
+ * pkvm_save_backtrace - Saves the protected nVHE HYP stacktrace
+ *
+ * @fp : frame pointer at which to start the unwinding.
+ * @pc : program counter at which to start the unwinding.
+ *
+ * Save the unwinded stack addresses to the shared stacktrace buffer.
+ * The host can access this shared buffer from EL1 to dump the backtrace.
+ */
+static void pkvm_save_backtrace(unsigned long fp, unsigned long pc)
+{
+ void *stacktrace_entry = (void *)this_cpu_ptr(pkvm_stacktrace);
+ struct unwind_state state;
+
+ kvm_nvhe_unwind_init(&state, fp, pc);
+
+ unwind(&state, pkvm_save_backtrace_entry, &stacktrace_entry);
+}
+#else /* !CONFIG_PROTECTED_NVHE_STACKTRACE */
+static void pkvm_save_backtrace(unsigned long fp, unsigned long pc)
+{
+}
#endif /* CONFIG_PROTECTED_NVHE_STACKTRACE */
+
+/*
+ * kvm_nvhe_prepare_backtrace - prepare to dump the nVHE backtrace
+ *
+ * @fp : frame pointer at which to start the unwinding.
+ * @pc : program counter at which to start the unwinding.
+ *
+ * Saves the information needed by the host to dump the nVHE hypervisor
+ * backtrace.
+ */
+void kvm_nvhe_prepare_backtrace(unsigned long fp, unsigned long pc)
+{
+ if (is_protected_kvm_enabled())
+ pkvm_save_backtrace(fp, pc);
+}
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index 6db801db8f27..64e13445d0d9 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -34,6 +34,8 @@ DEFINE_PER_CPU(struct kvm_host_data, kvm_host_data);
DEFINE_PER_CPU(struct kvm_cpu_context, kvm_hyp_ctxt);
DEFINE_PER_CPU(unsigned long, kvm_hyp_vector);

+extern void kvm_nvhe_prepare_backtrace(unsigned long fp, unsigned long pc);
+
static void __activate_traps(struct kvm_vcpu *vcpu)
{
u64 val;
@@ -375,6 +377,10 @@ asmlinkage void __noreturn hyp_panic(void)
__sysreg_restore_state_nvhe(host_ctxt);
}

+ /* Prepare to dump kvm nvhe hyp stacktrace */
+ kvm_nvhe_prepare_backtrace((unsigned long)__builtin_frame_address(0),
+ _THIS_IP_);
+
__hyp_do_panic(host_ctxt, spsr, elr, par);
unreachable();
}
--
2.37.0.170.g444d1eabd0-goog

2022-07-21 05:59:57

by Kalesh Singh

[permalink] [raw]
Subject: [PATCH v5 13/17] KVM: arm64: Prepare non-protected nVHE hypervisor stacktrace

In non-protected nVHE mode (non-pKVM) the host can directly access
hypervisor memory; and unwinding of the hypervisor stacktrace is
done from EL1 to save on memory for shared buffers.

To unwind the hypervisor stack from EL1 the host needs to know the
starting point for the unwind and information that will allow it to
translate hypervisor stack addresses to the corresponding kernel
addresses. This patch sets up this book keeping. It is made use of
later in the series.

Signed-off-by: Kalesh Singh <[email protected]>
---

Changes in v5:
- Use regular comments instead of doc comments, per Fuad

arch/arm64/include/asm/kvm_asm.h | 16 ++++++++++++++++
arch/arm64/include/asm/stacktrace/nvhe.h | 4 ++++
arch/arm64/kvm/hyp/nvhe/stacktrace.c | 24 ++++++++++++++++++++++++
3 files changed, 44 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 2e277f2ed671..53035763e48e 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -176,6 +176,22 @@ struct kvm_nvhe_init_params {
unsigned long vtcr;
};

+/*
+ * Used by the host in EL1 to dump the nVHE hypervisor backtrace on
+ * hyp_panic() in non-protected mode.
+ *
+ * @stack_base: hyp VA of the hyp_stack base.
+ * @overflow_stack_base: hyp VA of the hyp_overflow_stack base.
+ * @fp: hyp FP where the backtrace begins.
+ * @pc: hyp PC where the backtrace begins.
+ */
+struct kvm_nvhe_stacktrace_info {
+ unsigned long stack_base;
+ unsigned long overflow_stack_base;
+ unsigned long fp;
+ unsigned long pc;
+};
+
/* Translate a kernel address @ptr into its equivalent linear mapping */
#define kvm_ksym_ref(ptr) \
({ \
diff --git a/arch/arm64/include/asm/stacktrace/nvhe.h b/arch/arm64/include/asm/stacktrace/nvhe.h
index 05d7e03e0a8c..8f02803a005f 100644
--- a/arch/arm64/include/asm/stacktrace/nvhe.h
+++ b/arch/arm64/include/asm/stacktrace/nvhe.h
@@ -19,6 +19,7 @@
#ifndef __ASM_STACKTRACE_NVHE_H
#define __ASM_STACKTRACE_NVHE_H

+#include <asm/kvm_asm.h>
#include <asm/stacktrace/common.h>

/*
@@ -52,6 +53,9 @@ static inline bool on_accessible_stack(const struct task_struct *tsk,
* In protected mode, the unwinding is done by the hypervisor in EL2.
*/

+DECLARE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], overflow_stack);
+DECLARE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
+
#ifdef CONFIG_PROTECTED_NVHE_STACKTRACE
static inline bool on_overflow_stack(unsigned long sp, unsigned long size,
struct stack_info *info)
diff --git a/arch/arm64/kvm/hyp/nvhe/stacktrace.c b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
index 60461c033a04..cbd365f4f26a 100644
--- a/arch/arm64/kvm/hyp/nvhe/stacktrace.c
+++ b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
@@ -9,6 +9,28 @@
DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], overflow_stack)
__aligned(16);

+DEFINE_PER_CPU(struct kvm_nvhe_stacktrace_info, kvm_stacktrace_info);
+
+/*
+ * hyp_prepare_backtrace - Prepare non-protected nVHE backtrace.
+ *
+ * @fp : frame pointer at which to start the unwinding.
+ * @pc : program counter at which to start the unwinding.
+ *
+ * Save the information needed by the host to unwind the non-protected
+ * nVHE hypervisor stack in EL1.
+ */
+static void hyp_prepare_backtrace(unsigned long fp, unsigned long pc)
+{
+ struct kvm_nvhe_stacktrace_info *stacktrace_info = this_cpu_ptr(&kvm_stacktrace_info);
+ struct kvm_nvhe_init_params *params = this_cpu_ptr(&kvm_init_params);
+
+ stacktrace_info->stack_base = (unsigned long)(params->stack_hyp_va - PAGE_SIZE);
+ stacktrace_info->overflow_stack_base = (unsigned long)this_cpu_ptr(overflow_stack);
+ stacktrace_info->fp = fp;
+ stacktrace_info->pc = pc;
+}
+
#ifdef CONFIG_PROTECTED_NVHE_STACKTRACE
DEFINE_PER_CPU(unsigned long [NVHE_STACKTRACE_SIZE/sizeof(long)], pkvm_stacktrace);

@@ -89,4 +111,6 @@ void kvm_nvhe_prepare_backtrace(unsigned long fp, unsigned long pc)
{
if (is_protected_kvm_enabled())
pkvm_save_backtrace(fp, pc);
+ else
+ hyp_prepare_backtrace(fp, pc);
}
--
2.37.0.170.g444d1eabd0-goog

2022-07-21 06:00:54

by Kalesh Singh

[permalink] [raw]
Subject: [PATCH v5 15/17] KVM: arm64: Implement non-protected nVHE hyp stack unwinder

Implements the common framework necessary for unwind() to work
for non-protected nVHE mode:
- on_accessible_stack()
- on_overflow_stack()
- unwind_next()

Non-protected nVHE unwind() is used to unwind and dump the hypervisor
stacktrace by the host in EL1

Signed-off-by: Kalesh Singh <[email protected]>
---

Changes in v5:
- Use regular comments instead of doc comments, per Fuad

arch/arm64/include/asm/stacktrace/nvhe.h | 67 +++++++++++++++++++++++-
arch/arm64/kvm/arm.c | 2 +-
2 files changed, 66 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/stacktrace/nvhe.h b/arch/arm64/include/asm/stacktrace/nvhe.h
index c3688e717136..7a6e761aa443 100644
--- a/arch/arm64/include/asm/stacktrace/nvhe.h
+++ b/arch/arm64/include/asm/stacktrace/nvhe.h
@@ -120,15 +120,78 @@ NOKPROBE_SYMBOL(unwind_next);
* (by the host in EL1).
*/

+DECLARE_KVM_NVHE_PER_CPU(unsigned long [PAGE_SIZE/sizeof(long)], overflow_stack);
+DECLARE_KVM_NVHE_PER_CPU(struct kvm_nvhe_stacktrace_info, kvm_stacktrace_info);
+DECLARE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);
+
+/*
+ * kvm_nvhe_stack_kern_va - Convert KVM nVHE HYP stack addresses to a kernel VAs
+ *
+ * The nVHE hypervisor stack is mapped in the flexible 'private' VA range, to
+ * allow for guard pages below the stack. Consequently, the fixed offset address
+ * translation macros won't work here.
+ *
+ * The kernel VA is calculated as an offset from the kernel VA of the hypervisor
+ * stack base.
+ *
+ * Returns true on success and updates @addr to its corresponding kernel VA;
+ * otherwise returns false.
+ */
+static inline bool kvm_nvhe_stack_kern_va(unsigned long *addr,
+ enum stack_type type)
+{
+ struct kvm_nvhe_stacktrace_info *stacktrace_info;
+ unsigned long hyp_base, kern_base, hyp_offset;
+
+ stacktrace_info = this_cpu_ptr_nvhe_sym(kvm_stacktrace_info);
+
+ switch (type) {
+ case STACK_TYPE_HYP:
+ kern_base = (unsigned long)*this_cpu_ptr(&kvm_arm_hyp_stack_page);
+ hyp_base = (unsigned long)stacktrace_info->stack_base;
+ break;
+ case STACK_TYPE_OVERFLOW:
+ kern_base = (unsigned long)this_cpu_ptr_nvhe_sym(overflow_stack);
+ hyp_base = (unsigned long)stacktrace_info->overflow_stack_base;
+ break;
+ default:
+ return false;
+ }
+
+ hyp_offset = *addr - hyp_base;
+
+ *addr = kern_base + hyp_offset;
+
+ return true;
+}
+
static inline bool on_overflow_stack(unsigned long sp, unsigned long size,
struct stack_info *info)
{
- return false;
+ struct kvm_nvhe_stacktrace_info *stacktrace_info
+ = this_cpu_ptr_nvhe_sym(kvm_stacktrace_info);
+ unsigned long low = (unsigned long)stacktrace_info->overflow_stack_base;
+ unsigned long high = low + OVERFLOW_STACK_SIZE;
+
+ return on_stack(sp, size, low, high, STACK_TYPE_OVERFLOW, info);
+}
+
+static inline bool on_hyp_stack(unsigned long sp, unsigned long size,
+ struct stack_info *info)
+{
+ struct kvm_nvhe_stacktrace_info *stacktrace_info
+ = this_cpu_ptr_nvhe_sym(kvm_stacktrace_info);
+ unsigned long low = (unsigned long)stacktrace_info->stack_base;
+ unsigned long high = low + PAGE_SIZE;
+
+ return on_stack(sp, size, low, high, STACK_TYPE_HYP, info);
}

static inline int notrace unwind_next(struct unwind_state *state)
{
- return 0;
+ struct stack_info info;
+
+ return unwind_next_common(state, &info, kvm_nvhe_stack_kern_va);
}
NOKPROBE_SYMBOL(unwind_next);

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index a0188144a122..6a64293108c5 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -49,7 +49,7 @@ DEFINE_STATIC_KEY_FALSE(kvm_protected_mode_initialized);

DECLARE_KVM_HYP_PER_CPU(unsigned long, kvm_hyp_vector);

-static DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);
+DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);
unsigned long kvm_arm_hyp_percpu_base[NR_CPUS];
DECLARE_KVM_NVHE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);

--
2.37.0.170.g444d1eabd0-goog

2022-07-21 06:26:06

by Kalesh Singh

[permalink] [raw]
Subject: [PATCH v5 07/17] KVM: arm64: On stack overflow switch to hyp overflow_stack

On hyp stack overflow switch to 16-byte aligned secondary stack.
This provides us stack space to better handle overflows; and is
used in a subsequent patch to dump the hypervisor stacktrace.

Signed-off-by: Kalesh Singh <[email protected]>
Reviewed-by: Fuad Tabba <[email protected]>
---

Changes in v5:
- Add Reviewed-by tag from Fuad

arch/arm64/kvm/hyp/nvhe/Makefile | 2 +-
arch/arm64/kvm/hyp/nvhe/host.S | 9 ++-------
arch/arm64/kvm/hyp/nvhe/stacktrace.c | 11 +++++++++++
3 files changed, 14 insertions(+), 8 deletions(-)
create mode 100644 arch/arm64/kvm/hyp/nvhe/stacktrace.c

diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index f9fe4dc21b1f..524e7dad5739 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))

obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
hyp-main.o hyp-smp.o psci-relay.o early_alloc.o page_alloc.o \
- cache.o setup.o mm.o mem_protect.o sys_regs.o pkvm.o
+ cache.o setup.o mm.o mem_protect.o sys_regs.o pkvm.o stacktrace.o
obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
obj-$(CONFIG_DEBUG_LIST) += list_debug.o
diff --git a/arch/arm64/kvm/hyp/nvhe/host.S b/arch/arm64/kvm/hyp/nvhe/host.S
index ea6a397b64a6..b6c0188c4b35 100644
--- a/arch/arm64/kvm/hyp/nvhe/host.S
+++ b/arch/arm64/kvm/hyp/nvhe/host.S
@@ -177,13 +177,8 @@ SYM_FUNC_END(__host_hvc)
b hyp_panic

.L__hyp_sp_overflow\@:
- /*
- * Reset SP to the top of the stack, to allow handling the hyp_panic.
- * This corrupts the stack but is ok, since we won't be attempting
- * any unwinding here.
- */
- ldr_this_cpu x0, kvm_init_params + NVHE_INIT_STACK_HYP_VA, x1
- mov sp, x0
+ /* Switch to the overflow stack */
+ adr_this_cpu sp, overflow_stack + OVERFLOW_STACK_SIZE, x0

b hyp_panic_bad_stack
ASM_BUG()
diff --git a/arch/arm64/kvm/hyp/nvhe/stacktrace.c b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
new file mode 100644
index 000000000000..a3d5b34e1249
--- /dev/null
+++ b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
@@ -0,0 +1,11 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * KVM nVHE hypervisor stack tracing support.
+ *
+ * Copyright (C) 2022 Google LLC
+ */
+#include <asm/memory.h>
+#include <asm/percpu.h>
+
+DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], overflow_stack)
+ __aligned(16);
--
2.37.0.170.g444d1eabd0-goog

2022-07-21 06:28:20

by Kalesh Singh

[permalink] [raw]
Subject: [PATCH v5 08/17] KVM: arm64: Add PROTECTED_NVHE_STACKTRACE Kconfig

This can be used to disable stacktrace for the protected KVM
nVHE hypervisor, in order to save on the associated memory usage.

This option is disabled by default, since protected KVM is not widely
used on platforms other than Android currently.

Signed-off-by: Kalesh Singh <[email protected]>
---

Changes in v5:
- Make PROTECTED_NVHE_STACKTRACE depend on NVHE_EL2_DEBUG, per Marc

arch/arm64/kvm/Kconfig | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
index 8a5fbbf084df..09c995869916 100644
--- a/arch/arm64/kvm/Kconfig
+++ b/arch/arm64/kvm/Kconfig
@@ -46,6 +46,21 @@ menuconfig KVM

If unsure, say N.

+config PROTECTED_NVHE_STACKTRACE
+ bool "Protected KVM hypervisor stacktraces"
+ depends on NVHE_EL2_DEBUG
+ default n
+ help
+ Say Y here to enable pKVM hypervisor stacktraces on hyp_panic()
+
+ If you are not using protected nVHE (pKVM), say N.
+
+ If using protected nVHE mode, but cannot afford the associated
+ memory cost (less than 0.75 page per CPU) of pKVM stacktraces,
+ say N.
+
+ If unsure, say N.
+
config NVHE_EL2_DEBUG
bool "Debug mode for non-VHE EL2 object"
depends on KVM
--
2.37.0.170.g444d1eabd0-goog

2022-07-21 06:29:30

by Kalesh Singh

[permalink] [raw]
Subject: [PATCH v5 16/17] KVM: arm64: Introduce pkvm_dump_backtrace()

Dumps the pKVM hypervisor backtrace from EL1 by reading the unwinded
addresses from the shared stacktrace buffer.

The nVHE hyp backtrace is dumped on hyp_panic(), before panicking the
host.

Signed-off-by: Kalesh Singh <[email protected]>
---

Changes in v5:
- Move code out from nvhe.h header to handle_exit.c, per Marc
- Fix stacktrace symoblization when CONFIG_RAMDOMIZE_BASE is enabled,
per Fuad
- Use regular comments instead of doc comments, per Fuad

arch/arm64/kvm/handle_exit.c | 54 ++++++++++++++++++++++++++++++++++++
1 file changed, 54 insertions(+)

diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index f66c0142b335..ad568da5c7d7 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -318,6 +318,57 @@ void handle_exit_early(struct kvm_vcpu *vcpu, int exception_index)
kvm_handle_guest_serror(vcpu, kvm_vcpu_get_esr(vcpu));
}

+#ifdef CONFIG_PROTECTED_NVHE_STACKTRACE
+DECLARE_KVM_NVHE_PER_CPU(unsigned long [NVHE_STACKTRACE_SIZE/sizeof(long)],
+ pkvm_stacktrace);
+
+/*
+ * pkvm_dump_backtrace - Dump the protected nVHE HYP backtrace.
+ *
+ * @hyp_offset: hypervisor offset, used for address translation.
+ *
+ * Dumping of the pKVM HYP backtrace is done by reading the
+ * stack addresses from the shared stacktrace buffer, since the
+ * host cannot direclty access hyperviosr memory in protected
+ * mode.
+ */
+static void pkvm_dump_backtrace(unsigned long hyp_offset)
+{
+ unsigned long *stacktrace_entry
+ = (unsigned long *)this_cpu_ptr_nvhe_sym(pkvm_stacktrace);
+ unsigned long va_mask, pc;
+
+ va_mask = GENMASK_ULL(vabits_actual - 1, 0);
+
+ kvm_err("Protected nVHE HYP call trace:\n");
+
+ /* The stack trace is terminated by a null entry */
+ for (; *stacktrace_entry; stacktrace_entry++) {
+ /* Mask tags and convert to kern addr */
+ pc = (*stacktrace_entry & va_mask) + hyp_offset;
+ kvm_err(" [<%016lx>] %pB\n", pc, (void *)(pc + kaslr_offset()));
+ }
+
+ kvm_err("---- End of Protected nVHE HYP call trace ----\n");
+}
+#else /* !CONFIG_PROTECTED_NVHE_STACKTRACE */
+static void pkvm_dump_backtrace(unsigned long hyp_offset)
+{
+ kvm_err("Cannot dump pKVM nVHE stacktrace: !CONFIG_PROTECTED_NVHE_STACKTRACE\n");
+}
+#endif /* CONFIG_PROTECTED_NVHE_STACKTRACE */
+
+/*
+ * kvm_nvhe_dump_backtrace - Dump KVM nVHE hypervisor backtrace.
+ *
+ * @hyp_offset: hypervisor offset, used for address translation.
+ */
+static void kvm_nvhe_dump_backtrace(unsigned long hyp_offset)
+{
+ if (is_protected_kvm_enabled())
+ pkvm_dump_backtrace(hyp_offset);
+}
+
void __noreturn __cold nvhe_hyp_panic_handler(u64 esr, u64 spsr,
u64 elr_virt, u64 elr_phys,
u64 par, uintptr_t vcpu,
@@ -353,6 +404,9 @@ void __noreturn __cold nvhe_hyp_panic_handler(u64 esr, u64 spsr,
(void *)panic_addr);
}

+ /* Dump the nVHE hypervisor backtrace */
+ kvm_nvhe_dump_backtrace(hyp_offset);
+
/*
* Hyp has panicked and we're going to handle that by panicking the
* kernel. The kernel offset will be revealed in the panic so we're
--
2.37.0.170.g444d1eabd0-goog

2022-07-21 06:32:35

by Kalesh Singh

[permalink] [raw]
Subject: [PATCH v5 02/17] arm64: stacktrace: Factor out on_accessible_stack_common()

Move common on_accessible_stack checks to stacktrace/common.h. This is
used in the implementation of the nVHE hypervisor unwinder later in
this series.

Signed-off-by: Kalesh Singh <[email protected]>
Reviewed-by: Fuad Tabba <[email protected]>
Reviewed-by: Mark Brown <[email protected]>
---

Changes in v5:
- Add Reviewed-by tags from Mark Brown and Fuad
- Remove random whitespace change, per Mark Brown

arch/arm64/include/asm/stacktrace.h | 6 ++----
arch/arm64/include/asm/stacktrace/common.h | 18 ++++++++++++++++++
2 files changed, 20 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index 79f455b37c84..43f4b4a6d383 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -65,8 +65,8 @@ static inline bool on_accessible_stack(const struct task_struct *tsk,
unsigned long sp, unsigned long size,
struct stack_info *info)
{
- if (info)
- info->type = STACK_TYPE_UNKNOWN;
+ if (on_accessible_stack_common(tsk, sp, size, info))
+ return true;

if (on_task_stack(tsk, sp, size, info))
return true;
@@ -74,8 +74,6 @@ static inline bool on_accessible_stack(const struct task_struct *tsk,
return false;
if (on_irq_stack(sp, size, info))
return true;
- if (on_overflow_stack(sp, size, info))
- return true;
if (on_sdei_stack(sp, size, info))
return true;

diff --git a/arch/arm64/include/asm/stacktrace/common.h b/arch/arm64/include/asm/stacktrace/common.h
index 64ae4f6b06fe..f58b786460d3 100644
--- a/arch/arm64/include/asm/stacktrace/common.h
+++ b/arch/arm64/include/asm/stacktrace/common.h
@@ -62,6 +62,9 @@ struct unwind_state {
struct task_struct *task;
};

+static inline bool on_overflow_stack(unsigned long sp, unsigned long size,
+ struct stack_info *info);
+
static inline bool on_stack(unsigned long sp, unsigned long size,
unsigned long low, unsigned long high,
enum stack_type type, struct stack_info *info)
@@ -80,6 +83,21 @@ static inline bool on_stack(unsigned long sp, unsigned long size,
return true;
}

+static inline bool on_accessible_stack_common(const struct task_struct *tsk,
+ unsigned long sp,
+ unsigned long size,
+ struct stack_info *info)
+{
+ if (info)
+ info->type = STACK_TYPE_UNKNOWN;
+
+ /*
+ * Both the kernel and nvhe hypervisor make use of
+ * an overflow_stack
+ */
+ return on_overflow_stack(sp, size, info);
+}
+
static inline void unwind_init_common(struct unwind_state *state,
struct task_struct *task)
{
--
2.37.0.170.g444d1eabd0-goog

2022-07-21 06:33:25

by Kalesh Singh

[permalink] [raw]
Subject: [PATCH v5 10/17] KVM: arm64: Stub implementation of pKVM HYP stack unwinder

Add some stub implementations of protected nVHE stack unwinder, for
building. These are implemented later in this series.

Signed-off-by: Kalesh Singh <[email protected]>
---

Changes in v5:
- Mark unwind_next() as inline, per Marc

arch/arm64/include/asm/stacktrace/nvhe.h | 59 ++++++++++++++++++++++++
arch/arm64/kvm/hyp/nvhe/stacktrace.c | 3 +-
2 files changed, 60 insertions(+), 2 deletions(-)
create mode 100644 arch/arm64/include/asm/stacktrace/nvhe.h

diff --git a/arch/arm64/include/asm/stacktrace/nvhe.h b/arch/arm64/include/asm/stacktrace/nvhe.h
new file mode 100644
index 000000000000..80d71932afff
--- /dev/null
+++ b/arch/arm64/include/asm/stacktrace/nvhe.h
@@ -0,0 +1,59 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * KVM nVHE hypervisor stack tracing support.
+ *
+ * The unwinder implementation depends on the nVHE mode:
+ *
+ * 1) pKVM (protected nVHE) mode - the host cannot directly access
+ * the HYP memory. The stack is unwinded in EL2 and dumped to a shared
+ * buffer where the host can read and print the stacktrace.
+ *
+ * Copyright (C) 2022 Google LLC
+ */
+#ifndef __ASM_STACKTRACE_NVHE_H
+#define __ASM_STACKTRACE_NVHE_H
+
+#include <asm/stacktrace/common.h>
+
+static inline bool on_accessible_stack(const struct task_struct *tsk,
+ unsigned long sp, unsigned long size,
+ struct stack_info *info)
+{
+ return false;
+}
+
+#ifdef __KVM_NVHE_HYPERVISOR__
+/*
+ * Protected nVHE HYP stack unwinder
+ *
+ * In protected mode, the unwinding is done by the hypervisor in EL2.
+ */
+
+#ifdef CONFIG_PROTECTED_NVHE_STACKTRACE
+static inline bool on_overflow_stack(unsigned long sp, unsigned long size,
+ struct stack_info *info)
+{
+ return false;
+}
+
+static inline int notrace unwind_next(struct unwind_state *state)
+{
+ return 0;
+}
+NOKPROBE_SYMBOL(unwind_next);
+#else /* !CONFIG_PROTECTED_NVHE_STACKTRACE */
+static inline bool on_overflow_stack(unsigned long sp, unsigned long size,
+ struct stack_info *info)
+{
+ return false;
+}
+
+static inline int notrace unwind_next(struct unwind_state *state)
+{
+ return 0;
+}
+NOKPROBE_SYMBOL(unwind_next);
+#endif /* CONFIG_PROTECTED_NVHE_STACKTRACE */
+
+#endif /* __KVM_NVHE_HYPERVISOR__ */
+#endif /* __ASM_STACKTRACE_NVHE_H */
diff --git a/arch/arm64/kvm/hyp/nvhe/stacktrace.c b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
index 69e65b457f1c..96c8b93320eb 100644
--- a/arch/arm64/kvm/hyp/nvhe/stacktrace.c
+++ b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
@@ -4,8 +4,7 @@
*
* Copyright (C) 2022 Google LLC
*/
-#include <asm/memory.h>
-#include <asm/percpu.h>
+#include <asm/stacktrace/nvhe.h>

DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], overflow_stack)
__aligned(16);
--
2.37.0.170.g444d1eabd0-goog

2022-07-21 06:33:41

by Kalesh Singh

[permalink] [raw]
Subject: [PATCH v5 04/17] arm64: stacktrace: Handle frame pointer from different address spaces

The unwinder code is made reusable so that it can be used to
unwind various types of stacks. One usecase is unwinding the
nVHE hyp stack from the host (EL1) in non-protected mode. This
means that the unwinder must be able to translate HYP stack
addresses to kernel addresses.

Add a callback (stack_trace_translate_fp_fn) to allow specifying
the translation function.

Signed-off-by: Kalesh Singh <[email protected]>
---

Changes in v5:
- Fix typo in commit text, per Fuad
- Update unwind_next_common() to not have side effects on failure, per Fuad
- Use regular comment instead of doc comments, per Fuad

arch/arm64/include/asm/stacktrace/common.h | 29 +++++++++++++++++++---
arch/arm64/kernel/stacktrace.c | 2 +-
2 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/stacktrace/common.h b/arch/arm64/include/asm/stacktrace/common.h
index 0c5cbfdb56b5..e89c8c39858d 100644
--- a/arch/arm64/include/asm/stacktrace/common.h
+++ b/arch/arm64/include/asm/stacktrace/common.h
@@ -124,11 +124,25 @@ static inline void unwind_init_common(struct unwind_state *state,
state->prev_type = STACK_TYPE_UNKNOWN;
}

+/*
+ * stack_trace_translate_fp_fn() - Translates a non-kernel frame pointer to
+ * a kernel address.
+ *
+ * @fp: the frame pointer to be updated to it's kernel address.
+ * @type: the stack type associated with frame pointer @fp
+ *
+ * Returns true and success and @fp is updated to the corresponding
+ * kernel virtual address; otherwise returns false.
+ */
+typedef bool (*stack_trace_translate_fp_fn)(unsigned long *fp,
+ enum stack_type type);
+
static inline int unwind_next_common(struct unwind_state *state,
- struct stack_info *info)
+ struct stack_info *info,
+ stack_trace_translate_fp_fn translate_fp)
{
+ unsigned long fp = state->fp, kern_fp = fp;
struct task_struct *tsk = state->task;
- unsigned long fp = state->fp;

if (fp & 0x7)
return -EINVAL;
@@ -139,6 +153,13 @@ static inline int unwind_next_common(struct unwind_state *state,
if (test_bit(info->type, state->stacks_done))
return -EINVAL;

+ /*
+ * If fp is not from the current address space perform the necessary
+ * translation before dereferencing it to get the next fp.
+ */
+ if (translate_fp && !translate_fp(&kern_fp, info->type))
+ return -EINVAL;
+
/*
* As stacks grow downward, any valid record on the same stack must be
* at a strictly higher address than the prior record.
@@ -163,8 +184,8 @@ static inline int unwind_next_common(struct unwind_state *state,
* Record this frame record's values and location. The prev_fp and
* prev_type are only meaningful to the next unwind_next() invocation.
*/
- state->fp = READ_ONCE(*(unsigned long *)(fp));
- state->pc = READ_ONCE(*(unsigned long *)(fp + 8));
+ state->fp = READ_ONCE(*(unsigned long *)(kern_fp));
+ state->pc = READ_ONCE(*(unsigned long *)(kern_fp + 8));
state->prev_fp = fp;
state->prev_type = info->type;

diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index 834851939364..eef3cf6bf2d7 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -87,7 +87,7 @@ static int notrace unwind_next(struct unwind_state *state)
if (fp == (unsigned long)task_pt_regs(tsk)->stackframe)
return -ENOENT;

- err = unwind_next_common(state, &info);
+ err = unwind_next_common(state, &info, NULL);
if (err)
return err;

--
2.37.0.170.g444d1eabd0-goog

2022-07-21 06:33:48

by Kalesh Singh

[permalink] [raw]
Subject: [PATCH v5 14/17] KVM: arm64: Implement protected nVHE hyp stack unwinder

Implements the common framework necessary for unwind() to work in
the protected nVHE context:
- on_accessible_stack()
- on_overflow_stack()
- unwind_next()

Protected nVHE unwind() is used to unwind and save the hyp stack
addresses to the shared stacktrace buffer. The host reads the
entries in this buffer, symbolizes and dumps the stacktrace (later
patch in the series).

Signed-off-by: Kalesh Singh <[email protected]>
---
arch/arm64/include/asm/stacktrace/common.h | 2 ++
arch/arm64/include/asm/stacktrace/nvhe.h | 34 ++++++++++++++++++++--
2 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/stacktrace/common.h b/arch/arm64/include/asm/stacktrace/common.h
index be7920ba70b0..73fd9e143c4a 100644
--- a/arch/arm64/include/asm/stacktrace/common.h
+++ b/arch/arm64/include/asm/stacktrace/common.h
@@ -34,6 +34,7 @@ enum stack_type {
STACK_TYPE_OVERFLOW,
STACK_TYPE_SDEI_NORMAL,
STACK_TYPE_SDEI_CRITICAL,
+ STACK_TYPE_HYP,
__NR_STACK_TYPES
};

@@ -186,6 +187,7 @@ static inline int unwind_next_common(struct unwind_state *state,
*
* TASK -> IRQ -> OVERFLOW -> SDEI_NORMAL
* TASK -> SDEI_NORMAL -> SDEI_CRITICAL -> OVERFLOW
+ * HYP -> OVERFLOW
*
* ... but the nesting itself is strict. Once we transition from one
* stack to another, it's never valid to unwind back to that first
diff --git a/arch/arm64/include/asm/stacktrace/nvhe.h b/arch/arm64/include/asm/stacktrace/nvhe.h
index 8f02803a005f..c3688e717136 100644
--- a/arch/arm64/include/asm/stacktrace/nvhe.h
+++ b/arch/arm64/include/asm/stacktrace/nvhe.h
@@ -39,10 +39,19 @@ static inline void kvm_nvhe_unwind_init(struct unwind_state *state,
state->pc = pc;
}

+static inline bool on_hyp_stack(unsigned long sp, unsigned long size,
+ struct stack_info *info);
+
static inline bool on_accessible_stack(const struct task_struct *tsk,
unsigned long sp, unsigned long size,
struct stack_info *info)
{
+ if (on_accessible_stack_common(tsk, sp, size, info))
+ return true;
+
+ if (on_hyp_stack(sp, size, info))
+ return true;
+
return false;
}

@@ -60,12 +69,27 @@ DECLARE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
static inline bool on_overflow_stack(unsigned long sp, unsigned long size,
struct stack_info *info)
{
- return false;
+ unsigned long low = (unsigned long)this_cpu_ptr(overflow_stack);
+ unsigned long high = low + OVERFLOW_STACK_SIZE;
+
+ return on_stack(sp, size, low, high, STACK_TYPE_OVERFLOW, info);
+}
+
+static inline bool on_hyp_stack(unsigned long sp, unsigned long size,
+ struct stack_info *info)
+{
+ struct kvm_nvhe_init_params *params = this_cpu_ptr(&kvm_init_params);
+ unsigned long high = params->stack_hyp_va;
+ unsigned long low = high - PAGE_SIZE;
+
+ return on_stack(sp, size, low, high, STACK_TYPE_HYP, info);
}

static inline int notrace unwind_next(struct unwind_state *state)
{
- return 0;
+ struct stack_info info;
+
+ return unwind_next_common(state, &info, NULL);
}
NOKPROBE_SYMBOL(unwind_next);
#else /* !CONFIG_PROTECTED_NVHE_STACKTRACE */
@@ -75,6 +99,12 @@ static inline bool on_overflow_stack(unsigned long sp, unsigned long size,
return false;
}

+static inline bool on_hyp_stack(unsigned long sp, unsigned long size,
+ struct stack_info *info)
+{
+ return false;
+}
+
static inline int notrace unwind_next(struct unwind_state *state)
{
return 0;
--
2.37.0.170.g444d1eabd0-goog

2022-07-21 06:40:11

by Kalesh Singh

[permalink] [raw]
Subject: [PATCH v5 11/17] KVM: arm64: Stub implementation of non-protected nVHE HYP stack unwinder

Add stub implementations of non-protected nVHE stack unwinder, for
building. These are implemented later in this series.

Signed-off-by: Kalesh Singh <[email protected]>
---

Changes in v5:
- Mark unwind_next() as inline, per Marc
- Comment !__KVM_NVHE_HYPERVISOR__ unwinder path, per Marc

arch/arm64/include/asm/stacktrace/nvhe.h | 26 ++++++++++++++++++++++++
1 file changed, 26 insertions(+)

diff --git a/arch/arm64/include/asm/stacktrace/nvhe.h b/arch/arm64/include/asm/stacktrace/nvhe.h
index 80d71932afff..3078501f8e22 100644
--- a/arch/arm64/include/asm/stacktrace/nvhe.h
+++ b/arch/arm64/include/asm/stacktrace/nvhe.h
@@ -8,6 +8,12 @@
* the HYP memory. The stack is unwinded in EL2 and dumped to a shared
* buffer where the host can read and print the stacktrace.
*
+ * 2) Non-protected nVHE mode - the host can directly access the
+ * HYP stack pages and unwind the HYP stack in EL1. This saves having
+ * to allocate shared buffers for the host to read the unwinded
+ * stacktrace.
+ *
+ *
* Copyright (C) 2022 Google LLC
*/
#ifndef __ASM_STACKTRACE_NVHE_H
@@ -55,5 +61,25 @@ static inline int notrace unwind_next(struct unwind_state *state)
NOKPROBE_SYMBOL(unwind_next);
#endif /* CONFIG_PROTECTED_NVHE_STACKTRACE */

+#else /* !__KVM_NVHE_HYPERVISOR__ */
+/*
+ * Conventional (non-protected) nVHE HYP stack unwinder
+ *
+ * In non-protected mode, the unwinding is done from kernel proper context
+ * (by the host in EL1).
+ */
+
+static inline bool on_overflow_stack(unsigned long sp, unsigned long size,
+ struct stack_info *info)
+{
+ return false;
+}
+
+static inline int notrace unwind_next(struct unwind_state *state)
+{
+ return 0;
+}
+NOKPROBE_SYMBOL(unwind_next);
+
#endif /* __KVM_NVHE_HYPERVISOR__ */
#endif /* __ASM_STACKTRACE_NVHE_H */
--
2.37.0.170.g444d1eabd0-goog

2022-07-21 06:41:23

by Kalesh Singh

[permalink] [raw]
Subject: [PATCH v5 05/17] arm64: stacktrace: Factor out common unwind()

Move unwind() to stacktrace/common.h, and as a result
the kernel unwind_next() to asm/stacktrace.h. This allow
reusing unwind() in the implementation of the nVHE HYP
stack unwinder, later in the series.

Signed-off-by: Kalesh Singh <[email protected]>
Reviewed-by: Fuad Tabba <[email protected]>
---

Changes in v5:
- Add Reviewed-by tag from Fuad

arch/arm64/include/asm/stacktrace.h | 51 ++++++++++++++++
arch/arm64/include/asm/stacktrace/common.h | 19 ++++++
arch/arm64/kernel/stacktrace.c | 67 ----------------------
3 files changed, 70 insertions(+), 67 deletions(-)

diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index 43f4b4a6d383..ea828579a98b 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -11,6 +11,7 @@
#include <linux/llist.h>

#include <asm/memory.h>
+#include <asm/pointer_auth.h>
#include <asm/ptrace.h>
#include <asm/sdei.h>

@@ -80,4 +81,54 @@ static inline bool on_accessible_stack(const struct task_struct *tsk,
return false;
}

+/*
+ * Unwind from one frame record (A) to the next frame record (B).
+ *
+ * We terminate early if the location of B indicates a malformed chain of frame
+ * records (e.g. a cycle), determined based on the location and fp value of A
+ * and the location (but not the fp value) of B.
+ */
+static inline int notrace unwind_next(struct unwind_state *state)
+{
+ struct task_struct *tsk = state->task;
+ unsigned long fp = state->fp;
+ struct stack_info info;
+ int err;
+
+ /* Final frame; nothing to unwind */
+ if (fp == (unsigned long)task_pt_regs(tsk)->stackframe)
+ return -ENOENT;
+
+ err = unwind_next_common(state, &info, NULL);
+ if (err)
+ return err;
+
+ state->pc = ptrauth_strip_insn_pac(state->pc);
+
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+ if (tsk->ret_stack &&
+ (state->pc == (unsigned long)return_to_handler)) {
+ unsigned long orig_pc;
+ /*
+ * This is a case where function graph tracer has
+ * modified a return address (LR) in a stack frame
+ * to hook a function return.
+ * So replace it to an original value.
+ */
+ orig_pc = ftrace_graph_ret_addr(tsk, NULL, state->pc,
+ (void *)state->fp);
+ if (WARN_ON_ONCE(state->pc == orig_pc))
+ return -EINVAL;
+ state->pc = orig_pc;
+ }
+#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
+#ifdef CONFIG_KRETPROBES
+ if (is_kretprobe_trampoline(state->pc))
+ state->pc = kretprobe_find_ret_addr(tsk, (void *)state->fp, &state->kr_cur);
+#endif
+
+ return 0;
+}
+NOKPROBE_SYMBOL(unwind_next);
+
#endif /* __ASM_STACKTRACE_H */
diff --git a/arch/arm64/include/asm/stacktrace/common.h b/arch/arm64/include/asm/stacktrace/common.h
index e89c8c39858d..7807752aaab1 100644
--- a/arch/arm64/include/asm/stacktrace/common.h
+++ b/arch/arm64/include/asm/stacktrace/common.h
@@ -9,6 +9,7 @@

#include <linux/bitmap.h>
#include <linux/bitops.h>
+#include <linux/kprobes.h>
#include <linux/types.h>

enum stack_type {
@@ -69,6 +70,8 @@ static inline bool on_accessible_stack(const struct task_struct *tsk,
unsigned long sp, unsigned long size,
struct stack_info *info);

+static inline int unwind_next(struct unwind_state *state);
+
static inline bool on_stack(unsigned long sp, unsigned long size,
unsigned long low, unsigned long high,
enum stack_type type, struct stack_info *info)
@@ -191,4 +194,20 @@ static inline int unwind_next_common(struct unwind_state *state,

return 0;
}
+
+static inline void notrace unwind(struct unwind_state *state,
+ stack_trace_consume_fn consume_entry,
+ void *cookie)
+{
+ while (1) {
+ int ret;
+
+ if (!consume_entry(cookie, state->pc))
+ break;
+ ret = unwind_next(state);
+ if (ret < 0)
+ break;
+ }
+}
+NOKPROBE_SYMBOL(unwind);
#endif /* __ASM_STACKTRACE_COMMON_H */
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index eef3cf6bf2d7..9fa60ee48499 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -7,14 +7,12 @@
#include <linux/kernel.h>
#include <linux/export.h>
#include <linux/ftrace.h>
-#include <linux/kprobes.h>
#include <linux/sched.h>
#include <linux/sched/debug.h>
#include <linux/sched/task_stack.h>
#include <linux/stacktrace.h>

#include <asm/irq.h>
-#include <asm/pointer_auth.h>
#include <asm/stack_pointer.h>
#include <asm/stacktrace.h>

@@ -69,71 +67,6 @@ static inline void unwind_init_from_task(struct unwind_state *state,
state->pc = thread_saved_pc(task);
}

-/*
- * Unwind from one frame record (A) to the next frame record (B).
- *
- * We terminate early if the location of B indicates a malformed chain of frame
- * records (e.g. a cycle), determined based on the location and fp value of A
- * and the location (but not the fp value) of B.
- */
-static int notrace unwind_next(struct unwind_state *state)
-{
- struct task_struct *tsk = state->task;
- unsigned long fp = state->fp;
- struct stack_info info;
- int err;
-
- /* Final frame; nothing to unwind */
- if (fp == (unsigned long)task_pt_regs(tsk)->stackframe)
- return -ENOENT;
-
- err = unwind_next_common(state, &info, NULL);
- if (err)
- return err;
-
- state->pc = ptrauth_strip_insn_pac(state->pc);
-
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
- if (tsk->ret_stack &&
- (state->pc == (unsigned long)return_to_handler)) {
- unsigned long orig_pc;
- /*
- * This is a case where function graph tracer has
- * modified a return address (LR) in a stack frame
- * to hook a function return.
- * So replace it to an original value.
- */
- orig_pc = ftrace_graph_ret_addr(tsk, NULL, state->pc,
- (void *)state->fp);
- if (WARN_ON_ONCE(state->pc == orig_pc))
- return -EINVAL;
- state->pc = orig_pc;
- }
-#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
-#ifdef CONFIG_KRETPROBES
- if (is_kretprobe_trampoline(state->pc))
- state->pc = kretprobe_find_ret_addr(tsk, (void *)state->fp, &state->kr_cur);
-#endif
-
- return 0;
-}
-NOKPROBE_SYMBOL(unwind_next);
-
-static void notrace unwind(struct unwind_state *state,
- stack_trace_consume_fn consume_entry, void *cookie)
-{
- while (1) {
- int ret;
-
- if (!consume_entry(cookie, state->pc))
- break;
- ret = unwind_next(state);
- if (ret < 0)
- break;
- }
-}
-NOKPROBE_SYMBOL(unwind);
-
static bool dump_backtrace_entry(void *arg, unsigned long where)
{
char *loglvl = arg;
--
2.37.0.170.g444d1eabd0-goog

2022-07-21 06:42:30

by Kalesh Singh

[permalink] [raw]
Subject: [PATCH v5 17/17] KVM: arm64: Introduce hyp_dump_backtrace()

In non-protected nVHE mode, unwinds and dumps the hypervisor backtrace
from EL1. This is possible beacuase the host can directly access the
hypervisor stack pages in non-proteced mode.

Signed-off-by: Kalesh Singh <[email protected]>
---

Changes in v5:
- Move code out from nvhe.h header to handle_exit.c, per Marc
- Fix stacktrace symoblization when CONFIG_RAMDOMIZE_BASE is enabled,
per Fuad
- Use regular comments instead of doc comments, per Fuad

arch/arm64/kvm/handle_exit.c | 65 +++++++++++++++++++++++++++++++-----
1 file changed, 56 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index ad568da5c7d7..432b6b26f4ad 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -17,6 +17,7 @@
#include <asm/kvm_emulate.h>
#include <asm/kvm_mmu.h>
#include <asm/debug-monitors.h>
+#include <asm/stacktrace/nvhe.h>
#include <asm/traps.h>

#include <kvm/arm_hypercalls.h>
@@ -318,6 +319,56 @@ void handle_exit_early(struct kvm_vcpu *vcpu, int exception_index)
kvm_handle_guest_serror(vcpu, kvm_vcpu_get_esr(vcpu));
}

+/*
+ * kvm_nvhe_print_backtrace_entry - Symbolizes and prints the HYP stack address
+ */
+static void kvm_nvhe_print_backtrace_entry(unsigned long addr,
+ unsigned long hyp_offset)
+{
+ unsigned long va_mask = GENMASK_ULL(vabits_actual - 1, 0);
+
+ /* Mask tags and convert to kern addr */
+ addr = (addr & va_mask) + hyp_offset;
+ kvm_err(" [<%016lx>] %pB\n", addr, (void *)(addr + kaslr_offset()));
+}
+
+/*
+ * hyp_dump_backtrace_entry - Dump an entry of the non-protected nVHE HYP stacktrace
+ *
+ * @arg : the hypervisor offset, used for address translation
+ * @where : the program counter corresponding to the stack frame
+ */
+static bool hyp_dump_backtrace_entry(void *arg, unsigned long where)
+{
+ kvm_nvhe_print_backtrace_entry(where, (unsigned long)arg);
+
+ return true;
+}
+
+/*
+ * hyp_dump_backtrace - Dump the non-proteced nVHE HYP backtrace.
+ *
+ * @hyp_offset: hypervisor offset, used for address translation.
+ *
+ * The host can directly access HYP stack pages in non-protected
+ * mode, so the unwinding is done directly from EL1. This removes
+ * the need for shared buffers between host and hypervisor for
+ * the stacktrace.
+ */
+static void hyp_dump_backtrace(unsigned long hyp_offset)
+{
+ struct kvm_nvhe_stacktrace_info *stacktrace_info;
+ struct unwind_state state;
+
+ stacktrace_info = this_cpu_ptr_nvhe_sym(kvm_stacktrace_info);
+
+ kvm_nvhe_unwind_init(&state, stacktrace_info->fp, stacktrace_info->pc);
+
+ kvm_err("Non-protected nVHE HYP call trace:\n");
+ unwind(&state, hyp_dump_backtrace_entry, (void *)hyp_offset);
+ kvm_err("---- End of Non-protected nVHE HYP call trace ----\n");
+}
+
#ifdef CONFIG_PROTECTED_NVHE_STACKTRACE
DECLARE_KVM_NVHE_PER_CPU(unsigned long [NVHE_STACKTRACE_SIZE/sizeof(long)],
pkvm_stacktrace);
@@ -336,18 +387,12 @@ static void pkvm_dump_backtrace(unsigned long hyp_offset)
{
unsigned long *stacktrace_entry
= (unsigned long *)this_cpu_ptr_nvhe_sym(pkvm_stacktrace);
- unsigned long va_mask, pc;
-
- va_mask = GENMASK_ULL(vabits_actual - 1, 0);

kvm_err("Protected nVHE HYP call trace:\n");

- /* The stack trace is terminated by a null entry */
- for (; *stacktrace_entry; stacktrace_entry++) {
- /* Mask tags and convert to kern addr */
- pc = (*stacktrace_entry & va_mask) + hyp_offset;
- kvm_err(" [<%016lx>] %pB\n", pc, (void *)(pc + kaslr_offset()));
- }
+ /* The saved stacktrace is terminated by a null entry */
+ for (; *stacktrace_entry; stacktrace_entry++)
+ kvm_nvhe_print_backtrace_entry(*stacktrace_entry, hyp_offset);

kvm_err("---- End of Protected nVHE HYP call trace ----\n");
}
@@ -367,6 +412,8 @@ static void kvm_nvhe_dump_backtrace(unsigned long hyp_offset)
{
if (is_protected_kvm_enabled())
pkvm_dump_backtrace(hyp_offset);
+ else
+ hyp_dump_backtrace(hyp_offset);
}

void __noreturn __cold nvhe_hyp_panic_handler(u64 esr, u64 spsr,
--
2.37.0.170.g444d1eabd0-goog

2022-07-21 10:16:17

by Fuad Tabba

[permalink] [raw]
Subject: Re: [PATCH v5 08/17] KVM: arm64: Add PROTECTED_NVHE_STACKTRACE Kconfig

Hi Kalesh,

On Thu, Jul 21, 2022 at 6:57 AM Kalesh Singh <[email protected]> wrote:
>
> This can be used to disable stacktrace for the protected KVM
> nVHE hypervisor, in order to save on the associated memory usage.
>
> This option is disabled by default, since protected KVM is not widely
> used on platforms other than Android currently.
>
> Signed-off-by: Kalesh Singh <[email protected]>
> ---

Just wanted to point out that I have specifically tested this as well,
enabling PROTECTED_NVHE_STACKTRACE but not NVHE_EL2_DEBUG. Works as
expected.

Tested-by: Fuad Tabba <[email protected]>
Reviewed-by: Fuad Tabba <[email protected]>

Cheers,
/fuad


>
> Changes in v5:
> - Make PROTECTED_NVHE_STACKTRACE depend on NVHE_EL2_DEBUG, per Marc
>
> arch/arm64/kvm/Kconfig | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig
> index 8a5fbbf084df..09c995869916 100644
> --- a/arch/arm64/kvm/Kconfig
> +++ b/arch/arm64/kvm/Kconfig
> @@ -46,6 +46,21 @@ menuconfig KVM
>
> If unsure, say N.
>
> +config PROTECTED_NVHE_STACKTRACE
> + bool "Protected KVM hypervisor stacktraces"
> + depends on NVHE_EL2_DEBUG
> + default n
> + help
> + Say Y here to enable pKVM hypervisor stacktraces on hyp_panic()
> +
> + If you are not using protected nVHE (pKVM), say N.
> +
> + If using protected nVHE mode, but cannot afford the associated
> + memory cost (less than 0.75 page per CPU) of pKVM stacktraces,
> + say N.
> +
> + If unsure, say N.
> +
> config NVHE_EL2_DEBUG
> bool "Debug mode for non-VHE EL2 object"
> depends on KVM
> --
> 2.37.0.170.g444d1eabd0-goog
>

2022-07-21 10:18:36

by Fuad Tabba

[permalink] [raw]
Subject: Re: [PATCH v5 11/17] KVM: arm64: Stub implementation of non-protected nVHE HYP stack unwinder

Hi Kalesh,

On Thu, Jul 21, 2022 at 6:58 AM Kalesh Singh <[email protected]> wrote:
>
> Add stub implementations of non-protected nVHE stack unwinder, for
> building. These are implemented later in this series.
>
> Signed-off-by: Kalesh Singh <[email protected]>
> ---

Reviewed-by: Fuad Tabba <[email protected]>

Cheers,
/fuad

>
> Changes in v5:
> - Mark unwind_next() as inline, per Marc
> - Comment !__KVM_NVHE_HYPERVISOR__ unwinder path, per Marc
>
> arch/arm64/include/asm/stacktrace/nvhe.h | 26 ++++++++++++++++++++++++
> 1 file changed, 26 insertions(+)
>
> diff --git a/arch/arm64/include/asm/stacktrace/nvhe.h b/arch/arm64/include/asm/stacktrace/nvhe.h
> index 80d71932afff..3078501f8e22 100644
> --- a/arch/arm64/include/asm/stacktrace/nvhe.h
> +++ b/arch/arm64/include/asm/stacktrace/nvhe.h
> @@ -8,6 +8,12 @@
> * the HYP memory. The stack is unwinded in EL2 and dumped to a shared
> * buffer where the host can read and print the stacktrace.
> *
> + * 2) Non-protected nVHE mode - the host can directly access the
> + * HYP stack pages and unwind the HYP stack in EL1. This saves having
> + * to allocate shared buffers for the host to read the unwinded
> + * stacktrace.
> + *
> + *
> * Copyright (C) 2022 Google LLC
> */
> #ifndef __ASM_STACKTRACE_NVHE_H
> @@ -55,5 +61,25 @@ static inline int notrace unwind_next(struct unwind_state *state)
> NOKPROBE_SYMBOL(unwind_next);
> #endif /* CONFIG_PROTECTED_NVHE_STACKTRACE */
>
> +#else /* !__KVM_NVHE_HYPERVISOR__ */
> +/*
> + * Conventional (non-protected) nVHE HYP stack unwinder
> + *
> + * In non-protected mode, the unwinding is done from kernel proper context
> + * (by the host in EL1).
> + */
> +
> +static inline bool on_overflow_stack(unsigned long sp, unsigned long size,
> + struct stack_info *info)
> +{
> + return false;
> +}
> +
> +static inline int notrace unwind_next(struct unwind_state *state)
> +{
> + return 0;
> +}
> +NOKPROBE_SYMBOL(unwind_next);
> +
> #endif /* __KVM_NVHE_HYPERVISOR__ */
> #endif /* __ASM_STACKTRACE_NVHE_H */
> --
> 2.37.0.170.g444d1eabd0-goog
>

2022-07-21 10:18:41

by Fuad Tabba

[permalink] [raw]
Subject: Re: [PATCH v5 12/17] KVM: arm64: Save protected-nVHE (pKVM) hyp stacktrace

Hi Kalesh,

On Thu, Jul 21, 2022 at 6:58 AM Kalesh Singh <[email protected]> wrote:
>
> In protected nVHE mode, the host cannot access private owned hypervisor
> memory. Also the hypervisor aims to remains simple to reduce the attack
> surface and does not provide any printk support.
>
> For the above reasons, the approach taken to provide hypervisor stacktraces
> in protected mode is:
> 1) Unwind and save the hyp stack addresses in EL2 to a shared buffer
> with the host (done in this patch).
> 2) Delegate the dumping and symbolization of the addresses to the
> host in EL1 (later patch in the series).
>
> On hyp_panic(), the hypervisor prepares the stacktrace before returning to
> the host.
>
> Signed-off-by: Kalesh Singh <[email protected]>
> ---

Reviewed-by: Fuad Tabba <[email protected]>

Cheers,
/fuad


>
> Changes in v5:
> - Comment/clarify pkvm_save_backtrace_entry(), per Fuad
> - kvm_nvhe_unwind_init(), doesn't need to be always inline, make it
> inline instead to avoid linking issues, per Marc
> - Use regular comments instead of doc comments, per Fuad
>
> arch/arm64/include/asm/stacktrace/nvhe.h | 17 ++++++
> arch/arm64/kvm/hyp/nvhe/stacktrace.c | 78 ++++++++++++++++++++++++
> arch/arm64/kvm/hyp/nvhe/switch.c | 6 ++
> 3 files changed, 101 insertions(+)
>
> diff --git a/arch/arm64/include/asm/stacktrace/nvhe.h b/arch/arm64/include/asm/stacktrace/nvhe.h
> index 3078501f8e22..05d7e03e0a8c 100644
> --- a/arch/arm64/include/asm/stacktrace/nvhe.h
> +++ b/arch/arm64/include/asm/stacktrace/nvhe.h
> @@ -21,6 +21,23 @@
>
> #include <asm/stacktrace/common.h>
>
> +/*
> + * kvm_nvhe_unwind_init - Start an unwind from the given nVHE HYP fp and pc
> + *
> + * @state : unwind_state to initialize
> + * @fp : frame pointer at which to start the unwinding.
> + * @pc : program counter at which to start the unwinding.
> + */
> +static inline void kvm_nvhe_unwind_init(struct unwind_state *state,
> + unsigned long fp,
> + unsigned long pc)
> +{
> + unwind_init_common(state, NULL);
> +
> + state->fp = fp;
> + state->pc = pc;
> +}
> +
> static inline bool on_accessible_stack(const struct task_struct *tsk,
> unsigned long sp, unsigned long size,
> struct stack_info *info)
> diff --git a/arch/arm64/kvm/hyp/nvhe/stacktrace.c b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
> index 96c8b93320eb..60461c033a04 100644
> --- a/arch/arm64/kvm/hyp/nvhe/stacktrace.c
> +++ b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
> @@ -11,4 +11,82 @@ DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], overflow_stack)
>
> #ifdef CONFIG_PROTECTED_NVHE_STACKTRACE
> DEFINE_PER_CPU(unsigned long [NVHE_STACKTRACE_SIZE/sizeof(long)], pkvm_stacktrace);
> +
> +/*
> + * pkvm_save_backtrace_entry - Saves a protected nVHE HYP stacktrace entry
> + *
> + * @arg : the position of the entry in the stacktrace buffer
> + * @where : the program counter corresponding to the stack frame
> + *
> + * Save the return address of a stack frame to the shared stacktrace buffer.
> + * The host can access this shared buffer from EL1 to dump the backtrace.
> + */
> +static bool pkvm_save_backtrace_entry(void *arg, unsigned long where)
> +{
> + unsigned long **stacktrace_entry = (unsigned long **)arg;
> + int nr_entries = NVHE_STACKTRACE_SIZE / sizeof(long);
> + unsigned long *stacktrace_start, *stacktrace_end;
> +
> + stacktrace_start = (unsigned long *)this_cpu_ptr(pkvm_stacktrace);
> + stacktrace_end = stacktrace_start + nr_entries;
> +
> + /*
> + * Need 2 free slots: 1 for current entry and 1 for the
> + * trailing zero entry delimiter.
> + */
> + if (*stacktrace_entry > stacktrace_end - 2)
> + return false;
> +
> + /* Save the current entry */
> + **stacktrace_entry = where;
> +
> + /* Add trailing zero entry delimiter */
> + *(*stacktrace_entry + 1) = 0UL;
> +
> + /*
> + * Increment the current entry position. The zero entry
> + * will be overwritten by the next backtrace entry (if any)
> + */
> + ++*stacktrace_entry;
> +
> + return true;
> +}
> +
> +/*
> + * pkvm_save_backtrace - Saves the protected nVHE HYP stacktrace
> + *
> + * @fp : frame pointer at which to start the unwinding.
> + * @pc : program counter at which to start the unwinding.
> + *
> + * Save the unwinded stack addresses to the shared stacktrace buffer.
> + * The host can access this shared buffer from EL1 to dump the backtrace.
> + */
> +static void pkvm_save_backtrace(unsigned long fp, unsigned long pc)
> +{
> + void *stacktrace_entry = (void *)this_cpu_ptr(pkvm_stacktrace);
> + struct unwind_state state;
> +
> + kvm_nvhe_unwind_init(&state, fp, pc);
> +
> + unwind(&state, pkvm_save_backtrace_entry, &stacktrace_entry);
> +}
> +#else /* !CONFIG_PROTECTED_NVHE_STACKTRACE */
> +static void pkvm_save_backtrace(unsigned long fp, unsigned long pc)
> +{
> +}
> #endif /* CONFIG_PROTECTED_NVHE_STACKTRACE */
> +
> +/*
> + * kvm_nvhe_prepare_backtrace - prepare to dump the nVHE backtrace
> + *
> + * @fp : frame pointer at which to start the unwinding.
> + * @pc : program counter at which to start the unwinding.
> + *
> + * Saves the information needed by the host to dump the nVHE hypervisor
> + * backtrace.
> + */
> +void kvm_nvhe_prepare_backtrace(unsigned long fp, unsigned long pc)
> +{
> + if (is_protected_kvm_enabled())
> + pkvm_save_backtrace(fp, pc);
> +}
> diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
> index 6db801db8f27..64e13445d0d9 100644
> --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> @@ -34,6 +34,8 @@ DEFINE_PER_CPU(struct kvm_host_data, kvm_host_data);
> DEFINE_PER_CPU(struct kvm_cpu_context, kvm_hyp_ctxt);
> DEFINE_PER_CPU(unsigned long, kvm_hyp_vector);
>
> +extern void kvm_nvhe_prepare_backtrace(unsigned long fp, unsigned long pc);
> +
> static void __activate_traps(struct kvm_vcpu *vcpu)
> {
> u64 val;
> @@ -375,6 +377,10 @@ asmlinkage void __noreturn hyp_panic(void)
> __sysreg_restore_state_nvhe(host_ctxt);
> }
>
> + /* Prepare to dump kvm nvhe hyp stacktrace */
> + kvm_nvhe_prepare_backtrace((unsigned long)__builtin_frame_address(0),
> + _THIS_IP_);
> +
> __hyp_do_panic(host_ctxt, spsr, elr, par);
> unreachable();
> }
> --
> 2.37.0.170.g444d1eabd0-goog
>

2022-07-21 10:19:38

by Fuad Tabba

[permalink] [raw]
Subject: Re: [PATCH v5 14/17] KVM: arm64: Implement protected nVHE hyp stack unwinder

Hi Kalesh,

On Thu, Jul 21, 2022 at 6:58 AM Kalesh Singh <[email protected]> wrote:
>
> Implements the common framework necessary for unwind() to work in
> the protected nVHE context:
> - on_accessible_stack()
> - on_overflow_stack()
> - unwind_next()
>
> Protected nVHE unwind() is used to unwind and save the hyp stack
> addresses to the shared stacktrace buffer. The host reads the
> entries in this buffer, symbolizes and dumps the stacktrace (later
> patch in the series).
>
> Signed-off-by: Kalesh Singh <[email protected]>
> ---

Reviewed-by: Fuad Tabba <[email protected]>

Cheers,
/fuad


> arch/arm64/include/asm/stacktrace/common.h | 2 ++
> arch/arm64/include/asm/stacktrace/nvhe.h | 34 ++++++++++++++++++++--
> 2 files changed, 34 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/include/asm/stacktrace/common.h b/arch/arm64/include/asm/stacktrace/common.h
> index be7920ba70b0..73fd9e143c4a 100644
> --- a/arch/arm64/include/asm/stacktrace/common.h
> +++ b/arch/arm64/include/asm/stacktrace/common.h
> @@ -34,6 +34,7 @@ enum stack_type {
> STACK_TYPE_OVERFLOW,
> STACK_TYPE_SDEI_NORMAL,
> STACK_TYPE_SDEI_CRITICAL,
> + STACK_TYPE_HYP,
> __NR_STACK_TYPES
> };
>
> @@ -186,6 +187,7 @@ static inline int unwind_next_common(struct unwind_state *state,
> *
> * TASK -> IRQ -> OVERFLOW -> SDEI_NORMAL
> * TASK -> SDEI_NORMAL -> SDEI_CRITICAL -> OVERFLOW
> + * HYP -> OVERFLOW
> *
> * ... but the nesting itself is strict. Once we transition from one
> * stack to another, it's never valid to unwind back to that first
> diff --git a/arch/arm64/include/asm/stacktrace/nvhe.h b/arch/arm64/include/asm/stacktrace/nvhe.h
> index 8f02803a005f..c3688e717136 100644
> --- a/arch/arm64/include/asm/stacktrace/nvhe.h
> +++ b/arch/arm64/include/asm/stacktrace/nvhe.h
> @@ -39,10 +39,19 @@ static inline void kvm_nvhe_unwind_init(struct unwind_state *state,
> state->pc = pc;
> }
>
> +static inline bool on_hyp_stack(unsigned long sp, unsigned long size,
> + struct stack_info *info);
> +
> static inline bool on_accessible_stack(const struct task_struct *tsk,
> unsigned long sp, unsigned long size,
> struct stack_info *info)
> {
> + if (on_accessible_stack_common(tsk, sp, size, info))
> + return true;
> +
> + if (on_hyp_stack(sp, size, info))
> + return true;
> +
> return false;
> }
>
> @@ -60,12 +69,27 @@ DECLARE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
> static inline bool on_overflow_stack(unsigned long sp, unsigned long size,
> struct stack_info *info)
> {
> - return false;
> + unsigned long low = (unsigned long)this_cpu_ptr(overflow_stack);
> + unsigned long high = low + OVERFLOW_STACK_SIZE;
> +
> + return on_stack(sp, size, low, high, STACK_TYPE_OVERFLOW, info);
> +}
> +
> +static inline bool on_hyp_stack(unsigned long sp, unsigned long size,
> + struct stack_info *info)
> +{
> + struct kvm_nvhe_init_params *params = this_cpu_ptr(&kvm_init_params);
> + unsigned long high = params->stack_hyp_va;
> + unsigned long low = high - PAGE_SIZE;
> +
> + return on_stack(sp, size, low, high, STACK_TYPE_HYP, info);
> }
>
> static inline int notrace unwind_next(struct unwind_state *state)
> {
> - return 0;
> + struct stack_info info;
> +
> + return unwind_next_common(state, &info, NULL);
> }
> NOKPROBE_SYMBOL(unwind_next);
> #else /* !CONFIG_PROTECTED_NVHE_STACKTRACE */
> @@ -75,6 +99,12 @@ static inline bool on_overflow_stack(unsigned long sp, unsigned long size,
> return false;
> }
>
> +static inline bool on_hyp_stack(unsigned long sp, unsigned long size,
> + struct stack_info *info)
> +{
> + return false;
> +}
> +
> static inline int notrace unwind_next(struct unwind_state *state)
> {
> return 0;
> --
> 2.37.0.170.g444d1eabd0-goog
>

2022-07-21 10:21:16

by Fuad Tabba

[permalink] [raw]
Subject: Re: [PATCH v5 04/17] arm64: stacktrace: Handle frame pointer from different address spaces

Hi Kalesh,


On Thu, Jul 21, 2022 at 6:57 AM Kalesh Singh <[email protected]> wrote:
>
> The unwinder code is made reusable so that it can be used to
> unwind various types of stacks. One usecase is unwinding the
> nVHE hyp stack from the host (EL1) in non-protected mode. This
> means that the unwinder must be able to translate HYP stack
> addresses to kernel addresses.
>
> Add a callback (stack_trace_translate_fp_fn) to allow specifying
> the translation function.
>
> Signed-off-by: Kalesh Singh <[email protected]>
> ---
> Changes in v5:
> - Fix typo in commit text, per Fuad
> - Update unwind_next_common() to not have side effects on failure, per Fuad
> - Use regular comment instead of doc comments, per Fuad
>
> arch/arm64/include/asm/stacktrace/common.h | 29 +++++++++++++++++++---
> arch/arm64/kernel/stacktrace.c | 2 +-
> 2 files changed, 26 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/include/asm/stacktrace/common.h b/arch/arm64/include/asm/stacktrace/common.h
> index 0c5cbfdb56b5..e89c8c39858d 100644
> --- a/arch/arm64/include/asm/stacktrace/common.h
> +++ b/arch/arm64/include/asm/stacktrace/common.h
> @@ -124,11 +124,25 @@ static inline void unwind_init_common(struct unwind_state *state,
> state->prev_type = STACK_TYPE_UNKNOWN;
> }
>
> +/*
> + * stack_trace_translate_fp_fn() - Translates a non-kernel frame pointer to
> + * a kernel address.
> + *
> + * @fp: the frame pointer to be updated to it's kernel address.

nit: it's -> its

Otherwise

Reviewed-by: Fuad Tabba <[email protected]>

Cheers,
/fuad





> + * @type: the stack type associated with frame pointer @fp
> + *
> + * Returns true and success and @fp is updated to the corresponding
> + * kernel virtual address; otherwise returns false.
> + */
> +typedef bool (*stack_trace_translate_fp_fn)(unsigned long *fp,
> + enum stack_type type);
> +
> static inline int unwind_next_common(struct unwind_state *state,
> - struct stack_info *info)
> + struct stack_info *info,
> + stack_trace_translate_fp_fn translate_fp)
> {
> + unsigned long fp = state->fp, kern_fp = fp;
> struct task_struct *tsk = state->task;
> - unsigned long fp = state->fp;
>
> if (fp & 0x7)
> return -EINVAL;
> @@ -139,6 +153,13 @@ static inline int unwind_next_common(struct unwind_state *state,
> if (test_bit(info->type, state->stacks_done))
> return -EINVAL;
>
> + /*
> + * If fp is not from the current address space perform the necessary
> + * translation before dereferencing it to get the next fp.
> + */
> + if (translate_fp && !translate_fp(&kern_fp, info->type))
> + return -EINVAL;
> +
> /*
> * As stacks grow downward, any valid record on the same stack must be
> * at a strictly higher address than the prior record.
> @@ -163,8 +184,8 @@ static inline int unwind_next_common(struct unwind_state *state,
> * Record this frame record's values and location. The prev_fp and
> * prev_type are only meaningful to the next unwind_next() invocation.
> */
> - state->fp = READ_ONCE(*(unsigned long *)(fp));
> - state->pc = READ_ONCE(*(unsigned long *)(fp + 8));
> + state->fp = READ_ONCE(*(unsigned long *)(kern_fp));
> + state->pc = READ_ONCE(*(unsigned long *)(kern_fp + 8));
> state->prev_fp = fp;
> state->prev_type = info->type;
>
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index 834851939364..eef3cf6bf2d7 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -87,7 +87,7 @@ static int notrace unwind_next(struct unwind_state *state)
> if (fp == (unsigned long)task_pt_regs(tsk)->stackframe)
> return -ENOENT;
>
> - err = unwind_next_common(state, &info);
> + err = unwind_next_common(state, &info, NULL);
> if (err)
> return err;
>
> --
> 2.37.0.170.g444d1eabd0-goog
>

2022-07-21 10:21:18

by Fuad Tabba

[permalink] [raw]
Subject: Re: [PATCH v5 10/17] KVM: arm64: Stub implementation of pKVM HYP stack unwinder

Hi Kalesh,

On Thu, Jul 21, 2022 at 6:57 AM Kalesh Singh <[email protected]> wrote:
>
> Add some stub implementations of protected nVHE stack unwinder, for
> building. These are implemented later in this series.
>
> Signed-off-by: Kalesh Singh <[email protected]>
> ---

Reviewed-by: Fuad Tabba <[email protected]>

Cheers,
/fuad


>
> Changes in v5:
> - Mark unwind_next() as inline, per Marc
>
> arch/arm64/include/asm/stacktrace/nvhe.h | 59 ++++++++++++++++++++++++
> arch/arm64/kvm/hyp/nvhe/stacktrace.c | 3 +-
> 2 files changed, 60 insertions(+), 2 deletions(-)
> create mode 100644 arch/arm64/include/asm/stacktrace/nvhe.h
>
> diff --git a/arch/arm64/include/asm/stacktrace/nvhe.h b/arch/arm64/include/asm/stacktrace/nvhe.h
> new file mode 100644
> index 000000000000..80d71932afff
> --- /dev/null
> +++ b/arch/arm64/include/asm/stacktrace/nvhe.h
> @@ -0,0 +1,59 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * KVM nVHE hypervisor stack tracing support.
> + *
> + * The unwinder implementation depends on the nVHE mode:
> + *
> + * 1) pKVM (protected nVHE) mode - the host cannot directly access
> + * the HYP memory. The stack is unwinded in EL2 and dumped to a shared
> + * buffer where the host can read and print the stacktrace.
> + *
> + * Copyright (C) 2022 Google LLC
> + */
> +#ifndef __ASM_STACKTRACE_NVHE_H
> +#define __ASM_STACKTRACE_NVHE_H
> +
> +#include <asm/stacktrace/common.h>
> +
> +static inline bool on_accessible_stack(const struct task_struct *tsk,
> + unsigned long sp, unsigned long size,
> + struct stack_info *info)
> +{
> + return false;
> +}
> +
> +#ifdef __KVM_NVHE_HYPERVISOR__
> +/*
> + * Protected nVHE HYP stack unwinder
> + *
> + * In protected mode, the unwinding is done by the hypervisor in EL2.
> + */
> +
> +#ifdef CONFIG_PROTECTED_NVHE_STACKTRACE
> +static inline bool on_overflow_stack(unsigned long sp, unsigned long size,
> + struct stack_info *info)
> +{
> + return false;
> +}
> +
> +static inline int notrace unwind_next(struct unwind_state *state)
> +{
> + return 0;
> +}
> +NOKPROBE_SYMBOL(unwind_next);
> +#else /* !CONFIG_PROTECTED_NVHE_STACKTRACE */
> +static inline bool on_overflow_stack(unsigned long sp, unsigned long size,
> + struct stack_info *info)
> +{
> + return false;
> +}
> +
> +static inline int notrace unwind_next(struct unwind_state *state)
> +{
> + return 0;
> +}
> +NOKPROBE_SYMBOL(unwind_next);
> +#endif /* CONFIG_PROTECTED_NVHE_STACKTRACE */
> +
> +#endif /* __KVM_NVHE_HYPERVISOR__ */
> +#endif /* __ASM_STACKTRACE_NVHE_H */
> diff --git a/arch/arm64/kvm/hyp/nvhe/stacktrace.c b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
> index 69e65b457f1c..96c8b93320eb 100644
> --- a/arch/arm64/kvm/hyp/nvhe/stacktrace.c
> +++ b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
> @@ -4,8 +4,7 @@
> *
> * Copyright (C) 2022 Google LLC
> */
> -#include <asm/memory.h>
> -#include <asm/percpu.h>
> +#include <asm/stacktrace/nvhe.h>
>
> DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], overflow_stack)
> __aligned(16);
> --
> 2.37.0.170.g444d1eabd0-goog
>

2022-07-21 10:35:25

by Fuad Tabba

[permalink] [raw]
Subject: Re: [PATCH v5 16/17] KVM: arm64: Introduce pkvm_dump_backtrace()

Hi Kalesh,

On Thu, Jul 21, 2022 at 6:58 AM Kalesh Singh <[email protected]> wrote:
>
> Dumps the pKVM hypervisor backtrace from EL1 by reading the unwinded
> addresses from the shared stacktrace buffer.
>
> The nVHE hyp backtrace is dumped on hyp_panic(), before panicking the
> host.
>
> Signed-off-by: Kalesh Singh <[email protected]>
> ---
>
> Changes in v5:
> - Move code out from nvhe.h header to handle_exit.c, per Marc
> - Fix stacktrace symoblization when CONFIG_RAMDOMIZE_BASE is enabled,
> per Fuad
> - Use regular comments instead of doc comments, per Fuad
>
> arch/arm64/kvm/handle_exit.c | 54 ++++++++++++++++++++++++++++++++++++
> 1 file changed, 54 insertions(+)
>
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index f66c0142b335..ad568da5c7d7 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -318,6 +318,57 @@ void handle_exit_early(struct kvm_vcpu *vcpu, int exception_index)
> kvm_handle_guest_serror(vcpu, kvm_vcpu_get_esr(vcpu));
> }
>
> +#ifdef CONFIG_PROTECTED_NVHE_STACKTRACE
> +DECLARE_KVM_NVHE_PER_CPU(unsigned long [NVHE_STACKTRACE_SIZE/sizeof(long)],
> + pkvm_stacktrace);
> +
> +/*
> + * pkvm_dump_backtrace - Dump the protected nVHE HYP backtrace.
> + *
> + * @hyp_offset: hypervisor offset, used for address translation.
> + *
> + * Dumping of the pKVM HYP backtrace is done by reading the
> + * stack addresses from the shared stacktrace buffer, since the
> + * host cannot direclty access hyperviosr memory in protected

directly and hypervisor

Reviewed-by: Fuad Tabba <[email protected]>

Cheers,
/fuad


> + * mode.
> + */
> +static void pkvm_dump_backtrace(unsigned long hyp_offset)
> +{
> + unsigned long *stacktrace_entry
> + = (unsigned long *)this_cpu_ptr_nvhe_sym(pkvm_stacktrace);
> + unsigned long va_mask, pc;
> +
> + va_mask = GENMASK_ULL(vabits_actual - 1, 0);
> +
> + kvm_err("Protected nVHE HYP call trace:\n");
> +
> + /* The stack trace is terminated by a null entry */
> + for (; *stacktrace_entry; stacktrace_entry++) {
> + /* Mask tags and convert to kern addr */
> + pc = (*stacktrace_entry & va_mask) + hyp_offset;
> + kvm_err(" [<%016lx>] %pB\n", pc, (void *)(pc + kaslr_offset()));
> + }
> +
> + kvm_err("---- End of Protected nVHE HYP call trace ----\n");
> +}
> +#else /* !CONFIG_PROTECTED_NVHE_STACKTRACE */
> +static void pkvm_dump_backtrace(unsigned long hyp_offset)
> +{
> + kvm_err("Cannot dump pKVM nVHE stacktrace: !CONFIG_PROTECTED_NVHE_STACKTRACE\n");
> +}
> +#endif /* CONFIG_PROTECTED_NVHE_STACKTRACE */
> +
> +/*
> + * kvm_nvhe_dump_backtrace - Dump KVM nVHE hypervisor backtrace.
> + *
> + * @hyp_offset: hypervisor offset, used for address translation.
> + */
> +static void kvm_nvhe_dump_backtrace(unsigned long hyp_offset)
> +{
> + if (is_protected_kvm_enabled())
> + pkvm_dump_backtrace(hyp_offset);
> +}
> +
> void __noreturn __cold nvhe_hyp_panic_handler(u64 esr, u64 spsr,
> u64 elr_virt, u64 elr_phys,
> u64 par, uintptr_t vcpu,
> @@ -353,6 +404,9 @@ void __noreturn __cold nvhe_hyp_panic_handler(u64 esr, u64 spsr,
> (void *)panic_addr);
> }
>
> + /* Dump the nVHE hypervisor backtrace */
> + kvm_nvhe_dump_backtrace(hyp_offset);
> +
> /*
> * Hyp has panicked and we're going to handle that by panicking the
> * kernel. The kernel offset will be revealed in the panic so we're
> --
> 2.37.0.170.g444d1eabd0-goog
>

2022-07-21 10:36:58

by Fuad Tabba

[permalink] [raw]
Subject: Re: [PATCH v5 09/17] KVM: arm64: Allocate shared pKVM hyp stacktrace buffers

Hi Kalesh,

On Thu, Jul 21, 2022 at 6:57 AM Kalesh Singh <[email protected]> wrote:
>
> In protected nVHE mode the host cannot directly access
> hypervisor memory, so we will dump the hypervisor stacktrace
> to a shared buffer with the host.
>
> The minimum size for the buffer required, assuming the min frame
> size of [x29, x30] (2 * sizeof(long)), is half the combined size of
> the hypervisor and overflow stacks plus an additional entry to
> delimit the end of the stacktrace.
>
> The stacktrace buffers are used later in the seried to dump the

nit: seried/series

> nVHE hypervisor stacktrace when using protected-mode.
>
> Signed-off-by: Kalesh Singh <[email protected]>
> ---

Reviewed-by: Fuad Tabba <[email protected]>

Cheers,
/fuad


> Changes in v5:
> - Fix typo in commit text, per Marc
>
> arch/arm64/include/asm/memory.h | 8 ++++++++
> arch/arm64/kvm/hyp/nvhe/stacktrace.c | 4 ++++
> 2 files changed, 12 insertions(+)
>
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 0af70d9abede..cab80a9a4086 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -113,6 +113,14 @@
>
> #define OVERFLOW_STACK_SIZE SZ_4K
>
> +/*
> + * With the minimum frame size of [x29, x30], exactly half the combined
> + * sizes of the hyp and overflow stacks is the maximum size needed to
> + * save the unwinded stacktrace; plus an additional entry to delimit the
> + * end.
> + */
> +#define NVHE_STACKTRACE_SIZE ((OVERFLOW_STACK_SIZE + PAGE_SIZE) / 2 + sizeof(long))
> +
> /*
> * Alignment of kernel segments (e.g. .text, .data).
> *
> diff --git a/arch/arm64/kvm/hyp/nvhe/stacktrace.c b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
> index a3d5b34e1249..69e65b457f1c 100644
> --- a/arch/arm64/kvm/hyp/nvhe/stacktrace.c
> +++ b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
> @@ -9,3 +9,7 @@
>
> DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], overflow_stack)
> __aligned(16);
> +
> +#ifdef CONFIG_PROTECTED_NVHE_STACKTRACE
> +DEFINE_PER_CPU(unsigned long [NVHE_STACKTRACE_SIZE/sizeof(long)], pkvm_stacktrace);
> +#endif /* CONFIG_PROTECTED_NVHE_STACKTRACE */
> --
> 2.37.0.170.g444d1eabd0-goog
>

2022-07-21 10:37:33

by Fuad Tabba

[permalink] [raw]
Subject: Re: [PATCH v5 00/17] KVM nVHE Hypervisor stack unwinder

Hi Kalesh,


On Thu, Jul 21, 2022 at 6:57 AM Kalesh Singh <[email protected]> wrote:
>
> Hi all,
>
> This is v5 of nVHE hypervisor stacktraces support. The series is based on
> arm64 for-next/stacktrace.
>
> The previous versions were posted at:
> v4: https://lore.kernel.org/r/[email protected]/
> v3: https://lore.kernel.org/r/[email protected]/
> v2: https://lore.kernel.org/r/[email protected]/
> v1: https://lore.kernel.org/r/[email protected]/
>
> The main updates in this version are some refactoring to move stuff out of
> stacktrace/nvhe.h (leaving only the unwinder implementation in the header);
> and fixing the symbolization of the hyp stacktrace when KASLR is enabled;
> along with the addressing the other minor comments.
>
> Patch 18 (KVM: arm64: Dump nVHE hypervisor stack on panic) was also squashed
> into earlier patches.
>
> The previous cover letter is copied below for convenience.
>
> Thanks all for your feedback.

Thank you for this. This will be very helpful.

For the whole series:
Tested-by: Fuad Tabba <[email protected]>

Cheers,
/fuad


>
> --Kalesh
>
> ============
>
> KVM nVHE Stack unwinding.
> ===
>
> nVHE has two modes of operation: protected (pKVM) and unprotected
> (conventional nVHE). Depending on the mode, a slightly different approach
> is used to dump the hypervisor stacktrace but the core unwinding logic
> remains the same.
>
> Protected nVHE (pKVM) stacktraces
> ====
>
> In protected nVHE mode, the host cannot directly access hypervisor memory.
>
> The hypervisor stack unwinding happens in EL2 and is made accessible to
> the host via a shared buffer. Symbolizing and printing the stacktrace
> addresses is delegated to the host and happens in EL1.
>
> Non-protected (Conventional) nVHE stacktraces
> ====
>
> In non-protected mode, the host is able to directly access the hypervisor
> stack pages.
>
> The hypervisor stack unwinding and dumping of the stacktrace is performed
> by the host in EL1, as this avoids the memory overhead of setting up
> shared buffers between the host and hypervisor.
>
> Resuing the Core Unwinding Logic
> ====
>
> Since the hypervisor cannot link against the kernel code in proteced mode.
> The common stack unwinding code is moved to a shared header to allow reuse
> in the nVHE hypervisor.
>
> Reducing the memory footprint
> ====
>
> In this version the below steps were taken to reduce the memory usage of
> nVHE stack unwinding:
>
> 1) The nVHE overflow stack is reduced from PAGE_SIZE to 4KB; benificial
> for configurations with non 4KB pages (16KB or 64KB pages).
> 2) In protected nVHE mode (pKVM), the shared stacktrace buffers with the
> host are reduced from PAGE_SIZE to the minimum size required.
> 3) In systems other than Android, conventional nVHE makes up the vast
> majority of use case. So the pKVM stack tracing is disabled by default
> (!CONFIG_PROTECTED_NVHE_STACKTRACE), which avoid the memory usage for
> setting up shared buffers.
> 4) In non-protected nVHE mode (conventional nVHE), the stack unwinding
> is done directly in EL1 by the host and no shared buffers with the
> hypervisor are needed.
>
> Sample Output
> ====
>
> The below shows an example output from a simple stack overflow test:
>
> [ 126.862960] kvm [371]: nVHE hyp panic at: [<ffff8000090a51d0>] __kvm_nvhe_recursive_death+0x10/0x34!
> [ 126.869920] kvm [371]: Protected nVHE HYP call trace:
> [ 126.870528] kvm [371]: [<ffff8000090a5570>] __kvm_nvhe_hyp_panic+0xac/0xf8
> [ 126.871342] kvm [371]: [<ffff8000090a55cc>] __kvm_nvhe_hyp_panic_bad_stack+0x10/0x10
> [ 126.872174] kvm [371]: [<ffff8000090a51e4>] __kvm_nvhe_recursive_death+0x24/0x34
> [ 126.872971] kvm [371]: [<ffff8000090a51e4>] __kvm_nvhe_recursive_death+0x24/0x34
> . . .
>
> [ 126.927314] kvm [371]: [<ffff8000090a51e4>] __kvm_nvhe_recursive_death+0x24/0x34
> [ 126.927727] kvm [371]: [<ffff8000090a51e4>] __kvm_nvhe_recursive_death+0x24/0x34
> [ 126.928137] kvm [371]: [<ffff8000090a4de4>] __kvm_nvhe___kvm_vcpu_run+0x30/0x40c
> [ 126.928561] kvm [371]: [<ffff8000090a7b64>] __kvm_nvhe_handle___kvm_vcpu_run+0x30/0x48
> [ 126.928984] kvm [371]: [<ffff8000090a78b8>] __kvm_nvhe_handle_trap+0xc4/0x128
> [ 126.929385] kvm [371]: [<ffff8000090a6864>] __kvm_nvhe___host_exit+0x64/0x64
> [ 126.929804] kvm [371]: ---- End of Protected nVHE HYP call trace ----
>
> ============
>
> Kalesh Singh (17):
> arm64: stacktrace: Add shared header for common stack unwinding code
> arm64: stacktrace: Factor out on_accessible_stack_common()
> arm64: stacktrace: Factor out unwind_next_common()
> arm64: stacktrace: Handle frame pointer from different address spaces
> arm64: stacktrace: Factor out common unwind()
> arm64: stacktrace: Add description of stacktrace/common.h
> KVM: arm64: On stack overflow switch to hyp overflow_stack
> KVM: arm64: Add PROTECTED_NVHE_STACKTRACE Kconfig
> KVM: arm64: Allocate shared pKVM hyp stacktrace buffers
> KVM: arm64: Stub implementation of pKVM HYP stack unwinder
> KVM: arm64: Stub implementation of non-protected nVHE HYP stack
> unwinder
> KVM: arm64: Save protected-nVHE (pKVM) hyp stacktrace
> KVM: arm64: Prepare non-protected nVHE hypervisor stacktrace
> KVM: arm64: Implement protected nVHE hyp stack unwinder
> KVM: arm64: Implement non-protected nVHE hyp stack unwinder
> KVM: arm64: Introduce pkvm_dump_backtrace()
> KVM: arm64: Introduce hyp_dump_backtrace()
>
> arch/arm64/include/asm/kvm_asm.h | 16 ++
> arch/arm64/include/asm/memory.h | 8 +
> arch/arm64/include/asm/stacktrace.h | 92 +++++----
> arch/arm64/include/asm/stacktrace/common.h | 230 +++++++++++++++++++++
> arch/arm64/include/asm/stacktrace/nvhe.h | 199 ++++++++++++++++++
> arch/arm64/kernel/stacktrace.c | 157 --------------
> arch/arm64/kvm/Kconfig | 15 ++
> arch/arm64/kvm/arm.c | 2 +-
> arch/arm64/kvm/handle_exit.c | 101 +++++++++
> arch/arm64/kvm/hyp/nvhe/Makefile | 2 +-
> arch/arm64/kvm/hyp/nvhe/host.S | 9 +-
> arch/arm64/kvm/hyp/nvhe/stacktrace.c | 116 +++++++++++
> arch/arm64/kvm/hyp/nvhe/switch.c | 6 +
> 13 files changed, 749 insertions(+), 204 deletions(-)
> create mode 100644 arch/arm64/include/asm/stacktrace/common.h
> create mode 100644 arch/arm64/include/asm/stacktrace/nvhe.h
> create mode 100644 arch/arm64/kvm/hyp/nvhe/stacktrace.c
>
>
> base-commit: 82a592c13b0aeff94d84d54183dae0b26384c95f
> --
> 2.37.0.170.g444d1eabd0-goog
>

2022-07-21 10:37:44

by Fuad Tabba

[permalink] [raw]
Subject: Re: [PATCH v5 17/17] KVM: arm64: Introduce hyp_dump_backtrace()

Hi Kalesh,

On Thu, Jul 21, 2022 at 6:58 AM Kalesh Singh <[email protected]> wrote:
>
> In non-protected nVHE mode, unwinds and dumps the hypervisor backtrace
> from EL1. This is possible beacuase the host can directly access the
> hypervisor stack pages in non-proteced mode.

because and non-protected

>
> Signed-off-by: Kalesh Singh <[email protected]>
> ---
>
> Changes in v5:
> - Move code out from nvhe.h header to handle_exit.c, per Marc
> - Fix stacktrace symoblization when CONFIG_RAMDOMIZE_BASE is enabled,
> per Fuad
> - Use regular comments instead of doc comments, per Fuad
>
> arch/arm64/kvm/handle_exit.c | 65 +++++++++++++++++++++++++++++++-----
> 1 file changed, 56 insertions(+), 9 deletions(-)
>
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index ad568da5c7d7..432b6b26f4ad 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c
> @@ -17,6 +17,7 @@
> #include <asm/kvm_emulate.h>
> #include <asm/kvm_mmu.h>
> #include <asm/debug-monitors.h>
> +#include <asm/stacktrace/nvhe.h>
> #include <asm/traps.h>
>
> #include <kvm/arm_hypercalls.h>
> @@ -318,6 +319,56 @@ void handle_exit_early(struct kvm_vcpu *vcpu, int exception_index)
> kvm_handle_guest_serror(vcpu, kvm_vcpu_get_esr(vcpu));
> }
>
> +/*
> + * kvm_nvhe_print_backtrace_entry - Symbolizes and prints the HYP stack address
> + */
> +static void kvm_nvhe_print_backtrace_entry(unsigned long addr,
> + unsigned long hyp_offset)
> +{
> + unsigned long va_mask = GENMASK_ULL(vabits_actual - 1, 0);
> +
> + /* Mask tags and convert to kern addr */
> + addr = (addr & va_mask) + hyp_offset;
> + kvm_err(" [<%016lx>] %pB\n", addr, (void *)(addr + kaslr_offset()));
> +}
> +
> +/*
> + * hyp_dump_backtrace_entry - Dump an entry of the non-protected nVHE HYP stacktrace
> + *
> + * @arg : the hypervisor offset, used for address translation
> + * @where : the program counter corresponding to the stack frame
> + */
> +static bool hyp_dump_backtrace_entry(void *arg, unsigned long where)
> +{
> + kvm_nvhe_print_backtrace_entry(where, (unsigned long)arg);
> +
> + return true;
> +}
> +
> +/*
> + * hyp_dump_backtrace - Dump the non-proteced nVHE HYP backtrace.

non-protected

Reviewed-by: Fuad Tabba <[email protected]>

Cheers,
/fuad


> + *
> + * @hyp_offset: hypervisor offset, used for address translation.
> + *
> + * The host can directly access HYP stack pages in non-protected
> + * mode, so the unwinding is done directly from EL1. This removes
> + * the need for shared buffers between host and hypervisor for
> + * the stacktrace.
> + */
> +static void hyp_dump_backtrace(unsigned long hyp_offset)
> +{
> + struct kvm_nvhe_stacktrace_info *stacktrace_info;
> + struct unwind_state state;
> +
> + stacktrace_info = this_cpu_ptr_nvhe_sym(kvm_stacktrace_info);
> +
> + kvm_nvhe_unwind_init(&state, stacktrace_info->fp, stacktrace_info->pc);
> +
> + kvm_err("Non-protected nVHE HYP call trace:\n");
> + unwind(&state, hyp_dump_backtrace_entry, (void *)hyp_offset);
> + kvm_err("---- End of Non-protected nVHE HYP call trace ----\n");
> +}
> +
> #ifdef CONFIG_PROTECTED_NVHE_STACKTRACE
> DECLARE_KVM_NVHE_PER_CPU(unsigned long [NVHE_STACKTRACE_SIZE/sizeof(long)],
> pkvm_stacktrace);
> @@ -336,18 +387,12 @@ static void pkvm_dump_backtrace(unsigned long hyp_offset)
> {
> unsigned long *stacktrace_entry
> = (unsigned long *)this_cpu_ptr_nvhe_sym(pkvm_stacktrace);
> - unsigned long va_mask, pc;
> -
> - va_mask = GENMASK_ULL(vabits_actual - 1, 0);
>
> kvm_err("Protected nVHE HYP call trace:\n");
>
> - /* The stack trace is terminated by a null entry */
> - for (; *stacktrace_entry; stacktrace_entry++) {
> - /* Mask tags and convert to kern addr */
> - pc = (*stacktrace_entry & va_mask) + hyp_offset;
> - kvm_err(" [<%016lx>] %pB\n", pc, (void *)(pc + kaslr_offset()));
> - }
> + /* The saved stacktrace is terminated by a null entry */
> + for (; *stacktrace_entry; stacktrace_entry++)
> + kvm_nvhe_print_backtrace_entry(*stacktrace_entry, hyp_offset);
>
> kvm_err("---- End of Protected nVHE HYP call trace ----\n");
> }
> @@ -367,6 +412,8 @@ static void kvm_nvhe_dump_backtrace(unsigned long hyp_offset)
> {
> if (is_protected_kvm_enabled())
> pkvm_dump_backtrace(hyp_offset);
> + else
> + hyp_dump_backtrace(hyp_offset);
> }
>
> void __noreturn __cold nvhe_hyp_panic_handler(u64 esr, u64 spsr,
> --
> 2.37.0.170.g444d1eabd0-goog
>

2022-07-21 10:37:48

by Fuad Tabba

[permalink] [raw]
Subject: Re: [PATCH v5 06/17] arm64: stacktrace: Add description of stacktrace/common.h

Hi Kalesh,

On Thu, Jul 21, 2022 at 6:57 AM Kalesh Singh <[email protected]> wrote:
>
> Add brief description on how to use stacktrace/common.h to implement
> a stack unwinder.
>
> Signed-off-by: Kalesh Singh <[email protected]>
> ---
>
> Changes in v5:
> - Add short description of each required function, per Fuad and Marc
> - Add Reviewed-by tag from Fuad

Actually it's missing :)

Reviewed-by: Fuad Tabba <[email protected]>

Cheers,
/fuad


>
> arch/arm64/include/asm/stacktrace/common.h | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/arch/arm64/include/asm/stacktrace/common.h b/arch/arm64/include/asm/stacktrace/common.h
> index 7807752aaab1..be7920ba70b0 100644
> --- a/arch/arm64/include/asm/stacktrace/common.h
> +++ b/arch/arm64/include/asm/stacktrace/common.h
> @@ -2,6 +2,21 @@
> /*
> * Common arm64 stack unwinder code.
> *
> + * To implement a new arm64 stack unwinder:
> + * 1) Include this header
> + *
> + * 2) Provide implementations for the following functions:
> + * on_overflow_stack(): Returns true if SP is on the overflow
> + * stack.
> + * on_accessible_stack(): Returns true is SP is on any accessible
> + * stack.
> + * unwind_next(): Performs validation checks on the frame
> + * pointer, and transitions unwind_state
> + * to the next frame.
> + *
> + * See: arch/arm64/include/asm/stacktrace.h for reference
> + * implementations.
> + *
> * Copyright (C) 2012 ARM Ltd.
> */
> #ifndef __ASM_STACKTRACE_COMMON_H
> --
> 2.37.0.170.g444d1eabd0-goog
>

2022-07-21 10:37:50

by Fuad Tabba

[permalink] [raw]
Subject: Re: [PATCH v5 13/17] KVM: arm64: Prepare non-protected nVHE hypervisor stacktrace

Hi Kalesh,


On Thu, Jul 21, 2022 at 6:58 AM Kalesh Singh <[email protected]> wrote:
>
> In non-protected nVHE mode (non-pKVM) the host can directly access
> hypervisor memory; and unwinding of the hypervisor stacktrace is
> done from EL1 to save on memory for shared buffers.
>
> To unwind the hypervisor stack from EL1 the host needs to know the
> starting point for the unwind and information that will allow it to
> translate hypervisor stack addresses to the corresponding kernel
> addresses. This patch sets up this book keeping. It is made use of
> later in the series.
>
> Signed-off-by: Kalesh Singh <[email protected]>
> ---

Reviewed-by: Fuad Tabba <[email protected]>

Cheers,
/fuad


>
> Changes in v5:
> - Use regular comments instead of doc comments, per Fuad
>
> arch/arm64/include/asm/kvm_asm.h | 16 ++++++++++++++++
> arch/arm64/include/asm/stacktrace/nvhe.h | 4 ++++
> arch/arm64/kvm/hyp/nvhe/stacktrace.c | 24 ++++++++++++++++++++++++
> 3 files changed, 44 insertions(+)
>
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index 2e277f2ed671..53035763e48e 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -176,6 +176,22 @@ struct kvm_nvhe_init_params {
> unsigned long vtcr;
> };
>
> +/*
> + * Used by the host in EL1 to dump the nVHE hypervisor backtrace on
> + * hyp_panic() in non-protected mode.
> + *
> + * @stack_base: hyp VA of the hyp_stack base.
> + * @overflow_stack_base: hyp VA of the hyp_overflow_stack base.
> + * @fp: hyp FP where the backtrace begins.
> + * @pc: hyp PC where the backtrace begins.
> + */
> +struct kvm_nvhe_stacktrace_info {
> + unsigned long stack_base;
> + unsigned long overflow_stack_base;
> + unsigned long fp;
> + unsigned long pc;
> +};
> +
> /* Translate a kernel address @ptr into its equivalent linear mapping */
> #define kvm_ksym_ref(ptr) \
> ({ \
> diff --git a/arch/arm64/include/asm/stacktrace/nvhe.h b/arch/arm64/include/asm/stacktrace/nvhe.h
> index 05d7e03e0a8c..8f02803a005f 100644
> --- a/arch/arm64/include/asm/stacktrace/nvhe.h
> +++ b/arch/arm64/include/asm/stacktrace/nvhe.h
> @@ -19,6 +19,7 @@
> #ifndef __ASM_STACKTRACE_NVHE_H
> #define __ASM_STACKTRACE_NVHE_H
>
> +#include <asm/kvm_asm.h>
> #include <asm/stacktrace/common.h>
>
> /*
> @@ -52,6 +53,9 @@ static inline bool on_accessible_stack(const struct task_struct *tsk,
> * In protected mode, the unwinding is done by the hypervisor in EL2.
> */
>
> +DECLARE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], overflow_stack);
> +DECLARE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
> +
> #ifdef CONFIG_PROTECTED_NVHE_STACKTRACE
> static inline bool on_overflow_stack(unsigned long sp, unsigned long size,
> struct stack_info *info)
> diff --git a/arch/arm64/kvm/hyp/nvhe/stacktrace.c b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
> index 60461c033a04..cbd365f4f26a 100644
> --- a/arch/arm64/kvm/hyp/nvhe/stacktrace.c
> +++ b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
> @@ -9,6 +9,28 @@
> DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], overflow_stack)
> __aligned(16);
>
> +DEFINE_PER_CPU(struct kvm_nvhe_stacktrace_info, kvm_stacktrace_info);
> +
> +/*
> + * hyp_prepare_backtrace - Prepare non-protected nVHE backtrace.
> + *
> + * @fp : frame pointer at which to start the unwinding.
> + * @pc : program counter at which to start the unwinding.
> + *
> + * Save the information needed by the host to unwind the non-protected
> + * nVHE hypervisor stack in EL1.
> + */
> +static void hyp_prepare_backtrace(unsigned long fp, unsigned long pc)
> +{
> + struct kvm_nvhe_stacktrace_info *stacktrace_info = this_cpu_ptr(&kvm_stacktrace_info);
> + struct kvm_nvhe_init_params *params = this_cpu_ptr(&kvm_init_params);
> +
> + stacktrace_info->stack_base = (unsigned long)(params->stack_hyp_va - PAGE_SIZE);
> + stacktrace_info->overflow_stack_base = (unsigned long)this_cpu_ptr(overflow_stack);
> + stacktrace_info->fp = fp;
> + stacktrace_info->pc = pc;
> +}
> +
> #ifdef CONFIG_PROTECTED_NVHE_STACKTRACE
> DEFINE_PER_CPU(unsigned long [NVHE_STACKTRACE_SIZE/sizeof(long)], pkvm_stacktrace);
>
> @@ -89,4 +111,6 @@ void kvm_nvhe_prepare_backtrace(unsigned long fp, unsigned long pc)
> {
> if (is_protected_kvm_enabled())
> pkvm_save_backtrace(fp, pc);
> + else
> + hyp_prepare_backtrace(fp, pc);
> }
> --
> 2.37.0.170.g444d1eabd0-goog
>

2022-07-21 10:39:18

by Fuad Tabba

[permalink] [raw]
Subject: Re: [PATCH v5 15/17] KVM: arm64: Implement non-protected nVHE hyp stack unwinder

Hi Kalesh,

On Thu, Jul 21, 2022 at 6:58 AM Kalesh Singh <[email protected]> wrote:
>
> Implements the common framework necessary for unwind() to work
> for non-protected nVHE mode:
> - on_accessible_stack()
> - on_overflow_stack()
> - unwind_next()
>
> Non-protected nVHE unwind() is used to unwind and dump the hypervisor
> stacktrace by the host in EL1
>
> Signed-off-by: Kalesh Singh <[email protected]>
> ---

Reviewed-by: Fuad Tabba <[email protected]>

Cheers,
/fuad


>
> Changes in v5:
> - Use regular comments instead of doc comments, per Fuad
>
> arch/arm64/include/asm/stacktrace/nvhe.h | 67 +++++++++++++++++++++++-
> arch/arm64/kvm/arm.c | 2 +-
> 2 files changed, 66 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm64/include/asm/stacktrace/nvhe.h b/arch/arm64/include/asm/stacktrace/nvhe.h
> index c3688e717136..7a6e761aa443 100644
> --- a/arch/arm64/include/asm/stacktrace/nvhe.h
> +++ b/arch/arm64/include/asm/stacktrace/nvhe.h
> @@ -120,15 +120,78 @@ NOKPROBE_SYMBOL(unwind_next);
> * (by the host in EL1).
> */
>
> +DECLARE_KVM_NVHE_PER_CPU(unsigned long [PAGE_SIZE/sizeof(long)], overflow_stack);
> +DECLARE_KVM_NVHE_PER_CPU(struct kvm_nvhe_stacktrace_info, kvm_stacktrace_info);
> +DECLARE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);
> +
> +/*
> + * kvm_nvhe_stack_kern_va - Convert KVM nVHE HYP stack addresses to a kernel VAs
> + *
> + * The nVHE hypervisor stack is mapped in the flexible 'private' VA range, to
> + * allow for guard pages below the stack. Consequently, the fixed offset address
> + * translation macros won't work here.
> + *
> + * The kernel VA is calculated as an offset from the kernel VA of the hypervisor
> + * stack base.
> + *
> + * Returns true on success and updates @addr to its corresponding kernel VA;
> + * otherwise returns false.
> + */
> +static inline bool kvm_nvhe_stack_kern_va(unsigned long *addr,
> + enum stack_type type)
> +{
> + struct kvm_nvhe_stacktrace_info *stacktrace_info;
> + unsigned long hyp_base, kern_base, hyp_offset;
> +
> + stacktrace_info = this_cpu_ptr_nvhe_sym(kvm_stacktrace_info);
> +
> + switch (type) {
> + case STACK_TYPE_HYP:
> + kern_base = (unsigned long)*this_cpu_ptr(&kvm_arm_hyp_stack_page);
> + hyp_base = (unsigned long)stacktrace_info->stack_base;
> + break;
> + case STACK_TYPE_OVERFLOW:
> + kern_base = (unsigned long)this_cpu_ptr_nvhe_sym(overflow_stack);
> + hyp_base = (unsigned long)stacktrace_info->overflow_stack_base;
> + break;
> + default:
> + return false;
> + }
> +
> + hyp_offset = *addr - hyp_base;
> +
> + *addr = kern_base + hyp_offset;
> +
> + return true;
> +}
> +
> static inline bool on_overflow_stack(unsigned long sp, unsigned long size,
> struct stack_info *info)
> {
> - return false;
> + struct kvm_nvhe_stacktrace_info *stacktrace_info
> + = this_cpu_ptr_nvhe_sym(kvm_stacktrace_info);
> + unsigned long low = (unsigned long)stacktrace_info->overflow_stack_base;
> + unsigned long high = low + OVERFLOW_STACK_SIZE;
> +
> + return on_stack(sp, size, low, high, STACK_TYPE_OVERFLOW, info);
> +}
> +
> +static inline bool on_hyp_stack(unsigned long sp, unsigned long size,
> + struct stack_info *info)
> +{
> + struct kvm_nvhe_stacktrace_info *stacktrace_info
> + = this_cpu_ptr_nvhe_sym(kvm_stacktrace_info);
> + unsigned long low = (unsigned long)stacktrace_info->stack_base;
> + unsigned long high = low + PAGE_SIZE;
> +
> + return on_stack(sp, size, low, high, STACK_TYPE_HYP, info);
> }
>
> static inline int notrace unwind_next(struct unwind_state *state)
> {
> - return 0;
> + struct stack_info info;
> +
> + return unwind_next_common(state, &info, kvm_nvhe_stack_kern_va);
> }
> NOKPROBE_SYMBOL(unwind_next);
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index a0188144a122..6a64293108c5 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -49,7 +49,7 @@ DEFINE_STATIC_KEY_FALSE(kvm_protected_mode_initialized);
>
> DECLARE_KVM_HYP_PER_CPU(unsigned long, kvm_hyp_vector);
>
> -static DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);
> +DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);
> unsigned long kvm_arm_hyp_percpu_base[NR_CPUS];
> DECLARE_KVM_NVHE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
>
> --
> 2.37.0.170.g444d1eabd0-goog
>

2022-07-21 16:10:28

by Kalesh Singh

[permalink] [raw]
Subject: Re: [PATCH v5 00/17] KVM nVHE Hypervisor stack unwinder

On Thu, Jul 21, 2022 at 2:56 AM Fuad Tabba <[email protected]> wrote:
>
> Hi Kalesh,
>
>
> On Thu, Jul 21, 2022 at 6:57 AM Kalesh Singh <[email protected]> wrote:
> >
> > Hi all,
> >
> > This is v5 of nVHE hypervisor stacktraces support. The series is based on
> > arm64 for-next/stacktrace.
> >
> > The previous versions were posted at:
> > v4: https://lore.kernel.org/r/[email protected]/
> > v3: https://lore.kernel.org/r/[email protected]/
> > v2: https://lore.kernel.org/r/[email protected]/
> > v1: https://lore.kernel.org/r/[email protected]/
> >
> > The main updates in this version are some refactoring to move stuff out of
> > stacktrace/nvhe.h (leaving only the unwinder implementation in the header);
> > and fixing the symbolization of the hyp stacktrace when KASLR is enabled;
> > along with the addressing the other minor comments.
> >
> > Patch 18 (KVM: arm64: Dump nVHE hypervisor stack on panic) was also squashed
> > into earlier patches.
> >
> > The previous cover letter is copied below for convenience.
> >
> > Thanks all for your feedback.
>
> Thank you for this. This will be very helpful.
>
> For the whole series:
> Tested-by: Fuad Tabba <[email protected]>

Thanks for your reviews, Fuad.

--Kalesh

>
> Cheers,
> /fuad
>
>
> >
> > --Kalesh
> >
> > ============
> >
> > KVM nVHE Stack unwinding.
> > ===
> >
> > nVHE has two modes of operation: protected (pKVM) and unprotected
> > (conventional nVHE). Depending on the mode, a slightly different approach
> > is used to dump the hypervisor stacktrace but the core unwinding logic
> > remains the same.
> >
> > Protected nVHE (pKVM) stacktraces
> > ====
> >
> > In protected nVHE mode, the host cannot directly access hypervisor memory.
> >
> > The hypervisor stack unwinding happens in EL2 and is made accessible to
> > the host via a shared buffer. Symbolizing and printing the stacktrace
> > addresses is delegated to the host and happens in EL1.
> >
> > Non-protected (Conventional) nVHE stacktraces
> > ====
> >
> > In non-protected mode, the host is able to directly access the hypervisor
> > stack pages.
> >
> > The hypervisor stack unwinding and dumping of the stacktrace is performed
> > by the host in EL1, as this avoids the memory overhead of setting up
> > shared buffers between the host and hypervisor.
> >
> > Resuing the Core Unwinding Logic
> > ====
> >
> > Since the hypervisor cannot link against the kernel code in proteced mode.
> > The common stack unwinding code is moved to a shared header to allow reuse
> > in the nVHE hypervisor.
> >
> > Reducing the memory footprint
> > ====
> >
> > In this version the below steps were taken to reduce the memory usage of
> > nVHE stack unwinding:
> >
> > 1) The nVHE overflow stack is reduced from PAGE_SIZE to 4KB; benificial
> > for configurations with non 4KB pages (16KB or 64KB pages).
> > 2) In protected nVHE mode (pKVM), the shared stacktrace buffers with the
> > host are reduced from PAGE_SIZE to the minimum size required.
> > 3) In systems other than Android, conventional nVHE makes up the vast
> > majority of use case. So the pKVM stack tracing is disabled by default
> > (!CONFIG_PROTECTED_NVHE_STACKTRACE), which avoid the memory usage for
> > setting up shared buffers.
> > 4) In non-protected nVHE mode (conventional nVHE), the stack unwinding
> > is done directly in EL1 by the host and no shared buffers with the
> > hypervisor are needed.
> >
> > Sample Output
> > ====
> >
> > The below shows an example output from a simple stack overflow test:
> >
> > [ 126.862960] kvm [371]: nVHE hyp panic at: [<ffff8000090a51d0>] __kvm_nvhe_recursive_death+0x10/0x34!
> > [ 126.869920] kvm [371]: Protected nVHE HYP call trace:
> > [ 126.870528] kvm [371]: [<ffff8000090a5570>] __kvm_nvhe_hyp_panic+0xac/0xf8
> > [ 126.871342] kvm [371]: [<ffff8000090a55cc>] __kvm_nvhe_hyp_panic_bad_stack+0x10/0x10
> > [ 126.872174] kvm [371]: [<ffff8000090a51e4>] __kvm_nvhe_recursive_death+0x24/0x34
> > [ 126.872971] kvm [371]: [<ffff8000090a51e4>] __kvm_nvhe_recursive_death+0x24/0x34
> > . . .
> >
> > [ 126.927314] kvm [371]: [<ffff8000090a51e4>] __kvm_nvhe_recursive_death+0x24/0x34
> > [ 126.927727] kvm [371]: [<ffff8000090a51e4>] __kvm_nvhe_recursive_death+0x24/0x34
> > [ 126.928137] kvm [371]: [<ffff8000090a4de4>] __kvm_nvhe___kvm_vcpu_run+0x30/0x40c
> > [ 126.928561] kvm [371]: [<ffff8000090a7b64>] __kvm_nvhe_handle___kvm_vcpu_run+0x30/0x48
> > [ 126.928984] kvm [371]: [<ffff8000090a78b8>] __kvm_nvhe_handle_trap+0xc4/0x128
> > [ 126.929385] kvm [371]: [<ffff8000090a6864>] __kvm_nvhe___host_exit+0x64/0x64
> > [ 126.929804] kvm [371]: ---- End of Protected nVHE HYP call trace ----
> >
> > ============
> >
> > Kalesh Singh (17):
> > arm64: stacktrace: Add shared header for common stack unwinding code
> > arm64: stacktrace: Factor out on_accessible_stack_common()
> > arm64: stacktrace: Factor out unwind_next_common()
> > arm64: stacktrace: Handle frame pointer from different address spaces
> > arm64: stacktrace: Factor out common unwind()
> > arm64: stacktrace: Add description of stacktrace/common.h
> > KVM: arm64: On stack overflow switch to hyp overflow_stack
> > KVM: arm64: Add PROTECTED_NVHE_STACKTRACE Kconfig
> > KVM: arm64: Allocate shared pKVM hyp stacktrace buffers
> > KVM: arm64: Stub implementation of pKVM HYP stack unwinder
> > KVM: arm64: Stub implementation of non-protected nVHE HYP stack
> > unwinder
> > KVM: arm64: Save protected-nVHE (pKVM) hyp stacktrace
> > KVM: arm64: Prepare non-protected nVHE hypervisor stacktrace
> > KVM: arm64: Implement protected nVHE hyp stack unwinder
> > KVM: arm64: Implement non-protected nVHE hyp stack unwinder
> > KVM: arm64: Introduce pkvm_dump_backtrace()
> > KVM: arm64: Introduce hyp_dump_backtrace()
> >
> > arch/arm64/include/asm/kvm_asm.h | 16 ++
> > arch/arm64/include/asm/memory.h | 8 +
> > arch/arm64/include/asm/stacktrace.h | 92 +++++----
> > arch/arm64/include/asm/stacktrace/common.h | 230 +++++++++++++++++++++
> > arch/arm64/include/asm/stacktrace/nvhe.h | 199 ++++++++++++++++++
> > arch/arm64/kernel/stacktrace.c | 157 --------------
> > arch/arm64/kvm/Kconfig | 15 ++
> > arch/arm64/kvm/arm.c | 2 +-
> > arch/arm64/kvm/handle_exit.c | 101 +++++++++
> > arch/arm64/kvm/hyp/nvhe/Makefile | 2 +-
> > arch/arm64/kvm/hyp/nvhe/host.S | 9 +-
> > arch/arm64/kvm/hyp/nvhe/stacktrace.c | 116 +++++++++++
> > arch/arm64/kvm/hyp/nvhe/switch.c | 6 +
> > 13 files changed, 749 insertions(+), 204 deletions(-)
> > create mode 100644 arch/arm64/include/asm/stacktrace/common.h
> > create mode 100644 arch/arm64/include/asm/stacktrace/nvhe.h
> > create mode 100644 arch/arm64/kvm/hyp/nvhe/stacktrace.c
> >
> >
> > base-commit: 82a592c13b0aeff94d84d54183dae0b26384c95f
> > --
> > 2.37.0.170.g444d1eabd0-goog
> >

2022-07-21 20:51:09

by Oliver Upton

[permalink] [raw]
Subject: Re: [PATCH v5 17/17] KVM: arm64: Introduce hyp_dump_backtrace()

Hi Kalesh,

Nifty series! Had the chance to take it for a spin :) Few comments
below:

On Wed, Jul 20, 2022 at 10:57:28PM -0700, Kalesh Singh wrote:
> In non-protected nVHE mode, unwinds and dumps the hypervisor backtrace
> from EL1. This is possible beacuase the host can directly access the
> hypervisor stack pages in non-proteced mode.
>
> Signed-off-by: Kalesh Singh <[email protected]>
> ---
>
> Changes in v5:
> - Move code out from nvhe.h header to handle_exit.c, per Marc
> - Fix stacktrace symoblization when CONFIG_RAMDOMIZE_BASE is enabled,
> per Fuad
> - Use regular comments instead of doc comments, per Fuad
>
> arch/arm64/kvm/handle_exit.c | 65 +++++++++++++++++++++++++++++++-----
> 1 file changed, 56 insertions(+), 9 deletions(-)
>
> diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> index ad568da5c7d7..432b6b26f4ad 100644
> --- a/arch/arm64/kvm/handle_exit.c
> +++ b/arch/arm64/kvm/handle_exit.c

[...]

> @@ -318,6 +319,56 @@ void handle_exit_early(struct kvm_vcpu *vcpu, int exception_index)
> kvm_handle_guest_serror(vcpu, kvm_vcpu_get_esr(vcpu));
> }
>
> +/*
> + * kvm_nvhe_print_backtrace_entry - Symbolizes and prints the HYP stack address
> + */
> +static void kvm_nvhe_print_backtrace_entry(unsigned long addr,
> + unsigned long hyp_offset)
> +{
> + unsigned long va_mask = GENMASK_ULL(vabits_actual - 1, 0);
> +
> + /* Mask tags and convert to kern addr */
> + addr = (addr & va_mask) + hyp_offset;
> + kvm_err(" [<%016lx>] %pB\n", addr, (void *)(addr + kaslr_offset()));
> +}

It is a bit odd to see this get churned from the last patch. Is it
possible to introduce the helper earlier on? In fact, the non-protected
patch should come first to layer the series better.

Also, it seems to me that there isn't much need for the indirection if
the pKVM unwinder is made to work around the below function signature:

<snip>

> +/*
> + * hyp_dump_backtrace_entry - Dump an entry of the non-protected nVHE HYP stacktrace
> + *
> + * @arg : the hypervisor offset, used for address translation
> + * @where : the program counter corresponding to the stack frame
> + */
> +static bool hyp_dump_backtrace_entry(void *arg, unsigned long where)
> +{
> + kvm_nvhe_print_backtrace_entry(where, (unsigned long)arg);
> +
> + return true;
> +}

</snip>

> +/*
> + * hyp_dump_backtrace - Dump the non-proteced nVHE HYP backtrace.
> + *
> + * @hyp_offset: hypervisor offset, used for address translation.
> + *
> + * The host can directly access HYP stack pages in non-protected
> + * mode, so the unwinding is done directly from EL1. This removes
> + * the need for shared buffers between host and hypervisor for
> + * the stacktrace.
> + */
> +static void hyp_dump_backtrace(unsigned long hyp_offset)
> +{
> + struct kvm_nvhe_stacktrace_info *stacktrace_info;
> + struct unwind_state state;
> +
> + stacktrace_info = this_cpu_ptr_nvhe_sym(kvm_stacktrace_info);
> +
> + kvm_nvhe_unwind_init(&state, stacktrace_info->fp, stacktrace_info->pc);
> +
> + kvm_err("Non-protected nVHE HYP call trace:\n");

I don't see the value in discerning non-protected vs. protected in the
preamble, as panic() will dump the kernel commandline which presumably
would have a `kvm-arm.mode=protected` in the case of pKVM.

<nit>

Not entirely your problem, but we should really use a consistent name
for that thing we have living at EL2. Hyp or nVHE (but not both) would
be appropriate.

</nit>

> + unwind(&state, hyp_dump_backtrace_entry, (void *)hyp_offset);
> + kvm_err("---- End of Non-protected nVHE HYP call trace ----\n");

Maybe:

kvm_err("---[ end ${NAME_FOR_EL2} trace ]---");

(more closely matches the pattern of the arm64 stacktrace)

--
Thanks,
Oliver

2022-07-22 00:06:55

by Kalesh Singh

[permalink] [raw]
Subject: Re: [PATCH v5 17/17] KVM: arm64: Introduce hyp_dump_backtrace()

On Thu, Jul 21, 2022 at 1:35 PM Oliver Upton <[email protected]> wrote:
>
> Hi Kalesh,
>
> Nifty series! Had the chance to take it for a spin :) Few comments
> below:

Hi Oliver. Thanks for giving it a try :)

>
> On Wed, Jul 20, 2022 at 10:57:28PM -0700, Kalesh Singh wrote:
> > In non-protected nVHE mode, unwinds and dumps the hypervisor backtrace
> > from EL1. This is possible beacuase the host can directly access the
> > hypervisor stack pages in non-proteced mode.
> >
> > Signed-off-by: Kalesh Singh <[email protected]>
> > ---
> >
> > Changes in v5:
> > - Move code out from nvhe.h header to handle_exit.c, per Marc
> > - Fix stacktrace symoblization when CONFIG_RAMDOMIZE_BASE is enabled,
> > per Fuad
> > - Use regular comments instead of doc comments, per Fuad
> >
> > arch/arm64/kvm/handle_exit.c | 65 +++++++++++++++++++++++++++++++-----
> > 1 file changed, 56 insertions(+), 9 deletions(-)
> >
> > diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
> > index ad568da5c7d7..432b6b26f4ad 100644
> > --- a/arch/arm64/kvm/handle_exit.c
> > +++ b/arch/arm64/kvm/handle_exit.c
>
> [...]
>
> > @@ -318,6 +319,56 @@ void handle_exit_early(struct kvm_vcpu *vcpu, int exception_index)
> > kvm_handle_guest_serror(vcpu, kvm_vcpu_get_esr(vcpu));
> > }
> >
> > +/*
> > + * kvm_nvhe_print_backtrace_entry - Symbolizes and prints the HYP stack address
> > + */
> > +static void kvm_nvhe_print_backtrace_entry(unsigned long addr,
> > + unsigned long hyp_offset)
> > +{
> > + unsigned long va_mask = GENMASK_ULL(vabits_actual - 1, 0);
> > +
> > + /* Mask tags and convert to kern addr */
> > + addr = (addr & va_mask) + hyp_offset;
> > + kvm_err(" [<%016lx>] %pB\n", addr, (void *)(addr + kaslr_offset()));
> > +}
>
> It is a bit odd to see this get churned from the last patch. Is it
> possible to introduce the helper earlier on? In fact, the non-protected
> patch should come first to layer the series better.

Agreed. pKVM is the one with the extra layer. I'll reorder this in the
next version.

>
> Also, it seems to me that there isn't much need for the indirection if
> the pKVM unwinder is made to work around the below function signature:

Ok, I'll fold it into the below function.

>
> <snip>
>
> > +/*
> > + * hyp_dump_backtrace_entry - Dump an entry of the non-protected nVHE HYP stacktrace
> > + *
> > + * @arg : the hypervisor offset, used for address translation
> > + * @where : the program counter corresponding to the stack frame
> > + */
> > +static bool hyp_dump_backtrace_entry(void *arg, unsigned long where)
> > +{
> > + kvm_nvhe_print_backtrace_entry(where, (unsigned long)arg);
> > +
> > + return true;
> > +}
>
> </snip>
>
> > +/*
> > + * hyp_dump_backtrace - Dump the non-proteced nVHE HYP backtrace.
> > + *
> > + * @hyp_offset: hypervisor offset, used for address translation.
> > + *
> > + * The host can directly access HYP stack pages in non-protected
> > + * mode, so the unwinding is done directly from EL1. This removes
> > + * the need for shared buffers between host and hypervisor for
> > + * the stacktrace.
> > + */
> > +static void hyp_dump_backtrace(unsigned long hyp_offset)
> > +{
> > + struct kvm_nvhe_stacktrace_info *stacktrace_info;
> > + struct unwind_state state;
> > +
> > + stacktrace_info = this_cpu_ptr_nvhe_sym(kvm_stacktrace_info);
> > +
> > + kvm_nvhe_unwind_init(&state, stacktrace_info->fp, stacktrace_info->pc);
> > +
> > + kvm_err("Non-protected nVHE HYP call trace:\n");
>
> I don't see the value in discerning non-protected vs. protected in the
> preamble, as panic() will dump the kernel commandline which presumably
> would have a `kvm-arm.mode=protected` in the case of pKVM.

Ok, I'll remove the distinction.

>
> <nit>
>
> Not entirely your problem, but we should really use a consistent name
> for that thing we have living at EL2. Hyp or nVHE (but not both) would
> be appropriate.
>

Right, I think just "nVHE" is probably sufficient. (Open to other suggestions)

> </nit>
>
> > + unwind(&state, hyp_dump_backtrace_entry, (void *)hyp_offset);
> > + kvm_err("---- End of Non-protected nVHE HYP call trace ----\n");
>
> Maybe:
>
> kvm_err("---[ end ${NAME_FOR_EL2} trace ]---");
>
> (more closely matches the pattern of the arm64 stacktrace)

Agreed.

Thanks,
Kalesh

>
> --
> Thanks,
> Oliver

2022-07-22 11:05:24

by Oliver Upton

[permalink] [raw]
Subject: Re: [PATCH v5 00/17] KVM nVHE Hypervisor stack unwinder

Hi Kalesh,

On Wed, Jul 20, 2022 at 10:57:11PM -0700, Kalesh Singh wrote:

[...]

> Kalesh Singh (17):
> arm64: stacktrace: Add shared header for common stack unwinding code
> arm64: stacktrace: Factor out on_accessible_stack_common()
> arm64: stacktrace: Factor out unwind_next_common()
> arm64: stacktrace: Handle frame pointer from different address spaces
> arm64: stacktrace: Factor out common unwind()
> arm64: stacktrace: Add description of stacktrace/common.h
> KVM: arm64: On stack overflow switch to hyp overflow_stack
> KVM: arm64: Add PROTECTED_NVHE_STACKTRACE Kconfig
> KVM: arm64: Allocate shared pKVM hyp stacktrace buffers
> KVM: arm64: Stub implementation of pKVM HYP stack unwinder
> KVM: arm64: Stub implementation of non-protected nVHE HYP stack
> unwinder
> KVM: arm64: Save protected-nVHE (pKVM) hyp stacktrace
> KVM: arm64: Prepare non-protected nVHE hypervisor stacktrace
> KVM: arm64: Implement protected nVHE hyp stack unwinder
> KVM: arm64: Implement non-protected nVHE hyp stack unwinder
> KVM: arm64: Introduce pkvm_dump_backtrace()
> KVM: arm64: Introduce hyp_dump_backtrace()

Adding a general comment on the organization of the series. I think for
the next spin it'd be good to organize the entire non-pKVM
implementation first, followed by the pKVM implementation. Otherwise,
reviewers need to jump around the series a lot in order to page in the
appropriate context.

I had mentioned this about the last two patches earlier, but after
grokking the stack I see the comment applies to the entire KVM portion
of the series.

--
Thanks,
Oliver

2022-07-22 11:23:57

by Oliver Upton

[permalink] [raw]
Subject: Re: [PATCH v5 16/17] KVM: arm64: Introduce pkvm_dump_backtrace()

Hi Kalesh,

On Wed, Jul 20, 2022 at 10:57:27PM -0700, Kalesh Singh wrote:

[...]

> +/*
> + * pkvm_dump_backtrace - Dump the protected nVHE HYP backtrace.
> + *
> + * @hyp_offset: hypervisor offset, used for address translation.
> + *
> + * Dumping of the pKVM HYP backtrace is done by reading the
> + * stack addresses from the shared stacktrace buffer, since the
> + * host cannot direclty access hyperviosr memory in protected
> + * mode.
> + */
> +static void pkvm_dump_backtrace(unsigned long hyp_offset)
> +{
> + unsigned long *stacktrace_entry
> + = (unsigned long *)this_cpu_ptr_nvhe_sym(pkvm_stacktrace);
> + unsigned long va_mask, pc;
> +
> + va_mask = GENMASK_ULL(vabits_actual - 1, 0);
> +
> + kvm_err("Protected nVHE HYP call trace:\n");

This and the footer printks should be put in respective helpers to share
between pKVM and non-pKVM backtrace implementations. I imagine users
will invariably bake some pattern matching to scrape traces, and it
should be consistent between both flavors.

> + /* The stack trace is terminated by a null entry */
> + for (; *stacktrace_entry; stacktrace_entry++) {

At the point we're dumping the backtrace we know that EL2 has already
soiled itself, so we shouldn't explicitly depend on it providing NULL
terminators. I believe this loop should have an explicit range && NULL
check.

--
Thanks,
Oliver

2022-07-22 15:46:18

by Oliver Upton

[permalink] [raw]
Subject: Re: [PATCH v5 12/17] KVM: arm64: Save protected-nVHE (pKVM) hyp stacktrace

Hi Kalesh,

On Wed, Jul 20, 2022 at 10:57:23PM -0700, Kalesh Singh wrote:
> In protected nVHE mode, the host cannot access private owned hypervisor
> memory. Also the hypervisor aims to remains simple to reduce the attack
> surface and does not provide any printk support.
>
> For the above reasons, the approach taken to provide hypervisor stacktraces
> in protected mode is:
> 1) Unwind and save the hyp stack addresses in EL2 to a shared buffer
> with the host (done in this patch).
> 2) Delegate the dumping and symbolization of the addresses to the
> host in EL1 (later patch in the series).
>
> On hyp_panic(), the hypervisor prepares the stacktrace before returning to
> the host.
>
> Signed-off-by: Kalesh Singh <[email protected]>
> ---

Tried reworking this a bit and here is what I arrived at, WDYT?
Untested, of course :)

--
Thanks,
Oliver

diff --git a/arch/arm64/include/asm/stacktrace/nvhe.h b/arch/arm64/include/asm/stacktrace/nvhe.h
index 3078501f8e22..05d7e03e0a8c 100644
--- a/arch/arm64/include/asm/stacktrace/nvhe.h
+++ b/arch/arm64/include/asm/stacktrace/nvhe.h
@@ -21,6 +21,23 @@

#include <asm/stacktrace/common.h>

+/*
+ * kvm_nvhe_unwind_init - Start an unwind from the given nVHE HYP fp and pc
+ *
+ * @state : unwind_state to initialize
+ * @fp : frame pointer at which to start the unwinding.
+ * @pc : program counter at which to start the unwinding.
+ */
+static inline void kvm_nvhe_unwind_init(struct unwind_state *state,
+ unsigned long fp,
+ unsigned long pc)
+{
+ unwind_init_common(state, NULL);
+
+ state->fp = fp;
+ state->pc = pc;
+}
+
static inline bool on_accessible_stack(const struct task_struct *tsk,
unsigned long sp, unsigned long size,
struct stack_info *info)
diff --git a/arch/arm64/kvm/hyp/nvhe/stacktrace.c b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
index 96c8b93320eb..644276fb02af 100644
--- a/arch/arm64/kvm/hyp/nvhe/stacktrace.c
+++ b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
@@ -11,4 +11,69 @@ DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], overflow_stack)

#ifdef CONFIG_PROTECTED_NVHE_STACKTRACE
DEFINE_PER_CPU(unsigned long [NVHE_STACKTRACE_SIZE/sizeof(long)], pkvm_stacktrace);
+
+/*
+ * pkvm_save_backtrace_entry - Saves a protected nVHE HYP stacktrace entry
+ *
+ * @arg : the index of the entry in the stacktrace buffer
+ * @where : the program counter corresponding to the stack frame
+ *
+ * Save the return address of a stack frame to the shared stacktrace buffer.
+ * The host can access this shared buffer from EL1 to dump the backtrace.
+ */
+static bool pkvm_save_backtrace_entry(void *arg, unsigned long where)
+{
+ unsigned long *stacktrace = this_cpu_ptr(pkvm_stacktrace);
+ int *idx = (int *)arg;
+
+ /*
+ * Need 2 free slots: 1 for current entry and 1 for the
+ * delimiter.
+ */
+ if (*idx > ARRAY_SIZE(pkvm_stacktrace) - 2)
+ return false;
+
+ stacktrace[*idx] = where;
+ stacktrace[++*idx] = 0UL;
+
+ return true;
+}
+
+/*
+ * pkvm_save_backtrace - Saves the protected nVHE HYP stacktrace
+ *
+ * @fp : frame pointer at which to start the unwinding.
+ * @pc : program counter at which to start the unwinding.
+ *
+ * Save the unwinded stack addresses to the shared stacktrace buffer.
+ * The host can access this shared buffer from EL1 to dump the backtrace.
+ */
+static void pkvm_save_backtrace(unsigned long fp, unsigned long pc)
+{
+ struct unwind_state state;
+ int idx = 0;
+
+ kvm_nvhe_unwind_init(&state, fp, pc);
+
+ unwind(&state, pkvm_save_backtrace_entry, &idx);
+}
+#else /* !CONFIG_PROTECTED_NVHE_STACKTRACE */
+static void pkvm_save_backtrace(unsigned long fp, unsigned long pc)
+{
+}
#endif /* CONFIG_PROTECTED_NVHE_STACKTRACE */
+
+/*
+ * kvm_nvhe_prepare_backtrace - prepare to dump the nVHE backtrace
+ *
+ * @fp : frame pointer at which to start the unwinding.
+ * @pc : program counter at which to start the unwinding.
+ *
+ * Saves the information needed by the host to dump the nVHE hypervisor
+ * backtrace.
+ */
+void kvm_nvhe_prepare_backtrace(unsigned long fp, unsigned long pc)
+{
+ if (is_protected_kvm_enabled())
+ pkvm_save_backtrace(fp, pc);
+}
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index 6db801db8f27..64e13445d0d9 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -34,6 +34,8 @@ DEFINE_PER_CPU(struct kvm_host_data, kvm_host_data);
DEFINE_PER_CPU(struct kvm_cpu_context, kvm_hyp_ctxt);
DEFINE_PER_CPU(unsigned long, kvm_hyp_vector);

+extern void kvm_nvhe_prepare_backtrace(unsigned long fp, unsigned long pc);
+
static void __activate_traps(struct kvm_vcpu *vcpu)
{
u64 val;
@@ -375,6 +377,10 @@ asmlinkage void __noreturn hyp_panic(void)
__sysreg_restore_state_nvhe(host_ctxt);
}

+ /* Prepare to dump kvm nvhe hyp stacktrace */
+ kvm_nvhe_prepare_backtrace((unsigned long)__builtin_frame_address(0),
+ _THIS_IP_);
+
__hyp_do_panic(host_ctxt, spsr, elr, par);
unreachable();
}

2022-07-22 17:27:40

by Kalesh Singh

[permalink] [raw]
Subject: Re: [PATCH v5 16/17] KVM: arm64: Introduce pkvm_dump_backtrace()

On Fri, Jul 22, 2022 at 4:16 AM Oliver Upton <[email protected]> wrote:
>
> Hi Kalesh,
>
> On Wed, Jul 20, 2022 at 10:57:27PM -0700, Kalesh Singh wrote:
>
> [...]
>
> > +/*
> > + * pkvm_dump_backtrace - Dump the protected nVHE HYP backtrace.
> > + *
> > + * @hyp_offset: hypervisor offset, used for address translation.
> > + *
> > + * Dumping of the pKVM HYP backtrace is done by reading the
> > + * stack addresses from the shared stacktrace buffer, since the
> > + * host cannot direclty access hyperviosr memory in protected
> > + * mode.
> > + */
> > +static void pkvm_dump_backtrace(unsigned long hyp_offset)
> > +{
> > + unsigned long *stacktrace_entry
> > + = (unsigned long *)this_cpu_ptr_nvhe_sym(pkvm_stacktrace);
> > + unsigned long va_mask, pc;
> > +
> > + va_mask = GENMASK_ULL(vabits_actual - 1, 0);
> > +
> > + kvm_err("Protected nVHE HYP call trace:\n");
>
> This and the footer printks should be put in respective helpers to share
> between pKVM and non-pKVM backtrace implementations. I imagine users
> will invariably bake some pattern matching to scrape traces, and it
> should be consistent between both flavors.

Hi Oliver,

Ok will split these out into helpers.

>
> > + /* The stack trace is terminated by a null entry */
> > + for (; *stacktrace_entry; stacktrace_entry++) {
>
> At the point we're dumping the backtrace we know that EL2 has already
> soiled itself, so we shouldn't explicitly depend on it providing NULL
> terminators. I believe this loop should have an explicit range && NULL
> check.

Good point, I'll add the additional checks in the next version,

Thanks,
Kalesh
>
> --
> Thanks,
> Oliver

2022-07-22 17:59:25

by Kalesh Singh

[permalink] [raw]
Subject: Re: [PATCH v5 12/17] KVM: arm64: Save protected-nVHE (pKVM) hyp stacktrace

On Fri, Jul 22, 2022 at 8:33 AM Oliver Upton <[email protected]> wrote:
>
> Hi Kalesh,
>
> On Wed, Jul 20, 2022 at 10:57:23PM -0700, Kalesh Singh wrote:
> > In protected nVHE mode, the host cannot access private owned hypervisor
> > memory. Also the hypervisor aims to remains simple to reduce the attack
> > surface and does not provide any printk support.
> >
> > For the above reasons, the approach taken to provide hypervisor stacktraces
> > in protected mode is:
> > 1) Unwind and save the hyp stack addresses in EL2 to a shared buffer
> > with the host (done in this patch).
> > 2) Delegate the dumping and symbolization of the addresses to the
> > host in EL1 (later patch in the series).
> >
> > On hyp_panic(), the hypervisor prepares the stacktrace before returning to
> > the host.
> >
> > Signed-off-by: Kalesh Singh <[email protected]>
> > ---
>
> Tried reworking this a bit and here is what I arrived at, WDYT?
> Untested, of course :)

Hi Oliver,

I think what you have is a lot cleaner. :) I'll incorporate it for the
next spin.

Thanks,
Kalesh

>
> --
> Thanks,
> Oliver
>
> diff --git a/arch/arm64/include/asm/stacktrace/nvhe.h b/arch/arm64/include/asm/stacktrace/nvhe.h
> index 3078501f8e22..05d7e03e0a8c 100644
> --- a/arch/arm64/include/asm/stacktrace/nvhe.h
> +++ b/arch/arm64/include/asm/stacktrace/nvhe.h
> @@ -21,6 +21,23 @@
>
> #include <asm/stacktrace/common.h>
>
> +/*
> + * kvm_nvhe_unwind_init - Start an unwind from the given nVHE HYP fp and pc
> + *
> + * @state : unwind_state to initialize
> + * @fp : frame pointer at which to start the unwinding.
> + * @pc : program counter at which to start the unwinding.
> + */
> +static inline void kvm_nvhe_unwind_init(struct unwind_state *state,
> + unsigned long fp,
> + unsigned long pc)
> +{
> + unwind_init_common(state, NULL);
> +
> + state->fp = fp;
> + state->pc = pc;
> +}
> +
> static inline bool on_accessible_stack(const struct task_struct *tsk,
> unsigned long sp, unsigned long size,
> struct stack_info *info)
> diff --git a/arch/arm64/kvm/hyp/nvhe/stacktrace.c b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
> index 96c8b93320eb..644276fb02af 100644
> --- a/arch/arm64/kvm/hyp/nvhe/stacktrace.c
> +++ b/arch/arm64/kvm/hyp/nvhe/stacktrace.c
> @@ -11,4 +11,69 @@ DEFINE_PER_CPU(unsigned long [OVERFLOW_STACK_SIZE/sizeof(long)], overflow_stack)
>
> #ifdef CONFIG_PROTECTED_NVHE_STACKTRACE
> DEFINE_PER_CPU(unsigned long [NVHE_STACKTRACE_SIZE/sizeof(long)], pkvm_stacktrace);
> +
> +/*
> + * pkvm_save_backtrace_entry - Saves a protected nVHE HYP stacktrace entry
> + *
> + * @arg : the index of the entry in the stacktrace buffer
> + * @where : the program counter corresponding to the stack frame
> + *
> + * Save the return address of a stack frame to the shared stacktrace buffer.
> + * The host can access this shared buffer from EL1 to dump the backtrace.
> + */
> +static bool pkvm_save_backtrace_entry(void *arg, unsigned long where)
> +{
> + unsigned long *stacktrace = this_cpu_ptr(pkvm_stacktrace);
> + int *idx = (int *)arg;
> +
> + /*
> + * Need 2 free slots: 1 for current entry and 1 for the
> + * delimiter.
> + */
> + if (*idx > ARRAY_SIZE(pkvm_stacktrace) - 2)
> + return false;
> +
> + stacktrace[*idx] = where;
> + stacktrace[++*idx] = 0UL;
> +
> + return true;
> +}
> +
> +/*
> + * pkvm_save_backtrace - Saves the protected nVHE HYP stacktrace
> + *
> + * @fp : frame pointer at which to start the unwinding.
> + * @pc : program counter at which to start the unwinding.
> + *
> + * Save the unwinded stack addresses to the shared stacktrace buffer.
> + * The host can access this shared buffer from EL1 to dump the backtrace.
> + */
> +static void pkvm_save_backtrace(unsigned long fp, unsigned long pc)
> +{
> + struct unwind_state state;
> + int idx = 0;
> +
> + kvm_nvhe_unwind_init(&state, fp, pc);
> +
> + unwind(&state, pkvm_save_backtrace_entry, &idx);
> +}
> +#else /* !CONFIG_PROTECTED_NVHE_STACKTRACE */
> +static void pkvm_save_backtrace(unsigned long fp, unsigned long pc)
> +{
> +}
> #endif /* CONFIG_PROTECTED_NVHE_STACKTRACE */
> +
> +/*
> + * kvm_nvhe_prepare_backtrace - prepare to dump the nVHE backtrace
> + *
> + * @fp : frame pointer at which to start the unwinding.
> + * @pc : program counter at which to start the unwinding.
> + *
> + * Saves the information needed by the host to dump the nVHE hypervisor
> + * backtrace.
> + */
> +void kvm_nvhe_prepare_backtrace(unsigned long fp, unsigned long pc)
> +{
> + if (is_protected_kvm_enabled())
> + pkvm_save_backtrace(fp, pc);
> +}
> diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
> index 6db801db8f27..64e13445d0d9 100644
> --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> @@ -34,6 +34,8 @@ DEFINE_PER_CPU(struct kvm_host_data, kvm_host_data);
> DEFINE_PER_CPU(struct kvm_cpu_context, kvm_hyp_ctxt);
> DEFINE_PER_CPU(unsigned long, kvm_hyp_vector);
>
> +extern void kvm_nvhe_prepare_backtrace(unsigned long fp, unsigned long pc);
> +
> static void __activate_traps(struct kvm_vcpu *vcpu)
> {
> u64 val;
> @@ -375,6 +377,10 @@ asmlinkage void __noreturn hyp_panic(void)
> __sysreg_restore_state_nvhe(host_ctxt);
> }
>
> + /* Prepare to dump kvm nvhe hyp stacktrace */
> + kvm_nvhe_prepare_backtrace((unsigned long)__builtin_frame_address(0),
> + _THIS_IP_);
> +
> __hyp_do_panic(host_ctxt, spsr, elr, par);
> unreachable();
> }

2022-07-25 14:21:54

by Mark Brown

[permalink] [raw]
Subject: Re: [PATCH v5 05/17] arm64: stacktrace: Factor out common unwind()

On Wed, Jul 20, 2022 at 10:57:16PM -0700, Kalesh Singh wrote:
> Move unwind() to stacktrace/common.h, and as a result
> the kernel unwind_next() to asm/stacktrace.h. This allow
> reusing unwind() in the implementation of the nVHE HYP
> stack unwinder, later in the series.

Reviewed-by: Mark Brown <[email protected]>


Attachments:
(No filename) (326.00 B)
signature.asc (499.00 B)
Download all attachments