2022-05-14 01:09:55

by Quentin Perret

[permalink] [raw]
Subject: [PATCH] KVM: arm64: Don't hypercall before EL2 init

Will reported the following splat when running with Protected KVM
enabled:

[ 2.427181] ------------[ cut here ]------------
[ 2.427668] WARNING: CPU: 3 PID: 1 at arch/arm64/kvm/mmu.c:489 __create_hyp_private_mapping+0x118/0x1ac
[ 2.428424] Modules linked in:
[ 2.429040] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc2-00084-g8635adc4efc7 #1
[ 2.429589] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
[ 2.430286] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 2.430734] pc : __create_hyp_private_mapping+0x118/0x1ac
[ 2.431091] lr : create_hyp_exec_mappings+0x40/0x80
[ 2.431377] sp : ffff80000803baf0
[ 2.431597] x29: ffff80000803bb00 x28: 0000000000000000 x27: 0000000000000000
[ 2.432156] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
[ 2.432561] x23: ffffcd96c343b000 x22: 0000000000000000 x21: ffff80000803bb40
[ 2.433004] x20: 0000000000000004 x19: 0000000000001800 x18: 0000000000000000
[ 2.433343] x17: 0003e68cf7efdd70 x16: 0000000000000004 x15: fffffc81f602a2c8
[ 2.434053] x14: ffffdf8380000000 x13: ffffcd9573200000 x12: ffffcd96c343b000
[ 2.434401] x11: 0000000000000004 x10: ffffcd96c1738000 x9 : 0000000000000004
[ 2.434812] x8 : ffff80000803bb40 x7 : 7f7f7f7f7f7f7f7f x6 : 544f422effff306b
[ 2.435136] x5 : 000000008020001e x4 : ffff207d80a88c00 x3 : 0000000000000005
[ 2.435480] x2 : 0000000000001800 x1 : 000000014f4ab800 x0 : 000000000badca11
[ 2.436149] Call trace:
[ 2.436600] __create_hyp_private_mapping+0x118/0x1ac
[ 2.437576] create_hyp_exec_mappings+0x40/0x80
[ 2.438180] kvm_init_vector_slots+0x180/0x194
[ 2.458941] kvm_arch_init+0x80/0x274
[ 2.459220] kvm_init+0x48/0x354
[ 2.459416] arm_init+0x20/0x2c
[ 2.459601] do_one_initcall+0xbc/0x238
[ 2.459809] do_initcall_level+0x94/0xb4
[ 2.460043] do_initcalls+0x54/0x94
[ 2.460228] do_basic_setup+0x1c/0x28
[ 2.460407] kernel_init_freeable+0x110/0x178
[ 2.460610] kernel_init+0x20/0x1a0
[ 2.460817] ret_from_fork+0x10/0x20
[ 2.461274] ---[ end trace 0000000000000000 ]---

Indeed, the Protected KVM mode promotes __create_hyp_private_mapping()
to a hypercall as EL1 no longer has access to the hypervisor's stage-1
page-table. However, the call from kvm_init_vector_slots() happens after
pKVM has been initialized on the primary CPU, but before it has been
initialized on secondaries. As such, if the KVM initcall procedure is
migrated from one CPU to another in this window, the hypercall may end up
running on a CPU for which EL2 has not been initialized.

Fortunately, the pKVM hypervisor doesn't rely on the host to re-map the
vectors in the private range, so the hypercall in question is in fact
superfluous. Skip it when pKVM is enabled.

Reported-by: Will Deacon <[email protected]>
Signed-off-by: Quentin Perret <[email protected]>
---
arch/arm64/kvm/arm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 523bc934fe2f..7347c133efc4 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1436,7 +1436,7 @@ static int kvm_init_vector_slots(void)
base = kern_hyp_va(kvm_ksym_ref(__bp_harden_hyp_vecs));
kvm_init_vector_slot(base, HYP_VECTOR_SPECTRE_DIRECT);

- if (kvm_system_needs_idmapped_vectors() && !has_vhe()) {
+ if (kvm_system_needs_idmapped_vectors() && !has_vhe() && !is_protected_kvm_enabled()) {
err = create_hyp_exec_mappings(__pa_symbol(__bp_harden_hyp_vecs),
__BP_HARDEN_HYP_VECS_SZ, &base);
if (err)
--
2.36.0.550.gb090851708-goog



2022-05-16 06:33:47

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH] KVM: arm64: Don't hypercall before EL2 init

On Fri, 13 May 2022 10:26:07 +0100,
Quentin Perret <[email protected]> wrote:
>
> Will reported the following splat when running with Protected KVM
> enabled:
>
> [ 2.427181] ------------[ cut here ]------------
> [ 2.427668] WARNING: CPU: 3 PID: 1 at arch/arm64/kvm/mmu.c:489 __create_hyp_private_mapping+0x118/0x1ac
> [ 2.428424] Modules linked in:
> [ 2.429040] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc2-00084-g8635adc4efc7 #1
> [ 2.429589] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
> [ 2.430286] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ 2.430734] pc : __create_hyp_private_mapping+0x118/0x1ac
> [ 2.431091] lr : create_hyp_exec_mappings+0x40/0x80
> [ 2.431377] sp : ffff80000803baf0
> [ 2.431597] x29: ffff80000803bb00 x28: 0000000000000000 x27: 0000000000000000
> [ 2.432156] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
> [ 2.432561] x23: ffffcd96c343b000 x22: 0000000000000000 x21: ffff80000803bb40
> [ 2.433004] x20: 0000000000000004 x19: 0000000000001800 x18: 0000000000000000
> [ 2.433343] x17: 0003e68cf7efdd70 x16: 0000000000000004 x15: fffffc81f602a2c8
> [ 2.434053] x14: ffffdf8380000000 x13: ffffcd9573200000 x12: ffffcd96c343b000
> [ 2.434401] x11: 0000000000000004 x10: ffffcd96c1738000 x9 : 0000000000000004
> [ 2.434812] x8 : ffff80000803bb40 x7 : 7f7f7f7f7f7f7f7f x6 : 544f422effff306b
> [ 2.435136] x5 : 000000008020001e x4 : ffff207d80a88c00 x3 : 0000000000000005
> [ 2.435480] x2 : 0000000000001800 x1 : 000000014f4ab800 x0 : 000000000badca11
> [ 2.436149] Call trace:
> [ 2.436600] __create_hyp_private_mapping+0x118/0x1ac
> [ 2.437576] create_hyp_exec_mappings+0x40/0x80
> [ 2.438180] kvm_init_vector_slots+0x180/0x194
> [ 2.458941] kvm_arch_init+0x80/0x274
> [ 2.459220] kvm_init+0x48/0x354
> [ 2.459416] arm_init+0x20/0x2c
> [ 2.459601] do_one_initcall+0xbc/0x238
> [ 2.459809] do_initcall_level+0x94/0xb4
> [ 2.460043] do_initcalls+0x54/0x94
> [ 2.460228] do_basic_setup+0x1c/0x28
> [ 2.460407] kernel_init_freeable+0x110/0x178
> [ 2.460610] kernel_init+0x20/0x1a0
> [ 2.460817] ret_from_fork+0x10/0x20
> [ 2.461274] ---[ end trace 0000000000000000 ]---
>
> Indeed, the Protected KVM mode promotes __create_hyp_private_mapping()
> to a hypercall as EL1 no longer has access to the hypervisor's stage-1
> page-table. However, the call from kvm_init_vector_slots() happens after
> pKVM has been initialized on the primary CPU, but before it has been
> initialized on secondaries. As such, if the KVM initcall procedure is
> migrated from one CPU to another in this window, the hypercall may end up
> running on a CPU for which EL2 has not been initialized.
>
> Fortunately, the pKVM hypervisor doesn't rely on the host to re-map the
> vectors in the private range, so the hypercall in question is in fact
> superfluous. Skip it when pKVM is enabled.
>
> Reported-by: Will Deacon <[email protected]>
> Signed-off-by: Quentin Perret <[email protected]>
> ---
> arch/arm64/kvm/arm.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 523bc934fe2f..7347c133efc4 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -1436,7 +1436,7 @@ static int kvm_init_vector_slots(void)
> base = kern_hyp_va(kvm_ksym_ref(__bp_harden_hyp_vecs));
> kvm_init_vector_slot(base, HYP_VECTOR_SPECTRE_DIRECT);
>
> - if (kvm_system_needs_idmapped_vectors() && !has_vhe()) {
> + if (kvm_system_needs_idmapped_vectors() && !has_vhe() && !is_protected_kvm_enabled()) {
> err = create_hyp_exec_mappings(__pa_symbol(__bp_harden_hyp_vecs),
> __BP_HARDEN_HYP_VECS_SZ, &base);
> if (err)

Can we simplify the condition? ARM64_SPECTRE_V3A is only set when
!VHE, and we already bail in kvm_patch_vector_branch() if we see
VHE+V3A, because the combination makes no sense at all. I think this
can be rewritten as:

if (kvm_system_needs_idmapped_vectors() &&
!is_protected_lvm_enabled())

Thoughts?

M.

--
Without deviation from the norm, progress is not possible.

2022-05-17 02:04:34

by Quentin Perret

[permalink] [raw]
Subject: Re: [PATCH] KVM: arm64: Don't hypercall before EL2 init

On Sunday 15 May 2022 at 12:10:20 (+0100), Marc Zyngier wrote:
> Can we simplify the condition? ARM64_SPECTRE_V3A is only set when
> !VHE, and we already bail in kvm_patch_vector_branch() if we see
> VHE+V3A, because the combination makes no sense at all. I think this
> can be rewritten as:
>
> if (kvm_system_needs_idmapped_vectors() &&
> !is_protected_lvm_enabled())
>
> Thoughts?

Yup I think this works as both CPUs that are vulnerable to V3A aren't
VHE-capable. But if we ever get a VHE-capable CPU that's vulnerable I
think the next call to create_hyp_exec_mappings() will BUG(). Perhaps
the alternative would be to have has_spectre_v3a() say no in VHE to be
on the safe side? That is, prevent the cap from being set to begin with.

Cheers,
Quentin