2022-01-21 20:52:24

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH 0/9] KVM: SVM: Fix and clean up "can emulate" mess

Revert an amusing/embarassing goof reported by Liam Merwick, where KVM
attempts to determine if RIP is backed by a valid memslot without first
translating RIP to its associated GPA/GFN. Fix the underlying bug that
was "fixed" by the misguided memslots check by (a) never rejecting
emulation for !SEV guests and (b) using the #NPF error code to determine
if the fault happened on the code fetch or on guest page tables, which is
effectively what the memslots check attempted to do.

Further clean up, harden, and document SVM's "can emulate" helper, and
fix a #GP interception SEV bug found in the process of doing so.

Sean Christopherson (9):
KVM: SVM: Never reject emulation due to SMAP errata for !SEV guests
Revert "KVM: SVM: avoid infinite loop on NPF from bad address"
KVM: SVM: Don't intercept #GP for SEV guests
KVM: SVM: Explicitly require DECODEASSISTS to enable SEV support
KVM: x86: Pass emulation type to can_emulate_instruction()
KVM: SVM: WARN if KVM attempts emulation on #UD or #GP for SEV guests
KVM: SVM: Inject #UD on attempted emulation for SEV guest w/o insn
buffer
KVM: SVM: Don't apply SEV+SMAP workaround on code fetch or PT access
KVM: SVM: Don't kill SEV guest if SMAP erratum triggers in usermode

arch/x86/include/asm/kvm_host.h | 3 +-
arch/x86/kvm/svm/sev.c | 9 +-
arch/x86/kvm/svm/svm.c | 162 ++++++++++++++++++++++----------
arch/x86/kvm/vmx/vmx.c | 7 +-
arch/x86/kvm/x86.c | 11 ++-
virt/kvm/kvm_main.c | 1 -
6 files changed, 135 insertions(+), 58 deletions(-)


base-commit: edb9e50dbe18394d0fc9d0494f5b6046fc912d33
--
2.34.1.703.g22d0c6ccf7-goog


2022-01-21 20:52:31

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH 1/9] KVM: SVM: Never reject emulation due to SMAP errata for !SEV guests

Always signal that emulation is possible for !SEV guests regardless of
whether or not the CPU provided a valid instruction byte stream. KVM can
read all guest state (memory and registers) for !SEV guests, i.e. can
fetch the code stream from memory even if the CPU failed to do so because
of the SMAP errata.

Fixes: 05d5a4863525 ("KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation)")
Cc: [email protected]
Cc: Tom Lendacky <[email protected]>
Cc: Brijesh Singh <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/svm/svm.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 6d31d357a83b..aa1649b8cd8f 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4257,8 +4257,13 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, void *insn, int i
bool smep, smap, is_user;
unsigned long cr4;

+ /* Emulation is always possible when KVM has access to all guest state. */
+ if (!sev_guest(vcpu->kvm))
+ return true;
+
/*
- * When the guest is an SEV-ES guest, emulation is not possible.
+ * Emulation is impossible for SEV-ES guests as KVM doesn't have access
+ * to guest register state.
*/
if (sev_es_guest(vcpu->kvm))
return false;
@@ -4318,9 +4323,6 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, void *insn, int i
smap = cr4 & X86_CR4_SMAP;
is_user = svm_get_cpl(vcpu) == 3;
if (smap && (!smep || is_user)) {
- if (!sev_guest(vcpu->kvm))
- return true;
-
pr_err_ratelimited("KVM: SEV Guest triggered AMD Erratum 1096\n");
kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
}
--
2.34.1.703.g22d0c6ccf7-goog

2022-01-21 20:52:46

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH 4/9] KVM: SVM: Explicitly require DECODEASSISTS to enable SEV support

Add a sanity check on DECODEASSIST being support if SEV is supported, as
KVM cannot read guest private memory and thus relies on the CPU to
provide the instruction byte stream on #NPF for emulation. The intent of
the check is to document the dependency, it should never fail in practice
as producing hardware that supports SEV but not DECODEASSISTS would be
non-sensical.

Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/svm/sev.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index 6a22798eaaee..17b53457d866 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -2100,8 +2100,13 @@ void __init sev_hardware_setup(void)
if (!sev_enabled || !npt_enabled)
goto out;

- /* Does the CPU support SEV? */
- if (!boot_cpu_has(X86_FEATURE_SEV))
+ /*
+ * SEV must obviously be supported in hardware. Sanity check that the
+ * CPU supports decode assists, which is mandatory for SEV guests to
+ * support instruction emulation.
+ */
+ if (!boot_cpu_has(X86_FEATURE_SEV) ||
+ WARN_ON_ONCE(!boot_cpu_has(X86_FEATURE_DECODEASSISTS)))
goto out;

/* Retrieve SEV CPUID information */
--
2.34.1.703.g22d0c6ccf7-goog

2022-01-21 20:52:51

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH 5/9] KVM: x86: Pass emulation type to can_emulate_instruction()

Pass the emulation type to kvm_x86_ops.can_emulate_insutrction() so that
a future commit can harden KVM's SEV support to WARN on emulation
scenarios that should never happen.

No functional change intended.

Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 3 ++-
arch/x86/kvm/svm/svm.c | 3 ++-
arch/x86/kvm/vmx/vmx.c | 7 ++++---
arch/x86/kvm/x86.c | 11 +++++++++--
4 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 682ad02a4e58..c890931c9c65 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1482,7 +1482,8 @@ struct kvm_x86_ops {

int (*get_msr_feature)(struct kvm_msr_entry *entry);

- bool (*can_emulate_instruction)(struct kvm_vcpu *vcpu, void *insn, int insn_len);
+ bool (*can_emulate_instruction)(struct kvm_vcpu *vcpu, int emul_type,
+ void *insn, int insn_len);

bool (*apic_init_signal_blocked)(struct kvm_vcpu *vcpu);
int (*enable_direct_tlbflush)(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index edea52be6c01..994224ae2731 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4257,7 +4257,8 @@ static void svm_enable_smi_window(struct kvm_vcpu *vcpu)
}
}

-static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, void *insn, int insn_len)
+static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
+ void *insn, int insn_len)
{
bool smep, smap, is_user;
unsigned long cr4;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index a02a28ce7cc3..4b4c1dfa6842 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1487,11 +1487,12 @@ static int vmx_rtit_ctl_check(struct kvm_vcpu *vcpu, u64 data)
return 0;
}

-static bool vmx_can_emulate_instruction(struct kvm_vcpu *vcpu, void *insn, int insn_len)
+static bool vmx_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
+ void *insn, int insn_len)
{
/*
* Emulation of instructions in SGX enclaves is impossible as RIP does
- * not point tthe failing instruction, and even if it did, the code
+ * not point at the failing instruction, and even if it did, the code
* stream is inaccessible. Inject #UD instead of exiting to userspace
* so that guest userspace can't DoS the guest simply by triggering
* emulation (enclaves are CPL3 only).
@@ -5397,7 +5398,7 @@ static int handle_ept_misconfig(struct kvm_vcpu *vcpu)
{
gpa_t gpa;

- if (!vmx_can_emulate_instruction(vcpu, NULL, 0))
+ if (!vmx_can_emulate_instruction(vcpu, EMULTYPE_PF, NULL, 0))
return 1;

/*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 55518b7d3b96..2fa4687de8e4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6810,6 +6810,13 @@ int kvm_write_guest_virt_system(struct kvm_vcpu *vcpu, gva_t addr, void *val,
}
EXPORT_SYMBOL_GPL(kvm_write_guest_virt_system);

+static int kvm_can_emulate_insn(struct kvm_vcpu *vcpu, int emul_type,
+ void *insn, int insn_len)
+{
+ return static_call(kvm_x86_can_emulate_instruction)(vcpu, emul_type,
+ insn, insn_len);
+}
+
int handle_ud(struct kvm_vcpu *vcpu)
{
static const char kvm_emulate_prefix[] = { __KVM_EMULATE_PREFIX };
@@ -6817,7 +6824,7 @@ int handle_ud(struct kvm_vcpu *vcpu)
char sig[5]; /* ud2; .ascii "kvm" */
struct x86_exception e;

- if (unlikely(!static_call(kvm_x86_can_emulate_instruction)(vcpu, NULL, 0)))
+ if (unlikely(!kvm_can_emulate_insn(vcpu, emul_type, NULL, 0)))
return 1;

if (force_emulation_prefix &&
@@ -8193,7 +8200,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
bool writeback = true;
bool write_fault_to_spt;

- if (unlikely(!static_call(kvm_x86_can_emulate_instruction)(vcpu, insn, insn_len)))
+ if (unlikely(!kvm_can_emulate_insn(vcpu, emulation_type, insn, insn_len)))
return 1;

vcpu->arch.l1tf_flush_l1d = true;
--
2.34.1.703.g22d0c6ccf7-goog

2022-01-21 20:52:57

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH 7/9] KVM: SVM: Inject #UD on attempted emulation for SEV guest w/o insn buffer

Inject #UD if KVM attempts emulation for an SEV guests without an insn
buffer and instruction decoding is required. The previous behavior of
allowing emulation if there is no insn buffer is undesirable as doing so
means KVM is reading guest private memory and thus decoding cyphertext,
i.e. is emulating garbage. The check was previously necessary as the
emulation type was not provided, i.e. SVM needed to allow emulation to
handle completion of emulation after exiting to userspace to handle I/O.

Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/svm/svm.c | 89 ++++++++++++++++++++++++++----------------
1 file changed, 55 insertions(+), 34 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index ed2ca875b84b..d324183fc596 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4277,49 +4277,70 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
if (sev_es_guest(vcpu->kvm))
return false;

+ /*
+ * Emulation is possible if the instruction is already decoded, e.g.
+ * when completing I/O after returning from userspace.
+ */
+ if (emul_type & EMULTYPE_NO_DECODE)
+ return true;
+
+ /*
+ * Emulation is possible for SEV guests if and only if a prefilled
+ * buffer containing the bytes of the intercepted instruction is
+ * available. SEV guest memory is encrypted with a guest specific key
+ * and cannot be decrypted by KVM, i.e. KVM would read cyphertext and
+ * decode garbage.
+ *
+ * Inject #UD if KVM reached this point without an instruction buffer.
+ * In practice, this path should never be hit by a well-behaved guest,
+ * e.g. KVM doesn't intercept #UD or #GP for SEV guests, but this path
+ * is still theoretically reachable, e.g. via unaccelerated fault-like
+ * AVIC access, and needs to be handled by KVM to avoid putting the
+ * guest into an infinite loop. Injecting #UD is somewhat arbitrary,
+ * but its the least awful option given lack of insight into the guest.
+ */
+ if (unlikely(!insn)) {
+ kvm_queue_exception(vcpu, UD_VECTOR);
+ return false;
+ }
+
+ /*
+ * Emulate for SEV guests if the insn buffer is not empty. The buffer
+ * will be empty if the DecodeAssist microcode cannot fetch bytes for
+ * the faulting instruction because the code fetch itself faulted, e.g.
+ * the guest attempted to fetch from emulated MMIO or a guest page
+ * table used to translate CS:RIP resides in emulated MMIO.
+ */
+ if (likely(insn_len))
+ return true;
+
/*
* Detect and workaround Errata 1096 Fam_17h_00_0Fh.
*
* Errata:
- * When CPU raise #NPF on guest data access and vCPU CR4.SMAP=1, it is
- * possible that CPU microcode implementing DecodeAssist will fail
- * to read bytes of instruction which caused #NPF. In this case,
- * GuestIntrBytes field of the VMCB on a VMEXIT will incorrectly
- * return 0 instead of the correct guest instruction bytes.
- *
- * This happens because CPU microcode reading instruction bytes
- * uses a special opcode which attempts to read data using CPL=0
- * privileges. The microcode reads CS:RIP and if it hits a SMAP
- * fault, it gives up and returns no instruction bytes.
+ * When CPU raises #NPF on guest data access and vCPU CR4.SMAP=1, it is
+ * possible that CPU microcode implementing DecodeAssist will fail to
+ * read guest memory at CS:RIP and vmcb.GuestIntrBytes will incorrectly
+ * be '0'. This happens because microcode reads CS:RIP using a _data_
+ * loap uop with CPL=0 privileges. If the load hits a SMAP #PF, ucode
+ * gives up and does not fill the instruction bytes buffer.
*
* Detection:
- * We reach here in case CPU supports DecodeAssist, raised #NPF and
- * returned 0 in GuestIntrBytes field of the VMCB.
- * First, errata can only be triggered in case vCPU CR4.SMAP=1.
- * Second, if vCPU CR4.SMEP=1, errata could only be triggered
- * in case vCPU CPL==3 (Because otherwise guest would have triggered
- * a SMEP fault instead of #NPF).
- * Otherwise, vCPU CR4.SMEP=0, errata could be triggered by any vCPU CPL.
- * As most guests enable SMAP if they have also enabled SMEP, use above
- * logic in order to attempt minimize false-positive of detecting errata
- * while still preserving all cases semantic correctness.
+ * KVM reaches this point if the VM is an SEV guest, the CPU supports
+ * DecodeAssist, a #NPF was raised, KVM's page fault handler triggered
+ * emulation (e.g. for MMIO), and the CPU returned 0 in GuestIntrBytes
+ * field of the VMCB.
*
- * Workaround:
- * To determine what instruction the guest was executing, the hypervisor
- * will have to decode the instruction at the instruction pointer.
+ * This does _not_ mean that the erratum has been encountered, as the
+ * DecodeAssist will also fail if the load for CS:RIP hits a legitimate
+ * #PF, e.g. if the guest attempt to execute from emulated MMIO and
+ * encountered a reserved/not-present #PF.
*
- * In non SEV guest, hypervisor will be able to read the guest
- * memory to decode the instruction pointer when insn_len is zero
- * so we return true to indicate that decoding is possible.
- *
- * But in the SEV guest, the guest memory is encrypted with the
- * guest specific key and hypervisor will not be able to decode the
- * instruction pointer so we will not able to workaround it. Lets
- * print the error and request to kill the guest.
+ * To reduce the likelihood of false positives, take action if and only
+ * if CR4.SMAP=1 (obviously required to hit the erratum) and CR4.SMEP=0
+ * or CPL=3. If SMEP=1 and CPL!=3, the erratum cannot have been hit as
+ * the guest would have encountered a SMEP violation #PF, not a #NPF.
*/
- if (likely(!insn || insn_len))
- return true;
-
cr4 = kvm_read_cr4(vcpu);
smep = cr4 & X86_CR4_SMEP;
smap = cr4 & X86_CR4_SMAP;
--
2.34.1.703.g22d0c6ccf7-goog

2022-01-21 20:53:39

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH 9/9] KVM: SVM: Don't kill SEV guest if SMAP erratum triggers in usermode

Inject a #GP instead of synthesizing triple fault to try to avoid killing
the guest if emulation of an SEV guest fails due to encountering the SMAP
erratum. The injected #GP may still be fatal to the guest, e.g. if the
userspace process is providing critical functionality, but KVM should
make every attempt to keep the guest alive.

Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/svm/svm.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index a4b02a6217fd..88f5bbb0e6a1 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4357,7 +4357,21 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
is_user = svm_get_cpl(vcpu) == 3;
if (smap && (!smep || is_user)) {
pr_err_ratelimited("KVM: SEV Guest triggered AMD Erratum 1096\n");
- kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
+
+ /*
+ * If the fault occurred in userspace, arbitrarily inject #GP
+ * to avoid killing the guest and to hopefully avoid confusing
+ * the guest kernel too much, e.g. injecting #PF would not be
+ * coherent with respect to the guest's page tables. Request
+ * triple fault if the fault occurred in the kernel as there's
+ * no fault that KVM can inject without confusing the guest.
+ * In practice, the triple fault is moot as no sane SEV kernel
+ * will execute from user memory while also running with SMAP=1.
+ */
+ if (is_user)
+ kvm_inject_gp(vcpu, 0);
+ else
+ kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
}

resume_guest:
--
2.34.1.703.g22d0c6ccf7-goog

2022-01-21 20:53:44

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH 3/9] KVM: SVM: Don't intercept #GP for SEV guests

Never intercept #GP for SEV guests as reading SEV guest private memory
will return cyphertext, i.e. emulating on #GP can't work as intended.

Cc: [email protected]
Cc: Tom Lendacky <[email protected]>
Cc: Brijesh Singh <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/svm/svm.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 85703145eb0a..edea52be6c01 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -312,7 +312,11 @@ int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer)
return ret;
}

- if (svm_gp_erratum_intercept)
+ /*
+ * Never intercept #GP for SEV guests, KVM can't
+ * decrypt guest memory to workaround the erratum.
+ */
+ if (svm_gp_erratum_intercept && !sev_guest(vcpu->kvm))
set_exception_intercept(svm, GP_VECTOR);
}
}
@@ -1010,9 +1014,10 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
* Guest access to VMware backdoor ports could legitimately
* trigger #GP because of TSS I/O permission bitmap.
* We intercept those #GP and allow access to them anyway
- * as VMware does.
+ * as VMware does. Don't intercept #GP for SEV guests as KVM can't
+ * decrypt guest memory to decode the faulting instruction.
*/
- if (enable_vmware_backdoor)
+ if (enable_vmware_backdoor && !sev_guest(vcpu->kvm))
set_exception_intercept(svm, GP_VECTOR);

svm_set_intercept(svm, INTERCEPT_INTR);
--
2.34.1.703.g22d0c6ccf7-goog

2022-01-21 20:53:44

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH 2/9] Revert "KVM: SVM: avoid infinite loop on NPF from bad address"

Revert a completely broken check on an "invalid" RIP in SVM's workaround
for the DecodeAssists SMAP errata. kvm_vcpu_gfn_to_memslot() obviously
expects a gfn, i.e. operates in the guest physical address space, whereas
RIP is a virtual (not even linear) address. The "fix" worked for the
problematic KVM selftest because the test identity mapped RIP.

Fully revert the hack instead of trying to translate RIP to a GPA, as the
non-SEV case is now handled earlier, and KVM cannot access guest page
tables to translate RIP.

This reverts commit e72436bc3a5206f95bb384e741154166ddb3202e.

Fixes: e72436bc3a52 ("KVM: SVM: avoid infinite loop on NPF from bad address")
Reported-by: Liam Merwick <[email protected]>
Cc: [email protected]
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/svm/svm.c | 7 -------
virt/kvm/kvm_main.c | 1 -
2 files changed, 8 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index aa1649b8cd8f..85703145eb0a 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4311,13 +4311,6 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, void *insn, int i
if (likely(!insn || insn_len))
return true;

- /*
- * If RIP is invalid, go ahead with emulation which will cause an
- * internal error exit.
- */
- if (!kvm_vcpu_gfn_to_memslot(vcpu, kvm_rip_read(vcpu) >> PAGE_SHIFT))
- return true;
-
cr4 = kvm_read_cr4(vcpu);
smep = cr4 & X86_CR4_SMEP;
smap = cr4 & X86_CR4_SMAP;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 5a1164483e6c..0bacecda79cf 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2248,7 +2248,6 @@ struct kvm_memory_slot *kvm_vcpu_gfn_to_memslot(struct kvm_vcpu *vcpu, gfn_t gfn

return NULL;
}
-EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_memslot);

bool kvm_is_visible_gfn(struct kvm *kvm, gfn_t gfn)
{
--
2.34.1.703.g22d0c6ccf7-goog

2022-01-21 20:53:51

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH 6/9] KVM: SVM: WARN if KVM attempts emulation on #UD or #GP for SEV guests

WARN if KVM attempts to emulate in response to #UD or #GP for SEV guests,
i.e. if KVM intercepts #UD or #GP, as emulation on any fault except #NPF
is impossible since KVM cannot read guest private memory to get the code
stream, and the CPU's DecodeAssists feature only provides the instruction
bytes on #NPF.

Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/svm/svm.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 994224ae2731..ed2ca875b84b 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4267,6 +4267,9 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
if (!sev_guest(vcpu->kvm))
return true;

+ /* #UD and #GP should never be intercepted for SEV guests. */
+ WARN_ON_ONCE(emul_type & (EMULTYPE_TRAP_UD | EMULTYPE_VMWARE_GP));
+
/*
* Emulation is impossible for SEV-ES guests as KVM doesn't have access
* to guest register state.
--
2.34.1.703.g22d0c6ccf7-goog

2022-01-21 20:54:15

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH 8/9] KVM: SVM: Don't apply SEV+SMAP workaround on code fetch or PT access

Resume the guest instead of synthesizing a triple fault shutdown if the
instruction bytes buffer is empty due to the #NPF being on the code fetch
itself or on a page table access. The SMAP errata applies if and only if
the code fetch was successful and ucode's subsequent data read from the
code page encountered a SMAP violation. In practice, the guest is likely
hosed either way, but crashing the guest on a code fetch to emulated MMIO
is technically wrong according to the behavior described in the APM.

Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/svm/svm.c | 43 +++++++++++++++++++++++++++++++++---------
1 file changed, 34 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d324183fc596..a4b02a6217fd 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4262,6 +4262,7 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
{
bool smep, smap, is_user;
unsigned long cr4;
+ u64 error_code;

/* Emulation is always possible when KVM has access to all guest state. */
if (!sev_guest(vcpu->kvm))
@@ -4325,22 +4326,31 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
* loap uop with CPL=0 privileges. If the load hits a SMAP #PF, ucode
* gives up and does not fill the instruction bytes buffer.
*
- * Detection:
- * KVM reaches this point if the VM is an SEV guest, the CPU supports
- * DecodeAssist, a #NPF was raised, KVM's page fault handler triggered
- * emulation (e.g. for MMIO), and the CPU returned 0 in GuestIntrBytes
- * field of the VMCB.
+ * As above, KVM reaches this point iff the VM is an SEV guest, the CPU
+ * supports DecodeAssist, a #NPF was raised, KVM's page fault handler
+ * triggered emulation (e.g. for MMIO), and the CPU returned 0 in the
+ * GuestIntrBytes field of the VMCB.
*
* This does _not_ mean that the erratum has been encountered, as the
* DecodeAssist will also fail if the load for CS:RIP hits a legitimate
* #PF, e.g. if the guest attempt to execute from emulated MMIO and
* encountered a reserved/not-present #PF.
*
- * To reduce the likelihood of false positives, take action if and only
- * if CR4.SMAP=1 (obviously required to hit the erratum) and CR4.SMEP=0
- * or CPL=3. If SMEP=1 and CPL!=3, the erratum cannot have been hit as
- * the guest would have encountered a SMEP violation #PF, not a #NPF.
+ * To hit the erratum, the following conditions must be true:
+ * 1. CR4.SMAP=1 (obviously).
+ * 2. CR4.SMEP=0 || CPL=3. If SMEP=1 and CPL<3, the erratum cannot
+ * have been hit as the guest would have encountered a SMEP
+ * violation #PF, not a #NPF.
+ * 3. The #NPF is not due to a code fetch, in which case failure to
+ * retrieve the instruction bytes is legitimate (see abvoe).
+ *
+ * In addition, don't apply the erratum workaround if the #NPF occurred
+ * while translating guest page tables (see below).
*/
+ error_code = to_svm(vcpu)->vmcb->control.exit_info_1;
+ if (error_code & (PFERR_GUEST_PAGE_MASK | PFERR_FETCH_MASK))
+ goto resume_guest;
+
cr4 = kvm_read_cr4(vcpu);
smep = cr4 & X86_CR4_SMEP;
smap = cr4 & X86_CR4_SMAP;
@@ -4350,6 +4360,21 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
}

+resume_guest:
+ /*
+ * If the erratum was not hit, simply resume the guest and let it fault
+ * again. While awful, e.g. the vCPU may get stuck in an infinite loop
+ * if the fault is at CPL=0, it's the lesser of all evils. Exiting to
+ * userspace will kill the guest, and letting the emulator read garbage
+ * will yield random behavior and potentially corrupt the guest.
+ *
+ * Simply resuming the guest is technically not a violation of the SEV
+ * architecture. AMD's APM states that all code fetches and page table
+ * accesses for SEV guest are encrypted, regardless of the C-Bit. The
+ * APM also states that encrypted accesses to MMIO are "ignored", but
+ * doesn't explicitly define "ignored", i.e. doing nothing and letting
+ * the guest spin is technically "ignoring" the access.
+ */
return false;
}

--
2.34.1.703.g22d0c6ccf7-goog

2022-01-21 22:14:14

by Liam Merwick

[permalink] [raw]
Subject: Re: [PATCH 1/9] KVM: SVM: Never reject emulation due to SMAP errata for !SEV guests

On 20/01/2022 01:07, Sean Christopherson wrote:
> Always signal that emulation is possible for !SEV guests regardless of
> whether or not the CPU provided a valid instruction byte stream. KVM can
> read all guest state (memory and registers) for !SEV guests, i.e. can
> fetch the code stream from memory even if the CPU failed to do so because
> of the SMAP errata.
>
> Fixes: 05d5a4863525 ("KVM: SVM: Workaround errata#1096 (insn_len maybe zero on SMAP violation)")
> Cc: [email protected]
> Cc: Tom Lendacky <[email protected]>
> Cc: Brijesh Singh <[email protected]>
> Signed-off-by: Sean Christopherson <[email protected]>

Reviewed-by: Liam Merwick <[email protected]>

> ---
> arch/x86/kvm/svm/svm.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 6d31d357a83b..aa1649b8cd8f 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4257,8 +4257,13 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, void *insn, int i
> bool smep, smap, is_user;
> unsigned long cr4;
>
> + /* Emulation is always possible when KVM has access to all guest state. */
> + if (!sev_guest(vcpu->kvm))
> + return true;
> +
> /*
> - * When the guest is an SEV-ES guest, emulation is not possible.
> + * Emulation is impossible for SEV-ES guests as KVM doesn't have access
> + * to guest register state.
> */
> if (sev_es_guest(vcpu->kvm))
> return false;
> @@ -4318,9 +4323,6 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, void *insn, int i
> smap = cr4 & X86_CR4_SMAP;
> is_user = svm_get_cpl(vcpu) == 3;
> if (smap && (!smep || is_user)) {
> - if (!sev_guest(vcpu->kvm))
> - return true;
> -
> pr_err_ratelimited("KVM: SEV Guest triggered AMD Erratum 1096\n");
> kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
> }

2022-01-21 22:14:21

by Liam Merwick

[permalink] [raw]
Subject: Re: [PATCH 2/9] Revert "KVM: SVM: avoid infinite loop on NPF from bad address"

On 20/01/2022 01:07, Sean Christopherson wrote:
> Revert a completely broken check on an "invalid" RIP in SVM's workaround
> for the DecodeAssists SMAP errata. kvm_vcpu_gfn_to_memslot() obviously
> expects a gfn, i.e. operates in the guest physical address space, whereas
> RIP is a virtual (not even linear) address. The "fix" worked for the
> problematic KVM selftest because the test identity mapped RIP.
>
> Fully revert the hack instead of trying to translate RIP to a GPA, as the
> non-SEV case is now handled earlier, and KVM cannot access guest page
> tables to translate RIP.
>
> This reverts commit e72436bc3a5206f95bb384e741154166ddb3202e.
>
> Fixes: e72436bc3a52 ("KVM: SVM: avoid infinite loop on NPF from bad address")
> Reported-by: Liam Merwick <[email protected]>
> Cc: [email protected]
> Signed-off-by: Sean Christopherson <[email protected]>

Reviewed-by: Liam Merwick <[email protected]>


> ---
> arch/x86/kvm/svm/svm.c | 7 -------
> virt/kvm/kvm_main.c | 1 -
> 2 files changed, 8 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index aa1649b8cd8f..85703145eb0a 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4311,13 +4311,6 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, void *insn, int i
> if (likely(!insn || insn_len))
> return true;
>
> - /*
> - * If RIP is invalid, go ahead with emulation which will cause an
> - * internal error exit.
> - */
> - if (!kvm_vcpu_gfn_to_memslot(vcpu, kvm_rip_read(vcpu) >> PAGE_SHIFT))
> - return true;
> -
> cr4 = kvm_read_cr4(vcpu);
> smep = cr4 & X86_CR4_SMEP;
> smap = cr4 & X86_CR4_SMAP;
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 5a1164483e6c..0bacecda79cf 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -2248,7 +2248,6 @@ struct kvm_memory_slot *kvm_vcpu_gfn_to_memslot(struct kvm_vcpu *vcpu, gfn_t gfn
>
> return NULL;
> }
> -EXPORT_SYMBOL_GPL(kvm_vcpu_gfn_to_memslot);
>
> bool kvm_is_visible_gfn(struct kvm *kvm, gfn_t gfn)
> {

2022-01-21 22:15:54

by Liam Merwick

[permalink] [raw]
Subject: Re: [PATCH 4/9] KVM: SVM: Explicitly require DECODEASSISTS to enable SEV support

On 20/01/2022 01:07, Sean Christopherson wrote:
> Add a sanity check on DECODEASSIST being support if SEV is supported, as

"DECODEASSISTS being supported" or "DECODEASSISTS support"

> KVM cannot read guest private memory and thus relies on the CPU to
> provide the instruction byte stream on #NPF for emulation. The intent of
> the check is to document the dependency, it should never fail in practice
> as producing hardware that supports SEV but not DECODEASSISTS would be
> non-sensical.
>
> Signed-off-by: Sean Christopherson <[email protected]>

Reviewed-by: Liam Merwick <[email protected]>


> ---
> arch/x86/kvm/svm/sev.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index 6a22798eaaee..17b53457d866 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -2100,8 +2100,13 @@ void __init sev_hardware_setup(void)
> if (!sev_enabled || !npt_enabled)
> goto out;
>
> - /* Does the CPU support SEV? */
> - if (!boot_cpu_has(X86_FEATURE_SEV))
> + /*
> + * SEV must obviously be supported in hardware. Sanity check that the
> + * CPU supports decode assists, which is mandatory for SEV guests to
> + * support instruction emulation.
> + */
> + if (!boot_cpu_has(X86_FEATURE_SEV) ||
> + WARN_ON_ONCE(!boot_cpu_has(X86_FEATURE_DECODEASSISTS)))
> goto out;
>
> /* Retrieve SEV CPUID information */

2022-01-21 22:16:24

by Liam Merwick

[permalink] [raw]
Subject: Re: [PATCH 3/9] KVM: SVM: Don't intercept #GP for SEV guests

On 20/01/2022 01:07, Sean Christopherson wrote:
> Never intercept #GP for SEV guests as reading SEV guest private memory
> will return cyphertext, i.e. emulating on #GP can't work as intended.
>

"ciphertext" seems to be the convention.


> Cc: [email protected]
> Cc: Tom Lendacky <[email protected]>
> Cc: Brijesh Singh <[email protected]>
> Signed-off-by: Sean Christopherson <[email protected]>

Reviewed-by: Liam Merwick <[email protected]>

> ---
> arch/x86/kvm/svm/svm.c | 11 ++++++++---
> 1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 85703145eb0a..edea52be6c01 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -312,7 +312,11 @@ int svm_set_efer(struct kvm_vcpu *vcpu, u64 efer)
> return ret;
> }
>
> - if (svm_gp_erratum_intercept)
> + /*
> + * Never intercept #GP for SEV guests, KVM can't
> + * decrypt guest memory to workaround the erratum.
> + */
> + if (svm_gp_erratum_intercept && !sev_guest(vcpu->kvm))
> set_exception_intercept(svm, GP_VECTOR);
> }
> }
> @@ -1010,9 +1014,10 @@ static void init_vmcb(struct kvm_vcpu *vcpu)
> * Guest access to VMware backdoor ports could legitimately
> * trigger #GP because of TSS I/O permission bitmap.
> * We intercept those #GP and allow access to them anyway
> - * as VMware does.
> + * as VMware does. Don't intercept #GP for SEV guests as KVM can't
> + * decrypt guest memory to decode the faulting instruction.
> */
> - if (enable_vmware_backdoor)
> + if (enable_vmware_backdoor && !sev_guest(vcpu->kvm))
> set_exception_intercept(svm, GP_VECTOR);
>
> svm_set_intercept(svm, INTERCEPT_INTR);

2022-01-21 22:17:34

by Liam Merwick

[permalink] [raw]
Subject: Re: [PATCH 5/9] KVM: x86: Pass emulation type to can_emulate_instruction()

On 20/01/2022 01:07, Sean Christopherson wrote:
> Pass the emulation type to kvm_x86_ops.can_emulate_insutrction() so that

typo in function name - should be can_emulate_instruction()

> a future commit can harden KVM's SEV support to WARN on emulation
> scenarios that should never happen.
>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <[email protected]>


Reviewed-by: Liam Merwick <[email protected]>


> ---
> arch/x86/include/asm/kvm_host.h | 3 ++-
> arch/x86/kvm/svm/svm.c | 3 ++-
> arch/x86/kvm/vmx/vmx.c | 7 ++++---
> arch/x86/kvm/x86.c | 11 +++++++++--
> 4 files changed, 17 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 682ad02a4e58..c890931c9c65 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1482,7 +1482,8 @@ struct kvm_x86_ops {
>
> int (*get_msr_feature)(struct kvm_msr_entry *entry);
>
> - bool (*can_emulate_instruction)(struct kvm_vcpu *vcpu, void *insn, int insn_len);
> + bool (*can_emulate_instruction)(struct kvm_vcpu *vcpu, int emul_type,
> + void *insn, int insn_len);
>
> bool (*apic_init_signal_blocked)(struct kvm_vcpu *vcpu);
> int (*enable_direct_tlbflush)(struct kvm_vcpu *vcpu);
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index edea52be6c01..994224ae2731 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4257,7 +4257,8 @@ static void svm_enable_smi_window(struct kvm_vcpu *vcpu)
> }
> }
>
> -static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, void *insn, int insn_len)
> +static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
> + void *insn, int insn_len)
> {
> bool smep, smap, is_user;
> unsigned long cr4;
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index a02a28ce7cc3..4b4c1dfa6842 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1487,11 +1487,12 @@ static int vmx_rtit_ctl_check(struct kvm_vcpu *vcpu, u64 data)
> return 0;
> }
>
> -static bool vmx_can_emulate_instruction(struct kvm_vcpu *vcpu, void *insn, int insn_len)
> +static bool vmx_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
> + void *insn, int insn_len)
> {
> /*
> * Emulation of instructions in SGX enclaves is impossible as RIP does
> - * not point tthe failing instruction, and even if it did, the code
> + * not point at the failing instruction, and even if it did, the code
> * stream is inaccessible. Inject #UD instead of exiting to userspace
> * so that guest userspace can't DoS the guest simply by triggering
> * emulation (enclaves are CPL3 only).
> @@ -5397,7 +5398,7 @@ static int handle_ept_misconfig(struct kvm_vcpu *vcpu)
> {
> gpa_t gpa;
>
> - if (!vmx_can_emulate_instruction(vcpu, NULL, 0))
> + if (!vmx_can_emulate_instruction(vcpu, EMULTYPE_PF, NULL, 0))
> return 1;
>
> /*
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 55518b7d3b96..2fa4687de8e4 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -6810,6 +6810,13 @@ int kvm_write_guest_virt_system(struct kvm_vcpu *vcpu, gva_t addr, void *val,
> }
> EXPORT_SYMBOL_GPL(kvm_write_guest_virt_system);
>
> +static int kvm_can_emulate_insn(struct kvm_vcpu *vcpu, int emul_type,
> + void *insn, int insn_len)
> +{
> + return static_call(kvm_x86_can_emulate_instruction)(vcpu, emul_type,
> + insn, insn_len);
> +}
> +
> int handle_ud(struct kvm_vcpu *vcpu)
> {
> static const char kvm_emulate_prefix[] = { __KVM_EMULATE_PREFIX };
> @@ -6817,7 +6824,7 @@ int handle_ud(struct kvm_vcpu *vcpu)
> char sig[5]; /* ud2; .ascii "kvm" */
> struct x86_exception e;
>
> - if (unlikely(!static_call(kvm_x86_can_emulate_instruction)(vcpu, NULL, 0)))
> + if (unlikely(!kvm_can_emulate_insn(vcpu, emul_type, NULL, 0)))
> return 1;
>
> if (force_emulation_prefix &&
> @@ -8193,7 +8200,7 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
> bool writeback = true;
> bool write_fault_to_spt;
>
> - if (unlikely(!static_call(kvm_x86_can_emulate_instruction)(vcpu, insn, insn_len)))
> + if (unlikely(!kvm_can_emulate_insn(vcpu, emulation_type, insn, insn_len)))
> return 1;
>
> vcpu->arch.l1tf_flush_l1d = true;

2022-01-21 22:23:16

by Liam Merwick

[permalink] [raw]
Subject: Re: [PATCH 6/9] KVM: SVM: WARN if KVM attempts emulation on #UD or #GP for SEV guests

On 20/01/2022 01:07, Sean Christopherson wrote:
> WARN if KVM attempts to emulate in response to #UD or #GP for SEV guests,
> i.e. if KVM intercepts #UD or #GP, as emulation on any fault except #NPF
> is impossible since KVM cannot read guest private memory to get the code
> stream, and the CPU's DecodeAssists feature only provides the instruction
> bytes on #NPF.
>
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
> arch/x86/kvm/svm/svm.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 994224ae2731..ed2ca875b84b 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4267,6 +4267,9 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
> if (!sev_guest(vcpu->kvm))
> return true;
>
> + /* #UD and #GP should never be intercepted for SEV guests. */
> + WARN_ON_ONCE(emul_type & (EMULTYPE_TRAP_UD | EMULTYPE_VMWARE_GP));

What about EMULTYPE_TRAP_UD_FORCED?

Otherwise
Reviewed-by: Liam Merwick <[email protected]>


> +
> /*
> * Emulation is impossible for SEV-ES guests as KVM doesn't have access
> * to guest register state.

2022-01-21 22:23:39

by Liam Merwick

[permalink] [raw]
Subject: Re: [PATCH 7/9] KVM: SVM: Inject #UD on attempted emulation for SEV guest w/o insn buffer

On 20/01/2022 01:07, Sean Christopherson wrote:
> Inject #UD if KVM attempts emulation for an SEV guests without an insn
> buffer and instruction decoding is required. The previous behavior of
> allowing emulation if there is no insn buffer is undesirable as doing so
> means KVM is reading guest private memory and thus decoding cyphertext,
> i.e. is emulating garbage. The check was previously necessary as the
> emulation type was not provided, i.e. SVM needed to allow emulation to
> handle completion of emulation after exiting to userspace to handle I/O.
>

A few cyphertext references...

> Signed-off-by: Sean Christopherson <[email protected]>

Reviewed-by: Liam Merwick <[email protected]>

> ---
> arch/x86/kvm/svm/svm.c | 89 ++++++++++++++++++++++++++----------------
> 1 file changed, 55 insertions(+), 34 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index ed2ca875b84b..d324183fc596 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4277,49 +4277,70 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
> if (sev_es_guest(vcpu->kvm))
> return false;
>
> + /*
> + * Emulation is possible if the instruction is already decoded, e.g.
> + * when completing I/O after returning from userspace.
> + */
> + if (emul_type & EMULTYPE_NO_DECODE)
> + return true;
> +
> + /*
> + * Emulation is possible for SEV guests if and only if a prefilled
> + * buffer containing the bytes of the intercepted instruction is
> + * available. SEV guest memory is encrypted with a guest specific key
> + * and cannot be decrypted by KVM, i.e. KVM would read cyphertext and
> + * decode garbage.
> + *
> + * Inject #UD if KVM reached this point without an instruction buffer.
> + * In practice, this path should never be hit by a well-behaved guest,
> + * e.g. KVM doesn't intercept #UD or #GP for SEV guests, but this path
> + * is still theoretically reachable, e.g. via unaccelerated fault-like
> + * AVIC access, and needs to be handled by KVM to avoid putting the
> + * guest into an infinite loop. Injecting #UD is somewhat arbitrary,
> + * but its the least awful option given lack of insight into the guest.
> + */
> + if (unlikely(!insn)) {
> + kvm_queue_exception(vcpu, UD_VECTOR);
> + return false;
> + }
> +
> + /*
> + * Emulate for SEV guests if the insn buffer is not empty. The buffer
> + * will be empty if the DecodeAssist microcode cannot fetch bytes for
> + * the faulting instruction because the code fetch itself faulted, e.g.
> + * the guest attempted to fetch from emulated MMIO or a guest page
> + * table used to translate CS:RIP resides in emulated MMIO.
> + */
> + if (likely(insn_len))
> + return true;
> +
> /*
> * Detect and workaround Errata 1096 Fam_17h_00_0Fh.
> *
> * Errata:
> - * When CPU raise #NPF on guest data access and vCPU CR4.SMAP=1, it is
> - * possible that CPU microcode implementing DecodeAssist will fail
> - * to read bytes of instruction which caused #NPF. In this case,
> - * GuestIntrBytes field of the VMCB on a VMEXIT will incorrectly
> - * return 0 instead of the correct guest instruction bytes.
> - *
> - * This happens because CPU microcode reading instruction bytes
> - * uses a special opcode which attempts to read data using CPL=0
> - * privileges. The microcode reads CS:RIP and if it hits a SMAP
> - * fault, it gives up and returns no instruction bytes.
> + * When CPU raises #NPF on guest data access and vCPU CR4.SMAP=1, it is
> + * possible that CPU microcode implementing DecodeAssist will fail to
> + * read guest memory at CS:RIP and vmcb.GuestIntrBytes will incorrectly
> + * be '0'. This happens because microcode reads CS:RIP using a _data_
> + * loap uop with CPL=0 privileges. If the load hits a SMAP #PF, ucode
> + * gives up and does not fill the instruction bytes buffer.
> *
> * Detection:
> - * We reach here in case CPU supports DecodeAssist, raised #NPF and
> - * returned 0 in GuestIntrBytes field of the VMCB.
> - * First, errata can only be triggered in case vCPU CR4.SMAP=1.
> - * Second, if vCPU CR4.SMEP=1, errata could only be triggered
> - * in case vCPU CPL==3 (Because otherwise guest would have triggered
> - * a SMEP fault instead of #NPF).
> - * Otherwise, vCPU CR4.SMEP=0, errata could be triggered by any vCPU CPL.
> - * As most guests enable SMAP if they have also enabled SMEP, use above
> - * logic in order to attempt minimize false-positive of detecting errata
> - * while still preserving all cases semantic correctness.
> + * KVM reaches this point if the VM is an SEV guest, the CPU supports
> + * DecodeAssist, a #NPF was raised, KVM's page fault handler triggered
> + * emulation (e.g. for MMIO), and the CPU returned 0 in GuestIntrBytes
> + * field of the VMCB.
> *
> - * Workaround:
> - * To determine what instruction the guest was executing, the hypervisor
> - * will have to decode the instruction at the instruction pointer.
> + * This does _not_ mean that the erratum has been encountered, as the
> + * DecodeAssist will also fail if the load for CS:RIP hits a legitimate
> + * #PF, e.g. if the guest attempt to execute from emulated MMIO and
> + * encountered a reserved/not-present #PF.
> *
> - * In non SEV guest, hypervisor will be able to read the guest
> - * memory to decode the instruction pointer when insn_len is zero
> - * so we return true to indicate that decoding is possible.
> - *
> - * But in the SEV guest, the guest memory is encrypted with the
> - * guest specific key and hypervisor will not be able to decode the
> - * instruction pointer so we will not able to workaround it. Lets
> - * print the error and request to kill the guest.
> + * To reduce the likelihood of false positives, take action if and only
> + * if CR4.SMAP=1 (obviously required to hit the erratum) and CR4.SMEP=0
> + * or CPL=3. If SMEP=1 and CPL!=3, the erratum cannot have been hit as
> + * the guest would have encountered a SMEP violation #PF, not a #NPF.
> */
> - if (likely(!insn || insn_len))
> - return true;
> -
> cr4 = kvm_read_cr4(vcpu);
> smep = cr4 & X86_CR4_SMEP;
> smap = cr4 & X86_CR4_SMAP;

2022-01-21 22:25:09

by Liam Merwick

[permalink] [raw]
Subject: Re: [PATCH 8/9] KVM: SVM: Don't apply SEV+SMAP workaround on code fetch or PT access

On 20/01/2022 01:07, Sean Christopherson wrote:
> Resume the guest instead of synthesizing a triple fault shutdown if the
> instruction bytes buffer is empty due to the #NPF being on the code fetch
> itself or on a page table access. The SMAP errata applies if and only if
> the code fetch was successful and ucode's subsequent data read from the
> code page encountered a SMAP violation. In practice, the guest is likely
> hosed either way, but crashing the guest on a code fetch to emulated MMIO
> is technically wrong according to the behavior described in the APM.
>
> Signed-off-by: Sean Christopherson <[email protected]>


Reviewed-by: Liam Merwick <[email protected]>

> ---
> arch/x86/kvm/svm/svm.c | 43 +++++++++++++++++++++++++++++++++---------
> 1 file changed, 34 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index d324183fc596..a4b02a6217fd 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4262,6 +4262,7 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
> {
> bool smep, smap, is_user;
> unsigned long cr4;
> + u64 error_code;
>
> /* Emulation is always possible when KVM has access to all guest state. */
> if (!sev_guest(vcpu->kvm))
> @@ -4325,22 +4326,31 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
> * loap uop with CPL=0 privileges. If the load hits a SMAP #PF, ucode
> * gives up and does not fill the instruction bytes buffer.
> *
> - * Detection:
> - * KVM reaches this point if the VM is an SEV guest, the CPU supports
> - * DecodeAssist, a #NPF was raised, KVM's page fault handler triggered
> - * emulation (e.g. for MMIO), and the CPU returned 0 in GuestIntrBytes
> - * field of the VMCB.
> + * As above, KVM reaches this point iff the VM is an SEV guest, the CPU
> + * supports DecodeAssist, a #NPF was raised, KVM's page fault handler
> + * triggered emulation (e.g. for MMIO), and the CPU returned 0 in the
> + * GuestIntrBytes field of the VMCB.
> *
> * This does _not_ mean that the erratum has been encountered, as the
> * DecodeAssist will also fail if the load for CS:RIP hits a legitimate
> * #PF, e.g. if the guest attempt to execute from emulated MMIO and
> * encountered a reserved/not-present #PF.
> *
> - * To reduce the likelihood of false positives, take action if and only
> - * if CR4.SMAP=1 (obviously required to hit the erratum) and CR4.SMEP=0
> - * or CPL=3. If SMEP=1 and CPL!=3, the erratum cannot have been hit as
> - * the guest would have encountered a SMEP violation #PF, not a #NPF.
> + * To hit the erratum, the following conditions must be true:
> + * 1. CR4.SMAP=1 (obviously).
> + * 2. CR4.SMEP=0 || CPL=3. If SMEP=1 and CPL<3, the erratum cannot
> + * have been hit as the guest would have encountered a SMEP
> + * violation #PF, not a #NPF.
> + * 3. The #NPF is not due to a code fetch, in which case failure to
> + * retrieve the instruction bytes is legitimate (see abvoe).
> + *
> + * In addition, don't apply the erratum workaround if the #NPF occurred
> + * while translating guest page tables (see below).
> */
> + error_code = to_svm(vcpu)->vmcb->control.exit_info_1;
> + if (error_code & (PFERR_GUEST_PAGE_MASK | PFERR_FETCH_MASK))
> + goto resume_guest;
> +
> cr4 = kvm_read_cr4(vcpu);
> smep = cr4 & X86_CR4_SMEP;
> smap = cr4 & X86_CR4_SMAP;
> @@ -4350,6 +4360,21 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
> kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
> }
>
> +resume_guest:
> + /*
> + * If the erratum was not hit, simply resume the guest and let it fault
> + * again. While awful, e.g. the vCPU may get stuck in an infinite loop
> + * if the fault is at CPL=0, it's the lesser of all evils. Exiting to
> + * userspace will kill the guest, and letting the emulator read garbage
> + * will yield random behavior and potentially corrupt the guest.
> + *
> + * Simply resuming the guest is technically not a violation of the SEV
> + * architecture. AMD's APM states that all code fetches and page table
> + * accesses for SEV guest are encrypted, regardless of the C-Bit. The
> + * APM also states that encrypted accesses to MMIO are "ignored", but
> + * doesn't explicitly define "ignored", i.e. doing nothing and letting
> + * the guest spin is technically "ignoring" the access.
> + */
> return false;
> }
>

2022-01-21 22:25:22

by Liam Merwick

[permalink] [raw]
Subject: Re: [PATCH 9/9] KVM: SVM: Don't kill SEV guest if SMAP erratum triggers in usermode

On 20/01/2022 01:07, Sean Christopherson wrote:
> Inject a #GP instead of synthesizing triple fault to try to avoid killing
> the guest if emulation of an SEV guest fails due to encountering the SMAP
> erratum. The injected #GP may still be fatal to the guest, e.g. if the
> userspace process is providing critical functionality, but KVM should
> make every attempt to keep the guest alive.
>
> Signed-off-by: Sean Christopherson <[email protected]>


Reviewed-by: Liam Merwick <[email protected]>

> ---
> arch/x86/kvm/svm/svm.c | 16 +++++++++++++++-
> 1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index a4b02a6217fd..88f5bbb0e6a1 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4357,7 +4357,21 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
> is_user = svm_get_cpl(vcpu) == 3;
> if (smap && (!smep || is_user)) {
> pr_err_ratelimited("KVM: SEV Guest triggered AMD Erratum 1096\n");
> - kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
> +
> + /*
> + * If the fault occurred in userspace, arbitrarily inject #GP
> + * to avoid killing the guest and to hopefully avoid confusing
> + * the guest kernel too much, e.g. injecting #PF would not be
> + * coherent with respect to the guest's page tables. Request
> + * triple fault if the fault occurred in the kernel as there's
> + * no fault that KVM can inject without confusing the guest.
> + * In practice, the triple fault is moot as no sane SEV kernel
> + * will execute from user memory while also running with SMAP=1.
> + */
> + if (is_user)
> + kvm_inject_gp(vcpu, 0);
> + else
> + kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
> }
>
> resume_guest:

2022-01-21 22:25:48

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 3/9] KVM: SVM: Don't intercept #GP for SEV guests

On Thu, Jan 20, 2022, Liam Merwick wrote:
> On 20/01/2022 01:07, Sean Christopherson wrote:
> > Never intercept #GP for SEV guests as reading SEV guest private memory
> > will return cyphertext, i.e. emulating on #GP can't work as intended.
> >
>
> "ciphertext" seems to be the convention.

Huh, indeed it does seem to be way more common, and cipher is the proper root.
Stupid English, why can't we have encrypt+cypher or encript+cipher?

Thanks! I'll get these fixed.

2022-01-21 22:26:19

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 6/9] KVM: SVM: WARN if KVM attempts emulation on #UD or #GP for SEV guests

On Thu, Jan 20, 2022, Liam Merwick wrote:
> On 20/01/2022 01:07, Sean Christopherson wrote:
> > WARN if KVM attempts to emulate in response to #UD or #GP for SEV guests,
> > i.e. if KVM intercepts #UD or #GP, as emulation on any fault except #NPF
> > is impossible since KVM cannot read guest private memory to get the code
> > stream, and the CPU's DecodeAssists feature only provides the instruction
> > bytes on #NPF.
> >
> > Signed-off-by: Sean Christopherson <[email protected]>
> > ---
> > arch/x86/kvm/svm/svm.c | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> > index 994224ae2731..ed2ca875b84b 100644
> > --- a/arch/x86/kvm/svm/svm.c
> > +++ b/arch/x86/kvm/svm/svm.c
> > @@ -4267,6 +4267,9 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
> > if (!sev_guest(vcpu->kvm))
> > return true;
> > + /* #UD and #GP should never be intercepted for SEV guests. */
> > + WARN_ON_ONCE(emul_type & (EMULTYPE_TRAP_UD | EMULTYPE_VMWARE_GP));
>
> What about EMULTYPE_TRAP_UD_FORCED?

Hmm, yeah, it's worth adding, there's no additional cost. I was thinking it was
a modifier to EMULTYPE_TRAP_UD, but it's a replacement specifically to bypass
the EmulateOnUD check (which I should have remembered since I added the type...).

2022-01-21 22:27:12

by Liam Merwick

[permalink] [raw]
Subject: Re: [PATCH 0/9] KVM: SVM: Fix and clean up "can emulate" mess

On 20/01/2022 01:07, Sean Christopherson wrote:
> Revert an amusing/embarassing goof reported by Liam Merwick, where KVM
> attempts to determine if RIP is backed by a valid memslot without first
> translating RIP to its associated GPA/GFN. Fix the underlying bug that
> was "fixed" by the misguided memslots check by (a) never rejecting
> emulation for !SEV guests and (b) using the #NPF error code to determine
> if the fault happened on the code fetch or on guest page tables, which is
> effectively what the memslots check attempted to do.
>
> Further clean up, harden, and document SVM's "can emulate" helper, and
> fix a #GP interception SEV bug found in the process of doing so.


FYI: I've applied all 9 commits to a 5.15 based branch (applied cleanly)
and the 3 stable candidates to a 5.4 based branch (applied with minor
contextual conflicts) and have been running my SEV test case (sysbench)
and kvm-unit-tests without issues for a number of hours now.

Regards,
Liam


>
> Sean Christopherson (9):
> KVM: SVM: Never reject emulation due to SMAP errata for !SEV guests
> Revert "KVM: SVM: avoid infinite loop on NPF from bad address"
> KVM: SVM: Don't intercept #GP for SEV guests
> KVM: SVM: Explicitly require DECODEASSISTS to enable SEV support
> KVM: x86: Pass emulation type to can_emulate_instruction()
> KVM: SVM: WARN if KVM attempts emulation on #UD or #GP for SEV guests
> KVM: SVM: Inject #UD on attempted emulation for SEV guest w/o insn
> buffer
> KVM: SVM: Don't apply SEV+SMAP workaround on code fetch or PT access
> KVM: SVM: Don't kill SEV guest if SMAP erratum triggers in usermode
>
> arch/x86/include/asm/kvm_host.h | 3 +-
> arch/x86/kvm/svm/sev.c | 9 +-
> arch/x86/kvm/svm/svm.c | 162 ++++++++++++++++++++++----------
> arch/x86/kvm/vmx/vmx.c | 7 +-
> arch/x86/kvm/x86.c | 11 ++-
> virt/kvm/kvm_main.c | 1 -
> 6 files changed, 135 insertions(+), 58 deletions(-)
>
>
> base-commit: edb9e50dbe18394d0fc9d0494f5b6046fc912d33

2022-01-22 00:41:12

by Liam Merwick

[permalink] [raw]
Subject: Re: [PATCH 0/9] KVM: SVM: Fix and clean up "can emulate" mess

On 20/01/2022 16:58, Liam Merwick wrote:
> On 20/01/2022 01:07, Sean Christopherson wrote:
>> Revert an amusing/embarassing goof reported by Liam Merwick, where KVM
>> attempts to determine if RIP is backed by a valid memslot without first
>> translating RIP to its associated GPA/GFN.  Fix the underlying bug that
>> was "fixed" by the misguided memslots check by (a) never rejecting
>> emulation for !SEV guests and (b) using the #NPF error code to determine
>> if the fault happened on the code fetch or on guest page tables, which is
>> effectively what the memslots check attempted to do.
>>
>> Further clean up, harden, and document SVM's "can emulate" helper, and
>> fix a #GP interception SEV bug found in the process of doing so.
>
>
> FYI: I've applied all 9 commits to a 5.15 based branch (applied cleanly)
> and the 3 stable candidates to a 5.4 based branch (applied with minor
> contextual conflicts) and have been running my SEV test case (sysbench)
> and kvm-unit-tests without issues for a number of hours now.
>

Tested-by: Liam Merwick <[email protected]>


>
>>
>> Sean Christopherson (9):
>>    KVM: SVM: Never reject emulation due to SMAP errata for !SEV guests
>>    Revert "KVM: SVM: avoid infinite loop on NPF from bad address"
>>    KVM: SVM: Don't intercept #GP for SEV guests
>>    KVM: SVM: Explicitly require DECODEASSISTS to enable SEV support
>>    KVM: x86: Pass emulation type to can_emulate_instruction()
>>    KVM: SVM: WARN if KVM attempts emulation on #UD or #GP for SEV guests
>>    KVM: SVM: Inject #UD on attempted emulation for SEV guest w/o insn
>>      buffer
>>    KVM: SVM: Don't apply SEV+SMAP workaround on code fetch or PT access
>>    KVM: SVM: Don't kill SEV guest if SMAP erratum triggers in usermode
>>
>>   arch/x86/include/asm/kvm_host.h |   3 +-
>>   arch/x86/kvm/svm/sev.c          |   9 +-
>>   arch/x86/kvm/svm/svm.c          | 162 ++++++++++++++++++++++----------
>>   arch/x86/kvm/vmx/vmx.c          |   7 +-
>>   arch/x86/kvm/x86.c              |  11 ++-
>>   virt/kvm/kvm_main.c             |   1 -
>>   6 files changed, 135 insertions(+), 58 deletions(-)
>>
>>
>> base-commit: edb9e50dbe18394d0fc9d0494f5b6046fc912d33
>

2022-01-25 22:45:12

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 0/9] KVM: SVM: Fix and clean up "can emulate" mess

On 1/20/22 02:07, Sean Christopherson wrote:
> Revert an amusing/embarassing goof reported by Liam Merwick, where KVM
> attempts to determine if RIP is backed by a valid memslot without first
> translating RIP to its associated GPA/GFN. Fix the underlying bug that
> was "fixed" by the misguided memslots check by (a) never rejecting
> emulation for !SEV guests and (b) using the #NPF error code to determine
> if the fault happened on the code fetch or on guest page tables, which is
> effectively what the memslots check attempted to do.
>
> Further clean up, harden, and document SVM's "can emulate" helper, and
> fix a #GP interception SEV bug found in the process of doing so.
>
> Sean Christopherson (9):
> KVM: SVM: Never reject emulation due to SMAP errata for !SEV guests
> Revert "KVM: SVM: avoid infinite loop on NPF from bad address"
> KVM: SVM: Don't intercept #GP for SEV guests
> KVM: SVM: Explicitly require DECODEASSISTS to enable SEV support
> KVM: x86: Pass emulation type to can_emulate_instruction()
> KVM: SVM: WARN if KVM attempts emulation on #UD or #GP for SEV guests
> KVM: SVM: Inject #UD on attempted emulation for SEV guest w/o insn
> buffer
> KVM: SVM: Don't apply SEV+SMAP workaround on code fetch or PT access
> KVM: SVM: Don't kill SEV guest if SMAP erratum triggers in usermode
>
> arch/x86/include/asm/kvm_host.h | 3 +-
> arch/x86/kvm/svm/sev.c | 9 +-
> arch/x86/kvm/svm/svm.c | 162 ++++++++++++++++++++++----------
> arch/x86/kvm/vmx/vmx.c | 7 +-
> arch/x86/kvm/x86.c | 11 ++-
> virt/kvm/kvm_main.c | 1 -
> 6 files changed, 135 insertions(+), 58 deletions(-)
>
>
> base-commit: edb9e50dbe18394d0fc9d0494f5b6046fc912d33

Queued, thanks.

Paolo

2022-01-25 22:45:52

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 6/9] KVM: SVM: WARN if KVM attempts emulation on #UD or #GP for SEV guests

On 1/20/22 18:04, Sean Christopherson wrote:
>>> + WARN_ON_ONCE(emul_type & (EMULTYPE_TRAP_UD | EMULTYPE_VMWARE_GP));
>> What about EMULTYPE_TRAP_UD_FORCED?
> Hmm, yeah, it's worth adding, there's no additional cost. I was thinking it was
> a modifier to EMULTYPE_TRAP_UD, but it's a replacement specifically to bypass
> the EmulateOnUD check (which I should have remembered since I added the type...).
>

Added on top:

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index d5fe71862bcb..85bbfba1fa07 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4269,7 +4269,9 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
return true;

/* #UD and #GP should never be intercepted for SEV guests. */
- WARN_ON_ONCE(emul_type & (EMULTYPE_TRAP_UD | EMULTYPE_VMWARE_GP));
+ WARN_ON_ONCE(emul_type & (EMULTYPE_TRAP_UD |
+ EMULTYPE_TRAP_UD_FORCED |
+ EMULTYPE_VMWARE_GP));

/*
* Emulation is impossible for SEV-ES guests as KVM doesn't have access


Paolo