2019-08-27 21:45:15

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 00/14] KVM: x86: Remove emulation_result enums

Rework the emulator and its users to handle failure scenarios entirely
within the emulator.

{x86,kvm}_emulate_instruction() currently returns a tri-state value to
indicate success/continue, userspace exit needed, and failure. The
intent of returning EMULATE_FAIL is to let the caller handle failure in
a manner that is appropriate for the current context. In practice,
the emulator has ended up with a mixture of failure handling, i.e.
whether or not the emulator takes action on failure is dependent on the
specific flavor of emulation.

The mixed handling has proven to be rather fragile, e.g. many flows
incorrectly assume their specific flavor of emulation cannot fail or
that the emulator sets state to report the failure back to userspace.

Move everything inside the emulator, piece by piece, so that the
emulation routines can return '0' for exit to userspace and '1' for
resume the guest, just like every other VM-Exit handler.

Patch 13/14 is a tangentially related bug fix that conflicts heavily with
this series, so I tacked it on here.

Patch 14/14 documents the emulation types. I added it as a separate
patch at the very end so that the comments could reference the final
state of the code base, e.g. incorporate the rule change for using
EMULTYPE_SKIP that is introduced in patch 13/14.

v1:
- https://patchwork.kernel.org/cover/11110331/

v2:
- Collect reviews. [Vitaly and Liran]
- Squash VMware emultype changes into a single patch. [Liran]
- Add comments in VMX/SVM for VMware #GP handling. [Vitaly]
- Tack on the EPT misconfig bug fix.
- Add a patch to comment/document the emultypes. [Liran]

Sean Christopherson (14):
KVM: x86: Relocate MMIO exit stats counting
KVM: x86: Clean up handle_emulation_failure()
KVM: x86: Refactor kvm_vcpu_do_singlestep() to remove out param
KVM: x86: Don't attempt VMWare emulation on #GP with non-zero error
code
KVM: x86: Move #GP injection for VMware into x86_emulate_instruction()
KVM: x86: Add explicit flag for forced emulation on #UD
KVM: x86: Move #UD injection for failed emulation into emulation code
KVM: x86: Exit to userspace on emulation skip failure
KVM: x86: Handle emulation failure directly in kvm_task_switch()
KVM: x86: Move triple fault request into RM int injection
KVM: VMX: Remove EMULATE_FAIL handling in handle_invalid_guest_state()
KVM: x86: Remove emulation_result enums, EMULATE_{DONE,FAIL,USER_EXIT}
KVM: VMX: Handle single-step #DB for EMULTYPE_SKIP on EPT misconfig
KVM: x86: Add comments to document various emulation types

arch/x86/include/asm/kvm_host.h | 40 +++++++--
arch/x86/kvm/mmu.c | 16 +---
arch/x86/kvm/svm.c | 62 ++++++--------
arch/x86/kvm/vmx/vmx.c | 147 +++++++++++++-------------------
arch/x86/kvm/x86.c | 133 ++++++++++++++++-------------
arch/x86/kvm/x86.h | 2 +-
6 files changed, 195 insertions(+), 205 deletions(-)

--
2.22.0


2019-08-27 21:45:23

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 07/14] KVM: x86: Move #UD injection for failed emulation into emulation code

Immediately inject a #UD and return EMULATE done if emulation fails when
handling an intercepted #UD. This helps pave the way for removing
EMULATE_FAIL altogether.

Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/x86.c | 14 +++++---------
1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a1f9e36b2d58..bff3320aa78e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5328,7 +5328,6 @@ EXPORT_SYMBOL_GPL(kvm_write_guest_virt_system);
int handle_ud(struct kvm_vcpu *vcpu)
{
int emul_type = EMULTYPE_TRAP_UD;
- enum emulation_result er;
char sig[5]; /* ud2; .ascii "kvm" */
struct x86_exception e;

@@ -5340,12 +5339,7 @@ int handle_ud(struct kvm_vcpu *vcpu)
emul_type = EMULTYPE_TRAP_UD_FORCED;
}

- er = kvm_emulate_instruction(vcpu, emul_type);
- if (er == EMULATE_USER_EXIT)
- return 0;
- if (er != EMULATE_DONE)
- kvm_queue_exception(vcpu, UD_VECTOR);
- return 1;
+ return kvm_emulate_instruction(vcpu, emul_type) != EMULATE_USER_EXIT;
}
EXPORT_SYMBOL_GPL(handle_ud);

@@ -6533,8 +6527,10 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
++vcpu->stat.insn_emulation;
if (r != EMULATION_OK) {
if ((emulation_type & EMULTYPE_TRAP_UD) ||
- (emulation_type & EMULTYPE_TRAP_UD_FORCED))
- return EMULATE_FAIL;
+ (emulation_type & EMULTYPE_TRAP_UD_FORCED)) {
+ kvm_queue_exception(vcpu, UD_VECTOR);
+ return EMULATE_DONE;
+ }
if (reexecute_instruction(vcpu, cr2, write_fault_to_spt,
emulation_type))
return EMULATE_DONE;
--
2.22.0

2019-08-27 21:45:30

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 14/14] KVM: x86: Add comments to document various emulation types

Document the intended usage of each emulation type as each exists to
handle an edge case of one kind or another and can be easily
misinterpreted at first glance.

Cc: Liran Alon <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 3f524cea20ad..8221e68bbb0c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1308,6 +1308,36 @@ extern u64 kvm_default_tsc_scaling_ratio;

extern u64 kvm_mce_cap_supported;

+/*
+ * EMULTYPE_NO_DECODE - Set when re-emulating an instruction (after completing
+ * userspace I/O) to indicate that the emulation context
+ * should be resued as is, i.e. skip initialization of
+ * emulation context, instruction fetch and decode.
+ *
+ * EMULTYPE_TRAP_UD - Set when emulating an intercepted #UD from hardware.
+ * Indicates that only select instructions (tagged with
+ * EmulateOnUD) should be emulated (to minimize the emulator
+ * attack surface). See also EMULTYPE_TRAP_UD_FORCED.
+ *
+ * EMULTYPE_SKIP - Set when emulating solely to skip an instruction, i.e. to
+ * decode the instruction length. For use *only* by
+ * kvm_x86_ops->skip_emulated_instruction() implementations.
+ *
+ * EMULTYPE_ALLOW_RETRY - Set when the emulator should resume the guest to
+ * retry native execution under certain conditions.
+ *
+ * EMULTYPE_TRAP_UD_FORCED - Set when emulating an intercepted #UD that was
+ * triggered by KVM's magic "force emulation" prefix,
+ * which is opt in via module param (off by default).
+ * Bypasses EmulateOnUD restriction despite emulating
+ * due to an intercepted #UD (see EMULTYPE_TRAP_UD).
+ * Used to test the full emulator from userspace.
+ *
+ * EMULTYPE_VMWARE_GP - Set when emulating an intercepted #GP for VMware
+ * backdoor emulation, which is opt in via module param.
+ * VMware backoor emulation handles select instructions
+ * and reinjects the #GP for all other cases.
+ */
#define EMULTYPE_NO_DECODE (1 << 0)
#define EMULTYPE_TRAP_UD (1 << 1)
#define EMULTYPE_SKIP (1 << 2)
--
2.22.0

2019-08-27 21:45:50

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 06/14] KVM: x86: Add explicit flag for forced emulation on #UD

Add an explicit emulation type for forced #UD emulation and use it to
detect that KVM should unconditionally inject a #UD instead of falling
into its standard emulation failure handling.

Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/x86.c | 5 +++--
2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d1d5b5ca1195..a38c93362945 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1318,6 +1318,7 @@ enum emulation_result {
#define EMULTYPE_TRAP_UD (1 << 1)
#define EMULTYPE_SKIP (1 << 2)
#define EMULTYPE_ALLOW_RETRY (1 << 3)
+#define EMULTYPE_TRAP_UD_FORCED (1 << 4)
#define EMULTYPE_VMWARE_GP (1 << 5)
int kvm_emulate_instruction(struct kvm_vcpu *vcpu, int emulation_type);
int kvm_emulate_instruction_from_buffer(struct kvm_vcpu *vcpu,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 228ca71d5b01..a1f9e36b2d58 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5337,7 +5337,7 @@ int handle_ud(struct kvm_vcpu *vcpu)
sig, sizeof(sig), &e) == 0 &&
memcmp(sig, "\xf\xbkvm", sizeof(sig)) == 0) {
kvm_rip_write(vcpu, kvm_rip_read(vcpu) + sizeof(sig));
- emul_type = 0;
+ emul_type = EMULTYPE_TRAP_UD_FORCED;
}

er = kvm_emulate_instruction(vcpu, emul_type);
@@ -6532,7 +6532,8 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
trace_kvm_emulate_insn_start(vcpu);
++vcpu->stat.insn_emulation;
if (r != EMULATION_OK) {
- if (emulation_type & EMULTYPE_TRAP_UD)
+ if ((emulation_type & EMULTYPE_TRAP_UD) ||
+ (emulation_type & EMULTYPE_TRAP_UD_FORCED))
return EMULATE_FAIL;
if (reexecute_instruction(vcpu, cr2, write_fault_to_spt,
emulation_type))
--
2.22.0

2019-08-27 21:45:55

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 13/14] KVM: VMX: Handle single-step #DB for EMULTYPE_SKIP on EPT misconfig

VMX's EPT misconfig flow to handle fast-MMIO path falls back to decoding
the instruction to determine the instruction length when running as a
guest (Hyper-V doesn't fill VMCS.VM_EXIT_INSTRUCTION_LEN because it's
technically not defined for EPT misconfigs). Rather than implement the
slow skip in VMX's generic skip_emulated_instruction(),
handle_ept_misconfig() directly calls kvm_emulate_instruction() with
EMULTYPE_SKIP, which intentionally doesn't do single-step detection, and
so handle_ept_misconfig() misses a single-step #DB.

Rework the EPT misconfig fallback case to route it through
kvm_skip_emulated_instruction() so that single-step #DBs and interrupt
shadow updates are handled automatically. I.e. make VMX's slow skip
logic match SVM's and have the SVM flow not intentionally avoid the
shadow update.

Alternatively, the handle_ept_misconfig() could manually handle single-
step detection, but that results in EMULTYPE_SKIP having split logic for
the interrupt shadow vs. single-step #DBs, and split emulator logic is
largely what led to this mess in the first place.

Modifying SVM to mirror VMX flow isn't really an option as SVM's case
isn't limited to a specific exit reason, i.e. handling the slow skip in
skip_emulated_instruction() is mandatory for all intents and purposes.

Drop VMX's skip_emulated_instruction() wrapper since it can now fail,
and instead WARN if it fails unexpectedly, e.g. if exit_reason somehow
becomes corrupted.

Cc: Vitaly Kuznetsov <[email protected]>
Fixes: d391f12070672 ("x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested")
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/svm.c | 17 +++++++-------
arch/x86/kvm/vmx/vmx.c | 52 ++++++++++++++++++------------------------
arch/x86/kvm/x86.c | 6 ++++-
3 files changed, 36 insertions(+), 39 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 2a76b955c230..4e1f79348aa1 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -777,14 +777,15 @@ static int skip_emulated_instruction(struct kvm_vcpu *vcpu)
svm->next_rip = svm->vmcb->control.next_rip;
}

- if (!svm->next_rip)
- return kvm_emulate_instruction(vcpu, EMULTYPE_SKIP);
-
- if (svm->next_rip - kvm_rip_read(vcpu) > MAX_INST_SIZE)
- printk(KERN_ERR "%s: ip 0x%lx next 0x%llx\n",
- __func__, kvm_rip_read(vcpu), svm->next_rip);
-
- kvm_rip_write(vcpu, svm->next_rip);
+ if (!svm->next_rip) {
+ if (!kvm_emulate_instruction(vcpu, EMULTYPE_SKIP))
+ return 0;
+ } else {
+ if (svm->next_rip - kvm_rip_read(vcpu) > MAX_INST_SIZE)
+ pr_err("%s: ip 0x%lx next 0x%llx\n",
+ __func__, kvm_rip_read(vcpu), svm->next_rip);
+ kvm_rip_write(vcpu, svm->next_rip);
+ }
svm_set_interrupt_shadow(vcpu, 0);

return 1;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ebb97fc11a91..ec20781f2226 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1472,17 +1472,27 @@ static int vmx_rtit_ctl_check(struct kvm_vcpu *vcpu, u64 data)
return 0;
}

-/*
- * Returns an int to be compatible with SVM implementation (which can fail).
- * Do not use directly, use skip_emulated_instruction() instead.
- */
-static int __skip_emulated_instruction(struct kvm_vcpu *vcpu)
+static int skip_emulated_instruction(struct kvm_vcpu *vcpu)
{
unsigned long rip;

- rip = kvm_rip_read(vcpu);
- rip += vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
- kvm_rip_write(vcpu, rip);
+ /*
+ * Using VMCS.VM_EXIT_INSTRUCTION_LEN on EPT misconfig depends on
+ * undefined behavior: Intel's SDM doesn't mandate the VMCS field be
+ * set when EPT misconfig occurs. In practice, real hardware updates
+ * VM_EXIT_INSTRUCTION_LEN on EPT misconfig, but other hypervisors
+ * (namely Hyper-V) don't set it due to it being undefined behavior,
+ * i.e. we end up advancing IP with some random value.
+ */
+ if (!static_cpu_has(X86_FEATURE_HYPERVISOR) ||
+ to_vmx(vcpu)->exit_reason != EXIT_REASON_EPT_MISCONFIG) {
+ rip = kvm_rip_read(vcpu);
+ rip += vmcs_read32(VM_EXIT_INSTRUCTION_LEN);
+ kvm_rip_write(vcpu, rip);
+ } else {
+ if (!kvm_emulate_instruction(vcpu, EMULTYPE_SKIP))
+ return 0;
+ }

/* skipping an emulated instruction also counts */
vmx_set_interrupt_shadow(vcpu, 0);
@@ -1490,11 +1500,6 @@ static int __skip_emulated_instruction(struct kvm_vcpu *vcpu)
return 1;
}

-static inline void skip_emulated_instruction(struct kvm_vcpu *vcpu)
-{
- (void)__skip_emulated_instruction(vcpu);
-}
-
static void vmx_clear_hlt(struct kvm_vcpu *vcpu)
{
/*
@@ -4557,7 +4562,7 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu)
vcpu->arch.dr6 &= ~DR_TRAP_BITS;
vcpu->arch.dr6 |= dr6 | DR6_RTM;
if (is_icebp(intr_info))
- skip_emulated_instruction(vcpu);
+ WARN_ON(!skip_emulated_instruction(vcpu));

kvm_queue_exception(vcpu, DB_VECTOR);
return 1;
@@ -5067,7 +5072,7 @@ static int handle_task_switch(struct kvm_vcpu *vcpu)
if (!idt_v || (type != INTR_TYPE_HARD_EXCEPTION &&
type != INTR_TYPE_EXT_INTR &&
type != INTR_TYPE_NMI_INTR))
- skip_emulated_instruction(vcpu);
+ WARN_ON(!skip_emulated_instruction(vcpu));

/*
* TODO: What about debug traps on tss switch?
@@ -5134,20 +5139,7 @@ static int handle_ept_misconfig(struct kvm_vcpu *vcpu)
if (!is_guest_mode(vcpu) &&
!kvm_io_bus_write(vcpu, KVM_FAST_MMIO_BUS, gpa, 0, NULL)) {
trace_kvm_fast_mmio(gpa);
- /*
- * Doing kvm_skip_emulated_instruction() depends on undefined
- * behavior: Intel's manual doesn't mandate
- * VM_EXIT_INSTRUCTION_LEN to be set in VMCS when EPT MISCONFIG
- * occurs and while on real hardware it was observed to be set,
- * other hypervisors (namely Hyper-V) don't set it, we end up
- * advancing IP with some random value. Disable fast mmio when
- * running nested and keep it for real hardware in hope that
- * VM_EXIT_INSTRUCTION_LEN will always be set correctly.
- */
- if (!static_cpu_has(X86_FEATURE_HYPERVISOR))
- return kvm_skip_emulated_instruction(vcpu);
- else
- return kvm_emulate_instruction(vcpu, EMULTYPE_SKIP);
+ return kvm_skip_emulated_instruction(vcpu);
}

return kvm_mmu_page_fault(vcpu, gpa, PFERR_RSVD_MASK, NULL, 0);
@@ -7701,7 +7693,7 @@ static struct kvm_x86_ops vmx_x86_ops __ro_after_init = {

.run = vmx_vcpu_run,
.handle_exit = vmx_handle_exit,
- .skip_emulated_instruction = __skip_emulated_instruction,
+ .skip_emulated_instruction = skip_emulated_instruction,
.set_interrupt_shadow = vmx_set_interrupt_shadow,
.get_interrupt_shadow = vmx_get_interrupt_shadow,
.patch_hypercall = vmx_patch_hypercall,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 53a16eb2aba8..9d5a2b77473b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6552,11 +6552,15 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
return 1;
}

+ /*
+ * Note, EMULTYPE_SKIP is intended for use *only* by vendor callbacks
+ * for kvm_skip_emulated_instruction(). The caller is responsible for
+ * updating interruptibility state and injecting single-step #DBs.
+ */
if (emulation_type & EMULTYPE_SKIP) {
kvm_rip_write(vcpu, ctxt->_eip);
if (ctxt->eflags & X86_EFLAGS_RF)
kvm_set_rflags(vcpu, ctxt->eflags & ~X86_EFLAGS_RF);
- kvm_x86_ops->set_interrupt_shadow(vcpu, 0);
return 1;
}

--
2.22.0

2019-08-27 21:46:11

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 04/14] KVM: x86: Don't attempt VMWare emulation on #GP with non-zero error code

The VMware backdoor hooks #GP faults on IN{S}, OUT{S}, and RDPMC, none
of which generate a non-zero error code for their #GP. Re-injecting #GP
instead of attempting emulation on a non-zero error code will allow a
future patch to move #GP injection (for emulation failure) into
kvm_emulate_instruction() without having to plumb in the error code.

Reviewed-and-tested-by: Vitaly Kuznetsov <[email protected]>
Reviewed-by: Liran Alon <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/svm.c | 10 +++++++++-
arch/x86/kvm/vmx/vmx.c | 12 +++++++++++-
2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 1f220a85514f..7242142573d6 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2772,12 +2772,20 @@ static int gp_interception(struct vcpu_svm *svm)

WARN_ON_ONCE(!enable_vmware_backdoor);

+ /*
+ * VMware backdoor emulation on #GP interception only handles IN{S},
+ * OUT{S}, and RDPMC, none of which generate a non-zero error code.
+ */
+ if (error_code) {
+ kvm_queue_exception_e(vcpu, GP_VECTOR, error_code);
+ return 1;
+ }
er = kvm_emulate_instruction(vcpu,
EMULTYPE_VMWARE | EMULTYPE_NO_UD_ON_FAIL);
if (er == EMULATE_USER_EXIT)
return 0;
else if (er != EMULATE_DONE)
- kvm_queue_exception_e(vcpu, GP_VECTOR, error_code);
+ kvm_queue_exception_e(vcpu, GP_VECTOR, 0);
return 1;
}

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 18286e5b5983..8a65e1122376 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4509,12 +4509,22 @@ static int handle_exception_nmi(struct kvm_vcpu *vcpu)

if (!vmx->rmode.vm86_active && is_gp_fault(intr_info)) {
WARN_ON_ONCE(!enable_vmware_backdoor);
+
+ /*
+ * VMware backdoor emulation on #GP interception only handles
+ * IN{S}, OUT{S}, and RDPMC, none of which generate a non-zero
+ * error code on #GP.
+ */
+ if (error_code) {
+ kvm_queue_exception_e(vcpu, GP_VECTOR, error_code);
+ return 1;
+ }
er = kvm_emulate_instruction(vcpu,
EMULTYPE_VMWARE | EMULTYPE_NO_UD_ON_FAIL);
if (er == EMULATE_USER_EXIT)
return 0;
else if (er != EMULATE_DONE)
- kvm_queue_exception_e(vcpu, GP_VECTOR, error_code);
+ kvm_queue_exception_e(vcpu, GP_VECTOR, 0);
return 1;
}

--
2.22.0

2019-08-27 21:46:15

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 10/14] KVM: x86: Move triple fault request into RM int injection

Request triple fault in kvm_inject_realmode_interrupt() instead of
returning EMULATE_FAIL and deferring to the caller. All existing
callers request triple fault and it's highly unlikely Real Mode is
going to acquire new features. While this consolidates a small amount
of code, the real goal is to remove the last reference to EMULATE_FAIL.

No functional change intended.

Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 9 +++------
arch/x86/kvm/x86.c | 17 ++++++++---------
arch/x86/kvm/x86.h | 2 +-
3 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index a80613a5921f..85a378075725 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1527,8 +1527,7 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu)
int inc_eip = 0;
if (kvm_exception_is_soft(nr))
inc_eip = vcpu->arch.event_exit_inst_len;
- if (kvm_inject_realmode_interrupt(vcpu, nr, inc_eip) != EMULATE_DONE)
- kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
+ kvm_inject_realmode_interrupt(vcpu, nr, inc_eip);
return;
}

@@ -4276,8 +4275,7 @@ static void vmx_inject_irq(struct kvm_vcpu *vcpu)
int inc_eip = 0;
if (vcpu->arch.interrupt.soft)
inc_eip = vcpu->arch.event_exit_inst_len;
- if (kvm_inject_realmode_interrupt(vcpu, irq, inc_eip) != EMULATE_DONE)
- kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
+ kvm_inject_realmode_interrupt(vcpu, irq, inc_eip);
return;
}
intr = irq | INTR_INFO_VALID_MASK;
@@ -4313,8 +4311,7 @@ static void vmx_inject_nmi(struct kvm_vcpu *vcpu)
vmx->loaded_vmcs->nmi_known_unmasked = false;

if (vmx->rmode.vm86_active) {
- if (kvm_inject_realmode_interrupt(vcpu, NMI_VECTOR, 0) != EMULATE_DONE)
- kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
+ kvm_inject_realmode_interrupt(vcpu, NMI_VECTOR, 0);
return;
}

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 83b3c7e9fce7..e457363622e5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6176,7 +6176,7 @@ static void init_emulate_ctxt(struct kvm_vcpu *vcpu)
vcpu->arch.emulate_regs_need_sync_from_vcpu = false;
}

-int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip)
+void kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip)
{
struct x86_emulate_ctxt *ctxt = &vcpu->arch.emulate_ctxt;
int ret;
@@ -6188,14 +6188,13 @@ int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip)
ctxt->_eip = ctxt->eip + inc_eip;
ret = emulate_int_real(ctxt, irq);

- if (ret != X86EMUL_CONTINUE)
- return EMULATE_FAIL;
-
- ctxt->eip = ctxt->_eip;
- kvm_rip_write(vcpu, ctxt->eip);
- kvm_set_rflags(vcpu, ctxt->eflags);
-
- return EMULATE_DONE;
+ if (ret != X86EMUL_CONTINUE) {
+ kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
+ } else {
+ ctxt->eip = ctxt->_eip;
+ kvm_rip_write(vcpu, ctxt->eip);
+ kvm_set_rflags(vcpu, ctxt->eflags);
+ }
}
EXPORT_SYMBOL_GPL(kvm_inject_realmode_interrupt);

diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index b5274e2a53cf..dbf7442a822b 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -261,7 +261,7 @@ static inline bool kvm_check_has_quirk(struct kvm *kvm, u64 quirk)
}

void kvm_set_pending_timer(struct kvm_vcpu *vcpu);
-int kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip);
+void kvm_inject_realmode_interrupt(struct kvm_vcpu *vcpu, int irq, int inc_eip);

void kvm_write_tsc(struct kvm_vcpu *vcpu, struct msr_data *msr);
u64 get_kvmclock_ns(struct kvm *kvm);
--
2.22.0

2019-08-27 21:47:33

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 01/14] KVM: x86: Relocate MMIO exit stats counting

Move the stat.mmio_exits update into x86_emulate_instruction(). This is
both a bug fix, e.g. the current update flows will incorrectly increment
mmio_exits on emulation failure, and a preparatory change to set the
stage for eliminating EMULATE_DONE and company.

Reviewed-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/mmu.c | 2 --
arch/x86/kvm/vmx/vmx.c | 1 -
arch/x86/kvm/x86.c | 2 ++
3 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 4c45ff0cfbd0..845e39d8a970 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -5437,8 +5437,6 @@ int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t cr2, u64 error_code,
case EMULATE_DONE:
return 1;
case EMULATE_USER_EXIT:
- ++vcpu->stat.mmio_exits;
- /* fall through */
case EMULATE_FAIL:
return 0;
default:
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 570a233e272b..18286e5b5983 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -5200,7 +5200,6 @@ static int handle_invalid_guest_state(struct kvm_vcpu *vcpu)
err = kvm_emulate_instruction(vcpu, 0);

if (err == EMULATE_USER_EXIT) {
- ++vcpu->stat.mmio_exits;
ret = 0;
goto out;
}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b4cfd786d0b6..cd425f54096a 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6598,6 +6598,8 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
}
r = EMULATE_USER_EXIT;
} else if (vcpu->mmio_needed) {
+ ++vcpu->stat.mmio_exits;
+
if (!vcpu->mmio_is_write)
writeback = false;
r = EMULATE_USER_EXIT;
--
2.22.0

2019-08-27 21:47:34

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 08/14] KVM: x86: Exit to userspace on emulation skip failure

Kill a few birds with one stone by forcing an exit to userspace on skip
emulation failure. This removes a reference to EMULATE_FAIL, fixes a
bug in handle_ept_misconfig() where it would exit to userspace without
setting run->exit_reason, and fixes a theoretical bug in SVM's
task_switch_interception() where it would overwrite run->exit_reason on
a return of EMULATE_USER_EXIT.

Note, this technically doesn't fully fix task_switch_interception()
as it now incorrectly handles EMULATE_FAIL, but in practice there is no
bug as EMULATE_FAIL will never be returned for EMULTYPE_SKIP.

Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/svm.c | 4 ++--
arch/x86/kvm/x86.c | 9 +++++++--
2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index c4b72db48bc5..9c46031e96cc 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3885,8 +3885,8 @@ static int task_switch_interception(struct vcpu_svm *svm)
int_type == SVM_EXITINTINFO_TYPE_SOFT ||
(int_type == SVM_EXITINTINFO_TYPE_EXEPT &&
(int_vec == OF_VECTOR || int_vec == BP_VECTOR))) {
- if (skip_emulated_instruction(&svm->vcpu) != EMULATE_DONE)
- goto fail;
+ if (skip_emulated_instruction(&svm->vcpu) == EMULATE_USER_EXIT)
+ return 0;
}

if (int_type != SVM_EXITINTINFO_TYPE_SOFT)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index bff3320aa78e..1a886ec6957d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6209,6 +6209,13 @@ static int handle_emulation_failure(struct kvm_vcpu *vcpu, int emulation_type)
return EMULATE_DONE;
}

+ if (emulation_type & EMULTYPE_SKIP) {
+ vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
+ vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
+ vcpu->run->internal.ndata = 0;
+ return EMULATE_USER_EXIT;
+ }
+
kvm_queue_exception(vcpu, UD_VECTOR);

if (!is_guest_mode(vcpu) && kvm_x86_ops->get_cpl(vcpu) == 0) {
@@ -6536,8 +6543,6 @@ int x86_emulate_instruction(struct kvm_vcpu *vcpu,
return EMULATE_DONE;
if (ctxt->have_exception && inject_emulated_exception(vcpu))
return EMULATE_DONE;
- if (emulation_type & EMULTYPE_SKIP)
- return EMULATE_FAIL;
return handle_emulation_failure(vcpu, emulation_type);
}
}
--
2.22.0

2019-09-17 18:11:10

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 00/14] KVM: x86: Remove emulation_result enums

On 27/08/19 23:40, Sean Christopherson wrote:
> Rework the emulator and its users to handle failure scenarios entirely
> within the emulator.
>
> {x86,kvm}_emulate_instruction() currently returns a tri-state value to
> indicate success/continue, userspace exit needed, and failure. The
> intent of returning EMULATE_FAIL is to let the caller handle failure in
> a manner that is appropriate for the current context. In practice,
> the emulator has ended up with a mixture of failure handling, i.e.
> whether or not the emulator takes action on failure is dependent on the
> specific flavor of emulation.
>
> The mixed handling has proven to be rather fragile, e.g. many flows
> incorrectly assume their specific flavor of emulation cannot fail or
> that the emulator sets state to report the failure back to userspace.
>
> Move everything inside the emulator, piece by piece, so that the
> emulation routines can return '0' for exit to userspace and '1' for
> resume the guest, just like every other VM-Exit handler.
>
> Patch 13/14 is a tangentially related bug fix that conflicts heavily with
> this series, so I tacked it on here.
>
> Patch 14/14 documents the emulation types. I added it as a separate
> patch at the very end so that the comments could reference the final
> state of the code base, e.g. incorporate the rule change for using
> EMULTYPE_SKIP that is introduced in patch 13/14.
>
> v1:
> - https://patchwork.kernel.org/cover/11110331/
>
> v2:
> - Collect reviews. [Vitaly and Liran]
> - Squash VMware emultype changes into a single patch. [Liran]
> - Add comments in VMX/SVM for VMware #GP handling. [Vitaly]
> - Tack on the EPT misconfig bug fix.
> - Add a patch to comment/document the emultypes. [Liran]
>
> Sean Christopherson (14):
> KVM: x86: Relocate MMIO exit stats counting
> KVM: x86: Clean up handle_emulation_failure()
> KVM: x86: Refactor kvm_vcpu_do_singlestep() to remove out param
> KVM: x86: Don't attempt VMWare emulation on #GP with non-zero error
> code
> KVM: x86: Move #GP injection for VMware into x86_emulate_instruction()
> KVM: x86: Add explicit flag for forced emulation on #UD
> KVM: x86: Move #UD injection for failed emulation into emulation code
> KVM: x86: Exit to userspace on emulation skip failure
> KVM: x86: Handle emulation failure directly in kvm_task_switch()
> KVM: x86: Move triple fault request into RM int injection
> KVM: VMX: Remove EMULATE_FAIL handling in handle_invalid_guest_state()
> KVM: x86: Remove emulation_result enums, EMULATE_{DONE,FAIL,USER_EXIT}
> KVM: VMX: Handle single-step #DB for EMULTYPE_SKIP on EPT misconfig
> KVM: x86: Add comments to document various emulation types
>
> arch/x86/include/asm/kvm_host.h | 40 +++++++--
> arch/x86/kvm/mmu.c | 16 +---
> arch/x86/kvm/svm.c | 62 ++++++--------
> arch/x86/kvm/vmx/vmx.c | 147 +++++++++++++-------------------
> arch/x86/kvm/x86.c | 133 ++++++++++++++++-------------
> arch/x86/kvm/x86.h | 2 +-
> 6 files changed, 195 insertions(+), 205 deletions(-)
>

Queued, thanks (a couple conflicts had to be sorted out, but nothing
requiring a respin).

Paolo

2019-10-25 19:43:44

by Alexander Graf

[permalink] [raw]
Subject: Re: [PATCH v2 00/14] KVM: x86: Remove emulation_result enums

On 17.09.19 17:14, Paolo Bonzini wrote:
> On 27/08/19 23:40, Sean Christopherson wrote:
>> Rework the emulator and its users to handle failure scenarios entirely
>> within the emulator.
>>
>> {x86,kvm}_emulate_instruction() currently returns a tri-state value to
>> indicate success/continue, userspace exit needed, and failure. The
>> intent of returning EMULATE_FAIL is to let the caller handle failure in
>> a manner that is appropriate for the current context. In practice,
>> the emulator has ended up with a mixture of failure handling, i.e.
>> whether or not the emulator takes action on failure is dependent on the
>> specific flavor of emulation.
>>
>> The mixed handling has proven to be rather fragile, e.g. many flows
>> incorrectly assume their specific flavor of emulation cannot fail or
>> that the emulator sets state to report the failure back to userspace.
>>
>> Move everything inside the emulator, piece by piece, so that the
>> emulation routines can return '0' for exit to userspace and '1' for
>> resume the guest, just like every other VM-Exit handler.
>>
>> Patch 13/14 is a tangentially related bug fix that conflicts heavily with
>> this series, so I tacked it on here.
>>
>> Patch 14/14 documents the emulation types. I added it as a separate
>> patch at the very end so that the comments could reference the final
>> state of the code base, e.g. incorporate the rule change for using
>> EMULTYPE_SKIP that is introduced in patch 13/14.
>>
>> v1:
>> - https://patchwork.kernel.org/cover/11110331/
>>
>> v2:
>> - Collect reviews. [Vitaly and Liran]
>> - Squash VMware emultype changes into a single patch. [Liran]
>> - Add comments in VMX/SVM for VMware #GP handling. [Vitaly]
>> - Tack on the EPT misconfig bug fix.
>> - Add a patch to comment/document the emultypes. [Liran]
>>
>> Sean Christopherson (14):
>> KVM: x86: Relocate MMIO exit stats counting
>> KVM: x86: Clean up handle_emulation_failure()
>> KVM: x86: Refactor kvm_vcpu_do_singlestep() to remove out param
>> KVM: x86: Don't attempt VMWare emulation on #GP with non-zero error
>> code
>> KVM: x86: Move #GP injection for VMware into x86_emulate_instruction()
>> KVM: x86: Add explicit flag for forced emulation on #UD
>> KVM: x86: Move #UD injection for failed emulation into emulation code
>> KVM: x86: Exit to userspace on emulation skip failure
>> KVM: x86: Handle emulation failure directly in kvm_task_switch()
>> KVM: x86: Move triple fault request into RM int injection
>> KVM: VMX: Remove EMULATE_FAIL handling in handle_invalid_guest_state()
>> KVM: x86: Remove emulation_result enums, EMULATE_{DONE,FAIL,USER_EXIT}
>> KVM: VMX: Handle single-step #DB for EMULTYPE_SKIP on EPT misconfig
>> KVM: x86: Add comments to document various emulation types
>>
>> arch/x86/include/asm/kvm_host.h | 40 +++++++--
>> arch/x86/kvm/mmu.c | 16 +---
>> arch/x86/kvm/svm.c | 62 ++++++--------
>> arch/x86/kvm/vmx/vmx.c | 147 +++++++++++++-------------------
>> arch/x86/kvm/x86.c | 133 ++++++++++++++++-------------
>> arch/x86/kvm/x86.h | 2 +-
>> 6 files changed, 195 insertions(+), 205 deletions(-)
>>
>
> Queued, thanks (a couple conflicts had to be sorted out, but nothing
> requiring a respin).

Ugh, I just stumbled over this commit. Is this really the right
direction to move towards?

I appreciate the move to reduce the emulator logic from the many-fold
enum into a simple binary "worked" or "needs a user space exit". But are
"0" and "1" really the right names for that? I find the readability of
the current intercept handlers bad enough, trickling that into even more
code sounds like a situation that will decrease readability even more.

Why can't we just use names throughout? Something like

enum kvm_return {
KVM_RET_USER_EXIT = 0,
KVM_RET_GUEST = 1,
};

and then consistently use them as return values? That way anyone who has
not worked on kvm before can still make sense of the code.


Alex



Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Ralf Herbrich
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879


2019-11-06 00:59:52

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v2 00/14] KVM: x86: Remove emulation_result enums

On Fri, Oct 25, 2019 at 01:00:03PM +0200, Alexander Graf wrote:
> On 17.09.19 17:14, Paolo Bonzini wrote:
> >On 27/08/19 23:40, Sean Christopherson wrote:
> >>Rework the emulator and its users to handle failure scenarios entirely
> >>within the emulator.
> >>
> >>{x86,kvm}_emulate_instruction() currently returns a tri-state value to
> >>indicate success/continue, userspace exit needed, and failure. The
> >>intent of returning EMULATE_FAIL is to let the caller handle failure in
> >>a manner that is appropriate for the current context. In practice,
> >>the emulator has ended up with a mixture of failure handling, i.e.
> >>whether or not the emulator takes action on failure is dependent on the
> >>specific flavor of emulation.
> >>
> >>The mixed handling has proven to be rather fragile, e.g. many flows
> >>incorrectly assume their specific flavor of emulation cannot fail or
> >>that the emulator sets state to report the failure back to userspace.
> >>
> >>Move everything inside the emulator, piece by piece, so that the
> >>emulation routines can return '0' for exit to userspace and '1' for
> >>resume the guest, just like every other VM-Exit handler.
> >>
> >>Patch 13/14 is a tangentially related bug fix that conflicts heavily with
> >>this series, so I tacked it on here.
> >>
> >>Patch 14/14 documents the emulation types. I added it as a separate
> >>patch at the very end so that the comments could reference the final
> >>state of the code base, e.g. incorporate the rule change for using
> >>EMULTYPE_SKIP that is introduced in patch 13/14.
> >>
> >>v1:
> >> - https://patchwork.kernel.org/cover/11110331/
> >>
> >>v2:
> >> - Collect reviews. [Vitaly and Liran]
> >> - Squash VMware emultype changes into a single patch. [Liran]
> >> - Add comments in VMX/SVM for VMware #GP handling. [Vitaly]
> >> - Tack on the EPT misconfig bug fix.
> >> - Add a patch to comment/document the emultypes. [Liran]
> >>
> >>Sean Christopherson (14):
> >> KVM: x86: Relocate MMIO exit stats counting
> >> KVM: x86: Clean up handle_emulation_failure()
> >> KVM: x86: Refactor kvm_vcpu_do_singlestep() to remove out param
> >> KVM: x86: Don't attempt VMWare emulation on #GP with non-zero error
> >> code
> >> KVM: x86: Move #GP injection for VMware into x86_emulate_instruction()
> >> KVM: x86: Add explicit flag for forced emulation on #UD
> >> KVM: x86: Move #UD injection for failed emulation into emulation code
> >> KVM: x86: Exit to userspace on emulation skip failure
> >> KVM: x86: Handle emulation failure directly in kvm_task_switch()
> >> KVM: x86: Move triple fault request into RM int injection
> >> KVM: VMX: Remove EMULATE_FAIL handling in handle_invalid_guest_state()
> >> KVM: x86: Remove emulation_result enums, EMULATE_{DONE,FAIL,USER_EXIT}
> >> KVM: VMX: Handle single-step #DB for EMULTYPE_SKIP on EPT misconfig
> >> KVM: x86: Add comments to document various emulation types
> >>
> >> arch/x86/include/asm/kvm_host.h | 40 +++++++--
> >> arch/x86/kvm/mmu.c | 16 +---
> >> arch/x86/kvm/svm.c | 62 ++++++--------
> >> arch/x86/kvm/vmx/vmx.c | 147 +++++++++++++-------------------
> >> arch/x86/kvm/x86.c | 133 ++++++++++++++++-------------
> >> arch/x86/kvm/x86.h | 2 +-
> >> 6 files changed, 195 insertions(+), 205 deletions(-)
> >>
> >
> >Queued, thanks (a couple conflicts had to be sorted out, but nothing
> >requiring a respin).
>
> Ugh, I just stumbled over this commit. Is this really the right direction to
> move towards?

As you basically surmised below, removing the enum was just a side effect
of cleaning up the emulation error handling, it wasn't really a goal in
and of itself.

> I appreciate the move to reduce the emulator logic from the many-fold enum
> into a simple binary "worked" or "needs a user space exit". But are "0" and
> "1" really the right names for that? I find the readability of the current
> intercept handlers bad enough, trickling that into even more code sounds
> like a situation that will decrease readability even more.
>
> Why can't we just use names throughout? Something like
>
> enum kvm_return {
> KVM_RET_USER_EXIT = 0,
> KVM_RET_GUEST = 1,
> };
>
> and then consistently use them as return values? That way anyone who has not
> worked on kvm before can still make sense of the code.

Hmm, I think it'd make more sense to use #define instead of enum to
hopefully make it clear that they aren't the *only* values that can be
returned. That'd also prevent anyone from changing the return types from
'int' to 'enum kvm_return', which IMO would hurt readability overall.

And maybe KVM_EXIT_TO_USERSPACE and KVM_RETURN_TO_GUEST?

2019-11-06 12:18:40

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 00/14] KVM: x86: Remove emulation_result enums

On 06/11/19 01:58, Sean Christopherson wrote:
>> enum kvm_return {
>> KVM_RET_USER_EXIT = 0,
>> KVM_RET_GUEST = 1,
>> };
>>
>> and then consistently use them as return values? That way anyone who has not
>> worked on kvm before can still make sense of the code.
> Hmm, I think it'd make more sense to use #define instead of enum to
> hopefully make it clear that they aren't the *only* values that can be
> returned. That'd also prevent anyone from changing the return types from
> 'int' to 'enum kvm_return', which IMO would hurt readability overall.
>
> And maybe KVM_EXIT_TO_USERSPACE and KVM_RETURN_TO_GUEST?

That would be quite some work. Right now there is some consistency
between all of:

- x86_emulate_instruction and its callers

- vcpu->arch.complete_userspace_io

- vcpu_enter_guest/vcpu_block

- kvm_x86_ops->handle_exit

so it would be very easy to end up with a half-int-half-enum state that
is more confusing than before...

I'm more worried about cases where we have functions returning either 0
or -errno, but 0 lets you enter the guest. I'm not sure if the only one
is kvm_mmu_reload or there are others.

Paolo

2019-11-06 22:28:37

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v2 00/14] KVM: x86: Remove emulation_result enums

On Wed, Nov 06, 2019 at 01:17:40PM +0100, Paolo Bonzini wrote:
> On 06/11/19 01:58, Sean Christopherson wrote:
> >> enum kvm_return {
> >> KVM_RET_USER_EXIT = 0,
> >> KVM_RET_GUEST = 1,
> >> };
> >>
> >> and then consistently use them as return values? That way anyone who has not
> >> worked on kvm before can still make sense of the code.
> > Hmm, I think it'd make more sense to use #define instead of enum to
> > hopefully make it clear that they aren't the *only* values that can be
> > returned. That'd also prevent anyone from changing the return types from
> > 'int' to 'enum kvm_return', which IMO would hurt readability overall.
> >
> > And maybe KVM_EXIT_TO_USERSPACE and KVM_RETURN_TO_GUEST?
>
> That would be quite some work. Right now there is some consistency
> between all of:
>
> - x86_emulate_instruction and its callers
>
> - vcpu->arch.complete_userspace_io
>
> - vcpu_enter_guest/vcpu_block
>
> - kvm_x86_ops->handle_exit
>
> so it would be very easy to end up with a half-int-half-enum state that
> is more confusing than before...

Ya, my thought was to update the obvious cases, essentially the ones you
listed above, to use the define. So almost intentionally end up in a
half-n-half state, at least in the short term, the thought being that the
extra annotation would do more harm than good. But there's really no way
to determine whether or not it would actually be a net positive without
writing the code...

> I'm more worried about cases where we have functions returning either 0
> or -errno, but 0 lets you enter the guest. I'm not sure if the only one
> is kvm_mmu_reload or there are others.

There are lots of those, e.g. basically all of the helpers for nested
consistency checks.