2023-10-25 15:26:24

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 00/14] KVM: x86: Make Hyper-V emulation optional

RFC:
https://lore.kernel.org/kvm/[email protected]/

Changes since RFC:
- Enhance Kconfig message [Max, Sean]
- Introduce nested_vmx_is_evmptr12_valid [Max, Sean]
- Collected R-b tags [Max]
- Add a common function to allocate partition assist page [Max]
- Fix KVM selftest to skip gracefully when !CONFIG_KVM_HYPERV
- Other minor tweaks.

KVM supports emulating Microsoft Hyper-V as well as running as a nested
(L1) hypervisor on top of Hyper-V whileusing Hyper-V specific PV features.
Support for the later is only compiled in when CONFIG_HYPERV is set. This
series makes it possible to disable the former with a newly introduced
CONFIG_KVM_HYPERV option. This helps to reduce sized of KVM modules as well
as the attack surface for some specific deployments when no Windows/Hyper-V
guests are expected. The size gain is significant:

# CONFIG_KVM_HYPERV is not set
# CONFIG_HYPERV is not set

-rw-r--r--. 1 user user 3612632 Oct 10 16:53 arch/x86/kvm/kvm-amd.ko
-rw-r--r--. 1 user user 5343968 Oct 10 16:53 arch/x86/kvm/kvm-intel.ko

CONFIG_KVM_HYPERV=y
# CONFIG_HYPERV is not set

-rw-r--r--. 1 user user 3925704 Oct 10 16:51 arch/x86/kvm/kvm-amd.ko
-rw-r--r--. 1 user user 5819192 Oct 10 16:51 arch/x86/kvm/kvm-intel.ko

# CONFIG_KVM_HYPERV is not set
CONFIG_HYPERV=m

-rw-r--r--. 1 user user 3928440 Oct 10 16:40 arch/x86/kvm/kvm-amd.ko
-rw-r--r--. 1 user user 8156464 Oct 10 16:40 arch/x86/kvm/kvm-intel.ko

CONFIG_KVM_HYPERV=y
CONFIG_HYPERV=m

-rw-r--r--. 1 user user 4245440 Oct 10 16:37 arch/x86/kvm/kvm-amd.ko
-rw-r--r--. 1 user user 8583872 Oct 10 16:37 arch/x86/kvm/kvm-intel.ko

The series should not supposed to introduce any functional change for the
"CONFIG_KVM_HYPERV=y && CONFIG_HYPERV=m/y" case. Tested with KVM selftests,
kvm-unit-tests and real Windows guests on VMX and SVM. Note, kvm-unit-tests
have to be updated to not fail miserably when CONFIG_KVM_HYPERV is not set,
I'll send a separate series.

Vitaly Kuznetsov (14):
KVM: x86: xen: Remove unneeded xen context from struct kvm_arch when
!CONFIG_KVM_XEN
KVM: x86: hyper-v: Move Hyper-V partition assist page out of Hyper-V
emulation context
KVM: VMX: Split off vmx_onhyperv.{ch} from hyperv.{ch}
KVM: x86: hyper-v: Introduce kvm_hv_synic_auto_eoi_set()
KVM: x86: hyper-v: Introduce kvm_hv_synic_has_vector()
KVM: VMX: Split off hyperv_evmcs.{ch}
KVM: x86: hyper-v: Introduce kvm_hv_nested_transtion_tlb_flush()
helper
KVM: selftests: Make all Hyper-V tests explicitly dependent on Hyper-V
emulation support in KVM
KVM: selftests: Fix vmxon_pa == vmcs12_pa == -1ull
vmx_set_nested_state_test for !eVMCS case
KVM: x86: Make Hyper-V emulation optional
KVM: nVMX: hyper-v: Introduce nested_vmx_evmptr12() and
nested_vmx_is_evmptr12_valid() helpers
KVM: nVMX: hyper-v: Introduce nested_vmx_evmcs() accessor
KVM: nVMX: hyper-v: Hide more stuff under CONFIG_KVM_HYPERV
KVM: nSVM: hyper-v: Hide more stuff under
CONFIG_KVM_HYPERV/CONFIG_HYPERV

arch/x86/include/asm/kvm_host.h | 11 +-
arch/x86/kvm/Kconfig | 14 +
arch/x86/kvm/Makefile | 19 +-
arch/x86/kvm/cpuid.c | 6 +
arch/x86/kvm/hyperv.h | 52 +-
arch/x86/kvm/irq.c | 2 +
arch/x86/kvm/irq_comm.c | 9 +-
arch/x86/kvm/kvm_onhyperv.h | 20 +
arch/x86/kvm/lapic.c | 5 +-
arch/x86/kvm/svm/hyperv.h | 7 +
arch/x86/kvm/svm/nested.c | 30 +-
arch/x86/kvm/svm/svm.h | 2 +
arch/x86/kvm/svm/svm_onhyperv.c | 10 +-
arch/x86/kvm/vmx/hyperv.c | 447 ------------------
arch/x86/kvm/vmx/hyperv.h | 196 ++------
arch/x86/kvm/vmx/hyperv_evmcs.c | 315 ++++++++++++
arch/x86/kvm/vmx/hyperv_evmcs.h | 166 +++++++
arch/x86/kvm/vmx/nested.c | 102 ++--
arch/x86/kvm/vmx/nested.h | 3 +-
arch/x86/kvm/vmx/vmx.c | 18 +-
arch/x86/kvm/vmx/vmx.h | 2 +
arch/x86/kvm/vmx/vmx_onhyperv.c | 36 ++
arch/x86/kvm/vmx/vmx_onhyperv.h | 125 +++++
arch/x86/kvm/vmx/vmx_ops.h | 2 +-
arch/x86/kvm/x86.c | 66 ++-
.../selftests/kvm/x86_64/hyperv_clock.c | 2 +
.../selftests/kvm/x86_64/hyperv_evmcs.c | 5 +-
.../kvm/x86_64/hyperv_extended_hypercalls.c | 2 +
.../selftests/kvm/x86_64/hyperv_features.c | 2 +
.../testing/selftests/kvm/x86_64/hyperv_ipi.c | 2 +
.../selftests/kvm/x86_64/hyperv_svm_test.c | 1 +
.../selftests/kvm/x86_64/hyperv_tlb_flush.c | 2 +
.../kvm/x86_64/vmx_set_nested_state_test.c | 16 +-
33 files changed, 963 insertions(+), 734 deletions(-)
create mode 100644 arch/x86/kvm/vmx/hyperv_evmcs.c
create mode 100644 arch/x86/kvm/vmx/hyperv_evmcs.h
create mode 100644 arch/x86/kvm/vmx/vmx_onhyperv.c
create mode 100644 arch/x86/kvm/vmx/vmx_onhyperv.h

--
2.41.0


2023-10-25 15:26:26

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 09/14] KVM: selftests: Fix vmxon_pa == vmcs12_pa == -1ull vmx_set_nested_state_test for !eVMCS case

The "vmxon_pa == vmcs12_pa == -1ull" test happens to work by accident: as
Enlightened VMCS is always supported, set_default_vmx_state() adds
'KVM_STATE_NESTED_EVMCS' to 'flags' and the following branch of
vmx_set_nested_state() is executed:

if ((kvm_state->flags & KVM_STATE_NESTED_EVMCS) &&
(!guest_can_use(vcpu, X86_FEATURE_VMX) ||
!vmx->nested.enlightened_vmcs_enabled))
return -EINVAL;

as 'enlightened_vmcs_enabled' is false. In fact, "vmxon_pa == vmcs12_pa ==
-1ull" is a valid state when not tainted by wrong flags so the test should
aim for this branch:

if (kvm_state->hdr.vmx.vmxon_pa == INVALID_GPA)
return 0;

Test all this properly:
- Without KVM_STATE_NESTED_EVMCS in the flags, the expected return value is
'0'.
- With KVM_STATE_NESTED_EVMCS flag (when supported) set, the expected
return value is '-EINVAL' prior to enabling eVMCS and '0' after.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
.../kvm/x86_64/vmx_set_nested_state_test.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/vmx_set_nested_state_test.c b/tools/testing/selftests/kvm/x86_64/vmx_set_nested_state_test.c
index 41ea7028a1f8..67a62a5a8895 100644
--- a/tools/testing/selftests/kvm/x86_64/vmx_set_nested_state_test.c
+++ b/tools/testing/selftests/kvm/x86_64/vmx_set_nested_state_test.c
@@ -125,21 +125,25 @@ void test_vmx_nested_state(struct kvm_vcpu *vcpu)

/*
* Setting vmxon_pa == -1ull and vmcs_pa == -1ull exits early without
- * setting the nested state but flags other than eVMCS must be clear.
- * The eVMCS flag can be set if the enlightened VMCS capability has
- * been enabled.
+ * setting the nested state. When the eVMCS flag is not set, the
+ * expected return value is '0'.
*/
set_default_vmx_state(state, state_sz);
+ state->flags = 0;
state->hdr.vmx.vmxon_pa = -1ull;
state->hdr.vmx.vmcs12_pa = -1ull;
- test_nested_state_expect_einval(vcpu, state);
+ test_nested_state(vcpu, state);

- state->flags &= KVM_STATE_NESTED_EVMCS;
+ /*
+ * When eVMCS is supported, the eVMCS flag can only be set if the
+ * enlightened VMCS capability has been enabled.
+ */
if (have_evmcs) {
+ state->flags = KVM_STATE_NESTED_EVMCS;
test_nested_state_expect_einval(vcpu, state);
vcpu_enable_evmcs(vcpu);
+ test_nested_state(vcpu, state);
}
- test_nested_state(vcpu, state);

/* It is invalid to have vmxon_pa == -1ull and SMM flags non-zero. */
state->hdr.vmx.smm.flags = 1;
--
2.41.0

2023-10-25 15:27:10

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 11/14] KVM: nVMX: hyper-v: Introduce nested_vmx_evmptr12() and nested_vmx_is_evmptr12_valid() helpers

'vmx->nested.hv_evmcs_vmptr' accesses are all over the place so hiding
'hv_evmcs_vmptr' under 'ifdef CONFIG_KVM_HYPERV' would take a lot of
ifdefs. Introduce 'nested_vmx_evmptr12()' accessor and
'nested_vmx_is_evmptr12_valid()' checker instead. Note, several explicit

nested_vmx_evmptr12(vmx) != EVMPTR_INVALID

comparisons exist for a reson: 'nested_vmx_is_evmptr12_valid()' also checks
against 'EVMPTR_MAP_PENDING' and in these places this is undesireable. It
is possible to e.g. introduce 'nested_vmx_is_evmptr12_invalid()' and turn
these sites into

!nested_vmx_is_evmptr12_invalid(vmx)

eliminating the need for 'nested_vmx_evmptr12()' but this seems to create
even more confusion.

No functional change intended.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/hyperv.h | 10 +++++++++
arch/x86/kvm/vmx/nested.c | 44 +++++++++++++++++++--------------------
arch/x86/kvm/vmx/nested.h | 2 +-
3 files changed, 33 insertions(+), 23 deletions(-)

diff --git a/arch/x86/kvm/vmx/hyperv.h b/arch/x86/kvm/vmx/hyperv.h
index 933ef6cad5e6..ba1a95ea72b7 100644
--- a/arch/x86/kvm/vmx/hyperv.h
+++ b/arch/x86/kvm/vmx/hyperv.h
@@ -4,6 +4,7 @@

#include <linux/kvm_host.h>
#include "vmcs12.h"
+#include "vmx.h"

#define EVMPTR_INVALID (-1ULL)
#define EVMPTR_MAP_PENDING (-2ULL)
@@ -20,7 +21,14 @@ enum nested_evmptrld_status {
EVMPTRLD_ERROR,
};

+struct vcpu_vmx;
+
#ifdef CONFIG_KVM_HYPERV
+static inline gpa_t nested_vmx_evmptr12(struct vcpu_vmx *vmx) { return vmx->nested.hv_evmcs_vmptr; }
+static inline bool nested_vmx_is_evmptr12_valid(struct vcpu_vmx *vmx)
+{
+ return evmptr_is_valid(vmx->nested.hv_evmcs_vmptr);
+}
u64 nested_get_evmptr(struct kvm_vcpu *vcpu);
uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu);
int nested_enable_evmcs(struct kvm_vcpu *vcpu,
@@ -30,6 +38,8 @@ int nested_evmcs_check_controls(struct vmcs12 *vmcs12);
bool nested_evmcs_l2_tlb_flush_enabled(struct kvm_vcpu *vcpu);
void vmx_hv_inject_synthetic_vmexit_post_tlb_flush(struct kvm_vcpu *vcpu);
#else
+static inline gpa_t nested_vmx_evmptr12(struct vcpu_vmx *vmx) { return EVMPTR_INVALID; }
+static inline bool nested_vmx_is_evmptr12_valid(struct vcpu_vmx *vmx) { return false; }
static inline u64 nested_get_evmptr(struct kvm_vcpu *vcpu) { return EVMPTR_INVALID; }
static inline void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata) {}
static inline bool nested_evmcs_l2_tlb_flush_enabled(struct kvm_vcpu *vcpu) { return false; }
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index d0d735974b2c..b45586588bae 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -179,7 +179,7 @@ static int nested_vmx_failValid(struct kvm_vcpu *vcpu,
* VM_INSTRUCTION_ERROR is not shadowed. Enlightened VMCS 'shadows' all
* fields and thus must be synced.
*/
- if (to_vmx(vcpu)->nested.hv_evmcs_vmptr != EVMPTR_INVALID)
+ if (nested_vmx_evmptr12(to_vmx(vcpu)) != EVMPTR_INVALID)
to_vmx(vcpu)->nested.need_vmcs12_to_shadow_sync = true;

return kvm_skip_emulated_instruction(vcpu);
@@ -194,7 +194,7 @@ static int nested_vmx_fail(struct kvm_vcpu *vcpu, u32 vm_instruction_error)
* can't be done if there isn't a current VMCS.
*/
if (vmx->nested.current_vmptr == INVALID_GPA &&
- !evmptr_is_valid(vmx->nested.hv_evmcs_vmptr))
+ !nested_vmx_is_evmptr12_valid(vmx))
return nested_vmx_failInvalid(vcpu);

return nested_vmx_failValid(vcpu, vm_instruction_error);
@@ -230,7 +230,7 @@ static inline void nested_release_evmcs(struct kvm_vcpu *vcpu)
struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
struct vcpu_vmx *vmx = to_vmx(vcpu);

- if (evmptr_is_valid(vmx->nested.hv_evmcs_vmptr)) {
+ if (nested_vmx_is_evmptr12_valid(vmx)) {
kvm_vcpu_unmap(vcpu, &vmx->nested.hv_evmcs_map, true);
vmx->nested.hv_evmcs = NULL;
}
@@ -2011,7 +2011,7 @@ static enum nested_evmptrld_status nested_vmx_handle_enlightened_vmptrld(
return EVMPTRLD_DISABLED;
}

- if (unlikely(evmcs_gpa != vmx->nested.hv_evmcs_vmptr)) {
+ if (unlikely(evmcs_gpa != nested_vmx_evmptr12(vmx))) {
vmx->nested.current_vmptr = INVALID_GPA;

nested_release_evmcs(vcpu);
@@ -2089,7 +2089,7 @@ void nested_sync_vmcs12_to_shadow(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);

- if (evmptr_is_valid(vmx->nested.hv_evmcs_vmptr))
+ if (nested_vmx_is_evmptr12_valid(vmx))
copy_vmcs12_to_enlightened(vmx);
else
copy_vmcs12_to_shadow(vmx);
@@ -2243,7 +2243,7 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct loaded_vmcs *vmcs0
u32 exec_control;
u64 guest_efer = nested_vmx_calc_efer(vmx, vmcs12);

- if (vmx->nested.dirty_vmcs12 || evmptr_is_valid(vmx->nested.hv_evmcs_vmptr))
+ if (vmx->nested.dirty_vmcs12 || nested_vmx_is_evmptr12_valid(vmx))
prepare_vmcs02_early_rare(vmx, vmcs12);

/*
@@ -2538,11 +2538,11 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
struct vcpu_vmx *vmx = to_vmx(vcpu);
bool load_guest_pdptrs_vmcs12 = false;

- if (vmx->nested.dirty_vmcs12 || evmptr_is_valid(vmx->nested.hv_evmcs_vmptr)) {
+ if (vmx->nested.dirty_vmcs12 || nested_vmx_is_evmptr12_valid(vmx)) {
prepare_vmcs02_rare(vmx, vmcs12);
vmx->nested.dirty_vmcs12 = false;

- load_guest_pdptrs_vmcs12 = !evmptr_is_valid(vmx->nested.hv_evmcs_vmptr) ||
+ load_guest_pdptrs_vmcs12 = !nested_vmx_is_evmptr12_valid(vmx) ||
!(vmx->nested.hv_evmcs->hv_clean_fields &
HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1);
}
@@ -2665,7 +2665,7 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
* bits when it changes a field in eVMCS. Mark all fields as clean
* here.
*/
- if (evmptr_is_valid(vmx->nested.hv_evmcs_vmptr))
+ if (nested_vmx_is_evmptr12_valid(vmx))
vmx->nested.hv_evmcs->hv_clean_fields |=
HV_VMX_ENLIGHTENED_CLEAN_FIELD_ALL;

@@ -3173,7 +3173,7 @@ static bool nested_get_evmcs_page(struct kvm_vcpu *vcpu)
* properly reflected.
*/
if (guest_cpuid_has_evmcs(vcpu) &&
- vmx->nested.hv_evmcs_vmptr == EVMPTR_MAP_PENDING) {
+ nested_vmx_evmptr12(vmx) == EVMPTR_MAP_PENDING) {
enum nested_evmptrld_status evmptrld_status =
nested_vmx_handle_enlightened_vmptrld(vcpu, false);

@@ -3543,7 +3543,7 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,

load_vmcs12_host_state(vcpu, vmcs12);
vmcs12->vm_exit_reason = exit_reason.full;
- if (enable_shadow_vmcs || evmptr_is_valid(vmx->nested.hv_evmcs_vmptr))
+ if (enable_shadow_vmcs || nested_vmx_is_evmptr12_valid(vmx))
vmx->nested.need_vmcs12_to_shadow_sync = true;
return NVMX_VMENTRY_VMEXIT;
}
@@ -3576,7 +3576,7 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
if (CC(evmptrld_status == EVMPTRLD_VMFAIL))
return nested_vmx_failInvalid(vcpu);

- if (CC(!evmptr_is_valid(vmx->nested.hv_evmcs_vmptr) &&
+ if (CC(!nested_vmx_is_evmptr12_valid(vmx) &&
vmx->nested.current_vmptr == INVALID_GPA))
return nested_vmx_failInvalid(vcpu);

@@ -3591,7 +3591,7 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
if (CC(vmcs12->hdr.shadow_vmcs))
return nested_vmx_failInvalid(vcpu);

- if (evmptr_is_valid(vmx->nested.hv_evmcs_vmptr)) {
+ if (nested_vmx_is_evmptr12_valid(vmx)) {
copy_enlightened_to_vmcs12(vmx, vmx->nested.hv_evmcs->hv_clean_fields);
/* Enlightened VMCS doesn't have launch state */
vmcs12->launch_state = !launch;
@@ -4336,11 +4336,11 @@ static void sync_vmcs02_to_vmcs12(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);

- if (evmptr_is_valid(vmx->nested.hv_evmcs_vmptr))
+ if (nested_vmx_is_evmptr12_valid(vmx))
sync_vmcs02_to_vmcs12_rare(vcpu, vmcs12);

vmx->nested.need_sync_vmcs02_to_vmcs12_rare =
- !evmptr_is_valid(vmx->nested.hv_evmcs_vmptr);
+ !nested_vmx_is_evmptr12_valid(vmx);

vmcs12->guest_cr0 = vmcs12_guest_cr0(vcpu, vmcs12);
vmcs12->guest_cr4 = vmcs12_guest_cr4(vcpu, vmcs12);
@@ -4861,7 +4861,7 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
}

if ((vm_exit_reason != -1) &&
- (enable_shadow_vmcs || evmptr_is_valid(vmx->nested.hv_evmcs_vmptr)))
+ (enable_shadow_vmcs || nested_vmx_is_evmptr12_valid(vmx)))
vmx->nested.need_vmcs12_to_shadow_sync = true;

/* in case we halted in L2 */
@@ -5327,7 +5327,7 @@ static int handle_vmclear(struct kvm_vcpu *vcpu)
vmptr + offsetof(struct vmcs12,
launch_state),
&zero, sizeof(zero));
- } else if (vmx->nested.hv_evmcs && vmptr == vmx->nested.hv_evmcs_vmptr) {
+ } else if (vmx->nested.hv_evmcs && vmptr == nested_vmx_evmptr12(vmx)) {
nested_release_evmcs(vcpu);
}

@@ -5367,7 +5367,7 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
/* Decode instruction info and find the field to read */
field = kvm_register_read(vcpu, (((instr_info) >> 28) & 0xf));

- if (!evmptr_is_valid(vmx->nested.hv_evmcs_vmptr)) {
+ if (!nested_vmx_is_evmptr12_valid(vmx)) {
/*
* In VMX non-root operation, when the VMCS-link pointer is INVALID_GPA,
* any VMREAD sets the ALU flags for VMfailInvalid.
@@ -5593,7 +5593,7 @@ static int handle_vmptrld(struct kvm_vcpu *vcpu)
return nested_vmx_fail(vcpu, VMXERR_VMPTRLD_VMXON_POINTER);

/* Forbid normal VMPTRLD if Enlightened version was used */
- if (evmptr_is_valid(vmx->nested.hv_evmcs_vmptr))
+ if (nested_vmx_is_evmptr12_valid(vmx))
return 1;

if (vmx->nested.current_vmptr != vmptr) {
@@ -5656,7 +5656,7 @@ static int handle_vmptrst(struct kvm_vcpu *vcpu)
if (!nested_vmx_check_permission(vcpu))
return 1;

- if (unlikely(evmptr_is_valid(to_vmx(vcpu)->nested.hv_evmcs_vmptr)))
+ if (unlikely(nested_vmx_is_evmptr12_valid(to_vmx(vcpu))))
return 1;

if (get_vmx_mem_address(vcpu, exit_qual, instr_info,
@@ -6442,7 +6442,7 @@ static int vmx_get_nested_state(struct kvm_vcpu *vcpu,
kvm_state.size += sizeof(user_vmx_nested_state->vmcs12);

/* 'hv_evmcs_vmptr' can also be EVMPTR_MAP_PENDING here */
- if (vmx->nested.hv_evmcs_vmptr != EVMPTR_INVALID)
+ if (nested_vmx_evmptr12(vmx) != EVMPTR_INVALID)
kvm_state.flags |= KVM_STATE_NESTED_EVMCS;

if (is_guest_mode(vcpu) &&
@@ -6498,7 +6498,7 @@ static int vmx_get_nested_state(struct kvm_vcpu *vcpu,
} else {
copy_vmcs02_to_vmcs12_rare(vcpu, get_vmcs12(vcpu));
if (!vmx->nested.need_vmcs12_to_shadow_sync) {
- if (evmptr_is_valid(vmx->nested.hv_evmcs_vmptr))
+ if (nested_vmx_is_evmptr12_valid(vmx))
/*
* L1 hypervisor is not obliged to keep eVMCS
* clean fields data always up-to-date while
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index b0f2e26c1aea..0cedb80c5c94 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -58,7 +58,7 @@ static inline int vmx_has_valid_vmcs12(struct kvm_vcpu *vcpu)

/* 'hv_evmcs_vmptr' can also be EVMPTR_MAP_PENDING here */
return vmx->nested.current_vmptr != -1ull ||
- vmx->nested.hv_evmcs_vmptr != EVMPTR_INVALID;
+ nested_vmx_evmptr12(vmx) != EVMPTR_INVALID;
}

static inline u16 nested_get_vpid02(struct kvm_vcpu *vcpu)
--
2.41.0

2023-10-25 15:27:17

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 07/14] KVM: x86: hyper-v: Introduce kvm_hv_nested_transtion_tlb_flush() helper

As a preparation to making Hyper-V emulation optional, introduce a helper
to handle pending KVM_REQ_HV_TLB_FLUSH requests.

No functional change intended.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/hyperv.h | 12 ++++++++++++
arch/x86/kvm/svm/nested.c | 10 ++--------
arch/x86/kvm/vmx/nested.c | 10 ++--------
3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
index ddb1d0b019e6..75dcbe598fbc 100644
--- a/arch/x86/kvm/hyperv.h
+++ b/arch/x86/kvm/hyperv.h
@@ -246,6 +246,18 @@ static inline int kvm_hv_verify_vp_assist(struct kvm_vcpu *vcpu)
return kvm_hv_get_assist_page(vcpu);
}

+static inline void kvm_hv_nested_transtion_tlb_flush(struct kvm_vcpu *vcpu, bool tdp_enabled)
+{
+ /*
+ * KVM_REQ_HV_TLB_FLUSH flushes entries from either L1's VP_ID or
+ * L2's VP_ID upon request from the guest. Make sure we check for
+ * pending entries in the right FIFO upon L1/L2 transition as these
+ * requests are put by other vCPUs asynchronously.
+ */
+ if (to_hv_vcpu(vcpu) && tdp_enabled)
+ kvm_make_request(KVM_REQ_HV_TLB_FLUSH, vcpu);
+}
+
int kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu);

#endif
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 3fea8c47679e..74c04102ef01 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -487,14 +487,8 @@ static void nested_save_pending_event_to_vmcb12(struct vcpu_svm *svm,

static void nested_svm_transition_tlb_flush(struct kvm_vcpu *vcpu)
{
- /*
- * KVM_REQ_HV_TLB_FLUSH flushes entries from either L1's VP_ID or
- * L2's VP_ID upon request from the guest. Make sure we check for
- * pending entries in the right FIFO upon L1/L2 transition as these
- * requests are put by other vCPUs asynchronously.
- */
- if (to_hv_vcpu(vcpu) && npt_enabled)
- kvm_make_request(KVM_REQ_HV_TLB_FLUSH, vcpu);
+ /* Handle pending Hyper-V TLB flush requests */
+ kvm_hv_nested_transtion_tlb_flush(vcpu, npt_enabled);

/*
* TODO: optimize unconditional TLB flush/MMU sync. A partial list of
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index c5ec0ef51ff7..382c0746d069 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1139,14 +1139,8 @@ static void nested_vmx_transition_tlb_flush(struct kvm_vcpu *vcpu,
{
struct vcpu_vmx *vmx = to_vmx(vcpu);

- /*
- * KVM_REQ_HV_TLB_FLUSH flushes entries from either L1's VP_ID or
- * L2's VP_ID upon request from the guest. Make sure we check for
- * pending entries in the right FIFO upon L1/L2 transition as these
- * requests are put by other vCPUs asynchronously.
- */
- if (to_hv_vcpu(vcpu) && enable_ept)
- kvm_make_request(KVM_REQ_HV_TLB_FLUSH, vcpu);
+ /* Handle pending Hyper-V TLB flush requests */
+ kvm_hv_nested_transtion_tlb_flush(vcpu, enable_ept);

/*
* If vmcs12 doesn't use VPID, L1 expects linear and combined mappings
--
2.41.0

2023-10-25 15:27:21

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 13/14] KVM: nVMX: hyper-v: Hide more stuff under CONFIG_KVM_HYPERV

'hv_evmcs_vmptr'/'hv_evmcs_map'/'hv_evmcs' fields in 'struct nested_vmx'
should not be used when !CONFIG_KVM_HYPERV, hide them when
!CONFIG_KVM_HYPERV.

No functional change intended.

Reviewed-by: Maxim Levitsky <[email protected]>
Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/nested.c | 2 ++
arch/x86/kvm/vmx/vmx.c | 3 +++
arch/x86/kvm/vmx/vmx.h | 2 ++
3 files changed, 7 insertions(+)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index fa04aa38ad52..4777d867419c 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6642,6 +6642,7 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
return -EINVAL;

set_current_vmptr(vmx, kvm_state->hdr.vmx.vmcs12_pa);
+#ifdef CONFIG_KVM_HYPERV
} else if (kvm_state->flags & KVM_STATE_NESTED_EVMCS) {
/*
* nested_vmx_handle_enlightened_vmptrld() cannot be called
@@ -6651,6 +6652,7 @@ static int vmx_set_nested_state(struct kvm_vcpu *vcpu,
*/
vmx->nested.hv_evmcs_vmptr = EVMPTR_MAP_PENDING;
kvm_make_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu);
+#endif
} else {
return -EINVAL;
}
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 27411665bef9..6e0ff015c5ff 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4826,7 +4826,10 @@ static void __vmx_vcpu_reset(struct kvm_vcpu *vcpu)
vmx->nested.posted_intr_nv = -1;
vmx->nested.vmxon_ptr = INVALID_GPA;
vmx->nested.current_vmptr = INVALID_GPA;
+
+#ifdef CONFIG_KVM_HYPERV
vmx->nested.hv_evmcs_vmptr = EVMPTR_INVALID;
+#endif

vcpu->arch.microcode_version = 0x100000000ULL;
vmx->msr_ia32_feature_control_valid_bits = FEAT_CTL_LOCKED;
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index c2130d2c8e24..55addb8eef01 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -241,9 +241,11 @@ struct nested_vmx {
bool guest_mode;
} smm;

+#ifdef CONFIG_KVM_HYPERV
gpa_t hv_evmcs_vmptr;
struct kvm_host_map hv_evmcs_map;
struct hv_enlightened_vmcs *hv_evmcs;
+#endif
};

struct vcpu_vmx {
--
2.41.0

2023-10-25 15:27:29

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 12/14] KVM: nVMX: hyper-v: Introduce nested_vmx_evmcs() accessor

There's a number of 'vmx->nested.hv_evmcs' accesses in nested.c, introduce
'nested_vmx_evmcs()' accessor to hide them all in !CONFIG_KVM_HYPERV case.

No functional change intended.

Reviewed-by: Maxim Levitsky <[email protected]>
Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/hyperv.h | 8 ++++++++
arch/x86/kvm/vmx/nested.c | 33 ++++++++++++++++++---------------
2 files changed, 26 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/vmx/hyperv.h b/arch/x86/kvm/vmx/hyperv.h
index ba1a95ea72b7..92bde3b75ca0 100644
--- a/arch/x86/kvm/vmx/hyperv.h
+++ b/arch/x86/kvm/vmx/hyperv.h
@@ -29,6 +29,10 @@ static inline bool nested_vmx_is_evmptr12_valid(struct vcpu_vmx *vmx)
{
return evmptr_is_valid(vmx->nested.hv_evmcs_vmptr);
}
+static inline struct hv_enlightened_vmcs *nested_vmx_evmcs(struct vcpu_vmx *vmx)
+{
+ return vmx->nested.hv_evmcs;
+}
u64 nested_get_evmptr(struct kvm_vcpu *vcpu);
uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu);
int nested_enable_evmcs(struct kvm_vcpu *vcpu,
@@ -40,6 +44,10 @@ void vmx_hv_inject_synthetic_vmexit_post_tlb_flush(struct kvm_vcpu *vcpu);
#else
static inline gpa_t nested_vmx_evmptr12(struct vcpu_vmx *vmx) { return EVMPTR_INVALID; }
static inline bool nested_vmx_is_evmptr12_valid(struct vcpu_vmx *vmx) { return false; }
+static inline struct hv_enlightened_vmcs *nested_vmx_evmcs(struct vcpu_vmx *vmx)
+{
+ return NULL;
+}
static inline u64 nested_get_evmptr(struct kvm_vcpu *vcpu) { return EVMPTR_INVALID; }
static inline void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata) {}
static inline bool nested_evmcs_l2_tlb_flush_enabled(struct kvm_vcpu *vcpu) { return false; }
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index b45586588bae..fa04aa38ad52 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -574,7 +574,6 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
int msr;
unsigned long *msr_bitmap_l1;
unsigned long *msr_bitmap_l0 = vmx->nested.vmcs02.msr_bitmap;
- struct hv_enlightened_vmcs *evmcs = vmx->nested.hv_evmcs;
struct kvm_host_map *map = &vmx->nested.msr_bitmap_map;

/* Nothing to do if the MSR bitmap is not in use. */
@@ -590,10 +589,13 @@ static inline bool nested_vmx_prepare_msr_bitmap(struct kvm_vcpu *vcpu,
* - Nested hypervisor (L1) has enabled 'Enlightened MSR Bitmap' feature
* and tells KVM (L0) there were no changes in MSR bitmap for L2.
*/
- if (!vmx->nested.force_msr_bitmap_recalc && evmcs &&
- evmcs->hv_enlightenments_control.msr_bitmap &&
- evmcs->hv_clean_fields & HV_VMX_ENLIGHTENED_CLEAN_FIELD_MSR_BITMAP)
- return true;
+ if (!vmx->nested.force_msr_bitmap_recalc) {
+ struct hv_enlightened_vmcs *evmcs = nested_vmx_evmcs(vmx);
+
+ if (evmcs && evmcs->hv_enlightenments_control.msr_bitmap &&
+ evmcs->hv_clean_fields & HV_VMX_ENLIGHTENED_CLEAN_FIELD_MSR_BITMAP)
+ return true;
+ }

if (kvm_vcpu_map(vcpu, gpa_to_gfn(vmcs12->msr_bitmap), map))
return false;
@@ -1576,7 +1578,7 @@ static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx)
static void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields)
{
struct vmcs12 *vmcs12 = vmx->nested.cached_vmcs12;
- struct hv_enlightened_vmcs *evmcs = vmx->nested.hv_evmcs;
+ struct hv_enlightened_vmcs *evmcs = nested_vmx_evmcs(vmx);
struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(&vmx->vcpu);

/* HV_VMX_ENLIGHTENED_CLEAN_FIELD_NONE */
@@ -1820,7 +1822,7 @@ static void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields
static void copy_vmcs12_to_enlightened(struct vcpu_vmx *vmx)
{
struct vmcs12 *vmcs12 = vmx->nested.cached_vmcs12;
- struct hv_enlightened_vmcs *evmcs = vmx->nested.hv_evmcs;
+ struct hv_enlightened_vmcs *evmcs = nested_vmx_evmcs(vmx);

/*
* Should not be changed by KVM:
@@ -2404,7 +2406,7 @@ static void prepare_vmcs02_early(struct vcpu_vmx *vmx, struct loaded_vmcs *vmcs0

static void prepare_vmcs02_rare(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12)
{
- struct hv_enlightened_vmcs *hv_evmcs = vmx->nested.hv_evmcs;
+ struct hv_enlightened_vmcs *hv_evmcs = nested_vmx_evmcs(vmx);

if (!hv_evmcs || !(hv_evmcs->hv_clean_fields &
HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP2)) {
@@ -2536,6 +2538,7 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
enum vm_entry_failure_code *entry_failure_code)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
+ struct hv_enlightened_vmcs *evmcs = nested_vmx_evmcs(vmx);
bool load_guest_pdptrs_vmcs12 = false;

if (vmx->nested.dirty_vmcs12 || nested_vmx_is_evmptr12_valid(vmx)) {
@@ -2543,8 +2546,7 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
vmx->nested.dirty_vmcs12 = false;

load_guest_pdptrs_vmcs12 = !nested_vmx_is_evmptr12_valid(vmx) ||
- !(vmx->nested.hv_evmcs->hv_clean_fields &
- HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1);
+ !(evmcs->hv_clean_fields & HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1);
}

if (vmx->nested.nested_run_pending &&
@@ -2666,8 +2668,7 @@ static int prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12,
* here.
*/
if (nested_vmx_is_evmptr12_valid(vmx))
- vmx->nested.hv_evmcs->hv_clean_fields |=
- HV_VMX_ENLIGHTENED_CLEAN_FIELD_ALL;
+ evmcs->hv_clean_fields |= HV_VMX_ENLIGHTENED_CLEAN_FIELD_ALL;

return 0;
}
@@ -3592,7 +3593,9 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
return nested_vmx_failInvalid(vcpu);

if (nested_vmx_is_evmptr12_valid(vmx)) {
- copy_enlightened_to_vmcs12(vmx, vmx->nested.hv_evmcs->hv_clean_fields);
+ struct hv_enlightened_vmcs *evmcs = nested_vmx_evmcs(vmx);
+
+ copy_enlightened_to_vmcs12(vmx, evmcs->hv_clean_fields);
/* Enlightened VMCS doesn't have launch state */
vmcs12->launch_state = !launch;
} else if (enable_shadow_vmcs) {
@@ -5327,7 +5330,7 @@ static int handle_vmclear(struct kvm_vcpu *vcpu)
vmptr + offsetof(struct vmcs12,
launch_state),
&zero, sizeof(zero));
- } else if (vmx->nested.hv_evmcs && vmptr == nested_vmx_evmptr12(vmx)) {
+ } else if (nested_vmx_evmcs(vmx) && vmptr == nested_vmx_evmptr12(vmx)) {
nested_release_evmcs(vcpu);
}

@@ -5405,7 +5408,7 @@ static int handle_vmread(struct kvm_vcpu *vcpu)
return nested_vmx_fail(vcpu, VMXERR_UNSUPPORTED_VMCS_COMPONENT);

/* Read the field, zero-extended to a u64 value */
- value = evmcs_read_any(vmx->nested.hv_evmcs, field, offset);
+ value = evmcs_read_any(nested_vmx_evmcs(vmx), field, offset);
}

/*
--
2.41.0

2023-10-25 15:27:32

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 02/14] KVM: x86: hyper-v: Move Hyper-V partition assist page out of Hyper-V emulation context

Hyper-V partition assist page is used when KVM runs on top of Hyper-V and
is not used for Windows/Hyper-V guests on KVM, this means that 'hv_pa_pg'
placement in 'struct kvm_hv' is unfortunate. As a preparation to making
Hyper-V emulation optional, move 'hv_pa_pg' to 'struct kvm_arch' and put it
under CONFIG_HYPERV.

While on it, introduce hv_get_partition_assist_page() helper to allocate
partition assist page. Move the comment explaining why we use a single page
for all vCPUs from VMX and expand it a bit.

No functional change intended.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 2 +-
arch/x86/kvm/kvm_onhyperv.h | 20 ++++++++++++++++++++
arch/x86/kvm/svm/svm_onhyperv.c | 10 +++-------
arch/x86/kvm/vmx/vmx.c | 14 +++-----------
arch/x86/kvm/x86.c | 4 +++-
5 files changed, 30 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d107516b4591..7fb2810f4573 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1125,7 +1125,6 @@ struct kvm_hv {
*/
unsigned int synic_auto_eoi_used;

- struct hv_partition_assist_pg *hv_pa_pg;
struct kvm_hv_syndbg hv_syndbg;
};

@@ -1447,6 +1446,7 @@ struct kvm_arch {
#if IS_ENABLED(CONFIG_HYPERV)
hpa_t hv_root_tdp;
spinlock_t hv_root_tdp_lock;
+ struct hv_partition_assist_pg *hv_pa_pg;
#endif
/*
* VM-scope maximum vCPU ID. Used to determine the size of structures
diff --git a/arch/x86/kvm/kvm_onhyperv.h b/arch/x86/kvm/kvm_onhyperv.h
index f9ca3e7432b2..eefab3dc8498 100644
--- a/arch/x86/kvm/kvm_onhyperv.h
+++ b/arch/x86/kvm/kvm_onhyperv.h
@@ -10,6 +10,26 @@
int hv_flush_remote_tlbs_range(struct kvm *kvm, gfn_t gfn, gfn_t nr_pages);
int hv_flush_remote_tlbs(struct kvm *kvm);
void hv_track_root_tdp(struct kvm_vcpu *vcpu, hpa_t root_tdp);
+static inline hpa_t hv_get_partition_assist_page(struct kvm_vcpu *vcpu)
+{
+ /*
+ * Partition assist page is something which Hyper-V running in L0
+ * requires from KVM running in L1 before direct TLB flush for L2
+ * guests can be enabled. KVM doesn't currently use the page but to
+ * comply with TLFS it still needs to be allocated. For now, this
+ * is a single page shared among all vCPUs.
+ */
+ struct hv_partition_assist_pg **p_hv_pa_pg =
+ &vcpu->kvm->arch.hv_pa_pg;
+
+ if (!*p_hv_pa_pg)
+ *p_hv_pa_pg = kzalloc(PAGE_SIZE, GFP_KERNEL_ACCOUNT);
+
+ if (!*p_hv_pa_pg)
+ return INVALID_PAGE;
+
+ return __pa(*p_hv_pa_pg);
+}
#else /* !CONFIG_HYPERV */
static inline int hv_flush_remote_tlbs(struct kvm *kvm)
{
diff --git a/arch/x86/kvm/svm/svm_onhyperv.c b/arch/x86/kvm/svm/svm_onhyperv.c
index 7af8422d3382..3971b3ea5d04 100644
--- a/arch/x86/kvm/svm/svm_onhyperv.c
+++ b/arch/x86/kvm/svm/svm_onhyperv.c
@@ -18,18 +18,14 @@
int svm_hv_enable_l2_tlb_flush(struct kvm_vcpu *vcpu)
{
struct hv_vmcb_enlightenments *hve;
- struct hv_partition_assist_pg **p_hv_pa_pg =
- &to_kvm_hv(vcpu->kvm)->hv_pa_pg;
+ hpa_t partition_assist_page = hv_get_partition_assist_page(vcpu);

- if (!*p_hv_pa_pg)
- *p_hv_pa_pg = kzalloc(PAGE_SIZE, GFP_KERNEL);
-
- if (!*p_hv_pa_pg)
+ if (partition_assist_page == INVALID_PAGE)
return -ENOMEM;

hve = &to_svm(vcpu)->vmcb->control.hv_enlightenments;

- hve->partition_assist_page = __pa(*p_hv_pa_pg);
+ hve->partition_assist_page = partition_assist_page;
hve->hv_vm_id = (unsigned long)vcpu->kvm;
if (!hve->hv_enlightenments_control.nested_flush_hypercall) {
hve->hv_enlightenments_control.nested_flush_hypercall = 1;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index be20a60047b1..cb4591405f14 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -523,22 +523,14 @@ module_param(enlightened_vmcs, bool, 0444);
static int hv_enable_l2_tlb_flush(struct kvm_vcpu *vcpu)
{
struct hv_enlightened_vmcs *evmcs;
- struct hv_partition_assist_pg **p_hv_pa_pg =
- &to_kvm_hv(vcpu->kvm)->hv_pa_pg;
- /*
- * Synthetic VM-Exit is not enabled in current code and so All
- * evmcs in singe VM shares same assist page.
- */
- if (!*p_hv_pa_pg)
- *p_hv_pa_pg = kzalloc(PAGE_SIZE, GFP_KERNEL_ACCOUNT);
+ hpa_t partition_assist_page = hv_get_partition_assist_page(vcpu);

- if (!*p_hv_pa_pg)
+ if (partition_assist_page == INVALID_PAGE)
return -ENOMEM;

evmcs = (struct hv_enlightened_vmcs *)to_vmx(vcpu)->loaded_vmcs->vmcs;

- evmcs->partition_assist_page =
- __pa(*p_hv_pa_pg);
+ evmcs->partition_assist_page = partition_assist_page;
evmcs->hv_vm_id = (unsigned long)vcpu->kvm;
evmcs->hv_enlightenments_control.nested_flush_hypercall = 1;

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d632931fa545..cc2524598368 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -12425,7 +12425,9 @@ void kvm_arch_sched_in(struct kvm_vcpu *vcpu, int cpu)

void kvm_arch_free_vm(struct kvm *kvm)
{
- kfree(to_kvm_hv(kvm)->hv_pa_pg);
+#if IS_ENABLED(CONFIG_HYPERV)
+ kfree(kvm->arch.hv_pa_pg);
+#endif
__kvm_arch_free_vm(kvm);
}

--
2.41.0

2023-10-25 15:27:57

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 08/14] KVM: selftests: Make all Hyper-V tests explicitly dependent on Hyper-V emulation support in KVM

In preparation for conditional Hyper-V emulation enablement in KVM, make
Hyper-V specific tests check skip gracefully instead of failing when the
support is not there.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
tools/testing/selftests/kvm/x86_64/hyperv_clock.c | 2 ++
tools/testing/selftests/kvm/x86_64/hyperv_evmcs.c | 5 +++--
.../selftests/kvm/x86_64/hyperv_extended_hypercalls.c | 2 ++
tools/testing/selftests/kvm/x86_64/hyperv_features.c | 2 ++
tools/testing/selftests/kvm/x86_64/hyperv_ipi.c | 2 ++
tools/testing/selftests/kvm/x86_64/hyperv_svm_test.c | 1 +
tools/testing/selftests/kvm/x86_64/hyperv_tlb_flush.c | 2 ++
7 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_clock.c b/tools/testing/selftests/kvm/x86_64/hyperv_clock.c
index f25749eaa6a8..f5e1e98f04f9 100644
--- a/tools/testing/selftests/kvm/x86_64/hyperv_clock.c
+++ b/tools/testing/selftests/kvm/x86_64/hyperv_clock.c
@@ -211,6 +211,8 @@ int main(void)
vm_vaddr_t tsc_page_gva;
int stage;

+ TEST_REQUIRE(kvm_has_cap(KVM_CAP_HYPERV_TIME));
+
vm = vm_create_with_one_vcpu(&vcpu, guest_main);

vcpu_set_hv_cpuid(vcpu);
diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_evmcs.c b/tools/testing/selftests/kvm/x86_64/hyperv_evmcs.c
index 7bde0c4dfdbd..4c7257ecd2a6 100644
--- a/tools/testing/selftests/kvm/x86_64/hyperv_evmcs.c
+++ b/tools/testing/selftests/kvm/x86_64/hyperv_evmcs.c
@@ -240,11 +240,12 @@ int main(int argc, char *argv[])
struct ucall uc;
int stage;

- vm = vm_create_with_one_vcpu(&vcpu, guest_code);
-
TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_VMX));
TEST_REQUIRE(kvm_has_cap(KVM_CAP_NESTED_STATE));
TEST_REQUIRE(kvm_has_cap(KVM_CAP_HYPERV_ENLIGHTENED_VMCS));
+ TEST_REQUIRE(kvm_has_cap(KVM_CAP_HYPERV_DIRECT_TLBFLUSH));
+
+ vm = vm_create_with_one_vcpu(&vcpu, guest_code);

hcall_page = vm_vaddr_alloc_pages(vm, 1);
memset(addr_gva2hva(vm, hcall_page), 0x0, getpagesize());
diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_extended_hypercalls.c b/tools/testing/selftests/kvm/x86_64/hyperv_extended_hypercalls.c
index e036db1f32b9..949e08e98f31 100644
--- a/tools/testing/selftests/kvm/x86_64/hyperv_extended_hypercalls.c
+++ b/tools/testing/selftests/kvm/x86_64/hyperv_extended_hypercalls.c
@@ -43,6 +43,8 @@ int main(void)
uint64_t *outval;
struct ucall uc;

+ TEST_REQUIRE(kvm_has_cap(KVM_CAP_HYPERV_CPUID));
+
/* Verify if extended hypercalls are supported */
if (!kvm_cpuid_has(kvm_get_supported_hv_cpuid(),
HV_ENABLE_EXTENDED_HYPERCALLS)) {
diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_features.c b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
index 9f28aa276c4e..387c605a3077 100644
--- a/tools/testing/selftests/kvm/x86_64/hyperv_features.c
+++ b/tools/testing/selftests/kvm/x86_64/hyperv_features.c
@@ -690,6 +690,8 @@ static void guest_test_hcalls_access(void)

int main(void)
{
+ TEST_REQUIRE(kvm_has_cap(KVM_CAP_HYPERV_ENFORCE_CPUID));
+
pr_info("Testing access to Hyper-V specific MSRs\n");
guest_test_msrs_access();

diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_ipi.c b/tools/testing/selftests/kvm/x86_64/hyperv_ipi.c
index 6feb5ddb031d..65e5f4c05068 100644
--- a/tools/testing/selftests/kvm/x86_64/hyperv_ipi.c
+++ b/tools/testing/selftests/kvm/x86_64/hyperv_ipi.c
@@ -248,6 +248,8 @@ int main(int argc, char *argv[])
int stage = 1, r;
struct ucall uc;

+ TEST_REQUIRE(kvm_has_cap(KVM_CAP_HYPERV_SEND_IPI));
+
vm = vm_create_with_one_vcpu(&vcpu[0], sender_guest_code);

/* Hypercall input/output */
diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_svm_test.c b/tools/testing/selftests/kvm/x86_64/hyperv_svm_test.c
index 6c1278562090..c9b18707edc0 100644
--- a/tools/testing/selftests/kvm/x86_64/hyperv_svm_test.c
+++ b/tools/testing/selftests/kvm/x86_64/hyperv_svm_test.c
@@ -158,6 +158,7 @@ int main(int argc, char *argv[])
int stage;

TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_SVM));
+ TEST_REQUIRE(kvm_has_cap(KVM_CAP_HYPERV_DIRECT_TLBFLUSH));

/* Create VM */
vm = vm_create_with_one_vcpu(&vcpu, guest_code);
diff --git a/tools/testing/selftests/kvm/x86_64/hyperv_tlb_flush.c b/tools/testing/selftests/kvm/x86_64/hyperv_tlb_flush.c
index 4758b6ef5618..c4443f71f8dd 100644
--- a/tools/testing/selftests/kvm/x86_64/hyperv_tlb_flush.c
+++ b/tools/testing/selftests/kvm/x86_64/hyperv_tlb_flush.c
@@ -590,6 +590,8 @@ int main(int argc, char *argv[])
struct ucall uc;
int stage = 1, r, i;

+ TEST_REQUIRE(kvm_has_cap(KVM_CAP_HYPERV_TLBFLUSH));
+
vm = vm_create_with_one_vcpu(&vcpu[0], sender_guest_code);

/* Test data page */
--
2.41.0

2023-10-25 15:28:17

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 04/14] KVM: x86: hyper-v: Introduce kvm_hv_synic_auto_eoi_set()

As a preparation to making Hyper-V emulation optional, create a dedicated
kvm_hv_synic_auto_eoi_set() helper to avoid extra ifdefs in lapic.c

No functional change intended.

Reviewed-by: Maxim Levitsky <[email protected]>
Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/hyperv.h | 5 +++++
arch/x86/kvm/lapic.c | 2 +-
2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
index f83b8db72b11..1897a219981d 100644
--- a/arch/x86/kvm/hyperv.h
+++ b/arch/x86/kvm/hyperv.h
@@ -105,6 +105,11 @@ int kvm_hv_synic_set_irq(struct kvm *kvm, u32 vcpu_id, u32 sint);
void kvm_hv_synic_send_eoi(struct kvm_vcpu *vcpu, int vector);
int kvm_hv_activate_synic(struct kvm_vcpu *vcpu, bool dont_zero_synic_pages);

+static inline bool kvm_hv_synic_auto_eoi_set(struct kvm_vcpu *vcpu, int vector)
+{
+ return to_hv_vcpu(vcpu) && test_bit(vector, to_hv_synic(vcpu)->auto_eoi_bitmap);
+}
+
void kvm_hv_vcpu_uninit(struct kvm_vcpu *vcpu);

bool kvm_hv_assist_page_enabled(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 245b20973cae..f7abc1008cad 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2905,7 +2905,7 @@ int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu)
*/

apic_clear_irr(vector, apic);
- if (to_hv_vcpu(vcpu) && test_bit(vector, to_hv_synic(vcpu)->auto_eoi_bitmap)) {
+ if (kvm_hv_synic_auto_eoi_set(vcpu, vector)) {
/*
* For auto-EOI interrupts, there might be another pending
* interrupt above PPR, so check whether to raise another
--
2.41.0

2023-10-25 15:28:44

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 10/14] KVM: x86: Make Hyper-V emulation optional

Hyper-V emulation in KVM is a fairly big chunk and in some cases it may be
desirable to not compile it in to reduce module sizes as well as the attack
surface. Introduce CONFIG_KVM_HYPERV option to make it possible.

Note, there's room for further nVMX/nSVM code optimizations when
!CONFIG_KVM_HYPERV, this will be done in follow-up patches.

Reorganize Makefile a bit so all CONFIG_HYPERV and CONFIG_KVM_HYPERV files
are grouped together.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 4 +++
arch/x86/kvm/Kconfig | 14 ++++++++
arch/x86/kvm/Makefile | 23 ++++++------
arch/x86/kvm/cpuid.c | 6 ++++
arch/x86/kvm/hyperv.h | 30 ++++++++++++++--
arch/x86/kvm/irq_comm.c | 9 ++++-
arch/x86/kvm/svm/hyperv.h | 7 ++++
arch/x86/kvm/vmx/hyperv.h | 8 +++++
arch/x86/kvm/vmx/nested.c | 15 ++++++++
arch/x86/kvm/x86.c | 62 ++++++++++++++++++++++++---------
10 files changed, 147 insertions(+), 31 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 7fb2810f4573..e5b881dda747 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1095,6 +1095,7 @@ enum hv_tsc_page_status {
HV_TSC_PAGE_BROKEN,
};

+#ifdef CONFIG_KVM_HYPERV
/* Hyper-V emulation context */
struct kvm_hv {
struct mutex hv_lock;
@@ -1127,6 +1128,7 @@ struct kvm_hv {

struct kvm_hv_syndbg hv_syndbg;
};
+#endif

struct msr_bitmap_range {
u32 flags;
@@ -1349,7 +1351,9 @@ struct kvm_arch {
/* reads protected by irq_srcu, writes by irq_lock */
struct hlist_head mask_notifier_list;

+#ifdef CONFIG_KVM_HYPERV
struct kvm_hv hyperv;
+#endif

#ifdef CONFIG_KVM_XEN
struct kvm_xen xen;
diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig
index 950c12868d30..93930cef9b3b 100644
--- a/arch/x86/kvm/Kconfig
+++ b/arch/x86/kvm/Kconfig
@@ -129,6 +129,20 @@ config KVM_SMM

If unsure, say Y.

+config KVM_HYPERV
+ bool "Support for Microsoft Hyper-V emulation"
+ depends on KVM
+ default y
+ help
+ Provides KVM support for emulating Microsoft Hyper-V. This allows KVM
+ to expose a subset of the paravirtualized interfaces defined in the
+ Hyper-V Hypervisor Top-Level Functional Specification (TLFS):
+ https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs
+ These interfaces are required for the correct and performant functioning
+ of Windows and Hyper-V guests on KVM.
+
+ If unsure, say "Y".
+
config KVM_XEN
bool "Support for Xen hypercall interface"
depends on KVM
diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
index 8ea872401cd6..b97b875ad75f 100644
--- a/arch/x86/kvm/Makefile
+++ b/arch/x86/kvm/Makefile
@@ -11,32 +11,33 @@ include $(srctree)/virt/kvm/Makefile.kvm

kvm-y += x86.o emulate.o i8259.o irq.o lapic.o \
i8254.o ioapic.o irq_comm.o cpuid.o pmu.o mtrr.o \
- hyperv.o debugfs.o mmu/mmu.o mmu/page_track.o \
+ debugfs.o mmu/mmu.o mmu/page_track.o \
mmu/spte.o

-ifdef CONFIG_HYPERV
-kvm-y += kvm_onhyperv.o
-endif
-
kvm-$(CONFIG_X86_64) += mmu/tdp_iter.o mmu/tdp_mmu.o
+kvm-$(CONFIG_KVM_HYPERV) += hyperv.o
kvm-$(CONFIG_KVM_XEN) += xen.o
kvm-$(CONFIG_KVM_SMM) += smm.o

kvm-intel-y += vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o \
- vmx/hyperv.o vmx/hyperv_evmcs.o vmx/nested.o vmx/posted_intr.o
-kvm-intel-$(CONFIG_X86_SGX_KVM) += vmx/sgx.o
+ vmx/nested.o vmx/posted_intr.o

-ifdef CONFIG_HYPERV
-kvm-intel-y += vmx/vmx_onhyperv.o
-endif
+kvm-intel-$(CONFIG_X86_SGX_KVM) += vmx/sgx.o

kvm-amd-y += svm/svm.o svm/vmenter.o svm/pmu.o svm/nested.o svm/avic.o \
- svm/sev.o svm/hyperv.o
+ svm/sev.o

ifdef CONFIG_HYPERV
+kvm-y += kvm_onhyperv.o
+kvm-intel-y += vmx/vmx_onhyperv.o vmx/hyperv_evmcs.o
kvm-amd-y += svm/svm_onhyperv.o
endif

+ifdef CONFIG_KVM_HYPERV
+kvm-intel-y += vmx/hyperv.o vmx/hyperv_evmcs.o
+kvm-amd-y += svm/hyperv.o
+endif
+
obj-$(CONFIG_KVM) += kvm.o
obj-$(CONFIG_KVM_INTEL) += kvm-intel.o
obj-$(CONFIG_KVM_AMD) += kvm-amd.o
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 464b23ac5f93..da8e0873f63a 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -314,11 +314,15 @@ EXPORT_SYMBOL_GPL(kvm_update_cpuid_runtime);

static bool kvm_cpuid_has_hyperv(struct kvm_cpuid_entry2 *entries, int nent)
{
+#ifdef CONFIG_KVM_HYPERV
struct kvm_cpuid_entry2 *entry;

entry = cpuid_entry2_find(entries, nent, HYPERV_CPUID_INTERFACE,
KVM_CPUID_INDEX_NOT_SIGNIFICANT);
return entry && entry->eax == HYPERV_CPUID_SIGNATURE_EAX;
+#else
+ return false;
+#endif
}

static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
@@ -433,11 +437,13 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
return 0;
}

+#ifdef CONFIG_KVM_HYPERV
if (kvm_cpuid_has_hyperv(e2, nent)) {
r = kvm_hv_vcpu_init(vcpu);
if (r)
return r;
}
+#endif

r = kvm_check_cpuid(vcpu, e2, nent);
if (r)
diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
index 75dcbe598fbc..5c5ec7015136 100644
--- a/arch/x86/kvm/hyperv.h
+++ b/arch/x86/kvm/hyperv.h
@@ -24,6 +24,8 @@
#include <linux/kvm_host.h>
#include "x86.h"

+#ifdef CONFIG_KVM_HYPERV
+
/* "Hv#1" signature */
#define HYPERV_CPUID_SIGNATURE_EAX 0x31237648

@@ -259,5 +261,29 @@ static inline void kvm_hv_nested_transtion_tlb_flush(struct kvm_vcpu *vcpu, bool
}

int kvm_hv_vcpu_flush_tlb(struct kvm_vcpu *vcpu);
-
-#endif
+#else /* CONFIG_KVM_HYPERV */
+static inline void kvm_hv_setup_tsc_page(struct kvm *kvm,
+ struct pvclock_vcpu_time_info *hv_clock) {}
+static inline void kvm_hv_request_tsc_page_update(struct kvm *kvm) {}
+static inline void kvm_hv_init_vm(struct kvm *kvm) {}
+static inline void kvm_hv_destroy_vm(struct kvm *kvm) {}
+static inline int kvm_hv_vcpu_init(struct kvm_vcpu *vcpu) { return 0; }
+static inline void kvm_hv_vcpu_uninit(struct kvm_vcpu *vcpu) {}
+static inline bool kvm_hv_hypercall_enabled(struct kvm_vcpu *vcpu) { return false; }
+static inline int kvm_hv_hypercall(struct kvm_vcpu *vcpu) { return HV_STATUS_ACCESS_DENIED; }
+static inline void kvm_hv_vcpu_purge_flush_tlb(struct kvm_vcpu *vcpu) {}
+static inline void kvm_hv_free_pa_page(struct kvm *kvm) {}
+static inline bool kvm_hv_synic_has_vector(struct kvm_vcpu *vcpu, int vector) { return false; }
+static inline bool kvm_hv_synic_auto_eoi_set(struct kvm_vcpu *vcpu, int vector) { return false; }
+static inline void kvm_hv_synic_send_eoi(struct kvm_vcpu *vcpu, int vector) {}
+static inline bool kvm_hv_invtsc_suppressed(struct kvm_vcpu *vcpu) { return false; }
+static inline void kvm_hv_set_cpuid(struct kvm_vcpu *vcpu, bool hyperv_enabled) {}
+static inline bool kvm_hv_has_stimer_pending(struct kvm_vcpu *vcpu) { return false; }
+static inline bool kvm_hv_is_tlb_flush_hcall(struct kvm_vcpu *vcpu) { return false; }
+static inline bool guest_hv_cpuid_has_l2_tlb_flush(struct kvm_vcpu *vcpu) { return false; }
+static inline int kvm_hv_verify_vp_assist(struct kvm_vcpu *vcpu) { return 0; }
+static inline u32 kvm_hv_get_vpindex(struct kvm_vcpu *vcpu) { return vcpu->vcpu_idx; }
+static inline void kvm_hv_nested_transtion_tlb_flush(struct kvm_vcpu *vcpu, bool tdp_enabled) {}
+#endif /* CONFIG_KVM_HYPERV */
+
+#endif /* __ARCH_X86_KVM_HYPERV_H__ */
diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c
index 16d076a1b91a..68f3f6c26046 100644
--- a/arch/x86/kvm/irq_comm.c
+++ b/arch/x86/kvm/irq_comm.c
@@ -144,7 +144,7 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
return kvm_irq_delivery_to_apic(kvm, NULL, &irq, NULL);
}

-
+#ifdef CONFIG_KVM_HYPERV
static int kvm_hv_set_sint(struct kvm_kernel_irq_routing_entry *e,
struct kvm *kvm, int irq_source_id, int level,
bool line_status)
@@ -154,6 +154,7 @@ static int kvm_hv_set_sint(struct kvm_kernel_irq_routing_entry *e,

return kvm_hv_synic_set_irq(kvm, e->hv_sint.vcpu, e->hv_sint.sint);
}
+#endif

int kvm_arch_set_irq_inatomic(struct kvm_kernel_irq_routing_entry *e,
struct kvm *kvm, int irq_source_id, int level,
@@ -163,9 +164,11 @@ int kvm_arch_set_irq_inatomic(struct kvm_kernel_irq_routing_entry *e,
int r;

switch (e->type) {
+#ifdef CONFIG_KVM_HYPERV
case KVM_IRQ_ROUTING_HV_SINT:
return kvm_hv_set_sint(e, kvm, irq_source_id, level,
line_status);
+#endif

case KVM_IRQ_ROUTING_MSI:
if (kvm_msi_route_invalid(kvm, e))
@@ -314,11 +317,13 @@ int kvm_set_routing_entry(struct kvm *kvm,
if (kvm_msi_route_invalid(kvm, e))
return -EINVAL;
break;
+#ifdef CONFIG_KVM_HYPERV
case KVM_IRQ_ROUTING_HV_SINT:
e->set = kvm_hv_set_sint;
e->hv_sint.vcpu = ue->u.hv_sint.vcpu;
e->hv_sint.sint = ue->u.hv_sint.sint;
break;
+#endif
#ifdef CONFIG_KVM_XEN
case KVM_IRQ_ROUTING_XEN_EVTCHN:
return kvm_xen_setup_evtchn(kvm, e, ue);
@@ -438,5 +443,7 @@ void kvm_scan_ioapic_routes(struct kvm_vcpu *vcpu,

void kvm_arch_irq_routing_update(struct kvm *kvm)
{
+#ifdef CONFIG_KVM_HYPERV
kvm_hv_irq_routing_update(kvm);
+#endif
}
diff --git a/arch/x86/kvm/svm/hyperv.h b/arch/x86/kvm/svm/hyperv.h
index 02f4784b5d44..14eec2d9b6be 100644
--- a/arch/x86/kvm/svm/hyperv.h
+++ b/arch/x86/kvm/svm/hyperv.h
@@ -11,6 +11,7 @@
#include "../hyperv.h"
#include "svm.h"

+#ifdef CONFIG_KVM_HYPERV
static inline void nested_svm_hv_update_vm_vp_ids(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
@@ -41,5 +42,11 @@ static inline bool nested_svm_l2_tlb_flush_enabled(struct kvm_vcpu *vcpu)
}

void svm_hv_inject_synthetic_vmexit_post_tlb_flush(struct kvm_vcpu *vcpu);
+#else /* CONFIG_KVM_HYPERV */
+static inline void nested_svm_hv_update_vm_vp_ids(struct kvm_vcpu *vcpu) {}
+static inline bool nested_svm_l2_tlb_flush_enabled(struct kvm_vcpu *vcpu) { return false; }
+static inline void svm_hv_inject_synthetic_vmexit_post_tlb_flush(struct kvm_vcpu *vcpu) {}
+#endif /* CONFIG_KVM_HYPERV */
+

#endif /* __ARCH_X86_KVM_SVM_HYPERV_H__ */
diff --git a/arch/x86/kvm/vmx/hyperv.h b/arch/x86/kvm/vmx/hyperv.h
index d4ed99008518..933ef6cad5e6 100644
--- a/arch/x86/kvm/vmx/hyperv.h
+++ b/arch/x86/kvm/vmx/hyperv.h
@@ -20,6 +20,7 @@ enum nested_evmptrld_status {
EVMPTRLD_ERROR,
};

+#ifdef CONFIG_KVM_HYPERV
u64 nested_get_evmptr(struct kvm_vcpu *vcpu);
uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu);
int nested_enable_evmcs(struct kvm_vcpu *vcpu,
@@ -28,5 +29,12 @@ void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *
int nested_evmcs_check_controls(struct vmcs12 *vmcs12);
bool nested_evmcs_l2_tlb_flush_enabled(struct kvm_vcpu *vcpu);
void vmx_hv_inject_synthetic_vmexit_post_tlb_flush(struct kvm_vcpu *vcpu);
+#else
+static inline u64 nested_get_evmptr(struct kvm_vcpu *vcpu) { return EVMPTR_INVALID; }
+static inline void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata) {}
+static inline bool nested_evmcs_l2_tlb_flush_enabled(struct kvm_vcpu *vcpu) { return false; }
+static inline int nested_evmcs_check_controls(struct vmcs12 *vmcs12) { return 0; }
+#endif
+

#endif /* __KVM_X86_VMX_HYPERV_H */
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 382c0746d069..d0d735974b2c 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -226,6 +226,7 @@ static void vmx_disable_shadow_vmcs(struct vcpu_vmx *vmx)

static inline void nested_release_evmcs(struct kvm_vcpu *vcpu)
{
+#ifdef CONFIG_KVM_HYPERV
struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
struct vcpu_vmx *vmx = to_vmx(vcpu);

@@ -241,6 +242,7 @@ static inline void nested_release_evmcs(struct kvm_vcpu *vcpu)
hv_vcpu->nested.vm_id = 0;
hv_vcpu->nested.vp_id = 0;
}
+#endif
}

static void vmx_sync_vmcs_host_state(struct vcpu_vmx *vmx,
@@ -1570,6 +1572,7 @@ static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx)
vmcs_load(vmx->loaded_vmcs->vmcs);
}

+#ifdef CONFIG_KVM_HYPERV
static void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields)
{
struct vmcs12 *vmcs12 = vmx->nested.cached_vmcs12;
@@ -2077,6 +2080,10 @@ static enum nested_evmptrld_status nested_vmx_handle_enlightened_vmptrld(

return EVMPTRLD_SUCCEEDED;
}
+#else /* CONFIG_KVM_HYPERV */
+static inline void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields) {}
+static inline void copy_vmcs12_to_enlightened(struct vcpu_vmx *vmx) {}
+#endif /* CONFIG_KVM_HYPERV */

void nested_sync_vmcs12_to_shadow(struct kvm_vcpu *vcpu)
{
@@ -3155,6 +3162,7 @@ static int nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu)
return 0;
}

+#ifdef CONFIG_KVM_HYPERV
static bool nested_get_evmcs_page(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -3182,6 +3190,9 @@ static bool nested_get_evmcs_page(struct kvm_vcpu *vcpu)

return true;
}
+#else
+static bool nested_get_evmcs_page(struct kvm_vcpu *vcpu) { return true; }
+#endif

static bool nested_get_vmcs12_pages(struct kvm_vcpu *vcpu)
{
@@ -3552,11 +3563,13 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
if (!nested_vmx_check_permission(vcpu))
return 1;

+#ifdef CONFIG_KVM_HYPERV
evmptrld_status = nested_vmx_handle_enlightened_vmptrld(vcpu, launch);
if (evmptrld_status == EVMPTRLD_ERROR) {
kvm_queue_exception(vcpu, UD_VECTOR);
return 1;
}
+#endif

kvm_pmu_trigger_event(vcpu, PERF_COUNT_HW_BRANCH_INSTRUCTIONS);

@@ -7090,7 +7103,9 @@ struct kvm_x86_nested_ops vmx_nested_ops = {
.set_state = vmx_set_nested_state,
.get_nested_state_pages = vmx_get_nested_state_pages,
.write_log_dirty = nested_vmx_write_pml_buffer,
+#ifdef CONFIG_KVM_HYPERV
.enable_evmcs = nested_enable_evmcs,
.get_evmcs_version = nested_get_evmcs_version,
.hv_inject_synthetic_vmexit_post_tlb_flush = vmx_hv_inject_synthetic_vmexit_post_tlb_flush,
+#endif
};
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index cc2524598368..8ef9898092cd 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1504,6 +1504,8 @@ static unsigned num_msrs_to_save;
static const u32 emulated_msrs_all[] = {
MSR_KVM_SYSTEM_TIME, MSR_KVM_WALL_CLOCK,
MSR_KVM_SYSTEM_TIME_NEW, MSR_KVM_WALL_CLOCK_NEW,
+
+#ifdef CONFIG_KVM_HYPERV
HV_X64_MSR_GUEST_OS_ID, HV_X64_MSR_HYPERCALL,
HV_X64_MSR_TIME_REF_COUNT, HV_X64_MSR_REFERENCE_TSC,
HV_X64_MSR_TSC_FREQUENCY, HV_X64_MSR_APIC_FREQUENCY,
@@ -1521,6 +1523,7 @@ static const u32 emulated_msrs_all[] = {
HV_X64_MSR_SYNDBG_CONTROL, HV_X64_MSR_SYNDBG_STATUS,
HV_X64_MSR_SYNDBG_SEND_BUFFER, HV_X64_MSR_SYNDBG_RECV_BUFFER,
HV_X64_MSR_SYNDBG_PENDING_BUFFER,
+#endif

MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME,
MSR_KVM_PV_EOI_EN, MSR_KVM_ASYNC_PF_INT, MSR_KVM_ASYNC_PF_ACK,
@@ -4022,6 +4025,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
* the need to ignore the workaround.
*/
break;
+#ifdef CONFIG_KVM_HYPERV
case HV_X64_MSR_GUEST_OS_ID ... HV_X64_MSR_SINT15:
case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
case HV_X64_MSR_SYNDBG_OPTIONS:
@@ -4034,6 +4038,7 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case HV_X64_MSR_TSC_INVARIANT_CONTROL:
return kvm_hv_set_msr_common(vcpu, msr, data,
msr_info->host_initiated);
+#endif
case MSR_IA32_BBL_CR_CTL3:
/* Drop writes to this legacy MSR -- see rdmsr
* counterpart for further detail.
@@ -4378,6 +4383,7 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
*/
msr_info->data = 0x20000000;
break;
+#ifdef CONFIG_KVM_HYPERV
case HV_X64_MSR_GUEST_OS_ID ... HV_X64_MSR_SINT15:
case HV_X64_MSR_SYNDBG_CONTROL ... HV_X64_MSR_SYNDBG_PENDING_BUFFER:
case HV_X64_MSR_SYNDBG_OPTIONS:
@@ -4391,6 +4397,7 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
return kvm_hv_get_msr_common(vcpu,
msr_info->index, &msr_info->data,
msr_info->host_initiated);
+#endif
case MSR_IA32_BBL_CR_CTL3:
/* This legacy MSR exists but isn't fully documented in current
* silicon. It is however accessed by winxp in very narrow
@@ -4528,6 +4535,7 @@ static inline bool kvm_can_mwait_in_guest(void)
boot_cpu_has(X86_FEATURE_ARAT);
}

+#ifdef CONFIG_KVM_HYPERV
static int kvm_ioctl_get_supported_hv_cpuid(struct kvm_vcpu *vcpu,
struct kvm_cpuid2 __user *cpuid_arg)
{
@@ -4548,6 +4556,7 @@ static int kvm_ioctl_get_supported_hv_cpuid(struct kvm_vcpu *vcpu,

return 0;
}
+#endif

int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
{
@@ -4574,9 +4583,11 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_PIT_STATE2:
case KVM_CAP_SET_IDENTITY_MAP_ADDR:
case KVM_CAP_VCPU_EVENTS:
+#ifdef CONFIG_KVM_HYPERV
case KVM_CAP_HYPERV:
case KVM_CAP_HYPERV_VAPIC:
case KVM_CAP_HYPERV_SPIN:
+ case KVM_CAP_HYPERV_TIME:
case KVM_CAP_HYPERV_SYNIC:
case KVM_CAP_HYPERV_SYNIC2:
case KVM_CAP_HYPERV_VP_INDEX:
@@ -4586,6 +4597,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_HYPERV_CPUID:
case KVM_CAP_HYPERV_ENFORCE_CPUID:
case KVM_CAP_SYS_HYPERV_CPUID:
+#endif
case KVM_CAP_PCI_SEGMENT:
case KVM_CAP_DEBUGREGS:
case KVM_CAP_X86_ROBUST_SINGLESTEP:
@@ -4595,7 +4607,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_GET_TSC_KHZ:
case KVM_CAP_KVMCLOCK_CTRL:
case KVM_CAP_READONLY_MEM:
- case KVM_CAP_HYPERV_TIME:
case KVM_CAP_IOAPIC_POLARITY_IGNORED:
case KVM_CAP_TSC_DEADLINE_TIMER:
case KVM_CAP_DISABLE_QUIRKS:
@@ -4705,12 +4716,14 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
r = kvm_x86_ops.nested_ops->get_state ?
kvm_x86_ops.nested_ops->get_state(NULL, NULL, 0) : 0;
break;
+#ifdef CONFIG_KVM_HYPERV
case KVM_CAP_HYPERV_DIRECT_TLBFLUSH:
r = kvm_x86_ops.enable_l2_tlb_flush != NULL;
break;
case KVM_CAP_HYPERV_ENLIGHTENED_VMCS:
r = kvm_x86_ops.nested_ops->enable_evmcs != NULL;
break;
+#endif
case KVM_CAP_SMALLER_MAXPHYADDR:
r = (int) allow_smaller_maxphyaddr;
break;
@@ -4872,9 +4885,11 @@ long kvm_arch_dev_ioctl(struct file *filp,
case KVM_GET_MSRS:
r = msr_io(NULL, argp, do_get_msr_feature, 1);
break;
+#ifdef CONFIG_KVM_HYPERV
case KVM_GET_SUPPORTED_HV_CPUID:
r = kvm_ioctl_get_supported_hv_cpuid(NULL, argp);
break;
+#endif
case KVM_GET_DEVICE_ATTR: {
struct kvm_device_attr attr;
r = -EFAULT;
@@ -5700,14 +5715,11 @@ static int kvm_vcpu_ioctl_device_attr(struct kvm_vcpu *vcpu,
static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
struct kvm_enable_cap *cap)
{
- int r;
- uint16_t vmcs_version;
- void __user *user_ptr;
-
if (cap->flags)
return -EINVAL;

switch (cap->cap) {
+#ifdef CONFIG_KVM_HYPERV
case KVM_CAP_HYPERV_SYNIC2:
if (cap->args[0])
return -EINVAL;
@@ -5719,16 +5731,22 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
return kvm_hv_activate_synic(vcpu, cap->cap ==
KVM_CAP_HYPERV_SYNIC2);
case KVM_CAP_HYPERV_ENLIGHTENED_VMCS:
- if (!kvm_x86_ops.nested_ops->enable_evmcs)
- return -ENOTTY;
- r = kvm_x86_ops.nested_ops->enable_evmcs(vcpu, &vmcs_version);
- if (!r) {
- user_ptr = (void __user *)(uintptr_t)cap->args[0];
- if (copy_to_user(user_ptr, &vmcs_version,
- sizeof(vmcs_version)))
- r = -EFAULT;
+ {
+ int r;
+ uint16_t vmcs_version;
+ void __user *user_ptr;
+
+ if (!kvm_x86_ops.nested_ops->enable_evmcs)
+ return -ENOTTY;
+ r = kvm_x86_ops.nested_ops->enable_evmcs(vcpu, &vmcs_version);
+ if (!r) {
+ user_ptr = (void __user *)(uintptr_t)cap->args[0];
+ if (copy_to_user(user_ptr, &vmcs_version,
+ sizeof(vmcs_version)))
+ r = -EFAULT;
+ }
+ return r;
}
- return r;
case KVM_CAP_HYPERV_DIRECT_TLBFLUSH:
if (!kvm_x86_ops.enable_l2_tlb_flush)
return -ENOTTY;
@@ -5737,6 +5755,7 @@ static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,

case KVM_CAP_HYPERV_ENFORCE_CPUID:
return kvm_hv_set_enforce_cpuid(vcpu, cap->args[0]);
+#endif

case KVM_CAP_ENFORCE_PV_FEATURE_CPUID:
vcpu->arch.pv_cpuid.enforce = cap->args[0];
@@ -6129,9 +6148,11 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
srcu_read_unlock(&vcpu->kvm->srcu, idx);
break;
}
+#ifdef CONFIG_KVM_HYPERV
case KVM_GET_SUPPORTED_HV_CPUID:
r = kvm_ioctl_get_supported_hv_cpuid(vcpu, argp);
break;
+#endif
#ifdef CONFIG_KVM_XEN
case KVM_XEN_VCPU_GET_ATTR: {
struct kvm_xen_vcpu_attr xva;
@@ -7189,6 +7210,7 @@ int kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
r = static_call(kvm_x86_mem_enc_unregister_region)(kvm, &region);
break;
}
+#ifdef CONFIG_KVM_HYPERV
case KVM_HYPERV_EVENTFD: {
struct kvm_hyperv_eventfd hvevfd;

@@ -7198,6 +7220,7 @@ int kvm_arch_vm_ioctl(struct file *filp, unsigned int ioctl, unsigned long arg)
r = kvm_vm_ioctl_hv_eventfd(kvm, &hvevfd);
break;
}
+#endif
case KVM_SET_PMU_EVENT_FILTER:
r = kvm_vm_ioctl_set_pmu_event_filter(kvm, argp);
break;
@@ -10576,19 +10599,20 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)

static void vcpu_load_eoi_exitmap(struct kvm_vcpu *vcpu)
{
- u64 eoi_exit_bitmap[4];
-
if (!kvm_apic_hw_enabled(vcpu->arch.apic))
return;

+#ifdef CONFIG_KVM_HYPERV
if (to_hv_vcpu(vcpu)) {
+ u64 eoi_exit_bitmap[4];
+
bitmap_or((ulong *)eoi_exit_bitmap,
vcpu->arch.ioapic_handled_vectors,
to_hv_synic(vcpu)->vec_bitmap, 256);
static_call_cond(kvm_x86_load_eoi_exitmap)(vcpu, eoi_exit_bitmap);
return;
}
-
+#endif
static_call_cond(kvm_x86_load_eoi_exitmap)(
vcpu, (u64 *)vcpu->arch.ioapic_handled_vectors);
}
@@ -10679,9 +10703,11 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
* the flushes are considered "remote" and not "local" because
* the requests can be initiated from other vCPUs.
*/
+#ifdef CONFIG_KVM_HYPERV
if (kvm_check_request(KVM_REQ_HV_TLB_FLUSH, vcpu) &&
kvm_hv_vcpu_flush_tlb(vcpu))
kvm_vcpu_flush_tlb_guest(vcpu);
+#endif

if (kvm_check_request(KVM_REQ_REPORT_TPR_ACCESS, vcpu)) {
vcpu->run->exit_reason = KVM_EXIT_TPR_ACCESS;
@@ -10734,6 +10760,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
vcpu_load_eoi_exitmap(vcpu);
if (kvm_check_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu))
kvm_vcpu_reload_apic_access_page(vcpu);
+#ifdef CONFIG_KVM_HYPERV
if (kvm_check_request(KVM_REQ_HV_CRASH, vcpu)) {
vcpu->run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
vcpu->run->system_event.type = KVM_SYSTEM_EVENT_CRASH;
@@ -10764,6 +10791,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
*/
if (kvm_check_request(KVM_REQ_HV_STIMER, vcpu))
kvm_hv_process_stimers(vcpu);
+#endif
if (kvm_check_request(KVM_REQ_APICV_UPDATE, vcpu))
kvm_vcpu_update_apicv(vcpu);
if (kvm_check_request(KVM_REQ_APF_READY, vcpu))
--
2.41.0

2023-10-25 15:29:12

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 14/14] KVM: nSVM: hyper-v: Hide more stuff under CONFIG_KVM_HYPERV/CONFIG_HYPERV

'struct hv_vmcb_enlightenments' in VMCB only make sense when either
CONFIG_KVM_HYPERV or CONFIG_HYPERV is enabled.

No functional change intended.

Reviewed-by: Maxim Levitsky <[email protected]>
Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/svm/nested.c | 20 ++++++++++++++------
arch/x86/kvm/svm/svm.h | 2 ++
2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 74c04102ef01..20212aac050b 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -187,7 +187,6 @@ void recalc_intercepts(struct vcpu_svm *svm)
*/
static bool nested_svm_vmrun_msrpm(struct vcpu_svm *svm)
{
- struct hv_vmcb_enlightenments *hve = &svm->nested.ctl.hv_enlightenments;
int i;

/*
@@ -198,11 +197,16 @@ static bool nested_svm_vmrun_msrpm(struct vcpu_svm *svm)
* - Nested hypervisor (L1) is using Hyper-V emulation interface and
* tells KVM (L0) there were no changes in MSR bitmap for L2.
*/
- if (!svm->nested.force_msr_bitmap_recalc &&
- kvm_hv_hypercall_enabled(&svm->vcpu) &&
- hve->hv_enlightenments_control.msr_bitmap &&
- (svm->nested.ctl.clean & BIT(HV_VMCB_NESTED_ENLIGHTENMENTS)))
- goto set_msrpm_base_pa;
+#ifdef CONFIG_KVM_HYPERV
+ if (!svm->nested.force_msr_bitmap_recalc) {
+ struct hv_vmcb_enlightenments *hve = &svm->nested.ctl.hv_enlightenments;
+
+ if (kvm_hv_hypercall_enabled(&svm->vcpu) &&
+ hve->hv_enlightenments_control.msr_bitmap &&
+ (svm->nested.ctl.clean & BIT(HV_VMCB_NESTED_ENLIGHTENMENTS)))
+ goto set_msrpm_base_pa;
+ }
+#endif

if (!(vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_MSR_PROT)))
return true;
@@ -230,7 +234,9 @@ static bool nested_svm_vmrun_msrpm(struct vcpu_svm *svm)

svm->nested.force_msr_bitmap_recalc = false;

+#ifdef CONFIG_KVM_HYPERV
set_msrpm_base_pa:
+#endif
svm->vmcb->control.msrpm_base_pa = __sme_set(__pa(svm->nested.msrpm));

return true;
@@ -378,12 +384,14 @@ void __nested_copy_vmcb_control_to_cache(struct kvm_vcpu *vcpu,
to->msrpm_base_pa &= ~0x0fffULL;
to->iopm_base_pa &= ~0x0fffULL;

+#ifdef CONFIG_KVM_HYPERV
/* Hyper-V extensions (Enlightened VMCB) */
if (kvm_hv_hypercall_enabled(vcpu)) {
to->clean = from->clean;
memcpy(&to->hv_enlightenments, &from->hv_enlightenments,
sizeof(to->hv_enlightenments));
}
+#endif
}

void nested_copy_vmcb_control_to_cache(struct vcpu_svm *svm,
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index be67ab7fdd10..59adff7bbf55 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -148,7 +148,9 @@ struct vmcb_ctrl_area_cached {
u64 virt_ext;
u32 clean;
union {
+#if IS_ENABLED(CONFIG_HYPERV) || IS_ENABLED(CONFIG_KVM_HYPERV)
struct hv_vmcb_enlightenments hv_enlightenments;
+#endif
u8 reserved_sw[32];
};
};
--
2.41.0

2023-11-30 01:24:54

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 11/14] KVM: nVMX: hyper-v: Introduce nested_vmx_evmptr12() and nested_vmx_is_evmptr12_valid() helpers

On Wed, Oct 25, 2023, Vitaly Kuznetsov wrote:
> 'vmx->nested.hv_evmcs_vmptr' accesses are all over the place so hiding
> 'hv_evmcs_vmptr' under 'ifdef CONFIG_KVM_HYPERV' would take a lot of
> ifdefs. Introduce 'nested_vmx_evmptr12()' accessor and
> 'nested_vmx_is_evmptr12_valid()' checker instead. Note, several explicit
>
> nested_vmx_evmptr12(vmx) != EVMPTR_INVALID
>
> comparisons exist for a reson: 'nested_vmx_is_evmptr12_valid()' also checks
> against 'EVMPTR_MAP_PENDING' and in these places this is undesireable. It
> is possible to e.g. introduce 'nested_vmx_is_evmptr12_invalid()' and turn
> these sites into
>
> !nested_vmx_is_evmptr12_invalid(vmx)
>
> eliminating the need for 'nested_vmx_evmptr12()' but this seems to create
> even more confusion.
>
> No functional change intended.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
> arch/x86/kvm/vmx/hyperv.h | 10 +++++++++
> arch/x86/kvm/vmx/nested.c | 44 +++++++++++++++++++--------------------
> arch/x86/kvm/vmx/nested.h | 2 +-
> 3 files changed, 33 insertions(+), 23 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/hyperv.h b/arch/x86/kvm/vmx/hyperv.h
> index 933ef6cad5e6..ba1a95ea72b7 100644
> --- a/arch/x86/kvm/vmx/hyperv.h
> +++ b/arch/x86/kvm/vmx/hyperv.h
> @@ -4,6 +4,7 @@
>
> #include <linux/kvm_host.h>
> #include "vmcs12.h"
> +#include "vmx.h"
>
> #define EVMPTR_INVALID (-1ULL)
> #define EVMPTR_MAP_PENDING (-2ULL)
> @@ -20,7 +21,14 @@ enum nested_evmptrld_status {
> EVMPTRLD_ERROR,
> };
>
> +struct vcpu_vmx;

This forward declaration should be unnecessary as it's defined by vmx.h, which
is included above.

2023-11-30 01:32:14

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 10/14] KVM: x86: Make Hyper-V emulation optional

On Wed, Oct 25, 2023, Vitaly Kuznetsov wrote:
> Hyper-V emulation in KVM is a fairly big chunk and in some cases it may be
> desirable to not compile it in to reduce module sizes as well as the attack
> surface. Introduce CONFIG_KVM_HYPERV option to make it possible.
>
> Note, there's room for further nVMX/nSVM code optimizations when
> !CONFIG_KVM_HYPERV, this will be done in follow-up patches.
>
> Reorganize Makefile a bit so all CONFIG_HYPERV and CONFIG_KVM_HYPERV files
> are grouped together.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---

...

> diff --git a/arch/x86/kvm/Makefile b/arch/x86/kvm/Makefile
> index 8ea872401cd6..b97b875ad75f 100644
> --- a/arch/x86/kvm/Makefile
> +++ b/arch/x86/kvm/Makefile
> @@ -11,32 +11,33 @@ include $(srctree)/virt/kvm/Makefile.kvm
>
> kvm-y += x86.o emulate.o i8259.o irq.o lapic.o \
> i8254.o ioapic.o irq_comm.o cpuid.o pmu.o mtrr.o \
> - hyperv.o debugfs.o mmu/mmu.o mmu/page_track.o \
> + debugfs.o mmu/mmu.o mmu/page_track.o \
> mmu/spte.o
>
> -ifdef CONFIG_HYPERV
> -kvm-y += kvm_onhyperv.o
> -endif
> -
> kvm-$(CONFIG_X86_64) += mmu/tdp_iter.o mmu/tdp_mmu.o
> +kvm-$(CONFIG_KVM_HYPERV) += hyperv.o
> kvm-$(CONFIG_KVM_XEN) += xen.o
> kvm-$(CONFIG_KVM_SMM) += smm.o
>
> kvm-intel-y += vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o \
> - vmx/hyperv.o vmx/hyperv_evmcs.o vmx/nested.o vmx/posted_intr.o
> -kvm-intel-$(CONFIG_X86_SGX_KVM) += vmx/sgx.o
> + vmx/nested.o vmx/posted_intr.o
>
> -ifdef CONFIG_HYPERV
> -kvm-intel-y += vmx/vmx_onhyperv.o
> -endif
> +kvm-intel-$(CONFIG_X86_SGX_KVM) += vmx/sgx.o
>
> kvm-amd-y += svm/svm.o svm/vmenter.o svm/pmu.o svm/nested.o svm/avic.o \
> - svm/sev.o svm/hyperv.o
> + svm/sev.o
>
> ifdef CONFIG_HYPERV
> +kvm-y += kvm_onhyperv.o
> +kvm-intel-y += vmx/vmx_onhyperv.o vmx/hyperv_evmcs.o
> kvm-amd-y += svm/svm_onhyperv.o
> endif
>
> +ifdef CONFIG_KVM_HYPERV
> +kvm-intel-y += vmx/hyperv.o vmx/hyperv_evmcs.o
> +kvm-amd-y += svm/hyperv.o
> +endif

My strong preference is to avoid the unnecessary ifdef and instead do:

kvm-intel-y += vmx/vmx.o vmx/vmenter.o vmx/pmu_intel.o vmx/vmcs12.o \
vmx/nested.o vmx/posted_intr.o

kvm-intel-$(CONFIG_X86_SGX_KVM) += vmx/sgx.o
kvm-intel-$(CONFIG_KVM_HYPERV) += vmx/hyperv.o vmx/hyperv_evmcs.o

kvm-amd-y += svm/svm.o svm/vmenter.o svm/pmu.o svm/nested.o svm/avic.o \
svm/sev.o
kvm-amd-$(CONFIG_KVM_HYPERV) += svm/hyperv.o


CONFIG_HYPERV needs an ifdef because it can be 'y' or 'm', but otherwise ifdefs
just tend to be noisier.

> static bool nested_get_vmcs12_pages(struct kvm_vcpu *vcpu)
> {
> @@ -3552,11 +3563,13 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
> if (!nested_vmx_check_permission(vcpu))
> return 1;
>
> +#ifdef CONFIG_KVM_HYPERV
> evmptrld_status = nested_vmx_handle_enlightened_vmptrld(vcpu, launch);
> if (evmptrld_status == EVMPTRLD_ERROR) {
> kvm_queue_exception(vcpu, UD_VECTOR);
> return 1;
> }
> +#endif
>
> kvm_pmu_trigger_event(vcpu, PERF_COUNT_HW_BRANCH_INSTRUCTIONS);

This fails to build with CONFIG_KVM_HYPERV=n && CONFIG_KVM_WERROR=y:

arch/x86/kvm/vmx/nested.c:3577:9: error: variable 'evmptrld_status' is uninitialized when used here [-Werror,-Wuninitialized]
if (CC(evmptrld_status == EVMPTRLD_VMFAIL))
^~~~~~~~~~~~~~~

Sadly, simply wrapping with an #ifdef also fails because then evmptrld_status
becomes unused. I'd really prefer to avoid having to tag it __maybe_unused, and
adding more #ifdef would also be ugly. Any ideas?

2023-11-30 01:39:09

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 10/14] KVM: x86: Make Hyper-V emulation optional

On Wed, Oct 25, 2023, Vitaly Kuznetsov wrote:
> @@ -1570,6 +1572,7 @@ static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx)
> vmcs_load(vmx->loaded_vmcs->vmcs);
> }
>
> +#ifdef CONFIG_KVM_HYPERV
> static void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields)
> {
> struct vmcs12 *vmcs12 = vmx->nested.cached_vmcs12;
> @@ -2077,6 +2080,10 @@ static enum nested_evmptrld_status nested_vmx_handle_enlightened_vmptrld(
>
> return EVMPTRLD_SUCCEEDED;
> }
> +#else /* CONFIG_KVM_HYPERV */
> +static inline void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields) {}
> +static inline void copy_vmcs12_to_enlightened(struct vcpu_vmx *vmx) {}

I'm not sure I love the stubs in .c code. What if we instead throw the #ifdef
inside the helper, and then add a KVM_BUG_ON() in the CONFIG_KVM_HYPERV=n path?

> +#endif /* CONFIG_KVM_HYPERV */
>
> void nested_sync_vmcs12_to_shadow(struct kvm_vcpu *vcpu)
> {
> @@ -3155,6 +3162,7 @@ static int nested_vmx_check_vmentry_hw(struct kvm_vcpu *vcpu)
> return 0;
> }
>
> +#ifdef CONFIG_KVM_HYPERV
> static bool nested_get_evmcs_page(struct kvm_vcpu *vcpu)
> {
> struct vcpu_vmx *vmx = to_vmx(vcpu);
> @@ -3182,6 +3190,9 @@ static bool nested_get_evmcs_page(struct kvm_vcpu *vcpu)
>
> return true;
> }
> +#else
> +static bool nested_get_evmcs_page(struct kvm_vcpu *vcpu) { return true; }
> +#endif

And this one seems like it could be cleaner to just #ifdef the callers.

2023-11-30 01:42:18

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 00/14] KVM: x86: Make Hyper-V emulation optional

On Wed, Oct 25, 2023, Vitaly Kuznetsov wrote:
> Vitaly Kuznetsov (14):
> KVM: x86: xen: Remove unneeded xen context from struct kvm_arch when
> !CONFIG_KVM_XEN
> KVM: x86: hyper-v: Move Hyper-V partition assist page out of Hyper-V
> emulation context
> KVM: VMX: Split off vmx_onhyperv.{ch} from hyperv.{ch}
> KVM: x86: hyper-v: Introduce kvm_hv_synic_auto_eoi_set()
> KVM: x86: hyper-v: Introduce kvm_hv_synic_has_vector()
> KVM: VMX: Split off hyperv_evmcs.{ch}
> KVM: x86: hyper-v: Introduce kvm_hv_nested_transtion_tlb_flush()
> helper
> KVM: selftests: Make all Hyper-V tests explicitly dependent on Hyper-V
> emulation support in KVM
> KVM: selftests: Fix vmxon_pa == vmcs12_pa == -1ull
> vmx_set_nested_state_test for !eVMCS case
> KVM: x86: Make Hyper-V emulation optional
> KVM: nVMX: hyper-v: Introduce nested_vmx_evmptr12() and
> nested_vmx_is_evmptr12_valid() helpers
> KVM: nVMX: hyper-v: Introduce nested_vmx_evmcs() accessor
> KVM: nVMX: hyper-v: Hide more stuff under CONFIG_KVM_HYPERV
> KVM: nSVM: hyper-v: Hide more stuff under
> CONFIG_KVM_HYPERV/CONFIG_HYPERV

No major complaints. If it hadn't been for the build failure, I likely wouldn't
have even bothered with most of my nits ;-)

I'll wait for v2 though, trying to fixup as I go will be a bit risky.

2023-11-30 15:12:03

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH 10/14] KVM: x86: Make Hyper-V emulation optional

Sean Christopherson <[email protected]> writes:

...

>
>> static bool nested_get_vmcs12_pages(struct kvm_vcpu *vcpu)
>> {
>> @@ -3552,11 +3563,13 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
>> if (!nested_vmx_check_permission(vcpu))
>> return 1;
>>
>> +#ifdef CONFIG_KVM_HYPERV
>> evmptrld_status = nested_vmx_handle_enlightened_vmptrld(vcpu, launch);
>> if (evmptrld_status == EVMPTRLD_ERROR) {
>> kvm_queue_exception(vcpu, UD_VECTOR);
>> return 1;
>> }
>> +#endif
>>
>> kvm_pmu_trigger_event(vcpu, PERF_COUNT_HW_BRANCH_INSTRUCTIONS);
>
> This fails to build with CONFIG_KVM_HYPERV=n && CONFIG_KVM_WERROR=y:
>
> arch/x86/kvm/vmx/nested.c:3577:9: error: variable 'evmptrld_status' is uninitialized when used here [-Werror,-Wuninitialized]
> if (CC(evmptrld_status == EVMPTRLD_VMFAIL))
> ^~~~~~~~~~~~~~~
>
> Sadly, simply wrapping with an #ifdef also fails because then evmptrld_status
> becomes unused. I'd really prefer to avoid having to tag it __maybe_unused, and
> adding more #ifdef would also be ugly. Any ideas?

A couple,

- we can try putting all eVMPTR logic under 'if (1)' just to create a
block where we can define evmptrld_status. Not sure this is nicer than
another #ifdef wrapping evmptrld_status and I'm not sure what to do
with kvm_pmu_trigger_event() -- can it just go above
nested_vmx_handle_enlightened_vmptrld()?

- we can add a helper, e.g. 'evmptr_is_vmfail()' and make it just return
'false' when !CONFIG_KVM_HYPERV.

- rewrite this as a switch to avoid the need for having the local
variable, (untested)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index c5ec0ef51ff7..b26ce7d596e9 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -3553,22 +3553,23 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
enum nvmx_vmentry_status status;
struct vcpu_vmx *vmx = to_vmx(vcpu);
u32 interrupt_shadow = vmx_get_interrupt_shadow(vcpu);
- enum nested_evmptrld_status evmptrld_status;

if (!nested_vmx_check_permission(vcpu))
return 1;

- evmptrld_status = nested_vmx_handle_enlightened_vmptrld(vcpu, launch);
- if (evmptrld_status == EVMPTRLD_ERROR) {
+ switch (nested_vmx_handle_enlightened_vmptrld(vcpu, launch)) {
+ case EVMPTRLD_ERROR:
kvm_queue_exception(vcpu, UD_VECTOR);
return 1;
+ case EVMPTRLD_VMFAIL:
+ trace_kvm_nested_vmenter_failed("evmptrld_status", 0);
+ return nested_vmx_failInvalid(vcpu);
+ default:
+ break;
}

kvm_pmu_trigger_event(vcpu, PERF_COUNT_HW_BRANCH_INSTRUCTIONS);

- if (CC(evmptrld_status == EVMPTRLD_VMFAIL))
- return nested_vmx_failInvalid(vcpu);
-
if (CC(!evmptr_is_valid(vmx->nested.hv_evmcs_vmptr) &&
vmx->nested.current_vmptr == INVALID_GPA))
return nested_vmx_failInvalid(vcpu);

Unfortunately, I had to open code CC() ;-(

Or maybe just another "#ifdef" is not so ugly after all? :-)

--
Vitaly

2023-11-30 18:29:07

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 10/14] KVM: x86: Make Hyper-V emulation optional

On Thu, Nov 30, 2023, Vitaly Kuznetsov wrote:
> Sean Christopherson <[email protected]> writes:
>
> ...
>
> >
> >> static bool nested_get_vmcs12_pages(struct kvm_vcpu *vcpu)
> >> {
> >> @@ -3552,11 +3563,13 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
> >> if (!nested_vmx_check_permission(vcpu))
> >> return 1;
> >>
> >> +#ifdef CONFIG_KVM_HYPERV
> >> evmptrld_status = nested_vmx_handle_enlightened_vmptrld(vcpu, launch);
> >> if (evmptrld_status == EVMPTRLD_ERROR) {
> >> kvm_queue_exception(vcpu, UD_VECTOR);
> >> return 1;
> >> }
> >> +#endif
> >>
> >> kvm_pmu_trigger_event(vcpu, PERF_COUNT_HW_BRANCH_INSTRUCTIONS);
> >
> > This fails to build with CONFIG_KVM_HYPERV=n && CONFIG_KVM_WERROR=y:
> >
> > arch/x86/kvm/vmx/nested.c:3577:9: error: variable 'evmptrld_status' is uninitialized when used here [-Werror,-Wuninitialized]
> > if (CC(evmptrld_status == EVMPTRLD_VMFAIL))
> > ^~~~~~~~~~~~~~~
> >
> > Sadly, simply wrapping with an #ifdef also fails because then evmptrld_status
> > becomes unused. I'd really prefer to avoid having to tag it __maybe_unused, and
> > adding more #ifdef would also be ugly. Any ideas?
>
> A couple,
>
> - we can try putting all eVMPTR logic under 'if (1)' just to create a
> block where we can define evmptrld_status. Not sure this is nicer than
> another #ifdef wrapping evmptrld_status and I'm not sure what to do
> with kvm_pmu_trigger_event() -- can it just go above
> nested_vmx_handle_enlightened_vmptrld()?
>
> - we can add a helper, e.g. 'evmptr_is_vmfail()' and make it just return
> 'false' when !CONFIG_KVM_HYPERV.
>
> - rewrite this as a switch to avoid the need for having the local
> variable, (untested)
>
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index c5ec0ef51ff7..b26ce7d596e9 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -3553,22 +3553,23 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
> enum nvmx_vmentry_status status;
> struct vcpu_vmx *vmx = to_vmx(vcpu);
> u32 interrupt_shadow = vmx_get_interrupt_shadow(vcpu);
> - enum nested_evmptrld_status evmptrld_status;
>
> if (!nested_vmx_check_permission(vcpu))
> return 1;
>
> - evmptrld_status = nested_vmx_handle_enlightened_vmptrld(vcpu, launch);
> - if (evmptrld_status == EVMPTRLD_ERROR) {
> + switch (nested_vmx_handle_enlightened_vmptrld(vcpu, launch)) {
> + case EVMPTRLD_ERROR:
> kvm_queue_exception(vcpu, UD_VECTOR);
> return 1;
> + case EVMPTRLD_VMFAIL:
> + trace_kvm_nested_vmenter_failed("evmptrld_status", 0);
> + return nested_vmx_failInvalid(vcpu);
> + default:
> + break;
> }
>
> kvm_pmu_trigger_event(vcpu, PERF_COUNT_HW_BRANCH_INSTRUCTIONS);
>
> - if (CC(evmptrld_status == EVMPTRLD_VMFAIL))
> - return nested_vmx_failInvalid(vcpu);
> -
> if (CC(!evmptr_is_valid(vmx->nested.hv_evmcs_vmptr) &&
> vmx->nested.current_vmptr == INVALID_GPA))
> return nested_vmx_failInvalid(vcpu);
>
> Unfortunately, I had to open code CC() ;-(
>
> Or maybe just another "#ifdef" is not so ugly after all? :-)

Ah, just have nested_vmx_handle_enlightened_vmptrld() return EVMPTRLD_DISABLED
for its "stub", e.g. with some otehr tangentially de-stubbing:

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 4777d867419c..5a27a2ebbb32 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -2000,6 +2000,7 @@ static void copy_vmcs12_to_enlightened(struct vcpu_vmx *vmx)
static enum nested_evmptrld_status nested_vmx_handle_enlightened_vmptrld(
struct kvm_vcpu *vcpu, bool from_launch)
{
+#ifdef CONFIG_KVM_HYPERV
struct vcpu_vmx *vmx = to_vmx(vcpu);
bool evmcs_gpa_changed = false;
u64 evmcs_gpa;
@@ -2081,11 +2082,10 @@ static enum nested_evmptrld_status nested_vmx_handle_enlightened_vmptrld(
}

return EVMPTRLD_SUCCEEDED;
+#else
+ return EVMPTRLD_DISABLED;
+#endif
}
-#else /* CONFIG_KVM_HYPERV */
-static inline void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields) {}
-static inline void copy_vmcs12_to_enlightened(struct vcpu_vmx *vmx) {}
-#endif /* CONFIG_KVM_HYPERV */

void nested_sync_vmcs12_to_shadow(struct kvm_vcpu *vcpu)
{
@@ -3191,8 +3191,6 @@ static bool nested_get_evmcs_page(struct kvm_vcpu *vcpu)

return true;
}
-#else
-static bool nested_get_evmcs_page(struct kvm_vcpu *vcpu) { return true; }
#endif

static bool nested_get_vmcs12_pages(struct kvm_vcpu *vcpu)
@@ -3285,6 +3283,7 @@ static bool nested_get_vmcs12_pages(struct kvm_vcpu *vcpu)

static bool vmx_get_nested_state_pages(struct kvm_vcpu *vcpu)
{
+#ifdef CONFIG_KVM_HYPERV
/*
* Note: nested_get_evmcs_page() also updates 'vp_assist_page' copy
* in 'struct kvm_vcpu_hv' in case eVMCS is in use, this is mandatory
@@ -3301,7 +3300,7 @@ static bool vmx_get_nested_state_pages(struct kvm_vcpu *vcpu)

return false;
}
-
+#endif
if (is_guest_mode(vcpu) && !nested_get_vmcs12_pages(vcpu))
return false;

@@ -3564,7 +3563,6 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
if (!nested_vmx_check_permission(vcpu))
return 1;

-#ifdef CONFIG_KVM_HYPERV
evmptrld_status = nested_vmx_handle_enlightened_vmptrld(vcpu, launch);
if (evmptrld_status == EVMPTRLD_ERROR) {
kvm_queue_exception(vcpu, UD_VECTOR);
@@ -4743,6 +4741,7 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
WARN_ON_ONCE(vmx->nested.nested_run_pending);

if (kvm_check_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu)) {
+#ifdef CONFIG_KVM_HYPERV
/*
* KVM_REQ_GET_NESTED_STATE_PAGES is also used to map
* Enlightened VMCS after migration and we still need to
@@ -4750,6 +4749,7 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
* the first L2 run.
*/
(void)nested_get_evmcs_page(vcpu);
+#endif
}

/* Service pending TLB flush requests for L2 before switching to L1. */

2023-11-30 19:14:10

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 11/14] KVM: nVMX: hyper-v: Introduce nested_vmx_evmptr12() and nested_vmx_is_evmptr12_valid() helpers

On Wed, Oct 25, 2023, Vitaly Kuznetsov wrote:
> #ifdef CONFIG_KVM_HYPERV
> +static inline gpa_t nested_vmx_evmptr12(struct vcpu_vmx *vmx) { return vmx->nested.hv_evmcs_vmptr; }
> +static inline bool nested_vmx_is_evmptr12_valid(struct vcpu_vmx *vmx)
> +{
> + return evmptr_is_valid(vmx->nested.hv_evmcs_vmptr);
> +}

...

> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index d0d735974b2c..b45586588bae 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -179,7 +179,7 @@ static int nested_vmx_failValid(struct kvm_vcpu *vcpu,
> * VM_INSTRUCTION_ERROR is not shadowed. Enlightened VMCS 'shadows' all
> * fields and thus must be synced.
> */
> - if (to_vmx(vcpu)->nested.hv_evmcs_vmptr != EVMPTR_INVALID)
> + if (nested_vmx_evmptr12(to_vmx(vcpu)) != EVMPTR_INVALID)
> to_vmx(vcpu)->nested.need_vmcs12_to_shadow_sync = true;
>
> return kvm_skip_emulated_instruction(vcpu);
> @@ -194,7 +194,7 @@ static int nested_vmx_fail(struct kvm_vcpu *vcpu, u32 vm_instruction_error)
> * can't be done if there isn't a current VMCS.
> */
> if (vmx->nested.current_vmptr == INVALID_GPA &&
> - !evmptr_is_valid(vmx->nested.hv_evmcs_vmptr))
> + !nested_vmx_is_evmptr12_valid(vmx))

Hrm, this results in a bit of a mess, because it's essentially a half-baked
conversion, and the existing evmptr_is_valid() makes it extra confusing.

Specifically, I don't think nested_vmx_evmptr12() should exist. The only code
that should ever care about the evmptr12 _value_ should be guarded with
CONFIG_KVM_HYPERV, everything else should simply be checking for validity.

I don't know what names would be best, but we should end up with two helpers:
one that checks "evmptr != INVALID && evmptr != MAP_PENDING" and another that
checks "evmptr != INVALID".

And the inner helpers, e.g. evmptr_is_valid(), should also have stubs. I doubt
it will matter in practice, but I see no reason not to make it super duper obvious
that evmptr_is_valid() can never be %true if CONFIG_KVM_HYPERV=n.

> @@ -30,6 +38,8 @@ int nested_evmcs_check_controls(struct vmcs12 *vmcs12);
> bool nested_evmcs_l2_tlb_flush_enabled(struct kvm_vcpu *vcpu);
> void vmx_hv_inject_synthetic_vmexit_post_tlb_flush(struct kvm_vcpu *vcpu);
> #else
> +static inline gpa_t nested_vmx_evmptr12(struct vcpu_vmx *vmx) { return EVMPTR_INVALID; }
> +static inline bool nested_vmx_is_evmptr12_valid(struct vcpu_vmx *vmx) { return false; }
> static inline u64 nested_get_evmptr(struct kvm_vcpu *vcpu) { return EVMPTR_INVALID; }
> static inline void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata) {}
> static inline bool nested_evmcs_l2_tlb_flush_enabled(struct kvm_vcpu *vcpu) { return false; }

Pretty much all of these stubs are gratuitous. They either have single users
that can be wrapped with CONFIG_KVM_HYPERV, or can be eliminated by the adding
the extra helper as above.

Oh, and switching back to a topic that I think I brought up in a different response,
the stubs for nested_get_evmcs_page() is *really* nasty, as it returns %true when
eVMCS is disabled. It's functionally correct for vmx_get_nested_state_pages(), but
super odd because obviously KVM was unable to get any eVMCS pages. That's one of
the reasons why my preference is to avoid most stubs and instead either handle
things entirely within the eVMCS helpers or use #ifdefs at the call sites (when
there is only 1, maybe 2, callers).

Pulling in another goof, hyperv_enabled and hyperv in struct kvm_vcpu_arch should
be #ifdef'd away.

Last thought, my fairly strong preference is to not squish any of these helpers
into a single line, i.e. do

static inline bool evmptr_is_valid(u64 evmptr)
{
return false;
}

which IMO is much easier to read. It's quite easy to trim this down to five stubs,
at which point making the code as dense as possible is a net negative, e.g. even
with the multi-line stubs, the entirety of arch/x86/kvm/vmx/hyperv.h fits on my
monitor without scrolling.

This is what I ended up with after a bit of hacking. It compiles, but otherwise
is untested. As above, I have no idea what to call evmptr_is_tbd() :-)

---
arch/x86/include/asm/kvm_host.h | 2 +
arch/x86/kvm/vmx/hyperv.h | 57 ++++++++++++++++++-------
arch/x86/kvm/vmx/nested.c | 73 +++++++++++++++++++++------------
arch/x86/kvm/vmx/nested.h | 2 +-
arch/x86/kvm/vmx/vmx.c | 2 +
arch/x86/kvm/vmx/vmx.h | 10 -----
6 files changed, 94 insertions(+), 52 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 394dc11bf232..0ecfb16c611c 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -937,8 +937,10 @@ struct kvm_vcpu_arch {
/* used for guest single stepping over the given code position */
unsigned long singlestep_rip;

+#ifdef CONFIG_KVM_HYPERV
bool hyperv_enabled;
struct kvm_vcpu_hv *hyperv;
+#endif
#ifdef CONFIG_KVM_XEN
struct kvm_vcpu_xen xen;
#endif
diff --git a/arch/x86/kvm/vmx/hyperv.h b/arch/x86/kvm/vmx/hyperv.h
index 92bde3b75ca0..ce2b03ea2d30 100644
--- a/arch/x86/kvm/vmx/hyperv.h
+++ b/arch/x86/kvm/vmx/hyperv.h
@@ -9,11 +9,6 @@
#define EVMPTR_INVALID (-1ULL)
#define EVMPTR_MAP_PENDING (-2ULL)

-static inline bool evmptr_is_valid(u64 evmptr)
-{
- return evmptr != EVMPTR_INVALID && evmptr != EVMPTR_MAP_PENDING;
-}
-
enum nested_evmptrld_status {
EVMPTRLD_DISABLED,
EVMPTRLD_SUCCEEDED,
@@ -21,18 +16,41 @@ enum nested_evmptrld_status {
EVMPTRLD_ERROR,
};

-struct vcpu_vmx;
-
#ifdef CONFIG_KVM_HYPERV
-static inline gpa_t nested_vmx_evmptr12(struct vcpu_vmx *vmx) { return vmx->nested.hv_evmcs_vmptr; }
+static inline bool guest_cpuid_has_evmcs(struct kvm_vcpu *vcpu)
+{
+ /*
+ * eVMCS is exposed to the guest if Hyper-V is enabled in CPUID and
+ * eVMCS has been explicitly enabled by userspace.
+ */
+ return vcpu->arch.hyperv_enabled &&
+ to_vmx(vcpu)->nested.enlightened_vmcs_enabled;
+}
+static inline gpa_t nested_vmx_evmptr12(struct vcpu_vmx *vmx)
+{
+ return vmx->nested.hv_evmcs_vmptr;
+}
+static inline bool evmptr_is_valid(u64 evmptr)
+{
+ return evmptr != EVMPTR_INVALID && evmptr != EVMPTR_MAP_PENDING;
+}
static inline bool nested_vmx_is_evmptr12_valid(struct vcpu_vmx *vmx)
{
return evmptr_is_valid(vmx->nested.hv_evmcs_vmptr);
}
+static inline bool evmptr_is_tbd(u64 evmptr)
+{
+ return evmptr != EVMPTR_INVALID;
+}
+static inline bool nested_vmx_is_evmptr12_tbd(struct vcpu_vmx *vmx)
+{
+ return evmptr_is_tbd(vmx->nested.hv_evmcs_vmptr);
+}
static inline struct hv_enlightened_vmcs *nested_vmx_evmcs(struct vcpu_vmx *vmx)
{
return vmx->nested.hv_evmcs;
}
+
u64 nested_get_evmptr(struct kvm_vcpu *vcpu);
uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu);
int nested_enable_evmcs(struct kvm_vcpu *vcpu,
@@ -42,17 +60,26 @@ int nested_evmcs_check_controls(struct vmcs12 *vmcs12);
bool nested_evmcs_l2_tlb_flush_enabled(struct kvm_vcpu *vcpu);
void vmx_hv_inject_synthetic_vmexit_post_tlb_flush(struct kvm_vcpu *vcpu);
#else
-static inline gpa_t nested_vmx_evmptr12(struct vcpu_vmx *vmx) { return EVMPTR_INVALID; }
-static inline bool nested_vmx_is_evmptr12_valid(struct vcpu_vmx *vmx) { return false; }
+static inline bool evmptr_is_valid(u64 evmptr)
+{
+ return false;
+}
+static inline bool nested_vmx_is_evmptr12_valid(struct vcpu_vmx *vmx)
+{
+ return false;
+}
+static inline bool evmptr_is_tbd(u64 evmptr)
+{
+ return false;
+}
+static inline bool nested_vmx_is_evmptr12_tbd(struct vcpu_vmx *vmx)
+{
+ return false;
+}
static inline struct hv_enlightened_vmcs *nested_vmx_evmcs(struct vcpu_vmx *vmx)
{
return NULL;
}
-static inline u64 nested_get_evmptr(struct kvm_vcpu *vcpu) { return EVMPTR_INVALID; }
-static inline void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata) {}
-static inline bool nested_evmcs_l2_tlb_flush_enabled(struct kvm_vcpu *vcpu) { return false; }
-static inline int nested_evmcs_check_controls(struct vmcs12 *vmcs12) { return 0; }
#endif

-
#endif /* __KVM_X86_VMX_HYPERV_H */
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 4777d867419c..0045d6a2da54 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -179,7 +179,7 @@ static int nested_vmx_failValid(struct kvm_vcpu *vcpu,
* VM_INSTRUCTION_ERROR is not shadowed. Enlightened VMCS 'shadows' all
* fields and thus must be synced.
*/
- if (nested_vmx_evmptr12(to_vmx(vcpu)) != EVMPTR_INVALID)
+ if (nested_vmx_is_evmptr12_tbd(to_vmx(vcpu)))
to_vmx(vcpu)->nested.need_vmcs12_to_shadow_sync = true;

return kvm_skip_emulated_instruction(vcpu);
@@ -245,6 +245,34 @@ static inline void nested_release_evmcs(struct kvm_vcpu *vcpu)
#endif
}

+static bool nested_evmcs_handle_vmclear(struct kvm_vcpu *vcpu, gpa_t vmptr)
+{
+#ifdef CONFIG_KVM_HYPERV
+ struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+ /*
+ * When Enlightened VMEntry is enabled on the calling CPU we treat
+ * memory area pointer by vmptr as Enlightened VMCS (as there's no good
+ * way to distinguish it from VMCS12) and we must not corrupt it by
+ * writing to the non-existent 'launch_state' field. The area doesn't
+ * have to be the currently active EVMCS on the calling CPU and there's
+ * nothing KVM has to do to transition it from 'active' to 'non-active'
+ * state. It is possible that the area will stay mapped as
+ * vmx->nested.hv_evmcs but this shouldn't be a problem.
+ */
+ if (!guest_cpuid_has_evmcs(vcpu) ||
+ !evmptr_is_valid(nested_get_evmptr(vcpu)))
+ return false;
+
+ if (nested_vmx_evmcs(vmx) && vmptr == nested_vmx_evmptr12(vmx))
+ nested_release_evmcs(vcpu);
+
+ return true;
+#else
+ return false;
+#endif
+}
+
static void vmx_sync_vmcs_host_state(struct vcpu_vmx *vmx,
struct loaded_vmcs *prev)
{
@@ -1574,9 +1602,9 @@ static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx)
vmcs_load(vmx->loaded_vmcs->vmcs);
}

-#ifdef CONFIG_KVM_HYPERV
static void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields)
{
+#ifdef CONFIG_KVM_HYPERV
struct vmcs12 *vmcs12 = vmx->nested.cached_vmcs12;
struct hv_enlightened_vmcs *evmcs = nested_vmx_evmcs(vmx);
struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(&vmx->vcpu);
@@ -1817,10 +1845,12 @@ static void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields
*/

return;
+#endif
}

static void copy_vmcs12_to_enlightened(struct vcpu_vmx *vmx)
{
+#ifdef CONFIG_KVM_HYPERV
struct vmcs12 *vmcs12 = vmx->nested.cached_vmcs12;
struct hv_enlightened_vmcs *evmcs = nested_vmx_evmcs(vmx);

@@ -1991,6 +2021,7 @@ static void copy_vmcs12_to_enlightened(struct vcpu_vmx *vmx)
evmcs->guest_bndcfgs = vmcs12->guest_bndcfgs;

return;
+#endif
}

/*
@@ -2000,6 +2031,7 @@ static void copy_vmcs12_to_enlightened(struct vcpu_vmx *vmx)
static enum nested_evmptrld_status nested_vmx_handle_enlightened_vmptrld(
struct kvm_vcpu *vcpu, bool from_launch)
{
+#ifdef CONFIG_KVM_HYPERV
struct vcpu_vmx *vmx = to_vmx(vcpu);
bool evmcs_gpa_changed = false;
u64 evmcs_gpa;
@@ -2081,11 +2113,10 @@ static enum nested_evmptrld_status nested_vmx_handle_enlightened_vmptrld(
}

return EVMPTRLD_SUCCEEDED;
+#else
+ return EVMPTRLD_DISABLED;
+#endif
}
-#else /* CONFIG_KVM_HYPERV */
-static inline void copy_enlightened_to_vmcs12(struct vcpu_vmx *vmx, u32 hv_clean_fields) {}
-static inline void copy_vmcs12_to_enlightened(struct vcpu_vmx *vmx) {}
-#endif /* CONFIG_KVM_HYPERV */

void nested_sync_vmcs12_to_shadow(struct kvm_vcpu *vcpu)
{
@@ -2890,8 +2921,10 @@ static int nested_vmx_check_controls(struct kvm_vcpu *vcpu,
nested_check_vm_entry_controls(vcpu, vmcs12))
return -EINVAL;

+#ifdef CONFIG_KVM_HYPERV
if (guest_cpuid_has_evmcs(vcpu))
return nested_evmcs_check_controls(vmcs12);
+#endif

return 0;
}
@@ -3191,8 +3224,6 @@ static bool nested_get_evmcs_page(struct kvm_vcpu *vcpu)

return true;
}
-#else
-static bool nested_get_evmcs_page(struct kvm_vcpu *vcpu) { return true; }
#endif

static bool nested_get_vmcs12_pages(struct kvm_vcpu *vcpu)
@@ -3285,6 +3316,7 @@ static bool nested_get_vmcs12_pages(struct kvm_vcpu *vcpu)

static bool vmx_get_nested_state_pages(struct kvm_vcpu *vcpu)
{
+#ifdef CONFIG_KVM_HYPERV
/*
* Note: nested_get_evmcs_page() also updates 'vp_assist_page' copy
* in 'struct kvm_vcpu_hv' in case eVMCS is in use, this is mandatory
@@ -3301,7 +3333,7 @@ static bool vmx_get_nested_state_pages(struct kvm_vcpu *vcpu)

return false;
}
-
+#endif
if (is_guest_mode(vcpu) && !nested_get_vmcs12_pages(vcpu))
return false;

@@ -3564,13 +3596,11 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu, bool launch)
if (!nested_vmx_check_permission(vcpu))
return 1;

-#ifdef CONFIG_KVM_HYPERV
evmptrld_status = nested_vmx_handle_enlightened_vmptrld(vcpu, launch);
if (evmptrld_status == EVMPTRLD_ERROR) {
kvm_queue_exception(vcpu, UD_VECTOR);
return 1;
}
-#endif

kvm_pmu_trigger_event(vcpu, PERF_COUNT_HW_BRANCH_INSTRUCTIONS);

@@ -4743,6 +4773,7 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
WARN_ON_ONCE(vmx->nested.nested_run_pending);

if (kvm_check_request(KVM_REQ_GET_NESTED_STATE_PAGES, vcpu)) {
+#ifdef CONFIG_KVM_HYPERV
/*
* KVM_REQ_GET_NESTED_STATE_PAGES is also used to map
* Enlightened VMCS after migration and we still need to
@@ -4750,6 +4781,7 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 vm_exit_reason,
* the first L2 run.
*/
(void)nested_get_evmcs_page(vcpu);
+#endif
}

/* Service pending TLB flush requests for L2 before switching to L1. */
@@ -5302,18 +5334,7 @@ static int handle_vmclear(struct kvm_vcpu *vcpu)
if (vmptr == vmx->nested.vmxon_ptr)
return nested_vmx_fail(vcpu, VMXERR_VMCLEAR_VMXON_POINTER);

- /*
- * When Enlightened VMEntry is enabled on the calling CPU we treat
- * memory area pointer by vmptr as Enlightened VMCS (as there's no good
- * way to distinguish it from VMCS12) and we must not corrupt it by
- * writing to the non-existent 'launch_state' field. The area doesn't
- * have to be the currently active EVMCS on the calling CPU and there's
- * nothing KVM has to do to transition it from 'active' to 'non-active'
- * state. It is possible that the area will stay mapped as
- * vmx->nested.hv_evmcs but this shouldn't be a problem.
- */
- if (likely(!guest_cpuid_has_evmcs(vcpu) ||
- !evmptr_is_valid(nested_get_evmptr(vcpu)))) {
+ if (likely(!nested_evmcs_handle_vmclear(vcpu, vmptr))) {
if (vmptr == vmx->nested.current_vmptr)
nested_release_vmcs12(vcpu);

@@ -5330,8 +5351,6 @@ static int handle_vmclear(struct kvm_vcpu *vcpu)
vmptr + offsetof(struct vmcs12,
launch_state),
&zero, sizeof(zero));
- } else if (nested_vmx_evmcs(vmx) && vmptr == nested_vmx_evmptr12(vmx)) {
- nested_release_evmcs(vcpu);
}

return nested_vmx_succeed(vcpu);
@@ -6218,11 +6237,13 @@ static bool nested_vmx_l0_wants_exit(struct kvm_vcpu *vcpu,
* Handle L2's bus locks in L0 directly.
*/
return true;
+#ifdef CONFIG_KVM_HYPERV
case EXIT_REASON_VMCALL:
/* Hyper-V L2 TLB flush hypercall is handled by L0 */
return guest_hv_cpuid_has_l2_tlb_flush(vcpu) &&
nested_evmcs_l2_tlb_flush_enabled(vcpu) &&
kvm_hv_is_tlb_flush_hcall(vcpu);
+#endif
default:
break;
}
@@ -6445,7 +6466,7 @@ static int vmx_get_nested_state(struct kvm_vcpu *vcpu,
kvm_state.size += sizeof(user_vmx_nested_state->vmcs12);

/* 'hv_evmcs_vmptr' can also be EVMPTR_MAP_PENDING here */
- if (nested_vmx_evmptr12(vmx) != EVMPTR_INVALID)
+ if (nested_vmx_is_evmptr12_tbd(vmx))
kvm_state.flags |= KVM_STATE_NESTED_EVMCS;

if (is_guest_mode(vcpu) &&
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index 0cedb80c5c94..891cc1df6a60 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -58,7 +58,7 @@ static inline int vmx_has_valid_vmcs12(struct kvm_vcpu *vcpu)

/* 'hv_evmcs_vmptr' can also be EVMPTR_MAP_PENDING here */
return vmx->nested.current_vmptr != -1ull ||
- nested_vmx_evmptr12(vmx) != EVMPTR_INVALID;
+ nested_vmx_is_evmptr12_tbd(vmx);
}

static inline u16 nested_get_vpid02(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6e0ff015c5ff..942e10166777 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2048,6 +2048,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
if (vmx_get_vmx_msr(&vmx->nested.msrs, msr_info->index,
&msr_info->data))
return 1;
+#ifdef CONFIG_KVM_HYPERV
/*
* Enlightened VMCS v1 doesn't have certain VMCS fields but
* instead of just ignoring the features, different Hyper-V
@@ -2058,6 +2059,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
if (!msr_info->host_initiated && guest_cpuid_has_evmcs(vcpu))
nested_evmcs_filter_control_msr(vcpu, msr_info->index,
&msr_info->data);
+#endif
break;
case MSR_IA32_RTIT_CTL:
if (!vmx_pt_mode_is_host_guest())
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 55addb8eef01..8fe6eb2b4a34 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -747,14 +747,4 @@ static inline bool vmx_can_use_ipiv(struct kvm_vcpu *vcpu)
return lapic_in_kernel(vcpu) && enable_ipiv;
}

-static inline bool guest_cpuid_has_evmcs(struct kvm_vcpu *vcpu)
-{
- /*
- * eVMCS is exposed to the guest if Hyper-V is enabled in CPUID and
- * eVMCS has been explicitly enabled by userspace.
- */
- return vcpu->arch.hyperv_enabled &&
- to_vmx(vcpu)->nested.enlightened_vmcs_enabled;
-}
-
#endif /* __KVM_X86_VMX_H */

base-commit: 3f84682d96d3d77f35f661a3787ddb90c0477b75
--