2022-07-08 15:17:17

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v3 00/25] KVM: VMX: Support updated eVMCSv1 revision + use vmcs_config for L1 VMX MSRs

This series continues:
- "[PATCH v2 00/28] KVM: VMX: Support TscScaling and EnclsExitingBitmap
with eVMCS + use vmcs_config for L1 VMX MSRs" work:
https://lore.kernel.org/kvm/[email protected]/

Changes since v1:
- Turns out the updated eVMCSv1 revision comes with a CPUID feature bit
and this changes a lot as we don't need to invent our own eVMCS
revisions. Adjust the whole series accordingly, drop now unneeded
KVM_CAP_HYPERV_ENLIGHTENED_VMCS2.
- VM_{EXIT,ENTRY}_LOAD_IA32_PERF_GLOBAL_CTRL are now supported for both
Hyper-V on KVM and KVM on Hyper-V.
- Make CPU_BASED_NMI_WINDOW_EXITING optional [Sean].
- Drop erroneous "KVM: VMX: Add missing VMENTRY controls to vmcs_config"
[Jim].
- Include Jim's "KVM: x86: VMX: Replace some Intel model numbers with
mnemonics" into the series.
- "KVM: nVMX: Always set required-1 bits of pinbased_ctls to
PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR" added [Jim]. It allows to get rid
of required-1 bits caching in vmcs_config, patches dropped.
- Collect R-b tags [Jim].
- Other minor tweaks (descriptions, comments) [Jim, Sean].

Original description:

Enlightened VMCS v1 definition was updates to include fields for the
following features:
- PerfGlobalCtrl
- EnclsExitingBitmap
- TSC scaling
- GuestLbrCtl
- CET
- SSP
While the information is missing in the publicly available TLFS, the
updated definition comes with a new feature bit in CPUID.0x4000000A.EBX
(BIT 0). Use a made up HV_X64_NESTED_EVMCS1_2022_UPDATE name for it.

Add support for the new revision to KVM. SSP, CET and GuestLbrCtl
features are not currently supported by KVM.

While on it, implement Sean's idea to use vmcs_config for setting up
L1 VMX control MSRs instead of re-reading host MSRs.

Jim Mattson (1):
KVM: x86: VMX: Replace some Intel model numbers with mnemonics

Sean Christopherson (1):
KVM: VMX: Adjust CR3/INVPLG interception for EPT=y at runtime, not
setup

Vitaly Kuznetsov (23):
KVM: x86: hyper-v: Expose access to debug MSRs in the partition
privilege flags
x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition
x86/hyperv: Update 'struct hv_enlightened_vmcs' definition
KVM: VMX: Define VMCS-to-EVMCS conversion for the new fields
KVM: nVMX: Support several new fields in eVMCSv1
KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf
KVM: selftests: Add ENCLS_EXITING_BITMAP{,HIGH} VMCS fields
KVM: selftests: Switch to updated eVMCSv1 definition
KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with
enlightened VMCS
KVM: selftests: Enable TSC scaling in evmcs selftest
KVM: VMX: Get rid of eVMCS specific VMX controls sanitization
KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()
KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in
setup_vmcs_config()
KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING
in setup_vmcs_config()
KVM: VMX: Extend VMX controls macro shenanigans
KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of
setup_vmcs_config()
KVM: VMX: Add missing VMEXIT controls to vmcs_config
KVM: VMX: Add missing CPU based VM execution controls to vmcs_config
KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of
setup_vmcs_config()
KVM: nVMX: Always set required-1 bits of pinbased_ctls to
PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR
KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs
KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up
nested MSR

arch/x86/include/asm/hyperv-tlfs.h | 22 +-
arch/x86/include/asm/kvm_host.h | 2 +
arch/x86/kvm/hyperv.c | 12 +-
arch/x86/kvm/vmx/capabilities.h | 14 +-
arch/x86/kvm/vmx/evmcs.c | 127 +++++++---
arch/x86/kvm/vmx/evmcs.h | 18 +-
arch/x86/kvm/vmx/nested.c | 70 ++++--
arch/x86/kvm/vmx/nested.h | 2 +-
arch/x86/kvm/vmx/vmx.c | 235 ++++++++----------
arch/x86/kvm/vmx/vmx.h | 116 +++++++++
include/asm-generic/hyperv-tlfs.h | 2 +
.../selftests/kvm/include/x86_64/evmcs.h | 45 +++-
.../selftests/kvm/include/x86_64/vmx.h | 2 +
.../testing/selftests/kvm/x86_64/evmcs_test.c | 31 ++-
14 files changed, 477 insertions(+), 221 deletions(-)

--
2.35.3


2022-07-08 15:18:06

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v3 09/25] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS

Enlightened VMCS v1 got updated and now includes the required fields
for TSC scaling and loading PERF_GLOBAL_CTRL upon VMENTER/VMEXIT
features. For Hyper-V on KVM enablement, KVM can just observe VMX control
MSRs and use the features (with or without eVMCS) when
possible. Hyper-V on KVM case is trickier because of live migration:
the new features require explicit enablement from VMM to not break
it. Luckily, the updated eVMCS revision comes with a feature bit in
CPUID.0x4000000A.EBX.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/hyperv.c | 2 +-
arch/x86/kvm/vmx/evmcs.c | 88 +++++++++++++++++++++++++++++++--------
arch/x86/kvm/vmx/evmcs.h | 17 ++------
arch/x86/kvm/vmx/nested.c | 2 +-
arch/x86/kvm/vmx/vmx.c | 2 +-
5 files changed, 78 insertions(+), 33 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index b666902da4d9..406c5e273983 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2562,7 +2562,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
case HYPERV_CPUID_NESTED_FEATURES:
ent->eax = evmcs_ver;
ent->eax |= HV_X64_NESTED_MSR_BITMAP;
-
+ ent->ebx |= HV_X64_NESTED_EVMCS1_2022_UPDATE;
break;

case HYPERV_CPUID_SYNDBG_VENDOR_AND_MAX_FUNCTIONS:
diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
index 8bea5dea0341..52a53debd806 100644
--- a/arch/x86/kvm/vmx/evmcs.c
+++ b/arch/x86/kvm/vmx/evmcs.c
@@ -368,7 +368,60 @@ uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu)
return 0;
}

-void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata)
+enum evmcs_v1_revision {
+ EVMCSv1_2016,
+ EVMCSv1_2022,
+};
+
+enum evmcs_unsupported_ctrl_type {
+ EVMCS_EXIT_CTLS,
+ EVMCS_ENTRY_CTLS,
+ EVMCS_2NDEXEC,
+ EVMCS_PINCTRL,
+ EVMCS_VMFUNC,
+};
+
+static u32 evmcs_get_unsupported_ctls(struct kvm_vcpu *vcpu,
+ enum evmcs_unsupported_ctrl_type ctrl_type)
+{
+ struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
+ enum evmcs_v1_revision evmcs_rev = EVMCSv1_2016;
+
+ if (!hv_vcpu)
+ return 0;
+
+ if (hv_vcpu->cpuid_cache.nested_ebx & HV_X64_NESTED_EVMCS1_2022_UPDATE)
+ evmcs_rev = EVMCSv1_2022;
+
+ switch (ctrl_type) {
+ case EVMCS_EXIT_CTLS:
+ if (evmcs_rev == EVMCSv1_2016)
+ return EVMCS1_UNSUPPORTED_VMEXIT_CTRL |
+ VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
+ else
+ return EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
+ case EVMCS_ENTRY_CTLS:
+ if (evmcs_rev == EVMCSv1_2016)
+ return EVMCS1_UNSUPPORTED_VMENTRY_CTRL |
+ VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
+ else
+ return EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
+ case EVMCS_2NDEXEC:
+ if (evmcs_rev == EVMCSv1_2016)
+ return EVMCS1_UNSUPPORTED_2NDEXEC |
+ SECONDARY_EXEC_TSC_SCALING;
+ else
+ return EVMCS1_UNSUPPORTED_2NDEXEC;
+ case EVMCS_PINCTRL:
+ return EVMCS1_UNSUPPORTED_PINCTRL;
+ case EVMCS_VMFUNC:
+ return EVMCS1_UNSUPPORTED_VMFUNC;
+ }
+
+ return 0;
+}
+
+void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata)
{
u32 ctl_low = (u32)*pdata;
u32 ctl_high = (u32)(*pdata >> 32);
@@ -380,72 +433,73 @@ void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata)
switch (msr_index) {
case MSR_IA32_VMX_EXIT_CTLS:
case MSR_IA32_VMX_TRUE_EXIT_CTLS:
- ctl_high &= ~EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
+ ctl_high &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_EXIT_CTLS);
break;
case MSR_IA32_VMX_ENTRY_CTLS:
case MSR_IA32_VMX_TRUE_ENTRY_CTLS:
- ctl_high &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
+ ctl_high &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_ENTRY_CTLS);
break;
case MSR_IA32_VMX_PROCBASED_CTLS2:
- ctl_high &= ~EVMCS1_UNSUPPORTED_2NDEXEC;
+ ctl_high &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_2NDEXEC);
break;
case MSR_IA32_VMX_TRUE_PINBASED_CTLS:
case MSR_IA32_VMX_PINBASED_CTLS:
- ctl_high &= ~EVMCS1_UNSUPPORTED_PINCTRL;
+ ctl_high &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_PINCTRL);
break;
case MSR_IA32_VMX_VMFUNC:
- ctl_low &= ~EVMCS1_UNSUPPORTED_VMFUNC;
+ ctl_low &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_VMFUNC);
break;
}

*pdata = ctl_low | ((u64)ctl_high << 32);
}

-int nested_evmcs_check_controls(struct vmcs12 *vmcs12)
+int nested_evmcs_check_controls(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
{
int ret = 0;
u32 unsupp_ctl;

unsupp_ctl = vmcs12->pin_based_vm_exec_control &
- EVMCS1_UNSUPPORTED_PINCTRL;
+ evmcs_get_unsupported_ctls(vcpu, EVMCS_PINCTRL);
if (unsupp_ctl) {
trace_kvm_nested_vmenter_failed(
- "eVMCS: unsupported pin-based VM-execution controls",
+ "eVMCS: unsupported pin-based VM-execution controls: ",
unsupp_ctl);
ret = -EINVAL;
}

unsupp_ctl = vmcs12->secondary_vm_exec_control &
- EVMCS1_UNSUPPORTED_2NDEXEC;
+ evmcs_get_unsupported_ctls(vcpu, EVMCS_2NDEXEC);
if (unsupp_ctl) {
trace_kvm_nested_vmenter_failed(
- "eVMCS: unsupported secondary VM-execution controls",
+ "eVMCS: unsupported secondary VM-execution controls: ",
unsupp_ctl);
ret = -EINVAL;
}

unsupp_ctl = vmcs12->vm_exit_controls &
- EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
+ evmcs_get_unsupported_ctls(vcpu, EVMCS_EXIT_CTLS);
if (unsupp_ctl) {
trace_kvm_nested_vmenter_failed(
- "eVMCS: unsupported VM-exit controls",
+ "eVMCS: unsupported VM-exit controls: ",
unsupp_ctl);
ret = -EINVAL;
}

unsupp_ctl = vmcs12->vm_entry_controls &
- EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
+ evmcs_get_unsupported_ctls(vcpu, EVMCS_ENTRY_CTLS);
if (unsupp_ctl) {
trace_kvm_nested_vmenter_failed(
- "eVMCS: unsupported VM-entry controls",
+ "eVMCS: unsupported VM-entry controls: ",
unsupp_ctl);
ret = -EINVAL;
}

- unsupp_ctl = vmcs12->vm_function_control & EVMCS1_UNSUPPORTED_VMFUNC;
+ unsupp_ctl = vmcs12->vm_function_control &
+ evmcs_get_unsupported_ctls(vcpu, EVMCS_VMFUNC);
if (unsupp_ctl) {
trace_kvm_nested_vmenter_failed(
- "eVMCS: unsupported VM-function controls",
+ "eVMCS: unsupported VM-function controls: ",
unsupp_ctl);
ret = -EINVAL;
}
diff --git a/arch/x86/kvm/vmx/evmcs.h b/arch/x86/kvm/vmx/evmcs.h
index f886a8ff0342..4b809c79ae63 100644
--- a/arch/x86/kvm/vmx/evmcs.h
+++ b/arch/x86/kvm/vmx/evmcs.h
@@ -37,16 +37,9 @@ DECLARE_STATIC_KEY_FALSE(enable_evmcs);
* EPTP_LIST_ADDRESS = 0x00002024,
* VMREAD_BITMAP = 0x00002026,
* VMWRITE_BITMAP = 0x00002028,
- *
- * TSC_MULTIPLIER = 0x00002032,
* PLE_GAP = 0x00004020,
* PLE_WINDOW = 0x00004022,
* VMX_PREEMPTION_TIMER_VALUE = 0x0000482E,
- * GUEST_IA32_PERF_GLOBAL_CTRL = 0x00002808,
- * HOST_IA32_PERF_GLOBAL_CTRL = 0x00002c04,
- *
- * Currently unsupported in KVM:
- * GUEST_IA32_RTIT_CTL = 0x00002814,
*/
#define EVMCS1_UNSUPPORTED_PINCTRL (PIN_BASED_POSTED_INTR | \
PIN_BASED_VMX_PREEMPTION_TIMER)
@@ -58,12 +51,10 @@ DECLARE_STATIC_KEY_FALSE(enable_evmcs);
SECONDARY_EXEC_ENABLE_PML | \
SECONDARY_EXEC_ENABLE_VMFUNC | \
SECONDARY_EXEC_SHADOW_VMCS | \
- SECONDARY_EXEC_TSC_SCALING | \
SECONDARY_EXEC_PAUSE_LOOP_EXITING)
#define EVMCS1_UNSUPPORTED_VMEXIT_CTRL \
- (VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | \
- VM_EXIT_SAVE_VMX_PREEMPTION_TIMER)
-#define EVMCS1_UNSUPPORTED_VMENTRY_CTRL (VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL)
+ (VM_EXIT_SAVE_VMX_PREEMPTION_TIMER)
+#define EVMCS1_UNSUPPORTED_VMENTRY_CTRL (0)
#define EVMCS1_UNSUPPORTED_VMFUNC (VMX_VMFUNC_EPTP_SWITCHING)

struct evmcs_field {
@@ -243,7 +234,7 @@ bool nested_enlightened_vmentry(struct kvm_vcpu *vcpu, u64 *evmcs_gpa);
uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu);
int nested_enable_evmcs(struct kvm_vcpu *vcpu,
uint16_t *vmcs_version);
-void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata);
-int nested_evmcs_check_controls(struct vmcs12 *vmcs12);
+void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata);
+int nested_evmcs_check_controls(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12);

#endif /* __KVM_X86_VMX_EVMCS_H */
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 4fc84f0f5875..dcf3ee645212 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -2891,7 +2891,7 @@ static int nested_vmx_check_controls(struct kvm_vcpu *vcpu,
return -EINVAL;

if (to_vmx(vcpu)->nested.enlightened_vmcs_enabled)
- return nested_evmcs_check_controls(vmcs12);
+ return nested_evmcs_check_controls(vcpu, vmcs12);

return 0;
}
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c30115b9cb33..b4915d841357 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1858,7 +1858,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
*/
if (!msr_info->host_initiated &&
vmx->nested.enlightened_vmcs_enabled)
- nested_evmcs_filter_control_msr(msr_info->index,
+ nested_evmcs_filter_control_msr(vcpu, msr_info->index,
&msr_info->data);
break;
case MSR_IA32_RTIT_CTL:
--
2.35.3

2022-07-08 15:19:18

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v3 11/25] KVM: VMX: Get rid of eVMCS specific VMX controls sanitization

With the updated eVMCSv1 definition, there's no known 'problematic'
controls which are exposed in VMX control MSRs but are not present in
eVMCSv1. Get rid of VMX control MSRs filtering for KVM on Hyper-V.

Note: VMX control MSRs filtering for Hyper-V on KVM
(nested_evmcs_filter_control_msr()) stays as even the updated eVMCSv1
definition doesn't have all the features implemented by KVM and some
fields are still missing. Moreover, nested_evmcs_filter_control_msr()
has to support the original eVMCSv1 version when VMM wishes so.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/evmcs.c | 13 -------------
arch/x86/kvm/vmx/evmcs.h | 1 -
arch/x86/kvm/vmx/vmx.c | 5 -----
3 files changed, 19 deletions(-)

diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
index 52a53debd806..b5cfbf7d487b 100644
--- a/arch/x86/kvm/vmx/evmcs.c
+++ b/arch/x86/kvm/vmx/evmcs.c
@@ -320,19 +320,6 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
};
const unsigned int nr_evmcs_1_fields = ARRAY_SIZE(vmcs_field_to_evmcs_1);

-#if IS_ENABLED(CONFIG_HYPERV)
-__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf)
-{
- vmcs_conf->cpu_based_exec_ctrl &= ~EVMCS1_UNSUPPORTED_EXEC_CTRL;
- vmcs_conf->pin_based_exec_ctrl &= ~EVMCS1_UNSUPPORTED_PINCTRL;
- vmcs_conf->cpu_based_2nd_exec_ctrl &= ~EVMCS1_UNSUPPORTED_2NDEXEC;
- vmcs_conf->cpu_based_3rd_exec_ctrl = 0;
-
- vmcs_conf->vmexit_ctrl &= ~EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
- vmcs_conf->vmentry_ctrl &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
-}
-#endif
-
bool nested_enlightened_vmentry(struct kvm_vcpu *vcpu, u64 *evmcs_gpa)
{
struct hv_vp_assist_page assist_page;
diff --git a/arch/x86/kvm/vmx/evmcs.h b/arch/x86/kvm/vmx/evmcs.h
index 4b809c79ae63..0feac101cce4 100644
--- a/arch/x86/kvm/vmx/evmcs.h
+++ b/arch/x86/kvm/vmx/evmcs.h
@@ -203,7 +203,6 @@ static inline void evmcs_load(u64 phys_addr)
vp_ap->enlighten_vmentry = 1;
}

-__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf);
#else /* !IS_ENABLED(CONFIG_HYPERV) */
static __always_inline void evmcs_write64(unsigned long field, u64 value) {}
static inline void evmcs_write32(unsigned long field, u32 value) {}
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index b4915d841357..dd905ad72637 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2689,11 +2689,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
vmcs_conf->vmexit_ctrl = _vmexit_control;
vmcs_conf->vmentry_ctrl = _vmentry_control;

-#if IS_ENABLED(CONFIG_HYPERV)
- if (enlightened_vmcs)
- evmcs_sanitize_exec_ctrls(vmcs_conf);
-#endif
-
return 0;
}

--
2.35.3

2022-07-08 15:20:46

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v3 23/25] KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs

Using raw host MSR values for setting up nested VMX control MSRs is
incorrect as some features need to disabled, e.g. when KVM runs as
a nested hypervisor on Hyper-V and uses Enlightened VMCS or when a
workaround for IA32_PERF_GLOBAL_CTRL is applied. For non-nested VMX, this
is done in setup_vmcs_config() and the result is stored in vmcs_config.
Use it for setting up allowed-1 bits in nested VMX MSRs too.

Suggested-by: Sean Christopherson <[email protected]>
Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/nested.c | 30 ++++++++++++------------------
arch/x86/kvm/vmx/nested.h | 2 +-
arch/x86/kvm/vmx/vmx.c | 5 ++---
3 files changed, 15 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 09654d5c2144..3d386c663fac 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6565,8 +6565,10 @@ static u64 nested_vmx_calc_vmcs_enum_msr(void)
* bit in the high half is on if the corresponding bit in the control field
* may be on. See also vmx_control_verify().
*/
-void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
+void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
{
+ struct nested_vmx_msrs *msrs = &vmcs_conf->nested;
+
/*
* Note that as a general rule, the high half of the MSRs (bits in
* the control fields which may be 1) should be initialized by the
@@ -6583,11 +6585,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
*/

/* pin-based controls */
- rdmsr(MSR_IA32_VMX_PINBASED_CTLS,
- msrs->pinbased_ctls_low,
- msrs->pinbased_ctls_high);
msrs->pinbased_ctls_low =
PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR;
+
+ msrs->pinbased_ctls_high = vmcs_conf->pin_based_exec_ctrl;
msrs->pinbased_ctls_high &=
PIN_BASED_EXT_INTR_MASK |
PIN_BASED_NMI_EXITING |
@@ -6598,12 +6599,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
PIN_BASED_VMX_PREEMPTION_TIMER;

/* exit controls */
- rdmsr(MSR_IA32_VMX_EXIT_CTLS,
- msrs->exit_ctls_low,
- msrs->exit_ctls_high);
msrs->exit_ctls_low =
VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR;

+ msrs->exit_ctls_high = vmcs_conf->vmexit_ctrl;
msrs->exit_ctls_high &=
#ifdef CONFIG_X86_64
VM_EXIT_HOST_ADDR_SPACE_SIZE |
@@ -6619,11 +6618,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
msrs->exit_ctls_low &= ~VM_EXIT_SAVE_DEBUG_CONTROLS;

/* entry controls */
- rdmsr(MSR_IA32_VMX_ENTRY_CTLS,
- msrs->entry_ctls_low,
- msrs->entry_ctls_high);
msrs->entry_ctls_low =
VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR;
+
+ msrs->entry_ctls_high = vmcs_conf->vmentry_ctrl;
msrs->entry_ctls_high &=
#ifdef CONFIG_X86_64
VM_ENTRY_IA32E_MODE |
@@ -6637,11 +6635,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
msrs->entry_ctls_low &= ~VM_ENTRY_LOAD_DEBUG_CONTROLS;

/* cpu-based controls */
- rdmsr(MSR_IA32_VMX_PROCBASED_CTLS,
- msrs->procbased_ctls_low,
- msrs->procbased_ctls_high);
msrs->procbased_ctls_low =
CPU_BASED_ALWAYSON_WITHOUT_TRUE_MSR;
+
+ msrs->procbased_ctls_high = vmcs_conf->cpu_based_exec_ctrl;
msrs->procbased_ctls_high &=
CPU_BASED_INTR_WINDOW_EXITING |
CPU_BASED_NMI_WINDOW_EXITING | CPU_BASED_USE_TSC_OFFSETTING |
@@ -6675,12 +6672,9 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
* depend on CPUID bits, they are added later by
* vmx_vcpu_after_set_cpuid.
*/
- if (msrs->procbased_ctls_high & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS)
- rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2,
- msrs->secondary_ctls_low,
- msrs->secondary_ctls_high);
-
msrs->secondary_ctls_low = 0;
+
+ msrs->secondary_ctls_high = vmcs_conf->cpu_based_2nd_exec_ctrl;
msrs->secondary_ctls_high &=
SECONDARY_EXEC_DESC |
SECONDARY_EXEC_ENABLE_RDTSCP |
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index c92cea0b8ccc..fae047c6204b 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -17,7 +17,7 @@ enum nvmx_vmentry_status {
};

void vmx_leave_nested(struct kvm_vcpu *vcpu);
-void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps);
+void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps);
void nested_vmx_hardware_unsetup(void);
__init int nested_vmx_hardware_setup(int (*exit_handlers[])(struct kvm_vcpu *));
void nested_vmx_set_vmcs_shadowing_bitmap(void);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index e462e5b9c0a1..35285109856f 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7306,7 +7306,7 @@ static int __init vmx_check_processor_compat(void)
if (setup_vmcs_config(&vmcs_conf, &vmx_cap) < 0)
return -EIO;
if (nested)
- nested_vmx_setup_ctls_msrs(&vmcs_conf.nested, vmx_cap.ept);
+ nested_vmx_setup_ctls_msrs(&vmcs_conf, vmx_cap.ept);
if (memcmp(&vmcs_config, &vmcs_conf, sizeof(struct vmcs_config)) != 0) {
printk(KERN_ERR "kvm: CPU %d feature inconsistency!\n",
smp_processor_id());
@@ -8281,8 +8281,7 @@ static __init int hardware_setup(void)
setup_default_sgx_lepubkeyhash();

if (nested) {
- nested_vmx_setup_ctls_msrs(&vmcs_config.nested,
- vmx_capability.ept);
+ nested_vmx_setup_ctls_msrs(&vmcs_config, vmx_capability.ept);

r = nested_vmx_hardware_setup(kvm_vmx_exit_handlers);
if (r)
--
2.35.3

2022-07-08 15:22:02

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v3 17/25] KVM: VMX: Add missing VMEXIT controls to vmcs_config

As a preparation to reusing the result of setup_vmcs_config() in
nested VMX MSR setup, add the VMEXIT controls which KVM doesn't
use but supports for nVMX to KVM_OPT_VMX_VM_EXIT_CONTROLS and
filter them out in vmx_vmexit_ctrl().

No functional change intended.

Reviewed-by: Jim Mattson <[email protected]>
Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 4 ++++
arch/x86/kvm/vmx/vmx.h | 3 +++
2 files changed, 7 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index d7170990f469..2fb89bdcbbd8 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4196,6 +4196,10 @@ static u32 vmx_vmexit_ctrl(void)
{
u32 vmexit_ctrl = vmcs_config.vmexit_ctrl;

+ /* Not used by KVM but supported for nesting. */
+ vmexit_ctrl &= ~(VM_EXIT_SAVE_IA32_PAT | VM_EXIT_SAVE_IA32_EFER |
+ VM_EXIT_SAVE_VMX_PREEMPTION_TIMER);
+
if (vmx_pt_mode_is_system())
vmexit_ctrl &= ~(VM_EXIT_PT_CONCEAL_PIP |
VM_EXIT_CLEAR_IA32_RTIT_CTL);
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 89eaab3495a6..e9c392398f1b 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -498,8 +498,11 @@ static inline u8 vmx_get_rvi(void)
#endif
#define KVM_OPT_VMX_VM_EXIT_CONTROLS \
(VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | \
+ VM_EXIT_SAVE_IA32_PAT | \
VM_EXIT_LOAD_IA32_PAT | \
+ VM_EXIT_SAVE_IA32_EFER | \
VM_EXIT_LOAD_IA32_EFER | \
+ VM_EXIT_SAVE_VMX_PREEMPTION_TIMER | \
VM_EXIT_CLEAR_BNDCFGS | \
VM_EXIT_PT_CONCEAL_PIP | \
VM_EXIT_CLEAR_IA32_RTIT_CTL)
--
2.35.3

2022-07-08 15:34:43

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v3 24/25] KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config

Like other host VMX control MSRs, MSR_IA32_VMX_MISC can be cached in
vmcs_config to avoid the need to re-read it later, e.g. from
cpu_has_vmx_intel_pt() or cpu_has_vmx_shadow_vmcs().

No (real) functional change intended.

Reviewed-by: Jim Mattson <[email protected]>
Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/capabilities.h | 11 +++--------
arch/x86/kvm/vmx/vmx.c | 8 +++++---
2 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 07e7492fe72a..07f7a9534211 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -65,6 +65,7 @@ struct vmcs_config {
u64 cpu_based_3rd_exec_ctrl;
u32 vmexit_ctrl;
u32 vmentry_ctrl;
+ u64 misc;
struct nested_vmx_msrs nested;
};
extern struct vmcs_config vmcs_config;
@@ -225,11 +226,8 @@ static inline bool cpu_has_vmx_vmfunc(void)

static inline bool cpu_has_vmx_shadow_vmcs(void)
{
- u64 vmx_msr;
-
/* check if the cpu supports writing r/o exit information fields */
- rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
- if (!(vmx_msr & MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS))
+ if (!(vmcs_config.misc & MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS))
return false;

return vmcs_config.cpu_based_2nd_exec_ctrl &
@@ -371,10 +369,7 @@ static inline bool cpu_has_vmx_invvpid_global(void)

static inline bool cpu_has_vmx_intel_pt(void)
{
- u64 vmx_msr;
-
- rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
- return (vmx_msr & MSR_IA32_VMX_MISC_INTEL_PT) &&
+ return (vmcs_config.misc & MSR_IA32_VMX_MISC_INTEL_PT) &&
(vmcs_config.cpu_based_2nd_exec_ctrl & SECONDARY_EXEC_PT_USE_GPA) &&
(vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_RTIT_CTL);
}
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 35285109856f..ab091758c437 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2479,6 +2479,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
u64 _cpu_based_3rd_exec_control = 0;
u32 _vmexit_control = 0;
u32 _vmentry_control = 0;
+ u64 misc_msr;
int i;

/*
@@ -2613,6 +2614,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
if (((vmx_msr_high >> 18) & 15) != 6)
return -EIO;

+ rdmsrl(MSR_IA32_VMX_MISC, misc_msr);
+
vmcs_conf->size = vmx_msr_high & 0x1fff;
vmcs_conf->basic_cap = vmx_msr_high & ~0x1fff;

@@ -2624,6 +2627,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
vmcs_conf->cpu_based_3rd_exec_ctrl = _cpu_based_3rd_exec_control;
vmcs_conf->vmexit_ctrl = _vmexit_control;
vmcs_conf->vmentry_ctrl = _vmentry_control;
+ vmcs_conf->misc = misc_msr;

return 0;
}
@@ -8241,11 +8245,9 @@ static __init int hardware_setup(void)

if (enable_preemption_timer) {
u64 use_timer_freq = 5000ULL * 1000 * 1000;
- u64 vmx_msr;

- rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
cpu_preemption_timer_multi =
- vmx_msr & VMX_MISC_PREEMPTION_TIMER_RATE_MASK;
+ vmcs_config.misc & VMX_MISC_PREEMPTION_TIMER_RATE_MASK;

if (tsc_khz)
use_timer_freq = (u64)tsc_khz * 1000;
--
2.35.3

2022-07-08 15:34:45

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v3 19/25] KVM: VMX: Adjust CR3/INVPLG interception for EPT=y at runtime, not setup

From: Sean Christopherson <[email protected]>

Clear the CR3 and INVLPG interception controls at runtime based on
whether or not EPT is being _used_, as opposed to clearing the bits at
setup if EPT is _supported_ in hardware, and then restoring them when EPT
is not used. Not mucking with the base config will allow using the base
config as the starting point for emulating the VMX capability MSRs.

Signed-off-by: Sean Christopherson <[email protected]>
Reviewed-by: Jim Mattson <[email protected]>
Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 18 +++++++-----------
1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 9771c771c8f5..eca6875d6732 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2501,13 +2501,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
rdmsr_safe(MSR_IA32_VMX_EPT_VPID_CAP,
&vmx_cap->ept, &vmx_cap->vpid);

- if (_cpu_based_2nd_exec_control & SECONDARY_EXEC_ENABLE_EPT) {
- /* CR3 accesses and invlpg don't need to cause VM Exits when EPT
- enabled */
- _cpu_based_exec_control &= ~(CPU_BASED_CR3_LOAD_EXITING |
- CPU_BASED_CR3_STORE_EXITING |
- CPU_BASED_INVLPG_EXITING);
- } else if (vmx_cap->ept) {
+ if (!(_cpu_based_2nd_exec_control & SECONDARY_EXEC_ENABLE_EPT) &&
+ vmx_cap->ept) {
pr_warn_once("EPT CAP should not exist if not support "
"1-setting enable EPT VM-execution control\n");

@@ -4264,10 +4259,11 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
exec_control |= CPU_BASED_CR8_STORE_EXITING |
CPU_BASED_CR8_LOAD_EXITING;
#endif
- if (!enable_ept)
- exec_control |= CPU_BASED_CR3_STORE_EXITING |
- CPU_BASED_CR3_LOAD_EXITING |
- CPU_BASED_INVLPG_EXITING;
+ /* No need to intercept CR3 access or INVPLG when using EPT. */
+ if (enable_ept)
+ exec_control &= ~(CPU_BASED_CR3_LOAD_EXITING |
+ CPU_BASED_CR3_STORE_EXITING |
+ CPU_BASED_INVLPG_EXITING);
if (kvm_mwait_in_guest(vmx->vcpu.kvm))
exec_control &= ~(CPU_BASED_MWAIT_EXITING |
CPU_BASED_MONITOR_EXITING);
--
2.35.3

2022-07-08 15:38:41

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v3 06/25] KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf

KVM has to check guest visible HYPERV_CPUID_NESTED_FEATURES.EBX CPUID
leaf to know with Enlightened VMCS definition to use (original or 2022
update). Cache the leaf along with other Hyper-V CPUID feature leaves
to make the check quick.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 2 ++
arch/x86/kvm/hyperv.c | 9 +++++++++
2 files changed, 11 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index de5a149d0971..077ec9cf3169 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -616,6 +616,8 @@ struct kvm_vcpu_hv {
u32 enlightenments_eax; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EAX */
u32 enlightenments_ebx; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EBX */
u32 syndbg_cap_eax; /* HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES.EAX */
+ u32 nested_eax; /* HYPERV_CPUID_NESTED_FEATURES.EAX */
+ u32 nested_ebx; /* HYPERV_CPUID_NESTED_FEATURES.EBX */
} cpuid_cache;
};

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index e08189211d9a..b666902da4d9 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2030,6 +2030,15 @@ void kvm_hv_set_cpuid(struct kvm_vcpu *vcpu)
hv_vcpu->cpuid_cache.syndbg_cap_eax = entry->eax;
else
hv_vcpu->cpuid_cache.syndbg_cap_eax = 0;
+
+ entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_NESTED_FEATURES, 0);
+ if (entry) {
+ hv_vcpu->cpuid_cache.nested_eax = entry->eax;
+ hv_vcpu->cpuid_cache.nested_ebx = entry->ebx;
+ } else {
+ hv_vcpu->cpuid_cache.nested_eax = 0;
+ hv_vcpu->cpuid_cache.nested_ebx = 0;
+ }
}

int kvm_hv_set_enforce_cpuid(struct kvm_vcpu *vcpu, bool enforce)
--
2.35.3

2022-07-08 15:43:16

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v3 20/25] KVM: x86: VMX: Replace some Intel model numbers with mnemonics

From: Jim Mattson <[email protected]>

Intel processor code names are more familiar to many readers than
their decimal model numbers.

Signed-off-by: Jim Mattson <[email protected]>
Reviewed-by: Sean Christopherson <[email protected]>
Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index eca6875d6732..2dff5b94c535 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2580,11 +2580,11 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
*/
if (boot_cpu_data.x86 == 0x6) {
switch (boot_cpu_data.x86_model) {
- case 26: /* AAK155 */
- case 30: /* AAP115 */
- case 37: /* AAT100 */
- case 44: /* BC86,AAY89,BD102 */
- case 46: /* BA97 */
+ case INTEL_FAM6_NEHALEM_EP: /* AAK155 */
+ case INTEL_FAM6_NEHALEM: /* AAP115 */
+ case INTEL_FAM6_WESTMERE: /* AAT100 */
+ case INTEL_FAM6_WESTMERE_EP: /* BC86,AAY89,BD102 */
+ case INTEL_FAM6_NEHALEM_EX: /* BA97 */
_vmentry_control &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
_vmexit_control &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
--
2.35.3

2022-07-08 15:45:36

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v3 10/25] KVM: selftests: Enable TSC scaling in evmcs selftest

The updated Enlightened VMCS v1 definition enables TSC scaling, test
that SECONDARY_EXEC_TSC_SCALING can now be enabled.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
.../testing/selftests/kvm/x86_64/evmcs_test.c | 31 +++++++++++++++++--
1 file changed, 29 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/evmcs_test.c b/tools/testing/selftests/kvm/x86_64/evmcs_test.c
index 8dda527cc080..80135b98dc3b 100644
--- a/tools/testing/selftests/kvm/x86_64/evmcs_test.c
+++ b/tools/testing/selftests/kvm/x86_64/evmcs_test.c
@@ -18,6 +18,9 @@

#include "vmx.h"

+/* Test flags */
+#define HOST_HAS_TSC_SCALING BIT(0)
+
static int ud_count;

static void guest_ud_handler(struct ex_regs *regs)
@@ -64,11 +67,14 @@ void l2_guest_code(void)
vmcall();
rdmsr_gs_base(); /* intercepted */

+ /* TSC scaling */
+ vmcall();
+
/* Done, exit to L1 and never come back. */
vmcall();
}

-void guest_code(struct vmx_pages *vmx_pages)
+void guest_code(struct vmx_pages *vmx_pages, u64 test_flags)
{
#define L2_GUEST_STACK_SIZE 64
unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
@@ -150,6 +156,18 @@ void guest_code(struct vmx_pages *vmx_pages)
GUEST_ASSERT(vmreadz(VM_EXIT_REASON) == EXIT_REASON_VMCALL);
GUEST_SYNC(11);

+ if (test_flags & HOST_HAS_TSC_SCALING) {
+ GUEST_ASSERT((rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2) >> 32) &
+ SECONDARY_EXEC_TSC_SCALING);
+ /* Try enabling TSC scaling */
+ vmwrite(SECONDARY_VM_EXEC_CONTROL, vmreadz(SECONDARY_VM_EXEC_CONTROL) |
+ SECONDARY_EXEC_TSC_SCALING);
+ vmwrite(TSC_MULTIPLIER, 1);
+ }
+ GUEST_ASSERT(!vmresume());
+ GUEST_ASSERT(vmreadz(VM_EXIT_REASON) == EXIT_REASON_VMCALL);
+ GUEST_SYNC(12);
+
/* Try enlightened vmptrld with an incorrect GPA */
evmcs_vmptrld(0xdeadbeef, vmx_pages->enlightened_vmcs);
GUEST_ASSERT(vmlaunch());
@@ -204,6 +222,7 @@ int main(int argc, char *argv[])
struct kvm_vm *vm;
struct kvm_run *run;
struct ucall uc;
+ u64 test_flags = 0;
int stage;

vm = vm_create_with_one_vcpu(&vcpu, guest_code);
@@ -212,11 +231,19 @@ int main(int argc, char *argv[])
TEST_REQUIRE(kvm_has_cap(KVM_CAP_NESTED_STATE));
TEST_REQUIRE(kvm_has_cap(KVM_CAP_HYPERV_ENLIGHTENED_VMCS));

+ if ((kvm_get_feature_msr(MSR_IA32_VMX_PROCBASED_CTLS2) >> 32) &
+ SECONDARY_EXEC_TSC_SCALING) {
+ test_flags |= HOST_HAS_TSC_SCALING;
+ pr_info("TSC scaling is supported, adding to test\n");
+ } else {
+ pr_info("TSC scaling is not supported\n");
+ }
+
vcpu_set_hv_cpuid(vcpu);
vcpu_enable_evmcs(vcpu);

vcpu_alloc_vmx(vm, &vmx_pages_gva);
- vcpu_args_set(vcpu, 1, vmx_pages_gva);
+ vcpu_args_set(vcpu, 2, vmx_pages_gva, test_flags);

vm_init_descriptor_tables(vm);
vcpu_init_descriptor_tables(vcpu);
--
2.35.3

2022-07-08 15:45:50

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v3 08/25] KVM: selftests: Switch to updated eVMCSv1 definition

Update Enlightened VMCS definition in selftests from KVM.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
.../selftests/kvm/include/x86_64/evmcs.h | 45 +++++++++++++++++--
1 file changed, 42 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/evmcs.h b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
index 3c9260f8e116..a777711d5474 100644
--- a/tools/testing/selftests/kvm/include/x86_64/evmcs.h
+++ b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
@@ -203,14 +203,25 @@ struct hv_enlightened_vmcs {
u32 reserved:30;
} hv_enlightenments_control;
u32 hv_vp_id;
-
+ u32 padding32_2;
u64 hv_vm_id;
u64 partition_assist_page;
u64 padding64_4[4];
u64 guest_bndcfgs;
- u64 padding64_5[7];
+ u64 guest_ia32_perf_global_ctrl;
+ u64 guest_ia32_s_cet;
+ u64 guest_ssp;
+ u64 guest_ia32_int_ssp_table_addr;
+ u64 guest_ia32_lbr_ctl;
+ u64 padding64_5[2];
u64 xss_exit_bitmap;
- u64 padding64_6[7];
+ u64 host_ia32_perf_global_ctrl;
+ u64 encls_exiting_bitmap;
+ u64 tsc_multiplier;
+ u64 host_ia32_s_cet;
+ u64 host_ssp;
+ u64 host_ia32_int_ssp_table_addr;
+ u64 padding64_6;
};

#define HV_VMX_ENLIGHTENED_CLEAN_FIELD_NONE 0
@@ -656,6 +667,18 @@ static inline int evmcs_vmread(uint64_t encoding, uint64_t *value)
case VIRTUAL_PROCESSOR_ID:
*value = current_evmcs->virtual_processor_id;
break;
+ case HOST_IA32_PERF_GLOBAL_CTRL:
+ *value = current_evmcs->host_ia32_perf_global_ctrl;
+ break;
+ case GUEST_IA32_PERF_GLOBAL_CTRL:
+ *value = current_evmcs->guest_ia32_perf_global_ctrl;
+ break;
+ case ENCLS_EXITING_BITMAP:
+ *value = current_evmcs->encls_exiting_bitmap;
+ break;
+ case TSC_MULTIPLIER:
+ *value = current_evmcs->tsc_multiplier;
+ break;
default: return 1;
}

@@ -1169,6 +1192,22 @@ static inline int evmcs_vmwrite(uint64_t encoding, uint64_t value)
current_evmcs->virtual_processor_id = value;
current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_XLAT;
break;
+ case HOST_IA32_PERF_GLOBAL_CTRL:
+ current_evmcs->host_ia32_perf_global_ctrl = value;
+ current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1;
+ break;
+ case GUEST_IA32_PERF_GLOBAL_CTRL:
+ current_evmcs->guest_ia32_perf_global_ctrl = value;
+ current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1;
+ break;
+ case ENCLS_EXITING_BITMAP:
+ current_evmcs->encls_exiting_bitmap = value;
+ current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2;
+ break;
+ case TSC_MULTIPLIER:
+ current_evmcs->tsc_multiplier = value;
+ current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2;
+ break;
default: return 1;
}

--
2.35.3

2022-07-08 15:49:56

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v3 21/25] KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config()

As a preparation to reusing the result of setup_vmcs_config() for setting
up nested VMX control MSRs, move LOAD_IA32_PERF_GLOBAL_CTRL errata handling
to vmx_vmexit_ctrl()/vmx_vmentry_ctrl() and print the warning from
hardware_setup(). While it seems reasonable to not expose
LOAD_IA32_PERF_GLOBAL_CTRL controls to L1 hypervisor on buggy CPUs,
such change would inevitably break live migration from older KVMs
where the controls are exposed. Keep the status quo for know, L1 hypervisor
itself is supposed to take care of the errata.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 62 ++++++++++++++++++++++++++----------------
1 file changed, 38 insertions(+), 24 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 2dff5b94c535..e462e5b9c0a1 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2416,6 +2416,31 @@ static bool cpu_has_sgx(void)
return cpuid_eax(0) >= 0x12 && (cpuid_eax(0x12) & BIT(0));
}

+/*
+ * Some cpus support VM_{ENTRY,EXIT}_IA32_PERF_GLOBAL_CTRL but they
+ * can't be used due to errata where VM Exit may incorrectly clear
+ * IA32_PERF_GLOBAL_CTRL[34:32]. Work around the errata by using the
+ * MSR load mechanism to switch IA32_PERF_GLOBAL_CTRL.
+ */
+static bool cpu_has_perf_global_ctrl_bug(void)
+{
+ if (boot_cpu_data.x86 == 0x6) {
+ switch (boot_cpu_data.x86_model) {
+ case INTEL_FAM6_NEHALEM_EP: /* AAK155 */
+ case INTEL_FAM6_NEHALEM: /* AAP115 */
+ case INTEL_FAM6_WESTMERE: /* AAT100 */
+ case INTEL_FAM6_WESTMERE_EP: /* BC86,AAY89,BD102 */
+ case INTEL_FAM6_NEHALEM_EX: /* BA97 */
+ return true;
+ default:
+ break;
+ }
+ }
+
+ return false;
+}
+
+
static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
u32 msr, u32 *result)
{
@@ -2572,30 +2597,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
_vmexit_control &= ~x_ctrl;
}

- /*
- * Some cpus support VM_{ENTRY,EXIT}_IA32_PERF_GLOBAL_CTRL but they
- * can't be used due to an errata where VM Exit may incorrectly clear
- * IA32_PERF_GLOBAL_CTRL[34:32]. Workaround the errata by using the
- * MSR load mechanism to switch IA32_PERF_GLOBAL_CTRL.
- */
- if (boot_cpu_data.x86 == 0x6) {
- switch (boot_cpu_data.x86_model) {
- case INTEL_FAM6_NEHALEM_EP: /* AAK155 */
- case INTEL_FAM6_NEHALEM: /* AAP115 */
- case INTEL_FAM6_WESTMERE: /* AAT100 */
- case INTEL_FAM6_WESTMERE_EP: /* BC86,AAY89,BD102 */
- case INTEL_FAM6_NEHALEM_EX: /* BA97 */
- _vmentry_control &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
- _vmexit_control &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
- pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
- "does not work properly. Using workaround\n");
- break;
- default:
- break;
- }
- }
-
-
rdmsr(MSR_IA32_VMX_BASIC, vmx_msr_low, vmx_msr_high);

/* IA-32 SDM Vol 3B: VMCS size is never greater than 4kB. */
@@ -4184,6 +4185,10 @@ static u32 vmx_vmentry_ctrl(void)
VM_ENTRY_LOAD_IA32_EFER |
VM_ENTRY_IA32E_MODE);

+
+ if (cpu_has_perf_global_ctrl_bug())
+ vmentry_ctrl &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
+
return vmentry_ctrl;
}

@@ -4198,6 +4203,10 @@ static u32 vmx_vmexit_ctrl(void)
if (vmx_pt_mode_is_system())
vmexit_ctrl &= ~(VM_EXIT_PT_CONCEAL_PIP |
VM_EXIT_CLEAR_IA32_RTIT_CTL);
+
+ if (cpu_has_perf_global_ctrl_bug())
+ vmexit_ctrl &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
+
/* Loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically */
return vmexit_ctrl &
~(VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | VM_EXIT_LOAD_IA32_EFER);
@@ -8113,6 +8122,11 @@ static __init int hardware_setup(void)
if (setup_vmcs_config(&vmcs_config, &vmx_capability) < 0)
return -EIO;

+ if (cpu_has_perf_global_ctrl_bug()) {
+ pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
+ "does not work properly. Using workaround\n");
+ }
+
if (boot_cpu_has(X86_FEATURE_NX))
kvm_enable_efer_bits(EFER_NX);

--
2.35.3

2022-07-08 15:51:39

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v3 13/25] KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in setup_vmcs_config()

CPU_BASED_{INTR,NMI}_WINDOW_EXITING controls are toggled dynamically by
vmx_enable_{irq,nmi}_window, handle_interrupt_window(), handle_nmi_window()
but setup_vmcs_config() doesn't check their existence. Add the check and
filter the controls out in vmx_exec_control().

Note: KVM explicitly supports CPUs without VIRTUAL_NMIS and all these CPUs
are supposedly lacking NMI_WINDOW_EXITING too. Adjust cpu_has_virtual_nmis()
accordingly.

No functional change intended.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/capabilities.h | 3 ++-
arch/x86/kvm/vmx/vmx.c | 8 +++++++-
2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 069d8d298e1d..07e7492fe72a 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -82,7 +82,8 @@ static inline bool cpu_has_vmx_basic_inout(void)

static inline bool cpu_has_virtual_nmis(void)
{
- return vmcs_config.pin_based_exec_ctrl & PIN_BASED_VIRTUAL_NMIS;
+ return vmcs_config.pin_based_exec_ctrl & PIN_BASED_VIRTUAL_NMIS &&
+ vmcs_config.cpu_based_exec_ctrl & CPU_BASED_NMI_WINDOW_EXITING;
}

static inline bool cpu_has_vmx_preemption_timer(void)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 1aaec4d19e1b..ce54f13d8da1 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2487,10 +2487,12 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
CPU_BASED_MWAIT_EXITING |
CPU_BASED_MONITOR_EXITING |
CPU_BASED_INVLPG_EXITING |
- CPU_BASED_RDPMC_EXITING;
+ CPU_BASED_RDPMC_EXITING |
+ CPU_BASED_INTR_WINDOW_EXITING;

opt = CPU_BASED_TPR_SHADOW |
CPU_BASED_USE_MSR_BITMAPS |
+ CPU_BASED_NMI_WINDOW_EXITING |
CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |
CPU_BASED_ACTIVATE_TERTIARY_CONTROLS;
if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS,
@@ -4299,6 +4301,10 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
{
u32 exec_control = vmcs_config.cpu_based_exec_ctrl;

+ /* INTR_WINDOW_EXITING and NMI_WINDOW_EXITING are toggled dynamically */
+ exec_control &= ~(CPU_BASED_INTR_WINDOW_EXITING |
+ CPU_BASED_NMI_WINDOW_EXITING);
+
if (vmx->vcpu.arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
exec_control &= ~CPU_BASED_MOV_DR_EXITING;

--
2.35.3

2022-07-08 15:52:53

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH v3 01/25] KVM: x86: hyper-v: Expose access to debug MSRs in the partition privilege flags

For some features, Hyper-V spec defines two separate CPUID bits: one
listing whether the feature is supported or not and another one showing
whether guest partition was granted access to the feature ("partition
privilege mask"). 'Debug MSRs available' is one of such features. Add
the missing 'access' bit.

Fixes: f97f5a56f597 ("x86/kvm/hyper-v: Add support for synthetic debugger interface")
Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/hyperv.c | 1 +
include/asm-generic/hyperv-tlfs.h | 2 ++
2 files changed, 3 insertions(+)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index e2e95a6fccfd..e08189211d9a 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -2496,6 +2496,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
ent->eax |= HV_MSR_RESET_AVAILABLE;
ent->eax |= HV_MSR_REFERENCE_TSC_AVAILABLE;
ent->eax |= HV_ACCESS_FREQUENCY_MSRS;
+ ent->eax |= HV_ACCESS_DEBUG_MSRS;
ent->eax |= HV_ACCESS_REENLIGHTENMENT;

ent->ebx |= HV_POST_MESSAGES;
diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
index fdce7a4cfc6f..1d99dd296a76 100644
--- a/include/asm-generic/hyperv-tlfs.h
+++ b/include/asm-generic/hyperv-tlfs.h
@@ -70,6 +70,8 @@
#define HV_MSR_GUEST_IDLE_AVAILABLE BIT(10)
/* Partition local APIC and TSC frequency registers available */
#define HV_ACCESS_FREQUENCY_MSRS BIT(11)
+/* Debug MSRs available */
+#define HV_ACCESS_DEBUG_MSRS BIT(12)
/* AccessReenlightenmentControls privilege */
#define HV_ACCESS_REENLIGHTENMENT BIT(13)
/* AccessTscInvariantControls privilege */
--
2.35.3

2022-07-12 11:53:13

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v3 01/25] KVM: x86: hyper-v: Expose access to debug MSRs in the partition privilege flags

On Fri, 2022-07-08 at 16:41 +0200, Vitaly Kuznetsov wrote:
> For some features, Hyper-V spec defines two separate CPUID bits: one
> listing whether the feature is supported or not and another one showing
> whether guest partition was granted access to the feature ("partition
> privilege mask"). 'Debug MSRs available' is one of such features. Add
> the missing 'access' bit.
>
> Fixes: f97f5a56f597 ("x86/kvm/hyper-v: Add support for synthetic debugger interface")
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
> arch/x86/kvm/hyperv.c | 1 +
> include/asm-generic/hyperv-tlfs.h | 2 ++
> 2 files changed, 3 insertions(+)
>
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index e2e95a6fccfd..e08189211d9a 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -2496,6 +2496,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
> ent->eax |= HV_MSR_RESET_AVAILABLE;
> ent->eax |= HV_MSR_REFERENCE_TSC_AVAILABLE;
> ent->eax |= HV_ACCESS_FREQUENCY_MSRS;
> + ent->eax |= HV_ACCESS_DEBUG_MSRS;
> ent->eax |= HV_ACCESS_REENLIGHTENMENT;
>
> ent->ebx |= HV_POST_MESSAGES;
> diff --git a/include/asm-generic/hyperv-tlfs.h b/include/asm-generic/hyperv-tlfs.h
> index fdce7a4cfc6f..1d99dd296a76 100644
> --- a/include/asm-generic/hyperv-tlfs.h
> +++ b/include/asm-generic/hyperv-tlfs.h
> @@ -70,6 +70,8 @@
> #define HV_MSR_GUEST_IDLE_AVAILABLE BIT(10)
> /* Partition local APIC and TSC frequency registers available */
> #define HV_ACCESS_FREQUENCY_MSRS BIT(11)
> +/* Debug MSRs available */
> +#define HV_ACCESS_DEBUG_MSRS BIT(12)
> /* AccessReenlightenmentControls privilege */
> #define HV_ACCESS_REENLIGHTENMENT BIT(13)
> /* AccessTscInvariantControls privilege */


I guess you mean HV_FEATURE_DEBUG_MSRS_AVAILABLE and the new HV_ACCESS_DEBUG_MSRS

I checked the spec and the bits match, so I guess:

Reviewed-by: Maxim Levitsky <[email protected]>

Best regards,
Maxim Levitsky

2022-07-12 11:55:07

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v3 06/25] KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> KVM has to check guest visible HYPERV_CPUID_NESTED_FEATURES.EBX CPUID
> leaf to know with Enlightened VMCS definition to use (original or 2022
> update). Cache the leaf along with other Hyper-V CPUID feature leaves
> to make the check quick.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
> arch/x86/include/asm/kvm_host.h | 2 ++
> arch/x86/kvm/hyperv.c | 9 +++++++++
> 2 files changed, 11 insertions(+)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index de5a149d0971..077ec9cf3169 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -616,6 +616,8 @@ struct kvm_vcpu_hv {
> u32 enlightenments_eax; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EAX */
> u32 enlightenments_ebx; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EBX */
> u32 syndbg_cap_eax; /* HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES.EAX */
> + u32 nested_eax; /* HYPERV_CPUID_NESTED_FEATURES.EAX */
> + u32 nested_ebx; /* HYPERV_CPUID_NESTED_FEATURES.EBX */
> } cpuid_cache;
> };
>
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index e08189211d9a..b666902da4d9 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -2030,6 +2030,15 @@ void kvm_hv_set_cpuid(struct kvm_vcpu *vcpu)
> hv_vcpu->cpuid_cache.syndbg_cap_eax = entry->eax;
> else
> hv_vcpu->cpuid_cache.syndbg_cap_eax = 0;
> +
> + entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_NESTED_FEATURES, 0);
> + if (entry) {
> + hv_vcpu->cpuid_cache.nested_eax = entry->eax;
> + hv_vcpu->cpuid_cache.nested_ebx = entry->ebx;
> + } else {
> + hv_vcpu->cpuid_cache.nested_eax = 0;
> + hv_vcpu->cpuid_cache.nested_ebx = 0;
> + }
> }
>
> int kvm_hv_set_enforce_cpuid(struct kvm_vcpu *vcpu, bool enforce)


Small Nitpick:

If I understand correctly, the kvm_find_cpuid_entry can fail if the userspace didn't provide the
cpuid entry.

Since the code that deals with failback is now repeated 3 times, how about some wrapper function that
will return all zeros for a non present cpuid entry?

That can be done later of course, so

Reviewed-by: Maxim Levitsky <[email protected]>

Best regards,
Maxim Levitsky

2022-07-12 12:12:43

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v3 21/25] KVM: VMX: Move LOAD_IA32_PERF_GLOBAL_CTRL errata handling out of setup_vmcs_config()

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> As a preparation to reusing the result of setup_vmcs_config() for setting
> up nested VMX control MSRs, move LOAD_IA32_PERF_GLOBAL_CTRL errata handling
> to vmx_vmexit_ctrl()/vmx_vmentry_ctrl() and print the warning from
> hardware_setup(). While it seems reasonable to not expose
> LOAD_IA32_PERF_GLOBAL_CTRL controls to L1 hypervisor on buggy CPUs,
> such change would inevitably break live migration from older KVMs
> where the controls are exposed. Keep the status quo for know, L1 hypervisor
> itself is supposed to take care of the errata.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
>  arch/x86/kvm/vmx/vmx.c | 62 ++++++++++++++++++++++++++----------------
>  1 file changed, 38 insertions(+), 24 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 2dff5b94c535..e462e5b9c0a1 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2416,6 +2416,31 @@ static bool cpu_has_sgx(void)
>         return cpuid_eax(0) >= 0x12 && (cpuid_eax(0x12) & BIT(0));
>  }
>  
> +/*
> + * Some cpus support VM_{ENTRY,EXIT}_IA32_PERF_GLOBAL_CTRL but they
> + * can't be used due to errata where VM Exit may incorrectly clear
> + * IA32_PERF_GLOBAL_CTRL[34:32]. Work around the errata by using the
> + * MSR load mechanism to switch IA32_PERF_GLOBAL_CTRL.
> + */
> +static bool cpu_has_perf_global_ctrl_bug(void)
> +{
> +       if (boot_cpu_data.x86 == 0x6) {
> +               switch (boot_cpu_data.x86_model) {
> +               case INTEL_FAM6_NEHALEM_EP:     /* AAK155 */
> +               case INTEL_FAM6_NEHALEM:        /* AAP115 */
> +               case INTEL_FAM6_WESTMERE:       /* AAT100 */
> +               case INTEL_FAM6_WESTMERE_EP:    /* BC86,AAY89,BD102 */
> +               case INTEL_FAM6_NEHALEM_EX:     /* BA97 */
> +                       return true;
> +               default:
> +                       break;
> +               }
> +       }
> +
> +       return false;
> +}
> +
> +
>  static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
>                                       u32 msr, u32 *result)
Nitpick: Indention is incorrect
>  {
> @@ -2572,30 +2597,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>                 _vmexit_control &= ~x_ctrl;
>         }
>  
> -       /*
> -        * Some cpus support VM_{ENTRY,EXIT}_IA32_PERF_GLOBAL_CTRL but they
> -        * can't be used due to an errata where VM Exit may incorrectly clear
> -        * IA32_PERF_GLOBAL_CTRL[34:32].  Workaround the errata by using the
> -        * MSR load mechanism to switch IA32_PERF_GLOBAL_CTRL.
> -        */
> -       if (boot_cpu_data.x86 == 0x6) {
> -               switch (boot_cpu_data.x86_model) {
> -               case INTEL_FAM6_NEHALEM_EP:     /* AAK155 */
> -               case INTEL_FAM6_NEHALEM:        /* AAP115 */
> -               case INTEL_FAM6_WESTMERE:       /* AAT100 */
> -               case INTEL_FAM6_WESTMERE_EP:    /* BC86,AAY89,BD102 */
> -               case INTEL_FAM6_NEHALEM_EX:     /* BA97 */
> -                       _vmentry_control &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
> -                       _vmexit_control &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
> -                       pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
> -                                       "does not work properly. Using workaround\n");
> -                       break;
> -               default:
> -                       break;
> -               }
> -       }
> -
> -
>         rdmsr(MSR_IA32_VMX_BASIC, vmx_msr_low, vmx_msr_high);
>  
>         /* IA-32 SDM Vol 3B: VMCS size is never greater than 4kB. */
> @@ -4184,6 +4185,10 @@ static u32 vmx_vmentry_ctrl(void)
>                           VM_ENTRY_LOAD_IA32_EFER |
>                           VM_ENTRY_IA32E_MODE);
>  
> +
> +       if (cpu_has_perf_global_ctrl_bug())
> +               vmentry_ctrl &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
> +
>         return vmentry_ctrl;
>  }
>  
> @@ -4198,6 +4203,10 @@ static u32 vmx_vmexit_ctrl(void)
>         if (vmx_pt_mode_is_system())
>                 vmexit_ctrl &= ~(VM_EXIT_PT_CONCEAL_PIP |
>                                  VM_EXIT_CLEAR_IA32_RTIT_CTL);
> +
> +       if (cpu_has_perf_global_ctrl_bug())
> +               vmexit_ctrl &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
> +
>         /* Loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically */
>         return vmexit_ctrl &
>                 ~(VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | VM_EXIT_LOAD_IA32_EFER);
> @@ -8113,6 +8122,11 @@ static __init int hardware_setup(void)
>         if (setup_vmcs_config(&vmcs_config, &vmx_capability) < 0)
>                 return -EIO;
>  
> +       if (cpu_has_perf_global_ctrl_bug()) {
> +               pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "
> +                            "does not work properly. Using workaround\n");
> +       }
> +
>         if (boot_cpu_has(X86_FEATURE_NX))
>                 kvm_enable_efer_bits(EFER_NX);
>  

Reviewed-by: Maxim Levitsky <[email protected]>

Best regards,
Maxim Levitsky

2022-07-12 12:13:02

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v3 09/25] KVM: VMX: nVMX: Support TSC scaling and PERF_GLOBAL_CTRL with enlightened VMCS

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> Enlightened VMCS v1 got updated and now includes the required fields
> for TSC scaling and loading PERF_GLOBAL_CTRL upon VMENTER/VMEXIT
> features. For Hyper-V on KVM enablement, KVM can just observe VMX control
> MSRs and use the features (with or without eVMCS) when
> possible. Hyper-V on KVM case is trickier because of live migration:
> the new features require explicit enablement from VMM to not break
> it. Luckily, the updated eVMCS revision comes with a feature bit in
> CPUID.0x4000000A.EBX.

Very tiny nitpick about the commit summary. It might make sense instead
of specifying which new fields added, just say that this patch
updates the EVMCS to 2022 revision, or something, because the patch
series as a whole addes some other fields (even if as commented out),
and a new cpuid bit to detect the new eVMCS revision.

Again, this is a very tiny nitpick, feel free to ignore.

>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
> arch/x86/kvm/hyperv.c | 2 +-
> arch/x86/kvm/vmx/evmcs.c | 88 +++++++++++++++++++++++++++++++--------
> arch/x86/kvm/vmx/evmcs.h | 17 ++------
> arch/x86/kvm/vmx/nested.c | 2 +-
> arch/x86/kvm/vmx/vmx.c | 2 +-
> 5 files changed, 78 insertions(+), 33 deletions(-)
>
> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
> index b666902da4d9..406c5e273983 100644
> --- a/arch/x86/kvm/hyperv.c
> +++ b/arch/x86/kvm/hyperv.c
> @@ -2562,7 +2562,7 @@ int kvm_get_hv_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid2 *cpuid,
> case HYPERV_CPUID_NESTED_FEATURES:
> ent->eax = evmcs_ver;
> ent->eax |= HV_X64_NESTED_MSR_BITMAP;
> -
> + ent->ebx |= HV_X64_NESTED_EVMCS1_2022_UPDATE;
> break;
>
> case HYPERV_CPUID_SYNDBG_VENDOR_AND_MAX_FUNCTIONS:
> diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
> index 8bea5dea0341..52a53debd806 100644
> --- a/arch/x86/kvm/vmx/evmcs.c
> +++ b/arch/x86/kvm/vmx/evmcs.c
> @@ -368,7 +368,60 @@ uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu)
> return 0;
> }
>
> -void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata)
> +enum evmcs_v1_revision {
> + EVMCSv1_2016,
> + EVMCSv1_2022,
> +};
> +
> +enum evmcs_unsupported_ctrl_type {
> + EVMCS_EXIT_CTLS,
> + EVMCS_ENTRY_CTLS,
> + EVMCS_2NDEXEC,
> + EVMCS_PINCTRL,
> + EVMCS_VMFUNC,
> +};
> +
> +static u32 evmcs_get_unsupported_ctls(struct kvm_vcpu *vcpu,
> + enum evmcs_unsupported_ctrl_type ctrl_type)
Tiny nitpick: line break not aligned in the above.

> +{
> + struct kvm_vcpu_hv *hv_vcpu = to_hv_vcpu(vcpu);
> + enum evmcs_v1_revision evmcs_rev = EVMCSv1_2016;
> +
> + if (!hv_vcpu)
> + return 0;
> +
> + if (hv_vcpu->cpuid_cache.nested_ebx & HV_X64_NESTED_EVMCS1_2022_UPDATE)
> + evmcs_rev = EVMCSv1_2022;
> +
> + switch (ctrl_type) {
> + case EVMCS_EXIT_CTLS:
> + if (evmcs_rev == EVMCSv1_2016)
> + return EVMCS1_UNSUPPORTED_VMEXIT_CTRL |
> + VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
> + else
> + return EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
> + case EVMCS_ENTRY_CTLS:
> + if (evmcs_rev == EVMCSv1_2016)
> + return EVMCS1_UNSUPPORTED_VMENTRY_CTRL |
> + VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
> + else
> + return EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
> + case EVMCS_2NDEXEC:
> + if (evmcs_rev == EVMCSv1_2016)
> + return EVMCS1_UNSUPPORTED_2NDEXEC |
> + SECONDARY_EXEC_TSC_SCALING;
> + else
> + return EVMCS1_UNSUPPORTED_2NDEXEC;
> + case EVMCS_PINCTRL:
> + return EVMCS1_UNSUPPORTED_PINCTRL;
> + case EVMCS_VMFUNC:
> + return EVMCS1_UNSUPPORTED_VMFUNC;
> + }
> +
> + return 0;
> +}
> +
> +void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata)
> {
> u32 ctl_low = (u32)*pdata;
> u32 ctl_high = (u32)(*pdata >> 32);
> @@ -380,72 +433,73 @@ void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata)
> switch (msr_index) {
> case MSR_IA32_VMX_EXIT_CTLS:
> case MSR_IA32_VMX_TRUE_EXIT_CTLS:
> - ctl_high &= ~EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
> + ctl_high &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_EXIT_CTLS);
> break;
> case MSR_IA32_VMX_ENTRY_CTLS:
> case MSR_IA32_VMX_TRUE_ENTRY_CTLS:
> - ctl_high &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
> + ctl_high &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_ENTRY_CTLS);
> break;
> case MSR_IA32_VMX_PROCBASED_CTLS2:
> - ctl_high &= ~EVMCS1_UNSUPPORTED_2NDEXEC;
> + ctl_high &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_2NDEXEC);
> break;
> case MSR_IA32_VMX_TRUE_PINBASED_CTLS:
> case MSR_IA32_VMX_PINBASED_CTLS:
> - ctl_high &= ~EVMCS1_UNSUPPORTED_PINCTRL;
> + ctl_high &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_PINCTRL);
> break;
> case MSR_IA32_VMX_VMFUNC:
> - ctl_low &= ~EVMCS1_UNSUPPORTED_VMFUNC;
> + ctl_low &= ~evmcs_get_unsupported_ctls(vcpu, EVMCS_VMFUNC);
> break;
> }
>
> *pdata = ctl_low | ((u64)ctl_high << 32);
> }
>
> -int nested_evmcs_check_controls(struct vmcs12 *vmcs12)
> +int nested_evmcs_check_controls(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
> {
> int ret = 0;
> u32 unsupp_ctl;
>
> unsupp_ctl = vmcs12->pin_based_vm_exec_control &
> - EVMCS1_UNSUPPORTED_PINCTRL;
> + evmcs_get_unsupported_ctls(vcpu, EVMCS_PINCTRL);
> if (unsupp_ctl) {
> trace_kvm_nested_vmenter_failed(
> - "eVMCS: unsupported pin-based VM-execution controls",
> + "eVMCS: unsupported pin-based VM-execution controls: ",
> unsupp_ctl);
> ret = -EINVAL;
> }
>
> unsupp_ctl = vmcs12->secondary_vm_exec_control &
> - EVMCS1_UNSUPPORTED_2NDEXEC;
> + evmcs_get_unsupported_ctls(vcpu, EVMCS_2NDEXEC);
> if (unsupp_ctl) {
> trace_kvm_nested_vmenter_failed(
> - "eVMCS: unsupported secondary VM-execution controls",
> + "eVMCS: unsupported secondary VM-execution controls: ",
> unsupp_ctl);
> ret = -EINVAL;
> }
>
> unsupp_ctl = vmcs12->vm_exit_controls &
> - EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
> + evmcs_get_unsupported_ctls(vcpu, EVMCS_EXIT_CTLS);
> if (unsupp_ctl) {
> trace_kvm_nested_vmenter_failed(
> - "eVMCS: unsupported VM-exit controls",
> + "eVMCS: unsupported VM-exit controls: ",
> unsupp_ctl);
> ret = -EINVAL;
> }
>
> unsupp_ctl = vmcs12->vm_entry_controls &
> - EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
> + evmcs_get_unsupported_ctls(vcpu, EVMCS_ENTRY_CTLS);
> if (unsupp_ctl) {
> trace_kvm_nested_vmenter_failed(
> - "eVMCS: unsupported VM-entry controls",
> + "eVMCS: unsupported VM-entry controls: ",
> unsupp_ctl);
> ret = -EINVAL;
> }
>
> - unsupp_ctl = vmcs12->vm_function_control & EVMCS1_UNSUPPORTED_VMFUNC;
> + unsupp_ctl = vmcs12->vm_function_control &
> + evmcs_get_unsupported_ctls(vcpu, EVMCS_VMFUNC);
> if (unsupp_ctl) {
> trace_kvm_nested_vmenter_failed(
> - "eVMCS: unsupported VM-function controls",
> + "eVMCS: unsupported VM-function controls: ",
> unsupp_ctl);
> ret = -EINVAL;
> }
> diff --git a/arch/x86/kvm/vmx/evmcs.h b/arch/x86/kvm/vmx/evmcs.h
> index f886a8ff0342..4b809c79ae63 100644
> --- a/arch/x86/kvm/vmx/evmcs.h
> +++ b/arch/x86/kvm/vmx/evmcs.h
> @@ -37,16 +37,9 @@ DECLARE_STATIC_KEY_FALSE(enable_evmcs);
> * EPTP_LIST_ADDRESS = 0x00002024,
> * VMREAD_BITMAP = 0x00002026,
> * VMWRITE_BITMAP = 0x00002028,
> - *
> - * TSC_MULTIPLIER = 0x00002032,
> * PLE_GAP = 0x00004020,
> * PLE_WINDOW = 0x00004022,
> * VMX_PREEMPTION_TIMER_VALUE = 0x0000482E,
> - * GUEST_IA32_PERF_GLOBAL_CTRL = 0x00002808,
> - * HOST_IA32_PERF_GLOBAL_CTRL = 0x00002c04,
> - *
> - * Currently unsupported in KVM:
> - * GUEST_IA32_RTIT_CTL = 0x00002814,
> */
> #define EVMCS1_UNSUPPORTED_PINCTRL (PIN_BASED_POSTED_INTR | \
> PIN_BASED_VMX_PREEMPTION_TIMER)
> @@ -58,12 +51,10 @@ DECLARE_STATIC_KEY_FALSE(enable_evmcs);
> SECONDARY_EXEC_ENABLE_PML | \
> SECONDARY_EXEC_ENABLE_VMFUNC | \
> SECONDARY_EXEC_SHADOW_VMCS | \
> - SECONDARY_EXEC_TSC_SCALING | \
> SECONDARY_EXEC_PAUSE_LOOP_EXITING)
> #define EVMCS1_UNSUPPORTED_VMEXIT_CTRL \
> - (VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | \
> - VM_EXIT_SAVE_VMX_PREEMPTION_TIMER)
> -#define EVMCS1_UNSUPPORTED_VMENTRY_CTRL (VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL)
> + (VM_EXIT_SAVE_VMX_PREEMPTION_TIMER)
> +#define EVMCS1_UNSUPPORTED_VMENTRY_CTRL (0)
> #define EVMCS1_UNSUPPORTED_VMFUNC (VMX_VMFUNC_EPTP_SWITCHING)
>
> struct evmcs_field {
> @@ -243,7 +234,7 @@ bool nested_enlightened_vmentry(struct kvm_vcpu *vcpu, u64 *evmcs_gpa);
> uint16_t nested_get_evmcs_version(struct kvm_vcpu *vcpu);
> int nested_enable_evmcs(struct kvm_vcpu *vcpu,
> uint16_t *vmcs_version);
> -void nested_evmcs_filter_control_msr(u32 msr_index, u64 *pdata);
> -int nested_evmcs_check_controls(struct vmcs12 *vmcs12);
> +void nested_evmcs_filter_control_msr(struct kvm_vcpu *vcpu, u32 msr_index, u64 *pdata);
> +int nested_evmcs_check_controls(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12);
>
> #endif /* __KVM_X86_VMX_EVMCS_H */
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 4fc84f0f5875..dcf3ee645212 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -2891,7 +2891,7 @@ static int nested_vmx_check_controls(struct kvm_vcpu *vcpu,
> return -EINVAL;
>
> if (to_vmx(vcpu)->nested.enlightened_vmcs_enabled)
> - return nested_evmcs_check_controls(vmcs12);
> + return nested_evmcs_check_controls(vcpu, vmcs12);
>
> return 0;
> }
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index c30115b9cb33..b4915d841357 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1858,7 +1858,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> */
> if (!msr_info->host_initiated &&
> vmx->nested.enlightened_vmcs_enabled)
> - nested_evmcs_filter_control_msr(msr_info->index,
> + nested_evmcs_filter_control_msr(vcpu, msr_info->index,
> &msr_info->data);
> break;
> case MSR_IA32_RTIT_CTL:


Looks all good.

Reviewed-by: Maxim Levitsky <[email protected]>

Best regards,
Maxim Levitsky




2022-07-12 12:13:55

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v3 10/25] KVM: selftests: Enable TSC scaling in evmcs selftest

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> The updated Enlightened VMCS v1 definition enables TSC scaling, test
> that SECONDARY_EXEC_TSC_SCALING can now be enabled.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
>  .../testing/selftests/kvm/x86_64/evmcs_test.c | 31 +++++++++++++++++--
>  1 file changed, 29 insertions(+), 2 deletions(-)
>
> diff --git a/tools/testing/selftests/kvm/x86_64/evmcs_test.c b/tools/testing/selftests/kvm/x86_64/evmcs_test.c
> index 8dda527cc080..80135b98dc3b 100644
> --- a/tools/testing/selftests/kvm/x86_64/evmcs_test.c
> +++ b/tools/testing/selftests/kvm/x86_64/evmcs_test.c
> @@ -18,6 +18,9 @@
>  
>  #include "vmx.h"
>  
> +/* Test flags */
> +#define HOST_HAS_TSC_SCALING BIT(0)
> +
>  static int ud_count;
>  
>  static void guest_ud_handler(struct ex_regs *regs)
> @@ -64,11 +67,14 @@ void l2_guest_code(void)
>         vmcall();
>         rdmsr_gs_base(); /* intercepted */
>  
> +       /* TSC scaling */
> +       vmcall();
> +
>         /* Done, exit to L1 and never come back.  */
>         vmcall();
>  }
>  
> -void guest_code(struct vmx_pages *vmx_pages)
> +void guest_code(struct vmx_pages *vmx_pages, u64 test_flags)
>  {
>  #define L2_GUEST_STACK_SIZE 64
>         unsigned long l2_guest_stack[L2_GUEST_STACK_SIZE];
> @@ -150,6 +156,18 @@ void guest_code(struct vmx_pages *vmx_pages)
>         GUEST_ASSERT(vmreadz(VM_EXIT_REASON) == EXIT_REASON_VMCALL);
>         GUEST_SYNC(11);
>  
> +       if (test_flags & HOST_HAS_TSC_SCALING) {
> +               GUEST_ASSERT((rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2) >> 32) &
> +                            SECONDARY_EXEC_TSC_SCALING);
> +               /* Try enabling TSC scaling */
> +               vmwrite(SECONDARY_VM_EXEC_CONTROL, vmreadz(SECONDARY_VM_EXEC_CONTROL) |
> +                       SECONDARY_EXEC_TSC_SCALING);
> +               vmwrite(TSC_MULTIPLIER, 1);
> +       }
> +       GUEST_ASSERT(!vmresume());
> +       GUEST_ASSERT(vmreadz(VM_EXIT_REASON) == EXIT_REASON_VMCALL);
> +       GUEST_SYNC(12);
> +
>         /* Try enlightened vmptrld with an incorrect GPA */
>         evmcs_vmptrld(0xdeadbeef, vmx_pages->enlightened_vmcs);
>         GUEST_ASSERT(vmlaunch());
> @@ -204,6 +222,7 @@ int main(int argc, char *argv[])
>         struct kvm_vm *vm;
>         struct kvm_run *run;
>         struct ucall uc;
> +       u64 test_flags = 0;
>         int stage;
>  
>         vm = vm_create_with_one_vcpu(&vcpu, guest_code);
> @@ -212,11 +231,19 @@ int main(int argc, char *argv[])
>         TEST_REQUIRE(kvm_has_cap(KVM_CAP_NESTED_STATE));
>         TEST_REQUIRE(kvm_has_cap(KVM_CAP_HYPERV_ENLIGHTENED_VMCS));
>  
> +       if ((kvm_get_feature_msr(MSR_IA32_VMX_PROCBASED_CTLS2) >> 32) &
> +           SECONDARY_EXEC_TSC_SCALING) {
> +               test_flags |= HOST_HAS_TSC_SCALING;
> +               pr_info("TSC scaling is supported, adding to test\n");
> +       } else {
> +               pr_info("TSC scaling is not supported\n");
> +       }
> +
>         vcpu_set_hv_cpuid(vcpu);
>         vcpu_enable_evmcs(vcpu);
>  
>         vcpu_alloc_vmx(vm, &vmx_pages_gva);
> -       vcpu_args_set(vcpu, 1, vmx_pages_gva);
> +       vcpu_args_set(vcpu, 2, vmx_pages_gva, test_flags);
>  
>         vm_init_descriptor_tables(vm);
>         vcpu_init_descriptor_tables(vcpu);

Reviewed-by: Maxim Levitsky <[email protected]>

Best regards,
Maxim Levitsky



2022-07-12 12:16:23

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v3 11/25] KVM: VMX: Get rid of eVMCS specific VMX controls sanitization

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> With the updated eVMCSv1 definition, there's no known 'problematic'
> controls which are exposed in VMX control MSRs but are not present in
> eVMCSv1. Get rid of VMX control MSRs filtering for KVM on Hyper-V.

If I understand correctly we are taking about running KVM as a nested guest of Hyper-V here:

Don't we need to check the new CPUID bit and only then use the new fields of eVMCS,
aka check that the 'cpu' supports the updated eVMCS version?

Best regards,
Maxim Levitsky



>
> Note: VMX control MSRs filtering for Hyper-V on KVM
> (nested_evmcs_filter_control_msr()) stays as even the updated eVMCSv1
> definition doesn't have all the features implemented by KVM and some
> fields are still missing. Moreover, nested_evmcs_filter_control_msr()
> has to support the original eVMCSv1 version when VMM wishes so.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
>  arch/x86/kvm/vmx/evmcs.c | 13 -------------
>  arch/x86/kvm/vmx/evmcs.h |  1 -
>  arch/x86/kvm/vmx/vmx.c   |  5 -----
>  3 files changed, 19 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
> index 52a53debd806..b5cfbf7d487b 100644
> --- a/arch/x86/kvm/vmx/evmcs.c
> +++ b/arch/x86/kvm/vmx/evmcs.c
> @@ -320,19 +320,6 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
>  };
>  const unsigned int nr_evmcs_1_fields = ARRAY_SIZE(vmcs_field_to_evmcs_1);
>  
> -#if IS_ENABLED(CONFIG_HYPERV)
> -__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf)
> -{
> -       vmcs_conf->cpu_based_exec_ctrl &= ~EVMCS1_UNSUPPORTED_EXEC_CTRL;
> -       vmcs_conf->pin_based_exec_ctrl &= ~EVMCS1_UNSUPPORTED_PINCTRL;
> -       vmcs_conf->cpu_based_2nd_exec_ctrl &= ~EVMCS1_UNSUPPORTED_2NDEXEC;
> -       vmcs_conf->cpu_based_3rd_exec_ctrl = 0;
> -
> -       vmcs_conf->vmexit_ctrl &= ~EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
> -       vmcs_conf->vmentry_ctrl &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
> -}
> -#endif
> -
>  bool nested_enlightened_vmentry(struct kvm_vcpu *vcpu, u64 *evmcs_gpa)
>  {
>         struct hv_vp_assist_page assist_page;
> diff --git a/arch/x86/kvm/vmx/evmcs.h b/arch/x86/kvm/vmx/evmcs.h
> index 4b809c79ae63..0feac101cce4 100644
> --- a/arch/x86/kvm/vmx/evmcs.h
> +++ b/arch/x86/kvm/vmx/evmcs.h
> @@ -203,7 +203,6 @@ static inline void evmcs_load(u64 phys_addr)
>         vp_ap->enlighten_vmentry = 1;
>  }
>  
> -__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf);
>  #else /* !IS_ENABLED(CONFIG_HYPERV) */
>  static __always_inline void evmcs_write64(unsigned long field, u64 value) {}
>  static inline void evmcs_write32(unsigned long field, u32 value) {}
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index b4915d841357..dd905ad72637 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2689,11 +2689,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>         vmcs_conf->vmexit_ctrl         = _vmexit_control;
>         vmcs_conf->vmentry_ctrl        = _vmentry_control;
>  
> -#if IS_ENABLED(CONFIG_HYPERV)
> -       if (enlightened_vmcs)
> -               evmcs_sanitize_exec_ctrls(vmcs_conf);
> -#endif
> -
>         return 0;
>  }
>  


2022-07-12 12:17:36

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v3 17/25] KVM: VMX: Add missing VMEXIT controls to vmcs_config

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> As a preparation to reusing the result of setup_vmcs_config() in
> nested VMX MSR setup, add the VMEXIT controls which KVM doesn't
> use but supports for nVMX to KVM_OPT_VMX_VM_EXIT_CONTROLS and
> filter them out in vmx_vmexit_ctrl().
>
> No functional change intended.
>
> Reviewed-by: Jim Mattson <[email protected]>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
>  arch/x86/kvm/vmx/vmx.c | 4 ++++
>  arch/x86/kvm/vmx/vmx.h | 3 +++
>  2 files changed, 7 insertions(+)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index d7170990f469..2fb89bdcbbd8 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -4196,6 +4196,10 @@ static u32 vmx_vmexit_ctrl(void)
>  {
>         u32 vmexit_ctrl = vmcs_config.vmexit_ctrl;
>  
> +       /* Not used by KVM but supported for nesting. */
> +       vmexit_ctrl &= ~(VM_EXIT_SAVE_IA32_PAT | VM_EXIT_SAVE_IA32_EFER |
> +                        VM_EXIT_SAVE_VMX_PREEMPTION_TIMER);
> +
>         if (vmx_pt_mode_is_system())
>                 vmexit_ctrl &= ~(VM_EXIT_PT_CONCEAL_PIP |
>                                  VM_EXIT_CLEAR_IA32_RTIT_CTL);
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index 89eaab3495a6..e9c392398f1b 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -498,8 +498,11 @@ static inline u8 vmx_get_rvi(void)
>  #endif
>  #define KVM_OPT_VMX_VM_EXIT_CONTROLS                           \
>               (VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |             \
> +             VM_EXIT_SAVE_IA32_PAT |                           \
>               VM_EXIT_LOAD_IA32_PAT |                           \
> +             VM_EXIT_SAVE_IA32_EFER |                          \
>               VM_EXIT_LOAD_IA32_EFER |                          \
> +             VM_EXIT_SAVE_VMX_PREEMPTION_TIMER |               \
>               VM_EXIT_CLEAR_BNDCFGS |                           \
>               VM_EXIT_PT_CONCEAL_PIP |                          \
>               VM_EXIT_CLEAR_IA32_RTIT_CTL)


Reviewed-by: Maxim Levitsky <[email protected]>

Best regards,
Maxim Levitsky

2022-07-12 12:19:06

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v3 24/25] KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> Like other host VMX control MSRs, MSR_IA32_VMX_MISC can be cached in
> vmcs_config to avoid the need to re-read it later, e.g. from
> cpu_has_vmx_intel_pt() or cpu_has_vmx_shadow_vmcs().
>
> No (real) functional change intended.
>
> Reviewed-by: Jim Mattson <[email protected]>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
>  arch/x86/kvm/vmx/capabilities.h | 11 +++--------
>  arch/x86/kvm/vmx/vmx.c          |  8 +++++---
>  2 files changed, 8 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
> index 07e7492fe72a..07f7a9534211 100644
> --- a/arch/x86/kvm/vmx/capabilities.h
> +++ b/arch/x86/kvm/vmx/capabilities.h
> @@ -65,6 +65,7 @@ struct vmcs_config {
>         u64 cpu_based_3rd_exec_ctrl;
>         u32 vmexit_ctrl;
>         u32 vmentry_ctrl;
> +       u64 misc;
>         struct nested_vmx_msrs nested;
>  };
>  extern struct vmcs_config vmcs_config;
> @@ -225,11 +226,8 @@ static inline bool cpu_has_vmx_vmfunc(void)
>  
>  static inline bool cpu_has_vmx_shadow_vmcs(void)
>  {
> -       u64 vmx_msr;
> -
>         /* check if the cpu supports writing r/o exit information fields */
> -       rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
> -       if (!(vmx_msr & MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS))
> +       if (!(vmcs_config.misc & MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS))
>                 return false;
>  
>         return vmcs_config.cpu_based_2nd_exec_ctrl &
> @@ -371,10 +369,7 @@ static inline bool cpu_has_vmx_invvpid_global(void)
>  
>  static inline bool cpu_has_vmx_intel_pt(void)
>  {
> -       u64 vmx_msr;
> -
> -       rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
> -       return (vmx_msr & MSR_IA32_VMX_MISC_INTEL_PT) &&
> +       return (vmcs_config.misc & MSR_IA32_VMX_MISC_INTEL_PT) &&
>                 (vmcs_config.cpu_based_2nd_exec_ctrl & SECONDARY_EXEC_PT_USE_GPA) &&
>                 (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_RTIT_CTL);
>  }
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 35285109856f..ab091758c437 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2479,6 +2479,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>         u64 _cpu_based_3rd_exec_control = 0;
>         u32 _vmexit_control = 0;
>         u32 _vmentry_control = 0;
> +       u64 misc_msr;
>         int i;
>  
>         /*
> @@ -2613,6 +2614,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>         if (((vmx_msr_high >> 18) & 15) != 6)
>                 return -EIO;
>  
> +       rdmsrl(MSR_IA32_VMX_MISC, misc_msr);
> +
>         vmcs_conf->size = vmx_msr_high & 0x1fff;
>         vmcs_conf->basic_cap = vmx_msr_high & ~0x1fff;
>  
> @@ -2624,6 +2627,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>         vmcs_conf->cpu_based_3rd_exec_ctrl = _cpu_based_3rd_exec_control;
>         vmcs_conf->vmexit_ctrl         = _vmexit_control;
>         vmcs_conf->vmentry_ctrl        = _vmentry_control;
> +       vmcs_conf->misc = misc_msr;
>  
>         return 0;
>  }
> @@ -8241,11 +8245,9 @@ static __init int hardware_setup(void)
>  
>         if (enable_preemption_timer) {
>                 u64 use_timer_freq = 5000ULL * 1000 * 1000;
> -               u64 vmx_msr;
>  
> -               rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
>                 cpu_preemption_timer_multi =
> -                       vmx_msr & VMX_MISC_PREEMPTION_TIMER_RATE_MASK;
> +                       vmcs_config.misc & VMX_MISC_PREEMPTION_TIMER_RATE_MASK;
>  
>                 if (tsc_khz)
>                         use_timer_freq = (u64)tsc_khz * 1000;


Reviewed-by: Maxim Levitsky <[email protected]>

Best regards,
Maxim Levitsky

2022-07-12 12:19:10

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v3 23/25] KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> Using raw host MSR values for setting up nested VMX control MSRs is
> incorrect as some features need to disabled, e.g. when KVM runs as
> a nested hypervisor on Hyper-V and uses Enlightened VMCS or when a
> workaround for IA32_PERF_GLOBAL_CTRL is applied. For non-nested VMX, this
> is done in setup_vmcs_config() and the result is stored in vmcs_config.
> Use it for setting up allowed-1 bits in nested VMX MSRs too.
>
> Suggested-by: Sean Christopherson <[email protected]>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
>  arch/x86/kvm/vmx/nested.c | 30 ++++++++++++------------------
>  arch/x86/kvm/vmx/nested.h |  2 +-
>  arch/x86/kvm/vmx/vmx.c    |  5 ++---
>  3 files changed, 15 insertions(+), 22 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 09654d5c2144..3d386c663fac 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -6565,8 +6565,10 @@ static u64 nested_vmx_calc_vmcs_enum_msr(void)
>   * bit in the high half is on if the corresponding bit in the control field
>   * may be on. See also vmx_control_verify().
>   */
> -void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
> +void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
>  {
> +       struct nested_vmx_msrs *msrs = &vmcs_conf->nested;
> +
>         /*
>          * Note that as a general rule, the high half of the MSRs (bits in
>          * the control fields which may be 1) should be initialized by the
> @@ -6583,11 +6585,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
>          */
>  
>         /* pin-based controls */
> -       rdmsr(MSR_IA32_VMX_PINBASED_CTLS,
> -               msrs->pinbased_ctls_low,
> -               msrs->pinbased_ctls_high);
>         msrs->pinbased_ctls_low =
>                 PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR;
> +
> +       msrs->pinbased_ctls_high = vmcs_conf->pin_based_exec_ctrl;
>         msrs->pinbased_ctls_high &=
>                 PIN_BASED_EXT_INTR_MASK |
>                 PIN_BASED_NMI_EXITING |
> @@ -6598,12 +6599,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
>                 PIN_BASED_VMX_PREEMPTION_TIMER;
>  
>         /* exit controls */
> -       rdmsr(MSR_IA32_VMX_EXIT_CTLS,
> -               msrs->exit_ctls_low,
> -               msrs->exit_ctls_high);
>         msrs->exit_ctls_low =
>                 VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR;
>  
> +       msrs->exit_ctls_high = vmcs_conf->vmexit_ctrl;
>         msrs->exit_ctls_high &=
>  #ifdef CONFIG_X86_64
>                 VM_EXIT_HOST_ADDR_SPACE_SIZE |
> @@ -6619,11 +6618,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
>         msrs->exit_ctls_low &= ~VM_EXIT_SAVE_DEBUG_CONTROLS;
>  
>         /* entry controls */
> -       rdmsr(MSR_IA32_VMX_ENTRY_CTLS,
> -               msrs->entry_ctls_low,
> -               msrs->entry_ctls_high);
>         msrs->entry_ctls_low =
>                 VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR;
> +
> +       msrs->entry_ctls_high = vmcs_conf->vmentry_ctrl;
>         msrs->entry_ctls_high &=
>  #ifdef CONFIG_X86_64
>                 VM_ENTRY_IA32E_MODE |
> @@ -6637,11 +6635,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
>         msrs->entry_ctls_low &= ~VM_ENTRY_LOAD_DEBUG_CONTROLS;
>  
>         /* cpu-based controls */
> -       rdmsr(MSR_IA32_VMX_PROCBASED_CTLS,
> -               msrs->procbased_ctls_low,
> -               msrs->procbased_ctls_high);
>         msrs->procbased_ctls_low =
>                 CPU_BASED_ALWAYSON_WITHOUT_TRUE_MSR;
> +
> +       msrs->procbased_ctls_high = vmcs_conf->cpu_based_exec_ctrl;
>         msrs->procbased_ctls_high &=
>                 CPU_BASED_INTR_WINDOW_EXITING |
>                 CPU_BASED_NMI_WINDOW_EXITING | CPU_BASED_USE_TSC_OFFSETTING |
> @@ -6675,12 +6672,9 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
>          * depend on CPUID bits, they are added later by
>          * vmx_vcpu_after_set_cpuid.
>          */
> -       if (msrs->procbased_ctls_high & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS)
> -               rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2,
> -                     msrs->secondary_ctls_low,
> -                     msrs->secondary_ctls_high);
> -
>         msrs->secondary_ctls_low = 0;
> +
> +       msrs->secondary_ctls_high = vmcs_conf->cpu_based_2nd_exec_ctrl;
>         msrs->secondary_ctls_high &=
>                 SECONDARY_EXEC_DESC |
>                 SECONDARY_EXEC_ENABLE_RDTSCP |
> diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
> index c92cea0b8ccc..fae047c6204b 100644
> --- a/arch/x86/kvm/vmx/nested.h
> +++ b/arch/x86/kvm/vmx/nested.h
> @@ -17,7 +17,7 @@ enum nvmx_vmentry_status {
>  };
>  
>  void vmx_leave_nested(struct kvm_vcpu *vcpu);
> -void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps);
> +void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps);
>  void nested_vmx_hardware_unsetup(void);
>  __init int nested_vmx_hardware_setup(int (*exit_handlers[])(struct kvm_vcpu *));
>  void nested_vmx_set_vmcs_shadowing_bitmap(void);
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index e462e5b9c0a1..35285109856f 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7306,7 +7306,7 @@ static int __init vmx_check_processor_compat(void)
>         if (setup_vmcs_config(&vmcs_conf, &vmx_cap) < 0)
>                 return -EIO;
>         if (nested)
> -               nested_vmx_setup_ctls_msrs(&vmcs_conf.nested, vmx_cap.ept);
> +               nested_vmx_setup_ctls_msrs(&vmcs_conf, vmx_cap.ept);
>         if (memcmp(&vmcs_config, &vmcs_conf, sizeof(struct vmcs_config)) != 0) {
>                 printk(KERN_ERR "kvm: CPU %d feature inconsistency!\n",
>                                 smp_processor_id());
> @@ -8281,8 +8281,7 @@ static __init int hardware_setup(void)
>         setup_default_sgx_lepubkeyhash();
>  
>         if (nested) {
> -               nested_vmx_setup_ctls_msrs(&vmcs_config.nested,
> -                                          vmx_capability.ept);
> +               nested_vmx_setup_ctls_msrs(&vmcs_config, vmx_capability.ept);
>  
>                 r = nested_vmx_hardware_setup(kvm_vmx_exit_handlers);
>                 if (r)


Reviewed-by: Maxim Levitsky <[email protected]>

Best regards,
Maxim Levitsky

2022-07-12 12:19:17

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v3 11/25] KVM: VMX: Get rid of eVMCS specific VMX controls sanitization

Maxim Levitsky <[email protected]> writes:

> On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
>> With the updated eVMCSv1 definition, there's no known 'problematic'
>> controls which are exposed in VMX control MSRs but are not present in
>> eVMCSv1. Get rid of VMX control MSRs filtering for KVM on Hyper-V.
>
> If I understand correctly we are taking about running KVM as a nested guest of Hyper-V here:
>
> Don't we need to check the new CPUID bit and only then use the new fields of eVMCS,
> aka check that the 'cpu' supports the updated eVMCS version?
>

I've checked various Hyper-V versions available around and it seems
there's no need for that: these new features are exposed in VMX control
MSRs only when the updated eVMCS is supported.

We can, in theory, preserve the filtering for non-updated eVMCS verison
but I'd vote for putting a WARN_ON() or something around: we can
eventually get rid of it in case we don't get any reports.

> Best regards,
> Maxim Levitsky
>
>
>
>>
>> Note: VMX control MSRs filtering for Hyper-V on KVM
>> (nested_evmcs_filter_control_msr()) stays as even the updated eVMCSv1
>> definition doesn't have all the features implemented by KVM and some
>> fields are still missing. Moreover, nested_evmcs_filter_control_msr()
>> has to support the original eVMCSv1 version when VMM wishes so.
>>
>> Signed-off-by: Vitaly Kuznetsov <[email protected]>
>> ---
>>  arch/x86/kvm/vmx/evmcs.c | 13 -------------
>>  arch/x86/kvm/vmx/evmcs.h |  1 -
>>  arch/x86/kvm/vmx/vmx.c   |  5 -----
>>  3 files changed, 19 deletions(-)
>>
>> diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
>> index 52a53debd806..b5cfbf7d487b 100644
>> --- a/arch/x86/kvm/vmx/evmcs.c
>> +++ b/arch/x86/kvm/vmx/evmcs.c
>> @@ -320,19 +320,6 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
>>  };
>>  const unsigned int nr_evmcs_1_fields = ARRAY_SIZE(vmcs_field_to_evmcs_1);
>>  
>> -#if IS_ENABLED(CONFIG_HYPERV)
>> -__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf)
>> -{
>> -       vmcs_conf->cpu_based_exec_ctrl &= ~EVMCS1_UNSUPPORTED_EXEC_CTRL;
>> -       vmcs_conf->pin_based_exec_ctrl &= ~EVMCS1_UNSUPPORTED_PINCTRL;
>> -       vmcs_conf->cpu_based_2nd_exec_ctrl &= ~EVMCS1_UNSUPPORTED_2NDEXEC;
>> -       vmcs_conf->cpu_based_3rd_exec_ctrl = 0;
>> -
>> -       vmcs_conf->vmexit_ctrl &= ~EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
>> -       vmcs_conf->vmentry_ctrl &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
>> -}
>> -#endif
>> -
>>  bool nested_enlightened_vmentry(struct kvm_vcpu *vcpu, u64 *evmcs_gpa)
>>  {
>>         struct hv_vp_assist_page assist_page;
>> diff --git a/arch/x86/kvm/vmx/evmcs.h b/arch/x86/kvm/vmx/evmcs.h
>> index 4b809c79ae63..0feac101cce4 100644
>> --- a/arch/x86/kvm/vmx/evmcs.h
>> +++ b/arch/x86/kvm/vmx/evmcs.h
>> @@ -203,7 +203,6 @@ static inline void evmcs_load(u64 phys_addr)
>>         vp_ap->enlighten_vmentry = 1;
>>  }
>>  
>> -__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf);
>>  #else /* !IS_ENABLED(CONFIG_HYPERV) */
>>  static __always_inline void evmcs_write64(unsigned long field, u64 value) {}
>>  static inline void evmcs_write32(unsigned long field, u32 value) {}
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index b4915d841357..dd905ad72637 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -2689,11 +2689,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>>         vmcs_conf->vmexit_ctrl         = _vmexit_control;
>>         vmcs_conf->vmentry_ctrl        = _vmentry_control;
>>  
>> -#if IS_ENABLED(CONFIG_HYPERV)
>> -       if (enlightened_vmcs)
>> -               evmcs_sanitize_exec_ctrls(vmcs_conf);
>> -#endif
>> -
>>         return 0;
>>  }
>>  
>
>

--
Vitaly

2022-07-12 12:21:25

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v3 11/25] KVM: VMX: Get rid of eVMCS specific VMX controls sanitization

On Tue, 2022-07-12 at 14:14 +0200, Vitaly Kuznetsov wrote:
> Maxim Levitsky <[email protected]> writes:
>
> > On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> > > With the updated eVMCSv1 definition, there's no known 'problematic'
> > > controls which are exposed in VMX control MSRs but are not present in
> > > eVMCSv1. Get rid of VMX control MSRs filtering for KVM on Hyper-V.
> >
> > If I understand correctly we are taking about running KVM as a nested guest of Hyper-V here:
> >
> > Don't we need to check the new CPUID bit and only then use the new fields of eVMCS,
> > aka check that the 'cpu' supports the updated eVMCS version?
> >
>
> I've checked various Hyper-V versions available around and it seems
> there's no need for that: these new features are exposed in VMX control
> MSRs only when the updated eVMCS is supported.

Makes sense now. Might be worth a comment somewhere.

Best regards,
Maxim Levitsky

>
> We can, in theory, preserve the filtering for non-updated eVMCS verison
> but I'd vote for putting a WARN_ON() or something around: we can
> eventually get rid of it in case we don't get any reports.
>
> > Best regards,
> >         Maxim Levitsky
> >
> >
> >
> > >
> > > Note: VMX control MSRs filtering for Hyper-V on KVM
> > > (nested_evmcs_filter_control_msr()) stays as even the updated eVMCSv1
> > > definition doesn't have all the features implemented by KVM and some
> > > fields are still missing. Moreover, nested_evmcs_filter_control_msr()
> > > has to support the original eVMCSv1 version when VMM wishes so.
> > >
> > > Signed-off-by: Vitaly Kuznetsov <[email protected]>
> > > ---
> > >  arch/x86/kvm/vmx/evmcs.c | 13 -------------
> > >  arch/x86/kvm/vmx/evmcs.h |  1 -
> > >  arch/x86/kvm/vmx/vmx.c   |  5 -----
> > >  3 files changed, 19 deletions(-)
> > >
> > > diff --git a/arch/x86/kvm/vmx/evmcs.c b/arch/x86/kvm/vmx/evmcs.c
> > > index 52a53debd806..b5cfbf7d487b 100644
> > > --- a/arch/x86/kvm/vmx/evmcs.c
> > > +++ b/arch/x86/kvm/vmx/evmcs.c
> > > @@ -320,19 +320,6 @@ const struct evmcs_field vmcs_field_to_evmcs_1[] = {
> > >  };
> > >  const unsigned int nr_evmcs_1_fields = ARRAY_SIZE(vmcs_field_to_evmcs_1);
> > >  
> > > -#if IS_ENABLED(CONFIG_HYPERV)
> > > -__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf)
> > > -{
> > > -       vmcs_conf->cpu_based_exec_ctrl &= ~EVMCS1_UNSUPPORTED_EXEC_CTRL;
> > > -       vmcs_conf->pin_based_exec_ctrl &= ~EVMCS1_UNSUPPORTED_PINCTRL;
> > > -       vmcs_conf->cpu_based_2nd_exec_ctrl &= ~EVMCS1_UNSUPPORTED_2NDEXEC;
> > > -       vmcs_conf->cpu_based_3rd_exec_ctrl = 0;
> > > -
> > > -       vmcs_conf->vmexit_ctrl &= ~EVMCS1_UNSUPPORTED_VMEXIT_CTRL;
> > > -       vmcs_conf->vmentry_ctrl &= ~EVMCS1_UNSUPPORTED_VMENTRY_CTRL;
> > > -}
> > > -#endif
> > > -
> > >  bool nested_enlightened_vmentry(struct kvm_vcpu *vcpu, u64 *evmcs_gpa)
> > >  {
> > >         struct hv_vp_assist_page assist_page;
> > > diff --git a/arch/x86/kvm/vmx/evmcs.h b/arch/x86/kvm/vmx/evmcs.h
> > > index 4b809c79ae63..0feac101cce4 100644
> > > --- a/arch/x86/kvm/vmx/evmcs.h
> > > +++ b/arch/x86/kvm/vmx/evmcs.h
> > > @@ -203,7 +203,6 @@ static inline void evmcs_load(u64 phys_addr)
> > >         vp_ap->enlighten_vmentry = 1;
> > >  }
> > >  
> > > -__init void evmcs_sanitize_exec_ctrls(struct vmcs_config *vmcs_conf);
> > >  #else /* !IS_ENABLED(CONFIG_HYPERV) */
> > >  static __always_inline void evmcs_write64(unsigned long field, u64 value) {}
> > >  static inline void evmcs_write32(unsigned long field, u32 value) {}
> > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > > index b4915d841357..dd905ad72637 100644
> > > --- a/arch/x86/kvm/vmx/vmx.c
> > > +++ b/arch/x86/kvm/vmx/vmx.c
> > > @@ -2689,11 +2689,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
> > >         vmcs_conf->vmexit_ctrl         = _vmexit_control;
> > >         vmcs_conf->vmentry_ctrl        = _vmentry_control;
> > >  
> > > -#if IS_ENABLED(CONFIG_HYPERV)
> > > -       if (enlightened_vmcs)
> > > -               evmcs_sanitize_exec_ctrls(vmcs_conf);
> > > -#endif
> > > -
> > >         return 0;
> > >  }
> > >  
> >
> >
>


2022-07-12 12:31:43

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v3 13/25] KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in setup_vmcs_config()

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> CPU_BASED_{INTR,NMI}_WINDOW_EXITING controls are toggled dynamically by
> vmx_enable_{irq,nmi}_window, handle_interrupt_window(), handle_nmi_window()
> but setup_vmcs_config() doesn't check their existence. Add the check and
> filter the controls out in vmx_exec_control().
>
> Note: KVM explicitly supports CPUs without VIRTUAL_NMIS and all these CPUs
> are supposedly lacking NMI_WINDOW_EXITING too. Adjust cpu_has_virtual_nmis()
> accordingly.
>
> No functional change intended.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
>  arch/x86/kvm/vmx/capabilities.h | 3 ++-
>  arch/x86/kvm/vmx/vmx.c          | 8 +++++++-
>  2 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
> index 069d8d298e1d..07e7492fe72a 100644
> --- a/arch/x86/kvm/vmx/capabilities.h
> +++ b/arch/x86/kvm/vmx/capabilities.h
> @@ -82,7 +82,8 @@ static inline bool cpu_has_vmx_basic_inout(void)
>  
>  static inline bool cpu_has_virtual_nmis(void)
>  {
> -       return vmcs_config.pin_based_exec_ctrl & PIN_BASED_VIRTUAL_NMIS;
> +       return vmcs_config.pin_based_exec_ctrl & PIN_BASED_VIRTUAL_NMIS &&
> +               vmcs_config.cpu_based_exec_ctrl & CPU_BASED_NMI_WINDOW_EXITING;
>  }

Oh, a bit offtopic, I see how VMX handles the case of no support of vNMI,
VMX has no IRET intercept, and so it cheats by using regular interrupt window
and assuming that NMI hanlder will not enable normal interrupts....
Oh well....

>  
>  static inline bool cpu_has_vmx_preemption_timer(void)
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 1aaec4d19e1b..ce54f13d8da1 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2487,10 +2487,12 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>               CPU_BASED_MWAIT_EXITING |
>               CPU_BASED_MONITOR_EXITING |
>               CPU_BASED_INVLPG_EXITING |
> -             CPU_BASED_RDPMC_EXITING;
> +             CPU_BASED_RDPMC_EXITING |
> +             CPU_BASED_INTR_WINDOW_EXITING;
>  
>         opt = CPU_BASED_TPR_SHADOW |
>               CPU_BASED_USE_MSR_BITMAPS |
> +             CPU_BASED_NMI_WINDOW_EXITING |
>               CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |
>               CPU_BASED_ACTIVATE_TERTIARY_CONTROLS;
>         if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS,
> @@ -4299,6 +4301,10 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
>  {
>         u32 exec_control = vmcs_config.cpu_based_exec_ctrl;
>  
> +       /* INTR_WINDOW_EXITING and NMI_WINDOW_EXITING are toggled dynamically */
> +       exec_control &= ~(CPU_BASED_INTR_WINDOW_EXITING |
> +                         CPU_BASED_NMI_WINDOW_EXITING);
> +
>         if (vmx->vcpu.arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
>                 exec_control &= ~CPU_BASED_MOV_DR_EXITING;
>  

Also makes sense.

Reviewed-by: Maxim Levitsky <[email protected]>

Best regards,
Maxim Levitsky


2022-07-12 12:36:06

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v3 20/25] KVM: x86: VMX: Replace some Intel model numbers with mnemonics

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> From: Jim Mattson <[email protected]>
>
> Intel processor code names are more familiar to many readers than
> their decimal model numbers.
>
> Signed-off-by: Jim Mattson <[email protected]>
> Reviewed-by: Sean Christopherson <[email protected]>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
>  arch/x86/kvm/vmx/vmx.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index eca6875d6732..2dff5b94c535 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2580,11 +2580,11 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>          */
>         if (boot_cpu_data.x86 == 0x6) {
>                 switch (boot_cpu_data.x86_model) {
> -               case 26: /* AAK155 */
> -               case 30: /* AAP115 */
> -               case 37: /* AAT100 */
> -               case 44: /* BC86,AAY89,BD102 */
> -               case 46: /* BA97 */
> +               case INTEL_FAM6_NEHALEM_EP:     /* AAK155 */
> +               case INTEL_FAM6_NEHALEM:        /* AAP115 */
> +               case INTEL_FAM6_WESTMERE:       /* AAT100 */
> +               case INTEL_FAM6_WESTMERE_EP:    /* BC86,AAY89,BD102 */
> +               case INTEL_FAM6_NEHALEM_EX:     /* BA97 */
>                         _vmentry_control &= ~VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL;
>                         _vmexit_control &= ~VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL;
>                         pr_warn_once("kvm: VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL "

I cross checked the values, seems correct.

Reviewed-by: Maxim Levitsky <[email protected]>

Best regards,
Maxim Levitsky

2022-07-12 12:36:48

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v3 19/25] KVM: VMX: Adjust CR3/INVPLG interception for EPT=y at runtime, not setup

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> From: Sean Christopherson <[email protected]>
>
> Clear the CR3 and INVLPG interception controls at runtime based on
> whether or not EPT is being _used_, as opposed to clearing the bits at
> setup if EPT is _supported_ in hardware, and then restoring them when EPT
> is not used.  Not mucking with the base config will allow using the base
> config as the starting point for emulating the VMX capability MSRs.
>
> Signed-off-by: Sean Christopherson <[email protected]>
> Reviewed-by: Jim Mattson <[email protected]>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
>  arch/x86/kvm/vmx/vmx.c | 18 +++++++-----------
>  1 file changed, 7 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 9771c771c8f5..eca6875d6732 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2501,13 +2501,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
>         rdmsr_safe(MSR_IA32_VMX_EPT_VPID_CAP,
>                 &vmx_cap->ept, &vmx_cap->vpid);
>  
> -       if (_cpu_based_2nd_exec_control & SECONDARY_EXEC_ENABLE_EPT) {
> -               /* CR3 accesses and invlpg don't need to cause VM Exits when EPT
> -                  enabled */
> -               _cpu_based_exec_control &= ~(CPU_BASED_CR3_LOAD_EXITING |
> -                                            CPU_BASED_CR3_STORE_EXITING |
> -                                            CPU_BASED_INVLPG_EXITING);
> -       } else if (vmx_cap->ept) {
> +       if (!(_cpu_based_2nd_exec_control & SECONDARY_EXEC_ENABLE_EPT) &&
> +           vmx_cap->ept) {
>                 pr_warn_once("EPT CAP should not exist if not support "
>                                 "1-setting enable EPT VM-execution control\n");
>  
> @@ -4264,10 +4259,11 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
>                 exec_control |= CPU_BASED_CR8_STORE_EXITING |
>                                 CPU_BASED_CR8_LOAD_EXITING;
>  #endif
> -       if (!enable_ept)
> -               exec_control |= CPU_BASED_CR3_STORE_EXITING |
> -                               CPU_BASED_CR3_LOAD_EXITING  |
> -                               CPU_BASED_INVLPG_EXITING;
> +       /* No need to intercept CR3 access or INVPLG when using EPT. */
> +       if (enable_ept)
> +               exec_control &= ~(CPU_BASED_CR3_LOAD_EXITING |
> +                                 CPU_BASED_CR3_STORE_EXITING |
> +                                 CPU_BASED_INVLPG_EXITING);
>         if (kvm_mwait_in_guest(vmx->vcpu.kvm))
>                 exec_control &= ~(CPU_BASED_MWAIT_EXITING |
>                                 CPU_BASED_MONITOR_EXITING);


Makes sense, although the 'runtime' word a bit misleading, as we don't allow to change
'enable_ept' after kvm_intel is loaded I think.


Reviewed-by: Maxim Levitsky <[email protected]>

Best regards,
Maxim Levitsky

2022-07-12 12:39:22

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v3 08/25] KVM: selftests: Switch to updated eVMCSv1 definition

On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
> Update Enlightened VMCS definition in selftests from KVM.
>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
> .../selftests/kvm/include/x86_64/evmcs.h | 45 +++++++++++++++++--
> 1 file changed, 42 insertions(+), 3 deletions(-)
>
> diff --git a/tools/testing/selftests/kvm/include/x86_64/evmcs.h b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
> index 3c9260f8e116..a777711d5474 100644
> --- a/tools/testing/selftests/kvm/include/x86_64/evmcs.h
> +++ b/tools/testing/selftests/kvm/include/x86_64/evmcs.h
> @@ -203,14 +203,25 @@ struct hv_enlightened_vmcs {
> u32 reserved:30;
> } hv_enlightenments_control;
> u32 hv_vp_id;
> -
> + u32 padding32_2;

This is not just an update, this is a fix - I do see that padding field,
in the spec. It is possible that compiler added that padding on its own
to pad everything to 8 bytes though.

> u64 hv_vm_id;
> u64 partition_assist_page;
> u64 padding64_4[4];
> u64 guest_bndcfgs;
> - u64 padding64_5[7];
> + u64 guest_ia32_perf_global_ctrl;
> + u64 guest_ia32_s_cet;
> + u64 guest_ssp;
> + u64 guest_ia32_int_ssp_table_addr;
> + u64 guest_ia32_lbr_ctl;
> + u64 padding64_5[2];
Matches the spec.


> u64 xss_exit_bitmap;
> - u64 padding64_6[7];
> + u64 host_ia32_perf_global_ctrl;
> + u64 encls_exiting_bitmap;
This is swapped here as well, expains why it was not caught in the testing.

> + u64 tsc_multiplier;
> + u64 host_ia32_s_cet;
> + u64 host_ssp;
> + u64 host_ia32_int_ssp_table_addr;
> + u64 padding64_6;
> };
>
> #define HV_VMX_ENLIGHTENED_CLEAN_FIELD_NONE 0
> @@ -656,6 +667,18 @@ static inline int evmcs_vmread(uint64_t encoding, uint64_t *value)
> case VIRTUAL_PROCESSOR_ID:
> *value = current_evmcs->virtual_processor_id;
> break;
> + case HOST_IA32_PERF_GLOBAL_CTRL:
> + *value = current_evmcs->host_ia32_perf_global_ctrl;
> + break;
> + case GUEST_IA32_PERF_GLOBAL_CTRL:
> + *value = current_evmcs->guest_ia32_perf_global_ctrl;
> + break;
> + case ENCLS_EXITING_BITMAP:
> + *value = current_evmcs->encls_exiting_bitmap;
> + break;
> + case TSC_MULTIPLIER:
> + *value = current_evmcs->tsc_multiplier;
> + break;
> default: return 1;
> }
>
> @@ -1169,6 +1192,22 @@ static inline int evmcs_vmwrite(uint64_t encoding, uint64_t value)
> current_evmcs->virtual_processor_id = value;
> current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_XLAT;
> break;
> + case HOST_IA32_PERF_GLOBAL_CTRL:
> + current_evmcs->host_ia32_perf_global_ctrl = value;
> + current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_HOST_GRP1;
> + break;
> + case GUEST_IA32_PERF_GLOBAL_CTRL:
> + current_evmcs->guest_ia32_perf_global_ctrl = value;
> + current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_GUEST_GRP1;
> + break;
> + case ENCLS_EXITING_BITMAP:
> + current_evmcs->encls_exiting_bitmap = value;
> + current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2;
> + break;
> + case TSC_MULTIPLIER:
> + current_evmcs->tsc_multiplier = value;
> + current_evmcs->hv_clean_fields &= ~HV_VMX_ENLIGHTENED_CLEAN_FIELD_CONTROL_GRP2;
> + break;
> default: return 1;
> }
>

Other than bug in the order of host_ia32_perf_global_ctrl/encls_exiting_bitmap, everything else looks fine to me.

So with this bug fixed:

Reviewed-by: Maxim Levitsky <[email protected]>

Best regards,
Maxim Levitsky


2022-07-13 16:19:43

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH v3 06/25] KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf

Maxim Levitsky <[email protected]> writes:

> On Fri, 2022-07-08 at 16:42 +0200, Vitaly Kuznetsov wrote:
>> KVM has to check guest visible HYPERV_CPUID_NESTED_FEATURES.EBX CPUID
>> leaf to know with Enlightened VMCS definition to use (original or 2022
>> update). Cache the leaf along with other Hyper-V CPUID feature leaves
>> to make the check quick.
>>
>> Signed-off-by: Vitaly Kuznetsov <[email protected]>
>> ---
>> arch/x86/include/asm/kvm_host.h | 2 ++
>> arch/x86/kvm/hyperv.c | 9 +++++++++
>> 2 files changed, 11 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
>> index de5a149d0971..077ec9cf3169 100644
>> --- a/arch/x86/include/asm/kvm_host.h
>> +++ b/arch/x86/include/asm/kvm_host.h
>> @@ -616,6 +616,8 @@ struct kvm_vcpu_hv {
>> u32 enlightenments_eax; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EAX */
>> u32 enlightenments_ebx; /* HYPERV_CPUID_ENLIGHTMENT_INFO.EBX */
>> u32 syndbg_cap_eax; /* HYPERV_CPUID_SYNDBG_PLATFORM_CAPABILITIES.EAX */
>> + u32 nested_eax; /* HYPERV_CPUID_NESTED_FEATURES.EAX */
>> + u32 nested_ebx; /* HYPERV_CPUID_NESTED_FEATURES.EBX */
>> } cpuid_cache;
>> };
>>
>> diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
>> index e08189211d9a..b666902da4d9 100644
>> --- a/arch/x86/kvm/hyperv.c
>> +++ b/arch/x86/kvm/hyperv.c
>> @@ -2030,6 +2030,15 @@ void kvm_hv_set_cpuid(struct kvm_vcpu *vcpu)
>> hv_vcpu->cpuid_cache.syndbg_cap_eax = entry->eax;
>> else
>> hv_vcpu->cpuid_cache.syndbg_cap_eax = 0;
>> +
>> + entry = kvm_find_cpuid_entry(vcpu, HYPERV_CPUID_NESTED_FEATURES, 0);
>> + if (entry) {
>> + hv_vcpu->cpuid_cache.nested_eax = entry->eax;
>> + hv_vcpu->cpuid_cache.nested_ebx = entry->ebx;
>> + } else {
>> + hv_vcpu->cpuid_cache.nested_eax = 0;
>> + hv_vcpu->cpuid_cache.nested_ebx = 0;
>> + }
>> }
>>
>> int kvm_hv_set_enforce_cpuid(struct kvm_vcpu *vcpu, bool enforce)
>
>
> Small Nitpick:
>
> If I understand correctly, the kvm_find_cpuid_entry can fail if the userspace didn't provide the
> cpuid entry.
>
> Since the code that deals with failback is now repeated 3 times, how about some wrapper function that
> will return all zeros for a non present cpuid entry?

I've opted for wiping the whole hv_vcpu->cpuid_cache with memset(), this
way we don't even need a new helper.

--
Vitaly