2022-06-27 16:06:48

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 00/14] KVM: nVMX: Use vmcs_config for setting up nested VMX MSRs

Changes since RFC:
- "KVM: VMX: Extend VMX controls macro shenanigans" PATCH added and the
infrastructure is later used in other patches [Sean] PATCHes 1-3 added
to support the change.
- "KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup" PATCH
added [Sean].
- Commit messages added.

vmcs_config is a sanitized version of host VMX MSRs where some controls are
filtered out (e.g. when Enlightened VMCS is enabled, some know bugs are
discovered, some inconsistencies in controls are detected,...) but
nested_vmx_setup_ctls_msrs() uses raw host MSRs instead. This may end up
in exposing undesired controls to L1. Switch to using vmcs_config instead.

Sean Christopherson (1):
KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup

Vitaly Kuznetsov (13):
KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()
KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in
setup_vmcs_config()
KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING
in setup_vmcs_config()
KVM: VMX: Extend VMX controls macro shenanigans
KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of
setup_vmcs_config()
KVM: VMX: Add missing VMEXIT controls to vmcs_config
KVM: VMX: Add missing VMENTRY controls to vmcs_config
KVM: VMX: Add missing CPU based VM execution controls to vmcs_config
KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs
KVM: VMX: Store required-1 VMX controls in vmcs_config
KVM: nVMX: Use sanitized required-1 bits for VMX control MSRs
KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up
nested MSR

arch/x86/kvm/vmx/capabilities.h | 16 +--
arch/x86/kvm/vmx/nested.c | 37 +++---
arch/x86/kvm/vmx/nested.h | 2 +-
arch/x86/kvm/vmx/vmx.c | 198 ++++++++++++++------------------
arch/x86/kvm/vmx/vmx.h | 118 +++++++++++++++++++
5 files changed, 229 insertions(+), 142 deletions(-)

--
2.35.3


2022-06-27 16:06:48

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 02/14] KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in setup_vmcs_config()

CPU_BASED_{INTR,NMI}_WINDOW_EXITING controls are toggled dynamically by
vmx_enable_{irq,nmi}_window, handle_interrupt_window(), handle_nmi_window()
but setup_vmcs_config() doesn't check their existence. Add the check and
filter the controls out in vmx_exec_control().

No (real) functional change intended as all existing CPUs supporting
VMX are supposed to have these controls.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 7d5c837e5a7c..ecd00fc69674 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2487,7 +2487,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
CPU_BASED_MWAIT_EXITING |
CPU_BASED_MONITOR_EXITING |
CPU_BASED_INVLPG_EXITING |
- CPU_BASED_RDPMC_EXITING;
+ CPU_BASED_RDPMC_EXITING |
+ CPU_BASED_INTR_WINDOW_EXITING |
+ CPU_BASED_NMI_WINDOW_EXITING;

opt = CPU_BASED_TPR_SHADOW |
CPU_BASED_USE_MSR_BITMAPS |
@@ -4305,6 +4307,10 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
{
u32 exec_control = vmcs_config.cpu_based_exec_ctrl;

+ /* INTR_WINDOW_EXITING and NMI_WINDOW_EXITING are toggled dynamically */
+ exec_control &= ~(CPU_BASED_INTR_WINDOW_EXITING |
+ CPU_BASED_NMI_WINDOW_EXITING);
+
if (vmx->vcpu.arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
exec_control &= ~CPU_BASED_MOV_DR_EXITING;

--
2.35.3

2022-06-27 16:06:48

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 04/14] KVM: VMX: Extend VMX controls macro shenanigans

When VMX controls macros are used to set or clear a control bit, make
sure that this bit was checked in setup_vmcs_config() and thus is properly
reflected in vmcs_config.

No functional change intended.

Suggested-by: Sean Christopherson <[email protected]>
Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 99 +++++++------------------------------
arch/x86/kvm/vmx/vmx.h | 109 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 127 insertions(+), 81 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 5300f2ad6a25..7ef4bc69e2c6 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2448,7 +2448,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
struct vmx_capability *vmx_cap)
{
u32 vmx_msr_low, vmx_msr_high;
- u32 min, opt, min2, opt2;
u32 _pin_based_exec_control = 0;
u32 _cpu_based_exec_control = 0;
u32 _cpu_based_2nd_exec_control = 0;
@@ -2474,28 +2473,10 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
};

memset(vmcs_conf, 0, sizeof(*vmcs_conf));
- min = CPU_BASED_HLT_EXITING |
-#ifdef CONFIG_X86_64
- CPU_BASED_CR8_LOAD_EXITING |
- CPU_BASED_CR8_STORE_EXITING |
-#endif
- CPU_BASED_CR3_LOAD_EXITING |
- CPU_BASED_CR3_STORE_EXITING |
- CPU_BASED_UNCOND_IO_EXITING |
- CPU_BASED_MOV_DR_EXITING |
- CPU_BASED_USE_TSC_OFFSETTING |
- CPU_BASED_MWAIT_EXITING |
- CPU_BASED_MONITOR_EXITING |
- CPU_BASED_INVLPG_EXITING |
- CPU_BASED_RDPMC_EXITING |
- CPU_BASED_INTR_WINDOW_EXITING |
- CPU_BASED_NMI_WINDOW_EXITING;
-
- opt = CPU_BASED_TPR_SHADOW |
- CPU_BASED_USE_MSR_BITMAPS |
- CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |
- CPU_BASED_ACTIVATE_TERTIARY_CONTROLS;
- if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PROCBASED_CTLS,
+
+ if (adjust_vmx_controls(KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL,
+ KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL,
+ MSR_IA32_VMX_PROCBASED_CTLS,
&_cpu_based_exec_control) < 0)
return -EIO;
#ifdef CONFIG_X86_64
@@ -2504,34 +2485,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
~CPU_BASED_CR8_STORE_EXITING;
#endif
if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) {
- min2 = 0;
- opt2 = SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
- SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
- SECONDARY_EXEC_WBINVD_EXITING |
- SECONDARY_EXEC_ENABLE_VPID |
- SECONDARY_EXEC_ENABLE_EPT |
- SECONDARY_EXEC_UNRESTRICTED_GUEST |
- SECONDARY_EXEC_PAUSE_LOOP_EXITING |
- SECONDARY_EXEC_DESC |
- SECONDARY_EXEC_ENABLE_RDTSCP |
- SECONDARY_EXEC_ENABLE_INVPCID |
- SECONDARY_EXEC_APIC_REGISTER_VIRT |
- SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
- SECONDARY_EXEC_SHADOW_VMCS |
- SECONDARY_EXEC_XSAVES |
- SECONDARY_EXEC_RDSEED_EXITING |
- SECONDARY_EXEC_RDRAND_EXITING |
- SECONDARY_EXEC_ENABLE_PML |
- SECONDARY_EXEC_TSC_SCALING |
- SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE |
- SECONDARY_EXEC_PT_USE_GPA |
- SECONDARY_EXEC_PT_CONCEAL_VMX |
- SECONDARY_EXEC_ENABLE_VMFUNC |
- SECONDARY_EXEC_BUS_LOCK_DETECTION |
- SECONDARY_EXEC_NOTIFY_VM_EXITING |
- SECONDARY_EXEC_ENCLS_EXITING;
-
- if (adjust_vmx_controls(min2, opt2,
+ if (adjust_vmx_controls(KVM_REQ_VMX_SECONDARY_VM_EXEC_CONTROL,
+ KVM_OPT_VMX_SECONDARY_VM_EXEC_CONTROL,
MSR_IA32_VMX_PROCBASED_CTLS2,
&_cpu_based_2nd_exec_control) < 0)
return -EIO;
@@ -2581,30 +2536,20 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
_cpu_based_2nd_exec_control &= ~SECONDARY_EXEC_ENCLS_EXITING;

if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_TERTIARY_CONTROLS) {
- u64 opt3 = TERTIARY_EXEC_IPI_VIRT;
-
- _cpu_based_3rd_exec_control = adjust_vmx_controls64(opt3,
- MSR_IA32_VMX_PROCBASED_CTLS3);
+ _cpu_based_3rd_exec_control =
+ adjust_vmx_controls64(KVM_OPT_VMX_TERTIARY_VM_EXEC_CONTROL,
+ MSR_IA32_VMX_PROCBASED_CTLS3);
}

- min = VM_EXIT_SAVE_DEBUG_CONTROLS | VM_EXIT_ACK_INTR_ON_EXIT;
-#ifdef CONFIG_X86_64
- min |= VM_EXIT_HOST_ADDR_SPACE_SIZE;
-#endif
- opt = VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |
- VM_EXIT_LOAD_IA32_PAT |
- VM_EXIT_LOAD_IA32_EFER |
- VM_EXIT_CLEAR_BNDCFGS |
- VM_EXIT_PT_CONCEAL_PIP |
- VM_EXIT_CLEAR_IA32_RTIT_CTL;
- if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
+ if (adjust_vmx_controls(KVM_REQ_VMX_VM_EXIT_CONTROLS,
+ KVM_OPT_VMX_VM_EXIT_CONTROLS,
+ MSR_IA32_VMX_EXIT_CTLS,
&_vmexit_control) < 0)
return -EIO;

- min = PIN_BASED_EXT_INTR_MASK | PIN_BASED_NMI_EXITING;
- opt = PIN_BASED_VIRTUAL_NMIS | PIN_BASED_POSTED_INTR |
- PIN_BASED_VMX_PREEMPTION_TIMER;
- if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PINBASED_CTLS,
+ if (adjust_vmx_controls(KVM_REQ_VMX_PIN_BASED_VM_EXEC_CONTROL,
+ KVM_OPT_VMX_PIN_BASED_VM_EXEC_CONTROL,
+ MSR_IA32_VMX_PINBASED_CTLS,
&_pin_based_exec_control) < 0)
return -EIO;

@@ -2614,17 +2559,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY))
_pin_based_exec_control &= ~PIN_BASED_POSTED_INTR;

- min = VM_ENTRY_LOAD_DEBUG_CONTROLS;
-#ifdef CONFIG_X86_64
- min |= VM_ENTRY_IA32E_MODE;
-#endif
- opt = VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
- VM_ENTRY_LOAD_IA32_PAT |
- VM_ENTRY_LOAD_IA32_EFER |
- VM_ENTRY_LOAD_BNDCFGS |
- VM_ENTRY_PT_CONCEAL_PIP |
- VM_ENTRY_LOAD_IA32_RTIT_CTL;
- if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS,
+ if (adjust_vmx_controls(KVM_REQ_VMX_VM_ENTRY_CONTROLS,
+ KVM_OPT_VMX_VM_ENTRY_CONTROLS,
+ MSR_IA32_VMX_ENTRY_CTLS,
&_vmentry_control) < 0)
return -EIO;

diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 286c88e285ea..540febecac92 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -467,6 +467,113 @@ static inline u8 vmx_get_rvi(void)
return vmcs_read16(GUEST_INTR_STATUS) & 0xff;
}

+#define __KVM_REQ_VMX_VM_ENTRY_CONTROLS \
+ (VM_ENTRY_LOAD_DEBUG_CONTROLS)
+#ifdef CONFIG_X86_64
+ #define KVM_REQ_VMX_VM_ENTRY_CONTROLS \
+ (__KVM_REQ_VMX_VM_ENTRY_CONTROLS | \
+ VM_ENTRY_IA32E_MODE)
+#else
+ #define KVM_REQ_VMX_VM_ENTRY_CONTROLS \
+ __KVM_REQ_VMX_VM_ENTRY_CONTROLS
+#endif
+#define KVM_OPT_VMX_VM_ENTRY_CONTROLS \
+ (VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL | \
+ VM_ENTRY_LOAD_IA32_PAT | \
+ VM_ENTRY_LOAD_IA32_EFER | \
+ VM_ENTRY_LOAD_BNDCFGS | \
+ VM_ENTRY_PT_CONCEAL_PIP | \
+ VM_ENTRY_LOAD_IA32_RTIT_CTL)
+
+#define __KVM_REQ_VMX_VM_EXIT_CONTROLS \
+ (VM_EXIT_SAVE_DEBUG_CONTROLS | \
+ VM_EXIT_ACK_INTR_ON_EXIT)
+#ifdef CONFIG_X86_64
+ #define KVM_REQ_VMX_VM_EXIT_CONTROLS \
+ (__KVM_REQ_VMX_VM_EXIT_CONTROLS | \
+ VM_EXIT_HOST_ADDR_SPACE_SIZE)
+#else
+ #define KVM_REQ_VMX_VM_EXIT_CONTROLS \
+ __KVM_REQ_VMX_VM_EXIT_CONTROLS
+#endif
+#define KVM_OPT_VMX_VM_EXIT_CONTROLS \
+ (VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | \
+ VM_EXIT_LOAD_IA32_PAT | \
+ VM_EXIT_LOAD_IA32_EFER | \
+ VM_EXIT_CLEAR_BNDCFGS | \
+ VM_EXIT_PT_CONCEAL_PIP | \
+ VM_EXIT_CLEAR_IA32_RTIT_CTL)
+
+#define KVM_REQ_VMX_PIN_BASED_VM_EXEC_CONTROL \
+ (PIN_BASED_EXT_INTR_MASK | \
+ PIN_BASED_NMI_EXITING)
+#define KVM_OPT_VMX_PIN_BASED_VM_EXEC_CONTROL \
+ (PIN_BASED_VIRTUAL_NMIS | \
+ PIN_BASED_POSTED_INTR | \
+ PIN_BASED_VMX_PREEMPTION_TIMER)
+
+#define __KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL \
+ (CPU_BASED_HLT_EXITING | \
+ CPU_BASED_CR3_LOAD_EXITING | \
+ CPU_BASED_CR3_STORE_EXITING | \
+ CPU_BASED_UNCOND_IO_EXITING | \
+ CPU_BASED_MOV_DR_EXITING | \
+ CPU_BASED_USE_TSC_OFFSETTING | \
+ CPU_BASED_MWAIT_EXITING | \
+ CPU_BASED_MONITOR_EXITING | \
+ CPU_BASED_INVLPG_EXITING | \
+ CPU_BASED_RDPMC_EXITING | \
+ CPU_BASED_INTR_WINDOW_EXITING | \
+ CPU_BASED_NMI_WINDOW_EXITING)
+
+#ifdef CONFIG_X86_64
+ #define KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL \
+ (__KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL | \
+ CPU_BASED_CR8_LOAD_EXITING | \
+ CPU_BASED_CR8_STORE_EXITING)
+#else
+ #define KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL \
+ __KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL
+#endif
+
+#define KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL \
+ (CPU_BASED_TPR_SHADOW | \
+ CPU_BASED_USE_MSR_BITMAPS | \
+ CPU_BASED_ACTIVATE_SECONDARY_CONTROLS | \
+ CPU_BASED_ACTIVATE_TERTIARY_CONTROLS)
+
+#define KVM_REQ_VMX_SECONDARY_VM_EXEC_CONTROL 0
+#define KVM_OPT_VMX_SECONDARY_VM_EXEC_CONTROL \
+ (SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES | \
+ SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE | \
+ SECONDARY_EXEC_WBINVD_EXITING | \
+ SECONDARY_EXEC_ENABLE_VPID | \
+ SECONDARY_EXEC_ENABLE_EPT | \
+ SECONDARY_EXEC_UNRESTRICTED_GUEST | \
+ SECONDARY_EXEC_PAUSE_LOOP_EXITING | \
+ SECONDARY_EXEC_DESC | \
+ SECONDARY_EXEC_ENABLE_RDTSCP | \
+ SECONDARY_EXEC_ENABLE_INVPCID | \
+ SECONDARY_EXEC_APIC_REGISTER_VIRT | \
+ SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY | \
+ SECONDARY_EXEC_SHADOW_VMCS | \
+ SECONDARY_EXEC_XSAVES | \
+ SECONDARY_EXEC_RDSEED_EXITING | \
+ SECONDARY_EXEC_RDRAND_EXITING | \
+ SECONDARY_EXEC_ENABLE_PML | \
+ SECONDARY_EXEC_TSC_SCALING | \
+ SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE | \
+ SECONDARY_EXEC_PT_USE_GPA | \
+ SECONDARY_EXEC_PT_CONCEAL_VMX | \
+ SECONDARY_EXEC_ENABLE_VMFUNC | \
+ SECONDARY_EXEC_BUS_LOCK_DETECTION | \
+ SECONDARY_EXEC_NOTIFY_VM_EXITING | \
+ SECONDARY_EXEC_ENCLS_EXITING)
+
+#define KVM_REQ_VMX_TERTIARY_VM_EXEC_CONTROL 0
+#define KVM_OPT_VMX_TERTIARY_VM_EXEC_CONTROL \
+ (TERTIARY_EXEC_IPI_VIRT)
+
#define BUILD_CONTROLS_SHADOW(lname, uname, bits) \
static inline void lname##_controls_set(struct vcpu_vmx *vmx, u##bits val) \
{ \
@@ -485,10 +592,12 @@ static inline u##bits lname##_controls_get(struct vcpu_vmx *vmx) \
} \
static inline void lname##_controls_setbit(struct vcpu_vmx *vmx, u##bits val) \
{ \
+ BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname | KVM_OPT_VMX_##uname))); \
lname##_controls_set(vmx, lname##_controls_get(vmx) | val); \
} \
static inline void lname##_controls_clearbit(struct vcpu_vmx *vmx, u##bits val) \
{ \
+ BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname | KVM_OPT_VMX_##uname))); \
lname##_controls_set(vmx, lname##_controls_get(vmx) & ~val); \
}
BUILD_CONTROLS_SHADOW(vm_entry, VM_ENTRY_CONTROLS, 32)
--
2.35.3

2022-06-27 16:07:31

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 06/14] KVM: VMX: Add missing VMEXIT controls to vmcs_config

As a preparation to reusing the result of setup_vmcs_config() in
nested VMX MSR setup, add the VMEXIT controls which KVM doesn't
use but supports for nVMX to KVM_OPT_VMX_VM_EXIT_CONTROLS and
filter them out in vmx_vmexit_ctrl().

No functional change intended.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 4 ++++
arch/x86/kvm/vmx/vmx.h | 3 +++
2 files changed, 7 insertions(+)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index d28f85801ade..15191b3e5538 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4202,6 +4202,10 @@ static u32 vmx_vmexit_ctrl(void)
{
u32 vmexit_ctrl = vmcs_config.vmexit_ctrl;

+ /* Not used by KVM but supported for nesting. */
+ vmexit_ctrl &= ~(VM_EXIT_SAVE_IA32_PAT | VM_EXIT_SAVE_IA32_EFER |
+ VM_EXIT_SAVE_VMX_PREEMPTION_TIMER);
+
if (vmx_pt_mode_is_system())
vmexit_ctrl &= ~(VM_EXIT_PT_CONCEAL_PIP |
VM_EXIT_CLEAR_IA32_RTIT_CTL);
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 540febecac92..5e9127f39c19 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -498,8 +498,11 @@ static inline u8 vmx_get_rvi(void)
#endif
#define KVM_OPT_VMX_VM_EXIT_CONTROLS \
(VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | \
+ VM_EXIT_SAVE_IA32_PAT | \
VM_EXIT_LOAD_IA32_PAT | \
+ VM_EXIT_SAVE_IA32_EFER | \
VM_EXIT_LOAD_IA32_EFER | \
+ VM_EXIT_SAVE_VMX_PREEMPTION_TIMER | \
VM_EXIT_CLEAR_BNDCFGS | \
VM_EXIT_PT_CONCEAL_PIP | \
VM_EXIT_CLEAR_IA32_RTIT_CTL)
--
2.35.3

2022-06-27 16:07:55

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 09/14] KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup

From: Sean Christopherson <[email protected]>

Clear the CR3 and INVLPG interception controls at runtime based on
whether or not EPT is being _used_, as opposed to clearing the bits at
setup if EPT is _supported_ in hardware, and then restoring them when EPT
is not used. Not mucking with the base config will allow using the base
config as the starting point for emulating the VMX capability MSRs.

Signed-off-by: Sean Christopherson <[email protected]>
Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 18 +++++++-----------
1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index bad55d52aa28..aec6174686f2 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2501,13 +2501,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
rdmsr_safe(MSR_IA32_VMX_EPT_VPID_CAP,
&vmx_cap->ept, &vmx_cap->vpid);

- if (_cpu_based_2nd_exec_control & SECONDARY_EXEC_ENABLE_EPT) {
- /* CR3 accesses and invlpg don't need to cause VM Exits when EPT
- enabled */
- _cpu_based_exec_control &= ~(CPU_BASED_CR3_LOAD_EXITING |
- CPU_BASED_CR3_STORE_EXITING |
- CPU_BASED_INVLPG_EXITING);
- } else if (vmx_cap->ept) {
+ if (!(_cpu_based_2nd_exec_control & SECONDARY_EXEC_ENABLE_EPT) &&
+ vmx_cap->ept) {
pr_warn_once("EPT CAP should not exist if not support "
"1-setting enable EPT VM-execution control\n");

@@ -4273,10 +4268,11 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
exec_control |= CPU_BASED_CR8_STORE_EXITING |
CPU_BASED_CR8_LOAD_EXITING;
#endif
- if (!enable_ept)
- exec_control |= CPU_BASED_CR3_STORE_EXITING |
- CPU_BASED_CR3_LOAD_EXITING |
- CPU_BASED_INVLPG_EXITING;
+ /* No need to intercept CR3 access or INVPLG when using EPT. */
+ if (enable_ept)
+ exec_control &= ~(CPU_BASED_CR3_LOAD_EXITING |
+ CPU_BASED_CR3_STORE_EXITING |
+ CPU_BASED_INVLPG_EXITING);
if (kvm_mwait_in_guest(vmx->vcpu.kvm))
exec_control &= ~(CPU_BASED_MWAIT_EXITING |
CPU_BASED_MONITOR_EXITING);
--
2.35.3

2022-06-27 16:08:00

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 12/14] KVM: nVMX: Use sanitized required-1 bits for VMX control MSRs

vmcs_config has the required information for setting up required-1 bits of
nested VMCS control MSRs, use it to avoid redundant rdmsr()s.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/nested.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 41cac0390998..c88a9c6b4606 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6538,9 +6538,6 @@ void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
{
struct nested_vmx_msrs *msrs = &vmcs_conf->nested;

- /* Take the allowed-1 bits from KVM's sanitized VMCS configuration. */
- u32 ignore_high;
-
/*
* Note that as a general rule, the high half of the MSRs (bits in
* the control fields which may be 1) should be initialized by the
@@ -6557,8 +6554,7 @@ void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
*/

/* pin-based controls */
- rdmsr(MSR_IA32_VMX_PINBASED_CTLS, msrs->pinbased_ctls_low, ignore_high);
- msrs->pinbased_ctls_low |=
+ msrs->pinbased_ctls_low = vmcs_conf->pin_based_exec_ctrl_req1 |
PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR;

msrs->pinbased_ctls_high = vmcs_conf->pin_based_exec_ctrl;
--
2.35.3

2022-06-27 16:08:07

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 08/14] KVM: VMX: Add missing CPU based VM execution controls to vmcs_config

As a preparation to reusing the result of setup_vmcs_config() in
nested VMX MSR setup, add the CPU based VM execution controls which KVM
doesn't use but supports for nVMX to KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL
and filter them out in vmx_exec_control().

No functional change intended.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 6 ++++++
arch/x86/kvm/vmx/vmx.h | 6 +++++-
2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 3846a8c7102a..bad55d52aa28 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4249,6 +4249,12 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
{
u32 exec_control = vmcs_config.cpu_based_exec_ctrl;

+ /* Not used by KVM but supported for nesting. */
+ exec_control &= ~(CPU_BASED_RDTSC_EXITING |
+ CPU_BASED_USE_IO_BITMAPS |
+ CPU_BASED_MONITOR_TRAP_FLAG |
+ CPU_BASED_PAUSE_EXITING);
+
/* INTR_WINDOW_EXITING and NMI_WINDOW_EXITING are toggled dynamically */
exec_control &= ~(CPU_BASED_INTR_WINDOW_EXITING |
CPU_BASED_NMI_WINDOW_EXITING);
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 6b44f4c1d45f..2ba1f99a8671 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -542,8 +542,12 @@ static inline u8 vmx_get_rvi(void)
#endif

#define KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL \
- (CPU_BASED_TPR_SHADOW | \
+ (CPU_BASED_RDTSC_EXITING | \
+ CPU_BASED_TPR_SHADOW | \
+ CPU_BASED_USE_IO_BITMAPS | \
+ CPU_BASED_MONITOR_TRAP_FLAG | \
CPU_BASED_USE_MSR_BITMAPS | \
+ CPU_BASED_PAUSE_EXITING | \
CPU_BASED_ACTIVATE_SECONDARY_CONTROLS | \
CPU_BASED_ACTIVATE_TERTIARY_CONTROLS)

--
2.35.3

2022-06-27 16:08:12

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 10/14] KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs

Using raw host MSR values for setting up nested VMX control MSRs is
incorrect as some features need to disabled, e.g. when KVM runs as
a nested hypervisor on Hyper-V and uses Enlightened VMCS or when a
workaround for IA32_PERF_GLOBAL_CTRL is applied. For non-nested VMX, this
is done in setup_vmcs_config() and the result is stored in vmcs_config.
Use it for setting up allowed-1 bits in nested VMX MSRs too.

Suggested-by: Sean Christopherson <[email protected]>
Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/nested.c | 34 ++++++++++++++++------------------
arch/x86/kvm/vmx/nested.h | 2 +-
arch/x86/kvm/vmx/vmx.c | 5 ++---
3 files changed, 19 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 778f82015f03..41cac0390998 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6534,8 +6534,13 @@ static u64 nested_vmx_calc_vmcs_enum_msr(void)
* bit in the high half is on if the corresponding bit in the control field
* may be on. See also vmx_control_verify().
*/
-void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
+void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
{
+ struct nested_vmx_msrs *msrs = &vmcs_conf->nested;
+
+ /* Take the allowed-1 bits from KVM's sanitized VMCS configuration. */
+ u32 ignore_high;
+
/*
* Note that as a general rule, the high half of the MSRs (bits in
* the control fields which may be 1) should be initialized by the
@@ -6552,11 +6557,11 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
*/

/* pin-based controls */
- rdmsr(MSR_IA32_VMX_PINBASED_CTLS,
- msrs->pinbased_ctls_low,
- msrs->pinbased_ctls_high);
+ rdmsr(MSR_IA32_VMX_PINBASED_CTLS, msrs->pinbased_ctls_low, ignore_high);
msrs->pinbased_ctls_low |=
PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR;
+
+ msrs->pinbased_ctls_high = vmcs_conf->pin_based_exec_ctrl;
msrs->pinbased_ctls_high &=
PIN_BASED_EXT_INTR_MASK |
PIN_BASED_NMI_EXITING |
@@ -6567,12 +6572,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
PIN_BASED_VMX_PREEMPTION_TIMER;

/* exit controls */
- rdmsr(MSR_IA32_VMX_EXIT_CTLS,
- msrs->exit_ctls_low,
- msrs->exit_ctls_high);
msrs->exit_ctls_low =
VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR;

+ msrs->exit_ctls_high = vmcs_conf->vmexit_ctrl;
msrs->exit_ctls_high &=
#ifdef CONFIG_X86_64
VM_EXIT_HOST_ADDR_SPACE_SIZE |
@@ -6588,11 +6591,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
msrs->exit_ctls_low &= ~VM_EXIT_SAVE_DEBUG_CONTROLS;

/* entry controls */
- rdmsr(MSR_IA32_VMX_ENTRY_CTLS,
- msrs->entry_ctls_low,
- msrs->entry_ctls_high);
msrs->entry_ctls_low =
VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR;
+
+ msrs->entry_ctls_high = vmcs_conf->vmentry_ctrl;
msrs->entry_ctls_high &=
#ifdef CONFIG_X86_64
VM_ENTRY_IA32E_MODE |
@@ -6606,11 +6608,10 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
msrs->entry_ctls_low &= ~VM_ENTRY_LOAD_DEBUG_CONTROLS;

/* cpu-based controls */
- rdmsr(MSR_IA32_VMX_PROCBASED_CTLS,
- msrs->procbased_ctls_low,
- msrs->procbased_ctls_high);
msrs->procbased_ctls_low =
CPU_BASED_ALWAYSON_WITHOUT_TRUE_MSR;
+
+ msrs->procbased_ctls_high = vmcs_conf->cpu_based_exec_ctrl;
msrs->procbased_ctls_high &=
CPU_BASED_INTR_WINDOW_EXITING |
CPU_BASED_NMI_WINDOW_EXITING | CPU_BASED_USE_TSC_OFFSETTING |
@@ -6644,12 +6645,9 @@ void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps)
* depend on CPUID bits, they are added later by
* vmx_vcpu_after_set_cpuid.
*/
- if (msrs->procbased_ctls_high & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS)
- rdmsr(MSR_IA32_VMX_PROCBASED_CTLS2,
- msrs->secondary_ctls_low,
- msrs->secondary_ctls_high);
-
msrs->secondary_ctls_low = 0;
+
+ msrs->secondary_ctls_high = vmcs_conf->cpu_based_2nd_exec_ctrl;
msrs->secondary_ctls_high &=
SECONDARY_EXEC_DESC |
SECONDARY_EXEC_ENABLE_RDTSCP |
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index c92cea0b8ccc..fae047c6204b 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -17,7 +17,7 @@ enum nvmx_vmentry_status {
};

void vmx_leave_nested(struct kvm_vcpu *vcpu);
-void nested_vmx_setup_ctls_msrs(struct nested_vmx_msrs *msrs, u32 ept_caps);
+void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps);
void nested_vmx_hardware_unsetup(void);
__init int nested_vmx_hardware_setup(int (*exit_handlers[])(struct kvm_vcpu *));
void nested_vmx_set_vmcs_shadowing_bitmap(void);
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index aec6174686f2..faac50f7578d 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7306,7 +7306,7 @@ static int __init vmx_check_processor_compat(void)
if (setup_vmcs_config(&vmcs_conf, &vmx_cap) < 0)
return -EIO;
if (nested)
- nested_vmx_setup_ctls_msrs(&vmcs_conf.nested, vmx_cap.ept);
+ nested_vmx_setup_ctls_msrs(&vmcs_conf, vmx_cap.ept);
if (memcmp(&vmcs_config, &vmcs_conf, sizeof(struct vmcs_config)) != 0) {
printk(KERN_ERR "kvm: CPU %d feature inconsistency!\n",
smp_processor_id());
@@ -8276,8 +8276,7 @@ static __init int hardware_setup(void)
setup_default_sgx_lepubkeyhash();

if (nested) {
- nested_vmx_setup_ctls_msrs(&vmcs_config.nested,
- vmx_capability.ept);
+ nested_vmx_setup_ctls_msrs(&vmcs_config, vmx_capability.ept);

r = nested_vmx_hardware_setup(kvm_vmx_exit_handlers);
if (r)
--
2.35.3

2022-06-27 16:08:31

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 11/14] KVM: VMX: Store required-1 VMX controls in vmcs_config

While constructing nested VMX MSRs values, nested_vmx_setup_ctls_msrs()
has to re-read host VMX control MSRs to get required-1 bits which are not
stored anywhre. Add this missing information to vmcs_config.

No functional change intended.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/capabilities.h | 5 +++++
arch/x86/kvm/vmx/vmx.c | 28 +++++++++++++++++++++-------
2 files changed, 26 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 069d8d298e1d..2e223440e7ed 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -60,11 +60,16 @@ struct vmcs_config {
u32 basic_cap;
u32 revision_id;
u32 pin_based_exec_ctrl;
+ u32 pin_based_exec_ctrl_req1;
u32 cpu_based_exec_ctrl;
+ u32 cpu_based_exec_ctrl_req1;
u32 cpu_based_2nd_exec_ctrl;
+ u32 cpu_based_2nd_exec_ctrl_req1;
u64 cpu_based_3rd_exec_ctrl;
u32 vmexit_ctrl;
+ u32 vmexit_ctrl_req1;
u32 vmentry_ctrl;
+ u32 vmentry_ctrl_req1;
struct nested_vmx_msrs nested;
};
extern struct vmcs_config vmcs_config;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index faac50f7578d..c1bbbe1c6d9f 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2417,7 +2417,7 @@ static bool cpu_has_sgx(void)
}

static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
- u32 msr, u32 *result)
+ u32 msr, u32 *result_high, u32 *result_low)
{
u32 vmx_msr_low, vmx_msr_high;
u32 ctl = ctl_min | ctl_opt;
@@ -2431,7 +2431,8 @@ static __init int adjust_vmx_controls(u32 ctl_min, u32 ctl_opt,
if (ctl_min & ~ctl)
return -EIO;

- *result = ctl;
+ *result_high = ctl;
+ *result_low = vmx_msr_low;
return 0;
}

@@ -2454,6 +2455,11 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
u64 _cpu_based_3rd_exec_control = 0;
u32 _vmexit_control = 0;
u32 _vmentry_control = 0;
+ u32 _pin_based_exec_control_low = 0;
+ u32 _cpu_based_exec_control_low = 0;
+ u32 _cpu_based_2nd_exec_control_low = 0;
+ u32 _vmexit_control_low = 0;
+ u32 _vmentry_control_low = 0;
int i;

/*
@@ -2477,13 +2483,15 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
if (adjust_vmx_controls(KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL,
KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL,
MSR_IA32_VMX_PROCBASED_CTLS,
- &_cpu_based_exec_control) < 0)
+ &_cpu_based_exec_control,
+ &_cpu_based_exec_control_low) < 0)
return -EIO;
if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) {
if (adjust_vmx_controls(KVM_REQ_VMX_SECONDARY_VM_EXEC_CONTROL,
KVM_OPT_VMX_SECONDARY_VM_EXEC_CONTROL,
MSR_IA32_VMX_PROCBASED_CTLS2,
- &_cpu_based_2nd_exec_control) < 0)
+ &_cpu_based_2nd_exec_control,
+ &_cpu_based_2nd_exec_control_low) < 0)
return -EIO;
}
#ifndef CONFIG_X86_64
@@ -2534,13 +2542,14 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
if (adjust_vmx_controls(KVM_REQ_VMX_VM_EXIT_CONTROLS,
KVM_OPT_VMX_VM_EXIT_CONTROLS,
MSR_IA32_VMX_EXIT_CTLS,
- &_vmexit_control) < 0)
+ &_vmexit_control, &_vmexit_control_low) < 0)
return -EIO;

if (adjust_vmx_controls(KVM_REQ_VMX_PIN_BASED_VM_EXEC_CONTROL,
KVM_OPT_VMX_PIN_BASED_VM_EXEC_CONTROL,
MSR_IA32_VMX_PINBASED_CTLS,
- &_pin_based_exec_control) < 0)
+ &_pin_based_exec_control,
+ &_pin_based_exec_control_low) < 0)
return -EIO;

if (cpu_has_broken_vmx_preemption_timer())
@@ -2552,7 +2561,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
if (adjust_vmx_controls(KVM_REQ_VMX_VM_ENTRY_CONTROLS,
KVM_OPT_VMX_VM_ENTRY_CONTROLS,
MSR_IA32_VMX_ENTRY_CTLS,
- &_vmentry_control) < 0)
+ &_vmentry_control, &_vmentry_control_low) < 0)
return -EIO;

for (i = 0; i < ARRAY_SIZE(vmcs_entry_exit_pairs); i++) {
@@ -2618,11 +2627,16 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
vmcs_conf->revision_id = vmx_msr_low;

vmcs_conf->pin_based_exec_ctrl = _pin_based_exec_control;
+ vmcs_conf->pin_based_exec_ctrl_req1 = _pin_based_exec_control_low;
vmcs_conf->cpu_based_exec_ctrl = _cpu_based_exec_control;
+ vmcs_conf->cpu_based_exec_ctrl_req1 = _cpu_based_exec_control_low;
vmcs_conf->cpu_based_2nd_exec_ctrl = _cpu_based_2nd_exec_control;
+ vmcs_conf->cpu_based_2nd_exec_ctrl_req1 = _cpu_based_2nd_exec_control_low;
vmcs_conf->cpu_based_3rd_exec_ctrl = _cpu_based_3rd_exec_control;
vmcs_conf->vmexit_ctrl = _vmexit_control;
+ vmcs_conf->vmexit_ctrl_req1 = _vmexit_control_low;
vmcs_conf->vmentry_ctrl = _vmentry_control;
+ vmcs_conf->vmentry_ctrl_req1 = _vmentry_control_low;

#if IS_ENABLED(CONFIG_HYPERV)
if (enlightened_vmcs)
--
2.35.3

2022-06-27 16:08:31

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 13/14] KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config

Like other host VMX control MSRs, MSR_IA32_VMX_MISC can be cached in
vmcs_config to avoid the need to re-read it later, e.g. from
cpu_has_vmx_intel_pt() or cpu_has_vmx_shadow_vmcs().

No (real) functional change intended.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/capabilities.h | 11 +++--------
arch/x86/kvm/vmx/vmx.c | 8 +++++---
2 files changed, 8 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 2e223440e7ed..9a73087c8314 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -70,6 +70,7 @@ struct vmcs_config {
u32 vmexit_ctrl_req1;
u32 vmentry_ctrl;
u32 vmentry_ctrl_req1;
+ u64 misc;
struct nested_vmx_msrs nested;
};
extern struct vmcs_config vmcs_config;
@@ -229,11 +230,8 @@ static inline bool cpu_has_vmx_vmfunc(void)

static inline bool cpu_has_vmx_shadow_vmcs(void)
{
- u64 vmx_msr;
-
/* check if the cpu supports writing r/o exit information fields */
- rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
- if (!(vmx_msr & MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS))
+ if (!(vmcs_config.misc & MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS))
return false;

return vmcs_config.cpu_based_2nd_exec_ctrl &
@@ -375,10 +373,7 @@ static inline bool cpu_has_vmx_invvpid_global(void)

static inline bool cpu_has_vmx_intel_pt(void)
{
- u64 vmx_msr;
-
- rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
- return (vmx_msr & MSR_IA32_VMX_MISC_INTEL_PT) &&
+ return (vmcs_config.misc & MSR_IA32_VMX_MISC_INTEL_PT) &&
(vmcs_config.cpu_based_2nd_exec_ctrl & SECONDARY_EXEC_PT_USE_GPA) &&
(vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_RTIT_CTL);
}
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c1bbbe1c6d9f..878da8aa775a 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2460,6 +2460,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
u32 _cpu_based_2nd_exec_control_low = 0;
u32 _vmexit_control_low = 0;
u32 _vmentry_control_low = 0;
+ u64 misc_msr;
int i;

/*
@@ -2621,6 +2622,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
if (((vmx_msr_high >> 18) & 15) != 6)
return -EIO;

+ rdmsrl(MSR_IA32_VMX_MISC, misc_msr);
+
vmcs_conf->size = vmx_msr_high & 0x1fff;
vmcs_conf->basic_cap = vmx_msr_high & ~0x1fff;

@@ -2637,6 +2640,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
vmcs_conf->vmexit_ctrl_req1 = _vmexit_control_low;
vmcs_conf->vmentry_ctrl = _vmentry_control;
vmcs_conf->vmentry_ctrl_req1 = _vmentry_control_low;
+ vmcs_conf->misc = misc_msr;

#if IS_ENABLED(CONFIG_HYPERV)
if (enlightened_vmcs)
@@ -8250,11 +8254,9 @@ static __init int hardware_setup(void)

if (enable_preemption_timer) {
u64 use_timer_freq = 5000ULL * 1000 * 1000;
- u64 vmx_msr;

- rdmsrl(MSR_IA32_VMX_MISC, vmx_msr);
cpu_preemption_timer_multi =
- vmx_msr & VMX_MISC_PREEMPTION_TIMER_RATE_MASK;
+ vmcs_config.misc & VMX_MISC_PREEMPTION_TIMER_RATE_MASK;

if (tsc_khz)
use_timer_freq = (u64)tsc_khz * 1000;
--
2.35.3

2022-06-27 16:10:12

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 14/14] KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up nested MSR

vmcs_config has cased host MSR_IA32_VMX_MISC value, use it for setting
up nested MSR_IA32_VMX_MISC in nested_vmx_setup_ctls_msrs() and avoid the
redundant rdmsr().

No (real) functional change intended.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/nested.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index c88a9c6b4606..a35b28261d31 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -6723,10 +6723,7 @@ void nested_vmx_setup_ctls_msrs(struct vmcs_config *vmcs_conf, u32 ept_caps)
msrs->secondary_ctls_high |= SECONDARY_EXEC_ENCLS_EXITING;

/* miscellaneous data */
- rdmsr(MSR_IA32_VMX_MISC,
- msrs->misc_low,
- msrs->misc_high);
- msrs->misc_low &= VMX_MISC_SAVE_EFER_LMA;
+ msrs->misc_low = (u32)vmcs_conf->misc & VMX_MISC_SAVE_EFER_LMA;
msrs->misc_low |=
MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS |
VMX_MISC_EMULATED_PREEMPTION_TIMER_RATE |
--
2.35.3

2022-06-27 16:38:01

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 01/14] KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()

VM_ENTRY_IA32E_MODE control is toggled dynamically by vmx_set_efer()
and setup_vmcs_config() doesn't check its existence. On the contrary,
nested_vmx_setup_ctls_msrs() doesn set it on x86_64. Add the missing
check and filter the bit out in vmx_vmentry_ctrl().

No (real) functional change intended as all existing CPUs supporting
long mode and VMX are supposed to have it.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c30115b9cb33..7d5c837e5a7c 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2610,6 +2610,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
_pin_based_exec_control &= ~PIN_BASED_POSTED_INTR;

min = VM_ENTRY_LOAD_DEBUG_CONTROLS;
+#ifdef CONFIG_X86_64
+ min |= VM_ENTRY_IA32E_MODE;
+#endif
opt = VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
VM_ENTRY_LOAD_IA32_PAT |
VM_ENTRY_LOAD_IA32_EFER |
@@ -4247,9 +4250,15 @@ static u32 vmx_vmentry_ctrl(void)
if (vmx_pt_mode_is_system())
vmentry_ctrl &= ~(VM_ENTRY_PT_CONCEAL_PIP |
VM_ENTRY_LOAD_IA32_RTIT_CTL);
- /* Loading of EFER and PERF_GLOBAL_CTRL are toggled dynamically */
- return vmentry_ctrl &
- ~(VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL | VM_ENTRY_LOAD_IA32_EFER);
+ /*
+ * Loading of EFER, VM_ENTRY_IA32E_MODE, and PERF_GLOBAL_CTRL
+ * are toggled dynamically.
+ */
+ vmentry_ctrl &= ~(VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
+ VM_ENTRY_LOAD_IA32_EFER |
+ VM_ENTRY_IA32E_MODE);
+
+ return vmentry_ctrl;
}

static u32 vmx_vmexit_ctrl(void)
--
2.35.3

2022-06-27 16:42:40

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 07/14] KVM: VMX: Add missing VMENTRY controls to vmcs_config

As a preparation to reusing the result of setup_vmcs_config() in
nested VMX MSR setup, add the VMENTRY controls which KVM doesn't
use but supports for nVMX to KVM_OPT_VMX_VM_ENTRY_CONTROLS and
filter them out in vmx_vmentry_ctrl().

No functional change intended.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 3 +++
arch/x86/kvm/vmx/vmx.h | 4 +++-
2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 15191b3e5538..3846a8c7102a 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -4184,6 +4184,9 @@ static u32 vmx_vmentry_ctrl(void)
{
u32 vmentry_ctrl = vmcs_config.vmentry_ctrl;

+ /* Not used by KVM but supported for nesting. */
+ vmentry_ctrl &= ~(VM_ENTRY_SMM | VM_ENTRY_DEACT_DUAL_MONITOR);
+
if (vmx_pt_mode_is_system())
vmentry_ctrl &= ~(VM_ENTRY_PT_CONCEAL_PIP |
VM_ENTRY_LOAD_IA32_RTIT_CTL);
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 5e9127f39c19..6b44f4c1d45f 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -478,7 +478,9 @@ static inline u8 vmx_get_rvi(void)
__KVM_REQ_VMX_VM_ENTRY_CONTROLS
#endif
#define KVM_OPT_VMX_VM_ENTRY_CONTROLS \
- (VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL | \
+ (VM_ENTRY_SMM | \
+ VM_ENTRY_DEACT_DUAL_MONITOR | \
+ VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL | \
VM_ENTRY_LOAD_IA32_PAT | \
VM_ENTRY_LOAD_IA32_EFER | \
VM_ENTRY_LOAD_BNDCFGS | \
--
2.35.3

2022-06-27 16:48:32

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 05/14] KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of setup_vmcs_config()

As a preparation to reusing the result of setup_vmcs_config() in
nested VMX MSR setup, move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering
to vmx_exec_control().

No functional change intended.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 7ef4bc69e2c6..d28f85801ade 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2479,11 +2479,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
MSR_IA32_VMX_PROCBASED_CTLS,
&_cpu_based_exec_control) < 0)
return -EIO;
-#ifdef CONFIG_X86_64
- if (_cpu_based_exec_control & CPU_BASED_TPR_SHADOW)
- _cpu_based_exec_control &= ~CPU_BASED_CR8_LOAD_EXITING &
- ~CPU_BASED_CR8_STORE_EXITING;
-#endif
if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) {
if (adjust_vmx_controls(KVM_REQ_VMX_SECONDARY_VM_EXEC_CONTROL,
KVM_OPT_VMX_SECONDARY_VM_EXEC_CONTROL,
@@ -4254,13 +4249,17 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
if (vmx->vcpu.arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)
exec_control &= ~CPU_BASED_MOV_DR_EXITING;

- if (!cpu_need_tpr_shadow(&vmx->vcpu)) {
+ if (!cpu_need_tpr_shadow(&vmx->vcpu))
exec_control &= ~CPU_BASED_TPR_SHADOW;
+
#ifdef CONFIG_X86_64
+ if (exec_control & CPU_BASED_TPR_SHADOW)
+ exec_control &= ~(CPU_BASED_CR8_LOAD_EXITING |
+ CPU_BASED_CR8_STORE_EXITING);
+ else
exec_control |= CPU_BASED_CR8_STORE_EXITING |
CPU_BASED_CR8_LOAD_EXITING;
#endif
- }
if (!enable_ept)
exec_control |= CPU_BASED_CR3_STORE_EXITING |
CPU_BASED_CR3_LOAD_EXITING |
--
2.35.3

2022-06-27 16:51:58

by Vitaly Kuznetsov

[permalink] [raw]
Subject: [PATCH 03/14] KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING in setup_vmcs_config()

SECONDARY_EXEC_ENCLS_EXITING is conditionally added to the 'optional'
checklist in setup_vmcs_config() but there's little value in doing so.
First, as the control is optional, we can always check for its
presence, no harm done. Second, the only real value cpu_has_sgx() check
gives is that on the CPUs which support SECONDARY_EXEC_ENCLS_EXITING but
don't support SGX, the control is not getting enabled. It's highly unlikely
such CPUs exist but it's possible that some hypervisors expose broken vCPU
models.

Preserve cpu_has_sgx() check but filter the result of adjust_vmx_controls()
instead of the input.

Signed-off-by: Vitaly Kuznetsov <[email protected]>
---
arch/x86/kvm/vmx/vmx.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index ecd00fc69674..5300f2ad6a25 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2528,9 +2528,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
SECONDARY_EXEC_PT_CONCEAL_VMX |
SECONDARY_EXEC_ENABLE_VMFUNC |
SECONDARY_EXEC_BUS_LOCK_DETECTION |
- SECONDARY_EXEC_NOTIFY_VM_EXITING;
- if (cpu_has_sgx())
- opt2 |= SECONDARY_EXEC_ENCLS_EXITING;
+ SECONDARY_EXEC_NOTIFY_VM_EXITING |
+ SECONDARY_EXEC_ENCLS_EXITING;
+
if (adjust_vmx_controls(min2, opt2,
MSR_IA32_VMX_PROCBASED_CTLS2,
&_cpu_based_2nd_exec_control) < 0)
@@ -2577,6 +2577,9 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
vmx_cap->vpid = 0;
}

+ if (!cpu_has_sgx())
+ _cpu_based_2nd_exec_control &= ~SECONDARY_EXEC_ENCLS_EXITING;
+
if (_cpu_based_exec_control & CPU_BASED_ACTIVATE_TERTIARY_CONTROLS) {
u64 opt3 = TERTIARY_EXEC_IPI_VIRT;

--
2.35.3

2022-06-27 18:25:37

by Jim Mattson

[permalink] [raw]
Subject: Re: [PATCH 00/14] KVM: nVMX: Use vmcs_config for setting up nested VMX MSRs

On Mon, Jun 27, 2022 at 9:04 AM Vitaly Kuznetsov <[email protected]> wrote:
>
> Changes since RFC:
> - "KVM: VMX: Extend VMX controls macro shenanigans" PATCH added and the
> infrastructure is later used in other patches [Sean] PATCHes 1-3 added
> to support the change.
> - "KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup" PATCH
> added [Sean].
> - Commit messages added.
>
> vmcs_config is a sanitized version of host VMX MSRs where some controls are
> filtered out (e.g. when Enlightened VMCS is enabled, some know bugs are
> discovered, some inconsistencies in controls are detected,...) but
> nested_vmx_setup_ctls_msrs() uses raw host MSRs instead. This may end up
> in exposing undesired controls to L1. Switch to using vmcs_config instead.
>
> Sean Christopherson (1):
> KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup
>
> Vitaly Kuznetsov (13):
> KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()
> KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in
> setup_vmcs_config()
> KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING
> in setup_vmcs_config()
> KVM: VMX: Extend VMX controls macro shenanigans
> KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of
> setup_vmcs_config()
> KVM: VMX: Add missing VMEXIT controls to vmcs_config
> KVM: VMX: Add missing VMENTRY controls to vmcs_config
> KVM: VMX: Add missing CPU based VM execution controls to vmcs_config
> KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs
> KVM: VMX: Store required-1 VMX controls in vmcs_config
> KVM: nVMX: Use sanitized required-1 bits for VMX control MSRs
> KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
> KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up
> nested MSR
>
> arch/x86/kvm/vmx/capabilities.h | 16 +--
> arch/x86/kvm/vmx/nested.c | 37 +++---
> arch/x86/kvm/vmx/nested.h | 2 +-
> arch/x86/kvm/vmx/vmx.c | 198 ++++++++++++++------------------
> arch/x86/kvm/vmx/vmx.h | 118 +++++++++++++++++++
> 5 files changed, 229 insertions(+), 142 deletions(-)
>
> --
> 2.35.3
>

Just checking that this doesn't introduce any backwards-compatibility
issues. That is, all features that were reported as being available in
the past should still be available moving forward.

Thanks,

--jim

2022-06-27 21:47:09

by Dong, Eddie

[permalink] [raw]
Subject: RE: [PATCH 04/14] KVM: VMX: Extend VMX controls macro shenanigans



> -----Original Message-----
> From: Vitaly Kuznetsov <[email protected]>
> Sent: Monday, June 27, 2022 9:05 AM
> To: [email protected]; Paolo Bonzini <[email protected]>;
> Christopherson,, Sean <[email protected]>
> Cc: Anirudh Rayabharam <[email protected]>; Wanpeng Li
> <[email protected]>; Jim Mattson <[email protected]>; Maxim
> Levitsky <[email protected]>; [email protected]; linux-
> [email protected]
> Subject: [PATCH 04/14] KVM: VMX: Extend VMX controls macro shenanigans
>
> When VMX controls macros are used to set or clear a control bit, make sure
> that this bit was checked in setup_vmcs_config() and thus is properly
> reflected in vmcs_config.
>
> No functional change intended.
>
> Suggested-by: Sean Christopherson <[email protected]>
> Signed-off-by: Vitaly Kuznetsov <[email protected]>
> ---
> arch/x86/kvm/vmx/vmx.c | 99 +++++++------------------------------
> arch/x86/kvm/vmx/vmx.h | 109
> +++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 127 insertions(+), 81 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index
> 5300f2ad6a25..7ef4bc69e2c6 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2448,7 +2448,6 @@ static __init int setup_vmcs_config(struct
> vmcs_config *vmcs_conf,
> struct vmx_capability *vmx_cap) {
> u32 vmx_msr_low, vmx_msr_high;
> - u32 min, opt, min2, opt2;
> u32 _pin_based_exec_control = 0;
> u32 _cpu_based_exec_control = 0;
> u32 _cpu_based_2nd_exec_control = 0;
> @@ -2474,28 +2473,10 @@ static __init int setup_vmcs_config(struct
> vmcs_config *vmcs_conf,
> };
>
> memset(vmcs_conf, 0, sizeof(*vmcs_conf));
> - min = CPU_BASED_HLT_EXITING |
> -#ifdef CONFIG_X86_64
> - CPU_BASED_CR8_LOAD_EXITING |
> - CPU_BASED_CR8_STORE_EXITING |
> -#endif
> - CPU_BASED_CR3_LOAD_EXITING |
> - CPU_BASED_CR3_STORE_EXITING |
> - CPU_BASED_UNCOND_IO_EXITING |
> - CPU_BASED_MOV_DR_EXITING |
> - CPU_BASED_USE_TSC_OFFSETTING |
> - CPU_BASED_MWAIT_EXITING |
> - CPU_BASED_MONITOR_EXITING |
> - CPU_BASED_INVLPG_EXITING |
> - CPU_BASED_RDPMC_EXITING |
> - CPU_BASED_INTR_WINDOW_EXITING |
> - CPU_BASED_NMI_WINDOW_EXITING;
> -
> - opt = CPU_BASED_TPR_SHADOW |
> - CPU_BASED_USE_MSR_BITMAPS |
> - CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |
> - CPU_BASED_ACTIVATE_TERTIARY_CONTROLS;
> - if (adjust_vmx_controls(min, opt,
> MSR_IA32_VMX_PROCBASED_CTLS,
> +
> + if
> (adjust_vmx_controls(KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL,
> +
> KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL,
> + MSR_IA32_VMX_PROCBASED_CTLS,
> &_cpu_based_exec_control) < 0)
> return -EIO;
> #ifdef CONFIG_X86_64
> @@ -2504,34 +2485,8 @@ static __init int setup_vmcs_config(struct
> vmcs_config *vmcs_conf,
>
> ~CPU_BASED_CR8_STORE_EXITING;
> #endif
> if (_cpu_based_exec_control &
> CPU_BASED_ACTIVATE_SECONDARY_CONTROLS) {
> - min2 = 0;
> - opt2 = SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES |
> - SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
> - SECONDARY_EXEC_WBINVD_EXITING |
> - SECONDARY_EXEC_ENABLE_VPID |
> - SECONDARY_EXEC_ENABLE_EPT |
> - SECONDARY_EXEC_UNRESTRICTED_GUEST |
> - SECONDARY_EXEC_PAUSE_LOOP_EXITING |
> - SECONDARY_EXEC_DESC |
> - SECONDARY_EXEC_ENABLE_RDTSCP |
> - SECONDARY_EXEC_ENABLE_INVPCID |
> - SECONDARY_EXEC_APIC_REGISTER_VIRT |
> - SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY |
> - SECONDARY_EXEC_SHADOW_VMCS |
> - SECONDARY_EXEC_XSAVES |
> - SECONDARY_EXEC_RDSEED_EXITING |
> - SECONDARY_EXEC_RDRAND_EXITING |
> - SECONDARY_EXEC_ENABLE_PML |
> - SECONDARY_EXEC_TSC_SCALING |
> - SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE |
> - SECONDARY_EXEC_PT_USE_GPA |
> - SECONDARY_EXEC_PT_CONCEAL_VMX |
> - SECONDARY_EXEC_ENABLE_VMFUNC |
> - SECONDARY_EXEC_BUS_LOCK_DETECTION |
> - SECONDARY_EXEC_NOTIFY_VM_EXITING |
> - SECONDARY_EXEC_ENCLS_EXITING;
> -
> - if (adjust_vmx_controls(min2, opt2,
> + if
> (adjust_vmx_controls(KVM_REQ_VMX_SECONDARY_VM_EXEC_CONTROL,
> +
> KVM_OPT_VMX_SECONDARY_VM_EXEC_CONTROL,
>
> MSR_IA32_VMX_PROCBASED_CTLS2,
> &_cpu_based_2nd_exec_control) <
> 0)
> return -EIO;
> @@ -2581,30 +2536,20 @@ static __init int setup_vmcs_config(struct
> vmcs_config *vmcs_conf,
> _cpu_based_2nd_exec_control &=
> ~SECONDARY_EXEC_ENCLS_EXITING;
>
> if (_cpu_based_exec_control &
> CPU_BASED_ACTIVATE_TERTIARY_CONTROLS) {
> - u64 opt3 = TERTIARY_EXEC_IPI_VIRT;
> -
> - _cpu_based_3rd_exec_control =
> adjust_vmx_controls64(opt3,
> -
> MSR_IA32_VMX_PROCBASED_CTLS3);
> + _cpu_based_3rd_exec_control =
> +
> adjust_vmx_controls64(KVM_OPT_VMX_TERTIARY_VM_EXEC_CON
> TROL,
> + MSR_IA32_VMX_PROCBASED_CTLS3);
> }
>
> - min = VM_EXIT_SAVE_DEBUG_CONTROLS |
> VM_EXIT_ACK_INTR_ON_EXIT;
> -#ifdef CONFIG_X86_64
> - min |= VM_EXIT_HOST_ADDR_SPACE_SIZE;
> -#endif
> - opt = VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL |
> - VM_EXIT_LOAD_IA32_PAT |
> - VM_EXIT_LOAD_IA32_EFER |
> - VM_EXIT_CLEAR_BNDCFGS |
> - VM_EXIT_PT_CONCEAL_PIP |
> - VM_EXIT_CLEAR_IA32_RTIT_CTL;
> - if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
> + if (adjust_vmx_controls(KVM_REQ_VMX_VM_EXIT_CONTROLS,
> + KVM_OPT_VMX_VM_EXIT_CONTROLS,
> + MSR_IA32_VMX_EXIT_CTLS,
> &_vmexit_control) < 0)
> return -EIO;
>
> - min = PIN_BASED_EXT_INTR_MASK | PIN_BASED_NMI_EXITING;
> - opt = PIN_BASED_VIRTUAL_NMIS | PIN_BASED_POSTED_INTR |
> - PIN_BASED_VMX_PREEMPTION_TIMER;
> - if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PINBASED_CTLS,
> + if
> (adjust_vmx_controls(KVM_REQ_VMX_PIN_BASED_VM_EXEC_CONTROL,
> +
> KVM_OPT_VMX_PIN_BASED_VM_EXEC_CONTROL,
> + MSR_IA32_VMX_PINBASED_CTLS,
> &_pin_based_exec_control) < 0)
> return -EIO;
>
> @@ -2614,17 +2559,9 @@ static __init int setup_vmcs_config(struct
> vmcs_config *vmcs_conf,
> SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY))
> _pin_based_exec_control &= ~PIN_BASED_POSTED_INTR;
>
> - min = VM_ENTRY_LOAD_DEBUG_CONTROLS;
> -#ifdef CONFIG_X86_64
> - min |= VM_ENTRY_IA32E_MODE;
> -#endif
> - opt = VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL |
> - VM_ENTRY_LOAD_IA32_PAT |
> - VM_ENTRY_LOAD_IA32_EFER |
> - VM_ENTRY_LOAD_BNDCFGS |
> - VM_ENTRY_PT_CONCEAL_PIP |
> - VM_ENTRY_LOAD_IA32_RTIT_CTL;
> - if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS,
> + if (adjust_vmx_controls(KVM_REQ_VMX_VM_ENTRY_CONTROLS,
> + KVM_OPT_VMX_VM_ENTRY_CONTROLS,
> + MSR_IA32_VMX_ENTRY_CTLS,
> &_vmentry_control) < 0)
> return -EIO;
>
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h index
> 286c88e285ea..540febecac92 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -467,6 +467,113 @@ static inline u8 vmx_get_rvi(void)
> return vmcs_read16(GUEST_INTR_STATUS) & 0xff; }
>
> +#define __KVM_REQ_VMX_VM_ENTRY_CONTROLS
> \
> + (VM_ENTRY_LOAD_DEBUG_CONTROLS)
> +#ifdef CONFIG_X86_64
> + #define KVM_REQ_VMX_VM_ENTRY_CONTROLS \
> + (__KVM_REQ_VMX_VM_ENTRY_CONTROLS | \
> + VM_ENTRY_IA32E_MODE)
> +#else
> + #define KVM_REQ_VMX_VM_ENTRY_CONTROLS \
> + __KVM_REQ_VMX_VM_ENTRY_CONTROLS
> +#endif
> +#define KVM_OPT_VMX_VM_ENTRY_CONTROLS
> \
> + (VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL | \
> + VM_ENTRY_LOAD_IA32_PAT | \
> + VM_ENTRY_LOAD_IA32_EFER | \
> + VM_ENTRY_LOAD_BNDCFGS | \
> + VM_ENTRY_PT_CONCEAL_PIP | \
> + VM_ENTRY_LOAD_IA32_RTIT_CTL)
> +
> +#define __KVM_REQ_VMX_VM_EXIT_CONTROLS
> \
> + (VM_EXIT_SAVE_DEBUG_CONTROLS | \
> + VM_EXIT_ACK_INTR_ON_EXIT)
> +#ifdef CONFIG_X86_64
> + #define KVM_REQ_VMX_VM_EXIT_CONTROLS \
> + (__KVM_REQ_VMX_VM_EXIT_CONTROLS | \
> + VM_EXIT_HOST_ADDR_SPACE_SIZE)
> +#else
> + #define KVM_REQ_VMX_VM_EXIT_CONTROLS \
> + __KVM_REQ_VMX_VM_EXIT_CONTROLS
> +#endif
> +#define KVM_OPT_VMX_VM_EXIT_CONTROLS \
> + (VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL | \
> + VM_EXIT_LOAD_IA32_PAT | \
> + VM_EXIT_LOAD_IA32_EFER | \
> + VM_EXIT_CLEAR_BNDCFGS | \
> + VM_EXIT_PT_CONCEAL_PIP | \
> + VM_EXIT_CLEAR_IA32_RTIT_CTL)
> +
> +#define KVM_REQ_VMX_PIN_BASED_VM_EXEC_CONTROL
> \
> + (PIN_BASED_EXT_INTR_MASK | \
> + PIN_BASED_NMI_EXITING)
> +#define KVM_OPT_VMX_PIN_BASED_VM_EXEC_CONTROL
> \
> + (PIN_BASED_VIRTUAL_NMIS | \
> + PIN_BASED_POSTED_INTR | \
> + PIN_BASED_VMX_PREEMPTION_TIMER)
> +
> +#define __KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL
> \
> + (CPU_BASED_HLT_EXITING | \
> + CPU_BASED_CR3_LOAD_EXITING | \
> + CPU_BASED_CR3_STORE_EXITING | \
> + CPU_BASED_UNCOND_IO_EXITING | \
> + CPU_BASED_MOV_DR_EXITING | \
> + CPU_BASED_USE_TSC_OFFSETTING | \
> + CPU_BASED_MWAIT_EXITING | \
> + CPU_BASED_MONITOR_EXITING | \
> + CPU_BASED_INVLPG_EXITING | \
> + CPU_BASED_RDPMC_EXITING | \
> + CPU_BASED_INTR_WINDOW_EXITING | \
> + CPU_BASED_NMI_WINDOW_EXITING)
> +
> +#ifdef CONFIG_X86_64
> + #define KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL
> \
> + (__KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL | \
> + CPU_BASED_CR8_LOAD_EXITING | \
> + CPU_BASED_CR8_STORE_EXITING)
> +#else
> + #define KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL
> \
> + __KVM_REQ_VMX_CPU_BASED_VM_EXEC_CONTROL
> +#endif
> +
> +#define KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL
> \
> + (CPU_BASED_TPR_SHADOW | \
> + CPU_BASED_USE_MSR_BITMAPS | \
> + CPU_BASED_ACTIVATE_SECONDARY_CONTROLS |
> \
> + CPU_BASED_ACTIVATE_TERTIARY_CONTROLS)
> +
> +#define KVM_REQ_VMX_SECONDARY_VM_EXEC_CONTROL 0
> +#define KVM_OPT_VMX_SECONDARY_VM_EXEC_CONTROL
> \
> + (SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES | \
> + SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE |
> \
> + SECONDARY_EXEC_WBINVD_EXITING | \
> + SECONDARY_EXEC_ENABLE_VPID | \
> + SECONDARY_EXEC_ENABLE_EPT | \
> + SECONDARY_EXEC_UNRESTRICTED_GUEST | \
> + SECONDARY_EXEC_PAUSE_LOOP_EXITING | \
> + SECONDARY_EXEC_DESC | \
> + SECONDARY_EXEC_ENABLE_RDTSCP | \
> + SECONDARY_EXEC_ENABLE_INVPCID | \
> + SECONDARY_EXEC_APIC_REGISTER_VIRT | \
> + SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY | \
> + SECONDARY_EXEC_SHADOW_VMCS | \
> + SECONDARY_EXEC_XSAVES | \
> + SECONDARY_EXEC_RDSEED_EXITING | \
> + SECONDARY_EXEC_RDRAND_EXITING | \
> + SECONDARY_EXEC_ENABLE_PML | \
> + SECONDARY_EXEC_TSC_SCALING | \
> + SECONDARY_EXEC_ENABLE_USR_WAIT_PAUSE |
> \
> + SECONDARY_EXEC_PT_USE_GPA | \
> + SECONDARY_EXEC_PT_CONCEAL_VMX |
> \
> + SECONDARY_EXEC_ENABLE_VMFUNC | \
> + SECONDARY_EXEC_BUS_LOCK_DETECTION | \
> + SECONDARY_EXEC_NOTIFY_VM_EXITING | \
> + SECONDARY_EXEC_ENCLS_EXITING)
> +
> +#define KVM_REQ_VMX_TERTIARY_VM_EXEC_CONTROL 0
> +#define KVM_OPT_VMX_TERTIARY_VM_EXEC_CONTROL
> \
> + (TERTIARY_EXEC_IPI_VIRT)
> +
> #define BUILD_CONTROLS_SHADOW(lname, uname, bits)
> \
> static inline void lname##_controls_set(struct vcpu_vmx *vmx, u##bits val)
> \
> {
> \
> @@ -485,10 +592,12 @@ static inline u##bits lname##_controls_get(struct
> vcpu_vmx *vmx) \
> }
> \
> static inline void lname##_controls_setbit(struct vcpu_vmx *vmx, u##bits
> val) \
> {
> \
> + BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname |
> KVM_OPT_VMX_##uname))); \
> lname##_controls_set(vmx, lname##_controls_get(vmx) | val);
> \
> }
> \
> static inline void lname##_controls_clearbit(struct vcpu_vmx *vmx, u##bits
> val) \
> {
> \
> + BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname |
> KVM_OPT_VMX_##uname))); \
> lname##_controls_set(vmx, lname##_controls_get(vmx) & ~val);
> \
> }

With this, will it be safer if we present L1 CTRL MSRs with the bits KVM really uses? Do I miss something?

> BUILD_CONTROLS_SHADOW(vm_entry, VM_ENTRY_CONTROLS, 32)
> --
> 2.35.3

2022-06-28 02:14:09

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 04/14] KVM: VMX: Extend VMX controls macro shenanigans

On Mon, Jun 27, 2022, Dong, Eddie wrote:
> > static inline void lname##_controls_clearbit(struct vcpu_vmx *vmx, u##bits
> > val) \
> > {
> > \
> > + BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname |
> > KVM_OPT_VMX_##uname))); \
> > lname##_controls_set(vmx, lname##_controls_get(vmx) & ~val);
> > \
> > }
>
> With this, will it be safer if we present L1 CTRL MSRs with the bits KVM
> really uses? Do I miss something?

KVM will still allow L1 to use features/controls that KVM itself doesn't use, but
exposing features/controls that KVM doesn't use will require a more explicit
"override" of sorts, e.g. to prevent advertising features that are supported in
hardware, known to KVM, but disabled for whatever reason, e.g. a CPU bug, eVMCS
incompatibility, module param, etc...

The intent of this BUILD_BUG_ON() is to detect KVM usage of bits that aren't enabled
by default, i.e. to lower the probability that a control gets used by KVM but isn't
exposed to L1 because it's a dynamically enabled control.

2022-06-28 09:47:51

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH 00/14] KVM: nVMX: Use vmcs_config for setting up nested VMX MSRs

On Mon, 2022-06-27 at 18:04 +0200, Vitaly Kuznetsov wrote:
> Changes since RFC:
> - "KVM: VMX: Extend VMX controls macro shenanigans" PATCH added and the
> infrastructure is later used in other patches [Sean] PATCHes 1-3 added
> to support the change.
> - "KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup" PATCH
> added [Sean].
> - Commit messages added.
>
> vmcs_config is a sanitized version of host VMX MSRs where some controls are
> filtered out (e.g. when Enlightened VMCS is enabled, some know bugs are
> discovered, some inconsistencies in controls are detected,...) but
> nested_vmx_setup_ctls_msrs() uses raw host MSRs instead. This may end up
> in exposing undesired controls to L1. Switch to using vmcs_config instead.
>
> Sean Christopherson (1):
> KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup
>
> Vitaly Kuznetsov (13):
> KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()
> KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in
> setup_vmcs_config()
> KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING
> in setup_vmcs_config()
> KVM: VMX: Extend VMX controls macro shenanigans
> KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of
> setup_vmcs_config()
> KVM: VMX: Add missing VMEXIT controls to vmcs_config
> KVM: VMX: Add missing VMENTRY controls to vmcs_config
> KVM: VMX: Add missing CPU based VM execution controls to vmcs_config
> KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs
> KVM: VMX: Store required-1 VMX controls in vmcs_config
> KVM: nVMX: Use sanitized required-1 bits for VMX control MSRs
> KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
> KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up
> nested MSR
>
> arch/x86/kvm/vmx/capabilities.h | 16 +--
> arch/x86/kvm/vmx/nested.c | 37 +++---
> arch/x86/kvm/vmx/nested.h | 2 +-
> arch/x86/kvm/vmx/vmx.c | 198 ++++++++++++++------------------
> arch/x86/kvm/vmx/vmx.h | 118 +++++++++++++++++++
> 5 files changed, 229 insertions(+), 142 deletions(-)
>
Sorry that I was a bit out of loop on this, so before I review it,
does this patch series solve the eVMCS issue we had alone,
or we still need the eVMCS version patch series as well?

Best regards,
Maxim Levitsky

2022-06-28 10:44:45

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH 00/14] KVM: nVMX: Use vmcs_config for setting up nested VMX MSRs

On Tue, 2022-06-28 at 12:04 +0200, Vitaly Kuznetsov wrote:
> Maxim Levitsky <[email protected]> writes:
>
> > On Mon, 2022-06-27 at 18:04 +0200, Vitaly Kuznetsov wrote:
> > > Changes since RFC:
> > > - "KVM: VMX: Extend VMX controls macro shenanigans" PATCH added and the
> > > infrastructure is later used in other patches [Sean] PATCHes 1-3 added
> > > to support the change.
> > > - "KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup" PATCH
> > > added [Sean].
> > > - Commit messages added.
> > >
> > > vmcs_config is a sanitized version of host VMX MSRs where some controls are
> > > filtered out (e.g. when Enlightened VMCS is enabled, some know bugs are
> > > discovered, some inconsistencies in controls are detected,...) but
> > > nested_vmx_setup_ctls_msrs() uses raw host MSRs instead. This may end up
> > > in exposing undesired controls to L1. Switch to using vmcs_config instead.
> > >
> > > Sean Christopherson (1):
> > > KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup
> > >
> > > Vitaly Kuznetsov (13):
> > > KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()
> > > KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in
> > > setup_vmcs_config()
> > > KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING
> > > in setup_vmcs_config()
> > > KVM: VMX: Extend VMX controls macro shenanigans
> > > KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of
> > > setup_vmcs_config()
> > > KVM: VMX: Add missing VMEXIT controls to vmcs_config
> > > KVM: VMX: Add missing VMENTRY controls to vmcs_config
> > > KVM: VMX: Add missing CPU based VM execution controls to vmcs_config
> > > KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs
> > > KVM: VMX: Store required-1 VMX controls in vmcs_config
> > > KVM: nVMX: Use sanitized required-1 bits for VMX control MSRs
> > > KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
> > > KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up
> > > nested MSR
> > >
> > > arch/x86/kvm/vmx/capabilities.h | 16 +--
> > > arch/x86/kvm/vmx/nested.c | 37 +++---
> > > arch/x86/kvm/vmx/nested.h | 2 +-
> > > arch/x86/kvm/vmx/vmx.c | 198 ++++++++++++++------------------
> > > arch/x86/kvm/vmx/vmx.h | 118 +++++++++++++++++++
> > > 5 files changed, 229 insertions(+), 142 deletions(-)
> > >
> > Sorry that I was a bit out of loop on this, so before I review it,
> > does this patch series solve the eVMCS issue we had alone,
> > or we still need the eVMCS version patch series as well?
>
> "[PATCH 00/11] KVM: VMX: Support TscScaling and EnclsExitingBitmap whith
> eVMCS" adds new features, namely TSC scaling for both KVM-on-Hyper-V and
> Hyper-V-on-KVM. This series doesn't add any features but avoids the
> problem reported by Anirudh by properly filtering values in L1 VMX MSRs.
>
> TL;DR: Both series are needed.
>

Roger that!

Best regards,
Maxim Levitsky

2022-06-28 10:48:21

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH 00/14] KVM: nVMX: Use vmcs_config for setting up nested VMX MSRs

Maxim Levitsky <[email protected]> writes:

> On Mon, 2022-06-27 at 18:04 +0200, Vitaly Kuznetsov wrote:
>> Changes since RFC:
>> - "KVM: VMX: Extend VMX controls macro shenanigans" PATCH added and the
>> infrastructure is later used in other patches [Sean] PATCHes 1-3 added
>> to support the change.
>> - "KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup" PATCH
>> added [Sean].
>> - Commit messages added.
>>
>> vmcs_config is a sanitized version of host VMX MSRs where some controls are
>> filtered out (e.g. when Enlightened VMCS is enabled, some know bugs are
>> discovered, some inconsistencies in controls are detected,...) but
>> nested_vmx_setup_ctls_msrs() uses raw host MSRs instead. This may end up
>> in exposing undesired controls to L1. Switch to using vmcs_config instead.
>>
>> Sean Christopherson (1):
>> KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup
>>
>> Vitaly Kuznetsov (13):
>> KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()
>> KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in
>> setup_vmcs_config()
>> KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING
>> in setup_vmcs_config()
>> KVM: VMX: Extend VMX controls macro shenanigans
>> KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of
>> setup_vmcs_config()
>> KVM: VMX: Add missing VMEXIT controls to vmcs_config
>> KVM: VMX: Add missing VMENTRY controls to vmcs_config
>> KVM: VMX: Add missing CPU based VM execution controls to vmcs_config
>> KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs
>> KVM: VMX: Store required-1 VMX controls in vmcs_config
>> KVM: nVMX: Use sanitized required-1 bits for VMX control MSRs
>> KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
>> KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up
>> nested MSR
>>
>> arch/x86/kvm/vmx/capabilities.h | 16 +--
>> arch/x86/kvm/vmx/nested.c | 37 +++---
>> arch/x86/kvm/vmx/nested.h | 2 +-
>> arch/x86/kvm/vmx/vmx.c | 198 ++++++++++++++------------------
>> arch/x86/kvm/vmx/vmx.h | 118 +++++++++++++++++++
>> 5 files changed, 229 insertions(+), 142 deletions(-)
>>
> Sorry that I was a bit out of loop on this, so before I review it,
> does this patch series solve the eVMCS issue we had alone,
> or we still need the eVMCS version patch series as well?

"[PATCH 00/11] KVM: VMX: Support TscScaling and EnclsExitingBitmap whith
eVMCS" adds new features, namely TSC scaling for both KVM-on-Hyper-V and
Hyper-V-on-KVM. This series doesn't add any features but avoids the
problem reported by Anirudh by properly filtering values in L1 VMX MSRs.

TL;DR: Both series are needed.

--
Vitaly

2022-06-28 12:09:40

by Vitaly Kuznetsov

[permalink] [raw]
Subject: RE: [PATCH 04/14] KVM: VMX: Extend VMX controls macro shenanigans

"Dong, Eddie" <[email protected]> writes:

>> -----Original Message-----
>> From: Vitaly Kuznetsov <[email protected]>
>> Sent: Monday, June 27, 2022 9:05 AM
>> To: [email protected]; Paolo Bonzini <[email protected]>;
>> Christopherson,, Sean <[email protected]>
>> Cc: Anirudh Rayabharam <[email protected]>; Wanpeng Li
>> <[email protected]>; Jim Mattson <[email protected]>; Maxim
>> Levitsky <[email protected]>; [email protected]; linux-
>> [email protected]
>> Subject: [PATCH 04/14] KVM: VMX: Extend VMX controls macro shenanigans
>>
>> When VMX controls macros are used to set or clear a control bit, make sure
>> that this bit was checked in setup_vmcs_config() and thus is properly
>> reflected in vmcs_config.
>>

...

>
> With this, will it be safer if we present L1 CTRL MSRs with the bits
> KVM really uses? Do I miss something?

Sean has already answered but let me present my version. Currently,
vmcs_config has sanitized VMX control MSRs values filtering out three
groups of features:
- Features, which KVM doesn't know about.
- Features, which KVM can't enable (because of eVMCS, bugs,...)
- Features, which KVM doesn't want to enable for some reason.
L1 VMX control MSRs should have the first two groups filtered out but
not the third. E.g. when EPT is in use, KVM doesn't use
CPU_BASED_CR3_LOAD_EXITING/CPU_BASED_CR3_STORE_EXITING but this doesn't
mean that all possible L1 hypervisors are going to be happy if we filter
these out. Same goes to e.g. CPU_BASED_RDTSC_EXITING: KVM never sets
this for itself but nested hypervisor can.

--
Vitaly

2022-06-28 14:19:12

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH 00/14] KVM: nVMX: Use vmcs_config for setting up nested VMX MSRs

Jim Mattson <[email protected]> writes:

> On Mon, Jun 27, 2022 at 9:04 AM Vitaly Kuznetsov <[email protected]> wrote:
>>
>> Changes since RFC:
>> - "KVM: VMX: Extend VMX controls macro shenanigans" PATCH added and the
>> infrastructure is later used in other patches [Sean] PATCHes 1-3 added
>> to support the change.
>> - "KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup" PATCH
>> added [Sean].
>> - Commit messages added.
>>
>> vmcs_config is a sanitized version of host VMX MSRs where some controls are
>> filtered out (e.g. when Enlightened VMCS is enabled, some know bugs are
>> discovered, some inconsistencies in controls are detected,...) but
>> nested_vmx_setup_ctls_msrs() uses raw host MSRs instead. This may end up
>> in exposing undesired controls to L1. Switch to using vmcs_config instead.
>>
>> Sean Christopherson (1):
>> KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup
>>
>> Vitaly Kuznetsov (13):
>> KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()
>> KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in
>> setup_vmcs_config()
>> KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING
>> in setup_vmcs_config()
>> KVM: VMX: Extend VMX controls macro shenanigans
>> KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of
>> setup_vmcs_config()
>> KVM: VMX: Add missing VMEXIT controls to vmcs_config
>> KVM: VMX: Add missing VMENTRY controls to vmcs_config
>> KVM: VMX: Add missing CPU based VM execution controls to vmcs_config
>> KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs
>> KVM: VMX: Store required-1 VMX controls in vmcs_config
>> KVM: nVMX: Use sanitized required-1 bits for VMX control MSRs
>> KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
>> KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up
>> nested MSR
>>
>> arch/x86/kvm/vmx/capabilities.h | 16 +--
>> arch/x86/kvm/vmx/nested.c | 37 +++---
>> arch/x86/kvm/vmx/nested.h | 2 +-
>> arch/x86/kvm/vmx/vmx.c | 198 ++++++++++++++------------------
>> arch/x86/kvm/vmx/vmx.h | 118 +++++++++++++++++++
>> 5 files changed, 229 insertions(+), 142 deletions(-)
>>
>> --
>> 2.35.3
>>
>
> Just checking that this doesn't introduce any backwards-compatibility
> issues. That is, all features that were reported as being available in
> the past should still be available moving forward.
>

All the controls nested_vmx_setup_ctls_msrs() set are in the newly
introduced KVM_REQ_VMX_*/KVM_OPT_VMX_* sets so we should be good here
(unless I screwed up, of course).

There's going to be some changes though. E.g this series was started by
Anirudh's report when KVM was exposing SECONDARY_EXEC_TSC_SCALING while
running on KVM and using eVMCS which doesn't support the control. This
is a bug and I don't think we need and 'bug compatibility' here.

Another change is that VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL/
VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL will now be filtered out on the
"broken" CPUs (the list is in setup_vmcs_config()). I *think* this is
also OK but if not, we can move the filtering to vmx_vmentry_ctrl()/
vmx_vmexit_ctrl().

--
Vitaly

2022-06-28 16:00:22

by Jim Mattson

[permalink] [raw]
Subject: Re: [PATCH 00/14] KVM: nVMX: Use vmcs_config for setting up nested VMX MSRs

On Tue, Jun 28, 2022 at 7:04 AM Vitaly Kuznetsov <[email protected]> wrote:
>
> Jim Mattson <[email protected]> writes:
>
> > On Mon, Jun 27, 2022 at 9:04 AM Vitaly Kuznetsov <[email protected]> wrote:
> >>
> >> Changes since RFC:
> >> - "KVM: VMX: Extend VMX controls macro shenanigans" PATCH added and the
> >> infrastructure is later used in other patches [Sean] PATCHes 1-3 added
> >> to support the change.
> >> - "KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup" PATCH
> >> added [Sean].
> >> - Commit messages added.
> >>
> >> vmcs_config is a sanitized version of host VMX MSRs where some controls are
> >> filtered out (e.g. when Enlightened VMCS is enabled, some know bugs are
> >> discovered, some inconsistencies in controls are detected,...) but
> >> nested_vmx_setup_ctls_msrs() uses raw host MSRs instead. This may end up
> >> in exposing undesired controls to L1. Switch to using vmcs_config instead.
> >>
> >> Sean Christopherson (1):
> >> KVM: VMX: Clear controls obsoleted by EPT at runtime, not setup
> >>
> >> Vitaly Kuznetsov (13):
> >> KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config()
> >> KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in
> >> setup_vmcs_config()
> >> KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING
> >> in setup_vmcs_config()
> >> KVM: VMX: Extend VMX controls macro shenanigans
> >> KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of
> >> setup_vmcs_config()
> >> KVM: VMX: Add missing VMEXIT controls to vmcs_config
> >> KVM: VMX: Add missing VMENTRY controls to vmcs_config
> >> KVM: VMX: Add missing CPU based VM execution controls to vmcs_config
> >> KVM: nVMX: Use sanitized allowed-1 bits for VMX control MSRs
> >> KVM: VMX: Store required-1 VMX controls in vmcs_config
> >> KVM: nVMX: Use sanitized required-1 bits for VMX control MSRs
> >> KVM: VMX: Cache MSR_IA32_VMX_MISC in vmcs_config
> >> KVM: nVMX: Use cached host MSR_IA32_VMX_MISC value for setting up
> >> nested MSR
> >>
> >> arch/x86/kvm/vmx/capabilities.h | 16 +--
> >> arch/x86/kvm/vmx/nested.c | 37 +++---
> >> arch/x86/kvm/vmx/nested.h | 2 +-
> >> arch/x86/kvm/vmx/vmx.c | 198 ++++++++++++++------------------
> >> arch/x86/kvm/vmx/vmx.h | 118 +++++++++++++++++++
> >> 5 files changed, 229 insertions(+), 142 deletions(-)
> >>
> >> --
> >> 2.35.3
> >>
> >
> > Just checking that this doesn't introduce any backwards-compatibility
> > issues. That is, all features that were reported as being available in
> > the past should still be available moving forward.
> >
>
> All the controls nested_vmx_setup_ctls_msrs() set are in the newly
> introduced KVM_REQ_VMX_*/KVM_OPT_VMX_* sets so we should be good here
> (unless I screwed up, of course).
>
> There's going to be some changes though. E.g this series was started by
> Anirudh's report when KVM was exposing SECONDARY_EXEC_TSC_SCALING while
> running on KVM and using eVMCS which doesn't support the control. This
> is a bug and I don't think we need and 'bug compatibility' here.

You cannot force VM termination on a kernel upgrade. On live migration
from an older kernel, the new kernel must be willing to accept the
suspended state of a VM that was running under the older kernel. In
particular, the new KVM_SET_MSRS must accept the values of the VMX
capability MSRS that userspace obtains from the older KVM_GET_MSRS. I
don't know if this is what you are referring to as "bug
compatibility," but if it is, then we absolutely do need it.

> Another change is that VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL/
> VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL will now be filtered out on the
> "broken" CPUs (the list is in setup_vmcs_config()). I *think* this is
> also OK but if not, we can move the filtering to vmx_vmentry_ctrl()/
> vmx_vmexit_ctrl().
>
> --
> Vitaly
>

2022-06-28 16:11:35

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH 00/14] KVM: nVMX: Use vmcs_config for setting up nested VMX MSRs

Jim Mattson <[email protected]> writes:

> On Tue, Jun 28, 2022 at 7:04 AM Vitaly Kuznetsov <[email protected]> wrote:
>>

...

>> Jim Mattson <[email protected]> writes:
>>
>> > Just checking that this doesn't introduce any backwards-compatibility
>> > issues. That is, all features that were reported as being available in
>> > the past should still be available moving forward.
>> >
>>
>> All the controls nested_vmx_setup_ctls_msrs() set are in the newly
>> introduced KVM_REQ_VMX_*/KVM_OPT_VMX_* sets so we should be good here
>> (unless I screwed up, of course).
>>
>> There's going to be some changes though. E.g this series was started by
>> Anirudh's report when KVM was exposing SECONDARY_EXEC_TSC_SCALING while
>> running on KVM and using eVMCS which doesn't support the control. This
>> is a bug and I don't think we need and 'bug compatibility' here.
>
> You cannot force VM termination on a kernel upgrade. On live migration
> from an older kernel, the new kernel must be willing to accept the
> suspended state of a VM that was running under the older kernel. In
> particular, the new KVM_SET_MSRS must accept the values of the VMX
> capability MSRS that userspace obtains from the older KVM_GET_MSRS. I
> don't know if this is what you are referring to as "bug
> compatibility," but if it is, then we absolutely do need it.
>

Oh, right you are, we do seem to have a problem. Even for eVMCS case,
the fact that we expose a feature which can't be used in VMX control
MSRs doesn't mean that the VM is broken. In particular, the VM may not
be using VMX features at all. Same goes to PERF_GLOBAL_CTRL errata.

vmx_restore_control_msr() currenly does strict checking of the supplied
data against what was initially set by nested_vmx_setup_ctls_msrs(),
this basically means we cannot drop feature bits, just add them. Out of
top of my head I don't see a solution other than relaxing the check by
introducing a "revoke list"... Another questions is whether we want
guest visible MSR value to remain like it was before migration or we can
be brave and clear 'broken' feature bits there (the features are
'broken' so they couldn't be in use, right?). I'm not sure.

Anirudh, the same concern applies to your 'intermediate' patch too.

Smart ideas on what can be done are more than welcome)

--
Vitaly

2022-06-28 17:16:32

by Jim Mattson

[permalink] [raw]
Subject: Re: [PATCH 00/14] KVM: nVMX: Use vmcs_config for setting up nested VMX MSRs

On Tue, Jun 28, 2022 at 9:01 AM Vitaly Kuznetsov <[email protected]> wrote:
>
> Jim Mattson <[email protected]> writes:
>
> > On Tue, Jun 28, 2022 at 7:04 AM Vitaly Kuznetsov <[email protected]> wrote:
> >>
>
> ...
>
> >> Jim Mattson <[email protected]> writes:
> >>
> >> > Just checking that this doesn't introduce any backwards-compatibility
> >> > issues. That is, all features that were reported as being available in
> >> > the past should still be available moving forward.
> >> >
> >>
> >> All the controls nested_vmx_setup_ctls_msrs() set are in the newly
> >> introduced KVM_REQ_VMX_*/KVM_OPT_VMX_* sets so we should be good here
> >> (unless I screwed up, of course).
> >>
> >> There's going to be some changes though. E.g this series was started by
> >> Anirudh's report when KVM was exposing SECONDARY_EXEC_TSC_SCALING while
> >> running on KVM and using eVMCS which doesn't support the control. This
> >> is a bug and I don't think we need and 'bug compatibility' here.
> >
> > You cannot force VM termination on a kernel upgrade. On live migration
> > from an older kernel, the new kernel must be willing to accept the
> > suspended state of a VM that was running under the older kernel. In
> > particular, the new KVM_SET_MSRS must accept the values of the VMX
> > capability MSRS that userspace obtains from the older KVM_GET_MSRS. I
> > don't know if this is what you are referring to as "bug
> > compatibility," but if it is, then we absolutely do need it.
> >
>
> Oh, right you are, we do seem to have a problem. Even for eVMCS case,
> the fact that we expose a feature which can't be used in VMX control
> MSRs doesn't mean that the VM is broken. In particular, the VM may not
> be using VMX features at all. Same goes to PERF_GLOBAL_CTRL errata.
>
> vmx_restore_control_msr() currenly does strict checking of the supplied
> data against what was initially set by nested_vmx_setup_ctls_msrs(),
> this basically means we cannot drop feature bits, just add them. Out of
> top of my head I don't see a solution other than relaxing the check by
> introducing a "revoke list"... Another questions is whether we want
> guest visible MSR value to remain like it was before migration or we can
> be brave and clear 'broken' feature bits there (the features are
> 'broken' so they couldn't be in use, right?). I'm not sure.

Read-only MSRs cannot be changed after their values may have been
observed by the guest.

> Anirudh, the same concern applies to your 'intermediate' patch too.
>
> Smart ideas on what can be done are more than welcome)

You could define a bunch of "quirks," and userspace could use
KVM_CAP_DISABLE_QUIRKS2 to ask that the broken bits be cleared.

2022-06-29 09:11:29

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH 00/14] KVM: nVMX: Use vmcs_config for setting up nested VMX MSRs

Jim Mattson <[email protected]> writes:

> On Tue, Jun 28, 2022 at 9:01 AM Vitaly Kuznetsov <[email protected]> wrote:
>>

...

>
> Read-only MSRs cannot be changed after their values may have been
> observed by the guest.
>
>> Anirudh, the same concern applies to your 'intermediate' patch too.
>>
>> Smart ideas on what can be done are more than welcome)
>
> You could define a bunch of "quirks," and userspace could use
> KVM_CAP_DISABLE_QUIRKS2 to ask that the broken bits be cleared.

This sounds correct, but awful :-) I, however, think we can avoid this.

For the KVM-on-eVMCS case:
- When combined with "[PATCH 00/11] KVM: VMX: Support TscScaling and
EnclsExitingBitmap whith eVMCS" series
(https://lore.kernel.org/kvm/[email protected]/),
the filtering we do in setup_vmcs_config() is no longer needed. I need
to check various available Hyper-V versions but my initial investigation
shows that we were only filtering out TSC Scaling and 'Load
IA32_PERF_GLOBAL_CTRL' vmexit/vmentry, the rest were never present in
VMX control MSRs (as presented by Hyper-V) in the first place.

For PERF_GLOBAL_CTRL errata:
- We can move the filtering to vmx_vmexit_ctrl()/vmx_vmentry_ctrl()
preserving the status quo: KVM doesn't use the feature but it is exposed
to L1 hypervisor (and L1 hypervisor presumably has the same check and
doesn't use the feature. FWIW, the workaround was added in 2011 and the
erratas it references appeared in 2010, this means that the affected
CPUs are quite old, modern proprietary hypervisors won't likely boot
there).

If we do the above, there's going to be no changes to VMX control MSRs
generated by nested_vmx_setup_ctls_msrs(). I, however, need to work on
a combined series.

--
Vitaly

2022-06-29 21:02:31

by Dong, Eddie

[permalink] [raw]
Subject: RE: [PATCH 04/14] KVM: VMX: Extend VMX controls macro shenanigans



> -----Original Message-----
> From: Sean Christopherson <[email protected]>
> Sent: Monday, June 27, 2022 6:38 PM
> To: Dong, Eddie <[email protected]>
> Cc: Vitaly Kuznetsov <[email protected]>; [email protected]; Paolo
> Bonzini <[email protected]>; Anirudh Rayabharam
> <[email protected]>; Wanpeng Li <[email protected]>;
> Jim Mattson <[email protected]>; Maxim Levitsky
> <[email protected]>; [email protected]; linux-
> [email protected]
> Subject: Re: [PATCH 04/14] KVM: VMX: Extend VMX controls macro
> shenanigans
>
> On Mon, Jun 27, 2022, Dong, Eddie wrote:
> > > static inline void lname##_controls_clearbit(struct vcpu_vmx *vmx, u##bits
> > > val) \
> > > {
> > > \
> > > + BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname |
> > > KVM_OPT_VMX_##uname))); \
> > > lname##_controls_set(vmx, lname##_controls_get(vmx) & ~val);
> > > \
> > > }
> >
> > With this, will it be safer if we present L1 CTRL MSRs with the bits
> > KVM really uses? Do I miss something?
>
> KVM will still allow L1 to use features/controls that KVM itself doesn't use, but
> exposing features/controls that KVM doesn't use will require a more explicit
> "override" of sorts, e.g. to prevent advertising features that are supported in
> hardware, known to KVM, but disabled for whatever reason, e.g. a CPU bug,
> eVMCS incompatibility, module param, etc...
Mmm, that is fine too.
But, do we consider the potential need of migration for a L1 VMM ? Normally the VM can be configured to be as hardware neutral for better compatibility, or exposing as close to hardware feature as possible for performance.
For nested features, I thought we didn't support migration if L1 VMM yet, so exposing hardware capability by default is fine at moment. We may revisit one day in future if we need to support migration. This MACRO do help anyway ????

>
> The intent of this BUILD_BUG_ON() is to detect KVM usage of bits that aren't
> enabled by default, i.e. to lower the probability that a control gets used by KVM
> but isn't exposed to L1 because it's a dynamically enabled control.

2022-06-29 22:31:43

by Jim Mattson

[permalink] [raw]
Subject: Re: [PATCH 00/14] KVM: nVMX: Use vmcs_config for setting up nested VMX MSRs

On Wed, Jun 29, 2022 at 2:06 AM Vitaly Kuznetsov <[email protected]> wrote:

> For PERF_GLOBAL_CTRL errata:
> - We can move the filtering to vmx_vmexit_ctrl()/vmx_vmentry_ctrl()
> preserving the status quo: KVM doesn't use the feature but it is exposed
> to L1 hypervisor (and L1 hypervisor presumably has the same check and
> doesn't use the feature. FWIW, the workaround was added in 2011 and the
> erratas it references appeared in 2010, this means that the affected
> CPUs are quite old, modern proprietary hypervisors won't likely boot
> there).
Sadly, Nehalem and Westmere are well-supported by KVM today, and we
will probably still continue to support them for at least another
decade. They both have EPT, unrestricted guest, and other VT-x2
features that KVM still considers optional.

2022-06-30 08:21:21

by Vitaly Kuznetsov

[permalink] [raw]
Subject: RE: [PATCH 04/14] KVM: VMX: Extend VMX controls macro shenanigans

"Dong, Eddie" <[email protected]> writes:

>> -----Original Message-----
>> From: Sean Christopherson <[email protected]>
>> Sent: Monday, June 27, 2022 6:38 PM
>> To: Dong, Eddie <[email protected]>
>> Cc: Vitaly Kuznetsov <[email protected]>; [email protected]; Paolo
>> Bonzini <[email protected]>; Anirudh Rayabharam
>> <[email protected]>; Wanpeng Li <[email protected]>;
>> Jim Mattson <[email protected]>; Maxim Levitsky
>> <[email protected]>; [email protected]; linux-
>> [email protected]
>> Subject: Re: [PATCH 04/14] KVM: VMX: Extend VMX controls macro
>> shenanigans
>>
>> On Mon, Jun 27, 2022, Dong, Eddie wrote:
>> > > static inline void lname##_controls_clearbit(struct vcpu_vmx *vmx, u##bits
>> > > val) \
>> > > {
>> > > \
>> > > + BUILD_BUG_ON(!(val & (KVM_REQ_VMX_##uname |
>> > > KVM_OPT_VMX_##uname))); \
>> > > lname##_controls_set(vmx, lname##_controls_get(vmx) & ~val);
>> > > \
>> > > }
>> >
>> > With this, will it be safer if we present L1 CTRL MSRs with the bits
>> > KVM really uses? Do I miss something?
>>
>> KVM will still allow L1 to use features/controls that KVM itself doesn't use, but
>> exposing features/controls that KVM doesn't use will require a more explicit
>> "override" of sorts, e.g. to prevent advertising features that are supported in
>> hardware, known to KVM, but disabled for whatever reason, e.g. a CPU bug,
>> eVMCS incompatibility, module param, etc...
> Mmm, that is fine too.
> But, do we consider the potential need of migration for a L1 VMM ?
> Normally the VM can be configured to be as hardware neutral for better
> compatibility, or exposing as close to hardware feature as possible
> for performance.
> For nested features, I thought we didn't support migration if L1 VMM
> yet, so exposing hardware capability by default is fine at moment. We
> may revisit one day in future if we need to support migration.

Not sure I got your point, nested state migration is fully supported in
KVM. When migrating a guest, KVM makes sure the list of features exposed
in VMX control MSRs remain the same. This may not be the case if you use
something like "-cpu host" in QEMU but the problems are not specific to
nesting.

> This MACRO do help anyway ????
>
>>
>> The intent of this BUILD_BUG_ON() is to detect KVM usage of bits that aren't
>> enabled by default, i.e. to lower the probability that a control gets used by KVM
>> but isn't exposed to L1 because it's a dynamically enabled control.

--
Vitaly

2022-07-06 22:55:22

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 00/14] KVM: nVMX: Use vmcs_config for setting up nested VMX MSRs

On Wed, Jun 29, 2022, Jim Mattson wrote:
> On Wed, Jun 29, 2022 at 2:06 AM Vitaly Kuznetsov <[email protected]> wrote:
>
> > For PERF_GLOBAL_CTRL errata:
> > - We can move the filtering to vmx_vmexit_ctrl()/vmx_vmentry_ctrl()
> > preserving the status quo: KVM doesn't use the feature but it is exposed
> > to L1 hypervisor (and L1 hypervisor presumably has the same check and
> > doesn't use the feature. FWIW, the workaround was added in 2011 and the
> > erratas it references appeared in 2010, this means that the affected
> > CPUs are quite old, modern proprietary hypervisors won't likely boot
> > there).
> Sadly, Nehalem and Westmere are well-supported by KVM today, and we
> will probably still continue to support them for at least another
> decade. They both have EPT, unrestricted guest, and other VT-x2
> features that KVM still considers optional.

Nehalem doesn't have unrestricted guest. Nehalem is the only generation with EPT
but not unrestricted guest.