2024-02-06 18:53:44

by Li, Xin3

[permalink] [raw]
Subject: [PATCH v5 1/2] KVM: VMX: Cleanup VMX basic information defines and usages

Define VMX basic information fields with BIT_ULL()/GENMASK_ULL(), and
replace hardcoded VMX basic numbers with these field macros.

Save the full/raw value of MSR_IA32_VMX_BASIC in the global vmcs_config
as type u64 to get rid of the hi/lo crud, and then use VMX_BASIC helpers
to extract info as needed.

VMX_EPTP_MT_{WB,UC} values 0x6 and 0x0 are generic x86 memory type
values, no need to prefix them with VMX_EPTP_.

Signed-off-by: Xin Li <[email protected]>
Tested-by: Shan Kang <[email protected]>
Acked-by: Kai Huang <[email protected]>
---

Changes since v4:
* Do not split VMX_BASIC bit definitions across multiple files (Kai
Huang).
* Put some words to the changelog to justify changes around memory
type macros (Kai Huang).
* Remove a leftover ';' (Kai Huang).

Changes since v3:
* Remove vmx_basic_vmcs_basic_cap() (Kai Huang).
* Add 2 macros VMX_BASIC_VMCS12_SIZE and VMX_BASIC_MEM_TYPE_WB to
avoid keeping 2 their bit shift macros (Kai Huang).

Changes since v2:
* Simply save the full/raw value of MSR_IA32_VMX_BASIC in the global
vmcs_config, and then use the helpers to extract info from it as
needed (Sean Christopherson).
* Move all VMX_MISC related changes to the second patch (Kai Huang).
* Commonize memory type definitions used in the VMX files, as memory
types are architectural.

Changes since v1:
* Don't add field shift macros unless it's really needed, extra layer
of indirect makes it harder to read (Sean Christopherson).
* Add a static_assert() to ensure that VMX_BASIC_FEATURES_MASK doesn't
overlap with VMX_BASIC_RESERVED_BITS (Sean Christopherson).
* read MSR_IA32_VMX_BASIC into an u64 rather than 2 u32 (Sean
Christopherson).
* Add 2 new functions for extracting fields from VMX basic (Sean
Christopherson).
* Drop the tools header update (Sean Christopherson).
* Move VMX basic field macros to arch/x86/include/asm/vmx.h.
---
arch/x86/include/asm/msr-index.h | 9 ---------
arch/x86/include/asm/vmx.h | 18 ++++++++++++++++--
arch/x86/kvm/vmx/capabilities.h | 6 ++----
arch/x86/kvm/vmx/nested.c | 31 ++++++++++++++++++++-----------
arch/x86/kvm/vmx/vmx.c | 24 ++++++++++--------------
5 files changed, 48 insertions(+), 40 deletions(-)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index f1bd7b91b3c6..63cd50bfdc6d 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -1102,15 +1102,6 @@
#define MSR_IA32_VMX_VMFUNC 0x00000491
#define MSR_IA32_VMX_PROCBASED_CTLS3 0x00000492

-/* VMX_BASIC bits and bitmasks */
-#define VMX_BASIC_VMCS_SIZE_SHIFT 32
-#define VMX_BASIC_TRUE_CTLS (1ULL << 55)
-#define VMX_BASIC_64 0x0001000000000000LLU
-#define VMX_BASIC_MEM_TYPE_SHIFT 50
-#define VMX_BASIC_MEM_TYPE_MASK 0x003c000000000000LLU
-#define VMX_BASIC_MEM_TYPE_WB 6LLU
-#define VMX_BASIC_INOUT 0x0040000000000000LLU
-
/* Resctrl MSRs: */
/* - Intel: */
#define MSR_IA32_L3_QOS_CFG 0xc81
diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 0e73616b82f3..4fa8012980f6 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -120,6 +120,17 @@

#define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR 0x000011ff

+/* x86 memory types, explicitly used in VMX only */
+#define MEM_TYPE_WB 0x6ULL
+#define MEM_TYPE_UC 0x0ULL
+
+/* VMX_BASIC bits */
+#define VMX_BASIC_32BIT_PHYS_ADDR_ONLY BIT_ULL(48)
+#define VMX_BASIC_DUAL_MONITOR_TREATMENT BIT_ULL(49)
+#define VMX_BASIC_INOUT BIT_ULL(54)
+#define VMX_BASIC_TRUE_CTLS BIT_ULL(55)
+
+
#define VMX_MISC_PREEMPTION_TIMER_RATE_MASK 0x0000001f
#define VMX_MISC_SAVE_EFER_LMA 0x00000020
#define VMX_MISC_ACTIVITY_HLT 0x00000040
@@ -143,6 +154,11 @@ static inline u32 vmx_basic_vmcs_size(u64 vmx_basic)
return (vmx_basic & GENMASK_ULL(44, 32)) >> 32;
}

+static inline u32 vmx_basic_vmcs_mem_type(u64 vmx_basic)
+{
+ return (vmx_basic & GENMASK_ULL(53, 50)) >> 50;
+}
+
static inline int vmx_misc_preemption_timer_rate(u64 vmx_misc)
{
return vmx_misc & VMX_MISC_PREEMPTION_TIMER_RATE_MASK;
@@ -505,8 +521,6 @@ enum vmcs_field {
#define VMX_EPTP_PWL_5 0x20ull
#define VMX_EPTP_AD_ENABLE_BIT (1ull << 6)
#define VMX_EPTP_MT_MASK 0x7ull
-#define VMX_EPTP_MT_WB 0x6ull
-#define VMX_EPTP_MT_UC 0x0ull
#define VMX_EPT_READABLE_MASK 0x1ull
#define VMX_EPT_WRITABLE_MASK 0x2ull
#define VMX_EPT_EXECUTABLE_MASK 0x4ull
diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 41a4533f9989..86ce8bb96bed 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -54,9 +54,7 @@ struct nested_vmx_msrs {
};

struct vmcs_config {
- int size;
- u32 basic_cap;
- u32 revision_id;
+ u64 basic;
u32 pin_based_exec_ctrl;
u32 cpu_based_exec_ctrl;
u32 cpu_based_2nd_exec_ctrl;
@@ -76,7 +74,7 @@ extern struct vmx_capability vmx_capability __ro_after_init;

static inline bool cpu_has_vmx_basic_inout(void)
{
- return (((u64)vmcs_config.basic_cap << 32) & VMX_BASIC_INOUT);
+ return vmcs_config.basic & VMX_BASIC_INOUT;
}

static inline bool cpu_has_virtual_nmis(void)
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 994e014f8a50..80fea1875948 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1226,23 +1226,29 @@ static bool is_bitwise_subset(u64 superset, u64 subset, u64 mask)
return (superset | subset) == superset;
}

+#define VMX_BASIC_FEATURES_MASK \
+ (VMX_BASIC_DUAL_MONITOR_TREATMENT | \
+ VMX_BASIC_INOUT | \
+ VMX_BASIC_TRUE_CTLS)
+
+#define VMX_BASIC_RESERVED_BITS \
+ (GENMASK_ULL(63, 56) | GENMASK_ULL(47, 45) | BIT_ULL(31))
+
static int vmx_restore_vmx_basic(struct vcpu_vmx *vmx, u64 data)
{
- const u64 feature_and_reserved =
- /* feature (except bit 48; see below) */
- BIT_ULL(49) | BIT_ULL(54) | BIT_ULL(55) |
- /* reserved */
- BIT_ULL(31) | GENMASK_ULL(47, 45) | GENMASK_ULL(63, 56);
u64 vmx_basic = vmcs_config.nested.basic;

- if (!is_bitwise_subset(vmx_basic, data, feature_and_reserved))
+ static_assert(!(VMX_BASIC_FEATURES_MASK & VMX_BASIC_RESERVED_BITS));
+
+ if (!is_bitwise_subset(vmx_basic, data,
+ VMX_BASIC_FEATURES_MASK | VMX_BASIC_RESERVED_BITS))
return -EINVAL;

/*
* KVM does not emulate a version of VMX that constrains physical
* addresses of VMX structures (e.g. VMCS) to 32-bits.
*/
- if (data & BIT_ULL(48))
+ if (data & VMX_BASIC_32BIT_PHYS_ADDR_ONLY)
return -EINVAL;

if (vmx_basic_vmcs_revision_id(vmx_basic) !=
@@ -2726,11 +2732,11 @@ static bool nested_vmx_check_eptp(struct kvm_vcpu *vcpu, u64 new_eptp)

/* Check for memory type validity */
switch (new_eptp & VMX_EPTP_MT_MASK) {
- case VMX_EPTP_MT_UC:
+ case MEM_TYPE_UC:
if (CC(!(vmx->nested.msrs.ept_caps & VMX_EPTP_UC_BIT)))
return false;
break;
- case VMX_EPTP_MT_WB:
+ case MEM_TYPE_WB:
if (CC(!(vmx->nested.msrs.ept_caps & VMX_EPTP_WB_BIT)))
return false;
break;
@@ -6994,6 +7000,9 @@ static void nested_vmx_setup_misc_data(struct vmcs_config *vmcs_conf,
msrs->misc_high = 0;
}

+#define VMX_BSAIC_VMCS12_SIZE ((u64)VMCS12_SIZE << 32)
+#define VMX_BASIC_MEM_TYPE_WB (MEM_TYPE_WB << 50)
+
static void nested_vmx_setup_basic(struct nested_vmx_msrs *msrs)
{
/*
@@ -7005,8 +7014,8 @@ static void nested_vmx_setup_basic(struct nested_vmx_msrs *msrs)
msrs->basic =
VMCS12_REVISION |
VMX_BASIC_TRUE_CTLS |
- ((u64)VMCS12_SIZE << VMX_BASIC_VMCS_SIZE_SHIFT) |
- (VMX_BASIC_MEM_TYPE_WB << VMX_BASIC_MEM_TYPE_SHIFT);
+ VMX_BSAIC_VMCS12_SIZE |
+ VMX_BASIC_MEM_TYPE_WB;

if (cpu_has_vmx_basic_inout())
msrs->basic |= VMX_BASIC_INOUT;
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index e262bc2ba4e5..dc163a580f98 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2563,13 +2563,13 @@ static u64 adjust_vmx_controls64(u64 ctl_opt, u32 msr)
static int setup_vmcs_config(struct vmcs_config *vmcs_conf,
struct vmx_capability *vmx_cap)
{
- u32 vmx_msr_low, vmx_msr_high;
u32 _pin_based_exec_control = 0;
u32 _cpu_based_exec_control = 0;
u32 _cpu_based_2nd_exec_control = 0;
u64 _cpu_based_3rd_exec_control = 0;
u32 _vmexit_control = 0;
u32 _vmentry_control = 0;
+ u64 basic_msr;
u64 misc_msr;
int i;

@@ -2688,29 +2688,25 @@ static int setup_vmcs_config(struct vmcs_config *vmcs_conf,
_vmexit_control &= ~x_ctrl;
}

- rdmsr(MSR_IA32_VMX_BASIC, vmx_msr_low, vmx_msr_high);
+ rdmsrl(MSR_IA32_VMX_BASIC, basic_msr);

/* IA-32 SDM Vol 3B: VMCS size is never greater than 4kB. */
- if ((vmx_msr_high & 0x1fff) > PAGE_SIZE)
+ if ((vmx_basic_vmcs_size(basic_msr) > PAGE_SIZE))
return -EIO;

#ifdef CONFIG_X86_64
/* IA-32 SDM Vol 3B: 64-bit CPUs always have VMX_BASIC_MSR[48]==0. */
- if (vmx_msr_high & (1u<<16))
+ if (basic_msr & VMX_BASIC_32BIT_PHYS_ADDR_ONLY)
return -EIO;
#endif

/* Require Write-Back (WB) memory type for VMCS accesses. */
- if (((vmx_msr_high >> 18) & 15) != 6)
+ if (vmx_basic_vmcs_mem_type(basic_msr) != MEM_TYPE_WB)
return -EIO;

rdmsrl(MSR_IA32_VMX_MISC, misc_msr);

- vmcs_conf->size = vmx_msr_high & 0x1fff;
- vmcs_conf->basic_cap = vmx_msr_high & ~0x1fff;
-
- vmcs_conf->revision_id = vmx_msr_low;
-
+ vmcs_conf->basic = basic_msr;
vmcs_conf->pin_based_exec_ctrl = _pin_based_exec_control;
vmcs_conf->cpu_based_exec_ctrl = _cpu_based_exec_control;
vmcs_conf->cpu_based_2nd_exec_ctrl = _cpu_based_2nd_exec_control;
@@ -2860,13 +2856,13 @@ struct vmcs *alloc_vmcs_cpu(bool shadow, int cpu, gfp_t flags)
if (!pages)
return NULL;
vmcs = page_address(pages);
- memset(vmcs, 0, vmcs_config.size);
+ memset(vmcs, 0, vmx_basic_vmcs_size(vmcs_config.basic));

/* KVM supports Enlightened VMCS v1 only */
if (kvm_is_using_evmcs())
vmcs->hdr.revision_id = KVM_EVMCS_VERSION;
else
- vmcs->hdr.revision_id = vmcs_config.revision_id;
+ vmcs->hdr.revision_id = vmx_basic_vmcs_revision_id(vmcs_config.basic);

if (shadow)
vmcs->hdr.shadow_vmcs = 1;
@@ -2959,7 +2955,7 @@ static __init int alloc_kvm_area(void)
* physical CPU.
*/
if (kvm_is_using_evmcs())
- vmcs->hdr.revision_id = vmcs_config.revision_id;
+ vmcs->hdr.revision_id = vmx_basic_vmcs_revision_id(vmcs_config.basic);

per_cpu(vmxarea, cpu) = vmcs;
}
@@ -3361,7 +3357,7 @@ static int vmx_get_max_ept_level(void)

u64 construct_eptp(struct kvm_vcpu *vcpu, hpa_t root_hpa, int root_level)
{
- u64 eptp = VMX_EPTP_MT_WB;
+ u64 eptp = MEM_TYPE_WB;

eptp |= (root_level == 5) ? VMX_EPTP_PWL_5 : VMX_EPTP_PWL_4;


base-commit: 60eedcfceda9db46f1b333e5e1aa9359793f04fb
--
2.43.0



2024-02-06 19:06:22

by Li, Xin3

[permalink] [raw]
Subject: [PATCH v5 2/2] KVM: VMX: Cleanup VMX misc information defines and usages

Define VMX misc information fields with BIT_ULL()/GENMASK_ULL(), and move
VMX misc field macros to vmx.h if used in multiple files or where they are
used only once.

Signed-off-by: Xin Li <[email protected]>
---
arch/x86/include/asm/msr-index.h | 5 -----
arch/x86/include/asm/vmx.h | 12 +++++------
arch/x86/kvm/vmx/capabilities.h | 4 ++--
arch/x86/kvm/vmx/nested.c | 34 ++++++++++++++++++++++++--------
arch/x86/kvm/vmx/nested.h | 2 +-
arch/x86/kvm/vmx/vmx.c | 8 +++-----
6 files changed, 37 insertions(+), 28 deletions(-)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 63cd50bfdc6d..5b0c6d184b4d 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -1118,11 +1118,6 @@
#define MSR_IA32_SMBA_BW_BASE 0xc0000280
#define MSR_IA32_EVT_CFG_BASE 0xc0000400

-/* MSR_IA32_VMX_MISC bits */
-#define MSR_IA32_VMX_MISC_INTEL_PT (1ULL << 14)
-#define MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS (1ULL << 29)
-#define MSR_IA32_VMX_MISC_PREEMPTION_TIMER_SCALE 0x1F
-
/* AMD-V MSRs */
#define MSR_VM_CR 0xc0010114
#define MSR_VM_IGNNE 0xc0010115
diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 4fa8012980f6..0100731eaaab 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -131,12 +131,10 @@
#define VMX_BASIC_TRUE_CTLS BIT_ULL(55)


-#define VMX_MISC_PREEMPTION_TIMER_RATE_MASK 0x0000001f
-#define VMX_MISC_SAVE_EFER_LMA 0x00000020
-#define VMX_MISC_ACTIVITY_HLT 0x00000040
-#define VMX_MISC_ACTIVITY_WAIT_SIPI 0x00000100
-#define VMX_MISC_ZERO_LEN_INS 0x40000000
-#define VMX_MISC_MSR_LIST_MULTIPLIER 512
+/* VMX_MISC bits and bitmasks */
+#define VMX_MISC_INTEL_PT BIT_ULL(14)
+#define VMX_MISC_VMWRITE_SHADOW_RO_FIELDS BIT_ULL(29)
+#define VMX_MISC_ZERO_LEN_INS BIT_ULL(30)

/* VMFUNC functions */
#define VMFUNC_CONTROL_BIT(x) BIT((VMX_FEATURE_##x & 0x1f) - 28)
@@ -161,7 +159,7 @@ static inline u32 vmx_basic_vmcs_mem_type(u64 vmx_basic)

static inline int vmx_misc_preemption_timer_rate(u64 vmx_misc)
{
- return vmx_misc & VMX_MISC_PREEMPTION_TIMER_RATE_MASK;
+ return vmx_misc & GENMASK_ULL(4, 0);
}

static inline int vmx_misc_cr3_count(u64 vmx_misc)
diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 86ce8bb96bed..cb6588238f46 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -223,7 +223,7 @@ static inline bool cpu_has_vmx_vmfunc(void)
static inline bool cpu_has_vmx_shadow_vmcs(void)
{
/* check if the cpu supports writing r/o exit information fields */
- if (!(vmcs_config.misc & MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS))
+ if (!(vmcs_config.misc & VMX_MISC_VMWRITE_SHADOW_RO_FIELDS))
return false;

return vmcs_config.cpu_based_2nd_exec_ctrl &
@@ -365,7 +365,7 @@ static inline bool cpu_has_vmx_invvpid_global(void)

static inline bool cpu_has_vmx_intel_pt(void)
{
- return (vmcs_config.misc & MSR_IA32_VMX_MISC_INTEL_PT) &&
+ return (vmcs_config.misc & VMX_MISC_INTEL_PT) &&
(vmcs_config.cpu_based_2nd_exec_ctrl & SECONDARY_EXEC_PT_USE_GPA) &&
(vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_RTIT_CTL);
}
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 80fea1875948..a9dfda2cbca3 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -917,6 +917,8 @@ static int nested_vmx_store_msr_check(struct kvm_vcpu *vcpu,
return 0;
}

+#define VMX_MISC_MSR_LIST_MULTIPLIER 512
+
static u32 nested_vmx_max_atomic_switch_msrs(struct kvm_vcpu *vcpu)
{
struct vcpu_vmx *vmx = to_vmx(vcpu);
@@ -1315,18 +1317,34 @@ vmx_restore_control_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data)
return 0;
}

+#define VMX_MISC_SAVE_EFER_LMA BIT_ULL(5)
+#define VMX_MISC_ACTIVITY_STATE_BITMAP GENMASK_ULL(8, 6)
+#define VMX_MISC_ACTIVITY_HLT BIT_ULL(6)
+#define VMX_MISC_ACTIVITY_WAIT_SIPI BIT_ULL(8)
+#define VMX_MISC_RDMSR_IN_SMM BIT_ULL(15)
+#define VMX_MISC_VMXOFF_BLOCK_SMI BIT_ULL(28)
+
+#define VMX_MISC_FEATURES_MASK \
+ (VMX_MISC_SAVE_EFER_LMA | \
+ VMX_MISC_ACTIVITY_STATE_BITMAP | \
+ VMX_MISC_INTEL_PT | \
+ VMX_MISC_RDMSR_IN_SMM | \
+ VMX_MISC_VMXOFF_BLOCK_SMI | \
+ VMX_MISC_VMWRITE_SHADOW_RO_FIELDS | \
+ VMX_MISC_ZERO_LEN_INS)
+
+#define VMX_MISC_RESERVED_BITS \
+ (BIT_ULL(31) | GENMASK_ULL(13, 9))
+
static int vmx_restore_vmx_misc(struct vcpu_vmx *vmx, u64 data)
{
- const u64 feature_and_reserved_bits =
- /* feature */
- BIT_ULL(5) | GENMASK_ULL(8, 6) | BIT_ULL(14) | BIT_ULL(15) |
- BIT_ULL(28) | BIT_ULL(29) | BIT_ULL(30) |
- /* reserved */
- GENMASK_ULL(13, 9) | BIT_ULL(31);
u64 vmx_misc = vmx_control_msr(vmcs_config.nested.misc_low,
vmcs_config.nested.misc_high);

- if (!is_bitwise_subset(vmx_misc, data, feature_and_reserved_bits))
+ static_assert(!(VMX_MISC_FEATURES_MASK & VMX_MISC_RESERVED_BITS));
+
+ if (!is_bitwise_subset(vmx_misc, data,
+ VMX_MISC_FEATURES_MASK | VMX_MISC_RESERVED_BITS))
return -EINVAL;

if ((vmx->nested.msrs.pinbased_ctls_high &
@@ -6993,7 +7011,7 @@ static void nested_vmx_setup_misc_data(struct vmcs_config *vmcs_conf,
{
msrs->misc_low = (u32)vmcs_conf->misc & VMX_MISC_SAVE_EFER_LMA;
msrs->misc_low |=
- MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS |
+ VMX_MISC_VMWRITE_SHADOW_RO_FIELDS |
VMX_MISC_EMULATED_PREEMPTION_TIMER_RATE |
VMX_MISC_ACTIVITY_HLT |
VMX_MISC_ACTIVITY_WAIT_SIPI;
diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
index cce4e2aa30fb..0782fe599757 100644
--- a/arch/x86/kvm/vmx/nested.h
+++ b/arch/x86/kvm/vmx/nested.h
@@ -109,7 +109,7 @@ static inline unsigned nested_cpu_vmx_misc_cr3_count(struct kvm_vcpu *vcpu)
static inline bool nested_cpu_has_vmwrite_any_field(struct kvm_vcpu *vcpu)
{
return to_vmx(vcpu)->nested.msrs.misc_low &
- MSR_IA32_VMX_MISC_VMWRITE_SHADOW_RO_FIELDS;
+ VMX_MISC_VMWRITE_SHADOW_RO_FIELDS;
}

static inline bool nested_cpu_has_zero_length_injection(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index dc163a580f98..96f0d65dea45 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2570,7 +2570,6 @@ static int setup_vmcs_config(struct vmcs_config *vmcs_conf,
u32 _vmexit_control = 0;
u32 _vmentry_control = 0;
u64 basic_msr;
- u64 misc_msr;
int i;

/*
@@ -2704,8 +2703,6 @@ static int setup_vmcs_config(struct vmcs_config *vmcs_conf,
if (vmx_basic_vmcs_mem_type(basic_msr) != MEM_TYPE_WB)
return -EIO;

- rdmsrl(MSR_IA32_VMX_MISC, misc_msr);
-
vmcs_conf->basic = basic_msr;
vmcs_conf->pin_based_exec_ctrl = _pin_based_exec_control;
vmcs_conf->cpu_based_exec_ctrl = _cpu_based_exec_control;
@@ -2713,7 +2710,8 @@ static int setup_vmcs_config(struct vmcs_config *vmcs_conf,
vmcs_conf->cpu_based_3rd_exec_ctrl = _cpu_based_3rd_exec_control;
vmcs_conf->vmexit_ctrl = _vmexit_control;
vmcs_conf->vmentry_ctrl = _vmentry_control;
- vmcs_conf->misc = misc_msr;
+
+ rdmsrl(MSR_IA32_VMX_MISC, vmcs_conf->misc);

#if IS_ENABLED(CONFIG_HYPERV)
if (enlightened_vmcs)
@@ -8597,7 +8595,7 @@ static __init int hardware_setup(void)
u64 use_timer_freq = 5000ULL * 1000 * 1000;

cpu_preemption_timer_multi =
- vmcs_config.misc & VMX_MISC_PREEMPTION_TIMER_RATE_MASK;
+ vmx_misc_preemption_timer_rate(vmcs_config.misc);

if (tsc_khz)
use_timer_freq = (u64)tsc_khz * 1000;
--
2.43.0


2024-02-13 22:52:21

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] KVM: VMX: Cleanup VMX basic information defines and usages

Please send cover letters for series with more than one patch, even if there are
only two patches. At the very least, cover letters are a convenient location to
provide feedback/communication for the series as a whole. Instead, I need to
put it here:

I'll send a v6 with all of my suggestions incorporated. I like the cleanups, but
there are too many process issues to fixup when applying, a few things that I
straight up disagree with, and more aggressive memtype related changes that can
be done in the context of this series.

On Tue, Feb 06, 2024, Xin Li wrote:
> Define VMX basic information fields with BIT_ULL()/GENMASK_ULL(), and
> replace hardcoded VMX basic numbers with these field macros.
>
> Save the full/raw value of MSR_IA32_VMX_BASIC in the global vmcs_config
> as type u64 to get rid of the hi/lo crud, and then use VMX_BASIC helpers
> to extract info as needed.
>
> VMX_EPTP_MT_{WB,UC} values 0x6 and 0x0 are generic x86 memory type
> values, no need to prefix them with VMX_EPTP_.

*sigh*

This obviously, like super duper obviously, should be at least three distinct
patches. The changelog has three paragraphs that have *zero* relation to each
other, and the changelog doesn't even cover all of the opportunistic cleanups
that are being done.

> +/* x86 memory types, explicitly used in VMX only */
> +#define MEM_TYPE_WB 0x6ULL
> +#define MEM_TYPE_UC 0x0ULL

No, this is ridiculous. These values are architectural, there's no reason for
KVM to have yet another copy. The MTRRs #defines have goofy names, and are
incomplete, but it's trivial to move the enums from pat/memtype.c to msr-index.h.

> @@ -505,8 +521,6 @@ enum vmcs_field {
> #define VMX_EPTP_PWL_5 0x20ull
> #define VMX_EPTP_AD_ENABLE_BIT (1ull << 6)
> #define VMX_EPTP_MT_MASK 0x7ull
> -#define VMX_EPTP_MT_WB 0x6ull
> -#define VMX_EPTP_MT_UC 0x0ull

I would strongly prefer to keep the VMX_EPTP_MT_WB and VMX_EPTP_MT_UC defines,
at least so long as KVM is open coding reads and writes to the EPTP. E.g. if
someone wants to do a follow-up series that adds wrappers to decode/encode the
memtype (and other fiels) from/to EPTP values, then I'd be fine dropping these.

But this:


/* Check for memory type validity */
switch (new_eptp & VMX_EPTP_MT_MASK) {
case MEM_TYPE_UC:
if (CC(!(vmx->nested.msrs.ept_caps & VMX_EPTP_UC_BIT)))
return false;
break;
case MEM_TYPE_WB:
if (CC(!(vmx->nested.msrs.ept_caps & VMX_EPTP_WB_BIT)))
return false;
break;
default:
return false;
}

looks wrong and is actively confusing, especially when the code below it does:

/* Page-walk levels validity. */
switch (new_eptp & VMX_EPTP_PWL_MASK) {
case VMX_EPTP_PWL_5:
if (CC(!(vmx->nested.msrs.ept_caps & VMX_EPT_PAGE_WALK_5_BIT)))
return false;
break;
case VMX_EPTP_PWL_4:
if (CC(!(vmx->nested.msrs.ept_caps & VMX_EPT_PAGE_WALK_4_BIT)))
return false;
break;
default:
return false;
}

> static inline bool cpu_has_virtual_nmis(void)
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 994e014f8a50..80fea1875948 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -1226,23 +1226,29 @@ static bool is_bitwise_subset(u64 superset, u64 subset, u64 mask)
> return (superset | subset) == superset;
> }
>
> +#define VMX_BASIC_FEATURES_MASK \
> + (VMX_BASIC_DUAL_MONITOR_TREATMENT | \
> + VMX_BASIC_INOUT | \
> + VMX_BASIC_TRUE_CTLS)
> +
> +#define VMX_BASIC_RESERVED_BITS \
> + (GENMASK_ULL(63, 56) | GENMASK_ULL(47, 45) | BIT_ULL(31))

Looking at this with fresh eyes, I think #defines are overkill. There is zero
chance anything other than vmx_restore_vmx_basic() will use these, and the feature
bits mask is rather weird. It's not a mask of features that KVM supports, it's
a mask of feature *bits* that KVM knows about.

So rather than add #defines, I think we can keep "const u64" variables, but split
into feature_bits and reserved_bits (the latter will have open coded GENMASK_ULL()
usage, whereas the former will not).

BUILD_BUG_ON() is fancy enough that it can detect overlap.

> @@ -6994,6 +7000,9 @@ static void nested_vmx_setup_misc_data(struct vmcs_config *vmcs_conf,
> msrs->misc_high = 0;
> }
>
> +#define VMX_BSAIC_VMCS12_SIZE ((u64)VMCS12_SIZE << 32)

Typo.

> +#define VMX_BASIC_MEM_TYPE_WB (MEM_TYPE_WB << 50)

I don't see any value in either of these. In fact, I find them both to be far
more confusing, and much more likely to be incorrectly used.

Back in v1, when I said "don't bother with shift #defines", I was very specifically
talking about feature bits where defining the bit shift is an extra, pointless
layer. I even (tried) to clarify that.

2024-02-13 22:55:53

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v5 2/2] KVM: VMX: Cleanup VMX misc information defines and usages

On Tue, Feb 06, 2024, Xin Li wrote:
> Define VMX misc information fields with BIT_ULL()/GENMASK_ULL(), and move
> VMX misc field macros to vmx.h if used in multiple files or where they are
> used only once.

Yeah, no. This changelog doesn't even begin to cover what all is going on here,
and as with the first patch, this obviously needs to be split into multiple
patches.

> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 80fea1875948..a9dfda2cbca3 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -917,6 +917,8 @@ static int nested_vmx_store_msr_check(struct kvm_vcpu *vcpu,
> return 0;
> }
>
> +#define VMX_MISC_MSR_LIST_MULTIPLIER 512
> +
> static u32 nested_vmx_max_atomic_switch_msrs(struct kvm_vcpu *vcpu)
> {
> struct vcpu_vmx *vmx = to_vmx(vcpu);
> @@ -1315,18 +1317,34 @@ vmx_restore_control_msr(struct vcpu_vmx *vmx, u32 msr_index, u64 data)
> return 0;
> }
>
> +#define VMX_MISC_SAVE_EFER_LMA BIT_ULL(5)
> +#define VMX_MISC_ACTIVITY_STATE_BITMAP GENMASK_ULL(8, 6)
> +#define VMX_MISC_ACTIVITY_HLT BIT_ULL(6)
> +#define VMX_MISC_ACTIVITY_WAIT_SIPI BIT_ULL(8)
> +#define VMX_MISC_RDMSR_IN_SMM BIT_ULL(15)
> +#define VMX_MISC_VMXOFF_BLOCK_SMI BIT_ULL(28)

Gah, my bad. I misread a comment in v1, and gave nonsensical feedback. I thought
the comment was saying that #defines for the *reserved* bits should be in vmx.h
but you were talking about moving existing defines from msr-index.h to vmx.h.
Defining feature bits in nested.c, and thus splitting the VMX_MISC feature bit
definitions across multiple locations, doesn't make any sense. Sorry for the
confusion.

: > Probably should also move VMX MSR field defs from msr-index.h to
: > a vmx header file.
:
: Why bother putting them in a header? As above, it's extremely unlikely anything
: besides vmx_restore_vmx_basic() will ever care about exactly which bits are
: reserved.

> +#define VMX_MISC_FEATURES_MASK \
> + (VMX_MISC_SAVE_EFER_LMA | \
> + VMX_MISC_ACTIVITY_STATE_BITMAP | \
> + VMX_MISC_INTEL_PT | \
> + VMX_MISC_RDMSR_IN_SMM | \
> + VMX_MISC_VMXOFF_BLOCK_SMI | \
> + VMX_MISC_VMWRITE_SHADOW_RO_FIELDS | \
> + VMX_MISC_ZERO_LEN_INS)
> +
> +#define VMX_MISC_RESERVED_BITS \
> + (BIT_ULL(31) | GENMASK_ULL(13, 9))
> +
> static inline bool nested_cpu_has_zero_length_injection(struct kvm_vcpu *vcpu)
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index dc163a580f98..96f0d65dea45 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -2570,7 +2570,6 @@ static int setup_vmcs_config(struct vmcs_config *vmcs_conf,
> u32 _vmexit_control = 0;
> u32 _vmentry_control = 0;
> u64 basic_msr;
> - u64 misc_msr;
> int i;
>
> /*
> @@ -2704,8 +2703,6 @@ static int setup_vmcs_config(struct vmcs_config *vmcs_conf,
> if (vmx_basic_vmcs_mem_type(basic_msr) != MEM_TYPE_WB)
> return -EIO;
>
> - rdmsrl(MSR_IA32_VMX_MISC, misc_msr);
> -
> vmcs_conf->basic = basic_msr;
> vmcs_conf->pin_based_exec_ctrl = _pin_based_exec_control;
> vmcs_conf->cpu_based_exec_ctrl = _cpu_based_exec_control;
> @@ -2713,7 +2710,8 @@ static int setup_vmcs_config(struct vmcs_config *vmcs_conf,
> vmcs_conf->cpu_based_3rd_exec_ctrl = _cpu_based_3rd_exec_control;
> vmcs_conf->vmexit_ctrl = _vmexit_control;
> vmcs_conf->vmentry_ctrl = _vmentry_control;
> - vmcs_conf->misc = misc_msr;
> +
> + rdmsrl(MSR_IA32_VMX_MISC, vmcs_conf->misc);

No, keep the local variable. It's unlikely KVM will require a feature that is
enumerated in VMX_MISC, but it's not impossible, at which point we'd have to revert
this change.

And more importantly, if we messed up and forgot to revert this change, it's
slightly more like that the compiler will fail to detect an "uninitialized" access,
e.g. if vmcs_conf->misc were read before it was filled by rdmsrl().

Uninitialized in quotes because the usage from hardware_setup() isn't truly
uninitialized, i.e. it will be zeros. If we had a bug as above, we'd be relying
on the compiler to see that vmx_check_processor_compat()'s vmx_cap can get used
uninitialized, whereas the current code should be easily flagged if misc_msr is
used before it's written.

2024-02-14 00:55:11

by Li, Xin3

[permalink] [raw]
Subject: RE: [PATCH v5 1/2] KVM: VMX: Cleanup VMX basic information defines and usages

> Please send cover letters for series with more than one patch, even if there are
> only two patches. At the very least, cover letters are a convenient location to
> provide feedback/communication for the series as a whole.

Kai also said so... I'll take it as a standard practice.

> Instead, I need to put it here:
>
> I'll send a v6 with all of my suggestions incorporated.

Perfect!

> I like the cleanups, but
> there are too many process issues to fixup when applying, a few things that I
> straight up disagree with, and more aggressive memtype related changes that can
> be done in the context of this series.
>
> > @@ -505,8 +521,6 @@ enum vmcs_field {
> > #define VMX_EPTP_PWL_5 0x20ull
> > #define VMX_EPTP_AD_ENABLE_BIT (1ull << 6)
> > #define VMX_EPTP_MT_MASK 0x7ull
> > -#define VMX_EPTP_MT_WB 0x6ull
> > -#define VMX_EPTP_MT_UC 0x0ull
>
> I would strongly prefer to keep the VMX_EPTP_MT_WB and VMX_EPTP_MT_UC
> defines,
> at least so long as KVM is open coding reads and writes to the EPTP. E.g if
> someone wants to do a follow-up series that adds wrappers to decode/encode
> the
> memtype (and other fiels) from/to EPTP values, then I'd be fine dropping these.
>
> But this:
>
>
> /* Check for memory type validity */
> switch (new_eptp & VMX_EPTP_MT_MASK) {
> case MEM_TYPE_UC:
> if (CC(!(vmx->nested.msrs.ept_caps & VMX_EPTP_UC_BIT)))
> return false;
> break;
> case MEM_TYPE_WB:
> if (CC(!(vmx->nested.msrs.ept_caps & VMX_EPTP_WB_BIT)))
> return false;
> break;
> default:
> return false;
> }
>
> looks wrong and is actively confusing, especially when the code below it does:
>
> /* Page-walk levels validity. */
> switch (new_eptp & VMX_EPTP_PWL_MASK) {
> case VMX_EPTP_PWL_5:
> if (CC(!(vmx->nested.msrs.ept_caps &
> VMX_EPT_PAGE_WALK_5_BIT)))
> return false;
> break;
> case VMX_EPTP_PWL_4:
> if (CC(!(vmx->nested.msrs.ept_caps &
> VMX_EPT_PAGE_WALK_4_BIT)))
> return false;
> break;
> default:
> return false;
> }
>

I see your point here. But "#define VMX_EPTP_MT_WB 0x6ull" seems to define
its own memory type 0x6. I think what we want is:

/* in a pat/mtrr header */
#define MEM_TYPE_WB 0x6

/* vmx.h */
#define VMX_EPTP_MT_WB MEM_TYPE_WB

if it's not regarded as another layer of indirect.

> > static inline bool cpu_has_virtual_nmis(void)
> > diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> > index 994e014f8a50..80fea1875948 100644
> > --- a/arch/x86/kvm/vmx/nested.c
> > +++ b/arch/x86/kvm/vmx/nested.c
> > @@ -1226,23 +1226,29 @@ static bool is_bitwise_subset(u64 superset, u64
> subset, u64 mask)
> > return (superset | subset) == superset;
> > }
> >
> > +#define VMX_BASIC_FEATURES_MASK \
> > + (VMX_BASIC_DUAL_MONITOR_TREATMENT | \
> > + VMX_BASIC_INOUT | \
> > + VMX_BASIC_TRUE_CTLS)
> > +
> > +#define VMX_BASIC_RESERVED_BITS \
> > + (GENMASK_ULL(63, 56) | GENMASK_ULL(47, 45) | BIT_ULL(31))
>
> Looking at this with fresh eyes, I think #defines are overkill. There is zero
> chance anything other than vmx_restore_vmx_basic() will use these, and the
> feature
> bits mask is rather weird. It's not a mask of features that KVM supports, it's
> a mask of feature *bits* that KVM knows about.
>
> So rather than add #defines, I think we can keep "const u64" variables, but split
> into feature_bits and reserved_bits (the latter will have open coded
> GENMASK_ULL()
> usage, whereas the former will not).
>
> BUILD_BUG_ON() is fancy enough that it can detect overlap.

Sounds reasonable to me.

>
> > +#define VMX_BSAIC_VMCS12_SIZE ((u64)VMCS12_SIZE << 32)
>
> Typo.

Sigh!

>
> > +#define VMX_BASIC_MEM_TYPE_WB (MEM_TYPE_WB << 50)
>
> I don't see any value in either of these. In fact, I find them both to be far
> more confusing, and much more likely to be incorrectly used.
>
> Back in v1, when I said "don't bother with shift #defines", I was very specifically
> talking about feature bits where defining the bit shift is an extra, pointless
> layer. I even (tried) to clarify that.

Another review comment got me lost here:
https://lore.kernel.org/kvm/[email protected]/

2024-02-14 01:28:11

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] KVM: VMX: Cleanup VMX basic information defines and usages

On Wed, Feb 14, 2024, Xin3 Li wrote:
> > VMX_EPT_PAGE_WALK_4_BIT)))
> > return false;
> > break;
> > default:
> > return false;
> > }
> >
>
> I see your point here. But "#define VMX_EPTP_MT_WB 0x6ull" seems to define
> its own memory type 0x6. I think what we want is:
>
> /* in a pat/mtrr header */
> #define MEM_TYPE_WB 0x6
>
> /* vmx.h */
> #define VMX_EPTP_MT_WB MEM_TYPE_WB
>
> if it's not regarded as another layer of indirect.

Heh, yep, I already had this:

/* The EPTP memtype is encoded in bits 2:0, i.e. doesn't need to be shifted. */
#define VMX_EPTP_MT_MASK 0x7ull
#define VMX_EPTP_MT_WB X86_MEMTYPE_WB
#define VMX_EPTP_MT_UC X86_MEMTYPE_UC

2024-02-14 02:25:41

by Li, Xin3

[permalink] [raw]
Subject: RE: [PATCH v5 1/2] KVM: VMX: Cleanup VMX basic information defines and usages

> > I see your point here. But "#define VMX_EPTP_MT_WB 0x6ull" seems to
> define
> > its own memory type 0x6. I think what we want is:
> >
> > /* in a pat/mtrr header */
> > #define MEM_TYPE_WB 0x6
> >
> > /* vmx.h */
> > #define VMX_EPTP_MT_WB MEM_TYPE_WB
> >
> > if it's not regarded as another layer of indirect.
>
> Heh, yep, I already had this:
>
> /* The EPTP memtype is encoded in bits 2:0, i.e. doesn't need to be shifted. */
> #define VMX_EPTP_MT_MASK 0x7ull
> #define VMX_EPTP_MT_WB X86_MEMTYPE_WB
> #define VMX_EPTP_MT_UC X86_MEMTYPE_UC

Perfect!