2023-02-10 00:31:59

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 00/21] KVM: x86: Disallow writes to feature MSRs post-KVM_RUN

Give feature MSRs that same treatment as CPUID and disallow changing said
MSRs after KVM_RUN. Fix a tangentially related bug in the vPMU where KVM
leaves the vLBRs enabled after userspace disables the guest's entire vPMU.

The bulk of this series is a rework of the vmx_pmu_caps_test, a.k.a.
the PERF_CAPABILITIES selftests, to expand its coverage. In addition to
verifying that KVM rejects changes after KVM_RUN, verify other bits beyond
full-width writes and the LBR format.

Note! There is a sneaky, small, but massive change buried halfway through
this series that will affect all x86 selftests. Patch

Verify KVM preserves userspace writes to "durable" MSRs

adds a KVM_GET_MSRS after every KVM_SET_MSRS that writes a single MSR and
expects to succeeded. The intent is to opportunistically verify that KVM
provides "read what you wrote" for all "durable" MSRs. The PERF_CAPS test
was manually verifying this behavior, and while it seems kinda gratuitous,
the coverage is quite cheap from both a performance and maintenance cost,
i.e. I can't think of a reason _not_ to do it.

Applies on https://github.com/kvm-x86/linux/tree/next.

Sean Christopherson (21):
KVM: x86: Rename kvm_init_msr_list() to clarify it inits multiple
lists
KVM: x86: Add a helper to query whether or not a vCPU has ever run
KVM: x86: Add macros to track first...last VMX feature MSRs
KVM: x86: Generate set of VMX feature MSRs using first/last
definitions
KVM: x86: Disallow writes to immutable feature MSRs after KVM_RUN
KVM: x86/pmu: WARN and bug the VM if PMU is refreshed after vCPU has
run
KVM: x86/pmu: Zero out LBR capabilities during PMU refresh
KVM: selftests: Split PMU caps sub-tests to avoid writing MSR after
KVM_RUN
KVM: selftests: Move 0/initial value PERF_CAPS checks to dedicated
sub-test
KVM: selftests: Assert that full-width PMC writes are supported if
PDCM=1
KVM: selftests: Print out failing MSR and value in vcpu_set_msr()
KVM: selftests: Verify KVM preserves userspace writes to "durable"
MSRs
KVM: selftests: Drop now-redundant checks on PERF_CAPABILITIES writes
KVM: selftests: Test all fungible features in PERF_CAPABILITIES
KVM: selftests: Test all immutable non-format bits in
PERF_CAPABILITIES
KVM: selftests: Expand negative testing of guest writes to
PERF_CAPABILITIES
KVM: selftests: Test post-KVM_RUN writes to PERF_CAPABILITIES
KVM: selftests: Drop "all done!" printf() from PERF_CAPABILITIES test
KVM: selftests: Refactor LBR_FMT test to avoid use of separate macro
KVM: selftests: Add negative testcase for PEBS format in
PERF_CAPABILITIES
KVM: selftests: Verify LBRs are disabled if vPMU is disabled

arch/x86/kvm/cpuid.c | 2 +-
arch/x86/kvm/mmu/mmu.c | 2 +-
arch/x86/kvm/pmu.c | 3 +
arch/x86/kvm/svm/svm.c | 2 +-
arch/x86/kvm/vmx/pmu_intel.c | 10 +
arch/x86/kvm/vmx/vmx.c | 8 +-
arch/x86/kvm/x86.c | 103 +++++---
arch/x86/kvm/x86.h | 13 +
.../selftests/kvm/include/x86_64/processor.h | 41 ++-
.../selftests/kvm/x86_64/vmx_pmu_caps_test.c | 248 ++++++++++++++----
10 files changed, 342 insertions(+), 90 deletions(-)


base-commit: 62ef199250cd46fb66fe98267137b7f64e0b41b4
--
2.39.1.581.gbfd45094c4-goog



2023-02-10 00:32:03

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 01/21] KVM: x86: Rename kvm_init_msr_list() to clarify it inits multiple lists

Rename kvm_init_msr_list() to kvm_init_msr_lists() to clarify that it
initializes multiple lists: MSRs to save, emulated MSRs, and feature MSRs.

No functional change intended.

Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/x86.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f706621c35b8..7b91f73a837d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7071,7 +7071,7 @@ static void kvm_probe_msr_to_save(u32 msr_index)
msrs_to_save[num_msrs_to_save++] = msr_index;
}

-static void kvm_init_msr_list(void)
+static void kvm_init_msr_lists(void)
{
unsigned i;

@@ -9450,7 +9450,7 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
kvm_caps.max_guest_tsc_khz = max;
}
kvm_caps.default_tsc_scaling_ratio = 1ULL << kvm_caps.tsc_scaling_ratio_frac_bits;
- kvm_init_msr_list();
+ kvm_init_msr_lists();
return 0;

out_unwind_ops:
--
2.39.1.581.gbfd45094c4-goog


2023-02-10 00:32:06

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 02/21] KVM: x86: Add a helper to query whether or not a vCPU has ever run

Add a helper to query if a vCPU has run so that KVM doesn't have to open
code the check on last_vmentry_cpu being set to a magic value.

No functional change intended.

Suggested-by: Xiaoyao Li <[email protected]>
Cc: Like Xu <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/cpuid.c | 2 +-
arch/x86/kvm/mmu/mmu.c | 2 +-
arch/x86/kvm/x86.h | 5 +++++
3 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 8f8edeaf8177..448d627ce891 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -420,7 +420,7 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
* KVM_SET_CPUID{,2} again. To support this legacy behavior, check
* whether the supplied CPUID data is equal to what's already set.
*/
- if (vcpu->arch.last_vmentry_cpu != -1) {
+ if (kvm_vcpu_has_run(vcpu)) {
r = kvm_cpuid_check_equal(vcpu, e2, nent);
if (r)
return r;
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index c91ee2927dd7..b0693195273b 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5393,7 +5393,7 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
* Changing guest CPUID after KVM_RUN is forbidden, see the comment in
* kvm_arch_vcpu_ioctl().
*/
- KVM_BUG_ON(vcpu->arch.last_vmentry_cpu != -1, vcpu->kvm);
+ KVM_BUG_ON(kvm_vcpu_has_run(vcpu), vcpu->kvm);
}

void kvm_mmu_reset_context(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index a8167b47b8c8..754190af1791 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -83,6 +83,11 @@ static inline unsigned int __shrink_ple_window(unsigned int val,
void kvm_service_local_tlb_flush_requests(struct kvm_vcpu *vcpu);
int kvm_check_nested_events(struct kvm_vcpu *vcpu);

+static inline bool kvm_vcpu_has_run(struct kvm_vcpu *vcpu)
+{
+ return vcpu->arch.last_vmentry_cpu != -1;
+}
+
static inline bool kvm_is_exception_pending(struct kvm_vcpu *vcpu)
{
return vcpu->arch.exception.pending ||
--
2.39.1.581.gbfd45094c4-goog


2023-02-10 00:32:11

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 03/21] KVM: x86: Add macros to track first...last VMX feature MSRs

Add macros to track the range of VMX feature MSRs that are emulated by
KVM to reduce the maintenance cost of extending the set of emulated MSRs.

Note, KVM doesn't necessarily emulate all known/consumed VMX MSRs, e.g.
PROCBASED_CTLS3 is consumed by KVM to enable IPI virtualization, but is
not emulated as KVM doesn't emulate/virtualize IPI virtualization for
nested guests.

No functional change intended.

Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/svm/svm.c | 2 +-
arch/x86/kvm/vmx/vmx.c | 8 ++++----
arch/x86/kvm/x86.h | 8 ++++++++
3 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index b43775490074..a5b9ebd6f2c5 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4107,7 +4107,7 @@ static bool svm_has_emulated_msr(struct kvm *kvm, u32 index)
{
switch (index) {
case MSR_IA32_MCG_EXT_CTL:
- case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
+ case KVM_FIRST_EMULATED_VMX_MSR ... KVM_LAST_EMULATED_VMX_MSR:
return false;
case MSR_IA32_SMBASE:
if (!IS_ENABLED(CONFIG_KVM_SMM))
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 47abd9101e68..ee86db130519 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -1945,7 +1945,7 @@ static inline bool is_vmx_feature_control_msr_valid(struct vcpu_vmx *vmx,
static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
{
switch (msr->index) {
- case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
+ case KVM_FIRST_EMULATED_VMX_MSR ... KVM_LAST_EMULATED_VMX_MSR:
if (!nested)
return 1;
return vmx_get_vmx_msr(&vmcs_config.nested, msr->index, &msr->data);
@@ -2030,7 +2030,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
msr_info->data = to_vmx(vcpu)->msr_ia32_sgxlepubkeyhash
[msr_info->index - MSR_IA32_SGXLEPUBKEYHASH0];
break;
- case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
+ case KVM_FIRST_EMULATED_VMX_MSR ... KVM_LAST_EMULATED_VMX_MSR:
if (!nested_vmx_allowed(vcpu))
return 1;
if (vmx_get_vmx_msr(&vmx->nested.msrs, msr_info->index,
@@ -2366,7 +2366,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
vmx->msr_ia32_sgxlepubkeyhash
[msr_index - MSR_IA32_SGXLEPUBKEYHASH0] = data;
break;
- case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
+ case KVM_FIRST_EMULATED_VMX_MSR ... KVM_LAST_EMULATED_VMX_MSR:
if (!msr_info->host_initiated)
return 1; /* they are read-only */
if (!nested_vmx_allowed(vcpu))
@@ -6960,7 +6960,7 @@ static bool vmx_has_emulated_msr(struct kvm *kvm, u32 index)
* real mode.
*/
return enable_unrestricted_guest || emulate_invalid_guest_state;
- case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
+ case KVM_FIRST_EMULATED_VMX_MSR ... KVM_LAST_EMULATED_VMX_MSR:
return nested;
case MSR_AMD64_VIRT_SPEC_CTRL:
case MSR_AMD64_TSC_RATIO:
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 754190af1791..4bc483d082ee 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -40,6 +40,14 @@ void kvm_spurious_fault(void);
failed; \
})

+/*
+ * The first...last VMX feature MSRs that are emulated by KVM. This may or may
+ * not cover all known VMX MSRs, as KVM doesn't emulate an MSR until there's an
+ * associated feature that KVM supports for nested virtualization.
+ */
+#define KVM_FIRST_EMULATED_VMX_MSR MSR_IA32_VMX_BASIC
+#define KVM_LAST_EMULATED_VMX_MSR MSR_IA32_VMX_VMFUNC
+
#define KVM_DEFAULT_PLE_GAP 128
#define KVM_VMX_DEFAULT_PLE_WINDOW 4096
#define KVM_DEFAULT_PLE_WINDOW_GROW 2
--
2.39.1.581.gbfd45094c4-goog


2023-02-10 00:32:21

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 04/21] KVM: x86: Generate set of VMX feature MSRs using first/last definitions

Add VMX MSRs to the runtime list of feature MSRs by iterating over the
range of emulated MSRs instead of manually defining each MSR in the "all"
list. Using the range definition reduces the cost of emulating a new VMX
MSR, e.g. prevents forgetting to add an MSR to the list.

Extracting the VMX MSRs from the "all" list, which is a compile-time
constant, also shrinks the list to the point where the compiler can
heavily optimize code that iterates over the list.

No functional change intended.

Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/x86.c | 53 +++++++++++++++++++---------------------------
1 file changed, 22 insertions(+), 31 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7b91f73a837d..7b73a0b45041 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1539,36 +1539,19 @@ static u32 emulated_msrs[ARRAY_SIZE(emulated_msrs_all)];
static unsigned num_emulated_msrs;

/*
- * List of msr numbers which are used to expose MSR-based features that
- * can be used by a hypervisor to validate requested CPU features.
+ * List of MSRs that control the existence of MSR-based features, i.e. MSRs
+ * that are effectively CPUID leafs. VMX MSRs are also included in the set of
+ * feature MSRs, but are handled separately to allow expedited lookups.
*/
-static const u32 msr_based_features_all[] = {
- MSR_IA32_VMX_BASIC,
- MSR_IA32_VMX_TRUE_PINBASED_CTLS,
- MSR_IA32_VMX_PINBASED_CTLS,
- MSR_IA32_VMX_TRUE_PROCBASED_CTLS,
- MSR_IA32_VMX_PROCBASED_CTLS,
- MSR_IA32_VMX_TRUE_EXIT_CTLS,
- MSR_IA32_VMX_EXIT_CTLS,
- MSR_IA32_VMX_TRUE_ENTRY_CTLS,
- MSR_IA32_VMX_ENTRY_CTLS,
- MSR_IA32_VMX_MISC,
- MSR_IA32_VMX_CR0_FIXED0,
- MSR_IA32_VMX_CR0_FIXED1,
- MSR_IA32_VMX_CR4_FIXED0,
- MSR_IA32_VMX_CR4_FIXED1,
- MSR_IA32_VMX_VMCS_ENUM,
- MSR_IA32_VMX_PROCBASED_CTLS2,
- MSR_IA32_VMX_EPT_VPID_CAP,
- MSR_IA32_VMX_VMFUNC,
-
+static const u32 msr_based_features_all_except_vmx[] = {
MSR_AMD64_DE_CFG,
MSR_IA32_UCODE_REV,
MSR_IA32_ARCH_CAPABILITIES,
MSR_IA32_PERF_CAPABILITIES,
};

-static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all)];
+static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all_except_vmx) +
+ (KVM_LAST_EMULATED_VMX_MSR - KVM_FIRST_EMULATED_VMX_MSR + 1)];
static unsigned int num_msr_based_features;

/*
@@ -6996,6 +6979,18 @@ long kvm_arch_vm_ioctl(struct file *filp,
return r;
}

+static void kvm_probe_feature_msr(u32 msr_index)
+{
+ struct kvm_msr_entry msr = {
+ .index = msr_index,
+ };
+
+ if (kvm_get_msr_feature(&msr))
+ return;
+
+ msr_based_features[num_msr_based_features++] = msr_index;
+}
+
static void kvm_probe_msr_to_save(u32 msr_index)
{
u32 dummy[2];
@@ -7097,15 +7092,11 @@ static void kvm_init_msr_lists(void)
emulated_msrs[num_emulated_msrs++] = emulated_msrs_all[i];
}

- for (i = 0; i < ARRAY_SIZE(msr_based_features_all); i++) {
- struct kvm_msr_entry msr;
+ for (i = KVM_FIRST_EMULATED_VMX_MSR; i <= KVM_LAST_EMULATED_VMX_MSR; i++)
+ kvm_probe_feature_msr(i);

- msr.index = msr_based_features_all[i];
- if (kvm_get_msr_feature(&msr))
- continue;
-
- msr_based_features[num_msr_based_features++] = msr_based_features_all[i];
- }
+ for (i = 0; i < ARRAY_SIZE(msr_based_features_all_except_vmx); i++)
+ kvm_probe_feature_msr(msr_based_features_all_except_vmx[i]);
}

static int vcpu_mmio_write(struct kvm_vcpu *vcpu, gpa_t addr, int len,
--
2.39.1.581.gbfd45094c4-goog


2023-02-10 00:32:28

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 05/21] KVM: x86: Disallow writes to immutable feature MSRs after KVM_RUN

Disallow writes to feature MSRs after KVM_RUN to prevent userspace from
changing the vCPU model after running the vCPU. Similar to guest CPUID,
KVM uses feature MSRs to configure intercepts, determine what operations
are/aren't allowed, etc. Changing the capabilities while the vCPU is
active will at best yield unpredictable guest behavior, and at worst
could be dangerous to KVM.

Allow writing the current value, e.g. so that userspace can blindly set
all MSRs when emulating RESET, and unconditionally allow writes to
MSR_IA32_UCODE_REV so that userspace can emulate patch loads.

Special case the VMX MSRs to keep the generic list small, i.e. so that
KVM can do a linear walk of the generic list without incurring meaningful
overhead.

Cc: Like Xu <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/x86.c | 36 ++++++++++++++++++++++++++++++++++++
1 file changed, 36 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7b73a0b45041..186cb6a81643 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1554,6 +1554,25 @@ static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all_except_vmx) +
(KVM_LAST_EMULATED_VMX_MSR - KVM_FIRST_EMULATED_VMX_MSR + 1)];
static unsigned int num_msr_based_features;

+/*
+ * All feature MSRs except uCode revID, which tracks the currently loaded uCode
+ * patch, are immutable once the vCPU model is defined.
+ */
+static bool kvm_is_immutable_feature_msr(u32 msr)
+{
+ int i;
+
+ if (msr >= KVM_FIRST_EMULATED_VMX_MSR && msr <= KVM_LAST_EMULATED_VMX_MSR)
+ return true;
+
+ for (i = 0; i < ARRAY_SIZE(msr_based_features_all_except_vmx); i++) {
+ if (msr == msr_based_features_all_except_vmx[i])
+ return msr != MSR_IA32_UCODE_REV;
+ }
+
+ return false;
+}
+
/*
* Some IA32_ARCH_CAPABILITIES bits have dependencies on MSRs that KVM
* does not yet virtualize. These include:
@@ -2168,6 +2187,23 @@ static int do_get_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)

static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
{
+ u64 val;
+
+ /*
+ * Disallow writes to immutable feature MSRs after KVM_RUN. KVM does
+ * not support modifying the guest vCPU model on the fly, e.g. changing
+ * the nVMX capabilities while L2 is running is nonsensical. Ignore
+ * writes of the same value, e.g. to allow userspace to blindly stuff
+ * all MSRs when emulating RESET.
+ */
+ if (vcpu->arch.last_vmentry_cpu != -1 &&
+ kvm_is_immutable_feature_msr(index)) {
+ if (do_get_msr(vcpu, index, &val) || *data != val)
+ return -EINVAL;
+
+ return 0;
+ }
+
return kvm_set_msr_ignored_check(vcpu, index, *data, true);
}

--
2.39.1.581.gbfd45094c4-goog


2023-02-10 00:32:32

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 06/21] KVM: x86/pmu: WARN and bug the VM if PMU is refreshed after vCPU has run

Now that KVM disallows changing feature MSRs, i.e. PERF_CAPABILITIES,
after running a vCPU, WARN and bug the VM if the PMU is refreshed after
the vCPU has run.

Note, KVM has disallowed CPUID updates after running a vCPU since commit
feb627e8d6f6 ("KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN"), i.e.
PERF_CAPABILITIES was the only remaining way to trigger a PMU refresh
after KVM_RUN.

Cc: Like Xu <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/pmu.c | 3 +++
arch/x86/kvm/x86.c | 10 +++++++++-
2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 612e6c70ce2e..7e974c4e61b0 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -589,6 +589,9 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
*/
void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
{
+ if (KVM_BUG_ON(kvm_vcpu_has_run(vcpu), vcpu->kvm))
+ return;
+
static_call(kvm_x86_pmu_refresh)(vcpu);
}

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 186cb6a81643..1b14632a94a3 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3626,9 +3626,17 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
if (data & ~kvm_caps.supported_perf_cap)
return 1;

+ /*
+ * Note, this is not just a performance optimization! KVM
+ * disallows changing feature MSRs after the vCPU has run; PMU
+ * refresh will bug the VM if called after the vCPU has run.
+ */
+ if (vcpu->arch.perf_capabilities == data)
+ break;
+
vcpu->arch.perf_capabilities = data;
kvm_pmu_refresh(vcpu);
- return 0;
+ break;
case MSR_EFER:
return set_efer(vcpu, msr_info);
case MSR_K7_HWCR:
--
2.39.1.581.gbfd45094c4-goog


2023-02-10 00:32:37

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 07/21] KVM: x86/pmu: Zero out LBR capabilities during PMU refresh

Zero out the LBR capabilities during PMU refresh to avoid exposing LBRs
to the guest against userspace's wishes. If userspace modifies the
guest's CPUID model or invokes KVM_CAP_PMU_CAPABILITY to disable vPMU
after an initial KVM_SET_CPUID2, but before the first KVM_RUN, KVM will
retain the previous LBR info due to bailing before refreshing the LBR
descriptor.

Note, this is a very theoretical bug, there is no known use case where a
VMM would deliberately enable the vPMU via KVM_SET_CPUID2, and then later
disable the vPMU.

Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/vmx/pmu_intel.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index e8a3be0b9df9..d889bb2a1de5 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -531,6 +531,16 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->pebs_enable_mask = ~0ull;
pmu->pebs_data_cfg_mask = ~0ull;

+ memset(&lbr_desc->records, 0, sizeof(lbr_desc->records));
+
+ /*
+ * Setting passthrough of LBR MSRs is done only in the VM-Entry loop,
+ * and PMU refresh is disallowed after the vCPU has run, i.e. this code
+ * should never be reached while KVM is passing through MSRs.
+ */
+ if (KVM_BUG_ON(lbr_desc->msr_passthrough, vcpu->kvm))
+ return;
+
entry = kvm_find_cpuid_entry(vcpu, 0xa);
if (!entry || !vcpu->kvm->arch.enable_pmu)
return;
--
2.39.1.581.gbfd45094c4-goog


2023-02-10 00:32:42

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 08/21] KVM: selftests: Split PMU caps sub-tests to avoid writing MSR after KVM_RUN

Split the PERF_CAPABILITIES subtests into two parts so that the LBR format
testcases don't execute after KVM_RUN. Now that KVM disallows changing
PERF_CAPABILITIES after KVM_RUN (same as guest CPUID), attempting to set
the MSR after KVM_RUN will yield false positives and/or false negatives
depending on what the test is trying to do.

Land the LBR format test in a more generic "immutable features" test in
anticipation of expanding its scope to other immutable features.

Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/x86_64/vmx_pmu_caps_test.c | 51 +++++++++++--------
1 file changed, 31 insertions(+), 20 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
index c280ba1e6572..ac08c0fdd84d 100644
--- a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
+++ b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
@@ -41,24 +41,10 @@ static void guest_code(void)
wrmsr(MSR_IA32_PERF_CAPABILITIES, PMU_CAP_LBR_FMT);
}

-int main(int argc, char *argv[])
+static void test_fungible_perf_capabilities(union perf_capabilities host_cap)
{
- struct kvm_vm *vm;
struct kvm_vcpu *vcpu;
- int ret;
- union perf_capabilities host_cap;
- uint64_t val;
-
- host_cap.capabilities = kvm_get_feature_msr(MSR_IA32_PERF_CAPABILITIES);
- host_cap.capabilities &= (PMU_CAP_FW_WRITES | PMU_CAP_LBR_FMT);
-
- /* Create VM */
- vm = vm_create_with_one_vcpu(&vcpu, guest_code);
-
- TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_PDCM));
-
- TEST_REQUIRE(kvm_cpu_has_p(X86_PROPERTY_PMU_VERSION));
- TEST_REQUIRE(kvm_cpu_property(X86_PROPERTY_PMU_VERSION) > 0);
+ struct kvm_vm *vm = vm_create_with_one_vcpu(&vcpu, guest_code);

/* testcase 1, set capabilities when we have PDCM bit */
vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, PMU_CAP_FW_WRITES);
@@ -70,7 +56,16 @@ int main(int argc, char *argv[])
vcpu_run(vcpu);
ASSERT_EQ(vcpu_get_msr(vcpu, MSR_IA32_PERF_CAPABILITIES), PMU_CAP_FW_WRITES);

- /* testcase 2, check valid LBR formats are accepted */
+ kvm_vm_free(vm);
+}
+
+static void test_immutable_perf_capabilities(union perf_capabilities host_cap)
+{
+ struct kvm_vcpu *vcpu;
+ struct kvm_vm *vm = vm_create_with_one_vcpu(&vcpu, NULL);
+ uint64_t val;
+ int ret;
+
vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, 0);
ASSERT_EQ(vcpu_get_msr(vcpu, MSR_IA32_PERF_CAPABILITIES), 0);

@@ -78,8 +73,8 @@ int main(int argc, char *argv[])
ASSERT_EQ(vcpu_get_msr(vcpu, MSR_IA32_PERF_CAPABILITIES), (u64)host_cap.lbr_format);

/*
- * Testcase 3, check that an "invalid" LBR format is rejected. Only an
- * exact match of the host's format (and 0/disabled) is allowed.
+ * KVM only supports the host's native LBR format, as well as '0' (to
+ * disable LBR support). Verify KVM rejects all other LBR formats.
*/
for (val = 1; val <= PMU_CAP_LBR_FMT; val++) {
if (val == (host_cap.capabilities & PMU_CAP_LBR_FMT))
@@ -88,7 +83,23 @@ int main(int argc, char *argv[])
ret = _vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, val);
TEST_ASSERT(!ret, "Bad LBR FMT = 0x%lx didn't fail", val);
}
+ kvm_vm_free(vm);
+}
+
+int main(int argc, char *argv[])
+{
+ union perf_capabilities host_cap;
+
+ TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_PDCM));
+
+ TEST_REQUIRE(kvm_cpu_has_p(X86_PROPERTY_PMU_VERSION));
+ TEST_REQUIRE(kvm_cpu_property(X86_PROPERTY_PMU_VERSION) > 0);
+
+ host_cap.capabilities = kvm_get_feature_msr(MSR_IA32_PERF_CAPABILITIES);
+ host_cap.capabilities &= (PMU_CAP_FW_WRITES | PMU_CAP_LBR_FMT);
+
+ test_fungible_perf_capabilities(host_cap);
+ test_immutable_perf_capabilities(host_cap);

printf("Completed perf capability tests.\n");
- kvm_vm_free(vm);
}
--
2.39.1.581.gbfd45094c4-goog


2023-02-10 00:32:57

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 09/21] KVM: selftests: Move 0/initial value PERF_CAPS checks to dedicated sub-test

Use a separate sub-test to verify userspace can clear PERF_CAPABILITIES
and restore it to the KVM-supported value, as the testcase isn't unique
to the LBR format.

Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/x86_64/vmx_pmu_caps_test.c | 25 ++++++++++++++-----
1 file changed, 19 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
index ac08c0fdd84d..c3b0738e361b 100644
--- a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
+++ b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
@@ -41,6 +41,24 @@ static void guest_code(void)
wrmsr(MSR_IA32_PERF_CAPABILITIES, PMU_CAP_LBR_FMT);
}

+/*
+ * Verify KVM allows writing PERF_CAPABILITIES with all KVM-supported features
+ * enabled, as well as '0' (to disable all features).
+ */
+static void test_basic_perf_capabilities(union perf_capabilities host_cap)
+{
+ struct kvm_vcpu *vcpu;
+ struct kvm_vm *vm = vm_create_with_one_vcpu(&vcpu, NULL);
+
+ vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, 0);
+ ASSERT_EQ(vcpu_get_msr(vcpu, MSR_IA32_PERF_CAPABILITIES), 0);
+
+ vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, host_cap.capabilities);
+ ASSERT_EQ(vcpu_get_msr(vcpu, MSR_IA32_PERF_CAPABILITIES), host_cap.capabilities);
+
+ kvm_vm_free(vm);
+}
+
static void test_fungible_perf_capabilities(union perf_capabilities host_cap)
{
struct kvm_vcpu *vcpu;
@@ -66,12 +84,6 @@ static void test_immutable_perf_capabilities(union perf_capabilities host_cap)
uint64_t val;
int ret;

- vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, 0);
- ASSERT_EQ(vcpu_get_msr(vcpu, MSR_IA32_PERF_CAPABILITIES), 0);
-
- vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, host_cap.lbr_format);
- ASSERT_EQ(vcpu_get_msr(vcpu, MSR_IA32_PERF_CAPABILITIES), (u64)host_cap.lbr_format);
-
/*
* KVM only supports the host's native LBR format, as well as '0' (to
* disable LBR support). Verify KVM rejects all other LBR formats.
@@ -98,6 +110,7 @@ int main(int argc, char *argv[])
host_cap.capabilities = kvm_get_feature_msr(MSR_IA32_PERF_CAPABILITIES);
host_cap.capabilities &= (PMU_CAP_FW_WRITES | PMU_CAP_LBR_FMT);

+ test_basic_perf_capabilities(host_cap);
test_fungible_perf_capabilities(host_cap);
test_immutable_perf_capabilities(host_cap);

--
2.39.1.581.gbfd45094c4-goog


2023-02-10 00:33:08

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 10/21] KVM: selftests: Assert that full-width PMC writes are supported if PDCM=1

KVM emulates full-width PMC writes in software, assert that KVM reports
full-width writes as supported if PERF_CAPABILITIES is supported.

Signed-off-by: Sean Christopherson <[email protected]>
---
tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
index c3b0738e361b..035470b38400 100644
--- a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
+++ b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
@@ -110,6 +110,9 @@ int main(int argc, char *argv[])
host_cap.capabilities = kvm_get_feature_msr(MSR_IA32_PERF_CAPABILITIES);
host_cap.capabilities &= (PMU_CAP_FW_WRITES | PMU_CAP_LBR_FMT);

+ TEST_ASSERT(host_cap.full_width_write,
+ "Full-width writes should always be supported");
+
test_basic_perf_capabilities(host_cap);
test_fungible_perf_capabilities(host_cap);
test_immutable_perf_capabilities(host_cap);
--
2.39.1.581.gbfd45094c4-goog


2023-02-10 00:33:16

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 11/21] KVM: selftests: Print out failing MSR and value in vcpu_set_msr()

Reimplement vcpu_set_msr() as a macro and pretty print the failing MSR
(when possible) and the value if KVM_SET_MSRS fails instead of using the
using the standard KVM_IOCTL_ERROR(). KVM_SET_MSRS is somewhat odd in
that it returns the index of the last successful write, i.e. will be
'0' on failure barring an entirely different KVM bug. And for writing
MSRs, the MSR being written and the value being written are almost always
relevant to the failure, i.e. just saying "failed!" doesn't help debug.

Place the string goo in a separate macro in anticipation of using it to
further expand MSR testing.

Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/include/x86_64/processor.h | 30 ++++++++++++++-----
1 file changed, 23 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 53ffa43c90db..26c8e202a956 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -928,14 +928,30 @@ static inline void vcpu_clear_cpuid_feature(struct kvm_vcpu *vcpu,
uint64_t vcpu_get_msr(struct kvm_vcpu *vcpu, uint64_t msr_index);
int _vcpu_set_msr(struct kvm_vcpu *vcpu, uint64_t msr_index, uint64_t msr_value);

-static inline void vcpu_set_msr(struct kvm_vcpu *vcpu, uint64_t msr_index,
- uint64_t msr_value)
-{
- int r = _vcpu_set_msr(vcpu, msr_index, msr_value);
-
- TEST_ASSERT(r == 1, KVM_IOCTL_ERROR(KVM_SET_MSRS, r));
-}
+/*
+ * Assert on an MSR access(es) and pretty print the MSR name when possible.
+ * Note, the caller provides the stringified name so that the name of macro is
+ * printed, not the value the macro resolves to (due to macro expansion).
+ */
+#define TEST_ASSERT_MSR(cond, fmt, msr, str, args...) \
+do { \
+ if (__builtin_constant_p(msr)) { \
+ TEST_ASSERT(cond, fmt, str, args); \
+ } else if (!(cond)) { \
+ char buf[16]; \
+ \
+ snprintf(buf, sizeof(buf), "MSR 0x%x", msr); \
+ TEST_ASSERT(cond, fmt, buf, args); \
+ } \
+} while (0)

+#define vcpu_set_msr(vcpu, msr, val) \
+do { \
+ uint64_t v = val; \
+ \
+ TEST_ASSERT_MSR(_vcpu_set_msr(vcpu, msr, v) == 1, \
+ "KVM_SET_MSRS failed on %s, value = 0x%lx", msr, #msr, v); \
+} while (0)

void kvm_get_cpu_address_width(unsigned int *pa_bits, unsigned int *va_bits);
bool vm_is_unrestricted_guest(struct kvm_vm *vm);
--
2.39.1.581.gbfd45094c4-goog


2023-02-10 00:33:23

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 12/21] KVM: selftests: Verify KVM preserves userspace writes to "durable" MSRs

Assert that KVM provides "read what you wrote" semantics for all "durable"
MSRs (for lack of a better name). The extra coverage is cheap from a
runtime performance perspective, and verifying the behavior in the common
helper avoids gratuitous copy+paste in individual tests.

Note, this affects all tests that set MSRs from userspace!

Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/include/x86_64/processor.h | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 26c8e202a956..52260f6c2465 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -945,12 +945,27 @@ do { \
} \
} while (0)

+/*
+ * Returns true if KVM should return the last written value when reading an MSR
+ * from userspace, e.g. the MSR isn't a command MSR, doesn't emulate state that
+ * is changing, etc. This is NOT an exhaustive list! The intent is to filter
+ * out MSRs that are not durable _and_ that a selftest wants to write.
+ */
+static inline bool is_durable_msr(uint32_t msr)
+{
+ return msr != MSR_IA32_TSC;
+}
+
#define vcpu_set_msr(vcpu, msr, val) \
do { \
- uint64_t v = val; \
+ uint64_t r, v = val; \
\
TEST_ASSERT_MSR(_vcpu_set_msr(vcpu, msr, v) == 1, \
"KVM_SET_MSRS failed on %s, value = 0x%lx", msr, #msr, v); \
+ if (!is_durable_msr(msr)) \
+ break; \
+ r = vcpu_get_msr(vcpu, msr); \
+ TEST_ASSERT_MSR(r == v, "Set %s to '0x%lx', got back '0x%lx'", msr, #msr, v, r);\
} while (0)

void kvm_get_cpu_address_width(unsigned int *pa_bits, unsigned int *va_bits);
--
2.39.1.581.gbfd45094c4-goog


2023-02-10 00:33:34

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 15/21] KVM: selftests: Test all immutable non-format bits in PERF_CAPABILITIES

Add negative testing of all immutable bits in PERF_CAPABILITIES, i.e.
single bits that are reserved-0 or are effectively reserved-1 by KVM.

Omit LBR and PEBS format bits from the test as it's easier to test them
manually than it is to add safeguards to the comment path, e.g. toggling
a single bit can yield a format of '0', which is legal as a "disable"
value.

Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/x86_64/vmx_pmu_caps_test.c | 30 +++++++++++++++++--
1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
index 2647282ff380..d91bf44a2e39 100644
--- a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
+++ b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
@@ -49,6 +49,11 @@ static const union perf_capabilities immutable_caps = {
.pebs_baseline = 1,
};

+static const union perf_capabilities format_caps = {
+ .lbr_format = -1,
+ .pebs_format = -1,
+};
+
static void guest_code(void)
{
wrmsr(MSR_IA32_PERF_CAPABILITIES, PMU_CAP_LBR_FMT);
@@ -91,12 +96,30 @@ static void test_fungible_perf_capabilities(union perf_capabilities host_cap)
kvm_vm_free(vm);
}

+/*
+ * Verify KVM rejects attempts to set unsupported and/or immutable features in
+ * PERF_CAPABILITIES. Note, LBR format and PEBS format need to be validated
+ * separately as they are multi-bit values, e.g. toggling or setting a single
+ * bit can generate a false positive without dedicated safeguards.
+ */
static void test_immutable_perf_capabilities(union perf_capabilities host_cap)
{
+ const uint64_t reserved_caps = (~host_cap.capabilities |
+ immutable_caps.capabilities) &
+ ~format_caps.capabilities;
+
struct kvm_vcpu *vcpu;
struct kvm_vm *vm = vm_create_with_one_vcpu(&vcpu, NULL);
uint64_t val;
- int ret;
+ int r, bit;
+
+ for_each_set_bit(bit, &reserved_caps, 64) {
+ r = _vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES,
+ host_cap.capabilities ^ BIT_ULL(bit));
+ TEST_ASSERT(!r, "%s immutable feature 0x%llx (bit %d) didn't fail",
+ host_cap.capabilities & BIT_ULL(bit) ? "Setting" : "Clearing",
+ BIT_ULL(bit), bit);
+ }

/*
* KVM only supports the host's native LBR format, as well as '0' (to
@@ -106,9 +129,10 @@ static void test_immutable_perf_capabilities(union perf_capabilities host_cap)
if (val == (host_cap.capabilities & PMU_CAP_LBR_FMT))
continue;

- ret = _vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, val);
- TEST_ASSERT(!ret, "Bad LBR FMT = 0x%lx didn't fail", val);
+ r = _vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, val);
+ TEST_ASSERT(!r, "Bad LBR FMT = 0x%lx didn't fail", val);
}
+
kvm_vm_free(vm);
}

--
2.39.1.581.gbfd45094c4-goog


2023-02-10 00:33:38

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 13/21] KVM: selftests: Drop now-redundant checks on PERF_CAPABILITIES writes

Now that vcpu_set_msr() verifies the expected "read what was wrote"
semantics of all durable MSRs, including PERF_CAPABILITIES, drop the
now-redundant manual checks in the VMX PMU caps test.

Signed-off-by: Sean Christopherson <[email protected]>
---
tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c | 6 ------
1 file changed, 6 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
index 035470b38400..f7a27b5c949b 100644
--- a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
+++ b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
@@ -51,10 +51,7 @@ static void test_basic_perf_capabilities(union perf_capabilities host_cap)
struct kvm_vm *vm = vm_create_with_one_vcpu(&vcpu, NULL);

vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, 0);
- ASSERT_EQ(vcpu_get_msr(vcpu, MSR_IA32_PERF_CAPABILITIES), 0);
-
vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, host_cap.capabilities);
- ASSERT_EQ(vcpu_get_msr(vcpu, MSR_IA32_PERF_CAPABILITIES), host_cap.capabilities);

kvm_vm_free(vm);
}
@@ -67,9 +64,6 @@ static void test_fungible_perf_capabilities(union perf_capabilities host_cap)
/* testcase 1, set capabilities when we have PDCM bit */
vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, PMU_CAP_FW_WRITES);

- /* check capabilities can be retrieved with KVM_GET_MSR */
- ASSERT_EQ(vcpu_get_msr(vcpu, MSR_IA32_PERF_CAPABILITIES), PMU_CAP_FW_WRITES);
-
/* check whatever we write with KVM_SET_MSR is _not_ modified */
vcpu_run(vcpu);
ASSERT_EQ(vcpu_get_msr(vcpu, MSR_IA32_PERF_CAPABILITIES), PMU_CAP_FW_WRITES);
--
2.39.1.581.gbfd45094c4-goog


2023-02-10 00:33:40

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 14/21] KVM: selftests: Test all fungible features in PERF_CAPABILITIES

Verify that userspace can set all fungible features in PERF_CAPABILITIES.
Drop the now unused #define of the "full-width writes" flag.

Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/x86_64/vmx_pmu_caps_test.c | 29 +++++++++++++++----
1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
index f7a27b5c949b..2647282ff380 100644
--- a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
+++ b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
@@ -14,10 +14,11 @@
#define _GNU_SOURCE /* for program_invocation_short_name */
#include <sys/ioctl.h>

+#include <linux/bitmap.h>
+
#include "kvm_util.h"
#include "vmx.h"

-#define PMU_CAP_FW_WRITES (1ULL << 13)
#define PMU_CAP_LBR_FMT 0x3f

union perf_capabilities {
@@ -36,6 +37,18 @@ union perf_capabilities {
u64 capabilities;
};

+/*
+ * The LBR format and most PEBS features are immutable, all other features are
+ * fungible (if supported by the host and KVM).
+ */
+static const union perf_capabilities immutable_caps = {
+ .lbr_format = -1,
+ .pebs_trap = 1,
+ .pebs_arch_reg = 1,
+ .pebs_format = -1,
+ .pebs_baseline = 1,
+};
+
static void guest_code(void)
{
wrmsr(MSR_IA32_PERF_CAPABILITIES, PMU_CAP_LBR_FMT);
@@ -58,15 +71,22 @@ static void test_basic_perf_capabilities(union perf_capabilities host_cap)

static void test_fungible_perf_capabilities(union perf_capabilities host_cap)
{
+ const uint64_t fungible_caps = host_cap.capabilities & ~immutable_caps.capabilities;
+
struct kvm_vcpu *vcpu;
struct kvm_vm *vm = vm_create_with_one_vcpu(&vcpu, guest_code);
+ int bit;

- /* testcase 1, set capabilities when we have PDCM bit */
- vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, PMU_CAP_FW_WRITES);
+ for_each_set_bit(bit, &fungible_caps, 64) {
+ vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, BIT_ULL(bit));
+ vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES,
+ host_cap.capabilities & ~BIT_ULL(bit));
+ }
+ vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, host_cap.capabilities);

/* check whatever we write with KVM_SET_MSR is _not_ modified */
vcpu_run(vcpu);
- ASSERT_EQ(vcpu_get_msr(vcpu, MSR_IA32_PERF_CAPABILITIES), PMU_CAP_FW_WRITES);
+ ASSERT_EQ(vcpu_get_msr(vcpu, MSR_IA32_PERF_CAPABILITIES), host_cap.capabilities);

kvm_vm_free(vm);
}
@@ -102,7 +122,6 @@ int main(int argc, char *argv[])
TEST_REQUIRE(kvm_cpu_property(X86_PROPERTY_PMU_VERSION) > 0);

host_cap.capabilities = kvm_get_feature_msr(MSR_IA32_PERF_CAPABILITIES);
- host_cap.capabilities &= (PMU_CAP_FW_WRITES | PMU_CAP_LBR_FMT);

TEST_ASSERT(host_cap.full_width_write,
"Full-width writes should always be supported");
--
2.39.1.581.gbfd45094c4-goog


2023-02-10 00:33:57

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 16/21] KVM: selftests: Expand negative testing of guest writes to PERF_CAPABILITIES

Test that the guest can't write 0 to PERF_CAPABILITIES, can't write the
current value, and can't toggle _any_ bits. There is no reason to special
case the LBR format.

Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/x86_64/vmx_pmu_caps_test.c | 61 ++++++++++++++++---
1 file changed, 54 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
index d91bf44a2e39..44fc6101a547 100644
--- a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
+++ b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
@@ -54,9 +54,59 @@ static const union perf_capabilities format_caps = {
.pebs_format = -1,
};

-static void guest_code(void)
+static void guest_code(uint64_t current_val)
{
- wrmsr(MSR_IA32_PERF_CAPABILITIES, PMU_CAP_LBR_FMT);
+ uint8_t vector;
+ int i;
+
+ vector = wrmsr_safe(MSR_IA32_PERF_CAPABILITIES, current_val);
+ GUEST_ASSERT_2(vector == GP_VECTOR, current_val, vector);
+
+ vector = wrmsr_safe(MSR_IA32_PERF_CAPABILITIES, 0);
+ GUEST_ASSERT_2(vector == GP_VECTOR, 0, vector);
+
+ for (i = 0; i < 64; i++) {
+ vector = wrmsr_safe(MSR_IA32_PERF_CAPABILITIES,
+ current_val ^ BIT_ULL(i));
+ GUEST_ASSERT_2(vector == GP_VECTOR,
+ current_val ^ BIT_ULL(i), vector);
+ }
+
+ GUEST_DONE();
+}
+
+/*
+ * Verify that guest WRMSRs to PERF_CAPABILITIES #GP regardless of the value
+ * written, that the guest always sees the userspace controlled value, and that
+ * PERF_CAPABILITIES is immutable after KVM_RUN.
+ */
+static void test_guest_wrmsr_perf_capabilities(union perf_capabilities host_cap)
+{
+ struct kvm_vcpu *vcpu;
+ struct kvm_vm *vm = vm_create_with_one_vcpu(&vcpu, guest_code);
+ struct ucall uc;
+
+ vm_init_descriptor_tables(vm);
+ vcpu_init_descriptor_tables(vcpu);
+
+ vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, host_cap.capabilities);
+
+ vcpu_args_set(vcpu, 1, host_cap.capabilities);
+ vcpu_run(vcpu);
+
+ switch (get_ucall(vcpu, &uc)) {
+ case UCALL_ABORT:
+ REPORT_GUEST_ASSERT_2(uc, "val = 0x%lx, vector = %lu");
+ break;
+ case UCALL_DONE:
+ break;
+ default:
+ TEST_FAIL("Unexpected ucall: %lu", uc.cmd);
+ }
+
+ ASSERT_EQ(vcpu_get_msr(vcpu, MSR_IA32_PERF_CAPABILITIES), host_cap.capabilities);
+
+ kvm_vm_free(vm);
}

/*
@@ -79,7 +129,7 @@ static void test_fungible_perf_capabilities(union perf_capabilities host_cap)
const uint64_t fungible_caps = host_cap.capabilities & ~immutable_caps.capabilities;

struct kvm_vcpu *vcpu;
- struct kvm_vm *vm = vm_create_with_one_vcpu(&vcpu, guest_code);
+ struct kvm_vm *vm = vm_create_with_one_vcpu(&vcpu, NULL);
int bit;

for_each_set_bit(bit, &fungible_caps, 64) {
@@ -89,10 +139,6 @@ static void test_fungible_perf_capabilities(union perf_capabilities host_cap)
}
vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, host_cap.capabilities);

- /* check whatever we write with KVM_SET_MSR is _not_ modified */
- vcpu_run(vcpu);
- ASSERT_EQ(vcpu_get_msr(vcpu, MSR_IA32_PERF_CAPABILITIES), host_cap.capabilities);
-
kvm_vm_free(vm);
}

@@ -153,6 +199,7 @@ int main(int argc, char *argv[])
test_basic_perf_capabilities(host_cap);
test_fungible_perf_capabilities(host_cap);
test_immutable_perf_capabilities(host_cap);
+ test_guest_wrmsr_perf_capabilities(host_cap);

printf("Completed perf capability tests.\n");
}
--
2.39.1.581.gbfd45094c4-goog


2023-02-10 00:34:01

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 17/21] KVM: selftests: Test post-KVM_RUN writes to PERF_CAPABILITIES

Now that KVM disallows changing PERF_CAPABILITIES after KVM_RUN, expand
the host side checks to verify KVM rejects any attempts to change bits
from userspace.

Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/x86_64/vmx_pmu_caps_test.c | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
index 44fc6101a547..6fc86f5eba0b 100644
--- a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
+++ b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
@@ -85,6 +85,7 @@ static void test_guest_wrmsr_perf_capabilities(union perf_capabilities host_cap)
struct kvm_vcpu *vcpu;
struct kvm_vm *vm = vm_create_with_one_vcpu(&vcpu, guest_code);
struct ucall uc;
+ int r, i;

vm_init_descriptor_tables(vm);
vcpu_init_descriptor_tables(vcpu);
@@ -106,6 +107,18 @@ static void test_guest_wrmsr_perf_capabilities(union perf_capabilities host_cap)

ASSERT_EQ(vcpu_get_msr(vcpu, MSR_IA32_PERF_CAPABILITIES), host_cap.capabilities);

+ vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, host_cap.capabilities);
+
+ r = _vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, 0);
+ TEST_ASSERT(!r, "Post-KVM_RUN write '0' didn't fail");
+
+ for (i = 0; i < 64; i++) {
+ r = _vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES,
+ host_cap.capabilities ^ BIT_ULL(i));
+ TEST_ASSERT(!r, "Post-KVM_RUN write '0x%llx'didn't fail",
+ host_cap.capabilities ^ BIT_ULL(i));
+ }
+
kvm_vm_free(vm);
}

--
2.39.1.581.gbfd45094c4-goog


2023-02-10 00:34:05

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 18/21] KVM: selftests: Drop "all done!" printf() from PERF_CAPABILITIES test

Drop the arbitrary "done" message from the VMX PMU caps test, it's pretty
obvious the test is done when the process exits.

Signed-off-by: Sean Christopherson <[email protected]>
---
tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
index 6fc86f5eba0b..6733d879a00b 100644
--- a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
+++ b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
@@ -213,6 +213,4 @@ int main(int argc, char *argv[])
test_fungible_perf_capabilities(host_cap);
test_immutable_perf_capabilities(host_cap);
test_guest_wrmsr_perf_capabilities(host_cap);
-
- printf("Completed perf capability tests.\n");
}
--
2.39.1.581.gbfd45094c4-goog


2023-02-10 00:34:09

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 19/21] KVM: selftests: Refactor LBR_FMT test to avoid use of separate macro

Rework the LBR format test to use the bitfield instead of a separate
mask macro, mainly so that adding a nearly-identical PEBS format test
doesn't have to copy-paste-tweak the macro too.

No functional change intended.

Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/x86_64/vmx_pmu_caps_test.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
index 6733d879a00b..38aec88d733b 100644
--- a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
+++ b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
@@ -19,8 +19,6 @@
#include "kvm_util.h"
#include "vmx.h"

-#define PMU_CAP_LBR_FMT 0x3f
-
union perf_capabilities {
struct {
u64 lbr_format:6;
@@ -169,7 +167,7 @@ static void test_immutable_perf_capabilities(union perf_capabilities host_cap)

struct kvm_vcpu *vcpu;
struct kvm_vm *vm = vm_create_with_one_vcpu(&vcpu, NULL);
- uint64_t val;
+ union perf_capabilities val = host_cap;
int r, bit;

for_each_set_bit(bit, &reserved_caps, 64) {
@@ -184,12 +182,13 @@ static void test_immutable_perf_capabilities(union perf_capabilities host_cap)
* KVM only supports the host's native LBR format, as well as '0' (to
* disable LBR support). Verify KVM rejects all other LBR formats.
*/
- for (val = 1; val <= PMU_CAP_LBR_FMT; val++) {
- if (val == (host_cap.capabilities & PMU_CAP_LBR_FMT))
+ for (val.lbr_format = 1; val.lbr_format; val.lbr_format++) {
+ if (val.lbr_format == host_cap.lbr_format)
continue;

- r = _vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, val);
- TEST_ASSERT(!r, "Bad LBR FMT = 0x%lx didn't fail", val);
+ r = _vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, val.capabilities);
+ TEST_ASSERT(!r, "Bad LBR FMT = 0x%x didn't fail, host = 0x%x",
+ val.lbr_format, host_cap.lbr_format);
}

kvm_vm_free(vm);
--
2.39.1.581.gbfd45094c4-goog


2023-02-10 00:34:20

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 20/21] KVM: selftests: Add negative testcase for PEBS format in PERF_CAPABILITIES

Expand the immutable features sub-test for PERF_CAPABILITIES to verify
KVM rejects any attempt to use a PEBS format other than the host's.

Signed-off-by: Sean Christopherson <[email protected]>
---
tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
index 38aec88d733b..29aaa0419294 100644
--- a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
+++ b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
@@ -191,6 +191,16 @@ static void test_immutable_perf_capabilities(union perf_capabilities host_cap)
val.lbr_format, host_cap.lbr_format);
}

+ /* Ditto for the PEBS format. */
+ for (val.pebs_format = 1; val.pebs_format; val.pebs_format++) {
+ if (val.pebs_format == host_cap.pebs_format)
+ continue;
+
+ r = _vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, val.capabilities);
+ TEST_ASSERT(!r, "Bad PEBS FMT = 0x%x didn't fail, host = 0x%x",
+ val.pebs_format, host_cap.pebs_format);
+ }
+
kvm_vm_free(vm);
}

--
2.39.1.581.gbfd45094c4-goog


2023-02-10 00:34:27

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v2 21/21] KVM: selftests: Verify LBRs are disabled if vPMU is disabled

Verify that disabling the guest's vPMU via CPUID also disables LBRs.
KVM has had at least one bug where LBRs would remain enabled even though
the intent was to disable everything PMU related.

Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/x86_64/vmx_pmu_caps_test.c | 29 +++++++++++++++++++
1 file changed, 29 insertions(+)

diff --git a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
index 29aaa0419294..3009b3e5254d 100644
--- a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
+++ b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
@@ -204,6 +204,34 @@ static void test_immutable_perf_capabilities(union perf_capabilities host_cap)
kvm_vm_free(vm);
}

+/*
+ * Test that LBR MSRs are writable when LBRs are enabled, and then verify that
+ * disabling the vPMU via CPUID also disables LBR support. Set bits 2:0 of
+ * LBR_TOS as those bits are writable across all uarch implementations (arch
+ * LBRs will need to poke a different MSR).
+ */
+static void test_lbr_perf_capabilities(union perf_capabilities host_cap)
+{
+ struct kvm_vcpu *vcpu;
+ struct kvm_vm *vm;
+ int r;
+
+ if (!host_cap.lbr_format)
+ return;
+
+ vm = vm_create_with_one_vcpu(&vcpu, NULL);
+
+ vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, host_cap.capabilities);
+ vcpu_set_msr(vcpu, MSR_LBR_TOS, 7);
+
+ vcpu_clear_cpuid_entry(vcpu, X86_PROPERTY_PMU_VERSION.function);
+
+ r = _vcpu_set_msr(vcpu, MSR_LBR_TOS, 7);
+ TEST_ASSERT(!r, "Writing LBR_TOS should fail after disabling vPMU");
+
+ kvm_vm_free(vm);
+}
+
int main(int argc, char *argv[])
{
union perf_capabilities host_cap;
@@ -222,4 +250,5 @@ int main(int argc, char *argv[])
test_fungible_perf_capabilities(host_cap);
test_immutable_perf_capabilities(host_cap);
test_guest_wrmsr_perf_capabilities(host_cap);
+ test_lbr_perf_capabilities(host_cap);
}
--
2.39.1.581.gbfd45094c4-goog


2023-02-10 13:01:28

by Yu Zhang

[permalink] [raw]
Subject: Re: [PATCH v2 05/21] KVM: x86: Disallow writes to immutable feature MSRs after KVM_RUN

On Fri, Feb 10, 2023 at 12:31:32AM +0000, Sean Christopherson wrote:
> Disallow writes to feature MSRs after KVM_RUN to prevent userspace from
> changing the vCPU model after running the vCPU. Similar to guest CPUID,
> KVM uses feature MSRs to configure intercepts, determine what operations
> are/aren't allowed, etc. Changing the capabilities while the vCPU is
> active will at best yield unpredictable guest behavior, and at worst
> could be dangerous to KVM.
>
> Allow writing the current value, e.g. so that userspace can blindly set
> all MSRs when emulating RESET, and unconditionally allow writes to
> MSR_IA32_UCODE_REV so that userspace can emulate patch loads.
>
> Special case the VMX MSRs to keep the generic list small, i.e. so that
> KVM can do a linear walk of the generic list without incurring meaningful
> overhead.
>
> Cc: Like Xu <[email protected]>
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
> arch/x86/kvm/x86.c | 36 ++++++++++++++++++++++++++++++++++++
> 1 file changed, 36 insertions(+)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 7b73a0b45041..186cb6a81643 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1554,6 +1554,25 @@ static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all_except_vmx) +
> (KVM_LAST_EMULATED_VMX_MSR - KVM_FIRST_EMULATED_VMX_MSR + 1)];
> static unsigned int num_msr_based_features;
>
> +/*
> + * All feature MSRs except uCode revID, which tracks the currently loaded uCode
> + * patch, are immutable once the vCPU model is defined.
> + */
> +static bool kvm_is_immutable_feature_msr(u32 msr)
> +{
> + int i;
> +
> + if (msr >= KVM_FIRST_EMULATED_VMX_MSR && msr <= KVM_LAST_EMULATED_VMX_MSR)
> + return true;
> +
> + for (i = 0; i < ARRAY_SIZE(msr_based_features_all_except_vmx); i++) {
> + if (msr == msr_based_features_all_except_vmx[i])
> + return msr != MSR_IA32_UCODE_REV;
> + }
> +
> + return false;
> +}
> +
> /*
> * Some IA32_ARCH_CAPABILITIES bits have dependencies on MSRs that KVM
> * does not yet virtualize. These include:
> @@ -2168,6 +2187,23 @@ static int do_get_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
>
> static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
> {
> + u64 val;
> +
> + /*
> + * Disallow writes to immutable feature MSRs after KVM_RUN. KVM does
> + * not support modifying the guest vCPU model on the fly, e.g. changing
> + * the nVMX capabilities while L2 is running is nonsensical. Ignore
> + * writes of the same value, e.g. to allow userspace to blindly stuff
> + * all MSRs when emulating RESET.
> + */
> + if (vcpu->arch.last_vmentry_cpu != -1 &&

Use kvm_vcpu_has_run(vcpu) here?

B.R.
Yu

2023-02-10 16:31:51

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v2 05/21] KVM: x86: Disallow writes to immutable feature MSRs after KVM_RUN

On Fri, Feb 10, 2023, Yu Zhang wrote:
> On Fri, Feb 10, 2023 at 12:31:32AM +0000, Sean Christopherson wrote:
> > @@ -2168,6 +2187,23 @@ static int do_get_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
> >
> > static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
> > {
> > + u64 val;
> > +
> > + /*
> > + * Disallow writes to immutable feature MSRs after KVM_RUN. KVM does
> > + * not support modifying the guest vCPU model on the fly, e.g. changing
> > + * the nVMX capabilities while L2 is running is nonsensical. Ignore
> > + * writes of the same value, e.g. to allow userspace to blindly stuff
> > + * all MSRs when emulating RESET.
> > + */
> > + if (vcpu->arch.last_vmentry_cpu != -1 &&
>
> Use kvm_vcpu_has_run(vcpu) here?

/facepalm

Yes, that was the entire point of adding the helper.

Thanks!

2023-02-14 06:35:07

by Xiaoyao Li

[permalink] [raw]
Subject: Re: [PATCH v2 01/21] KVM: x86: Rename kvm_init_msr_list() to clarify it inits multiple lists

On 2/10/2023 8:31 AM, Sean Christopherson wrote:
> Rename kvm_init_msr_list() to kvm_init_msr_lists() to clarify that it
> initializes multiple lists: MSRs to save, emulated MSRs, and feature MSRs.
>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <[email protected]>

Reviewed-by: Xiaoyao Li <[email protected]>

> ---
> arch/x86/kvm/x86.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index f706621c35b8..7b91f73a837d 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -7071,7 +7071,7 @@ static void kvm_probe_msr_to_save(u32 msr_index)
> msrs_to_save[num_msrs_to_save++] = msr_index;
> }
>
> -static void kvm_init_msr_list(void)
> +static void kvm_init_msr_lists(void)
> {
> unsigned i;
>
> @@ -9450,7 +9450,7 @@ static int __kvm_x86_vendor_init(struct kvm_x86_init_ops *ops)
> kvm_caps.max_guest_tsc_khz = max;
> }
> kvm_caps.default_tsc_scaling_ratio = 1ULL << kvm_caps.tsc_scaling_ratio_frac_bits;
> - kvm_init_msr_list();
> + kvm_init_msr_lists();
> return 0;
>
> out_unwind_ops:


2023-02-14 06:35:56

by Xiaoyao Li

[permalink] [raw]
Subject: Re: [PATCH v2 02/21] KVM: x86: Add a helper to query whether or not a vCPU has ever run

On 2/10/2023 8:31 AM, Sean Christopherson wrote:
> Add a helper to query if a vCPU has run so that KVM doesn't have to open
> code the check on last_vmentry_cpu being set to a magic value.
>
> No functional change intended.
>
> Suggested-by: Xiaoyao Li <[email protected]>
> Cc: Like Xu <[email protected]>
> Signed-off-by: Sean Christopherson <[email protected]>

Reviewed-by: Xiaoyao Li <[email protected]>

> ---
> arch/x86/kvm/cpuid.c | 2 +-
> arch/x86/kvm/mmu/mmu.c | 2 +-
> arch/x86/kvm/x86.h | 5 +++++
> 3 files changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 8f8edeaf8177..448d627ce891 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -420,7 +420,7 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
> * KVM_SET_CPUID{,2} again. To support this legacy behavior, check
> * whether the supplied CPUID data is equal to what's already set.
> */
> - if (vcpu->arch.last_vmentry_cpu != -1) {
> + if (kvm_vcpu_has_run(vcpu)) {
> r = kvm_cpuid_check_equal(vcpu, e2, nent);
> if (r)
> return r;
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index c91ee2927dd7..b0693195273b 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -5393,7 +5393,7 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
> * Changing guest CPUID after KVM_RUN is forbidden, see the comment in
> * kvm_arch_vcpu_ioctl().
> */
> - KVM_BUG_ON(vcpu->arch.last_vmentry_cpu != -1, vcpu->kvm);
> + KVM_BUG_ON(kvm_vcpu_has_run(vcpu), vcpu->kvm);
> }
>
> void kvm_mmu_reset_context(struct kvm_vcpu *vcpu)
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index a8167b47b8c8..754190af1791 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -83,6 +83,11 @@ static inline unsigned int __shrink_ple_window(unsigned int val,
> void kvm_service_local_tlb_flush_requests(struct kvm_vcpu *vcpu);
> int kvm_check_nested_events(struct kvm_vcpu *vcpu);
>
> +static inline bool kvm_vcpu_has_run(struct kvm_vcpu *vcpu)
> +{
> + return vcpu->arch.last_vmentry_cpu != -1;
> +}
> +
> static inline bool kvm_is_exception_pending(struct kvm_vcpu *vcpu)
> {
> return vcpu->arch.exception.pending ||


2023-02-14 06:37:04

by Xiaoyao Li

[permalink] [raw]
Subject: Re: [PATCH v2 04/21] KVM: x86: Generate set of VMX feature MSRs using first/last definitions

On 2/10/2023 8:31 AM, Sean Christopherson wrote:
> Add VMX MSRs to the runtime list of feature MSRs by iterating over the
> range of emulated MSRs instead of manually defining each MSR in the "all"
> list. Using the range definition reduces the cost of emulating a new VMX
> MSR, e.g. prevents forgetting to add an MSR to the list.
>
> Extracting the VMX MSRs from the "all" list, which is a compile-time
> constant, also shrinks the list to the point where the compiler can
> heavily optimize code that iterates over the list.
>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <[email protected]>

Reviewed-by: Xiaoyao Li <[email protected]>

2023-02-14 06:39:31

by Xiaoyao Li

[permalink] [raw]
Subject: Re: [PATCH v2 05/21] KVM: x86: Disallow writes to immutable feature MSRs after KVM_RUN

Maybe be more clearer in the title to reflect what the patch really does

KVM: x86: Disallow writes to immutable feature MSRs from user space
after KVM_RUN

On 2/10/2023 8:31 AM, Sean Christopherson wrote:
> Disallow writes to feature MSRs after KVM_RUN to prevent userspace from
> changing the vCPU model after running the vCPU. Similar to guest CPUID,
> KVM uses feature MSRs to configure intercepts, determine what operations
> are/aren't allowed, etc. Changing the capabilities while the vCPU is
> active will at best yield unpredictable guest behavior, and at worst
> could be dangerous to KVM.
>
> Allow writing the current value, e.g. so that userspace can blindly set
> all MSRs when emulating RESET, and unconditionally allow writes to
> MSR_IA32_UCODE_REV so that userspace can emulate patch loads.
>
> Special case the VMX MSRs to keep the generic list small, i.e. so that
> KVM can do a linear walk of the generic list without incurring meaningful
> overhead.
>
> Cc: Like Xu <[email protected]>
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
> arch/x86/kvm/x86.c | 36 ++++++++++++++++++++++++++++++++++++
> 1 file changed, 36 insertions(+)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 7b73a0b45041..186cb6a81643 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1554,6 +1554,25 @@ static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all_except_vmx) +
> (KVM_LAST_EMULATED_VMX_MSR - KVM_FIRST_EMULATED_VMX_MSR + 1)];
> static unsigned int num_msr_based_features;
>
> +/*
> + * All feature MSRs except uCode revID, which tracks the currently loaded uCode
> + * patch, are immutable once the vCPU model is defined.
> + */
> +static bool kvm_is_immutable_feature_msr(u32 msr)
> +{
> + int i;
> +
> + if (msr >= KVM_FIRST_EMULATED_VMX_MSR && msr <= KVM_LAST_EMULATED_VMX_MSR)
> + return true;
> +
> + for (i = 0; i < ARRAY_SIZE(msr_based_features_all_except_vmx); i++) {
> + if (msr == msr_based_features_all_except_vmx[i])
> + return msr != MSR_IA32_UCODE_REV;
> + }
> +
> + return false;
> +}
> +
> /*
> * Some IA32_ARCH_CAPABILITIES bits have dependencies on MSRs that KVM
> * does not yet virtualize. These include:
> @@ -2168,6 +2187,23 @@ static int do_get_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
>
> static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
> {
> + u64 val;
> +
> + /*
> + * Disallow writes to immutable feature MSRs after KVM_RUN. KVM does
> + * not support modifying the guest vCPU model on the fly, e.g. changing
> + * the nVMX capabilities while L2 is running is nonsensical. Ignore
> + * writes of the same value, e.g. to allow userspace to blindly stuff
> + * all MSRs when emulating RESET.
> + */
> + if (vcpu->arch.last_vmentry_cpu != -1 &&

after this replaced with the helper,

Reviewed-by: Xiaoyao Li <[email protected]>

> + kvm_is_immutable_feature_msr(index)) {
> + if (do_get_msr(vcpu, index, &val) || *data != val)
> + return -EINVAL;
> +
> + return 0;
> + }
> +
> return kvm_set_msr_ignored_check(vcpu, index, *data, true);
> }
>


2023-02-14 07:27:43

by Like Xu

[permalink] [raw]
Subject: Re: [PATCH v2 08/21] KVM: selftests: Split PMU caps sub-tests to avoid writing MSR after KVM_RUN

Nit, could this patch be applied before the relevant KVM modifications [1] so that
a CI bot doesn't generate an error report before this selftests patch is applied ?

[1] KVM: x86: Disallow writes to immutable feature MSRs after KVM_RUN

On 10/2/2023 8:31 am, Sean Christopherson wrote:
> Split the PERF_CAPABILITIES subtests into two parts so that the LBR format
> testcases don't execute after KVM_RUN. Now that KVM disallows changing
> PERF_CAPABILITIES after KVM_RUN (same as guest CPUID), attempting to set
> the MSR after KVM_RUN will yield false positives and/or false negatives
> depending on what the test is trying to do.
>
> Land the LBR format test in a more generic "immutable features" test in
> anticipation of expanding its scope to other immutable features.
>
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
> .../selftests/kvm/x86_64/vmx_pmu_caps_test.c | 51 +++++++++++--------
> 1 file changed, 31 insertions(+), 20 deletions(-)
>
> diff --git a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
> index c280ba1e6572..ac08c0fdd84d 100644
> --- a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
> +++ b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
> @@ -41,24 +41,10 @@ static void guest_code(void)
> wrmsr(MSR_IA32_PERF_CAPABILITIES, PMU_CAP_LBR_FMT);
> }
>
> -int main(int argc, char *argv[])
> +static void test_fungible_perf_capabilities(union perf_capabilities host_cap)
> {
> - struct kvm_vm *vm;
> struct kvm_vcpu *vcpu;
> - int ret;
> - union perf_capabilities host_cap;
> - uint64_t val;
> -
> - host_cap.capabilities = kvm_get_feature_msr(MSR_IA32_PERF_CAPABILITIES);
> - host_cap.capabilities &= (PMU_CAP_FW_WRITES | PMU_CAP_LBR_FMT);
> -
> - /* Create VM */
> - vm = vm_create_with_one_vcpu(&vcpu, guest_code);
> -
> - TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_PDCM));
> -
> - TEST_REQUIRE(kvm_cpu_has_p(X86_PROPERTY_PMU_VERSION));
> - TEST_REQUIRE(kvm_cpu_property(X86_PROPERTY_PMU_VERSION) > 0);
> + struct kvm_vm *vm = vm_create_with_one_vcpu(&vcpu, guest_code);
>
> /* testcase 1, set capabilities when we have PDCM bit */
> vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, PMU_CAP_FW_WRITES);
> @@ -70,7 +56,16 @@ int main(int argc, char *argv[])
> vcpu_run(vcpu);
> ASSERT_EQ(vcpu_get_msr(vcpu, MSR_IA32_PERF_CAPABILITIES), PMU_CAP_FW_WRITES);
>
> - /* testcase 2, check valid LBR formats are accepted */
> + kvm_vm_free(vm);
> +}
> +
> +static void test_immutable_perf_capabilities(union perf_capabilities host_cap)
> +{
> + struct kvm_vcpu *vcpu;
> + struct kvm_vm *vm = vm_create_with_one_vcpu(&vcpu, NULL);
> + uint64_t val;
> + int ret;
> +
> vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, 0);
> ASSERT_EQ(vcpu_get_msr(vcpu, MSR_IA32_PERF_CAPABILITIES), 0);
>
> @@ -78,8 +73,8 @@ int main(int argc, char *argv[])
> ASSERT_EQ(vcpu_get_msr(vcpu, MSR_IA32_PERF_CAPABILITIES), (u64)host_cap.lbr_format);
>
> /*
> - * Testcase 3, check that an "invalid" LBR format is rejected. Only an
> - * exact match of the host's format (and 0/disabled) is allowed.
> + * KVM only supports the host's native LBR format, as well as '0' (to
> + * disable LBR support). Verify KVM rejects all other LBR formats.
> */
> for (val = 1; val <= PMU_CAP_LBR_FMT; val++) {
> if (val == (host_cap.capabilities & PMU_CAP_LBR_FMT))
> @@ -88,7 +83,23 @@ int main(int argc, char *argv[])
> ret = _vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, val);
> TEST_ASSERT(!ret, "Bad LBR FMT = 0x%lx didn't fail", val);
> }
> + kvm_vm_free(vm);
> +}
> +
> +int main(int argc, char *argv[])
> +{
> + union perf_capabilities host_cap;
> +
> + TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_PDCM));
> +
> + TEST_REQUIRE(kvm_cpu_has_p(X86_PROPERTY_PMU_VERSION));
> + TEST_REQUIRE(kvm_cpu_property(X86_PROPERTY_PMU_VERSION) > 0);
> +
> + host_cap.capabilities = kvm_get_feature_msr(MSR_IA32_PERF_CAPABILITIES);
> + host_cap.capabilities &= (PMU_CAP_FW_WRITES | PMU_CAP_LBR_FMT);
> +
> + test_fungible_perf_capabilities(host_cap);
> + test_immutable_perf_capabilities(host_cap);
>
> printf("Completed perf capability tests.\n");
> - kvm_vm_free(vm);
> }

2023-02-14 10:50:11

by Like Xu

[permalink] [raw]
Subject: Re: [PATCH v2 03/21] KVM: x86: Add macros to track first...last VMX feature MSRs

On 10/2/2023 8:31 am, Sean Christopherson wrote:
> +/*
> + * The first...last VMX feature MSRs that are emulated by KVM. This may or may
> + * not cover all known VMX MSRs, as KVM doesn't emulate an MSR until there's an
> + * associated feature that KVM supports for nested virtualization.
> + */
> +#define KVM_FIRST_EMULATED_VMX_MSR MSR_IA32_VMX_BASIC
> +#define KVM_LAST_EMULATED_VMX_MSR MSR_IA32_VMX_VMFUNC

Off-topic, we now have "#define MSR_IA32_VMX_PROCBASED_CTLS3 0x00000492",
any further changes needed here if L2 guest needs IPI virtualization or why not ?

2023-02-14 11:39:25

by Like Xu

[permalink] [raw]
Subject: Re: [PATCH v2 05/21] KVM: x86: Disallow writes to immutable feature MSRs after KVM_RUN

On 10/2/2023 8:31 am, Sean Christopherson wrote:
> Disallow writes to feature MSRs after KVM_RUN to prevent userspace from
> changing the vCPU model after running the vCPU. Similar to guest CPUID,
> KVM uses feature MSRs to configure intercepts, determine what operations
> are/aren't allowed, etc. Changing the capabilities while the vCPU is
> active will at best yield unpredictable guest behavior, and at worst
> could be dangerous to KVM.
>
> Allow writing the current value, e.g. so that userspace can blindly set
> all MSRs when emulating RESET, and unconditionally allow writes to
> MSR_IA32_UCODE_REV so that userspace can emulate patch loads.
>
> Special case the VMX MSRs to keep the generic list small, i.e. so that
> KVM can do a linear walk of the generic list without incurring meaningful
> overhead.
>
> Cc: Like Xu <[email protected]>
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
> arch/x86/kvm/x86.c | 36 ++++++++++++++++++++++++++++++++++++
> 1 file changed, 36 insertions(+)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 7b73a0b45041..186cb6a81643 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -1554,6 +1554,25 @@ static u32 msr_based_features[ARRAY_SIZE(msr_based_features_all_except_vmx) +
> (KVM_LAST_EMULATED_VMX_MSR - KVM_FIRST_EMULATED_VMX_MSR + 1)];
> static unsigned int num_msr_based_features;
>
> +/*
> + * All feature MSRs except uCode revID, which tracks the currently loaded uCode
> + * patch, are immutable once the vCPU model is defined.
> + */
> +static bool kvm_is_immutable_feature_msr(u32 msr)
> +{
> + int i;
> +
> + if (msr >= KVM_FIRST_EMULATED_VMX_MSR && msr <= KVM_LAST_EMULATED_VMX_MSR)
> + return true;
> +
> + for (i = 0; i < ARRAY_SIZE(msr_based_features_all_except_vmx); i++) {
> + if (msr == msr_based_features_all_except_vmx[i])
> + return msr != MSR_IA32_UCODE_REV;
> + }
> +
> + return false;
> +}
> +
> /*
> * Some IA32_ARCH_CAPABILITIES bits have dependencies on MSRs that KVM
> * does not yet virtualize. These include:
> @@ -2168,6 +2187,23 @@ static int do_get_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
>
> static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
> {
> + u64 val;
> +
> + /*
> + * Disallow writes to immutable feature MSRs after KVM_RUN. KVM does
> + * not support modifying the guest vCPU model on the fly, e.g. changing
> + * the nVMX capabilities while L2 is running is nonsensical. Ignore
> + * writes of the same value, e.g. to allow userspace to blindly stuff
> + * all MSRs when emulating RESET.
> + */
> + if (vcpu->arch.last_vmentry_cpu != -1 &&

Three concerns on my mind (to help you think more if any):
- why not using kvm->created_vcpus;

- how about different vcpu models of the same guest have different feature_msr
values;
(although they are not altered after the first run, cases (selftests) may be
needed to
show that it is dangerous for KVM);

- the relative time to set "vcpu->arch.last_vmentry_cpu = vcpu->cpu" is still
too late,
since part of the guest code (an attack window) has already been executed on first
run of kvm_x86_vcpu_run() which may run for a long time;

> + kvm_is_immutable_feature_msr(index)) {
> + if (do_get_msr(vcpu, index, &val) || *data != val)
> + return -EINVAL;
> +
> + return 0;
> + }
> +
> return kvm_set_msr_ignored_check(vcpu, index, *data, true);
> }
>

2023-02-14 17:42:58

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v2 08/21] KVM: selftests: Split PMU caps sub-tests to avoid writing MSR after KVM_RUN

On Tue, Feb 14, 2023, Like Xu wrote:
> Nit, could this patch be applied before the relevant KVM modifications [1] so that
> a CI bot doesn't generate an error report before this selftests patch is applied ?

Yeah, good call. CI aside, this patch should definitely land before the KVM change.

> [1] KVM: x86: Disallow writes to immutable feature MSRs after KVM_RUN

2023-02-14 18:51:47

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v2 03/21] KVM: x86: Add macros to track first...last VMX feature MSRs

On Tue, Feb 14, 2023, Like Xu wrote:
> On 10/2/2023 8:31 am, Sean Christopherson wrote:
> > +/*
> > + * The first...last VMX feature MSRs that are emulated by KVM. This may or may
> > + * not cover all known VMX MSRs, as KVM doesn't emulate an MSR until there's an
> > + * associated feature that KVM supports for nested virtualization.
> > + */
> > +#define KVM_FIRST_EMULATED_VMX_MSR MSR_IA32_VMX_BASIC
> > +#define KVM_LAST_EMULATED_VMX_MSR MSR_IA32_VMX_VMFUNC
>
> Off-topic, we now have "#define MSR_IA32_VMX_PROCBASED_CTLS3 0x00000492",
> any further changes needed here if L2 guest needs IPI virtualization or why not ?

If KVM exposes IPI virtualization, or any other feature that depends on tertiary
controls, to L1, then yes this will need to be extended. But that will naturally
happen as part of any such enabling, otherwise userspace won't be able to do
anything with L1's tertiary controls.

As for supporting IPI virtualization for nested VMs, that's definitely future work.
I haven't though about it too much, but I assume it's a similar story to AVIC where
KVM would need to shadow L1's PID table.

2023-02-15 03:20:47

by Xiaoyao Li

[permalink] [raw]
Subject: Re: [PATCH v2 03/21] KVM: x86: Add macros to track first...last VMX feature MSRs

On 2/10/2023 8:31 AM, Sean Christopherson wrote:
> Add macros to track the range of VMX feature MSRs that are emulated by
> KVM to reduce the maintenance cost of extending the set of emulated MSRs.
>
> Note, KVM doesn't necessarily emulate all known/consumed VMX MSRs, e.g.
> PROCBASED_CTLS3 is consumed by KVM to enable IPI virtualization, but is
> not emulated as KVM doesn't emulate/virtualize IPI virtualization for
> nested guests.
>
> No functional change intended.
>
> Signed-off-by: Sean Christopherson <[email protected]>

Reviewed-by: Xiaoyao Li <[email protected]>

> ---
> arch/x86/kvm/svm/svm.c | 2 +-
> arch/x86/kvm/vmx/vmx.c | 8 ++++----
> arch/x86/kvm/x86.h | 8 ++++++++
> 3 files changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index b43775490074..a5b9ebd6f2c5 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4107,7 +4107,7 @@ static bool svm_has_emulated_msr(struct kvm *kvm, u32 index)
> {
> switch (index) {
> case MSR_IA32_MCG_EXT_CTL:
> - case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
> + case KVM_FIRST_EMULATED_VMX_MSR ... KVM_LAST_EMULATED_VMX_MSR:
> return false;
> case MSR_IA32_SMBASE:
> if (!IS_ENABLED(CONFIG_KVM_SMM))
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 47abd9101e68..ee86db130519 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -1945,7 +1945,7 @@ static inline bool is_vmx_feature_control_msr_valid(struct vcpu_vmx *vmx,
> static int vmx_get_msr_feature(struct kvm_msr_entry *msr)
> {
> switch (msr->index) {
> - case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
> + case KVM_FIRST_EMULATED_VMX_MSR ... KVM_LAST_EMULATED_VMX_MSR:
> if (!nested)
> return 1;
> return vmx_get_vmx_msr(&vmcs_config.nested, msr->index, &msr->data);
> @@ -2030,7 +2030,7 @@ static int vmx_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> msr_info->data = to_vmx(vcpu)->msr_ia32_sgxlepubkeyhash
> [msr_info->index - MSR_IA32_SGXLEPUBKEYHASH0];
> break;
> - case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
> + case KVM_FIRST_EMULATED_VMX_MSR ... KVM_LAST_EMULATED_VMX_MSR:
> if (!nested_vmx_allowed(vcpu))
> return 1;
> if (vmx_get_vmx_msr(&vmx->nested.msrs, msr_info->index,
> @@ -2366,7 +2366,7 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> vmx->msr_ia32_sgxlepubkeyhash
> [msr_index - MSR_IA32_SGXLEPUBKEYHASH0] = data;
> break;
> - case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
> + case KVM_FIRST_EMULATED_VMX_MSR ... KVM_LAST_EMULATED_VMX_MSR:
> if (!msr_info->host_initiated)
> return 1; /* they are read-only */
> if (!nested_vmx_allowed(vcpu))
> @@ -6960,7 +6960,7 @@ static bool vmx_has_emulated_msr(struct kvm *kvm, u32 index)
> * real mode.
> */
> return enable_unrestricted_guest || emulate_invalid_guest_state;
> - case MSR_IA32_VMX_BASIC ... MSR_IA32_VMX_VMFUNC:
> + case KVM_FIRST_EMULATED_VMX_MSR ... KVM_LAST_EMULATED_VMX_MSR:
> return nested;
> case MSR_AMD64_VIRT_SPEC_CTRL:
> case MSR_AMD64_TSC_RATIO:
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index 754190af1791..4bc483d082ee 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -40,6 +40,14 @@ void kvm_spurious_fault(void);
> failed; \
> })
>
> +/*
> + * The first...last VMX feature MSRs that are emulated by KVM. This may or may
> + * not cover all known VMX MSRs, as KVM doesn't emulate an MSR until there's an
> + * associated feature that KVM supports for nested virtualization.
> + */
> +#define KVM_FIRST_EMULATED_VMX_MSR MSR_IA32_VMX_BASIC
> +#define KVM_LAST_EMULATED_VMX_MSR MSR_IA32_VMX_VMFUNC
> +
> #define KVM_DEFAULT_PLE_GAP 128
> #define KVM_VMX_DEFAULT_PLE_WINDOW 4096
> #define KVM_DEFAULT_PLE_WINDOW_GROW 2


2023-02-15 11:53:42

by Like Xu

[permalink] [raw]
Subject: Re: [PATCH v2 06/21] KVM: x86/pmu: WARN and bug the VM if PMU is refreshed after vCPU has run

On 10/2/2023 8:31 am, Sean Christopherson wrote:
> Now that KVM disallows changing feature MSRs, i.e. PERF_CAPABILITIES,
> after running a vCPU, WARN and bug the VM if the PMU is refreshed after
> the vCPU has run.
>
> Note, KVM has disallowed CPUID updates after running a vCPU since commit
> feb627e8d6f6 ("KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN"), i.e.
> PERF_CAPABILITIES was the only remaining way to trigger a PMU refresh
> after KVM_RUN.

A malicious user space could have saved the vcpu state and then deleted
and recreated a new vcpu w/ previous state so that it would have a chance
to re-set the features msr.

The key to this issue may be focused on the KVM_CREATE_VM interface.

How about the contract that when the first vcpu is created and "after
KVM_RUN of any vcpu", the values of all feature msrs for all vcpus on
the same guest cannot be changed, even if the (likely) first ever ran
vcpu is deleted ?

>
> Cc: Like Xu <[email protected]>
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
> arch/x86/kvm/pmu.c | 3 +++
> arch/x86/kvm/x86.c | 10 +++++++++-
> 2 files changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> index 612e6c70ce2e..7e974c4e61b0 100644
> --- a/arch/x86/kvm/pmu.c
> +++ b/arch/x86/kvm/pmu.c
> @@ -589,6 +589,9 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> */
> void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
> {
> + if (KVM_BUG_ON(kvm_vcpu_has_run(vcpu), vcpu->kvm))
> + return;
> +
> static_call(kvm_x86_pmu_refresh)(vcpu);
> }
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 186cb6a81643..1b14632a94a3 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -3626,9 +3626,17 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> if (data & ~kvm_caps.supported_perf_cap)
> return 1;
>
> + /*
> + * Note, this is not just a performance optimization! KVM
> + * disallows changing feature MSRs after the vCPU has run; PMU
> + * refresh will bug the VM if called after the vCPU has run.
> + */
> + if (vcpu->arch.perf_capabilities == data)
> + break;
> +
> vcpu->arch.perf_capabilities = data;
> kvm_pmu_refresh(vcpu);
> - return 0;
> + break;
> case MSR_EFER:
> return set_efer(vcpu, msr_info);
> case MSR_K7_HWCR:

2023-02-15 12:00:42

by Like Xu

[permalink] [raw]
Subject: Re: [PATCH v2 07/21] KVM: x86/pmu: Zero out LBR capabilities during PMU refresh

On 10/2/2023 8:31 am, Sean Christopherson wrote:
> Note, this is a very theoretical bug, there is no known use case where a
> VMM would deliberately enable the vPMU via KVM_SET_CPUID2, and then later
> disable the vPMU.

That's why we're getting more and more comfortable with selftests
and fuzz testing on KVM interfaces. So is there a test for this ?

2023-02-15 15:17:20

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v2 07/21] KVM: x86/pmu: Zero out LBR capabilities during PMU refresh

On Wed, Feb 15, 2023, Like Xu wrote:
> On 10/2/2023 8:31 am, Sean Christopherson wrote:
> > Note, this is a very theoretical bug, there is no known use case where a
> > VMM would deliberately enable the vPMU via KVM_SET_CPUID2, and then later
> > disable the vPMU.
>
> That's why we're getting more and more comfortable with selftests
> and fuzz testing on KVM interfaces. So is there a test for this ?

Yep, last patch in the series, "KVM: selftests: Verify LBRs are disabled if vPMU
is disabled".

2023-02-15 15:21:02

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v2 06/21] KVM: x86/pmu: WARN and bug the VM if PMU is refreshed after vCPU has run

On Wed, Feb 15, 2023, Like Xu wrote:
> On 10/2/2023 8:31 am, Sean Christopherson wrote:
> > Now that KVM disallows changing feature MSRs, i.e. PERF_CAPABILITIES,
> > after running a vCPU, WARN and bug the VM if the PMU is refreshed after
> > the vCPU has run.
> >
> > Note, KVM has disallowed CPUID updates after running a vCPU since commit
> > feb627e8d6f6 ("KVM: x86: Forbid KVM_SET_CPUID{,2} after KVM_RUN"), i.e.
> > PERF_CAPABILITIES was the only remaining way to trigger a PMU refresh
> > after KVM_RUN.
>
> A malicious user space could have saved the vcpu state and then deleted
> and recreated a new vcpu w/ previous state so that it would have a chance
> to re-set the features msr.

I don't follow. vcpu->arch.perf_capabilities and kvm_vcpu_has_run() are per-vCPU,
creating another vCPU will not let userspace trigger this WARN.

> The key to this issue may be focused on the KVM_CREATE_VM interface.
>
> How about the contract that when the first vcpu is created and "after
> KVM_RUN of any vcpu", the values of all feature msrs for all vcpus on
> the same guest cannot be changed, even if the (likely) first ever ran
> vcpu is deleted ?

I don't think that's necessary, as above the "freeze" happens per-vCPU.

2023-02-15 22:36:46

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v2 05/21] KVM: x86: Disallow writes to immutable feature MSRs after KVM_RUN

On Tue, Feb 14, 2023, Like Xu wrote:
> On 10/2/2023 8:31 am, Sean Christopherson wrote:
> > @@ -2168,6 +2187,23 @@ static int do_get_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
> > static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
> > {
> > + u64 val;
> > +
> > + /*
> > + * Disallow writes to immutable feature MSRs after KVM_RUN. KVM does
> > + * not support modifying the guest vCPU model on the fly, e.g. changing
> > + * the nVMX capabilities while L2 is running is nonsensical. Ignore
> > + * writes of the same value, e.g. to allow userspace to blindly stuff
> > + * all MSRs when emulating RESET.
> > + */
> > + if (vcpu->arch.last_vmentry_cpu != -1 &&
>
> Three concerns on my mind (to help you think more if any):
> - why not using kvm->created_vcpus;

Because this is a vCPU scoped ioctl().

> - how about different vcpu models of the same guest have different
> feature_msr values;

KVM shouldn't care. If KVM does care, then that's a completely orthogonal bug
that needs to be fixed separately.

> (although they are not altered after the first run, cases (selftests) may be
> needed to
> show that it is dangerous for KVM);
>
> - the relative time to set "vcpu->arch.last_vmentry_cpu = vcpu->cpu" is
> still too late,
> since part of the guest code (an attack window) has already been executed on first
> run of kvm_x86_vcpu_run() which may run for a long time;

Again, this is a vCPU scoped ioctl. The task doing KVM_RUN holds vcpu->mutex so
there is no race.