2024-01-09 23:03:04

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 00/29] KVM: x86/pmu: selftests: Fixes and new tests

Knock wood, _this_ is the final of fixes and tests for PMU counters. New
in v10 is a small refactor to treat FIXED as a value, not a flag, when
emulating RDPMC. Everything else is the same as v9 (although rebased, but
there were no conflicts).

v10:
- Collect review. [Dapeng]
- Treat the FIXED type in RDPMC's ECX as a value, not a flag. [Jim]

v9:
- https://lore.kernel.org/all/[email protected]
- Collect reviews. [Dapeng, Kan]
- Fix a 63:31 => 63:32 typo in a changelog. [Dapeng]
- Actually check that forced emulation is enabled before trying to force
emulation on RDPMC. [Jinrong]
- Fix the aformentioned priority inversion issue.
- Completely drop "support" for fast RDPMC, in quotes because KVM doesn't
actually support RDPMC for non-architectural PMUs. I had left the code
in v8 because I didn't fully grok what the early emulator check was
doing, i.e. wasn't 100% confident it was dead code.

v8:
- https://lore.kernel.org/all/[email protected]
- Collect reviews. [Jim, Dapeng, Kan]
- Tweak names for the RDPMC flags in the selftests #defines.
- Get the event selectors used to virtualize fixed straight from perf
instead of hardcoding the (wrong) selectors in KVM. [Kan]
- Rename an "eventsel" field to "event" for a patch that gets blasted
away in the end anyways. [Jim]
- Add patches to fix RDPMC emulation and to test the behavior on Intel.
I spot tested on AMD and spent ~30 minutes trying to squeeze in the
bare minimum AMD support, but the PMU implementations between Intel
and AMD are juuuust different enough to make adding AMD support non-
trivial, and this series is already way too big.

v7:
- https://lore.kernel.org/all/[email protected]
- Drop patches that unnecessarily sanitized supported CPUID. [Jim]
- Purge the array of architectural event encodings. [Jim, Dapeng]
- Clean up pmu.h to remove useless macros, and make it easier to use the
new macros. [Jim]
- Port more of pmu_event_filter_test.c to pmu.h macros. [Jim, Jinrong]
- Clean up test comments and error messages. [Jim]
- Sanity check the value provided to vcpu_set_cpuid_property(). [Jim]

v6:
- https://lore.kernel.org/all/[email protected]
- Test LLC references/misses with CFLUSH{OPT}. [Jim]
- Make the tests play nice without PERF_CAPABILITIES. [Mingwei]
- Don't squash eventsels that happen to match an unsupported arch event. [Kan]
- Test PMC counters with forced emulation (don't ask how long it took me to
figure out how to read integer module params).

v5: https://lore.kernel.org/all/[email protected]
v4: https://lore.kernel.org/all/[email protected]
v3: https://lore.kernel.org/kvm/[email protected]

Jinrong Liang (7):
KVM: selftests: Add vcpu_set_cpuid_property() to set properties
KVM: selftests: Add pmu.h and lib/pmu.c for common PMU assets
KVM: selftests: Test Intel PMU architectural events on gp counters
KVM: selftests: Test Intel PMU architectural events on fixed counters
KVM: selftests: Test consistency of CPUID with num of gp counters
KVM: selftests: Test consistency of CPUID with num of fixed counters
KVM: selftests: Add functional test for Intel's fixed PMU counters

Sean Christopherson (22):
KVM: x86/pmu: Always treat Fixed counters as available when supported
KVM: x86/pmu: Allow programming events that match unsupported arch
events
KVM: x86/pmu: Remove KVM's enumeration of Intel's architectural
encodings
KVM: x86/pmu: Setup fixed counters' eventsel during PMU initialization
KVM: x86/pmu: Get eventsel for fixed counters from perf
KVM: x86/pmu: Don't ignore bits 31:30 for RDPMC index on AMD
KVM: x86/pmu: Prioritize VMX interception over #GP on RDPMC due to bad
index
KVM: x86/pmu: Apply "fast" RDPMC only to Intel PMUs
KVM: x86/pmu: Disallow "fast" RDPMC for architectural Intel PMUs
KVM: x86/pmu: Treat "fixed" PMU type in RDPMC as index as a value, not
flag
KVM: x86/pmu: Explicitly check for RDPMC of unsupported Intel PMC
types
KVM: selftests: Drop the "name" param from KVM_X86_PMU_FEATURE()
KVM: selftests: Extend {kvm,this}_pmu_has() to support fixed counters
KVM: selftests: Expand PMU counters test to verify LLC events
KVM: selftests: Add a helper to query if the PMU module param is
enabled
KVM: selftests: Add helpers to read integer module params
KVM: selftests: Query module param to detect FEP in MSR filtering test
KVM: selftests: Move KVM_FEP macro into common library header
KVM: selftests: Test PMC virtualization with forced emulation
KVM: selftests: Add a forced emulation variation of KVM_ASM_SAFE()
KVM: selftests: Add helpers for safe and safe+forced RDMSR, RDPMC, and
XGETBV
KVM: selftests: Extend PMU counters test to validate RDPMC after WRMSR

arch/x86/include/asm/kvm-x86-pmu-ops.h | 3 +-
arch/x86/kvm/emulate.c | 2 +-
arch/x86/kvm/kvm_emulate.h | 2 +-
arch/x86/kvm/pmu.c | 20 +-
arch/x86/kvm/pmu.h | 5 +-
arch/x86/kvm/svm/pmu.c | 17 +-
arch/x86/kvm/vmx/pmu_intel.c | 178 +++--
arch/x86/kvm/x86.c | 9 +-
tools/testing/selftests/kvm/Makefile | 2 +
.../selftests/kvm/include/kvm_util_base.h | 4 +
tools/testing/selftests/kvm/include/pmu.h | 97 +++
.../selftests/kvm/include/x86_64/processor.h | 148 ++++-
tools/testing/selftests/kvm/lib/kvm_util.c | 62 +-
tools/testing/selftests/kvm/lib/pmu.c | 31 +
.../selftests/kvm/lib/x86_64/processor.c | 15 +-
.../selftests/kvm/x86_64/pmu_counters_test.c | 617 ++++++++++++++++++
.../kvm/x86_64/pmu_event_filter_test.c | 143 ++--
.../smaller_maxphyaddr_emulation_test.c | 2 +-
.../kvm/x86_64/userspace_msr_exit_test.c | 29 +-
.../selftests/kvm/x86_64/vmx_pmu_caps_test.c | 2 +-
20 files changed, 1097 insertions(+), 291 deletions(-)
create mode 100644 tools/testing/selftests/kvm/include/pmu.h
create mode 100644 tools/testing/selftests/kvm/lib/pmu.c
create mode 100644 tools/testing/selftests/kvm/x86_64/pmu_counters_test.c


base-commit: 1c6d984f523f67ecfad1083bb04c55d91977bb15
--
2.43.0.472.g3155946c3a-goog



2024-01-09 23:03:32

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 01/29] KVM: x86/pmu: Always treat Fixed counters as available when supported

Treat fixed counters as available when they are supported, i.e. don't
silently ignore an enabled fixed counter just because guest CPUID says the
associated general purpose architectural event is unavailable.

KVM originally treated fixed counters as always available, but that got
changed as part of a fix to avoid confusing REF_CPU_CYCLES, which does NOT
map to an architectural event, with the actual architectural event used
associated with bit 7, TOPDOWN_SLOTS.

The commit justified the change with:

If the event is marked as unavailable in the Intel guest CPUID
0AH.EBX leaf, we need to avoid any perf_event creation, whether
it's a gp or fixed counter.

but that justification doesn't mesh with reality. The Intel SDM uses
"architectural events" to refer to both general purpose events (the ones
with the reverse polarity mask in CPUID.0xA.EBX) and the events for fixed
counters, e.g. the SDM makes statements like:

Each of the fixed-function PMC can count only one architectural
performance event.

but the fact that fixed counter 2 (TSC reference cycles) doesn't have an
associated general purpose architectural makes trying to apply the mask
from CPUID.0xA.EBX impossible.

Furthermore, the lack of enumeration for an architectural event in CPUID
only means the CPU doesn't officially support the architectural encoding,
i.e. it doesn't mean using the architectural encoding _won't_ work, it
sipmly means there are no guarantees that it will work as expected. E.g.
if KVM is running in a VM that advertises a fixed counters but not the
corresponding architectural event encoding, and perf decides to use a
general purpose counter instead of a fixed counter, odds are very good
that the underlying hardware actually does support the architectrual
encoding, and that programming the encoding will count the right thing.

In other words, asking perf to count the event will probably work, whereas
intentionally doing nothing is obviously guaranteed to fail.

Note, at the time of the change, KVM didn't enforce hardware support, i.e.
didn't prevent userspace from enumerating support in guest CPUID.0xA.EBX
for architectural events that aren't supported in hardware. I.e. silently
dropping the fixed counter didn't somehow protection against counting the
wrong event, it just enforced guest CPUID. And practically speaking, this
issue is almost certainly limited to running KVM on a funky virtual CPU
model. No known real hardware has an asymmetric PMU where a fixed counter
is supported but the associated architectural event is not.

Fixes: a21864486f7e ("KVM: x86/pmu: Fix available_event_types check for REF_CPU_CYCLES event")
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/vmx/pmu_intel.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index a6216c874729..8207f8c03585 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -108,11 +108,24 @@ static bool intel_hw_event_available(struct kvm_pmc *pmc)
u8 unit_mask = (pmc->eventsel & ARCH_PERFMON_EVENTSEL_UMASK) >> 8;
int i;

+ /*
+ * Fixed counters are always available if KVM reaches this point. If a
+ * fixed counter is unsupported in hardware or guest CPUID, KVM doesn't
+ * allow the counter's corresponding MSR to be written. KVM does use
+ * architectural events to program fixed counters, as the interface to
+ * perf doesn't allow requesting a specific fixed counter, e.g. perf
+ * may (sadly) back a guest fixed PMC with a general purposed counter.
+ * But if _hardware_ doesn't support the associated event, KVM simply
+ * doesn't enumerate support for the fixed counter.
+ */
+ if (pmc_is_fixed(pmc))
+ return true;
+
BUILD_BUG_ON(ARRAY_SIZE(intel_arch_events) != NR_INTEL_ARCH_EVENTS);

/*
* Disallow events reported as unavailable in guest CPUID. Note, this
- * doesn't apply to pseudo-architectural events.
+ * doesn't apply to pseudo-architectural events (see above).
*/
for (i = 0; i < NR_REAL_INTEL_ARCH_EVENTS; i++) {
if (intel_arch_events[i].eventsel != event_select ||
--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:03:42

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 02/29] KVM: x86/pmu: Allow programming events that match unsupported arch events

Remove KVM's bogus restriction that the guest can't program an event whose
encoding matches an unsupported architectural event. The enumeration of
an architectural event only says that if a CPU supports an architectural
event, then the event can be programmed using the architectural encoding.
The enumeration does NOT say anything about the encoding when the CPU
doesn't report support the architectural event.

Preventing the guest from counting events whose encoding happens to match
an architectural event breaks existing functionality whenever Intel adds
an architectural encoding that was *ever* used for a CPU that doesn't
enumerate support for the architectural event, even if the encoding is for
the exact same event!

E.g. the architectural encoding for Top-Down Slots is 0x01a4. Broadwell
CPUs, which do not support the Top-Down Slots architectural event, 0x01a4
is a valid, model-specific event. Denying guest usage of 0x01a4 if/when
KVM adds support for Top-Down slots would break any Broadwell-based guest.

Reported-by: Kan Liang <[email protected]>
Closes: https://lore.kernel.org/all/[email protected]
Fixes: a21864486f7e ("KVM: x86/pmu: Fix available_event_types check for REF_CPU_CYCLES event")
Reviewed-by: Dapeng Mi <[email protected]>
Reviewed-by: Jim Mattson <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/include/asm/kvm-x86-pmu-ops.h | 1 -
arch/x86/kvm/pmu.c | 1 -
arch/x86/kvm/pmu.h | 1 -
arch/x86/kvm/svm/pmu.c | 6 ----
arch/x86/kvm/vmx/pmu_intel.c | 38 --------------------------
5 files changed, 47 deletions(-)

diff --git a/arch/x86/include/asm/kvm-x86-pmu-ops.h b/arch/x86/include/asm/kvm-x86-pmu-ops.h
index 058bc636356a..d7eebee4450c 100644
--- a/arch/x86/include/asm/kvm-x86-pmu-ops.h
+++ b/arch/x86/include/asm/kvm-x86-pmu-ops.h
@@ -12,7 +12,6 @@ BUILD_BUG_ON(1)
* a NULL definition, for example if "static_call_cond()" will be used
* at the call sites.
*/
-KVM_X86_PMU_OP(hw_event_available)
KVM_X86_PMU_OP(pmc_idx_to_pmc)
KVM_X86_PMU_OP(rdpmc_ecx_to_pmc)
KVM_X86_PMU_OP(msr_idx_to_pmc)
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 87cc6c8809ad..30945fea6988 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -441,7 +441,6 @@ static bool check_pmu_event_filter(struct kvm_pmc *pmc)
static bool pmc_event_is_allowed(struct kvm_pmc *pmc)
{
return pmc_is_globally_enabled(pmc) && pmc_speculative_in_use(pmc) &&
- static_call(kvm_x86_pmu_hw_event_available)(pmc) &&
check_pmu_event_filter(pmc);
}

diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 7caeb3d8d4fd..87ecf22f5b25 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -19,7 +19,6 @@
#define VMWARE_BACKDOOR_PMC_APPARENT_TIME 0x10002

struct kvm_pmu_ops {
- bool (*hw_event_available)(struct kvm_pmc *pmc);
struct kvm_pmc *(*pmc_idx_to_pmc)(struct kvm_pmu *pmu, int pmc_idx);
struct kvm_pmc *(*rdpmc_ecx_to_pmc)(struct kvm_vcpu *vcpu,
unsigned int idx, u64 *mask);
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index b6a7ad4d6914..1475d47c821c 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -73,11 +73,6 @@ static inline struct kvm_pmc *get_gp_pmc_amd(struct kvm_pmu *pmu, u32 msr,
return amd_pmc_idx_to_pmc(pmu, idx);
}

-static bool amd_hw_event_available(struct kvm_pmc *pmc)
-{
- return true;
-}
-
static bool amd_is_valid_rdpmc_ecx(struct kvm_vcpu *vcpu, unsigned int idx)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -233,7 +228,6 @@ static void amd_pmu_init(struct kvm_vcpu *vcpu)
}

struct kvm_pmu_ops amd_pmu_ops __initdata = {
- .hw_event_available = amd_hw_event_available,
.pmc_idx_to_pmc = amd_pmc_idx_to_pmc,
.rdpmc_ecx_to_pmc = amd_rdpmc_ecx_to_pmc,
.msr_idx_to_pmc = amd_msr_idx_to_pmc,
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 8207f8c03585..1a7d021a6c7b 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -101,43 +101,6 @@ static struct kvm_pmc *intel_pmc_idx_to_pmc(struct kvm_pmu *pmu, int pmc_idx)
}
}

-static bool intel_hw_event_available(struct kvm_pmc *pmc)
-{
- struct kvm_pmu *pmu = pmc_to_pmu(pmc);
- u8 event_select = pmc->eventsel & ARCH_PERFMON_EVENTSEL_EVENT;
- u8 unit_mask = (pmc->eventsel & ARCH_PERFMON_EVENTSEL_UMASK) >> 8;
- int i;
-
- /*
- * Fixed counters are always available if KVM reaches this point. If a
- * fixed counter is unsupported in hardware or guest CPUID, KVM doesn't
- * allow the counter's corresponding MSR to be written. KVM does use
- * architectural events to program fixed counters, as the interface to
- * perf doesn't allow requesting a specific fixed counter, e.g. perf
- * may (sadly) back a guest fixed PMC with a general purposed counter.
- * But if _hardware_ doesn't support the associated event, KVM simply
- * doesn't enumerate support for the fixed counter.
- */
- if (pmc_is_fixed(pmc))
- return true;
-
- BUILD_BUG_ON(ARRAY_SIZE(intel_arch_events) != NR_INTEL_ARCH_EVENTS);
-
- /*
- * Disallow events reported as unavailable in guest CPUID. Note, this
- * doesn't apply to pseudo-architectural events (see above).
- */
- for (i = 0; i < NR_REAL_INTEL_ARCH_EVENTS; i++) {
- if (intel_arch_events[i].eventsel != event_select ||
- intel_arch_events[i].unit_mask != unit_mask)
- continue;
-
- return pmu->available_event_types & BIT(i);
- }
-
- return true;
-}
-
static bool intel_is_valid_rdpmc_ecx(struct kvm_vcpu *vcpu, unsigned int idx)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -780,7 +743,6 @@ void intel_pmu_cross_mapped_check(struct kvm_pmu *pmu)
}

struct kvm_pmu_ops intel_pmu_ops __initdata = {
- .hw_event_available = intel_hw_event_available,
.pmc_idx_to_pmc = intel_pmc_idx_to_pmc,
.rdpmc_ecx_to_pmc = intel_rdpmc_ecx_to_pmc,
.msr_idx_to_pmc = intel_msr_idx_to_pmc,
--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:04:16

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 03/29] KVM: x86/pmu: Remove KVM's enumeration of Intel's architectural encodings

Drop KVM's enumeration of Intel's architectural event encodings, and
instead open code the three encodings (of which only two are real) that
KVM uses to emulate fixed counters. Now that KVM doesn't incorrectly
enforce the availability of architectural encodings, there is no reason
for KVM to ever care about the encodings themselves, at least not in the
current format of an array indexed by the encoding's position in CPUID.

Opportunistically add a comment to explain why KVM cares about eventsel
values for fixed counters.

Suggested-by: Jim Mattson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/vmx/pmu_intel.c | 72 ++++++++++++------------------------
1 file changed, 23 insertions(+), 49 deletions(-)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 1a7d021a6c7b..f3c44ddc09f8 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -22,52 +22,6 @@

#define MSR_PMC_FULL_WIDTH_BIT (MSR_IA32_PMC0 - MSR_IA32_PERFCTR0)

-enum intel_pmu_architectural_events {
- /*
- * The order of the architectural events matters as support for each
- * event is enumerated via CPUID using the index of the event.
- */
- INTEL_ARCH_CPU_CYCLES,
- INTEL_ARCH_INSTRUCTIONS_RETIRED,
- INTEL_ARCH_REFERENCE_CYCLES,
- INTEL_ARCH_LLC_REFERENCES,
- INTEL_ARCH_LLC_MISSES,
- INTEL_ARCH_BRANCHES_RETIRED,
- INTEL_ARCH_BRANCHES_MISPREDICTED,
-
- NR_REAL_INTEL_ARCH_EVENTS,
-
- /*
- * Pseudo-architectural event used to implement IA32_FIXED_CTR2, a.k.a.
- * TSC reference cycles. The architectural reference cycles event may
- * or may not actually use the TSC as the reference, e.g. might use the
- * core crystal clock or the bus clock (yeah, "architectural").
- */
- PSEUDO_ARCH_REFERENCE_CYCLES = NR_REAL_INTEL_ARCH_EVENTS,
- NR_INTEL_ARCH_EVENTS,
-};
-
-static struct {
- u8 eventsel;
- u8 unit_mask;
-} const intel_arch_events[] = {
- [INTEL_ARCH_CPU_CYCLES] = { 0x3c, 0x00 },
- [INTEL_ARCH_INSTRUCTIONS_RETIRED] = { 0xc0, 0x00 },
- [INTEL_ARCH_REFERENCE_CYCLES] = { 0x3c, 0x01 },
- [INTEL_ARCH_LLC_REFERENCES] = { 0x2e, 0x4f },
- [INTEL_ARCH_LLC_MISSES] = { 0x2e, 0x41 },
- [INTEL_ARCH_BRANCHES_RETIRED] = { 0xc4, 0x00 },
- [INTEL_ARCH_BRANCHES_MISPREDICTED] = { 0xc5, 0x00 },
- [PSEUDO_ARCH_REFERENCE_CYCLES] = { 0x00, 0x03 },
-};
-
-/* mapping between fixed pmc index and intel_arch_events array */
-static int fixed_pmc_events[] = {
- [0] = INTEL_ARCH_INSTRUCTIONS_RETIRED,
- [1] = INTEL_ARCH_CPU_CYCLES,
- [2] = PSEUDO_ARCH_REFERENCE_CYCLES,
-};
-
static void reprogram_fixed_counters(struct kvm_pmu *pmu, u64 data)
{
struct kvm_pmc *pmc;
@@ -440,8 +394,29 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
return 0;
}

+/*
+ * Map fixed counter events to architectural general purpose event encodings.
+ * Perf doesn't provide APIs to allow KVM to directly program a fixed counter,
+ * and so KVM instead programs the architectural event to effectively request
+ * the fixed counter. Perf isn't guaranteed to use a fixed counter and may
+ * instead program the encoding into a general purpose counter, e.g. if a
+ * different perf_event is already utilizing the requested counter, but the end
+ * result is the same (ignoring the fact that using a general purpose counter
+ * will likely exacerbate counter contention).
+ *
+ * Note, reference cycles is counted using a perf-defined "psuedo-encoding",
+ * as there is no architectural general purpose encoding for reference cycles.
+ */
static void setup_fixed_pmc_eventsel(struct kvm_pmu *pmu)
{
+ const struct {
+ u8 eventsel;
+ u8 unit_mask;
+ } fixed_pmc_events[] = {
+ [0] = { 0xc0, 0x00 }, /* Instruction Retired / PERF_COUNT_HW_INSTRUCTIONS. */
+ [1] = { 0x3c, 0x00 }, /* CPU Cycles/ PERF_COUNT_HW_CPU_CYCLES. */
+ [2] = { 0x00, 0x03 }, /* Reference Cycles / PERF_COUNT_HW_REF_CPU_CYCLES*/
+ };
int i;

BUILD_BUG_ON(ARRAY_SIZE(fixed_pmc_events) != KVM_PMC_MAX_FIXED);
@@ -449,10 +424,9 @@ static void setup_fixed_pmc_eventsel(struct kvm_pmu *pmu)
for (i = 0; i < pmu->nr_arch_fixed_counters; i++) {
int index = array_index_nospec(i, KVM_PMC_MAX_FIXED);
struct kvm_pmc *pmc = &pmu->fixed_counters[index];
- u32 event = fixed_pmc_events[index];

- pmc->eventsel = (intel_arch_events[event].unit_mask << 8) |
- intel_arch_events[event].eventsel;
+ pmc->eventsel = (fixed_pmc_events[index].unit_mask << 8) |
+ fixed_pmc_events[index].eventsel;
}
}

--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:04:34

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 04/29] KVM: x86/pmu: Setup fixed counters' eventsel during PMU initialization

Set the eventsel for all fixed counters during PMU initialization, the
eventsel is hardcoded and consumed if and only if the counter is supported,
i.e. there is no reason to redo the setup every time the PMU is refreshed.

Configuring all KVM-supported fixed counter also eliminates a potential
pitfall if/when KVM supports discontiguous fixed counters, in which case
configuring only nr_arch_fixed_counters will be insufficient (ignoring the
fact that KVM will need many other changes to support discontiguous fixed
counters).

Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/vmx/pmu_intel.c | 16 +++++-----------
1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index f3c44ddc09f8..98e92b9ece09 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -407,27 +407,21 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
* Note, reference cycles is counted using a perf-defined "psuedo-encoding",
* as there is no architectural general purpose encoding for reference cycles.
*/
-static void setup_fixed_pmc_eventsel(struct kvm_pmu *pmu)
+static u64 intel_get_fixed_pmc_eventsel(int index)
{
const struct {
- u8 eventsel;
+ u8 event;
u8 unit_mask;
} fixed_pmc_events[] = {
[0] = { 0xc0, 0x00 }, /* Instruction Retired / PERF_COUNT_HW_INSTRUCTIONS. */
[1] = { 0x3c, 0x00 }, /* CPU Cycles/ PERF_COUNT_HW_CPU_CYCLES. */
[2] = { 0x00, 0x03 }, /* Reference Cycles / PERF_COUNT_HW_REF_CPU_CYCLES*/
};
- int i;

BUILD_BUG_ON(ARRAY_SIZE(fixed_pmc_events) != KVM_PMC_MAX_FIXED);

- for (i = 0; i < pmu->nr_arch_fixed_counters; i++) {
- int index = array_index_nospec(i, KVM_PMC_MAX_FIXED);
- struct kvm_pmc *pmc = &pmu->fixed_counters[index];
-
- pmc->eventsel = (fixed_pmc_events[index].unit_mask << 8) |
- fixed_pmc_events[index].eventsel;
- }
+ return (fixed_pmc_events[index].unit_mask << 8) |
+ fixed_pmc_events[index].event;
}

static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
@@ -493,7 +487,6 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
kvm_pmu_cap.bit_width_fixed);
pmu->counter_bitmask[KVM_PMC_FIXED] =
((u64)1 << edx.split.bit_width_fixed) - 1;
- setup_fixed_pmc_eventsel(pmu);
}

for (i = 0; i < pmu->nr_arch_fixed_counters; i++)
@@ -571,6 +564,7 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu)
pmu->fixed_counters[i].vcpu = vcpu;
pmu->fixed_counters[i].idx = i + INTEL_PMC_IDX_FIXED;
pmu->fixed_counters[i].current_config = 0;
+ pmu->fixed_counters[i].eventsel = intel_get_fixed_pmc_eventsel(i);
}

lbr_desc->records.nr = 0;
--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:05:25

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 06/29] KVM: x86/pmu: Don't ignore bits 31:30 for RDPMC index on AMD

Stop stripping bits 31:30 prior to validating/consuming the RDPMC index on
AMD. Per the APM's documentation of RDPMC, *values* greater than 27 are
reserved. The behavior of upper bits being flags is firmly Intel-only.

Fixes: ca724305a2b0 ("KVM: x86/vPMU: Implement AMD vPMU code for KVM")
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/svm/pmu.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index 1475d47c821c..1fafc46f61c9 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -77,8 +77,6 @@ static bool amd_is_valid_rdpmc_ecx(struct kvm_vcpu *vcpu, unsigned int idx)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);

- idx &= ~(3u << 30);
-
return idx < pmu->nr_arch_gp_counters;
}

@@ -86,7 +84,7 @@ static bool amd_is_valid_rdpmc_ecx(struct kvm_vcpu *vcpu, unsigned int idx)
static struct kvm_pmc *amd_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu,
unsigned int idx, u64 *mask)
{
- return amd_pmc_idx_to_pmc(vcpu_to_pmu(vcpu), idx & ~(3u << 30));
+ return amd_pmc_idx_to_pmc(vcpu_to_pmu(vcpu), idx);
}

static struct kvm_pmc *amd_msr_idx_to_pmc(struct kvm_vcpu *vcpu, u32 msr)
--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:05:27

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 05/29] KVM: x86/pmu: Get eventsel for fixed counters from perf

Get the event selectors used to effectively request fixed counters for
perf events from perf itself instead of hardcoding them in KVM and hoping
that they match the underlying hardware. While fixed counters 0 and 1 use
architectural events, as of ffbe4ab0beda ("perf/x86/intel: Extend the
ref-cycles event to GP counters") fixed counter 2 (reference TSC cycles)
may use a software-defined pseudo-encoding or a real hardware-defined
encoding.

Reported-by: Kan Liang <[email protected]>
Closes: https://lkml.kernel.org/r/4281eee7-6423-4ec8-bb18-c6aeee1faf2c%40linux.intel.com
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/vmx/pmu_intel.c | 30 +++++++++++++++++-------------
1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 98e92b9ece09..ec4feaef3d55 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -404,24 +404,28 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
* result is the same (ignoring the fact that using a general purpose counter
* will likely exacerbate counter contention).
*
- * Note, reference cycles is counted using a perf-defined "psuedo-encoding",
- * as there is no architectural general purpose encoding for reference cycles.
+ * Forcibly inlined to allow asserting on @index at build time, and there should
+ * never be more than one user.
*/
-static u64 intel_get_fixed_pmc_eventsel(int index)
+static __always_inline u64 intel_get_fixed_pmc_eventsel(unsigned int index)
{
- const struct {
- u8 event;
- u8 unit_mask;
- } fixed_pmc_events[] = {
- [0] = { 0xc0, 0x00 }, /* Instruction Retired / PERF_COUNT_HW_INSTRUCTIONS. */
- [1] = { 0x3c, 0x00 }, /* CPU Cycles/ PERF_COUNT_HW_CPU_CYCLES. */
- [2] = { 0x00, 0x03 }, /* Reference Cycles / PERF_COUNT_HW_REF_CPU_CYCLES*/
+ const enum perf_hw_id fixed_pmc_perf_ids[] = {
+ [0] = PERF_COUNT_HW_INSTRUCTIONS,
+ [1] = PERF_COUNT_HW_CPU_CYCLES,
+ [2] = PERF_COUNT_HW_REF_CPU_CYCLES,
};
+ u64 eventsel;

- BUILD_BUG_ON(ARRAY_SIZE(fixed_pmc_events) != KVM_PMC_MAX_FIXED);
+ BUILD_BUG_ON(ARRAY_SIZE(fixed_pmc_perf_ids) != KVM_PMC_MAX_FIXED);
+ BUILD_BUG_ON(index >= KVM_PMC_MAX_FIXED);

- return (fixed_pmc_events[index].unit_mask << 8) |
- fixed_pmc_events[index].event;
+ /*
+ * Yell if perf reports support for a fixed counter but perf doesn't
+ * have a known encoding for the associated general purpose event.
+ */
+ eventsel = perf_get_hw_event_config(fixed_pmc_perf_ids[index]);
+ WARN_ON_ONCE(!eventsel && index < kvm_pmu_cap.num_counters_fixed);
+ return eventsel;
}

static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:05:47

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 07/29] KVM: x86/pmu: Prioritize VMX interception over #GP on RDPMC due to bad index

Apply the pre-intercepts RDPMC validity check only to AMD, and rename all
relevant functions to make it as clear as possible that the check is not a
standard PMC index check. On Intel, the basic rule is that only invalid
opcodes and privilege/permission/mode checks have priority over VM-Exit,
i.e. RDPMC with an invalid index should VM-Exit, not #GP. While the SDM
doesn't explicitly call out RDPMC, it _does_ explicitly use RDMSR of a
non-existent MSR as an example where VM-Exit has priority over #GP, and
RDPMC is effectively just a variation of RDMSR.

Manually testing on various Intel CPUs confirms this behavior, and the
inverted priority was introduced for SVM compatibility, i.e. was not an
intentional change for Intel PMUs. On AMD, *all* exceptions on RDPMC have
priority over VM-Exit.

Check for a NULL kvm_pmu_ops.check_rdpmc_early instead of using a RET0
static call so as to provide a convenient location to document the
difference between Intel and AMD, and to again try to make it as obvious
as possible that the early check is a one-off thing, not a generic "is
this PMC valid?" helper.

Fixes: 8061252ee0d2 ("KVM: SVM: Add intercept checks for remaining twobyte instructions")
Cc: Jim Mattson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/include/asm/kvm-x86-pmu-ops.h | 2 +-
arch/x86/kvm/emulate.c | 2 +-
arch/x86/kvm/kvm_emulate.h | 2 +-
arch/x86/kvm/pmu.c | 16 +++++++++++++---
arch/x86/kvm/pmu.h | 4 ++--
arch/x86/kvm/svm/pmu.c | 9 ++++++---
arch/x86/kvm/vmx/pmu_intel.c | 12 ------------
arch/x86/kvm/x86.c | 9 +++------
8 files changed, 27 insertions(+), 29 deletions(-)

diff --git a/arch/x86/include/asm/kvm-x86-pmu-ops.h b/arch/x86/include/asm/kvm-x86-pmu-ops.h
index d7eebee4450c..f0cd48222133 100644
--- a/arch/x86/include/asm/kvm-x86-pmu-ops.h
+++ b/arch/x86/include/asm/kvm-x86-pmu-ops.h
@@ -15,7 +15,7 @@ BUILD_BUG_ON(1)
KVM_X86_PMU_OP(pmc_idx_to_pmc)
KVM_X86_PMU_OP(rdpmc_ecx_to_pmc)
KVM_X86_PMU_OP(msr_idx_to_pmc)
-KVM_X86_PMU_OP(is_valid_rdpmc_ecx)
+KVM_X86_PMU_OP_OPTIONAL(check_rdpmc_early)
KVM_X86_PMU_OP(is_valid_msr)
KVM_X86_PMU_OP(get_msr)
KVM_X86_PMU_OP(set_msr)
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index e223043ef5b2..695ab5b6055c 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3962,7 +3962,7 @@ static int check_rdpmc(struct x86_emulate_ctxt *ctxt)
* protected mode.
*/
if ((!(cr4 & X86_CR4_PCE) && ctxt->ops->cpl(ctxt)) ||
- ctxt->ops->check_pmc(ctxt, rcx))
+ ctxt->ops->check_rdpmc_early(ctxt, rcx))
return emulate_gp(ctxt, 0);

return X86EMUL_CONTINUE;
diff --git a/arch/x86/kvm/kvm_emulate.h b/arch/x86/kvm/kvm_emulate.h
index e6d149825169..4351149484fb 100644
--- a/arch/x86/kvm/kvm_emulate.h
+++ b/arch/x86/kvm/kvm_emulate.h
@@ -208,7 +208,7 @@ struct x86_emulate_ops {
int (*set_msr_with_filter)(struct x86_emulate_ctxt *ctxt, u32 msr_index, u64 data);
int (*get_msr_with_filter)(struct x86_emulate_ctxt *ctxt, u32 msr_index, u64 *pdata);
int (*get_msr)(struct x86_emulate_ctxt *ctxt, u32 msr_index, u64 *pdata);
- int (*check_pmc)(struct x86_emulate_ctxt *ctxt, u32 pmc);
+ int (*check_rdpmc_early)(struct x86_emulate_ctxt *ctxt, u32 pmc);
int (*read_pmc)(struct x86_emulate_ctxt *ctxt, u32 pmc, u64 *pdata);
void (*halt)(struct x86_emulate_ctxt *ctxt);
void (*wbinvd)(struct x86_emulate_ctxt *ctxt);
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 30945fea6988..0b0d804ee239 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -524,10 +524,20 @@ void kvm_pmu_handle_event(struct kvm_vcpu *vcpu)
kvm_pmu_cleanup(vcpu);
}

-/* check if idx is a valid index to access PMU */
-bool kvm_pmu_is_valid_rdpmc_ecx(struct kvm_vcpu *vcpu, unsigned int idx)
+int kvm_pmu_check_rdpmc_early(struct kvm_vcpu *vcpu, unsigned int idx)
{
- return static_call(kvm_x86_pmu_is_valid_rdpmc_ecx)(vcpu, idx);
+ /*
+ * On Intel, VMX interception has priority over RDPMC exceptions that
+ * aren't already handled by the emulator, i.e. there are no additional
+ * check needed for Intel PMUs.
+ *
+ * On AMD, _all_ exceptions on RDPMC have priority over SVM intercepts,
+ * i.e. an invalid PMC results in a #GP, not #VMEXIT.
+ */
+ if (!kvm_pmu_ops.check_rdpmc_early)
+ return 0;
+
+ return static_call(kvm_x86_pmu_check_rdpmc_early)(vcpu, idx);
}

bool is_vmware_backdoor_pmc(u32 pmc_idx)
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 87ecf22f5b25..51bbb01b21c8 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -23,7 +23,7 @@ struct kvm_pmu_ops {
struct kvm_pmc *(*rdpmc_ecx_to_pmc)(struct kvm_vcpu *vcpu,
unsigned int idx, u64 *mask);
struct kvm_pmc *(*msr_idx_to_pmc)(struct kvm_vcpu *vcpu, u32 msr);
- bool (*is_valid_rdpmc_ecx)(struct kvm_vcpu *vcpu, unsigned int idx);
+ int (*check_rdpmc_early)(struct kvm_vcpu *vcpu, unsigned int idx);
bool (*is_valid_msr)(struct kvm_vcpu *vcpu, u32 msr);
int (*get_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr_info);
int (*set_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr_info);
@@ -215,7 +215,7 @@ static inline bool pmc_is_globally_enabled(struct kvm_pmc *pmc)
void kvm_pmu_deliver_pmi(struct kvm_vcpu *vcpu);
void kvm_pmu_handle_event(struct kvm_vcpu *vcpu);
int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned pmc, u64 *data);
-bool kvm_pmu_is_valid_rdpmc_ecx(struct kvm_vcpu *vcpu, unsigned int idx);
+int kvm_pmu_check_rdpmc_early(struct kvm_vcpu *vcpu, unsigned int idx);
bool kvm_pmu_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr);
int kvm_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info);
int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info);
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index 1fafc46f61c9..e886300f0f97 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -73,11 +73,14 @@ static inline struct kvm_pmc *get_gp_pmc_amd(struct kvm_pmu *pmu, u32 msr,
return amd_pmc_idx_to_pmc(pmu, idx);
}

-static bool amd_is_valid_rdpmc_ecx(struct kvm_vcpu *vcpu, unsigned int idx)
+static int amd_check_rdpmc_early(struct kvm_vcpu *vcpu, unsigned int idx)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);

- return idx < pmu->nr_arch_gp_counters;
+ if (idx >= pmu->nr_arch_gp_counters)
+ return -EINVAL;
+
+ return 0;
}

/* idx is the ECX register of RDPMC instruction */
@@ -229,7 +232,7 @@ struct kvm_pmu_ops amd_pmu_ops __initdata = {
.pmc_idx_to_pmc = amd_pmc_idx_to_pmc,
.rdpmc_ecx_to_pmc = amd_rdpmc_ecx_to_pmc,
.msr_idx_to_pmc = amd_msr_idx_to_pmc,
- .is_valid_rdpmc_ecx = amd_is_valid_rdpmc_ecx,
+ .check_rdpmc_early = amd_check_rdpmc_early,
.is_valid_msr = amd_is_valid_msr,
.get_msr = amd_pmu_get_msr,
.set_msr = amd_pmu_set_msr,
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index ec4feaef3d55..1b1f888ad32b 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -55,17 +55,6 @@ static struct kvm_pmc *intel_pmc_idx_to_pmc(struct kvm_pmu *pmu, int pmc_idx)
}
}

-static bool intel_is_valid_rdpmc_ecx(struct kvm_vcpu *vcpu, unsigned int idx)
-{
- struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
- bool fixed = idx & (1u << 30);
-
- idx &= ~(3u << 30);
-
- return fixed ? idx < pmu->nr_arch_fixed_counters
- : idx < pmu->nr_arch_gp_counters;
-}
-
static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu,
unsigned int idx, u64 *mask)
{
@@ -718,7 +707,6 @@ struct kvm_pmu_ops intel_pmu_ops __initdata = {
.pmc_idx_to_pmc = intel_pmc_idx_to_pmc,
.rdpmc_ecx_to_pmc = intel_rdpmc_ecx_to_pmc,
.msr_idx_to_pmc = intel_msr_idx_to_pmc,
- .is_valid_rdpmc_ecx = intel_is_valid_rdpmc_ecx,
.is_valid_msr = intel_is_valid_msr,
.get_msr = intel_pmu_get_msr,
.set_msr = intel_pmu_set_msr,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 27e23714e960..4d1191a944f1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8389,12 +8389,9 @@ static int emulator_get_msr(struct x86_emulate_ctxt *ctxt,
return kvm_get_msr(emul_to_vcpu(ctxt), msr_index, pdata);
}

-static int emulator_check_pmc(struct x86_emulate_ctxt *ctxt,
- u32 pmc)
+static int emulator_check_rdpmc_early(struct x86_emulate_ctxt *ctxt, u32 pmc)
{
- if (kvm_pmu_is_valid_rdpmc_ecx(emul_to_vcpu(ctxt), pmc))
- return 0;
- return -EINVAL;
+ return kvm_pmu_check_rdpmc_early(emul_to_vcpu(ctxt), pmc);
}

static int emulator_read_pmc(struct x86_emulate_ctxt *ctxt,
@@ -8526,7 +8523,7 @@ static const struct x86_emulate_ops emulate_ops = {
.set_msr_with_filter = emulator_set_msr_with_filter,
.get_msr_with_filter = emulator_get_msr_with_filter,
.get_msr = emulator_get_msr,
- .check_pmc = emulator_check_pmc,
+ .check_rdpmc_early = emulator_check_rdpmc_early,
.read_pmc = emulator_read_pmc,
.halt = emulator_halt,
.wbinvd = emulator_wbinvd,
--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:06:03

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 08/29] KVM: x86/pmu: Apply "fast" RDPMC only to Intel PMUs

Move the handling of "fast" RDPMC instructions, which drop bits 63:32 of
the count, to Intel. The "fast" flag, and all modifiers for that matter,
are Intel-only and aren't supported by AMD.

Opportunistically replace open coded bit crud with proper #defines, and
add comments to try and disentangle the flags vs. values mess for
non-architectural vs. architectural PMUs.

Fixes: ca724305a2b0 ("KVM: x86/vPMU: Implement AMD vPMU code for KVM")
Reviewed-by: Dapeng Mi <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/pmu.c | 3 +--
arch/x86/kvm/vmx/pmu_intel.c | 16 ++++++++++++++--
2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 0b0d804ee239..09b0feb975c3 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -576,10 +576,9 @@ static int kvm_pmu_rdpmc_vmware(struct kvm_vcpu *vcpu, unsigned idx, u64 *data)

int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned idx, u64 *data)
{
- bool fast_mode = idx & (1u << 31);
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
struct kvm_pmc *pmc;
- u64 mask = fast_mode ? ~0u : ~0ull;
+ u64 mask = ~0ull;

if (!pmu->version)
return 1;
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 1b1f888ad32b..03bd188b5754 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -20,6 +20,15 @@
#include "nested.h"
#include "pmu.h"

+/*
+ * Perf's "BASE" is wildly misleading, architectural PMUs use bits 31:16 of ECX
+ * to encode the "type" of counter to read, i.e. this is not a "base". And to
+ * further confuse things, non-architectural PMUs use bit 31 as a flag for
+ * "fast" reads, whereas the "type" is an explicit value.
+ */
+#define INTEL_RDPMC_FIXED INTEL_PMC_FIXED_RDPMC_BASE
+#define INTEL_RDPMC_FAST BIT(31)
+
#define MSR_PMC_FULL_WIDTH_BIT (MSR_IA32_PMC0 - MSR_IA32_PERFCTR0)

static void reprogram_fixed_counters(struct kvm_pmu *pmu, u64 data)
@@ -59,11 +68,14 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu,
unsigned int idx, u64 *mask)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
- bool fixed = idx & (1u << 30);
+ bool fixed = idx & INTEL_RDPMC_FIXED;
struct kvm_pmc *counters;
unsigned int num_counters;

- idx &= ~(3u << 30);
+ if (idx & INTEL_RDPMC_FAST)
+ *mask &= GENMASK_ULL(31, 0);
+
+ idx &= ~(INTEL_RDPMC_FIXED | INTEL_RDPMC_FAST);
if (fixed) {
counters = pmu->fixed_counters;
num_counters = pmu->nr_arch_fixed_counters;
--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:06:32

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 09/29] KVM: x86/pmu: Disallow "fast" RDPMC for architectural Intel PMUs

Inject #GP on RDPMC if the "fast" flag is set for architectural Intel
PMUs, i.e. if the PMU version is non-zero. Per Intel's SDM, and confirmed
on bare metal, the "fast" flag is supported only for non-architectural
PMUs, and is reserved for architectural PMUs.

If the processor does not support architectural performance monitoring
(CPUID.0AH:EAX[7:0]=0), ECX[30:0] specifies the index of the PMC to be
read. Setting ECX[31] selects “fast” read mode if supported. In this mode,
RDPMC returns bits 31:0 of the PMC in EAX while clearing EDX to zero.

If the processor does support architectural performance monitoring
(CPUID.0AH:EAX[7:0] ≠ 0), ECX[31:16] specifies type of PMC while ECX[15:0]
specifies the index of the PMC to be read within that type. The following
PMC types are currently defined:
— General-purpose counters use type 0. The index x (to read IA32_PMCx)
must be less than the value enumerated by CPUID.0AH.EAX[15:8] (thus
ECX[15:8] must be zero).
— Fixed-function counters use type 4000H. The index x (to read
IA32_FIXED_CTRx) can be used if either CPUID.0AH.EDX[4:0] > x or
CPUID.0AH.ECX[x] = 1 (thus ECX[15:5] must be 0).
— Performance metrics use type 2000H. This type can be used only if
IA32_PERF_CAPABILITIES.PERF_METRICS_AVAILABLE[bit 15]=1. For this type,
the index in ECX[15:0] is implementation specific.

Opportunistically WARN if KVM ever actually tries to complete RDPMC for a
non-architectural PMU, and drop the non-existent "support" for fast RDPMC,
as KVM doesn't support such PMUs, i.e. kvm_pmu_rdpmc() should reject the
RDPMC before getting to the Intel code.

Fixes: f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests")
Fixes: 67f4d4288c35 ("KVM: x86: rdpmc emulation checks the counter incorrectly")
Reviewed-by: Dapeng Mi <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/vmx/pmu_intel.c | 22 ++++++++++++++++++----
1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 03bd188b5754..5a5dfae6055c 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -27,7 +27,6 @@
* "fast" reads, whereas the "type" is an explicit value.
*/
#define INTEL_RDPMC_FIXED INTEL_PMC_FIXED_RDPMC_BASE
-#define INTEL_RDPMC_FAST BIT(31)

#define MSR_PMC_FULL_WIDTH_BIT (MSR_IA32_PMC0 - MSR_IA32_PERFCTR0)

@@ -72,10 +71,25 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu,
struct kvm_pmc *counters;
unsigned int num_counters;

- if (idx & INTEL_RDPMC_FAST)
- *mask &= GENMASK_ULL(31, 0);
+ /*
+ * The encoding of ECX for RDPMC is different for architectural versus
+ * non-architecturals PMUs (PMUs with version '0'). For architectural
+ * PMUs, bits 31:16 specify the PMC type and bits 15:0 specify the PMC
+ * index. For non-architectural PMUs, bit 31 is a "fast" flag, and
+ * bits 30:0 specify the PMC index.
+ *
+ * Yell and reject attempts to read PMCs for a non-architectural PMU,
+ * as KVM doesn't support such PMUs.
+ */
+ if (WARN_ON_ONCE(!pmu->version))
+ return NULL;

- idx &= ~(INTEL_RDPMC_FIXED | INTEL_RDPMC_FAST);
+ /*
+ * Fixed PMCs are supported on all architectural PMUs. Note, KVM only
+ * emulates fixed PMCs for PMU v2+, but the flag itself is still valid,
+ * i.e. let RDPMC fail due to accessing a non-existent counter.
+ */
+ idx &= ~INTEL_RDPMC_FIXED;
if (fixed) {
counters = pmu->fixed_counters;
num_counters = pmu->nr_arch_fixed_counters;
--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:06:56

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 10/29] KVM: x86/pmu: Treat "fixed" PMU type in RDPMC as index as a value, not flag

Refactor KVM's handling of ECX for RDPMC to treat the FIXED modifier as an
explicit value, not a flag (minus one wart). While non-architectural PMUs
do use bit 31 as a flag (for "fast" reads), architectural PMUs use the
upper half of ECX to encode the type. From the SDM:

ECX[31:16] specifies type of PMC while ECX[15:0] specifies the index of
the PMC to be read within that type

Note, that the known supported types are 4000H and 2000H, i.e. look a lot
like flags, doesn't contradict the above statement that ECX[31:16] holds
the type, at least not by any sane reading of the SDM.

Keep the explicitly clearing of the FIXED "flag", as KVM subtly relies on
that behavior to disallow unsupported types while allowing the correct
indices for fixed counters. This wart will be cleaned up in short order.

Opportunistically grab the per-type bitmask in the if-else blocks to
eliminate the one-off usage of the local "fixed" bool.

Reported-by: Jim Mattson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/vmx/pmu_intel.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 5a5dfae6055c..c37dd3aa056b 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -28,6 +28,9 @@
*/
#define INTEL_RDPMC_FIXED INTEL_PMC_FIXED_RDPMC_BASE

+#define INTEL_RDPMC_TYPE_MASK GENMASK(31, 16)
+#define INTEL_RDPMC_INDEX_MASK GENMASK(15, 0)
+
#define MSR_PMC_FULL_WIDTH_BIT (MSR_IA32_PMC0 - MSR_IA32_PERFCTR0)

static void reprogram_fixed_counters(struct kvm_pmu *pmu, u64 data)
@@ -66,10 +69,11 @@ static struct kvm_pmc *intel_pmc_idx_to_pmc(struct kvm_pmu *pmu, int pmc_idx)
static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu,
unsigned int idx, u64 *mask)
{
+ unsigned int type = idx & INTEL_RDPMC_TYPE_MASK;
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
- bool fixed = idx & INTEL_RDPMC_FIXED;
struct kvm_pmc *counters;
unsigned int num_counters;
+ u64 bitmask;

/*
* The encoding of ECX for RDPMC is different for architectural versus
@@ -90,16 +94,20 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu,
* i.e. let RDPMC fail due to accessing a non-existent counter.
*/
idx &= ~INTEL_RDPMC_FIXED;
- if (fixed) {
+ if (type == INTEL_RDPMC_FIXED) {
counters = pmu->fixed_counters;
num_counters = pmu->nr_arch_fixed_counters;
+ bitmask = pmu->counter_bitmask[KVM_PMC_FIXED];
} else {
counters = pmu->gp_counters;
num_counters = pmu->nr_arch_gp_counters;
+ bitmask = pmu->counter_bitmask[KVM_PMC_GP];
}
+
if (idx >= num_counters)
return NULL;
- *mask &= pmu->counter_bitmask[fixed ? KVM_PMC_FIXED : KVM_PMC_GP];
+
+ *mask &= bitmask;
return &counters[array_index_nospec(idx, num_counters)];
}

--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:07:14

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 11/29] KVM: x86/pmu: Explicitly check for RDPMC of unsupported Intel PMC types

Explicitly check for attempts to read unsupported PMC types instead of
letting the bounds check fail. Functionally, letting the check fail is
ok, but it's unnecessarily subtle and does a poor job of documenting the
architectural behavior that KVM is emulating.

Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/vmx/pmu_intel.c | 21 +++++++++++++++------
1 file changed, 15 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index c37dd3aa056b..b41bdb0a0995 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -26,6 +26,7 @@
* further confuse things, non-architectural PMUs use bit 31 as a flag for
* "fast" reads, whereas the "type" is an explicit value.
*/
+#define INTEL_RDPMC_GP 0
#define INTEL_RDPMC_FIXED INTEL_PMC_FIXED_RDPMC_BASE

#define INTEL_RDPMC_TYPE_MASK GENMASK(31, 16)
@@ -89,21 +90,29 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu,
return NULL;

/*
- * Fixed PMCs are supported on all architectural PMUs. Note, KVM only
- * emulates fixed PMCs for PMU v2+, but the flag itself is still valid,
- * i.e. let RDPMC fail due to accessing a non-existent counter.
+ * General Purpose (GP) PMCs are supported on all PMUs, and fixed PMCs
+ * are supported on all architectural PMUs, i.e. on all virtual PMUs
+ * supported by KVM. Note, KVM only emulates fixed PMCs for PMU v2+,
+ * but the type itself is still valid, i.e. let RDPMC fail due to
+ * accessing a non-existent counter. Reject attempts to read all other
+ * types, which are unknown/unsupported.
*/
- idx &= ~INTEL_RDPMC_FIXED;
- if (type == INTEL_RDPMC_FIXED) {
+ switch (type) {
+ case INTEL_RDPMC_FIXED:
counters = pmu->fixed_counters;
num_counters = pmu->nr_arch_fixed_counters;
bitmask = pmu->counter_bitmask[KVM_PMC_FIXED];
- } else {
+ break;
+ case INTEL_RDPMC_GP:
counters = pmu->gp_counters;
num_counters = pmu->nr_arch_gp_counters;
bitmask = pmu->counter_bitmask[KVM_PMC_GP];
+ break;
+ default:
+ return NULL;
}

+ idx &= INTEL_RDPMC_INDEX_MASK;
if (idx >= num_counters)
return NULL;

--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:07:34

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 12/29] KVM: selftests: Add vcpu_set_cpuid_property() to set properties

From: Jinrong Liang <[email protected]>

Add vcpu_set_cpuid_property() helper function for setting properties, and
use it instead of open coding an equivalent for MAX_PHY_ADDR. Future vPMU
testcases will also need to stuff various CPUID properties.

Reviewed-by: Jim Mattson <[email protected]>
Signed-off-by: Jinrong Liang <[email protected]>
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/include/x86_64/processor.h | 4 +++-
.../testing/selftests/kvm/lib/x86_64/processor.c | 15 ++++++++++++---
.../x86_64/smaller_maxphyaddr_emulation_test.c | 2 +-
3 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index a84863503fcb..932944c4ea01 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -995,7 +995,9 @@ static inline void vcpu_set_cpuid(struct kvm_vcpu *vcpu)
vcpu_ioctl(vcpu, KVM_GET_CPUID2, vcpu->cpuid);
}

-void vcpu_set_cpuid_maxphyaddr(struct kvm_vcpu *vcpu, uint8_t maxphyaddr);
+void vcpu_set_cpuid_property(struct kvm_vcpu *vcpu,
+ struct kvm_x86_cpu_property property,
+ uint32_t value);

void vcpu_clear_cpuid_entry(struct kvm_vcpu *vcpu, uint32_t function);
void vcpu_set_or_clear_cpuid_feature(struct kvm_vcpu *vcpu,
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index d8288374078e..67eb82a6c754 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -752,12 +752,21 @@ void vcpu_init_cpuid(struct kvm_vcpu *vcpu, const struct kvm_cpuid2 *cpuid)
vcpu_set_cpuid(vcpu);
}

-void vcpu_set_cpuid_maxphyaddr(struct kvm_vcpu *vcpu, uint8_t maxphyaddr)
+void vcpu_set_cpuid_property(struct kvm_vcpu *vcpu,
+ struct kvm_x86_cpu_property property,
+ uint32_t value)
{
- struct kvm_cpuid_entry2 *entry = vcpu_get_cpuid_entry(vcpu, 0x80000008);
+ struct kvm_cpuid_entry2 *entry;
+
+ entry = __vcpu_get_cpuid_entry(vcpu, property.function, property.index);
+
+ (&entry->eax)[property.reg] &= ~GENMASK(property.hi_bit, property.lo_bit);
+ (&entry->eax)[property.reg] |= value << property.lo_bit;

- entry->eax = (entry->eax & ~0xff) | maxphyaddr;
vcpu_set_cpuid(vcpu);
+
+ /* Sanity check that @value doesn't exceed the bounds in any way. */
+ TEST_ASSERT_EQ(kvm_cpuid_property(vcpu->cpuid, property), value);
}

void vcpu_clear_cpuid_entry(struct kvm_vcpu *vcpu, uint32_t function)
diff --git a/tools/testing/selftests/kvm/x86_64/smaller_maxphyaddr_emulation_test.c b/tools/testing/selftests/kvm/x86_64/smaller_maxphyaddr_emulation_test.c
index 06edf00a97d6..9b89440dff19 100644
--- a/tools/testing/selftests/kvm/x86_64/smaller_maxphyaddr_emulation_test.c
+++ b/tools/testing/selftests/kvm/x86_64/smaller_maxphyaddr_emulation_test.c
@@ -63,7 +63,7 @@ int main(int argc, char *argv[])
vm_init_descriptor_tables(vm);
vcpu_init_descriptor_tables(vcpu);

- vcpu_set_cpuid_maxphyaddr(vcpu, MAXPHYADDR);
+ vcpu_set_cpuid_property(vcpu, X86_PROPERTY_MAX_PHY_ADDR, MAXPHYADDR);

rc = kvm_check_cap(KVM_CAP_EXIT_ON_EMULATION_FAILURE);
TEST_ASSERT(rc, "KVM_CAP_EXIT_ON_EMULATION_FAILURE is unavailable");
--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:08:23

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 14/29] KVM: selftests: Extend {kvm,this}_pmu_has() to support fixed counters

Extend the kvm_x86_pmu_feature framework to allow querying for fixed
counters via {kvm,this}_pmu_has(). Like architectural events, checking
for a fixed counter annoyingly requires checking multiple CPUID fields, as
a fixed counter exists if:

FxCtr[i]_is_supported := ECX[i] || (EDX[4:0] > i);

Note, KVM currently doesn't actually support exposing fixed counters via
the bitmask, but that will hopefully change sooner than later, and Intel's
SDM explicitly "recommends" checking both the number of counters and the
mask.

Rename the intermedate "anti_feature" field to simply 'f' since the fixed
counter bitmask (thankfully) doesn't have reversed polarity like the
architectural events bitmask.

Note, ideally the helpers would use BUILD_BUG_ON() to assert on the
incoming register, but the expected usage in PMU tests can't guarantee the
inputs are compile-time constants.

Opportunistically define macros for all of the known architectural events
and fixed counters.

Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/include/x86_64/processor.h | 65 ++++++++++++++-----
1 file changed, 47 insertions(+), 18 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 4f737d3b893c..92d4f8ecc730 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -282,24 +282,41 @@ struct kvm_x86_cpu_property {
* that indicates the feature is _not_ supported, and a property that states
* the length of the bit mask of unsupported features. A feature is supported
* if the size of the bit mask is larger than the "unavailable" bit, and said
- * bit is not set.
+ * bit is not set. Fixed counters also bizarre enumeration, but inverted from
+ * arch events for general purpose counters. Fixed counters are supported if a
+ * feature flag is set **OR** the total number of fixed counters is greater
+ * than index of the counter.
*
- * Wrap the "unavailable" feature to simplify checking whether or not a given
- * architectural event is supported.
+ * Wrap the events for general purpose and fixed counters to simplify checking
+ * whether or not a given architectural event is supported.
*/
struct kvm_x86_pmu_feature {
- struct kvm_x86_cpu_feature anti_feature;
+ struct kvm_x86_cpu_feature f;
};
-#define KVM_X86_PMU_FEATURE(__bit) \
-({ \
- struct kvm_x86_pmu_feature feature = { \
- .anti_feature = KVM_X86_CPU_FEATURE(0xa, 0, EBX, __bit), \
- }; \
- \
- feature; \
+#define KVM_X86_PMU_FEATURE(__reg, __bit) \
+({ \
+ struct kvm_x86_pmu_feature feature = { \
+ .f = KVM_X86_CPU_FEATURE(0xa, 0, __reg, __bit), \
+ }; \
+ \
+ kvm_static_assert(KVM_CPUID_##__reg == KVM_CPUID_EBX || \
+ KVM_CPUID_##__reg == KVM_CPUID_ECX); \
+ feature; \
})

-#define X86_PMU_FEATURE_BRANCH_INSNS_RETIRED KVM_X86_PMU_FEATURE(5)
+#define X86_PMU_FEATURE_CPU_CYCLES KVM_X86_PMU_FEATURE(EBX, 0)
+#define X86_PMU_FEATURE_INSNS_RETIRED KVM_X86_PMU_FEATURE(EBX, 1)
+#define X86_PMU_FEATURE_REFERENCE_CYCLES KVM_X86_PMU_FEATURE(EBX, 2)
+#define X86_PMU_FEATURE_LLC_REFERENCES KVM_X86_PMU_FEATURE(EBX, 3)
+#define X86_PMU_FEATURE_LLC_MISSES KVM_X86_PMU_FEATURE(EBX, 4)
+#define X86_PMU_FEATURE_BRANCH_INSNS_RETIRED KVM_X86_PMU_FEATURE(EBX, 5)
+#define X86_PMU_FEATURE_BRANCHES_MISPREDICTED KVM_X86_PMU_FEATURE(EBX, 6)
+#define X86_PMU_FEATURE_TOPDOWN_SLOTS KVM_X86_PMU_FEATURE(EBX, 7)
+
+#define X86_PMU_FEATURE_INSNS_RETIRED_FIXED KVM_X86_PMU_FEATURE(ECX, 0)
+#define X86_PMU_FEATURE_CPU_CYCLES_FIXED KVM_X86_PMU_FEATURE(ECX, 1)
+#define X86_PMU_FEATURE_REFERENCE_TSC_CYCLES_FIXED KVM_X86_PMU_FEATURE(ECX, 2)
+#define X86_PMU_FEATURE_TOPDOWN_SLOTS_FIXED KVM_X86_PMU_FEATURE(ECX, 3)

static inline unsigned int x86_family(unsigned int eax)
{
@@ -698,10 +715,16 @@ static __always_inline bool this_cpu_has_p(struct kvm_x86_cpu_property property)

static inline bool this_pmu_has(struct kvm_x86_pmu_feature feature)
{
- uint32_t nr_bits = this_cpu_property(X86_PROPERTY_PMU_EBX_BIT_VECTOR_LENGTH);
+ uint32_t nr_bits;

- return nr_bits > feature.anti_feature.bit &&
- !this_cpu_has(feature.anti_feature);
+ if (feature.f.reg == KVM_CPUID_EBX) {
+ nr_bits = this_cpu_property(X86_PROPERTY_PMU_EBX_BIT_VECTOR_LENGTH);
+ return nr_bits > feature.f.bit && !this_cpu_has(feature.f);
+ }
+
+ GUEST_ASSERT(feature.f.reg == KVM_CPUID_ECX);
+ nr_bits = this_cpu_property(X86_PROPERTY_PMU_NR_FIXED_COUNTERS);
+ return nr_bits > feature.f.bit || this_cpu_has(feature.f);
}

static __always_inline uint64_t this_cpu_supported_xcr0(void)
@@ -917,10 +940,16 @@ static __always_inline bool kvm_cpu_has_p(struct kvm_x86_cpu_property property)

static inline bool kvm_pmu_has(struct kvm_x86_pmu_feature feature)
{
- uint32_t nr_bits = kvm_cpu_property(X86_PROPERTY_PMU_EBX_BIT_VECTOR_LENGTH);
+ uint32_t nr_bits;

- return nr_bits > feature.anti_feature.bit &&
- !kvm_cpu_has(feature.anti_feature);
+ if (feature.f.reg == KVM_CPUID_EBX) {
+ nr_bits = kvm_cpu_property(X86_PROPERTY_PMU_EBX_BIT_VECTOR_LENGTH);
+ return nr_bits > feature.f.bit && !kvm_cpu_has(feature.f);
+ }
+
+ TEST_ASSERT_EQ(feature.f.reg, KVM_CPUID_ECX);
+ nr_bits = kvm_cpu_property(X86_PROPERTY_PMU_NR_FIXED_COUNTERS);
+ return nr_bits > feature.f.bit || kvm_cpu_has(feature.f);
}

static __always_inline uint64_t kvm_cpu_supported_xcr0(void)
--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:08:54

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 15/29] KVM: selftests: Add pmu.h and lib/pmu.c for common PMU assets

From: Jinrong Liang <[email protected]>

Add a PMU library for x86 selftests to help eliminate open-coded event
encodings, and to reduce the amount of copy+paste between PMU selftests.

Use the new common macro definitions in the existing PMU event filter test.

Cc: Aaron Lewis <[email protected]>
Suggested-by: Sean Christopherson <[email protected]>
Signed-off-by: Jinrong Liang <[email protected]>
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
tools/testing/selftests/kvm/Makefile | 1 +
tools/testing/selftests/kvm/include/pmu.h | 97 ++++++++++++
tools/testing/selftests/kvm/lib/pmu.c | 31 ++++
.../kvm/x86_64/pmu_event_filter_test.c | 141 ++++++------------
4 files changed, 173 insertions(+), 97 deletions(-)
create mode 100644 tools/testing/selftests/kvm/include/pmu.h
create mode 100644 tools/testing/selftests/kvm/lib/pmu.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 492e937fab00..479bd85e1c56 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -23,6 +23,7 @@ LIBKVM += lib/guest_modes.c
LIBKVM += lib/io.c
LIBKVM += lib/kvm_util.c
LIBKVM += lib/memstress.c
+LIBKVM += lib/pmu.c
LIBKVM += lib/guest_sprintf.c
LIBKVM += lib/rbtree.c
LIBKVM += lib/sparsebit.c
diff --git a/tools/testing/selftests/kvm/include/pmu.h b/tools/testing/selftests/kvm/include/pmu.h
new file mode 100644
index 000000000000..3c10c4dc0ae8
--- /dev/null
+++ b/tools/testing/selftests/kvm/include/pmu.h
@@ -0,0 +1,97 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2023, Tencent, Inc.
+ */
+#ifndef SELFTEST_KVM_PMU_H
+#define SELFTEST_KVM_PMU_H
+
+#include <stdint.h>
+
+#define KVM_PMU_EVENT_FILTER_MAX_EVENTS 300
+
+/*
+ * Encode an eventsel+umask pair into event-select MSR format. Note, this is
+ * technically AMD's format, as Intel's format only supports 8 bits for the
+ * event selector, i.e. doesn't use bits 24:16 for the selector. But, OR-ing
+ * in '0' is a nop and won't clobber the CMASK.
+ */
+#define RAW_EVENT(eventsel, umask) (((eventsel & 0xf00UL) << 24) | \
+ ((eventsel) & 0xff) | \
+ ((umask) & 0xff) << 8)
+
+/*
+ * These are technically Intel's definitions, but except for CMASK (see above),
+ * AMD's layout is compatible with Intel's.
+ */
+#define ARCH_PERFMON_EVENTSEL_EVENT GENMASK_ULL(7, 0)
+#define ARCH_PERFMON_EVENTSEL_UMASK GENMASK_ULL(15, 8)
+#define ARCH_PERFMON_EVENTSEL_USR BIT_ULL(16)
+#define ARCH_PERFMON_EVENTSEL_OS BIT_ULL(17)
+#define ARCH_PERFMON_EVENTSEL_EDGE BIT_ULL(18)
+#define ARCH_PERFMON_EVENTSEL_PIN_CONTROL BIT_ULL(19)
+#define ARCH_PERFMON_EVENTSEL_INT BIT_ULL(20)
+#define ARCH_PERFMON_EVENTSEL_ANY BIT_ULL(21)
+#define ARCH_PERFMON_EVENTSEL_ENABLE BIT_ULL(22)
+#define ARCH_PERFMON_EVENTSEL_INV BIT_ULL(23)
+#define ARCH_PERFMON_EVENTSEL_CMASK GENMASK_ULL(31, 24)
+
+/* RDPMC control flags, Intel only. */
+#define INTEL_RDPMC_METRICS BIT_ULL(29)
+#define INTEL_RDPMC_FIXED BIT_ULL(30)
+#define INTEL_RDPMC_FAST BIT_ULL(31)
+
+/* Fixed PMC controls, Intel only. */
+#define FIXED_PMC_GLOBAL_CTRL_ENABLE(_idx) BIT_ULL((32 + (_idx)))
+
+#define FIXED_PMC_KERNEL BIT_ULL(0)
+#define FIXED_PMC_USER BIT_ULL(1)
+#define FIXED_PMC_ANYTHREAD BIT_ULL(2)
+#define FIXED_PMC_ENABLE_PMI BIT_ULL(3)
+#define FIXED_PMC_NR_BITS 4
+#define FIXED_PMC_CTRL(_idx, _val) ((_val) << ((_idx) * FIXED_PMC_NR_BITS))
+
+#define PMU_CAP_FW_WRITES BIT_ULL(13)
+#define PMU_CAP_LBR_FMT 0x3f
+
+#define INTEL_ARCH_CPU_CYCLES RAW_EVENT(0x3c, 0x00)
+#define INTEL_ARCH_INSTRUCTIONS_RETIRED RAW_EVENT(0xc0, 0x00)
+#define INTEL_ARCH_REFERENCE_CYCLES RAW_EVENT(0x3c, 0x01)
+#define INTEL_ARCH_LLC_REFERENCES RAW_EVENT(0x2e, 0x4f)
+#define INTEL_ARCH_LLC_MISSES RAW_EVENT(0x2e, 0x41)
+#define INTEL_ARCH_BRANCHES_RETIRED RAW_EVENT(0xc4, 0x00)
+#define INTEL_ARCH_BRANCHES_MISPREDICTED RAW_EVENT(0xc5, 0x00)
+#define INTEL_ARCH_TOPDOWN_SLOTS RAW_EVENT(0xa4, 0x01)
+
+#define AMD_ZEN_CORE_CYCLES RAW_EVENT(0x76, 0x00)
+#define AMD_ZEN_INSTRUCTIONS_RETIRED RAW_EVENT(0xc0, 0x00)
+#define AMD_ZEN_BRANCHES_RETIRED RAW_EVENT(0xc2, 0x00)
+#define AMD_ZEN_BRANCHES_MISPREDICTED RAW_EVENT(0xc3, 0x00)
+
+/*
+ * Note! The order and thus the index of the architectural events matters as
+ * support for each event is enumerated via CPUID using the index of the event.
+ */
+enum intel_pmu_architectural_events {
+ INTEL_ARCH_CPU_CYCLES_INDEX,
+ INTEL_ARCH_INSTRUCTIONS_RETIRED_INDEX,
+ INTEL_ARCH_REFERENCE_CYCLES_INDEX,
+ INTEL_ARCH_LLC_REFERENCES_INDEX,
+ INTEL_ARCH_LLC_MISSES_INDEX,
+ INTEL_ARCH_BRANCHES_RETIRED_INDEX,
+ INTEL_ARCH_BRANCHES_MISPREDICTED_INDEX,
+ INTEL_ARCH_TOPDOWN_SLOTS_INDEX,
+ NR_INTEL_ARCH_EVENTS,
+};
+
+enum amd_pmu_zen_events {
+ AMD_ZEN_CORE_CYCLES_INDEX,
+ AMD_ZEN_INSTRUCTIONS_INDEX,
+ AMD_ZEN_BRANCHES_INDEX,
+ AMD_ZEN_BRANCH_MISSES_INDEX,
+ NR_AMD_ZEN_EVENTS,
+};
+
+extern const uint64_t intel_pmu_arch_events[];
+extern const uint64_t amd_pmu_zen_events[];
+
+#endif /* SELFTEST_KVM_PMU_H */
diff --git a/tools/testing/selftests/kvm/lib/pmu.c b/tools/testing/selftests/kvm/lib/pmu.c
new file mode 100644
index 000000000000..f31f0427c17c
--- /dev/null
+++ b/tools/testing/selftests/kvm/lib/pmu.c
@@ -0,0 +1,31 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2023, Tencent, Inc.
+ */
+
+#include <stdint.h>
+
+#include <linux/kernel.h>
+
+#include "kvm_util.h"
+#include "pmu.h"
+
+const uint64_t intel_pmu_arch_events[] = {
+ INTEL_ARCH_CPU_CYCLES,
+ INTEL_ARCH_INSTRUCTIONS_RETIRED,
+ INTEL_ARCH_REFERENCE_CYCLES,
+ INTEL_ARCH_LLC_REFERENCES,
+ INTEL_ARCH_LLC_MISSES,
+ INTEL_ARCH_BRANCHES_RETIRED,
+ INTEL_ARCH_BRANCHES_MISPREDICTED,
+ INTEL_ARCH_TOPDOWN_SLOTS,
+};
+kvm_static_assert(ARRAY_SIZE(intel_pmu_arch_events) == NR_INTEL_ARCH_EVENTS);
+
+const uint64_t amd_pmu_zen_events[] = {
+ AMD_ZEN_CORE_CYCLES,
+ AMD_ZEN_INSTRUCTIONS_RETIRED,
+ AMD_ZEN_BRANCHES_RETIRED,
+ AMD_ZEN_BRANCHES_MISPREDICTED,
+};
+kvm_static_assert(ARRAY_SIZE(amd_pmu_zen_events) == NR_AMD_ZEN_EVENTS);
diff --git a/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c b/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c
index 283cc55597a4..7ec9fbed92e0 100644
--- a/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c
+++ b/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c
@@ -11,72 +11,18 @@
*/

#define _GNU_SOURCE /* for program_invocation_short_name */
-#include "test_util.h"
+
#include "kvm_util.h"
+#include "pmu.h"
#include "processor.h"
-
-/*
- * In lieu of copying perf_event.h into tools...
- */
-#define ARCH_PERFMON_EVENTSEL_OS (1ULL << 17)
-#define ARCH_PERFMON_EVENTSEL_ENABLE (1ULL << 22)
-
-/* End of stuff taken from perf_event.h. */
-
-/* Oddly, this isn't in perf_event.h. */
-#define ARCH_PERFMON_BRANCHES_RETIRED 5
+#include "test_util.h"

#define NUM_BRANCHES 42
-#define INTEL_PMC_IDX_FIXED 32
-
-/* Matches KVM_PMU_EVENT_FILTER_MAX_EVENTS in pmu.c */
-#define MAX_FILTER_EVENTS 300
#define MAX_TEST_EVENTS 10

#define PMU_EVENT_FILTER_INVALID_ACTION (KVM_PMU_EVENT_DENY + 1)
#define PMU_EVENT_FILTER_INVALID_FLAGS (KVM_PMU_EVENT_FLAGS_VALID_MASK << 1)
-#define PMU_EVENT_FILTER_INVALID_NEVENTS (MAX_FILTER_EVENTS + 1)
-
-/*
- * This is how the event selector and unit mask are stored in an AMD
- * core performance event-select register. Intel's format is similar,
- * but the event selector is only 8 bits.
- */
-#define EVENT(select, umask) ((select & 0xf00UL) << 24 | (select & 0xff) | \
- (umask & 0xff) << 8)
-
-/*
- * "Branch instructions retired", from the Intel SDM, volume 3,
- * "Pre-defined Architectural Performance Events."
- */
-
-#define INTEL_BR_RETIRED EVENT(0xc4, 0)
-
-/*
- * "Retired branch instructions", from Processor Programming Reference
- * (PPR) for AMD Family 17h Model 01h, Revision B1 Processors,
- * Preliminary Processor Programming Reference (PPR) for AMD Family
- * 17h Model 31h, Revision B0 Processors, and Preliminary Processor
- * Programming Reference (PPR) for AMD Family 19h Model 01h, Revision
- * B1 Processors Volume 1 of 2.
- */
-
-#define AMD_ZEN_BR_RETIRED EVENT(0xc2, 0)
-
-
-/*
- * "Retired instructions", from Processor Programming Reference
- * (PPR) for AMD Family 17h Model 01h, Revision B1 Processors,
- * Preliminary Processor Programming Reference (PPR) for AMD Family
- * 17h Model 31h, Revision B0 Processors, and Preliminary Processor
- * Programming Reference (PPR) for AMD Family 19h Model 01h, Revision
- * B1 Processors Volume 1 of 2.
- * --- and ---
- * "Instructions retired", from the Intel SDM, volume 3,
- * "Pre-defined Architectural Performance Events."
- */
-
-#define INST_RETIRED EVENT(0xc0, 0)
+#define PMU_EVENT_FILTER_INVALID_NEVENTS (KVM_PMU_EVENT_FILTER_MAX_EVENTS + 1)

struct __kvm_pmu_event_filter {
__u32 action;
@@ -84,26 +30,28 @@ struct __kvm_pmu_event_filter {
__u32 fixed_counter_bitmap;
__u32 flags;
__u32 pad[4];
- __u64 events[MAX_FILTER_EVENTS];
+ __u64 events[KVM_PMU_EVENT_FILTER_MAX_EVENTS];
};

/*
- * This event list comprises Intel's eight architectural events plus
- * AMD's "retired branch instructions" for Zen[123] (and possibly
- * other AMD CPUs).
+ * This event list comprises Intel's known architectural events, plus AMD's
+ * "retired branch instructions" for Zen1-Zen3 (and* possibly other AMD CPUs).
+ * Note, AMD and Intel use the same encoding for instructions retired.
*/
+kvm_static_assert(INTEL_ARCH_INSTRUCTIONS_RETIRED == AMD_ZEN_INSTRUCTIONS_RETIRED);
+
static const struct __kvm_pmu_event_filter base_event_filter = {
.nevents = ARRAY_SIZE(base_event_filter.events),
.events = {
- EVENT(0x3c, 0),
- INST_RETIRED,
- EVENT(0x3c, 1),
- EVENT(0x2e, 0x4f),
- EVENT(0x2e, 0x41),
- EVENT(0xc4, 0),
- EVENT(0xc5, 0),
- EVENT(0xa4, 1),
- AMD_ZEN_BR_RETIRED,
+ INTEL_ARCH_CPU_CYCLES,
+ INTEL_ARCH_INSTRUCTIONS_RETIRED,
+ INTEL_ARCH_REFERENCE_CYCLES,
+ INTEL_ARCH_LLC_REFERENCES,
+ INTEL_ARCH_LLC_MISSES,
+ INTEL_ARCH_BRANCHES_RETIRED,
+ INTEL_ARCH_BRANCHES_MISPREDICTED,
+ INTEL_ARCH_TOPDOWN_SLOTS,
+ AMD_ZEN_BRANCHES_RETIRED,
},
};

@@ -165,9 +113,9 @@ static void intel_guest_code(void)
for (;;) {
wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);
wrmsr(MSR_P6_EVNTSEL0, ARCH_PERFMON_EVENTSEL_ENABLE |
- ARCH_PERFMON_EVENTSEL_OS | INTEL_BR_RETIRED);
+ ARCH_PERFMON_EVENTSEL_OS | INTEL_ARCH_BRANCHES_RETIRED);
wrmsr(MSR_P6_EVNTSEL1, ARCH_PERFMON_EVENTSEL_ENABLE |
- ARCH_PERFMON_EVENTSEL_OS | INST_RETIRED);
+ ARCH_PERFMON_EVENTSEL_OS | INTEL_ARCH_INSTRUCTIONS_RETIRED);
wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0x3);

run_and_measure_loop(MSR_IA32_PMC0);
@@ -189,9 +137,9 @@ static void amd_guest_code(void)
for (;;) {
wrmsr(MSR_K7_EVNTSEL0, 0);
wrmsr(MSR_K7_EVNTSEL0, ARCH_PERFMON_EVENTSEL_ENABLE |
- ARCH_PERFMON_EVENTSEL_OS | AMD_ZEN_BR_RETIRED);
+ ARCH_PERFMON_EVENTSEL_OS | AMD_ZEN_BRANCHES_RETIRED);
wrmsr(MSR_K7_EVNTSEL1, ARCH_PERFMON_EVENTSEL_ENABLE |
- ARCH_PERFMON_EVENTSEL_OS | INST_RETIRED);
+ ARCH_PERFMON_EVENTSEL_OS | AMD_ZEN_INSTRUCTIONS_RETIRED);

run_and_measure_loop(MSR_K7_PERFCTR0);
GUEST_SYNC(0);
@@ -312,7 +260,7 @@ static void test_amd_deny_list(struct kvm_vcpu *vcpu)
.action = KVM_PMU_EVENT_DENY,
.nevents = 1,
.events = {
- EVENT(0x1C2, 0),
+ RAW_EVENT(0x1C2, 0),
},
};

@@ -347,9 +295,9 @@ static void test_not_member_deny_list(struct kvm_vcpu *vcpu)

f.action = KVM_PMU_EVENT_DENY;

- remove_event(&f, INST_RETIRED);
- remove_event(&f, INTEL_BR_RETIRED);
- remove_event(&f, AMD_ZEN_BR_RETIRED);
+ remove_event(&f, INTEL_ARCH_INSTRUCTIONS_RETIRED);
+ remove_event(&f, INTEL_ARCH_BRANCHES_RETIRED);
+ remove_event(&f, AMD_ZEN_BRANCHES_RETIRED);
test_with_filter(vcpu, &f);

ASSERT_PMC_COUNTING_INSTRUCTIONS();
@@ -361,9 +309,9 @@ static void test_not_member_allow_list(struct kvm_vcpu *vcpu)

f.action = KVM_PMU_EVENT_ALLOW;

- remove_event(&f, INST_RETIRED);
- remove_event(&f, INTEL_BR_RETIRED);
- remove_event(&f, AMD_ZEN_BR_RETIRED);
+ remove_event(&f, INTEL_ARCH_INSTRUCTIONS_RETIRED);
+ remove_event(&f, INTEL_ARCH_BRANCHES_RETIRED);
+ remove_event(&f, AMD_ZEN_BRANCHES_RETIRED);
test_with_filter(vcpu, &f);

ASSERT_PMC_NOT_COUNTING_INSTRUCTIONS();
@@ -452,9 +400,9 @@ static bool use_amd_pmu(void)
* - Sapphire Rapids, Ice Lake, Cascade Lake, Skylake.
*/
#define MEM_INST_RETIRED 0xD0
-#define MEM_INST_RETIRED_LOAD EVENT(MEM_INST_RETIRED, 0x81)
-#define MEM_INST_RETIRED_STORE EVENT(MEM_INST_RETIRED, 0x82)
-#define MEM_INST_RETIRED_LOAD_STORE EVENT(MEM_INST_RETIRED, 0x83)
+#define MEM_INST_RETIRED_LOAD RAW_EVENT(MEM_INST_RETIRED, 0x81)
+#define MEM_INST_RETIRED_STORE RAW_EVENT(MEM_INST_RETIRED, 0x82)
+#define MEM_INST_RETIRED_LOAD_STORE RAW_EVENT(MEM_INST_RETIRED, 0x83)

static bool supports_event_mem_inst_retired(void)
{
@@ -486,9 +434,9 @@ static bool supports_event_mem_inst_retired(void)
* B1 Processors Volume 1 of 2.
*/
#define LS_DISPATCH 0x29
-#define LS_DISPATCH_LOAD EVENT(LS_DISPATCH, BIT(0))
-#define LS_DISPATCH_STORE EVENT(LS_DISPATCH, BIT(1))
-#define LS_DISPATCH_LOAD_STORE EVENT(LS_DISPATCH, BIT(2))
+#define LS_DISPATCH_LOAD RAW_EVENT(LS_DISPATCH, BIT(0))
+#define LS_DISPATCH_STORE RAW_EVENT(LS_DISPATCH, BIT(1))
+#define LS_DISPATCH_LOAD_STORE RAW_EVENT(LS_DISPATCH, BIT(2))

#define INCLUDE_MASKED_ENTRY(event_select, mask, match) \
KVM_PMU_ENCODE_MASKED_ENTRY(event_select, mask, match, false)
@@ -729,14 +677,14 @@ static void add_dummy_events(uint64_t *events, int nevents)

static void test_masked_events(struct kvm_vcpu *vcpu)
{
- int nevents = MAX_FILTER_EVENTS - MAX_TEST_EVENTS;
- uint64_t events[MAX_FILTER_EVENTS];
+ int nevents = KVM_PMU_EVENT_FILTER_MAX_EVENTS - MAX_TEST_EVENTS;
+ uint64_t events[KVM_PMU_EVENT_FILTER_MAX_EVENTS];

/* Run the test cases against a sparse PMU event filter. */
run_masked_events_tests(vcpu, events, 0);

/* Run the test cases against a dense PMU event filter. */
- add_dummy_events(events, MAX_FILTER_EVENTS);
+ add_dummy_events(events, KVM_PMU_EVENT_FILTER_MAX_EVENTS);
run_masked_events_tests(vcpu, events, nevents);
}

@@ -809,20 +757,19 @@ static void test_filter_ioctl(struct kvm_vcpu *vcpu)
TEST_ASSERT(!r, "Masking non-existent fixed counters should be allowed");
}

-static void intel_run_fixed_counter_guest_code(uint8_t fixed_ctr_idx)
+static void intel_run_fixed_counter_guest_code(uint8_t idx)
{
for (;;) {
wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);
- wrmsr(MSR_CORE_PERF_FIXED_CTR0 + fixed_ctr_idx, 0);
+ wrmsr(MSR_CORE_PERF_FIXED_CTR0 + idx, 0);

/* Only OS_EN bit is enabled for fixed counter[idx]. */
- wrmsr(MSR_CORE_PERF_FIXED_CTR_CTRL, BIT_ULL(4 * fixed_ctr_idx));
- wrmsr(MSR_CORE_PERF_GLOBAL_CTRL,
- BIT_ULL(INTEL_PMC_IDX_FIXED + fixed_ctr_idx));
+ wrmsr(MSR_CORE_PERF_FIXED_CTR_CTRL, FIXED_PMC_CTRL(idx, FIXED_PMC_KERNEL));
+ wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, FIXED_PMC_GLOBAL_CTRL_ENABLE(idx));
__asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);

- GUEST_SYNC(rdmsr(MSR_CORE_PERF_FIXED_CTR0 + fixed_ctr_idx));
+ GUEST_SYNC(rdmsr(MSR_CORE_PERF_FIXED_CTR0 + idx));
}
}

--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:09:03

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 16/29] KVM: selftests: Test Intel PMU architectural events on gp counters

From: Jinrong Liang <[email protected]>

Add test cases to verify that Intel's Architectural PMU events work as
expected when they are available according to guest CPUID. Iterate over a
range of sane PMU versions, with and without full-width writes enabled,
and over interesting combinations of lengths/masks for the bit vector that
enumerates unavailable events.

Test up to vPMU version 5, i.e. the current architectural max. KVM only
officially supports up to version 2, but the behavior of the counters is
backwards compatible, i.e. KVM shouldn't do something completely different
for a higher, architecturally-defined vPMU version. Verify KVM behavior
against the effective vPMU version, e.g. advertising vPMU 5 when KVM only
supports vPMU 2 shouldn't magically unlock vPMU 5 features.

According to Intel SDM, the number of architectural events is reported
through CPUID.0AH:EAX[31:24] and the architectural event x is supported
if EBX[x]=0 && EAX[31:24]>x.

Handcode the entirety of the measured section so that the test can
precisely assert on the number of instructions and branches retired.

Co-developed-by: Like Xu <[email protected]>
Signed-off-by: Like Xu <[email protected]>
Signed-off-by: Jinrong Liang <[email protected]>
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/x86_64/pmu_counters_test.c | 321 ++++++++++++++++++
2 files changed, 322 insertions(+)
create mode 100644 tools/testing/selftests/kvm/x86_64/pmu_counters_test.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 479bd85e1c56..ab96fc80bfbd 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -81,6 +81,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/kvm_pv_test
TEST_GEN_PROGS_x86_64 += x86_64/monitor_mwait_test
TEST_GEN_PROGS_x86_64 += x86_64/nested_exceptions_test
TEST_GEN_PROGS_x86_64 += x86_64/platform_info_test
+TEST_GEN_PROGS_x86_64 += x86_64/pmu_counters_test
TEST_GEN_PROGS_x86_64 += x86_64/pmu_event_filter_test
TEST_GEN_PROGS_x86_64 += x86_64/private_mem_conversions_test
TEST_GEN_PROGS_x86_64 += x86_64/private_mem_kvm_exits_test
diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
new file mode 100644
index 000000000000..5b8687bb4639
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
@@ -0,0 +1,321 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2023, Tencent, Inc.
+ */
+
+#define _GNU_SOURCE /* for program_invocation_short_name */
+#include <x86intrin.h>
+
+#include "pmu.h"
+#include "processor.h"
+
+/* Number of LOOP instructions for the guest measurement payload. */
+#define NUM_BRANCHES 10
+/*
+ * Number of "extra" instructions that will be counted, i.e. the number of
+ * instructions that are needed to set up the loop and then disabled the
+ * counter. 2 MOV, 2 XOR, 1 WRMSR.
+ */
+#define NUM_EXTRA_INSNS 5
+#define NUM_INSNS_RETIRED (NUM_BRANCHES + NUM_EXTRA_INSNS)
+
+static uint8_t kvm_pmu_version;
+static bool kvm_has_perf_caps;
+
+static struct kvm_vm *pmu_vm_create_with_one_vcpu(struct kvm_vcpu **vcpu,
+ void *guest_code,
+ uint8_t pmu_version,
+ uint64_t perf_capabilities)
+{
+ struct kvm_vm *vm;
+
+ vm = vm_create_with_one_vcpu(vcpu, guest_code);
+ vm_init_descriptor_tables(vm);
+ vcpu_init_descriptor_tables(*vcpu);
+
+ sync_global_to_guest(vm, kvm_pmu_version);
+
+ /*
+ * Set PERF_CAPABILITIES before PMU version as KVM disallows enabling
+ * features via PERF_CAPABILITIES if the guest doesn't have a vPMU.
+ */
+ if (kvm_has_perf_caps)
+ vcpu_set_msr(*vcpu, MSR_IA32_PERF_CAPABILITIES, perf_capabilities);
+
+ vcpu_set_cpuid_property(*vcpu, X86_PROPERTY_PMU_VERSION, pmu_version);
+ return vm;
+}
+
+static void run_vcpu(struct kvm_vcpu *vcpu)
+{
+ struct ucall uc;
+
+ do {
+ vcpu_run(vcpu);
+ switch (get_ucall(vcpu, &uc)) {
+ case UCALL_SYNC:
+ break;
+ case UCALL_ABORT:
+ REPORT_GUEST_ASSERT(uc);
+ break;
+ case UCALL_PRINTF:
+ pr_info("%s", uc.buffer);
+ break;
+ case UCALL_DONE:
+ break;
+ default:
+ TEST_FAIL("Unexpected ucall: %lu", uc.cmd);
+ }
+ } while (uc.cmd != UCALL_DONE);
+}
+
+static uint8_t guest_get_pmu_version(void)
+{
+ /*
+ * Return the effective PMU version, i.e. the minimum between what KVM
+ * supports and what is enumerated to the guest. The host deliberately
+ * advertises a PMU version to the guest beyond what is actually
+ * supported by KVM to verify KVM doesn't freak out and do something
+ * bizarre with an architecturally valid, but unsupported, version.
+ */
+ return min_t(uint8_t, kvm_pmu_version, this_cpu_property(X86_PROPERTY_PMU_VERSION));
+}
+
+/*
+ * If an architectural event is supported and guaranteed to generate at least
+ * one "hit, assert that its count is non-zero. If an event isn't supported or
+ * the test can't guarantee the associated action will occur, then all bets are
+ * off regarding the count, i.e. no checks can be done.
+ *
+ * Sanity check that in all cases, the event doesn't count when it's disabled,
+ * and that KVM correctly emulates the write of an arbitrary value.
+ */
+static void guest_assert_event_count(uint8_t idx,
+ struct kvm_x86_pmu_feature event,
+ uint32_t pmc, uint32_t pmc_msr)
+{
+ uint64_t count;
+
+ count = _rdpmc(pmc);
+ if (!this_pmu_has(event))
+ goto sanity_checks;
+
+ switch (idx) {
+ case INTEL_ARCH_INSTRUCTIONS_RETIRED_INDEX:
+ GUEST_ASSERT_EQ(count, NUM_INSNS_RETIRED);
+ break;
+ case INTEL_ARCH_BRANCHES_RETIRED_INDEX:
+ GUEST_ASSERT_EQ(count, NUM_BRANCHES);
+ break;
+ case INTEL_ARCH_CPU_CYCLES_INDEX:
+ case INTEL_ARCH_REFERENCE_CYCLES_INDEX:
+ GUEST_ASSERT_NE(count, 0);
+ break;
+ default:
+ break;
+ }
+
+sanity_checks:
+ __asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
+ GUEST_ASSERT_EQ(_rdpmc(pmc), count);
+
+ wrmsr(pmc_msr, 0xdead);
+ GUEST_ASSERT_EQ(_rdpmc(pmc), 0xdead);
+}
+
+static void __guest_test_arch_event(uint8_t idx, struct kvm_x86_pmu_feature event,
+ uint32_t pmc, uint32_t pmc_msr,
+ uint32_t ctrl_msr, uint64_t ctrl_msr_value)
+{
+ wrmsr(pmc_msr, 0);
+
+ /*
+ * Enable and disable the PMC in a monolithic asm blob to ensure that
+ * the compiler can't insert _any_ code into the measured sequence.
+ * Note, ECX doesn't need to be clobbered as the input value, @pmc_msr,
+ * is restored before the end of the sequence.
+ */
+ __asm__ __volatile__("wrmsr\n\t"
+ "mov $" __stringify(NUM_BRANCHES) ", %%ecx\n\t"
+ "loop .\n\t"
+ "mov %%edi, %%ecx\n\t"
+ "xor %%eax, %%eax\n\t"
+ "xor %%edx, %%edx\n\t"
+ "wrmsr\n\t"
+ :: "a"((uint32_t)ctrl_msr_value),
+ "d"(ctrl_msr_value >> 32),
+ "c"(ctrl_msr), "D"(ctrl_msr)
+ );
+
+ guest_assert_event_count(idx, event, pmc, pmc_msr);
+}
+
+static void guest_test_arch_event(uint8_t idx)
+{
+ const struct {
+ struct kvm_x86_pmu_feature gp_event;
+ } intel_event_to_feature[] = {
+ [INTEL_ARCH_CPU_CYCLES_INDEX] = { X86_PMU_FEATURE_CPU_CYCLES },
+ [INTEL_ARCH_INSTRUCTIONS_RETIRED_INDEX] = { X86_PMU_FEATURE_INSNS_RETIRED },
+ [INTEL_ARCH_REFERENCE_CYCLES_INDEX] = { X86_PMU_FEATURE_REFERENCE_CYCLES },
+ [INTEL_ARCH_LLC_REFERENCES_INDEX] = { X86_PMU_FEATURE_LLC_REFERENCES },
+ [INTEL_ARCH_LLC_MISSES_INDEX] = { X86_PMU_FEATURE_LLC_MISSES },
+ [INTEL_ARCH_BRANCHES_RETIRED_INDEX] = { X86_PMU_FEATURE_BRANCH_INSNS_RETIRED },
+ [INTEL_ARCH_BRANCHES_MISPREDICTED_INDEX] = { X86_PMU_FEATURE_BRANCHES_MISPREDICTED },
+ [INTEL_ARCH_TOPDOWN_SLOTS_INDEX] = { X86_PMU_FEATURE_TOPDOWN_SLOTS },
+ };
+
+ uint32_t nr_gp_counters = this_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
+ uint32_t pmu_version = guest_get_pmu_version();
+ /* PERF_GLOBAL_CTRL exists only for Architectural PMU Version 2+. */
+ bool guest_has_perf_global_ctrl = pmu_version >= 2;
+ struct kvm_x86_pmu_feature gp_event;
+ uint32_t base_pmc_msr;
+ unsigned int i;
+
+ /* The host side shouldn't invoke this without a guest PMU. */
+ GUEST_ASSERT(pmu_version);
+
+ if (this_cpu_has(X86_FEATURE_PDCM) &&
+ rdmsr(MSR_IA32_PERF_CAPABILITIES) & PMU_CAP_FW_WRITES)
+ base_pmc_msr = MSR_IA32_PMC0;
+ else
+ base_pmc_msr = MSR_IA32_PERFCTR0;
+
+ gp_event = intel_event_to_feature[idx].gp_event;
+ GUEST_ASSERT_EQ(idx, gp_event.f.bit);
+
+ GUEST_ASSERT(nr_gp_counters);
+
+ for (i = 0; i < nr_gp_counters; i++) {
+ uint64_t eventsel = ARCH_PERFMON_EVENTSEL_OS |
+ ARCH_PERFMON_EVENTSEL_ENABLE |
+ intel_pmu_arch_events[idx];
+
+ wrmsr(MSR_P6_EVNTSEL0 + i, 0);
+ if (guest_has_perf_global_ctrl)
+ wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, BIT_ULL(i));
+
+ __guest_test_arch_event(idx, gp_event, i, base_pmc_msr + i,
+ MSR_P6_EVNTSEL0 + i, eventsel);
+ }
+}
+
+static void guest_test_arch_events(void)
+{
+ uint8_t i;
+
+ for (i = 0; i < NR_INTEL_ARCH_EVENTS; i++)
+ guest_test_arch_event(i);
+
+ GUEST_DONE();
+}
+
+static void test_arch_events(uint8_t pmu_version, uint64_t perf_capabilities,
+ uint8_t length, uint8_t unavailable_mask)
+{
+ struct kvm_vcpu *vcpu;
+ struct kvm_vm *vm;
+
+ /* Testing arch events requires a vPMU (there are no negative tests). */
+ if (!pmu_version)
+ return;
+
+ vm = pmu_vm_create_with_one_vcpu(&vcpu, guest_test_arch_events,
+ pmu_version, perf_capabilities);
+
+ vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_EBX_BIT_VECTOR_LENGTH,
+ length);
+ vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_EVENTS_MASK,
+ unavailable_mask);
+
+ run_vcpu(vcpu);
+
+ kvm_vm_free(vm);
+}
+
+static void test_intel_counters(void)
+{
+ uint8_t nr_arch_events = kvm_cpu_property(X86_PROPERTY_PMU_EBX_BIT_VECTOR_LENGTH);
+ uint8_t pmu_version = kvm_cpu_property(X86_PROPERTY_PMU_VERSION);
+ unsigned int i;
+ uint8_t v, j;
+ uint32_t k;
+
+ const uint64_t perf_caps[] = {
+ 0,
+ PMU_CAP_FW_WRITES,
+ };
+
+ /*
+ * Test up to PMU v5, which is the current maximum version defined by
+ * Intel, i.e. is the last version that is guaranteed to be backwards
+ * compatible with KVM's existing behavior.
+ */
+ uint8_t max_pmu_version = max_t(typeof(pmu_version), pmu_version, 5);
+
+ /*
+ * Detect the existence of events that aren't supported by selftests.
+ * This will (obviously) fail any time the kernel adds support for a
+ * new event, but it's worth paying that price to keep the test fresh.
+ */
+ TEST_ASSERT(nr_arch_events <= NR_INTEL_ARCH_EVENTS,
+ "New architectural event(s) detected; please update this test (length = %u, mask = %x)",
+ nr_arch_events, kvm_cpu_property(X86_PROPERTY_PMU_EVENTS_MASK));
+
+ /*
+ * Force iterating over known arch events regardless of whether or not
+ * KVM/hardware supports a given event.
+ */
+ nr_arch_events = max_t(typeof(nr_arch_events), nr_arch_events, NR_INTEL_ARCH_EVENTS);
+
+ for (v = 0; v <= max_pmu_version; v++) {
+ for (i = 0; i < ARRAY_SIZE(perf_caps); i++) {
+ if (!kvm_has_perf_caps && perf_caps[i])
+ continue;
+
+ pr_info("Testing arch events, PMU version %u, perf_caps = %lx\n",
+ v, perf_caps[i]);
+ /*
+ * To keep the total runtime reasonable, test every
+ * possible non-zero, non-reserved bitmap combination
+ * only with the native PMU version and the full bit
+ * vector length.
+ */
+ if (v == pmu_version) {
+ for (k = 1; k < (BIT(nr_arch_events) - 1); k++)
+ test_arch_events(v, perf_caps[i], nr_arch_events, k);
+ }
+ /*
+ * Test single bits for all PMU version and lengths up
+ * the number of events +1 (to verify KVM doesn't do
+ * weird things if the guest length is greater than the
+ * host length). Explicitly test a mask of '0' and all
+ * ones i.e. all events being available and unavailable.
+ */
+ for (j = 0; j <= nr_arch_events + 1; j++) {
+ test_arch_events(v, perf_caps[i], j, 0);
+ test_arch_events(v, perf_caps[i], j, 0xff);
+
+ for (k = 0; k < nr_arch_events; k++)
+ test_arch_events(v, perf_caps[i], j, BIT(k));
+ }
+ }
+ }
+}
+
+int main(int argc, char *argv[])
+{
+ TEST_REQUIRE(get_kvm_param_bool("enable_pmu"));
+
+ TEST_REQUIRE(host_cpu_is_intel);
+ TEST_REQUIRE(kvm_cpu_has_p(X86_PROPERTY_PMU_VERSION));
+ TEST_REQUIRE(kvm_cpu_property(X86_PROPERTY_PMU_VERSION) > 0);
+
+ kvm_pmu_version = kvm_cpu_property(X86_PROPERTY_PMU_VERSION);
+ kvm_has_perf_caps = kvm_cpu_has(X86_FEATURE_PDCM);
+
+ test_intel_counters();
+
+ return 0;
+}
--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:09:21

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 17/29] KVM: selftests: Test Intel PMU architectural events on fixed counters

From: Jinrong Liang <[email protected]>

Extend the PMU counters test to validate architectural events using fixed
counters. The core logic is largely the same, the biggest difference
being that if a fixed counter exists, its associated event is available
(the SDM doesn't explicitly state this to be true, but it's KVM's ABI and
letting software program a fixed counter that doesn't actually count would
be quite bizarre).

Note, fixed counters rely on PERF_GLOBAL_CTRL.

Reviewed-by: Jim Mattson <[email protected]>
Reviewed-by: Dapeng Mi <[email protected]>
Co-developed-by: Like Xu <[email protected]>
Signed-off-by: Like Xu <[email protected]>
Signed-off-by: Jinrong Liang <[email protected]>
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/x86_64/pmu_counters_test.c | 54 +++++++++++++++----
1 file changed, 45 insertions(+), 9 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
index 5b8687bb4639..663e8fbe7ff8 100644
--- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
+++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
@@ -150,26 +150,46 @@ static void __guest_test_arch_event(uint8_t idx, struct kvm_x86_pmu_feature even
guest_assert_event_count(idx, event, pmc, pmc_msr);
}

+#define X86_PMU_FEATURE_NULL \
+({ \
+ struct kvm_x86_pmu_feature feature = {}; \
+ \
+ feature; \
+})
+
+static bool pmu_is_null_feature(struct kvm_x86_pmu_feature event)
+{
+ return !(*(u64 *)&event);
+}
+
static void guest_test_arch_event(uint8_t idx)
{
const struct {
struct kvm_x86_pmu_feature gp_event;
+ struct kvm_x86_pmu_feature fixed_event;
} intel_event_to_feature[] = {
- [INTEL_ARCH_CPU_CYCLES_INDEX] = { X86_PMU_FEATURE_CPU_CYCLES },
- [INTEL_ARCH_INSTRUCTIONS_RETIRED_INDEX] = { X86_PMU_FEATURE_INSNS_RETIRED },
- [INTEL_ARCH_REFERENCE_CYCLES_INDEX] = { X86_PMU_FEATURE_REFERENCE_CYCLES },
- [INTEL_ARCH_LLC_REFERENCES_INDEX] = { X86_PMU_FEATURE_LLC_REFERENCES },
- [INTEL_ARCH_LLC_MISSES_INDEX] = { X86_PMU_FEATURE_LLC_MISSES },
- [INTEL_ARCH_BRANCHES_RETIRED_INDEX] = { X86_PMU_FEATURE_BRANCH_INSNS_RETIRED },
- [INTEL_ARCH_BRANCHES_MISPREDICTED_INDEX] = { X86_PMU_FEATURE_BRANCHES_MISPREDICTED },
- [INTEL_ARCH_TOPDOWN_SLOTS_INDEX] = { X86_PMU_FEATURE_TOPDOWN_SLOTS },
+ [INTEL_ARCH_CPU_CYCLES_INDEX] = { X86_PMU_FEATURE_CPU_CYCLES, X86_PMU_FEATURE_CPU_CYCLES_FIXED },
+ [INTEL_ARCH_INSTRUCTIONS_RETIRED_INDEX] = { X86_PMU_FEATURE_INSNS_RETIRED, X86_PMU_FEATURE_INSNS_RETIRED_FIXED },
+ /*
+ * Note, the fixed counter for reference cycles is NOT the same
+ * as the general purpose architectural event. The fixed counter
+ * explicitly counts at the same frequency as the TSC, whereas
+ * the GP event counts at a fixed, but uarch specific, frequency.
+ * Bundle them here for simplicity.
+ */
+ [INTEL_ARCH_REFERENCE_CYCLES_INDEX] = { X86_PMU_FEATURE_REFERENCE_CYCLES, X86_PMU_FEATURE_REFERENCE_TSC_CYCLES_FIXED },
+ [INTEL_ARCH_LLC_REFERENCES_INDEX] = { X86_PMU_FEATURE_LLC_REFERENCES, X86_PMU_FEATURE_NULL },
+ [INTEL_ARCH_LLC_MISSES_INDEX] = { X86_PMU_FEATURE_LLC_MISSES, X86_PMU_FEATURE_NULL },
+ [INTEL_ARCH_BRANCHES_RETIRED_INDEX] = { X86_PMU_FEATURE_BRANCH_INSNS_RETIRED, X86_PMU_FEATURE_NULL },
+ [INTEL_ARCH_BRANCHES_MISPREDICTED_INDEX] = { X86_PMU_FEATURE_BRANCHES_MISPREDICTED, X86_PMU_FEATURE_NULL },
+ [INTEL_ARCH_TOPDOWN_SLOTS_INDEX] = { X86_PMU_FEATURE_TOPDOWN_SLOTS, X86_PMU_FEATURE_TOPDOWN_SLOTS_FIXED },
};

uint32_t nr_gp_counters = this_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
uint32_t pmu_version = guest_get_pmu_version();
/* PERF_GLOBAL_CTRL exists only for Architectural PMU Version 2+. */
bool guest_has_perf_global_ctrl = pmu_version >= 2;
- struct kvm_x86_pmu_feature gp_event;
+ struct kvm_x86_pmu_feature gp_event, fixed_event;
uint32_t base_pmc_msr;
unsigned int i;

@@ -199,6 +219,22 @@ static void guest_test_arch_event(uint8_t idx)
__guest_test_arch_event(idx, gp_event, i, base_pmc_msr + i,
MSR_P6_EVNTSEL0 + i, eventsel);
}
+
+ if (!guest_has_perf_global_ctrl)
+ return;
+
+ fixed_event = intel_event_to_feature[idx].fixed_event;
+ if (pmu_is_null_feature(fixed_event) || !this_pmu_has(fixed_event))
+ return;
+
+ i = fixed_event.f.bit;
+
+ wrmsr(MSR_CORE_PERF_FIXED_CTR_CTRL, FIXED_PMC_CTRL(i, FIXED_PMC_KERNEL));
+
+ __guest_test_arch_event(idx, fixed_event, i | INTEL_RDPMC_FIXED,
+ MSR_CORE_PERF_FIXED_CTR0 + i,
+ MSR_CORE_PERF_GLOBAL_CTRL,
+ FIXED_PMC_GLOBAL_CTRL_ENABLE(i));
}

static void guest_test_arch_events(void)
--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:10:11

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 19/29] KVM: selftests: Test consistency of CPUID with num of fixed counters

From: Jinrong Liang <[email protected]>

Extend the PMU counters test to verify KVM emulation of fixed counters in
addition to general purpose counters. Fixed counters add an extra wrinkle
in the form of an extra supported bitmask. Thus quoth the SDM:

fixed-function performance counter 'i' is supported if ECX[i] || (EDX[4:0] > i)

Test that KVM handles a counter being available through either method.

Reviewed-by: Dapeng Mi <[email protected]>
Co-developed-by: Like Xu <[email protected]>
Signed-off-by: Like Xu <[email protected]>
Signed-off-by: Jinrong Liang <[email protected]>
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/x86_64/pmu_counters_test.c | 60 ++++++++++++++++++-
1 file changed, 57 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
index 863418842ef8..b07294af71a3 100644
--- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
+++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
@@ -290,7 +290,7 @@ __GUEST_ASSERT(expect_gp ? vector == GP_VECTOR : !vector, \
msr, expected_val, val);

static void guest_rd_wr_counters(uint32_t base_msr, uint8_t nr_possible_counters,
- uint8_t nr_counters)
+ uint8_t nr_counters, uint32_t or_mask)
{
uint8_t i;

@@ -301,7 +301,13 @@ static void guest_rd_wr_counters(uint32_t base_msr, uint8_t nr_possible_counters
*/
const uint64_t test_val = 0xffff;
const uint32_t msr = base_msr + i;
- const bool expect_success = i < nr_counters;
+
+ /*
+ * Fixed counters are supported if the counter is less than the
+ * number of enumerated contiguous counters *or* the counter is
+ * explicitly enumerated in the supported counters mask.
+ */
+ const bool expect_success = i < nr_counters || (or_mask & BIT(i));

/*
* KVM drops writes to MSR_P6_PERFCTR[0|1] if the counters are
@@ -343,7 +349,7 @@ static void guest_test_gp_counters(void)
else
base_msr = MSR_IA32_PERFCTR0;

- guest_rd_wr_counters(base_msr, MAX_NR_GP_COUNTERS, nr_gp_counters);
+ guest_rd_wr_counters(base_msr, MAX_NR_GP_COUNTERS, nr_gp_counters, 0);
}

static void test_gp_counters(uint8_t pmu_version, uint64_t perf_capabilities,
@@ -363,9 +369,50 @@ static void test_gp_counters(uint8_t pmu_version, uint64_t perf_capabilities,
kvm_vm_free(vm);
}

+static void guest_test_fixed_counters(void)
+{
+ uint64_t supported_bitmask = 0;
+ uint8_t nr_fixed_counters = 0;
+
+ /* Fixed counters require Architectural vPMU Version 2+. */
+ if (guest_get_pmu_version() >= 2)
+ nr_fixed_counters = this_cpu_property(X86_PROPERTY_PMU_NR_FIXED_COUNTERS);
+
+ /*
+ * The supported bitmask for fixed counters was introduced in PMU
+ * version 5.
+ */
+ if (guest_get_pmu_version() >= 5)
+ supported_bitmask = this_cpu_property(X86_PROPERTY_PMU_FIXED_COUNTERS_BITMASK);
+
+ guest_rd_wr_counters(MSR_CORE_PERF_FIXED_CTR0, MAX_NR_FIXED_COUNTERS,
+ nr_fixed_counters, supported_bitmask);
+}
+
+static void test_fixed_counters(uint8_t pmu_version, uint64_t perf_capabilities,
+ uint8_t nr_fixed_counters,
+ uint32_t supported_bitmask)
+{
+ struct kvm_vcpu *vcpu;
+ struct kvm_vm *vm;
+
+ vm = pmu_vm_create_with_one_vcpu(&vcpu, guest_test_fixed_counters,
+ pmu_version, perf_capabilities);
+
+ vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_FIXED_COUNTERS_BITMASK,
+ supported_bitmask);
+ vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_NR_FIXED_COUNTERS,
+ nr_fixed_counters);
+
+ run_vcpu(vcpu);
+
+ kvm_vm_free(vm);
+}
+
static void test_intel_counters(void)
{
uint8_t nr_arch_events = kvm_cpu_property(X86_PROPERTY_PMU_EBX_BIT_VECTOR_LENGTH);
+ uint8_t nr_fixed_counters = kvm_cpu_property(X86_PROPERTY_PMU_NR_FIXED_COUNTERS);
uint8_t nr_gp_counters = kvm_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
uint8_t pmu_version = kvm_cpu_property(X86_PROPERTY_PMU_VERSION);
unsigned int i;
@@ -435,6 +482,13 @@ static void test_intel_counters(void)
v, perf_caps[i]);
for (j = 0; j <= nr_gp_counters; j++)
test_gp_counters(v, perf_caps[i], j);
+
+ pr_info("Testing fixed counters, PMU version %u, perf_caps = %lx\n",
+ v, perf_caps[i]);
+ for (j = 0; j <= nr_fixed_counters; j++) {
+ for (k = 0; k <= (BIT(nr_fixed_counters) - 1); k++)
+ test_fixed_counters(v, perf_caps[i], j, k);
+ }
}
}
}
--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:11:14

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 13/29] KVM: selftests: Drop the "name" param from KVM_X86_PMU_FEATURE()

Drop the "name" parameter from KVM_X86_PMU_FEATURE(), it's unused and
the name is redundant with the macro, i.e. it's truly useless.

Reviewed-by: Jim Mattson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
tools/testing/selftests/kvm/include/x86_64/processor.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 932944c4ea01..4f737d3b893c 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -290,7 +290,7 @@ struct kvm_x86_cpu_property {
struct kvm_x86_pmu_feature {
struct kvm_x86_cpu_feature anti_feature;
};
-#define KVM_X86_PMU_FEATURE(name, __bit) \
+#define KVM_X86_PMU_FEATURE(__bit) \
({ \
struct kvm_x86_pmu_feature feature = { \
.anti_feature = KVM_X86_CPU_FEATURE(0xa, 0, EBX, __bit), \
@@ -299,7 +299,7 @@ struct kvm_x86_pmu_feature {
feature; \
})

-#define X86_PMU_FEATURE_BRANCH_INSNS_RETIRED KVM_X86_PMU_FEATURE(BRANCH_INSNS_RETIRED, 5)
+#define X86_PMU_FEATURE_BRANCH_INSNS_RETIRED KVM_X86_PMU_FEATURE(5)

static inline unsigned int x86_family(unsigned int eax)
{
--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:11:26

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 22/29] KVM: selftests: Add a helper to query if the PMU module param is enabled

Add a helper to probe KVM's "enable_pmu" param, open coding strings in
multiple places is just asking for false negatives and/or runtime errors
due to typos.

Reviewed-by: Dapeng Mi <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
tools/testing/selftests/kvm/include/x86_64/processor.h | 5 +++++
tools/testing/selftests/kvm/x86_64/pmu_counters_test.c | 2 +-
tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c | 2 +-
tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c | 2 +-
4 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 92d4f8ecc730..ee082ae58f40 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -1217,6 +1217,11 @@ static inline uint8_t xsetbv_safe(uint32_t index, uint64_t value)

bool kvm_is_tdp_enabled(void);

+static inline bool kvm_is_pmu_enabled(void)
+{
+ return get_kvm_param_bool("enable_pmu");
+}
+
uint64_t *__vm_get_page_table_entry(struct kvm_vm *vm, uint64_t vaddr,
int *level);
uint64_t *vm_get_page_table_entry(struct kvm_vm *vm, uint64_t vaddr);
diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
index 4c7133ddcda8..9e9dc4084c0d 100644
--- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
+++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
@@ -545,7 +545,7 @@ static void test_intel_counters(void)

int main(int argc, char *argv[])
{
- TEST_REQUIRE(get_kvm_param_bool("enable_pmu"));
+ TEST_REQUIRE(kvm_is_pmu_enabled());

TEST_REQUIRE(host_cpu_is_intel);
TEST_REQUIRE(kvm_cpu_has_p(X86_PROPERTY_PMU_VERSION));
diff --git a/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c b/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c
index 7ec9fbed92e0..fa407e2ccb2f 100644
--- a/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c
+++ b/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c
@@ -867,7 +867,7 @@ int main(int argc, char *argv[])
struct kvm_vcpu *vcpu, *vcpu2 = NULL;
struct kvm_vm *vm;

- TEST_REQUIRE(get_kvm_param_bool("enable_pmu"));
+ TEST_REQUIRE(kvm_is_pmu_enabled());
TEST_REQUIRE(kvm_has_cap(KVM_CAP_PMU_EVENT_FILTER));
TEST_REQUIRE(kvm_has_cap(KVM_CAP_PMU_EVENT_MASKED_EVENTS));

diff --git a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
index 2a8d4ac2f020..8ded194c5a6d 100644
--- a/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
+++ b/tools/testing/selftests/kvm/x86_64/vmx_pmu_caps_test.c
@@ -237,7 +237,7 @@ int main(int argc, char *argv[])
{
union perf_capabilities host_cap;

- TEST_REQUIRE(get_kvm_param_bool("enable_pmu"));
+ TEST_REQUIRE(kvm_is_pmu_enabled());
TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_PDCM));

TEST_REQUIRE(kvm_cpu_has_p(X86_PROPERTY_PMU_VERSION));
--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:11:51

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 23/29] KVM: selftests: Add helpers to read integer module params

Add helpers to read integer module params, which is painfully non-trivial
because the pain of dealing with strings in C is exacerbated by the kernel
inserting a newline.

Don't bother differentiating between int, uint, short, etc. They all fit
in an int, and KVM (thankfully) doesn't have any integer params larger
than an int.

Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/include/kvm_util_base.h | 4 ++
tools/testing/selftests/kvm/lib/kvm_util.c | 62 +++++++++++++++++--
2 files changed, 60 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index 9e5afc472c14..070f250036fc 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -259,6 +259,10 @@ bool get_kvm_param_bool(const char *param);
bool get_kvm_intel_param_bool(const char *param);
bool get_kvm_amd_param_bool(const char *param);

+int get_kvm_param_integer(const char *param);
+int get_kvm_intel_param_integer(const char *param);
+int get_kvm_amd_param_integer(const char *param);
+
unsigned int kvm_check_cap(long cap);

static inline bool kvm_has_cap(long cap)
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index e066d584c656..9bafe44cb978 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -51,13 +51,13 @@ int open_kvm_dev_path_or_exit(void)
return _open_kvm_dev_path_or_exit(O_RDONLY);
}

-static bool get_module_param_bool(const char *module_name, const char *param)
+static ssize_t get_module_param(const char *module_name, const char *param,
+ void *buffer, size_t buffer_size)
{
const int path_size = 128;
char path[path_size];
- char value;
- ssize_t r;
- int fd;
+ ssize_t bytes_read;
+ int fd, r;

r = snprintf(path, path_size, "/sys/module/%s/parameters/%s",
module_name, param);
@@ -66,11 +66,46 @@ static bool get_module_param_bool(const char *module_name, const char *param)

fd = open_path_or_exit(path, O_RDONLY);

- r = read(fd, &value, 1);
- TEST_ASSERT(r == 1, "read(%s) failed", path);
+ bytes_read = read(fd, buffer, buffer_size);
+ TEST_ASSERT(bytes_read > 0, "read(%s) returned %ld, wanted %ld bytes",
+ path, bytes_read, buffer_size);

r = close(fd);
TEST_ASSERT(!r, "close(%s) failed", path);
+ return bytes_read;
+}
+
+static int get_module_param_integer(const char *module_name, const char *param)
+{
+ /*
+ * 16 bytes to hold a 64-bit value (1 byte per char), 1 byte for the
+ * NUL char, and 1 byte because the kernel sucks and inserts a newline
+ * at the end.
+ */
+ char value[16 + 1 + 1];
+ ssize_t r;
+
+ memset(value, '\0', sizeof(value));
+
+ r = get_module_param(module_name, param, value, sizeof(value));
+ TEST_ASSERT(value[r - 1] == '\n',
+ "Expected trailing newline, got char '%c'", value[r - 1]);
+
+ /*
+ * Squash the newline, otherwise atoi_paranoid() will complain about
+ * trailing non-NUL characters in the string.
+ */
+ value[r - 1] = '\0';
+ return atoi_paranoid(value);
+}
+
+static bool get_module_param_bool(const char *module_name, const char *param)
+{
+ char value;
+ ssize_t r;
+
+ r = get_module_param(module_name, param, &value, sizeof(value));
+ TEST_ASSERT_EQ(r, 1);

if (value == 'Y')
return true;
@@ -95,6 +130,21 @@ bool get_kvm_amd_param_bool(const char *param)
return get_module_param_bool("kvm_amd", param);
}

+int get_kvm_param_integer(const char *param)
+{
+ return get_module_param_integer("kvm", param);
+}
+
+int get_kvm_intel_param_integer(const char *param)
+{
+ return get_module_param_integer("kvm_intel", param);
+}
+
+int get_kvm_amd_param_integer(const char *param)
+{
+ return get_module_param_integer("kvm_amd", param);
+}
+
/*
* Capability
*
--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:13:02

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 26/29] KVM: selftests: Test PMC virtualization with forced emulation

Extend the PMC counters test to use forced emulation to verify that KVM
emulates counter events for instructions retired and branches retired.
Force emulation for only a subset of the measured code to test that KVM
does the right thing when mixing perf events with emulated events.

Reviewed-by: Dapeng Mi <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/x86_64/pmu_counters_test.c | 44 +++++++++++++------
1 file changed, 30 insertions(+), 14 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
index 9e9dc4084c0d..cb808ac827ba 100644
--- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
+++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
@@ -21,6 +21,7 @@

static uint8_t kvm_pmu_version;
static bool kvm_has_perf_caps;
+static bool is_forced_emulation_enabled;

static struct kvm_vm *pmu_vm_create_with_one_vcpu(struct kvm_vcpu **vcpu,
void *guest_code,
@@ -34,6 +35,7 @@ static struct kvm_vm *pmu_vm_create_with_one_vcpu(struct kvm_vcpu **vcpu,
vcpu_init_descriptor_tables(*vcpu);

sync_global_to_guest(vm, kvm_pmu_version);
+ sync_global_to_guest(vm, is_forced_emulation_enabled);

/*
* Set PERF_CAPABILITIES before PMU version as KVM disallows enabling
@@ -138,37 +140,50 @@ static void guest_assert_event_count(uint8_t idx,
* If CLFUSH{,OPT} is supported, flush the cacheline containing (at least) the
* start of the loop to force LLC references and misses, i.e. to allow testing
* that those events actually count.
+ *
+ * If forced emulation is enabled (and specified), force emulation on a subset
+ * of the measured code to verify that KVM correctly emulates instructions and
+ * branches retired events in conjunction with hardware also counting said
+ * events.
*/
-#define GUEST_MEASURE_EVENT(_msr, _value, clflush) \
+#define GUEST_MEASURE_EVENT(_msr, _value, clflush, FEP) \
do { \
__asm__ __volatile__("wrmsr\n\t" \
clflush "\n\t" \
"mfence\n\t" \
"1: mov $" __stringify(NUM_BRANCHES) ", %%ecx\n\t" \
- "loop .\n\t" \
- "mov %%edi, %%ecx\n\t" \
- "xor %%eax, %%eax\n\t" \
- "xor %%edx, %%edx\n\t" \
+ FEP "loop .\n\t" \
+ FEP "mov %%edi, %%ecx\n\t" \
+ FEP "xor %%eax, %%eax\n\t" \
+ FEP "xor %%edx, %%edx\n\t" \
"wrmsr\n\t" \
:: "a"((uint32_t)_value), "d"(_value >> 32), \
"c"(_msr), "D"(_msr) \
); \
} while (0)

+#define GUEST_TEST_EVENT(_idx, _event, _pmc, _pmc_msr, _ctrl_msr, _value, FEP) \
+do { \
+ wrmsr(pmc_msr, 0); \
+ \
+ if (this_cpu_has(X86_FEATURE_CLFLUSHOPT)) \
+ GUEST_MEASURE_EVENT(_ctrl_msr, _value, "clflushopt 1f", FEP); \
+ else if (this_cpu_has(X86_FEATURE_CLFLUSH)) \
+ GUEST_MEASURE_EVENT(_ctrl_msr, _value, "clflush 1f", FEP); \
+ else \
+ GUEST_MEASURE_EVENT(_ctrl_msr, _value, "nop", FEP); \
+ \
+ guest_assert_event_count(_idx, _event, _pmc, _pmc_msr); \
+} while (0)
+
static void __guest_test_arch_event(uint8_t idx, struct kvm_x86_pmu_feature event,
uint32_t pmc, uint32_t pmc_msr,
uint32_t ctrl_msr, uint64_t ctrl_msr_value)
{
- wrmsr(pmc_msr, 0);
+ GUEST_TEST_EVENT(idx, event, pmc, pmc_msr, ctrl_msr, ctrl_msr_value, "");

- if (this_cpu_has(X86_FEATURE_CLFLUSHOPT))
- GUEST_MEASURE_EVENT(ctrl_msr, ctrl_msr_value, "clflushopt 1f");
- else if (this_cpu_has(X86_FEATURE_CLFLUSH))
- GUEST_MEASURE_EVENT(ctrl_msr, ctrl_msr_value, "clflush 1f");
- else
- GUEST_MEASURE_EVENT(ctrl_msr, ctrl_msr_value, "nop");
-
- guest_assert_event_count(idx, event, pmc, pmc_msr);
+ if (is_forced_emulation_enabled)
+ GUEST_TEST_EVENT(idx, event, pmc, pmc_msr, ctrl_msr, ctrl_msr_value, KVM_FEP);
}

#define X86_PMU_FEATURE_NULL \
@@ -553,6 +568,7 @@ int main(int argc, char *argv[])

kvm_pmu_version = kvm_cpu_property(X86_PROPERTY_PMU_VERSION);
kvm_has_perf_caps = kvm_cpu_has(X86_FEATURE_PDCM);
+ is_forced_emulation_enabled = kvm_is_forced_emulation_enabled();

test_intel_counters();

--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:13:12

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 27/29] KVM: selftests: Add a forced emulation variation of KVM_ASM_SAFE()

Add KVM_ASM_SAFE_FEP() to allow forcing emulation on an instruction that
might fault. Note, KVM skips RIP past the FEP prefix before injecting an
exception, i.e. the fixup needs to be on the instruction itself. Do not
check for FEP support, that is firmly the responsibility of whatever code
wants to use KVM_ASM_SAFE_FEP().

Sadly, chaining variadic arguments that contain commas doesn't work, thus
the unfortunate amount of copy+paste.

Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/include/x86_64/processor.h | 30 +++++++++++++++++--
1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 6be365ac2a85..fe891424ff55 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -1154,16 +1154,19 @@ void vm_install_exception_handler(struct kvm_vm *vm, int vector,
* r9 = exception vector (non-zero)
* r10 = error code
*/
-#define KVM_ASM_SAFE(insn) \
+#define __KVM_ASM_SAFE(insn, fep) \
"mov $" __stringify(KVM_EXCEPTION_MAGIC) ", %%r9\n\t" \
"lea 1f(%%rip), %%r10\n\t" \
"lea 2f(%%rip), %%r11\n\t" \
- "1: " insn "\n\t" \
+ fep "1: " insn "\n\t" \
"xor %%r9, %%r9\n\t" \
"2:\n\t" \
"mov %%r9b, %[vector]\n\t" \
"mov %%r10, %[error_code]\n\t"

+#define KVM_ASM_SAFE(insn) __KVM_ASM_SAFE(insn, "")
+#define KVM_ASM_SAFE_FEP(insn) __KVM_ASM_SAFE(insn, KVM_FEP)
+
#define KVM_ASM_SAFE_OUTPUTS(v, ec) [vector] "=qm"(v), [error_code] "=rm"(ec)
#define KVM_ASM_SAFE_CLOBBERS "r9", "r10", "r11"

@@ -1190,6 +1193,29 @@ void vm_install_exception_handler(struct kvm_vm *vm, int vector,
vector; \
})

+#define kvm_asm_safe_fep(insn, inputs...) \
+({ \
+ uint64_t ign_error_code; \
+ uint8_t vector; \
+ \
+ asm volatile(KVM_ASM_SAFE(insn) \
+ : KVM_ASM_SAFE_OUTPUTS(vector, ign_error_code) \
+ : inputs \
+ : KVM_ASM_SAFE_CLOBBERS); \
+ vector; \
+})
+
+#define kvm_asm_safe_ec_fep(insn, error_code, inputs...) \
+({ \
+ uint8_t vector; \
+ \
+ asm volatile(KVM_ASM_SAFE_FEP(insn) \
+ : KVM_ASM_SAFE_OUTPUTS(vector, error_code) \
+ : inputs \
+ : KVM_ASM_SAFE_CLOBBERS); \
+ vector; \
+})
+
static inline uint8_t rdmsr_safe(uint32_t msr, uint64_t *val)
{
uint64_t error_code;
--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:13:40

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 28/29] KVM: selftests: Add helpers for safe and safe+forced RDMSR, RDPMC, and XGETBV

Add helpers for safe and safe-with-forced-emulations versions of RDMSR,
RDPMC, and XGETBV. Use macro shenanigans to eliminate the rather large
amount of boilerplate needed to get values in and out of registers.

Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/include/x86_64/processor.h | 40 +++++++++++++------
1 file changed, 27 insertions(+), 13 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index fe891424ff55..abac816f6594 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -1216,21 +1216,35 @@ void vm_install_exception_handler(struct kvm_vm *vm, int vector,
vector; \
})

-static inline uint8_t rdmsr_safe(uint32_t msr, uint64_t *val)
-{
- uint64_t error_code;
- uint8_t vector;
- uint32_t a, d;
-
- asm volatile(KVM_ASM_SAFE("rdmsr")
- : "=a"(a), "=d"(d), KVM_ASM_SAFE_OUTPUTS(vector, error_code)
- : "c"(msr)
- : KVM_ASM_SAFE_CLOBBERS);
-
- *val = (uint64_t)a | ((uint64_t)d << 32);
- return vector;
+#define BUILD_READ_U64_SAFE_HELPER(insn, _fep, _FEP) \
+static inline uint8_t insn##_safe ##_fep(uint32_t idx, uint64_t *val) \
+{ \
+ uint64_t error_code; \
+ uint8_t vector; \
+ uint32_t a, d; \
+ \
+ asm volatile(KVM_ASM_SAFE##_FEP(#insn) \
+ : "=a"(a), "=d"(d), \
+ KVM_ASM_SAFE_OUTPUTS(vector, error_code) \
+ : "c"(idx) \
+ : KVM_ASM_SAFE_CLOBBERS); \
+ \
+ *val = (uint64_t)a | ((uint64_t)d << 32); \
+ return vector; \
}

+/*
+ * Generate {insn}_safe() and {insn}_safe_fep() helpers for instructions that
+ * use ECX as in input index, and EDX:EAX as a 64-bit output.
+ */
+#define BUILD_READ_U64_SAFE_HELPERS(insn) \
+ BUILD_READ_U64_SAFE_HELPER(insn, , ) \
+ BUILD_READ_U64_SAFE_HELPER(insn, _fep, _FEP) \
+
+BUILD_READ_U64_SAFE_HELPERS(rdmsr)
+BUILD_READ_U64_SAFE_HELPERS(rdpmc)
+BUILD_READ_U64_SAFE_HELPERS(xgetbv)
+
static inline uint8_t wrmsr_safe(uint32_t msr, uint64_t val)
{
return kvm_asm_safe("wrmsr", "a"(val & -1u), "d"(val >> 32), "c"(msr));
--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:14:05

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 29/29] KVM: selftests: Extend PMU counters test to validate RDPMC after WRMSR

Extend the read/write PMU counters subtest to verify that RDPMC also reads
back the written value. Opportunsitically verify that attempting to use
the "fast" mode of RDPMC fails, as the "fast" flag is only supported by
non-architectural PMUs, which KVM doesn't virtualize.

Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/x86_64/pmu_counters_test.c | 41 +++++++++++++++++++
1 file changed, 41 insertions(+)

diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
index cb808ac827ba..ae5f6042f1e8 100644
--- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
+++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
@@ -325,9 +325,30 @@ __GUEST_ASSERT(expect_gp ? vector == GP_VECTOR : !vector, \
"Expected " #insn "(0x%x) to yield 0x%lx, got 0x%lx", \
msr, expected_val, val);

+static void guest_test_rdpmc(uint32_t rdpmc_idx, bool expect_success,
+ uint64_t expected_val)
+{
+ uint8_t vector;
+ uint64_t val;
+
+ vector = rdpmc_safe(rdpmc_idx, &val);
+ GUEST_ASSERT_PMC_MSR_ACCESS(RDPMC, rdpmc_idx, !expect_success, vector);
+ if (expect_success)
+ GUEST_ASSERT_PMC_VALUE(RDPMC, rdpmc_idx, val, expected_val);
+
+ if (!is_forced_emulation_enabled)
+ return;
+
+ vector = rdpmc_safe_fep(rdpmc_idx, &val);
+ GUEST_ASSERT_PMC_MSR_ACCESS(RDPMC, rdpmc_idx, !expect_success, vector);
+ if (expect_success)
+ GUEST_ASSERT_PMC_VALUE(RDPMC, rdpmc_idx, val, expected_val);
+}
+
static void guest_rd_wr_counters(uint32_t base_msr, uint8_t nr_possible_counters,
uint8_t nr_counters, uint32_t or_mask)
{
+ const bool pmu_has_fast_mode = !guest_get_pmu_version();
uint8_t i;

for (i = 0; i < nr_possible_counters; i++) {
@@ -352,6 +373,7 @@ static void guest_rd_wr_counters(uint32_t base_msr, uint8_t nr_possible_counters
const uint64_t expected_val = expect_success ? test_val : 0;
const bool expect_gp = !expect_success && msr != MSR_P6_PERFCTR0 &&
msr != MSR_P6_PERFCTR1;
+ uint32_t rdpmc_idx;
uint8_t vector;
uint64_t val;

@@ -365,6 +387,25 @@ static void guest_rd_wr_counters(uint32_t base_msr, uint8_t nr_possible_counters
if (!expect_gp)
GUEST_ASSERT_PMC_VALUE(RDMSR, msr, val, expected_val);

+ /*
+ * Redo the read tests with RDPMC, which has different indexing
+ * semantics and additional capabilities.
+ */
+ rdpmc_idx = i;
+ if (base_msr == MSR_CORE_PERF_FIXED_CTR0)
+ rdpmc_idx |= INTEL_RDPMC_FIXED;
+
+ guest_test_rdpmc(rdpmc_idx, expect_success, expected_val);
+
+ /*
+ * KVM doesn't support non-architectural PMUs, i.e. it should
+ * impossible to have fast mode RDPMC. Verify that attempting
+ * to use fast RDPMC always #GPs.
+ */
+ GUEST_ASSERT(!expect_success || !pmu_has_fast_mode);
+ rdpmc_idx |= INTEL_RDPMC_FAST;
+ guest_test_rdpmc(rdpmc_idx, false, -1ull);
+
vector = wrmsr_safe(msr, 0);
GUEST_ASSERT_PMC_MSR_ACCESS(WRMSR, msr, expect_gp, vector);
}
--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:15:14

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 18/29] KVM: selftests: Test consistency of CPUID with num of gp counters

From: Jinrong Liang <[email protected]>

Add a test to verify that KVM correctly emulates MSR-based accesses to
general purpose counters based on guest CPUID, e.g. that accesses to
non-existent counters #GP and accesses to existent counters succeed.

Note, for compatibility reasons, KVM does not emulate #GP when
MSR_P6_PERFCTR[0|1] is not present (writes should be dropped).

Co-developed-by: Like Xu <[email protected]>
Signed-off-by: Like Xu <[email protected]>
Signed-off-by: Jinrong Liang <[email protected]>
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/x86_64/pmu_counters_test.c | 99 +++++++++++++++++++
1 file changed, 99 insertions(+)

diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
index 663e8fbe7ff8..863418842ef8 100644
--- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
+++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
@@ -270,9 +270,103 @@ static void test_arch_events(uint8_t pmu_version, uint64_t perf_capabilities,
kvm_vm_free(vm);
}

+/*
+ * Limit testing to MSRs that are actually defined by Intel (in the SDM). MSRs
+ * that aren't defined counter MSRs *probably* don't exist, but there's no
+ * guarantee that currently undefined MSR indices won't be used for something
+ * other than PMCs in the future.
+ */
+#define MAX_NR_GP_COUNTERS 8
+#define MAX_NR_FIXED_COUNTERS 3
+
+#define GUEST_ASSERT_PMC_MSR_ACCESS(insn, msr, expect_gp, vector) \
+__GUEST_ASSERT(expect_gp ? vector == GP_VECTOR : !vector, \
+ "Expected %s on " #insn "(0x%x), got vector %u", \
+ expect_gp ? "#GP" : "no fault", msr, vector) \
+
+#define GUEST_ASSERT_PMC_VALUE(insn, msr, val, expected) \
+ __GUEST_ASSERT(val == expected_val, \
+ "Expected " #insn "(0x%x) to yield 0x%lx, got 0x%lx", \
+ msr, expected_val, val);
+
+static void guest_rd_wr_counters(uint32_t base_msr, uint8_t nr_possible_counters,
+ uint8_t nr_counters)
+{
+ uint8_t i;
+
+ for (i = 0; i < nr_possible_counters; i++) {
+ /*
+ * TODO: Test a value that validates full-width writes and the
+ * width of the counters.
+ */
+ const uint64_t test_val = 0xffff;
+ const uint32_t msr = base_msr + i;
+ const bool expect_success = i < nr_counters;
+
+ /*
+ * KVM drops writes to MSR_P6_PERFCTR[0|1] if the counters are
+ * unsupported, i.e. doesn't #GP and reads back '0'.
+ */
+ const uint64_t expected_val = expect_success ? test_val : 0;
+ const bool expect_gp = !expect_success && msr != MSR_P6_PERFCTR0 &&
+ msr != MSR_P6_PERFCTR1;
+ uint8_t vector;
+ uint64_t val;
+
+ vector = wrmsr_safe(msr, test_val);
+ GUEST_ASSERT_PMC_MSR_ACCESS(WRMSR, msr, expect_gp, vector);
+
+ vector = rdmsr_safe(msr, &val);
+ GUEST_ASSERT_PMC_MSR_ACCESS(RDMSR, msr, expect_gp, vector);
+
+ /* On #GP, the result of RDMSR is undefined. */
+ if (!expect_gp)
+ GUEST_ASSERT_PMC_VALUE(RDMSR, msr, val, expected_val);
+
+ vector = wrmsr_safe(msr, 0);
+ GUEST_ASSERT_PMC_MSR_ACCESS(WRMSR, msr, expect_gp, vector);
+ }
+ GUEST_DONE();
+}
+
+static void guest_test_gp_counters(void)
+{
+ uint8_t nr_gp_counters = 0;
+ uint32_t base_msr;
+
+ if (guest_get_pmu_version())
+ nr_gp_counters = this_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
+
+ if (this_cpu_has(X86_FEATURE_PDCM) &&
+ rdmsr(MSR_IA32_PERF_CAPABILITIES) & PMU_CAP_FW_WRITES)
+ base_msr = MSR_IA32_PMC0;
+ else
+ base_msr = MSR_IA32_PERFCTR0;
+
+ guest_rd_wr_counters(base_msr, MAX_NR_GP_COUNTERS, nr_gp_counters);
+}
+
+static void test_gp_counters(uint8_t pmu_version, uint64_t perf_capabilities,
+ uint8_t nr_gp_counters)
+{
+ struct kvm_vcpu *vcpu;
+ struct kvm_vm *vm;
+
+ vm = pmu_vm_create_with_one_vcpu(&vcpu, guest_test_gp_counters,
+ pmu_version, perf_capabilities);
+
+ vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_NR_GP_COUNTERS,
+ nr_gp_counters);
+
+ run_vcpu(vcpu);
+
+ kvm_vm_free(vm);
+}
+
static void test_intel_counters(void)
{
uint8_t nr_arch_events = kvm_cpu_property(X86_PROPERTY_PMU_EBX_BIT_VECTOR_LENGTH);
+ uint8_t nr_gp_counters = kvm_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
uint8_t pmu_version = kvm_cpu_property(X86_PROPERTY_PMU_VERSION);
unsigned int i;
uint8_t v, j;
@@ -336,6 +430,11 @@ static void test_intel_counters(void)
for (k = 0; k < nr_arch_events; k++)
test_arch_events(v, perf_caps[i], j, BIT(k));
}
+
+ pr_info("Testing GP counters, PMU version %u, perf_caps = %lx\n",
+ v, perf_caps[i]);
+ for (j = 0; j <= nr_gp_counters; j++)
+ test_gp_counters(v, perf_caps[i], j);
}
}
}
--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:16:21

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 20/29] KVM: selftests: Add functional test for Intel's fixed PMU counters

From: Jinrong Liang <[email protected]>

Extend the fixed counters test to verify that supported counters can
actually be enabled in the control MSRs, that unsupported counters cannot,
and that enabled counters actually count.

Co-developed-by: Like Xu <[email protected]>
Signed-off-by: Like Xu <[email protected]>
Signed-off-by: Jinrong Liang <[email protected]>
[sean: fold into the rd/wr access test, massage changelog]
Reviewed-by: Dapeng Mi <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/x86_64/pmu_counters_test.c | 31 ++++++++++++++++++-
1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
index b07294af71a3..f5dedd112471 100644
--- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
+++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
@@ -332,7 +332,6 @@ static void guest_rd_wr_counters(uint32_t base_msr, uint8_t nr_possible_counters
vector = wrmsr_safe(msr, 0);
GUEST_ASSERT_PMC_MSR_ACCESS(WRMSR, msr, expect_gp, vector);
}
- GUEST_DONE();
}

static void guest_test_gp_counters(void)
@@ -350,6 +349,7 @@ static void guest_test_gp_counters(void)
base_msr = MSR_IA32_PERFCTR0;

guest_rd_wr_counters(base_msr, MAX_NR_GP_COUNTERS, nr_gp_counters, 0);
+ GUEST_DONE();
}

static void test_gp_counters(uint8_t pmu_version, uint64_t perf_capabilities,
@@ -373,6 +373,7 @@ static void guest_test_fixed_counters(void)
{
uint64_t supported_bitmask = 0;
uint8_t nr_fixed_counters = 0;
+ uint8_t i;

/* Fixed counters require Architectural vPMU Version 2+. */
if (guest_get_pmu_version() >= 2)
@@ -387,6 +388,34 @@ static void guest_test_fixed_counters(void)

guest_rd_wr_counters(MSR_CORE_PERF_FIXED_CTR0, MAX_NR_FIXED_COUNTERS,
nr_fixed_counters, supported_bitmask);
+
+ for (i = 0; i < MAX_NR_FIXED_COUNTERS; i++) {
+ uint8_t vector;
+ uint64_t val;
+
+ if (i >= nr_fixed_counters && !(supported_bitmask & BIT_ULL(i))) {
+ vector = wrmsr_safe(MSR_CORE_PERF_FIXED_CTR_CTRL,
+ FIXED_PMC_CTRL(i, FIXED_PMC_KERNEL));
+ __GUEST_ASSERT(vector == GP_VECTOR,
+ "Expected #GP for counter %u in FIXED_CTR_CTRL", i);
+
+ vector = wrmsr_safe(MSR_CORE_PERF_GLOBAL_CTRL,
+ FIXED_PMC_GLOBAL_CTRL_ENABLE(i));
+ __GUEST_ASSERT(vector == GP_VECTOR,
+ "Expected #GP for counter %u in PERF_GLOBAL_CTRL", i);
+ continue;
+ }
+
+ wrmsr(MSR_CORE_PERF_FIXED_CTR0 + i, 0);
+ wrmsr(MSR_CORE_PERF_FIXED_CTR_CTRL, FIXED_PMC_CTRL(i, FIXED_PMC_KERNEL));
+ wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, FIXED_PMC_GLOBAL_CTRL_ENABLE(i));
+ __asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
+ wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);
+ val = rdmsr(MSR_CORE_PERF_FIXED_CTR0 + i);
+
+ GUEST_ASSERT_NE(val, 0);
+ }
+ GUEST_DONE();
}

static void test_fixed_counters(uint8_t pmu_version, uint64_t perf_capabilities,
--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:17:23

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 21/29] KVM: selftests: Expand PMU counters test to verify LLC events

Expand the PMU counters test to verify that LLC references and misses have
non-zero counts when the code being executed while the LLC event(s) is
active is evicted via CFLUSH{,OPT}. Note, CLFLUSH{,OPT} requires a fence
of some kind to ensure the cache lines are flushed before execution
continues. Use MFENCE for simplicity (performance is not a concern).

Suggested-by: Jim Mattson <[email protected]>
Reviewed-by: Dapeng Mi <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/x86_64/pmu_counters_test.c | 59 +++++++++++++------
1 file changed, 40 insertions(+), 19 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
index f5dedd112471..4c7133ddcda8 100644
--- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
+++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
@@ -14,9 +14,9 @@
/*
* Number of "extra" instructions that will be counted, i.e. the number of
* instructions that are needed to set up the loop and then disabled the
- * counter. 2 MOV, 2 XOR, 1 WRMSR.
+ * counter. 1 CLFLUSH/CLFLUSHOPT/NOP, 1 MFENCE, 2 MOV, 2 XOR, 1 WRMSR.
*/
-#define NUM_EXTRA_INSNS 5
+#define NUM_EXTRA_INSNS 7
#define NUM_INSNS_RETIRED (NUM_BRANCHES + NUM_EXTRA_INSNS)

static uint8_t kvm_pmu_version;
@@ -107,6 +107,12 @@ static void guest_assert_event_count(uint8_t idx,
case INTEL_ARCH_BRANCHES_RETIRED_INDEX:
GUEST_ASSERT_EQ(count, NUM_BRANCHES);
break;
+ case INTEL_ARCH_LLC_REFERENCES_INDEX:
+ case INTEL_ARCH_LLC_MISSES_INDEX:
+ if (!this_cpu_has(X86_FEATURE_CLFLUSHOPT) &&
+ !this_cpu_has(X86_FEATURE_CLFLUSH))
+ break;
+ fallthrough;
case INTEL_ARCH_CPU_CYCLES_INDEX:
case INTEL_ARCH_REFERENCE_CYCLES_INDEX:
GUEST_ASSERT_NE(count, 0);
@@ -123,29 +129,44 @@ static void guest_assert_event_count(uint8_t idx,
GUEST_ASSERT_EQ(_rdpmc(pmc), 0xdead);
}

+/*
+ * Enable and disable the PMC in a monolithic asm blob to ensure that the
+ * compiler can't insert _any_ code into the measured sequence. Note, ECX
+ * doesn't need to be clobbered as the input value, @pmc_msr, is restored
+ * before the end of the sequence.
+ *
+ * If CLFUSH{,OPT} is supported, flush the cacheline containing (at least) the
+ * start of the loop to force LLC references and misses, i.e. to allow testing
+ * that those events actually count.
+ */
+#define GUEST_MEASURE_EVENT(_msr, _value, clflush) \
+do { \
+ __asm__ __volatile__("wrmsr\n\t" \
+ clflush "\n\t" \
+ "mfence\n\t" \
+ "1: mov $" __stringify(NUM_BRANCHES) ", %%ecx\n\t" \
+ "loop .\n\t" \
+ "mov %%edi, %%ecx\n\t" \
+ "xor %%eax, %%eax\n\t" \
+ "xor %%edx, %%edx\n\t" \
+ "wrmsr\n\t" \
+ :: "a"((uint32_t)_value), "d"(_value >> 32), \
+ "c"(_msr), "D"(_msr) \
+ ); \
+} while (0)
+
static void __guest_test_arch_event(uint8_t idx, struct kvm_x86_pmu_feature event,
uint32_t pmc, uint32_t pmc_msr,
uint32_t ctrl_msr, uint64_t ctrl_msr_value)
{
wrmsr(pmc_msr, 0);

- /*
- * Enable and disable the PMC in a monolithic asm blob to ensure that
- * the compiler can't insert _any_ code into the measured sequence.
- * Note, ECX doesn't need to be clobbered as the input value, @pmc_msr,
- * is restored before the end of the sequence.
- */
- __asm__ __volatile__("wrmsr\n\t"
- "mov $" __stringify(NUM_BRANCHES) ", %%ecx\n\t"
- "loop .\n\t"
- "mov %%edi, %%ecx\n\t"
- "xor %%eax, %%eax\n\t"
- "xor %%edx, %%edx\n\t"
- "wrmsr\n\t"
- :: "a"((uint32_t)ctrl_msr_value),
- "d"(ctrl_msr_value >> 32),
- "c"(ctrl_msr), "D"(ctrl_msr)
- );
+ if (this_cpu_has(X86_FEATURE_CLFLUSHOPT))
+ GUEST_MEASURE_EVENT(ctrl_msr, ctrl_msr_value, "clflushopt 1f");
+ else if (this_cpu_has(X86_FEATURE_CLFLUSH))
+ GUEST_MEASURE_EVENT(ctrl_msr, ctrl_msr_value, "clflush 1f");
+ else
+ GUEST_MEASURE_EVENT(ctrl_msr, ctrl_msr_value, "nop");

guest_assert_event_count(idx, event, pmc, pmc_msr);
}
--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:19:28

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 24/29] KVM: selftests: Query module param to detect FEP in MSR filtering test

Add a helper to detect KVM support for forced emulation by querying the
module param, and use the helper to detect support for the MSR filtering
test instead of throwing a noodle/NOP at KVM to see if it sticks.

Cc: Aaron Lewis <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/include/x86_64/processor.h | 5 ++++
.../kvm/x86_64/userspace_msr_exit_test.c | 27 +++++++------------
2 files changed, 14 insertions(+), 18 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index ee082ae58f40..d211cea188be 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -1222,6 +1222,11 @@ static inline bool kvm_is_pmu_enabled(void)
return get_kvm_param_bool("enable_pmu");
}

+static inline bool kvm_is_forced_emulation_enabled(void)
+{
+ return !!get_kvm_param_integer("force_emulation_prefix");
+}
+
uint64_t *__vm_get_page_table_entry(struct kvm_vm *vm, uint64_t vaddr,
int *level);
uint64_t *vm_get_page_table_entry(struct kvm_vm *vm, uint64_t vaddr);
diff --git a/tools/testing/selftests/kvm/x86_64/userspace_msr_exit_test.c b/tools/testing/selftests/kvm/x86_64/userspace_msr_exit_test.c
index 3533dc2fbfee..9e12dbc47a72 100644
--- a/tools/testing/selftests/kvm/x86_64/userspace_msr_exit_test.c
+++ b/tools/testing/selftests/kvm/x86_64/userspace_msr_exit_test.c
@@ -14,8 +14,7 @@

/* Forced emulation prefix, used to invoke the emulator unconditionally. */
#define KVM_FEP "ud2; .byte 'k', 'v', 'm';"
-#define KVM_FEP_LENGTH 5
-static int fep_available = 1;
+static bool fep_available;

#define MSR_NON_EXISTENT 0x474f4f00

@@ -260,13 +259,6 @@ static void guest_code_filter_allow(void)
GUEST_ASSERT(data == 2);
GUEST_ASSERT(guest_exception_count == 0);

- /*
- * Test to see if the instruction emulator is available (ie: the module
- * parameter 'kvm.force_emulation_prefix=1' is set). This instruction
- * will #UD if it isn't available.
- */
- __asm__ __volatile__(KVM_FEP "nop");
-
if (fep_available) {
/* Let userspace know we aren't done. */
GUEST_SYNC(0);
@@ -388,12 +380,6 @@ static void guest_fep_gp_handler(struct ex_regs *regs)
&em_wrmsr_start, &em_wrmsr_end);
}

-static void guest_ud_handler(struct ex_regs *regs)
-{
- fep_available = 0;
- regs->rip += KVM_FEP_LENGTH;
-}
-
static void check_for_guest_assert(struct kvm_vcpu *vcpu)
{
struct ucall uc;
@@ -531,9 +517,11 @@ static void test_msr_filter_allow(void)
{
struct kvm_vcpu *vcpu;
struct kvm_vm *vm;
+ uint64_t cmd;
int rc;

vm = vm_create_with_one_vcpu(&vcpu, guest_code_filter_allow);
+ sync_global_to_guest(vm, fep_available);

rc = kvm_check_cap(KVM_CAP_X86_USER_SPACE_MSR);
TEST_ASSERT(rc, "KVM_CAP_X86_USER_SPACE_MSR is available");
@@ -561,11 +549,11 @@ static void test_msr_filter_allow(void)
run_guest_then_process_wrmsr(vcpu, MSR_NON_EXISTENT);
run_guest_then_process_rdmsr(vcpu, MSR_NON_EXISTENT);

- vm_install_exception_handler(vm, UD_VECTOR, guest_ud_handler);
vcpu_run(vcpu);
- vm_install_exception_handler(vm, UD_VECTOR, NULL);
+ cmd = process_ucall(vcpu);

- if (process_ucall(vcpu) != UCALL_DONE) {
+ if (fep_available) {
+ TEST_ASSERT_EQ(cmd, UCALL_SYNC);
vm_install_exception_handler(vm, GP_VECTOR, guest_fep_gp_handler);

/* Process emulated rdmsr and wrmsr instructions. */
@@ -583,6 +571,7 @@ static void test_msr_filter_allow(void)
/* Confirm the guest completed without issues. */
run_guest_then_process_ucall_done(vcpu);
} else {
+ TEST_ASSERT_EQ(cmd, UCALL_DONE);
printf("To run the instruction emulated tests set the module parameter 'kvm.force_emulation_prefix=1'\n");
}

@@ -804,6 +793,8 @@ static void test_user_exit_msr_flags(void)

int main(int argc, char *argv[])
{
+ fep_available = kvm_is_forced_emulation_enabled();
+
test_msr_filter_allow();

test_msr_filter_deny();
--
2.43.0.472.g3155946c3a-goog


2024-01-09 23:19:48

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v10 25/29] KVM: selftests: Move KVM_FEP macro into common library header

Move the KVM_FEP definition, a.k.a. the KVM force emulation prefix, into
processor.h so that it can be used for other tests besides the MSR filter
test.

Signed-off-by: Sean Christopherson <[email protected]>
---
tools/testing/selftests/kvm/include/x86_64/processor.h | 3 +++
tools/testing/selftests/kvm/x86_64/userspace_msr_exit_test.c | 2 --
2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index d211cea188be..6be365ac2a85 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -23,6 +23,9 @@
extern bool host_cpu_is_intel;
extern bool host_cpu_is_amd;

+/* Forced emulation prefix, used to invoke the emulator unconditionally. */
+#define KVM_FEP "ud2; .byte 'k', 'v', 'm';"
+
#define NMI_VECTOR 0x02

#define X86_EFLAGS_FIXED (1u << 1)
diff --git a/tools/testing/selftests/kvm/x86_64/userspace_msr_exit_test.c b/tools/testing/selftests/kvm/x86_64/userspace_msr_exit_test.c
index 9e12dbc47a72..ab3a8c4f0b86 100644
--- a/tools/testing/selftests/kvm/x86_64/userspace_msr_exit_test.c
+++ b/tools/testing/selftests/kvm/x86_64/userspace_msr_exit_test.c
@@ -12,8 +12,6 @@
#include "kvm_util.h"
#include "vmx.h"

-/* Forced emulation prefix, used to invoke the emulator unconditionally. */
-#define KVM_FEP "ud2; .byte 'k', 'v', 'm';"
static bool fep_available;

#define MSR_NON_EXISTENT 0x474f4f00
--
2.43.0.472.g3155946c3a-goog


2024-01-10 09:21:46

by Andrew Jones

[permalink] [raw]
Subject: Re: [PATCH v10 15/29] KVM: selftests: Add pmu.h and lib/pmu.c for common PMU assets

On Tue, Jan 09, 2024 at 03:02:35PM -0800, Sean Christopherson wrote:
> From: Jinrong Liang <[email protected]>
>
> Add a PMU library for x86 selftests to help eliminate open-coded event
> encodings, and to reduce the amount of copy+paste between PMU selftests.
>
> Use the new common macro definitions in the existing PMU event filter test.
>
> Cc: Aaron Lewis <[email protected]>
> Suggested-by: Sean Christopherson <[email protected]>
> Signed-off-by: Jinrong Liang <[email protected]>
> Co-developed-by: Sean Christopherson <[email protected]>
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
> tools/testing/selftests/kvm/Makefile | 1 +
> tools/testing/selftests/kvm/include/pmu.h | 97 ++++++++++++
> tools/testing/selftests/kvm/lib/pmu.c | 31 ++++

Shouldn't these new files be

tools/testing/selftests/kvm/include/x86_64/pmu.h
tools/testing/selftests/kvm/lib/x86_64/pmu.c

Thanks,
drew

2024-01-10 13:58:59

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v10 15/29] KVM: selftests: Add pmu.h and lib/pmu.c for common PMU assets

On Wed, Jan 10, 2024, Andrew Jones wrote:
> On Tue, Jan 09, 2024 at 03:02:35PM -0800, Sean Christopherson wrote:
> > From: Jinrong Liang <[email protected]>
> >
> > Add a PMU library for x86 selftests to help eliminate open-coded event
> > encodings, and to reduce the amount of copy+paste between PMU selftests.
> >
> > Use the new common macro definitions in the existing PMU event filter test.
> >
> > Cc: Aaron Lewis <[email protected]>
> > Suggested-by: Sean Christopherson <[email protected]>
> > Signed-off-by: Jinrong Liang <[email protected]>
> > Co-developed-by: Sean Christopherson <[email protected]>
> > Signed-off-by: Sean Christopherson <[email protected]>
> > ---
> > tools/testing/selftests/kvm/Makefile | 1 +
> > tools/testing/selftests/kvm/include/pmu.h | 97 ++++++++++++
> > tools/testing/selftests/kvm/lib/pmu.c | 31 ++++
>
> Shouldn't these new files be
>
> tools/testing/selftests/kvm/include/x86_64/pmu.h
> tools/testing/selftests/kvm/lib/x86_64/pmu.c

/facepalm

I'm glad at least one of us is paying attention. If no one objects to not sending
yet another version, I'll squash the below when applying.

--
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index ab96fc80bfbd..ce58098d80fd 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -23,7 +23,6 @@ LIBKVM += lib/guest_modes.c
LIBKVM += lib/io.c
LIBKVM += lib/kvm_util.c
LIBKVM += lib/memstress.c
-LIBKVM += lib/pmu.c
LIBKVM += lib/guest_sprintf.c
LIBKVM += lib/rbtree.c
LIBKVM += lib/sparsebit.c
@@ -37,6 +36,7 @@ LIBKVM_x86_64 += lib/x86_64/apic.c
LIBKVM_x86_64 += lib/x86_64/handlers.S
LIBKVM_x86_64 += lib/x86_64/hyperv.c
LIBKVM_x86_64 += lib/x86_64/memstress.c
+LIBKVM_x86_64 += lib/x86_64/pmu.c
LIBKVM_x86_64 += lib/x86_64/processor.c
LIBKVM_x86_64 += lib/x86_64/svm.c
LIBKVM_x86_64 += lib/x86_64/ucall.c
diff --git a/tools/testing/selftests/kvm/include/pmu.h b/tools/testing/selftests/kvm/include/x86_64/pmu.h
similarity index 100%
rename from tools/testing/selftests/kvm/include/pmu.h
rename to tools/testing/selftests/kvm/include/x86_64/pmu.h
diff --git a/tools/testing/selftests/kvm/lib/pmu.c b/tools/testing/selftests/kvm/lib/x86_64/pmu.c
similarity index 100%
rename from tools/testing/selftests/kvm/lib/pmu.c
rename to tools/testing/selftests/kvm/lib/x86_64/pmu.c

2024-01-12 03:50:50

by Mi, Dapeng

[permalink] [raw]
Subject: Re: [PATCH v10 11/29] KVM: x86/pmu: Explicitly check for RDPMC of unsupported Intel PMC types


On 1/10/2024 7:02 AM, Sean Christopherson wrote:
> Explicitly check for attempts to read unsupported PMC types instead of
> letting the bounds check fail. Functionally, letting the check fail is
> ok, but it's unnecessarily subtle and does a poor job of documenting the
> architectural behavior that KVM is emulating.
>
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
> arch/x86/kvm/vmx/pmu_intel.c | 21 +++++++++++++++------
> 1 file changed, 15 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> index c37dd3aa056b..b41bdb0a0995 100644
> --- a/arch/x86/kvm/vmx/pmu_intel.c
> +++ b/arch/x86/kvm/vmx/pmu_intel.c
> @@ -26,6 +26,7 @@
> * further confuse things, non-architectural PMUs use bit 31 as a flag for
> * "fast" reads, whereas the "type" is an explicit value.
> */
> +#define INTEL_RDPMC_GP 0
> #define INTEL_RDPMC_FIXED INTEL_PMC_FIXED_RDPMC_BASE
>
> #define INTEL_RDPMC_TYPE_MASK GENMASK(31, 16)
> @@ -89,21 +90,29 @@ static struct kvm_pmc *intel_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu,
> return NULL;
>
> /*
> - * Fixed PMCs are supported on all architectural PMUs. Note, KVM only
> - * emulates fixed PMCs for PMU v2+, but the flag itself is still valid,
> - * i.e. let RDPMC fail due to accessing a non-existent counter.
> + * General Purpose (GP) PMCs are supported on all PMUs, and fixed PMCs
> + * are supported on all architectural PMUs, i.e. on all virtual PMUs
> + * supported by KVM. Note, KVM only emulates fixed PMCs for PMU v2+,
> + * but the type itself is still valid, i.e. let RDPMC fail due to
> + * accessing a non-existent counter. Reject attempts to read all other
> + * types, which are unknown/unsupported.
> */
> - idx &= ~INTEL_RDPMC_FIXED;
> - if (type == INTEL_RDPMC_FIXED) {
> + switch (type) {
> + case INTEL_RDPMC_FIXED:
> counters = pmu->fixed_counters;
> num_counters = pmu->nr_arch_fixed_counters;
> bitmask = pmu->counter_bitmask[KVM_PMC_FIXED];
> - } else {
> + break;
> + case INTEL_RDPMC_GP:
> counters = pmu->gp_counters;
> num_counters = pmu->nr_arch_gp_counters;
> bitmask = pmu->counter_bitmask[KVM_PMC_GP];
> + break;
> + default:
> + return NULL;
> }
>
> + idx &= INTEL_RDPMC_INDEX_MASK;
> if (idx >= num_counters)
> return NULL;
>
Reviewed-by: Dapeng Mi  <[email protected]>

2024-01-12 09:15:05

by Mi, Dapeng

[permalink] [raw]
Subject: Re: [PATCH v10 16/29] KVM: selftests: Test Intel PMU architectural events on gp counters


On 1/10/2024 7:02 AM, Sean Christopherson wrote:
> From: Jinrong Liang <[email protected]>
>
> Add test cases to verify that Intel's Architectural PMU events work as
> expected when they are available according to guest CPUID. Iterate over a
> range of sane PMU versions, with and without full-width writes enabled,
> and over interesting combinations of lengths/masks for the bit vector that
> enumerates unavailable events.
>
> Test up to vPMU version 5, i.e. the current architectural max. KVM only
> officially supports up to version 2, but the behavior of the counters is
> backwards compatible, i.e. KVM shouldn't do something completely different
> for a higher, architecturally-defined vPMU version. Verify KVM behavior
> against the effective vPMU version, e.g. advertising vPMU 5 when KVM only
> supports vPMU 2 shouldn't magically unlock vPMU 5 features.
>
> According to Intel SDM, the number of architectural events is reported
> through CPUID.0AH:EAX[31:24] and the architectural event x is supported
> if EBX[x]=0 && EAX[31:24]>x.
>
> Handcode the entirety of the measured section so that the test can
> precisely assert on the number of instructions and branches retired.
>
> Co-developed-by: Like Xu <[email protected]>
> Signed-off-by: Like Xu <[email protected]>
> Signed-off-by: Jinrong Liang <[email protected]>
> Co-developed-by: Sean Christopherson <[email protected]>
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
> tools/testing/selftests/kvm/Makefile | 1 +
> .../selftests/kvm/x86_64/pmu_counters_test.c | 321 ++++++++++++++++++
> 2 files changed, 322 insertions(+)
> create mode 100644 tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
>
> diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
> index 479bd85e1c56..ab96fc80bfbd 100644
> --- a/tools/testing/selftests/kvm/Makefile
> +++ b/tools/testing/selftests/kvm/Makefile
> @@ -81,6 +81,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/kvm_pv_test
> TEST_GEN_PROGS_x86_64 += x86_64/monitor_mwait_test
> TEST_GEN_PROGS_x86_64 += x86_64/nested_exceptions_test
> TEST_GEN_PROGS_x86_64 += x86_64/platform_info_test
> +TEST_GEN_PROGS_x86_64 += x86_64/pmu_counters_test
> TEST_GEN_PROGS_x86_64 += x86_64/pmu_event_filter_test
> TEST_GEN_PROGS_x86_64 += x86_64/private_mem_conversions_test
> TEST_GEN_PROGS_x86_64 += x86_64/private_mem_kvm_exits_test
> diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
> new file mode 100644
> index 000000000000..5b8687bb4639
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
> @@ -0,0 +1,321 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2023, Tencent, Inc.
> + */
> +
> +#define _GNU_SOURCE /* for program_invocation_short_name */
> +#include <x86intrin.h>
> +
> +#include "pmu.h"
> +#include "processor.h"
> +
> +/* Number of LOOP instructions for the guest measurement payload. */
> +#define NUM_BRANCHES 10
> +/*
> + * Number of "extra" instructions that will be counted, i.e. the number of
> + * instructions that are needed to set up the loop and then disabled the
> + * counter. 2 MOV, 2 XOR, 1 WRMSR.
> + */
> +#define NUM_EXTRA_INSNS 5
> +#define NUM_INSNS_RETIRED (NUM_BRANCHES + NUM_EXTRA_INSNS)
> +
> +static uint8_t kvm_pmu_version;
> +static bool kvm_has_perf_caps;
> +
> +static struct kvm_vm *pmu_vm_create_with_one_vcpu(struct kvm_vcpu **vcpu,
> + void *guest_code,
> + uint8_t pmu_version,
> + uint64_t perf_capabilities)
> +{
> + struct kvm_vm *vm;
> +
> + vm = vm_create_with_one_vcpu(vcpu, guest_code);
> + vm_init_descriptor_tables(vm);
> + vcpu_init_descriptor_tables(*vcpu);
> +
> + sync_global_to_guest(vm, kvm_pmu_version);
> +
> + /*
> + * Set PERF_CAPABILITIES before PMU version as KVM disallows enabling
> + * features via PERF_CAPABILITIES if the guest doesn't have a vPMU.
> + */
> + if (kvm_has_perf_caps)
> + vcpu_set_msr(*vcpu, MSR_IA32_PERF_CAPABILITIES, perf_capabilities);
> +
> + vcpu_set_cpuid_property(*vcpu, X86_PROPERTY_PMU_VERSION, pmu_version);
> + return vm;
> +}
> +
> +static void run_vcpu(struct kvm_vcpu *vcpu)
> +{
> + struct ucall uc;
> +
> + do {
> + vcpu_run(vcpu);
> + switch (get_ucall(vcpu, &uc)) {
> + case UCALL_SYNC:
> + break;
> + case UCALL_ABORT:
> + REPORT_GUEST_ASSERT(uc);
> + break;
> + case UCALL_PRINTF:
> + pr_info("%s", uc.buffer);
> + break;
> + case UCALL_DONE:
> + break;
> + default:
> + TEST_FAIL("Unexpected ucall: %lu", uc.cmd);
> + }
> + } while (uc.cmd != UCALL_DONE);
> +}
> +
> +static uint8_t guest_get_pmu_version(void)
> +{
> + /*
> + * Return the effective PMU version, i.e. the minimum between what KVM
> + * supports and what is enumerated to the guest. The host deliberately
> + * advertises a PMU version to the guest beyond what is actually
> + * supported by KVM to verify KVM doesn't freak out and do something
> + * bizarre with an architecturally valid, but unsupported, version.
> + */
> + return min_t(uint8_t, kvm_pmu_version, this_cpu_property(X86_PROPERTY_PMU_VERSION));
> +}
> +
> +/*
> + * If an architectural event is supported and guaranteed to generate at least
> + * one "hit, assert that its count is non-zero. If an event isn't supported or
> + * the test can't guarantee the associated action will occur, then all bets are
> + * off regarding the count, i.e. no checks can be done.
> + *
> + * Sanity check that in all cases, the event doesn't count when it's disabled,
> + * and that KVM correctly emulates the write of an arbitrary value.
> + */
> +static void guest_assert_event_count(uint8_t idx,
> + struct kvm_x86_pmu_feature event,
> + uint32_t pmc, uint32_t pmc_msr)
> +{
> + uint64_t count;
> +
> + count = _rdpmc(pmc);
> + if (!this_pmu_has(event))
> + goto sanity_checks;
> +
> + switch (idx) {
> + case INTEL_ARCH_INSTRUCTIONS_RETIRED_INDEX:
> + GUEST_ASSERT_EQ(count, NUM_INSNS_RETIRED);
> + break;
> + case INTEL_ARCH_BRANCHES_RETIRED_INDEX:
> + GUEST_ASSERT_EQ(count, NUM_BRANCHES);
> + break;
> + case INTEL_ARCH_CPU_CYCLES_INDEX:
> + case INTEL_ARCH_REFERENCE_CYCLES_INDEX:

Since we already support slots event in below guest_test_arch_event(),
we can add check for INTEL_ARCH_TOPDOWN_SLOTS_INDEX here.


> + GUEST_ASSERT_NE(count, 0);
> + break;
> + default:
> + break;
> + }
> +
> +sanity_checks:
> + __asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
> + GUEST_ASSERT_EQ(_rdpmc(pmc), count);
> +
> + wrmsr(pmc_msr, 0xdead);
> + GUEST_ASSERT_EQ(_rdpmc(pmc), 0xdead);
> +}
> +
> +static void __guest_test_arch_event(uint8_t idx, struct kvm_x86_pmu_feature event,
> + uint32_t pmc, uint32_t pmc_msr,
> + uint32_t ctrl_msr, uint64_t ctrl_msr_value)
> +{
> + wrmsr(pmc_msr, 0);
> +
> + /*
> + * Enable and disable the PMC in a monolithic asm blob to ensure that
> + * the compiler can't insert _any_ code into the measured sequence.
> + * Note, ECX doesn't need to be clobbered as the input value, @pmc_msr,
> + * is restored before the end of the sequence.
> + */
> + __asm__ __volatile__("wrmsr\n\t"
> + "mov $" __stringify(NUM_BRANCHES) ", %%ecx\n\t"
> + "loop .\n\t"
> + "mov %%edi, %%ecx\n\t"
> + "xor %%eax, %%eax\n\t"
> + "xor %%edx, %%edx\n\t"
> + "wrmsr\n\t"
> + :: "a"((uint32_t)ctrl_msr_value),
> + "d"(ctrl_msr_value >> 32),
> + "c"(ctrl_msr), "D"(ctrl_msr)
> + );
> +
> + guest_assert_event_count(idx, event, pmc, pmc_msr);
> +}
> +
> +static void guest_test_arch_event(uint8_t idx)
> +{
> + const struct {
> + struct kvm_x86_pmu_feature gp_event;
> + } intel_event_to_feature[] = {
> + [INTEL_ARCH_CPU_CYCLES_INDEX] = { X86_PMU_FEATURE_CPU_CYCLES },
> + [INTEL_ARCH_INSTRUCTIONS_RETIRED_INDEX] = { X86_PMU_FEATURE_INSNS_RETIRED },
> + [INTEL_ARCH_REFERENCE_CYCLES_INDEX] = { X86_PMU_FEATURE_REFERENCE_CYCLES },
> + [INTEL_ARCH_LLC_REFERENCES_INDEX] = { X86_PMU_FEATURE_LLC_REFERENCES },
> + [INTEL_ARCH_LLC_MISSES_INDEX] = { X86_PMU_FEATURE_LLC_MISSES },
> + [INTEL_ARCH_BRANCHES_RETIRED_INDEX] = { X86_PMU_FEATURE_BRANCH_INSNS_RETIRED },
> + [INTEL_ARCH_BRANCHES_MISPREDICTED_INDEX] = { X86_PMU_FEATURE_BRANCHES_MISPREDICTED },
> + [INTEL_ARCH_TOPDOWN_SLOTS_INDEX] = { X86_PMU_FEATURE_TOPDOWN_SLOTS },
> + };
> +
> + uint32_t nr_gp_counters = this_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
> + uint32_t pmu_version = guest_get_pmu_version();
> + /* PERF_GLOBAL_CTRL exists only for Architectural PMU Version 2+. */
> + bool guest_has_perf_global_ctrl = pmu_version >= 2;
> + struct kvm_x86_pmu_feature gp_event;
> + uint32_t base_pmc_msr;
> + unsigned int i;
> +
> + /* The host side shouldn't invoke this without a guest PMU. */
> + GUEST_ASSERT(pmu_version);
> +
> + if (this_cpu_has(X86_FEATURE_PDCM) &&
> + rdmsr(MSR_IA32_PERF_CAPABILITIES) & PMU_CAP_FW_WRITES)
> + base_pmc_msr = MSR_IA32_PMC0;
> + else
> + base_pmc_msr = MSR_IA32_PERFCTR0;
> +
> + gp_event = intel_event_to_feature[idx].gp_event;
> + GUEST_ASSERT_EQ(idx, gp_event.f.bit);
> +
> + GUEST_ASSERT(nr_gp_counters);
> +
> + for (i = 0; i < nr_gp_counters; i++) {
> + uint64_t eventsel = ARCH_PERFMON_EVENTSEL_OS |
> + ARCH_PERFMON_EVENTSEL_ENABLE |
> + intel_pmu_arch_events[idx];
> +
> + wrmsr(MSR_P6_EVNTSEL0 + i, 0);
> + if (guest_has_perf_global_ctrl)
> + wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, BIT_ULL(i));
> +
> + __guest_test_arch_event(idx, gp_event, i, base_pmc_msr + i,
> + MSR_P6_EVNTSEL0 + i, eventsel);
> + }
> +}
> +
> +static void guest_test_arch_events(void)
> +{
> + uint8_t i;
> +
> + for (i = 0; i < NR_INTEL_ARCH_EVENTS; i++)
> + guest_test_arch_event(i);
> +
> + GUEST_DONE();
> +}
> +
> +static void test_arch_events(uint8_t pmu_version, uint64_t perf_capabilities,
> + uint8_t length, uint8_t unavailable_mask)
> +{
> + struct kvm_vcpu *vcpu;
> + struct kvm_vm *vm;
> +
> + /* Testing arch events requires a vPMU (there are no negative tests). */
> + if (!pmu_version)
> + return;
> +
> + vm = pmu_vm_create_with_one_vcpu(&vcpu, guest_test_arch_events,
> + pmu_version, perf_capabilities);
> +
> + vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_EBX_BIT_VECTOR_LENGTH,
> + length);
> + vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_EVENTS_MASK,
> + unavailable_mask);
> +
> + run_vcpu(vcpu);
> +
> + kvm_vm_free(vm);
> +}
> +
> +static void test_intel_counters(void)
> +{
> + uint8_t nr_arch_events = kvm_cpu_property(X86_PROPERTY_PMU_EBX_BIT_VECTOR_LENGTH);
> + uint8_t pmu_version = kvm_cpu_property(X86_PROPERTY_PMU_VERSION);
> + unsigned int i;
> + uint8_t v, j;
> + uint32_t k;
> +
> + const uint64_t perf_caps[] = {
> + 0,
> + PMU_CAP_FW_WRITES,
> + };
> +
> + /*
> + * Test up to PMU v5, which is the current maximum version defined by
> + * Intel, i.e. is the last version that is guaranteed to be backwards
> + * compatible with KVM's existing behavior.
> + */
> + uint8_t max_pmu_version = max_t(typeof(pmu_version), pmu_version, 5);
> +
> + /*
> + * Detect the existence of events that aren't supported by selftests.
> + * This will (obviously) fail any time the kernel adds support for a
> + * new event, but it's worth paying that price to keep the test fresh.
> + */
> + TEST_ASSERT(nr_arch_events <= NR_INTEL_ARCH_EVENTS,
> + "New architectural event(s) detected; please update this test (length = %u, mask = %x)",
> + nr_arch_events, kvm_cpu_property(X86_PROPERTY_PMU_EVENTS_MASK));
> +
> + /*
> + * Force iterating over known arch events regardless of whether or not
> + * KVM/hardware supports a given event.
> + */
> + nr_arch_events = max_t(typeof(nr_arch_events), nr_arch_events, NR_INTEL_ARCH_EVENTS);
> +
> + for (v = 0; v <= max_pmu_version; v++) {
> + for (i = 0; i < ARRAY_SIZE(perf_caps); i++) {
> + if (!kvm_has_perf_caps && perf_caps[i])
> + continue;
> +
> + pr_info("Testing arch events, PMU version %u, perf_caps = %lx\n",
> + v, perf_caps[i]);
> + /*
> + * To keep the total runtime reasonable, test every
> + * possible non-zero, non-reserved bitmap combination
> + * only with the native PMU version and the full bit
> + * vector length.
> + */
> + if (v == pmu_version) {
> + for (k = 1; k < (BIT(nr_arch_events) - 1); k++)
> + test_arch_events(v, perf_caps[i], nr_arch_events, k);
> + }
> + /*
> + * Test single bits for all PMU version and lengths up
> + * the number of events +1 (to verify KVM doesn't do
> + * weird things if the guest length is greater than the
> + * host length). Explicitly test a mask of '0' and all
> + * ones i.e. all events being available and unavailable.
> + */
> + for (j = 0; j <= nr_arch_events + 1; j++) {
> + test_arch_events(v, perf_caps[i], j, 0);
> + test_arch_events(v, perf_caps[i], j, 0xff);
> +
> + for (k = 0; k < nr_arch_events; k++)
> + test_arch_events(v, perf_caps[i], j, BIT(k));
> + }
> + }
> + }
> +}
> +
> +int main(int argc, char *argv[])
> +{
> + TEST_REQUIRE(get_kvm_param_bool("enable_pmu"));
> +
> + TEST_REQUIRE(host_cpu_is_intel);
> + TEST_REQUIRE(kvm_cpu_has_p(X86_PROPERTY_PMU_VERSION));
> + TEST_REQUIRE(kvm_cpu_property(X86_PROPERTY_PMU_VERSION) > 0);
> +
> + kvm_pmu_version = kvm_cpu_property(X86_PROPERTY_PMU_VERSION);
> + kvm_has_perf_caps = kvm_cpu_has(X86_FEATURE_PDCM);
> +
> + test_intel_counters();
> +
> + return 0;
> +}

2024-01-12 09:17:37

by Mi, Dapeng

[permalink] [raw]
Subject: Re: [PATCH v10 00/29] KVM: x86/pmu: selftests: Fixes and new tests


On 1/10/2024 7:02 AM, Sean Christopherson wrote:
> Knock wood, _this_ is the final of fixes and tests for PMU counters. New
> in v10 is a small refactor to treat FIXED as a value, not a flag, when
> emulating RDPMC. Everything else is the same as v9 (although rebased, but
> there were no conflicts).
>
> v10:
> - Collect review. [Dapeng]
> - Treat the FIXED type in RDPMC's ECX as a value, not a flag. [Jim]
>
> v9:
> - https://lore.kernel.org/all/[email protected]
> - Collect reviews. [Dapeng, Kan]
> - Fix a 63:31 => 63:32 typo in a changelog. [Dapeng]
> - Actually check that forced emulation is enabled before trying to force
> emulation on RDPMC. [Jinrong]
> - Fix the aformentioned priority inversion issue.
> - Completely drop "support" for fast RDPMC, in quotes because KVM doesn't
> actually support RDPMC for non-architectural PMUs. I had left the code
> in v8 because I didn't fully grok what the early emulator check was
> doing, i.e. wasn't 100% confident it was dead code.
>
> v8:
> - https://lore.kernel.org/all/[email protected]
> - Collect reviews. [Jim, Dapeng, Kan]
> - Tweak names for the RDPMC flags in the selftests #defines.
> - Get the event selectors used to virtualize fixed straight from perf
> instead of hardcoding the (wrong) selectors in KVM. [Kan]
> - Rename an "eventsel" field to "event" for a patch that gets blasted
> away in the end anyways. [Jim]
> - Add patches to fix RDPMC emulation and to test the behavior on Intel.
> I spot tested on AMD and spent ~30 minutes trying to squeeze in the
> bare minimum AMD support, but the PMU implementations between Intel
> and AMD are juuuust different enough to make adding AMD support non-
> trivial, and this series is already way too big.
>
> v7:
> - https://lore.kernel.org/all/[email protected]
> - Drop patches that unnecessarily sanitized supported CPUID. [Jim]
> - Purge the array of architectural event encodings. [Jim, Dapeng]
> - Clean up pmu.h to remove useless macros, and make it easier to use the
> new macros. [Jim]
> - Port more of pmu_event_filter_test.c to pmu.h macros. [Jim, Jinrong]
> - Clean up test comments and error messages. [Jim]
> - Sanity check the value provided to vcpu_set_cpuid_property(). [Jim]
>
> v6:
> - https://lore.kernel.org/all/[email protected]
> - Test LLC references/misses with CFLUSH{OPT}. [Jim]
> - Make the tests play nice without PERF_CAPABILITIES. [Mingwei]
> - Don't squash eventsels that happen to match an unsupported arch event. [Kan]
> - Test PMC counters with forced emulation (don't ask how long it took me to
> figure out how to read integer module params).
>
> v5: https://lore.kernel.org/all/[email protected]
> v4: https://lore.kernel.org/all/[email protected]
> v3: https://lore.kernel.org/kvm/[email protected]
>
> Jinrong Liang (7):
> KVM: selftests: Add vcpu_set_cpuid_property() to set properties
> KVM: selftests: Add pmu.h and lib/pmu.c for common PMU assets
> KVM: selftests: Test Intel PMU architectural events on gp counters
> KVM: selftests: Test Intel PMU architectural events on fixed counters
> KVM: selftests: Test consistency of CPUID with num of gp counters
> KVM: selftests: Test consistency of CPUID with num of fixed counters
> KVM: selftests: Add functional test for Intel's fixed PMU counters
>
> Sean Christopherson (22):
> KVM: x86/pmu: Always treat Fixed counters as available when supported
> KVM: x86/pmu: Allow programming events that match unsupported arch
> events
> KVM: x86/pmu: Remove KVM's enumeration of Intel's architectural
> encodings
> KVM: x86/pmu: Setup fixed counters' eventsel during PMU initialization
> KVM: x86/pmu: Get eventsel for fixed counters from perf
> KVM: x86/pmu: Don't ignore bits 31:30 for RDPMC index on AMD
> KVM: x86/pmu: Prioritize VMX interception over #GP on RDPMC due to bad
> index
> KVM: x86/pmu: Apply "fast" RDPMC only to Intel PMUs
> KVM: x86/pmu: Disallow "fast" RDPMC for architectural Intel PMUs
> KVM: x86/pmu: Treat "fixed" PMU type in RDPMC as index as a value, not
> flag
> KVM: x86/pmu: Explicitly check for RDPMC of unsupported Intel PMC
> types
> KVM: selftests: Drop the "name" param from KVM_X86_PMU_FEATURE()
> KVM: selftests: Extend {kvm,this}_pmu_has() to support fixed counters
> KVM: selftests: Expand PMU counters test to verify LLC events
> KVM: selftests: Add a helper to query if the PMU module param is
> enabled
> KVM: selftests: Add helpers to read integer module params
> KVM: selftests: Query module param to detect FEP in MSR filtering test
> KVM: selftests: Move KVM_FEP macro into common library header
> KVM: selftests: Test PMC virtualization with forced emulation
> KVM: selftests: Add a forced emulation variation of KVM_ASM_SAFE()
> KVM: selftests: Add helpers for safe and safe+forced RDMSR, RDPMC, and
> XGETBV
> KVM: selftests: Extend PMU counters test to validate RDPMC after WRMSR
>
> arch/x86/include/asm/kvm-x86-pmu-ops.h | 3 +-
> arch/x86/kvm/emulate.c | 2 +-
> arch/x86/kvm/kvm_emulate.h | 2 +-
> arch/x86/kvm/pmu.c | 20 +-
> arch/x86/kvm/pmu.h | 5 +-
> arch/x86/kvm/svm/pmu.c | 17 +-
> arch/x86/kvm/vmx/pmu_intel.c | 178 +++--
> arch/x86/kvm/x86.c | 9 +-
> tools/testing/selftests/kvm/Makefile | 2 +
> .../selftests/kvm/include/kvm_util_base.h | 4 +
> tools/testing/selftests/kvm/include/pmu.h | 97 +++
> .../selftests/kvm/include/x86_64/processor.h | 148 ++++-
> tools/testing/selftests/kvm/lib/kvm_util.c | 62 +-
> tools/testing/selftests/kvm/lib/pmu.c | 31 +
> .../selftests/kvm/lib/x86_64/processor.c | 15 +-
> .../selftests/kvm/x86_64/pmu_counters_test.c | 617 ++++++++++++++++++
> .../kvm/x86_64/pmu_event_filter_test.c | 143 ++--
> .../smaller_maxphyaddr_emulation_test.c | 2 +-
> .../kvm/x86_64/userspace_msr_exit_test.c | 29 +-
> .../selftests/kvm/x86_64/vmx_pmu_caps_test.c | 2 +-
> 20 files changed, 1097 insertions(+), 291 deletions(-)
> create mode 100644 tools/testing/selftests/kvm/include/pmu.h
> create mode 100644 tools/testing/selftests/kvm/lib/pmu.c
> create mode 100644 tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
>
>
> base-commit: 1c6d984f523f67ecfad1083bb04c55d91977bb15

pmu_counters_test passes on Intel Sapphire Rapids platform.

Tested-by:  Dapeng Mi <[email protected]>


2024-01-12 21:38:13

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v10 16/29] KVM: selftests: Test Intel PMU architectural events on gp counters

On Fri, Jan 12, 2024, Dapeng Mi wrote:
>
> On 1/10/2024 7:02 AM, Sean Christopherson wrote:
> > +/*
> > + * If an architectural event is supported and guaranteed to generate at least
> > + * one "hit, assert that its count is non-zero. If an event isn't supported or
> > + * the test can't guarantee the associated action will occur, then all bets are
> > + * off regarding the count, i.e. no checks can be done.
> > + *
> > + * Sanity check that in all cases, the event doesn't count when it's disabled,
> > + * and that KVM correctly emulates the write of an arbitrary value.
> > + */
> > +static void guest_assert_event_count(uint8_t idx,
> > + struct kvm_x86_pmu_feature event,
> > + uint32_t pmc, uint32_t pmc_msr)
> > +{
> > + uint64_t count;
> > +
> > + count = _rdpmc(pmc);
> > + if (!this_pmu_has(event))
> > + goto sanity_checks;
> > +
> > + switch (idx) {
> > + case INTEL_ARCH_INSTRUCTIONS_RETIRED_INDEX:
> > + GUEST_ASSERT_EQ(count, NUM_INSNS_RETIRED);
> > + break;
> > + case INTEL_ARCH_BRANCHES_RETIRED_INDEX:
> > + GUEST_ASSERT_EQ(count, NUM_BRANCHES);
> > + break;
> > + case INTEL_ARCH_CPU_CYCLES_INDEX:
> > + case INTEL_ARCH_REFERENCE_CYCLES_INDEX:
>
> Since we already support slots event in below guest_test_arch_event(), we
> can add check for INTEL_ARCH_TOPDOWN_SLOTS_INDEX here.

Can that actually be tested at this point, since KVM doesn't support
X86_PMU_FEATURE_TOPDOWN_SLOTS, i.e. this_pmu_has() above should always fail, no?

I'm hesitant to add an assertion of any king without the ability to actually test
the code.

2024-01-15 02:03:26

by Mi, Dapeng

[permalink] [raw]
Subject: Re: [PATCH v10 16/29] KVM: selftests: Test Intel PMU architectural events on gp counters


On 1/13/2024 5:37 AM, Sean Christopherson wrote:
> On Fri, Jan 12, 2024, Dapeng Mi wrote:
>> On 1/10/2024 7:02 AM, Sean Christopherson wrote:
>>> +/*
>>> + * If an architectural event is supported and guaranteed to generate at least
>>> + * one "hit, assert that its count is non-zero. If an event isn't supported or
>>> + * the test can't guarantee the associated action will occur, then all bets are
>>> + * off regarding the count, i.e. no checks can be done.
>>> + *
>>> + * Sanity check that in all cases, the event doesn't count when it's disabled,
>>> + * and that KVM correctly emulates the write of an arbitrary value.
>>> + */
>>> +static void guest_assert_event_count(uint8_t idx,
>>> + struct kvm_x86_pmu_feature event,
>>> + uint32_t pmc, uint32_t pmc_msr)
>>> +{
>>> + uint64_t count;
>>> +
>>> + count = _rdpmc(pmc);
>>> + if (!this_pmu_has(event))
>>> + goto sanity_checks;
>>> +
>>> + switch (idx) {
>>> + case INTEL_ARCH_INSTRUCTIONS_RETIRED_INDEX:
>>> + GUEST_ASSERT_EQ(count, NUM_INSNS_RETIRED);
>>> + break;
>>> + case INTEL_ARCH_BRANCHES_RETIRED_INDEX:
>>> + GUEST_ASSERT_EQ(count, NUM_BRANCHES);
>>> + break;
>>> + case INTEL_ARCH_CPU_CYCLES_INDEX:
>>> + case INTEL_ARCH_REFERENCE_CYCLES_INDEX:
>> Since we already support slots event in below guest_test_arch_event(), we
>> can add check for INTEL_ARCH_TOPDOWN_SLOTS_INDEX here.
> Can that actually be tested at this point, since KVM doesn't support
> X86_PMU_FEATURE_TOPDOWN_SLOTS, i.e. this_pmu_has() above should always fail, no?

I suppose X86_PMU_FEATURE_TOPDOWN_SLOTS has been supported in KVM.  The
following output comes from a guest with latest kvm-x86 code on the
Sapphire Rapids platform.

sudo cpuid -l 0xa
CPU 0:
   Architecture Performance Monitoring Features (0xa):
      version ID                               = 0x2 (2)
      number of counters per logical processor = 0x8 (8)
      bit width of counter                     = 0x30 (48)
      length of EBX bit vector                 = 0x8 (8)
      core cycle event                         = available
      instruction retired event                = available
      reference cycles event                   = available
      last-level cache ref event               = available
      last-level cache miss event              = available
      branch inst retired event                = available
      branch mispred retired event             = available
      top-down slots event                     = available

Current KVM doesn't support fixed counter 3 and pseudo slots event yet,
but the architectural slots event is supported and can be programed on a
GP counter. Current test code can cover this case, so I think we'd
better add the check for the slots count.


>
> I'm hesitant to add an assertion of any king without the ability to actually test
> the code.

2024-01-30 23:28:20

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v10 16/29] KVM: selftests: Test Intel PMU architectural events on gp counters

On Mon, Jan 15, 2024, Dapeng Mi wrote:
>
> On 1/13/2024 5:37 AM, Sean Christopherson wrote:
> > On Fri, Jan 12, 2024, Dapeng Mi wrote:
> > > On 1/10/2024 7:02 AM, Sean Christopherson wrote:
> > > > +/*
> > > > + * If an architectural event is supported and guaranteed to generate at least
> > > > + * one "hit, assert that its count is non-zero. If an event isn't supported or
> > > > + * the test can't guarantee the associated action will occur, then all bets are
> > > > + * off regarding the count, i.e. no checks can be done.
> > > > + *
> > > > + * Sanity check that in all cases, the event doesn't count when it's disabled,
> > > > + * and that KVM correctly emulates the write of an arbitrary value.
> > > > + */
> > > > +static void guest_assert_event_count(uint8_t idx,
> > > > + struct kvm_x86_pmu_feature event,
> > > > + uint32_t pmc, uint32_t pmc_msr)
> > > > +{
> > > > + uint64_t count;
> > > > +
> > > > + count = _rdpmc(pmc);
> > > > + if (!this_pmu_has(event))
> > > > + goto sanity_checks;
> > > > +
> > > > + switch (idx) {
> > > > + case INTEL_ARCH_INSTRUCTIONS_RETIRED_INDEX:
> > > > + GUEST_ASSERT_EQ(count, NUM_INSNS_RETIRED);
> > > > + break;
> > > > + case INTEL_ARCH_BRANCHES_RETIRED_INDEX:
> > > > + GUEST_ASSERT_EQ(count, NUM_BRANCHES);
> > > > + break;
> > > > + case INTEL_ARCH_CPU_CYCLES_INDEX:
> > > > + case INTEL_ARCH_REFERENCE_CYCLES_INDEX:
> > > Since we already support slots event in below guest_test_arch_event(), we
> > > can add check for INTEL_ARCH_TOPDOWN_SLOTS_INDEX here.
> > Can that actually be tested at this point, since KVM doesn't support
> > X86_PMU_FEATURE_TOPDOWN_SLOTS, i.e. this_pmu_has() above should always fail, no?
>
> I suppose X86_PMU_FEATURE_TOPDOWN_SLOTS has been supported in KVM.  The
> following output comes from a guest with latest kvm-x86 code on the Sapphire
> Rapids platform.
>
> sudo cpuid -l 0xa
> CPU 0:
>    Architecture Performance Monitoring Features (0xa):
>       version ID                               = 0x2 (2)
>       number of counters per logical processor = 0x8 (8)
>       bit width of counter                     = 0x30 (48)
>       length of EBX bit vector                 = 0x8 (8)
>       core cycle event                         = available
>       instruction retired event                = available
>       reference cycles event                   = available
>       last-level cache ref event               = available
>       last-level cache miss event              = available
>       branch inst retired event                = available
>       branch mispred retired event             = available
>       top-down slots event                     = available
>
> Current KVM doesn't support fixed counter 3 and pseudo slots event yet, but
> the architectural slots event is supported and can be programed on a GP
> counter. Current test code can cover this case, so I think we'd better add
> the check for the slots count.

Can you submit a patch on top, with a changelog that includes justification that
that explains exactly what assertions can be made on the top-down slots event
given the "workload" being measured? I'm definitely not opposed to adding coverage
for top-down slots, but at this point, I don't want to respin this series, nor do
I want to make that change when applying on the fly.

2024-01-31 01:01:04

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v10 00/29] KVM: x86/pmu: selftests: Fixes and new tests

On Tue, 09 Jan 2024 15:02:20 -0800, Sean Christopherson wrote:
> Knock wood, _this_ is the final of fixes and tests for PMU counters. New
> in v10 is a small refactor to treat FIXED as a value, not a flag, when
> emulating RDPMC. Everything else is the same as v9 (although rebased, but
> there were no conflicts).
>
> v10:
> - Collect review. [Dapeng]
> - Treat the FIXED type in RDPMC's ECX as a value, not a flag. [Jim]
>
> [...]

Applied to kvm-x86 pmu, with the fix for the file goof Andrew pointed out.

[01/29] KVM: x86/pmu: Always treat Fixed counters as available when supported
https://github.com/kvm-x86/linux/commit/5eb7fcbdea63
[02/29] KVM: x86/pmu: Allow programming events that match unsupported arch events
https://github.com/kvm-x86/linux/commit/cbbd1aa89139
[03/29] KVM: x86/pmu: Remove KVM's enumeration of Intel's architectural encodings
https://github.com/kvm-x86/linux/commit/db9e008a0f37
[04/29] KVM: x86/pmu: Setup fixed counters' eventsel during PMU initialization
https://github.com/kvm-x86/linux/commit/61bb2ad795a7
[05/29] KVM: x86/pmu: Get eventsel for fixed counters from perf
https://github.com/kvm-x86/linux/commit/7a277c22412c
[06/29] KVM: x86/pmu: Don't ignore bits 31:30 for RDPMC index on AMD
https://github.com/kvm-x86/linux/commit/ecb490770ad4
[07/29] KVM: x86/pmu: Prioritize VMX interception over #GP on RDPMC due to bad index
https://github.com/kvm-x86/linux/commit/7bb7fce13601
[08/29] KVM: x86/pmu: Apply "fast" RDPMC only to Intel PMUs
https://github.com/kvm-x86/linux/commit/d652981db08f
[09/29] KVM: x86/pmu: Disallow "fast" RDPMC for architectural Intel PMUs
https://github.com/kvm-x86/linux/commit/5728a4a0ea79
[10/29] KVM: x86/pmu: Treat "fixed" PMU type in RDPMC as index as a value, not flag
https://github.com/kvm-x86/linux/commit/7a0fc734c20d
[11/29] KVM: x86/pmu: Explicitly check for RDPMC of unsupported Intel PMC types
https://github.com/kvm-x86/linux/commit/a634c76b2c1a
[12/29] KVM: selftests: Add vcpu_set_cpuid_property() to set properties
https://github.com/kvm-x86/linux/commit/d7e68738e1aa
[13/29] KVM: selftests: Drop the "name" param from KVM_X86_PMU_FEATURE()
https://github.com/kvm-x86/linux/commit/ff76d7712510
[14/29] KVM: selftests: Extend {kvm,this}_pmu_has() to support fixed counters
https://github.com/kvm-x86/linux/commit/370d53632289
[15/29] KVM: selftests: Add pmu.h and lib/pmu.c for common PMU assets
https://github.com/kvm-x86/linux/commit/e6faa0497057
[16/29] KVM: selftests: Test Intel PMU architectural events on gp counters
https://github.com/kvm-x86/linux/commit/4f1bd6b16074
[17/29] KVM: selftests: Test Intel PMU architectural events on fixed counters
https://github.com/kvm-x86/linux/commit/3e26b825f87d
[18/29] KVM: selftests: Test consistency of CPUID with num of gp counters
https://github.com/kvm-x86/linux/commit/7137cf751b9b
[19/29] KVM: selftests: Test consistency of CPUID with num of fixed counters
https://github.com/kvm-x86/linux/commit/c7d7c76ecf78
[20/29] KVM: selftests: Add functional test for Intel's fixed PMU counters
https://github.com/kvm-x86/linux/commit/787071fd0262
[21/29] KVM: selftests: Expand PMU counters test to verify LLC events
https://github.com/kvm-x86/linux/commit/b55e7adf633a
[22/29] KVM: selftests: Add a helper to query if the PMU module param is enabled
https://github.com/kvm-x86/linux/commit/c85e986716b0
[23/29] KVM: selftests: Add helpers to read integer module params
https://github.com/kvm-x86/linux/commit/45e4755c39fc
[24/29] KVM: selftests: Query module param to detect FEP in MSR filtering test
https://github.com/kvm-x86/linux/commit/0326cc6b02c8
[25/29] KVM: selftests: Move KVM_FEP macro into common library header
https://github.com/kvm-x86/linux/commit/00856e17da73
[26/29] KVM: selftests: Test PMC virtualization with forced emulation
https://github.com/kvm-x86/linux/commit/cd34fd8c758e
[27/29] KVM: selftests: Add a forced emulation variation of KVM_ASM_SAFE()
https://github.com/kvm-x86/linux/commit/ab3b6a7de8df
[28/29] KVM: selftests: Add helpers for safe and safe+forced RDMSR, RDPMC, and XGETBV
https://github.com/kvm-x86/linux/commit/b5e66df34cb0
[29/29] KVM: selftests: Extend PMU counters test to validate RDPMC after WRMSR
https://github.com/kvm-x86/linux/commit/a8a37f555684

--
https://github.com/kvm-x86/linux/tree/next

2024-01-31 05:50:13

by Mi, Dapeng

[permalink] [raw]
Subject: Re: [PATCH v10 16/29] KVM: selftests: Test Intel PMU architectural events on gp counters


On 1/31/2024 7:27 AM, Sean Christopherson wrote:
> On Mon, Jan 15, 2024, Dapeng Mi wrote:
>> On 1/13/2024 5:37 AM, Sean Christopherson wrote:
>>> On Fri, Jan 12, 2024, Dapeng Mi wrote:
>>>> On 1/10/2024 7:02 AM, Sean Christopherson wrote:
>>>>> +/*
>>>>> + * If an architectural event is supported and guaranteed to generate at least
>>>>> + * one "hit, assert that its count is non-zero. If an event isn't supported or
>>>>> + * the test can't guarantee the associated action will occur, then all bets are
>>>>> + * off regarding the count, i.e. no checks can be done.
>>>>> + *
>>>>> + * Sanity check that in all cases, the event doesn't count when it's disabled,
>>>>> + * and that KVM correctly emulates the write of an arbitrary value.
>>>>> + */
>>>>> +static void guest_assert_event_count(uint8_t idx,
>>>>> + struct kvm_x86_pmu_feature event,
>>>>> + uint32_t pmc, uint32_t pmc_msr)
>>>>> +{
>>>>> + uint64_t count;
>>>>> +
>>>>> + count = _rdpmc(pmc);
>>>>> + if (!this_pmu_has(event))
>>>>> + goto sanity_checks;
>>>>> +
>>>>> + switch (idx) {
>>>>> + case INTEL_ARCH_INSTRUCTIONS_RETIRED_INDEX:
>>>>> + GUEST_ASSERT_EQ(count, NUM_INSNS_RETIRED);
>>>>> + break;
>>>>> + case INTEL_ARCH_BRANCHES_RETIRED_INDEX:
>>>>> + GUEST_ASSERT_EQ(count, NUM_BRANCHES);
>>>>> + break;
>>>>> + case INTEL_ARCH_CPU_CYCLES_INDEX:
>>>>> + case INTEL_ARCH_REFERENCE_CYCLES_INDEX:
>>>> Since we already support slots event in below guest_test_arch_event(), we
>>>> can add check for INTEL_ARCH_TOPDOWN_SLOTS_INDEX here.
>>> Can that actually be tested at this point, since KVM doesn't support
>>> X86_PMU_FEATURE_TOPDOWN_SLOTS, i.e. this_pmu_has() above should always fail, no?
>> I suppose X86_PMU_FEATURE_TOPDOWN_SLOTS has been supported in KVM.  The
>> following output comes from a guest with latest kvm-x86 code on the Sapphire
>> Rapids platform.
>>
>> sudo cpuid -l 0xa
>> CPU 0:
>>    Architecture Performance Monitoring Features (0xa):
>>       version ID                               = 0x2 (2)
>>       number of counters per logical processor = 0x8 (8)
>>       bit width of counter                     = 0x30 (48)
>>       length of EBX bit vector                 = 0x8 (8)
>>       core cycle event                         = available
>>       instruction retired event                = available
>>       reference cycles event                   = available
>>       last-level cache ref event               = available
>>       last-level cache miss event              = available
>>       branch inst retired event                = available
>>       branch mispred retired event             = available
>>       top-down slots event                     = available
>>
>> Current KVM doesn't support fixed counter 3 and pseudo slots event yet, but
>> the architectural slots event is supported and can be programed on a GP
>> counter. Current test code can cover this case, so I think we'd better add
>> the check for the slots count.
> Can you submit a patch on top, with a changelog that includes justification that
> that explains exactly what assertions can be made on the top-down slots event
> given the "workload" being measured? I'm definitely not opposed to adding coverage
> for top-down slots, but at this point, I don't want to respin this series, nor do
> I want to make that change when applying on the fly.

Yeah, I'm glad to submit a patch for this. :)

BTW, I have a patch series to do the bug fixes and improvements for
kvm-unit-tests/pmu test. (some improvement ideas come from this patchset.)

https://lore.kernel.org/kvm/[email protected]/

Could you please kindly review them? Thanks.

>

2024-01-31 15:31:55

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v10 16/29] KVM: selftests: Test Intel PMU architectural events on gp counters

On Wed, Jan 31, 2024, Dapeng Mi wrote:
> BTW, I have a patch series to do the bug fixes and improvements for
> kvm-unit-tests/pmu test. (some improvement ideas come from this patchset.)
>
> https://lore.kernel.org/kvm/[email protected]/
>
> Could you please kindly review them? Thanks.

Unfortunately, that's probably not going to happen anytime soon. I am overloaded
with KVM/kernel reviews as it is, so I don't expect to have cycles for KUT reviews
in the near future.

And for PMU tests in particular, I really want to get selftests to the point where
the PMU selftests are a superset of the PMU KUT tests so that we can drop the KUT
versions. In short, reviewing PMU KUT changes is very far down my todo list.

2024-02-01 02:53:58

by Mi, Dapeng

[permalink] [raw]
Subject: Re: [PATCH v10 16/29] KVM: selftests: Test Intel PMU architectural events on gp counters


On 1/31/2024 11:31 PM, Sean Christopherson wrote:
> On Wed, Jan 31, 2024, Dapeng Mi wrote:
>> BTW, I have a patch series to do the bug fixes and improvements for
>> kvm-unit-tests/pmu test. (some improvement ideas come from this patchset.)
>>
>> https://lore.kernel.org/kvm/[email protected]/
>>
>> Could you please kindly review them? Thanks.
> Unfortunately, that's probably not going to happen anytime soon. I am overloaded
> with KVM/kernel reviews as it is, so I don't expect to have cycles for KUT reviews
> in the near future.
>
> And for PMU tests in particular, I really want to get selftests to the point where
> the PMU selftests are a superset of the PMU KUT tests so that we can drop the KUT
> versions. In short, reviewing PMU KUT changes is very far down my todo list.

Yeah, It's good and convenient that we can have one-stop test suite to
verify the PMU functions, so we don't need to verify the PMU functions
each time with KUT/PMU test and selftest/pmu test independently.

While it looks KUT/PMU test is still broadly used to verify KVM PMU
features by many users, and the patchset mainly fixes several KUT/PMU
bugs which would cause false alarms. If these bugs are not fixed, they
would mislead the users.

Yeah, patch reviewing takes much time and effort, thanks a lot for you
reviewing our patches in the past, and hope you can review this patchset
when you are free.