2023-10-24 00:26:55

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v5 00/13] KVM: x86/pmu: selftests: Fixes and new tests

This is effectively v5 of Jinrong's series to add more PMU selftests, with
a focus on architectural events, fixed counters, and CPUID configurations.
I reworked things quite a bit, but the core concepts and what's being tested
are more or less unchanged.

The first three patches are minor fixes for KVM's handling of fixed
counters. Patch 3 deals with an area in the PMU architecture that is
somewhat open to interpretation, i.e. could probably use a bit of dicsussion
to make sure we're all on the same page.

Jinrong and/or Like, please double check and rerun everything, my confidence
level with PMU stuff is still quite low relative to the rest of KVM.

v4: https://lore.kernel.org/all/[email protected]
v3: https://lore.kernel.org/kvm/[email protected]

Jinrong Liang (7):
KVM: selftests: Add vcpu_set_cpuid_property() to set properties
KVM: selftests: Add pmu.h and lib/pmu.c for common PMU assets
KVM: selftests: Test Intel PMU architectural events on gp counters
KVM: selftests: Test Intel PMU architectural events on fixed counters
KVM: selftests: Test consistency of CPUID with num of gp counters
KVM: selftests: Test consistency of CPUID with num of fixed counters
KVM: selftests: Add functional test for Intel's fixed PMU counters

Sean Christopherson (6):
KVM: x86/pmu: Don't allow exposing unsupported architectural events
KVM: x86/pmu: Don't enumerate support for fixed counters KVM can't
virtualize
KVM: x86/pmu: Always treat Fixed counters as available when supported
KVM: selftests: Drop the "name" param from KVM_X86_PMU_FEATURE()
KVM: selftests: Extend {kvm,this}_pmu_has() to support fixed counters
KVM: selftests: Extend PMU counters test to permute on vPMU version

arch/x86/kvm/pmu.h | 4 +
arch/x86/kvm/vmx/pmu_intel.c | 48 +-
tools/testing/selftests/kvm/Makefile | 2 +
tools/testing/selftests/kvm/include/pmu.h | 84 ++++
.../selftests/kvm/include/x86_64/processor.h | 67 ++-
tools/testing/selftests/kvm/lib/pmu.c | 28 ++
.../selftests/kvm/lib/x86_64/processor.c | 12 +-
.../selftests/kvm/x86_64/pmu_counters_test.c | 438 ++++++++++++++++++
.../kvm/x86_64/pmu_event_filter_test.c | 32 +-
.../smaller_maxphyaddr_emulation_test.c | 2 +-
10 files changed, 669 insertions(+), 48 deletions(-)
create mode 100644 tools/testing/selftests/kvm/include/pmu.h
create mode 100644 tools/testing/selftests/kvm/lib/pmu.c
create mode 100644 tools/testing/selftests/kvm/x86_64/pmu_counters_test.c


base-commit: c076acf10c78c0d7e1aa50670e9cc4c91e8d59b4
--
2.42.0.758.gaed0368e0e-goog


2023-10-24 00:27:04

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v5 05/13] KVM: selftests: Drop the "name" param from KVM_X86_PMU_FEATURE()

Drop the "name" parameter from KVM_X86_PMU_FEATURE(), it's unused and
the name is redundant with the macro, i.e. it's truly useless.

Signed-off-by: Sean Christopherson <[email protected]>
---
tools/testing/selftests/kvm/include/x86_64/processor.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index a01931f7d954..2d9771151dd9 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -289,7 +289,7 @@ struct kvm_x86_cpu_property {
struct kvm_x86_pmu_feature {
struct kvm_x86_cpu_feature anti_feature;
};
-#define KVM_X86_PMU_FEATURE(name, __bit) \
+#define KVM_X86_PMU_FEATURE(__bit) \
({ \
struct kvm_x86_pmu_feature feature = { \
.anti_feature = KVM_X86_CPU_FEATURE(0xa, 0, EBX, __bit), \
@@ -298,7 +298,7 @@ struct kvm_x86_pmu_feature {
feature; \
})

-#define X86_PMU_FEATURE_BRANCH_INSNS_RETIRED KVM_X86_PMU_FEATURE(BRANCH_INSNS_RETIRED, 5)
+#define X86_PMU_FEATURE_BRANCH_INSNS_RETIRED KVM_X86_PMU_FEATURE(5)

static inline unsigned int x86_family(unsigned int eax)
{
--
2.42.0.758.gaed0368e0e-goog

2023-10-24 00:27:13

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v5 06/13] KVM: selftests: Extend {kvm,this}_pmu_has() to support fixed counters

Extend the kvm_x86_pmu_feature framework to allow querying for fixed
counters via {kvm,this}_pmu_has(). Like architectural events, checking
for a fixed counter annoyingly requires checking multiple CPUID fields, as
a fixed counter exists if:

FxCtr[i]_is_supported := ECX[i] || (EDX[4:0] > i);

Note, KVM currently doesn't actually support exposing fixed counters via
the bitmask, but that will hopefully change sooner than later, and Intel's
SDM explicitly "recommends" checking both the number of counters and the
mask.

Rename the intermedate "anti_feature" field to simply 'f' since the fixed
counter bitmask (thankfully) doesn't have reversed polarity like the
architectural events bitmask.

Note, ideally the helpers would use BUILD_BUG_ON() to assert on the
incoming register, but the expected usage in PMU tests can't guarantee the
inputs are compile-time constants.

Opportunistically define macros for all of the architectural events and
fixed counters that KVM currently supports.

Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/include/x86_64/processor.h | 63 +++++++++++++------
1 file changed, 45 insertions(+), 18 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 2d9771151dd9..b103c462701b 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -281,24 +281,39 @@ struct kvm_x86_cpu_property {
* that indicates the feature is _not_ supported, and a property that states
* the length of the bit mask of unsupported features. A feature is supported
* if the size of the bit mask is larger than the "unavailable" bit, and said
- * bit is not set.
+ * bit is not set. Fixed counters also bizarre enumeration, but inverted from
+ * arch events for general purpose counters. Fixed counters are supported if a
+ * feature flag is set **OR** the total number of fixed counters is greater
+ * than index of the counter.
*
- * Wrap the "unavailable" feature to simplify checking whether or not a given
- * architectural event is supported.
+ * Wrap the events for general purpose and fixed counters to simplify checking
+ * whether or not a given architectural event is supported.
*/
struct kvm_x86_pmu_feature {
- struct kvm_x86_cpu_feature anti_feature;
+ struct kvm_x86_cpu_feature f;
};
-#define KVM_X86_PMU_FEATURE(__bit) \
-({ \
- struct kvm_x86_pmu_feature feature = { \
- .anti_feature = KVM_X86_CPU_FEATURE(0xa, 0, EBX, __bit), \
- }; \
- \
- feature; \
+#define KVM_X86_PMU_FEATURE(__reg, __bit) \
+({ \
+ struct kvm_x86_pmu_feature feature = { \
+ .f = KVM_X86_CPU_FEATURE(0xa, 0, __reg, __bit), \
+ }; \
+ \
+ kvm_static_assert(KVM_CPUID_##__reg == KVM_CPUID_EBX || \
+ KVM_CPUID_##__reg == KVM_CPUID_ECX); \
+ feature; \
})

-#define X86_PMU_FEATURE_BRANCH_INSNS_RETIRED KVM_X86_PMU_FEATURE(5)
+#define X86_PMU_FEATURE_CPU_CYCLES KVM_X86_PMU_FEATURE(EBX, 0)
+#define X86_PMU_FEATURE_INSNS_RETIRED KVM_X86_PMU_FEATURE(EBX, 1)
+#define X86_PMU_FEATURE_REFERENCE_CYCLES KVM_X86_PMU_FEATURE(EBX, 2)
+#define X86_PMU_FEATURE_LLC_REFERENCES KVM_X86_PMU_FEATURE(EBX, 3)
+#define X86_PMU_FEATURE_LLC_MISSES KVM_X86_PMU_FEATURE(EBX, 4)
+#define X86_PMU_FEATURE_BRANCH_INSNS_RETIRED KVM_X86_PMU_FEATURE(EBX, 5)
+#define X86_PMU_FEATURE_BRANCHES_MISPREDICTED KVM_X86_PMU_FEATURE(EBX, 6)
+
+#define X86_PMU_FEATURE_INSNS_RETIRED_FIXED KVM_X86_PMU_FEATURE(ECX, 0)
+#define X86_PMU_FEATURE_CPU_CYCLES_FIXED KVM_X86_PMU_FEATURE(ECX, 1)
+#define X86_PMU_FEATURE_REFERENCE_CYCLES_FIXED KVM_X86_PMU_FEATURE(ECX, 2)

static inline unsigned int x86_family(unsigned int eax)
{
@@ -697,10 +712,16 @@ static __always_inline bool this_cpu_has_p(struct kvm_x86_cpu_property property)

static inline bool this_pmu_has(struct kvm_x86_pmu_feature feature)
{
- uint32_t nr_bits = this_cpu_property(X86_PROPERTY_PMU_EBX_BIT_VECTOR_LENGTH);
+ uint32_t nr_bits;

- return nr_bits > feature.anti_feature.bit &&
- !this_cpu_has(feature.anti_feature);
+ if (feature.f.reg == KVM_CPUID_EBX) {
+ nr_bits = this_cpu_property(X86_PROPERTY_PMU_EBX_BIT_VECTOR_LENGTH);
+ return nr_bits > feature.f.bit && !this_cpu_has(feature.f);
+ }
+
+ GUEST_ASSERT(feature.f.reg == KVM_CPUID_ECX);
+ nr_bits = this_cpu_property(X86_PROPERTY_PMU_NR_FIXED_COUNTERS);
+ return nr_bits > feature.f.bit || this_cpu_has(feature.f);
}

static __always_inline uint64_t this_cpu_supported_xcr0(void)
@@ -916,10 +937,16 @@ static __always_inline bool kvm_cpu_has_p(struct kvm_x86_cpu_property property)

static inline bool kvm_pmu_has(struct kvm_x86_pmu_feature feature)
{
- uint32_t nr_bits = kvm_cpu_property(X86_PROPERTY_PMU_EBX_BIT_VECTOR_LENGTH);
+ uint32_t nr_bits;

- return nr_bits > feature.anti_feature.bit &&
- !kvm_cpu_has(feature.anti_feature);
+ if (feature.f.reg == KVM_CPUID_EBX) {
+ nr_bits = kvm_cpu_property(X86_PROPERTY_PMU_EBX_BIT_VECTOR_LENGTH);
+ return nr_bits > feature.f.bit && !kvm_cpu_has(feature.f);
+ }
+
+ TEST_ASSERT_EQ(feature.f.reg, KVM_CPUID_ECX);
+ nr_bits = kvm_cpu_property(X86_PROPERTY_PMU_NR_FIXED_COUNTERS);
+ return nr_bits > feature.f.bit || kvm_cpu_has(feature.f);
}

static __always_inline uint64_t kvm_cpu_supported_xcr0(void)
--
2.42.0.758.gaed0368e0e-goog

2023-10-24 00:27:17

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v5 04/13] KVM: selftests: Add vcpu_set_cpuid_property() to set properties

From: Jinrong Liang <[email protected]>

Add vcpu_set_cpuid_property() helper function for setting properties, and
use it instead of open coding an equivalent for MAX_PHY_ADDR. Future vPMU
testcases will also need to stuff various CPUID properties.

Signed-off-by: Jinrong Liang <[email protected]>
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
.../testing/selftests/kvm/include/x86_64/processor.h | 4 +++-
tools/testing/selftests/kvm/lib/x86_64/processor.c | 12 +++++++++---
.../kvm/x86_64/smaller_maxphyaddr_emulation_test.c | 2 +-
3 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 25bc61dac5fb..a01931f7d954 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -994,7 +994,9 @@ static inline void vcpu_set_cpuid(struct kvm_vcpu *vcpu)
vcpu_ioctl(vcpu, KVM_GET_CPUID2, vcpu->cpuid);
}

-void vcpu_set_cpuid_maxphyaddr(struct kvm_vcpu *vcpu, uint8_t maxphyaddr);
+void vcpu_set_cpuid_property(struct kvm_vcpu *vcpu,
+ struct kvm_x86_cpu_property property,
+ uint32_t value);

void vcpu_clear_cpuid_entry(struct kvm_vcpu *vcpu, uint32_t function);
void vcpu_set_or_clear_cpuid_feature(struct kvm_vcpu *vcpu,
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index d8288374078e..9e717bc6bd6d 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -752,11 +752,17 @@ void vcpu_init_cpuid(struct kvm_vcpu *vcpu, const struct kvm_cpuid2 *cpuid)
vcpu_set_cpuid(vcpu);
}

-void vcpu_set_cpuid_maxphyaddr(struct kvm_vcpu *vcpu, uint8_t maxphyaddr)
+void vcpu_set_cpuid_property(struct kvm_vcpu *vcpu,
+ struct kvm_x86_cpu_property property,
+ uint32_t value)
{
- struct kvm_cpuid_entry2 *entry = vcpu_get_cpuid_entry(vcpu, 0x80000008);
+ struct kvm_cpuid_entry2 *entry;
+
+ entry = __vcpu_get_cpuid_entry(vcpu, property.function, property.index);
+
+ (&entry->eax)[property.reg] &= ~GENMASK(property.hi_bit, property.lo_bit);
+ (&entry->eax)[property.reg] |= value << (property.lo_bit);

- entry->eax = (entry->eax & ~0xff) | maxphyaddr;
vcpu_set_cpuid(vcpu);
}

diff --git a/tools/testing/selftests/kvm/x86_64/smaller_maxphyaddr_emulation_test.c b/tools/testing/selftests/kvm/x86_64/smaller_maxphyaddr_emulation_test.c
index 06edf00a97d6..9b89440dff19 100644
--- a/tools/testing/selftests/kvm/x86_64/smaller_maxphyaddr_emulation_test.c
+++ b/tools/testing/selftests/kvm/x86_64/smaller_maxphyaddr_emulation_test.c
@@ -63,7 +63,7 @@ int main(int argc, char *argv[])
vm_init_descriptor_tables(vm);
vcpu_init_descriptor_tables(vcpu);

- vcpu_set_cpuid_maxphyaddr(vcpu, MAXPHYADDR);
+ vcpu_set_cpuid_property(vcpu, X86_PROPERTY_MAX_PHY_ADDR, MAXPHYADDR);

rc = kvm_check_cap(KVM_CAP_EXIT_ON_EMULATION_FAILURE);
TEST_ASSERT(rc, "KVM_CAP_EXIT_ON_EMULATION_FAILURE is unavailable");
--
2.42.0.758.gaed0368e0e-goog

2023-10-24 00:27:17

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v5 01/13] KVM: x86/pmu: Don't allow exposing unsupported architectural events

Hide architectural events that are unsupported according to guest CPUID
*or* hardware, i.e. don't let userspace advertise and potentially program
unsupported architectural events.

Note, KVM already limits the length of the reverse polarity field, only
the mask itself is missing.

Fixes: f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests")
Cc: [email protected]
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/vmx/pmu_intel.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 820d3e1f6b4f..1b13a472e3f2 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -533,7 +533,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->counter_bitmask[KVM_PMC_GP] = ((u64)1 << eax.split.bit_width) - 1;
eax.split.mask_length = min_t(int, eax.split.mask_length,
kvm_pmu_cap.events_mask_len);
- pmu->available_event_types = ~entry->ebx &
+ pmu->available_event_types = ~(entry->ebx | kvm_pmu_cap.events_mask) &
((1ull << eax.split.mask_length) - 1);

if (pmu->version == 1) {
--
2.42.0.758.gaed0368e0e-goog

2023-10-24 00:27:27

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v5 02/13] KVM: x86/pmu: Don't enumerate support for fixed counters KVM can't virtualize

Hide fixed counters for which perf is incapable of creating the associated
architectural event. Except for the so called pseudo-architectural event
for counting TSC reference cycle, KVM virtualizes fixed counters by
creating a perf event for the associated general purpose architectural
event. If the associated event isn't supported in hardware, KVM can't
actually virtualize the fixed counter because perf will likely not program
up the correct event.

Note, this issue is almost certainly limited to running KVM on a funky
virtual CPU model, no known real hardware has an asymmetric PMU where a
fixed counter is supported but the associated architectural event is not.

Fixes: f5132b01386b ("KVM: Expose a version 2 architectural PMU to a guests")
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/pmu.h | 4 ++++
arch/x86/kvm/vmx/pmu_intel.c | 31 +++++++++++++++++++++++++++++++
2 files changed, 35 insertions(+)

diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 1d64113de488..5341e8f69a22 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -19,6 +19,7 @@
#define VMWARE_BACKDOOR_PMC_APPARENT_TIME 0x10002

struct kvm_pmu_ops {
+ void (*init_pmu_capability)(void);
bool (*hw_event_available)(struct kvm_pmc *pmc);
struct kvm_pmc *(*pmc_idx_to_pmc)(struct kvm_pmu *pmu, int pmc_idx);
struct kvm_pmc *(*rdpmc_ecx_to_pmc)(struct kvm_vcpu *vcpu,
@@ -218,6 +219,9 @@ static inline void kvm_init_pmu_capability(const struct kvm_pmu_ops *pmu_ops)
pmu_ops->MAX_NR_GP_COUNTERS);
kvm_pmu_cap.num_counters_fixed = min(kvm_pmu_cap.num_counters_fixed,
KVM_PMC_MAX_FIXED);
+
+ if (pmu_ops->init_pmu_capability)
+ pmu_ops->init_pmu_capability();
}

static inline void kvm_pmu_request_counter_reprogram(struct kvm_pmc *pmc)
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 1b13a472e3f2..3316fdea212a 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -68,6 +68,36 @@ static int fixed_pmc_events[] = {
[2] = PSEUDO_ARCH_REFERENCE_CYCLES,
};

+static void intel_init_pmu_capability(void)
+{
+ int i;
+
+ /*
+ * Perf may (sadly) back a guest fixed counter with a general purpose
+ * counter, and so KVM must hide fixed counters whose associated
+ * architectural event are unsupported. On real hardware, this should
+ * never happen, but if KVM is running on a funky virtual CPU model...
+ *
+ * TODO: Drop this horror if/when KVM stops using perf events for
+ * guest fixed counters, or can explicitly request fixed counters.
+ */
+ for (i = 0; i < kvm_pmu_cap.num_counters_fixed; i++) {
+ int event = fixed_pmc_events[i];
+
+ /*
+ * Ignore pseudo-architectural events, they're a bizarre way of
+ * requesting events from perf that _can't_ be backed with a
+ * general purpose architectural event, i.e. they're guaranteed
+ * to be backed by the real fixed counter.
+ */
+ if (event < NR_REAL_INTEL_ARCH_EVENTS &&
+ (kvm_pmu_cap.events_mask & BIT(event)))
+ break;
+ }
+
+ kvm_pmu_cap.num_counters_fixed = i;
+}
+
static void reprogram_fixed_counters(struct kvm_pmu *pmu, u64 data)
{
struct kvm_pmc *pmc;
@@ -789,6 +819,7 @@ void intel_pmu_cross_mapped_check(struct kvm_pmu *pmu)
}

struct kvm_pmu_ops intel_pmu_ops __initdata = {
+ .init_pmu_capability = intel_init_pmu_capability,
.hw_event_available = intel_hw_event_available,
.pmc_idx_to_pmc = intel_pmc_idx_to_pmc,
.rdpmc_ecx_to_pmc = intel_rdpmc_ecx_to_pmc,
--
2.42.0.758.gaed0368e0e-goog

2023-10-24 00:27:28

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v5 03/13] KVM: x86/pmu: Always treat Fixed counters as available when supported

Now that KVM hides fixed counters that can't be virtualized, treat fixed
counters as available when they are supported, i.e. don't silently ignore
an enabled fixed counter just because guest CPUID says the associated
general purpose architectural event is unavailable.

KVM originally treated fixed counters as always available, but that got
changed as part of a fix to avoid confusing REF_CPU_CYCLES, which does NOT
map to an architectural event, with the actual architectural event used
associated with bit 7, TOPDOWN_SLOTS.

The commit justified the change with:

If the event is marked as unavailable in the Intel guest CPUID
0AH.EBX leaf, we need to avoid any perf_event creation, whether
it's a gp or fixed counter.

but that justification doesn't mesh with reality. The Intel SDM uses
"architectural events" to refer to both general purpose events (the ones
with the reverse polarity mask in CPUID.0xA.EBX) and the events for fixed
counters, e.g. the SDM makes statements like:

Each of the fixed-function PMC can count only one architectural
performance event.

but the fact that fixed counter 2 (TSC reference cycles) doesn't have an
associated general purpose architectural makes trying to apply the mask
from CPUID.0xA.EBX impossible. Furthermore, the SDM never explicitly
says that an architectural events that's marked unavailable in EBX affects
the fixed counters.

Note, at the time of the change, KVM didn't enforce hardware support, i.e.
didn't prevent userspace from enumerating support in guest CPUID.0xA.EBX
for architectural events that aren't supported in hardware. I.e. silently
dropping the fixed counter didn't somehow protection against counting the
wrong event, it just enforced guest CPUID.

Arguably, userspace is creating a bogus vCPU model by advertising a fixed
counter but saying the associated general purpose architectural event is
unavailable. But regardless of the validity of the vCPU model, letting
the guest enable a fixed counter and then not actually having it count
anything is completely nonsensical. I.e. even if all of the above is
wrong and it's illegal for a fixed counter to exist when the architectural
event is unavailable, silently doing nothing is still the wrong behavior
and KVM should instead disallow enabling the fixed counter in the first
place.

Fixes: a21864486f7e ("KVM: x86/pmu: Fix available_event_types check for REF_CPU_CYCLES event")
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/vmx/pmu_intel.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 3316fdea212a..1c0a17661781 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -138,11 +138,24 @@ static bool intel_hw_event_available(struct kvm_pmc *pmc)
u8 unit_mask = (pmc->eventsel & ARCH_PERFMON_EVENTSEL_UMASK) >> 8;
int i;

+ /*
+ * Fixed counters are always available if KVM reaches this point. If a
+ * fixed counter is unsupported in hardware or guest CPUID, KVM doesn't
+ * allow the counter's corresponding MSR to be written. KVM does use
+ * architectural events to program fixed counters, as the interface to
+ * perf doesn't allow requesting a specific fixed counter, e.g. perf
+ * may (sadly) back a guest fixed PMC with a general purposed counter.
+ * But if _hardware_ doesn't support the associated event, KVM simply
+ * doesn't enumerate support for the fixed counter.
+ */
+ if (pmc_is_fixed(pmc))
+ return true;
+
BUILD_BUG_ON(ARRAY_SIZE(intel_arch_events) != NR_INTEL_ARCH_EVENTS);

/*
* Disallow events reported as unavailable in guest CPUID. Note, this
- * doesn't apply to pseudo-architectural events.
+ * doesn't apply to pseudo-architectural events (see above).
*/
for (i = 0; i < NR_REAL_INTEL_ARCH_EVENTS; i++) {
if (intel_arch_events[i].eventsel != event_select ||
--
2.42.0.758.gaed0368e0e-goog

2023-10-24 00:27:50

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v5 08/13] KVM: selftests: Test Intel PMU architectural events on gp counters

From: Jinrong Liang <[email protected]>

Add test cases to check if different Architectural events are available
after it's marked as unavailable via CPUID. It covers vPMU event filtering
logic based on Intel CPUID, which is a complement to pmu_event_filter.

According to Intel SDM, the number of architectural events is reported
through CPUID.0AH:EAX[31:24] and the architectural event x is supported
if EBX[x]=0 && EAX[31:24]>x.

Co-developed-by: Like Xu <[email protected]>
Signed-off-by: Like Xu <[email protected]>
Signed-off-by: Jinrong Liang <[email protected]>
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/x86_64/pmu_counters_test.c | 189 ++++++++++++++++++
2 files changed, 190 insertions(+)
create mode 100644 tools/testing/selftests/kvm/x86_64/pmu_counters_test.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index ed1c17cabc07..4c024fb845b4 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -82,6 +82,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/mmio_warning_test
TEST_GEN_PROGS_x86_64 += x86_64/monitor_mwait_test
TEST_GEN_PROGS_x86_64 += x86_64/nested_exceptions_test
TEST_GEN_PROGS_x86_64 += x86_64/platform_info_test
+TEST_GEN_PROGS_x86_64 += x86_64/pmu_counters_test
TEST_GEN_PROGS_x86_64 += x86_64/pmu_event_filter_test
TEST_GEN_PROGS_x86_64 += x86_64/set_boot_cpu_id
TEST_GEN_PROGS_x86_64 += x86_64/set_sregs_test
diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
new file mode 100644
index 000000000000..2a6336b994d5
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
@@ -0,0 +1,189 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2023, Tencent, Inc.
+ */
+
+#define _GNU_SOURCE /* for program_invocation_short_name */
+#include <x86intrin.h>
+
+#include "pmu.h"
+#include "processor.h"
+
+/* Guest payload for any performance counter counting */
+#define NUM_BRANCHES 10
+
+static struct kvm_vm *pmu_vm_create_with_one_vcpu(struct kvm_vcpu **vcpu,
+ void *guest_code)
+{
+ struct kvm_vm *vm;
+
+ vm = vm_create_with_one_vcpu(vcpu, guest_code);
+ vm_init_descriptor_tables(vm);
+ vcpu_init_descriptor_tables(*vcpu);
+
+ return vm;
+}
+
+static void run_vcpu(struct kvm_vcpu *vcpu)
+{
+ struct ucall uc;
+
+ do {
+ vcpu_run(vcpu);
+ switch (get_ucall(vcpu, &uc)) {
+ case UCALL_SYNC:
+ break;
+ case UCALL_ABORT:
+ REPORT_GUEST_ASSERT(uc);
+ break;
+ case UCALL_DONE:
+ break;
+ default:
+ TEST_FAIL("Unexpected ucall: %lu", uc.cmd);
+ }
+ } while (uc.cmd != UCALL_DONE);
+}
+
+static bool pmu_is_intel_event_stable(uint8_t idx)
+{
+ switch (idx) {
+ case INTEL_ARCH_CPU_CYCLES:
+ case INTEL_ARCH_INSTRUCTIONS_RETIRED:
+ case INTEL_ARCH_REFERENCE_CYCLES:
+ case INTEL_ARCH_BRANCHES_RETIRED:
+ return true;
+ default:
+ return false;
+ }
+}
+
+static void guest_measure_pmu_v1(struct kvm_x86_pmu_feature event,
+ uint32_t counter_msr, uint32_t nr_gp_counters)
+{
+ uint8_t idx = event.f.bit;
+ unsigned int i;
+
+ for (i = 0; i < nr_gp_counters; i++) {
+ wrmsr(counter_msr + i, 0);
+ wrmsr(MSR_P6_EVNTSEL0 + i, ARCH_PERFMON_EVENTSEL_OS |
+ ARCH_PERFMON_EVENTSEL_ENABLE | intel_pmu_arch_events[idx]);
+ __asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
+
+ if (pmu_is_intel_event_stable(idx))
+ GUEST_ASSERT_EQ(this_pmu_has(event), !!_rdpmc(i));
+
+ wrmsr(MSR_P6_EVNTSEL0 + i, ARCH_PERFMON_EVENTSEL_OS |
+ !ARCH_PERFMON_EVENTSEL_ENABLE |
+ intel_pmu_arch_events[idx]);
+ wrmsr(counter_msr + i, 0);
+ __asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
+
+ if (pmu_is_intel_event_stable(idx))
+ GUEST_ASSERT(!_rdpmc(i));
+ }
+
+ GUEST_DONE();
+}
+
+static void guest_measure_loop(uint8_t idx)
+{
+ const struct {
+ struct kvm_x86_pmu_feature gp_event;
+ } intel_event_to_feature[] = {
+ [INTEL_ARCH_CPU_CYCLES] = { X86_PMU_FEATURE_CPU_CYCLES },
+ [INTEL_ARCH_INSTRUCTIONS_RETIRED] = { X86_PMU_FEATURE_INSNS_RETIRED },
+ [INTEL_ARCH_REFERENCE_CYCLES] = { X86_PMU_FEATURE_REFERENCE_CYCLES },
+ [INTEL_ARCH_LLC_REFERENCES] = { X86_PMU_FEATURE_LLC_REFERENCES },
+ [INTEL_ARCH_LLC_MISSES] = { X86_PMU_FEATURE_LLC_MISSES },
+ [INTEL_ARCH_BRANCHES_RETIRED] = { X86_PMU_FEATURE_BRANCH_INSNS_RETIRED },
+ [INTEL_ARCH_BRANCHES_MISPREDICTED] = { X86_PMU_FEATURE_BRANCHES_MISPREDICTED },
+ };
+
+ uint32_t nr_gp_counters = this_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
+ uint32_t pmu_version = this_cpu_property(X86_PROPERTY_PMU_VERSION);
+ struct kvm_x86_pmu_feature gp_event;
+ uint32_t counter_msr;
+ unsigned int i;
+
+ if (rdmsr(MSR_IA32_PERF_CAPABILITIES) & PMU_CAP_FW_WRITES)
+ counter_msr = MSR_IA32_PMC0;
+ else
+ counter_msr = MSR_IA32_PERFCTR0;
+
+ gp_event = intel_event_to_feature[idx].gp_event;
+ TEST_ASSERT_EQ(idx, gp_event.f.bit);
+
+ if (pmu_version < 2) {
+ guest_measure_pmu_v1(gp_event, counter_msr, nr_gp_counters);
+ return;
+ }
+
+ for (i = 0; i < nr_gp_counters; i++) {
+ wrmsr(counter_msr + i, 0);
+ wrmsr(MSR_P6_EVNTSEL0 + i, ARCH_PERFMON_EVENTSEL_OS |
+ ARCH_PERFMON_EVENTSEL_ENABLE |
+ intel_pmu_arch_events[idx]);
+
+ wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, BIT_ULL(i));
+ __asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
+ wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);
+
+ if (pmu_is_intel_event_stable(idx))
+ GUEST_ASSERT_EQ(this_pmu_has(gp_event), !!_rdpmc(i));
+ }
+
+ GUEST_DONE();
+}
+
+static void test_arch_events_cpuid(uint8_t i, uint8_t j, uint8_t idx)
+{
+ uint8_t arch_events_unavailable_mask = BIT_ULL(j);
+ uint8_t arch_events_bitmap_size = BIT_ULL(i);
+ struct kvm_vcpu *vcpu;
+ struct kvm_vm *vm;
+
+ vm = pmu_vm_create_with_one_vcpu(&vcpu, guest_measure_loop);
+
+ vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_EBX_BIT_VECTOR_LENGTH,
+ arch_events_bitmap_size);
+ vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_EVENTS_MASK,
+ arch_events_unavailable_mask);
+
+ vcpu_args_set(vcpu, 1, idx);
+
+ run_vcpu(vcpu);
+
+ kvm_vm_free(vm);
+}
+
+static void test_intel_arch_events(void)
+{
+ uint8_t idx, i, j;
+
+ for (idx = 0; idx < NR_INTEL_ARCH_EVENTS; idx++) {
+ /*
+ * A brute force iteration of all combinations of values is
+ * likely to exhaust the limit of the single-threaded thread
+ * fd nums, so it's test by iterating through all valid
+ * single-bit values.
+ */
+ for (i = 0; i < NR_INTEL_ARCH_EVENTS; i++) {
+ for (j = 0; j < NR_INTEL_ARCH_EVENTS; j++)
+ test_arch_events_cpuid(i, j, idx);
+ }
+ }
+}
+
+int main(int argc, char *argv[])
+{
+ TEST_REQUIRE(get_kvm_param_bool("enable_pmu"));
+
+ TEST_REQUIRE(host_cpu_is_intel);
+ TEST_REQUIRE(kvm_cpu_has_p(X86_PROPERTY_PMU_VERSION));
+ TEST_REQUIRE(kvm_cpu_property(X86_PROPERTY_PMU_VERSION) > 0);
+ TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_PDCM));
+
+ test_intel_arch_events();
+
+ return 0;
+}
--
2.42.0.758.gaed0368e0e-goog

2023-10-24 00:27:54

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v5 09/13] KVM: selftests: Test Intel PMU architectural events on fixed counters

From: Jinrong Liang <[email protected]>

Update test to cover Intel PMU architectural events on fixed counters.
Per Intel SDM, PMU users can also count architecture performance events
on fixed counters (specifically, FIXED_CTR0 for the retired instructions
and FIXED_CTR1 for cpu core cycles event). Therefore, if guest's CPUID
indicates that an architecture event is not available, the corresponding
fixed counter will also not count that event.

Co-developed-by: Like Xu <[email protected]>
Signed-off-by: Like Xu <[email protected]>
Signed-off-by: Jinrong Liang <[email protected]>
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/x86_64/pmu_counters_test.c | 54 ++++++++++++++++---
1 file changed, 46 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
index 2a6336b994d5..410d09f788ef 100644
--- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
+++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
@@ -85,23 +85,44 @@ static void guest_measure_pmu_v1(struct kvm_x86_pmu_feature event,
GUEST_DONE();
}

+#define X86_PMU_FEATURE_NULL \
+({ \
+ struct kvm_x86_pmu_feature feature = {}; \
+ \
+ feature; \
+})
+
+static bool pmu_is_null_feature(struct kvm_x86_pmu_feature event)
+{
+ return !(*(u64 *)&event);
+}
+
static void guest_measure_loop(uint8_t idx)
{
const struct {
struct kvm_x86_pmu_feature gp_event;
+ struct kvm_x86_pmu_feature fixed_event;
} intel_event_to_feature[] = {
- [INTEL_ARCH_CPU_CYCLES] = { X86_PMU_FEATURE_CPU_CYCLES },
- [INTEL_ARCH_INSTRUCTIONS_RETIRED] = { X86_PMU_FEATURE_INSNS_RETIRED },
- [INTEL_ARCH_REFERENCE_CYCLES] = { X86_PMU_FEATURE_REFERENCE_CYCLES },
- [INTEL_ARCH_LLC_REFERENCES] = { X86_PMU_FEATURE_LLC_REFERENCES },
- [INTEL_ARCH_LLC_MISSES] = { X86_PMU_FEATURE_LLC_MISSES },
- [INTEL_ARCH_BRANCHES_RETIRED] = { X86_PMU_FEATURE_BRANCH_INSNS_RETIRED },
- [INTEL_ARCH_BRANCHES_MISPREDICTED] = { X86_PMU_FEATURE_BRANCHES_MISPREDICTED },
+ [INTEL_ARCH_CPU_CYCLES] = { X86_PMU_FEATURE_CPU_CYCLES, X86_PMU_FEATURE_CPU_CYCLES_FIXED },
+ [INTEL_ARCH_INSTRUCTIONS_RETIRED] = { X86_PMU_FEATURE_INSNS_RETIRED, X86_PMU_FEATURE_INSNS_RETIRED_FIXED },
+ /*
+ * Note, the fixed counter for reference cycles is NOT the same
+ * as the general purpose architectural event (because the GP
+ * event is garbage). The fixed counter explicitly counts at
+ * the same frequency as the TSC, whereas the GP event counts
+ * at a fixed, but uarch specific, frequency. Bundle them here
+ * for simplicity.
+ */
+ [INTEL_ARCH_REFERENCE_CYCLES] = { X86_PMU_FEATURE_REFERENCE_CYCLES, X86_PMU_FEATURE_REFERENCE_CYCLES_FIXED },
+ [INTEL_ARCH_LLC_REFERENCES] = { X86_PMU_FEATURE_LLC_REFERENCES, X86_PMU_FEATURE_NULL },
+ [INTEL_ARCH_LLC_MISSES] = { X86_PMU_FEATURE_LLC_MISSES, X86_PMU_FEATURE_NULL },
+ [INTEL_ARCH_BRANCHES_RETIRED] = { X86_PMU_FEATURE_BRANCH_INSNS_RETIRED, X86_PMU_FEATURE_NULL },
+ [INTEL_ARCH_BRANCHES_MISPREDICTED] = { X86_PMU_FEATURE_BRANCHES_MISPREDICTED, X86_PMU_FEATURE_NULL },
};

uint32_t nr_gp_counters = this_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
uint32_t pmu_version = this_cpu_property(X86_PROPERTY_PMU_VERSION);
- struct kvm_x86_pmu_feature gp_event;
+ struct kvm_x86_pmu_feature gp_event, fixed_event;
uint32_t counter_msr;
unsigned int i;

@@ -132,6 +153,23 @@ static void guest_measure_loop(uint8_t idx)
GUEST_ASSERT_EQ(this_pmu_has(gp_event), !!_rdpmc(i));
}

+ fixed_event = intel_event_to_feature[idx].fixed_event;
+ if (pmu_is_null_feature(fixed_event) || !this_pmu_has(fixed_event))
+ goto done;
+
+ i = fixed_event.f.bit;
+
+ wrmsr(MSR_CORE_PERF_FIXED_CTR0 + i, 0);
+ wrmsr(MSR_CORE_PERF_FIXED_CTR_CTRL, BIT_ULL(4 * i));
+
+ wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, BIT_ULL(PMC_IDX_FIXED + i));
+ __asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
+ wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);
+
+ if (pmu_is_intel_event_stable(idx))
+ GUEST_ASSERT_NE(_rdpmc(PMC_FIXED_RDPMC_BASE | i), 0);
+
+done:
GUEST_DONE();
}

--
2.42.0.758.gaed0368e0e-goog

2023-10-24 00:28:04

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v5 10/13] KVM: selftests: Test consistency of CPUID with num of gp counters

From: Jinrong Liang <[email protected]>

Add a test to verify that KVM correctly emulates MSR-based accesses to
general purpose counters based on guest CPUID, e.g. that accesses to
non-existent counters #GP and accesses to existent counters succeed.

Note, for compatibility reasons, KVM does not emulate #GP when
MSR_P6_PERFCTR[0|1] is not present (writes should be dropped).

Co-developed-by: Like Xu <[email protected]>
Signed-off-by: Like Xu <[email protected]>
Signed-off-by: Jinrong Liang <[email protected]>
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/x86_64/pmu_counters_test.c | 98 +++++++++++++++++++
1 file changed, 98 insertions(+)

diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
index 410d09f788ef..274b7f4d4b53 100644
--- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
+++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
@@ -212,6 +212,103 @@ static void test_intel_arch_events(void)
}
}

+/*
+ * Limit testing to MSRs that are actually defined by Intel (in the SDM). MSRs
+ * that aren't defined counter MSRs *probably* don't exist, but there's no
+ * guarantee that currently undefined MSR indices won't be used for something
+ * other than PMCs in the future.
+ */
+#define MAX_NR_GP_COUNTERS 8
+#define MAX_NR_FIXED_COUNTERS 3
+
+#define GUEST_ASSERT_PMC_MSR_ACCESS(insn, msr, expect_gp, vector) \
+__GUEST_ASSERT(expect_gp ? vector == GP_VECTOR : !vector, \
+ "Expected %s on " #insn "(0x%x), got vector %u", \
+ expect_gp ? "#GP" : "no fault", msr, vector) \
+
+static void guest_rd_wr_counters(uint32_t base_msr, uint8_t nr_possible_counters,
+ uint8_t nr_counters)
+{
+ uint8_t i;
+
+ for (i = 0; i < nr_possible_counters; i++) {
+ const uint32_t msr = base_msr + i;
+ const bool expect_success = i < nr_counters;
+
+ /*
+ * KVM drops writes to MSR_P6_PERFCTR[0|1] if the counters are
+ * unsupported, i.e. doesn't #GP and reads back '0'.
+ */
+ const uint64_t expected_val = expect_success ? 0xffff : 0;
+ const bool expect_gp = !expect_success && msr != MSR_P6_PERFCTR0 &&
+ msr != MSR_P6_PERFCTR1;
+ uint8_t vector;
+ uint64_t val;
+
+ vector = wrmsr_safe(msr, 0xffff);
+ GUEST_ASSERT_PMC_MSR_ACCESS(WRMSR, msr, expect_gp, vector);
+
+ vector = rdmsr_safe(msr, &val);
+ GUEST_ASSERT_PMC_MSR_ACCESS(RDMSR, msr, expect_gp, vector);
+
+ /* On #GP, the result of RDMSR is undefined. */
+ if (!expect_gp)
+ __GUEST_ASSERT(val == expected_val,
+ "Expected RDMSR(0x%x) to yield 0x%lx, got 0x%lx",
+ msr, expected_val, val);
+
+ vector = wrmsr_safe(msr, 0);
+ GUEST_ASSERT_PMC_MSR_ACCESS(WRMSR, msr, expect_gp, vector);
+ }
+ GUEST_DONE();
+}
+
+static void guest_test_gp_counters(void)
+{
+ uint8_t nr_gp_counters = this_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
+ uint32_t base_msr;
+
+ if (rdmsr(MSR_IA32_PERF_CAPABILITIES) & PMU_CAP_FW_WRITES)
+ base_msr = MSR_IA32_PMC0;
+ else
+ base_msr = MSR_IA32_PERFCTR0;
+
+ guest_rd_wr_counters(base_msr, MAX_NR_GP_COUNTERS, nr_gp_counters);
+}
+
+static void test_gp_counters(uint8_t nr_gp_counters, uint64_t perf_cap)
+{
+ struct kvm_vcpu *vcpu;
+ struct kvm_vm *vm;
+
+ vm = pmu_vm_create_with_one_vcpu(&vcpu, guest_test_gp_counters);
+
+ vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_NR_GP_COUNTERS,
+ nr_gp_counters);
+ vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, perf_cap);
+
+ run_vcpu(vcpu);
+
+ kvm_vm_free(vm);
+}
+
+static void test_intel_counters(void)
+{
+ uint8_t nr_gp_counters = kvm_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
+ unsigned int i;
+ uint8_t j;
+
+ const uint64_t perf_caps[] = {
+ 0,
+ PMU_CAP_FW_WRITES,
+ };
+
+ for (i = 0; i < ARRAY_SIZE(perf_caps); i++) {
+ for (j = 0; j <= nr_gp_counters; j++)
+ test_gp_counters(j, perf_caps[i]);
+ }
+}
+
int main(int argc, char *argv[])
{
TEST_REQUIRE(get_kvm_param_bool("enable_pmu"));
@@ -222,6 +319,7 @@ int main(int argc, char *argv[])
TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_PDCM));

test_intel_arch_events();
+ test_intel_counters();

return 0;
}
--
2.42.0.758.gaed0368e0e-goog

2023-10-24 00:28:08

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v5 11/13] KVM: selftests: Test consistency of CPUID with num of fixed counters

From: Jinrong Liang <[email protected]>

Extend the PMU counters test to verify KVM emulation of fixed counters in
addition to general purpose counters. Fixed counters add an extra wrinkle
in the form of an extra supported bitmask. Thus quoth the SDM:

fixed-function performance counter 'i' is supported if ECX[i] || (EDX[4:0] > i)

Test that KVM handles a counter being available through either method.

Co-developed-by: Like Xu <[email protected]>
Signed-off-by: Like Xu <[email protected]>
Signed-off-by: Jinrong Liang <[email protected]>
Co-developed-by: Sean Christopherson <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/x86_64/pmu_counters_test.c | 58 ++++++++++++++++++-
1 file changed, 55 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
index 274b7f4d4b53..f1d9cdd69a17 100644
--- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
+++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
@@ -227,13 +227,19 @@ __GUEST_ASSERT(expect_gp ? vector == GP_VECTOR : !vector, \
expect_gp ? "#GP" : "no fault", msr, vector) \

static void guest_rd_wr_counters(uint32_t base_msr, uint8_t nr_possible_counters,
- uint8_t nr_counters)
+ uint8_t nr_counters, uint32_t or_mask)
{
uint8_t i;

for (i = 0; i < nr_possible_counters; i++) {
const uint32_t msr = base_msr + i;
- const bool expect_success = i < nr_counters;
+
+ /*
+ * Fixed counters are supported if the counter is less than the
+ * number of enumerated contiguous counters *or* the counter is
+ * explicitly enumerated in the supported counters mask.
+ */
+ const bool expect_success = i < nr_counters || (or_mask & BIT(i));

/*
* KVM drops writes to MSR_P6_PERFCTR[0|1] if the counters are
@@ -273,7 +279,7 @@ static void guest_test_gp_counters(void)
else
base_msr = MSR_IA32_PERFCTR0;

- guest_rd_wr_counters(base_msr, MAX_NR_GP_COUNTERS, nr_gp_counters);
+ guest_rd_wr_counters(base_msr, MAX_NR_GP_COUNTERS, nr_gp_counters, 0);
}

static void test_gp_counters(uint8_t nr_gp_counters, uint64_t perf_cap)
@@ -292,10 +298,51 @@ static void test_gp_counters(uint8_t nr_gp_counters, uint64_t perf_cap)
kvm_vm_free(vm);
}

+static void guest_test_fixed_counters(void)
+{
+ uint64_t supported_bitmask = 0;
+ uint8_t nr_fixed_counters = 0;
+
+ /* KVM provides fixed counters iff the vPMU version is 2+. */
+ if (this_cpu_property(X86_PROPERTY_PMU_VERSION) >= 2)
+ nr_fixed_counters = this_cpu_property(X86_PROPERTY_PMU_NR_FIXED_COUNTERS);
+
+ /*
+ * The supported bitmask for fixed counters was introduced in PMU
+ * version 5.
+ */
+ if (this_cpu_property(X86_PROPERTY_PMU_VERSION) >= 5)
+ supported_bitmask = this_cpu_property(X86_PROPERTY_PMU_FIXED_COUNTERS_BITMASK);
+
+ guest_rd_wr_counters(MSR_CORE_PERF_FIXED_CTR0, MAX_NR_FIXED_COUNTERS,
+ nr_fixed_counters, supported_bitmask);
+}
+
+static void test_fixed_counters(uint8_t nr_fixed_counters,
+ uint32_t supported_bitmask, uint64_t perf_cap)
+{
+ struct kvm_vcpu *vcpu;
+ struct kvm_vm *vm;
+
+ vm = pmu_vm_create_with_one_vcpu(&vcpu, guest_test_fixed_counters);
+
+ vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_FIXED_COUNTERS_BITMASK,
+ supported_bitmask);
+ vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_NR_FIXED_COUNTERS,
+ nr_fixed_counters);
+ vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, perf_cap);
+
+ run_vcpu(vcpu);
+
+ kvm_vm_free(vm);
+}
+
static void test_intel_counters(void)
{
+ uint8_t nr_fixed_counters = kvm_cpu_property(X86_PROPERTY_PMU_NR_FIXED_COUNTERS);
uint8_t nr_gp_counters = kvm_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
unsigned int i;
+ uint32_t k;
uint8_t j;

const uint64_t perf_caps[] = {
@@ -306,6 +353,11 @@ static void test_intel_counters(void)
for (i = 0; i < ARRAY_SIZE(perf_caps); i++) {
for (j = 0; j <= nr_gp_counters; j++)
test_gp_counters(j, perf_caps[i]);
+
+ for (j = 0; j <= nr_fixed_counters; j++) {
+ for (k = 0; k <= (BIT(nr_fixed_counters) - 1); k++)
+ test_fixed_counters(j, k, perf_caps[i]);
+ }
}
}

--
2.42.0.758.gaed0368e0e-goog

2023-10-24 00:28:21

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v5 13/13] KVM: selftests: Extend PMU counters test to permute on vPMU version

Extent the PMU counters test to verify that KVM emulates the vPMU (or
not) according to the vPMU version exposed to the guest. KVM's ABI (which
does NOT reflect Intel's architectural behavior) is that GP counters are
available if the PMU version is >0, and that fixed counters and
PERF_GLOBAL_CTRL are available if the PMU version is >1.

Test up to vPMU version 5, i.e. the current architectural max. KVM only
officially supports up to version 2, but the behavior of the counters is
backwards compatible, i.e. KVM shouldn't do something completely different
for a higher, architecturally-defined vPMU version.

Verify KVM behavior against the effective vPMU version, e.g. advertising
vPMU 5 when KVM only supports vPMU 2 shouldn't magically unlock vPMU 5
features.

Suggested-by: Like Xu <[email protected]>
Suggested-by: Jinrong Liang <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/x86_64/pmu_counters_test.c | 60 +++++++++++++++----
1 file changed, 47 insertions(+), 13 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
index 1c392ad156f4..85b01dd5b2cd 100644
--- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
+++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
@@ -12,6 +12,8 @@
/* Guest payload for any performance counter counting */
#define NUM_BRANCHES 10

+static uint8_t kvm_pmu_version;
+
static struct kvm_vm *pmu_vm_create_with_one_vcpu(struct kvm_vcpu **vcpu,
void *guest_code)
{
@@ -21,6 +23,8 @@ static struct kvm_vm *pmu_vm_create_with_one_vcpu(struct kvm_vcpu **vcpu,
vm_init_descriptor_tables(vm);
vcpu_init_descriptor_tables(*vcpu);

+ sync_global_to_guest(vm, kvm_pmu_version);
+
return vm;
}

@@ -97,6 +101,19 @@ static bool pmu_is_null_feature(struct kvm_x86_pmu_feature event)
return !(*(u64 *)&event);
}

+static uint8_t guest_get_pmu_version(void)
+{
+ /*
+ * Return the effective PMU version, i.e. the minimum between what KVM
+ * supports and what is enumerated to the guest. The counters test
+ * deliberately advertises a PMU version to the guest beyond what is
+ * actually supported by KVM to verify KVM doesn't freak out and do
+ * something bizarre with an architecturally valid, but unsupported,
+ * version.
+ */
+ return min_t(uint8_t, kvm_pmu_version, this_cpu_property(X86_PROPERTY_PMU_VERSION));
+}
+
static void guest_measure_loop(uint8_t idx)
{
const struct {
@@ -121,7 +138,7 @@ static void guest_measure_loop(uint8_t idx)
};

uint32_t nr_gp_counters = this_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
- uint32_t pmu_version = this_cpu_property(X86_PROPERTY_PMU_VERSION);
+ uint32_t pmu_version = guest_get_pmu_version();
struct kvm_x86_pmu_feature gp_event, fixed_event;
uint32_t counter_msr;
unsigned int i;
@@ -270,9 +287,12 @@ static void guest_rd_wr_counters(uint32_t base_msr, uint8_t nr_possible_counters

static void guest_test_gp_counters(void)
{
- uint8_t nr_gp_counters = this_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
+ uint8_t nr_gp_counters = 0;
uint32_t base_msr;

+ if (guest_get_pmu_version())
+ nr_gp_counters = this_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
+
if (rdmsr(MSR_IA32_PERF_CAPABILITIES) & PMU_CAP_FW_WRITES)
base_msr = MSR_IA32_PMC0;
else
@@ -282,7 +302,8 @@ static void guest_test_gp_counters(void)
GUEST_DONE();
}

-static void test_gp_counters(uint8_t nr_gp_counters, uint64_t perf_cap)
+static void test_gp_counters(uint8_t pmu_version, uint8_t nr_gp_counters,
+ uint64_t perf_cap)
{
struct kvm_vcpu *vcpu;
struct kvm_vm *vm;
@@ -305,16 +326,17 @@ static void guest_test_fixed_counters(void)
uint8_t i;

/* KVM provides fixed counters iff the vPMU version is 2+. */
- if (this_cpu_property(X86_PROPERTY_PMU_VERSION) >= 2)
+ if (guest_get_pmu_version() >= 2)
nr_fixed_counters = this_cpu_property(X86_PROPERTY_PMU_NR_FIXED_COUNTERS);

/*
* The supported bitmask for fixed counters was introduced in PMU
* version 5.
*/
- if (this_cpu_property(X86_PROPERTY_PMU_VERSION) >= 5)
+ if (guest_get_pmu_version() >= 5)
supported_bitmask = this_cpu_property(X86_PROPERTY_PMU_FIXED_COUNTERS_BITMASK);

+
guest_rd_wr_counters(MSR_CORE_PERF_FIXED_CTR0, MAX_NR_FIXED_COUNTERS,
nr_fixed_counters, supported_bitmask);

@@ -345,7 +367,7 @@ static void guest_test_fixed_counters(void)
GUEST_DONE();
}

-static void test_fixed_counters(uint8_t nr_fixed_counters,
+static void test_fixed_counters(uint8_t pmu_version, uint8_t nr_fixed_counters,
uint32_t supported_bitmask, uint64_t perf_cap)
{
struct kvm_vcpu *vcpu;
@@ -368,22 +390,32 @@ static void test_intel_counters(void)
{
uint8_t nr_fixed_counters = kvm_cpu_property(X86_PROPERTY_PMU_NR_FIXED_COUNTERS);
uint8_t nr_gp_counters = kvm_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
+ uint8_t max_pmu_version = kvm_cpu_property(X86_PROPERTY_PMU_VERSION);
unsigned int i;
+ uint8_t j, v;
uint32_t k;
- uint8_t j;

const uint64_t perf_caps[] = {
0,
PMU_CAP_FW_WRITES,
};

- for (i = 0; i < ARRAY_SIZE(perf_caps); i++) {
- for (j = 0; j <= nr_gp_counters; j++)
- test_gp_counters(j, perf_caps[i]);
+ /*
+ * Test up to PMU v5, which is the current maximum version defined by
+ * Intel, i.e. is the last version that is guaranteed to be backwards
+ * compatible with KVM's existing behavior.
+ */
+ max_pmu_version = max_t(typeof(max_pmu_version), max_pmu_version, 5);

- for (j = 0; j <= nr_fixed_counters; j++) {
- for (k = 0; k <= (BIT(nr_fixed_counters) - 1); k++)
- test_fixed_counters(j, k, perf_caps[i]);
+ for (v = 0; v <= max_pmu_version; v++) {
+ for (i = 0; i < ARRAY_SIZE(perf_caps) + 1; i++) {
+ for (j = 0; j <= nr_gp_counters; j++)
+ test_gp_counters(v, j, perf_caps[i]);
+
+ for (j = 0; j <= nr_fixed_counters; j++) {
+ for (k = 0; k <= (BIT(nr_fixed_counters) - 1); k++)
+ test_fixed_counters(v, j, k, perf_caps[i]);
+ }
}
}
}
@@ -397,6 +429,8 @@ int main(int argc, char *argv[])
TEST_REQUIRE(kvm_cpu_property(X86_PROPERTY_PMU_VERSION) > 0);
TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_PDCM));

+ kvm_pmu_version = kvm_cpu_property(X86_PROPERTY_PMU_VERSION);
+
test_intel_arch_events();
test_intel_counters();

--
2.42.0.758.gaed0368e0e-goog

2023-10-24 00:28:49

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v5 07/13] KVM: selftests: Add pmu.h and lib/pmu.c for common PMU assets

From: Jinrong Liang <[email protected]>

By defining the PMU performance events and masks relevant for x86 in
the new pmu.h and pmu.c, it becomes easier to reference them, minimizing
potential errors in code that handles these values.

Clean up pmu_event_filter_test.c by including pmu.h and removing
unnecessary macros.

Suggested-by: Sean Christopherson <[email protected]>
Signed-off-by: Jinrong Liang <[email protected]>
[sean: drop PSEUDO_ARCH_REFERENCE_CYCLES]
Signed-off-by: Sean Christopherson <[email protected]>
---
tools/testing/selftests/kvm/Makefile | 1 +
tools/testing/selftests/kvm/include/pmu.h | 84 +++++++++++++++++++
tools/testing/selftests/kvm/lib/pmu.c | 28 +++++++
.../kvm/x86_64/pmu_event_filter_test.c | 32 ++-----
4 files changed, 122 insertions(+), 23 deletions(-)
create mode 100644 tools/testing/selftests/kvm/include/pmu.h
create mode 100644 tools/testing/selftests/kvm/lib/pmu.c

diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index fb01c3f8d3da..ed1c17cabc07 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -23,6 +23,7 @@ LIBKVM += lib/guest_modes.c
LIBKVM += lib/io.c
LIBKVM += lib/kvm_util.c
LIBKVM += lib/memstress.c
+LIBKVM += lib/pmu.c
LIBKVM += lib/guest_sprintf.c
LIBKVM += lib/rbtree.c
LIBKVM += lib/sparsebit.c
diff --git a/tools/testing/selftests/kvm/include/pmu.h b/tools/testing/selftests/kvm/include/pmu.h
new file mode 100644
index 000000000000..987602c62b51
--- /dev/null
+++ b/tools/testing/selftests/kvm/include/pmu.h
@@ -0,0 +1,84 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2023, Tencent, Inc.
+ */
+#ifndef SELFTEST_KVM_PMU_H
+#define SELFTEST_KVM_PMU_H
+
+#include <stdint.h>
+
+#define X86_PMC_IDX_MAX 64
+#define INTEL_PMC_MAX_GENERIC 32
+#define KVM_PMU_EVENT_FILTER_MAX_EVENTS 300
+
+#define GP_COUNTER_NR_OFS_BIT 8
+#define EVENT_LENGTH_OFS_BIT 24
+
+#define PMU_VERSION_MASK GENMASK_ULL(7, 0)
+#define EVENT_LENGTH_MASK GENMASK_ULL(31, EVENT_LENGTH_OFS_BIT)
+#define GP_COUNTER_NR_MASK GENMASK_ULL(15, GP_COUNTER_NR_OFS_BIT)
+#define FIXED_COUNTER_NR_MASK GENMASK_ULL(4, 0)
+
+#define ARCH_PERFMON_EVENTSEL_EVENT GENMASK_ULL(7, 0)
+#define ARCH_PERFMON_EVENTSEL_UMASK GENMASK_ULL(15, 8)
+#define ARCH_PERFMON_EVENTSEL_USR BIT_ULL(16)
+#define ARCH_PERFMON_EVENTSEL_OS BIT_ULL(17)
+#define ARCH_PERFMON_EVENTSEL_EDGE BIT_ULL(18)
+#define ARCH_PERFMON_EVENTSEL_PIN_CONTROL BIT_ULL(19)
+#define ARCH_PERFMON_EVENTSEL_INT BIT_ULL(20)
+#define ARCH_PERFMON_EVENTSEL_ANY BIT_ULL(21)
+#define ARCH_PERFMON_EVENTSEL_ENABLE BIT_ULL(22)
+#define ARCH_PERFMON_EVENTSEL_INV BIT_ULL(23)
+#define ARCH_PERFMON_EVENTSEL_CMASK GENMASK_ULL(31, 24)
+
+#define PMC_MAX_FIXED 16
+#define PMC_IDX_FIXED 32
+
+/* RDPMC offset for Fixed PMCs */
+#define PMC_FIXED_RDPMC_BASE BIT_ULL(30)
+#define PMC_FIXED_RDPMC_METRICS BIT_ULL(29)
+
+#define FIXED_BITS_MASK 0xFULL
+#define FIXED_BITS_STRIDE 4
+#define FIXED_0_KERNEL BIT_ULL(0)
+#define FIXED_0_USER BIT_ULL(1)
+#define FIXED_0_ANYTHREAD BIT_ULL(2)
+#define FIXED_0_ENABLE_PMI BIT_ULL(3)
+
+#define fixed_bits_by_idx(_idx, _bits) \
+ ((_bits) << ((_idx) * FIXED_BITS_STRIDE))
+
+#define AMD64_NR_COUNTERS 4
+#define AMD64_NR_COUNTERS_CORE 6
+
+#define PMU_CAP_FW_WRITES BIT_ULL(13)
+#define PMU_CAP_LBR_FMT 0x3f
+
+enum intel_pmu_architectural_events {
+ /*
+ * The order of the architectural events matters as support for each
+ * event is enumerated via CPUID using the index of the event.
+ */
+ INTEL_ARCH_CPU_CYCLES,
+ INTEL_ARCH_INSTRUCTIONS_RETIRED,
+ INTEL_ARCH_REFERENCE_CYCLES,
+ INTEL_ARCH_LLC_REFERENCES,
+ INTEL_ARCH_LLC_MISSES,
+ INTEL_ARCH_BRANCHES_RETIRED,
+ INTEL_ARCH_BRANCHES_MISPREDICTED,
+ NR_INTEL_ARCH_EVENTS,
+};
+
+enum amd_pmu_k7_events {
+ AMD_ZEN_CORE_CYCLES,
+ AMD_ZEN_INSTRUCTIONS,
+ AMD_ZEN_BRANCHES,
+ AMD_ZEN_BRANCH_MISSES,
+ NR_AMD_ARCH_EVENTS,
+};
+
+extern const uint64_t intel_pmu_arch_events[];
+extern const uint64_t amd_pmu_arch_events[];
+extern const int intel_pmu_fixed_pmc_events[];
+
+#endif /* SELFTEST_KVM_PMU_H */
diff --git a/tools/testing/selftests/kvm/lib/pmu.c b/tools/testing/selftests/kvm/lib/pmu.c
new file mode 100644
index 000000000000..27a6c35f98a1
--- /dev/null
+++ b/tools/testing/selftests/kvm/lib/pmu.c
@@ -0,0 +1,28 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2023, Tencent, Inc.
+ */
+
+#include <stdint.h>
+
+#include "pmu.h"
+
+/* Definitions for Architectural Performance Events */
+#define ARCH_EVENT(select, umask) (((select) & 0xff) | ((umask) & 0xff) << 8)
+
+const uint64_t intel_pmu_arch_events[] = {
+ [INTEL_ARCH_CPU_CYCLES] = ARCH_EVENT(0x3c, 0x0),
+ [INTEL_ARCH_INSTRUCTIONS_RETIRED] = ARCH_EVENT(0xc0, 0x0),
+ [INTEL_ARCH_REFERENCE_CYCLES] = ARCH_EVENT(0x3c, 0x1),
+ [INTEL_ARCH_LLC_REFERENCES] = ARCH_EVENT(0x2e, 0x4f),
+ [INTEL_ARCH_LLC_MISSES] = ARCH_EVENT(0x2e, 0x41),
+ [INTEL_ARCH_BRANCHES_RETIRED] = ARCH_EVENT(0xc4, 0x0),
+ [INTEL_ARCH_BRANCHES_MISPREDICTED] = ARCH_EVENT(0xc5, 0x0),
+};
+
+const uint64_t amd_pmu_arch_events[] = {
+ [AMD_ZEN_CORE_CYCLES] = ARCH_EVENT(0x76, 0x00),
+ [AMD_ZEN_INSTRUCTIONS] = ARCH_EVENT(0xc0, 0x00),
+ [AMD_ZEN_BRANCHES] = ARCH_EVENT(0xc2, 0x00),
+ [AMD_ZEN_BRANCH_MISSES] = ARCH_EVENT(0xc3, 0x00),
+};
diff --git a/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c b/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c
index 283cc55597a4..b6e4f57a8651 100644
--- a/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c
+++ b/tools/testing/selftests/kvm/x86_64/pmu_event_filter_test.c
@@ -11,31 +11,18 @@
*/

#define _GNU_SOURCE /* for program_invocation_short_name */
-#include "test_util.h"
+
#include "kvm_util.h"
+#include "pmu.h"
#include "processor.h"
-
-/*
- * In lieu of copying perf_event.h into tools...
- */
-#define ARCH_PERFMON_EVENTSEL_OS (1ULL << 17)
-#define ARCH_PERFMON_EVENTSEL_ENABLE (1ULL << 22)
-
-/* End of stuff taken from perf_event.h. */
-
-/* Oddly, this isn't in perf_event.h. */
-#define ARCH_PERFMON_BRANCHES_RETIRED 5
+#include "test_util.h"

#define NUM_BRANCHES 42
-#define INTEL_PMC_IDX_FIXED 32
-
-/* Matches KVM_PMU_EVENT_FILTER_MAX_EVENTS in pmu.c */
-#define MAX_FILTER_EVENTS 300
#define MAX_TEST_EVENTS 10

#define PMU_EVENT_FILTER_INVALID_ACTION (KVM_PMU_EVENT_DENY + 1)
#define PMU_EVENT_FILTER_INVALID_FLAGS (KVM_PMU_EVENT_FLAGS_VALID_MASK << 1)
-#define PMU_EVENT_FILTER_INVALID_NEVENTS (MAX_FILTER_EVENTS + 1)
+#define PMU_EVENT_FILTER_INVALID_NEVENTS (KVM_PMU_EVENT_FILTER_MAX_EVENTS + 1)

/*
* This is how the event selector and unit mask are stored in an AMD
@@ -63,7 +50,6 @@

#define AMD_ZEN_BR_RETIRED EVENT(0xc2, 0)

-
/*
* "Retired instructions", from Processor Programming Reference
* (PPR) for AMD Family 17h Model 01h, Revision B1 Processors,
@@ -84,7 +70,7 @@ struct __kvm_pmu_event_filter {
__u32 fixed_counter_bitmap;
__u32 flags;
__u32 pad[4];
- __u64 events[MAX_FILTER_EVENTS];
+ __u64 events[KVM_PMU_EVENT_FILTER_MAX_EVENTS];
};

/*
@@ -729,14 +715,14 @@ static void add_dummy_events(uint64_t *events, int nevents)

static void test_masked_events(struct kvm_vcpu *vcpu)
{
- int nevents = MAX_FILTER_EVENTS - MAX_TEST_EVENTS;
- uint64_t events[MAX_FILTER_EVENTS];
+ int nevents = KVM_PMU_EVENT_FILTER_MAX_EVENTS - MAX_TEST_EVENTS;
+ uint64_t events[KVM_PMU_EVENT_FILTER_MAX_EVENTS];

/* Run the test cases against a sparse PMU event filter. */
run_masked_events_tests(vcpu, events, 0);

/* Run the test cases against a dense PMU event filter. */
- add_dummy_events(events, MAX_FILTER_EVENTS);
+ add_dummy_events(events, KVM_PMU_EVENT_FILTER_MAX_EVENTS);
run_masked_events_tests(vcpu, events, nevents);
}

@@ -818,7 +804,7 @@ static void intel_run_fixed_counter_guest_code(uint8_t fixed_ctr_idx)
/* Only OS_EN bit is enabled for fixed counter[idx]. */
wrmsr(MSR_CORE_PERF_FIXED_CTR_CTRL, BIT_ULL(4 * fixed_ctr_idx));
wrmsr(MSR_CORE_PERF_GLOBAL_CTRL,
- BIT_ULL(INTEL_PMC_IDX_FIXED + fixed_ctr_idx));
+ BIT_ULL(PMC_IDX_FIXED + fixed_ctr_idx));
__asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);

--
2.42.0.758.gaed0368e0e-goog

2023-10-24 00:29:01

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v5 12/13] KVM: selftests: Add functional test for Intel's fixed PMU counters

From: Jinrong Liang <[email protected]>

Extend the fixed counters test to verify that supported counters can
actually be enabled in the control MSRs, that unsupported counters cannot,
and that enabled counters actually count.

Co-developed-by: Like Xu <[email protected]>
Signed-off-by: Like Xu <[email protected]>
Signed-off-by: Jinrong Liang <[email protected]>
[sean: fold into the rd/wr access test, massage changelog]
Signed-off-by: Sean Christopherson <[email protected]>
---
.../selftests/kvm/x86_64/pmu_counters_test.c | 29 ++++++++++++++++++-
1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
index f1d9cdd69a17..1c392ad156f4 100644
--- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
+++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
@@ -266,7 +266,6 @@ static void guest_rd_wr_counters(uint32_t base_msr, uint8_t nr_possible_counters
vector = wrmsr_safe(msr, 0);
GUEST_ASSERT_PMC_MSR_ACCESS(WRMSR, msr, expect_gp, vector);
}
- GUEST_DONE();
}

static void guest_test_gp_counters(void)
@@ -280,6 +279,7 @@ static void guest_test_gp_counters(void)
base_msr = MSR_IA32_PERFCTR0;

guest_rd_wr_counters(base_msr, MAX_NR_GP_COUNTERS, nr_gp_counters, 0);
+ GUEST_DONE();
}

static void test_gp_counters(uint8_t nr_gp_counters, uint64_t perf_cap)
@@ -302,6 +302,7 @@ static void guest_test_fixed_counters(void)
{
uint64_t supported_bitmask = 0;
uint8_t nr_fixed_counters = 0;
+ uint8_t i;

/* KVM provides fixed counters iff the vPMU version is 2+. */
if (this_cpu_property(X86_PROPERTY_PMU_VERSION) >= 2)
@@ -316,6 +317,32 @@ static void guest_test_fixed_counters(void)

guest_rd_wr_counters(MSR_CORE_PERF_FIXED_CTR0, MAX_NR_FIXED_COUNTERS,
nr_fixed_counters, supported_bitmask);
+
+ for (i = 0; i < MAX_NR_FIXED_COUNTERS; i++) {
+ uint8_t vector;
+ uint64_t val;
+
+ if (i >= nr_fixed_counters && !(supported_bitmask & BIT_ULL(i))) {
+ vector = wrmsr_safe(MSR_CORE_PERF_FIXED_CTR_CTRL, BIT_ULL(4 * i));
+ __GUEST_ASSERT(vector == GP_VECTOR,
+ "Expected #GP for counter %u in FIXED_CTRL_CTRL", i);
+
+ vector = wrmsr_safe(MSR_CORE_PERF_GLOBAL_CTRL, BIT_ULL(PMC_IDX_FIXED + i));
+ __GUEST_ASSERT(vector == GP_VECTOR,
+ "Expected #GP for counter %u in PERF_GLOBAL_CTRL", i);
+ continue;
+ }
+
+ wrmsr(MSR_CORE_PERF_FIXED_CTR0 + i, 0);
+ wrmsr(MSR_CORE_PERF_FIXED_CTR_CTRL, BIT_ULL(4 * i));
+ wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, BIT_ULL(PMC_IDX_FIXED + i));
+ __asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
+ wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);
+ val = rdmsr(MSR_CORE_PERF_FIXED_CTR0 + i);
+
+ GUEST_ASSERT_NE(val, 0);
+ }
+ GUEST_DONE();
}

static void test_fixed_counters(uint8_t nr_fixed_counters,
--
2.42.0.758.gaed0368e0e-goog

2023-10-24 11:49:42

by Jinrong Liang

[permalink] [raw]
Subject: Re: [PATCH v5 13/13] KVM: selftests: Extend PMU counters test to permute on vPMU version

在 2023/10/24 08:26, Sean Christopherson 写道:
> Extent the PMU counters test to verify that KVM emulates the vPMU (or
> not) according to the vPMU version exposed to the guest. KVM's ABI (which
> does NOT reflect Intel's architectural behavior) is that GP counters are
> available if the PMU version is >0, and that fixed counters and
> PERF_GLOBAL_CTRL are available if the PMU version is >1.
>
> Test up to vPMU version 5, i.e. the current architectural max. KVM only
> officially supports up to version 2, but the behavior of the counters is
> backwards compatible, i.e. KVM shouldn't do something completely different
> for a higher, architecturally-defined vPMU version.
>
> Verify KVM behavior against the effective vPMU version, e.g. advertising
> vPMU 5 when KVM only supports vPMU 2 shouldn't magically unlock vPMU 5
> features.
>
> Suggested-by: Like Xu <[email protected]>
> Suggested-by: Jinrong Liang <[email protected]>
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
> .../selftests/kvm/x86_64/pmu_counters_test.c | 60 +++++++++++++++----
> 1 file changed, 47 insertions(+), 13 deletions(-)
>
> diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
> index 1c392ad156f4..85b01dd5b2cd 100644
> --- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
> +++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
> @@ -12,6 +12,8 @@
> /* Guest payload for any performance counter counting */
> #define NUM_BRANCHES 10
>
> +static uint8_t kvm_pmu_version;
> +
> static struct kvm_vm *pmu_vm_create_with_one_vcpu(struct kvm_vcpu **vcpu,
> void *guest_code)
> {
> @@ -21,6 +23,8 @@ static struct kvm_vm *pmu_vm_create_with_one_vcpu(struct kvm_vcpu **vcpu,
> vm_init_descriptor_tables(vm);
> vcpu_init_descriptor_tables(*vcpu);
>
> + sync_global_to_guest(vm, kvm_pmu_version);
> +
> return vm;
> }
>
> @@ -97,6 +101,19 @@ static bool pmu_is_null_feature(struct kvm_x86_pmu_feature event)
> return !(*(u64 *)&event);
> }
>
> +static uint8_t guest_get_pmu_version(void)
> +{
> + /*
> + * Return the effective PMU version, i.e. the minimum between what KVM
> + * supports and what is enumerated to the guest. The counters test
> + * deliberately advertises a PMU version to the guest beyond what is
> + * actually supported by KVM to verify KVM doesn't freak out and do
> + * something bizarre with an architecturally valid, but unsupported,
> + * version.
> + */
> + return min_t(uint8_t, kvm_pmu_version, this_cpu_property(X86_PROPERTY_PMU_VERSION));
> +}
> +
> static void guest_measure_loop(uint8_t idx)
> {
> const struct {
> @@ -121,7 +138,7 @@ static void guest_measure_loop(uint8_t idx)
> };
>
> uint32_t nr_gp_counters = this_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
> - uint32_t pmu_version = this_cpu_property(X86_PROPERTY_PMU_VERSION);
> + uint32_t pmu_version = guest_get_pmu_version();
> struct kvm_x86_pmu_feature gp_event, fixed_event;
> uint32_t counter_msr;
> unsigned int i;
> @@ -270,9 +287,12 @@ static void guest_rd_wr_counters(uint32_t base_msr, uint8_t nr_possible_counters
>
> static void guest_test_gp_counters(void)
> {
> - uint8_t nr_gp_counters = this_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
> + uint8_t nr_gp_counters = 0;
> uint32_t base_msr;
>
> + if (guest_get_pmu_version())
> + nr_gp_counters = this_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
> +
> if (rdmsr(MSR_IA32_PERF_CAPABILITIES) & PMU_CAP_FW_WRITES)
> base_msr = MSR_IA32_PMC0;
> else
> @@ -282,7 +302,8 @@ static void guest_test_gp_counters(void)
> GUEST_DONE();
> }
>
> -static void test_gp_counters(uint8_t nr_gp_counters, uint64_t perf_cap)
> +static void test_gp_counters(uint8_t pmu_version, uint8_t nr_gp_counters,

The parameter pmu_version is not used in this function.

> + uint64_t perf_cap)
> {
> struct kvm_vcpu *vcpu;
> struct kvm_vm *vm;
> @@ -305,16 +326,17 @@ static void guest_test_fixed_counters(void)
> uint8_t i;
>
> /* KVM provides fixed counters iff the vPMU version is 2+. */
> - if (this_cpu_property(X86_PROPERTY_PMU_VERSION) >= 2)
> + if (guest_get_pmu_version() >= 2)
> nr_fixed_counters = this_cpu_property(X86_PROPERTY_PMU_NR_FIXED_COUNTERS);
>
> /*
> * The supported bitmask for fixed counters was introduced in PMU
> * version 5.
> */
> - if (this_cpu_property(X86_PROPERTY_PMU_VERSION) >= 5)
> + if (guest_get_pmu_version() >= 5)
> supported_bitmask = this_cpu_property(X86_PROPERTY_PMU_FIXED_COUNTERS_BITMASK);
>
> +
> guest_rd_wr_counters(MSR_CORE_PERF_FIXED_CTR0, MAX_NR_FIXED_COUNTERS,
> nr_fixed_counters, supported_bitmask);
>
> @@ -345,7 +367,7 @@ static void guest_test_fixed_counters(void)
> GUEST_DONE();
> }
>
> -static void test_fixed_counters(uint8_t nr_fixed_counters,
> +static void test_fixed_counters(uint8_t pmu_version, uint8_t nr_fixed_counters,

The parameter pmu_version is not used in this function.

> uint32_t supported_bitmask, uint64_t perf_cap)
> {
> struct kvm_vcpu *vcpu;
> @@ -368,22 +390,32 @@ static void test_intel_counters(void)
> {
> uint8_t nr_fixed_counters = kvm_cpu_property(X86_PROPERTY_PMU_NR_FIXED_COUNTERS);
> uint8_t nr_gp_counters = kvm_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
> + uint8_t max_pmu_version = kvm_cpu_property(X86_PROPERTY_PMU_VERSION);
> unsigned int i;
> + uint8_t j, v;
> uint32_t k;
> - uint8_t j;
>
> const uint64_t perf_caps[] = {
> 0,
> PMU_CAP_FW_WRITES,
> };
>
> - for (i = 0; i < ARRAY_SIZE(perf_caps); i++) {
> - for (j = 0; j <= nr_gp_counters; j++)
> - test_gp_counters(j, perf_caps[i]);
> + /*
> + * Test up to PMU v5, which is the current maximum version defined by
> + * Intel, i.e. is the last version that is guaranteed to be backwards
> + * compatible with KVM's existing behavior.
> + */
> + max_pmu_version = max_t(typeof(max_pmu_version), max_pmu_version, 5);
>
> - for (j = 0; j <= nr_fixed_counters; j++) {
> - for (k = 0; k <= (BIT(nr_fixed_counters) - 1); k++)
> - test_fixed_counters(j, k, perf_caps[i]);
> + for (v = 0; v <= max_pmu_version; v++) {
> + for (i = 0; i < ARRAY_SIZE(perf_caps) + 1; i++) {
> + for (j = 0; j <= nr_gp_counters; j++)
> + test_gp_counters(v, j, perf_caps[i]);
> +
> + for (j = 0; j <= nr_fixed_counters; j++) {
> + for (k = 0; k <= (BIT(nr_fixed_counters) - 1); k++)
> + test_fixed_counters(v, j, k, perf_caps[i]);
> + }
> }
> }
> }
> @@ -397,6 +429,8 @@ int main(int argc, char *argv[])
> TEST_REQUIRE(kvm_cpu_property(X86_PROPERTY_PMU_VERSION) > 0);
> TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_PDCM));
>
> + kvm_pmu_version = kvm_cpu_property(X86_PROPERTY_PMU_VERSION);
> +
> test_intel_arch_events();
> test_intel_counters();
>

2023-10-24 12:16:16

by Jinrong Liang

[permalink] [raw]
Subject: Re: [PATCH v5 11/13] KVM: selftests: Test consistency of CPUID with num of fixed counters

在 2023/10/24 08:26, Sean Christopherson 写道:
> From: Jinrong Liang <[email protected]>
>
> Extend the PMU counters test to verify KVM emulation of fixed counters in
> addition to general purpose counters. Fixed counters add an extra wrinkle
> in the form of an extra supported bitmask. Thus quoth the SDM:
>
> fixed-function performance counter 'i' is supported if ECX[i] || (EDX[4:0] > i)
>
> Test that KVM handles a counter being available through either method.
>
> Co-developed-by: Like Xu <[email protected]>
> Signed-off-by: Like Xu <[email protected]>
> Signed-off-by: Jinrong Liang <[email protected]>
> Co-developed-by: Sean Christopherson <[email protected]>
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
> .../selftests/kvm/x86_64/pmu_counters_test.c | 58 ++++++++++++++++++-
> 1 file changed, 55 insertions(+), 3 deletions(-)
>
> diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
> index 274b7f4d4b53..f1d9cdd69a17 100644
> --- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
> +++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
> @@ -227,13 +227,19 @@ __GUEST_ASSERT(expect_gp ? vector == GP_VECTOR : !vector, \
> expect_gp ? "#GP" : "no fault", msr, vector) \
>
> static void guest_rd_wr_counters(uint32_t base_msr, uint8_t nr_possible_counters,
> - uint8_t nr_counters)
> + uint8_t nr_counters, uint32_t or_mask)
> {
> uint8_t i;
>
> for (i = 0; i < nr_possible_counters; i++) {
> const uint32_t msr = base_msr + i;
> - const bool expect_success = i < nr_counters;
> +
> + /*
> + * Fixed counters are supported if the counter is less than the
> + * number of enumerated contiguous counters *or* the counter is
> + * explicitly enumerated in the supported counters mask.
> + */
> + const bool expect_success = i < nr_counters || (or_mask & BIT(i));
>
> /*
> * KVM drops writes to MSR_P6_PERFCTR[0|1] if the counters are
> @@ -273,7 +279,7 @@ static void guest_test_gp_counters(void)
> else
> base_msr = MSR_IA32_PERFCTR0;
>
> - guest_rd_wr_counters(base_msr, MAX_NR_GP_COUNTERS, nr_gp_counters);
> + guest_rd_wr_counters(base_msr, MAX_NR_GP_COUNTERS, nr_gp_counters, 0);
> }
>
> static void test_gp_counters(uint8_t nr_gp_counters, uint64_t perf_cap)
> @@ -292,10 +298,51 @@ static void test_gp_counters(uint8_t nr_gp_counters, uint64_t perf_cap)
> kvm_vm_free(vm);
> }
>
> +static void guest_test_fixed_counters(void)
> +{
> + uint64_t supported_bitmask = 0;
> + uint8_t nr_fixed_counters = 0;
> +
> + /* KVM provides fixed counters iff the vPMU version is 2+. */

s/iff/if/

> + if (this_cpu_property(X86_PROPERTY_PMU_VERSION) >= 2)
> + nr_fixed_counters = this_cpu_property(X86_PROPERTY_PMU_NR_FIXED_COUNTERS);
> +
> + /*
> + * The supported bitmask for fixed counters was introduced in PMU
> + * version 5.
> + */
> + if (this_cpu_property(X86_PROPERTY_PMU_VERSION) >= 5)
> + supported_bitmask = this_cpu_property(X86_PROPERTY_PMU_FIXED_COUNTERS_BITMASK);
> +
> + guest_rd_wr_counters(MSR_CORE_PERF_FIXED_CTR0, MAX_NR_FIXED_COUNTERS,
> + nr_fixed_counters, supported_bitmask);
> +}
> +
> +static void test_fixed_counters(uint8_t nr_fixed_counters,
> + uint32_t supported_bitmask, uint64_t perf_cap)
> +{
> + struct kvm_vcpu *vcpu;
> + struct kvm_vm *vm;
> +
> + vm = pmu_vm_create_with_one_vcpu(&vcpu, guest_test_fixed_counters);
> +
> + vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_FIXED_COUNTERS_BITMASK,
> + supported_bitmask);
> + vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_NR_FIXED_COUNTERS,
> + nr_fixed_counters);
> + vcpu_set_msr(vcpu, MSR_IA32_PERF_CAPABILITIES, perf_cap);
> +
> + run_vcpu(vcpu);
> +
> + kvm_vm_free(vm);
> +}
> +
> static void test_intel_counters(void)
> {
> + uint8_t nr_fixed_counters = kvm_cpu_property(X86_PROPERTY_PMU_NR_FIXED_COUNTERS);
> uint8_t nr_gp_counters = kvm_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
> unsigned int i;
> + uint32_t k;
> uint8_t j;
>
> const uint64_t perf_caps[] = {
> @@ -306,6 +353,11 @@ static void test_intel_counters(void)
> for (i = 0; i < ARRAY_SIZE(perf_caps); i++) {
> for (j = 0; j <= nr_gp_counters; j++)
> test_gp_counters(j, perf_caps[i]);
> +
> + for (j = 0; j <= nr_fixed_counters; j++) {
> + for (k = 0; k <= (BIT(nr_fixed_counters) - 1); k++)
> + test_fixed_counters(j, k, perf_caps[i]);
> + }
> }
> }
>

2023-10-24 14:23:09

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v5 11/13] KVM: selftests: Test consistency of CPUID with num of fixed counters

On Tue, Oct 24, 2023, JinrongLiang wrote:
> 在 2023/10/24 08:26, Sean Christopherson 写道:
> > From: Jinrong Liang <[email protected]>
> >
> > Extend the PMU counters test to verify KVM emulation of fixed counters in
> > addition to general purpose counters. Fixed counters add an extra wrinkle
> > in the form of an extra supported bitmask. Thus quoth the SDM:
> >
> > fixed-function performance counter 'i' is supported if ECX[i] || (EDX[4:0] > i)
> >
> > Test that KVM handles a counter being available through either method.
> >
> > Co-developed-by: Like Xu <[email protected]>
> > Signed-off-by: Like Xu <[email protected]>
> > Signed-off-by: Jinrong Liang <[email protected]>
> > Co-developed-by: Sean Christopherson <[email protected]>
> > Signed-off-by: Sean Christopherson <[email protected]>
> > ---
> > .../selftests/kvm/x86_64/pmu_counters_test.c | 58 ++++++++++++++++++-
> > 1 file changed, 55 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
> > index 274b7f4d4b53..f1d9cdd69a17 100644
> > --- a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
> > +++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
> > @@ -227,13 +227,19 @@ __GUEST_ASSERT(expect_gp ? vector == GP_VECTOR : !vector, \
> > expect_gp ? "#GP" : "no fault", msr, vector) \
> > static void guest_rd_wr_counters(uint32_t base_msr, uint8_t nr_possible_counters,
> > - uint8_t nr_counters)
> > + uint8_t nr_counters, uint32_t or_mask)
> > {
> > uint8_t i;
> > for (i = 0; i < nr_possible_counters; i++) {
> > const uint32_t msr = base_msr + i;
> > - const bool expect_success = i < nr_counters;
> > +
> > + /*
> > + * Fixed counters are supported if the counter is less than the
> > + * number of enumerated contiguous counters *or* the counter is
> > + * explicitly enumerated in the supported counters mask.
> > + */
> > + const bool expect_success = i < nr_counters || (or_mask & BIT(i));
> > /*
> > * KVM drops writes to MSR_P6_PERFCTR[0|1] if the counters are
> > @@ -273,7 +279,7 @@ static void guest_test_gp_counters(void)
> > else
> > base_msr = MSR_IA32_PERFCTR0;
> > - guest_rd_wr_counters(base_msr, MAX_NR_GP_COUNTERS, nr_gp_counters);
> > + guest_rd_wr_counters(base_msr, MAX_NR_GP_COUNTERS, nr_gp_counters, 0);
> > }
> > static void test_gp_counters(uint8_t nr_gp_counters, uint64_t perf_cap)
> > @@ -292,10 +298,51 @@ static void test_gp_counters(uint8_t nr_gp_counters, uint64_t perf_cap)
> > kvm_vm_free(vm);
> > }
> > +static void guest_test_fixed_counters(void)
> > +{
> > + uint64_t supported_bitmask = 0;
> > + uint8_t nr_fixed_counters = 0;
> > +
> > + /* KVM provides fixed counters iff the vPMU version is 2+. */
>
> s/iff/if/

The "iff" is intentional, it's shorthand for "if and only if". Ha, it even has
a Wikipedia page: https://en.wikipedia.org/wiki/If_and_only_if.

2023-10-24 14:24:10

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v5 13/13] KVM: selftests: Extend PMU counters test to permute on vPMU version

On Tue, Oct 24, 2023, JinrongLiang wrote:
> > -static void test_gp_counters(uint8_t nr_gp_counters, uint64_t perf_cap)
> > +static void test_gp_counters(uint8_t pmu_version, uint8_t nr_gp_counters,
>
> The parameter pmu_version is not used in this function.

Argh, I forgot to actually set the PMU version for the guest. Good catch!

2023-10-24 19:50:01

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v5 08/13] KVM: selftests: Test Intel PMU architectural events on gp counters

On Mon, Oct 23, 2023, Sean Christopherson wrote:
> +static void guest_measure_pmu_v1(struct kvm_x86_pmu_feature event,
> + uint32_t counter_msr, uint32_t nr_gp_counters)
> +{
> + uint8_t idx = event.f.bit;
> + unsigned int i;
> +
> + for (i = 0; i < nr_gp_counters; i++) {
> + wrmsr(counter_msr + i, 0);
> + wrmsr(MSR_P6_EVNTSEL0 + i, ARCH_PERFMON_EVENTSEL_OS |
> + ARCH_PERFMON_EVENTSEL_ENABLE | intel_pmu_arch_events[idx]);
> + __asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
> +
> + if (pmu_is_intel_event_stable(idx))
> + GUEST_ASSERT_EQ(this_pmu_has(event), !!_rdpmc(i));
> +
> + wrmsr(MSR_P6_EVNTSEL0 + i, ARCH_PERFMON_EVENTSEL_OS |
> + !ARCH_PERFMON_EVENTSEL_ENABLE |
> + intel_pmu_arch_events[idx]);
> + wrmsr(counter_msr + i, 0);
> + __asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
> +
> + if (pmu_is_intel_event_stable(idx))
> + GUEST_ASSERT(!_rdpmc(i));
> + }
> +
> + GUEST_DONE();
> +}
> +
> +static void guest_measure_loop(uint8_t idx)
> +{
> + const struct {
> + struct kvm_x86_pmu_feature gp_event;
> + } intel_event_to_feature[] = {
> + [INTEL_ARCH_CPU_CYCLES] = { X86_PMU_FEATURE_CPU_CYCLES },
> + [INTEL_ARCH_INSTRUCTIONS_RETIRED] = { X86_PMU_FEATURE_INSNS_RETIRED },
> + [INTEL_ARCH_REFERENCE_CYCLES] = { X86_PMU_FEATURE_REFERENCE_CYCLES },
> + [INTEL_ARCH_LLC_REFERENCES] = { X86_PMU_FEATURE_LLC_REFERENCES },
> + [INTEL_ARCH_LLC_MISSES] = { X86_PMU_FEATURE_LLC_MISSES },
> + [INTEL_ARCH_BRANCHES_RETIRED] = { X86_PMU_FEATURE_BRANCH_INSNS_RETIRED },
> + [INTEL_ARCH_BRANCHES_MISPREDICTED] = { X86_PMU_FEATURE_BRANCHES_MISPREDICTED },
> + };
> +
> + uint32_t nr_gp_counters = this_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
> + uint32_t pmu_version = this_cpu_property(X86_PROPERTY_PMU_VERSION);
> + struct kvm_x86_pmu_feature gp_event;
> + uint32_t counter_msr;
> + unsigned int i;
> +
> + if (rdmsr(MSR_IA32_PERF_CAPABILITIES) & PMU_CAP_FW_WRITES)
> + counter_msr = MSR_IA32_PMC0;
> + else
> + counter_msr = MSR_IA32_PERFCTR0;
> +
> + gp_event = intel_event_to_feature[idx].gp_event;
> + TEST_ASSERT_EQ(idx, gp_event.f.bit);
> +
> + if (pmu_version < 2) {
> + guest_measure_pmu_v1(gp_event, counter_msr, nr_gp_counters);

Looking at this again, testing guest PMU version 1 is practically impossible
because this testcase doesn't force the guest PMU version. I.e. unless I'm
missing something, this requires old hardware or running in a VM with its PMU
forced to '1'.

And if all subtests use similar inputs, the common configuration can be shoved
into pmu_vm_create_with_one_vcpu().

It's easy enough to fold test_intel_arch_events() into test_intel_counters(),
which will also provide coverage for running with full-width writes enabled. The
only downside is that the total runtime will be longer.

> +static void test_arch_events_cpuid(uint8_t i, uint8_t j, uint8_t idx)
> +{
> + uint8_t arch_events_unavailable_mask = BIT_ULL(j);
> + uint8_t arch_events_bitmap_size = BIT_ULL(i);
> + struct kvm_vcpu *vcpu;
> + struct kvm_vm *vm;
> +
> + vm = pmu_vm_create_with_one_vcpu(&vcpu, guest_measure_loop);
> +
> + vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_EBX_BIT_VECTOR_LENGTH,
> + arch_events_bitmap_size);
> + vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_EVENTS_MASK,
> + arch_events_unavailable_mask);
> +
> + vcpu_args_set(vcpu, 1, idx);
> +
> + run_vcpu(vcpu);
> +
> + kvm_vm_free(vm);
> +}
> +
> +static void test_intel_arch_events(void)
> +{
> + uint8_t idx, i, j;
> +
> + for (idx = 0; idx < NR_INTEL_ARCH_EVENTS; idx++) {

There's no need to iterate over each event in the host, we can simply add a wrapper
for guest_measure_loop() in the guest. That'll be slightly faster since it won't
require creating and destroying a VM for every event.

> + /*
> + * A brute force iteration of all combinations of values is
> + * likely to exhaust the limit of the single-threaded thread
> + * fd nums, so it's test by iterating through all valid
> + * single-bit values.
> + */
> + for (i = 0; i < NR_INTEL_ARCH_EVENTS; i++) {

This is flawed/odd. 'i' becomes arch_events_bitmap_size, i.e. it's a length,
but the length is computed byt BIT(i). That's nonsensical and will eventually
result in undefined behavior. Oof, that'll actually happen sooner than later
because arch_events_bitmap_size is only a single byte, i.e. when the number of
events hits 9, this will try to shove 256 into an 8-bit variable.

The more correct approach would be to pass in 0..NR_INTEL_ARCH_EVENTS inclusive
as the size. But I think we should actually test 0..length+1, where "length" is
the max of the native length and NR_INTEL_ARCH_EVENTS, i.e. we should verify KVM
KVM handles a size larger than the native length.

> + for (j = 0; j < NR_INTEL_ARCH_EVENTS; j++)
> + test_arch_events_cpuid(i, j, idx);

And here, I think it makes sense to brute force all possible values for at least
one configuration. There aren't actually _that_ many values, e.g. currently it's
64 (I think). E.g. test the native PMU version with the "full" length, and then
test single bits with varying lengths.

I'll send a v6 later this week.

2023-10-24 21:00:26

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v5 08/13] KVM: selftests: Test Intel PMU architectural events on gp counters

On Tue, Oct 24, 2023, Sean Christopherson wrote:
> On Mon, Oct 23, 2023, Sean Christopherson wrote:
> > +static void test_intel_arch_events(void)
> > +{
> > + uint8_t idx, i, j;
> > +
> > + for (idx = 0; idx < NR_INTEL_ARCH_EVENTS; idx++) {

*sigh*

Yet another KVM bug that this test _should_ catch, but doesn't because too many
things are hardcoded. KVM _still_ advertises Top Down Slots to userspace because
KVM doesn't sanitize the bit vector or its length that comes out of perf.

2023-10-25 03:17:55

by Jinrong Liang

[permalink] [raw]
Subject: Re: [PATCH v5 08/13] KVM: selftests: Test Intel PMU architectural events on gp counters

在 2023/10/25 03:49, Sean Christopherson 写道:
> On Mon, Oct 23, 2023, Sean Christopherson wrote:
>> +static void guest_measure_pmu_v1(struct kvm_x86_pmu_feature event,
>> + uint32_t counter_msr, uint32_t nr_gp_counters)
>> +{
>> + uint8_t idx = event.f.bit;
>> + unsigned int i;
>> +
>> + for (i = 0; i < nr_gp_counters; i++) {
>> + wrmsr(counter_msr + i, 0);
>> + wrmsr(MSR_P6_EVNTSEL0 + i, ARCH_PERFMON_EVENTSEL_OS |
>> + ARCH_PERFMON_EVENTSEL_ENABLE | intel_pmu_arch_events[idx]);
>> + __asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
>> +
>> + if (pmu_is_intel_event_stable(idx))
>> + GUEST_ASSERT_EQ(this_pmu_has(event), !!_rdpmc(i));
>> +
>> + wrmsr(MSR_P6_EVNTSEL0 + i, ARCH_PERFMON_EVENTSEL_OS |
>> + !ARCH_PERFMON_EVENTSEL_ENABLE |
>> + intel_pmu_arch_events[idx]);
>> + wrmsr(counter_msr + i, 0);
>> + __asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
>> +
>> + if (pmu_is_intel_event_stable(idx))
>> + GUEST_ASSERT(!_rdpmc(i));
>> + }
>> +
>> + GUEST_DONE();
>> +}
>> +
>> +static void guest_measure_loop(uint8_t idx)
>> +{
>> + const struct {
>> + struct kvm_x86_pmu_feature gp_event;
>> + } intel_event_to_feature[] = {
>> + [INTEL_ARCH_CPU_CYCLES] = { X86_PMU_FEATURE_CPU_CYCLES },
>> + [INTEL_ARCH_INSTRUCTIONS_RETIRED] = { X86_PMU_FEATURE_INSNS_RETIRED },
>> + [INTEL_ARCH_REFERENCE_CYCLES] = { X86_PMU_FEATURE_REFERENCE_CYCLES },
>> + [INTEL_ARCH_LLC_REFERENCES] = { X86_PMU_FEATURE_LLC_REFERENCES },
>> + [INTEL_ARCH_LLC_MISSES] = { X86_PMU_FEATURE_LLC_MISSES },
>> + [INTEL_ARCH_BRANCHES_RETIRED] = { X86_PMU_FEATURE_BRANCH_INSNS_RETIRED },
>> + [INTEL_ARCH_BRANCHES_MISPREDICTED] = { X86_PMU_FEATURE_BRANCHES_MISPREDICTED },
>> + };
>> +
>> + uint32_t nr_gp_counters = this_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
>> + uint32_t pmu_version = this_cpu_property(X86_PROPERTY_PMU_VERSION);
>> + struct kvm_x86_pmu_feature gp_event;
>> + uint32_t counter_msr;
>> + unsigned int i;
>> +
>> + if (rdmsr(MSR_IA32_PERF_CAPABILITIES) & PMU_CAP_FW_WRITES)
>> + counter_msr = MSR_IA32_PMC0;
>> + else
>> + counter_msr = MSR_IA32_PERFCTR0;
>> +
>> + gp_event = intel_event_to_feature[idx].gp_event;
>> + TEST_ASSERT_EQ(idx, gp_event.f.bit);
>> +
>> + if (pmu_version < 2) {
>> + guest_measure_pmu_v1(gp_event, counter_msr, nr_gp_counters);
>
> Looking at this again, testing guest PMU version 1 is practically impossible
> because this testcase doesn't force the guest PMU version. I.e. unless I'm
> missing something, this requires old hardware or running in a VM with its PMU
> forced to '1'.
>
> And if all subtests use similar inputs, the common configuration can be shoved
> into pmu_vm_create_with_one_vcpu().
>
> It's easy enough to fold test_intel_arch_events() into test_intel_counters(),
> which will also provide coverage for running with full-width writes enabled. The
> only downside is that the total runtime will be longer.
>
>> +static void test_arch_events_cpuid(uint8_t i, uint8_t j, uint8_t idx)
>> +{
>> + uint8_t arch_events_unavailable_mask = BIT_ULL(j);
>> + uint8_t arch_events_bitmap_size = BIT_ULL(i);
>> + struct kvm_vcpu *vcpu;
>> + struct kvm_vm *vm;
>> +
>> + vm = pmu_vm_create_with_one_vcpu(&vcpu, guest_measure_loop);
>> +
>> + vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_EBX_BIT_VECTOR_LENGTH,
>> + arch_events_bitmap_size);
>> + vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_EVENTS_MASK,
>> + arch_events_unavailable_mask);
>> +
>> + vcpu_args_set(vcpu, 1, idx);
>> +
>> + run_vcpu(vcpu);
>> +
>> + kvm_vm_free(vm);
>> +}
>> +
>> +static void test_intel_arch_events(void)
>> +{
>> + uint8_t idx, i, j;
>> +
>> + for (idx = 0; idx < NR_INTEL_ARCH_EVENTS; idx++) {
>
> There's no need to iterate over each event in the host, we can simply add a wrapper
> for guest_measure_loop() in the guest. That'll be slightly faster since it won't
> require creating and destroying a VM for every event.
>
>> + /*
>> + * A brute force iteration of all combinations of values is
>> + * likely to exhaust the limit of the single-threaded thread
>> + * fd nums, so it's test by iterating through all valid
>> + * single-bit values.
>> + */
>> + for (i = 0; i < NR_INTEL_ARCH_EVENTS; i++) {
>
> This is flawed/odd. 'i' becomes arch_events_bitmap_size, i.e. it's a length,
> but the length is computed byt BIT(i). That's nonsensical and will eventually
> result in undefined behavior. Oof, that'll actually happen sooner than later
> because arch_events_bitmap_size is only a single byte, i.e. when the number of
> events hits 9, this will try to shove 256 into an 8-bit variable.
>
> The more correct approach would be to pass in 0..NR_INTEL_ARCH_EVENTS inclusive
> as the size. But I think we should actually test 0..length+1, where "length" is
> the max of the native length and NR_INTEL_ARCH_EVENTS, i.e. we should verify KVM
> KVM handles a size larger than the native length.
>
>> + for (j = 0; j < NR_INTEL_ARCH_EVENTS; j++)
>> + test_arch_events_cpuid(i, j, idx);
>
> And here, I think it makes sense to brute force all possible values for at least
> one configuration. There aren't actually _that_ many values, e.g. currently it's
> 64 (I think). E.g. test the native PMU version with the "full" length, and then
> test single bits with varying lengths.
>
> I'll send a v6 later this week.

Got it, thanks.

Please feel free to let me know if there's anything you'd like me to do.

2023-10-26 20:39:39

by Mingwei Zhang

[permalink] [raw]
Subject: Re: [PATCH v5 08/13] KVM: selftests: Test Intel PMU architectural events on gp counters

On Mon, Oct 23, 2023, Sean Christopherson wrote:
> From: Jinrong Liang <[email protected]>
>
> Add test cases to check if different Architectural events are available
> after it's marked as unavailable via CPUID. It covers vPMU event filtering
> logic based on Intel CPUID, which is a complement to pmu_event_filter.
>
> According to Intel SDM, the number of architectural events is reported
> through CPUID.0AH:EAX[31:24] and the architectural event x is supported
> if EBX[x]=0 && EAX[31:24]>x.
>
> Co-developed-by: Like Xu <[email protected]>
> Signed-off-by: Like Xu <[email protected]>
> Signed-off-by: Jinrong Liang <[email protected]>
> Co-developed-by: Sean Christopherson <[email protected]>
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
> tools/testing/selftests/kvm/Makefile | 1 +
> .../selftests/kvm/x86_64/pmu_counters_test.c | 189 ++++++++++++++++++
> 2 files changed, 190 insertions(+)
> create mode 100644 tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
>
> diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
> index ed1c17cabc07..4c024fb845b4 100644
> --- a/tools/testing/selftests/kvm/Makefile
> +++ b/tools/testing/selftests/kvm/Makefile
> @@ -82,6 +82,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/mmio_warning_test
> TEST_GEN_PROGS_x86_64 += x86_64/monitor_mwait_test
> TEST_GEN_PROGS_x86_64 += x86_64/nested_exceptions_test
> TEST_GEN_PROGS_x86_64 += x86_64/platform_info_test
> +TEST_GEN_PROGS_x86_64 += x86_64/pmu_counters_test
> TEST_GEN_PROGS_x86_64 += x86_64/pmu_event_filter_test
> TEST_GEN_PROGS_x86_64 += x86_64/set_boot_cpu_id
> TEST_GEN_PROGS_x86_64 += x86_64/set_sregs_test
> diff --git a/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
> new file mode 100644
> index 000000000000..2a6336b994d5
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/x86_64/pmu_counters_test.c
> @@ -0,0 +1,189 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2023, Tencent, Inc.
> + */
> +
> +#define _GNU_SOURCE /* for program_invocation_short_name */
> +#include <x86intrin.h>
> +
> +#include "pmu.h"
> +#include "processor.h"
> +
> +/* Guest payload for any performance counter counting */
> +#define NUM_BRANCHES 10
> +
> +static struct kvm_vm *pmu_vm_create_with_one_vcpu(struct kvm_vcpu **vcpu,
> + void *guest_code)
> +{
> + struct kvm_vm *vm;
> +
> + vm = vm_create_with_one_vcpu(vcpu, guest_code);
> + vm_init_descriptor_tables(vm);
> + vcpu_init_descriptor_tables(*vcpu);
> +
> + return vm;
> +}
> +
> +static void run_vcpu(struct kvm_vcpu *vcpu)
> +{
> + struct ucall uc;
> +
> + do {
> + vcpu_run(vcpu);
> + switch (get_ucall(vcpu, &uc)) {
> + case UCALL_SYNC:
> + break;
> + case UCALL_ABORT:
> + REPORT_GUEST_ASSERT(uc);
> + break;
> + case UCALL_DONE:
> + break;
> + default:
> + TEST_FAIL("Unexpected ucall: %lu", uc.cmd);
> + }
> + } while (uc.cmd != UCALL_DONE);
> +}
> +
> +static bool pmu_is_intel_event_stable(uint8_t idx)
> +{
> + switch (idx) {
> + case INTEL_ARCH_CPU_CYCLES:
> + case INTEL_ARCH_INSTRUCTIONS_RETIRED:
> + case INTEL_ARCH_REFERENCE_CYCLES:
> + case INTEL_ARCH_BRANCHES_RETIRED:
> + return true;
> + default:
> + return false;
> + }
> +}

Brief explanation on why other events are not stable please. Since there
are only a few architecture events, maybe listing all of them with
explanation in comments would work better. Let out-of-bound return false
on default.
> +
> +static void guest_measure_pmu_v1(struct kvm_x86_pmu_feature event,
> + uint32_t counter_msr, uint32_t nr_gp_counters)
> +{
> + uint8_t idx = event.f.bit;
> + unsigned int i;
> +
> + for (i = 0; i < nr_gp_counters; i++) {
> + wrmsr(counter_msr + i, 0);
> + wrmsr(MSR_P6_EVNTSEL0 + i, ARCH_PERFMON_EVENTSEL_OS |
> + ARCH_PERFMON_EVENTSEL_ENABLE | intel_pmu_arch_events[idx]);
> + __asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));

Some comment might be needed for readability. Abuptly inserting inline
assembly code in C destroys the readability.

I wonder do we need add 'clobber' here for the above line, since it
takes away ecx?

Also, I wonder if we need to disable IRQ here? This code might be
intercepted and resumed. If so, then the test will get a different
number?
> +
> + if (pmu_is_intel_event_stable(idx))
> + GUEST_ASSERT_EQ(this_pmu_has(event), !!_rdpmc(i));

Okay, just the counter value is non-zero means we pass the test ?!

hmm, I wonder other than IRQ stuff, what else may affect the result? NMI
watchdog or what?
> +
> + wrmsr(MSR_P6_EVNTSEL0 + i, ARCH_PERFMON_EVENTSEL_OS |
> + !ARCH_PERFMON_EVENTSEL_ENABLE |
> + intel_pmu_arch_events[idx]);
> + wrmsr(counter_msr + i, 0);
> + __asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
ditto for readability. Please consider using a macro to avoid repeated
explanation.

> +
> + if (pmu_is_intel_event_stable(idx))
> + GUEST_ASSERT(!_rdpmc(i));
> + }
> +
> + GUEST_DONE();
> +}
> +
> +static void guest_measure_loop(uint8_t idx)
> +{
> + const struct {
> + struct kvm_x86_pmu_feature gp_event;
> + } intel_event_to_feature[] = {
> + [INTEL_ARCH_CPU_CYCLES] = { X86_PMU_FEATURE_CPU_CYCLES },
> + [INTEL_ARCH_INSTRUCTIONS_RETIRED] = { X86_PMU_FEATURE_INSNS_RETIRED },
> + [INTEL_ARCH_REFERENCE_CYCLES] = { X86_PMU_FEATURE_REFERENCE_CYCLES },
> + [INTEL_ARCH_LLC_REFERENCES] = { X86_PMU_FEATURE_LLC_REFERENCES },
> + [INTEL_ARCH_LLC_MISSES] = { X86_PMU_FEATURE_LLC_MISSES },
> + [INTEL_ARCH_BRANCHES_RETIRED] = { X86_PMU_FEATURE_BRANCH_INSNS_RETIRED },
> + [INTEL_ARCH_BRANCHES_MISPREDICTED] = { X86_PMU_FEATURE_BRANCHES_MISPREDICTED },
> + };
> +
> + uint32_t nr_gp_counters = this_cpu_property(X86_PROPERTY_PMU_NR_GP_COUNTERS);
> + uint32_t pmu_version = this_cpu_property(X86_PROPERTY_PMU_VERSION);
> + struct kvm_x86_pmu_feature gp_event;
> + uint32_t counter_msr;
> + unsigned int i;
> +
> + if (rdmsr(MSR_IA32_PERF_CAPABILITIES) & PMU_CAP_FW_WRITES)
> + counter_msr = MSR_IA32_PMC0;
> + else
> + counter_msr = MSR_IA32_PERFCTR0;
> +
> + gp_event = intel_event_to_feature[idx].gp_event;
> + TEST_ASSERT_EQ(idx, gp_event.f.bit);
> +
> + if (pmu_version < 2) {
> + guest_measure_pmu_v1(gp_event, counter_msr, nr_gp_counters);
> + return;
> + }
> +
> + for (i = 0; i < nr_gp_counters; i++) {
> + wrmsr(counter_msr + i, 0);
> + wrmsr(MSR_P6_EVNTSEL0 + i, ARCH_PERFMON_EVENTSEL_OS |
> + ARCH_PERFMON_EVENTSEL_ENABLE |
> + intel_pmu_arch_events[idx]);
> +
> + wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, BIT_ULL(i));
> + __asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
> + wrmsr(MSR_CORE_PERF_GLOBAL_CTRL, 0);
> +
> + if (pmu_is_intel_event_stable(idx))
> + GUEST_ASSERT_EQ(this_pmu_has(gp_event), !!_rdpmc(i));
> + }
> +
> + GUEST_DONE();
> +}
> +
> +static void test_arch_events_cpuid(uint8_t i, uint8_t j, uint8_t idx)
> +{
> + uint8_t arch_events_unavailable_mask = BIT_ULL(j);
> + uint8_t arch_events_bitmap_size = BIT_ULL(i);
> + struct kvm_vcpu *vcpu;
> + struct kvm_vm *vm;
> +
> + vm = pmu_vm_create_with_one_vcpu(&vcpu, guest_measure_loop);
> +
> + vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_EBX_BIT_VECTOR_LENGTH,
> + arch_events_bitmap_size);
> + vcpu_set_cpuid_property(vcpu, X86_PROPERTY_PMU_EVENTS_MASK,
> + arch_events_unavailable_mask);
> +
> + vcpu_args_set(vcpu, 1, idx);
> +
> + run_vcpu(vcpu);
> +
> + kvm_vm_free(vm);
> +}
> +
> +static void test_intel_arch_events(void)
> +{
> + uint8_t idx, i, j;
> +
> + for (idx = 0; idx < NR_INTEL_ARCH_EVENTS; idx++) {
> + /*
> + * A brute force iteration of all combinations of values is
> + * likely to exhaust the limit of the single-threaded thread
> + * fd nums, so it's test by iterating through all valid
> + * single-bit values.
> + */
> + for (i = 0; i < NR_INTEL_ARCH_EVENTS; i++) {
> + for (j = 0; j < NR_INTEL_ARCH_EVENTS; j++)
> + test_arch_events_cpuid(i, j, idx);
> + }
> + }
> +}
> +
> +int main(int argc, char *argv[])
> +{
> + TEST_REQUIRE(get_kvm_param_bool("enable_pmu"));
> +
> + TEST_REQUIRE(host_cpu_is_intel);
> + TEST_REQUIRE(kvm_cpu_has_p(X86_PROPERTY_PMU_VERSION));
> + TEST_REQUIRE(kvm_cpu_property(X86_PROPERTY_PMU_VERSION) > 0);
> + TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_PDCM));

hmm, this means we cannot run this in nested if X86_FEATURE_PDCM is
missing. It only affects full-width counter, right?

> +
> + test_intel_arch_events();
> +
> + return 0;
> +}
> --
> 2.42.0.758.gaed0368e0e-goog
>

2023-10-26 20:54:13

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v5 08/13] KVM: selftests: Test Intel PMU architectural events on gp counters

On Thu, Oct 26, 2023, Mingwei Zhang wrote:
> > +static bool pmu_is_intel_event_stable(uint8_t idx)
> > +{
> > + switch (idx) {
> > + case INTEL_ARCH_CPU_CYCLES:
> > + case INTEL_ARCH_INSTRUCTIONS_RETIRED:
> > + case INTEL_ARCH_REFERENCE_CYCLES:
> > + case INTEL_ARCH_BRANCHES_RETIRED:
> > + return true;
> > + default:
> > + return false;
> > + }
> > +}
>
> Brief explanation on why other events are not stable please. Since there
> are only a few architecture events, maybe listing all of them with
> explanation in comments would work better.

Heh, I've already rewritten this logic to make


> > +
> > +static void guest_measure_pmu_v1(struct kvm_x86_pmu_feature event,
> > + uint32_t counter_msr, uint32_t nr_gp_counters)
> > +{
> > + uint8_t idx = event.f.bit;
> > + unsigned int i;
> > +
> > + for (i = 0; i < nr_gp_counters; i++) {
> > + wrmsr(counter_msr + i, 0);
> > + wrmsr(MSR_P6_EVNTSEL0 + i, ARCH_PERFMON_EVENTSEL_OS |
> > + ARCH_PERFMON_EVENTSEL_ENABLE | intel_pmu_arch_events[idx]);
> > + __asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
>
> Some comment might be needed for readability. Abuptly inserting inline
> assembly code in C destroys the readability.
>
> I wonder do we need add 'clobber' here for the above line, since it
> takes away ecx?

It's already there. You can't directly clobber a register that is used as an
input constraint. The workaround is to make the register both an input and an
output, hense the "+c" in the outputs section instead of just "c" in the inputs
section. The extra bit of cleverness is to use an intermediate anonymous variable
so that NUM_BRANCHES can effectively be passed in (#defines won't work as output
constraints).

> Also, I wonder if we need to disable IRQ here? This code might be
> intercepted and resumed. If so, then the test will get a different
> number?

This is guest code, disabling IRQs is pointless. There are no guest virtual IRQs,
guarding aginst host IRQs is impossible, unnecessary, and actualy undesirable,
i.e. the guest vPMU shouldn't be counting host instructions and whatnot.

> > +
> > + if (pmu_is_intel_event_stable(idx))
> > + GUEST_ASSERT_EQ(this_pmu_has(event), !!_rdpmc(i));
>
> Okay, just the counter value is non-zero means we pass the test ?!

FWIW, I've updated

> hmm, I wonder other than IRQ stuff, what else may affect the result? NMI
> watchdog or what?

This is the beauty of selftests. There _so_ simple that there are very few
surprises. E.g. there are no events of any kind unless the test explicitly
generates them. The downside is that doing anything complex in selftests requires
writing a fair bit of code.

> > +
> > + wrmsr(MSR_P6_EVNTSEL0 + i, ARCH_PERFMON_EVENTSEL_OS |
> > + !ARCH_PERFMON_EVENTSEL_ENABLE |
> > + intel_pmu_arch_events[idx]);
> > + wrmsr(counter_msr + i, 0);
> > + __asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
> ditto for readability. Please consider using a macro to avoid repeated
> explanation.

Heh, already did this too. Though I'm not entirely sure it's more readable. It's
definitely more precise and featured :-)

#define GUEST_MEASURE_EVENT(_msr, _value, clflush, FEP) \
do { \
__asm__ __volatile__("wrmsr\n\t" \
clflush "\n\t" \
"mfence\n\t" \
"1: mov $" __stringify(NUM_BRANCHES) ", %%ecx\n\t" \
FEP "loop .\n\t" \
FEP "mov %%edi, %%ecx\n\t" \
FEP "xor %%eax, %%eax\n\t" \
FEP "xor %%edx, %%edx\n\t" \
"wrmsr\n\t" \
: "+c"((int){_msr}) \
: "a"((uint32_t)_value), "d"(_value >> 32), \
"D"(_msr) \
); \
} while (0)


> > +int main(int argc, char *argv[])
> > +{
> > + TEST_REQUIRE(get_kvm_param_bool("enable_pmu"));
> > +
> > + TEST_REQUIRE(host_cpu_is_intel);
> > + TEST_REQUIRE(kvm_cpu_has_p(X86_PROPERTY_PMU_VERSION));
> > + TEST_REQUIRE(kvm_cpu_property(X86_PROPERTY_PMU_VERSION) > 0);
> > + TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_PDCM));
>
> hmm, this means we cannot run this in nested if X86_FEATURE_PDCM is
> missing. It only affects full-width counter, right?

Ah, yeah, good call. It won't be too much trouble to have the test play nice
with !PDCM.

2023-10-26 22:11:01

by Mingwei Zhang

[permalink] [raw]
Subject: Re: [PATCH v5 08/13] KVM: selftests: Test Intel PMU architectural events on gp counters

On Thu, Oct 26, 2023, Sean Christopherson wrote:
> On Thu, Oct 26, 2023, Mingwei Zhang wrote:
> > > +static bool pmu_is_intel_event_stable(uint8_t idx)
> > > +{
> > > + switch (idx) {
> > > + case INTEL_ARCH_CPU_CYCLES:
> > > + case INTEL_ARCH_INSTRUCTIONS_RETIRED:
> > > + case INTEL_ARCH_REFERENCE_CYCLES:
> > > + case INTEL_ARCH_BRANCHES_RETIRED:
> > > + return true;
> > > + default:
> > > + return false;
> > > + }
> > > +}
> >
> > Brief explanation on why other events are not stable please. Since there
> > are only a few architecture events, maybe listing all of them with
> > explanation in comments would work better.
>
> Heh, I've already rewritten this logic to make
>
>
> > > +
> > > +static void guest_measure_pmu_v1(struct kvm_x86_pmu_feature event,
> > > + uint32_t counter_msr, uint32_t nr_gp_counters)
> > > +{
> > > + uint8_t idx = event.f.bit;
> > > + unsigned int i;
> > > +
> > > + for (i = 0; i < nr_gp_counters; i++) {
> > > + wrmsr(counter_msr + i, 0);
> > > + wrmsr(MSR_P6_EVNTSEL0 + i, ARCH_PERFMON_EVENTSEL_OS |
> > > + ARCH_PERFMON_EVENTSEL_ENABLE | intel_pmu_arch_events[idx]);
> > > + __asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
> >
> > Some comment might be needed for readability. Abuptly inserting inline
> > assembly code in C destroys the readability.
> >
> > I wonder do we need add 'clobber' here for the above line, since it
> > takes away ecx?
>
> It's already there. You can't directly clobber a register that is used as an
> input constraint. The workaround is to make the register both an input and an
> output, hense the "+c" in the outputs section instead of just "c" in the inputs
> section. The extra bit of cleverness is to use an intermediate anonymous variable
> so that NUM_BRANCHES can effectively be passed in (#defines won't work as output
> constraints).
>
> > Also, I wonder if we need to disable IRQ here? This code might be
> > intercepted and resumed. If so, then the test will get a different
> > number?
>
> This is guest code, disabling IRQs is pointless. There are no guest virtual IRQs,
> guarding aginst host IRQs is impossible, unnecessary, and actualy undesirable,
> i.e. the guest vPMU shouldn't be counting host instructions and whatnot.
>
> > > +
> > > + if (pmu_is_intel_event_stable(idx))
> > > + GUEST_ASSERT_EQ(this_pmu_has(event), !!_rdpmc(i));
> >
> > Okay, just the counter value is non-zero means we pass the test ?!
>
> FWIW, I've updated
>
> > hmm, I wonder other than IRQ stuff, what else may affect the result? NMI
> > watchdog or what?
>
> This is the beauty of selftests. There _so_ simple that there are very few
> surprises. E.g. there are no events of any kind unless the test explicitly
> generates them. The downside is that doing anything complex in selftests requires
> writing a fair bit of code.

Understood, so we could support precise matching.
>
> > > +
> > > + wrmsr(MSR_P6_EVNTSEL0 + i, ARCH_PERFMON_EVENTSEL_OS |
> > > + !ARCH_PERFMON_EVENTSEL_ENABLE |
> > > + intel_pmu_arch_events[idx]);
> > > + wrmsr(counter_msr + i, 0);
> > > + __asm__ __volatile__("loop ." : "+c"((int){NUM_BRANCHES}));
> > ditto for readability. Please consider using a macro to avoid repeated
> > explanation.
>
> Heh, already did this too. Though I'm not entirely sure it's more readable. It's
> definitely more precise and featured :-)
>
Oh dear, this is challenging to my rusty inline assembly skills :)

> #define GUEST_MEASURE_EVENT(_msr, _value, clflush, FEP) \
> do { \
> __asm__ __volatile__("wrmsr\n\t" \
> clflush "\n\t" \
> "mfence\n\t" \
> "1: mov $" __stringify(NUM_BRANCHES) ", %%ecx\n\t" \
> FEP "loop .\n\t" \
> FEP "mov %%edi, %%ecx\n\t" \
> FEP "xor %%eax, %%eax\n\t" \
> FEP "xor %%edx, %%edx\n\t" \
> "wrmsr\n\t" \
> : "+c"((int){_msr}) \
isn't it NUM_BRANCHES?
> : "a"((uint32_t)_value), "d"(_value >> 32), \
> "D"(_msr) \
> ); \
> } while (0)
>

do we need this label '1:' in the above code? It does not seems to be
used anywhere within the code.

why is clflush needed here?
>
> > > +int main(int argc, char *argv[])
> > > +{
> > > + TEST_REQUIRE(get_kvm_param_bool("enable_pmu"));
> > > +
> > > + TEST_REQUIRE(host_cpu_is_intel);
> > > + TEST_REQUIRE(kvm_cpu_has_p(X86_PROPERTY_PMU_VERSION));
> > > + TEST_REQUIRE(kvm_cpu_property(X86_PROPERTY_PMU_VERSION) > 0);
> > > + TEST_REQUIRE(kvm_cpu_has(X86_FEATURE_PDCM));
> >
> > hmm, this means we cannot run this in nested if X86_FEATURE_PDCM is
> > missing. It only affects full-width counter, right?
>
> Ah, yeah, good call. It won't be too much trouble to have the test play nice
> with !PDCM.

2023-10-26 22:55:25

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v5 08/13] KVM: selftests: Test Intel PMU architectural events on gp counters

On Thu, Oct 26, 2023, Mingwei Zhang wrote:
> On Thu, Oct 26, 2023, Sean Christopherson wrote:
> > Heh, already did this too. Though I'm not entirely sure it's more readable. It's
> > definitely more precise and featured :-)
> >
> Oh dear, this is challenging to my rusty inline assembly skills :)
>
> > #define GUEST_MEASURE_EVENT(_msr, _value, clflush, FEP) \
> > do { \
> > __asm__ __volatile__("wrmsr\n\t" \
> > clflush "\n\t" \
> > "mfence\n\t" \
> > "1: mov $" __stringify(NUM_BRANCHES) ", %%ecx\n\t" \
> > FEP "loop .\n\t" \
> > FEP "mov %%edi, %%ecx\n\t" \
> > FEP "xor %%eax, %%eax\n\t" \
> > FEP "xor %%edx, %%edx\n\t" \
> > "wrmsr\n\t" \
> > : "+c"((int){_msr}) \
> isn't it NUM_BRANCHES?

Nope. It's hard to see because this doesn't provide the usage, but @_msr is an
MSR index that is consumed by the first "wrmsr", i.e. this blob relies on the
compiler to preload ECX, EAX, and EDX for WRMSR. NUM_BRANCHES is manually loaded
into ECX after WRMSR (WRMSR and LOOP both hardcode consuming ECX).

Ha! I can actually drop the "+c" clobbering trick since ECX is restored to its
input value before the asm blob finishes. EDI is also loaded with _@msr so that
it can be quickly reloaded into ECX for the WRMSR to disable the event.

The purpose of doing both WRMSRs in assembly is to ensure the compiler doesn't
insert _any_ code into the measured sequence, e.g. so that a random Jcc doesn't
throw off instructions retired.

> > : "a"((uint32_t)_value), "d"(_value >> 32), \
> > "D"(_msr) \
> > ); \
> > } while (0)
> >
>
> do we need this label '1:' in the above code? It does not seems to be
> used anywhere within the code.

It's used by the caller as the target for CLFLUSH{,OPT}.

if (this_cpu_has(X86_FEATURE_CLFLUSHOPT)) \
GUEST_MEASURE_EVENT(_ctrl_msr, _value, "clflushopt 1f", FEP); \
else if (this_cpu_has(X86_FEATURE_CLFLUSH)) \
GUEST_MEASURE_EVENT(_ctrl_msr, _value, "clflushopt 1f", FEP); \
else \
GUEST_MEASURE_EVENT(_ctrl_msr, _value, "nop", FEP); \
>
> why is clflush needed here?

As suggested by Jim, it allows verifying LLC references and misses by forcing
the CPU to evict the cache line that holds the MOV at 1: (and likely holds most
of the asm blob).