2023-06-07 01:03:34

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH 0/4] KVM: x86/pmu: Clean up arch/hw event handling

Clean up KVM's handling of arch/hw events, and the related fixed counter
usage. KVM has far too many open coded magic numbers, and kludgy code
that stems from the magic numbers.

Sean Christopherson (4):
KVM: x86/pmu: Use enums instead of hardcoded magic for arch event
indices
KVM: x86/pmu: Simplify intel_hw_event_available()
KVM: x86/pmu: Require nr fixed_pmc_events to match nr max fixed
counters
KVM: x86/pmu: Move .hw_event_available() check out of PMC filter
helper

arch/x86/kvm/pmu.c | 4 +-
arch/x86/kvm/vmx/pmu_intel.c | 81 ++++++++++++++++++++++++------------
2 files changed, 56 insertions(+), 29 deletions(-)


base-commit: 31b4fc3bc64aadd660c5bfa5178c86a7ba61e0f7
--
2.41.0.162.gfafddb0af9-goog



2023-06-07 01:14:23

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH 3/4] KVM: x86/pmu: Require nr fixed_pmc_events to match nr max fixed counters

Assert that the number of known fixed_pmc_events matches the max number of
fixed counters supported by KVM, and clean up related code.

Opportunistically extend setup_fixed_pmc_eventsel()'s use of
array_index_nospec() to cover fixed_counters, as nr_arch_fixed_counters is
set based on userspace input (but capped using KVM-controlled values).

Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/vmx/pmu_intel.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index f281e634af3c..c0b0a721b97f 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -527,16 +527,17 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)

static void setup_fixed_pmc_eventsel(struct kvm_pmu *pmu)
{
- size_t size = ARRAY_SIZE(fixed_pmc_events);
- struct kvm_pmc *pmc;
- u32 event;
int i;

+ BUILD_BUG_ON(ARRAY_SIZE(fixed_pmc_events) != KVM_PMC_MAX_FIXED);
+
for (i = 0; i < pmu->nr_arch_fixed_counters; i++) {
- pmc = &pmu->fixed_counters[i];
- event = fixed_pmc_events[array_index_nospec(i, size)];
+ int index = array_index_nospec(i, KVM_PMC_MAX_FIXED);
+ struct kvm_pmc *pmc = &pmu->fixed_counters[index];
+ u32 event = fixed_pmc_events[index];
+
pmc->eventsel = (intel_arch_events[event].unit_mask << 8) |
- intel_arch_events[event].eventsel;
+ intel_arch_events[event].eventsel;
}
}

@@ -597,10 +598,8 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
if (pmu->version == 1) {
pmu->nr_arch_fixed_counters = 0;
} else {
- pmu->nr_arch_fixed_counters =
- min3(ARRAY_SIZE(fixed_pmc_events),
- (size_t) edx.split.num_counters_fixed,
- (size_t)kvm_pmu_cap.num_counters_fixed);
+ pmu->nr_arch_fixed_counters = min_t(int, edx.split.num_counters_fixed,
+ kvm_pmu_cap.num_counters_fixed);
edx.split.bit_width_fixed = min_t(int, edx.split.bit_width_fixed,
kvm_pmu_cap.bit_width_fixed);
pmu->counter_bitmask[KVM_PMC_FIXED] =
--
2.41.0.162.gfafddb0af9-goog


2023-06-07 01:14:34

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH 2/4] KVM: x86/pmu: Simplify intel_hw_event_available()

Walk only the "real", i.e. non-pseudo, architectural events when checking
if a hardware event is available, i.e. isn't disabled by guest CPUID.
Skipping pseudo-arch events in the loop body is unnecessarily convoluted,
especially now that KVM has enums that delineate between real and pseudo
events.

Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/vmx/pmu_intel.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 0050d71c9c01..f281e634af3c 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -122,17 +122,16 @@ static bool intel_hw_event_available(struct kvm_pmc *pmc)

BUILD_BUG_ON(ARRAY_SIZE(intel_arch_events) != NR_INTEL_ARCH_EVENTS);

- for (i = 0; i < NR_INTEL_ARCH_EVENTS; i++) {
+ /*
+ * Disallow events reported as unavailable in guest CPUID. Note, this
+ * doesn't apply to pseudo-architectural events.
+ */
+ for (i = 0; i < NR_REAL_INTEL_ARCH_EVENTS; i++) {
if (intel_arch_events[i].eventsel != event_select ||
intel_arch_events[i].unit_mask != unit_mask)
continue;

- /* disable event that reported as not present by cpuid */
- if ((i < PSEUDO_ARCH_REFERENCE_CYCLES) &&
- !(pmu->available_event_types & (1 << i)))
- return false;
-
- break;
+ return pmu->available_event_types & BIT(i);
}

return true;
--
2.41.0.162.gfafddb0af9-goog


2023-06-07 01:44:03

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH 1/4] KVM: x86/pmu: Use enums instead of hardcoded magic for arch event indices

Add "enum intel_pmu_architectural_events" to replace the magic numbers for
the (pseudo-)architectural events, and to give a meaningful name to each
event so that new readers don't need psychic powers to understand what the
code is doing.

Cc: Aaron Lewis <[email protected]>
Cc: Like Xu <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/vmx/pmu_intel.c | 55 ++++++++++++++++++++++++++++--------
1 file changed, 43 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 84be32d9f365..0050d71c9c01 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -22,23 +22,51 @@

#define MSR_PMC_FULL_WIDTH_BIT (MSR_IA32_PMC0 - MSR_IA32_PERFCTR0)

+enum intel_pmu_architectural_events {
+ /*
+ * The order of the architectural events matters as support for each
+ * event is enumerated via CPUID using the index of the event.
+ */
+ INTEL_ARCH_CPU_CYCLES,
+ INTEL_ARCH_INSTRUCTIONS_RETIRED,
+ INTEL_ARCH_REFERENCE_CYCLES,
+ INTEL_ARCH_LLC_REFERENCES,
+ INTEL_ARCH_LLC_MISSES,
+ INTEL_ARCH_BRANCHES_RETIRED,
+ INTEL_ARCH_BRANCHES_MISPREDICTED,
+
+ NR_REAL_INTEL_ARCH_EVENTS,
+
+ /*
+ * Pseudo-architectural event used to implement IA32_FIXED_CTR2, a.k.a.
+ * TSC reference cycles. The architectural reference cycles event may
+ * or may not actually use the TSC as the reference, e.g. might use the
+ * core crystal clock or the bus clock (yeah, "architectural").
+ */
+ PSEUDO_ARCH_REFERENCE_CYCLES = NR_REAL_INTEL_ARCH_EVENTS,
+ NR_INTEL_ARCH_EVENTS,
+};
+
static struct {
u8 eventsel;
u8 unit_mask;
} const intel_arch_events[] = {
- [0] = { 0x3c, 0x00 },
- [1] = { 0xc0, 0x00 },
- [2] = { 0x3c, 0x01 },
- [3] = { 0x2e, 0x4f },
- [4] = { 0x2e, 0x41 },
- [5] = { 0xc4, 0x00 },
- [6] = { 0xc5, 0x00 },
- /* The above index must match CPUID 0x0A.EBX bit vector */
- [7] = { 0x00, 0x03 },
+ [INTEL_ARCH_CPU_CYCLES] = { 0x3c, 0x00 },
+ [INTEL_ARCH_INSTRUCTIONS_RETIRED] = { 0xc0, 0x00 },
+ [INTEL_ARCH_REFERENCE_CYCLES] = { 0x3c, 0x01 },
+ [INTEL_ARCH_LLC_REFERENCES] = { 0x2e, 0x4f },
+ [INTEL_ARCH_LLC_MISSES] = { 0x2e, 0x41 },
+ [INTEL_ARCH_BRANCHES_RETIRED] = { 0xc4, 0x00 },
+ [INTEL_ARCH_BRANCHES_MISPREDICTED] = { 0xc5, 0x00 },
+ [PSEUDO_ARCH_REFERENCE_CYCLES] = { 0x00, 0x03 },
};

/* mapping between fixed pmc index and intel_arch_events array */
-static int fixed_pmc_events[] = {1, 0, 7};
+static int fixed_pmc_events[] = {
+ [0] = INTEL_ARCH_INSTRUCTIONS_RETIRED,
+ [1] = INTEL_ARCH_CPU_CYCLES,
+ [2] = PSEUDO_ARCH_REFERENCE_CYCLES,
+};

static void reprogram_fixed_counters(struct kvm_pmu *pmu, u64 data)
{
@@ -92,13 +120,16 @@ static bool intel_hw_event_available(struct kvm_pmc *pmc)
u8 unit_mask = (pmc->eventsel & ARCH_PERFMON_EVENTSEL_UMASK) >> 8;
int i;

- for (i = 0; i < ARRAY_SIZE(intel_arch_events); i++) {
+ BUILD_BUG_ON(ARRAY_SIZE(intel_arch_events) != NR_INTEL_ARCH_EVENTS);
+
+ for (i = 0; i < NR_INTEL_ARCH_EVENTS; i++) {
if (intel_arch_events[i].eventsel != event_select ||
intel_arch_events[i].unit_mask != unit_mask)
continue;

/* disable event that reported as not present by cpuid */
- if ((i < 7) && !(pmu->available_event_types & (1 << i)))
+ if ((i < PSEUDO_ARCH_REFERENCE_CYCLES) &&
+ !(pmu->available_event_types & (1 << i)))
return false;

break;
--
2.41.0.162.gfafddb0af9-goog


2023-06-07 01:44:58

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH 4/4] KVM: x86/pmu: Move .hw_event_available() check out of PMC filter helper

Move the call to kvm_x86_pmu.hw_event_available(), which has nothing to
with the userspace PMU filter, out of check_pmu_event_filter() and into
its sole caller pmc_event_is_allowed(). pmc_event_is_allowed() didn't
exist when commit 7aadaa988c5e ("KVM: x86/pmu: Drop amd_event_mapping[]
in the KVM context"), so presumably the motivation for invoking
.hw_event_available() from check_pmu_event_filter() was to avoid having
to add multiple call sites.

Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/pmu.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 1690d41c1830..2a32dc6aa3f7 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -387,9 +387,6 @@ static bool check_pmu_event_filter(struct kvm_pmc *pmc)
struct kvm_x86_pmu_event_filter *filter;
struct kvm *kvm = pmc->vcpu->kvm;

- if (!static_call(kvm_x86_pmu_hw_event_available)(pmc))
- return false;
-
filter = srcu_dereference(kvm->arch.pmu_event_filter, &kvm->srcu);
if (!filter)
return true;
@@ -403,6 +400,7 @@ static bool check_pmu_event_filter(struct kvm_pmc *pmc)
static bool pmc_event_is_allowed(struct kvm_pmc *pmc)
{
return pmc_is_globally_enabled(pmc) && pmc_speculative_in_use(pmc) &&
+ static_call(kvm_x86_pmu_hw_event_available)(pmc) &&
check_pmu_event_filter(pmc);
}

--
2.41.0.162.gfafddb0af9-goog


2023-07-10 11:04:17

by Like Xu

[permalink] [raw]
Subject: Re: [PATCH 1/4] KVM: x86/pmu: Use enums instead of hardcoded magic for arch event indices

On 7/6/2023 9:02 am, Sean Christopherson wrote:
> Add "enum intel_pmu_architectural_events" to replace the magic numbers for
> the (pseudo-)architectural events, and to give a meaningful name to each
> event so that new readers don't need psychic powers to understand what the
> code is doing.
>
> Cc: Aaron Lewis <[email protected]>
> Cc: Like Xu <[email protected]>
> Signed-off-by: Sean Christopherson <[email protected]>

Reviewed-by: Like Xu <[email protected]>
// I should have done it. Thanks.

> ---
> arch/x86/kvm/vmx/pmu_intel.c | 55 ++++++++++++++++++++++++++++--------
> 1 file changed, 43 insertions(+), 12 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
> index 84be32d9f365..0050d71c9c01 100644
> --- a/arch/x86/kvm/vmx/pmu_intel.c
> +++ b/arch/x86/kvm/vmx/pmu_intel.c
> @@ -22,23 +22,51 @@
>
> #define MSR_PMC_FULL_WIDTH_BIT (MSR_IA32_PMC0 - MSR_IA32_PERFCTR0)
>
> +enum intel_pmu_architectural_events {
> + /*
> + * The order of the architectural events matters as support for each
> + * event is enumerated via CPUID using the index of the event.
> + */
> + INTEL_ARCH_CPU_CYCLES,
> + INTEL_ARCH_INSTRUCTIONS_RETIRED,
> + INTEL_ARCH_REFERENCE_CYCLES,
> + INTEL_ARCH_LLC_REFERENCES,
> + INTEL_ARCH_LLC_MISSES,
> + INTEL_ARCH_BRANCHES_RETIRED,
> + INTEL_ARCH_BRANCHES_MISPREDICTED,
> +
> + NR_REAL_INTEL_ARCH_EVENTS,
> +
> + /*
> + * Pseudo-architectural event used to implement IA32_FIXED_CTR2, a.k.a.
> + * TSC reference cycles. The architectural reference cycles event may
> + * or may not actually use the TSC as the reference, e.g. might use the
> + * core crystal clock or the bus clock (yeah, "architectural").
> + */
> + PSEUDO_ARCH_REFERENCE_CYCLES = NR_REAL_INTEL_ARCH_EVENTS,
> + NR_INTEL_ARCH_EVENTS,
> +};
> +
> static struct {
> u8 eventsel;
> u8 unit_mask;
> } const intel_arch_events[] = {
> - [0] = { 0x3c, 0x00 },
> - [1] = { 0xc0, 0x00 },
> - [2] = { 0x3c, 0x01 },
> - [3] = { 0x2e, 0x4f },
> - [4] = { 0x2e, 0x41 },
> - [5] = { 0xc4, 0x00 },
> - [6] = { 0xc5, 0x00 },
> - /* The above index must match CPUID 0x0A.EBX bit vector */
> - [7] = { 0x00, 0x03 },
> + [INTEL_ARCH_CPU_CYCLES] = { 0x3c, 0x00 },
> + [INTEL_ARCH_INSTRUCTIONS_RETIRED] = { 0xc0, 0x00 },
> + [INTEL_ARCH_REFERENCE_CYCLES] = { 0x3c, 0x01 },
> + [INTEL_ARCH_LLC_REFERENCES] = { 0x2e, 0x4f },
> + [INTEL_ARCH_LLC_MISSES] = { 0x2e, 0x41 },
> + [INTEL_ARCH_BRANCHES_RETIRED] = { 0xc4, 0x00 },
> + [INTEL_ARCH_BRANCHES_MISPREDICTED] = { 0xc5, 0x00 },
> + [PSEUDO_ARCH_REFERENCE_CYCLES] = { 0x00, 0x03 },
> };
>
> /* mapping between fixed pmc index and intel_arch_events array */
> -static int fixed_pmc_events[] = {1, 0, 7};
> +static int fixed_pmc_events[] = {
> + [0] = INTEL_ARCH_INSTRUCTIONS_RETIRED,
> + [1] = INTEL_ARCH_CPU_CYCLES,
> + [2] = PSEUDO_ARCH_REFERENCE_CYCLES,
> +};
>
> static void reprogram_fixed_counters(struct kvm_pmu *pmu, u64 data)
> {
> @@ -92,13 +120,16 @@ static bool intel_hw_event_available(struct kvm_pmc *pmc)
> u8 unit_mask = (pmc->eventsel & ARCH_PERFMON_EVENTSEL_UMASK) >> 8;
> int i;
>
> - for (i = 0; i < ARRAY_SIZE(intel_arch_events); i++) {
> + BUILD_BUG_ON(ARRAY_SIZE(intel_arch_events) != NR_INTEL_ARCH_EVENTS);
> +
> + for (i = 0; i < NR_INTEL_ARCH_EVENTS; i++) {
> if (intel_arch_events[i].eventsel != event_select ||
> intel_arch_events[i].unit_mask != unit_mask)
> continue;
>
> /* disable event that reported as not present by cpuid */
> - if ((i < 7) && !(pmu->available_event_types & (1 << i)))
> + if ((i < PSEUDO_ARCH_REFERENCE_CYCLES) &&
> + !(pmu->available_event_types & (1 << i)))

CHECK: Unnecessary parentheses around 'i < PSEUDO_ARCH_REFERENCE_CYCLES'
#164: FILE: arch/x86/kvm/vmx/pmu_intel.c:131:
+ if ((i < PSEUDO_ARCH_REFERENCE_CYCLES) &&
+ !(pmu->available_event_types & (1 << i)))

> return false;
>
> break;

2023-07-11 16:45:40

by Like Xu

[permalink] [raw]
Subject: Re: [PATCH 4/4] KVM: x86/pmu: Move .hw_event_available() check out of PMC filter helper

On 2023/6/7 09:02, Sean Christopherson wrote:
> Move the call to kvm_x86_pmu.hw_event_available(), which has nothing to
> with the userspace PMU filter, out of check_pmu_event_filter() and into
> its sole caller pmc_event_is_allowed(). pmc_event_is_allowed() didn't
> exist when commit 7aadaa988c5e ("KVM: x86/pmu: Drop amd_event_mapping[]
> in the KVM context"), so presumably the motivation for invoking
> .hw_event_available() from check_pmu_event_filter() was to avoid having
> to add multiple call sites.

The event unavailability check based on intel cpuid is, in my opinion,
part of our pmu_event_filter mechanism. Unavailable events can be
defined either by KVM userspace or by architectural cpuid (if any).

The bigger issue here is what happens when the two rules conflict, and
the answer can be found more easily by putting the two parts in one
function (the architectural cpuid rule takes precedence).

>
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
> arch/x86/kvm/pmu.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> index 1690d41c1830..2a32dc6aa3f7 100644
> --- a/arch/x86/kvm/pmu.c
> +++ b/arch/x86/kvm/pmu.c
> @@ -387,9 +387,6 @@ static bool check_pmu_event_filter(struct kvm_pmc *pmc)
> struct kvm_x86_pmu_event_filter *filter;
> struct kvm *kvm = pmc->vcpu->kvm;
>
> - if (!static_call(kvm_x86_pmu_hw_event_available)(pmc))
> - return false;
> -
> filter = srcu_dereference(kvm->arch.pmu_event_filter, &kvm->srcu);
> if (!filter)
> return true;
> @@ -403,6 +400,7 @@ static bool check_pmu_event_filter(struct kvm_pmc *pmc)
> static bool pmc_event_is_allowed(struct kvm_pmc *pmc)
> {
> return pmc_is_globally_enabled(pmc) && pmc_speculative_in_use(pmc) &&
> + static_call(kvm_x86_pmu_hw_event_available)(pmc) &&
> check_pmu_event_filter(pmc);
> }
>

2023-07-12 16:30:00

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 4/4] KVM: x86/pmu: Move .hw_event_available() check out of PMC filter helper

On Wed, Jul 12, 2023, Like Xu wrote:
> On 2023/6/7 09:02, Sean Christopherson wrote:
> > Move the call to kvm_x86_pmu.hw_event_available(), which has nothing to
> > with the userspace PMU filter, out of check_pmu_event_filter() and into
> > its sole caller pmc_event_is_allowed(). pmc_event_is_allowed() didn't
> > exist when commit 7aadaa988c5e ("KVM: x86/pmu: Drop amd_event_mapping[]
> > in the KVM context"), so presumably the motivation for invoking
> > .hw_event_available() from check_pmu_event_filter() was to avoid having
> > to add multiple call sites.
>
> The event unavailability check based on intel cpuid is, in my opinion,
> part of our pmu_event_filter mechanism. Unavailable events can be
> defined either by KVM userspace or by architectural cpuid (if any).
>
> The bigger issue here is what happens when the two rules conflict, and
> the answer can be found more easily by putting the two parts in one
> function (the architectural cpuid rule takes precedence).

I want to clearly differentiate between what KVM allows and what userspace allows,
and specifically I want to use "filter" only to describe userspace intervention as
much as possible.

Outside of kvm_get_filtered_xcr0(), which I would classify as userspace-defined
behavior (albeit rather indirectly), and a few architecturally defined "filter"
terms from Intel and AMD, we don't use "filter" in KVM to describe KVM behavior.

IMO, there's a lot of value in being able to associate "filter" with userspace
desires, e.g. just mentioning "filtering" immediately frames a conversation as
dealing with userspace's wants, not internal KVM behavior.

2023-08-03 00:53:57

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH 0/4] KVM: x86/pmu: Clean up arch/hw event handling

On Tue, 06 Jun 2023 18:02:02 -0700, Sean Christopherson wrote:
> Clean up KVM's handling of arch/hw events, and the related fixed counter
> usage. KVM has far too many open coded magic numbers, and kludgy code
> that stems from the magic numbers.
>
> Sean Christopherson (4):
> KVM: x86/pmu: Use enums instead of hardcoded magic for arch event
> indices
> KVM: x86/pmu: Simplify intel_hw_event_available()
> KVM: x86/pmu: Require nr fixed_pmc_events to match nr max fixed
> counters
> KVM: x86/pmu: Move .hw_event_available() check out of PMC filter
> helper
>
> [...]

Applied to kvm-x86 pmu, thanks!

[1/4] KVM: x86/pmu: Use enums instead of hardcoded magic for arch event indices
https://github.com/kvm-x86/linux/commit/0033fa354916
[2/4] KVM: x86/pmu: Simplify intel_hw_event_available()
https://github.com/kvm-x86/linux/commit/bc9658999b3e
[3/4] KVM: x86/pmu: Require nr fixed_pmc_events to match nr max fixed counters
https://github.com/kvm-x86/linux/commit/6d88d0ee5de1
[4/4] KVM: x86/pmu: Move .hw_event_available() check out of PMC filter helper
https://github.com/kvm-x86/linux/commit/6de2ccc16968

--
https://github.com/kvm-x86/linux/tree/next
https://github.com/kvm-x86/linux/tree/fixes