LinuxLists.cc - [PATCH v3 0/8] KVM: x86: Add AMD Guest PerfMonV2 PMU support

2022-11-11 11:10:30

Subject: [PATCH v3 0/8] KVM: x86: Add AMD Guest PerfMonV2 PMU support

Starting with Zen4, core PMU on AMD platforms such as Genoa and
Ryzen-7000 will support PerfMonV2, and it is also compatible with
legacy PERFCTR_CORE behavior and msr addresses.

If you don't have access to the hardware specification, the commits
d6d0c7f681fd..7685665c390d for host perf can also bring a quick
overview. Its main change is the addition of three msr's equivalent
to Intel V2, namely global_ctrl, global_status, global_status_clear.

It is worth noting that this feature is very attractive for reducing the
overhead of PMU virtualization, since multiple msr accesses to multiple
counters will be replaced by a single access to the global register,
plus more accuracy gain when multiple guest counters are used.

All related testcases are passed on a Genoa box.
Please feel free to run more tests, add more or share comments.

Previous:
https://lore.kernel.org/kvm/[email protected]/

V2 -> V3 Changelog:
- Renme pmc_is_enabled(); (Jim)
- Move the reprogram_counters() changes as a separate patch; (Jim)
- Refactoring to align with other set_msr() helper; (Sean)
- Fix the issue if userspace wants to expose v1 for whatever reason; (Sean)
- Add the feature flag X86_FEATURE_AMD_PMU_V2; (Sean)
- Check enable_pmu for intel 0xa as well; (Sean)
- Limit AMD pmu's KVM support to version 2 as well;
- Other nit changes around C code taste; (Sean)

Like Xu (7):
KVM: x86/pmu: Rename pmc_is_enabled() to pmc_is_globally_enabled()
KVM: x86/pmu: Rewrite reprogram_counters() to improve performance
KVM: x86/pmu: Make part of the Intel v2 PMU MSRs handling x86 generic
KVM: x86/cpuid: Add X86_FEATURE_AMD_PMU_V2 as a KVM-only leaf entry
KVM: x86/svm/pmu: Add AMD PerfMonV2 support
KVM: x86/cpuid: Add AMD CPUID ExtPerfMonAndDbg leaf 0x80000022
KVM: x86/cpuid: Use fast return for cpuid "0xa" leaf when !enable_pmu

Sean Christopherson (1):
KVM: VMX: Refactor intel_pmu_set_msr() to align with other set_msr()
helpers

arch/x86/include/asm/kvm-x86-pmu-ops.h | 1 -
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/cpuid.c | 32 ++++++-
arch/x86/kvm/pmu.c | 65 +++++++++++++--
arch/x86/kvm/pmu.h | 28 ++++++-
arch/x86/kvm/reverse_cpuid.h | 10 +++
arch/x86/kvm/svm/pmu.c | 73 +++++++++++-----
arch/x86/kvm/svm/svm.c | 11 ++-
arch/x86/kvm/vmx/pmu_intel.c | 111 +++++++------------------
arch/x86/kvm/x86.c | 14 +++-
10 files changed, 228 insertions(+), 118 deletions(-)

--
2.38.1

2022-11-11 11:24:58

by Like Xu

[permalink] [raw]

Subject: [PATCH v3 1/8] KVM: x86/pmu: Rename pmc_is_enabled() to pmc_is_globally_enabled()

From: Like Xu <[email protected]>

The name of function pmc_is_enabled() is a bit misleading. A PMC can
be disabled either by PERF_CLOBAL_CTRL or by its corresponding EVTSEL.
Add the global semantic to its name.

Suggested-by: Jim Mattson <[email protected]>
Signed-off-by: Like Xu <[email protected]>
---
arch/x86/include/asm/kvm-x86-pmu-ops.h | 2 +-
arch/x86/kvm/pmu.c | 6 +++---
arch/x86/kvm/pmu.h | 2 +-
arch/x86/kvm/svm/pmu.c | 2 +-
arch/x86/kvm/vmx/pmu_intel.c | 2 +-
5 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/kvm-x86-pmu-ops.h b/arch/x86/include/asm/kvm-x86-pmu-ops.h
index c17e3e96fc1d..86a3fb01e103 100644
--- a/arch/x86/include/asm/kvm-x86-pmu-ops.h
+++ b/arch/x86/include/asm/kvm-x86-pmu-ops.h
@@ -13,7 +13,7 @@ BUILD_BUG_ON(1)
* at the call sites.
*/
KVM_X86_PMU_OP(hw_event_available)
-KVM_X86_PMU_OP(pmc_is_enabled)
+KVM_X86_PMU_OP(pmc_is_globally_enabled)
KVM_X86_PMU_OP(pmc_idx_to_pmc)
KVM_X86_PMU_OP(rdpmc_ecx_to_pmc)
KVM_X86_PMU_OP(msr_idx_to_pmc)
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 684393c22105..e57f707fb940 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -83,7 +83,7 @@ void kvm_pmu_ops_update(const struct kvm_pmu_ops *pmu_ops)
#undef __KVM_X86_PMU_OP
}

-static inline bool pmc_is_enabled(struct kvm_pmc *pmc)
+static inline bool pmc_is_globally_enabled(struct kvm_pmc *pmc)
{
return static_call(kvm_x86_pmu_pmc_is_enabled)(pmc);
}
@@ -306,7 +306,7 @@ static void reprogram_counter(struct kvm_pmc *pmc)

pmc_pause_counter(pmc);

- if (!pmc_speculative_in_use(pmc) || !pmc_is_enabled(pmc))
+ if (!pmc_speculative_in_use(pmc) || !pmc_is_globally_enabled(pmc))
goto reprogram_complete;

if (!check_pmu_event_filter(pmc))
@@ -581,7 +581,7 @@ void kvm_pmu_trigger_event(struct kvm_vcpu *vcpu, u64 perf_hw_id)
for_each_set_bit(i, pmu->all_valid_pmc_idx, X86_PMC_IDX_MAX) {
pmc = static_call(kvm_x86_pmu_pmc_idx_to_pmc)(pmu, i);

- if (!pmc || !pmc_is_enabled(pmc) || !pmc_speculative_in_use(pmc))
+ if (!pmc || !pmc_is_globally_enabled(pmc) || !pmc_speculative_in_use(pmc))
continue;

/* Ignore checks for edge detect, pin control, invert and CMASK bits */
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 85ff3c0588ba..2b5376ba66ea 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -26,7 +26,7 @@ struct kvm_event_hw_type_mapping {

struct kvm_pmu_ops {
bool (*hw_event_available)(struct kvm_pmc *pmc);
- bool (*pmc_is_enabled)(struct kvm_pmc *pmc);
+ bool (*pmc_is_globally_enabled)(struct kvm_pmc *pmc);
struct kvm_pmc *(*pmc_idx_to_pmc)(struct kvm_pmu *pmu, int pmc_idx);
struct kvm_pmc *(*rdpmc_ecx_to_pmc)(struct kvm_vcpu *vcpu,
unsigned int idx, u64 *mask);
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index 0e313fbae055..7958a983b760 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -218,7 +218,7 @@ static void amd_pmu_reset(struct kvm_vcpu *vcpu)

struct kvm_pmu_ops amd_pmu_ops __initdata = {
.hw_event_available = amd_hw_event_available,
- .pmc_is_enabled = amd_pmc_is_enabled,
+ .pmc_is_globally_enabled = amd_pmc_is_enabled,
.pmc_idx_to_pmc = amd_pmc_idx_to_pmc,
.rdpmc_ecx_to_pmc = amd_rdpmc_ecx_to_pmc,
.msr_idx_to_pmc = amd_msr_idx_to_pmc,
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index e5cec07ca8d9..f81cf54a245f 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -797,7 +797,7 @@ void intel_pmu_cross_mapped_check(struct kvm_pmu *pmu)

struct kvm_pmu_ops intel_pmu_ops __initdata = {
.hw_event_available = intel_hw_event_available,
- .pmc_is_enabled = intel_pmc_is_enabled,
+ .pmc_is_globally_enabled = intel_pmc_is_enabled,
.pmc_idx_to_pmc = intel_pmc_idx_to_pmc,
.rdpmc_ecx_to_pmc = intel_rdpmc_ecx_to_pmc,
.msr_idx_to_pmc = intel_msr_idx_to_pmc,
--
2.38.1

2022-11-11 11:37:18

by Like Xu

[permalink] [raw]

Subject: [PATCH v3 6/8] KVM: x86/svm/pmu: Add AMD PerfMonV2 support

From: Like Xu <[email protected]>

If AMD Performance Monitoring Version 2 (PerfMonV2) is detected by
the guest, it can use a new scheme to manage the Core PMCs using the
new global control and status registers.

In addition to benefiting from the PerfMonV2 functionality in the same
way as the host (higher precision), the guest also can reduce the number
of vm-exits by lowering the total number of MSRs accesses.

In terms of implementation details, amd_is_valid_msr() is resurrected
since three newly added MSRs could not be mapped to one vPMC.
The possibility of emulating PerfMonV2 on the mainframe has also
been eliminated for reasons of precision.

Co-developed-by: Sandipan Das <[email protected]>
Signed-off-by: Sandipan Das <[email protected]>
Signed-off-by: Like Xu <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/pmu.c | 6 ++++
arch/x86/kvm/svm/pmu.c | 64 +++++++++++++++++++++++++++------
arch/x86/kvm/x86.c | 14 ++++++--
4 files changed, 72 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 81114a376c4e..d02990fcd46f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -512,6 +512,7 @@ struct kvm_pmc {
#define MSR_ARCH_PERFMON_EVENTSEL_MAX (MSR_ARCH_PERFMON_EVENTSEL0 + KVM_INTEL_PMC_MAX_GENERIC - 1)
#define KVM_PMC_MAX_FIXED 3
#define KVM_AMD_PMC_MAX_GENERIC 6
+#define MSR_F15H_PERF_MSR_MAX (MSR_F15H_PERF_CTR0 + 2 * (KVM_AMD_PMC_MAX_GENERIC - 1))
struct kvm_pmu {
unsigned nr_arch_gp_counters;
unsigned nr_arch_fixed_counters;
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index a3726af5416d..c70ff57ee44c 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -471,12 +471,15 @@ int kvm_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)

switch (msr) {
case MSR_CORE_PERF_GLOBAL_STATUS:
+ case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS:
msr_info->data = pmu->global_status;
return 0;
case MSR_CORE_PERF_GLOBAL_CTRL:
+ case MSR_AMD64_PERF_CNTR_GLOBAL_CTL:
msr_info->data = pmu->global_ctrl;
return 0;
case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
+ case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR:
msr_info->data = 0;
return 0;
default:
@@ -495,12 +498,14 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)

switch (msr) {
case MSR_CORE_PERF_GLOBAL_STATUS:
+ case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS:
if (!msr_info->host_initiated)
return 1; /* RO MSR */

pmu->global_status = data;
return 0;
case MSR_CORE_PERF_GLOBAL_CTRL:
+ case MSR_AMD64_PERF_CNTR_GLOBAL_CTL:
if (!kvm_valid_perf_global_ctrl(pmu, data))
return 1;

@@ -511,6 +516,7 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
}
return 0;
case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
+ case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR:
if (data & pmu->global_ovf_ctrl_mask)
return 1;

diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index 4e7d7e6cccec..e58f39f8f10b 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -92,12 +92,6 @@ static struct kvm_pmc *amd_rdpmc_ecx_to_pmc(struct kvm_vcpu *vcpu,
return amd_pmc_idx_to_pmc(vcpu_to_pmu(vcpu), idx & ~(3u << 30));
}

-static bool amd_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
-{
- /* All MSRs refer to exactly one PMC, so msr_idx_to_pmc is enough. */
- return false;
-}
-
static struct kvm_pmc *amd_msr_idx_to_pmc(struct kvm_vcpu *vcpu, u32 msr)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -109,6 +103,29 @@ static struct kvm_pmc *amd_msr_idx_to_pmc(struct kvm_vcpu *vcpu, u32 msr)
return pmc;
}

+static bool amd_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
+{
+ struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+
+ switch (msr) {
+ case MSR_K7_EVNTSEL0 ... MSR_K7_PERFCTR3:
+ return pmu->version > 0;
+ case MSR_F15H_PERF_CTL0 ... MSR_F15H_PERF_CTR5:
+ return guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE);
+ case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS:
+ case MSR_AMD64_PERF_CNTR_GLOBAL_CTL:
+ case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR:
+ return pmu->version > 1;
+ default:
+ if (msr > MSR_F15H_PERF_CTR5 &&
+ msr < MSR_F15H_PERF_CTL0 + 2 * pmu->nr_arch_gp_counters)
+ return pmu->version > 1;
+ break;
+ }
+
+ return amd_msr_idx_to_pmc(vcpu, msr);
+}
+
static int amd_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -162,20 +179,42 @@ static int amd_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
static void amd_pmu_refresh(struct kvm_vcpu *vcpu)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+ struct kvm_cpuid_entry2 *entry;
+ union cpuid_0x80000022_ebx ebx;

- if (guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE))
- pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS_CORE;
+ pmu->version = 1;
+ if (kvm_cpu_cap_has(X86_FEATURE_AMD_PMU_V2) &&
+ guest_cpuid_has(vcpu, X86_FEATURE_AMD_PMU_V2)) {
+ pmu->version = 2;
+ entry = kvm_find_cpuid_entry_index(vcpu, 0x80000022, 0);
+ ebx.full = entry->ebx;
+ pmu->nr_arch_gp_counters = min3((unsigned int)ebx.split.num_core_pmc,
+ (unsigned int)kvm_pmu_cap.num_counters_gp,
+ (unsigned int)KVM_AMD_PMC_MAX_GENERIC);
+ }
+
+ /* Commitment to minimal PMCs, regardless of CPUID.80000022 */
+ if (kvm_cpu_cap_has(X86_FEATURE_PERFCTR_CORE) &&
+ guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE))
+ pmu->nr_arch_gp_counters = max_t(unsigned int,
+ pmu->nr_arch_gp_counters,
+ AMD64_NUM_COUNTERS_CORE);
else
- pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS;
+ pmu->nr_arch_gp_counters = max_t(unsigned int,
+ pmu->nr_arch_gp_counters,
+ AMD64_NUM_COUNTERS);
+
+ if (pmu->version > 1) {
+ pmu->global_ctrl_mask = ~((1ull << pmu->nr_arch_gp_counters) - 1);
+ pmu->global_ovf_ctrl_mask = pmu->global_ctrl_mask;
+ }

pmu->counter_bitmask[KVM_PMC_GP] = ((u64)1 << 48) - 1;
pmu->reserved_bits = 0xfffffff000280000ull;
pmu->raw_event_mask = AMD64_RAW_EVENT_MASK;
- pmu->version = 1;
/* not applicable to AMD; but clean them to prevent any fall out */
pmu->counter_bitmask[KVM_PMC_FIXED] = 0;
pmu->nr_arch_fixed_counters = 0;
- pmu->global_status = 0;
bitmap_set(pmu->all_valid_pmc_idx, 0, pmu->nr_arch_gp_counters);
}

@@ -186,6 +225,7 @@ static void amd_pmu_init(struct kvm_vcpu *vcpu)

BUILD_BUG_ON(KVM_AMD_PMC_MAX_GENERIC > AMD64_NUM_COUNTERS_CORE);
BUILD_BUG_ON(KVM_AMD_PMC_MAX_GENERIC > INTEL_PMC_MAX_GENERIC);
+ BUILD_BUG_ON(KVM_AMD_PMC_MAX_GENERIC < 1);

for (i = 0; i < KVM_AMD_PMC_MAX_GENERIC ; i++) {
pmu->gp_counters[i].type = KVM_PMC_GP;
@@ -206,6 +246,8 @@ static void amd_pmu_reset(struct kvm_vcpu *vcpu)
pmc_stop_counter(pmc);
pmc->counter = pmc->prev_counter = pmc->eventsel = 0;
}
+
+ pmu->global_ctrl = pmu->global_status = 0;
}

struct kvm_pmu_ops amd_pmu_ops __initdata = {
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index e46e458c5b08..99bc47f1a40e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1458,6 +1458,10 @@ static const u32 msrs_to_save_all[] = {
MSR_F15H_PERF_CTR0, MSR_F15H_PERF_CTR1, MSR_F15H_PERF_CTR2,
MSR_F15H_PERF_CTR3, MSR_F15H_PERF_CTR4, MSR_F15H_PERF_CTR5,

+ MSR_AMD64_PERF_CNTR_GLOBAL_CTL,
+ MSR_AMD64_PERF_CNTR_GLOBAL_STATUS,
+ MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR,
+
MSR_IA32_XFD, MSR_IA32_XFD_ERR,
};

@@ -3859,7 +3863,10 @@ int kvm_set_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case MSR_IA32_PEBS_ENABLE:
case MSR_IA32_DS_AREA:
case MSR_PEBS_DATA_CFG:
- case MSR_F15H_PERF_CTL0 ... MSR_F15H_PERF_CTR5:
+ case MSR_F15H_PERF_CTL0 ... MSR_F15H_PERF_MSR_MAX:
+ case MSR_AMD64_PERF_CNTR_GLOBAL_CTL:
+ case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS:
+ case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR:
if (kvm_pmu_is_valid_msr(vcpu, msr))
return kvm_pmu_set_msr(vcpu, msr_info);
/*
@@ -3962,7 +3969,10 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case MSR_IA32_PEBS_ENABLE:
case MSR_IA32_DS_AREA:
case MSR_PEBS_DATA_CFG:
- case MSR_F15H_PERF_CTL0 ... MSR_F15H_PERF_CTR5:
+ case MSR_F15H_PERF_CTL0 ... MSR_F15H_PERF_MSR_MAX:
+ case MSR_AMD64_PERF_CNTR_GLOBAL_CTL:
+ case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS:
+ case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR:
if (kvm_pmu_is_valid_msr(vcpu, msr_info->index))
return kvm_pmu_get_msr(vcpu, msr_info);
/*
--
2.38.1

2022-11-11 11:37:27

by Like Xu

[permalink] [raw]

Subject: [PATCH v3 4/8] KVM: x86/pmu: Make part of the Intel v2 PMU MSRs handling x86 generic

From: Like Xu <[email protected]>

The AMD PerfMonV2 defines three registers similar to part of the Intel
v2 PMU registers, including the GLOBAL_CTRL, GLOBAL_STATUS and
GLOBAL_OVF_CTRL MSRs. For better code reuse, this specific part of
the handling can be extracted to make it generic for X86 as a straight
code movement.

Specifically, move the kvm_pmu_set/get_msr() hanlders of GLOBAL_STATUS,
GLOBAL_CTRL, GLOBAL_OVF_CTRL defined by intel to generic pmu.c and
remove the callback function .pmc_is_globally_enabled, which is very
helpful to introduce the AMD PerfMonV2 code later.

The new non-prefix pmc_is_globally_enabled() works well as legacy AMD
vPMU version is indexed as 1. Note that the specific *_is_valid_msr will
continue to be used to avoid cross-vendor msr access.

Signed-off-by: Like Xu <[email protected]>
---
arch/x86/include/asm/kvm-x86-pmu-ops.h | 1 -
arch/x86/kvm/pmu.c | 55 +++++++++++++++++++++++---
arch/x86/kvm/pmu.h | 17 +++++++-
arch/x86/kvm/svm/pmu.c | 9 -----
arch/x86/kvm/vmx/pmu_intel.c | 46 +--------------------
5 files changed, 67 insertions(+), 61 deletions(-)

diff --git a/arch/x86/include/asm/kvm-x86-pmu-ops.h b/arch/x86/include/asm/kvm-x86-pmu-ops.h
index 86a3fb01e103..6c98f4bb4228 100644
--- a/arch/x86/include/asm/kvm-x86-pmu-ops.h
+++ b/arch/x86/include/asm/kvm-x86-pmu-ops.h
@@ -13,7 +13,6 @@ BUILD_BUG_ON(1)
* at the call sites.
*/
KVM_X86_PMU_OP(hw_event_available)
-KVM_X86_PMU_OP(pmc_is_globally_enabled)
KVM_X86_PMU_OP(pmc_idx_to_pmc)
KVM_X86_PMU_OP(rdpmc_ecx_to_pmc)
KVM_X86_PMU_OP(msr_idx_to_pmc)
diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index e57f707fb940..a3726af5416d 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -83,11 +83,6 @@ void kvm_pmu_ops_update(const struct kvm_pmu_ops *pmu_ops)
#undef __KVM_X86_PMU_OP
}

-static inline bool pmc_is_globally_enabled(struct kvm_pmc *pmc)
-{
- return static_call(kvm_x86_pmu_pmc_is_enabled)(pmc);
-}
-
static void kvm_pmi_trigger_fn(struct irq_work *irq_work)
{
struct kvm_pmu *pmu = container_of(irq_work, struct kvm_pmu, irq_work);
@@ -471,11 +466,61 @@ static void kvm_pmu_mark_pmc_in_use(struct kvm_vcpu *vcpu, u32 msr)

int kvm_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
{
+ struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+ u32 msr = msr_info->index;
+
+ switch (msr) {
+ case MSR_CORE_PERF_GLOBAL_STATUS:
+ msr_info->data = pmu->global_status;
+ return 0;
+ case MSR_CORE_PERF_GLOBAL_CTRL:
+ msr_info->data = pmu->global_ctrl;
+ return 0;
+ case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
+ msr_info->data = 0;
+ return 0;
+ default:
+ break;
+ }
+
return static_call(kvm_x86_pmu_get_msr)(vcpu, msr_info);
}

int kvm_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
{
+ struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+ u32 msr = msr_info->index;
+ u64 data = msr_info->data;
+ u64 diff;
+
+ switch (msr) {
+ case MSR_CORE_PERF_GLOBAL_STATUS:
+ if (!msr_info->host_initiated)
+ return 1; /* RO MSR */
+
+ pmu->global_status = data;
+ return 0;
+ case MSR_CORE_PERF_GLOBAL_CTRL:
+ if (!kvm_valid_perf_global_ctrl(pmu, data))
+ return 1;
+
+ if (pmu->global_ctrl != data) {
+ diff = pmu->global_ctrl ^ data;
+ pmu->global_ctrl = data;
+ reprogram_counters(pmu, diff);
+ }
+ return 0;
+ case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
+ if (data & pmu->global_ovf_ctrl_mask)
+ return 1;
+
+ if (!msr_info->host_initiated)
+ pmu->global_status &= ~data;
+ return 0;
+ default:
+ break;
+ }
+
kvm_pmu_mark_pmc_in_use(vcpu, msr_info->index);
return static_call(kvm_x86_pmu_set_msr)(vcpu, msr_info);
}
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index be552c8217a0..8739e5ea2835 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -26,7 +26,6 @@ struct kvm_event_hw_type_mapping {

struct kvm_pmu_ops {
bool (*hw_event_available)(struct kvm_pmc *pmc);
- bool (*pmc_is_globally_enabled)(struct kvm_pmc *pmc);
struct kvm_pmc *(*pmc_idx_to_pmc)(struct kvm_pmu *pmu, int pmc_idx);
struct kvm_pmc *(*rdpmc_ecx_to_pmc)(struct kvm_vcpu *vcpu,
unsigned int idx, u64 *mask);
@@ -200,6 +199,22 @@ static inline void reprogram_counters(struct kvm_pmu *pmu, u64 diff)
}
}

+/*
+ * Check if a PMC is enabled by comparing it against global_ctrl bits.
+ *
+ * If the current version of vPMU doesn't have global_ctrl MSR,
+ * all vPMCs are enabled (return TRUE).
+ */
+static inline bool pmc_is_globally_enabled(struct kvm_pmc *pmc)
+{
+ struct kvm_pmu *pmu = pmc_to_pmu(pmc);
+
+ if (pmu->version < 2)
+ return true;
+
+ return test_bit(pmc->idx, (unsigned long *)&pmu->global_ctrl);
+}
+
void kvm_pmu_deliver_pmi(struct kvm_vcpu *vcpu);
void kvm_pmu_handle_event(struct kvm_vcpu *vcpu);
int kvm_pmu_rdpmc(struct kvm_vcpu *vcpu, unsigned pmc, u64 *data);
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index 7958a983b760..4e7d7e6cccec 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -76,14 +76,6 @@ static bool amd_hw_event_available(struct kvm_pmc *pmc)
return true;
}

-/* check if a PMC is enabled by comparing it against global_ctrl bits. Because
- * AMD CPU doesn't have global_ctrl MSR, all PMCs are enabled (return TRUE).
- */
-static bool amd_pmc_is_enabled(struct kvm_pmc *pmc)
-{
- return true;
-}
-
static bool amd_is_valid_rdpmc_ecx(struct kvm_vcpu *vcpu, unsigned int idx)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -218,7 +210,6 @@ static void amd_pmu_reset(struct kvm_vcpu *vcpu)

struct kvm_pmu_ops amd_pmu_ops __initdata = {
.hw_event_available = amd_hw_event_available,
- .pmc_is_globally_enabled = amd_pmc_is_enabled,
.pmc_idx_to_pmc = amd_pmc_idx_to_pmc,
.rdpmc_ecx_to_pmc = amd_rdpmc_ecx_to_pmc,
.msr_idx_to_pmc = amd_msr_idx_to_pmc,
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index db704eea2d7c..f95f8d1db2cf 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -90,17 +90,6 @@ static bool intel_hw_event_available(struct kvm_pmc *pmc)
return true;
}

-/* check if a PMC is enabled by comparing it with globl_ctrl bits. */
-static bool intel_pmc_is_enabled(struct kvm_pmc *pmc)
-{
- struct kvm_pmu *pmu = pmc_to_pmu(pmc);
-
- if (!intel_pmu_has_perf_global_ctrl(pmu))
- return true;
-
- return test_bit(pmc->idx, (unsigned long *)&pmu->global_ctrl);
-}
-
static bool intel_is_valid_rdpmc_ecx(struct kvm_vcpu *vcpu, unsigned int idx)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -335,15 +324,6 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case MSR_CORE_PERF_FIXED_CTR_CTRL:
msr_info->data = pmu->fixed_ctr_ctrl;
return 0;
- case MSR_CORE_PERF_GLOBAL_STATUS:
- msr_info->data = pmu->global_status;
- return 0;
- case MSR_CORE_PERF_GLOBAL_CTRL:
- msr_info->data = pmu->global_ctrl;
- return 0;
- case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
- msr_info->data = 0;
- return 0;
case MSR_IA32_PEBS_ENABLE:
msr_info->data = pmu->pebs_enable;
return 0;
@@ -391,29 +371,6 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
if (pmu->fixed_ctr_ctrl != data)
reprogram_fixed_counters(pmu, data);
break;
- case MSR_CORE_PERF_GLOBAL_STATUS:
- if (!msr_info->host_initiated)
- return 1; /* RO MSR */
-
- pmu->global_status = data;
- break;
- case MSR_CORE_PERF_GLOBAL_CTRL:
- if (!kvm_valid_perf_global_ctrl(pmu, data))
- return 1;
-
- if (pmu->global_ctrl != data) {
- diff = pmu->global_ctrl ^ data;
- pmu->global_ctrl = data;
- reprogram_counters(pmu, diff);
- }
- break;
- case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
- if (data & pmu->global_ovf_ctrl_mask)
- return 1;
-
- if (!msr_info->host_initiated)
- pmu->global_status &= ~data;
- break;
case MSR_IA32_PEBS_ENABLE:
if (data & pmu->pebs_enable_mask)
return 1;
@@ -773,7 +730,7 @@ void intel_pmu_cross_mapped_check(struct kvm_pmu *pmu)
pmc = intel_pmc_idx_to_pmc(pmu, bit);

if (!pmc || !pmc_speculative_in_use(pmc) ||
- !intel_pmc_is_enabled(pmc) || !pmc->perf_event)
+ !pmc_is_globally_enabled(pmc) || !pmc->perf_event)
continue;

/*
@@ -788,7 +745,6 @@ void intel_pmu_cross_mapped_check(struct kvm_pmu *pmu)

struct kvm_pmu_ops intel_pmu_ops __initdata = {
.hw_event_available = intel_hw_event_available,
- .pmc_is_globally_enabled = intel_pmc_is_enabled,
.pmc_idx_to_pmc = intel_pmc_idx_to_pmc,
.rdpmc_ecx_to_pmc = intel_rdpmc_ecx_to_pmc,
.msr_idx_to_pmc = intel_msr_idx_to_pmc,
--
2.38.1

2022-12-06 09:45:48

by Like Xu

[permalink] [raw]

Subject: Re: [PATCH v3 0/8] KVM: x86: Add AMD Guest PerfMonV2 PMU support

Any comments to move it forward ?

On 11/11/2022 6:26 pm, Like Xu wrote:
> Starting with Zen4, core PMU on AMD platforms such as Genoa and
> Ryzen-7000 will support PerfMonV2, and it is also compatible with
> legacy PERFCTR_CORE behavior and msr addresses.
>
> If you don't have access to the hardware specification, the commits
> d6d0c7f681fd..7685665c390d for host perf can also bring a quick
> overview. Its main change is the addition of three msr's equivalent
> to Intel V2, namely global_ctrl, global_status, global_status_clear.
>
> It is worth noting that this feature is very attractive for reducing the
> overhead of PMU virtualization, since multiple msr accesses to multiple
> counters will be replaced by a single access to the global register,
> plus more accuracy gain when multiple guest counters are used.
>
> All related testcases are passed on a Genoa box.
> Please feel free to run more tests, add more or share comments.
>
> Previous:
> https://lore.kernel.org/kvm/[email protected]/
>
> V2 -> V3 Changelog:
> - Renme pmc_is_enabled(); (Jim)
> - Move the reprogram_counters() changes as a separate patch; (Jim)
> - Refactoring to align with other set_msr() helper; (Sean)
> - Fix the issue if userspace wants to expose v1 for whatever reason; (Sean)
> - Add the feature flag X86_FEATURE_AMD_PMU_V2; (Sean)
> - Check enable_pmu for intel 0xa as well; (Sean)
> - Limit AMD pmu's KVM support to version 2 as well;
> - Other nit changes around C code taste; (Sean)
>
> Like Xu (7):
> KVM: x86/pmu: Rename pmc_is_enabled() to pmc_is_globally_enabled()
> KVM: x86/pmu: Rewrite reprogram_counters() to improve performance
> KVM: x86/pmu: Make part of the Intel v2 PMU MSRs handling x86 generic
> KVM: x86/cpuid: Add X86_FEATURE_AMD_PMU_V2 as a KVM-only leaf entry
> KVM: x86/svm/pmu: Add AMD PerfMonV2 support
> KVM: x86/cpuid: Add AMD CPUID ExtPerfMonAndDbg leaf 0x80000022
> KVM: x86/cpuid: Use fast return for cpuid "0xa" leaf when !enable_pmu
>
> Sean Christopherson (1):
> KVM: VMX: Refactor intel_pmu_set_msr() to align with other set_msr()
> helpers
>
> arch/x86/include/asm/kvm-x86-pmu-ops.h | 1 -
> arch/x86/include/asm/kvm_host.h | 1 +
> arch/x86/kvm/cpuid.c | 32 ++++++-
> arch/x86/kvm/pmu.c | 65 +++++++++++++--
> arch/x86/kvm/pmu.h | 28 ++++++-
> arch/x86/kvm/reverse_cpuid.h | 10 +++
> arch/x86/kvm/svm/pmu.c | 73 +++++++++++-----
> arch/x86/kvm/svm/svm.c | 11 ++-
> arch/x86/kvm/vmx/pmu_intel.c | 111 +++++++------------------
> arch/x86/kvm/x86.c | 14 +++-
> 10 files changed, 228 insertions(+), 118 deletions(-)
>

2023-01-20 01:29:51

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH v3 0/8] KVM: x86: Add AMD Guest PerfMonV2 PMU support

On Fri, Nov 11, 2022, Like Xu wrote:
> Starting with Zen4, core PMU on AMD platforms such as Genoa and
> Ryzen-7000 will support PerfMonV2, and it is also compatible with
> legacy PERFCTR_CORE behavior and msr addresses.
>
> If you don't have access to the hardware specification, the commits
> d6d0c7f681fd..7685665c390d for host perf can also bring a quick
> overview. Its main change is the addition of three msr's equivalent
> to Intel V2, namely global_ctrl, global_status, global_status_clear.
>
> It is worth noting that this feature is very attractive for reducing the
> overhead of PMU virtualization, since multiple msr accesses to multiple
> counters will be replaced by a single access to the global register,
> plus more accuracy gain when multiple guest counters are used.

Some minor nits, though I haven't looked at the meat of the series yet. I'll
give this a thorough review early next week (unless I'm extra ambitious tomorrow).

2023-01-25 00:10:20

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH v3 6/8] KVM: x86/svm/pmu: Add AMD PerfMonV2 support

On Fri, Nov 11, 2022, Like Xu wrote:
On Fri, Nov 11, 2022, Like Xu wrote:
> @@ -162,20 +179,42 @@ static int amd_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
> static void amd_pmu_refresh(struct kvm_vcpu *vcpu)
> {
> struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
> + struct kvm_cpuid_entry2 *entry;
> + union cpuid_0x80000022_ebx ebx;
>
> - if (guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE))
> - pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS_CORE;
> + pmu->version = 1;
> + if (kvm_cpu_cap_has(X86_FEATURE_AMD_PMU_V2) &&

Why check kvm_cpu_cap support? I.e. what will go wrong if userspace enumerates
PMU v2 to the guest without proper hardware/KVM support.

If this is _necessary_ to protect the host kernel, then we should probably have
a helper to query PMU features, e.g.

static __always_inline bool guest_pmu_has(struct kvm_vcpu *vcpu,
unsigned int x86_feature)
{
return kvm_cpu_cap_has(x86_feature) &&
guest_cpuid_has(vcpu, x86_feature);
}

> + guest_cpuid_has(vcpu, X86_FEATURE_AMD_PMU_V2)) {
> + pmu->version = 2;
> + entry = kvm_find_cpuid_entry_index(vcpu, 0x80000022, 0);
> + ebx.full = entry->ebx;
> + pmu->nr_arch_gp_counters = min3((unsigned int)ebx.split.num_core_pmc,
> + (unsigned int)kvm_pmu_cap.num_counters_gp,
> + (unsigned int)KVM_AMD_PMC_MAX_GENERIC);

Blech. This really shouldn't be necessary, KVM should tweak kvm_pmu_cap.num_counters_gp
as needed during initialization to ensure num_counters_gp doesn't exceed KVM's
internal limits.

Posted a patch[*], please take a look. As mentioned in that thread, I'll somewhat
speculatively apply that series sooner than later so that you can use it a base
for this series (assuming the patch isn't busted).

[*] https://lore.kernel.org/all/[email protected]

> + }
> +
> + /* Commitment to minimal PMCs, regardless of CPUID.80000022 */

Please expand this comment. I'm still not entirely sure I've interpreted it correctly,
and I'm not sure that I agree with the code.

> + if (kvm_cpu_cap_has(X86_FEATURE_PERFCTR_CORE) &&

AFAICT, checking kvm_cpu_cap_has() is an unrelated change. Either it's a bug fix
and belongs in a separate patch, or it's unnecessary and should be dropped.

> + guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE))
> + pmu->nr_arch_gp_counters = max_t(unsigned int,
> + pmu->nr_arch_gp_counters,
> + AMD64_NUM_COUNTERS_CORE);
> else
> - pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS;
> + pmu->nr_arch_gp_counters = max_t(unsigned int,
> + pmu->nr_arch_gp_counters,
> + AMD64_NUM_COUNTERS);

Using max() doesn't look right. E.g. if KVM ends up running on some odd setup
where ebx.split.num_core_pmc/kvm_pmu_cap.num_counters_gp is less than
AMD64_NUM_COUNTERS_CORE or AMD64_NUM_COUNTERS.

Or more likely, if userspace says "only expose N counters to this guest".

Shouldn't this be something like?

if (guest_cpuid_has(vcpu, X86_FEATURE_AMD_PMU_V2))
pmu->nr_arch_gp_counters = min(ebx.split.num_core_pmc,
kvm_pmu_cap.num_counters_gp);
else if (guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE))
pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS_CORE;
else
pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERSE;

2023-01-27 02:06:06

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH v3 1/8] KVM: x86/pmu: Rename pmc_is_enabled() to pmc_is_globally_enabled()

On Fri, Nov 11, 2022, Like Xu wrote:
> From: Like Xu <[email protected]>
>
> The name of function pmc_is_enabled() is a bit misleading. A PMC can
> be disabled either by PERF_CLOBAL_CTRL or by its corresponding EVTSEL.
> Add the global semantic to its name.
>
> Suggested-by: Jim Mattson <[email protected]>
> Signed-off-by: Like Xu <[email protected]>
> ---

...

> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
> index 684393c22105..e57f707fb940 100644
> --- a/arch/x86/kvm/pmu.c
> +++ b/arch/x86/kvm/pmu.c
> @@ -83,7 +83,7 @@ void kvm_pmu_ops_update(const struct kvm_pmu_ops *pmu_ops)
> #undef __KVM_X86_PMU_OP
> }
>
> -static inline bool pmc_is_enabled(struct kvm_pmc *pmc)
> +static inline bool pmc_is_globally_enabled(struct kvm_pmc *pmc)
> {
> return static_call(kvm_x86_pmu_pmc_is_enabled)(pmc);

This doesn't compile. v3, and I'm getting pings, and the very first patch doesn't
compile.

2023-02-02 11:47:26

by Like Xu

[permalink] [raw]

Subject: Re: [PATCH v3 1/8] KVM: x86/pmu: Rename pmc_is_enabled() to pmc_is_globally_enabled()

On 27/1/2023 10:03 am, Sean Christopherson wrote:
> On Fri, Nov 11, 2022, Like Xu wrote:
>> From: Like Xu<[email protected]>
>>
>> The name of function pmc_is_enabled() is a bit misleading. A PMC can
>> be disabled either by PERF_CLOBAL_CTRL or by its corresponding EVTSEL.
>> Add the global semantic to its name.
>>
>> Suggested-by: Jim Mattson<[email protected]>
>> Signed-off-by: Like Xu<[email protected]>
>> ---
> ...
>
>> diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
>> index 684393c22105..e57f707fb940 100644
>> --- a/arch/x86/kvm/pmu.c
>> +++ b/arch/x86/kvm/pmu.c
>> @@ -83,7 +83,7 @@ void kvm_pmu_ops_update(const struct kvm_pmu_ops *pmu_ops)
>> #undef __KVM_X86_PMU_OP
>> }
>>
>> -static inline bool pmc_is_enabled(struct kvm_pmc *pmc)
>> +static inline bool pmc_is_globally_enabled(struct kvm_pmc *pmc)
>> {
>> return static_call(kvm_x86_pmu_pmc_is_enabled)(pmc);
> This doesn't compile. v3, and I'm getting pings, and the very first patch doesn't
> compile.
>

Oops, very sorry for this breaking the git-bisect attribute, it's my fault to
split the code diff incorrectly
(weakly it compiles fine w/ 4th patch), I will enhance the process before
sending any patches to you.
Thank you for taking time to review the rest of patches in detail as you always
do. The new version is
under construction, apologies again.

2023-02-06 07:54:00

by Like Xu

[permalink] [raw]

Subject: Re: [PATCH v3 6/8] KVM: x86/svm/pmu: Add AMD PerfMonV2 support

On 25/1/2023 8:10 am, Sean Christopherson wrote:
> On Fri, Nov 11, 2022, Like Xu wrote:
> On Fri, Nov 11, 2022, Like Xu wrote:
>> @@ -162,20 +179,42 @@ static int amd_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
>> static void amd_pmu_refresh(struct kvm_vcpu *vcpu)
>> {
>> struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
>> + struct kvm_cpuid_entry2 *entry;
>> + union cpuid_0x80000022_ebx ebx;
>>
>> - if (guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE))
>> - pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS_CORE;
>> + pmu->version = 1;
>> + if (kvm_cpu_cap_has(X86_FEATURE_AMD_PMU_V2) &&
>
> Why check kvm_cpu_cap support? I.e. what will go wrong if userspace enumerates
> PMU v2 to the guest without proper hardware/KVM support.
>
> If this is _necessary_ to protect the host kernel, then we should probably have
> a helper to query PMU features, e.g.
>
> static __always_inline bool guest_pmu_has(struct kvm_vcpu *vcpu,
> unsigned int x86_feature)
> {
> return kvm_cpu_cap_has(x86_feature) &&
> guest_cpuid_has(vcpu, x86_feature);
> }
>
>
>
>> + guest_cpuid_has(vcpu, X86_FEATURE_AMD_PMU_V2)) {
>> + pmu->version = 2;
>> + entry = kvm_find_cpuid_entry_index(vcpu, 0x80000022, 0);
>> + ebx.full = entry->ebx;
>> + pmu->nr_arch_gp_counters = min3((unsigned int)ebx.split.num_core_pmc,
>> + (unsigned int)kvm_pmu_cap.num_counters_gp,
>> + (unsigned int)KVM_AMD_PMC_MAX_GENERIC);
>
> Blech. This really shouldn't be necessary, KVM should tweak kvm_pmu_cap.num_counters_gp
> as needed during initialization to ensure num_counters_gp doesn't exceed KVM's
> internal limits.
>
> Posted a patch[*], please take a look. As mentioned in that thread, I'll somewhat
> speculatively apply that series sooner than later so that you can use it a base
> for this series (assuming the patch isn't busted).
>
> [*] https://lore.kernel.org/all/[email protected]
>
>> + }
>> +
>> + /* Commitment to minimal PMCs, regardless of CPUID.80000022 */
>
> Please expand this comment. I'm still not entirely sure I've interpreted it correctly,
> and I'm not sure that I agree with the code.

In the first version [1], I used almost the same if-elif-else sequence
but the concerns from JimM[2] has changed my mind:

"Nonetheless, for compatibility with old software, Fn8000_0022_EBX can never
report less than four counters (or six, if Fn8000_0001_ECX[PerfCtrExtCore] is set)."

Both in amd_pmu_refresh() and in __do_cpuid_func(), KVM implements
this using the override approach of first applying the semantics of
AMD_PMU_V2 and then implementing a minimum number of counters
supported based on whether or not guest have PERFCTR_CORE,
the proposed if-elif-else does not fulfill this need.

[1] [email protected]/
[2] CALMp9eQObuiJGV=YrAU9Fw+KoXfJtZMJ-KUs-qCOVd+R9zGBpw@mail.gmail.com

>
>> + if (kvm_cpu_cap_has(X86_FEATURE_PERFCTR_CORE) &&
>
> AFAICT, checking kvm_cpu_cap_has() is an unrelated change. Either it's a bug fix
> and belongs in a separate patch, or it's unnecessary and should be dropped.
>
>> + guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE))
>> + pmu->nr_arch_gp_counters = max_t(unsigned int,
>> + pmu->nr_arch_gp_counters,
>> + AMD64_NUM_COUNTERS_CORE);
>> else
>> - pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS;
>> + pmu->nr_arch_gp_counters = max_t(unsigned int,
>> + pmu->nr_arch_gp_counters,
>> + AMD64_NUM_COUNTERS);
>
> Using max() doesn't look right. E.g. if KVM ends up running on some odd setup
> where ebx.split.num_core_pmc/kvm_pmu_cap.num_counters_gp is less than
> AMD64_NUM_COUNTERS_CORE or AMD64_NUM_COUNTERS.
>
> Or more likely, if userspace says "only expose N counters to this guest".
>
> Shouldn't this be something like?
>
> if (guest_cpuid_has(vcpu, X86_FEATURE_AMD_PMU_V2))
> pmu->nr_arch_gp_counters = min(ebx.split.num_core_pmc,
> kvm_pmu_cap.num_counters_gp);
> else if (guest_cpuid_has(vcpu, X86_FEATURE_PERFCTR_CORE))
> pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERS_CORE;
> else
> pmu->nr_arch_gp_counters = AMD64_NUM_COUNTERSE;
>

2023-02-06 22:22:18

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH v3 6/8] KVM: x86/svm/pmu: Add AMD PerfMonV2 support

On Mon, Feb 06, 2023, Like Xu wrote:
> On 25/1/2023 8:10 am, Sean Christopherson wrote:
> > > + }
> > > +
> > > + /* Commitment to minimal PMCs, regardless of CPUID.80000022 */
> >
> > Please expand this comment. I'm still not entirely sure I've interpreted it correctly,
> > and I'm not sure that I agree with the code.
>
> In the first version [1], I used almost the same if-elif-else sequence
> but the concerns from JimM[2] has changed my mind:
>
> "Nonetheless, for compatibility with old software, Fn8000_0022_EBX can never
> report less than four counters (or six, if Fn8000_0001_ECX[PerfCtrExtCore] is set)."
>
> Both in amd_pmu_refresh() and in __do_cpuid_func(), KVM implements
> this using the override approach of first applying the semantics of
> AMD_PMU_V2 and then implementing a minimum number of counters
> supported based on whether or not guest have PERFCTR_CORE,
> the proposed if-elif-else does not fulfill this need.

Jim's comments were in the context of __do_cpuid_func(), i.e. KVM_GET_SUPPORTED_CPUID.
As far as guest CPUID is concerned, that's userspace's responsibility to get correct.

And for KVM_GET_SUPPORTED_CPUID, overriding kvm_pmu_cap.num_counters_gp is not
the correct approach. KVM should sanity check the number of counters enumerated
by perf and explicitly disable vPMU support if the min isn't met. E.g. if KVM
needs 6 counters and perf says there are 4, then something is wrong and enumerating
6 to the guest is only going to cause more problems.

> [1] [email protected]/
> [2] CALMp9eQObuiJGV=YrAU9Fw+KoXfJtZMJ-KUs-qCOVd+R9zGBpw@mail.gmail.com