2022-09-07 11:01:27

by Like Xu

[permalink] [raw]
Subject: [PATCH v2 1/3] KVM: x86/pmu: Stop adding speculative Intel GP PMCs that don't exist yet

From: Like Xu <[email protected]>

The Intel April 2022 SDM - Table 2-2. IA-32 Architectural MSRs adds
a new architectural IA32_OVERCLOCKING_STATUS msr (0x195), plus the
presence of IA32_CORE_CAPABILITIES (0xCF), the theoretical effective
maximum value of the Intel GP PMCs is 14 (0xCF - 0xC1) instead of 18.

But the conclusion of this speculation "14" is very fragile and can
easily be overturned once Intel declares another meaningful arch msr
in the above reserved range, and even worse, Intel probably put PMCs
8-15 in a completely different range of MSR indices.

A conservative proposal would be to stop at the maximum number of Intel
GP PMCs supported today. Also subsequent changes would limit both AMD
and Intel on the number of GP counter supported by KVM.

There are some boxes like Intel P4 may indeed have 18 counters, but
those counters are in a completely different msr address range and do
not strictly adhere to the Intel Arch PMU specification, and will not
be supported by KVM in the near future.

Cc: Vitaly Kuznetsov <[email protected]>
Suggested-by: Jim Mattson <[email protected]>
Signed-off-by: Like Xu <[email protected]>
---
Previous:
https://lore.kernel.org/kvm/[email protected]/
V1 -> V2 Changelog:
- Stop at the maximum number of GP PMCs supported today; (Jim)

arch/x86/kvm/x86.c | 14 ++------------
1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 43a6a7efc6ec..884f6de11a33 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1428,20 +1428,10 @@ static const u32 msrs_to_save_all[] = {
MSR_ARCH_PERFMON_PERFCTR0 + 2, MSR_ARCH_PERFMON_PERFCTR0 + 3,
MSR_ARCH_PERFMON_PERFCTR0 + 4, MSR_ARCH_PERFMON_PERFCTR0 + 5,
MSR_ARCH_PERFMON_PERFCTR0 + 6, MSR_ARCH_PERFMON_PERFCTR0 + 7,
- MSR_ARCH_PERFMON_PERFCTR0 + 8, MSR_ARCH_PERFMON_PERFCTR0 + 9,
- MSR_ARCH_PERFMON_PERFCTR0 + 10, MSR_ARCH_PERFMON_PERFCTR0 + 11,
- MSR_ARCH_PERFMON_PERFCTR0 + 12, MSR_ARCH_PERFMON_PERFCTR0 + 13,
- MSR_ARCH_PERFMON_PERFCTR0 + 14, MSR_ARCH_PERFMON_PERFCTR0 + 15,
- MSR_ARCH_PERFMON_PERFCTR0 + 16, MSR_ARCH_PERFMON_PERFCTR0 + 17,
MSR_ARCH_PERFMON_EVENTSEL0, MSR_ARCH_PERFMON_EVENTSEL1,
MSR_ARCH_PERFMON_EVENTSEL0 + 2, MSR_ARCH_PERFMON_EVENTSEL0 + 3,
MSR_ARCH_PERFMON_EVENTSEL0 + 4, MSR_ARCH_PERFMON_EVENTSEL0 + 5,
MSR_ARCH_PERFMON_EVENTSEL0 + 6, MSR_ARCH_PERFMON_EVENTSEL0 + 7,
- MSR_ARCH_PERFMON_EVENTSEL0 + 8, MSR_ARCH_PERFMON_EVENTSEL0 + 9,
- MSR_ARCH_PERFMON_EVENTSEL0 + 10, MSR_ARCH_PERFMON_EVENTSEL0 + 11,
- MSR_ARCH_PERFMON_EVENTSEL0 + 12, MSR_ARCH_PERFMON_EVENTSEL0 + 13,
- MSR_ARCH_PERFMON_EVENTSEL0 + 14, MSR_ARCH_PERFMON_EVENTSEL0 + 15,
- MSR_ARCH_PERFMON_EVENTSEL0 + 16, MSR_ARCH_PERFMON_EVENTSEL0 + 17,
MSR_IA32_PEBS_ENABLE, MSR_IA32_DS_AREA, MSR_PEBS_DATA_CFG,

MSR_K7_EVNTSEL0, MSR_K7_EVNTSEL1, MSR_K7_EVNTSEL2, MSR_K7_EVNTSEL3,
@@ -6943,12 +6933,12 @@ static void kvm_init_msr_list(void)
intel_pt_validate_hw_cap(PT_CAP_num_address_ranges) * 2)
continue;
break;
- case MSR_ARCH_PERFMON_PERFCTR0 ... MSR_ARCH_PERFMON_PERFCTR0 + 17:
+ case MSR_ARCH_PERFMON_PERFCTR0 ... MSR_ARCH_PERFMON_PERFCTR0 + 7:
if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_PERFCTR0 >=
min(INTEL_PMC_MAX_GENERIC, kvm_pmu_cap.num_counters_gp))
continue;
break;
- case MSR_ARCH_PERFMON_EVENTSEL0 ... MSR_ARCH_PERFMON_EVENTSEL0 + 17:
+ case MSR_ARCH_PERFMON_EVENTSEL0 ... MSR_ARCH_PERFMON_EVENTSEL0 + 7:
if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=
min(INTEL_PMC_MAX_GENERIC, kvm_pmu_cap.num_counters_gp))
continue;
--
2.37.3


2022-09-07 11:46:19

by Like Xu

[permalink] [raw]
Subject: [PATCH v2 3/3] KVM: x86/pmu: Limit the maximum number of supported AMD GP counters

From: Like Xu <[email protected]>

The AMD PerfMonV2 specification allows for a maximum of 16 GP counters,
which is clearly not supported with zero code effort in the current KVM.

A local macro (named like INTEL_PMC_MAX_GENERIC) is introduced to
take back control of this virt capability, which also makes it easier to
statically partition all available counters between hosts and guests.

Signed-off-by: Like Xu <[email protected]>
Reviewed-by: Jim Mattson <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/svm/pmu.c | 7 ++++---
arch/x86/kvm/x86.c | 3 +++
3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 70b8266b0474..5c941ace8f67 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -506,6 +506,7 @@ struct kvm_pmc {
#define MSR_ARCH_PERFMON_PERFCTR_MAX (MSR_ARCH_PERFMON_PERFCTR0 + KVM_INTEL_PMC_MAX_GENERIC - 1)
#define MSR_ARCH_PERFMON_EVENTSEL_MAX (MSR_ARCH_PERFMON_EVENTSEL0 + KVM_INTEL_PMC_MAX_GENERIC - 1)
#define KVM_PMC_MAX_FIXED 3
+#define KVM_AMD_PMC_MAX_GENERIC AMD64_NUM_COUNTERS_CORE
struct kvm_pmu {
unsigned nr_arch_gp_counters;
unsigned nr_arch_fixed_counters;
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index f24613a108c5..e696979ee395 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -271,9 +271,10 @@ static void amd_pmu_init(struct kvm_vcpu *vcpu)
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
int i;

- BUILD_BUG_ON(AMD64_NUM_COUNTERS_CORE > INTEL_PMC_MAX_GENERIC);
+ BUILD_BUG_ON(AMD64_NUM_COUNTERS_CORE > KVM_AMD_PMC_MAX_GENERIC);
+ BUILD_BUG_ON(KVM_AMD_PMC_MAX_GENERIC > INTEL_PMC_MAX_GENERIC);

- for (i = 0; i < AMD64_NUM_COUNTERS_CORE ; i++) {
+ for (i = 0; i < KVM_AMD_PMC_MAX_GENERIC ; i++) {
pmu->gp_counters[i].type = KVM_PMC_GP;
pmu->gp_counters[i].vcpu = vcpu;
pmu->gp_counters[i].idx = i;
@@ -286,7 +287,7 @@ static void amd_pmu_reset(struct kvm_vcpu *vcpu)
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
int i;

- for (i = 0; i < AMD64_NUM_COUNTERS_CORE; i++) {
+ for (i = 0; i < KVM_AMD_PMC_MAX_GENERIC; i++) {
struct kvm_pmc *pmc = &pmu->gp_counters[i];

pmc_stop_counter(pmc);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index fd64003ee0e0..1d28d147fc34 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1438,10 +1438,13 @@ static const u32 msrs_to_save_all[] = {

MSR_K7_EVNTSEL0, MSR_K7_EVNTSEL1, MSR_K7_EVNTSEL2, MSR_K7_EVNTSEL3,
MSR_K7_PERFCTR0, MSR_K7_PERFCTR1, MSR_K7_PERFCTR2, MSR_K7_PERFCTR3,
+
+ /* This part of MSRs should match KVM_AMD_PMC_MAX_GENERIC. */
MSR_F15H_PERF_CTL0, MSR_F15H_PERF_CTL1, MSR_F15H_PERF_CTL2,
MSR_F15H_PERF_CTL3, MSR_F15H_PERF_CTL4, MSR_F15H_PERF_CTL5,
MSR_F15H_PERF_CTR0, MSR_F15H_PERF_CTR1, MSR_F15H_PERF_CTR2,
MSR_F15H_PERF_CTR3, MSR_F15H_PERF_CTR4, MSR_F15H_PERF_CTR5,
+
MSR_IA32_XFD, MSR_IA32_XFD_ERR,
};

--
2.37.3

2022-09-07 16:48:37

by Jim Mattson

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] KVM: x86/pmu: Stop adding speculative Intel GP PMCs that don't exist yet

On Wed, Sep 7, 2022 at 3:48 AM Like Xu <[email protected]> wrote:
>
> From: Like Xu <[email protected]>
>
> The Intel April 2022 SDM - Table 2-2. IA-32 Architectural MSRs adds
> a new architectural IA32_OVERCLOCKING_STATUS msr (0x195), plus the
> presence of IA32_CORE_CAPABILITIES (0xCF), the theoretical effective
> maximum value of the Intel GP PMCs is 14 (0xCF - 0xC1) instead of 18.
>
> But the conclusion of this speculation "14" is very fragile and can
> easily be overturned once Intel declares another meaningful arch msr
> in the above reserved range, and even worse, Intel probably put PMCs
> 8-15 in a completely different range of MSR indices.

The last clause is just conjecture.

> A conservative proposal would be to stop at the maximum number of Intel
> GP PMCs supported today. Also subsequent changes would limit both AMD
> and Intel on the number of GP counter supported by KVM.
>
> There are some boxes like Intel P4 may indeed have 18 counters, but
> those counters are in a completely different msr address range and do
> not strictly adhere to the Intel Arch PMU specification, and will not
> be supported by KVM in the near future.

The P4 PMU isn't virtualized by KVM today, is it?

>
> Cc: Vitaly Kuznetsov <[email protected]>
> Suggested-by: Jim Mattson <[email protected]>
> Signed-off-by: Like Xu <[email protected]>

Please put the "Fixes" tag back. You convinced me that it should be there.

Reviewed-by: Jim Mattson <[email protected]>

2022-09-19 09:31:34

by Like Xu

[permalink] [raw]
Subject: Re: [PATCH v2 1/3] KVM: x86/pmu: Stop adding speculative Intel GP PMCs that don't exist yet

On 8/9/2022 12:33 am, Jim Mattson wrote:
> On Wed, Sep 7, 2022 at 3:48 AM Like Xu <[email protected]> wrote:
>>
>> From: Like Xu <[email protected]>
>>
>> The Intel April 2022 SDM - Table 2-2. IA-32 Architectural MSRs adds
>> a new architectural IA32_OVERCLOCKING_STATUS msr (0x195), plus the
>> presence of IA32_CORE_CAPABILITIES (0xCF), the theoretical effective
>> maximum value of the Intel GP PMCs is 14 (0xCF - 0xC1) instead of 18.
>>
>> But the conclusion of this speculation "14" is very fragile and can
>> easily be overturned once Intel declares another meaningful arch msr
>> in the above reserved range, and even worse, Intel probably put PMCs
>> 8-15 in a completely different range of MSR indices.
>
> The last clause is just conjecture.
>
>> A conservative proposal would be to stop at the maximum number of Intel
>> GP PMCs supported today. Also subsequent changes would limit both AMD
>> and Intel on the number of GP counter supported by KVM.
>>
>> There are some boxes like Intel P4 may indeed have 18 counters, but
>> those counters are in a completely different msr address range and do
>> not strictly adhere to the Intel Arch PMU specification, and will not
>> be supported by KVM in the near future.
>
> The P4 PMU isn't virtualized by KVM today, is it?

According to [1], P4 PMU has ZERO number of Intel Architectural Events, and
the KVM support for non Intel Arch PMUs has been dropped recently.

[1]
https://www.intel.com/content/dam/develop/external/us/en/documents/30320-nehalem-pmu-programming-guide-core.pdf

>
>>
>> Cc: Vitaly Kuznetsov <[email protected]>
>> Suggested-by: Jim Mattson <[email protected]>
>> Signed-off-by: Like Xu <[email protected]>
>
> Please put the "Fixes" tag back. You convinced me that it should be there.
>
> Reviewed-by: Jim Mattson <[email protected]>