2022-09-19 09:51:41

by Like Xu

[permalink] [raw]
Subject: [PATCH v3 1/3] KVM: x86/pmu: Stop adding speculative Intel GP PMCs that don't exist yet

From: Like Xu <[email protected]>

The Intel April 2022 SDM - Table 2-2. IA-32 Architectural MSRs adds
a new architectural IA32_OVERCLOCKING_STATUS msr (0x195), plus the
presence of IA32_CORE_CAPABILITIES (0xCF), the theoretical effective
maximum value of the Intel GP PMCs is 14 (0xCF - 0xC1) instead of 18.

But the conclusion of this speculation "14" is very fragile and can
easily be overturned once Intel declares another meaningful arch msr
in the above reserved range, and even worse, just conjecture, Intel
probably put PMCs 8-15 in a completely different range of MSR indices.

A conservative proposal would be to stop at the maximum number of Intel
GP PMCs supported today. Also subsequent changes would limit both AMD
and Intel on the number of GP counter supported by KVM.

There are some boxes like Intel P4 (non Architectural PMU) may indeed
have 18 counters , but those counters are in a completely different msr
address range and is not supported by KVM.

Cc: Vitaly Kuznetsov <[email protected]>
Fixes: cf05a67b68b8 ("KVM: x86: omit "impossible" pmu MSRs from MSR list")
Suggested-by: Jim Mattson <[email protected]>
Signed-off-by: Like Xu <[email protected]>
Reviewed-by: Jim Mattson <[email protected]>
---
Previous:
https://lore.kernel.org/kvm/[email protected]/
V2 -> V3 Changelog:
- Append "Reviewed-by" from Jim;
- Refine commit message a little bit; (Jim)
- Put the "Fixes" tag back; (Jim)

arch/x86/kvm/x86.c | 14 ++------------
1 file changed, 2 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d7374d768296..9f74c3924377 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1428,20 +1428,10 @@ static const u32 msrs_to_save_all[] = {
MSR_ARCH_PERFMON_PERFCTR0 + 2, MSR_ARCH_PERFMON_PERFCTR0 + 3,
MSR_ARCH_PERFMON_PERFCTR0 + 4, MSR_ARCH_PERFMON_PERFCTR0 + 5,
MSR_ARCH_PERFMON_PERFCTR0 + 6, MSR_ARCH_PERFMON_PERFCTR0 + 7,
- MSR_ARCH_PERFMON_PERFCTR0 + 8, MSR_ARCH_PERFMON_PERFCTR0 + 9,
- MSR_ARCH_PERFMON_PERFCTR0 + 10, MSR_ARCH_PERFMON_PERFCTR0 + 11,
- MSR_ARCH_PERFMON_PERFCTR0 + 12, MSR_ARCH_PERFMON_PERFCTR0 + 13,
- MSR_ARCH_PERFMON_PERFCTR0 + 14, MSR_ARCH_PERFMON_PERFCTR0 + 15,
- MSR_ARCH_PERFMON_PERFCTR0 + 16, MSR_ARCH_PERFMON_PERFCTR0 + 17,
MSR_ARCH_PERFMON_EVENTSEL0, MSR_ARCH_PERFMON_EVENTSEL1,
MSR_ARCH_PERFMON_EVENTSEL0 + 2, MSR_ARCH_PERFMON_EVENTSEL0 + 3,
MSR_ARCH_PERFMON_EVENTSEL0 + 4, MSR_ARCH_PERFMON_EVENTSEL0 + 5,
MSR_ARCH_PERFMON_EVENTSEL0 + 6, MSR_ARCH_PERFMON_EVENTSEL0 + 7,
- MSR_ARCH_PERFMON_EVENTSEL0 + 8, MSR_ARCH_PERFMON_EVENTSEL0 + 9,
- MSR_ARCH_PERFMON_EVENTSEL0 + 10, MSR_ARCH_PERFMON_EVENTSEL0 + 11,
- MSR_ARCH_PERFMON_EVENTSEL0 + 12, MSR_ARCH_PERFMON_EVENTSEL0 + 13,
- MSR_ARCH_PERFMON_EVENTSEL0 + 14, MSR_ARCH_PERFMON_EVENTSEL0 + 15,
- MSR_ARCH_PERFMON_EVENTSEL0 + 16, MSR_ARCH_PERFMON_EVENTSEL0 + 17,
MSR_IA32_PEBS_ENABLE, MSR_IA32_DS_AREA, MSR_PEBS_DATA_CFG,

MSR_K7_EVNTSEL0, MSR_K7_EVNTSEL1, MSR_K7_EVNTSEL2, MSR_K7_EVNTSEL3,
@@ -6926,12 +6916,12 @@ static void kvm_init_msr_list(void)
intel_pt_validate_hw_cap(PT_CAP_num_address_ranges) * 2)
continue;
break;
- case MSR_ARCH_PERFMON_PERFCTR0 ... MSR_ARCH_PERFMON_PERFCTR0 + 17:
+ case MSR_ARCH_PERFMON_PERFCTR0 ... MSR_ARCH_PERFMON_PERFCTR0 + 7:
if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_PERFCTR0 >=
min(INTEL_PMC_MAX_GENERIC, kvm_pmu_cap.num_counters_gp))
continue;
break;
- case MSR_ARCH_PERFMON_EVENTSEL0 ... MSR_ARCH_PERFMON_EVENTSEL0 + 17:
+ case MSR_ARCH_PERFMON_EVENTSEL0 ... MSR_ARCH_PERFMON_EVENTSEL0 + 7:
if (msrs_to_save_all[i] - MSR_ARCH_PERFMON_EVENTSEL0 >=
min(INTEL_PMC_MAX_GENERIC, kvm_pmu_cap.num_counters_gp))
continue;
--
2.37.3


2022-09-19 09:52:03

by Like Xu

[permalink] [raw]
Subject: [PATCH v3 3/3] KVM: x86/pmu: Limit the maximum number of supported AMD GP counters

From: Like Xu <[email protected]>

The AMD PerfMonV2 specification allows for a maximum of 16 GP counters,
which is clearly not supported with zero code effort in the current KVM.

A local macro (named like INTEL_PMC_MAX_GENERIC) is introduced to
take back control of this virt capability, which also makes it easier to
statically partition all available counters between hosts and guests.

Signed-off-by: Like Xu <[email protected]>
Reviewed-by: Jim Mattson <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/svm/pmu.c | 7 ++++---
arch/x86/kvm/x86.c | 3 +++
3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 17abcf5c496a..1b3094616d87 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -506,6 +506,7 @@ struct kvm_pmc {
#define MSR_ARCH_PERFMON_PERFCTR_MAX (MSR_ARCH_PERFMON_PERFCTR0 + KVM_INTEL_PMC_MAX_GENERIC - 1)
#define MSR_ARCH_PERFMON_EVENTSEL_MAX (MSR_ARCH_PERFMON_EVENTSEL0 + KVM_INTEL_PMC_MAX_GENERIC - 1)
#define KVM_PMC_MAX_FIXED 3
+#define KVM_AMD_PMC_MAX_GENERIC AMD64_NUM_COUNTERS_CORE
struct kvm_pmu {
unsigned nr_arch_gp_counters;
unsigned nr_arch_fixed_counters;
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index f24613a108c5..e696979ee395 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -271,9 +271,10 @@ static void amd_pmu_init(struct kvm_vcpu *vcpu)
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
int i;

- BUILD_BUG_ON(AMD64_NUM_COUNTERS_CORE > INTEL_PMC_MAX_GENERIC);
+ BUILD_BUG_ON(AMD64_NUM_COUNTERS_CORE > KVM_AMD_PMC_MAX_GENERIC);
+ BUILD_BUG_ON(KVM_AMD_PMC_MAX_GENERIC > INTEL_PMC_MAX_GENERIC);

- for (i = 0; i < AMD64_NUM_COUNTERS_CORE ; i++) {
+ for (i = 0; i < KVM_AMD_PMC_MAX_GENERIC ; i++) {
pmu->gp_counters[i].type = KVM_PMC_GP;
pmu->gp_counters[i].vcpu = vcpu;
pmu->gp_counters[i].idx = i;
@@ -286,7 +287,7 @@ static void amd_pmu_reset(struct kvm_vcpu *vcpu)
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
int i;

- for (i = 0; i < AMD64_NUM_COUNTERS_CORE; i++) {
+ for (i = 0; i < KVM_AMD_PMC_MAX_GENERIC; i++) {
struct kvm_pmc *pmc = &pmu->gp_counters[i];

pmc_stop_counter(pmc);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index bf7eafcbe5ec..368af134f913 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1438,10 +1438,13 @@ static const u32 msrs_to_save_all[] = {

MSR_K7_EVNTSEL0, MSR_K7_EVNTSEL1, MSR_K7_EVNTSEL2, MSR_K7_EVNTSEL3,
MSR_K7_PERFCTR0, MSR_K7_PERFCTR1, MSR_K7_PERFCTR2, MSR_K7_PERFCTR3,
+
+ /* This part of MSRs should match KVM_AMD_PMC_MAX_GENERIC. */
MSR_F15H_PERF_CTL0, MSR_F15H_PERF_CTL1, MSR_F15H_PERF_CTL2,
MSR_F15H_PERF_CTL3, MSR_F15H_PERF_CTL4, MSR_F15H_PERF_CTL5,
MSR_F15H_PERF_CTR0, MSR_F15H_PERF_CTR1, MSR_F15H_PERF_CTR2,
MSR_F15H_PERF_CTR3, MSR_F15H_PERF_CTR4, MSR_F15H_PERF_CTR5,
+
MSR_IA32_XFD, MSR_IA32_XFD_ERR,
};

--
2.37.3

2022-10-07 20:14:03

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v3 1/3] KVM: x86/pmu: Stop adding speculative Intel GP PMCs that don't exist yet

On Mon, Sep 19, 2022, Like Xu wrote:
> From: Like Xu <[email protected]>
>
> The Intel April 2022 SDM - Table 2-2. IA-32 Architectural MSRs adds
> a new architectural IA32_OVERCLOCKING_STATUS msr (0x195), plus the
> presence of IA32_CORE_CAPABILITIES (0xCF), the theoretical effective
> maximum value of the Intel GP PMCs is 14 (0xCF - 0xC1) instead of 18.
>
> But the conclusion of this speculation "14" is very fragile and can
> easily be overturned once Intel declares another meaningful arch msr

s/msr/MSR for consistency

> in the above reserved range, and even worse, just conjecture, Intel
> probably put PMCs 8-15 in a completely different range of MSR indices.
>
> A conservative proposal would be to stop at the maximum number of Intel
> GP PMCs supported today. Also subsequent changes would limit both AMD
> and Intel on the number of GP counter supported by KVM.
>
> There are some boxes like Intel P4 (non Architectural PMU) may indeed
> have 18 counters , but those counters are in a completely different msr

unnecessary whitespace before the comma. And s/msr/MSR again.

> address range and is not supported by KVM.
>
> Cc: Vitaly Kuznetsov <[email protected]>
> Fixes: cf05a67b68b8 ("KVM: x86: omit "impossible" pmu MSRs from MSR list")

Does this need Cc: [email protected]? Or is this benign enough that we don't
care?

No need for a v4, the above nits can be handled when applying.

> Suggested-by: Jim Mattson <[email protected]>
> Signed-off-by: Like Xu <[email protected]>
> Reviewed-by: Jim Mattson <[email protected]>
> ---

In the future, please provide a cover letter even for trivial series, it helps
(me at least) mentally organize patches.

Thanks!

2022-10-14 09:21:53

by Like Xu

[permalink] [raw]
Subject: Re: [PATCH v3 1/3] KVM: x86/pmu: Stop adding speculative Intel GP PMCs that don't exist yet

On 8/10/2022 3:38 am, Sean Christopherson wrote:
> Does this need Cc:[email protected]? Or is this benign enough that we don't
> care?

Considering stable kernel may access IA32_OVERCLOCKING_STATUS as well,
cc stable list helps to remove the illusion of pmu msr scope for stable tree
maintainers.

>
> No need for a v4, the above nits can be handled when applying.

Thanks, so kind of you.

>
>> Suggested-by: Jim Mattson<[email protected]>
>> Signed-off-by: Like Xu<[email protected]>
>> Reviewed-by: Jim Mattson<[email protected]>
>> ---
> In the future, please provide a cover letter even for trivial series, it helps
> (me at least) mentally organize patches.

Roger that.

>
> Thanks!
>

2022-10-14 16:59:03

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v3 1/3] KVM: x86/pmu: Stop adding speculative Intel GP PMCs that don't exist yet

On Fri, Oct 14, 2022, Like Xu wrote:
> On 8/10/2022 3:38 am, Sean Christopherson wrote:
> > Does this need Cc:[email protected]? Or is this benign enough that we don't
> > care?
>
> Considering stable kernel may access IA32_OVERCLOCKING_STATUS as well,
> cc stable list helps to remove the illusion of pmu msr scope for stable tree
> maintainers.

Is that a "yes, this should be Cc'd stable" or "no, don't bother"?

2022-11-07 07:31:58

by Like Xu

[permalink] [raw]
Subject: Re: [PATCH v3 1/3] KVM: x86/pmu: Stop adding speculative Intel GP PMCs that don't exist yet

On 15/10/2022 12:18 am, Sean Christopherson wrote:
> On Fri, Oct 14, 2022, Like Xu wrote:
>> On 8/10/2022 3:38 am, Sean Christopherson wrote:
>>> Does this need Cc:[email protected]? Or is this benign enough that we don't
>>> care?
>>
>> Considering stable kernel may access IA32_OVERCLOCKING_STATUS as well,
>> cc stable list helps to remove the illusion of pmu msr scope for stable tree
>> maintainers.
>
> Is that a "yes, this should be Cc'd stable" or "no, don't bother"?

Oops, I missed this one. "Yes, this should be Cc'd" as I have received a few
complaints on this.

Please let me know if I need to post a new version of this minor patch set,
considering the previous
comment "No need for a v4, the above nits can be handled when applying. "

Thanks,
Like Xu

2022-11-07 18:38:19

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v3 1/3] KVM: x86/pmu: Stop adding speculative Intel GP PMCs that don't exist yet

On 11/7/22 08:26, Like Xu wrote:
> On 15/10/2022 12:18 am, Sean Christopherson wrote:
>> On Fri, Oct 14, 2022, Like Xu wrote:
>>> On 8/10/2022 3:38 am, Sean Christopherson wrote:
>>>> Does this need Cc:[email protected] Or is this benign enough
>>>> that we don't
>>>> care?
>>>
>>> Considering stable kernel may access IA32_OVERCLOCKING_STATUS as well,
>>> cc stable list helps to remove the illusion of pmu msr scope for
>>> stable tree
>>> maintainers.
>>
>> Is that a "yes, this should be Cc'd stable" or "no, don't bother"?
>
> Oops, I missed this one. "Yes, this should be Cc'd" as I have received a
> few complaints on this.
>
> Please let me know if I need to post a new version of this minor patch
> set, considering the previous
> comment "No need for a v4, the above nits can be handled when applying. "

Queued, thanks.

Paolo


2022-11-07 18:50:19

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v3 3/3] KVM: x86/pmu: Limit the maximum number of supported AMD GP counters

On 9/19/22 11:10, Like Xu wrote:
> @@ -506,6 +506,7 @@ struct kvm_pmc {
> #define MSR_ARCH_PERFMON_PERFCTR_MAX (MSR_ARCH_PERFMON_PERFCTR0 + KVM_INTEL_PMC_MAX_GENERIC - 1)
> #define MSR_ARCH_PERFMON_EVENTSEL_MAX (MSR_ARCH_PERFMON_EVENTSEL0 + KVM_INTEL_PMC_MAX_GENERIC - 1)
> #define KVM_PMC_MAX_FIXED 3
> +#define KVM_AMD_PMC_MAX_GENERIC AMD64_NUM_COUNTERS_CORE
> struct kvm_pmu {

Even though the BUILD_BUG_ON prevents out-of-bounds accesses, this
should be hardcoded to 6 to avoid mismatches with msrs_to_save_all[].

> + BUILD_BUG_ON(AMD64_NUM_COUNTERS_CORE > KVM_AMD_PMC_MAX_GENERIC);

This should be KVM_AMD_PMC_MAX_GENERIC > AMD64_NUM_COUNTERS_CORE, not
the opposite.

Fixed up and changed the commit message to follow:

The AMD PerfMonV2 specification allows for a maximum of 16 GP
counters, but currently only 6 pairs of MSRs are accepted by KVM.

While AMD64_NUM_COUNTERS_CORE is already equal to 6, increasing
without adjusting msrs_to_save_all[] could result in out-of-bounds
accesses. Therefore introduce a macro (named
KVM_AMD_PMC_MAX_GENERIC) to refer to the number of counters
supported by KVM.

Paolo