2015-06-05 15:29:07

by Imre Palik

[permalink] [raw]
Subject: [PATCH v2] perf: honoring the architectural performance monitoring version

From: "Palik, Imre" <[email protected]>

Architectural performance monitoring version 1 doesn't support fixed
counters. Currently, even if a hypervisor advertises support for
architectural performance monitoring version 1, perf may still tries to use
the fixed counters, as the constraints are set up based on the CPU model.

This patch ensures that perf honors the architectural performance
monitoring version returned by CPUID, and it only uses the fixed counters
for version two and above.

Some of the ideas in this patch are coming from Peter Zijlstra.

Signed-off-by: Imre Palik <[email protected]>
Cc: Anthony Liguori <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 3998131..bde66aa 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1870,7 +1870,7 @@ x86_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
for_each_event_constraint(c, x86_pmu.event_constraints) {
if ((event->hw.config & c->cmask) == c->code) {
event->hw.flags |= c->flags;
- return c;
+ return c->idxmsk64 ? c : NULL;
}
}
}
@@ -3341,9 +3341,12 @@ __init int intel_pmu_init(void)
for_each_event_constraint(c, x86_pmu.event_constraints) {
if (c->cmask != FIXED_EVENT_FLAGS
|| c->idxmsk64 == INTEL_PMC_MSK_FIXED_REF_CYCLES) {
+ c->idxmsk64 &=
+ ~(~0UL << (INTEL_PMC_IDX_FIXED + x86_pmu.num_counters_fixed));
continue;
}
-
+ c->idxmsk64 &=
+ ~(~0UL << (INTEL_PMC_IDX_FIXED + x86_pmu.num_counters_fixed));
c->idxmsk64 |= (1ULL << x86_pmu.num_counters) - 1;
c->weight += x86_pmu.num_counters;
}
--
1.7.9.5


2015-06-05 15:42:59

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2] perf: honoring the architectural performance monitoring version

On Fri, Jun 05, 2015 at 05:28:45PM +0200, Imre Palik wrote:
> From: "Palik, Imre" <[email protected]>
>
> Architectural performance monitoring version 1 doesn't support fixed
> counters. Currently, even if a hypervisor advertises support for
> architectural performance monitoring version 1, perf may still tries to use
> the fixed counters, as the constraints are set up based on the CPU model.
>
> This patch ensures that perf honors the architectural performance
> monitoring version returned by CPUID, and it only uses the fixed counters
> for version two and above.
>
> Some of the ideas in this patch are coming from Peter Zijlstra.
>
> Signed-off-by: Imre Palik <[email protected]>
> Cc: Anthony Liguori <[email protected]>
> ---
> arch/x86/kernel/cpu/perf_event_intel.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> index 3998131..bde66aa 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -1870,7 +1870,7 @@ x86_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
> for_each_event_constraint(c, x86_pmu.event_constraints) {
> if ((event->hw.config & c->cmask) == c->code) {
> event->hw.flags |= c->flags;
> - return c;
> + return c->idxmsk64 ? c : NULL;

One too many spaces there :-) Returning c as found, even with empty
idxmsk is fine.

Also, I think this is broken, I think we hard assume
x86_get_event_constraints() returns a valid constraint, see for example:

x86_schedule_event():

c = x86_pmu.get_event_constraints()
= intel_get_event_constraints()
= __intel_get_event_constraints()
= x86_get_event_constraints();

cpuc->event_constraint[i] = c;

...

c = cpuc->event_constraint[i];

if (!test_bit(hwc->idx, c->idxmask)) <-- *boom*


> @@ -3341,9 +3341,12 @@ __init int intel_pmu_init(void)
> for_each_event_constraint(c, x86_pmu.event_constraints) {
> if (c->cmask != FIXED_EVENT_FLAGS
> || c->idxmsk64 == INTEL_PMC_MSK_FIXED_REF_CYCLES) {
> + c->idxmsk64 &=
> + ~(~0UL << (INTEL_PMC_IDX_FIXED + x86_pmu.num_counters_fixed));

If you change idxmsk64 you also need to update weight.

> continue;
> }
> -
> + c->idxmsk64 &=
> + ~(~0UL << (INTEL_PMC_IDX_FIXED + x86_pmu.num_counters_fixed));
> c->idxmsk64 |= (1ULL << x86_pmu.num_counters) - 1;
> c->weight += x86_pmu.num_counters;

And since we're now not unconditionally adding num_counters bits, that
weight update is broken.

For both sites, something like:

c->weight = hweight64(c->idxmsk64);

Will recompute the weight.

Thanks!