2015-12-03 20:09:19

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH v2 0/2] perf/x86: fixes and improvement for Intel Atom PEBS support

This short series fixes total breakage of Intel Atom PEBS support in recent kernels.
The problems were introduced with the changes in the PEBS logic to handle
deeper buffer.

The first patch fixes PEBS and LBR problems, including NULL pointers, wrong pointer
arithmetic, and wrong pebs record layout assumption.

The second patch adds an alias for cycles:pp to Intel Atom given that perf record/top
uses cycles:pp nowadays.

In V2, we removed the alias function specific to Atom use use the one from Core2
because it is identical as suggested by Kan Liang.

Stephane Eranian (2):
perf/x86: fix PEBS and LBR issues on Intel Atom
perf/x86: enable cycles:pp for Intel Atom

arch/x86/kernel/cpu/perf_event_intel.c | 1 +
arch/x86/kernel/cpu/perf_event_intel_ds.c | 11 ++++++++++-
arch/x86/kernel/cpu/perf_event_intel_lbr.c | 11 +++++++----
3 files changed, 18 insertions(+), 5 deletions(-)

--
1.9.1


2015-12-03 20:03:21

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH v2 1/2] perf/x86: fix PEBS and LBR issues on Intel Atom

This patches fixes a number of problems in the PEBS
and LBR support of Intel Atom. Those bugs were introduced
by the recent changes to the PEBS code to handle multiple
entries.

The kernel was assuming that if the CPU support 64-bit format
LBR, then it has an LBR_SELECT MSR. Atom uses 64-bit LBR format
but does not have LBR_SELECT. That was causing NULL pointer
dereferences in a couple of places.

The kernel had a pointer arithmetic error in intel_pmu_drain_pebs_core()
when calculating the number of records present in the PEBS buffer.

The get_next_pebs_record_by_bit() was called on PEBS fm0 which does
not use the pebs_record_nhm layout.

This patch fixes all those problems and has PEBS and LBR working again.

Signed-off-by: Stephane Eranian <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel_ds.c | 9 ++++++++-
arch/x86/kernel/cpu/perf_event_intel_lbr.c | 11 +++++++----
2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index 5db1c77..dae5f93 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -1101,6 +1101,13 @@ get_next_pebs_record_by_bit(void *base, void *top, int bit)
void *at;
u64 pebs_status;

+ /*
+ * fmt0 does not have a status bitfield (does not use
+ * perf_record_nhm format)
+ */
+ if (x86_pmu.intel_cap.pebs_format < 1)
+ return base;
+
if (base == NULL)
return NULL;

@@ -1186,7 +1193,7 @@ static void intel_pmu_drain_pebs_core(struct pt_regs *iregs)
if (!event->attr.precise_ip)
return;

- n = (top - at) / x86_pmu.pebs_record_size;
+ n = top - at;
if (n <= 0)
return;

diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
index e2fad0c..1390148 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
@@ -161,7 +161,7 @@ static void __intel_pmu_lbr_enable(bool pmi)
*/
if (cpuc->lbr_sel)
lbr_select = cpuc->lbr_sel->config & x86_pmu.lbr_sel_mask;
- if (!pmi)
+ if (!pmi && cpuc->lbr_sel)
wrmsrl(MSR_LBR_SELECT, lbr_select);

rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctl);
@@ -430,7 +430,7 @@ static void intel_pmu_lbr_read_32(struct cpu_hw_events *cpuc)
*/
static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
{
- bool need_info = !(cpuc->lbr_sel->config & LBR_NO_INFO);
+ bool need_info = false;
unsigned long mask = x86_pmu.lbr_nr - 1;
int lbr_format = x86_pmu.intel_cap.lbr_format;
u64 tos = intel_pmu_lbr_tos();
@@ -438,8 +438,11 @@ static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)
int out = 0;
int num = x86_pmu.lbr_nr;

- if (cpuc->lbr_sel->config & LBR_CALL_STACK)
- num = tos;
+ if (cpuc->lbr_sel) {
+ need_info = !(cpuc->lbr_sel->config & LBR_NO_INFO);
+ if (cpuc->lbr_sel->config & LBR_CALL_STACK)
+ num = tos;
+ }

for (i = 0; i < num; i++) {
unsigned long lbr_idx = (tos - i) & mask;
--
1.9.1

2015-12-03 20:03:25

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH v2 2/2] perf/x86: enable cycles:pp for Intel Atom

This patch updates the PEBS support for Intel Atom to provide
an alias for the cycles:pp event used by perf record/top by default
nowadays.

On Atom, only INST_RETIRED:ANY supports PEBS, so we use this event
instead with a large cmask to count cycles. Given that Core2 has
the same issue, we use the intel_pebs_aliases_core2() function for Atom
as well.

Signed-off-by: Stephane Eranian <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel.c | 1 +
arch/x86/kernel/cpu/perf_event_intel_ds.c | 2 ++
2 files changed, 3 insertions(+)

diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 61f2577..cef4d2f 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -3332,6 +3332,7 @@ __init int intel_pmu_init(void)

x86_pmu.event_constraints = intel_gen_event_constraints;
x86_pmu.pebs_constraints = intel_atom_pebs_event_constraints;
+ x86_pmu.pebs_aliases = intel_pebs_aliases_core2;
pr_cont("Atom events, ");
break;

diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index dae5f93..1b748ee 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -620,6 +620,8 @@ struct event_constraint intel_atom_pebs_event_constraints[] = {
INTEL_FLAGS_EVENT_CONSTRAINT(0xcb, 0x1), /* MEM_LOAD_RETIRED.* */
/* INST_RETIRED.ANY_P, inv=1, cmask=16 (cycles:p). */
INTEL_FLAGS_EVENT_CONSTRAINT(0x108000c0, 0x01),
+ /* Allow all events as PEBS with no flags */
+ INTEL_ALL_EVENT_CONSTRAINT(0, 0x1),
EVENT_CONSTRAINT_END
};

--
1.9.1

2015-12-03 20:15:11

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] perf/x86: enable cycles:pp for Intel Atom

> /* INST_RETIRED.ANY_P, inv=1, cmask=16 (cycles:p). */
> INTEL_FLAGS_EVENT_CONSTRAINT(0x108000c0, 0x01),
> + /* Allow all events as PEBS with no flags */
> + INTEL_ALL_EVENT_CONSTRAINT(0, 0x1),

I don't think this is really needed (no extra PEBS events), but ok it shouldn't
hurt either.

-Andi

--
[email protected] -- Speaking for myself only

2015-12-03 20:51:44

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] perf/x86: fix PEBS and LBR issues on Intel Atom

On Thu, Dec 03, 2015 at 09:03:09PM +0100, Stephane Eranian wrote:
> This patches fixes a number of problems in the PEBS
> and LBR support of Intel Atom. Those bugs were introduced
> by the recent changes to the PEBS code to handle multiple
> entries.

AFAICT the PEBS and LBR issues are separate, and this should therefore
be 2 patches.

Also, you seem to be lacking Fixes: tags, which are a big help in
backporting to -stable trees.

2015-12-03 21:20:07

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] perf/x86: fix PEBS and LBR issues on Intel Atom

On Thu, Dec 03, 2015 at 09:51:37PM +0100, Peter Zijlstra wrote:
> On Thu, Dec 03, 2015 at 09:03:09PM +0100, Stephane Eranian wrote:
> > This patches fixes a number of problems in the PEBS
> > and LBR support of Intel Atom. Those bugs were introduced
> > by the recent changes to the PEBS code to handle multiple
> > entries.
>
> AFAICT the PEBS and LBR issues are separate, and this should therefore
> be 2 patches.

The hunk to allow other PEBS events is also separate
(it's not needed for cycles:pp)

-Andi

--
[email protected] -- Speaking for myself only

2015-12-03 21:44:45

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] perf/x86: enable cycles:pp for Intel Atom

On Thu, Dec 3, 2015 at 12:14 PM, Andi Kleen <[email protected]> wrote:
>
> > /* INST_RETIRED.ANY_P, inv=1, cmask=16 (cycles:p). */
> > INTEL_FLAGS_EVENT_CONSTRAINT(0x108000c0, 0x01),
> > + /* Allow all events as PEBS with no flags */
> > + INTEL_ALL_EVENT_CONSTRAINT(0, 0x1),
>
> I don't think this is really needed (no extra PEBS events), but ok it shouldn't
> hurt either.
>
Has to do with consistent behavior. Do not get an error if trying PEBS
on non-PEBS
event, like for the other CPUs.