2014-06-27 23:10:19

by Andi Kleen

[permalink] [raw]
Subject: Updated PEBS simplification/fixup patchkit

This patchkit is my take on how the PEBS event lists should
be revampled. Plus a fix for the ANY bit.

It is a superset of Stephane's patches and obsoletes them.

I think I discussed nearly everything in there already in some earlier
emails. Basic ideas/fixes:

- Don't list every PEBS event as that's not needed
- Check the flags as the SDM recommends
- Still allow cycles:pp of course
- Fix the counters for memory latency events
- Fix the DataLA handling on Haswell to support all events.
- Allow leaking events with ANY bit.

Also the patchkit removes more code than it adds, so it's a
simplification.

-Andi


2014-06-27 23:10:21

by Andi Kleen

[permalink] [raw]
Subject: [PATCH 1/2] perf, x86: Revamp PEBS event selection

From: Andi Kleen <[email protected]>

As already discussed earlier in email.

The basic idea is that it does not make sense to list all PEBS
events individually. The list is very long, sometimes outdated
and the hardware doesn't need it. If an event does not support
PEBS it will just not count, there is no security issue.

This vastly simplifies the PEBS event selection.

Bugs fixed:
- We do not allow setting forbidden flags with PEBS anymore
(SDM 18.9.4), except for the special cycle event.
This is done using a new constraint macro that also
matches on the event flags.
- We now allow DataLA on all Haswell events, not just
a small subset. In general all PEBS events that tag memory
accesses support DataLA on Haswell. Otherwise the reported
address is just zero. This allows address profiling
on vastly more events.
- We did not allow all PEBS events on Haswell.

This includes the changes proposed by Stephane earlier and obsoletes
his patchkit.

I only did Sandy Bridge and Silvermont and later so far, mostly because these
are the parts I could directly confirm the hardware behavior with hardware
architects.

Cc: [email protected]
Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/include/asm/perf_event.h | 8 +++
arch/x86/kernel/cpu/perf_event.h | 18 ++++--
arch/x86/kernel/cpu/perf_event_intel_ds.c | 96 +++++++------------------------
3 files changed, 43 insertions(+), 79 deletions(-)

diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 8249df4..8dfc9fd 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -51,6 +51,14 @@
ARCH_PERFMON_EVENTSEL_EDGE | \
ARCH_PERFMON_EVENTSEL_INV | \
ARCH_PERFMON_EVENTSEL_CMASK)
+#define X86_ALL_EVENT_FLAGS \
+ (ARCH_PERFMON_EVENTSEL_EDGE | \
+ ARCH_PERFMON_EVENTSEL_INV | \
+ ARCH_PERFMON_EVENTSEL_CMASK | \
+ ARCH_PERFMON_EVENTSEL_ANY | \
+ ARCH_PERFMON_EVENTSEL_PIN_CONTROL | \
+ HSW_IN_TX | \
+ HSW_IN_TX_CHECKPOINTED)
#define AMD64_RAW_EVENT_MASK \
(X86_RAW_EVENT_MASK | \
AMD64_EVENTSEL_EVENT)
diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index 3b2f9bd..9907759 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -252,16 +252,24 @@ struct cpu_hw_events {
EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK)

#define INTEL_PLD_CONSTRAINT(c, n) \
- __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK, \
+ __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK|X86_ALL_EVENT_FLAGS, \
HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_LDLAT)

#define INTEL_PST_CONSTRAINT(c, n) \
- __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK, \
+ __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK|X86_ALL_EVENT_FLAGS, \
HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_ST)

-/* DataLA version of store sampling without extra enable bit. */
-#define INTEL_PST_HSW_CONSTRAINT(c, n) \
- __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK, \
+/* Event constraint, but match on all event flags too. */
+#define INTEL_FLAGS_EVENT_CONSTRAINT(c, n) \
+ EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK|X86_ALL_EVENT_FLAGS)
+
+/* Check only flags, but allow all event/umask */
+#define INTEL_ALL_EVENT_CONSTRAINT(flags, n) \
+ EVENT_CONSTRAINT(flags, n, X86_ALL_EVENT_FLAGS)
+
+/* Same as above, but enable DataLA */
+#define INTEL_ALL_EVENT_CONSTRAINT_DATALA(flags, n) \
+ __EVENT_CONSTRAINT(flags, n, X86_ALL_EVENT_FLAGS, \
HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_ST_HSW)

/*
diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index 980970c..d50142e 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -567,28 +567,10 @@ struct event_constraint intel_atom_pebs_event_constraints[] = {
};

struct event_constraint intel_slm_pebs_event_constraints[] = {
- INTEL_UEVENT_CONSTRAINT(0x0103, 0x1), /* REHABQ.LD_BLOCK_ST_FORWARD_PS */
- INTEL_UEVENT_CONSTRAINT(0x0803, 0x1), /* REHABQ.LD_SPLITS_PS */
- INTEL_UEVENT_CONSTRAINT(0x0204, 0x1), /* MEM_UOPS_RETIRED.L2_HIT_LOADS_PS */
- INTEL_UEVENT_CONSTRAINT(0x0404, 0x1), /* MEM_UOPS_RETIRED.L2_MISS_LOADS_PS */
- INTEL_UEVENT_CONSTRAINT(0x0804, 0x1), /* MEM_UOPS_RETIRED.DTLB_MISS_LOADS_PS */
- INTEL_UEVENT_CONSTRAINT(0x2004, 0x1), /* MEM_UOPS_RETIRED.HITM_PS */
- INTEL_UEVENT_CONSTRAINT(0x00c0, 0x1), /* INST_RETIRED.ANY_PS */
- INTEL_UEVENT_CONSTRAINT(0x00c4, 0x1), /* BR_INST_RETIRED.ALL_BRANCHES_PS */
- INTEL_UEVENT_CONSTRAINT(0x7ec4, 0x1), /* BR_INST_RETIRED.JCC_PS */
- INTEL_UEVENT_CONSTRAINT(0xbfc4, 0x1), /* BR_INST_RETIRED.FAR_BRANCH_PS */
- INTEL_UEVENT_CONSTRAINT(0xebc4, 0x1), /* BR_INST_RETIRED.NON_RETURN_IND_PS */
- INTEL_UEVENT_CONSTRAINT(0xf7c4, 0x1), /* BR_INST_RETIRED.RETURN_PS */
- INTEL_UEVENT_CONSTRAINT(0xf9c4, 0x1), /* BR_INST_RETIRED.CALL_PS */
- INTEL_UEVENT_CONSTRAINT(0xfbc4, 0x1), /* BR_INST_RETIRED.IND_CALL_PS */
- INTEL_UEVENT_CONSTRAINT(0xfdc4, 0x1), /* BR_INST_RETIRED.REL_CALL_PS */
- INTEL_UEVENT_CONSTRAINT(0xfec4, 0x1), /* BR_INST_RETIRED.TAKEN_JCC_PS */
- INTEL_UEVENT_CONSTRAINT(0x00c5, 0x1), /* BR_INST_MISP_RETIRED.ALL_BRANCHES_PS */
- INTEL_UEVENT_CONSTRAINT(0x7ec5, 0x1), /* BR_INST_MISP_RETIRED.JCC_PS */
- INTEL_UEVENT_CONSTRAINT(0xebc5, 0x1), /* BR_INST_MISP_RETIRED.NON_RETURN_IND_PS */
- INTEL_UEVENT_CONSTRAINT(0xf7c5, 0x1), /* BR_INST_MISP_RETIRED.RETURN_PS */
- INTEL_UEVENT_CONSTRAINT(0xfbc5, 0x1), /* BR_INST_MISP_RETIRED.IND_CALL_PS */
- INTEL_UEVENT_CONSTRAINT(0xfec5, 0x1), /* BR_INST_MISP_RETIRED.TAKEN_JCC_PS */
+ /* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */
+ INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf),
+ /* Allow all events as PEBS with no flags */
+ INTEL_ALL_EVENT_CONSTRAINT(0xffff, 0x1),
EVENT_CONSTRAINT_END
};

@@ -624,68 +606,34 @@ struct event_constraint intel_westmere_pebs_event_constraints[] = {

struct event_constraint intel_snb_pebs_event_constraints[] = {
INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */
- INTEL_UEVENT_CONSTRAINT(0x01c2, 0xf), /* UOPS_RETIRED.ALL */
- INTEL_UEVENT_CONSTRAINT(0x02c2, 0xf), /* UOPS_RETIRED.RETIRE_SLOTS */
- INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */
- INTEL_EVENT_CONSTRAINT(0xc5, 0xf), /* BR_MISP_RETIRED.* */
- INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */
- INTEL_PST_CONSTRAINT(0x02cd, 0x8), /* MEM_TRANS_RETIRED.PRECISE_STORES */
- INTEL_EVENT_CONSTRAINT(0xd0, 0xf), /* MEM_UOP_RETIRED.* */
- INTEL_EVENT_CONSTRAINT(0xd1, 0xf), /* MEM_LOAD_UOPS_RETIRED.* */
- INTEL_EVENT_CONSTRAINT(0xd2, 0xf), /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */
- INTEL_EVENT_CONSTRAINT(0xd3, 0xf), /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.* */
- INTEL_UEVENT_CONSTRAINT(0x02d4, 0xf), /* MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS */
+ INTEL_PLD_CONSTRAINT(0x01cd, 0xf), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */
+ INTEL_PST_CONSTRAINT(0x02cd, 0xf), /* MEM_TRANS_RETIRED.PRECISE_STORES */
+ /* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */
+ INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf),
+ /* Allow all events as PEBS with no flags */
+ INTEL_ALL_EVENT_CONSTRAINT(0xffff, 0xf),
EVENT_CONSTRAINT_END
};

struct event_constraint intel_ivb_pebs_event_constraints[] = {
INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */
- INTEL_UEVENT_CONSTRAINT(0x01c2, 0xf), /* UOPS_RETIRED.ALL */
- INTEL_UEVENT_CONSTRAINT(0x02c2, 0xf), /* UOPS_RETIRED.RETIRE_SLOTS */
- INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */
- INTEL_EVENT_CONSTRAINT(0xc5, 0xf), /* BR_MISP_RETIRED.* */
- INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */
- INTEL_PST_CONSTRAINT(0x02cd, 0x8), /* MEM_TRANS_RETIRED.PRECISE_STORES */
- INTEL_EVENT_CONSTRAINT(0xd0, 0xf), /* MEM_UOP_RETIRED.* */
- INTEL_EVENT_CONSTRAINT(0xd1, 0xf), /* MEM_LOAD_UOPS_RETIRED.* */
- INTEL_EVENT_CONSTRAINT(0xd2, 0xf), /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */
- INTEL_EVENT_CONSTRAINT(0xd3, 0xf), /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.* */
+ INTEL_PLD_CONSTRAINT(0x01cd, 0xf), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */
+ INTEL_PST_CONSTRAINT(0x02cd, 0xf), /* MEM_TRANS_RETIRED.PRECISE_STORES */
+ /* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */
+ INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf),
+ /* Allow all events as PEBS with no flags */
+ INTEL_ALL_EVENT_CONSTRAINT(0xffff, 0xf),
EVENT_CONSTRAINT_END
};

struct event_constraint intel_hsw_pebs_event_constraints[] = {
INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */
- INTEL_PST_HSW_CONSTRAINT(0x01c2, 0xf), /* UOPS_RETIRED.ALL */
- INTEL_UEVENT_CONSTRAINT(0x02c2, 0xf), /* UOPS_RETIRED.RETIRE_SLOTS */
- INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */
- INTEL_UEVENT_CONSTRAINT(0x01c5, 0xf), /* BR_MISP_RETIRED.CONDITIONAL */
- INTEL_UEVENT_CONSTRAINT(0x04c5, 0xf), /* BR_MISP_RETIRED.ALL_BRANCHES */
- INTEL_UEVENT_CONSTRAINT(0x20c5, 0xf), /* BR_MISP_RETIRED.NEAR_TAKEN */
- INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.* */
- /* MEM_UOPS_RETIRED.STLB_MISS_LOADS */
- INTEL_UEVENT_CONSTRAINT(0x11d0, 0xf),
- /* MEM_UOPS_RETIRED.STLB_MISS_STORES */
- INTEL_UEVENT_CONSTRAINT(0x12d0, 0xf),
- INTEL_UEVENT_CONSTRAINT(0x21d0, 0xf), /* MEM_UOPS_RETIRED.LOCK_LOADS */
- INTEL_UEVENT_CONSTRAINT(0x41d0, 0xf), /* MEM_UOPS_RETIRED.SPLIT_LOADS */
- /* MEM_UOPS_RETIRED.SPLIT_STORES */
- INTEL_UEVENT_CONSTRAINT(0x42d0, 0xf),
- INTEL_UEVENT_CONSTRAINT(0x81d0, 0xf), /* MEM_UOPS_RETIRED.ALL_LOADS */
- INTEL_PST_HSW_CONSTRAINT(0x82d0, 0xf), /* MEM_UOPS_RETIRED.ALL_STORES */
- INTEL_UEVENT_CONSTRAINT(0x01d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L1_HIT */
- INTEL_UEVENT_CONSTRAINT(0x02d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L2_HIT */
- INTEL_UEVENT_CONSTRAINT(0x04d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L3_HIT */
- /* MEM_LOAD_UOPS_RETIRED.HIT_LFB */
- INTEL_UEVENT_CONSTRAINT(0x40d1, 0xf),
- /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_MISS */
- INTEL_UEVENT_CONSTRAINT(0x01d2, 0xf),
- /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HIT */
- INTEL_UEVENT_CONSTRAINT(0x02d2, 0xf),
- /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.LOCAL_DRAM */
- INTEL_UEVENT_CONSTRAINT(0x01d3, 0xf),
- INTEL_UEVENT_CONSTRAINT(0x04c8, 0xf), /* HLE_RETIRED.Abort */
- INTEL_UEVENT_CONSTRAINT(0x04c9, 0xf), /* RTM_RETIRED.Abort */
-
+ INTEL_PLD_CONSTRAINT(0x01cd, 0xf), /* MEM_TRANS_RETIRED.* */
+ /* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */
+ INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf),
+ /* Allow all events as PEBS with no flags */
+ /* We allow DATALA for all PEBS events, will be 0 if not supported */
+ INTEL_ALL_EVENT_CONSTRAINT_DATALA(0, 0xf),
EVENT_CONSTRAINT_END
};

--
1.9.3

2014-06-27 23:10:18

by Andi Kleen

[permalink] [raw]
Subject: [PATCH 2/2] perf, x86, ivb: Allow leaking events with ANY bit set

From: Andi Kleen <[email protected]>

Currently the leaking IVB events cannot be scheduled at all,
to avoid leaking information about other process.
When the ANY bit is set this does not matter: the process
already has all the needed priviledges and "leaking" is expected.
So allow these events with any bit set.

Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index adb02aa..db5cec3 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -116,6 +116,8 @@ static struct event_constraint intel_snb_event_constraints[] __read_mostly =
EVENT_CONSTRAINT_END
};

+#define FLAGS_NOT_ANY (X86_ALL_EVENT_FLAGS & ~ARCH_PERFMON_EVENTSEL_ANY)
+
static struct event_constraint intel_ivb_event_constraints[] __read_mostly =
{
FIXED_EVENT_CONSTRAINT(0x00c0, 0), /* INST_RETIRED.ANY */
@@ -135,11 +137,12 @@ static struct event_constraint intel_ivb_event_constraints[] __read_mostly =
* Errata BV98 -- MEM_*_RETIRED events can leak between counters of SMT
* siblings; disable these events because they can corrupt unrelated
* counters.
+ * But allow them with the ANY bit set.
*/
INTEL_EVENT_CONSTRAINT(0xd0, 0x0), /* MEM_UOPS_RETIRED.* */
- INTEL_EVENT_CONSTRAINT(0xd1, 0x0), /* MEM_LOAD_UOPS_RETIRED.* */
- INTEL_EVENT_CONSTRAINT(0xd2, 0x0), /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */
- INTEL_EVENT_CONSTRAINT(0xd3, 0x0), /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.* */
+ INTEL_FLAGS_EVENT_CONSTRAINT(FLAGS_NOT_ANY|0xd1, 0x0), /* MEM_LOAD_UOPS_RETIRED.* */
+ INTEL_FLAGS_EVENT_CONSTRAINT(FLAGS_NOT_ANY|0xd2, 0x0), /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */
+ INTEL_FLAGS_EVENT_CONSTRAINT(FLAGS_NOT_ANY|0xd3, 0x0), /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.* */
EVENT_CONSTRAINT_END
};

--
1.9.3

2014-07-02 12:29:12

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 1/2] perf, x86: Revamp PEBS event selection

On Fri, Jun 27, 2014 at 04:10:11PM -0700, Andi Kleen wrote:
> From: Andi Kleen <[email protected]>
>
> As already discussed earlier in email.

Is an entirely inappropriate start for a Changelog. Do not assume prior
knowledge. If its relevant include it here without reference.


Attachments:
(No filename) (279.00 B)
(No filename) (836.00 B)
Download all attachments

2014-07-02 13:07:32

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH 2/2] perf, x86, ivb: Allow leaking events with ANY bit set

Andi,

On Sat, Jun 28, 2014 at 1:10 AM, Andi Kleen <[email protected]> wrote:
> From: Andi Kleen <[email protected]>
>
> Currently the leaking IVB events cannot be scheduled at all,
> to avoid leaking information about other process.
> When the ANY bit is set this does not matter: the process
> already has all the needed priviledges and "leaking" is expected.
> So allow these events with any bit set.
>
Does not make any sense.
This is not the problem.

It is not about leaking information to another hyper-thread, i.e.,
leaking private info.

It is about corrupting the other thread's counter regardless of what
it measures.

The events black-listed here could as well be black-listed on SNB
and HSW. Yet, they are useful events. The patch series we posted
with Maria address the corruption aspect. I will post a V2 next week.


> Signed-off-by: Andi Kleen <[email protected]>
> ---
> arch/x86/kernel/cpu/perf_event_intel.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> index adb02aa..db5cec3 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -116,6 +116,8 @@ static struct event_constraint intel_snb_event_constraints[] __read_mostly =
> EVENT_CONSTRAINT_END
> };
>
> +#define FLAGS_NOT_ANY (X86_ALL_EVENT_FLAGS & ~ARCH_PERFMON_EVENTSEL_ANY)
> +
> static struct event_constraint intel_ivb_event_constraints[] __read_mostly =
> {
> FIXED_EVENT_CONSTRAINT(0x00c0, 0), /* INST_RETIRED.ANY */
> @@ -135,11 +137,12 @@ static struct event_constraint intel_ivb_event_constraints[] __read_mostly =
> * Errata BV98 -- MEM_*_RETIRED events can leak between counters of SMT
> * siblings; disable these events because they can corrupt unrelated
> * counters.
> + * But allow them with the ANY bit set.
> */
> INTEL_EVENT_CONSTRAINT(0xd0, 0x0), /* MEM_UOPS_RETIRED.* */
> - INTEL_EVENT_CONSTRAINT(0xd1, 0x0), /* MEM_LOAD_UOPS_RETIRED.* */
> - INTEL_EVENT_CONSTRAINT(0xd2, 0x0), /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */
> - INTEL_EVENT_CONSTRAINT(0xd3, 0x0), /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.* */
> + INTEL_FLAGS_EVENT_CONSTRAINT(FLAGS_NOT_ANY|0xd1, 0x0), /* MEM_LOAD_UOPS_RETIRED.* */
> + INTEL_FLAGS_EVENT_CONSTRAINT(FLAGS_NOT_ANY|0xd2, 0x0), /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */
> + INTEL_FLAGS_EVENT_CONSTRAINT(FLAGS_NOT_ANY|0xd3, 0x0), /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.* */
> EVENT_CONSTRAINT_END
> };
>
> --
> 1.9.3
>

2014-07-02 15:14:36

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH 1/2] perf, x86: Revamp PEBS event selection

Andi,


On Sat, Jun 28, 2014 at 1:10 AM, Andi Kleen <[email protected]> wrote:
> From: Andi Kleen <[email protected]>
>
> As already discussed earlier in email.
>
> The basic idea is that it does not make sense to list all PEBS
> events individually. The list is very long, sometimes outdated
> and the hardware doesn't need it. If an event does not support
> PEBS it will just not count, there is no security issue.
>
> This vastly simplifies the PEBS event selection.
>
> Bugs fixed:
> - We do not allow setting forbidden flags with PEBS anymore
> (SDM 18.9.4), except for the special cycle event.
> This is done using a new constraint macro that also
> matches on the event flags.
> - We now allow DataLA on all Haswell events, not just
> a small subset. In general all PEBS events that tag memory
> accesses support DataLA on Haswell. Otherwise the reported
> address is just zero. This allows address profiling
> on vastly more events.
> - We did not allow all PEBS events on Haswell.
>
> This includes the changes proposed by Stephane earlier and obsoletes
> his patchkit.
>
> I only did Sandy Bridge and Silvermont and later so far, mostly because these
> are the parts I could directly confirm the hardware behavior with hardware
> architects.
>
This patch still does not work as expected on any platforms. See below

> Cc: [email protected]
> Signed-off-by: Andi Kleen <[email protected]>
> ---
> arch/x86/include/asm/perf_event.h | 8 +++
> arch/x86/kernel/cpu/perf_event.h | 18 ++++--
> arch/x86/kernel/cpu/perf_event_intel_ds.c | 96 +++++++------------------------
> 3 files changed, 43 insertions(+), 79 deletions(-)
>
> diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
> index 8249df4..8dfc9fd 100644
> --- a/arch/x86/include/asm/perf_event.h
> +++ b/arch/x86/include/asm/perf_event.h
> @@ -51,6 +51,14 @@
> ARCH_PERFMON_EVENTSEL_EDGE | \
> ARCH_PERFMON_EVENTSEL_INV | \
> ARCH_PERFMON_EVENTSEL_CMASK)
> +#define X86_ALL_EVENT_FLAGS \
> + (ARCH_PERFMON_EVENTSEL_EDGE | \
> + ARCH_PERFMON_EVENTSEL_INV | \
> + ARCH_PERFMON_EVENTSEL_CMASK | \
> + ARCH_PERFMON_EVENTSEL_ANY | \
> + ARCH_PERFMON_EVENTSEL_PIN_CONTROL | \
> + HSW_IN_TX | \
> + HSW_IN_TX_CHECKPOINTED)
> #define AMD64_RAW_EVENT_MASK \
> (X86_RAW_EVENT_MASK | \
> AMD64_EVENTSEL_EVENT)
> diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
> index 3b2f9bd..9907759 100644
> --- a/arch/x86/kernel/cpu/perf_event.h
> +++ b/arch/x86/kernel/cpu/perf_event.h
> @@ -252,16 +252,24 @@ struct cpu_hw_events {
> EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK)
>
> #define INTEL_PLD_CONSTRAINT(c, n) \
> - __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK, \
> + __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK|X86_ALL_EVENT_FLAGS, \
> HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_LDLAT)
>
> #define INTEL_PST_CONSTRAINT(c, n) \
> - __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK, \
> + __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK|X86_ALL_EVENT_FLAGS, \
> HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_ST)
>
> -/* DataLA version of store sampling without extra enable bit. */
> -#define INTEL_PST_HSW_CONSTRAINT(c, n) \
> - __EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK, \
> +/* Event constraint, but match on all event flags too. */
> +#define INTEL_FLAGS_EVENT_CONSTRAINT(c, n) \
> + EVENT_CONSTRAINT(c, n, INTEL_ARCH_EVENT_MASK|X86_ALL_EVENT_FLAGS)
> +
> +/* Check only flags, but allow all event/umask */
> +#define INTEL_ALL_EVENT_CONSTRAINT(flags, n) \
> + EVENT_CONSTRAINT(flags, n, X86_ALL_EVENT_FLAGS)
> +
> +/* Same as above, but enable DataLA */
> +#define INTEL_ALL_EVENT_CONSTRAINT_DATALA(flags, n) \
> + __EVENT_CONSTRAINT(flags, n, X86_ALL_EVENT_FLAGS, \
> HWEIGHT(n), 0, PERF_X86_EVENT_PEBS_ST_HSW)
>
> /*
> diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
> index 980970c..d50142e 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
> @@ -567,28 +567,10 @@ struct event_constraint intel_atom_pebs_event_constraints[] = {
> };
>
> struct event_constraint intel_slm_pebs_event_constraints[] = {
> - INTEL_UEVENT_CONSTRAINT(0x0103, 0x1), /* REHABQ.LD_BLOCK_ST_FORWARD_PS */
> - INTEL_UEVENT_CONSTRAINT(0x0803, 0x1), /* REHABQ.LD_SPLITS_PS */
> - INTEL_UEVENT_CONSTRAINT(0x0204, 0x1), /* MEM_UOPS_RETIRED.L2_HIT_LOADS_PS */
> - INTEL_UEVENT_CONSTRAINT(0x0404, 0x1), /* MEM_UOPS_RETIRED.L2_MISS_LOADS_PS */
> - INTEL_UEVENT_CONSTRAINT(0x0804, 0x1), /* MEM_UOPS_RETIRED.DTLB_MISS_LOADS_PS */
> - INTEL_UEVENT_CONSTRAINT(0x2004, 0x1), /* MEM_UOPS_RETIRED.HITM_PS */
> - INTEL_UEVENT_CONSTRAINT(0x00c0, 0x1), /* INST_RETIRED.ANY_PS */
> - INTEL_UEVENT_CONSTRAINT(0x00c4, 0x1), /* BR_INST_RETIRED.ALL_BRANCHES_PS */
> - INTEL_UEVENT_CONSTRAINT(0x7ec4, 0x1), /* BR_INST_RETIRED.JCC_PS */
> - INTEL_UEVENT_CONSTRAINT(0xbfc4, 0x1), /* BR_INST_RETIRED.FAR_BRANCH_PS */
> - INTEL_UEVENT_CONSTRAINT(0xebc4, 0x1), /* BR_INST_RETIRED.NON_RETURN_IND_PS */
> - INTEL_UEVENT_CONSTRAINT(0xf7c4, 0x1), /* BR_INST_RETIRED.RETURN_PS */
> - INTEL_UEVENT_CONSTRAINT(0xf9c4, 0x1), /* BR_INST_RETIRED.CALL_PS */
> - INTEL_UEVENT_CONSTRAINT(0xfbc4, 0x1), /* BR_INST_RETIRED.IND_CALL_PS */
> - INTEL_UEVENT_CONSTRAINT(0xfdc4, 0x1), /* BR_INST_RETIRED.REL_CALL_PS */
> - INTEL_UEVENT_CONSTRAINT(0xfec4, 0x1), /* BR_INST_RETIRED.TAKEN_JCC_PS */
> - INTEL_UEVENT_CONSTRAINT(0x00c5, 0x1), /* BR_INST_MISP_RETIRED.ALL_BRANCHES_PS */
> - INTEL_UEVENT_CONSTRAINT(0x7ec5, 0x1), /* BR_INST_MISP_RETIRED.JCC_PS */
> - INTEL_UEVENT_CONSTRAINT(0xebc5, 0x1), /* BR_INST_MISP_RETIRED.NON_RETURN_IND_PS */
> - INTEL_UEVENT_CONSTRAINT(0xf7c5, 0x1), /* BR_INST_MISP_RETIRED.RETURN_PS */
> - INTEL_UEVENT_CONSTRAINT(0xfbc5, 0x1), /* BR_INST_MISP_RETIRED.IND_CALL_PS */
> - INTEL_UEVENT_CONSTRAINT(0xfec5, 0x1), /* BR_INST_MISP_RETIRED.TAKEN_JCC_PS */
> + /* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */
> + INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf),
> + /* Allow all events as PEBS with no flags */
> + INTEL_ALL_EVENT_CONSTRAINT(0xffff, 0x1),

No, still needs to be INTEL_ALL_EVENT_CONSTRAINT(0x0, 0x1)
otherwise the get_event_constraint() test I mentioned previously will
fail, event with your ALL_FILTER mask.

> EVENT_CONSTRAINT_END
> };
>
> @@ -624,68 +606,34 @@ struct event_constraint intel_westmere_pebs_event_constraints[] = {
>
> struct event_constraint intel_snb_pebs_event_constraints[] = {
> INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */
> - INTEL_UEVENT_CONSTRAINT(0x01c2, 0xf), /* UOPS_RETIRED.ALL */
> - INTEL_UEVENT_CONSTRAINT(0x02c2, 0xf), /* UOPS_RETIRED.RETIRE_SLOTS */
> - INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */
> - INTEL_EVENT_CONSTRAINT(0xc5, 0xf), /* BR_MISP_RETIRED.* */
> - INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */
> - INTEL_PST_CONSTRAINT(0x02cd, 0x8), /* MEM_TRANS_RETIRED.PRECISE_STORES */
> - INTEL_EVENT_CONSTRAINT(0xd0, 0xf), /* MEM_UOP_RETIRED.* */
> - INTEL_EVENT_CONSTRAINT(0xd1, 0xf), /* MEM_LOAD_UOPS_RETIRED.* */
> - INTEL_EVENT_CONSTRAINT(0xd2, 0xf), /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */
> - INTEL_EVENT_CONSTRAINT(0xd3, 0xf), /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.* */
> - INTEL_UEVENT_CONSTRAINT(0x02d4, 0xf), /* MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS */
> + INTEL_PLD_CONSTRAINT(0x01cd, 0xf), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */
> + INTEL_PST_CONSTRAINT(0x02cd, 0xf), /* MEM_TRANS_RETIRED.PRECISE_STORES */

No, precise stores only work on counter 3, keep 0x8 here

> + /* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */
> + INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf),
> + /* Allow all events as PEBS with no flags */
> + INTEL_ALL_EVENT_CONSTRAINT(0xffff, 0xf),
Ditto
> EVENT_CONSTRAINT_END
> };
>
> struct event_constraint intel_ivb_pebs_event_constraints[] = {
> INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */
> - INTEL_UEVENT_CONSTRAINT(0x01c2, 0xf), /* UOPS_RETIRED.ALL */
> - INTEL_UEVENT_CONSTRAINT(0x02c2, 0xf), /* UOPS_RETIRED.RETIRE_SLOTS */
> - INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */
> - INTEL_EVENT_CONSTRAINT(0xc5, 0xf), /* BR_MISP_RETIRED.* */
> - INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */
> - INTEL_PST_CONSTRAINT(0x02cd, 0x8), /* MEM_TRANS_RETIRED.PRECISE_STORES */
> - INTEL_EVENT_CONSTRAINT(0xd0, 0xf), /* MEM_UOP_RETIRED.* */
> - INTEL_EVENT_CONSTRAINT(0xd1, 0xf), /* MEM_LOAD_UOPS_RETIRED.* */
> - INTEL_EVENT_CONSTRAINT(0xd2, 0xf), /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */
> - INTEL_EVENT_CONSTRAINT(0xd3, 0xf), /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.* */
> + INTEL_PLD_CONSTRAINT(0x01cd, 0xf), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */
> + INTEL_PST_CONSTRAINT(0x02cd, 0xf), /* MEM_TRANS_RETIRED.PRECISE_STORES */
Must be 0x8
> + /* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */
> + INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf),
> + /* Allow all events as PEBS with no flags */
> + INTEL_ALL_EVENT_CONSTRAINT(0xffff, 0xf),
Ditto
> EVENT_CONSTRAINT_END
> };
>
> struct event_constraint intel_hsw_pebs_event_constraints[] = {
> INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */
> - INTEL_PST_HSW_CONSTRAINT(0x01c2, 0xf), /* UOPS_RETIRED.ALL */
> - INTEL_UEVENT_CONSTRAINT(0x02c2, 0xf), /* UOPS_RETIRED.RETIRE_SLOTS */
> - INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */
> - INTEL_UEVENT_CONSTRAINT(0x01c5, 0xf), /* BR_MISP_RETIRED.CONDITIONAL */
> - INTEL_UEVENT_CONSTRAINT(0x04c5, 0xf), /* BR_MISP_RETIRED.ALL_BRANCHES */
> - INTEL_UEVENT_CONSTRAINT(0x20c5, 0xf), /* BR_MISP_RETIRED.NEAR_TAKEN */
> - INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.* */
> - /* MEM_UOPS_RETIRED.STLB_MISS_LOADS */
> - INTEL_UEVENT_CONSTRAINT(0x11d0, 0xf),
> - /* MEM_UOPS_RETIRED.STLB_MISS_STORES */
> - INTEL_UEVENT_CONSTRAINT(0x12d0, 0xf),
> - INTEL_UEVENT_CONSTRAINT(0x21d0, 0xf), /* MEM_UOPS_RETIRED.LOCK_LOADS */
> - INTEL_UEVENT_CONSTRAINT(0x41d0, 0xf), /* MEM_UOPS_RETIRED.SPLIT_LOADS */
> - /* MEM_UOPS_RETIRED.SPLIT_STORES */
> - INTEL_UEVENT_CONSTRAINT(0x42d0, 0xf),
> - INTEL_UEVENT_CONSTRAINT(0x81d0, 0xf), /* MEM_UOPS_RETIRED.ALL_LOADS */
> - INTEL_PST_HSW_CONSTRAINT(0x82d0, 0xf), /* MEM_UOPS_RETIRED.ALL_STORES */
> - INTEL_UEVENT_CONSTRAINT(0x01d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L1_HIT */
> - INTEL_UEVENT_CONSTRAINT(0x02d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L2_HIT */
> - INTEL_UEVENT_CONSTRAINT(0x04d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L3_HIT */
> - /* MEM_LOAD_UOPS_RETIRED.HIT_LFB */
> - INTEL_UEVENT_CONSTRAINT(0x40d1, 0xf),
> - /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_MISS */
> - INTEL_UEVENT_CONSTRAINT(0x01d2, 0xf),
> - /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HIT */
> - INTEL_UEVENT_CONSTRAINT(0x02d2, 0xf),
> - /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.LOCAL_DRAM */
> - INTEL_UEVENT_CONSTRAINT(0x01d3, 0xf),
> - INTEL_UEVENT_CONSTRAINT(0x04c8, 0xf), /* HLE_RETIRED.Abort */
> - INTEL_UEVENT_CONSTRAINT(0x04c9, 0xf), /* RTM_RETIRED.Abort */
> -
> + INTEL_PLD_CONSTRAINT(0x01cd, 0xf), /* MEM_TRANS_RETIRED.* */
> + /* UOPS_RETIRED.ALL, inv=1, cmask=16 (cycles:p). */
> + INTEL_FLAGS_EVENT_CONSTRAINT(0x108001c2, 0xf),
> + /* Allow all events as PEBS with no flags */
> + /* We allow DATALA for all PEBS events, will be 0 if not supported */
> + INTEL_ALL_EVENT_CONSTRAINT_DATALA(0, 0xf),

Missing the catch-all constraint here.

> EVENT_CONSTRAINT_END
> };
>
Also add the NHM, WSM, Core2, Atom as well.
For the last two, you only have the catch-all constraint with counter0.

> --
> 1.9.3
>

2014-07-02 15:34:10

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 1/2] perf, x86: Revamp PEBS event selection

On Wed, Jul 02, 2014 at 02:29:02PM +0200, Peter Zijlstra wrote:
> On Fri, Jun 27, 2014 at 04:10:11PM -0700, Andi Kleen wrote:
> > From: Andi Kleen <[email protected]>
> >
> > As already discussed earlier in email.
>
> Is an entirely inappropriate start for a Changelog. Do not assume prior
> knowledge. If its relevant include it here without reference.

Thanks. Do you have any other comments?

-Andi

--
[email protected] -- Speaking for myself only

2014-07-02 15:34:19

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 1/2] perf, x86: Revamp PEBS event selection

> No, still needs to be INTEL_ALL_EVENT_CONSTRAINT(0x0, 0x1)
> otherwise the get_event_constraint() test I mentioned previously will
> fail, event with your ALL_FILTER mask.

What events should fail? I verified all PEBS events and they work as expected.

> > - INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */
> > - INTEL_EVENT_CONSTRAINT(0xc5, 0xf), /* BR_MISP_RETIRED.* */
> > - INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */
> > - INTEL_PST_CONSTRAINT(0x02cd, 0x8), /* MEM_TRANS_RETIRED.PRECISE_STORES */
> > - INTEL_EVENT_CONSTRAINT(0xd0, 0xf), /* MEM_UOP_RETIRED.* */
> > - INTEL_EVENT_CONSTRAINT(0xd1, 0xf), /* MEM_LOAD_UOPS_RETIRED.* */
> > - INTEL_EVENT_CONSTRAINT(0xd2, 0xf), /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */
> > - INTEL_EVENT_CONSTRAINT(0xd3, 0xf), /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.* */
> > - INTEL_UEVENT_CONSTRAINT(0x02d4, 0xf), /* MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS */
> > + INTEL_PLD_CONSTRAINT(0x01cd, 0xf), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */
> > + INTEL_PST_CONSTRAINT(0x02cd, 0xf), /* MEM_TRANS_RETIRED.PRECISE_STORES */
>
> No, precise stores only work on counter 3, keep 0x8 here

Good point.



-Andi
--
[email protected] -- Speaking for myself only

2014-07-02 15:43:16

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 1/2] perf, x86: Revamp PEBS event selection

On Wed, Jul 02, 2014 at 08:34:07AM -0700, Andi Kleen wrote:
> On Wed, Jul 02, 2014 at 02:29:02PM +0200, Peter Zijlstra wrote:
> > On Fri, Jun 27, 2014 at 04:10:11PM -0700, Andi Kleen wrote:
> > > From: Andi Kleen <[email protected]>
> > >
> > > As already discussed earlier in email.
> >
> > Is an entirely inappropriate start for a Changelog. Do not assume prior
> > knowledge. If its relevant include it here without reference.
>
> Thanks. Do you have any other comments?

What Stephane said ;-)

But also, I think we should conditionally allow the filter bits;
possibly with a sysfs file like I had.

Back when we had to sort that SNB cycles thing it was tedious that Linus
could not just try things.


Attachments:
(No filename) (708.00 B)
(No filename) (836.00 B)
Download all attachments

2014-07-02 15:44:09

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH 1/2] perf, x86: Revamp PEBS event selection

On Wed, Jul 2, 2014 at 5:33 PM, Andi Kleen <[email protected]> wrote:
>> No, still needs to be INTEL_ALL_EVENT_CONSTRAINT(0x0, 0x1)
>> otherwise the get_event_constraint() test I mentioned previously will
>> fail, event with your ALL_FILTER mask.
>
> What events should fail? I verified all PEBS events and they work as expected.
>
Random events should not fail, they should go with precise and not generate
any samples. That's the whole point of the exercise.

perf record -a -e r6099:p sleep 1

>> > - INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */
>> > - INTEL_EVENT_CONSTRAINT(0xc5, 0xf), /* BR_MISP_RETIRED.* */
>> > - INTEL_PLD_CONSTRAINT(0x01cd, 0x8), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */
>> > - INTEL_PST_CONSTRAINT(0x02cd, 0x8), /* MEM_TRANS_RETIRED.PRECISE_STORES */
>> > - INTEL_EVENT_CONSTRAINT(0xd0, 0xf), /* MEM_UOP_RETIRED.* */
>> > - INTEL_EVENT_CONSTRAINT(0xd1, 0xf), /* MEM_LOAD_UOPS_RETIRED.* */
>> > - INTEL_EVENT_CONSTRAINT(0xd2, 0xf), /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.* */
>> > - INTEL_EVENT_CONSTRAINT(0xd3, 0xf), /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.* */
>> > - INTEL_UEVENT_CONSTRAINT(0x02d4, 0xf), /* MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS */
>> > + INTEL_PLD_CONSTRAINT(0x01cd, 0xf), /* MEM_TRANS_RETIRED.LAT_ABOVE_THR */
>> > + INTEL_PST_CONSTRAINT(0x02cd, 0xf), /* MEM_TRANS_RETIRED.PRECISE_STORES */
>>
>> No, precise stores only work on counter 3, keep 0x8 here
>
> Good point.
>
>
>
> -Andi
> --
> [email protected] -- Speaking for myself only

2014-07-02 15:48:34

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 1/2] perf, x86: Revamp PEBS event selection

On Wed, Jul 02, 2014 at 05:44:05PM +0200, Stephane Eranian wrote:
> On Wed, Jul 2, 2014 at 5:33 PM, Andi Kleen <[email protected]> wrote:
> >> No, still needs to be INTEL_ALL_EVENT_CONSTRAINT(0x0, 0x1)
> >> otherwise the get_event_constraint() test I mentioned previously will
> >> fail, event with your ALL_FILTER mask.
> >
> > What events should fail? I verified all PEBS events and they work as expected.
> >
> Random events should not fail, they should go with precise and not generate
> any samples. That's the whole point of the exercise.
>
> perf record -a -e r6099:p sleep 1

Like I said I ran all PEBS events and they generated samples.

-Andi

2014-07-02 16:07:33

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH 1/2] perf, x86: Revamp PEBS event selection

On Wed, Jul 2, 2014 at 5:48 PM, Andi Kleen <[email protected]> wrote:
> On Wed, Jul 02, 2014 at 05:44:05PM +0200, Stephane Eranian wrote:
>> On Wed, Jul 2, 2014 at 5:33 PM, Andi Kleen <[email protected]> wrote:
>> >> No, still needs to be INTEL_ALL_EVENT_CONSTRAINT(0x0, 0x1)
>> >> otherwise the get_event_constraint() test I mentioned previously will
>> >> fail, event with your ALL_FILTER mask.
>> >
>> > What events should fail? I verified all PEBS events and they work as expected.
>> >
>> Random events should not fail, they should go with precise and not generate
>> any samples. That's the whole point of the exercise.
>>
>> perf record -a -e r6099:p sleep 1
>
> Like I said I ran all PEBS events and they generated samples.
>
I understand. I ran some random events to make sure I was not
getting PEBS samples and the system was stable.

2014-07-02 18:14:18

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 1/2] perf, x86: Revamp PEBS event selection

> But also, I think we should conditionally allow the filter bits;
> possibly with a sysfs file like I had.
>
> Back when we had to sort that SNB cycles thing it was tedious that Linus
> could not just try things.

Hmm, the code in your patch to handle it was quite nasty.
I don't really see the situation repeating.

-Andi

--
[email protected] -- Speaking for myself only

2014-07-02 18:14:28

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 1/2] perf, x86: Revamp PEBS event selection

On Wed, Jul 02, 2014 at 06:07:31PM +0200, Stephane Eranian wrote:
> On Wed, Jul 2, 2014 at 5:48 PM, Andi Kleen <[email protected]> wrote:
> > On Wed, Jul 02, 2014 at 05:44:05PM +0200, Stephane Eranian wrote:
> >> On Wed, Jul 2, 2014 at 5:33 PM, Andi Kleen <[email protected]> wrote:
> >> >> No, still needs to be INTEL_ALL_EVENT_CONSTRAINT(0x0, 0x1)
> >> >> otherwise the get_event_constraint() test I mentioned previously will
> >> >> fail, event with your ALL_FILTER mask.
> >> >
> >> > What events should fail? I verified all PEBS events and they work as expected.
> >> >
> >> Random events should not fail, they should go with precise and not generate
> >> any samples. That's the whole point of the exercise.
> >>
> >> perf record -a -e r6099:p sleep 1
> >
> > Like I said I ran all PEBS events and they generated samples.
> >
> I understand. I ran some random events to make sure I was not
> getting PEBS samples and the system was stable.

Not sure we're talking about the same thing. You claimed my patch
wouldn't let any PEBS events through, but the test results
disagree with that.

I fixed the broken store events you pointed out.

INST_RETIRED.PREC_DIST
cpu/event=0xC0,umask=0x01,name=INST_RETIRED_PREC_DIST/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.179 MB perf.data (~7821 samples) ]
UOPS_RETIRED.ALL
cpu/event=0xC2,umask=0x01,name=UOPS_RETIRED_ALL/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.179 MB perf.data (~7824 samples) ]
UOPS_RETIRED.RETIRE_SLOTS
cpu/event=0xC2,umask=0x02,name=UOPS_RETIRED_RETIRE_SLOTS/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.180 MB perf.data (~7869 samples) ]
BR_INST_RETIRED.CONDITIONAL
cpu/event=0xC4,umask=0x01,name=BR_INST_RETIRED_CONDITIONAL/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.177 MB perf.data (~7729 samples) ]
BR_INST_RETIRED.NEAR_CALL
cpu/event=0xC4,umask=0x02,name=BR_INST_RETIRED_NEAR_CALL/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.140 MB perf.data (~6112 samples) ]
BR_INST_RETIRED.NEAR_RETURN
cpu/event=0xC4,umask=0x08,name=BR_INST_RETIRED_NEAR_RETURN/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.140 MB perf.data (~6124 samples) ]
BR_INST_RETIRED.NEAR_TAKEN
cpu/event=0xC4,umask=0x20,name=BR_INST_RETIRED_NEAR_TAKEN/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.176 MB perf.data (~7709 samples) ]
BR_INST_RETIRED.ALL_BRANCHES_PEBS
cpu/event=0xC4,umask=0x04,name=BR_INST_RETIRED_ALL_BRANCHES_PEBS/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.177 MB perf.data (~7747 samples) ]
BR_MISP_RETIRED.CONDITIONAL
cpu/event=0xC5,umask=0x01,name=BR_MISP_RETIRED_CONDITIONAL/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.132 MB perf.data (~5767 samples) ]
BR_MISP_RETIRED.ALL_BRANCHES_PEBS
cpu/event=0xC5,umask=0x04,name=BR_MISP_RETIRED_ALL_BRANCHES_PEBS/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.132 MB perf.data (~5781 samples) ]
HLE_RETIRED.ABORTED
cpu/event=0xc8,umask=0x04,name=HLE_RETIRED_ABORTED/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB perf.data (~86 samples) ]
RTM_RETIRED.ABORTED
cpu/event=0xc9,umask=0x04,name=RTM_RETIRED_ABORTED/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB perf.data (~86 samples) ]
MEM_TRANS_RETIRED.LOAD_LATENCY_GT_4
cpu/event=0xCD,umask=0x01,ldlat=0x4,name=MEM_TRANS_RETIRED_LOAD_LATENCY_GT_4/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.179 MB perf.data (~7832 samples) ]
MEM_TRANS_RETIRED.LOAD_LATENCY_GT_8
cpu/event=0xCD,umask=0x01,ldlat=0x8,name=MEM_TRANS_RETIRED_LOAD_LATENCY_GT_8/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.126 MB perf.data (~5522 samples) ]
MEM_TRANS_RETIRED.LOAD_LATENCY_GT_16
cpu/event=0xCD,umask=0x01,ldlat=0x10,name=MEM_TRANS_RETIRED_LOAD_LATENCY_GT_16/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.090 MB perf.data (~3911 samples) ]
MEM_TRANS_RETIRED.LOAD_LATENCY_GT_32
cpu/event=0xCD,umask=0x01,ldlat=0x20,name=MEM_TRANS_RETIRED_LOAD_LATENCY_GT_32/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.056 MB perf.data (~2429 samples) ]
MEM_TRANS_RETIRED.LOAD_LATENCY_GT_64
cpu/event=0xCD,umask=0x01,ldlat=0x40,name=MEM_TRANS_RETIRED_LOAD_LATENCY_GT_64/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.012 MB perf.data (~516 samples) ]
MEM_TRANS_RETIRED.LOAD_LATENCY_GT_128
cpu/event=0xCD,umask=0x01,ldlat=0x80,name=MEM_TRANS_RETIRED_LOAD_LATENCY_GT_128/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.014 MB perf.data (~604 samples) ]
MEM_TRANS_RETIRED.LOAD_LATENCY_GT_256
cpu/event=0xCD,umask=0x01,ldlat=0x100,name=MEM_TRANS_RETIRED_LOAD_LATENCY_GT_256/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.004 MB perf.data (~172 samples) ]
MEM_TRANS_RETIRED.LOAD_LATENCY_GT_512
cpu/event=0xCD,umask=0x01,ldlat=0x200,name=MEM_TRANS_RETIRED_LOAD_LATENCY_GT_512/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.003 MB perf.data (~129 samples) ]
MEM_UOPS_RETIRED.STLB_MISS_LOADS
cpu/event=0xD0,umask=0x11,name=MEM_UOPS_RETIRED_STLB_MISS_LOADS/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.006 MB perf.data (~261 samples) ]
MEM_UOPS_RETIRED.STLB_MISS_STORES
cpu/event=0xD0,umask=0x12,name=MEM_UOPS_RETIRED_STLB_MISS_STORES/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.016 MB perf.data (~694 samples) ]
MEM_UOPS_RETIRED.LOCK_LOADS
cpu/event=0xD0,umask=0x21,name=MEM_UOPS_RETIRED_LOCK_LOADS/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.036 MB perf.data (~1554 samples) ]
MEM_UOPS_RETIRED.SPLIT_LOADS
cpu/event=0xD0,umask=0x41,name=MEM_UOPS_RETIRED_SPLIT_LOADS/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.003 MB perf.data (~121 samples) ]
MEM_UOPS_RETIRED.SPLIT_STORES
cpu/event=0xD0,umask=0x42,name=MEM_UOPS_RETIRED_SPLIT_STORES/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.039 MB perf.data (~1707 samples) ]
MEM_UOPS_RETIRED.ALL_LOADS
cpu/event=0xD0,umask=0x81,name=MEM_UOPS_RETIRED_ALL_LOADS/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.179 MB perf.data (~7839 samples) ]
MEM_UOPS_RETIRED.ALL_STORES
cpu/event=0xD0,umask=0x82,name=MEM_UOPS_RETIRED_ALL_STORES/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.164 MB perf.data (~7144 samples) ]
MEM_LOAD_UOPS_RETIRED.L1_HIT
cpu/event=0xD1,umask=0x01,name=MEM_LOAD_UOPS_RETIRED_L1_HIT/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.179 MB perf.data (~7826 samples) ]
MEM_LOAD_UOPS_RETIRED.L2_HIT
cpu/event=0xD1,umask=0x02,name=MEM_LOAD_UOPS_RETIRED_L2_HIT/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.084 MB perf.data (~3689 samples) ]
MEM_LOAD_UOPS_RETIRED.L3_HIT
cpu/event=0xD1,umask=0x04,name=MEM_LOAD_UOPS_RETIRED_L3_HIT/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.041 MB perf.data (~1779 samples) ]
MEM_LOAD_UOPS_RETIRED.L1_MISS
cpu/event=0xD1,umask=0x08,name=MEM_LOAD_UOPS_RETIRED_L1_MISS/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.088 MB perf.data (~3827 samples) ]
MEM_LOAD_UOPS_RETIRED.L2_MISS
cpu/event=0xD1,umask=0x10,name=MEM_LOAD_UOPS_RETIRED_L2_MISS/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.056 MB perf.data (~2439 samples) ]
MEM_LOAD_UOPS_RETIRED.L3_MISS
cpu/event=0xD1,umask=0x20,name=MEM_LOAD_UOPS_RETIRED_L3_MISS/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.028 MB perf.data (~1229 samples) ]
MEM_LOAD_UOPS_RETIRED.HIT_LFB
cpu/event=0xD1,umask=0x40,name=MEM_LOAD_UOPS_RETIRED_HIT_LFB/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.055 MB perf.data (~2402 samples) ]
MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_MISS
cpu/event=0xD2,umask=0x01,name=MEM_LOAD_UOPS_L3_HIT_RETIRED_XSNP_MISS/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB perf.data (~107 samples) ]
MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HIT
cpu/event=0xD2,umask=0x02,name=MEM_LOAD_UOPS_L3_HIT_RETIRED_XSNP_HIT/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.003 MB perf.data (~119 samples) ]
MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HITM
cpu/event=0xD2,umask=0x04,name=MEM_LOAD_UOPS_L3_HIT_RETIRED_XSNP_HITM/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB perf.data (~107 samples) ]
MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_NONE
cpu/event=0xD2,umask=0x08,name=MEM_LOAD_UOPS_L3_HIT_RETIRED_XSNP_NONE/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.038 MB perf.data (~1649 samples) ]
MEM_LOAD_UOPS_L3_MISS_RETIRED.LOCAL_DRAM
cpu/event=0xD3,umask=0x01,name=MEM_LOAD_UOPS_L3_MISS_RETIRED_LOCAL_DRAM/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.028 MB perf.data (~1204 samples) ]
BR_MISP_RETIRED.NEAR_TAKEN
cpu/event=0xC5,umask=0x20,name=BR_MISP_RETIRED_NEAR_TAKEN/pp
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.132 MB perf.data (~5777 samples) ]

--
[email protected] -- Speaking for myself only

2014-07-07 05:51:53

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH 1/2] perf, x86: Revamp PEBS event selection

On Wed, Jul 2, 2014 at 8:10 PM, Andi Kleen <[email protected]> wrote:
> On Wed, Jul 02, 2014 at 06:07:31PM +0200, Stephane Eranian wrote:
>> On Wed, Jul 2, 2014 at 5:48 PM, Andi Kleen <[email protected]> wrote:
>> > On Wed, Jul 02, 2014 at 05:44:05PM +0200, Stephane Eranian wrote:
>> >> On Wed, Jul 2, 2014 at 5:33 PM, Andi Kleen <[email protected]> wrote:
>> >> >> No, still needs to be INTEL_ALL_EVENT_CONSTRAINT(0x0, 0x1)
>> >> >> otherwise the get_event_constraint() test I mentioned previously will
>> >> >> fail, event with your ALL_FILTER mask.
>> >> >
>> >> > What events should fail? I verified all PEBS events and they work as expected.
>> >> >
>> >> Random events should not fail, they should go with precise and not generate
>> >> any samples. That's the whole point of the exercise.
>> >>
>> >> perf record -a -e r6099:p sleep 1
>> >
>> > Like I said I ran all PEBS events and they generated samples.
>> >
>> I understand. I ran some random events to make sure I was not
>> getting PEBS samples and the system was stable.
>
> Not sure we're talking about the same thing. You claimed my patch
> wouldn't let any PEBS events through, but the test results
> disagree with that.
>
I did not say that. I said, it does not let any random event code
use precise > 0. And this is what we want to eliminate. It is
okay to let precise > 1 on any event. The non-PEBS events
will not generate any PEBS records.


> I fixed the broken store events you pointed out.
>
> INST_RETIRED.PREC_DIST
> cpu/event=0xC0,umask=0x01,name=INST_RETIRED_PREC_DIST/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.179 MB perf.data (~7821 samples) ]
> UOPS_RETIRED.ALL
> cpu/event=0xC2,umask=0x01,name=UOPS_RETIRED_ALL/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.179 MB perf.data (~7824 samples) ]
> UOPS_RETIRED.RETIRE_SLOTS
> cpu/event=0xC2,umask=0x02,name=UOPS_RETIRED_RETIRE_SLOTS/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.180 MB perf.data (~7869 samples) ]
> BR_INST_RETIRED.CONDITIONAL
> cpu/event=0xC4,umask=0x01,name=BR_INST_RETIRED_CONDITIONAL/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.177 MB perf.data (~7729 samples) ]
> BR_INST_RETIRED.NEAR_CALL
> cpu/event=0xC4,umask=0x02,name=BR_INST_RETIRED_NEAR_CALL/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.140 MB perf.data (~6112 samples) ]
> BR_INST_RETIRED.NEAR_RETURN
> cpu/event=0xC4,umask=0x08,name=BR_INST_RETIRED_NEAR_RETURN/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.140 MB perf.data (~6124 samples) ]
> BR_INST_RETIRED.NEAR_TAKEN
> cpu/event=0xC4,umask=0x20,name=BR_INST_RETIRED_NEAR_TAKEN/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.176 MB perf.data (~7709 samples) ]
> BR_INST_RETIRED.ALL_BRANCHES_PEBS
> cpu/event=0xC4,umask=0x04,name=BR_INST_RETIRED_ALL_BRANCHES_PEBS/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.177 MB perf.data (~7747 samples) ]
> BR_MISP_RETIRED.CONDITIONAL
> cpu/event=0xC5,umask=0x01,name=BR_MISP_RETIRED_CONDITIONAL/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.132 MB perf.data (~5767 samples) ]
> BR_MISP_RETIRED.ALL_BRANCHES_PEBS
> cpu/event=0xC5,umask=0x04,name=BR_MISP_RETIRED_ALL_BRANCHES_PEBS/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.132 MB perf.data (~5781 samples) ]
> HLE_RETIRED.ABORTED
> cpu/event=0xc8,umask=0x04,name=HLE_RETIRED_ABORTED/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.002 MB perf.data (~86 samples) ]
> RTM_RETIRED.ABORTED
> cpu/event=0xc9,umask=0x04,name=RTM_RETIRED_ABORTED/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.002 MB perf.data (~86 samples) ]
> MEM_TRANS_RETIRED.LOAD_LATENCY_GT_4
> cpu/event=0xCD,umask=0x01,ldlat=0x4,name=MEM_TRANS_RETIRED_LOAD_LATENCY_GT_4/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.179 MB perf.data (~7832 samples) ]
> MEM_TRANS_RETIRED.LOAD_LATENCY_GT_8
> cpu/event=0xCD,umask=0x01,ldlat=0x8,name=MEM_TRANS_RETIRED_LOAD_LATENCY_GT_8/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.126 MB perf.data (~5522 samples) ]
> MEM_TRANS_RETIRED.LOAD_LATENCY_GT_16
> cpu/event=0xCD,umask=0x01,ldlat=0x10,name=MEM_TRANS_RETIRED_LOAD_LATENCY_GT_16/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.090 MB perf.data (~3911 samples) ]
> MEM_TRANS_RETIRED.LOAD_LATENCY_GT_32
> cpu/event=0xCD,umask=0x01,ldlat=0x20,name=MEM_TRANS_RETIRED_LOAD_LATENCY_GT_32/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.056 MB perf.data (~2429 samples) ]
> MEM_TRANS_RETIRED.LOAD_LATENCY_GT_64
> cpu/event=0xCD,umask=0x01,ldlat=0x40,name=MEM_TRANS_RETIRED_LOAD_LATENCY_GT_64/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.012 MB perf.data (~516 samples) ]
> MEM_TRANS_RETIRED.LOAD_LATENCY_GT_128
> cpu/event=0xCD,umask=0x01,ldlat=0x80,name=MEM_TRANS_RETIRED_LOAD_LATENCY_GT_128/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.014 MB perf.data (~604 samples) ]
> MEM_TRANS_RETIRED.LOAD_LATENCY_GT_256
> cpu/event=0xCD,umask=0x01,ldlat=0x100,name=MEM_TRANS_RETIRED_LOAD_LATENCY_GT_256/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.004 MB perf.data (~172 samples) ]
> MEM_TRANS_RETIRED.LOAD_LATENCY_GT_512
> cpu/event=0xCD,umask=0x01,ldlat=0x200,name=MEM_TRANS_RETIRED_LOAD_LATENCY_GT_512/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.003 MB perf.data (~129 samples) ]
> MEM_UOPS_RETIRED.STLB_MISS_LOADS
> cpu/event=0xD0,umask=0x11,name=MEM_UOPS_RETIRED_STLB_MISS_LOADS/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.006 MB perf.data (~261 samples) ]
> MEM_UOPS_RETIRED.STLB_MISS_STORES
> cpu/event=0xD0,umask=0x12,name=MEM_UOPS_RETIRED_STLB_MISS_STORES/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.016 MB perf.data (~694 samples) ]
> MEM_UOPS_RETIRED.LOCK_LOADS
> cpu/event=0xD0,umask=0x21,name=MEM_UOPS_RETIRED_LOCK_LOADS/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.036 MB perf.data (~1554 samples) ]
> MEM_UOPS_RETIRED.SPLIT_LOADS
> cpu/event=0xD0,umask=0x41,name=MEM_UOPS_RETIRED_SPLIT_LOADS/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.003 MB perf.data (~121 samples) ]
> MEM_UOPS_RETIRED.SPLIT_STORES
> cpu/event=0xD0,umask=0x42,name=MEM_UOPS_RETIRED_SPLIT_STORES/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.039 MB perf.data (~1707 samples) ]
> MEM_UOPS_RETIRED.ALL_LOADS
> cpu/event=0xD0,umask=0x81,name=MEM_UOPS_RETIRED_ALL_LOADS/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.179 MB perf.data (~7839 samples) ]
> MEM_UOPS_RETIRED.ALL_STORES
> cpu/event=0xD0,umask=0x82,name=MEM_UOPS_RETIRED_ALL_STORES/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.164 MB perf.data (~7144 samples) ]
> MEM_LOAD_UOPS_RETIRED.L1_HIT
> cpu/event=0xD1,umask=0x01,name=MEM_LOAD_UOPS_RETIRED_L1_HIT/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.179 MB perf.data (~7826 samples) ]
> MEM_LOAD_UOPS_RETIRED.L2_HIT
> cpu/event=0xD1,umask=0x02,name=MEM_LOAD_UOPS_RETIRED_L2_HIT/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.084 MB perf.data (~3689 samples) ]
> MEM_LOAD_UOPS_RETIRED.L3_HIT
> cpu/event=0xD1,umask=0x04,name=MEM_LOAD_UOPS_RETIRED_L3_HIT/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.041 MB perf.data (~1779 samples) ]
> MEM_LOAD_UOPS_RETIRED.L1_MISS
> cpu/event=0xD1,umask=0x08,name=MEM_LOAD_UOPS_RETIRED_L1_MISS/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.088 MB perf.data (~3827 samples) ]
> MEM_LOAD_UOPS_RETIRED.L2_MISS
> cpu/event=0xD1,umask=0x10,name=MEM_LOAD_UOPS_RETIRED_L2_MISS/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.056 MB perf.data (~2439 samples) ]
> MEM_LOAD_UOPS_RETIRED.L3_MISS
> cpu/event=0xD1,umask=0x20,name=MEM_LOAD_UOPS_RETIRED_L3_MISS/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.028 MB perf.data (~1229 samples) ]
> MEM_LOAD_UOPS_RETIRED.HIT_LFB
> cpu/event=0xD1,umask=0x40,name=MEM_LOAD_UOPS_RETIRED_HIT_LFB/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.055 MB perf.data (~2402 samples) ]
> MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_MISS
> cpu/event=0xD2,umask=0x01,name=MEM_LOAD_UOPS_L3_HIT_RETIRED_XSNP_MISS/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.002 MB perf.data (~107 samples) ]
> MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HIT
> cpu/event=0xD2,umask=0x02,name=MEM_LOAD_UOPS_L3_HIT_RETIRED_XSNP_HIT/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.003 MB perf.data (~119 samples) ]
> MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HITM
> cpu/event=0xD2,umask=0x04,name=MEM_LOAD_UOPS_L3_HIT_RETIRED_XSNP_HITM/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.002 MB perf.data (~107 samples) ]
> MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_NONE
> cpu/event=0xD2,umask=0x08,name=MEM_LOAD_UOPS_L3_HIT_RETIRED_XSNP_NONE/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.038 MB perf.data (~1649 samples) ]
> MEM_LOAD_UOPS_L3_MISS_RETIRED.LOCAL_DRAM
> cpu/event=0xD3,umask=0x01,name=MEM_LOAD_UOPS_L3_MISS_RETIRED_LOCAL_DRAM/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.028 MB perf.data (~1204 samples) ]
> BR_MISP_RETIRED.NEAR_TAKEN
> cpu/event=0xC5,umask=0x20,name=BR_MISP_RETIRED_NEAR_TAKEN/pp
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.132 MB perf.data (~5777 samples) ]
>
> --
> [email protected] -- Speaking for myself only