2013-04-20 19:06:32

by Andi Kleen

[permalink] [raw]
Subject: Basic perf PMU support for Haswell v11

This is based on v7 of the full Haswell PMU support,
rebased, and stripped down to the bare bones

Most interesting new features are not in this patchkit
(full version is git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git hsw/pmu5)

Contains support for:
- Basic Haswell PMU and PEBS support
- Late unmasking of the PMI
- Basic LBRv4 support

v2: Addressed Stephane's feedback. See individual patches for details.
v3: now even more bite-sized. Qualifier constraints merged earlier.
v4: Rename some variables, add some comments and other minor changes.
Add some Reviewed/Tested-bys.
v5: Address some minor review feedback. Port to latest perf/core
v6: Add just some variable names, add comments, edit descriptions, some
more testing, rebased to latest perf/core
v7: Expand comment
v8: Rename structure field.
v9: No wide counters, but add basic LBRs. Add some more
constraints. Rebase to 3.9rc1
v10: Change some whitespace. Rebase to 3.9rc3
v11: Rebase to perf/core. Fix extra regs. Rename INTX.

-Andi


2013-04-20 19:06:33

by Andi Kleen

[permalink] [raw]
Subject: [PATCH 4/5] perf, x86: Move NMI clearing to end of PMI handler after the counter registers are reset

From: Andi Kleen <[email protected]>

This avoids some problems with spurious PMIs on Haswell.
Haswell seems to behave more like P4 in this regard. Do
the same thing as the P4 perf handler by unmasking
the NMI only at the end. Shouldn't make any difference
for earlier family 6 cores.

Tested on Haswell, IvyBridge, Westmere, Saltwell (Atom)

Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel.c | 16 ++++++----------
1 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 62b6872..4a78745 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1172,16 +1172,6 @@ static int intel_pmu_handle_irq(struct pt_regs *regs)

cpuc = &__get_cpu_var(cpu_hw_events);

- /*
- * Some chipsets need to unmask the LVTPC in a particular spot
- * inside the nmi handler. As a result, the unmasking was pushed
- * into all the nmi handlers.
- *
- * This handler doesn't seem to have any issues with the unmasking
- * so it was left at the top.
- */
- apic_write(APIC_LVTPC, APIC_DM_NMI);
-
intel_pmu_disable_all();
handled = intel_pmu_drain_bts_buffer();
status = intel_pmu_get_status();
@@ -1241,6 +1231,12 @@ again:

done:
intel_pmu_enable_all(0);
+ /*
+ * Only unmask the NMI after the overflow counters
+ * have been reset. This avoids spurious NMIs on
+ * Haswell CPUs.
+ */
+ apic_write(APIC_LVTPC, APIC_DM_NMI);
return handled;
}

--
1.7.7.6

2013-04-20 19:06:47

by Andi Kleen

[permalink] [raw]
Subject: [PATCH 2/5] perf, x86: Basic Haswell PMU support v8

From: Andi Kleen <[email protected]>

Add basic Haswell PMU support.

Similar to SandyBridge, but has a few new events and two
new counter bits.

There are some new counter flags that need to be prevented
from being set on fixed counters, and allowed to be set
for generic counters.

Also we add support for the counter 2 constraint to handle
all raw events.

Contains fixes from Stephane Eranian

v2: Folded TSX bits into standard FIXED_EVENT_CONSTRAINTS
v3: Use SNB LBR init code. Comment fix (Stephane Eranian)
v4: Add the counter2 constraints. Fix comment in the right place.
v5: Expand comment
v6: Add CYCLE_ACTIVITY.* to counter constraints
v7: Follow Linux style, not perf style
v8: Add missing extra regs
Reviewed-by: Stephane Eranian <[email protected]>
Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/include/asm/perf_event.h | 3 +
arch/x86/kernel/cpu/perf_event.h | 5 ++-
arch/x86/kernel/cpu/perf_event_intel.c | 81 ++++++++++++++++++++++++++++++++
3 files changed, 88 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 57cb634..b79b6eb 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -29,6 +29,9 @@
#define ARCH_PERFMON_EVENTSEL_INV (1ULL << 23)
#define ARCH_PERFMON_EVENTSEL_CMASK 0xFF000000ULL

+#define HSW_INTX (1ULL << 32)
+#define HSW_INTX_CHECKPOINTED (1ULL << 33)
+
#define AMD64_EVENTSEL_INT_CORE_ENABLE (1ULL << 36)
#define AMD64_EVENTSEL_GUESTONLY (1ULL << 40)
#define AMD64_EVENTSEL_HOSTONLY (1ULL << 41)
diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index ba9aadf..a974fe4 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -227,11 +227,14 @@ struct cpu_hw_events {
* - inv
* - edge
* - cnt-mask
+ * - intx
+ * - intx_checkpointed
* The other filters are supported by fixed counters.
* The any-thread option is supported starting with v3.
*/
+#define FIXED_EVENT_FLAGS (X86_RAW_EVENT_MASK|HSW_INTX|HSW_INTX_CHECKPOINTED)
#define FIXED_EVENT_CONSTRAINT(c, n) \
- EVENT_CONSTRAINT(c, (1ULL << (32+n)), X86_RAW_EVENT_MASK)
+ EVENT_CONSTRAINT(c, (1ULL << (32+n)), FIXED_EVENT_FLAGS)

/*
* Constraint on the Event code + UMask
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 2ad2374..bee0308 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -13,6 +13,7 @@
#include <linux/slab.h>
#include <linux/export.h>

+#include <asm/cpufeature.h>
#include <asm/hardirq.h>
#include <asm/apic.h>

@@ -178,6 +179,22 @@ struct attribute *snb_events_attrs[] = {
NULL,
};

+static struct event_constraint intel_hsw_event_constraints[] = {
+ FIXED_EVENT_CONSTRAINT(0x00c0, 0), /* INST_RETIRED.ANY */
+ FIXED_EVENT_CONSTRAINT(0x003c, 1), /* CPU_CLK_UNHALTED.CORE */
+ FIXED_EVENT_CONSTRAINT(0x0300, 2), /* CPU_CLK_UNHALTED.REF */
+ INTEL_EVENT_CONSTRAINT(0x48, 0x4), /* L1D_PEND_MISS.* */
+ INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PREC_DIST */
+ INTEL_EVENT_CONSTRAINT(0xcd, 0x8), /* MEM_TRANS_RETIRED.LOAD_LATENCY */
+ /* CYCLE_ACTIVITY.CYCLES_L1D_PENDING */
+ INTEL_EVENT_CONSTRAINT(0x08a3, 0x4),
+ /* CYCLE_ACTIVITY.STALLS_L1D_PENDING */
+ INTEL_EVENT_CONSTRAINT(0x0ca3, 0x4),
+ /* CYCLE_ACTIVITY.CYCLES_NO_EXECUTE */
+ INTEL_EVENT_CONSTRAINT(0x04a3, 0xf),
+ EVENT_CONSTRAINT_END
+};
+
static u64 intel_pmu_event_map(int hw_event)
{
return intel_perfmon_event_map[hw_event];
@@ -1634,6 +1651,48 @@ static void core_pmu_enable_all(int added)
}
}

+static int hsw_hw_config(struct perf_event *event)
+{
+ int ret = intel_pmu_hw_config(event);
+
+ if (ret)
+ return ret;
+ if (!boot_cpu_has(X86_FEATURE_RTM) && !boot_cpu_has(X86_FEATURE_HLE))
+ return 0;
+ event->hw.config |= event->attr.config &
+ (HSW_INTX|HSW_INTX_CHECKPOINTED);
+
+ /*
+ * INTX/INTX-CP filters are not supported by the Haswell PMU with
+ * PEBS or in ANY thread mode. Since the results are non-sensical forbid
+ * this combination.
+ */
+ if ((event->hw.config & (HSW_INTX|HSW_INTX_CHECKPOINTED)) &&
+ ((event->hw.config & ARCH_PERFMON_EVENTSEL_ANY) ||
+ event->attr.precise_ip > 0))
+ return -EOPNOTSUPP;
+
+ return 0;
+}
+
+static struct event_constraint counter2_constraint =
+ EVENT_CONSTRAINT(0, 0x4, 0);
+
+static struct event_constraint *
+hsw_get_event_constraints(struct cpu_hw_events *cpuc, struct perf_event *event)
+{
+ struct event_constraint *c = intel_get_event_constraints(cpuc, event);
+
+ /* Handle special quirk on intx_checkpointed only in counter 2 */
+ if (event->hw.config & HSW_INTX_CHECKPOINTED) {
+ if (c->idxmsk64 & (1U << 2))
+ return &counter2_constraint;
+ return &emptyconstraint;
+ }
+
+ return c;
+}
+
PMU_FORMAT_ATTR(event, "config:0-7" );
PMU_FORMAT_ATTR(umask, "config:8-15" );
PMU_FORMAT_ATTR(edge, "config:18" );
@@ -2171,6 +2230,28 @@ __init int intel_pmu_init(void)
break;


+ case 60: /* Haswell Client */
+ case 70:
+ case 71:
+ memcpy(hw_cache_event_ids, snb_hw_cache_event_ids,
+ sizeof(hw_cache_event_ids));
+ memcpy(hw_cache_extra_regs, snb_hw_cache_extra_regs,
+ sizeof(hw_cache_extra_regs));
+
+ intel_pmu_lbr_init_snb();
+
+ x86_pmu.event_constraints = intel_hsw_event_constraints;
+
+ x86_pmu.extra_regs = intel_snb_extra_regs;
+ /* all extra regs are per-cpu when HT is on */
+ x86_pmu.er_flags |= ERF_HAS_RSP_1;
+ x86_pmu.er_flags |= ERF_NO_HT_SHARING;
+
+ x86_pmu.hw_config = hsw_hw_config;
+ x86_pmu.get_event_constraints = hsw_get_event_constraints;
+ pr_cont("Haswell events, ");
+ break;
+
default:
switch (x86_pmu.version) {
case 1:
--
1.7.7.6

2013-04-20 19:06:54

by Andi Kleen

[permalink] [raw]
Subject: [PATCH 3/5] perf, x86: Basic Haswell PEBS support v4

From: Andi Kleen <[email protected]>

Add basic PEBS support for Haswell.
The constraints are similar to SandyBridge with a few new events.

v2: Readd missing pebs_aliases
v3: Readd missing hunk. Fix some constraints.
v4: Fix typo in PEBS event table (Stephane Eranian)
Reviewed-by: Stephane Eranian <[email protected]>
Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/kernel/cpu/perf_event.h | 2 +
arch/x86/kernel/cpu/perf_event_intel.c | 6 +++-
arch/x86/kernel/cpu/perf_event_intel_ds.c | 34 +++++++++++++++++++++++++++++
3 files changed, 40 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.h b/arch/x86/kernel/cpu/perf_event.h
index a974fe4..d75d0ff 100644
--- a/arch/x86/kernel/cpu/perf_event.h
+++ b/arch/x86/kernel/cpu/perf_event.h
@@ -636,6 +636,8 @@ extern struct event_constraint intel_snb_pebs_event_constraints[];

extern struct event_constraint intel_ivb_pebs_event_constraints[];

+extern struct event_constraint intel_hsw_pebs_event_constraints[];
+
struct event_constraint *intel_pebs_constraints(struct perf_event *event);

void intel_pmu_pebs_enable(struct perf_event *event);
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index bee0308..62b6872 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -877,7 +877,8 @@ static inline bool intel_pmu_needs_lbr_smpl(struct perf_event *event)
return true;

/* implicit branch sampling to correct PEBS skid */
- if (x86_pmu.intel_cap.pebs_trap && event->attr.precise_ip > 1)
+ if (x86_pmu.intel_cap.pebs_trap && event->attr.precise_ip > 1 &&
+ x86_pmu.intel_cap.pebs_format < 2)
return true;

return false;
@@ -2241,8 +2242,9 @@ __init int intel_pmu_init(void)
intel_pmu_lbr_init_snb();

x86_pmu.event_constraints = intel_hsw_event_constraints;
-
+ x86_pmu.pebs_constraints = intel_hsw_pebs_event_constraints;
x86_pmu.extra_regs = intel_snb_extra_regs;
+ x86_pmu.pebs_aliases = intel_pebs_aliases_snb;
/* all extra regs are per-cpu when HT is on */
x86_pmu.er_flags |= ERF_HAS_RSP_1;
x86_pmu.er_flags |= ERF_NO_HT_SHARING;
diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index e91d7fa..e0a66f80 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -563,6 +563,40 @@ struct event_constraint intel_ivb_pebs_event_constraints[] = {
EVENT_CONSTRAINT_END
};

+struct event_constraint intel_hsw_pebs_event_constraints[] = {
+ INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */
+ INTEL_UEVENT_CONSTRAINT(0x01c2, 0xf), /* UOPS_RETIRED.ALL */
+ INTEL_UEVENT_CONSTRAINT(0x02c2, 0xf), /* UOPS_RETIRED.RETIRE_SLOTS */
+ INTEL_EVENT_CONSTRAINT(0xc4, 0xf), /* BR_INST_RETIRED.* */
+ INTEL_UEVENT_CONSTRAINT(0x01c5, 0xf), /* BR_MISP_RETIRED.CONDITIONAL */
+ INTEL_UEVENT_CONSTRAINT(0x04c5, 0xf), /* BR_MISP_RETIRED.ALL_BRANCHES */
+ INTEL_UEVENT_CONSTRAINT(0x20c5, 0xf), /* BR_MISP_RETIRED.NEAR_TAKEN */
+ INTEL_EVENT_CONSTRAINT(0xcd, 0x8), /* MEM_TRANS_RETIRED.* */
+ /* MEM_UOPS_RETIRED.STLB_MISS_LOADS */
+ INTEL_UEVENT_CONSTRAINT(0x11d0, 0xf),
+ /* MEM_UOPS_RETIRED.STLB_MISS_STORES */
+ INTEL_UEVENT_CONSTRAINT(0x12d0, 0xf),
+ INTEL_UEVENT_CONSTRAINT(0x21d0, 0xf), /* MEM_UOPS_RETIRED.LOCK_LOADS */
+ INTEL_UEVENT_CONSTRAINT(0x41d0, 0xf), /* MEM_UOPS_RETIRED.SPLIT_LOADS */
+ INTEL_UEVENT_CONSTRAINT(0x42d0, 0xf), /* MEM_UOPS_RETIRED.SPLIT_STORES */
+ INTEL_UEVENT_CONSTRAINT(0x81d0, 0xf), /* MEM_UOPS_RETIRED.ALL_LOADS */
+ INTEL_UEVENT_CONSTRAINT(0x82d0, 0xf), /* MEM_UOPS_RETIRED.ALL_STORES */
+ INTEL_UEVENT_CONSTRAINT(0x01d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L1_HIT */
+ INTEL_UEVENT_CONSTRAINT(0x02d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L2_HIT */
+ INTEL_UEVENT_CONSTRAINT(0x04d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.L3_HIT */
+ INTEL_UEVENT_CONSTRAINT(0x40d1, 0xf), /* MEM_LOAD_UOPS_RETIRED.HIT_LFB */
+ /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_MISS */
+ INTEL_UEVENT_CONSTRAINT(0x01d2, 0xf),
+ /* MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HIT */
+ INTEL_UEVENT_CONSTRAINT(0x02d2, 0xf),
+ /* MEM_LOAD_UOPS_LLC_MISS_RETIRED.LOCAL_DRAM */
+ INTEL_UEVENT_CONSTRAINT(0x01d3, 0xf),
+ INTEL_UEVENT_CONSTRAINT(0x04c8, 0xf), /* HLE_RETIRED.Abort */
+ INTEL_UEVENT_CONSTRAINT(0x04c9, 0xf), /* RTM_RETIRED.Abort */
+
+ EVENT_CONSTRAINT_END
+};
+
struct event_constraint *intel_pebs_constraints(struct perf_event *event)
{
struct event_constraint *c;
--
1.7.7.6

2013-04-20 19:06:53

by Andi Kleen

[permalink] [raw]
Subject: [PATCH 5/5] perf, x86: Support Haswell v4 LBR format v2

From: Andi Kleen <[email protected]>

Haswell has two additional LBR from flags for TSX: intx and abort, implemented
as a new v4 version of the LBR format.

Handle those in and adjust the sign extension code to still correctly extend.
The flags are exported similarly in the LBR record to the existing misprediction
flag

v2: Add some _
Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel_lbr.c | 56 +++++++++++++++++++++++++--
include/linux/perf_event.h | 7 +++-
include/uapi/linux/perf_event.h | 5 ++-
3 files changed, 61 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_lbr.c b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
index da02e9c..6f9b794 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
@@ -12,6 +12,16 @@ enum {
LBR_FORMAT_LIP = 0x01,
LBR_FORMAT_EIP = 0x02,
LBR_FORMAT_EIP_FLAGS = 0x03,
+ LBR_FORMAT_EIP_FLAGS2 = 0x04,
+ LBR_FORMAT_MAX_KNOWN = LBR_FORMAT_EIP_FLAGS2,
+};
+
+static enum {
+ LBR_EIP_FLAGS = 1,
+ LBR_TSX = 2,
+} lbr_desc[LBR_FORMAT_MAX_KNOWN + 1] = {
+ [LBR_FORMAT_EIP_FLAGS] = LBR_EIP_FLAGS,
+ [LBR_FORMAT_EIP_FLAGS2] = LBR_EIP_FLAGS | LBR_TSX,
};

/*
@@ -56,6 +66,8 @@ enum {
LBR_FAR)

#define LBR_FROM_FLAG_MISPRED (1ULL << 63)
+#define LBR_FROM_FLAG_IN_TX (1ULL << 62)
+#define LBR_FROM_FLAG_ABORT (1ULL << 61)

#define for_each_branch_sample_type(x) \
for ((x) = PERF_SAMPLE_BRANCH_USER; \
@@ -81,9 +93,13 @@ enum {
X86_BR_JMP = 1 << 9, /* jump */
X86_BR_IRQ = 1 << 10,/* hw interrupt or trap or fault */
X86_BR_IND_CALL = 1 << 11,/* indirect calls */
+ X86_BR_ABORT = 1 << 12,/* transaction abort */
+ X86_BR_IN_TX = 1 << 13,/* in transaction */
+ X86_BR_NO_TX = 1 << 14,/* not in transaction */
};

#define X86_BR_PLM (X86_BR_USER | X86_BR_KERNEL)
+#define X86_BR_ANYTX (X86_BR_NO_TX | X86_BR_IN_TX)

#define X86_BR_ANY \
(X86_BR_CALL |\
@@ -95,6 +111,7 @@ enum {
X86_BR_JCC |\
X86_BR_JMP |\
X86_BR_IRQ |\
+ X86_BR_ABORT |\
X86_BR_IND_CALL)

#define X86_BR_ALL (X86_BR_PLM | X86_BR_ANY)
@@ -270,21 +287,31 @@ static void intel_pmu_lbr_read_64(struct cpu_hw_events *cpuc)

for (i = 0; i < x86_pmu.lbr_nr; i++) {
unsigned long lbr_idx = (tos - i) & mask;
- u64 from, to, mis = 0, pred = 0;
+ u64 from, to, mis = 0, pred = 0, in_tx = 0, abort = 0;
+ int skip = 0;
+ int lbr_flags = lbr_desc[lbr_format];

rdmsrl(x86_pmu.lbr_from + lbr_idx, from);
rdmsrl(x86_pmu.lbr_to + lbr_idx, to);

- if (lbr_format == LBR_FORMAT_EIP_FLAGS) {
+ if (lbr_flags & LBR_EIP_FLAGS) {
mis = !!(from & LBR_FROM_FLAG_MISPRED);
pred = !mis;
- from = (u64)((((s64)from) << 1) >> 1);
+ skip = 1;
+ }
+ if (lbr_flags & LBR_TSX) {
+ in_tx = !!(from & LBR_FROM_FLAG_IN_TX);
+ abort = !!(from & LBR_FROM_FLAG_ABORT);
+ skip = 3;
}
+ from = (u64)((((s64)from) << skip) >> skip);

cpuc->lbr_entries[i].from = from;
cpuc->lbr_entries[i].to = to;
cpuc->lbr_entries[i].mispred = mis;
cpuc->lbr_entries[i].predicted = pred;
+ cpuc->lbr_entries[i].in_tx = in_tx;
+ cpuc->lbr_entries[i].abort = abort;
cpuc->lbr_entries[i].reserved = 0;
}
cpuc->lbr_stack.nr = i;
@@ -334,6 +361,16 @@ static void intel_pmu_setup_sw_lbr_filter(struct perf_event *event)

if (br_type & PERF_SAMPLE_BRANCH_IND_CALL)
mask |= X86_BR_IND_CALL;
+
+ if (br_type & PERF_SAMPLE_BRANCH_ABORT_TX)
+ mask |= X86_BR_ABORT;
+
+ if (br_type & PERF_SAMPLE_BRANCH_IN_TX)
+ mask |= X86_BR_IN_TX;
+
+ if (br_type & PERF_SAMPLE_BRANCH_NO_TX)
+ mask |= X86_BR_NO_TX;
+
/*
* stash actual user request into reg, it may
* be used by fixup code for some CPU
@@ -408,7 +445,7 @@ int intel_pmu_setup_lbr_filter(struct perf_event *event)
* decoded (e.g., text page not present), then X86_BR_NONE is
* returned.
*/
-static int branch_type(unsigned long from, unsigned long to)
+static int branch_type(unsigned long from, unsigned long to, int abort)
{
struct insn insn;
void *addr;
@@ -428,6 +465,9 @@ static int branch_type(unsigned long from, unsigned long to)
if (from == 0 || to == 0)
return X86_BR_NONE;

+ if (abort)
+ return X86_BR_ABORT | to_plm;
+
if (from_plm == X86_BR_USER) {
/*
* can happen if measuring at the user level only
@@ -564,7 +604,13 @@ intel_pmu_lbr_filter(struct cpu_hw_events *cpuc)
from = cpuc->lbr_entries[i].from;
to = cpuc->lbr_entries[i].to;

- type = branch_type(from, to);
+ type = branch_type(from, to, cpuc->lbr_entries[i].abort);
+ if (type != X86_BR_NONE && (br_sel & X86_BR_ANYTX)) {
+ if (cpuc->lbr_entries[i].in_tx)
+ type |= X86_BR_IN_TX;
+ else
+ type |= X86_BR_NO_TX;
+ }

/* if type does not correspond, then discard */
if (type == X86_BR_NONE || (br_sel & type) != type) {
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index e0373d2..466e378 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -73,13 +73,18 @@ struct perf_raw_record {
*
* support for mispred, predicted is optional. In case it
* is not supported mispred = predicted = 0.
+ *
+ * in_tx: running in a hardware transaction
+ * abort: aborting a hardware transaction
*/
struct perf_branch_entry {
__u64 from;
__u64 to;
__u64 mispred:1, /* target mispredicted */
predicted:1,/* target predicted */
- reserved:62;
+ in_tx:1, /* in transaction */
+ abort:1, /* transaction abort */
+ reserved:60;
};

/*
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index fb104e5..0b1df41 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -157,8 +157,11 @@ enum perf_branch_sample_type {
PERF_SAMPLE_BRANCH_ANY_CALL = 1U << 4, /* any call branch */
PERF_SAMPLE_BRANCH_ANY_RETURN = 1U << 5, /* any return branch */
PERF_SAMPLE_BRANCH_IND_CALL = 1U << 6, /* indirect calls */
+ PERF_SAMPLE_BRANCH_ABORT_TX = 1U << 7, /* transaction aborts */
+ PERF_SAMPLE_BRANCH_IN_TX = 1U << 8, /* in transaction */
+ PERF_SAMPLE_BRANCH_NO_TX = 1U << 9, /* not in transaction */

- PERF_SAMPLE_BRANCH_MAX = 1U << 7, /* non-ABI */
+ PERF_SAMPLE_BRANCH_MAX = 1U << 10, /* non-ABI */
};

#define PERF_SAMPLE_BRANCH_PLM_ALL \
--
1.7.7.6

2013-04-20 19:07:49

by Andi Kleen

[permalink] [raw]
Subject: [PATCH 1/5] perf, x86: Add Haswell PEBS record support v5

From: Andi Kleen <[email protected]>

Add support for the Haswell extended (fmt2) PEBS format.

It has a superset of the nhm (fmt1) PEBS fields, but has a longer record so
we need to adjust the code paths.

The main advantage is the new "EventingRip" support which directly
gives the instruction, not off-by-one instruction. So with precise == 2
we use that directly and don't try to use LBRs and walking basic blocks.
This lowers the overhead of using precise significantly.

Some other features are added in later patches.

Reviewed-by: Stephane Eranian <[email protected]>
v2: Rename various identifiers. Add more comments. Get rid of a cast.
v3: fmt2->hsw rename
v4: ip_of_the_event->real_ip rename
v5: use pr_cont. white space changes.
Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/kernel/cpu/perf_event.c | 3 +-
arch/x86/kernel/cpu/perf_event_intel_ds.c | 111 +++++++++++++++++++++++------
2 files changed, 92 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event.c b/arch/x86/kernel/cpu/perf_event.c
index 5ed7a4c..21cfc52 100644
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -397,7 +397,8 @@ int x86_pmu_hw_config(struct perf_event *event)
* check that PEBS LBR correction does not conflict with
* whatever the user is asking with attr->branch_sample_type
*/
- if (event->attr.precise_ip > 1) {
+ if (event->attr.precise_ip > 1 &&
+ x86_pmu.intel_cap.pebs_format < 2) {
u64 *br_type = &event->attr.branch_sample_type;

if (has_branch_stack(event)) {
diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c b/arch/x86/kernel/cpu/perf_event_intel_ds.c
index d467561..e91d7fa 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
@@ -165,6 +165,22 @@ struct pebs_record_nhm {
u64 status, dla, dse, lat;
};

+/*
+ * Same as pebs_record_nhm, with two additional fields.
+ */
+struct pebs_record_hsw {
+ struct pebs_record_nhm nhm;
+ /*
+ * Real IP of the event. In the Intel documentation this
+ * is called eventingrip.
+ */
+ u64 real_ip;
+ /*
+ * TSX tuning information field: abort cycles and abort flags.
+ */
+ u64 tsx_tuning;
+};
+
void init_debug_store_on_cpu(int cpu)
{
struct debug_store *ds = per_cpu(cpu_hw_events, cpu).ds;
@@ -696,6 +712,7 @@ static void __intel_pmu_pebs_event(struct perf_event *event,
*/
struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
struct pebs_record_nhm *pebs = __pebs;
+ struct pebs_record_hsw *pebs_hsw = __pebs;
struct perf_sample_data data;
struct pt_regs regs;
u64 sample_type;
@@ -752,7 +769,10 @@ static void __intel_pmu_pebs_event(struct perf_event *event,
regs.bp = pebs->bp;
regs.sp = pebs->sp;

- if (event->attr.precise_ip > 1 && intel_pmu_pebs_fixup_ip(&regs))
+ if (event->attr.precise_ip > 1 && x86_pmu.intel_cap.pebs_format >= 2) {
+ regs.ip = pebs_hsw->real_ip;
+ regs.flags |= PERF_EFLAGS_EXACT;
+ } else if (event->attr.precise_ip > 1 && intel_pmu_pebs_fixup_ip(&regs))
regs.flags |= PERF_EFLAGS_EXACT;
else
regs.flags &= ~PERF_EFLAGS_EXACT;
@@ -805,35 +825,22 @@ static void intel_pmu_drain_pebs_core(struct pt_regs *iregs)
__intel_pmu_pebs_event(event, iregs, at);
}

-static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
+static void __intel_pmu_drain_pebs_nhm(struct pt_regs *iregs, void *at,
+ void *top)
{
struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
struct debug_store *ds = cpuc->ds;
- struct pebs_record_nhm *at, *top;
struct perf_event *event = NULL;
u64 status = 0;
- int bit, n;
-
- if (!x86_pmu.pebs_active)
- return;
-
- at = (struct pebs_record_nhm *)(unsigned long)ds->pebs_buffer_base;
- top = (struct pebs_record_nhm *)(unsigned long)ds->pebs_index;
+ int bit;

ds->pebs_index = ds->pebs_buffer_base;

- n = top - at;
- if (n <= 0)
- return;
-
- /*
- * Should not happen, we program the threshold at 1 and do not
- * set a reset value.
- */
- WARN_ONCE(n > x86_pmu.max_pebs_events, "Unexpected number of pebs records %d\n", n);
+ for ( ; at < top; at += x86_pmu.pebs_record_size) {
+ struct pebs_record_nhm *p = at;

- for ( ; at < top; at++) {
- for_each_set_bit(bit, (unsigned long *)&at->status, x86_pmu.max_pebs_events) {
+ for_each_set_bit(bit, (unsigned long *)&p->status,
+ x86_pmu.max_pebs_events) {
event = cpuc->events[bit];
if (!test_bit(bit, cpuc->active_mask))
continue;
@@ -856,6 +863,61 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
}
}

+static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
+{
+ struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+ struct debug_store *ds = cpuc->ds;
+ struct pebs_record_nhm *at, *top;
+ int n;
+
+ if (!x86_pmu.pebs_active)
+ return;
+
+ at = (struct pebs_record_nhm *)(unsigned long)ds->pebs_buffer_base;
+ top = (struct pebs_record_nhm *)(unsigned long)ds->pebs_index;
+
+ ds->pebs_index = ds->pebs_buffer_base;
+
+ n = top - at;
+ if (n <= 0)
+ return;
+
+ /*
+ * Should not happen, we program the threshold at 1 and do not
+ * set a reset value.
+ */
+ WARN_ONCE(n > x86_pmu.max_pebs_events,
+ "Unexpected number of pebs records %d\n", n);
+
+ return __intel_pmu_drain_pebs_nhm(iregs, at, top);
+}
+
+static void intel_pmu_drain_pebs_hsw(struct pt_regs *iregs)
+{
+ struct cpu_hw_events *cpuc = &__get_cpu_var(cpu_hw_events);
+ struct debug_store *ds = cpuc->ds;
+ struct pebs_record_hsw *at, *top;
+ int n;
+
+ if (!x86_pmu.pebs_active)
+ return;
+
+ at = (struct pebs_record_hsw *)(unsigned long)ds->pebs_buffer_base;
+ top = (struct pebs_record_hsw *)(unsigned long)ds->pebs_index;
+
+ n = top - at;
+ if (n <= 0)
+ return;
+ /*
+ * Should not happen, we program the threshold at 1 and do not
+ * set a reset value.
+ */
+ WARN_ONCE(n > x86_pmu.max_pebs_events,
+ "Unexpected number of pebs records %d\n", n);
+
+ return __intel_pmu_drain_pebs_nhm(iregs, at, top);
+}
+
/*
* BTS, PEBS probe and setup
*/
@@ -887,6 +949,13 @@ void intel_ds_init(void)
x86_pmu.drain_pebs = intel_pmu_drain_pebs_nhm;
break;

+ case 2:
+ pr_cont("PEBS fmt2%c, ", pebs_type);
+ x86_pmu.pebs_record_size =
+ sizeof(struct pebs_record_hsw);
+ x86_pmu.drain_pebs = intel_pmu_drain_pebs_hsw;
+ break;
+
default:
printk(KERN_CONT "no PEBS fmt%d%c, ", format, pebs_type);
x86_pmu.pebs = 0;
--
1.7.7.6

2013-04-26 06:55:15

by Ingo Molnar

[permalink] [raw]
Subject: Re: Basic perf PMU support for Haswell v11


* Andi Kleen <[email protected]> wrote:

> This is based on v7 of the full Haswell PMU support,
> rebased, and stripped down to the bare bones

Ok, I found some time to still squeeze this into the v3.10 x86 PMU bits
merge window but ran into problems.

You say it's barebones, yet it does not work :-( How well was this
patch-set tested on non-Haswell hardware, which makes up 99.99% of our
installed base?

In particular, after applying your patches, 'perf top' stopped working on
an Intel testbox of mine:

processor : 15
vendor_id : GenuineIntel
cpu family : 6
model : 26
model name : Intel(R) Xeon(R) CPU X55600 @ 2.80GHz
stepping : 5

'perf top' just does not produce any profiling output - it says 0 events.

> Most interesting new features are not in this patchkit

Sigh, we don't want esoteric Haswell-only features that only a very small
subset of people will use.

As I mentioned before, we want _EXISTING_ tools to work and we want the
patchset to be well-tested. Your Haswell-only extensions might be
'interesting' to you but are completely uninteresting to most users who
just want bog standard profiling to work!

Until you don't understand that you will run into problems like this,
putting priorities not into making existing stuff work, but working on
esoteric extensions and featurities, missing the forest from all the
trees...

Your stubborn incompetence is the reason why we are at version 11 of the
patch-set (!) which is _still_ trivially unacceptable and there's _still_
no Haswell support in the upstream kernel...

Thanks,

Ingo

2013-04-26 06:59:31

by Ingo Molnar

[permalink] [raw]
Subject: Re: Basic perf PMU support for Haswell v11


* Ingo Molnar <[email protected]> wrote:

> You say it's barebones, yet it does not work :-( How well was this
> patch-set tested on non-Haswell hardware, which makes up 99.99% of our
> installed base?
>
> In particular, after applying your patches, 'perf top' stopped working
> on an Intel testbox of mine:

The other problem I noticed was stylistic: when I applied your patches for
testing even Git complained about their cleanliness ...

To quote from Documentation/SubmittingPatches:

4) Style check your changes.

Check your patch for basic style violations, details of which can be
found in Documentation/CodingStyle. Failure to do so simply wastes
the reviewers time and will get your patch rejected, probably
without even being read.

At a minimum you should check your patches with the patch style
checker prior to submission (scripts/checkpatch.pl). You should
be able to justify all violations that remain in your patch.

Please make your patches less sloppy!

Thanks,

Ingo

2013-04-26 22:52:38

by Andi Kleen

[permalink] [raw]
Subject: Re: Basic perf PMU support for Haswell v11

> How well was this
> patch-set tested on non-Haswell hardware, which makes up 99.99% of our
> installed base?

I tested on a couple systems now and then: usually Haswell, IvyBridge,
sometimes also Westmere and Atom. I don't retest every iteration,
as you know most of the changes you're requesting don't affect
the binary.

My test bed is likely to be smaller than yours though and as usual
as you well know some part of the kernel QA is after release.

>
> In particular, after applying your patches, 'perf top' stopped working on
> an Intel testbox of mine:
>
> processor : 15
> vendor_id : GenuineIntel
> cpu family : 6
> model : 26
> model name : Intel(R) Xeon(R) CPU X55600 @ 2.80GHz

I assume the second 0 is a typo?

> stepping : 5

> 'perf top' just does not produce any profiling output - it says 0 events.

Thanks for testing.

I found a similar system (not same stepping, but same model) and tested
perf top works fine here. Also on a couple of other systems.

Since I cannot reproduce I would need your help debugging it.

I assume it worked before my patches. If you don't know
please double check. Also I assume there's no general
problem between the user land perf you used and the kernel.

The only patch I could think of which may affect other systems
is the moving of the APIC ack.

So does it work if you revert

perf, x86: Move NMI clearing to end of PMI handler after ...

If that is it we could white list it for Haswell.

If that's not it I may need a bisect, assuming the problem is stable.

-Andi

2013-05-01 10:10:51

by Andi Kleen

[permalink] [raw]
Subject: Your action on perf bug report is requested was Re: Basic perf PMU support for Haswell v11

> I found a similar system (not same stepping, but same model) and tested
> perf top works fine here. Also on a couple of other systems.
>
> Since I cannot reproduce I would need your help debugging it.

Ingo, I haven't heard back from you on this.

You reported an unreproducable bug. I gave you several steps to
diagnose the problem, so that we can make progress on this.

You've had several days time now to to this, but I have not
heard from you.

If you don't report back from 5/3/2013 I'll assume it was some
other mistake on your side.

As a reminder:

I assume it worked before my patches. If you don't know
please double check. Also I assume there's no general
problem between the user land perf you used and the kernel.

The only patch I could think of which may affect other systems
is the moving of the APIC ack.

So does it work if you revert

perf, x86: Move NMI clearing to end of PMI handler after ...

If that is it we could white list it for Haswell.

If that's not it I may need a bisect, assuming the problem is stable.

Thanks for your cooperation.

-Andi

2013-05-01 10:33:29

by Ingo Molnar

[permalink] [raw]
Subject: Re: Your action on perf bug report is requested was Re: Basic perf PMU support for Haswell v11


* Andi Kleen <[email protected]> wrote:

> > I found a similar system (not same stepping, but same model) and tested
> > perf top works fine here. Also on a couple of other systems.
> >
> > Since I cannot reproduce I would need your help debugging it.
>
> Ingo, I haven't heard back from you on this.

FYI, the v3.10 merge window has started 3+ days ago.

The merge window is a very busy time period for Linus and maintainers
alike, and developers should generally not expect maintainers to deal with
new experimental patches near to or especially during the merge window!

As a special exception I tried and tested your patches on Friday and
reported back to you the bug, 2 days before the opening of the merge
window - but you should not expect out of order treatment of development
patches during the merge window

I might have time to look at your patches in a few days. No promises - I
still haven't merged all trees to Linus. I think the 11 review cycles of
your Haswell patch-set are proof enough of my willingness to deal with
your patches.

> You reported an unreproducable bug. I gave you several steps to diagnose
> the problem, so that we can make progress on this.

It's entirely reproducible here on that testbox, and was caused by your
patches - I partially bisected it to your series. (but not to a specific
patch in your series - ran out of time.)

> You've had several days time now to to this, but I have not heard from
> you.

FYI, you are entirely confused about how Linux maintenance works near and
during the merge window. Many maintainers don't take any patches but only
take fixes for already applied patches. (In -tip we generally try to
freeze a week before the merge window, so your patches missed the v3.10
merge window by a wide margin I'm afraid.)

Furthermore, frankly, the nasty, demanding tone of your mail, expecting
and demanding a reply to your mail within two work days (!) is ridiculous
and unacceptable and does not make it more likely for me to make special
exceptions for your patches.

Thanks,

Ingo

2013-05-01 10:48:30

by Ingo Molnar

[permalink] [raw]
Subject: Re: Basic perf PMU support for Haswell v11


* Ingo Molnar <[email protected]> wrote:

>
> * Ingo Molnar <[email protected]> wrote:
>
> > You say it's barebones, yet it does not work :-( How well was this
> > patch-set tested on non-Haswell hardware, which makes up 99.99% of our
> > installed base?
> >
> > In particular, after applying your patches, 'perf top' stopped working
> > on an Intel testbox of mine:
>
> The other problem I noticed was stylistic: when I applied your patches for
> testing even Git complained about their cleanliness ...
>
> To quote from Documentation/SubmittingPatches:
>
> 4) Style check your changes.
>
> Check your patch for basic style violations, details of which can be
> found in Documentation/CodingStyle. Failure to do so simply wastes
> the reviewers time and will get your patch rejected, probably
> without even being read.
>
> At a minimum you should check your patches with the patch style
> checker prior to submission (scripts/checkpatch.pl). You should
> be able to justify all violations that remain in your patch.
>
> Please make your patches less sloppy!

Andi, you have not replied to this mail of mine.

What new measures are you taking to avoid such annoying stylistic problems
to creep into your patches?

These problems are regular in your patches and that has been going on for
years - causing maintenance overhead for many maintainers, not just me.

Apparently you are not using proper tooling (checkpatch.pl for example) to
check your patches. If you refuse to take action I will have to stop
dealing with your patches directly altogether - the overhead just does not
justify the effort. You'll need to get your patches reviewed by and signed
off by a more experienced kernel hacker who knows how to submit patches.

Thanks,

Ingo

2013-05-02 08:39:08

by Ingo Molnar

[permalink] [raw]
Subject: Re: Basic perf PMU support for Haswell v11


[ FYI, we are still in the merge window when maintainers are very busy, so
don't expect quick replies to mails that are not about merge window
related patches and commits. Those issues are typically handled after
-rc1 has been released, once most of the merge fallout in the upstream
kernel has been resolved. ]

* Andi Kleen <[email protected]> wrote:

> > How well was this
> > patch-set tested on non-Haswell hardware, which makes up 99.99% of our
> > installed base?
>
> I tested on a couple systems now and then: usually Haswell, IvyBridge,
> sometimes also Westmere and Atom. I don't retest every iteration,
> as you know most of the changes you're requesting don't affect
> the binary.
>
> My test bed is likely to be smaller than yours though and as usual
> as you well know some part of the kernel QA is after release.
>
> >
> > In particular, after applying your patches, 'perf top' stopped working on
> > an Intel testbox of mine:
> >
> > processor : 15
> > vendor_id : GenuineIntel
> > cpu family : 6
> > model : 26
> > model name : Intel(R) Xeon(R) CPU X55600 @ 2.80GHz
>
> I assume the second 0 is a typo?

Probably a typo in the BIOS.

> > stepping : 5
>
> > 'perf top' just does not produce any profiling output - it says 0 events.
>
> Thanks for testing.
>
> I found a similar system (not same stepping, but same model) and tested
> perf top works fine here. Also on a couple of other systems.
>
> Since I cannot reproduce I would need your help debugging it.
>
> I assume it worked before my patches.

Yes, obviously.

Here's another easy to test symptom of the bug:

$ perf record ./hackbench 10
Time: 0.097
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.043 MB perf.data (~1866 samples) ]

$ perf report --stdio
Error:
The perf.data file has no samples!

Expected result is a profile displayed by 'perf report'.

> [...] If you don't know please double check. Also I assume there's no
> general problem between the user land perf you used and the kernel.
>
> The only patch I could think of which may affect other systems
> is the moving of the APIC ack.

Btw., I warned you about the delicate placement of the APIC ACK in my
Haswell patches review feedback mail, months ago:

https://lkml.org/lkml/2013/2/13/78

which mail you never replied to and which warning you apparently ignored.

When modifying the PMU ack sequence, please find the relevant Intel SDM
that recommends a different ACK sequence from what is implemented
currently, and document this in the changelog.

I'm going to ignore your APIC ACK patch until you do it properly.

> So does it work if you revert
>
> perf, x86: Move NMI clearing to end of PMI handler after ...
>
> If that is it we could white list it for Haswell.

No, reverting that patch did not fix the bug.

I have bisected it down to this patch of yours:

"perf/x86: Add Haswell PMU support"

Most of that patch has no effect on non-Haswell machines, so the scope of
problematic changes should be pretty small.

My quick guess is that your patch broke fixed counters.

If you find the bug or want me to test anything please send a delta patch,
relative to your last series - as I have parts of your patches applied
already locally with cleanups, etc.

Thanks,

Ingo

2013-05-02 08:49:38

by Ingo Molnar

[permalink] [raw]
Subject: Re: Basic perf PMU support for Haswell v11


* Andi Kleen <[email protected]> wrote:

> v11: Rebase to perf/core. Fix extra regs. Rename INTX.

Actually, you did not do what I asked you to do, to rename INTX to IN_TX,
you still kept the 'INTX' pattern that is confusingly similar to
interrupts related names like 'INT3'.

I still see, even your latest patch, this:

+#define HSW_INTX (1ULL << 32)
+#define HSW_INTX_CHECKPOINTED (1ULL << 33)

Why are you ignoring maintainer requests, repeatedly?

It woulnd't be a big issue and I'd do the rename myself if this wasn't a
repeat pattern of passive-aggressive obstruction from you spanning several
years, resisting or obstructing maintainer feedback whenever you can ...

Thanks,

Ingo

2013-09-03 19:24:28

by Vince Weaver

[permalink] [raw]
Subject: Re: [PATCH 5/5] perf, x86: Support Haswell v4 LBR format v2

On Sat, 20 Apr 2013, Andi Kleen wrote:

> From: Andi Kleen <[email protected]>
>
> Haswell has two additional LBR from flags for TSX: intx and abort, implemented
> as a new v4 version of the LBR format.
>
> Handle those in and adjust the sign extension code to still correctly extend.
> The flags are exported similarly in the LBR record to the existing misprediction
> flag

I'm trying to update the perf_event_open() manpage for the new changes
that were in Linux 3.11 and am having trouble getting info on exactly
what these new fields mean.

> + PERF_SAMPLE_BRANCH_ABORT_TX = 1U << 7, /* transaction aborts */
> + PERF_SAMPLE_BRANCH_IN_TX = 1U << 8, /* in transaction */
> + PERF_SAMPLE_BRANCH_NO_TX = 1U << 9, /* not in transaction */

so if you specify these flags in branch_sample_type, what information
appears in the branch record?

If you get an abort, what address appears in the record?

What does it mean in regards to a branch entry to be or not be in a
transaction?

If you set "in transaction" does that then only record branches that are
in transactions? What happens if you set both in transaction and not in?

Is there some sort of document from intel you can link to that describes
all of this?

Thanks,

Vince

2013-09-03 20:28:49

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 5/5] perf, x86: Support Haswell v4 LBR format v2

> > + PERF_SAMPLE_BRANCH_ABORT_TX = 1U << 7, /* transaction aborts */
> > + PERF_SAMPLE_BRANCH_IN_TX = 1U << 8, /* in transaction */
> > + PERF_SAMPLE_BRANCH_NO_TX = 1U << 9, /* not in transaction */
>
> so if you specify these flags in branch_sample_type, what information
> appears in the branch record?

This is just a filter, so when set branches that do not satisfy
the filter are not reported.

The patches to export the new fields haven't been merged yet.

>
> If you get an abort, what address appears in the record?

Abort is a jump from the abort point to the abort handler or
the HLE XACQUIRE lock instruction.

> What does it mean in regards to a branch entry to be or not be in a
> transaction?

When the branch is executed the CPU is in the transactional execution
state.

>
> If you set "in transaction" does that then only record branches that are
> in transactions?

Yes, like all the other branch filter flags.

> What happens if you set both in transaction and not in?

Then you get all branches.

>
> Is there some sort of document from intel you can link to that describes
> all of this?

http://download.intel.com/products/processor/manual/253669.pdf
Chapter 17.8

-Andi
--
[email protected] -- Speaking for myself only

2013-09-03 21:13:24

by Vince Weaver

[permalink] [raw]
Subject: Re: [PATCH 5/5] perf, x86: Support Haswell v4 LBR format v2

On Tue, 3 Sep 2013, Andi Kleen wrote:

> > > + PERF_SAMPLE_BRANCH_ABORT_TX = 1U << 7, /* transaction aborts */
> > > + PERF_SAMPLE_BRANCH_IN_TX = 1U << 8, /* in transaction */
> > > + PERF_SAMPLE_BRANCH_NO_TX = 1U << 9, /* not in transaction */
> >
> > so if you specify these flags in branch_sample_type, what information
> > appears in the branch record?
>
> This is just a filter, so when set branches that do not satisfy
> the filter are not reported.

Is the implementation a direct mapping to the LBR documentation or has it
been generic so non-Intel architectures can use it?

> The patches to export the new fields haven't been merged yet.

What does this mean? The above values are exported as part of
include/uapi/linux/perf_event.h
Do they not work yet?

> > What happens if you set both in transaction and not in?
>
> Then you get all branches.

so what happens if you set neither "PERF_SAMPLE_BRANCH_IN_TX" nor
"PERF_SAMPLE_BRANCH_NO_TX"? Logically you'd get no branches at all,
but that can't be true as all code prior to 3.11 didn't set those values.

Vince

2013-09-03 22:37:55

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 5/5] perf, x86: Support Haswell v4 LBR format v2

On Tue, Sep 03, 2013 at 05:14:51PM -0400, Vince Weaver wrote:
> On Tue, 3 Sep 2013, Andi Kleen wrote:
>
> > > > + PERF_SAMPLE_BRANCH_ABORT_TX = 1U << 7, /* transaction aborts */
> > > > + PERF_SAMPLE_BRANCH_IN_TX = 1U << 8, /* in transaction */
> > > > + PERF_SAMPLE_BRANCH_NO_TX = 1U << 9, /* not in transaction */
> > >
> > > so if you specify these flags in branch_sample_type, what information
> > > appears in the branch record?
> >
> > This is just a filter, so when set branches that do not satisfy
> > the filter are not reported.
>
> Is the implementation a direct mapping to the LBR documentation or has it
> been generic so non-Intel architectures can use it?

It's not a direct mapping (no_tx doesn't exist in the hardware)
If other architectures have similar capabilities they can likely use it.

>
> > The patches to export the new fields haven't been merged yet.
>
> What does this mean? The above values are exported as part of
> include/uapi/linux/perf_event.h
> Do they not work yet?

You can filter on the fields, but you can't see them outside
the kernel driver yet. The patch to see them is still pending.
>
> > > What happens if you set both in transaction and not in?
> >
> > Then you get all branches.
>
> so what happens if you set neither "PERF_SAMPLE_BRANCH_IN_TX" nor
> "PERF_SAMPLE_BRANCH_NO_TX"? Logically you'd get no branches at all,
> but that can't be true as all code prior to 3.11 didn't set those values.

Then you get all branches too

(that's how all the other filters work too)

-Andi

--
[email protected] -- Speaking for myself only.

2013-09-04 14:20:01

by Vince Weaver

[permalink] [raw]
Subject: Re: [PATCH 5/5] perf, x86: Support Haswell v4 LBR format v2

On Wed, 4 Sep 2013, Andi Kleen wrote:

> > What does this mean? The above values are exported as part of
> > include/uapi/linux/perf_event.h
> > Do they not work yet?
>
> You can filter on the fields, but you can't see them outside
> the kernel driver yet. The patch to see them is still pending.

so you can filter for aborts, but they'll never show up in the lbr[]
sample buffer?

> > > > What happens if you set both in transaction and not in?
> > >
> > > Then you get all branches.
> >
> > so what happens if you set neither "PERF_SAMPLE_BRANCH_IN_TX" nor
> > "PERF_SAMPLE_BRANCH_NO_TX"? Logically you'd get no branches at all,
> > but that can't be true as all code prior to 3.11 didn't set those values.
>
> Then you get all branches too
>
> (that's how all the other filters work too)

This is a really confusing API

so does setting "PERF_SAMPLE_BRANCH_ANY" also enable all of the TX types?

Is leaving branch_sample_type at 0 the same as setting it to all 1s?

Vince

2013-09-04 17:05:06

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 5/5] perf, x86: Support Haswell v4 LBR format v2

On Wed, Sep 04, 2013 at 10:21:27AM -0400, Vince Weaver wrote:
> On Wed, 4 Sep 2013, Andi Kleen wrote:
>
> > > What does this mean? The above values are exported as part of
> > > include/uapi/linux/perf_event.h
> > > Do they not work yet?
> >
> > You can filter on the fields, but you can't see them outside
> > the kernel driver yet. The patch to see them is still pending.
>
> so you can filter for aborts, but they'll never show up in the lbr[]
> sample buffer?

They will show up, you just don't know that they are aborts
because the two new status bits are not exported.

>
> > > > > What happens if you set both in transaction and not in?
> > > >
> > > > Then you get all branches.
> > >
> > > so what happens if you set neither "PERF_SAMPLE_BRANCH_IN_TX" nor
> > > "PERF_SAMPLE_BRANCH_NO_TX"? Logically you'd get no branches at all,
> > > but that can't be true as all code prior to 3.11 didn't set those values.
> >
> > Then you get all branches too
> >
> > (that's how all the other filters work too)
>
> This is a really confusing API
>
> so does setting "PERF_SAMPLE_BRANCH_ANY" also enable all of the TX types?
>
> Is leaving branch_sample_type at 0 the same as setting it to all 1s?

I believe so.

It may be also that the catch all only works if everything is 0.

-Andi
--
[email protected] -- Speaking for myself only