From: Andi Kleen <[email protected]>
Very similar to Sandy Bridge, but there is no PEBS problem.
As Stephane pointed out .code=0xb1, .umask=0x01 is gone, so don't
do a generic backend stall event on IvyBridge.
v2: Remove stall event
v3: Fork init code from Sandy Bridge
Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel.c | 23 +++++++++++++++++++++--
1 files changed, 21 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 187c294..abb29c2 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1911,7 +1911,6 @@ __init int intel_pmu_init(void)
case 42: /* SandyBridge */
case 45: /* SandyBridge, "Romely-EP" */
x86_add_quirk(intel_sandybridge_quirk);
- case 58: /* IvyBridge */
memcpy(hw_cache_event_ids, snb_hw_cache_event_ids,
sizeof(hw_cache_event_ids));
@@ -1928,11 +1927,31 @@ __init int intel_pmu_init(void)
/* UOPS_ISSUED.ANY,c=1,i=1 to count stall cycles */
intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] =
X86_CONFIG(.event=0x0e, .umask=0x01, .inv=1, .cmask=1);
+ pr_cont("SandyBridge events, ");
+ break;
+
+ case 58: /* IvyBridge */
/* UOPS_DISPATCHED.THREAD,c=1,i=1 to count stall cycles*/
intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] =
X86_CONFIG(.event=0xb1, .umask=0x01, .inv=1, .cmask=1);
- pr_cont("SandyBridge events, ");
+ memcpy(hw_cache_event_ids, snb_hw_cache_event_ids,
+ sizeof(hw_cache_event_ids));
+
+ intel_pmu_lbr_init_snb();
+
+ x86_pmu.event_constraints = intel_snb_event_constraints;
+ x86_pmu.pebs_constraints = intel_snb_pebs_event_constraints;
+ x86_pmu.extra_regs = intel_snb_extra_regs;
+ /* all extra regs are per-cpu when HT is on */
+ x86_pmu.er_flags |= ERF_HAS_RSP_1;
+ x86_pmu.er_flags |= ERF_NO_HT_SHARING;
+
+ /* UOPS_ISSUED.ANY,c=1,i=1 to count stall cycles */
+ intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] =
+ X86_CONFIG(.event=0x0e, .umask=0x01, .inv=1, .cmask=1);
+ /* No backend stall event */
+ pr_cont("IvyBridge events, ");
break;
default:
--
1.7.7.6
From: Andi Kleen <[email protected]>
Even with precise profiling Intel CPUs have a "skid". The sample
triggers a few cycles later than the instruction, so in some
cases there can be systematic errors where expensive instructions never
show up in the profile log.
Sandy Bridge added a new PDIR instruction retired event that randomizes
the sampling slightly. This corrects for systematic errors, so that
you should in most cases see the correct instruction getting profile hits.
Unfortunately the SandyBridge version could only work with a otherwise
quiescent CPU and was difficult to use. But now on IvyBridge this
restriction is gone and can be more widely used.
This only works for retired instructions.
I enabled it -- somewhat arbitarily -- for two 'p's or more.
To use it
perf record -e instructions:pp ...
This provides a more precise alternative to the usual cycles:pp,
however it will not account for expensive instructions.
Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel.c | 23 +++++++++++++++++++++++
1 files changed, 23 insertions(+), 0 deletions(-)
diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index abb29c2..886d124 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -1425,6 +1425,28 @@ static int intel_pmu_hw_config(struct perf_event *event)
return 0;
}
+static int pdir_hw_config(struct perf_event *event)
+{
+ int err = intel_pmu_hw_config(event);
+
+ if (err)
+ return err;
+
+ /*
+ * Use the PDIR instruction retired counter for two 'p's.
+ * This will randomize samples slightly and avoid some systematic
+ * measurement errors.
+ * Only works for retired cycles.
+ */
+ if (event->attr.precise_ip >= 2 &&
+ (event->hw.config & X86_RAW_EVENT_MASK) == 0xc0) {
+ u64 pdir_event = X86_CONFIG(.event=0xc0, .umask=1);
+ event->hw.config = pdir_event | (event->hw.config & ~X86_RAW_EVENT_MASK);
+ }
+
+ return 0;
+}
+
struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr)
{
if (x86_pmu.guest_get_msrs)
@@ -1943,6 +1965,7 @@ __init int intel_pmu_init(void)
x86_pmu.event_constraints = intel_snb_event_constraints;
x86_pmu.pebs_constraints = intel_snb_pebs_event_constraints;
x86_pmu.extra_regs = intel_snb_extra_regs;
+ x86_pmu.hw_config = pdir_hw_config;
/* all extra regs are per-cpu when HT is on */
x86_pmu.er_flags |= ERF_HAS_RSP_1;
x86_pmu.er_flags |= ERF_NO_HT_SHARING;
--
1.7.7.6
On Wed, 2012-06-13 at 12:20 -0700, Andi Kleen wrote:
> From: Andi Kleen <[email protected]>
>
> Very similar to Sandy Bridge, but there is no PEBS problem.
>
> As Stephane pointed out .code=0xb1, .umask=0x01 is gone, so don't
> do a generic backend stall event on IvyBridge.
>
> v2: Remove stall event
> v3: Fork init code from Sandy Bridge
> Signed-off-by: Andi Kleen <[email protected]>
> ---
> arch/x86/kernel/cpu/perf_event_intel.c | 23 +++++++++++++++++++++--
> 1 files changed, 21 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> index 187c294..abb29c2 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -1911,7 +1911,6 @@ __init int intel_pmu_init(void)
> case 42: /* SandyBridge */
> case 45: /* SandyBridge, "Romely-EP" */
> x86_add_quirk(intel_sandybridge_quirk);
> - case 58: /* IvyBridge */
> memcpy(hw_cache_event_ids, snb_hw_cache_event_ids,
> sizeof(hw_cache_event_ids));
>
> @@ -1928,11 +1927,31 @@ __init int intel_pmu_init(void)
> /* UOPS_ISSUED.ANY,c=1,i=1 to count stall cycles */
> intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] =
> X86_CONFIG(.event=0x0e, .umask=0x01, .inv=1, .cmask=1);
> + pr_cont("SandyBridge events, ");
> + break;
> +
> + case 58: /* IvyBridge */
> /* UOPS_DISPATCHED.THREAD,c=1,i=1 to count stall cycles*/
> intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] =
> X86_CONFIG(.event=0xb1, .umask=0x01, .inv=1, .cmask=1);
>
> - pr_cont("SandyBridge events, ");
> + memcpy(hw_cache_event_ids, snb_hw_cache_event_ids,
> + sizeof(hw_cache_event_ids));
> +
> + intel_pmu_lbr_init_snb();
> +
> + x86_pmu.event_constraints = intel_snb_event_constraints;
> + x86_pmu.pebs_constraints = intel_snb_pebs_event_constraints;
> + x86_pmu.extra_regs = intel_snb_extra_regs;
> + /* all extra regs are per-cpu when HT is on */
> + x86_pmu.er_flags |= ERF_HAS_RSP_1;
> + x86_pmu.er_flags |= ERF_NO_HT_SHARING;
> +
> + /* UOPS_ISSUED.ANY,c=1,i=1 to count stall cycles */
> + intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] =
> + X86_CONFIG(.event=0x0e, .umask=0x01, .inv=1, .cmask=1);
> + /* No backend stall event */
> + pr_cont("IvyBridge events, ");
> break;
>
> default:
I really don't see the point of this patch,.. it is in fact using the
SandyBridge events.. also you appear to have removed the backend stalls
which per SDM (may 2012) table 19-2 would be the very same.
On Wed, 2012-06-13 at 12:20 -0700, Andi Kleen wrote:
> From: Andi Kleen <[email protected]>
>
> Even with precise profiling Intel CPUs have a "skid". The sample
> triggers a few cycles later than the instruction, so in some
> cases there can be systematic errors where expensive instructions never
> show up in the profile log.
>
> Sandy Bridge added a new PDIR instruction retired event that randomizes
> the sampling slightly. This corrects for systematic errors, so that
> you should in most cases see the correct instruction getting profile hits.
>
> Unfortunately the SandyBridge version could only work with a otherwise
> quiescent CPU and was difficult to use. But now on IvyBridge this
> restriction is gone and can be more widely used.
>
> This only works for retired instructions.
>
> I enabled it -- somewhat arbitarily -- for two 'p's or more.
>
> To use it
>
> perf record -e instructions:pp ...
>
> This provides a more precise alternative to the usual cycles:pp,
> however it will not account for expensive instructions.
This patch is just wrong on too many levels.. where do you want me to
start. Lets go with the restriction you mention is lifted, the SDM
doesn't mention this, nor does the patch actually lift it.
> This patch is just wrong on too many levels.. where do you want me to
> start. Lets go with the restriction you mention is lifted, the SDM
> doesn't mention this,
The SDM is wrong. The restriction is Sandy Bridge only.
>nor does the patch actually lift it.
Why not? It works for me on a Ivy Bridge.
-Andi
--
[email protected] -- Speaking for myself only
On Wed, 2012-06-13 at 14:36 -0700, Andi Kleen wrote:
> > This patch is just wrong on too many levels.. where do you want me to
> > start. Lets go with the restriction you mention is lifted, the SDM
> > doesn't mention this,
>
> The SDM is wrong. The restriction is Sandy Bridge only.
It would be good to mention this -- at the very least until the SDM is
revised.
> >nor does the patch actually lift it.
>
> Why not? It works for me on a Ivy Bridge.
I'm very sure you didn't test it properly then. Clearly you need a hint:
struct event_constraint intel_snb_pebs_event_constraints[] = {
INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */
Is still in effect, isn't it..
> > >nor does the patch actually lift it.
> >
> > Why not? It works for me on a Ivy Bridge.
>
> I'm very sure you didn't test it properly then. Clearly you need a hint:
>
> struct event_constraint intel_snb_pebs_event_constraints[] = {
> INTEL_UEVENT_CONSTRAINT(0x01c0, 0x2), /* INST_RETIRED.PRECDIST */
>
> Is still in effect, isn't it..
Yes it's in effect, and it forces the event to counter 1.
That is correct and that restriction is still there.
Just what is gone is the restriction to quiescence the whole PMU.
Also without that we would refuse to enable PEBS anyways
I believe.
-Andi
--
[email protected] -- Speaking for myself only
On Wed, 2012-06-13 at 14:54 -0700, Andi Kleen wrote:
> Yes it's in effect, and it forces the event to counter 1.
> That is correct and that restriction is still there.
OK, when we cannot magically replace 'instructions' with pdir.
On Wed, Jun 13, 2012 at 11:58:58PM +0200, Peter Zijlstra wrote:
> On Wed, 2012-06-13 at 14:54 -0700, Andi Kleen wrote:
> > Yes it's in effect, and it forces the event to counter 1.
> > That is correct and that restriction is still there.
>
> OK, when we cannot magically replace 'instructions' with pdir.
Why not? You want a new event?
It's still retired instructions, just sampled in a slightly different way.
Today instructions:pp just errors out BTW, so nothing should rely on it.
-Andi
--
[email protected] -- Speaking for myself only
> I really don't see the point of this patch,.. it is in fact using the
> SandyBridge events..
It needs a separate path for the PDIR change and for not doing the PEBS
workaround and for printing Ivy instead of Sandy. I had it originally merged in v1 with a
goto, but Ingo thought it was obfuscated, so it became a separate block.
If you disagree with Ingo please take it up with him.
> also you appear to have removed the backend stalls
> which per SDM (may 2012) table 19-2 would be the very same.
Stephane said that event doesn't exist anymore and I agree with him.
You cannot just base it on the SDM, it's SDM plus whatever is currently
known.
Besides they are misleading and wrong in any case.
-Andi
--
[email protected] -- Speaking for myself only.