2013-07-18 23:04:09

by Andi Kleen

[permalink] [raw]
Subject: [PATCH] perf, x86: Enable PEBS mode automatically for mem-{loads,stores} v3

From: Andi Kleen <[email protected]>

[The patch to enable this in the user tools has been sent separately]

With the earlier patches to automatically try cpu// and add
a precise sys attribute, we can now enable PEBS for the mem-loads,
mem-stores events everywhere.

This allows to use

perf record -e mem-loads ...

instead of

perf record -e cpu/mem-loads/p ...

Always use precise=2 even though it is costly pre-Haswell

Cc: [email protected]
v2: Different white space
v3: Always use precise=2, as people seem to think overhead doesn't matter.
Signed-off-by: Andi Kleen <[email protected]>
---
arch/x86/kernel/cpu/perf_event_intel.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index fbc9210..ef9236b 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -176,9 +176,12 @@ static struct extra_reg intel_snbep_extra_regs[] __read_mostly = {
EVENT_EXTRA_END
};

-EVENT_ATTR_STR(mem-loads, mem_ld_nhm, "event=0x0b,umask=0x10,ldlat=3");
-EVENT_ATTR_STR(mem-loads, mem_ld_snb, "event=0xcd,umask=0x1,ldlat=3");
-EVENT_ATTR_STR(mem-stores, mem_st_snb, "event=0xcd,umask=0x2");
+EVENT_ATTR_STR(mem-loads, mem_ld_nhm,
+ "event=0x0b,umask=0x10,ldlat=3,precise=2");
+EVENT_ATTR_STR(mem-loads, mem_ld_snb,
+ "event=0xcd,umask=0x1,ldlat=3,precise=2");
+EVENT_ATTR_STR(mem-stores, mem_st_snb,
+ "event=0xcd,umask=0x2,precise=2");

struct attribute *nhm_events_attrs[] = {
EVENT_PTR(mem_ld_nhm),
@@ -2034,8 +2037,9 @@ static __init void intel_nehalem_quirk(void)
}
}

-EVENT_ATTR_STR(mem-loads, mem_ld_hsw, "event=0xcd,umask=0x1,ldlat=3");
-EVENT_ATTR_STR(mem-stores, mem_st_hsw, "event=0xd0,umask=0x82")
+EVENT_ATTR_STR(mem-loads, mem_ld_hsw,
+ "event=0xcd,umask=0x1,ldlat=3,precise=2");
+EVENT_ATTR_STR(mem-stores, mem_st_hsw, "event=0xd0,umask=0x82,precise=2")

static struct attribute *hsw_events_attrs[] = {
EVENT_PTR(mem_ld_hsw),
--
1.8.1.4


2013-07-19 07:46:49

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] perf, x86: Enable PEBS mode automatically for mem-{loads,stores} v3


* Andi Kleen <[email protected]> wrote:

> From: Andi Kleen <[email protected]>
>
> [The patch to enable this in the user tools has been sent separately]
>
> With the earlier patches to automatically try cpu// and add
> a precise sys attribute, we can now enable PEBS for the mem-loads,
> mem-stores events everywhere.
>
> This allows to use
>
> perf record -e mem-loads ...
>
> instead of
>
> perf record -e cpu/mem-loads/p ...
>
> Always use precise=2 even though it is costly pre-Haswell
>
> Cc: [email protected]
> v2: Different white space
> v3: Always use precise=2, as people seem to think overhead doesn't matter.
> Signed-off-by: Andi Kleen <[email protected]>
> ---
> arch/x86/kernel/cpu/perf_event_intel.c | 14 +++++++++-----
> 1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
> index fbc9210..ef9236b 100644
> --- a/arch/x86/kernel/cpu/perf_event_intel.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel.c
> @@ -176,9 +176,12 @@ static struct extra_reg intel_snbep_extra_regs[] __read_mostly = {
> EVENT_EXTRA_END
> };
>
> -EVENT_ATTR_STR(mem-loads, mem_ld_nhm, "event=0x0b,umask=0x10,ldlat=3");
> -EVENT_ATTR_STR(mem-loads, mem_ld_snb, "event=0xcd,umask=0x1,ldlat=3");
> -EVENT_ATTR_STR(mem-stores, mem_st_snb, "event=0xcd,umask=0x2");
> +EVENT_ATTR_STR(mem-loads, mem_ld_nhm,
> + "event=0x0b,umask=0x10,ldlat=3,precise=2");
> +EVENT_ATTR_STR(mem-loads, mem_ld_snb,
> + "event=0xcd,umask=0x1,ldlat=3,precise=2");
> +EVENT_ATTR_STR(mem-stores, mem_st_snb,
> + "event=0xcd,umask=0x2,precise=2");

Note that here while checkpatch.pl warns about an overlong line, it's
pointless to break the line because the result is not improved. Just keep
the line overlong in such cases.

checkpatch is a discretionary tool: if it warns then check the place,
improve it checkpatch is right and if an improvement is possible - don't
make the code harder to read just to placate the checkpatch warning.

> @@ -2034,8 +2037,9 @@ static __init void intel_nehalem_quirk(void)
> }
> }
>
> -EVENT_ATTR_STR(mem-loads, mem_ld_hsw, "event=0xcd,umask=0x1,ldlat=3");
> -EVENT_ATTR_STR(mem-stores, mem_st_hsw, "event=0xd0,umask=0x82")
> +EVENT_ATTR_STR(mem-loads, mem_ld_hsw,
> + "event=0xcd,umask=0x1,ldlat=3,precise=2");

Ditto.

Thanks,

Ingo

2013-07-23 08:38:48

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] perf, x86: Enable PEBS mode automatically for mem-{loads,stores} v3

On Thu, Jul 18, 2013 at 04:03:39PM -0700, Andi Kleen wrote:
> From: Andi Kleen <[email protected]>
>
> [The patch to enable this in the user tools has been sent separately]
>
> With the earlier patches to automatically try cpu// and add
> a precise sys attribute, we can now enable PEBS for the mem-loads,
> mem-stores events everywhere.
>
> This allows to use
>
> perf record -e mem-loads ...
>
> instead of
>
> perf record -e cpu/mem-loads/p ...
>
> Always use precise=2 even though it is costly pre-Haswell

This Changelog fails to give a reason _why_ we'd want to do this.

2013-07-23 16:13:37

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] perf, x86: Enable PEBS mode automatically for mem-{loads,stores} v3

On Tue, Jul 23, 2013 at 10:38:34AM +0200, Peter Zijlstra wrote:
> On Thu, Jul 18, 2013 at 04:03:39PM -0700, Andi Kleen wrote:
> > From: Andi Kleen <[email protected]>
> >
> > [The patch to enable this in the user tools has been sent separately]
> >
> > With the earlier patches to automatically try cpu// and add
> > a precise sys attribute, we can now enable PEBS for the mem-loads,
> > mem-stores events everywhere.
> >
> > This allows to use
> >
> > perf record -e mem-loads ...
> >
> > instead of
> >
> > perf record -e cpu/mem-loads/p ...
> >
> > Always use precise=2 even though it is costly pre-Haswell
>
> This Changelog fails to give a reason _why_ we'd want to do this.

The first is much nicer to type and understand? Just in the spirit of
making perf easier to use.

-andi
--
[email protected] -- Speaking for myself only.

2013-07-23 16:57:59

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] perf, x86: Enable PEBS mode automatically for mem-{loads,stores} v3

On Tue, Jul 23, 2013 at 06:13:34PM +0200, Andi Kleen wrote:
> On Tue, Jul 23, 2013 at 10:38:34AM +0200, Peter Zijlstra wrote:
> > On Thu, Jul 18, 2013 at 04:03:39PM -0700, Andi Kleen wrote:
> > > From: Andi Kleen <[email protected]>
> > >
> > > [The patch to enable this in the user tools has been sent separately]
> > >
> > > With the earlier patches to automatically try cpu// and add
> > > a precise sys attribute, we can now enable PEBS for the mem-loads,
> > > mem-stores events everywhere.
> > >
> > > This allows to use
> > >
> > > perf record -e mem-loads ...
> > >
> > > instead of
> > >
> > > perf record -e cpu/mem-loads/p ...
> > >
> > > Always use precise=2 even though it is costly pre-Haswell
> >
> > This Changelog fails to give a reason _why_ we'd want to do this.
>
> The first is much nicer to type and understand? Just in the spirit of
> making perf easier to use.

And here I was thinking that maybe these events don't make sense without
pebs or so. But no, rather than giving an actual useful reason you'd
have me look things up myself. *sigh*

2013-07-23 17:46:18

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH] perf, x86: Enable PEBS mode automatically for mem-{loads,stores} v3

On Tue, Jul 23, 2013 at 6:57 PM, Peter Zijlstra <[email protected]> wrote:
> On Tue, Jul 23, 2013 at 06:13:34PM +0200, Andi Kleen wrote:
>> On Tue, Jul 23, 2013 at 10:38:34AM +0200, Peter Zijlstra wrote:
>> > On Thu, Jul 18, 2013 at 04:03:39PM -0700, Andi Kleen wrote:
>> > > From: Andi Kleen <[email protected]>
>> > >
>> > > [The patch to enable this in the user tools has been sent separately]
>> > >
>> > > With the earlier patches to automatically try cpu// and add
>> > > a precise sys attribute, we can now enable PEBS for the mem-loads,
>> > > mem-stores events everywhere.
>> > >
>> > > This allows to use
>> > >
>> > > perf record -e mem-loads ...
>> > >
>> > > instead of
>> > >
>> > > perf record -e cpu/mem-loads/p ...
>> > >
>> > > Always use precise=2 even though it is costly pre-Haswell
>> >
>> > This Changelog fails to give a reason _why_ we'd want to do this.
>>
>> The first is much nicer to type and understand? Just in the spirit of
>> making perf easier to use.
>
> And here I was thinking that maybe these events don't make sense without
> pebs or so. But no, rather than giving an actual useful reason you'd
> have me look things up myself. *sigh*

The loads events using LATENCY_ABOVE_THRESHOLD do not count anything
without PEBS (that's for all processors pre-Haswell).

As for forcing precise=2, I think that is what people would expect,
i.e., point me
to the load/store instruction. Experts can still force precise=1 because I think
the parser uses the value of the last precise= instance.

2013-07-23 19:14:16

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] perf, x86: Enable PEBS mode automatically for mem-{loads,stores} v3

On Tue, Jul 23, 2013 at 06:57:39PM +0200, Peter Zijlstra wrote:
> On Tue, Jul 23, 2013 at 06:13:34PM +0200, Andi Kleen wrote:
> > On Tue, Jul 23, 2013 at 10:38:34AM +0200, Peter Zijlstra wrote:
> > > On Thu, Jul 18, 2013 at 04:03:39PM -0700, Andi Kleen wrote:
> > > > From: Andi Kleen <[email protected]>
> > > >
> > > > [The patch to enable this in the user tools has been sent separately]
> > > >
> > > > With the earlier patches to automatically try cpu// and add
> > > > a precise sys attribute, we can now enable PEBS for the mem-loads,
> > > > mem-stores events everywhere.
> > > >
> > > > This allows to use
> > > >
> > > > perf record -e mem-loads ...
> > > >
> > > > instead of
> > > >
> > > > perf record -e cpu/mem-loads/p ...
> > > >
> > > > Always use precise=2 even though it is costly pre-Haswell
> > >
> > > This Changelog fails to give a reason _why_ we'd want to do this.
> >
> > The first is much nicer to type and understand? Just in the spirit of
> > making perf easier to use.
>
> And here I was thinking that maybe these events don't make sense without
> pebs or so. But no, rather than giving an actual useful reason you'd
> have me look things up myself. *sigh*

You can use them without PEBS, but they're much more useful with PEBS
(e.g. due to skid or if you want the address or data source, although
you can also use perf mem for the later)

I think PEBS on is a better default here.

BTW I should add i have some other events (for HSW TSX) that would
really like a PEBS default. That was the original motivation,
but then I realized it makes other things nicer too.

-Andi

--
[email protected] -- Speaking for myself only.