2009-06-23 12:29:33

by Jaswinder Singh Rajput

[permalink] [raw]
Subject: [PATCH -tip] perf_counter tools: shorten names for events


On AMD box:
$ ./perf stat -e l1d -e l1d-misses -e l1d-write -e l1d-prefetch -e l1d-prefetch-miss -e l1i -e l1i-misses -e l1i-prefetch -e l2 -e l2-misses -e l2-write -e dtlb -e dtlb-misses -e itlb -e itlb-misses -e bpu -e bpu-misses -- ls -lR /usr/include/ > /dev/null

Before :

Performance counter stats for 'ls -lR /usr/include/':

248064467 L1-data-Cache-Load-Referencees (scaled from 23.27%)
1001433 L1-data-Cache-Load-Misses (scaled from 23.34%)
153691 L1-data-Cache-Store-Referencees (scaled from 23.34%)
423248 L1-data-Cache-Prefetch-Referencees (scaled from 23.33%)
302138 L1-data-Cache-Prefetch-Misses (scaled from 23.25%)
251217546 L1-instruction-Cache-Load-Referencees (scaled from 23.25%)
5757005 L1-instruction-Cache-Load-Misses (scaled from 23.23%)
93435 L1-instruction-Cache-Prefetch-Referencees (scaled from 23.24%)
6496073 L2-Cache-Load-Referencees (scaled from 23.32%)
609485 L2-Cache-Load-Misses (scaled from 23.45%)
6876991 L2-Cache-Store-Referencees (scaled from 23.71%)
248922840 Data-TLB-Cache-Load-Referencees (scaled from 23.94%)
5828386 Data-TLB-Cache-Load-Misses (scaled from 24.17%)
257613506 Instruction-TLB-Cache-Load-Referencees (scaled from 24.20%)
6833 Instruction-TLB-Cache-Load-Misses (scaled from 23.88%)
109043606 Branch-Cache-Load-Referencees (scaled from 23.64%)
5552296 Branch-Cache-Load-Misses (scaled from 23.42%)

0.413702461 seconds time elapsed.

After :

Performance counter stats for 'ls -lR /usr/include/':

259250339 L1-d-load-refs (scaled from 22.73%)
1187200 L1-d-load-miss (scaled from 23.01%)
150454 L1-d-store-refs (scaled from 23.01%)
494252 L1-d-prefetch-refs (scaled from 23.29%)
362661 L1-d-prefetch-miss (scaled from 23.73%)
247343449 L1-i-load-refs (scaled from 23.71%)
4804990 L1-i-load-miss (scaled from 23.85%)
108711 L1-i-prefetch-refs (scaled from 23.83%)
6260313 L2-load-refs (scaled from 23.82%)
605425 L2-load-miss (scaled from 23.82%)
6898075 L2-store-refs (scaled from 23.96%)
248334160 d-TLB-load-refs (scaled from 23.95%)
3812835 d-TLB-load-miss (scaled from 23.87%)
253208496 i-TLB-load-refs (scaled from 23.73%)
5873 i-TLB-load-miss (scaled from 23.46%)
110891027 Branch-load-refs (scaled from 23.21%)
5529622 Branch-load-miss (scaled from 23.02%)

0.374790195 seconds time elapsed.

Reported-by : Ingo Molnar <[email protected]>
Signed-off-by: Jaswinder Singh Rajput <[email protected]>
---
tools/perf/util/parse-events.c | 20 ++++++++++----------
1 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 06af2fa..5c4b532 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -71,23 +71,23 @@ static char *sw_event_names[] = {
#define MAX_ALIASES 8

static char *hw_cache[][MAX_ALIASES] = {
- { "L1-data", "l1-d", "l1d" },
- { "L1-instruction", "l1-i", "l1i" },
+ { "L1-d", "l1d" },
+ { "L1-i", "l1i" },
{ "L2", "l2" },
- { "Data-TLB", "dtlb", "d-tlb" },
- { "Instruction-TLB", "itlb", "i-tlb" },
+ { "d-TLB", "dtlb", },
+ { "i-TLB", "itlb", },
{ "Branch", "bpu" , "btb", "bpc" },
};

static char *hw_cache_op[][MAX_ALIASES] = {
- { "Load", "read" },
- { "Store", "write" },
- { "Prefetch", "speculative-read", "speculative-load" },
+ { "load", "read" },
+ { "store", "write" },
+ { "prefetch", "speculative-read", "speculative-load" },
};

static char *hw_cache_result[][MAX_ALIASES] = {
- { "Reference", "ops", "access" },
- { "Miss" },
+ { "refs", "ops", "access" },
+ { "miss" },
};

char *event_name(int counter)
@@ -123,7 +123,7 @@ char *event_name(int counter)
if (cache_result > PERF_COUNT_HW_CACHE_RESULT_MAX)
return "unknown-ext-hardware-cache-result";

- sprintf(name, "%s-Cache-%s-%ses",
+ sprintf(name, "%s-%s-%s",
hw_cache[cache_type][0],
hw_cache_op[cache_op][0],
hw_cache_result[cache_result][0]);
--
1.6.0.6



2009-06-23 13:53:12

by Jaswinder Singh Rajput

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events

Hello Ingo,

On Tue, 2009-06-23 at 17:58 +0530, Jaswinder Singh Rajput wrote:
> On AMD box:
> $ ./perf stat -e l1d -e l1d-misses -e l1d-write -e l1d-prefetch -e l1d-prefetch-miss -e l1i -e l1i-misses -e l1i-prefetch -e l2 -e l2-misses -e l2-write -e dtlb -e dtlb-misses -e itlb -e itlb-misses -e bpu -e bpu-misses -- ls -lR /usr/include/ > /dev/null
>
> Before :
>
> Performance counter stats for 'ls -lR /usr/include/':
>
> 248064467 L1-data-Cache-Load-Referencees (scaled from 23.27%)
> 1001433 L1-data-Cache-Load-Misses (scaled from 23.34%)
> 153691 L1-data-Cache-Store-Referencees (scaled from 23.34%)
> 423248 L1-data-Cache-Prefetch-Referencees (scaled from 23.33%)
> 302138 L1-data-Cache-Prefetch-Misses (scaled from 23.25%)
> 251217546 L1-instruction-Cache-Load-Referencees (scaled from 23.25%)
> 5757005 L1-instruction-Cache-Load-Misses (scaled from 23.23%)
> 93435 L1-instruction-Cache-Prefetch-Referencees (scaled from 23.24%)
> 6496073 L2-Cache-Load-Referencees (scaled from 23.32%)
> 609485 L2-Cache-Load-Misses (scaled from 23.45%)
> 6876991 L2-Cache-Store-Referencees (scaled from 23.71%)
> 248922840 Data-TLB-Cache-Load-Referencees (scaled from 23.94%)
> 5828386 Data-TLB-Cache-Load-Misses (scaled from 24.17%)
> 257613506 Instruction-TLB-Cache-Load-Referencees (scaled from 24.20%)
> 6833 Instruction-TLB-Cache-Load-Misses (scaled from 23.88%)
> 109043606 Branch-Cache-Load-Referencees (scaled from 23.64%)
> 5552296 Branch-Cache-Load-Misses (scaled from 23.42%)
>
> 0.413702461 seconds time elapsed.
>
> After :
>
> Performance counter stats for 'ls -lR /usr/include/':
>
> 259250339 L1-d-load-refs (scaled from 22.73%)
> 1187200 L1-d-load-miss (scaled from 23.01%)
> 150454 L1-d-store-refs (scaled from 23.01%)
> 494252 L1-d-prefetch-refs (scaled from 23.29%)
> 362661 L1-d-prefetch-miss (scaled from 23.73%)
> 247343449 L1-i-load-refs (scaled from 23.71%)
> 4804990 L1-i-load-miss (scaled from 23.85%)
> 108711 L1-i-prefetch-refs (scaled from 23.83%)
> 6260313 L2-load-refs (scaled from 23.82%)
> 605425 L2-load-miss (scaled from 23.82%)
> 6898075 L2-store-refs (scaled from 23.96%)
> 248334160 d-TLB-load-refs (scaled from 23.95%)
> 3812835 d-TLB-load-miss (scaled from 23.87%)
> 253208496 i-TLB-load-refs (scaled from 23.73%)
> 5873 i-TLB-load-miss (scaled from 23.46%)
> 110891027 Branch-load-refs (scaled from 23.21%)
> 5529622 Branch-load-miss (scaled from 23.02%)
>
> 0.374790195 seconds time elapsed.
>
> Reported-by : Ingo Molnar <[email protected]>
> Signed-off-by: Jaswinder Singh Rajput <[email protected]>
> ---

Is this looks OK to you.

Thanks,
--
JSR

2009-06-23 19:57:24

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events


* Jaswinder Singh Rajput <[email protected]> wrote:

> After :
>
> Performance counter stats for 'ls -lR /usr/include/':
>
> 259250339 L1-d-load-refs (scaled from 22.73%)
> 1187200 L1-d-load-miss (scaled from 23.01%)
> 150454 L1-d-store-refs (scaled from 23.01%)
> 494252 L1-d-prefetch-refs (scaled from 23.29%)
> 362661 L1-d-prefetch-miss (scaled from 23.73%)
> 247343449 L1-i-load-refs (scaled from 23.71%)
> 4804990 L1-i-load-miss (scaled from 23.85%)
> 108711 L1-i-prefetch-refs (scaled from 23.83%)
> 6260313 L2-load-refs (scaled from 23.82%)
> 605425 L2-load-miss (scaled from 23.82%)
> 6898075 L2-store-refs (scaled from 23.96%)
> 248334160 d-TLB-load-refs (scaled from 23.95%)
> 3812835 d-TLB-load-miss (scaled from 23.87%)
> 253208496 i-TLB-load-refs (scaled from 23.73%)
> 5873 i-TLB-load-miss (scaled from 23.46%)
> 110891027 Branch-load-refs (scaled from 23.21%)
> 5529622 Branch-load-miss (scaled from 23.02%)

here's an edited version of my suggestions:

> 259250339 dL1-loads (scaled from 22.73%)
> 1187200 dL1-load-misses (scaled from 23.01%)
> 150454 dL1-stores (scaled from 23.01%)
> 494252 dL1-prefetches (scaled from 23.29%)
> 362661 dL1-prefetch-misses (scaled from 23.73%)
> 247343449 iL1-loads (scaled from 23.71%)
> 4804990 iL1-load-misses (scaled from 23.85%)
> 108711 iL1-prefetches (scaled from 23.83%)
> 6260313 LLC-loads (scaled from 23.82%)
> 605425 LLC-load-misses (scaled from 23.82%)
> 6898075 LLC-stores (scaled from 23.96%)
> 248334160 dTLB-loads (scaled from 23.95%)
> 3812835 dTLB-load-misses (scaled from 23.87%)
> 253208496 iTLB-loads (scaled from 23.73%)
> 5873 iTLB-load-misses (scaled from 23.46%)
> 110891027 branches (scaled from 23.21%)
> 5529622 branch-misses (scaled from 23.02%)

We can leave out 'refs' i think - without any qualification
statements like '247343449 iL1-loads' are still unambigious i think.

Plus we can abbreviate dL1/iL1/dTLB/iTLB. The capitalization
matters. Also, note that it's LLC (Last Level Cache), not L2.

( Sidenote: L2 can still be an alias for LLC, even though some CPUs
have a L3 too. )

Note, branches are special - we dont really have 'branch loads',
branches are executions. 'Branches' and 'Branch-misses' are the
right term.

Do you agree?

Ingo

2009-06-23 22:13:37

by Jaswinder Singh Rajput

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events

On Tue, 2009-06-23 at 21:56 +0200, Ingo Molnar wrote:
> * Jaswinder Singh Rajput <[email protected]> wrote:
>
> > After :
> >
> > Performance counter stats for 'ls -lR /usr/include/':
> >
> > 259250339 L1-d-load-refs (scaled from 22.73%)
> > 1187200 L1-d-load-miss (scaled from 23.01%)
> > 150454 L1-d-store-refs (scaled from 23.01%)
> > 494252 L1-d-prefetch-refs (scaled from 23.29%)
> > 362661 L1-d-prefetch-miss (scaled from 23.73%)
> > 247343449 L1-i-load-refs (scaled from 23.71%)
> > 4804990 L1-i-load-miss (scaled from 23.85%)
> > 108711 L1-i-prefetch-refs (scaled from 23.83%)
> > 6260313 L2-load-refs (scaled from 23.82%)
> > 605425 L2-load-miss (scaled from 23.82%)
> > 6898075 L2-store-refs (scaled from 23.96%)
> > 248334160 d-TLB-load-refs (scaled from 23.95%)
> > 3812835 d-TLB-load-miss (scaled from 23.87%)
> > 253208496 i-TLB-load-refs (scaled from 23.73%)
> > 5873 i-TLB-load-miss (scaled from 23.46%)
> > 110891027 Branch-load-refs (scaled from 23.21%)
> > 5529622 Branch-load-miss (scaled from 23.02%)
>
> here's an edited version of my suggestions:
>
> > 259250339 dL1-loads (scaled from 22.73%)
> > 1187200 dL1-load-misses (scaled from 23.01%)
> > 150454 dL1-stores (scaled from 23.01%)
> > 494252 dL1-prefetches (scaled from 23.29%)
> > 362661 dL1-prefetch-misses (scaled from 23.73%)
> > 247343449 iL1-loads (scaled from 23.71%)
> > 4804990 iL1-load-misses (scaled from 23.85%)
> > 108711 iL1-prefetches (scaled from 23.83%)
> > 6260313 LLC-loads (scaled from 23.82%)
> > 605425 LLC-load-misses (scaled from 23.82%)
> > 6898075 LLC-stores (scaled from 23.96%)
> > 248334160 dTLB-loads (scaled from 23.95%)
> > 3812835 dTLB-load-misses (scaled from 23.87%)
> > 253208496 iTLB-loads (scaled from 23.73%)
> > 5873 iTLB-load-misses (scaled from 23.46%)
> > 110891027 branches (scaled from 23.21%)
> > 5529622 branch-misses (scaled from 23.02%)
>
> We can leave out 'refs' i think - without any qualification
> statements like '247343449 iL1-loads' are still unambigious i think.
>

Looks good.

> Plus we can abbreviate dL1/iL1/dTLB/iTLB. The capitalization
> matters. Also, note that it's LLC (Last Level Cache), not L2.
>
> ( Sidenote: L2 can still be an alias for LLC, even though some CPUs
> have a L3 too. )
>

Ok, I will fix it and also set the alias.

> Note, branches are special - we dont really have 'branch loads',
> branches are executions. 'Branches' and 'Branch-misses' are the
> right term.
>
> Do you agree?
>

Event we used for (BPU, READ, ACCESS) is 'branch instructions retired'

So 'branch loads' we mean 'branch instruction loaded and retired'

I like all of them : 'branch loads', 'branch retired' or 'branches'

Please let me know, which one is best option so that I can prepare the
patch.

Thanks,
--
JSR


2009-06-23 23:00:30

by Jaswinder Singh Rajput

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events

On Wed, 2009-06-24 at 03:42 +0530, Jaswinder Singh Rajput wrote:

> > Note, branches are special - we dont really have 'branch loads',
> > branches are executions. 'Branches' and 'Branch-misses' are the
> > right term.
> >
> > Do you agree?
> >
>
> Event we used for (BPU, READ, ACCESS) is 'branch instructions retired'
>
> So 'branch loads' we mean 'branch instruction loaded and retired'
>
> I like all of them : 'branch loads', 'branch retired' or 'branches'
>
> Please let me know, which one is best option so that I can prepare the
> patch.
>

Or if branches is special and following values are always invalid :

(BPU, WRITE, ACCESS)
(BPU, WRITE, MISS)
(BPU, PREFETCH, ACCESS)
(BPU, PREFTECH, MISS)

then can we move BPU to some other category from Hardware cache
counters.

Thanks,
--
JSR


2009-06-24 08:41:02

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events


* Jaswinder Singh Rajput <[email protected]> wrote:

> On Tue, 2009-06-23 at 21:56 +0200, Ingo Molnar wrote:
> > * Jaswinder Singh Rajput <[email protected]> wrote:
> >
> > > After :
> > >
> > > Performance counter stats for 'ls -lR /usr/include/':
> > >
> > > 259250339 L1-d-load-refs (scaled from 22.73%)
> > > 1187200 L1-d-load-miss (scaled from 23.01%)
> > > 150454 L1-d-store-refs (scaled from 23.01%)
> > > 494252 L1-d-prefetch-refs (scaled from 23.29%)
> > > 362661 L1-d-prefetch-miss (scaled from 23.73%)
> > > 247343449 L1-i-load-refs (scaled from 23.71%)
> > > 4804990 L1-i-load-miss (scaled from 23.85%)
> > > 108711 L1-i-prefetch-refs (scaled from 23.83%)
> > > 6260313 L2-load-refs (scaled from 23.82%)
> > > 605425 L2-load-miss (scaled from 23.82%)
> > > 6898075 L2-store-refs (scaled from 23.96%)
> > > 248334160 d-TLB-load-refs (scaled from 23.95%)
> > > 3812835 d-TLB-load-miss (scaled from 23.87%)
> > > 253208496 i-TLB-load-refs (scaled from 23.73%)
> > > 5873 i-TLB-load-miss (scaled from 23.46%)
> > > 110891027 Branch-load-refs (scaled from 23.21%)
> > > 5529622 Branch-load-miss (scaled from 23.02%)
> >
> > here's an edited version of my suggestions:
> >
> > > 259250339 dL1-loads (scaled from 22.73%)
> > > 1187200 dL1-load-misses (scaled from 23.01%)
> > > 150454 dL1-stores (scaled from 23.01%)
> > > 494252 dL1-prefetches (scaled from 23.29%)
> > > 362661 dL1-prefetch-misses (scaled from 23.73%)
> > > 247343449 iL1-loads (scaled from 23.71%)
> > > 4804990 iL1-load-misses (scaled from 23.85%)
> > > 108711 iL1-prefetches (scaled from 23.83%)
> > > 6260313 LLC-loads (scaled from 23.82%)
> > > 605425 LLC-load-misses (scaled from 23.82%)
> > > 6898075 LLC-stores (scaled from 23.96%)
> > > 248334160 dTLB-loads (scaled from 23.95%)
> > > 3812835 dTLB-load-misses (scaled from 23.87%)
> > > 253208496 iTLB-loads (scaled from 23.73%)
> > > 5873 iTLB-load-misses (scaled from 23.46%)
> > > 110891027 branches (scaled from 23.21%)
> > > 5529622 branch-misses (scaled from 23.02%)
> >
> > We can leave out 'refs' i think - without any qualification
> > statements like '247343449 iL1-loads' are still unambigious i think.
> >
>
> Looks good.
>
> > Plus we can abbreviate dL1/iL1/dTLB/iTLB. The capitalization
> > matters. Also, note that it's LLC (Last Level Cache), not L2.
> >
> > ( Sidenote: L2 can still be an alias for LLC, even though some CPUs
> > have a L3 too. )
> >
>
> Ok, I will fix it and also set the alias.
>
> > Note, branches are special - we dont really have 'branch loads',
> > branches are executions. 'Branches' and 'Branch-misses' are the
> > right term.
> >
> > Do you agree?
> >
>
> Event we used for (BPU, READ, ACCESS) is 'branch instructions
> retired'
>
> So 'branch loads' we mean 'branch instruction loaded and retired'
>
> I like all of them : 'branch loads', 'branch retired' or
> 'branches'

There's two things:

Firstly, there are "loads" are when data is loaded into the CPU. It
has a very firm meaning.

Secondly, the "loading an instruction into the CPU" idiom you
mention is not really correct - what we generally say is to "fetch
an instruction".

In that sense using 'branch loads' is confusing, and that's why i
corrected it. 'branches' is perfectly fine shortcut for 'branch
instructions executed'. (or branch instructions fetched and retired)

Ingo

2009-06-24 18:00:40

by Jaswinder Singh Rajput

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events

On Wed, 2009-06-24 at 10:40 +0200, Ingo Molnar wrote:
> * Jaswinder Singh Rajput <[email protected]> wrote:
>
> > On Tue, 2009-06-23 at 21:56 +0200, Ingo Molnar wrote:
> > > * Jaswinder Singh Rajput <[email protected]> wrote:
> > >
> > > here's an edited version of my suggestions:
> > >
> > > > 259250339 dL1-loads (scaled from 22.73%)
> > > > 1187200 dL1-load-misses (scaled from 23.01%)
> > > > 150454 dL1-stores (scaled from 23.01%)
> > > > 494252 dL1-prefetches (scaled from 23.29%)
> > > > 362661 dL1-prefetch-misses (scaled from 23.73%)
> > > > 247343449 iL1-loads (scaled from 23.71%)
> > > > 4804990 iL1-load-misses (scaled from 23.85%)
> > > > 108711 iL1-prefetches (scaled from 23.83%)
> > > > 6260313 LLC-loads (scaled from 23.82%)
> > > > 605425 LLC-load-misses (scaled from 23.82%)
> > > > 6898075 LLC-stores (scaled from 23.96%)
> > > > 248334160 dTLB-loads (scaled from 23.95%)
> > > > 3812835 dTLB-load-misses (scaled from 23.87%)
> > > > 253208496 iTLB-loads (scaled from 23.73%)
> > > > 5873 iTLB-load-misses (scaled from 23.46%)
> > > > 110891027 branches (scaled from 23.21%)
> > > > 5529622 branch-misses (scaled from 23.02%)
> > >
> > > We can leave out 'refs' i think - without any qualification
> > > statements like '247343449 iL1-loads' are still unambigious i think.
> > >
> >
> > Looks good.
> >
> > > Plus we can abbreviate dL1/iL1/dTLB/iTLB. The capitalization
> > > matters. Also, note that it's LLC (Last Level Cache), not L2.
> > >
> > > ( Sidenote: L2 can still be an alias for LLC, even though some CPUs
> > > have a L3 too. )
> > >
> >
> > Ok, I will fix it and also set the alias.
> >
> > > Note, branches are special - we dont really have 'branch loads',
> > > branches are executions. 'Branches' and 'Branch-misses' are the
> > > right term.
> > >
> > > Do you agree?
> > >
> >
> > Event we used for (BPU, READ, ACCESS) is 'branch instructions
> > retired'
> >
> > So 'branch loads' we mean 'branch instruction loaded and retired'
> >
> > I like all of them : 'branch loads', 'branch retired' or
> > 'branches'
>
> There's two things:
>
> Firstly, there are "loads" are when data is loaded into the CPU. It
> has a very firm meaning.
>
> Secondly, the "loading an instruction into the CPU" idiom you
> mention is not really correct - what we generally say is to "fetch
> an instruction".
>
> In that sense using 'branch loads' is confusing, and that's why i
> corrected it. 'branches' is perfectly fine shortcut for 'branch
> instructions executed'. (or branch instructions fetched and retired)
>


OK, We will show :
'branch loads' -> 'branches'
'branch load-misses' -> 'branch-misses'

now issue is how we can show :

'branch stores' -> ?
'branch store-misses' -> ?

'branch prefetches' -> ?
'branch prefetch-misses' -> ?

Thanks,
--
JSR

2009-06-24 18:08:25

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events


* Jaswinder Singh Rajput <[email protected]> wrote:

> On Wed, 2009-06-24 at 10:40 +0200, Ingo Molnar wrote:
> > * Jaswinder Singh Rajput <[email protected]> wrote:
> >
> > > On Tue, 2009-06-23 at 21:56 +0200, Ingo Molnar wrote:
> > > > * Jaswinder Singh Rajput <[email protected]> wrote:
> > > >
> > > > here's an edited version of my suggestions:
> > > >
> > > > > 259250339 dL1-loads (scaled from 22.73%)
> > > > > 1187200 dL1-load-misses (scaled from 23.01%)
> > > > > 150454 dL1-stores (scaled from 23.01%)
> > > > > 494252 dL1-prefetches (scaled from 23.29%)
> > > > > 362661 dL1-prefetch-misses (scaled from 23.73%)
> > > > > 247343449 iL1-loads (scaled from 23.71%)
> > > > > 4804990 iL1-load-misses (scaled from 23.85%)
> > > > > 108711 iL1-prefetches (scaled from 23.83%)
> > > > > 6260313 LLC-loads (scaled from 23.82%)
> > > > > 605425 LLC-load-misses (scaled from 23.82%)
> > > > > 6898075 LLC-stores (scaled from 23.96%)
> > > > > 248334160 dTLB-loads (scaled from 23.95%)
> > > > > 3812835 dTLB-load-misses (scaled from 23.87%)
> > > > > 253208496 iTLB-loads (scaled from 23.73%)
> > > > > 5873 iTLB-load-misses (scaled from 23.46%)
> > > > > 110891027 branches (scaled from 23.21%)
> > > > > 5529622 branch-misses (scaled from 23.02%)
> > > >
> > > > We can leave out 'refs' i think - without any qualification
> > > > statements like '247343449 iL1-loads' are still unambigious i think.
> > > >
> > >
> > > Looks good.
> > >
> > > > Plus we can abbreviate dL1/iL1/dTLB/iTLB. The capitalization
> > > > matters. Also, note that it's LLC (Last Level Cache), not L2.
> > > >
> > > > ( Sidenote: L2 can still be an alias for LLC, even though some CPUs
> > > > have a L3 too. )
> > > >
> > >
> > > Ok, I will fix it and also set the alias.
> > >
> > > > Note, branches are special - we dont really have 'branch loads',
> > > > branches are executions. 'Branches' and 'Branch-misses' are the
> > > > right term.
> > > >
> > > > Do you agree?
> > > >
> > >
> > > Event we used for (BPU, READ, ACCESS) is 'branch instructions
> > > retired'
> > >
> > > So 'branch loads' we mean 'branch instruction loaded and retired'
> > >
> > > I like all of them : 'branch loads', 'branch retired' or
> > > 'branches'
> >
> > There's two things:
> >
> > Firstly, there are "loads" are when data is loaded into the CPU. It
> > has a very firm meaning.
> >
> > Secondly, the "loading an instruction into the CPU" idiom you
> > mention is not really correct - what we generally say is to "fetch
> > an instruction".
> >
> > In that sense using 'branch loads' is confusing, and that's why i
> > corrected it. 'branches' is perfectly fine shortcut for 'branch
> > instructions executed'. (or branch instructions fetched and retired)
> >
>
>
> OK, We will show :
> 'branch loads' -> 'branches'
> 'branch load-misses' -> 'branch-misses'
>
> now issue is how we can show :
>
> 'branch stores' -> ?
> 'branch store-misses' -> ?
>
> 'branch prefetches' -> ?
> 'branch prefetch-misses' -> ?

there's no such thing as a 'branch store'. Instructions are not
stored. We shouldnt display those.

They are prefetched sometimes speculatively ... not sure there are
events for them ... are there?

Ingo

2009-06-24 18:18:54

by Jaswinder Singh Rajput

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events

On Wed, 2009-06-24 at 20:07 +0200, Ingo Molnar wrote:
> * Jaswinder Singh Rajput <[email protected]> wrote:
>
> >
> > OK, We will show :
> > 'branch loads' -> 'branches'
> > 'branch load-misses' -> 'branch-misses'
> >
> > now issue is how we can show :
> >
> > 'branch stores' -> ?
> > 'branch store-misses' -> ?
> >
> > 'branch prefetches' -> ?
> > 'branch prefetch-misses' -> ?
>
> there's no such thing as a 'branch store'. Instructions are not
> stored. We shouldnt display those.
>
> They are prefetched sometimes speculatively ... not sure there are
> events for them ... are there?
>

yes, you are right there is no such events.

But I need to test and display all.

If I need to to need to handle branch as special case, then better to
make separate array for branches and remove from hw_cache array.

Is it OK.

Thanks,

2009-06-24 20:41:48

by Jaswinder Singh Rajput

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events

On Tue, 2009-06-23 at 21:56 +0200, Ingo Molnar wrote:
> * Jaswinder Singh Rajput <[email protected]> wrote:
>

> here's an edited version of my suggestions:
>
> > 259250339 dL1-loads (scaled from 22.73%)
> > 1187200 dL1-load-misses (scaled from 23.01%)
> > 150454 dL1-stores (scaled from 23.01%)
> > 494252 dL1-prefetches (scaled from 23.29%)
> > 362661 dL1-prefetch-misses (scaled from 23.73%)
> > 247343449 iL1-loads (scaled from 23.71%)
> > 4804990 iL1-load-misses (scaled from 23.85%)
> > 108711 iL1-prefetches (scaled from 23.83%)
> > 6260313 LLC-loads (scaled from 23.82%)
> > 605425 LLC-load-misses (scaled from 23.82%)
> > 6898075 LLC-stores (scaled from 23.96%)
> > 248334160 dTLB-loads (scaled from 23.95%)
> > 3812835 dTLB-load-misses (scaled from 23.87%)
> > 253208496 iTLB-loads (scaled from 23.73%)
> > 5873 iTLB-load-misses (scaled from 23.46%)
> > 110891027 branches (scaled from 23.21%)
> > 5529622 branch-misses (scaled from 23.02%)
>
> We can leave out 'refs' i think - without any qualification
> statements like '247343449 iL1-loads' are still unambigious i think.
>
> Plus we can abbreviate dL1/iL1/dTLB/iTLB. The capitalization
> matters. Also, note that it's LLC (Last Level Cache), not L2.
>
> ( Sidenote: L2 can still be an alias for LLC, even though some CPUs
> have a L3 too. )
>
> Note, branches are special - we dont really have 'branch loads',
> branches are executions. 'Branches' and 'Branch-misses' are the
> right term.
>

[PATCH] perf_counter tools: shorten names for events

Special handling for branches as branches are special
we don't really have 'branch loads', branches are executions.
'Branches' and 'Branch-misses' are the right term.

On AMD box:
$ ./perf stat -e l1d -e l1d-misses -e l1d-write -e l1d-prefetch -e l1d-prefetch-miss -e l1i -e l1i-misses -e l1i-prefetch -e l2 -e l2-misses -e l2-write -e dtlb -e dtlb-misses -e itlb -e itlb-misses -e bpu -e bpu-misses -- ls -lR /usr/include/ > /dev/null

Before :

Performance counter stats for 'ls -lR /usr/include/':

248064467 L1-data-Cache-Load-Referencees (scaled from 23.27%)
1001433 L1-data-Cache-Load-Misses (scaled from 23.34%)
153691 L1-data-Cache-Store-Referencees (scaled from 23.34%)
423248 L1-data-Cache-Prefetch-Referencees (scaled from 23.33%)
302138 L1-data-Cache-Prefetch-Misses (scaled from 23.25%)
251217546 L1-instruction-Cache-Load-Referencees (scaled from 23.25%)
5757005 L1-instruction-Cache-Load-Misses (scaled from 23.23%)
93435 L1-instruction-Cache-Prefetch-Referencees (scaled from 23.24%)
6496073 L2-Cache-Load-Referencees (scaled from 23.32%)
609485 L2-Cache-Load-Misses (scaled from 23.45%)
6876991 L2-Cache-Store-Referencees (scaled from 23.71%)
248922840 Data-TLB-Cache-Load-Referencees (scaled from 23.94%)
5828386 Data-TLB-Cache-Load-Misses (scaled from 24.17%)
257613506 Instruction-TLB-Cache-Load-Referencees (scaled from 24.20%)
6833 Instruction-TLB-Cache-Load-Misses (scaled from 23.88%)
109043606 Branch-Cache-Load-Referencees (scaled from 23.64%)
5552296 Branch-Cache-Load-Misses (scaled from 23.42%)

0.413702461 seconds time elapsed.

After :

Performance counter stats for 'ls -lR /usr/include/':

283542921 dL1-loads (scaled from 23.28%)
1848314 dL1-load-misses (scaled from 22.94%)
168963 dL1-stores (scaled from 22.94%)
739249 dL1-prefetches (scaled from 22.45%)
501021 dL1-prefetch-misses (scaled from 22.25%)
275037259 iL1-loads (scaled from 23.40%)
6030825 iL1-load-misses (scaled from 23.26%)
166760 iL1-prefetches (scaled from 24.31%)
7224781 LLC-loads (scaled from 24.76%)
821097 LLC-load-misses (scaled from 24.07%)
7070549 LLC-stores (scaled from 24.45%)
251586242 dTLB-loads (scaled from 24.65%)
5127780 dTLB-load-misses (scaled from 23.99%)
276782014 iTLB-loads (scaled from 23.77%)
16787 iTLB-load-misses (scaled from 23.72%)
123408502 branches (scaled from 22.88%)
5843856 branch-misses (scaled from 22.87%)

1.417039891 seconds time elapsed.

Reported-by : Ingo Molnar <[email protected]>
Signed-off-by: Jaswinder Singh Rajput <[email protected]>
---
tools/perf/util/parse-events.c | 45 ++++++++++++++++++++++++++-------------
1 files changed, 30 insertions(+), 15 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 06af2fa..fa6e2e5 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -71,23 +71,23 @@ static char *sw_event_names[] = {
#define MAX_ALIASES 8

static char *hw_cache[][MAX_ALIASES] = {
- { "L1-data", "l1-d", "l1d" },
- { "L1-instruction", "l1-i", "l1i" },
- { "L2", "l2" },
- { "Data-TLB", "dtlb", "d-tlb" },
- { "Instruction-TLB", "itlb", "i-tlb" },
- { "Branch", "bpu" , "btb", "bpc" },
+ { "dL1", "L1-d", "l1d", },
+ { "iL1", "L1-i", "l1i", },
+ { "LLC", "L2", },
+ { "dTLB", "d-tlb", },
+ { "iTLB", "i-tlb", },
+ { "branch", "branches", "bpu", "btb", "bpc", },
};

static char *hw_cache_op[][MAX_ALIASES] = {
- { "Load", "read" },
- { "Store", "write" },
- { "Prefetch", "speculative-read", "speculative-load" },
+ { "load", "loads", "read", },
+ { "store", "stores", "write", },
+ { "prefetch", "prefetches", "speculative-read", "speculative-load", },
};

static char *hw_cache_result[][MAX_ALIASES] = {
- { "Reference", "ops", "access" },
- { "Miss" },
+ { "refs", "ops", "access", },
+ { "misses", "miss", },
};

char *event_name(int counter)
@@ -123,10 +123,25 @@ char *event_name(int counter)
if (cache_result > PERF_COUNT_HW_CACHE_RESULT_MAX)
return "unknown-ext-hardware-cache-result";

- sprintf(name, "%s-Cache-%s-%ses",
- hw_cache[cache_type][0],
- hw_cache_op[cache_op][0],
- hw_cache_result[cache_result][0]);
+ /*
+ * special handling for branches
+ * we are only interested in BPU, READ
+ */
+ if (cache_type == PERF_COUNT_HW_CACHE_BPU && cache_op)
+ return "unknown";
+ else if (cache_type == PERF_COUNT_HW_CACHE_BPU) {
+ if (cache_result)
+ sprintf(name, "%s-%s", hw_cache[cache_type][0],
+ hw_cache_result[cache_result][0]);
+ else
+ sprintf(name, "%s", hw_cache[cache_type][1]);
+ } else if (cache_result)
+ sprintf(name, "%s-%s-%s", hw_cache[cache_type][0],
+ hw_cache_op[cache_op][0],
+ hw_cache_result[cache_result][0]);
+ else
+ sprintf(name, "%s-%s", hw_cache[cache_type][0],
+ hw_cache_op[cache_op][1]);

return name;
}
--
1.6.0.6


2009-06-24 21:01:49

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events

On Thu, 25 Jun 2009, Jaswinder Singh Rajput wrote:
> On Tue, 2009-06-23 at 21:56 +0200, Ingo Molnar wrote:
>
> 248064467 L1-data-Cache-Load-Referencees (scaled from 23.27%)
> 283542921 dL1-loads (scaled from 23.28%)

Where is the point of this? dL1-loads is a completely non intuitive
artificial abbreviation.

Changing "L1-data-Cache-Load-Referencees" to "L1-dcache-loads" or
something similar provides a short but sufficiently self explaining
explanation of the counter.

I don't want to use a abbreviations dictionary to decode a perf
report.

Thanks,

tglx

2009-06-25 04:30:09

by Jaswinder Singh Rajput

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events

On Wed, 2009-06-24 at 23:00 +0200, Thomas Gleixner wrote:
> On Thu, 25 Jun 2009, Jaswinder Singh Rajput wrote:
> > On Tue, 2009-06-23 at 21:56 +0200, Ingo Molnar wrote:
> >
> > 248064467 L1-data-Cache-Load-Referencees (scaled from 23.27%)
> > 283542921 dL1-loads (scaled from 23.28%)
>
> Where is the point of this? dL1-loads is a completely non intuitive
> artificial abbreviation.
>
> Changing "L1-data-Cache-Load-Referencees" to "L1-dcache-loads" or
> something similar provides a short but sufficiently self explaining
> explanation of the counter.
>
> I don't want to use a abbreviations dictionary to decode a perf
> report.

Of course we need to shorten the event names, currently they are too
long and need at least 45 characters only for names, how can we show
another useful info on the screen.

Please suggest some better short names for :

L1-data-Cache-Load-Reference [Hardware cache event]
L1-data-Cache-Load-Miss [Hardware cache event]
L1-data-Cache-Store-Reference [Hardware cache event]
L1-data-Cache-Store-Miss [Hardware cache event]
L1-data-Cache-Prefetch-Reference [Hardware cache event]
L1-data-Cache-Prefetch-Miss [Hardware cache event]
L1-instruction-Cache-Load-Reference [Hardware cache event]
L1-instruction-Cache-Load-Miss [Hardware cache event]
L1-instruction-Cache-Store-Reference [Hardware cache event]
L1-instruction-Cache-Store-Miss [Hardware cache event]
L1-instruction-Cache-Prefetch-Reference [Hardware cache event]
L1-instruction-Cache-Prefetch-Miss [Hardware cache event]
L2-Cache-Load-Reference [Hardware cache event]
L2-Cache-Load-Miss [Hardware cache event]
L2-Cache-Store-Reference [Hardware cache event]
L2-Cache-Store-Miss [Hardware cache event]
L2-Cache-Prefetch-Reference [Hardware cache event]
L2-Cache-Prefetch-Miss [Hardware cache event]
Data-TLB-Cache-Load-Reference [Hardware cache event]
Data-TLB-Cache-Load-Miss [Hardware cache event]
Data-TLB-Cache-Store-Reference [Hardware cache event]
Data-TLB-Cache-Store-Miss [Hardware cache event]
Data-TLB-Cache-Prefetch-Reference [Hardware cache event]
Data-TLB-Cache-Prefetch-Miss [Hardware cache event]
Instruction-TLB-Cache-Load-Reference [Hardware cache event]
Instruction-TLB-Cache-Load-Miss [Hardware cache event]
Instruction-TLB-Cache-Store-Reference [Hardware cache event]
Instruction-TLB-Cache-Store-Miss [Hardware cache event]
Instruction-TLB-Cache-Prefetch-Reference [Hardware cache event]
Instruction-TLB-Cache-Prefetch-Miss [Hardware cache event]
Branch-Cache-Load-Reference [Hardware cache event]
Branch-Cache-Load-Miss [Hardware cache event]
Branch-Cache-Store-Reference [Hardware cache event]
Branch-Cache-Store-Miss [Hardware cache event]
Branch-Cache-Prefetch-Reference [Hardware cache event]
Branch-Cache-Prefetch-Miss [Hardware cache event]

Thanks,
--
JSR

2009-06-25 04:34:58

by Roland Dreier

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events


> L1-data-Cache-Load-Reference [Hardware cache event]
> L1-instruction-Cache-Load-Reference [Hardware cache event]

obvious place to start is to change "data-Cache" and "instruction-Cache"
to "dcache" and "icache" which are still easy to figure out without
needing to look anything up. And "Reference" can become "ref" with
minimal loss of clarity too. So we could get down to

L1-icache-load
L1-icache-load-miss
L1-icache-store-miss
L1-icache-prefetch
L1-icache-prefetch-miss

and so on.

> Data-TLB-Cache-Load-Reference [Hardware cache event]

Could become

dTLB-load
dTLB-load-miss
iTLB-load

etc.

2009-06-25 09:22:50

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events


* Roland Dreier <[email protected]> wrote:

> > Data-TLB-Cache-Load-Reference [Hardware cache event]
>
> Could become
>
> dTLB-load
> dTLB-load-miss
> iTLB-load

I already went through this and suggested shorter names, that was
the motivation of this patch.

The new names i suggested two days ago can be found below.

Ingo

----- Forwarded message from Ingo Molnar <[email protected]> -----

Date: Tue, 23 Jun 2009 21:56:56 +0200
From: Ingo Molnar <[email protected]>
To: Jaswinder Singh Rajput <[email protected]>
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events
Cc: Thomas Gleixner <[email protected]>,
Peter Zijlstra <[email protected]>,
LKML <[email protected]>


* Jaswinder Singh Rajput <[email protected]> wrote:

> After :
>
> Performance counter stats for 'ls -lR /usr/include/':
>
> 259250339 L1-d-load-refs (scaled from 22.73%)
> 1187200 L1-d-load-miss (scaled from 23.01%)
> 150454 L1-d-store-refs (scaled from 23.01%)
> 494252 L1-d-prefetch-refs (scaled from 23.29%)
> 362661 L1-d-prefetch-miss (scaled from 23.73%)
> 247343449 L1-i-load-refs (scaled from 23.71%)
> 4804990 L1-i-load-miss (scaled from 23.85%)
> 108711 L1-i-prefetch-refs (scaled from 23.83%)
> 6260313 L2-load-refs (scaled from 23.82%)
> 605425 L2-load-miss (scaled from 23.82%)
> 6898075 L2-store-refs (scaled from 23.96%)
> 248334160 d-TLB-load-refs (scaled from 23.95%)
> 3812835 d-TLB-load-miss (scaled from 23.87%)
> 253208496 i-TLB-load-refs (scaled from 23.73%)
> 5873 i-TLB-load-miss (scaled from 23.46%)
> 110891027 Branch-load-refs (scaled from 23.21%)
> 5529622 Branch-load-miss (scaled from 23.02%)

here's an edited version of my suggestions:

> 259250339 dL1-loads (scaled from 22.73%)
> 1187200 dL1-load-misses (scaled from 23.01%)
> 150454 dL1-stores (scaled from 23.01%)
> 494252 dL1-prefetches (scaled from 23.29%)
> 362661 dL1-prefetch-misses (scaled from 23.73%)
> 247343449 iL1-loads (scaled from 23.71%)
> 4804990 iL1-load-misses (scaled from 23.85%)
> 108711 iL1-prefetches (scaled from 23.83%)
> 6260313 LLC-loads (scaled from 23.82%)
> 605425 LLC-load-misses (scaled from 23.82%)
> 6898075 LLC-stores (scaled from 23.96%)
> 248334160 dTLB-loads (scaled from 23.95%)
> 3812835 dTLB-load-misses (scaled from 23.87%)
> 253208496 iTLB-loads (scaled from 23.73%)
> 5873 iTLB-load-misses (scaled from 23.46%)
> 110891027 branches (scaled from 23.21%)
> 5529622 branch-misses (scaled from 23.02%)

We can leave out 'refs' i think - without any qualification
statements like '247343449 iL1-loads' are still unambigious i think.

Plus we can abbreviate dL1/iL1/dTLB/iTLB. The capitalization
matters. Also, note that it's LLC (Last Level Cache), not L2.

( Sidenote: L2 can still be an alias for LLC, even though some CPUs
have a L3 too. )

Note, branches are special - we dont really have 'branch loads',
branches are executions. 'Branches' and 'Branch-misses' are the
right term.

Do you agree?

Ingo

2009-06-25 09:24:48

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events


* Thomas Gleixner <[email protected]> wrote:

> On Thu, 25 Jun 2009, Jaswinder Singh Rajput wrote:
> > On Tue, 2009-06-23 at 21:56 +0200, Ingo Molnar wrote:
> >
> > 248064467 L1-data-Cache-Load-Referencees (scaled from 23.27%)
> > 283542921 dL1-loads (scaled from 23.28%)
>
> Where is the point of this? dL1-loads is a completely non
> intuitive artificial abbreviation.

blame me :)

I found L1-data-Cache-Load-References way too long and i asked for
suggestions and came up with my list of abbreviations.

I found 'dL1' intuitive because we use 'dTLB' and 'iTLB' as well.

How about L1-data-loads ?

Ingo

2009-06-25 09:34:18

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events


* Jaswinder Singh Rajput <[email protected]> wrote:

> After :
>
> Performance counter stats for 'ls -lR /usr/include/':
>
> 283542921 dL1-loads (scaled from 23.28%)
> 1848314 dL1-load-misses (scaled from 22.94%)
> 168963 dL1-stores (scaled from 22.94%)
> 739249 dL1-prefetches (scaled from 22.45%)
> 501021 dL1-prefetch-misses (scaled from 22.25%)
> 275037259 iL1-loads (scaled from 23.40%)
> 6030825 iL1-load-misses (scaled from 23.26%)
> 166760 iL1-prefetches (scaled from 24.31%)
> 7224781 LLC-loads (scaled from 24.76%)
> 821097 LLC-load-misses (scaled from 24.07%)
> 7070549 LLC-stores (scaled from 24.45%)
> 251586242 dTLB-loads (scaled from 24.65%)
> 5127780 dTLB-load-misses (scaled from 23.99%)
> 276782014 iTLB-loads (scaled from 23.77%)
> 16787 iTLB-load-misses (scaled from 23.72%)
> 123408502 branches (scaled from 22.88%)
> 5843856 branch-misses (scaled from 22.87%)
>
> 1.417039891 seconds time elapsed.

ok, this output looks pretty good and intuitive to me (please
integrate suggestions from Thomas), but the patch itself needs
another iteration i think:

> static char *hw_cache[][MAX_ALIASES] = {
> - { "L1-data", "l1-d", "l1d" },
> - { "L1-instruction", "l1-i", "l1i" },
> - { "L2", "l2" },
> - { "Data-TLB", "dtlb", "d-tlb" },
> - { "Instruction-TLB", "itlb", "i-tlb" },
> - { "Branch", "bpu" , "btb", "bpc" },
> + { "dL1", "L1-d", "l1d", },
> + { "iL1", "L1-i", "l1i", },
> + { "LLC", "L2", },
> + { "dTLB", "d-tlb", },
> + { "iTLB", "i-tlb", },
> + { "branch", "branches", "bpu", "btb", "bpc", },
> };
>
> static char *hw_cache_op[][MAX_ALIASES] = {
> - { "Load", "read" },
> - { "Store", "write" },
> - { "Prefetch", "speculative-read", "speculative-load" },
> + { "load", "loads", "read", },
> + { "store", "stores", "write", },
> + { "prefetch", "prefetches", "speculative-read", "speculative-load", },
> };
>
> static char *hw_cache_result[][MAX_ALIASES] = {
> - { "Reference", "ops", "access" },
> - { "Miss" },
> + { "refs", "ops", "access", },
> + { "misses", "miss", },
> };
>
> char *event_name(int counter)
> @@ -123,10 +123,25 @@ char *event_name(int counter)
> if (cache_result > PERF_COUNT_HW_CACHE_RESULT_MAX)
> return "unknown-ext-hardware-cache-result";
>
> - sprintf(name, "%s-Cache-%s-%ses",
> - hw_cache[cache_type][0],
> - hw_cache_op[cache_op][0],
> - hw_cache_result[cache_result][0]);
> + /*
> + * special handling for branches
> + * we are only interested in BPU, READ
> + */
> + if (cache_type == PERF_COUNT_HW_CACHE_BPU && cache_op)
> + return "unknown";
> + else if (cache_type == PERF_COUNT_HW_CACHE_BPU) {
> + if (cache_result)
> + sprintf(name, "%s-%s", hw_cache[cache_type][0],
> + hw_cache_result[cache_result][0]);
> + else
> + sprintf(name, "%s", hw_cache[cache_type][1]);
> + } else if (cache_result)
> + sprintf(name, "%s-%s-%s", hw_cache[cache_type][0],
> + hw_cache_op[cache_op][0],
> + hw_cache_result[cache_result][0]);
> + else
> + sprintf(name, "%s-%s", hw_cache[cache_type][0],
> + hw_cache_op[cache_op][1]);
>
> return name;

Firstly, please run your patches through checkpatch - it will report
a real problem in your patch.

Secondly, this special-casing of the BPU isnt very clean in this
form. The BPU isnt 'special' because it deals with instructions -
it's special because it's for all practical purposes read-only.

So we should extend our table with a read-only flag, and the BPU and
the iTLB should be listed as read-only. (iTLB-store-miss is another
thing that makes no sense) For those we should skip the 'store'
bits.

That way the generic code does not have this special-case wart
dependent on PERF_COUNT_HW_CACHE_BPU.

Ingo

2009-06-25 12:49:43

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events

On Thu, 25 Jun 2009, Ingo Molnar wrote:

>
> * Thomas Gleixner <[email protected]> wrote:
>
> > On Thu, 25 Jun 2009, Jaswinder Singh Rajput wrote:
> > > On Tue, 2009-06-23 at 21:56 +0200, Ingo Molnar wrote:
> > >
> > > 248064467 L1-data-Cache-Load-Referencees (scaled from 23.27%)
> > > 283542921 dL1-loads (scaled from 23.28%)
> >
> > Where is the point of this? dL1-loads is a completely non
> > intuitive artificial abbreviation.
>
> blame me :)
>
> I found L1-data-Cache-Load-References way too long and i asked for
> suggestions and came up with my list of abbreviations.
>
> I found 'dL1' intuitive because we use 'dTLB' and 'iTLB' as well.

:)

> How about L1-data-loads ?

Yeah, something like that would be nice. The ones Roland
suggested are fine as well.

Thanks,

tglx

2009-06-25 12:56:18

by Jaswinder Singh Rajput

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events

On Thu, 2009-06-25 at 11:33 +0200, Ingo Molnar wrote:

> Firstly, please run your patches through checkpatch - it will report
> a real problem in your patch.
>
> Secondly, this special-casing of the BPU isnt very clean in this
> form. The BPU isnt 'special' because it deals with instructions -
> it's special because it's for all practical purposes read-only.
>
> So we should extend our table with a read-only flag, and the BPU and
> the iTLB should be listed as read-only. (iTLB-store-miss is another
> thing that makes no sense) For those we should skip the 'store'
> bits.
>
> That way the generic code does not have this special-case wart
> dependent on PERF_COUNT_HW_CACHE_BPU.
>

I am sorry BPU still needs special handling ;-)

[PATCH] perf_counter tools: shorten names for events

Added new alias for events.

special handling for BPU for :
'branch loads' -> 'branches'
'branch load-misses' -> 'branch-misses'

On AMD box:
$ ./perf stat -e l1d -e l1d-misses -e l1d-write -e l1d-prefetch -e l1d-prefetch-miss -e l1i -e l1i-misses -e l1i-prefetch -e l2 -e l2-misses -e l2-write -e dtlb -e dtlb-misses -e itlb -e itlb-misses -e bpu -e bpu-misses -- ls -lR /usr/include/ > /dev/null

Before :

Performance counter stats for 'ls -lR /usr/include/':

248064467 L1-data-Cache-Load-Referencees (scaled from 23.27%)
1001433 L1-data-Cache-Load-Misses (scaled from 23.34%)
153691 L1-data-Cache-Store-Referencees (scaled from 23.34%)
423248 L1-data-Cache-Prefetch-Referencees (scaled from 23.33%)
302138 L1-data-Cache-Prefetch-Misses (scaled from 23.25%)
251217546 L1-instruction-Cache-Load-Referencees (scaled from 23.25%)
5757005 L1-instruction-Cache-Load-Misses (scaled from 23.23%)
93435 L1-instruction-Cache-Prefetch-Referencees (scaled from 23.24%)
6496073 L2-Cache-Load-Referencees (scaled from 23.32%)
609485 L2-Cache-Load-Misses (scaled from 23.45%)
6876991 L2-Cache-Store-Referencees (scaled from 23.71%)
248922840 Data-TLB-Cache-Load-Referencees (scaled from 23.94%)
5828386 Data-TLB-Cache-Load-Misses (scaled from 24.17%)
257613506 Instruction-TLB-Cache-Load-Referencees (scaled from 24.20%)
6833 Instruction-TLB-Cache-Load-Misses (scaled from 23.88%)
109043606 Branch-Cache-Load-Referencees (scaled from 23.64%)
5552296 Branch-Cache-Load-Misses (scaled from 23.42%)

0.413702461 seconds time elapsed.

After :

Peformance counter stats for 'ls -lR /usr/include/':

266590464 L1d-loads (scaled from 23.03%)
1222273 L1d-load-misses (scaled from 23.58%)
146204 L1d-stores (scaled from 23.83%)
406344 L1d-prefetches (scaled from 24.09%)
283748 L1d-prefetch-misses (scaled from 24.10%)
249650965 L1i-loads (scaled from 23.80%)
3353961 L1i-load-misses (scaled from 23.82%)
104599 L1i-prefetches (scaled from 23.68%)
4836405 LLC-loads (scaled from 23.67%)
498214 LLC-load-misses (scaled from 23.66%)
4953994 LLC-stores (scaled from 23.64%)
243354097 dTLB-loads (scaled from 23.77%)
6468584 dTLB-load-misses (scaled from 23.74%)
249719549 iTLB-loads (scaled from 23.25%)
5060 iTLB-load-misses (scaled from 23.00%)
112343016 branches (scaled from 22.76%)
5528876 branch-misses (scaled from 22.54%)

0.427154051 seconds time elapsed.

Reported-by : Ingo Molnar <[email protected]>
Signed-off-by: Jaswinder Singh Rajput <[email protected]>
---
tools/perf/util/parse-events.c | 56 +++++++++++++++++++++++++++------------
1 files changed, 39 insertions(+), 17 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 7939a21..993cee4 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -71,23 +71,23 @@ static char *sw_event_names[] = {
#define MAX_ALIASES 8

static char *hw_cache[][MAX_ALIASES] = {
- { "L1-data", "l1-d", "l1d" },
- { "L1-instruction", "l1-i", "l1i" },
- { "L2", "l2" },
- { "Data-TLB", "dtlb", "d-tlb" },
- { "Instruction-TLB", "itlb", "i-tlb" },
- { "Branch", "bpu" , "btb", "bpc" },
+ { "L1d", "l1-d", "L1-data", },
+ { "L1i", "l1-i", "L1-instruction", },
+ { "LLC", "L2" },
+ { "dTLB", "d-tlb", "Data-TLB", },
+ { "iTLB", "i-tlb", "Instruction-TLB", },
+ { "branch", "branches", "bpu", "btb", "bpc", },
};

static char *hw_cache_op[][MAX_ALIASES] = {
- { "Load", "read" },
- { "Store", "write" },
- { "Prefetch", "speculative-read", "speculative-load" },
+ { "load", "loads", "read", },
+ { "store", "stores", "write", },
+ { "prefetch", "prefetches", "speculative-read", "speculative-load", },
};

static char *hw_cache_result[][MAX_ALIASES] = {
- { "Reference", "ops", "access" },
- { "Miss" },
+ { "refs", "Reference", "ops", "access", },
+ { "misses", "miss", },
};

#define C(x) PERF_COUNT_HW_CACHE_##x
@@ -118,6 +118,33 @@ static int is_cache_op_valid(u8 cache_type, u8 cache_op)
return 0; /* invalid */
}

+static char *event_cache_name(u8 cache_type, u8 cache_op, u8 cache_result)
+{
+ static char name[50];
+
+ /*
+ * special handling for BPU for :
+ * 'branch loads' -> 'branches'
+ * 'branch load-misses' -> 'branch-misses'
+ */
+ if (cache_type == PERF_COUNT_HW_CACHE_BPU) {
+ if (cache_result)
+ sprintf(name, "%s-%s", hw_cache[cache_type][0],
+ hw_cache_result[cache_result][0]);
+ else
+ sprintf(name, "%s", hw_cache[cache_type][1]);
+
+ } else if (cache_result)
+ sprintf(name, "%s-%s-%s", hw_cache[cache_type][0],
+ hw_cache_op[cache_op][0],
+ hw_cache_result[cache_result][0]);
+ else
+ sprintf(name, "%s-%s", hw_cache[cache_type][0],
+ hw_cache_op[cache_op][1]);
+
+ return name;
+}
+
char *event_name(int counter)
{
u64 config = attrs[counter].config;
@@ -137,7 +164,6 @@ char *event_name(int counter)

case PERF_TYPE_HW_CACHE: {
u8 cache_type, cache_op, cache_result;
- static char name[100];

cache_type = (config >> 0) & 0xff;
if (cache_type > PERF_COUNT_HW_CACHE_MAX)
@@ -153,12 +179,8 @@ char *event_name(int counter)

if (!is_cache_op_valid(cache_type, cache_op))
return "invalid-cache";
- sprintf(name, "%s-Cache-%s-%ses",
- hw_cache[cache_type][0],
- hw_cache_op[cache_op][0],
- hw_cache_result[cache_result][0]);

- return name;
+ return event_cache_name(cache_type, cache_op, cache_result);
}

case PERF_TYPE_SOFTWARE:
--
1.6.0.6


2009-06-25 13:24:18

by Jaswinder Singh Rajput

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events

On Thu, 2009-06-25 at 14:48 +0200, Thomas Gleixner wrote:
> On Thu, 25 Jun 2009, Ingo Molnar wrote:
>
> >
> > * Thomas Gleixner <[email protected]> wrote:
> >
> > > On Thu, 25 Jun 2009, Jaswinder Singh Rajput wrote:
> > > > On Tue, 2009-06-23 at 21:56 +0200, Ingo Molnar wrote:
> > > >
> > > > 248064467 L1-data-Cache-Load-Referencees (scaled from 23.27%)
> > > > 283542921 dL1-loads (scaled from 23.28%)
> > >
> > > Where is the point of this? dL1-loads is a completely non
> > > intuitive artificial abbreviation.
> >
> > blame me :)
> >
> > I found L1-data-Cache-Load-References way too long and i asked for
> > suggestions and came up with my list of abbreviations.
> >
> > I found 'dL1' intuitive because we use 'dTLB' and 'iTLB' as well.
>
> :)
>
> > How about L1-data-loads ?
>
> Yeah, something like that would be nice. The ones Roland
> suggested are fine as well.
>

This still looks ugly and lines are again long check for
'L1-dcache-prefetch-misses' and does not solve the purpose :

Performance counter stats for 'ls -lR /usr/include/':

254259235 L1-dcache-loads (scaled from 22.69%)
1129360 L1-dcache-load-misses (scaled from 23.05%)
151929 L1-dcache-stores (scaled from 22.94%)
395089 L1-dcache-prefetches (scaled from 23.30%)
273699 L1-dcache-prefetch-misses (scaled from 23.19%)
253780608 L1-icache-loads (scaled from 23.07%)
4014781 L1-icache-load-misses (scaled from 23.16%)
94336 L1-icache-prefetches (scaled from 23.66%)
5553717 LLC-loads (scaled from 23.70%)
533195 LLC-load-misses (scaled from 23.68%)
5534185 LLC-stores (scaled from 23.92%)
252786406 dTLB-loads (scaled from 23.86%)
5058100 dTLB-load-misses (scaled from 24.17%)
248308183 iTLB-loads (scaled from 24.55%)
4627 iTLB-load-misses (scaled from 24.10%)
106942084 branches (scaled from 23.93%)
5280013 branch-misses (scaled from 23.06%)

Please check my patch which I send few minutes ago.

Thanks,
--
JSR

2009-06-25 15:04:08

by Roland Dreier

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events


> > Where is the point of this? dL1-loads is a completely non
> > intuitive artificial abbreviation.
>
> blame me :)
>
> I found L1-data-Cache-Load-References way too long and i asked for
> suggestions and came up with my list of abbreviations.
>
> I found 'dL1' intuitive because we use 'dTLB' and 'iTLB' as well.
>
> How about L1-data-loads ?

I think L1-dcache is a good compromise between length and ease of
understanding (at least for me).

- R.

2009-06-25 15:06:01

by Roland Dreier

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events


> This still looks ugly and lines are again long check for
> 'L1-dcache-prefetch-misses' and does not solve the purpose :
>
> Performance counter stats for 'ls -lR /usr/include/':
>
> 254259235 L1-dcache-loads (scaled from 22.69%)
> 1129360 L1-dcache-load-misses (scaled from 23.05%)
> 151929 L1-dcache-stores (scaled from 22.94%)
> 395089 L1-dcache-prefetches (scaled from 23.30%)
> 273699 L1-dcache-prefetch-misses (scaled from 23.19%)
> 253780608 L1-icache-loads (scaled from 23.07%)
> 4014781 L1-icache-load-misses (scaled from 23.16%)

But even the longest line there (dcache-prefetch-misses) is only 64
columns long, so you could just align the "(scaled" part a little
further to the right and keep the output under, say, 72 columns.

- R.

2009-06-25 15:09:44

by Roland Dreier

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events


> 266590464 L1d-loads (scaled from 23.03%)

IMHO "L1d" is too abbreviated for it to be obvious that it means L1 data
cache. If you want something really short maybe "L1-d$" might be a
little clearer, but I stil like "L1-dcache" best.

The formatting of aligning things is nice defitely but we could also
shorten lines by saying "scaled from" in fewer characters maybe? (I
have to admit I don't know what "scaled from X%" means in this case so I
can't suggest something better, but the fact that I can't easily tell
what it means probably suggests that the wording could be improved)

2009-06-25 15:12:34

by Jaswinder Singh Rajput

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events

On Thu, 2009-06-25 at 08:05 -0700, Roland Dreier wrote:
> > This still looks ugly and lines are again long check for
> > 'L1-dcache-prefetch-misses' and does not solve the purpose :
> >
> > Performance counter stats for 'ls -lR /usr/include/':
> >
> > 254259235 L1-dcache-loads (scaled from 22.69%)
> > 1129360 L1-dcache-load-misses (scaled from 23.05%)
> > 151929 L1-dcache-stores (scaled from 22.94%)
> > 395089 L1-dcache-prefetches (scaled from 23.30%)
> > 273699 L1-dcache-prefetch-misses (scaled from 23.19%)
> > 253780608 L1-icache-loads (scaled from 23.07%)
> > 4014781 L1-icache-load-misses (scaled from 23.16%)
>
> But even the longest line there (dcache-prefetch-misses) is only 64
> columns long, so you could just align the "(scaled" part a little
> further to the right and keep the output under, say, 72 columns.
>

BTW, this is a shorten patch not widen patch ;-)

L1d and L1i is self explanatory.

If iTLB and dTLB are valid then why not L1d and L1i ?

Thanks,
--
JSR

2009-06-25 15:25:35

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events


* Roland Dreier <[email protected]> wrote:

>
> > > Where is the point of this? dL1-loads is a completely non
> > > intuitive artificial abbreviation.
> >
> > blame me :)
> >
> > I found L1-data-Cache-Load-References way too long and i asked for
> > suggestions and came up with my list of abbreviations.
> >
> > I found 'dL1' intuitive because we use 'dTLB' and 'iTLB' as well.
> >
> > How about L1-data-loads ?
>
> I think L1-dcache is a good compromise between length and ease of
> understanding (at least for me).

all of them are caches and dcache is quite ugly as we already have a
'dcache' in the kernel.

We could do L1-data-loads or just go back to dL1-loads or
d-L1-loads.

Ingo

2009-06-25 15:26:49

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events


* Roland Dreier <[email protected]> wrote:

>
> > 266590464 L1d-loads (scaled from 23.03%)
>
> IMHO "L1d" is too abbreviated for it to be obvious that it means
> L1 data cache. If you want something really short maybe "L1-d$"
> might be a little clearer, but I stil like "L1-dcache" best.

I find 'dcache' extremely confusing given that we have a dcache in
Linux already.

Ingo

2009-06-25 15:34:19

by Jaswinder Singh Rajput

[permalink] [raw]
Subject: [tip:perfcounters/urgent] perf_counter tools: Shorten names for events

Commit-ID: e5c59547791f171b280bc4c4b2c3ff171824c1a3
Gitweb: http://git.kernel.org/tip/e5c59547791f171b280bc4c4b2c3ff171824c1a3
Author: Jaswinder Singh Rajput <[email protected]>
AuthorDate: Thu, 25 Jun 2009 18:25:22 +0530
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 25 Jun 2009 17:30:23 +0200

perf_counter tools: Shorten names for events

Added new alias for events.

On AMD box:

$ ./perf stat -e l1d -e l1d-misses -e l1d-write -e l1d-prefetch -e l1d-prefetch-miss -e l1i -e l1i-misses -e l1i-prefetch -e l2 -e l2-misses -e l2-write -e dtlb -e dtlb-misses -e itlb -e itlb-misses -e bpu -e bpu-misses -- ls -lR /usr/include/ > /dev/null

Before :

Performance counter stats for 'ls -lR /usr/include/':

248064467 L1-data-Cache-Load-Referencees (scaled from 23.27%)
1001433 L1-data-Cache-Load-Misses (scaled from 23.34%)
153691 L1-data-Cache-Store-Referencees (scaled from 23.34%)
423248 L1-data-Cache-Prefetch-Referencees (scaled from 23.33%)
302138 L1-data-Cache-Prefetch-Misses (scaled from 23.25%)
251217546 L1-instruction-Cache-Load-Referencees (scaled from 23.25%)
5757005 L1-instruction-Cache-Load-Misses (scaled from 23.23%)
93435 L1-instruction-Cache-Prefetch-Referencees (scaled from 23.24%)
6496073 L2-Cache-Load-Referencees (scaled from 23.32%)
609485 L2-Cache-Load-Misses (scaled from 23.45%)
6876991 L2-Cache-Store-Referencees (scaled from 23.71%)
248922840 Data-TLB-Cache-Load-Referencees (scaled from 23.94%)
5828386 Data-TLB-Cache-Load-Misses (scaled from 24.17%)
257613506 Instruction-TLB-Cache-Load-Referencees (scaled from 24.20%)
6833 Instruction-TLB-Cache-Load-Misses (scaled from 23.88%)
109043606 Branch-Cache-Load-Referencees (scaled from 23.64%)
5552296 Branch-Cache-Load-Misses (scaled from 23.42%)

0.413702461 seconds time elapsed.

After :

Peformance counter stats for 'ls -lR /usr/include/':

266590464 L1-d$-loads (scaled from 23.03%)
1222273 L1-d$-load-misses (scaled from 23.58%)
146204 L1-d$-stores (scaled from 23.83%)
406344 L1-d$-prefetches (scaled from 24.09%)
283748 L1-d$-prefetch-misses (scaled from 24.10%)
249650965 L1-i$-loads (scaled from 23.80%)
3353961 L1-i$-load-misses (scaled from 23.82%)
104599 L1-i$-prefetches (scaled from 23.68%)
4836405 LLC-loads (scaled from 23.67%)
498214 LLC-load-misses (scaled from 23.66%)
4953994 LLC-stores (scaled from 23.64%)
243354097 dTLB-loads (scaled from 23.77%)
6468584 dTLB-load-misses (scaled from 23.74%)
249719549 iTLB-loads (scaled from 23.25%)
5060 iTLB-load-misses (scaled from 23.00%)
112343016 branch-loads (scaled from 22.76%)
5528876 branch-load-misses (scaled from 22.54%)

0.427154051 seconds time elapsed.

Reported-by : Ingo Molnar <[email protected]>
Signed-off-by: Jaswinder Singh Rajput <[email protected]>
Cc: Peter Zijlstra <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>


---
tools/perf/util/parse-events.c | 45 ++++++++++++++++++++++++---------------
1 files changed, 28 insertions(+), 17 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 7939a21..430f060 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -71,23 +71,23 @@ static char *sw_event_names[] = {
#define MAX_ALIASES 8

static char *hw_cache[][MAX_ALIASES] = {
- { "L1-data", "l1-d", "l1d" },
- { "L1-instruction", "l1-i", "l1i" },
- { "L2", "l2" },
- { "Data-TLB", "dtlb", "d-tlb" },
- { "Instruction-TLB", "itlb", "i-tlb" },
- { "Branch", "bpu" , "btb", "bpc" },
+ { "L1-d$", "l1-d", "L1-data", },
+ { "L1-i$", "l1-i", "L1-instruction", },
+ { "LLC", "L2" },
+ { "dTLB", "d-tlb", "Data-TLB", },
+ { "iTLB", "i-tlb", "Instruction-TLB", },
+ { "branch", "branches", "bpu", "btb", "bpc", },
};

static char *hw_cache_op[][MAX_ALIASES] = {
- { "Load", "read" },
- { "Store", "write" },
- { "Prefetch", "speculative-read", "speculative-load" },
+ { "load", "loads", "read", },
+ { "store", "stores", "write", },
+ { "prefetch", "prefetches", "speculative-read", "speculative-load", },
};

static char *hw_cache_result[][MAX_ALIASES] = {
- { "Reference", "ops", "access" },
- { "Miss" },
+ { "refs", "Reference", "ops", "access", },
+ { "misses", "miss", },
};

#define C(x) PERF_COUNT_HW_CACHE_##x
@@ -118,6 +118,22 @@ static int is_cache_op_valid(u8 cache_type, u8 cache_op)
return 0; /* invalid */
}

+static char *event_cache_name(u8 cache_type, u8 cache_op, u8 cache_result)
+{
+ static char name[50];
+
+ if (cache_result) {
+ sprintf(name, "%s-%s-%s", hw_cache[cache_type][0],
+ hw_cache_op[cache_op][0],
+ hw_cache_result[cache_result][0]);
+ } else {
+ sprintf(name, "%s-%s", hw_cache[cache_type][0],
+ hw_cache_op[cache_op][1]);
+ }
+
+ return name;
+}
+
char *event_name(int counter)
{
u64 config = attrs[counter].config;
@@ -137,7 +153,6 @@ char *event_name(int counter)

case PERF_TYPE_HW_CACHE: {
u8 cache_type, cache_op, cache_result;
- static char name[100];

cache_type = (config >> 0) & 0xff;
if (cache_type > PERF_COUNT_HW_CACHE_MAX)
@@ -153,12 +168,8 @@ char *event_name(int counter)

if (!is_cache_op_valid(cache_type, cache_op))
return "invalid-cache";
- sprintf(name, "%s-Cache-%s-%ses",
- hw_cache[cache_type][0],
- hw_cache_op[cache_op][0],
- hw_cache_result[cache_result][0]);

- return name;
+ return event_cache_name(cache_type, cache_op, cache_result);
}

case PERF_TYPE_SOFTWARE:

2009-06-25 15:34:50

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events


* Roland Dreier <[email protected]> wrote:

> > 266590464 L1d-loads (scaled from 23.03%)
>
> IMHO "L1d" is too abbreviated for it to be obvious that it means
> L1 data cache. If you want something really short maybe "L1-d$"
> might be a little clearer, but I stil like "L1-dcache" best.

i changed it to L1-d$ / L1-i$.

> The formatting of aligning things is nice defitely but we could
> also shorten lines by saying "scaled from" in fewer characters
> maybe? (I have to admit I don't know what "scaled from X%" means
> in this case so I can't suggest something better, but the fact
> that I can't easily tell what it means probably suggests that the
> wording could be improved)

Indeed, the '(scaled from 12.3%)' message should i guess be changed
to:

(9.43x scaled)

( and a multiplicative factor is more expressive anyway, than the
percentage of from where we scale up. )

Ingo

2009-06-25 15:59:03

by Jaswinder Singh Rajput

[permalink] [raw]
Subject: Re: [tip:perfcounters/urgent] perf_counter tools: Shorten names for events

Hello Ingo,

On Thu, 2009-06-25 at 15:33 +0000, tip-bot for Jaswinder Singh Rajput
wrote:
> Commit-ID: e5c59547791f171b280bc4c4b2c3ff171824c1a3
> Gitweb: http://git.kernel.org/tip/e5c59547791f171b280bc4c4b2c3ff171824c1a3
> Author: Jaswinder Singh Rajput <[email protected]>
> AuthorDate: Thu, 25 Jun 2009 18:25:22 +0530
> Committer: Ingo Molnar <[email protected]>
> CommitDate: Thu, 25 Jun 2009 17:30:23 +0200
>
> perf_counter tools: Shorten names for events
>
> Added new alias for events.
>
> On AMD box:
>
> $ ./perf stat -e l1d -e l1d-misses -e l1d-write -e l1d-prefetch -e l1d-prefetch-miss -e l1i -e l1i-misses -e l1i-prefetch -e l2 -e l2-misses -e l2-write -e dtlb -e dtlb-misses -e itlb -e itlb-misses -e bpu -e bpu-misses -- ls -lR /usr/include/ > /dev/null
>
> Before :
>
> Performance counter stats for 'ls -lR /usr/include/':
>
> 248064467 L1-data-Cache-Load-Referencees (scaled from 23.27%)
> 1001433 L1-data-Cache-Load-Misses (scaled from 23.34%)
> 153691 L1-data-Cache-Store-Referencees (scaled from 23.34%)
> 423248 L1-data-Cache-Prefetch-Referencees (scaled from 23.33%)
> 302138 L1-data-Cache-Prefetch-Misses (scaled from 23.25%)
> 251217546 L1-instruction-Cache-Load-Referencees (scaled from 23.25%)
> 5757005 L1-instruction-Cache-Load-Misses (scaled from 23.23%)
> 93435 L1-instruction-Cache-Prefetch-Referencees (scaled from 23.24%)
> 6496073 L2-Cache-Load-Referencees (scaled from 23.32%)
> 609485 L2-Cache-Load-Misses (scaled from 23.45%)
> 6876991 L2-Cache-Store-Referencees (scaled from 23.71%)
> 248922840 Data-TLB-Cache-Load-Referencees (scaled from 23.94%)
> 5828386 Data-TLB-Cache-Load-Misses (scaled from 24.17%)
> 257613506 Instruction-TLB-Cache-Load-Referencees (scaled from 24.20%)
> 6833 Instruction-TLB-Cache-Load-Misses (scaled from 23.88%)
> 109043606 Branch-Cache-Load-Referencees (scaled from 23.64%)
> 5552296 Branch-Cache-Load-Misses (scaled from 23.42%)
>
> 0.413702461 seconds time elapsed.
>
> After :
>
> Peformance counter stats for 'ls -lR /usr/include/':
>
> 266590464 L1-d$-loads (scaled from 23.03%)
> 1222273 L1-d$-load-misses (scaled from 23.58%)
> 146204 L1-d$-stores (scaled from 23.83%)
> 406344 L1-d$-prefetches (scaled from 24.09%)
> 283748 L1-d$-prefetch-misses (scaled from 24.10%)
> 249650965 L1-i$-loads (scaled from 23.80%)
> 3353961 L1-i$-load-misses (scaled from 23.82%)
> 104599 L1-i$-prefetches (scaled from 23.68%)
> 4836405 LLC-loads (scaled from 23.67%)
> 498214 LLC-load-misses (scaled from 23.66%)
> 4953994 LLC-stores (scaled from 23.64%)
> 243354097 dTLB-loads (scaled from 23.77%)
> 6468584 dTLB-load-misses (scaled from 23.74%)
> 249719549 iTLB-loads (scaled from 23.25%)
> 5060 iTLB-load-misses (scaled from 23.00%)
> 112343016 branch-loads (scaled from 22.76%)
> 5528876 branch-load-misses (scaled from 22.54%)
>
> 0.427154051 seconds time elapsed.
>
> Reported-by : Ingo Molnar <[email protected]>
> Signed-off-by: Jaswinder Singh Rajput <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> LKML-Reference: <[email protected]>
> Signed-off-by: Ingo Molnar <[email protected]>
>
>
> ---
> tools/perf/util/parse-events.c | 45 ++++++++++++++++++++++++---------------
> 1 files changed, 28 insertions(+), 17 deletions(-)
>
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index 7939a21..430f060 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -71,23 +71,23 @@ static char *sw_event_names[] = {
> #define MAX_ALIASES 8
>
> static char *hw_cache[][MAX_ALIASES] = {
> - { "L1-data", "l1-d", "l1d" },
> - { "L1-instruction", "l1-i", "l1i" },
> - { "L2", "l2" },
> - { "Data-TLB", "dtlb", "d-tlb" },
> - { "Instruction-TLB", "itlb", "i-tlb" },
> - { "Branch", "bpu" , "btb", "bpc" },
> + { "L1-d$", "l1-d", "L1-data", },
> + { "L1-i$", "l1-i", "L1-instruction", },

You changed, 'L1d' to 'L1-d$' and 'L1i" to 'L1-i$' so above command is
failing.

[PATCH] perf_counter tools: adding alias for L1D and L1I which was removed ny mistake

By mistake e5c59547791f171 renamed preexisting aliases which leads to failure.

Signed-off-by: Jaswinder Singh Rajput <[email protected]>
---
tools/perf/util/parse-events.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 430f060..4d042f1 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -71,8 +71,8 @@ static char *sw_event_names[] = {
#define MAX_ALIASES 8

static char *hw_cache[][MAX_ALIASES] = {
- { "L1-d$", "l1-d", "L1-data", },
- { "L1-i$", "l1-i", "L1-instruction", },
+ { "L1-d$", "l1-d", "l1d", "L1-data", },
+ { "L1-i$", "l1-i", "l1i", "L1-instruction", },
{ "LLC", "L2" },
{ "dTLB", "d-tlb", "Data-TLB", },
{ "iTLB", "i-tlb", "Instruction-TLB", },
--
1.6.0.6





2009-06-25 19:56:33

by Ingo Molnar

[permalink] [raw]
Subject: Re: [tip:perfcounters/urgent] perf_counter tools: Shorten names for events


* Jaswinder Singh Rajput <[email protected]> wrote:

> - { "L1-d$", "l1-d", "L1-data", },
> - { "L1-i$", "l1-i", "L1-instruction", },
> + { "L1-d$", "l1-d", "l1d", "L1-data", },
> + { "L1-i$", "l1-i", "l1i", "L1-instruction", },

Yes, those aliases still make sense indeed, as long as they are not
the primary alias (which is being displayed).

Ingo

2009-06-25 19:58:20

by Jaswinder Singh Rajput

[permalink] [raw]
Subject: [tip:perfcounters/urgent] perf_counter tools: Add alias for 'l1d' and 'l1i'

Commit-ID: 4418351f06d9ce73acc846158c20186965f920f3
Gitweb: http://git.kernel.org/tip/4418351f06d9ce73acc846158c20186965f920f3
Author: Jaswinder Singh Rajput <[email protected]>
AuthorDate: Thu, 25 Jun 2009 21:27:42 +0530
Committer: Ingo Molnar <[email protected]>
CommitDate: Thu, 25 Jun 2009 21:54:53 +0200

perf_counter tools: Add alias for 'l1d' and 'l1i'

Add 'l1d' and 'l1i' aliases again as shortcuts - just dont make them
the primary display alias.

Signed-off-by: Jaswinder Singh Rajput <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>


---
tools/perf/util/parse-events.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 430f060..4d042f1 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -71,8 +71,8 @@ static char *sw_event_names[] = {
#define MAX_ALIASES 8

static char *hw_cache[][MAX_ALIASES] = {
- { "L1-d$", "l1-d", "L1-data", },
- { "L1-i$", "l1-i", "L1-instruction", },
+ { "L1-d$", "l1-d", "l1d", "L1-data", },
+ { "L1-i$", "l1-i", "l1i", "L1-instruction", },
{ "LLC", "L2" },
{ "dTLB", "d-tlb", "Data-TLB", },
{ "iTLB", "i-tlb", "Instruction-TLB", },

2009-06-26 16:07:21

by Jaswinder Singh Rajput

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events

On Thu, 2009-06-25 at 17:34 +0200, Ingo Molnar wrote:
> * Roland Dreier <[email protected]> wrote:
>
> > > 266590464 L1d-loads (scaled from 23.03%)
> >
> > IMHO "L1d" is too abbreviated for it to be obvious that it means
> > L1 data cache. If you want something really short maybe "L1-d$"
> > might be a little clearer, but I stil like "L1-dcache" best.
>
> i changed it to L1-d$ / L1-i$.
>
> > The formatting of aligning things is nice defitely but we could
> > also shorten lines by saying "scaled from" in fewer characters
> > maybe? (I have to admit I don't know what "scaled from X%" means
> > in this case so I can't suggest something better, but the fact
> > that I can't easily tell what it means probably suggests that the
> > wording could be improved)
>
> Indeed, the '(scaled from 12.3%)' message should i guess be changed
> to:
>
> (9.43x scaled)
>
> ( and a multiplicative factor is more expressive anyway, than the
> percentage of from where we scale up. )
>

Can you please let me know how you get 9.43x from 12.3%

Thanks,
--
JSR


2009-06-26 18:48:17

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events


* Jaswinder Singh Rajput <[email protected]> wrote:

> On Thu, 2009-06-25 at 17:34 +0200, Ingo Molnar wrote:
> > * Roland Dreier <[email protected]> wrote:
> >
> > > > 266590464 L1d-loads (scaled from 23.03%)
> > >
> > > IMHO "L1d" is too abbreviated for it to be obvious that it means
> > > L1 data cache. If you want something really short maybe "L1-d$"
> > > might be a little clearer, but I stil like "L1-dcache" best.
> >
> > i changed it to L1-d$ / L1-i$.
> >
> > > The formatting of aligning things is nice defitely but we could
> > > also shorten lines by saying "scaled from" in fewer characters
> > > maybe? (I have to admit I don't know what "scaled from X%" means
> > > in this case so I can't suggest something better, but the fact
> > > that I can't easily tell what it means probably suggests that the
> > > wording could be improved)
> >
> > Indeed, the '(scaled from 12.3%)' message should i guess be changed
> > to:
> >
> > (9.43x scaled)
> >
> > ( and a multiplicative factor is more expressive anyway, than the
> > percentage of from where we scale up. )
> >
>
> Can you please let me know how you get 9.43x from 12.3%

it was a guesstimate - the real factor is 8.13x

from X percentage scaling factor you get to the multiplier factor by
doing:

100.0 / X

Ingo

2009-07-05 00:28:55

by Anton Blanchard

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events


Hi,

> IMHO "L1d" is too abbreviated for it to be obvious that it means L1 data
> cache. If you want something really short maybe "L1-d$" might be a
> little clearer, but I stil like "L1-dcache" best.

There's one problem with L1-d$-* for event names. It took me a few goes
before I realised I was being hit by bash variable expansion:


# perf stat -e L1-d$-loads ls

usage: perf stat [<options>] <command>

...


# perf stat -e 'L1-d$-loads' ls

Performance counter stats for 'ls':

1273291 L1-d$-loads

0.004434037 seconds time elapsed


I also prefer the more verbose L1-dcache-* names, and since we support
aliases its mostly a matter of screen real estate when printing out the
statistics.

Anton

2009-07-05 00:33:17

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH -tip] perf_counter tools: shorten names for events


* Anton Blanchard <[email protected]> wrote:

>
> Hi,
>
> > IMHO "L1d" is too abbreviated for it to be obvious that it means L1 data
> > cache. If you want something really short maybe "L1-d$" might be a
> > little clearer, but I stil like "L1-dcache" best.
>
> There's one problem with L1-d$-* for event names. It took me a few goes
> before I realised I was being hit by bash variable expansion:
>
>
> # perf stat -e L1-d$-loads ls
>
> usage: perf stat [<options>] <command>
>
> ...
>
>
> # perf stat -e 'L1-d$-loads' ls
>
> Performance counter stats for 'ls':
>
> 1273291 L1-d$-loads
>
> 0.004434037 seconds time elapsed
>
>
> I also prefer the more verbose L1-dcache-* names, and since we
> support aliases its mostly a matter of screen real estate when
> printing out the statistics.

ok - mind sending a patch for this?

Ingo

2009-07-06 12:03:48

by Anton Blanchard

[permalink] [raw]
Subject: [PATCH] perf_counter tools: Rename cache events to remove $

The cache events contain '$' which will hit shell variable expansion. To
avoid confusion change this to 'cache', ie L1-d$-loads becomes L1-dcache-loads.

Signed-off-by: Anton Blanchard <[email protected]>
---

Index: linux.trees.git/tools/perf/util/parse-events.c
===================================================================
--- linux.trees.git.orig/tools/perf/util/parse-events.c 2009-07-06 21:50:53.000000000 +1000
+++ linux.trees.git/tools/perf/util/parse-events.c 2009-07-06 21:51:12.000000000 +1000
@@ -71,8 +71,8 @@
#define MAX_ALIASES 8

static char *hw_cache[][MAX_ALIASES] = {
- { "L1-d$", "l1-d", "l1d", "L1-data", },
- { "L1-i$", "l1-i", "l1i", "L1-instruction", },
+ { "L1-dcache", "l1-d", "l1d", "L1-data", },
+ { "L1-icache", "l1-i", "l1i", "L1-instruction", },
{ "LLC", "L2" },
{ "dTLB", "d-tlb", "Data-TLB", },
{ "iTLB", "i-tlb", "Instruction-TLB", },

2009-07-10 10:40:51

by Anton Blanchard

[permalink] [raw]
Subject: [tip:perfcounters/core] perf_counter tools: Rename cache events to remove $

Commit-ID: 9590b7ba3fefdfe0c7741f5e2f61faf2ffcea19c
Gitweb: http://git.kernel.org/tip/9590b7ba3fefdfe0c7741f5e2f61faf2ffcea19c
Author: Anton Blanchard <[email protected]>
AuthorDate: Mon, 6 Jul 2009 22:01:31 +1000
Committer: Ingo Molnar <[email protected]>
CommitDate: Fri, 10 Jul 2009 10:04:06 +0200

perf_counter tools: Rename cache events to remove $

The cache events contain '$' which will hit shell variable
expansion. To avoid confusion change this to 'cache', ie
L1-d$-loads becomes L1-dcache-loads.

Signed-off-by: Anton Blanchard <[email protected]>
Cc: Roland Dreier <[email protected]>
Cc: Jaswinder Singh Rajput <[email protected]>
Cc: Peter Zijlstra <[email protected]>
LKML-Reference: <20090706120131.GB4391@kryten>
Signed-off-by: Ingo Molnar <[email protected]>


---
tools/perf/util/parse-events.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 5184959..518a33a 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -71,8 +71,8 @@ static char *sw_event_names[] = {
#define MAX_ALIASES 8

static char *hw_cache[][MAX_ALIASES] = {
- { "L1-d$", "l1-d", "l1d", "L1-data", },
- { "L1-i$", "l1-i", "l1i", "L1-instruction", },
+ { "L1-dcache", "l1-d", "l1d", "L1-data", },
+ { "L1-icache", "l1-i", "l1i", "L1-instruction", },
{ "LLC", "L2" },
{ "dTLB", "d-tlb", "Data-TLB", },
{ "iTLB", "i-tlb", "Instruction-TLB", },