2009-10-19 10:03:42

by Tim Blechmann

[permalink] [raw]
Subject: [PATCH] perf stat: add branch performance counters to default

not sure, whether there is any reason, why the branch performance
counters are not enabled by default.
personally, i would find it quite useful, though

--

this patch adds performance counter information about branches and
branch misses to the default output of perf stat.

Signed-off-by: Tim Blechmann <[email protected]>
---
tools/perf/builtin-stat.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)



Attachments:
0001-perf-stat-add-branch-performance-counters-to-default.patch (615.00 B)
signature.asc (197.00 B)
OpenPGP digital signature
Download all attachments

2009-10-19 11:23:09

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] perf stat: add branch performance counters to default


* Tim Blechmann <[email protected]> wrote:

> not sure, whether there is any reason, why the branch performance
> counters are not enabled by default. personally, i would find it quite
> useful, though

yeah, i find it useful too. The main reason is that most of the systems
out there have like 2 freely defined events. Only Nehalem and i think
Power7 has more.

OTOH ... maybe we should just do it, as most measurement and tuning is
done on modern hardware. I'll try your patch in the perf tree - if too
many people complain we will undo it.

Ingo

2009-10-19 11:43:17

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] perf stat: add branch performance counters to default


btw., your patch was somewhat whitespace damaged (see
Documentation/email-clients.txt).

I fixed it up manually - please double check latest -tip whether it's
working all fine for you:

http://people.redhat.com/mingo/tip.git/README

Thanks,

Ingo

2009-10-19 11:48:00

by Tim Blechmann

[permalink] [raw]
Subject: [tip:perf/core] perf stat: Add branch performance events to default output

Commit-ID: 12133afffcc7140eea915b1572189a2ea0cf7b0e
Gitweb: http://git.kernel.org/tip/12133afffcc7140eea915b1572189a2ea0cf7b0e
Author: Tim Blechmann <[email protected]>
AuthorDate: Mon, 19 Oct 2009 12:03:33 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Mon, 19 Oct 2009 13:26:42 +0200

perf stat: Add branch performance events to default output

Adds performance event information about branches
and branch misses to the default output of perf stat.

Signed-off-by: Tim Blechmann <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
tools/perf/builtin-stat.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index c373683..95a55ea 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -59,6 +59,8 @@ static struct perf_event_attr default_attrs[] = {
{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS },
{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES},
{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES },
+ { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS},
+ { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES },

};

2009-10-19 11:47:51

by Ingo Molnar

[permalink] [raw]
Subject: [tip:perf/core] perf stat: Re-align the default_attrs[] array

Commit-ID: 56aab464ff6232bcc2f53b26576983dc83f75db7
Gitweb: http://git.kernel.org/tip/56aab464ff6232bcc2f53b26576983dc83f75db7
Author: Ingo Molnar <[email protected]>
AuthorDate: Mon, 19 Oct 2009 13:27:08 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Mon, 19 Oct 2009 13:27:08 +0200

perf stat: Re-align the default_attrs[] array

Clean up the array definition to be vertically aligned.

No functional effects.

Cc: Tim Blechmann <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
tools/perf/builtin-stat.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index c373683..95a55ea 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -59,6 +59,8 @@ static struct perf_event_attr default_attrs[] = {
{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS },
{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES},
{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES },
+ { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS},
+ { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES },

};
---
tools/perf/builtin-stat.c | 20 ++++++++++----------
1 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 95a55ea..90e0a26 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -50,17 +50,17 @@

static struct perf_event_attr default_attrs[] = {

- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK },
- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CONTEXT_SWITCHES},
- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CPU_MIGRATIONS },
- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_PAGE_FAULTS },
-
- { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CPU_CYCLES },
- { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS },
- { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES},
- { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES },
- { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS},
- { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES },
+ { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK },
+ { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CONTEXT_SWITCHES },
+ { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CPU_MIGRATIONS },
+ { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_PAGE_FAULTS },
+
+ { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CPU_CYCLES },
+ { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS },
+ { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES },
+ { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES },
+ { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS },
+ { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES },

};

2009-10-19 11:48:17

by Ingo Molnar

[permalink] [raw]
Subject: [tip:perf/core] perf stat: Count branches first

Commit-ID: dd86e72abdbc4b436471af5a97927c6145f5298c
Gitweb: http://git.kernel.org/tip/dd86e72abdbc4b436471af5a97927c6145f5298c
Author: Ingo Molnar <[email protected]>
AuthorDate: Mon, 19 Oct 2009 13:33:03 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Mon, 19 Oct 2009 13:36:32 +0200

perf stat: Count branches first

Count branches first, cache-misses second. The reason is that
on x86 branches are not counted by all counters on all CPUs.

Before:

Performance counter stats for 'ls':

0.756653 task-clock-msecs # 0.802 CPUs
0 context-switches # 0.000 M/sec
0 CPU-migrations # 0.000 M/sec
250 page-faults # 0.330 M/sec
2375725 cycles # 3139.781 M/sec
1628129 instructions # 0.685 IPC
19643 cache-references # 25.960 M/sec
4608 cache-misses # 6.090 M/sec
342532 branches # 452.694 M/sec
<not counted> branch-misses

0.000943356 seconds time elapsed

After:

Performance counter stats for 'ls':

1.056734 task-clock-msecs # 0.859 CPUs
0 context-switches # 0.000 M/sec
0 CPU-migrations # 0.000 M/sec
259 page-faults # 0.245 M/sec
3345932 cycles # 3166.295 M/sec
3074090 instructions # 0.919 IPC
616928 branches # 583.806 M/sec
39279 branch-misses # 6.367 %
21312 cache-references # 20.168 M/sec
3661 cache-misses # 3.464 M/sec

0.001230551 seconds time elapsed

(also prettify the printout of branch misses, in case it's
getting scaled.)

Cc: Tim Blechmann <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
tools/perf/builtin-stat.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index c373683..95a55ea 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -59,6 +59,8 @@ static struct perf_event_attr default_attrs[] = {
{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS },
{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES},
{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES },
+ { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS},
+ { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES },

};
---
tools/perf/builtin-stat.c | 20 ++++++++++----------
1 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 95a55ea..90e0a26 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -50,17 +50,17 @@

static struct perf_event_attr default_attrs[] = {

- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK },
- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CONTEXT_SWITCHES},
- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CPU_MIGRATIONS },
- { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_PAGE_FAULTS },
-
- { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CPU_CYCLES },
- { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS },
- { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES},
- { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES },
- { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS},
- { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES },
+ { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK },
+ { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CONTEXT_SWITCHES },
+ { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CPU_MIGRATIONS },
+ { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_PAGE_FAULTS },
+
+ { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CPU_CYCLES },
+ { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS },
+ { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES },
+ { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES },
+ { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS },
+ { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES },

};
---
tools/perf/builtin-stat.c | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 90e0a26..c6df377 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -57,10 +57,10 @@ static struct perf_event_attr default_attrs[] = {

{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CPU_CYCLES },
{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS },
- { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES },
- { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES },
{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS },
{ .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES },
+ { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES },
+ { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES },

};

@@ -363,7 +363,7 @@ static void abs_printout(int counter, double avg)
if (total)
ratio = avg * 100 / total;

- fprintf(stderr, " # %10.3f %% ", ratio);
+ fprintf(stderr, " # %10.3f %% ", ratio);

} else {
total = avg_stats(&runtime_nsecs_stats);