2013-09-30 11:19:53

by Ramkumar Ramachandra

[permalink] [raw]
Subject: [PATCH 0/2] Toying with the perf tool

Hi,

I was curiously poking around the perf tool this morning. The hacking
session resulted in two minor patches:

[1/2] is a minor non-functional cleanup of the Makefile.
[2/2] is more important: it omits printing bogus data in an edge case.

Thanks for reading.

Ramkumar Ramachandra (2):
perf tool: simplify ARCH code in Makefile
perf tool: don't print bogus data on -e cycles

tools/perf/builtin-stat.c | 9 ++++-----
tools/perf/config/Makefile | 47 ++++++++++++++++++++++------------------------
2 files changed, 26 insertions(+), 30 deletions(-)

--
1.8.4.477.g5d89aa9


2013-09-30 11:19:58

by Ramkumar Ramachandra

[permalink] [raw]
Subject: [PATCH 1/2] perf tool: simplify ARCH code in Makefile

The ARCH is first determined from `uname -m`, and then overridden to x86
anyway. This is unnecessarily ugly; follow the example set by
ffee0de (x86: Default to ARCH=x86 to avoid overriding CONFIG_64BIT,
2012-12-20) to simplify the code.

Cc: Jiri Olsa <[email protected]>
Cc: Michael Witten <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Namhyung Kim <[email protected]>
Signed-off-by: Ramkumar Ramachandra <[email protected]>
---
tools/perf/config/Makefile | 47 ++++++++++++++++++++++------------------------
1 file changed, 22 insertions(+), 25 deletions(-)

diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
index 5f6f9b3..45a8515 100644
--- a/tools/perf/config/Makefile
+++ b/tools/perf/config/Makefile
@@ -1,33 +1,30 @@
-uname_M := $(shell uname -m 2>/dev/null || echo not)
-
-ARCH ?= $(shell echo $(uname_M) | sed -e s/i.86/i386/ -e s/sun4u/sparc64/ \
- -e s/arm.*/arm/ -e s/sa110/arm/ \
- -e s/s390x/s390/ -e s/parisc64/parisc/ \
- -e s/ppc.*/powerpc/ -e s/mips.*/mips/ \
- -e s/sh[234].*/sh/ -e s/aarch64.*/arm64/ )
+ARCH ?= $(shell uname -m | sed -e s/i.86/x86/ -e s/x86_64/x86/ \
+ -e s/sun4u/sparc64/ \
+ -e s/arm.*/arm/ -e s/sa110/arm/ \
+ -e s/s390x/s390/ -e s/parisc64/parisc/ \
+ -e s/ppc.*/powerpc/ -e s/mips.*/mips/ \
+ -e s/sh[234].*/sh/ -e s/aarch64.*/arm64/ )
NO_PERF_REGS := 1
CFLAGS := $(EXTRA_CFLAGS) $(EXTRA_WARNINGS)

# Additional ARCH settings for x86
-ifeq ($(ARCH),i386)
- override ARCH := x86
- NO_PERF_REGS := 0
- LIBUNWIND_LIBS = -lunwind -lunwind-x86
-endif
-
-ifeq ($(ARCH),x86_64)
- override ARCH := x86
- IS_X86_64 := 0
- ifeq (, $(findstring m32,$(CFLAGS)))
- IS_X86_64 := $(shell echo __x86_64__ | ${CC} -E -x c - | tail -n 1)
- endif
- ifeq (${IS_X86_64}, 1)
- RAW_ARCH := x86_64
- CFLAGS += -DARCH_X86_64
- ARCH_INCLUDE = ../../arch/x86/lib/memcpy_64.S ../../arch/x86/lib/memset_64.S
+ifeq ($(ARCH),x86)
+ ifeq ($(shell uname -m),x86_64)
+ IS_X86_64 := 0
+ ifeq (, $(findstring m32,$(CFLAGS)))
+ IS_X86_64 := $(shell echo __x86_64__ | ${CC} -E -x c - | tail -n 1)
+ endif
+ ifeq (${IS_X86_64}, 1)
+ RAW_ARCH := x86_64
+ CFLAGS += -DARCH_X86_64
+ ARCH_INCLUDE = ../../arch/x86/lib/memcpy_64.S ../../arch/x86/lib/memset_64.S
+ endif
+ NO_PERF_REGS := 0
+ LIBUNWIND_LIBS = -lunwind -lunwind-x86_64
+ else
+ NO_PERF_REGS := 0
+ LIBUNWIND_LIBS = -lunwind -lunwind-x86
endif
- NO_PERF_REGS := 0
- LIBUNWIND_LIBS = -lunwind -lunwind-x86_64
endif

ifeq ($(NO_PERF_REGS),0)
--
1.8.4.477.g5d89aa9

2013-09-30 11:20:13

by Ramkumar Ramachandra

[permalink] [raw]
Subject: [PATCH 2/2] perf tool: don't print bogus data on -e cycles

When only the cycles event is requested:

$ perf stat -e cycles dd if=/dev/zero of=/dev/null count=1000000
1000000+0 records in
1000000+0 records out
512000000 bytes (512 MB) copied, 0.26123 s, 2.0 GB/s

Performance counter stats for 'dd if=/dev/zero of=/dev/null count=1000000':

911,626,453 cycles # 0.000 GHz

0.262113350 seconds time elapsed

The 0.000 GHz comment in the output is totally bogus and misleading. It
happens because update_shadow_stats() doesn't touch runtime_nsecs_stats;
it is only written when a requested counter matches a SW_TASK_CLOCK. In
our case, since we have only requested HW_CPU_CYCLES,
runtime_nsecs_stats is unavailable. So, omit printing the comment
altogether.

Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Ramkumar Ramachandra <[email protected]>
---
tools/perf/builtin-stat.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index f686d5f..cc167ae 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -918,11 +918,10 @@ static void abs_printout(int cpu, int nr, struct perf_evsel *evsel, double avg)
print_stalled_cycles_backend(cpu, evsel, avg);
} else if (perf_evsel__match(evsel, HARDWARE, HW_CPU_CYCLES)) {
total = avg_stats(&runtime_nsecs_stats[cpu]);
-
- if (total)
- ratio = 1.0 * avg / total;
-
- fprintf(output, " # %8.3f GHz ", ratio);
+ if (total) {
+ ratio = avg / total;
+ fprintf(output, " # %8.3f GHz ", ratio);
+ }
} else if (runtime_nsecs_stats[cpu].n != 0) {
char unit = 'M';

--
1.8.4.477.g5d89aa9

2013-10-01 08:49:06

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 2/2] perf tool: don't print bogus data on -e cycles


* Ramkumar Ramachandra <[email protected]> wrote:

> When only the cycles event is requested:
>
> $ perf stat -e cycles dd if=/dev/zero of=/dev/null count=1000000
> 1000000+0 records in
> 1000000+0 records out
> 512000000 bytes (512 MB) copied, 0.26123 s, 2.0 GB/s
>
> Performance counter stats for 'dd if=/dev/zero of=/dev/null count=1000000':
>
> 911,626,453 cycles # 0.000 GHz
>
> 0.262113350 seconds time elapsed
>
> The 0.000 GHz comment in the output is totally bogus and misleading. It
> happens because update_shadow_stats() doesn't touch runtime_nsecs_stats;
> it is only written when a requested counter matches a SW_TASK_CLOCK. In
> our case, since we have only requested HW_CPU_CYCLES,
> runtime_nsecs_stats is unavailable. So, omit printing the comment
> altogether.
>
> Cc: Ingo Molnar <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: Arnaldo Carvalho de Melo <[email protected]>
> Signed-off-by: Ramkumar Ramachandra <[email protected]>
> ---
> tools/perf/builtin-stat.c | 9 ++++-----
> 1 file changed, 4 insertions(+), 5 deletions(-)

Acked-by: Ingo Molnar <[email protected]>

Thanks,

Ingo

Subject: [tip:perf/core] perf stat: Don't print bogus data on -e cycles

Commit-ID: c458fe62ca31496664c1211a7906d261220b18f9
Gitweb: http://git.kernel.org/tip/c458fe62ca31496664c1211a7906d261220b18f9
Author: Ramkumar Ramachandra <[email protected]>
AuthorDate: Mon, 30 Sep 2013 16:43:05 +0530
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Fri, 11 Oct 2013 12:17:33 -0300

perf stat: Don't print bogus data on -e cycles

When only the cycles event is requested:

$ perf stat -e cycles dd if=/dev/zero of=/dev/null count=1000000
1000000+0 records in
1000000+0 records out
512000000 bytes (512 MB) copied, 0.26123 s, 2.0 GB/s

Performance counter stats for 'dd if=/dev/zero of=/dev/null count=1000000':

911,626,453 cycles # 0.000 GHz

0.262113350 seconds time elapsed

The 0.000 GHz comment in the output is totally bogus and misleading. It
happens because update_shadow_stats() doesn't touch runtime_nsecs_stats;
it is only written when a requested counter matches a SW_TASK_CLOCK. In
our case, since we have only requested HW_CPU_CYCLES,
runtime_nsecs_stats is unavailable. So, omit printing the comment
altogether.

Signed-off-by: Ramkumar Ramachandra <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-stat.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 700b478..ce2266c 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -997,10 +997,10 @@ static void abs_printout(int cpu, int nr, struct perf_evsel *evsel, double avg)
} else if (perf_evsel__match(evsel, HARDWARE, HW_CPU_CYCLES)) {
total = avg_stats(&runtime_nsecs_stats[cpu]);

- if (total)
- ratio = 1.0 * avg / total;
-
- fprintf(output, " # %8.3f GHz ", ratio);
+ if (total) {
+ ratio = avg / total;
+ fprintf(output, " # %8.3f GHz ", ratio);
+ }
} else if (transaction_run &&
perf_evsel__cmp(evsel, nth_evsel(T_CYCLES_IN_TX))) {
total = avg_stats(&runtime_cycles_stats[cpu]);