2012-10-26 14:33:25

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [GIT PULL 0/9] perf/core improvements and fixes

Hi Ingo,

Please consider pulling,

- Arnaldo

The following changes since commit 8f7c1d07ade50dcdea7ec779b277e891f5c8292a:

Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2012-10-26 10:30:49 +0200)

are available in the git repository at:


git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux tags/perf-core-for-mingo

for you to fetch changes up to 1f16c5754d3a4008c29f3bf67b4f1271313ba385:

perf stat: Add --pre and --post command (2012-10-26 11:22:25 -0200)

----------------------------------------------------------------
perf/core improvements:

. perf inject changes to allow showing where a task sleeps, from Andrew Vagin.

. Makefile improvements from Namhyung Kim.

. Add --pre and --post command hooks in 'stat', from Peter Zijlstra.

Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

----------------------------------------------------------------
Andrew Vagin (3):
perf inject: Work with files
perf inject: Merge sched_stat_* and sched_switch events
perf inject: Mark a dso if it's used

Namhyung Kim (5):
tools lib traceevent: Do not generate dependency for system header files
perf tools: Cleanup doc related targets
perf tools: Convert invocation of MAKE into SUBDIR
perf tools: Always show CHK message when doing try-cc
perf tools: Fix LIBELF_MMAP checking

Peter Zijlstra (1):
perf stat: Add --pre and --post command

tools/lib/traceevent/Makefile | 2 +-
tools/perf/Documentation/perf-inject.txt | 11 ++
tools/perf/Documentation/perf-stat.txt | 5 +
tools/perf/Makefile | 51 ++------
tools/perf/builtin-inject.c | 189 ++++++++++++++++++++++++++++--
tools/perf/builtin-stat.c | 42 ++++++-
tools/perf/config/utilities.mak | 3 +-
tools/perf/util/build-id.c | 10 +-
tools/perf/util/build-id.h | 4 +
9 files changed, 256 insertions(+), 61 deletions(-)


2012-10-26 14:32:19

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 4/9] perf tools: Always show CHK message when doing try-cc

From: Namhyung Kim <[email protected]>

It might be useful to see what's happening behind us rather than just
waiting few seconds during the config checking.

Also align the CHK message with other ones.

Signed-off-by: Namhyung Kim <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/config/utilities.mak | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/perf/config/utilities.mak b/tools/perf/config/utilities.mak
index ea853c2..e541312 100644
--- a/tools/perf/config/utilities.mak
+++ b/tools/perf/config/utilities.mak
@@ -183,9 +183,8 @@ _gea_err = $(if $(1),$(error Please set '$(1)' appropriately))
# Usage: option = $(call try-cc, source-to-build, cc-options, msg)
ifndef V
TRY_CC_OUTPUT= > /dev/null 2>&1
-else
-TRY_CC_MSG=echo "CHK $(3)" 1>&2;
endif
+TRY_CC_MSG=echo " CHK $(3)" 1>&2;

try-cc = $(shell sh -c \
'TMP="$(OUTPUT)$(TMPOUT).$$$$"; \
--
1.7.9.2.358.g22243

2012-10-26 14:32:32

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 1/9] tools lib traceevent: Do not generate dependency for system header files

From: Namhyung Kim <[email protected]>

Ingo reported (again!) that 'make clean' on perf/traceevent does not
work due to some reason with system header file. Quotes Ingo:

"Note that the old dependency related build failure thought to be
fixed in commit 860df5833e46 is back:

make[1]: *** No rule to make target
`/usr/lib/gcc/x86_64-redhat-linux/4.7.0/include/stddef.h', needed by `.trace-seq.d'. Stop.

'make clean' itself does not work in libtraceevent:

comet:~/tip/tools/lib/traceevent> make clean
make: *** No rule to make target `/usr/lib/gcc/x86_64-redhat-linux/4.7.0/include/stddef.h', needed by `.trace-seq.d'. Stop.

So I had to clean it out manually:

comet:~/tip/tools/lib/traceevent> git ls-files --others | xargs rm
comet:~/tip/tools/lib/traceevent>

and then things build fine."

Try to fix it by excluding system headers from dependency generation.

Signed-off-by: Namhyung Kim <[email protected]>
Reported-by: Ingo Molnar <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/lib/traceevent/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/lib/traceevent/Makefile b/tools/lib/traceevent/Makefile
index 04d959f..a20e320 100644
--- a/tools/lib/traceevent/Makefile
+++ b/tools/lib/traceevent/Makefile
@@ -253,7 +253,7 @@ all_deps := $(all_objs:%.o=.%.d)
# let .d file also depends on the source and header files
define check_deps
@set -e; $(RM) $@; \
- $(CC) -M $(CFLAGS) $< > $@.$$$$; \
+ $(CC) -MM $(CFLAGS) $< > $@.$$$$; \
sed 's,\($*\)\.o[ :]*,\1.o $@ : ,g' < $@.$$$$ > $@; \
$(RM) $@.$$$$
endef
--
1.7.9.2.358.g22243

2012-10-26 14:32:40

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 5/9] perf tools: Fix LIBELF_MMAP checking

From: Namhyung Kim <[email protected]>

Currently checking mmap support in libelf failed due to wrong flags.

CHK libelf
CHK libdw
CHK libunwind
CHK -DLIBELF_MMAP
/tmp/ccYJwdR0.o: In function `main':
:(.text+0x18): undefined reference to `elf_begin'
collect2: error: ld returned 1 exit status

This cannot happen since we checked the elf_begin() when checking
libelf and it succeeded.

Fix it by using a same flag with libelf checking.

Signed-off-by: Namhyung Kim <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Makefile | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index 2d0c09a..629fc6a 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -541,7 +541,8 @@ LIB_OBJS += $(OUTPUT)util/symbol-minimal.o
else # NO_LIBELF
BASIC_CFLAGS += -DLIBELF_SUPPORT

-ifeq ($(call try-cc,$(SOURCE_ELF_MMAP),$(FLAGS_COMMON),-DLIBELF_MMAP),y)
+FLAGS_LIBELF=$(ALL_CFLAGS) $(ALL_LDFLAGS) $(EXTLIBS)
+ifeq ($(call try-cc,$(SOURCE_ELF_MMAP),$(FLAGS_LIBELF),-DLIBELF_MMAP),y)
BASIC_CFLAGS += -DLIBELF_MMAP
endif

--
1.7.9.2.358.g22243

2012-10-26 14:33:23

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 3/9] perf tools: Convert invocation of MAKE into SUBDIR

From: Namhyung Kim <[email protected]>

This will show directory change info in a consistent form. Also it can
be converted again into David Howell's descend command.

Signed-off-by: Namhyung Kim <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Howells <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Makefile | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index 5cf40cb..2d0c09a 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -985,7 +985,7 @@ INSTALL_DOC_TARGETS += quick-install-doc quick-install-man quick-install-html

# 'make doc' should call 'make -C Documentation all'
$(DOC_TARGETS):
- $(MAKE) -C Documentation $(@:doc=all)
+ $(QUIET_SUBDIR0)Documentation $(QUIET_SUBDIR1) $(@:doc=all)

TAGS:
$(RM) TAGS
@@ -1058,7 +1058,7 @@ install-python_ext:

# 'make install-doc' should call 'make -C Documentation install'
$(INSTALL_DOC_TARGETS):
- $(MAKE) -C Documentation $(@:-doc=)
+ $(QUIET_SUBDIR0)Documentation $(QUIET_SUBDIR1) $(@:-doc=)

### Cleaning rules

@@ -1066,7 +1066,7 @@ clean: $(LIBTRACEEVENT)-clean
$(RM) $(LIB_OBJS) $(BUILTIN_OBJS) $(LIB_FILE) $(OUTPUT)perf-archive $(OUTPUT)perf.o $(LANG_BINDINGS)
$(RM) $(ALL_PROGRAMS) perf
$(RM) *.spec *.pyc *.pyo */*.pyc */*.pyo $(OUTPUT)common-cmds.h TAGS tags cscope*
- $(MAKE) -C Documentation/ clean
+ $(QUIET_SUBDIR0)Documentation $(QUIET_SUBDIR1) clean
$(RM) $(OUTPUT)PERF-VERSION-FILE $(OUTPUT)PERF-CFLAGS
$(RM) $(OUTPUT)util/*-bison*
$(RM) $(OUTPUT)util/*-flex*
--
1.7.9.2.358.g22243

2012-10-26 14:33:50

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 7/9] perf inject: Merge sched_stat_* and sched_switch events

From: Andrew Vagin <[email protected]>

You may want to know where and how long a task is sleeping. A callchain
may be found in sched_switch and a time slice in stat_iowait, so I add
handler in perf inject for merging this events.

My code saves sched_switch event for each process and when it meets
stat_iowait, it reports the sched_switch event, because this event
contains a correct callchain. By another words it replaces all
stat_iowait events on proper sched_switch events.

I use the next sequence of commands for testing:

perf record -e sched:sched_stat_sleep -e sched:sched_switch \
-e sched:sched_process_exit -g -o ~/perf.data.raw \
~/test-program
perf inject -v -s -i ~/perf.data.raw -o ~/perf.data
perf report --stdio -i ~/perf.data
100.00% foo [kernel.kallsyms] [k] __schedule
|
--- __schedule
schedule
|
|--79.75%-- schedule_hrtimeout_range_clock
| schedule_hrtimeout_range
| poll_schedule_timeout
| do_select
| core_sys_select
| sys_select
| system_call_fastpath
| __select
| __libc_start_main
|
--20.25%-- do_nanosleep
hrtimer_nanosleep
sys_nanosleep
system_call_fastpath
__GI___libc_nanosleep
__libc_start_main

And here is test-program.c:

#include<unistd.h>
#include<time.h>
#include<sys/select.h>

int main()
{
struct timespec ts1;
struct timeval tv1;
int i;
long s;

for (i = 0; i < 10; i++) {
ts1.tv_sec = 0;
ts1.tv_nsec = 10000000;
nanosleep(&ts1, NULL);

tv1.tv_sec = 0;
tv1.tv_usec = 40000;
select(0, NULL, NULL, NULL,&tv1);
}
return 1;
}

Signed-off-by: Andrew Vagin <[email protected]>
Acked-by: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ committer note: Made it use evsel->handler ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Documentation/perf-inject.txt | 5 ++
tools/perf/builtin-inject.c | 142 +++++++++++++++++++++++++++++-
2 files changed, 144 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-inject.txt b/tools/perf/Documentation/perf-inject.txt
index 673ef97..a00a342 100644
--- a/tools/perf/Documentation/perf-inject.txt
+++ b/tools/perf/Documentation/perf-inject.txt
@@ -35,6 +35,11 @@ OPTIONS
-o::
--output=::
Output file name. (default: stdout)
+-s::
+--sched-stat::
+ Merge sched_stat and sched_switch for getting events where and how long
+ tasks slept. sched_switch contains a callchain where a task slept and
+ sched_stat contains a timeslice how long a task slept.

SEE ALSO
--------
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index a706ed5..a4a3072 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -8,19 +8,32 @@
#include "builtin.h"

#include "perf.h"
+#include "util/color.h"
+#include "util/evlist.h"
+#include "util/evsel.h"
#include "util/session.h"
#include "util/tool.h"
#include "util/debug.h"

#include "util/parse-options.h"

+#include <linux/list.h>
+
struct perf_inject {
struct perf_tool tool;
bool build_ids;
+ bool sched_stat;
const char *input_name;
int pipe_output,
output;
u64 bytes_written;
+ struct list_head samples;
+};
+
+struct event_entry {
+ struct list_head node;
+ u32 tid;
+ union perf_event event[0];
};

static int perf_event__repipe_synth(struct perf_tool *tool,
@@ -86,12 +99,23 @@ static int perf_event__repipe(struct perf_tool *tool,
return perf_event__repipe_synth(tool, event, machine);
}

+typedef int (*inject_handler)(struct perf_tool *tool,
+ union perf_event *event,
+ struct perf_sample *sample,
+ struct perf_evsel *evsel,
+ struct machine *machine);
+
static int perf_event__repipe_sample(struct perf_tool *tool,
union perf_event *event,
- struct perf_sample *sample __maybe_unused,
- struct perf_evsel *evsel __maybe_unused,
- struct machine *machine)
+ struct perf_sample *sample,
+ struct perf_evsel *evsel,
+ struct machine *machine)
{
+ if (evsel->handler.func) {
+ inject_handler f = evsel->handler.func;
+ return f(tool, event, sample, evsel, machine);
+ }
+
return perf_event__repipe_synth(tool, event, machine);
}

@@ -216,6 +240,79 @@ repipe:
return 0;
}

+static int perf_inject__sched_process_exit(struct perf_tool *tool,
+ union perf_event *event __maybe_unused,
+ struct perf_sample *sample,
+ struct perf_evsel *evsel __maybe_unused,
+ struct machine *machine __maybe_unused)
+{
+ struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
+ struct event_entry *ent;
+
+ list_for_each_entry(ent, &inject->samples, node) {
+ if (sample->tid == ent->tid) {
+ list_del_init(&ent->node);
+ free(ent);
+ break;
+ }
+ }
+
+ return 0;
+}
+
+static int perf_inject__sched_switch(struct perf_tool *tool,
+ union perf_event *event,
+ struct perf_sample *sample,
+ struct perf_evsel *evsel,
+ struct machine *machine)
+{
+ struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
+ struct event_entry *ent;
+
+ perf_inject__sched_process_exit(tool, event, sample, evsel, machine);
+
+ ent = malloc(event->header.size + sizeof(struct event_entry));
+ if (ent == NULL) {
+ color_fprintf(stderr, PERF_COLOR_RED,
+ "Not enough memory to process sched switch event!");
+ return -1;
+ }
+
+ ent->tid = sample->tid;
+ memcpy(&ent->event, event, event->header.size);
+ list_add(&ent->node, &inject->samples);
+ return 0;
+}
+
+static int perf_inject__sched_stat(struct perf_tool *tool,
+ union perf_event *event __maybe_unused,
+ struct perf_sample *sample,
+ struct perf_evsel *evsel,
+ struct machine *machine)
+{
+ struct event_entry *ent;
+ union perf_event *event_sw;
+ struct perf_sample sample_sw;
+ struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
+ u32 pid = perf_evsel__intval(evsel, sample, "pid");
+
+ list_for_each_entry(ent, &inject->samples, node) {
+ if (pid == ent->tid)
+ goto found;
+ }
+
+ return 0;
+found:
+ event_sw = &ent->event[0];
+ perf_evsel__parse_sample(evsel, event_sw, &sample_sw);
+
+ sample_sw.period = sample->period;
+ sample_sw.time = sample->time;
+ perf_event__synthesize_sample(event_sw, evsel->attr.sample_type,
+ &sample_sw, false);
+ return perf_event__repipe(tool, event_sw, &sample_sw, machine);
+}
+
extern volatile int session_done;

static void sig_handler(int sig __maybe_unused)
@@ -223,6 +320,21 @@ static void sig_handler(int sig __maybe_unused)
session_done = 1;
}

+static int perf_evsel__check_stype(struct perf_evsel *evsel,
+ u64 sample_type, const char *sample_msg)
+{
+ struct perf_event_attr *attr = &evsel->attr;
+ const char *name = perf_evsel__name(evsel);
+
+ if (!(attr->sample_type & sample_type)) {
+ pr_err("Samples for %s event do not have %s attribute set.",
+ name, sample_msg);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
static int __cmd_inject(struct perf_inject *inject)
{
struct perf_session *session;
@@ -241,6 +353,26 @@ static int __cmd_inject(struct perf_inject *inject)
if (session == NULL)
return -ENOMEM;

+ if (inject->sched_stat) {
+ struct perf_evsel *evsel;
+
+ inject->tool.ordered_samples = true;
+
+ list_for_each_entry(evsel, &session->evlist->entries, node) {
+ const char *name = perf_evsel__name(evsel);
+
+ if (!strcmp(name, "sched:sched_switch")) {
+ if (perf_evsel__check_stype(evsel, PERF_SAMPLE_TID, "TID"))
+ return -EINVAL;
+
+ evsel->handler.func = perf_inject__sched_switch;
+ } else if (!strcmp(name, "sched:sched_process_exit"))
+ evsel->handler.func = perf_inject__sched_process_exit;
+ else if (!strncmp(name, "sched:sched_stat_", 17))
+ evsel->handler.func = perf_inject__sched_stat;
+ }
+ }
+
if (!inject->pipe_output)
lseek(inject->output, session->header.data_offset, SEEK_SET);

@@ -275,6 +407,7 @@ int cmd_inject(int argc, const char **argv, const char *prefix __maybe_unused)
.build_id = perf_event__repipe_op2_synth,
},
.input_name = "-",
+ .samples = LIST_HEAD_INIT(inject.samples),
};
const char *output_name = "-";
const struct option options[] = {
@@ -284,6 +417,9 @@ int cmd_inject(int argc, const char **argv, const char *prefix __maybe_unused)
"input file name"),
OPT_STRING('o', "output", &output_name, "file",
"output file name"),
+ OPT_BOOLEAN('s', "sched-stat", &inject.sched_stat,
+ "Merge sched-stat and sched-switch for getting events "
+ "where and how long tasks slept"),
OPT_INCR('v', "verbose", &verbose,
"be more verbose (show build ids, etc)"),
OPT_END()
--
1.7.9.2.358.g22243

2012-10-26 14:32:15

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 2/9] perf tools: Cleanup doc related targets

From: Namhyung Kim <[email protected]>

Documentation targets handling rules are duplicate. Consolidate them
with DOC_TARGETS and INSTALL_DOC_TARGETS.

Signed-off-by: Namhyung Kim <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Makefile | 46 +++++++++-------------------------------------
1 file changed, 9 insertions(+), 37 deletions(-)

diff --git a/tools/perf/Makefile b/tools/perf/Makefile
index b14eeb8..5cf40cb 100644
--- a/tools/perf/Makefile
+++ b/tools/perf/Makefile
@@ -977,20 +977,15 @@ help:
@echo 'Perf maintainer targets:'
@echo ' clean - clean all binary objects and build output'

-doc:
- $(MAKE) -C Documentation all

-man:
- $(MAKE) -C Documentation man
+DOC_TARGETS := doc man html info pdf

-html:
- $(MAKE) -C Documentation html
+INSTALL_DOC_TARGETS := $(patsubst %,install-%,$(DOC_TARGETS)) try-install-man
+INSTALL_DOC_TARGETS += quick-install-doc quick-install-man quick-install-html

-info:
- $(MAKE) -C Documentation info
-
-pdf:
- $(MAKE) -C Documentation pdf
+# 'make doc' should call 'make -C Documentation all'
+$(DOC_TARGETS):
+ $(MAKE) -C Documentation $(@:doc=all)

TAGS:
$(RM) TAGS
@@ -1061,32 +1056,9 @@ install: all try-install-man
install-python_ext:
$(PYTHON_WORD) util/setup.py --quiet install --root='/$(DESTDIR_SQ)'

-install-doc:
- $(MAKE) -C Documentation install
-
-install-man:
- $(MAKE) -C Documentation install-man
-
-try-install-man:
- $(MAKE) -C Documentation try-install-man
-
-install-html:
- $(MAKE) -C Documentation install-html
-
-install-info:
- $(MAKE) -C Documentation install-info
-
-install-pdf:
- $(MAKE) -C Documentation install-pdf
-
-quick-install-doc:
- $(MAKE) -C Documentation quick-install
-
-quick-install-man:
- $(MAKE) -C Documentation quick-install-man
-
-quick-install-html:
- $(MAKE) -C Documentation quick-install-html
+# 'make install-doc' should call 'make -C Documentation install'
+$(INSTALL_DOC_TARGETS):
+ $(MAKE) -C Documentation $(@:-doc=)

### Cleaning rules

--
1.7.9.2.358.g22243

2012-10-26 14:32:14

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 9/9] perf stat: Add --pre and --post command

From: Peter Zijlstra <[email protected]>

In order to measure kernel builds, one has to do some pre/post cleanup
work in order to do the repeat build.

So provide --pre and --post command hooks to allow doing just that.

perf stat --repeat 10 --null --sync --pre 'make -s O=defconfig-build/clean' \
-- make -s -j64 O=defconfig-build/ bzImage

Signed-off-by: Peter Zijlstra <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/1350992414.13456.5.camel@twins
[ committer note: Added respective entries in Documentation/perf-stat.txt ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Documentation/perf-stat.txt | 5 ++++
tools/perf/builtin-stat.c | 42 +++++++++++++++++++++++++++-----
2 files changed, 41 insertions(+), 6 deletions(-)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 2fa173b..cf0c310 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -108,6 +108,11 @@ with it. --append may be used here. Examples:
3>results perf stat --log-fd 3 -- $cmd
3>>results perf stat --log-fd 3 --append -- $cmd

+--pre::
+--post::
+ Pre and post measurement hooks, e.g.:
+
+perf stat --repeat 10 --null --sync --pre 'make -s O=defconfig-build/clean' -- make -s -j64 O=defconfig-build/ bzImage


EXAMPLES
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 93b9011..6888960 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -57,6 +57,7 @@
#include "util/thread.h"
#include "util/thread_map.h"

+#include <stdlib.h>
#include <sys/prctl.h>
#include <locale.h>

@@ -83,6 +84,9 @@ static const char *csv_sep = NULL;
static bool csv_output = false;
static bool group = false;
static FILE *output = NULL;
+static const char *pre_cmd = NULL;
+static const char *post_cmd = NULL;
+static bool sync_run = false;

static volatile int done = 0;

@@ -265,7 +269,7 @@ static int read_counter(struct perf_evsel *counter)
return 0;
}

-static int run_perf_stat(int argc __maybe_unused, const char **argv)
+static int __run_perf_stat(int argc __maybe_unused, const char **argv)
{
unsigned long long t0, t1;
struct perf_evsel *counter, *first;
@@ -405,6 +409,32 @@ static int run_perf_stat(int argc __maybe_unused, const char **argv)
return WEXITSTATUS(status);
}

+static int run_perf_stat(int argc __maybe_unused, const char **argv)
+{
+ int ret;
+
+ if (pre_cmd) {
+ ret = system(pre_cmd);
+ if (ret)
+ return ret;
+ }
+
+ if (sync_run)
+ sync();
+
+ ret = __run_perf_stat(argc, argv);
+ if (ret)
+ return ret;
+
+ if (post_cmd) {
+ ret = system(post_cmd);
+ if (ret)
+ return ret;
+ }
+
+ return ret;
+}
+
static void print_noise_pct(double total, double avg)
{
double pct = rel_stddev_stats(total, avg);
@@ -1069,8 +1099,7 @@ static int add_default_attributes(void)

int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
{
- bool append_file = false,
- sync_run = false;
+ bool append_file = false;
int output_fd = 0;
const char *output_name = NULL;
const struct option options[] = {
@@ -1114,6 +1143,10 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_BOOLEAN(0, "append", &append_file, "append to the output file"),
OPT_INTEGER(0, "log-fd", &output_fd,
"log output to fd, instead of stderr"),
+ OPT_STRING(0, "pre", &pre_cmd, "command",
+ "command to run prior to the measured command"),
+ OPT_STRING(0, "post", &post_cmd, "command",
+ "command to run after to the measured command"),
OPT_END()
};
const char * const stat_usage[] = {
@@ -1238,9 +1271,6 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
fprintf(output, "[ perf stat: executing run #%d ... ]\n",
run_idx + 1);

- if (sync_run)
- sync();
-
status = run_perf_stat(argc, argv);
}

--
1.7.9.2.358.g22243

2012-10-26 14:34:50

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 6/9] perf inject: Work with files

From: Andrew Vagin <[email protected]>

Before this patch "perf inject" can only handle data from pipe.

I want to use "perf inject" for reworking events. Look at my following patch.

v2: add information about new options in tools/perf/Documentation/

Signed-off-by: Andrew Vagin <[email protected]>
Acked-by: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ committer note: fixed it up to cope with 5852a44, 5ded57a, 002439e & f62d3f0 ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Documentation/perf-inject.txt | 6 +++++
tools/perf/builtin-inject.c | 38 +++++++++++++++++++++++++++---
2 files changed, 41 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-inject.txt b/tools/perf/Documentation/perf-inject.txt
index 025630d..673ef97 100644
--- a/tools/perf/Documentation/perf-inject.txt
+++ b/tools/perf/Documentation/perf-inject.txt
@@ -29,6 +29,12 @@ OPTIONS
-v::
--verbose::
Be more verbose.
+-i::
+--input=::
+ Input file name. (default: stdin)
+-o::
+--output=::
+ Output file name. (default: stdout)

SEE ALSO
--------
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 386a5c0..a706ed5 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -17,24 +17,30 @@
struct perf_inject {
struct perf_tool tool;
bool build_ids;
+ const char *input_name;
+ int pipe_output,
+ output;
+ u64 bytes_written;
};

-static int perf_event__repipe_synth(struct perf_tool *tool __maybe_unused,
+static int perf_event__repipe_synth(struct perf_tool *tool,
union perf_event *event,
struct machine *machine __maybe_unused)
{
+ struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
uint32_t size;
void *buf = event;

size = event->header.size;

while (size) {
- int ret = write(STDOUT_FILENO, buf, size);
+ int ret = write(inject->output, buf, size);
if (ret < 0)
return -errno;

size -= ret;
buf += ret;
+ inject->bytes_written += ret;
}

return 0;
@@ -231,12 +237,20 @@ static int __cmd_inject(struct perf_inject *inject)
inject->tool.tracing_data = perf_event__repipe_tracing_data;
}

- session = perf_session__new("-", O_RDONLY, false, true, &inject->tool);
+ session = perf_session__new(inject->input_name, O_RDONLY, false, true, &inject->tool);
if (session == NULL)
return -ENOMEM;

+ if (!inject->pipe_output)
+ lseek(inject->output, session->header.data_offset, SEEK_SET);
+
ret = perf_session__process_events(session, &inject->tool);

+ if (!inject->pipe_output) {
+ session->header.data_size = inject->bytes_written;
+ perf_session__write_header(session, session->evlist, inject->output, true);
+ }
+
perf_session__delete(session);

return ret;
@@ -260,10 +274,16 @@ int cmd_inject(int argc, const char **argv, const char *prefix __maybe_unused)
.tracing_data = perf_event__repipe_tracing_data_synth,
.build_id = perf_event__repipe_op2_synth,
},
+ .input_name = "-",
};
+ const char *output_name = "-";
const struct option options[] = {
OPT_BOOLEAN('b', "build-ids", &inject.build_ids,
"Inject build-ids into the output stream"),
+ OPT_STRING('i', "input", &inject.input_name, "file",
+ "input file name"),
+ OPT_STRING('o', "output", &output_name, "file",
+ "output file name"),
OPT_INCR('v', "verbose", &verbose,
"be more verbose (show build ids, etc)"),
OPT_END()
@@ -281,6 +301,18 @@ int cmd_inject(int argc, const char **argv, const char *prefix __maybe_unused)
if (argc)
usage_with_options(inject_usage, options);

+ if (!strcmp(output_name, "-")) {
+ inject.pipe_output = 1;
+ inject.output = STDOUT_FILENO;
+ } else {
+ inject.output = open(output_name, O_CREAT | O_WRONLY | O_TRUNC,
+ S_IRUSR | S_IWUSR);
+ if (inject.output < 0) {
+ perror("failed to create output file");
+ return -1;
+ }
+ }
+
if (symbol__init() < 0)
return -1;

--
1.7.9.2.358.g22243

2012-10-26 14:34:48

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: [PATCH 8/9] perf inject: Mark a dso if it's used

From: Andrew Vagin <[email protected]>

Otherwise they will be not written in an output file.

Signed-off-by: Andrew Vagin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ committer note: Fixed up wrt changes made in the immediate previous patches ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-inject.c | 11 ++++++++---
tools/perf/util/build-id.c | 10 +++++-----
tools/perf/util/build-id.h | 4 ++++
3 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index a4a3072..84ad6ab 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -14,6 +14,7 @@
#include "util/session.h"
#include "util/tool.h"
#include "util/debug.h"
+#include "util/build-id.h"

#include "util/parse-options.h"

@@ -116,6 +117,8 @@ static int perf_event__repipe_sample(struct perf_tool *tool,
return f(tool, event, sample, evsel, machine);
}

+ build_id__mark_dso_hit(tool, event, sample, evsel, machine);
+
return perf_event__repipe_synth(tool, event, machine);
}

@@ -310,6 +313,7 @@ found:
sample_sw.time = sample->time;
perf_event__synthesize_sample(event_sw, evsel->attr.sample_type,
&sample_sw, false);
+ build_id__mark_dso_hit(tool, event_sw, &sample_sw, evsel, machine);
return perf_event__repipe(tool, event_sw, &sample_sw, machine);
}

@@ -342,8 +346,7 @@ static int __cmd_inject(struct perf_inject *inject)

signal(SIGINT, sig_handler);

- if (inject->build_ids) {
- inject->tool.sample = perf_event__inject_buildid;
+ if (inject->build_ids || inject->sched_stat) {
inject->tool.mmap = perf_event__repipe_mmap;
inject->tool.fork = perf_event__repipe_fork;
inject->tool.tracing_data = perf_event__repipe_tracing_data;
@@ -353,7 +356,9 @@ static int __cmd_inject(struct perf_inject *inject)
if (session == NULL)
return -ENOMEM;

- if (inject->sched_stat) {
+ if (inject->build_ids) {
+ inject->tool.sample = perf_event__inject_buildid;
+ } else if (inject->sched_stat) {
struct perf_evsel *evsel;

inject->tool.ordered_samples = true;
diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index 6a63999..94ca117 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -16,11 +16,11 @@
#include "session.h"
#include "tool.h"

-static int build_id__mark_dso_hit(struct perf_tool *tool __maybe_unused,
- union perf_event *event,
- struct perf_sample *sample __maybe_unused,
- struct perf_evsel *evsel __maybe_unused,
- struct machine *machine)
+int build_id__mark_dso_hit(struct perf_tool *tool __maybe_unused,
+ union perf_event *event,
+ struct perf_sample *sample __maybe_unused,
+ struct perf_evsel *evsel __maybe_unused,
+ struct machine *machine)
{
struct addr_location al;
u8 cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
diff --git a/tools/perf/util/build-id.h b/tools/perf/util/build-id.h
index a993ba8..45c500b 100644
--- a/tools/perf/util/build-id.h
+++ b/tools/perf/util/build-id.h
@@ -7,4 +7,8 @@ extern struct perf_tool build_id__mark_dso_hit_ops;

char *dso__build_id_filename(struct dso *self, char *bf, size_t size);

+int build_id__mark_dso_hit(struct perf_tool *tool, union perf_event *event,
+ struct perf_sample *sample, struct perf_evsel *evsel,
+ struct machine *machine);
+
#endif
--
1.7.9.2.358.g22243

2012-10-26 14:54:58

by Ingo Molnar

[permalink] [raw]
Subject: Re: [GIT PULL 0/9] perf/core improvements and fixes


* Arnaldo Carvalho de Melo <[email protected]> wrote:

> Hi Ingo,
>
> Please consider pulling,
>
> - Arnaldo
>
> The following changes since commit 8f7c1d07ade50dcdea7ec779b277e891f5c8292a:
>
> Merge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core (2012-10-26 10:30:49 +0200)
>
> are available in the git repository at:
>
>
> git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux tags/perf-core-for-mingo
>
> for you to fetch changes up to 1f16c5754d3a4008c29f3bf67b4f1271313ba385:
>
> perf stat: Add --pre and --post command (2012-10-26 11:22:25 -0200)
>
> ----------------------------------------------------------------
> perf/core improvements:
>
> . perf inject changes to allow showing where a task sleeps, from Andrew Vagin.
>
> . Makefile improvements from Namhyung Kim.

These are really useful: there used to be a couple of seconds of
wait time at the beginning of every perf build - these are now
nicely explained with the various CHK entries.

>
> . Add --pre and --post command hooks in 'stat', from Peter Zijlstra.
>
> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
>
> ----------------------------------------------------------------
> Andrew Vagin (3):
> perf inject: Work with files
> perf inject: Merge sched_stat_* and sched_switch events
> perf inject: Mark a dso if it's used
>
> Namhyung Kim (5):
> tools lib traceevent: Do not generate dependency for system header files
> perf tools: Cleanup doc related targets
> perf tools: Convert invocation of MAKE into SUBDIR
> perf tools: Always show CHK message when doing try-cc
> perf tools: Fix LIBELF_MMAP checking
>
> Peter Zijlstra (1):
> perf stat: Add --pre and --post command
>
> tools/lib/traceevent/Makefile | 2 +-
> tools/perf/Documentation/perf-inject.txt | 11 ++
> tools/perf/Documentation/perf-stat.txt | 5 +
> tools/perf/Makefile | 51 ++------
> tools/perf/builtin-inject.c | 189 ++++++++++++++++++++++++++++--
> tools/perf/builtin-stat.c | 42 ++++++-
> tools/perf/config/utilities.mak | 3 +-
> tools/perf/util/build-id.c | 10 +-
> tools/perf/util/build-id.h | 4 +
> 9 files changed, 256 insertions(+), 61 deletions(-)

Pulled, thanks Arnaldo!

Ingo

2012-10-26 15:07:00

by David Ahern

[permalink] [raw]
Subject: Re: [GIT PULL 0/9] perf/core improvements and fixes

On 10/26/12 8:54 AM, Ingo Molnar wrote:
>> perf/core improvements:
>>
>> . perf inject changes to allow showing where a task sleeps, from Andrew Vagin.
>>
>> . Makefile improvements from Namhyung Kim.
>
> These are really useful: there used to be a couple of seconds of
> wait time at the beginning of every perf build - these are now
> nicely explained with the various CHK entries.

PERF-VERSION-GEN and specifically the git commands are the cause of more
delay than the config checks, especially when doing the build in a VM
with the kernel source on an NFS mount.

David

2012-10-26 15:31:52

by Namhyung Kim

[permalink] [raw]
Subject: Re: [GIT PULL 0/9] perf/core improvements and fixes

2012-10-26 (금), 09:06 -0600, David Ahern:
> On 10/26/12 8:54 AM, Ingo Molnar wrote:
> >> perf/core improvements:
> >>
> >> . perf inject changes to allow showing where a task sleeps, from Andrew Vagin.
> >>
> >> . Makefile improvements from Namhyung Kim.
> >
> > These are really useful: there used to be a couple of seconds of
> > wait time at the beginning of every perf build - these are now
> > nicely explained with the various CHK entries.

Kudos to Jiri who did the real work!

>
> PERF-VERSION-GEN and specifically the git commands are the cause of more
> delay than the config checks, especially when doing the build in a VM
> with the kernel source on an NFS mount.

And I see a strange delay when compiling builtin-sched.o. After
building perf tools, I deleted builtin-{sched,test,script}.o to rebuild
the only since they are largest ones.

namhyung@leonhard:perf$ ls -lS *.c | head -3
-rw-r--r-- 1 namhyung namhyung 45522 2012-10-27 00:20 builtin-sched.c
-rw-r--r-- 1 namhyung namhyung 36372 2012-10-27 00:20 builtin-test.c
-rw-r--r-- 1 namhyung namhyung 35555 2012-10-27 00:20 builtin-script.c

namhyung@leonhard:perf$ rm builtin-{sched,test,script}.o


And then building each file with time command shows this:

namhyung@leonhard:perf$ time make builtin-script.o &> /dev/null

real 0m4.577s
user 0m2.755s
sys 0m1.655s

namhyung@leonhard:perf$ time make builtin-test.o &> /dev/null

real 0m4.486s
user 0m2.707s
sys 0m1.658s

namhyung@leonhard:perf$ time make builtin-sched.o &> /dev/null

real 0m16.936s
user 0m15.157s
sys 0m1.635s

You can see it easily when building perf without -j option. But I have
no idea why it takes so long..

Thanks,
Namhyung

2012-10-26 15:34:31

by Borislav Petkov

[permalink] [raw]
Subject: Re: [GIT PULL 0/9] perf/core improvements and fixes

On Sat, Oct 27, 2012 at 12:31:42AM +0900, Namhyung Kim wrote:
> And I see a strange delay when compiling builtin-sched.o. After
> building perf tools, I deleted builtin-{sched,test,script}.o to rebuild
> the only since they are largest ones.
>
> namhyung@leonhard:perf$ ls -lS *.c | head -3
> -rw-r--r-- 1 namhyung namhyung 45522 2012-10-27 00:20 builtin-sched.c
> -rw-r--r-- 1 namhyung namhyung 36372 2012-10-27 00:20 builtin-test.c
> -rw-r--r-- 1 namhyung namhyung 35555 2012-10-27 00:20 builtin-script.c
>
> namhyung@leonhard:perf$ rm builtin-{sched,test,script}.o
>
>
> And then building each file with time command shows this:
>
> namhyung@leonhard:perf$ time make builtin-script.o &> /dev/null
>
> real 0m4.577s
> user 0m2.755s
> sys 0m1.655s
>
> namhyung@leonhard:perf$ time make builtin-test.o &> /dev/null
>
> real 0m4.486s
> user 0m2.707s
> sys 0m1.658s
>
> namhyung@leonhard:perf$ time make builtin-sched.o &> /dev/null
>
> real 0m16.936s
> user 0m15.157s
> sys 0m1.635s
>
> You can see it easily when building perf without -j option. But I have
> no idea why it takes so long..

Well, you can trace that workload with perf itself, no, and see the
hotspots.

:-)

--
Regards/Gruss,
Boris.

2012-10-26 16:32:28

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [GIT PULL 0/9] perf/core improvements and fixes

Em Fri, Oct 26, 2012 at 05:34:32PM +0200, Borislav Petkov escreveu:
> On Sat, Oct 27, 2012 at 12:31:42AM +0900, Namhyung Kim wrote:
> > You can see it easily when building perf without -j option. But I have
> > no idea why it takes so long..

> Well, you can trace that workload with perf itself, no, and see the
> hotspots.

Right, perf'ing perf is a favourite pastime, right?

- Arnaldo

2012-10-26 17:05:56

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [GIT PULL 0/9] perf/core improvements and fixes

Em Fri, Oct 26, 2012 at 04:54:51PM +0200, Ingo Molnar escreveu:
> * Arnaldo Carvalho de Melo <[email protected]> wrote:
> > . Makefile improvements from Namhyung Kim.
>
> These are really useful: there used to be a couple of seconds of
> wait time at the beginning of every perf build - these are now
> nicely explained with the various CHK entries.

The optimal way, I guess, would be to have some cache file with the
results of such feature tests, that would be created and then used till
the build fails using its findings, which would trigger a new feature
check round, followed by an automatic rebuild.

That would be tricky because we would have to have an automated way of
discovering if the build failed due to missing packages or if it failed
due to some ordinary coding mistake.

- Arnaldo

2012-10-26 17:20:23

by Borislav Petkov

[permalink] [raw]
Subject: Re: [GIT PULL 0/9] perf/core improvements and fixes

On Fri, Oct 26, 2012 at 09:31:15AM -0700, Arnaldo Carvalho de Melo wrote:
> Right, perf'ing perf is a favourite pastime, right?

Sure, can I get "perfing perf" on a T-shirt please?

--
Regards/Gruss,
Boris.

2012-10-27 09:16:40

by Namhyung Kim

[permalink] [raw]
Subject: Re: [GIT PULL 0/9] perf/core improvements and fixes

2012-10-26 (금), 19:20 +0200, Borislav Petkov:
> On Fri, Oct 26, 2012 at 09:31:15AM -0700, Arnaldo Carvalho de Melo wrote:
> > Right, perf'ing perf is a favourite pastime, right?
>
> Sure, can I get "perfing perf" on a T-shirt please?

Well, guys, this is not perfing perf. It's about perfing make and/or
gcc. Anyway I'd also like to get a "perfing perf" T-shirt. ;)

Thanks,
Namhyung

2012-10-27 13:19:41

by Ingo Molnar

[permalink] [raw]
Subject: Re: [GIT PULL 0/9] perf/core improvements and fixes


* Arnaldo Carvalho de Melo <[email protected]> wrote:

> Em Fri, Oct 26, 2012 at 04:54:51PM +0200, Ingo Molnar escreveu:
> > * Arnaldo Carvalho de Melo <[email protected]> wrote:
> > > . Makefile improvements from Namhyung Kim.
> >
> > These are really useful: there used to be a couple of
> > seconds of wait time at the beginning of every perf build -
> > these are now nicely explained with the various CHK entries.
>
> The optimal way, I guess, would be to have some cache file
> with the results of such feature tests, that would be created
> and then used till the build fails using its findings, which
> would trigger a new feature check round, followed by an
> automatic rebuild.
>
> That would be tricky because we would have to have an
> automated way of discovering if the build failed due to
> missing packages or if it failed due to some ordinary coding
> mistake.

The feature tests aren't a big problem right now - but making it
*visible* is really useful. It also tells us which feature test
fails, etc.

Thanks,

Ingo

2012-10-27 13:33:59

by Ingo Molnar

[permalink] [raw]
Subject: 'git describe' is very slow on development trees with lots of commits


(Cc:-ed the Git development list.)

* David Ahern <[email protected]> wrote:

> PERF-VERSION-GEN and specifically the git commands are the
> cause of more delay than the config checks, especially when
> doing the build in a VM with the kernel source on an NFS
> mount.

Yes, I have noticed that too.

So, the problem is that we use 'git describe' on the kernel tree
to generate the version string, which is very, very slow if we
are far away from any tagged release - which is the case for the
-tip tree:

comet:~/tip> perf stat --null --repeat 3 git describe
v3.7-rc2-2007-g83e8223
v3.7-rc2-2007-g83e8223
v3.7-rc2-2007-g83e8223

'git describe' is much faster if we are on or near to a tag:

$ git checkout v3.6
$ perf stat --null --repeat 3 git describe
v3.6
v3.6
v3.6

Performance counter stats for 'git describe' (3 runs):

0.020171640 seconds time elapsed ( +- 3.64% )

$ git checkout b34e5f55a1e6

$ perf stat --null --repeat 3 git describe
v3.6-41-gb34e5f5
v3.6-41-gb34e5f5
v3.6-41-gb34e5f5

Performance counter stats for 'git describe' (3 runs):

0.155603676 seconds time elapsed ( +- 0.23% )

The cost on this pretty fast machine is about 1 msecs per commit
- which adds up to about 2.5 seconds during much of the
development cycle.

So maybe we should be using a different version string, for
example, instead of:

v3.7-rc2-2007-g83e8223

this would be perfectly fine:

v3.7-rc2-g83e8223

the 'commit count' is informative but not essential - and in
counting the number of off-tag commits is where much of the
overhead is:

#
# Overhead Command Shared Object Symbol
# ........ ....... .................. ..........................................
#
39.79% git libz.so.1.2.5 [.] 0x000000000000c1fe
26.39% git libz.so.1.2.5 [.] inflate
22.42% git git [.] 0x000000000009bd1e
2.99% git libz.so.1.2.5 [.] adler32
1.23% git libc-2.15.so [.] _int_malloc
0.72% git libc-2.15.so [.] __GI_____strtoull_l_internal
0.67% git libc-2.15.so [.] _int_free
0.62% git libc-2.15.so [.] malloc_consolidate
0.54% git [kernel.kallsyms] [k] clear_page_c
0.32% git [kernel.kallsyms] [k] page_fault

So by switching to the shorter version string that still embedds
the tag and the exact sha1 we'd be able to run this script a
*lot* faster.

Thanks,

Ingo

2012-10-27 14:30:07

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [GIT PULL 0/9] perf/core improvements and fixes

Em Sat, Oct 27, 2012 at 06:16:31PM +0900, Namhyung Kim escreveu:
> 2012-10-26 (금), 19:20 +0200, Borislav Petkov:
> > On Fri, Oct 26, 2012 at 09:31:15AM -0700, Arnaldo Carvalho de Melo wrote:
> > > Right, perf'ing perf is a favourite pastime, right?
> >
> > Sure, can I get "perfing perf" on a T-shirt please?
>
> Well, guys, this is not perfing perf. It's about perfing make and/or
> gcc. Anyway I'd also like to get a "perfing perf" T-shirt. ;)

Well, building perf faster will allow us to perf perf faster. ;-)

- Arnaldo

2012-10-27 17:12:10

by Stephane Eranian

[permalink] [raw]
Subject: Re: [GIT PULL 0/9] perf/core improvements and fixes

On Fri, Oct 26, 2012 at 5:31 PM, Namhyung Kim <[email protected]> wrote:
> 2012-10-26 (금), 09:06 -0600, David Ahern:
>> On 10/26/12 8:54 AM, Ingo Molnar wrote:
>> >> perf/core improvements:
>> >>
>> >> . perf inject changes to allow showing where a task sleeps, from Andrew Vagin.
>> >>
>> >> . Makefile improvements from Namhyung Kim.
>> >
>> > These are really useful: there used to be a couple of seconds of
>> > wait time at the beginning of every perf build - these are now
>> > nicely explained with the various CHK entries.
>
> Kudos to Jiri who did the real work!
>
>>
>> PERF-VERSION-GEN and specifically the git commands are the cause of more
>> delay than the config checks, especially when doing the build in a VM
>> with the kernel source on an NFS mount.
>
> And I see a strange delay when compiling builtin-sched.o. After
> building perf tools, I deleted builtin-{sched,test,script}.o to rebuild
> the only since they are largest ones.
>
Yes, I see that delay on copiling builtin-sched.c on my IVB system.
Don't know why it takes a significant number of seconds to compile
this file. It did not use to be like that a few revisions back. It takes
about 8 seconds on my OC'd IVB (> 4GHz). I don't see much code
in that file.

> namhyung@leonhard:perf$ ls -lS *.c | head -3
> -rw-r--r-- 1 namhyung namhyung 45522 2012-10-27 00:20 builtin-sched.c
> -rw-r--r-- 1 namhyung namhyung 36372 2012-10-27 00:20 builtin-test.c
> -rw-r--r-- 1 namhyung namhyung 35555 2012-10-27 00:20 builtin-script.c
>
> namhyung@leonhard:perf$ rm builtin-{sched,test,script}.o
>
>
> And then building each file with time command shows this:
>
> namhyung@leonhard:perf$ time make builtin-script.o &> /dev/null
>
> real 0m4.577s
> user 0m2.755s
> sys 0m1.655s
>
> namhyung@leonhard:perf$ time make builtin-test.o &> /dev/null
>
> real 0m4.486s
> user 0m2.707s
> sys 0m1.658s
>
> namhyung@leonhard:perf$ time make builtin-sched.o &> /dev/null
>
> real 0m16.936s
> user 0m15.157s
> sys 0m1.635s
>
> You can see it easily when building perf without -j option. But I have
> no idea why it takes so long..
>
> Thanks,
> Namhyung
>
>

2012-10-30 08:18:36

by Ingo Molnar

[permalink] [raw]
Subject: Re: [GIT PULL 0/9] perf/core improvements and fixes


* Ingo Molnar <[email protected]> wrote:

>
> * Arnaldo Carvalho de Melo <[email protected]> wrote:
>
> > Em Fri, Oct 26, 2012 at 04:54:51PM +0200, Ingo Molnar escreveu:
> > > * Arnaldo Carvalho de Melo <[email protected]> wrote:
> > > > . Makefile improvements from Namhyung Kim.
> > >
> > > These are really useful: there used to be a couple of
> > > seconds of wait time at the beginning of every perf build -
> > > these are now nicely explained with the various CHK entries.
> >
> > The optimal way, I guess, would be to have some cache file
> > with the results of such feature tests, that would be created
> > and then used till the build fails using its findings, which
> > would trigger a new feature check round, followed by an
> > automatic rebuild.
> >
> > That would be tricky because we would have to have an
> > automated way of discovering if the build failed due to
> > missing packages or if it failed due to some ordinary coding
> > mistake.
>
> The feature tests aren't a big problem right now - but making
> it *visible* is really useful. It also tells us which feature
> test fails, etc.

Btw., there's another thing that would be nice in addition to
simplifying the PERF-VERSION-GEN script: to be able to run the
CHK tests in parallel, like the object file runes.

Right now the CHK tests are serialized and they take several
seconds to build and run. A parallel make rule would reduce
that to about a second I think.

Thanks,

Ingo

2012-10-30 08:22:06

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [GIT PULL 0/9] perf/core improvements and fixes

On Tue, 2012-10-30 at 09:18 +0100, Ingo Molnar wrote:
> > > The optimal way, I guess, would be to have some cache file
> > > with the results of such feature tests, that would be created
> > > and then used till the build fails using its findings, which
> > > would trigger a new feature check round, followed by an
> > > automatic rebuild.

autoconf!! ;-)

/me runs

2012-10-30 08:46:11

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH] perf tools: Speed up the perf build time by simplifying the perf --version string generation


* Ingo Molnar <[email protected]> wrote:

> Btw., there's another thing that would be nice in addition to
> simplifying the PERF-VERSION-GEN script: [...]

Here's a stab at that.

----------------->

Building perf is pretty slow on trees that have a lot of commits
relative to the nearest Git tag. This slowness manifests itself
during version string generation:

$ perf stat --null --repeat 3 --sync --pre "rm -f PERF-VERSION-FILE" util/PERF-VERSION-GEN
PERF_VERSION = 3.7.rc3.1458.g5399b3b
PERF_VERSION = 3.7.rc3.1458.g5399b3b
PERF_VERSION = 3.7.rc3.1458.g5399b3b

Performance counter stats for 'util/PERF-VERSION-GEN' (3 runs):

2.857503976 seconds time elapsed ( +- 0.22% )

The build can be even slower than that, when one over NFS
volumes.

The reason for the slowness is that util/PERF-VERSION-GEN uses
"git describe" to generate the string, which has to count the
"number of commits distance" from the nearest tag - the ".1458."
count in the output above. For that Git had to extract and
decompress 1458 Git objects, which takes time and bandwidth.

But this "number of commits" value is mostly irrelevant in
practice. We either want to know an approximate tag name, or we
want to know the precise sha1.

So this patch simplifies the version string to:

PERF_VERSION = 3.7.rc3.g5399b3b.dirty

which speeds up the version string generation script by an order
of magnitude:

$ perf stat --null --repeat 3 --sync --pre "rm -f PERF-VERSION-FILE" util/PERF-VERSION-GEN
PERF_VERSION = 3.7.rc3.g5399b3b.dirty
PERF_VERSION = 3.7.rc3.g5399b3b.dirty
PERF_VERSION = 3.7.rc3.g5399b3b.dirty

Performance counter stats for 'util/PERF-VERSION-GEN' (3 runs):

0.307633559 seconds time elapsed ( +- 0.84% )

Signed-off-by: Ingo Molnar <[email protected]>
---
tools/perf/util/PERF-VERSION-GEN | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/PERF-VERSION-GEN b/tools/perf/util/PERF-VERSION-GEN
index 95264f3..c774b89 100755
--- a/tools/perf/util/PERF-VERSION-GEN
+++ b/tools/perf/util/PERF-VERSION-GEN
@@ -12,7 +12,7 @@ LF='
# First check if there is a .git to get the version from git describe
# otherwise try to get the version from the kernel makefile
if test -d ../../.git -o -f ../../.git &&
- VN=$(git describe --match 'v[0-9].[0-9]*' --abbrev=4 HEAD 2>/dev/null) &&
+ VN=$(echo $(git tag --list "v[0-9].[0-9]*" | tail -1)"-g"$(git log -1 --abbrev=4 --pretty=format:"%h" HEAD) 2>/dev/null) &&
case "$VN" in
*$LF*) (exit 1) ;;
v[0-9]*)

2012-10-30 08:54:49

by Ingo Molnar

[permalink] [raw]
Subject: [PATCH] perf tools: Further speed up the perf build

There's another source of overhead in the perf version string
generator:

git update-index -q --refresh

... which will iterate the whole checked out tree. This can be
pretty slow on NFS volumes, but takes some time even with local
SSD disks and a fully cached kernel tree:

$ perf stat --null --repeat 3 --pre "rm -f PERF-VERSION-FILE" util/PERF-VERSION-GEN
PERF_VERSION = 3.7.rc3.g5399b3b.dirty
PERF_VERSION = 3.7.rc3.g5399b3b.dirty
PERF_VERSION = 3.7.rc3.g5399b3b.dirty

Performance counter stats for 'util/PERF-VERSION-GEN' (3 runs):

0.306999221 seconds time elapsed ( +- 0.56% )

So remove the .dirty differentiator as well - it adds little
information because locally patched git trees are common, but
seldom are the perf tools modified.

So a lot of version strings are reported as 'dirty' while in
fact they are pristine perf builds. For example 99% of my perf
builds are not patched but the kernel tree is slightly patched,
which adds the .dirty tag.

Eliminating that tag speeds up version generation by another
order of magnitude:

$ perf stat --null --repeat 3 --sync --pre "rm -f PERF-VERSION-FILE" util/PERF-VERSION-GEN
PERF_VERSION = 3.7.rc3.g4b0bd3
PERF_VERSION = 3.7.rc3.g4b0bd3
PERF_VERSION = 3.7.rc3.g4b0bd3

Performance counter stats for 'util/PERF-VERSION-GEN' (3 runs):

0.021270923 seconds time elapsed ( +- 1.94% )

(Also clean up some of the comments around this code.)

Signed-off-by: Ingo Molnar <[email protected]>
---
tools/perf/util/PERF-VERSION-GEN | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/PERF-VERSION-GEN b/tools/perf/util/PERF-VERSION-GEN
index c774b89..6fb1cc8 100755
--- a/tools/perf/util/PERF-VERSION-GEN
+++ b/tools/perf/util/PERF-VERSION-GEN
@@ -9,17 +9,12 @@ GVF=${OUTPUT}PERF-VERSION-FILE
LF='
'

+#
# First check if there is a .git to get the version from git describe
-# otherwise try to get the version from the kernel makefile
+# otherwise try to get the version from the kernel Makefile
+#
if test -d ../../.git -o -f ../../.git &&
- VN=$(echo $(git tag --list "v[0-9].[0-9]*" | tail -1)"-g"$(git log -1 --abbrev=4 --pretty=format:"%h" HEAD) 2>/dev/null) &&
- case "$VN" in
- *$LF*) (exit 1) ;;
- v[0-9]*)
- git update-index -q --refresh
- test -z "$(git diff-index --name-only HEAD --)" ||
- VN="$VN-dirty" ;;
- esac
+ VN=$(echo $(git tag --list "v[0-9].[0-9]*" | tail -1)"-g"$(git log -1 --abbrev=4 --pretty=format:"%h" HEAD) 2>/dev/null)
then
VN=$(echo "$VN" | sed -e 's/-/./g');
else

2012-10-30 09:14:51

by Ingo Molnar

[permalink] [raw]
Subject: Re: [GIT PULL 0/9] perf/core improvements and fixes


* Peter Zijlstra <[email protected]> wrote:

> On Tue, 2012-10-30 at 09:18 +0100, Ingo Molnar wrote:
> > > > The optimal way, I guess, would be to have some cache file
> > > > with the results of such feature tests, that would be created
> > > > and then used till the build fails using its findings, which
> > > > would trigger a new feature check round, followed by an
> > > > automatic rebuild.

I did not write that.

I think making the feature tests parallel would be enough to
speed it all up - caching brings in a new set of problems. The
tests are mostly independent and the feature test makefile rules
could be parallelized like the object file rules.

> autoconf!! ;-)
>
> /me runs

hey, we build perf much faster than autoconf's 'configure'
script finishes running ;-)

Thanks,

Ingo

2012-10-30 09:35:42

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf tools: Speed up the perf build time by simplifying the perf --version string generation

Em Tue, Oct 30, 2012 at 09:46:00AM +0100, Ingo Molnar escreveu:
> +++ b/tools/perf/util/PERF-VERSION-GEN
> @@ -12,7 +12,7 @@ LF='
> # First check if there is a .git to get the version from git describe
> # otherwise try to get the version from the kernel makefile
> if test -d ../../.git -o -f ../../.git &&
> - VN=$(git describe --match 'v[0-9].[0-9]*' --abbrev=4 HEAD 2>/dev/null) &&
> + VN=$(echo $(git tag --list "v[0-9].[0-9]*" | tail -1)"-g"$(git log -1 --abbrev=4 --pretty=format:"%h" HEAD) 2>/dev/null) &&

[acme@sandy linux]$ make -j8 -C tools/perf/ O=/home/acme/git/build/perf install
make: Entering directory `/home/git/linux/tools/perf'
error: unknown option `list'
usage: git tag [-a|-s|-u <key-id>] [-f] [-m <msg>|-F <file>] <tagname>
[<head>]

<SNIP>

[acme@sandy linux]$ git --version
git version 1.7.1

- Arnaldo

2012-10-30 09:43:46

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] perf tools: Speed up the perf build time by simplifying the perf --version string generation


* Arnaldo Carvalho de Melo <[email protected]> wrote:

> Em Tue, Oct 30, 2012 at 09:46:00AM +0100, Ingo Molnar escreveu:
> > +++ b/tools/perf/util/PERF-VERSION-GEN
> > @@ -12,7 +12,7 @@ LF='
> > # First check if there is a .git to get the version from git describe
> > # otherwise try to get the version from the kernel makefile
> > if test -d ../../.git -o -f ../../.git &&
> > - VN=$(git describe --match 'v[0-9].[0-9]*' --abbrev=4 HEAD 2>/dev/null) &&
> > + VN=$(echo $(git tag --list "v[0-9].[0-9]*" | tail -1)"-g"$(git log -1 --abbrev=4 --pretty=format:"%h" HEAD) 2>/dev/null) &&
>
> [acme@sandy linux]$ make -j8 -C tools/perf/ O=/home/acme/git/build/perf install
> make: Entering directory `/home/git/linux/tools/perf'
> error: unknown option `list'
> usage: git tag [-a|-s|-u <key-id>] [-f] [-m <msg>|-F <file>] <tagname>
> [<head>]
>
> <SNIP>
>
> [acme@sandy linux]$ git --version
> git version 1.7.1

Does -l work?

Alternatively, please replace:

git tag --list "v[0-9].[0-9]*" | tail -1

with:

git tag | tail -1 | grep -E "v[0-9].[0-9]*"

which is just as fast.

Thanks,

Ingo

2012-10-30 09:48:59

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] perf tools: Speed up the perf build time by simplifying the perf --version string generation


* Ingo Molnar <[email protected]> wrote:

> Does -l work?
>
> Alternatively, please replace:
>
> git tag --list "v[0-9].[0-9]*" | tail -1
>
> with:
>
> git tag | tail -1 | grep -E "v[0-9].[0-9]*"
>
> which is just as fast.

make that:

git tag 2>/dev/null | tail -1 | grep -E "v[0-9].[0-9]*"

this will work silently even if Git is not installed.

Thanks,

Ingo

2012-10-30 09:50:00

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf tools: Speed up the perf build time by simplifying the perf --version string generation

Em Tue, Oct 30, 2012 at 10:43:38AM +0100, Ingo Molnar escreveu:
>
> * Arnaldo Carvalho de Melo <[email protected]> wrote:
>
> > Em Tue, Oct 30, 2012 at 09:46:00AM +0100, Ingo Molnar escreveu:
> > > +++ b/tools/perf/util/PERF-VERSION-GEN
> > > @@ -12,7 +12,7 @@ LF='
> > > # First check if there is a .git to get the version from git describe
> > > # otherwise try to get the version from the kernel makefile
> > > if test -d ../../.git -o -f ../../.git &&
> > > - VN=$(git describe --match 'v[0-9].[0-9]*' --abbrev=4 HEAD 2>/dev/null) &&
> > > + VN=$(echo $(git tag --list "v[0-9].[0-9]*" | tail -1)"-g"$(git log -1 --abbrev=4 --pretty=format:"%h" HEAD) 2>/dev/null) &&
> >
> > [acme@sandy linux]$ make -j8 -C tools/perf/ O=/home/acme/git/build/perf install
> > make: Entering directory `/home/git/linux/tools/perf'
> > error: unknown option `list'
> > usage: git tag [-a|-s|-u <key-id>] [-f] [-m <msg>|-F <file>] <tagname>
> > [<head>]
> >
> > <SNIP>
> >
> > [acme@sandy linux]$ git --version
> > git version 1.7.1
>
> Does -l work?

Yes, changed that and applied, thanks!

- Arnaldo

2012-10-30 09:58:11

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf tools: Speed up the perf build time by simplifying the perf --version string generation

Em Tue, Oct 30, 2012 at 10:48:52AM +0100, Ingo Molnar escreveu:
>
> * Ingo Molnar <[email protected]> wrote:
>
> > Does -l work?
> >
> > Alternatively, please replace:
> >
> > git tag --list "v[0-9].[0-9]*" | tail -1
> >
> > with:
> >
> > git tag | tail -1 | grep -E "v[0-9].[0-9]*"
> >
> > which is just as fast.
>
> make that:
>
> git tag 2>/dev/null | tail -1 | grep -E "v[0-9].[0-9]*"
>
> this will work silently even if Git is not installed.

But we first check if we have a .git, that doesn't guarantees that git
is installed, but makes it a lot likely, no? Redirecting stderr to null
would need to be done in more places, so we would need to use something
like what we do for xmlto/asciidoc, $(call get-executable,$(GIT))

#
# First check if there is a .git to get the version from git describe
# otherwise try to get the version from the kernel Makefile
#
if test -d ../../.git -o -f ../../.git &&
VN=$(echo $(git tag -l "v[0-9].[0-9]*" | tail -1)"-g"$(git log -1 --abbrev=4 --pretty=format:"%h" HEAD) 2>/dev/null)
then
VN=$(echo "$VN" | sed -e 's/-/./g');
else
VN=$(MAKEFLAGS= make -sC ../.. kernelversion)
fi

2012-10-30 10:01:42

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] perf tools: Speed up the perf build time by simplifying the perf --version string generation


* Arnaldo Carvalho de Melo <[email protected]> wrote:

> Em Tue, Oct 30, 2012 at 10:48:52AM +0100, Ingo Molnar escreveu:
> >
> > * Ingo Molnar <[email protected]> wrote:
> >
> > > Does -l work?
> > >
> > > Alternatively, please replace:
> > >
> > > git tag --list "v[0-9].[0-9]*" | tail -1
> > >
> > > with:
> > >
> > > git tag | tail -1 | grep -E "v[0-9].[0-9]*"
> > >
> > > which is just as fast.
> >
> > make that:
> >
> > git tag 2>/dev/null | tail -1 | grep -E "v[0-9].[0-9]*"
> >
> > this will work silently even if Git is not installed.
>
> But we first check if we have a .git, that doesn't guarantees
> that git is installed, but makes it a lot likely, no?
> [...]

Not necessarily - say a home directory is NFS shared to multiple
test boxes, one does not have Git installed.

> Redirecting stderr to null would need to be done in more
> places, so we would need to use something like what we do for
> xmlto/asciidoc, $(call get-executable,$(GIT))

It at least solves it in this particular case, and I tested it
with Git uninstalled, there's no extra message just a proper
error code the script can use to fall back to the toplevel
Makefile for version info.

Thanks,

Ingo

2012-10-31 17:52:30

by Pavel Machek

[permalink] [raw]
Subject: Re: 'git describe' is very slow on development trees with lots of commits

Hi!

> (Cc:-ed the Git development list.)
>
> * David Ahern <[email protected]> wrote:
>
> > PERF-VERSION-GEN and specifically the git commands are the
> > cause of more delay than the config checks, especially when
> > doing the build in a VM with the kernel source on an NFS
> > mount.
>
> Yes, I have noticed that too.
....
> The cost on this pretty fast machine is about 1 msecs per commit
> - which adds up to about 2.5 seconds during much of the
> development cycle.

Well... I noticed my builds when little changed are very slow... and
it was due to the computation of version string. Ouch.

pavel@amd:~/mainline-altera/linux$ time git describe
fixes-for-linus-506-g71ca8691
0.68user 0.22system 27.82 (0m27.820s) elapsed 3.26%CPU
pavel@amd:~/mainline-altera/linux$

(Cached it is more reasonable 3 seconds, but it keeps going out of
cache all the time. Uncached clean build is 3 minutes, cached is 9
seconds + time to do git describe).

Thikpad X60.

Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

2012-11-14 06:33:46

by Ingo Molnar

[permalink] [raw]
Subject: [tip:perf/core] perf tools: Speed up the perf build time by simplifying the perf --version string generation

Commit-ID: acddedfba0df1e47fa99035a04661082b679ee9c
Gitweb: http://git.kernel.org/tip/acddedfba0df1e47fa99035a04661082b679ee9c
Author: Ingo Molnar <[email protected]>
AuthorDate: Tue, 30 Oct 2012 09:46:00 +0100
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Wed, 31 Oct 2012 12:17:49 -0200

perf tools: Speed up the perf build time by simplifying the perf --version string generation

Building perf is pretty slow on trees that have a lot of commits
relative to the nearest Git tag. This slowness manifests itself during
version string generation:

$ perf stat --null --repeat 3 --sync --pre "rm -f PERF-VERSION-FILE" util/PERF-VERSION-GEN
PERF_VERSION = 3.7.rc3.1458.g5399b3b
PERF_VERSION = 3.7.rc3.1458.g5399b3b
PERF_VERSION = 3.7.rc3.1458.g5399b3b

Performance counter stats for 'util/PERF-VERSION-GEN' (3 runs):

2.857503976 seconds time elapsed ( +- 0.22% )

The build can be even slower than that, when one over NFS volumes.

The reason for the slowness is that util/PERF-VERSION-GEN uses "git
describe" to generate the string, which has to count the "number of
commits distance" from the nearest tag - the ".1458." count in the
output above. For that Git had to extract and decompress 1458 Git
objects, which takes time and bandwidth.

But this "number of commits" value is mostly irrelevant in practice. We
either want to know an approximate tag name, or we want to know the
precise sha1.

So this patch simplifies the version string to:

PERF_VERSION = 3.7.rc3.g5399b3b.dirty

which speeds up the version string generation script by an order of
magnitude:

$ perf stat --null --repeat 3 --sync --pre "rm -f PERF-VERSION-FILE" util/PERF-VERSION-GEN
PERF_VERSION = 3.7.rc3.g5399b3b.dirty
PERF_VERSION = 3.7.rc3.g5399b3b.dirty
PERF_VERSION = 3.7.rc3.g5399b3b.dirty

Performance counter stats for 'util/PERF-VERSION-GEN' (3 runs):

0.307633559 seconds time elapsed ( +- 0.84% )

Signed-off-by: Ingo Molnar <[email protected]>
Cc: Andrew Vagin <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Howells <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/PERF-VERSION-GEN | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/tools/perf/util/PERF-VERSION-GEN b/tools/perf/util/PERF-VERSION-GEN
index 95264f3..f6e8ee2 100755
--- a/tools/perf/util/PERF-VERSION-GEN
+++ b/tools/perf/util/PERF-VERSION-GEN
@@ -12,7 +12,7 @@ LF='
# First check if there is a .git to get the version from git describe
# otherwise try to get the version from the kernel makefile
if test -d ../../.git -o -f ../../.git &&
- VN=$(git describe --match 'v[0-9].[0-9]*' --abbrev=4 HEAD 2>/dev/null) &&
+ VN=$(echo $(git tag -l "v[0-9].[0-9]*" | tail -1)"-g"$(git log -1 --abbrev=4 --pretty=format:"%h" HEAD) 2>/dev/null) &&
case "$VN" in
*$LF*) (exit 1) ;;
v[0-9]*)

2012-11-14 06:33:45

by Ingo Molnar

[permalink] [raw]
Subject: [tip:perf/core] perf tools: Further speed up the perf build

Commit-ID: 0e2af956693a8797d658d076ff4c0da4147f0131
Gitweb: http://git.kernel.org/tip/0e2af956693a8797d658d076ff4c0da4147f0131
Author: Ingo Molnar <[email protected]>
AuthorDate: Tue, 30 Oct 2012 09:54:41 +0100
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Wed, 31 Oct 2012 12:17:49 -0200

perf tools: Further speed up the perf build

There's another source of overhead in the perf version string generator:

git update-index -q --refresh

... which will iterate the whole checked out tree. This can be pretty
slow on NFS volumes, but takes some time even with local SSD disks and a
fully cached kernel tree:

$ perf stat --null --repeat 3 --pre "rm -f PERF-VERSION-FILE" util/PERF-VERSION-GEN
PERF_VERSION = 3.7.rc3.g5399b3b.dirty
PERF_VERSION = 3.7.rc3.g5399b3b.dirty
PERF_VERSION = 3.7.rc3.g5399b3b.dirty

Performance counter stats for 'util/PERF-VERSION-GEN' (3 runs):

0.306999221 seconds time elapsed ( +- 0.56% )

So remove the .dirty differentiator as well - it adds little information
because locally patched git trees are common, but seldom are the perf
tools modified.

So a lot of version strings are reported as 'dirty' while in fact they
are pristine perf builds. For example 99% of my perf builds are not
patched but the kernel tree is slightly patched, which adds the .dirty
tag.

Eliminating that tag speeds up version generation by another order of
magnitude:

$ perf stat --null --repeat 3 --sync --pre "rm -f PERF-VERSION-FILE" util/PERF-VERSION-GEN
PERF_VERSION = 3.7.rc3.g4b0bd3
PERF_VERSION = 3.7.rc3.g4b0bd3
PERF_VERSION = 3.7.rc3.g4b0bd3

Performance counter stats for 'util/PERF-VERSION-GEN' (3 runs):

0.021270923 seconds time elapsed ( +- 1.94% )

(Also clean up some of the comments around this code.)

Signed-off-by: Ingo Molnar <[email protected]>
Cc: Andrew Vagin <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Howells <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/PERF-VERSION-GEN | 13 ++++---------
1 files changed, 4 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/PERF-VERSION-GEN b/tools/perf/util/PERF-VERSION-GEN
index f6e8ee2..ac418a1 100755
--- a/tools/perf/util/PERF-VERSION-GEN
+++ b/tools/perf/util/PERF-VERSION-GEN
@@ -9,17 +9,12 @@ GVF=${OUTPUT}PERF-VERSION-FILE
LF='
'

+#
# First check if there is a .git to get the version from git describe
-# otherwise try to get the version from the kernel makefile
+# otherwise try to get the version from the kernel Makefile
+#
if test -d ../../.git -o -f ../../.git &&
- VN=$(echo $(git tag -l "v[0-9].[0-9]*" | tail -1)"-g"$(git log -1 --abbrev=4 --pretty=format:"%h" HEAD) 2>/dev/null) &&
- case "$VN" in
- *$LF*) (exit 1) ;;
- v[0-9]*)
- git update-index -q --refresh
- test -z "$(git diff-index --name-only HEAD --)" ||
- VN="$VN-dirty" ;;
- esac
+ VN=$(echo $(git tag -l "v[0-9].[0-9]*" | tail -1)"-g"$(git log -1 --abbrev=4 --pretty=format:"%h" HEAD) 2>/dev/null)
then
VN=$(echo "$VN" | sed -e 's/-/./g');
else