LinuxLists.cc - Misc perf stat improvements

2013-08-03 00:41:19

Subject: Misc perf stat improvements

Here are a couple of perf stat improvements/cleanups:

- output more information (ratios) in CSV mode
- add --initial-delay to skip startup phase of program
- handle pipes better in interval mode
- some cleanup

2013-08-03 00:41:27

by Andi Kleen

[permalink] [raw]

Subject: [PATCH 5/5] perf, tools: Output running time and run/enabled ratio in CSV mode

From: Andi Kleen <[email protected]>

The information how much a counter ran in perf stat can be quite
interesting for other tools to judge how trustworthy a measurement is.

Currently it is only output in non CSV mode.

This patches make perf stat always output the running time and the
enabled/running ratio in CSV mode.

This adds two new fields at the end for each line. I assume that existing
tools ignore new fields at the end, so it's on by default.

Only CSV mode is affected, no difference otherwise.

Signed-off-by: Andi Kleen <[email protected]>
---
tools/perf/builtin-stat.c | 57 +++++++++++++++++++++++++++++++++++++----------
1 file changed, 45 insertions(+), 12 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index f686d5f..940fcfd 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -982,6 +982,13 @@ static void print_aggr(char *prefix)
fprintf(output, "%s%s",
csv_sep, counter->cgrp->name);

+ if (csv_output)
+ fprintf(output, "%s%" PRIu64 "%s%.2f",
+ csv_sep,
+ run,
+ csv_sep,
+ 100.0 * run / ena);
+
fputc('\n', output);
continue;
}
@@ -997,6 +1004,12 @@ static void print_aggr(char *prefix)
if (run != ena)
fprintf(output, " (%.2f%%)",
100.0 * run / ena);
+ } else {
+ fprintf(output, "%s%" PRIu64 "%s%.2f",
+ csv_sep,
+ run,
+ csv_sep,
+ 100.0 * run / ena);
}
fputc('\n', output);
}
@@ -1012,6 +1025,10 @@ static void print_counter_aggr(struct perf_evsel *counter, char *prefix)
struct perf_stat *ps = counter->priv;
double avg = avg_stats(&ps->res_stats[0]);
int scaled = counter->counts->scaled;
+ double avg_enabled, avg_running;
+
+ avg_enabled = avg_stats(&ps->res_stats[1]);
+ avg_running = avg_stats(&ps->res_stats[2]);

if (prefix)
fprintf(output, "%s", prefix);
@@ -1027,6 +1044,13 @@ static void print_counter_aggr(struct perf_evsel *counter, char *prefix)
if (counter->cgrp)
fprintf(output, "%s%s", csv_sep, counter->cgrp->name);

+ if (csv_output)
+ fprintf(output, "%s%.0f%s%.2f",
+ csv_sep,
+ avg_running,
+ csv_sep,
+ 100.0 * avg_running / avg_enabled);
+
fputc('\n', output);
return;
}
@@ -1038,19 +1062,14 @@ static void print_counter_aggr(struct perf_evsel *counter, char *prefix)

print_noise(counter, avg);

- if (csv_output) {
- fputc('\n', output);
- return;
- }
-
- if (scaled) {
- double avg_enabled, avg_running;
-
- avg_enabled = avg_stats(&ps->res_stats[1]);
- avg_running = avg_stats(&ps->res_stats[2]);
-
+ if (csv_output)
+ fprintf(output, "%s%.0f%s%.2f",
+ csv_sep,
+ avg_running,
+ csv_sep,
+ 100.0 * avg_running / avg_enabled);
+ else
fprintf(output, " [%5.2f%%]", 100 * avg_running / avg_enabled);
- }
fprintf(output, "\n");
}

@@ -1085,6 +1104,13 @@ static void print_counter(struct perf_evsel *counter, char *prefix)
fprintf(output, "%s%s",
csv_sep, counter->cgrp->name);

+ if (csv_output)
+ fprintf(output, "%s%" PRIu64 "%s%.2f",
+ csv_sep,
+ run,
+ csv_sep,
+ 100.0 * run / ena);
+
fputc('\n', output);
continue;
}
@@ -1100,7 +1126,14 @@ static void print_counter(struct perf_evsel *counter, char *prefix)
if (run != ena)
fprintf(output, " (%.2f%%)",
100.0 * run / ena);
+ } else {
+ fprintf(output, "%s%" PRIu64 "%s%.2f",
+ csv_sep,
+ run,
+ csv_sep,
+ 100.0 * run / ena);
}
+
fputc('\n', output);
}
}
--
1.8.3.1

2013-08-03 00:41:41

by Andi Kleen

[permalink] [raw]

Subject: [PATCH 3/5] perf, tools: Add support for --initial-delay option to perf stat

From: Andi Kleen <[email protected]>

When measuring workloads the startup phase -- doing page faults,
dynamic linking, opening files -- is often very different from
the rest of the workload. Especially with smaller kernels
and using counter multiplexing this can give significant
measurement errors.

Multiplexing assumes that the workload is mostly the same
over longer periods. But at startup there is typically
some spike of activity which is relatively short.
If many groups are multiplexing the one group seeing
the spike, and which is then scaled up over the time to run all
groups, may see a significant error.

Also in general it's often not useful to measure the startup,
because it is so different from the rest.

One way around this is to use interval mode and discard
the first sample, but this can be awkward because interval
mode doesn't support intervals of less than 100ms,
and also a useful interval is not necessarily the same
as a useful startup delay.

This patch adds a new --initial-delay / -D option to skip measuring
for the startup phase. The time can be specified in ms

Here's a simple example:

perf stat -e page-faults bash -c 'for i in $(seq 100000) ; do true ; done'
...
3,721 page-faults
...

If we just wait 20 ms the number of page faults is 1/3 less:
perf stat -D 20 -e page-faults bash -c 'for i in $(seq 100000) ; do true ; done'
...
2,823 page-faults
...

So we filtered out most of the startup noise from bash.

Signed-off-by: Andi Kleen <[email protected]>
---
tools/perf/Documentation/perf-stat.txt | 5 +++++
tools/perf/builtin-stat.c | 22 +++++++++++++++++++++-
2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 2fe87fb..73c9759 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -132,6 +132,11 @@ is a useful mode to detect imbalance between physical cores. To enable this mod
use --per-core in addition to -a. (system-wide). The output includes the
core number and the number of online logical processors on that physical processor.

+-D msecs::
+--initial-delay msecs::
+After starting the program, wait msecs before measuring. This is useful to
+filter out the startup phase of the program, which is often very different.
+
EXAMPLES
--------

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 352fbd7..2e637e4 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -100,6 +100,7 @@ static const char *pre_cmd = NULL;
static const char *post_cmd = NULL;
static bool sync_run = false;
static unsigned int interval = 0;
+static unsigned int initial_delay = 0;
static bool forever = false;
static struct timespec ref_time;
static struct cpu_map *aggr_map;
@@ -254,7 +255,8 @@ static int create_perf_stat_counter(struct perf_evsel *evsel)
if (!perf_target__has_task(&target) &&
perf_evsel__is_group_leader(evsel)) {
attr->disabled = 1;
- attr->enable_on_exec = 1;
+ if (!initial_delay)
+ attr->enable_on_exec = 1;
}

return perf_evsel__open_per_thread(evsel, evsel_list->threads);
@@ -416,6 +418,20 @@ static void print_interval(void)
}
}

+static void handle_initial_delay(void)
+{
+ struct perf_evsel *counter;
+
+ if (initial_delay) {
+ const int ncpus = cpu_map__nr(evsel_list->cpus),
+ nthreads = thread_map__nr(evsel_list->threads);
+
+ usleep(initial_delay * 1000);
+ list_for_each_entry(counter, &evsel_list->entries, node)
+ perf_evsel__enable(counter, ncpus, nthreads);
+ }
+}
+
static int __run_perf_stat(int argc, const char **argv)
{
char msg[512];
@@ -486,6 +502,7 @@ static int __run_perf_stat(int argc, const char **argv)

if (forks) {
perf_evlist__start_workload(evsel_list);
+ handle_initial_delay();

if (interval) {
while (!waitpid(child_pid, &status, WNOHANG)) {
@@ -497,6 +514,7 @@ static int __run_perf_stat(int argc, const char **argv)
if (WIFSIGNALED(status))
psignal(WTERMSIG(status), argv[0]);
} else {
+ handle_initial_delay();
while (!done) {
nanosleep(&ts, NULL);
if (interval)
@@ -1419,6 +1437,8 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
"aggregate counts per processor socket", AGGR_SOCKET),
OPT_SET_UINT(0, "per-core", &aggr_mode,
"aggregate counts per physical processor core", AGGR_CORE),
+ OPT_UINTEGER('D', "delay", &initial_delay,
+ "ms to wait before starting measurement after program start"),
OPT_END()
};
const char * const stat_usage[] = {
--
1.8.3.1

2013-08-03 00:41:40

by Andi Kleen

[permalink] [raw]

Subject: [PATCH 4/5] perf, tools: flush output after each line in stat interval mode

From: Andi Kleen <[email protected]>

When interval mode is outputting to a pipe, each measurement
should be flushed individually, so that the reader sees it
timely.

With a terminal each line is automatically flushed by stdio,
but that is disabled with non terminal output.

Simply fflush output after each time interval

Signed-off-by: Andi Kleen <[email protected]>
---
tools/perf/builtin-stat.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 2e637e4..f686d5f 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -416,6 +416,8 @@ static void print_interval(void)
list_for_each_entry(counter, &evsel_list->entries, node)
print_counter_aggr(counter, prefix);
}
+
+ fflush(output);
}

static void handle_initial_delay(void)
--
1.8.3.1

2013-08-03 00:41:18

by Andi Kleen

[permalink] [raw]

Subject: [PATCH 2/5] tools, perf: Add support to evsel for enabling counters

From: Andi Kleen <[email protected]>

Add support for enabling already set up counters by using an
ioctl. I share some code with the filter setup.

Signed-off-by: Andi Kleen <[email protected]>
---
tools/perf/util/evsel.c | 21 ++++++++++++++++++---
tools/perf/util/evsel.h | 1 +
2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index c9c7494..60e0d84 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -605,16 +605,16 @@ int perf_evsel__alloc_fd(struct perf_evsel *evsel, int ncpus, int nthreads)
return evsel->fd != NULL ? 0 : -ENOMEM;
}

-int perf_evsel__set_filter(struct perf_evsel *evsel, int ncpus, int nthreads,
- const char *filter)
+static int perf_evsel__run_ioctl(struct perf_evsel *evsel, int ncpus, int nthreads,
+ int ioc, void *arg)
{
int cpu, thread;

for (cpu = 0; cpu < ncpus; cpu++) {
for (thread = 0; thread < nthreads; thread++) {
int fd = FD(evsel, cpu, thread),
- err = ioctl(fd, PERF_EVENT_IOC_SET_FILTER, filter);

+ err = ioctl(fd, ioc, arg);
if (err)
return err;
}
@@ -623,6 +623,21 @@ int perf_evsel__set_filter(struct perf_evsel *evsel, int ncpus, int nthreads,
return 0;
}

+int perf_evsel__set_filter(struct perf_evsel *evsel, int ncpus, int nthreads,
+ const char *filter)
+{
+ return perf_evsel__run_ioctl(evsel, ncpus, nthreads,
+ PERF_EVENT_IOC_SET_FILTER,
+ (void *)filter);
+}
+
+int perf_evsel__enable(struct perf_evsel *evsel, int ncpus, int nthreads)
+{
+ return perf_evsel__run_ioctl(evsel, ncpus, nthreads,
+ PERF_EVENT_IOC_ENABLE,
+ 0);
+}
+
int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads)
{
evsel->sample_id = xyarray__new(ncpus, nthreads, sizeof(struct perf_sample_id));
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 3f156cc..b057e9c 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -142,6 +142,7 @@ void perf_evsel__set_sample_id(struct perf_evsel *evsel);

int perf_evsel__set_filter(struct perf_evsel *evsel, int ncpus, int nthreads,
const char *filter);
+int perf_evsel__enable(struct perf_evsel *evsel, int ncpus, int nthreads);

int perf_evsel__open_per_cpu(struct perf_evsel *evsel,
struct cpu_map *cpus);
--
1.8.3.1

2013-08-03 00:42:42

by Andi Kleen

[permalink] [raw]

Subject: [PATCH 1/5] perf, tools: Remove obsolete dummy execve

From: Andi Kleen <[email protected]>

Minor cleanup.

The dummy execve to pre-resolve the PLT is obsolete since
"enable_on_execve" was added. The counters are only
running after the execve anyways. So just remove it.

Signed-off-by: Andi Kleen <[email protected]>
---
tools/perf/util/evlist.c | 7 -------
1 file changed, 7 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 8065ce8..62efec9 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -783,13 +783,6 @@ int perf_evlist__prepare_workload(struct perf_evlist *evlist,
fcntl(go_pipe[0], F_SETFD, FD_CLOEXEC);

/*
- * Do a dummy execvp to get the PLT entry resolved,
- * so we avoid the resolver overhead on the real
- * execvp call.
- */
- execvp("", (char **)argv);
-
- /*
* Tell the parent we're ready to go
*/
close(child_ready_pipe[1]);
--
1.8.3.1

2013-08-05 08:28:17

by Namhyung Kim

[permalink] [raw]

Subject: Re: [PATCH 2/5] tools, perf: Add support to evsel for enabling counters

Hi Andi,

On Fri, 2 Aug 2013 17:41:10 -0700, Andi Kleen wrote:
> From: Andi Kleen <[email protected]>
>
> Add support for enabling already set up counters by using an
> ioctl. I share some code with the filter setup.
>
> Signed-off-by: Andi Kleen <[email protected]>
> ---
> tools/perf/util/evsel.c | 21 ++++++++++++++++++---
> tools/perf/util/evsel.h | 1 +
> 2 files changed, 19 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index c9c7494..60e0d84 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -605,16 +605,16 @@ int perf_evsel__alloc_fd(struct perf_evsel *evsel, int ncpus, int nthreads)
> return evsel->fd != NULL ? 0 : -ENOMEM;
> }
>
> -int perf_evsel__set_filter(struct perf_evsel *evsel, int ncpus, int nthreads,
> - const char *filter)
> +static int perf_evsel__run_ioctl(struct perf_evsel *evsel, int ncpus, int nthreads,
> + int ioc, void *arg)
> {
> int cpu, thread;
>
> for (cpu = 0; cpu < ncpus; cpu++) {
> for (thread = 0; thread < nthreads; thread++) {
> int fd = FD(evsel, cpu, thread),
> - err = ioctl(fd, PERF_EVENT_IOC_SET_FILTER, filter);
>
> + err = ioctl(fd, ioc, arg);

Looks very strange to have a blank line between variable declarations.
You'd better separating declarations on the other lines like:

int fd, err;

fd = FD(evsel, cpu, thread);
err = ioctl(fd, ioc, arg);

Thanks,
Namhyung

> if (err)
> return err;
> }

2013-08-05 09:46:40

by Jiri Olsa

[permalink] [raw]

Subject: Re: [PATCH 5/5] perf, tools: Output running time and run/enabled ratio in CSV mode

On Fri, Aug 02, 2013 at 05:41:13PM -0700, Andi Kleen wrote:
> From: Andi Kleen <[email protected]>
>
> The information how much a counter ran in perf stat can be quite
> interesting for other tools to judge how trustworthy a measurement is.
>
> Currently it is only output in non CSV mode.
>
> This patches make perf stat always output the running time and the
> enabled/running ratio in CSV mode.
>
> This adds two new fields at the end for each line. I assume that existing
> tools ignore new fields at the end, so it's on by default.
>
> Only CSV mode is affected, no difference otherwise.
>
> Signed-off-by: Andi Kleen <[email protected]>
> ---
> tools/perf/builtin-stat.c | 57 +++++++++++++++++++++++++++++++++++++----------
> 1 file changed, 45 insertions(+), 12 deletions(-)
>
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index f686d5f..940fcfd 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -982,6 +982,13 @@ static void print_aggr(char *prefix)
> fprintf(output, "%s%s",
> csv_sep, counter->cgrp->name);
>
> + if (csv_output)
> + fprintf(output, "%s%" PRIu64 "%s%.2f",
> + csv_sep,
> + run,
> + csv_sep,
> + 100.0 * run / ena);

looks like we could use function/macro for this

jirka

2013-08-05 09:46:39

by Jiri Olsa

[permalink] [raw]

Subject: Re: Misc perf stat improvements

On Fri, Aug 02, 2013 at 05:41:08PM -0700, Andi Kleen wrote:
> Here are a couple of perf stat improvements/cleanups:
>
> - output more information (ratios) in CSV mode
> - add --initial-delay to skip startup phase of program
> - handle pipes better in interval mode
> - some cleanup
>

for the patchset:

Reviewed-by: Jiri Olsa <[email protected]>

2013-08-05 16:11:04

by Arnaldo Carvalho de Melo

[permalink] [raw]

Subject: Re: [PATCH 2/5] tools, perf: Add support to evsel for enabling counters

Em Mon, Aug 05, 2013 at 05:28:14PM +0900, Namhyung Kim escreveu:
> Hi Andi,
>
> On Fri, 2 Aug 2013 17:41:10 -0700, Andi Kleen wrote:
> > From: Andi Kleen <[email protected]>
> >
> > Add support for enabling already set up counters by using an
> > ioctl. I share some code with the filter setup.
> >
> > Signed-off-by: Andi Kleen <[email protected]>
> > ---
> > tools/perf/util/evsel.c | 21 ++++++++++++++++++---
> > tools/perf/util/evsel.h | 1 +
> > 2 files changed, 19 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> > index c9c7494..60e0d84 100644
> > --- a/tools/perf/util/evsel.c
> > +++ b/tools/perf/util/evsel.c
> > @@ -605,16 +605,16 @@ int perf_evsel__alloc_fd(struct perf_evsel *evsel, int ncpus, int nthreads)
> > return evsel->fd != NULL ? 0 : -ENOMEM;
> > }
> >
> > -int perf_evsel__set_filter(struct perf_evsel *evsel, int ncpus, int nthreads,
> > - const char *filter)
> > +static int perf_evsel__run_ioctl(struct perf_evsel *evsel, int ncpus, int nthreads,
> > + int ioc, void *arg)
> > {
> > int cpu, thread;
> >
> > for (cpu = 0; cpu < ncpus; cpu++) {
> > for (thread = 0; thread < nthreads; thread++) {
> > int fd = FD(evsel, cpu, thread),
> > - err = ioctl(fd, PERF_EVENT_IOC_SET_FILTER, filter);
> >
> > + err = ioctl(fd, ioc, arg);
>
> Looks very strange to have a blank line between variable declarations.
> You'd better separating declarations on the other lines like:
>
> int fd, err;
>
> fd = FD(evsel, cpu, thread);
> err = ioctl(fd, ioc, arg);

Preferences :-) I think the best way is:

int fd = FD(evsel, cpu, thread),
err = ioctl(fd, ioc, arg);

As its all short and so uses 2 instead of 4 lines. I'll fix up the
alignment.

>
> Thanks,
> Namhyung
>
>
> > if (err)
> > return err;
> > }

2013-08-12 10:20:08

by tip-bot for Vasyl Gomonovych

[permalink] [raw]

Subject: [tip:perf/core] perf evlist: Remove obsolete dummy execve

Commit-ID: 5c6974f49832a55edc9ca744323778947c104ca0
Gitweb: http://git.kernel.org/tip/5c6974f49832a55edc9ca744323778947c104ca0
Author: Andi Kleen <[email protected]>
AuthorDate: Fri, 2 Aug 2013 17:41:09 -0700
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Wed, 7 Aug 2013 17:35:28 -0300

perf evlist: Remove obsolete dummy execve

Minor cleanup.

The dummy execve to pre-resolve the PLT is obsolete since
"enable_on_execve" was added. The counters are only
running after the execve anyways. So just remove it.

Signed-off-by: Andi Kleen <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/evlist.c | 7 -------
1 file changed, 7 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index da2dd92..c7d111f 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -837,13 +837,6 @@ int perf_evlist__prepare_workload(struct perf_evlist *evlist,
fcntl(go_pipe[0], F_SETFD, FD_CLOEXEC);

/*
- * Do a dummy execvp to get the PLT entry resolved,
- * so we avoid the resolver overhead on the real
- * execvp call.
- */
- execvp("", (char **)argv);
-
- /*
* Tell the parent we're ready to go
*/
close(child_ready_pipe[1]);

2013-08-12 10:20:22

by tip-bot for Vasyl Gomonovych

[permalink] [raw]

Subject: [tip:perf/core] perf evsel: Add support for enabling counters

Commit-ID: e2407bef968d64a28465561832686636d3380bf9
Gitweb: http://git.kernel.org/tip/e2407bef968d64a28465561832686636d3380bf9
Author: Andi Kleen <[email protected]>
AuthorDate: Fri, 2 Aug 2013 17:41:10 -0700
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Wed, 7 Aug 2013 17:35:28 -0300

perf evsel: Add support for enabling counters

Add support for enabling already set up counters by using an
ioctl. I share some code with the filter setup.

Signed-off-by: Andi Kleen <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Fixed up 'err' variable indentation ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/evsel.c | 21 ++++++++++++++++++---
tools/perf/util/evsel.h | 1 +
2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 8f10161..960394e 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -634,15 +634,15 @@ int perf_evsel__alloc_fd(struct perf_evsel *evsel, int ncpus, int nthreads)
return evsel->fd != NULL ? 0 : -ENOMEM;
}

-int perf_evsel__set_filter(struct perf_evsel *evsel, int ncpus, int nthreads,
- const char *filter)
+static int perf_evsel__run_ioctl(struct perf_evsel *evsel, int ncpus, int nthreads,
+ int ioc, void *arg)
{
int cpu, thread;

for (cpu = 0; cpu < ncpus; cpu++) {
for (thread = 0; thread < nthreads; thread++) {
int fd = FD(evsel, cpu, thread),
- err = ioctl(fd, PERF_EVENT_IOC_SET_FILTER, filter);
+ err = ioctl(fd, ioc, arg);

if (err)
return err;
@@ -652,6 +652,21 @@ int perf_evsel__set_filter(struct perf_evsel *evsel, int ncpus, int nthreads,
return 0;
}

+int perf_evsel__set_filter(struct perf_evsel *evsel, int ncpus, int nthreads,
+ const char *filter)
+{
+ return perf_evsel__run_ioctl(evsel, ncpus, nthreads,
+ PERF_EVENT_IOC_SET_FILTER,
+ (void *)filter);
+}
+
+int perf_evsel__enable(struct perf_evsel *evsel, int ncpus, int nthreads)
+{
+ return perf_evsel__run_ioctl(evsel, ncpus, nthreads,
+ PERF_EVENT_IOC_ENABLE,
+ 0);
+}
+
int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads)
{
evsel->sample_id = xyarray__new(ncpus, nthreads, sizeof(struct perf_sample_id));
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 5edc625..532a5f9 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -146,6 +146,7 @@ void perf_evsel__set_sample_id(struct perf_evsel *evsel);

int perf_evsel__set_filter(struct perf_evsel *evsel, int ncpus, int nthreads,
const char *filter);
+int perf_evsel__enable(struct perf_evsel *evsel, int ncpus, int nthreads);

int perf_evsel__open_per_cpu(struct perf_evsel *evsel,
struct cpu_map *cpus);

2013-08-12 10:20:32

by tip-bot for Vasyl Gomonovych

[permalink] [raw]

Subject: [tip:perf/core] perf stat: Add support for --initial-delay option

Commit-ID: 411916880ff4061ac0491a154f10af4d49a0c61a
Gitweb: http://git.kernel.org/tip/411916880ff4061ac0491a154f10af4d49a0c61a
Author: Andi Kleen <[email protected]>
AuthorDate: Fri, 2 Aug 2013 17:41:11 -0700
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Wed, 7 Aug 2013 17:35:29 -0300

perf stat: Add support for --initial-delay option

When measuring workloads the startup phase -- doing page faults, dynamic
linking, opening files -- is often very different from the rest of the
workload. Especially with smaller kernels and using counter
multiplexing this can give significant measurement errors.

Multiplexing assumes that the workload is mostly the same over longer
periods. But at startup there is typically some spike of activity which
is relatively short. If many groups are multiplexing the one group
seeing the spike, and which is then scaled up over the time to run all
groups, may see a significant error.

Also in general it's often not useful to measure the startup, because it
is so different from the rest.

One way around this is to use interval mode and discard the first
sample, but this can be awkward because interval mode doesn't support
intervals of less than 100ms, and also a useful interval is not
necessarily the same as a useful startup delay.

This patch adds a new --initial-delay / -D option to skip measuring for
the startup phase. The time can be specified in ms

Here's a simple example:

perf stat -e page-faults bash -c 'for i in $(seq 100000) ; do true ; done'
...
3,721 page-faults
...

If we just wait 20 ms the number of page faults is 1/3 less:

perf stat -D 20 -e page-faults bash -c 'for i in $(seq 100000) ; do true ; done'
...
2,823 page-faults
...

So we filtered out most of the startup noise from bash.

Signed-off-by: Andi Kleen <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Documentation/perf-stat.txt | 5 +++++
tools/perf/builtin-stat.c | 22 +++++++++++++++++++++-
2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 2fe87fb..73c9759 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -132,6 +132,11 @@ is a useful mode to detect imbalance between physical cores. To enable this mod
use --per-core in addition to -a. (system-wide). The output includes the
core number and the number of online logical processors on that physical processor.

+-D msecs::
+--initial-delay msecs::
+After starting the program, wait msecs before measuring. This is useful to
+filter out the startup phase of the program, which is often very different.
+
EXAMPLES
--------

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 352fbd7..2e637e4 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -100,6 +100,7 @@ static const char *pre_cmd = NULL;
static const char *post_cmd = NULL;
static bool sync_run = false;
static unsigned int interval = 0;
+static unsigned int initial_delay = 0;
static bool forever = false;
static struct timespec ref_time;
static struct cpu_map *aggr_map;
@@ -254,7 +255,8 @@ static int create_perf_stat_counter(struct perf_evsel *evsel)
if (!perf_target__has_task(&target) &&
perf_evsel__is_group_leader(evsel)) {
attr->disabled = 1;
- attr->enable_on_exec = 1;
+ if (!initial_delay)
+ attr->enable_on_exec = 1;
}

return perf_evsel__open_per_thread(evsel, evsel_list->threads);
@@ -416,6 +418,20 @@ static void print_interval(void)
}
}

+static void handle_initial_delay(void)
+{
+ struct perf_evsel *counter;
+
+ if (initial_delay) {
+ const int ncpus = cpu_map__nr(evsel_list->cpus),
+ nthreads = thread_map__nr(evsel_list->threads);
+
+ usleep(initial_delay * 1000);
+ list_for_each_entry(counter, &evsel_list->entries, node)
+ perf_evsel__enable(counter, ncpus, nthreads);
+ }
+}
+
static int __run_perf_stat(int argc, const char **argv)
{
char msg[512];
@@ -486,6 +502,7 @@ static int __run_perf_stat(int argc, const char **argv)

if (forks) {
perf_evlist__start_workload(evsel_list);
+ handle_initial_delay();

if (interval) {
while (!waitpid(child_pid, &status, WNOHANG)) {
@@ -497,6 +514,7 @@ static int __run_perf_stat(int argc, const char **argv)
if (WIFSIGNALED(status))
psignal(WTERMSIG(status), argv[0]);
} else {
+ handle_initial_delay();
while (!done) {
nanosleep(&ts, NULL);
if (interval)
@@ -1419,6 +1437,8 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
"aggregate counts per processor socket", AGGR_SOCKET),
OPT_SET_UINT(0, "per-core", &aggr_mode,
"aggregate counts per physical processor core", AGGR_CORE),
+ OPT_UINTEGER('D', "delay", &initial_delay,
+ "ms to wait before starting measurement after program start"),
OPT_END()
};
const char * const stat_usage[] = {

2013-08-12 10:20:42

by tip-bot for Vasyl Gomonovych

[permalink] [raw]

Subject: [tip:perf/core] perf stat: Flush output after each line in interval mode

Commit-ID: 2bbf03f16a634f675c49c473b2b6528571990aea
Gitweb: http://git.kernel.org/tip/2bbf03f16a634f675c49c473b2b6528571990aea
Author: Andi Kleen <[email protected]>
AuthorDate: Fri, 2 Aug 2013 17:41:12 -0700
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Wed, 7 Aug 2013 17:35:29 -0300

perf stat: Flush output after each line in interval mode

When interval mode is outputting to a pipe, each measurement should be
flushed individually, so that the reader sees it timely.

With a terminal each line is automatically flushed by stdio, but that is
disabled with non terminal output.

Simply fflush output after each time interval

Signed-off-by: Andi Kleen <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-stat.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 2e637e4..f686d5f 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -416,6 +416,8 @@ static void print_interval(void)
list_for_each_entry(counter, &evsel_list->entries, node)
print_counter_aggr(counter, prefix);
}
+
+ fflush(output);
}

static void handle_initial_delay(void)