2012-02-20 01:46:58

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 1/2] perf stat: Fix event grouping on forked task

When event group is enabled for forked task (i.e. no target task was
specified) all events were disabled and marked ->enable_on_exec.
However they are not counted at all since only group leader will be
enabled on exec actually. So the result looked like below:

$ ./perf stat --group -- sleep 1

Performance counter stats for 'sleep 1':

0.554926 task-clock # 0.001 CPUs utilized
<not counted> context-switches
<not counted> CPU-migrations
<not counted> page-faults
<not counted> cycles
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
<not counted> instructions
<not counted> branches
<not counted> branch-misses

1.001228093 seconds time elapsed

Fix it by disabling group leader only.

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/builtin-stat.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index e54cbc9fc4fd..cda1e757ea87 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -296,7 +296,7 @@ static int create_perf_stat_counter(struct perf_evsel *evsel,
if (system_wide)
return perf_evsel__open_per_cpu(evsel, evsel_list->cpus,
group, group_fd);
- if (!target_pid && !target_tid) {
+ if (!target_pid && !target_tid && (!group || evsel == first)) {
attr->disabled = 1;
attr->enable_on_exec = 1;
}
--
1.7.9


2012-02-20 01:47:15

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 2/2] perf tools: Do not disable members of group event

When event group is enabled for forked task (i.e. no target task/cpu
was specified) all events were disabled and marked ->enable_on_exec.
However they wouldn't be counted at all since only group leader will
be enabled on exec actually.

In contrast to perf stat, perf record doesn't have a real problem
as it enables all the event before proceeding. But it needs to be
fixed anyway IMHO.

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/evlist.c | 6 ++++--
tools/perf/util/evsel.c | 6 ++++--
tools/perf/util/evsel.h | 3 ++-
3 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index f8da9fada002..ec4e338ee291 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -51,13 +51,15 @@ struct perf_evlist *perf_evlist__new(struct cpu_map *cpus,
void perf_evlist__config_attrs(struct perf_evlist *evlist,
struct perf_record_opts *opts)
{
- struct perf_evsel *evsel;
+ struct perf_evsel *evsel, *first;

if (evlist->cpus->map[0] < 0)
opts->no_inherit = true;

+ first = list_entry(evlist->entries.next, struct perf_evsel, node);
+
list_for_each_entry(evsel, &evlist->entries, node) {
- perf_evsel__config(evsel, opts);
+ perf_evsel__config(evsel, opts, first);

if (evlist->nr_entries > 1)
evsel->attr.sample_type |= PERF_SAMPLE_ID;
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 302d49a9f985..c92693c3d108 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -63,7 +63,8 @@ struct perf_evsel *perf_evsel__new(struct perf_event_attr *attr, int idx)
return evsel;
}

-void perf_evsel__config(struct perf_evsel *evsel, struct perf_record_opts *opts)
+void perf_evsel__config(struct perf_evsel *evsel, struct perf_record_opts *opts,
+ struct perf_evsel *first)
{
struct perf_event_attr *attr = &evsel->attr;
int track = !evsel->idx; /* only the first counter needs these */
@@ -130,7 +131,8 @@ void perf_evsel__config(struct perf_evsel *evsel, struct perf_record_opts *opts)
attr->mmap = track;
attr->comm = track;

- if (!opts->target_pid && !opts->target_tid && !opts->system_wide) {
+ if (!opts->target_pid && !opts->target_tid && !opts->system_wide &&
+ (!opts->group || evsel == first)) {
attr->disabled = 1;
attr->enable_on_exec = 1;
}
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 326b8e4d5035..3158ca3d69a1 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -80,7 +80,8 @@ void perf_evsel__exit(struct perf_evsel *evsel);
void perf_evsel__delete(struct perf_evsel *evsel);

void perf_evsel__config(struct perf_evsel *evsel,
- struct perf_record_opts *opts);
+ struct perf_record_opts *opts,
+ struct perf_evsel *first);

int perf_evsel__alloc_fd(struct perf_evsel *evsel, int ncpus, int nthreads);
int perf_evsel__alloc_id(struct perf_evsel *evsel, int ncpus, int nthreads);
--
1.7.9