2013-06-27 07:49:16

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 00/15] perf tools: some fixes and tweaks

Hi

Here are some fixes and tweaks to perf tools (version 2).

Changes in V2:
perf tools: fix missing tool parameter
Fixed one extra occurrence
perf tools: fix parse_events_terms() freeing local variable on error path
Made "freeing" code into a new function
perf tools: validate perf event header size
Corrected byte-swapping
perf tools: allow non-matching sample types
Added comments
Fixed id_pos calculation
id_pos/is_pos updated whenever sample_type changes
Removed perf_evlist__sample_type()
Added __perf_evlist__combined_sample_type()
Added perf_evlist__combined_sample_type()
Added perf_evlist__make_sample_types_compatible()
Added ack's to patches acked by Jiri Olsa


Adrian Hunter (15):
perf tools: remove unused parameter
perf tools: fix missing tool parameter
perf tools: fix missing 'finished_round'
perf tools: fix parse_events_terms() segfault on error path
perf tools: fix new_term() missing free on error path
perf tools: fix parse_events_terms() freeing local variable on error path
perf tools: add const specifier to perf_pmu__find name parameter
perf tools: tidy duplicated munmap code
perf tools: validate perf event header size
perf tools: add debug prints
perf tools: fix symbol_conf.nr_events
perf tools: allow non-matching sample types
perf tools: struct thread has a tid not a pid
perf tools: add pid to struct thread
perf tools: fix ppid in thread__fork()

tools/perf/builtin-inject.c | 40 +++++-----
tools/perf/builtin-kmem.c | 2 +-
tools/perf/builtin-report.c | 2 +-
tools/perf/builtin-sched.c | 12 +--
tools/perf/builtin-trace.c | 4 +-
tools/perf/ui/browsers/hists.c | 6 +-
tools/perf/util/event.c | 2 +-
tools/perf/util/event.h | 14 ++++
tools/perf/util/evlist.c | 170 +++++++++++++++++++++++++++++++++++------
tools/perf/util/evlist.h | 8 +-
tools/perf/util/evsel.c | 116 +++++++++++++++++++++++++++-
tools/perf/util/evsel.h | 10 +++
tools/perf/util/header.c | 8 +-
tools/perf/util/header.h | 6 +-
tools/perf/util/machine.c | 57 ++++++++++----
tools/perf/util/machine.h | 8 +-
tools/perf/util/parse-events.c | 11 ++-
tools/perf/util/parse-events.h | 1 +
tools/perf/util/pmu.c | 16 ++--
tools/perf/util/pmu.h | 2 +-
tools/perf/util/session.c | 33 +++++---
tools/perf/util/sort.c | 6 +-
tools/perf/util/thread.c | 11 +--
tools/perf/util/thread.h | 5 +-
tools/perf/util/tool.h | 9 +--
25 files changed, 441 insertions(+), 118 deletions(-)

Regards
Adrian


2013-06-27 07:49:28

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 03/15] perf tools: fix missing 'finished_round'

By default, perf inject should "repipe" all events
including 'finished_round'.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
---
tools/perf/builtin-inject.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index c943513..ad1296c 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -404,6 +404,7 @@ int cmd_inject(int argc, const char **argv, const char *prefix __maybe_unused)
.attr = perf_event__repipe_attr,
.event_type = perf_event__repipe_event_type_synth,
.tracing_data = perf_event__repipe_op2_synth,
+ .finished_round = perf_event__repipe_op2_synth,
.build_id = perf_event__repipe_op2_synth,
},
.input_name = "-",
--
1.7.11.7

2013-06-27 07:49:34

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 05/15] perf tools: fix new_term() missing free on error path

On the error path, newly allocated 'term' must be freed.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
---
tools/perf/util/parse-events.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index d8dcb8d..995fc25 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1184,6 +1184,7 @@ static int new_term(struct parse_events_term **_term, int type_val,
term->val.str = str;
break;
default:
+ free(term);
return -EINVAL;
}

--
1.7.11.7

2013-06-27 07:49:32

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 04/15] perf tools: fix parse_events_terms() segfault on error path

On the error path, 'data.terms' may not have been
initialised.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
---
tools/perf/util/parse-events.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 6c8bb0f..d8dcb8d 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -860,7 +860,8 @@ int parse_events_terms(struct list_head *terms, const char *str)
return 0;
}

- parse_events__free_terms(data.terms);
+ if (data.terms)
+ parse_events__free_terms(data.terms);
return ret;
}

--
1.7.11.7

2013-06-27 07:49:37

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 06/15] perf tools: fix parse_events_terms() freeing local variable on error path

The list_head is on the stack, so just free the rest of the list.

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/util/parse-events.c | 7 ++++++-
tools/perf/util/parse-events.h | 1 +
tools/perf/util/pmu.c | 2 +-
3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 995fc25..d9cb055 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -1231,12 +1231,17 @@ int parse_events_term__clone(struct parse_events_term **new,
term->val.str, term->val.num);
}

-void parse_events__free_terms(struct list_head *terms)
+void parse_events__free_terms_only(struct list_head *terms)
{
struct parse_events_term *term, *h;

list_for_each_entry_safe(term, h, terms, list)
free(term);
+}
+
+void parse_events__free_terms(struct list_head *terms)
+{
+ parse_events__free_terms_only(terms);

free(terms);
}
diff --git a/tools/perf/util/parse-events.h b/tools/perf/util/parse-events.h
index 8a48593..6cb6f32 100644
--- a/tools/perf/util/parse-events.h
+++ b/tools/perf/util/parse-events.h
@@ -80,6 +80,7 @@ int parse_events_term__sym_hw(struct parse_events_term **term,
char *config, unsigned idx);
int parse_events_term__clone(struct parse_events_term **new,
struct parse_events_term *term);
+void parse_events__free_terms_only(struct list_head *terms);
void parse_events__free_terms(struct list_head *terms);
int parse_events__modifier_event(struct list_head *list, char *str, bool add);
int parse_events__modifier_group(struct list_head *list, char *event_mod);
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 4c6f9c4..5dd9e37 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -194,7 +194,7 @@ static int pmu_alias_terms(struct perf_pmu_alias *alias,
list_for_each_entry(term, &alias->terms, list) {
ret = parse_events_term__clone(&clone, term);
if (ret) {
- parse_events__free_terms(&list);
+ parse_events__free_terms_only(&list);
return ret;
}
list_add_tail(&clone->list, &list);
--
1.7.11.7

2013-06-27 07:49:42

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 07/15] perf tools: add const specifier to perf_pmu__find name parameter

The name parameter is constant, declare it so.

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/util/pmu.c | 14 +++++++-------
tools/perf/util/pmu.h | 2 +-
2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 5dd9e37..7360f9b 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -73,7 +73,7 @@ int perf_pmu__format_parse(char *dir, struct list_head *head)
* located at:
* /sys/bus/event_source/devices/<dev>/format as sysfs group attributes.
*/
-static int pmu_format(char *name, struct list_head *format)
+static int pmu_format(const char *name, struct list_head *format)
{
struct stat st;
char path[PATH_MAX];
@@ -162,7 +162,7 @@ static int pmu_aliases_parse(char *dir, struct list_head *head)
* Reading the pmu event aliases definition, which should be located at:
* /sys/bus/event_source/devices/<dev>/events as sysfs group attributes.
*/
-static int pmu_aliases(char *name, struct list_head *head)
+static int pmu_aliases(const char *name, struct list_head *head)
{
struct stat st;
char path[PATH_MAX];
@@ -208,7 +208,7 @@ static int pmu_alias_terms(struct perf_pmu_alias *alias,
* located at:
* /sys/bus/event_source/devices/<dev>/type as sysfs attribute.
*/
-static int pmu_type(char *name, __u32 *type)
+static int pmu_type(const char *name, __u32 *type)
{
struct stat st;
char path[PATH_MAX];
@@ -266,7 +266,7 @@ static void pmu_read_sysfs(void)
closedir(dir);
}

-static struct cpu_map *pmu_cpumask(char *name)
+static struct cpu_map *pmu_cpumask(const char *name)
{
struct stat st;
char path[PATH_MAX];
@@ -293,7 +293,7 @@ static struct cpu_map *pmu_cpumask(char *name)
return cpus;
}

-static struct perf_pmu *pmu_lookup(char *name)
+static struct perf_pmu *pmu_lookup(const char *name)
{
struct perf_pmu *pmu;
LIST_HEAD(format);
@@ -330,7 +330,7 @@ static struct perf_pmu *pmu_lookup(char *name)
return pmu;
}

-static struct perf_pmu *pmu_find(char *name)
+static struct perf_pmu *pmu_find(const char *name)
{
struct perf_pmu *pmu;

@@ -356,7 +356,7 @@ struct perf_pmu *perf_pmu__scan(struct perf_pmu *pmu)
return NULL;
}

-struct perf_pmu *perf_pmu__find(char *name)
+struct perf_pmu *perf_pmu__find(const char *name)
{
struct perf_pmu *pmu;

diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 32fe55b..d17b565 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -21,7 +21,7 @@ struct perf_pmu {
struct list_head list;
};

-struct perf_pmu *perf_pmu__find(char *name);
+struct perf_pmu *perf_pmu__find(const char *name);
int perf_pmu__config(struct perf_pmu *pmu, struct perf_event_attr *attr,
struct list_head *head_terms);
int perf_pmu__config_terms(struct list_head *formats,
--
1.7.11.7

2013-06-27 07:49:47

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 09/15] perf tools: validate perf event header size

'size' includes the header so must be at least
'sizeof(struct perf_event_header)'. Error out
immediately if that is not the case. Also
don't byte-swap the header until it is actually
"fetched" from the mmap region.

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/util/session.c | 14 ++++++++++----
1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 19ea888..7a12886 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1094,8 +1094,10 @@ more:
perf_event_header__bswap(&event->header);

size = event->header.size;
- if (size == 0)
- size = 8;
+ if (size < sizeof(struct perf_event_header)) {
+ pr_err("bad event header size\n");
+ goto out_err;
+ }

if (size > cur_size) {
void *new = realloc(buf, size);
@@ -1164,8 +1166,12 @@ fetch_mmaped_event(struct perf_session *session,
if (session->header.needs_swap)
perf_event_header__bswap(&event->header);

- if (head + event->header.size > mmap_size)
+ if (head + event->header.size > mmap_size) {
+ /* We're not fetching the event so swap back again */
+ if (session->header.needs_swap)
+ perf_event_header__bswap(&event->header);
return NULL;
+ }

return event;
}
@@ -1245,7 +1251,7 @@ more:

size = event->header.size;

- if (size == 0 ||
+ if (size < sizeof(struct perf_event_header) ||
perf_session__process_event(session, event, tool, file_pos) < 0) {
pr_err("%#" PRIx64 " [%#x]: failed to process type: %d\n",
file_offset + head, event->header.size,
--
1.7.11.7

2013-06-27 07:50:05

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 08/15] perf tools: tidy duplicated munmap code

The same lines of code are used in three places.
Make it a new function '__perf_evlist__munmap()'.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
---
tools/perf/util/evlist.c | 32 ++++++++++++++------------------
1 file changed, 14 insertions(+), 18 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 99b43dd..3c8b569 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -403,16 +403,20 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
return event;
}

+static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
+{
+ if (evlist->mmap[idx].base != NULL) {
+ munmap(evlist->mmap[idx].base, evlist->mmap_len);
+ evlist->mmap[idx].base = NULL;
+ }
+}
+
void perf_evlist__munmap(struct perf_evlist *evlist)
{
int i;

- for (i = 0; i < evlist->nr_mmaps; i++) {
- if (evlist->mmap[i].base != NULL) {
- munmap(evlist->mmap[i].base, evlist->mmap_len);
- evlist->mmap[i].base = NULL;
- }
- }
+ for (i = 0; i < evlist->nr_mmaps; i++)
+ __perf_evlist__munmap(evlist, i);

free(evlist->mmap);
evlist->mmap = NULL;
@@ -477,12 +481,8 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist, int prot, int m
return 0;

out_unmap:
- for (cpu = 0; cpu < nr_cpus; cpu++) {
- if (evlist->mmap[cpu].base != NULL) {
- munmap(evlist->mmap[cpu].base, evlist->mmap_len);
- evlist->mmap[cpu].base = NULL;
- }
- }
+ for (cpu = 0; cpu < nr_cpus; cpu++)
+ __perf_evlist__munmap(evlist, cpu);
return -1;
}

@@ -517,12 +517,8 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist, int prot, in
return 0;

out_unmap:
- for (thread = 0; thread < nr_threads; thread++) {
- if (evlist->mmap[thread].base != NULL) {
- munmap(evlist->mmap[thread].base, evlist->mmap_len);
- evlist->mmap[thread].base = NULL;
- }
- }
+ for (thread = 0; thread < nr_threads; thread++)
+ __perf_evlist__munmap(evlist, thread);
return -1;
}

--
1.7.11.7

2013-06-27 07:50:24

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 13/15] perf tools: struct thread has a tid not a pid

As evident from 'machine__process_fork_event()' and
'machine__process_exit_event()' the 'pid' member of
struct thread is actually the tid.

Rename 'pid' to 'tid' in struct thread accordingly.

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/builtin-kmem.c | 2 +-
tools/perf/builtin-sched.c | 12 ++++++------
tools/perf/builtin-trace.c | 4 ++--
tools/perf/ui/browsers/hists.c | 6 +++---
tools/perf/util/event.c | 2 +-
tools/perf/util/machine.c | 20 ++++++++++----------
tools/perf/util/machine.h | 4 ++--
tools/perf/util/sort.c | 6 +++---
tools/perf/util/thread.c | 10 +++++-----
tools/perf/util/thread.h | 4 ++--
10 files changed, 35 insertions(+), 35 deletions(-)

diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 46878da..34647fa 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -313,7 +313,7 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
return -1;
}

- dump_printf(" ... thread: %s:%d\n", thread->comm, thread->pid);
+ dump_printf(" ... thread: %s:%d\n", thread->comm, thread->tid);

if (evsel->handler.func != NULL) {
tracepoint_handler f = evsel->handler.func;
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 2da2a6c..b1715f4 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -1075,7 +1075,7 @@ static int latency_migrate_task_event(struct perf_sched *sched,
if (!atoms) {
if (thread_atoms_insert(sched, migrant))
return -1;
- register_pid(sched, migrant->pid, migrant->comm);
+ register_pid(sched, migrant->tid, migrant->comm);
atoms = thread_atoms_search(&sched->atom_root, migrant, &sched->cmp_pid);
if (!atoms) {
pr_err("migration-event: Internal tree error");
@@ -1115,7 +1115,7 @@ static void output_lat_thread(struct perf_sched *sched, struct work_atoms *work_
sched->all_runtime += work_list->total_runtime;
sched->all_count += work_list->nb_atoms;

- ret = printf(" %s:%d ", work_list->thread->comm, work_list->thread->pid);
+ ret = printf(" %s:%d ", work_list->thread->comm, work_list->thread->tid);

for (i = 0; i < 24 - ret; i++)
printf(" ");
@@ -1131,9 +1131,9 @@ static void output_lat_thread(struct perf_sched *sched, struct work_atoms *work_

static int pid_cmp(struct work_atoms *l, struct work_atoms *r)
{
- if (l->thread->pid < r->thread->pid)
+ if (l->thread->tid < r->thread->tid)
return -1;
- if (l->thread->pid > r->thread->pid)
+ if (l->thread->tid > r->thread->tid)
return 1;

return 0;
@@ -1321,7 +1321,7 @@ static int map_switch_event(struct perf_sched *sched, struct perf_evsel *evsel,
printf("*");

if (sched->curr_thread[cpu]) {
- if (sched->curr_thread[cpu]->pid)
+ if (sched->curr_thread[cpu]->tid)
printf("%2s ", sched->curr_thread[cpu]->shortname);
else
printf(". ");
@@ -1332,7 +1332,7 @@ static int map_switch_event(struct perf_sched *sched, struct perf_evsel *evsel,
printf(" %12.6f secs ", (double)timestamp/1e9);
if (new_shortname) {
printf("%s => %s:%d\n",
- sched_in->shortname, sched_in->comm, sched_in->pid);
+ sched_in->shortname, sched_in->comm, sched_in->tid);
} else {
printf("\n");
}
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index ab3ed4a..6a36a95 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -142,7 +142,7 @@ static size_t trace__fprintf_entry_head(struct trace *trace, struct thread *thre
printed += fprintf_duration(duration, fp);

if (trace->multiple_threads)
- printed += fprintf(fp, "%d ", thread->pid);
+ printed += fprintf(fp, "%d ", thread->tid);

return printed;
}
@@ -593,7 +593,7 @@ static size_t trace__fprintf_thread_summary(struct trace *trace, FILE *fp)
color = PERF_COLOR_YELLOW;

printed += color_fprintf(fp, color, "%20s", thread->comm);
- printed += fprintf(fp, " - %-5d :%11lu [", thread->pid, ttrace->nr_events);
+ printed += fprintf(fp, " - %-5d :%11lu [", thread->tid, ttrace->nr_events);
printed += color_fprintf(fp, color, "%5.1f%%", ratio);
printed += fprintf(fp, " ] %10.3f ms\n", ttrace->runtime_ms);
}
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index fc0bd38..06e892f 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -1256,7 +1256,7 @@ static int hists__browser_title(struct hists *hists, char *bf, size_t size,
printed += scnprintf(bf + printed, size - printed,
", Thread: %s(%d)",
(thread->comm_set ? thread->comm : ""),
- thread->pid);
+ thread->tid);
if (dso)
printed += scnprintf(bf + printed, size - printed,
", DSO: %s", dso->short_name);
@@ -1579,7 +1579,7 @@ static int perf_evsel__hists_browse(struct perf_evsel *evsel, int nr_events,
asprintf(&options[nr_options], "Zoom %s %s(%d) thread",
(browser->hists->thread_filter ? "out of" : "into"),
(thread->comm_set ? thread->comm : ""),
- thread->pid) > 0)
+ thread->tid) > 0)
zoom_thread = nr_options++;

if (dso != NULL &&
@@ -1702,7 +1702,7 @@ zoom_out_thread:
} else {
ui_helpline__fpush("To zoom out press <- or -> + \"Zoom out of %s(%d) thread\"",
thread->comm_set ? thread->comm : "",
- thread->pid);
+ thread->tid);
browser->hists->thread_filter = thread;
sort_thread.elide = true;
pstack__push(fstack, &browser->hists->thread_filter);
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 5cd13d7..9541270 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -686,7 +686,7 @@ int perf_event__preprocess_sample(const union perf_event *event,
!strlist__has_entry(symbol_conf.comm_list, thread->comm))
goto out_filtered;

- dump_printf(" ... thread: %s:%d\n", thread->comm, thread->pid);
+ dump_printf(" ... thread: %s:%d\n", thread->comm, thread->tid);
/*
* Have we already created the kernel maps for this machine?
*
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index b2ecad6..ef0be97 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -233,7 +233,7 @@ void machines__set_id_hdr_size(struct machines *machines, u16 id_hdr_size)
return;
}

-static struct thread *__machine__findnew_thread(struct machine *machine, pid_t pid,
+static struct thread *__machine__findnew_thread(struct machine *machine, pid_t tid,
bool create)
{
struct rb_node **p = &machine->threads.rb_node;
@@ -241,23 +241,23 @@ static struct thread *__machine__findnew_thread(struct machine *machine, pid_t p
struct thread *th;

/*
- * Font-end cache - PID lookups come in blocks,
+ * Front-end cache - TID lookups come in blocks,
* so most of the time we dont have to look up
* the full rbtree:
*/
- if (machine->last_match && machine->last_match->pid == pid)
+ if (machine->last_match && machine->last_match->tid == tid)
return machine->last_match;

while (*p != NULL) {
parent = *p;
th = rb_entry(parent, struct thread, rb_node);

- if (th->pid == pid) {
+ if (th->tid == tid) {
machine->last_match = th;
return th;
}

- if (pid < th->pid)
+ if (tid < th->tid)
p = &(*p)->rb_left;
else
p = &(*p)->rb_right;
@@ -266,7 +266,7 @@ static struct thread *__machine__findnew_thread(struct machine *machine, pid_t p
if (!create)
return NULL;

- th = thread__new(pid);
+ th = thread__new(tid);
if (th != NULL) {
rb_link_node(&th->rb_node, parent, p);
rb_insert_color(&th->rb_node, &machine->threads);
@@ -276,14 +276,14 @@ static struct thread *__machine__findnew_thread(struct machine *machine, pid_t p
return th;
}

-struct thread *machine__findnew_thread(struct machine *machine, pid_t pid)
+struct thread *machine__findnew_thread(struct machine *machine, pid_t tid)
{
- return __machine__findnew_thread(machine, pid, true);
+ return __machine__findnew_thread(machine, tid, true);
}

-struct thread *machine__find_thread(struct machine *machine, pid_t pid)
+struct thread *machine__find_thread(struct machine *machine, pid_t tid)
{
- return __machine__findnew_thread(machine, pid, false);
+ return __machine__findnew_thread(machine, tid, false);
}

int machine__process_comm_event(struct machine *machine, union perf_event *event)
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 7794068..e49ba01 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -36,7 +36,7 @@ struct map *machine__kernel_map(struct machine *machine, enum map_type type)
return machine->vmlinux_maps[type];
}

-struct thread *machine__find_thread(struct machine *machine, pid_t pid);
+struct thread *machine__find_thread(struct machine *machine, pid_t tid);

int machine__process_comm_event(struct machine *machine, union perf_event *event);
int machine__process_exit_event(struct machine *machine, union perf_event *event);
@@ -99,7 +99,7 @@ static inline bool machine__is_host(struct machine *machine)
return machine ? machine->pid == HOST_KERNEL_ID : false;
}

-struct thread *machine__findnew_thread(struct machine *machine, pid_t pid);
+struct thread *machine__findnew_thread(struct machine *machine, pid_t tid);

size_t machine__fprintf(struct machine *machine, FILE *fp);

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 313a5a7..8deee19 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -55,14 +55,14 @@ static int64_t cmp_null(void *l, void *r)
static int64_t
sort__thread_cmp(struct hist_entry *left, struct hist_entry *right)
{
- return right->thread->pid - left->thread->pid;
+ return right->thread->tid - left->thread->tid;
}

static int hist_entry__thread_snprintf(struct hist_entry *self, char *bf,
size_t size, unsigned int width)
{
return repsep_snprintf(bf, size, "%*s:%5d", width - 6,
- self->thread->comm ?: "", self->thread->pid);
+ self->thread->comm ?: "", self->thread->tid);
}

struct sort_entry sort_thread = {
@@ -77,7 +77,7 @@ struct sort_entry sort_thread = {
static int64_t
sort__comm_cmp(struct hist_entry *left, struct hist_entry *right)
{
- return right->thread->pid - left->thread->pid;
+ return right->thread->tid - left->thread->tid;
}

static int64_t
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 40399cb..6feeb88 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -7,17 +7,17 @@
#include "util.h"
#include "debug.h"

-struct thread *thread__new(pid_t pid)
+struct thread *thread__new(pid_t tid)
{
struct thread *self = zalloc(sizeof(*self));

if (self != NULL) {
map_groups__init(&self->mg);
- self->pid = pid;
+ self->tid = tid;
self->ppid = -1;
self->comm = malloc(32);
if (self->comm)
- snprintf(self->comm, 32, ":%d", self->pid);
+ snprintf(self->comm, 32, ":%d", self->tid);
}

return self;
@@ -57,7 +57,7 @@ int thread__comm_len(struct thread *self)

size_t thread__fprintf(struct thread *thread, FILE *fp)
{
- return fprintf(fp, "Thread %d %s\n", thread->pid, thread->comm) +
+ return fprintf(fp, "Thread %d %s\n", thread->tid, thread->comm) +
map_groups__fprintf(&thread->mg, verbose, fp);
}

@@ -84,7 +84,7 @@ int thread__fork(struct thread *self, struct thread *parent)
if (map_groups__clone(&self->mg, &parent->mg, i) < 0)
return -ENOMEM;

- self->ppid = parent->pid;
+ self->ppid = parent->tid;

return 0;
}
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index eeb7ac6..37a86a3 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -12,7 +12,7 @@ struct thread {
struct list_head node;
};
struct map_groups mg;
- pid_t pid;
+ pid_t tid;
pid_t ppid;
char shortname[3];
bool comm_set;
@@ -24,7 +24,7 @@ struct thread {

struct machine;

-struct thread *thread__new(pid_t pid);
+struct thread *thread__new(pid_t tid);
void thread__delete(struct thread *self);

int thread__set_comm(struct thread *self, const char *comm);
--
1.7.11.7

2013-06-27 07:51:01

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 15/15] perf tools: fix ppid in thread__fork()

ppid should be assigned to the parents pid. Note
'thread__fork()'s only caller 'machine__process_fork_event()'
ensures that the parents pid is set.

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/util/thread.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index e3d4a55..93f3eab 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -85,7 +85,7 @@ int thread__fork(struct thread *self, struct thread *parent)
if (map_groups__clone(&self->mg, &parent->mg, i) < 0)
return -ENOMEM;

- self->ppid = parent->tid;
+ self->ppid = parent->pid_;

return 0;
}
--
1.7.11.7

2013-06-27 07:50:20

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 12/15] perf tools: allow non-matching sample types

Sample types need not be identical to determine
the sample id from the event. Only the position
of the sample id needs to be the same.

Compatible sample types are ones in which the bits
defined by PERF_COMPAT_MASK are the same.
'perf_evlist__config()' forces sample types to be
compatible on that basis.

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/builtin-report.c | 2 +-
tools/perf/util/event.h | 14 +++++
tools/perf/util/evlist.c | 135 ++++++++++++++++++++++++++++++++++++++++++--
tools/perf/util/evlist.h | 8 ++-
tools/perf/util/evsel.c | 64 ++++++++++++++++++++-
tools/perf/util/evsel.h | 10 ++++
tools/perf/util/session.c | 8 ++-
7 files changed, 230 insertions(+), 11 deletions(-)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index ca98d34..58572ac 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -367,7 +367,7 @@ static int process_read_event(struct perf_tool *tool,
static int perf_report__setup_sample_type(struct perf_report *rep)
{
struct perf_session *self = rep->session;
- u64 sample_type = perf_evlist__sample_type(self->evlist);
+ u64 sample_type = perf_evlist__combined_sample_type(self->evlist);

if (!self->fd_pipe && !(sample_type & PERF_SAMPLE_CALLCHAIN)) {
if (sort__has_parent) {
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 1813895..3aef78c 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -65,6 +65,20 @@ struct read_event {
PERF_SAMPLE_ID | PERF_SAMPLE_STREAM_ID | \
PERF_SAMPLE_CPU | PERF_SAMPLE_PERIOD)

+/*
+ * Events have compatible sample types if the following bits all have the same
+ * value. This is because the order of sample members is fixed. For sample
+ * events the order is: PERF_SAMPLE_IP, PERF_SAMPLE_TID, PERF_SAMPLE_TIME,
+ * PERF_SAMPLE_ADDR, PERF_SAMPLE_ID. For non-sample events the sample members
+ * are accessed in reverse order. The order is: PERF_SAMPLE_ID,
+ * PERF_SAMPLE_STREAM_ID, PERF_SAMPLE_CPU.
+ */
+#define PERF_COMPAT_MASK \
+ (PERF_SAMPLE_IP | PERF_SAMPLE_TID | \
+ PERF_SAMPLE_TIME | PERF_SAMPLE_ADDR | \
+ PERF_SAMPLE_ID | PERF_SAMPLE_STREAM_ID | \
+ PERF_SAMPLE_CPU)
+
struct sample_event {
struct perf_event_header header;
u64 array[];
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index a660f56..78331da 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -49,6 +49,45 @@ struct perf_evlist *perf_evlist__new(void)
return evlist;
}

+/**
+ * perf_evlist__set_id_pos - set the positions of event ids.
+ * @evlist: selected event list
+ *
+ * Events with compatible sample types all have the same id_pos
+ * and is_pos. For convenience, put a copy on evlist.
+ */
+static void perf_evlist__set_id_pos(struct perf_evlist *evlist)
+{
+ struct perf_evsel *first = perf_evlist__first(evlist);
+
+ evlist->id_pos = first->id_pos;
+ evlist->is_pos = first->is_pos;
+}
+
+/**
+ * perf_evlist__make_sample_types_compatible - make sample types compatible.
+ * @evlist: selected event list
+ *
+ * Events with compatible sample types all have the same id_pos and is_pos.
+ * This can be achieved by matching the bits of PERF_COMPAT_MASK.
+ */
+void perf_evlist__make_sample_types_compatible(struct perf_evlist *evlist)
+{
+ struct perf_evsel *evsel;
+ u64 compat = 0;
+
+ list_for_each_entry(evsel, &evlist->entries, node)
+ compat |= evsel->attr.sample_type & PERF_COMPAT_MASK;
+
+ list_for_each_entry(evsel, &evlist->entries, node) {
+ evsel->attr.sample_type |= compat;
+ evsel->sample_size = __perf_evsel__sample_size(evsel->attr.sample_type);
+ perf_evsel__calc_id_pos(evsel);
+ }
+
+ perf_evlist__set_id_pos(evlist);
+}
+
void perf_evlist__config(struct perf_evlist *evlist,
struct perf_record_opts *opts)
{
@@ -69,6 +108,8 @@ void perf_evlist__config(struct perf_evlist *evlist,
if (evlist->nr_entries > 1)
perf_evsel__set_sample_id(evsel);
}
+
+ perf_evlist__make_sample_types_compatible(evlist);
}

static void perf_evlist__purge(struct perf_evlist *evlist)
@@ -102,6 +143,7 @@ void perf_evlist__add(struct perf_evlist *evlist, struct perf_evsel *entry)
{
list_add_tail(&entry->node, &evlist->entries);
++evlist->nr_entries;
+ perf_evlist__set_id_pos(evlist);
}

void perf_evlist__splice_list_tail(struct perf_evlist *evlist,
@@ -110,6 +152,7 @@ void perf_evlist__splice_list_tail(struct perf_evlist *evlist,
{
list_splice_tail(list, &evlist->entries);
evlist->nr_entries += nr_entries;
+ perf_evlist__set_id_pos(evlist);
}

void __perf_evlist__set_leader(struct list_head *list)
@@ -339,6 +382,55 @@ struct perf_evsel *perf_evlist__id2evsel(struct perf_evlist *evlist, u64 id)
return NULL;
}

+static int perf_evlist__event2id(struct perf_evlist *evlist,
+ union perf_event *event, u64 *id)
+{
+ const u64 *array = event->sample.array;
+ ssize_t n;
+
+ n = (event->header.size - sizeof(event->header)) >> 3;
+
+ if (event->header.type == PERF_RECORD_SAMPLE) {
+ if (evlist->id_pos >= n)
+ return -1;
+ *id = array[evlist->id_pos];
+ } else {
+ if (evlist->is_pos >= n)
+ return -1;
+ n -= evlist->is_pos;
+ *id = array[n];
+ }
+ return 0;
+}
+
+static struct perf_evsel *perf_evlist__event2evsel(struct perf_evlist *evlist,
+ union perf_event *event)
+{
+ struct hlist_head *head;
+ struct perf_sample_id *sid;
+ int hash;
+ u64 id;
+
+ if (evlist->nr_entries == 1 || evlist->matching_sample_types)
+ return perf_evlist__first(evlist);
+
+ if (perf_evlist__event2id(evlist, event, &id))
+ return NULL;
+
+ /* Synthesized events have an id of zero */
+ if (!id)
+ return perf_evlist__first(evlist);
+
+ hash = hash_64(id, PERF_EVLIST__HLIST_BITS);
+ head = &evlist->heads[hash];
+
+ hlist_for_each_entry(sid, head, node) {
+ if (sid->id == id)
+ return sid->evsel;
+ }
+ return NULL;
+}
+
union perf_event *perf_evlist__mmap_read(struct perf_evlist *evlist, int idx)
{
struct perf_mmap *md = &evlist->mmap[idx];
@@ -650,19 +742,49 @@ int perf_evlist__set_filter(struct perf_evlist *evlist, const char *filter)
bool perf_evlist__valid_sample_type(struct perf_evlist *evlist)
{
struct perf_evsel *first = perf_evlist__first(evlist), *pos = first;
+ bool ok = true;

list_for_each_entry_continue(pos, &evlist->entries, node) {
- if (first->attr.sample_type != pos->attr.sample_type)
+ if (first->attr.sample_type != pos->attr.sample_type) {
+ ok = false;
+ break;
+ }
+ }
+
+ if (ok) {
+ evlist->matching_sample_types = true;
+ return true;
+ }
+
+ if (evlist->id_pos < 0 || evlist->is_pos < 0)
+ return false;
+
+ list_for_each_entry(pos, &evlist->entries, node) {
+ if (pos->id_pos != evlist->id_pos ||
+ pos->is_pos != evlist->is_pos)
return false;
}

return true;
}

-u64 perf_evlist__sample_type(struct perf_evlist *evlist)
+u64 __perf_evlist__combined_sample_type(struct perf_evlist *evlist)
{
- struct perf_evsel *first = perf_evlist__first(evlist);
- return first->attr.sample_type;
+ struct perf_evsel *evsel;
+
+ if (evlist->combined_sample_type)
+ return evlist->combined_sample_type;
+
+ list_for_each_entry(evsel, &evlist->entries, node)
+ evlist->combined_sample_type |= evsel->attr.sample_type;
+
+ return evlist->combined_sample_type;
+}
+
+u64 perf_evlist__combined_sample_type(struct perf_evlist *evlist)
+{
+ evlist->combined_sample_type = 0;
+ return __perf_evlist__combined_sample_type(evlist);
}

u16 perf_evlist__id_hdr_size(struct perf_evlist *evlist)
@@ -848,7 +970,10 @@ int perf_evlist__start_workload(struct perf_evlist *evlist)
int perf_evlist__parse_sample(struct perf_evlist *evlist, union perf_event *event,
struct perf_sample *sample)
{
- struct perf_evsel *evsel = perf_evlist__first(evlist);
+ struct perf_evsel *evsel = perf_evlist__event2evsel(evlist, event);
+
+ if (!evsel)
+ return -EFAULT;
return perf_evsel__parse_sample(evsel, event, sample);
}

diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 0583d36..b1be475 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -32,11 +32,15 @@ struct perf_evlist {
int nr_fds;
int nr_mmaps;
int mmap_len;
+ int id_pos;
+ int is_pos;
+ u64 combined_sample_type;
struct {
int cork_fd;
pid_t pid;
} workload;
bool overwrite;
+ bool matching_sample_types;
struct perf_mmap *mmap;
struct pollfd *pollfd;
struct thread_map *threads;
@@ -83,6 +87,7 @@ union perf_event *perf_evlist__mmap_read(struct perf_evlist *self, int idx);
int perf_evlist__open(struct perf_evlist *evlist);
void perf_evlist__close(struct perf_evlist *evlist);

+void perf_evlist__make_sample_types_compatible(struct perf_evlist *evlist);
void perf_evlist__config(struct perf_evlist *evlist,
struct perf_record_opts *opts);

@@ -118,7 +123,8 @@ int perf_evlist__apply_filters(struct perf_evlist *evlist);
void __perf_evlist__set_leader(struct list_head *list);
void perf_evlist__set_leader(struct perf_evlist *evlist);

-u64 perf_evlist__sample_type(struct perf_evlist *evlist);
+u64 __perf_evlist__combined_sample_type(struct perf_evlist *evlist);
+u64 perf_evlist__combined_sample_type(struct perf_evlist *evlist);
bool perf_evlist__sample_id_all(struct perf_evlist *evlist);
u16 perf_evlist__id_hdr_size(struct perf_evlist *evlist);

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index d01d2cd..04a67ae 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -30,7 +30,7 @@ static struct {

#define FD(e, x, y) (*(int *)xyarray__entry(e->fd, x, y))

-static int __perf_evsel__sample_size(u64 sample_type)
+int __perf_evsel__sample_size(u64 sample_type)
{
u64 mask = sample_type & PERF_SAMPLE_MASK;
int size = 0;
@@ -46,6 +46,65 @@ static int __perf_evsel__sample_size(u64 sample_type)
return size;
}

+/**
+ * __perf_evsel__calc_id_pos - calculate id_pos.
+ * @sample_type: sample type
+ *
+ * This function returns the position of the event id (PERF_SAMPLE_ID) in a
+ * sample event i.e. in the array of struct sample_event.
+ */
+static int __perf_evsel__calc_id_pos(u64 sample_type)
+{
+ int idx = 0;
+
+ if (!(sample_type & PERF_SAMPLE_ID))
+ return -1;
+
+ if (sample_type & PERF_SAMPLE_IP)
+ idx += 1;
+
+ if (sample_type & PERF_SAMPLE_TID)
+ idx += 1;
+
+ if (sample_type & PERF_SAMPLE_TIME)
+ idx += 1;
+
+ if (sample_type & PERF_SAMPLE_ADDR)
+ idx += 1;
+
+ return idx;
+}
+
+/**
+ * __perf_evsel__calc_is_pos - calculate is_pos.
+ * @sample_type: sample type
+ *
+ * This function returns the position (counting backwards) of the event id
+ * (PERF_SAMPLE_ID) in a non-sample event i.e. if sample_id_all is used there is
+ * an id sample appended to non-sample events.
+ */
+static int __perf_evsel__calc_is_pos(u64 sample_type)
+{
+ int idx = 1;
+
+ if (!(sample_type & PERF_SAMPLE_ID))
+ return -1;
+
+ if (sample_type & PERF_SAMPLE_CPU)
+ idx += 1;
+
+ if (sample_type & PERF_SAMPLE_STREAM_ID)
+ idx += 1;
+
+ return idx;
+}
+
+void perf_evsel__calc_id_pos(struct perf_evsel *evsel)
+{
+ evsel->id_pos = __perf_evsel__calc_id_pos(evsel->attr.sample_type);
+ evsel->is_pos = __perf_evsel__calc_is_pos(evsel->attr.sample_type);
+}
+
void hists__init(struct hists *hists)
{
memset(hists, 0, sizeof(*hists));
@@ -62,6 +121,7 @@ void __perf_evsel__set_sample_bit(struct perf_evsel *evsel,
if (!(evsel->attr.sample_type & bit)) {
evsel->attr.sample_type |= bit;
evsel->sample_size += sizeof(u64);
+ perf_evsel__calc_id_pos(evsel);
}
}

@@ -71,6 +131,7 @@ void __perf_evsel__reset_sample_bit(struct perf_evsel *evsel,
if (evsel->attr.sample_type & bit) {
evsel->attr.sample_type &= ~bit;
evsel->sample_size -= sizeof(u64);
+ perf_evsel__calc_id_pos(evsel);
}
}

@@ -89,6 +150,7 @@ void perf_evsel__init(struct perf_evsel *evsel,
INIT_LIST_HEAD(&evsel->node);
hists__init(&evsel->hists);
evsel->sample_size = __perf_evsel__sample_size(attr->sample_type);
+ perf_evsel__calc_id_pos(evsel);
}

struct perf_evsel *perf_evsel__new(struct perf_event_attr *attr, int idx)
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 3f156cc..c6d616c 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -45,6 +45,11 @@ struct perf_sample_id {
* @name - Can be set to retain the original event name passed by the user,
* so that when showing results in tools such as 'perf stat', we
* show the name used, not some alias.
+ * @id_pos: the position of the event id (PERF_SAMPLE_ID) in a sample event
+ * i.e. in the array of struct sample_event
+ * @is_pos: the position (counting backwards) of the event id (PERF_SAMPLE_ID)
+ * in a non-sample event i.e. if sample_id_all is used there is an id
+ * sample appended to non-sample events
*/
struct perf_evsel {
struct list_head node;
@@ -71,6 +76,8 @@ struct perf_evsel {
} handler;
struct cpu_map *cpus;
unsigned int sample_size;
+ int id_pos;
+ int is_pos;
bool supported;
bool needs_swap;
/* parse modifier helper */
@@ -100,6 +107,9 @@ void perf_evsel__delete(struct perf_evsel *evsel);
void perf_evsel__config(struct perf_evsel *evsel,
struct perf_record_opts *opts);

+int __perf_evsel__sample_size(u64 sample_type);
+void perf_evsel__calc_id_pos(struct perf_evsel *evsel);
+
bool perf_evsel__is_cache_op_valid(u8 type, u8 op);

#define PERF_EVSEL__MAX_ALIASES 8
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 7a12886..e0cc110 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -743,7 +743,7 @@ static void perf_session__print_tstamp(struct perf_session *session,
union perf_event *event,
struct perf_sample *sample)
{
- u64 sample_type = perf_evlist__sample_type(session->evlist);
+ u64 sample_type = __perf_evlist__combined_sample_type(session->evlist);

if (event->header.type != PERF_RECORD_SAMPLE &&
!perf_evlist__sample_id_all(session->evlist)) {
@@ -902,7 +902,7 @@ static int perf_session__preprocess_sample(struct perf_session *session,
union perf_event *event, struct perf_sample *sample)
{
if (event->header.type != PERF_RECORD_SAMPLE ||
- !(perf_evlist__sample_type(session->evlist) & PERF_SAMPLE_CALLCHAIN))
+ !sample->callchain)
return 0;

if (!ip_callchain__valid(sample->callchain, event)) {
@@ -1304,7 +1304,9 @@ int perf_session__process_events(struct perf_session *self,

bool perf_session__has_traces(struct perf_session *session, const char *msg)
{
- if (!(perf_evlist__sample_type(session->evlist) & PERF_SAMPLE_RAW)) {
+ u64 sample_type = perf_evlist__combined_sample_type(session->evlist);
+
+ if (!(sample_type & PERF_SAMPLE_RAW)) {
pr_err("No trace sample to read. Did you call 'perf %s'?\n", msg);
return false;
}
--
1.7.11.7

2013-06-27 07:51:21

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 14/15] perf tools: add pid to struct thread

Record pid on struct thread. The member is named 'pid_'
to avoid confusion with the 'tid' member which was previously
named 'pid'.

Note that while "machine" functions update 'pid_', most tools
do not.

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/util/machine.c | 47 ++++++++++++++++++++++++++++++++++++++---------
tools/perf/util/machine.h | 4 ++++
tools/perf/util/thread.c | 3 ++-
tools/perf/util/thread.h | 3 ++-
4 files changed, 46 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index ef0be97..d11ca3b 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -233,7 +233,8 @@ void machines__set_id_hdr_size(struct machines *machines, u16 id_hdr_size)
return;
}

-static struct thread *__machine__findnew_thread(struct machine *machine, pid_t tid,
+static struct thread *__machine__findnew_thread(struct machine *machine,
+ pid_t pid, pid_t tid,
bool create)
{
struct rb_node **p = &machine->threads.rb_node;
@@ -245,8 +246,11 @@ static struct thread *__machine__findnew_thread(struct machine *machine, pid_t t
* so most of the time we dont have to look up
* the full rbtree:
*/
- if (machine->last_match && machine->last_match->tid == tid)
+ if (machine->last_match && machine->last_match->tid == tid) {
+ if (pid && !machine->last_match->pid_)
+ machine->last_match->pid_ = pid;
return machine->last_match;
+ }

while (*p != NULL) {
parent = *p;
@@ -254,6 +258,8 @@ static struct thread *__machine__findnew_thread(struct machine *machine, pid_t t

if (th->tid == tid) {
machine->last_match = th;
+ if (pid && !th->pid_)
+ th->pid_ = pid;
return th;
}

@@ -266,7 +272,7 @@ static struct thread *__machine__findnew_thread(struct machine *machine, pid_t t
if (!create)
return NULL;

- th = thread__new(tid);
+ th = thread__new(pid, tid);
if (th != NULL) {
rb_link_node(&th->rb_node, parent, p);
rb_insert_color(&th->rb_node, &machine->threads);
@@ -278,17 +284,25 @@ static struct thread *__machine__findnew_thread(struct machine *machine, pid_t t

struct thread *machine__findnew_thread(struct machine *machine, pid_t tid)
{
- return __machine__findnew_thread(machine, tid, true);
+ return __machine__findnew_thread(machine, 0, tid, true);
+}
+
+struct thread *machine__findnew_thread_ex(struct machine *machine, pid_t pid,
+ pid_t tid)
+{
+ return __machine__findnew_thread(machine, pid, tid, true);
}

struct thread *machine__find_thread(struct machine *machine, pid_t tid)
{
- return __machine__findnew_thread(machine, tid, false);
+ return __machine__findnew_thread(machine, 0, tid, false);
}

int machine__process_comm_event(struct machine *machine, union perf_event *event)
{
- struct thread *thread = machine__findnew_thread(machine, event->comm.tid);
+ struct thread *thread = machine__findnew_thread_ex(machine,
+ event->comm.pid,
+ event->comm.tid);

if (dump_trace)
perf_event__fprintf_comm(event, stdout);
@@ -969,7 +983,8 @@ int machine__process_mmap_event(struct machine *machine, union perf_event *event
return 0;
}

- thread = machine__findnew_thread(machine, event->mmap.pid);
+ thread = machine__findnew_thread_ex(machine, event->mmap.pid,
+ event->mmap.pid);
if (thread == NULL)
goto out_problem;

@@ -996,8 +1011,12 @@ out_problem:

int machine__process_fork_event(struct machine *machine, union perf_event *event)
{
- struct thread *thread = machine__findnew_thread(machine, event->fork.tid);
- struct thread *parent = machine__findnew_thread(machine, event->fork.ptid);
+ struct thread *thread = machine__findnew_thread_ex(machine,
+ event->fork.pid,
+ event->fork.tid);
+ struct thread *parent = machine__findnew_thread_ex(machine,
+ event->fork.ppid,
+ event->fork.ptid);

if (dump_trace)
perf_event__fprintf_task(event, stdout);
@@ -1264,3 +1283,13 @@ int machine__resolve_callchain(struct machine *machine,
sample);

}
+
+pid_t machine__get_thread_pid(struct machine *machine, pid_t tid)
+{
+ struct thread *thread = machine__find_thread(machine, tid);
+
+ if (!thread)
+ return 0;
+
+ return thread->pid_;
+}
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index e49ba01..11a53d5 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -100,6 +100,8 @@ static inline bool machine__is_host(struct machine *machine)
}

struct thread *machine__findnew_thread(struct machine *machine, pid_t tid);
+struct thread *machine__findnew_thread_ex(struct machine *machine, pid_t pid,
+ pid_t tid);

size_t machine__fprintf(struct machine *machine, FILE *fp);

@@ -156,4 +158,6 @@ void machines__destroy_kernel_maps(struct machines *machines);

size_t machine__fprintf_vmlinux_path(struct machine *machine, FILE *fp);

+pid_t machine__get_thread_pid(struct machine *machine, pid_t tid);
+
#endif /* __PERF_MACHINE_H */
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 6feeb88..e3d4a55 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -7,12 +7,13 @@
#include "util.h"
#include "debug.h"

-struct thread *thread__new(pid_t tid)
+struct thread *thread__new(pid_t pid, pid_t tid)
{
struct thread *self = zalloc(sizeof(*self));

if (self != NULL) {
map_groups__init(&self->mg);
+ self->pid_ = pid;
self->tid = tid;
self->ppid = -1;
self->comm = malloc(32);
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 37a86a3..bfa68e7 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -12,6 +12,7 @@ struct thread {
struct list_head node;
};
struct map_groups mg;
+ pid_t pid_; /* Not all tools update this */
pid_t tid;
pid_t ppid;
char shortname[3];
@@ -24,7 +25,7 @@ struct thread {

struct machine;

-struct thread *thread__new(pid_t tid);
+struct thread *thread__new(pid_t pid, pid_t tid);
void thread__delete(struct thread *self);

int thread__set_comm(struct thread *self, const char *comm);
--
1.7.11.7

2013-06-27 07:50:16

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 10/15] perf tools: add debug prints

It is useful to see the arguments to perf_event_open
and whether the perf events ring buffer was mmapped
per-cpu or per-thread. That information will now be
displayed when verbose is 2 i.e option -vv

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
---
tools/perf/util/evlist.c | 3 +++
tools/perf/util/evsel.c | 52 ++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 55 insertions(+)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 3c8b569..a660f56 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -14,6 +14,7 @@
#include "target.h"
#include "evlist.h"
#include "evsel.h"
+#include "debug.h"
#include <unistd.h>

#include "parse-events.h"
@@ -454,6 +455,7 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist, int prot, int m
int nr_cpus = cpu_map__nr(evlist->cpus);
int nr_threads = thread_map__nr(evlist->threads);

+ pr_debug2("perf event ring buffer mmapped per cpu\n");
for (cpu = 0; cpu < nr_cpus; cpu++) {
int output = -1;

@@ -492,6 +494,7 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist, int prot, in
int thread;
int nr_threads = thread_map__nr(evlist->threads);

+ pr_debug2("perf event ring buffer mmapped per thread\n");
for (thread = 0; thread < nr_threads; thread++) {
int output = -1;

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 63b6f8c..d01d2cd 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -21,6 +21,7 @@
#include <linux/hw_breakpoint.h>
#include <linux/perf_event.h>
#include "perf_regs.h"
+#include "debug.h"

static struct {
bool sample_id_all;
@@ -817,6 +818,54 @@ static int get_group_fd(struct perf_evsel *evsel, int cpu, int thread)
return fd;
}

+static void show_attr(struct perf_event_attr *attr)
+{
+ pr_debug2("------------------------------------------------------------\n");
+ pr_debug2("perf_event_attr:\n");
+ pr_debug2(" type %u\n", attr->type);
+ pr_debug2(" size %u\n", attr->size);
+ pr_debug2(" config %#"PRIx64"\n", (uint64_t)attr->config);
+ pr_debug2(" sample_period %"PRIu64"\n", (uint64_t)attr->sample_period);
+ pr_debug2(" sample_freq %"PRIu64"\n", (uint64_t)attr->sample_freq);
+ pr_debug2(" sample_type %#"PRIx64"\n", (uint64_t)attr->sample_type);
+ pr_debug2(" read_format %#"PRIx64"\n", (uint64_t)attr->read_format);
+
+ pr_debug2(" disabled %u ", attr->disabled);
+ pr_debug2("inherit %u\n", attr->inherit);
+ pr_debug2(" pinned %u ", attr->pinned);
+ pr_debug2("exclusive %u\n", attr->exclusive);
+ pr_debug2(" exclude_user %u ", attr->exclude_user);
+ pr_debug2("exclude_kernel %u\n", attr->exclude_kernel);
+ pr_debug2(" exclude_hv %u ", attr->exclude_hv);
+ pr_debug2("exclude_idle %u\n", attr->exclude_idle);
+ pr_debug2(" mmap %u ", attr->mmap);
+ pr_debug2("comm %u\n", attr->comm);
+ pr_debug2(" freq %u ", attr->freq);
+ pr_debug2("inherit_stat %u\n", attr->inherit_stat);
+ pr_debug2(" enable_on_exec %u ", attr->enable_on_exec);
+ pr_debug2("task %u\n", attr->task);
+ pr_debug2(" watermark %u ", attr->watermark);
+ pr_debug2("precise_ip %u\n", attr->precise_ip);
+ pr_debug2(" mmap_data %u ", attr->mmap_data);
+ pr_debug2("sample_id_all %u\n", attr->sample_id_all);
+ pr_debug2(" exclude_host %u ", attr->exclude_host);
+ pr_debug2("exclude_guest %u\n", attr->exclude_guest);
+ pr_debug2(" excl.callchain.kern %u ", attr->exclude_callchain_kernel);
+ pr_debug2("excl.callchain.user %u\n", attr->exclude_callchain_user);
+
+ pr_debug2(" wakeup_events %u\n", attr->wakeup_events);
+ pr_debug2(" wakeup_watermark %u\n", attr->wakeup_watermark);
+ pr_debug2(" bp_type %#x\n", attr->bp_type);
+ pr_debug2(" bp_addr %#"PRIx64"\n", (uint64_t)attr->bp_addr);
+ pr_debug2(" config1 %#"PRIx64"\n", (uint64_t)attr->config1);
+ pr_debug2(" bp_len %"PRIu64"\n", (uint64_t)attr->bp_len);
+ pr_debug2(" config2 %#"PRIx64"\n", (uint64_t)attr->config2);
+ pr_debug2(" branch_sample_type %#"PRIx64"\n", (uint64_t)attr->branch_sample_type);
+ pr_debug2(" sample_regs_user %#"PRIx64"\n", (uint64_t)attr->sample_regs_user);
+ pr_debug2(" sample_stack_user %u\n", attr->sample_stack_user);
+ pr_debug2("------------------------------------------------------------\n");
+}
+
static int __perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
struct thread_map *threads)
{
@@ -840,6 +889,7 @@ retry_sample_id:
if (perf_missing_features.sample_id_all)
evsel->attr.sample_id_all = 0;

+ show_attr(&evsel->attr);
for (cpu = 0; cpu < cpus->nr; cpu++) {

for (thread = 0; thread < threads->nr; thread++) {
@@ -850,6 +900,8 @@ retry_sample_id:

group_fd = get_group_fd(evsel, cpu, thread);

+ pr_debug2("perf_event_open: pid %d cpu %d group_fd %d flags %#lx\n",
+ pid, cpus->map[cpu], group_fd, flags);
FD(evsel, cpu, thread) = sys_perf_event_open(&evsel->attr,
pid,
cpus->map[cpu],
--
1.7.11.7

2013-06-27 07:51:58

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 11/15] perf tools: fix symbol_conf.nr_events

symbol_conf.nr_events must be updated when processing
attribute events.

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/util/header.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 4037b0a..81a3a0d 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -2968,6 +2968,8 @@ int perf_event__process_attr(struct perf_tool *tool __maybe_unused,
perf_evlist__id_add(evlist, evsel, 0, i, event->attr.id[i]);
}

+ symbol_conf.nr_events = evlist->nr_entries;
+
return 0;
}

--
1.7.11.7

2013-06-27 07:49:25

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 01/15] perf tools: remove unused parameter

'machine' is unused in 'perf_event__repipe_synth()' and some
callers pass NULL anyway. So remove it.

Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
---
tools/perf/builtin-inject.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 84ad6ab..f299ddf 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -38,8 +38,7 @@ struct event_entry {
};

static int perf_event__repipe_synth(struct perf_tool *tool,
- union perf_event *event,
- struct machine *machine __maybe_unused)
+ union perf_event *event)
{
struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
uint32_t size;
@@ -65,20 +64,20 @@ static int perf_event__repipe_op2_synth(struct perf_tool *tool,
struct perf_session *session
__maybe_unused)
{
- return perf_event__repipe_synth(tool, event, NULL);
+ return perf_event__repipe_synth(tool, event);
}

static int perf_event__repipe_event_type_synth(struct perf_tool *tool,
union perf_event *event)
{
- return perf_event__repipe_synth(tool, event, NULL);
+ return perf_event__repipe_synth(tool, event);
}

static int perf_event__repipe_tracing_data_synth(union perf_event *event,
struct perf_session *session
__maybe_unused)
{
- return perf_event__repipe_synth(NULL, event, NULL);
+ return perf_event__repipe_synth(NULL, event);
}

static int perf_event__repipe_attr(union perf_event *event,
@@ -89,15 +88,15 @@ static int perf_event__repipe_attr(union perf_event *event,
if (ret)
return ret;

- return perf_event__repipe_synth(NULL, event, NULL);
+ return perf_event__repipe_synth(NULL, event);
}

static int perf_event__repipe(struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample __maybe_unused,
- struct machine *machine)
+ struct machine *machine __maybe_unused)
{
- return perf_event__repipe_synth(tool, event, machine);
+ return perf_event__repipe_synth(tool, event);
}

typedef int (*inject_handler)(struct perf_tool *tool,
@@ -119,7 +118,7 @@ static int perf_event__repipe_sample(struct perf_tool *tool,

build_id__mark_dso_hit(tool, event, sample, evsel, machine);

- return perf_event__repipe_synth(tool, event, machine);
+ return perf_event__repipe_synth(tool, event);
}

static int perf_event__repipe_mmap(struct perf_tool *tool,
@@ -153,7 +152,7 @@ static int perf_event__repipe_tracing_data(union perf_event *event,
{
int err;

- perf_event__repipe_synth(NULL, event, NULL);
+ perf_event__repipe_synth(NULL, event);
err = perf_event__process_tracing_data(event, session);

return err;
--
1.7.11.7

2013-06-27 07:49:24

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH V2 02/15] perf tools: fix missing tool parameter

perf inject expects to get a reference to 'struct perf_inject'
from its 'tool' member. For that to work, 'tool' needs to be
a parameter of all tool callbacks. Make it so.

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/builtin-inject.c | 26 +++++++++++---------------
tools/perf/util/header.c | 6 ++++--
tools/perf/util/header.h | 6 ++++--
tools/perf/util/session.c | 11 +++++++----
tools/perf/util/tool.h | 9 ++++-----
5 files changed, 30 insertions(+), 28 deletions(-)

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index f299ddf..c943513 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -73,22 +73,17 @@ static int perf_event__repipe_event_type_synth(struct perf_tool *tool,
return perf_event__repipe_synth(tool, event);
}

-static int perf_event__repipe_tracing_data_synth(union perf_event *event,
- struct perf_session *session
- __maybe_unused)
-{
- return perf_event__repipe_synth(NULL, event);
-}
-
-static int perf_event__repipe_attr(union perf_event *event,
- struct perf_evlist **pevlist __maybe_unused)
+static int perf_event__repipe_attr(struct perf_tool *tool,
+ union perf_event *event,
+ struct perf_evlist **pevlist)
{
int ret;
- ret = perf_event__process_attr(event, pevlist);
+
+ ret = perf_event__process_attr(tool, event, pevlist);
if (ret)
return ret;

- return perf_event__repipe_synth(NULL, event);
+ return perf_event__repipe_synth(tool, event);
}

static int perf_event__repipe(struct perf_tool *tool,
@@ -147,13 +142,14 @@ static int perf_event__repipe_fork(struct perf_tool *tool,
return err;
}

-static int perf_event__repipe_tracing_data(union perf_event *event,
+static int perf_event__repipe_tracing_data(struct perf_tool *tool,
+ union perf_event *event,
struct perf_session *session)
{
int err;

- perf_event__repipe_synth(NULL, event);
- err = perf_event__process_tracing_data(event, session);
+ perf_event__repipe_synth(tool, event);
+ err = perf_event__process_tracing_data(tool, event, session);

return err;
}
@@ -407,7 +403,7 @@ int cmd_inject(int argc, const char **argv, const char *prefix __maybe_unused)
.unthrottle = perf_event__repipe,
.attr = perf_event__repipe_attr,
.event_type = perf_event__repipe_event_type_synth,
- .tracing_data = perf_event__repipe_tracing_data_synth,
+ .tracing_data = perf_event__repipe_op2_synth,
.build_id = perf_event__repipe_op2_synth,
},
.input_name = "-",
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 738d3b8..4037b0a 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -2933,7 +2933,8 @@ int perf_event__synthesize_attrs(struct perf_tool *tool,
return err;
}

-int perf_event__process_attr(union perf_event *event,
+int perf_event__process_attr(struct perf_tool *tool __maybe_unused,
+ union perf_event *event,
struct perf_evlist **pevlist)
{
u32 i, ids, n_ids;
@@ -3074,7 +3075,8 @@ int perf_event__synthesize_tracing_data(struct perf_tool *tool, int fd,
return aligned_size;
}

-int perf_event__process_tracing_data(union perf_event *event,
+int perf_event__process_tracing_data(struct perf_tool *tool __maybe_unused,
+ union perf_event *event,
struct perf_session *session)
{
ssize_t size_read, padding, size = event->tracing_data.size;
diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h
index 16a3e83..2d1ca7d 100644
--- a/tools/perf/util/header.h
+++ b/tools/perf/util/header.h
@@ -130,7 +130,8 @@ int perf_event__synthesize_attr(struct perf_tool *tool,
int perf_event__synthesize_attrs(struct perf_tool *tool,
struct perf_session *session,
perf_event__handler_t process);
-int perf_event__process_attr(union perf_event *event, struct perf_evlist **pevlist);
+int perf_event__process_attr(struct perf_tool *tool, union perf_event *event,
+ struct perf_evlist **pevlist);

int perf_event__synthesize_event_type(struct perf_tool *tool,
u64 event_id, char *name,
@@ -145,7 +146,8 @@ int perf_event__process_event_type(struct perf_tool *tool,
int perf_event__synthesize_tracing_data(struct perf_tool *tool,
int fd, struct perf_evlist *evlist,
perf_event__handler_t process);
-int perf_event__process_tracing_data(union perf_event *event,
+int perf_event__process_tracing_data(struct perf_tool *tool,
+ union perf_event *event,
struct perf_session *session);

int perf_event__synthesize_build_id(struct perf_tool *tool,
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index cf1fe01..19ea888 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -193,7 +193,9 @@ void perf_session__delete(struct perf_session *self)
vdso__exit();
}

-static int process_event_synth_tracing_data_stub(union perf_event *event
+static int process_event_synth_tracing_data_stub(struct perf_tool *tool
+ __maybe_unused,
+ union perf_event *event
__maybe_unused,
struct perf_session *session
__maybe_unused)
@@ -202,7 +204,8 @@ static int process_event_synth_tracing_data_stub(union perf_event *event
return 0;
}

-static int process_event_synth_attr_stub(union perf_event *event __maybe_unused,
+static int process_event_synth_attr_stub(struct perf_tool *tool __maybe_unused,
+ union perf_event *event __maybe_unused,
struct perf_evlist **pevlist
__maybe_unused)
{
@@ -921,7 +924,7 @@ static int perf_session__process_user_event(struct perf_session *session, union
/* These events are processed right away */
switch (event->header.type) {
case PERF_RECORD_HEADER_ATTR:
- err = tool->attr(event, &session->evlist);
+ err = tool->attr(tool, event, &session->evlist);
if (err == 0)
perf_session__set_id_hdr_size(session);
return err;
@@ -930,7 +933,7 @@ static int perf_session__process_user_event(struct perf_session *session, union
case PERF_RECORD_HEADER_TRACING_DATA:
/* setup for reading amidst mmap */
lseek(session->fd, file_offset, SEEK_SET);
- return tool->tracing_data(event, session);
+ return tool->tracing_data(tool, event, session);
case PERF_RECORD_HEADER_BUILD_ID:
return tool->build_id(tool, event, session);
case PERF_RECORD_FINISHED_ROUND:
diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h
index b0e1aad..88f8cbd 100644
--- a/tools/perf/util/tool.h
+++ b/tools/perf/util/tool.h
@@ -18,12 +18,11 @@ typedef int (*event_sample)(struct perf_tool *tool, union perf_event *event,
typedef int (*event_op)(struct perf_tool *tool, union perf_event *event,
struct perf_sample *sample, struct machine *machine);

-typedef int (*event_attr_op)(union perf_event *event,
+typedef int (*event_attr_op)(struct perf_tool *tool,
+ union perf_event *event,
struct perf_evlist **pevlist);
-typedef int (*event_simple_op)(struct perf_tool *tool, union perf_event *event);

-typedef int (*event_synth_op)(union perf_event *event,
- struct perf_session *session);
+typedef int (*event_simple_op)(struct perf_tool *tool, union perf_event *event);

typedef int (*event_op2)(struct perf_tool *tool, union perf_event *event,
struct perf_session *session);
@@ -39,7 +38,7 @@ struct perf_tool {
throttle,
unthrottle;
event_attr_op attr;
- event_synth_op tracing_data;
+ event_op2 tracing_data;
event_simple_op event_type;
event_op2 finished_round,
build_id;
--
1.7.11.7

2013-06-27 16:13:32

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH V2 06/15] perf tools: fix parse_events_terms() freeing local variable on error path

On 6/27/13 1:54 AM, Adrian Hunter wrote:
> The list_head is on the stack, so just free the rest of the list.
>
> Signed-off-by: Adrian Hunter <[email protected]>
> ---
> tools/perf/util/parse-events.c | 7 ++++++-
> tools/perf/util/parse-events.h | 1 +
> tools/perf/util/pmu.c | 2 +-
> 3 files changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index 995fc25..d9cb055 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -1231,12 +1231,17 @@ int parse_events_term__clone(struct parse_events_term **new,
> term->val.str, term->val.num);
> }
>
> -void parse_events__free_terms(struct list_head *terms)
> +void parse_events__free_terms_only(struct list_head *terms)
> {
> struct parse_events_term *term, *h;
>
> list_for_each_entry_safe(term, h, terms, list)
> free(term);
> +}
> +
> +void parse_events__free_terms(struct list_head *terms)
> +{
> + parse_events__free_terms_only(terms);
>
> free(terms);
> }

Seems like you can just fix parse_events__free_terms to do the right
thing and not call free(terms) -- ie., no need for a separate "only"
function. From there only test_term() needs to do the free(terms) and
since it mallocs the memory there it is nicely symmetrical.

David

2013-06-27 16:39:17

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH V2 12/15] perf tools: allow non-matching sample types

On 6/27/13 1:55 AM, Adrian Hunter wrote:
> Sample types need not be identical to determine
> the sample id from the event. Only the position
> of the sample id needs to be the same.
>
> Compatible sample types are ones in which the bits
> defined by PERF_COMPAT_MASK are the same.
> 'perf_evlist__config()' forces sample types to be
> compatible on that basis.
>
> Signed-off-by: Adrian Hunter <[email protected]>
> ---
> tools/perf/builtin-report.c | 2 +-
> tools/perf/util/event.h | 14 +++++
> tools/perf/util/evlist.c | 135 ++++++++++++++++++++++++++++++++++++++++++--
> tools/perf/util/evlist.h | 8 ++-
> tools/perf/util/evsel.c | 64 ++++++++++++++++++++-
> tools/perf/util/evsel.h | 10 ++++
> tools/perf/util/session.c | 8 ++-
> 7 files changed, 230 insertions(+), 11 deletions(-)
>
> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> index ca98d34..58572ac 100644
> --- a/tools/perf/builtin-report.c
> +++ b/tools/perf/builtin-report.c
> @@ -367,7 +367,7 @@ static int process_read_event(struct perf_tool *tool,
> static int perf_report__setup_sample_type(struct perf_report *rep)
> {
> struct perf_session *self = rep->session;
> - u64 sample_type = perf_evlist__sample_type(self->evlist);
> + u64 sample_type = perf_evlist__combined_sample_type(self->evlist);
>
> if (!self->fd_pipe && !(sample_type & PERF_SAMPLE_CALLCHAIN)) {
> if (sort__has_parent) {
> diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
> index 1813895..3aef78c 100644
> --- a/tools/perf/util/event.h
> +++ b/tools/perf/util/event.h
> @@ -65,6 +65,20 @@ struct read_event {
> PERF_SAMPLE_ID | PERF_SAMPLE_STREAM_ID | \
> PERF_SAMPLE_CPU | PERF_SAMPLE_PERIOD)
>
> +/*
> + * Events have compatible sample types if the following bits all have the same
> + * value. This is because the order of sample members is fixed. For sample
> + * events the order is: PERF_SAMPLE_IP, PERF_SAMPLE_TID, PERF_SAMPLE_TIME,
> + * PERF_SAMPLE_ADDR, PERF_SAMPLE_ID. For non-sample events the sample members
> + * are accessed in reverse order. The order is: PERF_SAMPLE_ID,
> + * PERF_SAMPLE_STREAM_ID, PERF_SAMPLE_CPU.
> + */
> +#define PERF_COMPAT_MASK \
> + (PERF_SAMPLE_IP | PERF_SAMPLE_TID | \
> + PERF_SAMPLE_TIME | PERF_SAMPLE_ADDR | \
> + PERF_SAMPLE_ID | PERF_SAMPLE_STREAM_ID | \
> + PERF_SAMPLE_CPU)
> +
> struct sample_event {
> struct perf_event_header header;
> u64 array[];
> diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> index a660f56..78331da 100644
> --- a/tools/perf/util/evlist.c
> +++ b/tools/perf/util/evlist.c
> @@ -49,6 +49,45 @@ struct perf_evlist *perf_evlist__new(void)
> return evlist;
> }
>
> +/**
> + * perf_evlist__set_id_pos - set the positions of event ids.
> + * @evlist: selected event list
> + *
> + * Events with compatible sample types all have the same id_pos
> + * and is_pos. For convenience, put a copy on evlist.
> + */
> +static void perf_evlist__set_id_pos(struct perf_evlist *evlist)
> +{
> + struct perf_evsel *first = perf_evlist__first(evlist);
> +
> + evlist->id_pos = first->id_pos;
> + evlist->is_pos = first->is_pos;
> +}
> +
> +/**
> + * perf_evlist__make_sample_types_compatible - make sample types compatible.
> + * @evlist: selected event list
> + *
> + * Events with compatible sample types all have the same id_pos and is_pos.
> + * This can be achieved by matching the bits of PERF_COMPAT_MASK.
> + */
> +void perf_evlist__make_sample_types_compatible(struct perf_evlist *evlist)
> +{
> + struct perf_evsel *evsel;
> + u64 compat = 0;
> +
> + list_for_each_entry(evsel, &evlist->entries, node)
> + compat |= evsel->attr.sample_type & PERF_COMPAT_MASK;
> +
> + list_for_each_entry(evsel, &evlist->entries, node) {
> + evsel->attr.sample_type |= compat;
> + evsel->sample_size = __perf_evsel__sample_size(evsel->attr.sample_type);
> + perf_evsel__calc_id_pos(evsel);
> + }
> +
> + perf_evlist__set_id_pos(evlist);
> +}
> +

While this works for a combined S/W and tracepoint events session, I do
not like promoting sample types to the minimum compatible level for all
events in the session. perf needs to allow each event to have its own
sample_type and not force a minimal compatibility.

David

2013-06-27 16:52:09

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH V2 14/15] perf tools: add pid to struct thread

On 6/27/13 1:55 AM, Adrian Hunter wrote:
> Record pid on struct thread. The member is named 'pid_'
> to avoid confusion with the 'tid' member which was previously
> named 'pid'.
>
> Note that while "machine" functions update 'pid_', most tools
> do not.

You are doing multiple things in this patch. Please make the
introduction and use of machine__findnew_thread_ex a standalone patch.

David


>
> Signed-off-by: Adrian Hunter <[email protected]>
> ---
> tools/perf/util/machine.c | 47 ++++++++++++++++++++++++++++++++++++++---------
> tools/perf/util/machine.h | 4 ++++
> tools/perf/util/thread.c | 3 ++-
> tools/perf/util/thread.h | 3 ++-
> 4 files changed, 46 insertions(+), 11 deletions(-)
>
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index ef0be97..d11ca3b 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -233,7 +233,8 @@ void machines__set_id_hdr_size(struct machines *machines, u16 id_hdr_size)
> return;
> }
>
> -static struct thread *__machine__findnew_thread(struct machine *machine, pid_t tid,
> +static struct thread *__machine__findnew_thread(struct machine *machine,
> + pid_t pid, pid_t tid,
> bool create)
> {
> struct rb_node **p = &machine->threads.rb_node;
> @@ -245,8 +246,11 @@ static struct thread *__machine__findnew_thread(struct machine *machine, pid_t t
> * so most of the time we dont have to look up
> * the full rbtree:
> */
> - if (machine->last_match && machine->last_match->tid == tid)
> + if (machine->last_match && machine->last_match->tid == tid) {
> + if (pid && !machine->last_match->pid_)
> + machine->last_match->pid_ = pid;
> return machine->last_match;
> + }
>
> while (*p != NULL) {
> parent = *p;
> @@ -254,6 +258,8 @@ static struct thread *__machine__findnew_thread(struct machine *machine, pid_t t
>
> if (th->tid == tid) {
> machine->last_match = th;
> + if (pid && !th->pid_)
> + th->pid_ = pid;
> return th;
> }
>
> @@ -266,7 +272,7 @@ static struct thread *__machine__findnew_thread(struct machine *machine, pid_t t
> if (!create)
> return NULL;
>
> - th = thread__new(tid);
> + th = thread__new(pid, tid);
> if (th != NULL) {
> rb_link_node(&th->rb_node, parent, p);
> rb_insert_color(&th->rb_node, &machine->threads);
> @@ -278,17 +284,25 @@ static struct thread *__machine__findnew_thread(struct machine *machine, pid_t t
>
> struct thread *machine__findnew_thread(struct machine *machine, pid_t tid)
> {
> - return __machine__findnew_thread(machine, tid, true);
> + return __machine__findnew_thread(machine, 0, tid, true);
> +}
> +
> +struct thread *machine__findnew_thread_ex(struct machine *machine, pid_t pid,
> + pid_t tid)
> +{
> + return __machine__findnew_thread(machine, pid, tid, true);
> }
>
> struct thread *machine__find_thread(struct machine *machine, pid_t tid)
> {
> - return __machine__findnew_thread(machine, tid, false);
> + return __machine__findnew_thread(machine, 0, tid, false);
> }
>
> int machine__process_comm_event(struct machine *machine, union perf_event *event)
> {
> - struct thread *thread = machine__findnew_thread(machine, event->comm.tid);
> + struct thread *thread = machine__findnew_thread_ex(machine,
> + event->comm.pid,
> + event->comm.tid);
>
> if (dump_trace)
> perf_event__fprintf_comm(event, stdout);
> @@ -969,7 +983,8 @@ int machine__process_mmap_event(struct machine *machine, union perf_event *event
> return 0;
> }
>
> - thread = machine__findnew_thread(machine, event->mmap.pid);
> + thread = machine__findnew_thread_ex(machine, event->mmap.pid,
> + event->mmap.pid);
> if (thread == NULL)
> goto out_problem;
>
> @@ -996,8 +1011,12 @@ out_problem:
>
> int machine__process_fork_event(struct machine *machine, union perf_event *event)
> {
> - struct thread *thread = machine__findnew_thread(machine, event->fork.tid);
> - struct thread *parent = machine__findnew_thread(machine, event->fork.ptid);
> + struct thread *thread = machine__findnew_thread_ex(machine,
> + event->fork.pid,
> + event->fork.tid);
> + struct thread *parent = machine__findnew_thread_ex(machine,
> + event->fork.ppid,
> + event->fork.ptid);
>
> if (dump_trace)
> perf_event__fprintf_task(event, stdout);
> @@ -1264,3 +1283,13 @@ int machine__resolve_callchain(struct machine *machine,
> sample);
>
> }
> +
> +pid_t machine__get_thread_pid(struct machine *machine, pid_t tid)
> +{
> + struct thread *thread = machine__find_thread(machine, tid);
> +
> + if (!thread)
> + return 0;
> +
> + return thread->pid_;
> +}
> diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
> index e49ba01..11a53d5 100644
> --- a/tools/perf/util/machine.h
> +++ b/tools/perf/util/machine.h
> @@ -100,6 +100,8 @@ static inline bool machine__is_host(struct machine *machine)
> }
>
> struct thread *machine__findnew_thread(struct machine *machine, pid_t tid);
> +struct thread *machine__findnew_thread_ex(struct machine *machine, pid_t pid,
> + pid_t tid);
>
> size_t machine__fprintf(struct machine *machine, FILE *fp);
>
> @@ -156,4 +158,6 @@ void machines__destroy_kernel_maps(struct machines *machines);
>
> size_t machine__fprintf_vmlinux_path(struct machine *machine, FILE *fp);
>
> +pid_t machine__get_thread_pid(struct machine *machine, pid_t tid);
> +
> #endif /* __PERF_MACHINE_H */
> diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
> index 6feeb88..e3d4a55 100644
> --- a/tools/perf/util/thread.c
> +++ b/tools/perf/util/thread.c
> @@ -7,12 +7,13 @@
> #include "util.h"
> #include "debug.h"
>
> -struct thread *thread__new(pid_t tid)
> +struct thread *thread__new(pid_t pid, pid_t tid)
> {
> struct thread *self = zalloc(sizeof(*self));
>
> if (self != NULL) {
> map_groups__init(&self->mg);
> + self->pid_ = pid;
> self->tid = tid;
> self->ppid = -1;
> self->comm = malloc(32);
> diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
> index 37a86a3..bfa68e7 100644
> --- a/tools/perf/util/thread.h
> +++ b/tools/perf/util/thread.h
> @@ -12,6 +12,7 @@ struct thread {
> struct list_head node;
> };
> struct map_groups mg;
> + pid_t pid_; /* Not all tools update this */
> pid_t tid;
> pid_t ppid;
> char shortname[3];
> @@ -24,7 +25,7 @@ struct thread {
>
> struct machine;
>
> -struct thread *thread__new(pid_t tid);
> +struct thread *thread__new(pid_t pid, pid_t tid);
> void thread__delete(struct thread *self);
>
> int thread__set_comm(struct thread *self, const char *comm);
>

2013-06-27 16:58:06

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH V2 15/15] perf tools: fix ppid in thread__fork()

On 6/27/13 1:55 AM, Adrian Hunter wrote:
> ppid should be assigned to the parents pid. Note
> 'thread__fork()'s only caller 'machine__process_fork_event()'
> ensures that the parents pid is set.
>
> Signed-off-by: Adrian Hunter <[email protected]>
> ---
> tools/perf/util/thread.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
> index e3d4a55..93f3eab 100644
> --- a/tools/perf/util/thread.c
> +++ b/tools/perf/util/thread.c
> @@ -85,7 +85,7 @@ int thread__fork(struct thread *self, struct thread *parent)
> if (map_groups__clone(&self->mg, &parent->mg, i) < 0)
> return -ENOMEM;
>
> - self->ppid = parent->tid;
> + self->ppid = parent->pid_;
>

knowledge of the specific thread within the parent that created the
process can be valuable -- that's what we get now. This change drops
that information. I'd prefer this stays at thread id -- or save the
parent thread id in addition to the pid.

David

2013-06-28 06:26:38

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH V2 06/15] perf tools: fix parse_events_terms() freeing local variable on error path

On 27/06/13 19:13, David Ahern wrote:
> On 6/27/13 1:54 AM, Adrian Hunter wrote:
>> The list_head is on the stack, so just free the rest of the list.
>>
>> Signed-off-by: Adrian Hunter <[email protected]>
>> ---
>> tools/perf/util/parse-events.c | 7 ++++++-
>> tools/perf/util/parse-events.h | 1 +
>> tools/perf/util/pmu.c | 2 +-
>> 3 files changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
>> index 995fc25..d9cb055 100644
>> --- a/tools/perf/util/parse-events.c
>> +++ b/tools/perf/util/parse-events.c
>> @@ -1231,12 +1231,17 @@ int parse_events_term__clone(struct
>> parse_events_term **new,
>> term->val.str, term->val.num);
>> }
>>
>> -void parse_events__free_terms(struct list_head *terms)
>> +void parse_events__free_terms_only(struct list_head *terms)
>> {
>> struct parse_events_term *term, *h;
>>
>> list_for_each_entry_safe(term, h, terms, list)
>> free(term);
>> +}
>> +
>> +void parse_events__free_terms(struct list_head *terms)
>> +{
>> + parse_events__free_terms_only(terms);
>>
>> free(terms);
>> }
>
> Seems like you can just fix parse_events__free_terms to do the right thing
> and not call free(terms) -- ie., no need for a separate "only" function.
> From there only test_term() needs to do the free(terms) and since it mallocs
> the memory there it is nicely symmetrical.

There are 2 more calls to parse_events__free_terms in util/parse-events.y

2013-06-28 06:41:38

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH V2 15/15] perf tools: fix ppid in thread__fork()

On 27/06/13 19:57, David Ahern wrote:
> On 6/27/13 1:55 AM, Adrian Hunter wrote:
>> ppid should be assigned to the parents pid. Note
>> 'thread__fork()'s only caller 'machine__process_fork_event()'
>> ensures that the parents pid is set.
>>
>> Signed-off-by: Adrian Hunter <[email protected]>
>> ---
>> tools/perf/util/thread.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
>> index e3d4a55..93f3eab 100644
>> --- a/tools/perf/util/thread.c
>> +++ b/tools/perf/util/thread.c
>> @@ -85,7 +85,7 @@ int thread__fork(struct thread *self, struct thread
>> *parent)
>> if (map_groups__clone(&self->mg, &parent->mg, i) < 0)
>> return -ENOMEM;
>>
>> - self->ppid = parent->tid;
>> + self->ppid = parent->pid_;
>>
>
> knowledge of the specific thread within the parent that created the process
> can be valuable -- that's what we get now. This change drops that
> information. I'd prefer this stays at thread id -- or save the parent thread
> id in addition to the pid.

That means renaming ppid to ptid. Do you want to do that?

Isn't it possible that the parent could exit and the pid or tid be re-used
for another process? In that case, to reliable identify the parent a
pointer to its struct thread would be needed. i.e.

diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -14,6 +14,7 @@ struct thread {
struct map_groups mg;
pid_t pid_; /* Not all tools update this */
pid_t tid;
- pid_t ppid;
+ struct thread *parent;
char shortname[3];
bool comm_set;

However that means tracking the lifetime of 'parent' to ensure that it is
not left dangling.

Do you want to do that?

2013-06-28 13:42:05

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH V2 06/15] perf tools: fix parse_events_terms() freeing local variable on error path

On 6/28/13 12:32 AM, Adrian Hunter wrote:

>> Seems like you can just fix parse_events__free_terms to do the right thing
>> and not call free(terms) -- ie., no need for a separate "only" function.
>> From there only test_term() needs to do the free(terms) and since it mallocs
>> the memory there it is nicely symmetrical.
>
> There are 2 more calls to parse_events__free_terms in util/parse-events.y
>

Right and both look like stack case uses. I don't see the need for
__free_terms to free the list so no need for an _only version.

David

2013-06-28 13:46:24

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH V2 15/15] perf tools: fix ppid in thread__fork()

On 6/28/13 12:47 AM, Adrian Hunter wrote:
>> knowledge of the specific thread within the parent that created the process
>> can be valuable -- that's what we get now. This change drops that
>> information. I'd prefer this stays at thread id -- or save the parent thread
>> id in addition to the pid.
>
> That means renaming ppid to ptid. Do you want to do that?

Don't lose information, so if you want the process id add ppid and ptid.
I like knowing the exact thread.

>
> Isn't it possible that the parent could exit and the pid or tid be re-used
> for another process? In that case, to reliable identify the parent a
> pointer to its struct thread would be needed. i.e.
>
> diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
> --- a/tools/perf/util/thread.h
> +++ b/tools/perf/util/thread.h
> @@ -14,6 +14,7 @@ struct thread {
> struct map_groups mg;
> pid_t pid_; /* Not all tools update this */
> pid_t tid;
> - pid_t ppid;
> + struct thread *parent;
> char shortname[3];
> bool comm_set;
>
> However that means tracking the lifetime of 'parent' to ensure that it is
> not left dangling.
>
> Do you want to do that?

No, I do not think we want to save pointers to other thread structs. We
actually need to be able to clean up the dead_threads list for long
running sessions and dangling pointers would add to the problem.

David

2013-06-28 14:03:12

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH V2 06/15] perf tools: fix parse_events_terms() freeing local variable on error path

On Fri, Jun 28, 2013 at 07:41:55AM -0600, David Ahern wrote:
> On 6/28/13 12:32 AM, Adrian Hunter wrote:
>
> >>Seems like you can just fix parse_events__free_terms to do the right thing
> >>and not call free(terms) -- ie., no need for a separate "only" function.
> >> From there only test_term() needs to do the free(terms) and since it mallocs
> >>the memory there it is nicely symmetrical.
> >
> >There are 2 more calls to parse_events__free_terms in util/parse-events.y
> >
>
> Right and both look like stack case uses. I don't see the need for
> __free_terms to free the list so no need for an _only version.

there's also another one in pmu_alias_terms function
which uses on stack list_head

jirka

2013-07-01 09:26:42

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH V2 12/15] perf tools: allow non-matching sample types

On 27/06/13 19:39, David Ahern wrote:
> On 6/27/13 1:55 AM, Adrian Hunter wrote:
>> Sample types need not be identical to determine
>> the sample id from the event. Only the position
>> of the sample id needs to be the same.
>>
>> Compatible sample types are ones in which the bits
>> defined by PERF_COMPAT_MASK are the same.
>> 'perf_evlist__config()' forces sample types to be
>> compatible on that basis.
>>
>> Signed-off-by: Adrian Hunter <[email protected]>
>> ---
>> tools/perf/builtin-report.c | 2 +-
>> tools/perf/util/event.h | 14 +++++
>> tools/perf/util/evlist.c | 135
>> ++++++++++++++++++++++++++++++++++++++++++--
>> tools/perf/util/evlist.h | 8 ++-
>> tools/perf/util/evsel.c | 64 ++++++++++++++++++++-
>> tools/perf/util/evsel.h | 10 ++++
>> tools/perf/util/session.c | 8 ++-
>> 7 files changed, 230 insertions(+), 11 deletions(-)
>>

Snip

>
> While this works for a combined S/W and tracepoint events session, I do not
> like promoting sample types to the minimum compatible level for all events
> in the session. perf needs to allow each event to have its own sample_type
> and not force a minimal compatibility.

Why? The impact is small. The kernel API is completely unchanged.

Consider the affected sample members:

PERF_SAMPLE_IP
PERF_SAMPLE_TID
PERF_SAMPLE_TIME
PERF_SAMPLE_ADDR
PERF_SAMPLE_ID
PERF_SAMPLE_CPU
PERF_SAMPLE_STREAM_ID

Of those, perf_evlist__config() always selects PERF_SAMPLE_IP,
PERF_SAMPLE_TID and PERF_SAMPLE_ID, and never selects PERF_SAMPLE_STREAM_ID.
That leaves just:

PERF_SAMPLE_TIME
PERF_SAMPLE_ADDR
PERF_SAMPLE_CPU

PERF_SAMPLE_TIME is always selected when inheriting.

PERF_SAMPLE_TIME and PERF_SAMPLE_CPU are always selected for per-cpu recording.

Under what circumstances would you want 'time' and/or 'cpu' on some events
but not all of them?

Under what circumstances would you want 'addr' but those events are not the
vast majority of the events you record?

2013-07-01 18:54:01

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH V2 12/15] perf tools: allow non-matching sample types

On 7/1/13 3:32 AM, Adrian Hunter wrote:
> Snip
>
>>
>> While this works for a combined S/W and tracepoint events session, I do not
>> like promoting sample types to the minimum compatible level for all events
>> in the session. perf needs to allow each event to have its own sample_type
>> and not force a minimal compatibility.
>
> Why? The impact is small. The kernel API is completely unchanged.

I'd like to see libperf become a stable, usable library - usable by more
than the perf binary and its builtin commands. I have already done this
once for a daemon, and it was a PITA to get the specific use functional
without memory leaks/growth in the libperf part.

With respect to this specific patch it means appropriate flexibility in
the data collected for events. ie., each event can have its own
sample_type. For example if the tracepoint already contains task
information TID is not needed - and IP may not be wanted either. The
code processing the samples should not require all events to have some
minimum data format - that just wastes buffer space.

David

2013-07-01 19:10:41

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH V2 12/15] perf tools: allow non-matching sample types

On Mon, Jul 1, 2013 at 8:53 PM, David Ahern <[email protected]> wrote:
>
> On 7/1/13 3:32 AM, Adrian Hunter wrote:
>>
>> Snip
>>
>>>
>>> While this works for a combined S/W and tracepoint events session, I do not
>>> like promoting sample types to the minimum compatible level for all events
>>> in the session. perf needs to allow each event to have its own sample_type
>>> and not force a minimal compatibility.
>>
>>
>> Why? The impact is small. The kernel API is completely unchanged.
>
>
> I'd like to see libperf become a stable, usable library - usable by more than the perf binary and its builtin commands. I have already done this once for a daemon, and it was a PITA to get the specific use functional without memory leaks/growth in the libperf part.
>
> With respect to this specific patch it means appropriate flexibility in the data collected for events. ie., each event can have its own sample_type. For example if the tracepoint already contains task information TID is not needed - and IP may not be wanted either. The code processing the samples should not require all events to have some minimum data format - that just wastes buffer space.
>
I agree. This kernel needs to allow for any bit combination on
sample_type and yet provide enough info
to parse the buffer in the case of multi-event sampling. This is
kernel bug. Tools should not have to handle
this. Because it'd have to be repeated for each tool.

Later this week, I'll post a patch that address the kernel limitation.

2013-07-02 06:57:18

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH V2 12/15] perf tools: allow non-matching sample types

On 01/07/13 22:10, Stephane Eranian wrote:
> On Mon, Jul 1, 2013 at 8:53 PM, David Ahern <[email protected]> wrote:
>>
>> On 7/1/13 3:32 AM, Adrian Hunter wrote:
>>>
>>> Snip
>>>
>>>>
>>>> While this works for a combined S/W and tracepoint events session, I do not
>>>> like promoting sample types to the minimum compatible level for all events
>>>> in the session. perf needs to allow each event to have its own sample_type
>>>> and not force a minimal compatibility.
>>>
>>>
>>> Why? The impact is small. The kernel API is completely unchanged.
>>
>>
>> I'd like to see libperf become a stable, usable library - usable by more than the perf binary and its builtin commands. I have already done this once for a daemon, and it was a PITA to get the specific use functional without memory leaks/growth in the libperf part.
>>
>> With respect to this specific patch it means appropriate flexibility in the data collected for events. ie., each event can have its own sample_type. For example if the tracepoint already contains task information TID is not needed - and IP may not be wanted either. The code processing the samples should not require all events to have some minimum data format - that just wastes buffer space.
>>
> I agree. This kernel needs to allow for any bit combination on
> sample_type and yet provide enough info
> to parse the buffer in the case of multi-event sampling. This is
> kernel bug. Tools should not have to handle
> this. Because it'd have to be repeated for each tool.
>
> Later this week, I'll post a patch that address the kernel limitation.

But isn't it trivial. Just add a new sample type that puts the ID first.
Anyone using the new PERF_SAMPLE_IDENTIFIER gets the new ABI.



diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index 0b1df41..6bb217e 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -134,8 +134,9 @@ enum perf_event_sample_format {
PERF_SAMPLE_STACK_USER = 1U << 13,
PERF_SAMPLE_WEIGHT = 1U << 14,
PERF_SAMPLE_DATA_SRC = 1U << 15,
+ PERF_SAMPLE_IDENTIFIER = 1U << 16,

- PERF_SAMPLE_MAX = 1U << 16, /* non-ABI */
+ PERF_SAMPLE_MAX = 1U << 17, /* non-ABI */
};

/*
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 1db3af9..a3707af 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1203,6 +1203,9 @@ static void perf_event__id_header_size(struct perf_event *event)
if (sample_type & PERF_SAMPLE_TIME)
size += sizeof(data->time);

+ if (sample_type & PERF_SAMPLE_IDENTIFIER)
+ size += sizeof(data->id);
+
if (sample_type & PERF_SAMPLE_ID)
size += sizeof(data->id);

@@ -4229,7 +4232,7 @@ static void __perf_event_header__init_id(struct perf_event_header *header,
if (sample_type & PERF_SAMPLE_TIME)
data->time = perf_clock();

- if (sample_type & PERF_SAMPLE_ID)
+ if (sample_type & (PERF_SAMPLE_ID | PERF_SAMPLE_IDENTIFIER))
data->id = primary_event_id(event);

if (sample_type & PERF_SAMPLE_STREAM_ID)
@@ -4268,6 +4271,9 @@ static void __perf_event__output_id_sample(struct perf_output_handle *handle,

if (sample_type & PERF_SAMPLE_CPU)
perf_output_put(handle, data->cpu_entry);
+
+ if (sample_type & PERF_SAMPLE_IDENTIFER)
+ perf_output_put(handle, data->id);
}

void perf_event__output_id_sample(struct perf_event *event,
@@ -4380,6 +4386,9 @@ void perf_output_sample(struct perf_output_handle *handle,

perf_output_put(handle, *header);

+ if (sample_type & PERF_SAMPLE_IDENTIFIER)
+ perf_output_put(handle, data->id);
+
if (sample_type & PERF_SAMPLE_IP)
perf_output_put(handle, data->ip);

2013-07-02 07:03:59

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH V2 12/15] perf tools: allow non-matching sample types

On 01/07/13 21:53, David Ahern wrote:
> On 7/1/13 3:32 AM, Adrian Hunter wrote:
>> Snip
>>
>>>
>>> While this works for a combined S/W and tracepoint events session, I do not
>>> like promoting sample types to the minimum compatible level for all events
>>> in the session. perf needs to allow each event to have its own sample_type
>>> and not force a minimal compatibility.
>>
>> Why? The impact is small. The kernel API is completely unchanged.
>
> I'd like to see libperf become a stable, usable library - usable by more
> than the perf binary and its builtin commands. I have already done this once
> for a daemon, and it was a PITA to get the specific use functional without
> memory leaks/growth in the libperf part.
>
> With respect to this specific patch it means appropriate flexibility in the
> data collected for events. ie., each event can have its own sample_type. For
> example if the tracepoint already contains task information TID is not
> needed - and IP may not be wanted either. The code processing the samples
> should not require all events to have some minimum data format - that just
> wastes buffer space.

It would be more compelling to provide a use-case where that "waste"
actually makes any difference.

2013-07-03 06:40:33

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH V2 12/15] perf tools: allow non-matching sample types

On Tue, Jul 2, 2013 at 8:58 AM, Adrian Hunter <[email protected]> wrote:
> On 01/07/13 22:10, Stephane Eranian wrote:
>> On Mon, Jul 1, 2013 at 8:53 PM, David Ahern <[email protected]> wrote:
>>>
>>> On 7/1/13 3:32 AM, Adrian Hunter wrote:
>>>>
>>>> Snip
>>>>
>>>>>
>>>>> While this works for a combined S/W and tracepoint events session, I do not
>>>>> like promoting sample types to the minimum compatible level for all events
>>>>> in the session. perf needs to allow each event to have its own sample_type
>>>>> and not force a minimal compatibility.
>>>>
>>>>
>>>> Why? The impact is small. The kernel API is completely unchanged.
>>>
>>>
>>> I'd like to see libperf become a stable, usable library - usable by more than the perf binary and its builtin commands. I have already done this once for a daemon, and it was a PITA to get the specific use functional without memory leaks/growth in the libperf part.
>>>
>>> With respect to this specific patch it means appropriate flexibility in the data collected for events. ie., each event can have its own sample_type. For example if the tracepoint already contains task information TID is not needed - and IP may not be wanted either. The code processing the samples should not require all events to have some minimum data format - that just wastes buffer space.
>>>
>> I agree. This kernel needs to allow for any bit combination on
>> sample_type and yet provide enough info
>> to parse the buffer in the case of multi-event sampling. This is
>> kernel bug. Tools should not have to handle
>> this. Because it'd have to be repeated for each tool.
>>
>> Later this week, I'll post a patch that address the kernel limitation.
>
> But isn't it trivial. Just add a new sample type that puts the ID first.
> Anyone using the new PERF_SAMPLE_IDENTIFIER gets the new ABI.
>
>
You could do that but then how many aliases do we need for the event ID.

>
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index 0b1df41..6bb217e 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -134,8 +134,9 @@ enum perf_event_sample_format {
> PERF_SAMPLE_STACK_USER = 1U << 13,
> PERF_SAMPLE_WEIGHT = 1U << 14,
> PERF_SAMPLE_DATA_SRC = 1U << 15,
> + PERF_SAMPLE_IDENTIFIER = 1U << 16,
>
> - PERF_SAMPLE_MAX = 1U << 16, /* non-ABI */
> + PERF_SAMPLE_MAX = 1U << 17, /* non-ABI */
> };
>
> /*
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 1db3af9..a3707af 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -1203,6 +1203,9 @@ static void perf_event__id_header_size(struct perf_event *event)
> if (sample_type & PERF_SAMPLE_TIME)
> size += sizeof(data->time);
>
> + if (sample_type & PERF_SAMPLE_IDENTIFIER)
> + size += sizeof(data->id);
> +
> if (sample_type & PERF_SAMPLE_ID)
> size += sizeof(data->id);
>
> @@ -4229,7 +4232,7 @@ static void __perf_event_header__init_id(struct perf_event_header *header,
> if (sample_type & PERF_SAMPLE_TIME)
> data->time = perf_clock();
>
> - if (sample_type & PERF_SAMPLE_ID)
> + if (sample_type & (PERF_SAMPLE_ID | PERF_SAMPLE_IDENTIFIER))
> data->id = primary_event_id(event);
>
> if (sample_type & PERF_SAMPLE_STREAM_ID)
> @@ -4268,6 +4271,9 @@ static void __perf_event__output_id_sample(struct perf_output_handle *handle,
>
> if (sample_type & PERF_SAMPLE_CPU)
> perf_output_put(handle, data->cpu_entry);
> +
> + if (sample_type & PERF_SAMPLE_IDENTIFER)
> + perf_output_put(handle, data->id);
> }
>
> void perf_event__output_id_sample(struct perf_event *event,
> @@ -4380,6 +4386,9 @@ void perf_output_sample(struct perf_output_handle *handle,
>
> perf_output_put(handle, *header);
>
> + if (sample_type & PERF_SAMPLE_IDENTIFIER)
> + perf_output_put(handle, data->id);
> +
> if (sample_type & PERF_SAMPLE_IP)
> perf_output_put(handle, data->ip);
>

2013-07-03 06:44:27

by Stephane Eranian

[permalink] [raw]
Subject: Re: [PATCH V2 12/15] perf tools: allow non-matching sample types

On Tue, Jul 2, 2013 at 9:09 AM, Adrian Hunter <[email protected]> wrote:
> On 01/07/13 21:53, David Ahern wrote:
>> On 7/1/13 3:32 AM, Adrian Hunter wrote:
>>> Snip
>>>
>>>>
>>>> While this works for a combined S/W and tracepoint events session, I do not
>>>> like promoting sample types to the minimum compatible level for all events
>>>> in the session. perf needs to allow each event to have its own sample_type
>>>> and not force a minimal compatibility.
>>>
>>> Why? The impact is small. The kernel API is completely unchanged.
>>
>> I'd like to see libperf become a stable, usable library - usable by more
>> than the perf binary and its builtin commands. I have already done this once
>> for a daemon, and it was a PITA to get the specific use functional without
>> memory leaks/growth in the libperf part.
>>
>> With respect to this specific patch it means appropriate flexibility in the
>> data collected for events. ie., each event can have its own sample_type. For
>> example if the tracepoint already contains task information TID is not
>> needed - and IP may not be wanted either. The code processing the samples
>> should not require all events to have some minimum data format - that just
>> wastes buffer space.
>
> It would be more compelling to provide a use-case where that "waste"
> actually makes any difference.
>
In my opinion, it's not so much of the "wasted" space than on the ease of use
for tools. With your change, tools would have to know something about the order
in which sample_type is laid out. And it just happens that it is not
as trivial as
expected. It is NOT the bit position order in sample_type. So this is more error
prone.

I prefer your IDENTIFIER solution better, yet it also implies that this flag is
special. It provides the event ID at the first position in the sample's body.

2013-08-26 18:54:57

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH V2 12/15] perf tools: allow non-matching sample types

Em Wed, Jul 03, 2013 at 08:44:25AM +0200, Stephane Eranian escreveu:
> On Tue, Jul 2, 2013 at 9:09 AM, Adrian Hunter <[email protected]> wrote:
> > It would be more compelling to provide a use-case where that "waste"
> > actually makes any difference.

> In my opinion, it's not so much of the "wasted" space than on the ease of use
> for tools. With your change, tools would have to know something about the order
> in which sample_type is laid out. And it just happens that it is not
> as trivial as
> expected. It is NOT the bit position order in sample_type. So this is more error
> prone.

> I prefer your IDENTIFIER solution better, yet it also implies that this flag is
> special. It provides the event ID at the first position in the sample's body.

I'm trying to process Adrian's latest batch, but then I saw this
discussion and since this compat part is the first patch and is
controversial, i.e. Stephane seems to be against it as is David Ahern, I
would like to get this compat thing to be moved to the end of the
patchset, so that we could get the parts that everybody agrees should go
in first and then further discuss the merits of forcing the
compatibility up to where we can get the PERF_SAMPLE_ID info.

Sorry Adrian, this has been thru a lot of versions :-\

I tried doing it myself, but in the end you would have to take a look to
see if I hadn't messed up something, so if you can do a v13 with

[PATCH V12 01/13] perf tools: allow non-matching sample types

Moved to 13/13, I could then process the other patches and move on,

Allowing mixed sample types after this part that must be force made
compatible allows us to do things in older kernels that we can't now, so
it has merits, BTW.

Thanks,

- Arnaldo