2015-12-22 17:07:58

by Namhyung Kim

[permalink] [raw]
Subject: [PATCHSET 00/10] perf tools: Support dynamic sort keys for tracepoints (v4)

Hello,

This is an attempt to improve perf to deal with tracepoint events
better. The perf tools can handle tracepoint events but perf report
on them is less useful since they're always sampled in a fixed
location and not provide event specific info. We can use perf script
but I always wishes there's more convenient way to see the result.

* changes in v4)
- support '<event>.*' syntax for dynamic sort key
- skip unrelated dynamic fields in the output
- add 'trace_fields' sort key (Jiri)
- reuse get_trace_output function (Jiri)

* changes in v3)
- save trace_output for dynamic entries (Jiri)
- update field length for each entry

* changes in v2)
- add 'trace' sort key and make it default (Jiri)
- add '--raw-trace' option and '/raw' field modifier (Jiri)
- support event name shortcuts (David)


I suggest dynamic sort keys created for each event.field on demand.
Consider following example:

# Overhead Command Shared Object Symbol
# ........ ............... ................ ..............
#
47.22% swapper [kernel.vmlinux] [k] __schedule
21.67% transmission-gt [kernel.vmlinux] [k] __schedule
8.23% netctl-auto [kernel.vmlinux] [k] __schedule
5.53% kworker/0:1H [kernel.vmlinux] [k] __schedule
1.98% Xephyr [kernel.vmlinux] [k] __schedule
1.33% irq/33-iwlwifi [kernel.vmlinux] [k] __schedule
1.17% wpa_cli [kernel.vmlinux] [k] __schedule
1.13% rcu_preempt [kernel.vmlinux] [k] __schedule
0.85% ksoftirqd/0 [kernel.vmlinux] [k] __schedule
0.77% Timer [kernel.vmlinux] [k] __schedule
...

Currently perf report only shows this but important info is on the
event fields, that is:

# sudo cat /sys/kernel/debug/tracing/events/sched/sched_switch/format
name: sched_switch
ID: 268
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;

field:char prev_comm[16]; offset:8; size:16; signed:1;
field:pid_t prev_pid; offset:24; size:4; signed:1;
field:int prev_prio; offset:28; size:4; signed:1;
field:long prev_state; offset:32; size:8; signed:1;
field:char next_comm[16]; offset:40; size:16; signed:1;
field:pid_t next_pid; offset:56; size:4; signed:1;
field:int next_prio; offset:60; size:4; signed:1;

print fmt: "prev_comm=%s prev_pid=%d prev_prio=%d prev_state=%s%s ==>
next_comm=%s next_pid=%d next_prio=%d",
REC->prev_comm, REC->prev_pid, REC->prev_prio,
REC->prev_state & (2048-1) ? __print_flags(REC->prev_state & (2048-1),
"|", { 1, "S"} , { 2, "D" }, { 4, "T" }, { 8, "t" }, { 16, "Z" }, { 32, "X" },
{ 64, "x" }, { 128, "K"}, { 256, "W" }, { 512, "P" }, { 1024, "N" }) : "R",
REC->prev_state & 2048 ? "+" : "", REC->next_comm, REC->next_pid, REC->next_prio

With dynamic sort keys, you can use <event.field> as a sort key. Those
dynamic keys are checked and created on demand. For instance, below is
to sort by next_pid field on the same data file.

$ perf report -s comm,sched:sched_switch.next_pid --stdio
...
# Overhead Command next_pid
# ........ ............... ..........
#
21.23% transmission-gt 0
20.86% swapper 17773
6.62% netctl-auto 0
5.25% swapper 109
5.21% kworker/0:1H 0
1.98% Xephyr 0
1.98% swapper 6524
1.98% swapper 27478
1.37% swapper 27476
1.17% swapper 233
...

Multiple dynamic sort keys are also supported and the event name can
be simplified (or even omitted in a couple of ways (see patch 10/10):

$ perf report -s comm,switch.next_pid,next_comm --stdio
...
# Overhead Command next_pid next_comm
# ........ ............... .......... ................
#
20.86% swapper 17773 transmission-gt
9.64% transmission-gt 0 swapper/0
9.16% transmission-gt 0 swapper/2
5.25% swapper 109 kworker/0:1H
5.21% kworker/0:1H 0 swapper/0
2.14% netctl-auto 0 swapper/2
1.98% netctl-auto 0 swapper/0
1.98% swapper 6524 Xephyr
1.98% swapper 27478 netctl-auto
1.78% transmission-gt 0 swapper/3
1.53% Xephyr 0 swapper/0
1.29% netctl-auto 0 swapper/1
1.29% swapper 27476 netctl-auto
1.21% netctl-auto 0 swapper/3
1.17% swapper 233 irq/33-iwlwifi
...

Note that pid 0 exists for each cpu so have comm of 'swapper/N'.

Also now it add a new 'trace' sort key to print whole trace output.
Below is the output using sched_switch plugin.

$ perf report -s trace --stdio
...
# Overhead Trace output
# ........ ...................................................
#
9.48% swapper/0:0 [120] R ==> transmission-gt:17773 [120]
9.48% transmission-gt:17773 [120] S ==> swapper/0:0 [120]
9.04% swapper/2:0 [120] R ==> transmission-gt:17773 [120]
8.92% transmission-gt:17773 [120] S ==> swapper/2:0 [120]
5.25% swapper/0:0 [120] R ==> kworker/0:1H:109 [100]
5.21% kworker/0:1H:109 [100] S ==> swapper/0:0 [120]
1.78% swapper/3:0 [120] R ==> transmission-gt:17773 [120]
1.78% transmission-gt:17773 [120] S ==> swapper/3:0 [120]
1.53% Xephyr:6524 [120] S ==> swapper/0:0 [120]
1.53% swapper/0:0 [120] R ==> Xephyr:6524 [120]
1.17% swapper/2:0 [120] R ==> irq/33-iwlwifi:233 [49]
1.13% irq/33-iwlwifi:233 [49] S ==> swapper/2:0 [120]
...

The 'trace_fields' sort key is to show each fields separately. This
is actually same as using -s 'sched_switch.*' in this case, but the
'trace_fields' sort key also works for more than one events:

$ perf report -s trace_fields --stdio
...
# Overhead prev_comm prev_pid prev_prio prev_state next_comm next_pid next_prio
# ........ ........... ......... ......... .......... ............ ........ .........
#
3.82% swapper/0 0 120 0 netctl-auto 18711 120
3.75% netctl-auto 18711 120 1 swapper/0 0 120
2.24% swapper/1 0 120 0 netctl-auto 18709 120
2.24% netctl-auto 18709 120 1 swapper/1 0 120
1.80% swapper/2 0 120 0 rcu_preempt 7 120
1.80% swapper/2 0 120 0 netctl-auto 18711 120
1.80% rcu_preempt 7 120 1 swapper/2 0 120
1.80% netctl-auto 18711 120 1 swapper/2 0 120
...

If a data file has only tracepoint events, the 'trace' sort key will
be used by default. However, if user gave --raw-trace option also,
it'll use 'trace_fields' instead.

This is available on 'perf/dynamic-sort-v4' branch in my tree

git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Any comments are welcome, thanks!
Namhyung


Namhyung Kim (13):
perf hist: Pass struct sample to __hists__add_entry()
perf hist: Save raw_data/size for tracepoint events
tools lib traceevent: Factor out and export print_event_field[s]
perf tools: Pass evlist to setup_sorting()
perf tools: Add dynamic sort key for tracepoint events
perf tools: Try to show pretty printed output for dynamic sort keys
perf tools: Add 'trace' sort key
perf tools: Add --raw-trace option
perf tools: Support shortcuts for events in dynamic sort keys
perf tools: Support '<event>.*' dynamic sort key
perf tools: Skip dynamic fields not defined for current event
perf tools: Add 'trace_fields' dynamic sort key
perf tools: Make 'trace' or 'trace_fields' sort key default for tracepoint events

tools/lib/traceevent/event-parse.c | 125 +++----
tools/lib/traceevent/event-parse.h | 4 +
tools/perf/Documentation/perf-report.txt | 3 +
tools/perf/Documentation/perf-top.txt | 3 +
tools/perf/builtin-annotate.c | 9 +-
tools/perf/builtin-diff.c | 13 +-
tools/perf/builtin-report.c | 4 +-
tools/perf/builtin-top.c | 16 +-
tools/perf/tests/hists_cumulate.c | 8 +-
tools/perf/tests/hists_filter.c | 2 +-
tools/perf/tests/hists_link.c | 8 +-
tools/perf/tests/hists_output.c | 10 +-
tools/perf/ui/browsers/hists.c | 7 +-
tools/perf/ui/gtk/hists.c | 4 +-
tools/perf/ui/hist.c | 2 +-
tools/perf/ui/stdio/hist.c | 6 +-
tools/perf/util/hist.c | 38 ++-
tools/perf/util/hist.h | 19 +-
tools/perf/util/sort.c | 539 ++++++++++++++++++++++++++++++-
tools/perf/util/sort.h | 10 +-
tools/perf/util/symbol.h | 3 +-
21 files changed, 700 insertions(+), 133 deletions(-)

--
2.6.4


2015-12-22 17:08:05

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 01/13] perf hist: Pass struct sample to __hists__add_entry()

This is a preparation to add more info into the hist_entry. Also it
already passes too many argument, so passing sample directly will reduce
the overhead of the function call.

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/builtin-annotate.c | 7 +++++--
tools/perf/builtin-diff.c | 11 +++++------
tools/perf/tests/hists_link.c | 6 +++---
tools/perf/util/hist.c | 31 +++++++++++++++++--------------
tools/perf/util/hist.h | 4 ++--
5 files changed, 32 insertions(+), 27 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index e18f1b995ffd..b5b8db0b0338 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -47,7 +47,7 @@ struct perf_annotate {
};

static int perf_evsel__add_sample(struct perf_evsel *evsel,
- struct perf_sample *sample __maybe_unused,
+ struct perf_sample *sample,
struct addr_location *al,
struct perf_annotate *ann)
{
@@ -72,7 +72,10 @@ static int perf_evsel__add_sample(struct perf_evsel *evsel,
return 0;
}

- he = __hists__add_entry(hists, al, NULL, NULL, NULL, 1, 1, 0, true);
+ sample->period = 1;
+ sample->weight = 1;
+
+ he = __hists__add_entry(hists, al, NULL, NULL, NULL, sample, true);
if (he == NULL)
return -ENOMEM;

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 0b180a885ba3..69f5b1feff39 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -311,11 +311,11 @@ static int formula_fprintf(struct hist_entry *he, struct hist_entry *pair,
}

static int hists__add_entry(struct hists *hists,
- struct addr_location *al, u64 period,
- u64 weight, u64 transaction)
+ struct addr_location *al,
+ struct perf_sample *sample)
{
- if (__hists__add_entry(hists, al, NULL, NULL, NULL, period, weight,
- transaction, true) != NULL)
+ if (__hists__add_entry(hists, al, NULL, NULL, NULL,
+ sample, true) != NULL)
return 0;
return -ENOMEM;
}
@@ -336,8 +336,7 @@ static int diff__process_sample_event(struct perf_tool *tool __maybe_unused,
return -1;
}

- if (hists__add_entry(hists, &al, sample->period,
- sample->weight, sample->transaction)) {
+ if (hists__add_entry(hists, &al, sample)) {
pr_warning("problem incrementing symbol period, skipping event\n");
goto out_put;
}
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 6243e2b2a245..9eac98daecb8 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -64,7 +64,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
struct perf_evsel *evsel;
struct addr_location al;
struct hist_entry *he;
- struct perf_sample sample = { .period = 1, };
+ struct perf_sample sample = { .period = 1, .weight = 1, };
size_t i = 0, k;

/*
@@ -90,7 +90,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
goto out;

he = __hists__add_entry(hists, &al, NULL,
- NULL, NULL, 1, 1, 0, true);
+ NULL, NULL, &sample, true);
if (he == NULL) {
addr_location__put(&al);
goto out;
@@ -116,7 +116,7 @@ static int add_hist_entries(struct perf_evlist *evlist, struct machine *machine)
goto out;

he = __hists__add_entry(hists, &al, NULL,
- NULL, NULL, 1, 1, 0, true);
+ NULL, NULL, &sample, true);
if (he == NULL) {
addr_location__put(&al);
goto out;
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 56e97f5af598..039bb91d0a92 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -461,7 +461,7 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
struct symbol *sym_parent,
struct branch_info *bi,
struct mem_info *mi,
- u64 period, u64 weight, u64 transaction,
+ struct perf_sample *sample,
bool sample_self)
{
struct hist_entry entry = {
@@ -478,15 +478,15 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
.level = al->level,
.stat = {
.nr_events = 1,
- .period = period,
- .weight = weight,
+ .period = sample->period,
+ .weight = sample->weight,
},
.parent = sym_parent,
.filtered = symbol__parent_filter(sym_parent) | al->filtered,
.hists = hists,
.branch_info = bi,
.mem_info = mi,
- .transaction = transaction,
+ .transaction = sample->transaction,
};

return hists__findnew_entry(hists, &entry, al, sample_self);
@@ -526,12 +526,13 @@ iter_add_single_mem_entry(struct hist_entry_iter *iter, struct addr_location *al
u64 cost;
struct mem_info *mi = iter->priv;
struct hists *hists = evsel__hists(iter->evsel);
+ struct perf_sample *sample = iter->sample;
struct hist_entry *he;

if (mi == NULL)
return -EINVAL;

- cost = iter->sample->weight;
+ cost = sample->weight;
if (!cost)
cost = 1;

@@ -542,8 +543,10 @@ iter_add_single_mem_entry(struct hist_entry_iter *iter, struct addr_location *al
* and this is indirectly achieved by passing period=weight here
* and the he_stat__add_period() function.
*/
+ sample->period = cost;
+
he = __hists__add_entry(hists, al, iter->parent, NULL, mi,
- cost, cost, 0, true);
+ sample, true);
if (!he)
return -ENOMEM;

@@ -630,6 +633,7 @@ iter_add_next_branch_entry(struct hist_entry_iter *iter, struct addr_location *a
struct branch_info *bi;
struct perf_evsel *evsel = iter->evsel;
struct hists *hists = evsel__hists(evsel);
+ struct perf_sample *sample = iter->sample;
struct hist_entry *he = NULL;
int i = iter->curr;
int err = 0;
@@ -643,9 +647,11 @@ iter_add_next_branch_entry(struct hist_entry_iter *iter, struct addr_location *a
* The report shows the percentage of total branches captured
* and not events sampled. Thus we use a pseudo period of 1.
*/
+ sample->period = 1;
+ sample->weight = bi->flags.cycles ? bi->flags.cycles : 1;
+
he = __hists__add_entry(hists, al, iter->parent, &bi[i], NULL,
- 1, bi->flags.cycles ? bi->flags.cycles : 1,
- 0, true);
+ sample, true);
if (he == NULL)
return -ENOMEM;

@@ -682,8 +688,7 @@ iter_add_single_normal_entry(struct hist_entry_iter *iter, struct addr_location
struct hist_entry *he;

he = __hists__add_entry(evsel__hists(evsel), al, iter->parent, NULL, NULL,
- sample->period, sample->weight,
- sample->transaction, true);
+ sample, true);
if (he == NULL)
return -ENOMEM;

@@ -744,8 +749,7 @@ iter_add_single_cumulative_entry(struct hist_entry_iter *iter,
int err = 0;

he = __hists__add_entry(hists, al, iter->parent, NULL, NULL,
- sample->period, sample->weight,
- sample->transaction, true);
+ sample, true);
if (he == NULL)
return -ENOMEM;

@@ -818,8 +822,7 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
}

he = __hists__add_entry(evsel__hists(evsel), al, iter->parent, NULL, NULL,
- sample->period, sample->weight,
- sample->transaction, false);
+ sample, false);
if (he == NULL)
return -ENOMEM;

diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index a48a2078d288..36439bfad059 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -114,8 +114,8 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
struct addr_location *al,
struct symbol *parent,
struct branch_info *bi,
- struct mem_info *mi, u64 period,
- u64 weight, u64 transaction,
+ struct mem_info *mi,
+ struct perf_sample *sample,
bool sample_self);
int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
int max_stack_depth, void *arg);
--
2.6.4

2015-12-22 17:11:57

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 02/13] perf hist: Save raw_data/size for tracepoint events

The raw_data and raw_size fields are to provide tracepoint specific
information. They will be used by dynamic sort keys later.

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/hist.c | 4 ++++
tools/perf/util/sort.h | 2 ++
2 files changed, 6 insertions(+)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 039bb91d0a92..c0c92a3daa69 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -487,6 +487,8 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
.branch_info = bi,
.mem_info = mi,
.transaction = sample->transaction,
+ .raw_data = sample->raw_data,
+ .raw_size = sample->raw_size,
};

return hists__findnew_entry(hists, &entry, al, sample_self);
@@ -801,6 +803,8 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
.sym = al->sym,
},
.parent = iter->parent,
+ .raw_data = sample->raw_data,
+ .raw_size = sample->raw_size,
};
int i;
struct callchain_cursor cursor;
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 86f05e7a5566..d29898708dbd 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -122,6 +122,8 @@ struct hist_entry {
struct branch_info *branch_info;
struct hists *hists;
struct mem_info *mem_info;
+ void *raw_data;
+ u32 raw_size;
struct callchain_root callchain[0]; /* must be last member */
};

--
2.6.4

2015-12-22 17:08:10

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 03/13] tools lib traceevent: Factor out and export print_event_field[s]

The print_event_field() and print_event_field() are to print basic
information of a given field or event without the print format. They'll
be used by dynamic sort keys later.

Cc: Steven Rostedt <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/lib/traceevent/event-parse.c | 125 ++++++++++++++++++++-----------------
tools/lib/traceevent/event-parse.h | 4 ++
2 files changed, 70 insertions(+), 59 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c
index 68276f35e323..1b43f2b9aebe 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -4735,73 +4735,80 @@ static int is_printable_array(char *p, unsigned int len)
return 1;
}

-static void print_event_fields(struct trace_seq *s, void *data,
- int size __maybe_unused,
- struct event_format *event)
+void print_event_field(struct trace_seq *s, void *data,
+ struct format_field *field)
{
- struct format_field *field;
unsigned long long val;
unsigned int offset, len, i;
-
- field = event->format.fields;
- while (field) {
- trace_seq_printf(s, " %s=", field->name);
- if (field->flags & FIELD_IS_ARRAY) {
- offset = field->offset;
- len = field->size;
- if (field->flags & FIELD_IS_DYNAMIC) {
- val = pevent_read_number(event->pevent, data + offset, len);
- offset = val;
- len = offset >> 16;
- offset &= 0xffff;
- }
- if (field->flags & FIELD_IS_STRING &&
- is_printable_array(data + offset, len)) {
- trace_seq_printf(s, "%s", (char *)data + offset);
- } else {
- trace_seq_puts(s, "ARRAY[");
- for (i = 0; i < len; i++) {
- if (i)
- trace_seq_puts(s, ", ");
- trace_seq_printf(s, "%02x",
- *((unsigned char *)data + offset + i));
- }
- trace_seq_putc(s, ']');
- field->flags &= ~FIELD_IS_STRING;
- }
+ struct pevent *pevent = field->event->pevent;
+
+ if (field->flags & FIELD_IS_ARRAY) {
+ offset = field->offset;
+ len = field->size;
+ if (field->flags & FIELD_IS_DYNAMIC) {
+ val = pevent_read_number(pevent, data + offset, len);
+ offset = val;
+ len = offset >> 16;
+ offset &= 0xffff;
+ }
+ if (field->flags & FIELD_IS_STRING &&
+ is_printable_array(data + offset, len)) {
+ trace_seq_printf(s, "%s", (char *)data + offset);
} else {
- val = pevent_read_number(event->pevent, data + field->offset,
- field->size);
- if (field->flags & FIELD_IS_POINTER) {
- trace_seq_printf(s, "0x%llx", val);
- } else if (field->flags & FIELD_IS_SIGNED) {
- switch (field->size) {
- case 4:
- /*
- * If field is long then print it in hex.
- * A long usually stores pointers.
- */
- if (field->flags & FIELD_IS_LONG)
- trace_seq_printf(s, "0x%x", (int)val);
- else
- trace_seq_printf(s, "%d", (int)val);
- break;
- case 2:
- trace_seq_printf(s, "%2d", (short)val);
- break;
- case 1:
- trace_seq_printf(s, "%1d", (char)val);
- break;
- default:
- trace_seq_printf(s, "%lld", val);
- }
- } else {
+ trace_seq_puts(s, "ARRAY[");
+ for (i = 0; i < len; i++) {
+ if (i)
+ trace_seq_puts(s, ", ");
+ trace_seq_printf(s, "%02x",
+ *((unsigned char *)data + offset + i));
+ }
+ trace_seq_putc(s, ']');
+ field->flags &= ~FIELD_IS_STRING;
+ }
+ } else {
+ val = pevent_read_number(pevent, data + field->offset,
+ field->size);
+ if (field->flags & FIELD_IS_POINTER) {
+ trace_seq_printf(s, "0x%llx", val);
+ } else if (field->flags & FIELD_IS_SIGNED) {
+ switch (field->size) {
+ case 4:
+ /*
+ * If field is long then print it in hex.
+ * A long usually stores pointers.
+ */
if (field->flags & FIELD_IS_LONG)
- trace_seq_printf(s, "0x%llx", val);
+ trace_seq_printf(s, "0x%x", (int)val);
else
- trace_seq_printf(s, "%llu", val);
+ trace_seq_printf(s, "%d", (int)val);
+ break;
+ case 2:
+ trace_seq_printf(s, "%2d", (short)val);
+ break;
+ case 1:
+ trace_seq_printf(s, "%1d", (char)val);
+ break;
+ default:
+ trace_seq_printf(s, "%lld", val);
}
+ } else {
+ if (field->flags & FIELD_IS_LONG)
+ trace_seq_printf(s, "0x%llx", val);
+ else
+ trace_seq_printf(s, "%llu", val);
}
+ }
+}
+
+void print_event_fields(struct trace_seq *s, void *data,
+ int size __maybe_unused, struct event_format *event)
+{
+ struct format_field *field;
+
+ field = event->format.fields;
+ while (field) {
+ trace_seq_printf(s, " %s=", field->name);
+ print_event_field(s, data, field);
field = field->next;
}
}
diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h
index 6fc83c7edbe9..600c73277a6f 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -705,6 +705,10 @@ struct cmdline *pevent_data_pid_from_comm(struct pevent *pevent, const char *com
struct cmdline *next);
int pevent_cmdline_pid(struct pevent *pevent, struct cmdline *cmdline);

+void print_event_field(struct trace_seq *s, void *data,
+ struct format_field *field);
+void print_event_fields(struct trace_seq *s, void *data,
+ int size __maybe_unused, struct event_format *event);
void pevent_event_info(struct trace_seq *s, struct event_format *event,
struct pevent_record *record);
int pevent_strerror(struct pevent *pevent, enum pevent_errno errnum,
--
2.6.4

2015-12-22 17:08:15

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 04/13] perf tools: Pass evlist to setup_sorting()

This is a preparation to support dynamic sort keys for tracepoint
events. Dynamic sort keys can be created for specific fields in trace
events so it needs the event information.

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/builtin-annotate.c | 2 +-
tools/perf/builtin-diff.c | 2 +-
tools/perf/builtin-report.c | 2 +-
tools/perf/builtin-top.c | 14 +++++++-------
tools/perf/tests/hists_cumulate.c | 8 ++++----
tools/perf/tests/hists_filter.c | 2 +-
tools/perf/tests/hists_link.c | 2 +-
tools/perf/tests/hists_output.c | 10 +++++-----
tools/perf/util/sort.c | 15 +++++++++------
tools/perf/util/sort.h | 5 +++--
10 files changed, 33 insertions(+), 29 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index b5b8db0b0338..cc5c1267c738 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -370,7 +370,7 @@ int cmd_annotate(int argc, const char **argv, const char *prefix __maybe_unused)
if (ret < 0)
goto out_delete;

- if (setup_sorting() < 0)
+ if (setup_sorting(NULL) < 0)
usage_with_options(annotate_usage, options);

if (annotate.use_stdio)
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index 69f5b1feff39..87063835d741 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -1279,7 +1279,7 @@ int cmd_diff(int argc, const char **argv, const char *prefix __maybe_unused)

sort__mode = SORT_MODE__DIFF;

- if (setup_sorting() < 0)
+ if (setup_sorting(NULL) < 0)
usage_with_options(diff_usage, options);

setup_pager();
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 2a7330b99b82..ea53c816640a 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -897,7 +897,7 @@ repeat:
symbol_conf.cumulate_callchain = false;
}

- if (setup_sorting() < 0) {
+ if (setup_sorting(session->evlist) < 0) {
if (sort_order)
parse_options_usage(report_usage, options, "s", 1);
if (field_order)
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 9ebd67a42ede..005825942441 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1231,11 +1231,17 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
if (argc)
usage_with_options(top_usage, options);

+ if (!top.evlist->nr_entries &&
+ perf_evlist__add_default(top.evlist) < 0) {
+ pr_err("Not enough memory for event selector list\n");
+ goto out_delete_evlist;
+ }
+
sort__mode = SORT_MODE__TOP;
/* display thread wants entries to be collapsed in a different tree */
sort__need_collapse = 1;

- if (setup_sorting() < 0) {
+ if (setup_sorting(top.evlist) < 0) {
if (sort_order)
parse_options_usage(top_usage, options, "s", 1);
if (field_order)
@@ -1277,12 +1283,6 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
goto out_delete_evlist;
}

- if (!top.evlist->nr_entries &&
- perf_evlist__add_default(top.evlist) < 0) {
- ui__error("Not enough memory for event selector list\n");
- goto out_delete_evlist;
- }
-
symbol_conf.nr_events = top.evlist->nr_entries;

if (top.delay_secs < 1)
diff --git a/tools/perf/tests/hists_cumulate.c b/tools/perf/tests/hists_cumulate.c
index 8292948bc5f9..e36089212061 100644
--- a/tools/perf/tests/hists_cumulate.c
+++ b/tools/perf/tests/hists_cumulate.c
@@ -281,7 +281,7 @@ static int test1(struct perf_evsel *evsel, struct machine *machine)
symbol_conf.cumulate_callchain = false;
perf_evsel__reset_sample_bit(evsel, CALLCHAIN);

- setup_sorting();
+ setup_sorting(NULL);
callchain_register_param(&callchain_param);

err = add_hist_entries(hists, machine);
@@ -428,7 +428,7 @@ static int test2(struct perf_evsel *evsel, struct machine *machine)
symbol_conf.cumulate_callchain = false;
perf_evsel__set_sample_bit(evsel, CALLCHAIN);

- setup_sorting();
+ setup_sorting(NULL);
callchain_register_param(&callchain_param);

err = add_hist_entries(hists, machine);
@@ -486,7 +486,7 @@ static int test3(struct perf_evsel *evsel, struct machine *machine)
symbol_conf.cumulate_callchain = true;
perf_evsel__reset_sample_bit(evsel, CALLCHAIN);

- setup_sorting();
+ setup_sorting(NULL);
callchain_register_param(&callchain_param);

err = add_hist_entries(hists, machine);
@@ -670,7 +670,7 @@ static int test4(struct perf_evsel *evsel, struct machine *machine)
symbol_conf.cumulate_callchain = true;
perf_evsel__set_sample_bit(evsel, CALLCHAIN);

- setup_sorting();
+ setup_sorting(NULL);
callchain_register_param(&callchain_param);

err = add_hist_entries(hists, machine);
diff --git a/tools/perf/tests/hists_filter.c b/tools/perf/tests/hists_filter.c
index ccb5b4921f25..2a784befd9ce 100644
--- a/tools/perf/tests/hists_filter.c
+++ b/tools/perf/tests/hists_filter.c
@@ -122,7 +122,7 @@ int test__hists_filter(int subtest __maybe_unused)
goto out;

/* default sort order (comm,dso,sym) will be used */
- if (setup_sorting() < 0)
+ if (setup_sorting(NULL) < 0)
goto out;

machines__init(&machines);
diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 9eac98daecb8..c764d69ac6ef 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -294,7 +294,7 @@ int test__hists_link(int subtest __maybe_unused)
goto out;

/* default sort order (comm,dso,sym) will be used */
- if (setup_sorting() < 0)
+ if (setup_sorting(NULL) < 0)
goto out;

machines__init(&machines);
diff --git a/tools/perf/tests/hists_output.c b/tools/perf/tests/hists_output.c
index 248beec1d917..ebe6cd485b5d 100644
--- a/tools/perf/tests/hists_output.c
+++ b/tools/perf/tests/hists_output.c
@@ -134,7 +134,7 @@ static int test1(struct perf_evsel *evsel, struct machine *machine)
field_order = NULL;
sort_order = NULL; /* equivalent to sort_order = "comm,dso,sym" */

- setup_sorting();
+ setup_sorting(NULL);

/*
* expected output:
@@ -236,7 +236,7 @@ static int test2(struct perf_evsel *evsel, struct machine *machine)
field_order = "overhead,cpu";
sort_order = "pid";

- setup_sorting();
+ setup_sorting(NULL);

/*
* expected output:
@@ -292,7 +292,7 @@ static int test3(struct perf_evsel *evsel, struct machine *machine)
field_order = "comm,overhead,dso";
sort_order = NULL;

- setup_sorting();
+ setup_sorting(NULL);

/*
* expected output:
@@ -366,7 +366,7 @@ static int test4(struct perf_evsel *evsel, struct machine *machine)
field_order = "dso,sym,comm,overhead,dso";
sort_order = "sym";

- setup_sorting();
+ setup_sorting(NULL);

/*
* expected output:
@@ -468,7 +468,7 @@ static int test5(struct perf_evsel *evsel, struct machine *machine)
field_order = "cpu,pid,comm,dso,sym";
sort_order = "dso,pid";

- setup_sorting();
+ setup_sorting(NULL);

/*
* expected output:
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 2d8ccd4d9e1b..0c038a27fe5c 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -4,6 +4,8 @@
#include "comm.h"
#include "symbol.h"
#include "evsel.h"
+#include "evlist.h"
+#include <traceevent/event-parse.h>

regex_t parent_regex;
const char default_parent_pattern[] = "^sys_|^do_page_fault";
@@ -1583,7 +1585,8 @@ int hpp_dimension__add_output(unsigned col)
return __hpp_dimension__add_output(&hpp_sort_dimensions[col]);
}

-int sort_dimension__add(const char *tok)
+static int sort_dimension__add(const char *tok,
+ struct perf_evlist *evlist __maybe_unused)
{
unsigned int i;

@@ -1712,7 +1715,7 @@ static int setup_sort_order(void)
return 0;
}

-static int __setup_sorting(void)
+static int __setup_sorting(struct perf_evlist *evlist)
{
char *tmp, *tok, *str;
const char *sort_keys;
@@ -1743,7 +1746,7 @@ static int __setup_sorting(void)

for (tok = strtok_r(str, ", ", &tmp);
tok; tok = strtok_r(NULL, ", ", &tmp)) {
- ret = sort_dimension__add(tok);
+ ret = sort_dimension__add(tok, evlist);
if (ret == -EINVAL) {
error("Invalid --sort key: `%s'", tok);
break;
@@ -1954,16 +1957,16 @@ out:
return ret;
}

-int setup_sorting(void)
+int setup_sorting(struct perf_evlist *evlist)
{
int err;

- err = __setup_sorting();
+ err = __setup_sorting(evlist);
if (err < 0)
return err;

if (parent_pattern != default_parent_pattern) {
- err = sort_dimension__add("parent");
+ err = sort_dimension__add("parent", evlist);
if (err < 0)
return err;
}
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index d29898708dbd..1a00f1eb9d21 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -226,10 +226,11 @@ struct sort_entry {
extern struct sort_entry sort_thread;
extern struct list_head hist_entry__sort_list;

-int setup_sorting(void);
+struct perf_evlist;
+struct pevent;
+int setup_sorting(struct perf_evlist *evlist);
int setup_output_field(void);
void reset_output_field(void);
-extern int sort_dimension__add(const char *);
void sort__setup_elide(FILE *fp);
void perf_hpp__set_elide(int idx, bool elide);

--
2.6.4

2015-12-22 17:10:45

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 05/13] perf tools: Add dynamic sort key for tracepoint events

The existing sort keys are less useful for tracepoint events in that
they are always sampled at a same location.

For example, report on sched:sched_switch event looks like following

# Overhead Command Shared Object Symbol
# ........ ............... ................ ..............
#
47.22% swapper [kernel.vmlinux] [k] __schedule
21.67% transmission-gt [kernel.vmlinux] [k] __schedule
8.23% netctl-auto [kernel.vmlinux] [k] __schedule
5.53% kworker/0:1H [kernel.vmlinux] [k] __schedule
1.98% Xephyr [kernel.vmlinux] [k] __schedule
1.33% irq/33-iwlwifi [kernel.vmlinux] [k] __schedule
1.17% wpa_cli [kernel.vmlinux] [k] __schedule
1.13% rcu_preempt [kernel.vmlinux] [k] __schedule
0.85% ksoftirqd/0 [kernel.vmlinux] [k] __schedule
0.77% Timer [kernel.vmlinux] [k] __schedule

In fact, tracepoints have meaningful information in their fields but
there's no way to use in the perf report currently. The dynamic sort
keys are to overcome this problem.

The sched:sched_switch events have following fields:

# sudo cat /sys/kernel/debug/tracing/events/sched/sched_switch/format
name: sched_switch
ID: 268
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
field:unsigned char common_flags; offset:2; size:1; signed:0;
field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
field:int common_pid; offset:4; size:4; signed:1;

field:char prev_comm[16]; offset:8; size:16; signed:1;
field:pid_t prev_pid; offset:24; size:4; signed:1;
field:int prev_prio; offset:28; size:4; signed:1;
field:long prev_state; offset:32; size:8; signed:1;
field:char next_comm[16]; offset:40; size:16; signed:1;
field:pid_t next_pid; offset:56; size:4; signed:1;
field:int next_prio; offset:60; size:4; signed:1;

print fmt: "prev_comm=%s prev_pid=%d prev_prio=%d prev_state=%s%s ==>
next_comm=%s next_pid=%d next_prio=%d",
REC->prev_comm, REC->prev_pid, REC->prev_prio,
REC->prev_state & (2048-1) ? __print_flags(REC->prev_state & (2048-1),
"|", { 1, "S"} , { 2, "D" }, { 4, "T" }, { 8, "t" }, { 16, "Z" }, { 32, "X" },
{ 64, "x" }, { 128, "K"}, { 256, "W" }, { 512, "P" }, { 1024, "N" }) : "R",
REC->prev_state & 2048 ? "+" : "", REC->next_comm, REC->next_pid, REC->next_prio

With dynamic sort keys, you can use <event.field> as a sort key. Those
dynamic keys are checked and created on demand. For instance, below is
to sort by next_pid field on the same data file.

$ perf report -s comm,sched:sched_switch.next_pid --stdio
...
# Overhead Command next_pid
# ........ ............... ..........
#
21.23% transmission-gt 0
20.86% swapper 17773
6.62% netctl-auto 0
5.25% swapper 109
5.21% kworker/0:1H 0
1.98% Xephyr 0
1.98% swapper 6524
1.98% swapper 27478
1.37% swapper 27476
1.17% swapper 233

Multiple dynamic sort keys are also supported:

$ perf report -s comm,sched:sched_switch.next_pid,sched:sched_switch.next_comm --stdio
...
# Overhead Command next_pid next_comm
# ........ ............... .......... ................
#
20.86% swapper 17773 transmission-gt
9.64% transmission-gt 0 swapper/0
9.16% transmission-gt 0 swapper/2
5.25% swapper 109 kworker/0:1H
5.21% kworker/0:1H 0 swapper/0
2.14% netctl-auto 0 swapper/2
1.98% netctl-auto 0 swapper/0
1.98% swapper 6524 Xephyr
1.98% swapper 27478 netctl-auto
1.78% transmission-gt 0 swapper/3
1.53% Xephyr 0 swapper/0
1.29% netctl-auto 0 swapper/1
1.29% swapper 27476 netctl-auto
1.21% netctl-auto 0 swapper/3
1.17% swapper 233 irq/33-iwlwifi

Note that pid 0 exists for each cpu so have comm of 'swapper/N'.

Cc: Steven Rostedt <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/sort.c | 213 +++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 213 insertions(+)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 0c038a27fe5c..763d810cdb38 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1531,6 +1531,216 @@ static int __sort_dimension__add_hpp_output(struct sort_dimension *sd)
return 0;
}

+struct hpp_dynamic_entry {
+ struct perf_hpp_fmt hpp;
+ struct perf_evsel *evsel;
+ struct format_field *field;
+ unsigned dynamic_len;
+};
+
+static int hde_width(struct hpp_dynamic_entry *hde)
+{
+ if (!hde->hpp.len) {
+ int len = hde->dynamic_len;
+ int namelen = strlen(hde->field->name);
+ int fieldlen = hde->field->size;
+
+ if (namelen > len)
+ len = namelen;
+
+ if (!(hde->field->flags & FIELD_IS_STRING)) {
+ /* length for print hex numbers */
+ fieldlen = hde->field->size * 2 + 2;
+ }
+ if (fieldlen > len)
+ len = fieldlen;
+
+ hde->hpp.len = len;
+ }
+ return hde->hpp.len;
+}
+
+static int __sort__hde_header(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct perf_evsel *evsel __maybe_unused)
+{
+ struct hpp_dynamic_entry *hde;
+ size_t len = fmt->user_len;
+
+ hde = container_of(fmt, struct hpp_dynamic_entry, hpp);
+
+ if (!len)
+ len = hde_width(hde);
+
+ return scnprintf(hpp->buf, hpp->size, "%*.*s", len, len, hde->field->name);
+}
+
+static int __sort__hde_width(struct perf_hpp_fmt *fmt,
+ struct perf_hpp *hpp __maybe_unused,
+ struct perf_evsel *evsel __maybe_unused)
+{
+ struct hpp_dynamic_entry *hde;
+ size_t len = fmt->user_len;
+
+ hde = container_of(fmt, struct hpp_dynamic_entry, hpp);
+
+ if (!len)
+ len = hde_width(hde);
+
+ return len;
+}
+
+static int __sort__hde_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
+ struct hist_entry *he)
+{
+ struct hpp_dynamic_entry *hde;
+ size_t len = fmt->user_len;
+ struct trace_seq seq;
+ int ret;
+
+ hde = container_of(fmt, struct hpp_dynamic_entry, hpp);
+
+ if (!len)
+ len = hde_width(hde);
+
+ if (hists_to_evsel(he->hists) != hde->evsel)
+ return scnprintf(hpp->buf, hpp->size, "%*.*s", len, len, "N/A");
+
+ trace_seq_init(&seq);
+ print_event_field(&seq, he->raw_data, hde->field);
+ ret = scnprintf(hpp->buf, hpp->size, "%*.*s", len, len, seq.buffer);
+ trace_seq_destroy(&seq);
+ return ret;
+}
+
+static int64_t __sort__hde_cmp(struct perf_hpp_fmt *fmt,
+ struct hist_entry *a, struct hist_entry *b)
+{
+ struct hpp_dynamic_entry *hde;
+ struct format_field *field;
+ unsigned offset, size;
+
+ hde = container_of(fmt, struct hpp_dynamic_entry, hpp);
+
+ if (hists_to_evsel(a->hists) != hde->evsel)
+ return 0;
+
+ field = hde->field;
+ if (field->flags & FIELD_IS_DYNAMIC) {
+ unsigned long long dyn;
+
+ pevent_read_number_field(field, a->raw_data, &dyn);
+ offset = dyn & 0xffff;
+ size = (dyn >> 16) & 0xffff;
+
+ /* record max width for output */
+ if (size > hde->dynamic_len)
+ hde->dynamic_len = size;
+ } else {
+ offset = field->offset;
+ size = field->size;
+ }
+
+ return memcmp(a->raw_data + offset, b->raw_data + offset, size);
+}
+
+static struct hpp_dynamic_entry *
+__alloc_dynamic_entry(struct perf_evsel *evsel, struct format_field *field)
+{
+ struct hpp_dynamic_entry *hde;
+
+ hde = malloc(sizeof(*hde));
+ if (hde == NULL) {
+ pr_debug("Memory allocation failed\n");
+ return NULL;
+ }
+
+ hde->evsel = evsel;
+ hde->field = field;
+ hde->dynamic_len = 0;
+
+ hde->hpp.name = field->name;
+ hde->hpp.header = __sort__hde_header;
+ hde->hpp.width = __sort__hde_width;
+ hde->hpp.entry = __sort__hde_entry;
+ hde->hpp.color = NULL;
+
+ hde->hpp.cmp = __sort__hde_cmp;
+ hde->hpp.collapse = __sort__hde_cmp;
+ hde->hpp.sort = __sort__hde_cmp;
+
+ INIT_LIST_HEAD(&hde->hpp.list);
+ INIT_LIST_HEAD(&hde->hpp.sort_list);
+ hde->hpp.elide = false;
+ hde->hpp.len = 0;
+ hde->hpp.user_len = 0;
+
+ return hde;
+}
+
+static int add_dynamic_entry(struct perf_evlist *evlist, const char *tok)
+{
+ char *str, *event_name, *field_name;
+ struct perf_evsel *evsel, *pos;
+ struct format_field *field;
+ struct hpp_dynamic_entry *hde;
+ int ret = 0;
+
+ if (evlist == NULL)
+ return -ENOENT;
+
+ str = strdup(tok);
+ if (str == NULL)
+ return -ENOMEM;
+
+ event_name = str;
+ field_name = strchr(str, '.');
+ if (field_name == NULL) {
+ ret = -EINVAL;
+ goto out;
+ }
+ *field_name++ = '\0';
+
+ evsel = NULL;
+ evlist__for_each(evlist, pos) {
+ if (!strcmp(pos->name, event_name)) {
+ evsel = pos;
+ break;
+ }
+ }
+
+ if (evsel == NULL) {
+ pr_debug("Cannot find event: %s\n", event_name);
+ ret = -ENOENT;
+ goto out;
+ }
+
+ if (evsel->attr.type != PERF_TYPE_TRACEPOINT) {
+ pr_debug("%s is not a tracepoint event\n", event_name);
+ ret = -EINVAL;
+ goto out;
+ }
+
+ field = pevent_find_any_field(evsel->tp_format, field_name);
+ if (field == NULL) {
+ pr_debug("Cannot find event field for %s.%s\n",
+ event_name, field_name);
+ ret = -ENOENT;
+ goto out;
+ }
+
+ hde = __alloc_dynamic_entry(evsel, field);
+ if (hde == NULL) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ perf_hpp__register_sort_field(&hde->hpp);
+
+out:
+ free(str);
+ return ret;
+}
+
static int __sort_dimension__add(struct sort_dimension *sd)
{
if (sd->taken)
@@ -1667,6 +1877,9 @@ static int sort_dimension__add(const char *tok,
return 0;
}

+ if (!add_dynamic_entry(evlist, tok))
+ return 0;
+
return -ESRCH;
}

--
2.6.4

2015-12-22 17:08:20

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 06/13] perf tools: Try to show pretty printed output for dynamic sort keys

Each tracepoint event has format string for print to improve
readability. Try to parse the output and match the field name. If it
finds one, use that for the result. If not, fallbacks to the original
output.

For example, sort on kmem:kmalloc.gfp_flags looks like below:
(Note: libtraceevent plugins are not installed on my system. They might
affect the output below)

Before:
# Overhead Command gfp_flags
# ........ ....... ..........
#
99.89% perf 32848
0.06% sleep 208
0.03% perf 32976
0.01% perf 208

After:
# Overhead Command gfp_flags
# ........ ....... ...................
#
99.89% perf GFP_NOFS|GFP_ZERO
0.06% sleep GFP_KERNEL
0.03% perf GFP_KERNEL|GFP_ZERO
0.01% perf GFP_KERNEL

Cc: Steven Rostedt <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/hist.c | 1 +
tools/perf/util/sort.c | 105 ++++++++++++++++++++++++++++++++++++++++++++++---
tools/perf/util/sort.h | 1 +
3 files changed, 102 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index c0c92a3daa69..ef1a4a5cc610 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -978,6 +978,7 @@ void hist_entry__delete(struct hist_entry *he)
if (he->srcfile && he->srcfile[0])
free(he->srcfile);
free_callchain(he->callchain);
+ free(he->trace_output);
free(he);
}

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 763d810cdb38..91c48f7a7f02 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1560,6 +1560,62 @@ static int hde_width(struct hpp_dynamic_entry *hde)
return hde->hpp.len;
}

+static char *get_trace_output(struct hist_entry *he)
+{
+ struct trace_seq seq;
+ struct perf_evsel *evsel;
+ struct pevent_record rec = {
+ .data = he->raw_data,
+ .size = he->raw_size,
+ };
+
+ evsel = hists_to_evsel(he->hists);
+
+ trace_seq_init(&seq);
+ pevent_event_info(&seq, evsel->tp_format, &rec);
+ return seq.buffer;
+}
+
+static void update_dynamic_len(struct hpp_dynamic_entry *hde,
+ struct hist_entry *he)
+{
+ char *str, *pos;
+ struct format_field *field = hde->field;
+ size_t namelen;
+ bool last = false;
+
+ /* parse pretty print result and update max length */
+ if (!he->trace_output)
+ he->trace_output = get_trace_output(he);
+
+ namelen = strlen(field->name);
+ str = he->trace_output;
+
+ while (str) {
+ pos = strchr(str, ' ');
+ if (pos == NULL) {
+ last = true;
+ pos = str + strlen(str);
+ }
+
+ if (!strncmp(str, field->name, namelen)) {
+ size_t len;
+
+ str += namelen + 1;
+ len = pos - str;
+
+ if (len > hde->dynamic_len)
+ hde->dynamic_len = len;
+ break;
+ }
+
+ if (last)
+ str = NULL;
+ else
+ str = pos + 1;
+ }
+}
+
static int __sort__hde_header(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
struct perf_evsel *evsel __maybe_unused)
{
@@ -1594,7 +1650,10 @@ static int __sort__hde_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
{
struct hpp_dynamic_entry *hde;
size_t len = fmt->user_len;
- struct trace_seq seq;
+ char *str, *pos;
+ struct format_field *field;
+ size_t namelen;
+ bool last = false;
int ret;

hde = container_of(fmt, struct hpp_dynamic_entry, hpp);
@@ -1605,10 +1664,43 @@ static int __sort__hde_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
if (hists_to_evsel(he->hists) != hde->evsel)
return scnprintf(hpp->buf, hpp->size, "%*.*s", len, len, "N/A");

- trace_seq_init(&seq);
- print_event_field(&seq, he->raw_data, hde->field);
- ret = scnprintf(hpp->buf, hpp->size, "%*.*s", len, len, seq.buffer);
- trace_seq_destroy(&seq);
+ field = hde->field;
+
+ namelen = strlen(field->name);
+ str = he->trace_output;
+
+ while (str) {
+ pos = strchr(str, ' ');
+ if (pos == NULL) {
+ last = true;
+ pos = str + strlen(str);
+ }
+
+ if (!strncmp(str, field->name, namelen)) {
+ str += namelen + 1;
+ str = strndup(str, pos - str);
+
+ if (str == NULL)
+ return scnprintf(hpp->buf, hpp->size,
+ "%*.*s", len, len, "ERROR");
+ break;
+ }
+
+ if (last)
+ str = NULL;
+ else
+ str = pos + 1;
+ }
+
+ if (str == NULL) {
+ struct trace_seq seq;
+ trace_seq_init(&seq);
+ print_event_field(&seq, he->raw_data, hde->field);
+ str = seq.buffer;
+ }
+
+ ret = scnprintf(hpp->buf, hpp->size, "%*.*s", len, len, str);
+ free(str);
return ret;
}

@@ -1638,6 +1730,9 @@ static int64_t __sort__hde_cmp(struct perf_hpp_fmt *fmt,
} else {
offset = field->offset;
size = field->size;
+
+ update_dynamic_len(hde, a);
+ update_dynamic_len(hde, b);
}

return memcmp(a->raw_data + offset, b->raw_data + offset, size);
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 1a00f1eb9d21..f6d2a7e3e7f2 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -124,6 +124,7 @@ struct hist_entry {
struct mem_info *mem_info;
void *raw_data;
u32 raw_size;
+ void *trace_output;
struct callchain_root callchain[0]; /* must be last member */
};

--
2.6.4

2015-12-22 17:08:26

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 07/13] perf tools: Add 'trace' sort key

The 'trace' sort key is to show tracepoint event output using either
print fmt or plugin. For example sched_switch event (using plugin) will
show output like below:

$ perf report -s trace --stdio
...
# Overhead Trace output
# ........ ...................................................
#
9.48% swapper/0:0 [120] R ==> transmission-gt:17773 [120]
9.48% transmission-gt:17773 [120] S ==> swapper/0:0 [120]
9.04% swapper/2:0 [120] R ==> transmission-gt:17773 [120]
8.92% transmission-gt:17773 [120] S ==> swapper/2:0 [120]
5.25% swapper/0:0 [120] R ==> kworker/0:1H:109 [100]
5.21% kworker/0:1H:109 [100] S ==> swapper/0:0 [120]
1.78% swapper/3:0 [120] R ==> transmission-gt:17773 [120]
1.78% transmission-gt:17773 [120] S ==> swapper/3:0 [120]
1.53% Xephyr:6524 [120] S ==> swapper/0:0 [120]
1.53% swapper/0:0 [120] R ==> Xephyr:6524 [120]
1.17% swapper/2:0 [120] R ==> irq/33-iwlwifi:233 [49]
1.13% irq/33-iwlwifi:233 [49] S ==> swapper/2:0 [120]

Note that the 'trace' sort key works only for tracepoint events. If
it's used to other type of events, just "N/A" will be printed.

Suggested-by: Jiri Olsa <[email protected]>
Cc: Steven Rostedt <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/hist.h | 1 +
tools/perf/util/sort.c | 76 +++++++++++++++++++++++++++++++++++++++-----------
tools/perf/util/sort.h | 1 +
3 files changed, 62 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 36439bfad059..15b22c563d30 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -52,6 +52,7 @@ enum hist_column {
HISTC_MEM_IADDR_SYMBOL,
HISTC_TRANSACTION,
HISTC_CYCLES,
+ HISTC_TRACE,
HISTC_NR_COLS, /* Last entry */
};

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 91c48f7a7f02..582f632353bc 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -445,6 +445,65 @@ struct sort_entry sort_socket = {
.se_width_idx = HISTC_SOCKET,
};

+/* --sort trace */
+
+static char *get_trace_output(struct hist_entry *he)
+{
+ struct trace_seq seq;
+ struct perf_evsel *evsel;
+ struct pevent_record rec = {
+ .data = he->raw_data,
+ .size = he->raw_size,
+ };
+
+ evsel = hists_to_evsel(he->hists);
+
+ trace_seq_init(&seq);
+ pevent_event_info(&seq, evsel->tp_format, &rec);
+ return seq.buffer;
+}
+
+static int64_t
+sort__trace_cmp(struct hist_entry *left, struct hist_entry *right)
+{
+ struct perf_evsel *evsel;
+
+ evsel = hists_to_evsel(left->hists);
+ if (evsel->attr.type != PERF_TYPE_TRACEPOINT)
+ return 0;
+
+ if (left->trace_output == NULL)
+ left->trace_output = get_trace_output(left);
+ if (right->trace_output == NULL)
+ right->trace_output = get_trace_output(right);
+
+ hists__new_col_len(left->hists, HISTC_TRACE, strlen(left->trace_output));
+ hists__new_col_len(right->hists, HISTC_TRACE, strlen(right->trace_output));
+
+ return strcmp(right->trace_output, left->trace_output);
+}
+
+static int hist_entry__trace_snprintf(struct hist_entry *he, char *bf,
+ size_t size, unsigned int width)
+{
+ struct perf_evsel *evsel;
+
+ evsel = hists_to_evsel(he->hists);
+ if (evsel->attr.type != PERF_TYPE_TRACEPOINT)
+ return scnprintf(bf, size, "%-*.*s", width, width, "N/A");
+
+ if (he->trace_output == NULL)
+ he->trace_output = get_trace_output(he);
+ return repsep_snprintf(bf, size, "%-*.*s", width, width, he->trace_output);
+}
+
+struct sort_entry sort_trace = {
+ .se_header = "Trace output",
+ .se_cmp = sort__trace_cmp,
+ .se_snprintf = hist_entry__trace_snprintf,
+ .se_width_idx = HISTC_TRACE,
+};
+
/* sort keys for branch stacks */

static int64_t
@@ -1314,6 +1373,7 @@ static struct sort_dimension common_sort_dimensions[] = {
DIM(SORT_LOCAL_WEIGHT, "local_weight", sort_local_weight),
DIM(SORT_GLOBAL_WEIGHT, "weight", sort_global_weight),
DIM(SORT_TRANSACTION, "transaction", sort_transaction),
+ DIM(SORT_TRACE, "trace", sort_trace),
};

#undef DIM
@@ -1560,22 +1620,6 @@ static int hde_width(struct hpp_dynamic_entry *hde)
return hde->hpp.len;
}

-static char *get_trace_output(struct hist_entry *he)
-{
- struct trace_seq seq;
- struct perf_evsel *evsel;
- struct pevent_record rec = {
- .data = he->raw_data,
- .size = he->raw_size,
- };
-
- evsel = hists_to_evsel(he->hists);
-
- trace_seq_init(&seq);
- pevent_event_info(&seq, evsel->tp_format, &rec);
- return seq.buffer;
-}
-
static void update_dynamic_len(struct hpp_dynamic_entry *hde,
struct hist_entry *he)
{
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index f6d2a7e3e7f2..6b7590ade229 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -183,6 +183,7 @@ enum sort_type {
SORT_LOCAL_WEIGHT,
SORT_GLOBAL_WEIGHT,
SORT_TRANSACTION,
+ SORT_TRACE,

/* branch stack specific sort keys */
__SORT_BRANCH_STACK,
--
2.6.4

2015-12-22 17:10:03

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 08/13] perf tools: Add --raw-trace option

The --raw-trace option is to prevent pretty printing by event's
print_fmt or plugin. Besides that, each dynamic sort key now receives
'raw' suffix separated by '/' to apply the raw trace to a specific
field.

$ perf report -s comm,kmem:kmalloc.gfp_flags
...
# Overhead Command gfp_flags
# ........ ....... ...................
#
99.89% perf GFP_NOFS|GFP_ZERO
0.06% sleep GFP_KERNEL
0.03% perf GFP_KERNEL|GFP_ZERO
0.01% perf GFP_KERNEL

Now

$ perf report -s comm,kmem:kmalloc.gfp_flags --raw-trace
or
$ perf report -s comm,kmem:kmalloc.gfp_flags/raw
...
# Overhead Command gfp_flags
# ........ ....... ..........
#
99.89% perf 32848
0.06% sleep 208
0.03% perf 32976
0.01% perf 208

Suggested-by: Jiri Olsa <[email protected]>
Cc: Steven Rostedt <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/Documentation/perf-report.txt | 3 +++
tools/perf/Documentation/perf-top.txt | 3 +++
tools/perf/builtin-report.c | 2 ++
tools/perf/builtin-top.c | 2 ++
tools/perf/util/sort.c | 31 ++++++++++++++++++++++++++++---
tools/perf/util/symbol.h | 3 ++-
6 files changed, 40 insertions(+), 4 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index dab99ed2b339..ae7cd91727f6 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -371,6 +371,9 @@ include::itrace.txt[]
--socket-filter::
Only report the samples on the processor socket that match with this filter

+--raw-trace::
+ When displaying traceevent output, do not use print fmt or plugins.
+
include::callchain-overhead-calculation.txt[]

SEE ALSO
diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
index 556cec09bf50..b0e60e17db38 100644
--- a/tools/perf/Documentation/perf-top.txt
+++ b/tools/perf/Documentation/perf-top.txt
@@ -230,6 +230,9 @@ Default is to monitor all CPUS.
The various filters must be specified as a comma separated list: --branch-filter any_ret,u,k
Note that this feature may not be available on all processors.

+--raw-trace::
+ When displaying traceevent output, do not use print fmt or plugins.
+
INTERACTIVE PROMPTING KEYS
--------------------------

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index ea53c816640a..f10c663af996 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -788,6 +788,8 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
"Show callgraph from reference event"),
OPT_INTEGER(0, "socket-filter", &report.socket_filter,
"only show processor socket that match with this filter"),
+ OPT_BOOLEAN(0, "raw-trace", &symbol_conf.raw_trace,
+ "Show raw trace event output (do not use print fmt or plugins)"),
OPT_END()
};
struct perf_data_file file = {
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 005825942441..bf01cbb0ef23 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -1210,6 +1210,8 @@ int cmd_top(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_CALLBACK('j', "branch-filter", &opts->branch_stack,
"branch filter mask", "branch stack filter modes",
parse_branch_stack),
+ OPT_BOOLEAN(0, "raw-trace", &symbol_conf.raw_trace,
+ "Show raw trace event output (do not use print fmt or plugins)"),
OPT_END()
};
const char * const top_usage[] = {
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 582f632353bc..df9e5989e209 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -459,7 +459,11 @@ static char *get_trace_output(struct hist_entry *he)
evsel = hists_to_evsel(he->hists);

trace_seq_init(&seq);
- pevent_event_info(&seq, evsel->tp_format, &rec);
+ if (symbol_conf.raw_trace)
+ print_event_fields(&seq, he->raw_data, he->raw_size,
+ evsel->tp_format);
+ else
+ pevent_event_info(&seq, evsel->tp_format, &rec);
return seq.buffer;
}

@@ -1596,6 +1600,7 @@ struct hpp_dynamic_entry {
struct perf_evsel *evsel;
struct format_field *field;
unsigned dynamic_len;
+ bool raw_trace;
};

static int hde_width(struct hpp_dynamic_entry *hde)
@@ -1628,6 +1633,9 @@ static void update_dynamic_len(struct hpp_dynamic_entry *hde,
size_t namelen;
bool last = false;

+ if (hde->raw_trace)
+ return;
+
/* parse pretty print result and update max length */
if (!he->trace_output)
he->trace_output = get_trace_output(he);
@@ -1708,8 +1716,10 @@ static int __sort__hde_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
if (hists_to_evsel(he->hists) != hde->evsel)
return scnprintf(hpp->buf, hpp->size, "%*.*s", len, len, "N/A");

- field = hde->field;
+ if (hde->raw_trace)
+ goto raw_field;

+ field = hde->field;
namelen = strlen(field->name);
str = he->trace_output;

@@ -1738,6 +1748,7 @@ static int __sort__hde_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,

if (str == NULL) {
struct trace_seq seq;
+raw_field:
trace_seq_init(&seq);
print_event_field(&seq, he->raw_data, hde->field);
str = seq.buffer;
@@ -1818,10 +1829,11 @@ __alloc_dynamic_entry(struct perf_evsel *evsel, struct format_field *field)

static int add_dynamic_entry(struct perf_evlist *evlist, const char *tok)
{
- char *str, *event_name, *field_name;
+ char *str, *event_name, *field_name, *raw_opt;
struct perf_evsel *evsel, *pos;
struct format_field *field;
struct hpp_dynamic_entry *hde;
+ bool raw_trace = symbol_conf.raw_trace;
int ret = 0;

if (evlist == NULL)
@@ -1839,6 +1851,18 @@ static int add_dynamic_entry(struct perf_evlist *evlist, const char *tok)
}
*field_name++ = '\0';

+ raw_opt = strchr(field_name, '/');
+ if (raw_opt) {
+ *raw_opt++ = '\0';
+
+ if (strcmp(raw_opt, "raw")) {
+ pr_err("Unsupported field option %s\n", raw_opt);
+ ret = -EINVAL;
+ goto out;
+ }
+ raw_trace = true;
+ }
+
evsel = NULL;
evlist__for_each(evlist, pos) {
if (!strcmp(pos->name, event_name)) {
@@ -1872,6 +1896,7 @@ static int add_dynamic_entry(struct perf_evlist *evlist, const char *tok)
ret = -ENOMEM;
goto out;
}
+ hde->raw_trace = raw_trace;

perf_hpp__register_sort_field(&hde->hpp);

diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 857f707ac12b..ccd1caa40e11 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -109,7 +109,8 @@ struct symbol_conf {
branch_callstack,
has_filter,
show_ref_callgraph,
- hide_unresolved;
+ hide_unresolved,
+ raw_trace;
const char *vmlinux_name,
*kallsyms_name,
*source_prefix,
--
2.6.4

2015-12-22 17:08:31

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 09/13] perf tools: Support shortcuts for events in dynamic sort keys

The dynamic sort key requires event name but specifying full event name
is rather inconvenient. This patch adds more ways to identify the event
in a more compact way.

1. If session has just one event, event name can be omitted.
2. Events can be accessed by index preceded by a percent sign.
3. A part of the name can be used, if it's not ambiguous. The partial
name should not contain ':' in it.
4. Full system + event name is still used, it should contain ':'.

So in the below example all does same thing:

$ perf record -e sched:sched_switch -a sleep 1

$ perf report -s next_pid,next_comm
$ perf report -s %1.next_pid,%1.next_comm
$ perf report -s switch.next_pid,switch.next_comm
$ perf report -s sched:sched_switch.next_pid,sched:sched_switch_.next_comm

Cc: Steven Rostedt <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/sort.c | 107 ++++++++++++++++++++++++++++++++++++++++---------
1 file changed, 87 insertions(+), 20 deletions(-)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index df9e5989e209..78496f0555ec 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1827,10 +1827,90 @@ __alloc_dynamic_entry(struct perf_evsel *evsel, struct format_field *field)
return hde;
}

+static int parse_field_name(char *str, char **event, char **field, char **opt)
+{
+ char *event_name, *field_name, *opt_name;
+
+ event_name = str;
+ field_name = strchr(str, '.');
+
+ if (field_name) {
+ *field_name++ = '\0';
+ } else {
+ event_name = NULL;
+ field_name = str;
+ }
+
+ opt_name = strchr(field_name, '/');
+ if (opt_name)
+ *opt_name++ = '\0';
+
+ *event = event_name;
+ *field = field_name;
+ *opt = opt_name;
+
+ return 0;
+}
+
+/* find match evsel using a given event name. The event name can be:
+ * 1. NULL - only valid for single event session
+ * 2. '%' + event index (e.g. '%1' for first event)
+ * 3. full event name (e.g. sched:sched_switch)
+ * 4. partial event name (should not contain ':')
+ */
+static struct perf_evsel *find_evsel(struct perf_evlist *evlist, char *event_name)
+{
+ struct perf_evsel *evsel = NULL;
+ struct perf_evsel *pos;
+ bool full_name;
+
+ /* case 1 */
+ if (event_name == NULL) {
+ if (evlist->nr_entries != 1) {
+ pr_debug("event name should be given\n");
+ return NULL;
+ }
+
+ return perf_evlist__first(evlist);
+ }
+
+ /* case 2 */
+ if (event_name[0] == '%') {
+ int nr = strtol(event_name+1, NULL, 0);
+
+ if (nr > evlist->nr_entries)
+ return NULL;
+
+ evsel = perf_evlist__first(evlist);
+ while (--nr > 0)
+ evsel = perf_evsel__next(evsel);
+
+ return evsel;
+ }
+
+ full_name = !!strchr(event_name, ':');
+ evlist__for_each(evlist, pos) {
+ /* case 3 */
+ if (full_name && !strcmp(pos->name, event_name))
+ return pos;
+ /* case 4 */
+ if (!full_name && strstr(pos->name, event_name)) {
+ if (evsel) {
+ pr_debug("'%s' event is ambiguous: it can be %s or %s\n",
+ event_name, evsel->name, pos->name);
+ return NULL;
+ }
+ evsel = pos;
+ }
+ }
+
+ return evsel;
+}
+
static int add_dynamic_entry(struct perf_evlist *evlist, const char *tok)
{
- char *str, *event_name, *field_name, *raw_opt;
- struct perf_evsel *evsel, *pos;
+ char *str, *event_name, *field_name, *opt_name;
+ struct perf_evsel *evsel;
struct format_field *field;
struct hpp_dynamic_entry *hde;
bool raw_trace = symbol_conf.raw_trace;
@@ -1843,34 +1923,21 @@ static int add_dynamic_entry(struct perf_evlist *evlist, const char *tok)
if (str == NULL)
return -ENOMEM;

- event_name = str;
- field_name = strchr(str, '.');
- if (field_name == NULL) {
+ if (parse_field_name(str, &event_name, &field_name, &opt_name) < 0) {
ret = -EINVAL;
goto out;
}
- *field_name++ = '\0';

- raw_opt = strchr(field_name, '/');
- if (raw_opt) {
- *raw_opt++ = '\0';
-
- if (strcmp(raw_opt, "raw")) {
- pr_err("Unsupported field option %s\n", raw_opt);
+ if (opt_name) {
+ if (strcmp(opt_name, "raw")) {
+ pr_debug("unsupported field option %s\n", opt_name);
ret = -EINVAL;
goto out;
}
raw_trace = true;
}

- evsel = NULL;
- evlist__for_each(evlist, pos) {
- if (!strcmp(pos->name, event_name)) {
- evsel = pos;
- break;
- }
- }
-
+ evsel = find_evsel(evlist, event_name);
if (evsel == NULL) {
pr_debug("Cannot find event: %s\n", event_name);
ret = -ENOENT;
--
2.6.4

2015-12-22 17:09:32

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 10/13] perf tools: Support '<event>.*' dynamic sort key

Support '*' character for field name to add all (non-common) fields as
sort keys easily.

$ perf report -s 'switch.*' --stdio
...
# Overhead prev_comm prev_pid prev_prio prev_state next_comm next_pid next_prio
# ........ ........... ......... ......... .......... ............ ........ .........
#
3.82% swapper/0 0 120 0 netctl-auto 18711 120
3.75% netctl-auto 18711 120 1 swapper/0 0 120
2.24% swapper/1 0 120 0 netctl-auto 18709 120
2.24% netctl-auto 18709 120 1 swapper/1 0 120
1.80% swapper/2 0 120 0 rcu_preempt 7 120
1.80% swapper/2 0 120 0 netctl-auto 18711 120
1.80% rcu_preempt 7 120 1 swapper/2 0 120
1.80% netctl-auto 18711 120 1 swapper/2 0 120
...

Suggested-by: Jiri Olsa <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/sort.c | 49 ++++++++++++++++++++++++++++++++++---------------
1 file changed, 34 insertions(+), 15 deletions(-)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 78496f0555ec..297383e0abc5 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1907,12 +1907,27 @@ static struct perf_evsel *find_evsel(struct perf_evlist *evlist, char *event_nam
return evsel;
}

+static int __dynamic_dimension__add(struct perf_evsel *evsel,
+ struct format_field *field,
+ bool raw_trace)
+{
+ struct hpp_dynamic_entry *hde;
+
+ hde = __alloc_dynamic_entry(evsel, field);
+ if (hde == NULL)
+ return -ENOMEM;
+
+ hde->raw_trace = raw_trace;
+
+ perf_hpp__register_sort_field(&hde->hpp);
+ return 0;
+}
+
static int add_dynamic_entry(struct perf_evlist *evlist, const char *tok)
{
char *str, *event_name, *field_name, *opt_name;
struct perf_evsel *evsel;
struct format_field *field;
- struct hpp_dynamic_entry *hde;
bool raw_trace = symbol_conf.raw_trace;
int ret = 0;

@@ -1950,22 +1965,26 @@ static int add_dynamic_entry(struct perf_evlist *evlist, const char *tok)
goto out;
}

- field = pevent_find_any_field(evsel->tp_format, field_name);
- if (field == NULL) {
- pr_debug("Cannot find event field for %s.%s\n",
- event_name, field_name);
- ret = -ENOENT;
- goto out;
- }
+ if (!strcmp(field_name, "*")) {
+ field = evsel->tp_format->format.fields;

- hde = __alloc_dynamic_entry(evsel, field);
- if (hde == NULL) {
- ret = -ENOMEM;
- goto out;
- }
- hde->raw_trace = raw_trace;
+ while (field) {
+ ret = __dynamic_dimension__add(evsel, field, raw_trace);
+ if (ret < 0)
+ goto out;

- perf_hpp__register_sort_field(&hde->hpp);
+ field = field->next;
+ }
+ } else {
+ field = pevent_find_any_field(evsel->tp_format, field_name);
+ if (field == NULL) {
+ pr_debug("Cannot find event field for %s.%s\n",
+ event_name, field_name);
+ return -ENOENT;
+ }
+
+ ret = __dynamic_dimension__add(evsel, field, raw_trace);
+ }

out:
free(str);
--
2.6.4

2015-12-22 17:08:39

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 11/13] perf tools: Skip dynamic fields not defined for current event

When there're multiple events, each dynamic sort key is defined just for
one event. In this case other events will always show "N/A" for those
fields. But they are meaningless and consume precious screen width.

Let's skip those undefined dynamic fields.

$ perf record -e kmem:kmalloc,kmem:kfree -a sleep 1

$ perf report -s 'comm,kmalloc.*' --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 20K of event 'kmem:kmalloc'
# Event count (approx.): 20533
#
# Overhead Command call_site ptr bytes_req bytes_alloc gfp_flags
# ........ ....... .................. .................. ......... ........... ...................
#
99.89% perf ffffffffa01d4396 0xffff8803ffb79720 96 96 GFP_NOFS|GFP_ZERO
0.06% sleep ffffffff8114e1cd 0xffff8803d228a000 4096 4096 GFP_KERNEL
0.03% perf ffffffff811d6ae6 0xffff8803f7678f00 240 256 GFP_KERNEL|GFP_ZERO
0.00% perf ffffffff812263c1 0xffff880406172380 128 128 GFP_KERNEL
0.00% perf ffffffff812264b9 0xffff8803ffac1600 504 512 GFP_KERNEL
0.00% perf ffffffff81226634 0xffff880401dc5280 28 32 GFP_KERNEL
0.00% sleep ffffffff81226da9 0xffff8803ffac3a00 392 512 GFP_KERNEL

# Samples: 20K of event 'kmem:kfree'
# Event count (approx.): 20597
#
# Overhead Command
# ........ ..............
#
99.63% perf
0.14% sleep
0.11% irq/36-iwlwifi
0.11% kworker/u16:0
0.01% Xorg
0.00% firefox

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/ui/browsers/hists.c | 7 ++++---
tools/perf/ui/gtk/hists.c | 4 ++--
tools/perf/ui/hist.c | 2 +-
tools/perf/ui/stdio/hist.c | 6 +++---
tools/perf/util/hist.c | 2 +-
tools/perf/util/hist.h | 14 ++++++++++++--
tools/perf/util/sort.c | 20 ++++++++++++++------
7 files changed, 37 insertions(+), 18 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index ec331969b7d7..901d481e6cea 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -1041,7 +1041,8 @@ static int hist_browser__show_entry(struct hist_browser *browser,
hist_browser__gotorc(browser, row, 0);

perf_hpp__for_each_format(fmt) {
- if (perf_hpp__should_skip(fmt) || column++ < browser->b.horiz_scroll)
+ if (perf_hpp__should_skip(fmt, entry->hists) ||
+ column++ < browser->b.horiz_scroll)
continue;

if (current_entry && browser->b.navkeypressed) {
@@ -1144,7 +1145,7 @@ static int hists_browser__scnprintf_headers(struct hist_browser *browser, char *
}

perf_hpp__for_each_format(fmt) {
- if (perf_hpp__should_skip(fmt) || column++ < browser->b.horiz_scroll)
+ if (perf_hpp__should_skip(fmt, hists) || column++ < browser->b.horiz_scroll)
continue;

ret = fmt->header(fmt, &dummy_hpp, hists_to_evsel(hists));
@@ -1414,7 +1415,7 @@ static int hist_browser__fprintf_entry(struct hist_browser *browser,
printed += fprintf(fp, "%c ", folded_sign);

perf_hpp__for_each_format(fmt) {
- if (perf_hpp__should_skip(fmt))
+ if (perf_hpp__should_skip(fmt, he->hists))
continue;

if (!first) {
diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
index 467717276ab6..0f8dcfdfb10f 100644
--- a/tools/perf/ui/gtk/hists.c
+++ b/tools/perf/ui/gtk/hists.c
@@ -318,7 +318,7 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
col_idx = 0;

perf_hpp__for_each_format(fmt) {
- if (perf_hpp__should_skip(fmt))
+ if (perf_hpp__should_skip(fmt, hists))
continue;

/*
@@ -368,7 +368,7 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
col_idx = 0;

perf_hpp__for_each_format(fmt) {
- if (perf_hpp__should_skip(fmt))
+ if (perf_hpp__should_skip(fmt, h->hists))
continue;

if (fmt->color)
diff --git a/tools/perf/ui/hist.c b/tools/perf/ui/hist.c
index 5029ba2b55af..8263c0eb9fb5 100644
--- a/tools/perf/ui/hist.c
+++ b/tools/perf/ui/hist.c
@@ -619,7 +619,7 @@ unsigned int hists__sort_list_width(struct hists *hists)
struct perf_hpp dummy_hpp;

perf_hpp__for_each_format(fmt) {
- if (perf_hpp__should_skip(fmt))
+ if (perf_hpp__should_skip(fmt, hists))
continue;

if (first)
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 7ebc661be267..387110d50b00 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -385,7 +385,7 @@ static int hist_entry__snprintf(struct hist_entry *he, struct perf_hpp *hpp)
return 0;

perf_hpp__for_each_format(fmt) {
- if (perf_hpp__should_skip(fmt))
+ if (perf_hpp__should_skip(fmt, he->hists))
continue;

/*
@@ -464,7 +464,7 @@ size_t hists__fprintf(struct hists *hists, bool show_header, int max_rows,
fprintf(fp, "# ");

perf_hpp__for_each_format(fmt) {
- if (perf_hpp__should_skip(fmt))
+ if (perf_hpp__should_skip(fmt, hists))
continue;

if (!first)
@@ -490,7 +490,7 @@ size_t hists__fprintf(struct hists *hists, bool show_header, int max_rows,
perf_hpp__for_each_format(fmt) {
unsigned int i;

- if (perf_hpp__should_skip(fmt))
+ if (perf_hpp__should_skip(fmt, hists))
continue;

if (!first)
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index ef1a4a5cc610..767a1a4f048e 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1092,7 +1092,7 @@ static int hist_entry__sort(struct hist_entry *a, struct hist_entry *b)
int64_t cmp = 0;

perf_hpp__for_each_sort_list(fmt) {
- if (perf_hpp__should_skip(fmt))
+ if (perf_hpp__should_skip(fmt, a->hists))
continue;

cmp = fmt->sort(fmt, a, b);
diff --git a/tools/perf/util/hist.h b/tools/perf/util/hist.h
index 15b22c563d30..cb8f37349972 100644
--- a/tools/perf/util/hist.h
+++ b/tools/perf/util/hist.h
@@ -262,10 +262,20 @@ void perf_hpp__append_sort_keys(void);

bool perf_hpp__is_sort_entry(struct perf_hpp_fmt *format);
bool perf_hpp__same_sort_entry(struct perf_hpp_fmt *a, struct perf_hpp_fmt *b);
+bool perf_hpp__is_dynamic_entry(struct perf_hpp_fmt *format);
+bool perf_hpp__defined_dynamic_entry(struct perf_hpp_fmt *fmt, struct hists *hists);

-static inline bool perf_hpp__should_skip(struct perf_hpp_fmt *format)
+static inline bool perf_hpp__should_skip(struct perf_hpp_fmt *format,
+ struct hists *hists)
{
- return format->elide;
+ if (format->elide)
+ return true;
+
+ if (perf_hpp__is_dynamic_entry(format) &&
+ !perf_hpp__defined_dynamic_entry(format, hists))
+ return true;
+
+ return false;
}

void perf_hpp__reset_width(struct perf_hpp_fmt *fmt, struct hists *hists);
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 297383e0abc5..58a84a1c889a 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1697,6 +1697,15 @@ static int __sort__hde_width(struct perf_hpp_fmt *fmt,
return len;
}

+bool perf_hpp__defined_dynamic_entry(struct perf_hpp_fmt *fmt, struct hists *hists)
+{
+ struct hpp_dynamic_entry *hde;
+
+ hde = container_of(fmt, struct hpp_dynamic_entry, hpp);
+
+ return hists_to_evsel(hists) == hde->evsel;
+}
+
static int __sort__hde_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
struct hist_entry *he)
{
@@ -1713,9 +1722,6 @@ static int __sort__hde_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
if (!len)
len = hde_width(hde);

- if (hists_to_evsel(he->hists) != hde->evsel)
- return scnprintf(hpp->buf, hpp->size, "%*.*s", len, len, "N/A");
-
if (hde->raw_trace)
goto raw_field;

@@ -1768,9 +1774,6 @@ static int64_t __sort__hde_cmp(struct perf_hpp_fmt *fmt,

hde = container_of(fmt, struct hpp_dynamic_entry, hpp);

- if (hists_to_evsel(a->hists) != hde->evsel)
- return 0;
-
field = hde->field;
if (field->flags & FIELD_IS_DYNAMIC) {
unsigned long long dyn;
@@ -1793,6 +1796,11 @@ static int64_t __sort__hde_cmp(struct perf_hpp_fmt *fmt,
return memcmp(a->raw_data + offset, b->raw_data + offset, size);
}

+bool perf_hpp__is_dynamic_entry(struct perf_hpp_fmt *fmt)
+{
+ return fmt->cmp == __sort__hde_cmp;
+}
+
static struct hpp_dynamic_entry *
__alloc_dynamic_entry(struct perf_evsel *evsel, struct format_field *field)
{
--
2.6.4

2015-12-22 17:09:10

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 12/13] perf tools: Add 'trace_fields' dynamic sort key

The 'trace_fields' sort key is similar as 'trace' sort key, but it shows
each fields separately. Each event will get different columns as their
fields.

$ perf report -s trace_fields --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 20K of event 'kmem:kmalloc'
# Event count (approx.): 20533
#
# Overhead Command call_site ptr bytes_req bytes_alloc gfp_flags
# ........ ....... .................. .................. ......... ........... ...................
#
99.89% perf ffffffffa01d4396 0xffff8803ffb79720 96 96 GFP_NOFS|GFP_ZERO
0.06% sleep ffffffff8114e1cd 0xffff8803d228a000 4096 4096 GFP_KERNEL
0.03% perf ffffffff811d6ae6 0xffff8803f7678f00 240 256 GFP_KERNEL|GFP_ZERO
0.00% perf ffffffff812263c1 0xffff880406172380 128 128 GFP_KERNEL
0.00% perf ffffffff812264b9 0xffff8803ffac1600 504 512 GFP_KERNEL
0.00% perf ffffffff81226634 0xffff880401dc5280 28 32 GFP_KERNEL
0.00% sleep ffffffff81226da9 0xffff8803ffac3a00 392 512 GFP_KERNEL

# Samples: 20K of event 'kmem:kfree'
# Event count (approx.): 20597
#
# Overhead call_site ptr
# ........ .................. ..................
#
99.58% ffffffffa01d85ad 0xffff8803ffb79720
0.07% ffffffff81443f5c 0xffff8803f7669400
0.02% ffffffff811d5753 0xffff8803f7678f00
0.01% ffffffff81443f5c 0xffff8803f766be00
0.01% ffffffff8114e359 0xffff8803d228a000
0.01% ffffffff81443f5c 0xffff8800d156dc00
0.01% ffffffff81443f5c 0xffff8803f7669400
0.01% ffffffff8114e359 0xffff8803d228a000
0.01% ffffffff8114e359 0xffff8803d228a000
0.01% ffffffff8114e359 0xffff8803d228a000

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/sort.c | 44 +++++++++++++++++++++++++++++++++++---------
1 file changed, 35 insertions(+), 9 deletions(-)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 58a84a1c889a..ae995f37e135 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1931,6 +1931,35 @@ static int __dynamic_dimension__add(struct perf_evsel *evsel,
return 0;
}

+static int add_evsel_fields(struct perf_evsel *evsel, bool raw_trace)
+{
+ int ret;
+ struct format_field *field;
+
+ field = evsel->tp_format->format.fields;
+ while (field) {
+ ret = __dynamic_dimension__add(evsel, field, raw_trace);
+ if (ret < 0)
+ return ret;
+
+ field = field->next;
+ }
+ return 0;
+}
+
+static int add_all_dynamic_fields(struct perf_evlist *evlist, bool raw_trace)
+{
+ int ret;
+ struct perf_evsel *evsel;
+
+ evlist__for_each(evlist, evsel) {
+ ret = add_evsel_fields(evsel, raw_trace);
+ if (ret < 0)
+ return ret;
+ }
+ return 0;
+}
+
static int add_dynamic_entry(struct perf_evlist *evlist, const char *tok)
{
char *str, *event_name, *field_name, *opt_name;
@@ -1960,6 +1989,11 @@ static int add_dynamic_entry(struct perf_evlist *evlist, const char *tok)
raw_trace = true;
}

+ if (!strcmp(field_name, "trace_fields")) {
+ ret = add_all_dynamic_fields(evlist ,raw_trace);
+ goto out;
+ }
+
evsel = find_evsel(evlist, event_name);
if (evsel == NULL) {
pr_debug("Cannot find event: %s\n", event_name);
@@ -1974,15 +2008,7 @@ static int add_dynamic_entry(struct perf_evlist *evlist, const char *tok)
}

if (!strcmp(field_name, "*")) {
- field = evsel->tp_format->format.fields;
-
- while (field) {
- ret = __dynamic_dimension__add(evsel, field, raw_trace);
- if (ret < 0)
- goto out;
-
- field = field->next;
- }
+ ret = add_evsel_fields(evsel, raw_trace);
} else {
field = pevent_find_any_field(evsel->tp_format, field_name);
if (field == NULL) {
--
2.6.4

2015-12-22 17:08:44

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 13/13] perf tools: Make 'trace' or 'trace_fields' sort key default for tracepoint events

When an evlist contains tracepoint events only, use 'trace' sort key as
default. If --raw-trace option was given, use 'trace_fields' instead.
This will make users more convenient to see trace result.

Suggested-by: Jiri Olsa <[email protected]>
Cc: Steven Rostedt <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/sort.c | 27 ++++++++++++++++++++++-----
tools/perf/util/sort.h | 1 +
2 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index ae995f37e135..18927984fe8e 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -15,6 +15,7 @@ const char default_branch_sort_order[] = "comm,dso_from,symbol_from,symbol_to,cy
const char default_mem_sort_order[] = "local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked";
const char default_top_sort_order[] = "dso,symbol";
const char default_diff_sort_order[] = "dso,symbol";
+const char default_tracepoint_sort_order[] = "trace";
const char *sort_order;
const char *field_order;
regex_t ignore_callees_regex;
@@ -2167,7 +2168,7 @@ static int sort_dimension__add(const char *tok,
return -ESRCH;
}

-static const char *get_default_sort_order(void)
+static const char *get_default_sort_order(struct perf_evlist *evlist)
{
const char *default_sort_orders[] = {
default_sort_order,
@@ -2175,14 +2176,30 @@ static const char *get_default_sort_order(void)
default_mem_sort_order,
default_top_sort_order,
default_diff_sort_order,
+ default_tracepoint_sort_order,
};
+ bool use_trace = true;
+ struct perf_evsel *evsel;

BUG_ON(sort__mode >= ARRAY_SIZE(default_sort_orders));

+ evlist__for_each(evlist, evsel) {
+ if (evsel->attr.type != PERF_TYPE_TRACEPOINT) {
+ use_trace = false;
+ break;
+ }
+ }
+
+ if (use_trace) {
+ sort__mode = SORT_MODE__TRACEPOINT;
+ if (symbol_conf.raw_trace)
+ return "trace_fields";
+ }
+
return default_sort_orders[sort__mode];
}

-static int setup_sort_order(void)
+static int setup_sort_order(struct perf_evlist *evlist)
{
char *new_sort_order;

@@ -2203,7 +2220,7 @@ static int setup_sort_order(void)
* because it's checked over the rest of the code.
*/
if (asprintf(&new_sort_order, "%s,%s",
- get_default_sort_order(), sort_order + 1) < 0) {
+ get_default_sort_order(evlist), sort_order + 1) < 0) {
error("Not enough memory to set up --sort");
return -ENOMEM;
}
@@ -2218,7 +2235,7 @@ static int __setup_sorting(struct perf_evlist *evlist)
const char *sort_keys;
int ret = 0;

- ret = setup_sort_order();
+ ret = setup_sort_order(evlist);
if (ret)
return ret;

@@ -2232,7 +2249,7 @@ static int __setup_sorting(struct perf_evlist *evlist)
return 0;
}

- sort_keys = get_default_sort_order();
+ sort_keys = get_default_sort_order(evlist);
}

str = strdup(sort_keys);
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 6b7590ade229..dec536b6ab3d 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -167,6 +167,7 @@ enum sort_mode {
SORT_MODE__MEMORY,
SORT_MODE__TOP,
SORT_MODE__DIFF,
+ SORT_MODE__TRACEPOINT,
};

enum sort_type {
--
2.6.4

2015-12-23 08:46:31

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCHSET 00/10] perf tools: Support dynamic sort keys for tracepoints (v4)

On Wed, Dec 23, 2015 at 02:06:57AM +0900, Namhyung Kim wrote:
> Hello,
>
> This is an attempt to improve perf to deal with tracepoint events
> better. The perf tools can handle tracepoint events but perf report
> on them is less useful since they're always sampled in a fixed
> location and not provide event specific info. We can use perf script
> but I always wishes there's more convenient way to see the result.
>
> * changes in v4)
> - support '<event>.*' syntax for dynamic sort key
> - skip unrelated dynamic fields in the output
> - add 'trace_fields' sort key (Jiri)
> - reuse get_trace_output function (Jiri)

real beauty now ;-)

Acked-by: Jiri Olsa <[email protected]>

thanks,
jirka

2015-12-23 08:52:49

by Wang Nan

[permalink] [raw]
Subject: Re: [PATCH 03/13] tools lib traceevent: Factor out and export print_event_field[s]



On 2015/12/23 1:07, Namhyung Kim wrote:
> The print_event_field() and print_event_field() are to print basic

Should be

The print_event_field() and print_event_fields() are to print basic ...

lost an 's'.

> information of a given field or event without the print format. They'll
> be used by dynamic sort keys later.
>
> Cc: Steven Rostedt <[email protected]>
> Signed-off-by: Namhyung Kim <[email protected]>

Thank you.

2015-12-23 13:09:30

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH v4.1 03/13] tools lib traceevent: Factor out and export print_event_field[s]

The print_event_field() and print_event_fields() are to print basic
information of a given field or event without the print format. They'll
be used by dynamic sort keys later.

Cc: Steven Rostedt <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
Fix typo. The branch was also updated.

tools/lib/traceevent/event-parse.c | 125 ++++++++++++++++++++-----------------
tools/lib/traceevent/event-parse.h | 4 ++
2 files changed, 70 insertions(+), 59 deletions(-)

diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c
index 68276f35e323..1b43f2b9aebe 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -4735,73 +4735,80 @@ static int is_printable_array(char *p, unsigned int len)
return 1;
}

-static void print_event_fields(struct trace_seq *s, void *data,
- int size __maybe_unused,
- struct event_format *event)
+void print_event_field(struct trace_seq *s, void *data,
+ struct format_field *field)
{
- struct format_field *field;
unsigned long long val;
unsigned int offset, len, i;
-
- field = event->format.fields;
- while (field) {
- trace_seq_printf(s, " %s=", field->name);
- if (field->flags & FIELD_IS_ARRAY) {
- offset = field->offset;
- len = field->size;
- if (field->flags & FIELD_IS_DYNAMIC) {
- val = pevent_read_number(event->pevent, data + offset, len);
- offset = val;
- len = offset >> 16;
- offset &= 0xffff;
- }
- if (field->flags & FIELD_IS_STRING &&
- is_printable_array(data + offset, len)) {
- trace_seq_printf(s, "%s", (char *)data + offset);
- } else {
- trace_seq_puts(s, "ARRAY[");
- for (i = 0; i < len; i++) {
- if (i)
- trace_seq_puts(s, ", ");
- trace_seq_printf(s, "%02x",
- *((unsigned char *)data + offset + i));
- }
- trace_seq_putc(s, ']');
- field->flags &= ~FIELD_IS_STRING;
- }
+ struct pevent *pevent = field->event->pevent;
+
+ if (field->flags & FIELD_IS_ARRAY) {
+ offset = field->offset;
+ len = field->size;
+ if (field->flags & FIELD_IS_DYNAMIC) {
+ val = pevent_read_number(pevent, data + offset, len);
+ offset = val;
+ len = offset >> 16;
+ offset &= 0xffff;
+ }
+ if (field->flags & FIELD_IS_STRING &&
+ is_printable_array(data + offset, len)) {
+ trace_seq_printf(s, "%s", (char *)data + offset);
} else {
- val = pevent_read_number(event->pevent, data + field->offset,
- field->size);
- if (field->flags & FIELD_IS_POINTER) {
- trace_seq_printf(s, "0x%llx", val);
- } else if (field->flags & FIELD_IS_SIGNED) {
- switch (field->size) {
- case 4:
- /*
- * If field is long then print it in hex.
- * A long usually stores pointers.
- */
- if (field->flags & FIELD_IS_LONG)
- trace_seq_printf(s, "0x%x", (int)val);
- else
- trace_seq_printf(s, "%d", (int)val);
- break;
- case 2:
- trace_seq_printf(s, "%2d", (short)val);
- break;
- case 1:
- trace_seq_printf(s, "%1d", (char)val);
- break;
- default:
- trace_seq_printf(s, "%lld", val);
- }
- } else {
+ trace_seq_puts(s, "ARRAY[");
+ for (i = 0; i < len; i++) {
+ if (i)
+ trace_seq_puts(s, ", ");
+ trace_seq_printf(s, "%02x",
+ *((unsigned char *)data + offset + i));
+ }
+ trace_seq_putc(s, ']');
+ field->flags &= ~FIELD_IS_STRING;
+ }
+ } else {
+ val = pevent_read_number(pevent, data + field->offset,
+ field->size);
+ if (field->flags & FIELD_IS_POINTER) {
+ trace_seq_printf(s, "0x%llx", val);
+ } else if (field->flags & FIELD_IS_SIGNED) {
+ switch (field->size) {
+ case 4:
+ /*
+ * If field is long then print it in hex.
+ * A long usually stores pointers.
+ */
if (field->flags & FIELD_IS_LONG)
- trace_seq_printf(s, "0x%llx", val);
+ trace_seq_printf(s, "0x%x", (int)val);
else
- trace_seq_printf(s, "%llu", val);
+ trace_seq_printf(s, "%d", (int)val);
+ break;
+ case 2:
+ trace_seq_printf(s, "%2d", (short)val);
+ break;
+ case 1:
+ trace_seq_printf(s, "%1d", (char)val);
+ break;
+ default:
+ trace_seq_printf(s, "%lld", val);
}
+ } else {
+ if (field->flags & FIELD_IS_LONG)
+ trace_seq_printf(s, "0x%llx", val);
+ else
+ trace_seq_printf(s, "%llu", val);
}
+ }
+}
+
+void print_event_fields(struct trace_seq *s, void *data,
+ int size __maybe_unused, struct event_format *event)
+{
+ struct format_field *field;
+
+ field = event->format.fields;
+ while (field) {
+ trace_seq_printf(s, " %s=", field->name);
+ print_event_field(s, data, field);
field = field->next;
}
}
diff --git a/tools/lib/traceevent/event-parse.h b/tools/lib/traceevent/event-parse.h
index 6fc83c7edbe9..600c73277a6f 100644
--- a/tools/lib/traceevent/event-parse.h
+++ b/tools/lib/traceevent/event-parse.h
@@ -705,6 +705,10 @@ struct cmdline *pevent_data_pid_from_comm(struct pevent *pevent, const char *com
struct cmdline *next);
int pevent_cmdline_pid(struct pevent *pevent, struct cmdline *cmdline);

+void print_event_field(struct trace_seq *s, void *data,
+ struct format_field *field);
+void print_event_fields(struct trace_seq *s, void *data,
+ int size __maybe_unused, struct event_format *event);
void pevent_event_info(struct trace_seq *s, struct event_format *event,
struct pevent_record *record);
int pevent_strerror(struct pevent *pevent, enum pevent_errno errnum,
--
2.6.4

2015-12-23 13:11:10

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCHSET 00/10] perf tools: Support dynamic sort keys for tracepoints (v4)

Hi Jiri,

On Wed, Dec 23, 2015 at 09:46:23AM +0100, Jiri Olsa wrote:
> On Wed, Dec 23, 2015 at 02:06:57AM +0900, Namhyung Kim wrote:
> > Hello,
> >
> > This is an attempt to improve perf to deal with tracepoint events
> > better. The perf tools can handle tracepoint events but perf report
> > on them is less useful since they're always sampled in a fixed
> > location and not provide event specific info. We can use perf script
> > but I always wishes there's more convenient way to see the result.
> >
> > * changes in v4)
> > - support '<event>.*' syntax for dynamic sort key
> > - skip unrelated dynamic fields in the output
> > - add 'trace_fields' sort key (Jiri)
> > - reuse get_trace_output function (Jiri)
>
> real beauty now ;-)
>
> Acked-by: Jiri Olsa <[email protected]>

Thanks for your review and feedback!
Namhyung

2015-12-23 21:43:41

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 02/13] perf hist: Save raw_data/size for tracepoint events

Em Wed, Dec 23, 2015 at 02:06:59AM +0900, Namhyung Kim escreveu:
> The raw_data and raw_size fields are to provide tracepoint specific
> information. They will be used by dynamic sort keys later.
>
> Signed-off-by: Namhyung Kim <[email protected]>
> ---
> tools/perf/util/hist.c | 4 ++++
> tools/perf/util/sort.h | 2 ++
> 2 files changed, 6 insertions(+)
>
> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> index 039bb91d0a92..c0c92a3daa69 100644
> --- a/tools/perf/util/hist.c
> +++ b/tools/perf/util/hist.c
> @@ -487,6 +487,8 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
> .branch_info = bi,
> .mem_info = mi,
> .transaction = sample->transaction,
> + .raw_data = sample->raw_data,
> + .raw_size = sample->raw_size,

So, sample->raw_data is just a pointer to perf_event_sample->array, that
may be overwritten, no?

Looking at the other patches.

- Arnaldo

> };
>
> return hists__findnew_entry(hists, &entry, al, sample_self);
> @@ -801,6 +803,8 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
> .sym = al->sym,
> },
> .parent = iter->parent,
> + .raw_data = sample->raw_data,
> + .raw_size = sample->raw_size,
> };
> int i;
> struct callchain_cursor cursor;
> diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
> index 86f05e7a5566..d29898708dbd 100644
> --- a/tools/perf/util/sort.h
> +++ b/tools/perf/util/sort.h
> @@ -122,6 +122,8 @@ struct hist_entry {
> struct branch_info *branch_info;
> struct hists *hists;
> struct mem_info *mem_info;
> + void *raw_data;
> + u32 raw_size;
> struct callchain_root callchain[0]; /* must be last member */
> };
>
> --
> 2.6.4

2015-12-24 00:46:53

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 02/13] perf hist: Save raw_data/size for tracepoint events

Hi Arnaldo,

On Wed, Dec 23, 2015 at 06:43:35PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Wed, Dec 23, 2015 at 02:06:59AM +0900, Namhyung Kim escreveu:
> > The raw_data and raw_size fields are to provide tracepoint specific
> > information. They will be used by dynamic sort keys later.
> >
> > Signed-off-by: Namhyung Kim <[email protected]>
> > ---
> > tools/perf/util/hist.c | 4 ++++
> > tools/perf/util/sort.h | 2 ++
> > 2 files changed, 6 insertions(+)
> >
> > diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> > index 039bb91d0a92..c0c92a3daa69 100644
> > --- a/tools/perf/util/hist.c
> > +++ b/tools/perf/util/hist.c
> > @@ -487,6 +487,8 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
> > .branch_info = bi,
> > .mem_info = mi,
> > .transaction = sample->transaction,
> > + .raw_data = sample->raw_data,
> > + .raw_size = sample->raw_size,
>
> So, sample->raw_data is just a pointer to perf_event_sample->array, that
> may be overwritten, no?

I couldn't find where the array data is overwritten. The
__perf_session__process_events() mmaps with PROT_READ basically. But
the mmap can be munmapped on 32 bit systems. I'll keep a copy then.

>
> Looking at the other patches.

Thanks,
Namhyung

>
> - Arnaldo
>
> > };
> >
> > return hists__findnew_entry(hists, &entry, al, sample_self);
> > @@ -801,6 +803,8 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
> > .sym = al->sym,
> > },
> > .parent = iter->parent,
> > + .raw_data = sample->raw_data,
> > + .raw_size = sample->raw_size,
> > };
> > int i;
> > struct callchain_cursor cursor;
> > diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
> > index 86f05e7a5566..d29898708dbd 100644
> > --- a/tools/perf/util/sort.h
> > +++ b/tools/perf/util/sort.h
> > @@ -122,6 +122,8 @@ struct hist_entry {
> > struct branch_info *branch_info;
> > struct hists *hists;
> > struct mem_info *mem_info;
> > + void *raw_data;
> > + u32 raw_size;
> > struct callchain_root callchain[0]; /* must be last member */
> > };
> >
> > --
> > 2.6.4

2015-12-24 01:06:22

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH v4.1 02/13] perf hist: Save raw_data/size for tracepoint events

The raw_data and raw_size fields are to provide tracepoint specific
information. They will be used by dynamic sort keys later.

Signed-off-by: Namhyung Kim <[email protected]>
---
make a copy of raw_data

tools/perf/util/hist.c | 27 +++++++++++++++++++++++++++
tools/perf/util/sort.h | 2 ++
2 files changed, 29 insertions(+)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 039bb91d0a92..df5567fff986 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -369,6 +369,28 @@ static struct hist_entry *hist_entry__new(struct hist_entry *template,
if (symbol_conf.use_callchain)
callchain_init(he->callchain);

+ if (he->raw_data) {
+ void *data = malloc(he->raw_size);
+
+ if (data == NULL) {
+ map__put(he->ms.map);
+ if (he->branch_info) {
+ map__put(he->branch_info->from.map);
+ map__put(he->branch_info->to.map);
+ free(he->branch_info);
+ }
+ if (he->mem_info) {
+ map__put(he->mem_info->iaddr.map);
+ map__put(he->mem_info->daddr.map);
+ }
+ free(he->stat_acc);
+ free(he);
+ return NULL;
+ }
+
+ memcpy(data, he->raw_data, he->raw_size);
+ he->raw_data = data;
+ }
INIT_LIST_HEAD(&he->pairs.node);
thread__get(he->thread);
}
@@ -487,6 +509,8 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
.branch_info = bi,
.mem_info = mi,
.transaction = sample->transaction,
+ .raw_data = sample->raw_data,
+ .raw_size = sample->raw_size,
};

return hists__findnew_entry(hists, &entry, al, sample_self);
@@ -801,6 +825,8 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
.sym = al->sym,
},
.parent = iter->parent,
+ .raw_data = sample->raw_data,
+ .raw_size = sample->raw_size,
};
int i;
struct callchain_cursor cursor;
@@ -974,6 +1000,7 @@ void hist_entry__delete(struct hist_entry *he)
if (he->srcfile && he->srcfile[0])
free(he->srcfile);
free_callchain(he->callchain);
+ free(he->raw_data);
free(he);
}

diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 86f05e7a5566..d29898708dbd 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -122,6 +122,8 @@ struct hist_entry {
struct branch_info *branch_info;
struct hists *hists;
struct mem_info *mem_info;
+ void *raw_data;
+ u32 raw_size;
struct callchain_root callchain[0]; /* must be last member */
};

--
2.6.4

2015-12-24 01:19:24

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 02/13] perf hist: Save raw_data/size for tracepoint events

Em Thu, Dec 24, 2015 at 09:45:45AM +0900, Namhyung Kim escreveu:
> Hi Arnaldo,
>
> On Wed, Dec 23, 2015 at 06:43:35PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Wed, Dec 23, 2015 at 02:06:59AM +0900, Namhyung Kim escreveu:
> > > The raw_data and raw_size fields are to provide tracepoint specific
> > > information. They will be used by dynamic sort keys later.
> > >
> > > Signed-off-by: Namhyung Kim <[email protected]>
> > > ---
> > > tools/perf/util/hist.c | 4 ++++
> > > tools/perf/util/sort.h | 2 ++
> > > 2 files changed, 6 insertions(+)
> > >
> > > diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> > > index 039bb91d0a92..c0c92a3daa69 100644
> > > --- a/tools/perf/util/hist.c
> > > +++ b/tools/perf/util/hist.c
> > > @@ -487,6 +487,8 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
> > > .branch_info = bi,
> > > .mem_info = mi,
> > > .transaction = sample->transaction,
> > > + .raw_data = sample->raw_data,
> > > + .raw_size = sample->raw_size,
> >
> > So, sample->raw_data is just a pointer to perf_event_sample->array, that
> > may be overwritten, no?
>
> I couldn't find where the array data is overwritten. The
> __perf_session__process_events() mmaps with PROT_READ basically. But
> the mmap can be munmapped on 32 bit systems. I'll keep a copy then.

perf top, aka mmap with overwrite mode?

I think we should always think first at how to make such new features to
work on 'perf top', where there is no such thing as a "file", but a
_ring_ buffer, where we reuse that buffer when we fill it up, so we
should not keep any pointers to past events, just use what in the
currently being processed.

Haven't looked, do we need to access it after we add the hist, or just
after the perf_sample is parsed?

> > Looking at the other patches.
>
> Thanks,
> Namhyung
>
> >
> > - Arnaldo
> >
> > > };
> > >
> > > return hists__findnew_entry(hists, &entry, al, sample_self);
> > > @@ -801,6 +803,8 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
> > > .sym = al->sym,
> > > },
> > > .parent = iter->parent,
> > > + .raw_data = sample->raw_data,
> > > + .raw_size = sample->raw_size,
> > > };
> > > int i;
> > > struct callchain_cursor cursor;
> > > diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
> > > index 86f05e7a5566..d29898708dbd 100644
> > > --- a/tools/perf/util/sort.h
> > > +++ b/tools/perf/util/sort.h
> > > @@ -122,6 +122,8 @@ struct hist_entry {
> > > struct branch_info *branch_info;
> > > struct hists *hists;
> > > struct mem_info *mem_info;
> > > + void *raw_data;
> > > + u32 raw_size;
> > > struct callchain_root callchain[0]; /* must be last member */
> > > };
> > >
> > > --
> > > 2.6.4

2015-12-24 01:39:24

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v4.1 02/13] perf hist: Save raw_data/size for tracepoint events

Em Thu, Dec 24, 2015 at 10:05:34AM +0900, Namhyung Kim escreveu:
> The raw_data and raw_size fields are to provide tracepoint specific
> information. They will be used by dynamic sort keys later.
>
> Signed-off-by: Namhyung Kim <[email protected]>
> ---
> make a copy of raw_data
>
> tools/perf/util/hist.c | 27 +++++++++++++++++++++++++++
> tools/perf/util/sort.h | 2 ++
> 2 files changed, 29 insertions(+)
>
> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> index 039bb91d0a92..df5567fff986 100644
> --- a/tools/perf/util/hist.c
> +++ b/tools/perf/util/hist.c
> @@ -369,6 +369,28 @@ static struct hist_entry *hist_entry__new(struct hist_entry *template,
> if (symbol_conf.use_callchain)
> callchain_init(he->callchain);
>
> + if (he->raw_data) {
> + void *data = malloc(he->raw_size);

if this is really needed, please use memdup(), I.e.:

memdup(he->raw_data, he->raw_size);

> + if (data == NULL) {
> + map__put(he->ms.map);
> + if (he->branch_info) {
> + map__put(he->branch_info->from.map);
> + map__put(he->branch_info->to.map);
> + free(he->branch_info);
> + }
> + if (he->mem_info) {
> + map__put(he->mem_info->iaddr.map);
> + map__put(he->mem_info->daddr.map);
> + }
> + free(he->stat_acc);
> + free(he);
> + return NULL;
> + }
> +
> + memcpy(data, he->raw_data, he->raw_size);
> + he->raw_data = data;
> + }
> INIT_LIST_HEAD(&he->pairs.node);
> thread__get(he->thread);
> }
> @@ -487,6 +509,8 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
> .branch_info = bi,
> .mem_info = mi,
> .transaction = sample->transaction,
> + .raw_data = sample->raw_data,
> + .raw_size = sample->raw_size,
> };
>
> return hists__findnew_entry(hists, &entry, al, sample_self);
> @@ -801,6 +825,8 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
> .sym = al->sym,
> },
> .parent = iter->parent,
> + .raw_data = sample->raw_data,
> + .raw_size = sample->raw_size,
> };
> int i;
> struct callchain_cursor cursor;
> @@ -974,6 +1000,7 @@ void hist_entry__delete(struct hist_entry *he)
> if (he->srcfile && he->srcfile[0])
> free(he->srcfile);
> free_callchain(he->callchain);
> + free(he->raw_data);
> free(he);
> }
>
> diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
> index 86f05e7a5566..d29898708dbd 100644
> --- a/tools/perf/util/sort.h
> +++ b/tools/perf/util/sort.h
> @@ -122,6 +122,8 @@ struct hist_entry {
> struct branch_info *branch_info;
> struct hists *hists;
> struct mem_info *mem_info;
> + void *raw_data;
> + u32 raw_size;
> struct callchain_root callchain[0]; /* must be last member */
> };
>
> --
> 2.6.4

2015-12-24 02:17:09

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH v4.2 02/13] perf hist: Save raw_data/size for tracepoint events

The raw_data and raw_size fields are to provide tracepoint specific
information. They will be used by dynamic sort keys later.

Signed-off-by: Namhyung Kim <[email protected]>
---
Make a copy using memdup()

tools/perf/util/hist.c | 24 ++++++++++++++++++++++++
tools/perf/util/sort.h | 2 ++
2 files changed, 26 insertions(+)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 039bb91d0a92..2dcf38a78f08 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -369,6 +369,25 @@ static struct hist_entry *hist_entry__new(struct hist_entry *template,
if (symbol_conf.use_callchain)
callchain_init(he->callchain);

+ if (he->raw_data) {
+ he->raw_data = memdup(he->raw_data, he->raw_size);
+
+ if (he->raw_data == NULL) {
+ map__put(he->ms.map);
+ if (he->branch_info) {
+ map__put(he->branch_info->from.map);
+ map__put(he->branch_info->to.map);
+ free(he->branch_info);
+ }
+ if (he->mem_info) {
+ map__put(he->mem_info->iaddr.map);
+ map__put(he->mem_info->daddr.map);
+ }
+ free(he->stat_acc);
+ free(he);
+ return NULL;
+ }
+ }
INIT_LIST_HEAD(&he->pairs.node);
thread__get(he->thread);
}
@@ -487,6 +506,8 @@ struct hist_entry *__hists__add_entry(struct hists *hists,
.branch_info = bi,
.mem_info = mi,
.transaction = sample->transaction,
+ .raw_data = sample->raw_data,
+ .raw_size = sample->raw_size,
};

return hists__findnew_entry(hists, &entry, al, sample_self);
@@ -801,6 +822,8 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
.sym = al->sym,
},
.parent = iter->parent,
+ .raw_data = sample->raw_data,
+ .raw_size = sample->raw_size,
};
int i;
struct callchain_cursor cursor;
@@ -974,6 +997,7 @@ void hist_entry__delete(struct hist_entry *he)
if (he->srcfile && he->srcfile[0])
free(he->srcfile);
free_callchain(he->callchain);
+ free(he->raw_data);
free(he);
}

diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index 86f05e7a5566..d29898708dbd 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -122,6 +122,8 @@ struct hist_entry {
struct branch_info *branch_info;
struct hists *hists;
struct mem_info *mem_info;
+ void *raw_data;
+ u32 raw_size;
struct callchain_root callchain[0]; /* must be last member */
};

--
2.6.4