2015-11-09 05:45:52

by Namhyung Kim

[permalink] [raw]
Subject: [PATCHSET 0/9] perf report: Support folded callchain output (v5)

Hello,

This is what Brendan requested on the perf-users mailing list [1] to
support FlameGraphs [2] more efficiently. This patchset adds a few
more callchain options to adjust the output for it.

* changes in v5)
- honor field separator from -t option
- add support for TUI and GTK

* changes in v4)
- add missing doc update
- cleanup/fix callchain value print code
- add Acked-by from Brendan and Jiri

* changes in v3)
- put the value before callchains
- fix compile error


At first, 'folded' output mode was added. The folded output puts the
value, a space and all calchain nodes separated by semicolons. Now it
only supports --stdio as other UI provides some way of folding and/or
expanding callchains dynamically.

The value is now can be one of 'percent', 'period', or 'count'. The
percent is current default output and the period is the raw number of
sample periods. The count is the number of samples for each callchain.

The proposed features of hiding hist lines with '-F none' and showing
hist info with callchains can be added as later work.

Here's an example:

$ perf report --no-children --show-nr-samples --stdio -g folded,count
...
39.93% 80 swapper [kernel.vmlinux] [k] intel_idel
57 intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary
23 intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;...


$ perf report --no-children --stdio -g percent
...
39.93% swapper [kernel.vmlinux] [k] intel_idel
|
---intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
|
|--28.63%-- start_secondary
|
--11.30%-- rest_init


$ perf report --no-children --stdio --show-total-period -g period
...
39.93% 13018705 swapper [kernel.vmlinux] [k] intel_idel
|
---intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
|
|--9334403-- start_secondary
|
--3684302-- rest_init


$ perf report --no-children --stdio --show-nr-samples -g count
...
39.93% 80 swapper [kernel.vmlinux] [k] intel_idel
|
---intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
|
|--57-- start_secondary
|
--23-- rest_init


You can get it from 'perf/callchain-fold-v5' branch on my tree:

git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Any comments are welcome, thanks
Namhyung


[1] http://www.spinics.net/lists/linux-perf-users/msg02498.html
[2] http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html


Namhyung Kim (9):
perf report: Support folded callchain mode on --stdio
perf callchain: Abstract callchain print function
perf callchain: Add count fields to struct callchain_node
perf report: Add callchain value option
perf hists browser: Factor out hist_browser__show_callchain_list()
perf hists browser: Support flat callchains
perf hists browser: Support folded callchains
perf ui/gtk: Support flat callchains
perf ui/gtk: Support folded callchains

tools/perf/Documentation/perf-report.txt | 14 +-
tools/perf/builtin-report.c | 4 +-
tools/perf/ui/browsers/hists.c | 316 ++++++++++++++++++++++++++++---
tools/perf/ui/gtk/hists.c | 148 ++++++++++++++-
tools/perf/ui/stdio/hist.c | 94 +++++++--
tools/perf/util/callchain.c | 135 ++++++++++++-
tools/perf/util/callchain.h | 28 ++-
tools/perf/util/util.c | 3 +-
8 files changed, 679 insertions(+), 63 deletions(-)

--
2.6.2


2015-11-09 05:45:55

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH v5 1/9] perf report: Support folded callchain mode on --stdio

Add new call chain option (-g) 'folded' to print callchains in a line.
The callchains are separated by semicolons, and preceded by (absolute)
percent values and a space.

For example, following 20 lines can be printed in 3 lines with the
folded output mode;

$ perf report -g flat --no-children | grep -v ^# | head -20
60.48% swapper [kernel.vmlinux] [k] intel_idle
54.60%
intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
start_secondary

5.88%
intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
rest_init
start_kernel
x86_64_start_reservations
x86_64_start_kernel

$ perf report -g folded --no-children | grep -v ^# | head -3
60.48% swapper [kernel.vmlinux] [k] intel_idle
54.60% intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary
5.88% intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel

This mode is supported only for --stdio now and intended to be used by
some scripts like in FlameGraphs[1]. Support for other UI might be
added later.

[1] http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html

Requested-by: Brendan Gregg <[email protected]>
Acked-by: Brendan Gregg <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/Documentation/perf-report.txt | 1 +
tools/perf/ui/stdio/hist.c | 55 ++++++++++++++++++++++++++++++++
tools/perf/util/callchain.c | 6 ++++
tools/perf/util/callchain.h | 5 +--
4 files changed, 65 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 5ce8da1e1256..f7d81aac9188 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -181,6 +181,7 @@ OPTIONS
- graph: use a graph tree, displaying absolute overhead rates. (default)
- fractal: like graph, but displays relative rates. Each branch of
the tree is considered as a new profiled object.
+ - folded: call chains are displayed in a line, separated by semicolons
- none: disable call chain display.

threshold is a percentage value which specifies a minimum percent to be
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index dfcbc90146ef..ea7984932d9a 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -260,6 +260,58 @@ static size_t callchain__fprintf_flat(FILE *fp, struct rb_root *tree,
return ret;
}

+static size_t __callchain__fprintf_folded(FILE *fp, struct callchain_node *node)
+{
+ const char *sep = symbol_conf.field_sep ?: ";";
+ struct callchain_list *chain;
+ size_t ret = 0;
+ char bf[1024];
+ bool first;
+
+ if (!node)
+ return 0;
+
+ ret += __callchain__fprintf_folded(fp, node->parent);
+
+ first = (ret == 0);
+ list_for_each_entry(chain, &node->val, list) {
+ if (chain->ip >= PERF_CONTEXT_MAX)
+ continue;
+ ret += fprintf(fp, "%s%s", first ? "" : sep,
+ callchain_list__sym_name(chain,
+ bf, sizeof(bf), false));
+ first = false;
+ }
+
+ return ret;
+}
+
+static size_t callchain__fprintf_folded(FILE *fp, struct rb_root *tree,
+ u64 total_samples)
+{
+ size_t ret = 0;
+ u32 entries_printed = 0;
+ struct callchain_node *chain;
+ struct rb_node *rb_node = rb_first(tree);
+
+ while (rb_node) {
+ double percent;
+
+ chain = rb_entry(rb_node, struct callchain_node, rb_node);
+ percent = chain->hit * 100.0 / total_samples;
+
+ ret += fprintf(fp, "%.2f%% ", percent);
+ ret += __callchain__fprintf_folded(fp, chain);
+ ret += fprintf(fp, "\n");
+ if (++entries_printed == callchain_param.print_limit)
+ break;
+
+ rb_node = rb_next(rb_node);
+ }
+
+ return ret;
+}
+
static size_t hist_entry_callchain__fprintf(struct hist_entry *he,
u64 total_samples, int left_margin,
FILE *fp)
@@ -278,6 +330,9 @@ static size_t hist_entry_callchain__fprintf(struct hist_entry *he,
case CHAIN_FLAT:
return callchain__fprintf_flat(fp, &he->sorted_chain, total_samples);
break;
+ case CHAIN_FOLDED:
+ return callchain__fprintf_folded(fp, &he->sorted_chain, total_samples);
+ break;
case CHAIN_NONE:
break;
default:
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 735ad48e1858..08cb220ba5ea 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -44,6 +44,10 @@ static int parse_callchain_mode(const char *value)
callchain_param.mode = CHAIN_GRAPH_REL;
return 0;
}
+ if (!strncmp(value, "folded", strlen(value))) {
+ callchain_param.mode = CHAIN_FOLDED;
+ return 0;
+ }
return -1;
}

@@ -218,6 +222,7 @@ rb_insert_callchain(struct rb_root *root, struct callchain_node *chain,

switch (mode) {
case CHAIN_FLAT:
+ case CHAIN_FOLDED:
if (rnode->hit < chain->hit)
p = &(*p)->rb_left;
else
@@ -338,6 +343,7 @@ int callchain_register_param(struct callchain_param *param)
param->sort = sort_chain_graph_rel;
break;
case CHAIN_FLAT:
+ case CHAIN_FOLDED:
param->sort = sort_chain_flat;
break;
case CHAIN_NONE:
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index fce8161e54db..544d99ac169c 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -24,7 +24,7 @@
#define CALLCHAIN_RECORD_HELP CALLCHAIN_HELP RECORD_MODE_HELP RECORD_SIZE_HELP

#define CALLCHAIN_REPORT_HELP \
- HELP_PAD "print_type:\tcall graph printing style (graph|flat|fractal|none)\n" \
+ HELP_PAD "print_type:\tcall graph printing style (graph|flat|fractal|folded|none)\n" \
HELP_PAD "threshold:\tminimum call graph inclusion threshold (<percent>)\n" \
HELP_PAD "print_limit:\tmaximum number of call graph entry (<number>)\n" \
HELP_PAD "order:\t\tcall graph order (caller|callee)\n" \
@@ -43,7 +43,8 @@ enum chain_mode {
CHAIN_NONE,
CHAIN_FLAT,
CHAIN_GRAPH_ABS,
- CHAIN_GRAPH_REL
+ CHAIN_GRAPH_REL,
+ CHAIN_FOLDED,
};

enum chain_order {
--
2.6.2

2015-11-09 05:45:58

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH] perf report: [WIP] Support '-F none' option to hide hist lines

For some reason, it sometimes wants to hide hist lines but only wants to
see callchains. To do that, add virtual 'none' field name to hide all
hist lines. It should be used solely and only meaningful on --stdio.

WIP on TUI

Cc: Brendan Gregg <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/Documentation/perf-report.txt | 3 ++
tools/perf/ui/browsers/hists.c | 22 +++++++++--
tools/perf/ui/gtk/hists.c | 65 ++++++++++++++++++++++++--------
tools/perf/ui/stdio/hist.c | 5 +++
tools/perf/util/sort.c | 5 +++
5 files changed, 82 insertions(+), 18 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index dab99ed2b339..6cfc643c0806 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -127,6 +127,9 @@ OPTIONS
By default, every sort keys not specified in -F will be appended
automatically.

+ If "none" is specified, it hides all fields and --stdio output will show
+ callchains only.
+
If --mem-mode option is used, following sort keys are also available
(incompatible with --branch-stack):
symbol_daddr, dso_daddr, locked, tlb, mem, snoop, dcacheline.
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 3efe7c74f47d..c2f586f0c729 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -78,6 +78,9 @@ static u32 hist_browser__nr_entries(struct hist_browser *hb)
nr_entries = hb->hists->nr_entries;

hb->nr_callchain_rows = hist_browser__get_folding(hb);
+
+ if (list_empty(&perf_hpp__list))
+ nr_entries = 1;
return nr_entries + hb->nr_callchain_rows;
}

@@ -255,7 +258,10 @@ static bool hist_entry__toggle_fold(struct hist_entry *he)
if (!he->has_children)
return false;

- he->unfolded = !he->unfolded;
+ if (list_empty(&perf_hpp__list))
+ he->unfolded = true;
+ else
+ he->unfolded = !he->unfolded;
return true;
}

@@ -329,6 +335,10 @@ static void hist_entry__init_have_children(struct hist_entry *he)
if (!he->init_have_children) {
he->has_children = !RB_EMPTY_ROOT(&he->sorted_chain);
callchain__init_have_children(&he->sorted_chain);
+ if (list_empty(&perf_hpp__list)) {
+ he->unfolded = true;
+ he->nr_rows = callchain__count_rows(&he->sorted_chain);
+ }
he->init_have_children = true;
}
}
@@ -1038,6 +1048,9 @@ static int hist_browser__show_entry(struct hist_browser *browser,

hist_browser__gotorc(browser, row, 0);

+ if (list_empty(&perf_hpp__list))
+ goto print_callchain;
+
perf_hpp__for_each_format(fmt) {
if (perf_hpp__should_skip(fmt) || column++ < browser->b.horiz_scroll)
continue;
@@ -1080,6 +1093,7 @@ static int hist_browser__show_entry(struct hist_browser *browser,
} else
--row_offset;

+print_callchain:
if (folded_sign == '-' && row != browser->b.rows) {
u64 total = hists__total_period(entry->hists);
struct callchain_print_arg arg = {
@@ -1313,7 +1327,8 @@ static void ui_browser__hists_seek(struct ui_browser *browser,
nd = hists__filter_entries(rb_next(nd), hb->min_pcnt);
if (nd == NULL)
break;
- --offset;
+ if (!list_empty(&perf_hpp__list))
+ --offset;
browser->top = nd;
} while (offset != 0);
} else if (offset < 0) {
@@ -1347,7 +1362,8 @@ static void ui_browser__hists_seek(struct ui_browser *browser,
hb->min_pcnt);
if (nd == NULL)
break;
- ++offset;
+ if (!list_empty(&perf_hpp__list))
+ ++offset;
browser->top = nd;
if (offset == 0) {
/*
diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
index 6105b4921754..535f8c5e74dc 100644
--- a/tools/perf/ui/gtk/hists.c
+++ b/tools/perf/ui/gtk/hists.c
@@ -98,12 +98,12 @@ static void perf_gtk__add_callchain_flat(struct rb_root *root, GtkTreeStore *sto
for (nd = rb_first(root); nd; nd = rb_next(nd)) {
struct callchain_node *node;
struct callchain_list *chain;
- GtkTreeIter iter, new_parent;
+ GtkTreeIter iter, new_parent_iter, *new_parent;
bool need_new_parent;

node = rb_entry(nd, struct callchain_node, rb_node);

- new_parent = *parent;
+ new_parent = parent;
need_new_parent = !has_single_node;

callchain_node__make_parent_list(node);
@@ -111,7 +111,7 @@ static void perf_gtk__add_callchain_flat(struct rb_root *root, GtkTreeStore *sto
list_for_each_entry(chain, &node->parent_val, list) {
char buf[128];

- gtk_tree_store_append(store, &iter, &new_parent);
+ gtk_tree_store_append(store, &iter, new_parent);

callchain_node__sprintf_value(node, buf, sizeof(buf), total);
gtk_tree_store_set(store, &iter, 0, buf, -1);
@@ -124,7 +124,8 @@ static void perf_gtk__add_callchain_flat(struct rb_root *root, GtkTreeStore *sto
* Only show the top-most symbol in a callchain
* if it's not the only callchain.
*/
- new_parent = iter;
+ new_parent_iter = iter;
+ new_parent = &new_parent_iter;
need_new_parent = false;
}
}
@@ -132,7 +133,7 @@ static void perf_gtk__add_callchain_flat(struct rb_root *root, GtkTreeStore *sto
list_for_each_entry(chain, &node->val, list) {
char buf[128];

- gtk_tree_store_append(store, &iter, &new_parent);
+ gtk_tree_store_append(store, &iter, new_parent);

callchain_node__sprintf_value(node, buf, sizeof(buf), total);
gtk_tree_store_set(store, &iter, 0, buf, -1);
@@ -145,7 +146,8 @@ static void perf_gtk__add_callchain_flat(struct rb_root *root, GtkTreeStore *sto
* Only show the top-most symbol in a callchain
* if it's not the only callchain.
*/
- new_parent = iter;
+ new_parent_iter = iter;
+ new_parent = &new_parent_iter;
need_new_parent = false;
}
}
@@ -221,19 +223,19 @@ static void perf_gtk__add_callchain_graph(struct rb_root *root, GtkTreeStore *st
for (nd = rb_first(root); nd; nd = rb_next(nd)) {
struct callchain_node *node;
struct callchain_list *chain;
- GtkTreeIter iter, new_parent;
+ GtkTreeIter iter, new_parent_iter, *new_parent;
bool need_new_parent;
u64 child_total;

node = rb_entry(nd, struct callchain_node, rb_node);

- new_parent = *parent;
+ new_parent = parent;
need_new_parent = !has_single_node && (node->val_nr > 1);

list_for_each_entry(chain, &node->val, list) {
char buf[128];

- gtk_tree_store_append(store, &iter, &new_parent);
+ gtk_tree_store_append(store, &iter, new_parent);

callchain_node__sprintf_value(node, buf, sizeof(buf), total);
gtk_tree_store_set(store, &iter, 0, buf, -1);
@@ -246,7 +248,8 @@ static void perf_gtk__add_callchain_graph(struct rb_root *root, GtkTreeStore *st
* Only show the top-most symbol in a callchain
* if it's not the only callchain.
*/
- new_parent = iter;
+ new_parent_iter = iter;
+ new_parent = &new_parent_iter;
need_new_parent = false;
}
}
@@ -292,12 +295,14 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
GType col_types[MAX_COLUMNS];
GtkCellRenderer *renderer;
GtkTreeStore *store;
+ GtkTreeIter *iter;
struct rb_node *nd;
GtkWidget *view;
int col_idx;
int sym_col = -1;
int nr_cols;
char s[512];
+ bool no_hists = false;

struct perf_hpp hpp = {
.buf = s,
@@ -309,6 +314,18 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
perf_hpp__for_each_format(fmt)
col_types[nr_cols++] = G_TYPE_STRING;

+ if (nr_cols == 0) {
+ /*
+ * user specified '-F none' to ignore hist entries.
+ * Add two columns to print callchain value and symbols.
+ */
+ no_hists = true;
+
+ nr_cols = 2;
+ col_types[0] = G_TYPE_STRING;
+ col_types[1] = G_TYPE_STRING;
+ }
+
store = gtk_tree_store_newv(nr_cols, col_types);

view = gtk_tree_view_new();
@@ -334,6 +351,18 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
col_idx++, NULL);
}

+ if (no_hists) {
+ sym_col = 1;
+ gtk_tree_view_insert_column_with_attributes(GTK_TREE_VIEW(view),
+ -1, "Overhead",
+ renderer, "markup",
+ 0, NULL);
+ gtk_tree_view_insert_column_with_attributes(GTK_TREE_VIEW(view),
+ -1, "Callchains",
+ renderer, "markup",
+ 1, NULL);
+ }
+
for (col_idx = 0; col_idx < nr_cols; col_idx++) {
GtkTreeViewColumn *column;

@@ -352,7 +381,7 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,

for (nd = rb_first(&hists->entries); nd; nd = rb_next(nd)) {
struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
- GtkTreeIter iter;
+ GtkTreeIter this_iter;
u64 total = hists__total_period(h->hists);
float percent;

@@ -363,7 +392,13 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
if (percent < min_pcnt)
continue;

- gtk_tree_store_append(store, &iter, NULL);
+ if (no_hists) {
+ /* NULL means that callchains are in top-level */
+ iter = NULL;
+ } else {
+ iter = &this_iter;
+ gtk_tree_store_append(store, iter, NULL);
+ }

col_idx = 0;

@@ -376,15 +411,15 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
else
fmt->entry(fmt, &hpp, h);

- gtk_tree_store_set(store, &iter, col_idx++, s, -1);
+ gtk_tree_store_set(store, iter, col_idx++, s, -1);
}

- if (symbol_conf.use_callchain && sort__has_sym) {
+ if (symbol_conf.use_callchain) {
if (callchain_param.mode == CHAIN_GRAPH_REL)
total = symbol_conf.cumulate_callchain ?
h->stat_acc->period : h->stat.period;

- perf_gtk__add_callchain(&h->sorted_chain, store, &iter,
+ perf_gtk__add_callchain(&h->sorted_chain, store, iter,
sym_col, total);
}
}
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 7ebc661be267..48ae34abf9c8 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -422,6 +422,11 @@ static int hist_entry__fprintf(struct hist_entry *he, size_t size,
if (size == 0 || size > bfsz)
size = hpp.size = bfsz;

+ /*
+ * In case of '-F none', the bf is not set at all.
+ */
+ bf[0] = '\0';
+
hist_entry__snprintf(he, &hpp);

ret = fprintf(fp, "%s\n", bf);
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 2d8ccd4d9e1b..8c731906d432 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -1923,6 +1923,11 @@ static int __setup_output_field(void)
if (field_order == NULL)
return 0;

+ if (!strcmp(field_order, "none")) {
+ symbol_conf.show_hist_headers = false;
+ return 0;
+ }
+
strp = str = strdup(field_order);
if (str == NULL) {
error("Not enough memory to setup output fields");
--
2.6.2

2015-11-09 05:47:10

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH v5 2/9] perf callchain: Abstract callchain print function

This is a preparation to support for printing other type of callchain
value like count or period.

Acked-by: Brendan Gregg <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/ui/browsers/hists.c | 8 +++++---
tools/perf/ui/gtk/hists.c | 8 ++------
tools/perf/ui/stdio/hist.c | 35 +++++++++++++++++------------------
tools/perf/util/callchain.c | 29 +++++++++++++++++++++++++++++
tools/perf/util/callchain.h | 4 ++++
5 files changed, 57 insertions(+), 27 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index e5afb8936040..a8897aab4c4a 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -592,7 +592,6 @@ static int hist_browser__show_callchain(struct hist_browser *browser,
while (node) {
struct callchain_node *child = rb_entry(node, struct callchain_node, rb_node);
struct rb_node *next = rb_next(node);
- u64 cumul = callchain_cumul_hits(child);
struct callchain_list *chain;
char folded_sign = ' ';
int first = true;
@@ -619,9 +618,12 @@ static int hist_browser__show_callchain(struct hist_browser *browser,
browser->show_dso);

if (was_first && need_percent) {
- double percent = cumul * 100.0 / total;
+ char buf[64];

- if (asprintf(&alloc_str, "%2.2f%% %s", percent, str) < 0)
+ callchain_node__sprintf_value(child, buf, sizeof(buf),
+ total);
+
+ if (asprintf(&alloc_str, "%s %s", buf, str) < 0)
str = "Not enough memory!";
else
str = alloc_str;
diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
index 4b3585eed1e8..d8037b7023e8 100644
--- a/tools/perf/ui/gtk/hists.c
+++ b/tools/perf/ui/gtk/hists.c
@@ -100,14 +100,10 @@ static void perf_gtk__add_callchain(struct rb_root *root, GtkTreeStore *store,
struct callchain_list *chain;
GtkTreeIter iter, new_parent;
bool need_new_parent;
- double percent;
- u64 hits, child_total;
+ u64 child_total;

node = rb_entry(nd, struct callchain_node, rb_node);

- hits = callchain_cumul_hits(node);
- percent = 100.0 * hits / total;
-
new_parent = *parent;
need_new_parent = !has_single_node && (node->val_nr > 1);

@@ -116,7 +112,7 @@ static void perf_gtk__add_callchain(struct rb_root *root, GtkTreeStore *store,

gtk_tree_store_append(store, &iter, &new_parent);

- scnprintf(buf, sizeof(buf), "%5.2f%%", percent);
+ callchain_node__sprintf_value(node, buf, sizeof(buf), total);
gtk_tree_store_set(store, &iter, 0, buf, -1);

callchain_list__sym_name(chain, buf, sizeof(buf), false);
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index ea7984932d9a..f4de055cab9b 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -34,10 +34,10 @@ static size_t ipchain__fprintf_graph_line(FILE *fp, int depth, int depth_mask,
return ret;
}

-static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_list *chain,
+static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_node *node,
+ struct callchain_list *chain,
int depth, int depth_mask, int period,
- u64 total_samples, u64 hits,
- int left_margin)
+ u64 total_samples, int left_margin)
{
int i;
size_t ret = 0;
@@ -50,10 +50,9 @@ static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_list *chain,
else
ret += fprintf(fp, " ");
if (!period && i == depth - 1) {
- double percent;
-
- percent = hits * 100.0 / total_samples;
- ret += percent_color_fprintf(fp, "--%2.2f%%-- ", percent);
+ ret += fprintf(fp, "--");
+ ret += callchain_node__fprintf_value(node, fp, total_samples);
+ ret += fprintf(fp, "--");
} else
ret += fprintf(fp, "%s", " ");
}
@@ -120,10 +119,9 @@ static size_t __callchain__fprintf_graph(FILE *fp, struct rb_root *root,
left_margin);
i = 0;
list_for_each_entry(chain, &child->val, list) {
- ret += ipchain__fprintf_graph(fp, chain, depth,
+ ret += ipchain__fprintf_graph(fp, child, chain, depth,
new_depth_mask, i++,
total_samples,
- cumul,
left_margin);
}

@@ -143,14 +141,17 @@ static size_t __callchain__fprintf_graph(FILE *fp, struct rb_root *root,

if (callchain_param.mode == CHAIN_GRAPH_REL &&
remaining && remaining != total_samples) {
+ struct callchain_node rem_node = {
+ .hit = remaining,
+ };

if (!rem_sq_bracket)
return ret;

new_depth_mask &= ~(1 << (depth - 1));
- ret += ipchain__fprintf_graph(fp, &rem_hits, depth,
+ ret += ipchain__fprintf_graph(fp, &rem_node, &rem_hits, depth,
new_depth_mask, 0, total_samples,
- remaining, left_margin);
+ left_margin);
}

return ret;
@@ -243,12 +244,11 @@ static size_t callchain__fprintf_flat(FILE *fp, struct rb_root *tree,
struct rb_node *rb_node = rb_first(tree);

while (rb_node) {
- double percent;
-
chain = rb_entry(rb_node, struct callchain_node, rb_node);
- percent = chain->hit * 100.0 / total_samples;

- ret = percent_color_fprintf(fp, " %6.2f%%\n", percent);
+ ret += fprintf(fp, " ");
+ ret += callchain_node__fprintf_value(chain, fp, total_samples);
+ ret += fprintf(fp, "\n");
ret += __callchain__fprintf_flat(fp, chain, total_samples);
ret += fprintf(fp, "\n");
if (++entries_printed == callchain_param.print_limit)
@@ -295,12 +295,11 @@ static size_t callchain__fprintf_folded(FILE *fp, struct rb_root *tree,
struct rb_node *rb_node = rb_first(tree);

while (rb_node) {
- double percent;

chain = rb_entry(rb_node, struct callchain_node, rb_node);
- percent = chain->hit * 100.0 / total_samples;

- ret += fprintf(fp, "%.2f%% ", percent);
+ ret += callchain_node__fprintf_value(chain, fp, total_samples);
+ ret += fprintf(fp, " ");
ret += __callchain__fprintf_folded(fp, chain);
ret += fprintf(fp, "\n");
if (++entries_printed == callchain_param.print_limit)
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 08cb220ba5ea..e2ef9b38acb6 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -805,6 +805,35 @@ char *callchain_list__sym_name(struct callchain_list *cl,
return bf;
}

+char *callchain_node__sprintf_value(struct callchain_node *node,
+ char *bf, size_t bfsize, u64 total)
+{
+ double percent = 0.0;
+ u64 period = callchain_cumul_hits(node);
+
+ if (callchain_param.mode == CHAIN_FOLDED)
+ period = node->hit;
+ if (total)
+ percent = period * 100.0 / total;
+
+ scnprintf(bf, bfsize, "%.2f%%", percent);
+ return bf;
+}
+
+int callchain_node__fprintf_value(struct callchain_node *node,
+ FILE *fp, u64 total)
+{
+ double percent = 0.0;
+ u64 period = callchain_cumul_hits(node);
+
+ if (callchain_param.mode == CHAIN_FOLDED)
+ period = node->hit;
+ if (total)
+ percent = period * 100.0 / total;
+
+ return percent_color_fprintf(fp, "%.2f%%", percent);
+}
+
static void free_callchain_node(struct callchain_node *node)
{
struct callchain_list *list, *tmp;
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 544d99ac169c..f9e00e3d1243 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -230,6 +230,10 @@ static inline int arch_skip_callchain_idx(struct thread *thread __maybe_unused,

char *callchain_list__sym_name(struct callchain_list *cl,
char *bf, size_t bfsize, bool show_dso);
+char *callchain_node__sprintf_value(struct callchain_node *node,
+ char *bf, size_t bfsize, u64 total);
+int callchain_node__fprintf_value(struct callchain_node *node,
+ FILE *fp, u64 total);

void free_callchain(struct callchain_root *root);

--
2.6.2

2015-11-09 05:47:08

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH v5 3/9] perf callchain: Add count fields to struct callchain_node

It's to track the count of occurrences of the callchains.

Acked-by: Brendan Gregg <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/callchain.c | 10 ++++++++++
tools/perf/util/callchain.h | 7 +++++++
2 files changed, 17 insertions(+)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index e2ef9b38acb6..60754de700d4 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -437,6 +437,8 @@ add_child(struct callchain_node *parent,

new->children_hit = 0;
new->hit = period;
+ new->children_count = 0;
+ new->count = 1;
return new;
}

@@ -484,6 +486,9 @@ split_add_child(struct callchain_node *parent,
parent->children_hit = callchain_cumul_hits(new);
new->val_nr = parent->val_nr - idx_local;
parent->val_nr = idx_local;
+ new->count = parent->count;
+ new->children_count = parent->children_count;
+ parent->children_count = callchain_cumul_counts(new);

/* create a new child for the new branch if any */
if (idx_total < cursor->nr) {
@@ -494,6 +499,8 @@ split_add_child(struct callchain_node *parent,

parent->hit = 0;
parent->children_hit += period;
+ parent->count = 0;
+ parent->children_count += 1;

node = callchain_cursor_current(cursor);
new = add_child(parent, cursor, period);
@@ -516,6 +523,7 @@ split_add_child(struct callchain_node *parent,
rb_insert_color(&new->rb_node_in, &parent->rb_root_in);
} else {
parent->hit = period;
+ parent->count = 1;
}
}

@@ -562,6 +570,7 @@ append_chain_children(struct callchain_node *root,

inc_children_hit:
root->children_hit += period;
+ root->children_count++;
}

static int
@@ -614,6 +623,7 @@ append_chain(struct callchain_node *root,
/* we match 100% of the path, increment the hit */
if (matches == root->val_nr && cursor->pos == cursor->nr) {
root->hit += period;
+ root->count++;
return 0;
}

diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index f9e00e3d1243..0e6cc83f1a46 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -60,6 +60,8 @@ struct callchain_node {
struct rb_root rb_root_in; /* input tree of children */
struct rb_root rb_root; /* sorted output tree of children */
unsigned int val_nr;
+ unsigned int count;
+ unsigned int children_count;
u64 hit;
u64 children_hit;
};
@@ -145,6 +147,11 @@ static inline u64 callchain_cumul_hits(struct callchain_node *node)
return node->hit + node->children_hit;
}

+static inline unsigned callchain_cumul_counts(struct callchain_node *node)
+{
+ return node->count + node->children_count;
+}
+
int callchain_register_param(struct callchain_param *param);
int callchain_append(struct callchain_root *root,
struct callchain_cursor *cursor,
--
2.6.2

2015-11-09 05:47:01

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH v5 4/9] perf report: Add callchain value option

Now -g/--call-graph option supports how to display callchain values.
Possible values are 'percent', 'period' and 'count'. The percent is
same as before and it's the default behavior. The period displays the
raw period value rather than the percentage. The count displays the
number of occurrences.

$ perf report --no-children --stdio -g percent
...
39.93% swapper [kernel.vmlinux] [k] intel_idel
|
---intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
|
|--28.63%-- start_secondary
|
--11.30%-- rest_init

$ perf report --no-children --show-total-period --stdio -g period
...
39.93% 13018705 swapper [kernel.vmlinux] [k] intel_idel
|
---intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
|
|--9334403-- start_secondary
|
--3684302-- rest_init

$ perf report --no-children --show-nr-samples --stdio -g count
...
39.93% 80 swapper [kernel.vmlinux] [k] intel_idel
|
---intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
|
|--57-- start_secondary
|
--23-- rest_init

Acked-by: Brendan Gregg <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/Documentation/perf-report.txt | 13 ++++---
tools/perf/builtin-report.c | 4 +--
tools/perf/ui/stdio/hist.c | 10 +++++-
tools/perf/util/callchain.c | 62 +++++++++++++++++++++++++++-----
tools/perf/util/callchain.h | 10 +++++-
tools/perf/util/util.c | 3 +-
6 files changed, 84 insertions(+), 18 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index f7d81aac9188..dab99ed2b339 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -170,11 +170,11 @@ OPTIONS
Dump raw trace in ASCII.

-g::
---call-graph=<print_type,threshold[,print_limit],order,sort_key,branch>::
+--call-graph=<print_type,threshold[,print_limit],order,sort_key[,branch],value>::
Display call chains using type, min percent threshold, print limit,
- call order, sort key and branch. Note that ordering of parameters is not
- fixed so any parement can be given in an arbitraty order. One exception
- is the print_limit which should be preceded by threshold.
+ call order, sort key, optional branch and value. Note that ordering of
+ parameters is not fixed so any parement can be given in an arbitraty order.
+ One exception is the print_limit which should be preceded by threshold.

print_type can be either:
- flat: single column, linear exposure of call chains.
@@ -205,6 +205,11 @@ OPTIONS
- branch: include last branch information in callgraph when available.
Usually more convenient to use --branch-history for this.

+ value can be:
+ - percent: diplay overhead percent (default)
+ - period: display event period
+ - count: display event count
+
--children::
Accumulate callchain of children to parent entry so that then can
show up in the output. The output will have a new "Children" column
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 2853ad2bd435..3dd4bb4ded1a 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -625,7 +625,7 @@ parse_percent_limit(const struct option *opt, const char *str,
return 0;
}

-#define CALLCHAIN_DEFAULT_OPT "graph,0.5,caller,function"
+#define CALLCHAIN_DEFAULT_OPT "graph,0.5,caller,function,percent"

const char report_callchain_help[] = "Display call graph (stack chain/backtrace):\n\n"
CALLCHAIN_REPORT_HELP
@@ -708,7 +708,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_BOOLEAN('x', "exclude-other", &symbol_conf.exclude_other,
"Only display entries with parent-match"),
OPT_CALLBACK_DEFAULT('g', "call-graph", &report,
- "print_type,threshold[,print_limit],order,sort_key[,branch]",
+ "print_type,threshold[,print_limit],order,sort_key[,branch],value",
report_callchain_help, &report_parse_callchain_opt,
callchain_default_opt),
OPT_BOOLEAN(0, "children", &symbol_conf.cumulate_callchain,
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index f4de055cab9b..7ebc661be267 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -81,13 +81,14 @@ static size_t __callchain__fprintf_graph(FILE *fp, struct rb_root *root,
int depth_mask, int left_margin)
{
struct rb_node *node, *next;
- struct callchain_node *child;
+ struct callchain_node *child = NULL;
struct callchain_list *chain;
int new_depth_mask = depth_mask;
u64 remaining;
size_t ret = 0;
int i;
uint entries_printed = 0;
+ int cumul_count = 0;

remaining = total_samples;

@@ -99,6 +100,7 @@ static size_t __callchain__fprintf_graph(FILE *fp, struct rb_root *root,
child = rb_entry(node, struct callchain_node, rb_node);
cumul = callchain_cumul_hits(child);
remaining -= cumul;
+ cumul_count += callchain_cumul_counts(child);

/*
* The depth mask manages the output of pipes that show
@@ -148,6 +150,12 @@ static size_t __callchain__fprintf_graph(FILE *fp, struct rb_root *root,
if (!rem_sq_bracket)
return ret;

+ if (callchain_param.value == CCVAL_COUNT && child && child->parent) {
+ rem_node.count = child->parent->children_count - cumul_count;
+ if (rem_node.count <= 0)
+ return ret;
+ }
+
new_depth_mask &= ~(1 << (depth - 1));
ret += ipchain__fprintf_graph(fp, &rem_node, &rem_hits, depth,
new_depth_mask, 0, total_samples,
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 60754de700d4..f3f1b95b808e 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -83,6 +83,23 @@ static int parse_callchain_sort_key(const char *value)
return -1;
}

+static int parse_callchain_value(const char *value)
+{
+ if (!strncmp(value, "percent", strlen(value))) {
+ callchain_param.value = CCVAL_PERCENT;
+ return 0;
+ }
+ if (!strncmp(value, "period", strlen(value))) {
+ callchain_param.value = CCVAL_PERIOD;
+ return 0;
+ }
+ if (!strncmp(value, "count", strlen(value))) {
+ callchain_param.value = CCVAL_COUNT;
+ return 0;
+ }
+ return -1;
+}
+
static int
__parse_callchain_report_opt(const char *arg, bool allow_record_opt)
{
@@ -106,7 +123,8 @@ __parse_callchain_report_opt(const char *arg, bool allow_record_opt)

if (!parse_callchain_mode(tok) ||
!parse_callchain_order(tok) ||
- !parse_callchain_sort_key(tok)) {
+ !parse_callchain_sort_key(tok) ||
+ !parse_callchain_value(tok)) {
/* parsing ok - move on to the next */
try_stack_size = false;
goto next;
@@ -820,13 +838,27 @@ char *callchain_node__sprintf_value(struct callchain_node *node,
{
double percent = 0.0;
u64 period = callchain_cumul_hits(node);
+ unsigned count = callchain_cumul_counts(node);

- if (callchain_param.mode == CHAIN_FOLDED)
+ if (callchain_param.mode == CHAIN_FOLDED) {
period = node->hit;
- if (total)
- percent = period * 100.0 / total;
+ count = node->count;
+ }

- scnprintf(bf, bfsize, "%.2f%%", percent);
+ switch (callchain_param.value) {
+ case CCVAL_PERIOD:
+ scnprintf(bf, bfsize, "%"PRIu64, period);
+ break;
+ case CCVAL_COUNT:
+ scnprintf(bf, bfsize, "%u", count);
+ break;
+ case CCVAL_PERCENT:
+ default:
+ if (total)
+ percent = period * 100.0 / total;
+ scnprintf(bf, bfsize, "%.2f%%", percent);
+ break;
+ }
return bf;
}

@@ -835,13 +867,25 @@ int callchain_node__fprintf_value(struct callchain_node *node,
{
double percent = 0.0;
u64 period = callchain_cumul_hits(node);
+ unsigned count = callchain_cumul_counts(node);

- if (callchain_param.mode == CHAIN_FOLDED)
+ if (callchain_param.mode == CHAIN_FOLDED) {
period = node->hit;
- if (total)
- percent = period * 100.0 / total;
+ count = node->count;
+ }

- return percent_color_fprintf(fp, "%.2f%%", percent);
+ switch (callchain_param.value) {
+ case CCVAL_PERIOD:
+ return fprintf(fp, "%"PRIu64, period);
+ case CCVAL_COUNT:
+ return fprintf(fp, "%u", count);
+ case CCVAL_PERCENT:
+ default:
+ if (total)
+ percent = period * 100.0 / total;
+ return percent_color_fprintf(fp, "%.2f%%", percent);
+ }
+ return 0;
}

static void free_callchain_node(struct callchain_node *node)
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 0e6cc83f1a46..b14d760fc4e3 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -29,7 +29,8 @@
HELP_PAD "print_limit:\tmaximum number of call graph entry (<number>)\n" \
HELP_PAD "order:\t\tcall graph order (caller|callee)\n" \
HELP_PAD "sort_key:\tcall graph sort key (function|address)\n" \
- HELP_PAD "branch:\t\tinclude last branch info to call graph (branch)\n"
+ HELP_PAD "branch:\t\tinclude last branch info to call graph (branch)\n" \
+ HELP_PAD "value:\t\tcall graph value (percent|period|count)\n"

enum perf_call_graph_mode {
CALLCHAIN_NONE,
@@ -81,6 +82,12 @@ enum chain_key {
CCKEY_ADDRESS
};

+enum chain_value {
+ CCVAL_PERCENT,
+ CCVAL_PERIOD,
+ CCVAL_COUNT,
+};
+
struct callchain_param {
bool enabled;
enum perf_call_graph_mode record_mode;
@@ -93,6 +100,7 @@ struct callchain_param {
bool order_set;
enum chain_key key;
bool branch_callstack;
+ enum chain_value value;
};

extern struct callchain_param callchain_param;
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index 47b1e36c7ea0..75759aebc7b8 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -21,7 +21,8 @@ struct callchain_param callchain_param = {
.mode = CHAIN_GRAPH_ABS,
.min_percent = 0.5,
.order = ORDER_CALLEE,
- .key = CCKEY_FUNCTION
+ .key = CCKEY_FUNCTION,
+ .value = CCVAL_PERCENT,
};

/*
--
2.6.2

2015-11-09 05:47:04

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH v5 5/9] perf hists browser: Factor out hist_browser__show_callchain_list()

This function is to print a single callchain list entry. As this
function will be used by other function, factor out to a separate
function.

Cc: Brendan Gregg <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/ui/browsers/hists.c | 72 ++++++++++++++++++++++++++----------------
1 file changed, 45 insertions(+), 27 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index a8897aab4c4a..b5c2d073c6b6 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -574,6 +574,44 @@ static bool hist_browser__check_dump_full(struct hist_browser *browser __maybe_u

#define LEVEL_OFFSET_STEP 3

+static int hist_browser__show_callchain_list(struct hist_browser *browser,
+ struct callchain_node *node,
+ struct callchain_list *chain,
+ unsigned short row, u64 total,
+ bool need_percent, int offset,
+ print_callchain_entry_fn print,
+ struct callchain_print_arg *arg)
+{
+ char bf[1024], *alloc_str;
+ const char *str;
+
+ if (arg->row_offset != 0) {
+ arg->row_offset--;
+ return 0;
+ }
+
+ alloc_str = NULL;
+ str = callchain_list__sym_name(chain, bf, sizeof(bf),
+ browser->show_dso);
+
+ if (need_percent) {
+ char buf[64];
+
+ callchain_node__sprintf_value(node, buf, sizeof(buf),
+ total);
+
+ if (asprintf(&alloc_str, "%s %s", buf, str) < 0)
+ str = "Not enough memory!";
+ else
+ str = alloc_str;
+ }
+
+ print(browser, chain, str, offset, row, arg);
+
+ free(alloc_str);
+ return 1;
+}
+
static int hist_browser__show_callchain(struct hist_browser *browser,
struct rb_root *root, int level,
unsigned short row, u64 total,
@@ -598,8 +636,6 @@ static int hist_browser__show_callchain(struct hist_browser *browser,
int extra_offset = 0;

list_for_each_entry(chain, &child->val, list) {
- char bf[1024], *alloc_str;
- const char *str;
bool was_first = first;

if (first)
@@ -608,34 +644,16 @@ static int hist_browser__show_callchain(struct hist_browser *browser,
extra_offset = LEVEL_OFFSET_STEP;

folded_sign = callchain_list__folded(chain);
- if (arg->row_offset != 0) {
- arg->row_offset--;
- goto do_next;
- }

- alloc_str = NULL;
- str = callchain_list__sym_name(chain, bf, sizeof(bf),
- browser->show_dso);
+ row += hist_browser__show_callchain_list(browser, child,
+ chain, row, total,
+ was_first && need_percent,
+ offset + extra_offset,
+ print, arg);

- if (was_first && need_percent) {
- char buf[64];
-
- callchain_node__sprintf_value(child, buf, sizeof(buf),
- total);
-
- if (asprintf(&alloc_str, "%s %s", buf, str) < 0)
- str = "Not enough memory!";
- else
- str = alloc_str;
- }
-
- print(browser, chain, str, offset + extra_offset, row, arg);
-
- free(alloc_str);
-
- if (is_output_full(browser, ++row))
+ if (is_output_full(browser, row))
goto out;
-do_next:
+
if (folded_sign == '+')
break;
}
--
2.6.2

2015-11-09 05:46:53

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH v5 6/9] perf hists browser: Support flat callchains

The flat callchain mode is to print all chains in a single, simple
hierarchy so make it easy to see.

Currently perf report --tui doesn't show flat callchains properly. With
flat callchains, only leaf nodes are added to the final rbtree so it
should show entries in parent nodes. To do that, add parent_val list to
struct callchain_node and show them along with the (normal) val list.

For example, consider following callchains with '-g graph'.

$ perf report -g graph
- 39.93% swapper [kernel.vmlinux] [k] intel_idle
intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
- cpu_startup_entry
28.63% start_secondary
- 11.30% rest_init
start_kernel
x86_64_start_reservations
x86_64_start_kernel

Before:
$ perf report -g flat
- 39.93% swapper [kernel.vmlinux] [k] intel_idle
28.63% start_secondary
- 11.30% rest_init
start_kernel
x86_64_start_reservations
x86_64_start_kernel

After:
$ perf report -g flat
- 39.93% swapper [kernel.vmlinux] [k] intel_idle
- 28.63% intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
start_secondary
- 11.30% intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
start_kernel
x86_64_start_reservations
x86_64_start_kernel

Cc: Brendan Gregg <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/ui/browsers/hists.c | 122 ++++++++++++++++++++++++++++++++++++++++-
tools/perf/util/callchain.c | 44 +++++++++++++++
tools/perf/util/callchain.h | 2 +
3 files changed, 166 insertions(+), 2 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index b5c2d073c6b6..f4216d92282d 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -178,12 +178,44 @@ static int callchain_node__count_rows_rb_tree(struct callchain_node *node)
return n;
}

+static int callchain_node__count_flat_rows(struct callchain_node *node)
+{
+ struct callchain_list *chain;
+ char folded_sign = 0;
+ int n = 0;
+
+ list_for_each_entry(chain, &node->parent_val, list) {
+ if (!folded_sign) {
+ /* only check first chain list entry */
+ folded_sign = callchain_list__folded(chain);
+ if (folded_sign == '+')
+ return 1;
+ }
+ n++;
+ }
+
+ list_for_each_entry(chain, &node->val, list) {
+ if (!folded_sign) {
+ /* node->parent_val list might be empty */
+ folded_sign = callchain_list__folded(chain);
+ if (folded_sign == '+')
+ return 1;
+ }
+ n++;
+ }
+
+ return n;
+}
+
static int callchain_node__count_rows(struct callchain_node *node)
{
struct callchain_list *chain;
bool unfolded = false;
int n = 0;

+ if (callchain_param.mode == CHAIN_FLAT)
+ return callchain_node__count_flat_rows(node);
+
list_for_each_entry(chain, &node->val, list) {
++n;
unfolded = chain->unfolded;
@@ -263,7 +295,7 @@ static void callchain_node__init_have_children(struct callchain_node *node,
chain = list_entry(node->val.next, struct callchain_list, list);
chain->has_children = has_sibling;

- if (!list_empty(&node->val)) {
+ if (node->val.next != node->val.prev) {
chain = list_entry(node->val.prev, struct callchain_list, list);
chain->has_children = !RB_EMPTY_ROOT(&node->rb_root);
}
@@ -279,6 +311,8 @@ static void callchain__init_have_children(struct rb_root *root)
for (nd = rb_first(root); nd; nd = rb_next(nd)) {
struct callchain_node *node = rb_entry(nd, struct callchain_node, rb_node);
callchain_node__init_have_children(node, has_sibling);
+ if (callchain_param.mode == CHAIN_FLAT)
+ callchain_node__make_parent_list(node);
}
}

@@ -612,6 +646,83 @@ static int hist_browser__show_callchain_list(struct hist_browser *browser,
return 1;
}

+static int hist_browser__show_callchain_flat(struct hist_browser *browser,
+ struct rb_root *root,
+ unsigned short row, u64 total,
+ print_callchain_entry_fn print,
+ struct callchain_print_arg *arg,
+ check_output_full_fn is_output_full)
+{
+ struct rb_node *node;
+ int first_row = row, offset = LEVEL_OFFSET_STEP;
+ bool need_percent;
+
+ node = rb_first(root);
+ need_percent = node && rb_next(node);
+
+ while (node) {
+ struct callchain_node *child = rb_entry(node, struct callchain_node, rb_node);
+ struct rb_node *next = rb_next(node);
+ struct callchain_list *chain;
+ char folded_sign = ' ';
+ int first = true;
+ int extra_offset = 0;
+
+ list_for_each_entry(chain, &child->parent_val, list) {
+ bool was_first = first;
+
+ if (first)
+ first = false;
+ else if (need_percent)
+ extra_offset = LEVEL_OFFSET_STEP;
+
+ folded_sign = callchain_list__folded(chain);
+
+ row += hist_browser__show_callchain_list(browser, child,
+ chain, row, total,
+ was_first && need_percent,
+ offset + extra_offset,
+ print, arg);
+
+ if (is_output_full(browser, row))
+ goto out;
+
+ if (folded_sign == '+')
+ goto next;
+ }
+
+ list_for_each_entry(chain, &child->val, list) {
+ bool was_first = first;
+
+ if (first)
+ first = false;
+ else if (need_percent)
+ extra_offset = LEVEL_OFFSET_STEP;
+
+ folded_sign = callchain_list__folded(chain);
+
+ row += hist_browser__show_callchain_list(browser, child,
+ chain, row, total,
+ was_first && need_percent,
+ offset + extra_offset,
+ print, arg);
+
+ if (is_output_full(browser, row))
+ goto out;
+
+ if (folded_sign == '+')
+ break;
+ }
+
+next:
+ if (is_output_full(browser, row))
+ break;
+ node = next;
+ }
+out:
+ return row - first_row;
+}
+
static int hist_browser__show_callchain(struct hist_browser *browser,
struct rb_root *root, int level,
unsigned short row, u64 total,
@@ -864,10 +975,17 @@ static int hist_browser__show_entry(struct hist_browser *browser,
total = entry->stat.period;
}

- printed += hist_browser__show_callchain(browser,
+ if (callchain_param.mode == CHAIN_FLAT) {
+ printed += hist_browser__show_callchain_flat(browser,
+ &entry->sorted_chain, row, total,
+ hist_browser__show_callchain_entry, &arg,
+ hist_browser__check_output_full);
+ } else {
+ printed += hist_browser__show_callchain(browser,
&entry->sorted_chain, 1, row, total,
hist_browser__show_callchain_entry, &arg,
hist_browser__check_output_full);
+ }

if (arg.is_current_entry)
browser->he_selection = entry;
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index f3f1b95b808e..f4fe000cea34 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -387,6 +387,7 @@ create_child(struct callchain_node *parent, bool inherit_children)
}
new->parent = parent;
INIT_LIST_HEAD(&new->val);
+ INIT_LIST_HEAD(&new->parent_val);

if (inherit_children) {
struct rb_node *n;
@@ -894,6 +895,11 @@ static void free_callchain_node(struct callchain_node *node)
struct callchain_node *child;
struct rb_node *n;

+ list_for_each_entry_safe(list, tmp, &node->parent_val, list) {
+ list_del(&list->list);
+ free(list);
+ }
+
list_for_each_entry_safe(list, tmp, &node->val, list) {
list_del(&list->list);
free(list);
@@ -917,3 +923,41 @@ void free_callchain(struct callchain_root *root)

free_callchain_node(&root->node);
}
+
+int callchain_node__make_parent_list(struct callchain_node *node)
+{
+ struct callchain_node *parent = node->parent;
+ struct callchain_list *chain, *new;
+ LIST_HEAD(head);
+
+ while (parent) {
+ list_for_each_entry_reverse(chain, &parent->val, list) {
+ new = malloc(sizeof(*new));
+ if (new == NULL)
+ goto out;
+ *new = *chain;
+ new->has_children = false;
+ list_add_tail(&new->list, &head);
+ }
+ parent = parent->parent;
+ }
+
+ list_for_each_entry_safe_reverse(chain, new, &head, list)
+ list_move_tail(&chain->list, &node->parent_val);
+
+ if (!list_empty(&node->parent_val)) {
+ chain = list_first_entry(&node->parent_val, struct callchain_list, list);
+ chain->has_children = rb_prev(&node->rb_node) || rb_next(&node->rb_node);
+
+ chain = list_first_entry(&node->val, struct callchain_list, list);
+ chain->has_children = false;
+ }
+ return 0;
+
+out:
+ list_for_each_entry_safe(chain, new, &head, list) {
+ list_del(&chain->list);
+ free(chain);
+ }
+ return -ENOMEM;
+}
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index b14d760fc4e3..3607e7a0f8a8 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -56,6 +56,7 @@ enum chain_order {
struct callchain_node {
struct callchain_node *parent;
struct list_head val;
+ struct list_head parent_val;
struct rb_node rb_node_in; /* to insert nodes in an rbtree */
struct rb_node rb_node; /* to sort nodes in an output tree */
struct rb_root rb_root_in; /* input tree of children */
@@ -251,5 +252,6 @@ int callchain_node__fprintf_value(struct callchain_node *node,
FILE *fp, u64 total);

void free_callchain(struct callchain_root *root);
+int callchain_node__make_parent_list(struct callchain_node *node);

#endif /* __PERF_CALLCHAIN_H */
--
2.6.2

2015-11-09 05:46:56

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH v5 7/9] perf hists browser: Support folded callchains

The folded callchain mode is to print all chains in a single line.
Currently perf report --tui doesn't support folded callchains. Like
flat callchains, only leaf nodes are added to the final rbtree so it
should show entries in parent nodes. To do that, add flat_val list to
struct callchain_node and show them along with the (normal) val list.

For example, folded callchain looks like below:

$ perf report -g folded --tui
Samples: 234 of event 'cycles:pp', Event count (approx.): 32605268
Overhead Command Shared Object Symbol
- 39.93% swapper [kernel.vmlinux] [k] intel_idle
+ 28.63% intel_idle; cpuidle_enter_state; cpuidle_enter; ...
+ 11.30% intel_idle; cpuidle_enter_state; cpuidle_enter; ...

Cc: Brendan Gregg <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/ui/browsers/hists.c | 126 ++++++++++++++++++++++++++++++++++++++++-
1 file changed, 125 insertions(+), 1 deletion(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index f4216d92282d..3efe7c74f47d 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -207,6 +207,11 @@ static int callchain_node__count_flat_rows(struct callchain_node *node)
return n;
}

+static int callchain_node__count_folded_rows(struct callchain_node *node __maybe_unused)
+{
+ return 1;
+}
+
static int callchain_node__count_rows(struct callchain_node *node)
{
struct callchain_list *chain;
@@ -215,6 +220,8 @@ static int callchain_node__count_rows(struct callchain_node *node)

if (callchain_param.mode == CHAIN_FLAT)
return callchain_node__count_flat_rows(node);
+ else if (callchain_param.mode == CHAIN_FOLDED)
+ return callchain_node__count_folded_rows(node);

list_for_each_entry(chain, &node->val, list) {
++n;
@@ -311,7 +318,8 @@ static void callchain__init_have_children(struct rb_root *root)
for (nd = rb_first(root); nd; nd = rb_next(nd)) {
struct callchain_node *node = rb_entry(nd, struct callchain_node, rb_node);
callchain_node__init_have_children(node, has_sibling);
- if (callchain_param.mode == CHAIN_FLAT)
+ if (callchain_param.mode == CHAIN_FLAT ||
+ callchain_param.mode == CHAIN_FOLDED)
callchain_node__make_parent_list(node);
}
}
@@ -723,6 +731,117 @@ static int hist_browser__show_callchain_flat(struct hist_browser *browser,
return row - first_row;
}

+static char *hist_browser__folded_callchain_str(struct hist_browser *browser,
+ struct callchain_list *chain,
+ char *value_str, char *old_str)
+{
+ char bf[1024];
+ const char *str;
+ char *new;
+
+ str = callchain_list__sym_name(chain, bf, sizeof(bf),
+ browser->show_dso);
+ if (old_str) {
+ if (asprintf(&new, "%s%s%s", old_str,
+ symbol_conf.field_sep ?: ";", str) < 0)
+ new = NULL;
+ } else {
+ if (value_str) {
+ if (asprintf(&new, "%s %s", value_str, str) < 0)
+ new = NULL;
+ } else {
+ if (asprintf(&new, "%s", str) < 0)
+ new = NULL;
+ }
+ }
+ return new;
+}
+
+static int hist_browser__show_callchain_folded(struct hist_browser *browser,
+ struct rb_root *root,
+ unsigned short row, u64 total,
+ print_callchain_entry_fn print,
+ struct callchain_print_arg *arg,
+ check_output_full_fn is_output_full)
+{
+ struct rb_node *node;
+ int first_row = row, offset = LEVEL_OFFSET_STEP;
+ bool need_percent;
+
+ node = rb_first(root);
+ need_percent = node && rb_next(node);
+
+ while (node) {
+ struct callchain_node *child = rb_entry(node, struct callchain_node, rb_node);
+ struct rb_node *next = rb_next(node);
+ struct callchain_list *chain, *first_chain = NULL;
+ int first = true;
+ char *value_str = NULL, *value_str_alloc = NULL;
+ char *chain_str = NULL, *chain_str_alloc = NULL;
+
+ if (arg->row_offset != 0) {
+ arg->row_offset--;
+ goto next;
+ }
+
+ if (need_percent) {
+ char buf[64];
+
+ callchain_node__sprintf_value(child, buf, sizeof(buf),
+ total);
+ if (asprintf(&value_str, "%s", buf) < 0) {
+ value_str = (char *)"<...>";
+ goto do_print;
+ }
+ value_str_alloc = value_str;
+ }
+
+ list_for_each_entry(chain, &child->parent_val, list) {
+ chain_str = hist_browser__folded_callchain_str(browser,
+ chain, value_str, chain_str);
+ if (first) {
+ first = false;
+ first_chain = chain;
+ }
+
+ if (chain_str == NULL) {
+ chain_str = (char *)"Not enough memory!";
+ goto do_print;
+ }
+
+ chain_str_alloc = chain_str;
+ }
+
+ list_for_each_entry(chain, &child->val, list) {
+ chain_str = hist_browser__folded_callchain_str(browser,
+ chain, value_str, chain_str);
+ if (first) {
+ first = false;
+ first_chain = chain;
+ }
+
+ if (chain_str == NULL) {
+ chain_str = (char *)"Not enough memory!";
+ goto do_print;
+ }
+
+ chain_str_alloc = chain_str;
+ }
+
+do_print:
+ print(browser, first_chain, chain_str, offset, row++, arg);
+ free(value_str_alloc);
+ free(chain_str_alloc);
+
+next:
+ if (is_output_full(browser, row))
+ break;
+ node = next;
+ }
+
+ return row - first_row;
+}
+
static int hist_browser__show_callchain(struct hist_browser *browser,
struct rb_root *root, int level,
unsigned short row, u64 total,
@@ -980,6 +1099,11 @@ static int hist_browser__show_entry(struct hist_browser *browser,
&entry->sorted_chain, row, total,
hist_browser__show_callchain_entry, &arg,
hist_browser__check_output_full);
+ } else if (callchain_param.mode == CHAIN_FOLDED) {
+ printed += hist_browser__show_callchain_folded(browser,
+ &entry->sorted_chain, row, total,
+ hist_browser__show_callchain_entry, &arg,
+ hist_browser__check_output_full);
} else {
printed += hist_browser__show_callchain(browser,
&entry->sorted_chain, 1, row, total,
--
2.6.2

2015-11-09 05:46:48

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH v5 8/9] perf ui/gtk: Support flat callchains

The flat callchain mode is to print all chains in a simple flat
hierarchy so make it easy to see.

Currently perf report --gtk doesn't show flat callchains properly. With
flat callchains, only leaf nodes are added to the final rbtree so it
should show entries in parent nodes. To do that, add parent_val list to
struct callchain_node and show them along with the (normal) val list.

See the previous commit on TUI support for more information.

Cc: Brendan Gregg <[email protected]>
Cc: Pekka Enberg <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/ui/gtk/hists.c | 80 ++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 76 insertions(+), 4 deletions(-)

diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
index d8037b7023e8..62f0b7792381 100644
--- a/tools/perf/ui/gtk/hists.c
+++ b/tools/perf/ui/gtk/hists.c
@@ -89,8 +89,71 @@ void perf_gtk__init_hpp(void)
perf_gtk__hpp_color_overhead_acc;
}

-static void perf_gtk__add_callchain(struct rb_root *root, GtkTreeStore *store,
- GtkTreeIter *parent, int col, u64 total)
+static void perf_gtk__add_callchain_flat(struct rb_root *root, GtkTreeStore *store,
+ GtkTreeIter *parent, int col, u64 total)
+{
+ struct rb_node *nd;
+ bool has_single_node = (rb_first(root) == rb_last(root));
+
+ for (nd = rb_first(root); nd; nd = rb_next(nd)) {
+ struct callchain_node *node;
+ struct callchain_list *chain;
+ GtkTreeIter iter, new_parent;
+ bool need_new_parent;
+
+ node = rb_entry(nd, struct callchain_node, rb_node);
+
+ new_parent = *parent;
+ need_new_parent = !has_single_node;
+
+ callchain_node__make_parent_list(node);
+
+ list_for_each_entry(chain, &node->parent_val, list) {
+ char buf[128];
+
+ gtk_tree_store_append(store, &iter, &new_parent);
+
+ callchain_node__sprintf_value(node, buf, sizeof(buf), total);
+ gtk_tree_store_set(store, &iter, 0, buf, -1);
+
+ callchain_list__sym_name(chain, buf, sizeof(buf), false);
+ gtk_tree_store_set(store, &iter, col, buf, -1);
+
+ if (need_new_parent) {
+ /*
+ * Only show the top-most symbol in a callchain
+ * if it's not the only callchain.
+ */
+ new_parent = iter;
+ need_new_parent = false;
+ }
+ }
+
+ list_for_each_entry(chain, &node->val, list) {
+ char buf[128];
+
+ gtk_tree_store_append(store, &iter, &new_parent);
+
+ callchain_node__sprintf_value(node, buf, sizeof(buf), total);
+ gtk_tree_store_set(store, &iter, 0, buf, -1);
+
+ callchain_list__sym_name(chain, buf, sizeof(buf), false);
+ gtk_tree_store_set(store, &iter, col, buf, -1);
+
+ if (need_new_parent) {
+ /*
+ * Only show the top-most symbol in a callchain
+ * if it's not the only callchain.
+ */
+ new_parent = iter;
+ need_new_parent = false;
+ }
+ }
+ }
+}
+
+static void perf_gtk__add_callchain_graph(struct rb_root *root, GtkTreeStore *store,
+ GtkTreeIter *parent, int col, u64 total)
{
struct rb_node *nd;
bool has_single_node = (rb_first(root) == rb_last(root));
@@ -134,11 +197,20 @@ static void perf_gtk__add_callchain(struct rb_root *root, GtkTreeStore *store,
child_total = total;

/* Now 'iter' contains info of the last callchain_list */
- perf_gtk__add_callchain(&node->rb_root, store, &iter, col,
- child_total);
+ perf_gtk__add_callchain_graph(&node->rb_root, store, &iter, col,
+ child_total);
}
}

+static void perf_gtk__add_callchain(struct rb_root *root, GtkTreeStore *store,
+ GtkTreeIter *parent, int col, u64 total)
+{
+ if (callchain_param.mode == CHAIN_FLAT)
+ perf_gtk__add_callchain_flat(root, store, parent, col, total);
+ else
+ perf_gtk__add_callchain_graph(root, store, parent, col, total);
+}
+
static void on_row_activated(GtkTreeView *view, GtkTreePath *path,
GtkTreeViewColumn *col __maybe_unused,
gpointer user_data __maybe_unused)
--
2.6.2

2015-11-09 05:46:51

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH v5 9/9] perf ui/gtk: Support folded callchains

The folded callchain mode is to print all chains in a single line.
Currently perf report --gtk doesn't support folded callchains. Like
flat callchains, only leaf nodes are added to the final rbtree so it
should show entries in parent nodes.

Cc: Brendan Gregg <[email protected]>
Cc: Pekka Enberg <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/ui/gtk/hists.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 62 insertions(+)

diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
index 62f0b7792381..6105b4921754 100644
--- a/tools/perf/ui/gtk/hists.c
+++ b/tools/perf/ui/gtk/hists.c
@@ -152,6 +152,66 @@ static void perf_gtk__add_callchain_flat(struct rb_root *root, GtkTreeStore *sto
}
}

+static void perf_gtk__add_callchain_folded(struct rb_root *root, GtkTreeStore *store,
+ GtkTreeIter *parent, int col, u64 total)
+{
+ struct rb_node *nd;
+
+ for (nd = rb_first(root); nd; nd = rb_next(nd)) {
+ struct callchain_node *node;
+ struct callchain_list *chain;
+ GtkTreeIter iter;
+ char buf[64];
+ char *str, *str_alloc = NULL;
+ bool first = true;
+
+ node = rb_entry(nd, struct callchain_node, rb_node);
+
+ callchain_node__make_parent_list(node);
+
+ list_for_each_entry(chain, &node->parent_val, list) {
+ char name[1024];
+
+ callchain_list__sym_name(chain, name, sizeof(name), false);
+
+ if (asprintf(&str, "%s%s%s",
+ first ? "" : str_alloc,
+ first ? "" : symbol_conf.field_sep ?: "; ",
+ name) < 0)
+ return;
+
+ first = false;
+ free(str_alloc);
+ str_alloc = str;
+ }
+
+ list_for_each_entry(chain, &node->val, list) {
+ char name[1024];
+
+ callchain_list__sym_name(chain, name, sizeof(name), false);
+
+ if (asprintf(&str, "%s%s%s",
+ first ? "" : str_alloc,
+ first ? "" : symbol_conf.field_sep ?: "; ",
+ name) < 0)
+ return;
+
+ first = false;
+ free(str_alloc);
+ str_alloc = str;
+ }
+
+ gtk_tree_store_append(store, &iter, parent);
+
+ callchain_node__sprintf_value(node, buf, sizeof(buf), total);
+ gtk_tree_store_set(store, &iter, 0, buf, -1);
+
+ gtk_tree_store_set(store, &iter, col, str, -1);
+
+ free(str_alloc);
+ }
+}
+
static void perf_gtk__add_callchain_graph(struct rb_root *root, GtkTreeStore *store,
GtkTreeIter *parent, int col, u64 total)
{
@@ -207,6 +267,8 @@ static void perf_gtk__add_callchain(struct rb_root *root, GtkTreeStore *store,
{
if (callchain_param.mode == CHAIN_FLAT)
perf_gtk__add_callchain_flat(root, store, parent, col, total);
+ else if (callchain_param.mode == CHAIN_FOLDED)
+ perf_gtk__add_callchain_folded(root, store, parent, col, total);
else
perf_gtk__add_callchain_graph(root, store, parent, col, total);
}
--
2.6.2

2015-11-12 17:50:53

by Brendan Gregg

[permalink] [raw]
Subject: Re: [PATCHSET 0/9] perf report: Support folded callchain output (v5)

On Sun, Nov 8, 2015 at 9:45 PM, Namhyung Kim <[email protected]> wrote:
> Hello,
>
> This is what Brendan requested on the perf-users mailing list [1] to
> support FlameGraphs [2] more efficiently. This patchset adds a few
> more callchain options to adjust the output for it.
>
> * changes in v5)
> - honor field separator from -t option
> - add support for TUI and GTK
>
> * changes in v4)
> - add missing doc update
> - cleanup/fix callchain value print code
> - add Acked-by from Brendan and Jiri
>
> * changes in v3)
> - put the value before callchains
> - fix compile error
>
>
> At first, 'folded' output mode was added. The folded output puts the
> value, a space and all calchain nodes separated by semicolons. Now it
> only supports --stdio as other UI provides some way of folding and/or
> expanding callchains dynamically.
>
> The value is now can be one of 'percent', 'period', or 'count'. The
> percent is current default output and the period is the raw number of
> sample periods. The count is the number of samples for each callchain.
>
> The proposed features of hiding hist lines with '-F none' and showing
> hist info with callchains can be added as later work.
>
> Here's an example:
>
> $ perf report --no-children --show-nr-samples --stdio -g folded,count
> ...
> 39.93% 80 swapper [kernel.vmlinux] [k] intel_idel
> 57 intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary
> 23 intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;...
>

Thanks, I tested it, it works!

It lets me do this:

# ./perf report --no-children -n --stdio -g folded,count -F pid
[...]
0:swapper
1032 xen_hypercall_sched_op;default_idle;arch_cpu_idle;default_idle_call;cpu_startup_entry;cpu_bringup_and_idle
134 xen_hypercall_sched_op;default_idle;arch_cpu_idle;default_idle_call;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;xen_start_kernel
1 xen_hypercall_xen_version;check_events;__schedule;schedule;schedule_preempt_disabled;cpu_startup_entry;cpu_bringup_and_idle
1 xen_hypercall_xen_version;check_events;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;xen_start_kernel
1248:bash
43 copy_page_range;copy_process;_do_fork;sys_clone;entry_SYSCALL_64_fastpath;__libc_fork;make_child
6 xen_hypercall_xen_version;check_events;xen_dup_mmap;copy_process;_do_fork;sys_clone;entry_SYSCALL_64_fastpath;__libc_fork;make_child
4 xen_hypercall_xen_version;check_events;copy_page_range;copy_process;_do_fork;sys_clone;entry_SYSCALL_64_fastpath;__libc_fork;make_child
[...]

This is a parsable call chain summary, and which can be flamegraph'd
after a touch of awk. Later on we can add a "-F none" and "-g pid"
etc, but this patch solves the number one issue of avoiding the
expense of needing to re-aggregate the call chain output (output of
perf script), so I'd be pretty happy to use this instead.

Brendan

>
> $ perf report --no-children --stdio -g percent
> ...
> 39.93% swapper [kernel.vmlinux] [k] intel_idel
> |
> ---intel_idle
> cpuidle_enter_state
> cpuidle_enter
> call_cpuidle
> cpu_startup_entry
> |
> |--28.63%-- start_secondary
> |
> --11.30%-- rest_init
>
>
> $ perf report --no-children --stdio --show-total-period -g period
> ...
> 39.93% 13018705 swapper [kernel.vmlinux] [k] intel_idel
> |
> ---intel_idle
> cpuidle_enter_state
> cpuidle_enter
> call_cpuidle
> cpu_startup_entry
> |
> |--9334403-- start_secondary
> |
> --3684302-- rest_init
>
>
> $ perf report --no-children --stdio --show-nr-samples -g count
> ...
> 39.93% 80 swapper [kernel.vmlinux] [k] intel_idel
> |
> ---intel_idle
> cpuidle_enter_state
> cpuidle_enter
> call_cpuidle
> cpu_startup_entry
> |
> |--57-- start_secondary
> |
> --23-- rest_init
>
>
> You can get it from 'perf/callchain-fold-v5' branch on my tree:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
>
> Any comments are welcome, thanks
> Namhyung
>
>
> [1] http://www.spinics.net/lists/linux-perf-users/msg02498.html
> [2] http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html
>
>
> Namhyung Kim (9):
> perf report: Support folded callchain mode on --stdio
> perf callchain: Abstract callchain print function
> perf callchain: Add count fields to struct callchain_node
> perf report: Add callchain value option
> perf hists browser: Factor out hist_browser__show_callchain_list()
> perf hists browser: Support flat callchains
> perf hists browser: Support folded callchains
> perf ui/gtk: Support flat callchains
> perf ui/gtk: Support folded callchains
>
> tools/perf/Documentation/perf-report.txt | 14 +-
> tools/perf/builtin-report.c | 4 +-
> tools/perf/ui/browsers/hists.c | 316 ++++++++++++++++++++++++++++---
> tools/perf/ui/gtk/hists.c | 148 ++++++++++++++-
> tools/perf/ui/stdio/hist.c | 94 +++++++--
> tools/perf/util/callchain.c | 135 ++++++++++++-
> tools/perf/util/callchain.h | 28 ++-
> tools/perf/util/util.c | 3 +-
> 8 files changed, 679 insertions(+), 63 deletions(-)
>
> --
> 2.6.2
>

2015-11-16 23:09:47

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCHSET 0/9] perf report: Support folded callchain output (v5)

Hi Brendan and Arnaldo,

On Thu, Nov 12, 2015 at 09:50:21AM -0800, Brendan Gregg wrote:
> On Sun, Nov 8, 2015 at 9:45 PM, Namhyung Kim <[email protected]> wrote:
> > Hello,
> >
> > This is what Brendan requested on the perf-users mailing list [1] to
> > support FlameGraphs [2] more efficiently. This patchset adds a few
> > more callchain options to adjust the output for it.
> >
> > * changes in v5)
> > - honor field separator from -t option
> > - add support for TUI and GTK
> >
> > * changes in v4)
> > - add missing doc update
> > - cleanup/fix callchain value print code
> > - add Acked-by from Brendan and Jiri
> >
> > * changes in v3)
> > - put the value before callchains
> > - fix compile error
> >
> >
> > At first, 'folded' output mode was added. The folded output puts the
> > value, a space and all calchain nodes separated by semicolons. Now it
> > only supports --stdio as other UI provides some way of folding and/or
> > expanding callchains dynamically.
> >
> > The value is now can be one of 'percent', 'period', or 'count'. The
> > percent is current default output and the period is the raw number of
> > sample periods. The count is the number of samples for each callchain.
> >
> > The proposed features of hiding hist lines with '-F none' and showing
> > hist info with callchains can be added as later work.
> >
> > Here's an example:
> >
> > $ perf report --no-children --show-nr-samples --stdio -g folded,count
> > ...
> > 39.93% 80 swapper [kernel.vmlinux] [k] intel_idel
> > 57 intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary
> > 23 intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;...
> >
>
> Thanks, I tested it, it works!
>
> It lets me do this:
>
> # ./perf report --no-children -n --stdio -g folded,count -F pid
> [...]
> 0:swapper
> 1032 xen_hypercall_sched_op;default_idle;arch_cpu_idle;default_idle_call;cpu_startup_entry;cpu_bringup_and_idle
> 134 xen_hypercall_sched_op;default_idle;arch_cpu_idle;default_idle_call;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;xen_start_kernel
> 1 xen_hypercall_xen_version;check_events;__schedule;schedule;schedule_preempt_disabled;cpu_startup_entry;cpu_bringup_and_idle
> 1 xen_hypercall_xen_version;check_events;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;xen_start_kernel
> 1248:bash
> 43 copy_page_range;copy_process;_do_fork;sys_clone;entry_SYSCALL_64_fastpath;__libc_fork;make_child
> 6 xen_hypercall_xen_version;check_events;xen_dup_mmap;copy_process;_do_fork;sys_clone;entry_SYSCALL_64_fastpath;__libc_fork;make_child
> 4 xen_hypercall_xen_version;check_events;copy_page_range;copy_process;_do_fork;sys_clone;entry_SYSCALL_64_fastpath;__libc_fork;make_child
> [...]
>
> This is a parsable call chain summary, and which can be flamegraph'd
> after a touch of awk. Later on we can add a "-F none" and "-g pid"
> etc, but this patch solves the number one issue of avoiding the
> expense of needing to re-aggregate the call chain output (output of
> perf script), so I'd be pretty happy to use this instead.

Thank you for testing.

Arnaldo, could you please take a look at this?

Thanks,
Namhyung


>
> >
> > $ perf report --no-children --stdio -g percent
> > ...
> > 39.93% swapper [kernel.vmlinux] [k] intel_idel
> > |
> > ---intel_idle
> > cpuidle_enter_state
> > cpuidle_enter
> > call_cpuidle
> > cpu_startup_entry
> > |
> > |--28.63%-- start_secondary
> > |
> > --11.30%-- rest_init
> >
> >
> > $ perf report --no-children --stdio --show-total-period -g period
> > ...
> > 39.93% 13018705 swapper [kernel.vmlinux] [k] intel_idel
> > |
> > ---intel_idle
> > cpuidle_enter_state
> > cpuidle_enter
> > call_cpuidle
> > cpu_startup_entry
> > |
> > |--9334403-- start_secondary
> > |
> > --3684302-- rest_init
> >
> >
> > $ perf report --no-children --stdio --show-nr-samples -g count
> > ...
> > 39.93% 80 swapper [kernel.vmlinux] [k] intel_idel
> > |
> > ---intel_idle
> > cpuidle_enter_state
> > cpuidle_enter
> > call_cpuidle
> > cpu_startup_entry
> > |
> > |--57-- start_secondary
> > |
> > --23-- rest_init
> >
> >
> > You can get it from 'perf/callchain-fold-v5' branch on my tree:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
> >
> > Any comments are welcome, thanks
> > Namhyung
> >
> >
> > [1] http://www.spinics.net/lists/linux-perf-users/msg02498.html
> > [2] http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html
> >
> >
> > Namhyung Kim (9):
> > perf report: Support folded callchain mode on --stdio
> > perf callchain: Abstract callchain print function
> > perf callchain: Add count fields to struct callchain_node
> > perf report: Add callchain value option
> > perf hists browser: Factor out hist_browser__show_callchain_list()
> > perf hists browser: Support flat callchains
> > perf hists browser: Support folded callchains
> > perf ui/gtk: Support flat callchains
> > perf ui/gtk: Support folded callchains
> >
> > tools/perf/Documentation/perf-report.txt | 14 +-
> > tools/perf/builtin-report.c | 4 +-
> > tools/perf/ui/browsers/hists.c | 316 ++++++++++++++++++++++++++++---
> > tools/perf/ui/gtk/hists.c | 148 ++++++++++++++-
> > tools/perf/ui/stdio/hist.c | 94 +++++++--
> > tools/perf/util/callchain.c | 135 ++++++++++++-
> > tools/perf/util/callchain.h | 28 ++-
> > tools/perf/util/util.c | 3 +-
> > 8 files changed, 679 insertions(+), 63 deletions(-)
> >
> > --
> > 2.6.2
> >

2015-11-17 00:09:38

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCHSET 0/9] perf report: Support folded callchain output (v5)

Em Tue, Nov 17, 2015 at 08:09:42AM +0900, Namhyung Kim escreveu:
> On Thu, Nov 12, 2015 at 09:50:21AM -0800, Brendan Gregg wrote:
> > Thanks, I tested it, it works!

> Thank you for testing.

> Arnaldo, could you please take a look at this?

I will, haven't yet because I saw a WIP, thought that you would post
something newer.

- Arnaldo

2015-11-17 00:23:21

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCHSET 0/9] perf report: Support folded callchain output (v5)

On Mon, Nov 16, 2015 at 09:09:33PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Nov 17, 2015 at 08:09:42AM +0900, Namhyung Kim escreveu:
> > On Thu, Nov 12, 2015 at 09:50:21AM -0800, Brendan Gregg wrote:
> > > Thanks, I tested it, it works!
>
> > Thank you for testing.
>
> > Arnaldo, could you please take a look at this?
>
> I will, haven't yet because I saw a WIP, thought that you would post
> something newer.

That was sent by mistake. The WIP patch is not a part of this series.

Thanks,
Namhyung

2015-11-17 01:32:16

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCHSET 0/9] perf report: Support folded callchain output (v5)

Em Tue, Nov 17, 2015 at 09:22:45AM +0900, Namhyung Kim escreveu:
> On Mon, Nov 16, 2015 at 09:09:33PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Tue, Nov 17, 2015 at 08:09:42AM +0900, Namhyung Kim escreveu:
> > > On Thu, Nov 12, 2015 at 09:50:21AM -0800, Brendan Gregg wrote:
> > > > Thanks, I tested it, it works!
> >
> > > Thank you for testing.
> >
> > > Arnaldo, could you please take a look at this?
> >
> > I will, haven't yet because I saw a WIP, thought that you would post
> > something newer.
>
> That was sent by mistake. The WIP patch is not a part of this series.

Ok, so I should consider that patchkit modulo that one, ok.

- Arnaldo

2015-11-19 13:41:42

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v5 2/9] perf callchain: Abstract callchain print function

Em Mon, Nov 09, 2015 at 02:45:39PM +0900, Namhyung Kim escreveu:
> This is a preparation to support for printing other type of callchain
> value like count or period.
>
> Acked-by: Brendan Gregg <[email protected]>
> Signed-off-by: Namhyung Kim <[email protected]>
> ---
> tools/perf/ui/browsers/hists.c | 8 +++++---
> tools/perf/ui/gtk/hists.c | 8 ++------
> tools/perf/ui/stdio/hist.c | 35 +++++++++++++++++------------------
> tools/perf/util/callchain.c | 29 +++++++++++++++++++++++++++++
> tools/perf/util/callchain.h | 4 ++++
> 5 files changed, 57 insertions(+), 27 deletions(-)
>
> diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
> index e5afb8936040..a8897aab4c4a 100644
> --- a/tools/perf/ui/browsers/hists.c
> +++ b/tools/perf/ui/browsers/hists.c
> @@ -592,7 +592,6 @@ static int hist_browser__show_callchain(struct hist_browser *browser,
> while (node) {
> struct callchain_node *child = rb_entry(node, struct callchain_node, rb_node);
> struct rb_node *next = rb_next(node);
> - u64 cumul = callchain_cumul_hits(child);
> struct callchain_list *chain;
> char folded_sign = ' ';
> int first = true;
> @@ -619,9 +618,12 @@ static int hist_browser__show_callchain(struct hist_browser *browser,
> browser->show_dso);
>
> if (was_first && need_percent) {
> - double percent = cumul * 100.0 / total;
> + char buf[64];
>
> - if (asprintf(&alloc_str, "%2.2f%% %s", percent, str) < 0)
> + callchain_node__sprintf_value(child, buf, sizeof(buf),
> + total);
> +
> + if (asprintf(&alloc_str, "%s %s", buf, str) < 0)
> str = "Not enough memory!";
> else
> str = alloc_str;
> diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
> index 4b3585eed1e8..d8037b7023e8 100644
> --- a/tools/perf/ui/gtk/hists.c
> +++ b/tools/perf/ui/gtk/hists.c
> @@ -100,14 +100,10 @@ static void perf_gtk__add_callchain(struct rb_root *root, GtkTreeStore *store,
> struct callchain_list *chain;
> GtkTreeIter iter, new_parent;
> bool need_new_parent;
> - double percent;
> - u64 hits, child_total;
> + u64 child_total;
>
> node = rb_entry(nd, struct callchain_node, rb_node);
>
> - hits = callchain_cumul_hits(node);
> - percent = 100.0 * hits / total;
> -
> new_parent = *parent;
> need_new_parent = !has_single_node && (node->val_nr > 1);
>
> @@ -116,7 +112,7 @@ static void perf_gtk__add_callchain(struct rb_root *root, GtkTreeStore *store,
>
> gtk_tree_store_append(store, &iter, &new_parent);
>
> - scnprintf(buf, sizeof(buf), "%5.2f%%", percent);
> + callchain_node__sprintf_value(node, buf, sizeof(buf), total);
> gtk_tree_store_set(store, &iter, 0, buf, -1);
>
> callchain_list__sym_name(chain, buf, sizeof(buf), false);
> diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
> index ea7984932d9a..f4de055cab9b 100644
> --- a/tools/perf/ui/stdio/hist.c
> +++ b/tools/perf/ui/stdio/hist.c
> @@ -34,10 +34,10 @@ static size_t ipchain__fprintf_graph_line(FILE *fp, int depth, int depth_mask,
> return ret;
> }
>
> -static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_list *chain,
> +static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_node *node,
> + struct callchain_list *chain,
> int depth, int depth_mask, int period,
> - u64 total_samples, u64 hits,
> - int left_margin)
> + u64 total_samples, int left_margin)
> {
> int i;
> size_t ret = 0;
> @@ -50,10 +50,9 @@ static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_list *chain,
> else
> ret += fprintf(fp, " ");
> if (!period && i == depth - 1) {
> - double percent;
> -
> - percent = hits * 100.0 / total_samples;
> - ret += percent_color_fprintf(fp, "--%2.2f%%-- ", percent);
> + ret += fprintf(fp, "--");
> + ret += callchain_node__fprintf_value(node, fp, total_samples);
> + ret += fprintf(fp, "--");
> } else
> ret += fprintf(fp, "%s", " ");
> }
> @@ -120,10 +119,9 @@ static size_t __callchain__fprintf_graph(FILE *fp, struct rb_root *root,
> left_margin);
> i = 0;
> list_for_each_entry(chain, &child->val, list) {
> - ret += ipchain__fprintf_graph(fp, chain, depth,
> + ret += ipchain__fprintf_graph(fp, child, chain, depth,
> new_depth_mask, i++,
> total_samples,
> - cumul,
> left_margin);
> }
>
> @@ -143,14 +141,17 @@ static size_t __callchain__fprintf_graph(FILE *fp, struct rb_root *root,
>
> if (callchain_param.mode == CHAIN_GRAPH_REL &&
> remaining && remaining != total_samples) {
> + struct callchain_node rem_node = {
> + .hit = remaining,
> + };
>
> if (!rem_sq_bracket)
> return ret;
>
> new_depth_mask &= ~(1 << (depth - 1));
> - ret += ipchain__fprintf_graph(fp, &rem_hits, depth,
> + ret += ipchain__fprintf_graph(fp, &rem_node, &rem_hits, depth,
> new_depth_mask, 0, total_samples,
> - remaining, left_margin);
> + left_margin);
> }
>
> return ret;
> @@ -243,12 +244,11 @@ static size_t callchain__fprintf_flat(FILE *fp, struct rb_root *tree,
> struct rb_node *rb_node = rb_first(tree);
>
> while (rb_node) {
> - double percent;
> -
> chain = rb_entry(rb_node, struct callchain_node, rb_node);
> - percent = chain->hit * 100.0 / total_samples;
>
> - ret = percent_color_fprintf(fp, " %6.2f%%\n", percent);
> + ret += fprintf(fp, " ");
> + ret += callchain_node__fprintf_value(chain, fp, total_samples);
> + ret += fprintf(fp, "\n");
> ret += __callchain__fprintf_flat(fp, chain, total_samples);
> ret += fprintf(fp, "\n");
> if (++entries_printed == callchain_param.print_limit)
> @@ -295,12 +295,11 @@ static size_t callchain__fprintf_folded(FILE *fp, struct rb_root *tree,
> struct rb_node *rb_node = rb_first(tree);
>
> while (rb_node) {
> - double percent;
>
> chain = rb_entry(rb_node, struct callchain_node, rb_node);
> - percent = chain->hit * 100.0 / total_samples;
>
> - ret += fprintf(fp, "%.2f%% ", percent);
> + ret += callchain_node__fprintf_value(chain, fp, total_samples);
> + ret += fprintf(fp, " ");
> ret += __callchain__fprintf_folded(fp, chain);
> ret += fprintf(fp, "\n");
> if (++entries_printed == callchain_param.print_limit)
> diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
> index 08cb220ba5ea..e2ef9b38acb6 100644
> --- a/tools/perf/util/callchain.c
> +++ b/tools/perf/util/callchain.c
> @@ -805,6 +805,35 @@ char *callchain_list__sym_name(struct callchain_list *cl,
> return bf;
> }
>
> +char *callchain_node__sprintf_value(struct callchain_node *node,
> + char *bf, size_t bfsize, u64 total)

sprintf doesn't require a bfsize, snprintf does, but we don't use that,
so renaming it to callchain_node__scnprintf_value() so that we recall
the semantic associated with this operation.

> +{
> + double percent = 0.0;
> + u64 period = callchain_cumul_hits(node);
> +
> + if (callchain_param.mode == CHAIN_FOLDED)
> + period = node->hit;
> + if (total)
> + percent = period * 100.0 / total;
> +
> + scnprintf(bf, bfsize, "%.2f%%", percent);
> + return bf;
> +}
> +
> +int callchain_node__fprintf_value(struct callchain_node *node,
> + FILE *fp, u64 total)
> +{
> + double percent = 0.0;
> + u64 period = callchain_cumul_hits(node);
> +
> + if (callchain_param.mode == CHAIN_FOLDED)
> + period = node->hit;
> + if (total)
> + percent = period * 100.0 / total;
> +
> + return percent_color_fprintf(fp, "%.2f%%", percent);
> +}
> +
> static void free_callchain_node(struct callchain_node *node)
> {
> struct callchain_list *list, *tmp;
> diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
> index 544d99ac169c..f9e00e3d1243 100644
> --- a/tools/perf/util/callchain.h
> +++ b/tools/perf/util/callchain.h
> @@ -230,6 +230,10 @@ static inline int arch_skip_callchain_idx(struct thread *thread __maybe_unused,
>
> char *callchain_list__sym_name(struct callchain_list *cl,
> char *bf, size_t bfsize, bool show_dso);
> +char *callchain_node__sprintf_value(struct callchain_node *node,
> + char *bf, size_t bfsize, u64 total);
> +int callchain_node__fprintf_value(struct callchain_node *node,
> + FILE *fp, u64 total);
>
> void free_callchain(struct callchain_root *root);
>
> --
> 2.6.2

2015-11-19 13:59:26

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v5 4/9] perf report: Add callchain value option

Em Mon, Nov 09, 2015 at 02:45:41PM +0900, Namhyung Kim escreveu:
> Now -g/--call-graph option supports how to display callchain values.
> Possible values are 'percent', 'period' and 'count'. The percent is
> same as before and it's the default behavior. The period displays the
> raw period value rather than the percentage. The count displays the
> number of occurrences.
>
> $ perf report --no-children --stdio -g percent
> ...
> 39.93% swapper [kernel.vmlinux] [k] intel_idel
> |
> ---intel_idle
> cpuidle_enter_state
> cpuidle_enter
> call_cpuidle
> cpu_startup_entry
> |
> |--28.63%-- start_secondary
> |
> --11.30%-- rest_init

So, if I try to do:

perf report --no-children --stdio -g percent,count

It shows just 'count', i.e. the last of these options, is this an
intended limitation?

I'm applying it as-is, but I can see no reason why we wouldn't want to
lift this limitation.

- Arnaldo

> $ perf report --no-children --show-total-period --stdio -g period
> ...
> 39.93% 13018705 swapper [kernel.vmlinux] [k] intel_idel
> |
> ---intel_idle
> cpuidle_enter_state
> cpuidle_enter
> call_cpuidle
> cpu_startup_entry
> |
> |--9334403-- start_secondary
> |
> --3684302-- rest_init
>
> $ perf report --no-children --show-nr-samples --stdio -g count
> ...
> 39.93% 80 swapper [kernel.vmlinux] [k] intel_idel
> |
> ---intel_idle
> cpuidle_enter_state
> cpuidle_enter
> call_cpuidle
> cpu_startup_entry
> |
> |--57-- start_secondary
> |
> --23-- rest_init
>
> Acked-by: Brendan Gregg <[email protected]>
> Signed-off-by: Namhyung Kim <[email protected]>
> ---
> tools/perf/Documentation/perf-report.txt | 13 ++++---
> tools/perf/builtin-report.c | 4 +--
> tools/perf/ui/stdio/hist.c | 10 +++++-
> tools/perf/util/callchain.c | 62 +++++++++++++++++++++++++++-----
> tools/perf/util/callchain.h | 10 +++++-
> tools/perf/util/util.c | 3 +-
> 6 files changed, 84 insertions(+), 18 deletions(-)
>
> diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
> index f7d81aac9188..dab99ed2b339 100644
> --- a/tools/perf/Documentation/perf-report.txt
> +++ b/tools/perf/Documentation/perf-report.txt
> @@ -170,11 +170,11 @@ OPTIONS
> Dump raw trace in ASCII.
>
> -g::
> ---call-graph=<print_type,threshold[,print_limit],order,sort_key,branch>::
> +--call-graph=<print_type,threshold[,print_limit],order,sort_key[,branch],value>::
> Display call chains using type, min percent threshold, print limit,
> - call order, sort key and branch. Note that ordering of parameters is not
> - fixed so any parement can be given in an arbitraty order. One exception
> - is the print_limit which should be preceded by threshold.
> + call order, sort key, optional branch and value. Note that ordering of
> + parameters is not fixed so any parement can be given in an arbitraty order.
> + One exception is the print_limit which should be preceded by threshold.
>
> print_type can be either:
> - flat: single column, linear exposure of call chains.
> @@ -205,6 +205,11 @@ OPTIONS
> - branch: include last branch information in callgraph when available.
> Usually more convenient to use --branch-history for this.
>
> + value can be:
> + - percent: diplay overhead percent (default)
> + - period: display event period
> + - count: display event count
> +
> --children::
> Accumulate callchain of children to parent entry so that then can
> show up in the output. The output will have a new "Children" column
> diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
> index 2853ad2bd435..3dd4bb4ded1a 100644
> --- a/tools/perf/builtin-report.c
> +++ b/tools/perf/builtin-report.c
> @@ -625,7 +625,7 @@ parse_percent_limit(const struct option *opt, const char *str,
> return 0;
> }
>
> -#define CALLCHAIN_DEFAULT_OPT "graph,0.5,caller,function"
> +#define CALLCHAIN_DEFAULT_OPT "graph,0.5,caller,function,percent"
>
> const char report_callchain_help[] = "Display call graph (stack chain/backtrace):\n\n"
> CALLCHAIN_REPORT_HELP
> @@ -708,7 +708,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
> OPT_BOOLEAN('x', "exclude-other", &symbol_conf.exclude_other,
> "Only display entries with parent-match"),
> OPT_CALLBACK_DEFAULT('g', "call-graph", &report,
> - "print_type,threshold[,print_limit],order,sort_key[,branch]",
> + "print_type,threshold[,print_limit],order,sort_key[,branch],value",
> report_callchain_help, &report_parse_callchain_opt,
> callchain_default_opt),
> OPT_BOOLEAN(0, "children", &symbol_conf.cumulate_callchain,
> diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
> index f4de055cab9b..7ebc661be267 100644
> --- a/tools/perf/ui/stdio/hist.c
> +++ b/tools/perf/ui/stdio/hist.c
> @@ -81,13 +81,14 @@ static size_t __callchain__fprintf_graph(FILE *fp, struct rb_root *root,
> int depth_mask, int left_margin)
> {
> struct rb_node *node, *next;
> - struct callchain_node *child;
> + struct callchain_node *child = NULL;
> struct callchain_list *chain;
> int new_depth_mask = depth_mask;
> u64 remaining;
> size_t ret = 0;
> int i;
> uint entries_printed = 0;
> + int cumul_count = 0;
>
> remaining = total_samples;
>
> @@ -99,6 +100,7 @@ static size_t __callchain__fprintf_graph(FILE *fp, struct rb_root *root,
> child = rb_entry(node, struct callchain_node, rb_node);
> cumul = callchain_cumul_hits(child);
> remaining -= cumul;
> + cumul_count += callchain_cumul_counts(child);
>
> /*
> * The depth mask manages the output of pipes that show
> @@ -148,6 +150,12 @@ static size_t __callchain__fprintf_graph(FILE *fp, struct rb_root *root,
> if (!rem_sq_bracket)
> return ret;
>
> + if (callchain_param.value == CCVAL_COUNT && child && child->parent) {
> + rem_node.count = child->parent->children_count - cumul_count;
> + if (rem_node.count <= 0)
> + return ret;
> + }
> +
> new_depth_mask &= ~(1 << (depth - 1));
> ret += ipchain__fprintf_graph(fp, &rem_node, &rem_hits, depth,
> new_depth_mask, 0, total_samples,
> diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
> index 60754de700d4..f3f1b95b808e 100644
> --- a/tools/perf/util/callchain.c
> +++ b/tools/perf/util/callchain.c
> @@ -83,6 +83,23 @@ static int parse_callchain_sort_key(const char *value)
> return -1;
> }
>
> +static int parse_callchain_value(const char *value)
> +{
> + if (!strncmp(value, "percent", strlen(value))) {
> + callchain_param.value = CCVAL_PERCENT;
> + return 0;
> + }
> + if (!strncmp(value, "period", strlen(value))) {
> + callchain_param.value = CCVAL_PERIOD;
> + return 0;
> + }
> + if (!strncmp(value, "count", strlen(value))) {
> + callchain_param.value = CCVAL_COUNT;
> + return 0;
> + }
> + return -1;
> +}
> +
> static int
> __parse_callchain_report_opt(const char *arg, bool allow_record_opt)
> {
> @@ -106,7 +123,8 @@ __parse_callchain_report_opt(const char *arg, bool allow_record_opt)
>
> if (!parse_callchain_mode(tok) ||
> !parse_callchain_order(tok) ||
> - !parse_callchain_sort_key(tok)) {
> + !parse_callchain_sort_key(tok) ||
> + !parse_callchain_value(tok)) {
> /* parsing ok - move on to the next */
> try_stack_size = false;
> goto next;
> @@ -820,13 +838,27 @@ char *callchain_node__sprintf_value(struct callchain_node *node,
> {
> double percent = 0.0;
> u64 period = callchain_cumul_hits(node);
> + unsigned count = callchain_cumul_counts(node);
>
> - if (callchain_param.mode == CHAIN_FOLDED)
> + if (callchain_param.mode == CHAIN_FOLDED) {
> period = node->hit;
> - if (total)
> - percent = period * 100.0 / total;
> + count = node->count;
> + }
>
> - scnprintf(bf, bfsize, "%.2f%%", percent);
> + switch (callchain_param.value) {
> + case CCVAL_PERIOD:
> + scnprintf(bf, bfsize, "%"PRIu64, period);
> + break;
> + case CCVAL_COUNT:
> + scnprintf(bf, bfsize, "%u", count);
> + break;
> + case CCVAL_PERCENT:
> + default:
> + if (total)
> + percent = period * 100.0 / total;
> + scnprintf(bf, bfsize, "%.2f%%", percent);
> + break;
> + }
> return bf;
> }
>
> @@ -835,13 +867,25 @@ int callchain_node__fprintf_value(struct callchain_node *node,
> {
> double percent = 0.0;
> u64 period = callchain_cumul_hits(node);
> + unsigned count = callchain_cumul_counts(node);
>
> - if (callchain_param.mode == CHAIN_FOLDED)
> + if (callchain_param.mode == CHAIN_FOLDED) {
> period = node->hit;
> - if (total)
> - percent = period * 100.0 / total;
> + count = node->count;
> + }
>
> - return percent_color_fprintf(fp, "%.2f%%", percent);
> + switch (callchain_param.value) {
> + case CCVAL_PERIOD:
> + return fprintf(fp, "%"PRIu64, period);
> + case CCVAL_COUNT:
> + return fprintf(fp, "%u", count);
> + case CCVAL_PERCENT:
> + default:
> + if (total)
> + percent = period * 100.0 / total;
> + return percent_color_fprintf(fp, "%.2f%%", percent);
> + }
> + return 0;
> }
>
> static void free_callchain_node(struct callchain_node *node)
> diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
> index 0e6cc83f1a46..b14d760fc4e3 100644
> --- a/tools/perf/util/callchain.h
> +++ b/tools/perf/util/callchain.h
> @@ -29,7 +29,8 @@
> HELP_PAD "print_limit:\tmaximum number of call graph entry (<number>)\n" \
> HELP_PAD "order:\t\tcall graph order (caller|callee)\n" \
> HELP_PAD "sort_key:\tcall graph sort key (function|address)\n" \
> - HELP_PAD "branch:\t\tinclude last branch info to call graph (branch)\n"
> + HELP_PAD "branch:\t\tinclude last branch info to call graph (branch)\n" \
> + HELP_PAD "value:\t\tcall graph value (percent|period|count)\n"
>
> enum perf_call_graph_mode {
> CALLCHAIN_NONE,
> @@ -81,6 +82,12 @@ enum chain_key {
> CCKEY_ADDRESS
> };
>
> +enum chain_value {
> + CCVAL_PERCENT,
> + CCVAL_PERIOD,
> + CCVAL_COUNT,
> +};
> +
> struct callchain_param {
> bool enabled;
> enum perf_call_graph_mode record_mode;
> @@ -93,6 +100,7 @@ struct callchain_param {
> bool order_set;
> enum chain_key key;
> bool branch_callstack;
> + enum chain_value value;
> };
>
> extern struct callchain_param callchain_param;
> diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
> index 47b1e36c7ea0..75759aebc7b8 100644
> --- a/tools/perf/util/util.c
> +++ b/tools/perf/util/util.c
> @@ -21,7 +21,8 @@ struct callchain_param callchain_param = {
> .mode = CHAIN_GRAPH_ABS,
> .min_percent = 0.5,
> .order = ORDER_CALLEE,
> - .key = CCKEY_FUNCTION
> + .key = CCKEY_FUNCTION,
> + .value = CCVAL_PERCENT,
> };
>
> /*
> --
> 2.6.2

2015-11-19 14:19:45

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v5 7/9] perf hists browser: Support folded callchains

Em Mon, Nov 09, 2015 at 02:45:44PM +0900, Namhyung Kim escreveu:
> The folded callchain mode is to print all chains in a single line.
> Currently perf report --tui doesn't support folded callchains. Like
> flat callchains, only leaf nodes are added to the final rbtree so it
> should show entries in parent nodes. To do that, add flat_val list to
> struct callchain_node and show them along with the (normal) val list.
>
> For example, folded callchain looks like below:
>
> $ perf report -g folded --tui
> Samples: 234 of event 'cycles:pp', Event count (approx.): 32605268
> Overhead Command Shared Object Symbol
> - 39.93% swapper [kernel.vmlinux] [k] intel_idle
> + 28.63% intel_idle; cpuidle_enter_state; cpuidle_enter; ...
> + 11.30% intel_idle; cpuidle_enter_state; cpuidle_enter; ...

The +/- before the folded callchains continue toggling, but with no, a
further polishment is to elide them altogether, since they have no use.

Applied, that can be done on top.

- Arnaldo

> Cc: Brendan Gregg <[email protected]>
> Signed-off-by: Namhyung Kim <[email protected]>
> ---
> tools/perf/ui/browsers/hists.c | 126 ++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 125 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
> index f4216d92282d..3efe7c74f47d 100644
> --- a/tools/perf/ui/browsers/hists.c
> +++ b/tools/perf/ui/browsers/hists.c
> @@ -207,6 +207,11 @@ static int callchain_node__count_flat_rows(struct callchain_node *node)
> return n;
> }
>
> +static int callchain_node__count_folded_rows(struct callchain_node *node __maybe_unused)
> +{
> + return 1;
> +}
> +
> static int callchain_node__count_rows(struct callchain_node *node)
> {
> struct callchain_list *chain;
> @@ -215,6 +220,8 @@ static int callchain_node__count_rows(struct callchain_node *node)
>
> if (callchain_param.mode == CHAIN_FLAT)
> return callchain_node__count_flat_rows(node);
> + else if (callchain_param.mode == CHAIN_FOLDED)
> + return callchain_node__count_folded_rows(node);
>
> list_for_each_entry(chain, &node->val, list) {
> ++n;
> @@ -311,7 +318,8 @@ static void callchain__init_have_children(struct rb_root *root)
> for (nd = rb_first(root); nd; nd = rb_next(nd)) {
> struct callchain_node *node = rb_entry(nd, struct callchain_node, rb_node);
> callchain_node__init_have_children(node, has_sibling);
> - if (callchain_param.mode == CHAIN_FLAT)
> + if (callchain_param.mode == CHAIN_FLAT ||
> + callchain_param.mode == CHAIN_FOLDED)
> callchain_node__make_parent_list(node);
> }
> }
> @@ -723,6 +731,117 @@ static int hist_browser__show_callchain_flat(struct hist_browser *browser,
> return row - first_row;
> }
>
> +static char *hist_browser__folded_callchain_str(struct hist_browser *browser,
> + struct callchain_list *chain,
> + char *value_str, char *old_str)
> +{
> + char bf[1024];
> + const char *str;
> + char *new;
> +
> + str = callchain_list__sym_name(chain, bf, sizeof(bf),
> + browser->show_dso);
> + if (old_str) {
> + if (asprintf(&new, "%s%s%s", old_str,
> + symbol_conf.field_sep ?: ";", str) < 0)
> + new = NULL;
> + } else {
> + if (value_str) {
> + if (asprintf(&new, "%s %s", value_str, str) < 0)
> + new = NULL;
> + } else {
> + if (asprintf(&new, "%s", str) < 0)
> + new = NULL;
> + }
> + }
> + return new;
> +}
> +
> +static int hist_browser__show_callchain_folded(struct hist_browser *browser,
> + struct rb_root *root,
> + unsigned short row, u64 total,
> + print_callchain_entry_fn print,
> + struct callchain_print_arg *arg,
> + check_output_full_fn is_output_full)
> +{
> + struct rb_node *node;
> + int first_row = row, offset = LEVEL_OFFSET_STEP;
> + bool need_percent;
> +
> + node = rb_first(root);
> + need_percent = node && rb_next(node);
> +
> + while (node) {
> + struct callchain_node *child = rb_entry(node, struct callchain_node, rb_node);
> + struct rb_node *next = rb_next(node);
> + struct callchain_list *chain, *first_chain = NULL;
> + int first = true;
> + char *value_str = NULL, *value_str_alloc = NULL;
> + char *chain_str = NULL, *chain_str_alloc = NULL;
> +
> + if (arg->row_offset != 0) {
> + arg->row_offset--;
> + goto next;
> + }
> +
> + if (need_percent) {
> + char buf[64];
> +
> + callchain_node__sprintf_value(child, buf, sizeof(buf),
> + total);
> + if (asprintf(&value_str, "%s", buf) < 0) {
> + value_str = (char *)"<...>";
> + goto do_print;
> + }
> + value_str_alloc = value_str;
> + }
> +
> + list_for_each_entry(chain, &child->parent_val, list) {
> + chain_str = hist_browser__folded_callchain_str(browser,
> + chain, value_str, chain_str);
> + if (first) {
> + first = false;
> + first_chain = chain;
> + }
> +
> + if (chain_str == NULL) {
> + chain_str = (char *)"Not enough memory!";
> + goto do_print;
> + }
> +
> + chain_str_alloc = chain_str;
> + }
> +
> + list_for_each_entry(chain, &child->val, list) {
> + chain_str = hist_browser__folded_callchain_str(browser,
> + chain, value_str, chain_str);
> + if (first) {
> + first = false;
> + first_chain = chain;
> + }
> +
> + if (chain_str == NULL) {
> + chain_str = (char *)"Not enough memory!";
> + goto do_print;
> + }
> +
> + chain_str_alloc = chain_str;
> + }
> +
> +do_print:
> + print(browser, first_chain, chain_str, offset, row++, arg);
> + free(value_str_alloc);
> + free(chain_str_alloc);
> +
> +next:
> + if (is_output_full(browser, row))
> + break;
> + node = next;
> + }
> +
> + return row - first_row;
> +}
> +
> static int hist_browser__show_callchain(struct hist_browser *browser,
> struct rb_root *root, int level,
> unsigned short row, u64 total,
> @@ -980,6 +1099,11 @@ static int hist_browser__show_entry(struct hist_browser *browser,
> &entry->sorted_chain, row, total,
> hist_browser__show_callchain_entry, &arg,
> hist_browser__check_output_full);
> + } else if (callchain_param.mode == CHAIN_FOLDED) {
> + printed += hist_browser__show_callchain_folded(browser,
> + &entry->sorted_chain, row, total,
> + hist_browser__show_callchain_entry, &arg,
> + hist_browser__check_output_full);
> } else {
> printed += hist_browser__show_callchain(browser,
> &entry->sorted_chain, 1, row, total,
> --
> 2.6.2

2015-11-19 14:33:59

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf report: [WIP] Support '-F none' option to hide hist lines

Em Mon, Nov 09, 2015 at 02:45:38PM +0900, Namhyung Kim escreveu:
> For some reason, it sometimes wants to hide hist lines but only wants to
> see callchains. To do that, add virtual 'none' field name to hide all
> hist lines. It should be used solely and only meaningful on --stdio.
>
> WIP on TUI

So, in the TUI its the navigation that doesnt work, i.e. it manages to
elide the hist_entry main lines, shows the folded callchains, but the
keys don't work...

I'll try it if you don't, just busy with processing tons of patches at
the moment :-\

Anyway, processed the other patches, pushing to Ingo for perf/core, to
avoid patchbombing him too much in just one go :-)

- Arnaldo

> Cc: Brendan Gregg <[email protected]>
> Signed-off-by: Namhyung Kim <[email protected]>
> ---
> tools/perf/Documentation/perf-report.txt | 3 ++
> tools/perf/ui/browsers/hists.c | 22 +++++++++--
> tools/perf/ui/gtk/hists.c | 65 ++++++++++++++++++++++++--------
> tools/perf/ui/stdio/hist.c | 5 +++
> tools/perf/util/sort.c | 5 +++
> 5 files changed, 82 insertions(+), 18 deletions(-)
>
> diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
> index dab99ed2b339..6cfc643c0806 100644
> --- a/tools/perf/Documentation/perf-report.txt
> +++ b/tools/perf/Documentation/perf-report.txt
> @@ -127,6 +127,9 @@ OPTIONS
> By default, every sort keys not specified in -F will be appended
> automatically.
>
> + If "none" is specified, it hides all fields and --stdio output will show
> + callchains only.
> +
> If --mem-mode option is used, following sort keys are also available
> (incompatible with --branch-stack):
> symbol_daddr, dso_daddr, locked, tlb, mem, snoop, dcacheline.
> diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
> index 3efe7c74f47d..c2f586f0c729 100644
> --- a/tools/perf/ui/browsers/hists.c
> +++ b/tools/perf/ui/browsers/hists.c
> @@ -78,6 +78,9 @@ static u32 hist_browser__nr_entries(struct hist_browser *hb)
> nr_entries = hb->hists->nr_entries;
>
> hb->nr_callchain_rows = hist_browser__get_folding(hb);
> +
> + if (list_empty(&perf_hpp__list))
> + nr_entries = 1;
> return nr_entries + hb->nr_callchain_rows;
> }
>
> @@ -255,7 +258,10 @@ static bool hist_entry__toggle_fold(struct hist_entry *he)
> if (!he->has_children)
> return false;
>
> - he->unfolded = !he->unfolded;
> + if (list_empty(&perf_hpp__list))
> + he->unfolded = true;
> + else
> + he->unfolded = !he->unfolded;
> return true;
> }
>
> @@ -329,6 +335,10 @@ static void hist_entry__init_have_children(struct hist_entry *he)
> if (!he->init_have_children) {
> he->has_children = !RB_EMPTY_ROOT(&he->sorted_chain);
> callchain__init_have_children(&he->sorted_chain);
> + if (list_empty(&perf_hpp__list)) {
> + he->unfolded = true;
> + he->nr_rows = callchain__count_rows(&he->sorted_chain);
> + }
> he->init_have_children = true;
> }
> }
> @@ -1038,6 +1048,9 @@ static int hist_browser__show_entry(struct hist_browser *browser,
>
> hist_browser__gotorc(browser, row, 0);
>
> + if (list_empty(&perf_hpp__list))
> + goto print_callchain;
> +
> perf_hpp__for_each_format(fmt) {
> if (perf_hpp__should_skip(fmt) || column++ < browser->b.horiz_scroll)
> continue;
> @@ -1080,6 +1093,7 @@ static int hist_browser__show_entry(struct hist_browser *browser,
> } else
> --row_offset;
>
> +print_callchain:
> if (folded_sign == '-' && row != browser->b.rows) {
> u64 total = hists__total_period(entry->hists);
> struct callchain_print_arg arg = {
> @@ -1313,7 +1327,8 @@ static void ui_browser__hists_seek(struct ui_browser *browser,
> nd = hists__filter_entries(rb_next(nd), hb->min_pcnt);
> if (nd == NULL)
> break;
> - --offset;
> + if (!list_empty(&perf_hpp__list))
> + --offset;
> browser->top = nd;
> } while (offset != 0);
> } else if (offset < 0) {
> @@ -1347,7 +1362,8 @@ static void ui_browser__hists_seek(struct ui_browser *browser,
> hb->min_pcnt);
> if (nd == NULL)
> break;
> - ++offset;
> + if (!list_empty(&perf_hpp__list))
> + ++offset;
> browser->top = nd;
> if (offset == 0) {
> /*
> diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
> index 6105b4921754..535f8c5e74dc 100644
> --- a/tools/perf/ui/gtk/hists.c
> +++ b/tools/perf/ui/gtk/hists.c
> @@ -98,12 +98,12 @@ static void perf_gtk__add_callchain_flat(struct rb_root *root, GtkTreeStore *sto
> for (nd = rb_first(root); nd; nd = rb_next(nd)) {
> struct callchain_node *node;
> struct callchain_list *chain;
> - GtkTreeIter iter, new_parent;
> + GtkTreeIter iter, new_parent_iter, *new_parent;
> bool need_new_parent;
>
> node = rb_entry(nd, struct callchain_node, rb_node);
>
> - new_parent = *parent;
> + new_parent = parent;
> need_new_parent = !has_single_node;
>
> callchain_node__make_parent_list(node);
> @@ -111,7 +111,7 @@ static void perf_gtk__add_callchain_flat(struct rb_root *root, GtkTreeStore *sto
> list_for_each_entry(chain, &node->parent_val, list) {
> char buf[128];
>
> - gtk_tree_store_append(store, &iter, &new_parent);
> + gtk_tree_store_append(store, &iter, new_parent);
>
> callchain_node__sprintf_value(node, buf, sizeof(buf), total);
> gtk_tree_store_set(store, &iter, 0, buf, -1);
> @@ -124,7 +124,8 @@ static void perf_gtk__add_callchain_flat(struct rb_root *root, GtkTreeStore *sto
> * Only show the top-most symbol in a callchain
> * if it's not the only callchain.
> */
> - new_parent = iter;
> + new_parent_iter = iter;
> + new_parent = &new_parent_iter;
> need_new_parent = false;
> }
> }
> @@ -132,7 +133,7 @@ static void perf_gtk__add_callchain_flat(struct rb_root *root, GtkTreeStore *sto
> list_for_each_entry(chain, &node->val, list) {
> char buf[128];
>
> - gtk_tree_store_append(store, &iter, &new_parent);
> + gtk_tree_store_append(store, &iter, new_parent);
>
> callchain_node__sprintf_value(node, buf, sizeof(buf), total);
> gtk_tree_store_set(store, &iter, 0, buf, -1);
> @@ -145,7 +146,8 @@ static void perf_gtk__add_callchain_flat(struct rb_root *root, GtkTreeStore *sto
> * Only show the top-most symbol in a callchain
> * if it's not the only callchain.
> */
> - new_parent = iter;
> + new_parent_iter = iter;
> + new_parent = &new_parent_iter;
> need_new_parent = false;
> }
> }
> @@ -221,19 +223,19 @@ static void perf_gtk__add_callchain_graph(struct rb_root *root, GtkTreeStore *st
> for (nd = rb_first(root); nd; nd = rb_next(nd)) {
> struct callchain_node *node;
> struct callchain_list *chain;
> - GtkTreeIter iter, new_parent;
> + GtkTreeIter iter, new_parent_iter, *new_parent;
> bool need_new_parent;
> u64 child_total;
>
> node = rb_entry(nd, struct callchain_node, rb_node);
>
> - new_parent = *parent;
> + new_parent = parent;
> need_new_parent = !has_single_node && (node->val_nr > 1);
>
> list_for_each_entry(chain, &node->val, list) {
> char buf[128];
>
> - gtk_tree_store_append(store, &iter, &new_parent);
> + gtk_tree_store_append(store, &iter, new_parent);
>
> callchain_node__sprintf_value(node, buf, sizeof(buf), total);
> gtk_tree_store_set(store, &iter, 0, buf, -1);
> @@ -246,7 +248,8 @@ static void perf_gtk__add_callchain_graph(struct rb_root *root, GtkTreeStore *st
> * Only show the top-most symbol in a callchain
> * if it's not the only callchain.
> */
> - new_parent = iter;
> + new_parent_iter = iter;
> + new_parent = &new_parent_iter;
> need_new_parent = false;
> }
> }
> @@ -292,12 +295,14 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
> GType col_types[MAX_COLUMNS];
> GtkCellRenderer *renderer;
> GtkTreeStore *store;
> + GtkTreeIter *iter;
> struct rb_node *nd;
> GtkWidget *view;
> int col_idx;
> int sym_col = -1;
> int nr_cols;
> char s[512];
> + bool no_hists = false;
>
> struct perf_hpp hpp = {
> .buf = s,
> @@ -309,6 +314,18 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
> perf_hpp__for_each_format(fmt)
> col_types[nr_cols++] = G_TYPE_STRING;
>
> + if (nr_cols == 0) {
> + /*
> + * user specified '-F none' to ignore hist entries.
> + * Add two columns to print callchain value and symbols.
> + */
> + no_hists = true;
> +
> + nr_cols = 2;
> + col_types[0] = G_TYPE_STRING;
> + col_types[1] = G_TYPE_STRING;
> + }
> +
> store = gtk_tree_store_newv(nr_cols, col_types);
>
> view = gtk_tree_view_new();
> @@ -334,6 +351,18 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
> col_idx++, NULL);
> }
>
> + if (no_hists) {
> + sym_col = 1;
> + gtk_tree_view_insert_column_with_attributes(GTK_TREE_VIEW(view),
> + -1, "Overhead",
> + renderer, "markup",
> + 0, NULL);
> + gtk_tree_view_insert_column_with_attributes(GTK_TREE_VIEW(view),
> + -1, "Callchains",
> + renderer, "markup",
> + 1, NULL);
> + }
> +
> for (col_idx = 0; col_idx < nr_cols; col_idx++) {
> GtkTreeViewColumn *column;
>
> @@ -352,7 +381,7 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
>
> for (nd = rb_first(&hists->entries); nd; nd = rb_next(nd)) {
> struct hist_entry *h = rb_entry(nd, struct hist_entry, rb_node);
> - GtkTreeIter iter;
> + GtkTreeIter this_iter;
> u64 total = hists__total_period(h->hists);
> float percent;
>
> @@ -363,7 +392,13 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
> if (percent < min_pcnt)
> continue;
>
> - gtk_tree_store_append(store, &iter, NULL);
> + if (no_hists) {
> + /* NULL means that callchains are in top-level */
> + iter = NULL;
> + } else {
> + iter = &this_iter;
> + gtk_tree_store_append(store, iter, NULL);
> + }
>
> col_idx = 0;
>
> @@ -376,15 +411,15 @@ static void perf_gtk__show_hists(GtkWidget *window, struct hists *hists,
> else
> fmt->entry(fmt, &hpp, h);
>
> - gtk_tree_store_set(store, &iter, col_idx++, s, -1);
> + gtk_tree_store_set(store, iter, col_idx++, s, -1);
> }
>
> - if (symbol_conf.use_callchain && sort__has_sym) {
> + if (symbol_conf.use_callchain) {
> if (callchain_param.mode == CHAIN_GRAPH_REL)
> total = symbol_conf.cumulate_callchain ?
> h->stat_acc->period : h->stat.period;
>
> - perf_gtk__add_callchain(&h->sorted_chain, store, &iter,
> + perf_gtk__add_callchain(&h->sorted_chain, store, iter,
> sym_col, total);
> }
> }
> diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
> index 7ebc661be267..48ae34abf9c8 100644
> --- a/tools/perf/ui/stdio/hist.c
> +++ b/tools/perf/ui/stdio/hist.c
> @@ -422,6 +422,11 @@ static int hist_entry__fprintf(struct hist_entry *he, size_t size,
> if (size == 0 || size > bfsz)
> size = hpp.size = bfsz;
>
> + /*
> + * In case of '-F none', the bf is not set at all.
> + */
> + bf[0] = '\0';
> +
> hist_entry__snprintf(he, &hpp);
>
> ret = fprintf(fp, "%s\n", bf);
> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> index 2d8ccd4d9e1b..8c731906d432 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -1923,6 +1923,11 @@ static int __setup_output_field(void)
> if (field_order == NULL)
> return 0;
>
> + if (!strcmp(field_order, "none")) {
> + symbol_conf.show_hist_headers = false;
> + return 0;
> + }
> +
> strp = str = strdup(field_order);
> if (str == NULL) {
> error("Not enough memory to setup output fields");
> --
> 2.6.2

2015-11-20 01:33:56

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v5 2/9] perf callchain: Abstract callchain print function

Hi Arnaldo,

On Thu, Nov 19, 2015 at 10:41:32AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Nov 09, 2015 at 02:45:39PM +0900, Namhyung Kim escreveu:
> > This is a preparation to support for printing other type of callchain
> > value like count or period.
> >
> > Acked-by: Brendan Gregg <[email protected]>
> > Signed-off-by: Namhyung Kim <[email protected]>
> > ---

[SNIP]
> > +char *callchain_node__sprintf_value(struct callchain_node *node,
> > + char *bf, size_t bfsize, u64 total)
>
> sprintf doesn't require a bfsize, snprintf does, but we don't use that,
> so renaming it to callchain_node__scnprintf_value() so that we recall
> the semantic associated with this operation.

OK, and thank you for doing this!

Thanks,
Namhyung

2015-11-20 01:39:52

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v5 4/9] perf report: Add callchain value option

On Thu, Nov 19, 2015 at 10:59:14AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Nov 09, 2015 at 02:45:41PM +0900, Namhyung Kim escreveu:
> > Now -g/--call-graph option supports how to display callchain values.
> > Possible values are 'percent', 'period' and 'count'. The percent is
> > same as before and it's the default behavior. The period displays the
> > raw period value rather than the percentage. The count displays the
> > number of occurrences.
> >
> > $ perf report --no-children --stdio -g percent
> > ...
> > 39.93% swapper [kernel.vmlinux] [k] intel_idel
> > |
> > ---intel_idle
> > cpuidle_enter_state
> > cpuidle_enter
> > call_cpuidle
> > cpu_startup_entry
> > |
> > |--28.63%-- start_secondary
> > |
> > --11.30%-- rest_init
>
> So, if I try to do:
>
> perf report --no-children --stdio -g percent,count
>
> It shows just 'count', i.e. the last of these options, is this an
> intended limitation?
>
> I'm applying it as-is, but I can see no reason why we wouldn't want to
> lift this limitation.


Hmm.. I expected just one value type is used, but yes, we might want
to support to print multiple values if needed.

Thanks,
Namhyung


>
> > $ perf report --no-children --show-total-period --stdio -g period
> > ...
> > 39.93% 13018705 swapper [kernel.vmlinux] [k] intel_idel
> > |
> > ---intel_idle
> > cpuidle_enter_state
> > cpuidle_enter
> > call_cpuidle
> > cpu_startup_entry
> > |
> > |--9334403-- start_secondary
> > |
> > --3684302-- rest_init
> >
> > $ perf report --no-children --show-nr-samples --stdio -g count
> > ...
> > 39.93% 80 swapper [kernel.vmlinux] [k] intel_idel
> > |
> > ---intel_idle
> > cpuidle_enter_state
> > cpuidle_enter
> > call_cpuidle
> > cpu_startup_entry
> > |
> > |--57-- start_secondary
> > |
> > --23-- rest_init
> >
> > Acked-by: Brendan Gregg <[email protected]>
> > Signed-off-by: Namhyung Kim <[email protected]>

2015-11-20 01:40:17

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v5 7/9] perf hists browser: Support folded callchains

On Thu, Nov 19, 2015 at 11:19:38AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Nov 09, 2015 at 02:45:44PM +0900, Namhyung Kim escreveu:
> > The folded callchain mode is to print all chains in a single line.
> > Currently perf report --tui doesn't support folded callchains. Like
> > flat callchains, only leaf nodes are added to the final rbtree so it
> > should show entries in parent nodes. To do that, add flat_val list to
> > struct callchain_node and show them along with the (normal) val list.
> >
> > For example, folded callchain looks like below:
> >
> > $ perf report -g folded --tui
> > Samples: 234 of event 'cycles:pp', Event count (approx.): 32605268
> > Overhead Command Shared Object Symbol
> > - 39.93% swapper [kernel.vmlinux] [k] intel_idle
> > + 28.63% intel_idle; cpuidle_enter_state; cpuidle_enter; ...
> > + 11.30% intel_idle; cpuidle_enter_state; cpuidle_enter; ...
>
> The +/- before the folded callchains continue toggling, but with no, a
> further polishment is to elide them altogether, since they have no use.
>
> Applied, that can be done on top.

Agreed.

Thanks,
Namhyung

2015-11-20 01:43:50

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH] perf report: [WIP] Support '-F none' option to hide hist lines

On Thu, Nov 19, 2015 at 11:33:52AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Nov 09, 2015 at 02:45:38PM +0900, Namhyung Kim escreveu:
> > For some reason, it sometimes wants to hide hist lines but only wants to
> > see callchains. To do that, add virtual 'none' field name to hide all
> > hist lines. It should be used solely and only meaningful on --stdio.
> >
> > WIP on TUI
>
> So, in the TUI its the navigation that doesnt work, i.e. it manages to
> elide the hist_entry main lines, shows the folded callchains, but the
> keys don't work...

Yes, but it'd work once you pressed ENTER key.. ;-)


>
> I'll try it if you don't, just busy with processing tons of patches at
> the moment :-\

It'd be great if you'd take a look at it.


>
> Anyway, processed the other patches, pushing to Ingo for perf/core, to
> avoid patchbombing him too much in just one go :-)

Sure, thanks for your great job! :)
Namhyung

2015-11-20 12:06:29

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v5 4/9] perf report: Add callchain value option

Em Fri, Nov 20, 2015 at 10:39:47AM +0900, Namhyung Kim escreveu:
> On Thu, Nov 19, 2015 at 10:59:14AM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Nov 09, 2015 at 02:45:41PM +0900, Namhyung Kim escreveu:
> > > Now -g/--call-graph option supports how to display callchain values.
> > > Possible values are 'percent', 'period' and 'count'. The percent is
> > > same as before and it's the default behavior. The period displays the
> > > raw period value rather than the percentage. The count displays the
> > > number of occurrences.
> > >
> > > $ perf report --no-children --stdio -g percent
> > > ...
> > > 39.93% swapper [kernel.vmlinux] [k] intel_idel
> > > |
> > > ---intel_idle
> > > cpuidle_enter_state
> > > cpuidle_enter
> > > call_cpuidle
> > > cpu_startup_entry
> > > |
> > > |--28.63%-- start_secondary
> > > |
> > > --11.30%-- rest_init
> >
> > So, if I try to do:
> >
> > perf report --no-children --stdio -g percent,count
> >
> > It shows just 'count', i.e. the last of these options, is this an
> > intended limitation?
> >
> > I'm applying it as-is, but I can see no reason why we wouldn't want to
> > lift this limitation.
>
>
> Hmm.. I expected just one value type is used, but yes, we might want
> to support to print multiple values if needed.

And at least to warn the user when more than one out of those options is
asked for and just one of them ends up in the output.

- Arnaldo

Subject: [tip:perf/core] perf report: Support folded callchain mode on --stdio

Commit-ID: 26e779245dd6f5270c0696860438e5c03d0780fd
Gitweb: http://git.kernel.org/tip/26e779245dd6f5270c0696860438e5c03d0780fd
Author: Namhyung Kim <[email protected]>
AuthorDate: Mon, 9 Nov 2015 14:45:37 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 19 Nov 2015 13:19:22 -0300

perf report: Support folded callchain mode on --stdio

Add new call chain option (-g) 'folded' to print callchains in a line.
The callchains are separated by semicolons, and preceded by (absolute)
percent values and a space.

For example, the following 20 lines can be printed in 3 lines with the
folded output mode:

$ perf report -g flat --no-children | grep -v ^# | head -20
60.48% swapper [kernel.vmlinux] [k] intel_idle
54.60%
intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
start_secondary

5.88%
intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
rest_init
start_kernel
x86_64_start_reservations
x86_64_start_kernel

$ perf report -g folded --no-children | grep -v ^# | head -3
60.48% swapper [kernel.vmlinux] [k] intel_idle
54.60% intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;start_secondary
5.88% intel_idle;cpuidle_enter_state;cpuidle_enter;call_cpuidle;cpu_startup_entry;rest_init;start_kernel;x86_64_start_reservations;x86_64_start_kernel

This mode is supported only for --stdio now and intended to be used by
some scripts like in FlameGraphs[1]. Support for other UI might be
added later.

[1] http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html

Requested-and-Tested-by: Brendan Gregg <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Documentation/perf-report.txt | 1 +
tools/perf/ui/stdio/hist.c | 55 ++++++++++++++++++++++++++++++++
tools/perf/util/callchain.c | 6 ++++
tools/perf/util/callchain.h | 5 +--
4 files changed, 65 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 5ce8da1..f7d81aa 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -181,6 +181,7 @@ OPTIONS
- graph: use a graph tree, displaying absolute overhead rates. (default)
- fractal: like graph, but displays relative rates. Each branch of
the tree is considered as a new profiled object.
+ - folded: call chains are displayed in a line, separated by semicolons
- none: disable call chain display.

threshold is a percentage value which specifies a minimum percent to be
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index dfcbc90..ea798493 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -260,6 +260,58 @@ static size_t callchain__fprintf_flat(FILE *fp, struct rb_root *tree,
return ret;
}

+static size_t __callchain__fprintf_folded(FILE *fp, struct callchain_node *node)
+{
+ const char *sep = symbol_conf.field_sep ?: ";";
+ struct callchain_list *chain;
+ size_t ret = 0;
+ char bf[1024];
+ bool first;
+
+ if (!node)
+ return 0;
+
+ ret += __callchain__fprintf_folded(fp, node->parent);
+
+ first = (ret == 0);
+ list_for_each_entry(chain, &node->val, list) {
+ if (chain->ip >= PERF_CONTEXT_MAX)
+ continue;
+ ret += fprintf(fp, "%s%s", first ? "" : sep,
+ callchain_list__sym_name(chain,
+ bf, sizeof(bf), false));
+ first = false;
+ }
+
+ return ret;
+}
+
+static size_t callchain__fprintf_folded(FILE *fp, struct rb_root *tree,
+ u64 total_samples)
+{
+ size_t ret = 0;
+ u32 entries_printed = 0;
+ struct callchain_node *chain;
+ struct rb_node *rb_node = rb_first(tree);
+
+ while (rb_node) {
+ double percent;
+
+ chain = rb_entry(rb_node, struct callchain_node, rb_node);
+ percent = chain->hit * 100.0 / total_samples;
+
+ ret += fprintf(fp, "%.2f%% ", percent);
+ ret += __callchain__fprintf_folded(fp, chain);
+ ret += fprintf(fp, "\n");
+ if (++entries_printed == callchain_param.print_limit)
+ break;
+
+ rb_node = rb_next(rb_node);
+ }
+
+ return ret;
+}
+
static size_t hist_entry_callchain__fprintf(struct hist_entry *he,
u64 total_samples, int left_margin,
FILE *fp)
@@ -278,6 +330,9 @@ static size_t hist_entry_callchain__fprintf(struct hist_entry *he,
case CHAIN_FLAT:
return callchain__fprintf_flat(fp, &he->sorted_chain, total_samples);
break;
+ case CHAIN_FOLDED:
+ return callchain__fprintf_folded(fp, &he->sorted_chain, total_samples);
+ break;
case CHAIN_NONE:
break;
default:
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 735ad48..08cb220 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -44,6 +44,10 @@ static int parse_callchain_mode(const char *value)
callchain_param.mode = CHAIN_GRAPH_REL;
return 0;
}
+ if (!strncmp(value, "folded", strlen(value))) {
+ callchain_param.mode = CHAIN_FOLDED;
+ return 0;
+ }
return -1;
}

@@ -218,6 +222,7 @@ rb_insert_callchain(struct rb_root *root, struct callchain_node *chain,

switch (mode) {
case CHAIN_FLAT:
+ case CHAIN_FOLDED:
if (rnode->hit < chain->hit)
p = &(*p)->rb_left;
else
@@ -338,6 +343,7 @@ int callchain_register_param(struct callchain_param *param)
param->sort = sort_chain_graph_rel;
break;
case CHAIN_FLAT:
+ case CHAIN_FOLDED:
param->sort = sort_chain_flat;
break;
case CHAIN_NONE:
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index fce8161..544d99a 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -24,7 +24,7 @@
#define CALLCHAIN_RECORD_HELP CALLCHAIN_HELP RECORD_MODE_HELP RECORD_SIZE_HELP

#define CALLCHAIN_REPORT_HELP \
- HELP_PAD "print_type:\tcall graph printing style (graph|flat|fractal|none)\n" \
+ HELP_PAD "print_type:\tcall graph printing style (graph|flat|fractal|folded|none)\n" \
HELP_PAD "threshold:\tminimum call graph inclusion threshold (<percent>)\n" \
HELP_PAD "print_limit:\tmaximum number of call graph entry (<number>)\n" \
HELP_PAD "order:\t\tcall graph order (caller|callee)\n" \
@@ -43,7 +43,8 @@ enum chain_mode {
CHAIN_NONE,
CHAIN_FLAT,
CHAIN_GRAPH_ABS,
- CHAIN_GRAPH_REL
+ CHAIN_GRAPH_REL,
+ CHAIN_FOLDED,
};

enum chain_order {

Subject: [tip:perf/core] perf callchain: Abstract callchain print function

Commit-ID: 5ab250cafcd884a2638b102239870bddca42ff88
Gitweb: http://git.kernel.org/tip/5ab250cafcd884a2638b102239870bddca42ff88
Author: Namhyung Kim <[email protected]>
AuthorDate: Mon, 9 Nov 2015 14:45:39 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 19 Nov 2015 13:19:22 -0300

perf callchain: Abstract callchain print function

This is a preparation to support for printing other type of callchain
value like count or period.

Signed-off-by: Namhyung Kim <[email protected]>
Tested-by: Brendan Gregg <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ renamed new _sprintf_ operation to _scnprintf_ ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/ui/browsers/hists.c | 8 +++++---
tools/perf/ui/gtk/hists.c | 8 ++------
tools/perf/ui/stdio/hist.c | 35 +++++++++++++++++------------------
tools/perf/util/callchain.c | 29 +++++++++++++++++++++++++++++
tools/perf/util/callchain.h | 4 ++++
5 files changed, 57 insertions(+), 27 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index fa9eb92..0b18857 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -592,7 +592,6 @@ static int hist_browser__show_callchain(struct hist_browser *browser,
while (node) {
struct callchain_node *child = rb_entry(node, struct callchain_node, rb_node);
struct rb_node *next = rb_next(node);
- u64 cumul = callchain_cumul_hits(child);
struct callchain_list *chain;
char folded_sign = ' ';
int first = true;
@@ -619,9 +618,12 @@ static int hist_browser__show_callchain(struct hist_browser *browser,
browser->show_dso);

if (was_first && need_percent) {
- double percent = cumul * 100.0 / total;
+ char buf[64];

- if (asprintf(&alloc_str, "%2.2f%% %s", percent, str) < 0)
+ callchain_node__scnprintf_value(child, buf, sizeof(buf),
+ total);
+
+ if (asprintf(&alloc_str, "%s %s", buf, str) < 0)
str = "Not enough memory!";
else
str = alloc_str;
diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
index 4b3585e..cff7bb9 100644
--- a/tools/perf/ui/gtk/hists.c
+++ b/tools/perf/ui/gtk/hists.c
@@ -100,14 +100,10 @@ static void perf_gtk__add_callchain(struct rb_root *root, GtkTreeStore *store,
struct callchain_list *chain;
GtkTreeIter iter, new_parent;
bool need_new_parent;
- double percent;
- u64 hits, child_total;
+ u64 child_total;

node = rb_entry(nd, struct callchain_node, rb_node);

- hits = callchain_cumul_hits(node);
- percent = 100.0 * hits / total;
-
new_parent = *parent;
need_new_parent = !has_single_node && (node->val_nr > 1);

@@ -116,7 +112,7 @@ static void perf_gtk__add_callchain(struct rb_root *root, GtkTreeStore *store,

gtk_tree_store_append(store, &iter, &new_parent);

- scnprintf(buf, sizeof(buf), "%5.2f%%", percent);
+ callchain_node__scnprintf_value(node, buf, sizeof(buf), total);
gtk_tree_store_set(store, &iter, 0, buf, -1);

callchain_list__sym_name(chain, buf, sizeof(buf), false);
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index ea798493..f4de055 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -34,10 +34,10 @@ static size_t ipchain__fprintf_graph_line(FILE *fp, int depth, int depth_mask,
return ret;
}

-static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_list *chain,
+static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_node *node,
+ struct callchain_list *chain,
int depth, int depth_mask, int period,
- u64 total_samples, u64 hits,
- int left_margin)
+ u64 total_samples, int left_margin)
{
int i;
size_t ret = 0;
@@ -50,10 +50,9 @@ static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_list *chain,
else
ret += fprintf(fp, " ");
if (!period && i == depth - 1) {
- double percent;
-
- percent = hits * 100.0 / total_samples;
- ret += percent_color_fprintf(fp, "--%2.2f%%-- ", percent);
+ ret += fprintf(fp, "--");
+ ret += callchain_node__fprintf_value(node, fp, total_samples);
+ ret += fprintf(fp, "--");
} else
ret += fprintf(fp, "%s", " ");
}
@@ -120,10 +119,9 @@ static size_t __callchain__fprintf_graph(FILE *fp, struct rb_root *root,
left_margin);
i = 0;
list_for_each_entry(chain, &child->val, list) {
- ret += ipchain__fprintf_graph(fp, chain, depth,
+ ret += ipchain__fprintf_graph(fp, child, chain, depth,
new_depth_mask, i++,
total_samples,
- cumul,
left_margin);
}

@@ -143,14 +141,17 @@ static size_t __callchain__fprintf_graph(FILE *fp, struct rb_root *root,

if (callchain_param.mode == CHAIN_GRAPH_REL &&
remaining && remaining != total_samples) {
+ struct callchain_node rem_node = {
+ .hit = remaining,
+ };

if (!rem_sq_bracket)
return ret;

new_depth_mask &= ~(1 << (depth - 1));
- ret += ipchain__fprintf_graph(fp, &rem_hits, depth,
+ ret += ipchain__fprintf_graph(fp, &rem_node, &rem_hits, depth,
new_depth_mask, 0, total_samples,
- remaining, left_margin);
+ left_margin);
}

return ret;
@@ -243,12 +244,11 @@ static size_t callchain__fprintf_flat(FILE *fp, struct rb_root *tree,
struct rb_node *rb_node = rb_first(tree);

while (rb_node) {
- double percent;
-
chain = rb_entry(rb_node, struct callchain_node, rb_node);
- percent = chain->hit * 100.0 / total_samples;

- ret = percent_color_fprintf(fp, " %6.2f%%\n", percent);
+ ret += fprintf(fp, " ");
+ ret += callchain_node__fprintf_value(chain, fp, total_samples);
+ ret += fprintf(fp, "\n");
ret += __callchain__fprintf_flat(fp, chain, total_samples);
ret += fprintf(fp, "\n");
if (++entries_printed == callchain_param.print_limit)
@@ -295,12 +295,11 @@ static size_t callchain__fprintf_folded(FILE *fp, struct rb_root *tree,
struct rb_node *rb_node = rb_first(tree);

while (rb_node) {
- double percent;

chain = rb_entry(rb_node, struct callchain_node, rb_node);
- percent = chain->hit * 100.0 / total_samples;

- ret += fprintf(fp, "%.2f%% ", percent);
+ ret += callchain_node__fprintf_value(chain, fp, total_samples);
+ ret += fprintf(fp, " ");
ret += __callchain__fprintf_folded(fp, chain);
ret += fprintf(fp, "\n");
if (++entries_printed == callchain_param.print_limit)
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 08cb220..b948bd0 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -805,6 +805,35 @@ char *callchain_list__sym_name(struct callchain_list *cl,
return bf;
}

+char *callchain_node__scnprintf_value(struct callchain_node *node,
+ char *bf, size_t bfsize, u64 total)
+{
+ double percent = 0.0;
+ u64 period = callchain_cumul_hits(node);
+
+ if (callchain_param.mode == CHAIN_FOLDED)
+ period = node->hit;
+ if (total)
+ percent = period * 100.0 / total;
+
+ scnprintf(bf, bfsize, "%.2f%%", percent);
+ return bf;
+}
+
+int callchain_node__fprintf_value(struct callchain_node *node,
+ FILE *fp, u64 total)
+{
+ double percent = 0.0;
+ u64 period = callchain_cumul_hits(node);
+
+ if (callchain_param.mode == CHAIN_FOLDED)
+ period = node->hit;
+ if (total)
+ percent = period * 100.0 / total;
+
+ return percent_color_fprintf(fp, "%.2f%%", percent);
+}
+
static void free_callchain_node(struct callchain_node *node)
{
struct callchain_list *list, *tmp;
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 544d99a..060e636 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -230,6 +230,10 @@ static inline int arch_skip_callchain_idx(struct thread *thread __maybe_unused,

char *callchain_list__sym_name(struct callchain_list *cl,
char *bf, size_t bfsize, bool show_dso);
+char *callchain_node__scnprintf_value(struct callchain_node *node,
+ char *bf, size_t bfsize, u64 total);
+int callchain_node__fprintf_value(struct callchain_node *node,
+ FILE *fp, u64 total);

void free_callchain(struct callchain_root *root);

Subject: [tip:perf/core] perf callchain: Add count fields to struct callchain_node

Commit-ID: 5e47f8ff406296bd078716d71283796ca5c6544b
Gitweb: http://git.kernel.org/tip/5e47f8ff406296bd078716d71283796ca5c6544b
Author: Namhyung Kim <[email protected]>
AuthorDate: Mon, 9 Nov 2015 14:45:40 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 19 Nov 2015 13:19:23 -0300

perf callchain: Add count fields to struct callchain_node

It's to track the count of occurrences of the callchains.

Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Brendan Gregg <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/callchain.c | 10 ++++++++++
tools/perf/util/callchain.h | 7 +++++++
2 files changed, 17 insertions(+)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index b948bd0..e390edd 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -437,6 +437,8 @@ add_child(struct callchain_node *parent,

new->children_hit = 0;
new->hit = period;
+ new->children_count = 0;
+ new->count = 1;
return new;
}

@@ -484,6 +486,9 @@ split_add_child(struct callchain_node *parent,
parent->children_hit = callchain_cumul_hits(new);
new->val_nr = parent->val_nr - idx_local;
parent->val_nr = idx_local;
+ new->count = parent->count;
+ new->children_count = parent->children_count;
+ parent->children_count = callchain_cumul_counts(new);

/* create a new child for the new branch if any */
if (idx_total < cursor->nr) {
@@ -494,6 +499,8 @@ split_add_child(struct callchain_node *parent,

parent->hit = 0;
parent->children_hit += period;
+ parent->count = 0;
+ parent->children_count += 1;

node = callchain_cursor_current(cursor);
new = add_child(parent, cursor, period);
@@ -516,6 +523,7 @@ split_add_child(struct callchain_node *parent,
rb_insert_color(&new->rb_node_in, &parent->rb_root_in);
} else {
parent->hit = period;
+ parent->count = 1;
}
}

@@ -562,6 +570,7 @@ append_chain_children(struct callchain_node *root,

inc_children_hit:
root->children_hit += period;
+ root->children_count++;
}

static int
@@ -614,6 +623,7 @@ append_chain(struct callchain_node *root,
/* we match 100% of the path, increment the hit */
if (matches == root->val_nr && cursor->pos == cursor->nr) {
root->hit += period;
+ root->count++;
return 0;
}

diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 060e636..cdb386d 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -60,6 +60,8 @@ struct callchain_node {
struct rb_root rb_root_in; /* input tree of children */
struct rb_root rb_root; /* sorted output tree of children */
unsigned int val_nr;
+ unsigned int count;
+ unsigned int children_count;
u64 hit;
u64 children_hit;
};
@@ -145,6 +147,11 @@ static inline u64 callchain_cumul_hits(struct callchain_node *node)
return node->hit + node->children_hit;
}

+static inline unsigned callchain_cumul_counts(struct callchain_node *node)
+{
+ return node->count + node->children_count;
+}
+
int callchain_register_param(struct callchain_param *param);
int callchain_append(struct callchain_root *root,
struct callchain_cursor *cursor,

Subject: [tip:perf/core] perf report: Add callchain value option

Commit-ID: f2af008695e0b54a58b76caecd52af7e6c97fb29
Gitweb: http://git.kernel.org/tip/f2af008695e0b54a58b76caecd52af7e6c97fb29
Author: Namhyung Kim <[email protected]>
AuthorDate: Mon, 9 Nov 2015 14:45:41 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 19 Nov 2015 13:19:23 -0300

perf report: Add callchain value option

Now -g/--call-graph option supports how to display callchain values.
Possible values are 'percent', 'period' and 'count'. The percent is
same as before and it's the default behavior. The period displays the
raw period value rather than the percentage. The count displays the
number of occurrences.

$ perf report --no-children --stdio -g percent
...
39.93% swapper [kernel.vmlinux] [k] intel_idel
|
---intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
|
|--28.63%-- start_secondary
|
--11.30%-- rest_init

$ perf report --no-children --show-total-period --stdio -g period
...
39.93% 13018705 swapper [kernel.vmlinux] [k] intel_idel
|
---intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
|
|--9334403-- start_secondary
|
--3684302-- rest_init

$ perf report --no-children --show-nr-samples --stdio -g count
...
39.93% 80 swapper [kernel.vmlinux] [k] intel_idel
|
---intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
|
|--57-- start_secondary
|
--23-- rest_init

Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Brendan Gregg <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Documentation/perf-report.txt | 13 ++++---
tools/perf/builtin-report.c | 4 +--
tools/perf/ui/stdio/hist.c | 10 +++++-
tools/perf/util/callchain.c | 62 +++++++++++++++++++++++++++-----
tools/perf/util/callchain.h | 10 +++++-
tools/perf/util/util.c | 3 +-
6 files changed, 84 insertions(+), 18 deletions(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index f7d81aa..dab99ed 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -170,11 +170,11 @@ OPTIONS
Dump raw trace in ASCII.

-g::
---call-graph=<print_type,threshold[,print_limit],order,sort_key,branch>::
+--call-graph=<print_type,threshold[,print_limit],order,sort_key[,branch],value>::
Display call chains using type, min percent threshold, print limit,
- call order, sort key and branch. Note that ordering of parameters is not
- fixed so any parement can be given in an arbitraty order. One exception
- is the print_limit which should be preceded by threshold.
+ call order, sort key, optional branch and value. Note that ordering of
+ parameters is not fixed so any parement can be given in an arbitraty order.
+ One exception is the print_limit which should be preceded by threshold.

print_type can be either:
- flat: single column, linear exposure of call chains.
@@ -205,6 +205,11 @@ OPTIONS
- branch: include last branch information in callgraph when available.
Usually more convenient to use --branch-history for this.

+ value can be:
+ - percent: diplay overhead percent (default)
+ - period: display event period
+ - count: display event count
+
--children::
Accumulate callchain of children to parent entry so that then can
show up in the output. The output will have a new "Children" column
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index f256fac..1442834 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -625,7 +625,7 @@ parse_percent_limit(const struct option *opt, const char *str,
return 0;
}

-#define CALLCHAIN_DEFAULT_OPT "graph,0.5,caller,function"
+#define CALLCHAIN_DEFAULT_OPT "graph,0.5,caller,function,percent"

const char report_callchain_help[] = "Display call graph (stack chain/backtrace):\n\n"
CALLCHAIN_REPORT_HELP
@@ -708,7 +708,7 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
OPT_BOOLEAN('x', "exclude-other", &symbol_conf.exclude_other,
"Only display entries with parent-match"),
OPT_CALLBACK_DEFAULT('g', "call-graph", &report,
- "print_type,threshold[,print_limit],order,sort_key[,branch]",
+ "print_type,threshold[,print_limit],order,sort_key[,branch],value",
report_callchain_help, &report_parse_callchain_opt,
callchain_default_opt),
OPT_BOOLEAN(0, "children", &symbol_conf.cumulate_callchain,
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index f4de055..7ebc661 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -81,13 +81,14 @@ static size_t __callchain__fprintf_graph(FILE *fp, struct rb_root *root,
int depth_mask, int left_margin)
{
struct rb_node *node, *next;
- struct callchain_node *child;
+ struct callchain_node *child = NULL;
struct callchain_list *chain;
int new_depth_mask = depth_mask;
u64 remaining;
size_t ret = 0;
int i;
uint entries_printed = 0;
+ int cumul_count = 0;

remaining = total_samples;

@@ -99,6 +100,7 @@ static size_t __callchain__fprintf_graph(FILE *fp, struct rb_root *root,
child = rb_entry(node, struct callchain_node, rb_node);
cumul = callchain_cumul_hits(child);
remaining -= cumul;
+ cumul_count += callchain_cumul_counts(child);

/*
* The depth mask manages the output of pipes that show
@@ -148,6 +150,12 @@ static size_t __callchain__fprintf_graph(FILE *fp, struct rb_root *root,
if (!rem_sq_bracket)
return ret;

+ if (callchain_param.value == CCVAL_COUNT && child && child->parent) {
+ rem_node.count = child->parent->children_count - cumul_count;
+ if (rem_node.count <= 0)
+ return ret;
+ }
+
new_depth_mask &= ~(1 << (depth - 1));
ret += ipchain__fprintf_graph(fp, &rem_node, &rem_hits, depth,
new_depth_mask, 0, total_samples,
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index e390edd..717c58c 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -83,6 +83,23 @@ static int parse_callchain_sort_key(const char *value)
return -1;
}

+static int parse_callchain_value(const char *value)
+{
+ if (!strncmp(value, "percent", strlen(value))) {
+ callchain_param.value = CCVAL_PERCENT;
+ return 0;
+ }
+ if (!strncmp(value, "period", strlen(value))) {
+ callchain_param.value = CCVAL_PERIOD;
+ return 0;
+ }
+ if (!strncmp(value, "count", strlen(value))) {
+ callchain_param.value = CCVAL_COUNT;
+ return 0;
+ }
+ return -1;
+}
+
static int
__parse_callchain_report_opt(const char *arg, bool allow_record_opt)
{
@@ -106,7 +123,8 @@ __parse_callchain_report_opt(const char *arg, bool allow_record_opt)

if (!parse_callchain_mode(tok) ||
!parse_callchain_order(tok) ||
- !parse_callchain_sort_key(tok)) {
+ !parse_callchain_sort_key(tok) ||
+ !parse_callchain_value(tok)) {
/* parsing ok - move on to the next */
try_stack_size = false;
goto next;
@@ -820,13 +838,27 @@ char *callchain_node__scnprintf_value(struct callchain_node *node,
{
double percent = 0.0;
u64 period = callchain_cumul_hits(node);
+ unsigned count = callchain_cumul_counts(node);

- if (callchain_param.mode == CHAIN_FOLDED)
+ if (callchain_param.mode == CHAIN_FOLDED) {
period = node->hit;
- if (total)
- percent = period * 100.0 / total;
+ count = node->count;
+ }

- scnprintf(bf, bfsize, "%.2f%%", percent);
+ switch (callchain_param.value) {
+ case CCVAL_PERIOD:
+ scnprintf(bf, bfsize, "%"PRIu64, period);
+ break;
+ case CCVAL_COUNT:
+ scnprintf(bf, bfsize, "%u", count);
+ break;
+ case CCVAL_PERCENT:
+ default:
+ if (total)
+ percent = period * 100.0 / total;
+ scnprintf(bf, bfsize, "%.2f%%", percent);
+ break;
+ }
return bf;
}

@@ -835,13 +867,25 @@ int callchain_node__fprintf_value(struct callchain_node *node,
{
double percent = 0.0;
u64 period = callchain_cumul_hits(node);
+ unsigned count = callchain_cumul_counts(node);

- if (callchain_param.mode == CHAIN_FOLDED)
+ if (callchain_param.mode == CHAIN_FOLDED) {
period = node->hit;
- if (total)
- percent = period * 100.0 / total;
+ count = node->count;
+ }

- return percent_color_fprintf(fp, "%.2f%%", percent);
+ switch (callchain_param.value) {
+ case CCVAL_PERIOD:
+ return fprintf(fp, "%"PRIu64, period);
+ case CCVAL_COUNT:
+ return fprintf(fp, "%u", count);
+ case CCVAL_PERCENT:
+ default:
+ if (total)
+ percent = period * 100.0 / total;
+ return percent_color_fprintf(fp, "%.2f%%", percent);
+ }
+ return 0;
}

static void free_callchain_node(struct callchain_node *node)
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index cdb386d..47bc0c5 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -29,7 +29,8 @@
HELP_PAD "print_limit:\tmaximum number of call graph entry (<number>)\n" \
HELP_PAD "order:\t\tcall graph order (caller|callee)\n" \
HELP_PAD "sort_key:\tcall graph sort key (function|address)\n" \
- HELP_PAD "branch:\t\tinclude last branch info to call graph (branch)\n"
+ HELP_PAD "branch:\t\tinclude last branch info to call graph (branch)\n" \
+ HELP_PAD "value:\t\tcall graph value (percent|period|count)\n"

enum perf_call_graph_mode {
CALLCHAIN_NONE,
@@ -81,6 +82,12 @@ enum chain_key {
CCKEY_ADDRESS
};

+enum chain_value {
+ CCVAL_PERCENT,
+ CCVAL_PERIOD,
+ CCVAL_COUNT,
+};
+
struct callchain_param {
bool enabled;
enum perf_call_graph_mode record_mode;
@@ -93,6 +100,7 @@ struct callchain_param {
bool order_set;
enum chain_key key;
bool branch_callstack;
+ enum chain_value value;
};

extern struct callchain_param callchain_param;
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index 47b1e36..75759ae 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -21,7 +21,8 @@ struct callchain_param callchain_param = {
.mode = CHAIN_GRAPH_ABS,
.min_percent = 0.5,
.order = ORDER_CALLEE,
- .key = CCKEY_FUNCTION
+ .key = CCKEY_FUNCTION,
+ .value = CCVAL_PERCENT,
};

/*

Subject: [tip:perf/core] perf hists browser: Factor out hist_browser__show_callchain_list()

Commit-ID: 18bb838129b08fb0009b1ba1dc2f748a9537ee89
Gitweb: http://git.kernel.org/tip/18bb838129b08fb0009b1ba1dc2f748a9537ee89
Author: Namhyung Kim <[email protected]>
AuthorDate: Mon, 9 Nov 2015 14:45:42 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 19 Nov 2015 13:19:24 -0300

perf hists browser: Factor out hist_browser__show_callchain_list()

This function is to print a single callchain list entry. As this
function will be used by other function, factor out to a separate
function.

Signed-off-by: Namhyung Kim <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Brendan Gregg <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/ui/browsers/hists.c | 72 ++++++++++++++++++++++++++----------------
1 file changed, 45 insertions(+), 27 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 0b18857..0746d41 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -574,6 +574,44 @@ static bool hist_browser__check_dump_full(struct hist_browser *browser __maybe_u

#define LEVEL_OFFSET_STEP 3

+static int hist_browser__show_callchain_list(struct hist_browser *browser,
+ struct callchain_node *node,
+ struct callchain_list *chain,
+ unsigned short row, u64 total,
+ bool need_percent, int offset,
+ print_callchain_entry_fn print,
+ struct callchain_print_arg *arg)
+{
+ char bf[1024], *alloc_str;
+ const char *str;
+
+ if (arg->row_offset != 0) {
+ arg->row_offset--;
+ return 0;
+ }
+
+ alloc_str = NULL;
+ str = callchain_list__sym_name(chain, bf, sizeof(bf),
+ browser->show_dso);
+
+ if (need_percent) {
+ char buf[64];
+
+ callchain_node__scnprintf_value(node, buf, sizeof(buf),
+ total);
+
+ if (asprintf(&alloc_str, "%s %s", buf, str) < 0)
+ str = "Not enough memory!";
+ else
+ str = alloc_str;
+ }
+
+ print(browser, chain, str, offset, row, arg);
+
+ free(alloc_str);
+ return 1;
+}
+
static int hist_browser__show_callchain(struct hist_browser *browser,
struct rb_root *root, int level,
unsigned short row, u64 total,
@@ -598,8 +636,6 @@ static int hist_browser__show_callchain(struct hist_browser *browser,
int extra_offset = 0;

list_for_each_entry(chain, &child->val, list) {
- char bf[1024], *alloc_str;
- const char *str;
bool was_first = first;

if (first)
@@ -608,34 +644,16 @@ static int hist_browser__show_callchain(struct hist_browser *browser,
extra_offset = LEVEL_OFFSET_STEP;

folded_sign = callchain_list__folded(chain);
- if (arg->row_offset != 0) {
- arg->row_offset--;
- goto do_next;
- }

- alloc_str = NULL;
- str = callchain_list__sym_name(chain, bf, sizeof(bf),
- browser->show_dso);
+ row += hist_browser__show_callchain_list(browser, child,
+ chain, row, total,
+ was_first && need_percent,
+ offset + extra_offset,
+ print, arg);

- if (was_first && need_percent) {
- char buf[64];
-
- callchain_node__scnprintf_value(child, buf, sizeof(buf),
- total);
-
- if (asprintf(&alloc_str, "%s %s", buf, str) < 0)
- str = "Not enough memory!";
- else
- str = alloc_str;
- }
-
- print(browser, chain, str, offset + extra_offset, row, arg);
-
- free(alloc_str);
-
- if (is_output_full(browser, ++row))
+ if (is_output_full(browser, row))
goto out;
-do_next:
+
if (folded_sign == '+')
break;
}

Subject: [tip:perf/core] perf hists browser: Support flat callchains

Commit-ID: 4b3a3212233a042f48b7b8fedc64933e1ccd8643
Gitweb: http://git.kernel.org/tip/4b3a3212233a042f48b7b8fedc64933e1ccd8643
Author: Namhyung Kim <[email protected]>
AuthorDate: Mon, 9 Nov 2015 14:45:43 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 19 Nov 2015 13:19:24 -0300

perf hists browser: Support flat callchains

The flat callchain mode is to print all chains in a single, simple
hierarchy so make it easy to see.

Currently perf report --tui doesn't show flat callchains properly. With
flat callchains, only leaf nodes are added to the final rbtree so it
should show entries in parent nodes. To do that, add parent_val list to
struct callchain_node and show them along with the (normal) val list.

For example, consider following callchains with '-g graph'.

$ perf report -g graph
- 39.93% swapper [kernel.vmlinux] [k] intel_idle
intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
- cpu_startup_entry
28.63% start_secondary
- 11.30% rest_init
start_kernel
x86_64_start_reservations
x86_64_start_kernel

Before:
$ perf report -g flat
- 39.93% swapper [kernel.vmlinux] [k] intel_idle
28.63% start_secondary
- 11.30% rest_init
start_kernel
x86_64_start_reservations
x86_64_start_kernel

After:
$ perf report -g flat
- 39.93% swapper [kernel.vmlinux] [k] intel_idle
- 28.63% intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
start_secondary
- 11.30% intel_idle
cpuidle_enter_state
cpuidle_enter
call_cpuidle
cpu_startup_entry
start_kernel
x86_64_start_reservations
x86_64_start_kernel

Signed-off-by: Namhyung Kim <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Tested-by: Brendan Gregg <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/ui/browsers/hists.c | 122 ++++++++++++++++++++++++++++++++++++++++-
tools/perf/util/callchain.c | 44 +++++++++++++++
tools/perf/util/callchain.h | 2 +
3 files changed, 166 insertions(+), 2 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 0746d41..c44af46 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -178,12 +178,44 @@ static int callchain_node__count_rows_rb_tree(struct callchain_node *node)
return n;
}

+static int callchain_node__count_flat_rows(struct callchain_node *node)
+{
+ struct callchain_list *chain;
+ char folded_sign = 0;
+ int n = 0;
+
+ list_for_each_entry(chain, &node->parent_val, list) {
+ if (!folded_sign) {
+ /* only check first chain list entry */
+ folded_sign = callchain_list__folded(chain);
+ if (folded_sign == '+')
+ return 1;
+ }
+ n++;
+ }
+
+ list_for_each_entry(chain, &node->val, list) {
+ if (!folded_sign) {
+ /* node->parent_val list might be empty */
+ folded_sign = callchain_list__folded(chain);
+ if (folded_sign == '+')
+ return 1;
+ }
+ n++;
+ }
+
+ return n;
+}
+
static int callchain_node__count_rows(struct callchain_node *node)
{
struct callchain_list *chain;
bool unfolded = false;
int n = 0;

+ if (callchain_param.mode == CHAIN_FLAT)
+ return callchain_node__count_flat_rows(node);
+
list_for_each_entry(chain, &node->val, list) {
++n;
unfolded = chain->unfolded;
@@ -263,7 +295,7 @@ static void callchain_node__init_have_children(struct callchain_node *node,
chain = list_entry(node->val.next, struct callchain_list, list);
chain->has_children = has_sibling;

- if (!list_empty(&node->val)) {
+ if (node->val.next != node->val.prev) {
chain = list_entry(node->val.prev, struct callchain_list, list);
chain->has_children = !RB_EMPTY_ROOT(&node->rb_root);
}
@@ -279,6 +311,8 @@ static void callchain__init_have_children(struct rb_root *root)
for (nd = rb_first(root); nd; nd = rb_next(nd)) {
struct callchain_node *node = rb_entry(nd, struct callchain_node, rb_node);
callchain_node__init_have_children(node, has_sibling);
+ if (callchain_param.mode == CHAIN_FLAT)
+ callchain_node__make_parent_list(node);
}
}

@@ -612,6 +646,83 @@ static int hist_browser__show_callchain_list(struct hist_browser *browser,
return 1;
}

+static int hist_browser__show_callchain_flat(struct hist_browser *browser,
+ struct rb_root *root,
+ unsigned short row, u64 total,
+ print_callchain_entry_fn print,
+ struct callchain_print_arg *arg,
+ check_output_full_fn is_output_full)
+{
+ struct rb_node *node;
+ int first_row = row, offset = LEVEL_OFFSET_STEP;
+ bool need_percent;
+
+ node = rb_first(root);
+ need_percent = node && rb_next(node);
+
+ while (node) {
+ struct callchain_node *child = rb_entry(node, struct callchain_node, rb_node);
+ struct rb_node *next = rb_next(node);
+ struct callchain_list *chain;
+ char folded_sign = ' ';
+ int first = true;
+ int extra_offset = 0;
+
+ list_for_each_entry(chain, &child->parent_val, list) {
+ bool was_first = first;
+
+ if (first)
+ first = false;
+ else if (need_percent)
+ extra_offset = LEVEL_OFFSET_STEP;
+
+ folded_sign = callchain_list__folded(chain);
+
+ row += hist_browser__show_callchain_list(browser, child,
+ chain, row, total,
+ was_first && need_percent,
+ offset + extra_offset,
+ print, arg);
+
+ if (is_output_full(browser, row))
+ goto out;
+
+ if (folded_sign == '+')
+ goto next;
+ }
+
+ list_for_each_entry(chain, &child->val, list) {
+ bool was_first = first;
+
+ if (first)
+ first = false;
+ else if (need_percent)
+ extra_offset = LEVEL_OFFSET_STEP;
+
+ folded_sign = callchain_list__folded(chain);
+
+ row += hist_browser__show_callchain_list(browser, child,
+ chain, row, total,
+ was_first && need_percent,
+ offset + extra_offset,
+ print, arg);
+
+ if (is_output_full(browser, row))
+ goto out;
+
+ if (folded_sign == '+')
+ break;
+ }
+
+next:
+ if (is_output_full(browser, row))
+ break;
+ node = next;
+ }
+out:
+ return row - first_row;
+}
+
static int hist_browser__show_callchain(struct hist_browser *browser,
struct rb_root *root, int level,
unsigned short row, u64 total,
@@ -864,10 +975,17 @@ static int hist_browser__show_entry(struct hist_browser *browser,
total = entry->stat.period;
}

- printed += hist_browser__show_callchain(browser,
+ if (callchain_param.mode == CHAIN_FLAT) {
+ printed += hist_browser__show_callchain_flat(browser,
+ &entry->sorted_chain, row, total,
+ hist_browser__show_callchain_entry, &arg,
+ hist_browser__check_output_full);
+ } else {
+ printed += hist_browser__show_callchain(browser,
&entry->sorted_chain, 1, row, total,
hist_browser__show_callchain_entry, &arg,
hist_browser__check_output_full);
+ }

if (arg.is_current_entry)
browser->he_selection = entry;
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 717c58c..fc3b1e0 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -387,6 +387,7 @@ create_child(struct callchain_node *parent, bool inherit_children)
}
new->parent = parent;
INIT_LIST_HEAD(&new->val);
+ INIT_LIST_HEAD(&new->parent_val);

if (inherit_children) {
struct rb_node *n;
@@ -894,6 +895,11 @@ static void free_callchain_node(struct callchain_node *node)
struct callchain_node *child;
struct rb_node *n;

+ list_for_each_entry_safe(list, tmp, &node->parent_val, list) {
+ list_del(&list->list);
+ free(list);
+ }
+
list_for_each_entry_safe(list, tmp, &node->val, list) {
list_del(&list->list);
free(list);
@@ -917,3 +923,41 @@ void free_callchain(struct callchain_root *root)

free_callchain_node(&root->node);
}
+
+int callchain_node__make_parent_list(struct callchain_node *node)
+{
+ struct callchain_node *parent = node->parent;
+ struct callchain_list *chain, *new;
+ LIST_HEAD(head);
+
+ while (parent) {
+ list_for_each_entry_reverse(chain, &parent->val, list) {
+ new = malloc(sizeof(*new));
+ if (new == NULL)
+ goto out;
+ *new = *chain;
+ new->has_children = false;
+ list_add_tail(&new->list, &head);
+ }
+ parent = parent->parent;
+ }
+
+ list_for_each_entry_safe_reverse(chain, new, &head, list)
+ list_move_tail(&chain->list, &node->parent_val);
+
+ if (!list_empty(&node->parent_val)) {
+ chain = list_first_entry(&node->parent_val, struct callchain_list, list);
+ chain->has_children = rb_prev(&node->rb_node) || rb_next(&node->rb_node);
+
+ chain = list_first_entry(&node->val, struct callchain_list, list);
+ chain->has_children = false;
+ }
+ return 0;
+
+out:
+ list_for_each_entry_safe(chain, new, &head, list) {
+ list_del(&chain->list);
+ free(chain);
+ }
+ return -ENOMEM;
+}
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index 47bc0c5..6e9b5f2 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -56,6 +56,7 @@ enum chain_order {
struct callchain_node {
struct callchain_node *parent;
struct list_head val;
+ struct list_head parent_val;
struct rb_node rb_node_in; /* to insert nodes in an rbtree */
struct rb_node rb_node; /* to sort nodes in an output tree */
struct rb_root rb_root_in; /* input tree of children */
@@ -251,5 +252,6 @@ int callchain_node__fprintf_value(struct callchain_node *node,
FILE *fp, u64 total);

void free_callchain(struct callchain_root *root);
+int callchain_node__make_parent_list(struct callchain_node *node);

#endif /* __PERF_CALLCHAIN_H */

Subject: [tip:perf/core] perf hists browser: Support folded callchains

Commit-ID: 8c430a34869946f1f5852f02d910ceef80040be5
Gitweb: http://git.kernel.org/tip/8c430a34869946f1f5852f02d910ceef80040be5
Author: Namhyung Kim <[email protected]>
AuthorDate: Mon, 9 Nov 2015 14:45:44 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 19 Nov 2015 13:19:25 -0300

perf hists browser: Support folded callchains

The folded callchain mode prints all chains in a single line.

Currently perf report --tui doesn't support folded callchains. Like
flat callchains, only leaf nodes are added to the final rbtree so it
should show entries in parent nodes. To do that, add flat_val list to
struct callchain_node and show them along with the (normal) val list.

For example, folded callchain looks like below:

$ perf report -g folded --tui
Samples: 234 of event 'cycles:pp', Event count (approx.): 32605268
Overhead Command Shared Object Symbol
- 39.93% swapper [kernel.vmlinux] [k] intel_idle
+ 28.63% intel_idle; cpuidle_enter_state; cpuidle_enter; ...
+ 11.30% intel_idle; cpuidle_enter_state; cpuidle_enter; ...

Signed-off-by: Namhyung Kim <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Tested-by: Brendan Gregg <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/ui/browsers/hists.c | 125 ++++++++++++++++++++++++++++++++++++++++-
1 file changed, 124 insertions(+), 1 deletion(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index c44af46..a211b7b 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -207,6 +207,11 @@ static int callchain_node__count_flat_rows(struct callchain_node *node)
return n;
}

+static int callchain_node__count_folded_rows(struct callchain_node *node __maybe_unused)
+{
+ return 1;
+}
+
static int callchain_node__count_rows(struct callchain_node *node)
{
struct callchain_list *chain;
@@ -215,6 +220,8 @@ static int callchain_node__count_rows(struct callchain_node *node)

if (callchain_param.mode == CHAIN_FLAT)
return callchain_node__count_flat_rows(node);
+ else if (callchain_param.mode == CHAIN_FOLDED)
+ return callchain_node__count_folded_rows(node);

list_for_each_entry(chain, &node->val, list) {
++n;
@@ -311,7 +318,8 @@ static void callchain__init_have_children(struct rb_root *root)
for (nd = rb_first(root); nd; nd = rb_next(nd)) {
struct callchain_node *node = rb_entry(nd, struct callchain_node, rb_node);
callchain_node__init_have_children(node, has_sibling);
- if (callchain_param.mode == CHAIN_FLAT)
+ if (callchain_param.mode == CHAIN_FLAT ||
+ callchain_param.mode == CHAIN_FOLDED)
callchain_node__make_parent_list(node);
}
}
@@ -723,6 +731,116 @@ out:
return row - first_row;
}

+static char *hist_browser__folded_callchain_str(struct hist_browser *browser,
+ struct callchain_list *chain,
+ char *value_str, char *old_str)
+{
+ char bf[1024];
+ const char *str;
+ char *new;
+
+ str = callchain_list__sym_name(chain, bf, sizeof(bf),
+ browser->show_dso);
+ if (old_str) {
+ if (asprintf(&new, "%s%s%s", old_str,
+ symbol_conf.field_sep ?: ";", str) < 0)
+ new = NULL;
+ } else {
+ if (value_str) {
+ if (asprintf(&new, "%s %s", value_str, str) < 0)
+ new = NULL;
+ } else {
+ if (asprintf(&new, "%s", str) < 0)
+ new = NULL;
+ }
+ }
+ return new;
+}
+
+static int hist_browser__show_callchain_folded(struct hist_browser *browser,
+ struct rb_root *root,
+ unsigned short row, u64 total,
+ print_callchain_entry_fn print,
+ struct callchain_print_arg *arg,
+ check_output_full_fn is_output_full)
+{
+ struct rb_node *node;
+ int first_row = row, offset = LEVEL_OFFSET_STEP;
+ bool need_percent;
+
+ node = rb_first(root);
+ need_percent = node && rb_next(node);
+
+ while (node) {
+ struct callchain_node *child = rb_entry(node, struct callchain_node, rb_node);
+ struct rb_node *next = rb_next(node);
+ struct callchain_list *chain, *first_chain = NULL;
+ int first = true;
+ char *value_str = NULL, *value_str_alloc = NULL;
+ char *chain_str = NULL, *chain_str_alloc = NULL;
+
+ if (arg->row_offset != 0) {
+ arg->row_offset--;
+ goto next;
+ }
+
+ if (need_percent) {
+ char buf[64];
+
+ callchain_node__scnprintf_value(child, buf, sizeof(buf), total);
+ if (asprintf(&value_str, "%s", buf) < 0) {
+ value_str = (char *)"<...>";
+ goto do_print;
+ }
+ value_str_alloc = value_str;
+ }
+
+ list_for_each_entry(chain, &child->parent_val, list) {
+ chain_str = hist_browser__folded_callchain_str(browser,
+ chain, value_str, chain_str);
+ if (first) {
+ first = false;
+ first_chain = chain;
+ }
+
+ if (chain_str == NULL) {
+ chain_str = (char *)"Not enough memory!";
+ goto do_print;
+ }
+
+ chain_str_alloc = chain_str;
+ }
+
+ list_for_each_entry(chain, &child->val, list) {
+ chain_str = hist_browser__folded_callchain_str(browser,
+ chain, value_str, chain_str);
+ if (first) {
+ first = false;
+ first_chain = chain;
+ }
+
+ if (chain_str == NULL) {
+ chain_str = (char *)"Not enough memory!";
+ goto do_print;
+ }
+
+ chain_str_alloc = chain_str;
+ }
+
+do_print:
+ print(browser, first_chain, chain_str, offset, row++, arg);
+ free(value_str_alloc);
+ free(chain_str_alloc);
+
+next:
+ if (is_output_full(browser, row))
+ break;
+ node = next;
+ }
+
+ return row - first_row;
+}
+
static int hist_browser__show_callchain(struct hist_browser *browser,
struct rb_root *root, int level,
unsigned short row, u64 total,
@@ -980,6 +1098,11 @@ static int hist_browser__show_entry(struct hist_browser *browser,
&entry->sorted_chain, row, total,
hist_browser__show_callchain_entry, &arg,
hist_browser__check_output_full);
+ } else if (callchain_param.mode == CHAIN_FOLDED) {
+ printed += hist_browser__show_callchain_folded(browser,
+ &entry->sorted_chain, row, total,
+ hist_browser__show_callchain_entry, &arg,
+ hist_browser__check_output_full);
} else {
printed += hist_browser__show_callchain(browser,
&entry->sorted_chain, 1, row, total,

Subject: [tip:perf/core] perf ui/gtk: Support flat callchains

Commit-ID: 3cd99dfd1c87067fb28a19fee76500aed56d7c8f
Gitweb: http://git.kernel.org/tip/3cd99dfd1c87067fb28a19fee76500aed56d7c8f
Author: Namhyung Kim <[email protected]>
AuthorDate: Mon, 9 Nov 2015 14:45:45 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 19 Nov 2015 13:19:25 -0300

perf ui/gtk: Support flat callchains

The flat callchain mode is to print all chains in a simple flat
hierarchy so make it easy to see.

Currently perf report --gtk doesn't show flat callchains properly. With
flat callchains, only leaf nodes are added to the final rbtree so it
should show entries in parent nodes. To do that, add parent_val list to
struct callchain_node and show them along with the (normal) val list.

See the previous commit on TUI support for more information.

Signed-off-by: Namhyung Kim <[email protected]>
Tested-by: Brendan Gregg <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/ui/gtk/hists.c | 80 ++++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 76 insertions(+), 4 deletions(-)

diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
index cff7bb9..0b24cd6 100644
--- a/tools/perf/ui/gtk/hists.c
+++ b/tools/perf/ui/gtk/hists.c
@@ -89,8 +89,71 @@ void perf_gtk__init_hpp(void)
perf_gtk__hpp_color_overhead_acc;
}

-static void perf_gtk__add_callchain(struct rb_root *root, GtkTreeStore *store,
- GtkTreeIter *parent, int col, u64 total)
+static void perf_gtk__add_callchain_flat(struct rb_root *root, GtkTreeStore *store,
+ GtkTreeIter *parent, int col, u64 total)
+{
+ struct rb_node *nd;
+ bool has_single_node = (rb_first(root) == rb_last(root));
+
+ for (nd = rb_first(root); nd; nd = rb_next(nd)) {
+ struct callchain_node *node;
+ struct callchain_list *chain;
+ GtkTreeIter iter, new_parent;
+ bool need_new_parent;
+
+ node = rb_entry(nd, struct callchain_node, rb_node);
+
+ new_parent = *parent;
+ need_new_parent = !has_single_node;
+
+ callchain_node__make_parent_list(node);
+
+ list_for_each_entry(chain, &node->parent_val, list) {
+ char buf[128];
+
+ gtk_tree_store_append(store, &iter, &new_parent);
+
+ callchain_node__scnprintf_value(node, buf, sizeof(buf), total);
+ gtk_tree_store_set(store, &iter, 0, buf, -1);
+
+ callchain_list__sym_name(chain, buf, sizeof(buf), false);
+ gtk_tree_store_set(store, &iter, col, buf, -1);
+
+ if (need_new_parent) {
+ /*
+ * Only show the top-most symbol in a callchain
+ * if it's not the only callchain.
+ */
+ new_parent = iter;
+ need_new_parent = false;
+ }
+ }
+
+ list_for_each_entry(chain, &node->val, list) {
+ char buf[128];
+
+ gtk_tree_store_append(store, &iter, &new_parent);
+
+ callchain_node__scnprintf_value(node, buf, sizeof(buf), total);
+ gtk_tree_store_set(store, &iter, 0, buf, -1);
+
+ callchain_list__sym_name(chain, buf, sizeof(buf), false);
+ gtk_tree_store_set(store, &iter, col, buf, -1);
+
+ if (need_new_parent) {
+ /*
+ * Only show the top-most symbol in a callchain
+ * if it's not the only callchain.
+ */
+ new_parent = iter;
+ need_new_parent = false;
+ }
+ }
+ }
+}
+
+static void perf_gtk__add_callchain_graph(struct rb_root *root, GtkTreeStore *store,
+ GtkTreeIter *parent, int col, u64 total)
{
struct rb_node *nd;
bool has_single_node = (rb_first(root) == rb_last(root));
@@ -134,11 +197,20 @@ static void perf_gtk__add_callchain(struct rb_root *root, GtkTreeStore *store,
child_total = total;

/* Now 'iter' contains info of the last callchain_list */
- perf_gtk__add_callchain(&node->rb_root, store, &iter, col,
- child_total);
+ perf_gtk__add_callchain_graph(&node->rb_root, store, &iter, col,
+ child_total);
}
}

+static void perf_gtk__add_callchain(struct rb_root *root, GtkTreeStore *store,
+ GtkTreeIter *parent, int col, u64 total)
+{
+ if (callchain_param.mode == CHAIN_FLAT)
+ perf_gtk__add_callchain_flat(root, store, parent, col, total);
+ else
+ perf_gtk__add_callchain_graph(root, store, parent, col, total);
+}
+
static void on_row_activated(GtkTreeView *view, GtkTreePath *path,
GtkTreeViewColumn *col __maybe_unused,
gpointer user_data __maybe_unused)

Subject: [tip:perf/core] perf ui/gtk: Support folded callchains

Commit-ID: 2c6caff2b26fde8f3f87183f8c97f2cebfdbcb98
Gitweb: http://git.kernel.org/tip/2c6caff2b26fde8f3f87183f8c97f2cebfdbcb98
Author: Namhyung Kim <[email protected]>
AuthorDate: Mon, 9 Nov 2015 14:45:46 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 19 Nov 2015 13:19:26 -0300

perf ui/gtk: Support folded callchains

The folded callchain mode is to print all chains in a single line.
Currently perf report --gtk doesn't support folded callchains. Like
flat callchains, only leaf nodes are added to the final rbtree so it
should show entries in parent nodes.

Signed-off-by: Namhyung Kim <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Tested-by: Brendan Gregg <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/ui/gtk/hists.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 62 insertions(+)

diff --git a/tools/perf/ui/gtk/hists.c b/tools/perf/ui/gtk/hists.c
index 0b24cd6..4677172 100644
--- a/tools/perf/ui/gtk/hists.c
+++ b/tools/perf/ui/gtk/hists.c
@@ -152,6 +152,66 @@ static void perf_gtk__add_callchain_flat(struct rb_root *root, GtkTreeStore *sto
}
}

+static void perf_gtk__add_callchain_folded(struct rb_root *root, GtkTreeStore *store,
+ GtkTreeIter *parent, int col, u64 total)
+{
+ struct rb_node *nd;
+
+ for (nd = rb_first(root); nd; nd = rb_next(nd)) {
+ struct callchain_node *node;
+ struct callchain_list *chain;
+ GtkTreeIter iter;
+ char buf[64];
+ char *str, *str_alloc = NULL;
+ bool first = true;
+
+ node = rb_entry(nd, struct callchain_node, rb_node);
+
+ callchain_node__make_parent_list(node);
+
+ list_for_each_entry(chain, &node->parent_val, list) {
+ char name[1024];
+
+ callchain_list__sym_name(chain, name, sizeof(name), false);
+
+ if (asprintf(&str, "%s%s%s",
+ first ? "" : str_alloc,
+ first ? "" : symbol_conf.field_sep ?: "; ",
+ name) < 0)
+ return;
+
+ first = false;
+ free(str_alloc);
+ str_alloc = str;
+ }
+
+ list_for_each_entry(chain, &node->val, list) {
+ char name[1024];
+
+ callchain_list__sym_name(chain, name, sizeof(name), false);
+
+ if (asprintf(&str, "%s%s%s",
+ first ? "" : str_alloc,
+ first ? "" : symbol_conf.field_sep ?: "; ",
+ name) < 0)
+ return;
+
+ first = false;
+ free(str_alloc);
+ str_alloc = str;
+ }
+
+ gtk_tree_store_append(store, &iter, parent);
+
+ callchain_node__scnprintf_value(node, buf, sizeof(buf), total);
+ gtk_tree_store_set(store, &iter, 0, buf, -1);
+
+ gtk_tree_store_set(store, &iter, col, str, -1);
+
+ free(str_alloc);
+ }
+}
+
static void perf_gtk__add_callchain_graph(struct rb_root *root, GtkTreeStore *store,
GtkTreeIter *parent, int col, u64 total)
{
@@ -207,6 +267,8 @@ static void perf_gtk__add_callchain(struct rb_root *root, GtkTreeStore *store,
{
if (callchain_param.mode == CHAIN_FLAT)
perf_gtk__add_callchain_flat(root, store, parent, col, total);
+ else if (callchain_param.mode == CHAIN_FOLDED)
+ perf_gtk__add_callchain_folded(root, store, parent, col, total);
else
perf_gtk__add_callchain_graph(root, store, parent, col, total);
}