2017-03-03 02:46:05

by Jin Yao

[permalink] [raw]
Subject: [PATCH v4 0/5] perf report: Show inline stack

v4: Remove the options "--inline-line" and "--inline-name". Just use
a new option "--inline" to print the inline function information.
The policy is if the inline function name can be resolved then
print the name in priority. If the name can't be resolved, then
print the source line number.

For example:
perf report --stdio --inline

0.69% 0.00% inline ld-2.23.so [.] dl_main
|
---dl_main
|
--0.56%--_dl_relocate_object
|
---_dl_relocate_object (inline)
elf_dynamic_do_Rela (inline)

Following 3 patches are updated according to this change.
perf report: Show inline stack in browser mode
perf report: Show inline stack in stdio mode
perf report: Create new inline option

Followings are not changed.
perf report: Find the inline stack for a given address
perf report: Refactor common code in srcline.c

v3: Iterate on RIPs of all callchain entries to check if the RIP is in
inline functions.

Reverse the order of the inliner printout if necessary.

Provide new options "--inline-line" / "--inline-name" to print
inline function name or print inline function source line.

v2: Thanks so much for Arnaldo's comments!
The modifications are:

1. Divide v1 patch "perf report: Find the inline stack for a
given address" into 2 patches:
a. perf report: Refactor common code in srcline.c
b. perf report: Find the inline stack for a given address

Some function names are changed:
dso_name_get -> dso__name
ilist_apend -> inline_list__append
get_inline_node -> dso__parse_addr_inlines
free_inline_node -> inline_node__delete

2. Since the function name are changed, update following patches
accordingly.
a. perf report: Show inline stack in stdio mode
b. perf report: Show inline stack in browser mode

3. Rebase to latest perf/core branch. This patch is impacted.
a. perf report: Create a new option "--inline"

v1: Initial post

It would be useful for perf to support a mode to query the
inline stack for callgraph addresses. This would simplify
finding the right code in code that does a lot of inlining.

For example, the c code:

static inline void f3(void)
{
int i;
for (i = 0; i < 1000;) {

if(i%2)
i++;
else
i++;
}
printf("hello f3\n"); /* D */
}

/* < CALLCHAIN: f2 <- f1 > */
static inline void f2(void)
{
int i;
for (i = 0; i < 100; i++) {
f3(); /* C */
}
}

/* < CALLCHAIN: f1 <- main > */
static inline void f1(void)
{
int i;
for (i = 0; i < 100; i++) {
f2(); /* B */
}
}

/* < CALLCHAIN: main <- TOP > */
int main()
{
struct timeval tv;
time_t start, end;

gettimeofday(&tv, NULL);
start = end = tv.tv_sec;
while((end - start) < 5) {
f1(); /* A */
gettimeofday(&tv, NULL);
end = tv.tv_sec;
}
return 0;
}

The printed inline stack is:

0.05% test2 test2 [.] main
|
---/home/perf-dev/lck-2867/test/test2.c:27 (inline)
/home/perf-dev/lck-2867/test/test2.c:35 (inline)
/home/perf-dev/lck-2867/test/test2.c:45 (inline)
/home/perf-dev/lck-2867/test/test2.c:61 (inline)

I tag A/B/C/D in above c code to indicate the source line,
actually the inline stack is equal to:

0.05% test2 test2 [.] main
|
---D
C
B
A

Jin Yao (5):
perf report: Refactor common code in srcline.c
perf report: Find the inline stack for a given address
perf report: Create new inline option
perf report: Show inline stack in stdio mode
perf report: Show inline stack in browser mode

tools/perf/Documentation/perf-report.txt | 4 +
tools/perf/builtin-report.c | 2 +
tools/perf/ui/browsers/hists.c | 168 ++++++++++++++++++++--
tools/perf/ui/stdio/hist.c | 76 +++++++++-
tools/perf/util/hist.c | 5 +
tools/perf/util/sort.h | 1 +
tools/perf/util/srcline.c | 237 +++++++++++++++++++++++++++----
tools/perf/util/symbol-elf.c | 5 +
tools/perf/util/symbol.h | 5 +-
tools/perf/util/util.h | 16 +++
10 files changed, 481 insertions(+), 38 deletions(-)

--
2.7.4


2017-03-03 02:46:04

by Jin Yao

[permalink] [raw]
Subject: [PATCH v4 1/5] perf report: Refactor common code in srcline.c

Introduce dso__name() and filename_split() out of existing code
because these codes will be used in several places in next
patch.

For filename_split(), it may also solve a potential memory leak
in existing code. In existing addr2line(),

sep = strchr(filename, ':');
if (sep) {
*sep++ = '\0';
*file = filename;
*line_nr = strtoul(sep, NULL, 0);
ret = 1;
}

out:
pclose(fp);
return ret;

If sep is NULL, filename is not freed or returned via file.

Signed-off-by: Jin Yao <[email protected]>
Tested-by: Milian Wolff <[email protected]>
---
tools/perf/util/srcline.c | 68 +++++++++++++++++++++++++++++++----------------
1 file changed, 45 insertions(+), 23 deletions(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index b4db3f4..2953c9f 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -12,6 +12,24 @@

bool srcline_full_filename;

+static const char *dso__name(struct dso *dso)
+{
+ const char *dso_name;
+
+ if (dso->symsrc_filename)
+ dso_name = dso->symsrc_filename;
+ else
+ dso_name = dso->long_name;
+
+ if (dso_name[0] == '[')
+ return NULL;
+
+ if (!strncmp(dso_name, "/tmp/perf-", 10))
+ return NULL;
+
+ return dso_name;
+}
+
#ifdef HAVE_LIBBFD_SUPPORT

/*
@@ -207,6 +225,27 @@ void dso__free_a2l(struct dso *dso)

#else /* HAVE_LIBBFD_SUPPORT */

+static int filename_split(char *filename, unsigned int *line_nr)
+{
+ char *sep;
+
+ sep = strchr(filename, '\n');
+ if (sep)
+ *sep = '\0';
+
+ if (!strcmp(filename, "??:0"))
+ return 0;
+
+ sep = strchr(filename, ':');
+ if (sep) {
+ *sep++ = '\0';
+ *line_nr = strtoul(sep, NULL, 0);
+ return 1;
+ }
+
+ return 0;
+}
+
static int addr2line(const char *dso_name, u64 addr,
char **file, unsigned int *line_nr,
struct dso *dso __maybe_unused,
@@ -216,7 +255,6 @@ static int addr2line(const char *dso_name, u64 addr,
char cmd[PATH_MAX];
char *filename = NULL;
size_t len;
- char *sep;
int ret = 0;

scnprintf(cmd, sizeof(cmd), "addr2line -e %s %016"PRIx64,
@@ -233,23 +271,14 @@ static int addr2line(const char *dso_name, u64 addr,
goto out;
}

- sep = strchr(filename, '\n');
- if (sep)
- *sep = '\0';
-
- if (!strcmp(filename, "??:0")) {
- pr_debug("no debugging info in %s\n", dso_name);
+ ret = filename_split(filename, line_nr);
+ if (ret != 1) {
free(filename);
goto out;
}

- sep = strchr(filename, ':');
- if (sep) {
- *sep++ = '\0';
- *file = filename;
- *line_nr = strtoul(sep, NULL, 0);
- ret = 1;
- }
+ *file = filename;
+
out:
pclose(fp);
return ret;
@@ -278,15 +307,8 @@ char *__get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
if (!dso->has_srcline)
goto out;

- if (dso->symsrc_filename)
- dso_name = dso->symsrc_filename;
- else
- dso_name = dso->long_name;
-
- if (dso_name[0] == '[')
- goto out;
-
- if (!strncmp(dso_name, "/tmp/perf-", 10))
+ dso_name = dso__name(dso);
+ if (dso_name == NULL)
goto out;

if (!addr2line(dso_name, addr, &file, &line, dso, unwind_inlines))
--
2.7.4

2017-03-03 06:37:44

by Jin Yao

[permalink] [raw]
Subject: [PATCH v4 2/5] perf report: Find the inline stack for a given address

It would be useful for perf to support a mode to query the
inline stack for a given callgraph address. This would simplify
finding the right code in code that does a lot of inlining.

The srcline.c has contained the code which supports to translate
the address to filename:line_nr. This patch just extends the
function to let it support getting the inline stacks.

It introduces the inline_list which will store the inline
function result (filename:line_nr and funcname).

Signed-off-by: Jin Yao <[email protected]>
Tested-by: Milian Wolff <[email protected]>
---
tools/perf/util/srcline.c | 169 +++++++++++++++++++++++++++++++++++++++++--
tools/perf/util/symbol-elf.c | 5 ++
tools/perf/util/symbol.h | 2 +
tools/perf/util/util.h | 16 ++++
4 files changed, 187 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index 2953c9f..f9d4b47 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -7,6 +7,7 @@
#include "util/dso.h"
#include "util/util.h"
#include "util/debug.h"
+#include "util/callchain.h"

#include "symbol.h"

@@ -30,6 +31,41 @@ static const char *dso__name(struct dso *dso)
return dso_name;
}

+static int inline_list__append(char *filename, char *funcname, int line_nr,
+ struct inline_node *node, struct dso *dso)
+{
+ struct inline_list *ilist;
+ char *demangled;
+
+ ilist = zalloc(sizeof(*ilist));
+ if (ilist == NULL)
+ return -1;
+
+ ilist->filename = filename;
+ ilist->line_nr = line_nr;
+
+ demangled = dso__demangle_sym(dso, 0, funcname);
+ if (demangled == NULL) {
+ ilist->funcname = funcname;
+ } else {
+ ilist->funcname = demangled;
+ if (funcname != NULL)
+ free(funcname);
+ }
+
+ list_add_tail(&ilist->list, &node->val);
+
+ return 0;
+}
+
+static void inline_list__reverse(struct inline_node *node)
+{
+ struct inline_list *ilist, *n;
+
+ list_for_each_entry_safe_reverse(ilist, n, &node->val, list)
+ list_move_tail(&ilist->list, &node->val);
+}
+
#ifdef HAVE_LIBBFD_SUPPORT

/*
@@ -171,7 +207,7 @@ static void addr2line_cleanup(struct a2l_data *a2l)

static int addr2line(const char *dso_name, u64 addr,
char **file, unsigned int *line, struct dso *dso,
- bool unwind_inlines)
+ bool unwind_inlines, struct inline_node *node)
{
int ret = 0;
struct a2l_data *a2l = dso->a2l;
@@ -196,8 +232,21 @@ static int addr2line(const char *dso_name, u64 addr,

while (bfd_find_inliner_info(a2l->abfd, &a2l->filename,
&a2l->funcname, &a2l->line) &&
- cnt++ < MAX_INLINE_NEST)
- ;
+ cnt++ < MAX_INLINE_NEST) {
+
+ if (node != NULL) {
+ if (inline_list__append(strdup(a2l->filename),
+ strdup(a2l->funcname),
+ a2l->line, node,
+ dso) != 0)
+ return 0;
+ }
+ }
+
+ if ((node != NULL) &&
+ (callchain_param.order != ORDER_CALLEE)) {
+ inline_list__reverse(node);
+ }
}

if (a2l->found && a2l->filename) {
@@ -223,6 +272,35 @@ void dso__free_a2l(struct dso *dso)
dso->a2l = NULL;
}

+static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
+ struct dso *dso)
+{
+ char *file = NULL;
+ unsigned int line = 0;
+ struct inline_node *node;
+
+ node = zalloc(sizeof(*node));
+ if (node == NULL) {
+ perror("not enough memory for the inline node");
+ return NULL;
+ }
+
+ INIT_LIST_HEAD(&node->val);
+ node->addr = addr;
+
+ if (!addr2line(dso_name, addr, &file, &line, dso, TRUE, node))
+ goto out_free_inline_node;
+
+ if (list_empty(&node->val))
+ goto out_free_inline_node;
+
+ return node;
+
+out_free_inline_node:
+ inline_node__delete(node);
+ return NULL;
+}
+
#else /* HAVE_LIBBFD_SUPPORT */

static int filename_split(char *filename, unsigned int *line_nr)
@@ -249,7 +327,8 @@ static int filename_split(char *filename, unsigned int *line_nr)
static int addr2line(const char *dso_name, u64 addr,
char **file, unsigned int *line_nr,
struct dso *dso __maybe_unused,
- bool unwind_inlines __maybe_unused)
+ bool unwind_inlines __maybe_unused,
+ struct inline_node *node __maybe_unused)
{
FILE *fp;
char cmd[PATH_MAX];
@@ -288,6 +367,57 @@ void dso__free_a2l(struct dso *dso __maybe_unused)
{
}

+static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
+ struct dso *dso __maybe_unused)
+{
+ FILE *fp;
+ char cmd[PATH_MAX];
+ struct inline_node *node;
+ char *filename = NULL;
+ size_t len;
+ unsigned int line_nr = 0;
+
+ scnprintf(cmd, sizeof(cmd), "addr2line -e %s -i %016"PRIx64,
+ dso_name, addr);
+
+ fp = popen(cmd, "r");
+ if (fp == NULL) {
+ pr_err("popen failed for %s\n", dso_name);
+ return NULL;
+ }
+
+ node = zalloc(sizeof(*node));
+ if (node == NULL) {
+ perror("not enough memory for the inline node");
+ goto out;
+ }
+
+ INIT_LIST_HEAD(&node->val);
+ node->addr = addr;
+
+ while (getline(&filename, &len, fp) != -1) {
+ if (filename_split(filename, &line_nr) != 1) {
+ free(filename);
+ goto out;
+ }
+
+ if (inline_list__append(filename, NULL, line_nr, node) != 0)
+ goto out;
+
+ filename = NULL;
+ }
+
+out:
+ pclose(fp);
+
+ if (list_empty(&node->val)) {
+ inline_node__delete(node);
+ return NULL;
+ }
+
+ return node;
+}
+
#endif /* HAVE_LIBBFD_SUPPORT */

/*
@@ -311,7 +441,7 @@ char *__get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
if (dso_name == NULL)
goto out;

- if (!addr2line(dso_name, addr, &file, &line, dso, unwind_inlines))
+ if (!addr2line(dso_name, addr, &file, &line, dso, unwind_inlines, NULL))
goto out;

if (asprintf(&srcline, "%s:%u",
@@ -351,3 +481,32 @@ char *get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
{
return __get_srcline(dso, addr, sym, show_sym, false);
}
+
+struct inline_node *dso__parse_addr_inlines(struct dso *dso, u64 addr)
+{
+ const char *dso_name;
+
+ dso_name = dso__name(dso);
+ if (dso_name == NULL)
+ return NULL;
+
+ return addr2inlines(dso_name, addr, dso);
+}
+
+void inline_node__delete(struct inline_node *node)
+{
+ struct inline_list *ilist, *tmp;
+
+ list_for_each_entry_safe(ilist, tmp, &node->val, list) {
+ list_del_init(&ilist->list);
+ if (ilist->filename != NULL)
+ free(ilist->filename);
+
+ if (ilist->funcname != NULL)
+ free(ilist->funcname);
+
+ free(ilist);
+ }
+
+ free(node);
+}
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 4e59dde..3a1dda3 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -390,6 +390,11 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss, struct map *
return 0;
}

+char *dso__demangle_sym(struct dso *dso, int kmodule, char *elf_name)
+{
+ return demangle_sym(dso, kmodule, elf_name);
+}
+
/*
* Align offset to 4 bytes as needed for note name and descriptor data.
*/
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 6c358b7..8adf045 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -305,6 +305,8 @@ int dso__load_sym(struct dso *dso, struct map *map, struct symsrc *syms_ss,
int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss,
struct map *map);

+char *dso__demangle_sym(struct dso *dso, int kmodule, char *elf_name);
+
void __symbols__insert(struct rb_root *symbols, struct symbol *sym, bool kernel);
void symbols__insert(struct rb_root *symbols, struct symbol *sym);
void symbols__fixup_duplicate(struct rb_root *symbols);
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index b2cfa47..cc0700d 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -364,4 +364,20 @@ int is_printable_array(char *p, unsigned int len);
int timestamp__scnprintf_usec(u64 timestamp, char *buf, size_t sz);

int unit_number__scnprintf(char *buf, size_t size, u64 n);
+
+struct inline_list {
+ char *filename;
+ char *funcname;
+ unsigned int line_nr;
+ struct list_head list;
+};
+
+struct inline_node {
+ u64 addr;
+ struct list_head val;
+};
+
+struct inline_node *dso__parse_addr_inlines(struct dso *dso, u64 addr);
+void inline_node__delete(struct inline_node *node);
+
#endif /* GIT_COMPAT_UTIL_H */
--
2.7.4

2017-03-03 07:10:55

by Jin Yao

[permalink] [raw]
Subject: [PATCH v4 4/5] perf report: Show inline stack in stdio mode

If the address belongs to an inlined function, the source information
back to the first non-inlined function will be printed.

For example:
perf report --stdio --inline

0.69% 0.00% inline ld-2.23.so [.] dl_main
|
---dl_main
|
--0.56%--_dl_relocate_object
|
---_dl_relocate_object (inline)
elf_dynamic_do_Rela (inline)

Signed-off-by: Jin Yao <[email protected]>
Tested-by: Milian Wolff <[email protected]>
---
tools/perf/ui/stdio/hist.c | 76 +++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 75 insertions(+), 1 deletion(-)

diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 668f4ae..3356bfb 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -17,6 +17,57 @@ static size_t callchain__fprintf_left_margin(FILE *fp, int left_margin)
return ret;
}

+static size_t inline__fprintf(struct map *map, u64 ip,
+ int left_margin, FILE *fp)
+{
+ struct dso *dso;
+ struct inline_node *node;
+ struct inline_list *ilist;
+ int ret = 0, i = 0;
+
+ if (map == NULL)
+ return 0;
+
+ dso = map->dso;
+ if (dso == NULL)
+ return 0;
+
+ if (dso->kernel != DSO_TYPE_USER)
+ return 0;
+
+ node = dso__parse_addr_inlines(dso,
+ map__rip_2objdump(map, ip));
+ if (node == NULL)
+ return 0;
+
+ ret += callchain__fprintf_left_margin(fp, left_margin);
+ ret += fprintf(fp, "|\n");
+ ret += callchain__fprintf_left_margin(fp, left_margin);
+ ret += fprintf(fp, "---");
+ left_margin += 3;
+
+ list_for_each_entry(ilist, &node->val, list) {
+ if (ilist->filename != NULL) {
+ if (i++ > 0)
+ ret = callchain__fprintf_left_margin(fp,
+ left_margin);
+
+ if (ilist->funcname)
+ ret += fprintf(fp, "%s (inline)",
+ ilist->funcname);
+ else
+ ret += fprintf(fp, "%s:%d (inline)",
+ ilist->filename,
+ ilist->line_nr);
+
+ ret += fprintf(fp, "\n");
+ }
+ }
+
+ inline_node__delete(node);
+ return ret;
+}
+
static size_t ipchain__fprintf_graph_line(FILE *fp, int depth, int depth_mask,
int left_margin)
{
@@ -78,6 +129,10 @@ static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_node *node,
fputs(str, fp);
fputc('\n', fp);
free(alloc_str);
+
+ if (symbol_conf.inline_name)
+ ret += inline__fprintf(chain->ms.map, chain->ip,
+ left_margin + 11, fp);
return ret;
}

@@ -229,6 +284,7 @@ static size_t callchain__fprintf_graph(FILE *fp, struct rb_root *root,
if (!i++ && field_order == NULL &&
sort_order && !prefixcmp(sort_order, "sym"))
continue;
+
if (!printed) {
ret += callchain__fprintf_left_margin(fp, left_margin);
ret += fprintf(fp, "|\n");
@@ -251,6 +307,12 @@ static size_t callchain__fprintf_graph(FILE *fp, struct rb_root *root,

if (++entries_printed == callchain_param.print_limit)
break;
+
+ if (symbol_conf.inline_name)
+ ret += inline__fprintf(chain->ms.map,
+ chain->ip,
+ left_margin,
+ fp);
}
root = &cnode->rb_root;
}
@@ -529,6 +591,8 @@ static int hist_entry__fprintf(struct hist_entry *he, size_t size,
bool use_callchain)
{
int ret;
+ int callchain_ret = 0;
+ int inline_ret = 0;
struct perf_hpp hpp = {
.buf = bf,
.size = size,
@@ -547,7 +611,17 @@ static int hist_entry__fprintf(struct hist_entry *he, size_t size,
ret = fprintf(fp, "%s\n", bf);

if (use_callchain)
- ret += hist_entry_callchain__fprintf(he, total_period, 0, fp);
+ callchain_ret = hist_entry_callchain__fprintf(he, total_period,
+ 0, fp);
+
+ if ((callchain_ret == 0) &&
+ (symbol_conf.inline_name)) {
+ inline_ret = inline__fprintf(he->ms.map, he->ip, 0, fp);
+ ret += inline_ret;
+ if (inline_ret > 0)
+ ret += fprintf(fp, "\n");
+ } else
+ ret += callchain_ret;

return ret;
}
--
2.7.4

2017-03-03 08:58:23

by Jin Yao

[permalink] [raw]
Subject: [PATCH v4 3/5] perf report: Create new inline option

It takes some time to look for inline stack for callgraph addresses.
So it provides new option "--inline" to let user decide if enable
this feature.

--inline:
If a callgraph address belongs to an inlined function, the inline stack
will be printed. Each entry is the inline function name.

Signed-off-by: Jin Yao <[email protected]>
Tested-by: Milian Wolff <[email protected]>
---
tools/perf/Documentation/perf-report.txt | 4 ++++
tools/perf/builtin-report.c | 2 ++
tools/perf/util/symbol.h | 3 ++-
3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index 33f9190..2bfd50b 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -425,6 +425,10 @@ include::itrace.txt[]
--hierarchy::
Enable hierarchical output.

+--inline::
+ If a callgraph address belongs to an inlined function, the inline stack
+ will be printed. Each entry is the inline function name.
+
include::callchain-overhead-calculation.txt[]

SEE ALSO
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 0a88670..900c020 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -845,6 +845,8 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
stdio__config_color, "always"),
OPT_STRING(0, "time", &report.time_str, "str",
"Time span of interest (start,stop)"),
+ OPT_BOOLEAN(0, "inline", &symbol_conf.inline_name,
+ "Show inline function"),
OPT_END()
};
struct perf_data_file file = {
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 8adf045..7b4a399 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -118,7 +118,8 @@ struct symbol_conf {
show_ref_callgraph,
hide_unresolved,
raw_trace,
- report_hierarchy;
+ report_hierarchy,
+ inline_name;
const char *vmlinux_name,
*kallsyms_name,
*source_prefix,
--
2.7.4

2017-03-03 12:08:48

by Jin Yao

[permalink] [raw]
Subject: [PATCH v4 5/5] perf report: Show inline stack in browser mode

If the address belongs to an inlined function, the source information
back to the first non-inlined function will be printed.

For example:
perf report --inline

- 0.69% 0.00% inline ld-2.23.so [.] dl_main
- dl_main
0.56% _dl_relocate_object
_dl_relocate_object: (inline)
elf_dynamic_do_Rela: (inline)

Signed-off-by: Jin Yao <[email protected]>
Tested-by: Milian Wolff <[email protected]>
---
tools/perf/ui/browsers/hists.c | 168 +++++++++++++++++++++++++++++++++++++++--
tools/perf/util/hist.c | 5 ++
tools/perf/util/sort.h | 1 +
3 files changed, 166 insertions(+), 8 deletions(-)

diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index fc4fb66..51a2a68 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -144,9 +144,60 @@ static void callchain_list__set_folding(struct callchain_list *cl, bool unfold)
cl->unfolded = unfold ? cl->has_children : false;
}

+static struct inline_node *inline_node__create(struct map *map, u64 ip)
+{
+ struct dso *dso;
+ struct inline_node *node;
+
+ if (map == NULL)
+ return NULL;
+
+ dso = map->dso;
+ if (dso == NULL)
+ return NULL;
+
+ if (dso->kernel != DSO_TYPE_USER)
+ return NULL;
+
+ node = dso__parse_addr_inlines(dso,
+ map__rip_2objdump(map, ip));
+
+ return node;
+}
+
+static int inline__count_rows(struct inline_node *node)
+{
+ struct inline_list *ilist;
+ int i = 0;
+
+ if (node == NULL)
+ return 0;
+
+ list_for_each_entry(ilist, &node->val, list) {
+ if ((ilist->filename != NULL) || (ilist->funcname != NULL))
+ i++;
+ }
+
+ return i;
+}
+
+static int callchain_list__inline_rows(struct callchain_list *chain)
+{
+ struct inline_node *node;
+ int rows;
+
+ node = inline_node__create(chain->ms.map, chain->ip);
+ if (node == NULL)
+ return 0;
+
+ rows = inline__count_rows(node);
+ inline_node__delete(node);
+ return rows;
+}
+
static int callchain_node__count_rows_rb_tree(struct callchain_node *node)
{
- int n = 0;
+ int n = 0, inline_rows;
struct rb_node *nd;

for (nd = rb_first(&node->rb_root); nd; nd = rb_next(nd)) {
@@ -156,6 +207,13 @@ static int callchain_node__count_rows_rb_tree(struct callchain_node *node)

list_for_each_entry(chain, &child->val, list) {
++n;
+
+ if (symbol_conf.inline_name) {
+ inline_rows =
+ callchain_list__inline_rows(chain);
+ n += inline_rows;
+ }
+
/* We need this because we may not have children */
folded_sign = callchain_list__folded(chain);
if (folded_sign == '+')
@@ -207,7 +265,7 @@ static int callchain_node__count_rows(struct callchain_node *node)
{
struct callchain_list *chain;
bool unfolded = false;
- int n = 0;
+ int n = 0, inline_rows;

if (callchain_param.mode == CHAIN_FLAT)
return callchain_node__count_flat_rows(node);
@@ -216,6 +274,11 @@ static int callchain_node__count_rows(struct callchain_node *node)

list_for_each_entry(chain, &node->val, list) {
++n;
+ if (symbol_conf.inline_name) {
+ inline_rows = callchain_list__inline_rows(chain);
+ n += inline_rows;
+ }
+
unfolded = chain->unfolded;
}

@@ -362,6 +425,19 @@ static void hist_entry__init_have_children(struct hist_entry *he)
he->init_have_children = true;
}

+static void hist_entry_init_inline_node(struct hist_entry *he)
+{
+ if (he->inline_node)
+ return;
+
+ he->inline_node = inline_node__create(he->ms.map, he->ip);
+
+ if (he->inline_node == NULL)
+ return;
+
+ he->has_children = true;
+}
+
static bool hist_browser__toggle_fold(struct hist_browser *browser)
{
struct hist_entry *he = browser->he_selection;
@@ -393,7 +469,12 @@ static bool hist_browser__toggle_fold(struct hist_browser *browser)

if (he->unfolded) {
if (he->leaf)
- he->nr_rows = callchain__count_rows(&he->sorted_chain);
+ if (he->inline_node)
+ he->nr_rows = inline__count_rows(
+ he->inline_node);
+ else
+ he->nr_rows = callchain__count_rows(
+ &he->sorted_chain);
else
he->nr_rows = hierarchy_count_rows(browser, he, false);

@@ -753,6 +834,58 @@ static bool hist_browser__check_dump_full(struct hist_browser *browser __maybe_u

#define LEVEL_OFFSET_STEP 3

+static int hist_browser__show_inline(struct hist_browser *browser,
+ struct inline_node *node,
+ unsigned short row,
+ int offset)
+{
+ struct inline_list *ilist;
+ char buf[1024];
+ int color, width, first_row;
+
+ first_row = row;
+ width = browser->b.width - (LEVEL_OFFSET_STEP + 2);
+ list_for_each_entry(ilist, &node->val, list) {
+ if ((ilist->filename != NULL) || (ilist->funcname != NULL)) {
+ color = HE_COLORSET_NORMAL;
+ if (ui_browser__is_current_entry(&browser->b, row))
+ color = HE_COLORSET_SELECTED;
+
+ if (ilist->funcname != NULL)
+ scnprintf(buf, sizeof(buf), "%s: (inline)",
+ ilist->funcname);
+ else
+ scnprintf(buf, sizeof(buf), "%s:%d (inline)",
+ ilist->filename, ilist->line_nr);
+
+ ui_browser__set_color(&browser->b, color);
+ hist_browser__gotorc(browser, row, 0);
+ ui_browser__write_nstring(&browser->b, " ",
+ LEVEL_OFFSET_STEP + offset);
+ ui_browser__write_nstring(&browser->b, buf, width);
+ row++;
+ }
+ }
+
+ return row - first_row;
+}
+
+static size_t show_inline_list(struct hist_browser *browser, struct map *map,
+ u64 ip, int row, int offset)
+{
+ struct inline_node *node;
+ int ret;
+
+ node = inline_node__create(map, ip);
+ if (node == NULL)
+ return 0;
+
+ ret = hist_browser__show_inline(browser, node, row, offset);
+
+ inline_node__delete(node);
+ return ret;
+}
+
static int hist_browser__show_callchain_list(struct hist_browser *browser,
struct callchain_node *node,
struct callchain_list *chain,
@@ -764,6 +897,7 @@ static int hist_browser__show_callchain_list(struct hist_browser *browser,
char bf[1024], *alloc_str;
char buf[64], *alloc_str2;
const char *str;
+ int inline_rows = 0, ret = 1;

if (arg->row_offset != 0) {
arg->row_offset--;
@@ -801,10 +935,15 @@ static int hist_browser__show_callchain_list(struct hist_browser *browser,
}

print(browser, chain, str, offset, row, arg);
-
free(alloc_str);
free(alloc_str2);
- return 1;
+
+ if (symbol_conf.inline_name) {
+ inline_rows = show_inline_list(browser, chain->ms.map,
+ chain->ip, row + 1, offset);
+ }
+
+ return ret + inline_rows;
}

static bool check_percent_display(struct rb_node *node, u64 parent_total)
@@ -1228,6 +1367,12 @@ static int hist_browser__show_entry(struct hist_browser *browser,
folded_sign = hist_entry__folded(entry);
}

+ if (symbol_conf.inline_name &&
+ (!entry->has_children)) {
+ hist_entry_init_inline_node(entry);
+ folded_sign = hist_entry__folded(entry);
+ }
+
if (row_offset == 0) {
struct hpp_arg arg = {
.b = &browser->b,
@@ -1259,7 +1404,8 @@ static int hist_browser__show_entry(struct hist_browser *browser,
}

if (first) {
- if (symbol_conf.use_callchain) {
+ if (symbol_conf.use_callchain ||
+ symbol_conf.inline_name) {
ui_browser__printf(&browser->b, "%c ", folded_sign);
width -= 2;
}
@@ -1301,8 +1447,14 @@ static int hist_browser__show_entry(struct hist_browser *browser,
.is_current_entry = current_entry,
};

- printed += hist_browser__show_callchain(browser, entry, 1, row,
- hist_browser__show_callchain_entry, &arg,
+ if (entry->inline_node)
+ printed += hist_browser__show_inline(browser,
+ entry->inline_node, row, 0);
+ else
+ printed += hist_browser__show_callchain(browser,
+ entry, 1, row,
+ hist_browser__show_callchain_entry,
+ &arg,
hist_browser__check_output_full);
}

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index eaf72a9..ffa42fc 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1129,6 +1129,11 @@ void hist_entry__delete(struct hist_entry *he)
zfree(&he->mem_info);
}

+ if (he->inline_node) {
+ inline_node__delete(he->inline_node);
+ he->inline_node = NULL;
+ }
+
zfree(&he->stat_acc);
free_srcline(he->srcline);
if (he->srcfile && he->srcfile[0])
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index acb2c57..18b949e 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -122,6 +122,7 @@ struct hist_entry {
};
char *srcline;
char *srcfile;
+ struct inline_node *inline_node;
struct symbol *parent;
struct branch_info *branch_info;
struct hists *hists;
--
2.7.4

2017-03-14 01:16:09

by Jin Yao

[permalink] [raw]
Subject: Re: [PATCH v4 0/5] perf report: Show inline stack

Hi,

Any comments for this v4 patch series?

Thanks

Jin Yao


On 3/3/2017 6:43 PM, Jin Yao wrote:
> v4: Remove the options "--inline-line" and "--inline-name". Just use
> a new option "--inline" to print the inline function information.
> The policy is if the inline function name can be resolved then
> print the name in priority. If the name can't be resolved, then
> print the source line number.
>
> For example:
> perf report --stdio --inline
>
> 0.69% 0.00% inline ld-2.23.so [.] dl_main
> |
> ---dl_main
> |
> --0.56%--_dl_relocate_object
> |
> ---_dl_relocate_object (inline)
> elf_dynamic_do_Rela (inline)
>
> Following 3 patches are updated according to this change.
> perf report: Show inline stack in browser mode
> perf report: Show inline stack in stdio mode
> perf report: Create new inline option
>
> Followings are not changed.
> perf report: Find the inline stack for a given address
> perf report: Refactor common code in srcline.c
>
> v3: Iterate on RIPs of all callchain entries to check if the RIP is in
> inline functions.
>
> Reverse the order of the inliner printout if necessary.
>
> Provide new options "--inline-line" / "--inline-name" to print
> inline function name or print inline function source line.
>
> v2: Thanks so much for Arnaldo's comments!
> The modifications are:
>
> 1. Divide v1 patch "perf report: Find the inline stack for a
> given address" into 2 patches:
> a. perf report: Refactor common code in srcline.c
> b. perf report: Find the inline stack for a given address
>
> Some function names are changed:
> dso_name_get -> dso__name
> ilist_apend -> inline_list__append
> get_inline_node -> dso__parse_addr_inlines
> free_inline_node -> inline_node__delete
>
> 2. Since the function name are changed, update following patches
> accordingly.
> a. perf report: Show inline stack in stdio mode
> b. perf report: Show inline stack in browser mode
>
> 3. Rebase to latest perf/core branch. This patch is impacted.
> a. perf report: Create a new option "--inline"
>
> v1: Initial post
>
> It would be useful for perf to support a mode to query the
> inline stack for callgraph addresses. This would simplify
> finding the right code in code that does a lot of inlining.
>
> For example, the c code:
>
> static inline void f3(void)
> {
> int i;
> for (i = 0; i < 1000;) {
>
> if(i%2)
> i++;
> else
> i++;
> }
> printf("hello f3\n"); /* D */
> }
>
> /* < CALLCHAIN: f2 <- f1 > */
> static inline void f2(void)
> {
> int i;
> for (i = 0; i < 100; i++) {
> f3(); /* C */
> }
> }
>
> /* < CALLCHAIN: f1 <- main > */
> static inline void f1(void)
> {
> int i;
> for (i = 0; i < 100; i++) {
> f2(); /* B */
> }
> }
>
> /* < CALLCHAIN: main <- TOP > */
> int main()
> {
> struct timeval tv;
> time_t start, end;
>
> gettimeofday(&tv, NULL);
> start = end = tv.tv_sec;
> while((end - start) < 5) {
> f1(); /* A */
> gettimeofday(&tv, NULL);
> end = tv.tv_sec;
> }
> return 0;
> }
>
> The printed inline stack is:
>
> 0.05% test2 test2 [.] main
> |
> ---/home/perf-dev/lck-2867/test/test2.c:27 (inline)
> /home/perf-dev/lck-2867/test/test2.c:35 (inline)
> /home/perf-dev/lck-2867/test/test2.c:45 (inline)
> /home/perf-dev/lck-2867/test/test2.c:61 (inline)
>
> I tag A/B/C/D in above c code to indicate the source line,
> actually the inline stack is equal to:
>
> 0.05% test2 test2 [.] main
> |
> ---D
> C
> B
> A
>
> Jin Yao (5):
> perf report: Refactor common code in srcline.c
> perf report: Find the inline stack for a given address
> perf report: Create new inline option
> perf report: Show inline stack in stdio mode
> perf report: Show inline stack in browser mode
>
> tools/perf/Documentation/perf-report.txt | 4 +
> tools/perf/builtin-report.c | 2 +
> tools/perf/ui/browsers/hists.c | 168 ++++++++++++++++++++--
> tools/perf/ui/stdio/hist.c | 76 +++++++++-
> tools/perf/util/hist.c | 5 +
> tools/perf/util/sort.h | 1 +
> tools/perf/util/srcline.c | 237 +++++++++++++++++++++++++++----
> tools/perf/util/symbol-elf.c | 5 +
> tools/perf/util/symbol.h | 5 +-
> tools/perf/util/util.h | 16 +++
> 10 files changed, 481 insertions(+), 38 deletions(-)
>

2017-03-15 10:17:20

by Milian Wolff

[permalink] [raw]
Subject: Re: [PATCH v4 0/5] perf report: Show inline stack

On Friday, March 3, 2017 11:43:00 AM CET Jin Yao wrote:
> v4: Remove the options "--inline-line" and "--inline-name". Just use
> a new option "--inline" to print the inline function information.
> The policy is if the inline function name can be resolved then
> print the name in priority. If the name can't be resolved, then
> print the source line number.

This is still wrong from a usability POV. I may want to see the file/line for
entry that have a name. And actually, there are afaik no situations where you
could have a file/line but not a symbol name.

Again, why don't you align this with the other non-inlined frames, and honor
the grouping setting? Check whether we group by address (then display file/
line), or by function (then display the function name).

Bye
--
Milian Wolff | [email protected] | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts

2017-03-16 13:47:19

by Jin Yao

[permalink] [raw]
Subject: Re: [PATCH v4 0/5] perf report: Show inline stack

Hi Wolff,

Thanks so much for your review comments!

I just send out the v5 patch series. The patch series are updated
according to your comments.

Thanks

Jin Yao

On 3/14/2017 8:59 PM, Milian Wolff wrote:
> On Friday, March 3, 2017 11:43:00 AM CET Jin Yao wrote:
>> v4: Remove the options "--inline-line" and "--inline-name". Just use
>> a new option "--inline" to print the inline function information.
>> The policy is if the inline function name can be resolved then
>> print the name in priority. If the name can't be resolved, then
>> print the source line number.
> This is still wrong from a usability POV. I may want to see the file/line for
> entry that have a name. And actually, there are afaik no situations where you
> could have a file/line but not a symbol name.
>
> Again, why don't you align this with the other non-inlined frames, and honor
> the grouping setting? Check whether we group by address (then display file/
> line), or by function (then display the function name).
>
> Bye