v5: Update according to Milian Wolff's comments. It groups by address
(then display file/ line), or by function (then display function name).
For example:
1. Show inlined function name
perf report --stdio -g function --inline
0.69% 0.00% inline ld-2.23.so [.] dl_main
|
---dl_main
|
--0.56%--_dl_relocate_object
_dl_relocate_object (inline)
elf_dynamic_do_Rela (inline)
2. Show the file/line information
perf report --stdio -g address --inline
0.69% 0.00% inline ld-2.23.so [.] _dl_start_user
|
---_dl_start_user .:0
_dl_start rtld.c:307
/build/glibc-GKVZIf/glibc-2.23/elf/rtld.c:413 (inline)
_dl_sysdep_start dl-sysdep.c:250
|
--0.56%--dl_main rtld.c:2076
2 patches are updated according to this change.
perf report: Show inline stack in browser mode
perf report: Show inline stack in stdio mode
3 patches are not changed.
perf report: Find the inline stack for a given address
perf report: Refactor common code in srcline.c
perf report: Create new inline option
v4: Remove the options "--inline-line" and "--inline-name". Just use
a new option "--inline" to print the inline function information.
The policy is if the inline function name can be resolved then
print the name in priority. If the name can't be resolved, then
print the source line number.
For example:
perf report --stdio --inline
0.69% 0.00% inline ld-2.23.so [.] dl_main
|
---dl_main
|
--0.56%--_dl_relocate_object
|
---_dl_relocate_object (inline)
elf_dynamic_do_Rela (inline)
Following 3 patches are updated according to this change.
perf report: Show inline stack in browser mode
perf report: Show inline stack in stdio mode
perf report: Create new inline option
Following are not changed.
perf report: Find the inline stack for a given address
perf report: Refactor common code in srcline.c
v3: Iterate on RIPs of all callchain entries to check if the RIP is in
inline functions.
Reverse the order of the inliner printout if necessary.
Provide new options "--inline-line" / "--inline-name" to print
inline function name or print inline function source line.
v2: Thanks so much for Arnaldo's comments!
The modifications are:
1. Divide v1 patch "perf report: Find the inline stack for a
given address" into 2 patches:
a. perf report: Refactor common code in srcline.c
b. perf report: Find the inline stack for a given address
Some function names are changed:
dso_name_get -> dso__name
ilist_apend -> inline_list__append
get_inline_node -> dso__parse_addr_inlines
free_inline_node -> inline_node__delete
2. Since the function name are changed, update following patches
accordingly.
a. perf report: Show inline stack in stdio mode
b. perf report: Show inline stack in browser mode
3. Rebase to latest perf/core branch. This patch is impacted.
a. perf report: Create a new option "--inline"
v1: Initial post
It would be useful for perf to support a mode to query the
inline stack for callgraph addresses. This would simplify
finding the right code in code that does a lot of inlining.
For example, the c code:
static inline void f3(void)
{
int i;
for (i = 0; i < 1000;) {
if(i%2)
i++;
else
i++;
}
printf("hello f3\n"); /* D */
}
/* < CALLCHAIN: f2 <- f1 > */
static inline void f2(void)
{
int i;
for (i = 0; i < 100; i++) {
f3(); /* C */
}
}
/* < CALLCHAIN: f1 <- main > */
static inline void f1(void)
{
int i;
for (i = 0; i < 100; i++) {
f2(); /* B */
}
}
/* < CALLCHAIN: main <- TOP > */
int main()
{
struct timeval tv;
time_t start, end;
gettimeofday(&tv, NULL);
start = end = tv.tv_sec;
while((end - start) < 5) {
f1(); /* A */
gettimeofday(&tv, NULL);
end = tv.tv_sec;
}
return 0;
}
The printed inline stack is:
0.05% test2 test2 [.] main
|
---/home/perf-dev/lck-2867/test/test2.c:27 (inline)
/home/perf-dev/lck-2867/test/test2.c:35 (inline)
/home/perf-dev/lck-2867/test/test2.c:45 (inline)
/home/perf-dev/lck-2867/test/test2.c:61 (inline)
I tag A/B/C/D in above c code to indicate the source line,
actually the inline stack is equal to:
0.05% test2 test2 [.] main
|
---D
C
B
A
Jin Yao (5):
perf report: Refactor common code in srcline.c
perf report: Find the inline stack for a given address
perf report: Create new inline option
perf report: Show inline stack for stdio mode
perf report: Show inline stack for browser mode
tools/perf/Documentation/perf-report.txt | 4 +
tools/perf/builtin-report.c | 2 +
tools/perf/ui/browsers/hists.c | 171 ++++++++++++++++++++--
tools/perf/ui/stdio/hist.c | 79 ++++++++++-
tools/perf/util/hist.c | 5 +
tools/perf/util/sort.h | 1 +
tools/perf/util/srcline.c | 237 +++++++++++++++++++++++++++----
tools/perf/util/symbol-elf.c | 5 +
tools/perf/util/symbol.h | 5 +-
tools/perf/util/util.h | 16 +++
10 files changed, 487 insertions(+), 38 deletions(-)
--
2.7.4
If the address belongs to an inlined function, the source information
back to the first non-inlined function will be printed.
For example:
1. Show inlined function name
perf report --stdio -g function --inline
0.69% 0.00% inline ld-2.23.so [.] dl_main
|
---dl_main
|
--0.56%--_dl_relocate_object
_dl_relocate_object (inline)
elf_dynamic_do_Rela (inline)
2. Show the file/line information
perf report --stdio -g address --inline
0.69% 0.00% inline ld-2.23.so [.] _dl_start_user
|
---_dl_start_user .:0
_dl_start rtld.c:307
/build/glibc-GKVZIf/glibc-2.23/elf/rtld.c:413 (inline)
_dl_sysdep_start dl-sysdep.c:250
|
--0.56%--dl_main rtld.c:2076
Signed-off-by: Jin Yao <[email protected]>
Tested-by: Milian Wolff <[email protected]>
---
tools/perf/ui/stdio/hist.c | 79 +++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 78 insertions(+), 1 deletion(-)
diff --git a/tools/perf/ui/stdio/hist.c b/tools/perf/ui/stdio/hist.c
index 668f4ae..183470f 100644
--- a/tools/perf/ui/stdio/hist.c
+++ b/tools/perf/ui/stdio/hist.c
@@ -17,6 +17,59 @@ static size_t callchain__fprintf_left_margin(FILE *fp, int left_margin)
return ret;
}
+static size_t inline__fprintf(struct map *map, u64 ip, int left_margin,
+ int depth, int depth_mask, FILE *fp)
+{
+ struct dso *dso;
+ struct inline_node *node;
+ struct inline_list *ilist;
+ int ret = 0, i;
+
+ if (map == NULL)
+ return 0;
+
+ dso = map->dso;
+ if (dso == NULL)
+ return 0;
+
+ if (dso->kernel != DSO_TYPE_USER)
+ return 0;
+
+ node = dso__parse_addr_inlines(dso,
+ map__rip_2objdump(map, ip));
+ if (node == NULL)
+ return 0;
+
+
+ list_for_each_entry(ilist, &node->val, list) {
+ if ((ilist->filename != NULL) || (ilist->funcname != NULL)) {
+ ret += callchain__fprintf_left_margin(fp, left_margin);
+
+ for (i = 0; i < depth; i++) {
+ if (depth_mask & (1 << i))
+ ret += fprintf(fp, "|");
+ else
+ ret += fprintf(fp, " ");
+ ret += fprintf(fp, " ");
+ }
+
+ if (callchain_param.key == CCKEY_ADDRESS) {
+ if (ilist->filename != NULL)
+ ret += fprintf(fp, "%s:%d (inline)",
+ ilist->filename,
+ ilist->line_nr);
+ } else if (ilist->funcname != NULL)
+ ret += fprintf(fp, "%s (inline)",
+ ilist->funcname);
+
+ ret += fprintf(fp, "\n");
+ }
+ }
+
+ inline_node__delete(node);
+ return ret;
+}
+
static size_t ipchain__fprintf_graph_line(FILE *fp, int depth, int depth_mask,
int left_margin)
{
@@ -78,6 +131,10 @@ static size_t ipchain__fprintf_graph(FILE *fp, struct callchain_node *node,
fputs(str, fp);
fputc('\n', fp);
free(alloc_str);
+
+ if (symbol_conf.inline_name)
+ ret += inline__fprintf(chain->ms.map, chain->ip,
+ left_margin, depth, depth_mask, fp);
return ret;
}
@@ -229,6 +286,7 @@ static size_t callchain__fprintf_graph(FILE *fp, struct rb_root *root,
if (!i++ && field_order == NULL &&
sort_order && !prefixcmp(sort_order, "sym"))
continue;
+
if (!printed) {
ret += callchain__fprintf_left_margin(fp, left_margin);
ret += fprintf(fp, "|\n");
@@ -251,6 +309,13 @@ static size_t callchain__fprintf_graph(FILE *fp, struct rb_root *root,
if (++entries_printed == callchain_param.print_limit)
break;
+
+ if (symbol_conf.inline_name)
+ ret += inline__fprintf(chain->ms.map,
+ chain->ip,
+ left_margin,
+ 0, 0,
+ fp);
}
root = &cnode->rb_root;
}
@@ -529,6 +594,8 @@ static int hist_entry__fprintf(struct hist_entry *he, size_t size,
bool use_callchain)
{
int ret;
+ int callchain_ret = 0;
+ int inline_ret = 0;
struct perf_hpp hpp = {
.buf = bf,
.size = size,
@@ -547,7 +614,17 @@ static int hist_entry__fprintf(struct hist_entry *he, size_t size,
ret = fprintf(fp, "%s\n", bf);
if (use_callchain)
- ret += hist_entry_callchain__fprintf(he, total_period, 0, fp);
+ callchain_ret = hist_entry_callchain__fprintf(he, total_period,
+ 0, fp);
+
+ if ((callchain_ret == 0) &&
+ (symbol_conf.inline_name)) {
+ inline_ret = inline__fprintf(he->ms.map, he->ip, 0, 0, 0, fp);
+ ret += inline_ret;
+ if (inline_ret > 0)
+ ret += fprintf(fp, "\n");
+ } else
+ ret += callchain_ret;
return ret;
}
--
2.7.4
Introduce dso__name() and filename_split() out of existing code
because these codes will be used in several places in next
patch.
For filename_split(), it may also solve a potential memory leak
in existing code. In existing addr2line(),
sep = strchr(filename, ':');
if (sep) {
*sep++ = '\0';
*file = filename;
*line_nr = strtoul(sep, NULL, 0);
ret = 1;
}
out:
pclose(fp);
return ret;
If sep is NULL, filename is not freed or returned via file.
Signed-off-by: Jin Yao <[email protected]>
Tested-by: Milian Wolff <[email protected]>
---
tools/perf/util/srcline.c | 68 +++++++++++++++++++++++++++++++----------------
1 file changed, 45 insertions(+), 23 deletions(-)
diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index b4db3f4..2953c9f 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -12,6 +12,24 @@
bool srcline_full_filename;
+static const char *dso__name(struct dso *dso)
+{
+ const char *dso_name;
+
+ if (dso->symsrc_filename)
+ dso_name = dso->symsrc_filename;
+ else
+ dso_name = dso->long_name;
+
+ if (dso_name[0] == '[')
+ return NULL;
+
+ if (!strncmp(dso_name, "/tmp/perf-", 10))
+ return NULL;
+
+ return dso_name;
+}
+
#ifdef HAVE_LIBBFD_SUPPORT
/*
@@ -207,6 +225,27 @@ void dso__free_a2l(struct dso *dso)
#else /* HAVE_LIBBFD_SUPPORT */
+static int filename_split(char *filename, unsigned int *line_nr)
+{
+ char *sep;
+
+ sep = strchr(filename, '\n');
+ if (sep)
+ *sep = '\0';
+
+ if (!strcmp(filename, "??:0"))
+ return 0;
+
+ sep = strchr(filename, ':');
+ if (sep) {
+ *sep++ = '\0';
+ *line_nr = strtoul(sep, NULL, 0);
+ return 1;
+ }
+
+ return 0;
+}
+
static int addr2line(const char *dso_name, u64 addr,
char **file, unsigned int *line_nr,
struct dso *dso __maybe_unused,
@@ -216,7 +255,6 @@ static int addr2line(const char *dso_name, u64 addr,
char cmd[PATH_MAX];
char *filename = NULL;
size_t len;
- char *sep;
int ret = 0;
scnprintf(cmd, sizeof(cmd), "addr2line -e %s %016"PRIx64,
@@ -233,23 +271,14 @@ static int addr2line(const char *dso_name, u64 addr,
goto out;
}
- sep = strchr(filename, '\n');
- if (sep)
- *sep = '\0';
-
- if (!strcmp(filename, "??:0")) {
- pr_debug("no debugging info in %s\n", dso_name);
+ ret = filename_split(filename, line_nr);
+ if (ret != 1) {
free(filename);
goto out;
}
- sep = strchr(filename, ':');
- if (sep) {
- *sep++ = '\0';
- *file = filename;
- *line_nr = strtoul(sep, NULL, 0);
- ret = 1;
- }
+ *file = filename;
+
out:
pclose(fp);
return ret;
@@ -278,15 +307,8 @@ char *__get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
if (!dso->has_srcline)
goto out;
- if (dso->symsrc_filename)
- dso_name = dso->symsrc_filename;
- else
- dso_name = dso->long_name;
-
- if (dso_name[0] == '[')
- goto out;
-
- if (!strncmp(dso_name, "/tmp/perf-", 10))
+ dso_name = dso__name(dso);
+ if (dso_name == NULL)
goto out;
if (!addr2line(dso_name, addr, &file, &line, dso, unwind_inlines))
--
2.7.4
If the address belongs to an inlined function, the source information
back to the first non-inlined function will be printed.
For example:
1. Show inlined function name
perf report -g function --inline
- 0.69% 0.00% inline ld-2.23.so [.] dl_main
- dl_main
0.56% _dl_relocate_object
_dl_relocate_object (inline)
elf_dynamic_do_Rela (inline)
2. Show the file/line information
perf report -g address --inline
- 0.69% 0.00% inline ld-2.23.so [.] _dl_start
_dl_start rtld.c:307
/build/glibc-GKVZIf/glibc-2.23/elf/rtld.c:413 (inline)
+ _dl_sysdep_start dl-sysdep.c:250
Signed-off-by: Jin Yao <[email protected]>
Tested-by: Milian Wolff <[email protected]>
---
tools/perf/ui/browsers/hists.c | 171 +++++++++++++++++++++++++++++++++++++++--
tools/perf/util/hist.c | 5 ++
tools/perf/util/sort.h | 1 +
3 files changed, 169 insertions(+), 8 deletions(-)
diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
index 2dc82be..757222b 100644
--- a/tools/perf/ui/browsers/hists.c
+++ b/tools/perf/ui/browsers/hists.c
@@ -144,9 +144,60 @@ static void callchain_list__set_folding(struct callchain_list *cl, bool unfold)
cl->unfolded = unfold ? cl->has_children : false;
}
+static struct inline_node *inline_node__create(struct map *map, u64 ip)
+{
+ struct dso *dso;
+ struct inline_node *node;
+
+ if (map == NULL)
+ return NULL;
+
+ dso = map->dso;
+ if (dso == NULL)
+ return NULL;
+
+ if (dso->kernel != DSO_TYPE_USER)
+ return NULL;
+
+ node = dso__parse_addr_inlines(dso,
+ map__rip_2objdump(map, ip));
+
+ return node;
+}
+
+static int inline__count_rows(struct inline_node *node)
+{
+ struct inline_list *ilist;
+ int i = 0;
+
+ if (node == NULL)
+ return 0;
+
+ list_for_each_entry(ilist, &node->val, list) {
+ if ((ilist->filename != NULL) || (ilist->funcname != NULL))
+ i++;
+ }
+
+ return i;
+}
+
+static int callchain_list__inline_rows(struct callchain_list *chain)
+{
+ struct inline_node *node;
+ int rows;
+
+ node = inline_node__create(chain->ms.map, chain->ip);
+ if (node == NULL)
+ return 0;
+
+ rows = inline__count_rows(node);
+ inline_node__delete(node);
+ return rows;
+}
+
static int callchain_node__count_rows_rb_tree(struct callchain_node *node)
{
- int n = 0;
+ int n = 0, inline_rows;
struct rb_node *nd;
for (nd = rb_first(&node->rb_root); nd; nd = rb_next(nd)) {
@@ -156,6 +207,13 @@ static int callchain_node__count_rows_rb_tree(struct callchain_node *node)
list_for_each_entry(chain, &child->val, list) {
++n;
+
+ if (symbol_conf.inline_name) {
+ inline_rows =
+ callchain_list__inline_rows(chain);
+ n += inline_rows;
+ }
+
/* We need this because we may not have children */
folded_sign = callchain_list__folded(chain);
if (folded_sign == '+')
@@ -207,7 +265,7 @@ static int callchain_node__count_rows(struct callchain_node *node)
{
struct callchain_list *chain;
bool unfolded = false;
- int n = 0;
+ int n = 0, inline_rows;
if (callchain_param.mode == CHAIN_FLAT)
return callchain_node__count_flat_rows(node);
@@ -216,6 +274,11 @@ static int callchain_node__count_rows(struct callchain_node *node)
list_for_each_entry(chain, &node->val, list) {
++n;
+ if (symbol_conf.inline_name) {
+ inline_rows = callchain_list__inline_rows(chain);
+ n += inline_rows;
+ }
+
unfolded = chain->unfolded;
}
@@ -362,6 +425,19 @@ static void hist_entry__init_have_children(struct hist_entry *he)
he->init_have_children = true;
}
+static void hist_entry_init_inline_node(struct hist_entry *he)
+{
+ if (he->inline_node)
+ return;
+
+ he->inline_node = inline_node__create(he->ms.map, he->ip);
+
+ if (he->inline_node == NULL)
+ return;
+
+ he->has_children = true;
+}
+
static bool hist_browser__toggle_fold(struct hist_browser *browser)
{
struct hist_entry *he = browser->he_selection;
@@ -393,7 +469,12 @@ static bool hist_browser__toggle_fold(struct hist_browser *browser)
if (he->unfolded) {
if (he->leaf)
- he->nr_rows = callchain__count_rows(&he->sorted_chain);
+ if (he->inline_node)
+ he->nr_rows = inline__count_rows(
+ he->inline_node);
+ else
+ he->nr_rows = callchain__count_rows(
+ &he->sorted_chain);
else
he->nr_rows = hierarchy_count_rows(browser, he, false);
@@ -753,6 +834,61 @@ static bool hist_browser__check_dump_full(struct hist_browser *browser __maybe_u
#define LEVEL_OFFSET_STEP 3
+static int hist_browser__show_inline(struct hist_browser *browser,
+ struct inline_node *node,
+ unsigned short row,
+ int offset)
+{
+ struct inline_list *ilist;
+ char buf[1024];
+ int color, width, first_row;
+
+ first_row = row;
+ width = browser->b.width - (LEVEL_OFFSET_STEP + 2);
+ list_for_each_entry(ilist, &node->val, list) {
+ if ((ilist->filename != NULL) || (ilist->funcname != NULL)) {
+ color = HE_COLORSET_NORMAL;
+ if (ui_browser__is_current_entry(&browser->b, row))
+ color = HE_COLORSET_SELECTED;
+
+ if (callchain_param.key == CCKEY_ADDRESS) {
+ if (ilist->filename != NULL)
+ scnprintf(buf, sizeof(buf),
+ "%s:%d (inline)",
+ ilist->filename,
+ ilist->line_nr);
+ } else if (ilist->funcname != NULL)
+ scnprintf(buf, sizeof(buf), "%s (inline)",
+ ilist->funcname);
+
+ ui_browser__set_color(&browser->b, color);
+ hist_browser__gotorc(browser, row, 0);
+ ui_browser__write_nstring(&browser->b, " ",
+ LEVEL_OFFSET_STEP + offset);
+ ui_browser__write_nstring(&browser->b, buf, width);
+ row++;
+ }
+ }
+
+ return row - first_row;
+}
+
+static size_t show_inline_list(struct hist_browser *browser, struct map *map,
+ u64 ip, int row, int offset)
+{
+ struct inline_node *node;
+ int ret;
+
+ node = inline_node__create(map, ip);
+ if (node == NULL)
+ return 0;
+
+ ret = hist_browser__show_inline(browser, node, row, offset);
+
+ inline_node__delete(node);
+ return ret;
+}
+
static int hist_browser__show_callchain_list(struct hist_browser *browser,
struct callchain_node *node,
struct callchain_list *chain,
@@ -764,6 +900,7 @@ static int hist_browser__show_callchain_list(struct hist_browser *browser,
char bf[1024], *alloc_str;
char buf[64], *alloc_str2;
const char *str;
+ int inline_rows = 0, ret = 1;
if (arg->row_offset != 0) {
arg->row_offset--;
@@ -801,10 +938,15 @@ static int hist_browser__show_callchain_list(struct hist_browser *browser,
}
print(browser, chain, str, offset, row, arg);
-
free(alloc_str);
free(alloc_str2);
- return 1;
+
+ if (symbol_conf.inline_name) {
+ inline_rows = show_inline_list(browser, chain->ms.map,
+ chain->ip, row + 1, offset);
+ }
+
+ return ret + inline_rows;
}
static bool check_percent_display(struct rb_node *node, u64 parent_total)
@@ -1228,6 +1370,12 @@ static int hist_browser__show_entry(struct hist_browser *browser,
folded_sign = hist_entry__folded(entry);
}
+ if (symbol_conf.inline_name &&
+ (!entry->has_children)) {
+ hist_entry_init_inline_node(entry);
+ folded_sign = hist_entry__folded(entry);
+ }
+
if (row_offset == 0) {
struct hpp_arg arg = {
.b = &browser->b,
@@ -1259,7 +1407,8 @@ static int hist_browser__show_entry(struct hist_browser *browser,
}
if (first) {
- if (symbol_conf.use_callchain) {
+ if (symbol_conf.use_callchain ||
+ symbol_conf.inline_name) {
ui_browser__printf(&browser->b, "%c ", folded_sign);
width -= 2;
}
@@ -1301,8 +1450,14 @@ static int hist_browser__show_entry(struct hist_browser *browser,
.is_current_entry = current_entry,
};
- printed += hist_browser__show_callchain(browser, entry, 1, row,
- hist_browser__show_callchain_entry, &arg,
+ if (entry->inline_node)
+ printed += hist_browser__show_inline(browser,
+ entry->inline_node, row, 0);
+ else
+ printed += hist_browser__show_callchain(browser,
+ entry, 1, row,
+ hist_browser__show_callchain_entry,
+ &arg,
hist_browser__check_output_full);
}
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index e3b38f6..3c4d4d0 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1136,6 +1136,11 @@ void hist_entry__delete(struct hist_entry *he)
zfree(&he->mem_info);
}
+ if (he->inline_node) {
+ inline_node__delete(he->inline_node);
+ he->inline_node = NULL;
+ }
+
zfree(&he->stat_acc);
free_srcline(he->srcline);
if (he->srcfile && he->srcfile[0])
diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h
index baf20a3..e35fb18 100644
--- a/tools/perf/util/sort.h
+++ b/tools/perf/util/sort.h
@@ -128,6 +128,7 @@ struct hist_entry {
};
char *srcline;
char *srcfile;
+ struct inline_node *inline_node;
struct symbol *parent;
struct branch_info *branch_info;
struct hists *hists;
--
2.7.4
It would be useful for perf to support a mode to query the
inline stack for a given callgraph address. This would simplify
finding the right code in code that does a lot of inlining.
The srcline.c has contained the code which supports to translate
the address to filename:line_nr. This patch just extends the
function to let it support getting the inline stacks.
It introduces the inline_list which will store the inline
function result (filename:line_nr and funcname).
Signed-off-by: Jin Yao <[email protected]>
Tested-by: Milian Wolff <[email protected]>
---
tools/perf/util/srcline.c | 169 +++++++++++++++++++++++++++++++++++++++++--
tools/perf/util/symbol-elf.c | 5 ++
tools/perf/util/symbol.h | 2 +
tools/perf/util/util.h | 16 ++++
4 files changed, 187 insertions(+), 5 deletions(-)
diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index 2953c9f..f9d4b47 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -7,6 +7,7 @@
#include "util/dso.h"
#include "util/util.h"
#include "util/debug.h"
+#include "util/callchain.h"
#include "symbol.h"
@@ -30,6 +31,41 @@ static const char *dso__name(struct dso *dso)
return dso_name;
}
+static int inline_list__append(char *filename, char *funcname, int line_nr,
+ struct inline_node *node, struct dso *dso)
+{
+ struct inline_list *ilist;
+ char *demangled;
+
+ ilist = zalloc(sizeof(*ilist));
+ if (ilist == NULL)
+ return -1;
+
+ ilist->filename = filename;
+ ilist->line_nr = line_nr;
+
+ demangled = dso__demangle_sym(dso, 0, funcname);
+ if (demangled == NULL) {
+ ilist->funcname = funcname;
+ } else {
+ ilist->funcname = demangled;
+ if (funcname != NULL)
+ free(funcname);
+ }
+
+ list_add_tail(&ilist->list, &node->val);
+
+ return 0;
+}
+
+static void inline_list__reverse(struct inline_node *node)
+{
+ struct inline_list *ilist, *n;
+
+ list_for_each_entry_safe_reverse(ilist, n, &node->val, list)
+ list_move_tail(&ilist->list, &node->val);
+}
+
#ifdef HAVE_LIBBFD_SUPPORT
/*
@@ -171,7 +207,7 @@ static void addr2line_cleanup(struct a2l_data *a2l)
static int addr2line(const char *dso_name, u64 addr,
char **file, unsigned int *line, struct dso *dso,
- bool unwind_inlines)
+ bool unwind_inlines, struct inline_node *node)
{
int ret = 0;
struct a2l_data *a2l = dso->a2l;
@@ -196,8 +232,21 @@ static int addr2line(const char *dso_name, u64 addr,
while (bfd_find_inliner_info(a2l->abfd, &a2l->filename,
&a2l->funcname, &a2l->line) &&
- cnt++ < MAX_INLINE_NEST)
- ;
+ cnt++ < MAX_INLINE_NEST) {
+
+ if (node != NULL) {
+ if (inline_list__append(strdup(a2l->filename),
+ strdup(a2l->funcname),
+ a2l->line, node,
+ dso) != 0)
+ return 0;
+ }
+ }
+
+ if ((node != NULL) &&
+ (callchain_param.order != ORDER_CALLEE)) {
+ inline_list__reverse(node);
+ }
}
if (a2l->found && a2l->filename) {
@@ -223,6 +272,35 @@ void dso__free_a2l(struct dso *dso)
dso->a2l = NULL;
}
+static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
+ struct dso *dso)
+{
+ char *file = NULL;
+ unsigned int line = 0;
+ struct inline_node *node;
+
+ node = zalloc(sizeof(*node));
+ if (node == NULL) {
+ perror("not enough memory for the inline node");
+ return NULL;
+ }
+
+ INIT_LIST_HEAD(&node->val);
+ node->addr = addr;
+
+ if (!addr2line(dso_name, addr, &file, &line, dso, TRUE, node))
+ goto out_free_inline_node;
+
+ if (list_empty(&node->val))
+ goto out_free_inline_node;
+
+ return node;
+
+out_free_inline_node:
+ inline_node__delete(node);
+ return NULL;
+}
+
#else /* HAVE_LIBBFD_SUPPORT */
static int filename_split(char *filename, unsigned int *line_nr)
@@ -249,7 +327,8 @@ static int filename_split(char *filename, unsigned int *line_nr)
static int addr2line(const char *dso_name, u64 addr,
char **file, unsigned int *line_nr,
struct dso *dso __maybe_unused,
- bool unwind_inlines __maybe_unused)
+ bool unwind_inlines __maybe_unused,
+ struct inline_node *node __maybe_unused)
{
FILE *fp;
char cmd[PATH_MAX];
@@ -288,6 +367,57 @@ void dso__free_a2l(struct dso *dso __maybe_unused)
{
}
+static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
+ struct dso *dso __maybe_unused)
+{
+ FILE *fp;
+ char cmd[PATH_MAX];
+ struct inline_node *node;
+ char *filename = NULL;
+ size_t len;
+ unsigned int line_nr = 0;
+
+ scnprintf(cmd, sizeof(cmd), "addr2line -e %s -i %016"PRIx64,
+ dso_name, addr);
+
+ fp = popen(cmd, "r");
+ if (fp == NULL) {
+ pr_err("popen failed for %s\n", dso_name);
+ return NULL;
+ }
+
+ node = zalloc(sizeof(*node));
+ if (node == NULL) {
+ perror("not enough memory for the inline node");
+ goto out;
+ }
+
+ INIT_LIST_HEAD(&node->val);
+ node->addr = addr;
+
+ while (getline(&filename, &len, fp) != -1) {
+ if (filename_split(filename, &line_nr) != 1) {
+ free(filename);
+ goto out;
+ }
+
+ if (inline_list__append(filename, NULL, line_nr, node) != 0)
+ goto out;
+
+ filename = NULL;
+ }
+
+out:
+ pclose(fp);
+
+ if (list_empty(&node->val)) {
+ inline_node__delete(node);
+ return NULL;
+ }
+
+ return node;
+}
+
#endif /* HAVE_LIBBFD_SUPPORT */
/*
@@ -311,7 +441,7 @@ char *__get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
if (dso_name == NULL)
goto out;
- if (!addr2line(dso_name, addr, &file, &line, dso, unwind_inlines))
+ if (!addr2line(dso_name, addr, &file, &line, dso, unwind_inlines, NULL))
goto out;
if (asprintf(&srcline, "%s:%u",
@@ -351,3 +481,32 @@ char *get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
{
return __get_srcline(dso, addr, sym, show_sym, false);
}
+
+struct inline_node *dso__parse_addr_inlines(struct dso *dso, u64 addr)
+{
+ const char *dso_name;
+
+ dso_name = dso__name(dso);
+ if (dso_name == NULL)
+ return NULL;
+
+ return addr2inlines(dso_name, addr, dso);
+}
+
+void inline_node__delete(struct inline_node *node)
+{
+ struct inline_list *ilist, *tmp;
+
+ list_for_each_entry_safe(ilist, tmp, &node->val, list) {
+ list_del_init(&ilist->list);
+ if (ilist->filename != NULL)
+ free(ilist->filename);
+
+ if (ilist->funcname != NULL)
+ free(ilist->funcname);
+
+ free(ilist);
+ }
+
+ free(node);
+}
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 4e59dde..3a1dda3 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -390,6 +390,11 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss, struct map *
return 0;
}
+char *dso__demangle_sym(struct dso *dso, int kmodule, char *elf_name)
+{
+ return demangle_sym(dso, kmodule, elf_name);
+}
+
/*
* Align offset to 4 bytes as needed for note name and descriptor data.
*/
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 6c358b7..8adf045 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -305,6 +305,8 @@ int dso__load_sym(struct dso *dso, struct map *map, struct symsrc *syms_ss,
int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss,
struct map *map);
+char *dso__demangle_sym(struct dso *dso, int kmodule, char *elf_name);
+
void __symbols__insert(struct rb_root *symbols, struct symbol *sym, bool kernel);
void symbols__insert(struct rb_root *symbols, struct symbol *sym);
void symbols__fixup_duplicate(struct rb_root *symbols);
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index b2cfa47..cc0700d 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -364,4 +364,20 @@ int is_printable_array(char *p, unsigned int len);
int timestamp__scnprintf_usec(u64 timestamp, char *buf, size_t sz);
int unit_number__scnprintf(char *buf, size_t size, u64 n);
+
+struct inline_list {
+ char *filename;
+ char *funcname;
+ unsigned int line_nr;
+ struct list_head list;
+};
+
+struct inline_node {
+ u64 addr;
+ struct list_head val;
+};
+
+struct inline_node *dso__parse_addr_inlines(struct dso *dso, u64 addr);
+void inline_node__delete(struct inline_node *node);
+
#endif /* GIT_COMPAT_UTIL_H */
--
2.7.4
It takes some time to look for inline stack for callgraph addresses.
So it provides new option "--inline" to let user decide if enable
this feature.
--inline:
If a callgraph address belongs to an inlined function, the inline stack
will be printed. Each entry is the inline function name or file/line.
Signed-off-by: Jin Yao <[email protected]>
Tested-by: Milian Wolff <[email protected]>
---
tools/perf/Documentation/perf-report.txt | 4 ++++
tools/perf/builtin-report.c | 2 ++
tools/perf/util/symbol.h | 3 ++-
3 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index e9a61f5..248bba4 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -430,6 +430,10 @@ include::itrace.txt[]
--hierarchy::
Enable hierarchical output.
+--inline::
+ If a callgraph address belongs to an inlined function, the inline stack
+ will be printed. Each entry is function name or file/line.
+
include::callchain-overhead-calculation.txt[]
SEE ALSO
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 5ab8117..26bc169 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -845,6 +845,8 @@ int cmd_report(int argc, const char **argv, const char *prefix __maybe_unused)
stdio__config_color, "always"),
OPT_STRING(0, "time", &report.time_str, "str",
"Time span of interest (start,stop)"),
+ OPT_BOOLEAN(0, "inline", &symbol_conf.inline_name,
+ "Show inline function"),
OPT_END()
};
struct perf_data_file file = {
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 8adf045..7b4a399 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -118,7 +118,8 @@ struct symbol_conf {
show_ref_callgraph,
hide_unresolved,
raw_trace,
- report_hierarchy;
+ report_hierarchy,
+ inline_name;
const char *vmlinux_name,
*kallsyms_name,
*source_prefix,
--
2.7.4
On Donnerstag, 16. M?rz 2017 22:42:22 CET Jin Yao wrote:
> v5: Update according to Milian Wolff's comments. It groups by address
> (then display file/ line), or by function (then display function name).
Thank you Jin, that is really good. I tested it and it works really well for
me.
Arnaldo, could you please consider merging this? It's an extremely useful
feature and direly missing from perf so far.
That said, Jin, here are some observations that could be improved in the
future (I don't think any of these should hold back merging this feature now):
For the following example code build with "-O2 -g" and recorded with "--call-
graph dwarf" I observe some output combinations that could potentially be
improved in the future:
~~~~~~~~~~~~~~~~~~~~
#include <complex>
#include <cmath>
#include <random>
#include <iostream>
using namespace std;
int main()
{
uniform_real_distribution<double> uniform(-1E5, 1E5);
default_random_engine engine;
double s = 0;
for (int i = 0; i < 10000000; ++i) {
s += norm(complex<double>(uniform(engine), uniform(engine)));
}
cout << s << '\n';
return 0;
}
~~~~~~~~~~~~~~~~
#1 duplicated entries when grouping by function:
~~~~~~~~~~~~~~~~
perf report --inline --stdio
...
--35.34%--_start
__libc_start_main
main
main (inline)
std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned
long, 16807ul, 0ul, 2147483647ul> > (inline)
std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned
long, 16807ul, 0ul, 2147483647ul> > (inline)
std::__detail::_Adaptor<std::linear_congruential_engine<unsigned
long, 16807ul, 0ul, 2147483647ul>, double>::operator() (inline)
~~~~~~~~~~~~~~~~
Here, we see main twice, once for the "real" frame, and once for an inlined
one? Then we see the same function twice as inlined frame, which is also odd.
~~~~~~~~~~~~~~~~
perf report --inline --stdio --no-children
...
59.81% cpp-inlining libm-2.25.so [.] __hypot_finite
|
---__hypot_finite
hypot
main
std::norm<double> (inline)
main (inline)
__libc_start_main
_start
~~~~~~~~~~~~~~~~
Here we see a confusing output. The first "main" frame below "hypot" is
actually code form cpp's complex header which got inlined into main. That
associates the wrong function name to this frame, i.e. "main" instead of
std::norm". When the inline stack is shown below we actually see what happens,
i.e. we eventually end up in main again, but of course this output is not the
best as-is.
But, again: I think these are minor issues, and the feature itself is already
extremely useful and I hope to see it finally merged.
Thanks again Jin for your good work!
Cheers
--
Milian Wolff | [email protected] | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts
Em Fri, Mar 17, 2017 at 05:42:24AM +0800, Jin Yao escreveu:
> It would be useful for perf to support a mode to query the
> inline stack for a given callgraph address. This would simplify
> finding the right code in code that does a lot of inlining.
>
> The srcline.c has contained the code which supports to translate
> the address to filename:line_nr. This patch just extends the
> function to let it support getting the inline stacks.
>
> It introduces the inline_list which will store the inline
> function result (filename:line_nr and funcname).
>
> Signed-off-by: Jin Yao <[email protected]>
> Tested-by: Milian Wolff <[email protected]>
> ---
> tools/perf/util/srcline.c | 169 +++++++++++++++++++++++++++++++++++++++++--
> tools/perf/util/symbol-elf.c | 5 ++
> tools/perf/util/symbol.h | 2 +
> tools/perf/util/util.h | 16 ++++
> 4 files changed, 187 insertions(+), 5 deletions(-)
>
> diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
> index 2953c9f..f9d4b47 100644
> --- a/tools/perf/util/srcline.c
> +++ b/tools/perf/util/srcline.c
> @@ -7,6 +7,7 @@
> #include "util/dso.h"
> #include "util/util.h"
> #include "util/debug.h"
> +#include "util/callchain.h"
>
> #include "symbol.h"
>
> @@ -30,6 +31,41 @@ static const char *dso__name(struct dso *dso)
> return dso_name;
> }
>
> +static int inline_list__append(char *filename, char *funcname, int line_nr,
> + struct inline_node *node, struct dso *dso)
> +{
> + struct inline_list *ilist;
> + char *demangled;
> +
> + ilist = zalloc(sizeof(*ilist));
> + if (ilist == NULL)
> + return -1;
> +
> + ilist->filename = filename;
> + ilist->line_nr = line_nr;
> +
> + demangled = dso__demangle_sym(dso, 0, funcname);
> + if (demangled == NULL) {
> + ilist->funcname = funcname;
> + } else {
> + ilist->funcname = demangled;
> + if (funcname != NULL)
> + free(funcname);
free() can be fed NULL, I'll simplify this thus
> + }
> +
> + list_add_tail(&ilist->list, &node->val);
> +
> + return 0;
> +}
> +
> +static void inline_list__reverse(struct inline_node *node)
> +{
> + struct inline_list *ilist, *n;
> +
> + list_for_each_entry_safe_reverse(ilist, n, &node->val, list)
> + list_move_tail(&ilist->list, &node->val);
> +}
> +
> #ifdef HAVE_LIBBFD_SUPPORT
>
> /*
> @@ -171,7 +207,7 @@ static void addr2line_cleanup(struct a2l_data *a2l)
>
> static int addr2line(const char *dso_name, u64 addr,
> char **file, unsigned int *line, struct dso *dso,
> - bool unwind_inlines)
> + bool unwind_inlines, struct inline_node *node)
> {
> int ret = 0;
> struct a2l_data *a2l = dso->a2l;
> @@ -196,8 +232,21 @@ static int addr2line(const char *dso_name, u64 addr,
>
> while (bfd_find_inliner_info(a2l->abfd, &a2l->filename,
> &a2l->funcname, &a2l->line) &&
> - cnt++ < MAX_INLINE_NEST)
> - ;
> + cnt++ < MAX_INLINE_NEST) {
> +
> + if (node != NULL) {
> + if (inline_list__append(strdup(a2l->filename),
> + strdup(a2l->funcname),
> + a2l->line, node,
> + dso) != 0)
> + return 0;
> + }
> + }
> +
> + if ((node != NULL) &&
> + (callchain_param.order != ORDER_CALLEE)) {
> + inline_list__reverse(node);
> + }
> }
>
> if (a2l->found && a2l->filename) {
> @@ -223,6 +272,35 @@ void dso__free_a2l(struct dso *dso)
> dso->a2l = NULL;
> }
>
> +static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
> + struct dso *dso)
> +{
> + char *file = NULL;
> + unsigned int line = 0;
> + struct inline_node *node;
> +
> + node = zalloc(sizeof(*node));
> + if (node == NULL) {
> + perror("not enough memory for the inline node");
> + return NULL;
> + }
> +
> + INIT_LIST_HEAD(&node->val);
> + node->addr = addr;
> +
> + if (!addr2line(dso_name, addr, &file, &line, dso, TRUE, node))
> + goto out_free_inline_node;
> +
> + if (list_empty(&node->val))
> + goto out_free_inline_node;
> +
> + return node;
> +
> +out_free_inline_node:
> + inline_node__delete(node);
> + return NULL;
> +}
> +
> #else /* HAVE_LIBBFD_SUPPORT */
>
> static int filename_split(char *filename, unsigned int *line_nr)
> @@ -249,7 +327,8 @@ static int filename_split(char *filename, unsigned int *line_nr)
> static int addr2line(const char *dso_name, u64 addr,
> char **file, unsigned int *line_nr,
> struct dso *dso __maybe_unused,
> - bool unwind_inlines __maybe_unused)
> + bool unwind_inlines __maybe_unused,
> + struct inline_node *node __maybe_unused)
> {
> FILE *fp;
> char cmd[PATH_MAX];
> @@ -288,6 +367,57 @@ void dso__free_a2l(struct dso *dso __maybe_unused)
> {
> }
>
> +static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
> + struct dso *dso __maybe_unused)
> +{
> + FILE *fp;
> + char cmd[PATH_MAX];
> + struct inline_node *node;
> + char *filename = NULL;
> + size_t len;
> + unsigned int line_nr = 0;
> +
> + scnprintf(cmd, sizeof(cmd), "addr2line -e %s -i %016"PRIx64,
> + dso_name, addr);
> +
> + fp = popen(cmd, "r");
> + if (fp == NULL) {
> + pr_err("popen failed for %s\n", dso_name);
> + return NULL;
> + }
> +
> + node = zalloc(sizeof(*node));
> + if (node == NULL) {
> + perror("not enough memory for the inline node");
> + goto out;
> + }
> +
> + INIT_LIST_HEAD(&node->val);
> + node->addr = addr;
> +
> + while (getline(&filename, &len, fp) != -1) {
> + if (filename_split(filename, &line_nr) != 1) {
> + free(filename);
> + goto out;
> + }
> +
> + if (inline_list__append(filename, NULL, line_nr, node) != 0)
> + goto out;
> +
> + filename = NULL;
> + }
> +
> +out:
> + pclose(fp);
> +
> + if (list_empty(&node->val)) {
> + inline_node__delete(node);
> + return NULL;
> + }
> +
> + return node;
> +}
> +
> #endif /* HAVE_LIBBFD_SUPPORT */
>
> /*
> @@ -311,7 +441,7 @@ char *__get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
> if (dso_name == NULL)
> goto out;
>
> - if (!addr2line(dso_name, addr, &file, &line, dso, unwind_inlines))
> + if (!addr2line(dso_name, addr, &file, &line, dso, unwind_inlines, NULL))
> goto out;
>
> if (asprintf(&srcline, "%s:%u",
> @@ -351,3 +481,32 @@ char *get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
> {
> return __get_srcline(dso, addr, sym, show_sym, false);
> }
> +
> +struct inline_node *dso__parse_addr_inlines(struct dso *dso, u64 addr)
> +{
> + const char *dso_name;
> +
> + dso_name = dso__name(dso);
> + if (dso_name == NULL)
> + return NULL;
> +
> + return addr2inlines(dso_name, addr, dso);
> +}
> +
> +void inline_node__delete(struct inline_node *node)
> +{
> + struct inline_list *ilist, *tmp;
> +
> + list_for_each_entry_safe(ilist, tmp, &node->val, list) {
> + list_del_init(&ilist->list);
> + if (ilist->filename != NULL)
> + free(ilist->filename);
> +
> + if (ilist->funcname != NULL)
> + free(ilist->funcname);
> +
> + free(ilist);
> + }
> +
> + free(node);
> +}
> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> index 4e59dde..3a1dda3 100644
> --- a/tools/perf/util/symbol-elf.c
> +++ b/tools/perf/util/symbol-elf.c
> @@ -390,6 +390,11 @@ int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss, struct map *
> return 0;
> }
>
> +char *dso__demangle_sym(struct dso *dso, int kmodule, char *elf_name)
> +{
> + return demangle_sym(dso, kmodule, elf_name);
> +}
> +
> /*
> * Align offset to 4 bytes as needed for note name and descriptor data.
> */
> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> index 6c358b7..8adf045 100644
> --- a/tools/perf/util/symbol.h
> +++ b/tools/perf/util/symbol.h
> @@ -305,6 +305,8 @@ int dso__load_sym(struct dso *dso, struct map *map, struct symsrc *syms_ss,
> int dso__synthesize_plt_symbols(struct dso *dso, struct symsrc *ss,
> struct map *map);
>
> +char *dso__demangle_sym(struct dso *dso, int kmodule, char *elf_name);
> +
> void __symbols__insert(struct rb_root *symbols, struct symbol *sym, bool kernel);
> void symbols__insert(struct rb_root *symbols, struct symbol *sym);
> void symbols__fixup_duplicate(struct rb_root *symbols);
> diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
> index b2cfa47..cc0700d 100644
> --- a/tools/perf/util/util.h
> +++ b/tools/perf/util/util.h
> @@ -364,4 +364,20 @@ int is_printable_array(char *p, unsigned int len);
> int timestamp__scnprintf_usec(u64 timestamp, char *buf, size_t sz);
>
> int unit_number__scnprintf(char *buf, size_t size, u64 n);
> +
> +struct inline_list {
> + char *filename;
> + char *funcname;
> + unsigned int line_nr;
> + struct list_head list;
> +};
> +
> +struct inline_node {
> + u64 addr;
> + struct list_head val;
> +};
> +
> +struct inline_node *dso__parse_addr_inlines(struct dso *dso, u64 addr);
> +void inline_node__delete(struct inline_node *node);
> +
> #endif /* GIT_COMPAT_UTIL_H */
> --
> 2.7.4
Em Sat, Mar 18, 2017 at 05:41:09PM +0100, Milian Wolff escreveu:
> On Donnerstag, 16. M?rz 2017 22:42:22 CET Jin Yao wrote:
> > v5: Update according to Milian Wolff's comments. It groups by address
> > (then display file/ line), or by function (then display function name).
>
> Thank you Jin, that is really good. I tested it and it works really well for
> me.
>
> Arnaldo, could you please consider merging this? It's an extremely useful
> feature and direly missing from perf so far.
Thanks, applied.
> That said, Jin, here are some observations that could be improved in the
> future (I don't think any of these should hold back merging this feature now):
>
> For the following example code build with "-O2 -g" and recorded with "--call-
> graph dwarf" I observe some output combinations that could potentially be
> improved in the future:
>
> ~~~~~~~~~~~~~~~~~~~~
> #include <complex>
> #include <cmath>
> #include <random>
> #include <iostream>
>
> using namespace std;
>
> int main()
> {
> uniform_real_distribution<double> uniform(-1E5, 1E5);
> default_random_engine engine;
> double s = 0;
> for (int i = 0; i < 10000000; ++i) {
> s += norm(complex<double>(uniform(engine), uniform(engine)));
> }
> cout << s << '\n';
> return 0;
> }
> ~~~~~~~~~~~~~~~~
>
> #1 duplicated entries when grouping by function:
>
> ~~~~~~~~~~~~~~~~
> perf report --inline --stdio
> ...
> --35.34%--_start
> __libc_start_main
> main
> main (inline)
> std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned
> long, 16807ul, 0ul, 2147483647ul> > (inline)
> std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned
> long, 16807ul, 0ul, 2147483647ul> > (inline)
> std::__detail::_Adaptor<std::linear_congruential_engine<unsigned
> long, 16807ul, 0ul, 2147483647ul>, double>::operator() (inline)
> ~~~~~~~~~~~~~~~~
>
> Here, we see main twice, once for the "real" frame, and once for an inlined
> one? Then we see the same function twice as inlined frame, which is also odd.
>
> ~~~~~~~~~~~~~~~~
> perf report --inline --stdio --no-children
> ...
> 59.81% cpp-inlining libm-2.25.so [.] __hypot_finite
> |
> ---__hypot_finite
> hypot
> main
> std::norm<double> (inline)
> main (inline)
> __libc_start_main
> _start
> ~~~~~~~~~~~~~~~~
>
> Here we see a confusing output. The first "main" frame below "hypot" is
> actually code form cpp's complex header which got inlined into main. That
> associates the wrong function name to this frame, i.e. "main" instead of
> std::norm". When the inline stack is shown below we actually see what happens,
> i.e. we eventually end up in main again, but of course this output is not the
> best as-is.
>
> But, again: I think these are minor issues, and the feature itself is already
> extremely useful and I hope to see it finally merged.
>
> Thanks again Jin for your good work!
>
> Cheers
>
> --
> Milian Wolff | [email protected] | Software Engineer
> KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
> Tel: +49-30-521325470
> KDAB - The Qt Experts
Em Fri, Mar 24, 2017 at 04:01:19PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Sat, Mar 18, 2017 at 05:41:09PM +0100, Milian Wolff escreveu:
> > On Donnerstag, 16. M?rz 2017 22:42:22 CET Jin Yao wrote:
> > > v5: Update according to Milian Wolff's comments. It groups by address
> > > (then display file/ line), or by function (then display function name).
> >
> > Thank you Jin, that is really good. I tested it and it works really well for
> > me.
> >
> > Arnaldo, could you please consider merging this? It's an extremely useful
> > feature and direly missing from perf so far.
>
> Thanks, applied.
But it fails testing in some cases, see below, will try to fix later, if nobody
beats me to it, what I have is in acme/perf/core, git.kernel.org
make -C tools/perf build-test
<SNIP>
make_minimal_O: cd . && make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1 FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP -j4 O=/tmp/tmp.dQoIXBCebw DESTDIR=/tmp/tmp.zgrFKdJikV
cd . && make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1 FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP -j4 O=/tmp/tmp.dQoIXBCebw DESTDIR=/tmp/tmp.zgrFKdJikV
BUILD: Doing 'make -j4' parallel build
HOSTCC /tmp/tmp.dQoIXBCebw/fixdep.o
HOSTLD /tmp/tmp.dQoIXBCebw/fixdep-in.o
LINK /tmp/tmp.dQoIXBCebw/fixdep
Makefile.config:458: Disabling post unwind, no support found.
Makefile.config:594: Python support disabled by user
GEN /tmp/tmp.dQoIXBCebw/common-cmds.h
Warning: x86_64's syscall_64.tbl differs from kernel
MKDIR /tmp/tmp.dQoIXBCebw/fd/
CC /tmp/tmp.dQoIXBCebw/fd/array.o
CC /tmp/tmp.dQoIXBCebw/event-parse.o
MKDIR /tmp/tmp.dQoIXBCebw/fs/
CC /tmp/tmp.dQoIXBCebw/fs/fs.o
LD /tmp/tmp.dQoIXBCebw/fd/libapi-in.o
CC /tmp/tmp.dQoIXBCebw/cpu.o
CC /tmp/tmp.dQoIXBCebw/debug.o
PERF_VERSION = 4.11.rc2.g8bc82f
CC /tmp/tmp.dQoIXBCebw/exec-cmd.o
CC /tmp/tmp.dQoIXBCebw/str_error_r.o
MKDIR /tmp/tmp.dQoIXBCebw/pmu-events/
MKDIR /tmp/tmp.dQoIXBCebw/fs/
HOSTCC /tmp/tmp.dQoIXBCebw/pmu-events/json.o
CC /tmp/tmp.dQoIXBCebw/fs/tracing_path.o
MKDIR /tmp/tmp.dQoIXBCebw/pmu-events/
HOSTCC /tmp/tmp.dQoIXBCebw/pmu-events/jsmn.o
CC /tmp/tmp.dQoIXBCebw/help.o
LD /tmp/tmp.dQoIXBCebw/fs/libapi-in.o
HOSTCC /tmp/tmp.dQoIXBCebw/pmu-events/jevents.o
LD /tmp/tmp.dQoIXBCebw/libapi-in.o
AR /tmp/tmp.dQoIXBCebw/libapi.a
CC /tmp/tmp.dQoIXBCebw/plugin_jbd2.o
CC /tmp/tmp.dQoIXBCebw/event-plugin.o
LD /tmp/tmp.dQoIXBCebw/plugin_jbd2-in.o
CC /tmp/tmp.dQoIXBCebw/plugin_hrtimer.o
HOSTLD /tmp/tmp.dQoIXBCebw/pmu-events/jevents-in.o
CC /tmp/tmp.dQoIXBCebw/perf-read-vdso32
LD /tmp/tmp.dQoIXBCebw/plugin_hrtimer-in.o
GEN perf-archive
CC /tmp/tmp.dQoIXBCebw/plugin_kmem.o
GEN perf-with-kcore
CC /tmp/tmp.dQoIXBCebw/trace-seq.o
MKDIR /tmp/tmp.dQoIXBCebw/util/
CC /tmp/tmp.dQoIXBCebw/util/alias.o
CC /tmp/tmp.dQoIXBCebw/pager.o
LD /tmp/tmp.dQoIXBCebw/plugin_kmem-in.o
CC /tmp/tmp.dQoIXBCebw/plugin_kvm.o
CC /tmp/tmp.dQoIXBCebw/parse-filter.o
LD /tmp/tmp.dQoIXBCebw/plugin_kvm-in.o
CC /tmp/tmp.dQoIXBCebw/parse-options.o
CC /tmp/tmp.dQoIXBCebw/plugin_mac80211.o
MKDIR /tmp/tmp.dQoIXBCebw/util/
CC /tmp/tmp.dQoIXBCebw/util/annotate.o
LD /tmp/tmp.dQoIXBCebw/plugin_mac80211-in.o
CC /tmp/tmp.dQoIXBCebw/plugin_sched_switch.o
CC /tmp/tmp.dQoIXBCebw/parse-utils.o
LD /tmp/tmp.dQoIXBCebw/plugin_sched_switch-in.o
CC /tmp/tmp.dQoIXBCebw/plugin_function.o
CC /tmp/tmp.dQoIXBCebw/kbuffer-parse.o
LD /tmp/tmp.dQoIXBCebw/plugin_function-in.o
CC /tmp/tmp.dQoIXBCebw/plugin_xen.o
LD /tmp/tmp.dQoIXBCebw/libtraceevent-in.o
LD /tmp/tmp.dQoIXBCebw/plugin_xen-in.o
LINK /tmp/tmp.dQoIXBCebw/libtraceevent.a
CC /tmp/tmp.dQoIXBCebw/plugin_scsi.o
CC /tmp/tmp.dQoIXBCebw/builtin-bench.o
LD /tmp/tmp.dQoIXBCebw/plugin_scsi-in.o
CC /tmp/tmp.dQoIXBCebw/plugin_cfg80211.o
LD /tmp/tmp.dQoIXBCebw/plugin_cfg80211-in.o
LINK /tmp/tmp.dQoIXBCebw/plugin_jbd2.so
LINK /tmp/tmp.dQoIXBCebw/plugin_hrtimer.so
LINK /tmp/tmp.dQoIXBCebw/plugin_kmem.so
LINK /tmp/tmp.dQoIXBCebw/plugin_kvm.so
LINK /tmp/tmp.dQoIXBCebw/plugin_mac80211.so
CC /tmp/tmp.dQoIXBCebw/run-command.o
LINK /tmp/tmp.dQoIXBCebw/plugin_sched_switch.so
LINK /tmp/tmp.dQoIXBCebw/plugin_function.so
CC /tmp/tmp.dQoIXBCebw/builtin-annotate.o
LINK /tmp/tmp.dQoIXBCebw/plugin_xen.so
LINK /tmp/tmp.dQoIXBCebw/plugin_scsi.so
LINK /tmp/tmp.dQoIXBCebw/plugin_cfg80211.so
LINK /tmp/tmp.dQoIXBCebw/pmu-events/jevents
CC /tmp/tmp.dQoIXBCebw/sigchain.o
GEN /tmp/tmp.dQoIXBCebw/libtraceevent-dynamic-list
GEN /tmp/tmp.dQoIXBCebw/pmu-events/pmu-events.c
CC /tmp/tmp.dQoIXBCebw/subcmd-config.o
LD /tmp/tmp.dQoIXBCebw/libsubcmd-in.o
CC /tmp/tmp.dQoIXBCebw/pmu-events/pmu-events.o
AR /tmp/tmp.dQoIXBCebw/libsubcmd.a
CC /tmp/tmp.dQoIXBCebw/util/block-range.o
CC /tmp/tmp.dQoIXBCebw/builtin-config.o
CC /tmp/tmp.dQoIXBCebw/util/build-id.o
CC /tmp/tmp.dQoIXBCebw/util/config.o
CC /tmp/tmp.dQoIXBCebw/builtin-diff.o
LD /tmp/tmp.dQoIXBCebw/pmu-events/pmu-events-in.o
CC /tmp/tmp.dQoIXBCebw/builtin-evlist.o
CC /tmp/tmp.dQoIXBCebw/builtin-ftrace.o
CC /tmp/tmp.dQoIXBCebw/util/ctype.o
CC /tmp/tmp.dQoIXBCebw/util/db-export.o
CC /tmp/tmp.dQoIXBCebw/util/env.o
CC /tmp/tmp.dQoIXBCebw/builtin-help.o
CC /tmp/tmp.dQoIXBCebw/builtin-sched.o
CC /tmp/tmp.dQoIXBCebw/util/event.o
CC /tmp/tmp.dQoIXBCebw/util/evlist.o
CC /tmp/tmp.dQoIXBCebw/builtin-buildid-list.o
CC /tmp/tmp.dQoIXBCebw/builtin-buildid-cache.o
CC /tmp/tmp.dQoIXBCebw/builtin-kallsyms.o
CC /tmp/tmp.dQoIXBCebw/util/evsel.o
CC /tmp/tmp.dQoIXBCebw/builtin-list.o
CC /tmp/tmp.dQoIXBCebw/util/evsel_fprintf.o
CC /tmp/tmp.dQoIXBCebw/arch/common.o
CC /tmp/tmp.dQoIXBCebw/builtin-record.o
CC /tmp/tmp.dQoIXBCebw/util/find_bit.o
CC /tmp/tmp.dQoIXBCebw/util/kallsyms.o
MKDIR /tmp/tmp.dQoIXBCebw/arch/x86/util/
CC /tmp/tmp.dQoIXBCebw/arch/x86/util/header.o
CC /tmp/tmp.dQoIXBCebw/util/levenshtein.o
MKDIR /tmp/tmp.dQoIXBCebw/arch/x86/util/
CC /tmp/tmp.dQoIXBCebw/arch/x86/util/tsc.o
CC /tmp/tmp.dQoIXBCebw/util/llvm-utils.o
CC /tmp/tmp.dQoIXBCebw/arch/x86/util/pmu.o
CC /tmp/tmp.dQoIXBCebw/builtin-report.o
CC /tmp/tmp.dQoIXBCebw/arch/x86/util/kvm-stat.o
BISON /tmp/tmp.dQoIXBCebw/util/parse-events-bison.c
CC /tmp/tmp.dQoIXBCebw/util/perf_regs.o
CC /tmp/tmp.dQoIXBCebw/util/path.o
CC /tmp/tmp.dQoIXBCebw/util/rbtree.o
CC /tmp/tmp.dQoIXBCebw/util/libstring.o
CC /tmp/tmp.dQoIXBCebw/arch/x86/util/perf_regs.o
CC /tmp/tmp.dQoIXBCebw/util/bitmap.o
CC /tmp/tmp.dQoIXBCebw/util/hweight.o
CC /tmp/tmp.dQoIXBCebw/arch/x86/util/group.o
CC /tmp/tmp.dQoIXBCebw/util/quote.o
CC /tmp/tmp.dQoIXBCebw/builtin-stat.o
LD /tmp/tmp.dQoIXBCebw/arch/x86/util/libperf-in.o
MKDIR /tmp/tmp.dQoIXBCebw/ui/
CC /tmp/tmp.dQoIXBCebw/ui/setup.o
MKDIR /tmp/tmp.dQoIXBCebw/arch/x86/tests/
CC /tmp/tmp.dQoIXBCebw/arch/x86/tests/arch-tests.o
MKDIR /tmp/tmp.dQoIXBCebw/arch/x86/tests/
CC /tmp/tmp.dQoIXBCebw/arch/x86/tests/rdpmc.o
CC /tmp/tmp.dQoIXBCebw/util/strbuf.o
MKDIR /tmp/tmp.dQoIXBCebw/ui/
CC /tmp/tmp.dQoIXBCebw/ui/helpline.o
CC /tmp/tmp.dQoIXBCebw/arch/x86/tests/perf-time-to-tsc.o
CC /tmp/tmp.dQoIXBCebw/util/string.o
CC /tmp/tmp.dQoIXBCebw/ui/progress.o
CC /tmp/tmp.dQoIXBCebw/ui/util.o
CC /tmp/tmp.dQoIXBCebw/arch/x86/tests/intel-cqm.o
CC /tmp/tmp.dQoIXBCebw/ui/hist.o
CC /tmp/tmp.dQoIXBCebw/util/strlist.o
LD /tmp/tmp.dQoIXBCebw/arch/x86/tests/libperf-in.o
LD /tmp/tmp.dQoIXBCebw/arch/x86/libperf-in.o
LD /tmp/tmp.dQoIXBCebw/arch/libperf-in.o
MKDIR /tmp/tmp.dQoIXBCebw/scripts/
LD /tmp/tmp.dQoIXBCebw/scripts/libperf-in.o
CC /tmp/tmp.dQoIXBCebw/builtin-timechart.o
CC /tmp/tmp.dQoIXBCebw/util/strfilter.o
CC /tmp/tmp.dQoIXBCebw/builtin-top.o
CC /tmp/tmp.dQoIXBCebw/util/top.o
CC /tmp/tmp.dQoIXBCebw/util/usage.o
CC /tmp/tmp.dQoIXBCebw/util/dso.o
MKDIR /tmp/tmp.dQoIXBCebw/ui/stdio/
CC /tmp/tmp.dQoIXBCebw/ui/stdio/hist.o
CC /tmp/tmp.dQoIXBCebw/builtin-script.o
CC /tmp/tmp.dQoIXBCebw/builtin-kmem.o
LD /tmp/tmp.dQoIXBCebw/ui/libperf-in.o
CC /tmp/tmp.dQoIXBCebw/util/symbol.o
CC /tmp/tmp.dQoIXBCebw/builtin-lock.o
CC /tmp/tmp.dQoIXBCebw/util/symbol_fprintf.o
CC /tmp/tmp.dQoIXBCebw/util/color.o
CC /tmp/tmp.dQoIXBCebw/builtin-kvm.o
CC /tmp/tmp.dQoIXBCebw/util/header.o
CC /tmp/tmp.dQoIXBCebw/util/callchain.o
CC /tmp/tmp.dQoIXBCebw/util/values.o
CC /tmp/tmp.dQoIXBCebw/util/debug.o
CC /tmp/tmp.dQoIXBCebw/builtin-inject.o
CC /tmp/tmp.dQoIXBCebw/util/machine.o
CC /tmp/tmp.dQoIXBCebw/builtin-mem.o
CC /tmp/tmp.dQoIXBCebw/util/map.o
CC /tmp/tmp.dQoIXBCebw/builtin-data.o
CC /tmp/tmp.dQoIXBCebw/util/pstack.o
CC /tmp/tmp.dQoIXBCebw/builtin-version.o
CC /tmp/tmp.dQoIXBCebw/util/session.o
CC /tmp/tmp.dQoIXBCebw/builtin-c2c.o
CC /tmp/tmp.dQoIXBCebw/util/ordered-events.o
CC /tmp/tmp.dQoIXBCebw/util/namespaces.o
MKDIR /tmp/tmp.dQoIXBCebw/bench/
CC /tmp/tmp.dQoIXBCebw/bench/sched-messaging.o
CC /tmp/tmp.dQoIXBCebw/util/comm.o
MKDIR /tmp/tmp.dQoIXBCebw/bench/
CC /tmp/tmp.dQoIXBCebw/bench/sched-pipe.o
CC /tmp/tmp.dQoIXBCebw/util/thread.o
CC /tmp/tmp.dQoIXBCebw/bench/mem-functions.o
CC /tmp/tmp.dQoIXBCebw/util/thread_map.o
CC /tmp/tmp.dQoIXBCebw/util/trace-event-parse.o
MKDIR /tmp/tmp.dQoIXBCebw/tests/
CC /tmp/tmp.dQoIXBCebw/tests/builtin-test.o
CC /tmp/tmp.dQoIXBCebw/bench/futex-hash.o
CC /tmp/tmp.dQoIXBCebw/util/parse-events-bison.o
CC /tmp/tmp.dQoIXBCebw/bench/futex-wake.o
BISON /tmp/tmp.dQoIXBCebw/util/pmu-bison.c
CC /tmp/tmp.dQoIXBCebw/util/trace-event-read.o
MKDIR /tmp/tmp.dQoIXBCebw/tests/
CC /tmp/tmp.dQoIXBCebw/tests/parse-events.o
CC /tmp/tmp.dQoIXBCebw/bench/futex-wake-parallel.o
CC /tmp/tmp.dQoIXBCebw/util/trace-event-info.o
CC /tmp/tmp.dQoIXBCebw/bench/futex-requeue.o
CC /tmp/tmp.dQoIXBCebw/util/trace-event-scripting.o
CC /tmp/tmp.dQoIXBCebw/bench/futex-lock-pi.o
CC /tmp/tmp.dQoIXBCebw/util/trace-event.o
CC /tmp/tmp.dQoIXBCebw/bench/mem-memcpy-x86-64-asm.o
CC /tmp/tmp.dQoIXBCebw/bench/mem-memset-x86-64-asm.o
CC /tmp/tmp.dQoIXBCebw/util/svghelper.o
LD /tmp/tmp.dQoIXBCebw/bench/perf-in.o
CC /tmp/tmp.dQoIXBCebw/perf.o
CC /tmp/tmp.dQoIXBCebw/util/sort.o
CC /tmp/tmp.dQoIXBCebw/tests/dso-data.o
CC /tmp/tmp.dQoIXBCebw/tests/attr.o
CC /tmp/tmp.dQoIXBCebw/util/hist.o
CC /tmp/tmp.dQoIXBCebw/tests/vmlinux-kallsyms.o
CC /tmp/tmp.dQoIXBCebw/tests/openat-syscall.o
CC /tmp/tmp.dQoIXBCebw/tests/openat-syscall-all-cpus.o
CC /tmp/tmp.dQoIXBCebw/tests/openat-syscall-tp-fields.o
CC /tmp/tmp.dQoIXBCebw/tests/mmap-basic.o
CC /tmp/tmp.dQoIXBCebw/tests/perf-record.o
CC /tmp/tmp.dQoIXBCebw/util/util.o
CC /tmp/tmp.dQoIXBCebw/tests/evsel-roundtrip-name.o
CC /tmp/tmp.dQoIXBCebw/tests/evsel-tp-sched.o
CC /tmp/tmp.dQoIXBCebw/tests/fdarray.o
CC /tmp/tmp.dQoIXBCebw/util/xyarray.o
CC /tmp/tmp.dQoIXBCebw/tests/pmu.o
CC /tmp/tmp.dQoIXBCebw/util/cpumap.o
CC /tmp/tmp.dQoIXBCebw/tests/hists_common.o
CC /tmp/tmp.dQoIXBCebw/util/cgroup.o
CC /tmp/tmp.dQoIXBCebw/tests/hists_link.o
CC /tmp/tmp.dQoIXBCebw/tests/hists_filter.o
CC /tmp/tmp.dQoIXBCebw/util/target.o
CC /tmp/tmp.dQoIXBCebw/tests/hists_output.o
CC /tmp/tmp.dQoIXBCebw/util/rblist.o
CC /tmp/tmp.dQoIXBCebw/util/intlist.o
CC /tmp/tmp.dQoIXBCebw/tests/hists_cumulate.o
CC /tmp/tmp.dQoIXBCebw/util/vdso.o
CC /tmp/tmp.dQoIXBCebw/util/counts.o
CC /tmp/tmp.dQoIXBCebw/util/stat.o
CC /tmp/tmp.dQoIXBCebw/util/stat-shadow.o
CC /tmp/tmp.dQoIXBCebw/tests/python-use.o
CC /tmp/tmp.dQoIXBCebw/tests/bp_signal.o
CC /tmp/tmp.dQoIXBCebw/tests/bp_signal_overflow.o
CC /tmp/tmp.dQoIXBCebw/util/record.o
CC /tmp/tmp.dQoIXBCebw/tests/task-exit.o
CC /tmp/tmp.dQoIXBCebw/tests/sw-clock.o
CC /tmp/tmp.dQoIXBCebw/util/srcline.o
CC /tmp/tmp.dQoIXBCebw/tests/mmap-thread-lookup.o
CC /tmp/tmp.dQoIXBCebw/util/data.o
CC /tmp/tmp.dQoIXBCebw/tests/thread-mg-share.o
CC /tmp/tmp.dQoIXBCebw/util/tsc.o
CC /tmp/tmp.dQoIXBCebw/tests/switch-tracking.o
CC /tmp/tmp.dQoIXBCebw/util/cloexec.o
CC /tmp/tmp.dQoIXBCebw/tests/keep-tracking.o
CC /tmp/tmp.dQoIXBCebw/util/call-path.o
CC /tmp/tmp.dQoIXBCebw/util/thread-stack.o
CC /tmp/tmp.dQoIXBCebw/util/parse-branch-options.o
CC /tmp/tmp.dQoIXBCebw/tests/code-reading.o
CC /tmp/tmp.dQoIXBCebw/tests/sample-parsing.o
CC /tmp/tmp.dQoIXBCebw/util/dump-insn.o
CC /tmp/tmp.dQoIXBCebw/util/parse-regs-options.o
CC /tmp/tmp.dQoIXBCebw/util/term.o
CC /tmp/tmp.dQoIXBCebw/tests/parse-no-sample-id-all.o
CC /tmp/tmp.dQoIXBCebw/util/help-unknown-cmd.o
CC /tmp/tmp.dQoIXBCebw/tests/kmod-path.o
CC /tmp/tmp.dQoIXBCebw/util/mem-events.o
CC /tmp/tmp.dQoIXBCebw/tests/thread-map.o
CC /tmp/tmp.dQoIXBCebw/util/vsprintf.o
CC /tmp/tmp.dQoIXBCebw/util/drv_configs.o
CC /tmp/tmp.dQoIXBCebw/tests/llvm.o
CC /tmp/tmp.dQoIXBCebw/util/time-utils.o
CC /tmp/tmp.dQoIXBCebw/tests/bpf.o
CC /tmp/tmp.dQoIXBCebw/tests/topology.o
BISON /tmp/tmp.dQoIXBCebw/util/expr-bison.c
CC /tmp/tmp.dQoIXBCebw/util/symbol-minimal.o
MKDIR /tmp/tmp.dQoIXBCebw/util/scripting-engines/
LD /tmp/tmp.dQoIXBCebw/util/scripting-engines/libperf-in.o
CC /tmp/tmp.dQoIXBCebw/util/zlib.o
CC /tmp/tmp.dQoIXBCebw/tests/cpumap.o
CC /tmp/tmp.dQoIXBCebw/util/lzma.o
CC /tmp/tmp.dQoIXBCebw/tests/stat.o
CC /tmp/tmp.dQoIXBCebw/tests/event_update.o
CC /tmp/tmp.dQoIXBCebw/util/demangle-java.o
CC /tmp/tmp.dQoIXBCebw/tests/event-times.o
CC /tmp/tmp.dQoIXBCebw/util/demangle-rust.o
CC /tmp/tmp.dQoIXBCebw/tests/expr.o
CC /tmp/tmp.dQoIXBCebw/util/perf-hooks.o
CC /tmp/tmp.dQoIXBCebw/tests/backward-ring-buffer.o
CC /tmp/tmp.dQoIXBCebw/tests/sdt.o
CC /tmp/tmp.dQoIXBCebw/tests/is_printable_array.o
FLEX /tmp/tmp.dQoIXBCebw/util/parse-events-flex.c
FLEX /tmp/tmp.dQoIXBCebw/util/pmu-flex.c
CC /tmp/tmp.dQoIXBCebw/util/pmu-bison.o
CC /tmp/tmp.dQoIXBCebw/tests/bitmap.o
CC /tmp/tmp.dQoIXBCebw/tests/perf-hooks.o
CC /tmp/tmp.dQoIXBCebw/tests/clang.o
CC /tmp/tmp.dQoIXBCebw/util/expr-bison.o
CC /tmp/tmp.dQoIXBCebw/tests/unit_number__scnprintf.o
CC /tmp/tmp.dQoIXBCebw/tests/llvm-src-base.o
CC /tmp/tmp.dQoIXBCebw/tests/llvm-src-kbuild.o
CC /tmp/tmp.dQoIXBCebw/tests/llvm-src-prologue.o
CC /tmp/tmp.dQoIXBCebw/tests/llvm-src-relocation.o
LD /tmp/tmp.dQoIXBCebw/tests/perf-in.o
CC /tmp/tmp.dQoIXBCebw/util/parse-events.o
CC /tmp/tmp.dQoIXBCebw/util/pmu.o
CC /tmp/tmp.dQoIXBCebw/util/parse-events-flex.o
LD /tmp/tmp.dQoIXBCebw/perf-in.o
CC /tmp/tmp.dQoIXBCebw/util/pmu-flex.o
LD /tmp/tmp.dQoIXBCebw/util/libperf-in.o
LD /tmp/tmp.dQoIXBCebw/libperf-in.o
AR /tmp/tmp.dQoIXBCebw/libperf.a
LINK /tmp/tmp.dQoIXBCebw/perf
/tmp/tmp.dQoIXBCebw/libperf.a(libperf-in.o): In function `inline_list__append':
/home/acme/git/linux/tools/perf/util/srcline.c:47: undefined reference to `dso__demangle_sym'
collect2: error: ld returned 1 exit status
Makefile.perf:420: recipe for target '/tmp/tmp.dQoIXBCebw/perf' failed
make[4]: *** [/tmp/tmp.dQoIXBCebw/perf] Error 1
Makefile.perf:204: recipe for target 'sub-make' failed
make[3]: *** [sub-make] Error 2
Makefile:68: recipe for target 'all' failed
make[2]: *** [all] Error 2
tests/make:296: recipe for target 'make_minimal_O' failed
make[1]: *** [make_minimal_O] Error 1
Makefile:102: recipe for target 'build-test' failed
make: *** [build-test] Error 2
make: Leaving directory '/home/acme/git/linux/tools/perf'
[acme@jouet linux]$
> > That said, Jin, here are some observations that could be improved in the
> > future (I don't think any of these should hold back merging this feature now):
> >
> > For the following example code build with "-O2 -g" and recorded with "--call-
> > graph dwarf" I observe some output combinations that could potentially be
> > improved in the future:
> >
> > ~~~~~~~~~~~~~~~~~~~~
> > #include <complex>
> > #include <cmath>
> > #include <random>
> > #include <iostream>
> >
> > using namespace std;
> >
> > int main()
> > {
> > uniform_real_distribution<double> uniform(-1E5, 1E5);
> > default_random_engine engine;
> > double s = 0;
> > for (int i = 0; i < 10000000; ++i) {
> > s += norm(complex<double>(uniform(engine), uniform(engine)));
> > }
> > cout << s << '\n';
> > return 0;
> > }
> > ~~~~~~~~~~~~~~~~
> >
> > #1 duplicated entries when grouping by function:
> >
> > ~~~~~~~~~~~~~~~~
> > perf report --inline --stdio
> > ...
> > --35.34%--_start
> > __libc_start_main
> > main
> > main (inline)
> > std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned
> > long, 16807ul, 0ul, 2147483647ul> > (inline)
> > std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned
> > long, 16807ul, 0ul, 2147483647ul> > (inline)
> > std::__detail::_Adaptor<std::linear_congruential_engine<unsigned
> > long, 16807ul, 0ul, 2147483647ul>, double>::operator() (inline)
> > ~~~~~~~~~~~~~~~~
> >
> > Here, we see main twice, once for the "real" frame, and once for an inlined
> > one? Then we see the same function twice as inlined frame, which is also odd.
> >
> > ~~~~~~~~~~~~~~~~
> > perf report --inline --stdio --no-children
> > ...
> > 59.81% cpp-inlining libm-2.25.so [.] __hypot_finite
> > |
> > ---__hypot_finite
> > hypot
> > main
> > std::norm<double> (inline)
> > main (inline)
> > __libc_start_main
> > _start
> > ~~~~~~~~~~~~~~~~
> >
> > Here we see a confusing output. The first "main" frame below "hypot" is
> > actually code form cpp's complex header which got inlined into main. That
> > associates the wrong function name to this frame, i.e. "main" instead of
> > std::norm". When the inline stack is shown below we actually see what happens,
> > i.e. we eventually end up in main again, but of course this output is not the
> > best as-is.
> >
> > But, again: I think these are minor issues, and the feature itself is already
> > extremely useful and I hope to see it finally merged.
> >
> > Thanks again Jin for your good work!
> >
> > Cheers
> >
> > --
> > Milian Wolff | [email protected] | Software Engineer
> > KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
> > Tel: +49-30-521325470
> > KDAB - The Qt Experts
>
>
Hi Arnaldo,
I checked it needed to add dso__dmangle_sym() in symbol-minimal.c.
make -C tools/perf build-test
..........
iqhvUMsN DESTDIR=/tmp/tmp.9zqX9FtV0p
Makefile:203: Please install asciidoc xmlto to have the man pages installed
make_no_slang_O: cd . && make NO_SLANG=1
FEATURES_DUMP=/home/jinyao/skl-ws/perf-dev/tmp/acme/tools/perf/BUILD_TEST_FEATURE_DUMP
-j8 O=/tmp/tmp.eDBcxsrMuA DESTDIR=/tmp/tmp.0wwJYGzYTr
OK
I will send a patch "perf report: Fix build-test error for
make_minimal_O target" to fix that.
Thanks so much for your help!
Thanks
Jin Yao
On 3/25/2017 3:24 AM, Arnaldo Carvalho de Melo wrote:
> Em Fri, Mar 24, 2017 at 04:01:19PM -0300, Arnaldo Carvalho de Melo escreveu:
>> Em Sat, Mar 18, 2017 at 05:41:09PM +0100, Milian Wolff escreveu:
>>> On Donnerstag, 16. M?rz 2017 22:42:22 CET Jin Yao wrote:
>>>> v5: Update according to Milian Wolff's comments. It groups by address
>>>> (then display file/ line), or by function (then display function name).
>>> Thank you Jin, that is really good. I tested it and it works really well for
>>> me.
>>>
>>> Arnaldo, could you please consider merging this? It's an extremely useful
>>> feature and direly missing from perf so far.
>> Thanks, applied.
> But it fails testing in some cases, see below, will try to fix later, if nobody
> beats me to it, what I have is in acme/perf/core, git.kernel.org
>
>
> make -C tools/perf build-test
>
> <SNIP>
> make_minimal_O: cd . && make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1 FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP -j4 O=/tmp/tmp.dQoIXBCebw DESTDIR=/tmp/tmp.zgrFKdJikV
> cd . && make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1 FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP -j4 O=/tmp/tmp.dQoIXBCebw DESTDIR=/tmp/tmp.zgrFKdJikV
> BUILD: Doing 'make -j4' parallel build
> HOSTCC /tmp/tmp.dQoIXBCebw/fixdep.o
> HOSTLD /tmp/tmp.dQoIXBCebw/fixdep-in.o
> LINK /tmp/tmp.dQoIXBCebw/fixdep
> Makefile.config:458: Disabling post unwind, no support found.
> Makefile.config:594: Python support disabled by user
> GEN /tmp/tmp.dQoIXBCebw/common-cmds.h
> Warning: x86_64's syscall_64.tbl differs from kernel
> MKDIR /tmp/tmp.dQoIXBCebw/fd/
> CC /tmp/tmp.dQoIXBCebw/fd/array.o
> CC /tmp/tmp.dQoIXBCebw/event-parse.o
> MKDIR /tmp/tmp.dQoIXBCebw/fs/
> CC /tmp/tmp.dQoIXBCebw/fs/fs.o
> LD /tmp/tmp.dQoIXBCebw/fd/libapi-in.o
> CC /tmp/tmp.dQoIXBCebw/cpu.o
> CC /tmp/tmp.dQoIXBCebw/debug.o
> PERF_VERSION = 4.11.rc2.g8bc82f
> CC /tmp/tmp.dQoIXBCebw/exec-cmd.o
> CC /tmp/tmp.dQoIXBCebw/str_error_r.o
> MKDIR /tmp/tmp.dQoIXBCebw/pmu-events/
> MKDIR /tmp/tmp.dQoIXBCebw/fs/
> HOSTCC /tmp/tmp.dQoIXBCebw/pmu-events/json.o
> CC /tmp/tmp.dQoIXBCebw/fs/tracing_path.o
> MKDIR /tmp/tmp.dQoIXBCebw/pmu-events/
> HOSTCC /tmp/tmp.dQoIXBCebw/pmu-events/jsmn.o
> CC /tmp/tmp.dQoIXBCebw/help.o
> LD /tmp/tmp.dQoIXBCebw/fs/libapi-in.o
> HOSTCC /tmp/tmp.dQoIXBCebw/pmu-events/jevents.o
> LD /tmp/tmp.dQoIXBCebw/libapi-in.o
> AR /tmp/tmp.dQoIXBCebw/libapi.a
> CC /tmp/tmp.dQoIXBCebw/plugin_jbd2.o
> CC /tmp/tmp.dQoIXBCebw/event-plugin.o
> LD /tmp/tmp.dQoIXBCebw/plugin_jbd2-in.o
> CC /tmp/tmp.dQoIXBCebw/plugin_hrtimer.o
> HOSTLD /tmp/tmp.dQoIXBCebw/pmu-events/jevents-in.o
> CC /tmp/tmp.dQoIXBCebw/perf-read-vdso32
> LD /tmp/tmp.dQoIXBCebw/plugin_hrtimer-in.o
> GEN perf-archive
> CC /tmp/tmp.dQoIXBCebw/plugin_kmem.o
> GEN perf-with-kcore
> CC /tmp/tmp.dQoIXBCebw/trace-seq.o
> MKDIR /tmp/tmp.dQoIXBCebw/util/
> CC /tmp/tmp.dQoIXBCebw/util/alias.o
> CC /tmp/tmp.dQoIXBCebw/pager.o
> LD /tmp/tmp.dQoIXBCebw/plugin_kmem-in.o
> CC /tmp/tmp.dQoIXBCebw/plugin_kvm.o
> CC /tmp/tmp.dQoIXBCebw/parse-filter.o
> LD /tmp/tmp.dQoIXBCebw/plugin_kvm-in.o
> CC /tmp/tmp.dQoIXBCebw/parse-options.o
> CC /tmp/tmp.dQoIXBCebw/plugin_mac80211.o
> MKDIR /tmp/tmp.dQoIXBCebw/util/
> CC /tmp/tmp.dQoIXBCebw/util/annotate.o
> LD /tmp/tmp.dQoIXBCebw/plugin_mac80211-in.o
> CC /tmp/tmp.dQoIXBCebw/plugin_sched_switch.o
> CC /tmp/tmp.dQoIXBCebw/parse-utils.o
> LD /tmp/tmp.dQoIXBCebw/plugin_sched_switch-in.o
> CC /tmp/tmp.dQoIXBCebw/plugin_function.o
> CC /tmp/tmp.dQoIXBCebw/kbuffer-parse.o
> LD /tmp/tmp.dQoIXBCebw/plugin_function-in.o
> CC /tmp/tmp.dQoIXBCebw/plugin_xen.o
> LD /tmp/tmp.dQoIXBCebw/libtraceevent-in.o
> LD /tmp/tmp.dQoIXBCebw/plugin_xen-in.o
> LINK /tmp/tmp.dQoIXBCebw/libtraceevent.a
> CC /tmp/tmp.dQoIXBCebw/plugin_scsi.o
> CC /tmp/tmp.dQoIXBCebw/builtin-bench.o
> LD /tmp/tmp.dQoIXBCebw/plugin_scsi-in.o
> CC /tmp/tmp.dQoIXBCebw/plugin_cfg80211.o
> LD /tmp/tmp.dQoIXBCebw/plugin_cfg80211-in.o
> LINK /tmp/tmp.dQoIXBCebw/plugin_jbd2.so
> LINK /tmp/tmp.dQoIXBCebw/plugin_hrtimer.so
> LINK /tmp/tmp.dQoIXBCebw/plugin_kmem.so
> LINK /tmp/tmp.dQoIXBCebw/plugin_kvm.so
> LINK /tmp/tmp.dQoIXBCebw/plugin_mac80211.so
> CC /tmp/tmp.dQoIXBCebw/run-command.o
> LINK /tmp/tmp.dQoIXBCebw/plugin_sched_switch.so
> LINK /tmp/tmp.dQoIXBCebw/plugin_function.so
> CC /tmp/tmp.dQoIXBCebw/builtin-annotate.o
> LINK /tmp/tmp.dQoIXBCebw/plugin_xen.so
> LINK /tmp/tmp.dQoIXBCebw/plugin_scsi.so
> LINK /tmp/tmp.dQoIXBCebw/plugin_cfg80211.so
> LINK /tmp/tmp.dQoIXBCebw/pmu-events/jevents
> CC /tmp/tmp.dQoIXBCebw/sigchain.o
> GEN /tmp/tmp.dQoIXBCebw/libtraceevent-dynamic-list
> GEN /tmp/tmp.dQoIXBCebw/pmu-events/pmu-events.c
> CC /tmp/tmp.dQoIXBCebw/subcmd-config.o
> LD /tmp/tmp.dQoIXBCebw/libsubcmd-in.o
> CC /tmp/tmp.dQoIXBCebw/pmu-events/pmu-events.o
> AR /tmp/tmp.dQoIXBCebw/libsubcmd.a
> CC /tmp/tmp.dQoIXBCebw/util/block-range.o
> CC /tmp/tmp.dQoIXBCebw/builtin-config.o
> CC /tmp/tmp.dQoIXBCebw/util/build-id.o
> CC /tmp/tmp.dQoIXBCebw/util/config.o
> CC /tmp/tmp.dQoIXBCebw/builtin-diff.o
> LD /tmp/tmp.dQoIXBCebw/pmu-events/pmu-events-in.o
> CC /tmp/tmp.dQoIXBCebw/builtin-evlist.o
> CC /tmp/tmp.dQoIXBCebw/builtin-ftrace.o
> CC /tmp/tmp.dQoIXBCebw/util/ctype.o
> CC /tmp/tmp.dQoIXBCebw/util/db-export.o
> CC /tmp/tmp.dQoIXBCebw/util/env.o
> CC /tmp/tmp.dQoIXBCebw/builtin-help.o
> CC /tmp/tmp.dQoIXBCebw/builtin-sched.o
> CC /tmp/tmp.dQoIXBCebw/util/event.o
> CC /tmp/tmp.dQoIXBCebw/util/evlist.o
> CC /tmp/tmp.dQoIXBCebw/builtin-buildid-list.o
> CC /tmp/tmp.dQoIXBCebw/builtin-buildid-cache.o
> CC /tmp/tmp.dQoIXBCebw/builtin-kallsyms.o
> CC /tmp/tmp.dQoIXBCebw/util/evsel.o
> CC /tmp/tmp.dQoIXBCebw/builtin-list.o
> CC /tmp/tmp.dQoIXBCebw/util/evsel_fprintf.o
> CC /tmp/tmp.dQoIXBCebw/arch/common.o
> CC /tmp/tmp.dQoIXBCebw/builtin-record.o
> CC /tmp/tmp.dQoIXBCebw/util/find_bit.o
> CC /tmp/tmp.dQoIXBCebw/util/kallsyms.o
> MKDIR /tmp/tmp.dQoIXBCebw/arch/x86/util/
> CC /tmp/tmp.dQoIXBCebw/arch/x86/util/header.o
> CC /tmp/tmp.dQoIXBCebw/util/levenshtein.o
> MKDIR /tmp/tmp.dQoIXBCebw/arch/x86/util/
> CC /tmp/tmp.dQoIXBCebw/arch/x86/util/tsc.o
> CC /tmp/tmp.dQoIXBCebw/util/llvm-utils.o
> CC /tmp/tmp.dQoIXBCebw/arch/x86/util/pmu.o
> CC /tmp/tmp.dQoIXBCebw/builtin-report.o
> CC /tmp/tmp.dQoIXBCebw/arch/x86/util/kvm-stat.o
> BISON /tmp/tmp.dQoIXBCebw/util/parse-events-bison.c
> CC /tmp/tmp.dQoIXBCebw/util/perf_regs.o
> CC /tmp/tmp.dQoIXBCebw/util/path.o
> CC /tmp/tmp.dQoIXBCebw/util/rbtree.o
> CC /tmp/tmp.dQoIXBCebw/util/libstring.o
> CC /tmp/tmp.dQoIXBCebw/arch/x86/util/perf_regs.o
> CC /tmp/tmp.dQoIXBCebw/util/bitmap.o
> CC /tmp/tmp.dQoIXBCebw/util/hweight.o
> CC /tmp/tmp.dQoIXBCebw/arch/x86/util/group.o
> CC /tmp/tmp.dQoIXBCebw/util/quote.o
> CC /tmp/tmp.dQoIXBCebw/builtin-stat.o
> LD /tmp/tmp.dQoIXBCebw/arch/x86/util/libperf-in.o
> MKDIR /tmp/tmp.dQoIXBCebw/ui/
> CC /tmp/tmp.dQoIXBCebw/ui/setup.o
> MKDIR /tmp/tmp.dQoIXBCebw/arch/x86/tests/
> CC /tmp/tmp.dQoIXBCebw/arch/x86/tests/arch-tests.o
> MKDIR /tmp/tmp.dQoIXBCebw/arch/x86/tests/
> CC /tmp/tmp.dQoIXBCebw/arch/x86/tests/rdpmc.o
> CC /tmp/tmp.dQoIXBCebw/util/strbuf.o
> MKDIR /tmp/tmp.dQoIXBCebw/ui/
> CC /tmp/tmp.dQoIXBCebw/ui/helpline.o
> CC /tmp/tmp.dQoIXBCebw/arch/x86/tests/perf-time-to-tsc.o
> CC /tmp/tmp.dQoIXBCebw/util/string.o
> CC /tmp/tmp.dQoIXBCebw/ui/progress.o
> CC /tmp/tmp.dQoIXBCebw/ui/util.o
> CC /tmp/tmp.dQoIXBCebw/arch/x86/tests/intel-cqm.o
> CC /tmp/tmp.dQoIXBCebw/ui/hist.o
> CC /tmp/tmp.dQoIXBCebw/util/strlist.o
> LD /tmp/tmp.dQoIXBCebw/arch/x86/tests/libperf-in.o
> LD /tmp/tmp.dQoIXBCebw/arch/x86/libperf-in.o
> LD /tmp/tmp.dQoIXBCebw/arch/libperf-in.o
> MKDIR /tmp/tmp.dQoIXBCebw/scripts/
> LD /tmp/tmp.dQoIXBCebw/scripts/libperf-in.o
> CC /tmp/tmp.dQoIXBCebw/builtin-timechart.o
> CC /tmp/tmp.dQoIXBCebw/util/strfilter.o
> CC /tmp/tmp.dQoIXBCebw/builtin-top.o
> CC /tmp/tmp.dQoIXBCebw/util/top.o
> CC /tmp/tmp.dQoIXBCebw/util/usage.o
> CC /tmp/tmp.dQoIXBCebw/util/dso.o
> MKDIR /tmp/tmp.dQoIXBCebw/ui/stdio/
> CC /tmp/tmp.dQoIXBCebw/ui/stdio/hist.o
> CC /tmp/tmp.dQoIXBCebw/builtin-script.o
> CC /tmp/tmp.dQoIXBCebw/builtin-kmem.o
> LD /tmp/tmp.dQoIXBCebw/ui/libperf-in.o
> CC /tmp/tmp.dQoIXBCebw/util/symbol.o
> CC /tmp/tmp.dQoIXBCebw/builtin-lock.o
> CC /tmp/tmp.dQoIXBCebw/util/symbol_fprintf.o
> CC /tmp/tmp.dQoIXBCebw/util/color.o
> CC /tmp/tmp.dQoIXBCebw/builtin-kvm.o
> CC /tmp/tmp.dQoIXBCebw/util/header.o
> CC /tmp/tmp.dQoIXBCebw/util/callchain.o
> CC /tmp/tmp.dQoIXBCebw/util/values.o
> CC /tmp/tmp.dQoIXBCebw/util/debug.o
> CC /tmp/tmp.dQoIXBCebw/builtin-inject.o
> CC /tmp/tmp.dQoIXBCebw/util/machine.o
> CC /tmp/tmp.dQoIXBCebw/builtin-mem.o
> CC /tmp/tmp.dQoIXBCebw/util/map.o
> CC /tmp/tmp.dQoIXBCebw/builtin-data.o
> CC /tmp/tmp.dQoIXBCebw/util/pstack.o
> CC /tmp/tmp.dQoIXBCebw/builtin-version.o
> CC /tmp/tmp.dQoIXBCebw/util/session.o
> CC /tmp/tmp.dQoIXBCebw/builtin-c2c.o
> CC /tmp/tmp.dQoIXBCebw/util/ordered-events.o
> CC /tmp/tmp.dQoIXBCebw/util/namespaces.o
> MKDIR /tmp/tmp.dQoIXBCebw/bench/
> CC /tmp/tmp.dQoIXBCebw/bench/sched-messaging.o
> CC /tmp/tmp.dQoIXBCebw/util/comm.o
> MKDIR /tmp/tmp.dQoIXBCebw/bench/
> CC /tmp/tmp.dQoIXBCebw/bench/sched-pipe.o
> CC /tmp/tmp.dQoIXBCebw/util/thread.o
> CC /tmp/tmp.dQoIXBCebw/bench/mem-functions.o
> CC /tmp/tmp.dQoIXBCebw/util/thread_map.o
> CC /tmp/tmp.dQoIXBCebw/util/trace-event-parse.o
> MKDIR /tmp/tmp.dQoIXBCebw/tests/
> CC /tmp/tmp.dQoIXBCebw/tests/builtin-test.o
> CC /tmp/tmp.dQoIXBCebw/bench/futex-hash.o
> CC /tmp/tmp.dQoIXBCebw/util/parse-events-bison.o
> CC /tmp/tmp.dQoIXBCebw/bench/futex-wake.o
> BISON /tmp/tmp.dQoIXBCebw/util/pmu-bison.c
> CC /tmp/tmp.dQoIXBCebw/util/trace-event-read.o
> MKDIR /tmp/tmp.dQoIXBCebw/tests/
> CC /tmp/tmp.dQoIXBCebw/tests/parse-events.o
> CC /tmp/tmp.dQoIXBCebw/bench/futex-wake-parallel.o
> CC /tmp/tmp.dQoIXBCebw/util/trace-event-info.o
> CC /tmp/tmp.dQoIXBCebw/bench/futex-requeue.o
> CC /tmp/tmp.dQoIXBCebw/util/trace-event-scripting.o
> CC /tmp/tmp.dQoIXBCebw/bench/futex-lock-pi.o
> CC /tmp/tmp.dQoIXBCebw/util/trace-event.o
> CC /tmp/tmp.dQoIXBCebw/bench/mem-memcpy-x86-64-asm.o
> CC /tmp/tmp.dQoIXBCebw/bench/mem-memset-x86-64-asm.o
> CC /tmp/tmp.dQoIXBCebw/util/svghelper.o
> LD /tmp/tmp.dQoIXBCebw/bench/perf-in.o
> CC /tmp/tmp.dQoIXBCebw/perf.o
> CC /tmp/tmp.dQoIXBCebw/util/sort.o
> CC /tmp/tmp.dQoIXBCebw/tests/dso-data.o
> CC /tmp/tmp.dQoIXBCebw/tests/attr.o
> CC /tmp/tmp.dQoIXBCebw/util/hist.o
> CC /tmp/tmp.dQoIXBCebw/tests/vmlinux-kallsyms.o
> CC /tmp/tmp.dQoIXBCebw/tests/openat-syscall.o
> CC /tmp/tmp.dQoIXBCebw/tests/openat-syscall-all-cpus.o
> CC /tmp/tmp.dQoIXBCebw/tests/openat-syscall-tp-fields.o
> CC /tmp/tmp.dQoIXBCebw/tests/mmap-basic.o
> CC /tmp/tmp.dQoIXBCebw/tests/perf-record.o
> CC /tmp/tmp.dQoIXBCebw/util/util.o
> CC /tmp/tmp.dQoIXBCebw/tests/evsel-roundtrip-name.o
> CC /tmp/tmp.dQoIXBCebw/tests/evsel-tp-sched.o
> CC /tmp/tmp.dQoIXBCebw/tests/fdarray.o
> CC /tmp/tmp.dQoIXBCebw/util/xyarray.o
> CC /tmp/tmp.dQoIXBCebw/tests/pmu.o
> CC /tmp/tmp.dQoIXBCebw/util/cpumap.o
> CC /tmp/tmp.dQoIXBCebw/tests/hists_common.o
> CC /tmp/tmp.dQoIXBCebw/util/cgroup.o
> CC /tmp/tmp.dQoIXBCebw/tests/hists_link.o
> CC /tmp/tmp.dQoIXBCebw/tests/hists_filter.o
> CC /tmp/tmp.dQoIXBCebw/util/target.o
> CC /tmp/tmp.dQoIXBCebw/tests/hists_output.o
> CC /tmp/tmp.dQoIXBCebw/util/rblist.o
> CC /tmp/tmp.dQoIXBCebw/util/intlist.o
> CC /tmp/tmp.dQoIXBCebw/tests/hists_cumulate.o
> CC /tmp/tmp.dQoIXBCebw/util/vdso.o
> CC /tmp/tmp.dQoIXBCebw/util/counts.o
> CC /tmp/tmp.dQoIXBCebw/util/stat.o
> CC /tmp/tmp.dQoIXBCebw/util/stat-shadow.o
> CC /tmp/tmp.dQoIXBCebw/tests/python-use.o
> CC /tmp/tmp.dQoIXBCebw/tests/bp_signal.o
> CC /tmp/tmp.dQoIXBCebw/tests/bp_signal_overflow.o
> CC /tmp/tmp.dQoIXBCebw/util/record.o
> CC /tmp/tmp.dQoIXBCebw/tests/task-exit.o
> CC /tmp/tmp.dQoIXBCebw/tests/sw-clock.o
> CC /tmp/tmp.dQoIXBCebw/util/srcline.o
> CC /tmp/tmp.dQoIXBCebw/tests/mmap-thread-lookup.o
> CC /tmp/tmp.dQoIXBCebw/util/data.o
> CC /tmp/tmp.dQoIXBCebw/tests/thread-mg-share.o
> CC /tmp/tmp.dQoIXBCebw/util/tsc.o
> CC /tmp/tmp.dQoIXBCebw/tests/switch-tracking.o
> CC /tmp/tmp.dQoIXBCebw/util/cloexec.o
> CC /tmp/tmp.dQoIXBCebw/tests/keep-tracking.o
> CC /tmp/tmp.dQoIXBCebw/util/call-path.o
> CC /tmp/tmp.dQoIXBCebw/util/thread-stack.o
> CC /tmp/tmp.dQoIXBCebw/util/parse-branch-options.o
> CC /tmp/tmp.dQoIXBCebw/tests/code-reading.o
> CC /tmp/tmp.dQoIXBCebw/tests/sample-parsing.o
> CC /tmp/tmp.dQoIXBCebw/util/dump-insn.o
> CC /tmp/tmp.dQoIXBCebw/util/parse-regs-options.o
> CC /tmp/tmp.dQoIXBCebw/util/term.o
> CC /tmp/tmp.dQoIXBCebw/tests/parse-no-sample-id-all.o
> CC /tmp/tmp.dQoIXBCebw/util/help-unknown-cmd.o
> CC /tmp/tmp.dQoIXBCebw/tests/kmod-path.o
> CC /tmp/tmp.dQoIXBCebw/util/mem-events.o
> CC /tmp/tmp.dQoIXBCebw/tests/thread-map.o
> CC /tmp/tmp.dQoIXBCebw/util/vsprintf.o
> CC /tmp/tmp.dQoIXBCebw/util/drv_configs.o
> CC /tmp/tmp.dQoIXBCebw/tests/llvm.o
> CC /tmp/tmp.dQoIXBCebw/util/time-utils.o
> CC /tmp/tmp.dQoIXBCebw/tests/bpf.o
> CC /tmp/tmp.dQoIXBCebw/tests/topology.o
> BISON /tmp/tmp.dQoIXBCebw/util/expr-bison.c
> CC /tmp/tmp.dQoIXBCebw/util/symbol-minimal.o
> MKDIR /tmp/tmp.dQoIXBCebw/util/scripting-engines/
> LD /tmp/tmp.dQoIXBCebw/util/scripting-engines/libperf-in.o
> CC /tmp/tmp.dQoIXBCebw/util/zlib.o
> CC /tmp/tmp.dQoIXBCebw/tests/cpumap.o
> CC /tmp/tmp.dQoIXBCebw/util/lzma.o
> CC /tmp/tmp.dQoIXBCebw/tests/stat.o
> CC /tmp/tmp.dQoIXBCebw/tests/event_update.o
> CC /tmp/tmp.dQoIXBCebw/util/demangle-java.o
> CC /tmp/tmp.dQoIXBCebw/tests/event-times.o
> CC /tmp/tmp.dQoIXBCebw/util/demangle-rust.o
> CC /tmp/tmp.dQoIXBCebw/tests/expr.o
> CC /tmp/tmp.dQoIXBCebw/util/perf-hooks.o
> CC /tmp/tmp.dQoIXBCebw/tests/backward-ring-buffer.o
> CC /tmp/tmp.dQoIXBCebw/tests/sdt.o
> CC /tmp/tmp.dQoIXBCebw/tests/is_printable_array.o
> FLEX /tmp/tmp.dQoIXBCebw/util/parse-events-flex.c
> FLEX /tmp/tmp.dQoIXBCebw/util/pmu-flex.c
> CC /tmp/tmp.dQoIXBCebw/util/pmu-bison.o
> CC /tmp/tmp.dQoIXBCebw/tests/bitmap.o
> CC /tmp/tmp.dQoIXBCebw/tests/perf-hooks.o
> CC /tmp/tmp.dQoIXBCebw/tests/clang.o
> CC /tmp/tmp.dQoIXBCebw/util/expr-bison.o
> CC /tmp/tmp.dQoIXBCebw/tests/unit_number__scnprintf.o
> CC /tmp/tmp.dQoIXBCebw/tests/llvm-src-base.o
> CC /tmp/tmp.dQoIXBCebw/tests/llvm-src-kbuild.o
> CC /tmp/tmp.dQoIXBCebw/tests/llvm-src-prologue.o
> CC /tmp/tmp.dQoIXBCebw/tests/llvm-src-relocation.o
> LD /tmp/tmp.dQoIXBCebw/tests/perf-in.o
> CC /tmp/tmp.dQoIXBCebw/util/parse-events.o
> CC /tmp/tmp.dQoIXBCebw/util/pmu.o
> CC /tmp/tmp.dQoIXBCebw/util/parse-events-flex.o
> LD /tmp/tmp.dQoIXBCebw/perf-in.o
> CC /tmp/tmp.dQoIXBCebw/util/pmu-flex.o
> LD /tmp/tmp.dQoIXBCebw/util/libperf-in.o
> LD /tmp/tmp.dQoIXBCebw/libperf-in.o
> AR /tmp/tmp.dQoIXBCebw/libperf.a
> LINK /tmp/tmp.dQoIXBCebw/perf
> /tmp/tmp.dQoIXBCebw/libperf.a(libperf-in.o): In function `inline_list__append':
> /home/acme/git/linux/tools/perf/util/srcline.c:47: undefined reference to `dso__demangle_sym'
> collect2: error: ld returned 1 exit status
> Makefile.perf:420: recipe for target '/tmp/tmp.dQoIXBCebw/perf' failed
> make[4]: *** [/tmp/tmp.dQoIXBCebw/perf] Error 1
> Makefile.perf:204: recipe for target 'sub-make' failed
> make[3]: *** [sub-make] Error 2
> Makefile:68: recipe for target 'all' failed
> make[2]: *** [all] Error 2
> tests/make:296: recipe for target 'make_minimal_O' failed
> make[1]: *** [make_minimal_O] Error 1
> Makefile:102: recipe for target 'build-test' failed
> make: *** [build-test] Error 2
> make: Leaving directory '/home/acme/git/linux/tools/perf'
> [acme@jouet linux]$
>
>
>>> That said, Jin, here are some observations that could be improved in the
>>> future (I don't think any of these should hold back merging this feature now):
>>>
>>> For the following example code build with "-O2 -g" and recorded with "--call-
>>> graph dwarf" I observe some output combinations that could potentially be
>>> improved in the future:
>>>
>>> ~~~~~~~~~~~~~~~~~~~~
>>> #include <complex>
>>> #include <cmath>
>>> #include <random>
>>> #include <iostream>
>>>
>>> using namespace std;
>>>
>>> int main()
>>> {
>>> uniform_real_distribution<double> uniform(-1E5, 1E5);
>>> default_random_engine engine;
>>> double s = 0;
>>> for (int i = 0; i < 10000000; ++i) {
>>> s += norm(complex<double>(uniform(engine), uniform(engine)));
>>> }
>>> cout << s << '\n';
>>> return 0;
>>> }
>>> ~~~~~~~~~~~~~~~~
>>>
>>> #1 duplicated entries when grouping by function:
>>>
>>> ~~~~~~~~~~~~~~~~
>>> perf report --inline --stdio
>>> ...
>>> --35.34%--_start
>>> __libc_start_main
>>> main
>>> main (inline)
>>> std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned
>>> long, 16807ul, 0ul, 2147483647ul> > (inline)
>>> std::uniform_real_distribution<double>::operator()<std::linear_congruential_engine<unsigned
>>> long, 16807ul, 0ul, 2147483647ul> > (inline)
>>> std::__detail::_Adaptor<std::linear_congruential_engine<unsigned
>>> long, 16807ul, 0ul, 2147483647ul>, double>::operator() (inline)
>>> ~~~~~~~~~~~~~~~~
>>>
>>> Here, we see main twice, once for the "real" frame, and once for an inlined
>>> one? Then we see the same function twice as inlined frame, which is also odd.
>>>
>>> ~~~~~~~~~~~~~~~~
>>> perf report --inline --stdio --no-children
>>> ...
>>> 59.81% cpp-inlining libm-2.25.so [.] __hypot_finite
>>> |
>>> ---__hypot_finite
>>> hypot
>>> main
>>> std::norm<double> (inline)
>>> main (inline)
>>> __libc_start_main
>>> _start
>>> ~~~~~~~~~~~~~~~~
>>>
>>> Here we see a confusing output. The first "main" frame below "hypot" is
>>> actually code form cpp's complex header which got inlined into main. That
>>> associates the wrong function name to this frame, i.e. "main" instead of
>>> std::norm". When the inline stack is shown below we actually see what happens,
>>> i.e. we eventually end up in main again, but of course this output is not the
>>> best as-is.
>>>
>>> But, again: I think these are minor issues, and the feature itself is already
>>> extremely useful and I hope to see it finally merged.
>>>
>>> Thanks again Jin for your good work!
>>>
>>> Cheers
>>>
>>> --
>>> Milian Wolff | [email protected] | Software Engineer
>>> KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
>>> Tel: +49-30-521325470
>>> KDAB - The Qt Experts
>>
Hi Jin / Arnaldo,
I see a build failure with this patch:
On Friday 17 March 2017 03:12 AM, Jin Yao wrote:
> It would be useful for perf to support a mode to query the
> inline stack for a given callgraph address. This would simplify
> finding the right code in code that does a lot of inlining.
>
> The srcline.c has contained the code which supports to translate
> the address to filename:line_nr. This patch just extends the
> function to let it support getting the inline stacks.
...
> + while (getline(&filename, &len, fp) != -1) {
> + if (filename_split(filename, &line_nr) != 1) {
> + free(filename);
> + goto out;
> + }
> +
> + if (inline_list__append(filename, NULL, line_nr, node) != 0)
util/srcline.c: In function ?addr2inlines?:
util/srcline.c:403:7: error: too few arguments to function ?inline_list__append?
if (inline_list__append(filename, NULL, line_nr, node) != 0)
^
util/srcline.c:34:12: note: declared here
static int inline_list__append(char *filename, char *funcname, int line_nr,
^
util/srcline.c: At top level:
util/srcline.c:60:13: error: ?inline_list__reverse? defined but not used [-Werror=unused-function]
static void inline_list__reverse(struct inline_node *node)
^
cc1: all warnings being treated as errors
mv: cannot stat ?util/.srcline.o.tmp?: No such file or directory
Thanks,
Ravi
Hi Ravi, Arnaldo,
The build error happens when BFD lib is not set in build environment.
Anyway the patch series should be improved to get better compatibility
for this case. For easy patch management, I send the v6 patch series for
fixing this issue. Very sorry for the inconvenience.
Thanks
Jin Yao
On 3/25/2017 3:18 PM, Ravi Bangoria wrote:
> Hi Jin / Arnaldo,
>
> I see a build failure with this patch:
>
> On Friday 17 March 2017 03:12 AM, Jin Yao wrote:
>> It would be useful for perf to support a mode to query the
>> inline stack for a given callgraph address. This would simplify
>> finding the right code in code that does a lot of inlining.
>>
>> The srcline.c has contained the code which supports to translate
>> the address to filename:line_nr. This patch just extends the
>> function to let it support getting the inline stacks.
> ...
>> + while (getline(&filename, &len, fp) != -1) {
>> + if (filename_split(filename, &line_nr) != 1) {
>> + free(filename);
>> + goto out;
>> + }
>> +
>> + if (inline_list__append(filename, NULL, line_nr, node) != 0)
> util/srcline.c: In function ?addr2inlines?:
> util/srcline.c:403:7: error: too few arguments to function ?inline_list__append?
> if (inline_list__append(filename, NULL, line_nr, node) != 0)
> ^
> util/srcline.c:34:12: note: declared here
> static int inline_list__append(char *filename, char *funcname, int line_nr,
> ^
> util/srcline.c: At top level:
> util/srcline.c:60:13: error: ?inline_list__reverse? defined but not used [-Werror=unused-function]
> static void inline_list__reverse(struct inline_node *node)
> ^
> cc1: all warnings being treated as errors
> mv: cannot stat ?util/.srcline.o.tmp?: No such file or directory
>
> Thanks,
> Ravi
>