2018-10-02 07:40:57

by Milian Wolff

[permalink] [raw]
Subject: [PATCH] perf record: use unmapped IP for inline callchain cursors

Only use the mapped IP to find inline frames, but keep
using the unmapped IP for the callchain cursor. This
ensures we properly show the unmapped IP when displaying
a frame we received via the dso__parse_addr_inlines API
for a module which does not contain sufficient debug symbols
to show the srcline.

Before:
$ perf record -e cycles:u --call-graph ls
$ perf script
...
ls 12853 2735.563911: 43354 cycles:u:
17878 __GI___tunables_init+0xffff01d1d63a0118 (/usr/lib/ld-2.28.so)
19ee9 _dl_sysdep_start+0xffff01d1d63a02e9 (/usr/lib/ld-2.28.so)
3087 _dl_start+0xffff01d1d63a0287 (/usr/lib/ld-2.28.so)
2007 _start+0xffff01d1d63a0007 (/usr/lib/ld-2.28.so)

After:

$ perf script
...
ls 12853 2735.563911: 43354 cycles:u:
7f1714e46878 __GI___tunables_init+0x118 (/usr/lib/ld-2.28.so)
7f1714e48ee9 _dl_sysdep_start+0x2e9 (/usr/lib/ld-2.28.so)
7f1714e32087 _dl_start+0x287 (/usr/lib/ld-2.28.so)
7f1714e31007 _start+0x7 (/usr/lib/ld-2.28.so)

For frames with sufficient debug symbols, the behavior is
still sane and works as expected in my tests.

This patch series shows that we desperately need
an automated test for inline frame resolution. I'll try to
come up with something for the various regressions in the future.

Signed-off-by: Milian Wolff <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Reported-by: Ravi Bangoria <[email protected]>
# Tested-by:
# Reviewed-by:
# Suggested-b:
Fixes: bfe16b0653 ("perf report: Don't crash on invalid inline debug information")
---
tools/perf/util/machine.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 73a651f10a0f..111ae858cbcb 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2286,7 +2286,8 @@ static int append_inlines(struct callchain_cursor *cursor,
if (!symbol_conf.inline_name || !map || !sym)
return ret;

- addr = map__rip_2objdump(map, ip);
+ addr = map__map_ip(map, ip);
+ addr = map__rip_2objdump(map, addr);

inline_node = inlines__tree_find(&map->dso->inlined_nodes, addr);
if (!inline_node) {
@@ -2317,6 +2318,9 @@ static int unwind_entry(struct unwind_entry *entry, void *arg)
if (symbol_conf.hide_unresolved && entry->sym == NULL)
return 0;

+ if (append_inlines(cursor, entry->map, entry->sym, entry->ip) == 0)
+ return 0;
+
/*
* Convert entry->ip from a virtual address to an offset in
* its corresponding binary.
@@ -2324,9 +2328,6 @@ static int unwind_entry(struct unwind_entry *entry, void *arg)
if (entry->map)
addr = map__map_ip(entry->map, entry->ip);

- if (append_inlines(cursor, entry->map, entry->sym, addr) == 0)
- return 0;
-
srcline = callchain_srcline(entry->map, entry->sym, addr);
return callchain_cursor_append(cursor, entry->ip,
entry->map, entry->sym,
--
2.19.0


2018-10-02 15:34:26

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf record: use unmapped IP for inline callchain cursors

Em Tue, Oct 02, 2018 at 09:39:49AM +0200, Milian Wolff escreveu:
> Only use the mapped IP to find inline frames, but keep
> using the unmapped IP for the callchain cursor. This
> ensures we properly show the unmapped IP when displaying
> a frame we received via the dso__parse_addr_inlines API
> for a module which does not contain sufficient debug symbols
> to show the srcline.

So, the patch came mangled, I fixed it up, please check, I'm adding Ravi
to the CC list, so that he can check as well and retest, right Ravi?

- Arnaldo

From d9ddee9653d5676130471de9bc688fc7ec7b4fc4 Mon Sep 17 00:00:00 2001
From: Milian Wolff <[email protected]>
Date: Tue, 2 Oct 2018 09:39:49 +0200
Subject: [PATCH 1/1] perf record: use unmapped IP for inline callchain cursors

Only use the mapped IP to find inline frames, but keep using the unmapped IP
for the callchain cursor. This ensures we properly show the unmapped IP when
displaying a frame we received via the dso__parse_addr_inlines API for a module
which does not contain sufficient debug symbols to show the srcline.

Before:

$ perf record -e cycles:u --call-graph ls
$ perf script
...
ls 12853 2735.563911: 43354 cycles:u:
17878 __GI___tunables_init+0xffff01d1d63a0118 (/usr/lib/ld-2.28.so)
19ee9 _dl_sysdep_start+0xffff01d1d63a02e9 (/usr/lib/ld-2.28.so)
3087 _dl_start+0xffff01d1d63a0287 (/usr/lib/ld-2.28.so)
2007 _start+0xffff01d1d63a0007 (/usr/lib/ld-2.28.so)

After:

$ perf script
...
ls 12853 2735.563911: 43354 cycles:u:
7f1714e46878 __GI___tunables_init+0x118 (/usr/lib/ld-2.28.so)
7f1714e48ee9 _dl_sysdep_start+0x2e9 (/usr/lib/ld-2.28.so)
7f1714e32087 _dl_start+0x287 (/usr/lib/ld-2.28.so)
7f1714e31007 _start+0x7 (/usr/lib/ld-2.28.so)

For frames with sufficient debug symbols, the behavior is still sane and works
as expected in my tests.

This patch series shows that we desperately need an automated test for inline
frame resolution. I'll try to come up with something for the various
regressions in the future.

Reported-by: Ravi Bangoria <[email protected]>
Signed-off-by: Milian Wolff <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Fixes: bfe16b0653 ("perf report: Don't crash on invalid inline debug information")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/machine.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 73a651f10a0f..111ae858cbcb 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2286,7 +2286,8 @@ static int append_inlines(struct callchain_cursor *cursor,
if (!symbol_conf.inline_name || !map || !sym)
return ret;

- addr = map__rip_2objdump(map, ip);
+ addr = map__map_ip(map, ip);
+ addr = map__rip_2objdump(map, addr);

inline_node = inlines__tree_find(&map->dso->inlined_nodes, addr);
if (!inline_node) {
@@ -2317,6 +2318,9 @@ static int unwind_entry(struct unwind_entry *entry, void *arg)
if (symbol_conf.hide_unresolved && entry->sym == NULL)
return 0;

+ if (append_inlines(cursor, entry->map, entry->sym, entry->ip) == 0)
+ return 0;
+
/*
* Convert entry->ip from a virtual address to an offset in
* its corresponding binary.
@@ -2324,9 +2328,6 @@ static int unwind_entry(struct unwind_entry *entry, void *arg)
if (entry->map)
addr = map__map_ip(entry->map, entry->ip);

- if (append_inlines(cursor, entry->map, entry->sym, addr) == 0)
- return 0;
-
srcline = callchain_srcline(entry->map, entry->sym, addr);
return callchain_cursor_append(cursor, entry->ip,
entry->map, entry->sym,
--
2.14.4


2018-10-03 03:36:10

by Ravi Bangoria

[permalink] [raw]
Subject: Re: [PATCH] perf record: use unmapped IP for inline callchain cursors



On 10/02/2018 09:02 PM, Arnaldo Carvalho de Melo wrote:
> Em Tue, Oct 02, 2018 at 09:39:49AM +0200, Milian Wolff escreveu:
>> Only use the mapped IP to find inline frames, but keep
>> using the unmapped IP for the callchain cursor. This
>> ensures we properly show the unmapped IP when displaying
>> a frame we received via the dso__parse_addr_inlines API
>> for a module which does not contain sufficient debug symbols
>> to show the srcline.
>
> So, the patch came mangled, I fixed it up, please check, I'm adding Ravi
> to the CC list, so that he can check as well and retest, right Ravi?
>
> - Arnaldo
>
> From d9ddee9653d5676130471de9bc688fc7ec7b4fc4 Mon Sep 17 00:00:00 2001
> From: Milian Wolff <[email protected]>
> Date: Tue, 2 Oct 2018 09:39:49 +0200
> Subject: [PATCH 1/1] perf record: use unmapped IP for inline callchain cursors
>
> Only use the mapped IP to find inline frames, but keep using the unmapped IP
> for the callchain cursor. This ensures we properly show the unmapped IP when
> displaying a frame we received via the dso__parse_addr_inlines API for a module
> which does not contain sufficient debug symbols to show the srcline.
>
> Before:
>
> $ perf record -e cycles:u --call-graph ls
> $ perf script
> ...
> ls 12853 2735.563911: 43354 cycles:u:
> 17878 __GI___tunables_init+0xffff01d1d63a0118 (/usr/lib/ld-2.28.so)
> 19ee9 _dl_sysdep_start+0xffff01d1d63a02e9 (/usr/lib/ld-2.28.so)
> 3087 _dl_start+0xffff01d1d63a0287 (/usr/lib/ld-2.28.so)
> 2007 _start+0xffff01d1d63a0007 (/usr/lib/ld-2.28.so)
>
> After:
>
> $ perf script
> ...
> ls 12853 2735.563911: 43354 cycles:u:
> 7f1714e46878 __GI___tunables_init+0x118 (/usr/lib/ld-2.28.so)
> 7f1714e48ee9 _dl_sysdep_start+0x2e9 (/usr/lib/ld-2.28.so)
> 7f1714e32087 _dl_start+0x287 (/usr/lib/ld-2.28.so)
> 7f1714e31007 _start+0x7 (/usr/lib/ld-2.28.so)

With current perf/urgent:

$ sudo ./perf record --call-graph=dwarf -e probe_libc:inet_pton -- ping -6 -c 1 ::1
$ sudo ./perf script
ping 4109 [011] 767269.031273: probe_libc:inet_pton: (7fff944cb458)
15b458 __inet_pton+0xffff0000d7920008 (inlined)
10feb3 gaih_inet.constprop.7+0xffff0000d7920f43 (/usr/lib64/libc-2.26.
110a13 __GI_getaddrinfo+0xffff0000d7920163 (inlined)
13c752d6f _init+0xbfb (/usr/bin/ping)
2371f generic_start_main.isra.0+0xffff0000d792013f (/usr/lib64/libc-2
2391b __libc_start_main+0xffff0000d79200bb (/usr/lib64/libc-2.26.so)

With this patch:

$ sudo ./perf script
ping 4109 [011] 767269.031273: probe_libc:inet_pton: (7fff944cb458)
7fff944cb458 __inet_pton+0x8 (inlined)
7fff9447feb3 gaih_inet.constprop.7+0xf43 (/usr/lib64/libc-2.26.so)
7fff94480a13 __GI_getaddrinfo+0x163 (inlined)
13c752d6f _init+0xbfb (/usr/bin/ping)
7fff9439371f generic_start_main.isra.0+0x13f (/usr/lib64/libc-2.26.so)
7fff9439391b __libc_start_main+0xbb (/usr/lib64/libc-2.26.so)

LGTM.

Tested-by: Ravi Bangoria <[email protected]>


2018-10-05 13:49:07

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf record: use unmapped IP for inline callchain cursors

Em Wed, Oct 03, 2018 at 09:05:37AM +0530, Ravi Bangoria escreveu:
> LGTM.

> Tested-by: Ravi Bangoria <[email protected]>

So, I've added this as a 'git rebase -i' 'fixup', i.e. kept the commit
log message for the patch this patch fixes, and combined the two into
just one patch so that we don't pollute the bisect history, since this
hasn't made it yet to tip, and I also added Ravi's Tested-by, since this
tests both.

- Arnaldo

2018-10-05 14:13:06

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf record: use unmapped IP for inline callchain cursors

Em Tue, Oct 02, 2018 at 09:39:49AM +0200, Milian Wolff escreveu:
> Signed-off-by: Milian Wolff <[email protected]>
> Cc: Arnaldo Carvalho de Melo <[email protected]>
> Reported-by: Ravi Bangoria <[email protected]>
> # Tested-by:
> # Reviewed-by:
> # Suggested-b:
> Fixes: bfe16b0653 ("perf report: Don't crash on invalid inline debug information")

No, the patch that Ravi pointed out as causing the regression wasn't the
one above, it was this one:

"perf report: Use the offset address to find inline frames"

I'm reworking this again...

- Arnaldo

> ---
> tools/perf/util/machine.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index 73a651f10a0f..111ae858cbcb 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -2286,7 +2286,8 @@ static int append_inlines(struct callchain_cursor *cursor,
> if (!symbol_conf.inline_name || !map || !sym)
> return ret;
>
> - addr = map__rip_2objdump(map, ip);
> + addr = map__map_ip(map, ip);
> + addr = map__rip_2objdump(map, addr);
>
> inline_node = inlines__tree_find(&map->dso->inlined_nodes, addr);
> if (!inline_node) {
> @@ -2317,6 +2318,9 @@ static int unwind_entry(struct unwind_entry *entry, void *arg)
> if (symbol_conf.hide_unresolved && entry->sym == NULL)
> return 0;
>
> + if (append_inlines(cursor, entry->map, entry->sym, entry->ip) == 0)
> + return 0;
> +
> /*
> * Convert entry->ip from a virtual address to an offset in
> * its corresponding binary.
> @@ -2324,9 +2328,6 @@ static int unwind_entry(struct unwind_entry *entry, void *arg)
> if (entry->map)
> addr = map__map_ip(entry->map, entry->ip);
>
> - if (append_inlines(cursor, entry->map, entry->sym, addr) == 0)
> - return 0;
> -
> srcline = callchain_srcline(entry->map, entry->sym, addr);
> return callchain_cursor_append(cursor, entry->ip,
> entry->map, entry->sym,
> --
> 2.19.0

Subject: [tip:perf/urgent] perf record: Use unmapped IP for inline callchain cursors

Commit-ID: 7a8a8fcf7b860e4b2d4edc787c844d41cad9dfcf
Gitweb: https://git.kernel.org/tip/7a8a8fcf7b860e4b2d4edc787c844d41cad9dfcf
Author: Milian Wolff <[email protected]>
AuthorDate: Wed, 26 Sep 2018 15:52:06 +0200
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Fri, 5 Oct 2018 11:18:09 -0300

perf record: Use unmapped IP for inline callchain cursors

Only use the mapped IP to find inline frames, but keep using the
unmapped IP for the callchain cursor. This ensures we properly show the
unmapped IP when displaying a frame we received via the
dso__parse_addr_inlines API for a module which does not contain
sufficient debug symbols to show the srcline.

This is another follow-up to commit 19610184693c ("perf script: Show
virtual addresses instead of offsets").

Signed-off-by: Milian Wolff <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Tested-by: Ravi Bangoria <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Sandipan Das <[email protected]>
Fixes: 19610184693c ("perf script: Show virtual addresses instead of offsets")
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
[ Squashed a fix from Milian for a problem reported by Ravi, fixed up space damage ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/machine.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 0cb4f8bf3ca7..111ae858cbcb 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2286,7 +2286,8 @@ static int append_inlines(struct callchain_cursor *cursor,
if (!symbol_conf.inline_name || !map || !sym)
return ret;

- addr = map__rip_2objdump(map, ip);
+ addr = map__map_ip(map, ip);
+ addr = map__rip_2objdump(map, addr);

inline_node = inlines__tree_find(&map->dso->inlined_nodes, addr);
if (!inline_node) {

2018-10-08 18:50:02

by Milian Wolff

[permalink] [raw]
Subject: Re: [PATCH] perf record: use unmapped IP for inline callchain cursors

On Freitag, 5. Oktober 2018 15:48:31 CEST Arnaldo Carvalho de Melo wrote:
> Em Wed, Oct 03, 2018 at 09:05:37AM +0530, Ravi Bangoria escreveu:
> > LGTM.
> >
> > Tested-by: Ravi Bangoria <[email protected]>
>
> So, I've added this as a 'git rebase -i' 'fixup', i.e. kept the commit
> log message for the patch this patch fixes, and combined the two into
> just one patch so that we don't pollute the bisect history, since this
> hasn't made it yet to tip, and I also added Ravi's Tested-by, since this
> tests both.

Thanks a lot for the cleanup work Arnaldo.

Cheers

--
Milian Wolff | [email protected] | Senior Software Engineer
KDAB (Deutschland) GmbH, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt, C++ and OpenGL Experts


Attachments:
smime.p7s (3.74 kB)