2020-04-15 22:17:55

by Thomas Richter

[permalink] [raw]
Subject: [PATCH] perf symbol: Fix kernel symbol address display

Running commands

./perf record -e rb0000 -- find .
./perf report -v

reveals symbol names and its addresses. There is a mismatch between
kernel symbol and address. Here is an example for kernel symbol
check_chain_key:

3.55% find /lib/modules/.../build/vmlinux 0xf11ec v [k] check_chain_key

This address is off by 0xff000 as can be seen with:

[root@t35lp46 ~]# fgrep check_chain_key /proc/kallsyms
00000000001f00d0 t check_chain_key
[root@t35lp46 ~]# objdump -t ~/linux/vmlinux| fgrep check_chain_key
00000000001f00d0 l F .text 00000000000001e8 check_chain_key
[root@t35lp46 ~]#

This function is located in main memory 0x1f00d0 - 0x1f02b4. It has
several entries in the perf data file with the correct address:

[root@t35lp46 perf]# ./perf report -D -i perf.data.find-bad | \
fgrep SAMPLE| fgrep 0x1f01ec
PERF_RECORD_SAMPLE(IP, 0x1): 22228/22228: 0x1f01ec period: 1300000 addr: 0
PERF_RECORD_SAMPLE(IP, 0x1): 22228/22228: 0x1f01ec period: 1300000 addr: 0

The root cause happens when reading symbol tables during perf report.
A long gdb call chain leads to

machine__deliver_events
perf_evlist__deliver_event
perf_evlist__deliver_sample
build_id__mark_dso_hits
thread__find_map(1) Read correct address from sample entry
map__load
dso__load Some more functions to end up in
....
dso__load_sym.

Function dso__load_syms checks for kernel relocation and symbol
adjustment for the kernel and results in kernel map adjustment of
kernel .text segment address (0x100000 on s390)
kernel .text segment offset in file (0x1000 on s390).
This results in all kernel symbol addresses to be changed by subtracting
0xff000 (on s390). For the symbol check_chain_key we end up with

0x1f00d0 - 0x100000 + 0x1000 = 0xf11d0

and this address is saved in the perf symbol table. This calculation is
also applied by the mapping functions map__mapip() and map__unmapip()
to map IP addresses to dso mappings.

During perf report processing functions

process_sample_event (builtin-report.c)
machine__resolve
thread__find_map
hist_entry_iter_add

are called. Function thread__find_map(1)
takes the correct sample address and applies the mapping function
map__mapip() from the kernel dso and saves the modified address
in struct addr_location for further reference. From now on this address
is used.

Funktion process_sample_event() then calls hist_entry_iter_add() to save
the address in member ip of struct hist_entry.

When samples are displayed using

perf_evlist__tty_browse_hists
hists__fprintf
hist_entry__fprintf
hist_entry__snprintf
__hist_entry__snprintf
_hist_entry__sym_snprintf()

This simply displays the address of the symbol and ignores the dso <-> map
mappings done in function thread__find_map. This leads to the address
mismatch.

Output before:

ot@t35lp46 perf]# ./perf report -v | fgrep check_chain_key
3.55% find /lib/modules/5.6.0d-perf+/build/vmlinux
0xf11ec v [k] check_chain_key
[root@t35lp46 perf]#

Output after:

[root@t35lp46 perf]# ./perf report -v | fgrep check_chain_key
3.55% find /lib/modules/5.6.0d-perf+/build/vmlinux
0x1f01ec v [k] check_chain_key
[root@t35lp46 perf]#

Signed-off-by: Thomas Richter <[email protected]>
Acked-by: Sumanth Korikkar <[email protected]>
---
tools/perf/util/sort.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index ab0cfd790ad0..0695b3f6460f 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -297,8 +297,14 @@ static int _hist_entry__sym_snprintf(struct map_symbol *ms,

if (verbose > 0) {
char o = map ? dso__symtab_origin(map->dso) : '!';
+ u64 rip = ip;
+
+ if (map && map->dso && map->dso->kernel
+ && map->dso->adjust_symbols)
+ rip = map->unmap_ip(map, ip);
+
ret += repsep_snprintf(bf, size, "%-#*llx %c ",
- BITS_PER_LONG / 4 + 2, ip, o);
+ BITS_PER_LONG / 4 + 2, rip, o);
}

ret += repsep_snprintf(bf + ret, size - ret, "[%c] ", level);
--
2.25.1


2020-04-20 20:47:46

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf symbol: Fix kernel symbol address display

Em Wed, Apr 15, 2020 at 09:07:44AM +0200, Thomas Richter escreveu:
> Running commands
>
> ./perf record -e rb0000 -- find .
> ./perf report -v

Or when pressing 'V' in the TUI.

> reveals symbol names and its addresses. There is a mismatch between

Yeah, an address that at some point was put there to help with debugging
the symbol resolution, IIRC how it looked like when looking at

readelf -sW vmlinux

Or any other DSO, for instance, for a glibc symbol here:

Using 'perf report -s pid,dso,sym' then pressing 'V':

1.55% 20325:perf /usr/lib64/libc-2.30.so 0x161825 B [.] __strlen_avx2

[acme@five perf]$ readelf -sW /usr/lib64/libc-2.30.so | grep strlen_avx2
24371: 0000000000161810 414 FUNC LOCAL DEFAULT 15 __strlen_avx2
[acme@five perf]$

Can you check if doing in /lib/modules/.../build/vmlinux produces what
appears when 'V' is in place?

And perhaps we can also show the DSO offset and the rip as it gets laid
out in memory in the end? So we have all the informations?

- Arnaldo

> kernel symbol and address. Here is an example for kernel symbol
> check_chain_key:
>
> 3.55% find /lib/modules/.../build/vmlinux 0xf11ec v [k] check_chain_key
>
> This address is off by 0xff000 as can be seen with:
>
> [root@t35lp46 ~]# fgrep check_chain_key /proc/kallsyms
> 00000000001f00d0 t check_chain_key
> [root@t35lp46 ~]# objdump -t ~/linux/vmlinux| fgrep check_chain_key
> 00000000001f00d0 l F .text 00000000000001e8 check_chain_key
> [root@t35lp46 ~]#
>
> This function is located in main memory 0x1f00d0 - 0x1f02b4. It has
> several entries in the perf data file with the correct address:
>
> [root@t35lp46 perf]# ./perf report -D -i perf.data.find-bad | \
> fgrep SAMPLE| fgrep 0x1f01ec
> PERF_RECORD_SAMPLE(IP, 0x1): 22228/22228: 0x1f01ec period: 1300000 addr: 0
> PERF_RECORD_SAMPLE(IP, 0x1): 22228/22228: 0x1f01ec period: 1300000 addr: 0
>
> The root cause happens when reading symbol tables during perf report.
> A long gdb call chain leads to
>
> machine__deliver_events
> perf_evlist__deliver_event
> perf_evlist__deliver_sample
> build_id__mark_dso_hits
> thread__find_map(1) Read correct address from sample entry
> map__load
> dso__load Some more functions to end up in
> ....
> dso__load_sym.
>
> Function dso__load_syms checks for kernel relocation and symbol
> adjustment for the kernel and results in kernel map adjustment of
> kernel .text segment address (0x100000 on s390)
> kernel .text segment offset in file (0x1000 on s390).
> This results in all kernel symbol addresses to be changed by subtracting
> 0xff000 (on s390). For the symbol check_chain_key we end up with
>
> 0x1f00d0 - 0x100000 + 0x1000 = 0xf11d0
>
> and this address is saved in the perf symbol table. This calculation is
> also applied by the mapping functions map__mapip() and map__unmapip()
> to map IP addresses to dso mappings.
>
> During perf report processing functions
>
> process_sample_event (builtin-report.c)
> machine__resolve
> thread__find_map
> hist_entry_iter_add
>
> are called. Function thread__find_map(1)
> takes the correct sample address and applies the mapping function
> map__mapip() from the kernel dso and saves the modified address
> in struct addr_location for further reference. From now on this address
> is used.
>
> Funktion process_sample_event() then calls hist_entry_iter_add() to save
> the address in member ip of struct hist_entry.
>
> When samples are displayed using
>
> perf_evlist__tty_browse_hists
> hists__fprintf
> hist_entry__fprintf
> hist_entry__snprintf
> __hist_entry__snprintf
> _hist_entry__sym_snprintf()
>
> This simply displays the address of the symbol and ignores the dso <-> map
> mappings done in function thread__find_map. This leads to the address
> mismatch.
>
> Output before:
>
> ot@t35lp46 perf]# ./perf report -v | fgrep check_chain_key
> 3.55% find /lib/modules/5.6.0d-perf+/build/vmlinux
> 0xf11ec v [k] check_chain_key
> [root@t35lp46 perf]#
>
> Output after:
>
> [root@t35lp46 perf]# ./perf report -v | fgrep check_chain_key
> 3.55% find /lib/modules/5.6.0d-perf+/build/vmlinux
> 0x1f01ec v [k] check_chain_key
> [root@t35lp46 perf]#
>
> Signed-off-by: Thomas Richter <[email protected]>
> Acked-by: Sumanth Korikkar <[email protected]>
> ---
> tools/perf/util/sort.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> index ab0cfd790ad0..0695b3f6460f 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -297,8 +297,14 @@ static int _hist_entry__sym_snprintf(struct map_symbol *ms,
>
> if (verbose > 0) {
> char o = map ? dso__symtab_origin(map->dso) : '!';
> + u64 rip = ip;
> +
> + if (map && map->dso && map->dso->kernel
> + && map->dso->adjust_symbols)
> + rip = map->unmap_ip(map, ip);
> +
> ret += repsep_snprintf(bf, size, "%-#*llx %c ",
> - BITS_PER_LONG / 4 + 2, ip, o);
> + BITS_PER_LONG / 4 + 2, rip, o);
> }
>
> ret += repsep_snprintf(bf + ret, size - ret, "[%c] ", level);
> --
> 2.25.1
>

--

- Arnaldo

2020-04-21 11:20:14

by Thomas Richter

[permalink] [raw]
Subject: Re: [PATCH] perf symbol: Fix kernel symbol address display

On 4/20/20 10:46 PM, Arnaldo Carvalho de Melo wrote:
> Em Wed, Apr 15, 2020 at 09:07:44AM +0200, Thomas Richter escreveu:
>> Running commands
>>
>> ./perf record -e rb0000 -- find .
>> ./perf report -v
>
> Or when pressing 'V' in the TUI.
>
>> reveals symbol names and its addresses. There is a mismatch between
>
> Yeah, an address that at some point was put there to help with debugging
> the symbol resolution, IIRC how it looked like when looking at
>
> readelf -sW vmlinux
>
> Or any other DSO, for instance, for a glibc symbol here:
>
> Using 'perf report -s pid,dso,sym' then pressing 'V':
>
> 1.55% 20325:perf /usr/lib64/libc-2.30.so 0x161825 B [.] __strlen_avx2
>
> [acme@five perf]$ readelf -sW /usr/lib64/libc-2.30.so | grep strlen_avx2
> 24371: 0000000000161810 414 FUNC LOCAL DEFAULT 15 __strlen_avx2
> [acme@five perf]$
>
> Can you check if doing in /lib/modules/.../build/vmlinux produces what
> appears when 'V' is in place?
>

Yes, I used latest 5.7.0rc2 and did
./perf record -e rb0000 -- find
./perf report -s pid,dso,sym
and pressed 'V' in the TUI.

This gives
6.06% 9986:find /lib/modules/5.7 0x1f0c86 v [k] check_chain_key
and
[root@m35lp76 ~]# readelf -sW /lib/modules/5.7.0-rc2d-perf+/build/vmlinux | fgrep check_chain_key
20698: 00000000001f0c70 486 FUNC LOCAL DEFAULT 1 check_chain_key
[root@m35lp76 ~]#

which is perfectly good and in the range.

> And perhaps we can also show the DSO offset and the rip as it gets laid
> out in memory in the end? So we have all the informations?
>
> - Arnaldo
>

I do not follow you here, do you mean something like DSO-name+offset
for example libc-2.30.so+0x123?

Thanks

--
Thomas Richter, Dept 3252, IBM s390 Linux Development, Boeblingen, Germany
--
Vorsitzender des Aufsichtsrats: Matthias Hartmann
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294

2020-04-24 12:39:05

by Thomas Richter

[permalink] [raw]
Subject: Re: [PATCH] perf symbol: Fix kernel symbol address display

On 4/20/20 10:46 PM, Arnaldo Carvalho de Melo wrote:
> Em Wed, Apr 15, 2020 at 09:07:44AM +0200, Thomas Richter escreveu:
>> Running commands
>>
>> ./perf record -e rb0000 -- find .
>> ./perf report -v
>
> Or when pressing 'V' in the TUI.
>
>> reveals symbol names and its addresses. There is a mismatch between
>
> Yeah, an address that at some point was put there to help with debugging
> the symbol resolution, IIRC how it looked like when looking at
>
> readelf -sW vmlinux
>
> Or any other DSO, for instance, for a glibc symbol here:
>
> Using 'perf report -s pid,dso,sym' then pressing 'V':
>
> 1.55% 20325:perf /usr/lib64/libc-2.30.so 0x161825 B [.] __strlen_avx2
>
> [acme@five perf]$ readelf -sW /usr/lib64/libc-2.30.so | grep strlen_avx2
> 24371: 0000000000161810 414 FUNC LOCAL DEFAULT 15 __strlen_avx2
> [acme@five perf]$
>
> Can you check if doing in /lib/modules/.../build/vmlinux produces what
> appears when 'V' is in place?
>
> And perhaps we can also show the DSO offset and the rip as it gets laid
> out in memory in the end? So we have all the informations?
>
> - Arnaldo
>

Arnaldo,

having verified this also works in the TUI mode using 'V', are you going to pick
this patch?

Thanks


--
Thomas Richter, Dept 3252, IBM s390 Linux Development, Boeblingen, Germany
--
Vorsitzender des Aufsichtsrats: Matthias Hartmann
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294

2020-04-24 15:11:52

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf symbol: Fix kernel symbol address display

Em Fri, Apr 24, 2020 at 02:37:01PM +0200, Thomas Richter escreveu:
> On 4/20/20 10:46 PM, Arnaldo Carvalho de Melo wrote:
> > Em Wed, Apr 15, 2020 at 09:07:44AM +0200, Thomas Richter escreveu:
> >> Running commands
> >>
> >> ./perf record -e rb0000 -- find .
> >> ./perf report -v
> >
> > Or when pressing 'V' in the TUI.
> >
> >> reveals symbol names and its addresses. There is a mismatch between
> >
> > Yeah, an address that at some point was put there to help with debugging
> > the symbol resolution, IIRC how it looked like when looking at
> >
> > readelf -sW vmlinux
> >
> > Or any other DSO, for instance, for a glibc symbol here:
> >
> > Using 'perf report -s pid,dso,sym' then pressing 'V':
> >
> > 1.55% 20325:perf /usr/lib64/libc-2.30.so 0x161825 B [.] __strlen_avx2
> >
> > [acme@five perf]$ readelf -sW /usr/lib64/libc-2.30.so | grep strlen_avx2
> > 24371: 0000000000161810 414 FUNC LOCAL DEFAULT 15 __strlen_avx2
> > [acme@five perf]$
> >
> > Can you check if doing in /lib/modules/.../build/vmlinux produces what
> > appears when 'V' is in place?
> >
> > And perhaps we can also show the DSO offset and the rip as it gets laid
> > out in memory in the end? So we have all the informations?
> >
> > - Arnaldo
> >
>
> Arnaldo,
>
> having verified this also works in the TUI mode using 'V', are you going to pick
> this patch?

That would be a change in behaviour, the original intent was to show the
value one would get from the ELF symbol table, isn't that the case?

That is why I asked if you would think that providing the information
you want, which is matching to what is in kallsyms, would be
interesting, which I think it is, but how? showing the vmlinux offset
somehow?

> Thanks
>
>
> --
> Thomas Richter, Dept 3252, IBM s390 Linux Development, Boeblingen, Germany
> --
> Vorsitzender des Aufsichtsrats: Matthias Hartmann
> Gesch?ftsf?hrung: Dirk Wittkopp
> Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294

--

- Arnaldo

2020-04-24 18:04:33

by Thomas Richter

[permalink] [raw]
Subject: Re: [PATCH] perf symbol: Fix kernel symbol address display

On 4/24/20 5:06 PM, Arnaldo Carvalho de Melo wrote:
> Em Fri, Apr 24, 2020 at 02:37:01PM +0200, Thomas Richter escreveu:
>> On 4/20/20 10:46 PM, Arnaldo Carvalho de Melo wrote:
>>> Em Wed, Apr 15, 2020 at 09:07:44AM +0200, Thomas Richter escreveu:
>>>> Running commands
>>>>
>>>> ./perf record -e rb0000 -- find .
>>>> ./perf report -v
>>>
>>> Or when pressing 'V' in the TUI.
>>>
>>>> reveals symbol names and its addresses. There is a mismatch between
>>>
>>> Yeah, an address that at some point was put there to help with debugging
>>> the symbol resolution, IIRC how it looked like when looking at
>>>
>>> readelf -sW vmlinux
>>>
>>> Or any other DSO, for instance, for a glibc symbol here:
>>>
>>> Using 'perf report -s pid,dso,sym' then pressing 'V':
>>>
>>> 1.55% 20325:perf /usr/lib64/libc-2.30.so 0x161825 B [.] __strlen_avx2
>>>
>>> [acme@five perf]$ readelf -sW /usr/lib64/libc-2.30.so | grep strlen_avx2
>>> 24371: 0000000000161810 414 FUNC LOCAL DEFAULT 15 __strlen_avx2
>>> [acme@five perf]$
>>>
>>> Can you check if doing in /lib/modules/.../build/vmlinux produces what
>>> appears when 'V' is in place?
>>>
>>> And perhaps we can also show the DSO offset and the rip as it gets laid
>>> out in memory in the end? So we have all the informations?
>>>
>>> - Arnaldo
>>>
>>
>> Arnaldo,
>>
>> having verified this also works in the TUI mode using 'V', are you going to pick
>> this patch?
>
> That would be a change in behaviour, the original intent was to show the
> value one would get from the ELF symbol table, isn't that the case?
>

I think there is a misunderstanding.

Your example above refers to libc, which shows the addresses
correctly (on x86 and s390). I was refering to the kernel dso.
The issue only shows up on the kernel DSO (vmlinux and kallsyms)

When I use the -v option (or 'V' in TUI) I get this

3.55% find /lib/modules/.../build/vmlinux 0xf11ec v [k] check_chain_key
^^^^^^^
This address is not correct as compared to readelf on vmlinux file:

[root@m35lp76 linux]# readelf -sW vmlinux| fgrep check_chain_key
20698: 00000000001f0c70 486 FUNC LOCAL DEFAULT 1 check_chain_key
[root@m35lp76 linux]#

The problem are the mapping functions with are applied for the kernel DSO.
In dso__process_kernel_symbol() the mapping is adjusted (around line 903)
and function hist_entry_iter_add() saves the modified address in the
struct hist_entry->ip. This address is used from now on and printed.

Let me know and I will double check this and rework the commit message
as it was not good enough.

Thanks.


> That is why I asked if you would think that providing the information
> you want, which is matching to what is in kallsyms, would be
> interesting, which I think it is, but how? showing the vmlinux offset
> somehow?
>

--
Thomas Richter, Dept 3252, IBM s390 Linux Development, Boeblingen, Germany
--
Vorsitzender des Aufsichtsrats: Matthias Hartmann
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294

2020-04-24 18:26:17

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf symbol: Fix kernel symbol address display

Em Fri, Apr 24, 2020 at 08:01:22PM +0200, Thomas Richter escreveu:
> On 4/24/20 5:06 PM, Arnaldo Carvalho de Melo wrote:
> > Em Fri, Apr 24, 2020 at 02:37:01PM +0200, Thomas Richter escreveu:
> >> On 4/20/20 10:46 PM, Arnaldo Carvalho de Melo wrote:
> >>> Em Wed, Apr 15, 2020 at 09:07:44AM +0200, Thomas Richter escreveu:
> >>>> Running commands
> >>>>
> >>>> ./perf record -e rb0000 -- find .
> >>>> ./perf report -v
> >>>
> >>> Or when pressing 'V' in the TUI.
> >>>
> >>>> reveals symbol names and its addresses. There is a mismatch between
> >>>
> >>> Yeah, an address that at some point was put there to help with debugging
> >>> the symbol resolution, IIRC how it looked like when looking at
> >>>
> >>> readelf -sW vmlinux
> >>>
> >>> Or any other DSO, for instance, for a glibc symbol here:
> >>>
> >>> Using 'perf report -s pid,dso,sym' then pressing 'V':
> >>>
> >>> 1.55% 20325:perf /usr/lib64/libc-2.30.so 0x161825 B [.] __strlen_avx2
> >>>
> >>> [acme@five perf]$ readelf -sW /usr/lib64/libc-2.30.so | grep strlen_avx2
> >>> 24371: 0000000000161810 414 FUNC LOCAL DEFAULT 15 __strlen_avx2
> >>> [acme@five perf]$
> >>>
> >>> Can you check if doing in /lib/modules/.../build/vmlinux produces what
> >>> appears when 'V' is in place?
> >>>
> >>> And perhaps we can also show the DSO offset and the rip as it gets laid
> >>> out in memory in the end? So we have all the informations?
> >>>
> >>> - Arnaldo
> >>>
> >>
> >> Arnaldo,
> >>
> >> having verified this also works in the TUI mode using 'V', are you going to pick
> >> this patch?
> >
> > That would be a change in behaviour, the original intent was to show the
> > value one would get from the ELF symbol table, isn't that the case?
> >
>
> I think there is a misunderstanding.
>
> Your example above refers to libc, which shows the addresses
> correctly (on x86 and s390). I was refering to the kernel dso.
> The issue only shows up on the kernel DSO (vmlinux and kallsyms)
>
> When I use the -v option (or 'V' in TUI) I get this
>
> 3.55% find /lib/modules/.../build/vmlinux 0xf11ec v [k] check_chain_key
> ^^^^^^^
> This address is not correct as compared to readelf on vmlinux file:
>
> [root@m35lp76 linux]# readelf -sW vmlinux| fgrep check_chain_key
> 20698: 00000000001f0c70 486 FUNC LOCAL DEFAULT 1 check_chain_key
> [root@m35lp76 linux]#
>
> The problem are the mapping functions with are applied for the kernel DSO.
> In dso__process_kernel_symbol() the mapping is adjusted (around line 903)
> and function hist_entry_iter_add() saves the modified address in the
> struct hist_entry->ip. This address is used from now on and printed.
>
> Let me know and I will double check this and rework the commit message
> as it was not good enough.

Yeah, this may be me not understanding something right, I saw this was a
corner case, that only happens in debug mode (ok, better provide good
info here...), so I haven't devoted more than cursory attention to this
and may be making a bad judgement, so I thought that since this doesn't
seems urgent, better leave it to when I manage to have solid time to
devote to this or someone else looks at this.

Sorry, but these have been difficult time for everybody, and I fear, for
me, I have to prioritise all the time.

- Arnaldo

> Thanks.
>
>
> > That is why I asked if you would think that providing the information
> > you want, which is matching to what is in kallsyms, would be
> > interesting, which I think it is, but how? showing the vmlinux offset
> > somehow?
> >
>
> --
> Thomas Richter, Dept 3252, IBM s390 Linux Development, Boeblingen, Germany
> --
> Vorsitzender des Aufsichtsrats: Matthias Hartmann
> Gesch?ftsf?hrung: Dirk Wittkopp
> Sitz der Gesellschaft: B?blingen / Registergericht: Amtsgericht Stuttgart, HRB 243294

--

- Arnaldo

2020-04-27 12:27:38

by Thomas Richter

[permalink] [raw]
Subject: Re: [PATCH] perf symbol: Fix kernel symbol address display

On 4/24/20 8:23 PM, Arnaldo Carvalho de Melo wrote:
> Em Fri, Apr 24, 2020 at 08:01:22PM +0200, Thomas Richter escreveu:
>> On 4/24/20 5:06 PM, Arnaldo Carvalho de Melo wrote:
>>> Em Fri, Apr 24, 2020 at 02:37:01PM +0200, Thomas Richter escreveu:
>>>> On 4/20/20 10:46 PM, Arnaldo Carvalho de Melo wrote:
>>>>> Em Wed, Apr 15, 2020 at 09:07:44AM +0200, Thomas Richter escreveu:
>>>>>> Running commands
>>>>>>
>>>>>> ./perf record -e rb0000 -- find .
>>>>>> ./perf report -v
>>>>>
>>>>> Or when pressing 'V' in the TUI.
>>>>>
>>>>>> reveals symbol names and its addresses. There is a mismatch between
>>>>>
>>>>> Yeah, an address that at some point was put there to help with debugging
>>>>> the symbol resolution, IIRC how it looked like when looking at
>>>>>
>>>>> readelf -sW vmlinux
>>>>>
>>>>> Or any other DSO, for instance, for a glibc symbol here:
>>>>>
>>>>> Using 'perf report -s pid,dso,sym' then pressing 'V':
>>>>>
>>>>> 1.55% 20325:perf /usr/lib64/libc-2.30.so 0x161825 B [.] __strlen_avx2
>>>>>
>>>>> [acme@five perf]$ readelf -sW /usr/lib64/libc-2.30.so | grep strlen_avx2
>>>>> 24371: 0000000000161810 414 FUNC LOCAL DEFAULT 15 __strlen_avx2
>>>>> [acme@five perf]$
>>>>>
>>>>> Can you check if doing in /lib/modules/.../build/vmlinux produces what
>>>>> appears when 'V' is in place?
>>>>>
>>>>> And perhaps we can also show the DSO offset and the rip as it gets laid
>>>>> out in memory in the end? So we have all the informations?
>>>>>
>>>>> - Arnaldo
>>>>>
>>>>
>>>> Arnaldo,
>>>>
>>>> having verified this also works in the TUI mode using 'V', are you going to pick
>>>> this patch?
>>>
>>> That would be a change in behaviour, the original intent was to show the
>>> value one would get from the ELF symbol table, isn't that the case?
>>>
>>
>> I think there is a misunderstanding.
>>
>> Your example above refers to libc, which shows the addresses
>> correctly (on x86 and s390). I was refering to the kernel dso.
>> The issue only shows up on the kernel DSO (vmlinux and kallsyms)
>>
>> When I use the -v option (or 'V' in TUI) I get this
>>
>> 3.55% find /lib/modules/.../build/vmlinux 0xf11ec v [k] check_chain_key
>> ^^^^^^^
>> This address is not correct as compared to readelf on vmlinux file:
>>
>> [root@m35lp76 linux]# readelf -sW vmlinux| fgrep check_chain_key
>> 20698: 00000000001f0c70 486 FUNC LOCAL DEFAULT 1 check_chain_key
>> [root@m35lp76 linux]#
>>
>> The problem are the mapping functions with are applied for the kernel DSO.
>> In dso__process_kernel_symbol() the mapping is adjusted (around line 903)
>> and function hist_entry_iter_add() saves the modified address in the
>> struct hist_entry->ip. This address is used from now on and printed.
>>
>> Let me know and I will double check this and rework the commit message
>> as it was not good enough.
>
> Yeah, this may be me not understanding something right, I saw this was a
> corner case, that only happens in debug mode (ok, better provide good
> info here...), so I haven't devoted more than cursory attention to this
> and may be making a bad judgement, so I thought that since this doesn't
> seems urgent, better leave it to when I manage to have solid time to
> devote to this or someone else looks at this.
>
> Sorry, but these have been difficult time for everybody, and I fear, for
> me, I have to prioritise all the time.
>
> - Arnaldo
>

Yes, it is not easy these days.
I just wanted to make sure this patch is not lost. I already have been
asked when this will be fixed.

--
Thomas Richter, Dept 3252, IBM s390 Linux Development, Boeblingen, Germany
--
Vorsitzender des Aufsichtsrats: Matthias Hartmann
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen / Registergericht: Amtsgericht Stuttgart, HRB 243294

Subject: [tip: perf/core] perf symbol: Fix kernel symbol address display

The following commit has been merged into the perf/core branch of tip:

Commit-ID: 51d9635582c55479bf7904fa47e39aef2b323a50
Gitweb: https://git.kernel.org/tip/51d9635582c55479bf7904fa47e39aef2b323a50
Author: Thomas Richter <[email protected]>
AuthorDate: Wed, 15 Apr 2020 09:07:44 +02:00
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitterDate: Tue, 05 May 2020 16:35:32 -03:00

perf symbol: Fix kernel symbol address display

Running commands

./perf record -e rb0000 -- find .
./perf report -v

reveals symbol names and its addresses. There is a mismatch between
kernel symbol and address. Here is an example for kernel symbol
check_chain_key:

3.55% find /lib/modules/.../build/vmlinux 0xf11ec v [k] check_chain_key

This address is off by 0xff000 as can be seen with:

[root@t35lp46 ~]# fgrep check_chain_key /proc/kallsyms
00000000001f00d0 t check_chain_key
[root@t35lp46 ~]# objdump -t ~/linux/vmlinux| fgrep check_chain_key
00000000001f00d0 l F .text 00000000000001e8 check_chain_key
[root@t35lp46 ~]#

This function is located in main memory 0x1f00d0 - 0x1f02b4. It has
several entries in the perf data file with the correct address:

[root@t35lp46 perf]# ./perf report -D -i perf.data.find-bad | \
fgrep SAMPLE| fgrep 0x1f01ec
PERF_RECORD_SAMPLE(IP, 0x1): 22228/22228: 0x1f01ec period: 1300000 addr: 0
PERF_RECORD_SAMPLE(IP, 0x1): 22228/22228: 0x1f01ec period: 1300000 addr: 0

The root cause happens when reading symbol tables during perf report.
A long gdb call chain leads to

machine__deliver_events
perf_evlist__deliver_event
perf_evlist__deliver_sample
build_id__mark_dso_hits
thread__find_map(1) Read correct address from sample entry
map__load
dso__load Some more functions to end up in
....
dso__load_sym.

Function dso__load_syms checks for kernel relocation and symbol
adjustment for the kernel and results in kernel map adjustment of
kernel .text segment address (0x100000 on s390)
kernel .text segment offset in file (0x1000 on s390).
This results in all kernel symbol addresses to be changed by subtracting
0xff000 (on s390). For the symbol check_chain_key we end up with

0x1f00d0 - 0x100000 + 0x1000 = 0xf11d0

and this address is saved in the perf symbol table. This calculation is
also applied by the mapping functions map__mapip() and map__unmapip()
to map IP addresses to dso mappings.

During perf report processing functions

process_sample_event (builtin-report.c)
machine__resolve
thread__find_map
hist_entry_iter_add

are called. Function thread__find_map(1)
takes the correct sample address and applies the mapping function
map__mapip() from the kernel dso and saves the modified address
in struct addr_location for further reference. From now on this address
is used.

Funktion process_sample_event() then calls hist_entry_iter_add() to save
the address in member ip of struct hist_entry.

When samples are displayed using

perf_evlist__tty_browse_hists
hists__fprintf
hist_entry__fprintf
hist_entry__snprintf
__hist_entry__snprintf
_hist_entry__sym_snprintf()

This simply displays the address of the symbol and ignores the dso <-> map
mappings done in function thread__find_map. This leads to the address
mismatch.

Output before:

ot@t35lp46 perf]# ./perf report -v | fgrep check_chain_key
3.55% find /lib/modules/5.6.0d-perf+/build/vmlinux
0xf11ec v [k] check_chain_key
[root@t35lp46 perf]#

Output after:

[root@t35lp46 perf]# ./perf report -v | fgrep check_chain_key
3.55% find /lib/modules/5.6.0d-perf+/build/vmlinux
0x1f01ec v [k] check_chain_key
[root@t35lp46 perf]#

Signed-off-by: Thomas Richter <[email protected]>
Acked-by: Sumanth Korikkar <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/sort.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index dc23b34..c1f8879 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -300,8 +300,14 @@ static int _hist_entry__sym_snprintf(struct map_symbol *ms,

if (verbose > 0) {
char o = map ? dso__symtab_origin(map->dso) : '!';
+ u64 rip = ip;
+
+ if (map && map->dso && map->dso->kernel
+ && map->dso->adjust_symbols)
+ rip = map->unmap_ip(map, ip);
+
ret += repsep_snprintf(bf, size, "%-#*llx %c ",
- BITS_PER_LONG / 4 + 2, ip, o);
+ BITS_PER_LONG / 4 + 2, rip, o);
}

ret += repsep_snprintf(bf + ret, size - ret, "[%c] ", level);