The test case perf lock contention dumps core on s390. Run the following
commands:
# ./perf lock record -- ./perf bench sched messaging
# Running 'sched/messaging' benchmark:
# 20 sender and receiver processes per group
# 10 groups == 400 processes run
Total time: 2.799 [sec]
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.073 MB perf.data (100 samples) ]
#
# ./perf lock contention
Segmentation fault (core dumped)
#
The function call stack is lengthy, here are the top 5 functions:
# gdb ./perf core.24048
GNU gdb (GDB) Fedora Linux 12.1-6.fc37
Copyright (C) 2022 Free Software Foundation, Inc.
Core was generated by `./perf lock contention'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00000000011dd25c in machine__is_lock_function (machine=0x3029e28,
addr=1789230) at util/machine.c:3356
3356 machine->sched.text_end = kmap->unmap_ip(kmap, sym->start);
(gdb) where
#0 0x00000000011dd25c in machine__is_lock_function (machine=0x3029e28,\
addr=1789230) at util/machine.c:3356
#1 0x000000000109f244 in callchain_id (evsel=0x30313e0,\
sample=0x3ffea4f77d0) at builtin-lock.c:957
#2 0x000000000109e094 in get_key_by_aggr_mode (key=0x3ffea4f7290,\
addr=27758136, evsel=0x30313e0, sample=0x3ffea4f77d0) \
at builtin-lock.c:586
#3 0x000000000109f4d0 in report_lock_contention_begin_event \
(evsel=0x30313e0, sample=0x3ffea4f77d0)
at builtin-lock.c:1004
#4 0x00000000010a00ae in evsel__process_contention_begin \
(evsel=0x30313e0, sample=0x3ffea4f77d0)
at builtin-lock.c:1254
#5 0x00000000010a0e14 in process_sample_event (tool=0x3ffea4f8480, \
event=0x3ff85601ef8, sample=0x3ffea4f77d0,
evsel=0x30313e0, machine=0x3029e28) at builtin-lock.c:1464
sample=0x3ffea4f77d0, evsel=0x30313e0, machine=0x3029e28) \
at util/session.c:1523
.....
The issue is in function machine__is_lock_function() in file
./util/machine.c lines 3355:
/* should not fail from here */
sym = machine__find_kernel_symbol_by_name(machine, "__sched_text_end",
&kmap);
machine->sched.text_end = kmap->unmap_ip(kmap, sym->start)
On s390 the symbol __sched_text_end is *NOT* in the symbol list and the
resulting pointer sym is set to NULL. The sym->start is then a NULL pointer
access and generates the core dump.
The reason why __sched_text_end is not in the symbol list on s390 is
simple:
When the symbol list is created at perf start up with function calls
dso__load
+--> dso__load_vmlinux_path
+--> dso__load_vmlinux
+--> dso__load_sym
+--> dso__load_sym_internal (reads kernel symbols)
+--> symbols__fixup_end
+--> symbols__fixup_duplicate
The issue is in function symbols__fixup_duplicate(). It deletes all
symbols with have the same address. On s390
# nm -g ~/linux/vmlinux| fgrep c68390
0000000000c68390 T __cpuidle_text_start
0000000000c68390 T __sched_text_end
#
two symbols have identical addresses and __sched_text_end is considered
duplicate (in ascending sort order) and removed from the symbol list.
Therefore it is missing and an invalid pointer reference occurs.
The code checks for symbol __sched_text_start and when it exists assumes
symbol __sched_text_end is also in the symbol table. However this is
not the case on s390.
Same situation exists for symbol __lock_text_start:
0000000000c68770 T __cpuidle_text_end
0000000000c68770 T __lock_text_start
This symbol is also removed from the symbol table but used in function
machine__is_lock_function().
To fix this and keep duplicate symbols in the symbol table, set
symbol_conf.allow_aliases to true. This prevents the removal of duplicate
symbols in function symbols__fixup_duplicate().
Output After:
# ./perf lock contention
contended total wait max wait avg wait type caller
48 124.39 ms 123.99 ms 2.59 ms rwsem:W unlink_anon_vmas+0x24a
47 83.68 ms 83.26 ms 1.78 ms rwsem:W free_pgtables+0x132
5 41.22 us 10.55 us 8.24 us rwsem:W free_pgtables+0x140
4 40.12 us 20.55 us 10.03 us rwsem:W copy_process+0x1ac8
#
Fixes: cc2367eebb0c ("machine: Adopt is_lock_function() from builtin-lock.c")
Cc: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Thomas Richter <[email protected]>
---
tools/perf/builtin-lock.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index 718b82bfcdff..506c2fe42d52 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -1670,6 +1670,7 @@ static int __cmd_report(bool display_info)
/* for lock function check */
symbol_conf.sort_by_name = true;
+ symbol_conf.allow_aliases = true;
symbol__init(&session->header.env);
if (!data.is_pipe) {
@@ -1757,6 +1758,7 @@ static int __cmd_contention(int argc, const char **argv)
/* for lock function check */
symbol_conf.sort_by_name = true;
+ symbol_conf.allow_aliases = true;
symbol__init(&session->header.env);
if (use_bpf) {
--
2.38.1
Em Fri, Dec 30, 2022 at 11:26:27AM +0100, Thomas Richter escreveu:
> The test case perf lock contention dumps core on s390. Run the following
> commands:
> # ./perf lock record -- ./perf bench sched messaging
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 2.799 [sec]
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.073 MB perf.data (100 samples) ]
> #
> # ./perf lock contention
> Segmentation fault (core dumped)
> #
>
> The function call stack is lengthy, here are the top 5 functions:
> # gdb ./perf core.24048
> GNU gdb (GDB) Fedora Linux 12.1-6.fc37
> Copyright (C) 2022 Free Software Foundation, Inc.
> Core was generated by `./perf lock contention'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0 0x00000000011dd25c in machine__is_lock_function (machine=0x3029e28,
> addr=1789230) at util/machine.c:3356
> 3356 machine->sched.text_end = kmap->unmap_ip(kmap, sym->start);
>
> (gdb) where
> #0 0x00000000011dd25c in machine__is_lock_function (machine=0x3029e28,\
> addr=1789230) at util/machine.c:3356
> #1 0x000000000109f244 in callchain_id (evsel=0x30313e0,\
> sample=0x3ffea4f77d0) at builtin-lock.c:957
> #2 0x000000000109e094 in get_key_by_aggr_mode (key=0x3ffea4f7290,\
> addr=27758136, evsel=0x30313e0, sample=0x3ffea4f77d0) \
> at builtin-lock.c:586
> #3 0x000000000109f4d0 in report_lock_contention_begin_event \
> (evsel=0x30313e0, sample=0x3ffea4f77d0)
> at builtin-lock.c:1004
> #4 0x00000000010a00ae in evsel__process_contention_begin \
> (evsel=0x30313e0, sample=0x3ffea4f77d0)
> at builtin-lock.c:1254
> #5 0x00000000010a0e14 in process_sample_event (tool=0x3ffea4f8480, \
> event=0x3ff85601ef8, sample=0x3ffea4f77d0,
> evsel=0x30313e0, machine=0x3029e28) at builtin-lock.c:1464
> sample=0x3ffea4f77d0, evsel=0x30313e0, machine=0x3029e28) \
> at util/session.c:1523
> .....
>
> The issue is in function machine__is_lock_function() in file
> ./util/machine.c lines 3355:
> /* should not fail from here */
> sym = machine__find_kernel_symbol_by_name(machine, "__sched_text_end",
> &kmap);
> machine->sched.text_end = kmap->unmap_ip(kmap, sym->start)
>
> On s390 the symbol __sched_text_end is *NOT* in the symbol list and the
> resulting pointer sym is set to NULL. The sym->start is then a NULL pointer
> access and generates the core dump.
>
> The reason why __sched_text_end is not in the symbol list on s390 is
> simple:
> When the symbol list is created at perf start up with function calls
> dso__load
> +--> dso__load_vmlinux_path
> +--> dso__load_vmlinux
> +--> dso__load_sym
> +--> dso__load_sym_internal (reads kernel symbols)
> +--> symbols__fixup_end
> +--> symbols__fixup_duplicate
>
> The issue is in function symbols__fixup_duplicate(). It deletes all
> symbols with have the same address. On s390
> # nm -g ~/linux/vmlinux| fgrep c68390
> 0000000000c68390 T __cpuidle_text_start
> 0000000000c68390 T __sched_text_end
> #
> two symbols have identical addresses and __sched_text_end is considered
> duplicate (in ascending sort order) and removed from the symbol list.
> Therefore it is missing and an invalid pointer reference occurs.
> The code checks for symbol __sched_text_start and when it exists assumes
> symbol __sched_text_end is also in the symbol table. However this is
> not the case on s390.
>
> Same situation exists for symbol __lock_text_start:
> 0000000000c68770 T __cpuidle_text_end
> 0000000000c68770 T __lock_text_start
> This symbol is also removed from the symbol table but used in function
> machine__is_lock_function().
>
> To fix this and keep duplicate symbols in the symbol table, set
> symbol_conf.allow_aliases to true. This prevents the removal of duplicate
> symbols in function symbols__fixup_duplicate().
>
> Output After:
> # ./perf lock contention
> contended total wait max wait avg wait type caller
>
> 48 124.39 ms 123.99 ms 2.59 ms rwsem:W unlink_anon_vmas+0x24a
> 47 83.68 ms 83.26 ms 1.78 ms rwsem:W free_pgtables+0x132
> 5 41.22 us 10.55 us 8.24 us rwsem:W free_pgtables+0x140
> 4 40.12 us 20.55 us 10.03 us rwsem:W copy_process+0x1ac8
> #
>
> Fixes: cc2367eebb0c ("machine: Adopt is_lock_function() from builtin-lock.c")
Humm, is that really the cset that introduces the problem? It just moves
things around, the cset that introduced the is_lock_function() function,
that assumed that __sched_text_end was always available was:
commit 0d2997f750d1de394231bc22768dab94a5b5db2f
Author: Namhyung Kim <[email protected]>
Date: Wed Jun 15 09:32:22 2022 -0700
perf lock: Look up callchain for the contended locks
---
Right? Namhyung? Can you spot any problem in enabling duplicates as a
fix?
- Arnaldo
> Cc: Arnaldo Carvalho de Melo <[email protected]>
> Signed-off-by: Thomas Richter <[email protected]>
> ---
> tools/perf/builtin-lock.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
> index 718b82bfcdff..506c2fe42d52 100644
> --- a/tools/perf/builtin-lock.c
> +++ b/tools/perf/builtin-lock.c
> @@ -1670,6 +1670,7 @@ static int __cmd_report(bool display_info)
>
> /* for lock function check */
> symbol_conf.sort_by_name = true;
> + symbol_conf.allow_aliases = true;
> symbol__init(&session->header.env);
>
> if (!data.is_pipe) {
> @@ -1757,6 +1758,7 @@ static int __cmd_contention(int argc, const char **argv)
>
> /* for lock function check */
> symbol_conf.sort_by_name = true;
> + symbol_conf.allow_aliases = true;
> symbol__init(&session->header.env);
>
> if (use_bpf) {
> --
> 2.38.1
--
- Arnaldo
Hi Arnaldo,
On Mon, Jan 2, 2023 at 7:33 AM Arnaldo Carvalho de Melo <[email protected]> wrote:
>
> Em Fri, Dec 30, 2022 at 11:26:27AM +0100, Thomas Richter escreveu:
> > The test case perf lock contention dumps core on s390. Run the following
> > commands:
> > # ./perf lock record -- ./perf bench sched messaging
> > # Running 'sched/messaging' benchmark:
> > # 20 sender and receiver processes per group
> > # 10 groups == 400 processes run
> >
> > Total time: 2.799 [sec]
> > [ perf record: Woken up 1 times to write data ]
> > [ perf record: Captured and wrote 0.073 MB perf.data (100 samples) ]
> > #
> > # ./perf lock contention
> > Segmentation fault (core dumped)
> > #
> >
> > The function call stack is lengthy, here are the top 5 functions:
> > # gdb ./perf core.24048
> > GNU gdb (GDB) Fedora Linux 12.1-6.fc37
> > Copyright (C) 2022 Free Software Foundation, Inc.
> > Core was generated by `./perf lock contention'.
> > Program terminated with signal SIGSEGV, Segmentation fault.
> > #0 0x00000000011dd25c in machine__is_lock_function (machine=0x3029e28,
> > addr=1789230) at util/machine.c:3356
> > 3356 machine->sched.text_end = kmap->unmap_ip(kmap, sym->start);
> >
> > (gdb) where
> > #0 0x00000000011dd25c in machine__is_lock_function (machine=0x3029e28,\
> > addr=1789230) at util/machine.c:3356
> > #1 0x000000000109f244 in callchain_id (evsel=0x30313e0,\
> > sample=0x3ffea4f77d0) at builtin-lock.c:957
> > #2 0x000000000109e094 in get_key_by_aggr_mode (key=0x3ffea4f7290,\
> > addr=27758136, evsel=0x30313e0, sample=0x3ffea4f77d0) \
> > at builtin-lock.c:586
> > #3 0x000000000109f4d0 in report_lock_contention_begin_event \
> > (evsel=0x30313e0, sample=0x3ffea4f77d0)
> > at builtin-lock.c:1004
> > #4 0x00000000010a00ae in evsel__process_contention_begin \
> > (evsel=0x30313e0, sample=0x3ffea4f77d0)
> > at builtin-lock.c:1254
> > #5 0x00000000010a0e14 in process_sample_event (tool=0x3ffea4f8480, \
> > event=0x3ff85601ef8, sample=0x3ffea4f77d0,
> > evsel=0x30313e0, machine=0x3029e28) at builtin-lock.c:1464
> > sample=0x3ffea4f77d0, evsel=0x30313e0, machine=0x3029e28) \
> > at util/session.c:1523
> > .....
> >
> > The issue is in function machine__is_lock_function() in file
> > ./util/machine.c lines 3355:
> > /* should not fail from here */
> > sym = machine__find_kernel_symbol_by_name(machine, "__sched_text_end",
> > &kmap);
> > machine->sched.text_end = kmap->unmap_ip(kmap, sym->start)
> >
> > On s390 the symbol __sched_text_end is *NOT* in the symbol list and the
> > resulting pointer sym is set to NULL. The sym->start is then a NULL pointer
> > access and generates the core dump.
> >
> > The reason why __sched_text_end is not in the symbol list on s390 is
> > simple:
> > When the symbol list is created at perf start up with function calls
> > dso__load
> > +--> dso__load_vmlinux_path
> > +--> dso__load_vmlinux
> > +--> dso__load_sym
> > +--> dso__load_sym_internal (reads kernel symbols)
> > +--> symbols__fixup_end
> > +--> symbols__fixup_duplicate
> >
> > The issue is in function symbols__fixup_duplicate(). It deletes all
> > symbols with have the same address. On s390
> > # nm -g ~/linux/vmlinux| fgrep c68390
> > 0000000000c68390 T __cpuidle_text_start
> > 0000000000c68390 T __sched_text_end
> > #
> > two symbols have identical addresses and __sched_text_end is considered
> > duplicate (in ascending sort order) and removed from the symbol list.
> > Therefore it is missing and an invalid pointer reference occurs.
> > The code checks for symbol __sched_text_start and when it exists assumes
> > symbol __sched_text_end is also in the symbol table. However this is
> > not the case on s390.
> >
> > Same situation exists for symbol __lock_text_start:
> > 0000000000c68770 T __cpuidle_text_end
> > 0000000000c68770 T __lock_text_start
> > This symbol is also removed from the symbol table but used in function
> > machine__is_lock_function().
> >
> > To fix this and keep duplicate symbols in the symbol table, set
> > symbol_conf.allow_aliases to true. This prevents the removal of duplicate
> > symbols in function symbols__fixup_duplicate().
> >
> > Output After:
> > # ./perf lock contention
> > contended total wait max wait avg wait type caller
> >
> > 48 124.39 ms 123.99 ms 2.59 ms rwsem:W unlink_anon_vmas+0x24a
> > 47 83.68 ms 83.26 ms 1.78 ms rwsem:W free_pgtables+0x132
> > 5 41.22 us 10.55 us 8.24 us rwsem:W free_pgtables+0x140
> > 4 40.12 us 20.55 us 10.03 us rwsem:W copy_process+0x1ac8
> > #
> >
> > Fixes: cc2367eebb0c ("machine: Adopt is_lock_function() from builtin-lock.c")
>
> Humm, is that really the cset that introduces the problem? It just moves
> things around, the cset that introduced the is_lock_function() function,
> that assumed that __sched_text_end was always available was:
>
> commit 0d2997f750d1de394231bc22768dab94a5b5db2f
> Author: Namhyung Kim <[email protected]>
> Date: Wed Jun 15 09:32:22 2022 -0700
>
> perf lock: Look up callchain for the contended locks
>
> ---
>
> Right? Namhyung? Can you spot any problem in enabling duplicates as a
> fix?
Yep, I think that's the cset introduced the problem.
I'm fine with the fix.
Acked-by: Namhyung Kim <[email protected]>
Thanks,
Namhyung
Em Tue, Jan 03, 2023 at 08:45:34AM -0800, Namhyung Kim escreveu:
> Hi Arnaldo,
>
> On Mon, Jan 2, 2023 at 7:33 AM Arnaldo Carvalho de Melo <[email protected]> wrote:
> >
> > Em Fri, Dec 30, 2022 at 11:26:27AM +0100, Thomas Richter escreveu:
> > > The test case perf lock contention dumps core on s390. Run the following
> > > commands:
> > > # ./perf lock record -- ./perf bench sched messaging
> > > # Running 'sched/messaging' benchmark:
> > > # 20 sender and receiver processes per group
> > > # 10 groups == 400 processes run
> > >
> > > Total time: 2.799 [sec]
> > > [ perf record: Woken up 1 times to write data ]
> > > [ perf record: Captured and wrote 0.073 MB perf.data (100 samples) ]
> > > #
> > > # ./perf lock contention
> > > Segmentation fault (core dumped)
> > > #
> > >
> > > The function call stack is lengthy, here are the top 5 functions:
> > > # gdb ./perf core.24048
> > > GNU gdb (GDB) Fedora Linux 12.1-6.fc37
> > > Copyright (C) 2022 Free Software Foundation, Inc.
> > > Core was generated by `./perf lock contention'.
> > > Program terminated with signal SIGSEGV, Segmentation fault.
> > > #0 0x00000000011dd25c in machine__is_lock_function (machine=0x3029e28,
> > > addr=1789230) at util/machine.c:3356
> > > 3356 machine->sched.text_end = kmap->unmap_ip(kmap, sym->start);
> > >
> > > (gdb) where
> > > #0 0x00000000011dd25c in machine__is_lock_function (machine=0x3029e28,\
> > > addr=1789230) at util/machine.c:3356
> > > #1 0x000000000109f244 in callchain_id (evsel=0x30313e0,\
> > > sample=0x3ffea4f77d0) at builtin-lock.c:957
> > > #2 0x000000000109e094 in get_key_by_aggr_mode (key=0x3ffea4f7290,\
> > > addr=27758136, evsel=0x30313e0, sample=0x3ffea4f77d0) \
> > > at builtin-lock.c:586
> > > #3 0x000000000109f4d0 in report_lock_contention_begin_event \
> > > (evsel=0x30313e0, sample=0x3ffea4f77d0)
> > > at builtin-lock.c:1004
> > > #4 0x00000000010a00ae in evsel__process_contention_begin \
> > > (evsel=0x30313e0, sample=0x3ffea4f77d0)
> > > at builtin-lock.c:1254
> > > #5 0x00000000010a0e14 in process_sample_event (tool=0x3ffea4f8480, \
> > > event=0x3ff85601ef8, sample=0x3ffea4f77d0,
> > > evsel=0x30313e0, machine=0x3029e28) at builtin-lock.c:1464
> > > sample=0x3ffea4f77d0, evsel=0x30313e0, machine=0x3029e28) \
> > > at util/session.c:1523
> > > .....
> > >
> > > The issue is in function machine__is_lock_function() in file
> > > ./util/machine.c lines 3355:
> > > /* should not fail from here */
> > > sym = machine__find_kernel_symbol_by_name(machine, "__sched_text_end",
> > > &kmap);
> > > machine->sched.text_end = kmap->unmap_ip(kmap, sym->start)
> > >
> > > On s390 the symbol __sched_text_end is *NOT* in the symbol list and the
> > > resulting pointer sym is set to NULL. The sym->start is then a NULL pointer
> > > access and generates the core dump.
> > >
> > > The reason why __sched_text_end is not in the symbol list on s390 is
> > > simple:
> > > When the symbol list is created at perf start up with function calls
> > > dso__load
> > > +--> dso__load_vmlinux_path
> > > +--> dso__load_vmlinux
> > > +--> dso__load_sym
> > > +--> dso__load_sym_internal (reads kernel symbols)
> > > +--> symbols__fixup_end
> > > +--> symbols__fixup_duplicate
> > >
> > > The issue is in function symbols__fixup_duplicate(). It deletes all
> > > symbols with have the same address. On s390
> > > # nm -g ~/linux/vmlinux| fgrep c68390
> > > 0000000000c68390 T __cpuidle_text_start
> > > 0000000000c68390 T __sched_text_end
> > > #
> > > two symbols have identical addresses and __sched_text_end is considered
> > > duplicate (in ascending sort order) and removed from the symbol list.
> > > Therefore it is missing and an invalid pointer reference occurs.
> > > The code checks for symbol __sched_text_start and when it exists assumes
> > > symbol __sched_text_end is also in the symbol table. However this is
> > > not the case on s390.
> > >
> > > Same situation exists for symbol __lock_text_start:
> > > 0000000000c68770 T __cpuidle_text_end
> > > 0000000000c68770 T __lock_text_start
> > > This symbol is also removed from the symbol table but used in function
> > > machine__is_lock_function().
> > >
> > > To fix this and keep duplicate symbols in the symbol table, set
> > > symbol_conf.allow_aliases to true. This prevents the removal of duplicate
> > > symbols in function symbols__fixup_duplicate().
> > >
> > > Output After:
> > > # ./perf lock contention
> > > contended total wait max wait avg wait type caller
> > >
> > > 48 124.39 ms 123.99 ms 2.59 ms rwsem:W unlink_anon_vmas+0x24a
> > > 47 83.68 ms 83.26 ms 1.78 ms rwsem:W free_pgtables+0x132
> > > 5 41.22 us 10.55 us 8.24 us rwsem:W free_pgtables+0x140
> > > 4 40.12 us 20.55 us 10.03 us rwsem:W copy_process+0x1ac8
> > > #
> > >
> > > Fixes: cc2367eebb0c ("machine: Adopt is_lock_function() from builtin-lock.c")
> >
> > Humm, is that really the cset that introduces the problem? It just moves
> > things around, the cset that introduced the is_lock_function() function,
> > that assumed that __sched_text_end was always available was:
> >
> > commit 0d2997f750d1de394231bc22768dab94a5b5db2f
> > Author: Namhyung Kim <[email protected]>
> > Date: Wed Jun 15 09:32:22 2022 -0700
> >
> > perf lock: Look up callchain for the contended locks
> >
> > ---
> >
> > Right? Namhyung? Can you spot any problem in enabling duplicates as a
> > fix?
>
> Yep, I think that's the cset introduced the problem.
> I'm fine with the fix.
>
> Acked-by: Namhyung Kim <[email protected]>
Thanks, applied.
- Arnaldo