2024-04-23 07:43:43

by Huacai Chen

[permalink] [raw]
Subject: [PATCH] LoongArch: Fix callchain parse error with kernel tracepoint events

In order to fix perf's callchain parse error for LoongArch, we implement
perf_arch_fetch_caller_regs() which fills several necessary registers
used for callchain unwinding, including sp, fp, and era. This is similar
to the following commits.

commit b3eac0265bf6:
("arm: perf: Fix callchain parse error with kernel tracepoint events")

commit 5b09a094f2fb:
("arm64: perf: Fix callchain parse error with kernel tracepoint events")

commit 9a7e8ec0d4cc:
("riscv: perf: Fix callchain parse error with kernel tracepoint events")

Test with commands:

perf record -e sched:sched_switch -g --call-graph dwarf
perf report

Without this patch:

Children Self Command Shared Object Symbol
........ ........ ............. ................. ....................

43.41% 43.41% swapper [unknown] [k] 0000000000000000

10.94% 10.94% loong-container [unknown] [k] 0000000000000000
|
|--5.98%--0x12006ba38
|
|--2.56%--0x12006bb84
|
--2.40%--0x12006b6b8

With this patch, callchain can be parsed correctly:

Children Self Command Shared Object Symbol
........ ........ ............. ................. ....................

47.57% 47.57% swapper [kernel.vmlinux] [k] __schedule
|
---__schedule

26.76% 26.76% loong-container [kernel.vmlinux] [k] __schedule
|
|--13.78%--0x12006ba38
| |
| |--9.19%--__schedule
| |
| --4.59%--handle_syscall
| do_syscall
| sys_futex
| do_futex
| futex_wait
| futex_wait_queue_me
| hrtimer_start_range_ns
| __schedule
|
|--8.38%--0x12006bb84
| handle_syscall
| do_syscall
| sys_epoll_pwait
| do_epoll_wait
| schedule_hrtimeout_range_clock
| hrtimer_start_range_ns
| __schedule
|
--4.59%--0x12006b6b8
handle_syscall
do_syscall
sys_nanosleep
hrtimer_nanosleep
do_nanosleep
hrtimer_start_range_ns
__schedule

Reported-by: Youling Tang <[email protected]>
Suggested-by: Youling Tang <[email protected]>
Signed-off-by: Huacai Chen <[email protected]>
---
arch/loongarch/include/asm/perf_event.h | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/arch/loongarch/include/asm/perf_event.h b/arch/loongarch/include/asm/perf_event.h
index 2a35a0bc2aaa..157c4ace69d0 100644
--- a/arch/loongarch/include/asm/perf_event.h
+++ b/arch/loongarch/include/asm/perf_event.h
@@ -9,4 +9,10 @@

#define perf_arch_bpf_user_pt_regs(regs) (struct user_pt_regs *)regs

+#define perf_arch_fetch_caller_regs(regs, __ip) { \
+ (regs)->csr_era = (__ip); \
+ (regs)->regs[3] = current_stack_pointer; \
+ (regs)->regs[22] = (unsigned long) __builtin_frame_address(0); \
+}
+
#endif /* __LOONGARCH_PERF_EVENT_H__ */
--
2.43.0