2020-12-17 12:52:43

by Tiezhu Yang

[permalink] [raw]
Subject: Re: [QUESTION] support perf record --call-graph dwarf for mips

On 12/16/2020 11:16 PM, Arnaldo Carvalho de Melo wrote:
> Em Wed, Dec 16, 2020 at 11:30:47AM -0300, Arnaldo Carvalho de Melo escreveu:
>> Em Wed, Dec 16, 2020 at 07:14:02PM +0800, Jiaxun Yang escreveu:
>>>
>>> 在 2020/12/16 下午6:05, Tiezhu Yang 写道:
>>>> Hi,
>>>>
>>>> In the current upstream mainline kernel, perf record --call-graph dwarf
>>>> is not supported for architecture mips64. I find the following related
>>>> patches about this feature by David Daney <[email protected]> and
>>>> Archer Yan <[email protected]> in Sep 2019.
>>
>>> AFAIK ddaney left Cavium at 2018 and Wave Computing Shanghai is defuncted...
>>
>>> Feel free to take over if you like, there is no licenses issue, just
>>> remember to credit
>>> others properly.
>> Ralf, can you take a look at the kernel part? The user space part seems
>> ok.
> I take that back, but made some progress in getting that old patch
> closer to what we have now in tools/perf/, see below.
>
> Someone with a mips system should try to refresh the kernel bits and
> then see if the patch below works.
>
> - Arnaldo
>
>
> commit e59de40addb092d7167fa1dd7c6640d0fab41ede
> Author: David Daney <[email protected]>
> Date: Wed Sep 11 08:26:37 2019 +0000
>
> perf mips: Support mips unwinding and dwarf-regs.
>
> Map perf APIs(perf_reg_name/get_arch_regstr/unwind__arch_reg_id)
> with MIPS specific registers.
>
> [[email protected]: repick this patch for unwinding userstack
> backtrace by perf and libunwind on MIPS based CPU.]
>
> Committer notes:
>
> Some header fixups, replace CONFIG_LIBUNWIND with CONFIG_LOCAL_LIBUNWIND
> to cope with:
>
> 9d8e14d306ef2f5d ("perf unwind: Separate local/remote libunwind config")
>
> Signed-off-by: David Daney <[email protected]>
> Cc: Jiri Olsa <[email protected]>
> Cc: [email protected]
> Cc: Peter Zijlstra <[email protected]>
> Cc: Paul Mackerras <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Link: https://lore.kernel.org/r/[email protected]
> Signed-off-by: Archer Yan <[email protected]>
> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
>
> diff --git a/tools/perf/arch/mips/Makefile b/tools/perf/arch/mips/Makefile
> new file mode 100644
> index 0000000000000000..6e1106fab26e4015
> --- /dev/null
> +++ b/tools/perf/arch/mips/Makefile
> @@ -0,0 +1,4 @@
> +# SPDX-License-Identifier: GPL-2.0
> +ifndef NO_DWARF
> +PERF_HAVE_DWARF_REGS := 1
> +endif
> diff --git a/tools/perf/arch/mips/include/perf_regs.h b/tools/perf/arch/mips/include/perf_regs.h
> new file mode 100644
> index 0000000000000000..36a28bc1734787ce
> --- /dev/null
> +++ b/tools/perf/arch/mips/include/perf_regs.h
> @@ -0,0 +1,83 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef ARCH_PERF_REGS_H
> +#define ARCH_PERF_REGS_H
> +
> +#include <stdlib.h>
> +#include <linux/types.h>
> +#include <asm/perf_regs.h>
> +
> +#define PERF_REG_IP PERF_REG_MIPS_PC
> +#define PERF_REG_SP PERF_REG_MIPS_R29
> +
> +#define PERF_REGS_MASK ((1ULL << PERF_REG_MIPS_MAX) - 1)
> +
> +static inline const char *perf_reg_name(int id)
> +{
> + switch (id) {
> + case PERF_REG_MIPS_PC:
> + return "PC";
> + case PERF_REG_MIPS_R1:
> + return "$1";
> + case PERF_REG_MIPS_R2:
> + return "$2";
> + case PERF_REG_MIPS_R3:
> + return "$3";
> + case PERF_REG_MIPS_R4:
> + return "$4";
> + case PERF_REG_MIPS_R5:
> + return "$5";
> + case PERF_REG_MIPS_R6:
> + return "$6";
> + case PERF_REG_MIPS_R7:
> + return "$7";
> + case PERF_REG_MIPS_R8:
> + return "$8";
> + case PERF_REG_MIPS_R9:
> + return "$9";
> + case PERF_REG_MIPS_R10:
> + return "$10";
> + case PERF_REG_MIPS_R11:
> + return "$11";
> + case PERF_REG_MIPS_R12:
> + return "$12";
> + case PERF_REG_MIPS_R13:
> + return "$13";
> + case PERF_REG_MIPS_R14:
> + return "$14";
> + case PERF_REG_MIPS_R15:
> + return "$15";
> + case PERF_REG_MIPS_R16:
> + return "$16";
> + case PERF_REG_MIPS_R17:
> + return "$17";
> + case PERF_REG_MIPS_R18:
> + return "$18";
> + case PERF_REG_MIPS_R19:
> + return "$19";
> + case PERF_REG_MIPS_R20:
> + return "$20";
> + case PERF_REG_MIPS_R21:
> + return "$21";
> + case PERF_REG_MIPS_R22:
> + return "$22";
> + case PERF_REG_MIPS_R23:
> + return "$23";
> + case PERF_REG_MIPS_R24:
> + return "$24";
> + case PERF_REG_MIPS_R25:
> + return "$25";
> + case PERF_REG_MIPS_R28:
> + return "$28";
> + case PERF_REG_MIPS_R29:
> + return "$29";
> + case PERF_REG_MIPS_R30:
> + return "$30";
> + case PERF_REG_MIPS_R31:
> + return "$31";
> + default:
> + break;
> + }
> + return NULL;
> +}
> +
> +#endif /* ARCH_PERF_REGS_H */
> diff --git a/tools/perf/arch/mips/util/Build b/tools/perf/arch/mips/util/Build
> new file mode 100644
> index 0000000000000000..7b0c0457154a22c5
> --- /dev/null
> +++ b/tools/perf/arch/mips/util/Build
> @@ -0,0 +1,2 @@
> +perf-$(CONFIG_DWARF) += dwarf-regs.o
> +perf-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind.o
> diff --git a/tools/perf/arch/mips/util/dwarf-regs.c b/tools/perf/arch/mips/util/dwarf-regs.c
> new file mode 100644
> index 0000000000000000..165e0179ea11d9b2
> --- /dev/null
> +++ b/tools/perf/arch/mips/util/dwarf-regs.c
> @@ -0,0 +1,37 @@
> +/*
> + * dwarf-regs.c : Mapping of DWARF debug register numbers into register names.
> + *
> + * Copyright (C) 2013 Cavium, Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + */
> +
> +#include <libio.h>
> +#include <dwarf-regs.h>
> +
> +static const char *mips_gpr_names[32] = {
> + "$0", "$1", "$2", "$3", "$4", "$5", "$6", "$7", "$8", "$9",
> + "$10", "$11", "$12", "$13", "$14", "$15", "$16", "$17", "$18", "$19",
> + "$20", "$21", "$22", "$23", "$24", "$25", "$26", "$27", "$28", "$29",
> + "$30", "$31"
> +};
> +
> +const char *get_arch_regstr(unsigned int n)
> +{
> + if (n < 32)
> + return mips_gpr_names[n];
> + if (n == 64)
> + return "hi";
> + if (n == 65)
> + return "lo";
> + return NULL;
> +}
> diff --git a/tools/perf/arch/mips/util/unwind-libunwind.c b/tools/perf/arch/mips/util/unwind-libunwind.c
> new file mode 100644
> index 0000000000000000..7af25427943f451a
> --- /dev/null
> +++ b/tools/perf/arch/mips/util/unwind-libunwind.c
> @@ -0,0 +1,21 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <errno.h>
> +#include <libunwind.h>
> +#include "perf_regs.h"
> +#include "../../util/unwind.h"
> +
> +int unwind__arch_reg_id(int regnum)
> +{
> + switch (regnum) {
> + case UNW_MIPS_R1 ... UNW_MIPS_R25:
> + return regnum - UNW_MIPS_R1 + PERF_REG_MIPS_R1;
> + case UNW_MIPS_R28 ... UNW_MIPS_R31:
> + return regnum - UNW_MIPS_R28 + PERF_REG_MIPS_R28;
> + case UNW_MIPS_PC:
> + return PERF_REG_MIPS_PC;
> + default:
> + pr_err("unwind: invalid reg id %d\n", regnum);
> + return -EINVAL;
> + }
> +}

(1)There exists build errors used with the above patch:

LINK perf
perf-in.o: In function `get_dwarf_regstr':
/home/loongson/linux-5.10-rc7/tools/perf/util/dwarf-regs.c:33: undefined
reference to `get_arch_regstr'
/home/loongson/linux-5.10-rc7/tools/perf/util/dwarf-regs.c:35: undefined
reference to `get_arch_regstr'
collect2: error: ld returned 1 exit status
Makefile.perf:659: recipe for target 'perf' failed
make[2]: *** [perf] Error 1
Makefile.perf:232: recipe for target 'sub-make' failed
make[1]: *** [sub-make] Error 2
Makefile:69: recipe for target 'all' failed
make: *** [all] Error 2

(2)We need to modify tools/perf/arch/mips/Build, then build successful.

diff --git a/tools/perf/arch/mips/Build b/tools/perf/arch/mips/Build
index 1bb8bf6d7fd4..54afe4a467e7 100644
--- a/tools/perf/arch/mips/Build
+++ b/tools/perf/arch/mips/Build
@@ -1 +1 @@
-# empty
+libperf-y += util/

(3)[loongson@linux perf]$ ./perf record --call-graph dwarf cd
Error:
The sys_perf_event_open() syscall returned with 89 (Function not
implemented) for event (cycles:u).
/bin/dmesg | grep -i perf may provide additional information.

Call Trace:
record__open
evsel__open()
evsel__open_cpu()
perf_event_open()
evsel__open_strerror

Maybe we need tools/perf/arch/mips/entry/syscalls/syscall.tbl?


2020-12-21 08:14:47

by Tiezhu Yang

[permalink] [raw]
Subject: Re: [QUESTION] support perf record --call-graph dwarf for mips

On 12/17/2020 08:48 PM, Tiezhu Yang wrote:
> On 12/16/2020 11:16 PM, Arnaldo Carvalho de Melo wrote:
>> Em Wed, Dec 16, 2020 at 11:30:47AM -0300, Arnaldo Carvalho de Melo
>> escreveu:
>>> Em Wed, Dec 16, 2020 at 07:14:02PM +0800, Jiaxun Yang escreveu:
>>>>
>>>> 在 2020/12/16 下午6:05, Tiezhu Yang 写道:
>>>>> Hi,
>>>>>
>>>>> In the current upstream mainline kernel, perf record --call-graph
>>>>> dwarf
>>>>> is not supported for architecture mips64. I find the following
>>>>> related
>>>>> patches about this feature by David Daney <[email protected]>
>>>>> and
>>>>> Archer Yan <[email protected]> in Sep 2019.
>>>
...
> (3)[loongson@linux perf]$ ./perf record --call-graph dwarf cd
> Error:
> The sys_perf_event_open() syscall returned with 89 (Function not
> implemented) for event (cycles:u).
> /bin/dmesg | grep -i perf may provide additional information.
>
> Call Trace:
> record__open
> evsel__open()
> evsel__open_cpu()
> perf_event_open()
> evsel__open_strerror
>
> Maybe we need tools/perf/arch/mips/entry/syscalls/syscall.tbl?

The code about mips kernel and perf tool are debugged successfully
on the Loongson 3A4000 CPU platform, we can see the following result,
I will make and submit some patches based on 5.11-rc1 in the next week.

[root@linux perf]# uname -r
5.10.0-rc7
[root@linux perf]# ./perf record --call-graph dwarf -F 1000 lscpu
Architecture: mips64
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
L1d cache: 64K
L1i cache: 64K
L2 cache: 2048K
NUMA node0 CPU(s): 0-3
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.078 MB perf.data (8 samples) ]
[root@linux perf]# ./perf report
# To display the perf.data header info, please use
--header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 8 of event 'cycles'
# Event count (approx.): 5682386
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... ................ ...........................
#
94.86% 94.86% lscpu [kernel.vmlinux] [k] get_page_from_freelist
|
---__GI_access (inlined)
syscall_common
do_faccessat
filename_lookup
path_lookupat
walk_component
__lookup_slow
d_alloc_parallel
d_alloc
__d_alloc
kmem_cache_alloc
__slab_alloc.isra.96
___slab_alloc
allocate_slab
__alloc_pages_nodemask
get_page_from_freelist

94.86% 0.00% lscpu libc-2.20.so [.] __GI_access (inlined)
|
---__GI_access (inlined)
syscall_common
do_faccessat
filename_lookup
path_lookupat
walk_component
__lookup_slow
d_alloc_parallel
d_alloc
__d_alloc
kmem_cache_alloc
__slab_alloc.isra.96
___slab_alloc
allocate_slab
__alloc_pages_nodemask
get_page_from_freelist

94.86% 0.00% lscpu [kernel.vmlinux] [k] syscall_common
|
---syscall_common
do_faccessat
filename_lookup
path_lookupat
walk_component
__lookup_slow
d_alloc_parallel
d_alloc
__d_alloc
kmem_cache_alloc
__slab_alloc.isra.96
___slab_alloc
allocate_slab
__alloc_pages_nodemask
get_page_from_freelist

94.86% 0.00% lscpu [kernel.vmlinux] [k] do_faccessat
|
---do_faccessat
filename_lookup
path_lookupat
walk_component
__lookup_slow
d_alloc_parallel
d_alloc
__d_alloc
kmem_cache_alloc
__slab_alloc.isra.96
___slab_alloc
allocate_slab
__alloc_pages_nodemask
get_page_from_freelist

94.86% 0.00% lscpu [kernel.vmlinux] [k] filename_lookup
|
---filename_lookup
path_lookupat
walk_component
__lookup_slow
d_alloc_parallel
d_alloc
__d_alloc
kmem_cache_alloc
__slab_alloc.isra.96
___slab_alloc
allocate_slab
__alloc_pages_nodemask
get_page_from_freelist

94.86% 0.00% lscpu [kernel.vmlinux] [k] path_lookupat
|
---path_lookupat
walk_component
__lookup_slow
d_alloc_parallel
d_alloc
__d_alloc
kmem_cache_alloc
__slab_alloc.isra.96
___slab_alloc
allocate_slab
__alloc_pages_nodemask
get_page_from_freelist

94.86% 0.00% lscpu [kernel.vmlinux] [k] walk_component
|
---walk_component
__lookup_slow
d_alloc_parallel
d_alloc
__d_alloc
kmem_cache_alloc
__slab_alloc.isra.96
___slab_alloc
allocate_slab
__alloc_pages_nodemask
get_page_from_freelist

94.86% 0.00% lscpu [kernel.vmlinux] [k] __lookup_slow
|
---__lookup_slow
d_alloc_parallel
d_alloc
__d_alloc
kmem_cache_alloc
__slab_alloc.isra.96
___slab_alloc
allocate_slab
__alloc_pages_nodemask
get_page_from_freelist

94.86% 0.00% lscpu [kernel.vmlinux] [k] d_alloc_parallel
|
---d_alloc_parallel
d_alloc
__d_alloc
kmem_cache_alloc
__slab_alloc.isra.96
___slab_alloc
allocate_slab
__alloc_pages_nodemask
get_page_from_freelist

94.86% 0.00% lscpu [kernel.vmlinux] [k] d_alloc
|
---d_alloc
__d_alloc
kmem_cache_alloc
__slab_alloc.isra.96
___slab_alloc
allocate_slab
__alloc_pages_nodemask
get_page_from_freelist

94.86% 0.00% lscpu [kernel.vmlinux] [k] __d_alloc
|
---__d_alloc
kmem_cache_alloc
__slab_alloc.isra.96
___slab_alloc
allocate_slab
__alloc_pages_nodemask
get_page_from_freelist

94.86% 0.00% lscpu [kernel.vmlinux] [k] kmem_cache_alloc
|
---kmem_cache_alloc
__slab_alloc.isra.96
___slab_alloc
allocate_slab
__alloc_pages_nodemask
get_page_from_freelist

94.86% 0.00% lscpu [kernel.vmlinux] [k] __slab_alloc.isra.96
|
---__slab_alloc.isra.96
___slab_alloc
allocate_slab
__alloc_pages_nodemask
get_page_from_freelist

94.86% 0.00% lscpu [kernel.vmlinux] [k] ___slab_alloc
|
---___slab_alloc
allocate_slab
__alloc_pages_nodemask
get_page_from_freelist

94.86% 0.00% lscpu [kernel.vmlinux] [k] allocate_slab
|
---allocate_slab
__alloc_pages_nodemask
get_page_from_freelist

94.86% 0.00% lscpu [kernel.vmlinux] [k] __alloc_pages_nodemask
|
---__alloc_pages_nodemask
get_page_from_freelist

5.00% 5.00% lscpu ld-2.20.so [.] dl_main
|
---dl_main

0.13% 0.13% lscpu [kernel.vmlinux] [k] perf_event_comm_output
0.13% 0.00% lscpu [kernel.vmlinux] [k] merge_sched_in
0.13% 0.00% lscpu [kernel.vmlinux] [k]
event_sched_in.isra.132
0.00% 0.00% perf [kernel.vmlinux] [k] arch_local_irq_restore

#
# (Tip: List events using substring match: perf list <keyword>)
#