2019-08-10 07:22:59

by Leo Yan

[permalink] [raw]
Subject: [PATCH v4 0/2] perf: arm/arm64: Improve completeness for kernel address space

This patch set is to improve completeness for kernel address space for
arm/arm64; it adds architecture specific tweaking for the kernel start
address, thus can include the memory regions which are prior to '_stext'
symbol. With this change, we can see the eBPF program can be parsed
properly on arm64.

This patch set is a following up version for the old patch "perf cs-etm:
Improve completeness for kernel address space" [1]; the old patch was
only to fix the issue for CoreSight ETM event; but the kernel address space
issue is not only limited to CoreSight event, it should be a common issue
for other events (e.g. PMU events), clock events for profiling eBPF
program. So this patch set tries to resolve it as a common issue for
arm/arm64 archs.

When implemented related code, I tried to use the API
machine__create_extra_kernel_maps(); but I found the 'perf script' tool
directly calls machine__get_kernel_start() instead of running into
the flow for machine__create_extra_kernel_maps(); this is the reason I
don't use machine__create_extra_kernel_maps() for tweaking kernel start
address and refactor machine__get_kernel_start() alternatively.

If there have better method to resolve this issue, any suggestions and
comments are very welcome!

[1] https://lkml.org/lkml/2019/6/19/1057


Leo Yan (2):
perf machine: Support arch's specific kernel start address
perf machine: arm/arm64: Improve completeness for kernel address space

tools/perf/Makefile.config | 22 ++++++++++++++++++++++
tools/perf/arch/arm/util/Build | 2 ++
tools/perf/arch/arm/util/machine.c | 17 +++++++++++++++++
tools/perf/arch/arm64/util/Build | 1 +
tools/perf/arch/arm64/util/machine.c | 17 +++++++++++++++++
tools/perf/arch/x86/util/machine.c | 10 ++++++++++
tools/perf/util/machine.c | 13 +++++++------
tools/perf/util/machine.h | 2 ++
8 files changed, 78 insertions(+), 6 deletions(-)
create mode 100644 tools/perf/arch/arm/util/machine.c
create mode 100644 tools/perf/arch/arm64/util/machine.c

--
2.17.1


2019-08-10 07:23:09

by Leo Yan

[permalink] [raw]
Subject: [PATCH v4 1/2] perf machine: Support arch's specific kernel start address

machine__get_kernel_start() gives out the kernel start address; some
architectures need to tweak the start address so that can reflect the
kernel start address correctly. This is not only for x86_64 arch, but
it is also required by other architectures, e.g. arm/arm64 needs to
tweak the kernel start address so can include the kernel memory regions
which are used before the '_stext' symbol.

This patch refactors machine__get_kernel_start() by adding a weak
arch__fix_kernel_text_start(), any architecture can implement it to
tweak its specific start address; this also allows the arch specific
code to be placed into 'arch' folder.

Signed-off-by: Leo Yan <[email protected]>
---
tools/perf/arch/x86/util/machine.c | 10 ++++++++++
tools/perf/util/machine.c | 13 +++++++------
tools/perf/util/machine.h | 2 ++
3 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/tools/perf/arch/x86/util/machine.c b/tools/perf/arch/x86/util/machine.c
index 1e9ec783b9a1..9f012131534a 100644
--- a/tools/perf/arch/x86/util/machine.c
+++ b/tools/perf/arch/x86/util/machine.c
@@ -101,4 +101,14 @@ int machine__create_extra_kernel_maps(struct machine *machine,
return ret;
}

+void arch__fix_kernel_text_start(u64 *start)
+{
+ /*
+ * On x86_64, PTI entry trampolines are less than the
+ * start of kernel text, but still above 2^63. So leave
+ * kernel_start = 1ULL << 63 for x86_64.
+ */
+ *start = 1ULL << 63;
+}
+
#endif
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index f6ee7fbad3e4..603518835692 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2671,6 +2671,10 @@ int machine__nr_cpus_avail(struct machine *machine)
return machine ? perf_env__nr_cpus_avail(machine->env) : 0;
}

+void __weak arch__fix_kernel_text_start(u64 *start __maybe_unused)
+{
+}
+
int machine__get_kernel_start(struct machine *machine)
{
struct map *map = machine__kernel_map(machine);
@@ -2687,14 +2691,11 @@ int machine__get_kernel_start(struct machine *machine)
machine->kernel_start = 1ULL << 63;
if (map) {
err = map__load(map);
- /*
- * On x86_64, PTI entry trampolines are less than the
- * start of kernel text, but still above 2^63. So leave
- * kernel_start = 1ULL << 63 for x86_64.
- */
- if (!err && !machine__is(machine, "x86_64"))
+ if (!err)
machine->kernel_start = map->start;
}
+
+ arch__fix_kernel_text_start(&machine->kernel_start);
return err;
}

diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index ef803f08ae12..9cb459f4bfbc 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -278,6 +278,8 @@ void machine__get_kallsyms_filename(struct machine *machine, char *buf,
int machine__create_extra_kernel_maps(struct machine *machine,
struct dso *kernel);

+void arch__fix_kernel_text_start(u64 *start);
+
/* Kernel-space maps for symbols that are outside the main kernel map and module maps */
struct extra_kernel_map {
u64 start;
--
2.17.1

2019-08-10 07:23:41

by Leo Yan

[permalink] [raw]
Subject: [PATCH v4 2/2] perf machine: arm/arm64: Improve completeness for kernel address space

Arm and arm64 architecture reserve some memory regions prior to the
symbol '_stext' and these memory regions later will be used by device
module and BPF jit. The current code misses to consider these memory
regions thus any address in the regions will be taken as user space
mode, but perf cannot find the corresponding dso with the wrong CPU
mode so we misses to generate samples for device module and BPF
related trace data.

This patch parse the link scripts to get the memory size prior to start
address and reduce this size from 'machine>->kernel_start', then can
get a fixed up kernel start address which contain memory regions for
device module and BPF. Finally, machine__get_kernel_start() can reflect
more complete kernel memory regions and perf can successfully generate
samples.

The reason for parsing the link scripts is Arm architecture changes text
offset dependent on different platforms, which define multiple text
offsets in $kernel/arch/arm/Makefile. This offset is decided when build
kernel and the final value is extended in the link script, so we can
extract the used value from the link script. We use the same way to
parse arm64 link script as well. If fail to find the link script, the
pre start memory size is assumed as zero, in this case it has no any
change caused with this patch.

Below is detailed info for testing this patch:

- Install or build LLVM/Clang;

- Configure perf with ~/.perfconfig:

root@debian:~# cat ~/.perfconfig
# this file is auto-generated.
[llvm]
clang-path = /mnt/build/llvm-build/build/install/bin/clang
kbuild-dir = /mnt/linux-kernel/linux-cs-dev/
clang-opt = "-g"
dump-obj = true

[trace]
show_zeros = yes
show_duration = no
no_inherit = yes
show_timestamp = no
show_arg_names = no
args_alignment = 40
show_prefix = yes

- Run 'perf trace' command with eBPF event:

root@debian:~# perf trace -e string \
-e $kernel/tools/perf/examples/bpf/augmented_raw_syscalls.c

- Read eBPF program memory mapping in kernel:

root@debian:~# echo 1 > /proc/sys/net/core/bpf_jit_kallsyms
root@debian:~# cat /proc/kallsyms | grep -E "bpf_prog_.+_sys_[enter|exit]"
ffff00000008a0d0 t bpf_prog_e470211b846088d5_sys_enter [bpf]
ffff00000008c6a4 t bpf_prog_29c7ae234d79bd5c_sys_exit [bpf]

- Launch any program which accesses file system frequently so can hit
the system calls trace flow with eBPF event;

- Capture CoreSight trace data with filtering eBPF program:

root@debian:~# perf record -e cs_etm/@tmc_etr0/ \
--filter 'filter 0xffff00000008a0d0/0x800' -a sleep 5s

- Decode the eBPF program symbol 'bpf_prog_f173133dc38ccf87_sys_enter':

root@debian:~# perf script -F,ip,sym
Frame deformatter: Found 4 FSYNCS
0 [unknown]
ffff00000008a1ac bpf_prog_e470211b846088d5_sys_enter
ffff00000008a250 bpf_prog_e470211b846088d5_sys_enter
0 [unknown]
ffff00000008a124 bpf_prog_e470211b846088d5_sys_enter
0 [unknown]
ffff00000008a14c bpf_prog_e470211b846088d5_sys_enter
ffff00000008a13c bpf_prog_e470211b846088d5_sys_enter
ffff00000008a14c bpf_prog_e470211b846088d5_sys_enter
0 [unknown]
ffff00000008a180 bpf_prog_e470211b846088d5_sys_enter
0 [unknown]
ffff00000008a1ac bpf_prog_e470211b846088d5_sys_enter
ffff00000008a190 bpf_prog_e470211b846088d5_sys_enter
ffff00000008a1ac bpf_prog_e470211b846088d5_sys_enter
ffff00000008a250 bpf_prog_e470211b846088d5_sys_enter
0 [unknown]
ffff00000008a124 bpf_prog_e470211b846088d5_sys_enter
0 [unknown]
ffff00000008a14c bpf_prog_e470211b846088d5_sys_enter
0 [unknown]
ffff00000008a180 bpf_prog_e470211b846088d5_sys_enter
[...]

Cc: Mathieu Poirier <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Leo Yan <[email protected]>
---
tools/perf/Makefile.config | 22 ++++++++++++++++++++++
tools/perf/arch/arm/util/Build | 2 ++
tools/perf/arch/arm/util/machine.c | 17 +++++++++++++++++
tools/perf/arch/arm64/util/Build | 1 +
tools/perf/arch/arm64/util/machine.c | 17 +++++++++++++++++
5 files changed, 59 insertions(+)
create mode 100644 tools/perf/arch/arm/util/machine.c
create mode 100644 tools/perf/arch/arm64/util/machine.c

diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index e4988f49ea79..76e0ad0b4fd2 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -51,6 +51,17 @@ endif
ifeq ($(SRCARCH),arm)
NO_PERF_REGS := 0
LIBUNWIND_LIBS = -lunwind -lunwind-arm
+ PRE_START_SIZE := 0
+ ifneq ($(wildcard $(srctree)/arch/$(SRCARCH)/kernel/vmlinux.lds),)
+ # Extract info from lds:
+ # . = ((0xC0000000)) + 0x00208000;
+ # PRE_START_SIZE := 0x00208000
+ PRE_START_SIZE := $(shell egrep ' \. \= \({2}0x[0-9a-fA-F]+\){2}' \
+ $(srctree)/arch/$(SRCARCH)/kernel/vmlinux.lds | \
+ sed -e 's/[(|)|.|=|+|<|;|-]//g' -e 's/ \+/ /g' -e 's/^[ \t]*//' | \
+ awk -F' ' '{printf "0x%x", $$2}' 2>/dev/null)
+ endif
+ CFLAGS += -DARM_PRE_START_SIZE=$(PRE_START_SIZE)
endif

ifeq ($(SRCARCH),arm64)
@@ -58,6 +69,17 @@ ifeq ($(SRCARCH),arm64)
NO_SYSCALL_TABLE := 0
CFLAGS += -I$(OUTPUT)arch/arm64/include/generated
LIBUNWIND_LIBS = -lunwind -lunwind-aarch64
+ PRE_START_SIZE := 0
+ ifneq ($(wildcard $(srctree)/arch/$(SRCARCH)/kernel/vmlinux.lds),)
+ # Extract info from lds:
+ # . = ((((((((0xffffffffffffffff)) - (((1)) << (48)) + 1) + (0)) + (0x08000000))) + (0x08000000))) + 0x00080000;
+ # PRE_START_SIZE := (0x08000000 + 0x08000000 + 0x00080000) = 0x10080000
+ PRE_START_SIZE := $(shell egrep ' \. \= \({8}0x[0-9a-fA-F]+\){2}' \
+ $(srctree)/arch/$(SRCARCH)/kernel/vmlinux.lds | \
+ sed -e 's/[(|)|.|=|+|<|;|-]//g' -e 's/ \+/ /g' -e 's/^[ \t]*//' | \
+ awk -F' ' '{printf "0x%x", $$6+$$7+$$8}' 2>/dev/null)
+ endif
+ CFLAGS += -DARM_PRE_START_SIZE=$(PRE_START_SIZE)
endif

ifeq ($(SRCARCH),csky)
diff --git a/tools/perf/arch/arm/util/Build b/tools/perf/arch/arm/util/Build
index 296f0eac5e18..efa6b768218a 100644
--- a/tools/perf/arch/arm/util/Build
+++ b/tools/perf/arch/arm/util/Build
@@ -1,3 +1,5 @@
+perf-y += machine.o
+
perf-$(CONFIG_DWARF) += dwarf-regs.o

perf-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind.o
diff --git a/tools/perf/arch/arm/util/machine.c b/tools/perf/arch/arm/util/machine.c
new file mode 100644
index 000000000000..db172894e4ea
--- /dev/null
+++ b/tools/perf/arch/arm/util/machine.c
@@ -0,0 +1,17 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/types.h>
+#include <linux/string.h>
+#include <stdlib.h>
+
+#include "../../util/machine.h"
+
+void arch__fix_kernel_text_start(u64 *start)
+{
+ /*
+ * On arm, the 16MB virtual memory space prior to 'kernel_start' is
+ * allocated to device modules, a PMD table if CONFIG_HIGHMEM is
+ * enabled and a PGD table. To reflect the complete kernel address
+ * space, compensate the pre-defined regions for kernel start address.
+ */
+ *start = *start - ARM_PRE_START_SIZE;
+}
diff --git a/tools/perf/arch/arm64/util/Build b/tools/perf/arch/arm64/util/Build
index 3cde540d2fcf..8081fb8a7b3d 100644
--- a/tools/perf/arch/arm64/util/Build
+++ b/tools/perf/arch/arm64/util/Build
@@ -1,4 +1,5 @@
perf-y += header.o
+perf-y += machine.o
perf-y += sym-handling.o
perf-$(CONFIG_DWARF) += dwarf-regs.o
perf-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind.o
diff --git a/tools/perf/arch/arm64/util/machine.c b/tools/perf/arch/arm64/util/machine.c
new file mode 100644
index 000000000000..61058dca8c5a
--- /dev/null
+++ b/tools/perf/arch/arm64/util/machine.c
@@ -0,0 +1,17 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/types.h>
+#include <linux/string.h>
+#include <stdlib.h>
+
+#include "../../util/machine.h"
+
+void arch__fix_kernel_text_start(u64 *start)
+{
+ /*
+ * On arm64, the root PGD table, device module memory region and
+ * BPF jit region are prior to 'kernel_start'. To reflect the
+ * complete kernel address space, compensate these pre-defined
+ * regions for kernel start address.
+ */
+ *start = *start - ARM_PRE_START_SIZE;
+}
--
2.17.1

2019-08-12 06:40:19

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH v4 1/2] perf machine: Support arch's specific kernel start address

On 10/08/19 10:21 AM, Leo Yan wrote:
> machine__get_kernel_start() gives out the kernel start address; some
> architectures need to tweak the start address so that can reflect the
> kernel start address correctly. This is not only for x86_64 arch, but
> it is also required by other architectures, e.g. arm/arm64 needs to
> tweak the kernel start address so can include the kernel memory regions
> which are used before the '_stext' symbol.
>
> This patch refactors machine__get_kernel_start() by adding a weak
> arch__fix_kernel_text_start(), any architecture can implement it to
> tweak its specific start address; this also allows the arch specific
> code to be placed into 'arch' folder.
>
> Signed-off-by: Leo Yan <[email protected]>
> ---
> tools/perf/arch/x86/util/machine.c | 10 ++++++++++
> tools/perf/util/machine.c | 13 +++++++------
> tools/perf/util/machine.h | 2 ++
> 3 files changed, 19 insertions(+), 6 deletions(-)
>
> diff --git a/tools/perf/arch/x86/util/machine.c b/tools/perf/arch/x86/util/machine.c
> index 1e9ec783b9a1..9f012131534a 100644
> --- a/tools/perf/arch/x86/util/machine.c
> +++ b/tools/perf/arch/x86/util/machine.c
> @@ -101,4 +101,14 @@ int machine__create_extra_kernel_maps(struct machine *machine,
> return ret;
> }
>
> +void arch__fix_kernel_text_start(u64 *start)
> +{
> + /*
> + * On x86_64, PTI entry trampolines are less than the
> + * start of kernel text, but still above 2^63. So leave
> + * kernel_start = 1ULL << 63 for x86_64.
> + */
> + *start = 1ULL << 63;
> +}

That is needed for reporting x86 data on any arch i.e. it is not specific to
the compile-time architecture, it is specific to the perf.data file
architecture, which is what machine__is() compares. So, this looks wrong.

> +
> #endif
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index f6ee7fbad3e4..603518835692 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -2671,6 +2671,10 @@ int machine__nr_cpus_avail(struct machine *machine)
> return machine ? perf_env__nr_cpus_avail(machine->env) : 0;
> }
>
> +void __weak arch__fix_kernel_text_start(u64 *start __maybe_unused)
> +{
> +}
> +
> int machine__get_kernel_start(struct machine *machine)
> {
> struct map *map = machine__kernel_map(machine);
> @@ -2687,14 +2691,11 @@ int machine__get_kernel_start(struct machine *machine)
> machine->kernel_start = 1ULL << 63;
> if (map) {
> err = map__load(map);
> - /*
> - * On x86_64, PTI entry trampolines are less than the
> - * start of kernel text, but still above 2^63. So leave
> - * kernel_start = 1ULL << 63 for x86_64.
> - */
> - if (!err && !machine__is(machine, "x86_64"))
> + if (!err)
> machine->kernel_start = map->start;
> }
> +
> + arch__fix_kernel_text_start(&machine->kernel_start);
> return err;
> }
>
> diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
> index ef803f08ae12..9cb459f4bfbc 100644
> --- a/tools/perf/util/machine.h
> +++ b/tools/perf/util/machine.h
> @@ -278,6 +278,8 @@ void machine__get_kallsyms_filename(struct machine *machine, char *buf,
> int machine__create_extra_kernel_maps(struct machine *machine,
> struct dso *kernel);
>
> +void arch__fix_kernel_text_start(u64 *start);
> +
> /* Kernel-space maps for symbols that are outside the main kernel map and module maps */
> struct extra_kernel_map {
> u64 start;
>

2019-08-12 07:03:35

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v4 1/2] perf machine: Support arch's specific kernel start address

On Mon, Aug 12, 2019 at 09:37:33AM +0300, Adrian Hunter wrote:
> On 10/08/19 10:21 AM, Leo Yan wrote:
> > machine__get_kernel_start() gives out the kernel start address; some
> > architectures need to tweak the start address so that can reflect the
> > kernel start address correctly. This is not only for x86_64 arch, but
> > it is also required by other architectures, e.g. arm/arm64 needs to
> > tweak the kernel start address so can include the kernel memory regions
> > which are used before the '_stext' symbol.
> >
> > This patch refactors machine__get_kernel_start() by adding a weak
> > arch__fix_kernel_text_start(), any architecture can implement it to
> > tweak its specific start address; this also allows the arch specific
> > code to be placed into 'arch' folder.
> >
> > Signed-off-by: Leo Yan <[email protected]>
> > ---
> > tools/perf/arch/x86/util/machine.c | 10 ++++++++++
> > tools/perf/util/machine.c | 13 +++++++------
> > tools/perf/util/machine.h | 2 ++
> > 3 files changed, 19 insertions(+), 6 deletions(-)
> >
> > diff --git a/tools/perf/arch/x86/util/machine.c b/tools/perf/arch/x86/util/machine.c
> > index 1e9ec783b9a1..9f012131534a 100644
> > --- a/tools/perf/arch/x86/util/machine.c
> > +++ b/tools/perf/arch/x86/util/machine.c
> > @@ -101,4 +101,14 @@ int machine__create_extra_kernel_maps(struct machine *machine,
> > return ret;
> > }
> >
> > +void arch__fix_kernel_text_start(u64 *start)
> > +{
> > + /*
> > + * On x86_64, PTI entry trampolines are less than the
> > + * start of kernel text, but still above 2^63. So leave
> > + * kernel_start = 1ULL << 63 for x86_64.
> > + */
> > + *start = 1ULL << 63;
> > +}
>
> That is needed for reporting x86 data on any arch i.e. it is not specific to
> the compile-time architecture, it is specific to the perf.data file
> architecture, which is what machine__is() compares. So, this looks wrong.

Thanks for reviewing, Adrian.

If so, I think we should extend the function machine__get_kernel_start()
as below; for building successfully, will always define the macro
ARM_PRE_START_SIZE in Makefile.config.

@Arnaldo, @Adrian, Please let me know if this works for you?

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index f6ee7fbad3e4..30a0ff627263 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2687,13 +2687,26 @@ int machine__get_kernel_start(struct machine *machine)
machine->kernel_start = 1ULL << 63;
if (map) {
err = map__load(map);
+ if (err)
+ return err;
+
/*
* On x86_64, PTI entry trampolines are less than the
* start of kernel text, but still above 2^63. So leave
* kernel_start = 1ULL << 63 for x86_64.
*/
- if (!err && !machine__is(machine, "x86_64"))
+ if (!machine__is(machine, "x86_64"))
machine->kernel_start = map->start;
+
+ /*
+ * On arm/arm64, some memory regions are prior to '_stext'
+ * symbol; to reflect the complete kernel address space,
+ * compensate these pre-defined regions for kernel start
+ * address.
+ */
+ if (machine__is(machine, "arm64") ||
+ machine__is(machine, "arm"))
+ machine->kernel_start -= ARM_PRE_START_SIZE;
}
return err;
}

Thanks,
Leo Yan

> > +
> > #endif
> > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> > index f6ee7fbad3e4..603518835692 100644
> > --- a/tools/perf/util/machine.c
> > +++ b/tools/perf/util/machine.c
> > @@ -2671,6 +2671,10 @@ int machine__nr_cpus_avail(struct machine *machine)
> > return machine ? perf_env__nr_cpus_avail(machine->env) : 0;
> > }
> >
> > +void __weak arch__fix_kernel_text_start(u64 *start __maybe_unused)
> > +{
> > +}
> > +
> > int machine__get_kernel_start(struct machine *machine)
> > {
> > struct map *map = machine__kernel_map(machine);
> > @@ -2687,14 +2691,11 @@ int machine__get_kernel_start(struct machine *machine)
> > machine->kernel_start = 1ULL << 63;
> > if (map) {
> > err = map__load(map);
> > - /*
> > - * On x86_64, PTI entry trampolines are less than the
> > - * start of kernel text, but still above 2^63. So leave
> > - * kernel_start = 1ULL << 63 for x86_64.
> > - */
> > - if (!err && !machine__is(machine, "x86_64"))
> > + if (!err)
> > machine->kernel_start = map->start;
> > }
> > +
> > + arch__fix_kernel_text_start(&machine->kernel_start);
> > return err;
> > }
> >
> > diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
> > index ef803f08ae12..9cb459f4bfbc 100644
> > --- a/tools/perf/util/machine.h
> > +++ b/tools/perf/util/machine.h
> > @@ -278,6 +278,8 @@ void machine__get_kallsyms_filename(struct machine *machine, char *buf,
> > int machine__create_extra_kernel_maps(struct machine *machine,
> > struct dso *kernel);
> >
> > +void arch__fix_kernel_text_start(u64 *start);
> > +
> > /* Kernel-space maps for symbols that are outside the main kernel map and module maps */
> > struct extra_kernel_map {
> > u64 start;
> >
>

2019-08-12 07:25:26

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH v4 1/2] perf machine: Support arch's specific kernel start address

On 12/08/19 10:02 AM, Leo Yan wrote:
> On Mon, Aug 12, 2019 at 09:37:33AM +0300, Adrian Hunter wrote:
>> On 10/08/19 10:21 AM, Leo Yan wrote:
>>> machine__get_kernel_start() gives out the kernel start address; some
>>> architectures need to tweak the start address so that can reflect the
>>> kernel start address correctly. This is not only for x86_64 arch, but
>>> it is also required by other architectures, e.g. arm/arm64 needs to
>>> tweak the kernel start address so can include the kernel memory regions
>>> which are used before the '_stext' symbol.
>>>
>>> This patch refactors machine__get_kernel_start() by adding a weak
>>> arch__fix_kernel_text_start(), any architecture can implement it to
>>> tweak its specific start address; this also allows the arch specific
>>> code to be placed into 'arch' folder.
>>>
>>> Signed-off-by: Leo Yan <[email protected]>
>>> ---
>>> tools/perf/arch/x86/util/machine.c | 10 ++++++++++
>>> tools/perf/util/machine.c | 13 +++++++------
>>> tools/perf/util/machine.h | 2 ++
>>> 3 files changed, 19 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/tools/perf/arch/x86/util/machine.c b/tools/perf/arch/x86/util/machine.c
>>> index 1e9ec783b9a1..9f012131534a 100644
>>> --- a/tools/perf/arch/x86/util/machine.c
>>> +++ b/tools/perf/arch/x86/util/machine.c
>>> @@ -101,4 +101,14 @@ int machine__create_extra_kernel_maps(struct machine *machine,
>>> return ret;
>>> }
>>>
>>> +void arch__fix_kernel_text_start(u64 *start)
>>> +{
>>> + /*
>>> + * On x86_64, PTI entry trampolines are less than the
>>> + * start of kernel text, but still above 2^63. So leave
>>> + * kernel_start = 1ULL << 63 for x86_64.
>>> + */
>>> + *start = 1ULL << 63;
>>> +}
>>
>> That is needed for reporting x86 data on any arch i.e. it is not specific to
>> the compile-time architecture, it is specific to the perf.data file
>> architecture, which is what machine__is() compares. So, this looks wrong.
>
> Thanks for reviewing, Adrian.
>
> If so, I think we should extend the function machine__get_kernel_start()
> as below; for building successfully, will always define the macro
> ARM_PRE_START_SIZE in Makefile.config.
>
> @Arnaldo, @Adrian, Please let me know if this works for you?

I don't know how you intend to calculate ARM_PRE_START_SIZE, but below is OK
for x86.

>
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index f6ee7fbad3e4..30a0ff627263 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -2687,13 +2687,26 @@ int machine__get_kernel_start(struct machine *machine)
> machine->kernel_start = 1ULL << 63;
> if (map) {
> err = map__load(map);
> + if (err)
> + return err;
> +
> /*
> * On x86_64, PTI entry trampolines are less than the
> * start of kernel text, but still above 2^63. So leave
> * kernel_start = 1ULL << 63 for x86_64.
> */
> - if (!err && !machine__is(machine, "x86_64"))
> + if (!machine__is(machine, "x86_64"))
> machine->kernel_start = map->start;
> +
> + /*
> + * On arm/arm64, some memory regions are prior to '_stext'
> + * symbol; to reflect the complete kernel address space,
> + * compensate these pre-defined regions for kernel start
> + * address.
> + */
> + if (machine__is(machine, "arm64") ||
> + machine__is(machine, "arm"))

machine__is() does not normalize the architecture, so you may want to use
perf_env__arch() instead.

> + machine->kernel_start -= ARM_PRE_START_SIZE;
> }
> return err;
> }
>
> Thanks,
> Leo Yan
>
>>> +
>>> #endif
>>> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
>>> index f6ee7fbad3e4..603518835692 100644
>>> --- a/tools/perf/util/machine.c
>>> +++ b/tools/perf/util/machine.c
>>> @@ -2671,6 +2671,10 @@ int machine__nr_cpus_avail(struct machine *machine)
>>> return machine ? perf_env__nr_cpus_avail(machine->env) : 0;
>>> }
>>>
>>> +void __weak arch__fix_kernel_text_start(u64 *start __maybe_unused)
>>> +{
>>> +}
>>> +
>>> int machine__get_kernel_start(struct machine *machine)
>>> {
>>> struct map *map = machine__kernel_map(machine);
>>> @@ -2687,14 +2691,11 @@ int machine__get_kernel_start(struct machine *machine)
>>> machine->kernel_start = 1ULL << 63;
>>> if (map) {
>>> err = map__load(map);
>>> - /*
>>> - * On x86_64, PTI entry trampolines are less than the
>>> - * start of kernel text, but still above 2^63. So leave
>>> - * kernel_start = 1ULL << 63 for x86_64.
>>> - */
>>> - if (!err && !machine__is(machine, "x86_64"))
>>> + if (!err)
>>> machine->kernel_start = map->start;
>>> }
>>> +
>>> + arch__fix_kernel_text_start(&machine->kernel_start);
>>> return err;
>>> }
>>>
>>> diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
>>> index ef803f08ae12..9cb459f4bfbc 100644
>>> --- a/tools/perf/util/machine.h
>>> +++ b/tools/perf/util/machine.h
>>> @@ -278,6 +278,8 @@ void machine__get_kallsyms_filename(struct machine *machine, char *buf,
>>> int machine__create_extra_kernel_maps(struct machine *machine,
>>> struct dso *kernel);
>>>
>>> +void arch__fix_kernel_text_start(u64 *start);
>>> +
>>> /* Kernel-space maps for symbols that are outside the main kernel map and module maps */
>>> struct extra_kernel_map {
>>> u64 start;
>>>
>>
>

2019-08-12 07:42:04

by Leo Yan

[permalink] [raw]
Subject: Re: [PATCH v4 1/2] perf machine: Support arch's specific kernel start address

On Mon, Aug 12, 2019 at 10:23:21AM +0300, Adrian Hunter wrote:
> On 12/08/19 10:02 AM, Leo Yan wrote:
> > On Mon, Aug 12, 2019 at 09:37:33AM +0300, Adrian Hunter wrote:
> >> On 10/08/19 10:21 AM, Leo Yan wrote:
> >>> machine__get_kernel_start() gives out the kernel start address; some
> >>> architectures need to tweak the start address so that can reflect the
> >>> kernel start address correctly. This is not only for x86_64 arch, but
> >>> it is also required by other architectures, e.g. arm/arm64 needs to
> >>> tweak the kernel start address so can include the kernel memory regions
> >>> which are used before the '_stext' symbol.
> >>>
> >>> This patch refactors machine__get_kernel_start() by adding a weak
> >>> arch__fix_kernel_text_start(), any architecture can implement it to
> >>> tweak its specific start address; this also allows the arch specific
> >>> code to be placed into 'arch' folder.
> >>>
> >>> Signed-off-by: Leo Yan <[email protected]>
> >>> ---
> >>> tools/perf/arch/x86/util/machine.c | 10 ++++++++++
> >>> tools/perf/util/machine.c | 13 +++++++------
> >>> tools/perf/util/machine.h | 2 ++
> >>> 3 files changed, 19 insertions(+), 6 deletions(-)
> >>>
> >>> diff --git a/tools/perf/arch/x86/util/machine.c b/tools/perf/arch/x86/util/machine.c
> >>> index 1e9ec783b9a1..9f012131534a 100644
> >>> --- a/tools/perf/arch/x86/util/machine.c
> >>> +++ b/tools/perf/arch/x86/util/machine.c
> >>> @@ -101,4 +101,14 @@ int machine__create_extra_kernel_maps(struct machine *machine,
> >>> return ret;
> >>> }
> >>>
> >>> +void arch__fix_kernel_text_start(u64 *start)
> >>> +{
> >>> + /*
> >>> + * On x86_64, PTI entry trampolines are less than the
> >>> + * start of kernel text, but still above 2^63. So leave
> >>> + * kernel_start = 1ULL << 63 for x86_64.
> >>> + */
> >>> + *start = 1ULL << 63;
> >>> +}
> >>
> >> That is needed for reporting x86 data on any arch i.e. it is not specific to
> >> the compile-time architecture, it is specific to the perf.data file
> >> architecture, which is what machine__is() compares. So, this looks wrong.
> >
> > Thanks for reviewing, Adrian.
> >
> > If so, I think we should extend the function machine__get_kernel_start()
> > as below; for building successfully, will always define the macro
> > ARM_PRE_START_SIZE in Makefile.config.
> >
> > @Arnaldo, @Adrian, Please let me know if this works for you?
>
> I don't know how you intend to calculate ARM_PRE_START_SIZE, but below is OK
> for x86.
>
> >
> > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> > index f6ee7fbad3e4..30a0ff627263 100644
> > --- a/tools/perf/util/machine.c
> > +++ b/tools/perf/util/machine.c
> > @@ -2687,13 +2687,26 @@ int machine__get_kernel_start(struct machine *machine)
> > machine->kernel_start = 1ULL << 63;
> > if (map) {
> > err = map__load(map);
> > + if (err)
> > + return err;
> > +
> > /*
> > * On x86_64, PTI entry trampolines are less than the
> > * start of kernel text, but still above 2^63. So leave
> > * kernel_start = 1ULL << 63 for x86_64.
> > */
> > - if (!err && !machine__is(machine, "x86_64"))
> > + if (!machine__is(machine, "x86_64"))
> > machine->kernel_start = map->start;
> > +
> > + /*
> > + * On arm/arm64, some memory regions are prior to '_stext'
> > + * symbol; to reflect the complete kernel address space,
> > + * compensate these pre-defined regions for kernel start
> > + * address.
> > + */
> > + if (machine__is(machine, "arm64") ||
> > + machine__is(machine, "arm"))
>
> machine__is() does not normalize the architecture, so you may want to use
> perf_env__arch() instead.

You are right, thanks for suggestion. Will use perf_env__arch() in
next spin.

Thanks,
Leo Yan

>
> > + machine->kernel_start -= ARM_PRE_START_SIZE;
> > }
> > return err;
> > }

[...]