2024-02-02 11:02:22

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH 0/2] perf symbols: Slightly improve module file executable section mappings

Hi

Currently perf does not record module section addresses except for
the .text section. In general that means perf cannot get module section
mappings correct (except for .text) when loading symbols from a kernel
module file. (Note using --kcore does not have this issue)

Here are a couple of patches to help shed light upon and slightly improve
the situation.


Adrian Hunter (2):
perf script: Make it possible to see perf's kernel and module memory mappings
perf symbols: Slightly improve module file executable section mappings


Regards
Adrian


2024-02-02 11:02:36

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH 1/2] perf script: Make it possible to see perf's kernel and module memory mappings

Dump kmaps if verbose > 2.

Example:

$ perf script -vvv 2>&1 >/dev/null | grep kvm.intel
build id event received for /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko: 0691d75e10e72ebbbd45a44c59f6d00a5604badf [20]
Map: 0-3a3 4f5d8 [kvm_intel].modinfo
Map: 0-5240 5f280 [kvm_intel]__versions
Map: 0-30 64 [kvm_intel].note.Linux
Map: 0-14 644c0 [kvm_intel].orc_header
Map: 0-5297 43680 [kvm_intel].rodata
Map: 0-5bee 3b837 [kvm_intel].text.unlikely
Map: 0-7e0 41430 [kvm_intel].noinstr.text
Map: 0-2080 713c0 [kvm_intel].bss
Map: 0-26 705c8 [kvm_intel].data..read_mostly
Map: 0-5888 6a4c0 [kvm_intel].data
Map: 0-22 70220 [kvm_intel].data.once
Map: 0-40 705f0 [kvm_intel].data..percpu
Map: 0-1685 41d20 [kvm_intel].init.text
Map: 0-4b8 6fd60 [kvm_intel].init.data
Map: 0-380 70248 [kvm_intel]__dyndbg
Map: 0-8 70218 [kvm_intel].exit.data
Map: 0-438 4f980 [kvm_intel]__param
Map: 0-5f5 4ca0f [kvm_intel].rodata.str1.1
Map: 0-3657 493b8 [kvm_intel].rodata.str1.8
Map: 0-e0 70640 [kvm_intel].data..ro_after_init
Map: 0-500 70ec0 [kvm_intel].gnu.linkonce.this_module
Map: ffffffffc13a7000-ffffffffc1421000 a0 /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko

The example above shows how the module section mappings are all wrong
except for the main .text mapping at 0xffffffffc13a7000.

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/builtin-script.c | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index b1f57401ff23..e764b319ef59 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3806,6 +3806,16 @@ static int parse_callret_trace(const struct option *opt __maybe_unused,
return 0;
}

+static void dump_kmaps(struct perf_session *session)
+{
+ int save_verbose = verbose;
+
+ pr_debug("Kernel and module maps:\n");
+ verbose = 0; /* Suppress verbose to print a summary only */
+ maps__fprintf(machine__kernel_maps(&session->machines.host), stderr);
+ verbose = save_verbose;
+}
+
int cmd_script(int argc, const char **argv)
{
bool show_full_info = false;
@@ -4366,6 +4376,9 @@ int cmd_script(int argc, const char **argv)

flush_scripting();

+ if (verbose > 2)
+ dump_kmaps(session);
+
out_delete:
if (script.ptime_range) {
itrace_synth_opts__clear_time_range(&itrace_synth_opts);
--
2.34.1


2024-02-02 11:02:57

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH 2/2] perf symbols: Slightly improve module file executable section mappings

Currently perf does not record module section addresses except for
the .text section. In general that means perf cannot get module section
mappings correct (except for .text) when loading symbols from a kernel
module file. (Note using --kcore does not have this issue)

Improve that situation slightly by identifying executable sections that
use the same mapping as the .text section. That happens when an
executable section comes directly after the .text section, both in memory
and on file, something that can be determined by following the same layout
rules used by the kernel, refer kernel layout_sections(). Note whether
that happens is somewhat arbitrary, so this is not a final solution.

Example from tracing a virtual machine process:

Before:

$ perf script | grep unknown
CPU 0/KVM 1718 203.511270: 318341 cpu-cycles:P: ffffffffc13e8a70 [unknown] (/lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko)
$ perf script -vvv 2>&1 >/dev/null | grep kvm.intel | grep 'noinstr.text\|ffff'
Map: 0-7e0 41430 [kvm_intel].noinstr.text
Map: ffffffffc13a7000-ffffffffc1421000 a0 /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko

After:

$ perf script | grep 203.511270
CPU 0/KVM 1718 203.511270: 318341 cpu-cycles:P: ffffffffc13e8a70 vmx_vmexit+0x0 (/lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko)
$ perf script -vvv 2>&1 >/dev/null | grep kvm.intel | grep 'noinstr.text\|ffff'
Map: ffffffffc13a7000-ffffffffc1421000 a0 /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko

Reported-by: Like Xu <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/util/symbol-elf.c | 75 +++++++++++++++++++++++++++++++++++-
1 file changed, 73 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 9e7eeaf616b8..98bf0881aaf6 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -23,6 +23,7 @@
#include <linux/ctype.h>
#include <linux/kernel.h>
#include <linux/zalloc.h>
+#include <linux/string.h>
#include <symbol/kallsyms.h>
#include <internal/lib.h>

@@ -1329,6 +1330,58 @@ int symsrc__init(struct symsrc *ss, struct dso *dso, const char *name,
return -1;
}

+static bool is_exe_text(int flags)
+{
+ return (flags & (SHF_ALLOC | SHF_EXECINSTR)) == (SHF_ALLOC | SHF_EXECINSTR);
+}
+
+/*
+ * Some executable module sections like .noinstr.text might be laid out with
+ * .text so they can use the same mapping (memory address to file offset).
+ * Check if that is the case. Refer to kernel layout_sections(). Return the
+ * maximum offset.
+ */
+static u64 max_text_section(Elf *elf, GElf_Ehdr *ehdr)
+{
+ Elf_Scn *sec = NULL;
+ GElf_Shdr shdr;
+ u64 offs = 0;
+
+ /* Doesn't work for some arch */
+ if (ehdr->e_machine == EM_PARISC ||
+ ehdr->e_machine == EM_ALPHA)
+ return 0;
+
+ /* ELF is corrupted/truncated, avoid calling elf_strptr. */
+ if (!elf_rawdata(elf_getscn(elf, ehdr->e_shstrndx), NULL))
+ return 0;
+
+ while ((sec = elf_nextscn(elf, sec)) != NULL) {
+ char *sec_name;
+
+ if (!gelf_getshdr(sec, &shdr))
+ break;
+
+ if (!is_exe_text(shdr.sh_flags))
+ continue;
+
+ /* .init and .exit sections are not placed with .text */
+ sec_name = elf_strptr(elf, ehdr->e_shstrndx, shdr.sh_name);
+ if (!sec_name ||
+ strstarts(sec_name, ".init") ||
+ strstarts(sec_name, ".exit"))
+ break;
+
+ /* Must be next to previous, assumes .text is first */
+ if (offs && PERF_ALIGN(offs, shdr.sh_addralign ?: 1) != shdr.sh_offset)
+ break;
+
+ offs = shdr.sh_offset + shdr.sh_size;
+ }
+
+ return offs;
+}
+
/**
* ref_reloc_sym_not_found - has kernel relocation symbol been found.
* @kmap: kernel maps and relocation reference symbol
@@ -1368,7 +1421,8 @@ static int dso__process_kernel_symbol(struct dso *dso, struct map *map,
struct maps *kmaps, struct kmap *kmap,
struct dso **curr_dsop, struct map **curr_mapp,
const char *section_name,
- bool adjust_kernel_syms, bool kmodule, bool *remap_kernel)
+ bool adjust_kernel_syms, bool kmodule, bool *remap_kernel,
+ u64 max_text_sh_offset)
{
struct dso *curr_dso = *curr_dsop;
struct map *curr_map;
@@ -1425,6 +1479,17 @@ static int dso__process_kernel_symbol(struct dso *dso, struct map *map,
if (!kmap)
return 0;

+ /*
+ * perf does not record module section addresses except for .text, but
+ * some sections can use the same mapping as .text.
+ */
+ if (kmodule && adjust_kernel_syms && is_exe_text(shdr->sh_flags) &&
+ shdr->sh_offset <= max_text_sh_offset) {
+ *curr_mapp = map;
+ *curr_dsop = dso;
+ return 0;
+ }
+
snprintf(dso_name, sizeof(dso_name), "%s%s", dso->short_name, section_name);

curr_map = maps__find_by_name(kmaps, dso_name);
@@ -1499,6 +1564,7 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
Elf *elf;
int nr = 0;
bool remap_kernel = false, adjust_kernel_syms = false;
+ u64 max_text_sh_offset = 0;

if (kmap && !kmaps)
return -1;
@@ -1586,6 +1652,10 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
remap_kernel = true;
adjust_kernel_syms = dso->adjust_symbols;
}
+
+ if (kmodule && adjust_kernel_syms)
+ max_text_sh_offset = max_text_section(runtime_ss->elf, &runtime_ss->ehdr);
+
elf_symtab__for_each_symbol(syms, nr_syms, idx, sym) {
struct symbol *f;
const char *elf_name = elf_sym__name(&sym, symstrs);
@@ -1675,7 +1745,8 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,

if (dso->kernel) {
if (dso__process_kernel_symbol(dso, map, &sym, &shdr, kmaps, kmap, &curr_dso, &curr_map,
- section_name, adjust_kernel_syms, kmodule, &remap_kernel))
+ section_name, adjust_kernel_syms, kmodule,
+ &remap_kernel, max_text_sh_offset))
goto out_elf_end;
} else if ((used_opd && runtime_ss->adjust_symbols) ||
(!used_opd && syms_ss->adjust_symbols)) {
--
2.34.1


2024-02-03 01:45:04

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 2/2] perf symbols: Slightly improve module file executable section mappings

Hi Adrian,

On Fri, Feb 02, 2024 at 01:01:30PM +0200, Adrian Hunter wrote:
> Currently perf does not record module section addresses except for
> the .text section. In general that means perf cannot get module section
> mappings correct (except for .text) when loading symbols from a kernel
> module file. (Note using --kcore does not have this issue)
>
> Improve that situation slightly by identifying executable sections that
> use the same mapping as the .text section. That happens when an
> executable section comes directly after the .text section, both in memory
> and on file, something that can be determined by following the same layout
> rules used by the kernel, refer kernel layout_sections(). Note whether
> that happens is somewhat arbitrary, so this is not a final solution.
>
> Example from tracing a virtual machine process:
>
> Before:
>
> $ perf script | grep unknown
> CPU 0/KVM 1718 203.511270: 318341 cpu-cycles:P: ffffffffc13e8a70 [unknown] (/lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko)
> $ perf script -vvv 2>&1 >/dev/null | grep kvm.intel | grep 'noinstr.text\|ffff'
> Map: 0-7e0 41430 [kvm_intel].noinstr.text
> Map: ffffffffc13a7000-ffffffffc1421000 a0 /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko
>
> After:
>
> $ perf script | grep 203.511270
> CPU 0/KVM 1718 203.511270: 318341 cpu-cycles:P: ffffffffc13e8a70 vmx_vmexit+0x0 (/lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko)
> $ perf script -vvv 2>&1 >/dev/null | grep kvm.intel | grep 'noinstr.text\|ffff'
> Map: ffffffffc13a7000-ffffffffc1421000 a0 /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko
>
> Reported-by: Like Xu <[email protected]>
> Signed-off-by: Adrian Hunter <[email protected]>
> ---
> tools/perf/util/symbol-elf.c | 75 +++++++++++++++++++++++++++++++++++-
> 1 file changed, 73 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> index 9e7eeaf616b8..98bf0881aaf6 100644
> --- a/tools/perf/util/symbol-elf.c
> +++ b/tools/perf/util/symbol-elf.c
> @@ -23,6 +23,7 @@
> #include <linux/ctype.h>
> #include <linux/kernel.h>
> #include <linux/zalloc.h>
> +#include <linux/string.h>
> #include <symbol/kallsyms.h>
> #include <internal/lib.h>
>
> @@ -1329,6 +1330,58 @@ int symsrc__init(struct symsrc *ss, struct dso *dso, const char *name,
> return -1;
> }
>
> +static bool is_exe_text(int flags)
> +{
> + return (flags & (SHF_ALLOC | SHF_EXECINSTR)) == (SHF_ALLOC | SHF_EXECINSTR);
> +}
> +
> +/*
> + * Some executable module sections like .noinstr.text might be laid out with
> + * .text so they can use the same mapping (memory address to file offset).
> + * Check if that is the case. Refer to kernel layout_sections(). Return the
> + * maximum offset.
> + */
> +static u64 max_text_section(Elf *elf, GElf_Ehdr *ehdr)
> +{
> + Elf_Scn *sec = NULL;
> + GElf_Shdr shdr;
> + u64 offs = 0;
> +
> + /* Doesn't work for some arch */
> + if (ehdr->e_machine == EM_PARISC ||
> + ehdr->e_machine == EM_ALPHA)
> + return 0;
> +
> + /* ELF is corrupted/truncated, avoid calling elf_strptr. */
> + if (!elf_rawdata(elf_getscn(elf, ehdr->e_shstrndx), NULL))
> + return 0;
> +
> + while ((sec = elf_nextscn(elf, sec)) != NULL) {
> + char *sec_name;
> +
> + if (!gelf_getshdr(sec, &shdr))
> + break;
> +
> + if (!is_exe_text(shdr.sh_flags))
> + continue;
> +
> + /* .init and .exit sections are not placed with .text */
> + sec_name = elf_strptr(elf, ehdr->e_shstrndx, shdr.sh_name);
> + if (!sec_name ||
> + strstarts(sec_name, ".init") ||
> + strstarts(sec_name, ".exit"))
> + break;

Do we really need this? It seems my module has .init.text section
next to .text.

$ readelf -SW /lib/modules/`uname -r`/kernel/fs/ext4/ext4.ko
There are 77 section headers, starting at offset 0x252e90:

Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 0] NULL 0000000000000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 0000000000000000 000040 079fa7 00 AX 0 0 16
[ 2] .rela.text RELA 0000000000000000 13c348 04f0c8 18 I 74 1 8
[ 3] .init.text PROGBITS 0000000000000000 079ff0 00060c 00 AX 0 0 16
...


ALIGN(0x40 + 0x79fa7, 16) = 0x79ff0, right?

Thanks,
Namhyung

> +
> + /* Must be next to previous, assumes .text is first */
> + if (offs && PERF_ALIGN(offs, shdr.sh_addralign ?: 1) != shdr.sh_offset)
> + break;
> +
> + offs = shdr.sh_offset + shdr.sh_size;
> + }
> +
> + return offs;
> +}
> +
> /**
> * ref_reloc_sym_not_found - has kernel relocation symbol been found.
> * @kmap: kernel maps and relocation reference symbol
> @@ -1368,7 +1421,8 @@ static int dso__process_kernel_symbol(struct dso *dso, struct map *map,
> struct maps *kmaps, struct kmap *kmap,
> struct dso **curr_dsop, struct map **curr_mapp,
> const char *section_name,
> - bool adjust_kernel_syms, bool kmodule, bool *remap_kernel)
> + bool adjust_kernel_syms, bool kmodule, bool *remap_kernel,
> + u64 max_text_sh_offset)
> {
> struct dso *curr_dso = *curr_dsop;
> struct map *curr_map;
> @@ -1425,6 +1479,17 @@ static int dso__process_kernel_symbol(struct dso *dso, struct map *map,
> if (!kmap)
> return 0;
>
> + /*
> + * perf does not record module section addresses except for .text, but
> + * some sections can use the same mapping as .text.
> + */
> + if (kmodule && adjust_kernel_syms && is_exe_text(shdr->sh_flags) &&
> + shdr->sh_offset <= max_text_sh_offset) {
> + *curr_mapp = map;
> + *curr_dsop = dso;
> + return 0;
> + }
> +
> snprintf(dso_name, sizeof(dso_name), "%s%s", dso->short_name, section_name);
>
> curr_map = maps__find_by_name(kmaps, dso_name);
> @@ -1499,6 +1564,7 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
> Elf *elf;
> int nr = 0;
> bool remap_kernel = false, adjust_kernel_syms = false;
> + u64 max_text_sh_offset = 0;
>
> if (kmap && !kmaps)
> return -1;
> @@ -1586,6 +1652,10 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
> remap_kernel = true;
> adjust_kernel_syms = dso->adjust_symbols;
> }
> +
> + if (kmodule && adjust_kernel_syms)
> + max_text_sh_offset = max_text_section(runtime_ss->elf, &runtime_ss->ehdr);
> +
> elf_symtab__for_each_symbol(syms, nr_syms, idx, sym) {
> struct symbol *f;
> const char *elf_name = elf_sym__name(&sym, symstrs);
> @@ -1675,7 +1745,8 @@ dso__load_sym_internal(struct dso *dso, struct map *map, struct symsrc *syms_ss,
>
> if (dso->kernel) {
> if (dso__process_kernel_symbol(dso, map, &sym, &shdr, kmaps, kmap, &curr_dso, &curr_map,
> - section_name, adjust_kernel_syms, kmodule, &remap_kernel))
> + section_name, adjust_kernel_syms, kmodule,
> + &remap_kernel, max_text_sh_offset))
> goto out_elf_end;
> } else if ((used_opd && runtime_ss->adjust_symbols) ||
> (!used_opd && syms_ss->adjust_symbols)) {
> --
> 2.34.1
>

2024-02-03 01:53:20

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 1/2] perf script: Make it possible to see perf's kernel and module memory mappings

On Fri, Feb 2, 2024 at 3:01 AM Adrian Hunter <[email protected]> wrote:
>
> Dump kmaps if verbose > 2.

Maybe we can add '--debug kmap' option rather than using an
arbitrary verbose level.

Thanks,
Namhyung

>
> Example:
>
> $ perf script -vvv 2>&1 >/dev/null | grep kvm.intel
> build id event received for /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko: 0691d75e10e72ebbbd45a44c59f6d00a5604badf [20]
> Map: 0-3a3 4f5d8 [kvm_intel].modinfo
> Map: 0-5240 5f280 [kvm_intel]__versions
> Map: 0-30 64 [kvm_intel].note.Linux
> Map: 0-14 644c0 [kvm_intel].orc_header
> Map: 0-5297 43680 [kvm_intel].rodata
> Map: 0-5bee 3b837 [kvm_intel].text.unlikely
> Map: 0-7e0 41430 [kvm_intel].noinstr.text
> Map: 0-2080 713c0 [kvm_intel].bss
> Map: 0-26 705c8 [kvm_intel].data..read_mostly
> Map: 0-5888 6a4c0 [kvm_intel].data
> Map: 0-22 70220 [kvm_intel].data.once
> Map: 0-40 705f0 [kvm_intel].data..percpu
> Map: 0-1685 41d20 [kvm_intel].init.text
> Map: 0-4b8 6fd60 [kvm_intel].init.data
> Map: 0-380 70248 [kvm_intel]__dyndbg
> Map: 0-8 70218 [kvm_intel].exit.data
> Map: 0-438 4f980 [kvm_intel]__param
> Map: 0-5f5 4ca0f [kvm_intel].rodata.str1.1
> Map: 0-3657 493b8 [kvm_intel].rodata.str1.8
> Map: 0-e0 70640 [kvm_intel].data..ro_after_init
> Map: 0-500 70ec0 [kvm_intel].gnu.linkonce.this_module
> Map: ffffffffc13a7000-ffffffffc1421000 a0 /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko
>
> The example above shows how the module section mappings are all wrong
> except for the main .text mapping at 0xffffffffc13a7000.
>
> Signed-off-by: Adrian Hunter <[email protected]>
> ---
> tools/perf/builtin-script.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> index b1f57401ff23..e764b319ef59 100644
> --- a/tools/perf/builtin-script.c
> +++ b/tools/perf/builtin-script.c
> @@ -3806,6 +3806,16 @@ static int parse_callret_trace(const struct option *opt __maybe_unused,
> return 0;
> }
>
> +static void dump_kmaps(struct perf_session *session)
> +{
> + int save_verbose = verbose;
> +
> + pr_debug("Kernel and module maps:\n");
> + verbose = 0; /* Suppress verbose to print a summary only */
> + maps__fprintf(machine__kernel_maps(&session->machines.host), stderr);
> + verbose = save_verbose;
> +}
> +
> int cmd_script(int argc, const char **argv)
> {
> bool show_full_info = false;
> @@ -4366,6 +4376,9 @@ int cmd_script(int argc, const char **argv)
>
> flush_scripting();
>
> + if (verbose > 2)
> + dump_kmaps(session);
> +
> out_delete:
> if (script.ptime_range) {
> itrace_synth_opts__clear_time_range(&itrace_synth_opts);
> --
> 2.34.1
>

2024-02-05 06:58:28

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH 2/2] perf symbols: Slightly improve module file executable section mappings

On 3/02/24 03:44, Namhyung Kim wrote:
> Hi Adrian,
>
> On Fri, Feb 02, 2024 at 01:01:30PM +0200, Adrian Hunter wrote:
>> Currently perf does not record module section addresses except for
>> the .text section. In general that means perf cannot get module section
>> mappings correct (except for .text) when loading symbols from a kernel
>> module file. (Note using --kcore does not have this issue)
>>
>> Improve that situation slightly by identifying executable sections that
>> use the same mapping as the .text section. That happens when an
>> executable section comes directly after the .text section, both in memory
>> and on file, something that can be determined by following the same layout
>> rules used by the kernel, refer kernel layout_sections(). Note whether
>> that happens is somewhat arbitrary, so this is not a final solution.
>>
>> Example from tracing a virtual machine process:
>>
>> Before:
>>
>> $ perf script | grep unknown
>> CPU 0/KVM 1718 203.511270: 318341 cpu-cycles:P: ffffffffc13e8a70 [unknown] (/lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko)
>> $ perf script -vvv 2>&1 >/dev/null | grep kvm.intel | grep 'noinstr.text\|ffff'
>> Map: 0-7e0 41430 [kvm_intel].noinstr.text
>> Map: ffffffffc13a7000-ffffffffc1421000 a0 /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko
>>
>> After:
>>
>> $ perf script | grep 203.511270
>> CPU 0/KVM 1718 203.511270: 318341 cpu-cycles:P: ffffffffc13e8a70 vmx_vmexit+0x0 (/lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko)
>> $ perf script -vvv 2>&1 >/dev/null | grep kvm.intel | grep 'noinstr.text\|ffff'
>> Map: ffffffffc13a7000-ffffffffc1421000 a0 /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko
>>
>> Reported-by: Like Xu <[email protected]>
>> Signed-off-by: Adrian Hunter <[email protected]>
>> ---
>> tools/perf/util/symbol-elf.c | 75 +++++++++++++++++++++++++++++++++++-
>> 1 file changed, 73 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
>> index 9e7eeaf616b8..98bf0881aaf6 100644
>> --- a/tools/perf/util/symbol-elf.c
>> +++ b/tools/perf/util/symbol-elf.c
>> @@ -23,6 +23,7 @@
>> #include <linux/ctype.h>
>> #include <linux/kernel.h>
>> #include <linux/zalloc.h>
>> +#include <linux/string.h>
>> #include <symbol/kallsyms.h>
>> #include <internal/lib.h>
>>
>> @@ -1329,6 +1330,58 @@ int symsrc__init(struct symsrc *ss, struct dso *dso, const char *name,
>> return -1;
>> }
>>
>> +static bool is_exe_text(int flags)
>> +{
>> + return (flags & (SHF_ALLOC | SHF_EXECINSTR)) == (SHF_ALLOC | SHF_EXECINSTR);
>> +}
>> +
>> +/*
>> + * Some executable module sections like .noinstr.text might be laid out with
>> + * .text so they can use the same mapping (memory address to file offset).
>> + * Check if that is the case. Refer to kernel layout_sections(). Return the
>> + * maximum offset.
>> + */
>> +static u64 max_text_section(Elf *elf, GElf_Ehdr *ehdr)
>> +{
>> + Elf_Scn *sec = NULL;
>> + GElf_Shdr shdr;
>> + u64 offs = 0;
>> +
>> + /* Doesn't work for some arch */
>> + if (ehdr->e_machine == EM_PARISC ||
>> + ehdr->e_machine == EM_ALPHA)
>> + return 0;
>> +
>> + /* ELF is corrupted/truncated, avoid calling elf_strptr. */
>> + if (!elf_rawdata(elf_getscn(elf, ehdr->e_shstrndx), NULL))
>> + return 0;
>> +
>> + while ((sec = elf_nextscn(elf, sec)) != NULL) {
>> + char *sec_name;
>> +
>> + if (!gelf_getshdr(sec, &shdr))
>> + break;
>> +
>> + if (!is_exe_text(shdr.sh_flags))
>> + continue;
>> +
>> + /* .init and .exit sections are not placed with .text */
>> + sec_name = elf_strptr(elf, ehdr->e_shstrndx, shdr.sh_name);
>> + if (!sec_name ||
>> + strstarts(sec_name, ".init") ||
>> + strstarts(sec_name, ".exit"))
>> + break;
>
> Do we really need this? It seems my module has .init.text section
> next to .text.
>
> $ readelf -SW /lib/modules/`uname -r`/kernel/fs/ext4/ext4.ko
> There are 77 section headers, starting at offset 0x252e90:
>
> Section Headers:
> [Nr] Name Type Address Off Size ES Flg Lk Inf Al
> [ 0] NULL 0000000000000000 000000 000000 00 0 0 0
> [ 1] .text PROGBITS 0000000000000000 000040 079fa7 00 AX 0 0 16
> [ 2] .rela.text RELA 0000000000000000 13c348 04f0c8 18 I 74 1 8
> [ 3] .init.text PROGBITS 0000000000000000 079ff0 00060c 00 AX 0 0 16
> ...
>
>
> ALIGN(0x40 + 0x79fa7, 16) = 0x79ff0, right?

But not in memory e.g.

Section Headers:
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[ 3] .text PROGBITS 0000000000000000 0000a0 071719 00 AX 0 0 16
[ 5] .text.unlikely PROGBITS 0000000000000000 0717b9 000a59 00 AX 0 0 1
[ 7] .init.text PROGBITS 0000000000000000 072212 0004fe 00 AX 0 0 1
[ 9] .altinstr_replacement PROGBITS 0000000000000000 072710 000004 00 AX 0 0 1
[10] .static_call.text PROGBITS 0000000000000000 072714 000388 00 AX 0 0 4
[12] .exit.text PROGBITS 0000000000000000 072a9c 000078 00 AX 0 0 1


/sys/module/ext4/sections/.text: 0xffffffffc0453000
/sys/module/ext4/sections/.text.unlikely: 0xffffffffc04c4719
/sys/module/ext4/sections/.init.text: 0xffffffffc053e000
/sys/module/ext4/sections/.altinstr_replacement: 0xffffffffc04c5172
/sys/module/ext4/sections/.static_call.text: 0xffffffffc04c5178
/sys/module/ext4/sections/.exit.text: 0xffffffffc04c5500

Need to have:

section address - offset == .text address - .text offset

perf does not record the section address, but the kernel
layout_sections() lays out executable sections in order
starting with .text *until* it gets to .init* or .exit*.


2024-02-05 07:09:18

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH 1/2] perf script: Make it possible to see perf's kernel and module memory mappings

On 3/02/24 03:56, Arnaldo Carvalho de Melo wrote:
>
>
> On Fri, Feb 2, 2024, 10:50 PM Namhyung Kim <[email protected] <mailto:[email protected]>> wrote:
>
> On Fri, Feb 2, 2024 at 3:01 AM Adrian Hunter <[email protected] <mailto:[email protected]>> wrote:
> >
> > Dump kmaps if verbose > 2.
>
> Maybe we can add '--debug kmap' option rather than using an
> arbitrary verbose level.

That is a global option but would only work for tools that are
explicitly programmed to do the dump. Could just do perf script
and perf report?

>
>
> I think we have 'perf report --mmap', no?

Only shows user space maps. Could add 'perf report --kmaps'?

>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/Documentation/perf-report.txt#n542 <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/Documentation/perf-report.txt#n542>
>
> - Arnaldo
>
> Sent from smartphone
>
>
> Thanks,
> Namhyung
>
> >
> > Example:
> >
> >   $ perf script -vvv 2>&1 >/dev/null | grep kvm.intel
> >   build id event received for /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko: 0691d75e10e72ebbbd45a44c59f6d00a5604badf [20]
> >   Map: 0-3a3 4f5d8 [kvm_intel].modinfo
> >   Map: 0-5240 5f280 [kvm_intel]__versions
> >   Map: 0-30 64 [kvm_intel].note.Linux
> >   Map: 0-14 644c0 [kvm_intel].orc_header
> >   Map: 0-5297 43680 [kvm_intel].rodata
> >   Map: 0-5bee 3b837 [kvm_intel].text.unlikely
> >   Map: 0-7e0 41430 [kvm_intel].noinstr.text
> >   Map: 0-2080 713c0 [kvm_intel].bss
> >   Map: 0-26 705c8 [kvm_intel].data..read_mostly
> >   Map: 0-5888 6a4c0 [kvm_intel].data
> >   Map: 0-22 70220 [kvm_intel].data.once
> >   Map: 0-40 705f0 [kvm_intel].data..percpu
> >   Map: 0-1685 41d20 [kvm_intel].init.text
> >   Map: 0-4b8 6fd60 [kvm_intel].init.data
> >   Map: 0-380 70248 [kvm_intel]__dyndbg
> >   Map: 0-8 70218 [kvm_intel].exit.data
> >   Map: 0-438 4f980 [kvm_intel]__param
> >   Map: 0-5f5 4ca0f [kvm_intel].rodata.str1.1
> >   Map: 0-3657 493b8 [kvm_intel].rodata.str1.8
> >   Map: 0-e0 70640 [kvm_intel].data..ro_after_init
> >   Map: 0-500 70ec0 [kvm_intel].gnu.linkonce.this_module
> >   Map: ffffffffc13a7000-ffffffffc1421000 a0 /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko
> >
> > The example above shows how the module section mappings are all wrong
> > except for the main .text mapping at 0xffffffffc13a7000.
> >
> > Signed-off-by: Adrian Hunter <[email protected] <mailto:[email protected]>>
> > ---
> >  tools/perf/builtin-script.c | 13 +++++++++++++
> >  1 file changed, 13 insertions(+)
> >
> > diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> > index b1f57401ff23..e764b319ef59 100644
> > --- a/tools/perf/builtin-script.c
> > +++ b/tools/perf/builtin-script.c
> > @@ -3806,6 +3806,16 @@ static int parse_callret_trace(const struct option *opt __maybe_unused,
> >         return 0;
> >  }
> >
> > +static void dump_kmaps(struct perf_session *session)
> > +{
> > +       int save_verbose = verbose;
> > +
> > +       pr_debug("Kernel and module maps:\n");
> > +       verbose = 0; /* Suppress verbose to print a summary only */
> > +       maps__fprintf(machine__kernel_maps(&session->machines.host), stderr);
> > +       verbose = save_verbose;
> > +}
> > +
> >  int cmd_script(int argc, const char **argv)
> >  {
> >         bool show_full_info = false;
> > @@ -4366,6 +4376,9 @@ int cmd_script(int argc, const char **argv)
> >
> >         flush_scripting();
> >
> > +       if (verbose > 2)
> > +               dump_kmaps(session);
> > +
> >  out_delete:
> >         if (script.ptime_range) {
> >                 itrace_synth_opts__clear_time_range(&itrace_synth_opts);
> > --
> > 2.34.1
> >
>


2024-02-06 02:21:44

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 1/2] perf script: Make it possible to see perf's kernel and module memory mappings

On Sun, Feb 4, 2024 at 11:08 PM Adrian Hunter <[email protected]> wrote:
>
> On 3/02/24 03:56, Arnaldo Carvalho de Melo wrote:
> >
> >
> > On Fri, Feb 2, 2024, 10:50 PM Namhyung Kim <[email protected] <mailto:[email protected]>> wrote:
> >
> > On Fri, Feb 2, 2024 at 3:01 AM Adrian Hunter <[email protected] <mailto:[email protected]>> wrote:
> > >
> > > Dump kmaps if verbose > 2.
> >
> > Maybe we can add '--debug kmap' option rather than using an
> > arbitrary verbose level.
>
> That is a global option but would only work for tools that are
> explicitly programmed to do the dump. Could just do perf script
> and perf report?

I don't care.. actually `--debug perf-event-open` would work with
commands that call the syscall only. But I'm fine either way.

>
> >
> >
> > I think we have 'perf report --mmap', no?
>
> Only shows user space maps. Could add 'perf report --kmaps'?

That'd work too. It's up to you.

Thanks,
Namhyung


>
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/Documentation/perf-report.txt#n542 <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/Documentation/perf-report.txt#n542>
> >
> > - Arnaldo
> >
> > Sent from smartphone
> >
> >
> > Thanks,
> > Namhyung
> >
> > >
> > > Example:
> > >
> > > $ perf script -vvv 2>&1 >/dev/null | grep kvm.intel
> > > build id event received for /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko: 0691d75e10e72ebbbd45a44c59f6d00a5604badf [20]
> > > Map: 0-3a3 4f5d8 [kvm_intel].modinfo
> > > Map: 0-5240 5f280 [kvm_intel]__versions
> > > Map: 0-30 64 [kvm_intel].note.Linux
> > > Map: 0-14 644c0 [kvm_intel].orc_header
> > > Map: 0-5297 43680 [kvm_intel].rodata
> > > Map: 0-5bee 3b837 [kvm_intel].text.unlikely
> > > Map: 0-7e0 41430 [kvm_intel].noinstr.text
> > > Map: 0-2080 713c0 [kvm_intel].bss
> > > Map: 0-26 705c8 [kvm_intel].data..read_mostly
> > > Map: 0-5888 6a4c0 [kvm_intel].data
> > > Map: 0-22 70220 [kvm_intel].data.once
> > > Map: 0-40 705f0 [kvm_intel].data..percpu
> > > Map: 0-1685 41d20 [kvm_intel].init.text
> > > Map: 0-4b8 6fd60 [kvm_intel].init.data
> > > Map: 0-380 70248 [kvm_intel]__dyndbg
> > > Map: 0-8 70218 [kvm_intel].exit.data
> > > Map: 0-438 4f980 [kvm_intel]__param
> > > Map: 0-5f5 4ca0f [kvm_intel].rodata.str1.1
> > > Map: 0-3657 493b8 [kvm_intel].rodata.str1.8
> > > Map: 0-e0 70640 [kvm_intel].data..ro_after_init
> > > Map: 0-500 70ec0 [kvm_intel].gnu.linkonce.this_module
> > > Map: ffffffffc13a7000-ffffffffc1421000 a0 /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko
> > >
> > > The example above shows how the module section mappings are all wrong
> > > except for the main .text mapping at 0xffffffffc13a7000.
> > >
> > > Signed-off-by: Adrian Hunter <[email protected] <mailto:[email protected]>>
> > > ---
> > > tools/perf/builtin-script.c | 13 +++++++++++++
> > > 1 file changed, 13 insertions(+)
> > >
> > > diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
> > > index b1f57401ff23..e764b319ef59 100644
> > > --- a/tools/perf/builtin-script.c
> > > +++ b/tools/perf/builtin-script.c
> > > @@ -3806,6 +3806,16 @@ static int parse_callret_trace(const struct option *opt __maybe_unused,
> > > return 0;
> > > }
> > >
> > > +static void dump_kmaps(struct perf_session *session)
> > > +{
> > > + int save_verbose = verbose;
> > > +
> > > + pr_debug("Kernel and module maps:\n");
> > > + verbose = 0; /* Suppress verbose to print a summary only */
> > > + maps__fprintf(machine__kernel_maps(&session->machines.host), stderr);
> > > + verbose = save_verbose;
> > > +}
> > > +
> > > int cmd_script(int argc, const char **argv)
> > > {
> > > bool show_full_info = false;
> > > @@ -4366,6 +4376,9 @@ int cmd_script(int argc, const char **argv)
> > >
> > > flush_scripting();
> > >
> > > + if (verbose > 2)
> > > + dump_kmaps(session);
> > > +
> > > out_delete:
> > > if (script.ptime_range) {
> > > itrace_synth_opts__clear_time_range(&itrace_synth_opts);
> > > --
> > > 2.34.1
> > >
> >
>

2024-02-06 02:22:15

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 2/2] perf symbols: Slightly improve module file executable section mappings

On Sun, Feb 4, 2024 at 10:58 PM Adrian Hunter <[email protected]> wrote:
>
> On 3/02/24 03:44, Namhyung Kim wrote:
> > Hi Adrian,
> >
> > On Fri, Feb 02, 2024 at 01:01:30PM +0200, Adrian Hunter wrote:
> >> Currently perf does not record module section addresses except for
> >> the .text section. In general that means perf cannot get module section
> >> mappings correct (except for .text) when loading symbols from a kernel
> >> module file. (Note using --kcore does not have this issue)
> >>
> >> Improve that situation slightly by identifying executable sections that
> >> use the same mapping as the .text section. That happens when an
> >> executable section comes directly after the .text section, both in memory
> >> and on file, something that can be determined by following the same layout
> >> rules used by the kernel, refer kernel layout_sections(). Note whether
> >> that happens is somewhat arbitrary, so this is not a final solution.
> >>
> >> Example from tracing a virtual machine process:
> >>
> >> Before:
> >>
> >> $ perf script | grep unknown
> >> CPU 0/KVM 1718 203.511270: 318341 cpu-cycles:P: ffffffffc13e8a70 [unknown] (/lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko)
> >> $ perf script -vvv 2>&1 >/dev/null | grep kvm.intel | grep 'noinstr.text\|ffff'
> >> Map: 0-7e0 41430 [kvm_intel].noinstr.text
> >> Map: ffffffffc13a7000-ffffffffc1421000 a0 /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko
> >>
> >> After:
> >>
> >> $ perf script | grep 203.511270
> >> CPU 0/KVM 1718 203.511270: 318341 cpu-cycles:P: ffffffffc13e8a70 vmx_vmexit+0x0 (/lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko)
> >> $ perf script -vvv 2>&1 >/dev/null | grep kvm.intel | grep 'noinstr.text\|ffff'
> >> Map: ffffffffc13a7000-ffffffffc1421000 a0 /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko
> >>
> >> Reported-by: Like Xu <[email protected]>
> >> Signed-off-by: Adrian Hunter <[email protected]>
> >> ---
> >> tools/perf/util/symbol-elf.c | 75 +++++++++++++++++++++++++++++++++++-
> >> 1 file changed, 73 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elfc
> >> index 9e7eeaf616b8..98bf0881aaf6 100644
> >> --- a/tools/perf/util/symbol-elf.c
> >> +++ b/tools/perf/util/symbol-elf.c
> >> @@ -23,6 +23,7 @@
> >> #include <linux/ctype.h>
> >> #include <linux/kernel.h>
> >> #include <linux/zalloc.h>
> >> +#include <linux/string.h>
> >> #include <symbol/kallsyms.h>
> >> #include <internal/lib.h>
> >>
> >> @@ -1329,6 +1330,58 @@ int symsrc__init(struct symsrc *ss, struct dso *dso, const char *name,
> >> return -1;
> >> }
> >>
> >> +static bool is_exe_text(int flags)
> >> +{
> >> + return (flags & (SHF_ALLOC | SHF_EXECINSTR)) == (SHF_ALLOC | SHF_EXECINSTR);
> >> +}
> >> +
> >> +/*
> >> + * Some executable module sections like .noinstr.text might be laid out with
> >> + * .text so they can use the same mapping (memory address to file offset).
> >> + * Check if that is the case. Refer to kernel layout_sections(). Return the
> >> + * maximum offset.
> >> + */
> >> +static u64 max_text_section(Elf *elf, GElf_Ehdr *ehdr)
> >> +{
> >> + Elf_Scn *sec = NULL;
> >> + GElf_Shdr shdr;
> >> + u64 offs = 0;
> >> +
> >> + /* Doesn't work for some arch */
> >> + if (ehdr->e_machine == EM_PARISC ||
> >> + ehdr->e_machine == EM_ALPHA)
> >> + return 0;
> >> +
> >> + /* ELF is corrupted/truncated, avoid calling elf_strptr. */
> >> + if (!elf_rawdata(elf_getscn(elf, ehdr->e_shstrndx), NULL))
> >> + return 0;
> >> +
> >> + while ((sec = elf_nextscn(elf, sec)) != NULL) {
> >> + char *sec_name;
> >> +
> >> + if (!gelf_getshdr(sec, &shdr))
> >> + break;
> >> +
> >> + if (!is_exe_text(shdr.sh_flags))
> >> + continue;
> >> +
> >> + /* .init and .exit sections are not placed with .text */
> >> + sec_name = elf_strptr(elf, ehdr->e_shstrndx, shdr.sh_name);
> >> + if (!sec_name ||
> >> + strstarts(sec_name, ".init") ||
> >> + strstarts(sec_name, ".exit"))
> >> + break;
> >
> > Do we really need this? It seems my module has .init.text section
> > next to .text.
> >
> > $ readelf -SW /lib/modules/`uname -r`/kernel/fs/ext4/ext4.ko
> > There are 77 section headers, starting at offset 0x252e90:
> >
> > Section Headers:
> > [Nr] Name Type Address Off Size ES Flg Lk Inf Al
> > [ 0] NULL 0000000000000000 000000 000000 00 0 0 0
> > [ 1] .text PROGBITS 0000000000000000 000040 079fa7 00 AX 0 0 16
> > [ 2] .rela.text RELA 0000000000000000 13c348 04f0c8 18 I 74 1 8
> > [ 3] .init.text PROGBITS 0000000000000000 079ff0 00060c 00 AX 0 0 16
> > ...
> >
> >
> > ALIGN(0x40 + 0x79fa7, 16) = 0x79ff0, right?
>
> But not in memory e.g.
>
> Section Headers:
> [Nr] Name Type Address Off Size ES Flg Lk Inf Al
> [ 3] .text PROGBITS 0000000000000000 0000a0 071719 00 AX 0 0 16
> [ 5] .text.unlikely PROGBITS 0000000000000000 0717b9 000a59 00 AX 0 0 1
> [ 7] .init.text PROGBITS 0000000000000000 072212 0004fe 00 AX 0 0 1
> [ 9] .altinstr_replacement PROGBITS 0000000000000000 072710 000004 00 AX 0 0 1
> [10] .static_call.text PROGBITS 0000000000000000 072714 000388 00 AX 0 0 4
> [12] .exit.text PROGBITS 0000000000000000 072a9c 000078 00 AX 0 0 1
>
>
> /sys/module/ext4/sections/.text: 0xffffffffc0453000
> /sys/module/ext4/sections/.text.unlikely: 0xffffffffc04c4719
> /sys/module/ext4/sections/.init.text: 0xffffffffc053e000
> /sys/module/ext4/sections/.altinstr_replacement: 0xffffffffc04c5172
> /sys/module/ext4/sections/.static_call.text: 0xffffffffc04c5178
> /sys/module/ext4/sections/.exit.text: 0xffffffffc04c5500
>
> Need to have:
>
> section address - offset == .text address - .text offset
>
> perf does not record the section address, but the kernel
> layout_sections() lays out executable sections in order
> starting with .text *until* it gets to .init* or .exit*.

Ok, thanks for the explanation!

Namhyung