2020-12-14 11:09:39

by Jiri Olsa

[permalink] [raw]
Subject: [PATCHv5 00/15] perf: Add mmap2 build id support

hi,
adding the support to have buildid stored in mmap2 event,
so we can bypass the final perf record hunt on build ids.

This patchset allows perf to record build ID in mmap2 event,
and adds perf tooling to store/download binaries to .debug
cache based on these build IDs.

Note that the build id retrieval code is stolen from bpf
code, where it's been used (together with file offsets)
to replace IPs in user space stack traces. It's now added
under lib directory.

v5 changes:
- rebased on latest perf/core
- several patches already pulled in
- fixed trace+probe_vfs_getname.sh output redirection
- fixed changelogs [Arnaldo]
- renamed BUILD_ID_SIZE to BUILD_ID_SIZE_MAX [Song]

v4 changes:
- fixed typo in changelog [Namhyung]
- removed force_download bool from struct dso_store_data,
because it's not used [Namhyung]

v3 changes:
- added acks
- removed forgotten debug code [Arnaldo]
- fixed readlink termination [Ian]
- fixed doc for --debuginfod=URLs [Ian]
- adopted kernel's memchr_inv function and used
it in build_id__is_defined function [Arnaldo]

On recording server:

- on the recording server we can run record with --buildid-mmap
option to store build ids in mmap2 events:

# perf record --buildid-mmap
^C[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.836 MB perf.data ]

- it stores nothing to ~/.debug cache:

# find ~/.debug
find: ‘/root/.debug’: No such file or directory

- and still reports properly:

# perf report --stdio
...
99.82% swapper [kernel.kallsyms] [k] native_safe_halt
0.03% swapper [kernel.kallsyms] [k] finish_task_switch
0.02% swapper [kernel.kallsyms] [k] __softirqentry_text_start
0.01% kcompactd0 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
0.01% ksoftirqd/6 [kernel.kallsyms] [k] slab_free_freelist_hook
0.01% kworker/17:1H-x [kernel.kallsyms] [k] slab_free_freelist_hook

- display used/hit build ids:

# perf buildid-list | head -5
5dcec522abf136fcfd3128f47e131f2365834dd7 /proc/kcore
589e403a34f55486bcac848a45e00bcdeedd1ca8 /usr/lib64/libcrypto.so.1.1.1g
94569566d4eac7e9c87ba029d43d4e2158f9527e /usr/lib64/libpthread-2.30.so
559b9702bebe31c6d132c8dc5cc887673d65d5b5 /usr/lib64/libc-2.30.so
40da7abe89f631f60538a17686a7d65c6a02ed31 /usr/lib64/ld-2.30.so

- store build id binaries into build id cache:

# perf buildid-cache -a perf.data
OK 5dcec522abf136fcfd3128f47e131f2365834dd7 /proc/kcore
OK 589e403a34f55486bcac848a45e00bcdeedd1ca8 /usr/lib64/libcrypto.so.1.1.1g
OK 94569566d4eac7e9c87ba029d43d4e2158f9527e /usr/lib64/libpthread-2.30.so
OK 559b9702bebe31c6d132c8dc5cc887673d65d5b5 /usr/lib64/libc-2.30.so
OK 40da7abe89f631f60538a17686a7d65c6a02ed31 /usr/lib64/ld-2.30.so
OK a674f7a47c78e35a088104647b9640710277b489 /usr/sbin/sshd
OK e5cb4ca25f46485bdbc691c3a92e7e111dac3ef2 /usr/bin/bash
OK 9bc8589108223c944b452f0819298a0c3cba6215 /usr/bin/find

# find ~/.debug | head -5
/root/.debug
/root/.debug/proc
/root/.debug/proc/kcore
/root/.debug/proc/kcore/5dcec522abf136fcfd3128f47e131f2365834dd7
/root/.debug/proc/kcore/5dcec522abf136fcfd3128f47e131f2365834dd7/kallsyms

- run debuginfod daemon to provide binaries to another server (below)
(the initialization could take some time)

# debuginfod -F /


On another server:

- copy perf.data from 'record' server and run:

$ find ~/.debug/
find: ‘/home/jolsa/.debug/’: No such file or directory

$ perf buildid-list | head -5
No kallsyms or vmlinux with build-id 5dcec522abf136fcfd3128f47e131f2365834dd7 was found
5dcec522abf136fcfd3128f47e131f2365834dd7 [kernel.kallsyms]
5784f813b727a50cfd3363234aef9fcbab685cc4 /lib/modules/5.10.0-rc2speed+/kernel/fs/xfs/xfs.ko
589e403a34f55486bcac848a45e00bcdeedd1ca8 /usr/lib64/libcrypto.so.1.1.1g
94569566d4eac7e9c87ba029d43d4e2158f9527e /usr/lib64/libpthread-2.30.so
559b9702bebe31c6d132c8dc5cc887673d65d5b5 /usr/lib64/libc-2.30.so

- report does not show anything (kernel build id does not match):

$ perf report --stdio
...
76.73% swapper [kernel.kallsyms] [k] 0xffffffff81aa8ebe
1.89% find [kernel.kallsyms] [k] 0xffffffff810f2167
0.93% sshd [kernel.kallsyms] [k] 0xffffffff8153380c
0.83% swapper [kernel.kallsyms] [k] 0xffffffff81104b0b
0.71% kworker/u40:2-e [kernel.kallsyms] [k] 0xffffffff810f3850
0.70% kworker/u40:0-e [kernel.kallsyms] [k] 0xffffffff810f3850
0.64% find [kernel.kallsyms] [k] 0xffffffff81a9ba0a
0.63% find [kernel.kallsyms] [k] 0xffffffff81aa93b0

- add build ids does not work, because existing binaries (on another server)
have different build ids:

$ perf buildid-cache -a perf.data
No kallsyms or vmlinux with build-id 5dcec522abf136fcfd3128f47e131f2365834dd7 was found
FAIL 5dcec522abf136fcfd3128f47e131f2365834dd7 [kernel.kallsyms]
FAIL 5784f813b727a50cfd3363234aef9fcbab685cc4 /lib/modules/5.10.0-rc2speed+/kernel/fs/xfs/xfs.ko
FAIL 589e403a34f55486bcac848a45e00bcdeedd1ca8 /usr/lib64/libcrypto.so.1.1.1g
FAIL 94569566d4eac7e9c87ba029d43d4e2158f9527e /usr/lib64/libpthread-2.30.so
FAIL 559b9702bebe31c6d132c8dc5cc887673d65d5b5 /usr/lib64/libc-2.30.so
FAIL 40da7abe89f631f60538a17686a7d65c6a02ed31 /usr/lib64/ld-2.30.so
FAIL a674f7a47c78e35a088104647b9640710277b489 /usr/sbin/sshd
FAIL e5cb4ca25f46485bdbc691c3a92e7e111dac3ef2 /usr/bin/bash
FAIL 9bc8589108223c944b452f0819298a0c3cba6215 /usr/bin/find

- add build ids with debuginfod setup pointing to record server:

$ perf buildid-cache -a perf.data --debuginfod http://192.168.122.174:8002
No kallsyms or vmlinux with build-id 5dcec522abf136fcfd3128f47e131f2365834dd7 was found
OK 5dcec522abf136fcfd3128f47e131f2365834dd7 [kernel.kallsyms]
OK 5784f813b727a50cfd3363234aef9fcbab685cc4 /lib/modules/5.10.0-rc2speed+/kernel/fs/xfs/xfs.ko
OK 589e403a34f55486bcac848a45e00bcdeedd1ca8 /usr/lib64/libcrypto.so.1.1.1g
OK 94569566d4eac7e9c87ba029d43d4e2158f9527e /usr/lib64/libpthread-2.30.so
OK 559b9702bebe31c6d132c8dc5cc887673d65d5b5 /usr/lib64/libc-2.30.so
OK 40da7abe89f631f60538a17686a7d65c6a02ed31 /usr/lib64/ld-2.30.so
OK a674f7a47c78e35a088104647b9640710277b489 /usr/sbin/sshd
OK e5cb4ca25f46485bdbc691c3a92e7e111dac3ef2 /usr/bin/bash
OK 9bc8589108223c944b452f0819298a0c3cba6215 /usr/bin/find

- and report works:

$ perf report --stdio
...
76.73% swapper [kernel.kallsyms] [k] native_safe_halt
1.91% find [kernel.kallsyms] [k] queue_work_on
0.93% sshd [kernel.kallsyms] [k] iowrite16
0.83% swapper [kernel.kallsyms] [k] finish_task_switch
0.72% kworker/u40:2-e [kernel.kallsyms] [k] process_one_work
0.70% kworker/u40:0-e [kernel.kallsyms] [k] process_one_work
0.64% find [kernel.kallsyms] [k] syscall_enter_from_user_mode
0.63% find [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore

- because we have the data in build id cache:

$ find ~/.debug | head -10
.../.debug
.../.debug/home
.../.debug/home/jolsa
.../.debug/home/jolsa/.cache
.../.debug/home/jolsa/.cache/debuginfod_client
.../.debug/home/jolsa/.cache/debuginfod_client/5dcec522abf136fcfd3128f47e131f2365834dd7
.../.debug/home/jolsa/.cache/debuginfod_client/5dcec522abf136fcfd3128f47e131f2365834dd7/executable
.../.debug/home/jolsa/.cache/debuginfod_client/5dcec522abf136fcfd3128f47e131f2365834dd7/executable/5dcec522abf136fcfd3128f47e131f2365834dd7
.../.debug/home/jolsa/.cache/debuginfod_client/5dcec522abf136fcfd3128f47e131f2365834dd7/executable/5dcec522abf136fcfd3128f47e131f2365834dd7/elf
.../.debug/home/jolsa/.cache/debuginfod_client/5dcec522abf136fcfd3128f47e131f2365834dd7/executable/5dcec522abf136fcfd3128f47e131f2365834dd7/debug


Available also in:
git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
perf/build_id

thanks,
jirka


Cc: Frank Ch. Eigler <[email protected]>
Cc: Mark Wielaard <[email protected]>
---
include/linux/buildid.h | 12 +++++
include/uapi/linux/perf_event.h | 42 +++++++++++++++---
kernel/bpf/stackmap.c | 143 ++----------------------------------------------------------
kernel/events/core.c | 32 ++++++++++++--
lib/Makefile | 3 +-
lib/buildid.c | 149 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
tools/include/uapi/linux/perf_event.h | 42 +++++++++++++++---
tools/lib/perf/include/perf/event.h | 18 ++++++--
tools/perf/Documentation/perf-buildid-cache.txt | 18 +++++++-
tools/perf/Documentation/perf-config.txt | 10 ++++-
tools/perf/Documentation/perf-record.txt | 3 ++
tools/perf/builtin-buildid-cache.c | 241 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
tools/perf/builtin-buildid-list.c | 3 ++
tools/perf/builtin-record.c | 20 +++++++++
tools/perf/tests/shell/trace+probe_vfs_getname.sh | 2 +-
tools/perf/util/event.c | 41 +++++++++++++-----
tools/perf/util/evsel.c | 10 +++--
tools/perf/util/machine.c | 24 +++++++---
tools/perf/util/map.c | 8 +++-
tools/perf/util/map.h | 3 +-
tools/perf/util/perf_api_probe.c | 10 +++++
tools/perf/util/perf_api_probe.h | 1 +
tools/perf/util/perf_event_attr_fprintf.c | 2 +
tools/perf/util/probe-event.c | 6 +--
tools/perf/util/record.h | 1 +
tools/perf/util/session.c | 11 +++--
tools/perf/util/symbol-elf.c | 37 +++++++++++++++-
tools/perf/util/symbol_conf.h | 3 +-
tools/perf/util/synthetic-events.c | 121 ++++++++++++++++++++++++++++++++++++++-------------
29 files changed, 787 insertions(+), 229 deletions(-)


2020-12-14 11:09:58

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 10/15] perf tools: Synthesize build id for kernel/modules/tasks

Adding build id to synthesized mmap2 events for
everything - kernel/modules/tasks.

Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/util/synthetic-events.c | 32 ++++++++++++++++++++++++++++++
1 file changed, 32 insertions(+)

diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index c209377106f5..a8d951a79f5e 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -347,6 +347,31 @@ static bool read_proc_maps_line(struct io *io, __u64 *start, __u64 *end,
}
}

+static void perf_record_mmap2__read_build_id(struct perf_record_mmap2 *event,
+ bool is_kernel)
+{
+ struct build_id bid;
+ int rc;
+
+ if (is_kernel)
+ rc = sysfs__read_build_id("/sys/kernel/notes", &bid);
+ else
+ rc = filename__read_build_id(event->filename, &bid) > 0 ? 0 : -1;
+
+ if (rc == 0) {
+ memcpy(event->build_id, bid.data, sizeof(bid.data));
+ event->build_id_size = (u8) bid.size;
+ event->header.misc |= PERF_RECORD_MISC_MMAP_BUILD_ID;
+ event->__reserved_1 = 0;
+ event->__reserved_2 = 0;
+ } else {
+ if (event->filename[0] == '/') {
+ pr_debug2("Failed to read build ID for %s\n",
+ event->filename);
+ }
+ }
+}
+
int perf_event__synthesize_mmap_events(struct perf_tool *tool,
union perf_event *event,
pid_t pid, pid_t tgid,
@@ -453,6 +478,9 @@ int perf_event__synthesize_mmap_events(struct perf_tool *tool,
event->mmap2.pid = tgid;
event->mmap2.tid = pid;

+ if (symbol_conf.buildid_mmap2)
+ perf_record_mmap2__read_build_id(&event->mmap2, false);
+
if (perf_tool__process_synth_event(tool, event, machine, process) != 0) {
rc = -1;
break;
@@ -633,6 +661,8 @@ int perf_event__synthesize_modules(struct perf_tool *tool, perf_event__handler_t

memcpy(event->mmap2.filename, pos->dso->long_name,
pos->dso->long_name_len + 1);
+
+ perf_record_mmap2__read_build_id(&event->mmap2, false);
} else {
size = PERF_ALIGN(pos->dso->long_name_len + 1, sizeof(u64));
event->mmap.header.type = PERF_RECORD_MMAP;
@@ -1053,6 +1083,8 @@ static int __perf_event__synthesize_kernel_mmap(struct perf_tool *tool,
event->mmap2.start = map->start;
event->mmap2.len = map->end - event->mmap.start;
event->mmap2.pid = machine->pid;
+
+ perf_record_mmap2__read_build_id(&event->mmap2, true);
} else {
size = snprintf(event->mmap.filename, sizeof(event->mmap.filename),
"%s%s", machine->mmap_name, kmap->ref_reloc_sym->name) + 1;
--
2.26.2

2020-12-14 11:10:05

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 13/15] perf buildid-cache: Add --debuginfod option

Adding --debuginfod option to specify debuginfod url and
support to do that through config file as well.

Use following in ~/.perfconfig file:

[buildid-cache]
debuginfod=http://192.168.122.174:8002

Acked-by: Ian Rogers <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
---
.../perf/Documentation/perf-buildid-cache.txt | 6 ++++
tools/perf/Documentation/perf-config.txt | 7 +++++
tools/perf/builtin-buildid-cache.c | 28 +++++++++++++++++--
3 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/tools/perf/Documentation/perf-buildid-cache.txt b/tools/perf/Documentation/perf-buildid-cache.txt
index b77da5138bca..b9987d1399ca 100644
--- a/tools/perf/Documentation/perf-buildid-cache.txt
+++ b/tools/perf/Documentation/perf-buildid-cache.txt
@@ -84,6 +84,12 @@ OPTIONS
used when creating a uprobe for a process that resides in a
different mount namespace from the perf(1) utility.

+--debuginfod=URLs::
+ Specify debuginfod URL to be used when retrieving perf.data binaries,
+ it follows the same syntax as the DEBUGINFOD_URLS variable, like:
+
+ buildid-cache.debuginfod=http://192.168.122.174:8002
+
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-report[1], linkperf:perf-buildid-list[1]
diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt
index 31069d8a5304..e3672c5d801b 100644
--- a/tools/perf/Documentation/perf-config.txt
+++ b/tools/perf/Documentation/perf-config.txt
@@ -238,6 +238,13 @@ buildid.*::
cache location, or to disable it altogether. If you want to disable it,
set buildid.dir to /dev/null. The default is $HOME/.debug

+buildid-cache.*::
+ buildid-cache.debuginfod=URLs
+ Specify debuginfod URLs to be used when retrieving perf.data binaries,
+ it follows the same syntax as the DEBUGINFOD_URLS variable, like:
+
+ buildid-cache.debuginfod=http://192.168.122.174:8002
+
annotate.*::
These are in control of addresses, jump function, source code
in lines of assembly code from a specific program.
diff --git a/tools/perf/builtin-buildid-cache.c b/tools/perf/builtin-buildid-cache.c
index f0afb2c89e03..864597fd9cf6 100644
--- a/tools/perf/builtin-buildid-cache.c
+++ b/tools/perf/builtin-buildid-cache.c
@@ -27,6 +27,7 @@
#include "util/time-utils.h"
#include "util/util.h"
#include "util/probe-file.h"
+#include "util/config.h"
#include <linux/string.h>
#include <linux/err.h>
#include <linux/zalloc.h>
@@ -550,12 +551,21 @@ build_id_cache__add_perf_data(const char *path, bool all)
return err;
}

+static int perf_buildid_cache_config(const char *var, const char *value, void *cb)
+{
+ const char **debuginfod = cb;
+
+ if (!strcmp(var, "buildid-cache.debuginfod"))
+ *debuginfod = strdup(value);
+
+ return 0;
+}
+
int cmd_buildid_cache(int argc, const char **argv)
{
struct strlist *list;
struct str_node *pos;
- int ret = 0;
- int ns_id = -1;
+ int ret, ns_id = -1;
bool force = false;
bool list_files = false;
bool opts_flag = false;
@@ -565,7 +575,8 @@ int cmd_buildid_cache(int argc, const char **argv)
*purge_name_list_str = NULL,
*missing_filename = NULL,
*update_name_list_str = NULL,
- *kcore_filename = NULL;
+ *kcore_filename = NULL,
+ *debuginfod = NULL;
char sbuf[STRERR_BUFSIZE];

struct perf_data data = {
@@ -590,6 +601,8 @@ int cmd_buildid_cache(int argc, const char **argv)
OPT_BOOLEAN('f', "force", &force, "don't complain, do it"),
OPT_STRING('u', "update", &update_name_list_str, "file list",
"file(s) to update"),
+ OPT_STRING(0, "debuginfod", &debuginfod, "debuginfod url",
+ "set debuginfod url"),
OPT_INCR('v', "verbose", &verbose, "be more verbose"),
OPT_INTEGER(0, "target-ns", &ns_id, "target pid for namespace context"),
OPT_END()
@@ -599,6 +612,10 @@ int cmd_buildid_cache(int argc, const char **argv)
NULL
};

+ ret = perf_config(perf_buildid_cache_config, &debuginfod);
+ if (ret)
+ return ret;
+
argc = parse_options(argc, argv, buildid_cache_options,
buildid_cache_usage, 0);

@@ -610,6 +627,11 @@ int cmd_buildid_cache(int argc, const char **argv)
if (argc || !(list_files || opts_flag))
usage_with_options(buildid_cache_usage, buildid_cache_options);

+ if (debuginfod) {
+ pr_debug("DEBUGINFOD_URLS=%s\n", debuginfod);
+ setenv("DEBUGINFOD_URLS", debuginfod, 1);
+ }
+
/* -l is exclusive. It can not be used with other options. */
if (list_files && opts_flag) {
usage_with_options_msg(buildid_cache_usage,
--
2.26.2

2020-12-14 11:10:11

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 07/15] perf tools: Store build id from mmap2 events

When processing mmap2 event, check on the build id
misc bit: PERF_RECORD_MISC_MMAP_BUILD_ID and if it
is set, store the build id in mmap's dso object.

Also adding the build id data to struct
perf_record_mmap2 event definition.

Signed-off-by: Jiri Olsa <[email protected]>
---
tools/lib/perf/include/perf/event.h | 18 ++++++++++++++----
tools/perf/util/machine.c | 24 +++++++++++++++++++-----
tools/perf/util/map.c | 8 ++++++--
tools/perf/util/map.h | 3 ++-
4 files changed, 41 insertions(+), 12 deletions(-)

diff --git a/tools/lib/perf/include/perf/event.h b/tools/lib/perf/include/perf/event.h
index 988c539bedb6..d82054225fcc 100644
--- a/tools/lib/perf/include/perf/event.h
+++ b/tools/lib/perf/include/perf/event.h
@@ -23,10 +23,20 @@ struct perf_record_mmap2 {
__u64 start;
__u64 len;
__u64 pgoff;
- __u32 maj;
- __u32 min;
- __u64 ino;
- __u64 ino_generation;
+ union {
+ struct {
+ __u32 maj;
+ __u32 min;
+ __u64 ino;
+ __u64 ino_generation;
+ };
+ struct {
+ __u8 build_id_size;
+ __u8 __reserved_1;
+ __u16 __reserved_2;
+ __u8 build_id[20];
+ };
+ };
__u32 prot;
__u32 flags;
char filename[PATH_MAX];
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 1ae32a81639c..1edb7d10b042 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1599,7 +1599,8 @@ static int machine__process_extra_kernel_map(struct machine *machine,
}

static int machine__process_kernel_mmap_event(struct machine *machine,
- struct extra_kernel_map *xm)
+ struct extra_kernel_map *xm,
+ struct build_id *bid)
{
struct map *map;
enum dso_space_type dso_space;
@@ -1624,6 +1625,10 @@ static int machine__process_kernel_mmap_event(struct machine *machine,
goto out_problem;

map->end = map->start + xm->end - xm->start;
+
+ if (build_id__is_defined(bid))
+ dso__set_build_id(map->dso, bid);
+
} else if (is_kernel_mmap) {
const char *symbol_name = (xm->name + strlen(machine->mmap_name));
/*
@@ -1681,6 +1686,9 @@ static int machine__process_kernel_mmap_event(struct machine *machine,

machine__update_kernel_mmap(machine, xm->start, xm->end);

+ if (build_id__is_defined(bid))
+ dso__set_build_id(kernel, bid);
+
/*
* Avoid using a zero address (kptr_restrict) for the ref reloc
* symbol. Effectively having zero here means that at record
@@ -1718,11 +1726,17 @@ int machine__process_mmap2_event(struct machine *machine,
.ino = event->mmap2.ino,
.ino_generation = event->mmap2.ino_generation,
};
+ struct build_id __bid, *bid = NULL;
int ret = 0;

if (dump_trace)
perf_event__fprintf_mmap2(event, stdout);

+ if (event->header.misc & PERF_RECORD_MISC_MMAP_BUILD_ID) {
+ bid = &__bid;
+ build_id__init(bid, event->mmap2.build_id, event->mmap2.build_id_size);
+ }
+
if (sample->cpumode == PERF_RECORD_MISC_GUEST_KERNEL ||
sample->cpumode == PERF_RECORD_MISC_KERNEL) {
struct extra_kernel_map xm = {
@@ -1732,7 +1746,7 @@ int machine__process_mmap2_event(struct machine *machine,
};

strlcpy(xm.name, event->mmap2.filename, KMAP_NAME_LEN);
- ret = machine__process_kernel_mmap_event(machine, &xm);
+ ret = machine__process_kernel_mmap_event(machine, &xm, bid);
if (ret < 0)
goto out_problem;
return 0;
@@ -1746,7 +1760,7 @@ int machine__process_mmap2_event(struct machine *machine,
map = map__new(machine, event->mmap2.start,
event->mmap2.len, event->mmap2.pgoff,
&dso_id, event->mmap2.prot,
- event->mmap2.flags,
+ event->mmap2.flags, bid,
event->mmap2.filename, thread);

if (map == NULL)
@@ -1789,7 +1803,7 @@ int machine__process_mmap_event(struct machine *machine, union perf_event *event
};

strlcpy(xm.name, event->mmap.filename, KMAP_NAME_LEN);
- ret = machine__process_kernel_mmap_event(machine, &xm);
+ ret = machine__process_kernel_mmap_event(machine, &xm, NULL);
if (ret < 0)
goto out_problem;
return 0;
@@ -1805,7 +1819,7 @@ int machine__process_mmap_event(struct machine *machine, union perf_event *event

map = map__new(machine, event->mmap.start,
event->mmap.len, event->mmap.pgoff,
- NULL, prot, 0, event->mmap.filename, thread);
+ NULL, prot, 0, NULL, event->mmap.filename, thread);

if (map == NULL)
goto out_problem_map;
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index f44ede437dc7..692e56dc832e 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -130,8 +130,8 @@ void map__init(struct map *map, u64 start, u64 end, u64 pgoff, struct dso *dso)

struct map *map__new(struct machine *machine, u64 start, u64 len,
u64 pgoff, struct dso_id *id,
- u32 prot, u32 flags, char *filename,
- struct thread *thread)
+ u32 prot, u32 flags, struct build_id *bid,
+ char *filename, struct thread *thread)
{
struct map *map = malloc(sizeof(*map));
struct nsinfo *nsi = NULL;
@@ -194,6 +194,10 @@ struct map *map__new(struct machine *machine, u64 start, u64 len,
dso__set_loaded(dso);
}
dso->nsinfo = nsi;
+
+ if (build_id__is_defined(bid))
+ dso__set_build_id(dso, bid);
+
dso__put(dso);
}
return map;
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index b1c0686db1b7..9f32825c98d8 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -104,10 +104,11 @@ void map__init(struct map *map,
u64 start, u64 end, u64 pgoff, struct dso *dso);

struct dso_id;
+struct build_id;

struct map *map__new(struct machine *machine, u64 start, u64 len,
u64 pgoff, struct dso_id *id, u32 prot, u32 flags,
- char *filename, struct thread *thread);
+ struct build_id *bid, char *filename, struct thread *thread);
struct map *map__new2(u64 start, struct dso *dso);
void map__delete(struct map *map);
struct map *map__clone(struct map *map);
--
2.26.2

2020-12-14 11:10:12

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 08/15] perf tools: Allow mmap2 event to synthesize kernel image

Allow mmap2 event to synthesize kernel image,
so we can synthesize kernel build id data in
following changes.

It's enabled by new symbol_conf.buildid_mmap2
bool, which will be switched in following
changes.

Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/util/symbol_conf.h | 3 ++-
tools/perf/util/synthetic-events.c | 40 ++++++++++++++++++++----------
2 files changed, 29 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/symbol_conf.h b/tools/perf/util/symbol_conf.h
index b916afb95ec5..b18f9c8dbb75 100644
--- a/tools/perf/util/symbol_conf.h
+++ b/tools/perf/util/symbol_conf.h
@@ -42,7 +42,8 @@ struct symbol_conf {
report_block,
report_individual_block,
inline_name,
- disable_add2line_warn;
+ disable_add2line_warn,
+ buildid_mmap2;
const char *vmlinux_name,
*kallsyms_name,
*source_prefix,
diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index 297987c6960b..a5f271fa2ae4 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -991,11 +991,12 @@ static int __perf_event__synthesize_kernel_mmap(struct perf_tool *tool,
perf_event__handler_t process,
struct machine *machine)
{
- size_t size;
+ union perf_event *event;
+ size_t size = symbol_conf.buildid_mmap2 ?
+ sizeof(event->mmap2) : sizeof(event->mmap);
struct map *map = machine__kernel_map(machine);
struct kmap *kmap;
int err;
- union perf_event *event;

if (map == NULL)
return -1;
@@ -1009,7 +1010,7 @@ static int __perf_event__synthesize_kernel_mmap(struct perf_tool *tool,
* available use this, and after it is use this as a fallback for older
* kernels.
*/
- event = zalloc((sizeof(event->mmap) + machine->id_hdr_size));
+ event = zalloc(size + machine->id_hdr_size);
if (event == NULL) {
pr_debug("Not enough memory synthesizing mmap event "
"for kernel modules\n");
@@ -1026,16 +1027,29 @@ static int __perf_event__synthesize_kernel_mmap(struct perf_tool *tool,
event->header.misc = PERF_RECORD_MISC_GUEST_KERNEL;
}

- size = snprintf(event->mmap.filename, sizeof(event->mmap.filename),
- "%s%s", machine->mmap_name, kmap->ref_reloc_sym->name) + 1;
- size = PERF_ALIGN(size, sizeof(u64));
- event->mmap.header.type = PERF_RECORD_MMAP;
- event->mmap.header.size = (sizeof(event->mmap) -
- (sizeof(event->mmap.filename) - size) + machine->id_hdr_size);
- event->mmap.pgoff = kmap->ref_reloc_sym->addr;
- event->mmap.start = map->start;
- event->mmap.len = map->end - event->mmap.start;
- event->mmap.pid = machine->pid;
+ if (symbol_conf.buildid_mmap2) {
+ size = snprintf(event->mmap2.filename, sizeof(event->mmap2.filename),
+ "%s%s", machine->mmap_name, kmap->ref_reloc_sym->name) + 1;
+ size = PERF_ALIGN(size, sizeof(u64));
+ event->mmap2.header.type = PERF_RECORD_MMAP2;
+ event->mmap2.header.size = (sizeof(event->mmap2) -
+ (sizeof(event->mmap2.filename) - size) + machine->id_hdr_size);
+ event->mmap2.pgoff = kmap->ref_reloc_sym->addr;
+ event->mmap2.start = map->start;
+ event->mmap2.len = map->end - event->mmap.start;
+ event->mmap2.pid = machine->pid;
+ } else {
+ size = snprintf(event->mmap.filename, sizeof(event->mmap.filename),
+ "%s%s", machine->mmap_name, kmap->ref_reloc_sym->name) + 1;
+ size = PERF_ALIGN(size, sizeof(u64));
+ event->mmap.header.type = PERF_RECORD_MMAP;
+ event->mmap.header.size = (sizeof(event->mmap) -
+ (sizeof(event->mmap.filename) - size) + machine->id_hdr_size);
+ event->mmap.pgoff = kmap->ref_reloc_sym->addr;
+ event->mmap.start = map->start;
+ event->mmap.len = map->end - event->mmap.start;
+ event->mmap.pid = machine->pid;
+ }

err = perf_tool__process_synth_event(tool, event, machine, process);
free(event);
--
2.26.2

2020-12-14 11:10:36

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 01/15] bpf: Move stack_map_get_build_id into lib

Moving stack_map_get_build_id into lib with
declaration in linux/buildid.h header:

int build_id_parse(struct vm_area_struct *vma, unsigned char *build_id);

This function returns build id for given struct vm_area_struct.
There is no functional change to stack_map_get_build_id function.

Cc: Alexei Starovoitov <[email protected]>
Acked-by: Song Liu <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
---
include/linux/buildid.h | 11 ++++
kernel/bpf/stackmap.c | 143 ++--------------------------------------
lib/Makefile | 3 +-
lib/buildid.c | 136 ++++++++++++++++++++++++++++++++++++++
4 files changed, 153 insertions(+), 140 deletions(-)
create mode 100644 include/linux/buildid.h
create mode 100644 lib/buildid.c

diff --git a/include/linux/buildid.h b/include/linux/buildid.h
new file mode 100644
index 000000000000..08028a212589
--- /dev/null
+++ b/include/linux/buildid.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_BUILDID_H
+#define _LINUX_BUILDID_H
+
+#include <linux/mm_types.h>
+
+#define BUILD_ID_SIZE_MAX 20
+
+int build_id_parse(struct vm_area_struct *vma, unsigned char *build_id);
+
+#endif
diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index 06065fa27124..d21512fbfa9a 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -7,10 +7,9 @@
#include <linux/kernel.h>
#include <linux/stacktrace.h>
#include <linux/perf_event.h>
-#include <linux/elf.h>
-#include <linux/pagemap.h>
#include <linux/irq_work.h>
#include <linux/btf_ids.h>
+#include <linux/buildid.h>
#include "percpu_freelist.h"

#define STACK_CREATE_FLAG_MASK \
@@ -153,140 +152,6 @@ static struct bpf_map *stack_map_alloc(union bpf_attr *attr)
return ERR_PTR(err);
}

-#define BPF_BUILD_ID 3
-/*
- * Parse build id from the note segment. This logic can be shared between
- * 32-bit and 64-bit system, because Elf32_Nhdr and Elf64_Nhdr are
- * identical.
- */
-static inline int stack_map_parse_build_id(void *page_addr,
- unsigned char *build_id,
- void *note_start,
- Elf32_Word note_size)
-{
- Elf32_Word note_offs = 0, new_offs;
-
- /* check for overflow */
- if (note_start < page_addr || note_start + note_size < note_start)
- return -EINVAL;
-
- /* only supports note that fits in the first page */
- if (note_start + note_size > page_addr + PAGE_SIZE)
- return -EINVAL;
-
- while (note_offs + sizeof(Elf32_Nhdr) < note_size) {
- Elf32_Nhdr *nhdr = (Elf32_Nhdr *)(note_start + note_offs);
-
- if (nhdr->n_type == BPF_BUILD_ID &&
- nhdr->n_namesz == sizeof("GNU") &&
- nhdr->n_descsz > 0 &&
- nhdr->n_descsz <= BPF_BUILD_ID_SIZE) {
- memcpy(build_id,
- note_start + note_offs +
- ALIGN(sizeof("GNU"), 4) + sizeof(Elf32_Nhdr),
- nhdr->n_descsz);
- memset(build_id + nhdr->n_descsz, 0,
- BPF_BUILD_ID_SIZE - nhdr->n_descsz);
- return 0;
- }
- new_offs = note_offs + sizeof(Elf32_Nhdr) +
- ALIGN(nhdr->n_namesz, 4) + ALIGN(nhdr->n_descsz, 4);
- if (new_offs <= note_offs) /* overflow */
- break;
- note_offs = new_offs;
- }
- return -EINVAL;
-}
-
-/* Parse build ID from 32-bit ELF */
-static int stack_map_get_build_id_32(void *page_addr,
- unsigned char *build_id)
-{
- Elf32_Ehdr *ehdr = (Elf32_Ehdr *)page_addr;
- Elf32_Phdr *phdr;
- int i;
-
- /* only supports phdr that fits in one page */
- if (ehdr->e_phnum >
- (PAGE_SIZE - sizeof(Elf32_Ehdr)) / sizeof(Elf32_Phdr))
- return -EINVAL;
-
- phdr = (Elf32_Phdr *)(page_addr + sizeof(Elf32_Ehdr));
-
- for (i = 0; i < ehdr->e_phnum; ++i) {
- if (phdr[i].p_type == PT_NOTE &&
- !stack_map_parse_build_id(page_addr, build_id,
- page_addr + phdr[i].p_offset,
- phdr[i].p_filesz))
- return 0;
- }
- return -EINVAL;
-}
-
-/* Parse build ID from 64-bit ELF */
-static int stack_map_get_build_id_64(void *page_addr,
- unsigned char *build_id)
-{
- Elf64_Ehdr *ehdr = (Elf64_Ehdr *)page_addr;
- Elf64_Phdr *phdr;
- int i;
-
- /* only supports phdr that fits in one page */
- if (ehdr->e_phnum >
- (PAGE_SIZE - sizeof(Elf64_Ehdr)) / sizeof(Elf64_Phdr))
- return -EINVAL;
-
- phdr = (Elf64_Phdr *)(page_addr + sizeof(Elf64_Ehdr));
-
- for (i = 0; i < ehdr->e_phnum; ++i) {
- if (phdr[i].p_type == PT_NOTE &&
- !stack_map_parse_build_id(page_addr, build_id,
- page_addr + phdr[i].p_offset,
- phdr[i].p_filesz))
- return 0;
- }
- return -EINVAL;
-}
-
-/* Parse build ID of ELF file mapped to vma */
-static int stack_map_get_build_id(struct vm_area_struct *vma,
- unsigned char *build_id)
-{
- Elf32_Ehdr *ehdr;
- struct page *page;
- void *page_addr;
- int ret;
-
- /* only works for page backed storage */
- if (!vma->vm_file)
- return -EINVAL;
-
- page = find_get_page(vma->vm_file->f_mapping, 0);
- if (!page)
- return -EFAULT; /* page not mapped */
-
- ret = -EINVAL;
- page_addr = kmap_atomic(page);
- ehdr = (Elf32_Ehdr *)page_addr;
-
- /* compare magic x7f "ELF" */
- if (memcmp(ehdr->e_ident, ELFMAG, SELFMAG) != 0)
- goto out;
-
- /* only support executable file and shared object file */
- if (ehdr->e_type != ET_EXEC && ehdr->e_type != ET_DYN)
- goto out;
-
- if (ehdr->e_ident[EI_CLASS] == ELFCLASS32)
- ret = stack_map_get_build_id_32(page_addr, build_id);
- else if (ehdr->e_ident[EI_CLASS] == ELFCLASS64)
- ret = stack_map_get_build_id_64(page_addr, build_id);
-out:
- kunmap_atomic(page_addr);
- put_page(page);
- return ret;
-}
-
static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs,
u64 *ips, u32 trace_nr, bool user)
{
@@ -327,18 +192,18 @@ static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs,
for (i = 0; i < trace_nr; i++) {
id_offs[i].status = BPF_STACK_BUILD_ID_IP;
id_offs[i].ip = ips[i];
- memset(id_offs[i].build_id, 0, BPF_BUILD_ID_SIZE);
+ memset(id_offs[i].build_id, 0, BUILD_ID_SIZE_MAX);
}
return;
}

for (i = 0; i < trace_nr; i++) {
vma = find_vma(current->mm, ips[i]);
- if (!vma || stack_map_get_build_id(vma, id_offs[i].build_id)) {
+ if (!vma || build_id_parse(vma, id_offs[i].build_id)) {
/* per entry fall back to ips */
id_offs[i].status = BPF_STACK_BUILD_ID_IP;
id_offs[i].ip = ips[i];
- memset(id_offs[i].build_id, 0, BPF_BUILD_ID_SIZE);
+ memset(id_offs[i].build_id, 0, BUILD_ID_SIZE_MAX);
continue;
}
id_offs[i].offset = (vma->vm_pgoff << PAGE_SHIFT) + ips[i]
diff --git a/lib/Makefile b/lib/Makefile
index ce45af50983a..f4858f5e9215 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -36,7 +36,8 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
flex_proportions.o ratelimit.o show_mem.o \
is_single_threaded.o plist.o decompress.o kobject_uevent.o \
earlycpio.o seq_buf.o siphash.o dec_and_lock.o \
- nmi_backtrace.o nodemask.o win_minmax.o memcat_p.o
+ nmi_backtrace.o nodemask.o win_minmax.o memcat_p.o \
+ buildid.o

lib-$(CONFIG_PRINTK) += dump_stack.o
lib-$(CONFIG_SMP) += cpumask.o
diff --git a/lib/buildid.c b/lib/buildid.c
new file mode 100644
index 000000000000..4a4f520c0e29
--- /dev/null
+++ b/lib/buildid.c
@@ -0,0 +1,136 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/buildid.h>
+#include <linux/elf.h>
+#include <linux/pagemap.h>
+
+#define BUILD_ID 3
+/*
+ * Parse build id from the note segment. This logic can be shared between
+ * 32-bit and 64-bit system, because Elf32_Nhdr and Elf64_Nhdr are
+ * identical.
+ */
+static inline int parse_build_id(void *page_addr,
+ unsigned char *build_id,
+ void *note_start,
+ Elf32_Word note_size)
+{
+ Elf32_Word note_offs = 0, new_offs;
+
+ /* check for overflow */
+ if (note_start < page_addr || note_start + note_size < note_start)
+ return -EINVAL;
+
+ /* only supports note that fits in the first page */
+ if (note_start + note_size > page_addr + PAGE_SIZE)
+ return -EINVAL;
+
+ while (note_offs + sizeof(Elf32_Nhdr) < note_size) {
+ Elf32_Nhdr *nhdr = (Elf32_Nhdr *)(note_start + note_offs);
+
+ if (nhdr->n_type == BUILD_ID &&
+ nhdr->n_namesz == sizeof("GNU") &&
+ nhdr->n_descsz > 0 &&
+ nhdr->n_descsz <= BUILD_ID_SIZE_MAX) {
+ memcpy(build_id,
+ note_start + note_offs +
+ ALIGN(sizeof("GNU"), 4) + sizeof(Elf32_Nhdr),
+ nhdr->n_descsz);
+ memset(build_id + nhdr->n_descsz, 0,
+ BUILD_ID_SIZE_MAX - nhdr->n_descsz);
+ return 0;
+ }
+ new_offs = note_offs + sizeof(Elf32_Nhdr) +
+ ALIGN(nhdr->n_namesz, 4) + ALIGN(nhdr->n_descsz, 4);
+ if (new_offs <= note_offs) /* overflow */
+ break;
+ note_offs = new_offs;
+ }
+ return -EINVAL;
+}
+
+/* Parse build ID from 32-bit ELF */
+static int get_build_id_32(void *page_addr, unsigned char *build_id)
+{
+ Elf32_Ehdr *ehdr = (Elf32_Ehdr *)page_addr;
+ Elf32_Phdr *phdr;
+ int i;
+
+ /* only supports phdr that fits in one page */
+ if (ehdr->e_phnum >
+ (PAGE_SIZE - sizeof(Elf32_Ehdr)) / sizeof(Elf32_Phdr))
+ return -EINVAL;
+
+ phdr = (Elf32_Phdr *)(page_addr + sizeof(Elf32_Ehdr));
+
+ for (i = 0; i < ehdr->e_phnum; ++i) {
+ if (phdr[i].p_type == PT_NOTE &&
+ !parse_build_id(page_addr, build_id,
+ page_addr + phdr[i].p_offset,
+ phdr[i].p_filesz))
+ return 0;
+ }
+ return -EINVAL;
+}
+
+/* Parse build ID from 64-bit ELF */
+static int get_build_id_64(void *page_addr, unsigned char *build_id)
+{
+ Elf64_Ehdr *ehdr = (Elf64_Ehdr *)page_addr;
+ Elf64_Phdr *phdr;
+ int i;
+
+ /* only supports phdr that fits in one page */
+ if (ehdr->e_phnum >
+ (PAGE_SIZE - sizeof(Elf64_Ehdr)) / sizeof(Elf64_Phdr))
+ return -EINVAL;
+
+ phdr = (Elf64_Phdr *)(page_addr + sizeof(Elf64_Ehdr));
+
+ for (i = 0; i < ehdr->e_phnum; ++i) {
+ if (phdr[i].p_type == PT_NOTE &&
+ !parse_build_id(page_addr, build_id,
+ page_addr + phdr[i].p_offset,
+ phdr[i].p_filesz))
+ return 0;
+ }
+ return -EINVAL;
+}
+
+/* Parse build ID of ELF file mapped to vma */
+int build_id_parse(struct vm_area_struct *vma, unsigned char *build_id)
+{
+ Elf32_Ehdr *ehdr;
+ struct page *page;
+ void *page_addr;
+ int ret;
+
+ /* only works for page backed storage */
+ if (!vma->vm_file)
+ return -EINVAL;
+
+ page = find_get_page(vma->vm_file->f_mapping, 0);
+ if (!page)
+ return -EFAULT; /* page not mapped */
+
+ ret = -EINVAL;
+ page_addr = kmap_atomic(page);
+ ehdr = (Elf32_Ehdr *)page_addr;
+
+ /* compare magic x7f "ELF" */
+ if (memcmp(ehdr->e_ident, ELFMAG, SELFMAG) != 0)
+ goto out;
+
+ /* only support executable file and shared object file */
+ if (ehdr->e_type != ET_EXEC && ehdr->e_type != ET_DYN)
+ goto out;
+
+ if (ehdr->e_ident[EI_CLASS] == ELFCLASS32)
+ ret = get_build_id_32(page_addr, build_id);
+ else if (ehdr->e_ident[EI_CLASS] == ELFCLASS64)
+ ret = get_build_id_64(page_addr, build_id);
+out:
+ kunmap_atomic(page_addr);
+ put_page(page);
+ return ret;
+}
--
2.26.2

2020-12-14 11:10:48

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 14/15] perf buildid-list: Add support for mmap2's buildid events

Add buildid-list support for mmap2's build id data, so we can
display build ids for dso objects for data without the build
id cache update.

$ perf buildid-list
1805c738c8f3ec0f47b7ea09080c28f34d18a82b /usr/lib64/ld-2.31.so
d278249792061c6b74d1693ca59513be1def13f2 /usr/lib64/libc-2.31.so

By default only dso objects with hits are shown.

Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/builtin-buildid-list.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/tools/perf/builtin-buildid-list.c b/tools/perf/builtin-buildid-list.c
index e3ef75583514..87f5b1a4a7fa 100644
--- a/tools/perf/builtin-buildid-list.c
+++ b/tools/perf/builtin-buildid-list.c
@@ -77,6 +77,9 @@ static int perf_session__list_build_ids(bool force, bool with_hits)
perf_header__has_feat(&session->header, HEADER_AUXTRACE))
with_hits = false;

+ if (!perf_header__has_feat(&session->header, HEADER_BUILD_ID))
+ with_hits = true;
+
/*
* in pipe-mode, the only way to get the buildids is to parse
* the record stream. Buildids are stored as RECORD_HEADER_BUILD_ID
--
2.26.2

2020-12-14 11:11:07

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 15/15] perf record: Add --buildid-mmap option to enable mmap's build id

Adding --buildid-mmap option to enable build id in mmap2 events.
It will only work if there's kernel support for that and it disables
build id cache (implies --no-buildid).

It's also possible to enable it permanently via config option
in ~.perfconfig file:

[record]
build-id=mmap

Also added build_id bit in the verbose output for perf_event_attr:

# perf record --buildid-mmap -vv
...
perf_event_attr:
type 1
size 120
...
build_id 1

Adding also missing text_poke bit.

Acked-by: Ian Rogers <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/Documentation/perf-config.txt | 3 ++-
tools/perf/Documentation/perf-record.txt | 3 +++
tools/perf/builtin-record.c | 20 ++++++++++++++++++++
tools/perf/util/evsel.c | 10 ++++++----
tools/perf/util/perf_api_probe.c | 10 ++++++++++
tools/perf/util/perf_api_probe.h | 1 +
tools/perf/util/perf_event_attr_fprintf.c | 2 ++
tools/perf/util/record.h | 1 +
8 files changed, 45 insertions(+), 5 deletions(-)

diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt
index e3672c5d801b..8a1c6c16821a 100644
--- a/tools/perf/Documentation/perf-config.txt
+++ b/tools/perf/Documentation/perf-config.txt
@@ -559,11 +559,12 @@ kmem.*::

record.*::
record.build-id::
- This option can be 'cache', 'no-cache' or 'skip'.
+ This option can be 'cache', 'no-cache', 'skip' or 'mmap'.
'cache' is to post-process data and save/update the binaries into
the build-id cache (in ~/.debug). This is the default.
But if this option is 'no-cache', it will not update the build-id cache.
'skip' skips post-processing and does not update the cache.
+ 'mmap' skips post-processing and reads build-ids from MMAP events.

record.call-graph::
This is identical to 'call-graph.record-mode', except it is
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 768888b9326a..1bcf51e24979 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -482,6 +482,9 @@ Specify vmlinux path which has debuginfo.
--buildid-all::
Record build-id of all DSOs regardless whether it's actually hit or not.

+--buildid-mmap::
+Record build ids in mmap2 events, disables build id cache (implies --no-buildid).
+
--aio[=n]::
Use <n> control blocks in asynchronous (Posix AIO) trace writing mode (default: 1, max: 4).
Asynchronous mode is supported only when linking Perf tool with libc library
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index d832c108a1ca..f6bfad096756 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -102,6 +102,7 @@ struct record {
bool no_buildid_cache;
bool no_buildid_cache_set;
bool buildid_all;
+ bool buildid_mmap;
bool timestamp_filename;
bool timestamp_boundary;
struct switch_output switch_output;
@@ -2135,6 +2136,8 @@ static int perf_record_config(const char *var, const char *value, void *cb)
rec->no_buildid_cache = true;
else if (!strcmp(value, "skip"))
rec->no_buildid = true;
+ else if (!strcmp(value, "mmap"))
+ rec->buildid_mmap = true;
else
return -1;
return 0;
@@ -2550,6 +2553,8 @@ static struct option __record_options[] = {
"file", "vmlinux pathname"),
OPT_BOOLEAN(0, "buildid-all", &record.buildid_all,
"Record build-id of all DSOs regardless of hits"),
+ OPT_BOOLEAN(0, "buildid-mmap", &record.buildid_mmap,
+ "Record build-id in map events"),
OPT_BOOLEAN(0, "timestamp-filename", &record.timestamp_filename,
"append timestamp to output filename"),
OPT_BOOLEAN(0, "timestamp-boundary", &record.timestamp_boundary,
@@ -2653,6 +2658,21 @@ int cmd_record(int argc, const char **argv)

}

+ if (rec->buildid_mmap) {
+ if (!perf_can_record_build_id()) {
+ pr_err("Failed: no support to record build id in mmap events, update your kernel.\n");
+ err = -EINVAL;
+ goto out_opts;
+ }
+ pr_debug("Enabling build id in mmap2 events.\n");
+ /* Enable mmap build id synthesizing. */
+ symbol_conf.buildid_mmap2 = true;
+ /* Enable perf_event_attr::build_id bit. */
+ rec->opts.build_id = true;
+ /* Disable build id cache. */
+ rec->no_buildid = true;
+ }
+
if (rec->opts.kcore)
rec->data.is_dir = true;

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 3dd0eae9810d..191500c1f9f6 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1168,10 +1168,12 @@ void evsel__config(struct evsel *evsel, struct record_opts *opts,
if (opts->sample_weight)
evsel__set_sample_bit(evsel, WEIGHT);

- attr->task = track;
- attr->mmap = track;
- attr->mmap2 = track && !perf_missing_features.mmap2;
- attr->comm = track;
+ attr->task = track;
+ attr->mmap = track;
+ attr->mmap2 = track && !perf_missing_features.mmap2;
+ attr->comm = track;
+ attr->build_id = track && opts->build_id;
+
/*
* ksymbol is tracked separately with text poke because it needs to be
* system wide and enabled immediately.
diff --git a/tools/perf/util/perf_api_probe.c b/tools/perf/util/perf_api_probe.c
index 3840d02f0f7b..829af17a0867 100644
--- a/tools/perf/util/perf_api_probe.c
+++ b/tools/perf/util/perf_api_probe.c
@@ -98,6 +98,11 @@ static void perf_probe_text_poke(struct evsel *evsel)
evsel->core.attr.text_poke = 1;
}

+static void perf_probe_build_id(struct evsel *evsel)
+{
+ evsel->core.attr.build_id = 1;
+}
+
bool perf_can_sample_identifier(void)
{
return perf_probe_api(perf_probe_sample_identifier);
@@ -172,3 +177,8 @@ bool perf_can_aux_sample(void)

return true;
}
+
+bool perf_can_record_build_id(void)
+{
+ return perf_probe_api(perf_probe_build_id);
+}
diff --git a/tools/perf/util/perf_api_probe.h b/tools/perf/util/perf_api_probe.h
index d5506a983a94..f12ca55f509a 100644
--- a/tools/perf/util/perf_api_probe.h
+++ b/tools/perf/util/perf_api_probe.h
@@ -11,5 +11,6 @@ bool perf_can_record_cpu_wide(void);
bool perf_can_record_switch_events(void);
bool perf_can_record_text_poke_events(void);
bool perf_can_sample_identifier(void);
+bool perf_can_record_build_id(void);

#endif // __PERF_API_PROBE_H
diff --git a/tools/perf/util/perf_event_attr_fprintf.c b/tools/perf/util/perf_event_attr_fprintf.c
index e67a227c0ce7..656a7fddfc26 100644
--- a/tools/perf/util/perf_event_attr_fprintf.c
+++ b/tools/perf/util/perf_event_attr_fprintf.c
@@ -134,6 +134,8 @@ int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
PRINT_ATTRf(bpf_event, p_unsigned);
PRINT_ATTRf(aux_output, p_unsigned);
PRINT_ATTRf(cgroup, p_unsigned);
+ PRINT_ATTRf(text_poke, p_unsigned);
+ PRINT_ATTRf(build_id, p_unsigned);

PRINT_ATTRn("{ wakeup_events, wakeup_watermark }", wakeup_events, p_unsigned);
PRINT_ATTRf(bp_type, p_unsigned);
diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
index 266760ac9143..609e706f4282 100644
--- a/tools/perf/util/record.h
+++ b/tools/perf/util/record.h
@@ -49,6 +49,7 @@ struct record_opts {
bool no_bpf_event;
bool kcore;
bool text_poke;
+ bool build_id;
unsigned int freq;
unsigned int mmap_pages;
unsigned int auxtrace_mmap_pages;
--
2.26.2

2020-12-14 11:12:26

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 06/15] perf tools: Add support to read build id from compressed elf

Adding support to decompress file before reading build id.

Adding filename__read_build_id and change its current
versions to read_build_id.

Shutting down stderr output of perf list in the shell test:
82: Check open filename arg using perf trace + vfs_getname : Ok

because with decompression code in the place we the
filename__read_build_id function is more verbose in case
of error and the test did not account for that.

Signed-off-by: Jiri Olsa <[email protected]>
---
.../tests/shell/trace+probe_vfs_getname.sh | 2 +-
tools/perf/util/symbol-elf.c | 37 ++++++++++++++++++-
2 files changed, 36 insertions(+), 3 deletions(-)

diff --git a/tools/perf/tests/shell/trace+probe_vfs_getname.sh b/tools/perf/tests/shell/trace+probe_vfs_getname.sh
index 11cc2af13f2b..3d31c1d560d6 100755
--- a/tools/perf/tests/shell/trace+probe_vfs_getname.sh
+++ b/tools/perf/tests/shell/trace+probe_vfs_getname.sh
@@ -20,7 +20,7 @@ skip_if_no_perf_trace || exit 2
file=$(mktemp /tmp/temporary_file.XXXXX)

trace_open_vfs_getname() {
- evts=$(echo $(perf list syscalls:sys_enter_open* 2>&1 | egrep 'open(at)? ' | sed -r 's/.*sys_enter_([a-z]+) +\[.*$/\1/') | sed 's/ /,/')
+ evts=$(echo $(perf list syscalls:sys_enter_open* 2>/dev/null | egrep 'open(at)? ' | sed -r 's/.*sys_enter_([a-z]+) +\[.*$/\1/') | sed 's/ /,/')
perf trace -e $evts touch $file 2>&1 | \
egrep " +[0-9]+\.[0-9]+ +\( +[0-9]+\.[0-9]+ ms\): +touch\/[0-9]+ open(at)?\((dfd: +CWD, +)?filename: +${file}, +flags: CREAT\|NOCTTY\|NONBLOCK\|WRONLY, +mode: +IRUGO\|IWUGO\) += +[0-9]+$"
}
diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 44dd86a4f25f..f3577f7d72fe 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -534,7 +534,7 @@ static int elf_read_build_id(Elf *elf, void *bf, size_t size)

#ifdef HAVE_LIBBFD_BUILDID_SUPPORT

-int filename__read_build_id(const char *filename, struct build_id *bid)
+static int read_build_id(const char *filename, struct build_id *bid)
{
size_t size = sizeof(bid->data);
int err = -1;
@@ -563,7 +563,7 @@ int filename__read_build_id(const char *filename, struct build_id *bid)

#else // HAVE_LIBBFD_BUILDID_SUPPORT

-int filename__read_build_id(const char *filename, struct build_id *bid)
+static int read_build_id(const char *filename, struct build_id *bid)
{
size_t size = sizeof(bid->data);
int fd, err = -1;
@@ -595,6 +595,39 @@ int filename__read_build_id(const char *filename, struct build_id *bid)

#endif // HAVE_LIBBFD_BUILDID_SUPPORT

+int filename__read_build_id(const char *filename, struct build_id *bid)
+{
+ struct kmod_path m = { .name = NULL, };
+ char path[PATH_MAX];
+ int err;
+
+ if (!filename)
+ return -EFAULT;
+
+ err = kmod_path__parse(&m, filename);
+ if (err)
+ return -1;
+
+ if (m.comp) {
+ int error = 0, fd;
+
+ fd = filename__decompress(filename, path, sizeof(path), m.comp, &error);
+ if (fd < 0) {
+ pr_debug("Failed to decompress (error %d) %s\n",
+ error, filename);
+ return -1;
+ }
+ close(fd);
+ filename = path;
+ }
+
+ err = read_build_id(filename, bid);
+
+ if (m.comp)
+ unlink(filename);
+ return err;
+}
+
int sysfs__read_build_id(const char *filename, struct build_id *bid)
{
size_t size = sizeof(bid->data);
--
2.26.2

2020-12-14 14:28:16

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 03/15] perf: Add build id data in mmap2 event

Adding support to carry build id data in mmap2 event.

The build id data replaces maj/min/ino/ino_generation
fields, which are also used to identify map's binary,
so it's ok to replace them with build id data:

union {
struct {
u32 maj;
u32 min;
u64 ino;
u64 ino_generation;
};
struct {
u8 build_id_size;
u8 __reserved_1;
u16 __reserved_2;
u8 build_id[20];
};
};

Replaced maj/min/ino/ino_generation fields give us size
of 24 bytes. We use 20 bytes for build id data, 1 byte
for size and rest is unused.

There's new misc bit for mmap2 to signal there's build
id data in it:

#define PERF_RECORD_MISC_MMAP_BUILD_ID (1 << 14)

Acked-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
---
include/uapi/linux/perf_event.h | 42 +++++++++++++++++++++++++++++----
kernel/events/core.c | 32 +++++++++++++++++++++----
2 files changed, 65 insertions(+), 9 deletions(-)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index b95d3c485d27..45a216bea048 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -384,7 +384,8 @@ struct perf_event_attr {
aux_output : 1, /* generate AUX records instead of events */
cgroup : 1, /* include cgroup events */
text_poke : 1, /* include text poke events */
- __reserved_1 : 30;
+ build_id : 1, /* use build id in mmap2 events */
+ __reserved_1 : 29;

union {
__u32 wakeup_events; /* wakeup every n events */
@@ -657,6 +658,22 @@ struct perf_event_mmap_page {
__u64 aux_size;
};

+/*
+ * The current state of perf_event_header::misc bits usage:
+ * ('|' used bit, '-' unused bit)
+ *
+ * 012 CDEF
+ * |||---------||||
+ *
+ * Where:
+ * 0-2 CPUMODE_MASK
+ *
+ * C PROC_MAP_PARSE_TIMEOUT
+ * D MMAP_DATA / COMM_EXEC / FORK_EXEC / SWITCH_OUT
+ * E MMAP_BUILD_ID / EXACT_IP / SCHED_OUT_PREEMPT
+ * F (reserved)
+ */
+
#define PERF_RECORD_MISC_CPUMODE_MASK (7 << 0)
#define PERF_RECORD_MISC_CPUMODE_UNKNOWN (0 << 0)
#define PERF_RECORD_MISC_KERNEL (1 << 0)
@@ -688,6 +705,7 @@ struct perf_event_mmap_page {
*
* PERF_RECORD_MISC_EXACT_IP - PERF_RECORD_SAMPLE of precise events
* PERF_RECORD_MISC_SWITCH_OUT_PREEMPT - PERF_RECORD_SWITCH* events
+ * PERF_RECORD_MISC_MMAP_BUILD_ID - PERF_RECORD_MMAP2 event
*
*
* PERF_RECORD_MISC_EXACT_IP:
@@ -697,9 +715,13 @@ struct perf_event_mmap_page {
*
* PERF_RECORD_MISC_SWITCH_OUT_PREEMPT:
* Indicates that thread was preempted in TASK_RUNNING state.
+ *
+ * PERF_RECORD_MISC_MMAP_BUILD_ID:
+ * Indicates that mmap2 event carries build id data.
*/
#define PERF_RECORD_MISC_EXACT_IP (1 << 14)
#define PERF_RECORD_MISC_SWITCH_OUT_PREEMPT (1 << 14)
+#define PERF_RECORD_MISC_MMAP_BUILD_ID (1 << 14)
/*
* Reserve the last bit to indicate some extended misc field
*/
@@ -911,10 +933,20 @@ enum perf_event_type {
* u64 addr;
* u64 len;
* u64 pgoff;
- * u32 maj;
- * u32 min;
- * u64 ino;
- * u64 ino_generation;
+ * union {
+ * struct {
+ * u32 maj;
+ * u32 min;
+ * u64 ino;
+ * u64 ino_generation;
+ * };
+ * struct {
+ * u8 build_id_size;
+ * u8 __reserved_1;
+ * u16 __reserved_2;
+ * u8 build_id[20];
+ * };
+ * };
* u32 prot, flags;
* char filename[];
* struct sample_id sample_id;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index dc568ca295bd..6cbd04a24d3a 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -51,6 +51,7 @@
#include <linux/proc_ns.h>
#include <linux/mount.h>
#include <linux/min_heap.h>
+#include <linux/buildid.h>

#include "internal.h"

@@ -395,6 +396,7 @@ static atomic_t nr_ksymbol_events __read_mostly;
static atomic_t nr_bpf_events __read_mostly;
static atomic_t nr_cgroup_events __read_mostly;
static atomic_t nr_text_poke_events __read_mostly;
+static atomic_t nr_build_id_events __read_mostly;

static LIST_HEAD(pmus);
static DEFINE_MUTEX(pmus_lock);
@@ -4665,6 +4667,8 @@ static void unaccount_event(struct perf_event *event)
dec = true;
if (event->attr.mmap || event->attr.mmap_data)
atomic_dec(&nr_mmap_events);
+ if (event->attr.build_id)
+ atomic_dec(&nr_build_id_events);
if (event->attr.comm)
atomic_dec(&nr_comm_events);
if (event->attr.namespaces)
@@ -7934,6 +7938,8 @@ struct perf_mmap_event {
u64 ino;
u64 ino_generation;
u32 prot, flags;
+ u8 build_id[BUILD_ID_SIZE_MAX];
+ u32 build_id_size;

struct {
struct perf_event_header header;
@@ -7965,6 +7971,7 @@ static void perf_event_mmap_output(struct perf_event *event,
struct perf_sample_data sample;
int size = mmap_event->event_id.header.size;
u32 type = mmap_event->event_id.header.type;
+ bool use_build_id;
int ret;

if (!perf_event_mmap_match(event, data))
@@ -7989,13 +7996,25 @@ static void perf_event_mmap_output(struct perf_event *event,
mmap_event->event_id.pid = perf_event_pid(event, current);
mmap_event->event_id.tid = perf_event_tid(event, current);

+ use_build_id = event->attr.build_id && mmap_event->build_id_size;
+
+ if (event->attr.mmap2 && use_build_id)
+ mmap_event->event_id.header.misc |= PERF_RECORD_MISC_MMAP_BUILD_ID;
+
perf_output_put(&handle, mmap_event->event_id);

if (event->attr.mmap2) {
- perf_output_put(&handle, mmap_event->maj);
- perf_output_put(&handle, mmap_event->min);
- perf_output_put(&handle, mmap_event->ino);
- perf_output_put(&handle, mmap_event->ino_generation);
+ if (use_build_id) {
+ u8 size[4] = { (u8) mmap_event->build_id_size, 0, 0, 0 };
+
+ __output_copy(&handle, size, 4);
+ __output_copy(&handle, mmap_event->build_id, BUILD_ID_SIZE_MAX);
+ } else {
+ perf_output_put(&handle, mmap_event->maj);
+ perf_output_put(&handle, mmap_event->min);
+ perf_output_put(&handle, mmap_event->ino);
+ perf_output_put(&handle, mmap_event->ino_generation);
+ }
perf_output_put(&handle, mmap_event->prot);
perf_output_put(&handle, mmap_event->flags);
}
@@ -8124,6 +8143,9 @@ static void perf_event_mmap_event(struct perf_mmap_event *mmap_event)

mmap_event->event_id.header.size = sizeof(mmap_event->event_id) + size;

+ if (atomic_read(&nr_build_id_events))
+ build_id_parse(vma, mmap_event->build_id, &mmap_event->build_id_size);
+
perf_iterate_sb(perf_event_mmap_output,
mmap_event,
NULL);
@@ -11060,6 +11082,8 @@ static void account_event(struct perf_event *event)
inc = true;
if (event->attr.mmap || event->attr.mmap_data)
atomic_inc(&nr_mmap_events);
+ if (event->attr.build_id)
+ atomic_inc(&nr_build_id_events);
if (event->attr.comm)
atomic_inc(&nr_comm_events);
if (event->attr.namespaces)
--
2.26.2

2020-12-14 14:28:20

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 05/15] perf tools: Do not swap mmap2 fields in case it contains build id

If PERF_RECORD_MISC_MMAP_BUILD_ID misc bit is set,
mmap2 event carries build id, placed in following union:

union {
struct {
u32 maj;
u32 min;
u64 ino;
u64 ino_generation;
};
struct {
u8 build_id_size;
u8 __reserved_1;
u16 __reserved_2;
u8 build_id[20];
};
};

In this case we can't swap above fields.

Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/util/session.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 3b3c50b12791..29593494b67c 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -592,10 +592,13 @@ static void perf_event__mmap2_swap(union perf_event *event,
event->mmap2.start = bswap_64(event->mmap2.start);
event->mmap2.len = bswap_64(event->mmap2.len);
event->mmap2.pgoff = bswap_64(event->mmap2.pgoff);
- event->mmap2.maj = bswap_32(event->mmap2.maj);
- event->mmap2.min = bswap_32(event->mmap2.min);
- event->mmap2.ino = bswap_64(event->mmap2.ino);
- event->mmap2.ino_generation = bswap_64(event->mmap2.ino_generation);
+
+ if (!(event->header.misc & PERF_RECORD_MISC_MMAP_BUILD_ID)) {
+ event->mmap2.maj = bswap_32(event->mmap2.maj);
+ event->mmap2.min = bswap_32(event->mmap2.min);
+ event->mmap2.ino = bswap_64(event->mmap2.ino);
+ event->mmap2.ino_generation = bswap_64(event->mmap2.ino_generation);
+ }

if (sample_id_all) {
void *data = &event->mmap2.filename;
--
2.26.2

2020-12-14 14:29:23

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 11/15] perf tools: Add support to display build id for mmap2 events

Adding support to display build id in mmap2 events:

$ perf script --show-mmap-events | head -4
swapper ... @ 0xffffffff81000000 <ff1969b3ba5e43911208bb46fa7d5b1eb809e422>]: ---p [kernel.kallsyms]_text
swapper ... @ 0 <5f62adb730272c9417883ae8b8a8ec224df8cddd>]: ---p /lib/modules/5.9.0-rc5buildid+/kernel/drivers/firmware/qemu_fw_cfg.ko
swapper ... @ 0 <c9ac6e1dafc1ebdadb048f967854e810706c8bab>]: ---p /lib/modules/5.9.0-rc5buildid+/kernel/drivers/char/virtio_console.ko
swapper ... @ 0 <86441a4c5b2c2ff5b440682f4c612bd4b426eb5c>]: ---p /lib/modules/5.9.0-rc5buildid+/kernel/lib/libcrc32c.ko

$ perf report -D | grep MMAP2 | head -4
0 0 ... @ 0xffffffff81000000 <ff1969b3ba5e43911208bb46fa7d5b1eb809e422>]: ---p [kernel.kallsyms]_text
0 0 ... @ 0 <5f62adb730272c9417883ae8b8a8ec224df8cddd>]: ---p /lib/modules/5.9.0-rc5buildid+/kernel/drivers/firmware/qemu_fw_cfg.ko
0 0 ... @ 0 <c9ac6e1dafc1ebdadb048f967854e810706c8bab>]: ---p /lib/modules/5.9.0-rc5buildid+/kernel/drivers/char/virtio_console.ko
0 0 ... @ 0 <86441a4c5b2c2ff5b440682f4c612bd4b426eb5c>]: ---p /lib/modules/5.9.0-rc5buildid+/kernel/lib/libcrc32c.ko

Adding build id data into <> brackets.

Acked-by: Ian Rogers <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/util/event.c | 41 ++++++++++++++++++++++++++++++-----------
1 file changed, 30 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 05616d4138a9..fbe8578e4c47 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -288,17 +288,36 @@ size_t perf_event__fprintf_mmap(union perf_event *event, FILE *fp)

size_t perf_event__fprintf_mmap2(union perf_event *event, FILE *fp)
{
- return fprintf(fp, " %d/%d: [%#" PRI_lx64 "(%#" PRI_lx64 ") @ %#" PRI_lx64
- " %02x:%02x %"PRI_lu64" %"PRI_lu64"]: %c%c%c%c %s\n",
- event->mmap2.pid, event->mmap2.tid, event->mmap2.start,
- event->mmap2.len, event->mmap2.pgoff, event->mmap2.maj,
- event->mmap2.min, event->mmap2.ino,
- event->mmap2.ino_generation,
- (event->mmap2.prot & PROT_READ) ? 'r' : '-',
- (event->mmap2.prot & PROT_WRITE) ? 'w' : '-',
- (event->mmap2.prot & PROT_EXEC) ? 'x' : '-',
- (event->mmap2.flags & MAP_SHARED) ? 's' : 'p',
- event->mmap2.filename);
+ if (event->header.misc & PERF_RECORD_MISC_MMAP_BUILD_ID) {
+ char sbuild_id[SBUILD_ID_SIZE];
+ struct build_id bid;
+
+ build_id__init(&bid, event->mmap2.build_id,
+ event->mmap2.build_id_size);
+ build_id__sprintf(&bid, sbuild_id);
+
+ return fprintf(fp, " %d/%d: [%#" PRI_lx64 "(%#" PRI_lx64 ") @ %#" PRI_lx64
+ " <%s>]: %c%c%c%c %s\n",
+ event->mmap2.pid, event->mmap2.tid, event->mmap2.start,
+ event->mmap2.len, event->mmap2.pgoff, sbuild_id,
+ (event->mmap2.prot & PROT_READ) ? 'r' : '-',
+ (event->mmap2.prot & PROT_WRITE) ? 'w' : '-',
+ (event->mmap2.prot & PROT_EXEC) ? 'x' : '-',
+ (event->mmap2.flags & MAP_SHARED) ? 's' : 'p',
+ event->mmap2.filename);
+ } else {
+ return fprintf(fp, " %d/%d: [%#" PRI_lx64 "(%#" PRI_lx64 ") @ %#" PRI_lx64
+ " %02x:%02x %"PRI_lu64" %"PRI_lu64"]: %c%c%c%c %s\n",
+ event->mmap2.pid, event->mmap2.tid, event->mmap2.start,
+ event->mmap2.len, event->mmap2.pgoff, event->mmap2.maj,
+ event->mmap2.min, event->mmap2.ino,
+ event->mmap2.ino_generation,
+ (event->mmap2.prot & PROT_READ) ? 'r' : '-',
+ (event->mmap2.prot & PROT_WRITE) ? 'w' : '-',
+ (event->mmap2.prot & PROT_EXEC) ? 'x' : '-',
+ (event->mmap2.flags & MAP_SHARED) ? 's' : 'p',
+ event->mmap2.filename);
+ }
}

size_t perf_event__fprintf_thread_map(union perf_event *event, FILE *fp)
--
2.26.2

2020-12-14 14:29:44

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 02/15] bpf: Add size arg to build_id_parse function

It's possible to have other build id types (other than default SHA1).
Currently there's also ld support for MD5 build id.

Adding size argument to build_id_parse function, that returns (if defined)
size of the parsed build id, so we can recognize the build id type.

Cc: Alexei Starovoitov <[email protected]>
Cc: Song Liu <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
---
include/linux/buildid.h | 3 ++-
kernel/bpf/stackmap.c | 2 +-
lib/buildid.c | 29 +++++++++++++++++++++--------
3 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/include/linux/buildid.h b/include/linux/buildid.h
index 08028a212589..40232f90db6e 100644
--- a/include/linux/buildid.h
+++ b/include/linux/buildid.h
@@ -6,6 +6,7 @@

#define BUILD_ID_SIZE_MAX 20

-int build_id_parse(struct vm_area_struct *vma, unsigned char *build_id);
+int build_id_parse(struct vm_area_struct *vma, unsigned char *build_id,
+ __u32 *size);

#endif
diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
index d21512fbfa9a..4fcf6018f35a 100644
--- a/kernel/bpf/stackmap.c
+++ b/kernel/bpf/stackmap.c
@@ -199,7 +199,7 @@ static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs,

for (i = 0; i < trace_nr; i++) {
vma = find_vma(current->mm, ips[i]);
- if (!vma || build_id_parse(vma, id_offs[i].build_id)) {
+ if (!vma || build_id_parse(vma, id_offs[i].build_id, NULL)) {
/* per entry fall back to ips */
id_offs[i].status = BPF_STACK_BUILD_ID_IP;
id_offs[i].ip = ips[i];
diff --git a/lib/buildid.c b/lib/buildid.c
index 4a4f520c0e29..6156997c3895 100644
--- a/lib/buildid.c
+++ b/lib/buildid.c
@@ -12,6 +12,7 @@
*/
static inline int parse_build_id(void *page_addr,
unsigned char *build_id,
+ __u32 *size,
void *note_start,
Elf32_Word note_size)
{
@@ -38,6 +39,8 @@ static inline int parse_build_id(void *page_addr,
nhdr->n_descsz);
memset(build_id + nhdr->n_descsz, 0,
BUILD_ID_SIZE_MAX - nhdr->n_descsz);
+ if (size)
+ *size = nhdr->n_descsz;
return 0;
}
new_offs = note_offs + sizeof(Elf32_Nhdr) +
@@ -50,7 +53,8 @@ static inline int parse_build_id(void *page_addr,
}

/* Parse build ID from 32-bit ELF */
-static int get_build_id_32(void *page_addr, unsigned char *build_id)
+static int get_build_id_32(void *page_addr, unsigned char *build_id,
+ __u32 *size)
{
Elf32_Ehdr *ehdr = (Elf32_Ehdr *)page_addr;
Elf32_Phdr *phdr;
@@ -65,7 +69,7 @@ static int get_build_id_32(void *page_addr, unsigned char *build_id)

for (i = 0; i < ehdr->e_phnum; ++i) {
if (phdr[i].p_type == PT_NOTE &&
- !parse_build_id(page_addr, build_id,
+ !parse_build_id(page_addr, build_id, size,
page_addr + phdr[i].p_offset,
phdr[i].p_filesz))
return 0;
@@ -74,7 +78,8 @@ static int get_build_id_32(void *page_addr, unsigned char *build_id)
}

/* Parse build ID from 64-bit ELF */
-static int get_build_id_64(void *page_addr, unsigned char *build_id)
+static int get_build_id_64(void *page_addr, unsigned char *build_id,
+ __u32 *size)
{
Elf64_Ehdr *ehdr = (Elf64_Ehdr *)page_addr;
Elf64_Phdr *phdr;
@@ -89,7 +94,7 @@ static int get_build_id_64(void *page_addr, unsigned char *build_id)

for (i = 0; i < ehdr->e_phnum; ++i) {
if (phdr[i].p_type == PT_NOTE &&
- !parse_build_id(page_addr, build_id,
+ !parse_build_id(page_addr, build_id, size,
page_addr + phdr[i].p_offset,
phdr[i].p_filesz))
return 0;
@@ -97,8 +102,16 @@ static int get_build_id_64(void *page_addr, unsigned char *build_id)
return -EINVAL;
}

-/* Parse build ID of ELF file mapped to vma */
-int build_id_parse(struct vm_area_struct *vma, unsigned char *build_id)
+/*
+ * Parse build ID of ELF file mapped to vma
+ * @vma: vma object
+ * @build_id: buffer to store build id, at least BUILD_ID_SIZE long
+ * @size: returns actual build id size in case of success
+ *
+ * Returns 0 on success, otherwise error (< 0).
+ */
+int build_id_parse(struct vm_area_struct *vma, unsigned char *build_id,
+ __u32 *size)
{
Elf32_Ehdr *ehdr;
struct page *page;
@@ -126,9 +139,9 @@ int build_id_parse(struct vm_area_struct *vma, unsigned char *build_id)
goto out;

if (ehdr->e_ident[EI_CLASS] == ELFCLASS32)
- ret = get_build_id_32(page_addr, build_id);
+ ret = get_build_id_32(page_addr, build_id, size);
else if (ehdr->e_ident[EI_CLASS] == ELFCLASS64)
- ret = get_build_id_64(page_addr, build_id);
+ ret = get_build_id_64(page_addr, build_id, size);
out:
kunmap_atomic(page_addr);
put_page(page);
--
2.26.2

2020-12-14 17:37:38

by Jiri Olsa

[permalink] [raw]
Subject: [PATCH 09/15] perf tools: Allow mmap2 event to synthesize modules

Allow mmap2 event to synthesize kernel modules,
so we can synthesize module's build id data in
following changes.

It's enabled by new symbol_conf.buildid_mmap2
bool, which will be switched in following
changes.

Acked-by: Ian Rogers <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
---
tools/perf/util/synthetic-events.c | 49 +++++++++++++++++++-----------
1 file changed, 32 insertions(+), 17 deletions(-)

diff --git a/tools/perf/util/synthetic-events.c b/tools/perf/util/synthetic-events.c
index a5f271fa2ae4..c209377106f5 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -596,16 +596,17 @@ int perf_event__synthesize_modules(struct perf_tool *tool, perf_event__handler_t
int rc = 0;
struct map *pos;
struct maps *maps = machine__kernel_maps(machine);
- union perf_event *event = zalloc((sizeof(event->mmap) +
- machine->id_hdr_size));
+ union perf_event *event;
+ size_t size = symbol_conf.buildid_mmap2 ?
+ sizeof(event->mmap2) : sizeof(event->mmap);
+
+ event = zalloc(size + machine->id_hdr_size);
if (event == NULL) {
pr_debug("Not enough memory synthesizing mmap event "
"for kernel modules\n");
return -1;
}

- event->header.type = PERF_RECORD_MMAP;
-
/*
* kernel uses 0 for user space maps, see kernel/perf_event.c
* __perf_event_mmap
@@ -616,23 +617,37 @@ int perf_event__synthesize_modules(struct perf_tool *tool, perf_event__handler_t
event->header.misc = PERF_RECORD_MISC_GUEST_KERNEL;

maps__for_each_entry(maps, pos) {
- size_t size;
-
if (!__map__is_kmodule(pos))
continue;

- size = PERF_ALIGN(pos->dso->long_name_len + 1, sizeof(u64));
- event->mmap.header.type = PERF_RECORD_MMAP;
- event->mmap.header.size = (sizeof(event->mmap) -
- (sizeof(event->mmap.filename) - size));
- memset(event->mmap.filename + size, 0, machine->id_hdr_size);
- event->mmap.header.size += machine->id_hdr_size;
- event->mmap.start = pos->start;
- event->mmap.len = pos->end - pos->start;
- event->mmap.pid = machine->pid;
+ if (symbol_conf.buildid_mmap2) {
+ size = PERF_ALIGN(pos->dso->long_name_len + 1, sizeof(u64));
+ event->mmap2.header.type = PERF_RECORD_MMAP2;
+ event->mmap2.header.size = (sizeof(event->mmap2) -
+ (sizeof(event->mmap2.filename) - size));
+ memset(event->mmap2.filename + size, 0, machine->id_hdr_size);
+ event->mmap2.header.size += machine->id_hdr_size;
+ event->mmap2.start = pos->start;
+ event->mmap2.len = pos->end - pos->start;
+ event->mmap2.pid = machine->pid;
+
+ memcpy(event->mmap2.filename, pos->dso->long_name,
+ pos->dso->long_name_len + 1);
+ } else {
+ size = PERF_ALIGN(pos->dso->long_name_len + 1, sizeof(u64));
+ event->mmap.header.type = PERF_RECORD_MMAP;
+ event->mmap.header.size = (sizeof(event->mmap) -
+ (sizeof(event->mmap.filename) - size));
+ memset(event->mmap.filename + size, 0, machine->id_hdr_size);
+ event->mmap.header.size += machine->id_hdr_size;
+ event->mmap.start = pos->start;
+ event->mmap.len = pos->end - pos->start;
+ event->mmap.pid = machine->pid;
+
+ memcpy(event->mmap.filename, pos->dso->long_name,
+ pos->dso->long_name_len + 1);
+ }

- memcpy(event->mmap.filename, pos->dso->long_name,
- pos->dso->long_name_len + 1);
if (perf_tool__process_synth_event(tool, event, machine, process) != 0) {
rc = -1;
break;
--
2.26.2

2020-12-15 13:46:53

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCHv5 00/15] perf: Add mmap2 build id support

Hi Jiri,

On Mon, Dec 14, 2020 at 7:55 PM Jiri Olsa <[email protected]> wrote:
>
> hi,
> adding the support to have buildid stored in mmap2 event,
> so we can bypass the final perf record hunt on build ids.
>
> This patchset allows perf to record build ID in mmap2 event,
> and adds perf tooling to store/download binaries to .debug
> cache based on these build IDs.
>
> Note that the build id retrieval code is stolen from bpf
> code, where it's been used (together with file offsets)
> to replace IPs in user space stack traces. It's now added
> under lib directory.

Thanks for doing this!

Acked-by: Namhyung Kim <[email protected]>

Thanks,
Namhyung


>
> v5 changes:
> - rebased on latest perf/core
> - several patches already pulled in
> - fixed trace+probe_vfs_getname.sh output redirection
> - fixed changelogs [Arnaldo]
> - renamed BUILD_ID_SIZE to BUILD_ID_SIZE_MAX [Song]
>
> v4 changes:
> - fixed typo in changelog [Namhyung]
> - removed force_download bool from struct dso_store_data,
> because it's not used [Namhyung]
>
> v3 changes:
> - added acks
> - removed forgotten debug code [Arnaldo]
> - fixed readlink termination [Ian]
> - fixed doc for --debuginfod=URLs [Ian]
> - adopted kernel's memchr_inv function and used
> it in build_id__is_defined function [Arnaldo]
>
> On recording server:
>
> - on the recording server we can run record with --buildid-mmap
> option to store build ids in mmap2 events:
>
> # perf record --buildid-mmap
> ^C[ perf record: Woken up 2 times to write data ]
> [ perf record: Captured and wrote 0.836 MB perf.data ]
>
> - it stores nothing to ~/.debug cache:
>
> # find ~/.debug
> find: ‘/root/.debug’: No such file or directory
>
> - and still reports properly:
>
> # perf report --stdio
> ...
> 99.82% swapper [kernel.kallsyms] [k] native_safe_halt
> 0.03% swapper [kernel.kallsyms] [k] finish_task_switch
> 0.02% swapper [kernel.kallsyms] [k] __softirqentry_text_start
> 0.01% kcompactd0 [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
> 0.01% ksoftirqd/6 [kernel.kallsyms] [k] slab_free_freelist_hook
> 0.01% kworker/17:1H-x [kernel.kallsyms] [k] slab_free_freelist_hook
>
> - display used/hit build ids:
>
> # perf buildid-list | head -5
> 5dcec522abf136fcfd3128f47e131f2365834dd7 /proc/kcore
> 589e403a34f55486bcac848a45e00bcdeedd1ca8 /usr/lib64/libcrypto.so.1.1.1g
> 94569566d4eac7e9c87ba029d43d4e2158f9527e /usr/lib64/libpthread-2.30.so
> 559b9702bebe31c6d132c8dc5cc887673d65d5b5 /usr/lib64/libc-2.30.so
> 40da7abe89f631f60538a17686a7d65c6a02ed31 /usr/lib64/ld-2.30.so
>
> - store build id binaries into build id cache:
>
> # perf buildid-cache -a perf.data
> OK 5dcec522abf136fcfd3128f47e131f2365834dd7 /proc/kcore
> OK 589e403a34f55486bcac848a45e00bcdeedd1ca8 /usr/lib64/libcrypto.so.1.1.1g
> OK 94569566d4eac7e9c87ba029d43d4e2158f9527e /usr/lib64/libpthread-2.30.so
> OK 559b9702bebe31c6d132c8dc5cc887673d65d5b5 /usr/lib64/libc-2.30.so
> OK 40da7abe89f631f60538a17686a7d65c6a02ed31 /usr/lib64/ld-2.30.so
> OK a674f7a47c78e35a088104647b9640710277b489 /usr/sbin/sshd
> OK e5cb4ca25f46485bdbc691c3a92e7e111dac3ef2 /usr/bin/bash
> OK 9bc8589108223c944b452f0819298a0c3cba6215 /usr/bin/find
>
> # find ~/.debug | head -5
> /root/.debug
> /root/.debug/proc
> /root/.debug/proc/kcore
> /root/.debug/proc/kcore/5dcec522abf136fcfd3128f47e131f2365834dd7
> /root/.debug/proc/kcore/5dcec522abf136fcfd3128f47e131f2365834dd7/kallsyms
>
> - run debuginfod daemon to provide binaries to another server (below)
> (the initialization could take some time)
>
> # debuginfod -F /
>
>
> On another server:
>
> - copy perf.data from 'record' server and run:
>
> $ find ~/.debug/
> find: ‘/home/jolsa/.debug/’: No such file or directory
>
> $ perf buildid-list | head -5
> No kallsyms or vmlinux with build-id 5dcec522abf136fcfd3128f47e131f2365834dd7 was found
> 5dcec522abf136fcfd3128f47e131f2365834dd7 [kernel.kallsyms]
> 5784f813b727a50cfd3363234aef9fcbab685cc4 /lib/modules/5.10.0-rc2speed+/kernel/fs/xfs/xfs.ko
> 589e403a34f55486bcac848a45e00bcdeedd1ca8 /usr/lib64/libcrypto.so.1.1.1g
> 94569566d4eac7e9c87ba029d43d4e2158f9527e /usr/lib64/libpthread-2.30.so
> 559b9702bebe31c6d132c8dc5cc887673d65d5b5 /usr/lib64/libc-2.30.so
>
> - report does not show anything (kernel build id does not match):
>
> $ perf report --stdio
> ...
> 76.73% swapper [kernel.kallsyms] [k] 0xffffffff81aa8ebe
> 1.89% find [kernel.kallsyms] [k] 0xffffffff810f2167
> 0.93% sshd [kernel.kallsyms] [k] 0xffffffff8153380c
> 0.83% swapper [kernel.kallsyms] [k] 0xffffffff81104b0b
> 0.71% kworker/u40:2-e [kernel.kallsyms] [k] 0xffffffff810f3850
> 0.70% kworker/u40:0-e [kernel.kallsyms] [k] 0xffffffff810f3850
> 0.64% find [kernel.kallsyms] [k] 0xffffffff81a9ba0a
> 0.63% find [kernel.kallsyms] [k] 0xffffffff81aa93b0
>
> - add build ids does not work, because existing binaries (on another server)
> have different build ids:
>
> $ perf buildid-cache -a perf.data
> No kallsyms or vmlinux with build-id 5dcec522abf136fcfd3128f47e131f2365834dd7 was found
> FAIL 5dcec522abf136fcfd3128f47e131f2365834dd7 [kernel.kallsyms]
> FAIL 5784f813b727a50cfd3363234aef9fcbab685cc4 /lib/modules/5.10.0-rc2speed+/kernel/fs/xfs/xfs.ko
> FAIL 589e403a34f55486bcac848a45e00bcdeedd1ca8 /usr/lib64/libcrypto.so.1.1.1g
> FAIL 94569566d4eac7e9c87ba029d43d4e2158f9527e /usr/lib64/libpthread-2.30.so
> FAIL 559b9702bebe31c6d132c8dc5cc887673d65d5b5 /usr/lib64/libc-2.30.so
> FAIL 40da7abe89f631f60538a17686a7d65c6a02ed31 /usr/lib64/ld-2.30.so
> FAIL a674f7a47c78e35a088104647b9640710277b489 /usr/sbin/sshd
> FAIL e5cb4ca25f46485bdbc691c3a92e7e111dac3ef2 /usr/bin/bash
> FAIL 9bc8589108223c944b452f0819298a0c3cba6215 /usr/bin/find
>
> - add build ids with debuginfod setup pointing to record server:
>
> $ perf buildid-cache -a perf.data --debuginfod http://192.168.122.174:8002
> No kallsyms or vmlinux with build-id 5dcec522abf136fcfd3128f47e131f2365834dd7 was found
> OK 5dcec522abf136fcfd3128f47e131f2365834dd7 [kernel.kallsyms]
> OK 5784f813b727a50cfd3363234aef9fcbab685cc4 /lib/modules/5.10.0-rc2speed+/kernel/fs/xfs/xfs.ko
> OK 589e403a34f55486bcac848a45e00bcdeedd1ca8 /usr/lib64/libcrypto.so.1.1.1g
> OK 94569566d4eac7e9c87ba029d43d4e2158f9527e /usr/lib64/libpthread-2.30.so
> OK 559b9702bebe31c6d132c8dc5cc887673d65d5b5 /usr/lib64/libc-2.30.so
> OK 40da7abe89f631f60538a17686a7d65c6a02ed31 /usr/lib64/ld-2.30.so
> OK a674f7a47c78e35a088104647b9640710277b489 /usr/sbin/sshd
> OK e5cb4ca25f46485bdbc691c3a92e7e111dac3ef2 /usr/bin/bash
> OK 9bc8589108223c944b452f0819298a0c3cba6215 /usr/bin/find
>
> - and report works:
>
> $ perf report --stdio
> ...
> 76.73% swapper [kernel.kallsyms] [k] native_safe_halt
> 1.91% find [kernel.kallsyms] [k] queue_work_on
> 0.93% sshd [kernel.kallsyms] [k] iowrite16
> 0.83% swapper [kernel.kallsyms] [k] finish_task_switch
> 0.72% kworker/u40:2-e [kernel.kallsyms] [k] process_one_work
> 0.70% kworker/u40:0-e [kernel.kallsyms] [k] process_one_work
> 0.64% find [kernel.kallsyms] [k] syscall_enter_from_user_mode
> 0.63% find [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
>
> - because we have the data in build id cache:
>
> $ find ~/.debug | head -10
> .../.debug
> .../.debug/home
> .../.debug/home/jolsa
> .../.debug/home/jolsa/.cache
> .../.debug/home/jolsa/.cache/debuginfod_client
> .../.debug/home/jolsa/.cache/debuginfod_client/5dcec522abf136fcfd3128f47e131f2365834dd7
> .../.debug/home/jolsa/.cache/debuginfod_client/5dcec522abf136fcfd3128f47e131f2365834dd7/executable
> .../.debug/home/jolsa/.cache/debuginfod_client/5dcec522abf136fcfd3128f47e131f2365834dd7/executable/5dcec522abf136fcfd3128f47e131f2365834dd7
> .../.debug/home/jolsa/.cache/debuginfod_client/5dcec522abf136fcfd3128f47e131f2365834dd7/executable/5dcec522abf136fcfd3128f47e131f2365834dd7/elf
> .../.debug/home/jolsa/.cache/debuginfod_client/5dcec522abf136fcfd3128f47e131f2365834dd7/executable/5dcec522abf136fcfd3128f47e131f2365834dd7/debug
>
>
> Available also in:
> git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
> perf/build_id
>
> thanks,
> jirka
>
>
> Cc: Frank Ch. Eigler <[email protected]>
> Cc: Mark Wielaard <[email protected]>
> ---
> include/linux/buildid.h | 12 +++++
> include/uapi/linux/perf_event.h | 42 +++++++++++++++---
> kernel/bpf/stackmap.c | 143 ++----------------------------------------------------------
> kernel/events/core.c | 32 ++++++++++++--
> lib/Makefile | 3 +-
> lib/buildid.c | 149 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> tools/include/uapi/linux/perf_event.h | 42 +++++++++++++++---
> tools/lib/perf/include/perf/event.h | 18 ++++++--
> tools/perf/Documentation/perf-buildid-cache.txt | 18 +++++++-
> tools/perf/Documentation/perf-config.txt | 10 ++++-
> tools/perf/Documentation/perf-record.txt | 3 ++
> tools/perf/builtin-buildid-cache.c | 241 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---
> tools/perf/builtin-buildid-list.c | 3 ++
> tools/perf/builtin-record.c | 20 +++++++++
> tools/perf/tests/shell/trace+probe_vfs_getname.sh | 2 +-
> tools/perf/util/event.c | 41 +++++++++++++-----
> tools/perf/util/evsel.c | 10 +++--
> tools/perf/util/machine.c | 24 +++++++---
> tools/perf/util/map.c | 8 +++-
> tools/perf/util/map.h | 3 +-
> tools/perf/util/perf_api_probe.c | 10 +++++
> tools/perf/util/perf_api_probe.h | 1 +
> tools/perf/util/perf_event_attr_fprintf.c | 2 +
> tools/perf/util/probe-event.c | 6 +--
> tools/perf/util/record.h | 1 +
> tools/perf/util/session.c | 11 +++--
> tools/perf/util/symbol-elf.c | 37 +++++++++++++++-
> tools/perf/util/symbol_conf.h | 3 +-
> tools/perf/util/synthetic-events.c | 121 ++++++++++++++++++++++++++++++++++++++-------------
> 29 files changed, 787 insertions(+), 229 deletions(-)
>

2020-12-15 15:53:16

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 01/15] bpf: Move stack_map_get_build_id into lib

Em Mon, Dec 14, 2020 at 11:54:43AM +0100, Jiri Olsa escreveu:
> Moving stack_map_get_build_id into lib with
> declaration in linux/buildid.h header:
>
> int build_id_parse(struct vm_area_struct *vma, unsigned char *build_id);
>
> This function returns build id for given struct vm_area_struct.
> There is no functional change to stack_map_get_build_id function.

Alexei, if you're ok with this, can you please process it? Linus will
find strange if I send kernel bits, as we agreed that my tools pull
requests would be just for tooling.

- Arnaldo

> Cc: Alexei Starovoitov <[email protected]>
> Acked-by: Song Liu <[email protected]>
> Signed-off-by: Jiri Olsa <[email protected]>
> ---
> include/linux/buildid.h | 11 ++++
> kernel/bpf/stackmap.c | 143 ++--------------------------------------
> lib/Makefile | 3 +-
> lib/buildid.c | 136 ++++++++++++++++++++++++++++++++++++++
> 4 files changed, 153 insertions(+), 140 deletions(-)
> create mode 100644 include/linux/buildid.h
> create mode 100644 lib/buildid.c
>
> diff --git a/include/linux/buildid.h b/include/linux/buildid.h
> new file mode 100644
> index 000000000000..08028a212589
> --- /dev/null
> +++ b/include/linux/buildid.h
> @@ -0,0 +1,11 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _LINUX_BUILDID_H
> +#define _LINUX_BUILDID_H
> +
> +#include <linux/mm_types.h>
> +
> +#define BUILD_ID_SIZE_MAX 20
> +
> +int build_id_parse(struct vm_area_struct *vma, unsigned char *build_id);
> +
> +#endif
> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
> index 06065fa27124..d21512fbfa9a 100644
> --- a/kernel/bpf/stackmap.c
> +++ b/kernel/bpf/stackmap.c
> @@ -7,10 +7,9 @@
> #include <linux/kernel.h>
> #include <linux/stacktrace.h>
> #include <linux/perf_event.h>
> -#include <linux/elf.h>
> -#include <linux/pagemap.h>
> #include <linux/irq_work.h>
> #include <linux/btf_ids.h>
> +#include <linux/buildid.h>
> #include "percpu_freelist.h"
>
> #define STACK_CREATE_FLAG_MASK \
> @@ -153,140 +152,6 @@ static struct bpf_map *stack_map_alloc(union bpf_attr *attr)
> return ERR_PTR(err);
> }
>
> -#define BPF_BUILD_ID 3
> -/*
> - * Parse build id from the note segment. This logic can be shared between
> - * 32-bit and 64-bit system, because Elf32_Nhdr and Elf64_Nhdr are
> - * identical.
> - */
> -static inline int stack_map_parse_build_id(void *page_addr,
> - unsigned char *build_id,
> - void *note_start,
> - Elf32_Word note_size)
> -{
> - Elf32_Word note_offs = 0, new_offs;
> -
> - /* check for overflow */
> - if (note_start < page_addr || note_start + note_size < note_start)
> - return -EINVAL;
> -
> - /* only supports note that fits in the first page */
> - if (note_start + note_size > page_addr + PAGE_SIZE)
> - return -EINVAL;
> -
> - while (note_offs + sizeof(Elf32_Nhdr) < note_size) {
> - Elf32_Nhdr *nhdr = (Elf32_Nhdr *)(note_start + note_offs);
> -
> - if (nhdr->n_type == BPF_BUILD_ID &&
> - nhdr->n_namesz == sizeof("GNU") &&
> - nhdr->n_descsz > 0 &&
> - nhdr->n_descsz <= BPF_BUILD_ID_SIZE) {
> - memcpy(build_id,
> - note_start + note_offs +
> - ALIGN(sizeof("GNU"), 4) + sizeof(Elf32_Nhdr),
> - nhdr->n_descsz);
> - memset(build_id + nhdr->n_descsz, 0,
> - BPF_BUILD_ID_SIZE - nhdr->n_descsz);
> - return 0;
> - }
> - new_offs = note_offs + sizeof(Elf32_Nhdr) +
> - ALIGN(nhdr->n_namesz, 4) + ALIGN(nhdr->n_descsz, 4);
> - if (new_offs <= note_offs) /* overflow */
> - break;
> - note_offs = new_offs;
> - }
> - return -EINVAL;
> -}
> -
> -/* Parse build ID from 32-bit ELF */
> -static int stack_map_get_build_id_32(void *page_addr,
> - unsigned char *build_id)
> -{
> - Elf32_Ehdr *ehdr = (Elf32_Ehdr *)page_addr;
> - Elf32_Phdr *phdr;
> - int i;
> -
> - /* only supports phdr that fits in one page */
> - if (ehdr->e_phnum >
> - (PAGE_SIZE - sizeof(Elf32_Ehdr)) / sizeof(Elf32_Phdr))
> - return -EINVAL;
> -
> - phdr = (Elf32_Phdr *)(page_addr + sizeof(Elf32_Ehdr));
> -
> - for (i = 0; i < ehdr->e_phnum; ++i) {
> - if (phdr[i].p_type == PT_NOTE &&
> - !stack_map_parse_build_id(page_addr, build_id,
> - page_addr + phdr[i].p_offset,
> - phdr[i].p_filesz))
> - return 0;
> - }
> - return -EINVAL;
> -}
> -
> -/* Parse build ID from 64-bit ELF */
> -static int stack_map_get_build_id_64(void *page_addr,
> - unsigned char *build_id)
> -{
> - Elf64_Ehdr *ehdr = (Elf64_Ehdr *)page_addr;
> - Elf64_Phdr *phdr;
> - int i;
> -
> - /* only supports phdr that fits in one page */
> - if (ehdr->e_phnum >
> - (PAGE_SIZE - sizeof(Elf64_Ehdr)) / sizeof(Elf64_Phdr))
> - return -EINVAL;
> -
> - phdr = (Elf64_Phdr *)(page_addr + sizeof(Elf64_Ehdr));
> -
> - for (i = 0; i < ehdr->e_phnum; ++i) {
> - if (phdr[i].p_type == PT_NOTE &&
> - !stack_map_parse_build_id(page_addr, build_id,
> - page_addr + phdr[i].p_offset,
> - phdr[i].p_filesz))
> - return 0;
> - }
> - return -EINVAL;
> -}
> -
> -/* Parse build ID of ELF file mapped to vma */
> -static int stack_map_get_build_id(struct vm_area_struct *vma,
> - unsigned char *build_id)
> -{
> - Elf32_Ehdr *ehdr;
> - struct page *page;
> - void *page_addr;
> - int ret;
> -
> - /* only works for page backed storage */
> - if (!vma->vm_file)
> - return -EINVAL;
> -
> - page = find_get_page(vma->vm_file->f_mapping, 0);
> - if (!page)
> - return -EFAULT; /* page not mapped */
> -
> - ret = -EINVAL;
> - page_addr = kmap_atomic(page);
> - ehdr = (Elf32_Ehdr *)page_addr;
> -
> - /* compare magic x7f "ELF" */
> - if (memcmp(ehdr->e_ident, ELFMAG, SELFMAG) != 0)
> - goto out;
> -
> - /* only support executable file and shared object file */
> - if (ehdr->e_type != ET_EXEC && ehdr->e_type != ET_DYN)
> - goto out;
> -
> - if (ehdr->e_ident[EI_CLASS] == ELFCLASS32)
> - ret = stack_map_get_build_id_32(page_addr, build_id);
> - else if (ehdr->e_ident[EI_CLASS] == ELFCLASS64)
> - ret = stack_map_get_build_id_64(page_addr, build_id);
> -out:
> - kunmap_atomic(page_addr);
> - put_page(page);
> - return ret;
> -}
> -
> static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs,
> u64 *ips, u32 trace_nr, bool user)
> {
> @@ -327,18 +192,18 @@ static void stack_map_get_build_id_offset(struct bpf_stack_build_id *id_offs,
> for (i = 0; i < trace_nr; i++) {
> id_offs[i].status = BPF_STACK_BUILD_ID_IP;
> id_offs[i].ip = ips[i];
> - memset(id_offs[i].build_id, 0, BPF_BUILD_ID_SIZE);
> + memset(id_offs[i].build_id, 0, BUILD_ID_SIZE_MAX);
> }
> return;
> }
>
> for (i = 0; i < trace_nr; i++) {
> vma = find_vma(current->mm, ips[i]);
> - if (!vma || stack_map_get_build_id(vma, id_offs[i].build_id)) {
> + if (!vma || build_id_parse(vma, id_offs[i].build_id)) {
> /* per entry fall back to ips */
> id_offs[i].status = BPF_STACK_BUILD_ID_IP;
> id_offs[i].ip = ips[i];
> - memset(id_offs[i].build_id, 0, BPF_BUILD_ID_SIZE);
> + memset(id_offs[i].build_id, 0, BUILD_ID_SIZE_MAX);
> continue;
> }
> id_offs[i].offset = (vma->vm_pgoff << PAGE_SHIFT) + ips[i]
> diff --git a/lib/Makefile b/lib/Makefile
> index ce45af50983a..f4858f5e9215 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -36,7 +36,8 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
> flex_proportions.o ratelimit.o show_mem.o \
> is_single_threaded.o plist.o decompress.o kobject_uevent.o \
> earlycpio.o seq_buf.o siphash.o dec_and_lock.o \
> - nmi_backtrace.o nodemask.o win_minmax.o memcat_p.o
> + nmi_backtrace.o nodemask.o win_minmax.o memcat_p.o \
> + buildid.o
>
> lib-$(CONFIG_PRINTK) += dump_stack.o
> lib-$(CONFIG_SMP) += cpumask.o
> diff --git a/lib/buildid.c b/lib/buildid.c
> new file mode 100644
> index 000000000000..4a4f520c0e29
> --- /dev/null
> +++ b/lib/buildid.c
> @@ -0,0 +1,136 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include <linux/buildid.h>
> +#include <linux/elf.h>
> +#include <linux/pagemap.h>
> +
> +#define BUILD_ID 3
> +/*
> + * Parse build id from the note segment. This logic can be shared between
> + * 32-bit and 64-bit system, because Elf32_Nhdr and Elf64_Nhdr are
> + * identical.
> + */
> +static inline int parse_build_id(void *page_addr,
> + unsigned char *build_id,
> + void *note_start,
> + Elf32_Word note_size)
> +{
> + Elf32_Word note_offs = 0, new_offs;
> +
> + /* check for overflow */
> + if (note_start < page_addr || note_start + note_size < note_start)
> + return -EINVAL;
> +
> + /* only supports note that fits in the first page */
> + if (note_start + note_size > page_addr + PAGE_SIZE)
> + return -EINVAL;
> +
> + while (note_offs + sizeof(Elf32_Nhdr) < note_size) {
> + Elf32_Nhdr *nhdr = (Elf32_Nhdr *)(note_start + note_offs);
> +
> + if (nhdr->n_type == BUILD_ID &&
> + nhdr->n_namesz == sizeof("GNU") &&
> + nhdr->n_descsz > 0 &&
> + nhdr->n_descsz <= BUILD_ID_SIZE_MAX) {
> + memcpy(build_id,
> + note_start + note_offs +
> + ALIGN(sizeof("GNU"), 4) + sizeof(Elf32_Nhdr),
> + nhdr->n_descsz);
> + memset(build_id + nhdr->n_descsz, 0,
> + BUILD_ID_SIZE_MAX - nhdr->n_descsz);
> + return 0;
> + }
> + new_offs = note_offs + sizeof(Elf32_Nhdr) +
> + ALIGN(nhdr->n_namesz, 4) + ALIGN(nhdr->n_descsz, 4);
> + if (new_offs <= note_offs) /* overflow */
> + break;
> + note_offs = new_offs;
> + }
> + return -EINVAL;
> +}
> +
> +/* Parse build ID from 32-bit ELF */
> +static int get_build_id_32(void *page_addr, unsigned char *build_id)
> +{
> + Elf32_Ehdr *ehdr = (Elf32_Ehdr *)page_addr;
> + Elf32_Phdr *phdr;
> + int i;
> +
> + /* only supports phdr that fits in one page */
> + if (ehdr->e_phnum >
> + (PAGE_SIZE - sizeof(Elf32_Ehdr)) / sizeof(Elf32_Phdr))
> + return -EINVAL;
> +
> + phdr = (Elf32_Phdr *)(page_addr + sizeof(Elf32_Ehdr));
> +
> + for (i = 0; i < ehdr->e_phnum; ++i) {
> + if (phdr[i].p_type == PT_NOTE &&
> + !parse_build_id(page_addr, build_id,
> + page_addr + phdr[i].p_offset,
> + phdr[i].p_filesz))
> + return 0;
> + }
> + return -EINVAL;
> +}
> +
> +/* Parse build ID from 64-bit ELF */
> +static int get_build_id_64(void *page_addr, unsigned char *build_id)
> +{
> + Elf64_Ehdr *ehdr = (Elf64_Ehdr *)page_addr;
> + Elf64_Phdr *phdr;
> + int i;
> +
> + /* only supports phdr that fits in one page */
> + if (ehdr->e_phnum >
> + (PAGE_SIZE - sizeof(Elf64_Ehdr)) / sizeof(Elf64_Phdr))
> + return -EINVAL;
> +
> + phdr = (Elf64_Phdr *)(page_addr + sizeof(Elf64_Ehdr));
> +
> + for (i = 0; i < ehdr->e_phnum; ++i) {
> + if (phdr[i].p_type == PT_NOTE &&
> + !parse_build_id(page_addr, build_id,
> + page_addr + phdr[i].p_offset,
> + phdr[i].p_filesz))
> + return 0;
> + }
> + return -EINVAL;
> +}
> +
> +/* Parse build ID of ELF file mapped to vma */
> +int build_id_parse(struct vm_area_struct *vma, unsigned char *build_id)
> +{
> + Elf32_Ehdr *ehdr;
> + struct page *page;
> + void *page_addr;
> + int ret;
> +
> + /* only works for page backed storage */
> + if (!vma->vm_file)
> + return -EINVAL;
> +
> + page = find_get_page(vma->vm_file->f_mapping, 0);
> + if (!page)
> + return -EFAULT; /* page not mapped */
> +
> + ret = -EINVAL;
> + page_addr = kmap_atomic(page);
> + ehdr = (Elf32_Ehdr *)page_addr;
> +
> + /* compare magic x7f "ELF" */
> + if (memcmp(ehdr->e_ident, ELFMAG, SELFMAG) != 0)
> + goto out;
> +
> + /* only support executable file and shared object file */
> + if (ehdr->e_type != ET_EXEC && ehdr->e_type != ET_DYN)
> + goto out;
> +
> + if (ehdr->e_ident[EI_CLASS] == ELFCLASS32)
> + ret = get_build_id_32(page_addr, build_id);
> + else if (ehdr->e_ident[EI_CLASS] == ELFCLASS64)
> + ret = get_build_id_64(page_addr, build_id);
> +out:
> + kunmap_atomic(page_addr);
> + put_page(page);
> + return ret;
> +}
> --
> 2.26.2
>

--

- Arnaldo

2020-12-15 15:56:49

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 03/15] perf: Add build id data in mmap2 event

Em Mon, Dec 14, 2020 at 11:54:45AM +0100, Jiri Olsa escreveu:
> Adding support to carry build id data in mmap2 event.
>
> The build id data replaces maj/min/ino/ino_generation
> fields, which are also used to identify map's binary,
> so it's ok to replace them with build id data:
>
> union {
> struct {
> u32 maj;
> u32 min;
> u64 ino;
> u64 ino_generation;
> };
> struct {
> u8 build_id_size;
> u8 __reserved_1;
> u16 __reserved_2;
> u8 build_id[20];
> };
> };

Alexei/Daniel, this one depends on BPFs build id routines to be exported
for use by the perf kernel subsys, PeterZ already acked this, so can you
guys consider getting the first three patches in this series via the bpf
tree?

The BPF bits were acked by Song.

- Arnaldo

> Replaced maj/min/ino/ino_generation fields give us size
> of 24 bytes. We use 20 bytes for build id data, 1 byte
> for size and rest is unused.
>
> There's new misc bit for mmap2 to signal there's build
> id data in it:
>
> #define PERF_RECORD_MISC_MMAP_BUILD_ID (1 << 14)
>
> Acked-by: Peter Zijlstra (Intel) <[email protected]>
> Signed-off-by: Jiri Olsa <[email protected]>
> ---
> include/uapi/linux/perf_event.h | 42 +++++++++++++++++++++++++++++----
> kernel/events/core.c | 32 +++++++++++++++++++++----
> 2 files changed, 65 insertions(+), 9 deletions(-)
>
> diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
> index b95d3c485d27..45a216bea048 100644
> --- a/include/uapi/linux/perf_event.h
> +++ b/include/uapi/linux/perf_event.h
> @@ -384,7 +384,8 @@ struct perf_event_attr {
> aux_output : 1, /* generate AUX records instead of events */
> cgroup : 1, /* include cgroup events */
> text_poke : 1, /* include text poke events */
> - __reserved_1 : 30;
> + build_id : 1, /* use build id in mmap2 events */
> + __reserved_1 : 29;
>
> union {
> __u32 wakeup_events; /* wakeup every n events */
> @@ -657,6 +658,22 @@ struct perf_event_mmap_page {
> __u64 aux_size;
> };
>
> +/*
> + * The current state of perf_event_header::misc bits usage:
> + * ('|' used bit, '-' unused bit)
> + *
> + * 012 CDEF
> + * |||---------||||
> + *
> + * Where:
> + * 0-2 CPUMODE_MASK
> + *
> + * C PROC_MAP_PARSE_TIMEOUT
> + * D MMAP_DATA / COMM_EXEC / FORK_EXEC / SWITCH_OUT
> + * E MMAP_BUILD_ID / EXACT_IP / SCHED_OUT_PREEMPT
> + * F (reserved)
> + */
> +
> #define PERF_RECORD_MISC_CPUMODE_MASK (7 << 0)
> #define PERF_RECORD_MISC_CPUMODE_UNKNOWN (0 << 0)
> #define PERF_RECORD_MISC_KERNEL (1 << 0)
> @@ -688,6 +705,7 @@ struct perf_event_mmap_page {
> *
> * PERF_RECORD_MISC_EXACT_IP - PERF_RECORD_SAMPLE of precise events
> * PERF_RECORD_MISC_SWITCH_OUT_PREEMPT - PERF_RECORD_SWITCH* events
> + * PERF_RECORD_MISC_MMAP_BUILD_ID - PERF_RECORD_MMAP2 event
> *
> *
> * PERF_RECORD_MISC_EXACT_IP:
> @@ -697,9 +715,13 @@ struct perf_event_mmap_page {
> *
> * PERF_RECORD_MISC_SWITCH_OUT_PREEMPT:
> * Indicates that thread was preempted in TASK_RUNNING state.
> + *
> + * PERF_RECORD_MISC_MMAP_BUILD_ID:
> + * Indicates that mmap2 event carries build id data.
> */
> #define PERF_RECORD_MISC_EXACT_IP (1 << 14)
> #define PERF_RECORD_MISC_SWITCH_OUT_PREEMPT (1 << 14)
> +#define PERF_RECORD_MISC_MMAP_BUILD_ID (1 << 14)
> /*
> * Reserve the last bit to indicate some extended misc field
> */
> @@ -911,10 +933,20 @@ enum perf_event_type {
> * u64 addr;
> * u64 len;
> * u64 pgoff;
> - * u32 maj;
> - * u32 min;
> - * u64 ino;
> - * u64 ino_generation;
> + * union {
> + * struct {
> + * u32 maj;
> + * u32 min;
> + * u64 ino;
> + * u64 ino_generation;
> + * };
> + * struct {
> + * u8 build_id_size;
> + * u8 __reserved_1;
> + * u16 __reserved_2;
> + * u8 build_id[20];
> + * };
> + * };
> * u32 prot, flags;
> * char filename[];
> * struct sample_id sample_id;
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index dc568ca295bd..6cbd04a24d3a 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -51,6 +51,7 @@
> #include <linux/proc_ns.h>
> #include <linux/mount.h>
> #include <linux/min_heap.h>
> +#include <linux/buildid.h>
>
> #include "internal.h"
>
> @@ -395,6 +396,7 @@ static atomic_t nr_ksymbol_events __read_mostly;
> static atomic_t nr_bpf_events __read_mostly;
> static atomic_t nr_cgroup_events __read_mostly;
> static atomic_t nr_text_poke_events __read_mostly;
> +static atomic_t nr_build_id_events __read_mostly;
>
> static LIST_HEAD(pmus);
> static DEFINE_MUTEX(pmus_lock);
> @@ -4665,6 +4667,8 @@ static void unaccount_event(struct perf_event *event)
> dec = true;
> if (event->attr.mmap || event->attr.mmap_data)
> atomic_dec(&nr_mmap_events);
> + if (event->attr.build_id)
> + atomic_dec(&nr_build_id_events);
> if (event->attr.comm)
> atomic_dec(&nr_comm_events);
> if (event->attr.namespaces)
> @@ -7934,6 +7938,8 @@ struct perf_mmap_event {
> u64 ino;
> u64 ino_generation;
> u32 prot, flags;
> + u8 build_id[BUILD_ID_SIZE_MAX];
> + u32 build_id_size;
>
> struct {
> struct perf_event_header header;
> @@ -7965,6 +7971,7 @@ static void perf_event_mmap_output(struct perf_event *event,
> struct perf_sample_data sample;
> int size = mmap_event->event_id.header.size;
> u32 type = mmap_event->event_id.header.type;
> + bool use_build_id;
> int ret;
>
> if (!perf_event_mmap_match(event, data))
> @@ -7989,13 +7996,25 @@ static void perf_event_mmap_output(struct perf_event *event,
> mmap_event->event_id.pid = perf_event_pid(event, current);
> mmap_event->event_id.tid = perf_event_tid(event, current);
>
> + use_build_id = event->attr.build_id && mmap_event->build_id_size;
> +
> + if (event->attr.mmap2 && use_build_id)
> + mmap_event->event_id.header.misc |= PERF_RECORD_MISC_MMAP_BUILD_ID;
> +
> perf_output_put(&handle, mmap_event->event_id);
>
> if (event->attr.mmap2) {
> - perf_output_put(&handle, mmap_event->maj);
> - perf_output_put(&handle, mmap_event->min);
> - perf_output_put(&handle, mmap_event->ino);
> - perf_output_put(&handle, mmap_event->ino_generation);
> + if (use_build_id) {
> + u8 size[4] = { (u8) mmap_event->build_id_size, 0, 0, 0 };
> +
> + __output_copy(&handle, size, 4);
> + __output_copy(&handle, mmap_event->build_id, BUILD_ID_SIZE_MAX);
> + } else {
> + perf_output_put(&handle, mmap_event->maj);
> + perf_output_put(&handle, mmap_event->min);
> + perf_output_put(&handle, mmap_event->ino);
> + perf_output_put(&handle, mmap_event->ino_generation);
> + }
> perf_output_put(&handle, mmap_event->prot);
> perf_output_put(&handle, mmap_event->flags);
> }
> @@ -8124,6 +8143,9 @@ static void perf_event_mmap_event(struct perf_mmap_event *mmap_event)
>
> mmap_event->event_id.header.size = sizeof(mmap_event->event_id) + size;
>
> + if (atomic_read(&nr_build_id_events))
> + build_id_parse(vma, mmap_event->build_id, &mmap_event->build_id_size);
> +
> perf_iterate_sb(perf_event_mmap_output,
> mmap_event,
> NULL);
> @@ -11060,6 +11082,8 @@ static void account_event(struct perf_event *event)
> inc = true;
> if (event->attr.mmap || event->attr.mmap_data)
> atomic_inc(&nr_mmap_events);
> + if (event->attr.build_id)
> + atomic_inc(&nr_build_id_events);
> if (event->attr.comm)
> atomic_inc(&nr_comm_events);
> if (event->attr.namespaces)
> --
> 2.26.2
>

--

- Arnaldo

2020-12-15 15:57:01

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 13/15] perf buildid-cache: Add --debuginfod option

Em Mon, Dec 14, 2020 at 11:54:55AM +0100, Jiri Olsa escreveu:
> Adding --debuginfod option to specify debuginfod url and
> support to do that through config file as well.
>
> Use following in ~/.perfconfig file:
>
> [buildid-cache]
> debuginfod=http://192.168.122.174:8002

I was going to try and cherry-pick this one, but it is after other
changes that are dependent on other bits to be merged :-\

- Arnaldo

> Acked-by: Ian Rogers <[email protected]>
> Signed-off-by: Jiri Olsa <[email protected]>
> ---
> .../perf/Documentation/perf-buildid-cache.txt | 6 ++++
> tools/perf/Documentation/perf-config.txt | 7 +++++
> tools/perf/builtin-buildid-cache.c | 28 +++++++++++++++++--
> 3 files changed, 38 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/Documentation/perf-buildid-cache.txt b/tools/perf/Documentation/perf-buildid-cache.txt
> index b77da5138bca..b9987d1399ca 100644
> --- a/tools/perf/Documentation/perf-buildid-cache.txt
> +++ b/tools/perf/Documentation/perf-buildid-cache.txt
> @@ -84,6 +84,12 @@ OPTIONS
> used when creating a uprobe for a process that resides in a
> different mount namespace from the perf(1) utility.
>
> +--debuginfod=URLs::
> + Specify debuginfod URL to be used when retrieving perf.data binaries,
> + it follows the same syntax as the DEBUGINFOD_URLS variable, like:
> +
> + buildid-cache.debuginfod=http://192.168.122.174:8002
> +
> SEE ALSO
> --------
> linkperf:perf-record[1], linkperf:perf-report[1], linkperf:perf-buildid-list[1]
> diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt
> index 31069d8a5304..e3672c5d801b 100644
> --- a/tools/perf/Documentation/perf-config.txt
> +++ b/tools/perf/Documentation/perf-config.txt
> @@ -238,6 +238,13 @@ buildid.*::
> cache location, or to disable it altogether. If you want to disable it,
> set buildid.dir to /dev/null. The default is $HOME/.debug
>
> +buildid-cache.*::
> + buildid-cache.debuginfod=URLs
> + Specify debuginfod URLs to be used when retrieving perf.data binaries,
> + it follows the same syntax as the DEBUGINFOD_URLS variable, like:
> +
> + buildid-cache.debuginfod=http://192.168.122.174:8002
> +
> annotate.*::
> These are in control of addresses, jump function, source code
> in lines of assembly code from a specific program.
> diff --git a/tools/perf/builtin-buildid-cache.c b/tools/perf/builtin-buildid-cache.c
> index f0afb2c89e03..864597fd9cf6 100644
> --- a/tools/perf/builtin-buildid-cache.c
> +++ b/tools/perf/builtin-buildid-cache.c
> @@ -27,6 +27,7 @@
> #include "util/time-utils.h"
> #include "util/util.h"
> #include "util/probe-file.h"
> +#include "util/config.h"
> #include <linux/string.h>
> #include <linux/err.h>
> #include <linux/zalloc.h>
> @@ -550,12 +551,21 @@ build_id_cache__add_perf_data(const char *path, bool all)
> return err;
> }
>
> +static int perf_buildid_cache_config(const char *var, const char *value, void *cb)
> +{
> + const char **debuginfod = cb;
> +
> + if (!strcmp(var, "buildid-cache.debuginfod"))
> + *debuginfod = strdup(value);
> +
> + return 0;
> +}
> +
> int cmd_buildid_cache(int argc, const char **argv)
> {
> struct strlist *list;
> struct str_node *pos;
> - int ret = 0;
> - int ns_id = -1;
> + int ret, ns_id = -1;
> bool force = false;
> bool list_files = false;
> bool opts_flag = false;
> @@ -565,7 +575,8 @@ int cmd_buildid_cache(int argc, const char **argv)
> *purge_name_list_str = NULL,
> *missing_filename = NULL,
> *update_name_list_str = NULL,
> - *kcore_filename = NULL;
> + *kcore_filename = NULL,
> + *debuginfod = NULL;
> char sbuf[STRERR_BUFSIZE];
>
> struct perf_data data = {
> @@ -590,6 +601,8 @@ int cmd_buildid_cache(int argc, const char **argv)
> OPT_BOOLEAN('f', "force", &force, "don't complain, do it"),
> OPT_STRING('u', "update", &update_name_list_str, "file list",
> "file(s) to update"),
> + OPT_STRING(0, "debuginfod", &debuginfod, "debuginfod url",
> + "set debuginfod url"),
> OPT_INCR('v', "verbose", &verbose, "be more verbose"),
> OPT_INTEGER(0, "target-ns", &ns_id, "target pid for namespace context"),
> OPT_END()
> @@ -599,6 +612,10 @@ int cmd_buildid_cache(int argc, const char **argv)
> NULL
> };
>
> + ret = perf_config(perf_buildid_cache_config, &debuginfod);
> + if (ret)
> + return ret;
> +
> argc = parse_options(argc, argv, buildid_cache_options,
> buildid_cache_usage, 0);
>
> @@ -610,6 +627,11 @@ int cmd_buildid_cache(int argc, const char **argv)
> if (argc || !(list_files || opts_flag))
> usage_with_options(buildid_cache_usage, buildid_cache_options);
>
> + if (debuginfod) {
> + pr_debug("DEBUGINFOD_URLS=%s\n", debuginfod);
> + setenv("DEBUGINFOD_URLS", debuginfod, 1);
> + }
> +
> /* -l is exclusive. It can not be used with other options. */
> if (list_files && opts_flag) {
> usage_with_options_msg(buildid_cache_usage,
> --
> 2.26.2
>

--

- Arnaldo

2020-12-15 15:58:27

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 06/15] perf tools: Add support to read build id from compressed elf

Em Mon, Dec 14, 2020 at 11:54:48AM +0100, Jiri Olsa escreveu:
> Adding support to decompress file before reading build id.
>
> Adding filename__read_build_id and change its current
> versions to read_build_id.
>
> Shutting down stderr output of perf list in the shell test:
> 82: Check open filename arg using perf trace + vfs_getname : Ok

Tentatively cherry picking this one.

- Arnaldo

> because with decompression code in the place we the
> filename__read_build_id function is more verbose in case
> of error and the test did not account for that.
>
> Signed-off-by: Jiri Olsa <[email protected]>
> ---
> .../tests/shell/trace+probe_vfs_getname.sh | 2 +-
> tools/perf/util/symbol-elf.c | 37 ++++++++++++++++++-
> 2 files changed, 36 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/tests/shell/trace+probe_vfs_getname.sh b/tools/perf/tests/shell/trace+probe_vfs_getname.sh
> index 11cc2af13f2b..3d31c1d560d6 100755
> --- a/tools/perf/tests/shell/trace+probe_vfs_getname.sh
> +++ b/tools/perf/tests/shell/trace+probe_vfs_getname.sh
> @@ -20,7 +20,7 @@ skip_if_no_perf_trace || exit 2
> file=$(mktemp /tmp/temporary_file.XXXXX)
>
> trace_open_vfs_getname() {
> - evts=$(echo $(perf list syscalls:sys_enter_open* 2>&1 | egrep 'open(at)? ' | sed -r 's/.*sys_enter_([a-z]+) +\[.*$/\1/') | sed 's/ /,/')
> + evts=$(echo $(perf list syscalls:sys_enter_open* 2>/dev/null | egrep 'open(at)? ' | sed -r 's/.*sys_enter_([a-z]+) +\[.*$/\1/') | sed 's/ /,/')
> perf trace -e $evts touch $file 2>&1 | \
> egrep " +[0-9]+\.[0-9]+ +\( +[0-9]+\.[0-9]+ ms\): +touch\/[0-9]+ open(at)?\((dfd: +CWD, +)?filename: +${file}, +flags: CREAT\|NOCTTY\|NONBLOCK\|WRONLY, +mode: +IRUGO\|IWUGO\) += +[0-9]+$"
> }
> diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> index 44dd86a4f25f..f3577f7d72fe 100644
> --- a/tools/perf/util/symbol-elf.c
> +++ b/tools/perf/util/symbol-elf.c
> @@ -534,7 +534,7 @@ static int elf_read_build_id(Elf *elf, void *bf, size_t size)
>
> #ifdef HAVE_LIBBFD_BUILDID_SUPPORT
>
> -int filename__read_build_id(const char *filename, struct build_id *bid)
> +static int read_build_id(const char *filename, struct build_id *bid)
> {
> size_t size = sizeof(bid->data);
> int err = -1;
> @@ -563,7 +563,7 @@ int filename__read_build_id(const char *filename, struct build_id *bid)
>
> #else // HAVE_LIBBFD_BUILDID_SUPPORT
>
> -int filename__read_build_id(const char *filename, struct build_id *bid)
> +static int read_build_id(const char *filename, struct build_id *bid)
> {
> size_t size = sizeof(bid->data);
> int fd, err = -1;
> @@ -595,6 +595,39 @@ int filename__read_build_id(const char *filename, struct build_id *bid)
>
> #endif // HAVE_LIBBFD_BUILDID_SUPPORT
>
> +int filename__read_build_id(const char *filename, struct build_id *bid)
> +{
> + struct kmod_path m = { .name = NULL, };
> + char path[PATH_MAX];
> + int err;
> +
> + if (!filename)
> + return -EFAULT;
> +
> + err = kmod_path__parse(&m, filename);
> + if (err)
> + return -1;
> +
> + if (m.comp) {
> + int error = 0, fd;
> +
> + fd = filename__decompress(filename, path, sizeof(path), m.comp, &error);
> + if (fd < 0) {
> + pr_debug("Failed to decompress (error %d) %s\n",
> + error, filename);
> + return -1;
> + }
> + close(fd);
> + filename = path;
> + }
> +
> + err = read_build_id(filename, bid);
> +
> + if (m.comp)
> + unlink(filename);
> + return err;
> +}
> +
> int sysfs__read_build_id(const char *filename, struct build_id *bid)
> {
> size_t size = sizeof(bid->data);
> --
> 2.26.2
>

--

- Arnaldo

2020-12-25 22:39:26

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH 03/15] perf: Add build id data in mmap2 event

On Tue, Dec 15, 2020 at 11:01:51PM +0100, Daniel Borkmann wrote:
> Hey Arnaldo,
>
> On 12/15/20 4:52 PM, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Dec 14, 2020 at 11:54:45AM +0100, Jiri Olsa escreveu:
> > > Adding support to carry build id data in mmap2 event.
> > >
> > > The build id data replaces maj/min/ino/ino_generation
> > > fields, which are also used to identify map's binary,
> > > so it's ok to replace them with build id data:
> > >
> > > union {
> > > struct {
> > > u32 maj;
> > > u32 min;
> > > u64 ino;
> > > u64 ino_generation;
> > > };
> > > struct {
> > > u8 build_id_size;
> > > u8 __reserved_1;
> > > u16 __reserved_2;
> > > u8 build_id[20];
> > > };
> > > };
> >
> > Alexei/Daniel, this one depends on BPFs build id routines to be exported
> > for use by the perf kernel subsys, PeterZ already acked this, so can you
> > guys consider getting the first three patches in this series via the bpf
> > tree?
> >
> > The BPF bits were acked by Song.
>
> All the net-next and therefore also bpf-next bits for 5.11 were just merged
> by Linus into his tree. If you need the first 3 from [0] to land for this merge
> window, it's probably easiest if you take them in and send them via perf tree
> directly in case you didn't send out a pull-req yet.. (alternatively I'll ping
> David/Jakub if they plan to make a 2nd net-next pull-req end of this week).
>
> Thanks,
> Daniel
>
> [0] https://lore.kernel.org/lkml/[email protected]/
>

hi,
I don't see them (first 3 from [0]) in any tree so far ;-) please let
me know if there's anything I can do from my side to get this merged

thanks,
jirka


[0] https://lore.kernel.org/lkml/[email protected]/

2020-12-28 13:04:19

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 06/15] perf tools: Add support to read build id from compressed elf

Em Tue, Dec 15, 2020 at 12:55:03PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Mon, Dec 14, 2020 at 11:54:48AM +0100, Jiri Olsa escreveu:
> > Adding support to decompress file before reading build id.
> >
> > Adding filename__read_build_id and change its current
> > versions to read_build_id.
> >
> > Shutting down stderr output of perf list in the shell test:
> > 82: Check open filename arg using perf trace + vfs_getname : Ok
>
> Tentatively cherry picking this one.

This one made into v5.11. Processing the tooling bits in the other
patches now, into perf/core.

- Arnaldo
>
> > because with decompression code in the place we the
> > filename__read_build_id function is more verbose in case
> > of error and the test did not account for that.
> >
> > Signed-off-by: Jiri Olsa <[email protected]>
> > ---
> > .../tests/shell/trace+probe_vfs_getname.sh | 2 +-
> > tools/perf/util/symbol-elf.c | 37 ++++++++++++++++++-
> > 2 files changed, 36 insertions(+), 3 deletions(-)
> >
> > diff --git a/tools/perf/tests/shell/trace+probe_vfs_getname.sh b/tools/perf/tests/shell/trace+probe_vfs_getname.sh
> > index 11cc2af13f2b..3d31c1d560d6 100755
> > --- a/tools/perf/tests/shell/trace+probe_vfs_getname.sh
> > +++ b/tools/perf/tests/shell/trace+probe_vfs_getname.sh
> > @@ -20,7 +20,7 @@ skip_if_no_perf_trace || exit 2
> > file=$(mktemp /tmp/temporary_file.XXXXX)
> >
> > trace_open_vfs_getname() {
> > - evts=$(echo $(perf list syscalls:sys_enter_open* 2>&1 | egrep 'open(at)? ' | sed -r 's/.*sys_enter_([a-z]+) +\[.*$/\1/') | sed 's/ /,/')
> > + evts=$(echo $(perf list syscalls:sys_enter_open* 2>/dev/null | egrep 'open(at)? ' | sed -r 's/.*sys_enter_([a-z]+) +\[.*$/\1/') | sed 's/ /,/')
> > perf trace -e $evts touch $file 2>&1 | \
> > egrep " +[0-9]+\.[0-9]+ +\( +[0-9]+\.[0-9]+ ms\): +touch\/[0-9]+ open(at)?\((dfd: +CWD, +)?filename: +${file}, +flags: CREAT\|NOCTTY\|NONBLOCK\|WRONLY, +mode: +IRUGO\|IWUGO\) += +[0-9]+$"
> > }
> > diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> > index 44dd86a4f25f..f3577f7d72fe 100644
> > --- a/tools/perf/util/symbol-elf.c
> > +++ b/tools/perf/util/symbol-elf.c
> > @@ -534,7 +534,7 @@ static int elf_read_build_id(Elf *elf, void *bf, size_t size)
> >
> > #ifdef HAVE_LIBBFD_BUILDID_SUPPORT
> >
> > -int filename__read_build_id(const char *filename, struct build_id *bid)
> > +static int read_build_id(const char *filename, struct build_id *bid)
> > {
> > size_t size = sizeof(bid->data);
> > int err = -1;
> > @@ -563,7 +563,7 @@ int filename__read_build_id(const char *filename, struct build_id *bid)
> >
> > #else // HAVE_LIBBFD_BUILDID_SUPPORT
> >
> > -int filename__read_build_id(const char *filename, struct build_id *bid)
> > +static int read_build_id(const char *filename, struct build_id *bid)
> > {
> > size_t size = sizeof(bid->data);
> > int fd, err = -1;
> > @@ -595,6 +595,39 @@ int filename__read_build_id(const char *filename, struct build_id *bid)
> >
> > #endif // HAVE_LIBBFD_BUILDID_SUPPORT
> >
> > +int filename__read_build_id(const char *filename, struct build_id *bid)
> > +{
> > + struct kmod_path m = { .name = NULL, };
> > + char path[PATH_MAX];
> > + int err;
> > +
> > + if (!filename)
> > + return -EFAULT;
> > +
> > + err = kmod_path__parse(&m, filename);
> > + if (err)
> > + return -1;
> > +
> > + if (m.comp) {
> > + int error = 0, fd;
> > +
> > + fd = filename__decompress(filename, path, sizeof(path), m.comp, &error);
> > + if (fd < 0) {
> > + pr_debug("Failed to decompress (error %d) %s\n",
> > + error, filename);
> > + return -1;
> > + }
> > + close(fd);
> > + filename = path;
> > + }
> > +
> > + err = read_build_id(filename, bid);
> > +
> > + if (m.comp)
> > + unlink(filename);
> > + return err;
> > +}
> > +
> > int sysfs__read_build_id(const char *filename, struct build_id *bid)
> > {
> > size_t size = sizeof(bid->data);
> > --
> > 2.26.2
> >
>
> --
>
> - Arnaldo

--

- Arnaldo

2020-12-28 13:47:42

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 15/15] perf record: Add --buildid-mmap option to enable mmap's build id

Em Mon, Dec 14, 2020 at 11:54:57AM +0100, Jiri Olsa escreveu:
> Adding --buildid-mmap option to enable build id in mmap2 events.
> It will only work if there's kernel support for that and it disables
> build id cache (implies --no-buildid).
>
> It's also possible to enable it permanently via config option
> in ~.perfconfig file:
>
> [record]
> build-id=mmap
>
> Also added build_id bit in the verbose output for perf_event_attr:
>
> # perf record --buildid-mmap -vv
> ...
> perf_event_attr:
> type 1
> size 120
> ...
> build_id 1
>
> Adding also missing text_poke bit.

I'm moving this to just before the:

perf tools: Add support to display build id when available in PERF_RECORD_MMAP2 events

So that I can actually print the synthesized/obtained from the kernel
build-ids, i.e. this:

perf report -D | grep MMAP2 | head -4

Will work at that point.

> Acked-by: Ian Rogers <[email protected]>
> Signed-off-by: Jiri Olsa <[email protected]>
> ---
> tools/perf/Documentation/perf-config.txt | 3 ++-
> tools/perf/Documentation/perf-record.txt | 3 +++
> tools/perf/builtin-record.c | 20 ++++++++++++++++++++
> tools/perf/util/evsel.c | 10 ++++++----
> tools/perf/util/perf_api_probe.c | 10 ++++++++++
> tools/perf/util/perf_api_probe.h | 1 +
> tools/perf/util/perf_event_attr_fprintf.c | 2 ++
> tools/perf/util/record.h | 1 +
> 8 files changed, 45 insertions(+), 5 deletions(-)
>
> diff --git a/tools/perf/Documentation/perf-config.txt b/tools/perf/Documentation/perf-config.txt
> index e3672c5d801b..8a1c6c16821a 100644
> --- a/tools/perf/Documentation/perf-config.txt
> +++ b/tools/perf/Documentation/perf-config.txt
> @@ -559,11 +559,12 @@ kmem.*::
>
> record.*::
> record.build-id::
> - This option can be 'cache', 'no-cache' or 'skip'.
> + This option can be 'cache', 'no-cache', 'skip' or 'mmap'.
> 'cache' is to post-process data and save/update the binaries into
> the build-id cache (in ~/.debug). This is the default.
> But if this option is 'no-cache', it will not update the build-id cache.
> 'skip' skips post-processing and does not update the cache.
> + 'mmap' skips post-processing and reads build-ids from MMAP events.
>
> record.call-graph::
> This is identical to 'call-graph.record-mode', except it is
> diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
> index 768888b9326a..1bcf51e24979 100644
> --- a/tools/perf/Documentation/perf-record.txt
> +++ b/tools/perf/Documentation/perf-record.txt
> @@ -482,6 +482,9 @@ Specify vmlinux path which has debuginfo.
> --buildid-all::
> Record build-id of all DSOs regardless whether it's actually hit or not.
>
> +--buildid-mmap::
> +Record build ids in mmap2 events, disables build id cache (implies --no-buildid).
> +
> --aio[=n]::
> Use <n> control blocks in asynchronous (Posix AIO) trace writing mode (default: 1, max: 4).
> Asynchronous mode is supported only when linking Perf tool with libc library
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index d832c108a1ca..f6bfad096756 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -102,6 +102,7 @@ struct record {
> bool no_buildid_cache;
> bool no_buildid_cache_set;
> bool buildid_all;
> + bool buildid_mmap;
> bool timestamp_filename;
> bool timestamp_boundary;
> struct switch_output switch_output;
> @@ -2135,6 +2136,8 @@ static int perf_record_config(const char *var, const char *value, void *cb)
> rec->no_buildid_cache = true;
> else if (!strcmp(value, "skip"))
> rec->no_buildid = true;
> + else if (!strcmp(value, "mmap"))
> + rec->buildid_mmap = true;
> else
> return -1;
> return 0;
> @@ -2550,6 +2553,8 @@ static struct option __record_options[] = {
> "file", "vmlinux pathname"),
> OPT_BOOLEAN(0, "buildid-all", &record.buildid_all,
> "Record build-id of all DSOs regardless of hits"),
> + OPT_BOOLEAN(0, "buildid-mmap", &record.buildid_mmap,
> + "Record build-id in map events"),
> OPT_BOOLEAN(0, "timestamp-filename", &record.timestamp_filename,
> "append timestamp to output filename"),
> OPT_BOOLEAN(0, "timestamp-boundary", &record.timestamp_boundary,
> @@ -2653,6 +2658,21 @@ int cmd_record(int argc, const char **argv)
>
> }
>
> + if (rec->buildid_mmap) {
> + if (!perf_can_record_build_id()) {
> + pr_err("Failed: no support to record build id in mmap events, update your kernel.\n");
> + err = -EINVAL;
> + goto out_opts;
> + }
> + pr_debug("Enabling build id in mmap2 events.\n");
> + /* Enable mmap build id synthesizing. */
> + symbol_conf.buildid_mmap2 = true;
> + /* Enable perf_event_attr::build_id bit. */
> + rec->opts.build_id = true;
> + /* Disable build id cache. */
> + rec->no_buildid = true;
> + }
> +
> if (rec->opts.kcore)
> rec->data.is_dir = true;
>
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index 3dd0eae9810d..191500c1f9f6 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -1168,10 +1168,12 @@ void evsel__config(struct evsel *evsel, struct record_opts *opts,
> if (opts->sample_weight)
> evsel__set_sample_bit(evsel, WEIGHT);
>
> - attr->task = track;
> - attr->mmap = track;
> - attr->mmap2 = track && !perf_missing_features.mmap2;
> - attr->comm = track;
> + attr->task = track;
> + attr->mmap = track;
> + attr->mmap2 = track && !perf_missing_features.mmap2;
> + attr->comm = track;
> + attr->build_id = track && opts->build_id;
> +
> /*
> * ksymbol is tracked separately with text poke because it needs to be
> * system wide and enabled immediately.
> diff --git a/tools/perf/util/perf_api_probe.c b/tools/perf/util/perf_api_probe.c
> index 3840d02f0f7b..829af17a0867 100644
> --- a/tools/perf/util/perf_api_probe.c
> +++ b/tools/perf/util/perf_api_probe.c
> @@ -98,6 +98,11 @@ static void perf_probe_text_poke(struct evsel *evsel)
> evsel->core.attr.text_poke = 1;
> }
>
> +static void perf_probe_build_id(struct evsel *evsel)
> +{
> + evsel->core.attr.build_id = 1;
> +}
> +
> bool perf_can_sample_identifier(void)
> {
> return perf_probe_api(perf_probe_sample_identifier);
> @@ -172,3 +177,8 @@ bool perf_can_aux_sample(void)
>
> return true;
> }
> +
> +bool perf_can_record_build_id(void)
> +{
> + return perf_probe_api(perf_probe_build_id);
> +}
> diff --git a/tools/perf/util/perf_api_probe.h b/tools/perf/util/perf_api_probe.h
> index d5506a983a94..f12ca55f509a 100644
> --- a/tools/perf/util/perf_api_probe.h
> +++ b/tools/perf/util/perf_api_probe.h
> @@ -11,5 +11,6 @@ bool perf_can_record_cpu_wide(void);
> bool perf_can_record_switch_events(void);
> bool perf_can_record_text_poke_events(void);
> bool perf_can_sample_identifier(void);
> +bool perf_can_record_build_id(void);
>
> #endif // __PERF_API_PROBE_H
> diff --git a/tools/perf/util/perf_event_attr_fprintf.c b/tools/perf/util/perf_event_attr_fprintf.c
> index e67a227c0ce7..656a7fddfc26 100644
> --- a/tools/perf/util/perf_event_attr_fprintf.c
> +++ b/tools/perf/util/perf_event_attr_fprintf.c
> @@ -134,6 +134,8 @@ int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
> PRINT_ATTRf(bpf_event, p_unsigned);
> PRINT_ATTRf(aux_output, p_unsigned);
> PRINT_ATTRf(cgroup, p_unsigned);
> + PRINT_ATTRf(text_poke, p_unsigned);
> + PRINT_ATTRf(build_id, p_unsigned);
>
> PRINT_ATTRn("{ wakeup_events, wakeup_watermark }", wakeup_events, p_unsigned);
> PRINT_ATTRf(bp_type, p_unsigned);
> diff --git a/tools/perf/util/record.h b/tools/perf/util/record.h
> index 266760ac9143..609e706f4282 100644
> --- a/tools/perf/util/record.h
> +++ b/tools/perf/util/record.h
> @@ -49,6 +49,7 @@ struct record_opts {
> bool no_bpf_event;
> bool kcore;
> bool text_poke;
> + bool build_id;
> unsigned int freq;
> unsigned int mmap_pages;
> unsigned int auxtrace_mmap_pages;
> --
> 2.26.2
>

--

- Arnaldo

2020-12-28 15:31:01

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 03/15] perf: Add build id data in mmap2 event

Em Mon, Dec 14, 2020 at 11:54:45AM +0100, Jiri Olsa escreveu:
> Adding support to carry build id data in mmap2 event.
>
> The build id data replaces maj/min/ino/ino_generation
> fields, which are also used to identify map's binary,
> so it's ok to replace them with build id data:
>
> union {
> struct {
> u32 maj;
> u32 min;
> u64 ino;
> u64 ino_generation;
> };
> struct {
> u8 build_id_size;
> u8 __reserved_1;
> u16 __reserved_2;
> u8 build_id[20];
> };
> };
>
> Replaced maj/min/ino/ino_generation fields give us size
> of 24 bytes. We use 20 bytes for build id data, 1 byte
> for size and rest is unused.
>
> There's new misc bit for mmap2 to signal there's build
> id data in it:
>
> #define PERF_RECORD_MISC_MMAP_BUILD_ID (1 << 14)
>
> Acked-by: Peter Zijlstra (Intel) <[email protected]>
> Signed-off-by: Jiri Olsa <[email protected]>

this one isn't applying cleanly, minor header addition problem in
kernel/events/core.c:

[acme@five linux]$ git am --show-current-patch=diff > a
[acme@five linux]$ patch -p1 < a
patching file include/uapi/linux/perf_event.h
Hunk #1 succeeded at 386 (offset 2 lines).
Hunk #2 succeeded at 660 (offset 2 lines).
Hunk #3 succeeded at 707 (offset 2 lines).
Hunk #4 succeeded at 717 (offset 2 lines).
Hunk #5 succeeded at 937 (offset 4 lines).
patching file kernel/events/core.c
Hunk #1 FAILED at 51.
Hunk #2 succeeded at 397 (offset 2 lines).
Hunk #3 succeeded at 4674 (offset 8 lines).
Hunk #4 succeeded at 8049 (offset 112 lines).
Hunk #5 succeeded at 8082 (offset 112 lines).
Hunk #6 succeeded at 8107 (offset 112 lines).
Hunk #7 succeeded at 8254 (offset 112 lines).
Hunk #8 succeeded at 11193 (offset 112 lines).
1 out of 8 hunks FAILED -- saving rejects to file kernel/events/core.c.rej
[acme@five linux]$ vim kernel/events/core.c.rej
[acme@five linux]$ cat kernel/events/core.c.rej
--- kernel/events/core.c
+++ kernel/events/core.c
@@ -51,6 +51,7 @@
#include <linux/proc_ns.h>
#include <linux/mount.h>
#include <linux/min_heap.h>
+#include <linux/buildid.h>

#include "internal.h"

[acme@five linux]$ vim kernel/events/core.c

I'm testing with/without the kernel bits, then you can refresh just the
kernel parts, the tooling side will be in my perf/core branch.

- Arnaldo

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index b15e3447cd9fead8..cb6f841035608e9b 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -386,7 +386,8 @@ struct perf_event_attr {
aux_output : 1, /* generate AUX records instead of events */
cgroup : 1, /* include cgroup events */
text_poke : 1, /* include text poke events */
- __reserved_1 : 30;
+ build_id : 1, /* use build id in mmap2 events */
+ __reserved_1 : 29;

union {
__u32 wakeup_events; /* wakeup every n events */
@@ -659,6 +660,22 @@ struct perf_event_mmap_page {
__u64 aux_size;
};

+/*
+ * The current state of perf_event_header::misc bits usage:
+ * ('|' used bit, '-' unused bit)
+ *
+ * 012 CDEF
+ * |||---------||||
+ *
+ * Where:
+ * 0-2 CPUMODE_MASK
+ *
+ * C PROC_MAP_PARSE_TIMEOUT
+ * D MMAP_DATA / COMM_EXEC / FORK_EXEC / SWITCH_OUT
+ * E MMAP_BUILD_ID / EXACT_IP / SCHED_OUT_PREEMPT
+ * F (reserved)
+ */
+
#define PERF_RECORD_MISC_CPUMODE_MASK (7 << 0)
#define PERF_RECORD_MISC_CPUMODE_UNKNOWN (0 << 0)
#define PERF_RECORD_MISC_KERNEL (1 << 0)
@@ -690,6 +707,7 @@ struct perf_event_mmap_page {
*
* PERF_RECORD_MISC_EXACT_IP - PERF_RECORD_SAMPLE of precise events
* PERF_RECORD_MISC_SWITCH_OUT_PREEMPT - PERF_RECORD_SWITCH* events
+ * PERF_RECORD_MISC_MMAP_BUILD_ID - PERF_RECORD_MMAP2 event
*
*
* PERF_RECORD_MISC_EXACT_IP:
@@ -699,9 +717,13 @@ struct perf_event_mmap_page {
*
* PERF_RECORD_MISC_SWITCH_OUT_PREEMPT:
* Indicates that thread was preempted in TASK_RUNNING state.
+ *
+ * PERF_RECORD_MISC_MMAP_BUILD_ID:
+ * Indicates that mmap2 event carries build id data.
*/
#define PERF_RECORD_MISC_EXACT_IP (1 << 14)
#define PERF_RECORD_MISC_SWITCH_OUT_PREEMPT (1 << 14)
+#define PERF_RECORD_MISC_MMAP_BUILD_ID (1 << 14)
/*
* Reserve the last bit to indicate some extended misc field
*/
@@ -915,10 +937,20 @@ enum perf_event_type {
* u64 addr;
* u64 len;
* u64 pgoff;
- * u32 maj;
- * u32 min;
- * u64 ino;
- * u64 ino_generation;
+ * union {
+ * struct {
+ * u32 maj;
+ * u32 min;
+ * u64 ino;
+ * u64 ino_generation;
+ * };
+ * struct {
+ * u8 build_id_size;
+ * u8 __reserved_1;
+ * u16 __reserved_2;
+ * u8 build_id[20];
+ * };
+ * };
* u32 prot, flags;
* char filename[];
* struct sample_id sample_id;
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 55d18791a72de38b..c37401e3e5f7326b 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -53,6 +53,7 @@
#include <linux/min_heap.h>
#include <linux/highmem.h>
#include <linux/pgtable.h>
+#include <linux/buildid.h>

#include "internal.h"

@@ -397,6 +398,7 @@ static atomic_t nr_ksymbol_events __read_mostly;
static atomic_t nr_bpf_events __read_mostly;
static atomic_t nr_cgroup_events __read_mostly;
static atomic_t nr_text_poke_events __read_mostly;
+static atomic_t nr_build_id_events __read_mostly;

static LIST_HEAD(pmus);
static DEFINE_MUTEX(pmus_lock);
@@ -4673,6 +4675,8 @@ static void unaccount_event(struct perf_event *event)
dec = true;
if (event->attr.mmap || event->attr.mmap_data)
atomic_dec(&nr_mmap_events);
+ if (event->attr.build_id)
+ atomic_dec(&nr_build_id_events);
if (event->attr.comm)
atomic_dec(&nr_comm_events);
if (event->attr.namespaces)
@@ -8046,6 +8050,8 @@ struct perf_mmap_event {
u64 ino;
u64 ino_generation;
u32 prot, flags;
+ u8 build_id[BUILD_ID_SIZE_MAX];
+ u32 build_id_size;

struct {
struct perf_event_header header;
@@ -8077,6 +8083,7 @@ static void perf_event_mmap_output(struct perf_event *event,
struct perf_sample_data sample;
int size = mmap_event->event_id.header.size;
u32 type = mmap_event->event_id.header.type;
+ bool use_build_id;
int ret;

if (!perf_event_mmap_match(event, data))
@@ -8101,13 +8108,25 @@ static void perf_event_mmap_output(struct perf_event *event,
mmap_event->event_id.pid = perf_event_pid(event, current);
mmap_event->event_id.tid = perf_event_tid(event, current);

+ use_build_id = event->attr.build_id && mmap_event->build_id_size;
+
+ if (event->attr.mmap2 && use_build_id)
+ mmap_event->event_id.header.misc |= PERF_RECORD_MISC_MMAP_BUILD_ID;
+
perf_output_put(&handle, mmap_event->event_id);

if (event->attr.mmap2) {
- perf_output_put(&handle, mmap_event->maj);
- perf_output_put(&handle, mmap_event->min);
- perf_output_put(&handle, mmap_event->ino);
- perf_output_put(&handle, mmap_event->ino_generation);
+ if (use_build_id) {
+ u8 size[4] = { (u8) mmap_event->build_id_size, 0, 0, 0 };
+
+ __output_copy(&handle, size, 4);
+ __output_copy(&handle, mmap_event->build_id, BUILD_ID_SIZE_MAX);
+ } else {
+ perf_output_put(&handle, mmap_event->maj);
+ perf_output_put(&handle, mmap_event->min);
+ perf_output_put(&handle, mmap_event->ino);
+ perf_output_put(&handle, mmap_event->ino_generation);
+ }
perf_output_put(&handle, mmap_event->prot);
perf_output_put(&handle, mmap_event->flags);
}
@@ -8236,6 +8255,9 @@ static void perf_event_mmap_event(struct perf_mmap_event *mmap_event)

mmap_event->event_id.header.size = sizeof(mmap_event->event_id) + size;

+ if (atomic_read(&nr_build_id_events))
+ build_id_parse(vma, mmap_event->build_id, &mmap_event->build_id_size);
+
perf_iterate_sb(perf_event_mmap_output,
mmap_event,
NULL);
@@ -11172,6 +11194,8 @@ static void account_event(struct perf_event *event)
inc = true;
if (event->attr.mmap || event->attr.mmap_data)
atomic_inc(&nr_mmap_events);
+ if (event->attr.build_id)
+ atomic_inc(&nr_build_id_events);
if (event->attr.comm)
atomic_inc(&nr_comm_events);
if (event->attr.namespaces)

2020-12-29 01:35:07

by Jiri Olsa

[permalink] [raw]
Subject: Re: [PATCH 15/15] perf record: Add --buildid-mmap option to enable mmap's build id

On Mon, Dec 28, 2020 at 10:44:37AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Dec 14, 2020 at 11:54:57AM +0100, Jiri Olsa escreveu:
> > Adding --buildid-mmap option to enable build id in mmap2 events.
> > It will only work if there's kernel support for that and it disables
> > build id cache (implies --no-buildid).
> >
> > It's also possible to enable it permanently via config option
> > in ~.perfconfig file:
> >
> > [record]
> > build-id=mmap
> >
> > Also added build_id bit in the verbose output for perf_event_attr:
> >
> > # perf record --buildid-mmap -vv
> > ...
> > perf_event_attr:
> > type 1
> > size 120
> > ...
> > build_id 1
> >
> > Adding also missing text_poke bit.
>
> I'm moving this to just before the:
>
> perf tools: Add support to display build id when available in PERF_RECORD_MMAP2 events
>
> So that I can actually print the synthesized/obtained from the kernel
> build-ids, i.e. this:
>
> perf report -D | grep MMAP2 | head -4
>
> Will work at that point.

ok, thanks

jirka