2024-05-26 18:22:29

by Steinar H. Gunderson

[permalink] [raw]
Subject: [PATCH v7 1/4] perf report: Support LLVM for addr2line()

In addition to the existing support for libbfd and calling out to
an external addr2line command, add support for using libllvm directly.
This is both faster than libbfd, and can be enabled in distro builds
(the LLVM license has an explicit provision for GPLv2 compatibility).
Thus, it is set as the primary choice if available.

As an example, running perf report on a medium-size profile with
DWARF-based backtraces took 58 seconds with LLVM, 78 seconds with
libbfd, 153 seconds with external llvm-addr2line, and I got tired
and aborted the test after waiting for 55 minutes with external
bfd addr2line (which is the default for perf as compiled by distributions
today). Evidently, for this case, the bfd addr2line process needs
18 seconds (on a 5.2 GHz Zen 3) to load the .debug ELF in question,
hits the 1-second timeout and gets killed during initialization,
getting restarted anew every time. Having an in-process addr2line
makes this much more robust.

As future extensions, libllvm can be used in many other places where
we currently use libbfd or other libraries:

- Symbol enumeration (in particular, for PE binaries).
- Demangling (including non-Itanium demangling, e.g. Microsoft
or Rust).
- Disassembling (perf annotate).

However, these are much less pressing; most people don't profile
PE binaries, and perf has non-bfd paths for ELF. The same with
demangling; the default _cxa_demangle path works fine for most
users. Disassembling is coming in a later patch in the series;
however do note that while bfd objdump can be slow on large binaries,
it is possible to use --objdump=llvm-objdump to get the speed benefits.
(It appears LLVM-based demangling is very simple, should we want
that.)

Tested with LLVM 14, 15, 16, 18 and 19. For some reason, LLVM 12 was not
correctly detected using feature_check, and thus was not tested.

Signed-off-by: Steinar H. Gunderson <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/build/Makefile.feature | 1 +
tools/perf/Makefile.config | 15 ++++
tools/perf/builtin-version.c | 1 +
tools/perf/util/Build | 1 +
tools/perf/util/llvm-c-helpers.cpp | 134 +++++++++++++++++++++++++++++
tools/perf/util/llvm-c-helpers.h | 49 +++++++++++
tools/perf/util/srcline.c | 57 +++++++++++-
7 files changed, 257 insertions(+), 1 deletion(-)
create mode 100644 tools/perf/util/llvm-c-helpers.cpp
create mode 100644 tools/perf/util/llvm-c-helpers.h

diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index 1e2ab148d5db..278b26216254 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -136,6 +136,7 @@ FEATURE_DISPLAY ?= \
libunwind \
libdw-dwarf-unwind \
libcapstone \
+ llvm \
zlib \
lzma \
get_cpuid \
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 7f1e016a9253..42f5ad89c0ca 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -969,6 +969,21 @@ ifdef BUILD_NONDISTRO
endif
endif

+ifndef NO_LIBLLVM
+ $(call feature_check,llvm)
+ ifeq ($(feature-llvm), 1)
+ CFLAGS += -DHAVE_LIBLLVM_SUPPORT
+ CXXFLAGS += -DHAVE_LIBLLVM_SUPPORT
+ CXXFLAGS += $(shell $(LLVM_CONFIG) --cxxflags)
+ LIBLLVM = $(shell $(LLVM_CONFIG) --libs all) $(shell $(LLVM_CONFIG) --system-libs)
+ EXTLIBS += -L$(shell $(LLVM_CONFIG) --libdir) $(LIBLLVM)
+ $(call detected,CONFIG_LIBLLVM)
+ else
+ $(warning No libllvm found, slower source file resolution, please install llvm-devel/llvm-dev)
+ NO_LIBLLVM := 1
+ endif
+endif
+
ifndef NO_DEMANGLE
$(call feature_check,cxa-demangle)
ifeq ($(feature-cxa-demangle), 1)
diff --git a/tools/perf/builtin-version.c b/tools/perf/builtin-version.c
index 398aa53e9e2e..4b252196de12 100644
--- a/tools/perf/builtin-version.c
+++ b/tools/perf/builtin-version.c
@@ -65,6 +65,7 @@ static void library_status(void)
STATUS(HAVE_LIBBFD_SUPPORT, libbfd);
STATUS(HAVE_DEBUGINFOD_SUPPORT, debuginfod);
STATUS(HAVE_LIBELF_SUPPORT, libelf);
+ STATUS(HAVE_LIBLLVM_SUPPORT, libllvm);
STATUS(HAVE_LIBNUMA_SUPPORT, libnuma);
STATUS(HAVE_LIBNUMA_SUPPORT, numa_num_possible_cpus);
STATUS(HAVE_LIBPERL_SUPPORT, libperl);
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index da64efd8718f..6a1fd8f1a488 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -226,6 +226,7 @@ perf-$(CONFIG_CXX_DEMANGLE) += demangle-cxx.o
perf-y += demangle-ocaml.o
perf-y += demangle-java.o
perf-y += demangle-rust.o
+perf-$(CONFIG_LIBLLVM) += llvm-c-helpers.o

ifdef CONFIG_JITDUMP
perf-$(CONFIG_LIBELF) += jitdump.o
diff --git a/tools/perf/util/llvm-c-helpers.cpp b/tools/perf/util/llvm-c-helpers.cpp
new file mode 100644
index 000000000000..3cc967ec6f28
--- /dev/null
+++ b/tools/perf/util/llvm-c-helpers.cpp
@@ -0,0 +1,134 @@
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Must come before the linux/compiler.h include, which defines several
+ * macros (e.g. noinline) that conflict with compiler builtins used
+ * by LLVM.
+ */
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wunused-parameter" /* Needed for LLVM <= 15 */
+#include <llvm/DebugInfo/Symbolize/Symbolize.h>
+#pragma GCC diagnostic pop
+
+#include <stdio.h>
+#include <sys/types.h>
+#include <linux/compiler.h>
+extern "C" {
+#include <linux/zalloc.h>
+}
+#include "symbol_conf.h"
+#include "llvm-c-helpers.h"
+
+using namespace llvm;
+using llvm::symbolize::LLVMSymbolizer;
+
+/*
+ * Allocate a static LLVMSymbolizer, which will live to the end of the program.
+ * Unlike the bfd paths, LLVMSymbolizer has its own cache, so we do not need
+ * to store anything in the dso struct.
+ */
+static LLVMSymbolizer *get_symbolizer()
+{
+ static LLVMSymbolizer *instance = nullptr;
+ if (instance == nullptr) {
+ LLVMSymbolizer::Options opts;
+ /*
+ * LLVM sometimes demangles slightly different from the rest
+ * of the code, and this mismatch can cause new_inline_sym()
+ * to get confused and mark non-inline symbol as inlined
+ * (since the name does not properly match up with base_sym).
+ * Thus, disable the demangling and let the rest of the code
+ * handle it.
+ */
+ opts.Demangle = false;
+ instance = new LLVMSymbolizer(opts);
+ }
+ return instance;
+}
+
+/* Returns 0 on error, 1 on success. */
+static int extract_file_and_line(const DILineInfo &line_info, char **file,
+ unsigned int *line)
+{
+ if (file) {
+ if (line_info.FileName == "<invalid>") {
+ /* Match the convention of libbfd. */
+ *file = nullptr;
+ } else {
+ /* The caller expects to get something it can free(). */
+ *file = strdup(line_info.FileName.c_str());
+ if (*file == nullptr)
+ return 0;
+ }
+ }
+ if (line)
+ *line = line_info.Line;
+ return 1;
+}
+
+extern "C"
+int llvm_addr2line(const char *dso_name, u64 addr,
+ char **file, unsigned int *line,
+ bool unwind_inlines,
+ llvm_a2l_frame **inline_frames)
+{
+ LLVMSymbolizer *symbolizer = get_symbolizer();
+ object::SectionedAddress sectioned_addr = {
+ addr,
+ object::SectionedAddress::UndefSection
+ };
+
+ if (unwind_inlines) {
+ Expected<DIInliningInfo> res_or_err =
+ symbolizer->symbolizeInlinedCode(dso_name,
+ sectioned_addr);
+ if (!res_or_err)
+ return 0;
+ unsigned num_frames = res_or_err->getNumberOfFrames();
+ if (num_frames == 0)
+ return 0;
+
+ if (extract_file_and_line(res_or_err->getFrame(0),
+ file, line) == 0)
+ return 0;
+
+ *inline_frames = (llvm_a2l_frame *)calloc(
+ num_frames, sizeof(**inline_frames));
+ if (*inline_frames == nullptr)
+ return 0;
+
+ for (unsigned i = 0; i < num_frames; ++i) {
+ const DILineInfo &src = res_or_err->getFrame(i);
+
+ llvm_a2l_frame &dst = (*inline_frames)[i];
+ if (src.FileName == "<invalid>")
+ /* Match the convention of libbfd. */
+ dst.filename = nullptr;
+ else
+ dst.filename = strdup(src.FileName.c_str());
+ dst.funcname = strdup(src.FunctionName.c_str());
+ dst.line = src.Line;
+
+ if (dst.filename == nullptr ||
+ dst.funcname == nullptr) {
+ for (unsigned j = 0; j <= i; ++j) {
+ zfree(&(*inline_frames)[j].filename);
+ zfree(&(*inline_frames)[j].funcname);
+ }
+ zfree(inline_frames);
+ return 0;
+ }
+ }
+
+ return num_frames;
+ } else {
+ if (inline_frames)
+ *inline_frames = nullptr;
+
+ Expected<DILineInfo> res_or_err =
+ symbolizer->symbolizeCode(dso_name, sectioned_addr);
+ if (!res_or_err)
+ return 0;
+ return extract_file_and_line(*res_or_err, file, line);
+ }
+}
diff --git a/tools/perf/util/llvm-c-helpers.h b/tools/perf/util/llvm-c-helpers.h
new file mode 100644
index 000000000000..19332dd98e14
--- /dev/null
+++ b/tools/perf/util/llvm-c-helpers.h
@@ -0,0 +1,49 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __PERF_LLVM_C_HELPERS
+#define __PERF_LLVM_C_HELPERS 1
+
+/*
+ * Helpers to call into LLVM C++ code from C, for the parts that do not have
+ * C APIs.
+ */
+
+#include <linux/compiler.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+struct llvm_a2l_frame {
+ char* filename;
+ char* funcname;
+ unsigned int line;
+};
+
+/*
+ * Implement addr2line() using libLLVM. LLVM is a C++ API, and
+ * many of the linux/ headers cannot be included in a C++ compile unit,
+ * so we need to make a little bridge code here. llvm_addr2line() will
+ * convert the inline frame information from LLVM's internal structures
+ * and put them into a flat array given in inline_frames. The caller
+ * is then responsible for taking that array and convert it into perf's
+ * regular inline frame structures (which depend on e.g. struct list_head).
+ *
+ * If the address could not be resolved, or an error occurred (e.g. OOM),
+ * returns 0. Otherwise, returns the number of inline frames (which means 1
+ * if the address was not part of an inlined function). If unwind_inlines
+ * is set and the return code is nonzero, inline_frames will be set to
+ * a newly allocated array with that length. The caller is then responsible
+ * for freeing both the strings and the array itself.
+ */
+int llvm_addr2line(const char* dso_name,
+ u64 addr,
+ char** file,
+ unsigned int* line,
+ bool unwind_inlines,
+ struct llvm_a2l_frame** inline_frames);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* __PERF_LLVM_C_HELPERS */
diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index 9d670d8c1c08..74091ecc21df 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -16,6 +16,9 @@
#include "util/debug.h"
#include "util/callchain.h"
#include "util/symbol_conf.h"
+#ifdef HAVE_LIBLLVM_SUPPORT
+#include "util/llvm-c-helpers.h"
+#endif
#include "srcline.h"
#include "string2.h"
#include "symbol.h"
@@ -130,7 +133,59 @@ static struct symbol *new_inline_sym(struct dso *dso,

#define MAX_INLINE_NEST 1024

-#ifdef HAVE_LIBBFD_SUPPORT
+#ifdef HAVE_LIBLLVM_SUPPORT
+
+static void free_llvm_inline_frames(struct llvm_a2l_frame *inline_frames,
+ int num_frames)
+{
+ if (inline_frames != NULL) {
+ for (int i = 0; i < num_frames; ++i) {
+ zfree(&inline_frames[i].filename);
+ zfree(&inline_frames[i].funcname);
+ }
+ zfree(&inline_frames);
+ }
+}
+
+static int addr2line(const char *dso_name, u64 addr,
+ char **file, unsigned int *line, struct dso *dso,
+ bool unwind_inlines, struct inline_node *node,
+ struct symbol *sym)
+{
+ struct llvm_a2l_frame *inline_frames = NULL;
+ int num_frames = llvm_addr2line(dso_name, addr, file, line,
+ node && unwind_inlines, &inline_frames);
+
+ if (num_frames == 0 || !inline_frames) {
+ /* Error, or we didn't want inlines. */
+ return num_frames;
+ }
+
+ for (int i = 0; i < num_frames; ++i) {
+ struct symbol *inline_sym =
+ new_inline_sym(dso, sym, inline_frames[i].funcname);
+ char *srcline = NULL;
+
+ if (inline_frames[i].filename)
+ srcline = srcline_from_fileline(
+ inline_frames[i].filename,
+ inline_frames[i].line);
+ if (inline_list__append(inline_sym, srcline, node) != 0) {
+ free_llvm_inline_frames(inline_frames, num_frames);
+ return 0;
+ }
+ }
+ free_llvm_inline_frames(inline_frames, num_frames);
+
+ return num_frames;
+}
+
+void dso__free_a2l(struct dso *)
+{
+ /* Nothing to free. */
+}
+
+#elif defined(HAVE_LIBBFD_SUPPORT)

/*
* Implement addr2line using libbfd.
--
2.45.1



2024-05-26 18:22:35

by Steinar H. Gunderson

[permalink] [raw]
Subject: [PATCH v7 2/4] perf annotate: split out read_symbol()

The Capstone disassembler code has a useful code snippet to read
the bytes for a given code symbol into memory. Split it out into
its own function, so that the LLVM disassembler can use it in
the next patch.

Signed-off-by: Steinar H. Gunderson <[email protected]>
---
tools/perf/util/disasm.c | 90 +++++++++++++++++++++++++---------------
1 file changed, 56 insertions(+), 34 deletions(-)

diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index 72aec8f61b94..c0dbb955e61a 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -1396,6 +1396,53 @@ static int find_file_offset(u64 start, u64 len, u64 pgoff, void *arg)
return 0;
}

+static u8 *
+read_symbol(const char *filename, struct map *map, struct symbol *sym,
+ u64 *len, bool *is_64bit)
+{
+ struct dso *dso = map__dso(map);
+ struct nscookie nsc;
+ u64 start = map__rip_2objdump(map, sym->start);
+ u64 end = map__rip_2objdump(map, sym->end);
+ int fd, count;
+ u8 *buf = NULL;
+ struct find_file_offset_data data = {
+ .ip = start,
+ };
+
+ *is_64bit = false;
+
+ nsinfo__mountns_enter(dso__nsinfo(dso), &nsc);
+ fd = open(filename, O_RDONLY);
+ nsinfo__mountns_exit(&nsc);
+ if (fd < 0)
+ return NULL;
+
+ if (file__read_maps(fd, /*exe=*/true, find_file_offset, &data,
+ is_64bit) == 0)
+ goto err;
+
+ *len = end - start;
+ buf = malloc(*len);
+ if (buf == NULL)
+ goto err;
+
+ count = pread(fd, buf, *len, data.offset);
+ close(fd);
+ fd = -1;
+
+ if ((u64)count != *len)
+ goto err;
+
+ return buf;
+
+err:
+ if (fd >= 0)
+ close(fd);
+ free(buf);
+ return NULL;
+}
+
static void print_capstone_detail(cs_insn *insn, char *buf, size_t len,
struct annotate_args *args, u64 addr)
{
@@ -1458,19 +1505,13 @@ static int symbol__disassemble_capstone(char *filename, struct symbol *sym,
{
struct annotation *notes = symbol__annotation(sym);
struct map *map = args->ms.map;
- struct dso *dso = map__dso(map);
- struct nscookie nsc;
u64 start = map__rip_2objdump(map, sym->start);
- u64 end = map__rip_2objdump(map, sym->end);
- u64 len = end - start;
+ u64 len;
u64 offset;
- int i, fd, count;
+ int i, count;
bool is_64bit = false;
bool needs_cs_close = false;
u8 *buf = NULL;
- struct find_file_offset_data data = {
- .ip = start,
- };
csh handle;
cs_insn *insn;
char disasm_buf[512];
@@ -1479,31 +1520,9 @@ static int symbol__disassemble_capstone(char *filename, struct symbol *sym,
if (args->options->objdump_path)
return -1;

- nsinfo__mountns_enter(dso__nsinfo(dso), &nsc);
- fd = open(filename, O_RDONLY);
- nsinfo__mountns_exit(&nsc);
- if (fd < 0)
- return -1;
-
- if (file__read_maps(fd, /*exe=*/true, find_file_offset, &data,
- &is_64bit) == 0)
- goto err;
-
- if (open_capstone_handle(args, is_64bit, &handle) < 0)
- goto err;
-
- needs_cs_close = true;
-
- buf = malloc(len);
+ buf = read_symbol(filename, map, sym, &len, &is_64bit);
if (buf == NULL)
- goto err;
-
- count = pread(fd, buf, len, data.offset);
- close(fd);
- fd = -1;
-
- if ((u64)count != len)
- goto err;
+ return -1;

/* add the function address and name */
scnprintf(disasm_buf, sizeof(disasm_buf), "%#"PRIx64" <%s>:",
@@ -1521,6 +1540,11 @@ static int symbol__disassemble_capstone(char *filename, struct symbol *sym,

annotation_line__add(&dl->al, &notes->src->source);

+ if (open_capstone_handle(args, is_64bit, &handle) < 0)
+ goto err;
+
+ needs_cs_close = true;
+
count = cs_disasm(handle, buf, len, start, len, &insn);
for (i = 0, offset = 0; i < count; i++) {
int printed;
@@ -1565,8 +1589,6 @@ static int symbol__disassemble_capstone(char *filename, struct symbol *sym,
return count < 0 ? count : 0;

err:
- if (fd >= 0)
- close(fd);
if (needs_cs_close) {
struct disasm_line *tmp;

--
2.45.1


2024-05-26 18:22:56

by Steinar H. Gunderson

[permalink] [raw]
Subject: [PATCH v7 3/4] perf annotate: LLVM-based disassembler

Support using LLVM as a disassembler method, allowing helperless
annotation in non-distro builds. (It is also much faster than
using libbfd or bfd objdump on binaries with a lot of debug
information.)

This is nearly identical to the output of llvm-objdump; there are
some very rare whitespace differences, some minor changes to demangling
(since we use perf's regular demangling and not LLVM's own) and
the occasional case where llvm-objdump makes a different choice
when multiple symbols share the same address. It should work across
all of LLVM's supported architectures, although I've only tested 64-bit
x86, and finding the right triple from perf's idea of machine
architecture can sometimes be a bit tricky. Ideally, we should have
some way of finding the triplet just from the file itself.

Signed-off-by: Steinar H. Gunderson <[email protected]>
---
tools/perf/util/disasm.c | 195 +++++++++++++++++++++++++++++
tools/perf/util/llvm-c-helpers.cpp | 62 +++++++++
tools/perf/util/llvm-c-helpers.h | 11 ++
3 files changed, 268 insertions(+)

diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index c0dbb955e61a..ee7c2365d066 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -43,6 +43,7 @@ static int call__scnprintf(struct ins *ins, char *bf, size_t size,

static void ins__sort(struct arch *arch);
static int disasm_line__parse(char *line, const char **namep, char **rawp);
+static char *expand_tabs(char *line, char **storage, size_t *storage_len);

static __attribute__((constructor)) void symbol__init_regexpr(void)
{
@@ -1378,7 +1379,9 @@ static int open_capstone_handle(struct annotate_args *args, bool is_64bit,

return 0;
}
+#endif

+#if defined(HAVE_LIBCAPSTONE_SUPPORT) || defined(HAVE_LIBLLVM_SUPPORT)
struct find_file_offset_data {
u64 ip;
u64 offset;
@@ -1442,7 +1445,9 @@ read_symbol(const char *filename, struct map *map, struct symbol *sym,
free(buf);
return NULL;
}
+#endif

+#ifdef HAVE_LIBCAPSTONE_SUPPORT
static void print_capstone_detail(cs_insn *insn, char *buf, size_t len,
struct annotate_args *args, u64 addr)
{
@@ -1606,6 +1611,191 @@ static int symbol__disassemble_capstone(char *filename, struct symbol *sym,
}
#endif

+#ifdef HAVE_LIBLLVM_SUPPORT
+#include <llvm-c/Disassembler.h>
+#include <llvm-c/Target.h>
+#include "util/llvm-c-helpers.h"
+
+struct symbol_lookup_storage {
+ u64 branch_addr;
+ u64 pcrel_load_addr;
+};
+
+/*
+ * Whenever LLVM wants to resolve an address into a symbol, it calls this
+ * callback. We don't ever actually _return_ anything (in particular, because
+ * it puts quotation marks around what we return), but we use this as a hint
+ * that there is a branch or PC-relative address in the expression that we
+ * should add some textual annotation for after the instruction. The caller
+ * will use this information to add the actual annotation.
+ */
+static const char *
+symbol_lookup_callback(void *disinfo, uint64_t value,
+ uint64_t *ref_type,
+ uint64_t address __maybe_unused,
+ const char **ref __maybe_unused)
+{
+ struct symbol_lookup_storage *storage =
+ (struct symbol_lookup_storage *)disinfo;
+ if (*ref_type == LLVMDisassembler_ReferenceType_In_Branch)
+ storage->branch_addr = value;
+ else if (*ref_type == LLVMDisassembler_ReferenceType_In_PCrel_Load)
+ storage->pcrel_load_addr = value;
+ *ref_type = LLVMDisassembler_ReferenceType_InOut_None;
+ return NULL;
+}
+
+static int symbol__disassemble_llvm(char *filename, struct symbol *sym,
+ struct annotate_args *args)
+{
+ struct annotation *notes = symbol__annotation(sym);
+ struct map *map = args->ms.map;
+ struct dso *dso = map__dso(map);
+ u64 start = map__rip_2objdump(map, sym->start);
+ u8 *buf;
+ u64 len;
+ u64 pc;
+ bool is_64bit;
+ char triplet[64];
+ char disasm_buf[2048];
+ size_t disasm_len;
+ struct disasm_line *dl;
+ LLVMDisasmContextRef disasm = NULL;
+ struct symbol_lookup_storage storage;
+ char *line_storage = NULL;
+ size_t line_storage_len = 0;
+
+ if (args->options->objdump_path)
+ return -1;
+
+ LLVMInitializeAllTargetInfos();
+ LLVMInitializeAllTargetMCs();
+ LLVMInitializeAllDisassemblers();
+
+ buf = read_symbol(filename, map, sym, &len, &is_64bit);
+ if (buf == NULL)
+ return -1;
+
+ if (arch__is(args->arch, "x86")) {
+ if (is_64bit)
+ scnprintf(triplet, sizeof(triplet), "x86_64-pc-linux");
+ else
+ scnprintf(triplet, sizeof(triplet), "i686-pc-linux");
+ } else {
+ scnprintf(triplet, sizeof(triplet), "%s-linux-gnu",
+ args->arch->name);
+ }
+
+ disasm = LLVMCreateDisasm(
+ triplet, &storage, 0, NULL, symbol_lookup_callback);
+ if (disasm == NULL)
+ goto err;
+
+ if (args->options->disassembler_style &&
+ !strcmp(args->options->disassembler_style, "intel"))
+ LLVMSetDisasmOptions(
+ disasm, LLVMDisassembler_Option_AsmPrinterVariant);
+
+ /*
+ * This needs to be set after AsmPrinterVariant, due to a bug in LLVM;
+ * setting AsmPrinterVariant makes a new instruction printer, making it
+ * forget about the PrintImmHex flag (which is applied before if both
+ * are given to the same call).
+ */
+ LLVMSetDisasmOptions(disasm, LLVMDisassembler_Option_PrintImmHex);
+
+ /* add the function address and name */
+ scnprintf(disasm_buf, sizeof(disasm_buf), "%#"PRIx64" <%s>:",
+ start, sym->name);
+
+ args->offset = -1;
+ args->line = disasm_buf;
+ args->line_nr = 0;
+ args->fileloc = NULL;
+ args->ms.sym = sym;
+
+ dl = disasm_line__new(args);
+ if (dl == NULL)
+ goto err;
+
+ annotation_line__add(&dl->al, &notes->src->source);
+
+ pc = start;
+ for (u64 offset = 0; offset < len; ) {
+ unsigned int ins_len;
+
+ storage.branch_addr = 0;
+ storage.pcrel_load_addr = 0;
+
+ ins_len = LLVMDisasmInstruction(
+ disasm, buf + offset, len - offset, pc,
+ disasm_buf, sizeof(disasm_buf));
+ if (ins_len == 0)
+ goto err;
+ disasm_len = strlen(disasm_buf);
+
+ if (storage.branch_addr != 0) {
+ char *name = llvm_name_for_code(
+ dso, filename, storage.branch_addr);
+ if (name != NULL) {
+ disasm_len += scnprintf(
+ disasm_buf + disasm_len,
+ sizeof(disasm_buf) - disasm_len,
+ " <%s>", name);
+ free(name);
+ }
+ }
+ if (storage.pcrel_load_addr != 0) {
+ char *name = llvm_name_for_data(
+ dso, filename, storage.pcrel_load_addr);
+ disasm_len += scnprintf(disasm_buf + disasm_len,
+ sizeof(disasm_buf) - disasm_len,
+ " # %#"PRIx64,
+ storage.pcrel_load_addr);
+ if (name) {
+ disasm_len += scnprintf(
+ disasm_buf + disasm_len,
+ sizeof(disasm_buf) - disasm_len,
+ " <%s>", name);
+ free(name);
+ }
+ }
+
+ args->offset = offset;
+ args->line = expand_tabs(
+ disasm_buf, &line_storage, &line_storage_len);
+ args->line_nr = 0;
+ args->fileloc = NULL;
+ args->ms.sym = sym;
+
+ llvm_addr2line(filename, pc, &args->fileloc,
+ (unsigned int *)&args->line_nr, false, NULL);
+
+ dl = disasm_line__new(args);
+ if (dl == NULL)
+ goto err;
+
+ annotation_line__add(&dl->al, &notes->src->source);
+
+ free(args->fileloc);
+ pc += ins_len;
+ offset += ins_len;
+ }
+
+ LLVMDisasmDispose(disasm);
+ free(buf);
+ free(line_storage);
+ return 0;
+
+err:
+ LLVMDisasmDispose(disasm);
+ free(buf);
+ free(line_storage);
+ return -1;
+}
+#endif
+
+
/*
* Possibly create a new version of line with tabs expanded. Returns the
* existing or new line, storage is updated if a new line is allocated. If
@@ -1730,6 +1920,11 @@ int symbol__disassemble(struct symbol *sym, struct annotate_args *args)
strcpy(symfs_filename, tmp);
}

+#ifdef HAVE_LIBLLVM_SUPPORT
+ err = symbol__disassemble_llvm(symfs_filename, sym, args);
+ if (err == 0)
+ goto out_remove_tmp;
+#endif
#ifdef HAVE_LIBCAPSTONE_SUPPORT
err = symbol__disassemble_capstone(symfs_filename, sym, args);
if (err == 0)
diff --git a/tools/perf/util/llvm-c-helpers.cpp b/tools/perf/util/llvm-c-helpers.cpp
index 3cc967ec6f28..4070e2d5682f 100644
--- a/tools/perf/util/llvm-c-helpers.cpp
+++ b/tools/perf/util/llvm-c-helpers.cpp
@@ -8,6 +8,7 @@
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wunused-parameter" /* Needed for LLVM <= 15 */
#include <llvm/DebugInfo/Symbolize/Symbolize.h>
+#include <llvm/Support/TargetSelect.h>
#pragma GCC diagnostic pop

#include <stdio.h>
@@ -19,6 +20,9 @@ extern "C" {
#include "symbol_conf.h"
#include "llvm-c-helpers.h"

+extern "C"
+char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name);
+
using namespace llvm;
using llvm::symbolize::LLVMSymbolizer;

@@ -132,3 +136,61 @@ int llvm_addr2line(const char *dso_name, u64 addr,
return extract_file_and_line(*res_or_err, file, line);
}
}
+
+static char *
+make_symbol_relative_string(struct dso *dso, const char *sym_name,
+ u64 addr, u64 base_addr)
+{
+ if (!strcmp(sym_name, "<invalid>"))
+ return NULL;
+
+ char *demangled = dso__demangle_sym(dso, 0, sym_name);
+ if (base_addr && base_addr != addr) {
+ char buf[256];
+ snprintf(buf, sizeof(buf), "%s+0x%lx",
+ demangled ? demangled : sym_name, addr - base_addr);
+ free(demangled);
+ return strdup(buf);
+ } else {
+ if (demangled)
+ return demangled;
+ else
+ return strdup(sym_name);
+ }
+}
+
+extern "C"
+char *llvm_name_for_code(struct dso *dso, const char *dso_name, u64 addr)
+{
+ LLVMSymbolizer *symbolizer = get_symbolizer();
+ object::SectionedAddress sectioned_addr = {
+ addr,
+ object::SectionedAddress::UndefSection
+ };
+ Expected<DILineInfo> res_or_err =
+ symbolizer->symbolizeCode(dso_name, sectioned_addr);
+ if (!res_or_err) {
+ return NULL;
+ }
+ return make_symbol_relative_string(
+ dso, res_or_err->FunctionName.c_str(),
+ addr, res_or_err->StartAddress ? *res_or_err->StartAddress : 0);
+}
+
+extern "C"
+char *llvm_name_for_data(struct dso *dso, const char *dso_name, u64 addr)
+{
+ LLVMSymbolizer *symbolizer = get_symbolizer();
+ object::SectionedAddress sectioned_addr = {
+ addr,
+ object::SectionedAddress::UndefSection
+ };
+ Expected<DIGlobal> res_or_err =
+ symbolizer->symbolizeData(dso_name, sectioned_addr);
+ if (!res_or_err) {
+ return NULL;
+ }
+ return make_symbol_relative_string(
+ dso, res_or_err->Name.c_str(),
+ addr, res_or_err->Start);
+}
diff --git a/tools/perf/util/llvm-c-helpers.h b/tools/perf/util/llvm-c-helpers.h
index 19332dd98e14..d2b99637a28a 100644
--- a/tools/perf/util/llvm-c-helpers.h
+++ b/tools/perf/util/llvm-c-helpers.h
@@ -13,6 +13,8 @@
extern "C" {
#endif

+struct dso;
+
struct llvm_a2l_frame {
char* filename;
char* funcname;
@@ -42,6 +44,15 @@ int llvm_addr2line(const char* dso_name,
bool unwind_inlines,
struct llvm_a2l_frame** inline_frames);

+/*
+ * Simple symbolizers for addresses; will convert something like
+ * 0x12345 to "func+0x123". Will return NULL if no symbol was found.
+ *
+ * The returned value must be freed by the caller, with free().
+ */
+char *llvm_name_for_code(struct dso *dso, const char *dso_name, u64 addr);
+char *llvm_name_for_data(struct dso *dso, const char *dso_name, u64 addr);
+
#ifdef __cplusplus
}
#endif
--
2.45.1


2024-05-26 18:23:21

by Steinar H. Gunderson

[permalink] [raw]
Subject: [PATCH v7 4/4] perf report: LLVM-based symbol listing

Similar to the previous LLVM patches, this allows symbol enumeration
using LLVM instead of libbfd or libelf. This allows distro builds to
reliably enumerate symbols in e.g. PE binaries as profiled with WINE.

A lot of the logic for e.g. length detection has been carried over
from the libbfd code.

Signed-off-by: Steinar H. Gunderson <[email protected]>
---
tools/perf/util/llvm-c-helpers.cpp | 124 +++++++++++++++++++++++++++++
tools/perf/util/llvm-c-helpers.h | 49 ++++++++++++
tools/perf/util/symbol.c | 54 ++++++++++++-
tools/perf/util/symbol.h | 3 +
4 files changed, 228 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/llvm-c-helpers.cpp b/tools/perf/util/llvm-c-helpers.cpp
index 4070e2d5682f..87caf98cc8f6 100644
--- a/tools/perf/util/llvm-c-helpers.cpp
+++ b/tools/perf/util/llvm-c-helpers.cpp
@@ -9,6 +9,7 @@
#pragma GCC diagnostic ignored "-Wunused-parameter" /* Needed for LLVM <= 15 */
#include <llvm/DebugInfo/Symbolize/Symbolize.h>
#include <llvm/Support/TargetSelect.h>
+#include <llvm/Object/Binary.h>
#pragma GCC diagnostic pop

#include <stdio.h>
@@ -17,6 +18,8 @@
extern "C" {
#include <linux/zalloc.h>
}
+#include <vector>
+#include <algorithm>
#include "symbol_conf.h"
#include "llvm-c-helpers.h"

@@ -24,6 +27,7 @@ extern "C"
char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name);

using namespace llvm;
+using namespace llvm::object;
using llvm::symbolize::LLVMSymbolizer;

/*
@@ -194,3 +198,123 @@ char *llvm_name_for_data(struct dso *dso, const char *dso_name, u64 addr)
dso, res_or_err->Name.c_str(),
addr, res_or_err->Start);
}
+
+int llvm_load_symbols(const char *debugfile, struct llvm_symbol_list *symbols)
+{
+ /* NOTE: This nominally does an mmap, despite the scary name. */
+ ErrorOr<std::unique_ptr<MemoryBuffer>> mem_buf_or_err =
+ MemoryBuffer::getFile(debugfile);
+ if (mem_buf_or_err.getError())
+ return -1;
+
+ Expected<std::unique_ptr<Binary>> binary_or_err(
+ createBinary(mem_buf_or_err.get()->getMemBufferRef(), nullptr));
+ if (!binary_or_err)
+ return -1;
+
+ /* Find the .text section. */
+ SectionRef text_section;
+ uint64_t text_filepos, image_base;
+ for (SectionRef section :
+ cast<ObjectFile>(*binary_or_err.get()).sections()) {
+ Expected<StringRef> name = section.getName();
+ if (name && *name == ".text") {
+ text_section = section;
+
+ /*
+ * If we don't find an image base below, we infer the
+ * image base * from the address and file offset of the
+ * .text section.
+ */
+ text_filepos = reinterpret_cast<const char *>(
+ text_section.getContents()->bytes_begin()) -
+ mem_buf_or_err.get()->getBufferStart();
+ image_base = text_section.getAddress() - text_filepos;
+ break;
+ }
+ }
+ if (text_section == SectionRef())
+ /* No .text section, so no symbols (but also not a failure). */
+ return 0;
+
+ /*
+ * See if we can find an explicit image base pseudosymbol. If so, get
+ * the image base directly from it, then infer the file position of
+ * .text from that (i.e., the opposite inference of the fallback above).
+ */
+ for (SymbolRef symbol :
+ cast<ObjectFile>(*binary_or_err.get()).symbols())
+ if (symbol.getName() &&
+ symbol.getAddress() &&
+ (*symbol.getName() == "__ImageBase" ||
+ *symbol.getName() == "__image_base__")) {
+ image_base = *symbol.getAddress();
+ if (image_base < 0x100000000ULL)
+ /*
+ * PE symbols can only have 4 bytes, so use
+ * .text high bits (if any).
+ */
+ image_base |= text_section.getAddress() &
+ ~0xFFFFFFFFULL;
+ text_filepos = text_section.getAddress() - image_base;
+ break;
+ }
+
+ symbols->image_base = image_base;
+ symbols->text_end = text_filepos + text_section.getSize();
+
+ /* Collect all valid symbols. */
+ std::vector<SymbolRef> all_symbols;
+ for (SymbolRef symbol :
+ cast<ObjectFile>(*binary_or_err.get()).symbols())
+ if (symbol.getName() && symbol.getFlags() &&
+ symbol.getAddress() && symbol.getSection())
+ all_symbols.push_back(symbol);
+ symbols->num_symbols = all_symbols.size();
+ symbols->symbols = (struct llvm_symbol *)calloc(
+ all_symbols.size(), sizeof(struct llvm_symbol));
+ if (symbols->symbols == nullptr)
+ return -1;
+
+ /*
+ * Symbols don't normally come with lengths, so we'll infer them
+ * from what comes after the symbol address-wise. There is some
+ * extra logic around zero-length symbols and deduplication,
+ * which the caller will do for us (it's shared with other backends).
+ */
+ std::sort(all_symbols.begin(), all_symbols.end(),
+ [](const SymbolRef &a, const SymbolRef &b) {
+ if (*a.getAddress() != *b.getAddress())
+ return *a.getAddress() < *b.getAddress();
+ return *a.getName() < *b.getName();
+ });
+ for (size_t i = 0; i < all_symbols.size(); ++i) {
+ const SymbolRef &sym = all_symbols[i];
+ llvm_symbol &out_sym = symbols->symbols[i];
+ out_sym.start = *sym.getAddress() - image_base;
+ out_sym.name = (char *)calloc(1, sym.getName()->size() + 1);
+ if (out_sym.name == nullptr) {
+ for (size_t i = 0; i < all_symbols.size(); ++i) {
+ zfree(&symbols->symbols[i].name);
+ }
+ zfree(&symbols->symbols);
+ return -1;
+ }
+ memcpy(out_sym.name, sym.getName()->bytes_begin(),
+ sym.getName()->size());
+ out_sym.global = *sym.getFlags() & SymbolRef::SF_Global;
+ out_sym.weak = *sym.getFlags() & SymbolRef::SF_Weak;
+
+ SectionRef section = **sym.getSection();
+ uint64_t next_addr;
+ if (i + 1 < all_symbols.size() &&
+ section == **all_symbols[i + 1].getSection())
+ next_addr = *all_symbols[i + 1].getAddress();
+ else
+ next_addr = section.getAddress() + section.getSize();
+
+ out_sym.len = next_addr - *sym.getAddress();
+ }
+
+ return 1;
+}
diff --git a/tools/perf/util/llvm-c-helpers.h b/tools/perf/util/llvm-c-helpers.h
index d2b99637a28a..1abbc9ef3dab 100644
--- a/tools/perf/util/llvm-c-helpers.h
+++ b/tools/perf/util/llvm-c-helpers.h
@@ -8,6 +8,7 @@
*/

#include <linux/compiler.h>
+#include <stddef.h>

#ifdef __cplusplus
extern "C" {
@@ -53,6 +54,54 @@ int llvm_addr2line(const char* dso_name,
char *llvm_name_for_code(struct dso *dso, const char *dso_name, u64 addr);
char *llvm_name_for_data(struct dso *dso, const char *dso_name, u64 addr);

+struct llvm_symbol {
+ u64 start, len;
+ bool weak, global;
+
+ /* Must be freed by the caller. */
+ char *name;
+};
+
+struct llvm_symbol_list {
+ /*
+ * The address this dso is loaded at in memory; adding this to
+ * the file position of any symbol will give the address that
+ * symbol is loaded at.
+ */
+ u64 image_base;
+
+ /*
+ * The byte offset of the end of the .text segment (one-past-the-end)
+ * in the file.
+ */
+ u64 text_end;
+
+ /*
+ * The actual symbols. Must be freed by the caller, with free().
+ * The string inside each llvm_symbol must also be freed.
+ */
+ size_t num_symbols;
+ struct llvm_symbol *symbols;
+};
+
+/*
+ * Get a list of all symbols in the given file using LLVM. Similarly to
+ * addr2line(), we need to implement this using a helper structure;
+ * LLVM actually has a C API for this, but we can't use it (it doesn't
+ * have an equivalent of getFlags()). So we do the meat of the LLVM-facing
+ * logic here and let the caller make perf symbols out of it.
+ *
+ * Possible return values and their meaning:
+ *
+ * -1: An error occurred.
+ * 0: The file loaded fine, but there were no .text section.
+ * 1: The file loaded successfully.
+ *
+ * For -1 and 0, "symbols" does not contain any meaningful information,
+ * and nothing needs to be freed.
+ */
+int llvm_load_symbols(const char *debugfile, struct llvm_symbol_list *symbols);
+
#ifdef __cplusplus
}
#endif
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 9e5940b5bc59..69b8c0a280c7 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -22,6 +22,7 @@
#include "util.h" // lsdir()
#include "debug.h"
#include "event.h"
+#include "llvm-c-helpers.h"
#include "machine.h"
#include "map.h"
#include "symbol.h"
@@ -1566,6 +1567,50 @@ static int dso__load_perf_map(const char *map_path, struct dso *dso)
return -1;
}

+#ifdef HAVE_LIBLLVM_SUPPORT
+int dso__load_llvm_symbols(struct dso *dso, const char *debugfile)
+{
+ struct llvm_symbol_list symbols;
+ struct symbol *symbol;
+ int err = llvm_load_symbols(debugfile, &symbols);
+ int ret = -1;
+
+ if (err == -1)
+ return -1;
+ else if (err == 0)
+ return 0;
+
+ for (size_t i = 0; i < symbols.num_symbols; ++i) {
+ const struct llvm_symbol *sym = &symbols.symbols[i];
+ int elf_binding;
+
+ if (sym->weak)
+ elf_binding = STB_WEAK;
+ else if (sym->global)
+ elf_binding = STB_GLOBAL;
+ else
+ elf_binding = STB_LOCAL;
+ symbol = symbol__new(sym->start, sym->len, elf_binding, STT_FUNC,
+ sym->name);
+ if (!symbol)
+ goto out_free;
+
+ symbols__insert(dso__symbols(dso), symbol);
+ }
+
+ symbols__fixup_end(dso__symbols(dso), false);
+ symbols__fixup_duplicate(dso__symbols(dso));
+ dso__set_adjust_symbols(dso, true);
+ ret = 0;
+
+out_free:
+ for (size_t i = 0; i < symbols.num_symbols; ++i)
+ zfree(&symbols.symbols[i].name);
+ zfree(&symbols.symbols);
+ return ret;
+}
+#endif
+
#ifdef HAVE_LIBBFD_SUPPORT
#define PACKAGE 'perf'
#include <bfd.h>
@@ -1871,6 +1916,7 @@ int dso__load(struct dso *dso, struct map *map)
bool next_slot = false;
bool is_reg;
bool nsexit;
+ int llvmrc = -1;
int bfdrc = -1;
int sirc = -1;

@@ -1899,11 +1945,15 @@ int dso__load(struct dso *dso, struct map *map)
}
}

-#ifdef HAVE_LIBBFD_SUPPORT
+#ifdef HAVE_LIBLLVM_SUPPORT
if (is_reg)
+ llvmrc = dso__load_llvm_symbols(dso, name);
+#endif
+#ifdef HAVE_LIBBFD_SUPPORT
+ if (is_reg && llvmrc < 0)
bfdrc = dso__load_bfd_symbols(dso, name);
#endif
- if (is_reg && bfdrc < 0)
+ if (is_reg && llvmrc < 0 && bfdrc < 0)
sirc = symsrc__init(ss, dso, name, symtab_type);

if (nsexit)
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index 3fb5d146d9b1..01256c41467a 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -174,6 +174,9 @@ int symbol__config_symfs(const struct option *opt __maybe_unused,

struct symsrc;

+#ifdef HAVE_LIBLLVM_SUPPORT
+int dso__load_llvm_symbols(struct dso *dso, const char *debugfile);
+#endif
#ifdef HAVE_LIBBFD_SUPPORT
int dso__load_bfd_symbols(struct dso *dso, const char *debugfile);
#endif
--
2.45.1


2024-05-26 21:24:57

by Steinar H. Gunderson

[permalink] [raw]
Subject: Re: [PATCH v7 4/4] perf report: LLVM-based symbol listing

On Sun, May 26, 2024 at 08:22:12PM +0200, Steinar H. Gunderson wrote:
> +
> + symbols__fixup_end(dso__symbols(dso), false);
> + symbols__fixup_duplicate(dso__symbols(dso));
> + dso__set_adjust_symbols(dso, true);
> + ret = 0;

Urgh, evidently I forgot to copy over dso->text_offset and dso->text_end
here, which I guess needs to happen.

Even after doing that, I still cannot annotate Windows symbols, though;
I get

Couldn't annotate RtlLeaveCriticalSection:
Internal error: Invalid -1 error code

This is true even with libbfd in a non-distro build (without any of my
LLVM patches), so there must be something else at play here.

I _can_ run a Windows binary in WINE and get the right symbols out with
perf report, though. So this leaves only demangling as the final piece
of the puzzle to make distro builds essentially equal to nondistro builds.

/* Steinar */

2024-05-29 03:57:12

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v7 1/4] perf report: Support LLVM for addr2line()

Hello,

On Sun, May 26, 2024 at 11:22 AM Steinar H. Gunderson <sesse@googlecom> wrote:
>
> In addition to the existing support for libbfd and calling out to
> an external addr2line command, add support for using libllvm directly.
> This is both faster than libbfd, and can be enabled in distro builds
> (the LLVM license has an explicit provision for GPLv2 compatibility).
> Thus, it is set as the primary choice if available.
>
> As an example, running perf report on a medium-size profile with
> DWARF-based backtraces took 58 seconds with LLVM, 78 seconds with
> libbfd, 153 seconds with external llvm-addr2line, and I got tired
> and aborted the test after waiting for 55 minutes with external
> bfd addr2line (which is the default for perf as compiled by distributions
> today). Evidently, for this case, the bfd addr2line process needs
> 18 seconds (on a 5.2 GHz Zen 3) to load the .debug ELF in question,
> hits the 1-second timeout and gets killed during initialization,
> getting restarted anew every time. Having an in-process addr2line
> makes this much more robust.
>
> As future extensions, libllvm can be used in many other places where
> we currently use libbfd or other libraries:
>
> - Symbol enumeration (in particular, for PE binaries).
> - Demangling (including non-Itanium demangling, e.g. Microsoft
> or Rust).
> - Disassembling (perf annotate).
>
> However, these are much less pressing; most people don't profile
> PE binaries, and perf has non-bfd paths for ELF. The same with
> demangling; the default _cxa_demangle path works fine for most
> users. Disassembling is coming in a later patch in the series;
> however do note that while bfd objdump can be slow on large binaries,
> it is possible to use --objdump=llvm-objdump to get the speed benefits.
> (It appears LLVM-based demangling is very simple, should we want
> that.)
>
> Tested with LLVM 14, 15, 16, 18 and 19. For some reason, LLVM 12 was not
> correctly detected using feature_check, and thus was not tested.
>
> Signed-off-by: Steinar H. Gunderson <[email protected]>
> Tested-by: Arnaldo Carvalho de Melo <[email protected]>
> ---
> tools/build/Makefile.feature | 1 +
> tools/perf/Makefile.config | 15 ++++
> tools/perf/builtin-version.c | 1 +
> tools/perf/util/Build | 1 +
> tools/perf/util/llvm-c-helpers.cpp | 134 +++++++++++++++++++++++++++++
> tools/perf/util/llvm-c-helpers.h | 49 +++++++++++
> tools/perf/util/srcline.c | 57 +++++++++++-
> 7 files changed, 257 insertions(+), 1 deletion(-)
> create mode 100644 tools/perf/util/llvm-c-helpers.cpp
> create mode 100644 tools/perf/util/llvm-c-helpers.h
>
> diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
> index 1e2ab148d5db..278b26216254 100644
> --- a/tools/build/Makefile.feature
> +++ b/tools/build/Makefile.feature
> @@ -136,6 +136,7 @@ FEATURE_DISPLAY ?= \
> libunwind \
> libdw-dwarf-unwind \
> libcapstone \
> + llvm \
> zlib \
> lzma \
> get_cpuid \
> diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
> index 7f1e016a9253..42f5ad89c0ca 100644
> --- a/tools/perf/Makefile.config
> +++ b/tools/perf/Makefile.config
> @@ -969,6 +969,21 @@ ifdef BUILD_NONDISTRO
> endif
> endif
>
> +ifndef NO_LIBLLVM
> + $(call feature_check,llvm)
> + ifeq ($(feature-llvm), 1)
> + CFLAGS += -DHAVE_LIBLLVM_SUPPORT
> + CXXFLAGS += -DHAVE_LIBLLVM_SUPPORT
> + CXXFLAGS += $(shell $(LLVM_CONFIG) --cxxflags)
> + LIBLLVM = $(shell $(LLVM_CONFIG) --libs all) $(shell $(LLVM_CONFIG) --system-libs)
> + EXTLIBS += -L$(shell $(LLVM_CONFIG) --libdir) $(LIBLLVM)
> + $(call detected,CONFIG_LIBLLVM)
> + else
> + $(warning No libllvm found, slower source file resolution, please install llvm-devel/llvm-dev)
> + NO_LIBLLVM := 1
> + endif
> +endif
> +
> ifndef NO_DEMANGLE
> $(call feature_check,cxa-demangle)
> ifeq ($(feature-cxa-demangle), 1)
> diff --git a/tools/perf/builtin-version.c b/tools/perf/builtin-version.c
> index 398aa53e9e2e..4b252196de12 100644
> --- a/tools/perf/builtin-version.c
> +++ b/tools/perf/builtin-version.c
> @@ -65,6 +65,7 @@ static void library_status(void)
> STATUS(HAVE_LIBBFD_SUPPORT, libbfd);
> STATUS(HAVE_DEBUGINFOD_SUPPORT, debuginfod);
> STATUS(HAVE_LIBELF_SUPPORT, libelf);
> + STATUS(HAVE_LIBLLVM_SUPPORT, libllvm);
> STATUS(HAVE_LIBNUMA_SUPPORT, libnuma);
> STATUS(HAVE_LIBNUMA_SUPPORT, numa_num_possible_cpus);
> STATUS(HAVE_LIBPERL_SUPPORT, libperl);
> diff --git a/tools/perf/util/Build b/tools/perf/util/Build
> index da64efd8718f..6a1fd8f1a488 100644
> --- a/tools/perf/util/Build
> +++ b/tools/perf/util/Build
> @@ -226,6 +226,7 @@ perf-$(CONFIG_CXX_DEMANGLE) += demangle-cxx.o
> perf-y += demangle-ocaml.o
> perf-y += demangle-java.o
> perf-y += demangle-rust.o
> +perf-$(CONFIG_LIBLLVM) += llvm-c-helpers.o
>
> ifdef CONFIG_JITDUMP
> perf-$(CONFIG_LIBELF) += jitdump.o
> diff --git a/tools/perf/util/llvm-c-helpers.cpp b/tools/perf/util/llvm-c-helpers.cpp
> new file mode 100644
> index 000000000000..3cc967ec6f28
> --- /dev/null
> +++ b/tools/perf/util/llvm-c-helpers.cpp
> @@ -0,0 +1,134 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +/*
> + * Must come before the linux/compiler.h include, which defines several
> + * macros (e.g. noinline) that conflict with compiler builtins used
> + * by LLVM.
> + */
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Wunused-parameter" /* Needed for LLVM <= 15 */
> +#include <llvm/DebugInfo/Symbolize/Symbolize.h>
> +#pragma GCC diagnostic pop
> +
> +#include <stdio.h>
> +#include <sys/types.h>
> +#include <linux/compiler.h>
> +extern "C" {
> +#include <linux/zalloc.h>
> +}
> +#include "symbol_conf.h"
> +#include "llvm-c-helpers.h"
> +
> +using namespace llvm;
> +using llvm::symbolize::LLVMSymbolizer;
> +
> +/*
> + * Allocate a static LLVMSymbolizer, which will live to the end of the program.
> + * Unlike the bfd paths, LLVMSymbolizer has its own cache, so we do not need
> + * to store anything in the dso struct.
> + */
> +static LLVMSymbolizer *get_symbolizer()
> +{
> + static LLVMSymbolizer *instance = nullptr;
> + if (instance == nullptr) {
> + LLVMSymbolizer::Options opts;
> + /*
> + * LLVM sometimes demangles slightly different from the rest
> + * of the code, and this mismatch can cause new_inline_sym()
> + * to get confused and mark non-inline symbol as inlined
> + * (since the name does not properly match up with base_sym).
> + * Thus, disable the demangling and let the rest of the code
> + * handle it.
> + */
> + opts.Demangle = false;
> + instance = new LLVMSymbolizer(opts);
> + }
> + return instance;
> +}
> +
> +/* Returns 0 on error, 1 on success. */
> +static int extract_file_and_line(const DILineInfo &line_info, char **file,
> + unsigned int *line)
> +{
> + if (file) {
> + if (line_info.FileName == "<invalid>") {
> + /* Match the convention of libbfd. */
> + *file = nullptr;
> + } else {
> + /* The caller expects to get something it can free(). */
> + *file = strdup(line_info.FileName.c_str());
> + if (*file == nullptr)
> + return 0;
> + }
> + }
> + if (line)
> + *line = line_info.Line;
> + return 1;
> +}
> +
> +extern "C"
> +int llvm_addr2line(const char *dso_name, u64 addr,
> + char **file, unsigned int *line,
> + bool unwind_inlines,
> + llvm_a2l_frame **inline_frames)
> +{
> + LLVMSymbolizer *symbolizer = get_symbolizer();
> + object::SectionedAddress sectioned_addr = {
> + addr,
> + object::SectionedAddress::UndefSection
> + };
> +
> + if (unwind_inlines) {
> + Expected<DIInliningInfo> res_or_err =
> + symbolizer->symbolizeInlinedCode(dso_name,
> + sectioned_addr);
> + if (!res_or_err)
> + return 0;
> + unsigned num_frames = res_or_err->getNumberOfFrames();
> + if (num_frames == 0)
> + return 0;
> +
> + if (extract_file_and_line(res_or_err->getFrame(0),
> + file, line) == 0)
> + return 0;
> +
> + *inline_frames = (llvm_a2l_frame *)calloc(
> + num_frames, sizeof(**inline_frames));
> + if (*inline_frames == nullptr)
> + return 0;
> +
> + for (unsigned i = 0; i < num_frames; ++i) {
> + const DILineInfo &src = res_or_err->getFrame(i);
> +
> + llvm_a2l_frame &dst = (*inline_frames)[i];
> + if (src.FileName == "<invalid>")
> + /* Match the convention of libbfd. */
> + dst.filename = nullptr;
> + else
> + dst.filename = strdup(src.FileName.c_str());
> + dst.funcname = strdup(src.FunctionName.c_str());
> + dst.line = src.Line;
> +
> + if (dst.filename == nullptr ||
> + dst.funcname == nullptr) {
> + for (unsigned j = 0; j <= i; ++j) {
> + zfree(&(*inline_frames)[j].filename);
> + zfree(&(*inline_frames)[j].funcname);
> + }
> + zfree(inline_frames);
> + return 0;
> + }
> + }
> +
> + return num_frames;
> + } else {
> + if (inline_frames)
> + *inline_frames = nullptr;
> +
> + Expected<DILineInfo> res_or_err =
> + symbolizer->symbolizeCode(dso_name, sectioned_addr);
> + if (!res_or_err)
> + return 0;
> + return extract_file_and_line(*res_or_err, file, line);
> + }
> +}
> diff --git a/tools/perf/util/llvm-c-helpers.h b/tools/perf/util/llvm-c-helpers.h
> new file mode 100644
> index 000000000000..19332dd98e14
> --- /dev/null
> +++ b/tools/perf/util/llvm-c-helpers.h
> @@ -0,0 +1,49 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __PERF_LLVM_C_HELPERS
> +#define __PERF_LLVM_C_HELPERS 1
> +
> +/*
> + * Helpers to call into LLVM C++ code from C, for the parts that do not have
> + * C APIs.
> + */
> +
> +#include <linux/compiler.h>
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +struct llvm_a2l_frame {
> + char* filename;
> + char* funcname;
> + unsigned int line;
> +};
> +
> +/*
> + * Implement addr2line() using libLLVM. LLVM is a C++ API, and
> + * many of the linux/ headers cannot be included in a C++ compile unit,
> + * so we need to make a little bridge code here. llvm_addr2line() will
> + * convert the inline frame information from LLVM's internal structures
> + * and put them into a flat array given in inline_frames. The caller
> + * is then responsible for taking that array and convert it into perf's
> + * regular inline frame structures (which depend on e.g. struct list_head).
> + *
> + * If the address could not be resolved, or an error occurred (e.g. OOM),
> + * returns 0. Otherwise, returns the number of inline frames (which means 1
> + * if the address was not part of an inlined function). If unwind_inlines
> + * is set and the return code is nonzero, inline_frames will be set to
> + * a newly allocated array with that length. The caller is then responsible
> + * for freeing both the strings and the array itself.
> + */
> +int llvm_addr2line(const char* dso_name,
> + u64 addr,
> + char** file,
> + unsigned int* line,
> + bool unwind_inlines,
> + struct llvm_a2l_frame** inline_frames);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* __PERF_LLVM_C_HELPERS */
> diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
> index 9d670d8c1c08..74091ecc21df 100644
> --- a/tools/perf/util/srcline.c
> +++ b/tools/perf/util/srcline.c
> @@ -16,6 +16,9 @@
> #include "util/debug.h"
> #include "util/callchain.h"
> #include "util/symbol_conf.h"
> +#ifdef HAVE_LIBLLVM_SUPPORT
> +#include "util/llvm-c-helpers.h"
> +#endif
> #include "srcline.h"
> #include "string2.h"
> #include "symbol.h"
> @@ -130,7 +133,59 @@ static struct symbol *new_inline_sym(struct dso *dso,
>
> #define MAX_INLINE_NEST 1024
>
> -#ifdef HAVE_LIBBFD_SUPPORT
> +#ifdef HAVE_LIBLLVM_SUPPORT
> +
> +static void free_llvm_inline_frames(struct llvm_a2l_frame *inline_frames,
> + int num_frames)
> +{
> + if (inline_frames != NULL) {
> + for (int i = 0; i < num_frames; ++i) {
> + zfree(&inline_frames[i].filename);
> + zfree(&inline_frames[i].funcname);
> + }
> + zfree(&inline_frames);
> + }
> +}
> +
> +static int addr2line(const char *dso_name, u64 addr,
> + char **file, unsigned int *line, struct dso *dso,
> + bool unwind_inlines, struct inline_node *node,
> + struct symbol *sym)
> +{
> + struct llvm_a2l_frame *inline_frames = NULL;
> + int num_frames = llvm_addr2line(dso_name, addr, file, line,
> + node && unwind_inlines, &inline_frames);
> +
> + if (num_frames == 0 || !inline_frames) {
> + /* Error, or we didn't want inlines. */
> + return num_frames;
> + }
> +
> + for (int i = 0; i < num_frames; ++i) {
> + struct symbol *inline_sym =
> + new_inline_sym(dso, sym, inline_frames[i].funcname);
> + char *srcline = NULL;
> +
> + if (inline_frames[i].filename)
> + srcline = srcline_from_fileline(
> + inline_frames[i].filename,
> + inline_frames[i].line);
> + if (inline_list__append(inline_sym, srcline, node) != 0) {
> + free_llvm_inline_frames(inline_frames, num_frames);
> + return 0;
> + }
> + }
> + free_llvm_inline_frames(inline_frames, num_frames);
> +
> + return num_frames;
> +}
> +
> +void dso__free_a2l(struct dso *)
> +{
> + /* Nothing to free. */
> +}
> +
> +#elif defined(HAVE_LIBBFD_SUPPORT)

Hmm.. it's unfortunate that we have only one addr2line
implementation at a time. Maybe we can do the same thing
like in annotate with objdump so that it can fallback to another
method when failing. But it'd require more changes beyond
this work and I'm not sure if it's really worth it.

Thanks,
Namhyung

>
> /*
> * Implement addr2line using libbfd.
> --
> 2.45.1
>
>

2024-05-29 04:08:24

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v7 3/4] perf annotate: LLVM-based disassembler

On Sun, May 26, 2024 at 11:22 AM Steinar H. Gunderson <sesse@googlecom> wrote:
>
> Support using LLVM as a disassembler method, allowing helperless
> annotation in non-distro builds. (It is also much faster than
> using libbfd or bfd objdump on binaries with a lot of debug
> information.)
>
> This is nearly identical to the output of llvm-objdump; there are
> some very rare whitespace differences, some minor changes to demangling
> (since we use perf's regular demangling and not LLVM's own) and
> the occasional case where llvm-objdump makes a different choice
> when multiple symbols share the same address. It should work across
> all of LLVM's supported architectures, although I've only tested 64-bit
> x86, and finding the right triple from perf's idea of machine
> architecture can sometimes be a bit tricky. Ideally, we should have
> some way of finding the triplet just from the file itself.
>
> Signed-off-by: Steinar H. Gunderson <[email protected]>
> ---
> tools/perf/util/disasm.c | 195 +++++++++++++++++++++++++++++
> tools/perf/util/llvm-c-helpers.cpp | 62 +++++++++
> tools/perf/util/llvm-c-helpers.h | 11 ++
> 3 files changed, 268 insertions(+)
>
> diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
> index c0dbb955e61a..ee7c2365d066 100644
> --- a/tools/perf/util/disasm.c
> +++ b/tools/perf/util/disasm.c
> @@ -43,6 +43,7 @@ static int call__scnprintf(struct ins *ins, char *bf, size_t size,
>
> static void ins__sort(struct arch *arch);
> static int disasm_line__parse(char *line, const char **namep, char **rawp);
> +static char *expand_tabs(char *line, char **storage, size_t *storage_len);
>
> static __attribute__((constructor)) void symbol__init_regexpr(void)
> {
> @@ -1378,7 +1379,9 @@ static int open_capstone_handle(struct annotate_args *args, bool is_64bit,
>
> return 0;
> }
> +#endif
>
> +#if defined(HAVE_LIBCAPSTONE_SUPPORT) || defined(HAVE_LIBLLVM_SUPPORT)
> struct find_file_offset_data {
> u64 ip;
> u64 offset;
> @@ -1442,7 +1445,9 @@ read_symbol(const char *filename, struct map *map, struct symbol *sym,
> free(buf);
> return NULL;
> }
> +#endif
>
> +#ifdef HAVE_LIBCAPSTONE_SUPPORT
> static void print_capstone_detail(cs_insn *insn, char *buf, size_t len,
> struct annotate_args *args, u64 addr)
> {
> @@ -1606,6 +1611,191 @@ static int symbol__disassemble_capstone(char *filename, struct symbol *sym,
> }
> #endif
>
> +#ifdef HAVE_LIBLLVM_SUPPORT
> +#include <llvm-c/Disassembler.h>
> +#include <llvm-c/Target.h>
> +#include "util/llvm-c-helpers.h"
> +
> +struct symbol_lookup_storage {
> + u64 branch_addr;
> + u64 pcrel_load_addr;
> +};
> +
> +/*
> + * Whenever LLVM wants to resolve an address into a symbol, it calls this
> + * callback. We don't ever actually _return_ anything (in particular, because
> + * it puts quotation marks around what we return), but we use this as a hint
> + * that there is a branch or PC-relative address in the expression that we
> + * should add some textual annotation for after the instruction. The caller
> + * will use this information to add the actual annotation.
> + */
> +static const char *
> +symbol_lookup_callback(void *disinfo, uint64_t value,
> + uint64_t *ref_type,
> + uint64_t address __maybe_unused,
> + const char **ref __maybe_unused)
> +{
> + struct symbol_lookup_storage *storage =
> + (struct symbol_lookup_storage *)disinfo;

You don't need this in C. :) Also we usually put a blank line
after declaration (at least, at the beginning).


> + if (*ref_type == LLVMDisassembler_ReferenceType_In_Branch)
> + storage->branch_addr = value;
> + else if (*ref_type == LLVMDisassembler_ReferenceType_In_PCrel_Load)
> + storage->pcrel_load_addr = value;
> + *ref_type = LLVMDisassembler_ReferenceType_InOut_None;
> + return NULL;
> +}
> +
> +static int symbol__disassemble_llvm(char *filename, struct symbol *sym,
> + struct annotate_args *args)
> +{
> + struct annotation *notes = symbol__annotation(sym);
> + struct map *map = args->ms.map;
> + struct dso *dso = map__dso(map);
> + u64 start = map__rip_2objdump(map, sym->start);
> + u8 *buf;
> + u64 len;
> + u64 pc;
> + bool is_64bit;
> + char triplet[64];
> + char disasm_buf[2048];
> + size_t disasm_len;
> + struct disasm_line *dl;
> + LLVMDisasmContextRef disasm = NULL;
> + struct symbol_lookup_storage storage;
> + char *line_storage = NULL;
> + size_t line_storage_len = 0;
> +
> + if (args->options->objdump_path)
> + return -1;
> +
> + LLVMInitializeAllTargetInfos();
> + LLVMInitializeAllTargetMCs();
> + LLVMInitializeAllDisassemblers();
> +
> + buf = read_symbol(filename, map, sym, &len, &is_64bit);
> + if (buf == NULL)
> + return -1;
> +
> + if (arch__is(args->arch, "x86")) {
> + if (is_64bit)
> + scnprintf(triplet, sizeof(triplet), "x86_64-pc-linux");
> + else
> + scnprintf(triplet, sizeof(triplet), "i686-pc-linux");
> + } else {
> + scnprintf(triplet, sizeof(triplet), "%s-linux-gnu",
> + args->arch->name);
> + }
> +
> + disasm = LLVMCreateDisasm(
> + triplet, &storage, 0, NULL, symbol_lookup_callback);

We put the arguments at the same line and align them.
Please do the same in other places.


> + if (disasm == NULL)
> + goto err;
> +
> + if (args->options->disassembler_style &&
> + !strcmp(args->options->disassembler_style, "intel"))
> + LLVMSetDisasmOptions(
> + disasm, LLVMDisassembler_Option_AsmPrinterVariant);
> +
> + /*
> + * This needs to be set after AsmPrinterVariant, due to a bug in LLVM;
> + * setting AsmPrinterVariant makes a new instruction printer, making it
> + * forget about the PrintImmHex flag (which is applied before if both
> + * are given to the same call).
> + */
> + LLVMSetDisasmOptions(disasm, LLVMDisassembler_Option_PrintImmHex);
> +
> + /* add the function address and name */
> + scnprintf(disasm_buf, sizeof(disasm_buf), "%#"PRIx64" <%s>:",
> + start, sym->name);
> +
> + args->offset = -1;
> + args->line = disasm_buf;
> + args->line_nr = 0;
> + args->fileloc = NULL;
> + args->ms.sym = sym;
> +
> + dl = disasm_line__new(args);
> + if (dl == NULL)
> + goto err;
> +
> + annotation_line__add(&dl->al, &notes->src->source);
> +
> + pc = start;
> + for (u64 offset = 0; offset < len; ) {
> + unsigned int ins_len;
> +
> + storage.branch_addr = 0;
> + storage.pcrel_load_addr = 0;
> +
> + ins_len = LLVMDisasmInstruction(
> + disasm, buf + offset, len - offset, pc,
> + disasm_buf, sizeof(disasm_buf));
> + if (ins_len == 0)
> + goto err;
> + disasm_len = strlen(disasm_buf);
> +
> + if (storage.branch_addr != 0) {
> + char *name = llvm_name_for_code(
> + dso, filename, storage.branch_addr);
> + if (name != NULL) {
> + disasm_len += scnprintf(
> + disasm_buf + disasm_len,
> + sizeof(disasm_buf) - disasm_len,
> + " <%s>", name);
> + free(name);
> + }
> + }
> + if (storage.pcrel_load_addr != 0) {
> + char *name = llvm_name_for_data(
> + dso, filename, storage.pcrel_load_addr);
> + disasm_len += scnprintf(disasm_buf + disasm_len,
> + sizeof(disasm_buf) - disasm_len,
> + " # %#"PRIx64,
> + storage.pcrel_load_addr);
> + if (name) {
> + disasm_len += scnprintf(
> + disasm_buf + disasm_len,
> + sizeof(disasm_buf) - disasm_len,
> + " <%s>", name);
> + free(name);
> + }
> + }
> +
> + args->offset = offset;
> + args->line = expand_tabs(
> + disasm_buf, &line_storage, &line_storage_len);
> + args->line_nr = 0;
> + args->fileloc = NULL;
> + args->ms.sym = sym;
> +
> + llvm_addr2line(filename, pc, &args->fileloc,
> + (unsigned int *)&args->line_nr, false, NULL);
> +
> + dl = disasm_line__new(args);
> + if (dl == NULL)
> + goto err;
> +
> + annotation_line__add(&dl->al, &notes->src->source);
> +
> + free(args->fileloc);
> + pc += ins_len;
> + offset += ins_len;
> + }
> +
> + LLVMDisasmDispose(disasm);
> + free(buf);
> + free(line_storage);
> + return 0;

Often we just set a different return value here and share the
error handling logic at the end.

Thanks,
Namhyung


> +
> +err:
> + LLVMDisasmDispose(disasm);
> + free(buf);
> + free(line_storage);
> + return -1;
> +}
> +#endif
> +
> +
> /*
> * Possibly create a new version of line with tabs expanded. Returns the
> * existing or new line, storage is updated if a new line is allocated. If
> @@ -1730,6 +1920,11 @@ int symbol__disassemble(struct symbol *sym, struct annotate_args *args)
> strcpy(symfs_filename, tmp);
> }
>
> +#ifdef HAVE_LIBLLVM_SUPPORT
> + err = symbol__disassemble_llvm(symfs_filename, sym, args);
> + if (err == 0)
> + goto out_remove_tmp;
> +#endif
> #ifdef HAVE_LIBCAPSTONE_SUPPORT
> err = symbol__disassemble_capstone(symfs_filename, sym, args);
> if (err == 0)
> diff --git a/tools/perf/util/llvm-c-helpers.cpp b/tools/perf/util/llvm-c-helpers.cpp
> index 3cc967ec6f28..4070e2d5682f 100644
> --- a/tools/perf/util/llvm-c-helpers.cpp
> +++ b/tools/perf/util/llvm-c-helpers.cpp
> @@ -8,6 +8,7 @@
> #pragma GCC diagnostic push
> #pragma GCC diagnostic ignored "-Wunused-parameter" /* Needed for LLVM <= 15 */
> #include <llvm/DebugInfo/Symbolize/Symbolize.h>
> +#include <llvm/Support/TargetSelect.h>
> #pragma GCC diagnostic pop
>
> #include <stdio.h>
> @@ -19,6 +20,9 @@ extern "C" {
> #include "symbol_conf.h"
> #include "llvm-c-helpers.h"
>
> +extern "C"
> +char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name);
> +
> using namespace llvm;
> using llvm::symbolize::LLVMSymbolizer;
>
> @@ -132,3 +136,61 @@ int llvm_addr2line(const char *dso_name, u64 addr,
> return extract_file_and_line(*res_or_err, file, line);
> }
> }
> +
> +static char *
> +make_symbol_relative_string(struct dso *dso, const char *sym_name,
> + u64 addr, u64 base_addr)
> +{
> + if (!strcmp(sym_name, "<invalid>"))
> + return NULL;
> +
> + char *demangled = dso__demangle_sym(dso, 0, sym_name);
> + if (base_addr && base_addr != addr) {
> + char buf[256];
> + snprintf(buf, sizeof(buf), "%s+0x%lx",
> + demangled ? demangled : sym_name, addr - base_addr);
> + free(demangled);
> + return strdup(buf);
> + } else {
> + if (demangled)
> + return demangled;
> + else
> + return strdup(sym_name);
> + }
> +}
> +
> +extern "C"
> +char *llvm_name_for_code(struct dso *dso, const char *dso_name, u64 addr)
> +{
> + LLVMSymbolizer *symbolizer = get_symbolizer();
> + object::SectionedAddress sectioned_addr = {
> + addr,
> + object::SectionedAddress::UndefSection
> + };
> + Expected<DILineInfo> res_or_err =
> + symbolizer->symbolizeCode(dso_name, sectioned_addr);
> + if (!res_or_err) {
> + return NULL;
> + }
> + return make_symbol_relative_string(
> + dso, res_or_err->FunctionName.c_str(),
> + addr, res_or_err->StartAddress ? *res_or_err->StartAddress : 0);
> +}
> +
> +extern "C"
> +char *llvm_name_for_data(struct dso *dso, const char *dso_name, u64 addr)
> +{
> + LLVMSymbolizer *symbolizer = get_symbolizer();
> + object::SectionedAddress sectioned_addr = {
> + addr,
> + object::SectionedAddress::UndefSection
> + };
> + Expected<DIGlobal> res_or_err =
> + symbolizer->symbolizeData(dso_name, sectioned_addr);
> + if (!res_or_err) {
> + return NULL;
> + }
> + return make_symbol_relative_string(
> + dso, res_or_err->Name.c_str(),
> + addr, res_or_err->Start);
> +}
> diff --git a/tools/perf/util/llvm-c-helpers.h b/tools/perf/util/llvm-c-helpers.h
> index 19332dd98e14..d2b99637a28a 100644
> --- a/tools/perf/util/llvm-c-helpers.h
> +++ b/tools/perf/util/llvm-c-helpers.h
> @@ -13,6 +13,8 @@
> extern "C" {
> #endif
>
> +struct dso;
> +
> struct llvm_a2l_frame {
> char* filename;
> char* funcname;
> @@ -42,6 +44,15 @@ int llvm_addr2line(const char* dso_name,
> bool unwind_inlines,
> struct llvm_a2l_frame** inline_frames);
>
> +/*
> + * Simple symbolizers for addresses; will convert something like
> + * 0x12345 to "func+0x123". Will return NULL if no symbol was found.
> + *
> + * The returned value must be freed by the caller, with free().
> + */
> +char *llvm_name_for_code(struct dso *dso, const char *dso_name, u64 addr);
> +char *llvm_name_for_data(struct dso *dso, const char *dso_name, u64 addr);
> +
> #ifdef __cplusplus
> }
> #endif
> --
> 2.45.1
>
>

2024-05-29 04:11:59

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v7 4/4] perf report: LLVM-based symbol listing

On Sun, May 26, 2024 at 11:23 AM Steinar H. Gunderson <sesse@googlecom> wrote:
>
> Similar to the previous LLVM patches, this allows symbol enumeration
> using LLVM instead of libbfd or libelf. This allows distro builds to
> reliably enumerate symbols in e.g. PE binaries as profiled with WINE.
>
> A lot of the logic for e.g. length detection has been carried over
> from the libbfd code.
>
> Signed-off-by: Steinar H. Gunderson <[email protected]>
> ---
> tools/perf/util/llvm-c-helpers.cpp | 124 +++++++++++++++++++++++++++++
> tools/perf/util/llvm-c-helpers.h | 49 ++++++++++++
> tools/perf/util/symbol.c | 54 ++++++++++++-
> tools/perf/util/symbol.h | 3 +
> 4 files changed, 228 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/util/llvm-c-helpers.cpp b/tools/perf/util/llvm-c-helpers.cpp
> index 4070e2d5682f..87caf98cc8f6 100644
> --- a/tools/perf/util/llvm-c-helpers.cpp
> +++ b/tools/perf/util/llvm-c-helpers.cpp
> @@ -9,6 +9,7 @@
> #pragma GCC diagnostic ignored "-Wunused-parameter" /* Needed for LLVM <= 15 */
> #include <llvm/DebugInfo/Symbolize/Symbolize.h>
> #include <llvm/Support/TargetSelect.h>
> +#include <llvm/Object/Binary.h>
> #pragma GCC diagnostic pop
>
> #include <stdio.h>
> @@ -17,6 +18,8 @@
> extern "C" {
> #include <linux/zalloc.h>
> }
> +#include <vector>
> +#include <algorithm>
> #include "symbol_conf.h"
> #include "llvm-c-helpers.h"
>
> @@ -24,6 +27,7 @@ extern "C"
> char *dso__demangle_sym(struct dso *dso, int kmodule, const char *elf_name);
>
> using namespace llvm;
> +using namespace llvm::object;
> using llvm::symbolize::LLVMSymbolizer;
>
> /*
> @@ -194,3 +198,123 @@ char *llvm_name_for_data(struct dso *dso, const char *dso_name, u64 addr)
> dso, res_or_err->Name.c_str(),
> addr, res_or_err->Start);
> }
> +
> +int llvm_load_symbols(const char *debugfile, struct llvm_symbol_list *symbols)
> +{
> + /* NOTE: This nominally does an mmap, despite the scary name. */
> + ErrorOr<std::unique_ptr<MemoryBuffer>> mem_buf_or_err =
> + MemoryBuffer::getFile(debugfile);
> + if (mem_buf_or_err.getError())
> + return -1;
> +
> + Expected<std::unique_ptr<Binary>> binary_or_err(
> + createBinary(mem_buf_or_err.get()->getMemBufferRef(), nullptr));
> + if (!binary_or_err)
> + return -1;
> +
> + /* Find the .text section. */
> + SectionRef text_section;
> + uint64_t text_filepos, image_base;
> + for (SectionRef section :
> + cast<ObjectFile>(*binary_or_err.get()).sections()) {
> + Expected<StringRef> name = section.getName();
> + if (name && *name == ".text") {
> + text_section = section;
> +
> + /*
> + * If we don't find an image base below, we infer the
> + * image base * from the address and file offset of the
> + * .text section.
> + */
> + text_filepos = reinterpret_cast<const char *>(
> + text_section.getContents()->bytes_begin()) -
> + mem_buf_or_err.get()->getBufferStart();
> + image_base = text_section.getAddress() - text_filepos;
> + break;
> + }
> + }
> + if (text_section == SectionRef())
> + /* No .text section, so no symbols (but also not a failure). */
> + return 0;
> +
> + /*
> + * See if we can find an explicit image base pseudosymbol. If so, get
> + * the image base directly from it, then infer the file position of
> + * .text from that (i.e., the opposite inference of the fallback above).
> + */

If it's PE-specific, can we skip this based on the target type or
something?


> + for (SymbolRef symbol :
> + cast<ObjectFile>(*binary_or_err.get()).symbols())
> + if (symbol.getName() &&
> + symbol.getAddress() &&
> + (*symbol.getName() == "__ImageBase" ||
> + *symbol.getName() == "__image_base__")) {
> + image_base = *symbol.getAddress();
> + if (image_base < 0x100000000ULL)
> + /*
> + * PE symbols can only have 4 bytes, so use
> + * .text high bits (if any).
> + */
> + image_base |= text_section.getAddress() &
> + ~0xFFFFFFFFULL;
> + text_filepos = text_section.getAddress() - image_base;
> + break;
> + }
> +
> + symbols->image_base = image_base;
> + symbols->text_end = text_filepos + text_section.getSize();
> +
> + /* Collect all valid symbols. */
> + std::vector<SymbolRef> all_symbols;
> + for (SymbolRef symbol :
> + cast<ObjectFile>(*binary_or_err.get()).symbols())
> + if (symbol.getName() && symbol.getFlags() &&
> + symbol.getAddress() && symbol.getSection())
> + all_symbols.push_back(symbol);
> + symbols->num_symbols = all_symbols.size();
> + symbols->symbols = (struct llvm_symbol *)calloc(
> + all_symbols.size(), sizeof(struct llvm_symbol));
> + if (symbols->symbols == nullptr)
> + return -1;

I'm not sure if it's compatible with what we do in the
dso__load_sym_internal(). Probably ARM and PPC folks
should check if it works well.

Thanks,
Namhyung


> +
> + /*
> + * Symbols don't normally come with lengths, so we'll infer them
> + * from what comes after the symbol address-wise. There is some
> + * extra logic around zero-length symbols and deduplication,
> + * which the caller will do for us (it's shared with other backends).
> + */
> + std::sort(all_symbols.begin(), all_symbols.end(),
> + [](const SymbolRef &a, const SymbolRef &b) {
> + if (*a.getAddress() != *b.getAddress())
> + return *a.getAddress() < *b.getAddress();
> + return *a.getName() < *b.getName();
> + });
> + for (size_t i = 0; i < all_symbols.size(); ++i) {
> + const SymbolRef &sym = all_symbols[i];
> + llvm_symbol &out_sym = symbols->symbols[i];
> + out_sym.start = *sym.getAddress() - image_base;
> + out_sym.name = (char *)calloc(1, sym.getName()->size() + 1);
> + if (out_sym.name == nullptr) {
> + for (size_t i = 0; i < all_symbols.size(); ++i) {
> + zfree(&symbols->symbols[i].name);
> + }
> + zfree(&symbols->symbols);
> + return -1;
> + }
> + memcpy(out_sym.name, sym.getName()->bytes_begin(),
> + sym.getName()->size());
> + out_sym.global = *sym.getFlags() & SymbolRef::SF_Global;
> + out_sym.weak = *sym.getFlags() & SymbolRef::SF_Weak;
> +
> + SectionRef section = **sym.getSection();
> + uint64_t next_addr;
> + if (i + 1 < all_symbols.size() &&
> + section == **all_symbols[i + 1].getSection())
> + next_addr = *all_symbols[i + 1].getAddress();
> + else
> + next_addr = section.getAddress() + section.getSize();
> +
> + out_sym.len = next_addr - *sym.getAddress();
> + }
> +
> + return 1;
> +}
> diff --git a/tools/perf/util/llvm-c-helpers.h b/tools/perf/util/llvm-c-helpers.h
> index d2b99637a28a..1abbc9ef3dab 100644
> --- a/tools/perf/util/llvm-c-helpers.h
> +++ b/tools/perf/util/llvm-c-helpers.h
> @@ -8,6 +8,7 @@
> */
>
> #include <linux/compiler.h>
> +#include <stddef.h>
>
> #ifdef __cplusplus
> extern "C" {
> @@ -53,6 +54,54 @@ int llvm_addr2line(const char* dso_name,
> char *llvm_name_for_code(struct dso *dso, const char *dso_name, u64 addr);
> char *llvm_name_for_data(struct dso *dso, const char *dso_name, u64 addr);
>
> +struct llvm_symbol {
> + u64 start, len;
> + bool weak, global;
> +
> + /* Must be freed by the caller. */
> + char *name;
> +};
> +
> +struct llvm_symbol_list {
> + /*
> + * The address this dso is loaded at in memory; adding this to
> + * the file position of any symbol will give the address that
> + * symbol is loaded at.
> + */
> + u64 image_base;
> +
> + /*
> + * The byte offset of the end of the .text segment (one-past-the-end)
> + * in the file.
> + */
> + u64 text_end;
> +
> + /*
> + * The actual symbols. Must be freed by the caller, with free().
> + * The string inside each llvm_symbol must also be freed.
> + */
> + size_t num_symbols;
> + struct llvm_symbol *symbols;
> +};
> +
> +/*
> + * Get a list of all symbols in the given file using LLVM. Similarly to
> + * addr2line(), we need to implement this using a helper structure;
> + * LLVM actually has a C API for this, but we can't use it (it doesn't
> + * have an equivalent of getFlags()). So we do the meat of the LLVM-facing
> + * logic here and let the caller make perf symbols out of it.
> + *
> + * Possible return values and their meaning:
> + *
> + * -1: An error occurred.
> + * 0: The file loaded fine, but there were no .text section.
> + * 1: The file loaded successfully.
> + *
> + * For -1 and 0, "symbols" does not contain any meaningful information,
> + * and nothing needs to be freed.
> + */
> +int llvm_load_symbols(const char *debugfile, struct llvm_symbol_list *symbols);
> +
> #ifdef __cplusplus
> }
> #endif
> diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
> index 9e5940b5bc59..69b8c0a280c7 100644
> --- a/tools/perf/util/symbol.c
> +++ b/tools/perf/util/symbol.c
> @@ -22,6 +22,7 @@
> #include "util.h" // lsdir()
> #include "debug.h"
> #include "event.h"
> +#include "llvm-c-helpers.h"
> #include "machine.h"
> #include "map.h"
> #include "symbol.h"
> @@ -1566,6 +1567,50 @@ static int dso__load_perf_map(const char *map_path, struct dso *dso)
> return -1;
> }
>
> +#ifdef HAVE_LIBLLVM_SUPPORT
> +int dso__load_llvm_symbols(struct dso *dso, const char *debugfile)
> +{
> + struct llvm_symbol_list symbols;
> + struct symbol *symbol;
> + int err = llvm_load_symbols(debugfile, &symbols);
> + int ret = -1;
> +
> + if (err == -1)
> + return -1;
> + else if (err == 0)
> + return 0;
> +
> + for (size_t i = 0; i < symbols.num_symbols; ++i) {
> + const struct llvm_symbol *sym = &symbols.symbols[i];
> + int elf_binding;
> +
> + if (sym->weak)
> + elf_binding = STB_WEAK;
> + else if (sym->global)
> + elf_binding = STB_GLOBAL;
> + else
> + elf_binding = STB_LOCAL;
> + symbol = symbol__new(sym->start, sym->len, elf_binding, STT_FUNC,
> + sym->name);
> + if (!symbol)
> + goto out_free;
> +
> + symbols__insert(dso__symbols(dso), symbol);
> + }
> +
> + symbols__fixup_end(dso__symbols(dso), false);
> + symbols__fixup_duplicate(dso__symbols(dso));
> + dso__set_adjust_symbols(dso, true);
> + ret = 0;
> +
> +out_free:
> + for (size_t i = 0; i < symbols.num_symbols; ++i)
> + zfree(&symbols.symbols[i].name);
> + zfree(&symbols.symbols);
> + return ret;
> +}
> +#endif
> +
> #ifdef HAVE_LIBBFD_SUPPORT
> #define PACKAGE 'perf'
> #include <bfd.h>
> @@ -1871,6 +1916,7 @@ int dso__load(struct dso *dso, struct map *map)
> bool next_slot = false;
> bool is_reg;
> bool nsexit;
> + int llvmrc = -1;
> int bfdrc = -1;
> int sirc = -1;
>
> @@ -1899,11 +1945,15 @@ int dso__load(struct dso *dso, struct map *map)
> }
> }
>
> -#ifdef HAVE_LIBBFD_SUPPORT
> +#ifdef HAVE_LIBLLVM_SUPPORT
> if (is_reg)
> + llvmrc = dso__load_llvm_symbols(dso, name);
> +#endif
> +#ifdef HAVE_LIBBFD_SUPPORT
> + if (is_reg && llvmrc < 0)
> bfdrc = dso__load_bfd_symbols(dso, name);
> #endif
> - if (is_reg && bfdrc < 0)
> + if (is_reg && llvmrc < 0 && bfdrc < 0)
> sirc = symsrc__init(ss, dso, name, symtab_type);
>
> if (nsexit)
> diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
> index 3fb5d146d9b1..01256c41467a 100644
> --- a/tools/perf/util/symbol.h
> +++ b/tools/perf/util/symbol.h
> @@ -174,6 +174,9 @@ int symbol__config_symfs(const struct option *opt __maybe_unused,
>
> struct symsrc;
>
> +#ifdef HAVE_LIBLLVM_SUPPORT
> +int dso__load_llvm_symbols(struct dso *dso, const char *debugfile);
> +#endif
> #ifdef HAVE_LIBBFD_SUPPORT
> int dso__load_bfd_symbols(struct dso *dso, const char *debugfile);
> #endif
> --
> 2.45.1
>
>

2024-05-29 14:09:11

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v7 1/4] perf report: Support LLVM for addr2line()

On Tue, May 28, 2024 at 08:56:47PM -0700, Namhyung Kim wrote:
> On Sun, May 26, 2024 at 11:22 AM Steinar H. Gunderson <[email protected]> wrote:
> > +#elif defined(HAVE_LIBBFD_SUPPORT)

> Hmm.. it's unfortunate that we have only one addr2line
> implementation at a time. Maybe we can do the same thing
> like in annotate with objdump so that it can fallback to another
> method when failing. But it'd require more changes beyond
> this work and I'm not sure if it's really worth it.

Right, I think we shouldn't delay processing these patches because of
that, we can do it as follow up patches, both for fallbacks in case we
can detect problems like we did with capstone -> objdump disasm and also
to be able to compare outputs in 'perf test' shell scripts, discounting
known/expected minor differences.

- Arnaldo

2024-05-30 16:56:13

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v7 1/4] perf report: Support LLVM for addr2line()

On Wed, May 29, 2024 at 6:53 AM Arnaldo Carvalho de Melo
<[email protected]> wrote:
>
> On Tue, May 28, 2024 at 08:56:47PM -0700, Namhyung Kim wrote:
> > On Sun, May 26, 2024 at 11:22 AM Steinar H. Gunderson <[email protected]> wrote:
> > > +#elif defined(HAVE_LIBBFD_SUPPORT)
>
> > Hmm.. it's unfortunate that we have only one addr2line
> > implementation at a time. Maybe we can do the same thing
> > like in annotate with objdump so that it can fallback to another
> > method when failing. But it'd require more changes beyond
> > this work and I'm not sure if it's really worth it.
>
> Right, I think we shouldn't delay processing these patches because of
> that, we can do it as follow up patches, both for fallbacks in case we
> can detect problems like we did with capstone -> objdump disasm and also
> to be able to compare outputs in 'perf test' shell scripts, discounting
> known/expected minor differences.

Yep, I don't want to delay it unnecessarily. But I just want more
people to review and/or test the patches. :)

Thanks,
Namhyung

2024-05-30 16:58:29

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v7 1/4] perf report: Support LLVM for addr2line()

On Sun, May 26, 2024 at 11:22 AM Steinar H. Gunderson <sesse@googlecom> wrote:
>
> In addition to the existing support for libbfd and calling out to
> an external addr2line command, add support for using libllvm directly.
> This is both faster than libbfd, and can be enabled in distro builds
> (the LLVM license has an explicit provision for GPLv2 compatibility).
> Thus, it is set as the primary choice if available.
>
> As an example, running perf report on a medium-size profile with
> DWARF-based backtraces took 58 seconds with LLVM, 78 seconds with
> libbfd, 153 seconds with external llvm-addr2line, and I got tired
> and aborted the test after waiting for 55 minutes with external
> bfd addr2line (which is the default for perf as compiled by distributions
> today). Evidently, for this case, the bfd addr2line process needs
> 18 seconds (on a 5.2 GHz Zen 3) to load the .debug ELF in question,
> hits the 1-second timeout and gets killed during initialization,
> getting restarted anew every time. Having an in-process addr2line
> makes this much more robust.
>
> As future extensions, libllvm can be used in many other places where
> we currently use libbfd or other libraries:
>
> - Symbol enumeration (in particular, for PE binaries).
> - Demangling (including non-Itanium demangling, e.g. Microsoft
> or Rust).
> - Disassembling (perf annotate).
>
> However, these are much less pressing; most people don't profile
> PE binaries, and perf has non-bfd paths for ELF. The same with
> demangling; the default _cxa_demangle path works fine for most
> users. Disassembling is coming in a later patch in the series;
> however do note that while bfd objdump can be slow on large binaries,
> it is possible to use --objdump=llvm-objdump to get the speed benefits.
> (It appears LLVM-based demangling is very simple, should we want
> that.)
>
> Tested with LLVM 14, 15, 16, 18 and 19. For some reason, LLVM 12 was not
> correctly detected using feature_check, and thus was not tested.
>
> Signed-off-by: Steinar H. Gunderson <[email protected]>
> Tested-by: Arnaldo Carvalho de Melo <[email protected]>
> ---
> tools/build/Makefile.feature | 1 +
> tools/perf/Makefile.config | 15 ++++
> tools/perf/builtin-version.c | 1 +
> tools/perf/util/Build | 1 +
> tools/perf/util/llvm-c-helpers.cpp | 134 +++++++++++++++++++++++++++++
> tools/perf/util/llvm-c-helpers.h | 49 +++++++++++
> tools/perf/util/srcline.c | 57 +++++++++++-
> 7 files changed, 257 insertions(+), 1 deletion(-)
> create mode 100644 tools/perf/util/llvm-c-helpers.cpp
> create mode 100644 tools/perf/util/llvm-c-helpers.h

Can you please add something to tests/mask so that
we can test the build with and without libllvm easily?

Thanks,
Namhyung

2024-06-02 20:38:08

by Steinar H. Gunderson

[permalink] [raw]
Subject: Re: [PATCH v7 1/4] perf report: Support LLVM for addr2line()

On Thu, May 30, 2024 at 09:58:09AM -0700, Namhyung Kim wrote:
> Can you please add something to tests/mask so that
> we can test the build with and without libllvm easily?

I have no idea what this is (it doesn't seem to be a file or directory).
Do you have a pointer where I can get more information?

/* Steinar */

2024-06-02 20:40:03

by Steinar H. Gunderson

[permalink] [raw]
Subject: Re: [PATCH v7 4/4] perf report: LLVM-based symbol listing

On Sun, May 26, 2024 at 08:22:12PM +0200, Steinar H. Gunderson wrote:
> Similar to the previous LLVM patches, this allows symbol enumeration
> using LLVM instead of libbfd or libelf. This allows distro builds to
> reliably enumerate symbols in e.g. PE binaries as profiled with WINE.
>
> A lot of the logic for e.g. length detection has been carried over
> from the libbfd code.

I'm dropping this patch; I don't need it myself, it hits some bugs, and
I don't have the resources to see it through. I'll send v8 of the rest
of the series soon.

/* Steinar */

2024-06-02 20:53:11

by Steinar H. Gunderson

[permalink] [raw]
Subject: Re: [PATCH v7 3/4] perf annotate: LLVM-based disassembler

On Tue, May 28, 2024 at 09:06:24PM -0700, Namhyung Kim wrote:
>> + struct symbol_lookup_storage *storage =
>> + (struct symbol_lookup_storage *)disinfo;
> You don't need this in C. :) Also we usually put a blank line
> after declaration (at least, at the beginning).

Done.

> We put the arguments at the same line and align them.
> Please do the same in other places.

Done, I believe.

>> + LLVMDisasmDispose(disasm);
>> + free(buf);
>> + free(line_storage);
>> + return 0;
> Often we just set a different return value here and share the
> error handling logic at the end.

Done.

/* Steinar */

2024-06-05 01:14:12

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH v7 1/4] perf report: Support LLVM for addr2line()

On Sun, Jun 02, 2024 at 10:37:38PM +0200, Steinar H. Gunderson wrote:
> On Thu, May 30, 2024 at 09:58:09AM -0700, Namhyung Kim wrote:
> > Can you please add something to tests/mask so that
> > we can test the build with and without libllvm easily?
>
> I have no idea what this is (it doesn't seem to be a file or directory).
> Do you have a pointer where I can get more information?

Sorry, it was tools/perf/tests/make (not mask). We keep combinations
of build configurations and test it with `make build-test`.

Thanks,
Namhyung