2024-01-17 06:27:14

by Namhyung Kim

[permalink] [raw]
Subject: [PATCHSET 0/9] perf tools: More updates on data type profiling (v4)

Hello,

This is a continuation of the data type profiling series. Now the basic
part (v3) which uses pointer variables is merged to the perf-tools-next
tree. And this part is for memory accesses without pointers as well as
small updates to handle some corner cases. Still mores to come to
complete the original series.

There's no change from the previous version. For background and usages,
pleaes refer the posting of previous version [1] and a LWN article [2].

Basically most memory accesses happen with pointers, but there are cases
don't use pointers - direct accesses to global and local variables.

Global variables are located in a static memory at a specific address.
So the DWARF location expression for the global vairable would also have
the static address. And it's common to access them using PC-relative
addressing mode. Thus it needs a special handling for global variables.

On the other hand, local variables are located in the stack which varies
as program executes. So the local variables are accessed either by the
(stack) frame pointer or (current) stack pointer. But sometimes DWARF
location expression uses a frame base address (CFA) to specify location
of local variables. So it may need to convert or normalize the location
extracted from the instruction to match DWARF expression.

Lastly, there are some cases DWARF location expressions end up having
complex (or not straight-forward) location. In that case, it cannot
simply match just the first expression with the instruction location.
It'd be safer to reject them.

The code is available at 'perf/data-profile-update-v4' branch in the tree
below. The full version of the code is in 'perf/data-profile-v4' branch.

git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks,
Namhyung


Cc: Ben Woodard <[email protected]>
Cc: Joe Mario <[email protected]>
CC: Kees Cook <[email protected]>
Cc: David Blaikie <[email protected]>
Cc: Xu Liu <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Mark Wielaard <[email protected]>
Cc: Jason Merrill <[email protected]>
Cc: Jose E. Marchesi <[email protected]>
Cc: William Huang <[email protected]>

[1] https://lore.kernel.org/linux-perf-users/[email protected]/
[2] https://lwn.net/Articles/955709/


Namhyung Kim (9):
perf annotate-data: Parse 'lock' prefix from llvm-objdump
perf annotate-data: Handle macro fusion on x86
perf annotate-data: Handle array style accesses
perf annotate-data: Add stack operation pseudo type
perf annotate-data: Handle PC-relative addressing
perf annotate-data: Support global variables
perf dwarf-aux: Add die_get_cfa()
perf annotate-data: Support stack variables
perf dwarf-aux: Check allowed DWARF Ops

tools/perf/util/annotate-data.c | 119 ++++++++++++++++----
tools/perf/util/annotate-data.h | 8 +-
tools/perf/util/annotate.c | 153 ++++++++++++++++++++++++--
tools/perf/util/annotate.h | 12 +-
tools/perf/util/dwarf-aux.c | 187 ++++++++++++++++++++++++++++----
tools/perf/util/dwarf-aux.h | 18 +++
6 files changed, 439 insertions(+), 58 deletions(-)


base-commit: d988c9f511af71a3445b6a4f3a2c67208ff8e480
--
2.43.0.381.gb435a96ce8-goog



2024-01-17 06:27:26

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 1/9] perf annotate-data: Parse 'lock' prefix from llvm-objdump

For the performance reason, I prefer llvm-objdump over GNU's. But I
found that llvm-objdump puts x86 lock prefix in a separate line like
below.

ffffffff81000695: f0 lock
ffffffff81000696: ff 83 54 0b 00 00 incl 2900(%rbx)

This should be parsed properly, but I just changed to find the insn
with next offset for now.

This improves the statistics as it can process more instructions.

Annotate data type stats:
total 294, ok 144 (49.0%), bad 150 (51.0%)
-----------------------------------------------------------
30 : no_sym
35 : no_mem_ops
71 : no_var
6 : no_typeinfo
8 : bad_offset

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/annotate.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 9b70ab110ce7..8d761be1a102 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -3660,8 +3660,17 @@ static struct disasm_line *find_disasm_line(struct symbol *sym, u64 ip)
notes = symbol__annotation(sym);

list_for_each_entry(dl, &notes->src->source, al.node) {
- if (sym->start + dl->al.offset == ip)
+ if (sym->start + dl->al.offset == ip) {
+ /*
+ * llvm-objdump places "lock" in a separate line and
+ * in that case, we want to get the next line.
+ */
+ if (!strcmp(dl->ins.name, "lock") && *dl->ops.raw == '\0') {
+ ip++;
+ continue;
+ }
return dl;
+ }
}
return NULL;
}
@@ -3758,6 +3767,9 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
if (!op_loc->mem_ref)
continue;

+ /* Recalculate IP since it can be changed due to LOCK prefix */
+ ip = ms->sym->start + dl->al.offset;
+
mem_type = find_data_type(ms, ip, op_loc->reg, op_loc->offset);
if (mem_type)
istat->good++;
--
2.43.0.381.gb435a96ce8-goog


2024-01-17 06:27:31

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 2/9] perf annotate-data: Handle macro fusion on x86

When a sample was come from a conditional branch without a memory
operand, it could be due to a macro fusion with a previous instruction.
So it needs to check the memory operand in the previous one.

This improves the stat like below:

Annotate data type stats:
total 294, ok 147 (50.0%), bad 147 (50.0%)
-----------------------------------------------------------
30 : no_sym
32 : no_mem_ops
71 : no_var
6 : no_typeinfo
8 : bad_offset

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/annotate.c | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 8d761be1a102..0ec42e85ca5c 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -3751,6 +3751,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
return NULL;
}

+retry:
istat = annotate_data_stat(&ann_insn_stat, dl->ins.name);
if (istat == NULL) {
ann_data_stat.no_insn++;
@@ -3767,7 +3768,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
if (!op_loc->mem_ref)
continue;

- /* Recalculate IP since it can be changed due to LOCK prefix */
+ /* Recalculate IP because of LOCK prefix or insn fusion */
ip = ms->sym->start + dl->al.offset;

mem_type = find_data_type(ms, ip, op_loc->reg, op_loc->offset);
@@ -3786,6 +3787,20 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
return mem_type;
}

+ /*
+ * Some instructions can be fused and the actual memory access came
+ * from the previous instruction.
+ */
+ if (dl->al.offset > 0) {
+ struct disasm_line *prev_dl;
+
+ prev_dl = list_prev_entry(dl, al.node);
+ if (ins__is_fused(arch, prev_dl->ins.name, dl->ins.name)) {
+ dl = prev_dl;
+ goto retry;
+ }
+ }
+
ann_data_stat.no_mem_ops++;
istat->bad++;
return NULL;
--
2.43.0.381.gb435a96ce8-goog


2024-01-17 06:27:59

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 3/9] perf annotate-data: Handle array style accesses

On x86, instructions for array access often looks like below.

mov 0x1234(%rax,%rbx,8), %rcx

Usually the first register holds the type information and the second one
has the index. And the current code only looks up a variable for the
first register. But it's possible to be in the other way around so it
needs to check the second register if the first one failed.

The stat changed like this.

Annotate data type stats:
total 294, ok 148 (50.3%), bad 146 (49.7%)
-----------------------------------------------------------
30 : no_sym
32 : no_mem_ops
66 : no_var
10 : no_typeinfo
8 : bad_offset

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/annotate-data.c | 24 +++++++++++++-----
tools/perf/util/annotate-data.h | 5 ++--
tools/perf/util/annotate.c | 43 ++++++++++++++++++++++++++-------
tools/perf/util/annotate.h | 8 ++++--
4 files changed, 61 insertions(+), 19 deletions(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index f22b4f18271c..d1ceac7e2441 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -9,6 +9,7 @@
#include <stdlib.h>
#include <inttypes.h>

+#include "annotate.h"
#include "annotate-data.h"
#include "debuginfo.h"
#include "debug.h"
@@ -207,7 +208,8 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset)
* It expects a pointer type for a memory access.
* Convert to a real type it points to.
*/
- if (dwarf_tag(type_die) != DW_TAG_pointer_type ||
+ if ((dwarf_tag(type_die) != DW_TAG_pointer_type &&
+ dwarf_tag(type_die) != DW_TAG_array_type) ||
die_get_real_type(type_die, type_die) == NULL) {
pr_debug("no pointer or no type\n");
ann_data_stat.no_typeinfo++;
@@ -233,10 +235,11 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset)

/* The result will be saved in @type_die */
static int find_data_type_die(struct debuginfo *di, u64 pc,
- int reg, int offset, Dwarf_Die *type_die)
+ struct annotated_op_loc *loc, Dwarf_Die *type_die)
{
Dwarf_Die cu_die, var_die;
Dwarf_Die *scopes = NULL;
+ int reg, offset;
int ret = -1;
int i, nr_scopes;

@@ -250,6 +253,10 @@ static int find_data_type_die(struct debuginfo *di, u64 pc,
/* Get a list of nested scopes - i.e. (inlined) functions and blocks. */
nr_scopes = die_get_scopes(&cu_die, pc, &scopes);

+ reg = loc->reg1;
+ offset = loc->offset;
+
+retry:
/* Search from the inner-most scope to the outer */
for (i = nr_scopes - 1; i >= 0; i--) {
/* Look up variables/parameters in this scope */
@@ -260,6 +267,12 @@ static int find_data_type_die(struct debuginfo *di, u64 pc,
ret = check_variable(&var_die, type_die, offset);
goto out;
}
+
+ if (loc->multi_regs && reg == loc->reg1 && loc->reg1 != loc->reg2) {
+ reg = loc->reg2;
+ goto retry;
+ }
+
if (ret < 0)
ann_data_stat.no_var++;

@@ -272,15 +285,14 @@ static int find_data_type_die(struct debuginfo *di, u64 pc,
* find_data_type - Return a data type at the location
* @ms: map and symbol at the location
* @ip: instruction address of the memory access
- * @reg: register that holds the base address
- * @offset: offset from the base address
+ * @loc: instruction operand location
*
* This functions searches the debug information of the binary to get the data
* type it accesses. The exact location is expressed by (ip, reg, offset).
* It return %NULL if not found.
*/
struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
- int reg, int offset)
+ struct annotated_op_loc *loc)
{
struct annotated_data_type *result = NULL;
struct dso *dso = map__dso(ms->map);
@@ -300,7 +312,7 @@ struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
* a file address for DWARF processing.
*/
pc = map__rip_2objdump(ms->map, ip);
- if (find_data_type_die(di, pc, reg, offset, &type_die) < 0)
+ if (find_data_type_die(di, pc, loc, &type_die) < 0)
goto out;

result = dso__findnew_data_type(dso, &type_die);
diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index 8e73096c01d1..65ddd839850f 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -7,6 +7,7 @@
#include <linux/rbtree.h>
#include <linux/types.h>

+struct annotated_op_loc;
struct evsel;
struct map_symbol;

@@ -105,7 +106,7 @@ extern struct annotated_data_stat ann_data_stat;

/* Returns data type at the location (ip, reg, offset) */
struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
- int reg, int offset);
+ struct annotated_op_loc *loc);

/* Update type access histogram at the given offset */
int annotated_data_type__update_samples(struct annotated_data_type *adt,
@@ -119,7 +120,7 @@ void annotated_data_type__tree_delete(struct rb_root *root);

static inline struct annotated_data_type *
find_data_type(struct map_symbol *ms __maybe_unused, u64 ip __maybe_unused,
- int reg __maybe_unused, int offset __maybe_unused)
+ struct annotated_op_loc *loc __maybe_unused)
{
return NULL;
}
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 0ec42e85ca5c..3cdcdd449c86 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -3563,8 +3563,22 @@ static int extract_reg_offset(struct arch *arch, const char *str,
if (regname == NULL)
return -1;

- op_loc->reg = get_dwarf_regnum(regname, 0);
+ op_loc->reg1 = get_dwarf_regnum(regname, 0);
free(regname);
+
+ /* Get the second register */
+ if (op_loc->multi_regs) {
+ p = strchr(p + 1, arch->objdump.register_char);
+ if (p == NULL)
+ return -1;
+
+ regname = strdup(p);
+ if (regname == NULL)
+ return -1;
+
+ op_loc->reg2 = get_dwarf_regnum(regname, 0);
+ free(regname);
+ }
return 0;
}

@@ -3577,14 +3591,20 @@ static int extract_reg_offset(struct arch *arch, const char *str,
* Get detailed location info (register and offset) in the instruction.
* It needs both source and target operand and whether it accesses a
* memory location. The offset field is meaningful only when the
- * corresponding mem flag is set.
+ * corresponding mem flag is set. The reg2 field is meaningful only
+ * when multi_regs flag is set.
*
* Some examples on x86:
*
- * mov (%rax), %rcx # src_reg = rax, src_mem = 1, src_offset = 0
- * # dst_reg = rcx, dst_mem = 0
+ * mov (%rax), %rcx # src_reg1 = rax, src_mem = 1, src_offset = 0
+ * # dst_reg1 = rcx, dst_mem = 0
*
- * mov 0x18, %r8 # src_reg = -1, dst_reg = r8
+ * mov 0x18, %r8 # src_reg1 = -1, src_mem = 0
+ * # dst_reg1 = r8, dst_mem = 0
+ *
+ * mov %rsi, 8(%rbx,%rcx,4) # src_reg1 = rsi, src_mem = 0, dst_multi_regs = 0
+ * # dst_reg1 = rbx, dst_reg2 = rcx, dst_mem = 1
+ * # dst_multi_regs = 1, dst_offset = 8
*/
int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl,
struct annotated_insn_loc *loc)
@@ -3605,24 +3625,29 @@ int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl,

for_each_insn_op_loc(loc, i, op_loc) {
const char *insn_str = ops->source.raw;
+ bool multi_regs = ops->source.multi_regs;

- if (i == INSN_OP_TARGET)
+ if (i == INSN_OP_TARGET) {
insn_str = ops->target.raw;
+ multi_regs = ops->target.multi_regs;
+ }

/* Invalidate the register by default */
- op_loc->reg = -1;
+ op_loc->reg1 = -1;
+ op_loc->reg2 = -1;

if (insn_str == NULL)
continue;

if (strchr(insn_str, arch->objdump.memory_ref_char)) {
op_loc->mem_ref = true;
+ op_loc->multi_regs = multi_regs;
extract_reg_offset(arch, insn_str, op_loc);
} else {
char *s = strdup(insn_str);

if (s) {
- op_loc->reg = get_dwarf_regnum(s, 0);
+ op_loc->reg1 = get_dwarf_regnum(s, 0);
free(s);
}
}
@@ -3771,7 +3796,7 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
/* Recalculate IP because of LOCK prefix or insn fusion */
ip = ms->sym->start + dl->al.offset;

- mem_type = find_data_type(ms, ip, op_loc->reg, op_loc->offset);
+ mem_type = find_data_type(ms, ip, op_loc);
if (mem_type)
istat->good++;
else
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index dba50762c6e8..d0ff677b460c 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -442,14 +442,18 @@ int annotate_check_args(void);

/**
* struct annotated_op_loc - Location info of instruction operand
- * @reg: Register in the operand
+ * @reg1: First register in the operand
+ * @reg2: Second register in the operand
* @offset: Memory access offset in the operand
* @mem_ref: Whether the operand accesses memory
+ * @multi_regs: Whether the second register is used
*/
struct annotated_op_loc {
- int reg;
+ int reg1;
+ int reg2;
int offset;
bool mem_ref;
+ bool multi_regs;
};

enum annotated_insn_ops {
--
2.43.0.381.gb435a96ce8-goog


2024-01-17 06:28:06

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 4/9] perf annotate-data: Add stack operation pseudo type

A typical function prologue and epilogue include multiple stack
operations to save and restore the current value of registers.
On x86, it looks like below:

push r15
push r14
push r13
push r12

...

pop r12
pop r13
pop r14
pop r15
ret

As these all touches the stack memory region, chances are high that they
appear in a memory profile data. But these are not used for any real
purpose yet so it'd return no types.

One of my profile type shows that non neglible portion of data came from
the stack operations. It also seems GCC generates more stack operations
than clang.

Annotate Instruction stats
total 264, ok 169 (64.0%), bad 95 (36.0%)

Name : Good Bad
-----------------------------------------------------------
movq : 49 27
movl : 24 9
popq : 0 19 <-- here
cmpl : 17 2
addq : 14 1
cmpq : 12 2
cmpxchgl : 3 7

Instead of dealing them as unknown, let's create a seperate pseudo type
to represent those stack operations separately.

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/annotate-data.h | 1 +
tools/perf/util/annotate.c | 26 ++++++++++++++++++++++++++
2 files changed, 27 insertions(+)

diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index 65ddd839850f..214c625e7bc9 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -70,6 +70,7 @@ struct annotated_data_type {
};

extern struct annotated_data_type unknown_type;
+extern struct annotated_data_type stackop_type;

/**
* struct annotated_data_stat - Debug statistics
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 3cdcdd449c86..655bd9443f5e 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -107,6 +107,14 @@ static struct ins_ops ret_ops;
struct annotated_data_stat ann_data_stat;
LIST_HEAD(ann_insn_stat);

+/* Pseudo data types */
+struct annotated_data_type stackop_type = {
+ .self = {
+ .type_name = (char *)"(stack operation)",
+ .children = LIST_HEAD_INIT(stackop_type.self.children),
+ },
+};
+
static int arch__grow_instructions(struct arch *arch)
{
struct ins *new_instructions;
@@ -3724,6 +3732,18 @@ static struct annotated_item_stat *annotate_data_stat(struct list_head *head,
return istat;
}

+static bool is_stack_operation(struct arch *arch, struct disasm_line *dl)
+{
+ if (arch__is(arch, "x86")) {
+ if (!strncmp(dl->ins.name, "push", 4) ||
+ !strncmp(dl->ins.name, "pop", 3) ||
+ !strncmp(dl->ins.name, "ret", 3))
+ return true;
+ }
+
+ return false;
+}
+
/**
* hist_entry__get_data_type - find data type for given hist entry
* @he: hist entry
@@ -3789,6 +3809,12 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
return NULL;
}

+ if (is_stack_operation(arch, dl)) {
+ istat->good++;
+ he->mem_type_off = 0;
+ return &stackop_type;
+ }
+
for_each_insn_op_loc(&loc, i, op_loc) {
if (!op_loc->mem_ref)
continue;
--
2.43.0.381.gb435a96ce8-goog


2024-01-17 06:28:22

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 5/9] perf annotate-data: Handle PC-relative addressing

Extend find_data_type_die() to find data type from PC-relative address
using die_find_variable_by_addr(). Users need to pass the address for
the (global) variable.

The offset for the variable should be updated after finding the type
because the offset in the instruction is just to calcuate the address
for the variable. So it changed to pass a pointer to offset and renamed
it to 'poffset'.

First it searches variables in the CU DIE as it's likely that the global
variables are defined in the file level. And then it iterates the scope
DIEs to find a local (static) variable.

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/annotate-data.c | 56 ++++++++++++++++++++++-----------
1 file changed, 38 insertions(+), 18 deletions(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index d1ceac7e2441..58c0fac42e9d 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -14,6 +14,7 @@
#include "debuginfo.h"
#include "debug.h"
#include "dso.h"
+#include "dwarf-regs.h"
#include "evsel.h"
#include "evlist.h"
#include "map.h"
@@ -193,7 +194,8 @@ static bool find_cu_die(struct debuginfo *di, u64 pc, Dwarf_Die *cu_die)
}

/* The type info will be saved in @type_die */
-static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset)
+static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset,
+ bool is_pointer)
{
Dwarf_Word size;

@@ -205,15 +207,18 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset)
}

/*
- * It expects a pointer type for a memory access.
- * Convert to a real type it points to.
+ * Usually it expects a pointer type for a memory access.
+ * Convert to a real type it points to. But global variables
+ * are accessed directly without a pointer.
*/
- if ((dwarf_tag(type_die) != DW_TAG_pointer_type &&
- dwarf_tag(type_die) != DW_TAG_array_type) ||
- die_get_real_type(type_die, type_die) == NULL) {
- pr_debug("no pointer or no type\n");
- ann_data_stat.no_typeinfo++;
- return -1;
+ if (is_pointer) {
+ if ((dwarf_tag(type_die) != DW_TAG_pointer_type &&
+ dwarf_tag(type_die) != DW_TAG_array_type) ||
+ die_get_real_type(type_die, type_die) == NULL) {
+ pr_debug("no pointer or no type\n");
+ ann_data_stat.no_typeinfo++;
+ return -1;
+ }
}

/* Get the size of the actual type */
@@ -234,7 +239,7 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset)
}

/* The result will be saved in @type_die */
-static int find_data_type_die(struct debuginfo *di, u64 pc,
+static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
struct annotated_op_loc *loc, Dwarf_Die *type_die)
{
Dwarf_Die cu_die, var_die;
@@ -250,21 +255,36 @@ static int find_data_type_die(struct debuginfo *di, u64 pc,
return -1;
}

- /* Get a list of nested scopes - i.e. (inlined) functions and blocks. */
- nr_scopes = die_get_scopes(&cu_die, pc, &scopes);
-
reg = loc->reg1;
offset = loc->offset;

+ if (reg == DWARF_REG_PC &&
+ die_find_variable_by_addr(&cu_die, pc, addr, &var_die, &offset)) {
+ ret = check_variable(&var_die, type_die, offset,
+ /*is_pointer=*/false);
+ goto out;
+ }
+
+ /* Get a list of nested scopes - i.e. (inlined) functions and blocks. */
+ nr_scopes = die_get_scopes(&cu_die, pc, &scopes);
+
retry:
/* Search from the inner-most scope to the outer */
for (i = nr_scopes - 1; i >= 0; i--) {
- /* Look up variables/parameters in this scope */
- if (!die_find_variable_by_reg(&scopes[i], pc, reg, &var_die))
- continue;
+ if (reg == DWARF_REG_PC) {
+ if (!die_find_variable_by_addr(&scopes[i], pc, addr,
+ &var_die, &offset))
+ continue;
+ } else {
+ /* Look up variables/parameters in this scope */
+ if (!die_find_variable_by_reg(&scopes[i], pc, reg,
+ &var_die))
+ continue;
+ }

/* Found a variable, see if it's correct */
- ret = check_variable(&var_die, type_die, offset);
+ ret = check_variable(&var_die, type_die, offset,
+ reg != DWARF_REG_PC);
goto out;
}

@@ -312,7 +332,7 @@ struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
* a file address for DWARF processing.
*/
pc = map__rip_2objdump(ms->map, ip);
- if (find_data_type_die(di, pc, loc, &type_die) < 0)
+ if (find_data_type_die(di, pc, 0, loc, &type_die) < 0)
goto out;

result = dso__findnew_data_type(dso, &type_die);
--
2.43.0.381.gb435a96ce8-goog


2024-01-17 06:28:42

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 6/9] perf annotate-data: Support global variables

Global variables are accessed using PC-relative address so it needs to
be handled separately. The PC-rel addressing is detected by using
DWARF_REG_PC. On x86, %rip register would be used.

The address can be calculated using the ip and offset in the
instruction. But it should start from the next instruction so add
calculate_pcrel_addr() to do it properly.

But global variables defined in a different file would only have a
declaration which doesn't include a location list. So it first tries
to get the type info using the address, and then looks up the variable
declarations using name. The name of global variables should be get
from the symbol table. The declaration would have the type info.

So extend find_var_type() to take both address and name for global
variables.

The stat is now looks like:

Annotate data type stats:
total 294, ok 153 (52.0%), bad 141 (48.0%)
-----------------------------------------------------------
30 : no_sym
32 : no_mem_ops
61 : no_var
10 : no_typeinfo
8 : bad_offset

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/annotate-data.c | 38 ++++++++++++++++------
tools/perf/util/annotate-data.h | 6 ++--
tools/perf/util/annotate.c | 57 +++++++++++++++++++++++++++++++--
tools/perf/util/annotate.h | 4 +++
4 files changed, 92 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 58c0fac42e9d..e375dd288f67 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -240,7 +240,8 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset,

/* The result will be saved in @type_die */
static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
- struct annotated_op_loc *loc, Dwarf_Die *type_die)
+ const char *var_name, struct annotated_op_loc *loc,
+ Dwarf_Die *type_die)
{
Dwarf_Die cu_die, var_die;
Dwarf_Die *scopes = NULL;
@@ -258,11 +259,21 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
reg = loc->reg1;
offset = loc->offset;

- if (reg == DWARF_REG_PC &&
- die_find_variable_by_addr(&cu_die, pc, addr, &var_die, &offset)) {
- ret = check_variable(&var_die, type_die, offset,
- /*is_pointer=*/false);
- goto out;
+ if (reg == DWARF_REG_PC) {
+ if (die_find_variable_by_addr(&cu_die, pc, addr, &var_die, &offset)) {
+ ret = check_variable(&var_die, type_die, offset,
+ /*is_pointer=*/false);
+ loc->offset = offset;
+ goto out;
+ }
+
+ if (var_name && die_find_variable_at(&cu_die, var_name, pc,
+ &var_die)) {
+ ret = check_variable(&var_die, type_die, 0,
+ /*is_pointer=*/false);
+ /* loc->offset will be updated by the caller */
+ goto out;
+ }
}

/* Get a list of nested scopes - i.e. (inlined) functions and blocks. */
@@ -285,6 +296,7 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
/* Found a variable, see if it's correct */
ret = check_variable(&var_die, type_die, offset,
reg != DWARF_REG_PC);
+ loc->offset = offset;
goto out;
}

@@ -306,13 +318,21 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
* @ms: map and symbol at the location
* @ip: instruction address of the memory access
* @loc: instruction operand location
+ * @addr: data address of the memory access
+ * @var_name: global variable name
*
* This functions searches the debug information of the binary to get the data
- * type it accesses. The exact location is expressed by (ip, reg, offset).
+ * type it accesses. The exact location is expressed by (@ip, reg, offset)
+ * for pointer variables or (@ip, @addr) for global variables. Note that global
+ * variables might update the @loc->offset after finding the start of the variable.
+ * If it cannot find a global variable by address, it tried to fine a declaration
+ * of the variable using @var_name. In that case, @loc->offset won't be updated.
+ *
* It return %NULL if not found.
*/
struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
- struct annotated_op_loc *loc)
+ struct annotated_op_loc *loc, u64 addr,
+ const char *var_name)
{
struct annotated_data_type *result = NULL;
struct dso *dso = map__dso(ms->map);
@@ -332,7 +352,7 @@ struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
* a file address for DWARF processing.
*/
pc = map__rip_2objdump(ms->map, ip);
- if (find_data_type_die(di, pc, 0, loc, &type_die) < 0)
+ if (find_data_type_die(di, pc, addr, var_name, loc, &type_die) < 0)
goto out;

result = dso__findnew_data_type(dso, &type_die);
diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index 214c625e7bc9..1b0db8e8c40e 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -107,7 +107,8 @@ extern struct annotated_data_stat ann_data_stat;

/* Returns data type at the location (ip, reg, offset) */
struct annotated_data_type *find_data_type(struct map_symbol *ms, u64 ip,
- struct annotated_op_loc *loc);
+ struct annotated_op_loc *loc, u64 addr,
+ const char *var_name);

/* Update type access histogram at the given offset */
int annotated_data_type__update_samples(struct annotated_data_type *adt,
@@ -121,7 +122,8 @@ void annotated_data_type__tree_delete(struct rb_root *root);

static inline struct annotated_data_type *
find_data_type(struct map_symbol *ms __maybe_unused, u64 ip __maybe_unused,
- struct annotated_op_loc *loc __maybe_unused)
+ struct annotated_op_loc *loc __maybe_unused,
+ u64 addr __maybe_unused, const char *var_name __maybe_unused)
{
return NULL;
}
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 655bd9443f5e..107b264fa41e 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -37,6 +37,7 @@
#include "util/sharded_mutex.h"
#include "arch/common.h"
#include "namespaces.h"
+#include "thread.h"
#include <regex.h>
#include <linux/bitops.h>
#include <linux/kernel.h>
@@ -3744,6 +3745,30 @@ static bool is_stack_operation(struct arch *arch, struct disasm_line *dl)
return false;
}

+u64 annotate_calc_pcrel(struct map_symbol *ms, u64 ip, int offset,
+ struct disasm_line *dl)
+{
+ struct annotation *notes;
+ struct disasm_line *next;
+ u64 addr;
+
+ notes = symbol__annotation(ms->sym);
+ /*
+ * PC-relative addressing starts from the next instruction address
+ * But the IP is for the current instruction. Since disasm_line
+ * doesn't have the instruction size, calculate it using the next
+ * disasm_line. If it's the last one, we can use symbol's end
+ * address directly.
+ */
+ if (&dl->al.node == notes->src->source.prev)
+ addr = ms->sym->end + offset;
+ else {
+ next = list_next_entry(dl, al.node);
+ addr = ip + (next->al.offset - dl->al.offset) + offset;
+ }
+ return map__rip_2objdump(ms->map, addr);
+}
+
/**
* hist_entry__get_data_type - find data type for given hist entry
* @he: hist entry
@@ -3763,7 +3788,9 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
struct annotated_op_loc *op_loc;
struct annotated_data_type *mem_type;
struct annotated_item_stat *istat;
- u64 ip = he->ip;
+ u64 ip = he->ip, addr = 0;
+ const char *var_name = NULL;
+ int var_offset;
int i;

ann_data_stat.total++;
@@ -3822,12 +3849,38 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
/* Recalculate IP because of LOCK prefix or insn fusion */
ip = ms->sym->start + dl->al.offset;

- mem_type = find_data_type(ms, ip, op_loc);
+ var_offset = op_loc->offset;
+
+ /* PC-relative addressing */
+ if (op_loc->reg1 == DWARF_REG_PC) {
+ struct addr_location al;
+ struct symbol *var;
+ u64 map_addr;
+
+ addr = annotate_calc_pcrel(ms, ip, op_loc->offset, dl);
+ /* Kernel symbols might be relocated */
+ map_addr = addr + map__reloc(ms->map);
+
+ addr_location__init(&al);
+ var = thread__find_symbol_fb(he->thread, he->cpumode,
+ map_addr, &al);
+ if (var) {
+ var_name = var->name;
+ /* Calculate type offset from the start of variable */
+ var_offset = map_addr - map__unmap_ip(al.map, var->start);
+ }
+ addr_location__exit(&al);
+ }
+
+ mem_type = find_data_type(ms, ip, op_loc, addr, var_name);
if (mem_type)
istat->good++;
else
istat->bad++;

+ if (mem_type && var_name)
+ op_loc->offset = var_offset;
+
if (symbol_conf.annotate_data_sample) {
annotated_data_type__update_samples(mem_type, evsel,
op_loc->offset,
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index d0ff677b460c..94435607c958 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -491,4 +491,8 @@ struct annotated_item_stat {
};
extern struct list_head ann_insn_stat;

+/* Calculate PC-relative address */
+u64 annotate_calc_pcrel(struct map_symbol *ms, u64 ip, int offset,
+ struct disasm_line *dl);
+
#endif /* __PERF_ANNOTATE_H */
--
2.43.0.381.gb435a96ce8-goog


2024-01-17 06:28:45

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 7/9] perf dwarf-aux: Add die_get_cfa()

The die_get_cfa() is to get frame base register and offset at the given
instruction address (pc). This info will be used to locate stack
variables which have location expression using DW_OP_fbreg.

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/dwarf-aux.c | 68 ++++++++++++++++++++++++++++++++++++-
tools/perf/util/dwarf-aux.h | 15 ++++++++
2 files changed, 82 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
index 7aa5fee0da19..3d42a8613869 100644
--- a/tools/perf/util/dwarf-aux.c
+++ b/tools/perf/util/dwarf-aux.c
@@ -1407,7 +1407,73 @@ Dwarf_Die *die_find_variable_by_addr(Dwarf_Die *sc_die, Dwarf_Addr pc,
*offset = data.offset;
return result;
}
-#endif
+#endif /* HAVE_DWARF_GETLOCATIONS_SUPPORT */
+
+#ifdef HAVE_DWARF_CFI_SUPPORT
+static int reg_from_dwarf_op(Dwarf_Op *op)
+{
+ switch (op->atom) {
+ case DW_OP_reg0 ... DW_OP_reg31:
+ return op->atom - DW_OP_reg0;
+ case DW_OP_breg0 ... DW_OP_breg31:
+ return op->atom - DW_OP_breg0;
+ case DW_OP_regx:
+ case DW_OP_bregx:
+ return op->number;
+ default:
+ break;
+ }
+ return -1;
+}
+
+static int offset_from_dwarf_op(Dwarf_Op *op)
+{
+ switch (op->atom) {
+ case DW_OP_reg0 ... DW_OP_reg31:
+ case DW_OP_regx:
+ return 0;
+ case DW_OP_breg0 ... DW_OP_breg31:
+ return op->number;
+ case DW_OP_bregx:
+ return op->number2;
+ default:
+ break;
+ }
+ return -1;
+}
+
+/**
+ * die_get_cfa - Get frame base information
+ * @dwarf: a Dwarf info
+ * @pc: program address
+ * @preg: pointer for saved register
+ * @poffset: pointer for saved offset
+ *
+ * This function gets register and offset for CFA (Canonical Frame Address)
+ * by searching the CIE/FDE info. The CFA usually points to the start address
+ * of the current stack frame and local variables can be located using an offset
+ * from the CFA. The @preg and @poffset will be updated if it returns 0.
+ */
+int die_get_cfa(Dwarf *dwarf, u64 pc, int *preg, int *poffset)
+{
+ Dwarf_CFI *cfi;
+ Dwarf_Frame *frame = NULL;
+ Dwarf_Op *ops = NULL;
+ size_t nops;
+
+ cfi = dwarf_getcfi(dwarf);
+ if (cfi == NULL)
+ return -1;
+
+ if (!dwarf_cfi_addrframe(cfi, pc, &frame) &&
+ !dwarf_frame_cfa(frame, &ops, &nops) && nops == 1) {
+ *preg = reg_from_dwarf_op(ops);
+ *poffset = offset_from_dwarf_op(ops);
+ return 0;
+ }
+ return -1;
+}
+#endif /* HAVE_DWARF_CFI_SUPPORT */

/*
* die_has_loclist - Check if DW_AT_location of @vr_die is a location list
diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h
index 4e64caac6df8..f209f9162908 100644
--- a/tools/perf/util/dwarf-aux.h
+++ b/tools/perf/util/dwarf-aux.h
@@ -177,4 +177,19 @@ static inline Dwarf_Die *die_find_variable_by_addr(Dwarf_Die *sc_die __maybe_unu

#endif /* HAVE_DWARF_GETLOCATIONS_SUPPORT */

+#ifdef HAVE_DWARF_CFI_SUPPORT
+
+/* Get the frame base information from CFA */
+int die_get_cfa(Dwarf *dwarf, u64 pc, int *preg, int *poffset);
+
+#else /* HAVE_DWARF_CFI_SUPPORT */
+
+static inline int die_get_cfa(Dwarf *dwarf __maybe_unused, u64 pc __maybe_unused,
+ int *preg __maybe_unused, int *poffset __maybe_unused)
+{
+ return -1;
+}
+
+#endif /* HAVE_DWARF_CFI_SUPPORT */
+
#endif /* _DWARF_AUX_H */
--
2.43.0.381.gb435a96ce8-goog


2024-01-17 06:29:09

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 8/9] perf annotate-data: Support stack variables

Local variables are allocated in the stack and the location list
should look like base register(s) and an offset. Extend the
die_find_variable_by_reg() to handle the following expressions

* DW_OP_breg{0..31}
* DW_OP_bregx
* DW_OP_fbreg

Ususally DWARF subprogram entries have frame base information and
use it to locate stack variable like below:

<2><43d1575>: Abbrev Number: 62 (DW_TAG_variable)
<43d1576> DW_AT_location : 2 byte block: 91 7c (DW_OP_fbreg: -4) <--- here
<43d1579> DW_AT_name : (indirect string, offset: 0x2c00c9): i
<43d157d> DW_AT_decl_file : 1
<43d157e> DW_AT_decl_line : 78
<43d157f> DW_AT_type : <0x43d19d7>

I found some differences on saving the frame base between gcc and clang.
The gcc uses the CFA to get the base so it needs to check the current
frame's CFI info. In this case, stack offset needs to be adjusted from
the start of the CFA.

<1><1bb8d>: Abbrev Number: 102 (DW_TAG_subprogram)
<1bb8e> DW_AT_name : (indirect string, offset: 0x74d41): kernel_init
<1bb92> DW_AT_decl_file : 2
<1bb92> DW_AT_decl_line : 1440
<1bb94> DW_AT_decl_column : 18
<1bb95> DW_AT_prototyped : 1
<1bb95> DW_AT_type : <0xcc>
<1bb99> DW_AT_low_pc : 0xffffffff81bab9e0
<1bba1> DW_AT_high_pc : 0x1b2
<1bba9> DW_AT_frame_base : 1 byte block: 9c (DW_OP_call_frame_cfa) <------ here
<1bbab> DW_AT_call_all_calls: 1
<1bbab> DW_AT_sibling : <0x1bf5a>

While clang sets it to a register directly and it can check the register
and offset in the instruction directly.

<1><43d1542>: Abbrev Number: 60 (DW_TAG_subprogram)
<43d1543> DW_AT_low_pc : 0xffffffff816a7c60
<43d154b> DW_AT_high_pc : 0x98
<43d154f> DW_AT_frame_base : 1 byte block: 56 (DW_OP_reg6 (rbp)) <---------- here
<43d1551> DW_AT_GNU_all_call_sites: 1
<43d1551> DW_AT_name : (indirect string, offset: 0x3bce91): foo
<43d1555> DW_AT_decl_file : 1
<43d1556> DW_AT_decl_line : 75
<43d1557> DW_AT_prototyped : 1
<43d1557> DW_AT_type : <0x43c7332>
<43d155b> DW_AT_external : 1

Also it needs to update the offset after finding the type like global
variables since the offset was from the frame base. Factor out
match_var_offset() to check global and local variables in the same way.

The type stats are improved too:

Annotate data type stats:
total 294, ok 160 (54.4%), bad 134 (45.6%)
-----------------------------------------------------------
30 : no_sym
32 : no_mem_ops
51 : no_var
14 : no_typeinfo
7 : bad_offset

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/annotate-data.c | 35 +++++++++++++--
tools/perf/util/dwarf-aux.c | 79 ++++++++++++++++++++++++---------
tools/perf/util/dwarf-aux.h | 3 ++
3 files changed, 93 insertions(+), 24 deletions(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index e375dd288f67..30c4d19fcf11 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -209,7 +209,7 @@ static int check_variable(Dwarf_Die *var_die, Dwarf_Die *type_die, int offset,
/*
* Usually it expects a pointer type for a memory access.
* Convert to a real type it points to. But global variables
- * are accessed directly without a pointer.
+ * and local variables are accessed directly without a pointer.
*/
if (is_pointer) {
if ((dwarf_tag(type_die) != DW_TAG_pointer_type &&
@@ -248,6 +248,9 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
int reg, offset;
int ret = -1;
int i, nr_scopes;
+ int fbreg = -1;
+ bool is_fbreg = false;
+ int fb_offset = 0;

/* Get a compile_unit for this address */
if (!find_cu_die(di, pc, &cu_die)) {
@@ -279,7 +282,33 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
/* Get a list of nested scopes - i.e. (inlined) functions and blocks. */
nr_scopes = die_get_scopes(&cu_die, pc, &scopes);

+ if (reg != DWARF_REG_PC && dwarf_hasattr(&scopes[0], DW_AT_frame_base)) {
+ Dwarf_Attribute attr;
+ Dwarf_Block block;
+
+ /* Check if the 'reg' is assigned as frame base register */
+ if (dwarf_attr(&scopes[0], DW_AT_frame_base, &attr) != NULL &&
+ dwarf_formblock(&attr, &block) == 0 && block.length == 1) {
+ switch (*block.data) {
+ case DW_OP_reg0 ... DW_OP_reg31:
+ fbreg = *block.data - DW_OP_reg0;
+ break;
+ case DW_OP_call_frame_cfa:
+ if (die_get_cfa(di->dbg, pc, &fbreg,
+ &fb_offset) < 0)
+ fbreg = -1;
+ break;
+ default:
+ break;
+ }
+ }
+ }
+
retry:
+ is_fbreg = (reg == fbreg);
+ if (is_fbreg)
+ offset = loc->offset - fb_offset;
+
/* Search from the inner-most scope to the outer */
for (i = nr_scopes - 1; i >= 0; i--) {
if (reg == DWARF_REG_PC) {
@@ -289,13 +318,13 @@ static int find_data_type_die(struct debuginfo *di, u64 pc, u64 addr,
} else {
/* Look up variables/parameters in this scope */
if (!die_find_variable_by_reg(&scopes[i], pc, reg,
- &var_die))
+ &offset, is_fbreg, &var_die))
continue;
}

/* Found a variable, see if it's correct */
ret = check_variable(&var_die, type_die, offset,
- reg != DWARF_REG_PC);
+ reg != DWARF_REG_PC && !is_fbreg);
loc->offset = offset;
goto out;
}
diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
index 3d42a8613869..7caf52fdc255 100644
--- a/tools/perf/util/dwarf-aux.c
+++ b/tools/perf/util/dwarf-aux.c
@@ -1272,11 +1272,39 @@ struct find_var_data {
unsigned reg;
/* Access offset, set for global data */
int offset;
+ /* True if the current register is the frame base */
+ bool is_fbreg;
};

/* Max number of registers DW_OP_regN supports */
#define DWARF_OP_DIRECT_REGS 32

+static bool match_var_offset(Dwarf_Die *die_mem, struct find_var_data *data,
+ u64 addr_offset, u64 addr_type)
+{
+ Dwarf_Die type_die;
+ Dwarf_Word size;
+
+ if (addr_offset == addr_type) {
+ /* Update offset relative to the start of the variable */
+ data->offset = 0;
+ return true;
+ }
+
+ if (die_get_real_type(die_mem, &type_die) == NULL)
+ return false;
+
+ if (dwarf_aggregate_size(&type_die, &size) < 0)
+ return false;
+
+ if (addr_offset >= addr_type + size)
+ return false;
+
+ /* Update offset relative to the start of the variable */
+ data->offset = addr_offset - addr_type;
+ return true;
+}
+
/* Only checks direct child DIEs in the given scope. */
static int __die_find_var_reg_cb(Dwarf_Die *die_mem, void *arg)
{
@@ -1301,14 +1329,30 @@ static int __die_find_var_reg_cb(Dwarf_Die *die_mem, void *arg)
if (start > data->pc)
break;

+ /* Local variables accessed using frame base register */
+ if (data->is_fbreg && ops->atom == DW_OP_fbreg &&
+ data->offset >= (int)ops->number &&
+ match_var_offset(die_mem, data, data->offset, ops->number))
+ return DIE_FIND_CB_END;
+
/* Only match with a simple case */
if (data->reg < DWARF_OP_DIRECT_REGS) {
if (ops->atom == (DW_OP_reg0 + data->reg) && nops == 1)
return DIE_FIND_CB_END;
+
+ /* Local variables accessed by a register + offset */
+ if (ops->atom == (DW_OP_breg0 + data->reg) &&
+ match_var_offset(die_mem, data, data->offset, ops->number))
+ return DIE_FIND_CB_END;
} else {
if (ops->atom == DW_OP_regx && ops->number == data->reg &&
nops == 1)
return DIE_FIND_CB_END;
+
+ /* Local variables accessed by a register + offset */
+ if (ops->atom == DW_OP_bregx && data->reg == ops->number &&
+ match_var_offset(die_mem, data, data->offset, ops->number2))
+ return DIE_FIND_CB_END;
}
}
return DIE_FIND_CB_SIBLING;
@@ -1319,18 +1363,29 @@ static int __die_find_var_reg_cb(Dwarf_Die *die_mem, void *arg)
* @sc_die: a scope DIE
* @pc: the program address to find
* @reg: the register number to find
+ * @poffset: pointer to offset, will be updated for fbreg case
+ * @is_fbreg: boolean value if the current register is the frame base
* @die_mem: a buffer to save the resulting DIE
*
- * Find the variable DIE accessed by the given register.
+ * Find the variable DIE accessed by the given register. It'll update the @offset
+ * when the variable is in the stack.
*/
Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die, Dwarf_Addr pc, int reg,
+ int *poffset, bool is_fbreg,
Dwarf_Die *die_mem)
{
struct find_var_data data = {
.pc = pc,
.reg = reg,
+ .offset = *poffset,
+ .is_fbreg = is_fbreg,
};
- return die_find_child(sc_die, __die_find_var_reg_cb, &data, die_mem);
+ Dwarf_Die *result;
+
+ result = die_find_child(sc_die, __die_find_var_reg_cb, &data, die_mem);
+ if (result)
+ *poffset = data.offset;
+ return result;
}

/* Only checks direct child DIEs in the given scope */
@@ -1341,8 +1396,6 @@ static int __die_find_var_addr_cb(Dwarf_Die *die_mem, void *arg)
ptrdiff_t off = 0;
Dwarf_Attribute attr;
Dwarf_Addr base, start, end;
- Dwarf_Word size;
- Dwarf_Die type_die;
Dwarf_Op *ops;
size_t nops;

@@ -1359,24 +1412,8 @@ static int __die_find_var_addr_cb(Dwarf_Die *die_mem, void *arg)
if (data->addr < ops->number)
continue;

- if (data->addr == ops->number) {
- /* Update offset relative to the start of the variable */
- data->offset = 0;
+ if (match_var_offset(die_mem, data, data->addr, ops->number))
return DIE_FIND_CB_END;
- }
-
- if (die_get_real_type(die_mem, &type_die) == NULL)
- continue;
-
- if (dwarf_aggregate_size(&type_die, &size) < 0)
- continue;
-
- if (data->addr >= ops->number + size)
- continue;
-
- /* Update offset relative to the start of the variable */
- data->offset = data->addr - ops->number;
- return DIE_FIND_CB_END;
}
return DIE_FIND_CB_SIBLING;
}
diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h
index f209f9162908..85dd527ae1f7 100644
--- a/tools/perf/util/dwarf-aux.h
+++ b/tools/perf/util/dwarf-aux.h
@@ -142,6 +142,7 @@ int die_get_var_range(Dwarf_Die *sp_die, Dwarf_Die *vr_die, struct strbuf *buf);

/* Find a variable saved in the 'reg' at given address */
Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die, Dwarf_Addr pc, int reg,
+ int *poffset, bool is_fbreg,
Dwarf_Die *die_mem);

/* Find a (global) variable located in the 'addr' */
@@ -161,6 +162,8 @@ static inline int die_get_var_range(Dwarf_Die *sp_die __maybe_unused,
static inline Dwarf_Die *die_find_variable_by_reg(Dwarf_Die *sc_die __maybe_unused,
Dwarf_Addr pc __maybe_unused,
int reg __maybe_unused,
+ int *poffset __maybe_unused,
+ bool is_fbreg __maybe_unused,
Dwarf_Die *die_mem __maybe_unused)
{
return NULL;
--
2.43.0.381.gb435a96ce8-goog


2024-01-17 06:29:13

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 9/9] perf dwarf-aux: Check allowed DWARF Ops

The DWARF location expression can be fairly complex and it'd be hard
to match it with the condition correctly. So let's be conservative
and only allow simple expressions. For now it just checks the first
operation in the list. The following operations looks ok:

* DW_OP_stack_value
* DW_OP_deref_size
* DW_OP_deref
* DW_OP_piece

To refuse complex (and unsupported) location expressions, add
check_allowed_ops() to compare the rest of the list. It seems earlier
result contained those unsupported expressions. For example, I found
some local struct variable is placed like below.

<2><43d1517>: Abbrev Number: 62 (DW_TAG_variable)
<43d1518> DW_AT_location : 15 byte block: 91 50 93 8 91 78 93 4 93 84 8 91 68 93 4
(DW_OP_fbreg: -48; DW_OP_piece: 8;
DW_OP_fbreg: -8; DW_OP_piece: 4;
DW_OP_piece: 1028;
DW_OP_fbreg: -24; DW_OP_piece: 4)

Another example is something like this.

0057c8be ffffffffffffffff ffffffff812109f0 (base address)
0057c8ce ffffffff812112b5 ffffffff812112c8 (DW_OP_breg3 (rbx): 0;
DW_OP_constu: 18446744073709551612;
DW_OP_and;
DW_OP_stack_value)

It should refuse them. After the change, the stat shows:

Annotate data type stats:
total 294, ok 158 (53.7%), bad 136 (46.3%)
-----------------------------------------------------------
30 : no_sym
32 : no_mem_ops
53 : no_var
14 : no_typeinfo
7 : bad_offset

Acked-by: Masami Hiramatsu (Google) <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/dwarf-aux.c | 44 +++++++++++++++++++++++++++++++++----
1 file changed, 40 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
index 7caf52fdc255..2791126069b4 100644
--- a/tools/perf/util/dwarf-aux.c
+++ b/tools/perf/util/dwarf-aux.c
@@ -1305,6 +1305,34 @@ static bool match_var_offset(Dwarf_Die *die_mem, struct find_var_data *data,
return true;
}

+static bool check_allowed_ops(Dwarf_Op *ops, size_t nops)
+{
+ /* The first op is checked separately */
+ ops++;
+ nops--;
+
+ /*
+ * It needs to make sure if the location expression matches to the given
+ * register and offset exactly. Thus it rejects any complex expressions
+ * and only allows a few of selected operators that doesn't change the
+ * location.
+ */
+ while (nops) {
+ switch (ops->atom) {
+ case DW_OP_stack_value:
+ case DW_OP_deref_size:
+ case DW_OP_deref:
+ case DW_OP_piece:
+ break;
+ default:
+ return false;
+ }
+ ops++;
+ nops--;
+ }
+ return true;
+}
+
/* Only checks direct child DIEs in the given scope. */
static int __die_find_var_reg_cb(Dwarf_Die *die_mem, void *arg)
{
@@ -1332,25 +1360,31 @@ static int __die_find_var_reg_cb(Dwarf_Die *die_mem, void *arg)
/* Local variables accessed using frame base register */
if (data->is_fbreg && ops->atom == DW_OP_fbreg &&
data->offset >= (int)ops->number &&
+ check_allowed_ops(ops, nops) &&
match_var_offset(die_mem, data, data->offset, ops->number))
return DIE_FIND_CB_END;

/* Only match with a simple case */
if (data->reg < DWARF_OP_DIRECT_REGS) {
- if (ops->atom == (DW_OP_reg0 + data->reg) && nops == 1)
+ /* pointer variables saved in a register 0 to 31 */
+ if (ops->atom == (DW_OP_reg0 + data->reg) &&
+ check_allowed_ops(ops, nops))
return DIE_FIND_CB_END;

/* Local variables accessed by a register + offset */
if (ops->atom == (DW_OP_breg0 + data->reg) &&
+ check_allowed_ops(ops, nops) &&
match_var_offset(die_mem, data, data->offset, ops->number))
return DIE_FIND_CB_END;
} else {
+ /* pointer variables saved in a register 32 or above */
if (ops->atom == DW_OP_regx && ops->number == data->reg &&
- nops == 1)
+ check_allowed_ops(ops, nops))
return DIE_FIND_CB_END;

/* Local variables accessed by a register + offset */
if (ops->atom == DW_OP_bregx && data->reg == ops->number &&
+ check_allowed_ops(ops, nops) &&
match_var_offset(die_mem, data, data->offset, ops->number2))
return DIE_FIND_CB_END;
}
@@ -1412,7 +1446,8 @@ static int __die_find_var_addr_cb(Dwarf_Die *die_mem, void *arg)
if (data->addr < ops->number)
continue;

- if (match_var_offset(die_mem, data, data->addr, ops->number))
+ if (check_allowed_ops(ops, nops) &&
+ match_var_offset(die_mem, data, data->addr, ops->number))
return DIE_FIND_CB_END;
}
return DIE_FIND_CB_SIBLING;
@@ -1503,7 +1538,8 @@ int die_get_cfa(Dwarf *dwarf, u64 pc, int *preg, int *poffset)
return -1;

if (!dwarf_cfi_addrframe(cfi, pc, &frame) &&
- !dwarf_frame_cfa(frame, &ops, &nops) && nops == 1) {
+ !dwarf_frame_cfa(frame, &ops, &nops) &&
+ check_allowed_ops(ops, nops)) {
*preg = reg_from_dwarf_op(ops);
*poffset = offset_from_dwarf_op(ops);
return 0;
--
2.43.0.381.gb435a96ce8-goog


2024-01-18 16:37:11

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCHSET 0/9] perf tools: More updates on data type profiling (v4)

On Tue, Jan 16, 2024 at 10:27 PM Namhyung Kim <[email protected]> wrote:
>
> Hello,
>
> This is a continuation of the data type profiling series. Now the basic
> part (v3) which uses pointer variables is merged to the perf-tools-next
> tree. And this part is for memory accesses without pointers as well as
> small updates to handle some corner cases. Still mores to come to
> complete the original series.
>
> There's no change from the previous version. For background and usages,
> pleaes refer the posting of previous version [1] and a LWN article [2].
>
> Basically most memory accesses happen with pointers, but there are cases
> don't use pointers - direct accesses to global and local variables.
>
> Global variables are located in a static memory at a specific address.
> So the DWARF location expression for the global vairable would also have
> the static address. And it's common to access them using PC-relative
> addressing mode. Thus it needs a special handling for global variables.
>
> On the other hand, local variables are located in the stack which varies
> as program executes. So the local variables are accessed either by the
> (stack) frame pointer or (current) stack pointer. But sometimes DWARF
> location expression uses a frame base address (CFA) to specify location
> of local variables. So it may need to convert or normalize the location
> extracted from the instruction to match DWARF expression.
>
> Lastly, there are some cases DWARF location expressions end up having
> complex (or not straight-forward) location. In that case, it cannot
> simply match just the first expression with the instruction location.
> It'd be safer to reject them.
>
> The code is available at 'perf/data-profile-update-v4' branch in the tree
> below. The full version of the code is in 'perf/data-profile-v4' branch.
>
> git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
>
> Thanks,
> Namhyung
>
>
> Cc: Ben Woodard <[email protected]>
> Cc: Joe Mario <[email protected]>
> CC: Kees Cook <[email protected]>
> Cc: David Blaikie <[email protected]>
> Cc: Xu Liu <[email protected]>
> Cc: Kan Liang <[email protected]>
> Cc: Ravi Bangoria <[email protected]>
> Cc: Mark Wielaard <[email protected]>
> Cc: Jason Merrill <[email protected]>
> Cc: Jose E. Marchesi <[email protected]>
> Cc: William Huang <[email protected]>
>
> [1] https://lore.kernel.org/linux-perf-users/[email protected]/
> [2] https://lwn.net/Articles/955709/
>
>
> Namhyung Kim (9):
> perf annotate-data: Parse 'lock' prefix from llvm-objdump
> perf annotate-data: Handle macro fusion on x86
> perf annotate-data: Handle array style accesses
> perf annotate-data: Add stack operation pseudo type
> perf annotate-data: Handle PC-relative addressing
> perf annotate-data: Support global variables
> perf dwarf-aux: Add die_get_cfa()
> perf annotate-data: Support stack variables
> perf dwarf-aux: Check allowed DWARF Ops

Series:
Reviewed-by: Ian Rogers <[email protected]>

Thanks,
Ian

> tools/perf/util/annotate-data.c | 119 ++++++++++++++++----
> tools/perf/util/annotate-data.h | 8 +-
> tools/perf/util/annotate.c | 153 ++++++++++++++++++++++++--
> tools/perf/util/annotate.h | 12 +-
> tools/perf/util/dwarf-aux.c | 187 ++++++++++++++++++++++++++++----
> tools/perf/util/dwarf-aux.h | 18 +++
> 6 files changed, 439 insertions(+), 58 deletions(-)
>
>
> base-commit: d988c9f511af71a3445b6a4f3a2c67208ff8e480
> --
> 2.43.0.381.gb435a96ce8-goog
>

2024-01-22 20:38:07

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCHSET 0/9] perf tools: More updates on data type profiling (v4)

On Tue, 16 Jan 2024 22:26:48 -0800, Namhyung Kim wrote:
> This is a continuation of the data type profiling series. Now the basic
> part (v3) which uses pointer variables is merged to the perf-tools-next
> tree. And this part is for memory accesses without pointers as well as
> small updates to handle some corner cases. Still mores to come to
> complete the original series.
>
> There's no change from the previous version. For background and usages,
> pleaes refer the posting of previous version [1] and a LWN article [2].
>
> [...]

Applied to perf-tools-next, thanks!

Thanks,
Namhyung