2024-06-01 06:10:42

by Athira Rajeev

[permalink] [raw]
Subject: [PATCH V3 00/14] Add data type profiling support for powerpc

The patchset from Namhyung added support for data type profiling
in perf tool. This enabled support to associate PMU samples to data
types they refer using DWARF debug information. With the upstream
perf, currently it possible to run perf report or perf annotate to
view the data type information on x86.

Initial patchset posted here had changes need to enable data type
profiling support for powerpc.

https://lore.kernel.org/all/[email protected]/T/

Main change were:
1. powerpc instruction nmemonic table to associate load/store
instructions with move_ops which is use to identify if instruction
is a memory access one.
2. To get register number and access offset from the given
instruction, code uses fields from "struct arch" -> objump.
Added entry for powerpc here.
3. A get_arch_regnum to return register number from the
register name string.

But the apporach used in the initial patchset used parsing of
disassembled code which the current perf tool implementation does.

Example: lwz r10,0(r9)

This line "lwz r10,0(r9)" is parsed to extract instruction name,
registers names and offset. Also to find whether there is a memory
reference in the operands, "memory_ref_char" field of objdump is used.
For x86, "(" is used as memory_ref_char to tackle instructions of the
form "mov (%rax), %rcx".

In case of powerpc, not all instructions using "(" are the only memory
instructions. Example, above instruction can also be of extended form (X
form) "lwzx r10,0,r19". Inorder to easy identify the instruction category
and extract the source/target registers, second patchset added support to use
raw instruction. With raw instruction, macros are added to extract opcode
and register fields.
Link to second patchset:
https://lore.kernel.org/all/[email protected]/

Example representation using --show-raw-insn in objdump gives result:

38 01 81 e8 ld r4,312(r1)

Here "38 01 81 e8" is the raw instruction representation. In powerpc,
this translates to instruction form: "ld RT,DS(RA)" and binary code
as:
_____________________________________
| 58 | RT | RA | DS | |
-------------------------------------
0 6 11 16 30 31

Second patchset used "objdump" again to read the raw instruction.
But since there is no need to disassemble and binary code can be read
directly from the DSO, third patchset (ie this patchset) uses below
apporach. The apporach preferred in powerpc to parse sample for data
type profiling in V3 patchset is:
- Read directly from DSO using dso__data_read_offset
- If that fails for any case, fallback to using libcapstone
- If libcapstone is not supported, approach will use objdump

Patchset adds support to pick the opcode and reg fields from this
raw/binary instruction code. This approach came in from review comment
by Segher Boessenkool and Christophe for the initial patchset.

Apart from that, instruction tracking is enabled for powerpc and
support function is added to find variables defined as registers
Example, in powerpc, below two registers are
defined to represent variable:
1. r13: represents local_paca
register struct paca_struct *local_paca asm("r13");

2. r1: represents stack_pointer
register void *__stack_pointer asm("r1");

These are handled in this patchset.

- Patch 1 is to rearrange register state type structures to header file
so that it can referred from other arch specific files
- Patch 2 is to make instruction tracking as a callback to"struct arch"
so that it can be implemented by other archs easily and defined in arch
specific files
- Patch 3 adds support to capture and parse raw instruction in powerpc
using dso__data_read_offset utility
- Patch 4 adds logic to support using objdump when doing default "perf
report" or "perf annotate" since it that needs disassembled instruction.
- Patch 5 adds disasm_line__parse to parse raw instruction for powerpc
- Patch 6 update parameters for reg extract functions to use raw
instruction on powerpc
- Patch 7 add support to identify memory instructions of opcode 31 in
powerpc
- Patch 8 adds more instructions to support instruction tracking in powerpc
- Patch 9 and 10 handles instruction tracking for powerpc.
- Patch 11 add support to use libcapstone in powerpc
- Patch 12 and patch 13 handles support to find global register variables
- Patch 14 handles insn-stat option for perf annotate

Note:
- There are remaining unknowns (25%) as seen in annotate Instruction stats
below.
- This patchset is not tested on powerpc32. In next step of enhancements
along with handling remaining unknowns, plan to cover powerpc32 changes
based on how testing goes.

With the current patchset:

./perf record -a -e mem-loads sleep 1
./perf report -s type,typeoff --hierarchy --group --stdio
./perf annotate --data-type --insn-stat

perf annotate logs:
==================

Annotate Instruction stats
total 609, ok 446 (73.2%), bad 163 (26.8%)

Name/opcode: Good Bad
-----------------------------------------------------------
58 : 323 80
32 : 49 43
34 : 33 11
OP_31_XOP_LDX : 8 20
40 : 23 0
OP_31_XOP_LWARX : 5 1
OP_31_XOP_LWZX : 2 3
OP_31_XOP_LDARX : 3 0
33 : 0 2
OP_31_XOP_LBZX : 0 1
OP_31_XOP_LWAX : 0 1
OP_31_XOP_LHZX : 0 1

perf report logs:
=================

Total Lost Samples: 0

Samples: 1K of event 'mem-loads'
Event count (approx.): 937238

Overhead Data Type Data Type Offset
........ ......... ................

48.60% (unknown) (unknown) +0 (no field)
12.85% long unsigned int long unsigned int +0 (current_stack_pointer)
4.68% struct paca_struct struct paca_struct +2312 (__current)
4.57% struct paca_struct struct paca_struct +2354 (irq_soft_mask)
2.69% struct paca_struct struct paca_struct +2808 (canary)
2.68% struct paca_struct struct paca_struct +8 (paca_index)
2.24% struct paca_struct struct paca_struct +48 (data_offset)
1.41% struct vm_fault struct vm_fault +0 (vma)
1.29% struct task_struct struct task_struct +276 (flags)
1.03% struct pt_regs struct pt_regs +264 (user_regs.msr)
0.90% struct security_hook_list struct security_hook_list +0 (list.next)
0.76% struct irq_desc struct irq_desc +304 (irq_data.chip)
0.76% struct rq struct rq +2856 (cpu)

Thanks
Athira Rajeev

Changelog:
From v2->v3:
- Addressed review comments from Christophe and Namhyung for V2
- Changed the apporach in powerpc to parse sample for data
type profiling as:
Read directly from DSO using dso__data_read_offset
If that fails for any case, fallback to using libcapstone
If libcapstone is not supported, approach will use objdump
- Include instructions with opcode as 31 and correctly categorize
them as memory or arithmetic instructions.
- Include more instructions for instruction tracking in powerpc

From v1->v2:
- Addressed suggestion from Christophe Leroy and Segher Boessenkool
to use the binary code (raw insn) to fetch opcode, register and
offset fields.
- Added support for instruction tracking in powerpc
- Find the register defined variables (r13 and r1 which points to
local_paca and current_stack_pointer in powerpc)

Athira Rajeev (14):
tools/perf: Move the data structures related to register type to
header file
tools/perf: Add "update_insn_state" callback function to handle arch
specific instruction tracking
tools/perf: Add support to capture and parse raw instruction in
powerpc using dso__data_read_offset utility
tools/perf: Use sort keys to determine whether to pick objdump to
disassemble
tools/perf: Add disasm_line__parse to parse raw instruction for
powerpc
tools/perf: Update parameters for reg extract functions to use raw
instruction on powerpc
tools/perf: Add support to identify memory instructions of opcode 31
in powerpc
tools/perf: Add some of the arithmetic instructions to support
instruction tracking in powerpc
tools/perf: Add more instructions for instruction tracking
tools/perf: Update instruction tracking for powerpc
tools/perf: Add support to use libcapstone in powerpc
tools/perf: Add support to find global register variables using
find_data_type_global_reg
tools/perf: Add support for global_die to capture name of variable in
case of register defined variable
tools/perf: Set instruction name to be used with insn-stat when using
raw instruction

tools/include/linux/string.h | 2 +
tools/lib/string.c | 13 +
.../perf/arch/powerpc/annotate/instructions.c | 260 +++++++++
tools/perf/arch/powerpc/util/dwarf-regs.c | 53 ++
tools/perf/arch/x86/annotate/instructions.c | 383 +++++++++++++
tools/perf/builtin-annotate.c | 4 +-
tools/perf/util/annotate-data.c | 519 +++---------------
tools/perf/util/annotate-data.h | 78 +++
tools/perf/util/annotate.c | 35 +-
tools/perf/util/annotate.h | 1 +
tools/perf/util/disasm.c | 442 ++++++++++++++-
tools/perf/util/disasm.h | 18 +-
tools/perf/util/dwarf-aux.c | 1 +
tools/perf/util/dwarf-aux.h | 1 +
tools/perf/util/include/dwarf-regs.h | 4 +
tools/perf/util/sort.c | 7 +-
16 files changed, 1364 insertions(+), 457 deletions(-)

--
2.43.0



2024-06-01 06:11:08

by Athira Rajeev

[permalink] [raw]
Subject: [PATCH V3 03/14] tools/perf: Add support to capture and parse raw instruction in powerpc using dso__data_read_offset utility

Add support to capture and parse raw instruction in powerpc.
Currently, the perf tool infrastructure uses two ways to disassemble
and understand the instruction. One is objdump and other option is
via libcapstone.

Currently, the perf tool infrastructure uses "--no-show-raw-insn" option
with "objdump" while disassemble. Example from powerpc with this option
for an instruction address is:

Snippet from:
objdump --start-address=<address> --stop-address=<address> -d --no-show-raw-insn -C <vmlinux>

c0000000010224b4: lwz r10,0(r9)

This line "lwz r10,0(r9)" is parsed to extract instruction name,
registers names and offset. Also to find whether there is a memory
reference in the operands, "memory_ref_char" field of objdump is used.
For x86, "(" is used as memory_ref_char to tackle instructions of the
form "mov (%rax), %rcx".

In case of powerpc, not all instructions using "(" are the only memory
instructions. Example, above instruction can also be of extended form (X
form) "lwzx r10,0,r19". Inorder to easy identify the instruction category
and extract the source/target registers, patch adds support to use raw
instruction for powerpc. Approach used is to read the raw instruction
directly from the DSO file using "dso__data_read_offset" utility which
is already implemented in perf infrastructure in "util/dso.c".

Example:

38 01 81 e8 ld r4,312(r1)

Here "38 01 81 e8" is the raw instruction representation. In powerpc,
this translates to instruction form: "ld RT,DS(RA)" and binary code
as:
_____________________________________
| 58 | RT | RA | DS | |
-------------------------------------
0 6 11 16 30 31

Function "symbol__disassemble_dso" is updated to read raw instruction
directly from DSO using dso__data_read_offset utility. In case of
above example, this captures:
line: 38 01 81 e8

Signed-off-by: Athira Rajeev <[email protected]>
---
tools/perf/util/disasm.c | 98 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 98 insertions(+)

diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index b5fe3a7508bb..89a9e4136c09 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -1586,6 +1586,91 @@ static int symbol__disassemble_capstone(char *filename, struct symbol *sym,
}
#endif

+static int symbol__disassemble_dso(char *filename, struct symbol *sym,
+ struct annotate_args *args)
+{
+ struct annotation *notes = symbol__annotation(sym);
+ struct map *map = args->ms.map;
+ struct dso *dso = map__dso(map);
+ u64 start = map__rip_2objdump(map, sym->start);
+ u64 end = map__rip_2objdump(map, sym->end);
+ u64 len = end - start;
+ u64 offset;
+ int i, count;
+ u8 *buf = NULL;
+ char disasm_buf[512];
+ struct disasm_line *dl;
+ u32 *line;
+
+ /* Return if objdump is specified explicitly */
+ if (args->options->objdump_path)
+ return -1;
+
+ pr_debug("Reading raw instruction from : %s using dso__data_read_offset\n", filename);
+
+ buf = malloc(len);
+ if (buf == NULL)
+ goto err;
+
+ count = dso__data_read_offset(dso, NULL, sym->start, buf, len);
+
+ line = (u32 *)buf;
+
+ if ((u64)count != len)
+ goto err;
+
+ /* add the function address and name */
+ scnprintf(disasm_buf, sizeof(disasm_buf), "%#"PRIx64" <%s>:",
+ start, sym->name);
+
+ args->offset = -1;
+ args->line = disasm_buf;
+ args->line_nr = 0;
+ args->fileloc = NULL;
+ args->ms.sym = sym;
+
+ dl = disasm_line__new(args);
+ if (dl == NULL)
+ goto err;
+
+ annotation_line__add(&dl->al, &notes->src->source);
+
+ /* Each raw instruction is 4 byte */
+ count = len/4;
+
+ for (i = 0, offset = 0; i < count; i++) {
+ args->offset = offset;
+ sprintf(args->line, "%x", line[i]);
+ dl = disasm_line__new(args);
+ if (dl == NULL)
+ goto err;
+
+ annotation_line__add(&dl->al, &notes->src->source);
+ offset += 4;
+ }
+
+ /* It failed in the middle */
+ if (offset != len) {
+ struct list_head *list = &notes->src->source;
+
+ /* Discard all lines and fallback to objdump */
+ while (!list_empty(list)) {
+ dl = list_first_entry(list, struct disasm_line, al.node);
+
+ list_del_init(&dl->al.node);
+ disasm_line__free(dl);
+ }
+ count = -1;
+ }
+
+out:
+ free(buf);
+ return count < 0 ? count : 0;
+
+err:
+ count = -1;
+ goto out;
+}
/*
* Possibly create a new version of line with tabs expanded. Returns the
* existing or new line, storage is updated if a new line is allocated. If
@@ -1710,6 +1795,19 @@ int symbol__disassemble(struct symbol *sym, struct annotate_args *args)
strcpy(symfs_filename, tmp);
}

+ /*
+ * For powerpc data type profiling, use the dso__data_read_offset
+ * to read raw instruction directly and interpret the binary code
+ * to understand instructions and register fields. For sort keys as
+ * type and typeoff, disassemble to mnemonic notation is
+ * not required in case of powerpc.
+ */
+ if (arch__is(args->arch, "powerpc")) {
+ err = symbol__disassemble_dso(symfs_filename, sym, args);
+ if (err == 0)
+ goto out_remove_tmp;
+ }
+
#ifdef HAVE_LIBCAPSTONE_SUPPORT
err = symbol__disassemble_capstone(symfs_filename, sym, args);
if (err == 0)
--
2.43.0


2024-06-01 06:11:26

by Athira Rajeev

[permalink] [raw]
Subject: [PATCH V3 04/14] tools/perf: Use sort keys to determine whether to pick objdump to disassemble

perf annotate can be done in different ways. One way is to directly use
"perf annotate" command, other way to annotate specific symbol is to do
"perf report" and press "a" on the sample in UI mode. The approach
preferred in powerpc to parse sample for data type profiling is:
- Read directly from DSO using dso__data_read_offset
- If that fails for any case, fallback to using libcapstone
- If libcapstone is not supported, approach will use objdump

The above works well when perf report is invoked with only sort keys for
data type ie type and typeoff. Because there is no instruction level
annotation needed if only data type information is requested for. For
annotating sample, along with type and typeoff sort key, "sym" sort key
is also needed. And by default invoking just "perf report" uses sort key
"sym" that displays the symbol information.

With approach changes in powerpc which first reads DSO for raw
instruction, "perf annotate" and "perf report" + a key breaks since
it doesn't do the instruction level disassembly.

Snippet of result from perf report:

Samples: 1K of event 'mem-loads', 4000 Hz, Event count (approx.): 937238
do_work /usr/bin/pmlogger [Percent: local period]
Percent│ ea230010
│ 3a550010
│ 3a600000

│ 38f60001
│ 39490008
│ 42400438
51.44 │ 81290008
│ 7d485378

Here, raw instruction is displayed in the output instead of human
readable annotated form.

One way to get the appropriate data is to specify "--objdump path", by
which code annotation will be done. But the default behaviour will be
changed. To fix this breakage, check if "sym" sort key is set. If so
fallback and use the libcapstone/objdump way of disassmbling the sample.

With the changes and "perf report"

Samples: 1K of event 'mem-loads', 4000 Hz, Event count (approx.): 937238
do_work /usr/bin/pmlogger [Percent: local period]
Percent│ ld r17,16(r3)
│ addi r18,r21,16
│ li r19,0

│ 8b0: rldicl r10,r10,63,33
│ addi r10,r10,1
│ mtctr r10
│ ↓ b 8e4
│ 8c0: addi r7,r22,1
│ addi r10,r9,8
│ ↓ bdz d00
51.44 │ lwz r9,8(r9)
│ mr r8,r10
│ cmpw r20,r9

Signed-off-by: Athira Rajeev <[email protected]>
---
tools/perf/util/disasm.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index 89a9e4136c09..3cd187f08193 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -25,6 +25,7 @@
#include "srcline.h"
#include "symbol.h"
#include "util.h"
+#include "sort.h"

static regex_t file_lineno;

@@ -1803,9 +1804,11 @@ int symbol__disassemble(struct symbol *sym, struct annotate_args *args)
* not required in case of powerpc.
*/
if (arch__is(args->arch, "powerpc")) {
- err = symbol__disassemble_dso(symfs_filename, sym, args);
- if (err == 0)
- goto out_remove_tmp;
+ if (sort_order && !strstr(sort_order, "sym")) {
+ err = symbol__disassemble_dso(symfs_filename, sym, args);
+ if (err == 0)
+ goto out_remove_tmp;
+ }
}

#ifdef HAVE_LIBCAPSTONE_SUPPORT
--
2.43.0


2024-06-01 06:11:41

by Athira Rajeev

[permalink] [raw]
Subject: [PATCH V3 02/14] tools/perf: Add "update_insn_state" callback function to handle arch specific instruction tracking

Add "update_insn_state" callback to "struct arch" to handle instruction
tracking. Currently updating instruction state is handled by static
function "update_insn_state_x86" which is defined in "annotate-data.c".
Make this as a callback for specific arch and move to archs specific
file "arch/x86/annotate/instructions.c" . This will help to add helper
function for other platforms in file:
"arch/<platform>/annotate/instructions.c and make changes/updates
easier.

Define callback "update_insn_state" as part of "struct arch", also make
some of the debug functions non-static so that it can be referenced from
other places.

Signed-off-by: Athira Rajeev <[email protected]>
---
tools/perf/arch/x86/annotate/instructions.c | 383 +++++++++++++++++++
tools/perf/util/annotate-data.c | 391 +-------------------
tools/perf/util/annotate-data.h | 23 ++
tools/perf/util/disasm.c | 2 +
tools/perf/util/disasm.h | 7 +
5 files changed, 423 insertions(+), 383 deletions(-)

diff --git a/tools/perf/arch/x86/annotate/instructions.c b/tools/perf/arch/x86/annotate/instructions.c
index 5cdf457f5cbe..715d8ce65f7f 100644
--- a/tools/perf/arch/x86/annotate/instructions.c
+++ b/tools/perf/arch/x86/annotate/instructions.c
@@ -206,3 +206,386 @@ static int x86__annotate_init(struct arch *arch, char *cpuid)
arch->initialized = true;
return err;
}
+
+#ifdef HAVE_DWARF_SUPPORT
+static void update_insn_state_x86(struct type_state *state,
+ struct data_loc_info *dloc, Dwarf_Die *cu_die,
+ struct disasm_line *dl)
+{
+ struct annotated_insn_loc loc;
+ struct annotated_op_loc *src = &loc.ops[INSN_OP_SOURCE];
+ struct annotated_op_loc *dst = &loc.ops[INSN_OP_TARGET];
+ struct type_state_reg *tsr;
+ Dwarf_Die type_die;
+ u32 insn_offset = dl->al.offset;
+ int fbreg = dloc->fbreg;
+ int fboff = 0;
+
+ if (annotate_get_insn_location(dloc->arch, dl, &loc) < 0)
+ return;
+
+ if (ins__is_call(&dl->ins)) {
+ struct symbol *func = dl->ops.target.sym;
+
+ if (func == NULL)
+ return;
+
+ /* __fentry__ will preserve all registers */
+ if (!strcmp(func->name, "__fentry__"))
+ return;
+
+ pr_debug_dtp("call [%x] %s\n", insn_offset, func->name);
+
+ /* Otherwise invalidate caller-saved registers after call */
+ for (unsigned i = 0; i < ARRAY_SIZE(state->regs); i++) {
+ if (state->regs[i].caller_saved)
+ state->regs[i].ok = false;
+ }
+
+ /* Update register with the return type (if any) */
+ if (die_find_func_rettype(cu_die, func->name, &type_die)) {
+ tsr = &state->regs[state->ret_reg];
+ tsr->type = type_die;
+ tsr->kind = TSR_KIND_TYPE;
+ tsr->ok = true;
+
+ pr_debug_dtp("call [%x] return -> reg%d",
+ insn_offset, state->ret_reg);
+ pr_debug_type_name(&type_die, tsr->kind);
+ }
+ return;
+ }
+
+ if (!strncmp(dl->ins.name, "add", 3)) {
+ u64 imm_value = -1ULL;
+ int offset;
+ const char *var_name = NULL;
+ struct map_symbol *ms = dloc->ms;
+ u64 ip = ms->sym->start + dl->al.offset;
+
+ if (!has_reg_type(state, dst->reg1))
+ return;
+
+ tsr = &state->regs[dst->reg1];
+
+ if (src->imm)
+ imm_value = src->offset;
+ else if (has_reg_type(state, src->reg1) &&
+ state->regs[src->reg1].kind == TSR_KIND_CONST)
+ imm_value = state->regs[src->reg1].imm_value;
+ else if (src->reg1 == DWARF_REG_PC) {
+ u64 var_addr = annotate_calc_pcrel(dloc->ms, ip,
+ src->offset, dl);
+
+ if (get_global_var_info(dloc, var_addr,
+ &var_name, &offset) &&
+ !strcmp(var_name, "this_cpu_off") &&
+ tsr->kind == TSR_KIND_CONST) {
+ tsr->kind = TSR_KIND_PERCPU_BASE;
+ imm_value = tsr->imm_value;
+ }
+ }
+ else
+ return;
+
+ if (tsr->kind != TSR_KIND_PERCPU_BASE)
+ return;
+
+ if (get_global_var_type(cu_die, dloc, ip, imm_value, &offset,
+ &type_die) && offset == 0) {
+ /*
+ * This is not a pointer type, but it should be treated
+ * as a pointer.
+ */
+ tsr->type = type_die;
+ tsr->kind = TSR_KIND_POINTER;
+ tsr->ok = true;
+
+ pr_debug_dtp("add [%x] percpu %#"PRIx64" -> reg%d",
+ insn_offset, imm_value, dst->reg1);
+ pr_debug_type_name(&tsr->type, tsr->kind);
+ }
+ return;
+ }
+
+ if (strncmp(dl->ins.name, "mov", 3))
+ return;
+
+ if (dloc->fb_cfa) {
+ u64 ip = dloc->ms->sym->start + dl->al.offset;
+ u64 pc = map__rip_2objdump(dloc->ms->map, ip);
+
+ if (die_get_cfa(dloc->di->dbg, pc, &fbreg, &fboff) < 0)
+ fbreg = -1;
+ }
+
+ /* Case 1. register to register or segment:offset to register transfers */
+ if (!src->mem_ref && !dst->mem_ref) {
+ if (!has_reg_type(state, dst->reg1))
+ return;
+
+ tsr = &state->regs[dst->reg1];
+ if (dso__kernel(map__dso(dloc->ms->map)) &&
+ src->segment == INSN_SEG_X86_GS && src->imm) {
+ u64 ip = dloc->ms->sym->start + dl->al.offset;
+ u64 var_addr;
+ int offset;
+
+ /*
+ * In kernel, %gs points to a per-cpu region for the
+ * current CPU. Access with a constant offset should
+ * be treated as a global variable access.
+ */
+ var_addr = src->offset;
+
+ if (var_addr == 40) {
+ tsr->kind = TSR_KIND_CANARY;
+ tsr->ok = true;
+
+ pr_debug_dtp("mov [%x] stack canary -> reg%d\n",
+ insn_offset, dst->reg1);
+ return;
+ }
+
+ if (!get_global_var_type(cu_die, dloc, ip, var_addr,
+ &offset, &type_die) ||
+ !die_get_member_type(&type_die, offset, &type_die)) {
+ tsr->ok = false;
+ return;
+ }
+
+ tsr->type = type_die;
+ tsr->kind = TSR_KIND_TYPE;
+ tsr->ok = true;
+
+ pr_debug_dtp("mov [%x] this-cpu addr=%#"PRIx64" -> reg%d",
+ insn_offset, var_addr, dst->reg1);
+ pr_debug_type_name(&tsr->type, tsr->kind);
+ return;
+ }
+
+ if (src->imm) {
+ tsr->kind = TSR_KIND_CONST;
+ tsr->imm_value = src->offset;
+ tsr->ok = true;
+
+ pr_debug_dtp("mov [%x] imm=%#x -> reg%d\n",
+ insn_offset, tsr->imm_value, dst->reg1);
+ return;
+ }
+
+ if (!has_reg_type(state, src->reg1) ||
+ !state->regs[src->reg1].ok) {
+ tsr->ok = false;
+ return;
+ }
+
+ tsr->type = state->regs[src->reg1].type;
+ tsr->kind = state->regs[src->reg1].kind;
+ tsr->ok = true;
+
+ pr_debug_dtp("mov [%x] reg%d -> reg%d",
+ insn_offset, src->reg1, dst->reg1);
+ pr_debug_type_name(&tsr->type, tsr->kind);
+ }
+ /* Case 2. memory to register transers */
+ if (src->mem_ref && !dst->mem_ref) {
+ int sreg = src->reg1;
+
+ if (!has_reg_type(state, dst->reg1))
+ return;
+
+ tsr = &state->regs[dst->reg1];
+
+retry:
+ /* Check stack variables with offset */
+ if (sreg == fbreg) {
+ struct type_state_stack *stack;
+ int offset = src->offset - fboff;
+
+ stack = find_stack_state(state, offset);
+ if (stack == NULL) {
+ tsr->ok = false;
+ return;
+ } else if (!stack->compound) {
+ tsr->type = stack->type;
+ tsr->kind = stack->kind;
+ tsr->ok = true;
+ } else if (die_get_member_type(&stack->type,
+ offset - stack->offset,
+ &type_die)) {
+ tsr->type = type_die;
+ tsr->kind = TSR_KIND_TYPE;
+ tsr->ok = true;
+ } else {
+ tsr->ok = false;
+ return;
+ }
+
+ pr_debug_dtp("mov [%x] -%#x(stack) -> reg%d",
+ insn_offset, -offset, dst->reg1);
+ pr_debug_type_name(&tsr->type, tsr->kind);
+ }
+ /* And then dereference the pointer if it has one */
+ else if (has_reg_type(state, sreg) && state->regs[sreg].ok &&
+ state->regs[sreg].kind == TSR_KIND_TYPE &&
+ die_deref_ptr_type(&state->regs[sreg].type,
+ src->offset, &type_die)) {
+ tsr->type = type_die;
+ tsr->kind = TSR_KIND_TYPE;
+ tsr->ok = true;
+
+ pr_debug_dtp("mov [%x] %#x(reg%d) -> reg%d",
+ insn_offset, src->offset, sreg, dst->reg1);
+ pr_debug_type_name(&tsr->type, tsr->kind);
+ }
+ /* Or check if it's a global variable */
+ else if (sreg == DWARF_REG_PC) {
+ struct map_symbol *ms = dloc->ms;
+ u64 ip = ms->sym->start + dl->al.offset;
+ u64 addr;
+ int offset;
+
+ addr = annotate_calc_pcrel(ms, ip, src->offset, dl);
+
+ if (!get_global_var_type(cu_die, dloc, ip, addr, &offset,
+ &type_die) ||
+ !die_get_member_type(&type_die, offset, &type_die)) {
+ tsr->ok = false;
+ return;
+ }
+
+ tsr->type = type_die;
+ tsr->kind = TSR_KIND_TYPE;
+ tsr->ok = true;
+
+ pr_debug_dtp("mov [%x] global addr=%"PRIx64" -> reg%d",
+ insn_offset, addr, dst->reg1);
+ pr_debug_type_name(&type_die, tsr->kind);
+ }
+ /* And check percpu access with base register */
+ else if (has_reg_type(state, sreg) &&
+ state->regs[sreg].kind == TSR_KIND_PERCPU_BASE) {
+ u64 ip = dloc->ms->sym->start + dl->al.offset;
+ u64 var_addr = src->offset;
+ int offset;
+
+ if (src->multi_regs) {
+ int reg2 = (sreg == src->reg1) ? src->reg2 : src->reg1;
+
+ if (has_reg_type(state, reg2) && state->regs[reg2].ok &&
+ state->regs[reg2].kind == TSR_KIND_CONST)
+ var_addr += state->regs[reg2].imm_value;
+ }
+
+ /*
+ * In kernel, %gs points to a per-cpu region for the
+ * current CPU. Access with a constant offset should
+ * be treated as a global variable access.
+ */
+ if (get_global_var_type(cu_die, dloc, ip, var_addr,
+ &offset, &type_die) &&
+ die_get_member_type(&type_die, offset, &type_die)) {
+ tsr->type = type_die;
+ tsr->kind = TSR_KIND_TYPE;
+ tsr->ok = true;
+
+ if (src->multi_regs) {
+ pr_debug_dtp("mov [%x] percpu %#x(reg%d,reg%d) -> reg%d",
+ insn_offset, src->offset, src->reg1,
+ src->reg2, dst->reg1);
+ } else {
+ pr_debug_dtp("mov [%x] percpu %#x(reg%d) -> reg%d",
+ insn_offset, src->offset, sreg, dst->reg1);
+ }
+ pr_debug_type_name(&tsr->type, tsr->kind);
+ } else {
+ tsr->ok = false;
+ }
+ }
+ /* And then dereference the calculated pointer if it has one */
+ else if (has_reg_type(state, sreg) && state->regs[sreg].ok &&
+ state->regs[sreg].kind == TSR_KIND_POINTER &&
+ die_get_member_type(&state->regs[sreg].type,
+ src->offset, &type_die)) {
+ tsr->type = type_die;
+ tsr->kind = TSR_KIND_TYPE;
+ tsr->ok = true;
+
+ pr_debug_dtp("mov [%x] pointer %#x(reg%d) -> reg%d",
+ insn_offset, src->offset, sreg, dst->reg1);
+ pr_debug_type_name(&tsr->type, tsr->kind);
+ }
+ /* Or try another register if any */
+ else if (src->multi_regs && sreg == src->reg1 &&
+ src->reg1 != src->reg2) {
+ sreg = src->reg2;
+ goto retry;
+ }
+ else {
+ int offset;
+ const char *var_name = NULL;
+
+ /* it might be per-cpu variable (in kernel) access */
+ if (src->offset < 0) {
+ if (get_global_var_info(dloc, (s64)src->offset,
+ &var_name, &offset) &&
+ !strcmp(var_name, "__per_cpu_offset")) {
+ tsr->kind = TSR_KIND_PERCPU_BASE;
+
+ pr_debug_dtp("mov [%x] percpu base reg%d\n",
+ insn_offset, dst->reg1);
+ }
+ }
+
+ tsr->ok = false;
+ }
+ }
+ /* Case 3. register to memory transfers */
+ if (!src->mem_ref && dst->mem_ref) {
+ if (!has_reg_type(state, src->reg1) ||
+ !state->regs[src->reg1].ok)
+ return;
+
+ /* Check stack variables with offset */
+ if (dst->reg1 == fbreg) {
+ struct type_state_stack *stack;
+ int offset = dst->offset - fboff;
+
+ tsr = &state->regs[src->reg1];
+
+ stack = find_stack_state(state, offset);
+ if (stack) {
+ /*
+ * The source register is likely to hold a type
+ * of member if it's a compound type. Do not
+ * update the stack variable type since we can
+ * get the member type later by using the
+ * die_get_member_type().
+ */
+ if (!stack->compound)
+ set_stack_state(stack, offset, tsr->kind,
+ &tsr->type);
+ } else {
+ findnew_stack_state(state, offset, tsr->kind,
+ &tsr->type);
+ }
+
+ pr_debug_dtp("mov [%x] reg%d -> -%#x(stack)",
+ insn_offset, src->reg1, -offset);
+ pr_debug_type_name(&tsr->type, tsr->kind);
+ }
+ /*
+ * Ignore other transfers since it'd set a value in a struct
+ * and won't change the type.
+ */
+ }
+ /* Case 4. memory to memory transfers (not handled for now) */
+}
+#else /* HAVE_DWARF_SUPPORT */
+static void update_insn_state_x86(struct type_state *state __maybe_unused, struct data_loc_info *dloc __maybe_unused,
+ Dwarf_Die * cu_die __maybe_unused, struct disasm_line *dl __maybe_unused)
+{
+ return;
+}
+#endif
diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index a4c7f98a75e3..7a48c3d72b89 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -39,7 +39,7 @@ do { \
pr_debug3(fmt, ##__VA_ARGS__); \
} while (0)

-static void pr_debug_type_name(Dwarf_Die *die, enum type_state_kind kind)
+void pr_debug_type_name(Dwarf_Die *die, enum type_state_kind kind)
{
struct strbuf sb;
char *str;
@@ -390,7 +390,7 @@ static int check_variable(struct data_loc_info *dloc, Dwarf_Die *var_die,
return 0;
}

-static struct type_state_stack *find_stack_state(struct type_state *state,
+struct type_state_stack *find_stack_state(struct type_state *state,
int offset)
{
struct type_state_stack *stack;
@@ -406,7 +406,7 @@ static struct type_state_stack *find_stack_state(struct type_state *state,
return NULL;
}

-static void set_stack_state(struct type_state_stack *stack, int offset, u8 kind,
+void set_stack_state(struct type_state_stack *stack, int offset, u8 kind,
Dwarf_Die *type_die)
{
int tag;
@@ -433,7 +433,7 @@ static void set_stack_state(struct type_state_stack *stack, int offset, u8 kind,
}
}

-static struct type_state_stack *findnew_stack_state(struct type_state *state,
+struct type_state_stack *findnew_stack_state(struct type_state *state,
int offset, u8 kind,
Dwarf_Die *type_die)
{
@@ -537,7 +537,7 @@ void global_var_type__tree_delete(struct rb_root *root)
}
}

-static bool get_global_var_info(struct data_loc_info *dloc, u64 addr,
+bool get_global_var_info(struct data_loc_info *dloc, u64 addr,
const char **var_name, int *var_offset)
{
struct addr_location al;
@@ -611,7 +611,7 @@ static void global_var__collect(struct data_loc_info *dloc)
}
}

-static bool get_global_var_type(Dwarf_Die *cu_die, struct data_loc_info *dloc,
+bool get_global_var_type(Dwarf_Die *cu_die, struct data_loc_info *dloc,
u64 ip, u64 var_addr, int *var_offset,
Dwarf_Die *type_die)
{
@@ -722,381 +722,6 @@ static void update_var_state(struct type_state *state, struct data_loc_info *dlo
}
}

-static void update_insn_state_x86(struct type_state *state,
- struct data_loc_info *dloc, Dwarf_Die *cu_die,
- struct disasm_line *dl)
-{
- struct annotated_insn_loc loc;
- struct annotated_op_loc *src = &loc.ops[INSN_OP_SOURCE];
- struct annotated_op_loc *dst = &loc.ops[INSN_OP_TARGET];
- struct type_state_reg *tsr;
- Dwarf_Die type_die;
- u32 insn_offset = dl->al.offset;
- int fbreg = dloc->fbreg;
- int fboff = 0;
-
- if (annotate_get_insn_location(dloc->arch, dl, &loc) < 0)
- return;
-
- if (ins__is_call(&dl->ins)) {
- struct symbol *func = dl->ops.target.sym;
-
- if (func == NULL)
- return;
-
- /* __fentry__ will preserve all registers */
- if (!strcmp(func->name, "__fentry__"))
- return;
-
- pr_debug_dtp("call [%x] %s\n", insn_offset, func->name);
-
- /* Otherwise invalidate caller-saved registers after call */
- for (unsigned i = 0; i < ARRAY_SIZE(state->regs); i++) {
- if (state->regs[i].caller_saved)
- state->regs[i].ok = false;
- }
-
- /* Update register with the return type (if any) */
- if (die_find_func_rettype(cu_die, func->name, &type_die)) {
- tsr = &state->regs[state->ret_reg];
- tsr->type = type_die;
- tsr->kind = TSR_KIND_TYPE;
- tsr->ok = true;
-
- pr_debug_dtp("call [%x] return -> reg%d",
- insn_offset, state->ret_reg);
- pr_debug_type_name(&type_die, tsr->kind);
- }
- return;
- }
-
- if (!strncmp(dl->ins.name, "add", 3)) {
- u64 imm_value = -1ULL;
- int offset;
- const char *var_name = NULL;
- struct map_symbol *ms = dloc->ms;
- u64 ip = ms->sym->start + dl->al.offset;
-
- if (!has_reg_type(state, dst->reg1))
- return;
-
- tsr = &state->regs[dst->reg1];
-
- if (src->imm)
- imm_value = src->offset;
- else if (has_reg_type(state, src->reg1) &&
- state->regs[src->reg1].kind == TSR_KIND_CONST)
- imm_value = state->regs[src->reg1].imm_value;
- else if (src->reg1 == DWARF_REG_PC) {
- u64 var_addr = annotate_calc_pcrel(dloc->ms, ip,
- src->offset, dl);
-
- if (get_global_var_info(dloc, var_addr,
- &var_name, &offset) &&
- !strcmp(var_name, "this_cpu_off") &&
- tsr->kind == TSR_KIND_CONST) {
- tsr->kind = TSR_KIND_PERCPU_BASE;
- imm_value = tsr->imm_value;
- }
- }
- else
- return;
-
- if (tsr->kind != TSR_KIND_PERCPU_BASE)
- return;
-
- if (get_global_var_type(cu_die, dloc, ip, imm_value, &offset,
- &type_die) && offset == 0) {
- /*
- * This is not a pointer type, but it should be treated
- * as a pointer.
- */
- tsr->type = type_die;
- tsr->kind = TSR_KIND_POINTER;
- tsr->ok = true;
-
- pr_debug_dtp("add [%x] percpu %#"PRIx64" -> reg%d",
- insn_offset, imm_value, dst->reg1);
- pr_debug_type_name(&tsr->type, tsr->kind);
- }
- return;
- }
-
- if (strncmp(dl->ins.name, "mov", 3))
- return;
-
- if (dloc->fb_cfa) {
- u64 ip = dloc->ms->sym->start + dl->al.offset;
- u64 pc = map__rip_2objdump(dloc->ms->map, ip);
-
- if (die_get_cfa(dloc->di->dbg, pc, &fbreg, &fboff) < 0)
- fbreg = -1;
- }
-
- /* Case 1. register to register or segment:offset to register transfers */
- if (!src->mem_ref && !dst->mem_ref) {
- if (!has_reg_type(state, dst->reg1))
- return;
-
- tsr = &state->regs[dst->reg1];
- if (dso__kernel(map__dso(dloc->ms->map)) &&
- src->segment == INSN_SEG_X86_GS && src->imm) {
- u64 ip = dloc->ms->sym->start + dl->al.offset;
- u64 var_addr;
- int offset;
-
- /*
- * In kernel, %gs points to a per-cpu region for the
- * current CPU. Access with a constant offset should
- * be treated as a global variable access.
- */
- var_addr = src->offset;
-
- if (var_addr == 40) {
- tsr->kind = TSR_KIND_CANARY;
- tsr->ok = true;
-
- pr_debug_dtp("mov [%x] stack canary -> reg%d\n",
- insn_offset, dst->reg1);
- return;
- }
-
- if (!get_global_var_type(cu_die, dloc, ip, var_addr,
- &offset, &type_die) ||
- !die_get_member_type(&type_die, offset, &type_die)) {
- tsr->ok = false;
- return;
- }
-
- tsr->type = type_die;
- tsr->kind = TSR_KIND_TYPE;
- tsr->ok = true;
-
- pr_debug_dtp("mov [%x] this-cpu addr=%#"PRIx64" -> reg%d",
- insn_offset, var_addr, dst->reg1);
- pr_debug_type_name(&tsr->type, tsr->kind);
- return;
- }
-
- if (src->imm) {
- tsr->kind = TSR_KIND_CONST;
- tsr->imm_value = src->offset;
- tsr->ok = true;
-
- pr_debug_dtp("mov [%x] imm=%#x -> reg%d\n",
- insn_offset, tsr->imm_value, dst->reg1);
- return;
- }
-
- if (!has_reg_type(state, src->reg1) ||
- !state->regs[src->reg1].ok) {
- tsr->ok = false;
- return;
- }
-
- tsr->type = state->regs[src->reg1].type;
- tsr->kind = state->regs[src->reg1].kind;
- tsr->ok = true;
-
- pr_debug_dtp("mov [%x] reg%d -> reg%d",
- insn_offset, src->reg1, dst->reg1);
- pr_debug_type_name(&tsr->type, tsr->kind);
- }
- /* Case 2. memory to register transers */
- if (src->mem_ref && !dst->mem_ref) {
- int sreg = src->reg1;
-
- if (!has_reg_type(state, dst->reg1))
- return;
-
- tsr = &state->regs[dst->reg1];
-
-retry:
- /* Check stack variables with offset */
- if (sreg == fbreg) {
- struct type_state_stack *stack;
- int offset = src->offset - fboff;
-
- stack = find_stack_state(state, offset);
- if (stack == NULL) {
- tsr->ok = false;
- return;
- } else if (!stack->compound) {
- tsr->type = stack->type;
- tsr->kind = stack->kind;
- tsr->ok = true;
- } else if (die_get_member_type(&stack->type,
- offset - stack->offset,
- &type_die)) {
- tsr->type = type_die;
- tsr->kind = TSR_KIND_TYPE;
- tsr->ok = true;
- } else {
- tsr->ok = false;
- return;
- }
-
- pr_debug_dtp("mov [%x] -%#x(stack) -> reg%d",
- insn_offset, -offset, dst->reg1);
- pr_debug_type_name(&tsr->type, tsr->kind);
- }
- /* And then dereference the pointer if it has one */
- else if (has_reg_type(state, sreg) && state->regs[sreg].ok &&
- state->regs[sreg].kind == TSR_KIND_TYPE &&
- die_deref_ptr_type(&state->regs[sreg].type,
- src->offset, &type_die)) {
- tsr->type = type_die;
- tsr->kind = TSR_KIND_TYPE;
- tsr->ok = true;
-
- pr_debug_dtp("mov [%x] %#x(reg%d) -> reg%d",
- insn_offset, src->offset, sreg, dst->reg1);
- pr_debug_type_name(&tsr->type, tsr->kind);
- }
- /* Or check if it's a global variable */
- else if (sreg == DWARF_REG_PC) {
- struct map_symbol *ms = dloc->ms;
- u64 ip = ms->sym->start + dl->al.offset;
- u64 addr;
- int offset;
-
- addr = annotate_calc_pcrel(ms, ip, src->offset, dl);
-
- if (!get_global_var_type(cu_die, dloc, ip, addr, &offset,
- &type_die) ||
- !die_get_member_type(&type_die, offset, &type_die)) {
- tsr->ok = false;
- return;
- }
-
- tsr->type = type_die;
- tsr->kind = TSR_KIND_TYPE;
- tsr->ok = true;
-
- pr_debug_dtp("mov [%x] global addr=%"PRIx64" -> reg%d",
- insn_offset, addr, dst->reg1);
- pr_debug_type_name(&type_die, tsr->kind);
- }
- /* And check percpu access with base register */
- else if (has_reg_type(state, sreg) &&
- state->regs[sreg].kind == TSR_KIND_PERCPU_BASE) {
- u64 ip = dloc->ms->sym->start + dl->al.offset;
- u64 var_addr = src->offset;
- int offset;
-
- if (src->multi_regs) {
- int reg2 = (sreg == src->reg1) ? src->reg2 : src->reg1;
-
- if (has_reg_type(state, reg2) && state->regs[reg2].ok &&
- state->regs[reg2].kind == TSR_KIND_CONST)
- var_addr += state->regs[reg2].imm_value;
- }
-
- /*
- * In kernel, %gs points to a per-cpu region for the
- * current CPU. Access with a constant offset should
- * be treated as a global variable access.
- */
- if (get_global_var_type(cu_die, dloc, ip, var_addr,
- &offset, &type_die) &&
- die_get_member_type(&type_die, offset, &type_die)) {
- tsr->type = type_die;
- tsr->kind = TSR_KIND_TYPE;
- tsr->ok = true;
-
- if (src->multi_regs) {
- pr_debug_dtp("mov [%x] percpu %#x(reg%d,reg%d) -> reg%d",
- insn_offset, src->offset, src->reg1,
- src->reg2, dst->reg1);
- } else {
- pr_debug_dtp("mov [%x] percpu %#x(reg%d) -> reg%d",
- insn_offset, src->offset, sreg, dst->reg1);
- }
- pr_debug_type_name(&tsr->type, tsr->kind);
- } else {
- tsr->ok = false;
- }
- }
- /* And then dereference the calculated pointer if it has one */
- else if (has_reg_type(state, sreg) && state->regs[sreg].ok &&
- state->regs[sreg].kind == TSR_KIND_POINTER &&
- die_get_member_type(&state->regs[sreg].type,
- src->offset, &type_die)) {
- tsr->type = type_die;
- tsr->kind = TSR_KIND_TYPE;
- tsr->ok = true;
-
- pr_debug_dtp("mov [%x] pointer %#x(reg%d) -> reg%d",
- insn_offset, src->offset, sreg, dst->reg1);
- pr_debug_type_name(&tsr->type, tsr->kind);
- }
- /* Or try another register if any */
- else if (src->multi_regs && sreg == src->reg1 &&
- src->reg1 != src->reg2) {
- sreg = src->reg2;
- goto retry;
- }
- else {
- int offset;
- const char *var_name = NULL;
-
- /* it might be per-cpu variable (in kernel) access */
- if (src->offset < 0) {
- if (get_global_var_info(dloc, (s64)src->offset,
- &var_name, &offset) &&
- !strcmp(var_name, "__per_cpu_offset")) {
- tsr->kind = TSR_KIND_PERCPU_BASE;
-
- pr_debug_dtp("mov [%x] percpu base reg%d\n",
- insn_offset, dst->reg1);
- }
- }
-
- tsr->ok = false;
- }
- }
- /* Case 3. register to memory transfers */
- if (!src->mem_ref && dst->mem_ref) {
- if (!has_reg_type(state, src->reg1) ||
- !state->regs[src->reg1].ok)
- return;
-
- /* Check stack variables with offset */
- if (dst->reg1 == fbreg) {
- struct type_state_stack *stack;
- int offset = dst->offset - fboff;
-
- tsr = &state->regs[src->reg1];
-
- stack = find_stack_state(state, offset);
- if (stack) {
- /*
- * The source register is likely to hold a type
- * of member if it's a compound type. Do not
- * update the stack variable type since we can
- * get the member type later by using the
- * die_get_member_type().
- */
- if (!stack->compound)
- set_stack_state(stack, offset, tsr->kind,
- &tsr->type);
- } else {
- findnew_stack_state(state, offset, tsr->kind,
- &tsr->type);
- }
-
- pr_debug_dtp("mov [%x] reg%d -> -%#x(stack)",
- insn_offset, src->reg1, -offset);
- pr_debug_type_name(&tsr->type, tsr->kind);
- }
- /*
- * Ignore other transfers since it'd set a value in a struct
- * and won't change the type.
- */
- }
- /* Case 4. memory to memory transfers (not handled for now) */
-}
-
/**
* update_insn_state - Update type state for an instruction
* @state: type state table
@@ -1115,8 +740,8 @@ static void update_insn_state_x86(struct type_state *state,
static void update_insn_state(struct type_state *state, struct data_loc_info *dloc,
Dwarf_Die *cu_die, struct disasm_line *dl)
{
- if (arch__is(dloc->arch, "x86"))
- update_insn_state_x86(state, dloc, cu_die, dl);
+ if (dloc->arch->update_insn_state)
+ dloc->arch->update_insn_state(state, dloc, cu_die, dl);
}

/*
diff --git a/tools/perf/util/annotate-data.h b/tools/perf/util/annotate-data.h
index ef235b1b15e1..2bc870e61c74 100644
--- a/tools/perf/util/annotate-data.h
+++ b/tools/perf/util/annotate-data.h
@@ -7,6 +7,7 @@
#include <linux/rbtree.h>
#include <linux/types.h>
#include "dwarf-aux.h"
+#include "dwarf-regs.h"
#include "annotate.h"
#include "debuginfo.h"

@@ -18,6 +19,14 @@ struct hist_entry;
struct map_symbol;
struct thread;

+#define pr_debug_dtp(fmt, ...) \
+do { \
+ if (debug_type_profile) \
+ pr_info(fmt, ##__VA_ARGS__); \
+ else \
+ pr_debug3(fmt, ##__VA_ARGS__); \
+} while (0)
+
enum type_state_kind {
TSR_KIND_INVALID = 0,
TSR_KIND_TYPE,
@@ -215,6 +224,20 @@ void global_var_type__tree_delete(struct rb_root *root);
int hist_entry__annotate_data_tty(struct hist_entry *he, struct evsel *evsel);

bool has_reg_type(struct type_state *state, int reg);
+struct type_state_stack *findnew_stack_state(struct type_state *state,
+ int offset, u8 kind,
+ Dwarf_Die *type_die);
+void set_stack_state(struct type_state_stack *stack, int offset, u8 kind,
+ Dwarf_Die *type_die);
+struct type_state_stack *find_stack_state(struct type_state *state,
+ int offset);
+bool get_global_var_type(Dwarf_Die *cu_die, struct data_loc_info *dloc,
+ u64 ip, u64 var_addr, int *var_offset,
+ Dwarf_Die *type_die);
+bool get_global_var_info(struct data_loc_info *dloc, u64 addr,
+ const char **var_name, int *var_offset);
+void pr_debug_type_name(Dwarf_Die *die, enum type_state_kind kind);
+
#else /* HAVE_DWARF_SUPPORT */

static inline struct annotated_data_type *
diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index 72aec8f61b94..b5fe3a7508bb 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -12,6 +12,7 @@
#include <subcmd/run-command.h>

#include "annotate.h"
+#include "annotate-data.h"
#include "build-id.h"
#include "debug.h"
#include "disasm.h"
@@ -145,6 +146,7 @@ static struct arch architectures[] = {
.memory_ref_char = '(',
.imm_char = '$',
},
+ .update_insn_state = update_insn_state_x86,
},
{
.name = "powerpc",
diff --git a/tools/perf/util/disasm.h b/tools/perf/util/disasm.h
index 3d381a043520..718177fa4775 100644
--- a/tools/perf/util/disasm.h
+++ b/tools/perf/util/disasm.h
@@ -3,12 +3,16 @@
#define __PERF_UTIL_DISASM_H

#include "map_symbol.h"
+#include "dwarf-aux.h"

struct annotation_options;
struct disasm_line;
struct ins;
struct evsel;
struct symbol;
+struct data_loc_info;
+struct type_state;
+struct disasm_line;

struct arch {
const char *name;
@@ -32,6 +36,9 @@ struct arch {
char memory_ref_char;
char imm_char;
} objdump;
+ void (*update_insn_state)(struct type_state *state,
+ struct data_loc_info *dloc, Dwarf_Die *cu_die,
+ struct disasm_line *dl);
};

struct ins {
--
2.43.0


2024-06-01 06:12:16

by Athira Rajeev

[permalink] [raw]
Subject: [PATCH V3 07/14] tools/perf: Add support to identify memory instructions of opcode 31 in powerpc

There are memory instructions in powerpc with opcode as 31.
Example: "ldx RT,RA,RB" , Its X form is as below:

______________________________________
| 31 | RT | RA | RB | 21 |/|
--------------------------------------
0 6 11 16 21 30 31

The opcode for "ldx" is 31. There are other instructions also with
opcode 31 which are memory insn like ldux, stbx, lwzx, lhaux
But all instructions with opcode 31 are not memory. Example is add
instruction: "add RT,RA,RB"

The value in bit 21-30 [ 21 for ldx ] is different for these
instructions. Patch uses this value to assign instruction ops for these
cases. The naming convention and value to identify these are picked from
defines in "arch/powerpc/include/asm/ppc-opcode.h"

Signed-off-by: Athira Rajeev <[email protected]>
---
.../perf/arch/powerpc/annotate/instructions.c | 107 +++++++++++++++++-
tools/perf/util/disasm.c | 4 +-
2 files changed, 108 insertions(+), 3 deletions(-)

diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c
index 10fea5e5cf4c..4ee959a24738 100644
--- a/tools/perf/arch/powerpc/annotate/instructions.c
+++ b/tools/perf/arch/powerpc/annotate/instructions.c
@@ -49,18 +49,121 @@ static struct ins_ops *powerpc__associate_instruction_ops(struct arch *arch, con
return ops;
}

-#define PPC_OP(op) (((op) >> 26) & 0x3F)
+#define PPC_OP(op) (((op) >> 26) & 0x3F)
+#define PPC_21_30(R) (((R) >> 1) & 0x3ff)
+
+struct insn_offset {
+ const char *name;
+ int value;
+};
+
+/*
+ * There are memory instructions with opcode 31 which are
+ * of X Form, Example:
+ * ldx RT,RA,RB
+ * ______________________________________
+ * | 31 | RT | RA | RB | 21 |/|
+ * --------------------------------------
+ * 0 6 11 16 21 30 31
+ *
+ * But all instructions with opcode 31 are not memory.
+ * Example: add RT,RA,RB
+ *
+ * Use bits 21 to 30 to check memory insns with 31 as opcode.
+ * In ins_array below, for ldx instruction:
+ * name => OP_31_XOP_LDX
+ * value => 21
+ */
+
+static struct insn_offset ins_array[] = {
+ { .name = "OP_31_XOP_LXSIWZX", .value = 12, },
+ { .name = "OP_31_XOP_LWARX", .value = 20, },
+ { .name = "OP_31_XOP_LDX", .value = 21, },
+ { .name = "OP_31_XOP_LWZX", .value = 23, },
+ { .name = "OP_31_XOP_LDUX", .value = 53, },
+ { .name = "OP_31_XOP_LWZUX", .value = 55, },
+ { .name = "OP_31_XOP_LXSIWAX", .value = 76, },
+ { .name = "OP_31_XOP_LDARX", .value = 84, },
+ { .name = "OP_31_XOP_LBZX", .value = 87, },
+ { .name = "OP_31_XOP_LVX", .value = 103, },
+ { .name = "OP_31_XOP_LBZUX", .value = 119, },
+ { .name = "OP_31_XOP_STXSIWX", .value = 140, },
+ { .name = "OP_31_XOP_STDX", .value = 149, },
+ { .name = "OP_31_XOP_STWX", .value = 151, },
+ { .name = "OP_31_XOP_STDUX", .value = 181, },
+ { .name = "OP_31_XOP_STWUX", .value = 183, },
+ { .name = "OP_31_XOP_STBX", .value = 215, },
+ { .name = "OP_31_XOP_STVX", .value = 231, },
+ { .name = "OP_31_XOP_STBUX", .value = 247, },
+ { .name = "OP_31_XOP_LHZX", .value = 279, },
+ { .name = "OP_31_XOP_LHZUX", .value = 311, },
+ { .name = "OP_31_XOP_LXVDSX", .value = 332, },
+ { .name = "OP_31_XOP_LWAX", .value = 341, },
+ { .name = "OP_31_XOP_LHAX", .value = 343, },
+ { .name = "OP_31_XOP_LWAUX", .value = 373, },
+ { .name = "OP_31_XOP_LHAUX", .value = 375, },
+ { .name = "OP_31_XOP_STHX", .value = 407, },
+ { .name = "OP_31_XOP_STHUX", .value = 439, },
+ { .name = "OP_31_XOP_LXSSPX", .value = 524, },
+ { .name = "OP_31_XOP_LDBRX", .value = 532, },
+ { .name = "OP_31_XOP_LSWX", .value = 533, },
+ { .name = "OP_31_XOP_LWBRX", .value = 534, },
+ { .name = "OP_31_XOP_LFSUX", .value = 567, },
+ { .name = "OP_31_XOP_LXSDX", .value = 588, },
+ { .name = "OP_31_XOP_LSWI", .value = 597, },
+ { .name = "OP_31_XOP_LFDX", .value = 599, },
+ { .name = "OP_31_XOP_LFDUX", .value = 631, },
+ { .name = "OP_31_XOP_STXSSPX", .value = 652, },
+ { .name = "OP_31_XOP_STDBRX", .value = 660, },
+ { .name = "OP_31_XOP_STXWX", .value = 661, },
+ { .name = "OP_31_XOP_STWBRX", .value = 662, },
+ { .name = "OP_31_XOP_STFSX", .value = 663, },
+ { .name = "OP_31_XOP_STFSUX", .value = 695, },
+ { .name = "OP_31_XOP_STXSDX", .value = 716, },
+ { .name = "OP_31_XOP_STSWI", .value = 725, },
+ { .name = "OP_31_XOP_STFDX", .value = 727, },
+ { .name = "OP_31_XOP_STFDUX", .value = 759, },
+ { .name = "OP_31_XOP_LXVW4X", .value = 780, },
+ { .name = "OP_31_XOP_LHBRX", .value = 790, },
+ { .name = "OP_31_XOP_LXVD2X", .value = 844, },
+ { .name = "OP_31_XOP_LFIWAX", .value = 855, },
+ { .name = "OP_31_XOP_LFIWZX", .value = 887, },
+ { .name = "OP_31_XOP_STXVW4X", .value = 908, },
+ { .name = "OP_31_XOP_STHBRX", .value = 918, },
+ { .name = "OP_31_XOP_STXVD2X", .value = 972, },
+ { .name = "OP_31_XOP_STFIWX", .value = 983, },
+};
+
+static int cmp_offset(const void *a, const void *b)
+{
+ const struct insn_offset *val1 = a;
+ const struct insn_offset *val2 = b;
+
+ return (val1->value - val2->value);
+}

static struct ins_ops *check_ppc_insn(int raw_insn)
{
int opcode = PPC_OP(raw_insn);
+ int mem_insn_31 = PPC_21_30(raw_insn);
+ struct insn_offset *ret;
+ struct insn_offset mem_insns_31_opcode = {
+ "OP_31_INSN",
+ mem_insn_31
+ };

/*
* Instructions with opcode 32 to 63 are memory
* instructions in powerpc
*/
- if ((opcode & 0x20))
+ if ((opcode & 0x20)) {
return &load_store_ops;
+ } else if (opcode == 31) {
+ /* Check for memory instructions with opcode 31 */
+ ret = bsearch(&mem_insns_31_opcode, ins_array, ARRAY_SIZE(ins_array), sizeof(ins_array[0]), cmp_offset);
+ if (ret != NULL)
+ return &load_store_ops;
+ }

return NULL;
}
diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index 252cb0d1f5d1..16d1e3eaaeb3 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -693,7 +693,9 @@ static int load_store__parse(struct arch *arch __maybe_unused, struct ins_operan
ops->source.raw_insn = ops->raw_insn;
ops->source.mem_ref = true;
ops->source.opcode = ops->opcode;
- ops->source.multi_regs = false;
+ /* opcode 31 is of X form */
+ if (ops->source.opcode == 31)
+ ops->source.multi_regs = true;

if (!ops->source.raw_insn)
return -1;
--
2.43.0


2024-06-01 06:12:34

by Athira Rajeev

[permalink] [raw]
Subject: [PATCH V3 08/14] tools/perf: Add some of the arithmetic instructions to support instruction tracking in powerpc

Data type profiling has concept of instruction tracking.
Example sequence in powerpc:

ld r10,264(r3)
mr r31,r3
<<after some sequence>
ld r9,312(r31)

or differently

lwz r10,264(r3)
add r31, r3, RB
lwz r9, 0(r31)

If a sample is hit at "lwz r9, 0(r31)", data type of r31 depends
on previous instruction sequence here. So to track the previous
instructions, patch adds changes to identify some of the arithmetic
instructions which are having opcode as 31. Since memory instructions
also has cases with opcode 31, use the bits 22:30 to filter the
arithmetic instructions here. Also there are instructions with just
two operands like addme, addze. Patch adds new instructions ops
"arithmetic_ops" to handle this

Signed-off-by: Athira Rajeev <[email protected]>
---
.../perf/arch/powerpc/annotate/instructions.c | 49 +++++++++++++++
tools/perf/util/disasm.c | 61 +++++++++++++++++++
2 files changed, 110 insertions(+)

diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c
index 4ee959a24738..bec8ab0ee18d 100644
--- a/tools/perf/arch/powerpc/annotate/instructions.c
+++ b/tools/perf/arch/powerpc/annotate/instructions.c
@@ -51,6 +51,7 @@ static struct ins_ops *powerpc__associate_instruction_ops(struct arch *arch, con

#define PPC_OP(op) (((op) >> 26) & 0x3F)
#define PPC_21_30(R) (((R) >> 1) & 0x3ff)
+#define PPC_22_30(R) (((R) >> 1) & 0x1ff)

struct insn_offset {
const char *name;
@@ -134,6 +135,44 @@ static struct insn_offset ins_array[] = {
{ .name = "OP_31_XOP_STFIWX", .value = 983, },
};

+/*
+ * Arithmetic instructions which are having opcode as 31.
+ * These instructions are tracked to save the register state
+ * changes. Example:
+ *
+ * lwz r10,264(r3)
+ * add r31, r3, r3
+ * lwz r9, 0(r31)
+ *
+ * Here instruction tracking needs to identify the "add"
+ * instruction and save data type of r3 to r31. If a sample
+ * is hit at next "lwz r9, 0(r31)", by this instruction tracking,
+ * data type of r31 can be resolved.
+ */
+static struct insn_offset arithmetic_ins_op_31[] = {
+ { .name = "SUB_CARRY_XO_FORM", .value = 8, },
+ { .name = "MUL_HDW_XO_FORM1", .value = 9, },
+ { .name = "ADD_CARRY_XO_FORM", .value = 10, },
+ { .name = "MUL_HW_XO_FORM1", .value = 11, },
+ { .name = "SUB_XO_FORM", .value = 40, },
+ { .name = "MUL_HDW_XO_FORM", .value = 73, },
+ { .name = "MUL_HW_XO_FORM", .value = 75, },
+ { .name = "SUB_EXT_XO_FORM", .value = 136, },
+ { .name = "ADD_EXT_XO_FORM", .value = 138, },
+ { .name = "SUB_ZERO_EXT_XO_FORM", .value = 200, },
+ { .name = "ADD_ZERO_EXT_XO_FORM", .value = 202, },
+ { .name = "SUB_EXT_XO_FORM2", .value = 232, },
+ { .name = "MUL_DW_XO_FORM", .value = 233, },
+ { .name = "ADD_EXT_XO_FORM2", .value = 234, },
+ { .name = "MUL_W_XO_FORM", .value = 235, },
+ { .name = "ADD_XO_FORM", .value = 266, },
+ { .name = "DIV_DW_XO_FORM1", .value = 457, },
+ { .name = "DIV_W_XO_FORM1", .value = 459, },
+ { .name = "DIV_DW_XO_FORM", .value = 489, },
+ { .name = "DIV_W_XO_FORM", .value = 491, },
+};
+
+
static int cmp_offset(const void *a, const void *b)
{
const struct insn_offset *val1 = a;
@@ -163,6 +202,16 @@ static struct ins_ops *check_ppc_insn(int raw_insn)
ret = bsearch(&mem_insns_31_opcode, ins_array, ARRAY_SIZE(ins_array), sizeof(ins_array[0]), cmp_offset);
if (ret != NULL)
return &load_store_ops;
+ else {
+ mem_insns_31_opcode.value = PPC_22_30(raw_insn);
+ ret = bsearch(&mem_insns_31_opcode, arithmetic_ins_op_31, ARRAY_SIZE(arithmetic_ins_op_31),
+ sizeof(arithmetic_ins_op_31[0]), cmp_offset);
+ if (ret != NULL)
+ return &arithmetic_ops;
+ /* Bits 21 to 30 has value 444 for "mr" insn ie, OR X form */
+ if (PPC_21_30(raw_insn) == 444)
+ return &arithmetic_ops;
+ }
}

return NULL;
diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index 16d1e3eaaeb3..57af4dc42a58 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -38,6 +38,7 @@ static struct ins_ops nop_ops;
static struct ins_ops lock_ops;
static struct ins_ops ret_ops;
static struct ins_ops load_store_ops;
+static struct ins_ops arithmetic_ops;

static int jump__scnprintf(struct ins *ins, char *bf, size_t size,
struct ins_operands *ops, int max_ins_name);
@@ -673,6 +674,66 @@ static struct ins_ops mov_ops = {
.scnprintf = mov__scnprintf,
};

+#define PPC_22_30(R) (((R) >> 1) & 0x1ff)
+#define MINUS_EXT_XO_FORM 234
+#define SUB_EXT_XO_FORM 232
+#define ADD_ZERO_EXT_XO_FORM 202
+#define SUB_ZERO_EXT_XO_FORM 200
+
+static int arithmetic__scnprintf(struct ins *ins, char *bf, size_t size,
+ struct ins_operands *ops, int max_ins_name)
+{
+ return scnprintf(bf, size, "%-*s %s", max_ins_name, ins->name,
+ ops->raw);
+}
+
+/*
+ * Sets the fields: "raw_insn", opcode, multi_regs and "mem_ref".
+ * "mem_ref" is set for ops->source which is later used to
+ * fill the objdump->memory_ref-char field. This ops is currently
+ * used by powerpc and since binary instruction code is used to
+ * extract opcode, regs and offset, no other parsing is needed here.
+ *
+ * Dont set multi regs for 4 cases since it has only one operand
+ * for source:
+ * - Add to Minus One Extended XO-form ( Ex: addme, addmeo )
+ * - Subtract From Minus One Extended XO-form ( Ex: subfme )
+ * - Add to Zero Extended XO-form ( Ex: addze, addzeo )
+ * - Subtract From Zero Extended XO-form ( Ex: subfze )
+ */
+static int arithmetic__parse(struct arch *arch __maybe_unused, struct ins_operands *ops,
+ struct map_symbol *ms __maybe_unused)
+{
+ ops->source.raw_insn = ops->raw_insn;
+ ops->source.mem_ref = false;
+ ops->source.opcode = ops->opcode;
+ if (ops->source.opcode == 31) {
+ int opcode = ops->source.opcode;
+
+ if ((opcode != MINUS_EXT_XO_FORM) && (opcode != SUB_EXT_XO_FORM) \
+ && (opcode != ADD_ZERO_EXT_XO_FORM) && (opcode != SUB_ZERO_EXT_XO_FORM))
+ ops->source.multi_regs = true;
+ }
+
+ if (!ops->source.raw_insn)
+ return -1;
+
+ ops->target.raw_insn = ops->raw_insn;
+ ops->target.mem_ref = false;
+ ops->target.opcode = ops->opcode;
+ ops->target.multi_regs = false;
+
+ if (!ops->target.raw_insn)
+ return -1;
+
+ return 0;
+}
+
+static struct ins_ops arithmetic_ops = {
+ .parse = arithmetic__parse,
+ .scnprintf = arithmetic__scnprintf,
+};
+
static int load_store__scnprintf(struct ins *ins, char *bf, size_t size,
struct ins_operands *ops, int max_ins_name)
{
--
2.43.0


2024-06-01 06:13:45

by Athira Rajeev

[permalink] [raw]
Subject: [PATCH V3 11/14] tools/perf: Add support to use libcapstone in powerpc

Now perf uses the capstone library to disassemble the instructions in
x86. capstone is used (if available) for perf annotate to speed up.
Currently it only supports x86 architecture. Patch includes changes to
enable this in powerpc. For now, only for data type sort keys, this
method is used and only binary code (raw instruction) is read. This is
because powerpc approach to understand instructions and reg fields uses
raw instruction. The "cs_disasm" is currently not enabled. While
attempting to do cs_disasm, observation is that some of the instructions
were not identified (ex: extswsli, maddld) and it had to fallback to use
objdump. Hence enabling "cs_disasm" is added in comment section as a
TODO for powerpc.

Signed-off-by: Athira Rajeev <[email protected]>
---
tools/perf/util/disasm.c | 148 ++++++++++++++++++++++++++++++++++++++-
1 file changed, 146 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index d8b357055302..915508d2e197 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -1540,12 +1540,18 @@ static int open_capstone_handle(struct annotate_args *args, bool is_64bit,
{
struct annotation_options *opt = args->options;
cs_mode mode = is_64bit ? CS_MODE_64 : CS_MODE_32;
+ int ret;

/* TODO: support more architectures */
- if (!arch__is(args->arch, "x86"))
+ if ((!arch__is(args->arch, "x86")) && (!arch__is(args->arch, "powerpc")))
return -1;

- if (cs_open(CS_ARCH_X86, mode, handle) != CS_ERR_OK)
+ if (arch__is(args->arch, "x86"))
+ ret = cs_open(CS_ARCH_X86, mode, handle);
+ else
+ ret = cs_open(CS_ARCH_PPC, mode, handle);
+
+ if (ret != CS_ERR_OK)
return -1;

if (!opt->disassembler_style ||
@@ -1635,6 +1641,139 @@ static void print_capstone_detail(cs_insn *insn, char *buf, size_t len,
}
}

+static int symbol__disassemble_capstone_powerpc(char *filename, struct symbol *sym,
+ struct annotate_args *args)
+{
+ struct annotation *notes = symbol__annotation(sym);
+ struct map *map = args->ms.map;
+ struct dso *dso = map__dso(map);
+ struct nscookie nsc;
+ u64 start = map__rip_2objdump(map, sym->start);
+ u64 end = map__rip_2objdump(map, sym->end);
+ u64 len = end - start;
+ u64 offset;
+ int i, fd, count;
+ bool is_64bit = false;
+ bool needs_cs_close = false;
+ u8 *buf = NULL;
+ struct find_file_offset_data data = {
+ .ip = start,
+ };
+ csh handle;
+ char disasm_buf[512];
+ struct disasm_line *dl;
+ u32 *line;
+
+ if (args->options->objdump_path)
+ return -1;
+
+ nsinfo__mountns_enter(dso->nsinfo, &nsc);
+ fd = open(filename, O_RDONLY);
+ nsinfo__mountns_exit(&nsc);
+ if (fd < 0)
+ return -1;
+
+ if (file__read_maps(fd, /*exe=*/true, find_file_offset, &data,
+ &is_64bit) == 0)
+ goto err;
+
+ if (open_capstone_handle(args, is_64bit, &handle) < 0)
+ goto err;
+
+ needs_cs_close = true;
+
+ buf = malloc(len);
+ if (buf == NULL)
+ goto err;
+
+ count = pread(fd, buf, len, data.offset);
+ close(fd);
+ fd = -1;
+
+ if ((u64)count != len)
+ goto err;
+
+ line = (u32 *)buf;
+
+ /* add the function address and name */
+ scnprintf(disasm_buf, sizeof(disasm_buf), "%#"PRIx64" <%s>:",
+ start, sym->name);
+
+ args->offset = -1;
+ args->line = disasm_buf;
+ args->line_nr = 0;
+ args->fileloc = NULL;
+ args->ms.sym = sym;
+
+ dl = disasm_line__new(args);
+ if (dl == NULL)
+ goto err;
+
+ annotation_line__add(&dl->al, &notes->src->source);
+
+ /*
+ * TODO: enable disassm for powerpc
+ * count = cs_disasm(handle, buf, len, start, len, &insn);
+ *
+ * For now, only binary code is saved in disassembled line
+ * to be used in "type" and "typeoff" sort keys. Each raw code
+ * is 32 bit instruction. So use "len/4" to get the number of
+ * entries.
+ */
+ count = len/4;
+
+ for (i = 0, offset = 0; i < count; i++) {
+ args->offset = offset;
+ sprintf(args->line, "%x", line[i]);
+
+ dl = disasm_line__new(args);
+ if (dl == NULL)
+ goto err;
+
+ annotation_line__add(&dl->al, &notes->src->source);
+
+ offset += 4;
+ }
+
+ /* It failed in the middle */
+ if (offset != len) {
+ struct list_head *list = &notes->src->source;
+
+ /* Discard all lines and fallback to objdump */
+ while (!list_empty(list)) {
+ dl = list_first_entry(list, struct disasm_line, al.node);
+
+ list_del_init(&dl->al.node);
+ disasm_line__free(dl);
+ }
+ count = -1;
+ }
+
+out:
+ if (needs_cs_close)
+ cs_close(&handle);
+ free(buf);
+ return count < 0 ? count : 0;
+
+err:
+ if (fd >= 0)
+ close(fd);
+ if (needs_cs_close) {
+ struct disasm_line *tmp;
+
+ /*
+ * It probably failed in the middle of the above loop.
+ * Release any resources it might add.
+ */
+ list_for_each_entry_safe(dl, tmp, &notes->src->source, al.node) {
+ list_del(&dl->al.node);
+ free(dl);
+ }
+ }
+ count = -1;
+ goto out;
+}
+
static int symbol__disassemble_capstone(char *filename, struct symbol *sym,
struct annotate_args *args)
{
@@ -1987,6 +2126,11 @@ int symbol__disassemble(struct symbol *sym, struct annotate_args *args)
err = symbol__disassemble_dso(symfs_filename, sym, args);
if (err == 0)
goto out_remove_tmp;
+#ifdef HAVE_LIBCAPSTONE_SUPPORT
+ err = symbol__disassemble_capstone_powerpc(symfs_filename, sym, args);
+ if (err == 0)
+ goto out_remove_tmp;
+#endif
}
}

--
2.43.0


2024-06-01 06:16:01

by Athira Rajeev

[permalink] [raw]
Subject: [PATCH V3 12/14] tools/perf: Add support to find global register variables using find_data_type_global_reg

There are cases where define a global register variable and associate it
with a specified register. Example, in powerpc, two registers are
defined to represent variable:
1. r13: represents local_paca
register struct paca_struct *local_paca asm("r13");

2. r1: represents stack_pointer
register void *__stack_pointer asm("r1");

These regs are present in dwarf debug as DW_OP_reg as part of variables
in the cu_die (compile unit). These are not present in die search done
in the list of nested scopes since these are global register variables.

Example for local_paca represented by r13:

<<>>
<1><18dc6b4>: Abbrev Number: 128 (DW_TAG_variable)
<18dc6b6> DW_AT_name : (indirect string, offset: 0x3861): local_paca
<18dc6ba> DW_AT_decl_file : 48
<18dc6bb> DW_AT_decl_line : 36
<18dc6bc> DW_AT_decl_column : 30
<18dc6bd> DW_AT_type : <0x18dc6c3>
<18dc6c1> DW_AT_external : 1
<18dc6c1> DW_AT_location : 1 byte block: 5d (DW_OP_reg13 (r13))

<1><18dc6c3>: Abbrev Number: 3 (DW_TAG_pointer_type)
<18dc6c4> DW_AT_byte_size : 8
<18dc6c4> DW_AT_type : <0x18dc353>

Where DW_AT_type : <0x18dc6c3> further points to :

<1><18dc6c3>: Abbrev Number: 3 (DW_TAG_pointer_type)
<18dc6c4> DW_AT_byte_size : 8
<18dc6c4> DW_AT_type : <0x18dc353>

which belongs to:

<1><18dc353>: Abbrev Number: 67 (DW_TAG_structure_type)
<18dc354> DW_AT_name : (indirect string, offset: 0x56cd): paca_struct
<18dc358> DW_AT_byte_size : 2944
<18dc35a> DW_AT_alignment : 128
<18dc35b> DW_AT_decl_file : 48
<18dc35c> DW_AT_decl_line : 61
<18dc35d> DW_AT_decl_column : 8
<18dc35d> DW_AT_sibling : <0x18dc6b4>
<<>>

Similar is case with "r1".

<<>>
<1><18dd772>: Abbrev Number: 129 (DW_TAG_variable)
<18dd774> DW_AT_name : (indirect string, offset: 0x11ba): current_stack_pointer
<18dd778> DW_AT_decl_file : 51
<18dd779> DW_AT_decl_line : 1468
<18dd77b> DW_AT_decl_column : 24
<18dd77c> DW_AT_type : <0x18da5cd>
<18dd780> DW_AT_external : 1
<18dd780> DW_AT_location : 1 byte block: 51 (DW_OP_reg1 (r1))

where 18da5cd is:

<1><18da5cd>: Abbrev Number: 47 (DW_TAG_base_type)
<18da5ce> DW_AT_byte_size : 8
<18da5cf> DW_AT_encoding : 7 (unsigned)
<18da5d0> DW_AT_name : (indirect string, offset: 0x55c7): long unsigned int
<<>>

To identify data type for these two special cases, iterate over
variables in the CU die (Compile Unit) and match it with the register.
If the variable is a base type, ie die_get_real_type will return NULL
here, set offset to zero. With the changes, data type for "paca_struct"
and "long unsigned int" for r1 is identified.

Snippet from ./perf report -s type,type_off

12.85% long unsigned int long unsigned int +0 (no field)
4.68% struct paca_struct struct paca_struct +2312 (__current)
4.57% struct paca_struct struct paca_struct +2354 (irq_soft_mask)

Signed-off-by: Athira Rajeev <[email protected]>
---
tools/perf/util/annotate-data.c | 40 ++++++++++++++++++++++++++++
tools/perf/util/annotate.c | 8 ++++++
tools/perf/util/annotate.h | 1 +
tools/perf/util/include/dwarf-regs.h | 1 +
4 files changed, 50 insertions(+)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 734acdd8c4b7..82232f2d8e16 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -1170,6 +1170,40 @@ static int find_data_type_block(struct data_loc_info *dloc,
return ret;
}

+/*
+ * Handle cases where define a global register variable and
+ * associate it with a specified register. These regs are
+ * present in dwarf debug as DW_OP_reg as part of variables
+ * in the cu_die (compile unit). Iterate over variables in the
+ * cu_die and match with reg to identify data type die.
+ */
+static int find_data_type_global_reg(struct data_loc_info *dloc, int reg, Dwarf_Die *cu_die,
+ Dwarf_Die *type_die)
+{
+ Dwarf_Die vr_die;
+ int ret = -1;
+ struct die_var_type *var_types = NULL;
+
+ die_collect_vars(cu_die, &var_types);
+ while (var_types) {
+ if (var_types->reg == reg) {
+ if (dwarf_offdie(dloc->di->dbg, var_types->die_off, &vr_die)) {
+ if (die_get_real_type(&vr_die, type_die) == NULL) {
+ dloc->type_offset = 0;
+ dwarf_offdie(dloc->di->dbg, var_types->die_off, type_die);
+ }
+ pr_debug_type_name(type_die, TSR_KIND_TYPE);
+ ret = 0;
+ pr_debug_dtp("found by CU for %s (die:%#lx)\n",
+ dwarf_diename(type_die), (long)dwarf_dieoffset(type_die));
+ }
+ break;
+ }
+ var_types = var_types->next;
+ }
+ return ret;
+}
+
/* The result will be saved in @type_die */
static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
{
@@ -1217,6 +1251,12 @@ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
pr_debug_dtp("CU for %s (die:%#lx)\n",
dwarf_diename(&cu_die), (long)dwarf_dieoffset(&cu_die));

+ if (loc->reg_type == DWARF_REG_GLOBAL) {
+ ret = find_data_type_global_reg(dloc, reg, &cu_die, type_die);
+ if (!ret)
+ goto out;
+ }
+
if (reg == DWARF_REG_PC) {
if (get_global_var_type(&cu_die, dloc, dloc->ip, dloc->var_addr,
&offset, type_die)) {
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 2b8cc759ae35..ddb36dddb8cc 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -2431,6 +2431,14 @@ struct annotated_data_type *hist_entry__get_data_type(struct hist_entry *he)
op_loc->reg1 = DWARF_REG_PC;
}

+ /* Global reg variable 13 and 1
+ * assign to DWARF_REG_GLOBAL
+ */
+ if (arch__is(arch, "powerpc")) {
+ if ((op_loc->reg1 == 13) || (op_loc->reg1 == 1))
+ op_loc->reg_type = DWARF_REG_GLOBAL;
+ }
+
mem_type = find_data_type(&dloc);

if (mem_type == NULL && is_stack_canary(arch, op_loc)) {
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index d5c821c22f79..43ae75d8356b 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -472,6 +472,7 @@ struct annotated_op_loc {
bool mem_ref;
bool multi_regs;
bool imm;
+ int reg_type;
};

enum annotated_insn_ops {
diff --git a/tools/perf/util/include/dwarf-regs.h b/tools/perf/util/include/dwarf-regs.h
index 7ea39362ecaf..a873c906a86b 100644
--- a/tools/perf/util/include/dwarf-regs.h
+++ b/tools/perf/util/include/dwarf-regs.h
@@ -5,6 +5,7 @@

#define DWARF_REG_PC 0xd3af9c /* random number */
#define DWARF_REG_FB 0xd3affb /* random number */
+#define DWARF_REG_GLOBAL 0xd3affc /* random number */

#ifdef HAVE_DWARF_SUPPORT
const char *get_arch_regstr(unsigned int n);
--
2.43.0


2024-06-01 07:16:41

by Athira Rajeev

[permalink] [raw]
Subject: [PATCH V3 09/14] tools/perf: Add more instructions for instruction tracking

Add few more instructions and use opcode as search key
to find if it is supported by the architecture. Added ones
are: addi, addic, addic., addis, subfic and mulli

Signed-off-by: Athira Rajeev <[email protected]>
---
tools/perf/arch/powerpc/annotate/instructions.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c
index bec8ab0ee18d..db72148eb857 100644
--- a/tools/perf/arch/powerpc/annotate/instructions.c
+++ b/tools/perf/arch/powerpc/annotate/instructions.c
@@ -172,6 +172,14 @@ static struct insn_offset arithmetic_ins_op_31[] = {
{ .name = "DIV_W_XO_FORM", .value = 491, },
};

+static struct insn_offset arithmetic_two_ops[] = {
+ { .name = "mulli", .value = 7, },
+ { .name = "subfic", .value = 8, },
+ { .name = "addic", .value = 12, },
+ { .name = "addic.", .value = 13, },
+ { .name = "addi", .value = 14, },
+ { .name = "addis", .value = 15, },
+};

static int cmp_offset(const void *a, const void *b)
{
@@ -212,6 +220,12 @@ static struct ins_ops *check_ppc_insn(int raw_insn)
if (PPC_21_30(raw_insn) == 444)
return &arithmetic_ops;
}
+ } else {
+ mem_insns_31_opcode.value = opcode;
+ ret = bsearch(&mem_insns_31_opcode, arithmetic_two_ops, ARRAY_SIZE(arithmetic_two_ops),
+ sizeof(arithmetic_two_ops[0]), cmp_offset);
+ if (ret != NULL)
+ return &arithmetic_ops;
}

return NULL;
--
2.43.0


2024-06-01 07:21:08

by Athira Rajeev

[permalink] [raw]
Subject: [PATCH V3 05/14] tools/perf: Add disasm_line__parse to parse raw instruction for powerpc

Currently, the perf tool infrastructure disasm_line__parse function to
parse disassembled line.

Example snippet from objdump:
objdump --start-address=<address> --stop-address=<address> -d --no-show-raw-insn -C <vmlinux>

c0000000010224b4: lwz r10,0(r9)

This line "lwz r10,0(r9)" is parsed to extract instruction name,
registers names and offset. In powerpc, the approach for data type
profiling uses raw instruction instead of result from objdump to identify
the instruction category and extract the source/target registers.

Example: 38 01 81 e8 ld r4,312(r1)

Here "38 01 81 e8" is the raw instruction representation. Add function
"disasm_line__parse_powerpc" to handle parsing of raw instruction. Also
update "struct ins" and "struct ins_operands" to save "opcode" and
binary code. With the change, function captures:

line -> "38 01 81 e8 ld r4,312(r1)"
opcode and raw instruction "38 01 81 e8"

Raw instruction is used later to extract the reg/offset fields. Macros
are added to extract opcode and register fields. "struct ins_operands"
and "struct ins" is updated to carry opcode and raw instruction binary
code (raw_insn). Function "disasm_line__parse_powerpc fills the raw
instruction hex value and opcode in newly added fields. There is no
changes in existing code paths, which parses the disassembled code.
The architecture using the instruction name and present approach is
not altered. Since this approach targets powerpc, the macro
implementation is added for powerpc as of now.

Since the disasm_line__parse is used in other cases (perf annotate) and
not only data tye profiling, the powerpc callback includes changes to
work with binary code as well as mneumonic representation. Also in case
if the DSO read fails and libcapstone is not supported, the approach
fallback to use objdump as option. Hence as option, patch has changes to
ensure objdump option also works well.

Signed-off-by: Athira Rajeev <[email protected]>
---
tools/include/linux/string.h | 2 +
tools/lib/string.c | 13 ++++
.../perf/arch/powerpc/annotate/instructions.c | 1 +
tools/perf/arch/powerpc/util/dwarf-regs.c | 9 +++
tools/perf/util/disasm.c | 63 ++++++++++++++++++-
tools/perf/util/disasm.h | 7 +++
6 files changed, 94 insertions(+), 1 deletion(-)

diff --git a/tools/include/linux/string.h b/tools/include/linux/string.h
index db5c99318c79..0acb1fc14e19 100644
--- a/tools/include/linux/string.h
+++ b/tools/include/linux/string.h
@@ -46,5 +46,7 @@ extern char * __must_check skip_spaces(const char *);

extern char *strim(char *);

+extern void remove_spaces(char *s);
+
extern void *memchr_inv(const void *start, int c, size_t bytes);
#endif /* _TOOLS_LINUX_STRING_H_ */
diff --git a/tools/lib/string.c b/tools/lib/string.c
index 8b6892f959ab..3126d2cff716 100644
--- a/tools/lib/string.c
+++ b/tools/lib/string.c
@@ -153,6 +153,19 @@ char *strim(char *s)
return skip_spaces(s);
}

+/*
+ * remove_spaces - Removes whitespaces from @s
+ */
+void remove_spaces(char *s)
+{
+ char *d = s;
+
+ do {
+ while (*d == ' ')
+ ++d;
+ } while ((*s++ = *d++));
+}
+
/**
* strreplace - Replace all occurrences of character in string.
* @s: The string to operate on.
diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c
index a3f423c27cae..d57fd023ef9c 100644
--- a/tools/perf/arch/powerpc/annotate/instructions.c
+++ b/tools/perf/arch/powerpc/annotate/instructions.c
@@ -55,6 +55,7 @@ static int powerpc__annotate_init(struct arch *arch, char *cpuid __maybe_unused)
arch->initialized = true;
arch->associate_instruction_ops = powerpc__associate_instruction_ops;
arch->objdump.comment_char = '#';
+ annotate_opts.show_asm_raw = true;
}

return 0;
diff --git a/tools/perf/arch/powerpc/util/dwarf-regs.c b/tools/perf/arch/powerpc/util/dwarf-regs.c
index 0c4f4caf53ac..430623ca5612 100644
--- a/tools/perf/arch/powerpc/util/dwarf-regs.c
+++ b/tools/perf/arch/powerpc/util/dwarf-regs.c
@@ -98,3 +98,12 @@ int regs_query_register_offset(const char *name)
return roff->ptregs_offset;
return -EINVAL;
}
+
+#define PPC_OP(op) (((op) >> 26) & 0x3F)
+#define PPC_RA(a) (((a) >> 16) & 0x1f)
+#define PPC_RT(t) (((t) >> 21) & 0x1f)
+#define PPC_RB(b) (((b) >> 11) & 0x1f)
+#define PPC_D(D) ((D) & 0xfffe)
+#define PPC_DS(DS) ((DS) & 0xfffc)
+#define OP_LD 58
+#define OP_STD 62
diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index 3cd187f08193..61f0f1656f82 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -45,6 +45,7 @@ static int call__scnprintf(struct ins *ins, char *bf, size_t size,

static void ins__sort(struct arch *arch);
static int disasm_line__parse(char *line, const char **namep, char **rawp);
+static int disasm_line__parse_powerpc(struct disasm_line *dl);

static __attribute__((constructor)) void symbol__init_regexpr(void)
{
@@ -844,6 +845,63 @@ static int disasm_line__parse(char *line, const char **namep, char **rawp)
return -1;
}

+/*
+ * Parses the result captured from symbol__disassemble_*
+ * Example, line read from DSO file in powerpc:
+ * line: 38 01 81 e8
+ * opcode: fetched from arch specific get_opcode_insn
+ * rawp_insn: e8810138
+ *
+ * rawp_insn is used later to extract the reg/offset fields
+ */
+#define PPC_OP(op) (((op) >> 26) & 0x3F)
+
+static int disasm_line__parse_powerpc(struct disasm_line *dl)
+{
+ char *line = dl->al.line;
+ const char **namep = &dl->ins.name;
+ char **rawp = &dl->ops.raw;
+ char tmp, *tmp_opcode, *name_opcode = skip_spaces(line);
+ char *name = skip_spaces(name_opcode + 11);
+ int objdump = 0;
+
+ if (strlen(line) > 11)
+ objdump = 1;
+
+ if (name_opcode[0] == '\0')
+ return -1;
+
+ if (objdump) {
+ *rawp = name + 1;
+ while ((*rawp)[0] != '\0' && !isspace((*rawp)[0]))
+ ++*rawp;
+ tmp = (*rawp)[0];
+ (*rawp)[0] = '\0';
+
+ *namep = strdup(name);
+ if (*namep == NULL)
+ return -1;
+
+ (*rawp)[0] = tmp;
+ *rawp = strim(*rawp);
+ } else
+ *namep = "";
+
+ tmp_opcode = strdup(name_opcode);
+ tmp_opcode[11] = '\0';
+ remove_spaces(tmp_opcode);
+
+ dl->ins.opcode = PPC_OP(strtol(tmp_opcode, NULL, 16));
+ if (objdump)
+ dl->ins.opcode = PPC_OP(be32_to_cpu(strtol(tmp_opcode, NULL, 16)));
+ dl->ops.opcode = dl->ins.opcode;
+
+ dl->ops.raw_insn = strtol(tmp_opcode, NULL, 16);
+ if (objdump)
+ dl->ops.raw_insn = be32_to_cpu(strtol(tmp_opcode, NULL, 16));
+ return 0;
+}
+
static void annotation_line__init(struct annotation_line *al,
struct annotate_args *args,
int nr)
@@ -897,7 +955,10 @@ struct disasm_line *disasm_line__new(struct annotate_args *args)
goto out_delete;

if (args->offset != -1) {
- if (disasm_line__parse(dl->al.line, &dl->ins.name, &dl->ops.raw) < 0)
+ if (arch__is(args->arch, "powerpc")) {
+ if (disasm_line__parse_powerpc(dl) < 0)
+ goto out_free_line;
+ } else if (disasm_line__parse(dl->al.line, &dl->ins.name, &dl->ops.raw) < 0)
goto out_free_line;

disasm_line__init_ins(dl, args->arch, &args->ms);
diff --git a/tools/perf/util/disasm.h b/tools/perf/util/disasm.h
index 718177fa4775..a391e1bb81f7 100644
--- a/tools/perf/util/disasm.h
+++ b/tools/perf/util/disasm.h
@@ -43,14 +43,19 @@ struct arch {

struct ins {
const char *name;
+ int opcode;
struct ins_ops *ops;
};

struct ins_operands {
char *raw;
+ int raw_insn;
+ int opcode;
struct {
char *raw;
char *name;
+ int opcode;
+ int raw_insn;
struct symbol *sym;
u64 addr;
s64 offset;
@@ -62,6 +67,8 @@ struct ins_operands {
struct {
char *raw;
char *name;
+ int opcode;
+ int raw_insn;
u64 addr;
bool multi_regs;
} source;
--
2.43.0


2024-06-01 08:15:00

by Athira Rajeev

[permalink] [raw]
Subject: [PATCH V3 10/14] tools/perf: Update instruction tracking for powerpc

Add instruction tracking function "update_insn_state_powerpc" for
powerpc. Example sequence in powerpc:

ld r10,264(r3)
mr r31,r3
<<after some sequence>
ld r9,312(r31)

Consider ithe sample is pointing to: "ld r9,312(r31)".
Here the memory reference is hit at "312(r31)" where 312 is the offset
and r31 is the source register. Previous instruction sequence shows that
register state of r3 is moved to r31. So to identify the data type for r31
access, the previous instruction ("mr") needs to be tracked and the
state type entry has to be updated. Current instruction tracking support
in perf tools infrastructure is specific to x86. Patch adds this support
for powerpc as well.

Signed-off-by: Athira Rajeev <[email protected]>
---
.../perf/arch/powerpc/annotate/instructions.c | 65 +++++++++++++++++++
tools/perf/util/annotate-data.c | 9 ++-
tools/perf/util/disasm.c | 1 +
3 files changed, 74 insertions(+), 1 deletion(-)

diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c
index db72148eb857..3ecf5a986037 100644
--- a/tools/perf/arch/powerpc/annotate/instructions.c
+++ b/tools/perf/arch/powerpc/annotate/instructions.c
@@ -231,6 +231,71 @@ static struct ins_ops *check_ppc_insn(int raw_insn)
return NULL;
}

+/*
+ * Instruction tracking function to track register state moves.
+ * Example sequence:
+ * ld r10,264(r3)
+ * mr r31,r3
+ * <<after some sequence>
+ * ld r9,312(r31)
+ *
+ * Previous instruction sequence shows that register state of r3
+ * is moved to r31. update_insn_state_powerpc tracks these state
+ * changes
+ */
+#ifdef HAVE_DWARF_SUPPORT
+static void update_insn_state_powerpc(struct type_state *state,
+ struct data_loc_info *dloc, Dwarf_Die * cu_die __maybe_unused,
+ struct disasm_line *dl)
+{
+ struct annotated_insn_loc loc;
+ struct annotated_op_loc *src = &loc.ops[INSN_OP_SOURCE];
+ struct annotated_op_loc *dst = &loc.ops[INSN_OP_TARGET];
+ struct type_state_reg *tsr;
+ u32 insn_offset = dl->al.offset;
+
+ if (annotate_get_insn_location(dloc->arch, dl, &loc) < 0)
+ return;
+
+ /*
+ * Value 444 for bits 21:30 is for "mr"
+ * instruction. "mr" is extended OR. So set the
+ * source and destination reg correctly
+ */
+ if (PPC_21_30(dl->ops.raw_insn) == 444) {
+ int src_reg = src->reg1;
+
+ src->reg1 = dst->reg1;
+ dst->reg1 = src_reg;
+ }
+
+ if (!has_reg_type(state, dst->reg1))
+ return;
+
+ tsr = &state->regs[dst->reg1];
+
+ if (!has_reg_type(state, src->reg1) ||
+ !state->regs[src->reg1].ok) {
+ tsr->ok = false;
+ return;
+ }
+
+ tsr->type = state->regs[src->reg1].type;
+ tsr->kind = state->regs[src->reg1].kind;
+ tsr->ok = true;
+
+ pr_debug("mov [%x] reg%d -> reg%d",
+ insn_offset, src->reg1, dst->reg1);
+ pr_debug_type_name(&tsr->type, tsr->kind);
+}
+#else /* HAVE_DWARF_SUPPORT */
+static void update_insn_state_powerpc(struct type_state *state __maybe_unused, struct data_loc_info *dloc __maybe_unused,
+ Dwarf_Die * cu_die __maybe_unused, struct disasm_line *dl __maybe_unused)
+{
+ return;
+}
+#endif /* HAVE_DWARF_SUPPORT */
+
static int powerpc__annotate_init(struct arch *arch, char *cpuid __maybe_unused)
{
if (!arch->initialized) {
diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 7a48c3d72b89..734acdd8c4b7 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -1080,6 +1080,13 @@ static int find_data_type_insn(struct data_loc_info *dloc,
return ret;
}

+static int arch_supports_insn_tracking(struct data_loc_info *dloc)
+{
+ if ((arch__is(dloc->arch, "x86")) || (arch__is(dloc->arch, "powerpc")))
+ return 1;
+ return 0;
+}
+
/*
* Construct a list of basic blocks for each scope with variables and try to find
* the data type by updating a type state table through instructions.
@@ -1094,7 +1101,7 @@ static int find_data_type_block(struct data_loc_info *dloc,
int ret = -1;

/* TODO: other architecture support */
- if (!arch__is(dloc->arch, "x86"))
+ if (!arch_supports_insn_tracking(dloc))
return -1;

prev_dst_ip = dst_ip = dloc->ip;
diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index 57af4dc42a58..d8b357055302 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -155,6 +155,7 @@ static struct arch architectures[] = {
{
.name = "powerpc",
.init = powerpc__annotate_init,
+ .update_insn_state = update_insn_state_powerpc,
},
{
.name = "riscv64",
--
2.43.0


2024-06-01 08:18:49

by Athira Rajeev

[permalink] [raw]
Subject: [PATCH V3 14/14] tools/perf: Set instruction name to be used with insn-stat when using raw instruction

Since the "ins.name" is not set while using raw instruction,
perf annotate with insn-stat gives wrong data:

Result from "./perf annotate --data-type --insn-stat":

Annotate Instruction stats
total 615, ok 419 (68.1%), bad 196 (31.9%)

Name : Good Bad
-----------------------------------------------------------
: 419 196

Patch sets "dl->ins.name" in arch specific function "check_ppc_insn"
while initialising "struct disasm_line". Also update "ins_find" function
to pass "struct disasm_line" as a parameter so as to set its name field
in arch specific call.

With the patch changes:

Annotate Instruction stats
total 609, ok 446 (73.2%), bad 163 (26.8%)

Name/opcode: Good Bad
-----------------------------------------------------------
58 : 323 80
32 : 49 43
34 : 33 11
OP_31_XOP_LDX : 8 20
40 : 23 0
OP_31_XOP_LWARX : 5 1
OP_31_XOP_LWZX : 2 3
OP_31_XOP_LDARX : 3 0
33 : 0 2
OP_31_XOP_LBZX : 0 1
OP_31_XOP_LWAX : 0 1
OP_31_XOP_LHZX : 0 1

Signed-off-by: Athira Rajeev <[email protected]>
---
.../perf/arch/powerpc/annotate/instructions.c | 18 +++++++++++++++---
tools/perf/builtin-annotate.c | 4 ++--
tools/perf/util/annotate.c | 2 +-
tools/perf/util/disasm.c | 10 +++++-----
tools/perf/util/disasm.h | 2 +-
5 files changed, 24 insertions(+), 12 deletions(-)

diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c
index 3ecf5a986037..c4113b3ee71a 100644
--- a/tools/perf/arch/powerpc/annotate/instructions.c
+++ b/tools/perf/arch/powerpc/annotate/instructions.c
@@ -189,8 +189,9 @@ static int cmp_offset(const void *a, const void *b)
return (val1->value - val2->value);
}

-static struct ins_ops *check_ppc_insn(int raw_insn)
+static struct ins_ops *check_ppc_insn(struct disasm_line *dl)
{
+ int raw_insn = dl->ops.raw_insn;
int opcode = PPC_OP(raw_insn);
int mem_insn_31 = PPC_21_30(raw_insn);
struct insn_offset *ret;
@@ -198,19 +199,30 @@ static struct ins_ops *check_ppc_insn(int raw_insn)
"OP_31_INSN",
mem_insn_31
};
+ char name_insn[32];

/*
* Instructions with opcode 32 to 63 are memory
* instructions in powerpc
*/
if ((opcode & 0x20)) {
+ /*
+ * Set name in case of raw instruction to
+ * opcode to be used in insn-stat
+ */
+ if (!strlen(dl->ins.name)) {
+ sprintf(name_insn, "%d", opcode);
+ dl->ins.name = strdup(name_insn);
+ }
return &load_store_ops;
} else if (opcode == 31) {
/* Check for memory instructions with opcode 31 */
ret = bsearch(&mem_insns_31_opcode, ins_array, ARRAY_SIZE(ins_array), sizeof(ins_array[0]), cmp_offset);
- if (ret != NULL)
+ if (ret) {
+ if (!strlen(dl->ins.name))
+ dl->ins.name = strdup(ret->name);
return &load_store_ops;
- else {
+ } else {
mem_insns_31_opcode.value = PPC_22_30(raw_insn);
ret = bsearch(&mem_insns_31_opcode, arithmetic_ins_op_31, ARRAY_SIZE(arithmetic_ins_op_31),
sizeof(arithmetic_ins_op_31[0]), cmp_offset);
diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 50d2fb222d48..926467b9a023 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -396,10 +396,10 @@ static void print_annotate_item_stat(struct list_head *head, const char *title)
printf("total %d, ok %d (%.1f%%), bad %d (%.1f%%)\n\n", total,
total_good, 100.0 * total_good / (total ?: 1),
total_bad, 100.0 * total_bad / (total ?: 1));
- printf(" %-10s: %5s %5s\n", "Name", "Good", "Bad");
+ printf(" %-10s: %5s %5s\n", "Name/opcode", "Good", "Bad");
printf("-----------------------------------------------------------\n");
list_for_each_entry(istat, head, list)
- printf(" %-10s: %5d %5d\n", istat->name, istat->good, istat->bad);
+ printf(" %-20s: %5d %5d\n", istat->name, istat->good, istat->bad);
printf("\n");
}

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index ddb36dddb8cc..b8fc663dad4c 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -2235,7 +2235,7 @@ static struct annotated_item_stat *annotate_data_stat(struct list_head *head,
return NULL;

istat->name = strdup(name);
- if (istat->name == NULL) {
+ if ((istat->name == NULL) || (!strlen(istat->name))) {
free(istat);
return NULL;
}
diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index 915508d2e197..94ca0b8759fe 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -868,7 +868,7 @@ static void ins__sort(struct arch *arch)
qsort(arch->instructions, nmemb, sizeof(struct ins), ins__cmp);
}

-static struct ins_ops *__ins__find(struct arch *arch, const char *name, int raw_insn)
+static struct ins_ops *__ins__find(struct arch *arch, const char *name, struct disasm_line *dl)
{
struct ins *ins;
const int nmemb = arch->nr_instructions;
@@ -880,7 +880,7 @@ static struct ins_ops *__ins__find(struct arch *arch, const char *name, int raw_
*/
struct ins_ops *ops;

- ops = check_ppc_insn(raw_insn);
+ ops = check_ppc_insn(dl);
if (ops)
return ops;
}
@@ -914,9 +914,9 @@ static struct ins_ops *__ins__find(struct arch *arch, const char *name, int raw_
return ins ? ins->ops : NULL;
}

-struct ins_ops *ins__find(struct arch *arch, const char *name, int raw_insn)
+struct ins_ops *ins__find(struct arch *arch, const char *name, struct disasm_line *dl)
{
- struct ins_ops *ops = __ins__find(arch, name, raw_insn);
+ struct ins_ops *ops = __ins__find(arch, name, dl);

if (!ops && arch->associate_instruction_ops)
ops = arch->associate_instruction_ops(arch, name);
@@ -926,7 +926,7 @@ struct ins_ops *ins__find(struct arch *arch, const char *name, int raw_insn)

static void disasm_line__init_ins(struct disasm_line *dl, struct arch *arch, struct map_symbol *ms)
{
- dl->ins.ops = ins__find(arch, dl->ins.name, dl->ops.raw_insn);
+ dl->ins.ops = ins__find(arch, dl->ins.name, dl);

if (!dl->ins.ops)
return;
diff --git a/tools/perf/util/disasm.h b/tools/perf/util/disasm.h
index 831ebcc329cd..2788c3fe2157 100644
--- a/tools/perf/util/disasm.h
+++ b/tools/perf/util/disasm.h
@@ -106,7 +106,7 @@ struct annotate_args {
struct arch *arch__find(const char *name);
bool arch__is(struct arch *arch, const char *name);

-struct ins_ops *ins__find(struct arch *arch, const char *name, int raw_insn);
+struct ins_ops *ins__find(struct arch *arch, const char *name, struct disasm_line *dl);
int ins__scnprintf(struct ins *ins, char *bf, size_t size,
struct ins_operands *ops, int max_ins_name);

--
2.43.0


2024-06-01 09:12:54

by Athira Rajeev

[permalink] [raw]
Subject: [PATCH V3 13/14] tools/perf: Add support for global_die to capture name of variable in case of register defined variable

In case of register defined variable (found using
find_data_type_global_reg), if the type of variable happens to be base
type (example, long unsigned int), perf report captures it as:

12.85% long unsigned int long unsigned int +0 (no field)

The above data type is actually referring to samples captured while
accessing "r1" which represents current stack pointer in powerpc.
register void *__stack_pointer asm("r1");

The dwarf debug contains this as:

<<>>
<1><18dd772>: Abbrev Number: 129 (DW_TAG_variable)
<18dd774> DW_AT_name : (indirect string, offset: 0x11ba): current_stack_pointer
<18dd778> DW_AT_decl_file : 51
<18dd779> DW_AT_decl_line : 1468
<18dd77b> DW_AT_decl_column : 24
<18dd77c> DW_AT_type : <0x18da5cd>
<18dd780> DW_AT_external : 1
<18dd780> DW_AT_location : 1 byte block: 51 (DW_OP_reg1 (r1))

where 18da5cd is:

<1><18da5cd>: Abbrev Number: 47 (DW_TAG_base_type)
<18da5ce> DW_AT_byte_size : 8
<18da5cf> DW_AT_encoding : 7 (unsigned)
<18da5d0> DW_AT_name : (indirect string, offset: 0x55c7): long unsigned int
<<>>

To make it more clear to the user, capture the DW_AT_name of the
variable and save it as part of Dwarf_Global. Dwarf_Global is used so
that it can be used and retrieved while presenting the result.

Update "dso__findnew_data_type" function to set "var_name" if
variable name is set as part of Dwarf_Global. Updated
"hist_entry__typeoff_snprintf" to print var_name if it is set.
With the changes, along with "long unsigned int" report also says the
variable name as current_stack_pointer

Snippet of result:

12.85% long unsigned int long unsigned int +0 (current_stack_pointer)
4.68% struct paca_struct struct paca_struct +2312 (__current)
4.57% struct paca_struct struct paca_struct +2354 (irq_soft_mask)

Signed-off-by: Athira Rajeev <[email protected]>
---
tools/perf/util/annotate-data.c | 30 ++++++++++++++++++++++++------
tools/perf/util/dwarf-aux.c | 1 +
tools/perf/util/dwarf-aux.h | 1 +
tools/perf/util/sort.c | 7 +++++--
4 files changed, 31 insertions(+), 8 deletions(-)

diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
index 82232f2d8e16..2bce522304f4 100644
--- a/tools/perf/util/annotate-data.c
+++ b/tools/perf/util/annotate-data.c
@@ -268,23 +268,32 @@ static void delete_members(struct annotated_member *member)
}

static struct annotated_data_type *dso__findnew_data_type(struct dso *dso,
- Dwarf_Die *type_die)
+ Dwarf_Die *type_die, Dwarf_Global *global_die)
{
struct annotated_data_type *result = NULL;
struct annotated_data_type key;
struct rb_node *node;
struct strbuf sb;
+ struct strbuf sb_var_name;
char *type_name;
+ char *var_name;
Dwarf_Word size;

strbuf_init(&sb, 32);
+ strbuf_init(&sb_var_name, 32);
if (die_get_typename_from_type(type_die, &sb) < 0)
strbuf_add(&sb, "(unknown type)", 14);
+ if (global_die->name) {
+ strbuf_addstr(&sb_var_name, global_die->name);
+ var_name = strbuf_detach(&sb_var_name, NULL);
+ }
type_name = strbuf_detach(&sb, NULL);
dwarf_aggregate_size(type_die, &size);

/* Check existing nodes in dso->data_types tree */
key.self.type_name = type_name;
+ if (global_die->name)
+ key.self.var_name = var_name;
key.self.size = size;
node = rb_find(&key, dso__data_types(dso), data_type_cmp);
if (node) {
@@ -301,6 +310,8 @@ static struct annotated_data_type *dso__findnew_data_type(struct dso *dso,
}

result->self.type_name = type_name;
+ if (global_die->name)
+ result->self.var_name = var_name;
result->self.size = size;
INIT_LIST_HEAD(&result->self.children);

@@ -1178,7 +1189,7 @@ static int find_data_type_block(struct data_loc_info *dloc,
* cu_die and match with reg to identify data type die.
*/
static int find_data_type_global_reg(struct data_loc_info *dloc, int reg, Dwarf_Die *cu_die,
- Dwarf_Die *type_die)
+ Dwarf_Die *type_die, Dwarf_Global *global_die)
{
Dwarf_Die vr_die;
int ret = -1;
@@ -1190,8 +1201,11 @@ static int find_data_type_global_reg(struct data_loc_info *dloc, int reg, Dwarf_
if (dwarf_offdie(dloc->di->dbg, var_types->die_off, &vr_die)) {
if (die_get_real_type(&vr_die, type_die) == NULL) {
dloc->type_offset = 0;
+ global_die->name = var_types->name;
dwarf_offdie(dloc->di->dbg, var_types->die_off, type_die);
}
+ global_die->die_offset = (long)dwarf_dieoffset(type_die);
+ global_die->cu_offset = (long)dwarf_dieoffset(cu_die);
pr_debug_type_name(type_die, TSR_KIND_TYPE);
ret = 0;
pr_debug_dtp("found by CU for %s (die:%#lx)\n",
@@ -1205,7 +1219,8 @@ static int find_data_type_global_reg(struct data_loc_info *dloc, int reg, Dwarf_
}

/* The result will be saved in @type_die */
-static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
+static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die,
+ Dwarf_Global *global_die)
{
struct annotated_op_loc *loc = dloc->op;
Dwarf_Die cu_die, var_die;
@@ -1219,6 +1234,8 @@ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
u64 pc;
char buf[64];

+ memset(global_die, 0, sizeof(Dwarf_Global));
+
if (dloc->op->multi_regs)
snprintf(buf, sizeof(buf), "reg%d, reg%d", dloc->op->reg1, dloc->op->reg2);
else if (dloc->op->reg1 == DWARF_REG_PC)
@@ -1252,7 +1269,7 @@ static int find_data_type_die(struct data_loc_info *dloc, Dwarf_Die *type_die)
dwarf_diename(&cu_die), (long)dwarf_dieoffset(&cu_die));

if (loc->reg_type == DWARF_REG_GLOBAL) {
- ret = find_data_type_global_reg(dloc, reg, &cu_die, type_die);
+ ret = find_data_type_global_reg(dloc, reg, &cu_die, type_die, global_die);
if (!ret)
goto out;
}
@@ -1388,6 +1405,7 @@ struct annotated_data_type *find_data_type(struct data_loc_info *dloc)
struct annotated_data_type *result = NULL;
struct dso *dso = map__dso(dloc->ms->map);
Dwarf_Die type_die;
+ Dwarf_Global global_die;

dloc->di = debuginfo__new(dso__long_name(dso));
if (dloc->di == NULL) {
@@ -1403,10 +1421,10 @@ struct annotated_data_type *find_data_type(struct data_loc_info *dloc)

dloc->fbreg = -1;

- if (find_data_type_die(dloc, &type_die) < 0)
+ if (find_data_type_die(dloc, &type_die, &global_die) < 0)
goto out;

- result = dso__findnew_data_type(dso, &type_die);
+ result = dso__findnew_data_type(dso, &type_die, &global_die);

out:
debuginfo__delete(dloc->di);
diff --git a/tools/perf/util/dwarf-aux.c b/tools/perf/util/dwarf-aux.c
index 44ef968a7ad3..9e61ff326651 100644
--- a/tools/perf/util/dwarf-aux.c
+++ b/tools/perf/util/dwarf-aux.c
@@ -1610,6 +1610,7 @@ static int __die_collect_vars_cb(Dwarf_Die *die_mem, void *arg)
vt->reg = reg_from_dwarf_op(ops);
vt->offset = offset_from_dwarf_op(ops);
vt->next = *var_types;
+ vt->name = dwarf_diename(die_mem);
*var_types = vt;

return DIE_FIND_CB_SIBLING;
diff --git a/tools/perf/util/dwarf-aux.h b/tools/perf/util/dwarf-aux.h
index 24446412b869..406a5b1e269b 100644
--- a/tools/perf/util/dwarf-aux.h
+++ b/tools/perf/util/dwarf-aux.h
@@ -146,6 +146,7 @@ struct die_var_type {
u64 addr;
int reg;
int offset;
+ const char *name;
};

/* Return type info of a member at offset */
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index cd39ea972193..535ca19a23fd 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -2305,9 +2305,12 @@ static int hist_entry__typeoff_snprintf(struct hist_entry *he, char *bf,
char buf[4096];

buf[0] = '\0';
- if (list_empty(&he_type->self.children))
+ if (list_empty(&he_type->self.children)) {
snprintf(buf, sizeof(buf), "no field");
- else
+ if (he_type->self.var_name)
+ strcpy(buf, he_type->self.var_name);
+
+ } else
fill_member_name(buf, sizeof(buf), &he_type->self,
he->mem_type_off, true);
buf[4095] = '\0';
--
2.43.0


2024-06-01 10:02:05

by Athira Rajeev

[permalink] [raw]
Subject: [PATCH V3 06/14] tools/perf: Update parameters for reg extract functions to use raw instruction on powerpc

Use the raw instruction code and macros to identify memory instructions,
extract register fields and also offset. The implementation addresses
the D-form, X-form, DS-form instructions. Two main functions are added.
New parse function "load_store__parse" as instruction ops parser for
memory instructions. Unlink other parser (like mov__parse), this parser
fills in the "raw_insn" field for source/target and new added "mem_ref"
field. Also set if it is multi_regs and opcode as well. No other fields
are set because, here there is no need to parse the disassembled
code and arch specific macros will take care of extracting offset and
regs which is easier and will be precise.

In powerpc, all instructions with a primary opcode from 32 to 63
are memory instructions. Update "ins__find" function to have "raw_insn"
also as a parameter. Don't use the "extract_reg_offset", instead use
newly added function "get_arch_regs" which will set these fields: reg1,
reg2, offset depending of where it is source or target ops.

Signed-off-by: Athira Rajeev <[email protected]>
---
.../perf/arch/powerpc/annotate/instructions.c | 16 +++++
tools/perf/arch/powerpc/util/dwarf-regs.c | 44 +++++++++++++
tools/perf/util/annotate.c | 25 +++++++-
tools/perf/util/disasm.c | 64 +++++++++++++++++--
tools/perf/util/disasm.h | 4 +-
tools/perf/util/include/dwarf-regs.h | 3 +
6 files changed, 147 insertions(+), 9 deletions(-)

diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c
index d57fd023ef9c..10fea5e5cf4c 100644
--- a/tools/perf/arch/powerpc/annotate/instructions.c
+++ b/tools/perf/arch/powerpc/annotate/instructions.c
@@ -49,6 +49,22 @@ static struct ins_ops *powerpc__associate_instruction_ops(struct arch *arch, con
return ops;
}

+#define PPC_OP(op) (((op) >> 26) & 0x3F)
+
+static struct ins_ops *check_ppc_insn(int raw_insn)
+{
+ int opcode = PPC_OP(raw_insn);
+
+ /*
+ * Instructions with opcode 32 to 63 are memory
+ * instructions in powerpc
+ */
+ if ((opcode & 0x20))
+ return &load_store_ops;
+
+ return NULL;
+}
+
static int powerpc__annotate_init(struct arch *arch, char *cpuid __maybe_unused)
{
if (!arch->initialized) {
diff --git a/tools/perf/arch/powerpc/util/dwarf-regs.c b/tools/perf/arch/powerpc/util/dwarf-regs.c
index 430623ca5612..38b74fa01d8b 100644
--- a/tools/perf/arch/powerpc/util/dwarf-regs.c
+++ b/tools/perf/arch/powerpc/util/dwarf-regs.c
@@ -107,3 +107,47 @@ int regs_query_register_offset(const char *name)
#define PPC_DS(DS) ((DS) & 0xfffc)
#define OP_LD 58
#define OP_STD 62
+
+static int get_source_reg(unsigned int raw_insn)
+{
+ return PPC_RA(raw_insn);
+}
+
+static int get_target_reg(unsigned int raw_insn)
+{
+ return PPC_RT(raw_insn);
+}
+
+static int get_offset_opcode(int raw_insn __maybe_unused)
+{
+ int opcode = PPC_OP(raw_insn);
+
+ /* DS- form */
+ if ((opcode == OP_LD) || (opcode == OP_STD))
+ return PPC_DS(raw_insn);
+ else
+ return PPC_D(raw_insn);
+}
+
+/*
+ * Fills the required fields for op_loc depending on if it
+ * is a source or target.
+ * D form: ins RT,D(RA) -> src_reg1 = RA, offset = D, dst_reg1 = RT
+ * DS form: ins RT,DS(RA) -> src_reg1 = RA, offset = DS, dst_reg1 = RT
+ * X form: ins RT,RA,RB -> src_reg1 = RA, src_reg2 = RB, dst_reg1 = RT
+ */
+void get_arch_regs(int raw_insn __maybe_unused, int is_source __maybe_unused,
+ struct annotated_op_loc *op_loc __maybe_unused)
+{
+ if (is_source)
+ op_loc->reg1 = get_source_reg(raw_insn);
+ else
+ op_loc->reg1 = get_target_reg(raw_insn);
+
+ if (op_loc->multi_regs)
+ op_loc->reg2 = PPC_RB(raw_insn);
+
+ /* TODO: Implement offset handling for X Form */
+ if ((op_loc->mem_ref) && (PPC_OP(raw_insn) != 31))
+ op_loc->offset = get_offset_opcode(raw_insn);
+}
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 1451caf25e77..2b8cc759ae35 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -2079,6 +2079,12 @@ static int extract_reg_offset(struct arch *arch, const char *str,
return 0;
}

+__weak void get_arch_regs(int raw_insn __maybe_unused, int is_source __maybe_unused,
+ struct annotated_op_loc *op_loc __maybe_unused)
+{
+ return;
+}
+
/**
* annotate_get_insn_location - Get location of instruction
* @arch: the architecture info
@@ -2123,20 +2129,33 @@ int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl,
for_each_insn_op_loc(loc, i, op_loc) {
const char *insn_str = ops->source.raw;
bool multi_regs = ops->source.multi_regs;
+ bool mem_ref = ops->source.mem_ref;

if (i == INSN_OP_TARGET) {
insn_str = ops->target.raw;
multi_regs = ops->target.multi_regs;
+ mem_ref = ops->target.mem_ref;
}

/* Invalidate the register by default */
op_loc->reg1 = -1;
op_loc->reg2 = -1;

- if (insn_str == NULL)
- continue;
+ if (insn_str == NULL) {
+ if (!arch__is(arch, "powerpc"))
+ continue;
+ }

- if (strchr(insn_str, arch->objdump.memory_ref_char)) {
+ /*
+ * For powerpc, call get_arch_regs function which extracts the
+ * required fields for op_loc, ie reg1, reg2, offset from the
+ * raw instruction.
+ */
+ if (arch__is(arch, "powerpc")) {
+ op_loc->mem_ref = mem_ref;
+ op_loc->multi_regs = multi_regs;
+ get_arch_regs(ops->raw_insn, !i, op_loc);
+ } else if (strchr(insn_str, arch->objdump.memory_ref_char)) {
op_loc->mem_ref = true;
op_loc->multi_regs = multi_regs;
extract_reg_offset(arch, insn_str, op_loc);
diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
index 61f0f1656f82..252cb0d1f5d1 100644
--- a/tools/perf/util/disasm.c
+++ b/tools/perf/util/disasm.c
@@ -37,6 +37,7 @@ static struct ins_ops mov_ops;
static struct ins_ops nop_ops;
static struct ins_ops lock_ops;
static struct ins_ops ret_ops;
+static struct ins_ops load_store_ops;

static int jump__scnprintf(struct ins *ins, char *bf, size_t size,
struct ins_operands *ops, int max_ins_name);
@@ -517,7 +518,7 @@ static int lock__parse(struct arch *arch, struct ins_operands *ops, struct map_s
if (disasm_line__parse(ops->raw, &ops->locked.ins.name, &ops->locked.ops->raw) < 0)
goto out_free_ops;

- ops->locked.ins.ops = ins__find(arch, ops->locked.ins.name);
+ ops->locked.ins.ops = ins__find(arch, ops->locked.ins.name, 0);

if (ops->locked.ins.ops == NULL)
goto out_free_ops;
@@ -672,6 +673,47 @@ static struct ins_ops mov_ops = {
.scnprintf = mov__scnprintf,
};

+static int load_store__scnprintf(struct ins *ins, char *bf, size_t size,
+ struct ins_operands *ops, int max_ins_name)
+{
+ return scnprintf(bf, size, "%-*s %s", max_ins_name, ins->name,
+ ops->raw);
+}
+
+/*
+ * Sets the fields: "raw_insn", opcode, multi_regs and "mem_ref".
+ * "mem_ref" is set for ops->source which is later used to
+ * fill the objdump->memory_ref-char field. This ops is currently
+ * used by powerpc and since binary instruction code is used to
+ * extract opcode, regs and offset, no other parsing is needed here
+ */
+static int load_store__parse(struct arch *arch __maybe_unused, struct ins_operands *ops,
+ struct map_symbol *ms __maybe_unused)
+{
+ ops->source.raw_insn = ops->raw_insn;
+ ops->source.mem_ref = true;
+ ops->source.opcode = ops->opcode;
+ ops->source.multi_regs = false;
+
+ if (!ops->source.raw_insn)
+ return -1;
+
+ ops->target.raw_insn = ops->raw_insn;
+ ops->target.mem_ref = false;
+ ops->target.multi_regs = false;
+ ops->target.opcode = ops->opcode;
+
+ if (!ops->target.raw_insn)
+ return -1;
+
+ return 0;
+}
+
+static struct ins_ops load_store_ops = {
+ .parse = load_store__parse,
+ .scnprintf = load_store__scnprintf,
+};
+
static int dec__parse(struct arch *arch __maybe_unused, struct ins_operands *ops, struct map_symbol *ms __maybe_unused)
{
char *target, *comment, *s, prev;
@@ -762,11 +804,23 @@ static void ins__sort(struct arch *arch)
qsort(arch->instructions, nmemb, sizeof(struct ins), ins__cmp);
}

-static struct ins_ops *__ins__find(struct arch *arch, const char *name)
+static struct ins_ops *__ins__find(struct arch *arch, const char *name, int raw_insn)
{
struct ins *ins;
const int nmemb = arch->nr_instructions;

+ if (arch__is(arch, "powerpc")) {
+ /*
+ * For powerpc, identify the instruction ops
+ * from the opcode using raw_insn.
+ */
+ struct ins_ops *ops;
+
+ ops = check_ppc_insn(raw_insn);
+ if (ops)
+ return ops;
+ }
+
if (!arch->sorted_instructions) {
ins__sort(arch);
arch->sorted_instructions = true;
@@ -796,9 +850,9 @@ static struct ins_ops *__ins__find(struct arch *arch, const char *name)
return ins ? ins->ops : NULL;
}

-struct ins_ops *ins__find(struct arch *arch, const char *name)
+struct ins_ops *ins__find(struct arch *arch, const char *name, int raw_insn)
{
- struct ins_ops *ops = __ins__find(arch, name);
+ struct ins_ops *ops = __ins__find(arch, name, raw_insn);

if (!ops && arch->associate_instruction_ops)
ops = arch->associate_instruction_ops(arch, name);
@@ -808,7 +862,7 @@ struct ins_ops *ins__find(struct arch *arch, const char *name)

static void disasm_line__init_ins(struct disasm_line *dl, struct arch *arch, struct map_symbol *ms)
{
- dl->ins.ops = ins__find(arch, dl->ins.name);
+ dl->ins.ops = ins__find(arch, dl->ins.name, dl->ops.raw_insn);

if (!dl->ins.ops)
return;
diff --git a/tools/perf/util/disasm.h b/tools/perf/util/disasm.h
index a391e1bb81f7..831ebcc329cd 100644
--- a/tools/perf/util/disasm.h
+++ b/tools/perf/util/disasm.h
@@ -62,6 +62,7 @@ struct ins_operands {
bool offset_avail;
bool outside;
bool multi_regs;
+ bool mem_ref;
} target;
union {
struct {
@@ -71,6 +72,7 @@ struct ins_operands {
int raw_insn;
u64 addr;
bool multi_regs;
+ bool mem_ref;
} source;
struct {
struct ins ins;
@@ -104,7 +106,7 @@ struct annotate_args {
struct arch *arch__find(const char *name);
bool arch__is(struct arch *arch, const char *name);

-struct ins_ops *ins__find(struct arch *arch, const char *name);
+struct ins_ops *ins__find(struct arch *arch, const char *name, int raw_insn);
int ins__scnprintf(struct ins *ins, char *bf, size_t size,
struct ins_operands *ops, int max_ins_name);

diff --git a/tools/perf/util/include/dwarf-regs.h b/tools/perf/util/include/dwarf-regs.h
index 01fb25a1150a..7ea39362ecaf 100644
--- a/tools/perf/util/include/dwarf-regs.h
+++ b/tools/perf/util/include/dwarf-regs.h
@@ -1,6 +1,7 @@
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _PERF_DWARF_REGS_H_
#define _PERF_DWARF_REGS_H_
+#include "annotate.h"

#define DWARF_REG_PC 0xd3af9c /* random number */
#define DWARF_REG_FB 0xd3affb /* random number */
@@ -31,6 +32,8 @@ static inline int get_dwarf_regnum(const char *name __maybe_unused,
}
#endif

+void get_arch_regs(int raw_insn, int is_source, struct annotated_op_loc *op_loc);
+
#ifdef HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET
/*
* Arch should support fetching the offset of a register in pt_regs
--
2.43.0


2024-06-03 16:31:05

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH V3 11/14] tools/perf: Add support to use libcapstone in powerpc

On Fri, May 31, 2024 at 11:10 PM Athira Rajeev
<[email protected]> wrote:
>
> Now perf uses the capstone library to disassemble the instructions in
> x86. capstone is used (if available) for perf annotate to speed up.
> Currently it only supports x86 architecture. Patch includes changes to
> enable this in powerpc. For now, only for data type sort keys, this
> method is used and only binary code (raw instruction) is read. This is
> because powerpc approach to understand instructions and reg fields uses
> raw instruction. The "cs_disasm" is currently not enabled. While
> attempting to do cs_disasm, observation is that some of the instructions
> were not identified (ex: extswsli, maddld) and it had to fallback to use
> objdump. Hence enabling "cs_disasm" is added in comment section as a
> TODO for powerpc.
>
> Signed-off-by: Athira Rajeev <[email protected]>
> ---
> tools/perf/util/disasm.c | 148 ++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 146 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
> index d8b357055302..915508d2e197 100644
> --- a/tools/perf/util/disasm.c
> +++ b/tools/perf/util/disasm.c
> @@ -1540,12 +1540,18 @@ static int open_capstone_handle(struct annotate_args *args, bool is_64bit,
> {
> struct annotation_options *opt = args->options;
> cs_mode mode = is_64bit ? CS_MODE_64 : CS_MODE_32;
> + int ret;
>
> /* TODO: support more architectures */
> - if (!arch__is(args->arch, "x86"))
> + if ((!arch__is(args->arch, "x86")) && (!arch__is(args->arch, "powerpc")))
> return -1;
>
> - if (cs_open(CS_ARCH_X86, mode, handle) != CS_ERR_OK)
> + if (arch__is(args->arch, "x86"))
> + ret = cs_open(CS_ARCH_X86, mode, handle);
> + else
> + ret = cs_open(CS_ARCH_PPC, mode, handle);
> +
> + if (ret != CS_ERR_OK)
> return -1;

There looks to be a pretty/more robust capstone_init function in
print_insn.c, should we factor this code out and recycle:
https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/print_insn.c?h=perf-tools-next#n40

Thanks,
Ian

> if (!opt->disassembler_style ||
> @@ -1635,6 +1641,139 @@ static void print_capstone_detail(cs_insn *insn, char *buf, size_t len,
> }
> }
>
> +static int symbol__disassemble_capstone_powerpc(char *filename, struct symbol *sym,
> + struct annotate_args *args)
> +{
> + struct annotation *notes = symbol__annotation(sym);
> + struct map *map = args->ms.map;
> + struct dso *dso = map__dso(map);
> + struct nscookie nsc;
> + u64 start = map__rip_2objdump(map, sym->start);
> + u64 end = map__rip_2objdump(map, sym->end);
> + u64 len = end - start;
> + u64 offset;
> + int i, fd, count;
> + bool is_64bit = false;
> + bool needs_cs_close = false;
> + u8 *buf = NULL;
> + struct find_file_offset_data data = {
> + .ip = start,
> + };
> + csh handle;
> + char disasm_buf[512];
> + struct disasm_line *dl;
> + u32 *line;
> +
> + if (args->options->objdump_path)
> + return -1;
> +
> + nsinfo__mountns_enter(dso->nsinfo, &nsc);
> + fd = open(filename, O_RDONLY);
> + nsinfo__mountns_exit(&nsc);
> + if (fd < 0)
> + return -1;
> +
> + if (file__read_maps(fd, /*exe=*/true, find_file_offset, &data,
> + &is_64bit) == 0)
> + goto err;
> +
> + if (open_capstone_handle(args, is_64bit, &handle) < 0)
> + goto err;
> +
> + needs_cs_close = true;
> +
> + buf = malloc(len);
> + if (buf == NULL)
> + goto err;
> +
> + count = pread(fd, buf, len, data.offset);
> + close(fd);
> + fd = -1;
> +
> + if ((u64)count != len)
> + goto err;
> +
> + line = (u32 *)buf;
> +
> + /* add the function address and name */
> + scnprintf(disasm_buf, sizeof(disasm_buf), "%#"PRIx64" <%s>:",
> + start, sym->name);
> +
> + args->offset = -1;
> + args->line = disasm_buf;
> + args->line_nr = 0;
> + args->fileloc = NULL;
> + args->ms.sym = sym;
> +
> + dl = disasm_line__new(args);
> + if (dl == NULL)
> + goto err;
> +
> + annotation_line__add(&dl->al, &notes->src->source);
> +
> + /*
> + * TODO: enable disassm for powerpc
> + * count = cs_disasm(handle, buf, len, start, len, &insn);
> + *
> + * For now, only binary code is saved in disassembled line
> + * to be used in "type" and "typeoff" sort keys. Each raw code
> + * is 32 bit instruction. So use "len/4" to get the number of
> + * entries.
> + */
> + count = len/4;
> +
> + for (i = 0, offset = 0; i < count; i++) {
> + args->offset = offset;
> + sprintf(args->line, "%x", line[i]);
> +
> + dl = disasm_line__new(args);
> + if (dl == NULL)
> + goto err;
> +
> + annotation_line__add(&dl->al, &notes->src->source);
> +
> + offset += 4;
> + }
> +
> + /* It failed in the middle */
> + if (offset != len) {
> + struct list_head *list = &notes->src->source;
> +
> + /* Discard all lines and fallback to objdump */
> + while (!list_empty(list)) {
> + dl = list_first_entry(list, struct disasm_line, al.node);
> +
> + list_del_init(&dl->al.node);
> + disasm_line__free(dl);
> + }
> + count = -1;
> + }
> +
> +out:
> + if (needs_cs_close)
> + cs_close(&handle);
> + free(buf);
> + return count < 0 ? count : 0;
> +
> +err:
> + if (fd >= 0)
> + close(fd);
> + if (needs_cs_close) {
> + struct disasm_line *tmp;
> +
> + /*
> + * It probably failed in the middle of the above loop.
> + * Release any resources it might add.
> + */
> + list_for_each_entry_safe(dl, tmp, &notes->src->source, al.node) {
> + list_del(&dl->al.node);
> + free(dl);
> + }
> + }
> + count = -1;
> + goto out;
> +}
> +
> static int symbol__disassemble_capstone(char *filename, struct symbol *sym,
> struct annotate_args *args)
> {
> @@ -1987,6 +2126,11 @@ int symbol__disassemble(struct symbol *sym, struct annotate_args *args)
> err = symbol__disassemble_dso(symfs_filename, sym, args);
> if (err == 0)
> goto out_remove_tmp;
> +#ifdef HAVE_LIBCAPSTONE_SUPPORT
> + err = symbol__disassemble_capstone_powerpc(symfs_filename, sym, args);
> + if (err == 0)
> + goto out_remove_tmp;
> +#endif
> }
> }
>
> --
> 2.43.0
>

2024-06-03 16:58:51

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH V3 11/14] tools/perf: Add support to use libcapstone in powerpc

On 3/06/24 19:30, Ian Rogers wrote:
> On Fri, May 31, 2024 at 11:10 PM Athira Rajeev
> <[email protected]> wrote:
>>
>> Now perf uses the capstone library to disassemble the instructions in
>> x86. capstone is used (if available) for perf annotate to speed up.
>> Currently it only supports x86 architecture. Patch includes changes to
>> enable this in powerpc. For now, only for data type sort keys, this
>> method is used and only binary code (raw instruction) is read. This is
>> because powerpc approach to understand instructions and reg fields uses
>> raw instruction. The "cs_disasm" is currently not enabled. While
>> attempting to do cs_disasm, observation is that some of the instructions
>> were not identified (ex: extswsli, maddld) and it had to fallback to use
>> objdump. Hence enabling "cs_disasm" is added in comment section as a
>> TODO for powerpc.
>>
>> Signed-off-by: Athira Rajeev <[email protected]>
>> ---
>> tools/perf/util/disasm.c | 148 ++++++++++++++++++++++++++++++++++++++-
>> 1 file changed, 146 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
>> index d8b357055302..915508d2e197 100644
>> --- a/tools/perf/util/disasm.c
>> +++ b/tools/perf/util/disasm.c
>> @@ -1540,12 +1540,18 @@ static int open_capstone_handle(struct annotate_args *args, bool is_64bit,
>> {
>> struct annotation_options *opt = args->options;
>> cs_mode mode = is_64bit ? CS_MODE_64 : CS_MODE_32;
>> + int ret;
>>
>> /* TODO: support more architectures */
>> - if (!arch__is(args->arch, "x86"))
>> + if ((!arch__is(args->arch, "x86")) && (!arch__is(args->arch, "powerpc")))
>> return -1;
>>
>> - if (cs_open(CS_ARCH_X86, mode, handle) != CS_ERR_OK)
>> + if (arch__is(args->arch, "x86"))
>> + ret = cs_open(CS_ARCH_X86, mode, handle);
>> + else
>> + ret = cs_open(CS_ARCH_PPC, mode, handle);
>> +
>> + if (ret != CS_ERR_OK)
>> return -1;
>
> There looks to be a pretty/more robust capstone_init function in
> print_insn.c, should we factor this code out and recycle:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/print_insn.c?h=perf-tools-next#n40

On a slightly related note, there is a compile error
been around for a while in util/disasm.c on Ubuntu 22.04

In file included from /usr/include/capstone/capstone.h:279,
from util/disasm.c:1354:
/usr/include/capstone/bpf.h:94:14: error: ‘bpf_insn’ defined as wrong
kind of tag
94 | typedef enum bpf_insn {
| ^~~~~~~~

>
> Thanks,
> Ian
>
>> if (!opt->disassembler_style ||
>> @@ -1635,6 +1641,139 @@ static void print_capstone_detail(cs_insn *insn, char *buf, size_t len,
>> }
>> }
>>
>> +static int symbol__disassemble_capstone_powerpc(char *filename, struct symbol *sym,
>> + struct annotate_args *args)
>> +{
>> + struct annotation *notes = symbol__annotation(sym);
>> + struct map *map = args->ms.map;
>> + struct dso *dso = map__dso(map);
>> + struct nscookie nsc;
>> + u64 start = map__rip_2objdump(map, sym->start);
>> + u64 end = map__rip_2objdump(map, sym->end);
>> + u64 len = end - start;
>> + u64 offset;
>> + int i, fd, count;
>> + bool is_64bit = false;
>> + bool needs_cs_close = false;
>> + u8 *buf = NULL;
>> + struct find_file_offset_data data = {
>> + .ip = start,
>> + };
>> + csh handle;
>> + char disasm_buf[512];
>> + struct disasm_line *dl;
>> + u32 *line;
>> +
>> + if (args->options->objdump_path)
>> + return -1;
>> +
>> + nsinfo__mountns_enter(dso->nsinfo, &nsc);
>> + fd = open(filename, O_RDONLY);
>> + nsinfo__mountns_exit(&nsc);
>> + if (fd < 0)
>> + return -1;
>> +
>> + if (file__read_maps(fd, /*exe=*/true, find_file_offset, &data,
>> + &is_64bit) == 0)
>> + goto err;
>> +
>> + if (open_capstone_handle(args, is_64bit, &handle) < 0)
>> + goto err;
>> +
>> + needs_cs_close = true;
>> +
>> + buf = malloc(len);
>> + if (buf == NULL)
>> + goto err;
>> +
>> + count = pread(fd, buf, len, data.offset);
>> + close(fd);
>> + fd = -1;
>> +
>> + if ((u64)count != len)
>> + goto err;
>> +
>> + line = (u32 *)buf;
>> +
>> + /* add the function address and name */
>> + scnprintf(disasm_buf, sizeof(disasm_buf), "%#"PRIx64" <%s>:",
>> + start, sym->name);
>> +
>> + args->offset = -1;
>> + args->line = disasm_buf;
>> + args->line_nr = 0;
>> + args->fileloc = NULL;
>> + args->ms.sym = sym;
>> +
>> + dl = disasm_line__new(args);
>> + if (dl == NULL)
>> + goto err;
>> +
>> + annotation_line__add(&dl->al, &notes->src->source);
>> +
>> + /*
>> + * TODO: enable disassm for powerpc
>> + * count = cs_disasm(handle, buf, len, start, len, &insn);
>> + *
>> + * For now, only binary code is saved in disassembled line
>> + * to be used in "type" and "typeoff" sort keys. Each raw code
>> + * is 32 bit instruction. So use "len/4" to get the number of
>> + * entries.
>> + */
>> + count = len/4;
>> +
>> + for (i = 0, offset = 0; i < count; i++) {
>> + args->offset = offset;
>> + sprintf(args->line, "%x", line[i]);
>> +
>> + dl = disasm_line__new(args);
>> + if (dl == NULL)
>> + goto err;
>> +
>> + annotation_line__add(&dl->al, &notes->src->source);
>> +
>> + offset += 4;
>> + }
>> +
>> + /* It failed in the middle */
>> + if (offset != len) {
>> + struct list_head *list = &notes->src->source;
>> +
>> + /* Discard all lines and fallback to objdump */
>> + while (!list_empty(list)) {
>> + dl = list_first_entry(list, struct disasm_line, al.node);
>> +
>> + list_del_init(&dl->al.node);
>> + disasm_line__free(dl);
>> + }
>> + count = -1;
>> + }
>> +
>> +out:
>> + if (needs_cs_close)
>> + cs_close(&handle);
>> + free(buf);
>> + return count < 0 ? count : 0;
>> +
>> +err:
>> + if (fd >= 0)
>> + close(fd);
>> + if (needs_cs_close) {
>> + struct disasm_line *tmp;
>> +
>> + /*
>> + * It probably failed in the middle of the above loop.
>> + * Release any resources it might add.
>> + */
>> + list_for_each_entry_safe(dl, tmp, &notes->src->source, al.node) {
>> + list_del(&dl->al.node);
>> + free(dl);
>> + }
>> + }
>> + count = -1;
>> + goto out;
>> +}
>> +
>> static int symbol__disassemble_capstone(char *filename, struct symbol *sym,
>> struct annotate_args *args)
>> {
>> @@ -1987,6 +2126,11 @@ int symbol__disassemble(struct symbol *sym, struct annotate_args *args)
>> err = symbol__disassemble_dso(symfs_filename, sym, args);
>> if (err == 0)
>> goto out_remove_tmp;
>> +#ifdef HAVE_LIBCAPSTONE_SUPPORT
>> + err = symbol__disassemble_capstone_powerpc(symfs_filename, sym, args);
>> + if (err == 0)
>> + goto out_remove_tmp;
>> +#endif
>> }
>> }
>>
>> --
>> 2.43.0
>>


2024-06-06 06:33:27

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH V3 05/14] tools/perf: Add disasm_line__parse to parse raw instruction for powerpc

Hello,

On Sat, Jun 01, 2024 at 11:39:32AM +0530, Athira Rajeev wrote:
> Currently, the perf tool infrastructure disasm_line__parse function to
> parse disassembled line.
>
> Example snippet from objdump:
> objdump --start-address=<address> --stop-address=<address> -d --no-show-raw-insn -C <vmlinux>
>
> c0000000010224b4: lwz r10,0(r9)
>
> This line "lwz r10,0(r9)" is parsed to extract instruction name,
> registers names and offset. In powerpc, the approach for data type
> profiling uses raw instruction instead of result from objdump to identify
> the instruction category and extract the source/target registers.
>
> Example: 38 01 81 e8 ld r4,312(r1)
>
> Here "38 01 81 e8" is the raw instruction representation. Add function
> "disasm_line__parse_powerpc" to handle parsing of raw instruction. Also
> update "struct ins" and "struct ins_operands" to save "opcode" and
> binary code. With the change, function captures:
>
> line -> "38 01 81 e8 ld r4,312(r1)"
> opcode and raw instruction "38 01 81 e8"
>
> Raw instruction is used later to extract the reg/offset fields. Macros
> are added to extract opcode and register fields. "struct ins_operands"
> and "struct ins" is updated to carry opcode and raw instruction binary
> code (raw_insn). Function "disasm_line__parse_powerpc fills the raw
> instruction hex value and opcode in newly added fields. There is no
> changes in existing code paths, which parses the disassembled code.
> The architecture using the instruction name and present approach is
> not altered. Since this approach targets powerpc, the macro
> implementation is added for powerpc as of now.
>
> Since the disasm_line__parse is used in other cases (perf annotate) and
> not only data tye profiling, the powerpc callback includes changes to
> work with binary code as well as mneumonic representation. Also in case
> if the DSO read fails and libcapstone is not supported, the approach
> fallback to use objdump as option. Hence as option, patch has changes to
> ensure objdump option also works well.
>
> Signed-off-by: Athira Rajeev <[email protected]>
> ---
> tools/include/linux/string.h | 2 +
> tools/lib/string.c | 13 ++++
> .../perf/arch/powerpc/annotate/instructions.c | 1 +
> tools/perf/arch/powerpc/util/dwarf-regs.c | 9 +++
> tools/perf/util/disasm.c | 63 ++++++++++++++++++-
> tools/perf/util/disasm.h | 7 +++
> 6 files changed, 94 insertions(+), 1 deletion(-)
>
> diff --git a/tools/include/linux/string.h b/tools/include/linux/string.h
> index db5c99318c79..0acb1fc14e19 100644
> --- a/tools/include/linux/string.h
> +++ b/tools/include/linux/string.h
> @@ -46,5 +46,7 @@ extern char * __must_check skip_spaces(const char *);
>
> extern char *strim(char *);
>
> +extern void remove_spaces(char *s);
> +
> extern void *memchr_inv(const void *start, int c, size_t bytes);
> #endif /* _TOOLS_LINUX_STRING_H_ */
> diff --git a/tools/lib/string.c b/tools/lib/string.c
> index 8b6892f959ab..3126d2cff716 100644
> --- a/tools/lib/string.c
> +++ b/tools/lib/string.c
> @@ -153,6 +153,19 @@ char *strim(char *s)
> return skip_spaces(s);
> }
>
> +/*
> + * remove_spaces - Removes whitespaces from @s
> + */
> +void remove_spaces(char *s)
> +{
> + char *d = s;
> +
> + do {
> + while (*d == ' ')
> + ++d;
> + } while ((*s++ = *d++));
> +}
> +
> /**
> * strreplace - Replace all occurrences of character in string.
> * @s: The string to operate on.
> diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c
> index a3f423c27cae..d57fd023ef9c 100644
> --- a/tools/perf/arch/powerpc/annotate/instructions.c
> +++ b/tools/perf/arch/powerpc/annotate/instructions.c
> @@ -55,6 +55,7 @@ static int powerpc__annotate_init(struct arch *arch, char *cpuid __maybe_unused)
> arch->initialized = true;
> arch->associate_instruction_ops = powerpc__associate_instruction_ops;
> arch->objdump.comment_char = '#';
> + annotate_opts.show_asm_raw = true;
> }
>
> return 0;
> diff --git a/tools/perf/arch/powerpc/util/dwarf-regs.c b/tools/perf/arch/powerpc/util/dwarf-regs.c
> index 0c4f4caf53ac..430623ca5612 100644
> --- a/tools/perf/arch/powerpc/util/dwarf-regs.c
> +++ b/tools/perf/arch/powerpc/util/dwarf-regs.c
> @@ -98,3 +98,12 @@ int regs_query_register_offset(const char *name)
> return roff->ptregs_offset;
> return -EINVAL;
> }
> +
> +#define PPC_OP(op) (((op) >> 26) & 0x3F)
> +#define PPC_RA(a) (((a) >> 16) & 0x1f)
> +#define PPC_RT(t) (((t) >> 21) & 0x1f)
> +#define PPC_RB(b) (((b) >> 11) & 0x1f)
> +#define PPC_D(D) ((D) & 0xfffe)
> +#define PPC_DS(DS) ((DS) & 0xfffc)
> +#define OP_LD 58
> +#define OP_STD 62
> diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
> index 3cd187f08193..61f0f1656f82 100644
> --- a/tools/perf/util/disasm.c
> +++ b/tools/perf/util/disasm.c
> @@ -45,6 +45,7 @@ static int call__scnprintf(struct ins *ins, char *bf, size_t size,
>
> static void ins__sort(struct arch *arch);
> static int disasm_line__parse(char *line, const char **namep, char **rawp);
> +static int disasm_line__parse_powerpc(struct disasm_line *dl);
>
> static __attribute__((constructor)) void symbol__init_regexpr(void)
> {
> @@ -844,6 +845,63 @@ static int disasm_line__parse(char *line, const char **namep, char **rawp)
> return -1;
> }
>
> +/*
> + * Parses the result captured from symbol__disassemble_*
> + * Example, line read from DSO file in powerpc:
> + * line: 38 01 81 e8
> + * opcode: fetched from arch specific get_opcode_insn
> + * rawp_insn: e8810138
> + *
> + * rawp_insn is used later to extract the reg/offset fields
> + */
> +#define PPC_OP(op) (((op) >> 26) & 0x3F)
> +
> +static int disasm_line__parse_powerpc(struct disasm_line *dl)
> +{
> + char *line = dl->al.line;
> + const char **namep = &dl->ins.name;
> + char **rawp = &dl->ops.raw;
> + char tmp, *tmp_opcode, *name_opcode = skip_spaces(line);
> + char *name = skip_spaces(name_opcode + 11);
> + int objdump = 0;
> +
> + if (strlen(line) > 11)
> + objdump = 1;
> +
> + if (name_opcode[0] == '\0')
> + return -1;
> +
> + if (objdump) {
> + *rawp = name + 1;
> + while ((*rawp)[0] != '\0' && !isspace((*rawp)[0]))
> + ++*rawp;
> + tmp = (*rawp)[0];
> + (*rawp)[0] = '\0';
> +
> + *namep = strdup(name);
> + if (*namep == NULL)
> + return -1;
> +
> + (*rawp)[0] = tmp;
> + *rawp = strim(*rawp);
> + } else
> + *namep = "";
> +
> + tmp_opcode = strdup(name_opcode);
> + tmp_opcode[11] = '\0';
> + remove_spaces(tmp_opcode);
> +
> + dl->ins.opcode = PPC_OP(strtol(tmp_opcode, NULL, 16));
> + if (objdump)
> + dl->ins.opcode = PPC_OP(be32_to_cpu(strtol(tmp_opcode, NULL, 16)));
> + dl->ops.opcode = dl->ins.opcode;
> +
> + dl->ops.raw_insn = strtol(tmp_opcode, NULL, 16);
> + if (objdump)
> + dl->ops.raw_insn = be32_to_cpu(strtol(tmp_opcode, NULL, 16));
> + return 0;
> +}
> +
> static void annotation_line__init(struct annotation_line *al,
> struct annotate_args *args,
> int nr)
> @@ -897,7 +955,10 @@ struct disasm_line *disasm_line__new(struct annotate_args *args)
> goto out_delete;
>
> if (args->offset != -1) {
> - if (disasm_line__parse(dl->al.line, &dl->ins.name, &dl->ops.raw) < 0)
> + if (arch__is(args->arch, "powerpc")) {
> + if (disasm_line__parse_powerpc(dl) < 0)
> + goto out_free_line;
> + } else if (disasm_line__parse(dl->al.line, &dl->ins.name, &dl->ops.raw) < 0)
> goto out_free_line;
>
> disasm_line__init_ins(dl, args->arch, &args->ms);
> diff --git a/tools/perf/util/disasm.h b/tools/perf/util/disasm.h
> index 718177fa4775..a391e1bb81f7 100644
> --- a/tools/perf/util/disasm.h
> +++ b/tools/perf/util/disasm.h
> @@ -43,14 +43,19 @@ struct arch {
>
> struct ins {
> const char *name;
> + int opcode;

I don't think this is the right place as 'ins' can be shared for
different opcodes. IIUC it's like a class and disasm_line should
have a pointer instead of a copy of the arch instructions. So I'd
like to keep a single instance if they behave in the same way. But
this is a separate change.

I guess we can move it to struct disasm_line and use helper macros when
we need to access the opcode. This will be helpful for other arches.

struct disasm_line {
struct ins *ins;
struct ins_operands ops;
union {
u8 bytes[4];
u32 opcode;
} raw;
struct annotation_line al;
};

#define PPC_OP(dl) (((dl)->raw.bytes[0] >> 2) & 0x3F)

Thanks,
Namhyung

>
> struct ins_ops *ops;
> };
>
> struct ins_operands {
> char *raw;
> + int raw_insn;
> + int opcode;
> struct {
> char *raw;
> char *name;
> + int opcode;
> + int raw_insn;
> struct symbol *sym;
> u64 addr;
> s64 offset;
> @@ -62,6 +67,8 @@ struct ins_operands {
> struct {
> char *raw;
> char *name;
> + int opcode;
> + int raw_insn;
> u64 addr;
> bool multi_regs;
> } source;
> --
> 2.43.0
>

2024-06-06 06:52:18

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH V3 06/14] tools/perf: Update parameters for reg extract functions to use raw instruction on powerpc

On Sat, Jun 01, 2024 at 11:39:33AM +0530, Athira Rajeev wrote:
> Use the raw instruction code and macros to identify memory instructions,
> extract register fields and also offset. The implementation addresses
> the D-form, X-form, DS-form instructions. Two main functions are added.
> New parse function "load_store__parse" as instruction ops parser for
> memory instructions. Unlink other parser (like mov__parse), this parser
> fills in the "raw_insn" field for source/target and new added "mem_ref"
> field. Also set if it is multi_regs and opcode as well. No other fields
> are set because, here there is no need to parse the disassembled
> code and arch specific macros will take care of extracting offset and
> regs which is easier and will be precise.
>
> In powerpc, all instructions with a primary opcode from 32 to 63
> are memory instructions. Update "ins__find" function to have "raw_insn"
> also as a parameter. Don't use the "extract_reg_offset", instead use
> newly added function "get_arch_regs" which will set these fields: reg1,
> reg2, offset depending of where it is source or target ops.
>
> Signed-off-by: Athira Rajeev <[email protected]>
> ---
> .../perf/arch/powerpc/annotate/instructions.c | 16 +++++
> tools/perf/arch/powerpc/util/dwarf-regs.c | 44 +++++++++++++
> tools/perf/util/annotate.c | 25 +++++++-
> tools/perf/util/disasm.c | 64 +++++++++++++++++--
> tools/perf/util/disasm.h | 4 +-
> tools/perf/util/include/dwarf-regs.h | 3 +
> 6 files changed, 147 insertions(+), 9 deletions(-)
>
> diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c
> index d57fd023ef9c..10fea5e5cf4c 100644
> --- a/tools/perf/arch/powerpc/annotate/instructions.c
> +++ b/tools/perf/arch/powerpc/annotate/instructions.c
> @@ -49,6 +49,22 @@ static struct ins_ops *powerpc__associate_instruction_ops(struct arch *arch, con
> return ops;
> }
>
> +#define PPC_OP(op) (((op) >> 26) & 0x3F)
> +
> +static struct ins_ops *check_ppc_insn(int raw_insn)
> +{
> + int opcode = PPC_OP(raw_insn);
> +
> + /*
> + * Instructions with opcode 32 to 63 are memory
> + * instructions in powerpc
> + */
> + if ((opcode & 0x20))
> + return &load_store_ops;
> +
> + return NULL;
> +}
> +
> static int powerpc__annotate_init(struct arch *arch, char *cpuid __maybe_unused)
> {
> if (!arch->initialized) {
> diff --git a/tools/perf/arch/powerpc/util/dwarf-regs.c b/tools/perf/arch/powerpc/util/dwarf-regs.c
> index 430623ca5612..38b74fa01d8b 100644
> --- a/tools/perf/arch/powerpc/util/dwarf-regs.c
> +++ b/tools/perf/arch/powerpc/util/dwarf-regs.c
> @@ -107,3 +107,47 @@ int regs_query_register_offset(const char *name)
> #define PPC_DS(DS) ((DS) & 0xfffc)
> #define OP_LD 58
> #define OP_STD 62
> +
> +static int get_source_reg(unsigned int raw_insn)
> +{
> + return PPC_RA(raw_insn);
> +}
> +
> +static int get_target_reg(unsigned int raw_insn)
> +{
> + return PPC_RT(raw_insn);
> +}
> +
> +static int get_offset_opcode(int raw_insn __maybe_unused)

The argument is used below, no need for __maybe_unused.

> +{
> + int opcode = PPC_OP(raw_insn);
> +
> + /* DS- form */
> + if ((opcode == OP_LD) || (opcode == OP_STD))
> + return PPC_DS(raw_insn);
> + else
> + return PPC_D(raw_insn);
> +}
> +
> +/*
> + * Fills the required fields for op_loc depending on if it
> + * is a source or target.
> + * D form: ins RT,D(RA) -> src_reg1 = RA, offset = D, dst_reg1 = RT
> + * DS form: ins RT,DS(RA) -> src_reg1 = RA, offset = DS, dst_reg1 = RT
> + * X form: ins RT,RA,RB -> src_reg1 = RA, src_reg2 = RB, dst_reg1 = RT
> + */
> +void get_arch_regs(int raw_insn __maybe_unused, int is_source __maybe_unused,
> + struct annotated_op_loc *op_loc __maybe_unused)

Ditto.

Thanks,
Namhyung


> +{
> + if (is_source)
> + op_loc->reg1 = get_source_reg(raw_insn);
> + else
> + op_loc->reg1 = get_target_reg(raw_insn);
> +
> + if (op_loc->multi_regs)
> + op_loc->reg2 = PPC_RB(raw_insn);
> +
> + /* TODO: Implement offset handling for X Form */
> + if ((op_loc->mem_ref) && (PPC_OP(raw_insn) != 31))
> + op_loc->offset = get_offset_opcode(raw_insn);
> +}
> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index 1451caf25e77..2b8cc759ae35 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -2079,6 +2079,12 @@ static int extract_reg_offset(struct arch *arch, const char *str,
> return 0;
> }
>
> +__weak void get_arch_regs(int raw_insn __maybe_unused, int is_source __maybe_unused,
> + struct annotated_op_loc *op_loc __maybe_unused)
> +{
> + return;
> +}
> +
> /**
> * annotate_get_insn_location - Get location of instruction
> * @arch: the architecture info
> @@ -2123,20 +2129,33 @@ int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl,
> for_each_insn_op_loc(loc, i, op_loc) {
> const char *insn_str = ops->source.raw;
> bool multi_regs = ops->source.multi_regs;
> + bool mem_ref = ops->source.mem_ref;
>
> if (i == INSN_OP_TARGET) {
> insn_str = ops->target.raw;
> multi_regs = ops->target.multi_regs;
> + mem_ref = ops->target.mem_ref;
> }
>
> /* Invalidate the register by default */
> op_loc->reg1 = -1;
> op_loc->reg2 = -1;
>
> - if (insn_str == NULL)
> - continue;
> + if (insn_str == NULL) {
> + if (!arch__is(arch, "powerpc"))
> + continue;
> + }
>
> - if (strchr(insn_str, arch->objdump.memory_ref_char)) {
> + /*
> + * For powerpc, call get_arch_regs function which extracts the
> + * required fields for op_loc, ie reg1, reg2, offset from the
> + * raw instruction.
> + */
> + if (arch__is(arch, "powerpc")) {
> + op_loc->mem_ref = mem_ref;
> + op_loc->multi_regs = multi_regs;
> + get_arch_regs(ops->raw_insn, !i, op_loc);
> + } else if (strchr(insn_str, arch->objdump.memory_ref_char)) {
> op_loc->mem_ref = true;
> op_loc->multi_regs = multi_regs;
> extract_reg_offset(arch, insn_str, op_loc);
> diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
> index 61f0f1656f82..252cb0d1f5d1 100644
> --- a/tools/perf/util/disasm.c
> +++ b/tools/perf/util/disasm.c
> @@ -37,6 +37,7 @@ static struct ins_ops mov_ops;
> static struct ins_ops nop_ops;
> static struct ins_ops lock_ops;
> static struct ins_ops ret_ops;
> +static struct ins_ops load_store_ops;
>
> static int jump__scnprintf(struct ins *ins, char *bf, size_t size,
> struct ins_operands *ops, int max_ins_name);
> @@ -517,7 +518,7 @@ static int lock__parse(struct arch *arch, struct ins_operands *ops, struct map_s
> if (disasm_line__parse(ops->raw, &ops->locked.ins.name, &ops->locked.ops->raw) < 0)
> goto out_free_ops;
>
> - ops->locked.ins.ops = ins__find(arch, ops->locked.ins.name);
> + ops->locked.ins.ops = ins__find(arch, ops->locked.ins.name, 0);
>
> if (ops->locked.ins.ops == NULL)
> goto out_free_ops;
> @@ -672,6 +673,47 @@ static struct ins_ops mov_ops = {
> .scnprintf = mov__scnprintf,
> };
>
> +static int load_store__scnprintf(struct ins *ins, char *bf, size_t size,
> + struct ins_operands *ops, int max_ins_name)
> +{
> + return scnprintf(bf, size, "%-*s %s", max_ins_name, ins->name,
> + ops->raw);
> +}
> +
> +/*
> + * Sets the fields: "raw_insn", opcode, multi_regs and "mem_ref".
> + * "mem_ref" is set for ops->source which is later used to
> + * fill the objdump->memory_ref-char field. This ops is currently
> + * used by powerpc and since binary instruction code is used to
> + * extract opcode, regs and offset, no other parsing is needed here
> + */
> +static int load_store__parse(struct arch *arch __maybe_unused, struct ins_operands *ops,
> + struct map_symbol *ms __maybe_unused)
> +{
> + ops->source.raw_insn = ops->raw_insn;
> + ops->source.mem_ref = true;
> + ops->source.opcode = ops->opcode;
> + ops->source.multi_regs = false;
> +
> + if (!ops->source.raw_insn)
> + return -1;
> +
> + ops->target.raw_insn = ops->raw_insn;
> + ops->target.mem_ref = false;
> + ops->target.multi_regs = false;
> + ops->target.opcode = ops->opcode;
> +
> + if (!ops->target.raw_insn)
> + return -1;
> +
> + return 0;
> +}
> +
> +static struct ins_ops load_store_ops = {
> + .parse = load_store__parse,
> + .scnprintf = load_store__scnprintf,
> +};
> +
> static int dec__parse(struct arch *arch __maybe_unused, struct ins_operands *ops, struct map_symbol *ms __maybe_unused)
> {
> char *target, *comment, *s, prev;
> @@ -762,11 +804,23 @@ static void ins__sort(struct arch *arch)
> qsort(arch->instructions, nmemb, sizeof(struct ins), ins__cmp);
> }
>
> -static struct ins_ops *__ins__find(struct arch *arch, const char *name)
> +static struct ins_ops *__ins__find(struct arch *arch, const char *name, int raw_insn)
> {
> struct ins *ins;
> const int nmemb = arch->nr_instructions;
>
> + if (arch__is(arch, "powerpc")) {
> + /*
> + * For powerpc, identify the instruction ops
> + * from the opcode using raw_insn.
> + */
> + struct ins_ops *ops;
> +
> + ops = check_ppc_insn(raw_insn);
> + if (ops)
> + return ops;
> + }
> +
> if (!arch->sorted_instructions) {
> ins__sort(arch);
> arch->sorted_instructions = true;
> @@ -796,9 +850,9 @@ static struct ins_ops *__ins__find(struct arch *arch, const char *name)
> return ins ? ins->ops : NULL;
> }
>
> -struct ins_ops *ins__find(struct arch *arch, const char *name)
> +struct ins_ops *ins__find(struct arch *arch, const char *name, int raw_insn)
> {
> - struct ins_ops *ops = __ins__find(arch, name);
> + struct ins_ops *ops = __ins__find(arch, name, raw_insn);
>
> if (!ops && arch->associate_instruction_ops)
> ops = arch->associate_instruction_ops(arch, name);
> @@ -808,7 +862,7 @@ struct ins_ops *ins__find(struct arch *arch, const char *name)
>
> static void disasm_line__init_ins(struct disasm_line *dl, struct arch *arch, struct map_symbol *ms)
> {
> - dl->ins.ops = ins__find(arch, dl->ins.name);
> + dl->ins.ops = ins__find(arch, dl->ins.name, dl->ops.raw_insn);
>
> if (!dl->ins.ops)
> return;
> diff --git a/tools/perf/util/disasm.h b/tools/perf/util/disasm.h
> index a391e1bb81f7..831ebcc329cd 100644
> --- a/tools/perf/util/disasm.h
> +++ b/tools/perf/util/disasm.h
> @@ -62,6 +62,7 @@ struct ins_operands {
> bool offset_avail;
> bool outside;
> bool multi_regs;
> + bool mem_ref;
> } target;
> union {
> struct {
> @@ -71,6 +72,7 @@ struct ins_operands {
> int raw_insn;
> u64 addr;
> bool multi_regs;
> + bool mem_ref;
> } source;
> struct {
> struct ins ins;
> @@ -104,7 +106,7 @@ struct annotate_args {
> struct arch *arch__find(const char *name);
> bool arch__is(struct arch *arch, const char *name);
>
> -struct ins_ops *ins__find(struct arch *arch, const char *name);
> +struct ins_ops *ins__find(struct arch *arch, const char *name, int raw_insn);
> int ins__scnprintf(struct ins *ins, char *bf, size_t size,
> struct ins_operands *ops, int max_ins_name);
>
> diff --git a/tools/perf/util/include/dwarf-regs.h b/tools/perf/util/include/dwarf-regs.h
> index 01fb25a1150a..7ea39362ecaf 100644
> --- a/tools/perf/util/include/dwarf-regs.h
> +++ b/tools/perf/util/include/dwarf-regs.h
> @@ -1,6 +1,7 @@
> /* SPDX-License-Identifier: GPL-2.0 */
> #ifndef _PERF_DWARF_REGS_H_
> #define _PERF_DWARF_REGS_H_
> +#include "annotate.h"
>
> #define DWARF_REG_PC 0xd3af9c /* random number */
> #define DWARF_REG_FB 0xd3affb /* random number */
> @@ -31,6 +32,8 @@ static inline int get_dwarf_regnum(const char *name __maybe_unused,
> }
> #endif
>
> +void get_arch_regs(int raw_insn, int is_source, struct annotated_op_loc *op_loc);
> +
> #ifdef HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET
> /*
> * Arch should support fetching the offset of a register in pt_regs
> --
> 2.43.0
>

2024-06-06 07:04:31

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH V3 10/14] tools/perf: Update instruction tracking for powerpc

On Sat, Jun 01, 2024 at 11:39:37AM +0530, Athira Rajeev wrote:
> Add instruction tracking function "update_insn_state_powerpc" for
> powerpc. Example sequence in powerpc:
>
> ld r10,264(r3)
> mr r31,r3
> <<after some sequence>
> ld r9,312(r31)
>
> Consider ithe sample is pointing to: "ld r9,312(r31)".
> Here the memory reference is hit at "312(r31)" where 312 is the offset
> and r31 is the source register. Previous instruction sequence shows that
> register state of r3 is moved to r31. So to identify the data type for r31
> access, the previous instruction ("mr") needs to be tracked and the
> state type entry has to be updated. Current instruction tracking support
> in perf tools infrastructure is specific to x86. Patch adds this support
> for powerpc as well.
>
> Signed-off-by: Athira Rajeev <[email protected]>
> ---
> .../perf/arch/powerpc/annotate/instructions.c | 65 +++++++++++++++++++
> tools/perf/util/annotate-data.c | 9 ++-
> tools/perf/util/disasm.c | 1 +
> 3 files changed, 74 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c
> index db72148eb857..3ecf5a986037 100644
> --- a/tools/perf/arch/powerpc/annotate/instructions.c
> +++ b/tools/perf/arch/powerpc/annotate/instructions.c
> @@ -231,6 +231,71 @@ static struct ins_ops *check_ppc_insn(int raw_insn)
> return NULL;
> }
>
> +/*
> + * Instruction tracking function to track register state moves.
> + * Example sequence:
> + * ld r10,264(r3)
> + * mr r31,r3
> + * <<after some sequence>
> + * ld r9,312(r31)
> + *
> + * Previous instruction sequence shows that register state of r3
> + * is moved to r31. update_insn_state_powerpc tracks these state
> + * changes
> + */
> +#ifdef HAVE_DWARF_SUPPORT
> +static void update_insn_state_powerpc(struct type_state *state,
> + struct data_loc_info *dloc, Dwarf_Die * cu_die __maybe_unused,
> + struct disasm_line *dl)
> +{
> + struct annotated_insn_loc loc;
> + struct annotated_op_loc *src = &loc.ops[INSN_OP_SOURCE];
> + struct annotated_op_loc *dst = &loc.ops[INSN_OP_TARGET];
> + struct type_state_reg *tsr;
> + u32 insn_offset = dl->al.offset;
> +
> + if (annotate_get_insn_location(dloc->arch, dl, &loc) < 0)
> + return;
> +
> + /*
> + * Value 444 for bits 21:30 is for "mr"
> + * instruction. "mr" is extended OR. So set the
> + * source and destination reg correctly
> + */
> + if (PPC_21_30(dl->ops.raw_insn) == 444) {
> + int src_reg = src->reg1;
> +
> + src->reg1 = dst->reg1;
> + dst->reg1 = src_reg;
> + }
> +
> + if (!has_reg_type(state, dst->reg1))
> + return;
> +
> + tsr = &state->regs[dst->reg1];
> +
> + if (!has_reg_type(state, src->reg1) ||
> + !state->regs[src->reg1].ok) {
> + tsr->ok = false;
> + return;
> + }
> +
> + tsr->type = state->regs[src->reg1].type;
> + tsr->kind = state->regs[src->reg1].kind;
> + tsr->ok = true;
> +
> + pr_debug("mov [%x] reg%d -> reg%d",

pr_debug_dtp() ?

Thanks,
Namhyung


> + insn_offset, src->reg1, dst->reg1);
> + pr_debug_type_name(&tsr->type, tsr->kind);
> +}
> +#else /* HAVE_DWARF_SUPPORT */
> +static void update_insn_state_powerpc(struct type_state *state __maybe_unused, struct data_loc_info *dloc __maybe_unused,
> + Dwarf_Die * cu_die __maybe_unused, struct disasm_line *dl __maybe_unused)
> +{
> + return;
> +}
> +#endif /* HAVE_DWARF_SUPPORT */
> +
> static int powerpc__annotate_init(struct arch *arch, char *cpuid __maybe_unused)
> {
> if (!arch->initialized) {
> diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
> index 7a48c3d72b89..734acdd8c4b7 100644
> --- a/tools/perf/util/annotate-data.c
> +++ b/tools/perf/util/annotate-data.c
> @@ -1080,6 +1080,13 @@ static int find_data_type_insn(struct data_loc_info *dloc,
> return ret;
> }
>
> +static int arch_supports_insn_tracking(struct data_loc_info *dloc)
> +{
> + if ((arch__is(dloc->arch, "x86")) || (arch__is(dloc->arch, "powerpc")))
> + return 1;
> + return 0;
> +}
> +
> /*
> * Construct a list of basic blocks for each scope with variables and try to find
> * the data type by updating a type state table through instructions.
> @@ -1094,7 +1101,7 @@ static int find_data_type_block(struct data_loc_info *dloc,
> int ret = -1;
>
> /* TODO: other architecture support */
> - if (!arch__is(dloc->arch, "x86"))
> + if (!arch_supports_insn_tracking(dloc))
> return -1;
>
> prev_dst_ip = dst_ip = dloc->ip;
> diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
> index 57af4dc42a58..d8b357055302 100644
> --- a/tools/perf/util/disasm.c
> +++ b/tools/perf/util/disasm.c
> @@ -155,6 +155,7 @@ static struct arch architectures[] = {
> {
> .name = "powerpc",
> .init = powerpc__annotate_init,
> + .update_insn_state = update_insn_state_powerpc,
> },
> {
> .name = "riscv64",
> --
> 2.43.0
>

2024-06-08 07:08:51

by Athira Rajeev

[permalink] [raw]
Subject: Re: [PATCH V3 06/14] tools/perf: Update parameters for reg extract functions to use raw instruction on powerpc



> On 6 Jun 2024, at 12:22 PM, Namhyung Kim <[email protected]> wrote:
>
> On Sat, Jun 01, 2024 at 11:39:33AM +0530, Athira Rajeev wrote:
>> Use the raw instruction code and macros to identify memory instructions,
>> extract register fields and also offset. The implementation addresses
>> the D-form, X-form, DS-form instructions. Two main functions are added.
>> New parse function "load_store__parse" as instruction ops parser for
>> memory instructions. Unlink other parser (like mov__parse), this parser
>> fills in the "raw_insn" field for source/target and new added "mem_ref"
>> field. Also set if it is multi_regs and opcode as well. No other fields
>> are set because, here there is no need to parse the disassembled
>> code and arch specific macros will take care of extracting offset and
>> regs which is easier and will be precise.
>>
>> In powerpc, all instructions with a primary opcode from 32 to 63
>> are memory instructions. Update "ins__find" function to have "raw_insn"
>> also as a parameter. Don't use the "extract_reg_offset", instead use
>> newly added function "get_arch_regs" which will set these fields: reg1,
>> reg2, offset depending of where it is source or target ops.
>>
>> Signed-off-by: Athira Rajeev <[email protected]>
>> ---
>> .../perf/arch/powerpc/annotate/instructions.c | 16 +++++
>> tools/perf/arch/powerpc/util/dwarf-regs.c | 44 +++++++++++++
>> tools/perf/util/annotate.c | 25 +++++++-
>> tools/perf/util/disasm.c | 64 +++++++++++++++++--
>> tools/perf/util/disasm.h | 4 +-
>> tools/perf/util/include/dwarf-regs.h | 3 +
>> 6 files changed, 147 insertions(+), 9 deletions(-)
>>
>> diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c
>> index d57fd023ef9c..10fea5e5cf4c 100644
>> --- a/tools/perf/arch/powerpc/annotate/instructions.c
>> +++ b/tools/perf/arch/powerpc/annotate/instructions.c
>> @@ -49,6 +49,22 @@ static struct ins_ops *powerpc__associate_instruction_ops(struct arch *arch, con
>> return ops;
>> }
>>
>> +#define PPC_OP(op) (((op) >> 26) & 0x3F)
>> +
>> +static struct ins_ops *check_ppc_insn(int raw_insn)
>> +{
>> + int opcode = PPC_OP(raw_insn);
>> +
>> + /*
>> + * Instructions with opcode 32 to 63 are memory
>> + * instructions in powerpc
>> + */
>> + if ((opcode & 0x20))
>> + return &load_store_ops;
>> +
>> + return NULL;
>> +}
>> +
>> static int powerpc__annotate_init(struct arch *arch, char *cpuid __maybe_unused)
>> {
>> if (!arch->initialized) {
>> diff --git a/tools/perf/arch/powerpc/util/dwarf-regs.c b/tools/perf/arch/powerpc/util/dwarf-regs.c
>> index 430623ca5612..38b74fa01d8b 100644
>> --- a/tools/perf/arch/powerpc/util/dwarf-regs.c
>> +++ b/tools/perf/arch/powerpc/util/dwarf-regs.c
>> @@ -107,3 +107,47 @@ int regs_query_register_offset(const char *name)
>> #define PPC_DS(DS) ((DS) & 0xfffc)
>> #define OP_LD 58
>> #define OP_STD 62
>> +
>> +static int get_source_reg(unsigned int raw_insn)
>> +{
>> + return PPC_RA(raw_insn);
>> +}
>> +
>> +static int get_target_reg(unsigned int raw_insn)
>> +{
>> + return PPC_RT(raw_insn);
>> +}
>> +
>> +static int get_offset_opcode(int raw_insn __maybe_unused)
>
> The argument is used below, no need for __maybe_unused.
>
>> +{
>> + int opcode = PPC_OP(raw_insn);
>> +
>> + /* DS- form */
>> + if ((opcode == OP_LD) || (opcode == OP_STD))
>> + return PPC_DS(raw_insn);
>> + else
>> + return PPC_D(raw_insn);
>> +}
>> +
>> +/*
>> + * Fills the required fields for op_loc depending on if it
>> + * is a source or target.
>> + * D form: ins RT,D(RA) -> src_reg1 = RA, offset = D, dst_reg1 = RT
>> + * DS form: ins RT,DS(RA) -> src_reg1 = RA, offset = DS, dst_reg1 = RT
>> + * X form: ins RT,RA,RB -> src_reg1 = RA, src_reg2 = RB, dst_reg1 = RT
>> + */
>> +void get_arch_regs(int raw_insn __maybe_unused, int is_source __maybe_unused,
>> + struct annotated_op_loc *op_loc __maybe_unused)
>
> Ditto.

Yes, right. Will fix it both the places.

Thanks
Athira
>
> Thanks,
> Namhyung

>
>
>> +{
>> + if (is_source)
>> + op_loc->reg1 = get_source_reg(raw_insn);
>> + else
>> + op_loc->reg1 = get_target_reg(raw_insn);
>> +
>> + if (op_loc->multi_regs)
>> + op_loc->reg2 = PPC_RB(raw_insn);
>> +
>> + /* TODO: Implement offset handling for X Form */
>> + if ((op_loc->mem_ref) && (PPC_OP(raw_insn) != 31))
>> + op_loc->offset = get_offset_opcode(raw_insn);
>> +}
>> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
>> index 1451caf25e77..2b8cc759ae35 100644
>> --- a/tools/perf/util/annotate.c
>> +++ b/tools/perf/util/annotate.c
>> @@ -2079,6 +2079,12 @@ static int extract_reg_offset(struct arch *arch, const char *str,
>> return 0;
>> }
>>
>> +__weak void get_arch_regs(int raw_insn __maybe_unused, int is_source __maybe_unused,
>> + struct annotated_op_loc *op_loc __maybe_unused)
>> +{
>> + return;
>> +}
>> +
>> /**
>> * annotate_get_insn_location - Get location of instruction
>> * @arch: the architecture info
>> @@ -2123,20 +2129,33 @@ int annotate_get_insn_location(struct arch *arch, struct disasm_line *dl,
>> for_each_insn_op_loc(loc, i, op_loc) {
>> const char *insn_str = ops->source.raw;
>> bool multi_regs = ops->source.multi_regs;
>> + bool mem_ref = ops->source.mem_ref;
>>
>> if (i == INSN_OP_TARGET) {
>> insn_str = ops->target.raw;
>> multi_regs = ops->target.multi_regs;
>> + mem_ref = ops->target.mem_ref;
>> }
>>
>> /* Invalidate the register by default */
>> op_loc->reg1 = -1;
>> op_loc->reg2 = -1;
>>
>> - if (insn_str == NULL)
>> - continue;
>> + if (insn_str == NULL) {
>> + if (!arch__is(arch, "powerpc"))
>> + continue;
>> + }
>>
>> - if (strchr(insn_str, arch->objdump.memory_ref_char)) {
>> + /*
>> + * For powerpc, call get_arch_regs function which extracts the
>> + * required fields for op_loc, ie reg1, reg2, offset from the
>> + * raw instruction.
>> + */
>> + if (arch__is(arch, "powerpc")) {
>> + op_loc->mem_ref = mem_ref;
>> + op_loc->multi_regs = multi_regs;
>> + get_arch_regs(ops->raw_insn, !i, op_loc);
>> + } else if (strchr(insn_str, arch->objdump.memory_ref_char)) {
>> op_loc->mem_ref = true;
>> op_loc->multi_regs = multi_regs;
>> extract_reg_offset(arch, insn_str, op_loc);
>> diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
>> index 61f0f1656f82..252cb0d1f5d1 100644
>> --- a/tools/perf/util/disasm.c
>> +++ b/tools/perf/util/disasm.c
>> @@ -37,6 +37,7 @@ static struct ins_ops mov_ops;
>> static struct ins_ops nop_ops;
>> static struct ins_ops lock_ops;
>> static struct ins_ops ret_ops;
>> +static struct ins_ops load_store_ops;
>>
>> static int jump__scnprintf(struct ins *ins, char *bf, size_t size,
>> struct ins_operands *ops, int max_ins_name);
>> @@ -517,7 +518,7 @@ static int lock__parse(struct arch *arch, struct ins_operands *ops, struct map_s
>> if (disasm_line__parse(ops->raw, &ops->locked.ins.name, &ops->locked.ops->raw) < 0)
>> goto out_free_ops;
>>
>> - ops->locked.ins.ops = ins__find(arch, ops->locked.ins.name);
>> + ops->locked.ins.ops = ins__find(arch, ops->locked.ins.name, 0);
>>
>> if (ops->locked.ins.ops == NULL)
>> goto out_free_ops;
>> @@ -672,6 +673,47 @@ static struct ins_ops mov_ops = {
>> .scnprintf = mov__scnprintf,
>> };
>>
>> +static int load_store__scnprintf(struct ins *ins, char *bf, size_t size,
>> + struct ins_operands *ops, int max_ins_name)
>> +{
>> + return scnprintf(bf, size, "%-*s %s", max_ins_name, ins->name,
>> + ops->raw);
>> +}
>> +
>> +/*
>> + * Sets the fields: "raw_insn", opcode, multi_regs and "mem_ref".
>> + * "mem_ref" is set for ops->source which is later used to
>> + * fill the objdump->memory_ref-char field. This ops is currently
>> + * used by powerpc and since binary instruction code is used to
>> + * extract opcode, regs and offset, no other parsing is needed here
>> + */
>> +static int load_store__parse(struct arch *arch __maybe_unused, struct ins_operands *ops,
>> + struct map_symbol *ms __maybe_unused)
>> +{
>> + ops->source.raw_insn = ops->raw_insn;
>> + ops->source.mem_ref = true;
>> + ops->source.opcode = ops->opcode;
>> + ops->source.multi_regs = false;
>> +
>> + if (!ops->source.raw_insn)
>> + return -1;
>> +
>> + ops->target.raw_insn = ops->raw_insn;
>> + ops->target.mem_ref = false;
>> + ops->target.multi_regs = false;
>> + ops->target.opcode = ops->opcode;
>> +
>> + if (!ops->target.raw_insn)
>> + return -1;
>> +
>> + return 0;
>> +}
>> +
>> +static struct ins_ops load_store_ops = {
>> + .parse = load_store__parse,
>> + .scnprintf = load_store__scnprintf,
>> +};
>> +
>> static int dec__parse(struct arch *arch __maybe_unused, struct ins_operands *ops, struct map_symbol *ms __maybe_unused)
>> {
>> char *target, *comment, *s, prev;
>> @@ -762,11 +804,23 @@ static void ins__sort(struct arch *arch)
>> qsort(arch->instructions, nmemb, sizeof(struct ins), ins__cmp);
>> }
>>
>> -static struct ins_ops *__ins__find(struct arch *arch, const char *name)
>> +static struct ins_ops *__ins__find(struct arch *arch, const char *name, int raw_insn)
>> {
>> struct ins *ins;
>> const int nmemb = arch->nr_instructions;
>>
>> + if (arch__is(arch, "powerpc")) {
>> + /*
>> + * For powerpc, identify the instruction ops
>> + * from the opcode using raw_insn.
>> + */
>> + struct ins_ops *ops;
>> +
>> + ops = check_ppc_insn(raw_insn);
>> + if (ops)
>> + return ops;
>> + }
>> +
>> if (!arch->sorted_instructions) {
>> ins__sort(arch);
>> arch->sorted_instructions = true;
>> @@ -796,9 +850,9 @@ static struct ins_ops *__ins__find(struct arch *arch, const char *name)
>> return ins ? ins->ops : NULL;
>> }
>>
>> -struct ins_ops *ins__find(struct arch *arch, const char *name)
>> +struct ins_ops *ins__find(struct arch *arch, const char *name, int raw_insn)
>> {
>> - struct ins_ops *ops = __ins__find(arch, name);
>> + struct ins_ops *ops = __ins__find(arch, name, raw_insn);
>>
>> if (!ops && arch->associate_instruction_ops)
>> ops = arch->associate_instruction_ops(arch, name);
>> @@ -808,7 +862,7 @@ struct ins_ops *ins__find(struct arch *arch, const char *name)
>>
>> static void disasm_line__init_ins(struct disasm_line *dl, struct arch *arch, struct map_symbol *ms)
>> {
>> - dl->ins.ops = ins__find(arch, dl->ins.name);
>> + dl->ins.ops = ins__find(arch, dl->ins.name, dl->ops.raw_insn);
>>
>> if (!dl->ins.ops)
>> return;
>> diff --git a/tools/perf/util/disasm.h b/tools/perf/util/disasm.h
>> index a391e1bb81f7..831ebcc329cd 100644
>> --- a/tools/perf/util/disasm.h
>> +++ b/tools/perf/util/disasm.h
>> @@ -62,6 +62,7 @@ struct ins_operands {
>> bool offset_avail;
>> bool outside;
>> bool multi_regs;
>> + bool mem_ref;
>> } target;
>> union {
>> struct {
>> @@ -71,6 +72,7 @@ struct ins_operands {
>> int raw_insn;
>> u64 addr;
>> bool multi_regs;
>> + bool mem_ref;
>> } source;
>> struct {
>> struct ins ins;
>> @@ -104,7 +106,7 @@ struct annotate_args {
>> struct arch *arch__find(const char *name);
>> bool arch__is(struct arch *arch, const char *name);
>>
>> -struct ins_ops *ins__find(struct arch *arch, const char *name);
>> +struct ins_ops *ins__find(struct arch *arch, const char *name, int raw_insn);
>> int ins__scnprintf(struct ins *ins, char *bf, size_t size,
>> struct ins_operands *ops, int max_ins_name);
>>
>> diff --git a/tools/perf/util/include/dwarf-regs.h b/tools/perf/util/include/dwarf-regs.h
>> index 01fb25a1150a..7ea39362ecaf 100644
>> --- a/tools/perf/util/include/dwarf-regs.h
>> +++ b/tools/perf/util/include/dwarf-regs.h
>> @@ -1,6 +1,7 @@
>> /* SPDX-License-Identifier: GPL-2.0 */
>> #ifndef _PERF_DWARF_REGS_H_
>> #define _PERF_DWARF_REGS_H_
>> +#include "annotate.h"
>>
>> #define DWARF_REG_PC 0xd3af9c /* random number */
>> #define DWARF_REG_FB 0xd3affb /* random number */
>> @@ -31,6 +32,8 @@ static inline int get_dwarf_regnum(const char *name __maybe_unused,
>> }
>> #endif
>>
>> +void get_arch_regs(int raw_insn, int is_source, struct annotated_op_loc *op_loc);
>> +
>> #ifdef HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET
>> /*
>> * Arch should support fetching the offset of a register in pt_regs
>> --
>> 2.43.0



2024-06-08 08:12:10

by Athira Rajeev

[permalink] [raw]
Subject: Re: [PATCH V3 10/14] tools/perf: Update instruction tracking for powerpc



> On 6 Jun 2024, at 12:23 PM, Namhyung Kim <[email protected]> wrote:
>
> On Sat, Jun 01, 2024 at 11:39:37AM +0530, Athira Rajeev wrote:
>> Add instruction tracking function "update_insn_state_powerpc" for
>> powerpc. Example sequence in powerpc:
>>
>> ld r10,264(r3)
>> mr r31,r3
>> <<after some sequence>
>> ld r9,312(r31)
>>
>> Consider ithe sample is pointing to: "ld r9,312(r31)".
>> Here the memory reference is hit at "312(r31)" where 312 is the offset
>> and r31 is the source register. Previous instruction sequence shows that
>> register state of r3 is moved to r31. So to identify the data type for r31
>> access, the previous instruction ("mr") needs to be tracked and the
>> state type entry has to be updated. Current instruction tracking support
>> in perf tools infrastructure is specific to x86. Patch adds this support
>> for powerpc as well.
>>
>> Signed-off-by: Athira Rajeev <[email protected]>
>> ---
>> .../perf/arch/powerpc/annotate/instructions.c | 65 +++++++++++++++++++
>> tools/perf/util/annotate-data.c | 9 ++-
>> tools/perf/util/disasm.c | 1 +
>> 3 files changed, 74 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c
>> index db72148eb857..3ecf5a986037 100644
>> --- a/tools/perf/arch/powerpc/annotate/instructions.c
>> +++ b/tools/perf/arch/powerpc/annotate/instructions.c
>> @@ -231,6 +231,71 @@ static struct ins_ops *check_ppc_insn(int raw_insn)
>> return NULL;
>> }
>>
>> +/*
>> + * Instruction tracking function to track register state moves.
>> + * Example sequence:
>> + * ld r10,264(r3)
>> + * mr r31,r3
>> + * <<after some sequence>
>> + * ld r9,312(r31)
>> + *
>> + * Previous instruction sequence shows that register state of r3
>> + * is moved to r31. update_insn_state_powerpc tracks these state
>> + * changes
>> + */
>> +#ifdef HAVE_DWARF_SUPPORT
>> +static void update_insn_state_powerpc(struct type_state *state,
>> + struct data_loc_info *dloc, Dwarf_Die * cu_die __maybe_unused,
>> + struct disasm_line *dl)
>> +{
>> + struct annotated_insn_loc loc;
>> + struct annotated_op_loc *src = &loc.ops[INSN_OP_SOURCE];
>> + struct annotated_op_loc *dst = &loc.ops[INSN_OP_TARGET];
>> + struct type_state_reg *tsr;
>> + u32 insn_offset = dl->al.offset;
>> +
>> + if (annotate_get_insn_location(dloc->arch, dl, &loc) < 0)
>> + return;
>> +
>> + /*
>> + * Value 444 for bits 21:30 is for "mr"
>> + * instruction. "mr" is extended OR. So set the
>> + * source and destination reg correctly
>> + */
>> + if (PPC_21_30(dl->ops.raw_insn) == 444) {
>> + int src_reg = src->reg1;
>> +
>> + src->reg1 = dst->reg1;
>> + dst->reg1 = src_reg;
>> + }
>> +
>> + if (!has_reg_type(state, dst->reg1))
>> + return;
>> +
>> + tsr = &state->regs[dst->reg1];
>> +
>> + if (!has_reg_type(state, src->reg1) ||
>> + !state->regs[src->reg1].ok) {
>> + tsr->ok = false;
>> + return;
>> + }
>> +
>> + tsr->type = state->regs[src->reg1].type;
>> + tsr->kind = state->regs[src->reg1].kind;
>> + tsr->ok = true;
>> +
>> + pr_debug("mov [%x] reg%d -> reg%d",
>
> pr_debug_dtp() ?

Sure, will change this in V4

Thanks
Athira
>
> Thanks,
> Namhyung
>
>
>> + insn_offset, src->reg1, dst->reg1);
>> + pr_debug_type_name(&tsr->type, tsr->kind);
>> +}
>> +#else /* HAVE_DWARF_SUPPORT */
>> +static void update_insn_state_powerpc(struct type_state *state __maybe_unused, struct data_loc_info *dloc __maybe_unused,
>> + Dwarf_Die * cu_die __maybe_unused, struct disasm_line *dl __maybe_unused)
>> +{
>> + return;
>> +}
>> +#endif /* HAVE_DWARF_SUPPORT */
>> +
>> static int powerpc__annotate_init(struct arch *arch, char *cpuid __maybe_unused)
>> {
>> if (!arch->initialized) {
>> diff --git a/tools/perf/util/annotate-data.c b/tools/perf/util/annotate-data.c
>> index 7a48c3d72b89..734acdd8c4b7 100644
>> --- a/tools/perf/util/annotate-data.c
>> +++ b/tools/perf/util/annotate-data.c
>> @@ -1080,6 +1080,13 @@ static int find_data_type_insn(struct data_loc_info *dloc,
>> return ret;
>> }
>>
>> +static int arch_supports_insn_tracking(struct data_loc_info *dloc)
>> +{
>> + if ((arch__is(dloc->arch, "x86")) || (arch__is(dloc->arch, "powerpc")))
>> + return 1;
>> + return 0;
>> +}
>> +
>> /*
>> * Construct a list of basic blocks for each scope with variables and try to find
>> * the data type by updating a type state table through instructions.
>> @@ -1094,7 +1101,7 @@ static int find_data_type_block(struct data_loc_info *dloc,
>> int ret = -1;
>>
>> /* TODO: other architecture support */
>> - if (!arch__is(dloc->arch, "x86"))
>> + if (!arch_supports_insn_tracking(dloc))
>> return -1;
>>
>> prev_dst_ip = dst_ip = dloc->ip;
>> diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
>> index 57af4dc42a58..d8b357055302 100644
>> --- a/tools/perf/util/disasm.c
>> +++ b/tools/perf/util/disasm.c
>> @@ -155,6 +155,7 @@ static struct arch architectures[] = {
>> {
>> .name = "powerpc",
>> .init = powerpc__annotate_init,
>> + .update_insn_state = update_insn_state_powerpc,
>> },
>> {
>> .name = "riscv64",
>> --
>> 2.43.0



2024-06-08 08:12:35

by Athira Rajeev

[permalink] [raw]
Subject: Re: [PATCH V3 05/14] tools/perf: Add disasm_line__parse to parse raw instruction for powerpc



> On 6 Jun 2024, at 12:03 PM, Namhyung Kim <[email protected]> wrote:
>
> Hello,
>
> On Sat, Jun 01, 2024 at 11:39:32AM +0530, Athira Rajeev wrote:
>> Currently, the perf tool infrastructure disasm_line__parse function to
>> parse disassembled line.
>>
>> Example snippet from objdump:
>> objdump --start-address=<address> --stop-address=<address> -d --no-show-raw-insn -C <vmlinux>
>>
>> c0000000010224b4: lwz r10,0(r9)
>>
>> This line "lwz r10,0(r9)" is parsed to extract instruction name,
>> registers names and offset. In powerpc, the approach for data type
>> profiling uses raw instruction instead of result from objdump to identify
>> the instruction category and extract the source/target registers.
>>
>> Example: 38 01 81 e8 ld r4,312(r1)
>>
>> Here "38 01 81 e8" is the raw instruction representation. Add function
>> "disasm_line__parse_powerpc" to handle parsing of raw instruction. Also
>> update "struct ins" and "struct ins_operands" to save "opcode" and
>> binary code. With the change, function captures:
>>
>> line -> "38 01 81 e8 ld r4,312(r1)"
>> opcode and raw instruction "38 01 81 e8"
>>
>> Raw instruction is used later to extract the reg/offset fields. Macros
>> are added to extract opcode and register fields. "struct ins_operands"
>> and "struct ins" is updated to carry opcode and raw instruction binary
>> code (raw_insn). Function "disasm_line__parse_powerpc fills the raw
>> instruction hex value and opcode in newly added fields. There is no
>> changes in existing code paths, which parses the disassembled code.
>> The architecture using the instruction name and present approach is
>> not altered. Since this approach targets powerpc, the macro
>> implementation is added for powerpc as of now.
>>
>> Since the disasm_line__parse is used in other cases (perf annotate) and
>> not only data tye profiling, the powerpc callback includes changes to
>> work with binary code as well as mneumonic representation. Also in case
>> if the DSO read fails and libcapstone is not supported, the approach
>> fallback to use objdump as option. Hence as option, patch has changes to
>> ensure objdump option also works well.
>>
>> Signed-off-by: Athira Rajeev <[email protected]>
>> ---
>> tools/include/linux/string.h | 2 +
>> tools/lib/string.c | 13 ++++
>> .../perf/arch/powerpc/annotate/instructions.c | 1 +
>> tools/perf/arch/powerpc/util/dwarf-regs.c | 9 +++
>> tools/perf/util/disasm.c | 63 ++++++++++++++++++-
>> tools/perf/util/disasm.h | 7 +++
>> 6 files changed, 94 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/include/linux/string.h b/tools/include/linux/string.h
>> index db5c99318c79..0acb1fc14e19 100644
>> --- a/tools/include/linux/string.h
>> +++ b/tools/include/linux/string.h
>> @@ -46,5 +46,7 @@ extern char * __must_check skip_spaces(const char *);
>>
>> extern char *strim(char *);
>>
>> +extern void remove_spaces(char *s);
>> +
>> extern void *memchr_inv(const void *start, int c, size_t bytes);
>> #endif /* _TOOLS_LINUX_STRING_H_ */
>> diff --git a/tools/lib/string.c b/tools/lib/string.c
>> index 8b6892f959ab..3126d2cff716 100644
>> --- a/tools/lib/string.c
>> +++ b/tools/lib/string.c
>> @@ -153,6 +153,19 @@ char *strim(char *s)
>> return skip_spaces(s);
>> }
>>
>> +/*
>> + * remove_spaces - Removes whitespaces from @s
>> + */
>> +void remove_spaces(char *s)
>> +{
>> + char *d = s;
>> +
>> + do {
>> + while (*d == ' ')
>> + ++d;
>> + } while ((*s++ = *d++));
>> +}
>> +
>> /**
>> * strreplace - Replace all occurrences of character in string.
>> * @s: The string to operate on.
>> diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c
>> index a3f423c27cae..d57fd023ef9c 100644
>> --- a/tools/perf/arch/powerpc/annotate/instructions.c
>> +++ b/tools/perf/arch/powerpc/annotate/instructions.c
>> @@ -55,6 +55,7 @@ static int powerpc__annotate_init(struct arch *arch, char *cpuid __maybe_unused)
>> arch->initialized = true;
>> arch->associate_instruction_ops = powerpc__associate_instruction_ops;
>> arch->objdump.comment_char = '#';
>> + annotate_opts.show_asm_raw = true;
>> }
>>
>> return 0;
>> diff --git a/tools/perf/arch/powerpc/util/dwarf-regs.c b/tools/perf/arch/powerpc/util/dwarf-regs.c
>> index 0c4f4caf53ac..430623ca5612 100644
>> --- a/tools/perf/arch/powerpc/util/dwarf-regs.c
>> +++ b/tools/perf/arch/powerpc/util/dwarf-regs.c
>> @@ -98,3 +98,12 @@ int regs_query_register_offset(const char *name)
>> return roff->ptregs_offset;
>> return -EINVAL;
>> }
>> +
>> +#define PPC_OP(op) (((op) >> 26) & 0x3F)
>> +#define PPC_RA(a) (((a) >> 16) & 0x1f)
>> +#define PPC_RT(t) (((t) >> 21) & 0x1f)
>> +#define PPC_RB(b) (((b) >> 11) & 0x1f)
>> +#define PPC_D(D) ((D) & 0xfffe)
>> +#define PPC_DS(DS) ((DS) & 0xfffc)
>> +#define OP_LD 58
>> +#define OP_STD 62
>> diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
>> index 3cd187f08193..61f0f1656f82 100644
>> --- a/tools/perf/util/disasm.c
>> +++ b/tools/perf/util/disasm.c
>> @@ -45,6 +45,7 @@ static int call__scnprintf(struct ins *ins, char *bf, size_t size,
>>
>> static void ins__sort(struct arch *arch);
>> static int disasm_line__parse(char *line, const char **namep, char **rawp);
>> +static int disasm_line__parse_powerpc(struct disasm_line *dl);
>>
>> static __attribute__((constructor)) void symbol__init_regexpr(void)
>> {
>> @@ -844,6 +845,63 @@ static int disasm_line__parse(char *line, const char **namep, char **rawp)
>> return -1;
>> }
>>
>> +/*
>> + * Parses the result captured from symbol__disassemble_*
>> + * Example, line read from DSO file in powerpc:
>> + * line: 38 01 81 e8
>> + * opcode: fetched from arch specific get_opcode_insn
>> + * rawp_insn: e8810138
>> + *
>> + * rawp_insn is used later to extract the reg/offset fields
>> + */
>> +#define PPC_OP(op) (((op) >> 26) & 0x3F)
>> +
>> +static int disasm_line__parse_powerpc(struct disasm_line *dl)
>> +{
>> + char *line = dl->al.line;
>> + const char **namep = &dl->ins.name;
>> + char **rawp = &dl->ops.raw;
>> + char tmp, *tmp_opcode, *name_opcode = skip_spaces(line);
>> + char *name = skip_spaces(name_opcode + 11);
>> + int objdump = 0;
>> +
>> + if (strlen(line) > 11)
>> + objdump = 1;
>> +
>> + if (name_opcode[0] == '\0')
>> + return -1;
>> +
>> + if (objdump) {
>> + *rawp = name + 1;
>> + while ((*rawp)[0] != '\0' && !isspace((*rawp)[0]))
>> + ++*rawp;
>> + tmp = (*rawp)[0];
>> + (*rawp)[0] = '\0';
>> +
>> + *namep = strdup(name);
>> + if (*namep == NULL)
>> + return -1;
>> +
>> + (*rawp)[0] = tmp;
>> + *rawp = strim(*rawp);
>> + } else
>> + *namep = "";
>> +
>> + tmp_opcode = strdup(name_opcode);
>> + tmp_opcode[11] = '\0';
>> + remove_spaces(tmp_opcode);
>> +
>> + dl->ins.opcode = PPC_OP(strtol(tmp_opcode, NULL, 16));
>> + if (objdump)
>> + dl->ins.opcode = PPC_OP(be32_to_cpu(strtol(tmp_opcode, NULL, 16)));
>> + dl->ops.opcode = dl->ins.opcode;
>> +
>> + dl->ops.raw_insn = strtol(tmp_opcode, NULL, 16);
>> + if (objdump)
>> + dl->ops.raw_insn = be32_to_cpu(strtol(tmp_opcode, NULL, 16));
>> + return 0;
>> +}
>> +
>> static void annotation_line__init(struct annotation_line *al,
>> struct annotate_args *args,
>> int nr)
>> @@ -897,7 +955,10 @@ struct disasm_line *disasm_line__new(struct annotate_args *args)
>> goto out_delete;
>>
>> if (args->offset != -1) {
>> - if (disasm_line__parse(dl->al.line, &dl->ins.name, &dl->ops.raw) < 0)
>> + if (arch__is(args->arch, "powerpc")) {
>> + if (disasm_line__parse_powerpc(dl) < 0)
>> + goto out_free_line;
>> + } else if (disasm_line__parse(dl->al.line, &dl->ins.name, &dl->ops.raw) < 0)
>> goto out_free_line;
>>
>> disasm_line__init_ins(dl, args->arch, &args->ms);
>> diff --git a/tools/perf/util/disasm.h b/tools/perf/util/disasm.h
>> index 718177fa4775..a391e1bb81f7 100644
>> --- a/tools/perf/util/disasm.h
>> +++ b/tools/perf/util/disasm.h
>> @@ -43,14 +43,19 @@ struct arch {
>>
>> struct ins {
>> const char *name;
>> + int opcode;
>
> I don't think this is the right place as 'ins' can be shared for
> different opcodes. IIUC it's like a class and disasm_line should
> have a pointer instead of a copy of the arch instructions. So I'd
> like to keep a single instance if they behave in the same way. But
> this is a separate change.
>
> I guess we can move it to struct disasm_line and use helper macros when
> we need to access the opcode. This will be helpful for other arches.
>
> struct disasm_line {
> struct ins *ins;
> struct ins_operands ops;
> union {
> u8 bytes[4];
> u32 opcode;
> } raw;
> struct annotation_line al;
> };
>
> #define PPC_OP(dl) (((dl)->raw.bytes[0] >> 2) & 0x3F)

Thanks for the suggestion. I will make these changes in V4

Thanks
Athira
>
> Thanks,
> Namhyung
>
>>
>> struct ins_ops *ops;
>> };
>>
>> struct ins_operands {
>> char *raw;
>> + int raw_insn;
>> + int opcode;
>> struct {
>> char *raw;
>> char *name;
>> + int opcode;
>> + int raw_insn;
>> struct symbol *sym;
>> u64 addr;
>> s64 offset;
>> @@ -62,6 +67,8 @@ struct ins_operands {
>> struct {
>> char *raw;
>> char *name;
>> + int opcode;
>> + int raw_insn;
>> u64 addr;
>> bool multi_regs;
>> } source;
>> --
>> 2.43.0
>>


2024-06-08 08:58:36

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH V3 05/14] tools/perf: Add disasm_line__parse to parse raw instruction for powerpc



Le 06/06/2024 à 08:33, Namhyung Kim a écrit :
> Hello,
>
> On Sat, Jun 01, 2024 at 11:39:32AM +0530, Athira Rajeev wrote:
>> Currently, the perf tool infrastructure disasm_line__parse function to
>> parse disassembled line.
>>
>> Example snippet from objdump:
>> objdump --start-address=<address> --stop-address=<address> -d --no-show-raw-insn -C <vmlinux>
>>
>> c0000000010224b4: lwz r10,0(r9)
>>
>> This line "lwz r10,0(r9)" is parsed to extract instruction name,
>> registers names and offset. In powerpc, the approach for data type
>> profiling uses raw instruction instead of result from objdump to identify
>> the instruction category and extract the source/target registers.
>>
>> Example: 38 01 81 e8 ld r4,312(r1)
>>
>> Here "38 01 81 e8" is the raw instruction representation. Add function
>> "disasm_line__parse_powerpc" to handle parsing of raw instruction. Also
>> update "struct ins" and "struct ins_operands" to save "opcode" and
>> binary code. With the change, function captures:
>>
>> line -> "38 01 81 e8 ld r4,312(r1)"
>> opcode and raw instruction "38 01 81 e8"
>>
>> Raw instruction is used later to extract the reg/offset fields. Macros
>> are added to extract opcode and register fields. "struct ins_operands"
>> and "struct ins" is updated to carry opcode and raw instruction binary
>> code (raw_insn). Function "disasm_line__parse_powerpc fills the raw
>> instruction hex value and opcode in newly added fields. There is no
>> changes in existing code paths, which parses the disassembled code.
>> The architecture using the instruction name and present approach is
>> not altered. Since this approach targets powerpc, the macro
>> implementation is added for powerpc as of now.
>>
>> Since the disasm_line__parse is used in other cases (perf annotate) and
>> not only data tye profiling, the powerpc callback includes changes to
>> work with binary code as well as mneumonic representation. Also in case
>> if the DSO read fails and libcapstone is not supported, the approach
>> fallback to use objdump as option. Hence as option, patch has changes to
>> ensure objdump option also works well.
>>
>> Signed-off-by: Athira Rajeev <[email protected]>
>> ---
>> tools/include/linux/string.h | 2 +
>> tools/lib/string.c | 13 ++++
>> .../perf/arch/powerpc/annotate/instructions.c | 1 +
>> tools/perf/arch/powerpc/util/dwarf-regs.c | 9 +++
>> tools/perf/util/disasm.c | 63 ++++++++++++++++++-
>> tools/perf/util/disasm.h | 7 +++
>> 6 files changed, 94 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/include/linux/string.h b/tools/include/linux/string.h
>> index db5c99318c79..0acb1fc14e19 100644
>> --- a/tools/include/linux/string.h
>> +++ b/tools/include/linux/string.h
>> @@ -46,5 +46,7 @@ extern char * __must_check skip_spaces(const char *);
>>
>> extern char *strim(char *);
>>
>> +extern void remove_spaces(char *s);
>> +
>> extern void *memchr_inv(const void *start, int c, size_t bytes);
>> #endif /* _TOOLS_LINUX_STRING_H_ */
>> diff --git a/tools/lib/string.c b/tools/lib/string.c
>> index 8b6892f959ab..3126d2cff716 100644
>> --- a/tools/lib/string.c
>> +++ b/tools/lib/string.c
>> @@ -153,6 +153,19 @@ char *strim(char *s)
>> return skip_spaces(s);
>> }
>>
>> +/*
>> + * remove_spaces - Removes whitespaces from @s
>> + */
>> +void remove_spaces(char *s)
>> +{
>> + char *d = s;
>> +
>> + do {
>> + while (*d == ' ')
>> + ++d;
>> + } while ((*s++ = *d++));
>> +}
>> +
>> /**
>> * strreplace - Replace all occurrences of character in string.
>> * @s: The string to operate on.
>> diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c
>> index a3f423c27cae..d57fd023ef9c 100644
>> --- a/tools/perf/arch/powerpc/annotate/instructions.c
>> +++ b/tools/perf/arch/powerpc/annotate/instructions.c
>> @@ -55,6 +55,7 @@ static int powerpc__annotate_init(struct arch *arch, char *cpuid __maybe_unused)
>> arch->initialized = true;
>> arch->associate_instruction_ops = powerpc__associate_instruction_ops;
>> arch->objdump.comment_char = '#';
>> + annotate_opts.show_asm_raw = true;
>> }
>>
>> return 0;
>> diff --git a/tools/perf/arch/powerpc/util/dwarf-regs.c b/tools/perf/arch/powerpc/util/dwarf-regs.c
>> index 0c4f4caf53ac..430623ca5612 100644
>> --- a/tools/perf/arch/powerpc/util/dwarf-regs.c
>> +++ b/tools/perf/arch/powerpc/util/dwarf-regs.c
>> @@ -98,3 +98,12 @@ int regs_query_register_offset(const char *name)
>> return roff->ptregs_offset;
>> return -EINVAL;
>> }
>> +
>> +#define PPC_OP(op) (((op) >> 26) & 0x3F)
>> +#define PPC_RA(a) (((a) >> 16) & 0x1f)
>> +#define PPC_RT(t) (((t) >> 21) & 0x1f)
>> +#define PPC_RB(b) (((b) >> 11) & 0x1f)
>> +#define PPC_D(D) ((D) & 0xfffe)
>> +#define PPC_DS(DS) ((DS) & 0xfffc)
>> +#define OP_LD 58
>> +#define OP_STD 62
>> diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
>> index 3cd187f08193..61f0f1656f82 100644
>> --- a/tools/perf/util/disasm.c
>> +++ b/tools/perf/util/disasm.c
>> @@ -45,6 +45,7 @@ static int call__scnprintf(struct ins *ins, char *bf, size_t size,
>>
>> static void ins__sort(struct arch *arch);
>> static int disasm_line__parse(char *line, const char **namep, char **rawp);
>> +static int disasm_line__parse_powerpc(struct disasm_line *dl);
>>
>> static __attribute__((constructor)) void symbol__init_regexpr(void)
>> {
>> @@ -844,6 +845,63 @@ static int disasm_line__parse(char *line, const char **namep, char **rawp)
>> return -1;
>> }
>>
>> +/*
>> + * Parses the result captured from symbol__disassemble_*
>> + * Example, line read from DSO file in powerpc:
>> + * line: 38 01 81 e8
>> + * opcode: fetched from arch specific get_opcode_insn
>> + * rawp_insn: e8810138
>> + *
>> + * rawp_insn is used later to extract the reg/offset fields
>> + */
>> +#define PPC_OP(op) (((op) >> 26) & 0x3F)
>> +
>> +static int disasm_line__parse_powerpc(struct disasm_line *dl)
>> +{
>> + char *line = dl->al.line;
>> + const char **namep = &dl->ins.name;
>> + char **rawp = &dl->ops.raw;
>> + char tmp, *tmp_opcode, *name_opcode = skip_spaces(line);
>> + char *name = skip_spaces(name_opcode + 11);
>> + int objdump = 0;
>> +
>> + if (strlen(line) > 11)
>> + objdump = 1;
>> +
>> + if (name_opcode[0] == '\0')
>> + return -1;
>> +
>> + if (objdump) {
>> + *rawp = name + 1;
>> + while ((*rawp)[0] != '\0' && !isspace((*rawp)[0]))
>> + ++*rawp;
>> + tmp = (*rawp)[0];
>> + (*rawp)[0] = '\0';
>> +
>> + *namep = strdup(name);
>> + if (*namep == NULL)
>> + return -1;
>> +
>> + (*rawp)[0] = tmp;
>> + *rawp = strim(*rawp);
>> + } else
>> + *namep = "";
>> +
>> + tmp_opcode = strdup(name_opcode);
>> + tmp_opcode[11] = '\0';
>> + remove_spaces(tmp_opcode);
>> +
>> + dl->ins.opcode = PPC_OP(strtol(tmp_opcode, NULL, 16));
>> + if (objdump)
>> + dl->ins.opcode = PPC_OP(be32_to_cpu(strtol(tmp_opcode, NULL, 16)));
>> + dl->ops.opcode = dl->ins.opcode;
>> +
>> + dl->ops.raw_insn = strtol(tmp_opcode, NULL, 16);
>> + if (objdump)
>> + dl->ops.raw_insn = be32_to_cpu(strtol(tmp_opcode, NULL, 16));
>> + return 0;
>> +}
>> +
>> static void annotation_line__init(struct annotation_line *al,
>> struct annotate_args *args,
>> int nr)
>> @@ -897,7 +955,10 @@ struct disasm_line *disasm_line__new(struct annotate_args *args)
>> goto out_delete;
>>
>> if (args->offset != -1) {
>> - if (disasm_line__parse(dl->al.line, &dl->ins.name, &dl->ops.raw) < 0)
>> + if (arch__is(args->arch, "powerpc")) {
>> + if (disasm_line__parse_powerpc(dl) < 0)
>> + goto out_free_line;
>> + } else if (disasm_line__parse(dl->al.line, &dl->ins.name, &dl->ops.raw) < 0)
>> goto out_free_line;
>>
>> disasm_line__init_ins(dl, args->arch, &args->ms);
>> diff --git a/tools/perf/util/disasm.h b/tools/perf/util/disasm.h
>> index 718177fa4775..a391e1bb81f7 100644
>> --- a/tools/perf/util/disasm.h
>> +++ b/tools/perf/util/disasm.h
>> @@ -43,14 +43,19 @@ struct arch {
>>
>> struct ins {
>> const char *name;
>> + int opcode;
>
> I don't think this is the right place as 'ins' can be shared for
> different opcodes. IIUC it's like a class and disasm_line should
> have a pointer instead of a copy of the arch instructions. So I'd
> like to keep a single instance if they behave in the same way. But
> this is a separate change.
>
> I guess we can move it to struct disasm_line and use helper macros when
> we need to access the opcode. This will be helpful for other arches.
>
> struct disasm_line {
> struct ins *ins;
> struct ins_operands ops;
> union {
> u8 bytes[4];
> u32 opcode;
> } raw;
> struct annotation_line al;
> };
>
> #define PPC_OP(dl) (((dl)->raw.bytes[0] >> 2) & 0x3F)

We already have a definition for PPC_OP(), see arch/powerpc/xmon/ppc.h:

/* A macro to extract the major opcode from an instruction. */
#define PPC_OP(i) (((i) >> 26) & 0x3f)

By the way why do you want to split off instructions in bytes ? On
powerpc an instruction is one (sometimes two) u32, nothing else, and if
you start breaking that into bytes you will likely unnecessarily
increase complexity when a param has bits on different bytes, and maybe
also with the byte order depending whether you are running on a little
or big endian PPC.
See for instance arch_decode_instruction() in
tools/objtool/arch/powerpc/decode.c

By the way, why are we spreading different decoding functions in
different tools ? Wouldn't it make sense to try and share decoding
functions between objtool and perf for instance ?

Christophe



>
> Thanks,
> Namhyung
>
>>
>> struct ins_ops *ops;
>> };
>>
>> struct ins_operands {
>> char *raw;
>> + int raw_insn;
>> + int opcode;
>> struct {
>> char *raw;
>> char *name;
>> + int opcode;
>> + int raw_insn;
>> struct symbol *sym;
>> u64 addr;
>> s64 offset;
>> @@ -62,6 +67,8 @@ struct ins_operands {
>> struct {
>> char *raw;
>> char *name;
>> + int opcode;
>> + int raw_insn;
>> u64 addr;
>> bool multi_regs;
>> } source;
>> --
>> 2.43.0
>>

2024-06-08 09:32:06

by Athira Rajeev

[permalink] [raw]
Subject: Re: [PATCH V3 11/14] tools/perf: Add support to use libcapstone in powerpc



> On 3 Jun 2024, at 10:00 PM, Ian Rogers <[email protected]> wrote:
>
> On Fri, May 31, 2024 at 11:10 PM Athira Rajeev
> <[email protected]> wrote:
>>
>> Now perf uses the capstone library to disassemble the instructions in
>> x86. capstone is used (if available) for perf annotate to speed up.
>> Currently it only supports x86 architecture. Patch includes changes to
>> enable this in powerpc. For now, only for data type sort keys, this
>> method is used and only binary code (raw instruction) is read. This is
>> because powerpc approach to understand instructions and reg fields uses
>> raw instruction. The "cs_disasm" is currently not enabled. While
>> attempting to do cs_disasm, observation is that some of the instructions
>> were not identified (ex: extswsli, maddld) and it had to fallback to use
>> objdump. Hence enabling "cs_disasm" is added in comment section as a
>> TODO for powerpc.
>>
>> Signed-off-by: Athira Rajeev <[email protected]>
>> ---
>> tools/perf/util/disasm.c | 148 ++++++++++++++++++++++++++++++++++++++-
>> 1 file changed, 146 insertions(+), 2 deletions(-)
>>
>> diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
>> index d8b357055302..915508d2e197 100644
>> --- a/tools/perf/util/disasm.c
>> +++ b/tools/perf/util/disasm.c
>> @@ -1540,12 +1540,18 @@ static int open_capstone_handle(struct annotate_args *args, bool is_64bit,
>> {
>> struct annotation_options *opt = args->options;
>> cs_mode mode = is_64bit ? CS_MODE_64 : CS_MODE_32;
>> + int ret;
>>
>> /* TODO: support more architectures */
>> - if (!arch__is(args->arch, "x86"))
>> + if ((!arch__is(args->arch, "x86")) && (!arch__is(args->arch, "powerpc")))
>> return -1;
>>
>> - if (cs_open(CS_ARCH_X86, mode, handle) != CS_ERR_OK)
>> + if (arch__is(args->arch, "x86"))
>> + ret = cs_open(CS_ARCH_X86, mode, handle);
>> + else
>> + ret = cs_open(CS_ARCH_PPC, mode, handle);
>> +
>> + if (ret != CS_ERR_OK)
>> return -1;
>
> There looks to be a pretty/more robust capstone_init function in
> print_insn.c, should we factor this code out and recycle:
> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/print_insn.c?h=perf-tools-next#n40
>
> Thanks,
> Ian

Hi Ian,

Thanks for checking the patch.

Yes, that’s good change to have. I will have this change in V4

Thanks
Athira

>
>> if (!opt->disassembler_style ||
>> @@ -1635,6 +1641,139 @@ static void print_capstone_detail(cs_insn *insn, char *buf, size_t len,
>> }
>> }
>>
>> +static int symbol__disassemble_capstone_powerpc(char *filename, struct symbol *sym,
>> + struct annotate_args *args)
>> +{
>> + struct annotation *notes = symbol__annotation(sym);
>> + struct map *map = args->ms.map;
>> + struct dso *dso = map__dso(map);
>> + struct nscookie nsc;
>> + u64 start = map__rip_2objdump(map, sym->start);
>> + u64 end = map__rip_2objdump(map, sym->end);
>> + u64 len = end - start;
>> + u64 offset;
>> + int i, fd, count;
>> + bool is_64bit = false;
>> + bool needs_cs_close = false;
>> + u8 *buf = NULL;
>> + struct find_file_offset_data data = {
>> + .ip = start,
>> + };
>> + csh handle;
>> + char disasm_buf[512];
>> + struct disasm_line *dl;
>> + u32 *line;
>> +
>> + if (args->options->objdump_path)
>> + return -1;
>> +
>> + nsinfo__mountns_enter(dso->nsinfo, &nsc);
>> + fd = open(filename, O_RDONLY);
>> + nsinfo__mountns_exit(&nsc);
>> + if (fd < 0)
>> + return -1;
>> +
>> + if (file__read_maps(fd, /*exe=*/true, find_file_offset, &data,
>> + &is_64bit) == 0)
>> + goto err;
>> +
>> + if (open_capstone_handle(args, is_64bit, &handle) < 0)
>> + goto err;
>> +
>> + needs_cs_close = true;
>> +
>> + buf = malloc(len);
>> + if (buf == NULL)
>> + goto err;
>> +
>> + count = pread(fd, buf, len, data.offset);
>> + close(fd);
>> + fd = -1;
>> +
>> + if ((u64)count != len)
>> + goto err;
>> +
>> + line = (u32 *)buf;
>> +
>> + /* add the function address and name */
>> + scnprintf(disasm_buf, sizeof(disasm_buf), "%#"PRIx64" <%s>:",
>> + start, sym->name);
>> +
>> + args->offset = -1;
>> + args->line = disasm_buf;
>> + args->line_nr = 0;
>> + args->fileloc = NULL;
>> + args->ms.sym = sym;
>> +
>> + dl = disasm_line__new(args);
>> + if (dl == NULL)
>> + goto err;
>> +
>> + annotation_line__add(&dl->al, &notes->src->source);
>> +
>> + /*
>> + * TODO: enable disassm for powerpc
>> + * count = cs_disasm(handle, buf, len, start, len, &insn);
>> + *
>> + * For now, only binary code is saved in disassembled line
>> + * to be used in "type" and "typeoff" sort keys. Each raw code
>> + * is 32 bit instruction. So use "len/4" to get the number of
>> + * entries.
>> + */
>> + count = len/4;
>> +
>> + for (i = 0, offset = 0; i < count; i++) {
>> + args->offset = offset;
>> + sprintf(args->line, "%x", line[i]);
>> +
>> + dl = disasm_line__new(args);
>> + if (dl == NULL)
>> + goto err;
>> +
>> + annotation_line__add(&dl->al, &notes->src->source);
>> +
>> + offset += 4;
>> + }
>> +
>> + /* It failed in the middle */
>> + if (offset != len) {
>> + struct list_head *list = &notes->src->source;
>> +
>> + /* Discard all lines and fallback to objdump */
>> + while (!list_empty(list)) {
>> + dl = list_first_entry(list, struct disasm_line, al.node);
>> +
>> + list_del_init(&dl->al.node);
>> + disasm_line__free(dl);
>> + }
>> + count = -1;
>> + }
>> +
>> +out:
>> + if (needs_cs_close)
>> + cs_close(&handle);
>> + free(buf);
>> + return count < 0 ? count : 0;
>> +
>> +err:
>> + if (fd >= 0)
>> + close(fd);
>> + if (needs_cs_close) {
>> + struct disasm_line *tmp;
>> +
>> + /*
>> + * It probably failed in the middle of the above loop.
>> + * Release any resources it might add.
>> + */
>> + list_for_each_entry_safe(dl, tmp, &notes->src->source, al.node) {
>> + list_del(&dl->al.node);
>> + free(dl);
>> + }
>> + }
>> + count = -1;
>> + goto out;
>> +}
>> +
>> static int symbol__disassemble_capstone(char *filename, struct symbol *sym,
>> struct annotate_args *args)
>> {
>> @@ -1987,6 +2126,11 @@ int symbol__disassemble(struct symbol *sym, struct annotate_args *args)
>> err = symbol__disassemble_dso(symfs_filename, sym, args);
>> if (err == 0)
>> goto out_remove_tmp;
>> +#ifdef HAVE_LIBCAPSTONE_SUPPORT
>> + err = symbol__disassemble_capstone_powerpc(symfs_filename, sym, args);
>> + if (err == 0)
>> + goto out_remove_tmp;
>> +#endif
>> }
>> }
>>
>> --
>> 2.43.0



2024-06-10 15:35:02

by Athira Rajeev

[permalink] [raw]
Subject: Re: [PATCH V3 11/14] tools/perf: Add support to use libcapstone in powerpc



> On 3 Jun 2024, at 10:28 PM, Adrian Hunter <[email protected]> wrote:
>
> On 3/06/24 19:30, Ian Rogers wrote:
>> On Fri, May 31, 2024 at 11:10 PM Athira Rajeev
>> <[email protected]> wrote:
>>>
>>> Now perf uses the capstone library to disassemble the instructions in
>>> x86. capstone is used (if available) for perf annotate to speed up.
>>> Currently it only supports x86 architecture. Patch includes changes to
>>> enable this in powerpc. For now, only for data type sort keys, this
>>> method is used and only binary code (raw instruction) is read. This is
>>> because powerpc approach to understand instructions and reg fields uses
>>> raw instruction. The "cs_disasm" is currently not enabled. While
>>> attempting to do cs_disasm, observation is that some of the instructions
>>> were not identified (ex: extswsli, maddld) and it had to fallback to use
>>> objdump. Hence enabling "cs_disasm" is added in comment section as a
>>> TODO for powerpc.
>>>
>>> Signed-off-by: Athira Rajeev <[email protected]>
>>> ---
>>> tools/perf/util/disasm.c | 148 ++++++++++++++++++++++++++++++++++++++-
>>> 1 file changed, 146 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
>>> index d8b357055302..915508d2e197 100644
>>> --- a/tools/perf/util/disasm.c
>>> +++ b/tools/perf/util/disasm.c
>>> @@ -1540,12 +1540,18 @@ static int open_capstone_handle(struct annotate_args *args, bool is_64bit,
>>> {
>>> struct annotation_options *opt = args->options;
>>> cs_mode mode = is_64bit ? CS_MODE_64 : CS_MODE_32;
>>> + int ret;
>>>
>>> /* TODO: support more architectures */
>>> - if (!arch__is(args->arch, "x86"))
>>> + if ((!arch__is(args->arch, "x86")) && (!arch__is(args->arch, "powerpc")))
>>> return -1;
>>>
>>> - if (cs_open(CS_ARCH_X86, mode, handle) != CS_ERR_OK)
>>> + if (arch__is(args->arch, "x86"))
>>> + ret = cs_open(CS_ARCH_X86, mode, handle);
>>> + else
>>> + ret = cs_open(CS_ARCH_PPC, mode, handle);
>>> +
>>> + if (ret != CS_ERR_OK)
>>> return -1;
>>
>> There looks to be a pretty/more robust capstone_init function in
>> print_insn.c, should we factor this code out and recycle:
>> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/print_insn.c?h=perf-tools-next#n40
>
> On a slightly related note, there is a compile error
> been around for a while in util/disasm.c on Ubuntu 22.04
>
> In file included from /usr/include/capstone/capstone.h:279,
> from util/disasm.c:1354:
> /usr/include/capstone/bpf.h:94:14: error: ‘bpf_insn’ defined as wrong
> kind of tag
> 94 | typedef enum bpf_insn {
> | ^~~~~~~~
>

Hi Adrian

I tried compilation on Ubuntu 22.04, but didn’t face this issue.
The libcapstone version I have is libcapstone4 which doesn’t have the include for “bpf.h”
What is the version of libcapstone in the setup where you are seeing this issue ?

Thanks
Athira
>>
>> Thanks,
>> Ian
>>
>>> if (!opt->disassembler_style ||
>>> @@ -1635,6 +1641,139 @@ static void print_capstone_detail(cs_insn *insn, char *buf, size_t len,
>>> }
>>> }
>>>
>>> +static int symbol__disassemble_capstone_powerpc(char *filename, struct symbol *sym,
>>> + struct annotate_args *args)
>>> +{
>>> + struct annotation *notes = symbol__annotation(sym);
>>> + struct map *map = args->ms.map;
>>> + struct dso *dso = map__dso(map);
>>> + struct nscookie nsc;
>>> + u64 start = map__rip_2objdump(map, sym->start);
>>> + u64 end = map__rip_2objdump(map, sym->end);
>>> + u64 len = end - start;
>>> + u64 offset;
>>> + int i, fd, count;
>>> + bool is_64bit = false;
>>> + bool needs_cs_close = false;
>>> + u8 *buf = NULL;
>>> + struct find_file_offset_data data = {
>>> + .ip = start,
>>> + };
>>> + csh handle;
>>> + char disasm_buf[512];
>>> + struct disasm_line *dl;
>>> + u32 *line;
>>> +
>>> + if (args->options->objdump_path)
>>> + return -1;
>>> +
>>> + nsinfo__mountns_enter(dso->nsinfo, &nsc);
>>> + fd = open(filename, O_RDONLY);
>>> + nsinfo__mountns_exit(&nsc);
>>> + if (fd < 0)
>>> + return -1;
>>> +
>>> + if (file__read_maps(fd, /*exe=*/true, find_file_offset, &data,
>>> + &is_64bit) == 0)
>>> + goto err;
>>> +
>>> + if (open_capstone_handle(args, is_64bit, &handle) < 0)
>>> + goto err;
>>> +
>>> + needs_cs_close = true;
>>> +
>>> + buf = malloc(len);
>>> + if (buf == NULL)
>>> + goto err;
>>> +
>>> + count = pread(fd, buf, len, data.offset);
>>> + close(fd);
>>> + fd = -1;
>>> +
>>> + if ((u64)count != len)
>>> + goto err;
>>> +
>>> + line = (u32 *)buf;
>>> +
>>> + /* add the function address and name */
>>> + scnprintf(disasm_buf, sizeof(disasm_buf), "%#"PRIx64" <%s>:",
>>> + start, sym->name);
>>> +
>>> + args->offset = -1;
>>> + args->line = disasm_buf;
>>> + args->line_nr = 0;
>>> + args->fileloc = NULL;
>>> + args->ms.sym = sym;
>>> +
>>> + dl = disasm_line__new(args);
>>> + if (dl == NULL)
>>> + goto err;
>>> +
>>> + annotation_line__add(&dl->al, &notes->src->source);
>>> +
>>> + /*
>>> + * TODO: enable disassm for powerpc
>>> + * count = cs_disasm(handle, buf, len, start, len, &insn);
>>> + *
>>> + * For now, only binary code is saved in disassembled line
>>> + * to be used in "type" and "typeoff" sort keys. Each raw code
>>> + * is 32 bit instruction. So use "len/4" to get the number of
>>> + * entries.
>>> + */
>>> + count = len/4;
>>> +
>>> + for (i = 0, offset = 0; i < count; i++) {
>>> + args->offset = offset;
>>> + sprintf(args->line, "%x", line[i]);
>>> +
>>> + dl = disasm_line__new(args);
>>> + if (dl == NULL)
>>> + goto err;
>>> +
>>> + annotation_line__add(&dl->al, &notes->src->source);
>>> +
>>> + offset += 4;
>>> + }
>>> +
>>> + /* It failed in the middle */
>>> + if (offset != len) {
>>> + struct list_head *list = &notes->src->source;
>>> +
>>> + /* Discard all lines and fallback to objdump */
>>> + while (!list_empty(list)) {
>>> + dl = list_first_entry(list, struct disasm_line, al.node);
>>> +
>>> + list_del_init(&dl->al.node);
>>> + disasm_line__free(dl);
>>> + }
>>> + count = -1;
>>> + }
>>> +
>>> +out:
>>> + if (needs_cs_close)
>>> + cs_close(&handle);
>>> + free(buf);
>>> + return count < 0 ? count : 0;
>>> +
>>> +err:
>>> + if (fd >= 0)
>>> + close(fd);
>>> + if (needs_cs_close) {
>>> + struct disasm_line *tmp;
>>> +
>>> + /*
>>> + * It probably failed in the middle of the above loop.
>>> + * Release any resources it might add.
>>> + */
>>> + list_for_each_entry_safe(dl, tmp, &notes->src->source, al.node) {
>>> + list_del(&dl->al.node);
>>> + free(dl);
>>> + }
>>> + }
>>> + count = -1;
>>> + goto out;
>>> +}
>>> +
>>> static int symbol__disassemble_capstone(char *filename, struct symbol *sym,
>>> struct annotate_args *args)
>>> {
>>> @@ -1987,6 +2126,11 @@ int symbol__disassemble(struct symbol *sym, struct annotate_args *args)
>>> err = symbol__disassemble_dso(symfs_filename, sym, args);
>>> if (err == 0)
>>> goto out_remove_tmp;
>>> +#ifdef HAVE_LIBCAPSTONE_SUPPORT
>>> + err = symbol__disassemble_capstone_powerpc(symfs_filename, sym, args);
>>> + if (err == 0)
>>> + goto out_remove_tmp;
>>> +#endif
>>> }
>>> }
>>>
>>> --
>>> 2.43.0



2024-06-11 16:38:51

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH V3 11/14] tools/perf: Add support to use libcapstone in powerpc

On 10/06/24 15:20, Athira Rajeev wrote:
>
>
>> On 3 Jun 2024, at 10:28 PM, Adrian Hunter <[email protected]> wrote:
>>
>> On 3/06/24 19:30, Ian Rogers wrote:
>>> On Fri, May 31, 2024 at 11:10 PM Athira Rajeev
>>> <[email protected]> wrote:
>>>>
>>>> Now perf uses the capstone library to disassemble the instructions in
>>>> x86. capstone is used (if available) for perf annotate to speed up.
>>>> Currently it only supports x86 architecture. Patch includes changes to
>>>> enable this in powerpc. For now, only for data type sort keys, this
>>>> method is used and only binary code (raw instruction) is read. This is
>>>> because powerpc approach to understand instructions and reg fields uses
>>>> raw instruction. The "cs_disasm" is currently not enabled. While
>>>> attempting to do cs_disasm, observation is that some of the instructions
>>>> were not identified (ex: extswsli, maddld) and it had to fallback to use
>>>> objdump. Hence enabling "cs_disasm" is added in comment section as a
>>>> TODO for powerpc.
>>>>
>>>> Signed-off-by: Athira Rajeev <[email protected]>
>>>> ---
>>>> tools/perf/util/disasm.c | 148 ++++++++++++++++++++++++++++++++++++++-
>>>> 1 file changed, 146 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
>>>> index d8b357055302..915508d2e197 100644
>>>> --- a/tools/perf/util/disasm.c
>>>> +++ b/tools/perf/util/disasm.c
>>>> @@ -1540,12 +1540,18 @@ static int open_capstone_handle(struct annotate_args *args, bool is_64bit,
>>>> {
>>>> struct annotation_options *opt = args->options;
>>>> cs_mode mode = is_64bit ? CS_MODE_64 : CS_MODE_32;
>>>> + int ret;
>>>>
>>>> /* TODO: support more architectures */
>>>> - if (!arch__is(args->arch, "x86"))
>>>> + if ((!arch__is(args->arch, "x86")) && (!arch__is(args->arch, "powerpc")))
>>>> return -1;
>>>>
>>>> - if (cs_open(CS_ARCH_X86, mode, handle) != CS_ERR_OK)
>>>> + if (arch__is(args->arch, "x86"))
>>>> + ret = cs_open(CS_ARCH_X86, mode, handle);
>>>> + else
>>>> + ret = cs_open(CS_ARCH_PPC, mode, handle);
>>>> +
>>>> + if (ret != CS_ERR_OK)
>>>> return -1;
>>>
>>> There looks to be a pretty/more robust capstone_init function in
>>> print_insn.c, should we factor this code out and recycle:
>>> https://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/print_insn.c?h=perf-tools-next#n40
>>
>> On a slightly related note, there is a compile error
>> been around for a while in util/disasm.c on Ubuntu 22.04
>>
>> In file included from /usr/include/capstone/capstone.h:279,
>> from util/disasm.c:1354:
>> /usr/include/capstone/bpf.h:94:14: error: ‘bpf_insn’ defined as wrong
>> kind of tag
>> 94 | typedef enum bpf_insn {
>> | ^~~~~~~~
>>
>
> Hi Adrian
>
> I tried compilation on Ubuntu 22.04, but didn’t face this issue.
> The libcapstone version I have is libcapstone4 which doesn’t have the include for “bpf.h”
> What is the version of libcapstone in the setup where you are seeing this issue ?

Yes, sorry. I got confused. Ubuntu was OK. The original issue
was with Fedora 40, but even then it requires binutils-devel
and BUILD_NONDISTRO=1


2024-06-12 11:48:06

by Athira Rajeev

[permalink] [raw]
Subject: Re: [PATCH V3 05/14] tools/perf: Add disasm_line__parse to parse raw instruction for powerpc



> On 8 Jun 2024, at 2:28 PM, Christophe Leroy <[email protected]> wrote:
>
>
>
> Le 06/06/2024 à 08:33, Namhyung Kim a écrit :
>> Hello,
>>
>> On Sat, Jun 01, 2024 at 11:39:32AM +0530, Athira Rajeev wrote:
>>> Currently, the perf tool infrastructure disasm_line__parse function to
>>> parse disassembled line.
>>>
>>> Example snippet from objdump:
>>> objdump --start-address=<address> --stop-address=<address> -d --no-show-raw-insn -C <vmlinux>
>>>
>>> c0000000010224b4: lwz r10,0(r9)
>>>
>>> This line "lwz r10,0(r9)" is parsed to extract instruction name,
>>> registers names and offset. In powerpc, the approach for data type
>>> profiling uses raw instruction instead of result from objdump to identify
>>> the instruction category and extract the source/target registers.
>>>
>>> Example: 38 01 81 e8 ld r4,312(r1)
>>>
>>> Here "38 01 81 e8" is the raw instruction representation. Add function
>>> "disasm_line__parse_powerpc" to handle parsing of raw instruction. Also
>>> update "struct ins" and "struct ins_operands" to save "opcode" and
>>> binary code. With the change, function captures:
>>>
>>> line -> "38 01 81 e8 ld r4,312(r1)"
>>> opcode and raw instruction "38 01 81 e8"
>>>
>>> Raw instruction is used later to extract the reg/offset fields. Macros
>>> are added to extract opcode and register fields. "struct ins_operands"
>>> and "struct ins" is updated to carry opcode and raw instruction binary
>>> code (raw_insn). Function "disasm_line__parse_powerpc fills the raw
>>> instruction hex value and opcode in newly added fields. There is no
>>> changes in existing code paths, which parses the disassembled code.
>>> The architecture using the instruction name and present approach is
>>> not altered. Since this approach targets powerpc, the macro
>>> implementation is added for powerpc as of now.
>>>
>>> Since the disasm_line__parse is used in other cases (perf annotate) and
>>> not only data tye profiling, the powerpc callback includes changes to
>>> work with binary code as well as mneumonic representation. Also in case
>>> if the DSO read fails and libcapstone is not supported, the approach
>>> fallback to use objdump as option. Hence as option, patch has changes to
>>> ensure objdump option also works well.
>>>
>>> Signed-off-by: Athira Rajeev <[email protected]>
>>> ---
>>> tools/include/linux/string.h | 2 +
>>> tools/lib/string.c | 13 ++++
>>> .../perf/arch/powerpc/annotate/instructions.c | 1 +
>>> tools/perf/arch/powerpc/util/dwarf-regs.c | 9 +++
>>> tools/perf/util/disasm.c | 63 ++++++++++++++++++-
>>> tools/perf/util/disasm.h | 7 +++
>>> 6 files changed, 94 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/tools/include/linux/string.h b/tools/include/linux/string.h
>>> index db5c99318c79..0acb1fc14e19 100644
>>> --- a/tools/include/linux/string.h
>>> +++ b/tools/include/linux/string.h
>>> @@ -46,5 +46,7 @@ extern char * __must_check skip_spaces(const char *);
>>>
>>> extern char *strim(char *);
>>>
>>> +extern void remove_spaces(char *s);
>>> +
>>> extern void *memchr_inv(const void *start, int c, size_t bytes);
>>> #endif /* _TOOLS_LINUX_STRING_H_ */
>>> diff --git a/tools/lib/string.c b/tools/lib/string.c
>>> index 8b6892f959ab..3126d2cff716 100644
>>> --- a/tools/lib/string.c
>>> +++ b/tools/lib/string.c
>>> @@ -153,6 +153,19 @@ char *strim(char *s)
>>> return skip_spaces(s);
>>> }
>>>
>>> +/*
>>> + * remove_spaces - Removes whitespaces from @s
>>> + */
>>> +void remove_spaces(char *s)
>>> +{
>>> + char *d = s;
>>> +
>>> + do {
>>> + while (*d == ' ')
>>> + ++d;
>>> + } while ((*s++ = *d++));
>>> +}
>>> +
>>> /**
>>> * strreplace - Replace all occurrences of character in string.
>>> * @s: The string to operate on.
>>> diff --git a/tools/perf/arch/powerpc/annotate/instructions.c b/tools/perf/arch/powerpc/annotate/instructions.c
>>> index a3f423c27cae..d57fd023ef9c 100644
>>> --- a/tools/perf/arch/powerpc/annotate/instructions.c
>>> +++ b/tools/perf/arch/powerpc/annotate/instructions.c
>>> @@ -55,6 +55,7 @@ static int powerpc__annotate_init(struct arch *arch, char *cpuid __maybe_unused)
>>> arch->initialized = true;
>>> arch->associate_instruction_ops = powerpc__associate_instruction_ops;
>>> arch->objdump.comment_char = '#';
>>> + annotate_opts.show_asm_raw = true;
>>> }
>>>
>>> return 0;
>>> diff --git a/tools/perf/arch/powerpc/util/dwarf-regs.c b/tools/perf/arch/powerpc/util/dwarf-regs.c
>>> index 0c4f4caf53ac..430623ca5612 100644
>>> --- a/tools/perf/arch/powerpc/util/dwarf-regs.c
>>> +++ b/tools/perf/arch/powerpc/util/dwarf-regs.c
>>> @@ -98,3 +98,12 @@ int regs_query_register_offset(const char *name)
>>> return roff->ptregs_offset;
>>> return -EINVAL;
>>> }
>>> +
>>> +#define PPC_OP(op) (((op) >> 26) & 0x3F)
>>> +#define PPC_RA(a) (((a) >> 16) & 0x1f)
>>> +#define PPC_RT(t) (((t) >> 21) & 0x1f)
>>> +#define PPC_RB(b) (((b) >> 11) & 0x1f)
>>> +#define PPC_D(D) ((D) & 0xfffe)
>>> +#define PPC_DS(DS) ((DS) & 0xfffc)
>>> +#define OP_LD 58
>>> +#define OP_STD 62
>>> diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c
>>> index 3cd187f08193..61f0f1656f82 100644
>>> --- a/tools/perf/util/disasm.c
>>> +++ b/tools/perf/util/disasm.c
>>> @@ -45,6 +45,7 @@ static int call__scnprintf(struct ins *ins, char *bf, size_t size,
>>>
>>> static void ins__sort(struct arch *arch);
>>> static int disasm_line__parse(char *line, const char **namep, char **rawp);
>>> +static int disasm_line__parse_powerpc(struct disasm_line *dl);
>>>
>>> static __attribute__((constructor)) void symbol__init_regexpr(void)
>>> {
>>> @@ -844,6 +845,63 @@ static int disasm_line__parse(char *line, const char **namep, char **rawp)
>>> return -1;
>>> }
>>>
>>> +/*
>>> + * Parses the result captured from symbol__disassemble_*
>>> + * Example, line read from DSO file in powerpc:
>>> + * line: 38 01 81 e8
>>> + * opcode: fetched from arch specific get_opcode_insn
>>> + * rawp_insn: e8810138
>>> + *
>>> + * rawp_insn is used later to extract the reg/offset fields
>>> + */
>>> +#define PPC_OP(op) (((op) >> 26) & 0x3F)
>>> +
>>> +static int disasm_line__parse_powerpc(struct disasm_line *dl)
>>> +{
>>> + char *line = dl->al.line;
>>> + const char **namep = &dl->ins.name;
>>> + char **rawp = &dl->ops.raw;
>>> + char tmp, *tmp_opcode, *name_opcode = skip_spaces(line);
>>> + char *name = skip_spaces(name_opcode + 11);
>>> + int objdump = 0;
>>> +
>>> + if (strlen(line) > 11)
>>> + objdump = 1;
>>> +
>>> + if (name_opcode[0] == '\0')
>>> + return -1;
>>> +
>>> + if (objdump) {
>>> + *rawp = name + 1;
>>> + while ((*rawp)[0] != '\0' && !isspace((*rawp)[0]))
>>> + ++*rawp;
>>> + tmp = (*rawp)[0];
>>> + (*rawp)[0] = '\0';
>>> +
>>> + *namep = strdup(name);
>>> + if (*namep == NULL)
>>> + return -1;
>>> +
>>> + (*rawp)[0] = tmp;
>>> + *rawp = strim(*rawp);
>>> + } else
>>> + *namep = "";
>>> +
>>> + tmp_opcode = strdup(name_opcode);
>>> + tmp_opcode[11] = '\0';
>>> + remove_spaces(tmp_opcode);
>>> +
>>> + dl->ins.opcode = PPC_OP(strtol(tmp_opcode, NULL, 16));
>>> + if (objdump)
>>> + dl->ins.opcode = PPC_OP(be32_to_cpu(strtol(tmp_opcode, NULL, 16)));
>>> + dl->ops.opcode = dl->ins.opcode;
>>> +
>>> + dl->ops.raw_insn = strtol(tmp_opcode, NULL, 16);
>>> + if (objdump)
>>> + dl->ops.raw_insn = be32_to_cpu(strtol(tmp_opcode, NULL, 16));
>>> + return 0;
>>> +}
>>> +
>>> static void annotation_line__init(struct annotation_line *al,
>>> struct annotate_args *args,
>>> int nr)
>>> @@ -897,7 +955,10 @@ struct disasm_line *disasm_line__new(struct annotate_args *args)
>>> goto out_delete;
>>>
>>> if (args->offset != -1) {
>>> - if (disasm_line__parse(dl->al.line, &dl->ins.name, &dl->ops.raw) < 0)
>>> + if (arch__is(args->arch, "powerpc")) {
>>> + if (disasm_line__parse_powerpc(dl) < 0)
>>> + goto out_free_line;
>>> + } else if (disasm_line__parse(dl->al.line, &dl->ins.name, &dl->ops.raw) < 0)
>>> goto out_free_line;
>>>
>>> disasm_line__init_ins(dl, args->arch, &args->ms);
>>> diff --git a/tools/perf/util/disasm.h b/tools/perf/util/disasm.h
>>> index 718177fa4775..a391e1bb81f7 100644
>>> --- a/tools/perf/util/disasm.h
>>> +++ b/tools/perf/util/disasm.h
>>> @@ -43,14 +43,19 @@ struct arch {
>>>
>>> struct ins {
>>> const char *name;
>>> + int opcode;
>>
>> I don't think this is the right place as 'ins' can be shared for
>> different opcodes. IIUC it's like a class and disasm_line should
>> have a pointer instead of a copy of the arch instructions. So I'd
>> like to keep a single instance if they behave in the same way. But
>> this is a separate change.
>>
>> I guess we can move it to struct disasm_line and use helper macros when
>> we need to access the opcode. This will be helpful for other arches.
>>
>> struct disasm_line {
>> struct ins *ins;
>> struct ins_operands ops;
>> union {
>> u8 bytes[4];
>> u32 opcode;
>> } raw;
>> struct annotation_line al;
>> };
>>
>> #define PPC_OP(dl) (((dl)->raw.bytes[0] >> 2) & 0x3F)
>
> We already have a definition for PPC_OP(), see arch/powerpc/xmon/ppc.h:
>
> /* A macro to extract the major opcode from an instruction. */
> #define PPC_OP(i) (((i) >> 26) & 0x3f)
>
> By the way why do you want to split off instructions in bytes ? On
> powerpc an instruction is one (sometimes two) u32, nothing else, and if
> you start breaking that into bytes you will likely unnecessarily
> increase complexity when a param has bits on different bytes, and maybe
> also with the byte order depending whether you are running on a little
> or big endian PPC.
> See for instance arch_decode_instruction() in
> tools/objtool/arch/powerpc/decode.c


Hi Namhyung, Christophe

Thanks for the feedback.

IIUC, Namhyung's main point here is to move the "opcode" related field from "struct ins" to "structure disasm_line"
In my V3, I had opcode and raw instruction both as part of "struct ins" and "struct ins_operands".
For V4, I am moving the raw instruction related field to "structure disasm_line”. In this approach, "opcode"
won't be saved as a separate field in the structure. Instead raw instruction itself will be saved as part of the disasm_line.
And helper macros will be used to extract opcode from raw instruction, wherever needed. So the union will have "u8 bytes[4]" and "u32 raw_insn". For powerpc, "raw_insn" will be used to carry the raw insn. For other archs, depending on implementation, "u8 bytes[4]" could be used

Below saves the raw instruction (u32) itself in the union raw as part of "structure disasm_line"

struct disasm_line {
struct ins *ins;
struct ins_operands ops;
union {
u8 bytes[4];
u32 raw_insn;
} raw;
struct annotation_line al;
};

And to access opcode, use helper macro for powerpc as below:

#define PPC_OP(op) (((op) >> 26) & 0x3F)


In disasm_line__parse_powerpc which parses the line captured, initialise "dl->raw.raw_insn"
To access opcode, use "PPC_OP(dl->raw.raw_insn)"

Does this approach looks good ?

Athira
>
> By the way, why are we spreading different decoding functions in
> different tools ? Wouldn't it make sense to try and share decoding
> functions between objtool and perf for instance ?
>
> Christophe
>

Hi Christophe,

I think that's good idea to check. At this point, I am not aware of existing decoding functions in objtool and which all can be made common between perf. But after checking through, would be able to propose more on this. Can we go with current approach by having helper macros in perf now and then explore/understand the objtool side ?

Thanks
Athira
>
>
>>
>> Thanks,
>> Namhyung
>>
>>>
>>> struct ins_ops *ops;
>>> };
>>>
>>> struct ins_operands {
>>> char *raw;
>>> + int raw_insn;
>>> + int opcode;
>>> struct {
>>> char *raw;
>>> char *name;
>>> + int opcode;
>>> + int raw_insn;
>>> struct symbol *sym;
>>> u64 addr;
>>> s64 offset;
>>> @@ -62,6 +67,8 @@ struct ins_operands {
>>> struct {
>>> char *raw;
>>> char *name;
>>> + int opcode;
>>> + int raw_insn;
>>> u64 addr;
>>> bool multi_regs;
>>> } source;
>>> --
>>> 2.43.0