2024-04-04 17:57:25

by Namhyung Kim

[permalink] [raw]
Subject: [PATCHSET 0/9] perf annotate: More memory footprint reduction

Hello,

This work is continuation of the previous work to reduce the memory
usage in symbol and annotation structures. Basically I moved some
fields in the annotation which consume spaces in the struct symbol
which is allocated regardless if the symbol has a sample or not when
annotation is enabled.

With this change applied, the struct annotation only has two members -
annotated_source and annotated_branch. The next step would be to
remove the struct annotation and to have a hash table from symbol to
each annotated struct directly.

No function changes intended.

The code is available at perf/annotate-diet-v3 branch in

git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks,
Namhyung


Namhyung Kim (9):
perf annotate: Fix annotation_calc_lines()
perf annotate: Staticize some local functions
perf annotate: Introduce annotated_source__get_line()
perf annotate: Check annotation lines more efficiently
perf annotate: Get rid of offsets array
perf annotate: Move widths struct to annotated_source
perf annotate: Move max_jump_sources struct to annotated_source
perf annotate: Move nr_events struct to annotated_source
perf annotate: Move start field struct to annotated_source

tools/perf/ui/browsers/annotate.c | 15 ++-
tools/perf/util/annotate.c | 174 +++++++++++++++++-------------
tools/perf/util/annotate.h | 39 +++----
3 files changed, 123 insertions(+), 105 deletions(-)


base-commit: b6347cb5e04e9c1d17342ab46e2ace2d448de727
--
2.44.0.478.gd926399ef9-goog



2024-04-04 17:58:15

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 5/9] perf annotate: Get rid of offsets array

The struct annotated_source.offsets[] is to save pointers to
annotation_line at each offset. We can use annotated_source__get_line()
helper instead and let's get rid of the array.

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/ui/browsers/annotate.c | 5 +----
tools/perf/util/annotate.c | 29 ++++++-----------------------
tools/perf/util/annotate.h | 2 --
3 files changed, 7 insertions(+), 29 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index e72583f37972..c93da2ce727f 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -977,7 +977,7 @@ int symbol__tui_annotate(struct map_symbol *ms, struct evsel *evsel,
dso->annotate_warned = true;
symbol__strerror_disassemble(ms, err, msg, sizeof(msg));
ui__error("Couldn't annotate %s:\n%s", sym->name, msg);
- goto out_free_offsets;
+ return -1;
}
}

@@ -996,8 +996,5 @@ int symbol__tui_annotate(struct map_symbol *ms, struct evsel *evsel,
if(not_annotated)
annotated_source__purge(notes->src);

-out_free_offsets:
- if(not_annotated)
- zfree(&notes->src->offsets);
return ret;
}
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index d98fc248ba5b..0e8319835986 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1378,7 +1378,7 @@ annotation__mark_jump_targets(struct annotation *notes, struct symbol *sym)
}
}

-static void annotation__set_offsets(struct annotation *notes, s64 size)
+static void annotation__set_index(struct annotation *notes)
{
struct annotation_line *al;
struct annotated_source *src = notes->src;
@@ -1393,18 +1393,9 @@ static void annotation__set_offsets(struct annotation *notes, s64 size)
if (src->max_line_len < line_len)
src->max_line_len = line_len;
al->idx = src->nr_entries++;
- if (al->offset != -1) {
+ if (al->offset != -1)
al->idx_asm = src->nr_asm_entries++;
- /*
- * FIXME: short term bandaid to cope with assembly
- * routines that comes with labels in the same column
- * as the address in objdump, sigh.
- *
- * E.g. copy_user_generic_unrolled
- */
- if (al->offset < size)
- notes->src->offsets[al->offset] = al;
- } else
+ else
al->idx_asm = -1;
}
}
@@ -1835,25 +1826,21 @@ int symbol__annotate2(struct map_symbol *ms, struct evsel *evsel,
size_t size = symbol__size(sym);
int nr_pcnt = 1, err;

- notes->src->offsets = zalloc(size * sizeof(struct annotation_line *));
- if (notes->src->offsets == NULL)
- return ENOMEM;
-
if (evsel__is_group_event(evsel))
nr_pcnt = evsel->core.nr_members;

err = symbol__annotate(ms, evsel, parch);
if (err)
- goto out_free_offsets;
+ return err;

symbol__calc_percent(sym, evsel);

- annotation__set_offsets(notes, size);
+ annotation__set_index(notes);
annotation__mark_jump_targets(notes, sym);

err = annotation__compute_ipc(notes, size);
if (err)
- goto out_free_offsets;
+ return err;

annotation__init_column_widths(notes, sym);
notes->nr_events = nr_pcnt;
@@ -1862,10 +1849,6 @@ int symbol__annotate2(struct map_symbol *ms, struct evsel *evsel,
sym->annotate2 = 1;

return 0;
-
-out_free_offsets:
- zfree(&notes->src->offsets);
- return err;
}

static int annotation__config(const char *var, const char *value, void *data)
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index aa3298c20300..d61184499bda 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -246,7 +246,6 @@ struct cyc_hist {
* we have more than a group in a evlist, where we will want
* to see each group separately, that is why symbol__annotate2()
* sets src->nr_histograms to evsel->nr_members.
- * @offsets: Array of annotation_line to be accessed by offset.
* @samples: Hash map of sym_hist_entry. Keyed by event index and offset in symbol.
* @nr_entries: Number of annotated_line in the source list.
* @nr_asm_entries: Number of annotated_line with actual asm instruction in the
@@ -262,7 +261,6 @@ struct cyc_hist {
struct annotated_source {
struct list_head source;
struct sym_hist *histograms;
- struct annotation_line **offsets;
struct hashmap *samples;
int nr_histograms;
int nr_entries;
--
2.44.0.478.gd926399ef9-goog


2024-04-04 17:58:26

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 4/9] perf annotate: Check annotation lines more efficiently

In some places, it checks annotated (disasm) lines for each byte. But
as it already has a list of disasm lines, it'd be better to traverse the
list entries instead of checking every offset with linear search (by
annotated_source__get_line() helper).

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/annotate.c | 56 ++++++++++++++++++++++++--------------
1 file changed, 35 insertions(+), 21 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 2409d7424c71..d98fc248ba5b 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -383,12 +383,19 @@ struct annotation_line *annotated_source__get_line(struct annotated_source *src,

static unsigned annotation__count_insn(struct annotation *notes, u64 start, u64 end)
{
+ struct annotation_line *al;
unsigned n_insn = 0;
- u64 offset;

- for (offset = start; offset <= end; offset++) {
- if (annotated_source__get_line(notes->src, offset))
- n_insn++;
+ al = annotated_source__get_line(notes->src, start);
+ if (al == NULL)
+ return 0;
+
+ list_for_each_entry_from(al, &notes->src->source, node) {
+ if (al->offset == -1)
+ continue;
+ if ((u64)al->offset > end)
+ break;
+ n_insn++;
}
return n_insn;
}
@@ -405,10 +412,10 @@ static void annotation__count_and_fill(struct annotation *notes, u64 start, u64
{
unsigned n_insn;
unsigned int cover_insn = 0;
- u64 offset;

n_insn = annotation__count_insn(notes, start, end);
if (n_insn && ch->num && ch->cycles) {
+ struct annotation_line *al;
struct annotated_branch *branch;
float ipc = n_insn / ((double)ch->cycles / (double)ch->num);

@@ -416,11 +423,16 @@ static void annotation__count_and_fill(struct annotation *notes, u64 start, u64
if (ch->reset >= 0x7fff)
return;

- for (offset = start; offset <= end; offset++) {
- struct annotation_line *al;
+ al = annotated_source__get_line(notes->src, start);
+ if (al == NULL)
+ return;

- al = annotated_source__get_line(notes->src, offset);
- if (al && al->cycles && al->cycles->ipc == 0.0) {
+ list_for_each_entry_from(al, &notes->src->source, node) {
+ if (al->offset == -1)
+ continue;
+ if ((u64)al->offset > end)
+ break;
+ if (al->cycles && al->cycles->ipc == 0.0) {
al->cycles->ipc = ipc;
cover_insn++;
}
@@ -1268,13 +1280,16 @@ void symbol__annotate_decay_histogram(struct symbol *sym, int evidx)
{
struct annotation *notes = symbol__annotation(sym);
struct sym_hist *h = annotation__histogram(notes, evidx);
- int len = symbol__size(sym), offset;
+ struct annotation_line *al;

h->nr_samples = 0;
- for (offset = 0; offset < len; ++offset) {
+ list_for_each_entry(al, &notes->src->source, node) {
struct sym_hist_entry *entry;

- entry = annotated_source__hist_entry(notes->src, evidx, offset);
+ if (al->offset == -1)
+ continue;
+
+ entry = annotated_source__hist_entry(notes->src, evidx, al->offset);
if (entry == NULL)
continue;

@@ -1334,33 +1349,32 @@ bool disasm_line__is_valid_local_jump(struct disasm_line *dl, struct symbol *sym
static void
annotation__mark_jump_targets(struct annotation *notes, struct symbol *sym)
{
- u64 offset, size = symbol__size(sym);
+ struct annotation_line *al;

/* PLT symbols contain external offsets */
if (strstr(sym->name, "@plt"))
return;

- for (offset = 0; offset < size; ++offset) {
- struct annotation_line *al;
+ list_for_each_entry(al, &notes->src->source, node) {
struct disasm_line *dl;
+ struct annotation_line *target;

- al = annotated_source__get_line(notes->src, offset);
dl = disasm_line(al);

if (!disasm_line__is_valid_local_jump(dl, sym))
continue;

- al = notes->src->offsets[dl->ops.target.offset];
-
+ target = annotated_source__get_line(notes->src,
+ dl->ops.target.offset);
/*
* FIXME: Oops, no jump target? Buggy disassembler? Or do we
* have to adjust to the previous offset?
*/
- if (al == NULL)
+ if (target == NULL)
continue;

- if (++al->jump_sources > notes->max_jump_sources)
- notes->max_jump_sources = al->jump_sources;
+ if (++target->jump_sources > notes->max_jump_sources)
+ notes->max_jump_sources = target->jump_sources;
}
}

--
2.44.0.478.gd926399ef9-goog


2024-04-04 17:58:39

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 7/9] perf annotate: Move max_jump_sources struct to annotated_source

It's only used in perf annotate output which means functions with actual
samples. No need to consume memory for every symbol (annotation).

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/ui/browsers/annotate.c | 2 +-
tools/perf/util/annotate.c | 6 +++---
tools/perf/util/annotate.h | 4 +++-
3 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index 032642a0b4b6..0e16c268e329 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -49,7 +49,7 @@ static int ui_browser__jumps_percent_color(struct ui_browser *browser, int nr, b

if (current && (!browser->use_navkeypressed || browser->navkeypressed))
return HE_COLORSET_SELECTED;
- if (nr == notes->max_jump_sources)
+ if (nr == notes->src->max_jump_sources)
return HE_COLORSET_TOP;
if (nr > 1)
return HE_COLORSET_MEDIUM;
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 0be744bb529c..1fd51856d78f 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1373,8 +1373,8 @@ annotation__mark_jump_targets(struct annotation *notes, struct symbol *sym)
if (target == NULL)
continue;

- if (++target->jump_sources > notes->max_jump_sources)
- notes->max_jump_sources = target->jump_sources;
+ if (++target->jump_sources > notes->src->max_jump_sources)
+ notes->src->max_jump_sources = target->jump_sources;
}
}

@@ -1432,7 +1432,7 @@ annotation__init_column_widths(struct annotation *notes, struct symbol *sym)
notes->src->widths.addr = notes->src->widths.target =
notes->src->widths.min_addr = hex_width(symbol__size(sym));
notes->src->widths.max_addr = hex_width(sym->end);
- notes->src->widths.jumps = width_jumps(notes->max_jump_sources);
+ notes->src->widths.jumps = width_jumps(notes->src->max_jump_sources);
notes->src->widths.max_ins_name = annotation__max_ins_name(notes);
}

diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 402ae774426b..382705311d28 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -250,6 +250,8 @@ struct cyc_hist {
* @nr_entries: Number of annotated_line in the source list.
* @nr_asm_entries: Number of annotated_line with actual asm instruction in the
* source list.
+ * @max_jump_sources: Maximum number of jump instructions targeting to the same
+ * instruction.
* @widths: Precalculated width of each column in the TUI output.
*
* disasm_lines are allocated, percentages calculated and all sorted by percentage
@@ -265,6 +267,7 @@ struct annotated_source {
int nr_histograms;
int nr_entries;
int nr_asm_entries;
+ int max_jump_sources;
struct {
u8 addr;
u8 jumps;
@@ -309,7 +312,6 @@ struct annotated_branch {
struct LOCKABLE annotation {
u64 start;
int nr_events;
- int max_jump_sources;
struct annotated_source *src;
struct annotated_branch *branch;
};
--
2.44.0.478.gd926399ef9-goog


2024-04-04 17:58:48

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 6/9] perf annotate: Move widths struct to annotated_source

It's only used in perf annotate output which means functions with actual
samples. No need to consume memory for every symbol (annotation).

Also move max_line_len field into it as it's related.

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/ui/browsers/annotate.c | 6 ++---
tools/perf/util/annotate.c | 41 +++++++++++++++++--------------
tools/perf/util/annotate.h | 20 +++++++--------
3 files changed, 35 insertions(+), 32 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index c93da2ce727f..032642a0b4b6 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -205,13 +205,13 @@ static void annotate_browser__draw_current_jump(struct ui_browser *browser)

ui_browser__set_color(browser, HE_COLORSET_JUMP_ARROWS);
__ui_browser__line_arrow(browser,
- pcnt_width + 2 + notes->widths.addr + width,
+ pcnt_width + 2 + notes->src->widths.addr + width,
from, to);

diff = is_fused(ab, cursor);
if (diff > 0) {
ui_browser__mark_fused(browser,
- pcnt_width + 3 + notes->widths.addr + width,
+ pcnt_width + 3 + notes->src->widths.addr + width,
from - diff, diff, to > from);
}
}
@@ -983,7 +983,7 @@ int symbol__tui_annotate(struct map_symbol *ms, struct evsel *evsel,

ui_helpline__push("Press ESC to exit");

- browser.b.width = notes->src->max_line_len;
+ browser.b.width = notes->src->widths.max_line_len;
browser.b.nr_entries = notes->src->nr_entries;
browser.b.entries = &notes->src->source,
browser.b.width += 18; /* Percentage */
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 0e8319835986..0be744bb529c 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1383,15 +1383,15 @@ static void annotation__set_index(struct annotation *notes)
struct annotation_line *al;
struct annotated_source *src = notes->src;

- src->max_line_len = 0;
+ src->widths.max_line_len = 0;
src->nr_entries = 0;
src->nr_asm_entries = 0;

list_for_each_entry(al, &src->source, node) {
size_t line_len = strlen(al->line);

- if (src->max_line_len < line_len)
- src->max_line_len = line_len;
+ if (src->widths.max_line_len < line_len)
+ src->widths.max_line_len = line_len;
al->idx = src->nr_entries++;
if (al->offset != -1)
al->idx_asm = src->nr_asm_entries++;
@@ -1429,26 +1429,26 @@ static int annotation__max_ins_name(struct annotation *notes)
static void
annotation__init_column_widths(struct annotation *notes, struct symbol *sym)
{
- notes->widths.addr = notes->widths.target =
- notes->widths.min_addr = hex_width(symbol__size(sym));
- notes->widths.max_addr = hex_width(sym->end);
- notes->widths.jumps = width_jumps(notes->max_jump_sources);
- notes->widths.max_ins_name = annotation__max_ins_name(notes);
+ notes->src->widths.addr = notes->src->widths.target =
+ notes->src->widths.min_addr = hex_width(symbol__size(sym));
+ notes->src->widths.max_addr = hex_width(sym->end);
+ notes->src->widths.jumps = width_jumps(notes->max_jump_sources);
+ notes->src->widths.max_ins_name = annotation__max_ins_name(notes);
}

void annotation__update_column_widths(struct annotation *notes)
{
if (annotate_opts.use_offset)
- notes->widths.target = notes->widths.min_addr;
+ notes->src->widths.target = notes->src->widths.min_addr;
else if (annotate_opts.full_addr)
- notes->widths.target = BITS_PER_LONG / 4;
+ notes->src->widths.target = BITS_PER_LONG / 4;
else
- notes->widths.target = notes->widths.max_addr;
+ notes->src->widths.target = notes->src->widths.max_addr;

- notes->widths.addr = notes->widths.target;
+ notes->src->widths.addr = notes->src->widths.target;

if (annotate_opts.show_nr_jumps)
- notes->widths.addr += notes->widths.jumps + 1;
+ notes->src->widths.addr += notes->src->widths.jumps + 1;
}

void annotation__toggle_full_addr(struct annotation *notes, struct map_symbol *ms)
@@ -1625,7 +1625,8 @@ static void disasm_line__write(struct disasm_line *dl, struct annotation *notes,
obj__printf(obj, " ");
}

- disasm_line__scnprintf(dl, bf, size, !annotate_opts.use_offset, notes->widths.max_ins_name);
+ disasm_line__scnprintf(dl, bf, size, !annotate_opts.use_offset,
+ notes->src->widths.max_ins_name);
}

static void ipc_coverage_string(char *bf, int size, struct annotation *notes)
@@ -1753,9 +1754,11 @@ static void __annotation_line__write(struct annotation_line *al, struct annotati
obj__printf(obj, "%-*s", width - pcnt_width - cycles_width, " ");
else if (al->offset == -1) {
if (al->line_nr && annotate_opts.show_linenr)
- printed = scnprintf(bf, sizeof(bf), "%-*d ", notes->widths.addr + 1, al->line_nr);
+ printed = scnprintf(bf, sizeof(bf), "%-*d ",
+ notes->src->widths.addr + 1, al->line_nr);
else
- printed = scnprintf(bf, sizeof(bf), "%-*s ", notes->widths.addr, " ");
+ printed = scnprintf(bf, sizeof(bf), "%-*s ",
+ notes->src->widths.addr, " ");
obj__printf(obj, bf);
obj__printf(obj, "%-*s", width - printed - pcnt_width - cycles_width + 1, al->line);
} else {
@@ -1773,7 +1776,7 @@ static void __annotation_line__write(struct annotation_line *al, struct annotati
if (annotate_opts.show_nr_jumps) {
int prev;
printed = scnprintf(bf, sizeof(bf), "%*d ",
- notes->widths.jumps,
+ notes->src->widths.jumps,
al->jump_sources);
prev = obj__set_jumps_percent_color(obj, al->jump_sources,
current_entry);
@@ -1782,7 +1785,7 @@ static void __annotation_line__write(struct annotation_line *al, struct annotati
}
print_addr:
printed = scnprintf(bf, sizeof(bf), "%*" PRIx64 ": ",
- notes->widths.target, addr);
+ notes->src->widths.target, addr);
} else if (ins__is_call(&disasm_line(al)->ins) &&
annotate_opts.offset_level >= ANNOTATION__OFFSET_CALL) {
goto print_addr;
@@ -1790,7 +1793,7 @@ static void __annotation_line__write(struct annotation_line *al, struct annotati
goto print_addr;
} else {
printed = scnprintf(bf, sizeof(bf), "%-*s ",
- notes->widths.addr, " ");
+ notes->src->widths.addr, " ");
}
}

diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index d61184499bda..402ae774426b 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -250,7 +250,7 @@ struct cyc_hist {
* @nr_entries: Number of annotated_line in the source list.
* @nr_asm_entries: Number of annotated_line with actual asm instruction in the
* source list.
- * @max_line_len: Maximum length of objdump output in an annotated_line.
+ * @widths: Precalculated width of each column in the TUI output.
*
* disasm_lines are allocated, percentages calculated and all sorted by percentage
* when the annotation is about to be presented, so the percentages are for
@@ -265,7 +265,15 @@ struct annotated_source {
int nr_histograms;
int nr_entries;
int nr_asm_entries;
- u16 max_line_len;
+ struct {
+ u8 addr;
+ u8 jumps;
+ u8 target;
+ u8 min_addr;
+ u8 max_addr;
+ u8 max_ins_name;
+ u16 max_line_len;
+ } widths;
};

struct annotation_line *annotated_source__get_line(struct annotated_source *src,
@@ -302,14 +310,6 @@ struct LOCKABLE annotation {
u64 start;
int nr_events;
int max_jump_sources;
- struct {
- u8 addr;
- u8 jumps;
- u8 target;
- u8 min_addr;
- u8 max_addr;
- u8 max_ins_name;
- } widths;
struct annotated_source *src;
struct annotated_branch *branch;
};
--
2.44.0.478.gd926399ef9-goog


2024-04-04 17:59:10

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 3/9] perf annotate: Introduce annotated_source__get_line()

It's a helper function to get annotation_line at the given offset
without using the offsets array. The goal is to get rid of the
offsets array altogether. It just does the linear search but I
think it's better to save memory as it won't be called in a hot
path.

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/ui/browsers/annotate.c | 2 +-
tools/perf/util/annotate.c | 26 +++++++++++++++++++++-----
tools/perf/util/annotate.h | 3 +++
3 files changed, 25 insertions(+), 6 deletions(-)

diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index ec5e21932876..e72583f37972 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -186,7 +186,7 @@ static void annotate_browser__draw_current_jump(struct ui_browser *browser)
* name right after the '<' token and probably treating this like a
* 'call' instruction.
*/
- target = notes->src->offsets[cursor->ops.target.offset];
+ target = annotated_source__get_line(notes->src, cursor->ops.target.offset);
if (target == NULL) {
ui_helpline__printf("WARN: jump target inconsistency, press 'o', notes->offsets[%#x] = NULL\n",
cursor->ops.target.offset);
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index bbf4894b1309..2409d7424c71 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -369,13 +369,25 @@ int addr_map_symbol__account_cycles(struct addr_map_symbol *ams,
return err;
}

+struct annotation_line *annotated_source__get_line(struct annotated_source *src,
+ s64 offset)
+{
+ struct annotation_line *al;
+
+ list_for_each_entry(al, &src->source, node) {
+ if (al->offset == offset)
+ return al;
+ }
+ return NULL;
+}
+
static unsigned annotation__count_insn(struct annotation *notes, u64 start, u64 end)
{
unsigned n_insn = 0;
u64 offset;

for (offset = start; offset <= end; offset++) {
- if (notes->src->offsets[offset])
+ if (annotated_source__get_line(notes->src, offset))
n_insn++;
}
return n_insn;
@@ -405,8 +417,9 @@ static void annotation__count_and_fill(struct annotation *notes, u64 start, u64
return;

for (offset = start; offset <= end; offset++) {
- struct annotation_line *al = notes->src->offsets[offset];
+ struct annotation_line *al;

+ al = annotated_source__get_line(notes->src, offset);
if (al && al->cycles && al->cycles->ipc == 0.0) {
al->cycles->ipc = ipc;
cover_insn++;
@@ -443,7 +456,7 @@ static int annotation__compute_ipc(struct annotation *notes, size_t size)
if (ch && ch->cycles) {
struct annotation_line *al;

- al = notes->src->offsets[offset];
+ al = annotated_source__get_line(notes->src, offset);
if (al && al->cycles == NULL) {
al->cycles = zalloc(sizeof(*al->cycles));
if (al->cycles == NULL) {
@@ -466,7 +479,9 @@ static int annotation__compute_ipc(struct annotation *notes, size_t size)
struct cyc_hist *ch = &notes->branch->cycles_hist[offset];

if (ch && ch->cycles) {
- struct annotation_line *al = notes->src->offsets[offset];
+ struct annotation_line *al;
+
+ al = annotated_source__get_line(notes->src, offset);
if (al)
zfree(&al->cycles);
}
@@ -1326,9 +1341,10 @@ annotation__mark_jump_targets(struct annotation *notes, struct symbol *sym)
return;

for (offset = 0; offset < size; ++offset) {
- struct annotation_line *al = notes->src->offsets[offset];
+ struct annotation_line *al;
struct disasm_line *dl;

+ al = annotated_source__get_line(notes->src, offset);
dl = disasm_line(al);

if (!disasm_line__is_valid_local_jump(dl, sym))
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 3f383f38f65f..aa3298c20300 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -270,6 +270,9 @@ struct annotated_source {
u16 max_line_len;
};

+struct annotation_line *annotated_source__get_line(struct annotated_source *src,
+ s64 offset);
+
/**
* struct annotated_branch - basic block and IPC information for a symbol.
*
--
2.44.0.478.gd926399ef9-goog


2024-04-04 18:17:22

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 1/9] perf annotate: Fix annotation_calc_lines()

It should pass a proper address (i.e. suitable for objdump or addr2line)
to get_srcline() in order to work correctly. It used to pass an address
with map__rip_2objdump() as the second argument but later it's changed
to use notes->start. It's ok in normal cases but it can be changed when
annotate_opts.full_addr is set. So let's convert the address directly
instead of using the notes->start.

Also the last argument is an IP to print symbol offset if requested. So
it should pass symbol-relative address.

Fixes: 7d18a824b5e5 ("perf annotate: Toggle full address <-> offset display")
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/annotate.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 35235147b111..a330e92c2552 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1440,7 +1440,7 @@ void annotation__toggle_full_addr(struct annotation *notes, struct map_symbol *m
annotation__update_column_widths(notes);
}

-static void annotation__calc_lines(struct annotation *notes, struct map *map,
+static void annotation__calc_lines(struct annotation *notes, struct map_symbol *ms,
struct rb_root *root)
{
struct annotation_line *al;
@@ -1448,6 +1448,7 @@ static void annotation__calc_lines(struct annotation *notes, struct map *map,

list_for_each_entry(al, &notes->src->source, node) {
double percent_max = 0.0;
+ u64 addr;
int i;

for (i = 0; i < al->data_nr; i++) {
@@ -1463,8 +1464,9 @@ static void annotation__calc_lines(struct annotation *notes, struct map *map,
if (percent_max <= 0.5)
continue;

- al->path = get_srcline(map__dso(map), notes->start + al->offset, NULL,
- false, true, notes->start + al->offset);
+ addr = map__rip_2objdump(ms->map, ms->sym->start);
+ al->path = get_srcline(map__dso(ms->map), addr + al->offset, NULL,
+ false, true, ms->sym->start + al->offset);
insert_source_line(&tmp_root, al);
}

@@ -1475,7 +1477,7 @@ static void symbol__calc_lines(struct map_symbol *ms, struct rb_root *root)
{
struct annotation *notes = symbol__annotation(ms->sym);

- annotation__calc_lines(notes, ms->map, root);
+ annotation__calc_lines(notes, ms, root);
}

int symbol__tty_annotate2(struct map_symbol *ms, struct evsel *evsel)
--
2.44.0.478.gd926399ef9-goog


2024-04-04 18:17:32

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 2/9] perf annotate: Staticize some local functions

I found annotation__mark_jump_targets(), annotation__set_offsets()
and annotation__init_column_widths() are only used in the same file.
Let's make them static.

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/annotate.c | 8 +++++---
tools/perf/util/annotate.h | 3 ---
2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index a330e92c2552..bbf4894b1309 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1316,7 +1316,8 @@ bool disasm_line__is_valid_local_jump(struct disasm_line *dl, struct symbol *sym
return true;
}

-void annotation__mark_jump_targets(struct annotation *notes, struct symbol *sym)
+static void
+annotation__mark_jump_targets(struct annotation *notes, struct symbol *sym)
{
u64 offset, size = symbol__size(sym);

@@ -1347,7 +1348,7 @@ void annotation__mark_jump_targets(struct annotation *notes, struct symbol *sym)
}
}

-void annotation__set_offsets(struct annotation *notes, s64 size)
+static void annotation__set_offsets(struct annotation *notes, s64 size)
{
struct annotation_line *al;
struct annotated_source *src = notes->src;
@@ -1404,7 +1405,8 @@ static int annotation__max_ins_name(struct annotation *notes)
return max_name;
}

-void annotation__init_column_widths(struct annotation *notes, struct symbol *sym)
+static void
+annotation__init_column_widths(struct annotation *notes, struct symbol *sym)
{
notes->widths.addr = notes->widths.target =
notes->widths.min_addr = hex_width(symbol__size(sym));
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index b3007c9966fd..3f383f38f65f 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -340,10 +340,7 @@ static inline bool annotation_line__filter(struct annotation_line *al)
return annotate_opts.hide_src_code && al->offset == -1;
}

-void annotation__set_offsets(struct annotation *notes, s64 size);
-void annotation__mark_jump_targets(struct annotation *notes, struct symbol *sym);
void annotation__update_column_widths(struct annotation *notes);
-void annotation__init_column_widths(struct annotation *notes, struct symbol *sym);
void annotation__toggle_full_addr(struct annotation *notes, struct map_symbol *ms);

static inline struct sym_hist *annotated_source__histogram(struct annotated_source *src, int idx)
--
2.44.0.478.gd926399ef9-goog


2024-04-04 18:18:54

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 8/9] perf annotate: Move nr_events struct to annotated_source

It's only used in perf annotate output which means functions with actual
samples. No need to consume memory for every symbol (annotation).

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/annotate.c | 6 +++---
tools/perf/util/annotate.h | 5 +++--
2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 1fd51856d78f..5f79ae0bccfd 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1584,7 +1584,7 @@ static double annotation_line__max_percent(struct annotation_line *al,
double percent_max = 0.0;
int i;

- for (i = 0; i < notes->nr_events; i++) {
+ for (i = 0; i < notes->src->nr_events; i++) {
double percent;

percent = annotation_data__percent(&al->data[i],
@@ -1674,7 +1674,7 @@ static void __annotation_line__write(struct annotation_line *al, struct annotati
if (al->offset != -1 && percent_max != 0.0) {
int i;

- for (i = 0; i < notes->nr_events; i++) {
+ for (i = 0; i < notes->src->nr_events; i++) {
double percent;

percent = annotation_data__percent(&al->data[i], percent_type);
@@ -1846,7 +1846,7 @@ int symbol__annotate2(struct map_symbol *ms, struct evsel *evsel,
return err;

annotation__init_column_widths(notes, sym);
- notes->nr_events = nr_pcnt;
+ notes->src->nr_events = nr_pcnt;

annotation__update_column_widths(notes);
sym->annotate2 = 1;
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index 382705311d28..d22b9e9a2fad 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -247,6 +247,7 @@ struct cyc_hist {
* to see each group separately, that is why symbol__annotate2()
* sets src->nr_histograms to evsel->nr_members.
* @samples: Hash map of sym_hist_entry. Keyed by event index and offset in symbol.
+ * @nr_events: Number of events in the current output.
* @nr_entries: Number of annotated_line in the source list.
* @nr_asm_entries: Number of annotated_line with actual asm instruction in the
* source list.
@@ -265,6 +266,7 @@ struct annotated_source {
struct sym_hist *histograms;
struct hashmap *samples;
int nr_histograms;
+ int nr_events;
int nr_entries;
int nr_asm_entries;
int max_jump_sources;
@@ -311,7 +313,6 @@ struct annotated_branch {

struct LOCKABLE annotation {
u64 start;
- int nr_events;
struct annotated_source *src;
struct annotated_branch *branch;
};
@@ -335,7 +336,7 @@ static inline int annotation__cycles_width(struct annotation *notes)

static inline int annotation__pcnt_width(struct annotation *notes)
{
- return (symbol_conf.show_total_period ? 12 : 7) * notes->nr_events;
+ return (symbol_conf.show_total_period ? 12 : 7) * notes->src->nr_events;
}

static inline bool annotation_line__filter(struct annotation_line *al)
--
2.44.0.478.gd926399ef9-goog


2024-04-04 18:19:04

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 9/9] perf annotate: Move start field struct to annotated_source

It's only used in perf annotate output which means functions with actual
samples. No need to consume memory for every symbol (annotation).

Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/annotate.c | 10 +++++-----
tools/perf/util/annotate.h | 2 +-
2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 5f79ae0bccfd..4db49611c386 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -909,9 +909,9 @@ int symbol__annotate(struct map_symbol *ms, struct evsel *evsel,
args.arch = arch;
args.ms = *ms;
if (annotate_opts.full_addr)
- notes->start = map__objdump_2mem(ms->map, ms->sym->start);
+ notes->src->start = map__objdump_2mem(ms->map, ms->sym->start);
else
- notes->start = map__rip_2objdump(ms->map, ms->sym->start);
+ notes->src->start = map__rip_2objdump(ms->map, ms->sym->start);

return symbol__disassemble(sym, &args);
}
@@ -1456,9 +1456,9 @@ void annotation__toggle_full_addr(struct annotation *notes, struct map_symbol *m
annotate_opts.full_addr = !annotate_opts.full_addr;

if (annotate_opts.full_addr)
- notes->start = map__objdump_2mem(ms->map, ms->sym->start);
+ notes->src->start = map__objdump_2mem(ms->map, ms->sym->start);
else
- notes->start = map__rip_2objdump(ms->map, ms->sym->start);
+ notes->src->start = map__rip_2objdump(ms->map, ms->sym->start);

annotation__update_column_widths(notes);
}
@@ -1766,7 +1766,7 @@ static void __annotation_line__write(struct annotation_line *al, struct annotati
int color = -1;

if (!annotate_opts.use_offset)
- addr += notes->start;
+ addr += notes->src->start;

if (!annotate_opts.use_offset) {
printed = scnprintf(bf, sizeof(bf), "%" PRIx64 ": ", addr);
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index d22b9e9a2fad..d5c821c22f79 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -270,6 +270,7 @@ struct annotated_source {
int nr_entries;
int nr_asm_entries;
int max_jump_sources;
+ u64 start;
struct {
u8 addr;
u8 jumps;
@@ -312,7 +313,6 @@ struct annotated_branch {
};

struct LOCKABLE annotation {
- u64 start;
struct annotated_source *src;
struct annotated_branch *branch;
};
--
2.44.0.478.gd926399ef9-goog


2024-04-05 23:42:06

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 9/9] perf annotate: Move start field struct to annotated_source

On Thu, Apr 4, 2024 at 10:57 AM Namhyung Kim <[email protected]> wrote:
>
> It's only used in perf annotate output which means functions with actual
> samples. No need to consume memory for every symbol (annotation).
>
> Signed-off-by: Namhyung Kim <[email protected]>
> ---
> tools/perf/util/annotate.c | 10 +++++-----
> tools/perf/util/annotate.h | 2 +-
> 2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
> index 5f79ae0bccfd..4db49611c386 100644
> --- a/tools/perf/util/annotate.c
> +++ b/tools/perf/util/annotate.c
> @@ -909,9 +909,9 @@ int symbol__annotate(struct map_symbol *ms, struct evsel *evsel,
> args.arch = arch;
> args.ms = *ms;
> if (annotate_opts.full_addr)
> - notes->start = map__objdump_2mem(ms->map, ms->sym->start);
> + notes->src->start = map__objdump_2mem(ms->map, ms->sym->start);
> else
> - notes->start = map__rip_2objdump(ms->map, ms->sym->start);
> + notes->src->start = map__rip_2objdump(ms->map, ms->sym->start);
>
> return symbol__disassemble(sym, &args);
> }
> @@ -1456,9 +1456,9 @@ void annotation__toggle_full_addr(struct annotation *notes, struct map_symbol *m
> annotate_opts.full_addr = !annotate_opts.full_addr;
>
> if (annotate_opts.full_addr)
> - notes->start = map__objdump_2mem(ms->map, ms->sym->start);
> + notes->src->start = map__objdump_2mem(ms->map, ms->sym->start);
> else
> - notes->start = map__rip_2objdump(ms->map, ms->sym->start);
> + notes->src->start = map__rip_2objdump(ms->map, ms->sym->start);
>
> annotation__update_column_widths(notes);
> }
> @@ -1766,7 +1766,7 @@ static void __annotation_line__write(struct annotation_line *al, struct annotati
> int color = -1;
>
> if (!annotate_opts.use_offset)
> - addr += notes->start;
> + addr += notes->src->start;
>
> if (!annotate_opts.use_offset) {
> printed = scnprintf(bf, sizeof(bf), "%" PRIx64 ": ", addr);
> diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
> index d22b9e9a2fad..d5c821c22f79 100644
> --- a/tools/perf/util/annotate.h
> +++ b/tools/perf/util/annotate.h
> @@ -270,6 +270,7 @@ struct annotated_source {
> int nr_entries;
> int nr_asm_entries;
> int max_jump_sources;
> + u64 start;

This introduces a 4 byte hole:

```
struct annotated_source {
struct list_head source; /* 0 16 */
struct sym_hist * histograms; /* 16 8 */
struct hashmap * samples; /* 24 8 */
int nr_histograms; /* 32 4 */
int nr_events; /* 36 4 */
int nr_entries; /* 40 4 */
int nr_asm_entries; /* 44 4 */
int max_jump_sources; /* 48 4 */

/* XXX 4 bytes hole, try to pack */

u64 start; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
struct {
u8 addr; /* 64 1 */
u8 jumps; /* 65 1 */
u8 target; /* 66 1 */
u8 min_addr; /* 67 1 */
u8 max_addr; /* 68 1 */
u8 max_ins_name; /* 69 1 */
u16 max_line_len; /* 70 2 */
} widths; /* 64 8 */

/* size: 72, cachelines: 2, members: 10 */
/* sum members: 68, holes: 1, sum holes: 4 */
/* last cacheline: 8 bytes */
};
```

If you sort variables from largest to smallest then it generally
avoids holes (so max_line_len violates this, but it doesn't introduce
padding). Anyway, a simple fix is to just shuffle things a little.

Thanks,
Ian

> struct {
> u8 addr;
> u8 jumps;
> @@ -312,7 +313,6 @@ struct annotated_branch {
> };
>
> struct LOCKABLE annotation {
> - u64 start;
> struct annotated_source *src;
> struct annotated_branch *branch;
> };
> --
> 2.44.0.478.gd926399ef9-goog
>

2024-04-05 23:44:32

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCHSET 0/9] perf annotate: More memory footprint reduction

On Thu, Apr 4, 2024 at 10:57 AM Namhyung Kim <[email protected]> wrote:
>
> Hello,
>
> This work is continuation of the previous work to reduce the memory
> usage in symbol and annotation structures. Basically I moved some
> fields in the annotation which consume spaces in the struct symbol
> which is allocated regardless if the symbol has a sample or not when
> annotation is enabled.
>
> With this change applied, the struct annotation only has two members -
> annotated_source and annotated_branch. The next step would be to
> remove the struct annotation and to have a hash table from symbol to
> each annotated struct directly.
>
> No function changes intended.
>
> The code is available at perf/annotate-diet-v3 branch in
>
> git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
>
> Thanks,
> Namhyung
>
>
> Namhyung Kim (9):
> perf annotate: Fix annotation_calc_lines()
> perf annotate: Staticize some local functions
> perf annotate: Introduce annotated_source__get_line()
> perf annotate: Check annotation lines more efficiently
> perf annotate: Get rid of offsets array
> perf annotate: Move widths struct to annotated_source
> perf annotate: Move max_jump_sources struct to annotated_source
> perf annotate: Move nr_events struct to annotated_source
> perf annotate: Move start field struct to annotated_source

Series:

Tested-by: Ian Rogers <[email protected]>

using -fsanitize=address. Only nit was noticing 4 bytes of padding.

Thanks,
Ian

> tools/perf/ui/browsers/annotate.c | 15 ++-
> tools/perf/util/annotate.c | 174 +++++++++++++++++-------------
> tools/perf/util/annotate.h | 39 +++----
> 3 files changed, 123 insertions(+), 105 deletions(-)
>
>
> base-commit: b6347cb5e04e9c1d17342ab46e2ace2d448de727
> --
> 2.44.0.478.gd926399ef9-goog
>

2024-04-08 13:56:31

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCHSET 0/9] perf annotate: More memory footprint reduction

On Thu, Apr 04, 2024 at 10:57:07AM -0700, Namhyung Kim wrote:
> Hello,
>
> This work is continuation of the previous work to reduce the memory
> usage in symbol and annotation structures. Basically I moved some
> fields in the annotation which consume spaces in the struct symbol
> which is allocated regardless if the symbol has a sample or not when
> annotation is enabled.
>
> With this change applied, the struct annotation only has two members -
> annotated_source and annotated_branch. The next step would be to
> remove the struct annotation and to have a hash table from symbol to
> each annotated struct directly.
>
> No function changes intended.
>
> The code is available at perf/annotate-diet-v3 branch in
>
> git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks, applied to perf-tools-next,

- Arnaldo