2023-06-08 23:51:44

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v2 00/26] Fix memory leaks (was reference count checking for thread)

Use leak sanitizer and reference count checking to fix outstanding
memory leaks in "perf top" or those discovered in "perf test". Also
fix address sanitizer issues discovered.

Add reference count checking to thread after first refactoring bits of
the code, such as making the thread red-black tree non-invasive (so
the thread it references is easier to reference count, rather than
having 3 potential references). Part of this refactoring also removes
the dead thread list because if we held a reference here the threads
would never die and anything else has questionable
correctness.

addr_location is made into its own C/header file to capture the init,
exit and copy code.

Refactor and change callchain_cursor to come from a pthread key so
that a destructor can run on pthreads exiting.

Fix additional outstanding memory leak and reference count issues to
the point that "perf test" compiled with address sanitizer but without
libtraceevent passes all but one test - libtraceevent reports leaks
within its own code, most likely as it isn't compiled with
sanitizers. The remaining failing test is "68: Test dwarf unwind" and
that has address sanitizer issues as it uses memcpy to access the
stack within the process - we likely want to skip parts of the test
with sanitizers enabled.

v2. Include extra fixes for callchain cursor, addr2line and related
"perf top" fixes, as well as the 2 patches in:
https://lore.kernel.org/lkml/[email protected]/

Ian Rogers (26):
perf thread: Remove notion of dead threads
perf thread: Make threads rbtree non-invasive
perf thread: Add accessor functions for thread
perf maps: Make delete static, always use put
perf addr_location: Move to its own header
perf addr_location: Add init/exit/copy functions
perf thread: Add reference count checking
perf machine: Make delete_threads part of machine__exit
perf report: Avoid thread leak
perf header: Ensure bitmaps are freed
perf stat: Avoid evlist leak
perf intel-pt: Fix missed put and leak
perf evlist: Free stats in all evlist destruction
perf python: Avoid 2 leak sanitizer issues
perf jit: Fix two thread leaks
perf symbol-elf: Correct holding a reference
perf maps: Fix overlapping memory leak
perf machine: Fix leak of kernel dso
perf machine: Don't leak module maps
perf map/maps/thread: Changes to reference counting
perf annotate: Fix parse_objdump_line memory leak
perf top: Add exit routine for main thread
perf header: Avoid out-of-bounds read
perf callchain: Use pthread keys for tls callchain_cursor
perf srcline: Change free_srcline to zfree_srcline
perf hist: Fix srcline memory leak

tools/perf/arch/arm/tests/dwarf-unwind.c | 2 +-
tools/perf/arch/arm64/tests/dwarf-unwind.c | 2 +-
tools/perf/arch/powerpc/tests/dwarf-unwind.c | 2 +-
tools/perf/arch/x86/tests/dwarf-unwind.c | 2 +-
tools/perf/builtin-annotate.c | 28 +-
tools/perf/builtin-c2c.c | 22 +-
tools/perf/builtin-diff.c | 20 +-
tools/perf/builtin-inject.c | 4 +-
tools/perf/builtin-kmem.c | 13 +-
tools/perf/builtin-kwork.c | 15 +-
tools/perf/builtin-mem.c | 4 +-
tools/perf/builtin-report.c | 21 +-
tools/perf/builtin-sched.c | 80 ++---
tools/perf/builtin-script.c | 123 ++++----
tools/perf/builtin-stat.c | 1 +
tools/perf/builtin-timechart.c | 11 +-
tools/perf/builtin-top.c | 19 +-
tools/perf/builtin-trace.c | 38 ++-
.../scripts/python/Perf-Trace-Util/Context.c | 4 +-
tools/perf/tests/code-reading.c | 6 +-
tools/perf/tests/dwarf-unwind.c | 1 -
tools/perf/tests/hists_common.c | 2 +-
tools/perf/tests/hists_cumulate.c | 18 +-
tools/perf/tests/hists_filter.c | 11 +-
tools/perf/tests/hists_link.c | 20 +-
tools/perf/tests/hists_output.c | 12 +-
tools/perf/tests/maps.c | 2 +-
tools/perf/tests/mmap-thread-lookup.c | 5 +-
tools/perf/tests/perf-targz-src-pkg | 5 +-
tools/perf/tests/symbols.c | 1 -
tools/perf/tests/thread-maps-share.c | 13 +-
tools/perf/trace/beauty/pid.c | 4 +-
tools/perf/ui/browsers/hists.c | 19 +-
tools/perf/ui/hist.c | 5 +-
tools/perf/ui/stdio/hist.c | 2 +-
tools/perf/util/Build | 1 +
tools/perf/util/addr_location.c | 44 +++
tools/perf/util/addr_location.h | 31 ++
tools/perf/util/annotate.c | 5 +-
tools/perf/util/arm-spe.c | 4 +-
tools/perf/util/block-info.c | 4 +-
tools/perf/util/build-id.c | 2 +
tools/perf/util/callchain.c | 68 +++-
tools/perf/util/callchain.h | 4 +-
tools/perf/util/cs-etm.c | 28 +-
tools/perf/util/data-convert-json.c | 16 +-
tools/perf/util/db-export.c | 30 +-
tools/perf/util/dlfilter.c | 17 +-
tools/perf/util/event.c | 37 +--
tools/perf/util/evlist.c | 2 +
tools/perf/util/evsel_fprintf.c | 8 +-
tools/perf/util/header.c | 14 +-
tools/perf/util/hist.c | 59 ++--
tools/perf/util/intel-bts.c | 2 +-
tools/perf/util/intel-pt.c | 88 +++---
tools/perf/util/jitdump.c | 12 +-
tools/perf/util/machine.c | 292 +++++++++---------
tools/perf/util/map.c | 4 +-
tools/perf/util/maps.c | 5 +-
tools/perf/util/maps.h | 9 +-
tools/perf/util/python.c | 4 +
.../scripting-engines/trace-event-python.c | 40 ++-
tools/perf/util/session.c | 8 +-
tools/perf/util/sort.c | 12 +-
tools/perf/util/srcline.c | 15 +-
tools/perf/util/srcline.h | 2 +-
tools/perf/util/symbol-elf.c | 4 +-
tools/perf/util/symbol.h | 17 +-
tools/perf/util/thread-stack.c | 25 +-
tools/perf/util/thread.c | 222 +++++++------
tools/perf/util/thread.h | 210 ++++++++++++-
tools/perf/util/unwind-libdw.c | 27 +-
tools/perf/util/unwind-libunwind-local.c | 19 +-
tools/perf/util/unwind-libunwind.c | 2 +-
tools/perf/util/vdso.c | 2 +-
75 files changed, 1210 insertions(+), 722 deletions(-)
create mode 100644 tools/perf/util/addr_location.c
create mode 100644 tools/perf/util/addr_location.h

--
2.41.0.162.gfafddb0af9-goog



2023-06-08 23:53:24

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v2 09/26] perf report: Avoid thread leak

Caught with address sanitizer and reference count checking.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/builtin-report.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 0b091a8983a5..a31a23af5547 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -839,6 +839,7 @@ static struct task *tasks_list(struct task *task, struct machine *machine)
return ERR_PTR(-ENOENT);

parent_task = thread__priv(parent_thread);
+ thread__put(parent_thread);
list_add_tail(&task->list, &parent_task->children);
return tasks_list(parent_task, machine);
}
--
2.41.0.162.gfafddb0af9-goog


2023-06-08 23:53:55

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v2 14/26] perf python: Avoid 2 leak sanitizer issues

Leak sanitizer complains about the variable size bf allocation and
store to bf if sized 0.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/util/scripting-engines/trace-event-python.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index d7c99028c6e6..d96e5c0fef45 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -735,6 +735,9 @@ static void regs_map(struct regs_dump *regs, uint64_t mask, const char *arch, ch
unsigned int i = 0, r;
int printed = 0;

+ if (size <= 0)
+ return;
+
bf[0] = 0;

if (!regs || !regs->regs)
@@ -764,7 +767,7 @@ static void set_regs_in_dict(PyObject *dict,
* 10 chars is for register name.
*/
int size = __sw_hweight64(attr->sample_regs_intr) * 28;
- char bf[size];
+ char *bf = malloc(size);

regs_map(&sample->intr_regs, attr->sample_regs_intr, arch, bf, sizeof(bf));

@@ -775,6 +778,7 @@ static void set_regs_in_dict(PyObject *dict,

pydict_set_item_string_decref(dict, "uregs",
_PyUnicode_FromString(bf));
+ free(bf);
}

static void set_sym_in_dict(PyObject *dict, struct addr_location *al,
--
2.41.0.162.gfafddb0af9-goog


2023-06-08 23:55:13

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v2 08/26] perf machine: Make delete_threads part of machine__exit

The code required threads to be deleted before machine__exit was
called or the threads would be leaked. This was error prone so move
the delete_threads into machine__exit.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/tests/code-reading.c | 1 -
tools/perf/tests/dwarf-unwind.c | 1 -
tools/perf/tests/mmap-thread-lookup.c | 1 -
tools/perf/tests/symbols.c | 1 -
tools/perf/util/machine.c | 1 +
tools/perf/util/session.c | 6 ------
6 files changed, 1 insertion(+), 10 deletions(-)

diff --git a/tools/perf/tests/code-reading.c b/tools/perf/tests/code-reading.c
index 2a7b2b6f5286..ed3815163d1b 100644
--- a/tools/perf/tests/code-reading.c
+++ b/tools/perf/tests/code-reading.c
@@ -721,7 +721,6 @@ static int do_test_code_reading(bool try_kcore)
evlist__delete(evlist);
perf_cpu_map__put(cpus);
perf_thread_map__put(threads);
- machine__delete_threads(machine);
machine__delete(machine);

return err;
diff --git a/tools/perf/tests/dwarf-unwind.c b/tools/perf/tests/dwarf-unwind.c
index ee983b677a6a..d01aa931fe81 100644
--- a/tools/perf/tests/dwarf-unwind.c
+++ b/tools/perf/tests/dwarf-unwind.c
@@ -235,7 +235,6 @@ noinline int test__dwarf_unwind(struct test_suite *test __maybe_unused,
thread__put(thread);

out:
- machine__delete_threads(machine);
machine__delete(machine);
return err;
}
diff --git a/tools/perf/tests/mmap-thread-lookup.c b/tools/perf/tests/mmap-thread-lookup.c
index 3891a2a3b46f..ddd1da9a4ba9 100644
--- a/tools/perf/tests/mmap-thread-lookup.c
+++ b/tools/perf/tests/mmap-thread-lookup.c
@@ -208,7 +208,6 @@ static int mmap_events(synth_cb synth)
addr_location__exit(&al);
}

- machine__delete_threads(machine);
machine__delete(machine);
return err;
}
diff --git a/tools/perf/tests/symbols.c b/tools/perf/tests/symbols.c
index 2d1aa42d36a9..16e1c5502b09 100644
--- a/tools/perf/tests/symbols.c
+++ b/tools/perf/tests/symbols.c
@@ -38,7 +38,6 @@ static int init_test_info(struct test_info *ti)
static void exit_test_info(struct test_info *ti)
{
thread__put(ti->thread);
- machine__delete_threads(ti->machine);
machine__delete(ti->machine);
}

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 261188766307..46af5e9748c9 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -256,6 +256,7 @@ void machine__exit(struct machine *machine)
zfree(&machine->current_tid);
zfree(&machine->kallsyms_filename);

+ machine__delete_threads(machine);
for (i = 0; i < THREADS__TABLE_SIZE; i++) {
struct threads *threads = &machine->threads[i];

diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 65ac9f7fdf7e..00d18c74c090 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -278,11 +278,6 @@ struct perf_session *__perf_session__new(struct perf_data *data,
return ERR_PTR(ret);
}

-static void perf_session__delete_threads(struct perf_session *session)
-{
- machine__delete_threads(&session->machines.host);
-}
-
static void perf_decomp__release_events(struct decomp *next)
{
struct decomp *decomp;
@@ -305,7 +300,6 @@ void perf_session__delete(struct perf_session *session)
auxtrace__free(session);
auxtrace_index__free(&session->auxtrace_index);
perf_session__destroy_kernel_maps(session);
- perf_session__delete_threads(session);
perf_decomp__release_events(session->decomp_data.decomp);
perf_env__exit(&session->header.env);
machines__exit(&session->machines);
--
2.41.0.162.gfafddb0af9-goog


2023-06-08 23:55:38

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v2 20/26] perf map/maps/thread: Changes to reference counting

Fix missed reference count gets and puts as detected with leak
sanitizer and reference count checking.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/util/callchain.c | 28 ++++++++++++++++++++++------
tools/perf/util/event.c | 3 +++
tools/perf/util/hist.c | 6 ++++--
tools/perf/util/machine.c | 29 +++++++++++++++++------------
4 files changed, 46 insertions(+), 20 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index b0dafc758173..909f62b3b266 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -590,6 +590,7 @@ fill_node(struct callchain_node *node, struct callchain_cursor *cursor)
call->ip = cursor_node->ip;
call->ms = cursor_node->ms;
call->ms.map = map__get(call->ms.map);
+ call->ms.maps = maps__get(call->ms.maps);
call->srcline = cursor_node->srcline;

if (cursor_node->branch) {
@@ -649,6 +650,7 @@ add_child(struct callchain_node *parent,
list_for_each_entry_safe(call, tmp, &new->val, list) {
list_del_init(&call->list);
map__zput(call->ms.map);
+ maps__zput(call->ms.maps);
free(call);
}
free(new);
@@ -1010,10 +1012,16 @@ merge_chain_branch(struct callchain_cursor *cursor,
int err = 0;

list_for_each_entry_safe(list, next_list, &src->val, list) {
- callchain_cursor_append(cursor, list->ip, &list->ms,
- false, NULL, 0, 0, 0, list->srcline);
+ struct map_symbol ms = {
+ .maps = maps__get(list->ms.maps),
+ .map = map__get(list->ms.map),
+ };
+ callchain_cursor_append(cursor, list->ip, &ms, false, NULL, 0, 0, 0, list->srcline);
list_del_init(&list->list);
+ map__zput(ms.map);
+ maps__zput(ms.maps);
map__zput(list->ms.map);
+ maps__zput(list->ms.maps);
free(list);
}

@@ -1065,9 +1073,11 @@ int callchain_cursor_append(struct callchain_cursor *cursor,
}

node->ip = ip;
+ maps__zput(node->ms.maps);
map__zput(node->ms.map);
node->ms = *ms;
- node->ms.map = map__get(node->ms.map);
+ node->ms.maps = maps__get(ms->maps);
+ node->ms.map = map__get(ms->map);
node->branch = branch;
node->nr_loop_iter = nr_loop_iter;
node->iter_cycles = iter_cycles;
@@ -1114,7 +1124,8 @@ int fill_callchain_info(struct addr_location *al, struct callchain_cursor_node *
{
struct machine *machine = maps__machine(node->ms.maps);

- al->maps = node->ms.maps;
+ maps__put(al->maps);
+ al->maps = maps__get(node->ms.maps);
map__put(al->map);
al->map = map__get(node->ms.map);
al->sym = node->ms.sym;
@@ -1127,7 +1138,7 @@ int fill_callchain_info(struct addr_location *al, struct callchain_cursor_node *
if (al->map == NULL)
goto out;
}
- if (al->maps == machine__kernel_maps(machine)) {
+ if (RC_CHK_ACCESS(al->maps) == RC_CHK_ACCESS(machine__kernel_maps(machine))) {
if (machine__is_host(machine)) {
al->cpumode = PERF_RECORD_MISC_KERNEL;
al->level = 'k';
@@ -1460,12 +1471,14 @@ static void free_callchain_node(struct callchain_node *node)
list_for_each_entry_safe(list, tmp, &node->parent_val, list) {
list_del_init(&list->list);
map__zput(list->ms.map);
+ maps__zput(list->ms.maps);
free(list);
}

list_for_each_entry_safe(list, tmp, &node->val, list) {
list_del_init(&list->list);
map__zput(list->ms.map);
+ maps__zput(list->ms.maps);
free(list);
}

@@ -1551,6 +1564,7 @@ int callchain_node__make_parent_list(struct callchain_node *node)
list_for_each_entry_safe(chain, new, &head, list) {
list_del_init(&chain->list);
map__zput(chain->ms.map);
+ maps__zput(chain->ms.maps);
free(chain);
}
return -ENOMEM;
@@ -1596,8 +1610,10 @@ void callchain_cursor_reset(struct callchain_cursor *cursor)
cursor->nr = 0;
cursor->last = &cursor->first;

- for (node = cursor->first; node != NULL; node = node->next)
+ for (node = cursor->first; node != NULL; node = node->next) {
map__zput(node->ms.map);
+ maps__zput(node->ms.maps);
+ }
}

void callchain_param_setup(u64 sample_type, const char *arch)
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 2fcfba38fc48..3860b0c74829 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -711,6 +711,9 @@ int machine__resolve(struct machine *machine, struct addr_location *al,
if (thread__is_filtered(thread))
al->filtered |= (1 << HIST_FILTER__THREAD);

+ thread__put(thread);
+ thread = NULL;
+
al->sym = NULL;
al->cpu = sample->cpu;
al->socket = -1;
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index dfda52d348a3..fb218b3e8a7c 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -450,6 +450,7 @@ static int hist_entry__init(struct hist_entry *he,
memset(&he->stat, 0, sizeof(he->stat));
}

+ he->ms.maps = maps__get(he->ms.maps);
he->ms.map = map__get(he->ms.map);

if (he->branch_info) {
@@ -497,7 +498,7 @@ static int hist_entry__init(struct hist_entry *he,
}

INIT_LIST_HEAD(&he->pairs.node);
- thread__get(he->thread);
+ he->thread = thread__get(he->thread);
he->hroot_in = RB_ROOT_CACHED;
he->hroot_out = RB_ROOT_CACHED;

@@ -523,6 +524,7 @@ static int hist_entry__init(struct hist_entry *he,
map__put(he->mem_info->daddr.ms.map);
}
err:
+ maps__zput(he->ms.maps);
map__zput(he->ms.map);
zfree(&he->stat_acc);
return -ENOMEM;
@@ -611,7 +613,6 @@ static struct hist_entry *hists__findnew_entry(struct hists *hists,
* keys were used.
*/
cmp = hist_entry__cmp(he, entry);
-
if (!cmp) {
if (sample_self) {
he_stat__add_period(&he->stat, period);
@@ -1309,6 +1310,7 @@ void hist_entry__delete(struct hist_entry *he)
struct hist_entry_ops *ops = he->ops;

thread__zput(he->thread);
+ maps__zput(he->ms.maps);
map__zput(he->ms.map);

if (he->branch_info) {
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 359ef6b4e840..bdad4b8bf77d 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -539,7 +539,7 @@ static void machine__update_thread_pid(struct machine *machine,
goto out_err;

if (thread__maps(th) == thread__maps(leader))
- return;
+ goto out_put;

if (thread__maps(th)) {
/*
@@ -579,7 +579,7 @@ __threads__get_last_match(struct threads *threads, struct machine *machine,
machine__update_thread_pid(machine, th, pid);
return thread__get(th);
}
-
+ thread__put(threads->last_match);
threads->last_match = NULL;
}

@@ -601,7 +601,8 @@ threads__get_last_match(struct threads *threads, struct machine *machine,
static void
__threads__set_last_match(struct threads *threads, struct thread *th)
{
- threads->last_match = th;
+ thread__put(threads->last_match);
+ threads->last_match = thread__get(th);
}

static void
@@ -664,7 +665,6 @@ static struct thread *____machine__findnew_thread(struct machine *machine,

rb_link_node(&nd->rb_node, parent, p);
rb_insert_color_cached(&nd->rb_node, &threads->entries, leftmost);
-
/*
* We have to initialize maps separately after rb tree is updated.
*
@@ -673,6 +673,7 @@ static struct thread *____machine__findnew_thread(struct machine *machine,
* the rb tree.
*/
if (thread__init_maps(th, machine)) {
+ pr_err("Thread init failed thread %d\n", pid);
rb_erase_cached(&nd->rb_node, &threads->entries);
RB_CLEAR_NODE(&nd->rb_node);
free(nd);
@@ -682,11 +683,10 @@ static struct thread *____machine__findnew_thread(struct machine *machine,
/*
* It is now in the rbtree, get a ref
*/
- thread__get(th);
threads__set_last_match(threads, th);
++threads->nr;

- return th;
+ return thread__get(th);
}

struct thread *__machine__findnew_thread(struct machine *machine, pid_t pid, pid_t tid)
@@ -2321,7 +2321,7 @@ static int add_callchain_ip(struct thread *thread,
struct iterations *iter,
u64 branch_from)
{
- struct map_symbol ms;
+ struct map_symbol ms = {};
struct addr_location al;
int nr_loop_iter = 0, err = 0;
u64 iter_cycles = 0;
@@ -2395,6 +2395,8 @@ static int add_callchain_ip(struct thread *thread,
iter_cycles, branch_from, srcline);
out:
addr_location__exit(&al);
+ maps__put(ms.maps);
+ map__put(ms.map);
return err;
}

@@ -3089,6 +3091,7 @@ static int append_inlines(struct callchain_cursor *cursor, struct map_symbol *ms
struct dso *dso;
u64 addr;
int ret = 1;
+ struct map_symbol ilist_ms;

if (!symbol_conf.inline_name || !map || !sym)
return ret;
@@ -3105,18 +3108,20 @@ static int append_inlines(struct callchain_cursor *cursor, struct map_symbol *ms
inlines__tree_insert(&dso->inlined_nodes, inline_node);
}

+ ilist_ms = (struct map_symbol) {
+ .maps = maps__get(ms->maps),
+ .map = map__get(map),
+ };
list_for_each_entry(ilist, &inline_node->val, list) {
- struct map_symbol ilist_ms = {
- .maps = ms->maps,
- .map = map,
- .sym = ilist->symbol,
- };
+ ilist_ms.sym = ilist->symbol;
ret = callchain_cursor_append(cursor, ip, &ilist_ms, false,
NULL, 0, 0, 0, ilist->srcline);

if (ret != 0)
return ret;
}
+ map__put(ilist_ms.map);
+ maps__put(ilist_ms.maps);

return ret;
}
--
2.41.0.162.gfafddb0af9-goog


2023-06-08 23:57:09

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v2 23/26] perf header: Avoid out-of-bounds read

intel-pt tests were failing:
```
...
--- Test virtual LBR ---
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.126 MB /tmp/perf-test-intel-pt-sh.FW57CXnCqQ/test-perf.data ]
Failed with virtual lbr
...
```

The root cause is an out-of-bounds read in header (where maxbrstack.py
is from test_intel_pt.sh):
```
$ perf --no-pager script --itrace=L -s maxbrstack.py
=================================================================
==3907930==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020000095a8 at pc 0x563c26c840bb bp 0x7fff43582710 sp 0x7fff43582708
READ of size 4 at 0x6020000095a8 thread T0
#0 0x563c26c840ba in process_group_desc util/header.c:2847
#1 0x563c26c8bc78 in perf_file_section__process util/header.c:4037
#2 0x563c26c8aa9b in perf_header__process_sections util/header.c:3813
#3 0x563c26c8d028 in perf_session__read_header util/header.c:4286
#4 0x563c26cbab29 in perf_session__open util/session.c:113
#5 0x563c26cbb3d0 in __perf_session__new util/session.c:221
#6 0x563c26aacb14 in perf_session__new util/session.h:73
#7 0x563c26acf7f1 in cmd_script tools/perf/builtin-script.c:4212
#8 0x563c26bb58ff in run_builtin tools/perf/perf.c:323
#9 0x563c26bb5e70 in handle_internal_command tools/perf/perf.c:377
#10 0x563c26bb6238 in run_argv tools/perf/perf.c:421
#11 0x563c26bb67a0 in main tools/perf/perf.c:537
#12 0x7f34bde46189 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
#13 0x7f34bde46244 in __libc_start_main_impl ../csu/libc-start.c:381
#14 0x563c26a33390 in _start (/tmp/perf/perf+0x1eb390)

0x6020000095a8 is located 8 bytes to the right of 16-byte region [0x602000009590,0x6020000095a0)
allocated by thread T0 here:
#0 0x7f34beeb83b7 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:77
#1 0x563c26c83df8 in process_group_desc util/header.c:2824
#2 0x563c26c8bc78 in perf_file_section__process util/header.c:4037
#3 0x563c26c8aa9b in perf_header__process_sections util/header.c:3813
#4 0x563c26c8d028 in perf_session__read_header util/header.c:4286
#5 0x563c26cbab29 in perf_session__open util/session.c:113
#6 0x563c26cbb3d0 in __perf_session__new util/session.c:221
#7 0x563c26aacb14 in perf_session__new util/session.h:73
#8 0x563c26acf7f1 in cmd_script tools/perf/builtin-script.c:4212
#9 0x563c26bb58ff in run_builtin tools/perf/perf.c:323
#10 0x563c26bb5e70 in handle_internal_command tools/perf/perf.c:377
#11 0x563c26bb6238 in run_argv tools/perf/perf.c:421
#12 0x563c26bb67a0 in main tools/perf/perf.c:537
#13 0x7f34bde46189 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
```

Avoid the out-of-bounds read checking for the leader. Leave the 'nr'
check intact as nr will be 0 or the counting down and evsel be a group
member.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/util/header.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 3db7c1fae71e..52fbf526fe74 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -2844,7 +2844,7 @@ static int process_group_desc(struct feat_fd *ff, void *data __maybe_unused)

i = nr = 0;
evlist__for_each_entry(session->evlist, evsel) {
- if (evsel->core.idx == (int) desc[i].leader_idx) {
+ if (i < nr_groups && evsel->core.idx == (int) desc[i].leader_idx) {
evsel__set_leader(evsel, evsel);
/* {anon_group} is a dummy name */
if (strcmp(desc[i].name, "{anon_group}")) {
--
2.41.0.162.gfafddb0af9-goog


2023-06-08 23:58:11

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v2 01/26] perf thread: Remove notion of dead threads

The dead thread list is best effort. Threads live on it until the
reference count hits zero and they are removed. With correct reference
counting this should never happen. It is, however, part of the 'perf
sched' output that is now removed. If this is an issue we should
implement tracking of dead threads in a robust not best-effort way.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/builtin-sched.c | 23 +----------------------
tools/perf/util/cs-etm.c | 6 ------
tools/perf/util/intel-pt.c | 8 --------
tools/perf/util/machine.c | 32 +-------------------------------
tools/perf/util/thread.c | 25 +------------------------
tools/perf/util/thread.h | 11 +----------
6 files changed, 4 insertions(+), 101 deletions(-)

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index cc4ba506e119..3a30c2ac5b47 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -2760,7 +2760,7 @@ struct total_run_stats {
u64 total_run_time;
};

-static int __show_thread_runtime(struct thread *t, void *priv)
+static int show_thread_runtime(struct thread *t, void *priv)
{
struct total_run_stats *stats = priv;
struct thread_runtime *r;
@@ -2783,22 +2783,6 @@ static int __show_thread_runtime(struct thread *t, void *priv)
return 0;
}

-static int show_thread_runtime(struct thread *t, void *priv)
-{
- if (t->dead)
- return 0;
-
- return __show_thread_runtime(t, priv);
-}
-
-static int show_deadthread_runtime(struct thread *t, void *priv)
-{
- if (!t->dead)
- return 0;
-
- return __show_thread_runtime(t, priv);
-}
-
static size_t callchain__fprintf_folded(FILE *fp, struct callchain_node *node)
{
const char *sep = " <- ";
@@ -2890,11 +2874,6 @@ static void timehist_print_summary(struct perf_sched *sched,
if (!task_count)
printf("<no still running tasks>\n");

- printf("\nTerminated tasks:\n");
- machine__for_each_thread(m, show_deadthread_runtime, &totals);
- if (task_count == totals.task_count)
- printf("<no terminated tasks>\n");
-
/* CPU idle stats not tracked when samples were skipped */
if (sched->skipped_samples && !sched->idle_hist)
return;
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index 91299cc56bf7..0f5be4ad24ba 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -3292,12 +3292,6 @@ int cs_etm__process_auxtrace_info_full(union perf_event *event,
goto err_free_queues;
}

- /*
- * Initialize list node so that at thread__zput() we can avoid
- * segmentation fault at list_del_init().
- */
- INIT_LIST_HEAD(&etm->unknown_thread->node);
-
err = thread__set_comm(etm->unknown_thread, "unknown", 0);
if (err)
goto err_delete_thread;
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index fe893c9bab3f..dde2ca77a005 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -4311,14 +4311,6 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
goto err_free_queues;
}

- /*
- * Since this thread will not be kept in any rbtree not in a
- * list, initialize its list node so that at thread__put() the
- * current thread lifetime assumption is kept and we don't segfault
- * at list_del_init().
- */
- INIT_LIST_HEAD(&pt->unknown_thread->node);
-
err = thread__set_comm(pt->unknown_thread, "unknown", 0);
if (err)
goto err_delete_thread;
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 9e02e19c1b7a..a1954ac85f59 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -241,17 +241,6 @@ void machine__exit(struct machine *machine)

for (i = 0; i < THREADS__TABLE_SIZE; i++) {
struct threads *threads = &machine->threads[i];
- struct thread *thread, *n;
- /*
- * Forget about the dead, at this point whatever threads were
- * left in the dead lists better have a reference count taken
- * by who is using them, and then, when they drop those references
- * and it finally hits zero, thread__put() will check and see that
- * its not in the dead threads list and will not try to remove it
- * from there, just calling thread__delete() straight away.
- */
- list_for_each_entry_safe(thread, n, &threads->dead, node)
- list_del_init(&thread->node);

exit_rwsem(&threads->lock);
}
@@ -2046,18 +2035,7 @@ static void __machine__remove_thread(struct machine *machine, struct thread *th,
rb_erase_cached(&th->rb_node, &threads->entries);
RB_CLEAR_NODE(&th->rb_node);
--threads->nr;
- /*
- * Move it first to the dead_threads list, then drop the reference,
- * if this is the last reference, then the thread__delete destructor
- * will be called and we will remove it from the dead_threads list.
- */
- list_add_tail(&th->node, &threads->dead);

- /*
- * We need to do the put here because if this is the last refcount,
- * then we will be touching the threads->dead head when removing the
- * thread.
- */
thread__put(th);

if (lock)
@@ -2145,10 +2123,8 @@ int machine__process_exit_event(struct machine *machine, union perf_event *event
if (dump_trace)
perf_event__fprintf_task(event, stdout);

- if (thread != NULL) {
- thread__exited(thread);
+ if (thread != NULL)
thread__put(thread);
- }

return 0;
}
@@ -3204,12 +3180,6 @@ int machine__for_each_thread(struct machine *machine,
if (rc != 0)
return rc;
}
-
- list_for_each_entry(thread, &threads->dead, node) {
- rc = fn(thread, priv);
- if (rc != 0)
- return rc;
- }
}
return rc;
}
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 4b5bdc277baa..d949bffc0ed6 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -125,31 +125,8 @@ struct thread *thread__get(struct thread *thread)

void thread__put(struct thread *thread)
{
- if (thread && refcount_dec_and_test(&thread->refcnt)) {
- /*
- * Remove it from the dead threads list, as last reference is
- * gone, if it is in a dead threads list.
- *
- * We may not be there anymore if say, the machine where it was
- * stored was already deleted, so we already removed it from
- * the dead threads and some other piece of code still keeps a
- * reference.
- *
- * This is what 'perf sched' does and finally drops it in
- * perf_sched__lat(), where it calls perf_sched__read_events(),
- * that processes the events by creating a session and deleting
- * it, which ends up destroying the list heads for the dead
- * threads, but before it does that it removes all threads from
- * it using list_del_init().
- *
- * So we need to check here if it is in a dead threads list and
- * if so, remove it before finally deleting the thread, to avoid
- * an use after free situation.
- */
- if (!list_empty(&thread->node))
- list_del_init(&thread->node);
+ if (thread && refcount_dec_and_test(&thread->refcnt))
thread__delete(thread);
- }
}

static struct namespaces *__thread__namespaces(const struct thread *thread)
diff --git a/tools/perf/util/thread.h b/tools/perf/util/thread.h
index 395c626699a9..86737812e06b 100644
--- a/tools/perf/util/thread.h
+++ b/tools/perf/util/thread.h
@@ -30,10 +30,7 @@ struct lbr_stitch {
};

struct thread {
- union {
- struct rb_node rb_node;
- struct list_head node;
- };
+ struct rb_node rb_node;
struct maps *maps;
pid_t pid_; /* Not all tools update this */
pid_t tid;
@@ -43,7 +40,6 @@ struct thread {
refcount_t refcnt;
bool comm_set;
int comm_len;
- bool dead; /* if set thread has exited */
struct list_head namespaces_list;
struct rw_semaphore namespaces_lock;
struct list_head comm_list;
@@ -81,11 +77,6 @@ static inline void __thread__zput(struct thread **thread)

#define thread__zput(thread) __thread__zput(&thread)

-static inline void thread__exited(struct thread *thread)
-{
- thread->dead = true;
-}
-
struct namespaces *thread__namespaces(struct thread *thread);
int thread__set_namespaces(struct thread *thread, u64 timestamp,
struct perf_record_namespaces *event);
--
2.41.0.162.gfafddb0af9-goog


2023-06-08 23:58:42

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v2 10/26] perf header: Ensure bitmaps are freed

memory_node bitmaps need a bitmap_free to avoid memory leaks. Caught
by leak sanitizer.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/util/header.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index d85b39079c31..3db7c1fae71e 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -1389,6 +1389,14 @@ static int memory_node__read(struct memory_node *n, unsigned long idx)
return 0;
}

+static void memory_node__delete_nodes(struct memory_node *nodesp, u64 cnt)
+{
+ for (u64 i = 0; i < cnt; i++)
+ bitmap_free(nodesp[i].set);
+
+ free(nodesp);
+}
+
static int memory_node__sort(const void *a, const void *b)
{
const struct memory_node *na = a;
@@ -1449,7 +1457,7 @@ static int build_mem_topology(struct memory_node **nodesp, u64 *cntp)
*nodesp = nodes;
qsort(nodes, cnt, sizeof(nodes[0]), memory_node__sort);
} else
- free(nodes);
+ memory_node__delete_nodes(nodes, cnt);

return ret;
}
@@ -1516,7 +1524,7 @@ static int write_mem_topology(struct feat_fd *ff __maybe_unused,
}

out:
- free(nodes);
+ memory_node__delete_nodes(nodes, nr);
return ret;
}

--
2.41.0.162.gfafddb0af9-goog


2023-06-08 23:59:22

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v2 06/26] perf addr_location: Add init/exit/copy functions

struct addr_location holds references to multiple reference counted
objects. Add init/exit functions to make maintenance of those more
consistent with the rest of the code and to try to avoid
leaks. Modification of thread reference counts isn't included in this
change.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/builtin-annotate.c | 28 ++++---
tools/perf/builtin-c2c.c | 12 ++-
tools/perf/builtin-diff.c | 16 ++--
tools/perf/builtin-inject.c | 2 +
tools/perf/builtin-kmem.c | 11 ++-
tools/perf/builtin-kwork.c | 15 +++-
tools/perf/builtin-mem.c | 4 +-
tools/perf/builtin-report.c | 6 +-
tools/perf/builtin-sched.c | 2 +
tools/perf/builtin-script.c | 77 +++++++++++--------
tools/perf/builtin-timechart.c | 11 ++-
tools/perf/builtin-top.c | 6 +-
tools/perf/builtin-trace.c | 10 ++-
tools/perf/tests/code-reading.c | 3 +-
tools/perf/tests/hists_cumulate.c | 17 ++--
tools/perf/tests/hists_filter.c | 11 ++-
tools/perf/tests/hists_link.c | 18 +++--
tools/perf/tests/hists_output.c | 10 ++-
tools/perf/tests/mmap-thread-lookup.c | 4 +-
tools/perf/util/addr_location.c | 30 +++++++-
tools/perf/util/addr_location.h | 5 +-
tools/perf/util/build-id.c | 2 +
tools/perf/util/cs-etm.c | 20 +++--
tools/perf/util/data-convert-json.c | 8 +-
tools/perf/util/db-export.c | 4 +-
tools/perf/util/dlfilter.c | 13 +++-
tools/perf/util/event.c | 16 ++--
tools/perf/util/evsel_fprintf.c | 8 +-
tools/perf/util/hist.c | 8 +-
tools/perf/util/intel-pt.c | 66 +++++++++++-----
tools/perf/util/machine.c | 35 +++++----
.../scripting-engines/trace-event-python.c | 10 ++-
tools/perf/util/thread.c | 13 +++-
tools/perf/util/unwind-libdw.c | 21 ++++-
tools/perf/util/unwind-libunwind-local.c | 13 +++-
35 files changed, 369 insertions(+), 166 deletions(-)

diff --git a/tools/perf/builtin-annotate.c b/tools/perf/builtin-annotate.c
index 425a7e2fd6fb..aeeb801f1ed7 100644
--- a/tools/perf/builtin-annotate.c
+++ b/tools/perf/builtin-annotate.c
@@ -184,7 +184,7 @@ static int hist_iter__branch_callback(struct hist_entry_iter *iter,

static int process_branch_callback(struct evsel *evsel,
struct perf_sample *sample,
- struct addr_location *al __maybe_unused,
+ struct addr_location *al,
struct perf_annotate *ann,
struct machine *machine)
{
@@ -195,21 +195,29 @@ static int process_branch_callback(struct evsel *evsel,
.hide_unresolved = symbol_conf.hide_unresolved,
.ops = &hist_iter_branch,
};
-
struct addr_location a;
+ int ret;

- if (machine__resolve(machine, &a, sample) < 0)
- return -1;
+ addr_location__init(&a);
+ if (machine__resolve(machine, &a, sample) < 0) {
+ ret = -1;
+ goto out;
+ }

- if (a.sym == NULL)
- return 0;
+ if (a.sym == NULL) {
+ ret = 0;
+ goto out;
+ }

if (a.map != NULL)
map__dso(a.map)->hit = 1;

hist__account_cycles(sample->branch_stack, al, sample, false, NULL);

- return hist_entry_iter__add(&iter, &a, PERF_MAX_STACK_DEPTH, ann);
+ ret = hist_entry_iter__add(&iter, &a, PERF_MAX_STACK_DEPTH, ann);
+out:
+ addr_location__exit(&a);
+ return ret;
}

static bool has_annotation(struct perf_annotate *ann)
@@ -272,10 +280,12 @@ static int process_sample_event(struct perf_tool *tool,
struct addr_location al;
int ret = 0;

+ addr_location__init(&al);
if (machine__resolve(machine, &al, sample) < 0) {
pr_warning("problem processing %d event, skipping it.\n",
event->header.type);
- return -1;
+ ret = -1;
+ goto out_put;
}

if (ann->cpu_list && !test_bit(sample->cpu, ann->cpu_bitmap))
@@ -288,7 +298,7 @@ static int process_sample_event(struct perf_tool *tool,
ret = -1;
}
out_put:
- addr_location__put(&al);
+ addr_location__exit(&al);
return ret;
}

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index ee41a96f0c73..530a44a59f41 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -286,10 +286,12 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
struct mem_info *mi, *mi_dup;
int ret;

+ addr_location__init(&al);
if (machine__resolve(machine, &al, sample) < 0) {
pr_debug("problem processing %d event, skipping it.\n",
event->header.type);
- return -1;
+ ret = -1;
+ goto out;
}

if (c2c.stitch_lbr)
@@ -301,8 +303,10 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
goto out;

mi = sample__resolve_mem(sample, &al);
- if (mi == NULL)
- return -ENOMEM;
+ if (mi == NULL) {
+ ret = -ENOMEM;
+ goto out;
+ }

/*
* The mi object is released in hists__add_entry_ops,
@@ -368,7 +372,7 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
}

out:
- addr_location__put(&al);
+ addr_location__exit(&al);
return ret;

free_mi:
diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index dbb0562d6a4f..ca39657ee407 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -409,15 +409,17 @@ static int diff__process_sample_event(struct perf_tool *tool,
return 0;
}

+ addr_location__init(&al);
if (machine__resolve(machine, &al, sample) < 0) {
pr_warning("problem processing %d event, skipping it.\n",
event->header.type);
- return -1;
+ ret = -1;
+ goto out;
}

if (cpu_list && !test_bit(sample->cpu, cpu_bitmap)) {
ret = 0;
- goto out_put;
+ goto out;
}

switch (compute) {
@@ -426,7 +428,7 @@ static int diff__process_sample_event(struct perf_tool *tool,
NULL, NULL, NULL, sample, true)) {
pr_warning("problem incrementing symbol period, "
"skipping event\n");
- goto out_put;
+ goto out;
}

hist__account_cycles(sample->branch_stack, &al, sample, false,
@@ -437,7 +439,7 @@ static int diff__process_sample_event(struct perf_tool *tool,
if (hist_entry_iter__add(&iter, &al, PERF_MAX_STACK_DEPTH,
NULL)) {
pr_debug("problem adding hist entry, skipping event\n");
- goto out_put;
+ goto out;
}
break;

@@ -446,7 +448,7 @@ static int diff__process_sample_event(struct perf_tool *tool,
true)) {
pr_warning("problem incrementing symbol period, "
"skipping event\n");
- goto out_put;
+ goto out;
}
}

@@ -460,8 +462,8 @@ static int diff__process_sample_event(struct perf_tool *tool,
if (!al.filtered)
hists->stats.total_non_filtered_period += sample->period;
ret = 0;
-out_put:
- addr_location__put(&al);
+out:
+ addr_location__exit(&al);
return ret;
}

diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index d9e96d4624c6..d19a1b862306 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -743,6 +743,7 @@ int perf_event__inject_buildid(struct perf_tool *tool, union perf_event *event,
struct addr_location al;
struct thread *thread;

+ addr_location__init(&al);
thread = machine__findnew_thread(machine, sample->pid, sample->tid);
if (thread == NULL) {
pr_err("problem processing %d event, skipping it.\n",
@@ -763,6 +764,7 @@ int perf_event__inject_buildid(struct perf_tool *tool, union perf_event *event,
thread__put(thread);
repipe:
perf_event__repipe(tool, event, sample, machine);
+ addr_location__exit(&al);
return 0;
}

diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index fe9439a4fd66..a11f280d20bd 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -399,7 +399,9 @@ static u64 find_callsite(struct evsel *evsel, struct perf_sample *sample)
struct addr_location al;
struct machine *machine = &kmem_session->machines.host;
struct callchain_cursor_node *node;
+ u64 result;

+ addr_location__init(&al);
if (alloc_func_list == NULL) {
if (build_alloc_func_list() < 0)
goto out;
@@ -427,16 +429,19 @@ static u64 find_callsite(struct evsel *evsel, struct perf_sample *sample)
else
addr = node->ip;

- return addr;
+ result = addr;
+ goto out;
} else
pr_debug3("skipping alloc function: %s\n", caller->name);

callchain_cursor_advance(&callchain_cursor);
}

-out:
pr_debug2("unknown callsite: %"PRIx64 "\n", sample->ip);
- return sample->ip;
+ result = sample->ip;
+out:
+ addr_location__exit(&al);
+ return result;
}

struct sort_dimension {
diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c
index a9395c52b23b..2d80aef4eccc 100644
--- a/tools/perf/builtin-kwork.c
+++ b/tools/perf/builtin-kwork.c
@@ -739,17 +739,22 @@ static int timehist_exit_event(struct perf_kwork *kwork,
struct kwork_atom *atom = NULL;
struct kwork_work *work = NULL;
struct addr_location al;
+ int ret = 0;

+ addr_location__init(&al);
if (machine__resolve(machine, &al, sample) < 0) {
pr_debug("Problem processing event, skipping it\n");
- return -1;
+ ret = -1;
+ goto out;
}

atom = work_pop_atom(kwork, class, KWORK_TRACE_EXIT,
KWORK_TRACE_ENTRY, evsel, sample,
machine, &work);
- if (work == NULL)
- return -1;
+ if (work == NULL) {
+ ret = -1;
+ goto out;
+ }

if (atom != NULL) {
work->nr_atoms++;
@@ -757,7 +762,9 @@ static int timehist_exit_event(struct perf_kwork *kwork,
atom_del(atom);
}

- return 0;
+out:
+ addr_location__exit(&al);
+ return ret;
}

static struct kwork_class kwork_irq;
diff --git a/tools/perf/builtin-mem.c b/tools/perf/builtin-mem.c
index 960bfd4b732a..51499c20da01 100644
--- a/tools/perf/builtin-mem.c
+++ b/tools/perf/builtin-mem.c
@@ -199,9 +199,11 @@ dump_raw_samples(struct perf_tool *tool,
char str[PAGE_SIZE_NAME_LEN];
struct dso *dso = NULL;

+ addr_location__init(&al);
if (machine__resolve(machine, &al, sample) < 0) {
fprintf(stderr, "problem processing %d event, skipping it.\n",
event->header.type);
+ addr_location__exit(&al);
return -1;
}

@@ -256,7 +258,7 @@ dump_raw_samples(struct perf_tool *tool,
dso ? dso->long_name : "???",
al.sym ? al.sym->name : "???");
out_put:
- addr_location__put(&al);
+ addr_location__exit(&al);
return 0;
}

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 8ea6ab18534a..0b091a8983a5 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -285,10 +285,12 @@ static int process_sample_event(struct perf_tool *tool,
if (evswitch__discard(&rep->evswitch, evsel))
return 0;

+ addr_location__init(&al);
if (machine__resolve(machine, &al, sample) < 0) {
pr_debug("problem processing %d event, skipping it.\n",
event->header.type);
- return -1;
+ ret = -1;
+ goto out_put;
}

if (rep->stitch_lbr)
@@ -331,7 +333,7 @@ static int process_sample_event(struct perf_tool *tool,
if (ret < 0)
pr_debug("problem adding hist entry, skipping event\n");
out_put:
- addr_location__put(&al);
+ addr_location__exit(&al);
return ret;
}

diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index fd37468c4f62..c75ad82a6729 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -2584,6 +2584,7 @@ static int timehist_sched_change_event(struct perf_tool *tool,
int rc = 0;
int state = evsel__intval(evsel, sample, "prev_state");

+ addr_location__init(&al);
if (machine__resolve(machine, &al, sample) < 0) {
pr_err("problem processing %d event. skipping it\n",
event->header.type);
@@ -2692,6 +2693,7 @@ static int timehist_sched_change_event(struct perf_tool *tool,

evsel__save_time(evsel, sample->time, sample->cpu);

+ addr_location__exit(&al);
return rc;
}

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index e756290de2ac..784d478c2e05 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -919,7 +919,6 @@ static int perf_sample__fprintf_brstack(struct perf_sample *sample,
{
struct branch_stack *br = sample->branch_stack;
struct branch_entry *entries = perf_sample__branch_entries(sample);
- struct addr_location alf, alt;
u64 i, from, to;
int printed = 0;

@@ -930,20 +929,22 @@ static int perf_sample__fprintf_brstack(struct perf_sample *sample,
from = entries[i].from;
to = entries[i].to;

+ printed += fprintf(fp, " 0x%"PRIx64, from);
if (PRINT_FIELD(DSO)) {
- memset(&alf, 0, sizeof(alf));
- memset(&alt, 0, sizeof(alt));
+ struct addr_location alf, alt;
+
+ addr_location__init(&alf);
+ addr_location__init(&alt);
thread__find_map_fb(thread, sample->cpumode, from, &alf);
thread__find_map_fb(thread, sample->cpumode, to, &alt);
- }

- printed += fprintf(fp, " 0x%"PRIx64, from);
- if (PRINT_FIELD(DSO))
printed += map__fprintf_dsoname_dsoff(alf.map, PRINT_FIELD(DSOFF), alf.addr, fp);
-
- printed += fprintf(fp, "/0x%"PRIx64, to);
- if (PRINT_FIELD(DSO))
+ printed += fprintf(fp, "/0x%"PRIx64, to);
printed += map__fprintf_dsoname_dsoff(alt.map, PRINT_FIELD(DSOFF), alt.addr, fp);
+ addr_location__exit(&alt);
+ addr_location__exit(&alf);
+ } else
+ printed += fprintf(fp, "/0x%"PRIx64, to);

printed += print_bstack_flags(fp, entries + i);
}
@@ -957,7 +958,6 @@ static int perf_sample__fprintf_brstacksym(struct perf_sample *sample,
{
struct branch_stack *br = sample->branch_stack;
struct branch_entry *entries = perf_sample__branch_entries(sample);
- struct addr_location alf, alt;
u64 i, from, to;
int printed = 0;

@@ -965,9 +965,10 @@ static int perf_sample__fprintf_brstacksym(struct perf_sample *sample,
return 0;

for (i = 0; i < br->nr; i++) {
+ struct addr_location alf, alt;

- memset(&alf, 0, sizeof(alf));
- memset(&alt, 0, sizeof(alt));
+ addr_location__init(&alf);
+ addr_location__init(&alt);
from = entries[i].from;
to = entries[i].to;

@@ -982,6 +983,8 @@ static int perf_sample__fprintf_brstacksym(struct perf_sample *sample,
if (PRINT_FIELD(DSO))
printed += map__fprintf_dsoname_dsoff(alt.map, PRINT_FIELD(DSOFF), alt.addr, fp);
printed += print_bstack_flags(fp, entries + i);
+ addr_location__exit(&alt);
+ addr_location__exit(&alf);
}

return printed;
@@ -993,7 +996,6 @@ static int perf_sample__fprintf_brstackoff(struct perf_sample *sample,
{
struct branch_stack *br = sample->branch_stack;
struct branch_entry *entries = perf_sample__branch_entries(sample);
- struct addr_location alf, alt;
u64 i, from, to;
int printed = 0;

@@ -1001,9 +1003,10 @@ static int perf_sample__fprintf_brstackoff(struct perf_sample *sample,
return 0;

for (i = 0; i < br->nr; i++) {
+ struct addr_location alf, alt;

- memset(&alf, 0, sizeof(alf));
- memset(&alt, 0, sizeof(alt));
+ addr_location__init(&alf);
+ addr_location__init(&alt);
from = entries[i].from;
to = entries[i].to;

@@ -1022,6 +1025,8 @@ static int perf_sample__fprintf_brstackoff(struct perf_sample *sample,
if (PRINT_FIELD(DSO))
printed += map__fprintf_dsoname_dsoff(alt.map, PRINT_FIELD(DSOFF), alt.addr, fp);
printed += print_bstack_flags(fp, entries + i);
+ addr_location__exit(&alt);
+ addr_location__exit(&alf);
}

return printed;
@@ -1036,6 +1041,7 @@ static int grab_bb(u8 *buffer, u64 start, u64 end,
struct addr_location al;
bool kernel;
struct dso *dso;
+ int ret = 0;

if (!start || !end)
return 0;
@@ -1057,7 +1063,6 @@ static int grab_bb(u8 *buffer, u64 start, u64 end,
return -ENXIO;
}

- memset(&al, 0, sizeof(al));
if (end - start > MAXBB - MAXINSN) {
if (last)
pr_debug("\tbrstack does not reach to final jump (%" PRIx64 "-%" PRIx64 ")\n", start, end);
@@ -1066,13 +1071,14 @@ static int grab_bb(u8 *buffer, u64 start, u64 end,
return 0;
}

+ addr_location__init(&al);
if (!thread__find_map(thread, *cpumode, start, &al) || (dso = map__dso(al.map)) == NULL) {
pr_debug("\tcannot resolve %" PRIx64 "-%" PRIx64 "\n", start, end);
- return 0;
+ goto out;
}
if (dso->data.status == DSO_DATA_STATUS_ERROR) {
pr_debug("\tcannot resolve %" PRIx64 "-%" PRIx64 "\n", start, end);
- return 0;
+ goto out;
}

/* Load maps to ensure dso->is_64_bit has been updated */
@@ -1086,7 +1092,10 @@ static int grab_bb(u8 *buffer, u64 start, u64 end,
if (len <= 0)
pr_debug("\tcannot fetch code for block at %" PRIx64 "-%" PRIx64 "\n",
start, end);
- return len;
+ ret = len;
+out:
+ addr_location__exit(&al);
+ return ret;
}

static int map__fprintf_srccode(struct map *map, u64 addr, FILE *fp, struct srccode_state *state)
@@ -1137,14 +1146,16 @@ static int print_srccode(struct thread *thread, u8 cpumode, uint64_t addr)
struct addr_location al;
int ret = 0;

- memset(&al, 0, sizeof(al));
+ addr_location__init(&al);
thread__find_map(thread, cpumode, addr, &al);
if (!al.map)
- return 0;
+ goto out;
ret = map__fprintf_srccode(al.map, al.addr, stdout,
thread__srccode_state(thread));
if (ret)
ret += printf("\n");
+out:
+ addr_location__exit(&al);
return ret;
}

@@ -1179,14 +1190,13 @@ static int ip__fprintf_sym(uint64_t addr, struct thread *thread,
struct perf_event_attr *attr, FILE *fp)
{
struct addr_location al;
- int off, printed = 0;
-
- memset(&al, 0, sizeof(al));
+ int off, printed = 0, ret = 0;

+ addr_location__init(&al);
thread__find_map(thread, cpumode, addr, &al);

if ((*lastsym) && al.addr >= (*lastsym)->start && al.addr < (*lastsym)->end)
- return 0;
+ goto out;

al.cpu = cpu;
al.sym = NULL;
@@ -1194,7 +1204,7 @@ static int ip__fprintf_sym(uint64_t addr, struct thread *thread,
al.sym = map__find_symbol(al.map, al.addr);

if (!al.sym)
- return 0;
+ goto out;

if (al.addr < al.sym->end)
off = al.addr - al.sym->start;
@@ -1209,7 +1219,10 @@ static int ip__fprintf_sym(uint64_t addr, struct thread *thread,
printed += fprintf(fp, "\n");
*lastsym = al.sym;

- return printed;
+ ret = printed;
+out:
+ addr_location__exit(&al);
+ return ret;
}

static int perf_sample__fprintf_brstackinsn(struct perf_sample *sample,
@@ -1371,6 +1384,7 @@ static int perf_sample__fprintf_addr(struct perf_sample *sample,
struct addr_location al;
int printed = fprintf(fp, "%16" PRIx64, sample->addr);

+ addr_location__init(&al);
if (!sample_addr_correlates_sym(attr))
goto out;

@@ -1387,6 +1401,7 @@ static int perf_sample__fprintf_addr(struct perf_sample *sample,
if (PRINT_FIELD(DSO))
printed += map__fprintf_dsoname_dsoff(al.map, PRINT_FIELD(DSOFF), al.addr, fp);
out:
+ addr_location__exit(&al);
return printed;
}

@@ -2338,8 +2353,8 @@ static int process_sample_event(struct perf_tool *tool,
int ret = 0;

/* Set thread to NULL to indicate addr_al and al are not initialized */
- addr_al.thread = NULL;
- al.thread = NULL;
+ addr_location__init(&al);
+ addr_location__init(&addr_al);

ret = dlfilter__filter_event_early(dlfilter, event, sample, evsel, machine, &al, &addr_al);
if (ret) {
@@ -2405,8 +2420,8 @@ static int process_sample_event(struct perf_tool *tool,
}

out_put:
- if (al.thread)
- addr_location__put(&al);
+ addr_location__exit(&addr_al);
+ addr_location__exit(&al);
return ret;
}

diff --git a/tools/perf/builtin-timechart.c b/tools/perf/builtin-timechart.c
index 829d99fecfd0..19d4542ea18a 100644
--- a/tools/perf/builtin-timechart.c
+++ b/tools/perf/builtin-timechart.c
@@ -498,7 +498,6 @@ static const char *cat_backtrace(union perf_event *event,
char *p = NULL;
size_t p_len;
u8 cpumode = PERF_RECORD_MISC_USER;
- struct addr_location tal;
struct ip_callchain *chain = sample->callchain;
FILE *f = open_memstream(&p, &p_len);

@@ -507,6 +506,7 @@ static const char *cat_backtrace(union perf_event *event,
return NULL;
}

+ addr_location__init(&al);
if (!chain)
goto exit;

@@ -518,6 +518,7 @@ static const char *cat_backtrace(union perf_event *event,

for (i = 0; i < chain->nr; i++) {
u64 ip;
+ struct addr_location tal;

if (callchain_param.order == ORDER_CALLEE)
ip = chain->ips[i];
@@ -544,20 +545,22 @@ static const char *cat_backtrace(union perf_event *event,
* Discard all.
*/
zfree(&p);
- goto exit_put;
+ goto exit;
}
continue;
}

+ addr_location__init(&tal);
tal.filtered = 0;
if (thread__find_symbol(al.thread, cpumode, ip, &tal))
fprintf(f, "..... %016" PRIx64 " %s\n", ip, tal.sym->name);
else
fprintf(f, "..... %016" PRIx64 "\n", ip);
+
+ addr_location__exit(&tal);
}
-exit_put:
- addr_location__put(&al);
exit:
+ addr_location__exit(&al);
fclose(f);

return p;
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 9d3cbebb9b79..99010dfa5760 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -773,8 +773,9 @@ static void perf_event__process_sample(struct perf_tool *tool,
if (event->header.misc & PERF_RECORD_MISC_EXACT_IP)
top->exact_samples++;

+ addr_location__init(&al);
if (machine__resolve(machine, &al, sample) < 0)
- return;
+ goto out;

if (top->stitch_lbr)
thread__set_lbr_stitch_enable(al.thread, true);
@@ -848,7 +849,8 @@ static void perf_event__process_sample(struct perf_tool *tool,
mutex_unlock(&hists->lock);
}

- addr_location__put(&al);
+out:
+ addr_location__exit(&al);
}

static void
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 4c9bec39423b..6a1e75f06832 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2418,13 +2418,15 @@ static int trace__resolve_callchain(struct trace *trace, struct evsel *evsel,
int max_stack = evsel->core.attr.sample_max_stack ?
evsel->core.attr.sample_max_stack :
trace->max_stack;
- int err;
+ int err = -1;

+ addr_location__init(&al);
if (machine__resolve(trace->host, &al, sample) < 0)
- return -1;
+ goto out;

err = thread__resolve_callchain(al.thread, cursor, evsel, sample, NULL, NULL, max_stack);
- addr_location__put(&al);
+out:
+ addr_location__exit(&al);
return err;
}

@@ -2893,6 +2895,7 @@ static int trace__pgfault(struct trace *trace,
int err = -1;
int callchain_ret = 0;

+ addr_location__init(&al);
thread = machine__findnew_thread(trace->host, sample->pid, sample->tid);

if (sample->callchain) {
@@ -2953,6 +2956,7 @@ static int trace__pgfault(struct trace *trace,
err = 0;
out_put:
thread__put(thread);
+ addr_location__exit(&al);
return err;
}

diff --git a/tools/perf/tests/code-reading.c b/tools/perf/tests/code-reading.c
index 9d8eefbebd48..2a7b2b6f5286 100644
--- a/tools/perf/tests/code-reading.c
+++ b/tools/perf/tests/code-reading.c
@@ -241,6 +241,7 @@ static int read_object_code(u64 addr, size_t len, u8 cpumode,

pr_debug("Reading object code for memory address: %#"PRIx64"\n", addr);

+ addr_location__init(&al);
if (!thread__find_map(thread, cpumode, addr, &al) || !map__dso(al.map)) {
if (cpumode == PERF_RECORD_MISC_HYPERVISOR) {
pr_debug("Hypervisor address can not be resolved - skipping\n");
@@ -366,7 +367,7 @@ static int read_object_code(u64 addr, size_t len, u8 cpumode,
}
pr_debug("Bytes read match those read by objdump\n");
out:
- map__put(al.map);
+ addr_location__exit(&al);
return err;
}

diff --git a/tools/perf/tests/hists_cumulate.c b/tools/perf/tests/hists_cumulate.c
index 62b9c6461ea6..71dacb0fec4d 100644
--- a/tools/perf/tests/hists_cumulate.c
+++ b/tools/perf/tests/hists_cumulate.c
@@ -8,8 +8,8 @@
#include "util/evsel.h"
#include "util/evlist.h"
#include "util/machine.h"
-#include "util/thread.h"
#include "util/parse-events.h"
+#include "util/thread.h"
#include "tests/tests.h"
#include "tests/hists_common.h"
#include <linux/kernel.h>
@@ -84,6 +84,7 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)
struct perf_sample sample = { .period = 1000, };
size_t i;

+ addr_location__init(&al);
for (i = 0; i < ARRAY_SIZE(fake_samples); i++) {
struct hist_entry_iter iter = {
.evsel = evsel,
@@ -107,20 +108,22 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)

if (hist_entry_iter__add(&iter, &al, sysctl_perf_event_max_stack,
NULL) < 0) {
- addr_location__put(&al);
goto out;
}

- fake_samples[i].thread = al.thread;
+ thread__put(fake_samples[i].thread);
+ fake_samples[i].thread = thread__get(al.thread);
map__put(fake_samples[i].map);
- fake_samples[i].map = al.map;
+ fake_samples[i].map = map__get(al.map);
fake_samples[i].sym = al.sym;
}

+ addr_location__exit(&al);
return TEST_OK;

out:
pr_debug("Not enough memory for adding a hist entry\n");
+ addr_location__exit(&al);
return TEST_FAIL;
}

@@ -152,8 +155,10 @@ static void put_fake_samples(void)
{
size_t i;

- for (i = 0; i < ARRAY_SIZE(fake_samples); i++)
- map__put(fake_samples[i].map);
+ for (i = 0; i < ARRAY_SIZE(fake_samples); i++) {
+ map__zput(fake_samples[i].map);
+ thread__zput(fake_samples[i].thread);
+ }
}

typedef int (*test_fn_t)(struct evsel *, struct machine *);
diff --git a/tools/perf/tests/hists_filter.c b/tools/perf/tests/hists_filter.c
index 98eff5935a1c..4b2e4f2fbe48 100644
--- a/tools/perf/tests/hists_filter.c
+++ b/tools/perf/tests/hists_filter.c
@@ -8,6 +8,7 @@
#include "util/evlist.h"
#include "util/machine.h"
#include "util/parse-events.h"
+#include "util/thread.h"
#include "tests/tests.h"
#include "tests/hists_common.h"
#include <linux/kernel.h>
@@ -53,6 +54,7 @@ static int add_hist_entries(struct evlist *evlist,
struct perf_sample sample = { .period = 100, };
size_t i;

+ addr_location__init(&al);
/*
* each evsel will have 10 samples but the 4th sample
* (perf [perf] main) will be collapsed to an existing entry
@@ -84,21 +86,22 @@ static int add_hist_entries(struct evlist *evlist,
al.socket = fake_samples[i].socket;
if (hist_entry_iter__add(&iter, &al,
sysctl_perf_event_max_stack, NULL) < 0) {
- addr_location__put(&al);
goto out;
}

- fake_samples[i].thread = al.thread;
+ thread__put(fake_samples[i].thread);
+ fake_samples[i].thread = thread__get(al.thread);
map__put(fake_samples[i].map);
- fake_samples[i].map = al.map;
+ fake_samples[i].map = map__get(al.map);
fake_samples[i].sym = al.sym;
}
}
-
+ addr_location__exit(&al);
return 0;

out:
pr_debug("Not enough memory for adding a hist entry\n");
+ addr_location__exit(&al);
return TEST_FAIL;
}

diff --git a/tools/perf/tests/hists_link.c b/tools/perf/tests/hists_link.c
index 141e2972e34f..12bad8840699 100644
--- a/tools/perf/tests/hists_link.c
+++ b/tools/perf/tests/hists_link.c
@@ -8,6 +8,7 @@
#include "machine.h"
#include "map.h"
#include "parse-events.h"
+#include "thread.h"
#include "hists_common.h"
#include "util/mmap.h"
#include <errno.h>
@@ -70,6 +71,7 @@ static int add_hist_entries(struct evlist *evlist, struct machine *machine)
struct perf_sample sample = { .period = 1, .weight = 1, };
size_t i = 0, k;

+ addr_location__init(&al);
/*
* each evsel will have 10 samples - 5 common and 5 distinct.
* However the second evsel also has a collapsed entry for
@@ -90,13 +92,13 @@ static int add_hist_entries(struct evlist *evlist, struct machine *machine)
he = hists__add_entry(hists, &al, NULL,
NULL, NULL, NULL, &sample, true);
if (he == NULL) {
- addr_location__put(&al);
goto out;
}

- fake_common_samples[k].thread = al.thread;
+ thread__put(fake_common_samples[k].thread);
+ fake_common_samples[k].thread = thread__get(al.thread);
map__put(fake_common_samples[k].map);
- fake_common_samples[k].map = al.map;
+ fake_common_samples[k].map = map__get(al.map);
fake_common_samples[k].sym = al.sym;
}

@@ -110,20 +112,22 @@ static int add_hist_entries(struct evlist *evlist, struct machine *machine)
he = hists__add_entry(hists, &al, NULL,
NULL, NULL, NULL, &sample, true);
if (he == NULL) {
- addr_location__put(&al);
goto out;
}

- fake_samples[i][k].thread = al.thread;
- fake_samples[i][k].map = al.map;
+ thread__put(fake_samples[i][k].thread);
+ fake_samples[i][k].thread = thread__get(al.thread);
+ map__put(fake_samples[i][k].map);
+ fake_samples[i][k].map = map__get(al.map);
fake_samples[i][k].sym = al.sym;
}
i++;
}

+ addr_location__exit(&al);
return 0;
-
out:
+ addr_location__exit(&al);
pr_debug("Not enough memory for adding a hist entry\n");
return -1;
}
diff --git a/tools/perf/tests/hists_output.c b/tools/perf/tests/hists_output.c
index cd2094c13e1e..ba1cccf57049 100644
--- a/tools/perf/tests/hists_output.c
+++ b/tools/perf/tests/hists_output.c
@@ -54,6 +54,7 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)
struct perf_sample sample = { .period = 100, };
size_t i;

+ addr_location__init(&al);
for (i = 0; i < ARRAY_SIZE(fake_samples); i++) {
struct hist_entry_iter iter = {
.evsel = evsel,
@@ -73,20 +74,21 @@ static int add_hist_entries(struct hists *hists, struct machine *machine)

if (hist_entry_iter__add(&iter, &al, sysctl_perf_event_max_stack,
NULL) < 0) {
- addr_location__put(&al);
goto out;
}

fake_samples[i].thread = al.thread;
map__put(fake_samples[i].map);
- fake_samples[i].map = al.map;
+ fake_samples[i].map = map__get(al.map);
fake_samples[i].sym = al.sym;
}

+ addr_location__exit(&al);
return TEST_OK;

out:
pr_debug("Not enough memory for adding a hist entry\n");
+ addr_location__exit(&al);
return TEST_FAIL;
}

@@ -118,8 +120,10 @@ static void put_fake_samples(void)
{
size_t i;

- for (i = 0; i < ARRAY_SIZE(fake_samples); i++)
+ for (i = 0; i < ARRAY_SIZE(fake_samples); i++) {
map__put(fake_samples[i].map);
+ fake_samples[i].map = NULL;
+ }
}

typedef int (*test_fn_t)(struct evsel *, struct machine *);
diff --git a/tools/perf/tests/mmap-thread-lookup.c b/tools/perf/tests/mmap-thread-lookup.c
index 898eda55b7a8..3891a2a3b46f 100644
--- a/tools/perf/tests/mmap-thread-lookup.c
+++ b/tools/perf/tests/mmap-thread-lookup.c
@@ -187,6 +187,7 @@ static int mmap_events(synth_cb synth)
struct addr_location al;
struct thread *thread;

+ addr_location__init(&al);
thread = machine__findnew_thread(machine, getpid(), td->tid);

pr_debug("looking for map %p\n", td->map);
@@ -199,11 +200,12 @@ static int mmap_events(synth_cb synth)
if (!al.map) {
pr_debug("failed, couldn't find map\n");
err = -1;
+ addr_location__exit(&al);
break;
}

pr_debug("map %p, addr %" PRIx64 "\n", al.map, map__start(al.map));
- map__put(al.map);
+ addr_location__exit(&al);
}

machine__delete_threads(machine);
diff --git a/tools/perf/util/addr_location.c b/tools/perf/util/addr_location.c
index c73fc2aa236c..51825ef8c0ab 100644
--- a/tools/perf/util/addr_location.c
+++ b/tools/perf/util/addr_location.c
@@ -1,16 +1,44 @@
// SPDX-License-Identifier: GPL-2.0
#include "addr_location.h"
#include "map.h"
+#include "maps.h"
#include "thread.h"

+void addr_location__init(struct addr_location *al)
+{
+ al->thread = NULL;
+ al->maps = NULL;
+ al->map = NULL;
+ al->sym = NULL;
+ al->srcline = NULL;
+ al->addr = 0;
+ al->level = 0;
+ al->filtered = 0;
+ al->cpumode = 0;
+ al->cpu = 0;
+ al->socket = 0;
+}
+
/*
* The preprocess_sample method will return with reference counts for the
* in it, when done using (and perhaps getting ref counts if needing to
* keep a pointer to one of those entries) it must be paired with
* addr_location__put(), so that the refcounts can be decremented.
*/
-void addr_location__put(struct addr_location *al)
+void addr_location__exit(struct addr_location *al)
{
map__zput(al->map);
thread__zput(al->thread);
+ maps__zput(al->maps);
+}
+
+void addr_location__copy(struct addr_location *dst, struct addr_location *src)
+{
+ thread__put(dst->thread);
+ maps__put(dst->maps);
+ map__put(dst->map);
+ *dst = *src;
+ dst->thread = thread__get(src->thread);
+ dst->maps = maps__get(src->maps);
+ dst->map = map__get(src->map);
}
diff --git a/tools/perf/util/addr_location.h b/tools/perf/util/addr_location.h
index 7dfa7417c0fe..d8ac0428dff2 100644
--- a/tools/perf/util/addr_location.h
+++ b/tools/perf/util/addr_location.h
@@ -23,6 +23,9 @@ struct addr_location {
s32 socket;
};

-void addr_location__put(struct addr_location *al);
+void addr_location__init(struct addr_location *al);
+void addr_location__exit(struct addr_location *al);
+
+void addr_location__copy(struct addr_location *dst, struct addr_location *src);

#endif /* __PERF_ADDR_LOCATION */
diff --git a/tools/perf/util/build-id.c b/tools/perf/util/build-id.c
index 06a8cd88cbef..36728222a5b4 100644
--- a/tools/perf/util/build-id.c
+++ b/tools/perf/util/build-id.c
@@ -58,9 +58,11 @@ int build_id__mark_dso_hit(struct perf_tool *tool __maybe_unused,
return -1;
}

+ addr_location__init(&al);
if (thread__find_map(thread, sample->cpumode, sample->ip, &al))
map__dso(al.map)->hit = 1;

+ addr_location__exit(&al);
thread__put(thread);
return 0;
}
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
index b550c7393155..416f2ddc3895 100644
--- a/tools/perf/util/cs-etm.c
+++ b/tools/perf/util/cs-etm.c
@@ -910,33 +910,35 @@ static u32 cs_etm__mem_access(struct cs_etm_queue *etmq, u8 trace_chan_id,
struct addr_location al;
struct dso *dso;
struct cs_etm_traceid_queue *tidq;
+ int ret = 0;

if (!etmq)
return 0;

+ addr_location__init(&al);
machine = etmq->etm->machine;
cpumode = cs_etm__cpu_mode(etmq, address);
tidq = cs_etm__etmq_get_traceid_queue(etmq, trace_chan_id);
if (!tidq)
- return 0;
+ goto out;

thread = tidq->thread;
if (!thread) {
if (cpumode != PERF_RECORD_MISC_KERNEL)
- return 0;
+ goto out;
thread = etmq->etm->unknown_thread;
}

if (!thread__find_map(thread, cpumode, address, &al))
- return 0;
+ goto out;

dso = map__dso(al.map);
if (!dso)
- return 0;
+ goto out;

if (dso->data.status == DSO_DATA_STATUS_ERROR &&
dso__data_status_seen(dso, DSO_DATA_STATUS_SEEN_ITRACE))
- return 0;
+ goto out;

offset = map__map_ip(al.map, address);

@@ -953,10 +955,12 @@ static u32 cs_etm__mem_access(struct cs_etm_queue *etmq, u8 trace_chan_id,
dso->long_name ? dso->long_name : "Unknown");
dso->auxtrace_warned = true;
}
- return 0;
+ goto out;
}
-
- return len;
+ ret = len;
+out:
+ addr_location__exit(&al);
+ return ret;
}

static struct cs_etm_queue *cs_etm__alloc_queue(struct cs_etm_auxtrace *etm,
diff --git a/tools/perf/util/data-convert-json.c b/tools/perf/util/data-convert-json.c
index 291591e303cd..5bb3c2ba95ca 100644
--- a/tools/perf/util/data-convert-json.c
+++ b/tools/perf/util/data-convert-json.c
@@ -154,12 +154,14 @@ static int process_sample_event(struct perf_tool *tool,
{
struct convert_json *c = container_of(tool, struct convert_json, tool);
FILE *out = c->out;
- struct addr_location al, tal;
+ struct addr_location al;
u64 sample_type = __evlist__combined_sample_type(evsel->evlist);
u8 cpumode = PERF_RECORD_MISC_USER;

+ addr_location__init(&al);
if (machine__resolve(machine, &al, sample) < 0) {
pr_err("Sample resolution failed!\n");
+ addr_location__exit(&al);
return -1;
}

@@ -190,6 +192,7 @@ static int process_sample_event(struct perf_tool *tool,

for (i = 0; i < sample->callchain->nr; ++i) {
u64 ip = sample->callchain->ips[i];
+ struct addr_location tal;

if (ip >= PERF_CONTEXT_MAX) {
switch (ip) {
@@ -215,8 +218,10 @@ static int process_sample_event(struct perf_tool *tool,
else
fputc(',', out);

+ addr_location__init(&tal);
ok = thread__find_symbol(al.thread, cpumode, ip, &tal);
output_sample_callchain_entry(tool, ip, ok ? &tal : NULL);
+ addr_location__exit(&tal);
}
} else {
output_sample_callchain_entry(tool, sample->ip, &al);
@@ -245,6 +250,7 @@ static int process_sample_event(struct perf_tool *tool,
}
#endif
output_json_format(out, false, 2, "}");
+ addr_location__exit(&al);
return 0;
}

diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index 751fd53bfd93..6184696dc266 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -239,16 +239,17 @@ static struct call_path *call_path_from_sample(struct db_export *dbe,
struct addr_location al;
u64 dso_db_id = 0, sym_db_id = 0, offset = 0;

- memset(&al, 0, sizeof(al));

node = callchain_cursor_current(&callchain_cursor);
if (!node)
break;
+
/*
* Handle export of symbol and dso for this node by
* constructing an addr_location struct and then passing it to
* db_ids_from_al() to perform the export.
*/
+ addr_location__init(&al);
al.sym = node->ms.sym;
al.map = node->ms.map;
al.maps = thread__maps(thread);
@@ -265,6 +266,7 @@ static struct call_path *call_path_from_sample(struct db_export *dbe,
kernel_start);

callchain_cursor_advance(&callchain_cursor);
+ addr_location__exit(&al);
}

/* Reset the callchain order to its prior value. */
diff --git a/tools/perf/util/dlfilter.c b/tools/perf/util/dlfilter.c
index 8016f21dc0b8..46f74b2344db 100644
--- a/tools/perf/util/dlfilter.c
+++ b/tools/perf/util/dlfilter.c
@@ -258,6 +258,7 @@ static __s32 dlfilter__object_code(void *ctx, __u64 ip, void *buf, __u32 len)
struct addr_location a;
struct map *map;
u64 offset;
+ __s32 ret;

if (!d->ctx_valid)
return -1;
@@ -272,16 +273,22 @@ static __s32 dlfilter__object_code(void *ctx, __u64 ip, void *buf, __u32 len)
machine__kernel_ip(d->machine, ip) == machine__kernel_ip(d->machine, d->sample->ip))
goto have_map;

+ addr_location__init(&a);
thread__find_map_fb(al->thread, d->sample->cpumode, ip, &a);
- if (!a.map)
- return -1;
+ if (!a.map) {
+ ret = -1;
+ goto out;
+ }

map = a.map;
have_map:
offset = map__map_ip(map, ip);
if (ip + len >= map__end(map))
len = map__end(map) - ip;
- return dso__data_read_offset(map__dso(map), d->machine, offset, buf, len);
+ ret = dso__data_read_offset(map__dso(map), d->machine, offset, buf, len);
+out:
+ addr_location__exit(&a);
+ return ret;
}

static const struct perf_dlfilter_fns perf_dlfilter_fns = {
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 6ee23145ee7e..2fcfba38fc48 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -486,6 +486,7 @@ size_t perf_event__fprintf_text_poke(union perf_event *event, struct machine *ma
if (machine) {
struct addr_location al;

+ addr_location__init(&al);
al.map = map__get(maps__find(machine__kernel_maps(machine), tp->addr));
if (al.map && map__load(al.map) >= 0) {
al.addr = map__map_ip(al.map, tp->addr);
@@ -493,7 +494,7 @@ size_t perf_event__fprintf_text_poke(union perf_event *event, struct machine *ma
if (al.sym)
ret += symbol__fprintf_symname_offs(al.sym, &al, fp);
}
- map__put(al.map);
+ addr_location__exit(&al);
}
ret += fprintf(fp, " old len %u new len %u\n", tp->old_len, tp->new_len);
old = true;
@@ -577,8 +578,10 @@ struct map *thread__find_map(struct thread *thread, u8 cpumode, u64 addr,
struct machine *machine = maps__machine(maps);
bool load_map = false;

- al->maps = maps;
- al->thread = thread;
+ maps__zput(al->maps);
+ map__zput(al->map);
+ thread__zput(al->thread);
+
al->addr = addr;
al->cpumode = cpumode;
al->filtered = 0;
@@ -590,13 +593,13 @@ struct map *thread__find_map(struct thread *thread, u8 cpumode, u64 addr,

if (cpumode == PERF_RECORD_MISC_KERNEL && perf_host) {
al->level = 'k';
- al->maps = maps = machine__kernel_maps(machine);
+ maps = machine__kernel_maps(machine);
load_map = true;
} else if (cpumode == PERF_RECORD_MISC_USER && perf_host) {
al->level = '.';
} else if (cpumode == PERF_RECORD_MISC_GUEST_KERNEL && perf_guest) {
al->level = 'g';
- al->maps = maps = machine__kernel_maps(machine);
+ maps = machine__kernel_maps(machine);
load_map = true;
} else if (cpumode == PERF_RECORD_MISC_GUEST_USER && perf_guest) {
al->level = 'u';
@@ -615,7 +618,8 @@ struct map *thread__find_map(struct thread *thread, u8 cpumode, u64 addr,

return NULL;
}
-
+ al->maps = maps__get(maps);
+ al->thread = thread__get(thread);
al->map = map__get(maps__find(maps, al->addr));
if (al->map != NULL) {
/*
diff --git a/tools/perf/util/evsel_fprintf.c b/tools/perf/util/evsel_fprintf.c
index a1655fd7ed9b..cf45ca0e768f 100644
--- a/tools/perf/util/evsel_fprintf.c
+++ b/tools/perf/util/evsel_fprintf.c
@@ -128,8 +128,6 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment,
bool first = true;

if (sample->callchain) {
- struct addr_location node_al;
-
callchain_cursor_commit(cursor);

while (1) {
@@ -159,9 +157,12 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment,
printed += fprintf(fp, "%c%16" PRIx64, s, node->ip);

if (print_sym) {
+ struct addr_location node_al;
+
+ addr_location__init(&node_al);
printed += fprintf(fp, " ");
node_al.addr = addr;
- node_al.map = map;
+ node_al.map = map__get(map);

if (print_symoffset) {
printed += __symbol__fprintf_symname_offs(sym, &node_al,
@@ -171,6 +172,7 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment,
printed += __symbol__fprintf_symname(sym, &node_al,
print_unknown_as_addr, fp);
}
+ addr_location__exit(&node_al);
}

if (print_dso && (!sym || !sym->inlined))
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 4bc3affbe891..a4c1b617f6e4 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -588,7 +588,7 @@ static void hist_entry__add_callchain_period(struct hist_entry *he, u64 period)

static struct hist_entry *hists__findnew_entry(struct hists *hists,
struct hist_entry *entry,
- struct addr_location *al,
+ const struct addr_location *al,
bool sample_self)
{
struct rb_node **p;
@@ -927,8 +927,10 @@ iter_next_branch_entry(struct hist_entry_iter *iter, struct addr_location *al)
if (iter->curr >= iter->total)
return 0;

- al->maps = bi[i].to.ms.maps;
- al->map = bi[i].to.ms.map;
+ maps__put(al->maps);
+ al->maps = maps__get(bi[i].to.ms.maps);
+ map__put(al->map);
+ al->map = map__get(bi[i].to.ms.map);
al->sym = bi[i].to.ms.sym;
al->addr = bi[i].to.addr;
return 1;
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 45c7e7722916..783ce61c6d25 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -754,13 +754,15 @@ static int intel_pt_walk_next_insn(struct intel_pt_insn *intel_pt_insn,
struct addr_location al;
unsigned char buf[INTEL_PT_INSN_BUF_SZ];
ssize_t len;
- int x86_64;
+ int x86_64, ret = 0;
u8 cpumode;
u64 offset, start_offset, start_ip;
u64 insn_cnt = 0;
bool one_map = true;
bool nr;

+
+ addr_location__init(&al);
intel_pt_insn->length = 0;

if (to_ip && *ip == to_ip)
@@ -773,19 +775,22 @@ static int intel_pt_walk_next_insn(struct intel_pt_insn *intel_pt_insn,
if (ptq->pt->have_guest_sideband) {
if (!ptq->guest_machine || ptq->guest_machine_pid != ptq->pid) {
intel_pt_log("ERROR: guest sideband but no guest machine\n");
- return -EINVAL;
+ ret = -EINVAL;
+ goto out_ret;
}
} else if ((!symbol_conf.guest_code && cpumode != PERF_RECORD_MISC_GUEST_KERNEL) ||
intel_pt_get_guest(ptq)) {
intel_pt_log("ERROR: no guest machine\n");
- return -EINVAL;
+ ret = -EINVAL;
+ goto out_ret;
}
machine = ptq->guest_machine;
thread = ptq->guest_thread;
if (!thread) {
if (cpumode != PERF_RECORD_MISC_GUEST_KERNEL) {
intel_pt_log("ERROR: no guest thread\n");
- return -EINVAL;
+ ret = -EINVAL;
+ goto out_ret;
}
thread = ptq->unknown_guest_thread;
}
@@ -794,7 +799,8 @@ static int intel_pt_walk_next_insn(struct intel_pt_insn *intel_pt_insn,
if (!thread) {
if (cpumode != PERF_RECORD_MISC_KERNEL) {
intel_pt_log("ERROR: no thread\n");
- return -EINVAL;
+ ret = -EINVAL;
+ goto out_ret;
}
thread = ptq->pt->unknown_thread;
}
@@ -808,13 +814,17 @@ static int intel_pt_walk_next_insn(struct intel_pt_insn *intel_pt_insn,
intel_pt_log("ERROR: thread has no dso for %#" PRIx64 "\n", *ip);
else
intel_pt_log("ERROR: thread has no map for %#" PRIx64 "\n", *ip);
- return -EINVAL;
+ addr_location__exit(&al);
+ ret = -EINVAL;
+ goto out_ret;
}
dso = map__dso(al.map);

if (dso->data.status == DSO_DATA_STATUS_ERROR &&
- dso__data_status_seen(dso, DSO_DATA_STATUS_SEEN_ITRACE))
- return -ENOENT;
+ dso__data_status_seen(dso, DSO_DATA_STATUS_SEEN_ITRACE)) {
+ ret = -ENOENT;
+ goto out_ret;
+ }

offset = map__map_ip(al.map, *ip);

@@ -833,7 +843,8 @@ static int intel_pt_walk_next_insn(struct intel_pt_insn *intel_pt_insn,
intel_pt_insn->rel = e->rel;
memcpy(intel_pt_insn->buf, e->insn, INTEL_PT_INSN_BUF_SZ);
intel_pt_log_insn_no_data(intel_pt_insn, *ip);
- return 0;
+ ret = 0;
+ goto out_ret;
}
}

@@ -854,11 +865,14 @@ static int intel_pt_walk_next_insn(struct intel_pt_insn *intel_pt_insn,
offset);
if (intel_pt_enable_logging)
dso__fprintf(dso, intel_pt_log_fp());
- return -EINVAL;
+ ret = -EINVAL;
+ goto out_ret;
}

- if (intel_pt_get_insn(buf, len, x86_64, intel_pt_insn))
- return -EINVAL;
+ if (intel_pt_get_insn(buf, len, x86_64, intel_pt_insn)) {
+ ret = -EINVAL;
+ goto out_ret;
+ }

intel_pt_log_insn(intel_pt_insn, *ip);

@@ -909,17 +923,20 @@ static int intel_pt_walk_next_insn(struct intel_pt_insn *intel_pt_insn,

e = intel_pt_cache_lookup(map__dso(al.map), machine, start_offset);
if (e)
- return 0;
+ goto out_ret;
}

/* Ignore cache errors */
intel_pt_cache_add(map__dso(al.map), machine, start_offset, insn_cnt,
*ip - start_ip, intel_pt_insn);

- return 0;
+out_ret:
+ addr_location__exit(&al);
+ return ret;

out_no_cache:
*insn_cnt_ptr = insn_cnt;
+ addr_location__exit(&al);
return 0;
}

@@ -968,6 +985,7 @@ static int __intel_pt_pgd_ip(uint64_t ip, void *data)
struct addr_location al;
u8 cpumode;
u64 offset;
+ int res;

if (ptq->state->to_nr) {
if (intel_pt_guest_kernel_ip(ip))
@@ -984,12 +1002,15 @@ static int __intel_pt_pgd_ip(uint64_t ip, void *data)
if (!thread)
return -EINVAL;

+ addr_location__init(&al);
if (!thread__find_map(thread, cpumode, ip, &al) || !map__dso(al.map))
return -EINVAL;

offset = map__map_ip(al.map, ip);

- return intel_pt_match_pgd_ip(ptq->pt, ip, offset, map__dso(al.map)->long_name);
+ res = intel_pt_match_pgd_ip(ptq->pt, ip, offset, map__dso(al.map)->long_name);
+ addr_location__exit(&al);
+ return res;
}

static bool intel_pt_pgd_ip(uint64_t ip, void *data)
@@ -3372,20 +3393,22 @@ static int intel_pt_text_poke(struct intel_pt *pt, union perf_event *event)
/* Assume text poke begins in a basic block no more than 4096 bytes */
int cnt = 4096 + event->text_poke.new_len;
struct thread *thread = pt->unknown_thread;
- struct addr_location al = { .map = NULL };
+ struct addr_location al;
struct machine *machine = pt->machine;
struct intel_pt_cache_entry *e;
u64 offset;
+ int ret = 0;

+ addr_location__init(&al);
if (!event->text_poke.new_len)
- return 0;
+ goto out;

for (; cnt; cnt--, addr--) {
struct dso *dso;

if (intel_pt_find_map(thread, cpumode, addr, &al)) {
if (addr < event->text_poke.addr)
- return 0;
+ goto out;
continue;
}

@@ -3406,15 +3429,16 @@ static int intel_pt_text_poke(struct intel_pt *pt, union perf_event *event)
* branch instruction before the text poke address.
*/
if (e->branch != INTEL_PT_BR_NO_BRANCH)
- return 0;
+ goto out;
} else {
intel_pt_cache_invalidate(dso, machine, offset);
intel_pt_log("Invalidated instruction cache for %s at %#"PRIx64"\n",
dso->long_name, addr);
}
}
-
- return 0;
+out:
+ addr_location__exit(&al);
+ return ret;
}

static int intel_pt_process_event(struct perf_session *session,
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 8972c852d3bd..9fcf357a4d53 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2221,7 +2221,7 @@ static void ip__resolve_ams(struct thread *thread,
{
struct addr_location al;

- memset(&al, 0, sizeof(al));
+ addr_location__init(&al);
/*
* We cannot use the header.misc hint to determine whether a
* branch stack address is user, kernel, guest, hypervisor.
@@ -2234,11 +2234,12 @@ static void ip__resolve_ams(struct thread *thread,
ams->addr = ip;
ams->al_addr = al.addr;
ams->al_level = al.level;
- ams->ms.maps = al.maps;
+ ams->ms.maps = maps__get(al.maps);
ams->ms.sym = al.sym;
- ams->ms.map = al.map;
+ ams->ms.map = map__get(al.map);
ams->phys_addr = 0;
ams->data_page_size = 0;
+ addr_location__exit(&al);
}

static void ip__resolve_data(struct thread *thread,
@@ -2247,18 +2248,19 @@ static void ip__resolve_data(struct thread *thread,
{
struct addr_location al;

- memset(&al, 0, sizeof(al));
+ addr_location__init(&al);

thread__find_symbol(thread, m, addr, &al);

ams->addr = addr;
ams->al_addr = al.addr;
ams->al_level = al.level;
- ams->ms.maps = al.maps;
+ ams->ms.maps = maps__get(al.maps);
ams->ms.sym = al.sym;
- ams->ms.map = al.map;
+ ams->ms.map = map__get(al.map);
ams->phys_addr = phys_addr;
ams->data_page_size = daddr_page_size;
+ addr_location__exit(&al);
}

struct mem_info *sample__resolve_mem(struct perf_sample *sample,
@@ -2319,10 +2321,11 @@ static int add_callchain_ip(struct thread *thread,
{
struct map_symbol ms;
struct addr_location al;
- int nr_loop_iter = 0, err;
+ int nr_loop_iter = 0, err = 0;
u64 iter_cycles = 0;
const char *srcline = NULL;

+ addr_location__init(&al);
al.filtered = 0;
al.sym = NULL;
al.srcline = NULL;
@@ -2348,9 +2351,10 @@ static int add_callchain_ip(struct thread *thread,
* Discard all.
*/
callchain_cursor_reset(cursor);
- return 1;
+ err = 1;
+ goto out;
}
- return 0;
+ goto out;
}
thread__find_symbol(thread, *cpumode, ip, &al);
}
@@ -2363,31 +2367,32 @@ static int add_callchain_ip(struct thread *thread,
symbol__match_regex(al.sym, &ignore_callees_regex)) {
/* Treat this symbol as the root,
forgetting its callees. */
- *root_al = al;
+ addr_location__copy(root_al, &al);
callchain_cursor_reset(cursor);
}
}

if (symbol_conf.hide_unresolved && al.sym == NULL)
- return 0;
+ goto out;

if (iter) {
nr_loop_iter = iter->nr_loop_iter;
iter_cycles = iter->cycles;
}

- ms.maps = al.maps;
- ms.map = al.map;
+ ms.maps = maps__get(al.maps);
+ ms.map = map__get(al.map);
ms.sym = al.sym;

if (!branch && append_inlines(cursor, &ms, ip) == 0)
- return 0;
+ goto out;

srcline = callchain_srcline(&ms, al.addr);
err = callchain_cursor_append(cursor, ip, &ms,
branch, flags, nr_loop_iter,
iter_cycles, branch_from, srcline);
- map__put(al.map);
+out:
+ addr_location__exit(&al);
return err;
}

diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index f3d262e871ac..d7c99028c6e6 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -469,9 +469,11 @@ static PyObject *python_process_callchain(struct perf_sample *sample,
struct addr_location node_al;
unsigned long offset;

+ addr_location__init(&node_al);
node_al.addr = map__map_ip(map, node->ip);
- node_al.map = map;
+ node_al.map = map__get(map);
offset = get_offset(node->ms.sym, &node_al);
+ addr_location__exit(&node_al);

pydict_set_item_string_decref(
pyelem, "sym_off",
@@ -539,6 +541,7 @@ static PyObject *python_process_brstack(struct perf_sample *sample,
pydict_set_item_string_decref(pyelem, "cycles",
PyLong_FromUnsignedLongLong(entries[i].flags.cycles));

+ addr_location__init(&al);
thread__find_map_fb(thread, sample->cpumode,
entries[i].from, &al);
dsoname = get_dsoname(al.map);
@@ -551,6 +554,7 @@ static PyObject *python_process_brstack(struct perf_sample *sample,
pydict_set_item_string_decref(pyelem, "to_dsoname",
_PyUnicode_FromString(dsoname));

+ addr_location__exit(&al);
PyList_Append(pylist, pyelem);
Py_DECREF(pyelem);
}
@@ -594,7 +598,6 @@ static PyObject *python_process_brstacksym(struct perf_sample *sample,
PyObject *pylist;
u64 i;
char bf[512];
- struct addr_location al;

pylist = PyList_New(0);
if (!pylist)
@@ -605,7 +608,9 @@ static PyObject *python_process_brstacksym(struct perf_sample *sample,

for (i = 0; i < br->nr; i++) {
PyObject *pyelem;
+ struct addr_location al;

+ addr_location__init(&al);
pyelem = PyDict_New();
if (!pyelem)
Py_FatalError("couldn't create Python dictionary");
@@ -644,6 +649,7 @@ static PyObject *python_process_brstacksym(struct perf_sample *sample,

PyList_Append(pylist, pyelem);
Py_DECREF(pyelem);
+ addr_location__exit(&al);
}

exit:
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 9a1db3be6436..bee4ac1051ee 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -432,18 +432,25 @@ int thread__memcpy(struct thread *thread, struct machine *machine,
if (machine__kernel_ip(machine, ip))
cpumode = PERF_RECORD_MISC_KERNEL;

- if (!thread__find_map(thread, cpumode, ip, &al))
- return -1;
+ addr_location__init(&al);
+ if (!thread__find_map(thread, cpumode, ip, &al)) {
+ addr_location__exit(&al);
+ return -1;
+ }

dso = map__dso(al.map);

- if( !dso || dso->data.status == DSO_DATA_STATUS_ERROR || map__load(al.map) < 0)
+ if (!dso || dso->data.status == DSO_DATA_STATUS_ERROR || map__load(al.map) < 0) {
+ addr_location__exit(&al);
return -1;
+ }

offset = map__map_ip(al.map, ip);
if (is64bit)
*is64bit = dso->is_64_bit;

+ addr_location__exit(&al);
+
return dso__data_read_offset(dso, machine, offset, buf, len);
}

diff --git a/tools/perf/util/unwind-libdw.c b/tools/perf/util/unwind-libdw.c
index 3723b5e31b2a..83eea968482e 100644
--- a/tools/perf/util/unwind-libdw.c
+++ b/tools/perf/util/unwind-libdw.c
@@ -90,8 +90,12 @@ static int __report_module(struct addr_location *al, u64 ip,
static int report_module(u64 ip, struct unwind_info *ui)
{
struct addr_location al;
+ int res;

- return __report_module(&al, ip, ui);
+ addr_location__init(&al);
+ res = __report_module(&al, ip, ui);
+ addr_location__exit(&al);
+ return res;
}

/*
@@ -104,8 +108,11 @@ static int entry(u64 ip, struct unwind_info *ui)
struct unwind_entry *e = &ui->entries[ui->idx++];
struct addr_location al;

- if (__report_module(&al, ip, ui))
+ addr_location__init(&al);
+ if (__report_module(&al, ip, ui)) {
+ addr_location__exit(&al);
return -1;
+ }

e->ip = ip;
e->ms.maps = al.maps;
@@ -116,6 +123,7 @@ static int entry(u64 ip, struct unwind_info *ui)
al.sym ? al.sym->name : "''",
ip,
al.map ? map__map_ip(al.map, ip) : (u64) 0);
+ addr_location__exit(&al);
return 0;
}

@@ -136,17 +144,22 @@ static int access_dso_mem(struct unwind_info *ui, Dwarf_Addr addr,
ssize_t size;
struct dso *dso;

+ addr_location__init(&al);
if (!thread__find_map(ui->thread, PERF_RECORD_MISC_USER, addr, &al)) {
pr_debug("unwind: no map for %lx\n", (unsigned long)addr);
- return -1;
+ goto out_fail;
}
dso = map__dso(al.map);
if (!dso)
- return -1;
+ goto out_fail;

size = dso__data_read_addr(dso, al.map, ui->machine, addr, (u8 *) data, sizeof(*data));

+ addr_location__exit(&al);
return !(size == sizeof(*data));
+out_fail:
+ addr_location__exit(&al);
+ return -1;
}

static bool memory_read(Dwfl *dwfl __maybe_unused, Dwarf_Addr addr, Dwarf_Word *result,
diff --git a/tools/perf/util/unwind-libunwind-local.c b/tools/perf/util/unwind-libunwind-local.c
index 11f3fc95aa11..36bf5100bad2 100644
--- a/tools/perf/util/unwind-libunwind-local.c
+++ b/tools/perf/util/unwind-libunwind-local.c
@@ -416,7 +416,12 @@ static int read_unwind_spec_debug_frame(struct dso *dso,
static struct map *find_map(unw_word_t ip, struct unwind_info *ui)
{
struct addr_location al;
- return thread__find_map(ui->thread, PERF_RECORD_MISC_USER, ip, &al);
+ struct map *ret;
+
+ addr_location__init(&al);
+ ret = thread__find_map(ui->thread, PERF_RECORD_MISC_USER, ip, &al);
+ addr_location__exit(&al);
+ return ret;
}

static int
@@ -631,7 +636,9 @@ static int entry(u64 ip, struct thread *thread,
{
struct unwind_entry e;
struct addr_location al;
+ int ret;

+ addr_location__init(&al);
e.ms.sym = thread__find_symbol(thread, PERF_RECORD_MISC_USER, ip, &al);
e.ip = ip;
e.ms.map = al.map;
@@ -642,7 +649,9 @@ static int entry(u64 ip, struct thread *thread,
ip,
al.map ? map__map_ip(al.map, ip) : (u64) 0);

- return cb(&e, arg);
+ ret = cb(&e, arg);
+ addr_location__exit(&al);
+ return ret;
}

static void display_error(int err)
--
2.41.0.162.gfafddb0af9-goog


2023-06-08 23:59:55

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v2 22/26] perf top: Add exit routine for main thread

Add exit_process_thread that reverses init_process_thread. This avoids
leak sanitizer reporting memory leaks.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/builtin-top.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 99010dfa5760..c363c04e16df 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -392,7 +392,7 @@ static void prompt_percent(int *target, const char *msg)

static void perf_top__prompt_symbol(struct perf_top *top, const char *msg)
{
- char *buf = malloc(0), *p;
+ char *buf = NULL, *p;
struct hist_entry *syme = top->sym_filter_entry, *n, *found = NULL;
struct hists *hists = evsel__hists(top->sym_evsel);
struct rb_node *next;
@@ -1227,6 +1227,14 @@ static void init_process_thread(struct perf_top *top)
cond_init(&top->qe.cond);
}

+static void exit_process_thread(struct perf_top *top)
+{
+ ordered_events__free(&top->qe.data[0]);
+ ordered_events__free(&top->qe.data[1]);
+ mutex_destroy(&top->qe.mutex);
+ cond_destroy(&top->qe.cond);
+}
+
static int __cmd_top(struct perf_top *top)
{
struct record_opts *opts = &top->record_opts;
@@ -1357,6 +1365,7 @@ static int __cmd_top(struct perf_top *top)
cond_signal(&top->qe.cond);
pthread_join(thread_process, NULL);
perf_set_singlethreaded();
+ exit_process_thread(top);
return ret;
}

--
2.41.0.162.gfafddb0af9-goog


2023-06-09 00:00:50

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v2 12/26] perf intel-pt: Fix missed put and leak

Add missing put and free, detected with leak sanitizer.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/util/intel-pt.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 783ce61c6d25..dbf0bc71a63b 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -1280,6 +1280,7 @@ static void intel_pt_add_br_stack(struct intel_pt *pt,
pt->kernel_start);

sample->branch_stack = pt->br_stack;
+ thread__put(thread);
}

/* INTEL_PT_LBR_0, INTEL_PT_LBR_1 and INTEL_PT_LBR_2 */
@@ -3580,6 +3581,7 @@ static void intel_pt_free(struct perf_session *session)
zfree(&pt->chain);
zfree(&pt->filter);
zfree(&pt->time_ranges);
+ zfree(&pt->br_stack);
free(pt);
}

--
2.41.0.162.gfafddb0af9-goog


2023-06-09 00:02:50

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v2 26/26] perf hist: Fix srcline memory leak

srcline isn't freed if it is SRCLINE_UNKNOWN. Avoid strduping in this
case as such strdups are redundant and leak memory.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/util/hist.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 77cb2cc83bb9..cc6f7f51faa5 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -484,7 +484,7 @@ static int hist_entry__init(struct hist_entry *he,
goto err_infos;
}

- if (he->srcline) {
+ if (he->srcline && strcmp(he->srcline, SRCLINE_UNKNOWN)) {
he->srcline = strdup(he->srcline);
if (he->srcline == NULL)
goto err_rawdata;
--
2.41.0.162.gfafddb0af9-goog


2023-06-09 00:05:06

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v2 11/26] perf stat: Avoid evlist leak

Free evlist before overwriting in "perf stat report" mode. Detected
using leak sanitizer.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/builtin-stat.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index c87c6897edc9..fc615bdeed4f 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -2427,6 +2427,7 @@ static int __cmd_report(int argc, const char **argv)

perf_stat.session = session;
stat_config.output = stderr;
+ evlist__delete(evsel_list);
evsel_list = session->evlist;

ret = perf_session__process_events(session);
--
2.41.0.162.gfafddb0af9-goog


2023-06-09 00:09:02

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v2 16/26] perf symbol-elf: Correct holding a reference

If a reference is held, don't put it as this will confuse reference
count checking.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/util/symbol-elf.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 63882a4db5c7..e6493d1cc251 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1389,11 +1389,11 @@ static int dso__process_kernel_symbol(struct dso *dso, struct map *map,
/* Ensure maps are correctly ordered */
if (kmaps) {
int err;
+ struct map *tmp = map__get(map);

- map__get(map);
maps__remove(kmaps, map);
err = maps__insert(kmaps, map);
- map__put(map);
+ map__put(tmp);
if (err)
return err;
}
--
2.41.0.162.gfafddb0af9-goog


2023-06-09 00:09:32

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v2 18/26] perf machine: Fix leak of kernel dso

The kernel dso may be found by searching dsos or allocating if not
found. The allocation returns with a reference count of 2, once for
the dsos list and once for the returned value. The list search has a
reference count of 1, once for the dsos list. To make the reference
counts consistent, increase the dsos list search reference count to 2
with a dso__get, and do a put when the scope ends for either the
allocated or found dso.

This issue was found with leak sanitizer and reference count checking.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/util/machine.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 46af5e9748c9..f8e6c07f0048 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1868,7 +1868,7 @@ static int machine__process_kernel_mmap_event(struct machine *machine,
continue;


- kernel = dso;
+ kernel = dso__get(dso);
break;
}

@@ -1913,6 +1913,7 @@ static int machine__process_kernel_mmap_event(struct machine *machine,
*/
dso__load(kernel, machine__kernel_map(machine));
}
+ dso__put(kernel);
} else if (perf_event__is_extra_kernel_mmap(machine, xm)) {
return machine__process_extra_kernel_map(machine, xm);
}
--
2.41.0.162.gfafddb0af9-goog


2023-06-09 00:09:35

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v2 15/26] perf jit: Fix two thread leaks

As reported by leak sanitizer with reference count checking.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/util/jitdump.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/jitdump.c b/tools/perf/util/jitdump.c
index 2380b41a4caa..6b2b96c16ccd 100644
--- a/tools/perf/util/jitdump.c
+++ b/tools/perf/util/jitdump.c
@@ -800,6 +800,7 @@ static void jit_add_pid(struct machine *machine, pid_t pid)
}

thread__set_priv(thread, (void *)true);
+ thread__put(thread);
}

static bool jit_has_pid(struct machine *machine, pid_t pid)
@@ -811,6 +812,7 @@ static bool jit_has_pid(struct machine *machine, pid_t pid)
return false;

priv = thread__priv(thread);
+ thread__put(thread);
return (bool)priv;
}

--
2.41.0.162.gfafddb0af9-goog


2023-06-09 00:09:45

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v2 21/26] perf annotate: Fix parse_objdump_line memory leak

fileloc is used to hold a previous line, before overwriting it ensure
the previous contents is freed. Free the storage once done in
symbol__disassemble.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/util/annotate.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index b708bbc49c9e..fc5f44535ebe 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1524,6 +1524,7 @@ static int symbol__parse_objdump_line(struct symbol *sym,
/* /filename:linenr ? Save line number and ignore. */
if (regexec(&file_lineno, parsed_line, 2, match, 0) == 0) {
*line_nr = atoi(parsed_line + match[1].rm_so);
+ free(*fileloc);
*fileloc = strdup(parsed_line);
return 0;
}
@@ -1572,7 +1573,6 @@ static int symbol__parse_objdump_line(struct symbol *sym,
}

annotation_line__add(&dl->al, &notes->src->source);
-
return 0;
}

@@ -2114,6 +2114,7 @@ static int symbol__disassemble(struct symbol *sym, struct annotate_args *args)
nline++;
}
free(line);
+ free(fileloc);

err = finish_command(&objdump_process);
if (err)
--
2.41.0.162.gfafddb0af9-goog


2023-06-09 00:25:28

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v2 25/26] perf srcline: Change free_srcline to zfree_srcline

Make use after free more unlikely.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/builtin-diff.c | 4 ++--
tools/perf/util/annotate.c | 2 +-
tools/perf/util/block-info.c | 4 ++--
tools/perf/util/hist.c | 6 +++---
tools/perf/util/map.c | 2 +-
tools/perf/util/srcline.c | 15 ++++++++++-----
tools/perf/util/srcline.h | 2 +-
7 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index ca39657ee407..eec89567ae48 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -1387,8 +1387,8 @@ static int cycles_printf(struct hist_entry *he, struct hist_entry *pair,
bi->start, bi->end, block_he->diff.cycles);
}

- free_srcline(start_line);
- free_srcline(end_line);
+ zfree_srcline(&start_line);
+ zfree_srcline(&end_line);

return scnprintf(hpp->buf, hpp->size, "%*s", width, buf);
}
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index fc5f44535ebe..58fc5fa00ecd 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -1196,7 +1196,7 @@ static void annotation_line__init(struct annotation_line *al,

static void annotation_line__exit(struct annotation_line *al)
{
- free_srcline(al->path);
+ zfree_srcline(&al->path);
zfree(&al->line);
}

diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
index 16a7b4adcf18..08279b1b65e5 100644
--- a/tools/perf/util/block-info.c
+++ b/tools/perf/util/block-info.c
@@ -305,8 +305,8 @@ static int block_range_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
bi->start, bi->end);
}

- free_srcline(start_line);
- free_srcline(end_line);
+ zfree_srcline(&start_line);
+ zfree_srcline(&end_line);

return scnprintf(hpp->buf, hpp->size, "%*s", block_fmt->width, buf);
}
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 4004c0915e4f..77cb2cc83bb9 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1317,8 +1317,8 @@ void hist_entry__delete(struct hist_entry *he)
if (he->branch_info) {
map__zput(he->branch_info->from.ms.map);
map__zput(he->branch_info->to.ms.map);
- free_srcline(he->branch_info->srcline_from);
- free_srcline(he->branch_info->srcline_to);
+ zfree_srcline(&he->branch_info->srcline_from);
+ zfree_srcline(&he->branch_info->srcline_to);
zfree(&he->branch_info);
}

@@ -1336,7 +1336,7 @@ void hist_entry__delete(struct hist_entry *he)

zfree(&he->res_samples);
zfree(&he->stat_acc);
- free_srcline(he->srcline);
+ zfree_srcline(&he->srcline);
if (he->srcfile && he->srcfile[0])
zfree(&he->srcfile);
free_callchain(he->callchain);
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index ae1d54d4880a..c77e2fce6a37 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -498,7 +498,7 @@ int map__fprintf_srcline(struct map *map, u64 addr, const char *prefix,
char *srcline = map__srcline(map, addr, NULL);
if (strncmp(srcline, SRCLINE_UNKNOWN, strlen(SRCLINE_UNKNOWN)) != 0)
ret = fprintf(fp, "%s%s", prefix, srcline);
- free_srcline(srcline);
+ zfree_srcline(&srcline);
}
return ret;
}
diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index cfca03abd6f8..b8e596528d7e 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -804,10 +804,15 @@ char *get_srcline_split(struct dso *dso, u64 addr, unsigned *line)
return NULL;
}

-void free_srcline(char *srcline)
+void zfree_srcline(char **srcline)
{
- if (srcline && strcmp(srcline, SRCLINE_UNKNOWN) != 0)
- free(srcline);
+ if (*srcline == NULL)
+ return;
+
+ if (strcmp(*srcline, SRCLINE_UNKNOWN))
+ free(*srcline);
+
+ *srcline = NULL;
}

char *get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
@@ -880,7 +885,7 @@ void srcline__tree_delete(struct rb_root_cached *tree)
pos = rb_entry(next, struct srcline_node, rb_node);
next = rb_next(&pos->rb_node);
rb_erase_cached(&pos->rb_node, tree);
- free_srcline(pos->srcline);
+ zfree_srcline(&pos->srcline);
zfree(&pos);
}
}
@@ -903,7 +908,7 @@ void inline_node__delete(struct inline_node *node)

list_for_each_entry_safe(ilist, tmp, &node->val, list) {
list_del_init(&ilist->list);
- free_srcline(ilist->srcline);
+ zfree_srcline(&ilist->srcline);
/* only the inlined symbols are owned by the list */
if (ilist->symbol && ilist->symbol->inlined)
symbol__delete(ilist->symbol);
diff --git a/tools/perf/util/srcline.h b/tools/perf/util/srcline.h
index b11a0aaaa676..a15c7db9058e 100644
--- a/tools/perf/util/srcline.h
+++ b/tools/perf/util/srcline.h
@@ -15,7 +15,7 @@ char *get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
char *__get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
bool show_sym, bool show_addr, bool unwind_inlines,
u64 ip);
-void free_srcline(char *srcline);
+void zfree_srcline(char **srcline);
char *get_srcline_split(struct dso *dso, u64 addr, unsigned *line);

/* insert the srcline into the DSO, which will take ownership */
--
2.41.0.162.gfafddb0af9-goog


2023-06-09 00:26:56

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v2 04/26] perf maps: Make delete static, always use put

Address/leak sanitizer with reference count checking can identify the
location of leaks, so use put rather than delete to avoid free-ing
memory when the reference count is >1. Add maps__zput to ensure the
variable is cleared.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/tests/maps.c | 2 +-
tools/perf/util/machine.c | 2 +-
tools/perf/util/maps.c | 2 +-
tools/perf/util/maps.h | 9 ++++++++-
4 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/tools/perf/tests/maps.c b/tools/perf/tests/maps.c
index 8c0eb5cf8bb5..5bb1123a91a7 100644
--- a/tools/perf/tests/maps.c
+++ b/tools/perf/tests/maps.c
@@ -140,7 +140,7 @@ static int test__maps__merge_in(struct test_suite *t __maybe_unused, int subtest
ret = check_maps(merged3, ARRAY_SIZE(merged3), maps);
TEST_ASSERT_VAL("merge check failed", !ret);

- maps__delete(maps);
+ maps__zput(maps);
return TEST_OK;
}

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 5d34d60a0045..8972c852d3bd 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -248,7 +248,7 @@ void machine__exit(struct machine *machine)
return;

machine__destroy_kernel_maps(machine);
- maps__delete(machine->kmaps);
+ maps__zput(machine->kmaps);
dsos__exit(&machine->dsos);
machine__exit_vdso(machine);
zfree(&machine->root_dir);
diff --git a/tools/perf/util/maps.c b/tools/perf/util/maps.c
index 5ae6379a1b42..5206a6433117 100644
--- a/tools/perf/util/maps.c
+++ b/tools/perf/util/maps.c
@@ -171,7 +171,7 @@ struct maps *maps__new(struct machine *machine)
return result;
}

-void maps__delete(struct maps *maps)
+static void maps__delete(struct maps *maps)
{
maps__exit(maps);
unwind__finish_access(maps);
diff --git a/tools/perf/util/maps.h b/tools/perf/util/maps.h
index d2963456cfbe..83144e0645ed 100644
--- a/tools/perf/util/maps.h
+++ b/tools/perf/util/maps.h
@@ -57,13 +57,20 @@ struct kmap {
};

struct maps *maps__new(struct machine *machine);
-void maps__delete(struct maps *maps);
bool maps__empty(struct maps *maps);
int maps__clone(struct thread *thread, struct maps *parent);

struct maps *maps__get(struct maps *maps);
void maps__put(struct maps *maps);

+static inline void __maps__zput(struct maps **map)
+{
+ maps__put(*map);
+ *map = NULL;
+}
+
+#define maps__zput(map) __maps__zput(&map)
+
static inline struct rb_root *maps__entries(struct maps *maps)
{
return &RC_CHK_ACCESS(maps)->entries;
--
2.41.0.162.gfafddb0af9-goog


2023-06-09 00:30:17

by Ian Rogers

[permalink] [raw]
Subject: [PATCH v2 24/26] perf callchain: Use pthread keys for tls callchain_cursor

Pthread keys are more portable than __thread and allow the association
of a destructor with the key. Use the destructor to clean up TLS
callchain cursors to aid understanding memory leaks.

Signed-off-by: Ian Rogers <[email protected]>
---
tools/perf/builtin-c2c.c | 4 +-
tools/perf/builtin-script.c | 24 ++++++-----
tools/perf/util/callchain.c | 40 ++++++++++++++++++-
tools/perf/util/callchain.h | 4 +-
tools/perf/util/db-export.c | 10 +++--
tools/perf/util/hist.c | 29 ++++++++------
.../scripting-engines/trace-event-python.c | 10 +++--
7 files changed, 86 insertions(+), 35 deletions(-)

diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 530a44a59f41..a4cf9de7a7b5 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -284,6 +284,7 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
struct hist_entry *he;
struct addr_location al;
struct mem_info *mi, *mi_dup;
+ struct callchain_cursor *cursor;
int ret;

addr_location__init(&al);
@@ -297,7 +298,8 @@ static int process_sample_event(struct perf_tool *tool __maybe_unused,
if (c2c.stitch_lbr)
thread__set_lbr_stitch_enable(al.thread, true);

- ret = sample__resolve_callchain(sample, &callchain_cursor, NULL,
+ cursor = get_tls_callchain_cursor();
+ ret = sample__resolve_callchain(sample, cursor, NULL,
evsel, &al, sysctl_perf_event_max_stack);
if (ret)
goto out;
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 784d478c2e05..e3f435e6a7d0 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1557,11 +1557,13 @@ static int perf_sample__fprintf_bts(struct perf_sample *sample,
unsigned int print_opts = output[type].print_ip_opts;
struct callchain_cursor *cursor = NULL;

- if (symbol_conf.use_callchain && sample->callchain &&
- thread__resolve_callchain(al->thread, &callchain_cursor, evsel,
- sample, NULL, NULL, scripting_max_stack) == 0)
- cursor = &callchain_cursor;
-
+ if (symbol_conf.use_callchain && sample->callchain) {
+ cursor = get_tls_callchain_cursor();
+ if (thread__resolve_callchain(al->thread, cursor, evsel,
+ sample, NULL, NULL,
+ scripting_max_stack))
+ cursor = NULL;
+ }
if (cursor == NULL) {
printed += fprintf(fp, " ");
if (print_opts & EVSEL__PRINT_SRCLINE) {
@@ -2203,11 +2205,13 @@ static void process_event(struct perf_script *script,
if (script->stitch_lbr)
thread__set_lbr_stitch_enable(al->thread, true);

- if (symbol_conf.use_callchain && sample->callchain &&
- thread__resolve_callchain(al->thread, &callchain_cursor, evsel,
- sample, NULL, NULL, scripting_max_stack) == 0)
- cursor = &callchain_cursor;
-
+ if (symbol_conf.use_callchain && sample->callchain) {
+ cursor = get_tls_callchain_cursor();
+ if (thread__resolve_callchain(al->thread, cursor, evsel,
+ sample, NULL, NULL,
+ scripting_max_stack))
+ cursor = NULL;
+ }
fputc(cursor ? '\n' : ' ', fp);
sample__fprintf_sym(sample, al, 0, output[type].print_ip_opts, cursor,
symbol_conf.bt_stop_list, fp);
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 909f62b3b266..7e9bd3b6be9f 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -58,7 +58,8 @@ struct callchain_param callchain_param_default = {
CALLCHAIN_PARAM_DEFAULT
};

-__thread struct callchain_cursor callchain_cursor;
+/* Used for thread-local struct callchain_cursor. */
+static pthread_key_t callchain_cursor;

int parse_callchain_record_opt(const char *arg, struct callchain_param *param)
{
@@ -1116,7 +1117,7 @@ int hist_entry__append_callchain(struct hist_entry *he, struct perf_sample *samp
if ((!symbol_conf.use_callchain || sample->callchain == NULL) &&
!symbol_conf.show_branchflag_count)
return 0;
- return callchain_append(he->callchain, &callchain_cursor, sample->period);
+ return callchain_append(he->callchain, get_tls_callchain_cursor(), sample->period);
}

int fill_callchain_info(struct addr_location *al, struct callchain_cursor_node *node,
@@ -1570,6 +1571,41 @@ int callchain_node__make_parent_list(struct callchain_node *node)
return -ENOMEM;
}

+static void callchain_cursor__delete(void *vcursor)
+{
+ struct callchain_cursor *cursor = vcursor;
+ struct callchain_cursor_node *node, *next;
+
+ callchain_cursor_reset(cursor);
+ for (node = cursor->first; node != NULL; node = next) {
+ next = node->next;
+ free(node);
+ }
+ free(cursor);
+}
+
+static void init_callchain_cursor_key(void)
+{
+ if (pthread_key_create(&callchain_cursor, callchain_cursor__delete)) {
+ pr_err("callchain cursor creation failed");
+ abort();
+ }
+}
+
+struct callchain_cursor *get_tls_callchain_cursor(void)
+{
+ static pthread_once_t once_control = PTHREAD_ONCE_INIT;
+ struct callchain_cursor *cursor;
+
+ pthread_once(&once_control, init_callchain_cursor_key);
+ cursor = pthread_getspecific(callchain_cursor);
+ if (!cursor) {
+ cursor = zalloc(sizeof(*cursor));
+ pthread_setspecific(callchain_cursor, cursor);
+ }
+ return cursor;
+}
+
int callchain_cursor__copy(struct callchain_cursor *dst,
struct callchain_cursor *src)
{
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index d95615daed73..ce9410018cf7 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -168,8 +168,6 @@ struct callchain_cursor {
struct callchain_cursor_node *curr;
};

-extern __thread struct callchain_cursor callchain_cursor;
-
static inline void callchain_init(struct callchain_root *root)
{
INIT_LIST_HEAD(&root->node.val);
@@ -231,6 +229,8 @@ static inline void callchain_cursor_advance(struct callchain_cursor *cursor)
cursor->pos++;
}

+struct callchain_cursor *get_tls_callchain_cursor(void);
+
int callchain_cursor__copy(struct callchain_cursor *dst,
struct callchain_cursor *src);

diff --git a/tools/perf/util/db-export.c b/tools/perf/util/db-export.c
index 6184696dc266..b9fb71ab7a73 100644
--- a/tools/perf/util/db-export.c
+++ b/tools/perf/util/db-export.c
@@ -215,6 +215,7 @@ static struct call_path *call_path_from_sample(struct db_export *dbe,
u64 kernel_start = machine__kernel_start(machine);
struct call_path *current = &dbe->cpr->call_path;
enum chain_order saved_order = callchain_param.order;
+ struct callchain_cursor *cursor;
int err;

if (!symbol_conf.use_callchain || !sample->callchain)
@@ -226,13 +227,14 @@ static struct call_path *call_path_from_sample(struct db_export *dbe,
* the callchain starting with the root node and ending with the leaf.
*/
callchain_param.order = ORDER_CALLER;
- err = thread__resolve_callchain(thread, &callchain_cursor, evsel,
+ cursor = get_tls_callchain_cursor();
+ err = thread__resolve_callchain(thread, cursor, evsel,
sample, NULL, NULL, PERF_MAX_STACK_DEPTH);
if (err) {
callchain_param.order = saved_order;
return NULL;
}
- callchain_cursor_commit(&callchain_cursor);
+ callchain_cursor_commit(cursor);

while (1) {
struct callchain_cursor_node *node;
@@ -240,7 +242,7 @@ static struct call_path *call_path_from_sample(struct db_export *dbe,
u64 dso_db_id = 0, sym_db_id = 0, offset = 0;


- node = callchain_cursor_current(&callchain_cursor);
+ node = callchain_cursor_current(cursor);
if (!node)
break;

@@ -265,7 +267,7 @@ static struct call_path *call_path_from_sample(struct db_export *dbe,
al.sym, node->ip,
kernel_start);

- callchain_cursor_advance(&callchain_cursor);
+ callchain_cursor_advance(cursor);
addr_location__exit(&al);
}

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index fb218b3e8a7c..4004c0915e4f 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1029,15 +1029,16 @@ iter_prepare_cumulative_entry(struct hist_entry_iter *iter,
struct addr_location *al __maybe_unused)
{
struct hist_entry **he_cache;
+ struct callchain_cursor *cursor = get_tls_callchain_cursor();

- callchain_cursor_commit(&callchain_cursor);
+ callchain_cursor_commit(cursor);

/*
* This is for detecting cycles or recursions so that they're
* cumulated only one time to prevent entries more than 100%
* overhead.
*/
- he_cache = malloc(sizeof(*he_cache) * (callchain_cursor.nr + 1));
+ he_cache = malloc(sizeof(*he_cache) * (cursor->nr + 1));
if (he_cache == NULL)
return -ENOMEM;

@@ -1072,7 +1073,7 @@ iter_add_single_cumulative_entry(struct hist_entry_iter *iter,
* We need to re-initialize the cursor since callchain_append()
* advanced the cursor to the end.
*/
- callchain_cursor_commit(&callchain_cursor);
+ callchain_cursor_commit(get_tls_callchain_cursor());

hists__inc_nr_samples(hists, he->filtered);

@@ -1085,7 +1086,7 @@ iter_next_cumulative_entry(struct hist_entry_iter *iter,
{
struct callchain_cursor_node *node;

- node = callchain_cursor_current(&callchain_cursor);
+ node = callchain_cursor_current(get_tls_callchain_cursor());
if (node == NULL)
return 0;

@@ -1131,12 +1132,12 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
.raw_size = sample->raw_size,
};
int i;
- struct callchain_cursor cursor;
+ struct callchain_cursor cursor, *tls_cursor = get_tls_callchain_cursor();
bool fast = hists__has(he_tmp.hists, sym);

- callchain_cursor_snapshot(&cursor, &callchain_cursor);
+ callchain_cursor_snapshot(&cursor, tls_cursor);

- callchain_cursor_advance(&callchain_cursor);
+ callchain_cursor_advance(tls_cursor);

/*
* Check if there's duplicate entries in the callchain.
@@ -1222,7 +1223,7 @@ int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al,
if (al)
alm = map__get(al->map);

- err = sample__resolve_callchain(iter->sample, &callchain_cursor, &iter->parent,
+ err = sample__resolve_callchain(iter->sample, get_tls_callchain_cursor(), &iter->parent,
iter->evsel, al, max_stack_depth);
if (err) {
map__put(alm);
@@ -1568,8 +1569,10 @@ static int hists__hierarchy_insert_entry(struct hists *hists,

if (hist_entry__has_callchains(new_he) &&
symbol_conf.use_callchain) {
- callchain_cursor_reset(&callchain_cursor);
- if (callchain_merge(&callchain_cursor,
+ struct callchain_cursor *cursor = get_tls_callchain_cursor();
+
+ callchain_cursor_reset(cursor);
+ if (callchain_merge(cursor,
new_he->callchain,
he->callchain) < 0)
ret = -1;
@@ -1610,8 +1613,10 @@ static int hists__collapse_insert_entry(struct hists *hists,
he_stat__add_stat(iter->stat_acc, he->stat_acc);

if (hist_entry__has_callchains(he) && symbol_conf.use_callchain) {
- callchain_cursor_reset(&callchain_cursor);
- if (callchain_merge(&callchain_cursor,
+ struct callchain_cursor *cursor = get_tls_callchain_cursor();
+
+ callchain_cursor_reset(cursor);
+ if (callchain_merge(cursor,
iter->callchain,
he->callchain) < 0)
ret = -1;
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index d96e5c0fef45..59063ec98619 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -417,6 +417,7 @@ static PyObject *python_process_callchain(struct perf_sample *sample,
struct addr_location *al)
{
PyObject *pylist;
+ struct callchain_cursor *cursor;

pylist = PyList_New(0);
if (!pylist)
@@ -425,19 +426,20 @@ static PyObject *python_process_callchain(struct perf_sample *sample,
if (!symbol_conf.use_callchain || !sample->callchain)
goto exit;

- if (thread__resolve_callchain(al->thread, &callchain_cursor, evsel,
+ cursor = get_tls_callchain_cursor();
+ if (thread__resolve_callchain(al->thread, cursor, evsel,
sample, NULL, NULL,
scripting_max_stack) != 0) {
pr_err("Failed to resolve callchain. Skipping\n");
goto exit;
}
- callchain_cursor_commit(&callchain_cursor);
+ callchain_cursor_commit(cursor);


while (1) {
PyObject *pyelem;
struct callchain_cursor_node *node;
- node = callchain_cursor_current(&callchain_cursor);
+ node = callchain_cursor_current(cursor);
if (!node)
break;

@@ -493,7 +495,7 @@ static PyObject *python_process_callchain(struct perf_sample *sample,
_PyUnicode_FromString(dsoname));
}

- callchain_cursor_advance(&callchain_cursor);
+ callchain_cursor_advance(cursor);
PyList_Append(pylist, pyelem);
Py_DECREF(pyelem);
}
--
2.41.0.162.gfafddb0af9-goog


2023-06-09 14:41:33

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2 04/26] perf maps: Make delete static, always use put

Em Thu, Jun 08, 2023 at 04:28:01PM -0700, Ian Rogers escreveu:
> Address/leak sanitizer with reference count checking can identify the
> location of leaks, so use put rather than delete to avoid free-ing
> memory when the reference count is >1. Add maps__zput to ensure the
> variable is cleared.

Applied.

- Arnaldo

> Signed-off-by: Ian Rogers <[email protected]>
> ---
> tools/perf/tests/maps.c | 2 +-
> tools/perf/util/machine.c | 2 +-
> tools/perf/util/maps.c | 2 +-
> tools/perf/util/maps.h | 9 ++++++++-
> 4 files changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/tools/perf/tests/maps.c b/tools/perf/tests/maps.c
> index 8c0eb5cf8bb5..5bb1123a91a7 100644
> --- a/tools/perf/tests/maps.c
> +++ b/tools/perf/tests/maps.c
> @@ -140,7 +140,7 @@ static int test__maps__merge_in(struct test_suite *t __maybe_unused, int subtest
> ret = check_maps(merged3, ARRAY_SIZE(merged3), maps);
> TEST_ASSERT_VAL("merge check failed", !ret);
>
> - maps__delete(maps);
> + maps__zput(maps);
> return TEST_OK;
> }
>
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index 5d34d60a0045..8972c852d3bd 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -248,7 +248,7 @@ void machine__exit(struct machine *machine)
> return;
>
> machine__destroy_kernel_maps(machine);
> - maps__delete(machine->kmaps);
> + maps__zput(machine->kmaps);
> dsos__exit(&machine->dsos);
> machine__exit_vdso(machine);
> zfree(&machine->root_dir);
> diff --git a/tools/perf/util/maps.c b/tools/perf/util/maps.c
> index 5ae6379a1b42..5206a6433117 100644
> --- a/tools/perf/util/maps.c
> +++ b/tools/perf/util/maps.c
> @@ -171,7 +171,7 @@ struct maps *maps__new(struct machine *machine)
> return result;
> }
>
> -void maps__delete(struct maps *maps)
> +static void maps__delete(struct maps *maps)
> {
> maps__exit(maps);
> unwind__finish_access(maps);
> diff --git a/tools/perf/util/maps.h b/tools/perf/util/maps.h
> index d2963456cfbe..83144e0645ed 100644
> --- a/tools/perf/util/maps.h
> +++ b/tools/perf/util/maps.h
> @@ -57,13 +57,20 @@ struct kmap {
> };
>
> struct maps *maps__new(struct machine *machine);
> -void maps__delete(struct maps *maps);
> bool maps__empty(struct maps *maps);
> int maps__clone(struct thread *thread, struct maps *parent);
>
> struct maps *maps__get(struct maps *maps);
> void maps__put(struct maps *maps);
>
> +static inline void __maps__zput(struct maps **map)
> +{
> + maps__put(*map);
> + *map = NULL;
> +}
> +
> +#define maps__zput(map) __maps__zput(&map)
> +
> static inline struct rb_root *maps__entries(struct maps *maps)
> {
> return &RC_CHK_ACCESS(maps)->entries;
> --
> 2.41.0.162.gfafddb0af9-goog
>

--

- Arnaldo

2023-06-09 19:54:53

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2 24/26] perf callchain: Use pthread keys for tls callchain_cursor

Em Thu, Jun 08, 2023 at 04:28:21PM -0700, Ian Rogers escreveu:
> Pthread keys are more portable than __thread and allow the association
> of a destructor with the key. Use the destructor to clean up TLS
> callchain cursors to aid understanding memory leaks.
>
> Signed-off-by: Ian Rogers <[email protected]>

Had to add this, and added this to the cset commit log:

Committer notes:

Had to fixup a series of unconverted places and also check for the
return of get_tls_callchain_cursor() as it may fail and return NULL.

In that unlikely case we now either print something to a file, if the
caller was expecting to print a callchain, or return an error code to
state that resolving the callchain isn't possible.

In some cases this was made easier because thread__resolve_callchain()
already can fail for other reasons, so this new one (cursor == NULL) can
be added and the callers don't have to explicitely check for this new
condition.

Now building it with the containers, will continue reviewing, more eyes
on this would be most welcome...

- Arnaldo

diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index 96a6611e4e53f448..9714327fd0eadd1b 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -399,6 +399,7 @@ static u64 find_callsite(struct evsel *evsel, struct perf_sample *sample)
struct addr_location al;
struct machine *machine = &kmem_session->machines.host;
struct callchain_cursor_node *node;
+ struct callchain_cursor *cursor;
u64 result = sample->ip;

addr_location__init(&al);
@@ -408,14 +409,19 @@ static u64 find_callsite(struct evsel *evsel, struct perf_sample *sample)
}

al.thread = machine__findnew_thread(machine, sample->pid, sample->tid);
- sample__resolve_callchain(sample, &callchain_cursor, NULL, evsel, &al, 16);

- callchain_cursor_commit(&callchain_cursor);
+ cursor = get_tls_callchain_cursor();
+ if (cursor == NULL)
+ goto out;
+
+ sample__resolve_callchain(sample, cursor, NULL, evsel, &al, 16);
+
+ callchain_cursor_commit(cursor);
while (true) {
struct alloc_func key, *caller;
u64 addr;

- node = callchain_cursor_current(&callchain_cursor);
+ node = callchain_cursor_current(cursor);
if (node == NULL)
break;

@@ -434,7 +440,7 @@ static u64 find_callsite(struct evsel *evsel, struct perf_sample *sample)
} else
pr_debug3("skipping alloc function: %s\n", caller->name);

- callchain_cursor_advance(&callchain_cursor);
+ callchain_cursor_advance(cursor);
}

pr_debug2("unknown callsite: %"PRIx64 "\n", sample->ip);
diff --git a/tools/perf/builtin-kwork.c b/tools/perf/builtin-kwork.c
index 2d80aef4ecccece0..14bf7a8429e76f89 100644
--- a/tools/perf/builtin-kwork.c
+++ b/tools/perf/builtin-kwork.c
@@ -589,7 +589,7 @@ static void timehist_save_callchain(struct perf_kwork *kwork,
struct symbol *sym;
struct thread *thread;
struct callchain_cursor_node *node;
- struct callchain_cursor *cursor = &callchain_cursor;
+ struct callchain_cursor *cursor;

if (!kwork->show_callchain || sample->callchain == NULL)
return;
@@ -601,6 +601,8 @@ static void timehist_save_callchain(struct perf_kwork *kwork,
return;
}

+ cursor = get_tls_callchain_cursor();
+
if (thread__resolve_callchain(thread, cursor, evsel, sample,
NULL, NULL, kwork->max_stack + 2) != 0) {
pr_debug("Failed to resolve callchain, skipping\n");
@@ -686,12 +688,18 @@ static void timehist_print_event(struct perf_kwork *kwork,
* callchain
*/
if (kwork->show_callchain) {
+ struct callchain_cursor *cursor = get_tls_callchain_cursor();
+
+ if (cursor == NULL)
+ return;
+
printf(" ");
+
sample__fprintf_sym(sample, al, 0,
EVSEL__PRINT_SYM | EVSEL__PRINT_ONELINE |
EVSEL__PRINT_CALLCHAIN_ARROW |
EVSEL__PRINT_SKIP_IGNORED,
- &callchain_cursor, symbol_conf.bt_stop_list,
+ cursor, symbol_conf.bt_stop_list,
stdout);
}

diff --git a/tools/perf/builtin-lock.c b/tools/perf/builtin-lock.c
index fc8356bd6e3a1d99..8b505e1e5002a680 100644
--- a/tools/perf/builtin-lock.c
+++ b/tools/perf/builtin-lock.c
@@ -911,7 +911,7 @@ static int lock_contention_caller(struct evsel *evsel, struct perf_sample *sampl
char *buf, int size)
{
struct thread *thread;
- struct callchain_cursor *cursor = &callchain_cursor;
+ struct callchain_cursor *cursor;
struct machine *machine = &session->machines.host;
struct symbol *sym;
int skip = 0;
@@ -925,6 +925,8 @@ static int lock_contention_caller(struct evsel *evsel, struct perf_sample *sampl
if (thread == NULL)
return -1;

+ cursor = get_tls_callchain_cursor();
+
/* use caller function name from the callchain */
ret = thread__resolve_callchain(thread, cursor, evsel, sample,
NULL, NULL, max_stack_depth);
@@ -962,7 +964,7 @@ static int lock_contention_caller(struct evsel *evsel, struct perf_sample *sampl

static u64 callchain_id(struct evsel *evsel, struct perf_sample *sample)
{
- struct callchain_cursor *cursor = &callchain_cursor;
+ struct callchain_cursor *cursor;
struct machine *machine = &session->machines.host;
struct thread *thread;
u64 hash = 0;
@@ -973,6 +975,7 @@ static u64 callchain_id(struct evsel *evsel, struct perf_sample *sample)
if (thread == NULL)
return -1;

+ cursor = get_tls_callchain_cursor();
/* use caller function name from the callchain */
ret = thread__resolve_callchain(thread, cursor, evsel, sample,
NULL, NULL, max_stack_depth);
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index cd79068200e5653e..c9ddf73689cd6c07 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -2111,7 +2111,7 @@ static void timehist_print_sample(struct perf_sched *sched,
EVSEL__PRINT_SYM | EVSEL__PRINT_ONELINE |
EVSEL__PRINT_CALLCHAIN_ARROW |
EVSEL__PRINT_SKIP_IGNORED,
- &callchain_cursor, symbol_conf.bt_stop_list, stdout);
+ get_tls_callchain_cursor(), symbol_conf.bt_stop_list, stdout);

out:
printf("\n");
@@ -2196,7 +2196,7 @@ static void save_task_callchain(struct perf_sched *sched,
struct evsel *evsel,
struct machine *machine)
{
- struct callchain_cursor *cursor = &callchain_cursor;
+ struct callchain_cursor *cursor;
struct thread *thread;

/* want main thread for process - has maps */
@@ -2209,6 +2209,8 @@ static void save_task_callchain(struct perf_sched *sched,
if (!sched->show_callchain || sample->callchain == NULL)
return;

+ cursor = get_tls_callchain_cursor();
+
if (thread__resolve_callchain(thread, cursor, evsel, sample,
NULL, NULL, sched->max_stack + 2) != 0) {
if (verbose > 0)
@@ -2338,10 +2340,16 @@ static void save_idle_callchain(struct perf_sched *sched,
struct idle_thread_runtime *itr,
struct perf_sample *sample)
{
+ struct callchain_cursor *cursor;
+
if (!sched->show_callchain || sample->callchain == NULL)
return;

- callchain_cursor__copy(&itr->cursor, &callchain_cursor);
+ cursor = get_tls_callchain_cursor();
+ if (cursor == NULL)
+ return;
+
+ callchain_cursor__copy(&itr->cursor, cursor);
}

static struct thread *timehist_get_thread(struct perf_sched *sched,
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 6a1e75f06832bf35..66d4eb185ca7428a 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -2411,8 +2411,7 @@ static int trace__fprintf_sys_enter(struct trace *trace, struct evsel *evsel,
}

static int trace__resolve_callchain(struct trace *trace, struct evsel *evsel,
- struct perf_sample *sample,
- struct callchain_cursor *cursor)
+ struct perf_sample *sample, struct callchain_cursor *cursor)
{
struct addr_location al;
int max_stack = evsel->core.attr.sample_max_stack ?
@@ -2437,7 +2436,7 @@ static int trace__fprintf_callchain(struct trace *trace, struct perf_sample *sam
EVSEL__PRINT_DSO |
EVSEL__PRINT_UNKNOWN_AS_ADDR;

- return sample__fprintf_callchain(sample, 38, print_opts, &callchain_cursor, symbol_conf.bt_stop_list, trace->output);
+ return sample__fprintf_callchain(sample, 38, print_opts, get_tls_callchain_cursor(), symbol_conf.bt_stop_list, trace->output);
}

static const char *errno_to_name(struct evsel *evsel, int err)
@@ -2491,9 +2490,11 @@ static int trace__sys_exit(struct trace *trace, struct evsel *evsel,
goto out;

if (sample->callchain) {
- callchain_ret = trace__resolve_callchain(trace, evsel, sample, &callchain_cursor);
+ struct callchain_cursor *cursor = get_tls_callchain_cursor();
+
+ callchain_ret = trace__resolve_callchain(trace, evsel, sample, cursor);
if (callchain_ret == 0) {
- if (callchain_cursor.nr < trace->min_stack)
+ if (cursor->nr < trace->min_stack)
goto out;
callchain_ret = 1;
}
@@ -2795,9 +2796,11 @@ static int trace__event_handler(struct trace *trace, struct evsel *evsel,
thread = machine__findnew_thread(trace->host, sample->pid, sample->tid);

if (sample->callchain) {
- callchain_ret = trace__resolve_callchain(trace, evsel, sample, &callchain_cursor);
+ struct callchain_cursor *cursor = get_tls_callchain_cursor();
+
+ callchain_ret = trace__resolve_callchain(trace, evsel, sample, cursor);
if (callchain_ret == 0) {
- if (callchain_cursor.nr < trace->min_stack)
+ if (cursor->nr < trace->min_stack)
goto out;
callchain_ret = 1;
}
@@ -2899,9 +2902,11 @@ static int trace__pgfault(struct trace *trace,
thread = machine__findnew_thread(trace->host, sample->pid, sample->tid);

if (sample->callchain) {
- callchain_ret = trace__resolve_callchain(trace, evsel, sample, &callchain_cursor);
+ struct callchain_cursor *cursor = get_tls_callchain_cursor();
+
+ callchain_ret = trace__resolve_callchain(trace, evsel, sample, cursor);
if (callchain_ret == 0) {
- if (callchain_cursor.nr < trace->min_stack)
+ if (cursor->nr < trace->min_stack)
goto out_put;
callchain_ret = 1;
}
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 7e9bd3b6be9f2655..aee937d14fbbf101 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -987,6 +987,9 @@ int callchain_append(struct callchain_root *root,
struct callchain_cursor *cursor,
u64 period)
{
+ if (cursor == NULL)
+ return -1;
+
if (!cursor->nr)
return 0;

@@ -1601,6 +1604,8 @@ struct callchain_cursor *get_tls_callchain_cursor(void)
cursor = pthread_getspecific(callchain_cursor);
if (!cursor) {
cursor = zalloc(sizeof(*cursor));
+ if (!cursor)
+ pr_debug3("%s: not enough memory\n", __func__);
pthread_setspecific(callchain_cursor, cursor);
}
return cursor;
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index ce9410018cf7a1d5..d2618a47deca8f27 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -209,6 +209,8 @@ int callchain_cursor_append(struct callchain_cursor *cursor, u64 ip,
/* Close a cursor writing session. Initialize for the reader */
static inline void callchain_cursor_commit(struct callchain_cursor *cursor)
{
+ if (cursor == NULL)
+ return;
cursor->curr = cursor->first;
cursor->pos = 0;
}
@@ -217,7 +219,7 @@ static inline void callchain_cursor_commit(struct callchain_cursor *cursor)
static inline struct callchain_cursor_node *
callchain_cursor_current(struct callchain_cursor *cursor)
{
- if (cursor->pos == cursor->nr)
+ if (cursor == NULL || cursor->pos == cursor->nr)
return NULL;

return cursor->curr;
diff --git a/tools/perf/util/evsel_fprintf.c b/tools/perf/util/evsel_fprintf.c
index cf45ca0e768fb882..8719b3cb5646614d 100644
--- a/tools/perf/util/evsel_fprintf.c
+++ b/tools/perf/util/evsel_fprintf.c
@@ -127,6 +127,9 @@ int sample__fprintf_callchain(struct perf_sample *sample, int left_alignment,
char s = print_oneline ? ' ' : '\t';
bool first = true;

+ if (cursor == NULL)
+ return fprintf(fp, "<not enough memory for the callchain cursor>%s", print_oneline ? "" : "\n");
+
if (sample->callchain) {
callchain_cursor_commit(cursor);

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 4004c0915e4f471a..efaf7ac784fc8483 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1031,6 +1031,9 @@ iter_prepare_cumulative_entry(struct hist_entry_iter *iter,
struct hist_entry **he_cache;
struct callchain_cursor *cursor = get_tls_callchain_cursor();

+ if (cursor == NULL)
+ return -ENOMEM;
+
callchain_cursor_commit(cursor);

/*
@@ -1135,6 +1138,9 @@ iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
struct callchain_cursor cursor, *tls_cursor = get_tls_callchain_cursor();
bool fast = hists__has(he_tmp.hists, sym);

+ if (tls_cursor == NULL)
+ return -ENOMEM;
+
callchain_cursor_snapshot(&cursor, tls_cursor);

callchain_cursor_advance(tls_cursor);
@@ -1571,6 +1577,9 @@ static int hists__hierarchy_insert_entry(struct hists *hists,
symbol_conf.use_callchain) {
struct callchain_cursor *cursor = get_tls_callchain_cursor();

+ if (cursor == NULL)
+ return -1;
+
callchain_cursor_reset(cursor);
if (callchain_merge(cursor,
new_he->callchain,
@@ -1615,11 +1624,13 @@ static int hists__collapse_insert_entry(struct hists *hists,
if (hist_entry__has_callchains(he) && symbol_conf.use_callchain) {
struct callchain_cursor *cursor = get_tls_callchain_cursor();

- callchain_cursor_reset(cursor);
- if (callchain_merge(cursor,
- iter->callchain,
- he->callchain) < 0)
- ret = -1;
+ if (cursor != NULL) {
+ callchain_cursor_reset(cursor);
+ if (callchain_merge(cursor, iter->callchain, he->callchain) < 0)
+ ret = -1;
+ } else {
+ ret = 0;
+ }
}
hist_entry__delete(he);
return ret;
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index bdad4b8bf77deb4e..4e62843d51b7dbf9 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -3180,6 +3180,9 @@ int thread__resolve_callchain(struct thread *thread,
{
int ret = 0;

+ if (cursor == NULL)
+ return -ENOMEM;
+
callchain_cursor_reset(cursor);

if (callchain_param.order == ORDER_CALLEE) {
diff --git a/tools/perf/util/scripting-engines/trace-event-perl.c b/tools/perf/util/scripting-engines/trace-event-perl.c
index 65b761d83a1f8c3c..603091317bed9be4 100644
--- a/tools/perf/util/scripting-engines/trace-event-perl.c
+++ b/tools/perf/util/scripting-engines/trace-event-perl.c
@@ -260,6 +260,7 @@ static SV *perl_process_callchain(struct perf_sample *sample,
struct evsel *evsel,
struct addr_location *al)
{
+ struct callchain_cursor *cursor;
AV *list;

list = newAV();
@@ -269,18 +270,20 @@ static SV *perl_process_callchain(struct perf_sample *sample,
if (!symbol_conf.use_callchain || !sample->callchain)
goto exit;

- if (thread__resolve_callchain(al->thread, &callchain_cursor, evsel,
+ cursor = get_tls_callchain_cursor();
+
+ if (thread__resolve_callchain(al->thread, cursor, evsel,
sample, NULL, NULL, scripting_max_stack) != 0) {
pr_err("Failed to resolve callchain. Skipping\n");
goto exit;
}
- callchain_cursor_commit(&callchain_cursor);
+ callchain_cursor_commit(cursor);


while (1) {
HV *elem;
struct callchain_cursor_node *node;
- node = callchain_cursor_current(&callchain_cursor);
+ node = callchain_cursor_current(cursor);
if (!node)
break;

@@ -328,7 +331,7 @@ static SV *perl_process_callchain(struct perf_sample *sample,
}
}

- callchain_cursor_advance(&callchain_cursor);
+ callchain_cursor_advance(cursor);
av_push(list, newRV_noinc((SV*)elem));
}


2023-06-09 19:56:45

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2 06/26] perf addr_location: Add init/exit/copy functions

Em Thu, Jun 08, 2023 at 04:28:03PM -0700, Ian Rogers escreveu:
> +++ b/tools/perf/builtin-kmem.c
> @@ -399,7 +399,9 @@ static u64 find_callsite(struct evsel *evsel, struct perf_sample *sample)
> struct addr_location al;
> struct machine *machine = &kmem_session->machines.host;
> struct callchain_cursor_node *node;
> + u64 result;
>
> + addr_location__init(&al);
> if (alloc_func_list == NULL) {
> if (build_alloc_func_list() < 0)
> goto out;
> @@ -427,16 +429,19 @@ static u64 find_callsite(struct evsel *evsel, struct perf_sample *sample)
> else
> addr = node->ip;
>
> - return addr;
> + result = addr;
> + goto out;
> } else
> pr_debug3("skipping alloc function: %s\n", caller->name);
>
> callchain_cursor_advance(&callchain_cursor);
> }
>
> -out:
> pr_debug2("unknown callsite: %"PRIx64 "\n", sample->ip);
> - return sample->ip;
> + result = sample->ip;
> +out:
> + addr_location__exit(&al);
> + return result;
> }

I needed this to make sure result is set to something, mostly keeping
the previous logic as build_alloc_func_list() already does
debugging/error prints about what went wrong if it takes the 'goto out'.

- Arnaldo

diff --git a/tools/perf/builtin-kmem.c b/tools/perf/builtin-kmem.c
index a11f280d20bd3d12..96a6611e4e53f448 100644
--- a/tools/perf/builtin-kmem.c
+++ b/tools/perf/builtin-kmem.c
@@ -399,7 +399,7 @@ static u64 find_callsite(struct evsel *evsel, struct perf_sample *sample)
struct addr_location al;
struct machine *machine = &kmem_session->machines.host;
struct callchain_cursor_node *node;
- u64 result;
+ u64 result = sample->ip;

addr_location__init(&al);
if (alloc_func_list == NULL) {
@@ -438,7 +438,6 @@ static u64 find_callsite(struct evsel *evsel, struct perf_sample *sample)
}

pr_debug2("unknown callsite: %"PRIx64 "\n", sample->ip);
- result = sample->ip;
out:
addr_location__exit(&al);
return result;

2023-06-12 14:39:02

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2 26/26] perf hist: Fix srcline memory leak

Em Mon, Jun 12, 2023 at 11:13:59AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Thu, Jun 08, 2023 at 04:28:23PM -0700, Ian Rogers escreveu:
> > srcline isn't freed if it is SRCLINE_UNKNOWN. Avoid strduping in this
> > case as such strdups are redundant and leak memory.
>
> The patch is ok as its what the rest of the code is doing, i.e. strcmp()
> to check if a srcline is the unknown one, but how about the following
> patch on top of yours?

[acme@quaco perf-tools-next]$ strings ~/bin/perf | grep '??:0'
??:0
SRCLINE_UNKNOWN ((char *) "??:0")
[acme@quaco perf-tools-next]$

> From 5163e54c1ed3d476f6b4e7f938861039bd4eec7c Mon Sep 17 00:00:00 2001
> From: Arnaldo Carvalho de Melo <[email protected]>
> Date: Mon, 12 Jun 2023 11:10:46 -0300
> Subject: [PATCH 1/1] perf srcline: Optimize comparision against
> SRCLINE_UNKNOWN
>
> This is a string constant that gets returned and then strcmp() around,
> we can instead just do a pointer comparision.
>
> Cc: Adrian Hunter <[email protected]>
> Cc: Ian Rogers <[email protected]>
> Cc: Jiri Olsa <[email protected]>
> Cc: Namhyung Kim <[email protected]>
> Link: https://lore.kernel.org/lkml/
> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
> ---
> tools/perf/builtin-diff.c | 4 ++--
> tools/perf/util/block-info.c | 4 ++--
> tools/perf/util/hist.c | 2 +-
> tools/perf/util/map.c | 2 +-
> tools/perf/util/sort.c | 2 +-
> tools/perf/util/srcline.c | 2 +-
> 6 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
> index eec89567ae483604..e8a1b16aa5f83f4f 100644
> --- a/tools/perf/builtin-diff.c
> +++ b/tools/perf/builtin-diff.c
> @@ -1378,8 +1378,8 @@ static int cycles_printf(struct hist_entry *he, struct hist_entry *pair,
> end_line = map__srcline(he->ms.map, bi->sym->start + bi->end,
> he->ms.sym);
>
> - if ((strncmp(start_line, SRCLINE_UNKNOWN, strlen(SRCLINE_UNKNOWN)) != 0) &&
> - (strncmp(end_line, SRCLINE_UNKNOWN, strlen(SRCLINE_UNKNOWN)) != 0)) {
> + if (start_line != SRCLINE_UNKNOWN &&
> + end_line != SRCLINE_UNKNOWN) {
> scnprintf(buf, sizeof(buf), "[%s -> %s] %4ld",
> start_line, end_line, block_he->diff.cycles);
> } else {
> diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
> index 08279b1b65e5a4b0..fe4c17248799f0a2 100644
> --- a/tools/perf/util/block-info.c
> +++ b/tools/perf/util/block-info.c
> @@ -296,8 +296,8 @@ static int block_range_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
> end_line = map__srcline(he->ms.map, bi->sym->start + bi->end,
> he->ms.sym);
>
> - if ((strncmp(start_line, SRCLINE_UNKNOWN, strlen(SRCLINE_UNKNOWN)) != 0) &&
> - (strncmp(end_line, SRCLINE_UNKNOWN, strlen(SRCLINE_UNKNOWN)) != 0)) {
> + if (start_line != SRCLINE_UNKNOWN) &&
> + end_line != SRCLINE_UNKNOWN) {
> scnprintf(buf, sizeof(buf), "[%s -> %s]",
> start_line, end_line);
> } else {
> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> index 0a10bcc6ec95b5e0..3dc8a4968beb9c01 100644
> --- a/tools/perf/util/hist.c
> +++ b/tools/perf/util/hist.c
> @@ -484,7 +484,7 @@ static int hist_entry__init(struct hist_entry *he,
> goto err_infos;
> }
>
> - if (he->srcline && strcmp(he->srcline, SRCLINE_UNKNOWN)) {
> + if (he->srcline && he->srcline != SRCLINE_UNKNOWN) {
> he->srcline = strdup(he->srcline);
> if (he->srcline == NULL)
> goto err_rawdata;
> diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
> index c77e2fce6a379e7f..f30d34903aa4eabe 100644
> --- a/tools/perf/util/map.c
> +++ b/tools/perf/util/map.c
> @@ -496,7 +496,7 @@ int map__fprintf_srcline(struct map *map, u64 addr, const char *prefix,
>
> if (dso) {
> char *srcline = map__srcline(map, addr, NULL);
> - if (strncmp(srcline, SRCLINE_UNKNOWN, strlen(SRCLINE_UNKNOWN)) != 0)
> + if (srcline != SRCLINE_UNKNOWN)
> ret = fprintf(fp, "%s%s", prefix, srcline);
> zfree_srcline(&srcline);
> }
> diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> index 047c3606802f5b7f..6aa1c7f2b4448b30 100644
> --- a/tools/perf/util/sort.c
> +++ b/tools/perf/util/sort.c
> @@ -643,7 +643,7 @@ static char *hist_entry__get_srcfile(struct hist_entry *e)
>
> sf = __get_srcline(map__dso(map), map__rip_2objdump(map, e->ip),
> e->ms.sym, false, true, true, e->ip);
> - if (!strcmp(sf, SRCLINE_UNKNOWN))
> + if (sf == SRCLINE_UNKNOWN)
> return no_srcfile;
> p = strchr(sf, ':');
> if (p && *sf) {
> diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
> index b8e596528d7e7e5e..48a04f42b308b080 100644
> --- a/tools/perf/util/srcline.c
> +++ b/tools/perf/util/srcline.c
> @@ -809,7 +809,7 @@ void zfree_srcline(char **srcline)
> if (*srcline == NULL)
> return;
>
> - if (strcmp(*srcline, SRCLINE_UNKNOWN))
> + if (*srcline != SRCLINE_UNKNOWN)
> free(*srcline);
>
> *srcline = NULL;
> --
> 2.39.2
>
>
> > Signed-off-by: Ian Rogers <[email protected]>
> > ---
> > tools/perf/util/hist.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> > index 77cb2cc83bb9..cc6f7f51faa5 100644
> > --- a/tools/perf/util/hist.c
> > +++ b/tools/perf/util/hist.c
> > @@ -484,7 +484,7 @@ static int hist_entry__init(struct hist_entry *he,
> > goto err_infos;
> > }
> >
> > - if (he->srcline) {
> > + if (he->srcline && strcmp(he->srcline, SRCLINE_UNKNOWN)) {
> > he->srcline = strdup(he->srcline);
> > if (he->srcline == NULL)
> > goto err_rawdata;
> > --
> > 2.41.0.162.gfafddb0af9-goog
> >
>
> --
>
> - Arnaldo

--

- Arnaldo

2023-06-12 14:41:00

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2 26/26] perf hist: Fix srcline memory leak

Em Thu, Jun 08, 2023 at 04:28:23PM -0700, Ian Rogers escreveu:
> srcline isn't freed if it is SRCLINE_UNKNOWN. Avoid strduping in this
> case as such strdups are redundant and leak memory.

The patch is ok as its what the rest of the code is doing, i.e. strcmp()
to check if a srcline is the unknown one, but how about the following
patch on top of yours?

From 5163e54c1ed3d476f6b4e7f938861039bd4eec7c Mon Sep 17 00:00:00 2001
From: Arnaldo Carvalho de Melo <[email protected]>
Date: Mon, 12 Jun 2023 11:10:46 -0300
Subject: [PATCH 1/1] perf srcline: Optimize comparision against
SRCLINE_UNKNOWN

This is a string constant that gets returned and then strcmp() around,
we can instead just do a pointer comparision.

Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/lkml/
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-diff.c | 4 ++--
tools/perf/util/block-info.c | 4 ++--
tools/perf/util/hist.c | 2 +-
tools/perf/util/map.c | 2 +-
tools/perf/util/sort.c | 2 +-
tools/perf/util/srcline.c | 2 +-
6 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
index eec89567ae483604..e8a1b16aa5f83f4f 100644
--- a/tools/perf/builtin-diff.c
+++ b/tools/perf/builtin-diff.c
@@ -1378,8 +1378,8 @@ static int cycles_printf(struct hist_entry *he, struct hist_entry *pair,
end_line = map__srcline(he->ms.map, bi->sym->start + bi->end,
he->ms.sym);

- if ((strncmp(start_line, SRCLINE_UNKNOWN, strlen(SRCLINE_UNKNOWN)) != 0) &&
- (strncmp(end_line, SRCLINE_UNKNOWN, strlen(SRCLINE_UNKNOWN)) != 0)) {
+ if (start_line != SRCLINE_UNKNOWN &&
+ end_line != SRCLINE_UNKNOWN) {
scnprintf(buf, sizeof(buf), "[%s -> %s] %4ld",
start_line, end_line, block_he->diff.cycles);
} else {
diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
index 08279b1b65e5a4b0..fe4c17248799f0a2 100644
--- a/tools/perf/util/block-info.c
+++ b/tools/perf/util/block-info.c
@@ -296,8 +296,8 @@ static int block_range_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
end_line = map__srcline(he->ms.map, bi->sym->start + bi->end,
he->ms.sym);

- if ((strncmp(start_line, SRCLINE_UNKNOWN, strlen(SRCLINE_UNKNOWN)) != 0) &&
- (strncmp(end_line, SRCLINE_UNKNOWN, strlen(SRCLINE_UNKNOWN)) != 0)) {
+ if (start_line != SRCLINE_UNKNOWN) &&
+ end_line != SRCLINE_UNKNOWN) {
scnprintf(buf, sizeof(buf), "[%s -> %s]",
start_line, end_line);
} else {
diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 0a10bcc6ec95b5e0..3dc8a4968beb9c01 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -484,7 +484,7 @@ static int hist_entry__init(struct hist_entry *he,
goto err_infos;
}

- if (he->srcline && strcmp(he->srcline, SRCLINE_UNKNOWN)) {
+ if (he->srcline && he->srcline != SRCLINE_UNKNOWN) {
he->srcline = strdup(he->srcline);
if (he->srcline == NULL)
goto err_rawdata;
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index c77e2fce6a379e7f..f30d34903aa4eabe 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -496,7 +496,7 @@ int map__fprintf_srcline(struct map *map, u64 addr, const char *prefix,

if (dso) {
char *srcline = map__srcline(map, addr, NULL);
- if (strncmp(srcline, SRCLINE_UNKNOWN, strlen(SRCLINE_UNKNOWN)) != 0)
+ if (srcline != SRCLINE_UNKNOWN)
ret = fprintf(fp, "%s%s", prefix, srcline);
zfree_srcline(&srcline);
}
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index 047c3606802f5b7f..6aa1c7f2b4448b30 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -643,7 +643,7 @@ static char *hist_entry__get_srcfile(struct hist_entry *e)

sf = __get_srcline(map__dso(map), map__rip_2objdump(map, e->ip),
e->ms.sym, false, true, true, e->ip);
- if (!strcmp(sf, SRCLINE_UNKNOWN))
+ if (sf == SRCLINE_UNKNOWN)
return no_srcfile;
p = strchr(sf, ':');
if (p && *sf) {
diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index b8e596528d7e7e5e..48a04f42b308b080 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -809,7 +809,7 @@ void zfree_srcline(char **srcline)
if (*srcline == NULL)
return;

- if (strcmp(*srcline, SRCLINE_UNKNOWN))
+ if (*srcline != SRCLINE_UNKNOWN)
free(*srcline);

*srcline = NULL;
--
2.39.2


> Signed-off-by: Ian Rogers <[email protected]>
> ---
> tools/perf/util/hist.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> index 77cb2cc83bb9..cc6f7f51faa5 100644
> --- a/tools/perf/util/hist.c
> +++ b/tools/perf/util/hist.c
> @@ -484,7 +484,7 @@ static int hist_entry__init(struct hist_entry *he,
> goto err_infos;
> }
>
> - if (he->srcline) {
> + if (he->srcline && strcmp(he->srcline, SRCLINE_UNKNOWN)) {
> he->srcline = strdup(he->srcline);
> if (he->srcline == NULL)
> goto err_rawdata;
> --
> 2.41.0.162.gfafddb0af9-goog
>

--

- Arnaldo

2023-06-12 15:20:08

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH v2 26/26] perf hist: Fix srcline memory leak

On Mon, Jun 12, 2023 at 7:16 AM Arnaldo Carvalho de Melo
<[email protected]> wrote:
>
> Em Mon, Jun 12, 2023 at 11:13:59AM -0300, Arnaldo Carvalho de Melo escreveu:
> > Em Thu, Jun 08, 2023 at 04:28:23PM -0700, Ian Rogers escreveu:
> > > srcline isn't freed if it is SRCLINE_UNKNOWN. Avoid strduping in this
> > > case as such strdups are redundant and leak memory.
> >
> > The patch is ok as its what the rest of the code is doing, i.e. strcmp()
> > to check if a srcline is the unknown one, but how about the following
> > patch on top of yours?
>
> [acme@quaco perf-tools-next]$ strings ~/bin/perf | grep '??:0'
> ??:0
> SRCLINE_UNKNOWN ((char *) "??:0")
> [acme@quaco perf-tools-next]$

Agreed, the strcmps make me nervous as they won't distinguish heap
from a global meaning we could end up with things like pointers to
freed memory. The comparison with the global is always going to be
same imo.

Acked-by: Ian Rogers <[email protected]>

Thanks,
Ian

> > From 5163e54c1ed3d476f6b4e7f938861039bd4eec7c Mon Sep 17 00:00:00 2001
> > From: Arnaldo Carvalho de Melo <[email protected]>
> > Date: Mon, 12 Jun 2023 11:10:46 -0300
> > Subject: [PATCH 1/1] perf srcline: Optimize comparision against
> > SRCLINE_UNKNOWN
> >
> > This is a string constant that gets returned and then strcmp() around,
> > we can instead just do a pointer comparision.
> >
> > Cc: Adrian Hunter <[email protected]>
> > Cc: Ian Rogers <[email protected]>
> > Cc: Jiri Olsa <[email protected]>
> > Cc: Namhyung Kim <[email protected]>
> > Link: https://lore.kernel.org/lkml/
> > Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
> > ---
> > tools/perf/builtin-diff.c | 4 ++--
> > tools/perf/util/block-info.c | 4 ++--
> > tools/perf/util/hist.c | 2 +-
> > tools/perf/util/map.c | 2 +-
> > tools/perf/util/sort.c | 2 +-
> > tools/perf/util/srcline.c | 2 +-
> > 6 files changed, 8 insertions(+), 8 deletions(-)
> >
> > diff --git a/tools/perf/builtin-diff.c b/tools/perf/builtin-diff.c
> > index eec89567ae483604..e8a1b16aa5f83f4f 100644
> > --- a/tools/perf/builtin-diff.c
> > +++ b/tools/perf/builtin-diff.c
> > @@ -1378,8 +1378,8 @@ static int cycles_printf(struct hist_entry *he, struct hist_entry *pair,
> > end_line = map__srcline(he->ms.map, bi->sym->start + bi->end,
> > he->ms.sym);
> >
> > - if ((strncmp(start_line, SRCLINE_UNKNOWN, strlen(SRCLINE_UNKNOWN)) != 0) &&
> > - (strncmp(end_line, SRCLINE_UNKNOWN, strlen(SRCLINE_UNKNOWN)) != 0)) {
> > + if (start_line != SRCLINE_UNKNOWN &&
> > + end_line != SRCLINE_UNKNOWN) {
> > scnprintf(buf, sizeof(buf), "[%s -> %s] %4ld",
> > start_line, end_line, block_he->diff.cycles);
> > } else {
> > diff --git a/tools/perf/util/block-info.c b/tools/perf/util/block-info.c
> > index 08279b1b65e5a4b0..fe4c17248799f0a2 100644
> > --- a/tools/perf/util/block-info.c
> > +++ b/tools/perf/util/block-info.c
> > @@ -296,8 +296,8 @@ static int block_range_entry(struct perf_hpp_fmt *fmt, struct perf_hpp *hpp,
> > end_line = map__srcline(he->ms.map, bi->sym->start + bi->end,
> > he->ms.sym);
> >
> > - if ((strncmp(start_line, SRCLINE_UNKNOWN, strlen(SRCLINE_UNKNOWN)) != 0) &&
> > - (strncmp(end_line, SRCLINE_UNKNOWN, strlen(SRCLINE_UNKNOWN)) != 0)) {
> > + if (start_line != SRCLINE_UNKNOWN) &&
> > + end_line != SRCLINE_UNKNOWN) {
> > scnprintf(buf, sizeof(buf), "[%s -> %s]",
> > start_line, end_line);
> > } else {
> > diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> > index 0a10bcc6ec95b5e0..3dc8a4968beb9c01 100644
> > --- a/tools/perf/util/hist.c
> > +++ b/tools/perf/util/hist.c
> > @@ -484,7 +484,7 @@ static int hist_entry__init(struct hist_entry *he,
> > goto err_infos;
> > }
> >
> > - if (he->srcline && strcmp(he->srcline, SRCLINE_UNKNOWN)) {
> > + if (he->srcline && he->srcline != SRCLINE_UNKNOWN) {
> > he->srcline = strdup(he->srcline);
> > if (he->srcline == NULL)
> > goto err_rawdata;
> > diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
> > index c77e2fce6a379e7f..f30d34903aa4eabe 100644
> > --- a/tools/perf/util/map.c
> > +++ b/tools/perf/util/map.c
> > @@ -496,7 +496,7 @@ int map__fprintf_srcline(struct map *map, u64 addr, const char *prefix,
> >
> > if (dso) {
> > char *srcline = map__srcline(map, addr, NULL);
> > - if (strncmp(srcline, SRCLINE_UNKNOWN, strlen(SRCLINE_UNKNOWN)) != 0)
> > + if (srcline != SRCLINE_UNKNOWN)
> > ret = fprintf(fp, "%s%s", prefix, srcline);
> > zfree_srcline(&srcline);
> > }
> > diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
> > index 047c3606802f5b7f..6aa1c7f2b4448b30 100644
> > --- a/tools/perf/util/sort.c
> > +++ b/tools/perf/util/sort.c
> > @@ -643,7 +643,7 @@ static char *hist_entry__get_srcfile(struct hist_entry *e)
> >
> > sf = __get_srcline(map__dso(map), map__rip_2objdump(map, e->ip),
> > e->ms.sym, false, true, true, e->ip);
> > - if (!strcmp(sf, SRCLINE_UNKNOWN))
> > + if (sf == SRCLINE_UNKNOWN)
> > return no_srcfile;
> > p = strchr(sf, ':');
> > if (p && *sf) {
> > diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
> > index b8e596528d7e7e5e..48a04f42b308b080 100644
> > --- a/tools/perf/util/srcline.c
> > +++ b/tools/perf/util/srcline.c
> > @@ -809,7 +809,7 @@ void zfree_srcline(char **srcline)
> > if (*srcline == NULL)
> > return;
> >
> > - if (strcmp(*srcline, SRCLINE_UNKNOWN))
> > + if (*srcline != SRCLINE_UNKNOWN)
> > free(*srcline);
> >
> > *srcline = NULL;
> > --
> > 2.39.2
> >
> >
> > > Signed-off-by: Ian Rogers <[email protected]>
> > > ---
> > > tools/perf/util/hist.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> > > index 77cb2cc83bb9..cc6f7f51faa5 100644
> > > --- a/tools/perf/util/hist.c
> > > +++ b/tools/perf/util/hist.c
> > > @@ -484,7 +484,7 @@ static int hist_entry__init(struct hist_entry *he,
> > > goto err_infos;
> > > }
> > >
> > > - if (he->srcline) {
> > > + if (he->srcline && strcmp(he->srcline, SRCLINE_UNKNOWN)) {
> > > he->srcline = strdup(he->srcline);
> > > if (he->srcline == NULL)
> > > goto err_rawdata;
> > > --
> > > 2.41.0.162.gfafddb0af9-goog
> > >
> >
> > --
> >
> > - Arnaldo
>
> --
>
> - Arnaldo

2023-06-12 17:30:59

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2 26/26] perf hist: Fix srcline memory leak

Em Mon, Jun 12, 2023 at 07:46:14AM -0700, Ian Rogers escreveu:
> On Mon, Jun 12, 2023 at 7:16 AM Arnaldo Carvalho de Melo
> <[email protected]> wrote:
> >
> > Em Mon, Jun 12, 2023 at 11:13:59AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > Em Thu, Jun 08, 2023 at 04:28:23PM -0700, Ian Rogers escreveu:
> > > > srcline isn't freed if it is SRCLINE_UNKNOWN. Avoid strduping in this
> > > > case as such strdups are redundant and leak memory.
> > >
> > > The patch is ok as its what the rest of the code is doing, i.e. strcmp()
> > > to check if a srcline is the unknown one, but how about the following
> > > patch on top of yours?
> >
> > [acme@quaco perf-tools-next]$ strings ~/bin/perf | grep '??:0'
> > ??:0
> > SRCLINE_UNKNOWN ((char *) "??:0")
> > [acme@quaco perf-tools-next]$
>
> Agreed, the strcmps make me nervous as they won't distinguish heap
> from a global meaning we could end up with things like pointers to
> freed memory. The comparison with the global is always going to be
> same imo.
>
> Acked-by: Ian Rogers <[email protected]>

Thanks, applied and added your Acked-by.

- Arnaldo

2023-06-12 21:28:27

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH v2 26/26] perf hist: Fix srcline memory leak

On Mon, Jun 12, 2023 at 02:23:41PM -0300, Arnaldo Carvalho de Melo wrote:
> Em Mon, Jun 12, 2023 at 07:46:14AM -0700, Ian Rogers escreveu:
> > On Mon, Jun 12, 2023 at 7:16 AM Arnaldo Carvalho de Melo
> > <[email protected]> wrote:
> > >
> > > Em Mon, Jun 12, 2023 at 11:13:59AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > Em Thu, Jun 08, 2023 at 04:28:23PM -0700, Ian Rogers escreveu:
> > > > > srcline isn't freed if it is SRCLINE_UNKNOWN. Avoid strduping in this
> > > > > case as such strdups are redundant and leak memory.
> > > >
> > > > The patch is ok as its what the rest of the code is doing, i.e. strcmp()
> > > > to check if a srcline is the unknown one, but how about the following
> > > > patch on top of yours?
> > >
> > > [acme@quaco perf-tools-next]$ strings ~/bin/perf | grep '??:0'
> > > ??:0
> > > SRCLINE_UNKNOWN ((char *) "??:0")
> > > [acme@quaco perf-tools-next]$
> >
> > Agreed, the strcmps make me nervous as they won't distinguish heap
> > from a global meaning we could end up with things like pointers to
> > freed memory. The comparison with the global is always going to be
> > same imo.
> >
> > Acked-by: Ian Rogers <[email protected]>
>
> Thanks, applied and added your Acked-by.

Actually was there another patch that turned it into a explicit global?

At least in my tree it isn't:

util/srcline.h
28:#define SRCLINE_UNKNOWN ((char *) "??:0")

Without any explicit global it's a bit dangerous because you rely on the
linker doing string deduplication. While it normally does that there might be
odd corner cases where it doesn't.

-Andi

2023-06-12 21:52:05

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH v2 26/26] perf hist: Fix srcline memory leak

Em Mon, Jun 12, 2023 at 02:16:34PM -0700, Andi Kleen escreveu:
> On Mon, Jun 12, 2023 at 02:23:41PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Mon, Jun 12, 2023 at 07:46:14AM -0700, Ian Rogers escreveu:
> > > On Mon, Jun 12, 2023 at 7:16 AM Arnaldo Carvalho de Melo
> > > <[email protected]> wrote:
> > > >
> > > > Em Mon, Jun 12, 2023 at 11:13:59AM -0300, Arnaldo Carvalho de Melo escreveu:
> > > > > Em Thu, Jun 08, 2023 at 04:28:23PM -0700, Ian Rogers escreveu:
> > > > > > srcline isn't freed if it is SRCLINE_UNKNOWN. Avoid strduping in this
> > > > > > case as such strdups are redundant and leak memory.
> > > > >
> > > > > The patch is ok as its what the rest of the code is doing, i.e. strcmp()
> > > > > to check if a srcline is the unknown one, but how about the following
> > > > > patch on top of yours?
> > > >
> > > > [acme@quaco perf-tools-next]$ strings ~/bin/perf | grep '??:0'
> > > > ??:0
> > > > SRCLINE_UNKNOWN ((char *) "??:0")
> > > > [acme@quaco perf-tools-next]$
> > >
> > > Agreed, the strcmps make me nervous as they won't distinguish heap
> > > from a global meaning we could end up with things like pointers to
> > > freed memory. The comparison with the global is always going to be
> > > same imo.
> > >
> > > Acked-by: Ian Rogers <[email protected]>
> >
> > Thanks, applied and added your Acked-by.
>
> Actually was there another patch that turned it into a explicit global?
>
> At least in my tree it isn't:
>
> util/srcline.h
> 28:#define SRCLINE_UNKNOWN ((char *) "??:0")
>
> Without any explicit global it's a bit dangerous because you rely on the
> linker doing string deduplication. While it normally does that there might be
> odd corner cases where it doesn't.

Yeah, several gcc and clang versions complained with:

2 65.38 almalinux:9 : FAIL clang version 15.0.7 (Red Hat 15.0.7-2.el9)
builtin-diff.c:1381:17: error: result of comparison against a string literal is unspecified (use an explicit string comparison function instead) [-Werror,-Wstring-compare]
if (start_line != SRCLINE_UNKNOWN &&

So I added the following on top:

diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index 48a04f42b308b080..aec596a0b0bbec0f 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -23,6 +23,8 @@

bool srcline_full_filename;

+char *srcline__unknown = (char *)"??:0";
+
static const char *dso__name(struct dso *dso)
{
const char *dso_name;
diff --git a/tools/perf/util/srcline.h b/tools/perf/util/srcline.h
index a15c7db9058ece96..167645bcff0755e2 100644
--- a/tools/perf/util/srcline.h
+++ b/tools/perf/util/srcline.h
@@ -25,7 +25,8 @@ char *srcline__tree_find(struct rb_root_cached *tree, u64 addr);
/* delete all srclines within the tree */
void srcline__tree_delete(struct rb_root_cached *tree);

-#define SRCLINE_UNKNOWN ((char *) "??:0")
+extern char *srcline__unknown;
+#define SRCLINE_UNKNOWN srcline__unknown

struct inline_list {
struct symbol *symbol;