Subject: [PATCH perf/core 00/13] perf memory/refcnt leak fixes

Hi,

Here is a series to fix some memory leaks and refcount
leaks on map and dso. This also includes the refcnt APIs
with backtrace debugging feature.

The story has started from the posible memory leak report
reported by Wnag Nan.
I've tried to use valgrind to ensure the perf probe doesn't
have other memory leaks. The result is here:

----
# valgrind ./perf probe vfs_read
==17521== Memcheck, a memory error detector
==17521== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==17521== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==17521== Command: ./perf probe vfs_read
==17521==
Added new event:
probe:vfs_read (on vfs_read)

You can now use it in all perf tools, such as:

perf record -e probe:vfs_read -aR sleep 1

==17521==
==17521== HEAP SUMMARY:
==17521== in use at exit: 3,512,761 bytes in 38,012 blocks
==17521== total heap usage: 74,723 allocs, 36,711 frees, 24,014,927
bytes allocated
==17521==
==17521== LEAK SUMMARY:
==17521== definitely lost: 6,857 bytes in 49 blocks
==17521== indirectly lost: 3,501,287 bytes in 37,891 blocks
==17521== possibly lost: 0 bytes in 0 blocks
==17521== still reachable: 4,617 bytes in 72 blocks
==17521== suppressed: 0 bytes in 0 blocks
==17521== Rerun with --leak-check=full to see details of leaked memory
==17521==
==17521== For counts of detected and suppressed errors, rerun with: -v
==17521== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
----

Oops! It leaked almost 4 MB memories. I've tried to find
the root causes, and what I've found is there are many
leaks in not only perf-probe specific code, but also maps
and dsos (and some other pieces).

The first 3 patches are just fixing 'easy' memory leaks. However,
most of the leaks are caused by refcnt. Since valgrind seems not
able to debug this kind of issues, I introduced a hand-made refcnt
backtrace APIs for debugging.
The rest of patches are for fixing refcnt leak bugs and replcing
refcnt apis.

After all, most of the issues are gone, except for just a few issues
in elfutils. I'll continue to investigate that.

----
valgrind ./perf probe vfs_read
==29521== Memcheck, a memory error detector
==29521== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==29521== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
==29521== Command: ./perf probe vfs_read
==29521==
Added new event:
probe:vfs_read (on vfs_read)

You can now use it in all perf tools, such as:

perf record -e probe:vfs_read -aR sleep 1

==29521==
==29521== HEAP SUMMARY:
==29521== in use at exit: 5,137 bytes in 75 blocks
==29521== total heap usage: 74,723 allocs, 74,648 frees, 24,014,927
bytes allocated
==29521==
==29521== LEAK SUMMARY:
==29521== definitely lost: 520 bytes in 3 blocks
==29521== indirectly lost: 0 bytes in 0 blocks
==29521== possibly lost: 0 bytes in 0 blocks
==29521== still reachable: 4,617 bytes in 72 blocks
==29521== suppressed: 0 bytes in 0 blocks
==29521== Rerun with --leak-check=full to see details of leaked memory
==29521==
==29521== For counts of detected and suppressed errors, rerun with: -v
==29521== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
----

Anyway, I decided to release these fixes and the debugging feature
because at least this will improve perf quality.

Thank you,

---

Masami Hiramatsu (13):
perf probe: Fix to free temporal Dwarf_Frame
perf: Make perf_exec_path always returns malloc'd string
perf: Introduce generic refcount APIs with debug feature
perf: make map to use refcnt
perf: Fix machine__findnew_module_map to put registered map
perf: Fix machine__destroy_kernel_maps to put vmlinux_maps
perf: Fix to destroy kernel maps when machine exits
perf: Fix to put new map after inserting to map_groups in dso__load_sym
perf: Make dso to use refcnt for debug
perf: Fix __dsos__addnew to put dso after adding it to the list
perf: Fix machine__create_kernel_maps to put kernel dso
perf: Fix machine__findnew_module_map to put dso
perf: Fix dso__load_sym to put dso


tools/perf/config/Makefile | 5 ++
tools/perf/util/Build | 1
tools/perf/util/dso.c | 9 ++-
tools/perf/util/exec_cmd.c | 20 ++++--
tools/perf/util/exec_cmd.h | 5 +-
tools/perf/util/help.c | 6 +-
tools/perf/util/machine.c | 17 ++++-
tools/perf/util/map.c | 7 +-
tools/perf/util/map.h | 3 +
tools/perf/util/probe-finder.c | 9 ++-
tools/perf/util/refcnt.c | 125 ++++++++++++++++++++++++++++++++++++++++
tools/perf/util/refcnt.h | 65 +++++++++++++++++++++
tools/perf/util/symbol-elf.c | 4 +
13 files changed, 250 insertions(+), 26 deletions(-)
create mode 100644 tools/perf/util/refcnt.c
create mode 100644 tools/perf/util/refcnt.h


--
Masami HIRAMATSU
Linux Technology Research Center, System Productivity Research Dept.
Center for Technology Innovation - Systems Engineering
Hitachi, Ltd., Research & Development Group
E-mail: [email protected]


Subject: [PATCH perf/core 01/13] perf probe: Fix to free temporal Dwarf_Frame

Since dwarf_cfi_addrframe returns malloc'd Dwarf_Frame
object, it has to be freed after used.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/probe-finder.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index 63993d7..4d7d4f4 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -683,21 +683,24 @@ static int call_probe_finder(Dwarf_Die *sc_die, struct probe_finder *pf)
ret = dwarf_getlocation_addr(&fb_attr, pf->addr, &pf->fb_ops, &nops, 1);
if (ret <= 0 || nops == 0) {
pf->fb_ops = NULL;
+ ret = 0;
#if _ELFUTILS_PREREQ(0, 142)
} else if (nops == 1 && pf->fb_ops[0].atom == DW_OP_call_frame_cfa &&
pf->cfi != NULL) {
- Dwarf_Frame *frame;
+ Dwarf_Frame *frame = NULL;
if (dwarf_cfi_addrframe(pf->cfi, pf->addr, &frame) != 0 ||
dwarf_frame_cfa(frame, &pf->fb_ops, &nops) != 0) {
pr_warning("Failed to get call frame on 0x%jx\n",
(uintmax_t)pf->addr);
- return -ENOENT;
+ ret = -ENOENT;
}
+ free(frame);
#endif
}

/* Call finder's callback handler */
- ret = pf->callback(sc_die, pf);
+ if (ret >= 0)
+ ret = pf->callback(sc_die, pf);

/* *pf->fb_ops will be cached in libdw. Don't free it. */
pf->fb_ops = NULL;

Subject: [PATCH perf/core 02/13] perf: Make perf_exec_path always returns malloc'd string

Since system_path() returns malloc'd string if given path is
not an absolute path, perf_exec_path sometimes returns static
string and sometimes returns malloc'd string depends on the
environment variables or command options.

This causes a memory leak because caller can not free the
returned string.

This fixes perf_exec_path and system_path to always return
malloc'd string, so caller can always free it.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/exec_cmd.c | 20 ++++++++++++--------
tools/perf/util/exec_cmd.h | 5 +++--
tools/perf/util/help.c | 6 ++++--
3 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/tools/perf/util/exec_cmd.c b/tools/perf/util/exec_cmd.c
index 7adf4ad..7031ffc 100644
--- a/tools/perf/util/exec_cmd.c
+++ b/tools/perf/util/exec_cmd.c
@@ -9,17 +9,17 @@
static const char *argv_exec_path;
static const char *argv0_path;

-const char *system_path(const char *path)
+char *system_path(const char *path)
{
static const char *prefix = PREFIX;
struct strbuf d = STRBUF_INIT;

if (is_absolute_path(path))
- return path;
+ return strdup(path);

strbuf_addf(&d, "%s/%s", prefix, path);
path = strbuf_detach(&d, NULL);
- return path;
+ return (char *)path;
}

const char *perf_extract_argv0_path(const char *argv0)
@@ -52,16 +52,18 @@ void perf_set_argv_exec_path(const char *exec_path)


/* Returns the highest-priority, location to look for perf programs. */
-const char *perf_exec_path(void)
+char *perf_exec_path(void)
{
- const char *env;
+ char *env;

if (argv_exec_path)
- return argv_exec_path;
+ return strdup(argv_exec_path);

env = getenv(EXEC_PATH_ENVIRONMENT);
if (env && *env) {
- return env;
+ env = strdup(env);
+ if (env)
+ return env;
}

return system_path(PERF_EXEC_PATH);
@@ -83,9 +85,11 @@ void setup_path(void)
{
const char *old_path = getenv("PATH");
struct strbuf new_path = STRBUF_INIT;
+ char *tmp = perf_exec_path();

- add_path(&new_path, perf_exec_path());
+ add_path(&new_path, tmp);
add_path(&new_path, argv0_path);
+ free(tmp);

if (old_path)
strbuf_addstr(&new_path, old_path);
diff --git a/tools/perf/util/exec_cmd.h b/tools/perf/util/exec_cmd.h
index bc4b915..48b4175 100644
--- a/tools/perf/util/exec_cmd.h
+++ b/tools/perf/util/exec_cmd.h
@@ -3,10 +3,11 @@

extern void perf_set_argv_exec_path(const char *exec_path);
extern const char *perf_extract_argv0_path(const char *path);
-extern const char *perf_exec_path(void);
extern void setup_path(void);
extern int execv_perf_cmd(const char **argv); /* NULL terminated */
extern int execl_perf_cmd(const char *cmd, ...);
-extern const char *system_path(const char *path);
+/* perf_exec_path and system_path return malloc'd string, caller must free it */
+extern char *perf_exec_path(void);
+extern char *system_path(const char *path);

#endif /* __PERF_EXEC_CMD_H */
diff --git a/tools/perf/util/help.c b/tools/perf/util/help.c
index 86c37c4..fa1fc4a 100644
--- a/tools/perf/util/help.c
+++ b/tools/perf/util/help.c
@@ -159,7 +159,7 @@ void load_command_list(const char *prefix,
struct cmdnames *other_cmds)
{
const char *env_path = getenv("PATH");
- const char *exec_path = perf_exec_path();
+ char *exec_path = perf_exec_path();

if (exec_path) {
list_commands_in_dir(main_cmds, exec_path, prefix);
@@ -187,6 +187,7 @@ void load_command_list(const char *prefix,
sizeof(*other_cmds->names), cmdname_compare);
uniq(other_cmds);
}
+ free(exec_path);
exclude_cmds(other_cmds, main_cmds);
}

@@ -203,13 +204,14 @@ void list_commands(const char *title, struct cmdnames *main_cmds,
longest = other_cmds->names[i]->len;

if (main_cmds->cnt) {
- const char *exec_path = perf_exec_path();
+ char *exec_path = perf_exec_path();
printf("available %s in '%s'\n", title, exec_path);
printf("----------------");
mput_char('-', strlen(title) + strlen(exec_path));
putchar('\n');
pretty_print_string_list(main_cmds, longest);
putchar('\n');
+ free(exec_path);
}

if (other_cmds->cnt) {

Subject: [PATCH perf/core 03/13] perf: Introduce generic refcount APIs with debug feature

This is a kind of debugging feature for atomic reference counter.
The reference counters are widely used in perf tools but not well
debugged. It sometimes causes memory leaks but no one has noticed
the issue, since it is hard to debug such un-reclaimed objects.

This refcnt interface provides fully backtrace debug feature to
analyze such issue. User just replace atomic_t ops with refcnt
APIs and add refcnt__exit() where the object is released.

/* At object initializing */
refcnt__init(obj, refcnt); /* <- atomic_set(&obj->refcnt, 1); */

/* At object get method */
refcnt__get(obj, refcnt); /* <- atomic_inc(&obj->refcnt); */

/* At object put method */
if (obj && refcnt__put(obj, refcnt)) /* <-atmoic_dec_and_test(&obj->refcnt)*/

/* At object releasing */
refcnt__exit(obj, refcnt); /* <- Newly added */

The debugging feature is enabled when building perf with
REFCNT_DEBUG=1. Otherwides it is just translated as normal
atomic ops.

Debugging binary warns you if it finds leaks when the perf exits.
If you give -v (or --verbose) to the perf, it also shows backtrace
logs on all refcnt operations of leaked objects.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/config/Makefile | 5 ++
tools/perf/util/Build | 1
tools/perf/util/refcnt.c | 125 ++++++++++++++++++++++++++++++++++++++++++++
tools/perf/util/refcnt.h | 65 +++++++++++++++++++++++
4 files changed, 196 insertions(+)
create mode 100644 tools/perf/util/refcnt.c
create mode 100644 tools/perf/util/refcnt.h

diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
index de89ec5..efc2e03 100644
--- a/tools/perf/config/Makefile
+++ b/tools/perf/config/Makefile
@@ -147,6 +147,11 @@ ifdef PARSER_DEBUG
$(call detected_var,PARSER_DEBUG_FLEX)
endif

+ifdef REFCNT_DEBUG
+ CFLAGS += -DREFCNT_DEBUG
+ $(call detected,CONFIG_REFCNT_DEBUG)
+endif
+
ifndef NO_LIBPYTHON
# Try different combinations to accommodate systems that only have
# python[2][-config] in weird combinations but always preferring
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 591b3fe..f64d2a4 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -86,6 +86,7 @@ libperf-$(CONFIG_AUXTRACE) += intel-pt.o
libperf-$(CONFIG_AUXTRACE) += intel-bts.o
libperf-y += parse-branch-options.o
libperf-y += parse-regs-options.o
+libperf-$(CONFIG_REFCNT_DEBUG) += refcnt.o

libperf-$(CONFIG_LIBBPF) += bpf-loader.o
libperf-$(CONFIG_LIBELF) += symbol-elf.o
diff --git a/tools/perf/util/refcnt.c b/tools/perf/util/refcnt.c
new file mode 100644
index 0000000..f828419
--- /dev/null
+++ b/tools/perf/util/refcnt.c
@@ -0,0 +1,125 @@
+/* Refcount backtrace for debugging leaks */
+#include "../perf.h"
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <execinfo.h> /* For backtrace */
+
+#include "event.h"
+#include "debug.h"
+#include "util.h"
+#include "refcnt.h"
+
+/* A root of backtrace */
+static LIST_HEAD(refcnt_root); /* List head of refcnt object */
+
+static void __refcnt_object__delete(struct refcnt_object *ref)
+{
+ struct refcnt_buffer *buf;
+
+ while (!list_empty(&ref->head)) {
+ buf = list_entry(ref->head.next, struct refcnt_buffer, list);
+ list_del_init(&buf->list);
+ free(buf);
+ }
+ list_del_init(&ref->list);
+ free(ref);
+}
+
+static struct refcnt_object *refcnt__find_object(void *obj)
+{
+ struct refcnt_object *ref;
+
+ /* TODO: use hash list */
+ list_for_each_entry(ref, &refcnt_root, list)
+ if (ref->obj == obj)
+ return ref;
+
+ return NULL;
+}
+
+void refcnt_object__delete(void *addr)
+{
+ struct refcnt_object *ref = refcnt__find_object(addr);
+
+ if (!ref) {
+ pr_debug("REFCNT: Delete uninitialized refcnt: %p\n", addr);
+ return;
+ }
+ __refcnt_object__delete(ref);
+}
+
+void refcnt_object__record(void *obj, const char *name, int count)
+{
+ struct refcnt_object *ref = refcnt__find_object(obj);
+ struct refcnt_buffer *buf;
+
+ /* If no entry, allocate new one */
+ if (!ref) {
+ ref = malloc(sizeof(*ref));
+ if (!ref) {
+ pr_debug("REFCNT: Out of memory for %p (%s)\n",
+ obj, name);
+ return;
+ }
+ INIT_LIST_HEAD(&ref->list);
+ INIT_LIST_HEAD(&ref->head);
+ ref->name = name;
+ ref->obj = obj;
+ list_add_tail(&ref->list, &refcnt_root);
+ }
+
+ buf = malloc(sizeof(*buf));
+ if (!buf) {
+ pr_debug("REFCNT: Out of memory for %p (%s)\n", obj, ref->name);
+ return;
+ }
+
+ INIT_LIST_HEAD(&buf->list);
+ buf->count = count;
+ buf->nr = backtrace(buf->buf, BACKTRACE_SIZE);
+ list_add_tail(&buf->list, &ref->head);
+}
+
+static void pr_refcnt_buffer(struct refcnt_buffer *buf)
+{
+ char **symbuf;
+ int i;
+
+ if (!buf)
+ return;
+ symbuf = backtrace_symbols(buf->buf, buf->nr);
+ /* Skip the first one because it is always btrace__record */
+ for (i = 1; i < buf->nr; i++)
+ pr_debug(" %s\n", symbuf[i]);
+ free(symbuf);
+}
+
+void refcnt__dump_unreclaimed(void) __attribute__((destructor));
+void refcnt__dump_unreclaimed(void)
+{
+ struct refcnt_object *ref, *n;
+ struct refcnt_buffer *buf;
+ int i = 0;
+
+ if (list_empty(&refcnt_root))
+ return;
+
+ pr_warning("REFCNT: BUG: Unreclaimed objects found.\n");
+ list_for_each_entry_safe(ref, n, &refcnt_root, list) {
+ pr_debug("==== [%d] ====\nUnreclaimed %s: %p\n", i,
+ ref->name ? ref->name : "(object)", ref->obj);
+ list_for_each_entry(buf, &ref->head, list) {
+ pr_debug("Refcount %s => %d at\n",
+ buf->count > 0 ? "+1" : "-1",
+ buf->count > 0 ? buf->count : -buf->count - 1);
+ pr_refcnt_buffer(buf);
+ }
+ __refcnt_object__delete(ref);
+ i++;
+ }
+ pr_warning("REFCNT: Total %d objects are not reclaimed.\n", i);
+ if (!verbose)
+ pr_warning(" To see all backtraces, rerun with -v option\n");
+}
+
diff --git a/tools/perf/util/refcnt.h b/tools/perf/util/refcnt.h
new file mode 100644
index 0000000..a8e2d29
--- /dev/null
+++ b/tools/perf/util/refcnt.h
@@ -0,0 +1,65 @@
+/*
+ * Atomic reference counter API with debugging feature
+ */
+#ifndef __PERF_REFCNT_H
+#define __PERF_REFCNT_H
+
+#include <linux/atomic.h>
+
+#ifdef REFCNT_DEBUG
+
+struct refcnt_object {
+ struct list_head list; /* List of objects */
+ void *obj; /* Object address which has refcnt */
+ const char *name; /* Object class name */
+ struct list_head head; /* List head for buffers */
+};
+
+#define BACKTRACE_SIZE 32
+struct refcnt_buffer {
+ struct list_head list; /* List of buffers */
+ int count; /* Count number at recording point */
+ int nr; /* Number of recorded buffer entries */
+ void *buf[BACKTRACE_SIZE]; /* Backtrace buffer */
+};
+
+void refcnt_object__record(void *obj, const char *name, int count);
+void refcnt_object__delete(void *obj);
+
+static inline void __refcnt__init(atomic_t *refcnt, void *obj, const char *name)
+{
+ atomic_set(refcnt, 1);
+ refcnt_object__record(obj, name, 1);
+}
+
+static inline void __refcnt__get(atomic_t *refcnt, void *obj)
+{
+ atomic_inc(refcnt);
+ refcnt_object__record(obj, NULL, atomic_read(refcnt));
+}
+
+static inline int __refcnt__put(atomic_t *refcnt, void *obj)
+{
+ refcnt_object__record(obj, NULL, -atomic_read(refcnt));
+ return atomic_dec_and_test(refcnt);
+}
+
+#define refcnt__init(obj, member) \
+ __refcnt__init(&obj->member, obj, #obj)
+#define refcnt__exit(obj, member) \
+ refcnt_object__delete(obj)
+#define refcnt__get(obj, member) \
+ __refcnt__get(&obj->member, obj)
+#define refcnt__put(obj, member) \
+ __refcnt__put(&obj->member, obj)
+
+#else /* !REFCNT_DEBUG */
+
+#define refcnt__init(obj, member) atomic_set(&obj->member, 1)
+#define refcnt__exit(obj, member) do { } while (0)
+#define refcnt__get(obj, member) atomic_inc(&obj->member)
+#define refcnt__put(obj, member) atomic_dec_and_test(&obj->member)
+
+#endif /* !REFCNT_DEBUG */
+
+#endif /* __PERF_REFCNT_H */

Subject: [PATCH perf/core 04/13] perf: make map to use refcnt

Make 'map' object to use refcnt interface for debug.
This can find refcnt related memory leaks.
E.g.

----
./perf probe vfs_read
Added new event:
probe:vfs_read (on vfs_read)

You can now use it in all perf tools, such as:

perf record -e probe:vfs_read -aR sleep 1

REFCNT: BUG: Unreclaimed objects found.
REFCNT: Total 76 objects are not reclaimed.
To see all backtraces, rerun with -v option
----

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/map.c | 7 ++++---
tools/perf/util/map.h | 3 ++-
2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index afc6b56..29334c6 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -138,7 +138,7 @@ void map__init(struct map *map, enum map_type type,
RB_CLEAR_NODE(&map->rb_node);
map->groups = NULL;
map->erange_warned = false;
- atomic_set(&map->refcnt, 1);
+ refcnt__init(map, refcnt);
}

struct map *map__new(struct machine *machine, u64 start, u64 len,
@@ -241,6 +241,7 @@ bool __map__is_kernel(const struct map *map)
static void map__exit(struct map *map)
{
BUG_ON(!RB_EMPTY_NODE(&map->rb_node));
+ refcnt__exit(map, refcnt);
dso__zput(map->dso);
}

@@ -252,7 +253,7 @@ void map__delete(struct map *map)

void map__put(struct map *map)
{
- if (map && atomic_dec_and_test(&map->refcnt))
+ if (map && refcnt__put(map, refcnt))
map__delete(map);
}

@@ -353,7 +354,7 @@ struct map *map__clone(struct map *from)
struct map *map = memdup(from, sizeof(*map));

if (map != NULL) {
- atomic_set(&map->refcnt, 1);
+ refcnt__init(map, refcnt);
RB_CLEAR_NODE(&map->rb_node);
dso__get(map->dso);
map->groups = NULL;
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 7309d64..fcdc7d6 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -9,6 +9,7 @@
#include <stdio.h>
#include <stdbool.h>
#include <linux/types.h>
+#include "refcnt.h"

enum map_type {
MAP__FUNCTION = 0,
@@ -153,7 +154,7 @@ struct map *map__clone(struct map *map);
static inline struct map *map__get(struct map *map)
{
if (map)
- atomic_inc(&map->refcnt);
+ refcnt__get(map, refcnt);
return map;
}

Subject: [PATCH perf/core 05/13] perf: Fix machine__findnew_module_map to put registered map

Fix machine object to put the map object which is already
insterted to machine->kmaps.

refcnt debugger shows what happened:
----
==== [2] ====
Unreclaimed map: 0x346f750
Refcount +1 => 1 at
./perf(map__new2+0xb5) [0x4bdea5]
./perf() [0x4b8aaf]
./perf(modules__parse+0xfc) [0x4a9cbc]
./perf() [0x4b83c0]
./perf(machine__create_kernel_maps+0x148) [0x4bb208]
./perf(machine__new_host+0xfa) [0x4bb3fa]
./perf(init_probe_symbol_maps+0x93) [0x5062b3]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f5373899af5]
./perf() [0x4220a9]
Refcount +1 => 2 at
./perf(maps__insert+0x9a) [0x4bfd4a]
./perf() [0x4b8acb]
./perf(modules__parse+0xfc) [0x4a9cbc]
./perf() [0x4b83c0]
./perf(machine__create_kernel_maps+0x148) [0x4bb208]
./perf(machine__new_host+0xfa) [0x4bb3fa]
./perf(init_probe_symbol_maps+0x93) [0x5062b3]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f5373899af5]
./perf() [0x4220a9]
Refcount -1 => 1 at
./perf(map_groups__exit+0x94) [0x4bea54]
./perf(machine__delete+0x3d) [0x4b91ed]
./perf(exit_probe_symbol_maps+0x28) [0x506358]
./perf() [0x45628a]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f5373899af5]
./perf() [0x4220a9]
----
This pattern clearly shows that the refcnt of the map is acquired
twice ny map__new2 and maps__insert but released once at map_groups_exit.
Since maps__insert already got the dso, we have to put it right
after that which corresponding to the map__new2.

These are happened in machine__findnew_module_map, as below.
----
# eu-addr2line -e ./perf -f 0x4b8aaf
machine__findnew_module_map inlined at util/machine.c:1046
in machine__create_module
util/machine.c:582
# eu-addr2line -e ./perf -f 0x4b8acb
map_groups__insert inlined at util/machine.c:585
in machine__create_module
util/map.h:208
----
(note that both are at util/machine.c:58X which is
machine__findnew_module_map)
So, this patch fixes machine__findnew_module_map to put the map
right after map_groups__insert.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/machine.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 5ef90be..5ca4064 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -584,6 +584,8 @@ struct map *machine__findnew_module_map(struct machine *machine, u64 start,

map_groups__insert(&machine->kmaps, map);

+ /* Put the map here because map_groups__insert alread got it */
+ map__put(map);
out:
free(m.name);
return map;

Subject: [PATCH perf/core 06/13] perf: Fix machine__destroy_kernel_maps to put vmlinux_maps

Fix machine__destroy_kernel_maps() to put vmlinux_maps before
filling it with NULL.

Refcnt debugger shows
==== [1] ====
Unreclaimed map: 0x36b1070
Refcount +1 => 1 at
./perf(map__new2+0xb5) [0x4bdec5]
./perf(machine__create_kernel_maps+0x72) [0x4bb152]
./perf(machine__new_host+0xfa) [0x4bb41a]
./perf(init_probe_symbol_maps+0x93) [0x5062d3]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1fc9fc4af5]
./perf() [0x4220a9]
Refcount +1 => 2 at
./perf(maps__insert+0x9a) [0x4bfd6a]
./perf(machine__create_kernel_maps+0xc3) [0x4bb1a3]
./perf(machine__new_host+0xfa) [0x4bb41a]
./perf(init_probe_symbol_maps+0x93) [0x5062d3]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1fc9fc4af5]
./perf() [0x4220a9]
Refcount -1 => 1 at
./perf(map_groups__exit+0x94) [0x4bea74]
./perf(machine__delete+0x3d) [0x4b91fd]
./perf(exit_probe_symbol_maps+0x28) [0x506378]
./perf() [0x45628a]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1fc9fc4af5]
./perf() [0x4220a9]

map__new2() returns map with refcnt = 1, and also
map_groups__insert gets it again in__machine__create_kernel_maps().

machine__destroy_kernel_maps() calls map_groups__remove() to
decrement the refcnt, but before decrement it again (corresponding
to map__new2), it makes vmlinux_maps[type] = NULL. And this may
cause a refcnt leak.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/machine.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 5ca4064..7c647d3 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -789,6 +789,7 @@ void machine__destroy_kernel_maps(struct machine *machine)
kmap->ref_reloc_sym = NULL;
}

+ map__put(machine->vmlinux_maps[type]);
machine->vmlinux_maps[type] = NULL;
}
}

Subject: [PATCH perf/core 07/13] perf: Fix to destroy kernel maps when machine exits

Actually machine__exit forgot to call machine__destroy_kernel_maps.
So this fixes to call it. This fixes some memory leaks on map
as below.

Without this fix.
----
./perf probe vfs_read
Added new event:
probe:vfs_read (on vfs_read)

You can now use it in all perf tools, such as:

perf record -e probe:vfs_read -aR sleep 1

REFCNT: BUG: Unreclaimed objects found.
REFCNT: Total 4 objects are not reclaimed.
To see all backtraces, rerun with -v option
----
With this fix.
----
./perf probe vfs_read
Added new event:
probe:vfs_read (on vfs_read)

You can now use it in all perf tools, such as:

perf record -e probe:vfs_read -aR sleep 1

REFCNT: BUG: Unreclaimed objects found.
REFCNT: Total 2 objects are not reclaimed.
To see all backtraces, rerun with -v option
----

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/machine.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 7c647d3..15282df 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -121,6 +121,7 @@ void machine__delete_threads(struct machine *machine)

void machine__exit(struct machine *machine)
{
+ machine__destroy_kernel_maps(machine);
map_groups__exit(&machine->kmaps);
dsos__exit(&machine->dsos);
machine__exit_vdso(machine);

Subject: [PATCH perf/core 08/13] perf: Fix to put new map after inserting to map_groups in dso__load_sym

Fix dso__load_sym to put the map object which is already
insterted to kmaps.

Refcnt debugger shows
==== [0] ====
Unreclaimed map: 0x39113e0
Refcount +1 => 1 at
./perf(map__new2+0xb5) [0x4be155]
./perf(dso__load_sym+0xee1) [0x503461]
./perf(dso__load_vmlinux+0xbf) [0x4aa6df]
./perf(dso__load_vmlinux_path+0x8c) [0x4aa83c]
./perf() [0x50528a]
./perf(convert_perf_probe_events+0xd79) [0x50ac29]
./perf() [0x45600f]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f152368baf5]
./perf() [0x4220a9]
Refcount +1 => 2 at
./perf(maps__insert+0x9a) [0x4bfffa]
./perf(dso__load_sym+0xf89) [0x503509]
./perf(dso__load_vmlinux+0xbf) [0x4aa6df]
./perf(dso__load_vmlinux_path+0x8c) [0x4aa83c]
./perf() [0x50528a]
./perf(convert_perf_probe_events+0xd79) [0x50ac29]
./perf() [0x45600f]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f152368baf5]
./perf() [0x4220a9]
Refcount -1 => 1 at
./perf(map_groups__exit+0x94) [0x4bed04]
./perf(machine__delete+0xb0) [0x4b9300]
./perf(exit_probe_symbol_maps+0x28) [0x506608]
./perf() [0x45628a]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f152368baf5]
./perf() [0x4220a9]

This means that the dso__load_sym calls map__new2 and maps_insert,
both of them get the map, but the map will be put at map_groups__exit
just once.
Thus, dso__load_sym has to put the map right after inserting it to
kmaps.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/symbol-elf.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 475d88d..53f1996 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1042,6 +1042,8 @@ int dso__load_sym(struct dso *dso, struct map *map,
}
curr_dso->symtab_type = dso->symtab_type;
map_groups__insert(kmaps, curr_map);
+ /* kmaps already got it */
+ map__put(curr_map);
dsos__add(&map->groups->machine->dsos, curr_dso);
dso__set_loaded(curr_dso, map->type);
} else

Subject: [PATCH perf/core 09/13] perf: Make dso to use refcnt for debug

Make 'dso' object to use refcnt interface for debug.
This can find refcnt related memory leaks on dsos.

E.g.
----
# ./perf probe vfs_read
Added new event:
probe:vfs_read (on vfs_read)

You can now use it in all perf tools, such as:

perf record -e probe:vfs_read -aR sleep 1

REFCNT: BUG: Unreclaimed objects found.
REFCNT: Total 75 objects are not reclaimed.
To see all backtraces, rerun with -v option
----

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/dso.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 7c0c083..28ad06a 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -1049,7 +1049,7 @@ struct dso *dso__new(const char *name)
INIT_LIST_HEAD(&dso->node);
INIT_LIST_HEAD(&dso->data.open_entry);
pthread_mutex_init(&dso->lock, NULL);
- atomic_set(&dso->refcnt, 1);
+ refcnt__init(dso, refcnt);
}

return dso;
@@ -1081,19 +1081,20 @@ void dso__delete(struct dso *dso)
dso__free_a2l(dso);
zfree(&dso->symsrc_filename);
pthread_mutex_destroy(&dso->lock);
+ refcnt__exit(dso, refcnt);
free(dso);
}

struct dso *dso__get(struct dso *dso)
{
if (dso)
- atomic_inc(&dso->refcnt);
+ refcnt__get(dso, refcnt);
return dso;
}

void dso__put(struct dso *dso)
{
- if (dso && atomic_dec_and_test(&dso->refcnt))
+ if (dso && refcnt__put(dso, refcnt))
dso__delete(dso);
}

Subject: [PATCH perf/core 10/13] perf: Fix __dsos__addnew to put dso after adding it to the list

Fix __dsos__addnew to put dso after adding it to the list, because
__dsos__add() gets the dso refcount.
This eases refcount leaks on dso.

Refcnt debugger shows:
==== [0] ====
Unreclaimed dso: 0x2fccab0
Refcount +1 => 1 at
./perf(dso__new+0x1ff) [0x4a62df]
./perf(__dsos__addnew+0x29) [0x4a6e19]
./perf(dsos__findnew+0xd1) [0x4a7281]
./perf(machine__findnew_kernel+0x27) [0x4a5e17]
./perf() [0x4b8df2]
./perf(machine__create_kernel_maps+0x28) [0x4bb528]
./perf(machine__new_host+0xfa) [0x4bb84a]
./perf(init_probe_symbol_maps+0x93) [0x506713]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f46df132af5]
./perf() [0x4220a9]
Refcount +1 => 2 at
./perf(__dsos__addnew+0xfb) [0x4a6eeb]
./perf(dsos__findnew+0xd1) [0x4a7281]
./perf(machine__findnew_kernel+0x27) [0x4a5e17]
./perf() [0x4b8df2]
./perf(machine__create_kernel_maps+0x28) [0x4bb528]
./perf(machine__new_host+0xfa) [0x4bb84a]
./perf(init_probe_symbol_maps+0x93) [0x506713]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f46df132af5]
./perf() [0x4220a9]
Refcount +1 => 3 at
./perf(dsos__findnew+0x7e) [0x4a722e]
./perf(machine__findnew_kernel+0x27) [0x4a5e17]
./perf() [0x4b8df2]
./perf(machine__create_kernel_maps+0x28) [0x4bb528]
./perf(machine__new_host+0xfa) [0x4bb84a]
./perf(init_probe_symbol_maps+0x93) [0x506713]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f46df132af5]
./perf() [0x4220a9]
[snip]

Here, __dsos__addnew() gets the dso twice, once for init,
once for adding the dso to its list. But after added, we
don't need the dso reference (since we already have).

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/dso.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 28ad06a..d21057c 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -1227,6 +1227,8 @@ struct dso *__dsos__addnew(struct dsos *dsos, const char *name)
if (dso != NULL) {
__dsos__add(dsos, dso);
dso__set_basename(dso);
+ /* Put dso here because __dsos_add already got it */
+ dso__put(dso);
}
return dso;
}

Subject: [PATCH perf/core 11/13] perf: Fix machine__create_kernel_maps to put kernel dso

Fix machine__create_kernel_maps() to put kernel dso because the dso
has been gotten by __machine__create_kernel_maps().

Refcnt debugger shows:
==== [0] ====
Unreclaimed dso: 0x3036ab0
Refcount +1 => 1 at
./perf(dso__new+0x1ff) [0x4a62df]
./perf(__dsos__addnew+0x29) [0x4a6e19]
./perf(dsos__findnew+0xd1) [0x4a7181]
./perf(machine__findnew_kernel+0x27) [0x4a5e17]
./perf() [0x4b8cf2]
./perf(machine__create_kernel_maps+0x28) [0x4bb428]
./perf(machine__new_host+0xfa) [0x4bb74a]
./perf(init_probe_symbol_maps+0x93) [0x506613]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7ffa6809eaf5]
./perf() [0x4220a9]
[snip]
Refcount +1 => 2 at
./perf(dsos__findnew+0x7e) [0x4a712e]
./perf(machine__findnew_kernel+0x27) [0x4a5e17]
./perf() [0x4b8cf2]
./perf(machine__create_kernel_maps+0x28) [0x4bb428]
./perf(machine__new_host+0xfa) [0x4bb74a]
./perf(init_probe_symbol_maps+0x93) [0x506613]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7ffa6809eaf5]
./perf() [0x4220a9]
[snip]
Refcount -1 => 1 at
./perf(dso__put+0x2f) [0x4a664f]
./perf(machine__delete+0xfe) [0x4b93ee]
./perf(exit_probe_symbol_maps+0x28) [0x5066b8]
./perf() [0x45628a]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7ffa6809eaf5]
./perf() [0x4220a9]

Actually, dsos__findnew gets the dso before returning it,
so the dso user (in this case machine__create_kernel_maps)
has to put the dso after used.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/machine.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 15282df..d38ecb5 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1087,11 +1087,14 @@ int machine__create_kernel_maps(struct machine *machine)
struct dso *kernel = machine__get_kernel(machine);
const char *name;
u64 addr = machine__get_running_kernel_start(machine, &name);
- if (!addr)
+ int ret;
+
+ if (!addr || kernel == NULL)
return -1;

- if (kernel == NULL ||
- __machine__create_kernel_maps(machine, kernel) < 0)
+ ret = __machine__create_kernel_maps(machine, kernel);
+ dso__put(kernel);
+ if (ret < 0)
return -1;

if (symbol_conf.use_modules && machine__create_modules(machine) < 0) {

Subject: [PATCH perf/core 12/13] perf: Fix machine__findnew_module_map to put dso

Fix machine__findnew_module_map to put dso because the dso is
already got by machine__findnew_module_dso() and map__new2().

Refcnt debugger shows:

==== [1] ====
Unreclaimed dso: 0x1ffd980
Refcount +1 => 1 at
./perf(dso__new+0x1ff) [0x4a62df]
./perf(__dsos__addnew+0x29) [0x4a6e19]
./perf() [0x4b8b91]
./perf(modules__parse+0xfc) [0x4a9d5c]
./perf() [0x4b8460]
./perf(machine__create_kernel_maps+0x150) [0x4bb550]
./perf(machine__new_host+0xfa) [0x4bb75a]
./perf(init_probe_symbol_maps+0x93) [0x506623]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1345a8eaf5]
./perf() [0x4220a9]

This map_groups__insert(0x4b8b91) already get the new dso,
----
eu-addr2line -e ./perf -f 0x4b8b91
map_groups__insert inlined at util/machine.c:586 in
machine__create_module
util/map.h:207
----
So this dso refcnt will be released when map_groups released.

[snip]
Refcount +1 => 2 at
./perf(dso__get+0x34) [0x4a65f4]
./perf() [0x4b8b35]
./perf(modules__parse+0xfc) [0x4a9d5c]
./perf() [0x4b8460]
./perf(machine__create_kernel_maps+0x150) [0x4bb550]
./perf(machine__new_host+0xfa) [0x4bb75a]
./perf(init_probe_symbol_maps+0x93) [0x506623]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1345a8eaf5]
./perf() [0x4220a9]

Here, machine__findnew_module_dso(0x4b8b35) gets the dso.
(and stores the dso to local variable)
----
# eu-addr2line -e ./perf -f 0x4b8b35
machine__findnew_module_dso inlined at util/machine.c:578 in
machine__create_module
util/machine.c:514
----

Refcount +1 => 3 at
./perf(dso__get+0x34) [0x4a65f4]
./perf(map__new2+0x76) [0x4be1c6]
./perf() [0x4b8b4f]
./perf(modules__parse+0xfc) [0x4a9d5c]
./perf() [0x4b8460]
./perf(machine__create_kernel_maps+0x150) [0x4bb550]
./perf(machine__new_host+0xfa) [0x4bb75a]
./perf(init_probe_symbol_maps+0x93) [0x506623]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1345a8eaf5]
./perf() [0x4220a9]

But also map__new2 gets the dso which will be put when
the map is released.

So, we have to put the dso corresponding to above
machine__findnew_module_dso.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/machine.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index d38ecb5..96e5942 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -564,7 +564,7 @@ struct map *machine__findnew_module_map(struct machine *machine, u64 start,
const char *filename)
{
struct map *map = NULL;
- struct dso *dso;
+ struct dso *dso = NULL;
struct kmod_path m;

if (kmod_path__parse_name(&m, filename))
@@ -588,6 +588,8 @@ struct map *machine__findnew_module_map(struct machine *machine, u64 start,
/* Put the map here because map_groups__insert alread got it */
map__put(map);
out:
+ /* put the dso here, corresponding to machine__findnew_module_dso */
+ dso__put(dso);
free(m.name);
return map;
}

Subject: [PATCH perf/core 13/13] perf: Fix dso__load_sym to put dso

Fix dso__load_sym to put dso because dsos__add already got it.

Refcnt debugger explain the problem:
----
==== [0] ====
Unreclaimed dso: 0x19dd200
Refcount +1 => 1 at
./perf(dso__new+0x1ff) [0x4a62df]
./perf(dso__load_sym+0xe89) [0x503509]
./perf(dso__load_vmlinux+0xbf) [0x4aa77f]
./perf(dso__load_vmlinux_path+0x8c) [0x4aa8dc]
./perf() [0x50539a]
./perf(convert_perf_probe_events+0xd79) [0x50ad39]
./perf() [0x45600f]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount +1 => 2 at
./perf(dso__get+0x34) [0x4a65f4]
./perf(map__new2+0x76) [0x4be216]
./perf(dso__load_sym+0xee1) [0x503561]
./perf(dso__load_vmlinux+0xbf) [0x4aa77f]
./perf(dso__load_vmlinux_path+0x8c) [0x4aa8dc]
./perf() [0x50539a]
./perf(convert_perf_probe_events+0xd79) [0x50ad39]
./perf() [0x45600f]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount +1 => 3 at
./perf(dsos__add+0xf3) [0x4a6bc3]
./perf(dso__load_sym+0xfc1) [0x503641]
./perf(dso__load_vmlinux+0xbf) [0x4aa77f]
./perf(dso__load_vmlinux_path+0x8c) [0x4aa8dc]
./perf() [0x50539a]
./perf(convert_perf_probe_events+0xd79) [0x50ad39]
./perf() [0x45600f]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount -1 => 2 at
./perf(dso__put+0x2f) [0x4a664f]
./perf(map_groups__exit+0xb9) [0x4bee29]
./perf(machine__delete+0xb0) [0x4b93d0]
./perf(exit_probe_symbol_maps+0x28) [0x506718]
./perf() [0x45628a]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount -1 => 1 at
./perf(dso__put+0x2f) [0x4a664f]
./perf(machine__delete+0xfe) [0x4b941e]
./perf(exit_probe_symbol_maps+0x28) [0x506718]
./perf() [0x45628a]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
----
So, in the dso__load_sym, dso is gotten 3 times, by dso__new,
map__new2, and dsos__add. The last 2 is actually released by
map_groups and machine__delete correspondingly. However, the
first reference by dso__new, is never released.

This solves this issue by putting dso right after dsos__add.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/symbol-elf.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 53f1996..a5703ac 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1045,6 +1045,8 @@ int dso__load_sym(struct dso *dso, struct map *map,
/* kmaps already got it */
map__put(curr_map);
dsos__add(&map->groups->machine->dsos, curr_dso);
+ /* curr_map and machine->dsos already got it */
+ dso__put(curr_dso);
dso__set_loaded(curr_dso, map->type);
} else
curr_dso = curr_map->dso;

2015-11-18 12:46:53

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH perf/core 00/13] perf memory/refcnt leak fixes

Em Wed, Nov 18, 2015 at 03:40:09PM +0900, Masami Hiramatsu escreveu:
> Hi,
>
> Here is a series to fix some memory leaks and refcount
> leaks on map and dso. This also includes the refcnt APIs
> with backtrace debugging feature.

Cool, I wonder if this could be usable in the kernel proper... Is there
such a facility there? I'll check.

But thanks for doing this work, I'll go thru the fixes first, then look
at the debugging feature.

- Arnaldo

> The story has started from the posible memory leak report
> reported by Wnag Nan.
> I've tried to use valgrind to ensure the perf probe doesn't
> have other memory leaks. The result is here:
>
> ----
> # valgrind ./perf probe vfs_read
> ==17521== Memcheck, a memory error detector
> ==17521== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
> ==17521== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
> ==17521== Command: ./perf probe vfs_read
> ==17521==
> Added new event:
> probe:vfs_read (on vfs_read)
>
> You can now use it in all perf tools, such as:
>
> perf record -e probe:vfs_read -aR sleep 1
>
> ==17521==
> ==17521== HEAP SUMMARY:
> ==17521== in use at exit: 3,512,761 bytes in 38,012 blocks
> ==17521== total heap usage: 74,723 allocs, 36,711 frees, 24,014,927
> bytes allocated
> ==17521==
> ==17521== LEAK SUMMARY:
> ==17521== definitely lost: 6,857 bytes in 49 blocks
> ==17521== indirectly lost: 3,501,287 bytes in 37,891 blocks
> ==17521== possibly lost: 0 bytes in 0 blocks
> ==17521== still reachable: 4,617 bytes in 72 blocks
> ==17521== suppressed: 0 bytes in 0 blocks
> ==17521== Rerun with --leak-check=full to see details of leaked memory
> ==17521==
> ==17521== For counts of detected and suppressed errors, rerun with: -v
> ==17521== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
> ----
>
> Oops! It leaked almost 4 MB memories. I've tried to find
> the root causes, and what I've found is there are many
> leaks in not only perf-probe specific code, but also maps
> and dsos (and some other pieces).
>
> The first 3 patches are just fixing 'easy' memory leaks. However,
> most of the leaks are caused by refcnt. Since valgrind seems not
> able to debug this kind of issues, I introduced a hand-made refcnt
> backtrace APIs for debugging.
> The rest of patches are for fixing refcnt leak bugs and replcing
> refcnt apis.
>
> After all, most of the issues are gone, except for just a few issues
> in elfutils. I'll continue to investigate that.
>
> ----
> valgrind ./perf probe vfs_read
> ==29521== Memcheck, a memory error detector
> ==29521== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
> ==29521== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
> ==29521== Command: ./perf probe vfs_read
> ==29521==
> Added new event:
> probe:vfs_read (on vfs_read)
>
> You can now use it in all perf tools, such as:
>
> perf record -e probe:vfs_read -aR sleep 1
>
> ==29521==
> ==29521== HEAP SUMMARY:
> ==29521== in use at exit: 5,137 bytes in 75 blocks
> ==29521== total heap usage: 74,723 allocs, 74,648 frees, 24,014,927
> bytes allocated
> ==29521==
> ==29521== LEAK SUMMARY:
> ==29521== definitely lost: 520 bytes in 3 blocks
> ==29521== indirectly lost: 0 bytes in 0 blocks
> ==29521== possibly lost: 0 bytes in 0 blocks
> ==29521== still reachable: 4,617 bytes in 72 blocks
> ==29521== suppressed: 0 bytes in 0 blocks
> ==29521== Rerun with --leak-check=full to see details of leaked memory
> ==29521==
> ==29521== For counts of detected and suppressed errors, rerun with: -v
> ==29521== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
> ----
>
> Anyway, I decided to release these fixes and the debugging feature
> because at least this will improve perf quality.
>
> Thank you,
>
> ---
>
> Masami Hiramatsu (13):
> perf probe: Fix to free temporal Dwarf_Frame
> perf: Make perf_exec_path always returns malloc'd string
> perf: Introduce generic refcount APIs with debug feature
> perf: make map to use refcnt
> perf: Fix machine__findnew_module_map to put registered map
> perf: Fix machine__destroy_kernel_maps to put vmlinux_maps
> perf: Fix to destroy kernel maps when machine exits
> perf: Fix to put new map after inserting to map_groups in dso__load_sym
> perf: Make dso to use refcnt for debug
> perf: Fix __dsos__addnew to put dso after adding it to the list
> perf: Fix machine__create_kernel_maps to put kernel dso
> perf: Fix machine__findnew_module_map to put dso
> perf: Fix dso__load_sym to put dso
>
>
> tools/perf/config/Makefile | 5 ++
> tools/perf/util/Build | 1
> tools/perf/util/dso.c | 9 ++-
> tools/perf/util/exec_cmd.c | 20 ++++--
> tools/perf/util/exec_cmd.h | 5 +-
> tools/perf/util/help.c | 6 +-
> tools/perf/util/machine.c | 17 ++++-
> tools/perf/util/map.c | 7 +-
> tools/perf/util/map.h | 3 +
> tools/perf/util/probe-finder.c | 9 ++-
> tools/perf/util/refcnt.c | 125 ++++++++++++++++++++++++++++++++++++++++
> tools/perf/util/refcnt.h | 65 +++++++++++++++++++++
> tools/perf/util/symbol-elf.c | 4 +
> 13 files changed, 250 insertions(+), 26 deletions(-)
> create mode 100644 tools/perf/util/refcnt.c
> create mode 100644 tools/perf/util/refcnt.h
>
>
> --
> Masami HIRAMATSU
> Linux Technology Research Center, System Productivity Research Dept.
> Center for Technology Innovation - Systems Engineering
> Hitachi, Ltd., Research & Development Group
> E-mail: [email protected]

2015-11-18 22:23:14

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH perf/core 02/13] perf: Make perf_exec_path always returns malloc'd string

Em Wed, Nov 18, 2015 at 03:40:14PM +0900, Masami Hiramatsu escreveu:
> Since system_path() returns malloc'd string if given path is
> not an absolute path, perf_exec_path sometimes returns static
> string and sometimes returns malloc'd string depends on the
> environment variables or command options.
>
> This causes a memory leak because caller can not free the
> returned string.
>
> This fixes perf_exec_path and system_path to always return
> malloc'd string, so caller can always free it.

> /* Returns the highest-priority, location to look for perf programs. */
> -const char *perf_exec_path(void)
> +char *perf_exec_path(void)
> {
> - const char *env;
> + char *env;
>
> if (argv_exec_path)
> - return argv_exec_path;
> + return strdup(argv_exec_path);

So here you return strdup without checking if it fails

>
> env = getenv(EXEC_PATH_ENVIRONMENT);
> if (env && *env) {
> - return env;
> + env = strdup(env);
> + if (env)
> + return env;

But here you change the behaviour by, having a EXEC_PATH_ENVIRONMENT,
fallback to PERF_EXEC_PATH if strdup fails, that in turn can end up
returning NULL, since system_path() doesn't check the strdup() result?

I wonder if in those cases we couldn't check the address range for the
pointer being freed and realize it is in .bss, .data or that it is a
stack variable? Maybe there is some malloc() friend that, given a
pointer, tells that yeah, it was allocated?

- Arnaldo

> }
>
> return system_path(PERF_EXEC_PATH);
> @@ -83,9 +85,11 @@ void setup_path(void)
> {
> const char *old_path = getenv("PATH");
> struct strbuf new_path = STRBUF_INIT;
> + char *tmp = perf_exec_path();
>
> - add_path(&new_path, perf_exec_path());
> + add_path(&new_path, tmp);
> add_path(&new_path, argv0_path);
> + free(tmp);
>
> if (old_path)
> strbuf_addstr(&new_path, old_path);
> diff --git a/tools/perf/util/exec_cmd.h b/tools/perf/util/exec_cmd.h
> index bc4b915..48b4175 100644
> --- a/tools/perf/util/exec_cmd.h
> +++ b/tools/perf/util/exec_cmd.h
> @@ -3,10 +3,11 @@
>
> extern void perf_set_argv_exec_path(const char *exec_path);
> extern const char *perf_extract_argv0_path(const char *path);
> -extern const char *perf_exec_path(void);
> extern void setup_path(void);
> extern int execv_perf_cmd(const char **argv); /* NULL terminated */
> extern int execl_perf_cmd(const char *cmd, ...);
> -extern const char *system_path(const char *path);
> +/* perf_exec_path and system_path return malloc'd string, caller must free it */
> +extern char *perf_exec_path(void);
> +extern char *system_path(const char *path);
>
> #endif /* __PERF_EXEC_CMD_H */
> diff --git a/tools/perf/util/help.c b/tools/perf/util/help.c
> index 86c37c4..fa1fc4a 100644
> --- a/tools/perf/util/help.c
> +++ b/tools/perf/util/help.c
> @@ -159,7 +159,7 @@ void load_command_list(const char *prefix,
> struct cmdnames *other_cmds)
> {
> const char *env_path = getenv("PATH");
> - const char *exec_path = perf_exec_path();
> + char *exec_path = perf_exec_path();
>
> if (exec_path) {
> list_commands_in_dir(main_cmds, exec_path, prefix);
> @@ -187,6 +187,7 @@ void load_command_list(const char *prefix,
> sizeof(*other_cmds->names), cmdname_compare);
> uniq(other_cmds);
> }
> + free(exec_path);
> exclude_cmds(other_cmds, main_cmds);
> }
>
> @@ -203,13 +204,14 @@ void list_commands(const char *title, struct cmdnames *main_cmds,
> longest = other_cmds->names[i]->len;
>
> if (main_cmds->cnt) {
> - const char *exec_path = perf_exec_path();
> + char *exec_path = perf_exec_path();
> printf("available %s in '%s'\n", title, exec_path);
> printf("----------------");
> mput_char('-', strlen(title) + strlen(exec_path));
> putchar('\n');
> pretty_print_string_list(main_cmds, longest);
> putchar('\n');
> + free(exec_path);
> }
>
> if (other_cmds->cnt) {

----- End forwarded message -----

2015-11-18 22:36:52

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH perf/core 05/13] perf: Fix machine__findnew_module_map to put registered map

Em Wed, Nov 18, 2015 at 03:40:20PM +0900, Masami Hiramatsu escreveu:
> Fix machine object to put the map object which is already
> insterted to machine->kmaps.
inserted :-)

Applied!

- Arnaldo

2015-11-18 22:36:46

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH perf/core 01/13] perf probe: Fix to free temporal Dwarf_Frame

Em Wed, Nov 18, 2015 at 03:40:12PM +0900, Masami Hiramatsu escreveu:
> Since dwarf_cfi_addrframe returns malloc'd Dwarf_Frame
> object, it has to be freed after used.

Applied to perf/urgent

> Signed-off-by: Masami Hiramatsu <[email protected]>
> ---
> tools/perf/util/probe-finder.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
> index 63993d7..4d7d4f4 100644
> --- a/tools/perf/util/probe-finder.c
> +++ b/tools/perf/util/probe-finder.c
> @@ -683,21 +683,24 @@ static int call_probe_finder(Dwarf_Die *sc_die, struct probe_finder *pf)
> ret = dwarf_getlocation_addr(&fb_attr, pf->addr, &pf->fb_ops, &nops, 1);
> if (ret <= 0 || nops == 0) {
> pf->fb_ops = NULL;
> + ret = 0;
> #if _ELFUTILS_PREREQ(0, 142)
> } else if (nops == 1 && pf->fb_ops[0].atom == DW_OP_call_frame_cfa &&
> pf->cfi != NULL) {
> - Dwarf_Frame *frame;
> + Dwarf_Frame *frame = NULL;
> if (dwarf_cfi_addrframe(pf->cfi, pf->addr, &frame) != 0 ||
> dwarf_frame_cfa(frame, &pf->fb_ops, &nops) != 0) {
> pr_warning("Failed to get call frame on 0x%jx\n",
> (uintmax_t)pf->addr);
> - return -ENOENT;
> + ret = -ENOENT;
> }
> + free(frame);
> #endif
> }
>
> /* Call finder's callback handler */
> - ret = pf->callback(sc_die, pf);
> + if (ret >= 0)
> + ret = pf->callback(sc_die, pf);
>
> /* *pf->fb_ops will be cached in libdw. Don't free it. */
> pf->fb_ops = NULL;

2015-11-18 22:38:56

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH perf/core 06/13] perf: Fix machine__destroy_kernel_maps to put vmlinux_maps

Em Wed, Nov 18, 2015 at 03:40:22PM +0900, Masami Hiramatsu escreveu:
> Fix machine__destroy_kernel_maps() to put vmlinux_maps before
> filling it with NULL.

Applied to perf/urgent.

- Arnaldo

2015-11-18 23:38:23

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH perf/core 01/13] perf probe: Fix to free temporal Dwarf_Frame

On November 19, 2015 7:36:39 AM GMT+09:00, Arnaldo Carvalho de Melo <[email protected]> wrote:
>Em Wed, Nov 18, 2015 at 03:40:12PM +0900, Masami Hiramatsu escreveu:
>> Since dwarf_cfi_addrframe returns malloc'd Dwarf_Frame
>> object, it has to be freed after used.
>
>Applied to perf/urgent
>
>> Signed-off-by: Masami Hiramatsu <[email protected]>
>> ---
>> tools/perf/util/probe-finder.c | 9 ++++++---
>> 1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/tools/perf/util/probe-finder.c
>b/tools/perf/util/probe-finder.c
>> index 63993d7..4d7d4f4 100644
>> --- a/tools/perf/util/probe-finder.c
>> +++ b/tools/perf/util/probe-finder.c
>> @@ -683,21 +683,24 @@ static int call_probe_finder(Dwarf_Die *sc_die,
>struct probe_finder *pf)
>> ret = dwarf_getlocation_addr(&fb_attr, pf->addr, &pf->fb_ops,
>&nops, 1);
>> if (ret <= 0 || nops == 0) {
>> pf->fb_ops = NULL;
>> + ret = 0;
>> #if _ELFUTILS_PREREQ(0, 142)
>> } else if (nops == 1 && pf->fb_ops[0].atom == DW_OP_call_frame_cfa
>&&
>> pf->cfi != NULL) {
>> - Dwarf_Frame *frame;
>> + Dwarf_Frame *frame = NULL;
>> if (dwarf_cfi_addrframe(pf->cfi, pf->addr, &frame) != 0 ||
>> dwarf_frame_cfa(frame, &pf->fb_ops, &nops) != 0) {

What if dwarf_cfi_addrframe() succeeded but
dwarf_frame_cfa() failed? It seems that the frame
still can be leaked..

Thanks
Namhyung


>> pr_warning("Failed to get call frame on 0x%jx\n",
>> (uintmax_t)pf->addr);
>> - return -ENOENT;
>> + ret = -ENOENT;
>> }
>> + free(frame);
>> #endif
>> }
>>
>> /* Call finder's callback handler */
>> - ret = pf->callback(sc_die, pf);
>> + if (ret >= 0)
>> + ret = pf->callback(sc_die, pf);
>>
>> /* *pf->fb_ops will be cached in libdw. Don't free it. */
>> pf->fb_ops = NULL;

Hi Arnaldo and Masami,
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Subject: RE: [PATCH perf/core 00/13] perf memory/refcnt leak fixes

From: Arnaldo Carvalho de Melo [mailto:[email protected]]
>
>Em Wed, Nov 18, 2015 at 03:40:09PM +0900, Masami Hiramatsu escreveu:
>> Hi,
>>
>> Here is a series to fix some memory leaks and refcount
>> leaks on map and dso. This also includes the refcnt APIs
>> with backtrace debugging feature.
>
>Cool, I wonder if this could be usable in the kernel proper... Is there
>such a facility there? I'll check.

As far as I know, there is no same feature, but I guess expanding
kmemleak is possible to provide similar feature.

>But thanks for doing this work, I'll go thru the fixes first, then look
>at the debugging feature.

Thanks!

>
>- Arnaldo
>
>> The story has started from the posible memory leak report
>> reported by Wnag Nan.
>> I've tried to use valgrind to ensure the perf probe doesn't
>> have other memory leaks. The result is here:
>>
>> ----
>> # valgrind ./perf probe vfs_read
>> ==17521== Memcheck, a memory error detector
>> ==17521== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
>> ==17521== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
>> ==17521== Command: ./perf probe vfs_read
>> ==17521==
>> Added new event:
>> probe:vfs_read (on vfs_read)
>>
>> You can now use it in all perf tools, such as:
>>
>> perf record -e probe:vfs_read -aR sleep 1
>>
>> ==17521==
>> ==17521== HEAP SUMMARY:
>> ==17521== in use at exit: 3,512,761 bytes in 38,012 blocks
>> ==17521== total heap usage: 74,723 allocs, 36,711 frees, 24,014,927
>> bytes allocated
>> ==17521==
>> ==17521== LEAK SUMMARY:
>> ==17521== definitely lost: 6,857 bytes in 49 blocks
>> ==17521== indirectly lost: 3,501,287 bytes in 37,891 blocks
>> ==17521== possibly lost: 0 bytes in 0 blocks
>> ==17521== still reachable: 4,617 bytes in 72 blocks
>> ==17521== suppressed: 0 bytes in 0 blocks
>> ==17521== Rerun with --leak-check=full to see details of leaked memory
>> ==17521==
>> ==17521== For counts of detected and suppressed errors, rerun with: -v
>> ==17521== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
>> ----
>>
>> Oops! It leaked almost 4 MB memories. I've tried to find
>> the root causes, and what I've found is there are many
>> leaks in not only perf-probe specific code, but also maps
>> and dsos (and some other pieces).
>>
>> The first 3 patches are just fixing 'easy' memory leaks. However,
>> most of the leaks are caused by refcnt. Since valgrind seems not
>> able to debug this kind of issues, I introduced a hand-made refcnt
>> backtrace APIs for debugging.
>> The rest of patches are for fixing refcnt leak bugs and replcing
>> refcnt apis.
>>
>> After all, most of the issues are gone, except for just a few issues
>> in elfutils. I'll continue to investigate that.
>>
>> ----
>> valgrind ./perf probe vfs_read
>> ==29521== Memcheck, a memory error detector
>> ==29521== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
>> ==29521== Using Valgrind-3.10.0 and LibVEX; rerun with -h for copyright info
>> ==29521== Command: ./perf probe vfs_read
>> ==29521==
>> Added new event:
>> probe:vfs_read (on vfs_read)
>>
>> You can now use it in all perf tools, such as:
>>
>> perf record -e probe:vfs_read -aR sleep 1
>>
>> ==29521==
>> ==29521== HEAP SUMMARY:
>> ==29521== in use at exit: 5,137 bytes in 75 blocks
>> ==29521== total heap usage: 74,723 allocs, 74,648 frees, 24,014,927
>> bytes allocated
>> ==29521==
>> ==29521== LEAK SUMMARY:
>> ==29521== definitely lost: 520 bytes in 3 blocks
>> ==29521== indirectly lost: 0 bytes in 0 blocks
>> ==29521== possibly lost: 0 bytes in 0 blocks
>> ==29521== still reachable: 4,617 bytes in 72 blocks
>> ==29521== suppressed: 0 bytes in 0 blocks
>> ==29521== Rerun with --leak-check=full to see details of leaked memory
>> ==29521==
>> ==29521== For counts of detected and suppressed errors, rerun with: -v
>> ==29521== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 2 from 2)
>> ----
>>
>> Anyway, I decided to release these fixes and the debugging feature
>> because at least this will improve perf quality.
>>
>> Thank you,
>>
>> ---
>>
>> Masami Hiramatsu (13):
>> perf probe: Fix to free temporal Dwarf_Frame
>> perf: Make perf_exec_path always returns malloc'd string
>> perf: Introduce generic refcount APIs with debug feature
>> perf: make map to use refcnt
>> perf: Fix machine__findnew_module_map to put registered map
>> perf: Fix machine__destroy_kernel_maps to put vmlinux_maps
>> perf: Fix to destroy kernel maps when machine exits
>> perf: Fix to put new map after inserting to map_groups in dso__load_sym
>> perf: Make dso to use refcnt for debug
>> perf: Fix __dsos__addnew to put dso after adding it to the list
>> perf: Fix machine__create_kernel_maps to put kernel dso
>> perf: Fix machine__findnew_module_map to put dso
>> perf: Fix dso__load_sym to put dso
>>
>>
>> tools/perf/config/Makefile | 5 ++
>> tools/perf/util/Build | 1
>> tools/perf/util/dso.c | 9 ++-
>> tools/perf/util/exec_cmd.c | 20 ++++--
>> tools/perf/util/exec_cmd.h | 5 +-
>> tools/perf/util/help.c | 6 +-
>> tools/perf/util/machine.c | 17 ++++-
>> tools/perf/util/map.c | 7 +-
>> tools/perf/util/map.h | 3 +
>> tools/perf/util/probe-finder.c | 9 ++-
>> tools/perf/util/refcnt.c | 125 ++++++++++++++++++++++++++++++++++++++++
>> tools/perf/util/refcnt.h | 65 +++++++++++++++++++++
>> tools/perf/util/symbol-elf.c | 4 +
>> 13 files changed, 250 insertions(+), 26 deletions(-)
>> create mode 100644 tools/perf/util/refcnt.c
>> create mode 100644 tools/perf/util/refcnt.h
>>
>>
>> --
>> Masami HIRAMATSU
>> Linux Technology Research Center, System Productivity Research Dept.
>> Center for Technology Innovation - Systems Engineering
>> Hitachi, Ltd., Research & Development Group
>> E-mail: [email protected]

Subject: RE: [PATCH perf/core 01/13] perf probe: Fix to free temporal Dwarf_Frame

From: Namhyung Kim [mailto:[email protected]]
>On November 19, 2015 7:36:39 AM GMT+09:00, Arnaldo Carvalho de Melo <[email protected]> wrote:
>>Em Wed, Nov 18, 2015 at 03:40:12PM +0900, Masami Hiramatsu escreveu:
>>> Since dwarf_cfi_addrframe returns malloc'd Dwarf_Frame
>>> object, it has to be freed after used.
>>
>>Applied to perf/urgent
>>
>>> Signed-off-by: Masami Hiramatsu <[email protected]>
>>> ---
>>> tools/perf/util/probe-finder.c | 9 ++++++---
>>> 1 file changed, 6 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/tools/perf/util/probe-finder.c
>>b/tools/perf/util/probe-finder.c
>>> index 63993d7..4d7d4f4 100644
>>> --- a/tools/perf/util/probe-finder.c
>>> +++ b/tools/perf/util/probe-finder.c
>>> @@ -683,21 +683,24 @@ static int call_probe_finder(Dwarf_Die *sc_die,
>>struct probe_finder *pf)
>>> ret = dwarf_getlocation_addr(&fb_attr, pf->addr, &pf->fb_ops,
>>&nops, 1);
>>> if (ret <= 0 || nops == 0) {
>>> pf->fb_ops = NULL;
>>> + ret = 0;
>>> #if _ELFUTILS_PREREQ(0, 142)
>>> } else if (nops == 1 && pf->fb_ops[0].atom == DW_OP_call_frame_cfa
>>&&
>>> pf->cfi != NULL) {
>>> - Dwarf_Frame *frame;
>>> + Dwarf_Frame *frame = NULL;
>>> if (dwarf_cfi_addrframe(pf->cfi, pf->addr, &frame) != 0 ||
>>> dwarf_frame_cfa(frame, &pf->fb_ops, &nops) != 0) {
>
>What if dwarf_cfi_addrframe() succeeded but
>dwarf_frame_cfa() failed? It seems that the frame
>still can be leaked..

No, it is also caught by free(). Please see below,
>
>>> pr_warning("Failed to get call frame on 0x%jx\n",
>>> (uintmax_t)pf->addr);
>>> - return -ENOENT;
>>> + ret = -ENOENT;

I've replaced "return -ENOENT" with "ret = -ENOENT", so this fall down

>>> }
>>> + free(frame);

and free the frame :) (and if frame == NULL, it is just ignored)

Thank you!


>>> #endif
>>> }
>>>
>>> /* Call finder's callback handler */
>>> - ret = pf->callback(sc_die, pf);
>>> + if (ret >= 0)
>>> + ret = pf->callback(sc_die, pf);
>>>
>>> /* *pf->fb_ops will be cached in libdw. Don't free it. */
>>> pf->fb_ops = NULL;
>
>Hi Arnaldo and Masami,
>--
>Sent from my Android device with K-9 Mail. Please excuse my brevity.
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

Subject: RE: [PATCH perf/core 02/13] perf: Make perf_exec_path always returns malloc'd string

From: Arnaldo Carvalho de Melo [mailto:[email protected]]
>
>Em Wed, Nov 18, 2015 at 03:40:14PM +0900, Masami Hiramatsu escreveu:
>> Since system_path() returns malloc'd string if given path is
>> not an absolute path, perf_exec_path sometimes returns static
>> string and sometimes returns malloc'd string depends on the
>> environment variables or command options.
>>
>> This causes a memory leak because caller can not free the
>> returned string.
>>
>> This fixes perf_exec_path and system_path to always return
>> malloc'd string, so caller can always free it.
>
>> /* Returns the highest-priority, location to look for perf programs. */
>> -const char *perf_exec_path(void)
>> +char *perf_exec_path(void)
>> {
>> - const char *env;
>> + char *env;
>>
>> if (argv_exec_path)
>> - return argv_exec_path;
>> + return strdup(argv_exec_path);
>
>So here you return strdup without checking if it fails
>
>>
>> env = getenv(EXEC_PATH_ENVIRONMENT);
>> if (env && *env) {
>> - return env;
>> + env = strdup(env);
>> + if (env)
>> + return env;
>
>But here you change the behaviour by, having a EXEC_PATH_ENVIRONMENT,
>fallback to PERF_EXEC_PATH if strdup fails, that in turn can end up
>returning NULL, since system_path() doesn't check the strdup() result?

Ah, right. actually this is the first part where I fixed, and at that
point I thought this is better. But afterwards, I changed my mind to
return strdup directly. (If there is no memory caller must handle it)
So, I think the above should be

if (env && *env)
return strdup(env);

>
>I wonder if in those cases we couldn't check the address range for the
>pointer being freed and realize it is in .bss, .data or that it is a
>stack variable? Maybe there is some malloc() friend that, given a
>pointer, tells that yeah, it was allocated?

I wonder too. But as far as I took a look, I couldn't find such functions.

Thank you,

>
>- Arnaldo
>
>> }
>>
>> return system_path(PERF_EXEC_PATH);
>> @@ -83,9 +85,11 @@ void setup_path(void)
>> {
>> const char *old_path = getenv("PATH");
>> struct strbuf new_path = STRBUF_INIT;
>> + char *tmp = perf_exec_path();
>>
>> - add_path(&new_path, perf_exec_path());
>> + add_path(&new_path, tmp);
>> add_path(&new_path, argv0_path);
>> + free(tmp);
>>
>> if (old_path)
>> strbuf_addstr(&new_path, old_path);
>> diff --git a/tools/perf/util/exec_cmd.h b/tools/perf/util/exec_cmd.h
>> index bc4b915..48b4175 100644
>> --- a/tools/perf/util/exec_cmd.h
>> +++ b/tools/perf/util/exec_cmd.h
>> @@ -3,10 +3,11 @@
>>
>> extern void perf_set_argv_exec_path(const char *exec_path);
>> extern const char *perf_extract_argv0_path(const char *path);
>> -extern const char *perf_exec_path(void);
>> extern void setup_path(void);
>> extern int execv_perf_cmd(const char **argv); /* NULL terminated */
>> extern int execl_perf_cmd(const char *cmd, ...);
>> -extern const char *system_path(const char *path);
>> +/* perf_exec_path and system_path return malloc'd string, caller must free it */
>> +extern char *perf_exec_path(void);
>> +extern char *system_path(const char *path);
>>
>> #endif /* __PERF_EXEC_CMD_H */
>> diff --git a/tools/perf/util/help.c b/tools/perf/util/help.c
>> index 86c37c4..fa1fc4a 100644
>> --- a/tools/perf/util/help.c
>> +++ b/tools/perf/util/help.c
>> @@ -159,7 +159,7 @@ void load_command_list(const char *prefix,
>> struct cmdnames *other_cmds)
>> {
>> const char *env_path = getenv("PATH");
>> - const char *exec_path = perf_exec_path();
>> + char *exec_path = perf_exec_path();
>>
>> if (exec_path) {
>> list_commands_in_dir(main_cmds, exec_path, prefix);
>> @@ -187,6 +187,7 @@ void load_command_list(const char *prefix,
>> sizeof(*other_cmds->names), cmdname_compare);
>> uniq(other_cmds);
>> }
>> + free(exec_path);
>> exclude_cmds(other_cmds, main_cmds);
>> }
>>
>> @@ -203,13 +204,14 @@ void list_commands(const char *title, struct cmdnames *main_cmds,
>> longest = other_cmds->names[i]->len;
>>
>> if (main_cmds->cnt) {
>> - const char *exec_path = perf_exec_path();
>> + char *exec_path = perf_exec_path();
>> printf("available %s in '%s'\n", title, exec_path);
>> printf("----------------");
>> mput_char('-', strlen(title) + strlen(exec_path));
>> putchar('\n');
>> pretty_print_string_list(main_cmds, longest);
>> putchar('\n');
>> + free(exec_path);
>> }
>>
>> if (other_cmds->cnt) {
>
>----- End forwarded message -----

Subject: [PATCH perf/core v2] perf: Make perf_exec_path always returns malloc'd string

Since system_path() returns malloc'd string if given path is
not an absolute path, perf_exec_path sometimes returns static
string and sometimes returns malloc'd string depends on the
environment variables or command options.

This causes a memory leak because caller can not free the
returned string.

This fixes perf_exec_path and system_path to always return
malloc'd string, so caller can always free it.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
Changes in v2:
- do not change the behavior if strdup is failed.
---
tools/perf/util/exec_cmd.c | 21 +++++++++++----------
tools/perf/util/exec_cmd.h | 5 +++--
tools/perf/util/help.c | 6 ++++--
3 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/exec_cmd.c b/tools/perf/util/exec_cmd.c
index 7adf4ad..1099e92 100644
--- a/tools/perf/util/exec_cmd.c
+++ b/tools/perf/util/exec_cmd.c
@@ -9,17 +9,17 @@
static const char *argv_exec_path;
static const char *argv0_path;

-const char *system_path(const char *path)
+char *system_path(const char *path)
{
static const char *prefix = PREFIX;
struct strbuf d = STRBUF_INIT;

if (is_absolute_path(path))
- return path;
+ return strdup(path);

strbuf_addf(&d, "%s/%s", prefix, path);
path = strbuf_detach(&d, NULL);
- return path;
+ return (char *)path;
}

const char *perf_extract_argv0_path(const char *argv0)
@@ -52,17 +52,16 @@ void perf_set_argv_exec_path(const char *exec_path)


/* Returns the highest-priority, location to look for perf programs. */
-const char *perf_exec_path(void)
+char *perf_exec_path(void)
{
- const char *env;
+ char *env;

if (argv_exec_path)
- return argv_exec_path;
+ return strdup(argv_exec_path);

env = getenv(EXEC_PATH_ENVIRONMENT);
- if (env && *env) {
- return env;
- }
+ if (env && *env)
+ return strdup(env);

return system_path(PERF_EXEC_PATH);
}
@@ -83,9 +82,11 @@ void setup_path(void)
{
const char *old_path = getenv("PATH");
struct strbuf new_path = STRBUF_INIT;
+ char *tmp = perf_exec_path();

- add_path(&new_path, perf_exec_path());
+ add_path(&new_path, tmp);
add_path(&new_path, argv0_path);
+ free(tmp);

if (old_path)
strbuf_addstr(&new_path, old_path);
diff --git a/tools/perf/util/exec_cmd.h b/tools/perf/util/exec_cmd.h
index bc4b915..48b4175 100644
--- a/tools/perf/util/exec_cmd.h
+++ b/tools/perf/util/exec_cmd.h
@@ -3,10 +3,11 @@

extern void perf_set_argv_exec_path(const char *exec_path);
extern const char *perf_extract_argv0_path(const char *path);
-extern const char *perf_exec_path(void);
extern void setup_path(void);
extern int execv_perf_cmd(const char **argv); /* NULL terminated */
extern int execl_perf_cmd(const char *cmd, ...);
-extern const char *system_path(const char *path);
+/* perf_exec_path and system_path return malloc'd string, caller must free it */
+extern char *perf_exec_path(void);
+extern char *system_path(const char *path);

#endif /* __PERF_EXEC_CMD_H */
diff --git a/tools/perf/util/help.c b/tools/perf/util/help.c
index 86c37c4..fa1fc4a 100644
--- a/tools/perf/util/help.c
+++ b/tools/perf/util/help.c
@@ -159,7 +159,7 @@ void load_command_list(const char *prefix,
struct cmdnames *other_cmds)
{
const char *env_path = getenv("PATH");
- const char *exec_path = perf_exec_path();
+ char *exec_path = perf_exec_path();

if (exec_path) {
list_commands_in_dir(main_cmds, exec_path, prefix);
@@ -187,6 +187,7 @@ void load_command_list(const char *prefix,
sizeof(*other_cmds->names), cmdname_compare);
uniq(other_cmds);
}
+ free(exec_path);
exclude_cmds(other_cmds, main_cmds);
}

@@ -203,13 +204,14 @@ void list_commands(const char *title, struct cmdnames *main_cmds,
longest = other_cmds->names[i]->len;

if (main_cmds->cnt) {
- const char *exec_path = perf_exec_path();
+ char *exec_path = perf_exec_path();
printf("available %s in '%s'\n", title, exec_path);
printf("----------------");
mput_char('-', strlen(title) + strlen(exec_path));
putchar('\n');
pretty_print_string_list(main_cmds, longest);
putchar('\n');
+ free(exec_path);
}

if (other_cmds->cnt) {

2015-11-20 01:46:30

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH perf/core 01/13] perf probe: Fix to free temporal Dwarf_Frame

Hi Masami,

On Thu, Nov 19, 2015 at 03:12:37AM +0000, 平松雅巳 / HIRAMATU,MASAMI wrote:
> From: Namhyung Kim [mailto:[email protected]]
> >On November 19, 2015 7:36:39 AM GMT+09:00, Arnaldo Carvalho de Melo <[email protected]> wrote:
> >>Em Wed, Nov 18, 2015 at 03:40:12PM +0900, Masami Hiramatsu escreveu:
> >>> Since dwarf_cfi_addrframe returns malloc'd Dwarf_Frame
> >>> object, it has to be freed after used.
> >>
> >>Applied to perf/urgent
> >>
> >>> Signed-off-by: Masami Hiramatsu <[email protected]>
> >>> ---
> >>> tools/perf/util/probe-finder.c | 9 ++++++---
> >>> 1 file changed, 6 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/tools/perf/util/probe-finder.c
> >>b/tools/perf/util/probe-finder.c
> >>> index 63993d7..4d7d4f4 100644
> >>> --- a/tools/perf/util/probe-finder.c
> >>> +++ b/tools/perf/util/probe-finder.c
> >>> @@ -683,21 +683,24 @@ static int call_probe_finder(Dwarf_Die *sc_die,
> >>struct probe_finder *pf)
> >>> ret = dwarf_getlocation_addr(&fb_attr, pf->addr, &pf->fb_ops,
> >>&nops, 1);
> >>> if (ret <= 0 || nops == 0) {
> >>> pf->fb_ops = NULL;
> >>> + ret = 0;
> >>> #if _ELFUTILS_PREREQ(0, 142)
> >>> } else if (nops == 1 && pf->fb_ops[0].atom == DW_OP_call_frame_cfa
> >>&&
> >>> pf->cfi != NULL) {
> >>> - Dwarf_Frame *frame;
> >>> + Dwarf_Frame *frame = NULL;
> >>> if (dwarf_cfi_addrframe(pf->cfi, pf->addr, &frame) != 0 ||
> >>> dwarf_frame_cfa(frame, &pf->fb_ops, &nops) != 0) {
> >
> >What if dwarf_cfi_addrframe() succeeded but
> >dwarf_frame_cfa() failed? It seems that the frame
> >still can be leaked..
>
> No, it is also caught by free(). Please see below,
> >
> >>> pr_warning("Failed to get call frame on 0x%jx\n",
> >>> (uintmax_t)pf->addr);
> >>> - return -ENOENT;
> >>> + ret = -ENOENT;
>
> I've replaced "return -ENOENT" with "ret = -ENOENT", so this fall down

Ah, missed that. Thank you.
Namhyung


>
> >>> }
> >>> + free(frame);
>
> and free the frame :) (and if frame == NULL, it is just ignored)
>
> Thank you!
>
>
> >>> #endif
> >>> }
> >>>
> >>> /* Call finder's callback handler */
> >>> - ret = pf->callback(sc_die, pf);
> >>> + if (ret >= 0)
> >>> + ret = pf->callback(sc_die, pf);
> >>>
> >>> /* *pf->fb_ops will be cached in libdw. Don't free it. */
> >>> pf->fb_ops = NULL;
> >
> >Hi Arnaldo and Masami,
> >--
> >Sent from my Android device with K-9 Mail. Please excuse my brevity.

2015-11-20 02:52:36

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH perf/core 03/13] perf: Introduce generic refcount APIs with debug feature

On Wed, Nov 18, 2015 at 03:40:16PM +0900, Masami Hiramatsu wrote:
> This is a kind of debugging feature for atomic reference counter.
> The reference counters are widely used in perf tools but not well
> debugged. It sometimes causes memory leaks but no one has noticed
> the issue, since it is hard to debug such un-reclaimed objects.
>
> This refcnt interface provides fully backtrace debug feature to
> analyze such issue. User just replace atomic_t ops with refcnt
> APIs and add refcnt__exit() where the object is released.
>
> /* At object initializing */
> refcnt__init(obj, refcnt); /* <- atomic_set(&obj->refcnt, 1); */
>
> /* At object get method */
> refcnt__get(obj, refcnt); /* <- atomic_inc(&obj->refcnt); */
>
> /* At object put method */
> if (obj && refcnt__put(obj, refcnt)) /* <-atmoic_dec_and_test(&obj->refcnt)*/
>
> /* At object releasing */
> refcnt__exit(obj, refcnt); /* <- Newly added */
>
> The debugging feature is enabled when building perf with
> REFCNT_DEBUG=1. Otherwides it is just translated as normal
> atomic ops.
>
> Debugging binary warns you if it finds leaks when the perf exits.
> If you give -v (or --verbose) to the perf, it also shows backtrace
> logs on all refcnt operations of leaked objects.
>
> Signed-off-by: Masami Hiramatsu <[email protected]>

Looks really useful!

Acked-by: Namhyung Kim <[email protected]>

Just a nitpick below..


> ---

[SNIP]
> +void refcnt_object__record(void *obj, const char *name, int count)
> +{
> + struct refcnt_object *ref = refcnt__find_object(obj);
> + struct refcnt_buffer *buf;
> +
> + /* If no entry, allocate new one */
> + if (!ref) {
> + ref = malloc(sizeof(*ref));
> + if (!ref) {
> + pr_debug("REFCNT: Out of memory for %p (%s)\n",
> + obj, name);
> + return;
> + }
> + INIT_LIST_HEAD(&ref->list);
> + INIT_LIST_HEAD(&ref->head);
> + ref->name = name;
> + ref->obj = obj;
> + list_add_tail(&ref->list, &refcnt_root);
> + }
> +
> + buf = malloc(sizeof(*buf));
> + if (!buf) {
> + pr_debug("REFCNT: Out of memory for %p (%s)\n", obj, ref->name);
> + return;
> + }
> +
> + INIT_LIST_HEAD(&buf->list);
> + buf->count = count;
> + buf->nr = backtrace(buf->buf, BACKTRACE_SIZE);
> + list_add_tail(&buf->list, &ref->head);
> +}
> +
> +static void pr_refcnt_buffer(struct refcnt_buffer *buf)
> +{
> + char **symbuf;
> + int i;
> +
> + if (!buf)
> + return;
> + symbuf = backtrace_symbols(buf->buf, buf->nr);

It seems you need to check the return value. Maybe we can use
backtrace_symbols_fd() instead, or just in case of an error.

Thanks,
Namhyung


> + /* Skip the first one because it is always btrace__record */
> + for (i = 1; i < buf->nr; i++)
> + pr_debug(" %s\n", symbuf[i]);
> + free(symbuf);
> +}
> +
> +void refcnt__dump_unreclaimed(void) __attribute__((destructor));
> +void refcnt__dump_unreclaimed(void)
> +{
> + struct refcnt_object *ref, *n;
> + struct refcnt_buffer *buf;
> + int i = 0;
> +
> + if (list_empty(&refcnt_root))
> + return;
> +
> + pr_warning("REFCNT: BUG: Unreclaimed objects found.\n");
> + list_for_each_entry_safe(ref, n, &refcnt_root, list) {
> + pr_debug("==== [%d] ====\nUnreclaimed %s: %p\n", i,
> + ref->name ? ref->name : "(object)", ref->obj);
> + list_for_each_entry(buf, &ref->head, list) {
> + pr_debug("Refcount %s => %d at\n",
> + buf->count > 0 ? "+1" : "-1",
> + buf->count > 0 ? buf->count : -buf->count - 1);
> + pr_refcnt_buffer(buf);
> + }
> + __refcnt_object__delete(ref);
> + i++;
> + }
> + pr_warning("REFCNT: Total %d objects are not reclaimed.\n", i);
> + if (!verbose)
> + pr_warning(" To see all backtraces, rerun with -v option\n");
> +}

Subject: RE: [PATCH perf/core 03/13] perf: Introduce generic refcount APIs with debug feature

From: Namhyung Kim [mailto:[email protected]]
>
>On Wed, Nov 18, 2015 at 03:40:16PM +0900, Masami Hiramatsu wrote:
>> This is a kind of debugging feature for atomic reference counter.
>> The reference counters are widely used in perf tools but not well
>> debugged. It sometimes causes memory leaks but no one has noticed
>> the issue, since it is hard to debug such un-reclaimed objects.
>>
>> This refcnt interface provides fully backtrace debug feature to
>> analyze such issue. User just replace atomic_t ops with refcnt
>> APIs and add refcnt__exit() where the object is released.
>>
>> /* At object initializing */
>> refcnt__init(obj, refcnt); /* <- atomic_set(&obj->refcnt, 1); */
>>
>> /* At object get method */
>> refcnt__get(obj, refcnt); /* <- atomic_inc(&obj->refcnt); */
>>
>> /* At object put method */
>> if (obj && refcnt__put(obj, refcnt)) /* <-atmoic_dec_and_test(&obj->refcnt)*/
>>
>> /* At object releasing */
>> refcnt__exit(obj, refcnt); /* <- Newly added */
>>
>> The debugging feature is enabled when building perf with
>> REFCNT_DEBUG=1. Otherwides it is just translated as normal
>> atomic ops.
>>
>> Debugging binary warns you if it finds leaks when the perf exits.
>> If you give -v (or --verbose) to the perf, it also shows backtrace
>> logs on all refcnt operations of leaked objects.
>>
>> Signed-off-by: Masami Hiramatsu <[email protected]>
>
>Looks really useful!
>
>Acked-by: Namhyung Kim <[email protected]>
>
>Just a nitpick below..

Thanks!

>> ---
>
>[SNIP]
>> +void refcnt_object__record(void *obj, const char *name, int count)
>> +{
>> + struct refcnt_object *ref = refcnt__find_object(obj);
>> + struct refcnt_buffer *buf;
>> +
>> + /* If no entry, allocate new one */
>> + if (!ref) {
>> + ref = malloc(sizeof(*ref));
>> + if (!ref) {
>> + pr_debug("REFCNT: Out of memory for %p (%s)\n",
>> + obj, name);
>> + return;
>> + }
>> + INIT_LIST_HEAD(&ref->list);
>> + INIT_LIST_HEAD(&ref->head);
>> + ref->name = name;
>> + ref->obj = obj;
>> + list_add_tail(&ref->list, &refcnt_root);
>> + }
>> +
>> + buf = malloc(sizeof(*buf));
>> + if (!buf) {
>> + pr_debug("REFCNT: Out of memory for %p (%s)\n", obj, ref->name);
>> + return;
>> + }
>> +
>> + INIT_LIST_HEAD(&buf->list);
>> + buf->count = count;
>> + buf->nr = backtrace(buf->buf, BACKTRACE_SIZE);
>> + list_add_tail(&buf->list, &ref->head);
>> +}
>> +
>> +static void pr_refcnt_buffer(struct refcnt_buffer *buf)
>> +{
>> + char **symbuf;
>> + int i;
>> +
>> + if (!buf)
>> + return;
>> + symbuf = backtrace_symbols(buf->buf, buf->nr);
>
>It seems you need to check the return value. Maybe we can use
>backtrace_symbols_fd() instead, or just in case of an error.

Yeah, it should be checked and in that case we can fall back to
backtrace_symbols_fd(as the last resort), but I don’t like
backtrace_symbols_fd replacing because it doesn't allow us to
indent the backtrace result.

Thank you,


>
>Thanks,
>Namhyung
>
>
>> + /* Skip the first one because it is always btrace__record */
>> + for (i = 1; i < buf->nr; i++)
>> + pr_debug(" %s\n", symbuf[i]);
>> + free(symbuf);
>> +}
>> +
>> +void refcnt__dump_unreclaimed(void) __attribute__((destructor));
>> +void refcnt__dump_unreclaimed(void)
>> +{
>> + struct refcnt_object *ref, *n;
>> + struct refcnt_buffer *buf;
>> + int i = 0;
>> +
>> + if (list_empty(&refcnt_root))
>> + return;
>> +
>> + pr_warning("REFCNT: BUG: Unreclaimed objects found.\n");
>> + list_for_each_entry_safe(ref, n, &refcnt_root, list) {
>> + pr_debug("==== [%d] ====\nUnreclaimed %s: %p\n", i,
>> + ref->name ? ref->name : "(object)", ref->obj);
>> + list_for_each_entry(buf, &ref->head, list) {
>> + pr_debug("Refcount %s => %d at\n",
>> + buf->count > 0 ? "+1" : "-1",
>> + buf->count > 0 ? buf->count : -buf->count - 1);
>> + pr_refcnt_buffer(buf);
>> + }
>> + __refcnt_object__delete(ref);
>> + i++;
>> + }
>> + pr_warning("REFCNT: Total %d objects are not reclaimed.\n", i);
>> + if (!verbose)
>> + pr_warning(" To see all backtraces, rerun with -v option\n");
>> +}
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-11-20 05:53:54

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH perf/core 03/13] perf: Introduce generic refcount APIs with debug feature

On Fri, Nov 20, 2015 at 04:12:06AM +0000, 平松雅巳 / HIRAMATU,MASAMI wrote:
> From: Namhyung Kim [mailto:[email protected]]
> >
> >On Wed, Nov 18, 2015 at 03:40:16PM +0900, Masami Hiramatsu wrote:
> >> This is a kind of debugging feature for atomic reference counter.
> >> The reference counters are widely used in perf tools but not well
> >> debugged. It sometimes causes memory leaks but no one has noticed
> >> the issue, since it is hard to debug such un-reclaimed objects.
> >>
> >> This refcnt interface provides fully backtrace debug feature to
> >> analyze such issue. User just replace atomic_t ops with refcnt
> >> APIs and add refcnt__exit() where the object is released.
> >>
> >> /* At object initializing */
> >> refcnt__init(obj, refcnt); /* <- atomic_set(&obj->refcnt, 1); */
> >>
> >> /* At object get method */
> >> refcnt__get(obj, refcnt); /* <- atomic_inc(&obj->refcnt); */
> >>
> >> /* At object put method */
> >> if (obj && refcnt__put(obj, refcnt)) /* <-atmoic_dec_and_test(&obj->refcnt)*/
> >>
> >> /* At object releasing */
> >> refcnt__exit(obj, refcnt); /* <- Newly added */
> >>
> >> The debugging feature is enabled when building perf with
> >> REFCNT_DEBUG=1. Otherwides it is just translated as normal
> >> atomic ops.
> >>
> >> Debugging binary warns you if it finds leaks when the perf exits.
> >> If you give -v (or --verbose) to the perf, it also shows backtrace
> >> logs on all refcnt operations of leaked objects.
> >>
> >> Signed-off-by: Masami Hiramatsu <[email protected]>
> >
> >Looks really useful!
> >
> >Acked-by: Namhyung Kim <[email protected]>
> >
> >Just a nitpick below..
>
> Thanks!
>
> >> ---
> >
> >[SNIP]
> >> +void refcnt_object__record(void *obj, const char *name, int count)
> >> +{
> >> + struct refcnt_object *ref = refcnt__find_object(obj);
> >> + struct refcnt_buffer *buf;
> >> +
> >> + /* If no entry, allocate new one */
> >> + if (!ref) {
> >> + ref = malloc(sizeof(*ref));
> >> + if (!ref) {
> >> + pr_debug("REFCNT: Out of memory for %p (%s)\n",
> >> + obj, name);
> >> + return;
> >> + }
> >> + INIT_LIST_HEAD(&ref->list);
> >> + INIT_LIST_HEAD(&ref->head);
> >> + ref->name = name;
> >> + ref->obj = obj;
> >> + list_add_tail(&ref->list, &refcnt_root);
> >> + }
> >> +
> >> + buf = malloc(sizeof(*buf));
> >> + if (!buf) {
> >> + pr_debug("REFCNT: Out of memory for %p (%s)\n", obj, ref->name);
> >> + return;
> >> + }
> >> +
> >> + INIT_LIST_HEAD(&buf->list);
> >> + buf->count = count;
> >> + buf->nr = backtrace(buf->buf, BACKTRACE_SIZE);
> >> + list_add_tail(&buf->list, &ref->head);
> >> +}
> >> +
> >> +static void pr_refcnt_buffer(struct refcnt_buffer *buf)
> >> +{
> >> + char **symbuf;
> >> + int i;
> >> +
> >> + if (!buf)
> >> + return;
> >> + symbuf = backtrace_symbols(buf->buf, buf->nr);
> >
> >It seems you need to check the return value. Maybe we can use
> >backtrace_symbols_fd() instead, or just in case of an error.
>
> Yeah, it should be checked and in that case we can fall back to
> backtrace_symbols_fd(as the last resort), but I don’t like
> backtrace_symbols_fd replacing because it doesn't allow us to
> indent the backtrace result.

OK, I think we need to improve the backtrace code in general. I'll
send a related patch soon.

Thanks,
Namhyung


> >
> >> + /* Skip the first one because it is always btrace__record */
> >> + for (i = 1; i < buf->nr; i++)
> >> + pr_debug(" %s\n", symbuf[i]);
> >> + free(symbuf);
> >> +}
> >> +
> >> +void refcnt__dump_unreclaimed(void) __attribute__((destructor));
> >> +void refcnt__dump_unreclaimed(void)
> >> +{
> >> + struct refcnt_object *ref, *n;
> >> + struct refcnt_buffer *buf;
> >> + int i = 0;
> >> +
> >> + if (list_empty(&refcnt_root))
> >> + return;
> >> +
> >> + pr_warning("REFCNT: BUG: Unreclaimed objects found.\n");
> >> + list_for_each_entry_safe(ref, n, &refcnt_root, list) {
> >> + pr_debug("==== [%d] ====\nUnreclaimed %s: %p\n", i,
> >> + ref->name ? ref->name : "(object)", ref->obj);
> >> + list_for_each_entry(buf, &ref->head, list) {
> >> + pr_debug("Refcount %s => %d at\n",
> >> + buf->count > 0 ? "+1" : "-1",
> >> + buf->count > 0 ? buf->count : -buf->count - 1);
> >> + pr_refcnt_buffer(buf);
> >> + }
> >> + __refcnt_object__delete(ref);
> >> + i++;
> >> + }
> >> + pr_warning("REFCNT: Total %d objects are not reclaimed.\n", i);
> >> + if (!verbose)
> >> + pr_warning(" To see all backtraces, rerun with -v option\n");
> >> +}

Subject: [tip:perf/core] perf probe: Fix to free temporal Dwarf_Frame

Commit-ID: 05c8d802fa52ef17dbcce21c38b72b4a313eb036
Gitweb: http://git.kernel.org/tip/05c8d802fa52ef17dbcce21c38b72b4a313eb036
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Wed, 18 Nov 2015 15:40:12 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 19 Nov 2015 13:19:17 -0300

perf probe: Fix to free temporal Dwarf_Frame

Since dwarf_cfi_addrframe returns malloc'd Dwarf_Frame object, it has to
be freed after it is used.

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/probe-finder.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index 05012bb..1cab05a 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -683,21 +683,24 @@ static int call_probe_finder(Dwarf_Die *sc_die, struct probe_finder *pf)
ret = dwarf_getlocation_addr(&fb_attr, pf->addr, &pf->fb_ops, &nops, 1);
if (ret <= 0 || nops == 0) {
pf->fb_ops = NULL;
+ ret = 0;
#if _ELFUTILS_PREREQ(0, 142)
} else if (nops == 1 && pf->fb_ops[0].atom == DW_OP_call_frame_cfa &&
pf->cfi != NULL) {
- Dwarf_Frame *frame;
+ Dwarf_Frame *frame = NULL;
if (dwarf_cfi_addrframe(pf->cfi, pf->addr, &frame) != 0 ||
dwarf_frame_cfa(frame, &pf->fb_ops, &nops) != 0) {
pr_warning("Failed to get call frame on 0x%jx\n",
(uintmax_t)pf->addr);
- return -ENOENT;
+ ret = -ENOENT;
}
+ free(frame);
#endif
}

/* Call finder's callback handler */
- ret = pf->callback(sc_die, pf);
+ if (ret >= 0)
+ ret = pf->callback(sc_die, pf);

/* *pf->fb_ops will be cached in libdw. Don't free it. */
pf->fb_ops = NULL;

Subject: [tip:perf/core] perf machine: Fix machine__findnew_module_map to put registered map

Commit-ID: 9afcb420d6cfeadf5d872f395061c611536615fb
Gitweb: http://git.kernel.org/tip/9afcb420d6cfeadf5d872f395061c611536615fb
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Wed, 18 Nov 2015 15:40:20 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 19 Nov 2015 13:19:18 -0300

perf machine: Fix machine__findnew_module_map to put registered map

Fix machine object to drop the reference to the map object after it
inserted it into machine->kmaps.

refcnt debugger shows what happened:
----
==== [2] ====
Unreclaimed map: 0x346f750
Refcount +1 => 1 at
./perf(map__new2+0xb5) [0x4bdea5]
./perf() [0x4b8aaf]
./perf(modules__parse+0xfc) [0x4a9cbc]
./perf() [0x4b83c0]
./perf(machine__create_kernel_maps+0x148) [0x4bb208]
./perf(machine__new_host+0xfa) [0x4bb3fa]
./perf(init_probe_symbol_maps+0x93) [0x5062b3]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f5373899af5]
./perf() [0x4220a9]
Refcount +1 => 2 at
./perf(maps__insert+0x9a) [0x4bfd4a]
./perf() [0x4b8acb]
./perf(modules__parse+0xfc) [0x4a9cbc]
./perf() [0x4b83c0]
./perf(machine__create_kernel_maps+0x148) [0x4bb208]
./perf(machine__new_host+0xfa) [0x4bb3fa]
./perf(init_probe_symbol_maps+0x93) [0x5062b3]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f5373899af5]
./perf() [0x4220a9]
Refcount -1 => 1 at
./perf(map_groups__exit+0x94) [0x4bea54]
./perf(machine__delete+0x3d) [0x4b91ed]
./perf(exit_probe_symbol_maps+0x28) [0x506358]
./perf() [0x45628a]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f5373899af5]
./perf() [0x4220a9]
----

This pattern clearly shows that the refcnt of the map is acquired twice
by map__new2 and maps__insert but released onlu once at
map_groups__exit, when we purge its maps rbtree.

Since maps__insert already reference counted the map, we have to drop
the constructor (map__new2) reference count right after inserting it.

These happened in machine__findnew_module_map, as below.

----
# eu-addr2line -e ./perf -f 0x4b8aaf
machine__findnew_module_map inlined at util/machine.c:1046
in machine__create_module
util/machine.c:582
# eu-addr2line -e ./perf -f 0x4b8acb
map_groups__insert inlined at util/machine.c:585
in machine__create_module
util/map.h:208
----

(note that both are at util/machine.c:58X which is
machine__findnew_module_map)

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/machine.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 8b303ff..0487d77 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -585,6 +585,8 @@ struct map *machine__findnew_module_map(struct machine *machine, u64 start,

map_groups__insert(&machine->kmaps, map);

+ /* Put the map here because map_groups__insert alread got it */
+ map__put(map);
out:
free(m.name);
return map;

Subject: [tip:perf/core] perf machine: Fix machine__destroy_kernel_maps to drop vmlinux_maps references

Commit-ID: e96e4078e9a5ea150b3ad9a296440a7976439e4a
Gitweb: http://git.kernel.org/tip/e96e4078e9a5ea150b3ad9a296440a7976439e4a
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Wed, 18 Nov 2015 15:40:22 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 19 Nov 2015 13:19:18 -0300

perf machine: Fix machine__destroy_kernel_maps to drop vmlinux_maps references

Fix machine__destroy_kernel_maps() to drop vmlinux_maps references
before filling it with NULL.

Refcnt debugger shows
==== [1] ====
Unreclaimed map: 0x36b1070
Refcount +1 => 1 at
./perf(map__new2+0xb5) [0x4bdec5]
./perf(machine__create_kernel_maps+0x72) [0x4bb152]
./perf(machine__new_host+0xfa) [0x4bb41a]
./perf(init_probe_symbol_maps+0x93) [0x5062d3]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1fc9fc4af5]
./perf() [0x4220a9]
Refcount +1 => 2 at
./perf(maps__insert+0x9a) [0x4bfd6a]
./perf(machine__create_kernel_maps+0xc3) [0x4bb1a3]
./perf(machine__new_host+0xfa) [0x4bb41a]
./perf(init_probe_symbol_maps+0x93) [0x5062d3]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1fc9fc4af5]
./perf() [0x4220a9]
Refcount -1 => 1 at
./perf(map_groups__exit+0x94) [0x4bea74]
./perf(machine__delete+0x3d) [0x4b91fd]
./perf(exit_probe_symbol_maps+0x28) [0x506378]
./perf() [0x45628a]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1fc9fc4af5]
./perf() [0x4220a9]

map__new2() returns map with refcnt = 1, and also map_groups__insert
gets it again in__machine__create_kernel_maps().

machine__destroy_kernel_maps() calls map_groups__remove() to
decrement the refcnt, but before decrement it again (corresponding
to map__new2), it makes vmlinux_maps[type] = NULL. And this may
cause a refcnt leak.

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/machine.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 0487d77..e9e09be 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -790,6 +790,7 @@ void machine__destroy_kernel_maps(struct machine *machine)
kmap->ref_reloc_sym = NULL;
}

+ map__put(machine->vmlinux_maps[type]);
machine->vmlinux_maps[type] = NULL;
}
}

Subject: [tip:perf/core] perf machine: Fix to destroy kernel maps when machine exits

Commit-ID: ebe9729c8c3171aa46ad5d7af40acdc29806689d
Gitweb: http://git.kernel.org/tip/ebe9729c8c3171aa46ad5d7af40acdc29806689d
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Wed, 18 Nov 2015 15:40:24 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 19 Nov 2015 13:19:19 -0300

perf machine: Fix to destroy kernel maps when machine exits

Actually machine__exit forgot to call machine__destroy_kernel_maps.

This fixes some memory leaks on map as below.

Without this fix.
----
./perf probe vfs_read
Added new event:
probe:vfs_read (on vfs_read)

You can now use it in all perf tools, such as:

perf record -e probe:vfs_read -aR sleep 1

REFCNT: BUG: Unreclaimed objects found.
REFCNT: Total 4 objects are not reclaimed.
To see all backtraces, rerun with -v option
----
With this fix.
----
./perf probe vfs_read
Added new event:
probe:vfs_read (on vfs_read)

You can now use it in all perf tools, such as:

perf record -e probe:vfs_read -aR sleep 1

REFCNT: BUG: Unreclaimed objects found.
REFCNT: Total 2 objects are not reclaimed.
To see all backtraces, rerun with -v option
----

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/machine.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index e9e09be..a358771 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -122,6 +122,7 @@ void machine__delete_threads(struct machine *machine)

void machine__exit(struct machine *machine)
{
+ machine__destroy_kernel_maps(machine);
map_groups__exit(&machine->kmaps);
dsos__exit(&machine->dsos);
machine__exit_vdso(machine);

Subject: [tip:perf/core] perf tools: Make perf_exec_path() always return malloc'd string

Commit-ID: c4068f51d40df151a661a384ab1309b11d7f012e
Gitweb: http://git.kernel.org/tip/c4068f51d40df151a661a384ab1309b11d7f012e
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Thu, 19 Nov 2015 15:04:53 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 19 Nov 2015 13:19:19 -0300

perf tools: Make perf_exec_path() always return malloc'd string

Since system_path() returns malloc'd string if given path is not an
absolute path, perf_exec_path() sometimes returns a static string and
sometimes returns a malloc'd string depending on the environment
variables or command options.

This may cause a memory leak because the caller can not unconditionally
free the returned string.

This fixes perf_exec_path() and system_path() to always return a
malloc'd string, so the caller can always free it.

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/exec_cmd.c | 21 +++++++++++----------
tools/perf/util/exec_cmd.h | 5 +++--
tools/perf/util/help.c | 6 ++++--
3 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/tools/perf/util/exec_cmd.c b/tools/perf/util/exec_cmd.c
index 7adf4ad..1099e92 100644
--- a/tools/perf/util/exec_cmd.c
+++ b/tools/perf/util/exec_cmd.c
@@ -9,17 +9,17 @@
static const char *argv_exec_path;
static const char *argv0_path;

-const char *system_path(const char *path)
+char *system_path(const char *path)
{
static const char *prefix = PREFIX;
struct strbuf d = STRBUF_INIT;

if (is_absolute_path(path))
- return path;
+ return strdup(path);

strbuf_addf(&d, "%s/%s", prefix, path);
path = strbuf_detach(&d, NULL);
- return path;
+ return (char *)path;
}

const char *perf_extract_argv0_path(const char *argv0)
@@ -52,17 +52,16 @@ void perf_set_argv_exec_path(const char *exec_path)


/* Returns the highest-priority, location to look for perf programs. */
-const char *perf_exec_path(void)
+char *perf_exec_path(void)
{
- const char *env;
+ char *env;

if (argv_exec_path)
- return argv_exec_path;
+ return strdup(argv_exec_path);

env = getenv(EXEC_PATH_ENVIRONMENT);
- if (env && *env) {
- return env;
- }
+ if (env && *env)
+ return strdup(env);

return system_path(PERF_EXEC_PATH);
}
@@ -83,9 +82,11 @@ void setup_path(void)
{
const char *old_path = getenv("PATH");
struct strbuf new_path = STRBUF_INIT;
+ char *tmp = perf_exec_path();

- add_path(&new_path, perf_exec_path());
+ add_path(&new_path, tmp);
add_path(&new_path, argv0_path);
+ free(tmp);

if (old_path)
strbuf_addstr(&new_path, old_path);
diff --git a/tools/perf/util/exec_cmd.h b/tools/perf/util/exec_cmd.h
index bc4b915..48b4175 100644
--- a/tools/perf/util/exec_cmd.h
+++ b/tools/perf/util/exec_cmd.h
@@ -3,10 +3,11 @@

extern void perf_set_argv_exec_path(const char *exec_path);
extern const char *perf_extract_argv0_path(const char *path);
-extern const char *perf_exec_path(void);
extern void setup_path(void);
extern int execv_perf_cmd(const char **argv); /* NULL terminated */
extern int execl_perf_cmd(const char *cmd, ...);
-extern const char *system_path(const char *path);
+/* perf_exec_path and system_path return malloc'd string, caller must free it */
+extern char *perf_exec_path(void);
+extern char *system_path(const char *path);

#endif /* __PERF_EXEC_CMD_H */
diff --git a/tools/perf/util/help.c b/tools/perf/util/help.c
index 86c37c4..fa1fc4a 100644
--- a/tools/perf/util/help.c
+++ b/tools/perf/util/help.c
@@ -159,7 +159,7 @@ void load_command_list(const char *prefix,
struct cmdnames *other_cmds)
{
const char *env_path = getenv("PATH");
- const char *exec_path = perf_exec_path();
+ char *exec_path = perf_exec_path();

if (exec_path) {
list_commands_in_dir(main_cmds, exec_path, prefix);
@@ -187,6 +187,7 @@ void load_command_list(const char *prefix,
sizeof(*other_cmds->names), cmdname_compare);
uniq(other_cmds);
}
+ free(exec_path);
exclude_cmds(other_cmds, main_cmds);
}

@@ -203,13 +204,14 @@ void list_commands(const char *title, struct cmdnames *main_cmds,
longest = other_cmds->names[i]->len;

if (main_cmds->cnt) {
- const char *exec_path = perf_exec_path();
+ char *exec_path = perf_exec_path();
printf("available %s in '%s'\n", title, exec_path);
printf("----------------");
mput_char('-', strlen(title) + strlen(exec_path));
putchar('\n');
pretty_print_string_list(main_cmds, longest);
putchar('\n');
+ free(exec_path);
}

if (other_cmds->cnt) {

Subject: [tip:perf/core] perf tools: Fix to put new map after inserting to map_groups in dso__load_sym

Commit-ID: 8d5c340dfcd48751fdff301bb2a7e3f875652dcb
Gitweb: http://git.kernel.org/tip/8d5c340dfcd48751fdff301bb2a7e3f875652dcb
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Wed, 18 Nov 2015 15:40:27 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 19 Nov 2015 13:19:20 -0300

perf tools: Fix to put new map after inserting to map_groups in dso__load_sym

Fix dso__load_sym to put the map object which is already
insterted to kmaps.

Refcnt debugger shows
==== [0] ====
Unreclaimed map: 0x39113e0
Refcount +1 => 1 at
./perf(map__new2+0xb5) [0x4be155]
./perf(dso__load_sym+0xee1) [0x503461]
./perf(dso__load_vmlinux+0xbf) [0x4aa6df]
./perf(dso__load_vmlinux_path+0x8c) [0x4aa83c]
./perf() [0x50528a]
./perf(convert_perf_probe_events+0xd79) [0x50ac29]
./perf() [0x45600f]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f152368baf5]
./perf() [0x4220a9]
Refcount +1 => 2 at
./perf(maps__insert+0x9a) [0x4bfffa]
./perf(dso__load_sym+0xf89) [0x503509]
./perf(dso__load_vmlinux+0xbf) [0x4aa6df]
./perf(dso__load_vmlinux_path+0x8c) [0x4aa83c]
./perf() [0x50528a]
./perf(convert_perf_probe_events+0xd79) [0x50ac29]
./perf() [0x45600f]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f152368baf5]
./perf() [0x4220a9]
Refcount -1 => 1 at
./perf(map_groups__exit+0x94) [0x4bed04]
./perf(machine__delete+0xb0) [0x4b9300]
./perf(exit_probe_symbol_maps+0x28) [0x506608]
./perf() [0x45628a]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f152368baf5]
./perf() [0x4220a9]

This means that the dso__load_sym calls map__new2 and maps_insert, both
of them bump the map refcount, but map_groups__exit will drop just one
reference.

Fix it by dropping the refcount after inserting it into kmaps.

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/symbol-elf.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 475d88d..53f1996 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1042,6 +1042,8 @@ int dso__load_sym(struct dso *dso, struct map *map,
}
curr_dso->symtab_type = dso->symtab_type;
map_groups__insert(kmaps, curr_map);
+ /* kmaps already got it */
+ map__put(curr_map);
dsos__add(&map->groups->machine->dsos, curr_dso);
dso__set_loaded(curr_dso, map->type);
} else

Subject: [tip:perf/core] perf tools: Fix __dsos__addnew to put dso after adding it to the list

Commit-ID: 82de26abdc127172fd7453a61d35a9b33bf4f871
Gitweb: http://git.kernel.org/tip/82de26abdc127172fd7453a61d35a9b33bf4f871
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Wed, 18 Nov 2015 15:40:31 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 19 Nov 2015 13:19:20 -0300

perf tools: Fix __dsos__addnew to put dso after adding it to the list

__dsos__addnew should drop the constructor reference to dso after adding
it to the list, because __dsos__add() will get a reference that will be
kept while it is in the list.

This fixes DSO leaks when entries are removed to the list and the refcount
never gets to zero.

Refcnt debugger shows:
==== [0] ====
Unreclaimed dso: 0x2fccab0
Refcount +1 => 1 at
./perf(dso__new+0x1ff) [0x4a62df]
./perf(__dsos__addnew+0x29) [0x4a6e19]
./perf(dsos__findnew+0xd1) [0x4a7281]
./perf(machine__findnew_kernel+0x27) [0x4a5e17]
./perf() [0x4b8df2]
./perf(machine__create_kernel_maps+0x28) [0x4bb528]
./perf(machine__new_host+0xfa) [0x4bb84a]
./perf(init_probe_symbol_maps+0x93) [0x506713]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f46df132af5]
./perf() [0x4220a9]
Refcount +1 => 2 at
./perf(__dsos__addnew+0xfb) [0x4a6eeb]
./perf(dsos__findnew+0xd1) [0x4a7281]
./perf(machine__findnew_kernel+0x27) [0x4a5e17]
./perf() [0x4b8df2]
./perf(machine__create_kernel_maps+0x28) [0x4bb528]
./perf(machine__new_host+0xfa) [0x4bb84a]
./perf(init_probe_symbol_maps+0x93) [0x506713]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f46df132af5]
./perf() [0x4220a9]
Refcount +1 => 3 at
./perf(dsos__findnew+0x7e) [0x4a722e]
./perf(machine__findnew_kernel+0x27) [0x4a5e17]
./perf() [0x4b8df2]
./perf(machine__create_kernel_maps+0x28) [0x4bb528]
./perf(machine__new_host+0xfa) [0x4bb84a]
./perf(init_probe_symbol_maps+0x93) [0x506713]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f46df132af5]
./perf() [0x4220a9]
[snip]

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/dso.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index 425df5c..e8e9a9d 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -1243,6 +1243,8 @@ struct dso *__dsos__addnew(struct dsos *dsos, const char *name)
if (dso != NULL) {
__dsos__add(dsos, dso);
dso__set_basename(dso);
+ /* Put dso here because __dsos_add already got it */
+ dso__put(dso);
}
return dso;
}

Subject: [tip:perf/core] perf tools: Fix machine__create_kernel_maps to put kernel dso refcount

Commit-ID: 1154c957607afdf5936ae14e1be27d7ca4e7bd30
Gitweb: http://git.kernel.org/tip/1154c957607afdf5936ae14e1be27d7ca4e7bd30
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Wed, 18 Nov 2015 15:40:33 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 19 Nov 2015 13:19:21 -0300

perf tools: Fix machine__create_kernel_maps to put kernel dso refcount

Fix machine__create_kernel_maps() to put kernel dso because the dso has
been gotten via __machine__create_kernel_maps().

Refcnt debugger shows:
==== [0] ====
Unreclaimed dso: 0x3036ab0
Refcount +1 => 1 at
./perf(dso__new+0x1ff) [0x4a62df]
./perf(__dsos__addnew+0x29) [0x4a6e19]
./perf(dsos__findnew+0xd1) [0x4a7181]
./perf(machine__findnew_kernel+0x27) [0x4a5e17]
./perf() [0x4b8cf2]
./perf(machine__create_kernel_maps+0x28) [0x4bb428]
./perf(machine__new_host+0xfa) [0x4bb74a]
./perf(init_probe_symbol_maps+0x93) [0x506613]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7ffa6809eaf5]
./perf() [0x4220a9]
[snip]
Refcount +1 => 2 at
./perf(dsos__findnew+0x7e) [0x4a712e]
./perf(machine__findnew_kernel+0x27) [0x4a5e17]
./perf() [0x4b8cf2]
./perf(machine__create_kernel_maps+0x28) [0x4bb428]
./perf(machine__new_host+0xfa) [0x4bb74a]
./perf(init_probe_symbol_maps+0x93) [0x506613]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7ffa6809eaf5]
./perf() [0x4220a9]
[snip]
Refcount -1 => 1 at
./perf(dso__put+0x2f) [0x4a664f]
./perf(machine__delete+0xfe) [0x4b93ee]
./perf(exit_probe_symbol_maps+0x28) [0x5066b8]
./perf() [0x45628a]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7ffa6809eaf5]
./perf() [0x4220a9]

Actually, dsos__findnew gets the dso before returning it, so the dso
user (in this case machine__create_kernel_maps) has to put the dso after
used.

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/machine.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index a358771..0b4a05c 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1088,11 +1088,14 @@ int machine__create_kernel_maps(struct machine *machine)
struct dso *kernel = machine__get_kernel(machine);
const char *name;
u64 addr = machine__get_running_kernel_start(machine, &name);
- if (!addr)
+ int ret;
+
+ if (!addr || kernel == NULL)
return -1;

- if (kernel == NULL ||
- __machine__create_kernel_maps(machine, kernel) < 0)
+ ret = __machine__create_kernel_maps(machine, kernel);
+ dso__put(kernel);
+ if (ret < 0)
return -1;

if (symbol_conf.use_modules && machine__create_modules(machine) < 0) {

Subject: [tip:perf/core] perf machine: Fix machine__findnew_module_map to put dso

Commit-ID: 566c69c36e6178774dd484ea4a02b76f6bd0ede4
Gitweb: http://git.kernel.org/tip/566c69c36e6178774dd484ea4a02b76f6bd0ede4
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Wed, 18 Nov 2015 15:40:35 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 19 Nov 2015 13:19:21 -0300

perf machine: Fix machine__findnew_module_map to put dso

Fix machine__findnew_module_map to drop the reference to the dso because
it is already referenced by both machine__findnew_module_dso() and
map__new2().

Refcnt debugger shows:

==== [1] ====
Unreclaimed dso: 0x1ffd980
Refcount +1 => 1 at
./perf(dso__new+0x1ff) [0x4a62df]
./perf(__dsos__addnew+0x29) [0x4a6e19]
./perf() [0x4b8b91]
./perf(modules__parse+0xfc) [0x4a9d5c]
./perf() [0x4b8460]
./perf(machine__create_kernel_maps+0x150) [0x4bb550]
./perf(machine__new_host+0xfa) [0x4bb75a]
./perf(init_probe_symbol_maps+0x93) [0x506623]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1345a8eaf5]
./perf() [0x4220a9]

This map_groups__insert(0x4b8b91) already gets a reference to the new
dso:

----
eu-addr2line -e ./perf -f 0x4b8b91
map_groups__insert inlined at util/machine.c:586 in
machine__create_module
util/map.h:207
----

So this dso refcnt will be released when map_groups gets released.

[snip]
Refcount +1 => 2 at
./perf(dso__get+0x34) [0x4a65f4]
./perf() [0x4b8b35]
./perf(modules__parse+0xfc) [0x4a9d5c]
./perf() [0x4b8460]
./perf(machine__create_kernel_maps+0x150) [0x4bb550]
./perf(machine__new_host+0xfa) [0x4bb75a]
./perf(init_probe_symbol_maps+0x93) [0x506623]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1345a8eaf5]
./perf() [0x4220a9]

Here, machine__findnew_module_dso(0x4b8b35) gets the dso (and stores it
in a local variable):

----
# eu-addr2line -e ./perf -f 0x4b8b35
machine__findnew_module_dso inlined at util/machine.c:578 in
machine__create_module
util/machine.c:514
----

Refcount +1 => 3 at
./perf(dso__get+0x34) [0x4a65f4]
./perf(map__new2+0x76) [0x4be1c6]
./perf() [0x4b8b4f]
./perf(modules__parse+0xfc) [0x4a9d5c]
./perf() [0x4b8460]
./perf(machine__create_kernel_maps+0x150) [0x4bb550]
./perf(machine__new_host+0xfa) [0x4bb75a]
./perf(init_probe_symbol_maps+0x93) [0x506623]
./perf() [0x455ffa]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1345a8eaf5]
./perf() [0x4220a9]

But also map__new2() gets the dso which will be put when the map is
released.

So, we have to drop the constructor reference obtained in
machine__findnew_module_dso().

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/machine.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 0b4a05c..7f5071a 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -565,7 +565,7 @@ struct map *machine__findnew_module_map(struct machine *machine, u64 start,
const char *filename)
{
struct map *map = NULL;
- struct dso *dso;
+ struct dso *dso = NULL;
struct kmod_path m;

if (kmod_path__parse_name(&m, filename))
@@ -589,6 +589,8 @@ struct map *machine__findnew_module_map(struct machine *machine, u64 start,
/* Put the map here because map_groups__insert alread got it */
map__put(map);
out:
+ /* put the dso here, corresponding to machine__findnew_module_dso */
+ dso__put(dso);
free(m.name);
return map;
}