Subject: [PATCH perf/core 00/22] perf refcnt debugger API and fixes

Hi Arnaldo,

Here is a series of patches for perf refcnt debugger and
some fixes.

In this series I've replaced all atomic reference counters
with the refcnt interface, including dso, map, map_groups,
thread, cpu_map, comm_str, cgroup_sel, thread_map, and
perf_mmap.

refcnt debugger (or refcnt leak checker)
===============

At first, note that this change doesn't affect any compiled
code unless building with REFCNT_DEBUG=1 (see macros in
refcnt.h). So, this feature is only enabled in the debug binary.
But before releasing, we can ensure that all objects are safely
reclaimed before exit in -rc phase.

To use the refcnt debugger, you just build a perf binary with
REFCNT_DEBUG=1 as follows;
----
# make REFCNT_DEBUG=1
----
And run the perf command. If the refcnt debugger finds leaks,
it shows the summary of the bugs. Note that if the command does
not use stdio, it doesn't show anything because refcnt debugger
uses pr_debug. Please use --stdio option in such case.

E.g. with the first 13 patches, perf top shows that many objects
are leaked.
----
# ./perf top --stdio
q
exiting.
REFCNT: BUG: Unreclaimed objects found.
REFCNT: Total 3595 objects are not reclaimed.
"map" leaks 3334 objects
"dso" leaks 231 objects
"thread" leaks 9 objects
"comm_str" leaks 13 objects
"map_groups" leaks 8 objects
To see all backtraces, rerun with -v option
----
You can also dump all the backtrace data with -v option, but I
don't recommend you to do it on your console, because it will
be very very long (for example, above dumps 40MB text logs
on your console).

Instead, you can use PERF_REFCNT_DEBUG_FILTER env. var. to focus
on one object, and also use "2>" to redirect stderr output to file.
E.g.
----
# PERF_REFCNT_DEBUG_FILTER=map ./perf top --stdio -v 2> refcnt.log
q
exiting.
# less refcnt.log
mmap size 528384B
Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.3.0-rc2+/build/vmlinux for symbols
REFCNT: BUG: Unreclaimed objects found.
==== [0] ====
Unreclaimed map@0x2157dc0
Refcount +1 => 1 at
...
./perf() [0x4226fd]
REFCNT: Total 3229 objects are not reclaimed.
"map" leaks 3229 objects
----

Bugfixes
========

In this series I've also tried to fix some object leaks in perf top
and perf stat.
After applying this series, this reduced (not vanished) to 1/5.
----
# ./perf top --stdio
q
exiting.
REFCNT: BUG: Unreclaimed objects found.
REFCNT: Total 866 objects are not reclaimed.
"dso" leaks 213 objects
"map" leaks 624 objects
"comm_str" leaks 12 objects
"thread" leaks 9 objects
"map_groups" leaks 8 objects
To see all backtraces, rerun with -v option
----

Actually, I'm still not able to fix all of the bugs. It seems that
hists has a bug that hists__delete_entries doesn't delete all the
entries because some entries are on hists->entries but others on
hists->entries_in. And I'm not so sure about hists.c.
Arnaldo, would you have any idea for this bug?


General refcnt miscodings
=========================

BTW, while applying this change, I've found that there are refcnt
coding mismatches in those code and most of the bugs come from those
mismatches.

- The reference counter can be initialized by 0 or 1.
- If 0 is chosen, caller have to get it and free it if failed to
register appropriately.
- If 1 is chosen, caller doesn't need to get it, but when exits the
caller, it has to put it. (except for returning the object itself)
- The application should choose either one as its policy, to avoid
confusion.

perf tools mixes it up (moreover, it initializes 2 in a case) and
caller usually forgets to put it (it is not surprising, because too
many "put" usually cause SEGV by accessing freed object.)

As far as I can see, cgroup_sel, perf_mmap(this is initialized 0 or 2...),
thread, and comm_str are initialized by 0. Others are initialized by 1.

So, I'd like to suggest that we choose one policy and cleanup the code.
I recommend to use init by 1 policy, because anyway caller has to get
the refcnt. Suppose the below code;

----
obj__new() {
obj = zalloc(sizeof(*obj));
refcnt__init(obj, refcnt, 0);
return obj;
}

caller() {
obj = obj__new();
if (parent__add_obj(parent, obj) != SUCCESS) {
free(obj);
}
}
----

At first glance, this looks good. However, if the parent__add_obj() once
gets the obj(refcnt => 1) and fails to add it to parent list by some reason,
it should put the obj(refcnt => 0). This means the obj is already freed at
that point.

Then, caller() shouldn't free obj in error case? No, because parent__add_obj()
can fail before getting the obj :(. Maybe we can handle it by checking return
code, but it is ugly.

If we choose "init by 1", caller always has to put it before returning. But
the coding rule becomes simpler.
----
caller() {
obj = obj__new();
if (parent__add_obj(parent, obj) != SUCCESS) {
ret = errorcode;
}
obj__put(obj);
return ret;
}
----

Thank you,

---

Masami Hiramatsu (22):
[v2] perf refcnt: Introduce generic refcount APIs with debug feature
perf refcnt: Use a hash for refcnt_root
perf refcnt: Add refcnt debug filter
perf refcnt: refcnt shows summary per object
perf: make map to use refcnt
perf: Make dso to use refcnt for debug
perf: Make map_groups to use refcnt
perf: Make thread uses refcnt for debug
perf: Make cpu_map to use refcnt for debug
perf: Make comm_str to use refcnt for debug
perf: Make cgroup_sel to use refcnt for debug
perf: Make thread_map to use refcnt for debug
perf: Make perf_mmap to use refcnt for debug
perf: Fix dso__load_sym to put dso
perf: Fix map_groups__clone to put cloned map
perf: Fix __cmd_top and perf_session__process_events to put the idle thread
perf: Fix __machine__addnew_vdso to put dso after add to dsos
perf stat: Fix cmd_stat to release cpu_map
perf: fix hists_evsel to release hists
perf: Fix maps__fixup_overlappings to put used maps
perf: Fix machine.vmlinux_maps to make sure to clear the old one
perf: Fix write_numa_topology to put cpu_map instead of free


tools/perf/builtin-stat.c | 11 ++
tools/perf/builtin-top.c | 6 +
tools/perf/config/Makefile | 5 +
tools/perf/util/Build | 1
tools/perf/util/cgroup.c | 11 +-
tools/perf/util/comm.c | 8 +
tools/perf/util/cpumap.c | 33 +++---
tools/perf/util/dso.c | 7 +
tools/perf/util/evlist.c | 8 +
tools/perf/util/header.c | 2
tools/perf/util/hist.c | 10 ++
tools/perf/util/machine.c | 5 +
tools/perf/util/map.c | 15 ++-
tools/perf/util/map.h | 5 +
tools/perf/util/refcnt.c | 229 ++++++++++++++++++++++++++++++++++++++++++
tools/perf/util/refcnt.h | 70 +++++++++++++
tools/perf/util/session.c | 10 ++
tools/perf/util/symbol-elf.c | 2
tools/perf/util/thread.c | 7 +
tools/perf/util/thread_map.c | 28 +++--
tools/perf/util/vdso.c | 2
21 files changed, 420 insertions(+), 55 deletions(-)
create mode 100644 tools/perf/util/refcnt.c
create mode 100644 tools/perf/util/refcnt.h


--
Masami HIRAMATSU
Linux Technology Research Center, System Productivity Research Dept.
Center for Technology Innovation - Systems Engineering
Hitachi, Ltd., Research & Development Group
E-mail: [email protected]


Subject: [PATCH perf/core 01/22] [v2] perf refcnt: Introduce generic refcount APIs with debug feature

This is a kind of debugging feature for atomic reference counter.
The reference counters are widely used in perf tools but not well
debugged. It sometimes causes memory leaks but no one has noticed
the issue, since it is hard to debug such un-reclaimed objects.

This refcnt interface provides fully backtrace debug feature to
analyze such issue. User just replace atomic_t ops with refcnt
APIs and add refcnt__exit() where the object is released.

/* At object initializing */
refcnt__init(obj, refcnt, 1); /* <- atomic_set(&obj->refcnt, 1); */
/* or */
refcnt__init_as(obj, refcnt, 1, "class"); /* init as "class" */

/* At object get method */
refcnt__get(obj, refcnt); /* <- atomic_inc(&obj->refcnt); */

/* At object put method */
if (obj && refcnt__put(obj, refcnt)) /* <-atmoic_dec_and_test(&obj->refcnt)*/

/* At object releasing */
refcnt__exit(obj, refcnt); /* <- Newly added */

The debugging feature is enabled when building perf with
REFCNT_DEBUG=1. Otherwides it is just translated as normal
atomic ops.

Debugging binary warns you if it finds leaks when the perf exits.
If you give -v (or --verbose) to the perf, it also shows backtrace
logs on all refcnt operations of leaked objects.

Signed-off-by: Masami Hiramatsu <[email protected]>
Acked-by: Namhyung Kim <[email protected]>

---
Changes from v1:
- Change internal functions as it indicates the object more clear
- Minor macro changes to have braces.
- refcnt__init takes an initial number for some cases.
- Add refcnt__init_as() for the object which the object and class
have different name.
- Header shows "obj@addr" instead of "obj: addr".
- Shrink down the backtrace size to 16 (it is enough).
- Fix a bug in case of refcnt == 0.
- Show raw addresses if backtrace_symbol is failed.
---
tools/perf/config/Makefile | 5 +
tools/perf/util/Build | 1
tools/perf/util/refcnt.c | 151 ++++++++++++++++++++++++++++++++++++++++++++
tools/perf/util/refcnt.h | 70 ++++++++++++++++++++
4 files changed, 227 insertions(+)
create mode 100644 tools/perf/util/refcnt.c
create mode 100644 tools/perf/util/refcnt.h

diff --git a/tools/perf/config/Makefile b/tools/perf/config/Makefile
index 6eb9a95..62fc93f 100644
--- a/tools/perf/config/Makefile
+++ b/tools/perf/config/Makefile
@@ -147,6 +147,11 @@ ifdef PARSER_DEBUG
$(call detected_var,PARSER_DEBUG_FLEX)
endif

+ifdef REFCNT_DEBUG
+ CFLAGS += -DREFCNT_DEBUG
+ $(call detected,CONFIG_REFCNT_DEBUG)
+endif
+
ifndef NO_LIBPYTHON
# Try different combinations to accommodate systems that only have
# python[2][-config] in weird combinations but always preferring
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 0513dd5..646a4b7 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -87,6 +87,7 @@ libperf-$(CONFIG_AUXTRACE) += intel-pt.o
libperf-$(CONFIG_AUXTRACE) += intel-bts.o
libperf-y += parse-branch-options.o
libperf-y += parse-regs-options.o
+libperf-$(CONFIG_REFCNT_DEBUG) += refcnt.o

libperf-$(CONFIG_LIBBPF) += bpf-loader.o
libperf-$(CONFIG_BPF_PROLOGUE) += bpf-prologue.o
diff --git a/tools/perf/util/refcnt.c b/tools/perf/util/refcnt.c
new file mode 100644
index 0000000..9bd36b2b
--- /dev/null
+++ b/tools/perf/util/refcnt.c
@@ -0,0 +1,151 @@
+/* Refcount backtrace for debugging leaks */
+#include "../perf.h"
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <execinfo.h> /* For backtrace */
+
+#include "event.h"
+#include "debug.h"
+#include "util.h"
+#include "refcnt.h"
+
+/* A root of backtrace */
+static LIST_HEAD(refcnt_root); /* List head of refcnt object */
+
+static void refcnt_object__delete(struct refcnt_object *ref)
+{
+ struct refcnt_buffer *buf;
+
+ while (!list_empty(&ref->head)) {
+ buf = list_entry(ref->head.next, struct refcnt_buffer, list);
+ list_del_init(&buf->list);
+ free(buf);
+ }
+ list_del_init(&ref->list);
+ free(ref);
+}
+
+static struct refcnt_object *refcnt_object__find(void *obj)
+{
+ struct refcnt_object *ref;
+
+ /* TODO: use hash list */
+ list_for_each_entry(ref, &refcnt_root, list)
+ if (ref->obj == obj)
+ return ref;
+
+ return NULL;
+}
+
+void refcnt__delete(void *addr)
+{
+ struct refcnt_object *ref = refcnt_object__find(addr);
+
+ if (!ref) {
+ pr_debug("REFCNT: Delete uninitialized refcnt: %p\n", addr);
+ return;
+ }
+ refcnt_object__delete(ref);
+}
+
+static void refcnt_object__record(struct refcnt_object *ref, int count)
+{
+ struct refcnt_buffer *buf = malloc(sizeof(*buf));
+
+ if (!buf) {
+ pr_debug("REFCNT: Out of memory for %p (%s)\n",
+ ref->obj, ref->name);
+ return;
+ }
+ INIT_LIST_HEAD(&buf->list);
+ buf->count = count;
+ buf->nr = backtrace(buf->buf, BACKTRACE_SIZE);
+ list_add_tail(&buf->list, &ref->head);
+}
+
+static struct refcnt_object *refcnt_object__new(void *obj, const char *name)
+{
+ struct refcnt_object *ref = malloc(sizeof(*ref));
+
+ if (!ref) {
+ pr_debug("REFCNT: Out of memory for %p (%s)\n",
+ obj, name);
+ return NULL;
+ }
+ INIT_LIST_HEAD(&ref->list);
+ INIT_LIST_HEAD(&ref->head);
+ ref->name = name;
+ ref->obj = obj;
+ list_add_tail(&ref->list, &refcnt_root);
+
+ return ref;
+}
+
+/* This is called via refcnt__init */
+void refcnt__recordnew(void *obj, const char *name, int count)
+{
+ struct refcnt_object *ref = refcnt_object__new(obj, name);
+
+ if (ref)
+ refcnt_object__record(ref, count);
+}
+
+/* This is called via refcnt__get/put */
+void refcnt__record(void *obj, const char *name, int count)
+{
+ struct refcnt_object *ref = refcnt_object__find(obj);
+
+ /* If no entry, allocate new one */
+ if (!ref)
+ refcnt__recordnew(obj, name, count);
+ else
+ refcnt_object__record(ref, count);
+}
+
+static void pr_refcnt_buffer(struct refcnt_buffer *buf)
+{
+ char **symbuf;
+ int i;
+
+ if (!buf)
+ return;
+ symbuf = backtrace_symbols(buf->buf, buf->nr);
+ /* Skip the first one because it is always btrace__record */
+ for (i = 1; i < buf->nr; i++) {
+ if (symbuf)
+ pr_debug(" %s\n", symbuf[i]);
+ else
+ pr_debug(" [%p]\n", buf->buf[i]);
+ }
+ free(symbuf);
+}
+
+static void __attribute__((destructor)) refcnt__dump_unreclaimed(void)
+{
+ struct refcnt_object *ref, *n;
+ struct refcnt_buffer *buf;
+ int i = 0;
+
+ if (list_empty(&refcnt_root))
+ return;
+
+ pr_warning("REFCNT: BUG: Unreclaimed objects found.\n");
+ list_for_each_entry_safe(ref, n, &refcnt_root, list) {
+ pr_debug("==== [%d] ====\nUnreclaimed %s@%p\n", i,
+ ref->name ? ref->name : "(object)", ref->obj);
+ list_for_each_entry(buf, &ref->head, list) {
+ pr_debug("Refcount %s => %d at\n",
+ buf->count >= 0 ? "+1" : "-1",
+ buf->count >= 0 ? buf->count :
+ -buf->count - 1);
+ pr_refcnt_buffer(buf);
+ }
+ refcnt_object__delete(ref);
+ i++;
+ }
+ pr_warning("REFCNT: Total %d objects are not reclaimed.\n", i);
+ if (!verbose)
+ pr_warning(" To see all backtraces, rerun with -v option\n");
+}
+
diff --git a/tools/perf/util/refcnt.h b/tools/perf/util/refcnt.h
new file mode 100644
index 0000000..cd5e4e4
--- /dev/null
+++ b/tools/perf/util/refcnt.h
@@ -0,0 +1,70 @@
+/*
+ * Atomic reference counter API with debugging feature
+ */
+#ifndef __PERF_REFCNT_H
+#define __PERF_REFCNT_H
+
+#include <linux/atomic.h>
+
+#ifdef REFCNT_DEBUG
+
+struct refcnt_object {
+ struct list_head list; /* List of objects */
+ void *obj; /* Object address which has refcnt */
+ const char *name; /* Object class name */
+ struct list_head head; /* List head for buffers */
+};
+
+#define BACKTRACE_SIZE 16
+struct refcnt_buffer {
+ struct list_head list; /* List of buffers */
+ int count; /* Count number at recording point */
+ int nr; /* Number of recorded buffer entries */
+ void *buf[BACKTRACE_SIZE]; /* Backtrace buffer */
+};
+
+void refcnt__recordnew(void *obj, const char *name, int count);
+void refcnt__record(void *obj, const char *name, int count);
+void refcnt__delete(void *obj);
+
+static inline void __refcnt__init(atomic_t *refcnt, int n, void *obj,
+ const char *name)
+{
+ atomic_set(refcnt, n);
+ refcnt__recordnew(obj, name, n);
+}
+
+static inline void __refcnt__get(atomic_t *refcnt, void *obj, const char *name)
+{
+ atomic_inc(refcnt);
+ refcnt__record(obj, name, atomic_read(refcnt));
+}
+
+static inline int __refcnt__put(atomic_t *refcnt, void *obj, const char *name)
+{
+ refcnt__record(obj, name, -atomic_read(refcnt));
+ return atomic_dec_and_test(refcnt);
+}
+
+#define refcnt__init(obj, member, n) \
+ __refcnt__init(&(obj)->member, n, obj, #obj)
+#define refcnt__init_as(obj, member, n, name) \
+ __refcnt__init(&(obj)->member, n, obj, name)
+#define refcnt__exit(obj, member) \
+ refcnt__delete(obj)
+#define refcnt__get(obj, member) \
+ __refcnt__get(&(obj)->member, obj, #obj)
+#define refcnt__put(obj, member) \
+ __refcnt__put(&(obj)->member, obj, #obj)
+
+#else /* !REFCNT_DEBUG */
+
+#define refcnt__init(obj, member, n) atomic_set(&(obj)->member, n)
+#define refcnt__init_as(obj, member, n, name) refcnt__init(obj, member, n)
+#define refcnt__exit(obj, member) do { } while (0)
+#define refcnt__get(obj, member) atomic_inc(&(obj)->member)
+#define refcnt__put(obj, member) atomic_dec_and_test(&(obj)->member)
+
+#endif /* !REFCNT_DEBUG */
+
+#endif /* __PERF_REFCNT_H */

Subject: [PATCH perf/core 02/22] perf refcnt: Use a hash for refcnt_root

Make refcnt to use a hash table to reduce search overhead
on refcnt_root. This hash is based on passed object address.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/refcnt.c | 71 ++++++++++++++++++++++++++++++++--------------
1 file changed, 49 insertions(+), 22 deletions(-)

diff --git a/tools/perf/util/refcnt.c b/tools/perf/util/refcnt.c
index 9bd36b2b..9748dba 100644
--- a/tools/perf/util/refcnt.c
+++ b/tools/perf/util/refcnt.c
@@ -9,9 +9,21 @@
#include "debug.h"
#include "util.h"
#include "refcnt.h"
+#include "linux/hash.h"

-/* A root of backtrace */
-static LIST_HEAD(refcnt_root); /* List head of refcnt object */
+#define REFCNT_HASHBITS 7
+#define REFCNT_HASHSZ (1 << REFCNT_HASHBITS)
+
+/* A root of backtraces (a hash table of the list of refcnt_object)*/
+static struct list_head refcnt_root[REFCNT_HASHSZ];
+
+static void __attribute__((constructor)) refcnt__init_root(void)
+{
+ int h;
+
+ for (h = 0; h < REFCNT_HASHSZ; h++)
+ INIT_LIST_HEAD(&refcnt_root[h]);
+}

static void refcnt_object__delete(struct refcnt_object *ref)
{
@@ -29,9 +41,9 @@ static void refcnt_object__delete(struct refcnt_object *ref)
static struct refcnt_object *refcnt_object__find(void *obj)
{
struct refcnt_object *ref;
+ int h = (int)hash_ptr(obj, REFCNT_HASHBITS);

- /* TODO: use hash list */
- list_for_each_entry(ref, &refcnt_root, list)
+ list_for_each_entry(ref, &refcnt_root[h], list)
if (ref->obj == obj)
return ref;

@@ -67,6 +79,7 @@ static void refcnt_object__record(struct refcnt_object *ref, int count)
static struct refcnt_object *refcnt_object__new(void *obj, const char *name)
{
struct refcnt_object *ref = malloc(sizeof(*ref));
+ int h = (int)hash_ptr(obj, REFCNT_HASHBITS);

if (!ref) {
pr_debug("REFCNT: Out of memory for %p (%s)\n",
@@ -77,7 +90,7 @@ static struct refcnt_object *refcnt_object__new(void *obj, const char *name)
INIT_LIST_HEAD(&ref->head);
ref->name = name;
ref->obj = obj;
- list_add_tail(&ref->list, &refcnt_root);
+ list_add_tail(&ref->list, &refcnt_root[h]);

return ref;
}
@@ -110,6 +123,11 @@ static void pr_refcnt_buffer(struct refcnt_buffer *buf)

if (!buf)
return;
+
+ pr_debug("Refcount %s => %d at\n",
+ buf->count >= 0 ? "+1" : "-1",
+ buf->count >= 0 ? buf->count : -buf->count - 1);
+
symbuf = backtrace_symbols(buf->buf, buf->nr);
/* Skip the first one because it is always btrace__record */
for (i = 1; i < buf->nr; i++) {
@@ -121,29 +139,38 @@ static void pr_refcnt_buffer(struct refcnt_buffer *buf)
free(symbuf);
}

-static void __attribute__((destructor)) refcnt__dump_unreclaimed(void)
+static void pr_refcnt_object(struct refcnt_object *ref)
{
- struct refcnt_object *ref, *n;
struct refcnt_buffer *buf;
- int i = 0;

- if (list_empty(&refcnt_root))
- return;
+ pr_debug("Unreclaimed %s@%p\n",
+ ref->name ? ref->name : "(object)", ref->obj);
+
+ list_for_each_entry(buf, &ref->head, list)
+ pr_refcnt_buffer(buf);
+}

+static void __attribute__((destructor)) refcnt__dump_unreclaimed(void)
+{
+ struct refcnt_object *ref, *n;
+ int h, i = 0;
+
+ for (h = 0; h < REFCNT_HASHSZ; h++)
+ if (!list_empty(&refcnt_root[h]))
+ goto found;
+ return;
+found:
pr_warning("REFCNT: BUG: Unreclaimed objects found.\n");
- list_for_each_entry_safe(ref, n, &refcnt_root, list) {
- pr_debug("==== [%d] ====\nUnreclaimed %s@%p\n", i,
- ref->name ? ref->name : "(object)", ref->obj);
- list_for_each_entry(buf, &ref->head, list) {
- pr_debug("Refcount %s => %d at\n",
- buf->count >= 0 ? "+1" : "-1",
- buf->count >= 0 ? buf->count :
- -buf->count - 1);
- pr_refcnt_buffer(buf);
+ for ( ; h < REFCNT_HASHSZ; h++)
+ list_for_each_entry_safe(ref, n, &refcnt_root[h], list) {
+ if (verbose) {
+ pr_debug("==== [%d] ====\n", i);
+ pr_refcnt_object(ref);
+ }
+ refcnt_object__delete(ref);
+ i++;
}
- refcnt_object__delete(ref);
- i++;
- }
+
pr_warning("REFCNT: Total %d objects are not reclaimed.\n", i);
if (!verbose)
pr_warning(" To see all backtraces, rerun with -v option\n");

Subject: [PATCH perf/core 03/22] perf refcnt: Add refcnt debug filter

Add refcnt debug filter for filtering the debug objects.
Developers can set the target object name (or glob pattern)
to PERF_REFCNT_DEBUG_FILTER environment variable.

E.g.
# PERF_REFCNT_DEBUG_FILTER=dso ./perf ...

This traces only the dso object. This also removes
unused name arguments from refcnt__record API.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/refcnt.c | 29 ++++++++++++++++++-----------
tools/perf/util/refcnt.h | 14 +++++++-------
2 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/tools/perf/util/refcnt.c b/tools/perf/util/refcnt.c
index 9748dba..e83e9a8 100644
--- a/tools/perf/util/refcnt.c
+++ b/tools/perf/util/refcnt.c
@@ -16,13 +16,20 @@

/* A root of backtraces (a hash table of the list of refcnt_object)*/
static struct list_head refcnt_root[REFCNT_HASHSZ];
+static const char *refcnt_filter;

static void __attribute__((constructor)) refcnt__init_root(void)
{
int h;
+ const char *filter;

for (h = 0; h < REFCNT_HASHSZ; h++)
INIT_LIST_HEAD(&refcnt_root[h]);
+
+ /* glob matching filter */
+ filter = getenv("PERF_REFCNT_DEBUG_FILTER");
+ if (filter && *filter)
+ refcnt_filter = filter;
}

static void refcnt_object__delete(struct refcnt_object *ref)
@@ -54,11 +61,9 @@ void refcnt__delete(void *addr)
{
struct refcnt_object *ref = refcnt_object__find(addr);

- if (!ref) {
- pr_debug("REFCNT: Delete uninitialized refcnt: %p\n", addr);
- return;
- }
- refcnt_object__delete(ref);
+ /* If !ref, it is already filtered out */
+ if (ref)
+ refcnt_object__delete(ref);
}

static void refcnt_object__record(struct refcnt_object *ref, int count)
@@ -98,21 +103,23 @@ static struct refcnt_object *refcnt_object__new(void *obj, const char *name)
/* This is called via refcnt__init */
void refcnt__recordnew(void *obj, const char *name, int count)
{
- struct refcnt_object *ref = refcnt_object__new(obj, name);
+ struct refcnt_object *ref;
+
+ if (refcnt_filter && !strglobmatch(name, refcnt_filter))
+ return;

+ ref = refcnt_object__new(obj, name);
if (ref)
refcnt_object__record(ref, count);
}

/* This is called via refcnt__get/put */
-void refcnt__record(void *obj, const char *name, int count)
+void refcnt__record(void *obj, int count)
{
struct refcnt_object *ref = refcnt_object__find(obj);

- /* If no entry, allocate new one */
- if (!ref)
- refcnt__recordnew(obj, name, count);
- else
+ /* If no entry, it should be filtered out */
+ if (ref)
refcnt_object__record(ref, count);
}

diff --git a/tools/perf/util/refcnt.h b/tools/perf/util/refcnt.h
index cd5e4e4..21569e1 100644
--- a/tools/perf/util/refcnt.h
+++ b/tools/perf/util/refcnt.h
@@ -24,7 +24,7 @@ struct refcnt_buffer {
};

void refcnt__recordnew(void *obj, const char *name, int count);
-void refcnt__record(void *obj, const char *name, int count);
+void refcnt__record(void *obj, int count);
void refcnt__delete(void *obj);

static inline void __refcnt__init(atomic_t *refcnt, int n, void *obj,
@@ -34,15 +34,15 @@ static inline void __refcnt__init(atomic_t *refcnt, int n, void *obj,
refcnt__recordnew(obj, name, n);
}

-static inline void __refcnt__get(atomic_t *refcnt, void *obj, const char *name)
+static inline void __refcnt__get(atomic_t *refcnt, void *obj)
{
atomic_inc(refcnt);
- refcnt__record(obj, name, atomic_read(refcnt));
+ refcnt__record(obj, atomic_read(refcnt));
}

-static inline int __refcnt__put(atomic_t *refcnt, void *obj, const char *name)
+static inline int __refcnt__put(atomic_t *refcnt, void *obj)
{
- refcnt__record(obj, name, -atomic_read(refcnt));
+ refcnt__record(obj, -atomic_read(refcnt));
return atomic_dec_and_test(refcnt);
}

@@ -53,9 +53,9 @@ static inline int __refcnt__put(atomic_t *refcnt, void *obj, const char *name)
#define refcnt__exit(obj, member) \
refcnt__delete(obj)
#define refcnt__get(obj, member) \
- __refcnt__get(&(obj)->member, obj, #obj)
+ __refcnt__get(&(obj)->member, obj)
#define refcnt__put(obj, member) \
- __refcnt__put(&(obj)->member, obj, #obj)
+ __refcnt__put(&(obj)->member, obj)

#else /* !REFCNT_DEBUG */

Subject: [PATCH perf/core 04/22] perf refcnt: refcnt shows summary per object

Report refcnt per object(class) summary numbers if
any leaks exist, not only the total number.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/refcnt.c | 44 ++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 44 insertions(+)

diff --git a/tools/perf/util/refcnt.c b/tools/perf/util/refcnt.c
index e83e9a8..5348406 100644
--- a/tools/perf/util/refcnt.c
+++ b/tools/perf/util/refcnt.c
@@ -157,8 +157,48 @@ static void pr_refcnt_object(struct refcnt_object *ref)
pr_refcnt_buffer(buf);
}

+#define REFCNT_STAT_NUM 16
+static struct refcnt_stat_entry {
+ int counter;
+ const char *name;
+} stat_entry[REFCNT_STAT_NUM] = { {.name = "(others)"}, };
+
+static struct refcnt_stat_entry *refcnt_stat__find_entry(const char *name)
+{
+ int i;
+
+ if (!name)
+ goto last;
+
+ for (i = 1; i < REFCNT_STAT_NUM; i++)
+ if (stat_entry[i].name) {
+ if (strcmp(stat_entry[i].name, name) == 0)
+ return &stat_entry[i];
+ } else {
+ stat_entry[i].name = name;
+ return &stat_entry[i];
+ }
+last:
+ /* Here, it shorts the slot, let it fold to others */
+ return &stat_entry[0];
+}
+
+static void pr_refcnt_stat(void)
+{
+ int i;
+
+ for (i = 1; i < REFCNT_STAT_NUM && stat_entry[i].name; i++)
+ pr_warning(" \"%s\" leaks %d objects\n",
+ stat_entry[i].name, stat_entry[i].counter);
+
+ if (stat_entry[0].counter)
+ pr_warning(" And others leak %d objects\n",
+ stat_entry[0].counter);
+}
+
static void __attribute__((destructor)) refcnt__dump_unreclaimed(void)
{
+ struct refcnt_stat_entry *ent;
struct refcnt_object *ref, *n;
int h, i = 0;

@@ -174,11 +214,15 @@ found:
pr_debug("==== [%d] ====\n", i);
pr_refcnt_object(ref);
}
+ ent = refcnt_stat__find_entry(ref->name);
+ if (ent)
+ ent->counter++;
refcnt_object__delete(ref);
i++;
}

pr_warning("REFCNT: Total %d objects are not reclaimed.\n", i);
+ pr_refcnt_stat();
if (!verbose)
pr_warning(" To see all backtraces, rerun with -v option\n");
}

Subject: [PATCH perf/core 05/22] perf: make map to use refcnt

Make 'map' object to use refcnt interface for debug.
This can find refcnt related memory leaks.
E.g.

----
./perf probe vfs_read
Added new event:
probe:vfs_read (on vfs_read)

You can now use it in all perf tools, such as:

perf record -e probe:vfs_read -aR sleep 1

REFCNT: BUG: Unreclaimed objects found.
REFCNT: Total 76 objects are not reclaimed.
To see all backtraces, rerun with -v option
----

Signed-off-by: Masami Hiramatsu <[email protected]>
---
Changes from v1:
- Update refcnt__init() call to pass initial value.
---
tools/perf/util/map.c | 7 ++++---
tools/perf/util/map.h | 3 ++-
2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 93d9f1c..72fcc54 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -138,7 +138,7 @@ void map__init(struct map *map, enum map_type type,
RB_CLEAR_NODE(&map->rb_node);
map->groups = NULL;
map->erange_warned = false;
- atomic_set(&map->refcnt, 1);
+ refcnt__init(map, refcnt, 1);
}

struct map *map__new(struct machine *machine, u64 start, u64 len,
@@ -241,6 +241,7 @@ bool __map__is_kernel(const struct map *map)
static void map__exit(struct map *map)
{
BUG_ON(!RB_EMPTY_NODE(&map->rb_node));
+ refcnt__exit(map, refcnt);
dso__zput(map->dso);
}

@@ -252,7 +253,7 @@ void map__delete(struct map *map)

void map__put(struct map *map)
{
- if (map && atomic_dec_and_test(&map->refcnt))
+ if (map && refcnt__put(map, refcnt))
map__delete(map);
}

@@ -353,7 +354,7 @@ struct map *map__clone(struct map *from)
struct map *map = memdup(from, sizeof(*map));

if (map != NULL) {
- atomic_set(&map->refcnt, 1);
+ refcnt__init(map, refcnt, 1);
RB_CLEAR_NODE(&map->rb_node);
dso__get(map->dso);
map->groups = NULL;
diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index 7309d64..fcdc7d6 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -9,6 +9,7 @@
#include <stdio.h>
#include <stdbool.h>
#include <linux/types.h>
+#include "refcnt.h"

enum map_type {
MAP__FUNCTION = 0,
@@ -153,7 +154,7 @@ struct map *map__clone(struct map *map);
static inline struct map *map__get(struct map *map)
{
if (map)
- atomic_inc(&map->refcnt);
+ refcnt__get(map, refcnt);
return map;
}

Subject: [PATCH perf/core 06/22] perf: Make dso to use refcnt for debug

Make 'dso' object to use refcnt interface for debug.
This can find refcnt related memory leaks on dsos.

E.g.
----
# ./perf probe vfs_read
Added new event:
probe:vfs_read (on vfs_read)

You can now use it in all perf tools, such as:

perf record -e probe:vfs_read -aR sleep 1

REFCNT: BUG: Unreclaimed objects found.
REFCNT: Total 75 objects are not reclaimed.
To see all backtraces, rerun with -v option
----

Signed-off-by: Masami Hiramatsu <[email protected]>
---
Changes from v1:
- Update refcnt__init() call to pass initial value.
---
tools/perf/util/dso.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index e8e9a9d..d7a3f34 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -1066,7 +1066,7 @@ struct dso *dso__new(const char *name)
INIT_LIST_HEAD(&dso->node);
INIT_LIST_HEAD(&dso->data.open_entry);
pthread_mutex_init(&dso->lock, NULL);
- atomic_set(&dso->refcnt, 1);
+ refcnt__init(dso, refcnt, 1);
}

return dso;
@@ -1098,19 +1098,20 @@ void dso__delete(struct dso *dso)
dso__free_a2l(dso);
zfree(&dso->symsrc_filename);
pthread_mutex_destroy(&dso->lock);
+ refcnt__exit(dso, refcnt);
free(dso);
}

struct dso *dso__get(struct dso *dso)
{
if (dso)
- atomic_inc(&dso->refcnt);
+ refcnt__get(dso, refcnt);
return dso;
}

void dso__put(struct dso *dso)
{
- if (dso && atomic_dec_and_test(&dso->refcnt))
+ if (dso && refcnt__put(dso, refcnt))
dso__delete(dso);
}

Subject: [PATCH perf/core 07/22] perf: Make map_groups to use refcnt

Make 'map_groups' object to use refcnt interface for debug.
This can find refcnt related memory leaks.
E.g.
----
# ./perf top --stdio
^q
REFCNT: BUG: Unreclaimed objects found.
REFCNT: Total 3615 objects are not reclaimed.
"map" leaks 3373 objects
"dso" leaks 232 objects
"map_groups" leaks 10 objects
To see all backtraces, rerun with -v option
----

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/map.c | 5 +++--
tools/perf/util/map.h | 2 +-
2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 72fcc54..7859216 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -471,7 +471,7 @@ void map_groups__init(struct map_groups *mg, struct machine *machine)
maps__init(&mg->maps[i]);
}
mg->machine = machine;
- atomic_set(&mg->refcnt, 1);
+ refcnt__init_as(mg, refcnt, 1, "map_groups");
}

static void __maps__purge(struct maps *maps)
@@ -501,6 +501,7 @@ void map_groups__exit(struct map_groups *mg)

for (i = 0; i < MAP__NR_TYPES; ++i)
maps__exit(&mg->maps[i]);
+ refcnt__exit(mg, refcnt);
}

bool map_groups__empty(struct map_groups *mg)
@@ -533,7 +534,7 @@ void map_groups__delete(struct map_groups *mg)

void map_groups__put(struct map_groups *mg)
{
- if (mg && atomic_dec_and_test(&mg->refcnt))
+ if (mg && refcnt__put(mg, refcnt))
map_groups__delete(mg);
}

diff --git a/tools/perf/util/map.h b/tools/perf/util/map.h
index fcdc7d6..4667d67 100644
--- a/tools/perf/util/map.h
+++ b/tools/perf/util/map.h
@@ -78,7 +78,7 @@ bool map_groups__empty(struct map_groups *mg);
static inline struct map_groups *map_groups__get(struct map_groups *mg)
{
if (mg)
- atomic_inc(&mg->refcnt);
+ refcnt__get(mg, refcnt);
return mg;
}

Subject: [PATCH perf/core 08/22] perf: Make thread uses refcnt for debug

Make 'thread' object to use refcnt interface for debug.
This can find refcnt related memory leaks.
E.g.

----
# ./perf top --stdio
^q
REFCNT: BUG: Unreclaimed objects found.
REFCNT: Total 3525 objects are not reclaimed.
"map" leaks 3285 objects
"dso" leaks 225 objects
"thread" leaks 8 objects
"map_groups" leaks 7 objects
To see all backtraces, rerun with -v option
----

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/thread.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 0a9ae80..392fc32 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -53,7 +53,7 @@ struct thread *thread__new(pid_t pid, pid_t tid)
goto err_thread;

list_add(&comm->list, &thread->comm_list);
- atomic_set(&thread->refcnt, 0);
+ refcnt__init(thread, refcnt, 0);
RB_CLEAR_NODE(&thread->rb_node);
}

@@ -81,6 +81,7 @@ void thread__delete(struct thread *thread)
comm__free(comm);
}
unwind__finish_access(thread);
+ refcnt__exit(thread, refcnt);

free(thread);
}
@@ -88,13 +89,13 @@ void thread__delete(struct thread *thread)
struct thread *thread__get(struct thread *thread)
{
if (thread)
- atomic_inc(&thread->refcnt);
+ refcnt__get(thread, refcnt);
return thread;
}

void thread__put(struct thread *thread)
{
- if (thread && atomic_dec_and_test(&thread->refcnt)) {
+ if (thread && refcnt__put(thread, refcnt)) {
list_del_init(&thread->node);
thread__delete(thread);
}

Subject: [PATCH perf/core 09/22] perf: Make cpu_map to use refcnt for debug

Make 'cpu_map' object to use refcnt interface for debug.
This can find refcnt related memory leaks.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/cpumap.c | 33 +++++++++++++++++----------------
1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/tools/perf/util/cpumap.c b/tools/perf/util/cpumap.c
index 10af1e7..cca94d3 100644
--- a/tools/perf/util/cpumap.c
+++ b/tools/perf/util/cpumap.c
@@ -23,7 +23,7 @@ static struct cpu_map *cpu_map__default_new(void)
cpus->map[i] = i;

cpus->nr = nr_cpus;
- atomic_set(&cpus->refcnt, 1);
+ refcnt__init_as(cpus, refcnt, 1, "cpu_map");
}

return cpus;
@@ -37,7 +37,7 @@ static struct cpu_map *cpu_map__trim_new(int nr_cpus, int *tmp_cpus)
if (cpus != NULL) {
cpus->nr = nr_cpus;
memcpy(cpus->map, tmp_cpus, payload_size);
- atomic_set(&cpus->refcnt, 1);
+ refcnt__init_as(cpus, refcnt, 1, "cpu_map");
}

return cpus;
@@ -197,7 +197,7 @@ struct cpu_map *cpu_map__dummy_new(void)
if (cpus != NULL) {
cpus->nr = 1;
cpus->map[0] = -1;
- atomic_set(&cpus->refcnt, 1);
+ refcnt__init_as(cpus, refcnt, 1, "cpu_map");
}

return cpus;
@@ -214,32 +214,33 @@ struct cpu_map *cpu_map__empty_new(int nr)
for (i = 0; i < nr; i++)
cpus->map[i] = -1;

- atomic_set(&cpus->refcnt, 1);
+ refcnt__init_as(cpus, refcnt, 1, "cpu_map");
}

return cpus;
}

-static void cpu_map__delete(struct cpu_map *map)
+static void cpu_map__delete(struct cpu_map *cpus)
{
- if (map) {
- WARN_ONCE(atomic_read(&map->refcnt) != 0,
+ if (cpus) {
+ WARN_ONCE(atomic_read(&cpus->refcnt) != 0,
"cpu_map refcnt unbalanced\n");
- free(map);
+ refcnt__exit(cpus, refcnt);
+ free(cpus);
}
}

-struct cpu_map *cpu_map__get(struct cpu_map *map)
+struct cpu_map *cpu_map__get(struct cpu_map *cpus)
{
- if (map)
- atomic_inc(&map->refcnt);
- return map;
+ if (cpus)
+ refcnt__get(cpus, refcnt);
+ return cpus;
}

-void cpu_map__put(struct cpu_map *map)
+void cpu_map__put(struct cpu_map *cpus)
{
- if (map && atomic_dec_and_test(&map->refcnt))
- cpu_map__delete(map);
+ if (cpus && refcnt__put(cpus, refcnt))
+ cpu_map__delete(cpus);
}

static int cpu__get_topology_int(int cpu, const char *name, int *value)
@@ -302,7 +303,7 @@ int cpu_map__build_map(struct cpu_map *cpus, struct cpu_map **res,
/* ensure we process id in increasing order */
qsort(c->map, c->nr, sizeof(int), cmp_ids);

- atomic_set(&c->refcnt, 1);
+ refcnt__init_as(c, refcnt, 1, "cpu_map");
*res = c;
return 0;
}

Subject: [PATCH perf/core 10/22] perf: Make comm_str to use refcnt for debug

Make 'comm_str' object to use refcnt interface for debug.
This can find refcnt related memory leaks.

----
# ./perf top --stdio
^q
REFCNT: BUG: Unreclaimed objects found.
REFCNT: Total 3369 objects are not reclaimed.
"dso" leaks 193 objects
"map" leaks 3154 objects
"map_groups" leaks 6 objects
"comm_str" leaks 9 objects
"thread" leaks 7 objects
To see all backtraces, rerun with -v option
----

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/comm.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/comm.c b/tools/perf/util/comm.c
index 21b7ff3..77202ab 100644
--- a/tools/perf/util/comm.c
+++ b/tools/perf/util/comm.c
@@ -1,5 +1,6 @@
#include "comm.h"
#include "util.h"
+#include "refcnt.h"
#include <stdlib.h>
#include <stdio.h>
#include <linux/atomic.h>
@@ -16,15 +17,16 @@ static struct rb_root comm_str_root;
static struct comm_str *comm_str__get(struct comm_str *cs)
{
if (cs)
- atomic_inc(&cs->refcnt);
+ refcnt__get(cs, refcnt);
return cs;
}

static void comm_str__put(struct comm_str *cs)
{
- if (cs && atomic_dec_and_test(&cs->refcnt)) {
+ if (cs && refcnt__put(cs, refcnt)) {
rb_erase(&cs->rb_node, &comm_str_root);
zfree(&cs->str);
+ refcnt__exit(cs, refcnt);
free(cs);
}
}
@@ -43,7 +45,7 @@ static struct comm_str *comm_str__alloc(const char *str)
return NULL;
}

- atomic_set(&cs->refcnt, 0);
+ refcnt__init_as(cs, refcnt, 0, "comm_str");

return cs;
}

Subject: [PATCH perf/core 11/22] perf: Make cgroup_sel to use refcnt for debug

Make 'cgroup_sel' object to use refcnt interface for debug.
This can find refcnt related memory leaks.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/cgroup.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/cgroup.c b/tools/perf/util/cgroup.c
index 32e12ec..d925f86 100644
--- a/tools/perf/util/cgroup.c
+++ b/tools/perf/util/cgroup.c
@@ -4,6 +4,7 @@
#include "evsel.h"
#include "cgroup.h"
#include "evlist.h"
+#include "refcnt.h"

int nr_cgroups;

@@ -103,6 +104,7 @@ static int add_cgroup(struct perf_evlist *evlist, char *str)
free(cgrp);
return -1;
}
+ refcnt__init_as(cgrp, refcnt, 0, "cgroup_sel");
}

/*
@@ -115,21 +117,24 @@ static int add_cgroup(struct perf_evlist *evlist, char *str)
goto found;
n++;
}
- if (atomic_read(&cgrp->refcnt) == 0)
+ if (atomic_read(&cgrp->refcnt) == 0) {
+ refcnt__exit(cgrp, refcnt);
free(cgrp);
+ }

return -1;
found:
- atomic_inc(&cgrp->refcnt);
+ refcnt__get(cgrp, refcnt);
counter->cgrp = cgrp;
return 0;
}

void close_cgroup(struct cgroup_sel *cgrp)
{
- if (cgrp && atomic_dec_and_test(&cgrp->refcnt)) {
+ if (cgrp && refcnt__put(cgrp, refcnt)) {
close(cgrp->fd);
zfree(&cgrp->name);
+ refcnt__exit(cgrp, refcnt);
free(cgrp);
}
}

Subject: [PATCH perf/core 12/22] perf: Make thread_map to use refcnt for debug

Make 'thread_map' object to use refcnt interface for debug.
This can find refcnt related memory leaks.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/thread_map.c | 28 +++++++++++++++-------------
1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index 6ec3c5c..a87eae4 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -13,6 +13,7 @@
#include "thread_map.h"
#include "util.h"
#include "debug.h"
+#include "refcnt.h"

/* Skip "." and ".." directories */
static int filter(const struct dirent *dir)
@@ -65,7 +66,7 @@ struct thread_map *thread_map__new_by_pid(pid_t pid)
for (i = 0; i < items; i++)
thread_map__set_pid(threads, i, atoi(namelist[i]->d_name));
threads->nr = items;
- atomic_set(&threads->refcnt, 1);
+ refcnt__init_as(threads, refcnt, 1, "thread_map");
}

for (i=0; i<items; i++)
@@ -82,7 +83,7 @@ struct thread_map *thread_map__new_by_tid(pid_t tid)
if (threads != NULL) {
thread_map__set_pid(threads, 0, tid);
threads->nr = 1;
- atomic_set(&threads->refcnt, 1);
+ refcnt__init_as(threads, refcnt, 1, "thread_map");
}

return threads;
@@ -104,7 +105,7 @@ struct thread_map *thread_map__new_by_uid(uid_t uid)
goto out_free_threads;

threads->nr = 0;
- atomic_set(&threads->refcnt, 1);
+ refcnt__init_as(threads, refcnt, 1, "thread_map");

while (!readdir_r(proc, &dirent, &next) && next) {
char *end;
@@ -234,7 +235,7 @@ static struct thread_map *thread_map__new_by_pid_str(const char *pid_str)
out:
strlist__delete(slist);
if (threads)
- atomic_set(&threads->refcnt, 1);
+ refcnt__init_as(threads, refcnt, 1, "thread_map");
return threads;

out_free_namelist:
@@ -254,7 +255,7 @@ struct thread_map *thread_map__new_dummy(void)
if (threads != NULL) {
thread_map__set_pid(threads, 0, -1);
threads->nr = 1;
- atomic_set(&threads->refcnt, 1);
+ refcnt__init_as(threads, refcnt, 1, "thread_map");
}
return threads;
}
@@ -299,7 +300,7 @@ static struct thread_map *thread_map__new_by_tid_str(const char *tid_str)
}
out:
if (threads)
- atomic_set(&threads->refcnt, 1);
+ refcnt__init_as(threads, refcnt, 1, "thread_map");
return threads;

out_free_threads:
@@ -328,21 +329,22 @@ static void thread_map__delete(struct thread_map *threads)
"thread map refcnt unbalanced\n");
for (i = 0; i < threads->nr; i++)
free(thread_map__comm(threads, i));
+ refcnt__exit(threads, refcnt);
free(threads);
}
}

-struct thread_map *thread_map__get(struct thread_map *map)
+struct thread_map *thread_map__get(struct thread_map *threads)
{
- if (map)
- atomic_inc(&map->refcnt);
- return map;
+ if (threads)
+ refcnt__get(threads, refcnt);
+ return threads;
}

-void thread_map__put(struct thread_map *map)
+void thread_map__put(struct thread_map *threads)
{
- if (map && atomic_dec_and_test(&map->refcnt))
- thread_map__delete(map);
+ if (threads && refcnt__put(threads, refcnt))
+ thread_map__delete(threads);
}

size_t thread_map__fprintf(struct thread_map *threads, FILE *fp)

Subject: [PATCH perf/core 13/22] perf: Make perf_mmap to use refcnt for debug

Make 'perf_mmap' object to use refcnt interface for debug.
This can find refcnt related memory leaks.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/evlist.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index d1b6c20..0637603 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -15,6 +15,7 @@
#include "evlist.h"
#include "evsel.h"
#include "debug.h"
+#include "refcnt.h"
#include <unistd.h>

#include "parse-events.h"
@@ -769,14 +770,14 @@ static bool perf_mmap__empty(struct perf_mmap *md)

static void perf_evlist__mmap_get(struct perf_evlist *evlist, int idx)
{
- atomic_inc(&evlist->mmap[idx].refcnt);
+ refcnt__get(&evlist->mmap[idx], refcnt);
}

static void perf_evlist__mmap_put(struct perf_evlist *evlist, int idx)
{
BUG_ON(atomic_read(&evlist->mmap[idx].refcnt) == 0);

- if (atomic_dec_and_test(&evlist->mmap[idx].refcnt))
+ if (refcnt__put(&evlist->mmap[idx], refcnt))
__perf_evlist__munmap(evlist, idx);
}

@@ -827,6 +828,7 @@ static void __perf_evlist__munmap(struct perf_evlist *evlist, int idx)
if (evlist->mmap[idx].base != NULL) {
munmap(evlist->mmap[idx].base, evlist->mmap_len);
evlist->mmap[idx].base = NULL;
+ refcnt__exit(&evlist->mmap[idx], refcnt);
atomic_set(&evlist->mmap[idx].refcnt, 0);
}
auxtrace_mmap__munmap(&evlist->mmap[idx].auxtrace_mmap);
@@ -876,7 +878,7 @@ static int __perf_evlist__mmap(struct perf_evlist *evlist, int idx,
* evlist layer can't just drop it when filtering events in
* perf_evlist__filter_pollfd().
*/
- atomic_set(&evlist->mmap[idx].refcnt, 2);
+ refcnt__init_as(&evlist->mmap[idx], refcnt, 2, "perf_mmap");
evlist->mmap[idx].prev = 0;
evlist->mmap[idx].mask = mp->mask;
evlist->mmap[idx].base = mmap(NULL, evlist->mmap_len, mp->prot,

Subject: [PATCH perf/core 14/22] perf: Fix dso__load_sym to put dso

Fix dso__load_sym to put dso because dsos__add already got it.

Refcnt debugger explain the problem:
----
==== [0] ====
Unreclaimed dso: 0x19dd200
Refcount +1 => 1 at
./perf(dso__new+0x1ff) [0x4a62df]
./perf(dso__load_sym+0xe89) [0x503509]
./perf(dso__load_vmlinux+0xbf) [0x4aa77f]
./perf(dso__load_vmlinux_path+0x8c) [0x4aa8dc]
./perf() [0x50539a]
./perf(convert_perf_probe_events+0xd79) [0x50ad39]
./perf() [0x45600f]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount +1 => 2 at
./perf(dso__get+0x34) [0x4a65f4]
./perf(map__new2+0x76) [0x4be216]
./perf(dso__load_sym+0xee1) [0x503561]
./perf(dso__load_vmlinux+0xbf) [0x4aa77f]
./perf(dso__load_vmlinux_path+0x8c) [0x4aa8dc]
./perf() [0x50539a]
./perf(convert_perf_probe_events+0xd79) [0x50ad39]
./perf() [0x45600f]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount +1 => 3 at
./perf(dsos__add+0xf3) [0x4a6bc3]
./perf(dso__load_sym+0xfc1) [0x503641]
./perf(dso__load_vmlinux+0xbf) [0x4aa77f]
./perf(dso__load_vmlinux_path+0x8c) [0x4aa8dc]
./perf() [0x50539a]
./perf(convert_perf_probe_events+0xd79) [0x50ad39]
./perf() [0x45600f]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount -1 => 2 at
./perf(dso__put+0x2f) [0x4a664f]
./perf(map_groups__exit+0xb9) [0x4bee29]
./perf(machine__delete+0xb0) [0x4b93d0]
./perf(exit_probe_symbol_maps+0x28) [0x506718]
./perf() [0x45628a]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount -1 => 1 at
./perf(dso__put+0x2f) [0x4a664f]
./perf(machine__delete+0xfe) [0x4b941e]
./perf(exit_probe_symbol_maps+0x28) [0x506718]
./perf() [0x45628a]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
----
So, in the dso__load_sym, dso is gotten 3 times, by dso__new,
map__new2, and dsos__add. The last 2 is actually released by
map_groups and machine__delete correspondingly. However, the
first reference by dso__new, is never released.

This solves this issue by putting dso right after dsos__add.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/symbol-elf.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 53f1996..a5703ac 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1045,6 +1045,8 @@ int dso__load_sym(struct dso *dso, struct map *map,
/* kmaps already got it */
map__put(curr_map);
dsos__add(&map->groups->machine->dsos, curr_dso);
+ /* curr_map and machine->dsos already got it */
+ dso__put(curr_dso);
dso__set_loaded(curr_dso, map->type);
} else
curr_dso = curr_map->dso;

Subject: [PATCH perf/core 15/22] perf: Fix map_groups__clone to put cloned map

Fix map_groups__clone to put cloned map after
inserting it to the map_groups.

Refcnt debugger shows:
----
==== [0] ====
Unreclaimed map: 0x2a27ee0
Refcount +1 => 1 at
./perf(map_groups__clone+0x8d) [0x4bb7ed]
./perf(thread__fork+0xbe) [0x4c1f9e]
./perf(machine__process_fork_event+0x216) [0x4b79a6]
./perf(perf_event__synthesize_threads+0x38b) [0x48135b]
./perf(cmd_top+0xdc6) [0x43cb76]
./perf() [0x477223]
./perf(main+0x617) [0x422077]
/lib64/libc.so.6(__libc_start_main+0xf0) [0x7ff806af8fe0]
./perf() [0x4221ed]
Refcount +1 => 2 at
./perf(map_groups__clone+0x128) [0x4bb888]
./perf(thread__fork+0xbe) [0x4c1f9e]
./perf(machine__process_fork_event+0x216) [0x4b79a6]
./perf(perf_event__synthesize_threads+0x38b) [0x48135b]
./perf(cmd_top+0xdc6) [0x43cb76]
./perf() [0x477223]
./perf(main+0x617) [0x422077]
/lib64/libc.so.6(__libc_start_main+0xf0) [0x7ff806af8fe0]
./perf() [0x4221ed]
Refcount -1 => 1 at
./perf(map_groups__exit+0x87) [0x4ba757]
./perf(map_groups__put+0x68) [0x4ba9a8]
./perf(thread__put+0x8b) [0x4c1aeb]
./perf(machine__delete_threads+0x81) [0x4b48f1]
./perf(perf_session__delete+0x4f) [0x4be63f]
./perf(cmd_top+0x1094) [0x43ce44]
./perf() [0x477223]
./perf(main+0x617) [0x422077]
/lib64/libc.so.6(__libc_start_main+0xf0) [0x7ff806af8fe0]
./perf() [0x4221ed]
----

This shows map_groups__clone get the map twice
and put it when map_groups__exit.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/map.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 7859216..03b7297 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -744,6 +744,7 @@ int map_groups__clone(struct map_groups *mg,
if (new == NULL)
goto out_unlock;
map_groups__insert(mg, new);
+ map__put(new);
}

err = 0;

Subject: [PATCH perf/core 16/22] perf: Fix __cmd_top and perf_session__process_events to put the idle thread

Since perf_session__register_idle_thread() got the idle thread,
caller functions have to put it afterwards.
Note that since the thread was already inserted to the session
list, it will be released when the session is released.
Also, in perf_session__register_idle_thread() failure path,
the thread should be put before returning.

Refcnt debugger shows that the perf_session__register_idle_thread
gets the returned thread, but the caller (__cmd_top) does not
put the returned idle thread.

----
==== [0] ====
Unreclaimed thread@0x24e6240
Refcount +1 => 0 at
./perf(thread__new+0xe5) [0x4c8a75]
./perf(machine__findnew_thread+0x9a) [0x4bbdba]
./perf(perf_session__register_idle_thread+0x28) [0x4c63c8]
./perf(cmd_top+0xd7d) [0x43cf6d]
./perf() [0x47ba35]
./perf(main+0x617) [0x4225b7]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5]
./perf() [0x42272d]
Refcount +1 => 1 at
./perf(thread__get+0x2c) [0x4c8bcc]
./perf(machine__findnew_thread+0xee) [0x4bbe0e]
./perf(perf_session__register_idle_thread+0x28) [0x4c63c8]
./perf(cmd_top+0xd7d) [0x43cf6d]
./perf() [0x47ba35]
./perf(main+0x617) [0x4225b7]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5]
./perf() [0x42272d]
Refcount +1 => 2 at
./perf(thread__get+0x2c) [0x4c8bcc]
./perf(machine__findnew_thread+0x112) [0x4bbe32]
./perf(perf_session__register_idle_thread+0x28) [0x4c63c8]
./perf(cmd_top+0xd7d) [0x43cf6d]
./perf() [0x47ba35]
./perf(main+0x617) [0x4225b7]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5]
./perf() [0x42272d]
----

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/builtin-top.c | 6 +++++-
tools/perf/util/session.c | 10 +++++++++-
2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 7e2e72e..430177f 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -945,6 +945,7 @@ static int perf_top__setup_sample_type(struct perf_top *top __maybe_unused)
static int __cmd_top(struct perf_top *top)
{
struct record_opts *opts = &top->record_opts;
+ struct thread *idle;
pthread_t thread;
int ret;

@@ -964,8 +965,11 @@ static int __cmd_top(struct perf_top *top)
if (ret)
goto out_delete;

- if (perf_session__register_idle_thread(top->session) == NULL)
+ idle = perf_session__register_idle_thread(top->session);
+ if (idle == NULL)
goto out_delete;
+ /* perf_session__register_idle_thread() got the returned thread. */
+ thread__put(idle);

machine__synthesize_threads(&top->session->machines.host, &opts->target,
top->evlist->threads, false, opts->proc_map_timeout);
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index c35ffdd..ee16228 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1318,6 +1318,8 @@ struct thread *perf_session__register_idle_thread(struct perf_session *session)
thread = machine__findnew_thread(&session->machines.host, 0, 0);
if (thread == NULL || thread__set_comm(thread, "swapper", 0)) {
pr_err("problem inserting idle task.\n");
+ /* machine__findnew_thread() got the thread, so put it */
+ thread__put(thread);
thread = NULL;
}

@@ -1674,10 +1676,16 @@ out_err:
int perf_session__process_events(struct perf_session *session)
{
u64 size = perf_data_file__size(session->file);
+ struct thread *idle = perf_session__register_idle_thread(session);
int err;

- if (perf_session__register_idle_thread(session) == NULL)
+ if (idle == NULL)
return -ENOMEM;
+ /*
+ * Since perf_session__register_idle_thread() got the idle thread,
+ * we have to put it here.
+ */
+ thread__put(idle);

if (!perf_data_file__is_pipe(session->file))
err = __perf_session__process_events(session,

Subject: [PATCH perf/core 17/22] perf: Fix __machine__addnew_vdso to put dso after add to dsos

Fix __machine__addnew_vdso to put dso after add to dsos because
the dso is already gotten by the dsos via __dsos__add().

This function is called finally from machine__findnew_vdso()
which locks machine->dsos.lock. And before unlock it, the
function gets the dso's refcnt. Thus we can ensure that the
dso is not removed from the machine while this operation,
and we don't need to get the dso except for the machine->dsos.

refcnt debugger shows:
-----
$ ./perf top --stdio -v (note: run by non-root user)
[...]
==== [3] ====
Unreclaimed dso@0x27a0a30
Refcount +1 => 1 at
./perf(dso__new+0x2bc) [0x4a778c]
./perf(machine__findnew_vdso+0x272) [0x4e8792]
./perf(map__new+0x2db) [0x4bfb4b]
./perf(machine__process_mmap2_event+0xf3) [0x4bda33]
./perf(perf_event__synthesize_mmap_events+0x364) [0x484e74]
./perf(perf_event__synthesize_threads+0x3ee) [0x48583e]
./perf(cmd_top+0xdc2) [0x43cfb2]
./perf() [0x47ba35]
./perf(main+0x617) [0x4225b7]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2b01387af5]
./perf() [0x42272d]
Refcount +1 => 2 at
./perf(machine__findnew_vdso+0x289) [0x4e87a9]
./perf(map__new+0x2db) [0x4bfb4b]
./perf(machine__process_mmap2_event+0xf3) [0x4bda33]
./perf(perf_event__synthesize_mmap_events+0x364) [0x484e74]
./perf(perf_event__synthesize_threads+0x3ee) [0x48583e]
./perf(cmd_top+0xdc2) [0x43cfb2]
./perf() [0x47ba35]
./perf(main+0x617) [0x4225b7]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2b01387af5]
./perf() [0x42272d]
Refcount +1 => 3 at
./perf(dso__get+0x32) [0x4a7b52]
./perf(machine__findnew_vdso+0xc1) [0x4e85e1]
./perf(map__new+0x2db) [0x4bfb4b]
./perf(machine__process_mmap2_event+0xf3) [0x4bda33]
./perf(perf_event__synthesize_mmap_events+0x364) [0x484e74]
./perf(perf_event__synthesize_threads+0x3ee) [0x48583e]
./perf(cmd_top+0xdc2) [0x43cfb2]
./perf() [0x47ba35]
./perf(main+0x617) [0x4225b7]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2b01387af5]
./perf() [0x42272d]
[...]
-----

The log shows that the machine__findnew_vdso gets a dso
so many unnaturally. I've traced the code and found this
bug.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/vdso.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/vdso.c b/tools/perf/util/vdso.c
index 44d440d..fea0d18 100644
--- a/tools/perf/util/vdso.c
+++ b/tools/perf/util/vdso.c
@@ -130,6 +130,8 @@ static struct dso *__machine__addnew_vdso(struct machine *machine, const char *s
__dsos__add(&machine->dsos, dso);
dso__set_long_name(dso, long_name, false);
}
+ /* Put the dso here because it is already gotten by __dsos__add */
+ dso__put(dso);

return dso;
}

Subject: [PATCH perf/core 18/22] perf stat: Fix cmd_stat to release cpu_map

Fix cmd_stat() to release cpu_map objects (aggr_map and
cpus_aggr_map) afterwards.

refcnt debugger shows that the cmd_stat initializes cpu_map
but not puts it.
----
# ./perf stat -v ls
....
REFCNT: BUG: Unreclaimed objects found.
==== [0] ====
Unreclaimed cpu_map@0x29339c0
Refcount +1 => 1 at
./perf(cpu_map__empty_new+0x6d) [0x4e64bd]
./perf(cmd_stat+0x5fe) [0x43594e]
./perf() [0x47b785]
./perf(main+0x617) [0x422587]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2dff420af5]
./perf() [0x4226fd]
REFCNT: Total 1 objects are not reclaimed.
"cpu_map" leaks 1 objects
----

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/builtin-stat.c | 11 +++++++++++
1 file changed, 11 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index e74712d..eb6da8f 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1094,6 +1094,16 @@ static int perf_stat_init_aggr_mode(void)
return cpus_aggr_map ? 0 : -ENOMEM;
}

+static void perf_stat_exit_aggr_mode(void)
+{
+ if (aggr_map)
+ cpu_map__put(aggr_map);
+ if (cpus_aggr_map)
+ cpu_map__put(cpus_aggr_map);
+ aggr_map = NULL;
+ cpus_aggr_map = NULL;
+}
+
/*
* Add default attributes, if there were no attributes specified or
* if -d/--detailed, -d -d or -d -d -d is used:
@@ -1442,6 +1452,7 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
if (!forever && status != -1 && !interval)
print_counters(NULL, argc, argv);

+ perf_stat_exit_aggr_mode();
perf_evlist__free_stats(evsel_list);
out:
perf_evlist__delete(evsel_list);

Subject: [PATCH perf/core 19/22] perf: fix hists_evsel to release hists

Since hists__init doesn't set the destructor of
hists_evsel (which is an extended evsel structure),
when hists_evsel is released, the extended part of
the hists_evsel is not deleted (note that the
hists_evsel object itself is freed).

This fixes it to add a destructor for hists__evsel
and to set it up.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/hist.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 6e8e0ee..565ea35 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1567,6 +1567,13 @@ static int hists_evsel__init(struct perf_evsel *evsel)
return 0;
}

+static void hists_evsel__exit(struct perf_evsel *evsel)
+{
+ struct hists *hists = evsel__hists(evsel);
+
+ hists__delete_entries(hists);
+}
+
/*
* XXX We probably need a hists_evsel__exit() to free the hist_entries
* stored in the rbtree...
@@ -1575,7 +1582,8 @@ static int hists_evsel__init(struct perf_evsel *evsel)
int hists__init(void)
{
int err = perf_evsel__object_config(sizeof(struct hists_evsel),
- hists_evsel__init, NULL);
+ hists_evsel__init,
+ hists_evsel__exit);
if (err)
fputs("FATAL ERROR: Couldn't setup hists class\n", stderr);

Subject: [PATCH perf/core 20/22] perf: Fix maps__fixup_overlappings to put used maps

Since the __map_groups__insert got the given map, we don't
need to keep it. So put the maps.

Refcnt debugger shows that the map_groups__fixup_overlappings
got a map twice but the group released it once. This pattern
usually indicates the leak happens in caller site.
----
==== [0] ====
Unreclaimed map@0x39d3ae0
Refcount +1 => 1 at
./perf(map_groups__fixup_overlappings+0x335) [0x4c1865]
./perf(thread__insert_map+0x30) [0x4c8e00]
./perf(machine__process_mmap2_event+0x106) [0x4bd876]
./perf() [0x4c378e]
./perf() [0x4c4393]
./perf(perf_session__process_events+0x38a) [0x4c654a]
./perf(cmd_record+0xe24) [0x42fc94]
./perf() [0x47b745]
./perf(main+0x617) [0x422547]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2eca2deaf5]
./perf() [0x4226bd]
Refcount +1 => 2 at
./perf(map_groups__fixup_overlappings+0x3c5) [0x4c18f5]
./perf(thread__insert_map+0x30) [0x4c8e00]
./perf(machine__process_mmap2_event+0x106) [0x4bd876]
./perf() [0x4c378e]
./perf() [0x4c4393]
./perf(perf_session__process_events+0x38a) [0x4c654a]
./perf(cmd_record+0xe24) [0x42fc94]
./perf() [0x47b745]
./perf(main+0x617) [0x422547]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2eca2deaf5]
./perf() [0x4226bd]
Refcount -1 => 1 at
./perf(map_groups__exit+0x92) [0x4c0962]
./perf(map_groups__put+0x60) [0x4c0bc0]
./perf(thread__put+0x90) [0x4c8a40]
./perf(machine__delete_threads+0x7e) [0x4bad9e]
./perf(perf_session__delete+0x4f) [0x4c499f]
./perf(cmd_record+0xb6d) [0x42f9dd]
./perf() [0x47b745]
./perf(main+0x617) [0x422547]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2eca2deaf5]
./perf() [0x4226bd]
----

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/map.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 03b7297..89be9c5 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -693,6 +693,7 @@ static int maps__fixup_overlappings(struct maps *maps, struct map *map, FILE *fp
__map_groups__insert(pos->groups, before);
if (verbose >= 2)
map__fprintf(before, fp);
+ map__put(before);
}

if (map->end < pos->end) {
@@ -707,6 +708,7 @@ static int maps__fixup_overlappings(struct maps *maps, struct map *map, FILE *fp
__map_groups__insert(pos->groups, after);
if (verbose >= 2)
map__fprintf(after, fp);
+ map__put(after);
}
put_map:
map__put(pos);

Subject: [PATCH perf/core 21/22] perf: Fix machine.vmlinux_maps to make sure to clear the old one

Fix machine.vmlinux_maps to make sure to clear the old one
if it is renewal. This can leak the previous maps on the
vmlinux_maps because those are just overwritten.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/machine.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index bfc289c..3172aa5 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -44,6 +44,8 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
machine->comm_exec = false;
machine->kernel_start = 0;

+ memset(machine->vmlinux_maps, 0, sizeof(struct map *) * MAP__NR_TYPES);
+
machine->root_dir = strdup(root_dir);
if (machine->root_dir == NULL)
return -ENOMEM;
@@ -770,6 +772,9 @@ int __machine__create_kernel_maps(struct machine *machine, struct dso *kernel)
enum map_type type;
u64 start = machine__get_running_kernel_start(machine, NULL);

+ /* In case of renewal the kernel map, destroy previous one */
+ machine__destroy_kernel_maps(machine);
+
for (type = 0; type < MAP__NR_TYPES; ++type) {
struct kmap *kmap;
struct map *map;

Subject: [PATCH perf/core 22/22] perf: Fix write_numa_topology to put cpu_map instead of free

Fix write_numa_topology to put cpu_map instead of free because
cpu_map is managed based on refcnt.

Signed-off-by: Masami Hiramatsu <[email protected]>
---
tools/perf/util/header.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 4383800..5ac7bdb 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -724,7 +724,7 @@ static int write_numa_topology(int fd, struct perf_header *h __maybe_unused,
done:
free(buf);
fclose(fp);
- free(node_map);
+ cpu_map__put(node_map);
return ret;
}

2015-12-09 13:41:47

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH perf/core 00/22] perf refcnt debugger API and fixes

Em Wed, Dec 09, 2015 at 11:10:48AM +0900, Masami Hiramatsu escreveu:
> Hi Arnaldo,
>
> Here is a series of patches for perf refcnt debugger and
> some fixes.
>
> In this series I've replaced all atomic reference counters
> with the refcnt interface, including dso, map, map_groups,
> thread, cpu_map, comm_str, cgroup_sel, thread_map, and
> perf_mmap.
>
> refcnt debugger (or refcnt leak checker)
> ===============
>
> At first, note that this change doesn't affect any compiled
> code unless building with REFCNT_DEBUG=1 (see macros in
> refcnt.h). So, this feature is only enabled in the debug binary.
> But before releasing, we can ensure that all objects are safely
> reclaimed before exit in -rc phase.

That helps and is finding bugs and is really great stuff, thank you!

But I wonder if we couldn't get the same results on an unmodified binary
by using things like 'perf probe', the BPF code we're introducing, have
you thought about this possibility?

I.e. trying to use 'perf probe' to do this would help in using the same
technique in other code bases where we can't change the sources, etc.

For perf we could perhaps use a 'noinline' on the __get/__put
operations, so that we could have the right places to hook using
uprobes, other codebases would have to rely in the 'perf probe'
infrastructure that knows where inlines were expanded, etc.

Such a toold could work like:

perf dbgrefcnt ~/bin/perf thread

And it would look up thread__get and thread__put(), create an eBPF map
where to store the needed tracking data structures, and use the same
techniques you used, asking for backtraces using the perf
infrastructure, etc.

We would be using this opportunity to improve the 'perf probe' and the
eBPF infrastructures we're putting in place, and having something that
could be used in other codebases, not just perf.

Comments about other issues are below:

> To use the refcnt debugger, you just build a perf binary with
> REFCNT_DEBUG=1 as follows;
> ----
> # make REFCNT_DEBUG=1
> ----
> And run the perf command. If the refcnt debugger finds leaks,
> it shows the summary of the bugs. Note that if the command does
> not use stdio, it doesn't show anything because refcnt debugger
> uses pr_debug. Please use --stdio option in such case.

It would be interesting to have this writing to a file instead, so that
we could use the same technique for the TUI and other UIs.

> E.g. with the first 13 patches, perf top shows that many objects
> are leaked.
> ----
> # ./perf top --stdio
> q
> exiting.
> REFCNT: BUG: Unreclaimed objects found.
> REFCNT: Total 3595 objects are not reclaimed.
> "map" leaks 3334 objects
> "dso" leaks 231 objects
> "thread" leaks 9 objects
> "comm_str" leaks 13 objects
> "map_groups" leaks 8 objects
> To see all backtraces, rerun with -v option
> ----
> You can also dump all the backtrace data with -v option, but I
> don't recommend you to do it on your console, because it will
> be very very long (for example, above dumps 40MB text logs
> on your console).
>
> Instead, you can use PERF_REFCNT_DEBUG_FILTER env. var. to focus
> on one object, and also use "2>" to redirect stderr output to file.
> E.g.
> ----
> # PERF_REFCNT_DEBUG_FILTER=map ./perf top --stdio -v 2> refcnt.log
> q
> exiting.
> # less refcnt.log
> mmap size 528384B
> Looking at the vmlinux_path (8 entries long)
> Using /lib/modules/4.3.0-rc2+/build/vmlinux for symbols
> REFCNT: BUG: Unreclaimed objects found.
> ==== [0] ====
> Unreclaimed map@0x2157dc0
> Refcount +1 => 1 at
> ...
> ./perf() [0x4226fd]
> REFCNT: Total 3229 objects are not reclaimed.
> "map" leaks 3229 objects
> ----
>
> Bugfixes
> ========
>
> In this series I've also tried to fix some object leaks in perf top
> and perf stat.
> After applying this series, this reduced (not vanished) to 1/5.
> ----
> # ./perf top --stdio
> q
> exiting.
> REFCNT: BUG: Unreclaimed objects found.
> REFCNT: Total 866 objects are not reclaimed.
> "dso" leaks 213 objects
> "map" leaks 624 objects
> "comm_str" leaks 12 objects
> "thread" leaks 9 objects
> "map_groups" leaks 8 objects
> To see all backtraces, rerun with -v option
> ----
>
> Actually, I'm still not able to fix all of the bugs. It seems that
> hists has a bug that hists__delete_entries doesn't delete all the
> entries because some entries are on hists->entries but others on
> hists->entries_in. And I'm not so sure about hists.c.
> Arnaldo, would you have any idea for this bug?

I'll will audit the hists code, its all about collecting new entries
into an rbtree to then move the last batch for merging with the
previous, decayed ones while continuing to process a new batch in
another thread.

>
> General refcnt miscodings
> =========================
>
> BTW, while applying this change, I've found that there are refcnt
> coding mismatches in those code and most of the bugs come from those
> mismatches.
>
> - The reference counter can be initialized by 0 or 1.
> - If 0 is chosen, caller have to get it and free it if failed to
> register appropriately.
> - If 1 is chosen, caller doesn't need to get it, but when exits the
> caller, it has to put it. (except for returning the object itself)
> - The application should choose either one as its policy, to avoid
> confusion.
>
> perf tools mixes it up (moreover, it initializes 2 in a case) and
> caller usually forgets to put it (it is not surprising, because too
> many "put" usually cause SEGV by accessing freed object.)

Well, we should fix the bugs and document why some initialization is
deemed better than a single initialization style.

For instance, if we know that we will keep multiple references straight
away, then we could init it with some value different that preferred,
that I aggee, is 1, i.e. the usual way for some constructor is:


struct foo *foo__new()
{
struct foo *f = malloc(sizeof (*f));

if (f) {
atomic_set(&f->refcnt, 1);
}

return f;
}

void *foo__delete(struct foo *f)
{
free(f);
}

void foo__put(struct foo *f)
{
if (f && atomic_dec_and_test(f->refcnt))
foo__delete(f);
}


Then, when using if, and before adding it to any other tree, list, i.e.
a container, we do:

struct foo *f = foo__new();

/*
* assume f was allocated and then the function allocating it
* failed, when it bails out it should just do:
*/
out_error:
foo__put(f);

And that will make it hit zero, which will call foo__delete(), case
closed.

> As far as I can see, cgroup_sel, perf_mmap(this is initialized 0 or 2...),
> thread, and comm_str are initialized by 0. Others are initialized by 1.
>
> So, I'd like to suggest that we choose one policy and cleanup the code.
> I recommend to use init by 1 policy, because anyway caller has to get

See above, if nothing else recommends using a different value, use 1.

> the refcnt. Suppose the below code;
>
> ----
> obj__new() {
> obj = zalloc(sizeof(*obj));
> refcnt__init(obj, refcnt, 0);
> return obj;
> }
>
> caller() {
> obj = obj__new();
> if (parent__add_obj(parent, obj) != SUCCESS) {
> free(obj);
> }
> }
> ----
>
> At first glance, this looks good. However, if the parent__add_obj() once

The caller never should call free or the object destructor (i.e.
foo__delete() in my example above), it should _always_ just drop the
referece it has to the object, i.e. call foo__put()

> gets the obj(refcnt => 1) and fails to add it to parent list by some reason,
> it should put the obj(refcnt => 0). This means the obj is already freed at
> that point.
>
> Then, caller() shouldn't free obj in error case? No, because parent__add_obj()
> can fail before getting the obj :(. Maybe we can handle it by checking return
> code, but it is ugly.
>
> If we choose "init by 1", caller always has to put it before returning. But

Right, if you don't keep a pointer to the object that at a later point
you will call obj__put() on it, then you should do it straight away
after adding it to some list, tree, array, i.e. a container, and the
container adding operation should grab a refcount, that will be dropped
when the object is removed from that data structure.

> the coding rule becomes simpler.
> ----
> caller() {
> obj = obj__new();
> if (parent__add_obj(parent, obj) != SUCCESS) {
> ret = errorcode;
> }
> obj__put(obj);
> return ret;
> }
> ----
>
> Thank you,
>
> ---
>
> Masami Hiramatsu (22):
> [v2] perf refcnt: Introduce generic refcount APIs with debug feature
> perf refcnt: Use a hash for refcnt_root
> perf refcnt: Add refcnt debug filter
> perf refcnt: refcnt shows summary per object
> perf: make map to use refcnt
> perf: Make dso to use refcnt for debug
> perf: Make map_groups to use refcnt
> perf: Make thread uses refcnt for debug
> perf: Make cpu_map to use refcnt for debug
> perf: Make comm_str to use refcnt for debug
> perf: Make cgroup_sel to use refcnt for debug
> perf: Make thread_map to use refcnt for debug
> perf: Make perf_mmap to use refcnt for debug
> perf: Fix dso__load_sym to put dso
> perf: Fix map_groups__clone to put cloned map
> perf: Fix __cmd_top and perf_session__process_events to put the idle thread
> perf: Fix __machine__addnew_vdso to put dso after add to dsos
> perf stat: Fix cmd_stat to release cpu_map
> perf: fix hists_evsel to release hists
> perf: Fix maps__fixup_overlappings to put used maps
> perf: Fix machine.vmlinux_maps to make sure to clear the old one
> perf: Fix write_numa_topology to put cpu_map instead of free
>
>
> tools/perf/builtin-stat.c | 11 ++
> tools/perf/builtin-top.c | 6 +
> tools/perf/config/Makefile | 5 +
> tools/perf/util/Build | 1
> tools/perf/util/cgroup.c | 11 +-
> tools/perf/util/comm.c | 8 +
> tools/perf/util/cpumap.c | 33 +++---
> tools/perf/util/dso.c | 7 +
> tools/perf/util/evlist.c | 8 +
> tools/perf/util/header.c | 2
> tools/perf/util/hist.c | 10 ++
> tools/perf/util/machine.c | 5 +
> tools/perf/util/map.c | 15 ++-
> tools/perf/util/map.h | 5 +
> tools/perf/util/refcnt.c | 229 ++++++++++++++++++++++++++++++++++++++++++
> tools/perf/util/refcnt.h | 70 +++++++++++++
> tools/perf/util/session.c | 10 ++
> tools/perf/util/symbol-elf.c | 2
> tools/perf/util/thread.c | 7 +
> tools/perf/util/thread_map.c | 28 +++--
> tools/perf/util/vdso.c | 2
> 21 files changed, 420 insertions(+), 55 deletions(-)
> create mode 100644 tools/perf/util/refcnt.c
> create mode 100644 tools/perf/util/refcnt.h
>
>
> --
> Masami HIRAMATSU
> Linux Technology Research Center, System Productivity Research Dept.
> Center for Technology Innovation - Systems Engineering
> Hitachi, Ltd., Research & Development Group
> E-mail: [email protected]

2015-12-09 14:16:39

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH perf/core 14/22] perf: Fix dso__load_sym to put dso

Em Wed, Dec 09, 2015 at 11:11:18AM +0900, Masami Hiramatsu escreveu:
> +++ b/tools/perf/util/symbol-elf.c
> @@ -1045,6 +1045,8 @@ int dso__load_sym(struct dso *dso, struct map *map,
> /* kmaps already got it */
> map__put(curr_map);
> dsos__add(&map->groups->machine->dsos, curr_dso);
> + /* curr_map and machine->dsos already got it */
> + dso__put(curr_dso);
> dso__set_loaded(curr_dso, map->type);
> } else
> curr_dso = curr_map->dso;

Right, to make the code smaller, how about doing it this way, i.e. drop
the reference once we have that curr_dso object with a ref held by
curr_map, if curr_map doesn't get it, then we don't need and will drop
it anyway:

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 53f19968bfa2..84d787074152 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1026,8 +1026,8 @@ int dso__load_sym(struct dso *dso, struct map *map,
curr_dso->long_name_len = dso->long_name_len;
curr_map = map__new2(start, curr_dso,
map->type);
+ dso__put(curr_dso);
if (curr_map == NULL) {
- dso__put(curr_dso);
goto out_elf_end;
}
if (adjust_kernel_syms) {

2015-12-09 14:25:28

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH perf/core 15/22] perf: Fix map_groups__clone to put cloned map

Em Wed, Dec 09, 2015 at 11:11:20AM +0900, Masami Hiramatsu escreveu:
> Fix map_groups__clone to put cloned map after
> inserting it to the map_groups.

Thanks, applied!

- Arnaldo

> Refcnt debugger shows:
> ----
> ==== [0] ====
> Unreclaimed map: 0x2a27ee0
> Refcount +1 => 1 at
> ./perf(map_groups__clone+0x8d) [0x4bb7ed]
> ./perf(thread__fork+0xbe) [0x4c1f9e]
> ./perf(machine__process_fork_event+0x216) [0x4b79a6]
> ./perf(perf_event__synthesize_threads+0x38b) [0x48135b]
> ./perf(cmd_top+0xdc6) [0x43cb76]
> ./perf() [0x477223]
> ./perf(main+0x617) [0x422077]
> /lib64/libc.so.6(__libc_start_main+0xf0) [0x7ff806af8fe0]
> ./perf() [0x4221ed]
> Refcount +1 => 2 at
> ./perf(map_groups__clone+0x128) [0x4bb888]
> ./perf(thread__fork+0xbe) [0x4c1f9e]
> ./perf(machine__process_fork_event+0x216) [0x4b79a6]
> ./perf(perf_event__synthesize_threads+0x38b) [0x48135b]
> ./perf(cmd_top+0xdc6) [0x43cb76]
> ./perf() [0x477223]
> ./perf(main+0x617) [0x422077]
> /lib64/libc.so.6(__libc_start_main+0xf0) [0x7ff806af8fe0]
> ./perf() [0x4221ed]
> Refcount -1 => 1 at
> ./perf(map_groups__exit+0x87) [0x4ba757]
> ./perf(map_groups__put+0x68) [0x4ba9a8]
> ./perf(thread__put+0x8b) [0x4c1aeb]
> ./perf(machine__delete_threads+0x81) [0x4b48f1]
> ./perf(perf_session__delete+0x4f) [0x4be63f]
> ./perf(cmd_top+0x1094) [0x43ce44]
> ./perf() [0x477223]
> ./perf(main+0x617) [0x422077]
> /lib64/libc.so.6(__libc_start_main+0xf0) [0x7ff806af8fe0]
> ./perf() [0x4221ed]
> ----
>
> This shows map_groups__clone get the map twice
> and put it when map_groups__exit.
>
> Signed-off-by: Masami Hiramatsu <[email protected]>
> ---
> tools/perf/util/map.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
> index 7859216..03b7297 100644
> --- a/tools/perf/util/map.c
> +++ b/tools/perf/util/map.c
> @@ -744,6 +744,7 @@ int map_groups__clone(struct map_groups *mg,
> if (new == NULL)
> goto out_unlock;
> map_groups__insert(mg, new);
> + map__put(new);
> }
>
> err = 0;

2015-12-09 14:30:25

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH perf/core 16/22] perf: Fix __cmd_top and perf_session__process_events to put the idle thread

Em Wed, Dec 09, 2015 at 11:11:23AM +0900, Masami Hiramatsu escreveu:
> Since perf_session__register_idle_thread() got the idle thread,
> caller functions have to put it afterwards.
> Note that since the thread was already inserted to the session
> list, it will be released when the session is released.
> Also, in perf_session__register_idle_thread() failure path,
> the thread should be put before returning.

Wouldn't this be better by making perf_session__register_idle_thread()
return -1 if it fails? I.e. that way its callers won't have to
immediately put the idle thread, as they are doing nothing with it.

In the future, if someone needs a handle for that thread, a lookup can
be done:

idle = machine__find_thread(&session->machines.host, 0, 0);

I.e. this would be the resulting patch, please let me know if you are ok
with this approach:

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 7e2e72e6d9d1..f26b08e72f74 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -964,7 +964,7 @@ static int __cmd_top(struct perf_top *top)
if (ret)
goto out_delete;

- if (perf_session__register_idle_thread(top->session) == NULL)
+ if (perf_session__register_idle_thread(top->session) < 0)
goto out_delete;

machine__synthesize_threads(&top->session->machines.host, &opts->target,
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index c35ffdd360fe..9774686525b4 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1311,17 +1311,20 @@ struct thread *perf_session__findnew(struct perf_session *session, pid_t pid)
return machine__findnew_thread(&session->machines.host, -1, pid);
}

-struct thread *perf_session__register_idle_thread(struct perf_session *session)
+int perf_session__register_idle_thread(struct perf_session *session)
{
struct thread *thread;
+ int err = 0;

thread = machine__findnew_thread(&session->machines.host, 0, 0);
if (thread == NULL || thread__set_comm(thread, "swapper", 0)) {
pr_err("problem inserting idle task.\n");
- thread = NULL;
+ err = -1;
}

- return thread;
+ /* machine__findnew_thread() got the thread, so put it */
+ thread__put(thread);
+ return err;
}

static void perf_session__warn_about_errors(const struct perf_session *session)
@@ -1676,7 +1679,7 @@ int perf_session__process_events(struct perf_session *session)
u64 size = perf_data_file__size(session->file);
int err;

- if (perf_session__register_idle_thread(session) == NULL)
+ if (perf_session__register_idle_thread(session) < 0)
return -ENOMEM;

if (!perf_data_file__is_pipe(session->file))
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index 3e900c0efc73..5f792e35d4c1 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -89,7 +89,7 @@ struct machine *perf_session__findnew_machine(struct perf_session *session, pid_
}

struct thread *perf_session__findnew(struct perf_session *session, pid_t pid);
-struct thread *perf_session__register_idle_thread(struct perf_session *session);
+int perf_session__register_idle_thread(struct perf_session *session);

size_t perf_session__fprintf(struct perf_session *session, FILE *fp);

2015-12-09 14:38:51

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH perf/core 17/22] perf: Fix __machine__addnew_vdso to put dso after add to dsos

Em Wed, Dec 09, 2015 at 11:11:25AM +0900, Masami Hiramatsu escreveu:
> Fix __machine__addnew_vdso to put dso after add to dsos because
> the dso is already gotten by the dsos via __dsos__add().
>
> This function is called finally from machine__findnew_vdso()
> which locks machine->dsos.lock. And before unlock it, the
> function gets the dso's refcnt. Thus we can ensure that the
> dso is not removed from the machine while this operation,
> and we don't need to get the dso except for the machine->dsos.
>
> refcnt debugger shows:
> -----
> $ ./perf top --stdio -v (note: run by non-root user)
> [...]
> ==== [3] ====
> Unreclaimed dso@0x27a0a30
> Refcount +1 => 1 at
> ./perf(dso__new+0x2bc) [0x4a778c]
> ./perf(machine__findnew_vdso+0x272) [0x4e8792]
> ./perf(map__new+0x2db) [0x4bfb4b]
> ./perf(machine__process_mmap2_event+0xf3) [0x4bda33]
> ./perf(perf_event__synthesize_mmap_events+0x364) [0x484e74]
> ./perf(perf_event__synthesize_threads+0x3ee) [0x48583e]
> ./perf(cmd_top+0xdc2) [0x43cfb2]
> ./perf() [0x47ba35]
> ./perf(main+0x617) [0x4225b7]
> /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2b01387af5]
> ./perf() [0x42272d]
> Refcount +1 => 2 at
> ./perf(machine__findnew_vdso+0x289) [0x4e87a9]
> ./perf(map__new+0x2db) [0x4bfb4b]
> ./perf(machine__process_mmap2_event+0xf3) [0x4bda33]
> ./perf(perf_event__synthesize_mmap_events+0x364) [0x484e74]
> ./perf(perf_event__synthesize_threads+0x3ee) [0x48583e]
> ./perf(cmd_top+0xdc2) [0x43cfb2]
> ./perf() [0x47ba35]
> ./perf(main+0x617) [0x4225b7]
> /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2b01387af5]
> ./perf() [0x42272d]
> Refcount +1 => 3 at
> ./perf(dso__get+0x32) [0x4a7b52]
> ./perf(machine__findnew_vdso+0xc1) [0x4e85e1]
> ./perf(map__new+0x2db) [0x4bfb4b]
> ./perf(machine__process_mmap2_event+0xf3) [0x4bda33]
> ./perf(perf_event__synthesize_mmap_events+0x364) [0x484e74]
> ./perf(perf_event__synthesize_threads+0x3ee) [0x48583e]
> ./perf(cmd_top+0xdc2) [0x43cfb2]
> ./perf() [0x47ba35]
> ./perf(main+0x617) [0x4225b7]
> /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2b01387af5]
> ./perf() [0x42272d]
> [...]
> -----
>
> The log shows that the machine__findnew_vdso gets a dso
> so many unnaturally. I've traced the code and found this
> bug.
>
> Signed-off-by: Masami Hiramatsu <[email protected]>
> ---
> tools/perf/util/vdso.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/tools/perf/util/vdso.c b/tools/perf/util/vdso.c
> index 44d440d..fea0d18 100644
> --- a/tools/perf/util/vdso.c
> +++ b/tools/perf/util/vdso.c
> @@ -130,6 +130,8 @@ static struct dso *__machine__addnew_vdso(struct machine *machine, const char *s
> __dsos__add(&machine->dsos, dso);
> dso__set_long_name(dso, long_name, false);
> }
> + /* Put the dso here because it is already gotten by __dsos__add */
> + dso__put(dso);
>
> return dso;
> }

We cannot put it here, because we're returning a pointer to it, so,
whoever receives this pointer, receives a recfount with it, that it, in
turn, should put.

And indeed, this dso adding code is confusing, will have to look at it
harder :-\

- Arnaldo

2015-12-09 15:05:51

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH perf/core 18/22] perf stat: Fix cmd_stat to release cpu_map

Em Wed, Dec 09, 2015 at 11:11:27AM +0900, Masami Hiramatsu escreveu:
> Fix cmd_stat() to release cpu_map objects (aggr_map and
> cpus_aggr_map) afterwards.
>
> refcnt debugger shows that the cmd_stat initializes cpu_map
> but not puts it.
> ----
> # ./perf stat -v ls
> ....
> REFCNT: BUG: Unreclaimed objects found.
> ==== [0] ====
> Unreclaimed cpu_map@0x29339c0
> Refcount +1 => 1 at
> ./perf(cpu_map__empty_new+0x6d) [0x4e64bd]
> ./perf(cmd_stat+0x5fe) [0x43594e]
> ./perf() [0x47b785]
> ./perf(main+0x617) [0x422587]
> /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2dff420af5]
> ./perf() [0x4226fd]
> REFCNT: Total 1 objects are not reclaimed.
> "cpu_map" leaks 1 objects
> ----
>
> Signed-off-by: Masami Hiramatsu <[email protected]>
> ---
> tools/perf/builtin-stat.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
> index e74712d..eb6da8f 100644
> --- a/tools/perf/builtin-stat.c
> +++ b/tools/perf/builtin-stat.c
> @@ -1094,6 +1094,16 @@ static int perf_stat_init_aggr_mode(void)
> return cpus_aggr_map ? 0 : -ENOMEM;
> }
>
> +static void perf_stat_exit_aggr_mode(void)
> +{
> + if (aggr_map)
> + cpu_map__put(aggr_map);
> + if (cpus_aggr_map)
> + cpu_map__put(cpus_aggr_map);

No need to check if it is NULL, as the other __put operations,
cpu_map__put() checks it:


void cpu_map__put(struct cpu_map *map)
{
if (map && atomic_dec_and_test(&map->refcnt))
cpu_map__delete(map);
}

I'm removing those checks and applying this patch, thanks!

> + aggr_map = NULL;
> + cpus_aggr_map = NULL;
> +}
> +
> /*
> * Add default attributes, if there were no attributes specified or
> * if -d/--detailed, -d -d or -d -d -d is used:
> @@ -1442,6 +1452,7 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
> if (!forever && status != -1 && !interval)
> print_counters(NULL, argc, argv);
>
> + perf_stat_exit_aggr_mode();
> perf_evlist__free_stats(evsel_list);
> out:
> perf_evlist__delete(evsel_list);

2015-12-09 15:12:40

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH perf/core 19/22] perf: fix hists_evsel to release hists

Em Wed, Dec 09, 2015 at 11:11:29AM +0900, Masami Hiramatsu escreveu:
> Since hists__init doesn't set the destructor of
> hists_evsel (which is an extended evsel structure),
> when hists_evsel is released, the extended part of
> the hists_evsel is not deleted (note that the
> hists_evsel object itself is freed).

This will help with 'perf report', that doesn't use the
hists->entries_in, top will need to drain that one too, but that can be
done in a follow up patch, after some more thinking, applying this
patch, thanks,

- Arnaldo

> This fixes it to add a destructor for hists__evsel
> and to set it up.
>
> Signed-off-by: Masami Hiramatsu <[email protected]>
> ---
> tools/perf/util/hist.c | 10 +++++++++-
> 1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
> index 6e8e0ee..565ea35 100644
> --- a/tools/perf/util/hist.c
> +++ b/tools/perf/util/hist.c
> @@ -1567,6 +1567,13 @@ static int hists_evsel__init(struct perf_evsel *evsel)
> return 0;
> }
>
> +static void hists_evsel__exit(struct perf_evsel *evsel)
> +{
> + struct hists *hists = evsel__hists(evsel);
> +
> + hists__delete_entries(hists);
> +}
> +
> /*
> * XXX We probably need a hists_evsel__exit() to free the hist_entries
> * stored in the rbtree...
> @@ -1575,7 +1582,8 @@ static int hists_evsel__init(struct perf_evsel *evsel)
> int hists__init(void)
> {
> int err = perf_evsel__object_config(sizeof(struct hists_evsel),
> - hists_evsel__init, NULL);
> + hists_evsel__init,
> + hists_evsel__exit);
> if (err)
> fputs("FATAL ERROR: Couldn't setup hists class\n", stderr);
>

2015-12-09 15:15:38

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH perf/core 20/22] perf: Fix maps__fixup_overlappings to put used maps

Em Wed, Dec 09, 2015 at 11:11:31AM +0900, Masami Hiramatsu escreveu:
> Since the __map_groups__insert got the given map, we don't
> need to keep it. So put the maps.
>
> Refcnt debugger shows that the map_groups__fixup_overlappings
> got a map twice but the group released it once. This pattern
> usually indicates the leak happens in caller site.

Thanks, applied!

- Arnaldo

> ----
> ==== [0] ====
> Unreclaimed map@0x39d3ae0
> Refcount +1 => 1 at
> ./perf(map_groups__fixup_overlappings+0x335) [0x4c1865]
> ./perf(thread__insert_map+0x30) [0x4c8e00]
> ./perf(machine__process_mmap2_event+0x106) [0x4bd876]
> ./perf() [0x4c378e]
> ./perf() [0x4c4393]
> ./perf(perf_session__process_events+0x38a) [0x4c654a]
> ./perf(cmd_record+0xe24) [0x42fc94]
> ./perf() [0x47b745]
> ./perf(main+0x617) [0x422547]
> /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2eca2deaf5]
> ./perf() [0x4226bd]
> Refcount +1 => 2 at
> ./perf(map_groups__fixup_overlappings+0x3c5) [0x4c18f5]
> ./perf(thread__insert_map+0x30) [0x4c8e00]
> ./perf(machine__process_mmap2_event+0x106) [0x4bd876]
> ./perf() [0x4c378e]
> ./perf() [0x4c4393]
> ./perf(perf_session__process_events+0x38a) [0x4c654a]
> ./perf(cmd_record+0xe24) [0x42fc94]
> ./perf() [0x47b745]
> ./perf(main+0x617) [0x422547]
> /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2eca2deaf5]
> ./perf() [0x4226bd]
> Refcount -1 => 1 at
> ./perf(map_groups__exit+0x92) [0x4c0962]
> ./perf(map_groups__put+0x60) [0x4c0bc0]
> ./perf(thread__put+0x90) [0x4c8a40]
> ./perf(machine__delete_threads+0x7e) [0x4bad9e]
> ./perf(perf_session__delete+0x4f) [0x4c499f]
> ./perf(cmd_record+0xb6d) [0x42f9dd]
> ./perf() [0x47b745]
> ./perf(main+0x617) [0x422547]
> /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2eca2deaf5]
> ./perf() [0x4226bd]
> ----
>
> Signed-off-by: Masami Hiramatsu <[email protected]>
> ---
> tools/perf/util/map.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
> index 03b7297..89be9c5 100644
> --- a/tools/perf/util/map.c
> +++ b/tools/perf/util/map.c
> @@ -693,6 +693,7 @@ static int maps__fixup_overlappings(struct maps *maps, struct map *map, FILE *fp
> __map_groups__insert(pos->groups, before);
> if (verbose >= 2)
> map__fprintf(before, fp);
> + map__put(before);
> }
>
> if (map->end < pos->end) {
> @@ -707,6 +708,7 @@ static int maps__fixup_overlappings(struct maps *maps, struct map *map, FILE *fp
> __map_groups__insert(pos->groups, after);
> if (verbose >= 2)
> map__fprintf(after, fp);
> + map__put(after);
> }
> put_map:
> map__put(pos);

2015-12-09 15:23:13

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH perf/core 21/22] perf: Fix machine.vmlinux_maps to make sure to clear the old one

Em Wed, Dec 09, 2015 at 11:11:33AM +0900, Masami Hiramatsu escreveu:
> Fix machine.vmlinux_maps to make sure to clear the old one
> if it is renewal. This can leak the previous maps on the
> vmlinux_maps because those are just overwritten.
>
> Signed-off-by: Masami Hiramatsu <[email protected]>
> ---
> tools/perf/util/machine.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index bfc289c..3172aa5 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -44,6 +44,8 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
> machine->comm_exec = false;
> machine->kernel_start = 0;
>
> + memset(machine->vmlinux_maps, 0, sizeof(struct map *) * MAP__NR_TYPES);


Simpler:

memset(machine->vmlinux_maps, 0, sizeof(machine->vmlinux_maps));

struct machine {
...
struct map *vmlinux_maps[MAP__NR_TYPES];
...
};

[acme@zoo c]$ cat pointer_array_sizeof.c
#include <stdio.h>

struct foo { int *array[2]; };
struct bar { int *array[4]; };

int main(void)
{
struct foo *foo; struct bar *bar;

printf("sizeof(foo->array)=%zd\n", sizeof(foo->array));
printf("sizeof(bar->array)=%zd\n", sizeof(bar->array));
return 0;
}
[acme@zoo c]$ make pointer_array_sizeof
cc pointer_array_sizeof.c -o pointer_array_sizeof
[acme@zoo c]$ ./pointer_array_sizeof
sizeof(foo->array)=16
sizeof(bar->array)=32
[acme@zoo c]$

> +
> machine->root_dir = strdup(root_dir);
> if (machine->root_dir == NULL)
> return -ENOMEM;
> @@ -770,6 +772,9 @@ int __machine__create_kernel_maps(struct machine *machine, struct dso *kernel)
> enum map_type type;
> u64 start = machine__get_running_kernel_start(machine, NULL);
>
> + /* In case of renewal the kernel map, destroy previous one */
> + machine__destroy_kernel_maps(machine);
> +
> for (type = 0; type < MAP__NR_TYPES; ++type) {
> struct kmap *kmap;
> struct map *map;

2015-12-09 15:26:11

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH perf/core 22/22] perf: Fix write_numa_topology to put cpu_map instead of free

Em Wed, Dec 09, 2015 at 11:11:35AM +0900, Masami Hiramatsu escreveu:
> Fix write_numa_topology to put cpu_map instead of free because
> cpu_map is managed based on refcnt.

thanks, applied!

> Signed-off-by: Masami Hiramatsu <[email protected]>
> ---
> tools/perf/util/header.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
> index 4383800..5ac7bdb 100644
> --- a/tools/perf/util/header.c
> +++ b/tools/perf/util/header.c
> @@ -724,7 +724,7 @@ static int write_numa_topology(int fd, struct perf_header *h __maybe_unused,
> done:
> free(buf);
> fclose(fp);
> - free(node_map);
> + cpu_map__put(node_map);
> return ret;
> }
>

2015-12-10 02:53:23

by Wang Nan

[permalink] [raw]
Subject: Re: [PATCH perf/core 21/22] perf: Fix machine.vmlinux_maps to make sure to clear the old one



On 2015/12/9 23:22, Arnaldo Carvalho de Melo wrote:
> Em Wed, Dec 09, 2015 at 11:11:33AM +0900, Masami Hiramatsu escreveu:
>> Fix machine.vmlinux_maps to make sure to clear the old one
>> if it is renewal. This can leak the previous maps on the
>> vmlinux_maps because those are just overwritten.
>>
>> Signed-off-by: Masami Hiramatsu <[email protected]>
>> ---
>> tools/perf/util/machine.c | 5 +++++
>> 1 file changed, 5 insertions(+)

This patch solves one of my problem I met when 'perf test':

http://lkml.kernel.org/g/[email protected]

Thank you.

2015-12-10 03:31:46

by Alexei Starovoitov

[permalink] [raw]
Subject: Re: [PATCH perf/core 00/22] perf refcnt debugger API and fixes

On Wed, Dec 09, 2015 at 10:41:38AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Wed, Dec 09, 2015 at 11:10:48AM +0900, Masami Hiramatsu escreveu:
> > Hi Arnaldo,
> >
> > Here is a series of patches for perf refcnt debugger and
> > some fixes.
> >
> > In this series I've replaced all atomic reference counters
> > with the refcnt interface, including dso, map, map_groups,
> > thread, cpu_map, comm_str, cgroup_sel, thread_map, and
> > perf_mmap.
> >
> > refcnt debugger (or refcnt leak checker)
> > ===============
> >
> > At first, note that this change doesn't affect any compiled
> > code unless building with REFCNT_DEBUG=1 (see macros in
> > refcnt.h). So, this feature is only enabled in the debug binary.
> > But before releasing, we can ensure that all objects are safely
> > reclaimed before exit in -rc phase.
>
> That helps and is finding bugs and is really great stuff, thank you!
>
> But I wonder if we couldn't get the same results on an unmodified binary
> by using things like 'perf probe', the BPF code we're introducing, have
> you thought about this possibility?
>
> I.e. trying to use 'perf probe' to do this would help in using the same
> technique in other code bases where we can't change the sources, etc.
>
> For perf we could perhaps use a 'noinline' on the __get/__put
> operations, so that we could have the right places to hook using
> uprobes, other codebases would have to rely in the 'perf probe'
> infrastructure that knows where inlines were expanded, etc.
>
> Such a toold could work like:
>
> perf dbgrefcnt ~/bin/perf thread
>
> And it would look up thread__get and thread__put(), create an eBPF map
> where to store the needed tracking data structures, and use the same
> techniques you used, asking for backtraces using the perf
> infrastructure, etc.

I really like the idea. It's doable with minimal changes.
The only question is the speed of uprobes.
I just haven't benchmarked them. If uprobe+bpf is within 10% slower
than this native refcnt debugger then I think we can build some really
cool tools on top of it. Like generic debugging of std::shared_ptr
or dead lock detection of unmodified binaries.
We can uprobe into pthread_mutex_lock, etc.
Infinite possibilites.

2015-12-10 04:51:22

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH perf/core 00/22] perf refcnt debugger API and fixes

Hi Arnaldo and Masami,

On Wed, Dec 09, 2015 at 10:41:38AM -0300, Arnaldo Carvalho de Melo wrote:
> Em Wed, Dec 09, 2015 at 11:10:48AM +0900, Masami Hiramatsu escreveu:
> > In this series I've also tried to fix some object leaks in perf top
> > and perf stat.
> > After applying this series, this reduced (not vanished) to 1/5.
> > ----
> > # ./perf top --stdio
> > q
> > exiting.
> > REFCNT: BUG: Unreclaimed objects found.
> > REFCNT: Total 866 objects are not reclaimed.
> > "dso" leaks 213 objects
> > "map" leaks 624 objects
> > "comm_str" leaks 12 objects
> > "thread" leaks 9 objects
> > "map_groups" leaks 8 objects
> > To see all backtraces, rerun with -v option
> > ----
> >
> > Actually, I'm still not able to fix all of the bugs. It seems that
> > hists has a bug that hists__delete_entries doesn't delete all the
> > entries because some entries are on hists->entries but others on
> > hists->entries_in. And I'm not so sure about hists.c.
> > Arnaldo, would you have any idea for this bug?
>
> I'll will audit the hists code, its all about collecting new entries
> into an rbtree to then move the last batch for merging with the
> previous, decayed ones while continuing to process a new batch in
> another thread.

Right. After processing is done, hist entries are in both of
hists->entries and hists->entries_in (or hists->entries_collapsed).
So I guess perf report does not have leaks on hists.

But for perf top, it's possible to have half-processed entries which
are only in hists->entries_in. Eventually they will go to the
hists->entries and get freed but they cannot be deleted by current
hists__delete_entries(). If we really care deleting all of those
half-processed entries, how about this?

Thanks,
Namhyung



diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index fd179a068df7..08396a7fea23 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -270,6 +270,8 @@ static void hists__delete_entry(struct hists *hists, struct hist_entry *he)

if (sort__need_collapse)
rb_erase(&he->rb_node_in, &hists->entries_collapsed);
+ else
+ rb_erase(&he->rb_node_in, hists->entries_in);

--hists->nr_entries;
if (!he->filtered)
@@ -1589,11 +1591,33 @@ int __hists__init(struct hists *hists)
return 0;
}

+static void hists__delete_remaining_entries(struct rb_root *root)
+{
+ struct rb_node *node;
+ struct hist_entry *he;
+
+ while (!RB_EMPTY_ROOT(root)) {
+ node = rb_first(root);
+ rb_erase(node, root);
+
+ he = rb_entry(node, struct hist_entry, rb_node_in);
+ hist_entry__delete(he);
+ }
+}
+
+static void hists__delete_all_entries(struct hists *hists)
+{
+ hists__delete_entries(hists);
+ hists__delete_remaining_entries(&hists->entries_in_array[0]);
+ hists__delete_remaining_entries(&hists->entries_in_array[1]);
+ hists__delete_remaining_entries(&hists->entries_collapsed);
+}
+
static void hists_evsel__exit(struct perf_evsel *evsel)
{
struct hists *hists = evsel__hists(evsel);

- hists__delete_entries(hists);
+ hists__delete_all_entries(hists);
}

static int hists_evsel__init(struct perf_evsel *evsel)

Subject: [tip:perf/core] perf tools: Fix map_groups__clone to put cloned map

Commit-ID: bae32b50ea96ca0f8702ea55e62095e8cc4745e2
Gitweb: http://git.kernel.org/tip/bae32b50ea96ca0f8702ea55e62095e8cc4745e2
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Wed, 9 Dec 2015 11:11:20 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Wed, 9 Dec 2015 13:41:58 -0300

perf tools: Fix map_groups__clone to put cloned map

Fix map_groups__clone to put cloned map after inserting it to the
map_groups.

Refcnt debugger shows:
----
==== [0] ====
Unreclaimed map: 0x2a27ee0
Refcount +1 => 1 at
./perf(map_groups__clone+0x8d) [0x4bb7ed]
./perf(thread__fork+0xbe) [0x4c1f9e]
./perf(machine__process_fork_event+0x216) [0x4b79a6]
./perf(perf_event__synthesize_threads+0x38b) [0x48135b]
./perf(cmd_top+0xdc6) [0x43cb76]
./perf() [0x477223]
./perf(main+0x617) [0x422077]
/lib64/libc.so.6(__libc_start_main+0xf0) [0x7ff806af8fe0]
./perf() [0x4221ed]
Refcount +1 => 2 at
./perf(map_groups__clone+0x128) [0x4bb888]
./perf(thread__fork+0xbe) [0x4c1f9e]
./perf(machine__process_fork_event+0x216) [0x4b79a6]
./perf(perf_event__synthesize_threads+0x38b) [0x48135b]
./perf(cmd_top+0xdc6) [0x43cb76]
./perf() [0x477223]
./perf(main+0x617) [0x422077]
/lib64/libc.so.6(__libc_start_main+0xf0) [0x7ff806af8fe0]
./perf() [0x4221ed]
Refcount -1 => 1 at
./perf(map_groups__exit+0x87) [0x4ba757]
./perf(map_groups__put+0x68) [0x4ba9a8]
./perf(thread__put+0x8b) [0x4c1aeb]
./perf(machine__delete_threads+0x81) [0x4b48f1]
./perf(perf_session__delete+0x4f) [0x4be63f]
./perf(cmd_top+0x1094) [0x43ce44]
./perf() [0x477223]
./perf(main+0x617) [0x422077]
/lib64/libc.so.6(__libc_start_main+0xf0) [0x7ff806af8fe0]
./perf() [0x4221ed]
----

This shows map_groups__clone get the map twice and put it when
map_groups__exit.

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/map.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 93d9f1c..7b1c720 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -742,6 +742,7 @@ int map_groups__clone(struct map_groups *mg,
if (new == NULL)
goto out_unlock;
map_groups__insert(mg, new);
+ map__put(new);
}

err = 0;

Subject: [tip:perf/core] perf stat: Fix cmd_stat to release cpu_map

Commit-ID: 544c2ae7b1a794ad0bc5ec24d832ab5658d5aef6
Gitweb: http://git.kernel.org/tip/544c2ae7b1a794ad0bc5ec24d832ab5658d5aef6
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Wed, 9 Dec 2015 11:11:27 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Wed, 9 Dec 2015 13:41:59 -0300

perf stat: Fix cmd_stat to release cpu_map

Fix cmd_stat() to release cpu_map objects (aggr_map and
cpus_aggr_map) afterwards.

refcnt debugger shows that the cmd_stat initializes cpu_map
but not puts it.
----
# ./perf stat -v ls
....
REFCNT: BUG: Unreclaimed objects found.
==== [0] ====
Unreclaimed cpu_map@0x29339c0
Refcount +1 => 1 at
./perf(cpu_map__empty_new+0x6d) [0x4e64bd]
./perf(cmd_stat+0x5fe) [0x43594e]
./perf() [0x47b785]
./perf(main+0x617) [0x422587]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2dff420af5]
./perf() [0x4226fd]
REFCNT: Total 1 objects are not reclaimed.
"cpu_map" leaks 1 objects
----

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Remove NULL checks before calling the put operation, it checks it already ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-stat.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index e74712d..25a95f4 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -1094,6 +1094,14 @@ static int perf_stat_init_aggr_mode(void)
return cpus_aggr_map ? 0 : -ENOMEM;
}

+static void perf_stat__exit_aggr_mode(void)
+{
+ cpu_map__put(aggr_map);
+ cpu_map__put(cpus_aggr_map);
+ aggr_map = NULL;
+ cpus_aggr_map = NULL;
+}
+
/*
* Add default attributes, if there were no attributes specified or
* if -d/--detailed, -d -d or -d -d -d is used:
@@ -1442,6 +1450,7 @@ int cmd_stat(int argc, const char **argv, const char *prefix __maybe_unused)
if (!forever && status != -1 && !interval)
print_counters(NULL, argc, argv);

+ perf_stat__exit_aggr_mode();
perf_evlist__free_stats(evsel_list);
out:
perf_evlist__delete(evsel_list);

Subject: [tip:perf/core] perf hists: Fix hists_evsel to release hists

Commit-ID: 17577decb2ddae28f5a449ddb79cf0ed3e2312c5
Gitweb: http://git.kernel.org/tip/17577decb2ddae28f5a449ddb79cf0ed3e2312c5
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Wed, 9 Dec 2015 11:11:29 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Wed, 9 Dec 2015 13:41:59 -0300

perf hists: Fix hists_evsel to release hists

Since hists__init doesn't set the destructor of hists_evsel (which is an
extended evsel structure), when hists_evsel is released, the extended
part of the hists_evsel is not deleted (note that the hists_evsel object
itself is freed).

This fixes it to add a destructor for hists__evsel and to set it up.

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/hist.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
index 6e8e0ee..565ea35 100644
--- a/tools/perf/util/hist.c
+++ b/tools/perf/util/hist.c
@@ -1567,6 +1567,13 @@ static int hists_evsel__init(struct perf_evsel *evsel)
return 0;
}

+static void hists_evsel__exit(struct perf_evsel *evsel)
+{
+ struct hists *hists = evsel__hists(evsel);
+
+ hists__delete_entries(hists);
+}
+
/*
* XXX We probably need a hists_evsel__exit() to free the hist_entries
* stored in the rbtree...
@@ -1575,7 +1582,8 @@ static int hists_evsel__init(struct perf_evsel *evsel)
int hists__init(void)
{
int err = perf_evsel__object_config(sizeof(struct hists_evsel),
- hists_evsel__init, NULL);
+ hists_evsel__init,
+ hists_evsel__exit);
if (err)
fputs("FATAL ERROR: Couldn't setup hists class\n", stderr);

Subject: [tip:perf/core] perf tools: Fix maps__fixup_overlappings to put used maps

Commit-ID: d91130e90a005876b488b6d52b743149d95b4a59
Gitweb: http://git.kernel.org/tip/d91130e90a005876b488b6d52b743149d95b4a59
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Wed, 9 Dec 2015 11:11:31 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Wed, 9 Dec 2015 13:42:00 -0300

perf tools: Fix maps__fixup_overlappings to put used maps

Since the __map_groups__insert got the given map, we don't need to keep
it. So put the maps.

Refcnt debugger shows that map_groups__fixup_overlappings() got a map
twice but the group released it just once. This pattern usually
indicates the leak happens in caller site.

----
==== [0] ====
Unreclaimed map@0x39d3ae0
Refcount +1 => 1 at
./perf(map_groups__fixup_overlappings+0x335) [0x4c1865]
./perf(thread__insert_map+0x30) [0x4c8e00]
./perf(machine__process_mmap2_event+0x106) [0x4bd876]
./perf() [0x4c378e]
./perf() [0x4c4393]
./perf(perf_session__process_events+0x38a) [0x4c654a]
./perf(cmd_record+0xe24) [0x42fc94]
./perf() [0x47b745]
./perf(main+0x617) [0x422547]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2eca2deaf5]
./perf() [0x4226bd]
Refcount +1 => 2 at
./perf(map_groups__fixup_overlappings+0x3c5) [0x4c18f5]
./perf(thread__insert_map+0x30) [0x4c8e00]
./perf(machine__process_mmap2_event+0x106) [0x4bd876]
./perf() [0x4c378e]
./perf() [0x4c4393]
./perf(perf_session__process_events+0x38a) [0x4c654a]
./perf(cmd_record+0xe24) [0x42fc94]
./perf() [0x47b745]
./perf(main+0x617) [0x422547]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2eca2deaf5]
./perf() [0x4226bd]
Refcount -1 => 1 at
./perf(map_groups__exit+0x92) [0x4c0962]
./perf(map_groups__put+0x60) [0x4c0bc0]
./perf(thread__put+0x90) [0x4c8a40]
./perf(machine__delete_threads+0x7e) [0x4bad9e]
./perf(perf_session__delete+0x4f) [0x4c499f]
./perf(cmd_record+0xb6d) [0x42f9dd]
./perf() [0x47b745]
./perf(main+0x617) [0x422547]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f2eca2deaf5]
./perf() [0x4226bd]
----

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/map.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 7b1c720..171b6d1 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -691,6 +691,7 @@ static int maps__fixup_overlappings(struct maps *maps, struct map *map, FILE *fp
__map_groups__insert(pos->groups, before);
if (verbose >= 2)
map__fprintf(before, fp);
+ map__put(before);
}

if (map->end < pos->end) {
@@ -705,6 +706,7 @@ static int maps__fixup_overlappings(struct maps *maps, struct map *map, FILE *fp
__map_groups__insert(pos->groups, after);
if (verbose >= 2)
map__fprintf(after, fp);
+ map__put(after);
}
put_map:
map__put(pos);

Subject: [tip:perf/core] perf machine: Fix machine.vmlinux_maps to make sure to clear the old one

Commit-ID: cc1121ab9687d660cc02f50b1a4974112f87a8e6
Gitweb: http://git.kernel.org/tip/cc1121ab9687d660cc02f50b1a4974112f87a8e6
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Wed, 9 Dec 2015 11:11:33 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Wed, 9 Dec 2015 13:42:00 -0300

perf machine: Fix machine.vmlinux_maps to make sure to clear the old one

Fix machine.vmlinux_maps to make sure to clear the old one if it is
renewal. This can leak the previous maps on the vmlinux_maps because
those are just overwritten.

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Simplified the memset, same end result ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/machine.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index bfc289c..f5882b8 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -44,6 +44,8 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
machine->comm_exec = false;
machine->kernel_start = 0;

+ memset(machine->vmlinux_maps, 0, sizeof(machine->vmlinux_maps));
+
machine->root_dir = strdup(root_dir);
if (machine->root_dir == NULL)
return -ENOMEM;
@@ -770,6 +772,9 @@ int __machine__create_kernel_maps(struct machine *machine, struct dso *kernel)
enum map_type type;
u64 start = machine__get_running_kernel_start(machine, NULL);

+ /* In case of renewal the kernel map, destroy previous one */
+ machine__destroy_kernel_maps(machine);
+
for (type = 0; type < MAP__NR_TYPES; ++type) {
struct kmap *kmap;
struct map *map;

Subject: [tip:perf/core] perf tools: Fix write_numa_topology to put cpu_map instead of free

Commit-ID: 5191d887681dd34ba3993a438d5746378952885a
Gitweb: http://git.kernel.org/tip/5191d887681dd34ba3993a438d5746378952885a
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Wed, 9 Dec 2015 11:11:35 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Wed, 9 Dec 2015 13:42:01 -0300

perf tools: Fix write_numa_topology to put cpu_map instead of free

Fix write_numa_topology to put cpu_map instead of free because cpu_map
is managed based on refcnt.

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/header.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 4383800..5ac7bdb 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -724,7 +724,7 @@ static int write_numa_topology(int fd, struct perf_header *h __maybe_unused,
done:
free(buf);
fclose(fp);
- free(node_map);
+ cpu_map__put(node_map);
return ret;
}

Subject: RE: [PATCH perf/core 00/22] perf refcnt debugger API and fixes

>From: Namhyung Kim [mailto:[email protected]]
>
>Hi Arnaldo and Masami,
>
>On Wed, Dec 09, 2015 at 10:41:38AM -0300, Arnaldo Carvalho de Melo wrote:
>> Em Wed, Dec 09, 2015 at 11:10:48AM +0900, Masami Hiramatsu escreveu:
>> > In this series I've also tried to fix some object leaks in perf top
>> > and perf stat.
>> > After applying this series, this reduced (not vanished) to 1/5.
>> > ----
>> > # ./perf top --stdio
>> > q
>> > exiting.
>> > REFCNT: BUG: Unreclaimed objects found.
>> > REFCNT: Total 866 objects are not reclaimed.
>> > "dso" leaks 213 objects
>> > "map" leaks 624 objects
>> > "comm_str" leaks 12 objects
>> > "thread" leaks 9 objects
>> > "map_groups" leaks 8 objects
>> > To see all backtraces, rerun with -v option
>> > ----
>> >
>> > Actually, I'm still not able to fix all of the bugs. It seems that
>> > hists has a bug that hists__delete_entries doesn't delete all the
>> > entries because some entries are on hists->entries but others on
>> > hists->entries_in. And I'm not so sure about hists.c.
>> > Arnaldo, would you have any idea for this bug?
>>
>> I'll will audit the hists code, its all about collecting new entries
>> into an rbtree to then move the last batch for merging with the
>> previous, decayed ones while continuing to process a new batch in
>> another thread.
>
>Right. After processing is done, hist entries are in both of
>hists->entries and hists->entries_in (or hists->entries_collapsed).
>So I guess perf report does not have leaks on hists.
>
>But for perf top, it's possible to have half-processed entries which
>are only in hists->entries_in. Eventually they will go to the
>hists->entries and get freed but they cannot be deleted by current
>hists__delete_entries(). If we really care deleting all of those
>half-processed entries, how about this?
>

Nice! I've applied below patch on the top of my series and
checked that no leak is detected. :)

You can add my acked-by and tested-by. :)

Acked-by: Masami Hiramatsu <[email protected]>
Tested-by: Masami Hiramatsu <[email protected]>

Thanks!

>
>
>diff --git a/tools/perf/util/hist.c b/tools/perf/util/hist.c
>index fd179a068df7..08396a7fea23 100644
>--- a/tools/perf/util/hist.c
>+++ b/tools/perf/util/hist.c
>@@ -270,6 +270,8 @@ static void hists__delete_entry(struct hists *hists, struct hist_entry *he)
>
> if (sort__need_collapse)
> rb_erase(&he->rb_node_in, &hists->entries_collapsed);
>+ else
>+ rb_erase(&he->rb_node_in, hists->entries_in);
>
> --hists->nr_entries;
> if (!he->filtered)
>@@ -1589,11 +1591,33 @@ int __hists__init(struct hists *hists)
> return 0;
> }
>
>+static void hists__delete_remaining_entries(struct rb_root *root)
>+{
>+ struct rb_node *node;
>+ struct hist_entry *he;
>+
>+ while (!RB_EMPTY_ROOT(root)) {
>+ node = rb_first(root);
>+ rb_erase(node, root);
>+
>+ he = rb_entry(node, struct hist_entry, rb_node_in);
>+ hist_entry__delete(he);
>+ }
>+}
>+
>+static void hists__delete_all_entries(struct hists *hists)
>+{
>+ hists__delete_entries(hists);
>+ hists__delete_remaining_entries(&hists->entries_in_array[0]);
>+ hists__delete_remaining_entries(&hists->entries_in_array[1]);
>+ hists__delete_remaining_entries(&hists->entries_collapsed);
>+}
>+
> static void hists_evsel__exit(struct perf_evsel *evsel)
> {
> struct hists *hists = evsel__hists(evsel);
>
>- hists__delete_entries(hists);
>+ hists__delete_all_entries(hists);
> }
>
> static int hists_evsel__init(struct perf_evsel *evsel)
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

Subject: RE: [PATCH perf/core 14/22] perf: Fix dso__load_sym to put dso

From: Arnaldo Carvalho de Melo [mailto:[email protected]]
>
>Em Wed, Dec 09, 2015 at 11:11:18AM +0900, Masami Hiramatsu escreveu:
>> +++ b/tools/perf/util/symbol-elf.c
>> @@ -1045,6 +1045,8 @@ int dso__load_sym(struct dso *dso, struct map *map,
>> /* kmaps already got it */
>> map__put(curr_map);
>> dsos__add(&map->groups->machine->dsos, curr_dso);
>> + /* curr_map and machine->dsos already got it */
>> + dso__put(curr_dso);
>> dso__set_loaded(curr_dso, map->type);
>> } else
>> curr_dso = curr_map->dso;
>
>Right, to make the code smaller, how about doing it this way, i.e. drop
>the reference once we have that curr_dso object with a ref held by
>curr_map, if curr_map doesn't get it, then we don't need and will drop
>it anyway:

But as above code, curr_dso is passed to dsos__add after curr_map is put.
Even if the curr_map is hold by kmaps, isn't the kmaps controlled by other pthreads?
If no, I'm OK if we move above dsos__add() before map__put() for safety, because
curr_dso is held by curr_map at that point.

Thank you,

>
>diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
>index 53f19968bfa2..84d787074152 100644
>--- a/tools/perf/util/symbol-elf.c
>+++ b/tools/perf/util/symbol-elf.c
>@@ -1026,8 +1026,8 @@ int dso__load_sym(struct dso *dso, struct map *map,
> curr_dso->long_name_len = dso->long_name_len;
> curr_map = map__new2(start, curr_dso,
> map->type);
>+ dso__put(curr_dso);
> if (curr_map == NULL) {
>- dso__put(curr_dso);
> goto out_elf_end;
> }
> if (adjust_kernel_syms) {

Subject: RE: [PATCH perf/core 16/22] perf: Fix __cmd_top and perf_session__process_events to put the idle thread

>From: Arnaldo Carvalho de Melo [mailto:[email protected]]
>
>Em Wed, Dec 09, 2015 at 11:11:23AM +0900, Masami Hiramatsu escreveu:
>> Since perf_session__register_idle_thread() got the idle thread,
>> caller functions have to put it afterwards.
>> Note that since the thread was already inserted to the session
>> list, it will be released when the session is released.
>> Also, in perf_session__register_idle_thread() failure path,
>> the thread should be put before returning.
>
>Wouldn't this be better by making perf_session__register_idle_thread()
>return -1 if it fails? I.e. that way its callers won't have to
>immediately put the idle thread, as they are doing nothing with it.
>

Ah, right. I thought that someone may use this return value, but no,
there is no code which use the returned thread.

>In the future, if someone needs a handle for that thread, a lookup can
>be done:
>
> idle = machine__find_thread(&session->machines.host, 0, 0);
>
>I.e. this would be the resulting patch, please let me know if you are ok
>with this approach:

I'm OK for your approach :)

Reviewed-by: Masami Hiramatsu <[email protected]>


Thanks!


>
>diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
>index 7e2e72e6d9d1..f26b08e72f74 100644
>--- a/tools/perf/builtin-top.c
>+++ b/tools/perf/builtin-top.c
>@@ -964,7 +964,7 @@ static int __cmd_top(struct perf_top *top)
> if (ret)
> goto out_delete;
>
>- if (perf_session__register_idle_thread(top->session) == NULL)
>+ if (perf_session__register_idle_thread(top->session) < 0)
> goto out_delete;
>
> machine__synthesize_threads(&top->session->machines.host, &opts->target,
>diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
>index c35ffdd360fe..9774686525b4 100644
>--- a/tools/perf/util/session.c
>+++ b/tools/perf/util/session.c
>@@ -1311,17 +1311,20 @@ struct thread *perf_session__findnew(struct perf_session *session, pid_t pid)
> return machine__findnew_thread(&session->machines.host, -1, pid);
> }
>
>-struct thread *perf_session__register_idle_thread(struct perf_session *session)
>+int perf_session__register_idle_thread(struct perf_session *session)
> {
> struct thread *thread;
>+ int err = 0;
>
> thread = machine__findnew_thread(&session->machines.host, 0, 0);
> if (thread == NULL || thread__set_comm(thread, "swapper", 0)) {
> pr_err("problem inserting idle task.\n");
>- thread = NULL;
>+ err = -1;
> }
>
>- return thread;
>+ /* machine__findnew_thread() got the thread, so put it */
>+ thread__put(thread);
>+ return err;
> }
>
> static void perf_session__warn_about_errors(const struct perf_session *session)
>@@ -1676,7 +1679,7 @@ int perf_session__process_events(struct perf_session *session)
> u64 size = perf_data_file__size(session->file);
> int err;
>
>- if (perf_session__register_idle_thread(session) == NULL)
>+ if (perf_session__register_idle_thread(session) < 0)
> return -ENOMEM;
>
> if (!perf_data_file__is_pipe(session->file))
>diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
>index 3e900c0efc73..5f792e35d4c1 100644
>--- a/tools/perf/util/session.h
>+++ b/tools/perf/util/session.h
>@@ -89,7 +89,7 @@ struct machine *perf_session__findnew_machine(struct perf_session *session, pid_
> }
>
> struct thread *perf_session__findnew(struct perf_session *session, pid_t pid);
>-struct thread *perf_session__register_idle_thread(struct perf_session *session);
>+int perf_session__register_idle_thread(struct perf_session *session);
>
> size_t perf_session__fprintf(struct perf_session *session, FILE *fp);
>

Subject: RE: [PATCH perf/core 00/22] perf refcnt debugger API and fixes

>From: Arnaldo Carvalho de Melo [mailto:[email protected]]
>
>Em Wed, Dec 09, 2015 at 11:10:48AM +0900, Masami Hiramatsu escreveu:
>> Hi Arnaldo,
>>
>> Here is a series of patches for perf refcnt debugger and
>> some fixes.
>>
>> In this series I've replaced all atomic reference counters
>> with the refcnt interface, including dso, map, map_groups,
>> thread, cpu_map, comm_str, cgroup_sel, thread_map, and
>> perf_mmap.
>>
>> refcnt debugger (or refcnt leak checker)
>> ===============
>>
>> At first, note that this change doesn't affect any compiled
>> code unless building with REFCNT_DEBUG=1 (see macros in
>> refcnt.h). So, this feature is only enabled in the debug binary.
>> But before releasing, we can ensure that all objects are safely
>> reclaimed before exit in -rc phase.
>
>That helps and is finding bugs and is really great stuff, thank you!
>
>But I wonder if we couldn't get the same results on an unmodified binary
>by using things like 'perf probe', the BPF code we're introducing, have
>you thought about this possibility?

That's possible, but it will require pre-analysis of the binary, because
refcnt interface is not fixed API like a "systemcall" (moreover, it could
be just a non-atomic variable).

Thus we need a kind of "annotation" event by source code level.

>
>I.e. trying to use 'perf probe' to do this would help in using the same
>technique in other code bases where we can't change the sources, etc.
>
>For perf we could perhaps use a 'noinline' on the __get/__put
>operations, so that we could have the right places to hook using
>uprobes, other codebases would have to rely in the 'perf probe'
>infrastructure that knows where inlines were expanded, etc.
>
>Such a toold could work like:
>
> perf dbgrefcnt ~/bin/perf thread

This works only for the binary which is coded as you said.
I actually doubt that this is universal solution. We'd better introduce
librefcnt.so if you need more general solution, so that we can fix the
refcnt API and we can also hook the interface easily.

But with this way, we don't need ebpf/uprobes anymore, since we've already
have LD_PRELOAD (like valgrind does). :(

So, IMHO, using ebpf and perf probe for this issue sounds like using
a sledge$B!>(Bhammer...

Thanks,

>
>And it would look up thread__get and thread__put(), create an eBPF map
>where to store the needed tracking data structures, and use the same
>techniques you used, asking for backtraces using the perf
>infrastructure, etc.
>
>We would be using this opportunity to improve the 'perf probe' and the
>eBPF infrastructures we're putting in place, and having something that
>could be used in other codebases, not just perf.
>



2015-12-10 12:52:32

by Wang Nan

[permalink] [raw]
Subject: Re: [PATCH perf/core 00/22] perf refcnt debugger API and fixes



On 2015/12/10 19:04, 平松雅巳 / HIRAMATU,MASAMI wrote:
>> From: Arnaldo Carvalho de Melo [mailto:[email protected]]
>>
>> Em Wed, Dec 09, 2015 at 11:10:48AM +0900, Masami Hiramatsu escreveu:
>>> Hi Arnaldo,
>>>
>>> Here is a series of patches for perf refcnt debugger and
>>> some fixes.
>>>
>>> In this series I've replaced all atomic reference counters
>>> with the refcnt interface, including dso, map, map_groups,
>>> thread, cpu_map, comm_str, cgroup_sel, thread_map, and
>>> perf_mmap.
>>>
>>> refcnt debugger (or refcnt leak checker)
>>> ===============
>>>
>>> At first, note that this change doesn't affect any compiled
>>> code unless building with REFCNT_DEBUG=1 (see macros in
>>> refcnt.h). So, this feature is only enabled in the debug binary.
>>> But before releasing, we can ensure that all objects are safely
>>> reclaimed before exit in -rc phase.
>> That helps and is finding bugs and is really great stuff, thank you!
>>
>> But I wonder if we couldn't get the same results on an unmodified binary
>> by using things like 'perf probe', the BPF code we're introducing, have
>> you thought about this possibility?
> That's possible, but it will require pre-analysis of the binary, because
> refcnt interface is not fixed API like a "systemcall" (moreover, it could
> be just a non-atomic variable).
>
> Thus we need a kind of "annotation" event by source code level.
>
>> I.e. trying to use 'perf probe' to do this would help in using the same
>> technique in other code bases where we can't change the sources, etc.
>>
>> For perf we could perhaps use a 'noinline' on the __get/__put
>> operations, so that we could have the right places to hook using
>> uprobes, other codebases would have to rely in the 'perf probe'
>> infrastructure that knows where inlines were expanded, etc.
>>
>> Such a toold could work like:
>>
>> perf dbgrefcnt ~/bin/perf thread
> This works only for the binary which is coded as you said.
> I actually doubt that this is universal solution. We'd better introduce
> librefcnt.so if you need more general solution, so that we can fix the
> refcnt API and we can also hook the interface easily.
>
> But with this way, we don't need ebpf/uprobes anymore, since we've already
> have LD_PRELOAD (like valgrind does). :(
>
> So, IMHO, using ebpf and perf probe for this issue sounds like using
> a sledge‐hammer...

But this is an interesting problem and can inspire us the direction
for eBPF improvement. I guess if we can solve this problem with eBPF
we can also solve many similar problems with much lower cost than what
you have done in first 5 patches?

This is what we have done today:

With a much simpler patch which create 4 stub functions:

diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index 65fef59..2c45478 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -87,6 +87,7 @@ libperf-$(CONFIG_AUXTRACE) += intel-bts.o
libperf-y += parse-branch-options.o
libperf-y += parse-regs-options.o
libperf-y += term.o
+libperf-y += refcnt.o

libperf-$(CONFIG_LIBBPF) += bpf-loader.o
libperf-$(CONFIG_BPF_PROLOGUE) += bpf-prologue.o
diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index e8e9a9d..de52ae8 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -1,6 +1,7 @@
#include <asm/bug.h>
#include <sys/time.h>
#include <sys/resource.h>
+#include "refcnt.h"
#include "symbol.h"
#include "dso.h"
#include "machine.h"
diff --git a/tools/perf/util/refcnt.c b/tools/perf/util/refcnt.c
new file mode 100644
index 0000000..f5a6659
--- /dev/null
+++ b/tools/perf/util/refcnt.c
@@ -0,0 +1,29 @@
+#include <linux/compiler.h>
+#include "util/refcnt.h"
+
+void __attribute__ ((noinline))
+__refcnt__init(atomic_t *refcnt, int n,
+ void *obj __maybe_unused,
+ const char *name __maybe_unused)
+{
+ atomic_set(refcnt, n);
+}
+
+void __attribute__ ((noinline))
+refcnt__delete(void *obj __maybe_unused)
+{
+}
+
+void __attribute__ ((noinline))
+__refcnt__get(atomic_t *refcnt, void *obj __maybe_unused,
+ const char *name __maybe_unused)
+{
+ atomic_inc(refcnt);
+}
+
+int __attribute__ ((noinline))
+__refcnt__put(atomic_t *refcnt, void *obj __maybe_unused,
+ const char *name __maybe_unused)
+{
+ return atomic_dec_and_test(refcnt);
+}
diff --git a/tools/perf/util/refcnt.h b/tools/perf/util/refcnt.h
new file mode 100644
index 0000000..04f5390
--- /dev/null
+++ b/tools/perf/util/refcnt.h
@@ -0,0 +1,21 @@
+#ifndef PERF_REFCNT_H
+#define PERF_REFCNT_H
+#include <linux/atomic.h>
+
+void __refcnt__init(atomic_t *refcnt, int n, void *obj, const char *name);
+void refcnt__delete(void *obj);
+void __refcnt__get(atomic_t *refcnt, void *obj, const char *name);
+int __refcnt__put(atomic_t *refcnt, void *obj, const char *name);
+
+#define refcnt__init(obj, member, n) \
+ __refcnt__init(&(obj)->member, n, obj, #obj)
+#define refcnt__init_as(obj, member, n, name) \
+ __refcnt__init(&(obj)->member, n, obj, name)
+#define refcnt__exit(obj, member) \
+ refcnt__delete(obj)
+#define refcnt__get(obj, member) \
+ __refcnt__get(&(obj)->member, obj, #obj)
+#define refcnt__put(obj, member) \
+ __refcnt__put(&(obj)->member, obj, #obj)
+
+#endif

And a relative complex eBPF script attached at the end of
this mail, with following cmdline:

# ./perf record -e ./refcnt.c ./perf probe vfs_read
# cat /sys/kernel/debug/tracing/trace
...
perf-18419 [004] d... 613572.513083: : Type 0 leak 2
perf-18419 [004] d... 613572.513084: : Type 1 leak 1

I know we have 2 dsos and 1 map get leak.

However I have to analysis full stack trace from 'perf script' to find
which one get leak, because currently my eBPF script is unable to report
which object is leak. I know I can use a hashtable with object address
as key, but currently I don't know how to enumerate keys in a hash table,
except maintaining a relationship between index and object address.
Now I'm waiting for Daniel's persistent map to be enforced for that. When
it ready we can create a tool with the following eBPF script embedded into
perf as a small subcommand, and report call stack of 'alloc' method of
leak object in 'perf report' style, so we can solve similar problem easier.
To make it genereic, we can even make it attach to '{m,c}alloc%return' and
'free', or 'mmap/munmap'.

Thank you.


-------------- eBPF script --------------

typedef int u32;
typedef unsigned long long u64;
#define NULL ((void *)(0))

#define BPF_ANY 0 /* create new element or update existing */
#define BPF_NOEXIST 1 /* create new element if it didn't exist */
#define BPF_EXIST 2 /* update existing element */

enum bpf_map_type {
BPF_MAP_TYPE_UNSPEC,
BPF_MAP_TYPE_HASH,
BPF_MAP_TYPE_ARRAY,
BPF_MAP_TYPE_PROG_ARRAY,
BPF_MAP_TYPE_PERF_EVENT_ARRAY,
};

struct bpf_map_def {
unsigned int type;
unsigned int key_size;
unsigned int value_size;
unsigned int max_entries;
};

#define SEC(NAME) __attribute__((section(NAME), used))
static int (*bpf_probe_read)(void *dst, int size, void *src) =
(void *)4;
static int (*bpf_trace_printk)(const char *fmt, int fmt_size, ...) =
(void *)6;
static int (*bpf_get_smp_processor_id)(void) =
(void *)8;
static int (*map_update_elem)(struct bpf_map_def *, void *, void *,
unsigned long long flags) =
(void *)2;
static void *(*map_lookup_elem)(struct bpf_map_def *, void *) =
(void *)1;
static unsigned long long (*get_current_pid_tgid)(void) =
(void *)14;
static unsigned long long (*get_current_comm)(char *buf, int size_of_buf) =
(void *)16;

char _license[] SEC("license") = "GPL";
int _version SEC("version") = LINUX_VERSION_CODE;

enum global_var {
G_pid,
G_LEAK_START,
G_dso_leak = G_LEAK_START,
G_map_group_leak,
G_LEAK_END,
G_NR = G_LEAK_END,
};

struct bpf_map_def SEC("maps") global_vars = {
.type = BPF_MAP_TYPE_ARRAY,
.key_size = sizeof(int),
.value_size = sizeof(u64),
.max_entries = G_NR,
};

static inline int filter_pid(void)
{
int key_pid = G_pid;
unsigned long long *p_pid, pid;

char fmt[] = "%d vs %d\n";

p_pid = map_lookup_elem(&global_vars, &key_pid);
if (!p_pid)
return 0;

pid = get_current_pid_tgid() & 0xffffffff;

if (*p_pid != pid)
return 0;
return 1;
}

static inline void print_leak(int type)
{
unsigned long long *p_cnt;
char fmt[] = "Type %d leak %llu\n";

p_cnt = map_lookup_elem(&global_vars, &type);
if (!p_cnt)
return;
bpf_trace_printk(fmt, sizeof(fmt), type - G_LEAK_START, *p_cnt);
}

SEC("execve=sys_execve")
int execve(void *ctx)
{
char name[sizeof(u64)] = "";
char name_perf[sizeof(u64)] = "perf";
unsigned long long *p_pid, pid;
int key = G_pid;

p_pid = map_lookup_elem(&global_vars, &key);
if (!p_pid)
return 0;
pid = *p_pid;
if (pid)
return 0;
if (get_current_comm(name, sizeof(name)))
return 0;
if (*(u32*)name != *(u32*)name_perf)
return 0;

pid = get_current_pid_tgid() & 0xffffffff;
map_update_elem(&global_vars, &key, &pid, BPF_ANY);
return 0;
}

static inline int func_exit(void *ctx)
{
if (!filter_pid())
return 0;
print_leak(G_dso_leak);
print_leak(G_map_group_leak);
return 0;
}

SEC("exit_group=sys_exit_group")
int exit_group(void *ctx)
{
return func_exit(ctx);
}

SEC("exit_=sys_exit")
int exit_(void *ctx)
{
return func_exit(ctx);
}
static inline void inc_leak_from_type(int type, int n)
{
u64 *p_cnt, cnt;

type += G_LEAK_START;
if (type >= G_LEAK_END)
return;

p_cnt = map_lookup_elem(&global_vars, &type);
if (!p_cnt)
cnt = n;
else
cnt = *p_cnt + n;

map_update_elem(&global_vars, &type, &cnt, BPF_ANY);
return;
}

SEC("exec=/home/wangnan/perf;"
"refcnt_init=__refcnt__init n obj type")
int refcnt_init(void *ctx, int err, int n, void *obj, int type)
{
if (!filter_pid())
return 0;
inc_leak_from_type(type, n);
return 0;
}
SEC("exec=/home/wangnan/perf;"
"refcnt_del=refcnt__delete obj type")
int refcnt_del(void *ctx, int err, void *obj, int type)
{
if (!filter_pid())
return 0;
return 0;
}

SEC("exec=/home/wangnan/perf;"
"refcnt_get=__refcnt__get obj type")
int refcnt_get(void *ctx, int err, void *obj, int type)
{
if (!filter_pid())
return 0;
inc_leak_from_type(type, 1);
return 0;
}

SEC("exec=/home/wangnan/perf;"
"refcnt_put=__refcnt__put refcnt obj type")
int refcnt_put(void *ctx, int err, void *refcnt, void *obj, int type)
{
int old_cnt = -1;

if (!filter_pid())
return 0;
if (bpf_probe_read(&old_cnt, sizeof(int), refcnt))
return 0;
if (old_cnt)
inc_leak_from_type(type, -1);
return 0;
}

2015-12-10 15:12:52

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH perf/core 00/22] perf refcnt debugger API and fixes

Em Thu, Dec 10, 2015 at 08:52:02PM +0800, Wangnan (F) escreveu:
> On 2015/12/10 19:04, 平松雅巳 / HIRAMATU,MASAMI wrote:
> >>From: Arnaldo Carvalho de Melo [mailto:[email protected]]
> >>Em Wed, Dec 09, 2015 at 11:10:48AM +0900, Masami Hiramatsu escreveu:
> >>>Here is a series of patches for perf refcnt debugger and
> >>>some fixes.
> >>>
> >>>In this series I've replaced all atomic reference counters
> >>>with the refcnt interface, including dso, map, map_groups,
> >>>thread, cpu_map, comm_str, cgroup_sel, thread_map, and
> >>>perf_mmap.
> >>>
> >>> refcnt debugger (or refcnt leak checker)
> >>> ===============
> >>>
> >>>At first, note that this change doesn't affect any compiled
> >>>code unless building with REFCNT_DEBUG=1 (see macros in
> >>>refcnt.h). So, this feature is only enabled in the debug binary.
> >>>But before releasing, we can ensure that all objects are safely
> >>>reclaimed before exit in -rc phase.
> >>That helps and is finding bugs and is really great stuff, thank you!
> >>
> >>But I wonder if we couldn't get the same results on an unmodified binary
> >>by using things like 'perf probe', the BPF code we're introducing, have
> >>you thought about this possibility?
> >That's possible, but it will require pre-analysis of the binary, because
> >refcnt interface is not fixed API like a "systemcall" (moreover, it could
> >be just a non-atomic variable).
> >
> >Thus we need a kind of "annotation" event by source code level.
> >
> >>I.e. trying to use 'perf probe' to do this would help in using the same
> >>technique in other code bases where we can't change the sources, etc.
> >>
> >>For perf we could perhaps use a 'noinline' on the __get/__put
> >>operations, so that we could have the right places to hook using
> >>uprobes, other codebases would have to rely in the 'perf probe'
> >>infrastructure that knows where inlines were expanded, etc.
> >>
> >>Such a toold could work like:
> >>
> >> perf dbgrefcnt ~/bin/perf thread
> >This works only for the binary which is coded as you said.
> >I actually doubt that this is universal solution. We'd better introduce
> >librefcnt.so if you need more general solution, so that we can fix the
> >refcnt API and we can also hook the interface easily.
> >
> >But with this way, we don't need ebpf/uprobes anymore, since we've already
> >have LD_PRELOAD (like valgrind does). :(
> >
> >So, IMHO, using ebpf and perf probe for this issue sounds like using
> >a sledge‐hammer...
>
> But this is an interesting problem and can inspire us the direction
> for eBPF improvement. I guess if we can solve this problem with eBPF

That is the spirit! All those efforts in the kernel to have always
available tracing facilities, kprobes, uprobes, eBPF, jump labels, you
name it, leaves me thinking that doing things by having to rebuild from
sources using defines, and even having to restart a workload to use
LD_PRELOAD tricks looks backwards :-)

> we can also solve many similar problems with much lower cost than what
> you have done in first 5 patches?
>
> This is what we have done today:
>
> With a much simpler patch which create 4 stub functions:
>
> diff --git a/tools/perf/util/Build b/tools/perf/util/Build
> index 65fef59..2c45478 100644
> --- a/tools/perf/util/Build
> +++ b/tools/perf/util/Build
> @@ -87,6 +87,7 @@ libperf-$(CONFIG_AUXTRACE) += intel-bts.o
> libperf-y += parse-branch-options.o
> libperf-y += parse-regs-options.o
> libperf-y += term.o
> +libperf-y += refcnt.o
>
> libperf-$(CONFIG_LIBBPF) += bpf-loader.o
> libperf-$(CONFIG_BPF_PROLOGUE) += bpf-prologue.o
> diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
> index e8e9a9d..de52ae8 100644
> --- a/tools/perf/util/dso.c
> +++ b/tools/perf/util/dso.c
> @@ -1,6 +1,7 @@
> #include <asm/bug.h>
> #include <sys/time.h>
> #include <sys/resource.h>
> +#include "refcnt.h"
> #include "symbol.h"
> #include "dso.h"
> #include "machine.h"
> diff --git a/tools/perf/util/refcnt.c b/tools/perf/util/refcnt.c
> new file mode 100644
> index 0000000..f5a6659
> --- /dev/null
> +++ b/tools/perf/util/refcnt.c
> @@ -0,0 +1,29 @@
> +#include <linux/compiler.h>
> +#include "util/refcnt.h"
> +
> +void __attribute__ ((noinline))
> +__refcnt__init(atomic_t *refcnt, int n,
> + void *obj __maybe_unused,
> + const char *name __maybe_unused)
> +{
> + atomic_set(refcnt, n);
> +}
> +
> +void __attribute__ ((noinline))
> +refcnt__delete(void *obj __maybe_unused)
> +{
> +}
> +
> +void __attribute__ ((noinline))
> +__refcnt__get(atomic_t *refcnt, void *obj __maybe_unused,
> + const char *name __maybe_unused)
> +{
> + atomic_inc(refcnt);
> +}
> +
> +int __attribute__ ((noinline))
> +__refcnt__put(atomic_t *refcnt, void *obj __maybe_unused,
> + const char *name __maybe_unused)
> +{
> + return atomic_dec_and_test(refcnt);
> +}
> diff --git a/tools/perf/util/refcnt.h b/tools/perf/util/refcnt.h
> new file mode 100644
> index 0000000..04f5390
> --- /dev/null
> +++ b/tools/perf/util/refcnt.h
> @@ -0,0 +1,21 @@
> +#ifndef PERF_REFCNT_H
> +#define PERF_REFCNT_H
> +#include <linux/atomic.h>
> +
> +void __refcnt__init(atomic_t *refcnt, int n, void *obj, const char *name);
> +void refcnt__delete(void *obj);
> +void __refcnt__get(atomic_t *refcnt, void *obj, const char *name);
> +int __refcnt__put(atomic_t *refcnt, void *obj, const char *name);
> +
> +#define refcnt__init(obj, member, n) \
> + __refcnt__init(&(obj)->member, n, obj, #obj)
> +#define refcnt__init_as(obj, member, n, name) \
> + __refcnt__init(&(obj)->member, n, obj, name)
> +#define refcnt__exit(obj, member) \
> + refcnt__delete(obj)
> +#define refcnt__get(obj, member) \
> + __refcnt__get(&(obj)->member, obj, #obj)
> +#define refcnt__put(obj, member) \
> + __refcnt__put(&(obj)->member, obj, #obj)
> +
> +#endif

But this requires having these special refcnt__ routines, that will make
tools/perf/ code patterns for reference counts look different that the
refcount patterns in the kernel :-\

And would be a requirement to change the observed workload :-\

Is this _strictly_ required? Can't we, for a project like perf, where we
know where some refcount (say, the one for 'struct thread') gets
initialized, use that (thread__new()) and then hook into thread__get and
thread__put and then use the destructor, thread__delete() as the place
to dump leaks?

> And a relative complex eBPF script attached at the end of
> this mail, with following cmdline:
>
> # ./perf record -e ./refcnt.c ./perf probe vfs_read
> # cat /sys/kernel/debug/tracing/trace
> ...
> perf-18419 [004] d... 613572.513083: : Type 0 leak 2
> perf-18419 [004] d... 613572.513084: : Type 1 leak 1
>
> I know we have 2 dsos and 1 map get leak.
>
> However I have to analysis full stack trace from 'perf script' to find
> which one get leak, because currently my eBPF script is unable to report
> which object is leak. I know I can use a hashtable with object address
> as key, but currently I don't know how to enumerate keys in a hash table,



> except maintaining a relationship between index and object address.
> Now I'm waiting for Daniel's persistent map to be enforced for that. When
> it ready we can create a tool with the following eBPF script embedded into
> perf as a small subcommand, and report call stack of 'alloc' method of
> leak object in 'perf report' style, so we can solve similar problem easier.
> To make it genereic, we can even make it attach to '{m,c}alloc%return' and
> 'free', or 'mmap/munmap'.
>
> Thank you.
>
>
> -------------- eBPF script --------------
>
> typedef int u32;
> typedef unsigned long long u64;
> #define NULL ((void *)(0))
>
> #define BPF_ANY 0 /* create new element or update existing */
> #define BPF_NOEXIST 1 /* create new element if it didn't exist */
> #define BPF_EXIST 2 /* update existing element */
>
> enum bpf_map_type {
> BPF_MAP_TYPE_UNSPEC,
> BPF_MAP_TYPE_HASH,
> BPF_MAP_TYPE_ARRAY,
> BPF_MAP_TYPE_PROG_ARRAY,
> BPF_MAP_TYPE_PERF_EVENT_ARRAY,
> };
>
> struct bpf_map_def {
> unsigned int type;
> unsigned int key_size;
> unsigned int value_size;
> unsigned int max_entries;
> };
>
> #define SEC(NAME) __attribute__((section(NAME), used))
> static int (*bpf_probe_read)(void *dst, int size, void *src) =
> (void *)4;
> static int (*bpf_trace_printk)(const char *fmt, int fmt_size, ...) =
> (void *)6;
> static int (*bpf_get_smp_processor_id)(void) =
> (void *)8;
> static int (*map_update_elem)(struct bpf_map_def *, void *, void *, unsigned
> long long flags) =
> (void *)2;
> static void *(*map_lookup_elem)(struct bpf_map_def *, void *) =
> (void *)1;
> static unsigned long long (*get_current_pid_tgid)(void) =
> (void *)14;
> static unsigned long long (*get_current_comm)(char *buf, int size_of_buf) =
> (void *)16;
>
> char _license[] SEC("license") = "GPL";
> int _version SEC("version") = LINUX_VERSION_CODE;
>
> enum global_var {
> G_pid,
> G_LEAK_START,
> G_dso_leak = G_LEAK_START,
> G_map_group_leak,
> G_LEAK_END,
> G_NR = G_LEAK_END,
> };
>
> struct bpf_map_def SEC("maps") global_vars = {
> .type = BPF_MAP_TYPE_ARRAY,
> .key_size = sizeof(int),
> .value_size = sizeof(u64),
> .max_entries = G_NR,
> };
>
> static inline int filter_pid(void)
> {
> int key_pid = G_pid;
> unsigned long long *p_pid, pid;
>
> char fmt[] = "%d vs %d\n";
>
> p_pid = map_lookup_elem(&global_vars, &key_pid);
> if (!p_pid)
> return 0;
>
> pid = get_current_pid_tgid() & 0xffffffff;
>
> if (*p_pid != pid)
> return 0;
> return 1;
> }
>
> static inline void print_leak(int type)
> {
> unsigned long long *p_cnt;
> char fmt[] = "Type %d leak %llu\n";
>
> p_cnt = map_lookup_elem(&global_vars, &type);
> if (!p_cnt)
> return;
> bpf_trace_printk(fmt, sizeof(fmt), type - G_LEAK_START, *p_cnt);
> }
>
> SEC("execve=sys_execve")
> int execve(void *ctx)
> {
> char name[sizeof(u64)] = "";
> char name_perf[sizeof(u64)] = "perf";
> unsigned long long *p_pid, pid;
> int key = G_pid;
>
> p_pid = map_lookup_elem(&global_vars, &key);
> if (!p_pid)
> return 0;
> pid = *p_pid;
> if (pid)
> return 0;
> if (get_current_comm(name, sizeof(name)))
> return 0;
> if (*(u32*)name != *(u32*)name_perf)
> return 0;
>
> pid = get_current_pid_tgid() & 0xffffffff;
> map_update_elem(&global_vars, &key, &pid, BPF_ANY);
> return 0;
> }
>
> static inline int func_exit(void *ctx)
> {
> if (!filter_pid())
> return 0;
> print_leak(G_dso_leak);
> print_leak(G_map_group_leak);
> return 0;
> }
>
> SEC("exit_group=sys_exit_group")
> int exit_group(void *ctx)
> {
> return func_exit(ctx);
> }
>
> SEC("exit_=sys_exit")
> int exit_(void *ctx)
> {
> return func_exit(ctx);
> }
> static inline void inc_leak_from_type(int type, int n)
> {
> u64 *p_cnt, cnt;
>
> type += G_LEAK_START;
> if (type >= G_LEAK_END)
> return;
>
> p_cnt = map_lookup_elem(&global_vars, &type);
> if (!p_cnt)
> cnt = n;
> else
> cnt = *p_cnt + n;
>
> map_update_elem(&global_vars, &type, &cnt, BPF_ANY);
> return;
> }
>
> SEC("exec=/home/wangnan/perf;"
> "refcnt_init=__refcnt__init n obj type")
> int refcnt_init(void *ctx, int err, int n, void *obj, int type)
> {
> if (!filter_pid())
> return 0;
> inc_leak_from_type(type, n);
> return 0;
> }
> SEC("exec=/home/wangnan/perf;"
> "refcnt_del=refcnt__delete obj type")
> int refcnt_del(void *ctx, int err, void *obj, int type)
> {
> if (!filter_pid())
> return 0;
> return 0;
> }
>
> SEC("exec=/home/wangnan/perf;"
> "refcnt_get=__refcnt__get obj type")
> int refcnt_get(void *ctx, int err, void *obj, int type)
> {
> if (!filter_pid())
> return 0;
> inc_leak_from_type(type, 1);
> return 0;
> }
>
> SEC("exec=/home/wangnan/perf;"
> "refcnt_put=__refcnt__put refcnt obj type")
> int refcnt_put(void *ctx, int err, void *refcnt, void *obj, int type)
> {
> int old_cnt = -1;
>
> if (!filter_pid())
> return 0;
> if (bpf_probe_read(&old_cnt, sizeof(int), refcnt))
> return 0;
> if (old_cnt)
> inc_leak_from_type(type, -1);
> return 0;
> }
>

2015-12-10 19:26:30

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH perf/core 14/22] perf: Fix dso__load_sym to put dso

Em Thu, Dec 10, 2015 at 08:52:46AM +0000, 平松雅巳 / HIRAMATU,MASAMI escreveu:
> From: Arnaldo Carvalho de Melo [mailto:[email protected]]
> >
> >Em Wed, Dec 09, 2015 at 11:11:18AM +0900, Masami Hiramatsu escreveu:
> >> +++ b/tools/perf/util/symbol-elf.c
> >> @@ -1045,6 +1045,8 @@ int dso__load_sym(struct dso *dso, struct map *map,
> >> /* kmaps already got it */
> >> map__put(curr_map);
> >> dsos__add(&map->groups->machine->dsos, curr_dso);
> >> + /* curr_map and machine->dsos already got it */
> >> + dso__put(curr_dso);
> >> dso__set_loaded(curr_dso, map->type);
> >> } else
> >> curr_dso = curr_map->dso;
> >
> >Right, to make the code smaller, how about doing it this way, i.e. drop
> >the reference once we have that curr_dso object with a ref held by
> >curr_map, if curr_map doesn't get it, then we don't need and will drop
> >it anyway:
>
> But as above code, curr_dso is passed to dsos__add after curr_map is put.
> Even if the curr_map is hold by kmaps, isn't the kmaps controlled by other pthreads?
> If no, I'm OK if we move above dsos__add() before map__put() for safety, because
> curr_dso is held by curr_map at that point.

Good catch, so I'll do as you suggest and move dsos__add() to before we
drop that map reference, as then we're sure that we hold it and it holds
the dso.

> Thank you,
>
> >
> >diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
> >index 53f19968bfa2..84d787074152 100644
> >--- a/tools/perf/util/symbol-elf.c
> >+++ b/tools/perf/util/symbol-elf.c
> >@@ -1026,8 +1026,8 @@ int dso__load_sym(struct dso *dso, struct map *map,
> > curr_dso->long_name_len = dso->long_name_len;
> > curr_map = map__new2(start, curr_dso,
> > map->type);
> >+ dso__put(curr_dso);
> > if (curr_map == NULL) {
> >- dso__put(curr_dso);
> > goto out_elf_end;
> > }
> > if (adjust_kernel_syms) {

2015-12-11 01:54:17

by Wang Nan

[permalink] [raw]
Subject: Re: [PATCH perf/core 00/22] perf refcnt debugger API and fixes



On 2015/12/10 23:12, 'Arnaldo Carvalho de Melo' wrote:

[SNIP]
> But this requires having these special refcnt__ routines, that will make
> tools/perf/ code patterns for reference counts look different that the
> refcount patterns in the kernel :-\
>
> And would be a requirement to change the observed workload :-\
>
> Is this _strictly_ required?

No. The requirement should be:

1. The create/get/put/delete functions are non-inline (because dwarf info
is not as reliable as symbol);
2. From their argument list, we can always get the variable we need (the
pointer of objects, the value of refcnt, etc.)

We don't have to use this refcnt things.

> Can't we, for a project like perf, where we
> know where some refcount (say, the one for 'struct thread') gets
> initialized, use that (thread__new()) and then hook into thread__get and
> thread__put and then use the destructor, thread__delete() as the place
> to dump leaks?
>

I think it is possible. If we can abstract a common pattern about it,
we can provide a perf subcommand which we can deal with generic alloc/free
pattern. I'll put it on my todo-list. Currently we are focusing on perf
daemonization.

Thank you.

Subject: RE: [PATCH perf/core 00/22] perf refcnt debugger API and fixes

From: Wangnan (F) [mailto:[email protected]]
>On 2015/12/10 23:12, 'Arnaldo Carvalho de Melo' wrote:
>
>[SNIP]
>> But this requires having these special refcnt__ routines, that will make
>> tools/perf/ code patterns for reference counts look different that the
>> refcount patterns in the kernel :-\
>>
>> And would be a requirement to change the observed workload :-\
>>
>> Is this _strictly_ required?
>
>No. The requirement should be:
>
> 1. The create/get/put/delete functions are non-inline (because dwarf info
> is not as reliable as symbol);
> 2. From their argument list, we can always get the variable we need (the
> pointer of objects, the value of refcnt, etc.)

However, we have to customize it for each application. Perf itself might be OK
but others might have different implementation.

Thanks,

????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

Subject: RE: [PATCH perf/core 00/22] perf refcnt debugger API and fixes

From: 'Arnaldo Carvalho de Melo' [mailto:[email protected]]
>
>But this requires having these special refcnt__ routines, that will make
>tools/perf/ code patterns for reference counts look different that the
>refcount patterns in the kernel :-\

BTW, I think even without the refcnt debugger, we'd better introduce this
kind API to unify the refcnt operation in perf code. As I said, we have many
miscodings on current implementation. Unifying the API can enforce developers
to avoid such miscodings.

Thank you,


????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-12-11 02:22:31

by Wang Nan

[permalink] [raw]
Subject: Re: [PATCH perf/core 00/22] perf refcnt debugger API and fixes



On 2015/12/11 10:08, 平松雅巳 / HIRAMATU,MASAMI wrote:
> From: Wangnan (F) [mailto:[email protected]]
>> On 2015/12/10 23:12, 'Arnaldo Carvalho de Melo' wrote:
>>
>> [SNIP]
>>> But this requires having these special refcnt__ routines, that will make
>>> tools/perf/ code patterns for reference counts look different that the
>>> refcount patterns in the kernel :-\
>>>
>>> And would be a requirement to change the observed workload :-\
>>>
>>> Is this _strictly_ required?
>> No. The requirement should be:
>>
>> 1. The create/get/put/delete functions are non-inline (because dwarf info
>> is not as reliable as symbol);
>> 2. From their argument list, we can always get the variable we need (the
>> pointer of objects, the value of refcnt, etc.)
> However, we have to customize it for each application. Perf itself might be OK
> but others might have different implementation.

If limited to pairwise operations ({{m,c}alloc,strdup} vs free, open vs
close),
I think it is possible to abstract a uniformed pattern.

Thank you.

2015-12-11 02:43:15

by Wang Nan

[permalink] [raw]
Subject: Re: [PATCH perf/core 00/22] perf refcnt debugger API and fixes



On 2015/12/11 10:15, 平松雅巳 / HIRAMATU,MASAMI wrote:
> From: 'Arnaldo Carvalho de Melo' [mailto:[email protected]]
>> But this requires having these special refcnt__ routines, that will make
>> tools/perf/ code patterns for reference counts look different that the
>> refcount patterns in the kernel :-\
> BTW, I think even without the refcnt debugger, we'd better introduce this
> kind API to unify the refcnt operation in perf code. As I said, we have many
> miscodings on current implementation. Unifying the API can enforce developers
> to avoid such miscodings.
>
> Thank you,
>

I tried this problem in another way, I'd like to share it here.

First: create two uprobes:

# ./perf probe --exec /home/wangnan/perf dso__new%return %ax
Added new event:
probe_perf:dso__new (on dso__new%return in /home/wangnan/perf with %ax)

You can now use it in all perf tools, such as:

perf record -e probe_perf:dso__new -aR sleep 1

# ./perf probe --exec /home/wangnan/perf dso__delete dso
Added new event:
probe_perf:dso__delete (on dso__delete in /home/wangnan/perf with dso)

You can now use it in all perf tools, such as:

perf record -e probe_perf:dso__delete -aR sleep 1


Then start test:

# ./perf record -g -e probe_perf:dso__new -e probe_perf:dso__delete
./perf probe vfs_read
Added new event:
probe:vfs_read (on vfs_read)

You can now use it in all perf tools, such as:

perf record -e probe:vfs_read -aR sleep 1

[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.048 MB perf.data (178 samples) ]


From the perf report result I know two dso objects are leak:

90 probe_perf:dso__new `
88 probe_perf:dso__delete


Then convert output to CTF:

$ ./perf data convert --to-ctf ./out.ctf


With a python script, try to find the exact leak objects (I'm not good
at python,
I believe we can do this with much shorter script):

$ cat refcnt.py
from babeltrace import TraceCollection

tc = TraceCollection();
tc.add_trace('./out.ctf', 'ctf')
objs = {}
for event in tc.events:
if event.name.startswith('probe_perf:dso__new'):
if event['arg1'] not in objs:
objs[event['arg1']] = 1
if event.name.startswith('probe_perf:dso__delete'):
if event['dso'] in objs:
objs[event['dso']] = 0

for x in objs:
if objs[x] is 0:
continue
print("0x%x" % x)

$ python3 ./refcnt.py
0x34cb350
0x34d4640

Then from perf script's result, search for the two address:

perf 23203 [004] 665244.170387: probe_perf:dso__new: (4aaee0 <- 50a5eb)
arg1=0x34cb350
10a5eb dso__load_sym (/home/w00229757/perf)
af42d dso__load_vmlinux (/home/w00229757/perf)
af58c dso__load_vmlinux_path (/home/w00229757/perf)
10c40a open_debuginfo (/home/w00229757/perf)
111d39 convert_perf_probe_events (/home/w00229757/perf)
603a7 __cmd_probe.isra.3 (/home/w00229757/perf)
60a44 cmd_probe (/home/w00229757/perf)
7f6d1 run_builtin (/home/w00229757/perf)
33056 main (/home/w00229757/perf)
21bd5 __libc_start_main
(/tmp/oxygen_root-w00229757/lib64/libc-2.18.so)

perf 23203 [004] 665244.170679: probe_perf:dso__new: (4aaee0 <- 50a5eb)
arg1=0x34d4640
10a5eb dso__load_sym (/home/w00229757/perf)
af42d dso__load_vmlinux (/home/w00229757/perf)
af58c dso__load_vmlinux_path (/home/w00229757/perf)
10c40a open_debuginfo (/home/w00229757/perf)
111d39 convert_perf_probe_events (/home/w00229757/perf)
603a7 __cmd_probe.isra.3 (/home/w00229757/perf)
60a44 cmd_probe (/home/w00229757/perf)
7f6d1 run_builtin (/home/w00229757/perf)
33056 main (/home/w00229757/perf)
21bd5 __libc_start_main
(/tmp/oxygen_root-w00229757/lib64/libc-2.18.so)

Thank you.

2015-12-11 02:58:32

by Wang Nan

[permalink] [raw]
Subject: Re: [PATCH perf/core 00/22] perf refcnt debugger API and fixes



On 2015/12/11 10:42, Wangnan (F) wrote:
>
>
> On 2015/12/11 10:15, 平松雅巳 / HIRAMATU,MASAMI wrote:
>> From: 'Arnaldo Carvalho de Melo' [mailto:[email protected]]
>>> But this requires having these special refcnt__ routines, that will
>>> make
>>> tools/perf/ code patterns for reference counts look different that the
>>> refcount patterns in the kernel :-\
>> BTW, I think even without the refcnt debugger, we'd better introduce
>> this
>> kind API to unify the refcnt operation in perf code. As I said, we
>> have many
>> miscodings on current implementation. Unifying the API can enforce
>> developers
>> to avoid such miscodings.
>>
>> Thank you,
>>
>
> I tried this problem in another way, I'd like to share it here.
>
> First: create two uprobes:
>
> # ./perf probe --exec /home/wangnan/perf dso__new%return %ax
> Added new event:
> probe_perf:dso__new (on dso__new%return in /home/wangnan/perf with
> %ax)
>
> You can now use it in all perf tools, such as:
>
> perf record -e probe_perf:dso__new -aR sleep 1
>
> # ./perf probe --exec /home/wangnan/perf dso__delete dso
> Added new event:
> probe_perf:dso__delete (on dso__delete in /home/wangnan/perf with dso)
>
> You can now use it in all perf tools, such as:
>
> perf record -e probe_perf:dso__delete -aR sleep 1
>
>
> Then start test:
>
> # ./perf record -g -e probe_perf:dso__new -e probe_perf:dso__delete
> ./perf probe vfs_read
> Added new event:
> probe:vfs_read (on vfs_read)
>
> You can now use it in all perf tools, such as:
>
> perf record -e probe:vfs_read -aR sleep 1
>
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.048 MB perf.data (178 samples) ]
>
>
> From the perf report result I know two dso objects are leak:
>
> 90 probe_perf:dso__new `
> 88 probe_perf:dso__delete
>

The above result is gotten from yesterday's perf/core. I also tried
today's perf/core and get:

90 probe_perf:dso__new `
90 probe_perf:dso__delete

So we fix these leak.

Thank you.

Subject: RE: [PATCH perf/core 00/22] perf refcnt debugger API and fixes

From: Wangnan (F) [mailto:[email protected]]
>
>
>On 2015/12/11 10:15, 平松雅巳 / HIRAMATU,MASAMI wrote:
>> From: 'Arnaldo Carvalho de Melo' [mailto:[email protected]]
>>> But this requires having these special refcnt__ routines, that will make
>>> tools/perf/ code patterns for reference counts look different that the
>>> refcount patterns in the kernel :-\
>> BTW, I think even without the refcnt debugger, we'd better introduce this
>> kind API to unify the refcnt operation in perf code. As I said, we have many
>> miscodings on current implementation. Unifying the API can enforce developers
>> to avoid such miscodings.
>>
>> Thank you,
>>
>
>I tried this problem in another way, I'd like to share it here.

No, what I said here is the issue on the coding policy. If we have no special
reason that each object (class) has its own reference counter implementation,
we'd better unify it for binding them to one policy.


>
>Then from perf script's result, search for the two address:
>
>perf 23203 [004] 665244.170387: probe_perf:dso__new: (4aaee0 <- 50a5eb)
>arg1=0x34cb350
> 10a5eb dso__load_sym (/home/w00229757/perf)
> af42d dso__load_vmlinux (/home/w00229757/perf)
> af58c dso__load_vmlinux_path (/home/w00229757/perf)
> 10c40a open_debuginfo (/home/w00229757/perf)
> 111d39 convert_perf_probe_events (/home/w00229757/perf)
> 603a7 __cmd_probe.isra.3 (/home/w00229757/perf)
> 60a44 cmd_probe (/home/w00229757/perf)
> 7f6d1 run_builtin (/home/w00229757/perf)
> 33056 main (/home/w00229757/perf)
> 21bd5 __libc_start_main
>(/tmp/oxygen_root-w00229757/lib64/libc-2.18.so)
>
>perf 23203 [004] 665244.170679: probe_perf:dso__new: (4aaee0 <- 50a5eb)
>arg1=0x34d4640
> 10a5eb dso__load_sym (/home/w00229757/perf)
> af42d dso__load_vmlinux (/home/w00229757/perf)
> af58c dso__load_vmlinux_path (/home/w00229757/perf)
> 10c40a open_debuginfo (/home/w00229757/perf)
> 111d39 convert_perf_probe_events (/home/w00229757/perf)
> 603a7 __cmd_probe.isra.3 (/home/w00229757/perf)
> 60a44 cmd_probe (/home/w00229757/perf)
> 7f6d1 run_builtin (/home/w00229757/perf)
> 33056 main (/home/w00229757/perf)
> 21bd5 __libc_start_main
>(/tmp/oxygen_root-w00229757/lib64/libc-2.18.so)

BTW, actually the analysis of this level (object memory leaks) can be found
(and analyzed) by valgrind (please try it).
The problem is not happen only when the creating site, but the refcnt can
leak when processing the object afterwards. That is the reason why I made
refcnt debugger to support precise backtracing for each operation.

Thanks,

????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2015-12-11 22:26:29

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH perf/core 00/22] perf refcnt debugger API and fixes

Em Wed, Dec 09, 2015 at 10:41:38AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Wed, Dec 09, 2015 at 11:10:48AM +0900, Masami Hiramatsu escreveu:
> > General refcnt miscodings
> > =========================
> >
> > BTW, while applying this change, I've found that there are refcnt
> > coding mismatches in those code and most of the bugs come from those
> > mismatches.
> >
> > - The reference counter can be initialized by 0 or 1.
> > - If 0 is chosen, caller have to get it and free it if failed to
> > register appropriately.
> > - If 1 is chosen, caller doesn't need to get it, but when exits the
> > caller, it has to put it. (except for returning the object itself)
> > - The application should choose either one as its policy, to avoid
> > confusion.
> >
> > perf tools mixes it up (moreover, it initializes 2 in a case) and
> > caller usually forgets to put it (it is not surprising, because too
> > many "put" usually cause SEGV by accessing freed object.)
>
> Well, we should fix the bugs and document why some initialization is
> deemed better than a single initialization style.

> For instance, if we know that we will keep multiple references straight
> away, then we could init it with some value different that preferred,
> that I aggee, is 1, i.e. the usual way for some constructor is:
>
>
> struct foo *foo__new()
> {
> struct foo *f = malloc(sizeof (*f));
>
> if (f) {
> atomic_set(&f->refcnt, 1);
> }
>
> return f;
> }
>
> void *foo__delete(struct foo *f)
> {
> free(f);
> }
>
> void foo__put(struct foo *f)
> {
> if (f && atomic_dec_and_test(f->refcnt))
> foo__delete(f);
> }
>
>
> Then, when using if, and before adding it to any other tree, list, i.e.
> a container, we do:
>
> struct foo *f = foo__new();
>
> /*
> * assume f was allocated and then the function allocating it
> * failed, when it bails out it should just do:
> */
> out_error:
> foo__put(f);
>
> And that will make it hit zero, which will call foo__delete(), case
> closed.
>
> > As far as I can see, cgroup_sel, perf_mmap(this is initialized 0 or 2...),
> > thread, and comm_str are initialized by 0. Others are initialized by 1.
> >
> > So, I'd like to suggest that we choose one policy and cleanup the code.
> > I recommend to use init by 1 policy, because anyway caller has to get
>
> See above, if nothing else recommends using a different value, use 1.

So, I think I fixed the thread->refcnt case, please take a look at my
perf/core branch, more specifically this patch:

>From a9a8442c64a86599600ecb2ae0f296b5c93e2a98 Mon Sep 17 00:00:00 2001
From: Arnaldo Carvalho de Melo <[email protected]>
Date: Fri, 11 Dec 2015 19:11:23 -0300
Subject: [PATCH 1/1] perf thread: Fix reference count initial state

We should always return from thread__new(), the constructor, with the
object with a reference count of one, so that:

struct thread *thread = thread__new();
thread__put(thread);

Will call thread__delete().

If any reference is made to that 'thread' variable, it better use
thread__get(thread) to hold a reference.

We were returning with thread->refcnt set to zero, fix it and some cases
where thread__delete() was being called, which were not a problem
because just one reference was being used, now that we set it to 1, use
thread__put() instead.

Reported-by: Masami Hiramatsu <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/intel-pt.c | 4 ++--
tools/perf/util/machine.c | 19 ++++++++++++-------
tools/perf/util/thread.c | 10 ++++++++--
3 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 97f963a3dcb9..81a2eb77ba7f 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -1744,7 +1744,7 @@ static void intel_pt_free(struct perf_session *session)
auxtrace_heap__free(&pt->heap);
intel_pt_free_events(session);
session->auxtrace = NULL;
- thread__delete(pt->unknown_thread);
+ thread__put(pt->unknown_thread);
free(pt);
}

@@ -2153,7 +2153,7 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
return 0;

err_delete_thread:
- thread__delete(pt->unknown_thread);
+ thread__zput(pt->unknown_thread);
err_free_queues:
intel_pt_log_disable();
auxtrace_queues__free(&pt->queues);
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 1407d5107480..ad79297c76c8 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -352,13 +352,18 @@ static void machine__update_thread_pid(struct machine *machine,
}

th->mg = map_groups__get(leader->mg);
-
+out_put:
+ thread__put(leader);
return;
-
out_err:
pr_err("Failed to join map groups for %d:%d\n", th->pid_, th->tid);
+ goto out_put;
}

+/*
+ * Caller must eventually drop thread->refcnt returned with a successfull
+ * lookup/new thread inserted.
+ */
static struct thread *____machine__findnew_thread(struct machine *machine,
pid_t pid, pid_t tid,
bool create)
@@ -376,7 +381,7 @@ static struct thread *____machine__findnew_thread(struct machine *machine,
if (th != NULL) {
if (th->tid == tid) {
machine__update_thread_pid(machine, th, pid);
- return th;
+ return thread__get(th);
}

machine->last_match = NULL;
@@ -389,7 +394,7 @@ static struct thread *____machine__findnew_thread(struct machine *machine,
if (th->tid == tid) {
machine->last_match = th;
machine__update_thread_pid(machine, th, pid);
- return th;
+ return thread__get(th);
}

if (tid < th->tid)
@@ -417,7 +422,7 @@ static struct thread *____machine__findnew_thread(struct machine *machine,
if (thread__init_map_groups(th, machine)) {
rb_erase_init(&th->rb_node, &machine->threads);
RB_CLEAR_NODE(&th->rb_node);
- thread__delete(th);
+ thread__put(th);
return NULL;
}
/*
@@ -441,7 +446,7 @@ struct thread *machine__findnew_thread(struct machine *machine, pid_t pid,
struct thread *th;

pthread_rwlock_wrlock(&machine->threads_lock);
- th = thread__get(__machine__findnew_thread(machine, pid, tid));
+ th = __machine__findnew_thread(machine, pid, tid);
pthread_rwlock_unlock(&machine->threads_lock);
return th;
}
@@ -451,7 +456,7 @@ struct thread *machine__find_thread(struct machine *machine, pid_t pid,
{
struct thread *th;
pthread_rwlock_rdlock(&machine->threads_lock);
- th = thread__get(____machine__findnew_thread(machine, pid, tid, false));
+ th = ____machine__findnew_thread(machine, pid, tid, false);
pthread_rwlock_unlock(&machine->threads_lock);
return th;
}
diff --git a/tools/perf/util/thread.c b/tools/perf/util/thread.c
index 0a9ae8014729..dfd00c6dad6e 100644
--- a/tools/perf/util/thread.c
+++ b/tools/perf/util/thread.c
@@ -19,8 +19,10 @@ int thread__init_map_groups(struct thread *thread, struct machine *machine)
thread->mg = map_groups__new(machine);
} else {
leader = __machine__findnew_thread(machine, pid, pid);
- if (leader)
+ if (leader) {
thread->mg = map_groups__get(leader->mg);
+ thread__put(leader);
+ }
}

return thread->mg ? 0 : -1;
@@ -53,7 +55,7 @@ struct thread *thread__new(pid_t pid, pid_t tid)
goto err_thread;

list_add(&comm->list, &thread->comm_list);
- atomic_set(&thread->refcnt, 0);
+ atomic_set(&thread->refcnt, 1);
RB_CLEAR_NODE(&thread->rb_node);
}

@@ -95,6 +97,10 @@ struct thread *thread__get(struct thread *thread)
void thread__put(struct thread *thread)
{
if (thread && atomic_dec_and_test(&thread->refcnt)) {
+ /*
+ * Remove it from the dead_threads list, as last reference
+ * is gone.
+ */
list_del_init(&thread->node);
thread__delete(thread);
}
--
2.1.0

Subject: [tip:perf/core] perf tools: Make perf_session__register_idle_thread drop the refcount

Commit-ID: 9d8b172f29ac0e5d1923d348e395e9643625ef7f
Gitweb: http://git.kernel.org/tip/9d8b172f29ac0e5d1923d348e395e9643625ef7f
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Wed, 9 Dec 2015 11:11:23 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 10 Dec 2015 16:28:58 -0300

perf tools: Make perf_session__register_idle_thread drop the refcount

Note that since the thread was already inserted to the session
list, it will be released when the session is released.
Also, in perf_session__register_idle_thread() failure path,
the thread should be put before returning.

Refcnt debugger shows that the perf_session__register_idle_thread
gets the returned thread, but the caller (__cmd_top) does not
put the returned idle thread.

----
==== [0] ====
Unreclaimed thread@0x24e6240
Refcount +1 => 0 at
./perf(thread__new+0xe5) [0x4c8a75]
./perf(machine__findnew_thread+0x9a) [0x4bbdba]
./perf(perf_session__register_idle_thread+0x28) [0x4c63c8]
./perf(cmd_top+0xd7d) [0x43cf6d]
./perf() [0x47ba35]
./perf(main+0x617) [0x4225b7]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5]
./perf() [0x42272d]
Refcount +1 => 1 at
./perf(thread__get+0x2c) [0x4c8bcc]
./perf(machine__findnew_thread+0xee) [0x4bbe0e]
./perf(perf_session__register_idle_thread+0x28) [0x4c63c8]
./perf(cmd_top+0xd7d) [0x43cf6d]
./perf() [0x47ba35]
./perf(main+0x617) [0x4225b7]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5]
./perf() [0x42272d]
Refcount +1 => 2 at
./perf(thread__get+0x2c) [0x4c8bcc]
./perf(machine__findnew_thread+0x112) [0x4bbe32]
./perf(perf_session__register_idle_thread+0x28) [0x4c63c8]
./perf(cmd_top+0xd7d) [0x43cf6d]
./perf() [0x47ba35]
./perf(main+0x617) [0x4225b7]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f06027c5af5]
./perf() [0x42272d]
----

Signed-off-by: Masami Hiramatsu <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Drop the refcount in perf_session__register_idle_thread() ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/builtin-top.c | 2 +-
tools/perf/util/session.c | 11 +++++++----
tools/perf/util/session.h | 2 +-
3 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 84fd636..785aa2d 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -964,7 +964,7 @@ static int __cmd_top(struct perf_top *top)
if (ret)
goto out_delete;

- if (perf_session__register_idle_thread(top->session) == NULL)
+ if (perf_session__register_idle_thread(top->session) < 0)
goto out_delete;

machine__synthesize_threads(&top->session->machines.host, &opts->target,
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index c35ffdd..9774686 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1311,17 +1311,20 @@ struct thread *perf_session__findnew(struct perf_session *session, pid_t pid)
return machine__findnew_thread(&session->machines.host, -1, pid);
}

-struct thread *perf_session__register_idle_thread(struct perf_session *session)
+int perf_session__register_idle_thread(struct perf_session *session)
{
struct thread *thread;
+ int err = 0;

thread = machine__findnew_thread(&session->machines.host, 0, 0);
if (thread == NULL || thread__set_comm(thread, "swapper", 0)) {
pr_err("problem inserting idle task.\n");
- thread = NULL;
+ err = -1;
}

- return thread;
+ /* machine__findnew_thread() got the thread, so put it */
+ thread__put(thread);
+ return err;
}

static void perf_session__warn_about_errors(const struct perf_session *session)
@@ -1676,7 +1679,7 @@ int perf_session__process_events(struct perf_session *session)
u64 size = perf_data_file__size(session->file);
int err;

- if (perf_session__register_idle_thread(session) == NULL)
+ if (perf_session__register_idle_thread(session) < 0)
return -ENOMEM;

if (!perf_data_file__is_pipe(session->file))
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index 3e900c0..5f792e3 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -89,7 +89,7 @@ struct machine *perf_session__findnew_machine(struct perf_session *session, pid_
}

struct thread *perf_session__findnew(struct perf_session *session, pid_t pid);
-struct thread *perf_session__register_idle_thread(struct perf_session *session);
+int perf_session__register_idle_thread(struct perf_session *session);

size_t perf_session__fprintf(struct perf_session *session, FILE *fp);

Subject: [tip:perf/core] perf symbols: Fix dso__load_sym to put dso

Commit-ID: e7a7865cc0da306542db0b9205cb0a467f59e33d
Gitweb: http://git.kernel.org/tip/e7a7865cc0da306542db0b9205cb0a467f59e33d
Author: Masami Hiramatsu <[email protected]>
AuthorDate: Wed, 9 Dec 2015 11:11:18 +0900
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Thu, 10 Dec 2015 16:29:32 -0300

perf symbols: Fix dso__load_sym to put dso

Fix dso__load_sym to put dso because dsos__add already got it.

Refcnt debugger explain the problem:
----
==== [0] ====
Unreclaimed dso: 0x19dd200
Refcount +1 => 1 at
./perf(dso__new+0x1ff) [0x4a62df]
./perf(dso__load_sym+0xe89) [0x503509]
./perf(dso__load_vmlinux+0xbf) [0x4aa77f]
./perf(dso__load_vmlinux_path+0x8c) [0x4aa8dc]
./perf() [0x50539a]
./perf(convert_perf_probe_events+0xd79) [0x50ad39]
./perf() [0x45600f]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount +1 => 2 at
./perf(dso__get+0x34) [0x4a65f4]
./perf(map__new2+0x76) [0x4be216]
./perf(dso__load_sym+0xee1) [0x503561]
./perf(dso__load_vmlinux+0xbf) [0x4aa77f]
./perf(dso__load_vmlinux_path+0x8c) [0x4aa8dc]
./perf() [0x50539a]
./perf(convert_perf_probe_events+0xd79) [0x50ad39]
./perf() [0x45600f]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount +1 => 3 at
./perf(dsos__add+0xf3) [0x4a6bc3]
./perf(dso__load_sym+0xfc1) [0x503641]
./perf(dso__load_vmlinux+0xbf) [0x4aa77f]
./perf(dso__load_vmlinux_path+0x8c) [0x4aa8dc]
./perf() [0x50539a]
./perf(convert_perf_probe_events+0xd79) [0x50ad39]
./perf() [0x45600f]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount -1 => 2 at
./perf(dso__put+0x2f) [0x4a664f]
./perf(map_groups__exit+0xb9) [0x4bee29]
./perf(machine__delete+0xb0) [0x4b93d0]
./perf(exit_probe_symbol_maps+0x28) [0x506718]
./perf() [0x45628a]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
Refcount -1 => 1 at
./perf(dso__put+0x2f) [0x4a664f]
./perf(machine__delete+0xfe) [0x4b941e]
./perf(exit_probe_symbol_maps+0x28) [0x506718]
./perf() [0x45628a]
./perf(cmd_probe+0x6c) [0x4566bc]
./perf() [0x47abc5]
./perf(main+0x610) [0x421f90]
/lib64/libc.so.6(__libc_start_main+0xf5) [0x7f74dd0efaf5]
./perf() [0x4220a9]
----
So, in the dso__load_sym, dso is gotten 3 times, by dso__new,
map__new2, and dsos__add. The last 2 is actually released by
map_groups and machine__delete correspondingly. However, the
first reference by dso__new, is never released.

Committer note:

Changed the place where the reference count is dropped to:

Fix it by dropping it right after creating curr_map, since we know that
either that operation failed and we need to drop the dso refcount or
that it succeed and we have it referenced via curr_map->dso.

Then only drop the curr_map refcount after we call dsos__add() to make
sure we hold a reference to it via curr_map->dso.

Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/symbol-elf.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/symbol-elf.c b/tools/perf/util/symbol-elf.c
index 53f1996..562b8eb 100644
--- a/tools/perf/util/symbol-elf.c
+++ b/tools/perf/util/symbol-elf.c
@@ -1026,8 +1026,8 @@ int dso__load_sym(struct dso *dso, struct map *map,
curr_dso->long_name_len = dso->long_name_len;
curr_map = map__new2(start, curr_dso,
map->type);
+ dso__put(curr_dso);
if (curr_map == NULL) {
- dso__put(curr_dso);
goto out_elf_end;
}
if (adjust_kernel_syms) {
@@ -1042,9 +1042,14 @@ int dso__load_sym(struct dso *dso, struct map *map,
}
curr_dso->symtab_type = dso->symtab_type;
map_groups__insert(kmaps, curr_map);
+ /*
+ * Add it before we drop the referece to curr_map,
+ * i.e. while we still are sure to have a reference
+ * to this DSO via curr_map->dso.
+ */
+ dsos__add(&map->groups->machine->dsos, curr_dso);
/* kmaps already got it */
map__put(curr_map);
- dsos__add(&map->groups->machine->dsos, curr_dso);
dso__set_loaded(curr_dso, map->type);
} else
curr_dso = curr_map->dso;