2020-09-15 03:19:35

by Namhyung Kim

[permalink] [raw]
Subject: [PATCHSET v2 00/11] perf tools: Fix various memory leaks

Hello,

I've found and fixed a bunch of memory leaks during perf pmu and
metric tests with address sanitizer. Before this, the tests were
mostly failed due to the leaks since ASAN makes it return non-zero.

Now I'm seeing no error with ASAN like below:

$ ./perf test pmu metric
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Skip (some metrics failed)
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
67: Parse and process metrics : Ok

The failure in 10.3 seems due to parse errors like below:

Multiple errors dropping message: unknown term 'filter_opc' for pmu 'uncore_cbox_0'
(valid terms: event,edge,inv,umask,cmask,config,config1,config2,name,period,freq,
branch_type,time,call-graph,stack-size,no-inherit,inherit,max-stack,
nr,no-overwrite,overwrite,driver-config,percore,aux-output,aux-sample-size)


Parse event failed metric 'DRAM_Parallel_Reads' id 'arb/event=0x80,umask=0x2,thresh=1/'
expr 'arb@event\=0x80\,umask\=0x2@ / arb@event\=0x80\,umask\=0x2\,thresh\=1@'
Error string 'unknown term 'thresh' for pmu 'uncore_arb'' help
'valid terms: event,edge,inv,umask,cmask,config,config1,config2,name,period,freq,
branch_type,time,call-graph,stack-size,no-inherit,inherit,max-stack,
nr,no-overwrite,overwrite,driver-config,percore,aux-output,aux-sample-size'


* Changes from v1:
- Add 'Acked-by: Jiri'


The patches are also available at 'perf/metric-fix-v2' branch on

git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git

Thanks
Namhyung


Namhyung Kim (11):
perf metric: Fix some memory leaks
perf metric: Fix some memory leaks - part 2
perf evlist: Fix cpu/thread map leak
perf parse-event: Fix cpu map leaks
perf parse-event: Fix memory leak in evsel->unit
perf test: Fix memory leaks in parse-metric test
perf metric: Release expr_parse_ctx after testing
perf metric: Free metric when it failed to resolve
perf metric: Do not free metric when failed to resolve
perf test: Free aliases for PMU event map aliases test
perf test: Free formats for perf pmu parse test

tools/perf/tests/parse-metric.c | 14 ++++++++-----
tools/perf/tests/pmu-events.c | 5 +++++
tools/perf/tests/pmu.c | 1 +
tools/perf/util/evlist.c | 11 ++++++++---
tools/perf/util/metricgroup.c | 35 +++++++++++++++++++++++----------
tools/perf/util/parse-events.c | 9 +++++++--
tools/perf/util/pmu.c | 13 +++++++++++-
tools/perf/util/pmu.h | 2 ++
tools/perf/util/stat-shadow.c | 8 +++++---
9 files changed, 74 insertions(+), 24 deletions(-)

--
2.28.0.618.gf4bc123cb7-goog


2020-09-15 03:19:49

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 01/11] perf metric: Fix some memory leaks

I found some memory leaks while reading the metric code. Some are
real and others only occur in the error path. When it failed during
metric or event parsing, it should release all resources properly.

Cc: Kajol Jain <[email protected]>
Cc: John Garry <[email protected]>
Cc: Ian Rogers <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Fixes: b18f3e365019d ("perf stat: Support JSON metrics in perf stat")
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/metricgroup.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index d03bac65a3c2..90d14c63babb 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -529,6 +529,9 @@ void metricgroup__print(bool metrics, bool metricgroups, char *filter,
continue;
strlist__add(me->metrics, s);
}
+
+ if (!raw)
+ free(s);
}
free(omg);
}
@@ -1041,7 +1044,7 @@ static int parse_groups(struct evlist *perf_evlist, const char *str,
ret = metricgroup__add_metric_list(str, metric_no_group,
&extra_events, &metric_list, map);
if (ret)
- return ret;
+ goto out;
pr_debug("adding %s\n", extra_events.buf);
bzero(&parse_error, sizeof(parse_error));
ret = __parse_events(perf_evlist, extra_events.buf, &parse_error, fake_pmu);
@@ -1049,11 +1052,11 @@ static int parse_groups(struct evlist *perf_evlist, const char *str,
parse_events_print_error(&parse_error, extra_events.buf);
goto out;
}
- strbuf_release(&extra_events);
ret = metricgroup__setup_events(&metric_list, metric_no_merge,
perf_evlist, metric_events);
out:
metricgroup__free_metrics(&metric_list);
+ strbuf_release(&extra_events);
return ret;
}

--
2.28.0.618.gf4bc123cb7-goog

2020-09-15 03:20:16

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 02/11] perf metric: Fix some memory leaks - part 2

The metric_event_delete() missed to free expr->metric_events and it
should free an expr when metric_refs allocation failed.

Cc: Kajol Jain <[email protected]>
Cc: John Garry <[email protected]>
Cc: Ian Rogers <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Fixes: 4ea2896715e67 ("perf metric: Collect referenced metrics in struct metric_expr")
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/metricgroup.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 90d14c63babb..53747df601fa 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -84,6 +84,7 @@ static void metric_event_delete(struct rblist *rblist __maybe_unused,

list_for_each_entry_safe(expr, tmp, &me->head, nd) {
free(expr->metric_refs);
+ free(expr->metric_events);
free(expr);
}

@@ -315,6 +316,7 @@ static int metricgroup__setup_events(struct list_head *groups,
if (!metric_refs) {
ret = -ENOMEM;
free(metric_events);
+ free(expr);
break;
}

--
2.28.0.618.gf4bc123cb7-goog

2020-09-15 03:20:33

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 04/11] perf parse-event: Fix cpu map leaks

Like evlist cpu map, evsel's cpu map should have proper refcount by
releasing the original count after creation.

This fixes the following ASAN report:

Direct leak of 840 byte(s) in 70 object(s) allocated from:
#0 0x7fe36703f628 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x107628)
#1 0x559fbbf611ca in cpu_map__trim_new /home/namhyung/project/linux/tools/lib/perf/cpumap.c:79
#2 0x559fbbf6229c in perf_cpu_map__new /home/namhyung/project/linux/tools/lib/perf/cpumap.c:237
#3 0x559fbbcc6c6d in __add_event util/parse-events.c:357
#4 0x559fbbcc6c6d in add_event_tool util/parse-events.c:408
#5 0x559fbbcc6c6d in parse_events_add_tool util/parse-events.c:1414
#6 0x559fbbd8474d in parse_events_parse util/parse-events.y:439
#7 0x559fbbcc95da in parse_events__scanner util/parse-events.c:2096
#8 0x559fbbcc95da in __parse_events util/parse-events.c:2141
#9 0x559fbbc2788b in check_parse_id tests/pmu-events.c:406
#10 0x559fbbc2788b in check_parse_id tests/pmu-events.c:393
#11 0x559fbbc2788b in check_parse_fake tests/pmu-events.c:436
#12 0x559fbbc2788b in metric_parse_fake tests/pmu-events.c:553
#13 0x559fbbc27e2d in test_parsing_fake tests/pmu-events.c:599
#14 0x559fbbc27e2d in test_parsing_fake tests/pmu-events.c:574
#15 0x559fbbc0109b in run_test tests/builtin-test.c:410
#16 0x559fbbc0109b in test_and_print tests/builtin-test.c:440
#17 0x559fbbc03e69 in __cmd_test tests/builtin-test.c:695
#18 0x559fbbc03e69 in cmd_test tests/builtin-test.c:807
#19 0x559fbbc691f4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
#20 0x559fbbb071a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
#21 0x559fbbb071a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
#22 0x559fbbb071a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
#23 0x7fe366b68cc9 in __libc_start_main ../csu/libc-start.c:308

And I've failed which commit introduced this bug as the code was
heavily changed since then. ;-/

Acked-by: Jiri Olsa <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/parse-events.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index c4d2394e2b2d..b35e4bb1cecb 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -360,8 +360,11 @@ __add_event(struct list_head *list, int *idx,
event_attr_init(attr);

evsel = evsel__new_idx(attr, *idx);
- if (!evsel)
+ if (!evsel) {
+ if (!pmu)
+ perf_cpu_map__put(cpus);
return NULL;
+ }

(*idx)++;
evsel->core.cpus = perf_cpu_map__get(cpus);
@@ -369,6 +372,8 @@ __add_event(struct list_head *list, int *idx,
evsel->core.system_wide = pmu ? pmu->is_uncore : false;
evsel->auto_merge_stats = auto_merge_stats;

+ if (!pmu)
+ perf_cpu_map__put(cpus);
if (name)
evsel->name = strdup(name);

--
2.28.0.618.gf4bc123cb7-goog

2020-09-15 03:20:45

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 05/11] perf parse-event: Fix memory leak in evsel->unit

The evsel->unit borrows a pointer of pmu event or alias instead of
owns a string. But tool event (duration_time) passes a result of
strdup() caused a leak.

It was found by ASAN during metric test:

Direct leak of 210 byte(s) in 70 object(s) allocated from:
#0 0x7fe366fca0b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5)
#1 0x559fbbcc6ea3 in add_event_tool util/parse-events.c:414
#2 0x559fbbcc6ea3 in parse_events_add_tool util/parse-events.c:1414
#3 0x559fbbd8474d in parse_events_parse util/parse-events.y:439
#4 0x559fbbcc95da in parse_events__scanner util/parse-events.c:2096
#5 0x559fbbcc95da in __parse_events util/parse-events.c:2141
#6 0x559fbbc28555 in check_parse_id tests/pmu-events.c:406
#7 0x559fbbc28555 in check_parse_id tests/pmu-events.c:393
#8 0x559fbbc28555 in check_parse_cpu tests/pmu-events.c:415
#9 0x559fbbc28555 in test_parsing tests/pmu-events.c:498
#10 0x559fbbc0109b in run_test tests/builtin-test.c:410
#11 0x559fbbc0109b in test_and_print tests/builtin-test.c:440
#12 0x559fbbc03e69 in __cmd_test tests/builtin-test.c:695
#13 0x559fbbc03e69 in cmd_test tests/builtin-test.c:807
#14 0x559fbbc691f4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
#15 0x559fbbb071a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
#16 0x559fbbb071a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
#17 0x559fbbb071a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
#18 0x7fe366b68cc9 in __libc_start_main ../csu/libc-start.c:308

Acked-by: Jiri Olsa <[email protected]>
Fixes: f0fbb114e3025 ("perf stat: Implement duration_time as a proper event")
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/parse-events.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index b35e4bb1cecb..ece321ccf599 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -416,7 +416,7 @@ static int add_event_tool(struct list_head *list, int *idx,
return -ENOMEM;
evsel->tool_event = tool_event;
if (tool_event == PERF_TOOL_DURATION_TIME)
- evsel->unit = strdup("ns");
+ evsel->unit = "ns";
return 0;
}

--
2.28.0.618.gf4bc123cb7-goog

2020-09-15 03:20:57

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 06/11] perf test: Fix memory leaks in parse-metric test

It didn't release resources when there's an error so the
test_recursion_fail() will leak some memory.

Acked-by: Jiri Olsa <[email protected]>
Fixes: 0a507af9c681a ("perf tests: Add parse metric test for ipc metric")
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/tests/parse-metric.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/tools/perf/tests/parse-metric.c b/tools/perf/tests/parse-metric.c
index 23db8acc492d..cd7331aac3bd 100644
--- a/tools/perf/tests/parse-metric.c
+++ b/tools/perf/tests/parse-metric.c
@@ -153,8 +153,10 @@ static int __compute_metric(const char *name, struct value *vals,
return -ENOMEM;

cpus = perf_cpu_map__new("0");
- if (!cpus)
+ if (!cpus) {
+ evlist__delete(evlist);
return -ENOMEM;
+ }

perf_evlist__set_maps(&evlist->core, cpus, NULL);

@@ -163,10 +165,11 @@ static int __compute_metric(const char *name, struct value *vals,
false, false,
&metric_events);
if (err)
- return err;
+ goto out;

- if (perf_evlist__alloc_stats(evlist, false))
- return -1;
+ err = perf_evlist__alloc_stats(evlist, false);
+ if (err)
+ goto out;

/* Load the runtime stats with given numbers for events. */
runtime_stat__init(&st);
@@ -178,13 +181,14 @@ static int __compute_metric(const char *name, struct value *vals,
if (name2 && ratio2)
*ratio2 = compute_single(&metric_events, evlist, &st, name2);

+out:
/* ... clenup. */
metricgroup__rblist_exit(&metric_events);
runtime_stat__exit(&st);
perf_evlist__free_stats(evlist);
perf_cpu_map__put(cpus);
evlist__delete(evlist);
- return 0;
+ return err;
}

static int compute_metric(const char *name, struct value *vals, double *ratio)
--
2.28.0.618.gf4bc123cb7-goog

2020-09-15 03:21:35

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 09/11] perf metric: Do not free metric when failed to resolve

It's dangerous to free the original metric when it's called from
resolve_metric() as it's already in the metric_list and might have
other resources too. Instead, it'd better let them bail out and be
released properly at the later stage.

So add a check when it's called from metricgroup__add_metric() and
release it. Also make sure that mp is set properly.

Acked-by: Jiri Olsa <[email protected]>
Fixes: 83de0b7d535de ("perf metric: Collect referenced metrics in struct metric_ref_node")
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/metricgroup.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 35bcaa8d11e9..941702cb6a79 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -673,7 +673,6 @@ static int __add_metric(struct list_head *metric_list,
m->has_constraint = metric_no_group || metricgroup__has_constraint(pe);
INIT_LIST_HEAD(&m->metric_refs);
m->metric_refs_cnt = 0;
- *mp = m;

parent = expr_ids__alloc(ids);
if (!parent) {
@@ -686,6 +685,7 @@ static int __add_metric(struct list_head *metric_list,
free(m);
return -ENOMEM;
}
+ *mp = m;
} else {
/*
* We got here for the referenced metric, via the
@@ -720,8 +720,11 @@ static int __add_metric(struct list_head *metric_list,
* all the metric's IDs and add it to the parent context.
*/
if (expr__find_other(pe->metric_expr, NULL, &m->pctx, runtime) < 0) {
- expr__ctx_clear(&m->pctx);
- free(m);
+ if (m->metric_refs_cnt == 0) {
+ expr__ctx_clear(&m->pctx);
+ free(m);
+ *mp = NULL;
+ }
return -EINVAL;
}

--
2.28.0.618.gf4bc123cb7-goog

2020-09-15 03:21:39

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 03/11] perf evlist: Fix cpu/thread map leak

Asan reported leak of cpu and thread maps as they have one more
refcount than released. I found that after setting evlist maps it
should release it's refcount.

It seems to be broken from the beginning so I chose the original
commit as the culprit. But not sure how it's applied to stable trees
since there are many changes in the code after that.

Acked-by: Jiri Olsa <[email protected]>
Fixes: 7e2ed097538c5 ("perf evlist: Store pointer to the cpu and thread maps")
Fixes: 4112eb1899c0e ("perf evlist: Default to syswide target when no thread/cpu maps set")
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/evlist.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index ee7b576d3b12..e971daf946d0 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -946,6 +946,10 @@ int perf_evlist__create_maps(struct evlist *evlist, struct target *target)

perf_evlist__set_maps(&evlist->core, cpus, threads);

+ /* as evlist now has references, put count here */
+ perf_cpu_map__put(cpus);
+ perf_thread_map__put(threads);
+
return 0;

out_delete_threads:
@@ -1273,11 +1277,12 @@ static int perf_evlist__create_syswide_maps(struct evlist *evlist)
goto out_put;

perf_evlist__set_maps(&evlist->core, cpus, threads);
-out:
- return err;
+
+ perf_thread_map__put(threads);
out_put:
perf_cpu_map__put(cpus);
- goto out;
+out:
+ return err;
}

int evlist__open(struct evlist *evlist)
--
2.28.0.618.gf4bc123cb7-goog

2020-09-15 03:21:47

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 08/11] perf metric: Free metric when it failed to resolve

The metricgroup__add_metric() can find multiple match for a metric
group and it's possible to fail. Also it can fail in the middle like
in resolve_metric() even for single metric.

In those cases, the intermediate list and ids will be leaked like:

Direct leak of 3 byte(s) in 1 object(s) allocated from:
#0 0x7f4c938f40b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5)
#1 0x55f7e71c1bef in __add_metric util/metricgroup.c:683
#2 0x55f7e71c31d0 in add_metric util/metricgroup.c:906
#3 0x55f7e71c3844 in metricgroup__add_metric util/metricgroup.c:940
#4 0x55f7e71c488d in metricgroup__add_metric_list util/metricgroup.c:993
#5 0x55f7e71c488d in parse_groups util/metricgroup.c:1045
#6 0x55f7e71c60a4 in metricgroup__parse_groups_test util/metricgroup.c:1087
#7 0x55f7e71235ae in __compute_metric tests/parse-metric.c:164
#8 0x55f7e7124650 in compute_metric tests/parse-metric.c:196
#9 0x55f7e7124650 in test_recursion_fail tests/parse-metric.c:318
#10 0x55f7e7124650 in test__parse_metric tests/parse-metric.c:356
#11 0x55f7e70be09b in run_test tests/builtin-test.c:410
#12 0x55f7e70be09b in test_and_print tests/builtin-test.c:440
#13 0x55f7e70c101a in __cmd_test tests/builtin-test.c:661
#14 0x55f7e70c101a in cmd_test tests/builtin-test.c:807
#15 0x55f7e7126214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
#16 0x55f7e6fc41a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
#17 0x55f7e6fc41a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
#18 0x55f7e6fc41a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
#19 0x7f4c93492cc9 in __libc_start_main ../csu/libc-start.c:308

Acked-by: Jiri Olsa <[email protected]>
Fixes: 83de0b7d535de ("perf metric: Collect referenced metrics in struct metric_ref_node")
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/metricgroup.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 53747df601fa..35bcaa8d11e9 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -940,7 +940,7 @@ static int metricgroup__add_metric(const char *metric, bool metric_no_group,

ret = add_metric(&list, pe, metric_no_group, &m, NULL, &ids);
if (ret)
- return ret;
+ goto out;

/*
* Process any possible referenced metrics
@@ -949,12 +949,14 @@ static int metricgroup__add_metric(const char *metric, bool metric_no_group,
ret = resolve_metric(metric_no_group,
&list, map, &ids);
if (ret)
- return ret;
+ goto out;
}

/* End of pmu events. */
- if (!has_match)
- return -EINVAL;
+ if (!has_match) {
+ ret = -EINVAL;
+ goto out;
+ }

list_for_each_entry(m, &list, nd) {
if (events->len > 0)
@@ -969,9 +971,14 @@ static int metricgroup__add_metric(const char *metric, bool metric_no_group,
}
}

+out:
+ /*
+ * add to metric_list so that they can be released
+ * even if it's failed
+ */
list_splice(&list, metric_list);
expr_ids__exit(&ids);
- return 0;
+ return ret;
}

static int metricgroup__add_metric_list(const char *list, bool metric_no_group,
--
2.28.0.618.gf4bc123cb7-goog

2020-09-15 03:22:08

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 11/11] perf test: Free formats for perf pmu parse test

The following leaks were detected by ASAN:

Indirect leak of 360 byte(s) in 9 object(s) allocated from:
#0 0x7fecc305180e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e)
#1 0x560578f6dce5 in perf_pmu__new_format util/pmu.c:1333
#2 0x560578f752fc in perf_pmu_parse util/pmu.y:59
#3 0x560578f6a8b7 in perf_pmu__format_parse util/pmu.c:73
#4 0x560578e07045 in test__pmu tests/pmu.c:155
#5 0x560578de109b in run_test tests/builtin-test.c:410
#6 0x560578de109b in test_and_print tests/builtin-test.c:440
#7 0x560578de401a in __cmd_test tests/builtin-test.c:661
#8 0x560578de401a in cmd_test tests/builtin-test.c:807
#9 0x560578e49354 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
#10 0x560578ce71a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
#11 0x560578ce71a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
#12 0x560578ce71a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
#13 0x7fecc2b7acc9 in __libc_start_main ../csu/libc-start.c:308

Acked-by: Jiri Olsa <[email protected]>
Fixes: cff7f956ec4a1 ("perf tests: Move pmu tests into separate object")
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/tests/pmu.c | 1 +
tools/perf/util/pmu.c | 11 +++++++++++
tools/perf/util/pmu.h | 1 +
3 files changed, 13 insertions(+)

diff --git a/tools/perf/tests/pmu.c b/tools/perf/tests/pmu.c
index 5c11fe2b3040..714e6830a758 100644
--- a/tools/perf/tests/pmu.c
+++ b/tools/perf/tests/pmu.c
@@ -173,6 +173,7 @@ int test__pmu(struct test *test __maybe_unused, int subtest __maybe_unused)
ret = 0;
} while (0);

+ perf_pmu__del_formats(&formats);
test_format_dir_put(format);
return ret;
}
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 555cb3524c25..d41caeb35cf6 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -1354,6 +1354,17 @@ void perf_pmu__set_format(unsigned long *bits, long from, long to)
set_bit(b, bits);
}

+void perf_pmu__del_formats(struct list_head *formats)
+{
+ struct perf_pmu_format *fmt, *tmp;
+
+ list_for_each_entry_safe(fmt, tmp, formats, list) {
+ list_del(&fmt->list);
+ free(fmt->name);
+ free(fmt);
+ }
+}
+
static int sub_non_neg(int a, int b)
{
if (b > a)
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index b63c4c5e335e..a64e9c9ce731 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -94,6 +94,7 @@ int perf_pmu__new_format(struct list_head *list, char *name,
int config, unsigned long *bits);
void perf_pmu__set_format(unsigned long *bits, long from, long to);
int perf_pmu__format_parse(char *dir, struct list_head *head);
+void perf_pmu__del_formats(struct list_head *formats);

struct perf_pmu *perf_pmu__scan(struct perf_pmu *pmu);

--
2.28.0.618.gf4bc123cb7-goog

2020-09-15 03:23:38

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 10/11] perf test: Free aliases for PMU event map aliases test

The aliases were never released causing the following leaks:

Indirect leak of 1224 byte(s) in 9 object(s) allocated from:
#0 0x7feefb830628 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x107628)
#1 0x56332c8f1b62 in __perf_pmu__new_alias util/pmu.c:322
#2 0x56332c8f401f in pmu_add_cpu_aliases_map util/pmu.c:778
#3 0x56332c792ce9 in __test__pmu_event_aliases tests/pmu-events.c:295
#4 0x56332c792ce9 in test_aliases tests/pmu-events.c:367
#5 0x56332c76a09b in run_test tests/builtin-test.c:410
#6 0x56332c76a09b in test_and_print tests/builtin-test.c:440
#7 0x56332c76ce69 in __cmd_test tests/builtin-test.c:695
#8 0x56332c76ce69 in cmd_test tests/builtin-test.c:807
#9 0x56332c7d2214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
#10 0x56332c6701a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
#11 0x56332c6701a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
#12 0x56332c6701a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
#13 0x7feefb359cc9 in __libc_start_main ../csu/libc-start.c:308

Cc: John Garry <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Fixes: 956a78356c24c ("perf test: Test pmu-events aliases")
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/tests/pmu-events.c | 5 +++++
tools/perf/util/pmu.c | 2 +-
tools/perf/util/pmu.h | 1 +
3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/perf/tests/pmu-events.c b/tools/perf/tests/pmu-events.c
index eb19f9a0bc15..d3517a74d95e 100644
--- a/tools/perf/tests/pmu-events.c
+++ b/tools/perf/tests/pmu-events.c
@@ -274,6 +274,7 @@ static int __test__pmu_event_aliases(char *pmu_name, int *count)
int res = 0;
bool use_uncore_table;
struct pmu_events_map *map = __test_pmu_get_events_map();
+ struct perf_pmu_alias *a, *tmp;

if (!map)
return -1;
@@ -347,6 +348,10 @@ static int __test__pmu_event_aliases(char *pmu_name, int *count)
pmu_name, alias->name);
}

+ list_for_each_entry_safe(a, tmp, &aliases, list) {
+ list_del(&a->list);
+ perf_pmu_free_alias(a);
+ }
free(pmu);
return res;
}
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index f1688e1f6ed7..555cb3524c25 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -274,7 +274,7 @@ static void perf_pmu_update_alias(struct perf_pmu_alias *old,
}

/* Delete an alias entry. */
-static void perf_pmu_free_alias(struct perf_pmu_alias *newalias)
+void perf_pmu_free_alias(struct perf_pmu_alias *newalias)
{
zfree(&newalias->name);
zfree(&newalias->desc);
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 44ccbdbb1c37..b63c4c5e335e 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -113,6 +113,7 @@ void pmu_add_cpu_aliases_map(struct list_head *head, struct perf_pmu *pmu,

struct pmu_events_map *perf_pmu__find_map(struct perf_pmu *pmu);
bool pmu_uncore_alias_match(const char *pmu_name, const char *name);
+void perf_pmu_free_alias(struct perf_pmu_alias *alias);

int perf_pmu__convert_scale(const char *scale, char **end, double *sval);

--
2.28.0.618.gf4bc123cb7-goog

2020-09-15 03:23:57

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH 07/11] perf metric: Release expr_parse_ctx after testing

The test_generic_metric() missed to release entries in the pctx.
Asan reported following leak (and more):

Direct leak of 128 byte(s) in 1 object(s) allocated from:
#0 0x7f4c9396980e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e)
#1 0x55f7e748cc14 in hashmap_grow (/home/namhyung/project/linux/tools/perf/perf+0x90cc14)
#2 0x55f7e748d497 in hashmap__insert (/home/namhyung/project/linux/tools/perf/perf+0x90d497)
#3 0x55f7e7341667 in hashmap__set /home/namhyung/project/linux/tools/perf/util/hashmap.h:111
#4 0x55f7e7341667 in expr__add_ref util/expr.c:120
#5 0x55f7e7292436 in prepare_metric util/stat-shadow.c:783
#6 0x55f7e729556d in test_generic_metric util/stat-shadow.c:858
#7 0x55f7e712390b in compute_single tests/parse-metric.c:128
#8 0x55f7e712390b in __compute_metric tests/parse-metric.c:180
#9 0x55f7e712446d in compute_metric tests/parse-metric.c:196
#10 0x55f7e712446d in test_dcache_l2 tests/parse-metric.c:295
#11 0x55f7e712446d in test__parse_metric tests/parse-metric.c:355
#12 0x55f7e70be09b in run_test tests/builtin-test.c:410
#13 0x55f7e70be09b in test_and_print tests/builtin-test.c:440
#14 0x55f7e70c101a in __cmd_test tests/builtin-test.c:661
#15 0x55f7e70c101a in cmd_test tests/builtin-test.c:807
#16 0x55f7e7126214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
#17 0x55f7e6fc41a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
#18 0x55f7e6fc41a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
#19 0x55f7e6fc41a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
#20 0x7f4c93492cc9 in __libc_start_main ../csu/libc-start.c:308

Acked-by: Jiri Olsa <[email protected]>
Fixes: 6d432c4c8aa56 ("perf tools: Add test_generic_metric function")
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/stat-shadow.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index e1ba6c1b916a..a5f42c22c484 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -853,14 +853,16 @@ static void generic_metric(struct perf_stat_config *config,
double test_generic_metric(struct metric_expr *mexp, int cpu, struct runtime_stat *st)
{
struct expr_parse_ctx pctx;
- double ratio;
+ double ratio = 0.0;

if (prepare_metric(mexp->metric_events, mexp->metric_refs, &pctx, cpu, st) < 0)
- return 0.;
+ goto out;

if (expr__parse(&ratio, &pctx, mexp->metric_expr, 1))
- return 0.;
+ ratio = 0.0;

+out:
+ expr__ctx_clear(&pctx);
return ratio;
}

--
2.28.0.618.gf4bc123cb7-goog

2020-09-15 05:18:49

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCHSET v2 00/11] perf tools: Fix various memory leaks

On Mon, Sep 14, 2020 at 8:18 PM Namhyung Kim <[email protected]> wrote:
>
> Hello,
>
> I've found and fixed a bunch of memory leaks during perf pmu and
> metric tests with address sanitizer. Before this, the tests were
> mostly failed due to the leaks since ASAN makes it return non-zero.
>
> Now I'm seeing no error with ASAN like below:
>
> $ ./perf test pmu metric
> 9: Parse perf pmu format : Ok
> 10: PMU events :
> 10.1: PMU event table sanity : Ok
> 10.2: PMU event map aliases : Ok
> 10.3: Parsing of PMU event table metrics : Skip (some metrics failed)
> 10.4: Parsing of PMU event table metrics with fake PMUs : Ok
> 67: Parse and process metrics : Ok
>
> The failure in 10.3 seems due to parse errors like below:
>
> Multiple errors dropping message: unknown term 'filter_opc' for pmu 'uncore_cbox_0'
> (valid terms: event,edge,inv,umask,cmask,config,config1,config2,name,period,freq,
> branch_type,time,call-graph,stack-size,no-inherit,inherit,max-stack,
> nr,no-overwrite,overwrite,driver-config,percore,aux-output,aux-sample-size)
>
>
> Parse event failed metric 'DRAM_Parallel_Reads' id 'arb/event=0x80,umask=0x2,thresh=1/'
> expr 'arb@event\=0x80\,umask\=0x2@ / arb@event\=0x80\,umask\=0x2\,thresh\=1@'
> Error string 'unknown term 'thresh' for pmu 'uncore_arb'' help
> 'valid terms: event,edge,inv,umask,cmask,config,config1,config2,name,period,freq,
> branch_type,time,call-graph,stack-size,no-inherit,inherit,max-stack,
> nr,no-overwrite,overwrite,driver-config,percore,aux-output,aux-sample-size'

The 10.3 failure seems to be a problem in the skl metric DRAM_Parallel_Reads:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/pmu-events/arch/x86/skylake/skl-metrics.json?h=perf/core#n319
arb@event\\=0x80\\,umask\\=0x2@ / arb@event\\=0x80\\,umask\\=0x2\\,thresh\\=1@

The test failure message is:
Parse event failed metric 'DRAM_Parallel_Reads' id
'arb/event=0x80,umask=0x2,thresh=1/' expr
'arb@event\=0x80\,umask\=0x2@ /
arb@event\=0x80\,umask\=0x2\,thresh\=1@'
Error string 'unknown term 'thresh' for pmu 'uncore_arb'' help 'valid
terms: event,edge,inv,umask,cmask,config,config1,config2,name,period,freq,branch_type,time,call-graph,stack-size,no-inherit,inherit,max-stack,nr,no-overwrite,overwrite,driver-config,percore,aux-output,aux-sample-size'

The 01.org version of this from:
https://download.01.org/perfmon/TMA_Metrics.xlsx
is:
UNC_ARB_TRK_OCCUPANCY.DATA_READ / UNC_ARB_TRK_OCCUPANCY.DATA_READ:c1

It seems that :c1 has been translated into thresh=1 but that thresh
doesn't exist as a format option (just "cmask edge event inv umask"
are present). I wonder if Andi or Jin you could look into this broken
metric?

Thanks,
Ian

> * Changes from v1:
> - Add 'Acked-by: Jiri'
>
>
> The patches are also available at 'perf/metric-fix-v2' branch on
>
> git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
>
> Thanks
> Namhyung
>
>
> Namhyung Kim (11):
> perf metric: Fix some memory leaks
> perf metric: Fix some memory leaks - part 2
> perf evlist: Fix cpu/thread map leak
> perf parse-event: Fix cpu map leaks
> perf parse-event: Fix memory leak in evsel->unit
> perf test: Fix memory leaks in parse-metric test
> perf metric: Release expr_parse_ctx after testing
> perf metric: Free metric when it failed to resolve
> perf metric: Do not free metric when failed to resolve
> perf test: Free aliases for PMU event map aliases test
> perf test: Free formats for perf pmu parse test
>
> tools/perf/tests/parse-metric.c | 14 ++++++++-----
> tools/perf/tests/pmu-events.c | 5 +++++
> tools/perf/tests/pmu.c | 1 +
> tools/perf/util/evlist.c | 11 ++++++++---
> tools/perf/util/metricgroup.c | 35 +++++++++++++++++++++++----------
> tools/perf/util/parse-events.c | 9 +++++++--
> tools/perf/util/pmu.c | 13 +++++++++++-
> tools/perf/util/pmu.h | 2 ++
> tools/perf/util/stat-shadow.c | 8 +++++---
> 9 files changed, 74 insertions(+), 24 deletions(-)
>
> --
> 2.28.0.618.gf4bc123cb7-goog
>

2020-09-15 07:41:46

by John Garry

[permalink] [raw]
Subject: Re: [PATCH 10/11] perf test: Free aliases for PMU event map aliases test

On 15/09/2020 04:18, Namhyung Kim wrote:
> The aliases were never released causing the following leaks:
>
> Indirect leak of 1224 byte(s) in 9 object(s) allocated from:
> #0 0x7feefb830628 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x107628)
> #1 0x56332c8f1b62 in __perf_pmu__new_alias util/pmu.c:322
> #2 0x56332c8f401f in pmu_add_cpu_aliases_map util/pmu.c:778
> #3 0x56332c792ce9 in __test__pmu_event_aliases tests/pmu-events.c:295
> #4 0x56332c792ce9 in test_aliases tests/pmu-events.c:367
> #5 0x56332c76a09b in run_test tests/builtin-test.c:410
> #6 0x56332c76a09b in test_and_print tests/builtin-test.c:440
> #7 0x56332c76ce69 in __cmd_test tests/builtin-test.c:695
> #8 0x56332c76ce69 in cmd_test tests/builtin-test.c:807
> #9 0x56332c7d2214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
> #10 0x56332c6701a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
> #11 0x56332c6701a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
> #12 0x56332c6701a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
> #13 0x7feefb359cc9 in __libc_start_main ../csu/libc-start.c:308
>
> Cc: John Garry <[email protected]>
> Acked-by: Jiri Olsa <[email protected]>

Just a minor comment below, either way:
Reviewed-by: John Garry <[email protected]>

Thanks

> Fixes: 956a78356c24c ("perf test: Test pmu-events aliases")
> Signed-off-by: Namhyung Kim <[email protected]>
> ---
> tools/perf/tests/pmu-events.c | 5 +++++
> tools/perf/util/pmu.c | 2 +-
> tools/perf/util/pmu.h | 1 +
> 3 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/tests/pmu-events.c b/tools/perf/tests/pmu-events.c
> index eb19f9a0bc15..d3517a74d95e 100644
> --- a/tools/perf/tests/pmu-events.c
> +++ b/tools/perf/tests/pmu-events.c
> @@ -274,6 +274,7 @@ static int __test__pmu_event_aliases(char *pmu_name, int *count)
> int res = 0;
> bool use_uncore_table;
> struct pmu_events_map *map = __test_pmu_get_events_map();
> + struct perf_pmu_alias *a, *tmp;
>
> if (!map)
> return -1;
> @@ -347,6 +348,10 @@ static int __test__pmu_event_aliases(char *pmu_name, int *count)
> pmu_name, alias->name);
> }
>
> + list_for_each_entry_safe(a, tmp, &aliases, list) {
> + list_del(&a->list);
> + perf_pmu_free_alias(a);
> + }

You could also consider putting this in a helper in pmu.c

> free(pmu);
> return res;
> }
> diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
> index f1688e1f6ed7..555cb3524c25 100644
> --- a/tools/perf/util/pmu.c
> +++ b/tools/perf/util/pmu.c
> @@ -274,7 +274,7 @@ static void perf_pmu_update_alias(struct perf_pmu_alias *old,
> }
>
> /* Delete an alias entry. */
> -static void perf_pmu_free_alias(struct perf_pmu_alias *newalias)
> +void perf_pmu_free_alias(struct perf_pmu_alias *newalias)
> {
> zfree(&newalias->name);
> zfree(&newalias->desc);
> diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
> index 44ccbdbb1c37..b63c4c5e335e 100644
> --- a/tools/perf/util/pmu.h
> +++ b/tools/perf/util/pmu.h
> @@ -113,6 +113,7 @@ void pmu_add_cpu_aliases_map(struct list_head *head, struct perf_pmu *pmu,
>
> struct pmu_events_map *perf_pmu__find_map(struct perf_pmu *pmu);
> bool pmu_uncore_alias_match(const char *pmu_name, const char *name);
> +void perf_pmu_free_alias(struct perf_pmu_alias *alias);
>
> int perf_pmu__convert_scale(const char *scale, char **end, double *sval);
>
>

2020-09-15 12:10:41

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 01/11] perf metric: Fix some memory leaks

Em Tue, Sep 15, 2020 at 12:18:09PM +0900, Namhyung Kim escreveu:
> I found some memory leaks while reading the metric code. Some are
> real and others only occur in the error path. When it failed during
> metric or event parsing, it should release all resources properly.

Thanks, applied.

- Arnaldo

> Cc: Kajol Jain <[email protected]>
> Cc: John Garry <[email protected]>
> Cc: Ian Rogers <[email protected]>
> Acked-by: Jiri Olsa <[email protected]>
> Fixes: b18f3e365019d ("perf stat: Support JSON metrics in perf stat")
> Signed-off-by: Namhyung Kim <[email protected]>
> ---
> tools/perf/util/metricgroup.c | 7 +++++--
> 1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
> index d03bac65a3c2..90d14c63babb 100644
> --- a/tools/perf/util/metricgroup.c
> +++ b/tools/perf/util/metricgroup.c
> @@ -529,6 +529,9 @@ void metricgroup__print(bool metrics, bool metricgroups, char *filter,
> continue;
> strlist__add(me->metrics, s);
> }
> +
> + if (!raw)
> + free(s);
> }
> free(omg);
> }
> @@ -1041,7 +1044,7 @@ static int parse_groups(struct evlist *perf_evlist, const char *str,
> ret = metricgroup__add_metric_list(str, metric_no_group,
> &extra_events, &metric_list, map);
> if (ret)
> - return ret;
> + goto out;
> pr_debug("adding %s\n", extra_events.buf);
> bzero(&parse_error, sizeof(parse_error));
> ret = __parse_events(perf_evlist, extra_events.buf, &parse_error, fake_pmu);
> @@ -1049,11 +1052,11 @@ static int parse_groups(struct evlist *perf_evlist, const char *str,
> parse_events_print_error(&parse_error, extra_events.buf);
> goto out;
> }
> - strbuf_release(&extra_events);
> ret = metricgroup__setup_events(&metric_list, metric_no_merge,
> perf_evlist, metric_events);
> out:
> metricgroup__free_metrics(&metric_list);
> + strbuf_release(&extra_events);
> return ret;
> }
>
> --
> 2.28.0.618.gf4bc123cb7-goog
>

--

- Arnaldo

2020-09-15 12:12:57

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 02/11] perf metric: Fix some memory leaks - part 2

Em Tue, Sep 15, 2020 at 12:18:10PM +0900, Namhyung Kim escreveu:
> The metric_event_delete() missed to free expr->metric_events and it
> should free an expr when metric_refs allocation failed.


Thanks, applied.

- Arnaldo

> Cc: Kajol Jain <[email protected]>
> Cc: John Garry <[email protected]>
> Cc: Ian Rogers <[email protected]>
> Acked-by: Jiri Olsa <[email protected]>
> Fixes: 4ea2896715e67 ("perf metric: Collect referenced metrics in struct metric_expr")
> Signed-off-by: Namhyung Kim <[email protected]>
> ---
> tools/perf/util/metricgroup.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
> index 90d14c63babb..53747df601fa 100644
> --- a/tools/perf/util/metricgroup.c
> +++ b/tools/perf/util/metricgroup.c
> @@ -84,6 +84,7 @@ static void metric_event_delete(struct rblist *rblist __maybe_unused,
>
> list_for_each_entry_safe(expr, tmp, &me->head, nd) {
> free(expr->metric_refs);
> + free(expr->metric_events);
> free(expr);
> }
>
> @@ -315,6 +316,7 @@ static int metricgroup__setup_events(struct list_head *groups,
> if (!metric_refs) {
> ret = -ENOMEM;
> free(metric_events);
> + free(expr);
> break;
> }
>
> --
> 2.28.0.618.gf4bc123cb7-goog
>

--

- Arnaldo

2020-09-15 12:14:59

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 10/11] perf test: Free aliases for PMU event map aliases test

Em Tue, Sep 15, 2020 at 08:37:15AM +0100, John Garry escreveu:
> On 15/09/2020 04:18, Namhyung Kim wrote:
> > The aliases were never released causing the following leaks:
> >
> > Indirect leak of 1224 byte(s) in 9 object(s) allocated from:
> > #0 0x7feefb830628 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x107628)
> > #1 0x56332c8f1b62 in __perf_pmu__new_alias util/pmu.c:322
> > #2 0x56332c8f401f in pmu_add_cpu_aliases_map util/pmu.c:778
> > #3 0x56332c792ce9 in __test__pmu_event_aliases tests/pmu-events.c:295
> > #4 0x56332c792ce9 in test_aliases tests/pmu-events.c:367
> > #5 0x56332c76a09b in run_test tests/builtin-test.c:410
> > #6 0x56332c76a09b in test_and_print tests/builtin-test.c:440
> > #7 0x56332c76ce69 in __cmd_test tests/builtin-test.c:695
> > #8 0x56332c76ce69 in cmd_test tests/builtin-test.c:807
> > #9 0x56332c7d2214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
> > #10 0x56332c6701a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
> > #11 0x56332c6701a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
> > #12 0x56332c6701a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
> > #13 0x7feefb359cc9 in __libc_start_main ../csu/libc-start.c:308
> >
> > Cc: John Garry <[email protected]>
> > Acked-by: Jiri Olsa <[email protected]>
>
> Just a minor comment below, either way:
> Reviewed-by: John Garry <[email protected]>

Thanks, applied to perf/urgent.

- Arnaldo

> Thanks
>
> > Fixes: 956a78356c24c ("perf test: Test pmu-events aliases")
> > Signed-off-by: Namhyung Kim <[email protected]>
> > ---
> > tools/perf/tests/pmu-events.c | 5 +++++
> > tools/perf/util/pmu.c | 2 +-
> > tools/perf/util/pmu.h | 1 +
> > 3 files changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/tools/perf/tests/pmu-events.c b/tools/perf/tests/pmu-events.c
> > index eb19f9a0bc15..d3517a74d95e 100644
> > --- a/tools/perf/tests/pmu-events.c
> > +++ b/tools/perf/tests/pmu-events.c
> > @@ -274,6 +274,7 @@ static int __test__pmu_event_aliases(char *pmu_name, int *count)
> > int res = 0;
> > bool use_uncore_table;
> > struct pmu_events_map *map = __test_pmu_get_events_map();
> > + struct perf_pmu_alias *a, *tmp;
> > if (!map)
> > return -1;
> > @@ -347,6 +348,10 @@ static int __test__pmu_event_aliases(char *pmu_name, int *count)
> > pmu_name, alias->name);
> > }
> > + list_for_each_entry_safe(a, tmp, &aliases, list) {
> > + list_del(&a->list);
> > + perf_pmu_free_alias(a);
> > + }
>
> You could also consider putting this in a helper in pmu.c
>
> > free(pmu);
> > return res;
> > }
> > diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
> > index f1688e1f6ed7..555cb3524c25 100644
> > --- a/tools/perf/util/pmu.c
> > +++ b/tools/perf/util/pmu.c
> > @@ -274,7 +274,7 @@ static void perf_pmu_update_alias(struct perf_pmu_alias *old,
> > }
> > /* Delete an alias entry. */
> > -static void perf_pmu_free_alias(struct perf_pmu_alias *newalias)
> > +void perf_pmu_free_alias(struct perf_pmu_alias *newalias)
> > {
> > zfree(&newalias->name);
> > zfree(&newalias->desc);
> > diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
> > index 44ccbdbb1c37..b63c4c5e335e 100644
> > --- a/tools/perf/util/pmu.h
> > +++ b/tools/perf/util/pmu.h
> > @@ -113,6 +113,7 @@ void pmu_add_cpu_aliases_map(struct list_head *head, struct perf_pmu *pmu,
> > struct pmu_events_map *perf_pmu__find_map(struct perf_pmu *pmu);
> > bool pmu_uncore_alias_match(const char *pmu_name, const char *name);
> > +void perf_pmu_free_alias(struct perf_pmu_alias *alias);
> > int perf_pmu__convert_scale(const char *scale, char **end, double *sval);
> >
>

--

- Arnaldo

2020-09-15 12:26:58

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 04/11] perf parse-event: Fix cpu map leaks

Em Tue, Sep 15, 2020 at 12:18:12PM +0900, Namhyung Kim escreveu:
> Like evlist cpu map, evsel's cpu map should have proper refcount by
> releasing the original count after creation.

"releasing original count after creation"?

There are two fixes here, one its legit, i.e. when failing to create
the new evsel, if you created the perf_cpu_map, drop the refcount,
which, in this case, will free it since perf_cpu_map__new() sets it to
1.

But what about the other? Humm, I see, since a new refcount is being
obtained, then we need to drop the first.

This all got complicated, perhaps the following patch is simpler?

diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index c4d2394e2b2dc60f..3dceeacf8669bc5d 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -353,18 +353,20 @@ __add_event(struct list_head *list, int *idx,
const char *cpu_list)
{
struct evsel *evsel;
- struct perf_cpu_map *cpus = pmu ? pmu->cpus :
+ struct perf_cpu_map *cpus = pmu ? perf_cpu_map__get(pmu->cpus) :
cpu_list ? perf_cpu_map__new(cpu_list) : NULL;

if (init_attr)
event_attr_init(attr);

evsel = evsel__new_idx(attr, *idx);
- if (!evsel)
+ if (!evsel) {
+ perf_cpu_map__put(cpus);
return NULL;
+ }

(*idx)++;
- evsel->core.cpus = perf_cpu_map__get(cpus);
+ evsel->core.cpus = cpus;
evsel->core.own_cpus = perf_cpu_map__get(cpus);
evsel->core.system_wide = pmu ? pmu->is_uncore : false;
evsel->auto_merge_stats = auto_merge_stats;


---

I.e. if we're going to share pmu->cpus, grab the necessary refcount at
that point, if we're going to create one (pmu == NULL), then
perf_cpu_map__new() will have the refcount we need (will set it to 1).

Then, if we fail to create the new evsel, we just drop the refcount we
got either from perf_cpu_map__get(pmu->cpus) or from
perf_cpu_map__new(cpu_list) (NULL is ok for __put() operations, that
covers that last ': NULL').

And then, when we go make evsel->core.cpus share that cpu_map, we know
we have the necessary refcount already, right?

No need to later on drop the one obtained previously via:

evsel->core.cpus = perf_cpu_map__get(cpus);

And we don't need to drop it later when we want to drop the extra
refcount it gets when pmu == NULL.

- Arnaldo

> This fixes the following ASAN report:
>
> Direct leak of 840 byte(s) in 70 object(s) allocated from:
> #0 0x7fe36703f628 in malloc (/lib/x86_64-linux-gnu/libasan.so.5+0x107628)
> #1 0x559fbbf611ca in cpu_map__trim_new /home/namhyung/project/linux/tools/lib/perf/cpumap.c:79
> #2 0x559fbbf6229c in perf_cpu_map__new /home/namhyung/project/linux/tools/lib/perf/cpumap.c:237
> #3 0x559fbbcc6c6d in __add_event util/parse-events.c:357
> #4 0x559fbbcc6c6d in add_event_tool util/parse-events.c:408
> #5 0x559fbbcc6c6d in parse_events_add_tool util/parse-events.c:1414
> #6 0x559fbbd8474d in parse_events_parse util/parse-events.y:439
> #7 0x559fbbcc95da in parse_events__scanner util/parse-events.c:2096
> #8 0x559fbbcc95da in __parse_events util/parse-events.c:2141
> #9 0x559fbbc2788b in check_parse_id tests/pmu-events.c:406
> #10 0x559fbbc2788b in check_parse_id tests/pmu-events.c:393
> #11 0x559fbbc2788b in check_parse_fake tests/pmu-events.c:436
> #12 0x559fbbc2788b in metric_parse_fake tests/pmu-events.c:553
> #13 0x559fbbc27e2d in test_parsing_fake tests/pmu-events.c:599
> #14 0x559fbbc27e2d in test_parsing_fake tests/pmu-events.c:574
> #15 0x559fbbc0109b in run_test tests/builtin-test.c:410
> #16 0x559fbbc0109b in test_and_print tests/builtin-test.c:440
> #17 0x559fbbc03e69 in __cmd_test tests/builtin-test.c:695
> #18 0x559fbbc03e69 in cmd_test tests/builtin-test.c:807
> #19 0x559fbbc691f4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
> #20 0x559fbbb071a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
> #21 0x559fbbb071a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
> #22 0x559fbbb071a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
> #23 0x7fe366b68cc9 in __libc_start_main ../csu/libc-start.c:308
>
> And I've failed which commit introduced this bug as the code was
> heavily changed since then. ;-/
>
> Acked-by: Jiri Olsa <[email protected]>
> Signed-off-by: Namhyung Kim <[email protected]>
> ---
> tools/perf/util/parse-events.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index c4d2394e2b2d..b35e4bb1cecb 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -360,8 +360,11 @@ __add_event(struct list_head *list, int *idx,
> event_attr_init(attr);
>
> evsel = evsel__new_idx(attr, *idx);
> - if (!evsel)
> + if (!evsel) {
> + if (!pmu)
> + perf_cpu_map__put(cpus);
> return NULL;
> + }
>
> (*idx)++;
> evsel->core.cpus = perf_cpu_map__get(cpus);
> @@ -369,6 +372,8 @@ __add_event(struct list_head *list, int *idx,
> evsel->core.system_wide = pmu ? pmu->is_uncore : false;
> evsel->auto_merge_stats = auto_merge_stats;
>
> + if (!pmu)
> + perf_cpu_map__put(cpus);
> if (name)
> evsel->name = strdup(name);
>
> --
> 2.28.0.618.gf4bc123cb7-goog
>

--

- Arnaldo

2020-09-15 12:27:49

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 06/11] perf test: Fix memory leaks in parse-metric test

Em Tue, Sep 15, 2020 at 12:18:14PM +0900, Namhyung Kim escreveu:
> It didn't release resources when there's an error so the
> test_recursion_fail() will leak some memory.

Thanks, applied.

- Arnaldo
>
> Acked-by: Jiri Olsa <[email protected]>
> Fixes: 0a507af9c681a ("perf tests: Add parse metric test for ipc metric")
> Signed-off-by: Namhyung Kim <[email protected]>
> ---
> tools/perf/tests/parse-metric.c | 14 +++++++++-----
> 1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/tools/perf/tests/parse-metric.c b/tools/perf/tests/parse-metric.c
> index 23db8acc492d..cd7331aac3bd 100644
> --- a/tools/perf/tests/parse-metric.c
> +++ b/tools/perf/tests/parse-metric.c
> @@ -153,8 +153,10 @@ static int __compute_metric(const char *name, struct value *vals,
> return -ENOMEM;
>
> cpus = perf_cpu_map__new("0");
> - if (!cpus)
> + if (!cpus) {
> + evlist__delete(evlist);
> return -ENOMEM;
> + }
>
> perf_evlist__set_maps(&evlist->core, cpus, NULL);
>
> @@ -163,10 +165,11 @@ static int __compute_metric(const char *name, struct value *vals,
> false, false,
> &metric_events);
> if (err)
> - return err;
> + goto out;
>
> - if (perf_evlist__alloc_stats(evlist, false))
> - return -1;
> + err = perf_evlist__alloc_stats(evlist, false);
> + if (err)
> + goto out;
>
> /* Load the runtime stats with given numbers for events. */
> runtime_stat__init(&st);
> @@ -178,13 +181,14 @@ static int __compute_metric(const char *name, struct value *vals,
> if (name2 && ratio2)
> *ratio2 = compute_single(&metric_events, evlist, &st, name2);
>
> +out:
> /* ... clenup. */
> metricgroup__rblist_exit(&metric_events);
> runtime_stat__exit(&st);
> perf_evlist__free_stats(evlist);
> perf_cpu_map__put(cpus);
> evlist__delete(evlist);
> - return 0;
> + return err;
> }
>
> static int compute_metric(const char *name, struct value *vals, double *ratio)
> --
> 2.28.0.618.gf4bc123cb7-goog
>

--

- Arnaldo

2020-09-15 12:27:49

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 05/11] perf parse-event: Fix memory leak in evsel->unit

Em Tue, Sep 15, 2020 at 12:18:13PM +0900, Namhyung Kim escreveu:
> The evsel->unit borrows a pointer of pmu event or alias instead of
> owns a string. But tool event (duration_time) passes a result of
> strdup() caused a leak.
>
> It was found by ASAN during metric test:

Thanks, applied.

> Direct leak of 210 byte(s) in 70 object(s) allocated from:
> #0 0x7fe366fca0b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5)
> #1 0x559fbbcc6ea3 in add_event_tool util/parse-events.c:414
> #2 0x559fbbcc6ea3 in parse_events_add_tool util/parse-events.c:1414
> #3 0x559fbbd8474d in parse_events_parse util/parse-events.y:439
> #4 0x559fbbcc95da in parse_events__scanner util/parse-events.c:2096
> #5 0x559fbbcc95da in __parse_events util/parse-events.c:2141
> #6 0x559fbbc28555 in check_parse_id tests/pmu-events.c:406
> #7 0x559fbbc28555 in check_parse_id tests/pmu-events.c:393
> #8 0x559fbbc28555 in check_parse_cpu tests/pmu-events.c:415
> #9 0x559fbbc28555 in test_parsing tests/pmu-events.c:498
> #10 0x559fbbc0109b in run_test tests/builtin-test.c:410
> #11 0x559fbbc0109b in test_and_print tests/builtin-test.c:440
> #12 0x559fbbc03e69 in __cmd_test tests/builtin-test.c:695
> #13 0x559fbbc03e69 in cmd_test tests/builtin-test.c:807
> #14 0x559fbbc691f4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
> #15 0x559fbbb071a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
> #16 0x559fbbb071a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
> #17 0x559fbbb071a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
> #18 0x7fe366b68cc9 in __libc_start_main ../csu/libc-start.c:308
>
> Acked-by: Jiri Olsa <[email protected]>
> Fixes: f0fbb114e3025 ("perf stat: Implement duration_time as a proper event")
> Signed-off-by: Namhyung Kim <[email protected]>
> ---
> tools/perf/util/parse-events.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index b35e4bb1cecb..ece321ccf599 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -416,7 +416,7 @@ static int add_event_tool(struct list_head *list, int *idx,
> return -ENOMEM;
> evsel->tool_event = tool_event;
> if (tool_event == PERF_TOOL_DURATION_TIME)
> - evsel->unit = strdup("ns");
> + evsel->unit = "ns";
> return 0;
> }
>
> --
> 2.28.0.618.gf4bc123cb7-goog
>

--

- Arnaldo

2020-09-15 12:28:07

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 08/11] perf metric: Free metric when it failed to resolve

Em Tue, Sep 15, 2020 at 12:18:16PM +0900, Namhyung Kim escreveu:
> The metricgroup__add_metric() can find multiple match for a metric
> group and it's possible to fail. Also it can fail in the middle like
> in resolve_metric() even for single metric.

Thanks, applied.

- Arnaldo

>
> In those cases, the intermediate list and ids will be leaked like:
>
> Direct leak of 3 byte(s) in 1 object(s) allocated from:
> #0 0x7f4c938f40b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5)
> #1 0x55f7e71c1bef in __add_metric util/metricgroup.c:683
> #2 0x55f7e71c31d0 in add_metric util/metricgroup.c:906
> #3 0x55f7e71c3844 in metricgroup__add_metric util/metricgroup.c:940
> #4 0x55f7e71c488d in metricgroup__add_metric_list util/metricgroup.c:993
> #5 0x55f7e71c488d in parse_groups util/metricgroup.c:1045
> #6 0x55f7e71c60a4 in metricgroup__parse_groups_test util/metricgroup.c:1087
> #7 0x55f7e71235ae in __compute_metric tests/parse-metric.c:164
> #8 0x55f7e7124650 in compute_metric tests/parse-metric.c:196
> #9 0x55f7e7124650 in test_recursion_fail tests/parse-metric.c:318
> #10 0x55f7e7124650 in test__parse_metric tests/parse-metric.c:356
> #11 0x55f7e70be09b in run_test tests/builtin-test.c:410
> #12 0x55f7e70be09b in test_and_print tests/builtin-test.c:440
> #13 0x55f7e70c101a in __cmd_test tests/builtin-test.c:661
> #14 0x55f7e70c101a in cmd_test tests/builtin-test.c:807
> #15 0x55f7e7126214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
> #16 0x55f7e6fc41a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
> #17 0x55f7e6fc41a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
> #18 0x55f7e6fc41a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
> #19 0x7f4c93492cc9 in __libc_start_main ../csu/libc-start.c:308
>
> Acked-by: Jiri Olsa <[email protected]>
> Fixes: 83de0b7d535de ("perf metric: Collect referenced metrics in struct metric_ref_node")
> Signed-off-by: Namhyung Kim <[email protected]>
> ---
> tools/perf/util/metricgroup.c | 17 ++++++++++++-----
> 1 file changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
> index 53747df601fa..35bcaa8d11e9 100644
> --- a/tools/perf/util/metricgroup.c
> +++ b/tools/perf/util/metricgroup.c
> @@ -940,7 +940,7 @@ static int metricgroup__add_metric(const char *metric, bool metric_no_group,
>
> ret = add_metric(&list, pe, metric_no_group, &m, NULL, &ids);
> if (ret)
> - return ret;
> + goto out;
>
> /*
> * Process any possible referenced metrics
> @@ -949,12 +949,14 @@ static int metricgroup__add_metric(const char *metric, bool metric_no_group,
> ret = resolve_metric(metric_no_group,
> &list, map, &ids);
> if (ret)
> - return ret;
> + goto out;
> }
>
> /* End of pmu events. */
> - if (!has_match)
> - return -EINVAL;
> + if (!has_match) {
> + ret = -EINVAL;
> + goto out;
> + }
>
> list_for_each_entry(m, &list, nd) {
> if (events->len > 0)
> @@ -969,9 +971,14 @@ static int metricgroup__add_metric(const char *metric, bool metric_no_group,
> }
> }
>
> +out:
> + /*
> + * add to metric_list so that they can be released
> + * even if it's failed
> + */
> list_splice(&list, metric_list);
> expr_ids__exit(&ids);
> - return 0;
> + return ret;
> }
>
> static int metricgroup__add_metric_list(const char *list, bool metric_no_group,
> --
> 2.28.0.618.gf4bc123cb7-goog
>

--

- Arnaldo

2020-09-15 15:25:08

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCHSET v2 00/11] perf tools: Fix various memory leaks

Hi Ian,

On Tue, Sep 15, 2020 at 2:15 PM Ian Rogers <[email protected]> wrote:
>
> On Mon, Sep 14, 2020 at 8:18 PM Namhyung Kim <[email protected]> wrote:
> >
> > Hello,
> >
> > I've found and fixed a bunch of memory leaks during perf pmu and
> > metric tests with address sanitizer. Before this, the tests were
> > mostly failed due to the leaks since ASAN makes it return non-zero.
> >
> > Now I'm seeing no error with ASAN like below:
> >
> > $ ./perf test pmu metric
> > 9: Parse perf pmu format : Ok
> > 10: PMU events :
> > 10.1: PMU event table sanity : Ok
> > 10.2: PMU event map aliases : Ok
> > 10.3: Parsing of PMU event table metrics : Skip (some metrics failed)
> > 10.4: Parsing of PMU event table metrics with fake PMUs : Ok
> > 67: Parse and process metrics : Ok
> >
> > The failure in 10.3 seems due to parse errors like below:
> >
> > Multiple errors dropping message: unknown term 'filter_opc' for pmu 'uncore_cbox_0'
> > (valid terms: event,edge,inv,umask,cmask,config,config1,config2,name,period,freq,
> > branch_type,time,call-graph,stack-size,no-inherit,inherit,max-stack,
> > nr,no-overwrite,overwrite,driver-config,percore,aux-output,aux-sample-size)
> >
> >
> > Parse event failed metric 'DRAM_Parallel_Reads' id 'arb/event=0x80,umask=0x2,thresh=1/'
> > expr 'arb@event\=0x80\,umask\=0x2@ / arb@event\=0x80\,umask\=0x2\,thresh\=1@'
> > Error string 'unknown term 'thresh' for pmu 'uncore_arb'' help
> > 'valid terms: event,edge,inv,umask,cmask,config,config1,config2,name,period,freq,
> > branch_type,time,call-graph,stack-size,no-inherit,inherit,max-stack,
> > nr,no-overwrite,overwrite,driver-config,percore,aux-output,aux-sample-size'
>
> The 10.3 failure seems to be a problem in the skl metric DRAM_Parallel_Reads:
> https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/pmu-events/arch/x86/skylake/skl-metrics.json?h=perf/core#n319
> arb@event\\=0x80\\,umask\\=0x2@ / arb@event\\=0x80\\,umask\\=0x2\\,thresh\\=1@
>
> The test failure message is:
> Parse event failed metric 'DRAM_Parallel_Reads' id
> 'arb/event=0x80,umask=0x2,thresh=1/' expr
> 'arb@event\=0x80\,umask\=0x2@ /
> arb@event\=0x80\,umask\=0x2\,thresh\=1@'
> Error string 'unknown term 'thresh' for pmu 'uncore_arb'' help 'valid
> terms: event,edge,inv,umask,cmask,config,config1,config2,name,period,freq,branch_type,time,call-graph,stack-size,no-inherit,inherit,max-stack,nr,no-overwrite,overwrite,driver-config,percore,aux-output,aux-sample-size'
>
> The 01.org version of this from:
> https://download.01.org/perfmon/TMA_Metrics.xlsx
> is:
> UNC_ARB_TRK_OCCUPANCY.DATA_READ / UNC_ARB_TRK_OCCUPANCY.DATA_READ:c1
>
> It seems that :c1 has been translated into thresh=1 but that thresh
> doesn't exist as a format option (just "cmask edge event inv umask"
> are present). I wonder if Andi or Jin you could look into this broken
> metric?

Thanks for the explanation. It'd be nice if Intel folks can take a look..

Thanks
Namhyung

2020-09-15 17:42:38

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 04/11] perf parse-event: Fix cpu map leaks

Em Tue, Sep 15, 2020 at 11:39:56PM +0900, Namhyung Kim escreveu:
> Hi Arnaldo,
>
> On Tue, Sep 15, 2020 at 9:18 PM Arnaldo Carvalho de Melo
> <[email protected]> wrote:
> >
> > Em Tue, Sep 15, 2020 at 12:18:12PM +0900, Namhyung Kim escreveu:
> > > Like evlist cpu map, evsel's cpu map should have proper refcount by
> > > releasing the original count after creation.
> >
> > "releasing original count after creation"?
> >
> > There are two fixes here, one its legit, i.e. when failing to create
> > the new evsel, if you created the perf_cpu_map, drop the refcount,
> > which, in this case, will free it since perf_cpu_map__new() sets it to
> > 1.
> >
> > But what about the other? Humm, I see, since a new refcount is being
> > obtained, then we need to drop the first.
> >
> > This all got complicated, perhaps the following patch is simpler?
> >
> > diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> > index c4d2394e2b2dc60f..3dceeacf8669bc5d 100644
> > --- a/tools/perf/util/parse-events.c
> > +++ b/tools/perf/util/parse-events.c
> > @@ -353,18 +353,20 @@ __add_event(struct list_head *list, int *idx,
> > const char *cpu_list)
> > {
> > struct evsel *evsel;
> > - struct perf_cpu_map *cpus = pmu ? pmu->cpus :
> > + struct perf_cpu_map *cpus = pmu ? perf_cpu_map__get(pmu->cpus) :
> > cpu_list ? perf_cpu_map__new(cpu_list) : NULL;
> >
> > if (init_attr)
> > event_attr_init(attr);
> >
> > evsel = evsel__new_idx(attr, *idx);
> > - if (!evsel)
> > + if (!evsel) {
> > + perf_cpu_map__put(cpus);
> > return NULL;
> > + }
> >
> > (*idx)++;
> > - evsel->core.cpus = perf_cpu_map__get(cpus);
> > + evsel->core.cpus = cpus;
> > evsel->core.own_cpus = perf_cpu_map__get(cpus);
> > evsel->core.system_wide = pmu ? pmu->is_uncore : false;
> > evsel->auto_merge_stats = auto_merge_stats;
> >
> >
> > ---
> >
> > I.e. if we're going to share pmu->cpus, grab the necessary refcount at
> > that point, if we're going to create one (pmu == NULL), then
> > perf_cpu_map__new() will have the refcount we need (will set it to 1).
> >
> > Then, if we fail to create the new evsel, we just drop the refcount we
> > got either from perf_cpu_map__get(pmu->cpus) or from
> > perf_cpu_map__new(cpu_list) (NULL is ok for __put() operations, that
> > covers that last ': NULL').
> >
> > And then, when we go make evsel->core.cpus share that cpu_map, we know
> > we have the necessary refcount already, right?
>
> Indeed! This looks a lot better. Do you want me to resend?

Please, feel free to use whatever snippets from my explanations.

But please consider breaking it into two patches, without thinking too
much about it at this time, it seems there are two fixes to be done in
this case.

- Arnaldo

2020-09-15 19:01:09

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 05/11] perf parse-event: Fix memory leak in evsel->unit

On Tue, Sep 15, 2020 at 5:19 AM Arnaldo Carvalho de Melo
<[email protected]> wrote:
>
> Em Tue, Sep 15, 2020 at 12:18:13PM +0900, Namhyung Kim escreveu:
> > The evsel->unit borrows a pointer of pmu event or alias instead of
> > owns a string. But tool event (duration_time) passes a result of
> > strdup() caused a leak.
> >
> > It was found by ASAN during metric test:
>
> Thanks, applied.

Thanks Namhyung and Arnaldo, just to raise a meta point. A lot of the
parse-events asan failures were caused by a lack of strdup causing
frees of string literals. It seems we're now adding strdup defensively
but introducing memory leaks. Could we be doing this in a smarter way?
For C++ I'd likely use std::string and walk away. For perf code the
best source of "ownership" I've found is to look at the "delete"
functions and figure out ownership from what gets freed there - this
can be burdensome. For strings, the code is also using strbuf and
asprintf. One possible improvement could be to document ownership next
to the struct member variable declarations. Another idea would be to
declare a macro whose usage would look like:

struct evsel {
...
OWNER(char *name, "this");
...
UNOWNED(const char *unit);
...

Maybe then we could get a static analyzer to complain if a literal
were assigned to an owned struct variable. Perhaps if a strdup were
assigned to an UNOWNED struct variable perhaps it could warn too, as
presumably the memory allocation is a request to own the memory.

There was a talk about GCC's -fanalyzer option doing malloc/free
checking at Linux plumbers 2 weeks ago:
https://linuxplumbersconf.org/event/7/contributions/721/attachments/542/961/2020-LPC-analyzer-talk.pdf
I added David Malcolm, the LPC presenter, as he may have ideas on how
we could do this in a better way.

Thanks,
Ian


> > Direct leak of 210 byte(s) in 70 object(s) allocated from:
> > #0 0x7fe366fca0b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5)
> > #1 0x559fbbcc6ea3 in add_event_tool util/parse-events.c:414
> > #2 0x559fbbcc6ea3 in parse_events_add_tool util/parse-events.c:1414
> > #3 0x559fbbd8474d in parse_events_parse util/parse-events.y:439
> > #4 0x559fbbcc95da in parse_events__scanner util/parse-events.c:2096
> > #5 0x559fbbcc95da in __parse_events util/parse-events.c:2141
> > #6 0x559fbbc28555 in check_parse_id tests/pmu-events.c:406
> > #7 0x559fbbc28555 in check_parse_id tests/pmu-events.c:393
> > #8 0x559fbbc28555 in check_parse_cpu tests/pmu-events.c:415
> > #9 0x559fbbc28555 in test_parsing tests/pmu-events.c:498
> > #10 0x559fbbc0109b in run_test tests/builtin-test.c:410
> > #11 0x559fbbc0109b in test_and_print tests/builtin-test.c:440
> > #12 0x559fbbc03e69 in __cmd_test tests/builtin-test.c:695
> > #13 0x559fbbc03e69 in cmd_test tests/builtin-test.c:807
> > #14 0x559fbbc691f4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
> > #15 0x559fbbb071a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
> > #16 0x559fbbb071a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
> > #17 0x559fbbb071a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
> > #18 0x7fe366b68cc9 in __libc_start_main ../csu/libc-start.c:308
> >
> > Acked-by: Jiri Olsa <[email protected]>
> > Fixes: f0fbb114e3025 ("perf stat: Implement duration_time as a proper event")
> > Signed-off-by: Namhyung Kim <[email protected]>
> > ---
> > tools/perf/util/parse-events.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> > index b35e4bb1cecb..ece321ccf599 100644
> > --- a/tools/perf/util/parse-events.c
> > +++ b/tools/perf/util/parse-events.c
> > @@ -416,7 +416,7 @@ static int add_event_tool(struct list_head *list, int *idx,
> > return -ENOMEM;
> > evsel->tool_event = tool_event;
> > if (tool_event == PERF_TOOL_DURATION_TIME)
> > - evsel->unit = strdup("ns");
> > + evsel->unit = "ns";
> > return 0;
> > }
> >
> > --
> > 2.28.0.618.gf4bc123cb7-goog
> >
>
> --
>
> - Arnaldo

2020-09-15 19:58:51

by David Malcolm

[permalink] [raw]
Subject: Re: [PATCH 05/11] perf parse-event: Fix memory leak in evsel->unit

On Tue, 2020-09-15 at 11:59 -0700, Ian Rogers wrote:
> On Tue, Sep 15, 2020 at 5:19 AM Arnaldo Carvalho de Melo
> <[email protected]> wrote:
> > Em Tue, Sep 15, 2020 at 12:18:13PM +0900, Namhyung Kim escreveu:
> > > The evsel->unit borrows a pointer of pmu event or alias instead
> > > of
> > > owns a string. But tool event (duration_time) passes a result of
> > > strdup() caused a leak.
> > >
> > > It was found by ASAN during metric test:
> >
> > Thanks, applied.
>
> Thanks Namhyung and Arnaldo, just to raise a meta point. A lot of the
> parse-events asan failures were caused by a lack of strdup causing
> frees of string literals. It seems we're now adding strdup
> defensively
> but introducing memory leaks. Could we be doing this in a smarter
> way?
> For C++ I'd likely use std::string and walk away. For perf code the
> best source of "ownership" I've found is to look at the "delete"
> functions and figure out ownership from what gets freed there - this
> can be burdensome. For strings, the code is also using strbuf and
> asprintf. One possible improvement could be to document ownership
> next
> to the struct member variable declarations. Another idea would be to
> declare a macro whose usage would look like:
>
> struct evsel {
> ...
> OWNER(char *name, "this");
> ...
> UNOWNED(const char *unit);
> ...
>
> Maybe then we could get a static analyzer to complain if a literal
> were assigned to an owned struct variable. Perhaps if a strdup were
> assigned to an UNOWNED struct variable perhaps it could warn too, as
> presumably the memory allocation is a request to own the memory.
>
> There was a talk about GCC's -fanalyzer option doing malloc/free
> checking at Linux plumbers 2 weeks ago:
> https://linuxplumbersconf.org/event/7/contributions/721/attachments/542/961/2020-LPC-analyzer-talk.pdf
> I added David Malcolm, the LPC presenter, as he may have ideas on how
> we could do this in a better way.

Hi Ian.

Some ideas (with the caveat that I'm a GCC developer, and not a regular
on LKML): can you capture the ownership status in the type system?
I'm brainstorming here but how about:
typedef char *owned_string_t;
typedef const char *borrowed_string_t;
This would at least capture the intent in human-readable form, and
*might* make things more amenable to checking by a machine. It's also
less macro cruft.
I take it that capturing the ownership status with a runtime flag next
to the pointer in a struct is too expensive for your code?


Some notes on -fanalyzer:

Caveat: The implementation of -fanalyzer in gcc 10 is an early
prototype and although it has found its first CVE I don't recommend it
for use "in anger" yet; I'm working on getting it more suitable for
general usage for C in gcc 11. (mostly scaling issues and other
bugfixing)

-fanalyzer associates state machines with APIs; one of these state
machines implements leak detection for malloc, along with e.g. double-
free detection. I'm generalizing this checker to other acquire/release
APIs: I have a semi-working patch under development (targeting GCC 11)
that exposes this via a fndecl attribute, currently named
"deallocated_by", so that fn decls can be labeled e.g.:

extern void foo_release (foo *);
extern foo *foo_acquire (void)
__attribute__((deallocated_by(foo_release));

and have -fanalyzer detect leaks, double-releases, use-after-release,
failure to check for NULL (alloc failure) etc.

Ultimately this attribute might land in the libc header for strdup (and
friends), but I can also special-case strdup so that the analyzer
"knows" that the result needs to be freed if non-NULL (and that it can
fail and return NULL).

Hope this is constructive
Dave

[...]


2020-09-15 20:10:32

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH 05/11] perf parse-event: Fix memory leak in evsel->unit

Em Tue, Sep 15, 2020 at 11:59:13AM -0700, Ian Rogers escreveu:
> On Tue, Sep 15, 2020 at 5:19 AM Arnaldo Carvalho de Melo
> <[email protected]> wrote:
> >
> > Em Tue, Sep 15, 2020 at 12:18:13PM +0900, Namhyung Kim escreveu:
> > > The evsel->unit borrows a pointer of pmu event or alias instead of
> > > owns a string. But tool event (duration_time) passes a result of
> > > strdup() caused a leak.
> > >
> > > It was found by ASAN during metric test:
> >
> > Thanks, applied.
>
> Thanks Namhyung and Arnaldo, just to raise a meta point. A lot of the
> parse-events asan failures were caused by a lack of strdup causing
> frees of string literals. It seems we're now adding strdup defensively
> but introducing memory leaks. Could we be doing this in a smarter way?

I hope so, the parsing code was always something I thought about
auditing but never had time to work on, whatever you can do to help with
that will be great!

- Arnaldo

> For C++ I'd likely use std::string and walk away. For perf code the
> best source of "ownership" I've found is to look at the "delete"
> functions and figure out ownership from what gets freed there - this
> can be burdensome. For strings, the code is also using strbuf and
> asprintf. One possible improvement could be to document ownership next
> to the struct member variable declarations. Another idea would be to
> declare a macro whose usage would look like:
>
> struct evsel {
> ...
> OWNER(char *name, "this");
> ...
> UNOWNED(const char *unit);
> ...
>
> Maybe then we could get a static analyzer to complain if a literal
> were assigned to an owned struct variable. Perhaps if a strdup were
> assigned to an UNOWNED struct variable perhaps it could warn too, as
> presumably the memory allocation is a request to own the memory.
>
> There was a talk about GCC's -fanalyzer option doing malloc/free
> checking at Linux plumbers 2 weeks ago:
> https://linuxplumbersconf.org/event/7/contributions/721/attachments/542/961/2020-LPC-analyzer-talk.pdf
> I added David Malcolm, the LPC presenter, as he may have ideas on how
> we could do this in a better way.
>
> Thanks,
> Ian
>
>
> > > Direct leak of 210 byte(s) in 70 object(s) allocated from:
> > > #0 0x7fe366fca0b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5)
> > > #1 0x559fbbcc6ea3 in add_event_tool util/parse-events.c:414
> > > #2 0x559fbbcc6ea3 in parse_events_add_tool util/parse-events.c:1414
> > > #3 0x559fbbd8474d in parse_events_parse util/parse-events.y:439
> > > #4 0x559fbbcc95da in parse_events__scanner util/parse-events.c:2096
> > > #5 0x559fbbcc95da in __parse_events util/parse-events.c:2141
> > > #6 0x559fbbc28555 in check_parse_id tests/pmu-events.c:406
> > > #7 0x559fbbc28555 in check_parse_id tests/pmu-events.c:393
> > > #8 0x559fbbc28555 in check_parse_cpu tests/pmu-events.c:415
> > > #9 0x559fbbc28555 in test_parsing tests/pmu-events.c:498
> > > #10 0x559fbbc0109b in run_test tests/builtin-test.c:410
> > > #11 0x559fbbc0109b in test_and_print tests/builtin-test.c:440
> > > #12 0x559fbbc03e69 in __cmd_test tests/builtin-test.c:695
> > > #13 0x559fbbc03e69 in cmd_test tests/builtin-test.c:807
> > > #14 0x559fbbc691f4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
> > > #15 0x559fbbb071a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
> > > #16 0x559fbbb071a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
> > > #17 0x559fbbb071a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
> > > #18 0x7fe366b68cc9 in __libc_start_main ../csu/libc-start.c:308
> > >
> > > Acked-by: Jiri Olsa <[email protected]>
> > > Fixes: f0fbb114e3025 ("perf stat: Implement duration_time as a proper event")
> > > Signed-off-by: Namhyung Kim <[email protected]>
> > > ---
> > > tools/perf/util/parse-events.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> > > index b35e4bb1cecb..ece321ccf599 100644
> > > --- a/tools/perf/util/parse-events.c
> > > +++ b/tools/perf/util/parse-events.c
> > > @@ -416,7 +416,7 @@ static int add_event_tool(struct list_head *list, int *idx,
> > > return -ENOMEM;
> > > evsel->tool_event = tool_event;
> > > if (tool_event == PERF_TOOL_DURATION_TIME)
> > > - evsel->unit = strdup("ns");
> > > + evsel->unit = "ns";
> > > return 0;
> > > }
> > >
> > > --
> > > 2.28.0.618.gf4bc123cb7-goog
> > >
> >
> > --
> >
> > - Arnaldo

--

- Arnaldo

2020-09-15 23:15:25

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 04/11] perf parse-event: Fix cpu map leaks

Hi Arnaldo,

On Tue, Sep 15, 2020 at 9:18 PM Arnaldo Carvalho de Melo
<[email protected]> wrote:
>
> Em Tue, Sep 15, 2020 at 12:18:12PM +0900, Namhyung Kim escreveu:
> > Like evlist cpu map, evsel's cpu map should have proper refcount by
> > releasing the original count after creation.
>
> "releasing original count after creation"?
>
> There are two fixes here, one its legit, i.e. when failing to create
> the new evsel, if you created the perf_cpu_map, drop the refcount,
> which, in this case, will free it since perf_cpu_map__new() sets it to
> 1.
>
> But what about the other? Humm, I see, since a new refcount is being
> obtained, then we need to drop the first.
>
> This all got complicated, perhaps the following patch is simpler?
>
> diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
> index c4d2394e2b2dc60f..3dceeacf8669bc5d 100644
> --- a/tools/perf/util/parse-events.c
> +++ b/tools/perf/util/parse-events.c
> @@ -353,18 +353,20 @@ __add_event(struct list_head *list, int *idx,
> const char *cpu_list)
> {
> struct evsel *evsel;
> - struct perf_cpu_map *cpus = pmu ? pmu->cpus :
> + struct perf_cpu_map *cpus = pmu ? perf_cpu_map__get(pmu->cpus) :
> cpu_list ? perf_cpu_map__new(cpu_list) : NULL;
>
> if (init_attr)
> event_attr_init(attr);
>
> evsel = evsel__new_idx(attr, *idx);
> - if (!evsel)
> + if (!evsel) {
> + perf_cpu_map__put(cpus);
> return NULL;
> + }
>
> (*idx)++;
> - evsel->core.cpus = perf_cpu_map__get(cpus);
> + evsel->core.cpus = cpus;
> evsel->core.own_cpus = perf_cpu_map__get(cpus);
> evsel->core.system_wide = pmu ? pmu->is_uncore : false;
> evsel->auto_merge_stats = auto_merge_stats;
>
>
> ---
>
> I.e. if we're going to share pmu->cpus, grab the necessary refcount at
> that point, if we're going to create one (pmu == NULL), then
> perf_cpu_map__new() will have the refcount we need (will set it to 1).
>
> Then, if we fail to create the new evsel, we just drop the refcount we
> got either from perf_cpu_map__get(pmu->cpus) or from
> perf_cpu_map__new(cpu_list) (NULL is ok for __put() operations, that
> covers that last ': NULL').
>
> And then, when we go make evsel->core.cpus share that cpu_map, we know
> we have the necessary refcount already, right?

Indeed! This looks a lot better. Do you want me to resend?

Thanks
Namhyung


>
> No need to later on drop the one obtained previously via:
>
> evsel->core.cpus = perf_cpu_map__get(cpus);
>
> And we don't need to drop it later when we want to drop the extra
> refcount it gets when pmu == NULL.
>
> - Arnaldo

2020-09-16 07:13:38

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 05/11] perf parse-event: Fix memory leak in evsel->unit

Hello Ian and David,

Thank you for the good suggestions!

On Wed, Sep 16, 2020 at 4:56 AM David Malcolm <[email protected]> wrote:
> Some ideas (with the caveat that I'm a GCC developer, and not a regular
> on LKML): can you capture the ownership status in the type system?
> I'm brainstorming here but how about:
> typedef char *owned_string_t;
> typedef const char *borrowed_string_t;
> This would at least capture the intent in human-readable form, and
> *might* make things more amenable to checking by a machine. It's also
> less macro cruft.
> I take it that capturing the ownership status with a runtime flag next
> to the pointer in a struct is too expensive for your code?

Adding more random thoughts..

I think we can make it more generic like __attribute__((owned))
so that it can be applied to any pointers. And we can use a
conventional macro like '__owned' in the declaration..

__owned char *name;
__owned char *strdup(const char *);
...

Thanks
Namhyung

2020-09-16 18:38:36

by Ian Rogers

[permalink] [raw]
Subject: Re: [PATCH 05/11] perf parse-event: Fix memory leak in evsel->unit

On Wed, Sep 16, 2020 at 12:12 AM Namhyung Kim <[email protected]> wrote:
>
> Hello Ian and David,
>
> Thank you for the good suggestions!
>
> On Wed, Sep 16, 2020 at 4:56 AM David Malcolm <[email protected]> wrote:
> > Some ideas (with the caveat that I'm a GCC developer, and not a regular
> > on LKML): can you capture the ownership status in the type system?
> > I'm brainstorming here but how about:
> > typedef char *owned_string_t;
> > typedef const char *borrowed_string_t;
> > This would at least capture the intent in human-readable form, and
> > *might* make things more amenable to checking by a machine. It's also
> > less macro cruft.
> > I take it that capturing the ownership status with a runtime flag next
> > to the pointer in a struct is too expensive for your code?
>
> Adding more random thoughts..
>
> I think we can make it more generic like __attribute__((owned))
> so that it can be applied to any pointers. And we can use a
> conventional macro like '__owned' in the declaration..
>
> __owned char *name;
> __owned char *strdup(const char *);
> ...
>
> Thanks
> Namhyung

I have to say I like the idea of a __owned like "modifier" before
these names more than introducing types. David, do you think a patch
with something like the following is reasonable? I'm also throwing
this out there to see if somebody on the linux code side screams and
thinks this is the worst idea ever in existence :-)

compiler.h:
/* In the future __owned and __unowned will be an attribute to allow
static analysis to perform certain correctness checks. For now they
are placeholders to provide documentation. */
#define __owned
#define __unowned
..
evsel.h:
..
struct evsel {
..
__owned char *name;
..
__unowned const char *unit;
..

Thanks,
Ian