Hi,
These patches fix a segfault seen in some cases when perf stat is exiting.
We don't balance opening/closing of events in all cases, and go out-of-bounds
when we close events. Full details in the patch 2 commit message.
Thanks,
Mark.
Mark Rutland (2):
perf evsel: add per{cpu,thread} close helpers
perf stat: balance opening/closing of events
tools/perf/builtin-stat.c | 24 +++++++++++++++++++++++-
tools/perf/util/evsel.c | 26 ++++++++++++++++++++------
tools/perf/util/evsel.h | 4 ++++
3 files changed, 47 insertions(+), 7 deletions(-)
--
1.9.1
While perf-stat has its own create_perf_stat_counter() helper to open
events, dependent on target configuration, it uses perf_evlist__close()
to close events.
The common perf_evlist__{open,close}() helpers don't consider the target
configuration, and always evsel->cpus even where
create_perf_stat_counter() would have used an empty_cpu_map.
On some systems, the CPU PMU describes cpus under sysfs, and evsel->cpus
may be set even when we open per-thread events. For per-thread events,
we don't use evsel->cpus in create_perf_stat_counter(), though
perf_evlist__close() will. Thus we try to close more events than we
allocated and opened, resulting in segfaults when we go out-of-bounds:
$ perf stat -e armv8_pmuv3/cpu_cycles/ true
*** Error in `./old-perf': free(): invalid next size (fast): 0x00000000263a55c0 ***
Aborted
This problem was introduced by commit:
3df33eff2ba96be4 ("perf stat: Avoid skew when reading events")
... prior to that commit, we closed events as we read them, using the
correct number of CPUs.
This patch fixes the problem by adding a new close_counters() routine
that mirrors create_perf_stat_counter(), ensuring that we always
correctly balance open/close.
Fixes: 3df33eff2ba96be4 ("perf stat: Avoid skew when reading events")
Signed-off-by: Mark Rutland <[email protected]>
Tested-by: Ganapatrao Kulkarni <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: [email protected]
---
tools/perf/builtin-stat.c | 24 +++++++++++++++++++++++-
1 file changed, 23 insertions(+), 1 deletion(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index a935b50..ceedc0a 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -364,6 +364,28 @@ static void read_counters(void)
}
}
+/*
+ * Close all evnt FDs we open in __run_perf_stat() and
+ * create_perf_stat_counter(), taking care to match the number of threads and CPUs.
+ *
+ * Note that perf_evlist__close(evsel_list) is not equivalent, as it doesn't
+ * take the target into account.
+ */
+static void close_counters(void)
+{
+ bool per_cpu = target__has_cpu(&target);
+ struct perf_evsel *evsel;
+
+ evlist__for_each_entry(evsel_list, evsel) {
+ if (per_cpu)
+ perf_evsel__close_per_cpu(evsel,
+ perf_evsel__cpus(evsel));
+ else
+ perf_evsel__close_per_thread(evsel,
+ evsel_list->threads);
+ }
+}
+
static void process_interval(void)
{
struct timespec ts, rs;
@@ -704,7 +726,7 @@ static int __run_perf_stat(int argc, const char **argv)
* group leaders.
*/
read_counters();
- perf_evlist__close(evsel_list);
+ close_counters();
return WEXITSTATUS(status);
}
--
1.9.1
We have perf_evsel__open_per_{cpu,thread}() helpers for opening events,
but we have no corresponding helpers for closing events.
This results in callers having to duplicate logic to determine the
number of cpus and threads when closing an event, and makes it harder
than necessary to determine whether open/close are correctly balanced.
This patch adds new perf_evsel__close_per_{cpu,thread}() helpers, which
can be paired with their open counterpart. A subsequent patch will make
use of these.
For consistency, the functions are shuffled in evsel.c so that the
per-{cpu,thread} variants of open/close immediately follow their
respective common implementation.
Signed-off-by: Mark Rutland <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: [email protected]
---
tools/perf/util/evsel.c | 26 ++++++++++++++++++++------
tools/perf/util/evsel.h | 4 ++++
2 files changed, 24 insertions(+), 6 deletions(-)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 0e87909..27abed8 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -1682,6 +1682,18 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
return err;
}
+int perf_evsel__open_per_cpu(struct perf_evsel *evsel,
+ struct cpu_map *cpus)
+{
+ return perf_evsel__open(evsel, cpus, NULL);
+}
+
+int perf_evsel__open_per_thread(struct perf_evsel *evsel,
+ struct thread_map *threads)
+{
+ return perf_evsel__open(evsel, NULL, threads);
+}
+
void perf_evsel__close(struct perf_evsel *evsel, int ncpus, int nthreads)
{
if (evsel->fd == NULL)
@@ -1691,16 +1703,18 @@ void perf_evsel__close(struct perf_evsel *evsel, int ncpus, int nthreads)
perf_evsel__free_fd(evsel);
}
-int perf_evsel__open_per_cpu(struct perf_evsel *evsel,
- struct cpu_map *cpus)
+void perf_evsel__close_per_cpu(struct perf_evsel *evsel,
+ struct cpu_map *cpus)
{
- return perf_evsel__open(evsel, cpus, NULL);
+ int ncpus = cpus ? cpus->nr : 1;
+ perf_evsel__close(evsel, ncpus, 1);
}
-int perf_evsel__open_per_thread(struct perf_evsel *evsel,
- struct thread_map *threads)
+void perf_evsel__close_per_thread(struct perf_evsel *evsel,
+ struct thread_map *threads)
{
- return perf_evsel__open(evsel, NULL, threads);
+ int nthreads = threads ? threads->nr : 1;
+ perf_evsel__close(evsel, 1, nthreads);
}
static int perf_evsel__parse_id_sample(const struct perf_evsel *evsel,
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index d101695..6f61a49 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -257,6 +257,10 @@ int perf_evsel__open_per_thread(struct perf_evsel *evsel,
struct thread_map *threads);
int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
struct thread_map *threads);
+void perf_evsel__close_per_cpu(struct perf_evsel *evsel,
+ struct cpu_map *cpus);
+void perf_evsel__close_per_thread(struct perf_evsel *evsel,
+ struct thread_map *threads);
void perf_evsel__close(struct perf_evsel *evsel, int ncpus, int nthreads);
struct perf_sample;
--
1.9.1
ping?
On Tue, May 23, 2017 at 8:44 PM, Ganapatrao Kulkarni
<[email protected]> wrote:
> On Tue, May 23, 2017 at 3:34 PM, Mark Rutland <[email protected]> wrote:
>> Hi,
>>
>> Does anyone have any comments on these?
>>
>> I'm happy to rebase/resend if necessary.
>>
>> I'd very much like to see this fixed.
>
> arm64 platforms have dependency on this patch set to use latest perf tool.
> Earliest merge/review is much appreciated!!
>
>>
>> Thanks,
>> Mark.
>>
>> On Fri, Apr 28, 2017 at 12:47:08PM +0100, Mark Rutland wrote:
>>> Hi,
>>>
>>> These patches fix a segfault seen in some cases when perf stat is exiting.
>>>
>>> We don't balance opening/closing of events in all cases, and go out-of-bounds
>>> when we close events. Full details in the patch 2 commit message.
>>>
>>> Thanks,
>>> Mark.
>>>
>>> Mark Rutland (2):
>>> perf evsel: add per{cpu,thread} close helpers
>>> perf stat: balance opening/closing of events
>>>
>>> tools/perf/builtin-stat.c | 24 +++++++++++++++++++++++-
>>> tools/perf/util/evsel.c | 26 ++++++++++++++++++++------
>>> tools/perf/util/evsel.h | 4 ++++
>>> 3 files changed, 47 insertions(+), 7 deletions(-)
>>>
>>> --
>>> 1.9.1
>>>
>
> thanks
> Ganapat
The perf tool segmentation fault which we were observing and this
patch series intended to fix it, has been fixed due to change in
default behaviour of perf stat[1][2].
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v4.12&id=0d79f8b93187c771b6971acfaba67f4e2f1e0710
[2] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=linux-4.11.y&id=0d79f8b93187c771b6971acfaba67f4e2f1e0710
On Fri, Jun 9, 2017 at 10:03 AM, Ganapatrao Kulkarni
<[email protected]> wrote:
> ping?
>
> On Tue, May 23, 2017 at 8:44 PM, Ganapatrao Kulkarni
> <[email protected]> wrote:
>> On Tue, May 23, 2017 at 3:34 PM, Mark Rutland <[email protected]> wrote:
>>> Hi,
>>>
>>> Does anyone have any comments on these?
>>>
>>> I'm happy to rebase/resend if necessary.
>>>
>>> I'd very much like to see this fixed.
>>
>> arm64 platforms have dependency on this patch set to use latest perf tool.
>> Earliest merge/review is much appreciated!!
>>
>>>
>>> Thanks,
>>> Mark.
>>>
>>> On Fri, Apr 28, 2017 at 12:47:08PM +0100, Mark Rutland wrote:
>>>> Hi,
>>>>
>>>> These patches fix a segfault seen in some cases when perf stat is exiting.
>>>>
>>>> We don't balance opening/closing of events in all cases, and go out-of-bounds
>>>> when we close events. Full details in the patch 2 commit message.
>>>>
>>>> Thanks,
>>>> Mark.
>>>>
>>>> Mark Rutland (2):
>>>> perf evsel: add per{cpu,thread} close helpers
>>>> perf stat: balance opening/closing of events
>>>>
>>>> tools/perf/builtin-stat.c | 24 +++++++++++++++++++++++-
>>>> tools/perf/util/evsel.c | 26 ++++++++++++++++++++------
>>>> tools/perf/util/evsel.h | 4 ++++
>>>> 3 files changed, 47 insertions(+), 7 deletions(-)
>>>>
>>>> --
>>>> 1.9.1
>>>>
>>
>> thanks
>> Ganapat
thanks
Ganapat