2012-02-21 14:54:30

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH] perf tools: fix broken perf record -a mode


The following commit:
b52956c perf tools: Allow multiple threads or processes in record, stat, top

introduced a bug in the thread_map code which caused
perf record -a to not setup system-wide monitoring properly.

$ taskset -c 1 noploop 1000 &
$ perf record -a -C 1 sleep 10
$ perf report -D | tail -20
cycles stats:
TOTAL events: 4413
MMAP events: 4025
COMM events: 340
SAMPLE events: 48

Here I was expecting about 10,000 samples and not 48.

In system-wide mode, the PID passed to perf_event_open()
must be -1 and it was 0. That caused the kernel to setup
a per-process event on PID:0. Consequently, the number
of samples captured does not correspond to the requested
measurement.

The following one-liner fixes the problem for me with or
without -C.

I would also suggest to change the malloc() to something
that matches the struct definition. thread_map->map[] is
declared as int map[] and not pid_t map[]. If map[] can
only contain pids, then change the struct definition.

Signed-off-by: Stephane Eranian <[email protected]>
---

diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index e15983c..84d9bd78 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -229,7 +229,7 @@ static struct thread_map *thread_map__new_by_tid_str(const char *tid_str)
if (!tid_str) {
threads = malloc(sizeof(*threads) + sizeof(pid_t));
if (threads != NULL) {
- threads->map[1] = -1;
+ threads->map[0] = -1;
threads->nr = 1;
}
return threads;


2012-02-21 15:32:04

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH] perf tools: fix broken perf record -a mode

On 2/21/12 7:54 AM, Stephane Eranian wrote:
>
> The following commit:
> b52956c perf tools: Allow multiple threads or processes in record, stat, top
>
> introduced a bug in the thread_map code which caused
> perf record -a to not setup system-wide monitoring properly.
>
> $ taskset -c 1 noploop 1000&
> $ perf record -a -C 1 sleep 10
> $ perf report -D | tail -20
> cycles stats:
> TOTAL events: 4413
> MMAP events: 4025
> COMM events: 340
> SAMPLE events: 48
>
> Here I was expecting about 10,000 samples and not 48.
>
> In system-wide mode, the PID passed to perf_event_open()
> must be -1 and it was 0. That caused the kernel to setup
> a per-process event on PID:0. Consequently, the number
> of samples captured does not correspond to the requested
> measurement.
>
> The following one-liner fixes the problem for me with or
> without -C.
>
> I would also suggest to change the malloc() to something
> that matches the struct definition. thread_map->map[] is
> declared as int map[] and not pid_t map[]. If map[] can
> only contain pids, then change the struct definition.
>
> Signed-off-by: Stephane Eranian<[email protected]>
> ---
>
> diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
> index e15983c..84d9bd78 100644
> --- a/tools/perf/util/thread_map.c
> +++ b/tools/perf/util/thread_map.c
> @@ -229,7 +229,7 @@ static struct thread_map *thread_map__new_by_tid_str(const char *tid_str)
> if (!tid_str) {
> threads = malloc(sizeof(*threads) + sizeof(pid_t));
> if (threads != NULL) {
> - threads->map[1] = -1;
> + threads->map[0] = -1;
> threads->nr = 1;
> }
> return threads;

Damn. Hope you did not spend much time chasing it down.

Acked-by: David Ahern <[email protected]>

2012-02-21 17:08:04

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf tools: fix broken perf record -a mode

Em Tue, Feb 21, 2012 at 08:31:56AM -0700, David Ahern escreveu:
> On 2/21/12 7:54 AM, Stephane Eranian wrote:
>>
>> The following commit:
>> b52956c perf tools: Allow multiple threads or processes in record, stat, top
>>
>> introduced a bug in the thread_map code which caused
>> perf record -a to not setup system-wide monitoring properly.
>>
>> $ taskset -c 1 noploop 1000&
>> $ perf record -a -C 1 sleep 10
>> $ perf report -D | tail -20
>> cycles stats:
>> TOTAL events: 4413
>> MMAP events: 4025
>> COMM events: 340
>> SAMPLE events: 48
>>
>> Here I was expecting about 10,000 samples and not 48.
>>
>> In system-wide mode, the PID passed to perf_event_open()
>> must be -1 and it was 0. That caused the kernel to setup
>> a per-process event on PID:0. Consequently, the number
>> of samples captured does not correspond to the requested
>> measurement.
>>
>> The following one-liner fixes the problem for me with or
>> without -C.

>> I would also suggest to change the malloc() to something
>> that matches the struct definition. thread_map->map[] is
>> declared as int map[] and not pid_t map[]. If map[] can
>> only contain pids, then change the struct definition.

Stephane,

Feel free to submit a patch :-)

>> Signed-off-by: Stephane Eranian<[email protected]>
>>
>> diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
>> index e15983c..84d9bd78 100644
>> --- a/tools/perf/util/thread_map.c
>> +++ b/tools/perf/util/thread_map.c
>> @@ -229,7 +229,7 @@ static struct thread_map *thread_map__new_by_tid_str(const char *tid_str)
>> if (!tid_str) {
>> threads = malloc(sizeof(*threads) + sizeof(pid_t));
>> if (threads != NULL) {
>> - threads->map[1] = -1;
>> + threads->map[0] = -1;
>> threads->nr = 1;
>> }
>> return threads;
>
> Damn. Hope you did not spend much time chasing it down.

> Acked-by: David Ahern <[email protected]>

Yeah, this one slip thru my visual inspection :-\

Now I'll pay for this sin by adding an entry in 'perf test' to check
that :-)

- Arnaldo

2012-02-22 16:04:31

by Stephane Eranian

[permalink] [raw]
Subject: [tip:perf/core] perf tools: fix broken perf record -a mode

Commit-ID: 6b1bee9035d430c4b4f586df6df4b3f840e89b5b
Gitweb: http://git.kernel.org/tip/6b1bee9035d430c4b4f586df6df4b3f840e89b5b
Author: Stephane Eranian <[email protected]>
AuthorDate: Tue, 21 Feb 2012 15:54:25 +0100
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitDate: Tue, 21 Feb 2012 15:05:43 -0200

perf tools: fix broken perf record -a mode

The following commit:
b52956c perf tools: Allow multiple threads or processes in record, stat, top

introduced a bug in the thread_map code which caused perf record -a to
not setup system-wide monitoring properly.

$ taskset -c 1 noploop 1000 &
$ perf record -a -C 1 sleep 10
$ perf report -D | tail -20
cycles stats:
TOTAL events: 4413
MMAP events: 4025
COMM events: 340
SAMPLE events: 48

Here I was expecting about 10,000 samples and not 48.

In system-wide mode, the PID passed to perf_event_open() must be -1 and
it was 0. That caused the kernel to setup a per-process event on PID:0.
Consequently, the number of samples captured does not correspond to the
requested measurement.

The following one-liner fixes the problem for me with or without -C.

I would also suggest to change the malloc() to something that matches
the struct definition. thread_map->map[] is declared as int map[] and
not pid_t map[]. If map[] can only contain pids, then change the struct
definition.

Acked-by: David Ahern <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/20120221145424.GA6757@quad
Signed-off-by: Stephane Eranian <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/util/thread_map.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index e15983c..84d9bd78 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -229,7 +229,7 @@ static struct thread_map *thread_map__new_by_tid_str(const char *tid_str)
if (!tid_str) {
threads = malloc(sizeof(*threads) + sizeof(pid_t));
if (threads != NULL) {
- threads->map[1] = -1;
+ threads->map[0] = -1;
threads->nr = 1;
}
return threads;