2012-05-15 15:25:45

by Dmitry Antipov

[permalink] [raw]
Subject: Perf record format portability

Hello,

are there any thoughts on how much of the perf.data is portable and how much it should be?
I'm interesting in recording scheduler activity on one machine and then replaying on
another. As I can see, replaying x86 perf.data on ARM doesn't work. At least, should it
work with a small subset of recorded events (for example, sched:sched_switch,
sched:sched_process_exit, sched:sched_process_fork, sched:sched_wakeup
and sched:sched_migrate_task) on the same architecture?

Thanks in advance,
Dmitry


2012-05-15 15:52:02

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: Perf record format portability

Em Tue, May 15, 2012 at 07:27:39PM +0400, Dmitry Antipov escreveu:
> Hello,
>
> are there any thoughts on how much of the perf.data is portable and how much it should be?
> I'm interesting in recording scheduler activity on one machine and then replaying on
> another. As I can see, replaying x86 perf.data on ARM doesn't work. At least, should it
> work with a small subset of recorded events (for example, sched:sched_switch,
> sched:sched_process_exit, sched:sched_process_fork, sched:sched_wakeup
> and sched:sched_migrate_task) on the same architecture?

Endianness issues? ARM EB? There are some patches by Jiri Olsa that may
help you if that is the case.

It should be portable, are you using 'perf archive' too?

What exactly is the error experienced?

- Arnaldo

2012-05-16 10:48:27

by Dmitry Antipov

[permalink] [raw]
Subject: Re: Perf record format portability

On 05/15/2012 07:51 PM, Arnaldo Carvalho de Melo wrote:
> Em Tue, May 15, 2012 at 07:27:39PM +0400, Dmitry Antipov escreveu:
>> Hello,
>>
>> are there any thoughts on how much of the perf.data is portable and how much it should be?
>> I'm interesting in recording scheduler activity on one machine and then replaying on
>> another. As I can see, replaying x86 perf.data on ARM doesn't work. At least, should it
>> work with a small subset of recorded events (for example, sched:sched_switch,
>> sched:sched_process_exit, sched:sched_process_fork, sched:sched_wakeup
>> and sched:sched_migrate_task) on the same architecture?
>
> Endianness issues? ARM EB? There are some patches by Jiri Olsa that may
> help you if that is the case.

Thanks, will look at.

> It should be portable, are you using 'perf archive' too?

It doesn't work with cryptic messages like:

tar: .build-id/17/d6ca02b2c31df54bf62a4142c47e3c99a9eedf: Cannot stat: No such file or directory

creating empty archive.

> What exactly is the error experienced?

Now I'm facing the simple problem with event IDs, which may be different from machine to
machine. For example, /sys/kernel/debug/tracing/events/sched/sched_switch/id is 55 on my ARM
board and 279 on my PC host, so 'perf report' displays all event names like "unknown:unknown",
even with --kallsyms=XXX where XXX is 'cat /proc/kallsyms > XXX' from PC host.

Dmitry

2012-05-16 14:59:35

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: Perf record format portability

Adding Jiri and Steven to the CC list.

Em Wed, May 16, 2012 at 02:50:31PM +0400, Dmitry Antipov escreveu:
> On 05/15/2012 07:51 PM, Arnaldo Carvalho de Melo wrote:
> >Em Tue, May 15, 2012 at 07:27:39PM +0400, Dmitry Antipov escreveu:
> >>are there any thoughts on how much of the perf.data is portable and how much it should be?
> >>I'm interesting in recording scheduler activity on one machine and then replaying on
> >>another. As I can see, replaying x86 perf.data on ARM doesn't work. At least, should it
> >>work with a small subset of recorded events (for example, sched:sched_switch,
> >>sched:sched_process_exit, sched:sched_process_fork, sched:sched_wakeup
> >>and sched:sched_migrate_task) on the same architecture?
> >
> >Endianness issues? ARM EB? There are some patches by Jiri Olsa that may
> >help you if that is the case.
>
> Thanks, will look at.
>
> >It should be portable, are you using 'perf archive' too?
>
> It doesn't work with cryptic messages like:
>
> tar: .build-id/17/d6ca02b2c31df54bf62a4142c47e3c99a9eedf: Cannot stat: No such file or directory

It is a shell script, basically, after yum collect your events with
something like:

[acme@sandy ~]$ perf record -F 10000 sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.021 MB perf.data (~917 samples) ]

The resulting perf.data file will have samples taken on these DSOs,
with those respective hashes identifying each one:

[acme@sandy ~]$ perf buildid-list
4390a3d2dc84c37a8923ba4c910d6766abc42cbf [kernel.kallsyms]
ceb82e745b0ab8bb7ea28c068327be1fb068c923 /lib64/ld-2.12.so
e731c64000993d1fd1b443e6d5d6972d149440e8 /lib64/libc-2.12.so
[acme@sandy ~]$

In your case we can see that it is looking for build id
17d6ca02b2c31df54bf62a4142c47e3c99a9eedf on the build id cache.

Probably you either are running 'perf archive' on a different machine
than the one where you ran 'perf record' or using a different user on
the same machine, or, unlikely, perhaps you removed ~/.debug/ after
'record'.

The 'perf archive' tool was done quickly just as a proof of concept,
admitedly it needs to be improved to help diagnosing these problems.

> creating empty archive.
>
> >What exactly is the error experienced?
>
> Now I'm facing the simple problem with event IDs, which may be different from machine to
> machine. For example, /sys/kernel/debug/tracing/events/sched/sched_switch/id is 55 on my ARM
> board and 279 on my PC host, so 'perf report' displays all event names like "unknown:unknown",
> even with --kallsyms=XXX where XXX is 'cat /proc/kallsyms > XXX' from PC host.

With build-ids and 'perf archive' you shouldn't need specifying
kallsyms, it has a build-id and will be collected (record + archive) an
then transfered and expanded on the analysis machine (scp + tar xvf).

The tracing part even stashes a copy of kallsyms in perf.data (not
needed, but there for historical reasons). The problem is in translating
the perf_event_attr.config to the same name and format as in the machine
where you collected the events.`

Steve,

Was the kernel trace events infrastructure designed with that in
mind? I.e. cross analysis? I must be missing something here, still
ENOCOFFEE :-\

When doing cross arch event analisys I tested:

PERF_TYPE_HARDWARE = 0,
PERF_TYPE_SOFTWARE = 1,
PERF_TYPE_HW_CACHE = 3,

Not:

PERF_TYPE_TRACEPOINT = 2,
PERF_TYPE_RAW = 4,
PERF_TYPE_BREAKPOINT = 5,

- Arnaldo

2012-05-16 15:17:12

by Jiri Olsa

[permalink] [raw]
Subject: Re: Perf record format portability

On Wed, May 16, 2012 at 11:59:27AM -0300, Arnaldo Carvalho de Melo wrote:
> Adding Jiri and Steven to the CC list.
>
> Em Wed, May 16, 2012 at 02:50:31PM +0400, Dmitry Antipov escreveu:
> > On 05/15/2012 07:51 PM, Arnaldo Carvalho de Melo wrote:
> > >Em Tue, May 15, 2012 at 07:27:39PM +0400, Dmitry Antipov escreveu:
> > >>are there any thoughts on how much of the perf.data is portable and how much it should be?
> > >>I'm interesting in recording scheduler activity on one machine and then replaying on
> > >>another. As I can see, replaying x86 perf.data on ARM doesn't work. At least, should it
> > >>work with a small subset of recorded events (for example, sched:sched_switch,
> > >>sched:sched_process_exit, sched:sched_process_fork, sched:sched_wakeup
> > >>and sched:sched_migrate_task) on the same architecture?
> > >
> > >Endianness issues? ARM EB? There are some patches by Jiri Olsa that may
> > >help you if that is the case.

latest version sent today, there's description of tests I did:
http://marc.info/?l=linux-kernel&m=133715172512742&w=2

Each time I run new sort of test, another endianity issue is hit.
so, tracepoints.. I'll check ;)

jirka

2012-05-16 15:50:55

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: Perf record format portability

Em Wed, May 16, 2012 at 05:16:55PM +0200, Jiri Olsa escreveu:
> On Wed, May 16, 2012 at 11:59:27AM -0300, Arnaldo Carvalho de Melo wrote:
> > Adding Jiri and Steven to the CC list.
> >
> > Em Wed, May 16, 2012 at 02:50:31PM +0400, Dmitry Antipov escreveu:
> > > On 05/15/2012 07:51 PM, Arnaldo Carvalho de Melo wrote:
> > > >Em Tue, May 15, 2012 at 07:27:39PM +0400, Dmitry Antipov escreveu:
> > > >>are there any thoughts on how much of the perf.data is portable and how much it should be?
> > > >>I'm interesting in recording scheduler activity on one machine and then replaying on
> > > >>another. As I can see, replaying x86 perf.data on ARM doesn't work. At least, should it
> > > >>work with a small subset of recorded events (for example, sched:sched_switch,
> > > >>sched:sched_process_exit, sched:sched_process_fork, sched:sched_wakeup
> > > >>and sched:sched_migrate_task) on the same architecture?
> > > >
> > > >Endianness issues? ARM EB? There are some patches by Jiri Olsa that may
> > > >help you if that is the case.
>
> latest version sent today, there's description of tests I did:
> http://marc.info/?l=linux-kernel&m=133715172512742&w=2
>
> Each time I run new sort of test, another endianity issue is hit.
> so, tracepoints.. I'll check ;)

The tracepoints part is a different problem, I think, but take a look
anyway ;-)

- Arnaldo

2012-05-16 16:58:34

by Steven Rostedt

[permalink] [raw]
Subject: Re: Perf record format portability

On Wed, 2012-05-16 at 11:59 -0300, Arnaldo Carvalho de Melo wrote:

> Steve,
>
> Was the kernel trace events infrastructure designed with that in
> mind? I.e. cross analysis? I must be missing something here, still
> ENOCOFFEE :-\

Yes, the libparsevents library was design for this from day one. That's
why trace-cmd data file can be run on an ARM and read on x86, or PPC, or
whatever. I did all my development testing against 32bit, 64bit and big
and little endian. This was the case from the beginning.

-- Steve

>
> When doing cross arch event analisys I tested:
>
> PERF_TYPE_HARDWARE = 0,
> PERF_TYPE_SOFTWARE = 1,
> PERF_TYPE_HW_CACHE = 3,
>
> Not:
>
> PERF_TYPE_TRACEPOINT = 2,
> PERF_TYPE_RAW = 4,
> PERF_TYPE_BREAKPOINT = 5,
>
> - Arnaldo

2012-05-16 18:09:28

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: Perf record format portability

Em Wed, May 16, 2012 at 12:58:23PM -0400, Steven Rostedt escreveu:
> On Wed, 2012-05-16 at 11:59 -0300, Arnaldo Carvalho de Melo wrote:
> > Was the kernel trace events infrastructure designed with that in
> > mind? I.e. cross analysis? I must be missing something here, still
> > ENOCOFFEE :-\
>
> Yes, the libparsevents library was design for this from day one. That's
> why trace-cmd data file can be run on an ARM and read on x86, or PPC, or
> whatever. I did all my development testing against 32bit, 64bit and big
> and little endian. This was the case from the beginning.

I need to look at the code, but how does it do this? Copy the relevant
/sys/kernel/debug/events formats in the header and then instead of
looking at /sys/... look at those?

Does it still copy /proc/kallsyms?

- Arnaldo

2012-05-16 18:17:32

by Steven Rostedt

[permalink] [raw]
Subject: Re: Perf record format portability

On Wed, 2012-05-16 at 15:08 -0300, Arnaldo Carvalho de Melo wrote:
> Em Wed, May 16, 2012 at 12:58:23PM -0400, Steven Rostedt escreveu:
> > On Wed, 2012-05-16 at 11:59 -0300, Arnaldo Carvalho de Melo wrote:
> > > Was the kernel trace events infrastructure designed with that in
> > > mind? I.e. cross analysis? I must be missing something here, still
> > > ENOCOFFEE :-\
> >
> > Yes, the libparsevents library was design for this from day one. That's
> > why trace-cmd data file can be run on an ARM and read on x86, or PPC, or
> > whatever. I did all my development testing against 32bit, 64bit and big
> > and little endian. This was the case from the beginning.
>
> I need to look at the code, but how does it do this? Copy the relevant
> /sys/kernel/debug/events formats in the header and then instead of
> looking at /sys/... look at those?

It does copy the events from .../debug/tracing/events. But it does cheat
about the bits. To determine the size, it looks
at /sys/kernel/debug/tracing/events/header_page and the field of
"commit". On 32bit machines, that's 4bytes, and on 64bit, that's 8
bytes.

For endianess, that is calculated on the machine that the recording is
running on and stored in the file.

The parse-events structure has a way to record the endianess and long
size, for later retrieval.

>
> Does it still copy /proc/kallsyms?

Yes it does.

-- Steve

2012-05-16 18:36:39

by Jiri Olsa

[permalink] [raw]
Subject: Re: Perf record format portability

On Wed, May 16, 2012 at 12:58:23PM -0400, Steven Rostedt wrote:
> On Wed, 2012-05-16 at 11:59 -0300, Arnaldo Carvalho de Melo wrote:
>
> > Steve,
> >
> > Was the kernel trace events infrastructure designed with that in
> > mind? I.e. cross analysis? I must be missing something here, still
> > ENOCOFFEE :-\
>
> Yes, the libparsevents library was design for this from day one. That's
> why trace-cmd data file can be run on an ARM and read on x86, or PPC, or
> whatever. I did all my development testing against 32bit, 64bit and big
> and little endian. This was the case from the beginning.

for ppc64(record) vs x86_64(report) I got following report on latest tip:

[jolsa@dhcp-26-214 test]$ ../perf report > report.target
Endianness of raw data not corrected!
Warning:
718 samples with id not present in the header
Warning:
The perf.data file has no samples!

for following record:
perf record -a -e sched:sched_switch -e sched:sched_process_exit -e sched:sched_process_fork -e sched:sched_wakeup -- sleep 10
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.178 MB perf.data (~7781 samples) ]

I haven't tried trace-cmd, but I guess let's wait for libparsevents
perf integration then.. ;)

jirka

2012-05-16 19:32:27

by Steven Rostedt

[permalink] [raw]
Subject: Re: Perf record format portability

On Wed, 2012-05-16 at 19:48 +0200, Jiri Olsa wrote:

> for ppc64(record) vs x86_64(report) I got following report on latest tip:
>
> [jolsa@dhcp-26-214 test]$ ../perf report > report.target
> Endianness of raw data not corrected!
> Warning:
> 718 samples with id not present in the header
> Warning:
> The perf.data file has no samples!
>
> for following record:
> perf record -a -e sched:sched_switch -e sched:sched_process_exit -e sched:sched_process_fork -e sched:sched_wakeup -- sleep 10
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.178 MB perf.data (~7781 samples) ]
>
> I haven't tried trace-cmd, but I guess let's wait for libparsevents
> perf integration then.. ;)
>

It's in perf. It just needs to be set up.

Look at tools/perf/util/trace-event.h

There's a bigendian() function, a "file_bigendian" and a
"host_bigendian". If perf recorded what endian was used on the target,
and saves that in the perf.dat file, all it needs to do is update the
two variables.

file_bigendian = recorded_endian;
host_bigendian = bigendian();

1 for big endian, 0 for little endian.

Where host is the machine that is running the perf report or script.
After that, all reads of the data in events uses one of the
__data2host() macros to convert if necessary.

Note, latest trace-cmd has put all these in a pevent struct descriptor,
so that different files can be read at the same time, and these files
can be from different endian (and bit size) machines. The global
variables no longer exist.

My patches, that I and Frederic posted previously, convert perf to use
this descriptor so that perf could benefit and read multiple files too.

-- Steve

2012-05-16 19:39:17

by Steven Rostedt

[permalink] [raw]
Subject: Re: Perf record format portability

On Wed, 2012-05-16 at 15:32 -0400, Steven Rostedt wrote:
> On Wed, 2012-05-16 at 19:48 +0200, Jiri Olsa wrote:
>
> > for ppc64(record) vs x86_64(report) I got following report on latest tip:
> >
> > [jolsa@dhcp-26-214 test]$ ../perf report > report.target
> > Endianness of raw data not corrected!
> > Warning:
> > 718 samples with id not present in the header
> > Warning:
> > The perf.data file has no samples!
> >

What does perf script give you. It looks like Frederic took my code for
this when he ported the original parse-events over to perf. I see the
setup of these variables in tools/perf/util/trace-event-read.c

If you run 'perf script' on x86 from a ppc perf.dat file, do you still
get the same errors?

-- Steve

2012-05-17 05:08:51

by Dmitry Antipov

[permalink] [raw]
Subject: Re: Perf record format portability

On 05/16/2012 08:58 PM, Steven Rostedt wrote:

> On Wed, 2012-05-16 at 11:59 -0300, Arnaldo Carvalho de Melo wrote:
>
>> Steve,
>>
>> Was the kernel trace events infrastructure designed with that in
>> mind? I.e. cross analysis? I must be missing something here, still
>> ENOCOFFEE :-\
>
> Yes, the libparsevents library was design for this from day one. That's
> why trace-cmd data file can be run on an ARM and read on x86, or PPC, or
> whatever. I did all my development testing against 32bit, 64bit and big
> and little endian. This was the case from the beginning.

I didn't face with big/little conversion issues, most probably both x86 and
my ARM board are of the same (little) endian :-).

But the original question was about event IDs. For example,
/sys/kernel/debug/tracing/events/sched/sched_switch/id is 55 on my ARM board
and 279 on my PC host, so 'perf report' displays "unknown:unknown" instead
of expected "sched:sched_switch" when attempting to do some cross-analysis.
I suppose that original event IDs should be preserved, either within perf.data
or by providing the copy of original /sys/kernel/debug/tracing/*, much like
it's done with --kallsyms to resolve kernel symbols.

Dmitry

2012-05-17 08:51:26

by Jiri Olsa

[permalink] [raw]
Subject: Re: Perf record format portability

On Wed, May 16, 2012 at 03:39:14PM -0400, Steven Rostedt wrote:
> On Wed, 2012-05-16 at 15:32 -0400, Steven Rostedt wrote:
> > On Wed, 2012-05-16 at 19:48 +0200, Jiri Olsa wrote:
> >
> > > for ppc64(record) vs x86_64(report) I got following report on latest tip:
> > >
> > > [jolsa@dhcp-26-214 test]$ ../perf report > report.target
> > > Endianness of raw data not corrected!
> > > Warning:
> > > 718 samples with id not present in the header
> > > Warning:
> > > The perf.data file has no samples!
> > >
>
> What does perf script give you. It looks like Frederic took my code for
> this when he ported the original parse-events over to perf. I see the
> setup of these variables in tools/perf/util/trace-event-read.c
>
> If you run 'perf script' on x86 from a ppc perf.dat file, do you still
> get the same errors?

yes

---
[jolsa@dhcp-26-214 test]$ ../perf script
Endianness of raw data not corrected!
Warning:
718 samples with id not present in the header
# ========
# captured on: Wed May 16 19:53:13 2012
# hostname : ibm-js22-vios-02-lp1.rhts.eng.bos.redhat.com
# os release : 2.6.32-270.el6.ppc64
# perf version : 2.6.32-270.el6.ppc64.debug
# arch : ppc64
# nrcpus online : 8
# nrcpus avail : 8
# cpudesc : POWER6 (architected), altivec supported
# cpuid : 62,769
# total memory : 6236992 kB
# cmdline : /usr/bin/perf record -a -e sched:sched_switch -e
# sched:sched_process_exit -e sched:sched_process_fork -e
# sched:sched_wakeup -- sleep 10
# event : name = sched:sched_switch, type = 2, config = 0x22, config1 =
# 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 97, 98, 99,
# 100, 101, 102, 103, 104 }
# event : name = sched:sched_process_exit, type = 2, config = 0x1b,
# config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 105,
# 106, 107, 108, 109, 110, 111, 112 }
# event : name = sched:sched_process_fork, type = 2, config = 0x1d,
# config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 113,
# 114, 115, 116, 117, 118, 119, 120 }
# event : name = sched:sched_wakeup, type = 2, config = 0x17, config1 =
# 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, id = { 121, 122, 123,
# 124, 125, 126, 127, 128 }
# HEADER_CPU_TOPOLOGY info available, use -I to display
# HEADER_NUMA_TOPOLOGY info available, use -I to display
# ========
#
---

jirka

2012-05-17 11:48:50

by Steven Rostedt

[permalink] [raw]
Subject: Re: Perf record format portability

On Thu, 2012-05-17 at 09:10 +0400, Dmitry Antipov wrote:
> On 05/16/2012 08:58 PM, Steven Rostedt wrote:
>
> > On Wed, 2012-05-16 at 11:59 -0300, Arnaldo Carvalho de Melo wrote:
> >
> >> Steve,
> >>
> >> Was the kernel trace events infrastructure designed with that in
> >> mind? I.e. cross analysis? I must be missing something here, still
> >> ENOCOFFEE :-\
> >
> > Yes, the libparsevents library was design for this from day one. That's
> > why trace-cmd data file can be run on an ARM and read on x86, or PPC, or
> > whatever. I did all my development testing against 32bit, 64bit and big
> > and little endian. This was the case from the beginning.
>
> I didn't face with big/little conversion issues, most probably both x86 and
> my ARM board are of the same (little) endian :-).
>
> But the original question was about event IDs. For example,
> /sys/kernel/debug/tracing/events/sched/sched_switch/id is 55 on my ARM board
> and 279 on my PC host, so 'perf report' displays "unknown:unknown" instead
> of expected "sched:sched_switch" when attempting to do some cross-analysis.
> I suppose that original event IDs should be preserved, either within perf.data
> or by providing the copy of original /sys/kernel/debug/tracing/*, much like
> it's done with --kallsyms to resolve kernel symbols.

trace-cmd copies the entire /sys/kernel/debug/tracing/events directory
into the data file (well it copies only the events you specify). I
thought perf did the same. It should be using what's in the perf.dat
file and not what's on the host.

Again, perf report is not what uses the events from trace-cmd. It's perf
script that does. If perf script works, then perf report needs to be
fixed. But after it gets updated to use the latest libparse-events,
which I have no idea when that will ever happen.

-- Steve

2012-05-18 05:46:23

by Dmitry Antipov

[permalink] [raw]
Subject: Re: Perf record format portability

On 05/17/2012 03:48 PM, Steven Rostedt wrote:

> trace-cmd copies the entire /sys/kernel/debug/tracing/events directory
> into the data file (well it copies only the events you specify).
> I thought perf did the same. It should be using what's in the perf.dat
> file and not what's on the host.

I found that 'perf script' and 'perf report' works differently,
and I suppose 'perf script' is correct and 'perf report' isn't.

What I'm doing on PC host is:

1) Collect data with:
perf record -a -R -f -m 8192 -c 1 -e sched:sched_switch \
-e sched:sched_process_exit -e sched:sched_process_fork \
-e sched:sched_wakeup -e sched:sched_migrate_task [task]
2) Collect an output from 'perf script' and 'perf report', both looks
great.
3) Copy perf.data and contents of /proc/kallsyms to ARM target.

4) Next, on ARM target:
perf script --kallsyms=[kallsyms from PC host] -i [perf.data from PC host]
Looks good, all event names like 'sched_wakeup' or 'sched_switch' are shown.
5) Try:
perf report --kallsyms=[kallsyms from PC host] -i [perf.data from PC host] --stdio
All event names are shown as 'unknown:unknown'.

"Cross-replaying" (perf sched replay) looks broken too.
Host results are:

run measurement overhead: 260 nsecs
sleep measurement overhead: 56109 nsecs
the run test took 1000054 nsecs
the sleep test took 1076170 nsecs
nr_run_events: 246
nr_sleep_events: 257
nr_wakeup_events: 123
target-less wakeups: 27
task 0 ( <unknown>: 3440), nr_events: 33
task 1 ( kworker/0:0: 3227), nr_events: 15
task 2 ( <unknown>: 0), nr_events: 125
task 3 ( plugin-containe: 1769), nr_events: 13
task 4 ( ksoftirqd/0: 3), nr_events: 5
task 5 ( kworker/2:2: 2023), nr_events: 3
task 6 ( perf: 3441), nr_events: 200
task 7 ( migration/2: 3091), nr_events: 3
task 8 ( kworker/1:0: 3104), nr_events: 158
task 9 ( urxvt: 2952), nr_events: 95
task 10 ( ksoftirqd/2: 3093), nr_events: 3
------------------------------------------------------------
#1 : 70.193, ravg: 70.19, cpu: 116.57 / 116.57
#2 : 70.607, ravg: 70.23, cpu: 116.61 / 116.58
#3 : 70.411, ravg: 70.25, cpu: 116.69 / 116.59
#4 : 70.386, ravg: 70.27, cpu: 116.72 / 116.60
#5 : 70.222, ravg: 70.26, cpu: 116.39 / 116.58
#6 : 70.361, ravg: 70.27, cpu: 116.40 / 116.56
#7 : 70.409, ravg: 70.28, cpu: 116.43 / 116.55
#8 : 70.368, ravg: 70.29, cpu: 116.50 / 116.55
#9 : 70.604, ravg: 70.32, cpu: 116.75 / 116.57
#10 : 70.578, ravg: 70.35, cpu: 116.79 / 116.59

Cross-replaying attempt is ('perf sched -i [perf.data from PC host] replay'):

run measurement overhead: 8099 nsecs
sleep measurement overhead: 159428 nsecs
the run test took 998913 nsecs
the sleep test took 1188048 nsecs
nr_run_events: 0
nr_sleep_events: 0
nr_wakeup_events: 0
------------------------------------------------------------
#1 : 0.058, ravg: 0.06, cpu: 0.00 / 0.00
#2 : 0.105, ravg: 0.06, cpu: 0.00 / 0.00
#3 : 0.027, ravg: 0.06, cpu: 0.00 / 0.00
#4 : 0.026, ravg: 0.06, cpu: 0.00 / 0.00
#5 : 0.035, ravg: 0.05, cpu: 0.00 / 0.00
#6 : 0.027, ravg: 0.05, cpu: 0.00 / 0.00
#7 : 0.027, ravg: 0.05, cpu: 0.00 / 0.00
#8 : 0.028, ravg: 0.05, cpu: 0.00 / 0.00
#9 : 0.029, ravg: 0.04, cpu: 0.00 / 0.00
#10 : 0.028, ravg: 0.04, cpu: 0.00 / 0.00

Dmitry

2012-05-29 15:10:30

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: Perf record format portability

Em Fri, May 18, 2012 at 09:48:26AM +0400, Dmitry Antipov escreveu:
> On 05/17/2012 03:48 PM, Steven Rostedt wrote:
>
> >trace-cmd copies the entire /sys/kernel/debug/tracing/events directory
> >into the data file (well it copies only the events you specify).
> >I thought perf did the same. It should be using what's in the perf.dat
> >file and not what's on the host.
>
> I found that 'perf script' and 'perf report' works differently,
> and I suppose 'perf script' is correct and 'perf report' isn't.
>
> What I'm doing on PC host is:

I haven't tested this, but libtraceevent is now in, perhaps it works for
you now? Can you check?

- Arnaldo

> 1) Collect data with:
> perf record -a -R -f -m 8192 -c 1 -e sched:sched_switch \
> -e sched:sched_process_exit -e sched:sched_process_fork \
> -e sched:sched_wakeup -e sched:sched_migrate_task [task]
> 2) Collect an output from 'perf script' and 'perf report', both looks
> great.
> 3) Copy perf.data and contents of /proc/kallsyms to ARM target.
>
> 4) Next, on ARM target:
> perf script --kallsyms=[kallsyms from PC host] -i [perf.data from PC host]
> Looks good, all event names like 'sched_wakeup' or 'sched_switch' are shown.
> 5) Try:
> perf report --kallsyms=[kallsyms from PC host] -i [perf.data from PC host] --stdio
> All event names are shown as 'unknown:unknown'.
>
> "Cross-replaying" (perf sched replay) looks broken too.
> Host results are:
>
> run measurement overhead: 260 nsecs
> sleep measurement overhead: 56109 nsecs
> the run test took 1000054 nsecs
> the sleep test took 1076170 nsecs
> nr_run_events: 246
> nr_sleep_events: 257
> nr_wakeup_events: 123
> target-less wakeups: 27
> task 0 ( <unknown>: 3440), nr_events: 33
> task 1 ( kworker/0:0: 3227), nr_events: 15
> task 2 ( <unknown>: 0), nr_events: 125
> task 3 ( plugin-containe: 1769), nr_events: 13
> task 4 ( ksoftirqd/0: 3), nr_events: 5
> task 5 ( kworker/2:2: 2023), nr_events: 3
> task 6 ( perf: 3441), nr_events: 200
> task 7 ( migration/2: 3091), nr_events: 3
> task 8 ( kworker/1:0: 3104), nr_events: 158
> task 9 ( urxvt: 2952), nr_events: 95
> task 10 ( ksoftirqd/2: 3093), nr_events: 3
> ------------------------------------------------------------
> #1 : 70.193, ravg: 70.19, cpu: 116.57 / 116.57
> #2 : 70.607, ravg: 70.23, cpu: 116.61 / 116.58
> #3 : 70.411, ravg: 70.25, cpu: 116.69 / 116.59
> #4 : 70.386, ravg: 70.27, cpu: 116.72 / 116.60
> #5 : 70.222, ravg: 70.26, cpu: 116.39 / 116.58
> #6 : 70.361, ravg: 70.27, cpu: 116.40 / 116.56
> #7 : 70.409, ravg: 70.28, cpu: 116.43 / 116.55
> #8 : 70.368, ravg: 70.29, cpu: 116.50 / 116.55
> #9 : 70.604, ravg: 70.32, cpu: 116.75 / 116.57
> #10 : 70.578, ravg: 70.35, cpu: 116.79 / 116.59
>
> Cross-replaying attempt is ('perf sched -i [perf.data from PC host] replay'):
>
> run measurement overhead: 8099 nsecs
> sleep measurement overhead: 159428 nsecs
> the run test took 998913 nsecs
> the sleep test took 1188048 nsecs
> nr_run_events: 0
> nr_sleep_events: 0
> nr_wakeup_events: 0
> ------------------------------------------------------------
> #1 : 0.058, ravg: 0.06, cpu: 0.00 / 0.00
> #2 : 0.105, ravg: 0.06, cpu: 0.00 / 0.00
> #3 : 0.027, ravg: 0.06, cpu: 0.00 / 0.00
> #4 : 0.026, ravg: 0.06, cpu: 0.00 / 0.00
> #5 : 0.035, ravg: 0.05, cpu: 0.00 / 0.00
> #6 : 0.027, ravg: 0.05, cpu: 0.00 / 0.00
> #7 : 0.027, ravg: 0.05, cpu: 0.00 / 0.00
> #8 : 0.028, ravg: 0.05, cpu: 0.00 / 0.00
> #9 : 0.029, ravg: 0.04, cpu: 0.00 / 0.00
> #10 : 0.028, ravg: 0.04, cpu: 0.00 / 0.00
>
> Dmitry

2012-05-31 08:26:14

by Dmitry Antipov

[permalink] [raw]
Subject: Re: Perf record format portability

On 05/29/2012 07:10 PM, Arnaldo Carvalho de Melo wrote:

> I haven't tested this, but libtraceevent is now in, perhaps it works for
> you now? Can you check?

It doesn't work. Attempt to do 'perf report' on ARM for the data collected on
x86 shows 'unknown:unknown' for event names (see report_x86_on_ARM.txt),
and 'perf report' on x86 for the data collected on ARM shows invalid event names
(see report_ARM_on_x86.txt).

Dmitry


Attachments:
report_x86_on_ARM.txt (3.02 kB)
report_ARM_on_x86.txt (3.03 kB)
Download all attachments