2015-08-17 04:31:35

by Andi Kleen

[permalink] [raw]
Subject: Announcing simple-pt -- a simple Processor Trace implementation for Linux


Modern Intel Core CPUs (5th and 6th generation) have a Intel Processor Trace (PT) feature
to trace branch execution with low overhead. This is useful for performance analysis and debugging.

simple-pt is a simple standalone driver and decoder tool to implement PT on Linux.

Starting with Linux 4.1 Linux has an integrated PT implementation in perf
(see https://lwn.net/Articles/648154/).
simple-pt is an alternative implementation. It has many disadvantages over the perf PT
implementation, such as:
- needs to run as root
- no long term tracing or sampling with interrupts
- no support for interactive debugging (use gdb 7.10 on perf for that)
- no support for histograms
- somewhat experimental
- not as well supported as perf

On the positive side simple-pt is:
- simple
- standalone. No kernel changes needed. Could be ported to older kernels or other operating systems
- easy to modify and experiment with
- more ftrace like decoding tool
- support for kprobes based triggers
- modular “unix style” design with simple tools that do only one thing each
- BSD licensed

Example output:


% sptcmd -c tcall taskset -c 0 ./tcall
cpu 0 offset 1027688, 1003 KB, writing to ptout.0
...
Wrote sideband to ptout.sideband
% sptdecode --sideband ptout.sideband --pt ptout.0 | less
TIME DELTA INSNs OPERATION
frequency 32
0 [+0] [+ 1] _dl_aux_init+436
[+ 6] __libc_start_main+455 -> _dl_discover_osversion
...
[+ 13] __libc_start_main+446 -> main
[+ 9] main+22 -> f1
[+ 4] f1+9 -> f2
[+ 2] f1+19 -> f2
[+ 5] main+22 -> f1
[+ 4] f1+9 -> f2
[+ 2] f1+19 -> f2
[+ 5] main+22 -> f1
...

Available from https://github.com/andikleen/simple-pt

--
[email protected] -- Speaking for myself only.


2015-08-17 13:09:35

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: Announcing simple-pt -- a simple Processor Trace implementation for Linux

2015-08-17 6:31 GMT+02:00 Andi Kleen <[email protected]>:
>
> Modern Intel Core CPUs (5th and 6th generation) have a Intel Processor Trace (PT) feature
> to trace branch execution with low overhead. This is useful for performance analysis and debugging.
>
> simple-pt is a simple standalone driver and decoder tool to implement PT on Linux.
>
> Starting with Linux 4.1 Linux has an integrated PT implementation in perf
> (see https://lwn.net/Articles/648154/).
> simple-pt is an alternative implementation. It has many disadvantages over the perf PT
> implementation, such as:
> - needs to run as root
> - no long term tracing or sampling with interrupts
> - no support for interactive debugging (use gdb 7.10 on perf for that)
> - no support for histograms
> - somewhat experimental
> - not as well supported as perf
>
> On the positive side simple-pt is:
> - simple
> - standalone. No kernel changes needed. Could be ported to older kernels or other operating systems
> - easy to modify and experiment with
> - more ftrace like decoding tool
> - support for kprobes based triggers
> - modular “unix style” design with simple tools that do only one thing each
> - BSD licensed
>
> Example output:
>
>
> % sptcmd -c tcall taskset -c 0 ./tcall
> cpu 0 offset 1027688, 1003 KB, writing to ptout.0
> ...
> Wrote sideband to ptout.sideband
> % sptdecode --sideband ptout.sideband --pt ptout.0 | less
> TIME DELTA INSNs OPERATION
> frequency 32
> 0 [+0] [+ 1] _dl_aux_init+436
> [+ 6] __libc_start_main+455 -> _dl_discover_osversion
> ...
> [+ 13] __libc_start_main+446 -> main
> [+ 9] main+22 -> f1
> [+ 4] f1+9 -> f2
> [+ 2] f1+19 -> f2
> [+ 5] main+22 -> f1
> [+ 4] f1+9 -> f2
> [+ 2] f1+19 -> f2
> [+ 5] main+22 -> f1

Nice. So I guess +x is the address offset. How hard would it be to
translate to file lines?

Thanks.

> ...
>
> Available from https://github.com/andikleen/simple-pt
>
> --
> [email protected] -- Speaking for myself only.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2015-08-17 18:21:08

by Andi Kleen

[permalink] [raw]
Subject: Re: Announcing simple-pt -- a simple Processor Trace implementation for Linux

> > % sptdecode --sideband ptout.sideband --pt ptout.0 | less
> > TIME DELTA INSNs OPERATION
> > frequency 32
> > 0 [+0] [+ 1] _dl_aux_init+436
> > [+ 6] __libc_start_main+455 -> _dl_discover_osversion
> > ...
> > [+ 13] __libc_start_main+446 -> main
> > [+ 9] main+22 -> f1
> > [+ 4] f1+9 -> f2
> > [+ 2] f1+19 -> f2
> > [+ 5] main+22 -> f1
> > [+ 4] f1+9 -> f2
> > [+ 2] f1+19 -> f2
> > [+ 5] main+22 -> f1
>
> Nice. So I guess +x is the address offset. How hard would it be to
> translate to file lines?

Yes it's the address offset. Translating to lines wouldn't be too hard,
just needs to be implemented with a dwarf reader.

BTW the PT trace has all branches, just not the calls, but it's more
difficult to display them all in a nice way.

-Andi