2020-04-29 15:10:55

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH 8/9] perf intel-pt: Update documentation about itrace G and L options

Provide a little more information about the new G and L options,
particularly the issue with large PEBs.

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/Documentation/itrace.txt | 4 +++
tools/perf/Documentation/perf-intel-pt.txt | 35 ++++++++++++++++++++++
2 files changed, 39 insertions(+)

diff --git a/tools/perf/Documentation/itrace.txt b/tools/perf/Documentation/itrace.txt
index 0326050beebd..271484754fee 100644
--- a/tools/perf/Documentation/itrace.txt
+++ b/tools/perf/Documentation/itrace.txt
@@ -33,6 +33,10 @@
Also the number of last branch entries (default 64, max. 1024) for
instructions or transactions events can be specified.

+ Similar to options g and l, size may also be specified for options G and L.
+ On x86, note that G and L work poorly when data has been recorded with
+ large PEBS. Refer linkperf:perf-intel-pt[1] man page for details.
+
It is also possible to skip events generated (instructions, branches, transactions,
ptwrite, power) at the beginning. This is useful to ignore initialization code.

diff --git a/tools/perf/Documentation/perf-intel-pt.txt b/tools/perf/Documentation/perf-intel-pt.txt
index 456fdcbf26ac..782eb8a65caf 100644
--- a/tools/perf/Documentation/perf-intel-pt.txt
+++ b/tools/perf/Documentation/perf-intel-pt.txt
@@ -821,7 +821,9 @@ The letters are:
e synthesize tracing error events
d create a debug log
g synthesize a call chain (use with i or x)
+ G synthesize a call chain on existing event records
l synthesize last branch entries (use with i or x)
+ L synthesize last branch entries on existing event records
s skip initial number of events

"Instructions" events look like they were recorded by "perf record -e
@@ -912,6 +914,39 @@ transactions events can be specified. e.g.
Note that last branch entries are cleared for each sample, so there is no overlap
from one sample to the next.

+The G and L options are designed in particular for sample mode, and work much
+like g and l but add call chain and branch stack to the other selected events
+instead of synthesized events. For example, to record branch-misses events for
+'ls' and then add a call chain derived from the Intel PT trace:
+
+ perf record --aux-sample -e '{intel_pt//u,branch-misses:u}' -- ls
+ perf report --itrace=Ge
+
+Although in fact G is a default for perf report, so that is the same as just:
+
+ perf report
+
+One caveat with the G and L options is that they work poorly with "Large PEBS".
+Large PEBS means PEBS records will be accumulated by hardware and the written
+into the event buffer in one go. That reduces interrupts, but can give very
+late timestamps. Because the Intel PT trace is synchronized by timestamps,
+the PEBS events do not match the trace. Currently, Large PEBS is used only in
+certain circumstances:
+ - hardware supports it
+ - PEBS is used
+ - event period is specified, instead of frequency
+ - the sample type is limited to the following flags:
+ PERF_SAMPLE_IP | PERF_SAMPLE_TID | PERF_SAMPLE_ADDR |
+ PERF_SAMPLE_ID | PERF_SAMPLE_CPU | PERF_SAMPLE_STREAM_ID |
+ PERF_SAMPLE_DATA_SRC | PERF_SAMPLE_IDENTIFIER |
+ PERF_SAMPLE_TRANSACTION | PERF_SAMPLE_PHYS_ADDR |
+ PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER |
+ PERF_SAMPLE_PERIOD (and sometimes) | PERF_SAMPLE_TIME
+Because Intel PT sample mode uses a different sample type to the list above,
+Large PEBS is not used with Intel PT sample mode. To avoid Large PEBS in other
+cases, avoid specifying the event period i.e. avoid the 'perf record' -c option,
+--count option, or 'period' config term.
+
To disable trace decoding entirely, use the option --no-itrace.

It is also possible to skip events generated (instructions, branches, transactions)
--
2.17.1


2020-04-29 23:07:54

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 8/9] perf intel-pt: Update documentation about itrace G and L options

> +One caveat with the G and L options is that they work poorly with "Large PEBS".
> +Large PEBS means PEBS records will be accumulated by hardware and the written
> +into the event buffer in one go. That reduces interrupts, but can give very
> +late timestamps. Because the Intel PT trace is synchronized by timestamps,

Are you refering to Broadwell here?

On Skylake/Goldmont the PEBS event contains the TSC and the time stamp reported by
perf should report the time the event was sampled based on that TSC.
Or is that not working for some reason?

-Andi

2020-04-30 05:40:13

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH 8/9] perf intel-pt: Update documentation about itrace G and L options

On 30/04/20 2:03 am, Andi Kleen wrote:
>> +One caveat with the G and L options is that they work poorly with "Large PEBS".
>> +Large PEBS means PEBS records will be accumulated by hardware and the written
>> +into the event buffer in one go. That reduces interrupts, but can give very
>> +late timestamps. Because the Intel PT trace is synchronized by timestamps,
>
> Are you refering to Broadwell here?

I was testing on Coffee Lake

>
> On Skylake/Goldmont the PEBS event contains the TSC and the time stamp reported by
> perf should report the time the event was sampled based on that TSC.
> Or is that not working for some reason?

I guess it is not working like that, but perf tools would probably need
special rules to sort the events because the they would break the rules of
PERF_RECORD_FINISHED_ROUND, wouldn't they?

2020-04-30 12:31:56

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 8/9] perf intel-pt: Update documentation about itrace G and L options

> >
> > On Skylake/Goldmont the PEBS event contains the TSC and the time stamp reported by
> > perf should report the time the event was sampled based on that TSC.
> > Or is that not working for some reason?
>
> I guess it is not working like that, but perf tools would probably need
> special rules to sort the events because the they would break the rules of
> PERF_RECORD_FINISHED_ROUND, wouldn't they?

Yes it may violate finished round.

It should not be delayed longer than the next context switch though.

The documentation warning would be still correct, but only on Broadwell.

-Andi

Subject: [tip: perf/core] perf intel-pt: Update documentation about itrace G and L options

The following commit has been merged into the perf/core branch of tip:

Commit-ID: 43358d9dfb250842383cddad816a7538548a5070
Gitweb: https://git.kernel.org/tip/43358d9dfb250842383cddad816a7538548a5070
Author: Adrian Hunter <[email protected]>
AuthorDate: Wed, 29 Apr 2020 18:07:50 +03:00
Committer: Arnaldo Carvalho de Melo <[email protected]>
CommitterDate: Tue, 05 May 2020 16:35:30 -03:00

perf intel-pt: Update documentation about itrace G and L options

Provide a little more information about the new G and L options,
particularly the issue with large PEBs.

Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
---
tools/perf/Documentation/itrace.txt | 4 ++-
tools/perf/Documentation/perf-intel-pt.txt | 35 +++++++++++++++++++++-
2 files changed, 39 insertions(+)

diff --git a/tools/perf/Documentation/itrace.txt b/tools/perf/Documentation/itrace.txt
index 0326050..2714847 100644
--- a/tools/perf/Documentation/itrace.txt
+++ b/tools/perf/Documentation/itrace.txt
@@ -33,6 +33,10 @@
Also the number of last branch entries (default 64, max. 1024) for
instructions or transactions events can be specified.

+ Similar to options g and l, size may also be specified for options G and L.
+ On x86, note that G and L work poorly when data has been recorded with
+ large PEBS. Refer linkperf:perf-intel-pt[1] man page for details.
+
It is also possible to skip events generated (instructions, branches, transactions,
ptwrite, power) at the beginning. This is useful to ignore initialization code.

diff --git a/tools/perf/Documentation/perf-intel-pt.txt b/tools/perf/Documentation/perf-intel-pt.txt
index 456fdcb..782eb8a 100644
--- a/tools/perf/Documentation/perf-intel-pt.txt
+++ b/tools/perf/Documentation/perf-intel-pt.txt
@@ -821,7 +821,9 @@ The letters are:
e synthesize tracing error events
d create a debug log
g synthesize a call chain (use with i or x)
+ G synthesize a call chain on existing event records
l synthesize last branch entries (use with i or x)
+ L synthesize last branch entries on existing event records
s skip initial number of events

"Instructions" events look like they were recorded by "perf record -e
@@ -912,6 +914,39 @@ transactions events can be specified. e.g.
Note that last branch entries are cleared for each sample, so there is no overlap
from one sample to the next.

+The G and L options are designed in particular for sample mode, and work much
+like g and l but add call chain and branch stack to the other selected events
+instead of synthesized events. For example, to record branch-misses events for
+'ls' and then add a call chain derived from the Intel PT trace:
+
+ perf record --aux-sample -e '{intel_pt//u,branch-misses:u}' -- ls
+ perf report --itrace=Ge
+
+Although in fact G is a default for perf report, so that is the same as just:
+
+ perf report
+
+One caveat with the G and L options is that they work poorly with "Large PEBS".
+Large PEBS means PEBS records will be accumulated by hardware and the written
+into the event buffer in one go. That reduces interrupts, but can give very
+late timestamps. Because the Intel PT trace is synchronized by timestamps,
+the PEBS events do not match the trace. Currently, Large PEBS is used only in
+certain circumstances:
+ - hardware supports it
+ - PEBS is used
+ - event period is specified, instead of frequency
+ - the sample type is limited to the following flags:
+ PERF_SAMPLE_IP | PERF_SAMPLE_TID | PERF_SAMPLE_ADDR |
+ PERF_SAMPLE_ID | PERF_SAMPLE_CPU | PERF_SAMPLE_STREAM_ID |
+ PERF_SAMPLE_DATA_SRC | PERF_SAMPLE_IDENTIFIER |
+ PERF_SAMPLE_TRANSACTION | PERF_SAMPLE_PHYS_ADDR |
+ PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER |
+ PERF_SAMPLE_PERIOD (and sometimes) | PERF_SAMPLE_TIME
+Because Intel PT sample mode uses a different sample type to the list above,
+Large PEBS is not used with Intel PT sample mode. To avoid Large PEBS in other
+cases, avoid specifying the event period i.e. avoid the 'perf record' -c option,
+--count option, or 'period' config term.
+
To disable trace decoding entirely, use the option --no-itrace.

It is also possible to skip events generated (instructions, branches, transactions)