Hello,
I found some problems in Intel-PT and auxtrace in general with pipe.
In the past it used to work with pipe, but recent code fails. As it
also touches the generic code, other auxtrace users like ARM SPE will
be affected too. I added a test case to verify it works with pipes.
Changes in v2)
* add a warning in intel_pt_process_auxtrace_info()
* add Reviewed-by from James
At last, I can run this command without a problem.
$ perf record -o- -e intel_pt// true | perf inject -b | perf report -i- --itrace=i1000
The code is available at 'perf/auxtrace-pipe-v2' branch in
git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
Thanks,
Namhyung
Namhyung Kim (4):
perf inject: Use perf_data__read() for auxtrace
perf intel-pt: Do not try to queue auxtrace data on pipe
perf session: Avoid calling lseek(2) for pipe
perf test: Add pipe mode test to the Intel PT test suite
tools/perf/Documentation/perf-intel-pt.txt | 30 ++++++++++++++++++++++
tools/perf/builtin-inject.c | 6 ++---
tools/perf/tests/shell/test_intel_pt.sh | 17 ++++++++++++
tools/perf/util/auxtrace.c | 3 +++
tools/perf/util/intel-pt.c | 6 +++++
tools/perf/util/session.c | 9 +++++--
6 files changed, 66 insertions(+), 5 deletions(-)
base-commit: 5670ebf54bd26482f57a094c53bdc562c106e0a9
--
2.39.1.456.gfc5497dd1b-goog
In copy_bytes(), it reads the data from the (input) fd and writes it to
the output file. But it does with the read(2) unconditionally which
caused a problem of mixing buffered vs unbuffered I/O together.
You can see the problem when using pipes.
$ perf record -e intel_pt// -o- true | perf inject -b > /dev/null
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.000 MB - ]
0x45c0 [0x30]: failed to process type: 71
It should use perf_data__read() to honor the 'use_stdio' setting.
Fixes: 601366678c93 ("perf data: Allow to use stdio functions for pipe mode")
Reviewed-by: James Clark <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/builtin-inject.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 3f4e4dd5abf3..f8182417b734 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -215,14 +215,14 @@ static int perf_event__repipe_event_update(struct perf_tool *tool,
#ifdef HAVE_AUXTRACE_SUPPORT
-static int copy_bytes(struct perf_inject *inject, int fd, off_t size)
+static int copy_bytes(struct perf_inject *inject, struct perf_data *data, off_t size)
{
char buf[4096];
ssize_t ssz;
int ret;
while (size > 0) {
- ssz = read(fd, buf, min(size, (off_t)sizeof(buf)));
+ ssz = perf_data__read(data, buf, min(size, (off_t)sizeof(buf)));
if (ssz < 0)
return -errno;
ret = output_bytes(inject, buf, ssz);
@@ -260,7 +260,7 @@ static s64 perf_event__repipe_auxtrace(struct perf_session *session,
ret = output_bytes(inject, event, event->header.size);
if (ret < 0)
return ret;
- ret = copy_bytes(inject, perf_data__fd(session->data),
+ ret = copy_bytes(inject, session->data,
event->auxtrace.size);
} else {
ret = output_bytes(inject, event,
--
2.39.1.456.gfc5497dd1b-goog
When it processes AUXTRACE_INFO, it calls to auxtrace_queue_data() to
collect AUXTRACE data first. That won't work with pipe since it needs
lseek() to read the scattered aux data.
$ perf record -o- -e intel_pt// true | perf report -i- --itrace=i100
# To display the perf.data header info, please use --header/--header-only options.
#
0x4118 [0xa0]: failed to process type: 70
Error:
failed to process sample
For the pipe mode, it can handle the aux data as it gets. But there's
no guarantee it can get the aux data in time. So the following warning
will be shown at the beginning:
WARNING: Intel PT with pipe mode is not recommended.
The output cannot relied upon. In particular,
time stamps and the order of events may be incorrect.
Fixes: dbd134322e74 ("perf intel-pt: Add support for decoding AUX area samples")
Reviewed-by: James Clark <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/Documentation/perf-intel-pt.txt | 30 ++++++++++++++++++++++
tools/perf/util/auxtrace.c | 3 +++
tools/perf/util/intel-pt.c | 6 +++++
3 files changed, 39 insertions(+)
diff --git a/tools/perf/Documentation/perf-intel-pt.txt b/tools/perf/Documentation/perf-intel-pt.txt
index 7b6ccd2fa3bf..9d485a9cdb19 100644
--- a/tools/perf/Documentation/perf-intel-pt.txt
+++ b/tools/perf/Documentation/perf-intel-pt.txt
@@ -1821,6 +1821,36 @@ a trace that encodes the payload data into TNT packets. Here is an example
$
+Pipe mode
+---------
+Pipe mode is a problem for Intel PT and possibly other auxtrace users.
+It's not recommended to use a pipe as data output with Intel PT because
+of the following reason.
+
+Essentially the auxtrace buffers do not behave like the regular perf
+event buffers. That is because the head and tail are updated by
+software, but in the auxtrace case the data is written by hardware.
+So the head and tail do not get updated as data is written.
+
+In the Intel PT case, the head and tail are updated only when the trace
+is disabled by software, for example:
+ - full-trace, system wide : when buffer passes watermark
+ - full-trace, not system-wide : when buffer passes watermark or
+ context switches
+ - snapshot mode : as above but also when a snapshot is made
+ - sample mode : as above but also when a sample is made
+
+That means finished-round ordering doesn't work. An auxtrace buffer
+can turn up that has data that extends back in time, possibly to the
+very beginning of tracing.
+
+For a perf.data file, that problem is solved by going through the trace
+and queuing up the auxtrace buffers in advance.
+
+For pipe mode, the order of events and timestamps can presumably
+be messed up.
+
+
EXAMPLE
-------
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index c2e323cd7d49..d4b04fa07a11 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -1133,6 +1133,9 @@ int auxtrace_queue_data(struct perf_session *session, bool samples, bool events)
if (auxtrace__dont_decode(session))
return 0;
+ if (perf_data__is_pipe(session->data))
+ return 0;
+
if (!session->auxtrace || !session->auxtrace->queue_data)
return -EINVAL;
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 6d3921627e33..b8b29756fbf1 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -4379,6 +4379,12 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
intel_pt_setup_pebs_events(pt);
+ if (perf_data__is_pipe(session->data)) {
+ pr_warning("WARNING: Intel PT with pipe mode is not recommended.\n"
+ " The output cannot relied upon. In particular,\n"
+ " timestamps and the order of events may be incorrect.\n");
+ }
+
if (pt->sampling_mode || list_empty(&session->auxtrace_index))
err = auxtrace_queue_data(session, true, true);
else
--
2.39.1.456.gfc5497dd1b-goog
The test_pipe() function will check perf report and perf inject with
pipe input.
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/tests/shell/test_intel_pt.sh | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/tools/perf/tests/shell/test_intel_pt.sh b/tools/perf/tests/shell/test_intel_pt.sh
index f5ed7b1af419..4ddb17cb83c5 100755
--- a/tools/perf/tests/shell/test_intel_pt.sh
+++ b/tools/perf/tests/shell/test_intel_pt.sh
@@ -620,6 +620,22 @@ test_event_trace()
return 0
}
+test_pipe()
+{
+ echo "--- Test with pipe mode ---"
+ # Check if it works with pipe
+ if ! perf_record_no_bpf -o- -e intel_pt//u uname | perf report -q -i- --itrace=i10000 ; then
+ echo "perf record + report failed with pipe mode"
+ return 1
+ fi
+ if ! perf_record_no_bpf -o- -e intel_pt//u uname | perf inject -b > /dev/null ; then
+ echo "perf record + inject failed with pipe mode"
+ return 1
+ fi
+ echo OK
+ return 0
+}
+
count_result()
{
if [ "$1" -eq 2 ] ; then
@@ -647,6 +663,7 @@ test_virtual_lbr || ret=$? ; count_result $ret ; ret=0
test_power_event || ret=$? ; count_result $ret ; ret=0
test_no_tnt || ret=$? ; count_result $ret ; ret=0
test_event_trace || ret=$? ; count_result $ret ; ret=0
+test_pipe || ret=$? ; count_result $ret ; ret=0
cleanup
--
2.39.1.456.gfc5497dd1b-goog
We should not call lseek(2) for pipes as it won't work. And we already
in the proper place to read the data for AUXTRACE. Add the comment like
in the PERF_RECORD_HEADER_TRACING_DATA.
Reviewed-by: James Clark <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
---
tools/perf/util/session.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 7c021c6cedb9..fdfe772f2699 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1699,8 +1699,13 @@ static s64 perf_session__process_user_event(struct perf_session *session,
case PERF_RECORD_AUXTRACE_INFO:
return tool->auxtrace_info(session, event);
case PERF_RECORD_AUXTRACE:
- /* setup for reading amidst mmap */
- lseek(fd, file_offset + event->header.size, SEEK_SET);
+ /*
+ * Setup for reading amidst mmap, but only when we
+ * are in 'file' mode. The 'pipe' fd is in proper
+ * place already.
+ */
+ if (!perf_data__is_pipe(session->data))
+ lseek(fd, file_offset + event->header.size, SEEK_SET);
return tool->auxtrace(session, event);
case PERF_RECORD_AUXTRACE_ERROR:
perf_session__auxtrace_error_inc(session, event);
--
2.39.1.456.gfc5497dd1b-goog
On 31/01/23 04:33, Namhyung Kim wrote:
> When it processes AUXTRACE_INFO, it calls to auxtrace_queue_data() to
> collect AUXTRACE data first. That won't work with pipe since it needs
> lseek() to read the scattered aux data.
>
> $ perf record -o- -e intel_pt// true | perf report -i- --itrace=i100
> # To display the perf.data header info, please use --header/--header-only options.
> #
> 0x4118 [0xa0]: failed to process type: 70
> Error:
> failed to process sample
>
> For the pipe mode, it can handle the aux data as it gets. But there's
> no guarantee it can get the aux data in time. So the following warning
> will be shown at the beginning:
>
> WARNING: Intel PT with pipe mode is not recommended.
> The output cannot relied upon. In particular,
> time stamps and the order of events may be incorrect.
>
> Fixes: dbd134322e74 ("perf intel-pt: Add support for decoding AUX area samples")
> Reviewed-by: James Clark <[email protected]>
> Signed-off-by: Namhyung Kim <[email protected]>
Minor changes to the documentation, see below, otherwise:
Reviewed-by: Adrian Hunter <[email protected]>
> ---
> tools/perf/Documentation/perf-intel-pt.txt | 30 ++++++++++++++++++++++
> tools/perf/util/auxtrace.c | 3 +++
> tools/perf/util/intel-pt.c | 6 +++++
> 3 files changed, 39 insertions(+)
>
> diff --git a/tools/perf/Documentation/perf-intel-pt.txt b/tools/perf/Documentation/perf-intel-pt.txt
> index 7b6ccd2fa3bf..9d485a9cdb19 100644
> --- a/tools/perf/Documentation/perf-intel-pt.txt
> +++ b/tools/perf/Documentation/perf-intel-pt.txt
> @@ -1821,6 +1821,36 @@ a trace that encodes the payload data into TNT packets. Here is an example
> $
>
>
> +Pipe mode
> +---------
> +Pipe mode is a problem for Intel PT and possibly other auxtrace users.
> +It's not recommended to use a pipe as data output with Intel PT because
> +of the following reason.
'following reason' -> 'reason explained below'
> +
> +Essentially the auxtrace buffers do not behave like the regular perf
> +event buffers. That is because the head and tail are updated by
> +software, but in the auxtrace case the data is written by hardware.
> +So the head and tail do not get updated as data is written.
> +
> +In the Intel PT case, the head and tail are updated only when the trace
> +is disabled by software, for example:
Needs a blank line here, otherwise all the text is merged into 1 paragraph.
> + - full-trace, system wide : when buffer passes watermark
> + - full-trace, not system-wide : when buffer passes watermark or
> + context switches
> + - snapshot mode : as above but also when a snapshot is made
> + - sample mode : as above but also when a sample is made
> +
> +That means finished-round ordering doesn't work. An auxtrace buffer
> +can turn up that has data that extends back in time, possibly to the
> +very beginning of tracing.
> +
> +For a perf.data file, that problem is solved by going through the trace
> +and queuing up the auxtrace buffers in advance.
> +
> +For pipe mode, the order of events and timestamps can presumably
> +be messed up.
'presumably be messed up' -> can be mixed up.
> +
> +
> EXAMPLE
> -------
>
> diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
> index c2e323cd7d49..d4b04fa07a11 100644
> --- a/tools/perf/util/auxtrace.c
> +++ b/tools/perf/util/auxtrace.c
> @@ -1133,6 +1133,9 @@ int auxtrace_queue_data(struct perf_session *session, bool samples, bool events)
> if (auxtrace__dont_decode(session))
> return 0;
>
> + if (perf_data__is_pipe(session->data))
> + return 0;
> +
> if (!session->auxtrace || !session->auxtrace->queue_data)
> return -EINVAL;
>
> diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
> index 6d3921627e33..b8b29756fbf1 100644
> --- a/tools/perf/util/intel-pt.c
> +++ b/tools/perf/util/intel-pt.c
> @@ -4379,6 +4379,12 @@ int intel_pt_process_auxtrace_info(union perf_event *event,
>
> intel_pt_setup_pebs_events(pt);
>
> + if (perf_data__is_pipe(session->data)) {
> + pr_warning("WARNING: Intel PT with pipe mode is not recommended.\n"
> + " The output cannot relied upon. In particular,\n"
> + " timestamps and the order of events may be incorrect.\n");
> + }
> +
> if (pt->sampling_mode || list_empty(&session->auxtrace_index))
> err = auxtrace_queue_data(session, true, true);
> else
On 31/01/23 04:33, Namhyung Kim wrote:
> Hello,
>
> I found some problems in Intel-PT and auxtrace in general with pipe.
> In the past it used to work with pipe, but recent code fails. As it
> also touches the generic code, other auxtrace users like ARM SPE will
> be affected too. I added a test case to verify it works with pipes.
>
> Changes in v2)
> * add a warning in intel_pt_process_auxtrace_info()
> * add Reviewed-by from James
Minor changes suggested for patch 2, otherwise:
Reviewed-by: Adrian Hunter <[email protected]>
>
> At last, I can run this command without a problem.
>
> $ perf record -o- -e intel_pt// true | perf inject -b | perf report -i- --itrace=i1000
>
> The code is available at 'perf/auxtrace-pipe-v2' branch in
>
> git://git.kernel.org/pub/scm/linux/kernel/git/namhyung/linux-perf.git
>
> Thanks,
> Namhyung
>
>
> Namhyung Kim (4):
> perf inject: Use perf_data__read() for auxtrace
> perf intel-pt: Do not try to queue auxtrace data on pipe
> perf session: Avoid calling lseek(2) for pipe
> perf test: Add pipe mode test to the Intel PT test suite
>
> tools/perf/Documentation/perf-intel-pt.txt | 30 ++++++++++++++++++++++
> tools/perf/builtin-inject.c | 6 ++---
> tools/perf/tests/shell/test_intel_pt.sh | 17 ++++++++++++
> tools/perf/util/auxtrace.c | 3 +++
> tools/perf/util/intel-pt.c | 6 +++++
> tools/perf/util/session.c | 9 +++++--
> 6 files changed, 66 insertions(+), 5 deletions(-)
>
>
> base-commit: 5670ebf54bd26482f57a094c53bdc562c106e0a9