2021-04-08 15:33:44

by Alexander Shishkin

[permalink] [raw]
Subject: [PATCH 2/2] perf intel-pt: Use aux_watermark

Turns out, the default setting of attr.aux_watermark to half of the total
buffer size is not very useful, especially with smaller buffers. The
problem is that, after half of the buffer is filled up, the kernel updates
->aux_head and sets up the next "transaction", while observing that
->aux_tail is still zero (as userspace haven't had the chance to update
it), meaning that the trace will have to stop at the end of this second
"transaction". This means, for example, that the second PERF_RECORD_AUX in
every trace comes with TRUNCATED flag set.

Setting attr.aux_watermark to quarter of the buffer gives enough space for
the ->aux_tail update to be observed and prevents the data loss.

The obligatory before/after showcase:

> # perf_before record -e intel_pt//u -m,8 uname
> Linux
> [ perf record: Woken up 6 times to write data ]
> Warning:
> AUX data lost 4 times out of 10!
>
> [ perf record: Captured and wrote 0.099 MB perf.data ]
> # perf record -e intel_pt//u -m,8 uname
> Linux
> [ perf record: Woken up 4 times to write data ]
> [ perf record: Captured and wrote 0.039 MB perf.data ]

The effect is still visible with large workloads and large buffers,
although less pronounced.

Signed-off-by: Alexander Shishkin <[email protected]>
---
tools/perf/arch/x86/util/intel-pt.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c
index a6420c647959..d00707faf547 100644
--- a/tools/perf/arch/x86/util/intel-pt.c
+++ b/tools/perf/arch/x86/util/intel-pt.c
@@ -776,6 +776,10 @@ static int intel_pt_recording_options(struct auxtrace_record *itr,
}
}

+ if (opts->full_auxtrace)
+ intel_pt_evsel->core.attr.aux_watermark =
+ opts->auxtrace_mmap_pages / 4 * page_size;
+
intel_pt_parse_terms(intel_pt_pmu->name, &intel_pt_pmu->format,
"tsc", &tsc_bit);

--
2.30.2


2021-04-14 17:37:11

by Alexander Shishkin

[permalink] [raw]
Subject: Re: [PATCH 2/2] perf intel-pt: Use aux_watermark

Adrian Hunter <[email protected]> writes:

> On 8/04/21 6:31 pm, Alexander Shishkin wrote:
>> diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c
>> index a6420c647959..d00707faf547 100644
>> --- a/tools/perf/arch/x86/util/intel-pt.c
>> +++ b/tools/perf/arch/x86/util/intel-pt.c
>> @@ -776,6 +776,10 @@ static int intel_pt_recording_options(struct auxtrace_record *itr,
>> }
>> }
>>
>> + if (opts->full_auxtrace)
>> + intel_pt_evsel->core.attr.aux_watermark =
>> + opts->auxtrace_mmap_pages / 4 * page_size;
>> +
>
> I would be explicit about the mode and put "/ 4" at the end
> for the case auxtrace_mmap_pages is not a multiple of 4 (e.g. 2).
> i.e.
>
> if (!opts->auxtrace_snapshot_mode && !opts->auxtrace_sample_mode) {
> u32 aux_watermark = opts->auxtrace_mmap_pages * page_size / 4;
>
> intel_pt_evsel->core.attr.aux_watermark = aux_watermark;
> }

Thank you! I'll do exactly that.

Regards,
--
Alex

2021-04-14 19:21:55

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH 2/2] perf intel-pt: Use aux_watermark

On 8/04/21 6:31 pm, Alexander Shishkin wrote:
> Turns out, the default setting of attr.aux_watermark to half of the total
> buffer size is not very useful, especially with smaller buffers. The
> problem is that, after half of the buffer is filled up, the kernel updates
> ->aux_head and sets up the next "transaction", while observing that
> ->aux_tail is still zero (as userspace haven't had the chance to update
> it), meaning that the trace will have to stop at the end of this second
> "transaction". This means, for example, that the second PERF_RECORD_AUX in
> every trace comes with TRUNCATED flag set.
>
> Setting attr.aux_watermark to quarter of the buffer gives enough space for
> the ->aux_tail update to be observed and prevents the data loss.
>
> The obligatory before/after showcase:
>
>> # perf_before record -e intel_pt//u -m,8 uname
>> Linux
>> [ perf record: Woken up 6 times to write data ]
>> Warning:
>> AUX data lost 4 times out of 10!
>>
>> [ perf record: Captured and wrote 0.099 MB perf.data ]
>> # perf record -e intel_pt//u -m,8 uname
>> Linux
>> [ perf record: Woken up 4 times to write data ]
>> [ perf record: Captured and wrote 0.039 MB perf.data ]
>
> The effect is still visible with large workloads and large buffers,
> although less pronounced.
>
> Signed-off-by: Alexander Shishkin <[email protected]>
> ---
> tools/perf/arch/x86/util/intel-pt.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/tools/perf/arch/x86/util/intel-pt.c b/tools/perf/arch/x86/util/intel-pt.c
> index a6420c647959..d00707faf547 100644
> --- a/tools/perf/arch/x86/util/intel-pt.c
> +++ b/tools/perf/arch/x86/util/intel-pt.c
> @@ -776,6 +776,10 @@ static int intel_pt_recording_options(struct auxtrace_record *itr,
> }
> }
>
> + if (opts->full_auxtrace)
> + intel_pt_evsel->core.attr.aux_watermark =
> + opts->auxtrace_mmap_pages / 4 * page_size;
> +

I would be explicit about the mode and put "/ 4" at the end
for the case auxtrace_mmap_pages is not a multiple of 4 (e.g. 2).
i.e.

if (!opts->auxtrace_snapshot_mode && !opts->auxtrace_sample_mode) {
u32 aux_watermark = opts->auxtrace_mmap_pages * page_size / 4;

intel_pt_evsel->core.attr.aux_watermark = aux_watermark;
}


> intel_pt_parse_terms(intel_pt_pmu->name, &intel_pt_pmu->format,
> "tsc", &tsc_bit);
>
>