2023-06-23 08:51:34

by Ravi Bangoria

[permalink] [raw]
Subject: [RFC] perf record: Use PERF_RECORD_LOST for synthesizing samples from read_format->lost

Currently perf synthesizes PERF_RECORD_LOST_SAMPLES samples for values
returned by read_format->lost. IIUC, PERF_RECORD_LOST_SAMPLES is used
only when hw provides corrupted samples and thus kernel has to drop
those. OTOH, PERF_RECORD_LOST is used when kernel has valid samples but
it fails to push them to ring-buffer because ring-buffer is already full.

So I feel PERF_RECORD_LOST is more appropriate for synthesizing samples
from read_format->lost.

Signed-off-by: Ravi Bangoria <[email protected]>
---
Notes:
- Posting this as RFC to get feedback. I haven't tested it well.
- There is one more minor issue: Aggregated Stats shows count of LOST/
LOST_SAMPLES records instead of actual number of lost events/samples:
```
$ sudo ./perf report -D | tail -20
Warning:
Processed 78923 events and lost 1153 chunks!
Warning:
Processed 47427 samples and lost 28.30%!
...
Aggregated stats:
LOST events: 1153 ( 1.5%)
LOST_SAMPLES events: 132 ( 0.2%)
```
Not sure if this is intentional.

tools/perf/builtin-record.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index efa03e4ac2c9..5baed42437b2 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -1858,7 +1858,7 @@ record__switch_output(struct record *rec, bool at_exit)
}

static void __record__save_lost_samples(struct record *rec, struct evsel *evsel,
- struct perf_record_lost_samples *lost,
+ struct perf_record_lost *lost,
int cpu_idx, int thread_idx, u64 lost_count,
u16 misc_flag)
{
@@ -1871,6 +1871,7 @@ static void __record__save_lost_samples(struct record *rec, struct evsel *evsel,
sid = xyarray__entry(evsel->core.sample_id, cpu_idx, thread_idx);
sample.id = sid->id;
}
+ lost->id = sid->id;

id_hdr_size = perf_event__synthesize_id_sample((void *)(lost + 1),
evsel->core.attr.sample_type, &sample);
@@ -1882,7 +1883,7 @@ static void __record__save_lost_samples(struct record *rec, struct evsel *evsel,
static void record__read_lost_samples(struct record *rec)
{
struct perf_session *session = rec->session;
- struct perf_record_lost_samples *lost;
+ struct perf_record_lost *lost;
struct evsel *evsel;

/* there was an error during record__open */
@@ -1895,7 +1896,7 @@ static void record__read_lost_samples(struct record *rec)
return;
}

- lost->header.type = PERF_RECORD_LOST_SAMPLES;
+ lost->header.type = PERF_RECORD_LOST;

evlist__for_each_entry(session->evlist, evsel) {
struct xyarray *xy = evsel->core.sample_id;
--
2.41.0



2023-06-24 06:19:10

by Namhyung Kim

[permalink] [raw]
Subject: Re: [RFC] perf record: Use PERF_RECORD_LOST for synthesizing samples from read_format->lost

Hi Ravi,

On Fri, Jun 23, 2023 at 1:21 AM Ravi Bangoria <[email protected]> wrote:
>
> Currently perf synthesizes PERF_RECORD_LOST_SAMPLES samples for values
> returned by read_format->lost. IIUC, PERF_RECORD_LOST_SAMPLES is used
> only when hw provides corrupted samples and thus kernel has to drop
> those. OTOH, PERF_RECORD_LOST is used when kernel has valid samples but
> it fails to push them to ring-buffer because ring-buffer is already full.
>
> So I feel PERF_RECORD_LOST is more appropriate for synthesizing samples
> from read_format->lost.

The problem of PERF_RECORD_LOST is that it counts non-sample
records too. It just counts every lost records and put the number
when it can find a space in the ring buffer later. We don't know
which one is lost and how many of it.

Some users want to get the accurate number of samples even if it's
not recorded in the ring buffer. Using PERF_RECORD_LOST can be
confusing since the kernel will return inaccurate data in terms of the
number of lost samples. So I chose PERF_RECORD_LOST_SAMPLES
to return the accurate number for each event.

Thanks,
Namhyung


>
> Signed-off-by: Ravi Bangoria <[email protected]>
> ---
> Notes:
> - Posting this as RFC to get feedback. I haven't tested it well.
> - There is one more minor issue: Aggregated Stats shows count of LOST/
> LOST_SAMPLES records instead of actual number of lost events/samples:
> ```
> $ sudo ./perf report -D | tail -20
> Warning:
> Processed 78923 events and lost 1153 chunks!
> Warning:
> Processed 47427 samples and lost 28.30%!
> ...
> Aggregated stats:
> LOST events: 1153 ( 1.5%)
> LOST_SAMPLES events: 132 ( 0.2%)
> ```
> Not sure if this is intentional.
>
> tools/perf/builtin-record.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
> index efa03e4ac2c9..5baed42437b2 100644
> --- a/tools/perf/builtin-record.c
> +++ b/tools/perf/builtin-record.c
> @@ -1858,7 +1858,7 @@ record__switch_output(struct record *rec, bool at_exit)
> }
>
> static void __record__save_lost_samples(struct record *rec, struct evsel *evsel,
> - struct perf_record_lost_samples *lost,
> + struct perf_record_lost *lost,
> int cpu_idx, int thread_idx, u64 lost_count,
> u16 misc_flag)
> {
> @@ -1871,6 +1871,7 @@ static void __record__save_lost_samples(struct record *rec, struct evsel *evsel,
> sid = xyarray__entry(evsel->core.sample_id, cpu_idx, thread_idx);
> sample.id = sid->id;
> }
> + lost->id = sid->id;
>
> id_hdr_size = perf_event__synthesize_id_sample((void *)(lost + 1),
> evsel->core.attr.sample_type, &sample);
> @@ -1882,7 +1883,7 @@ static void __record__save_lost_samples(struct record *rec, struct evsel *evsel,
> static void record__read_lost_samples(struct record *rec)
> {
> struct perf_session *session = rec->session;
> - struct perf_record_lost_samples *lost;
> + struct perf_record_lost *lost;
> struct evsel *evsel;
>
> /* there was an error during record__open */
> @@ -1895,7 +1896,7 @@ static void record__read_lost_samples(struct record *rec)
> return;
> }
>
> - lost->header.type = PERF_RECORD_LOST_SAMPLES;
> + lost->header.type = PERF_RECORD_LOST;
>
> evlist__for_each_entry(session->evlist, evsel) {
> struct xyarray *xy = evsel->core.sample_id;
> --
> 2.41.0
>

2023-06-26 06:16:47

by Ravi Bangoria

[permalink] [raw]
Subject: Re: [RFC] perf record: Use PERF_RECORD_LOST for synthesizing samples from read_format->lost

On 24-Jun-23 11:25 AM, Namhyung Kim wrote:
> Hi Ravi,
>
> On Fri, Jun 23, 2023 at 1:21 AM Ravi Bangoria <[email protected]> wrote:
>>
>> Currently perf synthesizes PERF_RECORD_LOST_SAMPLES samples for values
>> returned by read_format->lost. IIUC, PERF_RECORD_LOST_SAMPLES is used
>> only when hw provides corrupted samples and thus kernel has to drop
>> those. OTOH, PERF_RECORD_LOST is used when kernel has valid samples but
>> it fails to push them to ring-buffer because ring-buffer is already full.
>>
>> So I feel PERF_RECORD_LOST is more appropriate for synthesizing samples
>> from read_format->lost.
>
> The problem of PERF_RECORD_LOST is that it counts non-sample
> records too. It just counts every lost records and put the number
> when it can find a space in the ring buffer later. We don't know
> which one is lost and how many of it.
>
> Some users want to get the accurate number of samples even if it's
> not recorded in the ring buffer. Using PERF_RECORD_LOST can be
> confusing since the kernel will return inaccurate data in terms of the
> number of lost samples. So I chose PERF_RECORD_LOST_SAMPLES
> to return the accurate number for each event.

Ok. Thanks for the clarification.

Ravi