2021-01-18 04:23:31

by Namhyung Kim

[permalink] [raw]
Subject: [PATCH] perf/core: Emit PERF_RECORD_LOST for pinned events

As of now we silently ignore pinned events when it's failed to be
scheduled and make it error state not try to schedule it again.
That means we won't get any samples for the event.

But there's no way for users to notice and respond to it. Let's
emit a lost event with a new misc bit to indicate this situation.

Signed-off-by: Namhyung Kim <[email protected]>
---
include/uapi/linux/perf_event.h | 2 ++
kernel/events/core.c | 36 +++++++++++++++++++++++++++++++++
2 files changed, 38 insertions(+)

diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index b15e3447cd9f..3c0e115dd8b7 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -679,11 +679,13 @@ struct perf_event_mmap_page {
* PERF_RECORD_MISC_COMM_EXEC - PERF_RECORD_COMM event
* PERF_RECORD_MISC_FORK_EXEC - PERF_RECORD_FORK event (perf internal)
* PERF_RECORD_MISC_SWITCH_OUT - PERF_RECORD_SWITCH* events
+ * PERF_RECORD_MISC_LOST_PINNED- PERF_RECORD_LOST event
*/
#define PERF_RECORD_MISC_MMAP_DATA (1 << 13)
#define PERF_RECORD_MISC_COMM_EXEC (1 << 13)
#define PERF_RECORD_MISC_FORK_EXEC (1 << 13)
#define PERF_RECORD_MISC_SWITCH_OUT (1 << 13)
+#define PERF_RECORD_MISC_LOST_PINNED (1 << 13)
/*
* These PERF_RECORD_MISC_* flags below are safely reused
* for the following events:
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 55d18791a72d..523927575434 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3654,6 +3654,8 @@ static noinline int visit_groups_merge(struct perf_cpu_context *cpuctx,
return 0;
}

+static void perf_log_lost_event(struct perf_event *event);
+
static int merge_sched_in(struct perf_event *event, void *data)
{
struct perf_event_context *ctx = event->ctx;
@@ -3675,6 +3677,7 @@ static int merge_sched_in(struct perf_event *event, void *data)
if (event->attr.pinned) {
perf_cgroup_event_disable(event, ctx);
perf_event_set_state(event, PERF_EVENT_STATE_ERROR);
+ perf_log_lost_event(event);
}

*can_add_hw = 0;
@@ -8414,6 +8417,39 @@ void perf_event_aux_event(struct perf_event *event, unsigned long head,
perf_output_end(&handle);
}

+/*
+ * failed/errored events logging
+ */
+static void perf_log_lost_event(struct perf_event *event)
+{
+ struct perf_output_handle handle;
+ struct perf_sample_data sample;
+ int ret;
+ struct {
+ struct perf_event_header header;
+ u64 id;
+ u64 lost;
+ } lost_event = {
+ .header = {
+ .type = PERF_RECORD_LOST,
+ .misc = PERF_RECORD_MISC_LOST_PINNED,
+ .size = sizeof(lost_event),
+ },
+ .id = event->id,
+ };
+
+ perf_event_header__init_id(&lost_event.header, &sample, event);
+
+ ret = perf_output_begin(&handle, &sample, event,
+ lost_event.header.size);
+ if (ret)
+ return;
+
+ perf_output_put(&handle, lost_event);
+ perf_event__output_id_sample(event, &handle, &sample);
+ perf_output_end(&handle);
+}
+
/*
* Lost/dropped samples logging
*/
--
2.30.0.284.gd98b1dd5eaa7-goog


2021-01-18 21:23:01

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] perf/core: Emit PERF_RECORD_LOST for pinned events

On Mon, Jan 18, 2021 at 12:43:23PM +0900, Namhyung Kim wrote:
> As of now we silently ignore pinned events when it's failed to be
> scheduled and make it error state not try to schedule it again.
> That means we won't get any samples for the event.
>
> But there's no way for users to notice and respond to it. Let's
> emit a lost event with a new misc bit to indicate this situation.

Users should get a read(2) error IIRC, does that not work?

2021-01-19 04:17:41

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] perf/core: Emit PERF_RECORD_LOST for pinned events

On Mon, Jan 18, 2021 at 08:44:20PM +0900, Namhyung Kim wrote:
> Hi Peter,
>
> On Mon, Jan 18, 2021 at 7:11 PM Peter Zijlstra <[email protected]> wrote:
> >
> > On Mon, Jan 18, 2021 at 12:43:23PM +0900, Namhyung Kim wrote:
> > > As of now we silently ignore pinned events when it's failed to be
> > > scheduled and make it error state not try to schedule it again.
> > > That means we won't get any samples for the event.
> > >
> > > But there's no way for users to notice and respond to it. Let's
> > > emit a lost event with a new misc bit to indicate this situation.
> >
> > Users should get a read(2) error IIRC, does that not work?
>
> Ah, right. maybe I'm too specific to perf record's perspective.
>
> In perf record, it doesn't use read(2) so I thought it should
> have the information in the stream of sample data.

perf-record could of course do a read() at the end, to detect this.

I don't think I object to having an even in the stream, but your LOST
event is unfortunate in that it itself can get lost when there's no
space in the buffer (which arguably is unlikely, but still).

So from that point of view, I think overloading LOST is not so very nice
for this.

2021-01-19 04:28:27

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH] perf/core: Emit PERF_RECORD_LOST for pinned events

Hi Peter,

On Mon, Jan 18, 2021 at 7:11 PM Peter Zijlstra <[email protected]> wrote:
>
> On Mon, Jan 18, 2021 at 12:43:23PM +0900, Namhyung Kim wrote:
> > As of now we silently ignore pinned events when it's failed to be
> > scheduled and make it error state not try to schedule it again.
> > That means we won't get any samples for the event.
> >
> > But there's no way for users to notice and respond to it. Let's
> > emit a lost event with a new misc bit to indicate this situation.
>
> Users should get a read(2) error IIRC, does that not work?

Ah, right. maybe I'm too specific to perf record's perspective.

In perf record, it doesn't use read(2) so I thought it should
have the information in the stream of sample data.

Thanks,
Namhyung

2021-01-19 05:40:29

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH] perf/core: Emit PERF_RECORD_LOST for pinned events

On Mon, Jan 18, 2021 at 9:56 PM Peter Zijlstra <[email protected]> wrote:
>
> On Mon, Jan 18, 2021 at 08:44:20PM +0900, Namhyung Kim wrote:
> > Hi Peter,
> >
> > On Mon, Jan 18, 2021 at 7:11 PM Peter Zijlstra <[email protected]> wrote:
> > >
> > > On Mon, Jan 18, 2021 at 12:43:23PM +0900, Namhyung Kim wrote:
> > > > As of now we silently ignore pinned events when it's failed to be
> > > > scheduled and make it error state not try to schedule it again.
> > > > That means we won't get any samples for the event.
> > > >
> > > > But there's no way for users to notice and respond to it. Let's
> > > > emit a lost event with a new misc bit to indicate this situation.
> > >
> > > Users should get a read(2) error IIRC, does that not work?
> >
> > Ah, right. maybe I'm too specific to perf record's perspective.
> >
> > In perf record, it doesn't use read(2) so I thought it should
> > have the information in the stream of sample data.
>
> perf-record could of course do a read() at the end, to detect this.

OK, will add that.

>
> I don't think I object to having an even in the stream, but your LOST
> event is unfortunate in that it itself can get lost when there's no
> space in the buffer (which arguably is unlikely, but still).
>
> So from that point of view, I think overloading LOST is not so very nice
> for this.

But anything can get lost in case of no space.
Do you want to use something other than the LOST event?

Thanks,
Namhyung

2021-01-19 05:47:00

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH] perf/core: Emit PERF_RECORD_LOST for pinned events

> > I don't think I object to having an even in the stream, but your LOST
> > event is unfortunate in that it itself can get lost when there's no
> > space in the buffer (which arguably is unlikely, but still).
> >
> > So from that point of view, I think overloading LOST is not so very nice
> > for this.
>
> But anything can get lost in case of no space.
> Do you want to use something other than the LOST event?

Could always reserve the last entry in the ring buffer for a LOST event,
that would guarantee you can always get one out.

-Andi

2021-01-19 05:47:36

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH] perf/core: Emit PERF_RECORD_LOST for pinned events

Hi Andi,

On Tue, Jan 19, 2021 at 11:47 AM Andi Kleen <[email protected]> wrote:
>
> > > I don't think I object to having an even in the stream, but your LOST
> > > event is unfortunate in that it itself can get lost when there's no
> > > space in the buffer (which arguably is unlikely, but still).
> > >
> > > So from that point of view, I think overloading LOST is not so very nice
> > > for this.
> >
> > But anything can get lost in case of no space.
> > Do you want to use something other than the LOST event?
>
> Could always reserve the last entry in the ring buffer for a LOST event,
> that would guarantee you can always get one out.

A problem is that we can have more than one event that failed.

In my understanding, we keep the lost count and add a LOST event
when there's a space later. So probably we can keep a list of the
failed events and do similar for each event. Or just use a single
event to notify some number of events were failed.

Thanks,
Namhyung

2021-01-20 12:20:21

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH] perf/core: Emit PERF_RECORD_LOST for pinned events

On Tue, Jan 19, 2021 at 12:11 PM Namhyung Kim <[email protected]> wrote:
>
> Hi Andi,
>
> On Tue, Jan 19, 2021 at 11:47 AM Andi Kleen <[email protected]> wrote:
> >
> > > > I don't think I object to having an even in the stream, but your LOST
> > > > event is unfortunate in that it itself can get lost when there's no
> > > > space in the buffer (which arguably is unlikely, but still).
> > > >
> > > > So from that point of view, I think overloading LOST is not so very nice
> > > > for this.
> > >
> > > But anything can get lost in case of no space.
> > > Do you want to use something other than the LOST event?
> >
> > Could always reserve the last entry in the ring buffer for a LOST event,
> > that would guarantee you can always get one out.
>
> A problem is that we can have more than one event that failed.
>
> In my understanding, we keep the lost count and add a LOST event
> when there's a space later. So probably we can keep a list of the
> failed events and do similar for each event. Or just use a single
> event to notify some number of events were failed.

Stephane suggested emitting an event for poll() like EPOLLERR or
EPOLLHUP. I'll take a look at that.

Thanks,
Namhyung

2021-01-20 13:09:02

by Arnaldo Carvalho de Melo

[permalink] [raw]
Subject: Re: [PATCH] perf/core: Emit PERF_RECORD_LOST for pinned events

Em Wed, Jan 20, 2021 at 08:53:48PM +0900, Namhyung Kim escreveu:
> On Tue, Jan 19, 2021 at 12:11 PM Namhyung Kim <[email protected]> wrote:
> >
> > Hi Andi,
> >
> > On Tue, Jan 19, 2021 at 11:47 AM Andi Kleen <[email protected]> wrote:
> > >
> > > > > I don't think I object to having an even in the stream, but your LOST
> > > > > event is unfortunate in that it itself can get lost when there's no
> > > > > space in the buffer (which arguably is unlikely, but still).
> > > > >
> > > > > So from that point of view, I think overloading LOST is not so very nice
> > > > > for this.
> > > >
> > > > But anything can get lost in case of no space.
> > > > Do you want to use something other than the LOST event?
> > >
> > > Could always reserve the last entry in the ring buffer for a LOST event,
> > > that would guarantee you can always get one out.
> >
> > A problem is that we can have more than one event that failed.
> >
> > In my understanding, we keep the lost count and add a LOST event
> > when there's a space later. So probably we can keep a list of the
> > failed events and do similar for each event. Or just use a single
> > event to notify some number of events were failed.
>
> Stephane suggested emitting an event for poll() like EPOLLERR or
> EPOLLHUP. I'll take a look at that.

Looks sane, that way the poll returns immediately when we start seeing
lost events, so tools can warn the user and then, if/when space becomes
available, tell how many events were lost.

- Arnaldo