2010-08-26 14:54:42

by Stephane Eranian

[permalink] [raw]
Subject: [PATCH] perf_events: fixing time tracking for event with pid != -1 and cpu != -1

Per-thread events with a cpu filter, i.e., cpu != -1, were not reporting
correct timings when the thread never ran on the monitored cpu. The time
enabled was reported as a negative value.

This patch fixes the problem by updating tstamp_stopped, tstamp_running
in event_sched_out() for events with filters and which are marked as
INACTIVE.

The function group_sched_out() is modified to systematically call into
event_sched_out() to avoid duplicating the timing adjustment code twice.

With the patch, I now get:

$ task_cpu -i -e unhalted_core_cycles,unhalted_core_cycles noploop 2
noploop for 2 seconds
CPU0 0 unhalted_core_cycles (ena=1,991,136,594, run=0)
CPU0 0 unhalted_core_cycles (ena=1,991,136,594, run=0)

CPU1 0 unhalted_core_cycles (ena=1,991,136,594, run=0)
CPU1 0 unhalted_core_cycles (ena=1,991,136,594, run=0)

CPU2 0 unhalted_core_cycles (ena=1,991,136,594, run=0)
CPU2 0 unhalted_core_cycles (ena=1,991,136,594, run=0)

CPU3 4,747,990,931 unhalted_core_cycles (ena=1,991,136,594, run=1,991,136,594)
CPU3 4,747,990,931 unhalted_core_cycles (ena=1,991,136,594, run=1,991,136,594)

Signed-off-by: Stephane Eranian <[email protected]>

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 0d38f27..d196412 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -402,11 +402,31 @@ static void perf_group_detach(struct perf_event *event)
}
}

+static inline int
+event_filter_match(struct perf_event *event)
+{
+ return event->cpu == -1 || event->cpu == smp_processor_id();
+}
+
static void
event_sched_out(struct perf_event *event,
struct perf_cpu_context *cpuctx,
struct perf_event_context *ctx)
{
+ u64 delta;
+ /*
+ * an event which could not be activated because of
+ * filter mismatch still needs to have its timings
+ * maintained, otherwise bogus information is return
+ * via read() for time_enabled, time_running
+ */
+ if (event->state == PERF_EVENT_STATE_INACTIVE
+ && !event_filter_match(event)) {
+ delta = ctx->time - event->tstamp_stopped;
+ event->tstamp_running += delta;
+ event->tstamp_stopped = ctx->time;
+ }
+
if (event->state != PERF_EVENT_STATE_ACTIVE)
return;

@@ -432,9 +452,7 @@ group_sched_out(struct perf_event *group_event,
struct perf_event_context *ctx)
{
struct perf_event *event;
-
- if (group_event->state != PERF_EVENT_STATE_ACTIVE)
- return;
+ int state = group_event->state;

event_sched_out(group_event, cpuctx, ctx);

@@ -444,7 +462,7 @@ group_sched_out(struct perf_event *group_event,
list_for_each_entry(event, &group_event->sibling_list, group_entry)
event_sched_out(event, cpuctx, ctx);

- if (group_event->attr.exclusive)
+ if (state == PERF_EVENT_STATE_ACTIVE && group_event->attr.exclusive)
cpuctx->exclusive = 0;
}


2010-08-30 15:49:28

by Stephane Eranian

[permalink] [raw]
Subject: [tip:perf/urgent] perf_events: Fix time tracking for events with pid != -1 and cpu != -1

Commit-ID: fa66f07aa1f0950e1dc78b7ab39728b3f8aa77a1
Gitweb: http://git.kernel.org/tip/fa66f07aa1f0950e1dc78b7ab39728b3f8aa77a1
Author: Stephane Eranian <[email protected]>
AuthorDate: Thu, 26 Aug 2010 16:40:01 +0200
Committer: Ingo Molnar <[email protected]>
CommitDate: Mon, 30 Aug 2010 12:16:55 +0200

perf_events: Fix time tracking for events with pid != -1 and cpu != -1

Per-thread events with a cpu filter, i.e., cpu != -1, were not
reporting correct timings when the thread never ran on the
monitored cpu. The time enabled was reported as a negative
value.

This patch fixes the problem by updating tstamp_stopped,
tstamp_running in event_sched_out() for events with filters and
which are marked as INACTIVE.

The function group_sched_out() is modified to systematically
call into event_sched_out() to avoid duplicating the timing
adjustment code twice.

With the patch, I now get:

$ task_cpu -i -e unhalted_core_cycles,unhalted_core_cycles
noploop 2 noploop for 2 seconds
CPU0 0 unhalted_core_cycles (ena=1,991,136,594, run=0)
CPU0 0 unhalted_core_cycles (ena=1,991,136,594, run=0)

CPU1 0 unhalted_core_cycles (ena=1,991,136,594, run=0)
CPU1 0 unhalted_core_cycles (ena=1,991,136,594, run=0)

CPU2 0 unhalted_core_cycles (ena=1,991,136,594, run=0)
CPU2 0 unhalted_core_cycles (ena=1,991,136,594, run=0)

CPU3 4,747,990,931 unhalted_core_cycles (ena=1,991,136,594, run=1,991,136,594)
CPU3 4,747,990,931 unhalted_core_cycles (ena=1,991,136,594, run=1,991,136,594)

Signed-off-by: Stephane Eranian <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/perf_event.c | 26 ++++++++++++++++++++++----
1 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 403d180..657555a 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -402,11 +402,31 @@ static void perf_group_detach(struct perf_event *event)
}
}

+static inline int
+event_filter_match(struct perf_event *event)
+{
+ return event->cpu == -1 || event->cpu == smp_processor_id();
+}
+
static void
event_sched_out(struct perf_event *event,
struct perf_cpu_context *cpuctx,
struct perf_event_context *ctx)
{
+ u64 delta;
+ /*
+ * An event which could not be activated because of
+ * filter mismatch still needs to have its timings
+ * maintained, otherwise bogus information is return
+ * via read() for time_enabled, time_running:
+ */
+ if (event->state == PERF_EVENT_STATE_INACTIVE
+ && !event_filter_match(event)) {
+ delta = ctx->time - event->tstamp_stopped;
+ event->tstamp_running += delta;
+ event->tstamp_stopped = ctx->time;
+ }
+
if (event->state != PERF_EVENT_STATE_ACTIVE)
return;

@@ -432,9 +452,7 @@ group_sched_out(struct perf_event *group_event,
struct perf_event_context *ctx)
{
struct perf_event *event;
-
- if (group_event->state != PERF_EVENT_STATE_ACTIVE)
- return;
+ int state = group_event->state;

event_sched_out(group_event, cpuctx, ctx);

@@ -444,7 +462,7 @@ group_sched_out(struct perf_event *group_event,
list_for_each_entry(event, &group_event->sibling_list, group_entry)
event_sched_out(event, cpuctx, ctx);

- if (group_event->attr.exclusive)
+ if (state == PERF_EVENT_STATE_ACTIVE && group_event->attr.exclusive)
cpuctx->exclusive = 0;
}

2010-08-31 01:11:02

by Lin Ming

[permalink] [raw]
Subject: Re: [tip:perf/urgent] perf_events: Fix time tracking for events with pid != -1 and cpu != -1

>
> perf_events: Fix time tracking for events with pid != -1 and cpu != -1

Hi,

I thought pid != -1 and cpu != -1 is not valid.

# perf top -p 23268 -C 1
WARNING: PID switch overriding CPU

In which case this could be valid?

Thanks,
Lin Ming

>
> Per-thread events with a cpu filter, i.e., cpu != -1, were not
> reporting correct timings when the thread never ran on the
> monitored cpu. The time enabled was reported as a negative
> value.

2010-08-31 08:03:39

by Stephane Eranian

[permalink] [raw]
Subject: Re: [tip:perf/urgent] perf_events: Fix time tracking for events with pid != -1 and cpu != -1

On Tue, Aug 31, 2010 at 3:10 AM, Lin Ming <[email protected]> wrote:
>>
>> perf_events: Fix time tracking for events with pid != -1 and cpu != -1
>
> Hi,
>
> I thought pid != -1 and cpu != -1 is not valid.
>

This is in per-thread mode. This is what happens when you do
perf record foo for instance. This is sampling, but you can do
the same with counting.


> # perf top -p 23268 -C 1
> WARNING: PID switch overriding CPU
>
> In which case this could be valid?
>
> Thanks,
> Lin Ming
>
>>
>> Per-thread events with a cpu filter, i.e., cpu != -1, were not
>> reporting correct timings when the thread never ran on the
>> monitored cpu. The time enabled was reported as a negative
>> value.
>