2009-10-14 05:58:54

by Paul Mackerras

[permalink] [raw]
Subject: [PATCH] perf_event: Adjust frequency and unthrottle for non-group-leader events

The loop in perf_ctx_adjust_freq checks the frequency of sampling
event counters, and adjusts the event interval and unthrottles the
event if required, and resets the interrupt count for the event.
However, at present it only looks at group leaders.

This means that a sampling event that is not a group leader will
eventually get throttled, once its interrupt count reaches
sysctl_perf_event_sample_rate/HZ --- and that is guaranteed to
happen, if the event is active for long enough, since the interrupt
count never gets reset. Once it is throttled it never gets
unthrottled, so it basically just stops working at that point.

This fixes it by making perf_ctx_adjust_freq use ctx->event_list
rather than ctx->group_list. The existing spin_lock/spin_unlock
around the loop makes it unnecessary to put rcu_read_lock/
rcu_read_unlock around the list_for_each_entry_rcu().

Reported-by: Mark W. Krentel <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>
---
Apparently this bug was only seen on powerpc, and not on x86. I have
no idea why.

Looks to me like we want a similar patch for 2.6.31 with some names
changed (event -> counter etc.). If people agree with this patch I'll
do one for stable.

kernel/perf_event.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 76ac4db..1d6b938 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -1368,7 +1368,7 @@ static void perf_ctx_adjust_freq(struct perf_event_context *ctx)
u64 interrupts, freq;

spin_lock(&ctx->lock);
- list_for_each_entry(event, &ctx->group_list, group_entry) {
+ list_for_each_entry_rcu(event, &ctx->event_list, event_entry) {
if (event->state != PERF_EVENT_STATE_ACTIVE)
continue;

--
1.6.0.4


2009-10-14 07:01:09

by Paul Mackerras

[permalink] [raw]
Subject: [tip:perf/urgent] perf_event: Adjust frequency and unthrottle for non-group-leader events

Commit-ID: 03541f8b69c058162e4cf9675ec9181e6a204d55
Gitweb: http://git.kernel.org/tip/03541f8b69c058162e4cf9675ec9181e6a204d55
Author: Paul Mackerras <[email protected]>
AuthorDate: Wed, 14 Oct 2009 16:58:03 +1100
Committer: Ingo Molnar <[email protected]>
CommitDate: Wed, 14 Oct 2009 08:39:32 +0200

perf_event: Adjust frequency and unthrottle for non-group-leader events

The loop in perf_ctx_adjust_freq checks the frequency of sampling
event counters, and adjusts the event interval and unthrottles the
event if required, and resets the interrupt count for the event.
However, at present it only looks at group leaders.

This means that a sampling event that is not a group leader will
eventually get throttled, once its interrupt count reaches
sysctl_perf_event_sample_rate/HZ --- and that is guaranteed to
happen, if the event is active for long enough, since the interrupt
count never gets reset. Once it is throttled it never gets
unthrottled, so it basically just stops working at that point.

This fixes it by making perf_ctx_adjust_freq use ctx->event_list
rather than ctx->group_list. The existing spin_lock/spin_unlock
around the loop makes it unnecessary to put rcu_read_lock/
rcu_read_unlock around the list_for_each_entry_rcu().

Reported-by: Mark W. Krentel <[email protected]>
Signed-off-by: Paul Mackerras <[email protected]>
Cc: Corey Ashford <[email protected]>
Cc: Peter Zijlstra <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
---
kernel/perf_event.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 9d0b5c6..afb7ef3 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -1355,7 +1355,7 @@ static void perf_ctx_adjust_freq(struct perf_event_context *ctx)
u64 interrupts, freq;

spin_lock(&ctx->lock);
- list_for_each_entry(event, &ctx->group_list, group_entry) {
+ list_for_each_entry_rcu(event, &ctx->event_list, event_entry) {
if (event->state != PERF_EVENT_STATE_ACTIVE)
continue;

2009-10-14 10:55:10

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] perf_event: Adjust frequency and unthrottle for non-group-leader events

On Wed, 2009-10-14 at 16:58 +1100, Paul Mackerras wrote:
> The loop in perf_ctx_adjust_freq checks the frequency of sampling
> event counters, and adjusts the event interval and unthrottles the
> event if required, and resets the interrupt count for the event.
> However, at present it only looks at group leaders.
>
> This means that a sampling event that is not a group leader will
> eventually get throttled, once its interrupt count reaches
> sysctl_perf_event_sample_rate/HZ --- and that is guaranteed to
> happen, if the event is active for long enough, since the interrupt
> count never gets reset. Once it is throttled it never gets
> unthrottled, so it basically just stops working at that point.
>
> This fixes it by making perf_ctx_adjust_freq use ctx->event_list
> rather than ctx->group_list. The existing spin_lock/spin_unlock
> around the loop makes it unnecessary to put rcu_read_lock/
> rcu_read_unlock around the list_for_each_entry_rcu().
>
> Reported-by: Mark W. Krentel <[email protected]>
> Signed-off-by: Paul Mackerras <[email protected]>

Looks good, thanks Paul!

Acked-by: Peter Zijlstra <[email protected]>

> ---
> Apparently this bug was only seen on powerpc, and not on x86. I have
> no idea why.

Weird, should work the same way on both of them.