2008-10-30 02:18:23

by Theodore Ts'o

[permalink] [raw]
Subject: [REGRESSION 2.6.28-rc2-git3] lots of extra timer interrupts costing 2W

I've noticed that sometime between 2.6.27-git7 and 2.6.28-rc2-git3,
powertop is reporting a huge number of excess timer interrupts on my
x61s laptop:

Wakeups-from-idle per second : 617.8 interval: 3.1s
no ACPI power usage estimate available

Top causes for wakeups:
71.3% (543.7) <interrupt> : extra timer interrupt
13.5% (103.0) <kernel core> : hrtimer_start_expires (tick_sched_timer)
7.2% ( 55.0) <kernel IPI> : Rescheduling interrupts
1.7% ( 13.0) gnome-terminal : hrtimer_start_expires (hrtimer_wakeup)
1.4% ( 10.3) <interrupt> : PS/2 keyboard/mouse/touchpad
1.1% ( 8.3) <interrupt> : ahci
0.4% ( 3.3) multiload-apple : hrtimer_start_expires (hrtimer_wakeup)
0.3% ( 2.7) Xorg : hrtimer_start (it_real_fn)


This is costing me an extra 2 Watts or so. Does this ring a bell with
anyone, before I start trying to bisect this problem?

- Ted


2008-10-31 03:32:33

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [REGRESSION 2.6.28-rc2-git3] lots of extra timer interrupts costing 2W

On Wed, Oct 29, 2008 at 10:18:10PM -0400, Theodore Ts'o wrote:
> I've noticed that sometime between 2.6.27-git7 and 2.6.28-rc2-git3,
> powertop is reporting a huge number of excess timer interrupts on my
> x61s laptop:
>
> Wakeups-from-idle per second : 617.8 interval: 3.1s
> no ACPI power usage estimate available
>
> Top causes for wakeups:
> 71.3% (543.7) <interrupt> : extra timer interrupt
> 13.5% (103.0) <kernel core> : hrtimer_start_expires (tick_sched_timer)
> 7.2% ( 55.0) <kernel IPI> : Rescheduling interrupts
> 1.7% ( 13.0) gnome-terminal : hrtimer_start_expires (hrtimer_wakeup)
> 1.4% ( 10.3) <interrupt> : PS/2 keyboard/mouse/touchpad
> 1.1% ( 8.3) <interrupt> : ahci
> 0.4% ( 3.3) multiload-apple : hrtimer_start_expires (hrtimer_wakeup)
> 0.3% ( 2.7) Xorg : hrtimer_start (it_real_fn)
>
>
> This is costing me an extra 2 Watts or so. Does this ring a bell with
> anyone, before I start trying to bisect this problem?

I found the problem via bisection. It's commit fb02fbc1, "NOHZ:
restart tick device from irq_enter()".

After bisecting the problem, I compiled v2.6.28-rc2 (with the attached
.config file), verified that I saw the extra 500 or so timer
interrupts, and then reverted the commit, and the extra timer
interrupts went away. A more careful measurement of the power drain
difference makes it appear to be more like 0.8 watts.

This is on an Lenovo X61s laptop. I'm not sure that simply reverting
the commit is the right fix, but it does seem to make the extra timer
interrupts (and the resulting decrease in average C3 residency time)
go away.

- Ted

commit 6698aaa8ee4435c167b91926fe06d6e4257a020c
Author: Theodore Ts'o <[email protected]>
Date: Thu Oct 30 22:48:17 2008 -0400

Revert "NOHZ: restart tick device from irq_enter()"

This reverts commit fb02fbc14d17837b4b7b02dbb36142c16a7bf208.

Conflicts:

kernel/time/tick-sched.c

diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index f98a1b7..cb01cd8 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -384,19 +384,6 @@ int tick_resume_broadcast_oneshot(struct clock_event_device *bc)
}

/*
- * Called from irq_enter() when idle was interrupted to reenable the
- * per cpu device.
- */
-void tick_check_oneshot_broadcast(int cpu)
-{
- if (cpu_isset(cpu, tick_broadcast_oneshot_mask)) {
- struct tick_device *td = &per_cpu(tick_cpu_device, cpu);
-
- clockevents_set_mode(td->evtdev, CLOCK_EVT_MODE_ONESHOT);
- }
-}
-
-/*
* Handle oneshot mode broadcasting
*/
static void tick_handle_oneshot_broadcast(struct clock_event_device *dev)
diff --git a/kernel/time/tick-internal.h b/kernel/time/tick-internal.h
index b1c05bf..4692487 100644
--- a/kernel/time/tick-internal.h
+++ b/kernel/time/tick-internal.h
@@ -36,7 +36,6 @@ extern void tick_broadcast_switch_to_oneshot(void);
extern void tick_shutdown_broadcast_oneshot(unsigned int *cpup);
extern int tick_resume_broadcast_oneshot(struct clock_event_device *bc);
extern int tick_broadcast_oneshot_active(void);
-extern void tick_check_oneshot_broadcast(int cpu);
# else /* BROADCAST */
static inline void tick_broadcast_setup_oneshot(struct clock_event_device *bc)
{
@@ -46,7 +45,6 @@ static inline void tick_broadcast_oneshot_control(unsigned long reason) { }
static inline void tick_broadcast_switch_to_oneshot(void) { }
static inline void tick_shutdown_broadcast_oneshot(unsigned int *cpup) { }
static inline int tick_broadcast_oneshot_active(void) { return 0; }
-static inline void tick_check_oneshot_broadcast(int cpu) { }
# endif /* !BROADCAST */

#else /* !ONESHOT */
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 5bbb104..c8b3d8d 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -510,6 +510,10 @@ static void tick_nohz_handler(struct clock_event_device *dev)
update_process_times(user_mode(regs));
profile_tick(CPU_PROFILING);

+ /* Do not restart, when we are in the idle loop */
+ if (ts->tick_stopped)
+ return;
+
while (tick_nohz_reprogram(ts, now)) {
now = ktime_get();
tick_do_update_jiffies64(now);
@@ -555,37 +559,6 @@ static void tick_nohz_switch_to_nohz(void)
smp_processor_id());
}

-/*
- * When NOHZ is enabled and the tick is stopped, we need to kick the
- * tick timer from irq_enter() so that the jiffies update is kept
- * alive during long running softirqs. That's ugly as hell, but
- * correctness is key even if we need to fix the offending softirq in
- * the first place.
- *
- * Note, this is different to tick_nohz_restart. We just kick the
- * timer and do not touch the other magic bits which need to be done
- * when idle is left.
- */
-static void tick_nohz_kick_tick(int cpu)
-{
- struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
- ktime_t delta, now;
-
- if (!ts->tick_stopped)
- return;
-
- /*
- * Do not touch the tick device, when the next expiry is either
- * already reached or less/equal than the tick period.
- */
- now = ktime_get();
- delta = ktime_sub(hrtimer_get_expires(&ts->sched_timer), now);
- if (delta.tv64 <= tick_period.tv64)
- return;
-
- tick_nohz_restart(ts, now);
-}
-
#else

static inline void tick_nohz_switch_to_nohz(void) { }
@@ -597,11 +570,9 @@ static inline void tick_nohz_switch_to_nohz(void) { }
*/
void tick_check_idle(int cpu)
{
- tick_check_oneshot_broadcast(cpu);
#ifdef CONFIG_NO_HZ
tick_nohz_stop_idle(cpu);
tick_nohz_update_jiffies();
- tick_nohz_kick_tick(cpu);
#endif
}

@@ -658,6 +629,10 @@ static enum hrtimer_restart tick_sched_timer(struct hrtimer *timer)
profile_tick(CPU_PROFILING);
}

+ /* Do not restart, when we are in the idle loop */
+ if (ts->tick_stopped)
+ return HRTIMER_NORESTART;
+
hrtimer_forward(timer, now, tick_period);

return HRTIMER_RESTART;



Attachments:
(No filename) (5.76 kB)
.config (68.69 kB)
Download all attachments

2008-10-31 10:21:54

by Elias Oltmanns

[permalink] [raw]
Subject: Re: [REGRESSION 2.6.28-rc2-git3] lots of extra timer interrupts costing 2W

Theodore Tso <[email protected]> wrote:
> On Wed, Oct 29, 2008 at 10:18:10PM -0400, Theodore Ts'o wrote:
>> I've noticed that sometime between 2.6.27-git7 and 2.6.28-rc2-git3,
>
>> powertop is reporting a huge number of excess timer interrupts on my
>> x61s laptop:
>>
>> Wakeups-from-idle per second : 617.8 interval: 3.1s
>> no ACPI power usage estimate available
>>
>> Top causes for wakeups:
>> 71.3% (543.7) <interrupt> : extra timer interrupt
>> 13.5% (103.0) <kernel core> : hrtimer_start_expires (tick_sched_timer)
>> 7.2% ( 55.0) <kernel IPI> : Rescheduling interrupts
>> 1.7% ( 13.0) gnome-terminal : hrtimer_start_expires (hrtimer_wakeup)
>> 1.4% ( 10.3) <interrupt> : PS/2 keyboard/mouse/touchpad
>> 1.1% ( 8.3) <interrupt> : ahci
>> 0.4% ( 3.3) multiload-apple : hrtimer_start_expires (hrtimer_wakeup)
>> 0.3% ( 2.7) Xorg : hrtimer_start (it_real_fn)
>>
>>
>> This is costing me an extra 2 Watts or so. Does this ring a bell with
>> anyone, before I start trying to bisect this problem?
>
> I found the problem via bisection. It's commit fb02fbc1, "NOHZ:
> restart tick device from irq_enter()".
>
> After bisecting the problem, I compiled v2.6.28-rc2 (with the attached
> .config file), verified that I saw the extra 500 or so timer
> interrupts, and then reverted the commit, and the extra timer
> interrupts went away. A more careful measurement of the power drain
> difference makes it appear to be more like 0.8 watts.

Strange, I can't reproduce this problem here (tested next-20081029 and
2.6.27.4 + some commits including the one in question). However, I'm
wondering whether this is simply due to a certain config option since
extra timer interrupts doesn't even show up in my powertop output. This
is with NO_HZ=y and HIGH_RES_TIMERS=y/n. Could SMP possibly make any
difference here? It's disabled in my config.

>
> This is on an Lenovo X61s laptop. I'm not sure that simply reverting
> the commit is the right fix, but it does seem to make the extra timer
> interrupts (and the resulting decrease in average C3 residency time)
> go away.

Well, I'd be rather unhappy, for obvious reasons, if this commit was
simply reverted. But, of course, something has got to be done about it.

Regards,

Elias

2008-11-17 08:56:47

by Theodore Ts'o

[permalink] [raw]
Subject: [Bug 11905] REGRESSION: lots of extra timer interrupts costing power

Hi, what's going on with this regression? To refresh people's memory,
I tracked down the excess interrupts to a specific commit, fb02fbc1,
and I've been running quite happily with a patch to revert this
commit.

I'm not sure it's the right thing, but if no one has had any time to
figure out what's wrong with the original commit, can we please revert
it for 2.6.28? I reported it over two weeks ago, and I've not heard
anything from about alternative ways of fixing this regression.

Thanks!!

- Ted


commit c38a3882eeeb6dcb45c99f29c7b95595606eaf4d
Author: Theodore Ts'o <[email protected]>
Date: Fri Oct 31 12:37:31 2008 -0400

Revert "NOHZ: restart tick device from irq_enter()"

This reverts commit fb02fbc14d17837b4b7b02dbb36142c16a7bf208 to avoid
a large number of excess interrupts which burns a significant amount
on a X61s laptop (and probably many more); others have reported this
problem as well on the bugzilla entry:

http://bugzilla.kernel.org/show_bug.cgi?id=11905

Conflicts:

kernel/time/tick-sched.c

diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index f98a1b7..cb01cd8 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -384,19 +384,6 @@ int tick_resume_broadcast_oneshot(struct clock_event_device *bc)
}

/*
- * Called from irq_enter() when idle was interrupted to reenable the
- * per cpu device.
- */
-void tick_check_oneshot_broadcast(int cpu)
-{
- if (cpu_isset(cpu, tick_broadcast_oneshot_mask)) {
- struct tick_device *td = &per_cpu(tick_cpu_device, cpu);
-
- clockevents_set_mode(td->evtdev, CLOCK_EVT_MODE_ONESHOT);
- }
-}
-
-/*
* Handle oneshot mode broadcasting
*/
static void tick_handle_oneshot_broadcast(struct clock_event_device *dev)
diff --git a/kernel/time/tick-internal.h b/kernel/time/tick-internal.h
index b1c05bf..4692487 100644
--- a/kernel/time/tick-internal.h
+++ b/kernel/time/tick-internal.h
@@ -36,7 +36,6 @@ extern void tick_broadcast_switch_to_oneshot(void);
extern void tick_shutdown_broadcast_oneshot(unsigned int *cpup);
extern int tick_resume_broadcast_oneshot(struct clock_event_device *bc);
extern int tick_broadcast_oneshot_active(void);
-extern void tick_check_oneshot_broadcast(int cpu);
# else /* BROADCAST */
static inline void tick_broadcast_setup_oneshot(struct clock_event_device *bc)
{
@@ -46,7 +45,6 @@ static inline void tick_broadcast_oneshot_control(unsigned long reason) { }
static inline void tick_broadcast_switch_to_oneshot(void) { }
static inline void tick_shutdown_broadcast_oneshot(unsigned int *cpup) { }
static inline int tick_broadcast_oneshot_active(void) { return 0; }
-static inline void tick_check_oneshot_broadcast(int cpu) { }
# endif /* !BROADCAST */

#else /* !ONESHOT */
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 5bbb104..c8b3d8d 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -510,6 +510,10 @@ static void tick_nohz_handler(struct clock_event_device *dev)
update_process_times(user_mode(regs));
profile_tick(CPU_PROFILING);

+ /* Do not restart, when we are in the idle loop */
+ if (ts->tick_stopped)
+ return;
+
while (tick_nohz_reprogram(ts, now)) {
now = ktime_get();
tick_do_update_jiffies64(now);
@@ -555,37 +559,6 @@ static void tick_nohz_switch_to_nohz(void)
smp_processor_id());
}

-/*
- * When NOHZ is enabled and the tick is stopped, we need to kick the
- * tick timer from irq_enter() so that the jiffies update is kept
- * alive during long running softirqs. That's ugly as hell, but
- * correctness is key even if we need to fix the offending softirq in
- * the first place.
- *
- * Note, this is different to tick_nohz_restart. We just kick the
- * timer and do not touch the other magic bits which need to be done
- * when idle is left.
- */
-static void tick_nohz_kick_tick(int cpu)
-{
- struct tick_sched *ts = &per_cpu(tick_cpu_sched, cpu);
- ktime_t delta, now;
-
- if (!ts->tick_stopped)
- return;
-
- /*
- * Do not touch the tick device, when the next expiry is either
- * already reached or less/equal than the tick period.
- */
- now = ktime_get();
- delta = ktime_sub(hrtimer_get_expires(&ts->sched_timer), now);
- if (delta.tv64 <= tick_period.tv64)
- return;
-
- tick_nohz_restart(ts, now);
-}
-
#else

static inline void tick_nohz_switch_to_nohz(void) { }
@@ -597,11 +570,9 @@ static inline void tick_nohz_switch_to_nohz(void) { }
*/
void tick_check_idle(int cpu)
{
- tick_check_oneshot_broadcast(cpu);
#ifdef CONFIG_NO_HZ
tick_nohz_stop_idle(cpu);
tick_nohz_update_jiffies();
- tick_nohz_kick_tick(cpu);
#endif
}

@@ -658,6 +629,10 @@ static enum hrtimer_restart tick_sched_timer(struct hrtimer *timer)
profile_tick(CPU_PROFILING);
}

+ /* Do not restart, when we are in the idle loop */
+ if (ts->tick_stopped)
+ return HRTIMER_NORESTART;
+
hrtimer_forward(timer, now, tick_period);

return HRTIMER_RESTART;

2008-11-17 09:37:12

by Frans Pop

[permalink] [raw]
Subject: Re: [Bug 11905] REGRESSION: lots of extra timer interrupts costing power

Theodore Tso wrote:
> Hi, what's going on with this regression? To refresh people's memory,
> I tracked down the excess interrupts to a specific commit, fb02fbc1,
> and I've been running quite happily with a patch to revert this
> commit.

Isn't that solved by this commit (included in -rc5)?

commit ae99286b4f1be7788f2d6947c66a91dbd6351eec
Author: Thomas Gleixner <[email protected]>
Date: Mon Nov 10 13:20:23 2008 +0100

nohz: disable tick_nohz_kick_tick() for now

2008-11-17 19:57:05

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [Bug 11905] REGRESSION: lots of extra timer interrupts costing power

On Mon, Nov 17, 2008 at 10:36:48AM +0100, Frans Pop wrote:
> Theodore Tso wrote:
> > Hi, what's going on with this regression? To refresh people's memory,
> > I tracked down the excess interrupts to a specific commit, fb02fbc1,
> > and I've been running quite happily with a patch to revert this
> > commit.
>
> Isn't that solved by this commit (included in -rc5)?
>
> commit ae99286b4f1be7788f2d6947c66a91dbd6351eec
> Author: Thomas Gleixner <[email protected]>
> Date: Mon Nov 10 13:20:23 2008 +0100
>
> nohz: disable tick_nohz_kick_tick() for now

Yep, so it is. Sorry, I missed that; I haven't migrated to -rc5 yet.

I'll close the Bugzilla entry.

- Ted