2010-01-04 03:14:22

by Youquan Song

[permalink] [raw]
Subject: [PATCH]tickless: Fix tick nohz timer irq0 fail to increaase

Tickless is disabled by nohz=off, which is used in OSVs. But in current kernel,
if tickless is disabled, the timer irq0 will not increase. Because the timer
event handler should be tick_handle_periodic, but actually event handler keep
as tick_handle_oneshot_broadcast which is used in tickless. The root cause is
that it is default to enable high resolution timer which will force to oneshot
broadcast mode.

This patch add tickless enable check before enable high resolution timer

On Nehalem-EX:

Before the patch:
linux-a25n:~ # cat /proc/interrupts | grep timer
0: 334 0 0 0 0 0 ....
LOC: 192248 193931 193851 184441 193803 193625 ....

After the patch:
cat /proc/interrupts | grep timer
0: 223788 0 0 0 0 0 ....
LOC: 13081 238407 238452 229405 238298 235688 ....

Signed-off-by: Youquan, Song <[email protected]>
---


diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index f992762..a515bed 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -815,7 +815,7 @@ int tick_check_oneshot_change(int allow_nohz)
if (!timekeeping_valid_for_hres() || !tick_is_oneshot_available())
return 0;

- if (!allow_nohz)
+ if (!allow_nohz && tick_nohz_enabled)
return 1;

tick_nohz_switch_to_nohz();


2010-01-04 12:34:37

by Arjan van de Ven

[permalink] [raw]
Subject: Re: [PATCH]tickless: Fix tick nohz timer irq0 fail to increaase

On Mon, 4 Jan 2010 05:49:05 -0500
"Youquan,Song" <[email protected]> wrote:

> Tickless is disabled by nohz=off, which is used in OSVs.

it is? I doubt anyone wants to normally disable tickless, the power
cost is just too high....

> But in
> current kernel, if tickless is disabled, the timer irq0 will not
> increase.

why is this a problem?

> Because the timer event handler should be
> tick_handle_periodic, but actually event handler keep as
> tick_handle_oneshot_broadcast which is used in tickless. The root
> cause is that it is default to enable high resolution timer which
> will force to oneshot broadcast mode.

using local apic in one shot mode is not a problem, I'd in fact call it
a feature...


--
Arjan van de Ven Intel Open Source Technology Centre
For development, discussion and tips for power savings,
visit http://www.lesswatts.org

2010-01-04 14:01:43

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH]tickless: Fix tick nohz timer irq0 fail to increaase

On Mon, 2010-01-04 at 04:36 -0800, Arjan van de Ven wrote:
> > cause is that it is default to enable high resolution timer which
> > will force to oneshot broadcast mode.
>
> using local apic in one shot mode is not a problem, I'd in fact call it
> a feature...

lapic yes, broadcast not so much.

He mentioned he was using Nehelem-EX which is known broken in that it
stops the lapic in deeper C states and thus reverts to HPET broadcast
since there's not enough HPET timers to go around.

Something like processor.max_cstate=2 should fix that IIRC.

Jens ran into this on his shiny new box.

Disabling nohz is funny indeed, one wonders why he does that.

2010-01-04 14:06:06

by Youquan Song

[permalink] [raw]
Subject: Re: [PATCH]tickless: Fix tick nohz timer irq0 fail to increaase

> > Tickless is disabled by nohz=off, which is used in OSVs.
>
> it is? I doubt anyone wants to normally disable tickless, the power
> cost is just too high....
tickless is normally used but it is possible to use for some special users,
tickless is disabled by nohz=off kernel option, which is address in OSV's
release notes.

> > But in
> > current kernel, if tickless is disabled, the timer irq0 will not
> > increase.
>
> why is this a problem?

Two reasons:
R1:
In my mind, tick is period timer with frequent of HZ. but timer irq0 is
actually be an oneshot mode timer.
But in real watch, it is keep not increase for long time. but it
suddenly increase 5000~10000 times in a seconds.

R2:
Tickless disable, change cpuidle driver to ladder governor(menu does not
work at all).

Run powertop, if apply my patch, the c-state residency has > 10% improvement
than current kernel.

>
> > Because the timer event handler should be
> > tick_handle_periodic, but actually event handler keep as
> > tick_handle_oneshot_broadcast which is used in tickless. The root
> > cause is that it is default to enable high resolution timer which
> > will force to oneshot broadcast mode.
>
> using local apic in one shot mode is not a problem, I'd in fact call it
> a feature...

But in current kernel, local apic timer is in periodic mode with
frequencey of HZ when tickless is disabled.

2010-01-04 14:27:29

by Youquan Song

[permalink] [raw]
Subject: Re: [PATCH]tickless: Fix tick nohz timer irq0 fail to increaase

> Disabling nohz is funny indeed, one wonders why he does that.
I do not have real case for it. I look at nohz=off because
1. OSV has such release notes to explain how to disable tickless.
2. with some OSV config + latest kernel, it will result in kernel
oops and it config disable tickless. I have reported it to acpi maillist
before.

Thanks.

-Youquan