2014-07-31 16:46:13

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 0/2] nohz: Avoid double clock write

Hi Thomas,

The tick reschedules itself unconditionally. That's what we want as long
as the CPU is in periodic mode. It's not that relevant when the CPU
is in dynticks mode though as the clock write is likely to be overwritten
by the nohz code on irq exit.

In low-res mode, that even result in unexpected periodic behaviour when
no timer is scheduled in the future.

So here is a set that proposes the removal of these double writes.

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
origin/nohz/drop-double-write-v2

Thanks,
Frederic
---

Viresh Kumar (2):
nohz: Fix spurious periodic tick behaviour in low-res dynticks mode
nohz: Avoid tick's double reprogramming in highres mode


kernel/time/tick-sched.c | 8 ++++++++
1 file changed, 8 insertions(+)


2014-07-31 16:46:16

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 2/2] nohz: Avoid tick's double reprogramming in highres mode

From: Viresh Kumar <[email protected]>

In highres mode, the tick reschedules itself unconditionally to the
next jiffies.

However while this clock reprogramming is relevant when the tick is
in periodic mode, it's not that interesting when we run in dynticks mode
because irq exit is likely going to overwrite the next tick to some
randomly deferred future.

So lets just get rid of this tick self rescheduling in dynticks mode.
This way we can avoid some clockevents double write in favourable
scenarios like when we stop the tick completely in idle while no other
hrtimer is pending.

Suggested-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
---
kernel/time/tick-sched.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index b07ba8e..a7098f7 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -1087,6 +1087,10 @@ static enum hrtimer_restart tick_sched_timer(struct hrtimer *timer)
if (regs)
tick_sched_handle(ts, regs);

+ /* Do not restart, when we are in idle or full dynticks mode */
+ if (unlikely(ts->tick_stopped))
+ return HRTIMER_NORESTART;
+
hrtimer_forward(timer, now, tick_period);

return HRTIMER_RESTART;
--
1.8.3.1

2014-07-31 16:46:36

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 1/2] nohz: Fix spurious periodic tick behaviour in low-res dynticks mode

From: Viresh Kumar <[email protected]>

When we reach the end of the tick handler, we unconditionally reschedule
the next tick to the next jiffy. Then on irq exit, the nohz code
overrides that setting if needed and defers the next tick as far away in
the future as possible.

Now in the best dynticks case, when we actually don't need any tick in
the future (ie: expires == KTIME_MAX), low-res and high-res behave
differently. What we want in this case is to cancel the next tick
programmed by the previous one. That's what we do in high-res mode. OTOH
we lack a low-res mode equivalent of hrtimer_cancel() so we simply don't
do anything in this case and the next tick remains scheduled to jiffies + 1.

As a result, in low-res mode, when the dynticks code determines that no
tick is needed in the future, we can recursively get a spurious tick
every jiffy because then the next tick is always reprogrammed from the
tick handler and is never cancelled. And this can happen indefinetly
until some subsystem actually needs a precise tick in the future and only
then we eventually overwrite the previous tick handler setting to defer
the next tick.

We are fixing this by introducing the ONESHOT_STOPPED mode which will
let us pause a clockevent when no further interrupt is needed. Meanwhile
we can't expect all drivers to support this new mode.

So lets reduce much of the symptoms by skipping the nohz-blind tick
rescheduling from the tick-handler when the CPU is in dynticks mode.
That tick rescheduling wrongly assumed periodicity and the low-res
dynticks code can't cancel such decision. This breaks the recursive (and
thus the worst) part of the problem. In the worst case now, we'll get
only one extra tick due to uncancelled tick scheduled before we entered
dynticks mode.

This also removes a needless clockevent write on idle ticks. Since those
clock write are usually considered to be slow, it's a general win.

Reviewed-by: Preeti U Murthy <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
---
kernel/time/tick-sched.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 6558b7a..b07ba8e 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -956,6 +956,10 @@ static void tick_nohz_handler(struct clock_event_device *dev)
tick_sched_do_timer(now);
tick_sched_handle(ts, regs);

+ /* Skip reprogramming next event if we are running tickless */
+ if (unlikely(ts->tick_stopped))
+ return;
+
while (tick_nohz_reprogram(ts, now)) {
now = ktime_get();
tick_do_update_jiffies64(now);
--
1.8.3.1