2014-06-12 19:34:09

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 0/2] timer: Fix missing dynticks kick

Thomas,

Viresh has spotted and fixed an interesting issue. When a timer list
is enqueued or updated on a (idle or full) dynticks target, we kick it
with an IPI.

At least that's what we do on add_timer_on() but the mod_timer*() family
doesn't handle that. I think that's because we had assumptions that
mod_timer() never enqueues to remote idle targets. But that's subject
to subtle races where a CPU can become dynticks-idle between a call to
get_nohz_timer_target() and the time we actually lock the new target.

Moreover we forgot to handle full dynticks targets selected by __mod_timer().

That patchset should fixes all those issues.

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
timers/missing-kick

Thanks,
Frederic
---

Viresh Kumar (2):
timer: Store cpu-number in 'struct tvec_base'
timer: Kick dynticks targets on mod_timer*() calls


kernel/timer.c | 34 ++++++++++++++++++----------------
1 file changed, 18 insertions(+), 16 deletions(-)


2014-06-12 19:34:11

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 1/2] timer: Store cpu-number in 'struct tvec_base'

From: Viresh Kumar <[email protected]>

Timers are serviced by the tick. But when a timer is enqueued on a
dynticks target, we need to kick it in order to make it reconsider the
next tick to schedule to correctly handle the timer's expiring time.

Now while this kick is correctly performed for add_timer_on(), the
mod_timer*() family has been a bit neglected.

To prepare for fixing this, we need internal_add_timer() to be able to
resolve the CPU target associated to a timer's object 'base' so that the
kick can be centralized there.

This can't be passed as an argument as not all the callers know the CPU
number of a timer's base. So lets store it in the struct tvec_base to
resolve the CPU without much overhead. It is set once for good at every
CPU's first boot.

Signed-off-by: Viresh Kumar <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
---
kernel/timer.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/kernel/timer.c b/kernel/timer.c
index 3bb01a3..9e5f4f2 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -82,6 +82,7 @@ struct tvec_base {
unsigned long next_timer;
unsigned long active_timers;
unsigned long all_timers;
+ int cpu;
struct tvec_root tv1;
struct tvec tv2;
struct tvec tv3;
@@ -1568,6 +1569,7 @@ static int init_timers_cpu(int cpu)
}
spin_lock_init(&base->lock);
tvec_base_done[cpu] = 1;
+ base->cpu = cpu;
} else {
base = per_cpu(tvec_bases, cpu);
}
--
1.8.3.1

2014-06-12 19:34:31

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 2/2] timer: Kick dynticks targets on mod_timer*() calls

From: Viresh Kumar <[email protected]>

When a timer is enqueued or modified on a dynticks target, that CPU
must re-evaluate the next tick to service that timer.

The tick re-evaluation is performed by an IPI kick on the target.
Now while we correctly call wake_up_nohz_cpu() from add_timer_on(), the
mod_timer*() API family doesn't support so well dynticks targets.

The reason for this is likely that __mod_timer() isn't supposed to
select an idle target for a timer, unless that target is the current
CPU, in which case a dynticks idle kick isn't actually needed.

But there is a small race window lurking behind that assumption: the
elected target has all the time to turn dynticks idle between the call
to get_nohz_timer_target() and the locking of its base. Hence a risk
that we enqueue a timer on a dynticks idle destination without kicking
it. As a result, the timer might be serviced too late in the future.

Also a target elected by __mod_timer() can be in full dynticks mode
and thus require to be kicked as well. And unlike idle dynticks, this
concern both local and remote targets.

To fix this whole issue, lets centralize the dynticks kick to
internal_add_timer() so that it is well handled for all sort of timer
enqueue. Even timer migration is concerned so that a full dynticks target
is correctly kicked as needed when timers are migrating to it.

Signed-off-by: Viresh Kumar <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
---
kernel/timer.c | 32 ++++++++++++++++----------------
1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/kernel/timer.c b/kernel/timer.c
index 9e5f4f2..aca5dfe 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -410,6 +410,22 @@ static void internal_add_timer(struct tvec_base *base, struct timer_list *timer)
base->next_timer = timer->expires;
}
base->all_timers++;
+
+ /*
+ * Check whether the other CPU is in dynticks mode and needs
+ * to be triggered to reevaluate the timer wheel.
+ * We are protected against the other CPU fiddling
+ * with the timer by holding the timer base lock. This also
+ * makes sure that a CPU on the way to stop its tick can not
+ * evaluate the timer wheel.
+ *
+ * Spare the IPI for deferrable timers on idle targets though.
+ * The next busy ticks will take care of it. Except full dynticks
+ * require special care against races with idle_cpu(), lets deal
+ * with that later.
+ */
+ if (!tbase_get_deferrable(base) || tick_nohz_full_cpu(base->cpu))
+ wake_up_nohz_cpu(base->cpu);
}

#ifdef CONFIG_TIMER_STATS
@@ -949,22 +965,6 @@ void add_timer_on(struct timer_list *timer, int cpu)
timer_set_base(timer, base);
debug_activate(timer, timer->expires);
internal_add_timer(base, timer);
- /*
- * Check whether the other CPU is in dynticks mode and needs
- * to be triggered to reevaluate the timer wheel.
- * We are protected against the other CPU fiddling
- * with the timer by holding the timer base lock. This also
- * makes sure that a CPU on the way to stop its tick can not
- * evaluate the timer wheel.
- *
- * Spare the IPI for deferrable timers on idle targets though.
- * The next busy ticks will take care of it. Except full dynticks
- * require special care against races with idle_cpu(), lets deal
- * with that later.
- */
- if (!tbase_get_deferrable(timer->base) || tick_nohz_full_cpu(cpu))
- wake_up_nohz_cpu(cpu);
-
spin_unlock_irqrestore(&base->lock, flags);
}
EXPORT_SYMBOL_GPL(add_timer_on);
--
1.8.3.1