2005-03-21 13:13:46

by Oleg Nesterov

[permalink] [raw]
Subject: [PATCH 6/5] timers: enable irqs in __mod_timer()

On top of "[PATCH 5/5] timers: cleanup, kill __get_base()", see
http://marc.theaimsgroup.com/?l=linux-kernel&m=111125359121372

If the timer is currently running on another CPU, __mod_timer()
spins with interrupts disabled and timer->lock held. I think it
is better to spin_unlock_irqrestore(&timer->lock) in __mod_timer's
retry path.

This patch is unneccessary long. It is because it tries to cleanup
the code a bit. I do not like the fact that lock+test+unlock pattern
is duplicated in the code.

If you think that this patch uglifies the code or does not match
kernel's coding style - just say nack :)

Signed-off-by: Oleg Nesterov <[email protected]>

--- 2.6.12-rc1/kernel/timer.c~6_LOCK 2005-03-19 23:34:23.000000000 +0300
+++ 2.6.12-rc1/kernel/timer.c 2005-03-21 18:55:25.000000000 +0300
@@ -174,64 +174,60 @@ int __mod_timer(struct timer_list *timer
{
tvec_base_t *old_base, *new_base;
unsigned long flags;
- int ret = 0;
-
- BUG_ON(!timer->function);
-
- check_timer(timer);
-
- spin_lock_irqsave(&timer->lock, flags);
- new_base = &__get_cpu_var(tvec_bases);
-repeat:
- old_base = timer_base(timer);
-
- /*
- * Prevent deadlocks via ordering by old_base < new_base.
- */
- if (old_base && (new_base != old_base)) {
- if (old_base < new_base) {
- spin_lock(&new_base->lock);
- spin_lock(&old_base->lock);
- } else {
- spin_lock(&old_base->lock);
- spin_lock(&new_base->lock);
- }
- /*
- * The timer base might have been cancelled while we were
- * trying to take the lock(s), or can still be running on
- * old_base's CPU.
- */
- if (timer_base(timer) != old_base
- || old_base->running_timer == timer) {
- spin_unlock(&new_base->lock);
- spin_unlock(&old_base->lock);
- goto repeat;
- }
- } else {
- spin_lock(&new_base->lock);
- if (timer_base(timer) != old_base) {
- spin_unlock(&new_base->lock);
- goto repeat;
- }
- }
-
- /*
- * Delete the previous timeout (if there was any).
- * We hold timer->lock, no need to check old_base != 0.
- */
- if (timer_pending(timer)) {
- list_del(&timer->entry);
- ret = 1;
- }
-
- timer->expires = expires;
- internal_add_timer(new_base, timer);
- __set_base(timer, new_base, 1);
-
- if (old_base && (new_base != old_base))
- spin_unlock(&old_base->lock);
- spin_unlock(&new_base->lock);
- spin_unlock_irqrestore(&timer->lock, flags);
+ int new_lock, ret;
+
+ BUG_ON(!timer->function);
+ check_timer(timer);
+
+ for (ret = -1; ret < 0; ) {
+ spin_lock_irqsave(&timer->lock, flags);
+ new_base = &__get_cpu_var(tvec_bases);
+ old_base = timer_base(timer);
+
+ /* Prevent AB-BA deadlocks. */
+ new_lock = old_base < new_base;
+ if (new_lock)
+ spin_lock(&new_base->lock);
+
+ /* Note:
+ * (old_base == NULL) => (new_lock == 1)
+ * (old_base == new_base) => (new_lock == 0)
+ */
+ if (old_base) {
+ spin_lock(&old_base->lock);
+
+ if (timer_base(timer) != old_base)
+ goto unlock;
+
+ if (old_base != new_base) {
+ /* Ensure the timer is serialized. */
+ if (old_base->running_timer == timer)
+ goto unlock;
+
+ if (!new_lock) {
+ spin_lock(&new_base->lock);
+ new_lock = 1;
+ }
+ }
+ }
+
+ ret = 0;
+ /* We hold timer->lock, no need to check old_base != 0. */
+ if (timer_pending(timer)) {
+ list_del(&timer->entry);
+ ret = 1;
+ }
+
+ timer->expires = expires;
+ internal_add_timer(new_base, timer);
+ __set_base(timer, new_base, 1);
+unlock:
+ if (old_base)
+ spin_unlock(&old_base->lock);
+ if (new_lock)
+ spin_unlock(&new_base->lock);
+ spin_unlock_irqrestore(&timer->lock, flags);
+ }

return ret;
}


2005-03-22 05:49:58

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 6/5] timers: enable irqs in __mod_timer()

Oleg Nesterov <[email protected]> wrote:
>
> If the timer is currently running on another CPU, __mod_timer()
> spins with interrupts disabled and timer->lock held. I think it
> is better to spin_unlock_irqrestore(&timer->lock) in __mod_timer's
> retry path.
>
> This patch is unneccessary long. It is because it tries to cleanup
> the code a bit. I do not like the fact that lock+test+unlock pattern
> is duplicated in the code.
>
> If you think that this patch uglifies the code or does not match
> kernel's coding style - just say nack :)

I've seen worse ;)

I think this makes it a bit more kernel-like?

--- 25/kernel/timer.c~timers-enable-irqs-in-__mod_timer-tidy 2005-03-21 21:41:03.000000000 -0800
+++ 25-akpm/kernel/timer.c 2005-03-21 21:41:57.000000000 -0800
@@ -174,12 +174,13 @@ int __mod_timer(struct timer_list *timer
{
tvec_base_t *old_base, *new_base;
unsigned long flags;
- int new_lock, ret;
+ int new_lock;
+ int ret = -1;

BUG_ON(!timer->function);
check_timer(timer);

- for (ret = -1; ret < 0; ) {
+ do {
spin_lock_irqsave(&timer->lock, flags);
new_base = &__get_cpu_var(tvec_bases);
old_base = timer_base(timer);
@@ -227,7 +228,7 @@ unlock:
if (new_lock)
spin_unlock(&new_base->lock);
spin_unlock_irqrestore(&timer->lock, flags);
- }
+ } while (ret == -1);

return ret;
}
_

2005-03-24 21:57:36

by Chen, Kenneth W

[permalink] [raw]
Subject: RE: [PATCH 0/5] timers: description

Oleg Nesterov wrote on Monday, March 21, 2005 6:19 AM
> These patches are updated version of 'del_timer_sync: proof of concept'
> 2 patches.

Looks good performance wise. Took a quick micro benchmark measurement on
a 16-node numa box (32-way). Results are pretty nice (to the expectation).

Time to execute del_timer_sync, averaged over 10,000 call:
Vanilla 2.6.11 19,313 ns
2.6.12-rc1-mm1 81 ns
del_singleshot_timer_sync 74ns

Took another measurement on a 4-way smp box:
Vanilla 2.6.11 648 ns
2.6.12-rc1-mm1 55 ns
del_singleshot 41 ns

And Andrew always asks what are the impact on real-world bench, we did
not see performance regression on db transaction processing benchmark
with these set of timer patches versus using del_singleshot_timer_sync.
(del_singleshot_timer_sync is slightly faster, though 14 ns difference
falls well into noise range)

Note: I've only stress tested deleting an un-expired timer.


2005-03-26 19:52:51

by Chen, Kenneth W

[permalink] [raw]
Subject: RE: [PATCH 0/5] timers: description

Oleg Nesterov wrote on March 19, 2005 17:28:48
> These patches are updated version of 'del_timer_sync: proof of
> concept' 2 patches.

I changed schedule_timeout() to call the new del_timer_sync instead of
currently del_singleshot_timer_sync in attempt to stress these set of
patches a bit more and I just observed a kernel hang.

The symptom starts with lost network connectivity. It looks like the
entire ethernet connections were gone, followed by blank screen on the
console. I'm not sure whether it is a hard or soft hang, but system
is inaccessible (blank screen and no network connection). I'm forced
to do a reboot when that happens.


2005-03-28 19:12:30

by Christoph Lameter

[permalink] [raw]
Subject: RE: [PATCH 0/5] timers: description

On Sat, 26 Mar 2005, Chen, Kenneth W wrote:

> Oleg Nesterov wrote on March 19, 2005 17:28:48
> > These patches are updated version of 'del_timer_sync: proof of
> > concept' 2 patches.
>
> I changed schedule_timeout() to call the new del_timer_sync instead of
> currently del_singleshot_timer_sync in attempt to stress these set of
> patches a bit more and I just observed a kernel hang.
>
> The symptom starts with lost network connectivity. It looks like the
> entire ethernet connections were gone, followed by blank screen on the
> console. I'm not sure whether it is a hard or soft hang, but system
> is inaccessible (blank screen and no network connection). I'm forced
> to do a reboot when that happens.

Same problems here with occasional hangs w/o changes to schedule_timeout.