Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753217Ab1EYKSI (ORCPT ); Wed, 25 May 2011 06:18:08 -0400 Received: from www.linutronix.de ([62.245.132.108]:57514 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752051Ab1EYKSF (ORCPT ); Wed, 25 May 2011 06:18:05 -0400 Date: Wed, 25 May 2011 12:17:50 +0200 (CEST) From: Thomas Gleixner To: Yong Zhang cc: Sebastian Andrzej Siewior , LKML , Arjan van de Ven , Trond Myklebust , David Miller , netdev@vger.kernel.org Subject: Re: [PATCH] timers: consider slack value in mod_timer() In-Reply-To: Message-ID: References: <20110521105828.GA29442@Chamillionaire.breakpoint.cc> <20110524121343.GA17312@Chamillionaire.breakpoint.cc> User-Agent: Alpine 2.02 (LFD 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5591 Lines: 160 On Wed, 25 May 2011, Yong Zhang wrote: > On Tue, May 24, 2011 at 8:13 PM, Sebastian Andrzej Siewior > wrote: > > * Yong Zhang | 2011-05-24 15:54:17 [+0800]: > > > >>> diff --git a/kernel/timer.c b/kernel/timer.c > >>> index fd61986..bf09726 100644 > >>> --- a/kernel/timer.c > >>> +++ b/kernel/timer.c > >>> @@ -804,6 +804,8 @@ int mod_timer(struct timer_list *timer, unsigned long expires) > >>> ?? ?? ?? ?? ?? ?? ?? ??return 1; > >>> > >>> ?? ?? ?? ??expires = apply_slack(timer, expires); > >> > >>So, why not move above line up, then we can use the recalculated > >>expires? > > > > We leave often before apply_slack() kicks in. From printks() it looks > > like we leave more often in first "return 1" than in the second. Moving > > that line up would lead to more __mode_timer() calls. > > Hmmm, so the reason is for a timer whose timer->slack is not set > explicitly. when we recalculate expires, we will get different value > sometimes. No, that's not the problem. > Could you please try the attached patch(webmail will mangle it) Grrr. gmail allows usage of real mail clients, doesn't it ? > diff --git a/kernel/timer.c b/kernel/timer.c > index fd61986..73af53c 100644 > --- a/kernel/timer.c > +++ b/kernel/timer.c > @@ -749,6 +749,10 @@ unsigned long apply_slack(struct timer_list *timer, unsigned long expires) > unsigned long expires_limit, mask; > int bit; > > + /* no need to account slack again for a same-expire pending timer */ > + if (timer_pending(timer) && time_after_eq(timer->expires, expires)) > + return timer->expires; That's total crap. Assume some code sets the timer with 5 seconds for some purpose and after a second it wants it to fire in 50ms from now because some state change happened. The above will keep the original 5 seconds timeout no matter what, so the requested 50ms timeout will fire about 4 seconds late. > expires_limit = expires; > > if (timer->slack >= 0) { > @@ -795,6 +799,8 @@ unsigned long apply_slack(struct timer_list *timer, unsigned long expires) > */ > int mod_timer(struct timer_list *timer, unsigned long expires) > { > + expires = apply_slack(timer, expires); > + We need to analyse the problem thoroughly and not slap random changes into the code without knowing about the consequences. And the problem is mostly in the call sites because they are not aware of the slack effect. The sunrpc code is one of those which are affected by the slack magic simply because it makes the mod_timer() call basically unconditional even if the jiffies value is unchanged. diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index ce5eb68..cb0574f 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -1053,10 +1053,12 @@ void xprt_release(struct rpc_task *task) xprt->ops->release_request(task); if (!list_empty(&req->rq_list)) list_del(&req->rq_list); - xprt->last_used = jiffies; - if (list_empty(&xprt->recv) && xprt_has_timer(xprt)) - mod_timer(&xprt->timer, - xprt->last_used + xprt->idle_timeout); + if (xprt->last_used = jiffies) { + xprt->last_used = jiffies; + if (list_empty(&xprt->recv) && xprt_has_timer(xprt)) + mod_timer(&xprt->timer, + xprt->last_used + xprt->idle_timeout); + } spin_unlock_bh(&xprt->transport_lock); if (req->rq_buffer) xprt->ops->buf_free(req->rq_buffer); The above patch does not solve the problem when the resulting new timeout is rounded up to the same expiry value after the slack is applied, which is not unlikely when jiffies only advanced by a small amount. So we must check after apply_slack() and the reason why the first check before apply_slack triggers very often is that auto slack only changes the expiry value for timeouts >= 256 jiffies. And the main caller is the networking code via tcp_send_delayed_ack(). The standard delay we see from there is 40ms (10 jiffies for HZ=250) and that falls below the 256 jiffies treshold. The patch below is a reasonable compromise between overhead and correctness. Thanks, tglx diff --git a/kernel/timer.c b/kernel/timer.c index fd61986..458fd81 100644 --- a/kernel/timer.c +++ b/kernel/timer.c @@ -749,16 +749,15 @@ unsigned long apply_slack(struct timer_list *timer, unsigned long expires) unsigned long expires_limit, mask; int bit; - expires_limit = expires; - if (timer->slack >= 0) { expires_limit = expires + timer->slack; } else { - unsigned long now = jiffies; + long delta = expires - jiffies; + + if (delta < 256) + return expires; - /* No slack, if already expired else auto slack 0.4% */ - if (time_after(expires, now)) - expires_limit = expires + (expires - now)/256; + expires_limit = expires + (expires - now)/256; } mask = expires ^ expires_limit; if (mask == 0) @@ -795,6 +794,8 @@ unsigned long apply_slack(struct timer_list *timer, unsigned long expires) */ int mod_timer(struct timer_list *timer, unsigned long expires) { + expires = apply_slack(timer, expires); + /* * This is a common optimization triggered by the * networking code - if the timer is re-modified @@ -803,8 +804,6 @@ int mod_timer(struct timer_list *timer, unsigned long expires) if (timer_pending(timer) && timer->expires == expires) return 1; - expires = apply_slack(timer, expires); - return __mod_timer(timer, expires, false, TIMER_NOT_PINNED); } EXPORT_SYMBOL(mod_timer); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/