Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753403AbbDGVK3 (ORCPT ); Tue, 7 Apr 2015 17:10:29 -0400 Received: from www.linutronix.de ([62.245.132.108]:40122 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753200AbbDGVK1 (ORCPT ); Tue, 7 Apr 2015 17:10:27 -0400 Date: Tue, 7 Apr 2015 23:10:49 +0200 (CEST) From: Thomas Gleixner To: Marcelo Tosatti cc: Frederic Weisbecker , linux-kernel@vger.kernel.org, Rik van Riel Subject: Re: kernel/timer: avoid spurious ksoftirqd wakeups (v2) In-Reply-To: <20150406231557.GA5094@amt.cnet> Message-ID: References: <20150406231557.GA5094@amt.cnet> User-Agent: Alpine 2.11 (DEB 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2876 Lines: 89 On Mon, 6 Apr 2015, Marcelo Tosatti wrote: > It is only necessary to raise timer softirq > in case there are active timers. Depends. See below. > Limit the ksoftirqd wakeup to that case. > > Fixes a latency spike with isolated CPUs and > nohz full mode. This lacks a proper explanation of the observed issue. > --- a/kernel/time/tick-sched.c > +++ b/kernel/time/tick-sched.c > @@ -568,6 +568,7 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts, > unsigned long rcu_delta_jiffies; > struct clock_event_device *dev = __this_cpu_read(tick_cpu_device.evtdev); > u64 time_delta; > + bool raise_softirq = false; This shadows the function name raise_softirq(). Not pretty. > time_delta = timekeeping_max_deferment(); > > @@ -584,7 +585,8 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts, > delta_jiffies = 1; > } else { > /* Get the next timer wheel timer */ > - next_jiffies = get_next_timer_interrupt(last_jiffies); > + next_jiffies = get_next_timer_interrupt(last_jiffies, > + &raise_softirq); > delta_jiffies = next_jiffies - last_jiffies; > if (rcu_delta_jiffies < delta_jiffies) { > next_jiffies = last_jiffies + rcu_delta_jiffies; > @@ -703,7 +705,8 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts, > */ > tick_do_update_jiffies64(ktime_get()); > } > - raise_softirq_irqoff(TIMER_SOFTIRQ); > + if (raise_softirq) > + raise_softirq_irqoff(TIMER_SOFTIRQ); This breaks when high resolution timers are disabled (compile or runtime) because then the hrtimer queues are run from the timer softirq. Now assume the following situation: Tick is stopped completely with no timers and no hrtimers pending. Interrupt happens and schedules a hrtimer. nohz_stop_sched_tick() get_next_timer_interrupt(..., &raise_softirq); ---> base->active_timers = 0, so raise_softirq is false tick_program_event(expires) clockevents_program_event(expires) ---> Assume expires is already in the past if (expires <= ktime_get()) return -ETIME; if (raise_softirq) raise_softirq_irqoff(TIMER_SOFTIRQ); So because the tick device was not armed you wont get a tick interrupt up to the point where tick_nohz_stop_sched_tick() is called again which might be far off. I can see that the unconditional raise_softirq_irqoff() is suboptimal, but it was a rather simple solution to get stuff rolling again because it forces the cpu out of the inner idle loop which in turn restarts the tick. We certainly can do better, but that needs more thought than the proposed solution. Thanks, tglx -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/