Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752171AbaBLPGf (ORCPT ); Wed, 12 Feb 2014 10:06:35 -0500 Received: from mail-wi0-f182.google.com ([209.85.212.182]:41139 "EHLO mail-wi0-f182.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751511AbaBLPGd (ORCPT ); Wed, 12 Feb 2014 10:06:33 -0500 Date: Wed, 12 Feb 2014 16:06:29 +0100 From: Frederic Weisbecker To: Viresh Kumar Cc: Lei Wen , Thomas Gleixner , LKML , Lists linaro-kernel , "linux-pm@vger.kernel.org" , "Rafael J. Wysocki" Subject: Re: Is it ok for deferrable timer wakeup the idle cpu? Message-ID: <20140212150627.GB5496@localhost.localdomain> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Viresh, On Thu, Jan 23, 2014 at 11:22:32AM +0530, Viresh Kumar wrote: > > Hi Guys, > > So the first question is why cpufreq needs it and is it really stupid? > Yes, it is stupid but that's how its implemented since a long time. It does > so to get data about the load on CPUs, so that freq can be scaled up/down. > > Though there is a solution in discussion currently, which will take > inputs from scheduler and so these background timers would go away. > But we need to wait until that time. > > Now, why do we need that for every cpu, while that for a single cpu might > be enough? The answer is cpuidle here: What if the cpu responsible for > running timer goes to sleep? Who will evaluate the load then? And if we > make this timer run on one cpu in non-deferrable mode then that cpu > would be waken up again and again from idle. So, it was decided to have > a per-cpu deferrable timer. Though to improve efficiency, once it is fired > on any cpu, timer for all other CPUs are rescheduled, so that they don't > fire before 5ms (sampling time).. > > I think below diff might get this fixed for you, though I am not sure if it > breaks something else. Probably Thomas/Frederic can answer here. > If this looks fine I will send it formally again: > > diff --git a/kernel/timer.c b/kernel/timer.c > index accfd24..3a2c7fa 100644 > --- a/kernel/timer.c > +++ b/kernel/timer.c > @@ -940,7 +940,8 @@ void add_timer_on(struct timer_list *timer, int cpu) > * makes sure that a CPU on the way to stop its tick can not > * evaluate the timer wheel. > */ > - wake_up_nohz_cpu(cpu); > + if (!tbase_get_deferrable(timer->base)) > + wake_up_nohz_cpu(cpu); The change I'm applying is strongly inspired from the above. Can I use your Signed-off-by? Thanks. > spin_unlock_irqrestore(&base->lock, flags); > } > EXPORT_SYMBOL_GPL(add_timer_on); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/