Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755069AbdGLEPO (ORCPT ); Wed, 12 Jul 2017 00:15:14 -0400 Received: from mga11.intel.com ([192.55.52.93]:23777 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751281AbdGLEPN (ORCPT ); Wed, 12 Jul 2017 00:15:13 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.40,347,1496127600"; d="scan'208";a="285782784" Subject: Re: [RFC PATCH v1 00/11] Create fast idle path for short idle periods To: Peter Zijlstra , Frederic Weisbecker Cc: Christoph Lameter , Andi Kleen , Aubrey Li , tglx@linutronix.de, len.brown@intel.com, rjw@rjwysocki.net, tim.c.chen@linux.intel.com, arjan@linux.intel.com, paulmck@linux.vnet.ibm.com, yang.zhang.wz@gmail.com, x86@kernel.org, linux-kernel@vger.kernel.org References: <1499650721-5928-1-git-send-email-aubrey.li@intel.com> <20170710084647.zs6wkl3fumszd33g@hirez.programming.kicks-ass.net> <20170710144609.GD31832@tassilo.jf.intel.com> <20170710164206.5aon5kelbisxqyxq@hirez.programming.kicks-ass.net> <20170710172705.GA3441@tassilo.jf.intel.com> <20170711094157.5xcwkloxnjehieqv@hirez.programming.kicks-ass.net> <20170711160926.GA18805@lerouge> <20170711163422.etydkhhtgfthpfi5@hirez.programming.kicks-ass.net> From: "Li, Aubrey" Message-ID: <496d4921-5768-cd1e-654b-38630b7d2e13@linux.intel.com> Date: Wed, 12 Jul 2017 12:15:08 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170711163422.etydkhhtgfthpfi5@hirez.programming.kicks-ass.net> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2174 Lines: 58 On 2017/7/12 0:34, Peter Zijlstra wrote: > On Tue, Jul 11, 2017 at 06:09:27PM +0200, Frederic Weisbecker wrote: > >>>> - tick_nohz_idle_enter costs 7058ns - 10726ns >>>> - tick_nohz_idle_exit costs 8372ns - 20850ns >>> >>> Right, those are horrible expensive, but skipping them isn't 'hard', the >>> only tricky bit is finding a condition that makes sense. >> >> Note you can statically disable it with nohz=0 boot parameter. > > Yeah, but that's bad for power usage, nobody wants that. > >>> See Mike's patch: https://patchwork.kernel.org/patch/2839221/ >>> >>> Combined with the above, and possibly a better condition, that should >>> get rid of most of this. >> >> Such a patch could work well if the decision from the scheduler to not stop the tick >> happens on idle entry. >> >> Now if sched_needs_cpu() first allows to stop the tick then refuses it later >> in the end of an idle IRQ, this won't have the desired effect. As long as ts->tick_stopped=1, >> it stays so until we really restart the tick. So the whole costly nohz machinery stays on. >> >> I guess it doesn't matter though, as we are talking about making fast idle entry so the >> decision not to stop the tick is likely to be done once on idle entry, when ts->tick_stopped=0. >> >> One exception though: if the tick is already stopped when we enter idle (full nohz case). And >> BTW stopping the tick outside idle shouldn't be concerned here. >> >> So I'd rather put that on can_stop_idle_tick(). > > Mike's patch much predates the existence of that function I think ;-) But > sure.. > Okay, the difference is that Mike's patch uses a very simple algorithm to make the decision. /* * delta is wakeup_timestamp - idle_timestamp */ update_avg(&rq->avg_idle, delta); ... static void update_avg(u64 *avg, u64 sample) { s64 diff = sample - *avg; *avg += diff >> 3; } While my proposal is trying to leverage the prediction functionality of the existing idle menu governor, which works very well for a long time. I know the the code change is big and the running overhead is a bit higher than rq->avg_idle, but should we make a comparison for some typical workloads? Thanks, -Aubrey