Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752687AbbDBDWy (ORCPT ); Wed, 1 Apr 2015 23:22:54 -0400 Received: from e36.co.us.ibm.com ([32.97.110.154]:53798 "EHLO e36.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751863AbbDBDWx (ORCPT ); Wed, 1 Apr 2015 23:22:53 -0400 Message-ID: <551CB5FB.8000008@linux.vnet.ibm.com> Date: Thu, 02 Apr 2015 08:52:35 +0530 From: Preeti U Murthy User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Morten Rasmussen CC: Jason Low , Peter Zijlstra , "mingo@kernel.org" , "riel@redhat.com" , "daniel.lezcano@linaro.org" , "vincent.guittot@linaro.org" , "srikar@linux.vnet.ibm.com" , "pjt@google.com" , "benh@kernel.crashing.org" , "efault@gmx.de" , "linux-kernel@vger.kernel.org" , "iamjoonsoo.kim@lge.com" , "svaidy@linux.vnet.ibm.com" , "tim.c.chen@linux.intel.com" Subject: Re: [PATCH V2] sched: Improve load balancing in the presence of idle CPUs References: <20150326130014.21532.17158.stgit@preeti.in.ibm.com> <20150327143839.GO18994@e105550-lin.cambridge.arm.com> <55158966.4050300@linux.vnet.ibm.com> <20150327175651.GR18994@e105550-lin.cambridge.arm.com> <20150330110632.GT23123@twins.programming.kicks-ass.net> <20150330120302.GT18994@e105550-lin.cambridge.arm.com> <551A61A9.6020009@linux.vnet.ibm.com> <1427823008.2492.19.camel@j-VirtualBox> <551B8FF3.70608@linux.vnet.ibm.com> <20150401130355.GW18994@e105550-lin.cambridge.arm.com> In-Reply-To: <20150401130355.GW18994@e105550-lin.cambridge.arm.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15040203-0021-0000-0000-00000993299A Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2598 Lines: 53 Hi Morten, On 04/01/2015 06:33 PM, Morten Rasmussen wrote: >> Alright I see. But it is one additional wake up. And the wake up will be >> within the cluster. We will not wake up any CPU in the neighboring >> cluster unless there are tasks to be pulled. So, we can wake up a core >> out of a deep idle state and never a cluster in the problem described. >> In terms of energy efficiency, this is not so bad a scenario, is it? > > After Peter pointed out that it shouldn't happen across clusters due to > group_classify()/sg_capacity_factor() it isn't as bad as I initially > thought. It is still not an ideal solution I think. Wake-ups aren't nice > for battery-powered devices. Waking up a cpu in an already active > cluster may still imply powering up the core and bringing the L1 cache > into a usable state, but it isn't as bad as waking up a cluster. I would > prefer to avoid it if we can. > > Thinking more about it, don't we also risk doing a lot of iterations in > nohz_idle_balance() leading to nothing (pure overhead) in certain corner > cases? If find_new_ild() is the last cpu in the cluster and we have one > task for each cpu in the cluster but one cpu is currently having two. > Don't we end up trying all nohz-idle cpus before giving up and balancing > the balancer cpu itself. On big machines, going through everyone could > take a while I think. No? The balancer CPU will iterate as long as need_resched() is not set on it. It may take a while, but if the balancer CPU has nothing to do, the iteration will not be at a cost of performance. Besides this, load balancing itself is optimized in terms of who does it and how often. The candidate CPUs for load balancing are the first idle CPUs in a given group. So nohz idle load balancing may abort on some of the idle CPUs. If the CPUs on our left have not managed to pull tasks, we abort load balancing too. We will save time here. Load balancing on bigger sched domains is spaced out in time too. The min_interval is equal to sd_weight and the balance_interval can be as large as 2*sd_weight. This should ensure that load balancing across large scheduling domains are not carried out too often. nohz_idle_load balancing may therefore not go through the entire scheduling domain hierarchy for each CPU. This will cut down on the time too. Regards Preeti U Murthy > > Morten > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/