Received: by 10.192.165.156 with SMTP id m28csp1160246imm; Wed, 18 Apr 2018 05:17:37 -0700 (PDT) X-Google-Smtp-Source: AIpwx48WxakB2dWgXtxJXgDUyzGeoh6mcI0MQGbKT8P7MBl9c4bSEhiV24qZgmw3VgKgyI44zOtq X-Received: by 10.101.99.68 with SMTP id p4mr1576209pgv.421.1524053856619; Wed, 18 Apr 2018 05:17:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524053856; cv=none; d=google.com; s=arc-20160816; b=CVRqM0cg7JAnXdB46suT8C9pnxbsyAMFrclJ6IbY5gYT6GTn0RvjtsczHcAUrvOS0O Qc5PefoZl2mq4JfxbjNwbMll3ga8ADdwnfk2abf6qhwp8+38Mq5PLXnPi7s3ta8U+Lck JXDLFqK7UgK4tNszLdnBfni7Qa0mFv7JxAwYXGPCw5phOEx8ks/U2rrTL0mGUJuo17bH xD7z9pAM0RLmv9QPPuGgwWNOaojq5fgEdflSR6ub7aktA481IcvvhVu8BPKD7aHPZVFC dO7w5uHuLsCZrX1WuyXWORH7VM4s6AR/Eo0vrvVaxpI4HHBtFtlHle6fnbuuhl+qWsrp mtyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=sBkh+GfiFhbCuqNV1ANC3ch6yey98qOibe+2FgTXN60=; b=mC4z+EpfEse4Ef0EEjrdTqFERxKoKMbdJIgnMUzbBQuuWqDkq9g2NH8BLf7rzmksl7 Yemg7SVklJ1DlQIriRF2/PejKJ+suao9orxrdjxE6hG0BZ/iniPpEQWJT7Kk8SGZQbAv r+Kxeey+3Nn6MDKfVpOBV5NmNlpeCNTkMu1fE6tkBW20yRJERcUyR6L6izLh4tVhD6EF xwBX2fyvSTFrl+XbZjaBO4w6AnQuWETQAjXghE6jh9cfwiQskI4guu3JfmTKP7FQqtgQ DKIds43IB7+mfwJxLg7LJffrABUc9+I3LQ0hYKMgnXYSowFLKXZNXeBxOFIEzZoZTY4n QtTg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=hhppE8/M; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p20si1070169pfi.345.2018.04.18.05.17.22; Wed, 18 Apr 2018 05:17:36 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=hhppE8/M; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753364AbeDRMQK (ORCPT + 99 others); Wed, 18 Apr 2018 08:16:10 -0400 Received: from mail-pl0-f66.google.com ([209.85.160.66]:37363 "EHLO mail-pl0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752617AbeDRMQI (ORCPT ); Wed, 18 Apr 2018 08:16:08 -0400 Received: by mail-pl0-f66.google.com with SMTP id f7-v6so1017535plr.4 for ; Wed, 18 Apr 2018 05:16:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=sBkh+GfiFhbCuqNV1ANC3ch6yey98qOibe+2FgTXN60=; b=hhppE8/MKpbMsK/Jwy7e7RnC57NSDxHuCOrdBjMhD1bV1pnKfHyAneZqJSEMHFgOca i5QADJeczsLeDszOMb+MG4NUZBKHbZ241qhjCLSKledvFlkljeJ0f3lovQc/gHRDgUwD FxknEDB1P33BBpm+ZXAR5QHPHbPNFe6335vfY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=sBkh+GfiFhbCuqNV1ANC3ch6yey98qOibe+2FgTXN60=; b=htzrPPD08ipZNe0srz4y1W06R7odcldxSXNmZ+9+OlvGmmgtaDrHFY+W8viscj50BD utu22u/wfehj03MVnmjpIPgKB6n8YCBd+X+x/wrRNQ3UcPzOsD1E+BGOmDsW1Kdsh/RM XQuv6NWtvjQkNU8w54VSA4I2/hDvbnVpIbrE2EB3ND883j/82FsJmgO3d4TI+cPh6x+A usbR9ThAJ5wUotpfOWtdZeH4OKxWe/TKSKTJQzebY+l3+uXUiBlYHyApKj4rPS/k7AOl ef5NHpZ8QhVOr6vJbKHVJWMWNZF9DkFBd4refifmyL9ed4h+lyMnfE9yWs6JLwTaM22R ojfA== X-Gm-Message-State: ALQs6tDWWK2QFiEm7v0rDcHLKkPFcVe9oUgmn3pI80QgSE3mEO4T5KoY ZjvnUs63We0YIXo+sdgJWmjuQw== X-Received: by 2002:a17:902:4003:: with SMTP id b3-v6mr1834407pld.15.1524053768083; Wed, 18 Apr 2018 05:16:08 -0700 (PDT) Received: from leoy-ThinkPad-X240s (li1168-94.members.linode.com. [45.79.69.94]) by smtp.gmail.com with ESMTPSA id k7sm3667214pfi.77.2018.04.18.05.15.55 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 18 Apr 2018 05:16:06 -0700 (PDT) Date: Wed, 18 Apr 2018 20:15:47 +0800 From: Leo Yan To: Dietmar Eggemann Cc: linux-kernel@vger.kernel.org, Peter Zijlstra , Quentin Perret , Thara Gopinath , linux-pm@vger.kernel.org, Morten Rasmussen , Chris Redpath , Patrick Bellasi , Valentin Schneider , "Rafael J . Wysocki" , Greg Kroah-Hartman , Vincent Guittot , Viresh Kumar , Todd Kjos , Joel Fernandes , Juri Lelli , Steve Muckle , Eduardo Valentin Subject: Re: [RFC PATCH v2 4/6] sched/fair: Introduce an energy estimation helper function Message-ID: <20180418121547.GC15682@leoy-ThinkPad-X240s> References: <20180406153607.17815-1-dietmar.eggemann@arm.com> <20180406153607.17815-5-dietmar.eggemann@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180406153607.17815-5-dietmar.eggemann@arm.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Quentin, On Fri, Apr 06, 2018 at 04:36:05PM +0100, Dietmar Eggemann wrote: > From: Quentin Perret > > In preparation for the definition of an energy-aware wakeup path, a > helper function is provided to estimate the consequence on system energy > when a specific task wakes-up on a specific CPU. compute_energy() > estimates the OPPs to be reached by all frequency domains and estimates > the consumption of each online CPU according to its energy model and its > percentage of busy time. > > Cc: Ingo Molnar > Cc: Peter Zijlstra > Signed-off-by: Quentin Perret > Signed-off-by: Dietmar Eggemann > --- > include/linux/sched/energy.h | 20 +++++++++++++ > kernel/sched/fair.c | 68 ++++++++++++++++++++++++++++++++++++++++++++ > kernel/sched/sched.h | 2 +- > 3 files changed, 89 insertions(+), 1 deletion(-) > > diff --git a/include/linux/sched/energy.h b/include/linux/sched/energy.h > index 941071eec013..b4110b145228 100644 > --- a/include/linux/sched/energy.h > +++ b/include/linux/sched/energy.h > @@ -27,6 +27,24 @@ static inline bool sched_energy_enabled(void) > return static_branch_unlikely(&sched_energy_present); > } > > +static inline > +struct capacity_state *find_cap_state(int cpu, unsigned long util) > +{ > + struct sched_energy_model *em = *per_cpu_ptr(energy_model, cpu); > + struct capacity_state *cs = NULL; > + int i; > + > + util += util >> 2; > + > + for (i = 0; i < em->nr_cap_states; i++) { > + cs = &em->cap_states[i]; > + if (cs->cap >= util) > + break; > + } > + > + return cs; > +} > + > static inline struct cpumask *freq_domain_span(struct freq_domain *fd) > { > return &fd->span; > @@ -42,6 +60,8 @@ struct freq_domain; > static inline bool sched_energy_enabled(void) { return false; } > static inline struct cpumask > *freq_domain_span(struct freq_domain *fd) { return NULL; } > +static inline struct capacity_state > +*find_cap_state(int cpu, unsigned long util) { return NULL; } > static inline void init_sched_energy(void) { } > #define for_each_freq_domain(fdom) for (; fdom; fdom = NULL) > #endif > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 6960e5ef3c14..8cb9fb04fff2 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -6633,6 +6633,74 @@ static int wake_cap(struct task_struct *p, int cpu, int prev_cpu) > } > > /* > + * Returns the util of "cpu" if "p" wakes up on "dst_cpu". > + */ > +static unsigned long cpu_util_next(int cpu, struct task_struct *p, int dst_cpu) > +{ > + unsigned long util, util_est; > + struct cfs_rq *cfs_rq; > + > + /* Task is where it should be, or has no impact on cpu */ > + if ((task_cpu(p) == dst_cpu) || (cpu != task_cpu(p) && cpu != dst_cpu)) > + return cpu_util(cpu); > + > + cfs_rq = &cpu_rq(cpu)->cfs; > + util = READ_ONCE(cfs_rq->avg.util_avg); > + > + if (dst_cpu == cpu) > + util += task_util(p); > + else > + util = max_t(long, util - task_util(p), 0); > + > + if (sched_feat(UTIL_EST)) { > + util_est = READ_ONCE(cfs_rq->avg.util_est.enqueued); > + if (dst_cpu == cpu) > + util_est += _task_util_est(p); > + else > + util_est = max_t(long, util_est - _task_util_est(p), 0); > + util = max(util, util_est); > + } > + > + return min_t(unsigned long, util, capacity_orig_of(cpu)); > +} > + > +/* > + * Estimates the system level energy assuming that p wakes-up on dst_cpu. > + * > + * compute_energy() is safe to call only if an energy model is available for > + * the platform, which is when sched_energy_enabled() is true. > + */ > +static unsigned long compute_energy(struct task_struct *p, int dst_cpu) > +{ > + unsigned long util, max_util, sum_util; > + struct capacity_state *cs; > + unsigned long energy = 0; > + struct freq_domain *fd; > + int cpu; > + > + for_each_freq_domain(fd) { > + max_util = sum_util = 0; > + for_each_cpu_and(cpu, freq_domain_span(fd), cpu_online_mask) { > + util = cpu_util_next(cpu, p, dst_cpu); > + util += cpu_util_dl(cpu_rq(cpu)); > + max_util = max(util, max_util); > + sum_util += util; > + } > + > + /* > + * Here we assume that the capacity states of CPUs belonging to > + * the same frequency domains are shared. Hence, we look at the > + * capacity state of the first CPU and re-use it for all. > + */ > + cpu = cpumask_first(freq_domain_span(fd)); > + cs = find_cap_state(cpu, max_util); > + energy += cs->power * sum_util / cs->cap; > + } Sorry I introduce mess at here to spread my questions in several replying, later will try to ask questions in one replying. Below are more questions which it's good to bring up: The code for energy computation is quite neat and simple, but I think the energy computation mixes two concepts for CPU util: one concept is the estimated CPU util which is used to select CPU OPP in schedutil, another concept is the raw CPU util according to CPU real running time; for example, cpu_util_next() predicts CPU util but this value might be much higher than cpu_util(), especially after enabled UTIL_EST feature (I have shallow understanding for UTIL_EST so correct me as needed); but this patch simply computes CPU capacity and energy with the single one CPU utilization value (and it will be an inflated value afte enable UTIL_EST). Is this purposed for simple implementation? IMHO, cpu_util_next() can be used to predict CPU capacity, on the other hand, should we use the CPU util without UTIL_EST capping for 'sum_util', this can be more reasonable to reflect the CPU utilization? Furthermore, if we consider RT thread is running on CPU and connect with 'schedutil' governor, the CPU will run at maximum frequency, but we cannot say the CPU has 100% utilization. The RT thread case is not handled in this patch. > + > + return energy; > +} > + > +/* > * select_task_rq_fair: Select target runqueue for the waking task in domains > * that have the 'sd_flag' flag set. In practice, this is SD_BALANCE_WAKE, > * SD_BALANCE_FORK, or SD_BALANCE_EXEC. > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h > index 5d552c0d7109..6eb38f41d5d9 100644 > --- a/kernel/sched/sched.h > +++ b/kernel/sched/sched.h > @@ -2156,7 +2156,7 @@ static inline void cpufreq_update_util(struct rq *rq, unsigned int flags) {} > # define arch_scale_freq_invariant() false > #endif > > -#ifdef CONFIG_CPU_FREQ_GOV_SCHEDUTIL > +#ifdef CONFIG_SMP > static inline unsigned long cpu_util_dl(struct rq *rq) > { > return (rq->dl.running_bw * SCHED_CAPACITY_SCALE) >> BW_SHIFT; > -- > 2.11.0 >