Received: by 10.213.65.68 with SMTP id h4csp831582imn; Thu, 22 Mar 2018 09:29:27 -0700 (PDT) X-Google-Smtp-Source: AG47ELuDRy0BYfMoi7NSB18t2jCA0rrCM44HX/OAmNWp2rCUftjFLc/qOhz0yCefwyx0kBMibPjl X-Received: by 2002:a17:902:f24:: with SMTP id 33-v6mr25661680ply.242.1521736167310; Thu, 22 Mar 2018 09:29:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521736167; cv=none; d=google.com; s=arc-20160816; b=ghZQBxEUb+M4MPHHxbUZFZFRqjIi7psRWSCyRJVY/SmVu/GlumY+Y8knkauPI4lR4P KnHJxyMcL09HNgzJ/gZ5Ei7VzvpMUz2UpbCeld68f09zXeN8g5Qk69Da0cn0MoClIHV1 o+j1F61+MowSHwxeV84orf3aIEaJzkvlDJ0/yKGva3CJP27iRe8pKBdh840bCzuIdZ7J gNJGibMN2+m5hM0+XT11AzxNAKxpJo4+H1UB+ZzQ55rbNg5k+4qVrKWcA7UIMg0lWWb+ o3t/IFEySsWBdafgcXFcGkxVMC8do7KQpC17cLs6baRpqvOxpju3sAf/nHJIAYi20B9+ yAoQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=hDXpWMaKdr1aaTOPI18T+aoPa61QO3Nx3mzzCUr4v3o=; b=WAHsbU6UXjuGYNJUroUNEwcbrjUWolG90MFXPIL6mH0N0IbdbLKQCBQXX99txmoFQw dCa2/NOHc42H+kYD5gBAEMhEF8Go+fvYriBWuMYJfYrUiBUYb4z2Jy0f3K+UYnZdwFlJ Ovj2BdcnXNghqZ2iiRWfvSl67/BfM5graYgbtClNeTaJHo4aPrbT7lgikbqTX6VRahts OlVuvTZxMM2prBoH00JYamvFLlotzzTgGlmaCJb4luPUq6XHUIKkFFydMIy6VUQdqtoD aic97g3yZrlVusHgME0Pn0U4tfLhUVNde9uRNQziAyuttGhT7XLHWRzKlKxWQGF+0GkR z/xg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=SDykd/4d; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i12si4622448pgr.546.2018.03.22.09.29.12; Thu, 22 Mar 2018 09:29:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=SDykd/4d; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751853AbeCVQ1s (ORCPT + 99 others); Thu, 22 Mar 2018 12:27:48 -0400 Received: from mail-it0-f65.google.com ([209.85.214.65]:54775 "EHLO mail-it0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751828AbeCVQ1p (ORCPT ); Thu, 22 Mar 2018 12:27:45 -0400 Received: by mail-it0-f65.google.com with SMTP id w3-v6so12088370itc.4 for ; Thu, 22 Mar 2018 09:27:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=hDXpWMaKdr1aaTOPI18T+aoPa61QO3Nx3mzzCUr4v3o=; b=SDykd/4dJuX7OZF7ZAei1FDua4kpR8r0T5+G1bd5R1NUM2YrmQMW0dK/Y+jC90MDDu Q8TPV8RcF75ZNiSmgUpURhzO9r1S6FKJZAVgHYp3dJS8Kr11InuB9hbrd4KLb7XHNjLZ 5iZC8tu5+/s81r1Vlh4Z/BDUd3vrR4FtJbphVIQc+8RYdY8YHdDXY3iPnhvqOoZu5mP9 ziHzUxkwD6Y4Rd3JkZ2PEvpFw7lC1pifC2HmT+EZ9yXueI3b+/upsh/MytWRLc0UOrYV z9puxkiKo95oGqeasueOPrbDYlNvXabNsE+j35R1kMkFRBg7Znxgo5XIGobVgL278rd8 k/Hg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=hDXpWMaKdr1aaTOPI18T+aoPa61QO3Nx3mzzCUr4v3o=; b=MeKIAbBjbVwB8E7hIsMsbY0SfV5b9z9VorqydOafMI2UWcKjbPBmCnNV1aAEL00bEA QwB8noR1Pp0e2UteT2JfwfoWc+Pxjz9JkbiY1BGdODt7IjtyBhaLzmORvvcC9lKYjbxA 5fML6pz3dq0YZFEn0J+wXdGMmdgop4neeLq9olxAU0nc23L4rjTIx9O6XRI6VA6YT4jq N/v8GQPyTHK1hiuBZ92BG/gpcW4jRXZi3LOQDEDVdr+awZgsxmqVcdMg5w/GeC8XaeFY JenACBbysblvdqxmZKGLLuiE9nKjEC4Dxk6ZQiKC0clVlBXy4HBkpl/Q2Bhge7QPXHbW QwqA== X-Gm-Message-State: AElRT7EE+B1q4a6mVMB3h4uQcDNm3fN2mx8UYqXefSfMMNzSLXIdGlpO dnDCyOydwpu5bVdUPUxupx2g9s7QNXlKdkfANXCYFA== X-Received: by 2002:a24:cd45:: with SMTP id l66-v6mr10046504itg.151.1521736064702; Thu, 22 Mar 2018 09:27:44 -0700 (PDT) MIME-Version: 1.0 Received: by 10.107.11.158 with HTTP; Thu, 22 Mar 2018 09:27:43 -0700 (PDT) In-Reply-To: <20180320094312.24081-6-dietmar.eggemann@arm.com> References: <20180320094312.24081-1-dietmar.eggemann@arm.com> <20180320094312.24081-6-dietmar.eggemann@arm.com> From: Joel Fernandes Date: Thu, 22 Mar 2018 09:27:43 -0700 Message-ID: Subject: Re: [RFC PATCH 5/6] sched/fair: Select an energy-efficient CPU on task wake-up To: Dietmar Eggemann Cc: LKML , Peter Zijlstra , Quentin Perret , Thara Gopinath , Linux PM , Morten Rasmussen , Chris Redpath , Patrick Bellasi , Valentin Schneider , "Rafael J . Wysocki" , Greg Kroah-Hartman , Vincent Guittot , Viresh Kumar , Todd Kjos Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On Tue, Mar 20, 2018 at 2:43 AM, Dietmar Eggemann wrote: > > From: Quentin Perret > > In case an energy model is available, waking tasks are re-routed into a > new energy-aware placement algorithm. The eligible CPUs to be used in the > energy-aware wakeup path are restricted to the highest non-overutilized > sched_domain containing prev_cpu and this_cpu. If no such domain is found, > the tasks go through the usual wake-up path, hence energy-aware placement > happens only in lightly utilized scenarios. > > The selection of the most energy-efficient CPU for a task is achieved by > estimating the impact on system-level active energy resulting from the > placement of the task on each candidate CPU. The best CPU energy-wise is > then selected if it saves a large enough amount of energy with respect to > prev_cpu. > > Although it has already shown significant benefits on some existing > targets, this brute force approach clearly cannot scale to platforms with > numerous CPUs. This patch is an attempt to do something useful as writing > a fast heuristic that performs reasonably well on a broad spectrum of > architectures isn't an easy task. As a consequence, the scope of usability > of the energy-aware wake-up path is restricted to systems with the > SD_ASYM_CPUCAPACITY flag set. These systems not only show the most > promising opportunities for saving energy but also typically feature a > limited number of logical CPUs. > > Cc: Ingo Molnar > Cc: Peter Zijlstra > Signed-off-by: Quentin Perret > Signed-off-by: Dietmar Eggemann > --- > kernel/sched/fair.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++++++--- > 1 file changed, 71 insertions(+), 3 deletions(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 76bd46502486..65a1bead0773 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -6513,6 +6513,60 @@ static unsigned long compute_energy(struct task_struct *p, int dst_cpu) > return energy; > } > > +static bool task_fits(struct task_struct *p, int cpu) > +{ > + unsigned long next_util = cpu_util_next(cpu, p, cpu); > + > + return util_fits_capacity(next_util, capacity_orig_of(cpu)); > +} > + > +static int find_energy_efficient_cpu(struct sched_domain *sd, > + struct task_struct *p, int prev_cpu) > +{ > + unsigned long cur_energy, prev_energy, best_energy; > + int cpu, best_cpu = prev_cpu; > + > + if (!task_util(p)) > + return prev_cpu; > + > + /* Compute the energy impact of leaving the task on prev_cpu. */ > + prev_energy = best_energy = compute_energy(p, prev_cpu); Is it possible that before the wakeup, the task's affinity is changed so that p->cpus_allowed no longer contains prev_cpu ? In that case prev_energy wouldn't matter since previous CPU is no longer an option? > + > + /* Look for the CPU that minimizes the energy. */ > + for_each_cpu_and(cpu, &p->cpus_allowed, sched_domain_span(sd)) { > + if (!task_fits(p, cpu) || cpu == prev_cpu) > + continue; > + cur_energy = compute_energy(p, cpu); > + if (cur_energy < best_energy) { > + best_energy = cur_energy; > + best_cpu = cpu; > + } > + } > + > + /* > + * We pick the best CPU only if it saves at least 1.5% of the > + * energy used by prev_cpu. > + */ > + if ((prev_energy - best_energy) > (prev_energy >> 6)) > + return best_cpu; > + > + return prev_cpu; > +} > + > +static inline bool wake_energy(struct task_struct *p, int prev_cpu) > +{ > + struct sched_domain *sd; > + > + if (!static_branch_unlikely(&sched_energy_present)) > + return false; > + > + sd = rcu_dereference_sched(cpu_rq(prev_cpu)->sd); > + if (!sd || sd_overutilized(sd)) Shouldn't you check for the SD_ASYM_CPUCAPACITY flag here? > + return false; > + > + return true; > +} > + > /* > * select_task_rq_fair: Select target runqueue for the waking task in domains > * that have the 'sd_flag' flag set. In practice, this is SD_BALANCE_WAKE, > @@ -6529,18 +6583,22 @@ static int > select_task_rq_fair(struct task_struct *p, int prev_cpu, int sd_flag, int wake_flags) > { > struct sched_domain *tmp, *affine_sd = NULL, *sd = NULL; > + struct sched_domain *energy_sd = NULL; > int cpu = smp_processor_id(); > int new_cpu = prev_cpu; > - int want_affine = 0; > + int want_affine = 0, want_energy = 0; > int sync = (wake_flags & WF_SYNC) && !(current->flags & PF_EXITING); > > + rcu_read_lock(); > + > if (sd_flag & SD_BALANCE_WAKE) { > record_wakee(p); > + want_energy = wake_energy(p, prev_cpu); > want_affine = !wake_wide(p) && !wake_cap(p, cpu, prev_cpu) > - && cpumask_test_cpu(cpu, &p->cpus_allowed); > + && cpumask_test_cpu(cpu, &p->cpus_allowed) > + && !want_energy; > } > > - rcu_read_lock(); > for_each_domain(cpu, tmp) { > if (!(tmp->flags & SD_LOAD_BALANCE)) > break; > @@ -6555,6 +6613,14 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int sd_flag, int wake_f > break; > } > > + /* > + * Energy-aware task placement is performed on the highest > + * non-overutilized domain spanning over cpu and prev_cpu. > + */ > + if (want_energy && !sd_overutilized(tmp) && > + cpumask_test_cpu(prev_cpu, sched_domain_span(tmp))) Shouldn't you check for the SD_ASYM_CPUCAPACITY flag here for tmp level? > + energy_sd = tmp; > + > if (tmp->flags & sd_flag) > sd = tmp; > else if (!want_affine) > @@ -6586,6 +6652,8 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int sd_flag, int wake_f > if (want_affine) > current->recent_used_cpu = cpu; > } > + } else if (energy_sd) { > + new_cpu = find_energy_efficient_cpu(energy_sd, p, prev_cpu); Even if want_affine = 0 (want_energy = 1), we can have sd = NULL if sd_flag and tmp->flags don't match. In this case we wont enter the EAS selection path because sd will be = NULL. So should you be setting sd = NULL explicitly if energy_sd != NULL , or rather do the if (energy_sd) before doing the if (!sd) ? If you still want to keep the logic this way, then probably you should also check if (tmp->flags & sd_flag) == true in the loop? That way energy_sd wont be set at all (Since we're basically saying we dont want to do wake up across this sd (in energy aware fashion in this case) if the domain flags don't watch the wake up sd_flag. thanks, - Joel