Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp376125imm; Mon, 21 May 2018 07:27:36 -0700 (PDT) X-Google-Smtp-Source: AB8JxZr48eUV53nernjjQmePIaZLxWnHD1rFtL/a/s5v1PbjZbzXmYGwIo3ZMKrBDIXywWtVZT5g X-Received: by 2002:a62:8b92:: with SMTP id e18-v6mr20163645pfl.60.1526912855992; Mon, 21 May 2018 07:27:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526912855; cv=none; d=google.com; s=arc-20160816; b=UmZOKa+ai5In+JUeupMTkiElryNIbAF78R5yNnSwZuV2hDy4FZfmy3tXKPg293b7xT fUlEJzA/QA7zqA//mXiLQ2sjSi2QMtXOrwssXx56EBAEsMqfJbrEP7Mt/VGiJFyD8AEn Vy/6abpa6KNnwjMbize2bUtbsiMMFDk9m7+g7KMgZ9ujQ6lO4qEWkb7i4Fercb8yRnXw lpqcC2R/JU8nw9GDBJGQq1lslfol+50RBGtUSuNSb4QcFVy/IpFqlLXlCDSL64tWTbqN xZ41ZC3YLVp9nzEPdiQDm+GJkw0P5B8vxOpPSqRU0Kxunbwwm4+FjM3L5MZfVfYH/6V1 h9iA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=048B0N2MWwymdT1Zv8ptiZaiPvL9ig43kZgDECeEUd8=; b=bPiGvkmHSUgUqL4UpLZsV5So8HxQ5x20kdnvv8O3+gg0Hp2UgHYP3vg0f3Ak9kyean pVmMz9NpXWj9G9sNZ3owKTstXSkpRI53F74Sx8i499ezCIurTUAZEncSyRxV8mrFyJ7V JQNxIa1kI/JJ9y9wi88FOlltKzrrWzu9kEwz8ADKe7NZ37vbmtIGAHpe6V46LfkfqBxE wpv3AK2yaWowA6TlUO7bfsXfDDZ3ulrt0/a2loTmIE/hBM/vxtRC1Id7Ifd57OHC9YzT 3VevH7X5w84bakP3mJ4gNI72LtjBoDNirjl6sVt0QIHjS3fqorun+yR18QR1TNfXt5OJ 4Q/Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c11-v6si10808433pgu.454.2018.05.21.07.27.21; Mon, 21 May 2018 07:27:35 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753049AbeEUO0H (ORCPT + 99 others); Mon, 21 May 2018 10:26:07 -0400 Received: from foss.arm.com ([217.140.101.70]:51120 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752722AbeEUO0C (ORCPT ); Mon, 21 May 2018 10:26:02 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id CF12E1684; Mon, 21 May 2018 07:26:01 -0700 (PDT) Received: from e108498-lin.cambridge.arm.com (e108498-lin.cambridge.arm.com [10.1.210.84]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B85203F577; Mon, 21 May 2018 07:25:57 -0700 (PDT) From: Quentin Perret To: peterz@infradead.org, rjw@rjwysocki.net, gregkh@linuxfoundation.org, linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org Cc: mingo@redhat.com, dietmar.eggemann@arm.com, morten.rasmussen@arm.com, chris.redpath@arm.com, patrick.bellasi@arm.com, valentin.schneider@arm.com, vincent.guittot@linaro.org, thara.gopinath@linaro.org, viresh.kumar@linaro.org, tkjos@google.com, joelaf@google.com, smuckle@google.com, adharmap@quicinc.com, skannan@quicinc.com, pkondeti@codeaurora.org, juri.lelli@redhat.com, edubezval@gmail.com, srinivas.pandruvada@linux.intel.com, currojerez@riseup.net, javi.merino@kernel.org, quentin.perret@arm.com Subject: [RFC PATCH v3 09/10] sched/fair: Select an energy-efficient CPU on task wake-up Date: Mon, 21 May 2018 15:25:04 +0100 Message-Id: <20180521142505.6522-10-quentin.perret@arm.com> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180521142505.6522-1-quentin.perret@arm.com> References: <20180521142505.6522-1-quentin.perret@arm.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org If an energy model is available, and if the system isn't overutilized, waking tasks are re-routed into a new energy-aware placement algorithm. The selection of an energy-efficient CPU for a task is achieved by estimating the impact on system-level active energy resulting from the placement of the task on the CPU with the highest spare capacity in each frequency domain. This strategy spreads tasks in a frequency domain and avoids overly aggressive task packing. The best CPU energy-wise is then selected if it saves a large enough amount of energy with respect to prev_cpu. Although it has already shown significant benefits on some existing targets, this approach cannot scale to platforms with numerous CPUs. This patch is an attempt to do something useful as writing a fast heuristic that performs reasonably well on a broad spectrum of architectures isn't an easy task. As such, the scope of usability of the energy-aware wake-up path is restricted to systems with the SD_ASYM_CPUCAPACITY flag set, and where the EM isn't too complex. In addition, the energy-aware wake-up path is accessible only if sched_energy_enabled() is true. For systems which don't meet all dependencies for EAS (CONFIG_ENERGY_MODEL for ex.) at compile time, sched_enegy_enabled() defaults to a constant "false" value, hence letting the compiler remove the unused EAS code entirely. Cc: Ingo Molnar Cc: Peter Zijlstra Signed-off-by: Quentin Perret --- kernel/sched/fair.c | 84 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 83 insertions(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 1f7029258df2..eb44829be17f 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -6683,6 +6683,80 @@ static long compute_energy(struct task_struct *p, int dst_cpu) return energy; } +static int find_energy_efficient_cpu(struct task_struct *p, int prev_cpu) +{ + unsigned long cur_energy, prev_energy, best_energy, cpu_cap, task_util; + int cpu, best_energy_cpu = prev_cpu; + struct sched_energy_fd *sfd; + struct sched_domain *sd; + + sync_entity_load_avg(&p->se); + + task_util = task_util_est(p); + if (!task_util) + return prev_cpu; + + /* + * Energy-aware wake-up happens on the lowest sched_domain starting + * from sd_ea spanning over this_cpu and prev_cpu. + */ + sd = rcu_dereference(*this_cpu_ptr(&sd_ea)); + while (sd && !cpumask_test_cpu(prev_cpu, sched_domain_span(sd))) + sd = sd->parent; + if (!sd) + return -1; + + if (cpumask_test_cpu(prev_cpu, &p->cpus_allowed)) + prev_energy = best_energy = compute_energy(p, prev_cpu); + else + prev_energy = best_energy = ULONG_MAX; + + for_each_freq_domain(sfd) { + unsigned long spare_cap, max_spare_cap = 0; + int max_spare_cap_cpu = -1; + unsigned long util; + + /* Find the CPU with the max spare cap in the freq. dom. */ + for_each_cpu_and(cpu, freq_domain_span(sfd), sched_domain_span(sd)) { + if (!cpumask_test_cpu(cpu, &p->cpus_allowed)) + continue; + + if (cpu == prev_cpu) + continue; + + /* Skip CPUs that will be overutilized */ + util = cpu_util_wake(cpu, p) + task_util; + cpu_cap = capacity_of(cpu); + if (cpu_cap * 1024 < util * capacity_margin) + continue; + + spare_cap = cpu_cap - util; + if (spare_cap > max_spare_cap) { + max_spare_cap = spare_cap; + max_spare_cap_cpu = cpu; + } + } + + /* Evaluate the energy impact of using this CPU. */ + if (max_spare_cap_cpu >= 0) { + cur_energy = compute_energy(p, max_spare_cap_cpu); + if (cur_energy < best_energy) { + best_energy = cur_energy; + best_energy_cpu = max_spare_cap_cpu; + } + } + } + + /* + * We pick the best CPU only if it saves at least 1.5% of the + * energy used by prev_cpu. + */ + if ((prev_energy - best_energy) > (prev_energy >> 6)) + return best_energy_cpu; + + return prev_cpu; +} + /* * select_task_rq_fair: Select target runqueue for the waking task in domains * that have the 'sd_flag' flag set. In practice, this is SD_BALANCE_WAKE, @@ -6701,16 +6775,23 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int sd_flag, int wake_f struct sched_domain *tmp, *sd = NULL; int cpu = smp_processor_id(); int new_cpu = prev_cpu; - int want_affine = 0; + int want_affine = 0, want_energy = 0; int sync = (wake_flags & WF_SYNC) && !(current->flags & PF_EXITING); if (sd_flag & SD_BALANCE_WAKE) { record_wakee(p); + want_energy = sched_energy_enabled() && + !READ_ONCE(cpu_rq(cpu)->rd->overutilized); want_affine = !wake_wide(p) && !wake_cap(p, cpu, prev_cpu) && cpumask_test_cpu(cpu, &p->cpus_allowed); } rcu_read_lock(); + if (want_energy) { + new_cpu = find_energy_efficient_cpu(p, prev_cpu); + goto unlock; + } + for_each_domain(cpu, tmp) { if (!(tmp->flags & SD_LOAD_BALANCE)) break; @@ -6745,6 +6826,7 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int sd_flag, int wake_f if (want_affine) current->recent_used_cpu = cpu; } +unlock: rcu_read_unlock(); return new_cpu; -- 2.17.0