Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1030305AbcJ0Rnf (ORCPT ); Thu, 27 Oct 2016 13:43:35 -0400 Received: from foss.arm.com ([217.140.101.70]:43290 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S941474AbcJ0Rl1 (ORCPT ); Thu, 27 Oct 2016 13:41:27 -0400 From: Patrick Bellasi To: linux-kernel@vger.kernel.org Cc: Ingo Molnar , Peter Zijlstra , Vincent Guittot , Steve Muckle , Leo Yan , Viresh Kumar , "Rafael J . Wysocki" , Todd Kjos , Srinath Sridharan , Andres Oportus , Juri Lelli , Morten Rasmussen , Dietmar Eggemann , Chris Redpath , Robin Randhawa , Patrick Bellasi Subject: [RFC v2 4/8] sched/fair: add boosted CPU usage Date: Thu, 27 Oct 2016 18:41:04 +0100 Message-Id: <20161027174108.31139-5-patrick.bellasi@arm.com> X-Mailer: git-send-email 2.10.1 In-Reply-To: <20161027174108.31139-1-patrick.bellasi@arm.com> References: <20161027174108.31139-1-patrick.bellasi@arm.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4430 Lines: 136 The CPU utilization signal (cpu_rq(cpu)->cfs.avg.util_avg) is used by the scheduler as an estimation of the overall bandwidth currently allocated on a CPU. When the schedutil CPUFreq governor is in use, this signal drives the selection of the Operating Performance Points (OPP) required to accommodate all the workload allocated on that CPU. A convenient way to boost the performance of tasks running on a CPU, which is also little intrusive, is to boost the CPU utilization signal each time it is used to select an OPP. This patch introduces a new function: boosted_cpu_util(cpu) to return a boosted value for the usage of a specified CPU. The margin added to the original usage is: 1. computed based on the "boosting strategy" in use 2. proportional to the system-wide boost value defined via the provided user-space interface The boosted signal is used by schedutil (transparently) each time it requires an estimation of the capacity required by CFS tasks which are currently RUNNABLE a CPU. It's worth to notice that the RUNNABLE status is used to defined _when_ a CPU needs to be boosted. While _what_ we boost is the CPU utilization which includes also the blocked utilization. Currently SchedTune is available only for CONFIG_SMP system, thus we have a single point of integration with schedutil which is provided by the cfs_rq_util_change(cfs_rq) function which ultimately calls into: kernel/sched/cpufreq_schedutil.c::sugov_get_util(util, max) Each time a CFS utilization update is required, if SchedTune is compiled in, we use the global boost value to update the CFS utilization required by the CFS class. Such a simple mechanism allows for example to use schedutil to mimics the behaviors of other governors, i.e. performance (when boost=100% and only while there are RUNNABLE tasks on that CPU). Cc: Ingo Molnar Cc: Peter Zijlstra Signed-off-by: Patrick Bellasi --- kernel/sched/cpufreq_schedutil.c | 4 ++-- kernel/sched/fair.c | 36 ++++++++++++++++++++++++++++++++++++ kernel/sched/sched.h | 2 ++ 3 files changed, 40 insertions(+), 2 deletions(-) diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 69e0689..0382df7 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -148,12 +148,12 @@ static unsigned int get_next_freq(struct sugov_cpu *sg_cpu, unsigned long util, static void sugov_get_util(unsigned long *util, unsigned long *max) { - struct rq *rq = this_rq(); unsigned long cfs_max; cfs_max = arch_scale_cpu_capacity(NULL, smp_processor_id()); - *util = min(rq->cfs.avg.util_avg, cfs_max); + *util = boosted_cpu_util(smp_processor_id()); + *util = min(*util, cfs_max); *max = cfs_max; } diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index fdacc29..26c3911 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5578,6 +5578,25 @@ schedtune_margin(unsigned long signal, unsigned int boost) return margin; } +static inline unsigned long +schedtune_cpu_margin(unsigned long util, int cpu) +{ + unsigned int boost = get_sysctl_sched_cfs_boost(); + + if (boost == 0) + return 0UL; + + return schedtune_margin(util, boost); +} + +#else /* CONFIG_SCHED_TUNE */ + +static inline unsigned long +schedtune_cpu_margin(unsigned long util, int cpu) +{ + return 0; +} + #endif /* CONFIG_SCHED_TUNE */ /* @@ -5614,6 +5633,23 @@ static int cpu_util(int cpu) return (util >= capacity) ? capacity : util; } +unsigned long boosted_cpu_util(int cpu) +{ + unsigned long util = cpu_rq(cpu)->cfs.avg.util_avg; + unsigned long capacity = capacity_orig_of(cpu); + + /* Do not boost saturated utilizations */ + if (util >= capacity) + return capacity; + + /* Add margin to current CPU's capacity */ + util += schedtune_cpu_margin(util, cpu); + if (util >= capacity) + return capacity; + + return util; +} + static inline int task_util(struct task_struct *p) { return p->se.avg.util_avg; diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 055f935..fd85818 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -1764,6 +1764,8 @@ static inline u64 irq_time_read(int cpu) } #endif /* CONFIG_IRQ_TIME_ACCOUNTING */ +unsigned long boosted_cpu_util(int cpu); + #ifdef CONFIG_CPU_FREQ DECLARE_PER_CPU(struct update_util_data *, cpufreq_update_util_data); -- 2.10.1