Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751341AbdGQIEt (ORCPT ); Mon, 17 Jul 2017 04:04:49 -0400 Received: from mail-pg0-f45.google.com ([74.125.83.45]:35246 "EHLO mail-pg0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751272AbdGQIEo (ORCPT ); Mon, 17 Jul 2017 04:04:44 -0400 Date: Mon, 17 Jul 2017 13:34:41 +0530 From: Viresh Kumar To: Joel Fernandes Cc: linux-kernel@vger.kernel.org, Juri Lelli , Patrick Bellasi , Andres Oportus , Dietmar Eggemann , Srinivas Pandruvada , Len Brown , "Rafael J . Wysocki" , Ingo Molnar , Peter Zijlstra Subject: Re: [PATCH RFC v5] cpufreq: schedutil: Make iowait boost more energy efficient Message-ID: <20170717080441.GM352@vireshk-i7> References: <20170716080407.28492-1-joelaf@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170716080407.28492-1-joelaf@google.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5784 Lines: 150 On 16-07-17, 01:04, Joel Fernandes wrote: > Currently the iowait_boost feature in schedutil makes the frequency go to max > on iowait wakeups. This feature was added to handle a case that Peter > described where the throughput of operations involving continuous I/O requests > [1] is reduced due to running at a lower frequency, however the lower > throughput itself causes utilization to be low and hence causing frequency to > be low hence its "stuck". > > Instead of going to max, its also possible to achieve the same effect by > ramping up to max if there are repeated in_iowait wakeups happening. This patch > is an attempt to do that. We start from a lower frequency (policy->mind) s/mind/min/ > and double the boost for every consecutive iowait update until we reach the > maximum iowait boost frequency (iowait_boost_max). > > I ran a synthetic test (continuous O_DIRECT writes in a loop) on an x86 machine > with intel_pstate in passive mode using schedutil. In this test the iowait_boost > value ramped from 800MHz to 4GHz in 60ms. The patch achieves the desired improved > throughput as the existing behavior. > > Also while at it, make iowait_boost and iowait_boost_max as unsigned int since > its unit is kHz and this is consistent with struct cpufreq_policy. > > [1] https://patchwork.kernel.org/patch/9735885/ > > Cc: Srinivas Pandruvada > Cc: Len Brown > Cc: Rafael J. Wysocki > Cc: Viresh Kumar > Cc: Ingo Molnar > Cc: Peter Zijlstra > Suggested-by: Peter Zijlstra > Signed-off-by: Joel Fernandes > --- > This version is based on some ideas from Viresh and Juri in v4. Viresh, one > difference between the idea we just discussed is, I am scaling up/down the > boost only after consuming it. This has the effect of slightly delaying the > "deboost" but achieves the same boost ramp time. Its more cleaner in the code > IMO to avoid the scaling up and then down on the initial boost. Note that I > also dropped iowait_boost_min and now I'm just starting the initial boost from > policy->min since as I mentioned in the commit above, the ramp of the > iowait_boost value is very quick and for the usecase its intended for, it works > fine. Hope this is acceptable. Thanks. > > kernel/sched/cpufreq_schedutil.c | 31 +++++++++++++++++++++++-------- > 1 file changed, 23 insertions(+), 8 deletions(-) > > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c > index 622eed1b7658..4225bbada88d 100644 > --- a/kernel/sched/cpufreq_schedutil.c > +++ b/kernel/sched/cpufreq_schedutil.c > @@ -53,8 +53,9 @@ struct sugov_cpu { > struct update_util_data update_util; > struct sugov_policy *sg_policy; > > - unsigned long iowait_boost; > - unsigned long iowait_boost_max; > + bool iowait_boost_pending; > + unsigned int iowait_boost; > + unsigned int iowait_boost_max; > u64 last_update; > > /* The fields below are only needed when sharing a policy. */ > @@ -172,30 +173,43 @@ static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, > unsigned int flags) > { > if (flags & SCHED_CPUFREQ_IOWAIT) { > - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max; > + sg_cpu->iowait_boost_pending = true; > + sg_cpu->iowait_boost = max(sg_cpu->iowait_boost, > + sg_cpu->sg_policy->policy->min); > } else if (sg_cpu->iowait_boost) { > s64 delta_ns = time - sg_cpu->last_update; > > /* Clear iowait_boost if the CPU apprears to have been idle. */ > - if (delta_ns > TICK_NSEC) > + if (delta_ns > TICK_NSEC) { > sg_cpu->iowait_boost = 0; > + sg_cpu->iowait_boost_pending = false; > + } We don't really need to clear this flag here as we are already making iowait_boost as 0 and that's what we check while using boost. > } > } > > static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long *util, > unsigned long *max) > { > - unsigned long boost_util = sg_cpu->iowait_boost; > - unsigned long boost_max = sg_cpu->iowait_boost_max; > + unsigned long boost_util, boost_max; > > - if (!boost_util) > + if (!sg_cpu->iowait_boost) > return; > > + boost_util = sg_cpu->iowait_boost; > + boost_max = sg_cpu->iowait_boost_max; > + The above changes are not required anymore (and were required only with my patch). > if (*util * boost_max < *max * boost_util) { > *util = boost_util; > *max = boost_max; > } > - sg_cpu->iowait_boost >>= 1; > + > + if (sg_cpu->iowait_boost_pending) { > + sg_cpu->iowait_boost_pending = false; > + sg_cpu->iowait_boost = min(sg_cpu->iowait_boost << 1, > + sg_cpu->iowait_boost_max); Now this has a problem. We will also boost after waiting for rate_limit_us. And that's why I had proposed the tricky solution in the first place. I thought we wanted to avoid instant boost only for the first iteration, but after that we wanted to do it ASAP. Isn't it? Now that you are using policy->min instead of policy->cur, we can simplify the solution I proposed and always do 2 * iowait_boost before getting current util/max in above if loop. i.e. we will start iowait boost with min * 2 instead of min and that should be fine. > + } else { > + sg_cpu->iowait_boost >>= 1; > + } > } > > #ifdef CONFIG_NO_HZ_COMMON > @@ -267,6 +281,7 @@ static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu, u64 time) > delta_ns = time - j_sg_cpu->last_update; > if (delta_ns > TICK_NSEC) { > j_sg_cpu->iowait_boost = 0; > + j_sg_cpu->iowait_boost_pending = false; Not required here as well. > continue; > } > if (j_sg_cpu->flags & SCHED_CPUFREQ_RT_DL) > -- > 2.13.2.932.g7449e964c-goog -- viresh