Received: by 10.192.165.156 with SMTP id m28csp277466imm; Tue, 10 Apr 2018 21:40:52 -0700 (PDT) X-Google-Smtp-Source: AIpwx494ttc4laHfxlGLSlmvTQLqAWGoghRcwlG0GybGVrqW2oSm93fbzxaPYGnpqFVFQlrAnO1t X-Received: by 2002:a17:902:d20a:: with SMTP id t10-v6mr3368987ply.151.1523421652563; Tue, 10 Apr 2018 21:40:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1523421652; cv=none; d=google.com; s=arc-20160816; b=oKtvzQz3JJLmYFEydUvxnFx7QAvYZ4Fx2veeD0Xdf40Dg7pwMpCvDP0dM6q0Nzm91s n2nPOabD6kHvLdNROo4ocsLHS5L+Wyjf6TgXEWjC912xmGvr49x1HNW/JmHKNtlpBRHo PomElCja3aad1o7BRtOToNp2TGkaK1mQUhO9uYsh08f+1cyS3bmpDZ62FvKpFINka5gA ga3G9t9e/6AvvIYVvzvjrRzrjTxrOVOYhqAxO94HgiDZCkbb/pv6hPL8etJyCpoOTPzu aapj0/yPb+utts3lCsH7LlSDuhj+YSpJnt7fp/E3tSwrNIFXrxG+68v8k34LRJeeG59K hN7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=7KYAT1cMq267yFbxfgnyfXboHSUXema4tm4MylShho8=; b=VCm5iUC6J+j7eP481Q1Kj8xxs+vs+ynZePd6wS0TLDCpSAFHg+WVO4gO4G9xSa94tw Fi+YQm08uYfvc2+2FVPi850NfL4A2XI64sdfR8eQoYrykQa31zkT4AzUhm1zRxm7hgOl V4c9mg8v94cq1OHeA+xALkYGqb5Ag8bJiLrV1WWLL87Wx+bpYl1HetL/uMmAoF6ac6s5 FyRVv/f6+j1J2/kv/ZP/PelkHTU/Vv6l/14WTQyw0ta0oGOuESs2680YRLKy4dVG8kIC Xcv6oRwXDmXs62otvN3/svRqsyW3NcsfwSPLES+b+IPzJQ84XssBXCxOvGnSr0sO4bmH GUEQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=WuC+cJP2; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k196si191594pgc.700.2018.04.10.21.40.15; Tue, 10 Apr 2018 21:40:52 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=WuC+cJP2; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751491AbeDKEhc (ORCPT + 99 others); Wed, 11 Apr 2018 00:37:32 -0400 Received: from mail-pf0-f195.google.com ([209.85.192.195]:39170 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750734AbeDKEhb (ORCPT ); Wed, 11 Apr 2018 00:37:31 -0400 Received: by mail-pf0-f195.google.com with SMTP id c78so338040pfj.6 for ; Tue, 10 Apr 2018 21:37:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=7KYAT1cMq267yFbxfgnyfXboHSUXema4tm4MylShho8=; b=WuC+cJP245nfpaTh4Irn62qK3jr2iXSZt7SsIXhM85T2hXqwIkXqUpFO62Q/XwmLE4 +eZPsdGa6rfnW0U11lkFJilN58JY+JKT0aINZHcAx7KL+w5AHVZz+TH46IcqZ/f8qVXH WSRtYiMK6hjE1zBcKmEPW54qng2AqfKExD10A= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=7KYAT1cMq267yFbxfgnyfXboHSUXema4tm4MylShho8=; b=JXHeSlsBLWnwH5gHWvSvZEWEYoLTZa44X2vpwuPfcpG8Rs+b5fTx6Iz2tGRuqj3uOF mEDUWTqp5tbY4KLDSvU4ntXvXlmN1LY3Lrl4XaIbSC28XaJ4c50eVgSzkSUYbRYz61WV 8Kyq5Mj4XCu6wtLRWYZpzr+UXYpBv657hnIn/mfuUJStQLWBwekCUOFwre7brpuvEQTU WIwPK+JNay5PosU2MslGiJPsBBND38dR20M96iX6EjqVcmgGWgkgd408xT6aNzKTWX00 vzW3Rb3zv/zyXhplBVdOuuqjQtDEZfx8TeQ4WqHB6DIPIhkX09F2Exh+NE93q9Xt6wzh wBuA== X-Gm-Message-State: ALQs6tBITksLKXB3spVlIpR3mLiCzNMBQv0EUO46VtEW7RlgKu1Cnrmb B9Ji2NNLwMCKze32aJ38lZViLg== X-Received: by 10.99.123.92 with SMTP id k28mr10315pgn.146.1523421450343; Tue, 10 Apr 2018 21:37:30 -0700 (PDT) Received: from localhost ([122.171.228.188]) by smtp.gmail.com with ESMTPSA id l14sm442458pgu.46.2018.04.10.21.37.28 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 10 Apr 2018 21:37:29 -0700 (PDT) Date: Wed, 11 Apr 2018 10:07:26 +0530 From: Viresh Kumar To: Patrick Bellasi Cc: linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ingo Molnar , Peter Zijlstra , "Rafael J . Wysocki" , Joel Fernandes , Steve Muckle , Juri Lelli , Dietmar Eggemann Subject: Re: [PATCH v2] cpufreq/schedutil: Cleanup, document and fix iowait boost Message-ID: <20180411043726.GJ7671@vireshk-i7> References: <20180410155931.31973-1-patrick.bellasi@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180410155931.31973-1-patrick.bellasi@arm.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10-04-18, 16:59, Patrick Bellasi wrote: > The iowait boosting code has been recently updated to add a progressive > boosting behavior which allows to be less aggressive in boosting tasks > doing only sporadic IO operations, thus being more energy efficient for > example on mobile platforms. > > The current code is now however a bit convoluted. Some functionalities > (e.g. iowait boost reset) are replicated in different paths and their > documentation is slightly misaligned. > > Moreover, from a functional stadpoint, the iowait boosting is also not > always reset in systems where cpufreq policies are not shared, each CPU > has his own policy. Indeed, when a CPU is idle for a long time we keep > doubling the boosting instead of resetting it to the minimum frequency, > as expected by the TICK_NSEC logic, whenever a task wakes up from IO. > > Let's cleanup the code by consolidating all the IO wait boosting related > functionality inside the already existing functions and better define > their role: > > - sugov_set_iowait_boost: is now in charge only to set/increase the IO > wait boost, every time a task wakes up from an IO wait. > > - sugov_iowait_boost: is now in charge to reset/reduce the IO wait > boost, every time a sugov update is triggered, as well as > to (eventually) enforce the currently required IO boost value. > > This is possible since these two functions are already used one after > the other, both in single and shared frequency domains, following the > same template: > > /* Configure IO boost, if required */ > sugov_set_iowait_boost() > > /* Return here if freq change is in progress or throttled */ > > /* Collect and aggregate utilization information */ > sugov_get_util() > sugov_aggregate_util() > > /* Add IO boost if currently enabled */ > sugov_iowait_boost() > > As a extra bonus, let's also add the documentation for these two > functions and better align the in-code documentation. > > Signed-off-by: Patrick Bellasi > Reported-by: Viresh Kumar > Cc: Ingo Molnar > Cc: Peter Zijlstra > Cc: Rafael J. Wysocki > Cc: Viresh Kumar > Cc: Joel Fernandes > Cc: Steve Muckle > Cc: Juri Lelli > Cc: Dietmar Eggemann > Cc: linux-kernel@vger.kernel.org > Cc: linux-pm@vger.kernel.org > > --- > Changes in v2: > - Fix return in sugov_iowait_boost()'s reset code (Viresh) > - Add iowait boost reset for sugov_update_single() (Viresh) > - Title changed to reflact the fix from previous point > > Based on today's tip/sched/core: > b720342 sched/core: Update preempt_notifier_key to modern API > --- > kernel/sched/cpufreq_schedutil.c | 120 ++++++++++++++++++++++++++------------- > 1 file changed, 81 insertions(+), 39 deletions(-) > > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c > index 2b124811947d..2a2ae3a0e41f 100644 > --- a/kernel/sched/cpufreq_schedutil.c > +++ b/kernel/sched/cpufreq_schedutil.c > @@ -51,7 +51,7 @@ struct sugov_cpu { > bool iowait_boost_pending; > unsigned int iowait_boost; > unsigned int iowait_boost_max; > - u64 last_update; > + u64 last_update; > > /* The fields below are only needed when sharing a policy: */ > unsigned long util_cfs; > @@ -201,43 +201,97 @@ static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu) > return min(util, sg_cpu->max); > } > > -static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, unsigned int flags) > +/** > + * sugov_set_iowait_boost updates the IO boost at each wakeup from IO. > + * @sg_cpu: the sugov data for the CPU to boost > + * @time: the update time from the caller > + * @flags: SCHED_CPUFREQ_IOWAIT if the task is waking up after an IO wait > + * > + * Each time a task wakes up after an IO operation, the CPU utilization can be > + * boosted to a certain utilization which is doubled at each wakeup > + * from IO, starting from the utilization of the minimum OPP to that of the > + * maximum one. > + */ > +static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, > + unsigned int flags) > { > - if (flags & SCHED_CPUFREQ_IOWAIT) { > - if (sg_cpu->iowait_boost_pending) > - return; > - > - sg_cpu->iowait_boost_pending = true; > + bool iowait = flags & SCHED_CPUFREQ_IOWAIT; > > - if (sg_cpu->iowait_boost) { > - sg_cpu->iowait_boost <<= 1; > - if (sg_cpu->iowait_boost > sg_cpu->iowait_boost_max) > - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max; > - } else { > - sg_cpu->iowait_boost = sg_cpu->sg_policy->policy->min; > - } > - } else if (sg_cpu->iowait_boost) { > + /* Reset boost if the CPU appears to have been idle enough */ > + if (sg_cpu->iowait_boost) { > s64 delta_ns = time - sg_cpu->last_update; > > - /* Clear iowait_boost if the CPU apprears to have been idle. */ > if (delta_ns > TICK_NSEC) { > - sg_cpu->iowait_boost = 0; > - sg_cpu->iowait_boost_pending = false; > + sg_cpu->iowait_boost = iowait > + ? sg_cpu->sg_policy->policy->min : 0; > + sg_cpu->iowait_boost_pending = iowait; > + return; > } > } > + > + /* Boost only tasks waking up after IO */ > + if (!iowait) > + return; > + > + /* Ensure IO boost doubles only one time at each frequency increase */ > + if (sg_cpu->iowait_boost_pending) > + return; > + sg_cpu->iowait_boost_pending = true; > + > + /* Double the IO boost at each frequency increase */ > + if (sg_cpu->iowait_boost) { > + sg_cpu->iowait_boost <<= 1; > + if (sg_cpu->iowait_boost > sg_cpu->iowait_boost_max) > + sg_cpu->iowait_boost = sg_cpu->iowait_boost_max; > + return; > + } > + > + /* At first wakeup after IO, start with minimum boost */ > + sg_cpu->iowait_boost = sg_cpu->sg_policy->policy->min; > } The above part should be a different patch with this: Fixes: a5a0809bc58e ("cpufreq: schedutil: Make iowait boost more energy efficient") > -static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long *util, > - unsigned long *max) > +/** > + * sugov_iowait_boost boosts a CPU after a wakeup from IO. > + * @sg_cpu: the sugov data for the cpu to boost > + * @time: the update time from the caller > + * @util: the utilization to (eventually) boost > + * @max: the maximum value the utilization can be boosted to > + * > + * A CPU running a task which woken up after an IO operation can have its > + * utilization boosted to speed up the completion of those IO operations. > + * The IO boost value is increased each time a task wakes up from IO, in > + * sugov_set_iowait_boost(), and it's instead decreased by this function, > + * each time an increase has not been requested (!iowait_boost_pending). > + * > + * A CPU which also appears to have been idle for at least one tick has also > + * its IO boost utilization reset. > + * > + * This mechanism is designed to boost high frequently IO waiting tasks, while > + * being more conservative on tasks which does sporadic IO operations. > + */ > +static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, > + unsigned long *util, unsigned long *max) > { > unsigned int boost_util, boost_max; > + s64 delta_ns; > > + /* No IOWait boost active */ > if (!sg_cpu->iowait_boost) > return; > > + /* Clear boost if the CPU appears to have been idle enough */ > + delta_ns = time - sg_cpu->last_update; > + if (delta_ns > TICK_NSEC) { > + sg_cpu->iowait_boost = 0; > + sg_cpu->iowait_boost_pending = false; > + return; > + } > + > + /* An IO waiting task has just woken up, use the boost value */ > if (sg_cpu->iowait_boost_pending) { > sg_cpu->iowait_boost_pending = false; > } else { > + /* Reduce the boost value otherwise */ > sg_cpu->iowait_boost >>= 1; > if (sg_cpu->iowait_boost < sg_cpu->sg_policy->policy->min) { > sg_cpu->iowait_boost = 0; > @@ -248,6 +302,10 @@ static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long *util, > boost_util = sg_cpu->iowait_boost; > boost_max = sg_cpu->iowait_boost_max; > > + /* > + * A CPU is boosted only if its current utilization is smaller then > + * the current IO boost level. > + */ > if (*util * boost_max < *max * boost_util) { > *util = boost_util; > *max = boost_max; > @@ -299,7 +357,7 @@ static void sugov_update_single(struct update_util_data *hook, u64 time, > sugov_get_util(sg_cpu); > max = sg_cpu->max; > util = sugov_aggregate_util(sg_cpu); > - sugov_iowait_boost(sg_cpu, &util, &max); > + sugov_iowait_boost(sg_cpu, time, &util, &max); > next_f = get_next_freq(sg_policy, util, max); > /* > * Do not reduce the frequency if the CPU has not been idle > @@ -325,28 +383,12 @@ static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu, u64 time) > for_each_cpu(j, policy->cpus) { > struct sugov_cpu *j_sg_cpu = &per_cpu(sugov_cpu, j); > unsigned long j_util, j_max; > - s64 delta_ns; > > sugov_get_util(j_sg_cpu); > - > - /* > - * If the CFS CPU utilization was last updated before the > - * previous frequency update and the time elapsed between the > - * last update of the CPU utilization and the last frequency > - * update is long enough, reset iowait_boost and util_cfs, as > - * they are now probably stale. However, still consider the > - * CPU contribution if it has some DEADLINE utilization > - * (util_dl). > - */ > - delta_ns = time - j_sg_cpu->last_update; > - if (delta_ns > TICK_NSEC) { > - j_sg_cpu->iowait_boost = 0; > - j_sg_cpu->iowait_boost_pending = false; > - } > - > j_max = j_sg_cpu->max; > j_util = sugov_aggregate_util(j_sg_cpu); > - sugov_iowait_boost(j_sg_cpu, &j_util, &j_max); > + sugov_iowait_boost(j_sg_cpu, time, &j_util, &j_max); > + > if (j_util * max > j_max * util) { > util = j_util; > max = j_max; And the rest is just code rearrangement. And as Peter said, we better have a routine to clear boost values on delta > TICK_NSEC. Diff LGTM otherwise. Thanks. -- viresh