2020-06-25 06:47:25

by Wei Wang

[permalink] [raw]
Subject: [PATCH] cpufreq: schedutil: force frequency update when limits change

To avoid reducing the frequency of a CPU prematurely, we skip reducing
the frequency if the CPU had been busy recently.

This should not be done when the limits of the policy are changed, for
example due to thermal throttling. We should always get the frequency
within the new limits as soon as possible.

There was a fix in
commit 600f5badb78c ("cpufreq: schedutil: Don't skip freq update when
limits change") upstream which introduced another flag. However, the
fix didn't address the case when next_freq is the same as previously
voted, which is then checked in sugov_update_next_freq. As a result, the
frequency would be stuck at low until the high demanding workload quits.

test trace:
kworker/u19:0-1872 ( 1872) [002] .... 347.878871: cpu_frequency_limits: min=600000 max=2348000 cpu_id=6
dhry64-11525 (11525) [007] d.h2 347.880012: sugov_should_update_freq: thermal limit on policy6
dhry64-11525 (11525) [007] d.h2 347.880012: sugov_deferred_update: policy6 skipped update
dhry64-11525 (11525) [007] d.h2 347.884040: sugov_deferred_update: policy6 skipped update
...

This patch fixes this by skipping the check and forcing an update in
this case. The second flag was kept as the limits_change flag could be
updated in thermal kworker from another CPU.

Fixes: ecd288429126 ("cpufreq: schedutil: Don't set next_freq to UINT_MAX")
Signed-off-by: Wei Wang <[email protected]>
---
kernel/sched/cpufreq_schedutil.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 7fbaee24c824..dc2cd768022e 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -102,11 +102,12 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,
unsigned int next_freq)
{
- if (sg_policy->next_freq == next_freq)
+ if (!sg_policy->need_freq_update && sg_policy->next_freq == next_freq)
return false;

sg_policy->next_freq = next_freq;
sg_policy->last_freq_update_time = time;
+ sg_policy->need_freq_update = false;

return true;
}
@@ -178,7 +179,6 @@ static unsigned int get_next_freq(struct sugov_policy *sg_policy,
if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
return sg_policy->next_freq;

- sg_policy->need_freq_update = false;
sg_policy->cached_raw_freq = freq;
return cpufreq_driver_resolve_freq(policy, freq);
}
--
2.27.0.212.ge8ba1cc988-goog


2020-06-25 10:26:35

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH] cpufreq: schedutil: force frequency update when limits change

On 24-06-20, 23:46, Wei Wang wrote:
> To avoid reducing the frequency of a CPU prematurely, we skip reducing
> the frequency if the CPU had been busy recently.
>
> This should not be done when the limits of the policy are changed, for
> example due to thermal throttling. We should always get the frequency
> within the new limits as soon as possible.
>
> There was a fix in
> commit 600f5badb78c ("cpufreq: schedutil: Don't skip freq update when
> limits change") upstream which introduced another flag. However, the
> fix didn't address the case when next_freq is the same as previously
> voted, which is then checked in sugov_update_next_freq. As a result, the
> frequency would be stuck at low until the high demanding workload quits.
>
> test trace:
> kworker/u19:0-1872 ( 1872) [002] .... 347.878871: cpu_frequency_limits: min=600000 max=2348000 cpu_id=6
> dhry64-11525 (11525) [007] d.h2 347.880012: sugov_should_update_freq: thermal limit on policy6
> dhry64-11525 (11525) [007] d.h2 347.880012: sugov_deferred_update: policy6 skipped update
> dhry64-11525 (11525) [007] d.h2 347.884040: sugov_deferred_update: policy6 skipped update

I am not sure these are helpful in the logs as the code which
generated them isn't there in the kernel.

> ...
>
> This patch fixes this by skipping the check and forcing an update in
> this case. The second flag was kept as the limits_change flag could be
> updated in thermal kworker from another CPU.

I am sorry but I am not fully sure of what the problem is. Can you
describe that by giving an example with some random frequency, and
tell the expected and actual behavior ?

> Fixes: ecd288429126 ("cpufreq: schedutil: Don't set next_freq to UINT_MAX")
> Signed-off-by: Wei Wang <[email protected]>
> ---
> kernel/sched/cpufreq_schedutil.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index 7fbaee24c824..dc2cd768022e 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -102,11 +102,12 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
> static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,
> unsigned int next_freq)
> {
> - if (sg_policy->next_freq == next_freq)
> + if (!sg_policy->need_freq_update && sg_policy->next_freq == next_freq)

AFAIU, if the next freq is same as currently programmed one, there is
no need to force update it.

> return false;
>
> sg_policy->next_freq = next_freq;
> sg_policy->last_freq_update_time = time;
> + sg_policy->need_freq_update = false;
>
> return true;
> }
> @@ -178,7 +179,6 @@ static unsigned int get_next_freq(struct sugov_policy *sg_policy,
> if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
> return sg_policy->next_freq;
>
> - sg_policy->need_freq_update = false;
> sg_policy->cached_raw_freq = freq;
> return cpufreq_driver_resolve_freq(policy, freq);
> }
> --
> 2.27.0.212.ge8ba1cc988-goog

--
viresh

2020-06-25 23:35:16

by Wei Wang

[permalink] [raw]
Subject: Re: [PATCH] cpufreq: schedutil: force frequency update when limits change

On Thu, Jun 25, 2020 at 3:23 AM Viresh Kumar <[email protected]> wrote:
>
> On 24-06-20, 23:46, Wei Wang wrote:
> > To avoid reducing the frequency of a CPU prematurely, we skip reducing
> > the frequency if the CPU had been busy recently.
> >
> > This should not be done when the limits of the policy are changed, for
> > example due to thermal throttling. We should always get the frequency
> > within the new limits as soon as possible.
> >
> > There was a fix in
> > commit 600f5badb78c ("cpufreq: schedutil: Don't skip freq update when
> > limits change") upstream which introduced another flag. However, the
> > fix didn't address the case when next_freq is the same as previously
> > voted, which is then checked in sugov_update_next_freq. As a result, the
> > frequency would be stuck at low until the high demanding workload quits.
> >
> > test trace:
> > kworker/u19:0-1872 ( 1872) [002] .... 347.878871: cpu_frequency_limits: min=600000 max=2348000 cpu_id=6
> > dhry64-11525 (11525) [007] d.h2 347.880012: sugov_should_update_freq: thermal limit on policy6
> > dhry64-11525 (11525) [007] d.h2 347.880012: sugov_deferred_update: policy6 skipped update
> > dhry64-11525 (11525) [007] d.h2 347.884040: sugov_deferred_update: policy6 skipped update
>
> I am not sure these are helpful in the logs as the code which
> generated them isn't there in the kernel.
>
Yes, those traceprintk were added to those particular functions to help debug.


> > ...
> >
> > This patch fixes this by skipping the check and forcing an update in
> > this case. The second flag was kept as the limits_change flag could be
> > updated in thermal kworker from another CPU.
>
> I am sorry but I am not fully sure of what the problem is. Can you
> describe that by giving an example with some random frequency, and
> tell the expected and actual behavior ?
>
The problem is sugov thought next_freq already updated (but actually
skipped by the rate limit thing) and all following updates will be
skipped.
Actually this is specifically for Android common kernel 4.19's issue
which has sugov_up_down_rate_limit in sugov_update_next_freq, let's
continue discussion there.

Thanks!
-Wei
> > Fixes: ecd288429126 ("cpufreq: schedutil: Don't set next_freq to UINT_MAX")
> > Signed-off-by: Wei Wang <[email protected]>
> > ---
> > kernel/sched/cpufreq_schedutil.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> > index 7fbaee24c824..dc2cd768022e 100644
> > --- a/kernel/sched/cpufreq_schedutil.c
> > +++ b/kernel/sched/cpufreq_schedutil.c
> > @@ -102,11 +102,12 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
> > static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,
> > unsigned int next_freq)
> > {
> > - if (sg_policy->next_freq == next_freq)
> > + if (!sg_policy->need_freq_update && sg_policy->next_freq == next_freq)
>
> AFAIU, if the next freq is same as currently programmed one, there is
> no need to force update it.
>
> > return false;
> >
> > sg_policy->next_freq = next_freq;
> > sg_policy->last_freq_update_time = time;
> > + sg_policy->need_freq_update = false;
> >
> > return true;
> > }
> > @@ -178,7 +179,6 @@ static unsigned int get_next_freq(struct sugov_policy *sg_policy,
> > if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
> > return sg_policy->next_freq;
> >
> > - sg_policy->need_freq_update = false;
> > sg_policy->cached_raw_freq = freq;
> > return cpufreq_driver_resolve_freq(policy, freq);
> > }
> > --
> > 2.27.0.212.ge8ba1cc988-goog
>
> --
> viresh

2020-06-26 04:45:54

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH] cpufreq: schedutil: force frequency update when limits change

On 25-06-20, 13:47, Wei Wang wrote:
> On Thu, Jun 25, 2020 at 3:23 AM Viresh Kumar <[email protected]> wrote:
> > I am sorry but I am not fully sure of what the problem is. Can you
> > describe that by giving an example with some random frequency, and
> > tell the expected and actual behavior ?
> >
> The problem is sugov thought next_freq already updated (but actually
> skipped by the rate limit thing) and all following updates will be
> skipped.

I am sorry, can you please give a detailed example with existing
frequency and limits, then the limits changed to new values, then what
exactly happens ?

> Actually this is specifically for Android common kernel 4.19's issue
> which has sugov_up_down_rate_limit in sugov_update_next_freq, let's
> continue discussion there.

If it is a mainline problem, we will surely get it fixed here. Just
that I am not able to understand the problem yet. Sorry about that.

--
viresh

2020-06-26 04:46:20

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH] cpufreq: schedutil: force frequency update when limits change

On 26-06-20, 07:44, Viresh Kumar wrote:
> On 25-06-20, 13:47, Wei Wang wrote:
> > On Thu, Jun 25, 2020 at 3:23 AM Viresh Kumar <[email protected]> wrote:
> > > I am sorry but I am not fully sure of what the problem is. Can you
> > > describe that by giving an example with some random frequency, and
> > > tell the expected and actual behavior ?
> > >
> > The problem is sugov thought next_freq already updated (but actually
> > skipped by the rate limit thing) and all following updates will be
> > skipped.

The rate-limiting thing is specific to android and not present in
mainline. Even in android I see next_freq getting updated only after
rate-limiting is verified.

I think you maybe trying to fix an android only problem in mainline,
which may not be required at all. And I am not sure if Android has a
problem as well :)

> I am sorry, can you please give a detailed example with existing
> frequency and limits, then the limits changed to new values, then what
> exactly happens ?
>
> > Actually this is specifically for Android common kernel 4.19's issue
> > which has sugov_up_down_rate_limit in sugov_update_next_freq, let's
> > continue discussion there.
>
> If it is a mainline problem, we will surely get it fixed here. Just
> that I am not able to understand the problem yet. Sorry about that.

--
viresh

2020-06-26 04:52:46

by Wei Wang

[permalink] [raw]
Subject: Re: [PATCH] cpufreq: schedutil: force frequency update when limits change

On Thu, Jun 25, 2020 at 7:32 PM Viresh Kumar <[email protected]> wrote:
>
> On 26-06-20, 07:44, Viresh Kumar wrote:
> > On 25-06-20, 13:47, Wei Wang wrote:
> > > On Thu, Jun 25, 2020 at 3:23 AM Viresh Kumar <[email protected]> wrote:
> > > > I am sorry but I am not fully sure of what the problem is. Can you
> > > > describe that by giving an example with some random frequency, and
> > > > tell the expected and actual behavior ?
> > > >
> > > The problem is sugov thought next_freq already updated (but actually
> > > skipped by the rate limit thing) and all following updates will be
> > > skipped.
>
> The rate-limiting thing is specific to android and not present in
> mainline. Even in android I see next_freq getting updated only after
> rate-limiting is verified.
>
> I think you maybe trying to fix an android only problem in mainline,
> which may not be required at all. And I am not sure if Android has a
> problem as well :)
>
Yes, that is Android specific, I added you to the Gerrit already.

Thanks!
-Wei

> > I am sorry, can you please give a detailed example with existing
> > frequency and limits, then the limits changed to new values, then what
> > exactly happens ?
> >
> > > Actually this is specifically for Android common kernel 4.19's issue
> > > which has sugov_up_down_rate_limit in sugov_update_next_freq, let's
> > > continue discussion there.
> >
> > If it is a mainline problem, we will surely get it fixed here. Just
> > that I am not able to understand the problem yet. Sorry about that.
>
> --
> viresh