LinuxLists.cc - [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

[permalink] [raw]

Subject: Re: [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

On 07-05-18, 16:43, Claudio Scordino wrote:
> At OSPM, it was mentioned the issue about urgent CPU frequency requests
> arriving when a frequency switch is already in progress.
>
> Besides the various issues (physical time for switching frequency,
> on-going kthread activity, etc.) one (minor) issue is the kernel
> "forgetting" such request, thus waiting the next switch time for
> recomputing the needed frequency and behaving accordingly.
>
> This patch makes the kthread serve any urgent request occurred during
> the previous frequency switch. It introduces a specific flag, only set
> when the SCHED_DEADLINE scheduling class increases the CPU utilization,
> aiming at decreasing the likelihood of a deadline miss.
>
> Indeed, some preliminary tests in critical conditions (i.e.
> SCHED_DEADLINE tasks with short periods) have shown reductions of more
> than 10% of the average number of deadline misses. On the other hand,
> the increase in terms of energy consumption when running SCHED_DEADLINE
> tasks (not yet measured) is likely to be not negligible (especially in
> case of critical scenarios like "ramp up" utilizations).
>
> The patch is meant as follow-up discussion after OSPM.
>
> Signed-off-by: Claudio Scordino <[email protected]>
> CC: Viresh Kumar <[email protected]>
> CC: Rafael J. Wysocki <[email protected]>
> CC: Peter Zijlstra <[email protected]>
> CC: Ingo Molnar <[email protected]>
> CC: Patrick Bellasi <[email protected]>
> CC: Juri Lelli <[email protected]>
> Cc: Luca Abeni <[email protected]>
> CC: Joel Fernandes <[email protected]>
> CC: [email protected]
> ---
> kernel/sched/cpufreq_schedutil.c | 20 ++++++++++++++++++--
> 1 file changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index d2c6083..4de06b0 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -41,6 +41,7 @@ struct sugov_policy {
> bool work_in_progress;
>
> bool need_freq_update;
> + bool urgent_freq_update;
> };
>
> struct sugov_cpu {
> @@ -92,6 +93,14 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
> !cpufreq_can_do_remote_dvfs(sg_policy->policy))
> return false;
>
> + /*
> + * Continue computing the new frequency. In case of work_in_progress,
> + * the kthread will resched a change once the current transition is
> + * finished.
> + */
> + if (sg_policy->urgent_freq_update)
> + return true;
> +
> if (sg_policy->work_in_progress)
> return false;
>
> @@ -121,6 +130,9 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time,
> sg_policy->next_freq = next_freq;
> sg_policy->last_freq_update_time = time;
>
> + if (sg_policy->work_in_progress)
> + return;
> +
> if (policy->fast_switch_enabled) {
> next_freq = cpufreq_driver_fast_switch(policy, next_freq);
> if (!next_freq)
> @@ -274,7 +286,7 @@ static inline bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu) { return false; }
> static inline void ignore_dl_rate_limit(struct sugov_cpu *sg_cpu, struct sugov_policy *sg_policy)
> {
> if (cpu_util_dl(cpu_rq(sg_cpu->cpu)) > sg_cpu->util_dl)
> - sg_policy->need_freq_update = true;
> + sg_policy->urgent_freq_update = true;
> }
>
> static void sugov_update_single(struct update_util_data *hook, u64 time,
> @@ -383,8 +395,11 @@ static void sugov_work(struct kthread_work *work)
> struct sugov_policy *sg_policy = container_of(work, struct sugov_policy, work);
>
> mutex_lock(&sg_policy->work_lock);
> - __cpufreq_driver_target(sg_policy->policy, sg_policy->next_freq,
> + do {
> + sg_policy->urgent_freq_update = false;
> + __cpufreq_driver_target(sg_policy->policy, sg_policy->next_freq,
> CPUFREQ_RELATION_L);

If we are going to solve this problem, then maybe instead of the added
complexity and a new flag we can look for need_freq_update flag at this location
and re-calculate the next frequency if required.

> + } while (sg_policy->urgent_freq_update);
> mutex_unlock(&sg_policy->work_lock);
>
> sg_policy->work_in_progress = false;
> @@ -673,6 +688,7 @@ static int sugov_start(struct cpufreq_policy *policy)
> sg_policy->next_freq = UINT_MAX;
> sg_policy->work_in_progress = false;
> sg_policy->need_freq_update = false;
> + sg_policy->urgent_freq_update = false;
> sg_policy->cached_raw_freq = 0;
>
> for_each_cpu(cpu, policy->cpus) {
> --
> 2.7.4

--
viresh

2018-05-08 12:33:11

by Claudio Scordino

[permalink] [raw]

Subject: Re: [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

Il 08/05/2018 08:54, Viresh Kumar ha scritto:
> On 07-05-18, 16:43, Claudio Scordino wrote:
>> At OSPM, it was mentioned the issue about urgent CPU frequency requests
>> arriving when a frequency switch is already in progress.
>>
>> Besides the various issues (physical time for switching frequency,
>> on-going kthread activity, etc.) one (minor) issue is the kernel
>> "forgetting" such request, thus waiting the next switch time for
>> recomputing the needed frequency and behaving accordingly.
>>
>> This patch makes the kthread serve any urgent request occurred during
>> the previous frequency switch. It introduces a specific flag, only set
>> when the SCHED_DEADLINE scheduling class increases the CPU utilization,
>> aiming at decreasing the likelihood of a deadline miss.
>>
>> Indeed, some preliminary tests in critical conditions (i.e.
>> SCHED_DEADLINE tasks with short periods) have shown reductions of more
>> than 10% of the average number of deadline misses. On the other hand,
>> the increase in terms of energy consumption when running SCHED_DEADLINE
>> tasks (not yet measured) is likely to be not negligible (especially in
>> case of critical scenarios like "ramp up" utilizations).
>>
>> The patch is meant as follow-up discussion after OSPM.
>>
>> Signed-off-by: Claudio Scordino <[email protected]>
>> CC: Viresh Kumar <[email protected]>
>> CC: Rafael J. Wysocki <[email protected]>
>> CC: Peter Zijlstra <[email protected]>
>> CC: Ingo Molnar <[email protected]>
>> CC: Patrick Bellasi <[email protected]>
>> CC: Juri Lelli <[email protected]>
>> Cc: Luca Abeni <[email protected]>
>> CC: Joel Fernandes <[email protected]>
>> CC: [email protected]
>> ---
>> kernel/sched/cpufreq_schedutil.c | 20 ++++++++++++++++++--
>> 1 file changed, 18 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
>> index d2c6083..4de06b0 100644
>> --- a/kernel/sched/cpufreq_schedutil.c
>> +++ b/kernel/sched/cpufreq_schedutil.c
>> @@ -41,6 +41,7 @@ struct sugov_policy {
>> bool work_in_progress;
>>
>> bool need_freq_update;
>> + bool urgent_freq_update;
>> };
>>
>> struct sugov_cpu {
>> @@ -92,6 +93,14 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
>> !cpufreq_can_do_remote_dvfs(sg_policy->policy))
>> return false;
>>
>> + /*
>> + * Continue computing the new frequency. In case of work_in_progress,
>> + * the kthread will resched a change once the current transition is
>> + * finished.
>> + */
>> + if (sg_policy->urgent_freq_update)
>> + return true;
>> +
>> if (sg_policy->work_in_progress)
>> return false;
>>
>> @@ -121,6 +130,9 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time,
>> sg_policy->next_freq = next_freq;
>> sg_policy->last_freq_update_time = time;
>>
>> + if (sg_policy->work_in_progress)
>> + return;
>> +
>> if (policy->fast_switch_enabled) {
>> next_freq = cpufreq_driver_fast_switch(policy, next_freq);
>> if (!next_freq)
>> @@ -274,7 +286,7 @@ static inline bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu) { return false; }
>> static inline void ignore_dl_rate_limit(struct sugov_cpu *sg_cpu, struct sugov_policy *sg_policy)
>> {
>> if (cpu_util_dl(cpu_rq(sg_cpu->cpu)) > sg_cpu->util_dl)
>> - sg_policy->need_freq_update = true;
>> + sg_policy->urgent_freq_update = true;
>> }
>>
>> static void sugov_update_single(struct update_util_data *hook, u64 time,
>> @@ -383,8 +395,11 @@ static void sugov_work(struct kthread_work *work)
>> struct sugov_policy *sg_policy = container_of(work, struct sugov_policy, work);
>>
>> mutex_lock(&sg_policy->work_lock);
>> - __cpufreq_driver_target(sg_policy->policy, sg_policy->next_freq,
>> + do {
>> + sg_policy->urgent_freq_update = false;
>> + __cpufreq_driver_target(sg_policy->policy, sg_policy->next_freq,
>> CPUFREQ_RELATION_L);
>
> If we are going to solve this problem, then maybe instead of the added
> complexity and a new flag we can look for need_freq_update flag at this location
> and re-calculate the next frequency if required.

I agree.
Indeed, I've been in doubt if adding a new flag or relying on the existing need_freq_update flag (whose name, however, didn't seem to reflect any sense of urgency).
Maybe we can use need_freq_update but change its name to a more meaningful string ?

Thanks,

Claudio

2018-05-08 20:41:52

[permalink] [raw]

Subject: Re: [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

On Tue, May 8, 2018 at 2:32 PM, Claudio Scordino
<[email protected]> wrote:
>
>
> Il 08/05/2018 08:54, Viresh Kumar ha scritto:
>>
>> On 07-05-18, 16:43, Claudio Scordino wrote:
>>>
>>> At OSPM, it was mentioned the issue about urgent CPU frequency requests
>>> arriving when a frequency switch is already in progress.
>>>
>>> Besides the various issues (physical time for switching frequency,
>>> on-going kthread activity, etc.) one (minor) issue is the kernel
>>> "forgetting" such request, thus waiting the next switch time for
>>> recomputing the needed frequency and behaving accordingly.
>>>
>>> This patch makes the kthread serve any urgent request occurred during
>>> the previous frequency switch. It introduces a specific flag, only set
>>> when the SCHED_DEADLINE scheduling class increases the CPU utilization,
>>> aiming at decreasing the likelihood of a deadline miss.
>>>
>>> Indeed, some preliminary tests in critical conditions (i.e.
>>> SCHED_DEADLINE tasks with short periods) have shown reductions of more
>>> than 10% of the average number of deadline misses. On the other hand,
>>> the increase in terms of energy consumption when running SCHED_DEADLINE
>>> tasks (not yet measured) is likely to be not negligible (especially in
>>> case of critical scenarios like "ramp up" utilizations).
>>>
>>> The patch is meant as follow-up discussion after OSPM.
>>>
>>> Signed-off-by: Claudio Scordino <[email protected]>
>>> CC: Viresh Kumar <[email protected]>
>>> CC: Rafael J. Wysocki <[email protected]>
>>> CC: Peter Zijlstra <[email protected]>
>>> CC: Ingo Molnar <[email protected]>
>>> CC: Patrick Bellasi <[email protected]>
>>> CC: Juri Lelli <[email protected]>
>>> Cc: Luca Abeni <[email protected]>
>>> CC: Joel Fernandes <[email protected]>
>>> CC: [email protected]
>>> ---
>>> kernel/sched/cpufreq_schedutil.c | 20 ++++++++++++++++++--
>>> 1 file changed, 18 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/kernel/sched/cpufreq_schedutil.c
>>> b/kernel/sched/cpufreq_schedutil.c
>>> index d2c6083..4de06b0 100644
>>> --- a/kernel/sched/cpufreq_schedutil.c
>>> +++ b/kernel/sched/cpufreq_schedutil.c
>>> @@ -41,6 +41,7 @@ struct sugov_policy {
>>> bool work_in_progress;
>>> bool need_freq_update;
>>> + bool urgent_freq_update;
>>> };
>>> struct sugov_cpu {
>>> @@ -92,6 +93,14 @@ static bool sugov_should_update_freq(struct
>>> sugov_policy *sg_policy, u64 time)
>>> !cpufreq_can_do_remote_dvfs(sg_policy->policy))
>>> return false;
>>> + /*
>>> + * Continue computing the new frequency. In case of
>>> work_in_progress,
>>> + * the kthread will resched a change once the current transition
>>> is
>>> + * finished.
>>> + */
>>> + if (sg_policy->urgent_freq_update)
>>> + return true;
>>> +
>>> if (sg_policy->work_in_progress)
>>> return false;
>>> @@ -121,6 +130,9 @@ static void sugov_update_commit(struct sugov_policy
>>> *sg_policy, u64 time,
>>> sg_policy->next_freq = next_freq;
>>> sg_policy->last_freq_update_time = time;
>>> + if (sg_policy->work_in_progress)
>>> + return;
>>> +
>>> if (policy->fast_switch_enabled) {
>>> next_freq = cpufreq_driver_fast_switch(policy,
>>> next_freq);
>>> if (!next_freq)
>>> @@ -274,7 +286,7 @@ static inline bool sugov_cpu_is_busy(struct sugov_cpu
>>> *sg_cpu) { return false; }
>>> static inline void ignore_dl_rate_limit(struct sugov_cpu *sg_cpu,
>>> struct sugov_policy *sg_policy)
>>> {
>>> if (cpu_util_dl(cpu_rq(sg_cpu->cpu)) > sg_cpu->util_dl)
>>> - sg_policy->need_freq_update = true;
>>> + sg_policy->urgent_freq_update = true;
>>> }
>>> static void sugov_update_single(struct update_util_data *hook, u64
>>> time,
>>> @@ -383,8 +395,11 @@ static void sugov_work(struct kthread_work *work)
>>> struct sugov_policy *sg_policy = container_of(work, struct
>>> sugov_policy, work);
>>> mutex_lock(&sg_policy->work_lock);
>>> - __cpufreq_driver_target(sg_policy->policy, sg_policy->next_freq,
>>> + do {
>>> + sg_policy->urgent_freq_update = false;
>>> + __cpufreq_driver_target(sg_policy->policy,
>>> sg_policy->next_freq,
>>> CPUFREQ_RELATION_L);
>>
>>
>> If we are going to solve this problem, then maybe instead of the added
>> complexity and a new flag we can look for need_freq_update flag at this
>> location
>> and re-calculate the next frequency if required.
>
>
> I agree.
> Indeed, I've been in doubt if adding a new flag or relying on the existing
> need_freq_update flag (whose name, however, didn't seem to reflect any sense
> of urgency).
> Maybe we can use need_freq_update but change its name to a more meaningful
> string ?

Implicitly, it means "as soon as reasonably possible".

2018-05-09 04:54:55

[permalink] [raw]

Subject: Re: [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

On Tue, May 08, 2018 at 12:24:35PM +0530, Viresh Kumar wrote:
> On 07-05-18, 16:43, Claudio Scordino wrote:
> > At OSPM, it was mentioned the issue about urgent CPU frequency requests
> > arriving when a frequency switch is already in progress.
> >
> > Besides the various issues (physical time for switching frequency,
> > on-going kthread activity, etc.) one (minor) issue is the kernel
> > "forgetting" such request, thus waiting the next switch time for
> > recomputing the needed frequency and behaving accordingly.
> >
> > This patch makes the kthread serve any urgent request occurred during
> > the previous frequency switch. It introduces a specific flag, only set
> > when the SCHED_DEADLINE scheduling class increases the CPU utilization,
> > aiming at decreasing the likelihood of a deadline miss.
> >
> > Indeed, some preliminary tests in critical conditions (i.e.
> > SCHED_DEADLINE tasks with short periods) have shown reductions of more
> > than 10% of the average number of deadline misses. On the other hand,
> > the increase in terms of energy consumption when running SCHED_DEADLINE
> > tasks (not yet measured) is likely to be not negligible (especially in
> > case of critical scenarios like "ramp up" utilizations).
> >
> > The patch is meant as follow-up discussion after OSPM.
> >
> > Signed-off-by: Claudio Scordino <[email protected]>
> > CC: Viresh Kumar <[email protected]>
> > CC: Rafael J. Wysocki <[email protected]>
> > CC: Peter Zijlstra <[email protected]>
> > CC: Ingo Molnar <[email protected]>
> > CC: Patrick Bellasi <[email protected]>
> > CC: Juri Lelli <[email protected]>
> > Cc: Luca Abeni <[email protected]>
> > CC: Joel Fernandes <[email protected]>
> > CC: [email protected]
> > ---
> > kernel/sched/cpufreq_schedutil.c | 20 ++++++++++++++++++--
> > 1 file changed, 18 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> > index d2c6083..4de06b0 100644
> > --- a/kernel/sched/cpufreq_schedutil.c
> > +++ b/kernel/sched/cpufreq_schedutil.c
> > @@ -41,6 +41,7 @@ struct sugov_policy {
> > bool work_in_progress;
> >
> > bool need_freq_update;
> > + bool urgent_freq_update;
> > };
> >
> > struct sugov_cpu {
> > @@ -92,6 +93,14 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
> > !cpufreq_can_do_remote_dvfs(sg_policy->policy))
> > return false;
> >
> > + /*
> > + * Continue computing the new frequency. In case of work_in_progress,
> > + * the kthread will resched a change once the current transition is
> > + * finished.
> > + */
> > + if (sg_policy->urgent_freq_update)
> > + return true;
> > +
> > if (sg_policy->work_in_progress)
> > return false;
> >
> > @@ -121,6 +130,9 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time,
> > sg_policy->next_freq = next_freq;
> > sg_policy->last_freq_update_time = time;
> >
> > + if (sg_policy->work_in_progress)
> > + return;
> > +
> > if (policy->fast_switch_enabled) {
> > next_freq = cpufreq_driver_fast_switch(policy, next_freq);
> > if (!next_freq)
> > @@ -274,7 +286,7 @@ static inline bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu) { return false; }
> > static inline void ignore_dl_rate_limit(struct sugov_cpu *sg_cpu, struct sugov_policy *sg_policy)
> > {
> > if (cpu_util_dl(cpu_rq(sg_cpu->cpu)) > sg_cpu->util_dl)
> > - sg_policy->need_freq_update = true;
> > + sg_policy->urgent_freq_update = true;
> > }
> >
> > static void sugov_update_single(struct update_util_data *hook, u64 time,
> > @@ -383,8 +395,11 @@ static void sugov_work(struct kthread_work *work)
> > struct sugov_policy *sg_policy = container_of(work, struct sugov_policy, work);
> >
> > mutex_lock(&sg_policy->work_lock);
> > - __cpufreq_driver_target(sg_policy->policy, sg_policy->next_freq,
> > + do {
> > + sg_policy->urgent_freq_update = false;
> > + __cpufreq_driver_target(sg_policy->policy, sg_policy->next_freq,
> > CPUFREQ_RELATION_L);
>
> If we are going to solve this problem, then maybe instead of the added
> complexity and a new flag we can look for need_freq_update flag at this location
> and re-calculate the next frequency if required.

Just for discussion sake, is there any need for work_in_progress? If we can
queue multiple work say kthread_queue_work can handle it, then just queuing
works whenever they are available should be Ok and the kthread loop can
handle them. __cpufreq_driver_target is also protected by the work lock if
there is any concern that can have races... only thing is rate-limiting of
the requests, but we are doing a rate limiting, just not for the "DL
increased utilization" type requests (which I don't think we are doing at the
moment for urgent DL requests anyway).

Following is an untested diff to show the idea. What do you think?

thanks,

- Joel

----8<---
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index d2c6083304b4..862634ff4bf3 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -38,7 +38,6 @@ struct sugov_policy {
struct mutex work_lock;
struct kthread_worker worker;
struct task_struct *thread;
- bool work_in_progress;

bool need_freq_update;
};
@@ -92,16 +91,8 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
!cpufreq_can_do_remote_dvfs(sg_policy->policy))
return false;

- if (sg_policy->work_in_progress)
- return false;
-
if (unlikely(sg_policy->need_freq_update)) {
sg_policy->need_freq_update = false;
- /*
- * This happens when limits change, so forget the previous
- * next_freq value and force an update.
- */
- sg_policy->next_freq = UINT_MAX;
return true;
}

@@ -129,7 +120,6 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time,
policy->cur = next_freq;
trace_cpu_frequency(next_freq, smp_processor_id());
} else {
- sg_policy->work_in_progress = true;
irq_work_queue(&sg_policy->irq_work);
}
}
@@ -386,8 +376,6 @@ static void sugov_work(struct kthread_work *work)
__cpufreq_driver_target(sg_policy->policy, sg_policy->next_freq,
CPUFREQ_RELATION_L);
mutex_unlock(&sg_policy->work_lock);
-
- sg_policy->work_in_progress = false;
}

static void sugov_irq_work(struct irq_work *irq_work)
@@ -671,7 +659,6 @@ static int sugov_start(struct cpufreq_policy *policy)
sg_policy->freq_update_delay_ns = sg_policy->tunables->rate_limit_us * NSEC_PER_USEC;
sg_policy->last_freq_update_time = 0;
sg_policy->next_freq = UINT_MAX;
- sg_policy->work_in_progress = false;
sg_policy->need_freq_update = false;
sg_policy->cached_raw_freq = 0;

2018-05-09 06:46:09

by Juri Lelli

[permalink] [raw]

Subject: Re: [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

On 08/05/18 21:54, Joel Fernandes wrote:

[...]

> Just for discussion sake, is there any need for work_in_progress? If we can
> queue multiple work say kthread_queue_work can handle it, then just queuing
> works whenever they are available should be Ok and the kthread loop can
> handle them. __cpufreq_driver_target is also protected by the work lock if
> there is any concern that can have races... only thing is rate-limiting of
> the requests, but we are doing a rate limiting, just not for the "DL
> increased utilization" type requests (which I don't think we are doing at the
> moment for urgent DL requests anyway).
>
> Following is an untested diff to show the idea. What do you think?
>
> thanks,
>
> - Joel
>
> ----8<---
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index d2c6083304b4..862634ff4bf3 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -38,7 +38,6 @@ struct sugov_policy {
> struct mutex work_lock;
> struct kthread_worker worker;
> struct task_struct *thread;
> - bool work_in_progress;
>
> bool need_freq_update;
> };
> @@ -92,16 +91,8 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
> !cpufreq_can_do_remote_dvfs(sg_policy->policy))
> return false;
>
> - if (sg_policy->work_in_progress)
> - return false;
> -
> if (unlikely(sg_policy->need_freq_update)) {
> sg_policy->need_freq_update = false;
> - /*
> - * This happens when limits change, so forget the previous
> - * next_freq value and force an update.
> - */
> - sg_policy->next_freq = UINT_MAX;
> return true;
> }
>
> @@ -129,7 +120,6 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time,
> policy->cur = next_freq;
> trace_cpu_frequency(next_freq, smp_processor_id());
> } else {
> - sg_policy->work_in_progress = true;
> irq_work_queue(&sg_policy->irq_work);

Isn't this potentially introducing unneeded irq pressure (and doing the
whole wakeup the kthread thing), while the already active kthread could
simply handle multiple back-to-back requests before going to sleep?

Best,

- Juri

2018-05-09 06:56:06

[permalink] [raw]

Subject: Re: [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

On 09-05-18, 08:45, Juri Lelli wrote:
> On 08/05/18 21:54, Joel Fernandes wrote:
> Isn't this potentially introducing unneeded irq pressure (and doing the
> whole wakeup the kthread thing), while the already active kthread could
> simply handle multiple back-to-back requests before going to sleep?

And then we may need more instances of the work item and need to store
a different value of next_freq with each work item, as we can't use
the common one anymore as there would be races around accessing it ?

--
viresh

2018-05-09 06:56:42

[permalink] [raw]

Subject: Re: [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

2018-05-09 07:02:09

[permalink] [raw]

Subject: Re: [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

On Wed, May 09, 2018 at 12:24:49PM +0530, Viresh Kumar wrote:
> On 09-05-18, 08:45, Juri Lelli wrote:
> > On 08/05/18 21:54, Joel Fernandes wrote:
> > Isn't this potentially introducing unneeded irq pressure (and doing the
> > whole wakeup the kthread thing), while the already active kthread could
> > simply handle multiple back-to-back requests before going to sleep?
>
> And then we may need more instances of the work item and need to store
> a different value of next_freq with each work item, as we can't use
> the common one anymore as there would be races around accessing it ?

Exactly. I think it also doesn't make sense to over write an already
committed request either so better to store them separate (?). After the
"commit", that previous request is done..

- Joel

2018-05-09 08:06:42

[permalink] [raw]

Subject: Re: [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

On Wed, May 9, 2018 at 9:01 AM, Joel Fernandes <[email protected]> wrote:
> On Wed, May 09, 2018 at 12:24:49PM +0530, Viresh Kumar wrote:
>> On 09-05-18, 08:45, Juri Lelli wrote:
>> > On 08/05/18 21:54, Joel Fernandes wrote:
>> > Isn't this potentially introducing unneeded irq pressure (and doing the
>> > whole wakeup the kthread thing), while the already active kthread could
>> > simply handle multiple back-to-back requests before going to sleep?
>>
>> And then we may need more instances of the work item and need to store
>> a different value of next_freq with each work item, as we can't use
>> the common one anymore as there would be races around accessing it ?
>
> Exactly. I think it also doesn't make sense to over write an already
> committed request either so better to store them separate (?). After the
> "commit", that previous request is done..

Why is it?

In the non-fast-switch case the "commit" only means queuing up an
irq_work. Which BTW is one of the reasons for having work_in_progress
even if your kthread can handle multiple work items in one go.

You may try to clear work_in_progress in sugov_irq_work() instead of
in sugov_work(), though.

BTW, I'm not sure if the comment in sugov_irq_work() still applies. Juri?

2018-05-09 08:07:22

[permalink] [raw]

Subject: Re: [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

On Wed, May 09, 2018 at 08:45:30AM +0200, Juri Lelli wrote:
> On 08/05/18 21:54, Joel Fernandes wrote:
>
> [...]
>
> > Just for discussion sake, is there any need for work_in_progress? If we can
> > queue multiple work say kthread_queue_work can handle it, then just queuing
> > works whenever they are available should be Ok and the kthread loop can
> > handle them. __cpufreq_driver_target is also protected by the work lock if
> > there is any concern that can have races... only thing is rate-limiting of
> > the requests, but we are doing a rate limiting, just not for the "DL
> > increased utilization" type requests (which I don't think we are doing at the
> > moment for urgent DL requests anyway).
> >
> > Following is an untested diff to show the idea. What do you think?
> >
> > thanks,
> >
> > - Joel
> >
> > ----8<---
> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> > index d2c6083304b4..862634ff4bf3 100644
> > --- a/kernel/sched/cpufreq_schedutil.c
> > +++ b/kernel/sched/cpufreq_schedutil.c
> > @@ -38,7 +38,6 @@ struct sugov_policy {
> > struct mutex work_lock;
> > struct kthread_worker worker;
> > struct task_struct *thread;
> > - bool work_in_progress;
> >
> > bool need_freq_update;
> > };
> > @@ -92,16 +91,8 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
> > !cpufreq_can_do_remote_dvfs(sg_policy->policy))
> > return false;
> >
> > - if (sg_policy->work_in_progress)
> > - return false;
> > -
> > if (unlikely(sg_policy->need_freq_update)) {
> > sg_policy->need_freq_update = false;
> > - /*
> > - * This happens when limits change, so forget the previous
> > - * next_freq value and force an update.
> > - */
> > - sg_policy->next_freq = UINT_MAX;
> > return true;
> > }
> >
> > @@ -129,7 +120,6 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time,
> > policy->cur = next_freq;
> > trace_cpu_frequency(next_freq, smp_processor_id());
> > } else {
> > - sg_policy->work_in_progress = true;
> > irq_work_queue(&sg_policy->irq_work);
>
> Isn't this potentially introducing unneeded irq pressure (and doing the
> whole wakeup the kthread thing), while the already active kthread could
> simply handle multiple back-to-back requests before going to sleep?

How about this? Will use the latest request, and also doesn't do unnecessary
irq_work_queue:

(untested)
-----8<--------
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index d2c6083304b4..6a3e42b01f52 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -38,7 +38,7 @@ struct sugov_policy {
struct mutex work_lock;
struct kthread_worker worker;
struct task_struct *thread;
- bool work_in_progress;
+ bool work_in_progress; /* Has kthread been kicked */

bool need_freq_update;
};
@@ -92,9 +92,6 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
!cpufreq_can_do_remote_dvfs(sg_policy->policy))
return false;

- if (sg_policy->work_in_progress)
- return false;
-
if (unlikely(sg_policy->need_freq_update)) {
sg_policy->need_freq_update = false;
/*
@@ -129,8 +126,11 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time,
policy->cur = next_freq;
trace_cpu_frequency(next_freq, smp_processor_id());
} else {
- sg_policy->work_in_progress = true;
- irq_work_queue(&sg_policy->irq_work);
+ /* work_in_progress helps us not queue unnecessarily */
+ if (!sg_policy->work_in_progress) {
+ sg_policy->work_in_progress = true;
+ irq_work_queue(&sg_policy->irq_work);
+ }
}
}

@@ -381,13 +381,23 @@ sugov_update_shared(struct update_util_data *hook, u64 time, unsigned int flags)
static void sugov_work(struct kthread_work *work)
{
struct sugov_policy *sg_policy = container_of(work, struct sugov_policy, work);
+ unsigned int freq;
+
+ /*
+ * Hold sg_policy->update_lock just enough to handle the case where:
+ * if sg_policy->next_freq is updated before work_in_progress is set to
+ * false, we may miss queueing the new update request since
+ * work_in_progress would appear to be true.
+ */
+ raw_spin_lock(&sg_policy->update_lock);
+ freq = sg_policy->next_freq;
+ sg_policy->work_in_progress = false;
+ raw_spin_unlock(&sg_policy->update_lock);

mutex_lock(&sg_policy->work_lock);
- __cpufreq_driver_target(sg_policy->policy, sg_policy->next_freq,
+ __cpufreq_driver_target(sg_policy->policy, freq,
CPUFREQ_RELATION_L);
mutex_unlock(&sg_policy->work_lock);
-
- sg_policy->work_in_progress = false;
}

static void sugov_irq_work(struct irq_work *irq_work)

2018-05-09 08:24:54

[permalink] [raw]

Subject: Re: [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

On Wed, May 09, 2018 at 10:05:09AM +0200, Rafael J. Wysocki wrote:
> On Wed, May 9, 2018 at 9:01 AM, Joel Fernandes <[email protected]> wrote:
> > On Wed, May 09, 2018 at 12:24:49PM +0530, Viresh Kumar wrote:
> >> On 09-05-18, 08:45, Juri Lelli wrote:
> >> > On 08/05/18 21:54, Joel Fernandes wrote:
> >> > Isn't this potentially introducing unneeded irq pressure (and doing the
> >> > whole wakeup the kthread thing), while the already active kthread could
> >> > simply handle multiple back-to-back requests before going to sleep?
> >>
> >> And then we may need more instances of the work item and need to store
> >> a different value of next_freq with each work item, as we can't use
> >> the common one anymore as there would be races around accessing it ?
> >
> > Exactly. I think it also doesn't make sense to over write an already
> > committed request either so better to store them separate (?). After the
> > "commit", that previous request is done..
>
> Why is it?
>
> In the non-fast-switch case the "commit" only means queuing up an
> irq_work. Which BTW is one of the reasons for having work_in_progress
> even if your kthread can handle multiple work items in one go.

Ok I agree. I just thought there was something funky with the meaning of
commit from a cpufreq perspective.

In the last diff I just sent out, I actually keep work_in_progress and
consider its meaning to be what you're saying (has the kthread been kicked)
and lets such "overwriting" of the next frequency to be made possible. Also
with that we would be servicing just the latest request even if there were
multiple ones made.

thanks,

- Joel

2018-05-09 08:25:37

by Juri Lelli

[permalink] [raw]

Subject: Re: [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

On 09/05/18 10:05, Rafael J. Wysocki wrote:
> On Wed, May 9, 2018 at 9:01 AM, Joel Fernandes <[email protected]> wrote:
> > On Wed, May 09, 2018 at 12:24:49PM +0530, Viresh Kumar wrote:
> >> On 09-05-18, 08:45, Juri Lelli wrote:
> >> > On 08/05/18 21:54, Joel Fernandes wrote:
> >> > Isn't this potentially introducing unneeded irq pressure (and doing the
> >> > whole wakeup the kthread thing), while the already active kthread could
> >> > simply handle multiple back-to-back requests before going to sleep?
> >>
> >> And then we may need more instances of the work item and need to store
> >> a different value of next_freq with each work item, as we can't use
> >> the common one anymore as there would be races around accessing it ?
> >
> > Exactly. I think it also doesn't make sense to over write an already
> > committed request either so better to store them separate (?). After the
> > "commit", that previous request is done..
>
> Why is it?
>
> In the non-fast-switch case the "commit" only means queuing up an
> irq_work. Which BTW is one of the reasons for having work_in_progress
> even if your kthread can handle multiple work items in one go.
>
> You may try to clear work_in_progress in sugov_irq_work() instead of
> in sugov_work(), though.
>
> BTW, I'm not sure if the comment in sugov_irq_work() still applies. Juri?

It doesn't anymore. sugov kthreads are now being "ignored". Should have
remove it with the DL set of changes, sorry about that.

2018-05-09 08:26:22

[permalink] [raw]

Subject: Re: [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

On Wed, May 9, 2018 at 10:23 AM, Juri Lelli <[email protected]> wrote:
> On 09/05/18 10:05, Rafael J. Wysocki wrote:
>> On Wed, May 9, 2018 at 9:01 AM, Joel Fernandes <[email protected]> wrote:
>> > On Wed, May 09, 2018 at 12:24:49PM +0530, Viresh Kumar wrote:
>> >> On 09-05-18, 08:45, Juri Lelli wrote:
>> >> > On 08/05/18 21:54, Joel Fernandes wrote:
>> >> > Isn't this potentially introducing unneeded irq pressure (and doing the
>> >> > whole wakeup the kthread thing), while the already active kthread could
>> >> > simply handle multiple back-to-back requests before going to sleep?
>> >>
>> >> And then we may need more instances of the work item and need to store
>> >> a different value of next_freq with each work item, as we can't use
>> >> the common one anymore as there would be races around accessing it ?
>> >
>> > Exactly. I think it also doesn't make sense to over write an already
>> > committed request either so better to store them separate (?). After the
>> > "commit", that previous request is done..
>>
>> Why is it?
>>
>> In the non-fast-switch case the "commit" only means queuing up an
>> irq_work. Which BTW is one of the reasons for having work_in_progress
>> even if your kthread can handle multiple work items in one go.
>>
>> You may try to clear work_in_progress in sugov_irq_work() instead of
>> in sugov_work(), though.
>>
>> BTW, I'm not sure if the comment in sugov_irq_work() still applies. Juri?
>
> It doesn't anymore. sugov kthreads are now being "ignored". Should have
> remove it with the DL set of changes, sorry about that.

No worries, you can still do that. ;-)

2018-05-09 08:31:12

[permalink] [raw]

Subject: Re: [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

On Wed, May 9, 2018 at 10:06 AM, Joel Fernandes <[email protected]> wrote:
> On Wed, May 09, 2018 at 08:45:30AM +0200, Juri Lelli wrote:
>> On 08/05/18 21:54, Joel Fernandes wrote:
>>
>> [...]
>>
>> > Just for discussion sake, is there any need for work_in_progress? If we can
>> > queue multiple work say kthread_queue_work can handle it, then just queuing
>> > works whenever they are available should be Ok and the kthread loop can
>> > handle them. __cpufreq_driver_target is also protected by the work lock if
>> > there is any concern that can have races... only thing is rate-limiting of
>> > the requests, but we are doing a rate limiting, just not for the "DL
>> > increased utilization" type requests (which I don't think we are doing at the
>> > moment for urgent DL requests anyway).
>> >
>> > Following is an untested diff to show the idea. What do you think?
>> >
>> > thanks,
>> >
>> > - Joel
>> >
>> > ----8<---
>> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
>> > index d2c6083304b4..862634ff4bf3 100644
>> > --- a/kernel/sched/cpufreq_schedutil.c
>> > +++ b/kernel/sched/cpufreq_schedutil.c
>> > @@ -38,7 +38,6 @@ struct sugov_policy {
>> > struct mutex work_lock;
>> > struct kthread_worker worker;
>> > struct task_struct *thread;
>> > - bool work_in_progress;
>> >
>> > bool need_freq_update;
>> > };
>> > @@ -92,16 +91,8 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
>> > !cpufreq_can_do_remote_dvfs(sg_policy->policy))
>> > return false;
>> >
>> > - if (sg_policy->work_in_progress)
>> > - return false;
>> > -
>> > if (unlikely(sg_policy->need_freq_update)) {
>> > sg_policy->need_freq_update = false;
>> > - /*
>> > - * This happens when limits change, so forget the previous
>> > - * next_freq value and force an update.
>> > - */
>> > - sg_policy->next_freq = UINT_MAX;
>> > return true;
>> > }
>> >
>> > @@ -129,7 +120,6 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time,
>> > policy->cur = next_freq;
>> > trace_cpu_frequency(next_freq, smp_processor_id());
>> > } else {
>> > - sg_policy->work_in_progress = true;
>> > irq_work_queue(&sg_policy->irq_work);
>>
>> Isn't this potentially introducing unneeded irq pressure (and doing the
>> whole wakeup the kthread thing), while the already active kthread could
>> simply handle multiple back-to-back requests before going to sleep?
>
> How about this? Will use the latest request, and also doesn't do unnecessary
> irq_work_queue:
>
> (untested)
> -----8<--------
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index d2c6083304b4..6a3e42b01f52 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -38,7 +38,7 @@ struct sugov_policy {
> struct mutex work_lock;
> struct kthread_worker worker;
> struct task_struct *thread;
> - bool work_in_progress;
> + bool work_in_progress; /* Has kthread been kicked */
>
> bool need_freq_update;
> };
> @@ -92,9 +92,6 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
> !cpufreq_can_do_remote_dvfs(sg_policy->policy))
> return false;
>
> - if (sg_policy->work_in_progress)
> - return false;
> -

Why this change?

Doing the below is rather pointless if work_in_progress is set, isn't it?

You'll drop the results of it on the floor going forward anyway then AFAICS.

> if (unlikely(sg_policy->need_freq_update)) {
> sg_policy->need_freq_update = false;
> /*
> @@ -129,8 +126,11 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time,
> policy->cur = next_freq;
> trace_cpu_frequency(next_freq, smp_processor_id());
> } else {
> - sg_policy->work_in_progress = true;
> - irq_work_queue(&sg_policy->irq_work);
> + /* work_in_progress helps us not queue unnecessarily */
> + if (!sg_policy->work_in_progress) {
> + sg_policy->work_in_progress = true;
> + irq_work_queue(&sg_policy->irq_work);
> + }
> }
> }
>
> @@ -381,13 +381,23 @@ sugov_update_shared(struct update_util_data *hook, u64 time, unsigned int flags)
> static void sugov_work(struct kthread_work *work)
> {
> struct sugov_policy *sg_policy = container_of(work, struct sugov_policy, work);
> + unsigned int freq;
> +
> + /*
> + * Hold sg_policy->update_lock just enough to handle the case where:
> + * if sg_policy->next_freq is updated before work_in_progress is set to
> + * false, we may miss queueing the new update request since
> + * work_in_progress would appear to be true.
> + */
> + raw_spin_lock(&sg_policy->update_lock);
> + freq = sg_policy->next_freq;
> + sg_policy->work_in_progress = false;
> + raw_spin_unlock(&sg_policy->update_lock);
>
> mutex_lock(&sg_policy->work_lock);
> - __cpufreq_driver_target(sg_policy->policy, sg_policy->next_freq,
> + __cpufreq_driver_target(sg_policy->policy, freq,
> CPUFREQ_RELATION_L);
> mutex_unlock(&sg_policy->work_lock);
> -
> - sg_policy->work_in_progress = false;
> }
>
> static void sugov_irq_work(struct irq_work *irq_work)

2018-05-09 08:42:38

[permalink] [raw]

Subject: Re: [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

On Wed, May 9, 2018 at 10:22 AM, Joel Fernandes <[email protected]> wrote:
> On Wed, May 09, 2018 at 10:05:09AM +0200, Rafael J. Wysocki wrote:
>> On Wed, May 9, 2018 at 9:01 AM, Joel Fernandes <[email protected]> wrote:
>> > On Wed, May 09, 2018 at 12:24:49PM +0530, Viresh Kumar wrote:
>> >> On 09-05-18, 08:45, Juri Lelli wrote:
>> >> > On 08/05/18 21:54, Joel Fernandes wrote:
>> >> > Isn't this potentially introducing unneeded irq pressure (and doing the
>> >> > whole wakeup the kthread thing), while the already active kthread could
>> >> > simply handle multiple back-to-back requests before going to sleep?
>> >>
>> >> And then we may need more instances of the work item and need to store
>> >> a different value of next_freq with each work item, as we can't use
>> >> the common one anymore as there would be races around accessing it ?
>> >
>> > Exactly. I think it also doesn't make sense to over write an already
>> > committed request either so better to store them separate (?). After the
>> > "commit", that previous request is done..
>>
>> Why is it?
>>
>> In the non-fast-switch case the "commit" only means queuing up an
>> irq_work. Which BTW is one of the reasons for having work_in_progress
>> even if your kthread can handle multiple work items in one go.
>
> Ok I agree. I just thought there was something funky with the meaning of
> commit from a cpufreq perspective.
>
> In the last diff I just sent out, I actually keep work_in_progress and
> consider its meaning to be what you're saying (has the kthread been kicked)
> and lets such "overwriting" of the next frequency to be made possible. Also
> with that we would be servicing just the latest request even if there were
> multiple ones made.

My understanding of this is that when the kthread actually starts
changing the frequency, it can't really roll back (at least not in
general), but there may be multiple following requests while the
frequency is being changed. In that case the most recent of the new
requests is the only one that matters. So IMO the kthread should make
a local copy of the most recent request to start with, process it and
let the irq_work update the most recent request data as new requests
come in. When done with the previous frequency change, the kthread
can make a local copy of the most recent request again and so on.

This should work, because there is only one irq_work updating the most
recent request data.

2018-05-09 08:43:43

by Juri Lelli

[permalink] [raw]

Subject: Re: [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

On 09/05/18 10:25, Rafael J. Wysocki wrote:
> On Wed, May 9, 2018 at 10:23 AM, Juri Lelli <[email protected]> wrote:
> > On 09/05/18 10:05, Rafael J. Wysocki wrote:
> >> On Wed, May 9, 2018 at 9:01 AM, Joel Fernandes <[email protected]> wrote:
> >> > On Wed, May 09, 2018 at 12:24:49PM +0530, Viresh Kumar wrote:
> >> >> On 09-05-18, 08:45, Juri Lelli wrote:
> >> >> > On 08/05/18 21:54, Joel Fernandes wrote:
> >> >> > Isn't this potentially introducing unneeded irq pressure (and doing the
> >> >> > whole wakeup the kthread thing), while the already active kthread could
> >> >> > simply handle multiple back-to-back requests before going to sleep?
> >> >>
> >> >> And then we may need more instances of the work item and need to store
> >> >> a different value of next_freq with each work item, as we can't use
> >> >> the common one anymore as there would be races around accessing it ?
> >> >
> >> > Exactly. I think it also doesn't make sense to over write an already
> >> > committed request either so better to store them separate (?). After the
> >> > "commit", that previous request is done..
> >>
> >> Why is it?
> >>
> >> In the non-fast-switch case the "commit" only means queuing up an
> >> irq_work. Which BTW is one of the reasons for having work_in_progress
> >> even if your kthread can handle multiple work items in one go.
> >>
> >> You may try to clear work_in_progress in sugov_irq_work() instead of
> >> in sugov_work(), though.
> >>
> >> BTW, I'm not sure if the comment in sugov_irq_work() still applies. Juri?
> >
> > It doesn't anymore. sugov kthreads are now being "ignored". Should have
> > remove it with the DL set of changes, sorry about that.
>
> No worries, you can still do that. ;-)

Indeed! Done. :)

2018-05-09 08:45:57

[permalink] [raw]

Subject: Re: [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

On 09-05-18, 10:30, Rafael J. Wysocki wrote:
> On Wed, May 9, 2018 at 10:06 AM, Joel Fernandes <[email protected]> wrote:
> > How about this? Will use the latest request, and also doesn't do unnecessary
> > irq_work_queue:

I almost wrote the same stuff before I went for lunch :)

> > (untested)
> > -----8<--------
> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> > index d2c6083304b4..6a3e42b01f52 100644
> > --- a/kernel/sched/cpufreq_schedutil.c
> > +++ b/kernel/sched/cpufreq_schedutil.c
> > @@ -38,7 +38,7 @@ struct sugov_policy {
> > struct mutex work_lock;
> > struct kthread_worker worker;
> > struct task_struct *thread;
> > - bool work_in_progress;
> > + bool work_in_progress; /* Has kthread been kicked */
> >
> > bool need_freq_update;
> > };
> > @@ -92,9 +92,6 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
> > !cpufreq_can_do_remote_dvfs(sg_policy->policy))
> > return false;
> >
> > - if (sg_policy->work_in_progress)
> > - return false;
> > -
>
> Why this change?
>
> Doing the below is rather pointless if work_in_progress is set, isn't it?
>
> You'll drop the results of it on the floor going forward anyway then AFAICS.
>
> > if (unlikely(sg_policy->need_freq_update)) {
> > sg_policy->need_freq_update = false;
> > /*
> > @@ -129,8 +126,11 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time,
> > policy->cur = next_freq;
> > trace_cpu_frequency(next_freq, smp_processor_id());
> > } else {
> > - sg_policy->work_in_progress = true;
> > - irq_work_queue(&sg_policy->irq_work);
> > + /* work_in_progress helps us not queue unnecessarily */
> > + if (!sg_policy->work_in_progress) {
> > + sg_policy->work_in_progress = true;
> > + irq_work_queue(&sg_policy->irq_work);
> > + }
> > }
> > }

Right, none of the above changes are required now.

> > @@ -381,13 +381,23 @@ sugov_update_shared(struct update_util_data *hook, u64 time, unsigned int flags)
> > static void sugov_work(struct kthread_work *work)
> > {
> > struct sugov_policy *sg_policy = container_of(work, struct sugov_policy, work);
> > + unsigned int freq;
> > +
> > + /*
> > + * Hold sg_policy->update_lock just enough to handle the case where:
> > + * if sg_policy->next_freq is updated before work_in_progress is set to
> > + * false, we may miss queueing the new update request since
> > + * work_in_progress would appear to be true.
> > + */
> > + raw_spin_lock(&sg_policy->update_lock);
> > + freq = sg_policy->next_freq;
> > + sg_policy->work_in_progress = false;
> > + raw_spin_unlock(&sg_policy->update_lock);

One problem we still have is that sg_policy->update_lock is only used
in the shared policy case and not in the single CPU per policy case,
so the race isn't solved there yet.

> > mutex_lock(&sg_policy->work_lock);
> > - __cpufreq_driver_target(sg_policy->policy, sg_policy->next_freq,
> > + __cpufreq_driver_target(sg_policy->policy, freq,
> > CPUFREQ_RELATION_L);
> > mutex_unlock(&sg_policy->work_lock);
> > -
> > - sg_policy->work_in_progress = false;
> > }
> >
> > static void sugov_irq_work(struct irq_work *irq_work)

--
viresh

2018-05-09 08:51:57

[permalink] [raw]

Subject: Re: [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

On Wed, May 09, 2018 at 10:30:37AM +0200, Rafael J. Wysocki wrote:
> On Wed, May 9, 2018 at 10:06 AM, Joel Fernandes <[email protected]> wrote:
> > On Wed, May 09, 2018 at 08:45:30AM +0200, Juri Lelli wrote:
> >> On 08/05/18 21:54, Joel Fernandes wrote:
> >>
> >> [...]
> >>
> >> > Just for discussion sake, is there any need for work_in_progress? If we can
> >> > queue multiple work say kthread_queue_work can handle it, then just queuing
> >> > works whenever they are available should be Ok and the kthread loop can
> >> > handle them. __cpufreq_driver_target is also protected by the work lock if
> >> > there is any concern that can have races... only thing is rate-limiting of
> >> > the requests, but we are doing a rate limiting, just not for the "DL
> >> > increased utilization" type requests (which I don't think we are doing at the
> >> > moment for urgent DL requests anyway).
> >> >
> >> > Following is an untested diff to show the idea. What do you think?
> >> >
> >> > thanks,
> >> >
> >> > - Joel
> >> >
> >> > ----8<---
> >> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> >> > index d2c6083304b4..862634ff4bf3 100644
> >> > --- a/kernel/sched/cpufreq_schedutil.c
> >> > +++ b/kernel/sched/cpufreq_schedutil.c
> >> > @@ -38,7 +38,6 @@ struct sugov_policy {
> >> > struct mutex work_lock;
> >> > struct kthread_worker worker;
> >> > struct task_struct *thread;
> >> > - bool work_in_progress;
> >> >
> >> > bool need_freq_update;
> >> > };
> >> > @@ -92,16 +91,8 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
> >> > !cpufreq_can_do_remote_dvfs(sg_policy->policy))
> >> > return false;
> >> >
> >> > - if (sg_policy->work_in_progress)
> >> > - return false;
> >> > -
> >> > if (unlikely(sg_policy->need_freq_update)) {
> >> > sg_policy->need_freq_update = false;
> >> > - /*
> >> > - * This happens when limits change, so forget the previous
> >> > - * next_freq value and force an update.
> >> > - */
> >> > - sg_policy->next_freq = UINT_MAX;
> >> > return true;
> >> > }
> >> >
> >> > @@ -129,7 +120,6 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time,
> >> > policy->cur = next_freq;
> >> > trace_cpu_frequency(next_freq, smp_processor_id());
> >> > } else {
> >> > - sg_policy->work_in_progress = true;
> >> > irq_work_queue(&sg_policy->irq_work);
> >>
> >> Isn't this potentially introducing unneeded irq pressure (and doing the
> >> whole wakeup the kthread thing), while the already active kthread could
> >> simply handle multiple back-to-back requests before going to sleep?
> >
> > How about this? Will use the latest request, and also doesn't do unnecessary
> > irq_work_queue:
> >
> > (untested)
> > -----8<--------
> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> > index d2c6083304b4..6a3e42b01f52 100644
> > --- a/kernel/sched/cpufreq_schedutil.c
> > +++ b/kernel/sched/cpufreq_schedutil.c
> > @@ -38,7 +38,7 @@ struct sugov_policy {
> > struct mutex work_lock;
> > struct kthread_worker worker;
> > struct task_struct *thread;
> > - bool work_in_progress;
> > + bool work_in_progress; /* Has kthread been kicked */
> >
> > bool need_freq_update;
> > };
> > @@ -92,9 +92,6 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
> > !cpufreq_can_do_remote_dvfs(sg_policy->policy))
> > return false;
> >
> > - if (sg_policy->work_in_progress)
> > - return false;
> > -
>
> Why this change?
>
> Doing the below is rather pointless if work_in_progress is set, isn't it?

The issue being discussed is that if a work was already in progress, then new
frequency updates will be dropped. So say even if DL increased in
utilization, nothing will happen because if work_in_progress = true and
need_freq_update = true, we would skip an update. In this diff, I am
allowing the frequency request to be possible while work_in_progress is true.
In the end the latest update will be picked.

>
> You'll drop the results of it on the floor going forward anyway then AFAICS.

Why? If sg_policy->need_freq_update = true, sugov_should_update_freq() will
return true.

thanks,

- Joel

>
> > if (unlikely(sg_policy->need_freq_update)) {
> > sg_policy->need_freq_update = false;
> > /*
> > @@ -129,8 +126,11 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time,
> > policy->cur = next_freq;
> > trace_cpu_frequency(next_freq, smp_processor_id());
> > } else {
> > - sg_policy->work_in_progress = true;
> > - irq_work_queue(&sg_policy->irq_work);
> > + /* work_in_progress helps us not queue unnecessarily */
> > + if (!sg_policy->work_in_progress) {
> > + sg_policy->work_in_progress = true;
> > + irq_work_queue(&sg_policy->irq_work);
> > + }
> > }
> > }
> >
> > @@ -381,13 +381,23 @@ sugov_update_shared(struct update_util_data *hook, u64 time, unsigned int flags)
> > static void sugov_work(struct kthread_work *work)
> > {
> > struct sugov_policy *sg_policy = container_of(work, struct sugov_policy, work);
> > + unsigned int freq;
> > +
> > + /*
> > + * Hold sg_policy->update_lock just enough to handle the case where:
> > + * if sg_policy->next_freq is updated before work_in_progress is set to
> > + * false, we may miss queueing the new update request since
> > + * work_in_progress would appear to be true.
> > + */
> > + raw_spin_lock(&sg_policy->update_lock);
> > + freq = sg_policy->next_freq;
> > + sg_policy->work_in_progress = false;
> > + raw_spin_unlock(&sg_policy->update_lock);
> >
> > mutex_lock(&sg_policy->work_lock);
> > - __cpufreq_driver_target(sg_policy->policy, sg_policy->next_freq,
> > + __cpufreq_driver_target(sg_policy->policy, freq,
> > CPUFREQ_RELATION_L);
> > mutex_unlock(&sg_policy->work_lock);
> > -
> > - sg_policy->work_in_progress = false;
> > }
> >
> > static void sugov_irq_work(struct irq_work *irq_work)

2018-05-09 09:04:13

[permalink] [raw]

Subject: Re: [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

On Wed, May 09, 2018 at 02:10:01PM +0530, Viresh Kumar wrote:
> On 09-05-18, 10:30, Rafael J. Wysocki wrote:
> > On Wed, May 9, 2018 at 10:06 AM, Joel Fernandes <[email protected]> wrote:
> > > How about this? Will use the latest request, and also doesn't do unnecessary
> > > irq_work_queue:
>
> I almost wrote the same stuff before I went for lunch :)

Oh :)

> > > (untested)
> > > -----8<--------
> > > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> > > index d2c6083304b4..6a3e42b01f52 100644
> > > --- a/kernel/sched/cpufreq_schedutil.c
> > > +++ b/kernel/sched/cpufreq_schedutil.c
> > > @@ -38,7 +38,7 @@ struct sugov_policy {
> > > struct mutex work_lock;
> > > struct kthread_worker worker;
> > > struct task_struct *thread;
> > > - bool work_in_progress;
> > > + bool work_in_progress; /* Has kthread been kicked */
> > >
> > > bool need_freq_update;
> > > };
> > > @@ -92,9 +92,6 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
> > > !cpufreq_can_do_remote_dvfs(sg_policy->policy))
> > > return false;
> > >
> > > - if (sg_policy->work_in_progress)
> > > - return false;
> > > -
> >
> > Why this change?
> >
> > Doing the below is rather pointless if work_in_progress is set, isn't it?
> >
> > You'll drop the results of it on the floor going forward anyway then AFAICS.
> >
> > > if (unlikely(sg_policy->need_freq_update)) {
> > > sg_policy->need_freq_update = false;
> > > /*
> > > @@ -129,8 +126,11 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time,
> > > policy->cur = next_freq;
> > > trace_cpu_frequency(next_freq, smp_processor_id());
> > > } else {
> > > - sg_policy->work_in_progress = true;
> > > - irq_work_queue(&sg_policy->irq_work);
> > > + /* work_in_progress helps us not queue unnecessarily */
> > > + if (!sg_policy->work_in_progress) {
> > > + sg_policy->work_in_progress = true;
> > > + irq_work_queue(&sg_policy->irq_work);
> > > + }
> > > }
> > > }
>
> Right, none of the above changes are required now.

I didn't follow what you mean the changes are not required? I was developing
against Linus mainline. Also I replied to Rafael's comment in the other
thread.

>
> > > @@ -381,13 +381,23 @@ sugov_update_shared(struct update_util_data *hook, u64 time, unsigned int flags)
> > > static void sugov_work(struct kthread_work *work)
> > > {
> > > struct sugov_policy *sg_policy = container_of(work, struct sugov_policy, work);
> > > + unsigned int freq;
> > > +
> > > + /*
> > > + * Hold sg_policy->update_lock just enough to handle the case where:
> > > + * if sg_policy->next_freq is updated before work_in_progress is set to
> > > + * false, we may miss queueing the new update request since
> > > + * work_in_progress would appear to be true.
> > > + */
> > > + raw_spin_lock(&sg_policy->update_lock);
> > > + freq = sg_policy->next_freq;
> > > + sg_policy->work_in_progress = false;
> > > + raw_spin_unlock(&sg_policy->update_lock);
>
> One problem we still have is that sg_policy->update_lock is only used
> in the shared policy case and not in the single CPU per policy case,
> so the race isn't solved there yet.

True.. I can make the single CPU case acquire the update_lock very briefly
around sugov_update_commit call in sugov_update_single.

Also I think the lock acquiral from sugov_work running in the kthread context should be a raw_spin_lock_irqsave..

thanks,

- Joel

2018-05-09 09:08:35

[permalink] [raw]

Subject: Re: [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

On Wed, May 9, 2018 at 10:51 AM, Joel Fernandes <[email protected]> wrote:
> On Wed, May 09, 2018 at 10:30:37AM +0200, Rafael J. Wysocki wrote:
>> On Wed, May 9, 2018 at 10:06 AM, Joel Fernandes <[email protected]> wrote:
>> > On Wed, May 09, 2018 at 08:45:30AM +0200, Juri Lelli wrote:
>> >> On 08/05/18 21:54, Joel Fernandes wrote:
>> >>
>> >> [...]
>> >>
>> >> > Just for discussion sake, is there any need for work_in_progress? If we can
>> >> > queue multiple work say kthread_queue_work can handle it, then just queuing
>> >> > works whenever they are available should be Ok and the kthread loop can
>> >> > handle them. __cpufreq_driver_target is also protected by the work lock if
>> >> > there is any concern that can have races... only thing is rate-limiting of
>> >> > the requests, but we are doing a rate limiting, just not for the "DL
>> >> > increased utilization" type requests (which I don't think we are doing at the
>> >> > moment for urgent DL requests anyway).
>> >> >
>> >> > Following is an untested diff to show the idea. What do you think?
>> >> >
>> >> > thanks,
>> >> >
>> >> > - Joel
>> >> >
>> >> > ----8<---
>> >> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
>> >> > index d2c6083304b4..862634ff4bf3 100644
>> >> > --- a/kernel/sched/cpufreq_schedutil.c
>> >> > +++ b/kernel/sched/cpufreq_schedutil.c
>> >> > @@ -38,7 +38,6 @@ struct sugov_policy {
>> >> > struct mutex work_lock;
>> >> > struct kthread_worker worker;
>> >> > struct task_struct *thread;
>> >> > - bool work_in_progress;
>> >> >
>> >> > bool need_freq_update;
>> >> > };
>> >> > @@ -92,16 +91,8 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
>> >> > !cpufreq_can_do_remote_dvfs(sg_policy->policy))
>> >> > return false;
>> >> >
>> >> > - if (sg_policy->work_in_progress)
>> >> > - return false;
>> >> > -
>> >> > if (unlikely(sg_policy->need_freq_update)) {
>> >> > sg_policy->need_freq_update = false;
>> >> > - /*
>> >> > - * This happens when limits change, so forget the previous
>> >> > - * next_freq value and force an update.
>> >> > - */
>> >> > - sg_policy->next_freq = UINT_MAX;
>> >> > return true;
>> >> > }
>> >> >
>> >> > @@ -129,7 +120,6 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time,
>> >> > policy->cur = next_freq;
>> >> > trace_cpu_frequency(next_freq, smp_processor_id());
>> >> > } else {
>> >> > - sg_policy->work_in_progress = true;
>> >> > irq_work_queue(&sg_policy->irq_work);
>> >>
>> >> Isn't this potentially introducing unneeded irq pressure (and doing the
>> >> whole wakeup the kthread thing), while the already active kthread could
>> >> simply handle multiple back-to-back requests before going to sleep?
>> >
>> > How about this? Will use the latest request, and also doesn't do unnecessary
>> > irq_work_queue:
>> >
>> > (untested)
>> > -----8<--------
>> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
>> > index d2c6083304b4..6a3e42b01f52 100644
>> > --- a/kernel/sched/cpufreq_schedutil.c
>> > +++ b/kernel/sched/cpufreq_schedutil.c
>> > @@ -38,7 +38,7 @@ struct sugov_policy {
>> > struct mutex work_lock;
>> > struct kthread_worker worker;
>> > struct task_struct *thread;
>> > - bool work_in_progress;
>> > + bool work_in_progress; /* Has kthread been kicked */
>> >
>> > bool need_freq_update;
>> > };
>> > @@ -92,9 +92,6 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
>> > !cpufreq_can_do_remote_dvfs(sg_policy->policy))
>> > return false;
>> >
>> > - if (sg_policy->work_in_progress)
>> > - return false;
>> > -
>>
>> Why this change?
>>
>> Doing the below is rather pointless if work_in_progress is set, isn't it?
>
> The issue being discussed is that if a work was already in progress, then new
> frequency updates will be dropped. So say even if DL increased in
> utilization, nothing will happen because if work_in_progress = true and
> need_freq_update = true, we would skip an update. In this diff, I am
> allowing the frequency request to be possible while work_in_progress is true.
> In the end the latest update will be picked.

I'm not sure if taking new requests with the irq_work in flight is a good idea.

>>
>> You'll drop the results of it on the floor going forward anyway then AFAICS.
>
> Why?

Because you cannot queue up a new irq_work before the previous one is complete?

2018-05-09 09:33:56

[permalink] [raw]

Subject: Re: [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

On 09-05-18, 02:02, Joel Fernandes wrote:
> On Wed, May 09, 2018 at 02:10:01PM +0530, Viresh Kumar wrote:
> > Right, none of the above changes are required now.
>
> I didn't follow what you mean the changes are not required? I was developing
> against Linus mainline. Also I replied to Rafael's comment in the other
> thread.

At least for the shared policy case the entire sequence of
sugov_should_update_freq() followed by sugov_update_commit() is
executed from within spinlock protected region and you are using the
same lock below. And so either the above two routines or the kthread
routine below will execute at a given point of time.

So in case kthread has started doing the update and acquired the lock,
the util update handler will wait until the time work_in_progress is
set to false, that's not a problem we are trying to solve here.

And if kthread hasn't acquired the lock yet and util handler has
started executing sugov_should_update_freq() ....

And ^^^ this is where I understood that your earlier change is
actually required, so that we accumulate the latest updated next_freq
value.

And with all that we wouldn't require a while loop in the kthread
code.

> > > > @@ -381,13 +381,23 @@ sugov_update_shared(struct update_util_data *hook, u64 time, unsigned int flags)
> > > > static void sugov_work(struct kthread_work *work)
> > > > {
> > > > struct sugov_policy *sg_policy = container_of(work, struct sugov_policy, work);
> > > > + unsigned int freq;
> > > > +
> > > > + /*
> > > > + * Hold sg_policy->update_lock just enough to handle the case where:
> > > > + * if sg_policy->next_freq is updated before work_in_progress is set to
> > > > + * false, we may miss queueing the new update request since
> > > > + * work_in_progress would appear to be true.
> > > > + */
> > > > + raw_spin_lock(&sg_policy->update_lock);
> > > > + freq = sg_policy->next_freq;
> > > > + sg_policy->work_in_progress = false;
> > > > + raw_spin_unlock(&sg_policy->update_lock);
> >
> > One problem we still have is that sg_policy->update_lock is only used
> > in the shared policy case and not in the single CPU per policy case,
> > so the race isn't solved there yet.
>
> True.. I can make the single CPU case acquire the update_lock very briefly
> around sugov_update_commit call in sugov_update_single.

Rafael was very clear from the beginning that he wouldn't allow a spin
lock in the un-shared policy case :)

--
viresh

2018-05-09 09:42:12

[permalink] [raw]

Subject: Re: [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

On Wed, May 09, 2018 at 11:06:24AM +0200, Rafael J. Wysocki wrote:
> On Wed, May 9, 2018 at 10:51 AM, Joel Fernandes <[email protected]> wrote:
> > On Wed, May 09, 2018 at 10:30:37AM +0200, Rafael J. Wysocki wrote:
> >> On Wed, May 9, 2018 at 10:06 AM, Joel Fernandes <[email protected]> wrote:
> >> > On Wed, May 09, 2018 at 08:45:30AM +0200, Juri Lelli wrote:
> >> >> On 08/05/18 21:54, Joel Fernandes wrote:
> >> >>
> >> >> [...]
> >> >>
> >> >> > Just for discussion sake, is there any need for work_in_progress? If we can
> >> >> > queue multiple work say kthread_queue_work can handle it, then just queuing
> >> >> > works whenever they are available should be Ok and the kthread loop can
> >> >> > handle them. __cpufreq_driver_target is also protected by the work lock if
> >> >> > there is any concern that can have races... only thing is rate-limiting of
> >> >> > the requests, but we are doing a rate limiting, just not for the "DL
> >> >> > increased utilization" type requests (which I don't think we are doing at the
> >> >> > moment for urgent DL requests anyway).
> >> >> >
> >> >> > Following is an untested diff to show the idea. What do you think?
> >> >> >
> >> >> > thanks,
> >> >> >
> >> >> > - Joel
> >> >> >
> >> >> > ----8<---
> >> >> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> >> >> > index d2c6083304b4..862634ff4bf3 100644
> >> >> > --- a/kernel/sched/cpufreq_schedutil.c
> >> >> > +++ b/kernel/sched/cpufreq_schedutil.c
> >> >> > @@ -38,7 +38,6 @@ struct sugov_policy {
> >> >> > struct mutex work_lock;
> >> >> > struct kthread_worker worker;
> >> >> > struct task_struct *thread;
> >> >> > - bool work_in_progress;
> >> >> >
> >> >> > bool need_freq_update;
> >> >> > };
> >> >> > @@ -92,16 +91,8 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
> >> >> > !cpufreq_can_do_remote_dvfs(sg_policy->policy))
> >> >> > return false;
> >> >> >
> >> >> > - if (sg_policy->work_in_progress)
> >> >> > - return false;
> >> >> > -
> >> >> > if (unlikely(sg_policy->need_freq_update)) {
> >> >> > sg_policy->need_freq_update = false;
> >> >> > - /*
> >> >> > - * This happens when limits change, so forget the previous
> >> >> > - * next_freq value and force an update.
> >> >> > - */
> >> >> > - sg_policy->next_freq = UINT_MAX;
> >> >> > return true;
> >> >> > }
> >> >> >
> >> >> > @@ -129,7 +120,6 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time,
> >> >> > policy->cur = next_freq;
> >> >> > trace_cpu_frequency(next_freq, smp_processor_id());
> >> >> > } else {
> >> >> > - sg_policy->work_in_progress = true;
> >> >> > irq_work_queue(&sg_policy->irq_work);
> >> >>
> >> >> Isn't this potentially introducing unneeded irq pressure (and doing the
> >> >> whole wakeup the kthread thing), while the already active kthread could
> >> >> simply handle multiple back-to-back requests before going to sleep?
> >> >
> >> > How about this? Will use the latest request, and also doesn't do unnecessary
> >> > irq_work_queue:
> >> >
> >> > (untested)
> >> > -----8<--------
> >> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> >> > index d2c6083304b4..6a3e42b01f52 100644
> >> > --- a/kernel/sched/cpufreq_schedutil.c
> >> > +++ b/kernel/sched/cpufreq_schedutil.c
> >> > @@ -38,7 +38,7 @@ struct sugov_policy {
> >> > struct mutex work_lock;
> >> > struct kthread_worker worker;
> >> > struct task_struct *thread;
> >> > - bool work_in_progress;
> >> > + bool work_in_progress; /* Has kthread been kicked */
> >> >
> >> > bool need_freq_update;
> >> > };
> >> > @@ -92,9 +92,6 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
> >> > !cpufreq_can_do_remote_dvfs(sg_policy->policy))
> >> > return false;
> >> >
> >> > - if (sg_policy->work_in_progress)
> >> > - return false;
> >> > -
> >>
> >> Why this change?
> >>
> >> Doing the below is rather pointless if work_in_progress is set, isn't it?
> >
> > The issue being discussed is that if a work was already in progress, then new
> > frequency updates will be dropped. So say even if DL increased in
> > utilization, nothing will happen because if work_in_progress = true and
> > need_freq_update = true, we would skip an update. In this diff, I am
> > allowing the frequency request to be possible while work_in_progress is true.
> > In the end the latest update will be picked.
>
> I'm not sure if taking new requests with the irq_work in flight is a good idea.

That's the point of the original $SUBJECT patch posted by Claudio :) In that
you can see if urgent_request, then work_in_progress isn't checked.

Also I don't see why we cannot do this with this small tweak as in my diff.
It solves a real problem seen with frequency updates done with the
slow-switch as we discussed at OSPM.

But let me know if I missed your point or something ;)

>
> >>
> >> You'll drop the results of it on the floor going forward anyway then AFAICS.
> >
> > Why?
>
> Because you cannot queue up a new irq_work before the previous one is complete?

We are not doing that. If you see in my diff, I am not queuing an irq_work if
one was already queued. What we're allowing is an update to next_freq. We
still use work_in_progress but don't use it to ban all incoming update
requests as done previously. Instead we use work_in_progress to make sure
that we dont unnecessarily increase the irq pressure and have excessive wake
ups (as Juri suggested).

I can clean it up and post it as a patch next week after some testing incase
that's less confusing.
This week I'm actually on vacation and the diff was pure vacation hacking ;-)

thanks,

- Joel

2018-05-09 09:48:52

[permalink] [raw]

Subject: Re: [RFC PATCH] sched/cpufreq/schedutil: handling urgent frequency requests

On Wed, May 9, 2018 at 11:39 AM, Joel Fernandes <[email protected]> wrote:
> On Wed, May 09, 2018 at 11:06:24AM +0200, Rafael J. Wysocki wrote:
>> On Wed, May 9, 2018 at 10:51 AM, Joel Fernandes <[email protected]> wrote:
>> > On Wed, May 09, 2018 at 10:30:37AM +0200, Rafael J. Wysocki wrote:
>> >> On Wed, May 9, 2018 at 10:06 AM, Joel Fernandes <[email protected]> wrote:
>> >> > On Wed, May 09, 2018 at 08:45:30AM +0200, Juri Lelli wrote:
>> >> >> On 08/05/18 21:54, Joel Fernandes wrote:
>> >> >>
>> >> >> [...]
>> >> >>
>> >> >> > Just for discussion sake, is there any need for work_in_progress? If we can
>> >> >> > queue multiple work say kthread_queue_work can handle it, then just queuing
>> >> >> > works whenever they are available should be Ok and the kthread loop can
>> >> >> > handle them. __cpufreq_driver_target is also protected by the work lock if
>> >> >> > there is any concern that can have races... only thing is rate-limiting of
>> >> >> > the requests, but we are doing a rate limiting, just not for the "DL
>> >> >> > increased utilization" type requests (which I don't think we are doing at the
>> >> >> > moment for urgent DL requests anyway).
>> >> >> >
>> >> >> > Following is an untested diff to show the idea. What do you think?
>> >> >> >
>> >> >> > thanks,
>> >> >> >
>> >> >> > - Joel
>> >> >> >
>> >> >> > ----8<---
>> >> >> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
>> >> >> > index d2c6083304b4..862634ff4bf3 100644
>> >> >> > --- a/kernel/sched/cpufreq_schedutil.c
>> >> >> > +++ b/kernel/sched/cpufreq_schedutil.c
>> >> >> > @@ -38,7 +38,6 @@ struct sugov_policy {
>> >> >> > struct mutex work_lock;
>> >> >> > struct kthread_worker worker;
>> >> >> > struct task_struct *thread;
>> >> >> > - bool work_in_progress;
>> >> >> >
>> >> >> > bool need_freq_update;
>> >> >> > };
>> >> >> > @@ -92,16 +91,8 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
>> >> >> > !cpufreq_can_do_remote_dvfs(sg_policy->policy))
>> >> >> > return false;
>> >> >> >
>> >> >> > - if (sg_policy->work_in_progress)
>> >> >> > - return false;
>> >> >> > -
>> >> >> > if (unlikely(sg_policy->need_freq_update)) {
>> >> >> > sg_policy->need_freq_update = false;
>> >> >> > - /*
>> >> >> > - * This happens when limits change, so forget the previous
>> >> >> > - * next_freq value and force an update.
>> >> >> > - */
>> >> >> > - sg_policy->next_freq = UINT_MAX;
>> >> >> > return true;
>> >> >> > }
>> >> >> >
>> >> >> > @@ -129,7 +120,6 @@ static void sugov_update_commit(struct sugov_policy *sg_policy, u64 time,
>> >> >> > policy->cur = next_freq;
>> >> >> > trace_cpu_frequency(next_freq, smp_processor_id());
>> >> >> > } else {
>> >> >> > - sg_policy->work_in_progress = true;
>> >> >> > irq_work_queue(&sg_policy->irq_work);
>> >> >>
>> >> >> Isn't this potentially introducing unneeded irq pressure (and doing the
>> >> >> whole wakeup the kthread thing), while the already active kthread could
>> >> >> simply handle multiple back-to-back requests before going to sleep?
>> >> >
>> >> > How about this? Will use the latest request, and also doesn't do unnecessary
>> >> > irq_work_queue:
>> >> >
>> >> > (untested)
>> >> > -----8<--------
>> >> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
>> >> > index d2c6083304b4..6a3e42b01f52 100644
>> >> > --- a/kernel/sched/cpufreq_schedutil.c
>> >> > +++ b/kernel/sched/cpufreq_schedutil.c
>> >> > @@ -38,7 +38,7 @@ struct sugov_policy {
>> >> > struct mutex work_lock;
>> >> > struct kthread_worker worker;
>> >> > struct task_struct *thread;
>> >> > - bool work_in_progress;
>> >> > + bool work_in_progress; /* Has kthread been kicked */
>> >> >
>> >> > bool need_freq_update;
>> >> > };
>> >> > @@ -92,9 +92,6 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
>> >> > !cpufreq_can_do_remote_dvfs(sg_policy->policy))
>> >> > return false;
>> >> >
>> >> > - if (sg_policy->work_in_progress)
>> >> > - return false;
>> >> > -
>> >>
>> >> Why this change?
>> >>
>> >> Doing the below is rather pointless if work_in_progress is set, isn't it?
>> >
>> > The issue being discussed is that if a work was already in progress, then new
>> > frequency updates will be dropped. So say even if DL increased in
>> > utilization, nothing will happen because if work_in_progress = true and
>> > need_freq_update = true, we would skip an update. In this diff, I am
>> > allowing the frequency request to be possible while work_in_progress is true.
>> > In the end the latest update will be picked.
>>
>> I'm not sure if taking new requests with the irq_work in flight is a good idea.
>
> That's the point of the original $SUBJECT patch posted by Claudio :) In that
> you can see if urgent_request, then work_in_progress isn't checked.
>
> Also I don't see why we cannot do this with this small tweak as in my diff.
> It solves a real problem seen with frequency updates done with the
> slow-switch as we discussed at OSPM.

OK

> But let me know if I missed your point or something ;)
>
>>
>> >>
>> >> You'll drop the results of it on the floor going forward anyway then AFAICS.
>> >
>> > Why?
>>
>> Because you cannot queue up a new irq_work before the previous one is complete?
>
> We are not doing that. If you see in my diff, I am not queuing an irq_work if
> one was already queued. What we're allowing is an update to next_freq. We
> still use work_in_progress but don't use it to ban all incoming update
> requests as done previously. Instead we use work_in_progress to make sure
> that we dont unnecessarily increase the irq pressure and have excessive wake
> ups (as Juri suggested).
>
> I can clean it up and post it as a patch next week after some testing incase
> that's less confusing.

Yeah, that would help. :-)

> This week I'm actually on vacation and the diff was pure vacation hacking ;-)

No worries.

Thanks,
Rafael

2018-05-09 10:35:15