2018-05-08 07:34:18

by Dietmar Eggemann

[permalink] [raw]
Subject: [PATCH] Revert "cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily"

This reverts commit e2cabe48c20efb174ce0c01190f8b9c5f3ea1d13.

Lifting the restriction that the sugov kthread is bound to the
policy->related_cpus for a system with a slow switching cpufreq driver,
which is able to perform DVFS from any cpu (e.g. cpufreq-dt), is not
only not beneficial it also harms Enery-Aware Scheduling (EAS) on
systems with asymmetric cpu capacities (e.g. Arm big.LITTLE).

The sugov kthread which does the update for the little cpus could
potentially run on a big cpu. It could prevent that the big cluster goes
into deeper idle states although all the tasks are running on the little
cluster.

Example: hikey960 w/ 4.16.0-rc6-+
Arm big.LITTLE with per-cluster DVFS

root@h960:~# cat /proc/cpuinfo | grep "^CPU part"
CPU part : 0xd03 (Cortex-A53, little cpu)
CPU part : 0xd03
CPU part : 0xd03
CPU part : 0xd03
CPU part : 0xd09 (Cortex-A73, big cpu)
CPU part : 0xd09
CPU part : 0xd09
CPU part : 0xd09

root@h960:/sys/devices/system/cpu/cpufreq# ls
policy0 policy4 schedutil

root@h960:/sys/devices/system/cpu/cpufreq# cat policy*/related_cpus
0 1 2 3
4 5 6 7

(1) w/o the revert:

root@h960:~# ps -eo pid,class,rtprio,pri,psr,comm | awk 'NR == 1 ||
/sugov/'
PID CLS RTPRIO PRI PSR COMMAND
1489 #6 0 140 1 sugov:0
1490 #6 0 140 0 sugov:4

The sugov kthread sugov:4 responsible for policy4 runs on cpu0. (In this
case both sugov kthreads run on little cpus).

cross policy (cluster) remote callback example:
...
migration/1-14 [001] enqueue_task_fair: this_cpu=1 cpu_of(rq)=5
migration/1-14 [001] sugov_update_shared: this_cpu=1 sg_cpu->cpu=5
sg_cpu->sg_policy->policy->related_cpus=4-7
sugov:4-1490 [000] sugov_work: this_cpu=0
sg_cpu->sg_policy->policy->related_cpus=4-7
...

The remote callback (this_cpu=1, target_cpu=5) is executed on cpu=0.

(2) w/ the revert:

root@h960:~# ps -eo pid,class,rtprio,pri,psr,comm | awk 'NR == 1 ||
/sugov/'
PID CLS RTPRIO PRI PSR COMMAND
1491 #6 0 140 2 sugov:0
1492 #6 0 140 4 sugov:4

The sugov kthread sugov:4 responsible for policy4 runs on cpu4.

cross policy (cluster) remote callback example:
...
migration/1-14 [001] enqueue_task_fair: this_cpu=1 cpu_of(rq)=7
migration/1-14 [001] sugov_update_shared: this_cpu=1 sg_cpu->cpu=7
sg_cpu->sg_policy->policy->related_cpus=4-7
sugov:4-1492 [004] sugov_work: this_cpu=4
sg_cpu->sg_policy->policy->related_cpus=4-7
...

The remote callback (this_cpu=1, target_cpu=7) is executed on cpu=4.

Now the sugov kthread executes again on the policy (cluster) for which
the Operating Performance Point (OPP) should be changed.
It avoids the problem that an otherwise idle policy (cluster) is running
schedutil (the sugov kthread) for another one.

Cc: Viresh Kumar <[email protected]>
Cc: Rafael J. Wysocki <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Pavan Kondeti <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Patrick Bellasi <[email protected]>
Cc: Quentin Perret <[email protected]>
Signed-off-by: Dietmar Eggemann <[email protected]>
---
kernel/sched/cpufreq_schedutil.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index d2c6083304b4..63014cff76a5 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -523,11 +523,7 @@ static int sugov_kthread_create(struct sugov_policy *sg_policy)
}

sg_policy->thread = thread;
-
- /* Kthread is bound to all CPUs by default */
- if (!policy->dvfs_possible_from_any_cpu)
- kthread_bind_mask(thread, policy->related_cpus);
-
+ kthread_bind_mask(thread, policy->related_cpus);
init_irq_work(&sg_policy->irq_work, sugov_irq_work);
mutex_init(&sg_policy->work_lock);

--
2.11.0



2018-05-08 08:23:59

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH] Revert "cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily"

On 08-05-18, 08:33, Dietmar Eggemann wrote:
> This reverts commit e2cabe48c20efb174ce0c01190f8b9c5f3ea1d13.
>
> Lifting the restriction that the sugov kthread is bound to the
> policy->related_cpus for a system with a slow switching cpufreq driver,
> which is able to perform DVFS from any cpu (e.g. cpufreq-dt), is not
> only not beneficial it also harms Enery-Aware Scheduling (EAS) on
> systems with asymmetric cpu capacities (e.g. Arm big.LITTLE).
>
> The sugov kthread which does the update for the little cpus could
> potentially run on a big cpu. It could prevent that the big cluster goes
> into deeper idle states although all the tasks are running on the little
> cluster.

I think the original patch did the right thing, but that doesn't suit
everybody as you explained.

I wouldn't really revert the patch but fix my platform's cpufreq
driver to set dvfs_possible_from_any_cpu = false, so that other
platforms can still benefit from the original commit.

--
viresh

2018-05-08 09:10:59

by Dietmar Eggemann

[permalink] [raw]
Subject: Re: [PATCH] Revert "cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily"

On 05/08/2018 10:22 AM, Viresh Kumar wrote:
> On 08-05-18, 08:33, Dietmar Eggemann wrote:
>> This reverts commit e2cabe48c20efb174ce0c01190f8b9c5f3ea1d13.
>>
>> Lifting the restriction that the sugov kthread is bound to the
>> policy->related_cpus for a system with a slow switching cpufreq driver,
>> which is able to perform DVFS from any cpu (e.g. cpufreq-dt), is not
>> only not beneficial it also harms Enery-Aware Scheduling (EAS) on
>> systems with asymmetric cpu capacities (e.g. Arm big.LITTLE).
>>
>> The sugov kthread which does the update for the little cpus could
>> potentially run on a big cpu. It could prevent that the big cluster goes
>> into deeper idle states although all the tasks are running on the little
>> cluster.
>
> I think the original patch did the right thing, but that doesn't suit
> everybody as you explained.
>
> I wouldn't really revert the patch but fix my platform's cpufreq
> driver to set dvfs_possible_from_any_cpu = false, so that other
> platforms can still benefit from the original commit.

This would make sure that the kthreads are bound to the correct set of
cpus for platforms with those cpufreq drivers (cpufreq-dt (h960),
scmi-cpufreq, scpi-cpufreq) but it will also change the logic (e.g.
sugov_should_update_freq() -> cpufreq_can_do_remote_dvfs()).

I'm still struggling to understand when a driver/platform should set
dvfs_possible_from_any_cpu to true and what the actual benefit would be.


2018-05-08 09:43:27

by Quentin Perret

[permalink] [raw]
Subject: Re: [PATCH] Revert "cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily"

On Tuesday 08 May 2018 at 11:09:57 (+0200), Dietmar Eggemann wrote:
> On 05/08/2018 10:22 AM, Viresh Kumar wrote:
> > On 08-05-18, 08:33, Dietmar Eggemann wrote:
> > > This reverts commit e2cabe48c20efb174ce0c01190f8b9c5f3ea1d13.
> > >
> > > Lifting the restriction that the sugov kthread is bound to the
> > > policy->related_cpus for a system with a slow switching cpufreq driver,
> > > which is able to perform DVFS from any cpu (e.g. cpufreq-dt), is not
> > > only not beneficial it also harms Enery-Aware Scheduling (EAS) on
> > > systems with asymmetric cpu capacities (e.g. Arm big.LITTLE).
> > >
> > > The sugov kthread which does the update for the little cpus could
> > > potentially run on a big cpu. It could prevent that the big cluster goes
> > > into deeper idle states although all the tasks are running on the little
> > > cluster.
> >
> > I think the original patch did the right thing, but that doesn't suit
> > everybody as you explained.
> >
> > I wouldn't really revert the patch but fix my platform's cpufreq
> > driver to set dvfs_possible_from_any_cpu = false, so that other
> > platforms can still benefit from the original commit.
>
> This would make sure that the kthreads are bound to the correct set of cpus
> for platforms with those cpufreq drivers (cpufreq-dt (h960), scmi-cpufreq,
> scpi-cpufreq) but it will also change the logic (e.g.
> sugov_should_update_freq() -> cpufreq_can_do_remote_dvfs()).
>
> I'm still struggling to understand when a driver/platform should set
> dvfs_possible_from_any_cpu to true and what the actual benefit would be.

I assume it might be beneficial to have the kthread moving around freely
in some cases, but since it is a SCHED_DEADLINE task now it can't really
migrate anywhere anyway. So I'm not sure either if this commits still makes
sense now. Or is there another use case for this ?

Thanks,
Quentin

2018-05-08 09:46:09

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH] Revert "cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily"

On 08-05-18, 11:09, Dietmar Eggemann wrote:
> This would make sure that the kthreads are bound to the correct set of cpus
> for platforms with those cpufreq drivers (cpufreq-dt (h960), scmi-cpufreq,
> scpi-cpufreq) but it will also change the logic (e.g.
> sugov_should_update_freq() -> cpufreq_can_do_remote_dvfs()).

Yeah, I misunderstood your patch a bit. So you are not disabling
remote updates but only limiting the CPUs where the kthread runs.

That still looks to be a big little specific problem to me right now
and I am not sure why should we specially handle these kthreads ?
Isn't the same true for any other threads/tasks in the kernel which
may end up running on big CPUs ? And this problem still occurs with
the EAS patches applied ? As I thought we may end up keeping such
small tasks on little cores then.

> I'm still struggling to understand when a driver/platform should set
> dvfs_possible_from_any_cpu to true and what the actual benefit would be.

Ideally it should be set by default for all ARM platforms at least
which have more than one cpufreq policy, as there is no hardware
limitation for changing frequency from other CPUs. If you look at the
commit logs of patches which added remote updates, you will see
interesting cases where this can be very useful.

commit 674e75411fc2 ("sched: cpufreq: Allow remote cpufreq callbacks")

--
viresh

2018-05-08 10:07:32

by Quentin Perret

[permalink] [raw]
Subject: Re: [PATCH] Revert "cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily"

On Tuesday 08 May 2018 at 15:15:26 (+0530), Viresh Kumar wrote:
> On 08-05-18, 11:09, Dietmar Eggemann wrote:
> > This would make sure that the kthreads are bound to the correct set of cpus
> > for platforms with those cpufreq drivers (cpufreq-dt (h960), scmi-cpufreq,
> > scpi-cpufreq) but it will also change the logic (e.g.
> > sugov_should_update_freq() -> cpufreq_can_do_remote_dvfs()).
>
> Yeah, I misunderstood your patch a bit. So you are not disabling
> remote updates but only limiting the CPUs where the kthread runs.
>
> That still looks to be a big little specific problem to me right now
> and I am not sure why should we specially handle these kthreads ?
> Isn't the same true for any other threads/tasks in the kernel which
> may end up running on big CPUs ? And this problem still occurs with
> the EAS patches applied ? As I thought we may end up keeping such
> small tasks on little cores then.

The sugov kthreads are DL tasks so they're not impacted by EAS. But even
if you take EAS out of the picture, those kthreads are assigned to a
"random" CPU at boot time and stay there forever (because that's how DL
works). Is this what we want ?

>
> > I'm still struggling to understand when a driver/platform should set
> > dvfs_possible_from_any_cpu to true and what the actual benefit would be.
>
> Ideally it should be set by default for all ARM platforms at least
> which have more than one cpufreq policy,

Looking at the commit you mention below it seems that you did the
testing on the old hikey which has only one cpufreq policy. Did you try
on other platforms as well ?

> as there is no hardware
> limitation for changing frequency from other CPUs. If you look at the
> commit logs of patches which added remote updates, you will see
> interesting cases where this can be very useful.
>
> commit 674e75411fc2 ("sched: cpufreq: Allow remote cpufreq callbacks")
>
> --
> viresh

Thanks,
Quentin

2018-05-08 10:35:05

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH] Revert "cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily"

On 08-05-18, 11:02, Quentin Perret wrote:
> The sugov kthreads are DL tasks so they're not impacted by EAS. But even
> if you take EAS out of the picture, those kthreads are assigned to a
> "random" CPU at boot time and stay there forever (because that's how DL
> works). Is this what we want ?

Okay, I didn't knew that DL threads don't migrate at all. I don't
think that's what we want then specially for big LITTLE platforms. But
for the rest, I don't know. Take example of Qcom krait. Each CPU has a
separate policy, why shouldn't we allow other CPUs to run the kthread?

> Looking at the commit you mention below it seems that you did the
> testing on the old hikey which has only one cpufreq policy. Did you try
> on other platforms as well ?

Yeah, the testing wasn't performance oriented but rather corner case
oriented and it made sense to allow other CPUs to go update the freq
for remote CPUs. And that's true for big LITTLE as well, the only
question here is which CPUs we want the thread to run on.

--
viresh

2018-05-08 10:37:27

by Dietmar Eggemann

[permalink] [raw]
Subject: Re: [PATCH] Revert "cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily"

On 05/08/2018 11:45 AM, Viresh Kumar wrote:
> On 08-05-18, 11:09, Dietmar Eggemann wrote:
>> This would make sure that the kthreads are bound to the correct set of cpus
>> for platforms with those cpufreq drivers (cpufreq-dt (h960), scmi-cpufreq,
>> scpi-cpufreq) but it will also change the logic (e.g.
>> sugov_should_update_freq() -> cpufreq_can_do_remote_dvfs()).
>
> Yeah, I misunderstood your patch a bit. So you are not disabling
> remote updates but only limiting the CPUs where the kthread runs.

Yes, remote updates are possible even if the sugov kthread is bound to
the cpus of the policy.

cross policy (cluster) remote callback example:
...
migration/1-14 [001] enqueue_task_fair: this_cpu=1 cpu_of(rq)=7
migration/1-14 [001] sugov_update_shared: this_cpu=1 sg_cpu->cpu=7
sg_cpu->sg_policy->policy->related_cpus=4-7
sugov:4-1492 [004] sugov_work: this_cpu=4
sg_cpu->sg_policy->policy->related_cpus=4-7
...

cpu=1 updates cpu=7 remotely. This is independent from where the actual
sugov kthread is running.

> That still looks to be a big little specific problem to me right now
> and I am not sure why should we specially handle these kthreads ?
> Isn't the same true for any other threads/tasks in the kernel which
> may end up running on big CPUs ? And this problem still occurs with
> the EAS patches applied ? As I thought we may end up keeping such
> small tasks on little cores then.

Binding it to the cpus of the policy makes sense since if the OPP should
be changed for these cpus it should be done on one of these cpus, making
sure we don't disturb the others. EAS and big.LITTLE will profit from
this, but potentially every system with multiple frequency domains.

Binding them on to little cpus is an overhead to me which will gain us
very little. Keeping the sugov threads to the cpus of the policy doesn't
cost much and helps a lot.

>> I'm still struggling to understand when a driver/platform should set
>> dvfs_possible_from_any_cpu to true and what the actual benefit would be.
>
> Ideally it should be set by default for all ARM platforms at least
> which have more than one cpufreq policy, as there is no hardware
> limitation for changing frequency from other CPUs. If you look at the
> commit logs of patches which added remote updates, you will see
> interesting cases where this can be very useful.

That's true but where is the benefit by doing so? (Multiple) per-cluster
or per-cpu frequency domains, why should the sugov kthread run on a
foreign cpu?

> commit 674e75411fc2 ("sched: cpufreq: Allow remote cpufreq callbacks")

IMHO, this describes the remote cpufreq callback functionality itself
which works perfectly fine even if we leave the sugov ktrhead bound to
the cpus of the policy.

2018-05-08 10:54:08

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH] Revert "cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily"

On 08-05-18, 12:36, Dietmar Eggemann wrote:
> That's true but where is the benefit by doing so? (Multiple) per-cluster or
> per-cpu frequency domains, why should the sugov kthread run on a foreign
> cpu?

I am not sure I know the answer, but I have a question which you can
answer :)

Is it possible for a CPU (which already has high priority deadline
activity going on) to be busy enough to be not able to run the
schedutil kthread for sometime ? That would be the only case I believe
where it would be better to let some other CPU go and change the
frequency for this one as we better run faster while we have the high
load going on.

--
viresh

2018-05-08 11:00:49

by Quentin Perret

[permalink] [raw]
Subject: Re: [PATCH] Revert "cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily"

On Tuesday 08 May 2018 at 16:04:27 (+0530), Viresh Kumar wrote:
> On 08-05-18, 11:02, Quentin Perret wrote:
> > The sugov kthreads are DL tasks so they're not impacted by EAS. But even
> > if you take EAS out of the picture, those kthreads are assigned to a
> > "random" CPU at boot time and stay there forever (because that's how DL
> > works). Is this what we want ?
>
> Okay, I didn't knew that DL threads don't migrate at all. I don't
> think that's what we want then specially for big LITTLE platforms. But
> for the rest, I don't know. Take example of Qcom krait. Each CPU has a
> separate policy, why shouldn't we allow other CPUs to run the kthread?

Right, I see your point. Now, with the current implementation, why should
we randomly force a CPU to manage the kthread of another ? IIUC deadline
should assign the kthreads to CPUs depending on the state of the system
when the task is created. So, from one boot to another, you could
theoretically end up with varying configurations, and varying power/perf
numbers.

>
> > Looking at the commit you mention below it seems that you did the
> > testing on the old hikey which has only one cpufreq policy. Did you try
> > on other platforms as well ?
>
> Yeah, the testing wasn't performance oriented but rather corner case
> oriented and it made sense to allow other CPUs to go update the freq
> for remote CPUs. And that's true for big LITTLE as well, the only
> question here is which CPUs we want the thread to run on.
>
> --
> viresh

2018-05-08 11:15:44

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH] Revert "cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily"

On 08-05-18, 12:00, Quentin Perret wrote:
> Right, I see your point. Now, with the current implementation, why should
> we randomly force a CPU to manage the kthread of another ? IIUC deadline
> should assign the kthreads to CPUs depending on the state of the system
> when the task is created. So, from one boot to another, you could
> theoretically end up with varying configurations, and varying power/perf
> numbers.

Yeah, if it is fixed at boot then there is no good reason to push it
to any other CPU. I agree.

--
viresh

2018-05-08 11:25:17

by Quentin Perret

[permalink] [raw]
Subject: Re: [PATCH] Revert "cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily"

On Tuesday 08 May 2018 at 16:44:51 (+0530), Viresh Kumar wrote:
> On 08-05-18, 12:00, Quentin Perret wrote:
> > Right, I see your point. Now, with the current implementation, why should
> > we randomly force a CPU to manage the kthread of another ? IIUC deadline
> > should assign the kthreads to CPUs depending on the state of the system
> > when the task is created. So, from one boot to another, you could
> > theoretically end up with varying configurations, and varying power/perf
> > numbers.
>
> Yeah, if it is fixed at boot then there is no good reason to push it
> to any other CPU. I agree.
>

To be fair, I think that DL tasks _can_ migrate, but only in special
conditions (hotplug, or if a DL task wakes up when another DL task is
running and things like that IIRC) but that probably doesn't matter much
for our discussion here.

Thanks,
Quentin

2018-05-08 12:19:52

by Juri Lelli

[permalink] [raw]
Subject: Re: [PATCH] Revert "cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily"

On 08/05/18 16:23, Viresh Kumar wrote:
> On 08-05-18, 12:36, Dietmar Eggemann wrote:
> > That's true but where is the benefit by doing so? (Multiple) per-cluster or
> > per-cpu frequency domains, why should the sugov kthread run on a foreign
> > cpu?
>
> I am not sure I know the answer, but I have a question which you can
> answer :)
>
> Is it possible for a CPU (which already has high priority deadline
> activity going on) to be busy enough to be not able to run the
> schedutil kthread for sometime ? That would be the only case I believe
> where it would be better to let some other CPU go and change the
> frequency for this one as we better run faster while we have the high
> load going on.

Shouldn't happen. This kthreads are "special" and will preempt any other
DL task (stop class win of course).

2018-05-08 12:22:20

by Juri Lelli

[permalink] [raw]
Subject: Re: [PATCH] Revert "cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily"

On 08/05/18 12:24, Quentin Perret wrote:
> On Tuesday 08 May 2018 at 16:44:51 (+0530), Viresh Kumar wrote:
> > On 08-05-18, 12:00, Quentin Perret wrote:
> > > Right, I see your point. Now, with the current implementation, why should
> > > we randomly force a CPU to manage the kthread of another ? IIUC deadline
> > > should assign the kthreads to CPUs depending on the state of the system
> > > when the task is created. So, from one boot to another, you could
> > > theoretically end up with varying configurations, and varying power/perf
> > > numbers.
> >
> > Yeah, if it is fixed at boot then there is no good reason to push it
> > to any other CPU. I agree.
> >
>
> To be fair, I think that DL tasks _can_ migrate, but only in special
> conditions (hotplug, or if a DL task wakes up when another DL task is
> running and things like that IIRC) but that probably doesn't matter much
> for our discussion here.

More than "special" I'd say "different" (w.r.t. CFS). DL looks at tasks
deadlines and use that info to migrate tasks around. So, it's correct
that they will currently stay on first CPU they run on if no other DL
tasks are around.

Luca's capacity(/energy) awareness stuff should change that in the
future. But that might take a while I guess. :/

2018-05-08 20:37:36

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] Revert "cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily"

On Tue, May 8, 2018 at 12:34 PM, Viresh Kumar <[email protected]> wrote:
> On 08-05-18, 11:02, Quentin Perret wrote:
>> The sugov kthreads are DL tasks so they're not impacted by EAS. But even
>> if you take EAS out of the picture, those kthreads are assigned to a
>> "random" CPU at boot time and stay there forever (because that's how DL
>> works). Is this what we want ?
>
> Okay, I didn't knew that DL threads don't migrate at all. I don't
> think that's what we want then specially for big LITTLE platforms. But
> for the rest, I don't know. Take example of Qcom krait. Each CPU has a
> separate policy, why shouldn't we allow other CPUs to run the kthread?

Because that makes things more complex and harder to debug in general.

What's the exact reason why non-policy CPUs should ever run the sugov
kthread for the given policy?

2018-05-09 04:56:45

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH] Revert "cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily"

On 08-05-18, 22:36, Rafael J. Wysocki wrote:
> Because that makes things more complex and harder to debug in general.
>
> What's the exact reason why non-policy CPUs should ever run the sugov
> kthread for the given policy?

The only benefit was to let the scheduler run the kthread on the best
CPU (according to the scheduler), which may help reducing the delay in
running the kthread. But given the way deadline scheduler works, I
don't see a reason why this should be done anymore.

--
viresh

2018-05-09 04:57:47

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH] Revert "cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily"

On 08-05-18, 08:33, Dietmar Eggemann wrote:
> This reverts commit e2cabe48c20efb174ce0c01190f8b9c5f3ea1d13.
>
> Lifting the restriction that the sugov kthread is bound to the
> policy->related_cpus for a system with a slow switching cpufreq driver,
> which is able to perform DVFS from any cpu (e.g. cpufreq-dt), is not
> only not beneficial it also harms Enery-Aware Scheduling (EAS) on
> systems with asymmetric cpu capacities (e.g. Arm big.LITTLE).
>
> The sugov kthread which does the update for the little cpus could
> potentially run on a big cpu. It could prevent that the big cluster goes
> into deeper idle states although all the tasks are running on the little
> cluster.
>
> Example: hikey960 w/ 4.16.0-rc6-+
> Arm big.LITTLE with per-cluster DVFS
>
> root@h960:~# cat /proc/cpuinfo | grep "^CPU part"
> CPU part : 0xd03 (Cortex-A53, little cpu)
> CPU part : 0xd03
> CPU part : 0xd03
> CPU part : 0xd03
> CPU part : 0xd09 (Cortex-A73, big cpu)
> CPU part : 0xd09
> CPU part : 0xd09
> CPU part : 0xd09
>
> root@h960:/sys/devices/system/cpu/cpufreq# ls
> policy0 policy4 schedutil
>
> root@h960:/sys/devices/system/cpu/cpufreq# cat policy*/related_cpus
> 0 1 2 3
> 4 5 6 7
>
> (1) w/o the revert:
>
> root@h960:~# ps -eo pid,class,rtprio,pri,psr,comm | awk 'NR == 1 ||
> /sugov/'
> PID CLS RTPRIO PRI PSR COMMAND
> 1489 #6 0 140 1 sugov:0
> 1490 #6 0 140 0 sugov:4
>
> The sugov kthread sugov:4 responsible for policy4 runs on cpu0. (In this
> case both sugov kthreads run on little cpus).
>
> cross policy (cluster) remote callback example:
> ...
> migration/1-14 [001] enqueue_task_fair: this_cpu=1 cpu_of(rq)=5
> migration/1-14 [001] sugov_update_shared: this_cpu=1 sg_cpu->cpu=5
> sg_cpu->sg_policy->policy->related_cpus=4-7
> sugov:4-1490 [000] sugov_work: this_cpu=0
> sg_cpu->sg_policy->policy->related_cpus=4-7
> ...
>
> The remote callback (this_cpu=1, target_cpu=5) is executed on cpu=0.
>
> (2) w/ the revert:
>
> root@h960:~# ps -eo pid,class,rtprio,pri,psr,comm | awk 'NR == 1 ||
> /sugov/'
> PID CLS RTPRIO PRI PSR COMMAND
> 1491 #6 0 140 2 sugov:0
> 1492 #6 0 140 4 sugov:4
>
> The sugov kthread sugov:4 responsible for policy4 runs on cpu4.
>
> cross policy (cluster) remote callback example:
> ...
> migration/1-14 [001] enqueue_task_fair: this_cpu=1 cpu_of(rq)=7
> migration/1-14 [001] sugov_update_shared: this_cpu=1 sg_cpu->cpu=7
> sg_cpu->sg_policy->policy->related_cpus=4-7
> sugov:4-1492 [004] sugov_work: this_cpu=4
> sg_cpu->sg_policy->policy->related_cpus=4-7
> ...
>
> The remote callback (this_cpu=1, target_cpu=7) is executed on cpu=4.
>
> Now the sugov kthread executes again on the policy (cluster) for which
> the Operating Performance Point (OPP) should be changed.
> It avoids the problem that an otherwise idle policy (cluster) is running
> schedutil (the sugov kthread) for another one.
>
> Cc: Viresh Kumar <[email protected]>
> Cc: Rafael J. Wysocki <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Ingo Molnar <[email protected]>
> Cc: Pavan Kondeti <[email protected]>
> Cc: Juri Lelli <[email protected]>
> Cc: Joel Fernandes <[email protected]>
> Cc: Patrick Bellasi <[email protected]>
> Cc: Quentin Perret <[email protected]>
> Signed-off-by: Dietmar Eggemann <[email protected]>
> ---
> kernel/sched/cpufreq_schedutil.c | 6 +-----
> 1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index d2c6083304b4..63014cff76a5 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -523,11 +523,7 @@ static int sugov_kthread_create(struct sugov_policy *sg_policy)
> }
>
> sg_policy->thread = thread;
> -
> - /* Kthread is bound to all CPUs by default */
> - if (!policy->dvfs_possible_from_any_cpu)
> - kthread_bind_mask(thread, policy->related_cpus);
> -
> + kthread_bind_mask(thread, policy->related_cpus);
> init_irq_work(&sg_policy->irq_work, sugov_irq_work);
> mutex_init(&sg_policy->work_lock);
>

Acked-by: Viresh Kumar <[email protected]>

--
viresh

2018-05-13 05:19:59

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH] Revert "cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily"

On Tue, May 08, 2018 at 10:42:37AM +0100, Quentin Perret wrote:
> On Tuesday 08 May 2018 at 11:09:57 (+0200), Dietmar Eggemann wrote:
> > On 05/08/2018 10:22 AM, Viresh Kumar wrote:
> > > On 08-05-18, 08:33, Dietmar Eggemann wrote:
> > > > This reverts commit e2cabe48c20efb174ce0c01190f8b9c5f3ea1d13.
> > > >
> > > > Lifting the restriction that the sugov kthread is bound to the
> > > > policy->related_cpus for a system with a slow switching cpufreq driver,
> > > > which is able to perform DVFS from any cpu (e.g. cpufreq-dt), is not
> > > > only not beneficial it also harms Enery-Aware Scheduling (EAS) on
> > > > systems with asymmetric cpu capacities (e.g. Arm big.LITTLE).
> > > >
> > > > The sugov kthread which does the update for the little cpus could
> > > > potentially run on a big cpu. It could prevent that the big cluster goes
> > > > into deeper idle states although all the tasks are running on the little
> > > > cluster.
> > >
> > > I think the original patch did the right thing, but that doesn't suit
> > > everybody as you explained.
> > >
> > > I wouldn't really revert the patch but fix my platform's cpufreq
> > > driver to set dvfs_possible_from_any_cpu = false, so that other
> > > platforms can still benefit from the original commit.
> >
> > This would make sure that the kthreads are bound to the correct set of cpus
> > for platforms with those cpufreq drivers (cpufreq-dt (h960), scmi-cpufreq,
> > scpi-cpufreq) but it will also change the logic (e.g.
> > sugov_should_update_freq() -> cpufreq_can_do_remote_dvfs()).
> >
> > I'm still struggling to understand when a driver/platform should set
> > dvfs_possible_from_any_cpu to true and what the actual benefit would be.
>
> I assume it might be beneficial to have the kthread moving around freely
> in some cases, but since it is a SCHED_DEADLINE task now it can't really
> migrate anywhere anyway. So I'm not sure either if this commits still makes
> sense now. Or is there another use case for this ?

The usecase I guess is, as Dietmar was saying, that it makes sense for
kthread to update its own cluster and not disturb other clusters or random
CPUs. I agree with this point.

- Joel



2018-05-17 10:33:32

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] Revert "cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily"

On Wednesday, May 9, 2018 6:55:27 AM CEST Viresh Kumar wrote:
> On 08-05-18, 08:33, Dietmar Eggemann wrote:
> > This reverts commit e2cabe48c20efb174ce0c01190f8b9c5f3ea1d13.
> >
> > Lifting the restriction that the sugov kthread is bound to the
> > policy->related_cpus for a system with a slow switching cpufreq driver,
> > which is able to perform DVFS from any cpu (e.g. cpufreq-dt), is not
> > only not beneficial it also harms Enery-Aware Scheduling (EAS) on
> > systems with asymmetric cpu capacities (e.g. Arm big.LITTLE).
> >
> > The sugov kthread which does the update for the little cpus could
> > potentially run on a big cpu. It could prevent that the big cluster goes
> > into deeper idle states although all the tasks are running on the little
> > cluster.
> >
> > Example: hikey960 w/ 4.16.0-rc6-+
> > Arm big.LITTLE with per-cluster DVFS
> >
> > root@h960:~# cat /proc/cpuinfo | grep "^CPU part"
> > CPU part : 0xd03 (Cortex-A53, little cpu)
> > CPU part : 0xd03
> > CPU part : 0xd03
> > CPU part : 0xd03
> > CPU part : 0xd09 (Cortex-A73, big cpu)
> > CPU part : 0xd09
> > CPU part : 0xd09
> > CPU part : 0xd09
> >
> > root@h960:/sys/devices/system/cpu/cpufreq# ls
> > policy0 policy4 schedutil
> >
> > root@h960:/sys/devices/system/cpu/cpufreq# cat policy*/related_cpus
> > 0 1 2 3
> > 4 5 6 7
> >
> > (1) w/o the revert:
> >
> > root@h960:~# ps -eo pid,class,rtprio,pri,psr,comm | awk 'NR == 1 ||
> > /sugov/'
> > PID CLS RTPRIO PRI PSR COMMAND
> > 1489 #6 0 140 1 sugov:0
> > 1490 #6 0 140 0 sugov:4
> >
> > The sugov kthread sugov:4 responsible for policy4 runs on cpu0. (In this
> > case both sugov kthreads run on little cpus).
> >
> > cross policy (cluster) remote callback example:
> > ...
> > migration/1-14 [001] enqueue_task_fair: this_cpu=1 cpu_of(rq)=5
> > migration/1-14 [001] sugov_update_shared: this_cpu=1 sg_cpu->cpu=5
> > sg_cpu->sg_policy->policy->related_cpus=4-7
> > sugov:4-1490 [000] sugov_work: this_cpu=0
> > sg_cpu->sg_policy->policy->related_cpus=4-7
> > ...
> >
> > The remote callback (this_cpu=1, target_cpu=5) is executed on cpu=0.
> >
> > (2) w/ the revert:
> >
> > root@h960:~# ps -eo pid,class,rtprio,pri,psr,comm | awk 'NR == 1 ||
> > /sugov/'
> > PID CLS RTPRIO PRI PSR COMMAND
> > 1491 #6 0 140 2 sugov:0
> > 1492 #6 0 140 4 sugov:4
> >
> > The sugov kthread sugov:4 responsible for policy4 runs on cpu4.
> >
> > cross policy (cluster) remote callback example:
> > ...
> > migration/1-14 [001] enqueue_task_fair: this_cpu=1 cpu_of(rq)=7
> > migration/1-14 [001] sugov_update_shared: this_cpu=1 sg_cpu->cpu=7
> > sg_cpu->sg_policy->policy->related_cpus=4-7
> > sugov:4-1492 [004] sugov_work: this_cpu=4
> > sg_cpu->sg_policy->policy->related_cpus=4-7
> > ...
> >
> > The remote callback (this_cpu=1, target_cpu=7) is executed on cpu=4.
> >
> > Now the sugov kthread executes again on the policy (cluster) for which
> > the Operating Performance Point (OPP) should be changed.
> > It avoids the problem that an otherwise idle policy (cluster) is running
> > schedutil (the sugov kthread) for another one.
> >
> > Cc: Viresh Kumar <[email protected]>
> > Cc: Rafael J. Wysocki <[email protected]>
> > Cc: Peter Zijlstra <[email protected]>
> > Cc: Ingo Molnar <[email protected]>
> > Cc: Pavan Kondeti <[email protected]>
> > Cc: Juri Lelli <[email protected]>
> > Cc: Joel Fernandes <[email protected]>
> > Cc: Patrick Bellasi <[email protected]>
> > Cc: Quentin Perret <[email protected]>
> > Signed-off-by: Dietmar Eggemann <[email protected]>
> > ---
> > kernel/sched/cpufreq_schedutil.c | 6 +-----
> > 1 file changed, 1 insertion(+), 5 deletions(-)
> >
> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> > index d2c6083304b4..63014cff76a5 100644
> > --- a/kernel/sched/cpufreq_schedutil.c
> > +++ b/kernel/sched/cpufreq_schedutil.c
> > @@ -523,11 +523,7 @@ static int sugov_kthread_create(struct sugov_policy *sg_policy)
> > }
> >
> > sg_policy->thread = thread;
> > -
> > - /* Kthread is bound to all CPUs by default */
> > - if (!policy->dvfs_possible_from_any_cpu)
> > - kthread_bind_mask(thread, policy->related_cpus);
> > -
> > + kthread_bind_mask(thread, policy->related_cpus);
> > init_irq_work(&sg_policy->irq_work, sugov_irq_work);
> > mutex_init(&sg_policy->work_lock);
> >
>
> Acked-by: Viresh Kumar <[email protected]>

Applied, thanks!


2018-05-17 19:10:55

by Saravana Kannan

[permalink] [raw]
Subject: Re: [PATCH] Revert "cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily"

On 05/12/2018 10:19 PM, Joel Fernandes wrote:
> On Tue, May 08, 2018 at 10:42:37AM +0100, Quentin Perret wrote:
>> On Tuesday 08 May 2018 at 11:09:57 (+0200), Dietmar Eggemann wrote:
>>> On 05/08/2018 10:22 AM, Viresh Kumar wrote:
>>>> On 08-05-18, 08:33, Dietmar Eggemann wrote:
>>>>> This reverts commit e2cabe48c20efb174ce0c01190f8b9c5f3ea1d13.
>>>>>
>>>>> Lifting the restriction that the sugov kthread is bound to the
>>>>> policy->related_cpus for a system with a slow switching cpufreq driver,
>>>>> which is able to perform DVFS from any cpu (e.g. cpufreq-dt), is not
>>>>> only not beneficial it also harms Enery-Aware Scheduling (EAS) on
>>>>> systems with asymmetric cpu capacities (e.g. Arm big.LITTLE).
>>>>>
>>>>> The sugov kthread which does the update for the little cpus could
>>>>> potentially run on a big cpu. It could prevent that the big cluster goes
>>>>> into deeper idle states although all the tasks are running on the little
>>>>> cluster.
>>>>
>>>> I think the original patch did the right thing, but that doesn't suit
>>>> everybody as you explained.
>>>>
>>>> I wouldn't really revert the patch but fix my platform's cpufreq
>>>> driver to set dvfs_possible_from_any_cpu = false, so that other
>>>> platforms can still benefit from the original commit.
>>>
>>> This would make sure that the kthreads are bound to the correct set of cpus
>>> for platforms with those cpufreq drivers (cpufreq-dt (h960), scmi-cpufreq,
>>> scpi-cpufreq) but it will also change the logic (e.g.
>>> sugov_should_update_freq() -> cpufreq_can_do_remote_dvfs()).
>>>
>>> I'm still struggling to understand when a driver/platform should set
>>> dvfs_possible_from_any_cpu to true and what the actual benefit would be.
>>
>> I assume it might be beneficial to have the kthread moving around freely
>> in some cases, but since it is a SCHED_DEADLINE task now it can't really
>> migrate anywhere anyway. So I'm not sure either if this commits still makes
>> sense now. Or is there another use case for this ?
>
> The usecase I guess is, as Dietmar was saying, that it makes sense for
> kthread to update its own cluster and not disturb other clusters or random
> CPUs. I agree with this point.

I agree with Viresh. Also, why exactly did we make it deadline instead
of RT? Was RT not getting scheduled quick enough? Is it because Android
creates a lot of RT threads?

-Saravana


--
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

2018-05-17 19:14:34

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH] Revert "cpufreq: schedutil: Don't restrict kthread to related_cpus unnecessarily"

On Thu, May 17, 2018 at 12:10:22PM -0700, Saravana Kannan wrote:
> On 05/12/2018 10:19 PM, Joel Fernandes wrote:
> > On Tue, May 08, 2018 at 10:42:37AM +0100, Quentin Perret wrote:
> > > On Tuesday 08 May 2018 at 11:09:57 (+0200), Dietmar Eggemann wrote:
> > > > On 05/08/2018 10:22 AM, Viresh Kumar wrote:
> > > > > On 08-05-18, 08:33, Dietmar Eggemann wrote:
> > > > > > This reverts commit e2cabe48c20efb174ce0c01190f8b9c5f3ea1d13.
> > > > > >
> > > > > > Lifting the restriction that the sugov kthread is bound to the
> > > > > > policy->related_cpus for a system with a slow switching cpufreq driver,
> > > > > > which is able to perform DVFS from any cpu (e.g. cpufreq-dt), is not
> > > > > > only not beneficial it also harms Enery-Aware Scheduling (EAS) on
> > > > > > systems with asymmetric cpu capacities (e.g. Arm big.LITTLE).
> > > > > >
> > > > > > The sugov kthread which does the update for the little cpus could
> > > > > > potentially run on a big cpu. It could prevent that the big cluster goes
> > > > > > into deeper idle states although all the tasks are running on the little
> > > > > > cluster.
> > > > >
> > > > > I think the original patch did the right thing, but that doesn't suit
> > > > > everybody as you explained.
> > > > >
> > > > > I wouldn't really revert the patch but fix my platform's cpufreq
> > > > > driver to set dvfs_possible_from_any_cpu = false, so that other
> > > > > platforms can still benefit from the original commit.
> > > >
> > > > This would make sure that the kthreads are bound to the correct set of cpus
> > > > for platforms with those cpufreq drivers (cpufreq-dt (h960), scmi-cpufreq,
> > > > scpi-cpufreq) but it will also change the logic (e.g.
> > > > sugov_should_update_freq() -> cpufreq_can_do_remote_dvfs()).
> > > >
> > > > I'm still struggling to understand when a driver/platform should set
> > > > dvfs_possible_from_any_cpu to true and what the actual benefit would be.
> > >
> > > I assume it might be beneficial to have the kthread moving around freely
> > > in some cases, but since it is a SCHED_DEADLINE task now it can't really
> > > migrate anywhere anyway. So I'm not sure either if this commits still makes
> > > sense now. Or is there another use case for this ?
> >
> > The usecase I guess is, as Dietmar was saying, that it makes sense for
> > kthread to update its own cluster and not disturb other clusters or random
> > CPUs. I agree with this point.
>
> I agree with Viresh. Also, why exactly did we make it deadline instead of
> RT? Was RT not getting scheduled quick enough? Is it because Android creates
> a lot of RT threads?

Because deadline also needs to change frequency and depends on it ;)

- Joel