2021-08-10 08:42:01

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 0/8] cpufreq: Auto-register with energy model

Provide a cpufreq driver flag so drivers can ask the cpufreq core to register
with the EM core on their behalf. This allows us to get rid of duplicated code
in the drivers and fix the unregistration part as well, which none of the
drivers have done until now.

This would also make the registration with EM core to happen only after policy
is fully initialized, and the EM core can do other stuff from in there, like
marking frequencies as inefficient (WIP). Though this patchset is useful without
that work being done and should be merged nevertheless.

This doesn't update scmi cpufreq driver for now as it is a special case and need
to be handled differently. Though we can make it work with this if required.

This is build/boot tested by the bot for a couple of boards.

https://gitlab.com/vireshk/pmko/-/pipelines/350674298

--
Viresh

Viresh Kumar (8):
cpufreq: Auto-register with energy model if asked
cpufreq: dt: Use auto-registration for energy model
cpufreq: imx6q: Use auto-registration for energy model
cpufreq: mediatek: Use auto-registration for energy model
cpufreq: omap: Use auto-registration for energy model
cpufreq: qcom-cpufreq-hw: Use auto-registration for energy model
cpufreq: scpi: Use auto-registration for energy model
cpufreq: vexpress: Use auto-registration for energy model

drivers/cpufreq/cpufreq-dt.c | 5 ++---
drivers/cpufreq/cpufreq.c | 9 +++++++++
drivers/cpufreq/imx6q-cpufreq.c | 4 ++--
drivers/cpufreq/mediatek-cpufreq.c | 5 ++---
drivers/cpufreq/omap-cpufreq.c | 4 ++--
drivers/cpufreq/qcom-cpufreq-hw.c | 5 ++---
drivers/cpufreq/scpi-cpufreq.c | 5 ++---
drivers/cpufreq/vexpress-spc-cpufreq.c | 5 ++---
include/linux/cpufreq.h | 6 ++++++
9 files changed, 29 insertions(+), 19 deletions(-)

--
2.31.1.272.g89b43f80a514


2021-08-10 08:42:41

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 4/8] cpufreq: mediatek: Use auto-registration for energy model

Use the CPUFREQ_REGISTER_WITH_EM flag to allow cpufreq core to
automatically register with the energy model.

This allows removal of boiler plate code from the driver and fixes the
unregistration part as well.

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/cpufreq/mediatek-cpufreq.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/mediatek-cpufreq.c b/drivers/cpufreq/mediatek-cpufreq.c
index 87019d5a9547..4743f2e58b97 100644
--- a/drivers/cpufreq/mediatek-cpufreq.c
+++ b/drivers/cpufreq/mediatek-cpufreq.c
@@ -448,8 +448,6 @@ static int mtk_cpufreq_init(struct cpufreq_policy *policy)
policy->driver_data = info;
policy->clk = info->cpu_clk;

- dev_pm_opp_of_register_em(info->cpu_dev, policy->cpus);
-
return 0;
}

@@ -465,7 +463,8 @@ static int mtk_cpufreq_exit(struct cpufreq_policy *policy)
static struct cpufreq_driver mtk_cpufreq_driver = {
.flags = CPUFREQ_NEED_INITIAL_FREQ_CHECK |
CPUFREQ_HAVE_GOVERNOR_PER_POLICY |
- CPUFREQ_IS_COOLING_DEV,
+ CPUFREQ_IS_COOLING_DEV |
+ CPUFREQ_REGISTER_WITH_EM,
.verify = cpufreq_generic_frequency_table_verify,
.target_index = mtk_cpufreq_set_target,
.get = cpufreq_generic_get,
--
2.31.1.272.g89b43f80a514

2021-08-10 08:43:03

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 2/8] cpufreq: dt: Use auto-registration for energy model

Use the CPUFREQ_REGISTER_WITH_EM flag to allow cpufreq core to
automatically register with the energy model.

This allows removal of boiler plate code from the driver and fixes the
unregistration part as well.

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/cpufreq/cpufreq-dt.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
index ece52863ba62..b727006e85af 100644
--- a/drivers/cpufreq/cpufreq-dt.c
+++ b/drivers/cpufreq/cpufreq-dt.c
@@ -143,8 +143,6 @@ static int cpufreq_init(struct cpufreq_policy *policy)
cpufreq_dt_attr[1] = &cpufreq_freq_attr_scaling_boost_freqs;
}

- dev_pm_opp_of_register_em(cpu_dev, policy->cpus);
-
return 0;

out_clk_put:
@@ -176,7 +174,8 @@ static int cpufreq_exit(struct cpufreq_policy *policy)

static struct cpufreq_driver dt_cpufreq_driver = {
.flags = CPUFREQ_NEED_INITIAL_FREQ_CHECK |
- CPUFREQ_IS_COOLING_DEV,
+ CPUFREQ_IS_COOLING_DEV |
+ CPUFREQ_REGISTER_WITH_EM,
.verify = cpufreq_generic_frequency_table_verify,
.target_index = set_target,
.get = cpufreq_generic_get,
--
2.31.1.272.g89b43f80a514

2021-08-10 08:43:59

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 7/8] cpufreq: scpi: Use auto-registration for energy model

Use the CPUFREQ_REGISTER_WITH_EM flag to allow cpufreq core to
automatically register with the energy model.

This allows removal of boiler plate code from the driver and fixes the
unregistration part as well.

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/cpufreq/scpi-cpufreq.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/scpi-cpufreq.c b/drivers/cpufreq/scpi-cpufreq.c
index d6a698a1b5d1..bc8c62b1beb5 100644
--- a/drivers/cpufreq/scpi-cpufreq.c
+++ b/drivers/cpufreq/scpi-cpufreq.c
@@ -163,8 +163,6 @@ static int scpi_cpufreq_init(struct cpufreq_policy *policy)

policy->fast_switch_possible = false;

- dev_pm_opp_of_register_em(cpu_dev, policy->cpus);
-
return 0;

out_free_cpufreq_table:
@@ -193,7 +191,8 @@ static struct cpufreq_driver scpi_cpufreq_driver = {
.name = "scpi-cpufreq",
.flags = CPUFREQ_HAVE_GOVERNOR_PER_POLICY |
CPUFREQ_NEED_INITIAL_FREQ_CHECK |
- CPUFREQ_IS_COOLING_DEV,
+ CPUFREQ_IS_COOLING_DEV |
+ CPUFREQ_REGISTER_WITH_EM,
.verify = cpufreq_generic_frequency_table_verify,
.attr = cpufreq_generic_attr,
.get = scpi_cpufreq_get_rate,
--
2.31.1.272.g89b43f80a514

2021-08-10 08:44:39

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 8/8] cpufreq: vexpress: Use auto-registration for energy model

Use the CPUFREQ_REGISTER_WITH_EM flag to allow cpufreq core to
automatically register with the energy model.

This allows removal of boiler plate code from the driver and fixes the
unregistration part as well.

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/cpufreq/vexpress-spc-cpufreq.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/vexpress-spc-cpufreq.c b/drivers/cpufreq/vexpress-spc-cpufreq.c
index 51dfa9ae6cf5..28c4c3254337 100644
--- a/drivers/cpufreq/vexpress-spc-cpufreq.c
+++ b/drivers/cpufreq/vexpress-spc-cpufreq.c
@@ -442,8 +442,6 @@ static int ve_spc_cpufreq_init(struct cpufreq_policy *policy)
policy->freq_table = freq_table[cur_cluster];
policy->cpuinfo.transition_latency = 1000000; /* 1 ms */

- dev_pm_opp_of_register_em(cpu_dev, policy->cpus);
-
if (is_bL_switching_enabled())
per_cpu(cpu_last_req_freq, policy->cpu) =
clk_get_cpu_rate(policy->cpu);
@@ -487,7 +485,8 @@ static void ve_spc_cpufreq_ready(struct cpufreq_policy *policy)
static struct cpufreq_driver ve_spc_cpufreq_driver = {
.name = "vexpress-spc",
.flags = CPUFREQ_HAVE_GOVERNOR_PER_POLICY |
- CPUFREQ_NEED_INITIAL_FREQ_CHECK,
+ CPUFREQ_NEED_INITIAL_FREQ_CHECK |
+ CPUFREQ_REGISTER_WITH_EM,
.verify = cpufreq_generic_frequency_table_verify,
.target_index = ve_spc_cpufreq_set_target,
.get = ve_spc_cpufreq_get_rate,
--
2.31.1.272.g89b43f80a514

2021-08-10 09:11:29

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 5/8] cpufreq: omap: Use auto-registration for energy model

Use the CPUFREQ_REGISTER_WITH_EM flag to allow cpufreq core to
automatically register with the energy model.

This allows removal of boiler plate code from the driver and fixes the
unregistration part as well.

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/cpufreq/omap-cpufreq.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/omap-cpufreq.c b/drivers/cpufreq/omap-cpufreq.c
index e035ee216b0f..303136f97773 100644
--- a/drivers/cpufreq/omap-cpufreq.c
+++ b/drivers/cpufreq/omap-cpufreq.c
@@ -131,7 +131,6 @@ static int omap_cpu_init(struct cpufreq_policy *policy)

/* FIXME: what's the actual transition time? */
cpufreq_generic_init(policy, freq_table, 300 * 1000);
- dev_pm_opp_of_register_em(mpu_dev, policy->cpus);

return 0;
}
@@ -144,7 +143,8 @@ static int omap_cpu_exit(struct cpufreq_policy *policy)
}

static struct cpufreq_driver omap_driver = {
- .flags = CPUFREQ_NEED_INITIAL_FREQ_CHECK,
+ .flags = CPUFREQ_NEED_INITIAL_FREQ_CHECK |
+ CPUFREQ_REGISTER_WITH_EM,
.verify = cpufreq_generic_frequency_table_verify,
.target_index = omap_target,
.get = cpufreq_generic_get,
--
2.31.1.272.g89b43f80a514

2021-08-10 09:30:16

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 1/8] cpufreq: Auto-register with energy model if asked

Many cpufreq drivers register with the energy model for each policy and
do exactly the same thing. Follow the footsteps of thermal-cooling, to
get it done from the cpufreq core itself.

Provide a cpufreq driver flag so drivers can ask the cpufreq core to
register with the EM core on their behalf. This allows us to get rid of
duplicated code in the drivers.

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/cpufreq/cpufreq.c | 9 +++++++++
include/linux/cpufreq.h | 6 ++++++
2 files changed, 15 insertions(+)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 06c526d66dd3..a060dc2aa2f2 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -23,6 +23,7 @@
#include <linux/kernel_stat.h>
#include <linux/module.h>
#include <linux/mutex.h>
+#include <linux/pm_opp.h>
#include <linux/pm_qos.h>
#include <linux/slab.h>
#include <linux/suspend.h>
@@ -1511,6 +1512,11 @@ static int cpufreq_online(unsigned int cpu)
if (cpufreq_thermal_control_enabled(cpufreq_driver))
policy->cdev = of_cpufreq_cooling_register(policy);

+ if (cpufreq_driver->flags & CPUFREQ_REGISTER_WITH_EM) {
+ dev_pm_opp_of_register_em(get_cpu_device(policy->cpu),
+ policy->related_cpus);
+ }
+
pr_debug("initialization complete\n");

return 0;
@@ -1602,6 +1608,9 @@ static int cpufreq_offline(unsigned int cpu)
goto unlock;
}

+ if (cpufreq_driver->flags & CPUFREQ_REGISTER_WITH_EM)
+ dev_pm_opp_of_unregister_em(get_cpu_device(cpu));
+
if (cpufreq_thermal_control_enabled(cpufreq_driver)) {
cpufreq_cooling_unregister(policy->cdev);
policy->cdev = NULL;
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 9fd719475fcd..f11723cd4cca 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -424,6 +424,12 @@ struct cpufreq_driver {
*/
#define CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING BIT(6)

+/*
+ * Set by drivers that want the core to automatically register the CPU device
+ * with Energy Model.
+ */
+#define CPUFREQ_REGISTER_WITH_EM BIT(7)
+
int cpufreq_register_driver(struct cpufreq_driver *driver_data);
int cpufreq_unregister_driver(struct cpufreq_driver *driver_data);

--
2.31.1.272.g89b43f80a514

2021-08-10 09:30:23

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 3/8] cpufreq: imx6q: Use auto-registration for energy model

Use the CPUFREQ_REGISTER_WITH_EM flag to allow cpufreq core to
automatically register with the energy model.

This allows removal of boiler plate code from the driver and fixes the
unregistration part as well.

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/cpufreq/imx6q-cpufreq.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/imx6q-cpufreq.c b/drivers/cpufreq/imx6q-cpufreq.c
index 5bf5fc759881..aa8df5b468d7 100644
--- a/drivers/cpufreq/imx6q-cpufreq.c
+++ b/drivers/cpufreq/imx6q-cpufreq.c
@@ -192,14 +192,14 @@ static int imx6q_cpufreq_init(struct cpufreq_policy *policy)
policy->clk = clks[ARM].clk;
cpufreq_generic_init(policy, freq_table, transition_latency);
policy->suspend_freq = max_freq;
- dev_pm_opp_of_register_em(cpu_dev, policy->cpus);

return 0;
}

static struct cpufreq_driver imx6q_cpufreq_driver = {
.flags = CPUFREQ_NEED_INITIAL_FREQ_CHECK |
- CPUFREQ_IS_COOLING_DEV,
+ CPUFREQ_IS_COOLING_DEV |
+ CPUFREQ_REGISTER_WITH_EM,
.verify = cpufreq_generic_frequency_table_verify,
.target_index = imx6q_set_target,
.get = cpufreq_generic_get,
--
2.31.1.272.g89b43f80a514

2021-08-10 09:30:48

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 6/8] cpufreq: qcom-cpufreq-hw: Use auto-registration for energy model

Use the CPUFREQ_REGISTER_WITH_EM flag to allow cpufreq core to
automatically register with the energy model.

This allows removal of boiler plate code from the driver and fixes the
unregistration part as well.

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/cpufreq/qcom-cpufreq-hw.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
index f86859bf76f1..221433c6dcb0 100644
--- a/drivers/cpufreq/qcom-cpufreq-hw.c
+++ b/drivers/cpufreq/qcom-cpufreq-hw.c
@@ -362,8 +362,6 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
goto error;
}

- dev_pm_opp_of_register_em(cpu_dev, policy->cpus);
-
if (policy_has_boost_freq(policy)) {
ret = cpufreq_enable_boost_support();
if (ret)
@@ -406,7 +404,8 @@ static struct freq_attr *qcom_cpufreq_hw_attr[] = {
static struct cpufreq_driver cpufreq_qcom_hw_driver = {
.flags = CPUFREQ_NEED_INITIAL_FREQ_CHECK |
CPUFREQ_HAVE_GOVERNOR_PER_POLICY |
- CPUFREQ_IS_COOLING_DEV,
+ CPUFREQ_IS_COOLING_DEV |
+ CPUFREQ_REGISTER_WITH_EM,
.verify = cpufreq_generic_frequency_table_verify,
.target_index = qcom_cpufreq_hw_target_index,
.get = qcom_cpufreq_hw_get,
--
2.31.1.272.g89b43f80a514

2021-08-10 10:28:22

by Lukasz Luba

[permalink] [raw]
Subject: Re: [PATCH 0/8] cpufreq: Auto-register with energy model



On 8/10/21 10:27 AM, Viresh Kumar wrote:
> On 10-08-21, 10:17, Lukasz Luba wrote:
>> Hi Viresh,
>>
>> I like the idea, only small comments here in the cover letter.
>>
>> On 8/10/21 8:36 AM, Viresh Kumar wrote:
>>> Provide a cpufreq driver flag so drivers can ask the cpufreq core to register
>>> with the EM core on their behalf. This allows us to get rid of duplicated code
>>> in the drivers and fix the unregistration part as well, which none of the
>>> drivers have done until now.
>>
>> The EM is never freed for CPUs by design. The unregister function was
>> introduced for devfreq devices.
>
> I see. So if a cpufreq driver unregisters and registers again, it will
> be required to use the entries created by the registration itself,
> right ? Technically speaking, it is better to unregister and free any
> related resources and parse everything again.
>
> Lets say, just for fun, I want to test two copies of a cpufreq driver

It's good that it's just for fun ;)

> (providing different set of freq-tables). I build both of them as
> modules, insert the first version, remove it, insert the second one.
> Ideally, this should just work as expected. But I don't think it will
> in this case as you never parse the EM stuff again.

The EM is directly used by scheduler in the hot-path, there are no
checks even if the EM if for CPUs. We are sure it's is for CPUs and
is always there for all CPUs.

I'm currently working on a EM v2 which would have stronger mechanisms
and do better job in this field. The patches are under internal review
and hopefully ready to post by the end of month.

>
> Again, since the routine is there already, I think it is better/fine
> to just use it.

True, it doesn't harm, so I commented it in the patch 1/8 that it
could stay.

>
>>> This would also make the registration with EM core to happen only after policy
>>> is fully initialized, and the EM core can do other stuff from in there, like
>>> marking frequencies as inefficient (WIP). Though this patchset is useful without
>>> that work being done and should be merged nevertheless.
>>>
>>> This doesn't update scmi cpufreq driver for now as it is a special case and need
>>> to be handled differently. Though we can make it work with this if required.
>>
>> The scmi cpufreq driver uses direct EM API, which provides flexibility
>> and should stay as is.
>
> Right, so I left it as is for now.
>

2021-08-10 10:30:31

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH 1/8] cpufreq: Auto-register with energy model if asked

On 10-08-21, 10:36, Lukasz Luba wrote:
> The of_cpufreq_cooling_register() should be called after the EM
> is present for the CPU device. When you check that function,
> you will see that we call
> em_cpu_get(policy->cpu)
> to get the EM pointer. Otherwise IPA might fail.

Good point.

--
viresh

2021-08-10 12:41:08

by Lukasz Luba

[permalink] [raw]
Subject: Re: [PATCH 0/8] cpufreq: Auto-register with energy model

Hi Viresh,

I like the idea, only small comments here in the cover letter.

On 8/10/21 8:36 AM, Viresh Kumar wrote:
> Provide a cpufreq driver flag so drivers can ask the cpufreq core to register
> with the EM core on their behalf. This allows us to get rid of duplicated code
> in the drivers and fix the unregistration part as well, which none of the
> drivers have done until now.

The EM is never freed for CPUs by design. The unregister function was
introduced for devfreq devices.

>
> This would also make the registration with EM core to happen only after policy
> is fully initialized, and the EM core can do other stuff from in there, like
> marking frequencies as inefficient (WIP). Though this patchset is useful without
> that work being done and should be merged nevertheless.
>
> This doesn't update scmi cpufreq driver for now as it is a special case and need
> to be handled differently. Though we can make it work with this if required.

The scmi cpufreq driver uses direct EM API, which provides flexibility
and should stay as is.

Let me review the patches.

Regards,
Lukasz

2021-08-10 12:42:50

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH 0/8] cpufreq: Auto-register with energy model

On 10-08-21, 10:17, Lukasz Luba wrote:
> Hi Viresh,
>
> I like the idea, only small comments here in the cover letter.
>
> On 8/10/21 8:36 AM, Viresh Kumar wrote:
> > Provide a cpufreq driver flag so drivers can ask the cpufreq core to register
> > with the EM core on their behalf. This allows us to get rid of duplicated code
> > in the drivers and fix the unregistration part as well, which none of the
> > drivers have done until now.
>
> The EM is never freed for CPUs by design. The unregister function was
> introduced for devfreq devices.

I see. So if a cpufreq driver unregisters and registers again, it will
be required to use the entries created by the registration itself,
right ? Technically speaking, it is better to unregister and free any
related resources and parse everything again.

Lets say, just for fun, I want to test two copies of a cpufreq driver
(providing different set of freq-tables). I build both of them as
modules, insert the first version, remove it, insert the second one.
Ideally, this should just work as expected. But I don't think it will
in this case as you never parse the EM stuff again.

Again, since the routine is there already, I think it is better/fine
to just use it.

> > This would also make the registration with EM core to happen only after policy
> > is fully initialized, and the EM core can do other stuff from in there, like
> > marking frequencies as inefficient (WIP). Though this patchset is useful without
> > that work being done and should be merged nevertheless.
> >
> > This doesn't update scmi cpufreq driver for now as it is a special case and need
> > to be handled differently. Though we can make it work with this if required.
>
> The scmi cpufreq driver uses direct EM API, which provides flexibility
> and should stay as is.

Right, so I left it as is for now.

--
viresh

2021-08-10 13:26:37

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH 8/8] cpufreq: vexpress: Use auto-registration for energy model

On 10-08-21, 11:05, Lukasz Luba wrote:
> I can see that this driver calls explicitly the
> of_cpufreq_cooling_register()
> It does this in the cpufreq_driver->ready() callback
> implementation: ve_spc_cpufreq_ready()
>
> With that in mind, the new code in the patch 1/8, which
> registers the EM, should be called even earlier, above:
> ---------------------8<---------------------------------
> /* Callback for handling stuff after policy is ready */
> if (cpufreq_driver->ready)
> cpufreq_driver->ready(policy);
> ------------------->8----------------------------------

Thanks. I will look at this sequencing issue again.

> This also triggered a question:
> If this new flag can be set in the cpufreq driver which hasn't set
> CPUFREQ_IS_COOLING_DEV
> ?

Why not ?

> I can only see one driver (this one in the patch) which has such
> configuration.

--
viresh

2021-08-10 13:27:00

by Lukasz Luba

[permalink] [raw]
Subject: Re: [PATCH 8/8] cpufreq: vexpress: Use auto-registration for energy model



On 8/10/21 11:06 AM, Viresh Kumar wrote:
> On 10-08-21, 11:05, Lukasz Luba wrote:
>> I can see that this driver calls explicitly the
>> of_cpufreq_cooling_register()
>> It does this in the cpufreq_driver->ready() callback
>> implementation: ve_spc_cpufreq_ready()
>>
>> With that in mind, the new code in the patch 1/8, which
>> registers the EM, should be called even earlier, above:
>> ---------------------8<---------------------------------
>> /* Callback for handling stuff after policy is ready */
>> if (cpufreq_driver->ready)
>> cpufreq_driver->ready(policy);
>> ------------------->8----------------------------------
>
> Thanks. I will look at this sequencing issue again.
>
>> This also triggered a question:
>> If this new flag can be set in the cpufreq driver which hasn't set
>> CPUFREQ_IS_COOLING_DEV
>> ?
>
> Why not ?

I thought someone could try to call cpufreq_cooling_register()
from the cpufreq driver init function, but it's not possible. I have
just checked that, so should be good with these two flags being
independent and working fine.

>
>> I can only see one driver (this one in the patch) which has such
>> configuration.
>

2021-08-10 13:27:00

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH 8/8] cpufreq: vexpress: Use auto-registration for energy model

On 10-08-21, 11:11, Lukasz Luba wrote:
>
>
> On 8/10/21 11:06 AM, Viresh Kumar wrote:
> > On 10-08-21, 11:05, Lukasz Luba wrote:
> > > I can see that this driver calls explicitly the
> > > of_cpufreq_cooling_register()
> > > It does this in the cpufreq_driver->ready() callback
> > > implementation: ve_spc_cpufreq_ready()
> > >
> > > With that in mind, the new code in the patch 1/8, which
> > > registers the EM, should be called even earlier, above:
> > > ---------------------8<---------------------------------
> > > /* Callback for handling stuff after policy is ready */
> > > if (cpufreq_driver->ready)
> > > cpufreq_driver->ready(policy);
> > > ------------------->8----------------------------------
> >
> > Thanks. I will look at this sequencing issue again.
> >
> > > This also triggered a question:
> > > If this new flag can be set in the cpufreq driver which hasn't set
> > > CPUFREQ_IS_COOLING_DEV
> > > ?
> >
> > Why not ?
>
> I thought someone could try to call cpufreq_cooling_register()
> from the cpufreq driver init function, but it's not possible. I have
> just checked that, so should be good with these two flags being
> independent and working fine.

Ahh, I see. Great.

--
viresh

2021-08-10 13:28:47

by Lukasz Luba

[permalink] [raw]
Subject: Re: [PATCH 3/8] cpufreq: imx6q: Use auto-registration for energy model



On 8/10/21 8:36 AM, Viresh Kumar wrote:
> Use the CPUFREQ_REGISTER_WITH_EM flag to allow cpufreq core to
> automatically register with the energy model.
>
> This allows removal of boiler plate code from the driver and fixes the
> unregistration part as well.
>
> Signed-off-by: Viresh Kumar <[email protected]>
> ---
> drivers/cpufreq/imx6q-cpufreq.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cpufreq/imx6q-cpufreq.c b/drivers/cpufreq/imx6q-cpufreq.c
> index 5bf5fc759881..aa8df5b468d7 100644
> --- a/drivers/cpufreq/imx6q-cpufreq.c
> +++ b/drivers/cpufreq/imx6q-cpufreq.c
> @@ -192,14 +192,14 @@ static int imx6q_cpufreq_init(struct cpufreq_policy *policy)
> policy->clk = clks[ARM].clk;
> cpufreq_generic_init(policy, freq_table, transition_latency);
> policy->suspend_freq = max_freq;
> - dev_pm_opp_of_register_em(cpu_dev, policy->cpus);
>
> return 0;
> }
>
> static struct cpufreq_driver imx6q_cpufreq_driver = {
> .flags = CPUFREQ_NEED_INITIAL_FREQ_CHECK |
> - CPUFREQ_IS_COOLING_DEV,
> + CPUFREQ_IS_COOLING_DEV |
> + CPUFREQ_REGISTER_WITH_EM,
> .verify = cpufreq_generic_frequency_table_verify,
> .target_index = imx6q_set_target,
> .get = cpufreq_generic_get,
>

Reviewed-by: Lukasz Luba <[email protected]>

2021-08-10 13:29:17

by Lukasz Luba

[permalink] [raw]
Subject: Re: [PATCH 5/8] cpufreq: omap: Use auto-registration for energy model



On 8/10/21 8:36 AM, Viresh Kumar wrote:
> Use the CPUFREQ_REGISTER_WITH_EM flag to allow cpufreq core to
> automatically register with the energy model.
>
> This allows removal of boiler plate code from the driver and fixes the
> unregistration part as well.
>
> Signed-off-by: Viresh Kumar <[email protected]>
> ---
> drivers/cpufreq/omap-cpufreq.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cpufreq/omap-cpufreq.c b/drivers/cpufreq/omap-cpufreq.c
> index e035ee216b0f..303136f97773 100644
> --- a/drivers/cpufreq/omap-cpufreq.c
> +++ b/drivers/cpufreq/omap-cpufreq.c
> @@ -131,7 +131,6 @@ static int omap_cpu_init(struct cpufreq_policy *policy)
>
> /* FIXME: what's the actual transition time? */
> cpufreq_generic_init(policy, freq_table, 300 * 1000);
> - dev_pm_opp_of_register_em(mpu_dev, policy->cpus);
>
> return 0;
> }
> @@ -144,7 +143,8 @@ static int omap_cpu_exit(struct cpufreq_policy *policy)
> }
>
> static struct cpufreq_driver omap_driver = {
> - .flags = CPUFREQ_NEED_INITIAL_FREQ_CHECK,
> + .flags = CPUFREQ_NEED_INITIAL_FREQ_CHECK |
> + CPUFREQ_REGISTER_WITH_EM,
> .verify = cpufreq_generic_frequency_table_verify,
> .target_index = omap_target,
> .get = cpufreq_generic_get,
>

Reviewed-by: Lukasz Luba <[email protected]>

2021-08-10 13:29:49

by Lukasz Luba

[permalink] [raw]
Subject: Re: [PATCH 2/8] cpufreq: dt: Use auto-registration for energy model



On 8/10/21 8:36 AM, Viresh Kumar wrote:
> Use the CPUFREQ_REGISTER_WITH_EM flag to allow cpufreq core to
> automatically register with the energy model.
>
> This allows removal of boiler plate code from the driver and fixes the
> unregistration part as well.
>
> Signed-off-by: Viresh Kumar <[email protected]>
> ---
> drivers/cpufreq/cpufreq-dt.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
> index ece52863ba62..b727006e85af 100644
> --- a/drivers/cpufreq/cpufreq-dt.c
> +++ b/drivers/cpufreq/cpufreq-dt.c
> @@ -143,8 +143,6 @@ static int cpufreq_init(struct cpufreq_policy *policy)
> cpufreq_dt_attr[1] = &cpufreq_freq_attr_scaling_boost_freqs;
> }
>
> - dev_pm_opp_of_register_em(cpu_dev, policy->cpus);
> -
> return 0;
>
> out_clk_put:
> @@ -176,7 +174,8 @@ static int cpufreq_exit(struct cpufreq_policy *policy)
>
> static struct cpufreq_driver dt_cpufreq_driver = {
> .flags = CPUFREQ_NEED_INITIAL_FREQ_CHECK |
> - CPUFREQ_IS_COOLING_DEV,
> + CPUFREQ_IS_COOLING_DEV |
> + CPUFREQ_REGISTER_WITH_EM,
> .verify = cpufreq_generic_frequency_table_verify,
> .target_index = set_target,
> .get = cpufreq_generic_get,
>

Reviewed-by: Lukasz Luba <[email protected]>

2021-08-10 14:23:00

by Quentin Perret

[permalink] [raw]
Subject: Re: [PATCH 0/8] cpufreq: Auto-register with energy model

On Tuesday 10 Aug 2021 at 14:25:15 (+0100), Lukasz Luba wrote:
> The way I see this is that the flag in cpufreq avoids
> mistakes potentially made by driver developer. It will automaticaly
> register the *simple* EM model via dev_pm_opp_of_register_em() on behalf
> of drivers (which is already done manually by drivers). The developer
> would just set the flag similarly to CPUFREQ_IS_COOLING_DEV and be sure
> it will register at the right time. Well tested flag approach should be
> safer, easier to understand, maintain.

I would agree with all that if calling dev_pm_opp_of_register_em() was
complicated, but that is not really the case. I don't think we ever call
PM_OPP directly from cpufreq core ATM, which makes a lot of sense if you
consider PM_OPP arch-specific. I could understand that we might accept a
little 'violation' of the abstraction with this series if there were
real benefits, but I just don't see them.

2021-08-10 15:58:02

by Lukasz Luba

[permalink] [raw]
Subject: Re: [PATCH 1/8] cpufreq: Auto-register with energy model if asked



On 8/10/21 8:36 AM, Viresh Kumar wrote:
> Many cpufreq drivers register with the energy model for each policy and
> do exactly the same thing. Follow the footsteps of thermal-cooling, to
> get it done from the cpufreq core itself.
>
> Provide a cpufreq driver flag so drivers can ask the cpufreq core to
> register with the EM core on their behalf. This allows us to get rid of
> duplicated code in the drivers.
>
> Signed-off-by: Viresh Kumar <[email protected]>
> ---
> drivers/cpufreq/cpufreq.c | 9 +++++++++
> include/linux/cpufreq.h | 6 ++++++
> 2 files changed, 15 insertions(+)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 06c526d66dd3..a060dc2aa2f2 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -23,6 +23,7 @@
> #include <linux/kernel_stat.h>
> #include <linux/module.h>
> #include <linux/mutex.h>
> +#include <linux/pm_opp.h>
> #include <linux/pm_qos.h>
> #include <linux/slab.h>
> #include <linux/suspend.h>
> @@ -1511,6 +1512,11 @@ static int cpufreq_online(unsigned int cpu)
> if (cpufreq_thermal_control_enabled(cpufreq_driver))
> policy->cdev = of_cpufreq_cooling_register(policy);

The of_cpufreq_cooling_register() should be called after the EM
is present for the CPU device. When you check that function,
you will see that we call
em_cpu_get(policy->cpu)
to get the EM pointer. Otherwise IPA might fail.

>
> + if (cpufreq_driver->flags & CPUFREQ_REGISTER_WITH_EM) {
> + dev_pm_opp_of_register_em(get_cpu_device(policy->cpu),
> + policy->related_cpus);
> + }
> +

So please move these new code above the thermal registration few lines
above.

> pr_debug("initialization complete\n");
>
> return 0;
> @@ -1602,6 +1608,9 @@ static int cpufreq_offline(unsigned int cpu)
> goto unlock;
> }
>
> + if (cpufreq_driver->flags & CPUFREQ_REGISTER_WITH_EM)
> + dev_pm_opp_of_unregister_em(get_cpu_device(cpu));
> +

Here is similar situation. Move the EM unregister after the thermal is
done. For consistency it's OK, the real EM struct won't be freed
for CPUs (due to scheduler reasons), though.

> if (cpufreq_thermal_control_enabled(cpufreq_driver)) {
> cpufreq_cooling_unregister(policy->cdev);
> policy->cdev = NULL;
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index 9fd719475fcd..f11723cd4cca 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -424,6 +424,12 @@ struct cpufreq_driver {
> */
> #define CPUFREQ_NO_AUTO_DYNAMIC_SWITCHING BIT(6)
>
> +/*
> + * Set by drivers that want the core to automatically register the CPU device
> + * with Energy Model.
> + */
> +#define CPUFREQ_REGISTER_WITH_EM BIT(7)
> +
> int cpufreq_register_driver(struct cpufreq_driver *driver_data);
> int cpufreq_unregister_driver(struct cpufreq_driver *driver_data);
>
>

2021-08-10 16:20:05

by Lukasz Luba

[permalink] [raw]
Subject: Re: [PATCH 8/8] cpufreq: vexpress: Use auto-registration for energy model



On 8/10/21 8:36 AM, Viresh Kumar wrote:
> Use the CPUFREQ_REGISTER_WITH_EM flag to allow cpufreq core to
> automatically register with the energy model.
>
> This allows removal of boiler plate code from the driver and fixes the
> unregistration part as well.
>
> Signed-off-by: Viresh Kumar <[email protected]>
> ---
> drivers/cpufreq/vexpress-spc-cpufreq.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/cpufreq/vexpress-spc-cpufreq.c b/drivers/cpufreq/vexpress-spc-cpufreq.c
> index 51dfa9ae6cf5..28c4c3254337 100644
> --- a/drivers/cpufreq/vexpress-spc-cpufreq.c
> +++ b/drivers/cpufreq/vexpress-spc-cpufreq.c
> @@ -442,8 +442,6 @@ static int ve_spc_cpufreq_init(struct cpufreq_policy *policy)
> policy->freq_table = freq_table[cur_cluster];
> policy->cpuinfo.transition_latency = 1000000; /* 1 ms */
>
> - dev_pm_opp_of_register_em(cpu_dev, policy->cpus);
> -
> if (is_bL_switching_enabled())
> per_cpu(cpu_last_req_freq, policy->cpu) =
> clk_get_cpu_rate(policy->cpu);
> @@ -487,7 +485,8 @@ static void ve_spc_cpufreq_ready(struct cpufreq_policy *policy)
> static struct cpufreq_driver ve_spc_cpufreq_driver = {
> .name = "vexpress-spc",
> .flags = CPUFREQ_HAVE_GOVERNOR_PER_POLICY |
> - CPUFREQ_NEED_INITIAL_FREQ_CHECK,
> + CPUFREQ_NEED_INITIAL_FREQ_CHECK |
> + CPUFREQ_REGISTER_WITH_EM,
> .verify = cpufreq_generic_frequency_table_verify,
> .target_index = ve_spc_cpufreq_set_target,
> .get = ve_spc_cpufreq_get_rate,
>

I can see that this driver calls explicitly the
of_cpufreq_cooling_register()
It does this in the cpufreq_driver->ready() callback
implementation: ve_spc_cpufreq_ready()

With that in mind, the new code in the patch 1/8, which
registers the EM, should be called even earlier, above:
---------------------8<---------------------------------
/* Callback for handling stuff after policy is ready */
if (cpufreq_driver->ready)
cpufreq_driver->ready(policy);
------------------->8----------------------------------

This also triggered a question:
If this new flag can be set in the cpufreq driver which hasn't set
CPUFREQ_IS_COOLING_DEV
?
I can only see one driver (this one in the patch) which has such
configuration.

2021-08-10 16:22:00

by Lukasz Luba

[permalink] [raw]
Subject: Re: [PATCH 6/8] cpufreq: qcom-cpufreq-hw: Use auto-registration for energy model



On 8/10/21 8:36 AM, Viresh Kumar wrote:
> Use the CPUFREQ_REGISTER_WITH_EM flag to allow cpufreq core to
> automatically register with the energy model.
>
> This allows removal of boiler plate code from the driver and fixes the
> unregistration part as well.
>
> Signed-off-by: Viresh Kumar <[email protected]>
> ---
> drivers/cpufreq/qcom-cpufreq-hw.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
> index f86859bf76f1..221433c6dcb0 100644
> --- a/drivers/cpufreq/qcom-cpufreq-hw.c
> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
> @@ -362,8 +362,6 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
> goto error;
> }
>
> - dev_pm_opp_of_register_em(cpu_dev, policy->cpus);
> -
> if (policy_has_boost_freq(policy)) {
> ret = cpufreq_enable_boost_support();
> if (ret)
> @@ -406,7 +404,8 @@ static struct freq_attr *qcom_cpufreq_hw_attr[] = {
> static struct cpufreq_driver cpufreq_qcom_hw_driver = {
> .flags = CPUFREQ_NEED_INITIAL_FREQ_CHECK |
> CPUFREQ_HAVE_GOVERNOR_PER_POLICY |
> - CPUFREQ_IS_COOLING_DEV,
> + CPUFREQ_IS_COOLING_DEV |
> + CPUFREQ_REGISTER_WITH_EM,
> .verify = cpufreq_generic_frequency_table_verify,
> .target_index = qcom_cpufreq_hw_target_index,
> .get = qcom_cpufreq_hw_get,
>

Reviewed-by: Lukasz Luba <[email protected]>

2021-08-10 16:30:35

by Quentin Perret

[permalink] [raw]
Subject: Re: [PATCH 0/8] cpufreq: Auto-register with energy model

On Tuesday 10 Aug 2021 at 13:06:47 (+0530), Viresh Kumar wrote:
> Provide a cpufreq driver flag so drivers can ask the cpufreq core to register
> with the EM core on their behalf.

Hmm, that's not quite what this does. This asks the cpufreq core to
use *PM_OPP* to register an EM, which I think is kinda wrong to do from
there IMO. The decision to use PM_OPP or another mechanism to register
an EM belongs to platform specific code (drivers), so it is odd for the
PM_OPP registration to have its own cpufreq flag but not the other ways.

As mentioned in another thread, the very reason to have PM_EM is to not
depend on PM_OPP, so I'm worried about the direction of travel with this
series TBH.

> This allows us to get rid of duplicated code
> in the drivers and fix the unregistration part as well, which none of the
> drivers have done until now.

This series adds more code than it removes, and the unregistration is
not a fix as we don't ever remove the EM tables by design, so not sure
either of these points are valid arguments.

> This would also make the registration with EM core to happen only after policy
> is fully initialized, and the EM core can do other stuff from in there, like
> marking frequencies as inefficient (WIP). Though this patchset is useful without
> that work being done and should be merged nevertheless.
>
> This doesn't update scmi cpufreq driver for now as it is a special case and need
> to be handled differently. Though we can make it work with this if required.

Note that we'll have more 'special cases' if other architectures start
using PM_EM, which is what we have been trying to allow since the
beginning, so that's worth keeping in mind.

Thanks,
Quentin

2021-08-10 16:33:06

by Lukasz Luba

[permalink] [raw]
Subject: Re: [PATCH 0/8] cpufreq: Auto-register with energy model



On 8/10/21 1:35 PM, Quentin Perret wrote:
> On Tuesday 10 Aug 2021 at 13:06:47 (+0530), Viresh Kumar wrote:
>> Provide a cpufreq driver flag so drivers can ask the cpufreq core to register
>> with the EM core on their behalf.
>
> Hmm, that's not quite what this does. This asks the cpufreq core to
> use *PM_OPP* to register an EM, which I think is kinda wrong to do from
> there IMO. The decision to use PM_OPP or another mechanism to register
> an EM belongs to platform specific code (drivers), so it is odd for the
> PM_OPP registration to have its own cpufreq flag but not the other ways.
>
> As mentioned in another thread, the very reason to have PM_EM is to not
> depend on PM_OPP, so I'm worried about the direction of travel with this
> series TBH.
>
>> This allows us to get rid of duplicated code
>> in the drivers and fix the unregistration part as well, which none of the
>> drivers have done until now.
>
> This series adds more code than it removes, and the unregistration is
> not a fix as we don't ever remove the EM tables by design, so not sure
> either of these points are valid arguments.
>
>> This would also make the registration with EM core to happen only after policy
>> is fully initialized, and the EM core can do other stuff from in there, like
>> marking frequencies as inefficient (WIP). Though this patchset is useful without
>> that work being done and should be merged nevertheless.
>>
>> This doesn't update scmi cpufreq driver for now as it is a special case and need
>> to be handled differently. Though we can make it work with this if required.
>
> Note that we'll have more 'special cases' if other architectures start
> using PM_EM, which is what we have been trying to allow since the
> beginning, so that's worth keeping in mind.
>

The way I see this is that the flag in cpufreq avoids
mistakes potentially made by driver developer. It will automaticaly
register the *simple* EM model via dev_pm_opp_of_register_em() on behalf
of drivers (which is already done manually by drivers). The developer
would just set the flag similarly to CPUFREQ_IS_COOLING_DEV and be sure
it will register at the right time. Well tested flag approach should be
safer, easier to understand, maintain.

If there is a need for *advanced* EM model, driver developer would
have to care about all these things (order, setup-ready-structures,
fw channels, freeing, etc) while developing custom registration.
The developer won't set this flag in such case, so the core won't
try to auto register the EM for that driver.

I don't see the dependency of PM_EM on PM_OPP in this series.

2021-08-10 16:42:43

by Lukasz Luba

[permalink] [raw]
Subject: Re: [PATCH 1/8] cpufreq: Auto-register with energy model if asked



On 8/10/21 10:38 AM, Viresh Kumar wrote:
> On 10-08-21, 10:36, Lukasz Luba wrote:
>> The of_cpufreq_cooling_register() should be called after the EM
>> is present for the CPU device. When you check that function,
>> you will see that we call
>> em_cpu_get(policy->cpu)
>> to get the EM pointer. Otherwise IPA might fail.
>
> Good point.
>

In other patch set I had a discussion with Quentin and I've checked
the Performance Domains setup code. There is a code triggering the
rebuilding perf domains with EM from governor. We cannot call
EM registration so late in this cpufreq_online(), not after
cpufreq_init_policy() call.

So this dev_pm_opp_of_unregister_em() must be called before
the policy is initialized. I'm not sure if you still would like
to push forward this patch set in this case.

2021-08-10 17:23:10

by Lukasz Luba

[permalink] [raw]
Subject: Re: [PATCH 4/8] cpufreq: mediatek: Use auto-registration for energy model



On 8/10/21 8:36 AM, Viresh Kumar wrote:
> Use the CPUFREQ_REGISTER_WITH_EM flag to allow cpufreq core to
> automatically register with the energy model.
>
> This allows removal of boiler plate code from the driver and fixes the
> unregistration part as well.
>
> Signed-off-by: Viresh Kumar <[email protected]>
> ---
> drivers/cpufreq/mediatek-cpufreq.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/cpufreq/mediatek-cpufreq.c b/drivers/cpufreq/mediatek-cpufreq.c
> index 87019d5a9547..4743f2e58b97 100644
> --- a/drivers/cpufreq/mediatek-cpufreq.c
> +++ b/drivers/cpufreq/mediatek-cpufreq.c
> @@ -448,8 +448,6 @@ static int mtk_cpufreq_init(struct cpufreq_policy *policy)
> policy->driver_data = info;
> policy->clk = info->cpu_clk;
>
> - dev_pm_opp_of_register_em(info->cpu_dev, policy->cpus);
> -
> return 0;
> }
>
> @@ -465,7 +463,8 @@ static int mtk_cpufreq_exit(struct cpufreq_policy *policy)
> static struct cpufreq_driver mtk_cpufreq_driver = {
> .flags = CPUFREQ_NEED_INITIAL_FREQ_CHECK |
> CPUFREQ_HAVE_GOVERNOR_PER_POLICY |
> - CPUFREQ_IS_COOLING_DEV,
> + CPUFREQ_IS_COOLING_DEV |
> + CPUFREQ_REGISTER_WITH_EM,
> .verify = cpufreq_generic_frequency_table_verify,
> .target_index = mtk_cpufreq_set_target,
> .get = cpufreq_generic_get,
>

Reviewed-by: Lukasz Luba <[email protected]>

2021-08-10 17:23:19

by Lukasz Luba

[permalink] [raw]
Subject: Re: [PATCH 8/8] cpufreq: vexpress: Use auto-registration for energy model



On 8/10/21 8:36 AM, Viresh Kumar wrote:
> Use the CPUFREQ_REGISTER_WITH_EM flag to allow cpufreq core to
> automatically register with the energy model.
>
> This allows removal of boiler plate code from the driver and fixes the
> unregistration part as well.
>
> Signed-off-by: Viresh Kumar <[email protected]>
> ---
> drivers/cpufreq/vexpress-spc-cpufreq.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/cpufreq/vexpress-spc-cpufreq.c b/drivers/cpufreq/vexpress-spc-cpufreq.c
> index 51dfa9ae6cf5..28c4c3254337 100644
> --- a/drivers/cpufreq/vexpress-spc-cpufreq.c
> +++ b/drivers/cpufreq/vexpress-spc-cpufreq.c
> @@ -442,8 +442,6 @@ static int ve_spc_cpufreq_init(struct cpufreq_policy *policy)
> policy->freq_table = freq_table[cur_cluster];
> policy->cpuinfo.transition_latency = 1000000; /* 1 ms */
>
> - dev_pm_opp_of_register_em(cpu_dev, policy->cpus);
> -
> if (is_bL_switching_enabled())
> per_cpu(cpu_last_req_freq, policy->cpu) =
> clk_get_cpu_rate(policy->cpu);
> @@ -487,7 +485,8 @@ static void ve_spc_cpufreq_ready(struct cpufreq_policy *policy)
> static struct cpufreq_driver ve_spc_cpufreq_driver = {
> .name = "vexpress-spc",
> .flags = CPUFREQ_HAVE_GOVERNOR_PER_POLICY |
> - CPUFREQ_NEED_INITIAL_FREQ_CHECK,
> + CPUFREQ_NEED_INITIAL_FREQ_CHECK |
> + CPUFREQ_REGISTER_WITH_EM,
> .verify = cpufreq_generic_frequency_table_verify,
> .target_index = ve_spc_cpufreq_set_target,
> .get = ve_spc_cpufreq_get_rate,
>

With the change for patch 1/8 the we discussed below this patch 8/8,
it LGTM

Reviewed-by: Lukasz Luba <[email protected]>

2021-08-10 17:23:20

by Lukasz Luba

[permalink] [raw]
Subject: Re: [PATCH 7/8] cpufreq: scpi: Use auto-registration for energy model



On 8/10/21 8:36 AM, Viresh Kumar wrote:
> Use the CPUFREQ_REGISTER_WITH_EM flag to allow cpufreq core to
> automatically register with the energy model.
>
> This allows removal of boiler plate code from the driver and fixes the
> unregistration part as well.
>
> Signed-off-by: Viresh Kumar <[email protected]>
> ---
> drivers/cpufreq/scpi-cpufreq.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/cpufreq/scpi-cpufreq.c b/drivers/cpufreq/scpi-cpufreq.c
> index d6a698a1b5d1..bc8c62b1beb5 100644
> --- a/drivers/cpufreq/scpi-cpufreq.c
> +++ b/drivers/cpufreq/scpi-cpufreq.c
> @@ -163,8 +163,6 @@ static int scpi_cpufreq_init(struct cpufreq_policy *policy)
>
> policy->fast_switch_possible = false;
>
> - dev_pm_opp_of_register_em(cpu_dev, policy->cpus);
> -
> return 0;
>
> out_free_cpufreq_table:
> @@ -193,7 +191,8 @@ static struct cpufreq_driver scpi_cpufreq_driver = {
> .name = "scpi-cpufreq",
> .flags = CPUFREQ_HAVE_GOVERNOR_PER_POLICY |
> CPUFREQ_NEED_INITIAL_FREQ_CHECK |
> - CPUFREQ_IS_COOLING_DEV,
> + CPUFREQ_IS_COOLING_DEV |
> + CPUFREQ_REGISTER_WITH_EM,
> .verify = cpufreq_generic_frequency_table_verify,
> .attr = cpufreq_generic_attr,
> .get = scpi_cpufreq_get_rate,
>

Reviewed-by: Lukasz Luba <[email protected]>

2021-08-11 02:44:11

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH 8/8] cpufreq: vexpress: Use auto-registration for energy model

On Tue, Aug 10, 2021 at 01:06:55PM +0530, Viresh Kumar wrote:
> Use the CPUFREQ_REGISTER_WITH_EM flag to allow cpufreq core to
> automatically register with the energy model.
>
> This allows removal of boiler plate code from the driver and fixes the
> unregistration part as well.
>

Acked-by: Sudeep Holla <[email protected]>

--
Regards,
Sudeep

2021-08-11 02:46:38

by Sudeep Holla

[permalink] [raw]
Subject: Re: [PATCH 7/8] cpufreq: scpi: Use auto-registration for energy model

On Tue, Aug 10, 2021 at 01:06:54PM +0530, Viresh Kumar wrote:
> Use the CPUFREQ_REGISTER_WITH_EM flag to allow cpufreq core to
> automatically register with the energy model.
>
> This allows removal of boiler plate code from the driver and fixes the
> unregistration part as well.
>

Acked-by: Sudeep Holla <[email protected]>

--
Regards,
Sudeep

2021-08-11 05:21:26

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH 0/8] cpufreq: Auto-register with energy model

On 10-08-21, 13:35, Quentin Perret wrote:
> On Tuesday 10 Aug 2021 at 13:06:47 (+0530), Viresh Kumar wrote:
> > Provide a cpufreq driver flag so drivers can ask the cpufreq core to register
> > with the EM core on their behalf.
>
> Hmm, that's not quite what this does. This asks the cpufreq core to
> use *PM_OPP* to register an EM, which I think is kinda wrong to do from
> there IMO. The decision to use PM_OPP or another mechanism to register
> an EM belongs to platform specific code (drivers), so it is odd for the
> PM_OPP registration to have its own cpufreq flag but not the other ways.
>
> As mentioned in another thread, the very reason to have PM_EM is to not
> depend on PM_OPP, so I'm worried about the direction of travel with this
> series TBH.

I had to use the pm-opp version, since almost everyone was using that.

On the other hand, there isn't a lot of OPP specific stuff in
dev_pm_opp_of_register_em(). It just uses dev_pm_opp_get_opp_count(),
that's all. This ended up in the OPP core, nothing else. Maybe we can
now move it back to the EM core and name it differently ?

> > This allows us to get rid of duplicated code
> > in the drivers and fix the unregistration part as well, which none of the
> > drivers have done until now.
>
> This series adds more code than it removes,

Sadly yes :(

> and the unregistration is
> not a fix as we don't ever remove the EM tables by design, so not sure
> either of these points are valid arguments.

I think that design needs to be looked over again, it looks broken to
me everytime I land onto this code. I wonder why we don't unregister
stuff.

Lets say, I am working on the cpufreq driver and I want to test that
on my ARM machine. Rebooting a simpler board to test stuff out is
easy, but if I am working on an ARM server which is running lots of
other userspace stuff as well, I won't want to reboot the machine just
to test a different versions of the driver. I will rather want to
build the driver as module and insert/remove it again and again.

If the frequency table changes in between versions, this just breaks
as EM won't be updated again.

This breaks one of the most basic rules of Linux Kernel. Inserting a
module should have exactly the same final behavior every single time.
This model doesn't guarantee it. It simply looks broken.

> > This would also make the registration with EM core to happen only after policy
> > is fully initialized, and the EM core can do other stuff from in there, like
> > marking frequencies as inefficient (WIP). Though this patchset is useful without
> > that work being done and should be merged nevertheless.
> >
> > This doesn't update scmi cpufreq driver for now as it is a special case and need
> > to be handled differently. Though we can make it work with this if required.
>
> Note that we'll have more 'special cases' if other architectures start
> using PM_EM, which is what we have been trying to allow since the
> beginning, so that's worth keeping in mind.

Yes, we need to take care of all such special cases as well.

--
viresh

2021-08-11 05:38:07

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH 0/8] cpufreq: Auto-register with energy model

On 11-08-21, 10:48, Viresh Kumar wrote:
> On 10-08-21, 13:35, Quentin Perret wrote:
> > This series adds more code than it removes,
>
> Sadly yes :(
>
> > and the unregistration is
> > not a fix as we don't ever remove the EM tables by design, so not sure
> > either of these points are valid arguments.
>
> I think that design needs to be looked over again, it looks broken to
> me everytime I land onto this code. I wonder why we don't unregister
> stuff.

Coming back to this series. We have two options, based on what I
proposed here:

https://lore.kernel.org/linux-pm/20210811050327.3yxrk4kqxjjwaztx@vireshk-i7/

1. Let cpufreq core register with EM on behalf of cpufreq drivers.

2. Update drivers to use ->ready() callback to do this stuff.

I am fine with both :)

--
viresh

2021-08-11 08:41:38

by Quentin Perret

[permalink] [raw]
Subject: Re: [PATCH 0/8] cpufreq: Auto-register with energy model

On Wednesday 11 Aug 2021 at 10:48:59 (+0530), Viresh Kumar wrote:
> On 10-08-21, 13:35, Quentin Perret wrote:
> > On Tuesday 10 Aug 2021 at 13:06:47 (+0530), Viresh Kumar wrote:
> > > Provide a cpufreq driver flag so drivers can ask the cpufreq core to register
> > > with the EM core on their behalf.
> >
> > Hmm, that's not quite what this does. This asks the cpufreq core to
> > use *PM_OPP* to register an EM, which I think is kinda wrong to do from
> > there IMO. The decision to use PM_OPP or another mechanism to register
> > an EM belongs to platform specific code (drivers), so it is odd for the
> > PM_OPP registration to have its own cpufreq flag but not the other ways.
> >
> > As mentioned in another thread, the very reason to have PM_EM is to not
> > depend on PM_OPP, so I'm worried about the direction of travel with this
> > series TBH.
>
> I had to use the pm-opp version, since almost everyone was using that.
>
> On the other hand, there isn't a lot of OPP specific stuff in
> dev_pm_opp_of_register_em(). It just uses dev_pm_opp_get_opp_count(),
> that's all. This ended up in the OPP core, nothing else. Maybe we can
> now move it back to the EM core and name it differently ?

Well it also uses dev_pm_opp_find_freq_ceil() and
dev_pm_opp_get_voltage(), so not sure how easy it will be to move, but
if it is possible no objection from me.

> > > This allows us to get rid of duplicated code
> > > in the drivers and fix the unregistration part as well, which none of the
> > > drivers have done until now.
> >
> > This series adds more code than it removes,
>
> Sadly yes :(
>
> > and the unregistration is
> > not a fix as we don't ever remove the EM tables by design, so not sure
> > either of these points are valid arguments.
>
> I think that design needs to be looked over again, it looks broken to
> me everytime I land onto this code. I wonder why we don't unregister
> stuff.
>
> Lets say, I am working on the cpufreq driver and I want to test that
> on my ARM machine. Rebooting a simpler board to test stuff out is
> easy, but if I am working on an ARM server which is running lots of
> other userspace stuff as well, I won't want to reboot the machine just
> to test a different versions of the driver. I will rather want to
> build the driver as module and insert/remove it again and again.
>
> If the frequency table changes in between versions, this just breaks
> as EM won't be updated again.
>
> This breaks one of the most basic rules of Linux Kernel. Inserting a
> module should have exactly the same final behavior every single time.
> This model doesn't guarantee it. It simply looks broken.

Right but the EM is a description of the hardware, so it seemed fair
to assume this wouldn't change across the lifetime of the OS, similar
to the DT which we can't reload at run-time. Yes it can be a little odd
if you load/unload your driver module, but note that you generally can't
load two completely different drivers on a single system. You'll just
load the same one again and the hardware hasn't changed in the meantime,
so the previously loaded EM will still be correct. I hear your argument
about cpufreq driver development, but the locking involved to allow
'just' that is pretty involved, and nobody has complained about this
specific issue so far, so that didn't seem worth it. If we do have good
reasons to change the EM at runtime, then yes I think we should do it,
it just didn't seem like that was the case until now.

2021-08-11 09:14:25

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH 0/8] cpufreq: Auto-register with energy model

On 11-08-21, 09:37, Quentin Perret wrote:
> On Wednesday 11 Aug 2021 at 10:48:59 (+0530), Viresh Kumar wrote:
> > I had to use the pm-opp version, since almost everyone was using that.
> >
> > On the other hand, there isn't a lot of OPP specific stuff in
> > dev_pm_opp_of_register_em(). It just uses dev_pm_opp_get_opp_count(),
> > that's all. This ended up in the OPP core, nothing else. Maybe we can
> > now move it back to the EM core and name it differently ?
>
> Well it also uses dev_pm_opp_find_freq_ceil() and
> dev_pm_opp_get_voltage(), so not sure how easy it will be to move, but
> if it is possible no objection from me.

What uses these routines ? dev_pm_opp_of_register_em() ? I am not able
to see that at least :(

> Right but the EM is a description of the hardware, so it seemed fair
> to assume this wouldn't change across the lifetime of the OS, similar
> to the DT which we can't reload at run-time. Yes it can be a little odd
> if you load/unload your driver module, but note that you generally can't
> load two completely different drivers on a single system. You'll just
> load the same one again and the hardware hasn't changed in the meantime,
> so the previously loaded EM will still be correct.

Yeah, it will be the same driver but a different version of it, which
may have updated the freq table. For me the EM is attached to the
freq-table, and the freq-table is not available anymore after the
driver is gone.

Anyway, I will leave that for you guys to decide :)

> I hear your argument
> about cpufreq driver development, but the locking involved to allow
> 'just' that is pretty involved, and nobody has complained about this
> specific issue so far, so that didn't seem worth it. If we do have good
> reasons to change the EM at runtime, then yes I think we should do it,
> it just didn't seem like that was the case until now.

--
viresh

2021-08-11 09:36:27

by Quentin Perret

[permalink] [raw]
Subject: Re: [PATCH 0/8] cpufreq: Auto-register with energy model

On Wednesday 11 Aug 2021 at 14:43:21 (+0530), Viresh Kumar wrote:
> On 11-08-21, 09:37, Quentin Perret wrote:
> > On Wednesday 11 Aug 2021 at 10:48:59 (+0530), Viresh Kumar wrote:
> > > I had to use the pm-opp version, since almost everyone was using that.
> > >
> > > On the other hand, there isn't a lot of OPP specific stuff in
> > > dev_pm_opp_of_register_em(). It just uses dev_pm_opp_get_opp_count(),
> > > that's all. This ended up in the OPP core, nothing else. Maybe we can
> > > now move it back to the EM core and name it differently ?
> >
> > Well it also uses dev_pm_opp_find_freq_ceil() and
> > dev_pm_opp_get_voltage(), so not sure how easy it will be to move, but
> > if it is possible no objection from me.
>
> What uses these routines ? dev_pm_opp_of_register_em() ? I am not able
> to see that at least :(

Yep, it's not immediately obvious, but see how it sets the struct
em_data_callback to point at _get_power() where the actual energy
calculation is done. So strictly speaking _get_power() is what uses
these routines, but it goes in hand with dev_pm_opp_of_register_em() so
I guess the same reasoning applies.

> > Right but the EM is a description of the hardware, so it seemed fair
> > to assume this wouldn't change across the lifetime of the OS, similar
> > to the DT which we can't reload at run-time. Yes it can be a little odd
> > if you load/unload your driver module, but note that you generally can't
> > load two completely different drivers on a single system. You'll just
> > load the same one again and the hardware hasn't changed in the meantime,
> > so the previously loaded EM will still be correct.
>
> Yeah, it will be the same driver but a different version of it, which
> may have updated the freq table. For me the EM is attached to the
> freq-table, and the freq-table is not available anymore after the
> driver is gone.
>
> Anyway, I will leave that for you guys to decide :)

IIUC Lukasz is working on something that should allow changing the EM at
run-time, so hopefully it'll enable this use-case as well, but we'll see :)

2021-08-11 09:38:41

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH 0/8] cpufreq: Auto-register with energy model

On 11-08-21, 10:34, Quentin Perret wrote:
> Yep, it's not immediately obvious, but see how it sets the struct
> em_data_callback to point at _get_power() where the actual energy
> calculation is done. So strictly speaking _get_power() is what uses
> these routines, but it goes in hand with dev_pm_opp_of_register_em() so
> I guess the same reasoning applies.

My bad.

--
viresh

2021-08-11 09:52:55

by Quentin Perret

[permalink] [raw]
Subject: Re: [PATCH 0/8] cpufreq: Auto-register with energy model

On Wednesday 11 Aug 2021 at 11:04:06 (+0530), Viresh Kumar wrote:
> On 11-08-21, 10:48, Viresh Kumar wrote:
> > On 10-08-21, 13:35, Quentin Perret wrote:
> > > This series adds more code than it removes,
> >
> > Sadly yes :(
> >
> > > and the unregistration is
> > > not a fix as we don't ever remove the EM tables by design, so not sure
> > > either of these points are valid arguments.
> >
> > I think that design needs to be looked over again, it looks broken to
> > me everytime I land onto this code. I wonder why we don't unregister
> > stuff.
>
> Coming back to this series. We have two options, based on what I
> proposed here:
>
> https://lore.kernel.org/linux-pm/20210811050327.3yxrk4kqxjjwaztx@vireshk-i7/
>
> 1. Let cpufreq core register with EM on behalf of cpufreq drivers.

If we're going that route, I think we should allow _all_ possible
EM registration methods (via PM_OPP or else) to be done that way.
Otherwise we're creating an inconsitency in how the EM is registered
(e.g. from the ->init() cpufreq callback for some, or from cpufreq core
for others) which is problematic as we risk building features that
assume loading is done at a certain time, which won't work for some
platforms.

> 2. Update drivers to use ->ready() callback to do this stuff.

I think this should work, but perhaps will be a bit tricky for cpufreq
driver developers as they need to have a pretty good understanding of
the stack to know that they should do the registration from here and not
->init() for instance. Suggested alternative: we introduce a ->register_em()
callback to cpufreq_driver, and turn dev_pm_opp_of_register_em() into a
valid handler for this callback. This should 'document' things a bit
better, avoid some of the problems your other series tried to achieve, and
allow us to call the EM registration in exactly the right place from
cpufreq core. On the plus side, we could easily make this work for e.g.
the SCMI driver which would only need to provide its own version of
->register_em().

Thoughts?

2021-08-11 09:55:40

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH 0/8] cpufreq: Auto-register with energy model

On 11-08-21, 10:48, Quentin Perret wrote:
> I think this should work, but perhaps will be a bit tricky for cpufreq
> driver developers as they need to have a pretty good understanding of
> the stack to know that they should do the registration from here and not
> ->init() for instance. Suggested alternative: we introduce a ->register_em()
> callback to cpufreq_driver, and turn dev_pm_opp_of_register_em() into a
> valid handler for this callback. This should 'document' things a bit
> better, avoid some of the problems your other series tried to achieve, and
> allow us to call the EM registration in exactly the right place from
> cpufreq core. On the plus side, we could easily make this work for e.g.
> the SCMI driver which would only need to provide its own version of
> ->register_em().
>
> Thoughts?

I had exactly the same thing in mind, but was thinking of two
callbacks, to register and unregister. But yeah, we aren't going to
register for now at least :)

I wasn't sure if that should be done or not, since we also have
ready() callback. So was reluctant to suggest it earlier. But that can
work well as well.

--
viresh

2021-08-11 10:17:08

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH 0/8] cpufreq: Auto-register with energy model

On 11-08-21, 11:12, Quentin Perret wrote:
> I think using the ready() callback can work just fine as long as we
> document clearly it is important to register the EM from there and not
> anywhere else. The dedicated em_register() callback makes that a bit
> clearer and should avoid a bit of boilerplate in the driver, but it's
> not a big deal really, so I'm happy either way ;)

Yeah, I think just the same. It is better to have register_em as a
separate call. I was just wondering if it is the right choice :)

Anyway, I think ready() will get removed pretty soon, so register_em()
will work well. I will redo this series and send it.

--
viresh

2021-08-11 10:23:25

by Quentin Perret

[permalink] [raw]
Subject: Re: [PATCH 0/8] cpufreq: Auto-register with energy model

On Wednesday 11 Aug 2021 at 15:23:11 (+0530), Viresh Kumar wrote:
> On 11-08-21, 10:48, Quentin Perret wrote:
> > I think this should work, but perhaps will be a bit tricky for cpufreq
> > driver developers as they need to have a pretty good understanding of
> > the stack to know that they should do the registration from here and not
> > ->init() for instance. Suggested alternative: we introduce a ->register_em()
> > callback to cpufreq_driver, and turn dev_pm_opp_of_register_em() into a
> > valid handler for this callback. This should 'document' things a bit
> > better, avoid some of the problems your other series tried to achieve, and
> > allow us to call the EM registration in exactly the right place from
> > cpufreq core. On the plus side, we could easily make this work for e.g.
> > the SCMI driver which would only need to provide its own version of
> > ->register_em().
> >
> > Thoughts?
>
> I had exactly the same thing in mind, but was thinking of two
> callbacks, to register and unregister. But yeah, we aren't going to
> register for now at least :)

Ack, we probably want both once we unregister things.

> I wasn't sure if that should be done or not, since we also have
> ready() callback. So was reluctant to suggest it earlier. But that can
> work well as well.

I think using the ready() callback can work just fine as long as we
document clearly it is important to register the EM from there and not
anywhere else. The dedicated em_register() callback makes that a bit
clearer and should avoid a bit of boilerplate in the driver, but it's
not a big deal really, so I'm happy either way ;)