2019-09-11 13:47:15

by Quentin Perret

[permalink] [raw]
Subject: [PATCH RESEND v8 0/4] Make IPA use PM_EM

Re-sending this from an email address I can access. For a cover letter,
see:
https://lore.kernel.org/lkml/[email protected]/

Changes in v8:
- Fixed checkpatch errors (Rui)

Changes in v7
- Added patch 02/04 to fix the build error reported by the kbuild bot

Changes in v6
- Added Daniel's and Viresh's Acked-by to all patches

Changes in v5:
- Changed patch 02 to guard IPA-specific code in cpu_cooling.c with
appropriate ifdefery (Daniel)
- Rebased on 5.2-rc2

Changes in v4:
- Added Viresh's Acked-by to all 3 patches
- Improved commit message of patch 3/3 to explain how it has no
functional impact on existing users (Eduardo)

Changes in v3:
- Changed warning message for unordered tables to something more
explicit (Viresh)
- Changed WARN() into a pr_err() for consistency

Changes in v2:
- Fixed patch 01/03 to actually enable CONFIG_ENERGY_MODEL
- Added "depends on ENERGY_MODEL" to IPA (Daniel)
- Added check to bail out if the freq table is unsorted (Viresh)


Quentin Perret (4):
arm64: defconfig: Enable CONFIG_ENERGY_MODEL
PM / EM: Declare EM data types unconditionally
thermal: cpu_cooling: Make the power-related code depend on IPA
thermal: cpu_cooling: Migrate to using the EM framework

arch/arm64/configs/defconfig | 1 +
drivers/thermal/Kconfig | 1 +
drivers/thermal/cpu_cooling.c | 427 ++++++++++++++--------------------
include/linux/energy_model.h | 3 +-
4 files changed, 178 insertions(+), 254 deletions(-)

--
2.22.1


2019-09-11 14:25:43

by Quentin Perret

[permalink] [raw]
Subject: [PATCH v8 3/4] thermal: cpu_cooling: Make the power-related code depend on IPA

From: Quentin Perret <[email protected]>

The core CPU cooling infrastructure has power-related functions
that have only one client: IPA. Since there can be no user of those
functions if IPA is not compiled in, make sure to guard them with
checks on CONFIG_THERMAL_GOV_POWER_ALLOCATOR to not waste space
unnecessarily.

Acked-by: Daniel Lezcano <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
Suggested-by: Daniel Lezcano <[email protected]>
Signed-off-by: Quentin Perret <[email protected]>
---
drivers/thermal/cpu_cooling.c | 214 +++++++++++++++++-----------------
1 file changed, 104 insertions(+), 110 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 4c5db59a619b..498f59ab64b2 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -46,7 +46,9 @@
*/
struct freq_table {
u32 frequency;
+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
u32 power;
+#endif
};

/**
@@ -96,28 +98,6 @@ static DEFINE_IDA(cpufreq_ida);
static DEFINE_MUTEX(cooling_list_lock);
static LIST_HEAD(cpufreq_cdev_list);

-/* Below code defines functions to be used for cpufreq as cooling device */
-
-/**
- * get_level: Find the level for a particular frequency
- * @cpufreq_cdev: cpufreq_cdev for which the property is required
- * @freq: Frequency
- *
- * Return: level corresponding to the frequency.
- */
-static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev,
- unsigned int freq)
-{
- struct freq_table *freq_table = cpufreq_cdev->freq_table;
- unsigned long level;
-
- for (level = 1; level <= cpufreq_cdev->max_level; level++)
- if (freq > freq_table[level].frequency)
- break;
-
- return level - 1;
-}
-
/**
* cpufreq_thermal_notifier - notifier callback for cpufreq policy change.
* @nb: struct notifier_block * with callback info.
@@ -171,6 +151,27 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb,
return NOTIFY_OK;
}

+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
+/**
+ * get_level: Find the level for a particular frequency
+ * @cpufreq_cdev: cpufreq_cdev for which the property is required
+ * @freq: Frequency
+ *
+ * Return: level corresponding to the frequency.
+ */
+static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_cdev,
+ unsigned int freq)
+{
+ struct freq_table *freq_table = cpufreq_cdev->freq_table;
+ unsigned long level;
+
+ for (level = 1; level <= cpufreq_cdev->max_level; level++)
+ if (freq > freq_table[level].frequency)
+ break;
+
+ return level - 1;
+}
+
/**
* update_freq_table() - Update the freq table with power numbers
* @cpufreq_cdev: the cpufreq cooling device in which to update the table
@@ -319,80 +320,6 @@ static u32 get_dynamic_power(struct cpufreq_cooling_device *cpufreq_cdev,
return (raw_cpu_power * cpufreq_cdev->last_load) / 100;
}

-/* cpufreq cooling device callback functions are defined below */
-
-/**
- * cpufreq_get_max_state - callback function to get the max cooling state.
- * @cdev: thermal cooling device pointer.
- * @state: fill this variable with the max cooling state.
- *
- * Callback for the thermal cooling device to return the cpufreq
- * max cooling state.
- *
- * Return: 0 on success, an error code otherwise.
- */
-static int cpufreq_get_max_state(struct thermal_cooling_device *cdev,
- unsigned long *state)
-{
- struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
-
- *state = cpufreq_cdev->max_level;
- return 0;
-}
-
-/**
- * cpufreq_get_cur_state - callback function to get the current cooling state.
- * @cdev: thermal cooling device pointer.
- * @state: fill this variable with the current cooling state.
- *
- * Callback for the thermal cooling device to return the cpufreq
- * current cooling state.
- *
- * Return: 0 on success, an error code otherwise.
- */
-static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev,
- unsigned long *state)
-{
- struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
-
- *state = cpufreq_cdev->cpufreq_state;
-
- return 0;
-}
-
-/**
- * cpufreq_set_cur_state - callback function to set the current cooling state.
- * @cdev: thermal cooling device pointer.
- * @state: set this variable to the current cooling state.
- *
- * Callback for the thermal cooling device to change the cpufreq
- * current cooling state.
- *
- * Return: 0 on success, an error code otherwise.
- */
-static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
- unsigned long state)
-{
- struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
- unsigned int clip_freq;
-
- /* Request state should be less than max_level */
- if (WARN_ON(state > cpufreq_cdev->max_level))
- return -EINVAL;
-
- /* Check if the old cooling action is same as new cooling action */
- if (cpufreq_cdev->cpufreq_state == state)
- return 0;
-
- clip_freq = cpufreq_cdev->freq_table[state].frequency;
- cpufreq_cdev->cpufreq_state = state;
- cpufreq_cdev->clipped_freq = clip_freq;
-
- cpufreq_update_policy(cpufreq_cdev->policy->cpu);
-
- return 0;
-}
-
/**
* cpufreq_get_requested_power() - get the current power
* @cdev: &thermal_cooling_device pointer
@@ -536,22 +463,88 @@ static int cpufreq_power2state(struct thermal_cooling_device *cdev,
power);
return 0;
}
+#endif /* CONFIG_THERMAL_GOV_POWER_ALLOCATOR */
+
+/* cpufreq cooling device callback functions are defined below */
+
+/**
+ * cpufreq_get_max_state - callback function to get the max cooling state.
+ * @cdev: thermal cooling device pointer.
+ * @state: fill this variable with the max cooling state.
+ *
+ * Callback for the thermal cooling device to return the cpufreq
+ * max cooling state.
+ *
+ * Return: 0 on success, an error code otherwise.
+ */
+static int cpufreq_get_max_state(struct thermal_cooling_device *cdev,
+ unsigned long *state)
+{
+ struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
+
+ *state = cpufreq_cdev->max_level;
+ return 0;
+}
+
+/**
+ * cpufreq_get_cur_state - callback function to get the current cooling state.
+ * @cdev: thermal cooling device pointer.
+ * @state: fill this variable with the current cooling state.
+ *
+ * Callback for the thermal cooling device to return the cpufreq
+ * current cooling state.
+ *
+ * Return: 0 on success, an error code otherwise.
+ */
+static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev,
+ unsigned long *state)
+{
+ struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
+
+ *state = cpufreq_cdev->cpufreq_state;
+
+ return 0;
+}
+
+/**
+ * cpufreq_set_cur_state - callback function to set the current cooling state.
+ * @cdev: thermal cooling device pointer.
+ * @state: set this variable to the current cooling state.
+ *
+ * Callback for the thermal cooling device to change the cpufreq
+ * current cooling state.
+ *
+ * Return: 0 on success, an error code otherwise.
+ */
+static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
+ unsigned long state)
+{
+ struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
+ unsigned int clip_freq;
+
+ /* Request state should be less than max_level */
+ if (WARN_ON(state > cpufreq_cdev->max_level))
+ return -EINVAL;
+
+ /* Check if the old cooling action is same as new cooling action */
+ if (cpufreq_cdev->cpufreq_state == state)
+ return 0;
+
+ clip_freq = cpufreq_cdev->freq_table[state].frequency;
+ cpufreq_cdev->cpufreq_state = state;
+ cpufreq_cdev->clipped_freq = clip_freq;
+
+ cpufreq_update_policy(cpufreq_cdev->policy->cpu);
+
+ return 0;
+}

/* Bind cpufreq callbacks to thermal cooling device ops */

static struct thermal_cooling_device_ops cpufreq_cooling_ops = {
- .get_max_state = cpufreq_get_max_state,
- .get_cur_state = cpufreq_get_cur_state,
- .set_cur_state = cpufreq_set_cur_state,
-};
-
-static struct thermal_cooling_device_ops cpufreq_power_cooling_ops = {
.get_max_state = cpufreq_get_max_state,
.get_cur_state = cpufreq_get_cur_state,
.set_cur_state = cpufreq_set_cur_state,
- .get_requested_power = cpufreq_get_requested_power,
- .state2power = cpufreq_state2power,
- .power2state = cpufreq_power2state,
};

/* Notifier for cpufreq policy change */
@@ -659,18 +652,19 @@ __cpufreq_cooling_register(struct device_node *np,
pr_debug("%s: freq:%u KHz\n", __func__, freq);
}

+ cooling_ops = &cpufreq_cooling_ops;
+#ifdef CONFIG_THERMAL_GOV_POWER_ALLOCATOR
if (capacitance) {
ret = update_freq_table(cpufreq_cdev, capacitance);
if (ret) {
cdev = ERR_PTR(ret);
goto remove_ida;
}
-
- cooling_ops = &cpufreq_power_cooling_ops;
- } else {
- cooling_ops = &cpufreq_cooling_ops;
+ cooling_ops->get_requested_power = cpufreq_get_requested_power;
+ cooling_ops->state2power = cpufreq_state2power;
+ cooling_ops->power2state = cpufreq_power2state;
}
-
+#endif
cdev = thermal_of_cooling_device_register(np, dev_name, cpufreq_cdev,
cooling_ops);
if (IS_ERR(cdev))
--
2.22.1

2019-09-11 15:35:38

by Quentin Perret

[permalink] [raw]
Subject: [PATCH v8 2/4] PM / EM: Declare EM data types unconditionally

From: Quentin Perret <[email protected]>

The structs representing capacity states and performance domains of an
Energy Model are currently only defined for CONFIG_ENERGY_MODEL=y. That
makes it hard for code outside PM_EM to manipulate those structures
without a lot of ifdefery or stubbed accessors.

So, move the declaration of the two structs outside of the
CONFIG_ENERGY_MODEL ifdef. The client code (e.g. EAS or thermal) always
checks the return of em_cpu_get() before using it, so the exising code
is still safe to use as-is.

Reported-by: kbuild test robot <[email protected]>
Signed-off-by: Quentin Perret <[email protected]>
---
include/linux/energy_model.h | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/linux/energy_model.h b/include/linux/energy_model.h
index 73f8c3cb9588..d249b88a4d5a 100644
--- a/include/linux/energy_model.h
+++ b/include/linux/energy_model.h
@@ -9,7 +9,6 @@
#include <linux/sched/topology.h>
#include <linux/types.h>

-#ifdef CONFIG_ENERGY_MODEL
/**
* em_cap_state - Capacity state of a performance domain
* @frequency: The CPU frequency in KHz, for consistency with CPUFreq
@@ -40,6 +39,7 @@ struct em_perf_domain {
unsigned long cpus[0];
};

+#ifdef CONFIG_ENERGY_MODEL
#define EM_CPU_MAX_POWER 0xFFFF

struct em_data_callback {
@@ -160,7 +160,6 @@ static inline int em_pd_nr_cap_states(struct em_perf_domain *pd)
}

#else
-struct em_perf_domain {};
struct em_data_callback {};
#define EM_DATA_CB(_active_power_cb) { }

--
2.22.1

2019-09-11 22:23:11

by Quentin Perret

[permalink] [raw]
Subject: [PATCH v8 1/4] arm64: defconfig: Enable CONFIG_ENERGY_MODEL

From: Quentin Perret <[email protected]>

The recently introduced Energy Model (EM) framework manages power cost
tables for the CPUs of the system. Its only user right now is the
scheduler, in the context of Energy Aware Scheduling (EAS).

However, the EM framework also offers a generic infrastructure that
could replace subsystem-specific implementations of the same concepts,
as this is the case in the thermal framework.

So, in order to prepare the migration of the thermal subsystem to use
the EM framework, enable it in the default arm64 defconfig, which is the
most commonly used architecture for IPA. This will also compile-in all
of the EAS code, although it won't be enabled by default -- EAS requires
to use the 'schedutil' CPUFreq governor while arm64 defaults to
'performance'.

Acked-by: Daniel Lezcano <[email protected]>
Acked-by: Viresh Kumar <[email protected]>
Signed-off-by: Quentin Perret <[email protected]>
---
arch/arm64/configs/defconfig | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 0e58ef02880c..ad0e4944a71f 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -71,6 +71,7 @@ CONFIG_COMPAT=y
CONFIG_RANDOMIZE_BASE=y
CONFIG_HIBERNATION=y
CONFIG_WQ_POWER_EFFICIENT_DEFAULT=y
+CONFIG_ENERGY_MODEL=y
CONFIG_ARM_CPUIDLE=y
CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_STAT=y
--
2.22.1

2019-10-07 05:35:53

by Daniel Lezcano

[permalink] [raw]
Subject: Re: [PATCH RESEND v8 0/4] Make IPA use PM_EM

Hi Quentin,

the series does no longer apply, do you think it is possible to give it
a respin?



On 11/09/2019 15:03, Quentin Perret wrote:
> Re-sending this from an email address I can access. For a cover letter,
> see:
> https://lore.kernel.org/lkml/[email protected]/
>
> Changes in v8:
> - Fixed checkpatch errors (Rui)
>
> Changes in v7
> - Added patch 02/04 to fix the build error reported by the kbuild bot
>
> Changes in v6
> - Added Daniel's and Viresh's Acked-by to all patches
>
> Changes in v5:
> - Changed patch 02 to guard IPA-specific code in cpu_cooling.c with
> appropriate ifdefery (Daniel)
> - Rebased on 5.2-rc2
>
> Changes in v4:
> - Added Viresh's Acked-by to all 3 patches
> - Improved commit message of patch 3/3 to explain how it has no
> functional impact on existing users (Eduardo)
>
> Changes in v3:
> - Changed warning message for unordered tables to something more
> explicit (Viresh)
> - Changed WARN() into a pr_err() for consistency
>
> Changes in v2:
> - Fixed patch 01/03 to actually enable CONFIG_ENERGY_MODEL
> - Added "depends on ENERGY_MODEL" to IPA (Daniel)
> - Added check to bail out if the freq table is unsorted (Viresh)
>
>
> Quentin Perret (4):
> arm64: defconfig: Enable CONFIG_ENERGY_MODEL
> PM / EM: Declare EM data types unconditionally
> thermal: cpu_cooling: Make the power-related code depend on IPA
> thermal: cpu_cooling: Migrate to using the EM framework
>
> arch/arm64/configs/defconfig | 1 +
> drivers/thermal/Kconfig | 1 +
> drivers/thermal/cpu_cooling.c | 427 ++++++++++++++--------------------
> include/linux/energy_model.h | 3 +-
> 4 files changed, 178 insertions(+), 254 deletions(-)
>


--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

2019-10-07 13:40:42

by Quentin Perret

[permalink] [raw]
Subject: Re: [PATCH RESEND v8 0/4] Make IPA use PM_EM

Hi Daniel,

On Mon, Oct 7, 2019 at 6:35 AM Daniel Lezcano <[email protected]> wrote:
> the series does no longer apply, do you think it is possible to give it
> a respin?

Right, I'll try to fix the conflicts and post a v9 shortly.

Thanks,
Quentin

2019-10-30 10:55:52

by Daniel Lezcano

[permalink] [raw]
Subject: Re: [PATCH RESEND v8 0/4] Make IPA use PM_EM

Hi Quentin,

On 07/10/2019 15:37, Quentin Perret wrote:
> Hi Daniel,
>
> On Mon, Oct 7, 2019 at 6:35 AM Daniel Lezcano <[email protected]> wrote:
>> the series does no longer apply, do you think it is possible to give it
>> a respin?
>
> Right, I'll try to fix the conflicts and post a v9 shortly.

we are getting close to the merge window, did you have time to respin
the series?


--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

2019-10-30 11:47:14

by Quentin Perret

[permalink] [raw]
Subject: Re: [PATCH RESEND v8 0/4] Make IPA use PM_EM

On Wednesday 30 Oct 2019 at 11:54:30 (+0100), Daniel Lezcano wrote:
> Hi Quentin,
>
> On 07/10/2019 15:37, Quentin Perret wrote:
> > Hi Daniel,
> >
> > On Mon, Oct 7, 2019 at 6:35 AM Daniel Lezcano <[email protected]> wrote:
> >> the series does no longer apply, do you think it is possible to give it
> >> a respin?
> >
> > Right, I'll try to fix the conflicts and post a v9 shortly.
>
> we are getting close to the merge window, did you have time to respin
> the series?

Yes, sorry for the delay, but I have a 5.4-rc4-based branch that seems
to work. I'll push that ASAP.

Thanks,
Quentin