2019-11-05 18:53:04

by Thara Gopinath

[permalink] [raw]
Subject: [Patch v5 5/6] thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping

Thermal governors can request for a cpu's maximum supported frequency
to be capped in case of an overheat event. This in turn means that the
maximum capacity available for tasks to run on the particular cpu is
reduced. Delta between the original maximum capacity and capped
maximum capacity is known as thermal pressure. Enable cpufreq cooling
device to update the thermal pressure in event of a capped
maximum frequency.

Signed-off-by: Thara Gopinath <[email protected]>
---

v4->v5:
- fixed issues in update_sched_max_capacity comment header.
- Updated update_sched_max_capacity to calculate maximum available
capacity.

drivers/thermal/cpu_cooling.c | 36 ++++++++++++++++++++++++++++++++++--
1 file changed, 34 insertions(+), 2 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 391f397..c65b7c4 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -218,6 +218,27 @@ static u32 cpu_power_to_freq(struct cpufreq_cooling_device *cpufreq_cdev,
}

/**
+ * update_sched_max_capacity - update scheduler about change in cpu
+ * max frequency.
+ * @cpus: list of cpus whose max capacity needs udating in scheduler.
+ * @cur_max_freq: current maximum frequency.
+ * @max_freq: highest possible frequency.
+ */
+static void update_sched_max_capacity(struct cpumask *cpus,
+ unsigned int cur_max_freq,
+ unsigned int max_freq)
+{
+ int cpu;
+ unsigned long capacity;
+
+ for_each_cpu(cpu, cpus) {
+ capacity = cur_max_freq * arch_scale_cpu_capacity(cpu);
+ capacity /= max_freq;
+ update_thermal_pressure(cpu, capacity);
+ }
+}
+
+/**
* get_load() - get load for a cpu since last updated
* @cpufreq_cdev: &struct cpufreq_cooling_device for this cpu
* @cpu: cpu number
@@ -320,6 +341,7 @@ static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
unsigned long state)
{
struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
+ int ret;

/* Request state should be less than max_level */
if (WARN_ON(state > cpufreq_cdev->max_level))
@@ -331,8 +353,18 @@ static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,

cpufreq_cdev->cpufreq_state = state;

- return dev_pm_qos_update_request(&cpufreq_cdev->qos_req,
- cpufreq_cdev->freq_table[state].frequency);
+ ret = dev_pm_qos_update_request
+ (&cpufreq_cdev->qos_req,
+ cpufreq_cdev->freq_table[state].frequency);
+
+ if (ret > 0) {
+ update_sched_max_capacity
+ (cpufreq_cdev->policy->cpus,
+ cpufreq_cdev->freq_table[state].frequency,
+ cpufreq_cdev->policy->cpuinfo.max_freq);
+ }
+
+ return ret;
}

/**
--
2.1.4


2019-11-06 12:53:43

by Dietmar Eggemann

[permalink] [raw]
Subject: Re: [Patch v5 5/6] thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping

On 05/11/2019 19:49, Thara Gopinath wrote:

[...]

> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> index 391f397..c65b7c4 100644
> --- a/drivers/thermal/cpu_cooling.c
> +++ b/drivers/thermal/cpu_cooling.c
> @@ -218,6 +218,27 @@ static u32 cpu_power_to_freq(struct cpufreq_cooling_device *cpufreq_cdev,
> }
>
> /**
> + * update_sched_max_capacity - update scheduler about change in cpu
> + * max frequency.
> + * @cpus: list of cpus whose max capacity needs udating in scheduler.
> + * @cur_max_freq: current maximum frequency.
> + * @max_freq: highest possible frequency.
> + */
> +static void update_sched_max_capacity(struct cpumask *cpus,
> + unsigned int cur_max_freq,
> + unsigned int max_freq)
> +{
> + int cpu;
> + unsigned long capacity;
> +
> + for_each_cpu(cpu, cpus) {
> + capacity = cur_max_freq * arch_scale_cpu_capacity(cpu);
> + capacity /= max_freq;
> + update_thermal_pressure(cpu, capacity);
> + }
> +}
> +
> +/**

Have you seen
https://lore.kernel.org/r/[email protected] ?

Also the naming 'update_thermal_pressure()' is not really suitable for
an external task scheduler interface. If I see this called in the
cooling device code, I wouldn't guess that this is a task scheduler
interface.

[...]

2019-11-06 17:29:32

by Thara Gopinath

[permalink] [raw]
Subject: Re: [Patch v5 5/6] thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping

On 11/06/2019 07:50 AM, Dietmar Eggemann wrote:
> On 05/11/2019 19:49, Thara Gopinath wrote:
>
> [...]
>
>> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
>> index 391f397..c65b7c4 100644
>> --- a/drivers/thermal/cpu_cooling.c
>> +++ b/drivers/thermal/cpu_cooling.c
>> @@ -218,6 +218,27 @@ static u32 cpu_power_to_freq(struct cpufreq_cooling_device *cpufreq_cdev,
>> }
>>
>> /**
>> + * update_sched_max_capacity - update scheduler about change in cpu
>> + * max frequency.
>> + * @cpus: list of cpus whose max capacity needs udating in scheduler.
>> + * @cur_max_freq: current maximum frequency.
>> + * @max_freq: highest possible frequency.
>> + */
>> +static void update_sched_max_capacity(struct cpumask *cpus,
>> + unsigned int cur_max_freq,
>> + unsigned int max_freq)
>> +{
>> + int cpu;
>> + unsigned long capacity;
>> +
>> + for_each_cpu(cpu, cpus) {
>> + capacity = cur_max_freq * arch_scale_cpu_capacity(cpu);
>> + capacity /= max_freq;
>> + update_thermal_pressure(cpu, capacity);
>> + }
>> +}
>> +
>> +/**
>
> Have you seen
> https://lore.kernel.org/r/[email protected] ?
Yes and have you seen this
https://lore.kernel.org/lkml/CAKfTPtCpZq61gQVpATtTdg5hDA+tP4bF6xPMsvHYUMoY+H-6FQ@mail.gmail.com/

this
https://lore.kernel.org/lkml/[email protected]/

and this from Ionela (which is the basis for v5)
https://lore.kernel.org/lkml/20191101154612.GA4884@e108754-lin/

I stand by what I said earlier that I see no reason to take parsing of
the cpus into fair.c for this feature (this way this solution is more
generic and can be used from any other entity capping maximum cpu frequency)

>
> Also the naming 'update_thermal_pressure()' is not really suitable for
> an external task scheduler interface. If I see this called in the
> cooling device code, I wouldn't guess that this is a task scheduler
> interface.

Do you have a name suggestion? When you say you don't like a name do
suggest a name so that I have an idea of what is it you are expecting.

>
> [...]
>


--
Warm Regards
Thara

2019-11-07 13:02:10

by Dietmar Eggemann

[permalink] [raw]
Subject: Re: [Patch v5 5/6] thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping



On 06/11/2019 13:50, Dietmar Eggemann wrote:
> On 05/11/2019 19:49, Thara Gopinath wrote:
>
> [...]
>
>> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
>> index 391f397..c65b7c4 100644
>> --- a/drivers/thermal/cpu_cooling.c
>> +++ b/drivers/thermal/cpu_cooling.c
>> @@ -218,6 +218,27 @@ static u32 cpu_power_to_freq(struct cpufreq_cooling_device *cpufreq_cdev,
>> }
>>
>> /**
>> + * update_sched_max_capacity - update scheduler about change in cpu
>> + * max frequency.
>> + * @cpus: list of cpus whose max capacity needs udating in scheduler.
>> + * @cur_max_freq: current maximum frequency.
>> + * @max_freq: highest possible frequency.
>> + */
>> +static void update_sched_max_capacity(struct cpumask *cpus,
>> + unsigned int cur_max_freq,
>> + unsigned int max_freq)
>> +{
>> + int cpu;
>> + unsigned long capacity;
>> +
>> + for_each_cpu(cpu, cpus) {
>> + capacity = cur_max_freq * arch_scale_cpu_capacity(cpu);
>> + capacity /= max_freq;
>> + update_thermal_pressure(cpu, capacity);
>> + }
>> +}
>> +
>> +/**
>
> Have you seen
> https://lore.kernel.org/r/[email protected] ?
>
> Also the naming 'update_thermal_pressure()' is not really suitable for
> an external task scheduler interface. If I see this called in the
> cooling device code, I wouldn't guess that this is a task scheduler
> interface.
>
> [...]

The current interface to bring in frequency and uarch (CPU) invariance
information into the task scheduler is arch_scale_[freq|cpu]_capacity.

So why not follow this approach for the thermal->sched interface?

+#ifndef arch_scale_thermal_capacity
+static __always_inline
+unsigned long arch_scale_thermal_capacity(int cpu)
+{
+ return 0;
+}
+#endif

Leave per_cpu(thermal_pressure, cpu) within cpu cooling and let cpu
cooling (or even platform) #define arch_scale_thermal_capacity.