2017-03-16 05:27:36

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 00/17] thermal: cpu_cooling: improve interaction with cpufreq core

Hi Guys,

The cpu_cooling driver is designed to use CPU frequency scaling to avoid
high thermal states for a platform. But it wasn't glued really well with
cpufreq core.

This series tries to improve interactions between cpufreq core and
cpu_cooling driver and does some fixes/cleanups to the cpu_cooling
driver.

I am a bit confused about which tree this series should go through, PM
or thermal.

This series has dependency on few other patches which are already merged
in the PM [1] tree and thermal [2] tree. As this is 4.12 material, all
of this should go through only one tree to avoid conflicts.

I assume that one of Rafael and Rui have to drop the existing patch(es)
from their trees and let the other one apply all of these. I would let
you guys decide on that. Sorry for the trouble.

I have tested it on ARM 32 (exynos) and 64 bit (hikey) boards and have
pushed them for 0-day build bot and kernel CI testing as well. We should
know if something is broken with these.

@Javi: It would be good if you can give them a test, specially because
of your work on the "power" specific bits in the driver.

Pushed here as well:

git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm.git thermal/cooling

--
viresh

[1] https://marc.info/?l=linux-kernel&m=148946890403271&w=2
[2] https://marc.info/?l=linux-kernel&m=148644060126593&w=2

Viresh Kumar (17):
thermal: cpu_cooling: Avoid accessing potentially freed structures
thermal: cpu_cooling: rearrange globals
thermal: cpu_cooling: Replace cpufreq_device with cpufreq_dev
thermal: cpu_cooling: replace cool_dev with cdev
thermal: cpu_cooling: remove cpufreq_cooling_get_level()
thermal: cpu_cooling: get rid of a variable in cpufreq_set_cur_state()
thermal: cpu_cooling: use cpufreq_policy to register cooling device
cpufreq: create cpufreq_table_count_valid_entries()
thermal: cpu_cooling: store cpufreq policy
thermal: cpu_cooling: OPPs are registered for all CPUs
thermal: cpu_cooling: get rid of 'allowed_cpus'
thermal: cpu_cooling: merge frequency and power tables
thermal: cpu_cooling: create structure for idle time stats
thermal: cpu_cooling: get_level() can't fail
thermal: cpu_cooling: don't store cpu_dev in cpufreq_dev
thermal: cpu_cooling: 'freq' can't be zero in cpufreq_state2power()
thermal: cpu_cooling: Rearrange struct cpufreq_cooling_device

drivers/cpufreq/arm_big_little.c | 2 +-
drivers/cpufreq/cpufreq-dt.c | 2 +-
drivers/cpufreq/cpufreq_stats.c | 13 +-
drivers/cpufreq/dbx500-cpufreq.c | 2 +-
drivers/cpufreq/mt8173-cpufreq.c | 4 +-
drivers/cpufreq/qoriq-cpufreq.c | 3 +-
drivers/thermal/cpu_cooling.c | 530 ++++++++-------------
drivers/thermal/imx_thermal.c | 22 +-
drivers/thermal/ti-soc-thermal/ti-thermal-common.c | 22 +-
include/linux/cpu_cooling.h | 32 +-
include/linux/cpufreq.h | 14 +
11 files changed, 272 insertions(+), 374 deletions(-)

--
2.7.1.410.g6faf27b


2017-03-16 05:30:01

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 01/17] thermal: cpu_cooling: Avoid accessing potentially freed structures

After the lock is dropped, it is possible that the cpufreq_dev gets
freed before we call get_level() and that can cause kernel to crash.

Drop the lock after we are done using the structure.

Cc: 4.2+ <[email protected]>
Fixes: 02373d7c69b4 ("thermal: cpu_cooling: fix lockdep problems in cpu_cooling")
Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/thermal/cpu_cooling.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index c2525b585487..6fd258d62e47 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -155,8 +155,10 @@ unsigned long cpufreq_cooling_get_level(unsigned int cpu, unsigned int freq)
mutex_lock(&cooling_list_lock);
list_for_each_entry(cpufreq_dev, &cpufreq_dev_list, node) {
if (cpumask_test_cpu(cpu, &cpufreq_dev->allowed_cpus)) {
+ unsigned long level = get_level(cpufreq_dev, freq);
+
mutex_unlock(&cooling_list_lock);
- return get_level(cpufreq_dev, freq);
+ return level;
}
}
mutex_unlock(&cooling_list_lock);
--
2.7.1.410.g6faf27b

2017-03-16 05:30:10

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 03/17] thermal: cpu_cooling: Replace cpufreq_device with cpufreq_dev

Objects of "struct cpufreq_cooling_device" are named a bit
inconsistently. Lets use cpufreq_dev everywhere.

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/thermal/cpu_cooling.c | 133 +++++++++++++++++++++---------------------
1 file changed, 66 insertions(+), 67 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 7ce73eee866f..7a19033d7f79 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -218,11 +218,11 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb,

/**
* build_dyn_power_table() - create a dynamic power to frequency table
- * @cpufreq_device: the cpufreq cooling device in which to store the table
+ * @cpufreq_dev: the cpufreq cooling device in which to store the table
* @capacitance: dynamic power coefficient for these cpus
*
* Build a dynamic power to frequency table for this cpu and store it
- * in @cpufreq_device. This table will be used in cpu_power_to_freq() and
+ * in @cpufreq_dev. This table will be used in cpu_power_to_freq() and
* cpu_freq_to_power() to convert between power and frequency
* efficiently. Power is stored in mW, frequency in KHz. The
* resulting table is in ascending order.
@@ -231,7 +231,7 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb,
* -ENOMEM if we run out of memory or -EAGAIN if an OPP was
* added/enabled while the function was executing.
*/
-static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
+static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_dev,
u32 capacitance)
{
struct power_table *power_table;
@@ -240,10 +240,10 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
int num_opps = 0, cpu, i, ret = 0;
unsigned long freq;

- for_each_cpu(cpu, &cpufreq_device->allowed_cpus) {
+ for_each_cpu(cpu, &cpufreq_dev->allowed_cpus) {
dev = get_cpu_device(cpu);
if (!dev) {
- dev_warn(&cpufreq_device->cool_dev->device,
+ dev_warn(&cpufreq_dev->cool_dev->device,
"No cpu device for cpu %d\n", cpu);
continue;
}
@@ -296,9 +296,9 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
goto free_power_table;
}

- cpufreq_device->cpu_dev = dev;
- cpufreq_device->dyn_power_table = power_table;
- cpufreq_device->dyn_power_table_entries = i;
+ cpufreq_dev->cpu_dev = dev;
+ cpufreq_dev->dyn_power_table = power_table;
+ cpufreq_dev->dyn_power_table_entries = i;

return 0;

@@ -308,26 +308,26 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_device,
return ret;
}

-static u32 cpu_freq_to_power(struct cpufreq_cooling_device *cpufreq_device,
+static u32 cpu_freq_to_power(struct cpufreq_cooling_device *cpufreq_dev,
u32 freq)
{
int i;
- struct power_table *pt = cpufreq_device->dyn_power_table;
+ struct power_table *pt = cpufreq_dev->dyn_power_table;

- for (i = 1; i < cpufreq_device->dyn_power_table_entries; i++)
+ for (i = 1; i < cpufreq_dev->dyn_power_table_entries; i++)
if (freq < pt[i].frequency)
break;

return pt[i - 1].power;
}

-static u32 cpu_power_to_freq(struct cpufreq_cooling_device *cpufreq_device,
+static u32 cpu_power_to_freq(struct cpufreq_cooling_device *cpufreq_dev,
u32 power)
{
int i;
- struct power_table *pt = cpufreq_device->dyn_power_table;
+ struct power_table *pt = cpufreq_dev->dyn_power_table;

- for (i = 1; i < cpufreq_device->dyn_power_table_entries; i++)
+ for (i = 1; i < cpufreq_dev->dyn_power_table_entries; i++)
if (power < pt[i].power)
break;

@@ -336,37 +336,37 @@ static u32 cpu_power_to_freq(struct cpufreq_cooling_device *cpufreq_device,

/**
* get_load() - get load for a cpu since last updated
- * @cpufreq_device: &struct cpufreq_cooling_device for this cpu
+ * @cpufreq_dev: &struct cpufreq_cooling_device for this cpu
* @cpu: cpu number
- * @cpu_idx: index of the cpu in cpufreq_device->allowed_cpus
+ * @cpu_idx: index of the cpu in cpufreq_dev->allowed_cpus
*
* Return: The average load of cpu @cpu in percentage since this
* function was last called.
*/
-static u32 get_load(struct cpufreq_cooling_device *cpufreq_device, int cpu,
+static u32 get_load(struct cpufreq_cooling_device *cpufreq_dev, int cpu,
int cpu_idx)
{
u32 load;
u64 now, now_idle, delta_time, delta_idle;

now_idle = get_cpu_idle_time(cpu, &now, 0);
- delta_idle = now_idle - cpufreq_device->time_in_idle[cpu_idx];
- delta_time = now - cpufreq_device->time_in_idle_timestamp[cpu_idx];
+ delta_idle = now_idle - cpufreq_dev->time_in_idle[cpu_idx];
+ delta_time = now - cpufreq_dev->time_in_idle_timestamp[cpu_idx];

if (delta_time <= delta_idle)
load = 0;
else
load = div64_u64(100 * (delta_time - delta_idle), delta_time);

- cpufreq_device->time_in_idle[cpu_idx] = now_idle;
- cpufreq_device->time_in_idle_timestamp[cpu_idx] = now;
+ cpufreq_dev->time_in_idle[cpu_idx] = now_idle;
+ cpufreq_dev->time_in_idle_timestamp[cpu_idx] = now;

return load;
}

/**
* get_static_power() - calculate the static power consumed by the cpus
- * @cpufreq_device: struct &cpufreq_cooling_device for this cpu cdev
+ * @cpufreq_dev: struct &cpufreq_cooling_device for this cpu cdev
* @tz: thermal zone device in which we're operating
* @freq: frequency in KHz
* @power: pointer in which to store the calculated static power
@@ -379,25 +379,24 @@ static u32 get_load(struct cpufreq_cooling_device *cpufreq_device, int cpu,
*
* Return: 0 on success, -E* on failure.
*/
-static int get_static_power(struct cpufreq_cooling_device *cpufreq_device,
+static int get_static_power(struct cpufreq_cooling_device *cpufreq_dev,
struct thermal_zone_device *tz, unsigned long freq,
u32 *power)
{
struct dev_pm_opp *opp;
unsigned long voltage;
- struct cpumask *cpumask = &cpufreq_device->allowed_cpus;
+ struct cpumask *cpumask = &cpufreq_dev->allowed_cpus;
unsigned long freq_hz = freq * 1000;

- if (!cpufreq_device->plat_get_static_power ||
- !cpufreq_device->cpu_dev) {
+ if (!cpufreq_dev->plat_get_static_power || !cpufreq_dev->cpu_dev) {
*power = 0;
return 0;
}

- opp = dev_pm_opp_find_freq_exact(cpufreq_device->cpu_dev, freq_hz,
+ opp = dev_pm_opp_find_freq_exact(cpufreq_dev->cpu_dev, freq_hz,
true);
if (IS_ERR(opp)) {
- dev_warn_ratelimited(cpufreq_device->cpu_dev,
+ dev_warn_ratelimited(cpufreq_dev->cpu_dev,
"Failed to find OPP for frequency %lu: %ld\n",
freq_hz, PTR_ERR(opp));
return -EINVAL;
@@ -407,31 +406,31 @@ static int get_static_power(struct cpufreq_cooling_device *cpufreq_device,
dev_pm_opp_put(opp);

if (voltage == 0) {
- dev_err_ratelimited(cpufreq_device->cpu_dev,
+ dev_err_ratelimited(cpufreq_dev->cpu_dev,
"Failed to get voltage for frequency %lu\n",
freq_hz);
return -EINVAL;
}

- return cpufreq_device->plat_get_static_power(cpumask, tz->passive_delay,
- voltage, power);
+ return cpufreq_dev->plat_get_static_power(cpumask, tz->passive_delay,
+ voltage, power);
}

/**
* get_dynamic_power() - calculate the dynamic power
- * @cpufreq_device: &cpufreq_cooling_device for this cdev
+ * @cpufreq_dev: &cpufreq_cooling_device for this cdev
* @freq: current frequency
*
* Return: the dynamic power consumed by the cpus described by
- * @cpufreq_device.
+ * @cpufreq_dev.
*/
-static u32 get_dynamic_power(struct cpufreq_cooling_device *cpufreq_device,
+static u32 get_dynamic_power(struct cpufreq_cooling_device *cpufreq_dev,
unsigned long freq)
{
u32 raw_cpu_power;

- raw_cpu_power = cpu_freq_to_power(cpufreq_device, freq);
- return (raw_cpu_power * cpufreq_device->last_load) / 100;
+ raw_cpu_power = cpu_freq_to_power(cpufreq_dev, freq);
+ return (raw_cpu_power * cpufreq_dev->last_load) / 100;
}

/* cpufreq cooling device callback functions are defined below */
@@ -449,9 +448,9 @@ static u32 get_dynamic_power(struct cpufreq_cooling_device *cpufreq_device,
static int cpufreq_get_max_state(struct thermal_cooling_device *cdev,
unsigned long *state)
{
- struct cpufreq_cooling_device *cpufreq_device = cdev->devdata;
+ struct cpufreq_cooling_device *cpufreq_dev = cdev->devdata;

- *state = cpufreq_device->max_level;
+ *state = cpufreq_dev->max_level;
return 0;
}

@@ -468,9 +467,9 @@ static int cpufreq_get_max_state(struct thermal_cooling_device *cdev,
static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev,
unsigned long *state)
{
- struct cpufreq_cooling_device *cpufreq_device = cdev->devdata;
+ struct cpufreq_cooling_device *cpufreq_dev = cdev->devdata;

- *state = cpufreq_device->cpufreq_state;
+ *state = cpufreq_dev->cpufreq_state;

return 0;
}
@@ -488,21 +487,21 @@ static int cpufreq_get_cur_state(struct thermal_cooling_device *cdev,
static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
unsigned long state)
{
- struct cpufreq_cooling_device *cpufreq_device = cdev->devdata;
- unsigned int cpu = cpumask_any(&cpufreq_device->allowed_cpus);
+ struct cpufreq_cooling_device *cpufreq_dev = cdev->devdata;
+ unsigned int cpu = cpumask_any(&cpufreq_dev->allowed_cpus);
unsigned int clip_freq;

/* Request state should be less than max_level */
- if (WARN_ON(state > cpufreq_device->max_level))
+ if (WARN_ON(state > cpufreq_dev->max_level))
return -EINVAL;

/* Check if the old cooling action is same as new cooling action */
- if (cpufreq_device->cpufreq_state == state)
+ if (cpufreq_dev->cpufreq_state == state)
return 0;

- clip_freq = cpufreq_device->freq_table[state];
- cpufreq_device->cpufreq_state = state;
- cpufreq_device->clipped_freq = clip_freq;
+ clip_freq = cpufreq_dev->freq_table[state];
+ cpufreq_dev->cpufreq_state = state;
+ cpufreq_dev->clipped_freq = clip_freq;

cpufreq_update_policy(cpu);

@@ -539,10 +538,10 @@ static int cpufreq_get_requested_power(struct thermal_cooling_device *cdev,
unsigned long freq;
int i = 0, cpu, ret;
u32 static_power, dynamic_power, total_load = 0;
- struct cpufreq_cooling_device *cpufreq_device = cdev->devdata;
+ struct cpufreq_cooling_device *cpufreq_dev = cdev->devdata;
u32 *load_cpu = NULL;

- cpu = cpumask_any_and(&cpufreq_device->allowed_cpus, cpu_online_mask);
+ cpu = cpumask_any_and(&cpufreq_dev->allowed_cpus, cpu_online_mask);

/*
* All the CPUs are offline, thus the requested power by
@@ -556,16 +555,16 @@ static int cpufreq_get_requested_power(struct thermal_cooling_device *cdev,
freq = cpufreq_quick_get(cpu);

if (trace_thermal_power_cpu_get_power_enabled()) {
- u32 ncpus = cpumask_weight(&cpufreq_device->allowed_cpus);
+ u32 ncpus = cpumask_weight(&cpufreq_dev->allowed_cpus);

load_cpu = kcalloc(ncpus, sizeof(*load_cpu), GFP_KERNEL);
}

- for_each_cpu(cpu, &cpufreq_device->allowed_cpus) {
+ for_each_cpu(cpu, &cpufreq_dev->allowed_cpus) {
u32 load;

if (cpu_online(cpu))
- load = get_load(cpufreq_device, cpu, i);
+ load = get_load(cpufreq_dev, cpu, i);
else
load = 0;

@@ -576,10 +575,10 @@ static int cpufreq_get_requested_power(struct thermal_cooling_device *cdev,
i++;
}

- cpufreq_device->last_load = total_load;
+ cpufreq_dev->last_load = total_load;

- dynamic_power = get_dynamic_power(cpufreq_device, freq);
- ret = get_static_power(cpufreq_device, tz, freq, &static_power);
+ dynamic_power = get_dynamic_power(cpufreq_dev, freq);
+ ret = get_static_power(cpufreq_dev, tz, freq, &static_power);
if (ret) {
kfree(load_cpu);
return ret;
@@ -587,7 +586,7 @@ static int cpufreq_get_requested_power(struct thermal_cooling_device *cdev,

if (load_cpu) {
trace_thermal_power_cpu_get_power(
- &cpufreq_device->allowed_cpus,
+ &cpufreq_dev->allowed_cpus,
freq, load_cpu, i, dynamic_power, static_power);

kfree(load_cpu);
@@ -620,12 +619,12 @@ static int cpufreq_state2power(struct thermal_cooling_device *cdev,
cpumask_var_t cpumask;
u32 static_power, dynamic_power;
int ret;
- struct cpufreq_cooling_device *cpufreq_device = cdev->devdata;
+ struct cpufreq_cooling_device *cpufreq_dev = cdev->devdata;

if (!alloc_cpumask_var(&cpumask, GFP_KERNEL))
return -ENOMEM;

- cpumask_and(cpumask, &cpufreq_device->allowed_cpus, cpu_online_mask);
+ cpumask_and(cpumask, &cpufreq_dev->allowed_cpus, cpu_online_mask);
num_cpus = cpumask_weight(cpumask);

/* None of our cpus are online, so no power */
@@ -635,14 +634,14 @@ static int cpufreq_state2power(struct thermal_cooling_device *cdev,
goto out;
}

- freq = cpufreq_device->freq_table[state];
+ freq = cpufreq_dev->freq_table[state];
if (!freq) {
ret = -EINVAL;
goto out;
}

- dynamic_power = cpu_freq_to_power(cpufreq_device, freq) * num_cpus;
- ret = get_static_power(cpufreq_device, tz, freq, &static_power);
+ dynamic_power = cpu_freq_to_power(cpufreq_dev, freq) * num_cpus;
+ ret = get_static_power(cpufreq_dev, tz, freq, &static_power);
if (ret)
goto out;

@@ -680,24 +679,24 @@ static int cpufreq_power2state(struct thermal_cooling_device *cdev,
int ret;
s32 dyn_power;
u32 last_load, normalised_power, static_power;
- struct cpufreq_cooling_device *cpufreq_device = cdev->devdata;
+ struct cpufreq_cooling_device *cpufreq_dev = cdev->devdata;

- cpu = cpumask_any_and(&cpufreq_device->allowed_cpus, cpu_online_mask);
+ cpu = cpumask_any_and(&cpufreq_dev->allowed_cpus, cpu_online_mask);

/* None of our cpus are online */
if (cpu >= nr_cpu_ids)
return -ENODEV;

cur_freq = cpufreq_quick_get(cpu);
- ret = get_static_power(cpufreq_device, tz, cur_freq, &static_power);
+ ret = get_static_power(cpufreq_dev, tz, cur_freq, &static_power);
if (ret)
return ret;

dyn_power = power - static_power;
dyn_power = dyn_power > 0 ? dyn_power : 0;
- last_load = cpufreq_device->last_load ?: 1;
+ last_load = cpufreq_dev->last_load ?: 1;
normalised_power = (dyn_power * 100) / last_load;
- target_freq = cpu_power_to_freq(cpufreq_device, normalised_power);
+ target_freq = cpu_power_to_freq(cpufreq_dev, normalised_power);

*state = cpufreq_cooling_get_level(cpu, target_freq);
if (*state == THERMAL_CSTATE_INVALID) {
@@ -707,7 +706,7 @@ static int cpufreq_power2state(struct thermal_cooling_device *cdev,
return -EINVAL;
}

- trace_thermal_power_cpu_limit(&cpufreq_device->allowed_cpus,
+ trace_thermal_power_cpu_limit(&cpufreq_dev->allowed_cpus,
target_freq, *state, power);
return 0;
}
--
2.7.1.410.g6faf27b

2017-03-16 05:30:06

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 02/17] thermal: cpu_cooling: rearrange globals

Just to make it look better.

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/thermal/cpu_cooling.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 6fd258d62e47..7ce73eee866f 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -105,10 +105,9 @@ struct cpufreq_cooling_device {
struct device *cpu_dev;
get_static_t plat_get_static_power;
};
-static DEFINE_IDA(cpufreq_ida);

static unsigned int cpufreq_dev_count;
-
+static DEFINE_IDA(cpufreq_ida);
static DEFINE_MUTEX(cooling_list_lock);
static LIST_HEAD(cpufreq_dev_list);

--
2.7.1.410.g6faf27b

2017-03-16 05:30:15

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 05/17] thermal: cpu_cooling: remove cpufreq_cooling_get_level()

There is only one user of cpufreq_cooling_get_level() and that already
has pointer to the cpufreq_dev structure. It can directly call
get_level() instead and we can get rid of cpufreq_cooling_get_level().

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/thermal/cpu_cooling.c | 33 +--------------------------------
include/linux/cpu_cooling.h | 6 ------
2 files changed, 1 insertion(+), 38 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index e2931c20c309..99dc6833de75 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -137,37 +137,6 @@ static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_dev,
}

/**
- * cpufreq_cooling_get_level - for a given cpu, return the cooling level.
- * @cpu: cpu for which the level is required
- * @freq: the frequency of interest
- *
- * This function will match the cooling level corresponding to the
- * requested @freq and return it.
- *
- * Return: The matched cooling level on success or THERMAL_CSTATE_INVALID
- * otherwise.
- */
-unsigned long cpufreq_cooling_get_level(unsigned int cpu, unsigned int freq)
-{
- struct cpufreq_cooling_device *cpufreq_dev;
-
- mutex_lock(&cooling_list_lock);
- list_for_each_entry(cpufreq_dev, &cpufreq_dev_list, node) {
- if (cpumask_test_cpu(cpu, &cpufreq_dev->allowed_cpus)) {
- unsigned long level = get_level(cpufreq_dev, freq);
-
- mutex_unlock(&cooling_list_lock);
- return level;
- }
- }
- mutex_unlock(&cooling_list_lock);
-
- pr_err("%s: cpu:%d not part of any cooling device\n", __func__, cpu);
- return THERMAL_CSTATE_INVALID;
-}
-EXPORT_SYMBOL_GPL(cpufreq_cooling_get_level);
-
-/**
* cpufreq_thermal_notifier - notifier callback for cpufreq policy change.
* @nb: struct notifier_block * with callback info.
* @event: value showing cpufreq event for which this function invoked.
@@ -698,7 +667,7 @@ static int cpufreq_power2state(struct thermal_cooling_device *cdev,
normalised_power = (dyn_power * 100) / last_load;
target_freq = cpu_power_to_freq(cpufreq_dev, normalised_power);

- *state = cpufreq_cooling_get_level(cpu, target_freq);
+ *state = get_level(cpufreq_dev, target_freq);
if (*state == THERMAL_CSTATE_INVALID) {
dev_err_ratelimited(&cdev->device,
"Failed to convert %dKHz for cpu %d into a cdev state\n",
diff --git a/include/linux/cpu_cooling.h b/include/linux/cpu_cooling.h
index c156f5082758..96c5e4c2f9c8 100644
--- a/include/linux/cpu_cooling.h
+++ b/include/linux/cpu_cooling.h
@@ -82,7 +82,6 @@ of_cpufreq_power_cooling_register(struct device_node *np,
*/
void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev);

-unsigned long cpufreq_cooling_get_level(unsigned int cpu, unsigned int freq);
#else /* !CONFIG_CPU_THERMAL */
static inline struct thermal_cooling_device *
cpufreq_cooling_register(const struct cpumask *clip_cpus)
@@ -117,11 +116,6 @@ void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
{
return;
}
-static inline
-unsigned long cpufreq_cooling_get_level(unsigned int cpu, unsigned int freq)
-{
- return THERMAL_CSTATE_INVALID;
-}
#endif /* CONFIG_CPU_THERMAL */

#endif /* __CPU_COOLING_H__ */
--
2.7.1.410.g6faf27b

2017-03-16 05:30:19

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 06/17] thermal: cpu_cooling: get rid of a variable in cpufreq_set_cur_state()

'cpu' is used at only one place and there is no need to keep a separate
variable for it.

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/thermal/cpu_cooling.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 99dc6833de75..46e90122b746 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -457,7 +457,6 @@ static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
unsigned long state)
{
struct cpufreq_cooling_device *cpufreq_dev = cdev->devdata;
- unsigned int cpu = cpumask_any(&cpufreq_dev->allowed_cpus);
unsigned int clip_freq;

/* Request state should be less than max_level */
@@ -472,7 +471,7 @@ static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
cpufreq_dev->cpufreq_state = state;
cpufreq_dev->clipped_freq = clip_freq;

- cpufreq_update_policy(cpu);
+ cpufreq_update_policy(cpumask_any(&cpufreq_dev->allowed_cpus));

return 0;
}
--
2.7.1.410.g6faf27b

2017-03-16 05:30:13

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 04/17] thermal: cpu_cooling: replace cool_dev with cdev

Objects of "struct thermal_cooling_device" are named a bit
inconsistently. Lets use cdev everywhere.

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/thermal/cpu_cooling.c | 36 ++++++++++++++++++------------------
1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 7a19033d7f79..e2931c20c309 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -65,7 +65,7 @@ struct power_table {
* struct cpufreq_cooling_device - data for cooling device with cpufreq
* @id: unique integer value corresponding to each cpufreq_cooling_device
* registered.
- * @cool_dev: thermal_cooling_device pointer to keep track of the
+ * @cdev: thermal_cooling_device pointer to keep track of the
* registered cooling device.
* @cpufreq_state: integer value representing the current state of cpufreq
* cooling devices.
@@ -90,7 +90,7 @@ struct power_table {
*/
struct cpufreq_cooling_device {
int id;
- struct thermal_cooling_device *cool_dev;
+ struct thermal_cooling_device *cdev;
unsigned int cpufreq_state;
unsigned int clipped_freq;
unsigned int max_level;
@@ -243,7 +243,7 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_dev,
for_each_cpu(cpu, &cpufreq_dev->allowed_cpus) {
dev = get_cpu_device(cpu);
if (!dev) {
- dev_warn(&cpufreq_dev->cool_dev->device,
+ dev_warn(&cpufreq_dev->cdev->device,
"No cpu device for cpu %d\n", cpu);
continue;
}
@@ -770,7 +770,7 @@ __cpufreq_cooling_register(struct device_node *np,
get_static_t plat_static_func)
{
struct cpufreq_policy *policy;
- struct thermal_cooling_device *cool_dev;
+ struct thermal_cooling_device *cdev;
struct cpufreq_cooling_device *cpufreq_dev;
char dev_name[THERMAL_NAME_LENGTH];
struct cpufreq_frequency_table *pos, *table;
@@ -786,20 +786,20 @@ __cpufreq_cooling_register(struct device_node *np,
policy = cpufreq_cpu_get(cpumask_first(temp_mask));
if (!policy) {
pr_debug("%s: CPUFreq policy not found\n", __func__);
- cool_dev = ERR_PTR(-EPROBE_DEFER);
+ cdev = ERR_PTR(-EPROBE_DEFER);
goto free_cpumask;
}

table = policy->freq_table;
if (!table) {
pr_debug("%s: CPUFreq table not found\n", __func__);
- cool_dev = ERR_PTR(-ENODEV);
+ cdev = ERR_PTR(-ENODEV);
goto put_policy;
}

cpufreq_dev = kzalloc(sizeof(*cpufreq_dev), GFP_KERNEL);
if (!cpufreq_dev) {
- cool_dev = ERR_PTR(-ENOMEM);
+ cdev = ERR_PTR(-ENOMEM);
goto put_policy;
}

@@ -808,7 +808,7 @@ __cpufreq_cooling_register(struct device_node *np,
sizeof(*cpufreq_dev->time_in_idle),
GFP_KERNEL);
if (!cpufreq_dev->time_in_idle) {
- cool_dev = ERR_PTR(-ENOMEM);
+ cdev = ERR_PTR(-ENOMEM);
goto free_cdev;
}

@@ -816,7 +816,7 @@ __cpufreq_cooling_register(struct device_node *np,
kcalloc(num_cpus, sizeof(*cpufreq_dev->time_in_idle_timestamp),
GFP_KERNEL);
if (!cpufreq_dev->time_in_idle_timestamp) {
- cool_dev = ERR_PTR(-ENOMEM);
+ cdev = ERR_PTR(-ENOMEM);
goto free_time_in_idle;
}

@@ -827,7 +827,7 @@ __cpufreq_cooling_register(struct device_node *np,
cpufreq_dev->freq_table = kmalloc(sizeof(*cpufreq_dev->freq_table) *
cpufreq_dev->max_level, GFP_KERNEL);
if (!cpufreq_dev->freq_table) {
- cool_dev = ERR_PTR(-ENOMEM);
+ cdev = ERR_PTR(-ENOMEM);
goto free_time_in_idle_timestamp;
}

@@ -841,7 +841,7 @@ __cpufreq_cooling_register(struct device_node *np,

ret = build_dyn_power_table(cpufreq_dev, capacitance);
if (ret) {
- cool_dev = ERR_PTR(ret);
+ cdev = ERR_PTR(ret);
goto free_table;
}

@@ -852,7 +852,7 @@ __cpufreq_cooling_register(struct device_node *np,

ret = ida_simple_get(&cpufreq_ida, 0, 0, GFP_KERNEL);
if (ret < 0) {
- cool_dev = ERR_PTR(ret);
+ cdev = ERR_PTR(ret);
goto free_power_table;
}
cpufreq_dev->id = ret;
@@ -872,13 +872,13 @@ __cpufreq_cooling_register(struct device_node *np,
snprintf(dev_name, sizeof(dev_name), "thermal-cpufreq-%d",
cpufreq_dev->id);

- cool_dev = thermal_of_cooling_device_register(np, dev_name, cpufreq_dev,
- cooling_ops);
- if (IS_ERR(cool_dev))
+ cdev = thermal_of_cooling_device_register(np, dev_name, cpufreq_dev,
+ cooling_ops);
+ if (IS_ERR(cdev))
goto remove_ida;

cpufreq_dev->clipped_freq = cpufreq_dev->freq_table[0];
- cpufreq_dev->cool_dev = cool_dev;
+ cpufreq_dev->cdev = cdev;

mutex_lock(&cooling_list_lock);
list_add(&cpufreq_dev->node, &cpufreq_dev_list);
@@ -907,7 +907,7 @@ __cpufreq_cooling_register(struct device_node *np,
cpufreq_cpu_put(policy);
free_cpumask:
free_cpumask_var(temp_mask);
- return cool_dev;
+ return cdev;
}

/**
@@ -1043,7 +1043,7 @@ void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
list_del(&cpufreq_dev->node);
mutex_unlock(&cooling_list_lock);

- thermal_cooling_device_unregister(cpufreq_dev->cool_dev);
+ thermal_cooling_device_unregister(cpufreq_dev->cdev);
ida_simple_remove(&cpufreq_ida, cpufreq_dev->id);
kfree(cpufreq_dev->dyn_power_table);
kfree(cpufreq_dev->time_in_idle_timestamp);
--
2.7.1.410.g6faf27b

2017-03-16 05:30:35

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 09/17] thermal: cpu_cooling: store cpufreq policy

The cpufreq policy can be used by the cpu_cooling driver, lets store it
in the cpufreq_cooling_device structure.

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/thermal/cpu_cooling.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 2c169fee693e..7590279bf1de 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -67,6 +67,7 @@ struct power_table {
* registered.
* @cdev: thermal_cooling_device pointer to keep track of the
* registered cooling device.
+ * @policy: cpufreq policy.
* @cpufreq_state: integer value representing the current state of cpufreq
* cooling devices.
* @clipped_freq: integer value representing the absolute value of the clipped
@@ -91,6 +92,7 @@ struct power_table {
struct cpufreq_cooling_device {
int id;
struct thermal_cooling_device *cdev;
+ struct cpufreq_policy *policy;
unsigned int cpufreq_state;
unsigned int clipped_freq;
unsigned int max_level;
@@ -827,6 +829,7 @@ __cpufreq_cooling_register(struct device_node *np,

cpufreq_dev->clipped_freq = cpufreq_dev->freq_table[0];
cpufreq_dev->cdev = cdev;
+ cpufreq_dev->policy = policy;

mutex_lock(&cooling_list_lock);
list_add(&cpufreq_dev->node, &cpufreq_dev_list);
--
2.7.1.410.g6faf27b

2017-03-16 05:30:45

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 12/17] thermal: cpu_cooling: merge frequency and power tables

The cpu_cooling driver keeps two tables:

- freq_table: table of frequencies in descending order, built from
policy->freq_table.

- power_table: table of frequencies and power in ascending order, built
from OPP table.

If the OPPs are used for the CPU device then both these tables are
actually built using the OPP core and should have the same frequency
entries. And there is no need to keep separate tables for this.

Lets merge them both.

Note that the new table is in descending order of frequencies and so the
'for' loops were required to be fixed at few places to make it work.

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/thermal/cpu_cooling.c | 123 ++++++++++++++++++------------------------
1 file changed, 52 insertions(+), 71 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index b7b193cb0e7a..960135d85a71 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -49,14 +49,14 @@
*/

/**
- * struct power_table - frequency to power conversion
+ * struct freq_table - frequency table along with power entries
* @frequency: frequency in KHz
* @power: power in mW
*
* This structure is built when the cooling device registers and helps
- * in translating frequency to power and viceversa.
+ * in translating frequency to power and vice versa.
*/
-struct power_table {
+struct freq_table {
u32 frequency;
u32 power;
};
@@ -79,9 +79,6 @@ struct power_table {
* @time_in_idle: previous reading of the absolute time that this cpu was idle
* @time_in_idle_timestamp: wall time of the last invocation of
* get_cpu_idle_time_us()
- * @dyn_power_table: array of struct power_table for frequency to power
- * conversion, sorted in ascending order.
- * @dyn_power_table_entries: number of entries in the @dyn_power_table array
* @cpu_dev: the cpu_device of policy->cpu.
* @plat_get_static_power: callback to calculate the static power
*
@@ -95,13 +92,11 @@ struct cpufreq_cooling_device {
unsigned int cpufreq_state;
unsigned int clipped_freq;
unsigned int max_level;
- unsigned int *freq_table; /* In descending order */
+ struct freq_table *freq_table; /* In descending order */
struct list_head node;
u32 last_load;
u64 *time_in_idle;
u64 *time_in_idle_timestamp;
- struct power_table *dyn_power_table;
- int dyn_power_table_entries;
struct device *cpu_dev;
get_static_t plat_get_static_power;
};
@@ -126,10 +121,10 @@ static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_dev,
unsigned long level;

for (level = 0; level <= cpufreq_dev->max_level; level++) {
- if (freq == cpufreq_dev->freq_table[level])
+ if (freq == cpufreq_dev->freq_table[level].frequency)
return level;

- if (freq > cpufreq_dev->freq_table[level])
+ if (freq > cpufreq_dev->freq_table[level].frequency)
break;
}

@@ -186,28 +181,25 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb,
}

/**
- * build_dyn_power_table() - create a dynamic power to frequency table
- * @cpufreq_dev: the cpufreq cooling device in which to store the table
+ * update_freq_table() - Update the freq table with power numbers
+ * @cpufreq_dev: the cpufreq cooling device in which to update the table
* @capacitance: dynamic power coefficient for these cpus
*
- * Build a dynamic power to frequency table for this cpu and store it
- * in @cpufreq_dev. This table will be used in cpu_power_to_freq() and
- * cpu_freq_to_power() to convert between power and frequency
- * efficiently. Power is stored in mW, frequency in KHz. The
- * resulting table is in ascending order.
+ * Update the freq table with power numbers. This table will be used in
+ * cpu_power_to_freq() and cpu_freq_to_power() to convert between power and
+ * frequency efficiently. Power is stored in mW, frequency in KHz. The
+ * resulting table is in descending order.
*
* Return: 0 on success, -EINVAL if there are no OPPs for any CPUs,
- * -ENOMEM if we run out of memory or -EAGAIN if an OPP was
- * added/enabled while the function was executing.
+ * or -ENOMEM if we run out of memory.
*/
-static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_dev,
- u32 capacitance)
+static int update_freq_table(struct cpufreq_cooling_device *cpufreq_dev,
+ u32 capacitance)
{
- struct power_table *power_table;
+ struct freq_table *freq_table = cpufreq_dev->freq_table;
struct dev_pm_opp *opp;
struct device *dev = NULL;
- int num_opps = 0, cpu = cpufreq_dev->policy->cpu, i, ret = 0;
- unsigned long freq;
+ int num_opps = 0, cpu = cpufreq_dev->policy->cpu, i;

dev = get_cpu_device(cpu);
if (unlikely(!dev)) {
@@ -220,25 +212,32 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_dev,
if (num_opps < 0)
return num_opps;

- if (num_opps == 0)
+ /*
+ * The cpufreq table is also built from the OPP table and so the count
+ * should match.
+ */
+ if (num_opps != cpufreq_dev->max_level + 1) {
+ dev_warn(dev, "Number of OPPs not matching with max_levels\n");
return -EINVAL;
+ }

- power_table = kcalloc(num_opps, sizeof(*power_table), GFP_KERNEL);
- if (!power_table)
- return -ENOMEM;
-
- for (freq = 0, i = 0;
- opp = dev_pm_opp_find_freq_ceil(dev, &freq), !IS_ERR(opp);
- freq++, i++) {
- u32 freq_mhz, voltage_mv;
+ for (i = 0; i < cpufreq_dev->max_level; i++) {
+ unsigned long freq = freq_table[i].frequency * 1000;
+ u32 freq_mhz = freq_table[i].frequency / 1000;
u64 power;
+ u32 voltage_mv;

- if (i >= num_opps) {
- ret = -EAGAIN;
- goto free_power_table;
+ /*
+ * Find ceil frequency as 'freq' may be slightly lower than OPP
+ * freq due to truncation while converting to kHz.
+ */
+ opp = dev_pm_opp_find_freq_ceil(dev, &freq);
+ if (IS_ERR(opp)) {
+ dev_err(dev, "failed to get opp for %lu frequency\n",
+ freq);
+ return -EINVAL;
}

- freq_mhz = freq / 1000000;
voltage_mv = dev_pm_opp_get_voltage(opp) / 1000;
dev_pm_opp_put(opp);

@@ -249,54 +248,39 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_dev,
power = (u64)capacitance * freq_mhz * voltage_mv * voltage_mv;
do_div(power, 1000000000);

- /* frequency is stored in power_table in KHz */
- power_table[i].frequency = freq / 1000;
-
/* power is stored in mW */
- power_table[i].power = power;
- }
-
- if (i != num_opps) {
- ret = PTR_ERR(opp);
- goto free_power_table;
+ freq_table[i].power = power;
}

cpufreq_dev->cpu_dev = dev;
- cpufreq_dev->dyn_power_table = power_table;
- cpufreq_dev->dyn_power_table_entries = i;

return 0;
-
-free_power_table:
- kfree(power_table);
-
- return ret;
}

static u32 cpu_freq_to_power(struct cpufreq_cooling_device *cpufreq_dev,
u32 freq)
{
int i;
- struct power_table *pt = cpufreq_dev->dyn_power_table;
+ struct freq_table *freq_table = cpufreq_dev->freq_table;

- for (i = 1; i < cpufreq_dev->dyn_power_table_entries; i++)
- if (freq < pt[i].frequency)
+ for (i = 1; i < cpufreq_dev->max_level; i++)
+ if (freq > freq_table[i].frequency)
break;

- return pt[i - 1].power;
+ return freq_table[i - 1].power;
}

static u32 cpu_power_to_freq(struct cpufreq_cooling_device *cpufreq_dev,
u32 power)
{
int i;
- struct power_table *pt = cpufreq_dev->dyn_power_table;
+ struct freq_table *freq_table = cpufreq_dev->freq_table;

- for (i = 1; i < cpufreq_dev->dyn_power_table_entries; i++)
- if (power < pt[i].power)
+ for (i = 1; i < cpufreq_dev->max_level; i++)
+ if (power > freq_table[i].power)
break;

- return pt[i - 1].frequency;
+ return freq_table[i - 1].frequency;
}

/**
@@ -463,7 +447,7 @@ static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
if (cpufreq_dev->cpufreq_state == state)
return 0;

- clip_freq = cpufreq_dev->freq_table[state];
+ clip_freq = cpufreq_dev->freq_table[state].frequency;
cpufreq_dev->cpufreq_state = state;
cpufreq_dev->clipped_freq = clip_freq;

@@ -576,7 +560,7 @@ static int cpufreq_state2power(struct thermal_cooling_device *cdev,

num_cpus = cpumask_weight(cpufreq_dev->policy->cpus);

- freq = cpufreq_dev->freq_table[state];
+ freq = cpufreq_dev->freq_table[state].frequency;
if (!freq)
return -EINVAL;

@@ -750,7 +734,7 @@ __cpufreq_cooling_register(struct device_node *np,
if (capacitance) {
cpufreq_dev->plat_get_static_power = plat_static_func;

- ret = build_dyn_power_table(cpufreq_dev, capacitance);
+ ret = update_freq_table(cpufreq_dev, capacitance);
if (ret) {
cdev = ERR_PTR(ret);
goto free_table;
@@ -764,14 +748,14 @@ __cpufreq_cooling_register(struct device_node *np,
ret = ida_simple_get(&cpufreq_ida, 0, 0, GFP_KERNEL);
if (ret < 0) {
cdev = ERR_PTR(ret);
- goto free_power_table;
+ goto free_table;
}
cpufreq_dev->id = ret;

/* Fill freq-table in descending order of frequencies */
for (i = 0, freq = -1; i <= cpufreq_dev->max_level; i++) {
freq = find_next_max(policy->freq_table, freq);
- cpufreq_dev->freq_table[i] = freq;
+ cpufreq_dev->freq_table[i].frequency = freq;

/* Warn for duplicate entries */
if (!freq)
@@ -788,7 +772,7 @@ __cpufreq_cooling_register(struct device_node *np,
if (IS_ERR(cdev))
goto remove_ida;

- cpufreq_dev->clipped_freq = cpufreq_dev->freq_table[0];
+ cpufreq_dev->clipped_freq = cpufreq_dev->freq_table[0].frequency;
cpufreq_dev->cdev = cdev;
cpufreq_dev->policy = policy;

@@ -805,8 +789,6 @@ __cpufreq_cooling_register(struct device_node *np,

remove_ida:
ida_simple_remove(&cpufreq_ida, cpufreq_dev->id);
-free_power_table:
- kfree(cpufreq_dev->dyn_power_table);
free_table:
kfree(cpufreq_dev->freq_table);
free_time_in_idle_timestamp:
@@ -953,7 +935,6 @@ void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)

thermal_cooling_device_unregister(cpufreq_dev->cdev);
ida_simple_remove(&cpufreq_ida, cpufreq_dev->id);
- kfree(cpufreq_dev->dyn_power_table);
kfree(cpufreq_dev->time_in_idle_timestamp);
kfree(cpufreq_dev->time_in_idle);
kfree(cpufreq_dev->freq_table);
--
2.7.1.410.g6faf27b

2017-03-16 05:30:33

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 08/17] cpufreq: create cpufreq_table_count_valid_entries()

We need such a routine at two places already, lets create one.

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/cpufreq/cpufreq_stats.c | 13 ++++---------
drivers/thermal/cpu_cooling.c | 22 +++++++++-------------
include/linux/cpufreq.h | 14 ++++++++++++++
3 files changed, 27 insertions(+), 22 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
index f570ead62454..9c3d319dc129 100644
--- a/drivers/cpufreq/cpufreq_stats.c
+++ b/drivers/cpufreq/cpufreq_stats.c
@@ -170,11 +170,10 @@ void cpufreq_stats_create_table(struct cpufreq_policy *policy)
unsigned int i = 0, count = 0, ret = -ENOMEM;
struct cpufreq_stats *stats;
unsigned int alloc_size;
- struct cpufreq_frequency_table *pos, *table;
+ struct cpufreq_frequency_table *pos;

- /* We need cpufreq table for creating stats table */
- table = policy->freq_table;
- if (unlikely(!table))
+ count = cpufreq_table_count_valid_entries(policy);
+ if (!count)
return;

/* stats already initialized */
@@ -185,10 +184,6 @@ void cpufreq_stats_create_table(struct cpufreq_policy *policy)
if (!stats)
return;

- /* Find total allocation size */
- cpufreq_for_each_valid_entry(pos, table)
- count++;
-
alloc_size = count * sizeof(int) + count * sizeof(u64);

alloc_size += count * count * sizeof(int);
@@ -205,7 +200,7 @@ void cpufreq_stats_create_table(struct cpufreq_policy *policy)
stats->max_state = count;

/* Find valid-unique entries */
- cpufreq_for_each_valid_entry(pos, table)
+ cpufreq_for_each_valid_entry(pos, policy->freq_table)
if (freq_table_get_index(stats, pos->frequency) == -1)
stats->freq_table[i++] = pos->frequency;

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index a97ebb7bf27f..2c169fee693e 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -740,14 +740,14 @@ __cpufreq_cooling_register(struct device_node *np,
struct thermal_cooling_device *cdev;
struct cpufreq_cooling_device *cpufreq_dev;
char dev_name[THERMAL_NAME_LENGTH];
- struct cpufreq_frequency_table *pos, *table;
unsigned int freq, i, num_cpus;
int ret;
struct thermal_cooling_device_ops *cooling_ops;

- table = policy->freq_table;
- if (!table) {
- pr_debug("%s: CPUFreq table not found\n", __func__);
+ i = cpufreq_table_count_valid_entries(policy);
+ if (!i) {
+ pr_debug("%s: CPUFreq table not found or has no valid entries\n",
+ __func__);
return ERR_PTR(-ENODEV);
}

@@ -772,20 +772,16 @@ __cpufreq_cooling_register(struct device_node *np,
goto free_time_in_idle;
}

- /* Find max levels */
- cpufreq_for_each_valid_entry(pos, table)
- cpufreq_dev->max_level++;
+ /* max_level is an index, not a counter */
+ cpufreq_dev->max_level = i - 1;

- cpufreq_dev->freq_table = kmalloc(sizeof(*cpufreq_dev->freq_table) *
- cpufreq_dev->max_level, GFP_KERNEL);
+ cpufreq_dev->freq_table = kmalloc(sizeof(*cpufreq_dev->freq_table) * i,
+ GFP_KERNEL);
if (!cpufreq_dev->freq_table) {
cdev = ERR_PTR(-ENOMEM);
goto free_time_in_idle_timestamp;
}

- /* max_level is an index, not a counter */
- cpufreq_dev->max_level--;
-
cpumask_copy(&cpufreq_dev->allowed_cpus, policy->related_cpus);

if (capacitance) {
@@ -811,7 +807,7 @@ __cpufreq_cooling_register(struct device_node *np,

/* Fill freq-table in descending order of frequencies */
for (i = 0, freq = -1; i <= cpufreq_dev->max_level; i++) {
- freq = find_next_max(table, freq);
+ freq = find_next_max(policy->freq_table, freq);
cpufreq_dev->freq_table[i] = freq;

/* Warn for duplicate entries */
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 87165f06a307..affc13568af6 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -855,6 +855,20 @@ static inline int cpufreq_frequency_table_target(struct cpufreq_policy *policy,
return -EINVAL;
}
}
+
+static inline int cpufreq_table_count_valid_entries(const struct cpufreq_policy *policy)
+{
+ struct cpufreq_frequency_table *pos;
+ int count = 0;
+
+ if (unlikely(!policy->freq_table))
+ return 0;
+
+ cpufreq_for_each_valid_entry(pos, policy->freq_table)
+ count++;
+
+ return count;
+}
#else
static inline int cpufreq_boost_trigger_state(int state)
{
--
2.7.1.410.g6faf27b

2017-03-16 05:30:42

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 11/17] thermal: cpu_cooling: get rid of 'allowed_cpus'

'allowed_cpus' is a copy of policy->related_cpus and can be replaced by
it directly. At some places we are only concerned about online CPUs and
policy->cpus can be used there.

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/thermal/cpu_cooling.c | 77 ++++++++++++-------------------------------
1 file changed, 21 insertions(+), 56 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 1df6c9039e45..b7b193cb0e7a 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -74,7 +74,6 @@ struct power_table {
* frequency.
* @max_level: maximum cooling level. One less than total number of valid
* cpufreq frequencies.
- * @allowed_cpus: all the cpus involved for this cpufreq_cooling_device.
* @node: list_head to link all cpufreq_cooling_device together.
* @last_load: load measured by the latest call to cpufreq_get_requested_power()
* @time_in_idle: previous reading of the absolute time that this cpu was idle
@@ -97,7 +96,6 @@ struct cpufreq_cooling_device {
unsigned int clipped_freq;
unsigned int max_level;
unsigned int *freq_table; /* In descending order */
- struct cpumask allowed_cpus;
struct list_head node;
u32 last_load;
u64 *time_in_idle;
@@ -162,7 +160,7 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb,

mutex_lock(&cooling_list_lock);
list_for_each_entry(cpufreq_dev, &cpufreq_dev_list, node) {
- if (!cpumask_test_cpu(policy->cpu, &cpufreq_dev->allowed_cpus))
+ if (policy != cpufreq_dev->policy)
continue;

/*
@@ -305,7 +303,7 @@ static u32 cpu_power_to_freq(struct cpufreq_cooling_device *cpufreq_dev,
* get_load() - get load for a cpu since last updated
* @cpufreq_dev: &struct cpufreq_cooling_device for this cpu
* @cpu: cpu number
- * @cpu_idx: index of the cpu in cpufreq_dev->allowed_cpus
+ * @cpu_idx: index of the cpu in time_in_idle*
*
* Return: The average load of cpu @cpu in percentage since this
* function was last called.
@@ -352,7 +350,7 @@ static int get_static_power(struct cpufreq_cooling_device *cpufreq_dev,
{
struct dev_pm_opp *opp;
unsigned long voltage;
- struct cpumask *cpumask = &cpufreq_dev->allowed_cpus;
+ struct cpumask *cpumask = cpufreq_dev->policy->related_cpus;
unsigned long freq_hz = freq * 1000;

if (!cpufreq_dev->plat_get_static_power || !cpufreq_dev->cpu_dev) {
@@ -469,7 +467,7 @@ static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
cpufreq_dev->cpufreq_state = state;
cpufreq_dev->clipped_freq = clip_freq;

- cpufreq_update_policy(cpumask_any(&cpufreq_dev->allowed_cpus));
+ cpufreq_update_policy(cpufreq_dev->policy->cpu);

return 0;
}
@@ -505,28 +503,18 @@ static int cpufreq_get_requested_power(struct thermal_cooling_device *cdev,
int i = 0, cpu, ret;
u32 static_power, dynamic_power, total_load = 0;
struct cpufreq_cooling_device *cpufreq_dev = cdev->devdata;
+ struct cpufreq_policy *policy = cpufreq_dev->policy;
u32 *load_cpu = NULL;

- cpu = cpumask_any_and(&cpufreq_dev->allowed_cpus, cpu_online_mask);
-
- /*
- * All the CPUs are offline, thus the requested power by
- * the cdev is 0
- */
- if (cpu >= nr_cpu_ids) {
- *power = 0;
- return 0;
- }
-
- freq = cpufreq_quick_get(cpu);
+ freq = cpufreq_quick_get(policy->cpu);

if (trace_thermal_power_cpu_get_power_enabled()) {
- u32 ncpus = cpumask_weight(&cpufreq_dev->allowed_cpus);
+ u32 ncpus = cpumask_weight(policy->related_cpus);

load_cpu = kcalloc(ncpus, sizeof(*load_cpu), GFP_KERNEL);
}

- for_each_cpu(cpu, &cpufreq_dev->allowed_cpus) {
+ for_each_cpu(cpu, policy->related_cpus) {
u32 load;

if (cpu_online(cpu))
@@ -551,9 +539,9 @@ static int cpufreq_get_requested_power(struct thermal_cooling_device *cdev,
}

if (load_cpu) {
- trace_thermal_power_cpu_get_power(
- &cpufreq_dev->allowed_cpus,
- freq, load_cpu, i, dynamic_power, static_power);
+ trace_thermal_power_cpu_get_power(policy->related_cpus, freq,
+ load_cpu, i, dynamic_power,
+ static_power);

kfree(load_cpu);
}
@@ -582,38 +570,22 @@ static int cpufreq_state2power(struct thermal_cooling_device *cdev,
unsigned long state, u32 *power)
{
unsigned int freq, num_cpus;
- cpumask_var_t cpumask;
u32 static_power, dynamic_power;
int ret;
struct cpufreq_cooling_device *cpufreq_dev = cdev->devdata;

- if (!alloc_cpumask_var(&cpumask, GFP_KERNEL))
- return -ENOMEM;
-
- cpumask_and(cpumask, &cpufreq_dev->allowed_cpus, cpu_online_mask);
- num_cpus = cpumask_weight(cpumask);
-
- /* None of our cpus are online, so no power */
- if (num_cpus == 0) {
- *power = 0;
- ret = 0;
- goto out;
- }
+ num_cpus = cpumask_weight(cpufreq_dev->policy->cpus);

freq = cpufreq_dev->freq_table[state];
- if (!freq) {
- ret = -EINVAL;
- goto out;
- }
+ if (!freq)
+ return -EINVAL;

dynamic_power = cpu_freq_to_power(cpufreq_dev, freq) * num_cpus;
ret = get_static_power(cpufreq_dev, tz, freq, &static_power);
if (ret)
- goto out;
+ return ret;

*power = static_power + dynamic_power;
-out:
- free_cpumask_var(cpumask);
return ret;
}

@@ -641,19 +613,14 @@ static int cpufreq_power2state(struct thermal_cooling_device *cdev,
struct thermal_zone_device *tz, u32 power,
unsigned long *state)
{
- unsigned int cpu, cur_freq, target_freq;
+ unsigned int cur_freq, target_freq;
int ret;
s32 dyn_power;
u32 last_load, normalised_power, static_power;
struct cpufreq_cooling_device *cpufreq_dev = cdev->devdata;
+ struct cpufreq_policy *policy = cpufreq_dev->policy;

- cpu = cpumask_any_and(&cpufreq_dev->allowed_cpus, cpu_online_mask);
-
- /* None of our cpus are online */
- if (cpu >= nr_cpu_ids)
- return -ENODEV;
-
- cur_freq = cpufreq_quick_get(cpu);
+ cur_freq = cpufreq_quick_get(policy->cpu);
ret = get_static_power(cpufreq_dev, tz, cur_freq, &static_power);
if (ret)
return ret;
@@ -668,12 +635,12 @@ static int cpufreq_power2state(struct thermal_cooling_device *cdev,
if (*state == THERMAL_CSTATE_INVALID) {
dev_err_ratelimited(&cdev->device,
"Failed to convert %dKHz for cpu %d into a cdev state\n",
- target_freq, cpu);
+ target_freq, policy->cpu);
return -EINVAL;
}

- trace_thermal_power_cpu_limit(&cpufreq_dev->allowed_cpus,
- target_freq, *state, power);
+ trace_thermal_power_cpu_limit(policy->related_cpus, target_freq, *state,
+ power);
return 0;
}

@@ -780,8 +747,6 @@ __cpufreq_cooling_register(struct device_node *np,
goto free_time_in_idle_timestamp;
}

- cpumask_copy(&cpufreq_dev->allowed_cpus, policy->related_cpus);
-
if (capacitance) {
cpufreq_dev->plat_get_static_power = plat_static_func;

--
2.7.1.410.g6faf27b

2017-03-16 05:30:49

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 13/17] thermal: cpu_cooling: create structure for idle time stats

We keep two arrays for idle time stats and allocate memory for them
separately. It would be much easier to follow if we create an array of
idle stats structure instead and allocate it once.

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/thermal/cpu_cooling.c | 53 ++++++++++++++++++++-----------------------
1 file changed, 25 insertions(+), 28 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 960135d85a71..69388a903706 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -62,6 +62,16 @@ struct freq_table {
};

/**
+ * struct time_in_idle - Idle time stats
+ * @time: previous reading of the absolute time that this cpu was idle
+ * @timestamp: wall time of the last invocation of get_cpu_idle_time_us()
+ */
+struct time_in_idle {
+ u64 time;
+ u64 timestamp;
+};
+
+/**
* struct cpufreq_cooling_device - data for cooling device with cpufreq
* @id: unique integer value corresponding to each cpufreq_cooling_device
* registered.
@@ -76,9 +86,7 @@ struct freq_table {
* cpufreq frequencies.
* @node: list_head to link all cpufreq_cooling_device together.
* @last_load: load measured by the latest call to cpufreq_get_requested_power()
- * @time_in_idle: previous reading of the absolute time that this cpu was idle
- * @time_in_idle_timestamp: wall time of the last invocation of
- * get_cpu_idle_time_us()
+ * @idle_time: idle time stats
* @cpu_dev: the cpu_device of policy->cpu.
* @plat_get_static_power: callback to calculate the static power
*
@@ -95,8 +103,7 @@ struct cpufreq_cooling_device {
struct freq_table *freq_table; /* In descending order */
struct list_head node;
u32 last_load;
- u64 *time_in_idle;
- u64 *time_in_idle_timestamp;
+ struct time_in_idle *idle_time;
struct device *cpu_dev;
get_static_t plat_get_static_power;
};
@@ -297,18 +304,19 @@ static u32 get_load(struct cpufreq_cooling_device *cpufreq_dev, int cpu,
{
u32 load;
u64 now, now_idle, delta_time, delta_idle;
+ struct time_in_idle *idle_time = &cpufreq_dev->idle_time[cpu_idx];

now_idle = get_cpu_idle_time(cpu, &now, 0);
- delta_idle = now_idle - cpufreq_dev->time_in_idle[cpu_idx];
- delta_time = now - cpufreq_dev->time_in_idle_timestamp[cpu_idx];
+ delta_idle = now_idle - idle_time->time;
+ delta_time = now - idle_time->timestamp;

if (delta_time <= delta_idle)
load = 0;
else
load = div64_u64(100 * (delta_time - delta_idle), delta_time);

- cpufreq_dev->time_in_idle[cpu_idx] = now_idle;
- cpufreq_dev->time_in_idle_timestamp[cpu_idx] = now;
+ idle_time->time = now_idle;
+ idle_time->timestamp = now;

return load;
}
@@ -705,22 +713,14 @@ __cpufreq_cooling_register(struct device_node *np,
return ERR_PTR(-ENOMEM);

num_cpus = cpumask_weight(policy->related_cpus);
- cpufreq_dev->time_in_idle = kcalloc(num_cpus,
- sizeof(*cpufreq_dev->time_in_idle),
- GFP_KERNEL);
- if (!cpufreq_dev->time_in_idle) {
+ cpufreq_dev->idle_time = kcalloc(num_cpus,
+ sizeof(*cpufreq_dev->idle_time),
+ GFP_KERNEL);
+ if (!cpufreq_dev->idle_time) {
cdev = ERR_PTR(-ENOMEM);
goto free_cdev;
}

- cpufreq_dev->time_in_idle_timestamp =
- kcalloc(num_cpus, sizeof(*cpufreq_dev->time_in_idle_timestamp),
- GFP_KERNEL);
- if (!cpufreq_dev->time_in_idle_timestamp) {
- cdev = ERR_PTR(-ENOMEM);
- goto free_time_in_idle;
- }
-
/* max_level is an index, not a counter */
cpufreq_dev->max_level = i - 1;

@@ -728,7 +728,7 @@ __cpufreq_cooling_register(struct device_node *np,
GFP_KERNEL);
if (!cpufreq_dev->freq_table) {
cdev = ERR_PTR(-ENOMEM);
- goto free_time_in_idle_timestamp;
+ goto free_idle_time;
}

if (capacitance) {
@@ -791,10 +791,8 @@ __cpufreq_cooling_register(struct device_node *np,
ida_simple_remove(&cpufreq_ida, cpufreq_dev->id);
free_table:
kfree(cpufreq_dev->freq_table);
-free_time_in_idle_timestamp:
- kfree(cpufreq_dev->time_in_idle_timestamp);
-free_time_in_idle:
- kfree(cpufreq_dev->time_in_idle);
+free_idle_time:
+ kfree(cpufreq_dev->idle_time);
free_cdev:
kfree(cpufreq_dev);
return cdev;
@@ -935,8 +933,7 @@ void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)

thermal_cooling_device_unregister(cpufreq_dev->cdev);
ida_simple_remove(&cpufreq_ida, cpufreq_dev->id);
- kfree(cpufreq_dev->time_in_idle_timestamp);
- kfree(cpufreq_dev->time_in_idle);
+ kfree(cpufreq_dev->idle_time);
kfree(cpufreq_dev->freq_table);
kfree(cpufreq_dev);
}
--
2.7.1.410.g6faf27b

2017-03-16 05:30:54

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 16/17] thermal: cpu_cooling: 'freq' can't be zero in cpufreq_state2power()

The frequency table shouldn't have any zero frequency entries and so
such a check isn't required. Though it would be better to make sure
'state' is within limits.

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/thermal/cpu_cooling.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index fb535fd5aa12..768a95bcc392 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -561,12 +561,13 @@ static int cpufreq_state2power(struct thermal_cooling_device *cdev,
int ret;
struct cpufreq_cooling_device *cpufreq_dev = cdev->devdata;

+ /* Request state should be less than max_level */
+ if (WARN_ON(state > cpufreq_dev->max_level))
+ return -EINVAL;
+
num_cpus = cpumask_weight(cpufreq_dev->policy->cpus);

freq = cpufreq_dev->freq_table[state].frequency;
- if (!freq)
- return -EINVAL;
-
dynamic_power = cpu_freq_to_power(cpufreq_dev, freq) * num_cpus;
ret = get_static_power(cpufreq_dev, tz, freq, &static_power);
if (ret)
--
2.7.1.410.g6faf27b

2017-03-16 05:31:01

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 17/17] thermal: cpu_cooling: Rearrange struct cpufreq_cooling_device

This shrinks the size of the structure on arm64 by 8 bytes by avoiding
padding of 4 bytes at two places.

Also add missing doc comment for freq_table

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/thermal/cpu_cooling.c | 15 ++++++++-------
1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 768a95bcc392..f31c753ac08f 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -75,17 +75,18 @@ struct time_in_idle {
* struct cpufreq_cooling_device - data for cooling device with cpufreq
* @id: unique integer value corresponding to each cpufreq_cooling_device
* registered.
- * @cdev: thermal_cooling_device pointer to keep track of the
- * registered cooling device.
- * @policy: cpufreq policy.
+ * @last_load: load measured by the latest call to cpufreq_get_requested_power()
* @cpufreq_state: integer value representing the current state of cpufreq
* cooling devices.
* @clipped_freq: integer value representing the absolute value of the clipped
* frequency.
* @max_level: maximum cooling level. One less than total number of valid
* cpufreq frequencies.
+ * @freq_table: Freq table in descending order of frequencies
+ * @cdev: thermal_cooling_device pointer to keep track of the
+ * registered cooling device.
+ * @policy: cpufreq policy.
* @node: list_head to link all cpufreq_cooling_device together.
- * @last_load: load measured by the latest call to cpufreq_get_requested_power()
* @idle_time: idle time stats
* @plat_get_static_power: callback to calculate the static power
*
@@ -94,14 +95,14 @@ struct time_in_idle {
*/
struct cpufreq_cooling_device {
int id;
- struct thermal_cooling_device *cdev;
- struct cpufreq_policy *policy;
+ u32 last_load;
unsigned int cpufreq_state;
unsigned int clipped_freq;
unsigned int max_level;
struct freq_table *freq_table; /* In descending order */
+ struct thermal_cooling_device *cdev;
+ struct cpufreq_policy *policy;
struct list_head node;
- u32 last_load;
struct time_in_idle *idle_time;
get_static_t plat_get_static_power;
};
--
2.7.1.410.g6faf27b

2017-03-16 05:31:04

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 15/17] thermal: cpu_cooling: don't store cpu_dev in cpufreq_dev

'cpu_dev' is used by only one function, get_static_power(), and it
wouldn't be time consuming to get the cpu device structure within it.
This would help removing cpu_dev from struct cpufreq_cooling_device.

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/thermal/cpu_cooling.c | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index fd84802d2e8e..fb535fd5aa12 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -87,7 +87,6 @@ struct time_in_idle {
* @node: list_head to link all cpufreq_cooling_device together.
* @last_load: load measured by the latest call to cpufreq_get_requested_power()
* @idle_time: idle time stats
- * @cpu_dev: the cpu_device of policy->cpu.
* @plat_get_static_power: callback to calculate the static power
*
* This structure is required for keeping information of each registered
@@ -104,7 +103,6 @@ struct cpufreq_cooling_device {
struct list_head node;
u32 last_load;
struct time_in_idle *idle_time;
- struct device *cpu_dev;
get_static_t plat_get_static_power;
};

@@ -256,8 +254,6 @@ static int update_freq_table(struct cpufreq_cooling_device *cpufreq_dev,
freq_table[i].power = power;
}

- cpufreq_dev->cpu_dev = dev;
-
return 0;
}

@@ -339,19 +335,22 @@ static int get_static_power(struct cpufreq_cooling_device *cpufreq_dev,
{
struct dev_pm_opp *opp;
unsigned long voltage;
- struct cpumask *cpumask = cpufreq_dev->policy->related_cpus;
+ struct cpufreq_policy *policy = cpufreq_dev->policy;
+ struct cpumask *cpumask = policy->related_cpus;
unsigned long freq_hz = freq * 1000;
+ struct device *dev;

- if (!cpufreq_dev->plat_get_static_power || !cpufreq_dev->cpu_dev) {
+ if (!cpufreq_dev->plat_get_static_power) {
*power = 0;
return 0;
}

- opp = dev_pm_opp_find_freq_exact(cpufreq_dev->cpu_dev, freq_hz,
- true);
+ dev = get_cpu_device(policy->cpu);
+ WARN_ON(!dev);
+
+ opp = dev_pm_opp_find_freq_exact(dev, freq_hz, true);
if (IS_ERR(opp)) {
- dev_warn_ratelimited(cpufreq_dev->cpu_dev,
- "Failed to find OPP for frequency %lu: %ld\n",
+ dev_warn_ratelimited(dev, "Failed to find OPP for frequency %lu: %ld\n",
freq_hz, PTR_ERR(opp));
return -EINVAL;
}
@@ -360,8 +359,7 @@ static int get_static_power(struct cpufreq_cooling_device *cpufreq_dev,
dev_pm_opp_put(opp);

if (voltage == 0) {
- dev_err_ratelimited(cpufreq_dev->cpu_dev,
- "Failed to get voltage for frequency %lu\n",
+ dev_err_ratelimited(dev, "Failed to get voltage for frequency %lu\n",
freq_hz);
return -EINVAL;
}
--
2.7.1.410.g6faf27b

2017-03-16 05:31:38

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 14/17] thermal: cpu_cooling: get_level() can't fail

The frequency passed to get_level() is returned by cpu_power_to_freq()
and it is guaranteed that get_level() can't fail.

Get rid of error code.

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/thermal/cpu_cooling.c | 20 +++++---------------
1 file changed, 5 insertions(+), 15 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 69388a903706..fd84802d2e8e 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -120,22 +120,19 @@ static LIST_HEAD(cpufreq_dev_list);
* @cpufreq_dev: cpufreq_dev for which the property is required
* @freq: Frequency
*
- * Return: level on success, THERMAL_CSTATE_INVALID on error.
+ * Return: level corresponding to the frequency.
*/
static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_dev,
unsigned int freq)
{
+ struct freq_table *freq_table = cpufreq_dev->freq_table;
unsigned long level;

- for (level = 0; level <= cpufreq_dev->max_level; level++) {
- if (freq == cpufreq_dev->freq_table[level].frequency)
- return level;
-
- if (freq > cpufreq_dev->freq_table[level].frequency)
+ for (level = 1; level < cpufreq_dev->max_level; level++)
+ if (freq > freq_table[level].frequency)
break;
- }

- return THERMAL_CSTATE_INVALID;
+ return level - 1;
}

/**
@@ -624,13 +621,6 @@ static int cpufreq_power2state(struct thermal_cooling_device *cdev,
target_freq = cpu_power_to_freq(cpufreq_dev, normalised_power);

*state = get_level(cpufreq_dev, target_freq);
- if (*state == THERMAL_CSTATE_INVALID) {
- dev_err_ratelimited(&cdev->device,
- "Failed to convert %dKHz for cpu %d into a cdev state\n",
- target_freq, policy->cpu);
- return -EINVAL;
- }
-
trace_thermal_power_cpu_limit(policy->related_cpus, target_freq, *state,
power);
return 0;
--
2.7.1.410.g6faf27b

2017-03-16 05:32:50

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 07/17] thermal: cpu_cooling: use cpufreq_policy to register cooling device

The CPU cooling driver uses the cpufreq policy, to get clip_cpus, the
frequency table, etc. Most of the callers of CPU cooling driver's
registration routines have the cpufreq policy with them, but they only
pass the policy->related_cpus cpumask. The __cpufreq_cooling_register()
routine then gets the policy by itself and uses it.

It would be much better if the callers can pass the policy instead
directly. This also fixes a basic design flaw, where the policy can be
freed while the CPU cooling driver is still active.

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/cpufreq/arm_big_little.c | 2 +-
drivers/cpufreq/cpufreq-dt.c | 2 +-
drivers/cpufreq/dbx500-cpufreq.c | 2 +-
drivers/cpufreq/mt8173-cpufreq.c | 4 +-
drivers/cpufreq/qoriq-cpufreq.c | 3 +-
drivers/thermal/cpu_cooling.c | 60 ++++++++--------------
drivers/thermal/imx_thermal.c | 22 ++++++--
drivers/thermal/ti-soc-thermal/ti-thermal-common.c | 22 +++++---
include/linux/cpu_cooling.h | 26 +++++-----
9 files changed, 71 insertions(+), 72 deletions(-)

diff --git a/drivers/cpufreq/arm_big_little.c b/drivers/cpufreq/arm_big_little.c
index 418042201e6d..ea6d62547b10 100644
--- a/drivers/cpufreq/arm_big_little.c
+++ b/drivers/cpufreq/arm_big_little.c
@@ -540,7 +540,7 @@ static void bL_cpufreq_ready(struct cpufreq_policy *policy)
&power_coefficient);

cdev[cur_cluster] = of_cpufreq_power_cooling_register(np,
- policy->related_cpus, power_coefficient, NULL);
+ policy, power_coefficient, NULL);
if (IS_ERR(cdev[cur_cluster])) {
dev_err(cpu_dev,
"running cpufreq without cooling device: %ld\n",
diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
index c943787d761e..fef3c2160691 100644
--- a/drivers/cpufreq/cpufreq-dt.c
+++ b/drivers/cpufreq/cpufreq-dt.c
@@ -326,7 +326,7 @@ static void cpufreq_ready(struct cpufreq_policy *policy)
&power_coefficient);

priv->cdev = of_cpufreq_power_cooling_register(np,
- policy->related_cpus, power_coefficient, NULL);
+ policy, power_coefficient, NULL);
if (IS_ERR(priv->cdev)) {
dev_err(priv->cpu_dev,
"running cpufreq without cooling device: %ld\n",
diff --git a/drivers/cpufreq/dbx500-cpufreq.c b/drivers/cpufreq/dbx500-cpufreq.c
index 3575b82210ba..4ee0431579c1 100644
--- a/drivers/cpufreq/dbx500-cpufreq.c
+++ b/drivers/cpufreq/dbx500-cpufreq.c
@@ -43,7 +43,7 @@ static int dbx500_cpufreq_exit(struct cpufreq_policy *policy)

static void dbx500_cpufreq_ready(struct cpufreq_policy *policy)
{
- cdev = cpufreq_cooling_register(policy->cpus);
+ cdev = cpufreq_cooling_register(policy);
if (IS_ERR(cdev))
pr_err("Failed to register cooling device %ld\n", PTR_ERR(cdev));
else
diff --git a/drivers/cpufreq/mt8173-cpufreq.c b/drivers/cpufreq/mt8173-cpufreq.c
index fd1886faf33a..f9f00fb4bc3a 100644
--- a/drivers/cpufreq/mt8173-cpufreq.c
+++ b/drivers/cpufreq/mt8173-cpufreq.c
@@ -320,9 +320,7 @@ static void mtk_cpufreq_ready(struct cpufreq_policy *policy)
of_property_read_u32(np, DYNAMIC_POWER, &capacitance);

info->cdev = of_cpufreq_power_cooling_register(np,
- policy->related_cpus,
- capacitance,
- NULL);
+ policy, capacitance, NULL);

if (IS_ERR(info->cdev)) {
dev_err(info->cpu_dev,
diff --git a/drivers/cpufreq/qoriq-cpufreq.c b/drivers/cpufreq/qoriq-cpufreq.c
index e2ea433a5f9c..4ada55b8856e 100644
--- a/drivers/cpufreq/qoriq-cpufreq.c
+++ b/drivers/cpufreq/qoriq-cpufreq.c
@@ -278,8 +278,7 @@ static void qoriq_cpufreq_ready(struct cpufreq_policy *policy)
struct device_node *np = of_get_cpu_node(policy->cpu, NULL);

if (of_find_property(np, "#cooling-cells", NULL)) {
- cpud->cdev = of_cpufreq_cooling_register(np,
- policy->related_cpus);
+ cpud->cdev = of_cpufreq_cooling_register(np, policy);

if (IS_ERR(cpud->cdev) && PTR_ERR(cpud->cdev) != -ENOSYS) {
pr_err("cpu%d is not running as cooling device: %ld\n",
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 46e90122b746..a97ebb7bf27f 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -718,7 +718,7 @@ static unsigned int find_next_max(struct cpufreq_frequency_table *table,
/**
* __cpufreq_cooling_register - helper function to create cpufreq cooling device
* @np: a valid struct device_node to the cooling device device tree node
- * @clip_cpus: cpumask of cpus where the frequency constraints will happen.
+ * @policy: cpufreq policy
* Normally this should be same as cpufreq policy->related_cpus.
* @capacitance: dynamic power coefficient for these cpus
* @plat_static_func: function to calculate the static power consumed by these
@@ -734,44 +734,28 @@ static unsigned int find_next_max(struct cpufreq_frequency_table *table,
*/
static struct thermal_cooling_device *
__cpufreq_cooling_register(struct device_node *np,
- const struct cpumask *clip_cpus, u32 capacitance,
+ struct cpufreq_policy *policy, u32 capacitance,
get_static_t plat_static_func)
{
- struct cpufreq_policy *policy;
struct thermal_cooling_device *cdev;
struct cpufreq_cooling_device *cpufreq_dev;
char dev_name[THERMAL_NAME_LENGTH];
struct cpufreq_frequency_table *pos, *table;
- cpumask_var_t temp_mask;
unsigned int freq, i, num_cpus;
int ret;
struct thermal_cooling_device_ops *cooling_ops;

- if (!alloc_cpumask_var(&temp_mask, GFP_KERNEL))
- return ERR_PTR(-ENOMEM);
-
- cpumask_and(temp_mask, clip_cpus, cpu_online_mask);
- policy = cpufreq_cpu_get(cpumask_first(temp_mask));
- if (!policy) {
- pr_debug("%s: CPUFreq policy not found\n", __func__);
- cdev = ERR_PTR(-EPROBE_DEFER);
- goto free_cpumask;
- }
-
table = policy->freq_table;
if (!table) {
pr_debug("%s: CPUFreq table not found\n", __func__);
- cdev = ERR_PTR(-ENODEV);
- goto put_policy;
+ return ERR_PTR(-ENODEV);
}

cpufreq_dev = kzalloc(sizeof(*cpufreq_dev), GFP_KERNEL);
- if (!cpufreq_dev) {
- cdev = ERR_PTR(-ENOMEM);
- goto put_policy;
- }
+ if (!cpufreq_dev)
+ return ERR_PTR(-ENOMEM);

- num_cpus = cpumask_weight(clip_cpus);
+ num_cpus = cpumask_weight(policy->related_cpus);
cpufreq_dev->time_in_idle = kcalloc(num_cpus,
sizeof(*cpufreq_dev->time_in_idle),
GFP_KERNEL);
@@ -802,7 +786,7 @@ __cpufreq_cooling_register(struct device_node *np,
/* max_level is an index, not a counter */
cpufreq_dev->max_level--;

- cpumask_copy(&cpufreq_dev->allowed_cpus, clip_cpus);
+ cpumask_copy(&cpufreq_dev->allowed_cpus, policy->related_cpus);

if (capacitance) {
cpufreq_dev->plat_get_static_power = plat_static_func;
@@ -857,7 +841,7 @@ __cpufreq_cooling_register(struct device_node *np,
CPUFREQ_POLICY_NOTIFIER);
mutex_unlock(&cooling_list_lock);

- goto put_policy;
+ return cdev;

remove_ida:
ida_simple_remove(&cpufreq_ida, cpufreq_dev->id);
@@ -871,16 +855,12 @@ __cpufreq_cooling_register(struct device_node *np,
kfree(cpufreq_dev->time_in_idle);
free_cdev:
kfree(cpufreq_dev);
-put_policy:
- cpufreq_cpu_put(policy);
-free_cpumask:
- free_cpumask_var(temp_mask);
return cdev;
}

/**
* cpufreq_cooling_register - function to create cpufreq cooling device.
- * @clip_cpus: cpumask of cpus where the frequency constraints will happen.
+ * @policy: cpufreq policy
*
* This interface function registers the cpufreq cooling device with the name
* "thermal-cpufreq-%x". This api can support multiple instances of cpufreq
@@ -890,16 +870,16 @@ __cpufreq_cooling_register(struct device_node *np,
* on failure, it returns a corresponding ERR_PTR().
*/
struct thermal_cooling_device *
-cpufreq_cooling_register(const struct cpumask *clip_cpus)
+cpufreq_cooling_register(struct cpufreq_policy *policy)
{
- return __cpufreq_cooling_register(NULL, clip_cpus, 0, NULL);
+ return __cpufreq_cooling_register(NULL, policy, 0, NULL);
}
EXPORT_SYMBOL_GPL(cpufreq_cooling_register);

/**
* of_cpufreq_cooling_register - function to create cpufreq cooling device.
* @np: a valid struct device_node to the cooling device device tree node
- * @clip_cpus: cpumask of cpus where the frequency constraints will happen.
+ * @policy: cpufreq policy
*
* This interface function registers the cpufreq cooling device with the name
* "thermal-cpufreq-%x". This api can support multiple instances of cpufreq
@@ -911,18 +891,18 @@ EXPORT_SYMBOL_GPL(cpufreq_cooling_register);
*/
struct thermal_cooling_device *
of_cpufreq_cooling_register(struct device_node *np,
- const struct cpumask *clip_cpus)
+ struct cpufreq_policy *policy)
{
if (!np)
return ERR_PTR(-EINVAL);

- return __cpufreq_cooling_register(np, clip_cpus, 0, NULL);
+ return __cpufreq_cooling_register(np, policy, 0, NULL);
}
EXPORT_SYMBOL_GPL(of_cpufreq_cooling_register);

/**
* cpufreq_power_cooling_register() - create cpufreq cooling device with power extensions
- * @clip_cpus: cpumask of cpus where the frequency constraints will happen
+ * @policy: cpufreq policy
* @capacitance: dynamic power coefficient for these cpus
* @plat_static_func: function to calculate the static power consumed by these
* cpus (optional)
@@ -942,10 +922,10 @@ EXPORT_SYMBOL_GPL(of_cpufreq_cooling_register);
* on failure, it returns a corresponding ERR_PTR().
*/
struct thermal_cooling_device *
-cpufreq_power_cooling_register(const struct cpumask *clip_cpus, u32 capacitance,
+cpufreq_power_cooling_register(struct cpufreq_policy *policy, u32 capacitance,
get_static_t plat_static_func)
{
- return __cpufreq_cooling_register(NULL, clip_cpus, capacitance,
+ return __cpufreq_cooling_register(NULL, policy, capacitance,
plat_static_func);
}
EXPORT_SYMBOL(cpufreq_power_cooling_register);
@@ -953,7 +933,7 @@ EXPORT_SYMBOL(cpufreq_power_cooling_register);
/**
* of_cpufreq_power_cooling_register() - create cpufreq cooling device with power extensions
* @np: a valid struct device_node to the cooling device device tree node
- * @clip_cpus: cpumask of cpus where the frequency constraints will happen
+ * @policy: cpufreq policy
* @capacitance: dynamic power coefficient for these cpus
* @plat_static_func: function to calculate the static power consumed by these
* cpus (optional)
@@ -975,14 +955,14 @@ EXPORT_SYMBOL(cpufreq_power_cooling_register);
*/
struct thermal_cooling_device *
of_cpufreq_power_cooling_register(struct device_node *np,
- const struct cpumask *clip_cpus,
+ struct cpufreq_policy *policy,
u32 capacitance,
get_static_t plat_static_func)
{
if (!np)
return ERR_PTR(-EINVAL);

- return __cpufreq_cooling_register(np, clip_cpus, capacitance,
+ return __cpufreq_cooling_register(np, policy, capacitance,
plat_static_func);
}
EXPORT_SYMBOL(of_cpufreq_power_cooling_register);
diff --git a/drivers/thermal/imx_thermal.c b/drivers/thermal/imx_thermal.c
index fb648a45754e..f7ec39f46ee4 100644
--- a/drivers/thermal/imx_thermal.c
+++ b/drivers/thermal/imx_thermal.c
@@ -8,6 +8,7 @@
*/

#include <linux/clk.h>
+#include <linux/cpufreq.h>
#include <linux/cpu_cooling.h>
#include <linux/delay.h>
#include <linux/device.h>
@@ -88,6 +89,7 @@ static struct thermal_soc_data thermal_imx6sx_data = {
};

struct imx_thermal_data {
+ struct cpufreq_policy *policy;
struct thermal_zone_device *tz;
struct thermal_cooling_device *cdev;
enum thermal_device_mode mode;
@@ -525,13 +527,18 @@ static int imx_thermal_probe(struct platform_device *pdev)
regmap_write(map, MISC0 + REG_SET, MISC0_REFTOP_SELBIASOFF);
regmap_write(map, TEMPSENSE0 + REG_SET, TEMPSENSE0_POWER_DOWN);

- data->cdev = cpufreq_cooling_register(cpu_present_mask);
+ data->policy = cpufreq_cpu_get(0);
+ if (!data->policy) {
+ pr_debug("%s: CPUFreq policy not found\n", __func__);
+ return -EPROBE_DEFER;
+ }
+
+ data->cdev = cpufreq_cooling_register(data->policy);
if (IS_ERR(data->cdev)) {
ret = PTR_ERR(data->cdev);
- if (ret != -EPROBE_DEFER)
- dev_err(&pdev->dev,
- "failed to register cpufreq cooling device: %d\n",
- ret);
+ dev_err(&pdev->dev,
+ "failed to register cpufreq cooling device: %d\n", ret);
+ cpufreq_cpu_put(data->policy);
return ret;
}

@@ -542,6 +549,7 @@ static int imx_thermal_probe(struct platform_device *pdev)
dev_err(&pdev->dev,
"failed to get thermal clk: %d\n", ret);
cpufreq_cooling_unregister(data->cdev);
+ cpufreq_cpu_put(data->policy);
return ret;
}

@@ -556,6 +564,7 @@ static int imx_thermal_probe(struct platform_device *pdev)
if (ret) {
dev_err(&pdev->dev, "failed to enable thermal clk: %d\n", ret);
cpufreq_cooling_unregister(data->cdev);
+ cpufreq_cpu_put(data->policy);
return ret;
}

@@ -571,6 +580,7 @@ static int imx_thermal_probe(struct platform_device *pdev)
"failed to register thermal zone device %d\n", ret);
clk_disable_unprepare(data->thermal_clk);
cpufreq_cooling_unregister(data->cdev);
+ cpufreq_cpu_put(data->policy);
return ret;
}

@@ -599,6 +609,7 @@ static int imx_thermal_probe(struct platform_device *pdev)
clk_disable_unprepare(data->thermal_clk);
thermal_zone_device_unregister(data->tz);
cpufreq_cooling_unregister(data->cdev);
+ cpufreq_cpu_put(data->policy);
return ret;
}

@@ -620,6 +631,7 @@ static int imx_thermal_remove(struct platform_device *pdev)

thermal_zone_device_unregister(data->tz);
cpufreq_cooling_unregister(data->cdev);
+ cpufreq_cpu_put(data->policy);

return 0;
}
diff --git a/drivers/thermal/ti-soc-thermal/ti-thermal-common.c b/drivers/thermal/ti-soc-thermal/ti-thermal-common.c
index 0586bd0f2bab..cfc851d76d4b 100644
--- a/drivers/thermal/ti-soc-thermal/ti-thermal-common.c
+++ b/drivers/thermal/ti-soc-thermal/ti-thermal-common.c
@@ -28,6 +28,7 @@
#include <linux/kernel.h>
#include <linux/workqueue.h>
#include <linux/thermal.h>
+#include <linux/cpufreq.h>
#include <linux/cpumask.h>
#include <linux/cpu_cooling.h>
#include <linux/of.h>
@@ -37,6 +38,7 @@

/* common data structures */
struct ti_thermal_data {
+ struct cpufreq_policy *policy;
struct thermal_zone_device *ti_thermal;
struct thermal_zone_device *pcb_tz;
struct thermal_cooling_device *cool_dev;
@@ -395,15 +397,19 @@ int ti_thermal_register_cpu_cooling(struct ti_bandgap *bgp, int id)
if (!data)
return -EINVAL;

+ data->policy = cpufreq_cpu_get(0);
+ if (!data->policy) {
+ pr_debug("%s: CPUFreq policy not found\n", __func__);
+ return -EPROBE_DEFER;
+ }
+
/* Register cooling device */
- data->cool_dev = cpufreq_cooling_register(cpu_present_mask);
+ data->cool_dev = cpufreq_cooling_register(data->policy);
if (IS_ERR(data->cool_dev)) {
int ret = PTR_ERR(data->cool_dev);
-
- if (ret != -EPROBE_DEFER)
- dev_err(bgp->dev,
- "Failed to register cpu cooling device %d\n",
- ret);
+ dev_err(bgp->dev, "Failed to register cpu cooling device %d\n",
+ ret);
+ cpufreq_cpu_put(data->policy);

return ret;
}
@@ -418,8 +424,10 @@ int ti_thermal_unregister_cpu_cooling(struct ti_bandgap *bgp, int id)

data = ti_bandgap_get_sensor_data(bgp, id);

- if (data)
+ if (data) {
cpufreq_cooling_unregister(data->cool_dev);
+ cpufreq_cpu_put(data->policy);
+ }

return 0;
}
diff --git a/include/linux/cpu_cooling.h b/include/linux/cpu_cooling.h
index 96c5e4c2f9c8..d4292ebc5c8b 100644
--- a/include/linux/cpu_cooling.h
+++ b/include/linux/cpu_cooling.h
@@ -28,47 +28,49 @@
#include <linux/thermal.h>
#include <linux/cpumask.h>

+struct cpufreq_policy;
+
typedef int (*get_static_t)(cpumask_t *cpumask, int interval,
unsigned long voltage, u32 *power);

#ifdef CONFIG_CPU_THERMAL
/**
* cpufreq_cooling_register - function to create cpufreq cooling device.
- * @clip_cpus: cpumask of cpus where the frequency constraints will happen
+ * @policy: cpufreq policy.
*/
struct thermal_cooling_device *
-cpufreq_cooling_register(const struct cpumask *clip_cpus);
+cpufreq_cooling_register(struct cpufreq_policy *policy);

struct thermal_cooling_device *
-cpufreq_power_cooling_register(const struct cpumask *clip_cpus,
+cpufreq_power_cooling_register(struct cpufreq_policy *policy,
u32 capacitance, get_static_t plat_static_func);

/**
* of_cpufreq_cooling_register - create cpufreq cooling device based on DT.
* @np: a valid struct device_node to the cooling device device tree node.
- * @clip_cpus: cpumask of cpus where the frequency constraints will happen
+ * @policy: cpufreq policy.
*/
#ifdef CONFIG_THERMAL_OF
struct thermal_cooling_device *
of_cpufreq_cooling_register(struct device_node *np,
- const struct cpumask *clip_cpus);
+ struct cpufreq_policy *policy);

struct thermal_cooling_device *
of_cpufreq_power_cooling_register(struct device_node *np,
- const struct cpumask *clip_cpus,
+ struct cpufreq_policy *policy,
u32 capacitance,
get_static_t plat_static_func);
#else
static inline struct thermal_cooling_device *
of_cpufreq_cooling_register(struct device_node *np,
- const struct cpumask *clip_cpus)
+ struct cpufreq_policy *policy)
{
return ERR_PTR(-ENOSYS);
}

static inline struct thermal_cooling_device *
of_cpufreq_power_cooling_register(struct device_node *np,
- const struct cpumask *clip_cpus,
+ struct cpufreq_policy *policy,
u32 capacitance,
get_static_t plat_static_func)
{
@@ -84,12 +86,12 @@ void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev);

#else /* !CONFIG_CPU_THERMAL */
static inline struct thermal_cooling_device *
-cpufreq_cooling_register(const struct cpumask *clip_cpus)
+cpufreq_cooling_register(struct cpufreq_policy *policy)
{
return ERR_PTR(-ENOSYS);
}
static inline struct thermal_cooling_device *
-cpufreq_power_cooling_register(const struct cpumask *clip_cpus,
+cpufreq_power_cooling_register(struct cpufreq_policy *policy,
u32 capacitance, get_static_t plat_static_func)
{
return NULL;
@@ -97,14 +99,14 @@ cpufreq_power_cooling_register(const struct cpumask *clip_cpus,

static inline struct thermal_cooling_device *
of_cpufreq_cooling_register(struct device_node *np,
- const struct cpumask *clip_cpus)
+ struct cpufreq_policy *policy)
{
return ERR_PTR(-ENOSYS);
}

static inline struct thermal_cooling_device *
of_cpufreq_power_cooling_register(struct device_node *np,
- const struct cpumask *clip_cpus,
+ struct cpufreq_policy *policy,
u32 capacitance,
get_static_t plat_static_func)
{
--
2.7.1.410.g6faf27b

2017-03-16 05:33:08

by Viresh Kumar

[permalink] [raw]
Subject: [PATCH 10/17] thermal: cpu_cooling: OPPs are registered for all CPUs

The OPPs are registered for all CPUs of a cpufreq policy now and we
don't need to run the loop in build_dyn_power_table(). Just check for
the policy->cpu and we should be fine.

Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/thermal/cpu_cooling.c | 26 +++++++++++---------------
1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 7590279bf1de..1df6c9039e45 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -83,7 +83,7 @@ struct power_table {
* @dyn_power_table: array of struct power_table for frequency to power
* conversion, sorted in ascending order.
* @dyn_power_table_entries: number of entries in the @dyn_power_table array
- * @cpu_dev: the first cpu_device from @allowed_cpus that has OPPs registered
+ * @cpu_dev: the cpu_device of policy->cpu.
* @plat_get_static_power: callback to calculate the static power
*
* This structure is required for keeping information of each registered
@@ -208,24 +208,20 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_dev,
struct power_table *power_table;
struct dev_pm_opp *opp;
struct device *dev = NULL;
- int num_opps = 0, cpu, i, ret = 0;
+ int num_opps = 0, cpu = cpufreq_dev->policy->cpu, i, ret = 0;
unsigned long freq;

- for_each_cpu(cpu, &cpufreq_dev->allowed_cpus) {
- dev = get_cpu_device(cpu);
- if (!dev) {
- dev_warn(&cpufreq_dev->cdev->device,
- "No cpu device for cpu %d\n", cpu);
- continue;
- }
-
- num_opps = dev_pm_opp_get_opp_count(dev);
- if (num_opps > 0)
- break;
- else if (num_opps < 0)
- return num_opps;
+ dev = get_cpu_device(cpu);
+ if (unlikely(!dev)) {
+ dev_warn(&cpufreq_dev->cdev->device,
+ "No cpu device for cpu %d\n", cpu);
+ return -ENODEV;
}

+ num_opps = dev_pm_opp_get_opp_count(dev);
+ if (num_opps < 0)
+ return num_opps;
+
if (num_opps == 0)
return -EINVAL;

--
2.7.1.410.g6faf27b

2017-04-11 06:02:37

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH 00/17] thermal: cpu_cooling: improve interaction with cpufreq core

On 16-03-17, 10:56, Viresh Kumar wrote:
> Hi Guys,
>
> The cpu_cooling driver is designed to use CPU frequency scaling to avoid
> high thermal states for a platform. But it wasn't glued really well with
> cpufreq core.
>
> This series tries to improve interactions between cpufreq core and
> cpu_cooling driver and does some fixes/cleanups to the cpu_cooling
> driver.

Thermal guys, Ping !!

--
viresh

2017-04-11 17:33:39

by Eduardo Valentin

[permalink] [raw]
Subject: Re: [PATCH 03/17] thermal: cpu_cooling: Replace cpufreq_device with cpufreq_dev

On Thu, Mar 16, 2017 at 10:59:38AM +0530, Viresh Kumar wrote:
> Objects of "struct cpufreq_cooling_device" are named a bit
> inconsistently. Lets use cpufreq_dev everywhere.
>

Naming is always a matter of taste.. but I frankly cannot see how this
patch improves anything here. cpufreq_dev is still misleading.

Here other options:
cpufreq_cooling_dev: not misleading but, may a bit too long..
cpufreq_cdev: better, but still, not a char dev

> Signed-off-by: Viresh Kumar <[email protected]>


Attachments:
(No filename) (502.00 B)
signature.asc (819.00 B)
Digital signature
Download all attachments

2017-04-11 17:35:23

by Eduardo Valentin

[permalink] [raw]
Subject: Re: [PATCH 04/17] thermal: cpu_cooling: replace cool_dev with cdev

On Thu, Mar 16, 2017 at 10:59:39AM +0530, Viresh Kumar wrote:
> Objects of "struct thermal_cooling_device" are named a bit
> inconsistently. Lets use cdev everywhere.

In this case, cpufreq_cdev is best option on patch 3. Too bad we lost
the "cool" dev here :-)

>
> Signed-off-by: Viresh Kumar <[email protected]>
> ---
> drivers/thermal/cpu_cooling.c | 36 ++++++++++++++++++------------------
> 1 file changed, 18 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> index 7a19033d7f79..e2931c20c309 100644
> --- a/drivers/thermal/cpu_cooling.c
> +++ b/drivers/thermal/cpu_cooling.c
> @@ -65,7 +65,7 @@ struct power_table {
> * struct cpufreq_cooling_device - data for cooling device with cpufreq
> * @id: unique integer value corresponding to each cpufreq_cooling_device
> * registered.
> - * @cool_dev: thermal_cooling_device pointer to keep track of the
> + * @cdev: thermal_cooling_device pointer to keep track of the
> * registered cooling device.
> * @cpufreq_state: integer value representing the current state of cpufreq
> * cooling devices.
> @@ -90,7 +90,7 @@ struct power_table {
> */
> struct cpufreq_cooling_device {
> int id;
> - struct thermal_cooling_device *cool_dev;
> + struct thermal_cooling_device *cdev;
> unsigned int cpufreq_state;
> unsigned int clipped_freq;
> unsigned int max_level;
> @@ -243,7 +243,7 @@ static int build_dyn_power_table(struct cpufreq_cooling_device *cpufreq_dev,
> for_each_cpu(cpu, &cpufreq_dev->allowed_cpus) {
> dev = get_cpu_device(cpu);
> if (!dev) {
> - dev_warn(&cpufreq_dev->cool_dev->device,
> + dev_warn(&cpufreq_dev->cdev->device,
> "No cpu device for cpu %d\n", cpu);
> continue;
> }
> @@ -770,7 +770,7 @@ __cpufreq_cooling_register(struct device_node *np,
> get_static_t plat_static_func)
> {
> struct cpufreq_policy *policy;
> - struct thermal_cooling_device *cool_dev;
> + struct thermal_cooling_device *cdev;
> struct cpufreq_cooling_device *cpufreq_dev;
> char dev_name[THERMAL_NAME_LENGTH];
> struct cpufreq_frequency_table *pos, *table;
> @@ -786,20 +786,20 @@ __cpufreq_cooling_register(struct device_node *np,
> policy = cpufreq_cpu_get(cpumask_first(temp_mask));
> if (!policy) {
> pr_debug("%s: CPUFreq policy not found\n", __func__);
> - cool_dev = ERR_PTR(-EPROBE_DEFER);
> + cdev = ERR_PTR(-EPROBE_DEFER);
> goto free_cpumask;
> }
>
> table = policy->freq_table;
> if (!table) {
> pr_debug("%s: CPUFreq table not found\n", __func__);
> - cool_dev = ERR_PTR(-ENODEV);
> + cdev = ERR_PTR(-ENODEV);
> goto put_policy;
> }
>
> cpufreq_dev = kzalloc(sizeof(*cpufreq_dev), GFP_KERNEL);
> if (!cpufreq_dev) {
> - cool_dev = ERR_PTR(-ENOMEM);
> + cdev = ERR_PTR(-ENOMEM);
> goto put_policy;
> }
>
> @@ -808,7 +808,7 @@ __cpufreq_cooling_register(struct device_node *np,
> sizeof(*cpufreq_dev->time_in_idle),
> GFP_KERNEL);
> if (!cpufreq_dev->time_in_idle) {
> - cool_dev = ERR_PTR(-ENOMEM);
> + cdev = ERR_PTR(-ENOMEM);
> goto free_cdev;
> }
>
> @@ -816,7 +816,7 @@ __cpufreq_cooling_register(struct device_node *np,
> kcalloc(num_cpus, sizeof(*cpufreq_dev->time_in_idle_timestamp),
> GFP_KERNEL);
> if (!cpufreq_dev->time_in_idle_timestamp) {
> - cool_dev = ERR_PTR(-ENOMEM);
> + cdev = ERR_PTR(-ENOMEM);
> goto free_time_in_idle;
> }
>
> @@ -827,7 +827,7 @@ __cpufreq_cooling_register(struct device_node *np,
> cpufreq_dev->freq_table = kmalloc(sizeof(*cpufreq_dev->freq_table) *
> cpufreq_dev->max_level, GFP_KERNEL);
> if (!cpufreq_dev->freq_table) {
> - cool_dev = ERR_PTR(-ENOMEM);
> + cdev = ERR_PTR(-ENOMEM);
> goto free_time_in_idle_timestamp;
> }
>
> @@ -841,7 +841,7 @@ __cpufreq_cooling_register(struct device_node *np,
>
> ret = build_dyn_power_table(cpufreq_dev, capacitance);
> if (ret) {
> - cool_dev = ERR_PTR(ret);
> + cdev = ERR_PTR(ret);
> goto free_table;
> }
>
> @@ -852,7 +852,7 @@ __cpufreq_cooling_register(struct device_node *np,
>
> ret = ida_simple_get(&cpufreq_ida, 0, 0, GFP_KERNEL);
> if (ret < 0) {
> - cool_dev = ERR_PTR(ret);
> + cdev = ERR_PTR(ret);
> goto free_power_table;
> }
> cpufreq_dev->id = ret;
> @@ -872,13 +872,13 @@ __cpufreq_cooling_register(struct device_node *np,
> snprintf(dev_name, sizeof(dev_name), "thermal-cpufreq-%d",
> cpufreq_dev->id);
>
> - cool_dev = thermal_of_cooling_device_register(np, dev_name, cpufreq_dev,
> - cooling_ops);
> - if (IS_ERR(cool_dev))
> + cdev = thermal_of_cooling_device_register(np, dev_name, cpufreq_dev,
> + cooling_ops);
> + if (IS_ERR(cdev))
> goto remove_ida;
>
> cpufreq_dev->clipped_freq = cpufreq_dev->freq_table[0];
> - cpufreq_dev->cool_dev = cool_dev;
> + cpufreq_dev->cdev = cdev;
>
> mutex_lock(&cooling_list_lock);
> list_add(&cpufreq_dev->node, &cpufreq_dev_list);
> @@ -907,7 +907,7 @@ __cpufreq_cooling_register(struct device_node *np,
> cpufreq_cpu_put(policy);
> free_cpumask:
> free_cpumask_var(temp_mask);
> - return cool_dev;
> + return cdev;
> }
>
> /**
> @@ -1043,7 +1043,7 @@ void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
> list_del(&cpufreq_dev->node);
> mutex_unlock(&cooling_list_lock);
>
> - thermal_cooling_device_unregister(cpufreq_dev->cool_dev);
> + thermal_cooling_device_unregister(cpufreq_dev->cdev);
> ida_simple_remove(&cpufreq_ida, cpufreq_dev->id);
> kfree(cpufreq_dev->dyn_power_table);
> kfree(cpufreq_dev->time_in_idle_timestamp);
> --
> 2.7.1.410.g6faf27b
>


Attachments:
(No filename) (5.49 kB)
signature.asc (819.00 B)
Digital signature
Download all attachments

2017-04-11 17:43:52

by Eduardo Valentin

[permalink] [raw]
Subject: Re: [PATCH 05/17] thermal: cpu_cooling: remove cpufreq_cooling_get_level()

On Thu, Mar 16, 2017 at 10:59:40AM +0530, Viresh Kumar wrote:
> There is only one user of cpufreq_cooling_get_level() and that already
> has pointer to the cpufreq_dev structure. It can directly call
> get_level() instead and we can get rid of cpufreq_cooling_get_level().
>
> Signed-off-by: Viresh Kumar <[email protected]>
> ---
> drivers/thermal/cpu_cooling.c | 33 +--------------------------------
> include/linux/cpu_cooling.h | 6 ------
> 2 files changed, 1 insertion(+), 38 deletions(-)
>
> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> index e2931c20c309..99dc6833de75 100644
> --- a/drivers/thermal/cpu_cooling.c
> +++ b/drivers/thermal/cpu_cooling.c
> @@ -137,37 +137,6 @@ static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_dev,
> }
>
> /**
> - * cpufreq_cooling_get_level - for a given cpu, return the cooling level.
> - * @cpu: cpu for which the level is required
> - * @freq: the frequency of interest
> - *
> - * This function will match the cooling level corresponding to the
> - * requested @freq and return it.
> - *
> - * Return: The matched cooling level on success or THERMAL_CSTATE_INVALID
> - * otherwise.
> - */
> -unsigned long cpufreq_cooling_get_level(unsigned int cpu, unsigned int freq)
> -{
> - struct cpufreq_cooling_device *cpufreq_dev;
> -
> - mutex_lock(&cooling_list_lock);
> - list_for_each_entry(cpufreq_dev, &cpufreq_dev_list, node) {
> - if (cpumask_test_cpu(cpu, &cpufreq_dev->allowed_cpus)) {
> - unsigned long level = get_level(cpufreq_dev, freq);
> -
> - mutex_unlock(&cooling_list_lock);
> - return level;
> - }
> - }
> - mutex_unlock(&cooling_list_lock);
> -
> - pr_err("%s: cpu:%d not part of any cooling device\n", __func__, cpu);
> - return THERMAL_CSTATE_INVALID;
> -}
> -EXPORT_SYMBOL_GPL(cpufreq_cooling_get_level);
> -
> -/**
> * cpufreq_thermal_notifier - notifier callback for cpufreq policy change.
> * @nb: struct notifier_block * with callback info.
> * @event: value showing cpufreq event for which this function invoked.
> @@ -698,7 +667,7 @@ static int cpufreq_power2state(struct thermal_cooling_device *cdev,
> normalised_power = (dyn_power * 100) / last_load;
> target_freq = cpu_power_to_freq(cpufreq_dev, normalised_power);
>
> - *state = cpufreq_cooling_get_level(cpu, target_freq);
> + *state = get_level(cpufreq_dev, target_freq);

Did I miss something or we are loosing semantics here?

I guess the idea at this point is to get the level corresponding to the
frequency on a specific cpu. Let's have a look on get_level()..

I guess now we can rely on the freq table held in the
cpufreq_cooling_device..

> if (*state == THERMAL_CSTATE_INVALID) {
> dev_err_ratelimited(&cdev->device,
> "Failed to convert %dKHz for cpu %d into a cdev state\n",
> diff --git a/include/linux/cpu_cooling.h b/include/linux/cpu_cooling.h
> index c156f5082758..96c5e4c2f9c8 100644
> --- a/include/linux/cpu_cooling.h
> +++ b/include/linux/cpu_cooling.h
> @@ -82,7 +82,6 @@ of_cpufreq_power_cooling_register(struct device_node *np,
> */
> void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev);
>
> -unsigned long cpufreq_cooling_get_level(unsigned int cpu, unsigned int freq);
> #else /* !CONFIG_CPU_THERMAL */
> static inline struct thermal_cooling_device *
> cpufreq_cooling_register(const struct cpumask *clip_cpus)
> @@ -117,11 +116,6 @@ void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
> {
> return;
> }
> -static inline
> -unsigned long cpufreq_cooling_get_level(unsigned int cpu, unsigned int freq)
> -{
> - return THERMAL_CSTATE_INVALID;
> -}
> #endif /* CONFIG_CPU_THERMAL */
>
> #endif /* __CPU_COOLING_H__ */
> --
> 2.7.1.410.g6faf27b
>


Attachments:
(No filename) (3.66 kB)
signature.asc (819.00 B)
Digital signature
Download all attachments

2017-04-12 06:17:00

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH 03/17] thermal: cpu_cooling: Replace cpufreq_device with cpufreq_dev

On 11-04-17, 10:33, Eduardo Valentin wrote:
> On Thu, Mar 16, 2017 at 10:59:38AM +0530, Viresh Kumar wrote:
> > Objects of "struct cpufreq_cooling_device" are named a bit
> > inconsistently. Lets use cpufreq_dev everywhere.
> >
>
> Naming is always a matter of taste.. but I frankly cannot see how this
> patch improves anything here. cpufreq_dev is still misleading.

I wasn't trying to get to a better name, but rather be consistent.

> Here other options:
> cpufreq_cooling_dev: not misleading but, may a bit too long..
> cpufreq_cdev: better, but still, not a char dev

I will go for the second one.

--
viresh

2017-04-12 06:25:22

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH 05/17] thermal: cpu_cooling: remove cpufreq_cooling_get_level()

On 11-04-17, 10:43, Eduardo Valentin wrote:
> On Thu, Mar 16, 2017 at 10:59:40AM +0530, Viresh Kumar wrote:
> > There is only one user of cpufreq_cooling_get_level() and that already
> > has pointer to the cpufreq_dev structure. It can directly call
> > get_level() instead and we can get rid of cpufreq_cooling_get_level().
> >
> > Signed-off-by: Viresh Kumar <[email protected]>
> > ---
> > drivers/thermal/cpu_cooling.c | 33 +--------------------------------
> > include/linux/cpu_cooling.h | 6 ------
> > 2 files changed, 1 insertion(+), 38 deletions(-)
> >
> > diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
> > index e2931c20c309..99dc6833de75 100644
> > --- a/drivers/thermal/cpu_cooling.c
> > +++ b/drivers/thermal/cpu_cooling.c
> > @@ -137,37 +137,6 @@ static unsigned long get_level(struct cpufreq_cooling_device *cpufreq_dev,
> > }
> >
> > /**
> > - * cpufreq_cooling_get_level - for a given cpu, return the cooling level.
> > - * @cpu: cpu for which the level is required
> > - * @freq: the frequency of interest
> > - *
> > - * This function will match the cooling level corresponding to the
> > - * requested @freq and return it.
> > - *
> > - * Return: The matched cooling level on success or THERMAL_CSTATE_INVALID
> > - * otherwise.
> > - */
> > -unsigned long cpufreq_cooling_get_level(unsigned int cpu, unsigned int freq)
> > -{
> > - struct cpufreq_cooling_device *cpufreq_dev;
> > -
> > - mutex_lock(&cooling_list_lock);
> > - list_for_each_entry(cpufreq_dev, &cpufreq_dev_list, node) {
> > - if (cpumask_test_cpu(cpu, &cpufreq_dev->allowed_cpus)) {
> > - unsigned long level = get_level(cpufreq_dev, freq);
> > -
> > - mutex_unlock(&cooling_list_lock);
> > - return level;
> > - }
> > - }
> > - mutex_unlock(&cooling_list_lock);
> > -
> > - pr_err("%s: cpu:%d not part of any cooling device\n", __func__, cpu);
> > - return THERMAL_CSTATE_INVALID;
> > -}
> > -EXPORT_SYMBOL_GPL(cpufreq_cooling_get_level);
> > -
> > -/**
> > * cpufreq_thermal_notifier - notifier callback for cpufreq policy change.
> > * @nb: struct notifier_block * with callback info.
> > * @event: value showing cpufreq event for which this function invoked.
> > @@ -698,7 +667,7 @@ static int cpufreq_power2state(struct thermal_cooling_device *cdev,
> > normalised_power = (dyn_power * 100) / last_load;
> > target_freq = cpu_power_to_freq(cpufreq_dev, normalised_power);
> >
> > - *state = cpufreq_cooling_get_level(cpu, target_freq);
> > + *state = get_level(cpufreq_dev, target_freq);
>
> Did I miss something or we are loosing semantics here?

I just got rid of an unnecessary wrapper routine. That's it. There shouldn't be
any functional change after this patch.

> I guess the idea at this point is to get the level corresponding to the
> frequency on a specific cpu. Let's have a look on get_level()..
>
> I guess now we can rely on the freq table held in the
> cpufreq_cooling_device..

I am not sure I understood your concerns here :(

--
viresh