This is targetted for 3.10-rc1 or linux-next just after the merge window.
All patches are pushed here for others to apply:
http://git.linaro.org/gitweb?p=people/vireshk/linux.git;a=shortlog;h=refs/heads/for-3.10
Currently, there can't be multiple instances of single governor_type. If we have
a multi-package system, where we have multiple instances of struct policy (per
package), we can't have multiple instances of same governor. i.e. We can't have
multiple instances of ondemand governor for multiple packages.
Governors directory in sysfs is created at /sys/devices/system/cpu/cpufreq/
governor-name/. Which again reflects that there can be only one instance of a
governor_type in the system.
This is a bottleneck for multicluster system, where we want different packages
to use same governor type, but with different tunables.
This patchset is inclined towards fixing this issue. Now we will create
governors directory in cpu/cpu*/cpufreq/<gov> for platforms which have multiple
struct policy alive at any moment. For others the interface is kept same:
cpu/cpufreq/<gov>.
@Rafael: Clearly, I don't want to have following patch: "cpufreq: Add Kconfig
option to enable/disable have_multiple_policies" and added it because of comment
from Borislov against which nobody else replied :)
So, please drop it if you agree over my comments with earlier version.
V2->V3:
- Fixed value of CPUFREQ_GOV_POLICY_EXIT in the correct patch
- Drop indentation fixes from intel_pstate.c
V1->V2:
- Few patches from V1 are already picked up by Rafael for 3.9-rc1
- Last two patches are new
- Added dbs_data->exit() routines to free up memory used for struct tuners.
Viresh Kumar (4):
cpufreq: Add per policy governor-init/exit infrastructure
cpufreq: governor: Implement per policy instances of governors
cpufreq: Get rid of "struct global_attr"
cpufreq: Add Kconfig option to enable/disable have_multiple_policies
drivers/cpufreq/Kconfig | 3 +
drivers/cpufreq/acpi-cpufreq.c | 9 +-
drivers/cpufreq/cpufreq.c | 27 +++--
drivers/cpufreq/cpufreq_conservative.c | 148 +++++++++++++---------
drivers/cpufreq/cpufreq_governor.c | 159 ++++++++++++++----------
drivers/cpufreq/cpufreq_governor.h | 43 +++++--
drivers/cpufreq/cpufreq_ondemand.c | 216 +++++++++++++++++++--------------
drivers/cpufreq/intel_pstate.c | 20 +--
include/linux/cpufreq.h | 44 ++++---
9 files changed, 397 insertions(+), 272 deletions(-)
--
1.7.12.rc2.18.g61b472e
Currently, there can't be multiple instances of single governor_type. If we have
a multi-package system, where we have multiple instances of struct policy (per
package), we can't have multiple instances of same governor. i.e. We can't have
multiple instances of ondemand governor for multiple packages.
Governors directory in sysfs is created at /sys/devices/system/cpu/cpufreq/
governor-name/. Which again reflects that there can be only one instance of a
governor_type in the system.
This is a bottleneck for multicluster system, where we want different packages
to use same governor type, but with different tunables.
This patch is inclined towards providing this infrastructure. Because we are
required to allocate governor's resources dynamically now, we must do it at
policy creation and end. And so got CPUFREQ_GOV_POLICY_INIT/EXIT.
Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/cpufreq/cpufreq.c | 21 ++++++++++++++++++---
include/linux/cpufreq.h | 9 ++++++---
2 files changed, 24 insertions(+), 6 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index b02824d..7d59688 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1070,6 +1070,8 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
/* If cpu is last user of policy, free policy */
if (cpus == 1) {
+ __cpufreq_governor(data, CPUFREQ_GOV_POLICY_EXIT);
+
lock_policy_rwsem_read(cpu);
kobj = &data->kobj;
cmp = &data->kobj_unregister;
@@ -1651,7 +1653,7 @@ EXPORT_SYMBOL(cpufreq_get_policy);
static int __cpufreq_set_policy(struct cpufreq_policy *data,
struct cpufreq_policy *policy)
{
- int ret = 0;
+ int ret = 0, failed = 1;
pr_debug("setting new policy for CPU %u: %u - %u kHz\n", policy->cpu,
policy->min, policy->max);
@@ -1705,18 +1707,31 @@ static int __cpufreq_set_policy(struct cpufreq_policy *data,
pr_debug("governor switch\n");
/* end old governor */
- if (data->governor)
+ if (data->governor) {
__cpufreq_governor(data, CPUFREQ_GOV_STOP);
+ __cpufreq_governor(data,
+ CPUFREQ_GOV_POLICY_EXIT);
+ }
/* start new governor */
data->governor = policy->governor;
- if (__cpufreq_governor(data, CPUFREQ_GOV_START)) {
+ if (!__cpufreq_governor(data, CPUFREQ_GOV_POLICY_INIT)) {
+ if (!__cpufreq_governor(data, CPUFREQ_GOV_START))
+ failed = 0;
+ else
+ __cpufreq_governor(data,
+ CPUFREQ_GOV_POLICY_EXIT);
+ }
+
+ if (failed) {
/* new governor failed, so re-start old one */
pr_debug("starting governor %s failed\n",
data->governor->name);
if (old_gov) {
data->governor = old_gov;
__cpufreq_governor(data,
+ CPUFREQ_GOV_POLICY_INIT);
+ __cpufreq_governor(data,
CPUFREQ_GOV_START);
}
ret = -EINVAL;
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index a22944c..b7393b5 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -106,6 +106,7 @@ struct cpufreq_policy {
* governors are used */
unsigned int policy; /* see above */
struct cpufreq_governor *governor; /* see below */
+ void *governor_data;
struct work_struct update; /* if update_policy() needs to be
* called, but you're in IRQ context */
@@ -178,9 +179,11 @@ static inline unsigned long cpufreq_scale(unsigned long old, u_int div, u_int mu
* CPUFREQ GOVERNORS *
*********************************************************************/
-#define CPUFREQ_GOV_START 1
-#define CPUFREQ_GOV_STOP 2
-#define CPUFREQ_GOV_LIMITS 3
+#define CPUFREQ_GOV_START 1
+#define CPUFREQ_GOV_STOP 2
+#define CPUFREQ_GOV_LIMITS 3
+#define CPUFREQ_GOV_POLICY_INIT 4
+#define CPUFREQ_GOV_POLICY_EXIT 5
struct cpufreq_governor {
char name[CPUFREQ_NAME_LEN];
--
1.7.12.rc2.18.g61b472e
Currently, there can't be multiple instances of single governor_type. If we have
a multi-package system, where we have multiple instances of struct policy (per
package), we can't have multiple instances of same governor. i.e. We can't have
multiple instances of ondemand governor for multiple packages.
Governors directory in sysfs is created at /sys/devices/system/cpu/cpufreq/
governor-name/. Which again reflects that there can be only one instance of a
governor_type in the system.
This is a bottleneck for multicluster system, where we want different packages
to use same governor type, but with different tunables.
This patch uses the infrastructure provided by earlier patch and implements
init/exit routines for ondemand and conservative governors.
Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/cpufreq/cpufreq.c | 6 -
drivers/cpufreq/cpufreq_conservative.c | 148 +++++++++++++---------
drivers/cpufreq/cpufreq_governor.c | 159 ++++++++++++++----------
drivers/cpufreq/cpufreq_governor.h | 43 +++++--
drivers/cpufreq/cpufreq_ondemand.c | 216 +++++++++++++++++++--------------
include/linux/cpufreq.h | 15 ++-
6 files changed, 352 insertions(+), 235 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 7d59688..75b64e2 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -1546,11 +1546,6 @@ static int __cpufreq_governor(struct cpufreq_policy *policy,
policy->cpu, event);
ret = policy->governor->governor(policy, event);
- if (event == CPUFREQ_GOV_START)
- policy->governor->initialized++;
- else if (event == CPUFREQ_GOV_STOP)
- policy->governor->initialized--;
-
/* we keep one module reference alive for
each CPU governed by this CPU */
if ((event != CPUFREQ_GOV_START) || ret)
@@ -1574,7 +1569,6 @@ int cpufreq_register_governor(struct cpufreq_governor *governor)
mutex_lock(&cpufreq_governor_mutex);
- governor->initialized = 0;
err = -EBUSY;
if (__find_governor(governor->name) == NULL) {
err = 0;
diff --git a/drivers/cpufreq/cpufreq_conservative.c b/drivers/cpufreq/cpufreq_conservative.c
index 4fd0006..d86577f 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -20,6 +20,7 @@
#include <linux/mutex.h>
#include <linux/notifier.h>
#include <linux/percpu-defs.h>
+#include <linux/slab.h>
#include <linux/sysfs.h>
#include <linux/types.h>
@@ -31,17 +32,8 @@
#define DEF_SAMPLING_DOWN_FACTOR (1)
#define MAX_SAMPLING_DOWN_FACTOR (10)
-static struct dbs_data cs_dbs_data;
static DEFINE_PER_CPU(struct cs_cpu_dbs_info_s, cs_cpu_dbs_info);
-static struct cs_dbs_tuners cs_tuners = {
- .up_threshold = DEF_FREQUENCY_UP_THRESHOLD,
- .down_threshold = DEF_FREQUENCY_DOWN_THRESHOLD,
- .sampling_down_factor = DEF_SAMPLING_DOWN_FACTOR,
- .ignore_nice = 0,
- .freq_step = 5,
-};
-
/*
* Every sampling_rate, we check, if current idle time is less than 20%
* (default), then we try to increase frequency Every sampling_rate *
@@ -55,24 +47,26 @@ static void cs_check_cpu(int cpu, unsigned int load)
{
struct cs_cpu_dbs_info_s *dbs_info = &per_cpu(cs_cpu_dbs_info, cpu);
struct cpufreq_policy *policy = dbs_info->cdbs.cur_policy;
+ struct dbs_data *dbs_data = policy->governor_data;
+ struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
unsigned int freq_target;
/*
* break out if we 'cannot' reduce the speed as the user might
* want freq_step to be zero
*/
- if (cs_tuners.freq_step == 0)
+ if (cs_tuners->freq_step == 0)
return;
/* Check for frequency increase */
- if (load > cs_tuners.up_threshold) {
+ if (load > cs_tuners->up_threshold) {
dbs_info->down_skip = 0;
/* if we are already at full speed then break out early */
if (dbs_info->requested_freq == policy->max)
return;
- freq_target = (cs_tuners.freq_step * policy->max) / 100;
+ freq_target = (cs_tuners->freq_step * policy->max) / 100;
/* max freq cannot be less than 100. But who knows.... */
if (unlikely(freq_target == 0))
@@ -92,8 +86,8 @@ static void cs_check_cpu(int cpu, unsigned int load)
* support the current CPU usage without triggering the up policy. To be
* safe, we focus 10 points under the threshold.
*/
- if (load < (cs_tuners.down_threshold - 10)) {
- freq_target = (cs_tuners.freq_step * policy->max) / 100;
+ if (load < (cs_tuners->down_threshold - 10)) {
+ freq_target = (cs_tuners->freq_step * policy->max) / 100;
dbs_info->requested_freq -= freq_target;
if (dbs_info->requested_freq < policy->min)
@@ -119,11 +113,13 @@ static void cs_dbs_timer(struct work_struct *work)
unsigned int cpu = dbs_info->cdbs.cur_policy->cpu;
struct cs_cpu_dbs_info_s *core_dbs_info = &per_cpu(cs_cpu_dbs_info,
cpu);
- int delay = delay_for_sampling_rate(cs_tuners.sampling_rate);
+ struct dbs_data *dbs_data = dbs_info->cdbs.cur_policy->governor_data;
+ struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
+ int delay = delay_for_sampling_rate(cs_tuners->sampling_rate);
mutex_lock(&core_dbs_info->cdbs.timer_mutex);
- if (need_load_eval(&core_dbs_info->cdbs, cs_tuners.sampling_rate))
- dbs_check_cpu(&cs_dbs_data, cpu);
+ if (need_load_eval(&core_dbs_info->cdbs, cs_tuners->sampling_rate))
+ dbs_check_cpu(dbs_data, cpu);
schedule_delayed_work_on(smp_processor_id(), dw, delay);
mutex_unlock(&core_dbs_info->cdbs.timer_mutex);
@@ -154,16 +150,11 @@ static int dbs_cpufreq_notifier(struct notifier_block *nb, unsigned long val,
}
/************************** sysfs interface ************************/
-static ssize_t show_sampling_rate_min(struct kobject *kobj,
- struct attribute *attr, char *buf)
-{
- return sprintf(buf, "%u\n", cs_dbs_data.min_sampling_rate);
-}
-
-static ssize_t store_sampling_down_factor(struct kobject *a,
- struct attribute *b,
- const char *buf, size_t count)
+static ssize_t store_sampling_down_factor(struct cpufreq_policy *policy,
+ const char *buf, size_t count)
{
+ struct dbs_data *dbs_data = policy->governor_data;
+ struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
unsigned int input;
int ret;
ret = sscanf(buf, "%u", &input);
@@ -171,13 +162,15 @@ static ssize_t store_sampling_down_factor(struct kobject *a,
if (ret != 1 || input > MAX_SAMPLING_DOWN_FACTOR || input < 1)
return -EINVAL;
- cs_tuners.sampling_down_factor = input;
+ cs_tuners->sampling_down_factor = input;
return count;
}
-static ssize_t store_sampling_rate(struct kobject *a, struct attribute *b,
- const char *buf, size_t count)
+static ssize_t store_sampling_rate(struct cpufreq_policy *policy,
+ const char *buf, size_t count)
{
+ struct dbs_data *dbs_data = policy->governor_data;
+ struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
unsigned int input;
int ret;
ret = sscanf(buf, "%u", &input);
@@ -185,43 +178,49 @@ static ssize_t store_sampling_rate(struct kobject *a, struct attribute *b,
if (ret != 1)
return -EINVAL;
- cs_tuners.sampling_rate = max(input, cs_dbs_data.min_sampling_rate);
+ cs_tuners->sampling_rate = max(input, dbs_data->min_sampling_rate);
return count;
}
-static ssize_t store_up_threshold(struct kobject *a, struct attribute *b,
- const char *buf, size_t count)
+static ssize_t store_up_threshold(struct cpufreq_policy *policy,
+ const char *buf, size_t count)
{
+ struct dbs_data *dbs_data = policy->governor_data;
+ struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
unsigned int input;
int ret;
ret = sscanf(buf, "%u", &input);
- if (ret != 1 || input > 100 || input <= cs_tuners.down_threshold)
+ if (ret != 1 || input > 100 || input <= cs_tuners->down_threshold)
return -EINVAL;
- cs_tuners.up_threshold = input;
+ cs_tuners->up_threshold = input;
return count;
}
-static ssize_t store_down_threshold(struct kobject *a, struct attribute *b,
- const char *buf, size_t count)
+static ssize_t store_down_threshold(struct cpufreq_policy *policy,
+ const char *buf, size_t count)
{
+ struct dbs_data *dbs_data = policy->governor_data;
+ struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
unsigned int input;
int ret;
ret = sscanf(buf, "%u", &input);
/* cannot be lower than 11 otherwise freq will not fall */
if (ret != 1 || input < 11 || input > 100 ||
- input >= cs_tuners.up_threshold)
+ input >= cs_tuners->up_threshold)
return -EINVAL;
- cs_tuners.down_threshold = input;
+ cs_tuners->down_threshold = input;
return count;
}
-static ssize_t store_ignore_nice_load(struct kobject *a, struct attribute *b,
- const char *buf, size_t count)
+static ssize_t store_ignore_nice(struct cpufreq_policy *policy,
+ const char *buf, size_t count)
{
+ struct dbs_data *dbs_data = policy->governor_data;
+ struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
unsigned int input, j;
int ret;
@@ -232,10 +231,10 @@ static ssize_t store_ignore_nice_load(struct kobject *a, struct attribute *b,
if (input > 1)
input = 1;
- if (input == cs_tuners.ignore_nice) /* nothing to do */
+ if (input == cs_tuners->ignore_nice) /* nothing to do */
return count;
- cs_tuners.ignore_nice = input;
+ cs_tuners->ignore_nice = input;
/* we need to re-evaluate prev_cpu_idle */
for_each_online_cpu(j) {
@@ -243,16 +242,18 @@ static ssize_t store_ignore_nice_load(struct kobject *a, struct attribute *b,
dbs_info = &per_cpu(cs_cpu_dbs_info, j);
dbs_info->cdbs.prev_cpu_idle = get_cpu_idle_time(j,
&dbs_info->cdbs.prev_cpu_wall);
- if (cs_tuners.ignore_nice)
+ if (cs_tuners->ignore_nice)
dbs_info->cdbs.prev_cpu_nice =
kcpustat_cpu(j).cpustat[CPUTIME_NICE];
}
return count;
}
-static ssize_t store_freq_step(struct kobject *a, struct attribute *b,
- const char *buf, size_t count)
+static ssize_t store_freq_step(struct cpufreq_policy *policy,
+ const char *buf, size_t count)
{
+ struct dbs_data *dbs_data = policy->governor_data;
+ struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
unsigned int input;
int ret;
ret = sscanf(buf, "%u", &input);
@@ -267,7 +268,7 @@ static ssize_t store_freq_step(struct kobject *a, struct attribute *b,
* no need to test here if freq_step is zero as the user might actually
* want this, they would be crazy though :)
*/
- cs_tuners.freq_step = input;
+ cs_tuners->freq_step = input;
return count;
}
@@ -275,16 +276,17 @@ show_one(cs, sampling_rate, sampling_rate);
show_one(cs, sampling_down_factor, sampling_down_factor);
show_one(cs, up_threshold, up_threshold);
show_one(cs, down_threshold, down_threshold);
-show_one(cs, ignore_nice_load, ignore_nice);
+show_one(cs, ignore_nice, ignore_nice);
show_one(cs, freq_step, freq_step);
+declare_show_sampling_rate_min();
-define_one_global_rw(sampling_rate);
-define_one_global_rw(sampling_down_factor);
-define_one_global_rw(up_threshold);
-define_one_global_rw(down_threshold);
-define_one_global_rw(ignore_nice_load);
-define_one_global_rw(freq_step);
-define_one_global_ro(sampling_rate_min);
+cpufreq_freq_attr_rw(sampling_rate);
+cpufreq_freq_attr_rw(sampling_down_factor);
+cpufreq_freq_attr_rw(up_threshold);
+cpufreq_freq_attr_rw(down_threshold);
+cpufreq_freq_attr_rw(ignore_nice);
+cpufreq_freq_attr_rw(freq_step);
+cpufreq_freq_attr_ro(sampling_rate_min);
static struct attribute *dbs_attributes[] = {
&sampling_rate_min.attr,
@@ -292,7 +294,7 @@ static struct attribute *dbs_attributes[] = {
&sampling_down_factor.attr,
&up_threshold.attr,
&down_threshold.attr,
- &ignore_nice_load.attr,
+ &ignore_nice.attr,
&freq_step.attr,
NULL
};
@@ -304,6 +306,34 @@ static struct attribute_group cs_attr_group = {
/************************** sysfs end ************************/
+static int cs_init(struct dbs_data *dbs_data)
+{
+ struct cs_dbs_tuners *tuners;
+
+ tuners = kzalloc(sizeof(struct cs_dbs_tuners), GFP_KERNEL);
+ if (!tuners) {
+ pr_err("%s: kzalloc failed\n", __func__);
+ return -ENOMEM;
+ }
+
+ tuners->up_threshold = DEF_FREQUENCY_UP_THRESHOLD;
+ tuners->down_threshold = DEF_FREQUENCY_DOWN_THRESHOLD;
+ tuners->sampling_down_factor = DEF_SAMPLING_DOWN_FACTOR;
+ tuners->ignore_nice = 0;
+ tuners->freq_step = 5;
+
+ dbs_data->tuners = tuners;
+ dbs_data->min_sampling_rate = MIN_SAMPLING_RATE_RATIO *
+ jiffies_to_usecs(10);
+ mutex_init(&dbs_data->mutex);
+ return 0;
+}
+
+static void cs_exit(struct dbs_data *dbs_data)
+{
+ kfree(dbs_data->tuners);
+}
+
define_get_cpu_dbs_routines(cs_cpu_dbs_info);
static struct notifier_block cs_cpufreq_notifier_block = {
@@ -314,21 +344,22 @@ static struct cs_ops cs_ops = {
.notifier_block = &cs_cpufreq_notifier_block,
};
-static struct dbs_data cs_dbs_data = {
+static struct common_dbs_data cs_dbs_cdata = {
.governor = GOV_CONSERVATIVE,
.attr_group = &cs_attr_group,
- .tuners = &cs_tuners,
.get_cpu_cdbs = get_cpu_cdbs,
.get_cpu_dbs_info_s = get_cpu_dbs_info_s,
.gov_dbs_timer = cs_dbs_timer,
.gov_check_cpu = cs_check_cpu,
.gov_ops = &cs_ops,
+ .init = cs_init,
+ .exit = cs_exit,
};
static int cs_cpufreq_governor_dbs(struct cpufreq_policy *policy,
unsigned int event)
{
- return cpufreq_governor_dbs(&cs_dbs_data, policy, event);
+ return cpufreq_governor_dbs(policy, &cs_dbs_cdata, event);
}
#ifndef CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE
@@ -343,7 +374,6 @@ struct cpufreq_governor cpufreq_gov_conservative = {
static int __init cpufreq_gov_dbs_init(void)
{
- mutex_init(&cs_dbs_data.mutex);
return cpufreq_register_governor(&cpufreq_gov_conservative);
}
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 5a76086..7722505 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -22,6 +22,7 @@
#include <linux/export.h>
#include <linux/kernel_stat.h>
#include <linux/mutex.h>
+#include <linux/slab.h>
#include <linux/tick.h>
#include <linux/types.h>
#include <linux/workqueue.h>
@@ -65,7 +66,7 @@ EXPORT_SYMBOL_GPL(get_cpu_idle_time);
void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
{
- struct cpu_dbs_common_info *cdbs = dbs_data->get_cpu_cdbs(cpu);
+ struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
struct od_dbs_tuners *od_tuners = dbs_data->tuners;
struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
struct cpufreq_policy *policy;
@@ -73,7 +74,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
unsigned int ignore_nice;
unsigned int j;
- if (dbs_data->governor == GOV_ONDEMAND)
+ if (dbs_data->cdata->governor == GOV_ONDEMAND)
ignore_nice = od_tuners->ignore_nice;
else
ignore_nice = cs_tuners->ignore_nice;
@@ -87,7 +88,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
unsigned int idle_time, wall_time, iowait_time;
unsigned int load;
- j_cdbs = dbs_data->get_cpu_cdbs(j);
+ j_cdbs = dbs_data->cdata->get_cpu_cdbs(j);
cur_idle_time = get_cpu_idle_time(j, &cur_wall_time);
@@ -117,9 +118,9 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
idle_time += jiffies_to_usecs(cur_nice_jiffies);
}
- if (dbs_data->governor == GOV_ONDEMAND) {
+ if (dbs_data->cdata->governor == GOV_ONDEMAND) {
struct od_cpu_dbs_info_s *od_j_dbs_info =
- dbs_data->get_cpu_dbs_info_s(cpu);
+ dbs_data->cdata->get_cpu_dbs_info_s(cpu);
cur_iowait_time = get_cpu_iowait_time_us(j,
&cur_wall_time);
@@ -145,7 +146,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
load = 100 * (wall_time - idle_time) / wall_time;
- if (dbs_data->governor == GOV_ONDEMAND) {
+ if (dbs_data->cdata->governor == GOV_ONDEMAND) {
int freq_avg = __cpufreq_driver_getavg(policy, j);
if (freq_avg <= 0)
freq_avg = policy->cur;
@@ -157,7 +158,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
max_load = load;
}
- dbs_data->gov_check_cpu(cpu, max_load);
+ dbs_data->cdata->gov_check_cpu(cpu, max_load);
}
EXPORT_SYMBOL_GPL(dbs_check_cpu);
@@ -165,14 +166,14 @@ static inline void dbs_timer_init(struct dbs_data *dbs_data, int cpu,
unsigned int sampling_rate)
{
int delay = delay_for_sampling_rate(sampling_rate);
- struct cpu_dbs_common_info *cdbs = dbs_data->get_cpu_cdbs(cpu);
+ struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
schedule_delayed_work_on(cpu, &cdbs->work, delay);
}
static inline void dbs_timer_exit(struct dbs_data *dbs_data, int cpu)
{
- struct cpu_dbs_common_info *cdbs = dbs_data->get_cpu_cdbs(cpu);
+ struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
cancel_delayed_work_sync(&cdbs->work);
}
@@ -196,31 +197,84 @@ bool need_load_eval(struct cpu_dbs_common_info *cdbs,
}
EXPORT_SYMBOL_GPL(need_load_eval);
-int cpufreq_governor_dbs(struct dbs_data *dbs_data,
- struct cpufreq_policy *policy, unsigned int event)
+static void set_sampling_rate(struct dbs_data *dbs_data,
+ unsigned int sampling_rate)
{
+ if (dbs_data->cdata->governor == GOV_CONSERVATIVE) {
+ struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
+ cs_tuners->sampling_rate = sampling_rate;
+ } else {
+ struct od_dbs_tuners *od_tuners = dbs_data->tuners;
+ od_tuners->sampling_rate = sampling_rate;
+ }
+}
+
+int cpufreq_governor_dbs(struct cpufreq_policy *policy,
+ struct common_dbs_data *cdata, unsigned int event)
+{
+ struct dbs_data *dbs_data = policy->governor_data;
struct od_cpu_dbs_info_s *od_dbs_info = NULL;
struct cs_cpu_dbs_info_s *cs_dbs_info = NULL;
struct cs_ops *cs_ops = NULL;
struct od_ops *od_ops = NULL;
- struct od_dbs_tuners *od_tuners = dbs_data->tuners;
- struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
+ struct od_dbs_tuners *od_tuners = NULL;
+ struct cs_dbs_tuners *cs_tuners = NULL;
struct cpu_dbs_common_info *cpu_cdbs;
- unsigned int *sampling_rate, latency, ignore_nice, j, cpu = policy->cpu;
+ unsigned int sampling_rate, latency, ignore_nice, j, cpu = policy->cpu;
int rc;
- cpu_cdbs = dbs_data->get_cpu_cdbs(cpu);
+ WARN_ON(!dbs_data && (event != CPUFREQ_GOV_POLICY_INIT));
+
+ switch (event) {
+ case CPUFREQ_GOV_POLICY_INIT:
+ WARN_ON(dbs_data);
+ dbs_data = kzalloc(sizeof(*dbs_data), GFP_KERNEL);
+ if (!dbs_data) {
+ pr_err("%s: POLICY_INIT: kzalloc failed\n", __func__);
+ return -ENOMEM;
+ }
+
+ dbs_data->cdata = cdata;
+ rc = cdata->init(dbs_data);
+ if (rc) {
+ pr_err("%s: POLICY_INIT: init() failed\n", __func__);
+ kfree(dbs_data);
+ return rc;
+ }
+ policy->governor_data = dbs_data;
+
+ /* policy latency is in nS. Convert it to uS first */
+ latency = policy->cpuinfo.transition_latency / 1000;
+ if (latency == 0)
+ latency = 1;
+
+ /* Bring kernel and HW constraints together */
+ dbs_data->min_sampling_rate = max(dbs_data->min_sampling_rate,
+ MIN_LATENCY_MULTIPLIER * latency);
+ set_sampling_rate(dbs_data, max(dbs_data->min_sampling_rate,
+ latency * LATENCY_MULTIPLIER));
+ return 0;
+ case CPUFREQ_GOV_POLICY_EXIT:
+ cdata->exit(dbs_data);
+ kfree(dbs_data);
+ policy->governor_data = NULL;
+ return 0;
+ }
- if (dbs_data->governor == GOV_CONSERVATIVE) {
- cs_dbs_info = dbs_data->get_cpu_dbs_info_s(cpu);
- sampling_rate = &cs_tuners->sampling_rate;
+ cpu_cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
+
+ if (dbs_data->cdata->governor == GOV_CONSERVATIVE) {
+ cs_tuners = dbs_data->tuners;
+ cs_dbs_info = dbs_data->cdata->get_cpu_dbs_info_s(cpu);
+ sampling_rate = cs_tuners->sampling_rate;
ignore_nice = cs_tuners->ignore_nice;
- cs_ops = dbs_data->gov_ops;
+ cs_ops = dbs_data->cdata->gov_ops;
} else {
- od_dbs_info = dbs_data->get_cpu_dbs_info_s(cpu);
- sampling_rate = &od_tuners->sampling_rate;
+ od_tuners = dbs_data->tuners;
+ od_dbs_info = dbs_data->cdata->get_cpu_dbs_info_s(cpu);
+ sampling_rate = od_tuners->sampling_rate;
ignore_nice = od_tuners->ignore_nice;
- od_ops = dbs_data->gov_ops;
+ od_ops = dbs_data->cdata->gov_ops;
}
switch (event) {
@@ -232,7 +286,7 @@ int cpufreq_governor_dbs(struct dbs_data *dbs_data,
for_each_cpu(j, policy->cpus) {
struct cpu_dbs_common_info *j_cdbs =
- dbs_data->get_cpu_cdbs(j);
+ dbs_data->cdata->get_cpu_cdbs(j);
j_cdbs->cpu = j;
j_cdbs->cur_policy = policy;
@@ -244,69 +298,44 @@ int cpufreq_governor_dbs(struct dbs_data *dbs_data,
mutex_init(&j_cdbs->timer_mutex);
INIT_DEFERRABLE_WORK(&j_cdbs->work,
- dbs_data->gov_dbs_timer);
+ dbs_data->cdata->gov_dbs_timer);
}
- if (!policy->governor->initialized) {
- rc = sysfs_create_group(cpufreq_global_kobject,
- dbs_data->attr_group);
- if (rc) {
- mutex_unlock(&dbs_data->mutex);
- return rc;
- }
+ rc = sysfs_create_group(get_governor_parent_kobj(policy),
+ dbs_data->cdata->attr_group);
+ if (rc) {
+ mutex_unlock(&dbs_data->mutex);
+ return rc;
}
/*
* conservative does not implement micro like ondemand
* governor, thus we are bound to jiffes/HZ
*/
- if (dbs_data->governor == GOV_CONSERVATIVE) {
+ if (dbs_data->cdata->governor == GOV_CONSERVATIVE) {
cs_dbs_info->down_skip = 0;
cs_dbs_info->enable = 1;
cs_dbs_info->requested_freq = policy->cur;
- if (!policy->governor->initialized) {
- cpufreq_register_notifier(cs_ops->notifier_block,
- CPUFREQ_TRANSITION_NOTIFIER);
-
- dbs_data->min_sampling_rate =
- MIN_SAMPLING_RATE_RATIO *
- jiffies_to_usecs(10);
- }
+ cpufreq_register_notifier(cs_ops->notifier_block,
+ CPUFREQ_TRANSITION_NOTIFIER);
} else {
od_dbs_info->rate_mult = 1;
od_dbs_info->sample_type = OD_NORMAL_SAMPLE;
od_ops->powersave_bias_init_cpu(cpu);
-
- if (!policy->governor->initialized)
- od_tuners->io_is_busy = od_ops->io_busy();
}
- if (policy->governor->initialized)
- goto unlock;
-
- /* policy latency is in nS. Convert it to uS first */
- latency = policy->cpuinfo.transition_latency / 1000;
- if (latency == 0)
- latency = 1;
-
- /* Bring kernel and HW constraints together */
- dbs_data->min_sampling_rate = max(dbs_data->min_sampling_rate,
- MIN_LATENCY_MULTIPLIER * latency);
- *sampling_rate = max(dbs_data->min_sampling_rate, latency *
- LATENCY_MULTIPLIER);
-unlock:
mutex_unlock(&dbs_data->mutex);
/* Initiate timer time stamp */
cpu_cdbs->time_stamp = ktime_get();
for_each_cpu(j, policy->cpus)
- dbs_timer_init(dbs_data, j, *sampling_rate);
+ dbs_timer_init(dbs_data, j, sampling_rate);
break;
case CPUFREQ_GOV_STOP:
- if (dbs_data->governor == GOV_CONSERVATIVE)
+ if (dbs_data->cdata->governor == GOV_CONSERVATIVE)
cs_dbs_info->enable = 0;
for_each_cpu(j, policy->cpus)
@@ -315,13 +344,11 @@ unlock:
mutex_lock(&dbs_data->mutex);
mutex_destroy(&cpu_cdbs->timer_mutex);
- if (policy->governor->initialized == 1) {
- sysfs_remove_group(cpufreq_global_kobject,
- dbs_data->attr_group);
- if (dbs_data->governor == GOV_CONSERVATIVE)
- cpufreq_unregister_notifier(cs_ops->notifier_block,
- CPUFREQ_TRANSITION_NOTIFIER);
- }
+ sysfs_remove_group(get_governor_parent_kobj(policy),
+ dbs_data->cdata->attr_group);
+ if (dbs_data->cdata->governor == GOV_CONSERVATIVE)
+ cpufreq_unregister_notifier(cs_ops->notifier_block,
+ CPUFREQ_TRANSITION_NOTIFIER);
mutex_unlock(&dbs_data->mutex);
break;
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index d2ac911..6301790 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -43,9 +43,11 @@ enum {OD_NORMAL_SAMPLE, OD_SUB_SAMPLE};
/* Macro creating sysfs show routines */
#define show_one(_gov, file_name, object) \
static ssize_t show_##file_name \
-(struct kobject *kobj, struct attribute *attr, char *buf) \
+(struct cpufreq_policy *policy, char *buf) \
{ \
- return sprintf(buf, "%u\n", _gov##_tuners.object); \
+ struct dbs_data *dbs_data = policy->governor_data; \
+ struct _gov##_dbs_tuners *tuners = dbs_data->tuners; \
+ return sprintf(buf, "%u\n", tuners->file_name); \
}
#define define_get_cpu_dbs_routines(_dbs_info) \
@@ -103,7 +105,7 @@ struct cs_cpu_dbs_info_s {
unsigned int enable:1;
};
-/* Governers sysfs tunables */
+/* Per policy Governers sysfs tunables */
struct od_dbs_tuners {
unsigned int ignore_nice;
unsigned int sampling_rate;
@@ -123,31 +125,38 @@ struct cs_dbs_tuners {
unsigned int freq_step;
};
-/* Per Governer data */
-struct dbs_data {
+/* Common Governer data across policies */
+struct dbs_data;
+struct common_dbs_data {
/* Common across governors */
#define GOV_ONDEMAND 0
#define GOV_CONSERVATIVE 1
int governor;
- unsigned int min_sampling_rate;
struct attribute_group *attr_group;
- void *tuners;
-
- /* dbs_mutex protects dbs_enable in governor start/stop */
- struct mutex mutex;
struct cpu_dbs_common_info *(*get_cpu_cdbs)(int cpu);
void *(*get_cpu_dbs_info_s)(int cpu);
void (*gov_dbs_timer)(struct work_struct *work);
void (*gov_check_cpu)(int cpu, unsigned int load);
+ int (*init)(struct dbs_data *dbs_data);
+ void (*exit)(struct dbs_data *dbs_data);
/* Governor specific ops, see below */
void *gov_ops;
};
+/* Governer Per policy data */
+struct dbs_data {
+ struct common_dbs_data *cdata;
+ unsigned int min_sampling_rate;
+ void *tuners;
+
+ /* dbs_mutex protects dbs_enable in governor start/stop */
+ struct mutex mutex;
+};
+
/* Governor specific ops, will be passed to dbs_data->gov_ops */
struct od_ops {
- int (*io_busy)(void);
void (*powersave_bias_init_cpu)(int cpu);
unsigned int (*powersave_bias_target)(struct cpufreq_policy *policy,
unsigned int freq_next, unsigned int relation);
@@ -169,10 +178,18 @@ static inline int delay_for_sampling_rate(unsigned int sampling_rate)
return delay;
}
+#define declare_show_sampling_rate_min() \
+static ssize_t show_sampling_rate_min(struct cpufreq_policy *policy, \
+ char *buf) \
+{ \
+ struct dbs_data *dbs_data = policy->governor_data; \
+ return sprintf(buf, "%u\n", dbs_data->min_sampling_rate); \
+}
+
u64 get_cpu_idle_time(unsigned int cpu, u64 *wall);
void dbs_check_cpu(struct dbs_data *dbs_data, int cpu);
bool need_load_eval(struct cpu_dbs_common_info *cdbs,
unsigned int sampling_rate);
-int cpufreq_governor_dbs(struct dbs_data *dbs_data,
- struct cpufreq_policy *policy, unsigned int event);
+int cpufreq_governor_dbs(struct cpufreq_policy *policy,
+ struct common_dbs_data *cdata, unsigned int event);
#endif /* _CPUFREQ_GOVERNER_H */
diff --git a/drivers/cpufreq/cpufreq_ondemand.c b/drivers/cpufreq/cpufreq_ondemand.c
index f3eb26c..c5fd794 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -20,6 +20,7 @@
#include <linux/module.h>
#include <linux/mutex.h>
#include <linux/percpu-defs.h>
+#include <linux/slab.h>
#include <linux/sysfs.h>
#include <linux/tick.h>
#include <linux/types.h>
@@ -37,22 +38,12 @@
#define MIN_FREQUENCY_UP_THRESHOLD (11)
#define MAX_FREQUENCY_UP_THRESHOLD (100)
-static struct dbs_data od_dbs_data;
static DEFINE_PER_CPU(struct od_cpu_dbs_info_s, od_cpu_dbs_info);
#ifndef CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND
static struct cpufreq_governor cpufreq_gov_ondemand;
#endif
-static struct od_dbs_tuners od_tuners = {
- .up_threshold = DEF_FREQUENCY_UP_THRESHOLD,
- .sampling_down_factor = DEF_SAMPLING_DOWN_FACTOR,
- .adj_up_threshold = DEF_FREQUENCY_UP_THRESHOLD -
- DEF_FREQUENCY_DOWN_DIFFERENTIAL,
- .ignore_nice = 0,
- .powersave_bias = 0,
-};
-
static void ondemand_powersave_bias_init_cpu(int cpu)
{
struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
@@ -98,6 +89,8 @@ static unsigned int powersave_bias_target(struct cpufreq_policy *policy,
unsigned int jiffies_total, jiffies_hi, jiffies_lo;
struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info,
policy->cpu);
+ struct dbs_data *dbs_data = policy->governor_data;
+ struct od_dbs_tuners *od_tuners = dbs_data->tuners;
if (!dbs_info->freq_table) {
dbs_info->freq_lo = 0;
@@ -108,7 +101,7 @@ static unsigned int powersave_bias_target(struct cpufreq_policy *policy,
cpufreq_frequency_table_target(policy, dbs_info->freq_table, freq_next,
relation, &index);
freq_req = dbs_info->freq_table[index].frequency;
- freq_reduc = freq_req * od_tuners.powersave_bias / 1000;
+ freq_reduc = freq_req * od_tuners->powersave_bias / 1000;
freq_avg = freq_req - freq_reduc;
/* Find freq bounds for freq_avg in freq_table */
@@ -127,7 +120,7 @@ static unsigned int powersave_bias_target(struct cpufreq_policy *policy,
dbs_info->freq_lo_jiffies = 0;
return freq_lo;
}
- jiffies_total = usecs_to_jiffies(od_tuners.sampling_rate);
+ jiffies_total = usecs_to_jiffies(od_tuners->sampling_rate);
jiffies_hi = (freq_avg - freq_lo) * jiffies_total;
jiffies_hi += ((freq_hi - freq_lo) / 2);
jiffies_hi /= (freq_hi - freq_lo);
@@ -148,12 +141,15 @@ static void ondemand_powersave_bias_init(void)
static void dbs_freq_increase(struct cpufreq_policy *p, unsigned int freq)
{
- if (od_tuners.powersave_bias)
+ struct dbs_data *dbs_data = p->governor_data;
+ struct od_dbs_tuners *od_tuners = dbs_data->tuners;
+
+ if (od_tuners->powersave_bias)
freq = powersave_bias_target(p, freq, CPUFREQ_RELATION_H);
else if (p->cur == p->max)
return;
- __cpufreq_driver_target(p, freq, od_tuners.powersave_bias ?
+ __cpufreq_driver_target(p, freq, od_tuners->powersave_bias ?
CPUFREQ_RELATION_L : CPUFREQ_RELATION_H);
}
@@ -170,15 +166,17 @@ static void od_check_cpu(int cpu, unsigned int load_freq)
{
struct od_cpu_dbs_info_s *dbs_info = &per_cpu(od_cpu_dbs_info, cpu);
struct cpufreq_policy *policy = dbs_info->cdbs.cur_policy;
+ struct dbs_data *dbs_data = policy->governor_data;
+ struct od_dbs_tuners *od_tuners = dbs_data->tuners;
dbs_info->freq_lo = 0;
/* Check for frequency increase */
- if (load_freq > od_tuners.up_threshold * policy->cur) {
+ if (load_freq > od_tuners->up_threshold * policy->cur) {
/* If switching to max speed, apply sampling_down_factor */
if (policy->cur < policy->max)
dbs_info->rate_mult =
- od_tuners.sampling_down_factor;
+ od_tuners->sampling_down_factor;
dbs_freq_increase(policy, policy->max);
return;
}
@@ -193,9 +191,10 @@ static void od_check_cpu(int cpu, unsigned int load_freq)
* support the current CPU usage without triggering the up policy. To be
* safe, we focus 10 points under the threshold.
*/
- if (load_freq < od_tuners.adj_up_threshold * policy->cur) {
+ if (load_freq < od_tuners->adj_up_threshold
+ * policy->cur) {
unsigned int freq_next;
- freq_next = load_freq / od_tuners.adj_up_threshold;
+ freq_next = load_freq / od_tuners->adj_up_threshold;
/* No longer fully busy, reset rate_mult */
dbs_info->rate_mult = 1;
@@ -203,7 +202,7 @@ static void od_check_cpu(int cpu, unsigned int load_freq)
if (freq_next < policy->min)
freq_next = policy->min;
- if (!od_tuners.powersave_bias) {
+ if (!od_tuners->powersave_bias) {
__cpufreq_driver_target(policy, freq_next,
CPUFREQ_RELATION_L);
} else {
@@ -223,12 +222,14 @@ static void od_dbs_timer(struct work_struct *work)
unsigned int cpu = dbs_info->cdbs.cur_policy->cpu;
struct od_cpu_dbs_info_s *core_dbs_info = &per_cpu(od_cpu_dbs_info,
cpu);
+ struct dbs_data *dbs_data = dbs_info->cdbs.cur_policy->governor_data;
+ struct od_dbs_tuners *od_tuners = dbs_data->tuners;
int delay, sample_type = core_dbs_info->sample_type;
bool eval_load;
mutex_lock(&core_dbs_info->cdbs.timer_mutex);
eval_load = need_load_eval(&core_dbs_info->cdbs,
- od_tuners.sampling_rate);
+ od_tuners->sampling_rate);
/* Common NORMAL_SAMPLE setup */
core_dbs_info->sample_type = OD_NORMAL_SAMPLE;
@@ -240,13 +241,13 @@ static void od_dbs_timer(struct work_struct *work)
CPUFREQ_RELATION_H);
} else {
if (eval_load)
- dbs_check_cpu(&od_dbs_data, cpu);
+ dbs_check_cpu(dbs_data, cpu);
if (core_dbs_info->freq_lo) {
/* Setup timer for SUB_SAMPLE */
core_dbs_info->sample_type = OD_SUB_SAMPLE;
delay = core_dbs_info->freq_hi_jiffies;
} else {
- delay = delay_for_sampling_rate(od_tuners.sampling_rate
+ delay = delay_for_sampling_rate(od_tuners->sampling_rate
* core_dbs_info->rate_mult);
}
}
@@ -256,13 +257,6 @@ static void od_dbs_timer(struct work_struct *work)
}
/************************** sysfs interface ************************/
-
-static ssize_t show_sampling_rate_min(struct kobject *kobj,
- struct attribute *attr, char *buf)
-{
- return sprintf(buf, "%u\n", od_dbs_data.min_sampling_rate);
-}
-
/**
* update_sampling_rate - update sampling rate effective immediately if needed.
* @new_rate: new sampling rate
@@ -276,12 +270,14 @@ static ssize_t show_sampling_rate_min(struct kobject *kobj,
* reducing the sampling rate, we need to make the new value effective
* immediately.
*/
-static void update_sampling_rate(unsigned int new_rate)
+static void update_sampling_rate(struct dbs_data *dbs_data,
+ unsigned int new_rate)
{
+ struct od_dbs_tuners *od_tuners = dbs_data->tuners;
int cpu;
- od_tuners.sampling_rate = new_rate = max(new_rate,
- od_dbs_data.min_sampling_rate);
+ od_tuners->sampling_rate = new_rate = max(new_rate,
+ dbs_data->min_sampling_rate);
for_each_online_cpu(cpu) {
struct cpufreq_policy *policy;
@@ -322,34 +318,39 @@ static void update_sampling_rate(unsigned int new_rate)
}
}
-static ssize_t store_sampling_rate(struct kobject *a, struct attribute *b,
- const char *buf, size_t count)
+static ssize_t store_sampling_rate(struct cpufreq_policy *policy,
+ const char *buf, size_t count)
{
+ struct dbs_data *dbs_data = policy->governor_data;
unsigned int input;
int ret;
ret = sscanf(buf, "%u", &input);
if (ret != 1)
return -EINVAL;
- update_sampling_rate(input);
+ update_sampling_rate(dbs_data, input);
return count;
}
-static ssize_t store_io_is_busy(struct kobject *a, struct attribute *b,
- const char *buf, size_t count)
+static ssize_t store_io_is_busy(struct cpufreq_policy *policy, const char *buf,
+ size_t count)
{
+ struct dbs_data *dbs_data = policy->governor_data;
+ struct od_dbs_tuners *od_tuners = dbs_data->tuners;
unsigned int input;
int ret;
ret = sscanf(buf, "%u", &input);
if (ret != 1)
return -EINVAL;
- od_tuners.io_is_busy = !!input;
+ od_tuners->io_is_busy = !!input;
return count;
}
-static ssize_t store_up_threshold(struct kobject *a, struct attribute *b,
- const char *buf, size_t count)
+static ssize_t store_up_threshold(struct cpufreq_policy *policy,
+ const char *buf, size_t count)
{
+ struct dbs_data *dbs_data = policy->governor_data;
+ struct od_dbs_tuners *od_tuners = dbs_data->tuners;
unsigned int input;
int ret;
ret = sscanf(buf, "%u", &input);
@@ -359,23 +360,25 @@ static ssize_t store_up_threshold(struct kobject *a, struct attribute *b,
return -EINVAL;
}
/* Calculate the new adj_up_threshold */
- od_tuners.adj_up_threshold += input;
- od_tuners.adj_up_threshold -= od_tuners.up_threshold;
+ od_tuners->adj_up_threshold += input;
+ od_tuners->adj_up_threshold -= od_tuners->up_threshold;
- od_tuners.up_threshold = input;
+ od_tuners->up_threshold = input;
return count;
}
-static ssize_t store_sampling_down_factor(struct kobject *a,
- struct attribute *b, const char *buf, size_t count)
+static ssize_t store_sampling_down_factor(struct cpufreq_policy *policy,
+ const char *buf, size_t count)
{
+ struct dbs_data *dbs_data = policy->governor_data;
+ struct od_dbs_tuners *od_tuners = dbs_data->tuners;
unsigned int input, j;
int ret;
ret = sscanf(buf, "%u", &input);
if (ret != 1 || input > MAX_SAMPLING_DOWN_FACTOR || input < 1)
return -EINVAL;
- od_tuners.sampling_down_factor = input;
+ od_tuners->sampling_down_factor = input;
/* Reset down sampling multiplier in case it was active */
for_each_online_cpu(j) {
@@ -386,9 +389,11 @@ static ssize_t store_sampling_down_factor(struct kobject *a,
return count;
}
-static ssize_t store_ignore_nice_load(struct kobject *a, struct attribute *b,
- const char *buf, size_t count)
+static ssize_t store_ignore_nice(struct cpufreq_policy *policy, const char *buf,
+ size_t count)
{
+ struct dbs_data *dbs_data = policy->governor_data;
+ struct od_dbs_tuners *od_tuners = dbs_data->tuners;
unsigned int input;
int ret;
@@ -401,10 +406,10 @@ static ssize_t store_ignore_nice_load(struct kobject *a, struct attribute *b,
if (input > 1)
input = 1;
- if (input == od_tuners.ignore_nice) { /* nothing to do */
+ if (input == od_tuners->ignore_nice) { /* nothing to do */
return count;
}
- od_tuners.ignore_nice = input;
+ od_tuners->ignore_nice = input;
/* we need to re-evaluate prev_cpu_idle */
for_each_online_cpu(j) {
@@ -412,7 +417,7 @@ static ssize_t store_ignore_nice_load(struct kobject *a, struct attribute *b,
dbs_info = &per_cpu(od_cpu_dbs_info, j);
dbs_info->cdbs.prev_cpu_idle = get_cpu_idle_time(j,
&dbs_info->cdbs.prev_cpu_wall);
- if (od_tuners.ignore_nice)
+ if (od_tuners->ignore_nice)
dbs_info->cdbs.prev_cpu_nice =
kcpustat_cpu(j).cpustat[CPUTIME_NICE];
@@ -420,9 +425,11 @@ static ssize_t store_ignore_nice_load(struct kobject *a, struct attribute *b,
return count;
}
-static ssize_t store_powersave_bias(struct kobject *a, struct attribute *b,
- const char *buf, size_t count)
+static ssize_t store_powersave_bias(struct cpufreq_policy *policy,
+ const char *buf, size_t count)
{
+ struct dbs_data *dbs_data = policy->governor_data;
+ struct od_dbs_tuners *od_tuners = dbs_data->tuners;
unsigned int input;
int ret;
ret = sscanf(buf, "%u", &input);
@@ -433,7 +440,7 @@ static ssize_t store_powersave_bias(struct kobject *a, struct attribute *b,
if (input > 1000)
input = 1000;
- od_tuners.powersave_bias = input;
+ od_tuners->powersave_bias = input;
ondemand_powersave_bias_init();
return count;
}
@@ -442,23 +449,24 @@ show_one(od, sampling_rate, sampling_rate);
show_one(od, io_is_busy, io_is_busy);
show_one(od, up_threshold, up_threshold);
show_one(od, sampling_down_factor, sampling_down_factor);
-show_one(od, ignore_nice_load, ignore_nice);
+show_one(od, ignore_nice, ignore_nice);
show_one(od, powersave_bias, powersave_bias);
+declare_show_sampling_rate_min();
-define_one_global_rw(sampling_rate);
-define_one_global_rw(io_is_busy);
-define_one_global_rw(up_threshold);
-define_one_global_rw(sampling_down_factor);
-define_one_global_rw(ignore_nice_load);
-define_one_global_rw(powersave_bias);
-define_one_global_ro(sampling_rate_min);
+cpufreq_freq_attr_rw(sampling_rate);
+cpufreq_freq_attr_rw(io_is_busy);
+cpufreq_freq_attr_rw(up_threshold);
+cpufreq_freq_attr_rw(sampling_down_factor);
+cpufreq_freq_attr_rw(ignore_nice);
+cpufreq_freq_attr_rw(powersave_bias);
+cpufreq_freq_attr_ro(sampling_rate_min);
static struct attribute *dbs_attributes[] = {
&sampling_rate_min.attr,
&sampling_rate.attr,
&up_threshold.attr,
&sampling_down_factor.attr,
- &ignore_nice_load.attr,
+ &ignore_nice.attr,
&powersave_bias.attr,
&io_is_busy.attr,
NULL
@@ -471,30 +479,81 @@ static struct attribute_group od_attr_group = {
/************************** sysfs end ************************/
+static int od_init(struct dbs_data *dbs_data)
+{
+ struct od_dbs_tuners *tuners;
+ u64 idle_time;
+ int cpu;
+
+ tuners = kzalloc(sizeof(struct od_dbs_tuners), GFP_KERNEL);
+ if (!tuners) {
+ pr_err("%s: kzalloc failed\n", __func__);
+ return -ENOMEM;
+ }
+
+ cpu = get_cpu();
+ idle_time = get_cpu_idle_time_us(cpu, NULL);
+ put_cpu();
+ if (idle_time != -1ULL) {
+ /* Idle micro accounting is supported. Use finer thresholds */
+ tuners->up_threshold = MICRO_FREQUENCY_UP_THRESHOLD;
+ tuners->adj_up_threshold = MICRO_FREQUENCY_UP_THRESHOLD -
+ MICRO_FREQUENCY_DOWN_DIFFERENTIAL;
+ /*
+ * In nohz/micro accounting case we set the minimum frequency
+ * not depending on HZ, but fixed (very low). The deferred
+ * timer might skip some samples if idle/sleeping as needed.
+ */
+ dbs_data->min_sampling_rate = MICRO_FREQUENCY_MIN_SAMPLE_RATE;
+ } else {
+ tuners->up_threshold = DEF_FREQUENCY_UP_THRESHOLD;
+ tuners->adj_up_threshold = DEF_FREQUENCY_UP_THRESHOLD -
+ DEF_FREQUENCY_DOWN_DIFFERENTIAL;
+
+ /* For correct statistics, we need 10 ticks for each measure */
+ dbs_data->min_sampling_rate = MIN_SAMPLING_RATE_RATIO *
+ jiffies_to_usecs(10);
+ }
+
+ tuners->sampling_down_factor = DEF_SAMPLING_DOWN_FACTOR;
+ tuners->ignore_nice = 0;
+ tuners->powersave_bias = 0;
+ tuners->io_is_busy = should_io_be_busy();
+
+ dbs_data->tuners = tuners;
+ mutex_init(&dbs_data->mutex);
+ return 0;
+}
+
+static void od_exit(struct dbs_data *dbs_data)
+{
+ kfree(dbs_data->tuners);
+}
+
define_get_cpu_dbs_routines(od_cpu_dbs_info);
static struct od_ops od_ops = {
- .io_busy = should_io_be_busy,
.powersave_bias_init_cpu = ondemand_powersave_bias_init_cpu,
.powersave_bias_target = powersave_bias_target,
.freq_increase = dbs_freq_increase,
};
-static struct dbs_data od_dbs_data = {
+static struct common_dbs_data od_dbs_cdata = {
.governor = GOV_ONDEMAND,
.attr_group = &od_attr_group,
- .tuners = &od_tuners,
.get_cpu_cdbs = get_cpu_cdbs,
.get_cpu_dbs_info_s = get_cpu_dbs_info_s,
.gov_dbs_timer = od_dbs_timer,
.gov_check_cpu = od_check_cpu,
.gov_ops = &od_ops,
+ .init = od_init,
+ .exit = od_exit,
};
static int od_cpufreq_governor_dbs(struct cpufreq_policy *policy,
unsigned int event)
{
- return cpufreq_governor_dbs(&od_dbs_data, policy, event);
+ return cpufreq_governor_dbs(policy, &od_dbs_cdata, event);
}
#ifndef CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND
@@ -509,29 +568,6 @@ struct cpufreq_governor cpufreq_gov_ondemand = {
static int __init cpufreq_gov_dbs_init(void)
{
- u64 idle_time;
- int cpu = get_cpu();
-
- mutex_init(&od_dbs_data.mutex);
- idle_time = get_cpu_idle_time_us(cpu, NULL);
- put_cpu();
- if (idle_time != -1ULL) {
- /* Idle micro accounting is supported. Use finer thresholds */
- od_tuners.up_threshold = MICRO_FREQUENCY_UP_THRESHOLD;
- od_tuners.adj_up_threshold = MICRO_FREQUENCY_UP_THRESHOLD -
- MICRO_FREQUENCY_DOWN_DIFFERENTIAL;
- /*
- * In nohz/micro accounting case we set the minimum frequency
- * not depending on HZ, but fixed (very low). The deferred
- * timer might skip some samples if idle/sleeping as needed.
- */
- od_dbs_data.min_sampling_rate = MICRO_FREQUENCY_MIN_SAMPLE_RATE;
- } else {
- /* For correct statistics, we need 10 ticks for each measure */
- od_dbs_data.min_sampling_rate = MIN_SAMPLING_RATE_RATIO *
- jiffies_to_usecs(10);
- }
-
return cpufreq_register_governor(&cpufreq_gov_ondemand);
}
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index b7393b5..c5ac9a5 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -107,6 +107,11 @@ struct cpufreq_policy {
unsigned int policy; /* see above */
struct cpufreq_governor *governor; /* see below */
void *governor_data;
+ /* This should be set by init() of platforms having multiple
+ * clock-domains, i.e. supporting multiple policies. With this sysfs
+ * directories of governor would be created in cpu/cpu<num>/cpufreq/
+ * directory */
+ bool have_multiple_policies;
struct work_struct update; /* if update_policy() needs to be
* called, but you're in IRQ context */
@@ -134,6 +139,15 @@ static inline bool policy_is_shared(struct cpufreq_policy *policy)
return cpumask_weight(policy->cpus) > 1;
}
+static inline struct kobject *
+get_governor_parent_kobj(struct cpufreq_policy *policy)
+{
+ if (policy->have_multiple_policies)
+ return &policy->kobj;
+ else
+ return cpufreq_global_kobject;
+}
+
/******************** cpufreq transition notifiers *******************/
#define CPUFREQ_PRECHANGE (0)
@@ -187,7 +201,6 @@ static inline unsigned long cpufreq_scale(unsigned long old, u_int div, u_int mu
struct cpufreq_governor {
char name[CPUFREQ_NAME_LEN];
- int initialized;
int (*governor) (struct cpufreq_policy *policy,
unsigned int event);
ssize_t (*show_setspeed) (struct cpufreq_policy *policy,
--
1.7.12.rc2.18.g61b472e
We don't need to keep two structures for file attributes, global_attr and
freq_attr. Lets use freq_attr only for cpufreq core and drivers.
Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/cpufreq/acpi-cpufreq.c | 9 ++++-----
drivers/cpufreq/intel_pstate.c | 20 ++++++++++----------
include/linux/cpufreq.h | 16 ----------------
3 files changed, 14 insertions(+), 31 deletions(-)
diff --git a/drivers/cpufreq/acpi-cpufreq.c b/drivers/cpufreq/acpi-cpufreq.c
index 937bc28..14219c3 100644
--- a/drivers/cpufreq/acpi-cpufreq.c
+++ b/drivers/cpufreq/acpi-cpufreq.c
@@ -160,19 +160,18 @@ static ssize_t _store_boost(const char *buf, size_t count)
return count;
}
-static ssize_t store_global_boost(struct kobject *kobj, struct attribute *attr,
- const char *buf, size_t count)
+static ssize_t store_global_boost(struct cpufreq_policy *policy,
+ const char *buf, size_t count)
{
return _store_boost(buf, count);
}
-static ssize_t show_global_boost(struct kobject *kobj,
- struct attribute *attr, char *buf)
+static ssize_t show_global_boost(struct cpufreq_policy *policy, char *buf)
{
return sprintf(buf, "%u\n", boost_enabled);
}
-static struct global_attr global_boost = __ATTR(boost, 0644,
+static struct freq_attr global_boost = __ATTR(boost, 0644,
show_global_boost,
store_global_boost);
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 096fde0..49846b9 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -275,13 +275,13 @@ static void intel_pstate_debug_expose_params(void)
/************************** sysfs begin ************************/
#define show_one(file_name, object) \
static ssize_t show_##file_name \
- (struct kobject *kobj, struct attribute *attr, char *buf) \
+ (struct cpufreq_policy *policy, char *buf) \
{ \
return sprintf(buf, "%u\n", limits.object); \
}
-static ssize_t store_no_turbo(struct kobject *a, struct attribute *b,
- const char *buf, size_t count)
+static ssize_t store_no_turbo(struct cpufreq_policy *policy, const char *buf,
+ size_t count)
{
unsigned int input;
int ret;
@@ -293,8 +293,8 @@ static ssize_t store_no_turbo(struct kobject *a, struct attribute *b,
return count;
}
-static ssize_t store_max_perf_pct(struct kobject *a, struct attribute *b,
- const char *buf, size_t count)
+static ssize_t store_max_perf_pct(struct cpufreq_policy *policy,
+ const char *buf, size_t count)
{
unsigned int input;
int ret;
@@ -307,8 +307,8 @@ static ssize_t store_max_perf_pct(struct kobject *a, struct attribute *b,
return count;
}
-static ssize_t store_min_perf_pct(struct kobject *a, struct attribute *b,
- const char *buf, size_t count)
+static ssize_t store_min_perf_pct(struct cpufreq_policy *policy,
+ const char *buf, size_t count)
{
unsigned int input;
int ret;
@@ -325,9 +325,9 @@ show_one(no_turbo, no_turbo);
show_one(max_perf_pct, max_perf_pct);
show_one(min_perf_pct, min_perf_pct);
-define_one_global_rw(no_turbo);
-define_one_global_rw(max_perf_pct);
-define_one_global_rw(min_perf_pct);
+cpufreq_freq_attr_rw(no_turbo);
+cpufreq_freq_attr_rw(max_perf_pct);
+cpufreq_freq_attr_rw(min_perf_pct);
static struct attribute *intel_pstate_attributes[] = {
&no_turbo.attr,
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index c5ac9a5..6e1abd2 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -320,22 +320,6 @@ __ATTR(_name, _perm, show_##_name, NULL)
static struct freq_attr _name = \
__ATTR(_name, 0644, show_##_name, store_##_name)
-struct global_attr {
- struct attribute attr;
- ssize_t (*show)(struct kobject *kobj,
- struct attribute *attr, char *buf);
- ssize_t (*store)(struct kobject *a, struct attribute *b,
- const char *c, size_t count);
-};
-
-#define define_one_global_ro(_name) \
-static struct global_attr _name = \
-__ATTR(_name, 0444, show_##_name, NULL)
-
-#define define_one_global_rw(_name) \
-static struct global_attr _name = \
-__ATTR(_name, 0644, show_##_name, store_##_name)
-
struct cpufreq_policy *cpufreq_cpu_get(unsigned int cpu);
void cpufreq_cpu_put(struct cpufreq_policy *data);
const char *cpufreq_get_current_driver(void);
--
1.7.12.rc2.18.g61b472e
have_multiple_policies is required by platforms having multiple clock-domains
for cpus, i.e. supporting multiple policies for cpus. This patch adds in a
Kconfig option for enabling execution of this code.
Reported-by: Borislav Petkov <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
---
drivers/cpufreq/Kconfig | 3 +++
include/linux/cpufreq.h | 4 ++++
2 files changed, 7 insertions(+)
diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig
index cbcb21e..e6e6939 100644
--- a/drivers/cpufreq/Kconfig
+++ b/drivers/cpufreq/Kconfig
@@ -23,6 +23,9 @@ config CPU_FREQ_TABLE
config CPU_FREQ_GOV_COMMON
bool
+config CPU_FREQ_HAVE_MULTIPLE_POLICIES
+ bool
+
config CPU_FREQ_STAT
tristate "CPU frequency translation statistics"
select CPU_FREQ_TABLE
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 6e1abd2..a092fcb 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -107,11 +107,13 @@ struct cpufreq_policy {
unsigned int policy; /* see above */
struct cpufreq_governor *governor; /* see below */
void *governor_data;
+#ifdef CONFIG_CPU_FREQ_HAVE_MULTIPLE_POLICIES
/* This should be set by init() of platforms having multiple
* clock-domains, i.e. supporting multiple policies. With this sysfs
* directories of governor would be created in cpu/cpu<num>/cpufreq/
* directory */
bool have_multiple_policies;
+#endif
struct work_struct update; /* if update_policy() needs to be
* called, but you're in IRQ context */
@@ -142,9 +144,11 @@ static inline bool policy_is_shared(struct cpufreq_policy *policy)
static inline struct kobject *
get_governor_parent_kobj(struct cpufreq_policy *policy)
{
+#ifdef CONFIG_CPU_FREQ_HAVE_MULTIPLE_POLICIES
if (policy->have_multiple_policies)
return &policy->kobj;
else
+#endif
return cpufreq_global_kobject;
}
--
1.7.12.rc2.18.g61b472e
On Monday, March 04, 2013 03:37:56 PM Viresh Kumar wrote:
> have_multiple_policies is required by platforms having multiple clock-domains
> for cpus, i.e. supporting multiple policies for cpus. This patch adds in a
> Kconfig option for enabling execution of this code.
>
> Reported-by: Borislav Petkov <[email protected]>
> Signed-off-by: Viresh Kumar <[email protected]>
> ---
> drivers/cpufreq/Kconfig | 3 +++
> include/linux/cpufreq.h | 4 ++++
> 2 files changed, 7 insertions(+)
>
> diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig
> index cbcb21e..e6e6939 100644
> --- a/drivers/cpufreq/Kconfig
> +++ b/drivers/cpufreq/Kconfig
> @@ -23,6 +23,9 @@ config CPU_FREQ_TABLE
> config CPU_FREQ_GOV_COMMON
> bool
>
> +config CPU_FREQ_HAVE_MULTIPLE_POLICIES
> + bool
> +
> config CPU_FREQ_STAT
> tristate "CPU frequency translation statistics"
> select CPU_FREQ_TABLE
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index 6e1abd2..a092fcb 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -107,11 +107,13 @@ struct cpufreq_policy {
> unsigned int policy; /* see above */
> struct cpufreq_governor *governor; /* see below */
> void *governor_data;
> +#ifdef CONFIG_CPU_FREQ_HAVE_MULTIPLE_POLICIES
> /* This should be set by init() of platforms having multiple
> * clock-domains, i.e. supporting multiple policies. With this sysfs
> * directories of governor would be created in cpu/cpu<num>/cpufreq/
> * directory */
> bool have_multiple_policies;
> +#endif
>
> struct work_struct update; /* if update_policy() needs to be
> * called, but you're in IRQ context */
> @@ -142,9 +144,11 @@ static inline bool policy_is_shared(struct cpufreq_policy *policy)
> static inline struct kobject *
> get_governor_parent_kobj(struct cpufreq_policy *policy)
> {
> +#ifdef CONFIG_CPU_FREQ_HAVE_MULTIPLE_POLICIES
> if (policy->have_multiple_policies)
> return &policy->kobj;
> else
> +#endif
> return cpufreq_global_kobject;
> }
One more question before I apply it.
Is there any architecture/platform that will set
CONFIG_CPU_FREQ_HAVE_MULTIPLE_POLICIES and keep have_multiple_policies unset
at the same time?
Rafael
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
On 12 March 2013 07:38, Rafael J. Wysocki <[email protected]> wrote:
> One more question before I apply it.
>
> Is there any architecture/platform that will set
> CONFIG_CPU_FREQ_HAVE_MULTIPLE_POLICIES and keep have_multiple_policies unset
> at the same time?
No, they are redundant. That's why i have been forcing to drop this patch.
On Tuesday, March 12, 2013 08:55:12 AM Viresh Kumar wrote:
> On 12 March 2013 07:38, Rafael J. Wysocki <[email protected]> wrote:
> > One more question before I apply it.
> >
> > Is there any architecture/platform that will set
> > CONFIG_CPU_FREQ_HAVE_MULTIPLE_POLICIES and keep have_multiple_policies unset
> > at the same time?
>
> No, they are redundant. That's why i have been forcing to drop this patch.
I see.
What about having the Kconfig option alone, however?
Rafael
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
On 14 March 2013 03:11, Rafael J. Wysocki <[email protected]> wrote:
> On Tuesday, March 12, 2013 08:55:12 AM Viresh Kumar wrote:
>> On 12 March 2013 07:38, Rafael J. Wysocki <[email protected]> wrote:
>> > One more question before I apply it.
>> >
>> > Is there any architecture/platform that will set
>> > CONFIG_CPU_FREQ_HAVE_MULTIPLE_POLICIES and keep have_multiple_policies unset
>> > at the same time?
>>
>> No, they are redundant. That's why i have been forcing to drop this patch.
>
> I see.
>
> What about having the Kconfig option alone, however?
Even that is not enough. We build multiplatform kernels and so need a
variable to be set by platform.
On Thursday, March 14, 2013 08:39:55 AM Viresh Kumar wrote:
> On 14 March 2013 03:11, Rafael J. Wysocki <[email protected]> wrote:
> > On Tuesday, March 12, 2013 08:55:12 AM Viresh Kumar wrote:
> >> On 12 March 2013 07:38, Rafael J. Wysocki <[email protected]> wrote:
> >> > One more question before I apply it.
> >> >
> >> > Is there any architecture/platform that will set
> >> > CONFIG_CPU_FREQ_HAVE_MULTIPLE_POLICIES and keep have_multiple_policies unset
> >> > at the same time?
> >>
> >> No, they are redundant. That's why i have been forcing to drop this patch.
> >
> > I see.
> >
> > What about having the Kconfig option alone, however?
>
> Even that is not enough. We build multiplatform kernels and so need a
> variable to be set by platform.
Which means the Kconfig option and the field are not redundant in fact.
But do we need the field to reside in the policy structure? It looks like
it may just be a global bool variable (in which case the Kconfig option could
be dropped IMO). Is there any particular reason to put that thing into
struct cpufreq_policy?
Rafael
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
On 20 March 2013 05:50, Rafael J. Wysocki <[email protected]> wrote:
> On Thursday, March 14, 2013 08:39:55 AM Viresh Kumar wrote:
>> On 14 March 2013 03:11, Rafael J. Wysocki <[email protected]> wrote:
>> > On Tuesday, March 12, 2013 08:55:12 AM Viresh Kumar wrote:
>> >> On 12 March 2013 07:38, Rafael J. Wysocki <[email protected]> wrote:
>> >> > One more question before I apply it.
>> >> >
>> >> > Is there any architecture/platform that will set
>> >> > CONFIG_CPU_FREQ_HAVE_MULTIPLE_POLICIES and keep have_multiple_policies unset
>> >> > at the same time?
>> >>
>> >> No, they are redundant. That's why i have been forcing to drop this patch.
>> >
>> > I see.
>> >
>> > What about having the Kconfig option alone, however?
>>
>> Even that is not enough. We build multiplatform kernels and so need a
>> variable to be set by platform.
>
> Which means the Kconfig option and the field are not redundant in fact.
Yes. Redundant was the wrong word. Actually Kconfig option is just not required
as we can work efficiently without it.
> But do we need the field to reside in the policy structure? It looks like
> it may just be a global bool variable
Yes. It is not per policy but per cpufreq driver. And this can be done
by sharing
a function from cpufreq core to driver. But when do you want me to
call this function
(which will set this global variable). If we do it from init, then we
will end up calling it
again and again. Then it has to be called before calling
cpufreq_register_driver(),
as init() gets called internally.
> (in which case the Kconfig option could be dropped IMO).
We are aligned now :)
> Is there any particular reason to put that thing into
> struct cpufreq_policy?
Just the problem i mentioned to you.
On 20 March 2013 09:53, Viresh Kumar <[email protected]> wrote:
> But when do you want me to call this function
Guess what, i got answer to this question: struct cpufreq_driver :)
On 4 March 2013 13:07, Viresh Kumar <[email protected]> wrote:
> Currently, there can't be multiple instances of single governor_type. If we have
> a multi-package system, where we have multiple instances of struct policy (per
> package), we can't have multiple instances of same governor. i.e. We can't have
> multiple instances of ondemand governor for multiple packages.
>
> Governors directory in sysfs is created at /sys/devices/system/cpu/cpufreq/
> governor-name/. Which again reflects that there can be only one instance of a
> governor_type in the system.
>
> This is a bottleneck for multicluster system, where we want different packages
> to use same governor type, but with different tunables.
>
> This patch uses the infrastructure provided by earlier patch and implements
> init/exit routines for ondemand and conservative governors.
>
> Signed-off-by: Viresh Kumar <[email protected]>
As discussed in other patch thread, i dropped "cpufreq: Add Kconfig option
to enable/disable have_multiple_policies" patch and following is the fixup
to this patch:
I have queued all patches i had for 3.10 here:
http://git.linaro.org/gitweb?p=people/vireshk/linux.git;a=shortlog;h=refs/heads/for-3.10
commit f02fca9a2478088c4f7dadf82d998ae007a56285
Author: Viresh Kumar <[email protected]>
Date: Wed Mar 20 10:50:33 2013 +0530
fixup! cpufreq: governor: Implement per policy instances of governors
---
drivers/cpufreq/cpufreq.c | 8 ++++++++
include/linux/cpufreq.h | 22 ++++++++--------------
2 files changed, 16 insertions(+), 14 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index a843855..3d83b02 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -128,6 +128,14 @@ void disable_cpufreq(void)
static LIST_HEAD(cpufreq_governor_list);
static DEFINE_MUTEX(cpufreq_governor_mutex);
+struct kobject *get_governor_parent_kobj(struct cpufreq_policy *policy)
+{
+ if (cpufreq_driver->have_multiple_policies)
+ return &policy->kobj;
+ else
+ return cpufreq_global_kobject;
+}
+
static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
{
struct cpufreq_policy *data;
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 6e1abd2..805c4d3 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -107,11 +107,6 @@ struct cpufreq_policy {
unsigned int policy; /* see above */
struct cpufreq_governor *governor; /* see below */
void *governor_data;
- /* This should be set by init() of platforms having multiple
- * clock-domains, i.e. supporting multiple policies. With this sysfs
- * directories of governor would be created in cpu/cpu<num>/cpufreq/
- * directory */
- bool have_multiple_policies;
struct work_struct update; /* if update_policy() needs to be
* called, but you're in IRQ context */
@@ -139,15 +134,6 @@ static inline bool policy_is_shared(struct
cpufreq_policy *policy)
return cpumask_weight(policy->cpus) > 1;
}
-static inline struct kobject *
-get_governor_parent_kobj(struct cpufreq_policy *policy)
-{
- if (policy->have_multiple_policies)
- return &policy->kobj;
- else
- return cpufreq_global_kobject;
-}
-
/******************** cpufreq transition notifiers *******************/
#define CPUFREQ_PRECHANGE (0)
@@ -245,6 +231,13 @@ struct cpufreq_driver {
struct module *owner;
char name[CPUFREQ_NAME_LEN];
u8 flags;
+ /*
+ * This should be set by init() of platforms having multiple
+ * clock-domains, i.e. supporting multiple policies. With this sysfs
+ * directories of governor would be created in cpu/cpu<num>/cpufreq/
+ * directory
+ */
+ bool have_multiple_policies;
/* needed by all drivers */
int (*init) (struct cpufreq_policy *policy);
@@ -329,6 +322,7 @@ const char *cpufreq_get_current_driver(void);
*********************************************************************/
int cpufreq_get_policy(struct cpufreq_policy *policy, unsigned int cpu);
int cpufreq_update_policy(unsigned int cpu);
+struct kobject *get_governor_parent_kobj(struct cpufreq_policy *policy);
#ifdef CONFIG_CPU_FREQ
/* query the current CPU frequency (in kHz). If zero, cpufreq
couldn't detect it */
On Wednesday, March 20, 2013 10:59:13 AM Viresh Kumar wrote:
> On 4 March 2013 13:07, Viresh Kumar <[email protected]> wrote:
> > Currently, there can't be multiple instances of single governor_type. If we have
> > a multi-package system, where we have multiple instances of struct policy (per
> > package), we can't have multiple instances of same governor. i.e. We can't have
> > multiple instances of ondemand governor for multiple packages.
> >
> > Governors directory in sysfs is created at /sys/devices/system/cpu/cpufreq/
> > governor-name/. Which again reflects that there can be only one instance of a
> > governor_type in the system.
> >
> > This is a bottleneck for multicluster system, where we want different packages
> > to use same governor type, but with different tunables.
> >
> > This patch uses the infrastructure provided by earlier patch and implements
> > init/exit routines for ondemand and conservative governors.
> >
> > Signed-off-by: Viresh Kumar <[email protected]>
>
> As discussed in other patch thread, i dropped "cpufreq: Add Kconfig option
> to enable/disable have_multiple_policies" patch and following is the fixup
> to this patch:
>
> I have queued all patches i had for 3.10 here:
>
> http://git.linaro.org/gitweb?p=people/vireshk/linux.git;a=shortlog;h=refs/heads/for-3.10
OK, applied these to linux-pm.git/bleeding-edge.
At the moment bleeding-edge and linux-next diverged slightly on cpufreq, but
I hope the bleeding-edge material won't cause build problems to occur, so I'll
be able to move it to linux-next shortly.
> commit f02fca9a2478088c4f7dadf82d998ae007a56285
> Author: Viresh Kumar <[email protected]>
> Date: Wed Mar 20 10:50:33 2013 +0530
>
> fixup! cpufreq: governor: Implement per policy instances of governors
I'd actually prefer you to post complete updated patches instead of these
fixups. They are real PITA for me and probably for everybody else trying
to follow the cpufreq development recently.
Thanks,
Rafael
> ---
> drivers/cpufreq/cpufreq.c | 8 ++++++++
> include/linux/cpufreq.h | 22 ++++++++--------------
> 2 files changed, 16 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index a843855..3d83b02 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -128,6 +128,14 @@ void disable_cpufreq(void)
> static LIST_HEAD(cpufreq_governor_list);
> static DEFINE_MUTEX(cpufreq_governor_mutex);
>
> +struct kobject *get_governor_parent_kobj(struct cpufreq_policy *policy)
> +{
> + if (cpufreq_driver->have_multiple_policies)
> + return &policy->kobj;
> + else
> + return cpufreq_global_kobject;
> +}
> +
> static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
> {
> struct cpufreq_policy *data;
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index 6e1abd2..805c4d3 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -107,11 +107,6 @@ struct cpufreq_policy {
> unsigned int policy; /* see above */
> struct cpufreq_governor *governor; /* see below */
> void *governor_data;
> - /* This should be set by init() of platforms having multiple
> - * clock-domains, i.e. supporting multiple policies. With this sysfs
> - * directories of governor would be created in cpu/cpu<num>/cpufreq/
> - * directory */
> - bool have_multiple_policies;
>
> struct work_struct update; /* if update_policy() needs to be
> * called, but you're in IRQ context */
> @@ -139,15 +134,6 @@ static inline bool policy_is_shared(struct
> cpufreq_policy *policy)
> return cpumask_weight(policy->cpus) > 1;
> }
>
> -static inline struct kobject *
> -get_governor_parent_kobj(struct cpufreq_policy *policy)
> -{
> - if (policy->have_multiple_policies)
> - return &policy->kobj;
> - else
> - return cpufreq_global_kobject;
> -}
> -
> /******************** cpufreq transition notifiers *******************/
>
> #define CPUFREQ_PRECHANGE (0)
> @@ -245,6 +231,13 @@ struct cpufreq_driver {
> struct module *owner;
> char name[CPUFREQ_NAME_LEN];
> u8 flags;
> + /*
> + * This should be set by init() of platforms having multiple
> + * clock-domains, i.e. supporting multiple policies. With this sysfs
> + * directories of governor would be created in cpu/cpu<num>/cpufreq/
> + * directory
> + */
> + bool have_multiple_policies;
>
> /* needed by all drivers */
> int (*init) (struct cpufreq_policy *policy);
> @@ -329,6 +322,7 @@ const char *cpufreq_get_current_driver(void);
> *********************************************************************/
> int cpufreq_get_policy(struct cpufreq_policy *policy, unsigned int cpu);
> int cpufreq_update_policy(unsigned int cpu);
> +struct kobject *get_governor_parent_kobj(struct cpufreq_policy *policy);
>
> #ifdef CONFIG_CPU_FREQ
> /* query the current CPU frequency (in kHz). If zero, cpufreq
> couldn't detect it */
> --
> To unsubscribe from this list: send the line "unsubscribe cpufreq" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
On 22 March 2013 05:14, Rafael J. Wysocki <[email protected]> wrote:
> On Wednesday, March 20, 2013 10:59:13 AM Viresh Kumar wrote:
>> I have queued all patches i had for 3.10 here:
>>
>> http://git.linaro.org/gitweb?p=people/vireshk/linux.git;a=shortlog;h=refs/heads/for-3.10
>
> OK, applied these to linux-pm.git/bleeding-edge.
Thanks.
> At the moment bleeding-edge and linux-next diverged slightly on cpufreq, but
> I hope the bleeding-edge material won't cause build problems to occur, so I'll
> be able to move it to linux-next shortly.
There shouldn't be any build problems not because i have done all build testing
properly BUT because my tree is under continuously surveillance by Fengguang's
bot. And any problem with my branches is reported very early :)
>> commit f02fca9a2478088c4f7dadf82d998ae007a56285
>> Author: Viresh Kumar <[email protected]>
>> Date: Wed Mar 20 10:50:33 2013 +0530
>>
>> fixup! cpufreq: governor: Implement per policy instances of governors
>
> I'd actually prefer you to post complete updated patches instead of these
> fixups. They are real PITA for me and probably for everybody else trying
> to follow the cpufreq development recently.
Hmm... I always thought fixups are way easy to review (and i still
believe that's
true) as they just contain what got changed and so people don't have to review
whole patch again. BUT people who are looking for complete patches to apply
would be annoyed by this and hence i always show them path of my repo
where they can find it. So, what i may do is, post fixups and then resend
patches. So that reviewer knows what changed and others can have complete
patches too.
--
viresh
On Friday, March 22, 2013 07:50:54 AM Viresh Kumar wrote:
> On 22 March 2013 05:14, Rafael J. Wysocki <[email protected]> wrote:
> > On Wednesday, March 20, 2013 10:59:13 AM Viresh Kumar wrote:
>
> >> I have queued all patches i had for 3.10 here:
> >>
> >> http://git.linaro.org/gitweb?p=people/vireshk/linux.git;a=shortlog;h=refs/heads/for-3.10
> >
> > OK, applied these to linux-pm.git/bleeding-edge.
>
> Thanks.
>
> > At the moment bleeding-edge and linux-next diverged slightly on cpufreq, but
> > I hope the bleeding-edge material won't cause build problems to occur, so I'll
> > be able to move it to linux-next shortly.
>
> There shouldn't be any build problems not because i have done all build testing
> properly BUT because my tree is under continuously surveillance by Fengguang's
> bot. And any problem with my branches is reported very early :)
>
> >> commit f02fca9a2478088c4f7dadf82d998ae007a56285
> >> Author: Viresh Kumar <[email protected]>
> >> Date: Wed Mar 20 10:50:33 2013 +0530
> >>
> >> fixup! cpufreq: governor: Implement per policy instances of governors
> >
> > I'd actually prefer you to post complete updated patches instead of these
> > fixups. They are real PITA for me and probably for everybody else trying
> > to follow the cpufreq development recently.
>
> Hmm... I always thought fixups are way easy to review (and i still
> believe that's
> true) as they just contain what got changed and so people don't have to review
> whole patch again.
They won't have to if you write in the preamble what the differences from
previous versions are.
> BUT people who are looking for complete patches to apply
> would be annoyed by this and hence i always show them path of my repo
> where they can find it.
The problem with this approach is that the complete patches never make it to
the mailing lists and people have problems with connecting commits to
previously posted patches.
Moreover, it is *much* more convenient to me to take patches from kernel.org
patchwork than from your repos, with all due respect.
> So, what i may do is, post fixups and then resend
> patches. So that reviewer knows what changed and others can have complete
> patches too.
Sure, that will work too.
Thanks,
Rafael
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
On Fri, Mar 22, 2013 at 5:25 PM, Rafael J. Wysocki <[email protected]> wrote:
> On Friday, March 22, 2013 07:50:54 AM Viresh Kumar wrote:
>> Hmm... I always thought fixups are way easy to review (and i still
>> believe that's
>> true) as they just contain what got changed and so people don't have to review
>> whole patch again.
>
> They won't have to if you write in the preamble what the differences from
> previous versions are.
You didn't get me.. How will the reviewer check if author has done what he
is saying in preamble and he hasn't broken anything new?
>> BUT people who are looking for complete patches to apply
>> would be annoyed by this and hence i always show them path of my repo
>> where they can find it.
>
> The problem with this approach is that the complete patches never make it to
> the mailing lists and people have problems with connecting commits to
> previously posted patches.
>
> Moreover, it is *much* more convenient to me to take patches from kernel.org
> patchwork than from your repos, with all due respect.
I understand.. I push them to repo only when they are broken, otherwise not.
On Friday, March 22, 2013 05:21:19 PM Viresh Kumar wrote:
> On Fri, Mar 22, 2013 at 5:25 PM, Rafael J. Wysocki <[email protected]> wrote:
> > On Friday, March 22, 2013 07:50:54 AM Viresh Kumar wrote:
>
> >> Hmm... I always thought fixups are way easy to review (and i still
> >> believe that's
> >> true) as they just contain what got changed and so people don't have to review
> >> whole patch again.
> >
> > They won't have to if you write in the preamble what the differences from
> > previous versions are.
>
> You didn't get me.. How will the reviewer check if author has done what he
> is saying in preamble and he hasn't broken anything new?
Well, if the submitter wants to cheat, she/he certainly can this way, but
what's the benefit, honestly? If the reviewer actually notices that there are
more differences than the submitter admits to, the consequences may be quite
unpleasant for the submitter (like the rejection of any future patches, for
example). And mistakes are possible anyway (and the more patches you deal
with, the greater the chances of making a mistake are).
Thanks.
Rafael
--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
On 22 March 2013 17:41, Rafael J. Wysocki <[email protected]> wrote:
> Well, if the submitter wants to cheat, she/he certainly can this way, but
> what's the benefit, honestly? If the reviewer actually notices that there are
> more differences than the submitter admits to, the consequences may be quite
> unpleasant for the submitter (like the rejection of any future patches, for
> example). And mistakes are possible anyway (and the more patches you deal
> with, the greater the chances of making a mistake are).
Agreed.. I was focusing on the last part: mistakes.. People learn over time
and can still commit mistakes.. So a diff is always better to see what changed.
On Fri, Mar 22, 2013 at 12:44:40AM +0100, Rafael J. Wysocki wrote:
> On Wednesday, March 20, 2013 10:59:13 AM Viresh Kumar wrote:
> > On 4 March 2013 13:07, Viresh Kumar <[email protected]> wrote:
> > > Currently, there can't be multiple instances of single governor_type. If we have
> > > a multi-package system, where we have multiple instances of struct policy (per
> > > package), we can't have multiple instances of same governor. i.e. We can't have
> > > multiple instances of ondemand governor for multiple packages.
> > >
> > > Governors directory in sysfs is created at /sys/devices/system/cpu/cpufreq/
> > > governor-name/. Which again reflects that there can be only one instance of a
> > > governor_type in the system.
> > >
> > > This is a bottleneck for multicluster system, where we want different packages
> > > to use same governor type, but with different tunables.
> > >
> > > This patch uses the infrastructure provided by earlier patch and implements
> > > init/exit routines for ondemand and conservative governors.
> > >
> > > Signed-off-by: Viresh Kumar <[email protected]>
> >
> > As discussed in other patch thread, i dropped "cpufreq: Add Kconfig option
> > to enable/disable have_multiple_policies" patch and following is the fixup
> > to this patch:
> >
> > I have queued all patches i had for 3.10 here:
> >
> > http://git.linaro.org/gitweb?p=people/vireshk/linux.git;a=shortlog;h=refs/heads/for-3.10
>
> OK, applied these to linux-pm.git/bleeding-edge.
Hi, latest bleeding-edge is spewing this out on boot:
[ 3.585157] ------------[ cut here ]------------
[ 3.592227] WARNING: at fs/sysfs/dir.c:536 sysfs_add_one+0xc8/0x100()
[ 3.599521] Hardware name: Dinar
[ 3.606878] sysfs: cannot create duplicate filename '/devices/system/cpu/cpufreq/ondemand'
[ 3.614634] Modules linked in:
[ 3.622382] Pid: 1, comm: swapper/0 Not tainted 3.9.0-rc4+ #7
[ 3.630305] Call Trace:
[ 3.638251] [<ffffffff810589cf>] warn_slowpath_common+0x7f/0xc0
[ 3.646435] [<ffffffff81058ac6>] warn_slowpath_fmt+0x46/0x50
[ 3.654586] [<ffffffff8133e2f0>] ? strlcat+0x60/0x80
[ 3.662765] [<ffffffff811fe7d8>] sysfs_add_one+0xc8/0x100
[ 3.670977] [<ffffffff811fe9cc>] create_dir+0x7c/0xd0
[ 3.679239] [<ffffffff811fecaf>] sysfs_create_subdir+0x1f/0x30
[ 3.687601] [<ffffffff812006c4>] internal_create_group+0x64/0x210
[ 3.696098] [<ffffffff812008a3>] sysfs_create_group+0x13/0x20
[ 3.704700] [<ffffffff816bf800>] cpufreq_governor_dbs+0x400/0x590
[ 3.713401] [<ffffffff816bdc37>] od_cpufreq_governor_dbs+0x17/0x20
[ 3.722191] [<ffffffff816bb437>] __cpufreq_governor+0x47/0xc0
[ 3.731071] [<ffffffff816bb94d>] __cpufreq_set_policy+0x19d/0x1b0
[ 3.739968] [<ffffffff816bca89>] cpufreq_add_dev_interface+0x259/0x2b0
[ 3.748960] [<ffffffff813cdce6>] ? acpi_processor_get_performance_info+0x21c/0x452
[ 3.758099] [<ffffffff816bc210>] ? cpufreq_update_policy+0x130/0x130
[ 3.767366] [<ffffffff816bce90>] cpufreq_add_dev+0x3b0/0x4d0
[ 3.776659] [<ffffffff821579d4>] ? cpufreq_gov_dbs_init+0x12/0x12
[ 3.785985] [<ffffffff814e6a39>] subsys_interface_register+0x89/0xd0
[ 3.795452] [<ffffffff816baf5e>] cpufreq_register_driver+0x8e/0x180
[ 3.804919] [<ffffffff82157aca>] acpi_cpufreq_init+0xf6/0x1f8
[ 3.814360] [<ffffffff814f5030>] ? set_trace_device+0x80/0x80
[ 3.823558] [<ffffffff8100206f>] do_one_initcall+0x3f/0x170
[ 3.832476] [<ffffffff8211b00a>] kernel_init_freeable+0x13e/0x1cd
[ 3.841131] [<ffffffff8211a88e>] ? do_early_param+0x86/0x86
[ 3.849506] [<ffffffff817f4c20>] ? rest_init+0x80/0x80
[ 3.857557] [<ffffffff817f4c2e>] kernel_init+0xe/0xf0
[ 3.865260] [<ffffffff8181edec>] ret_from_fork+0x7c/0xb0
[ 3.872886] [<ffffffff817f4c20>] ? rest_init+0x80/0x80
[ 3.880456] ---[ end trace 1a5c6247c6d9b0ac ]---
[ 3.888201] ------------[ cut here ]------------
This warning is repeated for number of cpus - 1 times.
And when I do:
$ cat /sys/devices/system/cpu/cpufreq/ondemand/up_threshold
[ 489.103388] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
[ 489.112064] IP: [<ffffffff816be02c>] show_up_threshold+0x1c/0x30
[ 489.120511] PGD a285e6067 PUD a27085067 PMD 0
[ 489.128690] Oops: 0000 [#1] SMP
[ 489.136521] Modules linked in:
[ 489.144134] CPU 15
[ 489.144229] Pid: 1565, comm: cat Tainted: G W 3.9.0-rc4+ #7 AMD Dinar/Dinar
[ 489.159654] RIP: 0010:[<ffffffff816be02c>] [<ffffffff816be02c>] show_up_threshold+0x1c/0x30
[ 489.167864] RSP: 0018:ffff880423859e88 EFLAGS: 00010246
[ 489.176043] RAX: 0000000000000000 RBX: ffff880a271188c0 RCX: ffffffff81a41810
[ 489.184372] RDX: 0000000000000000 RSI: ffffffff81e02dae RDI: ffffffff820d7860
[ 489.184373] RBP: ffff880423859e88 R08: ffffea0028b6df80 R09: 00000000001f05b8
[ 489.184374] R10: 0000000000001b97 R11: 0000000000000246 R12: ffff880423859f50
[ 489.184374] R13: 0000000000008000 R14: ffff880a271188a0 R15: ffff8804251aa070
[ 489.184377] FS: 00007f3278b31700(0000) GS:ffff880a2fcc0000(0000) knlGS:0000000000000000
[ 489.184378] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 489.184378] CR2: 0000000000000010 CR3: 0000000a2db82000 CR4: 00000000000407e0
[ 489.184380] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 489.184381] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 489.184382] Process cat (pid: 1565, threadinfo ffff880423858000, task ffff880424805c00)
[ 489.184382] Stack:
[ 489.184389] ffff880423859e98 ffffffff8133814f ffff880423859ef8 ffffffff811fd62a
[ 489.184394] 000000002bf67be2 ffff8808260b7a80 ffffffff81a41810 00000000012c8000
[ 489.184399] ffff880423859ef8 0000000000008000 00000000012c8000 ffff880423859f50
[ 489.184400] Call Trace:
[ 489.184406] [<ffffffff8133814f>] kobj_attr_show+0xf/0x30
[ 489.184411] [<ffffffff811fd62a>] sysfs_read_file+0xaa/0x190
[ 489.184415] [<ffffffff81187e30>] vfs_read+0xb0/0x180
[ 489.184418] [<ffffffff81187f52>] sys_read+0x52/0xa0
[ 489.184422] [<ffffffff8181a7fe>] ? do_page_fault+0xe/0x10
[ 489.184426] [<ffffffff8181ee99>] system_call_fastpath+0x16/0x1b
[ 489.184441] Code: 52 08 e8 78 2c c8 ff 5d 48 98 c3 0f 1f 40 00 66 66 66 66 90 55 48 8b 57 70 48 89 f0 48 89 c7 48 c7 c6 ae 2d e0 81 31 c0 48 89 e5 <48> 8b 52 10 8b 52 0c e8 48 2c c8 ff 5d 48 98 c3 0f 1f 40 00 66
[ 489.184443] RIP [<ffffffff816be02c>] show_up_threshold+0x1c/0x30
[ 489.184443] RSP <ffff880423859e88>
[ 489.184444] CR2: 0000000000000010
[ 489.184507] ---[ end trace 1a5c6247c6d9b0c3 ]---
Any ideas?
Thanks,
On 26 March 2013 20:50, Jacob Shin <[email protected]> wrote:
> Hi, latest bleeding-edge is spewing this out on boot:
>
> [ 3.585157] ------------[ cut here ]------------
> [ 3.592227] WARNING: at fs/sysfs/dir.c:536 sysfs_add_one+0xc8/0x100()
> [ 3.599521] Hardware name: Dinar
> [ 3.606878] sysfs: cannot create duplicate filename '/devices/system/cpu/cpufreq/ondemand'
> [ 3.614634] Modules linked in:
> [ 3.622382] Pid: 1, comm: swapper/0 Not tainted 3.9.0-rc4+ #7
> [ 3.630305] Call Trace:
> [ 3.638251] [<ffffffff810589cf>] warn_slowpath_common+0x7f/0xc0
> [ 3.646435] [<ffffffff81058ac6>] warn_slowpath_fmt+0x46/0x50
> [ 3.654586] [<ffffffff8133e2f0>] ? strlcat+0x60/0x80
> [ 3.662765] [<ffffffff811fe7d8>] sysfs_add_one+0xc8/0x100
> [ 3.670977] [<ffffffff811fe9cc>] create_dir+0x7c/0xd0
> [ 3.679239] [<ffffffff811fecaf>] sysfs_create_subdir+0x1f/0x30
> [ 3.687601] [<ffffffff812006c4>] internal_create_group+0x64/0x210
> [ 3.696098] [<ffffffff812008a3>] sysfs_create_group+0x13/0x20
> [ 3.704700] [<ffffffff816bf800>] cpufreq_governor_dbs+0x400/0x590
> [ 3.713401] [<ffffffff816bdc37>] od_cpufreq_governor_dbs+0x17/0x20
> [ 3.722191] [<ffffffff816bb437>] __cpufreq_governor+0x47/0xc0
> [ 3.731071] [<ffffffff816bb94d>] __cpufreq_set_policy+0x19d/0x1b0
> [ 3.739968] [<ffffffff816bca89>] cpufreq_add_dev_interface+0x259/0x2b0
> [ 3.748960] [<ffffffff813cdce6>] ? acpi_processor_get_performance_info+0x21c/0x452
> [ 3.758099] [<ffffffff816bc210>] ? cpufreq_update_policy+0x130/0x130
> [ 3.767366] [<ffffffff816bce90>] cpufreq_add_dev+0x3b0/0x4d0
> [ 3.776659] [<ffffffff821579d4>] ? cpufreq_gov_dbs_init+0x12/0x12
> [ 3.785985] [<ffffffff814e6a39>] subsys_interface_register+0x89/0xd0
> [ 3.795452] [<ffffffff816baf5e>] cpufreq_register_driver+0x8e/0x180
> [ 3.804919] [<ffffffff82157aca>] acpi_cpufreq_init+0xf6/0x1f8
> [ 3.814360] [<ffffffff814f5030>] ? set_trace_device+0x80/0x80
> [ 3.823558] [<ffffffff8100206f>] do_one_initcall+0x3f/0x170
> [ 3.832476] [<ffffffff8211b00a>] kernel_init_freeable+0x13e/0x1cd
> [ 3.841131] [<ffffffff8211a88e>] ? do_early_param+0x86/0x86
> [ 3.849506] [<ffffffff817f4c20>] ? rest_init+0x80/0x80
> [ 3.857557] [<ffffffff817f4c2e>] kernel_init+0xe/0xf0
> [ 3.865260] [<ffffffff8181edec>] ret_from_fork+0x7c/0xb0
> [ 3.872886] [<ffffffff817f4c20>] ? rest_init+0x80/0x80
> [ 3.880456] ---[ end trace 1a5c6247c6d9b0ac ]---
> [ 3.888201] ------------[ cut here ]------------
>
> This warning is repeated for number of cpus - 1 times.
>
> And when I do:
>
> $ cat /sys/devices/system/cpu/cpufreq/ondemand/up_threshold
>
> [ 489.103388] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
> [ 489.112064] IP: [<ffffffff816be02c>] show_up_threshold+0x1c/0x30
> [ 489.120511] PGD a285e6067 PUD a27085067 PMD 0
> [ 489.128690] Oops: 0000 [#1] SMP
> [ 489.136521] Modules linked in:
> [ 489.144134] CPU 15
> [ 489.144229] Pid: 1565, comm: cat Tainted: G W 3.9.0-rc4+ #7 AMD Dinar/Dinar
> [ 489.159654] RIP: 0010:[<ffffffff816be02c>] [<ffffffff816be02c>] show_up_threshold+0x1c/0x30
> [ 489.167864] RSP: 0018:ffff880423859e88 EFLAGS: 00010246
> [ 489.176043] RAX: 0000000000000000 RBX: ffff880a271188c0 RCX: ffffffff81a41810
> [ 489.184372] RDX: 0000000000000000 RSI: ffffffff81e02dae RDI: ffffffff820d7860
> [ 489.184373] RBP: ffff880423859e88 R08: ffffea0028b6df80 R09: 00000000001f05b8
> [ 489.184374] R10: 0000000000001b97 R11: 0000000000000246 R12: ffff880423859f50
> [ 489.184374] R13: 0000000000008000 R14: ffff880a271188a0 R15: ffff8804251aa070
> [ 489.184377] FS: 00007f3278b31700(0000) GS:ffff880a2fcc0000(0000) knlGS:0000000000000000
> [ 489.184378] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 489.184378] CR2: 0000000000000010 CR3: 0000000a2db82000 CR4: 00000000000407e0
> [ 489.184380] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 489.184381] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 489.184382] Process cat (pid: 1565, threadinfo ffff880423858000, task ffff880424805c00)
> [ 489.184382] Stack:
> [ 489.184389] ffff880423859e98 ffffffff8133814f ffff880423859ef8 ffffffff811fd62a
> [ 489.184394] 000000002bf67be2 ffff8808260b7a80 ffffffff81a41810 00000000012c8000
> [ 489.184399] ffff880423859ef8 0000000000008000 00000000012c8000 ffff880423859f50
> [ 489.184400] Call Trace:
> [ 489.184406] [<ffffffff8133814f>] kobj_attr_show+0xf/0x30
> [ 489.184411] [<ffffffff811fd62a>] sysfs_read_file+0xaa/0x190
> [ 489.184415] [<ffffffff81187e30>] vfs_read+0xb0/0x180
> [ 489.184418] [<ffffffff81187f52>] sys_read+0x52/0xa0
> [ 489.184422] [<ffffffff8181a7fe>] ? do_page_fault+0xe/0x10
> [ 489.184426] [<ffffffff8181ee99>] system_call_fastpath+0x16/0x1b
> [ 489.184441] Code: 52 08 e8 78 2c c8 ff 5d 48 98 c3 0f 1f 40 00 66 66 66 66 90 55 48 8b 57 70 48 89 f0 48 89 c7 48 c7 c6 ae 2d e0 81 31 c0 48 89 e5 <48> 8b 52 10 8b 52 0c e8 48 2c c8 ff 5d 48 98 c3 0f 1f 40 00 66
> [ 489.184443] RIP [<ffffffff816be02c>] show_up_threshold+0x1c/0x30
> [ 489.184443] RSP <ffff880423859e88>
> [ 489.184444] CR2: 0000000000000010
> [ 489.184507] ---[ end trace 1a5c6247c6d9b0c3 ]---
>
> Any ideas?
Yes, i believe i have enough idea about it :)
There are two kind of systems i know:
1 - Single group of cpus controlled by a single clock line,
i.e. only one policy instance at any time
2 - multipolicy systems where we have more than one group of cpus
and every group have one clock line.
For the second case also there are two cases:
2.1 - support have_multiple_policies (i.e. have separate instance of governor
for each policy struct)
2.2 - doesn't support have_multiple_policies
The last one (2.2) is broken with my patch and attached is the fix. I
have tested
it on my Lenovo Thinkpad which is more like 2.2 case.
cat of cpufreq/ondemand/** is still broken and i am too tired of
fixing it now...
Its already midnight here 01:01 AM.
--
viresh
On Wed, Mar 27, 2013 at 01:02:15AM +0530, Viresh Kumar wrote:
> On 26 March 2013 20:50, Jacob Shin <[email protected]> wrote:
> > Hi, latest bleeding-edge is spewing this out on boot:
> >
> > [ 3.585157] ------------[ cut here ]------------
> > [ 3.592227] WARNING: at fs/sysfs/dir.c:536 sysfs_add_one+0xc8/0x100()
> > [ 3.599521] Hardware name: Dinar
> > [ 3.606878] sysfs: cannot create duplicate filename '/devices/system/cpu/cpufreq/ondemand'
> > [ 3.614634] Modules linked in:
> > [ 3.622382] Pid: 1, comm: swapper/0 Not tainted 3.9.0-rc4+ #7
> > [ 3.630305] Call Trace:
> > [ 3.638251] [<ffffffff810589cf>] warn_slowpath_common+0x7f/0xc0
> > [ 3.646435] [<ffffffff81058ac6>] warn_slowpath_fmt+0x46/0x50
> > [ 3.654586] [<ffffffff8133e2f0>] ? strlcat+0x60/0x80
> > [ 3.662765] [<ffffffff811fe7d8>] sysfs_add_one+0xc8/0x100
> > [ 3.670977] [<ffffffff811fe9cc>] create_dir+0x7c/0xd0
> > [ 3.679239] [<ffffffff811fecaf>] sysfs_create_subdir+0x1f/0x30
> > [ 3.687601] [<ffffffff812006c4>] internal_create_group+0x64/0x210
> > [ 3.696098] [<ffffffff812008a3>] sysfs_create_group+0x13/0x20
> > [ 3.704700] [<ffffffff816bf800>] cpufreq_governor_dbs+0x400/0x590
> > [ 3.713401] [<ffffffff816bdc37>] od_cpufreq_governor_dbs+0x17/0x20
> > [ 3.722191] [<ffffffff816bb437>] __cpufreq_governor+0x47/0xc0
> > [ 3.731071] [<ffffffff816bb94d>] __cpufreq_set_policy+0x19d/0x1b0
> > [ 3.739968] [<ffffffff816bca89>] cpufreq_add_dev_interface+0x259/0x2b0
> > [ 3.748960] [<ffffffff813cdce6>] ? acpi_processor_get_performance_info+0x21c/0x452
> > [ 3.758099] [<ffffffff816bc210>] ? cpufreq_update_policy+0x130/0x130
> > [ 3.767366] [<ffffffff816bce90>] cpufreq_add_dev+0x3b0/0x4d0
> > [ 3.776659] [<ffffffff821579d4>] ? cpufreq_gov_dbs_init+0x12/0x12
> > [ 3.785985] [<ffffffff814e6a39>] subsys_interface_register+0x89/0xd0
> > [ 3.795452] [<ffffffff816baf5e>] cpufreq_register_driver+0x8e/0x180
> > [ 3.804919] [<ffffffff82157aca>] acpi_cpufreq_init+0xf6/0x1f8
> > [ 3.814360] [<ffffffff814f5030>] ? set_trace_device+0x80/0x80
> > [ 3.823558] [<ffffffff8100206f>] do_one_initcall+0x3f/0x170
> > [ 3.832476] [<ffffffff8211b00a>] kernel_init_freeable+0x13e/0x1cd
> > [ 3.841131] [<ffffffff8211a88e>] ? do_early_param+0x86/0x86
> > [ 3.849506] [<ffffffff817f4c20>] ? rest_init+0x80/0x80
> > [ 3.857557] [<ffffffff817f4c2e>] kernel_init+0xe/0xf0
> > [ 3.865260] [<ffffffff8181edec>] ret_from_fork+0x7c/0xb0
> > [ 3.872886] [<ffffffff817f4c20>] ? rest_init+0x80/0x80
> > [ 3.880456] ---[ end trace 1a5c6247c6d9b0ac ]---
> > [ 3.888201] ------------[ cut here ]------------
> >
> > This warning is repeated for number of cpus - 1 times.
> >
> > And when I do:
> >
> > $ cat /sys/devices/system/cpu/cpufreq/ondemand/up_threshold
> >
> > [ 489.103388] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010
> > [ 489.112064] IP: [<ffffffff816be02c>] show_up_threshold+0x1c/0x30
> > [ 489.120511] PGD a285e6067 PUD a27085067 PMD 0
> > [ 489.128690] Oops: 0000 [#1] SMP
> > [ 489.136521] Modules linked in:
> > [ 489.144134] CPU 15
> > [ 489.144229] Pid: 1565, comm: cat Tainted: G W 3.9.0-rc4+ #7 AMD Dinar/Dinar
> > [ 489.159654] RIP: 0010:[<ffffffff816be02c>] [<ffffffff816be02c>] show_up_threshold+0x1c/0x30
> > [ 489.167864] RSP: 0018:ffff880423859e88 EFLAGS: 00010246
> > [ 489.176043] RAX: 0000000000000000 RBX: ffff880a271188c0 RCX: ffffffff81a41810
> > [ 489.184372] RDX: 0000000000000000 RSI: ffffffff81e02dae RDI: ffffffff820d7860
> > [ 489.184373] RBP: ffff880423859e88 R08: ffffea0028b6df80 R09: 00000000001f05b8
> > [ 489.184374] R10: 0000000000001b97 R11: 0000000000000246 R12: ffff880423859f50
> > [ 489.184374] R13: 0000000000008000 R14: ffff880a271188a0 R15: ffff8804251aa070
> > [ 489.184377] FS: 00007f3278b31700(0000) GS:ffff880a2fcc0000(0000) knlGS:0000000000000000
> > [ 489.184378] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 489.184378] CR2: 0000000000000010 CR3: 0000000a2db82000 CR4: 00000000000407e0
> > [ 489.184380] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [ 489.184381] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > [ 489.184382] Process cat (pid: 1565, threadinfo ffff880423858000, task ffff880424805c00)
> > [ 489.184382] Stack:
> > [ 489.184389] ffff880423859e98 ffffffff8133814f ffff880423859ef8 ffffffff811fd62a
> > [ 489.184394] 000000002bf67be2 ffff8808260b7a80 ffffffff81a41810 00000000012c8000
> > [ 489.184399] ffff880423859ef8 0000000000008000 00000000012c8000 ffff880423859f50
> > [ 489.184400] Call Trace:
> > [ 489.184406] [<ffffffff8133814f>] kobj_attr_show+0xf/0x30
> > [ 489.184411] [<ffffffff811fd62a>] sysfs_read_file+0xaa/0x190
> > [ 489.184415] [<ffffffff81187e30>] vfs_read+0xb0/0x180
> > [ 489.184418] [<ffffffff81187f52>] sys_read+0x52/0xa0
> > [ 489.184422] [<ffffffff8181a7fe>] ? do_page_fault+0xe/0x10
> > [ 489.184426] [<ffffffff8181ee99>] system_call_fastpath+0x16/0x1b
> > [ 489.184441] Code: 52 08 e8 78 2c c8 ff 5d 48 98 c3 0f 1f 40 00 66 66 66 66 90 55 48 8b 57 70 48 89 f0 48 89 c7 48 c7 c6 ae 2d e0 81 31 c0 48 89 e5 <48> 8b 52 10 8b 52 0c e8 48 2c c8 ff 5d 48 98 c3 0f 1f 40 00 66
> > [ 489.184443] RIP [<ffffffff816be02c>] show_up_threshold+0x1c/0x30
> > [ 489.184443] RSP <ffff880423859e88>
> > [ 489.184444] CR2: 0000000000000010
> > [ 489.184507] ---[ end trace 1a5c6247c6d9b0c3 ]---
> >
> > Any ideas?
>
> Yes, i believe i have enough idea about it :)
>
> There are two kind of systems i know:
> 1 - Single group of cpus controlled by a single clock line,
> i.e. only one policy instance at any time
> 2 - multipolicy systems where we have more than one group of cpus
> and every group have one clock line.
>
> For the second case also there are two cases:
> 2.1 - support have_multiple_policies (i.e. have separate instance of governor
> for each policy struct)
> 2.2 - doesn't support have_multiple_policies
>
> The last one (2.2) is broken with my patch and attached is the fix. I
> have tested
> it on my Lenovo Thinkpad which is more like 2.2 case.
>
> cat of cpufreq/ondemand/** is still broken and i am too tired of
> fixing it now...
> Its already midnight here 01:01 AM.
>
> --
> viresh
>
> diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
> index 41e5e56..f29feb4 100644
> --- a/drivers/cpufreq/cpufreq_governor.c
> +++ b/drivers/cpufreq/cpufreq_governor.c
> @@ -29,6 +29,9 @@
>
> #include "cpufreq_governor.h"
>
> +/* Common data for platforms that don't need governor instance per policy */
> +struct dbs_data *gdbs_data;
> +
Hmm .. I don't think this works for both ondemand and conservative
governors running at the same time .
On 27 March 2013 01:18, Jacob Shin <[email protected]> wrote:
> On Wed, Mar 27, 2013 at 01:02:15AM +0530, Viresh Kumar wrote:
>> +struct dbs_data *gdbs_data;
>> +
>
> Hmm .. I don't think this works for both ondemand and conservative
> governors running at the same time .
Yes, this should fix it (untested for now, i will provide a complete fix
today):
diff --git a/drivers/cpufreq/cpufreq_governor.c
b/drivers/cpufreq/cpufreq_governor.c
index f29feb4..54ca5fc 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -29,9 +29,6 @@
#include "cpufreq_governor.h"
-/* Common data for platforms that don't need governor instance per policy */
-struct dbs_data *gdbs_data;
-
static inline u64 get_cpu_idle_time_jiffy(unsigned int cpu, u64 *wall)
{
u64 idle_time;
@@ -233,7 +230,7 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
if (have_multiple_policies())
dbs_data = policy->governor_data;
else
- dbs_data = gdbs_data;
+ dbs_data = cdata->gdbs_data;
WARN_ON(!dbs_data && (event != CPUFREQ_GOV_POLICY_INIT));
@@ -289,7 +286,7 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
}
if (!have_multiple_policies())
- gdbs_data = dbs_data;
+ cdata->gdbs_data = dbs_data;
return 0;
case CPUFREQ_GOV_POLICY_EXIT:
@@ -307,7 +304,7 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
cdata->exit(dbs_data);
kfree(dbs_data);
- gdbs_data = NULL;
+ cdata->gdbs_data = NULL;
}
policy->governor_data = NULL;
diff --git a/drivers/cpufreq/cpufreq_governor.h
b/drivers/cpufreq/cpufreq_governor.h
index 1f7de13..cc4a189 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -133,6 +133,9 @@ struct common_dbs_data {
int governor;
struct attribute_group *attr_group;
+ /* Common data for platforms that don't set have_multiple_policies */
+ struct dbs_data *gdbs_data;
+
struct cpu_dbs_common_info *(*get_cpu_cdbs)(int cpu);
void *(*get_cpu_dbs_info_s)(int cpu);
void (*gov_dbs_timer)(struct work_struct *work);
On 27 March 2013 09:59, Viresh Kumar <[email protected]> wrote:
> On 27 March 2013 01:18, Jacob Shin <[email protected]> wrote:
>> Hmm .. I don't think this works for both ondemand and conservative
>> governors running at the same time .
Hi Jacob,
This must be early morning for you and you must be looking for something
to start with.
I am still stuck at a issue, which i am not able to fix.
- cat of */cpufreq/ondemand/** isn't showing anything on console, but all
pointers are correctly set and i can see the right values with printk()
- I am able to see all prints for have_multiple_policies enabled.
- Due to this, i haven't tested any corner cases for now.
@Rafael: I will post all this as a separate patch later which you can then
fold into original commit. I am just waiting for a complete fix.
Following is the work i have done until now (attached too), leave earlier
fixups:
--------x-------------x------------------
From: Viresh Kumar <[email protected]>
Date: Tue, 26 Mar 2013 23:20:18 +0530
Subject: [PATCH] fixup! cpufreq: governor: Implement per policy instances of
governors
---
drivers/cpufreq/cpufreq.c | 13 +++++++
drivers/cpufreq/cpufreq_conservative.c | 24 ++++++------
drivers/cpufreq/cpufreq_governor.c | 71 ++++++++++++++++++++++++----------
drivers/cpufreq/cpufreq_governor.h | 28 ++++++++++++--
drivers/cpufreq/cpufreq_ondemand.c | 25 ++++++------
include/linux/cpufreq.h | 2 +
6 files changed, 113 insertions(+), 50 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 3d83b02..3d990b4 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -136,6 +136,11 @@ struct kobject *get_governor_parent_kobj(struct
cpufreq_policy *policy)
return cpufreq_global_kobject;
}
+bool have_multiple_policies(void)
+{
+ return cpufreq_driver->have_multiple_policies;
+}
+
static struct cpufreq_policy *__cpufreq_cpu_get(unsigned int cpu, bool sysfs)
{
struct cpufreq_policy *data;
@@ -1554,6 +1559,13 @@ static int __cpufreq_governor(struct
cpufreq_policy *policy,
policy->cpu, event);
ret = policy->governor->governor(policy, event);
+ if (!ret) {
+ if (event == CPUFREQ_GOV_POLICY_INIT)
+ policy->governor->initialized++;
+ else if (event == CPUFREQ_GOV_POLICY_EXIT)
+ policy->governor->initialized--;
+ }
+
/* we keep one module reference alive for
each CPU governed by this CPU */
if ((event != CPUFREQ_GOV_START) || ret)
@@ -1577,6 +1589,7 @@ int cpufreq_register_governor(struct
cpufreq_governor *governor)
mutex_lock(&cpufreq_governor_mutex);
+ governor->initialized = 0;
err = -EBUSY;
if (__find_governor(governor->name) == NULL) {
err = 0;
diff --git a/drivers/cpufreq/cpufreq_conservative.c
b/drivers/cpufreq/cpufreq_conservative.c
index 63499c8..c09d7c8 100644
--- a/drivers/cpufreq/cpufreq_conservative.c
+++ b/drivers/cpufreq/cpufreq_conservative.c
@@ -157,11 +157,13 @@ static int dbs_cpufreq_notifier(struct
notifier_block *nb, unsigned long val,
}
/************************** sysfs interface ************************/
+static struct common_dbs_data cs_dbs_cdata;
+declare_get_tuners(cs);
+
static ssize_t store_sampling_down_factor(struct cpufreq_policy *policy,
const char *buf, size_t count)
{
- struct dbs_data *dbs_data = policy->governor_data;
- struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
+ struct cs_dbs_tuners *cs_tuners = cs_get_tuners(policy, NULL);
unsigned int input;
int ret;
ret = sscanf(buf, "%u", &input);
@@ -176,8 +178,8 @@ static ssize_t store_sampling_down_factor(struct
cpufreq_policy *policy,
static ssize_t store_sampling_rate(struct cpufreq_policy *policy,
const char *buf, size_t count)
{
- struct dbs_data *dbs_data = policy->governor_data;
- struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
+ struct dbs_data *dbs_data = NULL;
+ struct cs_dbs_tuners *cs_tuners = cs_get_tuners(policy, &dbs_data);
unsigned int input;
int ret;
ret = sscanf(buf, "%u", &input);
@@ -192,8 +194,7 @@ static ssize_t store_sampling_rate(struct
cpufreq_policy *policy,
static ssize_t store_up_threshold(struct cpufreq_policy *policy,
const char *buf, size_t count)
{
- struct dbs_data *dbs_data = policy->governor_data;
- struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
+ struct cs_dbs_tuners *cs_tuners = cs_get_tuners(policy, NULL);
unsigned int input;
int ret;
ret = sscanf(buf, "%u", &input);
@@ -208,8 +209,7 @@ static ssize_t store_up_threshold(struct
cpufreq_policy *policy,
static ssize_t store_down_threshold(struct cpufreq_policy *policy,
const char *buf, size_t count)
{
- struct dbs_data *dbs_data = policy->governor_data;
- struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
+ struct cs_dbs_tuners *cs_tuners = cs_get_tuners(policy, NULL);
unsigned int input;
int ret;
ret = sscanf(buf, "%u", &input);
@@ -226,8 +226,7 @@ static ssize_t store_down_threshold(struct
cpufreq_policy *policy,
static ssize_t store_ignore_nice(struct cpufreq_policy *policy,
const char *buf, size_t count)
{
- struct dbs_data *dbs_data = policy->governor_data;
- struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
+ struct cs_dbs_tuners *cs_tuners = cs_get_tuners(policy, NULL);
unsigned int input, j;
int ret;
@@ -259,8 +258,7 @@ static ssize_t store_ignore_nice(struct
cpufreq_policy *policy,
static ssize_t store_freq_step(struct cpufreq_policy *policy,
const char *buf, size_t count)
{
- struct dbs_data *dbs_data = policy->governor_data;
- struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
+ struct cs_dbs_tuners *cs_tuners = cs_get_tuners(policy, NULL);
unsigned int input;
int ret;
ret = sscanf(buf, "%u", &input);
@@ -285,7 +283,7 @@ show_one(cs, up_threshold, up_threshold);
show_one(cs, down_threshold, down_threshold);
show_one(cs, ignore_nice, ignore_nice);
show_one(cs, freq_step, freq_step);
-declare_show_sampling_rate_min();
+declare_show_sampling_rate_min(cs);
cpufreq_freq_attr_rw(sampling_rate);
cpufreq_freq_attr_rw(sampling_down_factor);
diff --git a/drivers/cpufreq/cpufreq_governor.c
b/drivers/cpufreq/cpufreq_governor.c
index 41e5e56..54ca5fc 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -216,10 +216,9 @@ static void set_sampling_rate(struct dbs_data *dbs_data,
int cpufreq_governor_dbs(struct cpufreq_policy *policy,
struct common_dbs_data *cdata, unsigned int event)
{
- struct dbs_data *dbs_data = policy->governor_data;
+ struct dbs_data *dbs_data;
struct od_cpu_dbs_info_s *od_dbs_info = NULL;
struct cs_cpu_dbs_info_s *cs_dbs_info = NULL;
- struct cs_ops *cs_ops = NULL;
struct od_ops *od_ops = NULL;
struct od_dbs_tuners *od_tuners = NULL;
struct cs_dbs_tuners *cs_tuners = NULL;
@@ -228,11 +227,22 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
int io_busy = 0;
int rc;
+ if (have_multiple_policies())
+ dbs_data = policy->governor_data;
+ else
+ dbs_data = cdata->gdbs_data;
+
WARN_ON(!dbs_data && (event != CPUFREQ_GOV_POLICY_INIT));
switch (event) {
case CPUFREQ_GOV_POLICY_INIT:
- WARN_ON(dbs_data);
+ if (have_multiple_policies()) {
+ WARN_ON(dbs_data);
+ } else if (dbs_data) {
+ policy->governor_data = dbs_data;
+ return 0;
+ }
+
dbs_data = kzalloc(sizeof(*dbs_data), GFP_KERNEL);
if (!dbs_data) {
pr_err("%s: POLICY_INIT: kzalloc failed\n", __func__);
@@ -246,6 +256,15 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
kfree(dbs_data);
return rc;
}
+
+ rc = sysfs_create_group(get_governor_parent_kobj(policy),
+ dbs_data->cdata->attr_group);
+ if (rc) {
+ cdata->exit(dbs_data);
+ kfree(dbs_data);
+ return rc;
+ }
+
policy->governor_data = dbs_data;
/* policy latency is in nS. Convert it to uS first */
@@ -258,10 +277,36 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
MIN_LATENCY_MULTIPLIER * latency);
set_sampling_rate(dbs_data, max(dbs_data->min_sampling_rate,
latency * LATENCY_MULTIPLIER));
+
+ if (dbs_data->cdata->governor == GOV_CONSERVATIVE) {
+ struct cs_ops *cs_ops = dbs_data->cdata->gov_ops;
+
+ cpufreq_register_notifier(cs_ops->notifier_block,
+ CPUFREQ_TRANSITION_NOTIFIER);
+ }
+
+ if (!have_multiple_policies())
+ cdata->gdbs_data = dbs_data;
+
return 0;
case CPUFREQ_GOV_POLICY_EXIT:
- cdata->exit(dbs_data);
- kfree(dbs_data);
+ if ((policy->governor->initialized == 1) ||
+ have_multiple_policies()) {
+ sysfs_remove_group(get_governor_parent_kobj(policy),
+ dbs_data->cdata->attr_group);
+
+ if (dbs_data->cdata->governor == GOV_CONSERVATIVE) {
+ struct cs_ops *cs_ops = dbs_data->cdata->gov_ops;
+
+ cpufreq_register_notifier(cs_ops->notifier_block,
+ CPUFREQ_TRANSITION_NOTIFIER);
+ }
+
+ cdata->exit(dbs_data);
+ kfree(dbs_data);
+ cdata->gdbs_data = NULL;
+ }
+
policy->governor_data = NULL;
return 0;
}
@@ -273,7 +318,6 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
cs_dbs_info = dbs_data->cdata->get_cpu_dbs_info_s(cpu);
sampling_rate = cs_tuners->sampling_rate;
ignore_nice = cs_tuners->ignore_nice;
- cs_ops = dbs_data->cdata->gov_ops;
} else {
od_tuners = dbs_data->tuners;
od_dbs_info = dbs_data->cdata->get_cpu_dbs_info_s(cpu);
@@ -307,13 +351,6 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
dbs_data->cdata->gov_dbs_timer);
}
- rc = sysfs_create_group(get_governor_parent_kobj(policy),
- dbs_data->cdata->attr_group);
- if (rc) {
- mutex_unlock(&dbs_data->mutex);
- return rc;
- }
-
/*
* conservative does not implement micro like ondemand
* governor, thus we are bound to jiffes/HZ
@@ -322,9 +359,6 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
cs_dbs_info->down_skip = 0;
cs_dbs_info->enable = 1;
cs_dbs_info->requested_freq = policy->cur;
-
- cpufreq_register_notifier(cs_ops->notifier_block,
- CPUFREQ_TRANSITION_NOTIFIER);
} else {
od_dbs_info->rate_mult = 1;
od_dbs_info->sample_type = OD_NORMAL_SAMPLE;
@@ -349,11 +383,6 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
mutex_lock(&dbs_data->mutex);
mutex_destroy(&cpu_cdbs->timer_mutex);
- sysfs_remove_group(get_governor_parent_kobj(policy),
- dbs_data->cdata->attr_group);
- if (dbs_data->cdata->governor == GOV_CONSERVATIVE)
- cpufreq_unregister_notifier(cs_ops->notifier_block,
- CPUFREQ_TRANSITION_NOTIFIER);
mutex_unlock(&dbs_data->mutex);
break;
diff --git a/drivers/cpufreq/cpufreq_governor.h
b/drivers/cpufreq/cpufreq_governor.h
index 8b18d25..fcf10d9 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -40,13 +40,29 @@
/* Ondemand Sampling types */
enum {OD_NORMAL_SAMPLE, OD_SUB_SAMPLE};
+
+#define declare_get_tuners(_gov) \
+static struct _gov##_dbs_tuners *_gov##_get_tuners \
+(struct cpufreq_policy *policy, struct dbs_data **ldbs_data) \
+{ \
+ if (have_multiple_policies()) { \
+ struct dbs_data *dbs_data = policy->governor_data; \
+ if (ldbs_data) \
+ *ldbs_data = dbs_data; \
+ return dbs_data->tuners; \
+ } else { \
+ if (ldbs_data) \
+ *ldbs_data = _gov##_dbs_cdata.gdbs_data; \
+ return _gov##_dbs_cdata.gdbs_data->tuners; \
+ } \
+}
+
/* Macro creating sysfs show routines */
#define show_one(_gov, file_name, object) \
static ssize_t show_##file_name \
(struct cpufreq_policy *policy, char *buf) \
{ \
- struct dbs_data *dbs_data = policy->governor_data; \
- struct _gov##_dbs_tuners *tuners = dbs_data->tuners; \
+ struct _gov##_dbs_tuners *tuners = _gov##_get_tuners(policy, NULL); \
return sprintf(buf, "%u\n", tuners->file_name); \
}
@@ -133,6 +149,9 @@ struct common_dbs_data {
int governor;
struct attribute_group *attr_group;
+ /* Common data for platforms that don't set have_multiple_policies */
+ struct dbs_data *gdbs_data;
+
struct cpu_dbs_common_info *(*get_cpu_cdbs)(int cpu);
void *(*get_cpu_dbs_info_s)(int cpu);
void (*gov_dbs_timer)(struct work_struct *work);
@@ -177,11 +196,12 @@ static inline int
delay_for_sampling_rate(unsigned int sampling_rate)
return delay;
}
-#define declare_show_sampling_rate_min() \
+#define declare_show_sampling_rate_min(_gov) \
static ssize_t show_sampling_rate_min(struct cpufreq_policy *policy, \
char *buf) \
{ \
- struct dbs_data *dbs_data = policy->governor_data; \
+ struct dbs_data *dbs_data = NULL; \
+ _gov##_get_tuners(policy, &dbs_data); \
return sprintf(buf, "%u\n", dbs_data->min_sampling_rate); \
}
diff --git a/drivers/cpufreq/cpufreq_ondemand.c
b/drivers/cpufreq/cpufreq_ondemand.c
index 29ed48a..4860137 100644
--- a/drivers/cpufreq/cpufreq_ondemand.c
+++ b/drivers/cpufreq/cpufreq_ondemand.c
@@ -257,6 +257,9 @@ max_delay:
}
/************************** sysfs interface ************************/
+static struct common_dbs_data od_dbs_cdata;
+declare_get_tuners(od);
+
/**
* update_sampling_rate - update sampling rate effective immediately if needed.
* @new_rate: new sampling rate
@@ -321,12 +324,14 @@ static void update_sampling_rate(struct dbs_data
*dbs_data,
static ssize_t store_sampling_rate(struct cpufreq_policy *policy,
const char *buf, size_t count)
{
- struct dbs_data *dbs_data = policy->governor_data;
+ struct dbs_data *dbs_data = NULL;
unsigned int input;
int ret;
ret = sscanf(buf, "%u", &input);
if (ret != 1)
return -EINVAL;
+
+ od_get_tuners(policy, &dbs_data);
update_sampling_rate(dbs_data, input);
return count;
}
@@ -334,8 +339,7 @@ static ssize_t store_sampling_rate(struct
cpufreq_policy *policy,
static ssize_t store_io_is_busy(struct cpufreq_policy *policy, const char *buf,
size_t count)
{
- struct dbs_data *dbs_data = policy->governor_data;
- struct od_dbs_tuners *od_tuners = dbs_data->tuners;
+ struct od_dbs_tuners *od_tuners = od_get_tuners(policy, NULL);
unsigned int input;
int ret;
unsigned int j;
@@ -358,8 +362,7 @@ static ssize_t store_io_is_busy(struct
cpufreq_policy *policy, const char *buf,
static ssize_t store_up_threshold(struct cpufreq_policy *policy,
const char *buf, size_t count)
{
- struct dbs_data *dbs_data = policy->governor_data;
- struct od_dbs_tuners *od_tuners = dbs_data->tuners;
+ struct od_dbs_tuners *od_tuners = od_get_tuners(policy, NULL);
unsigned int input;
int ret;
ret = sscanf(buf, "%u", &input);
@@ -379,8 +382,7 @@ static ssize_t store_up_threshold(struct
cpufreq_policy *policy,
static ssize_t store_sampling_down_factor(struct cpufreq_policy *policy,
const char *buf, size_t count)
{
- struct dbs_data *dbs_data = policy->governor_data;
- struct od_dbs_tuners *od_tuners = dbs_data->tuners;
+ struct od_dbs_tuners *od_tuners = od_get_tuners(policy, NULL);
unsigned int input, j;
int ret;
ret = sscanf(buf, "%u", &input);
@@ -401,8 +403,7 @@ static ssize_t store_sampling_down_factor(struct
cpufreq_policy *policy,
static ssize_t store_ignore_nice(struct cpufreq_policy *policy, const
char *buf,
size_t count)
{
- struct dbs_data *dbs_data = policy->governor_data;
- struct od_dbs_tuners *od_tuners = dbs_data->tuners;
+ struct od_dbs_tuners *od_tuners = od_get_tuners(policy, NULL);
unsigned int input;
int ret;
@@ -437,8 +438,7 @@ static ssize_t store_ignore_nice(struct
cpufreq_policy *policy, const char *buf,
static ssize_t store_powersave_bias(struct cpufreq_policy *policy,
const char *buf, size_t count)
{
- struct dbs_data *dbs_data = policy->governor_data;
- struct od_dbs_tuners *od_tuners = dbs_data->tuners;
+ struct od_dbs_tuners *od_tuners = od_get_tuners(policy, NULL);
unsigned int input;
int ret;
ret = sscanf(buf, "%u", &input);
@@ -460,7 +460,7 @@ show_one(od, up_threshold, up_threshold);
show_one(od, sampling_down_factor, sampling_down_factor);
show_one(od, ignore_nice, ignore_nice);
show_one(od, powersave_bias, powersave_bias);
-declare_show_sampling_rate_min();
+declare_show_sampling_rate_min(od);
cpufreq_freq_attr_rw(sampling_rate);
cpufreq_freq_attr_rw(io_is_busy);
@@ -530,6 +530,7 @@ static int od_init(struct dbs_data *dbs_data)
tuners->io_is_busy = should_io_be_busy();
dbs_data->tuners = tuners;
+ pr_info("%s: tuners %p\n", __func__, tuners);
mutex_init(&dbs_data->mutex);
return 0;
}
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index dd53cea..394ea0d 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -187,6 +187,7 @@ static inline unsigned long cpufreq_scale(unsigned
long old, u_int div, u_int mu
struct cpufreq_governor {
char name[CPUFREQ_NAME_LEN];
+ int initialized;
int (*governor) (struct cpufreq_policy *policy,
unsigned int event);
ssize_t (*show_setspeed) (struct cpufreq_policy *policy,
@@ -323,6 +324,7 @@ const char *cpufreq_get_current_driver(void);
int cpufreq_get_policy(struct cpufreq_policy *policy, unsigned int cpu);
int cpufreq_update_policy(unsigned int cpu);
struct kobject *get_governor_parent_kobj(struct cpufreq_policy *policy);
+bool have_multiple_policies(void);
#ifdef CONFIG_CPU_FREQ
/* query the current CPU frequency (in kHz). If zero, cpufreq
couldn't detect it */
On 27 March 2013 15:34, Viresh Kumar <[email protected]> wrote:
> I am still stuck at a issue, which i am not able to fix.
> - cat of */cpufreq/ondemand/** isn't showing anything on console, but all
> pointers are correctly set and i can see the right values with printk()
The culprit is the part where i changed prototypes of all show/store functions.
I am still trying to understand how that part works, and will fix it soon.
--
viresh
On 27 March 2013 17:05, Viresh Kumar <[email protected]> wrote:
> On 27 March 2013 15:34, Viresh Kumar <[email protected]> wrote:
>> I am still stuck at a issue, which i am not able to fix.
>> - cat of */cpufreq/ondemand/** isn't showing anything on console, but all
>> pointers are correctly set and i can see the right values with printk()
>
> The culprit is the part where i changed prototypes of all show/store functions.
> I am still trying to understand how that part works, and will fix it soon.
It seems i have broken some other stuff too with this patch:
commit 63173d01761a442ac5a7c73de996acbe9b0b3da7
Author: Viresh Kumar <[email protected]>
Date: Wed Feb 6 15:37:37 2013 +0530
cpufreq: Get rid of "struct global_attr"
We don't need to keep two structures for file attributes, global_attr and
freq_attr. Lets use freq_attr only for cpufreq core and drivers.
Signed-off-by: Viresh Kumar <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/cpufreq/acpi-cpufreq.c | 9 ++++-----
drivers/cpufreq/intel_pstate.c | 20 ++++++++++----------
include/linux/cpufreq.h | 16 ----------------
Atleast both acpi-cpufreq and intel_pstate (including cpufreq as a whole). :(
And i don't want to post any more fixups now.. they are huge and will create
more confusion..
Rather i will send next version of my patchset and ask Rafael to drop earlier
patches.
Will cc you, please try them at your end.
--
viresh