2013-07-05 08:46:53

by Chanwoo Choi

[permalink] [raw]
Subject: [PATCH 0/6] cpufreq: Add 'load_table' debugfs file to show colleced CPUs load

This patchset add 'load_table' debugfs file to provide collected CPUs data.
The load_table debugfs file gives below CPU datas.
- measured time
- old CPU frequency
- new CPU frequency
- each CPU load

These data will mean the change of CPU frequency according to CPUs load at
specific measured time. Also, the user can determine the storage size of colleced
CPUs data. The range is from 10 to 1000.

Second, previous performance/powersave governor haven't calculated CPUs load
becuase these governor didn't change CPU frequency according to CPUs load. But,
load_table debugfs file always should indicate the collected CPUs data regardless
of the kind of cpufreq governor. So, the patch3/4/5 implement that performance/
powersave governor will check periodically CPUs load by calling dbs_check_cpu()
with timer.

Finally, the patch 6 explain the detailed description of load_table debugfs file.

Thanks,
Chanwoo Choi

[Test Result]
- the kind of SoC : Samsung EXYNOS4412
- a range of frequency : 200 ~ 1400MHz

- Ondemand governor and the number of online CPU is 4
Time(ms) Old Freq(Hz) New Freq(Hz) CPU0 CPU1 CPU2 CPU3
23820 500000 500000 53 86 2 73
23920 500000 400000 66 40 0 42
24020 400000 400000 71 71 10 52
24120 400000 300000 33 27 45 65
24220 300000 300000 4 37 71 34
24320 300000 300000 1 85 38 16
24420 300000 200000 6 41 15 51
24520 200000 200000 12 62 1 51
24620 200000 200000 9 51 0 58
24720 200000 200000 32 32 11 27

- Performance governor and the number of online CPU is 4
Time(ms) Old Freq(Hz) New Freq(Hz) CPU0 CPU1 CPU2 CPU3
3425930 1400000 1400000 0 0 0 0
3425945 1400000 1400000 0 0 0 0
3425960 1400000 1400000 0 0 0 0
3427105 1400000 1400000 0 0 0 0
3427109 1400000 1400000 0 0 33 100
3428135 1400000 1400000 0 0 0 0
3428425 1400000 1400000 0 0 0 0
3429385 1400000 1400000 0 0 0 0
3429400 1400000 1400000 0 0 0 0
3429415 1400000 1400000 0 0 0 0

- Powersave governor and the number of online CPU is 4
Time(ms) Old Freq(Hz) New Freq(Hz) CPU0 CPU1 CPU2 CPU3
3451945 200000 200000 3 0 0 2
3453930 200000 200000 0 1 0 1
3453945 200000 200000 2 2 0 0
3453960 200000 200000 1 0 1 0
3455065 200000 200000 0 0 0 0
3455075 200000 200000 1 5 1 0
3455570 200000 200000 0 0 0 0
3455605 200000 200000 0 0 0 3
3456350 200000 200000 0 0 0 0
3456360 200000 200000 41 2 39 43

- Powersave governor and the number of online CPU is 2 (CPU[1-2] is offline)
Time(ms) Old Freq(Hz) New Freq(Hz) CPU0 CPU3
3501930 200000 200000 0 0
3502040 200000 200000 0 0
3502100 200000 200000 75 16
3502575 200000 200000 0 0
3503025 200000 200000 11 2
3503100 200000 200000 62 14
3503455 200000 200000 1 0
3503930 200000 200000 3 9
3504000 200000 200000 65 15
3504440 200000 200000 1 0

Chanwoo Choi (6):
cpufreq: Add debugfs directory for cpufreq
cpufreq: stats: Add 'load_table' debugfs file to show accumulated data of CPUs
cpufreq: Update governor core to support all governors
cpufreq: performance: Add support to collect CPUs load periodically
cpufreq: powersave: Add support to collect CPUs load periodically
Documentation: cpufreq: load_table: Update load_table debugfs file documentation

Documentation/cpu-freq/cpufreq-stats.txt | 39 ++++-
drivers/cpufreq/Kconfig | 6 +
drivers/cpufreq/cpufreq.c | 61 ++++++++
drivers/cpufreq/cpufreq_governor.c | 37 ++++-
drivers/cpufreq/cpufreq_governor.h | 11 ++
drivers/cpufreq/cpufreq_performance.c | 156 ++++++++++++++++++-
drivers/cpufreq/cpufreq_powersave.c | 158 ++++++++++++++++++-
drivers/cpufreq/cpufreq_stats.c | 256 ++++++++++++++++++++++++++++---
include/linux/cpufreq.h | 7 +
9 files changed, 692 insertions(+), 39 deletions(-)

--
1.8.0


2013-07-05 08:46:59

by Chanwoo Choi

[permalink] [raw]
Subject: [PATCH 4/6] cpufreq: performance: Add support to collect CPUs load periodically

This patch collect CPUs load when cpufreq governos is performance.
The collected CPUs load is used for providing data through debugfs file.
- /sys/kernel/debug/cpufreq/cpuX/load_table

And this patch create basic sysfs file as below:
- /sys/devices/system/cpu/cpufreq/performance/ignore_nice (rw)
- /sys/devices/system/cpu/cpufreq/performance/sampling_rate (rw)
- /sys/devices/system/cpu/cpufreq/performance/sampling_rate_min (r)

Signed-off-by: Chanwoo Choi <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
Signed-off-by: Myungjoo Ham <[email protected]>
---
drivers/cpufreq/cpufreq_governor.h | 1 +
drivers/cpufreq/cpufreq_performance.c | 156 +++++++++++++++++++++++++++++++++-
2 files changed, 154 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index 55ef8c6..7ad4d3b 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -198,6 +198,7 @@ struct common_dbs_data {
/* Common across governors */
#define GOV_ONDEMAND 0
#define GOV_CONSERVATIVE 1
+ #define GOV_PERFORMANCE 2
int governor;
struct attribute_group *attr_group_gov_sys; /* one governor - system */
struct attribute_group *attr_group_gov_pol; /* one governor - policy */
diff --git a/drivers/cpufreq/cpufreq_performance.c b/drivers/cpufreq/cpufreq_performance.c
index ceee068..15195fc9 100644
--- a/drivers/cpufreq/cpufreq_performance.c
+++ b/drivers/cpufreq/cpufreq_performance.c
@@ -17,10 +17,158 @@
#include <linux/cpufreq.h>
#include <linux/init.h>

+#ifdef CONFIG_CPU_FREQ_STAT
+
+#include <linux/kernel_stat.h>
+#include <linux/slab.h>
+
+#include "cpufreq_governor.h"
+
+#define pf_cpu_dbs_info_s cpu_dbs_info_s
+#define pf_dbs_tuners dbs_tuners
+
+static DEFINE_PER_CPU(struct pf_cpu_dbs_info_s, pf_cpu_dbs_info);
+static struct common_dbs_data pf_dbs_cdata;
+
+static ssize_t store_sampling_rate(struct dbs_data *dbs_data, const char *buf,
+ size_t count)
+{
+ struct pf_dbs_tuners *tuners = dbs_data->tuners;
+ unsigned int input;
+ int ret;
+ ret = sscanf(buf, "%u", &input);
+
+ if (ret != 1)
+ return -EINVAL;
+
+ tuners->sampling_rate = max(input, dbs_data->min_sampling_rate);
+ return count;
+}
+
+static ssize_t store_ignore_nice(struct dbs_data *dbs_data, const char *buf,
+ size_t count)
+{
+ struct pf_dbs_tuners *tuners = dbs_data->tuners;
+ unsigned int input, j;
+ int ret;
+
+ ret = sscanf(buf, "%u", &input);
+ if (ret != 1)
+ return -EINVAL;
+
+ if (input > 1)
+ input = 1;
+
+ if (input == tuners->ignore_nice) /* nothing to do */
+ return count;
+
+ tuners->ignore_nice = input;
+
+ /* we need to re-evaluate prev_cpu_idle */
+ for_each_online_cpu(j) {
+ struct pf_cpu_dbs_info_s *dbs_info;
+ dbs_info = &per_cpu(pf_cpu_dbs_info, j);
+ dbs_info->cdbs.prev_cpu_idle = get_cpu_idle_time(j,
+ &dbs_info->cdbs.prev_cpu_wall, 0);
+ if (tuners->ignore_nice)
+ dbs_info->cdbs.prev_cpu_nice =
+ kcpustat_cpu(j).cpustat[CPUTIME_NICE];
+ }
+ return count;
+}
+
+show_store_one(pf, sampling_rate);
+show_store_one(pf, ignore_nice);
+declare_show_sampling_rate_min(pf);
+
+gov_sys_pol_attr_rw(sampling_rate);
+gov_sys_pol_attr_rw(ignore_nice);
+gov_sys_pol_attr_ro(sampling_rate_min);
+
+static struct attribute *dbs_attributes_gov_sys[] = {
+ &sampling_rate_min_gov_sys.attr,
+ &sampling_rate_gov_sys.attr,
+ &ignore_nice_gov_sys.attr,
+ NULL
+};
+
+static struct attribute_group pf_attr_group_gov_sys = {
+ .attrs = dbs_attributes_gov_sys,
+ .name = "performance",
+};
+
+static struct attribute *dbs_attributes_gov_pol[] = {
+ &sampling_rate_min_gov_pol.attr,
+ &sampling_rate_gov_pol.attr,
+ &ignore_nice_gov_pol.attr,
+ NULL
+};
+
+static struct attribute_group pf_attr_group_gov_pol = {
+ .attrs = dbs_attributes_gov_pol,
+ .name = "performance",
+};
+
+static void pf_dbs_timer(struct work_struct *work)
+{
+ struct pf_cpu_dbs_info_s *dbs_info = container_of(work,
+ struct pf_cpu_dbs_info_s, cdbs.work.work);
+ unsigned int cpu = dbs_info->cdbs.cur_policy->cpu;
+ struct dbs_data *dbs_data = dbs_info->cdbs.cur_policy->governor_data;
+ unsigned int delay = 0;
+
+ delay = usecs_to_jiffies(dbs_data->min_sampling_rate);
+
+ mutex_lock(&dbs_info->cdbs.timer_mutex);
+ dbs_check_cpu(dbs_data, cpu);
+ gov_queue_work(dbs_data, dbs_info->cdbs.cur_policy, delay, false);
+ mutex_unlock(&dbs_info->cdbs.timer_mutex);
+}
+
+static int pf_init(struct dbs_data *dbs_data)
+{
+ struct pf_dbs_tuners *tuners;
+
+ tuners = kzalloc(sizeof(struct pf_dbs_tuners), GFP_KERNEL);
+ if (!tuners) {
+ pr_err("%s: kzalloc failed\n", __func__);
+ return -ENOMEM;
+ }
+
+ tuners->ignore_nice = 0;
+
+ dbs_data->tuners = tuners;
+ dbs_data->min_sampling_rate = MIN_SAMPLING_RATE_RATIO *
+ jiffies_to_usecs(100);
+
+ mutex_init(&dbs_data->mutex);
+
+ return 0;
+}
+static void pf_exit(struct dbs_data *dbs_data)
+{
+ kfree(dbs_data->tuners);
+}
+
+define_get_cpu_dbs_routines(pf_cpu_dbs_info);
+
+static struct common_dbs_data pf_dbs_cdata = {
+ .governor = GOV_PERFORMANCE,
+ .attr_group_gov_sys = &pf_attr_group_gov_sys,
+ .attr_group_gov_pol = &pf_attr_group_gov_pol,
+ .get_cpu_cdbs = get_cpu_cdbs,
+ .get_cpu_dbs_info_s = get_cpu_dbs_info_s,
+ .gov_dbs_timer = pf_dbs_timer,
+ .init = pf_init,
+ .exit = pf_exit,
+};
+#endif /* CONFIG_CPU_FREQ_STAT */

static int cpufreq_governor_performance(struct cpufreq_policy *policy,
unsigned int event)
{
+ int ret = 0;
+
switch (event) {
case CPUFREQ_GOV_START:
case CPUFREQ_GOV_LIMITS:
@@ -29,10 +177,12 @@ static int cpufreq_governor_performance(struct cpufreq_policy *policy,
__cpufreq_driver_target(policy, policy->max,
CPUFREQ_RELATION_H);
break;
- default:
- break;
}
- return 0;
+
+#ifdef CONFIG_CPU_FREQ_STAT
+ ret = cpufreq_governor_dbs(policy, &pf_dbs_cdata, event);
+#endif
+ return ret;
}

#ifdef CONFIG_CPU_FREQ_GOV_PERFORMANCE_MODULE
--
1.8.0

2013-07-05 08:47:26

by Chanwoo Choi

[permalink] [raw]
Subject: [PATCH v5 2/6] cpufreq: stats: Add 'load_table' debugfs file to show accumulated data of CPUs

This patch add new 'load_table' debugfs file to show previous accumulated data
of CPUs load as following path and add CPUFREQ_LOADCHECK notification to
CPUFREQ_TRANSITION_NOTIFIER notifier chain.
- /sys/kernel/debug/cpufreq/cpuX/load_table

When governor calculates CPUs load on dbs_check_cpu(), governor send
CPUFREQ_LOADCHECK notification with CPUs load, so that cpufreq_stats
accumulates calculated CPUs load on 'load_table' storage.

This debugfs file is used to judge the correct system state or determine
suitable system resource according to current CPUs load on user-space.

This debugfs file include following data:
- Measurement point of time
- CPU frequency
- Per-CPU load

Signed-off-by: Chanwoo Choi <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
Signed-off-by: Myungjoo Ham <[email protected]>
---
Changes since v4:
- Reset the data of CPUs load when cpufreq governor is performance or powersave
- Move code about creating debugfs directory to below first patch
: [PATCH 1/2] cpufreq: Add debugfs directory for cpufreq

Changes since v3:
- Extend a range of accumulated data (10 ~ 1000)
- Add unit information of time/freq and align 'Time' field as left for readability
- Use CONFIG_CPU_FREQ_STAT depdendency instead of CONFIG_CPU_FREQ_STAT_DETATILS
- Initialize load of offline CPUx as zero(0)
- Create/remove debugfs root directory on cpufreq_stats_init/exit() because
debugfs root is used on all CPUs.

Changes since v2:
- Code clean according to Viresh Kumar's comment
- Show both old frequency and new frequency on 'load_table' debugfs file
- Change debufs file patch as below
old: /sys/kernel/debugfs/cpufreq/load_table
new: /sys/kernel/debugfs/cpufreq/cpuX/load_table

Changes since v1:
- Set maximum storage size to save CPUs load on Kconfig
- Use spinlock to synchronize read/write operation for CPUs load
- Use local variable instead of global variable(struct cpufreq_freqs *freqs)
- Use pointer of data structure to get correct size of data structure
in sizeof() macro instead of structure name
: sizeof(struct cpufreq_freqs) -> sizeof(*stat->load_table)
- Change time unit from nanosecond to microsecond
- Remove unnecessary memory copy

Following Test result :
- Cpufreq governor : ondemand governor
- Test application : MP3 play + Picture Audo-slide application
- NR_CPU_LOAD_STORAGE : 10
- command : cat /sys/kernel/debug/cpufreq/cpu0/load_table

drivers/cpufreq/Kconfig | 6 +
drivers/cpufreq/cpufreq.c | 4 +
drivers/cpufreq/cpufreq_governor.c | 13 ++
drivers/cpufreq/cpufreq_stats.c | 256 +++++++++++++++++++++++++++++++++----
include/linux/cpufreq.h | 6 +
5 files changed, 261 insertions(+), 24 deletions(-)

diff --git a/drivers/cpufreq/Kconfig b/drivers/cpufreq/Kconfig
index 534fcb8..5c3f406 100644
--- a/drivers/cpufreq/Kconfig
+++ b/drivers/cpufreq/Kconfig
@@ -36,6 +36,12 @@ config CPU_FREQ_STAT

If in doubt, say N.

+config NR_CPU_LOAD_STORAGE
+ int "Maximum storage size to save CPU load (10-1000)"
+ range 10 1000
+ depends on CPU_FREQ_STAT
+ default "10"
+
config CPU_FREQ_STAT_DETAILS
bool "CPU frequency translation statistics details"
depends on CPU_FREQ_STAT
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index bc01c8e..8a2c4c2 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -297,6 +297,10 @@ void __cpufreq_notify_transition(struct cpufreq_policy *policy,
if (likely(policy) && likely(policy->cpu == freqs->cpu))
policy->cur = freqs->new;
break;
+ case CPUFREQ_LOADCHECK:
+ srcu_notifier_call_chain(&cpufreq_transition_notifier_list,
+ CPUFREQ_LOADCHECK, freqs);
+ break;
}
}
/**
diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index dc9b72e..3e73107 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -87,6 +87,9 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
struct od_dbs_tuners *od_tuners = dbs_data->tuners;
struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
struct cpufreq_policy *policy;
+#ifdef CONFIG_CPU_FREQ_STAT
+ struct cpufreq_freqs freq = {0};
+#endif
unsigned int max_load = 0;
unsigned int ignore_nice;
unsigned int j;
@@ -148,6 +151,9 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
continue;

load = 100 * (wall_time - idle_time) / wall_time;
+#ifdef CONFIG_CPU_FREQ_STAT
+ freq.load[j] = load;
+#endif

if (dbs_data->cdata->governor == GOV_ONDEMAND) {
int freq_avg = __cpufreq_driver_getavg(policy, j);
@@ -161,6 +167,13 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
max_load = load;
}

+#ifdef CONFIG_CPU_FREQ_STAT
+ freq.time = ktime_to_ms(ktime_get());
+ freq.old = policy->cur;
+
+ cpufreq_notify_transition(policy, &freq, CPUFREQ_LOADCHECK);
+#endif
+
dbs_data->cdata->gov_check_cpu(cpu, max_load);
}
EXPORT_SYMBOL_GPL(dbs_check_cpu);
diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
index fb65dec..3211f46 100644
--- a/drivers/cpufreq/cpufreq_stats.c
+++ b/drivers/cpufreq/cpufreq_stats.c
@@ -12,6 +12,7 @@
#include <linux/kernel.h>
#include <linux/slab.h>
#include <linux/cpu.h>
+#include <linux/debugfs.h>
#include <linux/sysfs.h>
#include <linux/cpufreq.h>
#include <linux/module.h>
@@ -36,6 +37,12 @@ struct cpufreq_stats {
#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
unsigned int *trans_table;
#endif
+
+ /* Debugfs file for load_table */
+ struct cpufreq_freqs *load_table;
+ unsigned int load_last_index;
+ unsigned int load_max_index;
+ struct dentry *debugfs_load_table;
};

static DEFINE_PER_CPU(struct cpufreq_stats *, cpufreq_stats_table);
@@ -149,6 +156,179 @@ static struct attribute_group stats_attr_group = {
.name = "stats"
};

+#define MAX_LINE_SIZE 255
+static ssize_t load_table_read(struct file *file, char __user *user_buf,
+ size_t count, loff_t *ppos)
+{
+ struct cpufreq_policy *policy = file->private_data;
+ struct cpufreq_stats *stat = per_cpu(cpufreq_stats_table, policy->cpu);
+ struct cpufreq_freqs *load_table = stat->load_table;
+ ssize_t len = 0;
+ char *buf;
+ int i, cpu, ret;
+
+ buf = kzalloc(MAX_LINE_SIZE * stat->load_max_index, GFP_KERNEL);
+ if (!buf)
+ return 0;
+
+ spin_lock(&cpufreq_stats_lock);
+ len += sprintf(buf + len, "%-10s %-12s %-12s ", "Time(ms)",
+ "Old Freq(Hz)",
+ "New Freq(Hz)");
+ for_each_cpu(cpu, policy->cpus)
+ len += sprintf(buf + len, "%3s%d ", "CPU", cpu);
+ len += sprintf(buf + len, "\n");
+
+ i = stat->load_last_index;
+ do {
+ len += sprintf(buf + len, "%-10lld %-12d %-12d ",
+ load_table[i].time,
+ load_table[i].old,
+ load_table[i].new);
+
+ for_each_cpu(cpu, policy->cpus)
+ len += sprintf(buf + len, "%-4d ",
+ load_table[i].load[cpu]);
+ len += sprintf(buf + len, "\n");
+
+ if (++i == stat->load_max_index)
+ i = 0;
+ } while (i != stat->load_last_index);
+ spin_unlock(&cpufreq_stats_lock);
+
+ ret = simple_read_from_buffer(user_buf, count, ppos, buf, len);
+ kfree(buf);
+
+ return ret;
+}
+
+static const struct file_operations load_table_fops = {
+ .read = load_table_read,
+ .open = simple_open,
+ .llseek = no_llseek,
+};
+
+static int cpufreq_stats_reset_debugfs(struct cpufreq_policy *policy)
+{
+ struct cpufreq_stats *stat = per_cpu(cpufreq_stats_table, policy->cpu);
+ int size;
+
+ if (!stat)
+ return -EINVAL;
+
+ if (stat->load_table)
+ kfree(stat->load_table);
+ stat->load_last_index = 0;
+
+ size = sizeof(*stat->load_table) * stat->load_max_index;
+ stat->load_table = kzalloc(size, GFP_KERNEL);
+ if (!stat->load_table)
+ return -ENOMEM;
+
+ return 0;
+}
+
+static int cpufreq_stats_create_debugfs(struct cpufreq_policy *policy)
+{
+ struct cpufreq_stats *stat = per_cpu(cpufreq_stats_table, policy->cpu);
+ int size, ret = 0;
+
+ if (!stat)
+ return -EINVAL;
+
+ if (!policy->cpu_debugfs)
+ return -EINVAL;
+
+ stat->load_last_index = 0;
+ stat->load_max_index = CONFIG_NR_CPU_LOAD_STORAGE;
+
+ /* Allocate memory for storage of CPUs load */
+ size = sizeof(*stat->load_table) * stat->load_max_index;
+ stat->load_table = kzalloc(size, GFP_KERNEL);
+ if (!stat->load_table)
+ return -ENOMEM;
+
+ /* Create debugfs directory and file for cpufreq */
+ stat->debugfs_load_table = debugfs_create_file("load_table", S_IWUSR,
+ policy->cpu_debugfs[policy->cpu],
+ policy, &load_table_fops);
+ if (!stat->debugfs_load_table) {
+ ret = -ENOMEM;
+ goto err;
+ }
+
+ pr_debug("Creating debugfs file for CPU%d \n", policy->cpu);
+
+ return 0;
+err:
+ kfree(stat->load_table);
+ return ret;
+}
+
+/* should be called late in the CPU removal sequence so that the stats
+ * memory is still available in case someone tries to use it.
+ */
+static void cpufreq_stats_free_load_table(unsigned int cpu)
+{
+ struct cpufreq_stats *stat = per_cpu(cpufreq_stats_table, cpu);
+
+ if (stat) {
+ pr_debug("Free memory of load_table\n");
+ kfree(stat->load_table);
+ }
+}
+
+/* must be called early in the CPU removal sequence (before
+ * cpufreq_remove_dev) so that policy is still valid.
+ */
+static void cpufreq_stats_free_debugfs(unsigned int cpu)
+{
+ struct cpufreq_stats *stat = per_cpu(cpufreq_stats_table, cpu);
+
+ if (stat) {
+ pr_debug("Remove load_table debugfs file\n");
+ debugfs_remove(stat->debugfs_load_table);
+ }
+}
+
+static void cpufreq_stats_store_load_table(struct cpufreq_freqs *freq,
+ unsigned long val)
+{
+ struct cpufreq_stats *stat;
+ int cpu, last_idx;
+
+ stat = per_cpu(cpufreq_stats_table, freq->cpu);
+ if (!stat)
+ return;
+
+ spin_lock(&cpufreq_stats_lock);
+
+ switch (val) {
+ case CPUFREQ_POSTCHANGE:
+ if (!stat->load_last_index)
+ last_idx = stat->load_max_index;
+ else
+ last_idx = stat->load_last_index - 1;
+
+ stat->load_table[last_idx].new = freq->new;
+ break;
+ case CPUFREQ_LOADCHECK:
+ last_idx = stat->load_last_index;
+
+ stat->load_table[last_idx].time = freq->time;
+ stat->load_table[last_idx].old = freq->old;
+ stat->load_table[last_idx].new = freq->old;
+ for_each_present_cpu(cpu)
+ stat->load_table[last_idx].load[cpu] = freq->load[cpu];
+
+ if (++stat->load_last_index == stat->load_max_index)
+ stat->load_last_index = 0;
+ break;
+ }
+
+ spin_unlock(&cpufreq_stats_lock);
+}
+
static int freq_table_get_index(struct cpufreq_stats *stat, unsigned int freq)
{
int index;
@@ -204,7 +384,7 @@ static int cpufreq_stats_create_table(struct cpufreq_policy *policy,
unsigned int alloc_size;
unsigned int cpu = policy->cpu;
if (per_cpu(cpufreq_stats_table, cpu))
- return -EBUSY;
+ return 0;
stat = kzalloc(sizeof(struct cpufreq_stats), GFP_KERNEL);
if ((stat) == NULL)
return -ENOMEM;
@@ -257,6 +437,14 @@ static int cpufreq_stats_create_table(struct cpufreq_policy *policy,
spin_lock(&cpufreq_stats_lock);
stat->last_time = get_jiffies_64();
stat->last_index = freq_table_get_index(stat, policy->cur);
+
+ ret = cpufreq_stats_create_debugfs(data);
+ if (ret < 0) {
+ spin_unlock(&cpufreq_stats_lock);
+ ret = -EINVAL;
+ goto error_out;
+ }
+
spin_unlock(&cpufreq_stats_lock);
cpufreq_cpu_put(data);
return 0;
@@ -297,11 +485,18 @@ static int cpufreq_stat_notifier_policy(struct notifier_block *nb,
if (val != CPUFREQ_NOTIFY)
return 0;
table = cpufreq_frequency_get_table(cpu);
- if (!table)
- return 0;
- ret = cpufreq_stats_create_table(policy, table);
- if (ret)
+ if (table) {
+ ret = cpufreq_stats_create_table(policy, table);
+ if (ret)
+ return ret;
+ }
+
+ ret = cpufreq_stats_reset_debugfs(policy);
+ if (ret < 0) {
+ pr_debug("Failed to reset CPUs data of debugfs\n");
return ret;
+ }
+
return 0;
}

@@ -312,32 +507,40 @@ static int cpufreq_stat_notifier_trans(struct notifier_block *nb,
struct cpufreq_stats *stat;
int old_index, new_index;

- if (val != CPUFREQ_POSTCHANGE)
- return 0;
+ switch (val) {
+ case CPUFREQ_POSTCHANGE:
+ stat = per_cpu(cpufreq_stats_table, freq->cpu);
+ if (!stat)
+ return 0;

- stat = per_cpu(cpufreq_stats_table, freq->cpu);
- if (!stat)
- return 0;
+ old_index = stat->last_index;
+ new_index = freq_table_get_index(stat, freq->new);

- old_index = stat->last_index;
- new_index = freq_table_get_index(stat, freq->new);
+ /* We can't do stat->time_in_state[-1]= .. */
+ if (old_index == -1 || new_index == -1)
+ return 0;

- /* We can't do stat->time_in_state[-1]= .. */
- if (old_index == -1 || new_index == -1)
- return 0;
+ cpufreq_stats_update(freq->cpu);

- cpufreq_stats_update(freq->cpu);
+ if (old_index == new_index)
+ return 0;

- if (old_index == new_index)
- return 0;
-
- spin_lock(&cpufreq_stats_lock);
- stat->last_index = new_index;
+ spin_lock(&cpufreq_stats_lock);
+ stat->last_index = new_index;
#ifdef CONFIG_CPU_FREQ_STAT_DETAILS
- stat->trans_table[old_index * stat->max_state + new_index]++;
+ stat->trans_table[old_index * stat->max_state + new_index]++;
#endif
- stat->total_trans++;
- spin_unlock(&cpufreq_stats_lock);
+ stat->total_trans++;
+ spin_unlock(&cpufreq_stats_lock);
+
+ cpufreq_stats_store_load_table(freq, CPUFREQ_POSTCHANGE);
+
+ break;
+ case CPUFREQ_LOADCHECK:
+ cpufreq_stats_store_load_table(freq, CPUFREQ_LOADCHECK);
+ break;
+ }
+
return 0;
}

@@ -352,12 +555,16 @@ static int __cpuinit cpufreq_stat_cpu_callback(struct notifier_block *nfb,
cpufreq_update_policy(cpu);
break;
case CPU_DOWN_PREPARE:
+ cpufreq_stats_free_debugfs(cpu);
cpufreq_stats_free_sysfs(cpu);
break;
case CPU_DEAD:
+ cpufreq_stats_free_load_table(cpu);
cpufreq_stats_free_table(cpu);
break;
case CPU_UP_CANCELED_FROZEN:
+ cpufreq_stats_free_debugfs(cpu);
+ cpufreq_stats_free_load_table(cpu);
cpufreq_stats_free_sysfs(cpu);
cpufreq_stats_free_table(cpu);
break;
@@ -418,6 +625,7 @@ static void __exit cpufreq_stats_exit(void)
unregister_hotcpu_notifier(&cpufreq_stat_cpu_notifier);
for_each_online_cpu(cpu) {
cpufreq_stats_free_table(cpu);
+ cpufreq_stats_free_debugfs(cpu);
cpufreq_stats_free_sysfs(cpu);
}
}
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 825f379..4d0a4fd 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -141,12 +141,18 @@ static inline bool policy_is_shared(struct cpufreq_policy *policy)
#define CPUFREQ_POSTCHANGE (1)
#define CPUFREQ_RESUMECHANGE (8)
#define CPUFREQ_SUSPENDCHANGE (9)
+#define CPUFREQ_LOADCHECK (10)

struct cpufreq_freqs {
unsigned int cpu; /* cpu nr */
unsigned int old;
unsigned int new;
u8 flags; /* flags of cpufreq_driver, see below. */
+
+#ifdef CONFIG_CPU_FREQ_STAT
+ int64_t time;
+ unsigned int load[NR_CPUS];
+#endif
};


--
1.8.0

2013-07-05 08:47:24

by Chanwoo Choi

[permalink] [raw]
Subject: [PATCH 3/6] cpufreq: Update governor core to support all governors

The cpufreq_governor.c only support ondemand and conservative governor.
So, this patch update governor core to support all governors.

Signed-off-by: Chanwoo Choi <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
Signed-off-by: Myungjoo Ham <[email protected]>
---
drivers/cpufreq/cpufreq_governor.c | 24 +++++++++++++++++++-----
drivers/cpufreq/cpufreq_governor.h | 9 +++++++++
2 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.c b/drivers/cpufreq/cpufreq_governor.c
index 3e73107..ea282f8 100644
--- a/drivers/cpufreq/cpufreq_governor.c
+++ b/drivers/cpufreq/cpufreq_governor.c
@@ -86,6 +86,7 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
struct cpu_dbs_common_info *cdbs = dbs_data->cdata->get_cpu_cdbs(cpu);
struct od_dbs_tuners *od_tuners = dbs_data->tuners;
struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
+ struct cs_dbs_tuners *tuners = dbs_data->tuners;
struct cpufreq_policy *policy;
#ifdef CONFIG_CPU_FREQ_STAT
struct cpufreq_freqs freq = {0};
@@ -96,8 +97,10 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)

if (dbs_data->cdata->governor == GOV_ONDEMAND)
ignore_nice = od_tuners->ignore_nice;
- else
+ else if (dbs_data->cdata->governor == GOV_CONSERVATIVE)
ignore_nice = cs_tuners->ignore_nice;
+ else
+ ignore_nice = tuners->ignore_nice;

policy = cdbs->cur_policy;

@@ -174,7 +177,8 @@ void dbs_check_cpu(struct dbs_data *dbs_data, int cpu)
cpufreq_notify_transition(policy, &freq, CPUFREQ_LOADCHECK);
#endif

- dbs_data->cdata->gov_check_cpu(cpu, max_load);
+ if (dbs_data->cdata->gov_check_cpu)
+ dbs_data->cdata->gov_check_cpu(cpu, max_load);
}
EXPORT_SYMBOL_GPL(dbs_check_cpu);

@@ -239,9 +243,12 @@ static void set_sampling_rate(struct dbs_data *dbs_data,
if (dbs_data->cdata->governor == GOV_CONSERVATIVE) {
struct cs_dbs_tuners *cs_tuners = dbs_data->tuners;
cs_tuners->sampling_rate = sampling_rate;
- } else {
+ } else if (dbs_data->cdata->governor == GOV_ONDEMAND) {
struct od_dbs_tuners *od_tuners = dbs_data->tuners;
od_tuners->sampling_rate = sampling_rate;
+ } else {
+ struct dbs_tuners *tuners = dbs_data->tuners;
+ tuners->sampling_rate = sampling_rate;
}
}

@@ -251,9 +258,11 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
struct dbs_data *dbs_data;
struct od_cpu_dbs_info_s *od_dbs_info = NULL;
struct cs_cpu_dbs_info_s *cs_dbs_info = NULL;
+ struct cpu_dbs_info_s *dbs_info = NULL;
struct od_ops *od_ops = NULL;
struct od_dbs_tuners *od_tuners = NULL;
struct cs_dbs_tuners *cs_tuners = NULL;
+ struct dbs_tuners *tuners = NULL;
struct cpu_dbs_common_info *cpu_cdbs;
unsigned int sampling_rate, latency, ignore_nice, j, cpu = policy->cpu;
int io_busy = 0;
@@ -353,13 +362,18 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
cs_dbs_info = dbs_data->cdata->get_cpu_dbs_info_s(cpu);
sampling_rate = cs_tuners->sampling_rate;
ignore_nice = cs_tuners->ignore_nice;
- } else {
+ } else if (dbs_data->cdata->governor == GOV_ONDEMAND) {
od_tuners = dbs_data->tuners;
od_dbs_info = dbs_data->cdata->get_cpu_dbs_info_s(cpu);
sampling_rate = od_tuners->sampling_rate;
ignore_nice = od_tuners->ignore_nice;
od_ops = dbs_data->cdata->gov_ops;
io_busy = od_tuners->io_is_busy;
+ } else {
+ tuners = dbs_data->tuners;
+ dbs_info = dbs_data->cdata->get_cpu_dbs_info_s(cpu);
+ sampling_rate = tuners->sampling_rate;
+ ignore_nice = tuners->ignore_nice;
}

switch (event) {
@@ -394,7 +408,7 @@ int cpufreq_governor_dbs(struct cpufreq_policy *policy,
cs_dbs_info->down_skip = 0;
cs_dbs_info->enable = 1;
cs_dbs_info->requested_freq = policy->cur;
- } else {
+ } else if (dbs_data->cdata->governor == GOV_ONDEMAND) {
od_dbs_info->rate_mult = 1;
od_dbs_info->sample_type = OD_NORMAL_SAMPLE;
od_ops->powersave_bias_init_cpu(cpu);
diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index e16a961..55ef8c6 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -163,6 +163,10 @@ struct cs_cpu_dbs_info_s {
unsigned int enable:1;
};

+struct cpu_dbs_info_s {
+ struct cpu_dbs_common_info cdbs;
+};
+
/* Per policy Governers sysfs tunables */
struct od_dbs_tuners {
unsigned int ignore_nice;
@@ -183,6 +187,11 @@ struct cs_dbs_tuners {
unsigned int freq_step;
};

+struct dbs_tuners {
+ unsigned int ignore_nice;
+ unsigned int sampling_rate;
+};
+
/* Common Governer data across policies */
struct dbs_data;
struct common_dbs_data {
--
1.8.0

2013-07-05 08:47:21

by Chanwoo Choi

[permalink] [raw]
Subject: [PATCH 6/6] Documentation: cpufreq: load_table: Update load_table debugfs file documentation

This patch add the detailed description of 'load_table' debugfs file to show
collected CPUs load and the change of CPU frequency.

Signed-off-by: Chanwoo Choi <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
Signed-off-by: Myungjoo Ham <[email protected]>
---
Documentation/cpu-freq/cpufreq-stats.txt | 39 ++++++++++++++++++++++++++++----
1 file changed, 35 insertions(+), 4 deletions(-)

diff --git a/Documentation/cpu-freq/cpufreq-stats.txt b/Documentation/cpu-freq/cpufreq-stats.txt
index fc64749..8c2e4f2 100644
--- a/Documentation/cpu-freq/cpufreq-stats.txt
+++ b/Documentation/cpu-freq/cpufreq-stats.txt
@@ -18,10 +18,12 @@ Contents
1. Introduction

cpufreq-stats is a driver that provides CPU frequency statistics for each CPU.
-These statistics are provided in /sysfs as a bunch of read_only interfaces. This
-interface (when configured) will appear in a separate directory under cpufreq
-in /sysfs (<sysfs root>/devices/system/cpu/cpuX/cpufreq/stats/) for each CPU.
-Various statistics will form read_only files under this directory.
+These statistics are provided in /sysfs and /debugfs as a bunch of read_only
+interfaces. This interface (when configured) will appear in a separate directory
+under cpufreq in below directory list for each CPU. Various statistics will form
+read_only files under this directory.
+- /sysfs (<sysfs root>/devices/system/cpu/cpuX/cpufreq/stats/)
+- /debugfs (<sysfs root>/kernel/debug/cpufreq/cpuX/)

This driver is designed to be independent of any particular cpufreq_driver
that may be running on your CPU. So, it will work with any cpufreq_driver.
@@ -33,6 +35,7 @@ cpufreq stats provides following statistics (explained in detail below).
- time_in_state
- total_trans
- trans_table
+- load_table

All the statistics will be from the time the stats driver has been inserted
to the time when a read of a particular statistic is done. Obviously, stats
@@ -95,6 +98,28 @@ contains the actual freq values for each row and column for better readability.
2800000: 0 0 0 2 0
--------------------------------------------------------------------------------

+- load_table
+This gives the collected CPUs data which include measured time, old CPU
+frequency, new CPU frequency and each CPU load. The cat output will have
+"<measured time> <old CPU frequency> <new CPU frequency> <CPUs load> ..." pair
+in each line, which will mean the change of CPU frequency according to CPUs load
+at <measured time>.
+
+--------------------------------------------------------------------------------
+<mysystem>:/sys/kernel/debug/cpufreq/cpu0 # cat load_table
+Time(ms) Old Freq(Hz) New Freq(Hz) CPU0 CPU1 CPU2 CPU3
+23820 500000 500000 53 86 2 73
+23920 500000 400000 66 40 0 42
+24020 400000 400000 71 71 10 52
+24120 400000 300000 33 27 45 65
+24220 300000 300000 4 37 71 34
+24320 300000 300000 1 85 38 16
+24420 300000 200000 6 41 15 51
+24520 200000 200000 12 62 1 51
+24620 200000 200000 9 51 0 58
+24720 200000 200000 32 32 11 27
+--------------------------------------------------------------------------------
+

3. Configuring cpufreq-stats

@@ -104,6 +129,7 @@ Config Main Menu
CPU Frequency scaling --->
[*] CPU Frequency scaling
<*> CPU frequency translation statistics
+ (10) Maximum storage size to save CPU load (10-1000)
[*] CPU frequency translation statistics details


@@ -113,6 +139,11 @@ cpufreq-stats.
"CPU frequency translation statistics" (CONFIG_CPU_FREQ_STAT) provides the
basic statistics which includes time_in_state and total_trans.

+"Maximum storage size to save CPU load (10-1000)" (depends on CONFIG_CPU_FREQ_
+STAT) indicates the total number of load_table data and provides collected data
+which include old cpu frequency, new cpu frequency and CPUs load. The user can
+determine the storage size of collected CPUs data. The range is from 10 to 1000.
+
"CPU frequency translation statistics details" (CONFIG_CPU_FREQ_STAT_DETAILS)
provides fine grained cpufreq stats by trans_table. The reason for having a
separate config option for trans_table is:
--
1.8.0

2013-07-05 08:47:19

by Chanwoo Choi

[permalink] [raw]
Subject: [PATCH 5/6] cpufreq: powersave: Add support to collect CPUs load periodically

This patch collect CPUs load and create create basic sysfs file as below:
- /sys/devices/system/cpu/cpufreq/powersave/ignore_nice (rw)
- /sys/devices/system/cpu/cpufreq/powersave/sampling_rate (rw)
- /sys/devices/system/cpu/cpufreq/powersave/sampling_rate_min (r)

Signed-off-by: Chanwoo Choi <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
Signed-off-by: Myungjoo Ham <[email protected]>
---
drivers/cpufreq/cpufreq_governor.h | 1 +
drivers/cpufreq/cpufreq_powersave.c | 158 +++++++++++++++++++++++++++++++++++-
2 files changed, 156 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/cpufreq_governor.h b/drivers/cpufreq/cpufreq_governor.h
index 7ad4d3b..a926c74 100644
--- a/drivers/cpufreq/cpufreq_governor.h
+++ b/drivers/cpufreq/cpufreq_governor.h
@@ -199,6 +199,7 @@ struct common_dbs_data {
#define GOV_ONDEMAND 0
#define GOV_CONSERVATIVE 1
#define GOV_PERFORMANCE 2
+ #define GOV_POWERSAVE 3
int governor;
struct attribute_group *attr_group_gov_sys; /* one governor - system */
struct attribute_group *attr_group_gov_pol; /* one governor - policy */
diff --git a/drivers/cpufreq/cpufreq_powersave.c b/drivers/cpufreq/cpufreq_powersave.c
index 2d948a1..39c7857 100644
--- a/drivers/cpufreq/cpufreq_powersave.c
+++ b/drivers/cpufreq/cpufreq_powersave.c
@@ -17,9 +17,159 @@
#include <linux/cpufreq.h>
#include <linux/init.h>

+#ifdef CONFIG_CPU_FREQ_STAT
+
+#include <linux/kernel_stat.h>
+#include <linux/slab.h>
+
+#include "cpufreq_governor.h"
+
+#define ps_cpu_dbs_info_s cpu_dbs_info_s
+#define ps_dbs_tuners dbs_tuners
+
+static DEFINE_PER_CPU(struct cpu_dbs_info_s, ps_cpu_dbs_info);
+static struct common_dbs_data ps_dbs_cdata;
+
+static ssize_t store_sampling_rate(struct dbs_data *dbs_data, const char *buf,
+ size_t count)
+{
+ struct dbs_tuners *tuners = dbs_data->tuners;
+ unsigned int input;
+ int ret;
+ ret = sscanf(buf, "%u", &input);
+
+ if (ret != 1)
+ return -EINVAL;
+
+ tuners->sampling_rate = max(input, dbs_data->min_sampling_rate);
+ return count;
+}
+
+static ssize_t store_ignore_nice(struct dbs_data *dbs_data, const char *buf,
+ size_t count)
+{
+ struct dbs_tuners *tuners = dbs_data->tuners;
+ unsigned int input, j;
+ int ret;
+
+ ret = sscanf(buf, "%u", &input);
+ if (ret != 1)
+ return -EINVAL;
+
+ if (input > 1)
+ input = 1;
+
+ if (input == tuners->ignore_nice) /* nothing to do */
+ return count;
+
+ tuners->ignore_nice = input;
+
+ /* we need to re-evaluate prev_cpu_idle */
+ for_each_online_cpu(j) {
+ struct ps_cpu_dbs_info_s *dbs_info;
+ dbs_info = &per_cpu(ps_cpu_dbs_info, j);
+ dbs_info->cdbs.prev_cpu_idle = get_cpu_idle_time(j,
+ &dbs_info->cdbs.prev_cpu_wall, 0);
+ if (tuners->ignore_nice)
+ dbs_info->cdbs.prev_cpu_nice =
+ kcpustat_cpu(j).cpustat[CPUTIME_NICE];
+ }
+ return count;
+}
+
+show_store_one(ps, sampling_rate);
+show_store_one(ps, ignore_nice);
+declare_show_sampling_rate_min(ps);
+
+gov_sys_pol_attr_rw(sampling_rate);
+gov_sys_pol_attr_rw(ignore_nice);
+gov_sys_pol_attr_ro(sampling_rate_min);
+
+static struct attribute *dbs_attributes_gov_sys[] = {
+ &sampling_rate_min_gov_sys.attr,
+ &sampling_rate_gov_sys.attr,
+ &ignore_nice_gov_sys.attr,
+ NULL
+};
+
+static struct attribute_group ps_attr_group_gov_sys = {
+ .attrs = dbs_attributes_gov_sys,
+ .name = "powersave",
+};
+
+static struct attribute *dbs_attributes_gov_pol[] = {
+ &sampling_rate_min_gov_pol.attr,
+ &sampling_rate_gov_pol.attr,
+ &ignore_nice_gov_pol.attr,
+ NULL
+};
+
+static struct attribute_group ps_attr_group_gov_pol = {
+ .attrs = dbs_attributes_gov_pol,
+ .name = "powersave",
+};
+
+static void ps_dbs_timer(struct work_struct *work)
+{
+ struct cpu_dbs_info_s *dbs_info = container_of(work,
+ struct cpu_dbs_info_s, cdbs.work.work);
+ unsigned int cpu = dbs_info->cdbs.cur_policy->cpu;
+ struct dbs_data *dbs_data = dbs_info->cdbs.cur_policy->governor_data;
+ struct dbs_tuners *tuners = dbs_data->tuners;
+ unsigned int delay = 0;
+
+ delay = max(tuners->sampling_rate, dbs_data->min_sampling_rate);
+ delay = usecs_to_jiffies(delay);
+
+ mutex_lock(&dbs_info->cdbs.timer_mutex);
+ dbs_check_cpu(dbs_data, cpu);
+ gov_queue_work(dbs_data, dbs_info->cdbs.cur_policy, delay, false);
+ mutex_unlock(&dbs_info->cdbs.timer_mutex);
+}
+
+static int ps_init(struct dbs_data *dbs_data)
+{
+ struct dbs_tuners *tuners;
+
+ tuners = kzalloc(sizeof(struct dbs_tuners), GFP_KERNEL);
+ if (!tuners) {
+ pr_err("%s: kzalloc failed\n", __func__);
+ return -ENOMEM;
+ }
+
+ tuners->ignore_nice = 0;
+
+ dbs_data->tuners = tuners;
+ dbs_data->min_sampling_rate = MIN_SAMPLING_RATE_RATIO
+ * jiffies_to_usecs(100);
+ mutex_init(&dbs_data->mutex);
+
+ return 0;
+}
+static void ps_exit(struct dbs_data *dbs_data)
+{
+ kfree(dbs_data->tuners);
+}
+
+define_get_cpu_dbs_routines(ps_cpu_dbs_info);
+
+static struct common_dbs_data ps_dbs_cdata = {
+ .governor = GOV_POWERSAVE,
+ .attr_group_gov_sys = &ps_attr_group_gov_sys,
+ .attr_group_gov_pol = &ps_attr_group_gov_pol,
+ .get_cpu_cdbs = get_cpu_cdbs,
+ .get_cpu_dbs_info_s = get_cpu_dbs_info_s,
+ .gov_dbs_timer = ps_dbs_timer,
+ .init = ps_init,
+ .exit = ps_exit,
+};
+#endif /* CONFIG_CPU_FREQ_STAT */
+
static int cpufreq_governor_powersave(struct cpufreq_policy *policy,
unsigned int event)
{
+ int ret = 0;
+
switch (event) {
case CPUFREQ_GOV_START:
case CPUFREQ_GOV_LIMITS:
@@ -28,10 +178,12 @@ static int cpufreq_governor_powersave(struct cpufreq_policy *policy,
__cpufreq_driver_target(policy, policy->min,
CPUFREQ_RELATION_L);
break;
- default:
- break;
}
- return 0;
+
+#ifdef CONFIG_CPU_FREQ_STAT
+ ret = cpufreq_governor_dbs(policy, &ps_dbs_cdata, event);
+#endif
+ return ret;
}

#ifndef CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE
--
1.8.0

2013-07-05 08:48:33

by Chanwoo Choi

[permalink] [raw]
Subject: [PATCH 1/6] cpufreq: Add debugfs directory for cpufreq

This patch create debugfs root directory and child directory according to
the number of CPUs for CPUFreq as below debugfs directory path:
- /sys/kernel/debug/cpufreq/cpuX

If many CPUs share only one cpufreq policy, other CPU(except for CPU0) create
a link for debugfs directory of CPU0.
- /sys/kernel/debug/cpufreq/cpu0
- /sys/kernel/debug/cpufreq/cpu[1-(N-1)] -> /sys/kernel/debug/cpufreq/cpu0

And then cpufreq may need to create debugfs specific file below of debugfs
directory of cpufreq. (e.g., /sys/kernel/debug/cpufreq/cpu0/load_table)

Signed-off-by: Chanwoo Choi <[email protected]>
Signed-off-by: Kyungmin Park <[email protected]>
Signed-off-by: Myungjoo Ham <[email protected]>
---
drivers/cpufreq/cpufreq.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++
include/linux/cpufreq.h | 1 +
2 files changed, 58 insertions(+)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 2d53f47..bc01c8e 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -23,6 +23,7 @@
#include <linux/notifier.h>
#include <linux/cpufreq.h>
#include <linux/delay.h>
+#include <linux/debugfs.h>
#include <linux/interrupt.h>
#include <linux/spinlock.h>
#include <linux/device.h>
@@ -47,6 +48,10 @@ static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor);
#endif
static DEFINE_RWLOCK(cpufreq_driver_lock);

+/* The cpufreq_debugfs is used to create debugfs root directory for CPUFreq. */
+#define MAX_DEBUGFS_NAME_LEN CPUFREQ_NAME_LEN
+static struct dentry *cpufreq_debugfs;
+
/*
* cpu_policy_rwsem is a per CPU reader-writer semaphore designed to cure
* all cpufreq/hotplug/workqueue/etc related lock issues.
@@ -726,6 +731,20 @@ static int cpufreq_add_dev_symlink(unsigned int cpu,
cpufreq_cpu_put(managed_policy);
return ret;
}
+
+ if (cpufreq_debugfs) {
+ char symlink_name[MAX_DEBUGFS_NAME_LEN];
+ char target_name[MAX_DEBUGFS_NAME_LEN];
+
+ sprintf(symlink_name, "cpu%d", j);
+ sprintf(target_name, "./cpu%d", cpu);
+ managed_policy->cpu_debugfs[j] = debugfs_create_symlink(
+ symlink_name,
+ cpufreq_debugfs,
+ target_name);
+ if (!managed_policy->cpu_debugfs[j])
+ pr_debug("creating debugfs symlink failed\n");
+ }
}
return ret;
}
@@ -746,6 +765,22 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
if (ret)
return ret;

+ /* prepare interface data for debugfs */
+ if (cpufreq_debugfs) {
+ char name[MAX_DEBUGFS_NAME_LEN];
+ int size, i;
+
+ sprintf(name, "cpu%d", policy->cpu);
+ size = sizeof(struct dentry*) * NR_CPUS;
+ i = cpu;
+
+ policy->cpu_debugfs = devm_kzalloc(dev, size, GFP_KERNEL);
+ policy->cpu_debugfs[i] = debugfs_create_dir(name,
+ cpufreq_debugfs);
+ if (!policy->cpu_debugfs[i])
+ pr_debug("creating debugfs directory failed\n");
+ }
+
/* set up files for this cpu device */
drv_attr = cpufreq_driver->attr;
while ((drv_attr) && (*drv_attr)) {
@@ -839,6 +874,20 @@ static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
return ret;
}

+ if (cpufreq_debugfs) {
+ char symlink_name[MAX_DEBUGFS_NAME_LEN];
+ char target_name[MAX_DEBUGFS_NAME_LEN];
+
+ sprintf(symlink_name, "cpu%d", cpu);
+ sprintf(target_name, "./cpu%d", sibling);
+ policy->cpu_debugfs[cpu] = debugfs_create_symlink(
+ symlink_name,
+ cpufreq_debugfs,
+ target_name);
+ if (!policy->cpu_debugfs[cpu])
+ pr_debug("creating debugfs symlink failed\n");
+ }
+
return 0;
}
#endif
@@ -1046,6 +1095,7 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif

if (cpu != data->cpu) {
sysfs_remove_link(&dev->kobj, "cpufreq");
+ debugfs_remove(data->cpu_debugfs[cpu]);
} else if (cpus > 1) {
/* first sibling now owns the new sysfs dir */
cpu_dev = get_cpu_device(cpumask_first(data->cpus));
@@ -1068,6 +1118,9 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
return -EINVAL;
}

+ debugfs_remove_recursive(data->cpu_debugfs[cpu]);
+ debugfs_remove(cpufreq_debugfs);
+
WARN_ON(lock_policy_rwsem_write(cpu));
update_policy_cpu(data, cpu_dev->id);
unlock_policy_rwsem_write(cpu);
@@ -1976,6 +2029,10 @@ static int __init cpufreq_core_init(void)
BUG_ON(!cpufreq_global_kobject);
register_syscore_ops(&cpufreq_syscore_ops);

+ cpufreq_debugfs = debugfs_create_dir("cpufreq", NULL);
+ if (!cpufreq_debugfs)
+ pr_debug("creating debugfs root failed\n");
+
return 0;
}
core_initcall(cpufreq_core_init);
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 037d36a..825f379 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -115,6 +115,7 @@ struct cpufreq_policy {

struct kobject kobj;
struct completion kobj_unregister;
+ struct dentry **cpu_debugfs;
};

#define CPUFREQ_ADJUST (0)
--
1.8.0

2013-07-07 18:54:46

by Pankaj Jangra

[permalink] [raw]
Subject: Re: [PATCH 1/6] cpufreq: Add debugfs directory for cpufreq

Hi Chanwoo,

On Fri, Jul 5, 2013 at 1:46 AM, Chanwoo Choi <[email protected]> wrote:
> This patch create debugfs root directory and child directory according to
> the number of CPUs for CPUFreq as below debugfs directory path:
> - /sys/kernel/debug/cpufreq/cpuX
>
> If many CPUs share only one cpufreq policy, other CPU(except for CPU0) create
> a link for debugfs directory of CPU0.
> - /sys/kernel/debug/cpufreq/cpu0
> - /sys/kernel/debug/cpufreq/cpu[1-(N-1)] -> /sys/kernel/debug/cpufreq/cpu0
>
> And then cpufreq may need to create debugfs specific file below of debugfs
> directory of cpufreq. (e.g., /sys/kernel/debug/cpufreq/cpu0/load_table)
>
> Signed-off-by: Chanwoo Choi <[email protected]>
> Signed-off-by: Kyungmin Park <[email protected]>
> Signed-off-by: Myungjoo Ham <[email protected]>
> ---
> drivers/cpufreq/cpufreq.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++
> include/linux/cpufreq.h | 1 +
> 2 files changed, 58 insertions(+)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 2d53f47..bc01c8e 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -23,6 +23,7 @@
> #include <linux/notifier.h>
> #include <linux/cpufreq.h>
> #include <linux/delay.h>
> +#include <linux/debugfs.h>
> #include <linux/interrupt.h>
> #include <linux/spinlock.h>
> #include <linux/device.h>
> @@ -47,6 +48,10 @@ static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor);
> #endif
> static DEFINE_RWLOCK(cpufreq_driver_lock);
>
> +/* The cpufreq_debugfs is used to create debugfs root directory for CPUFreq. */
> +#define MAX_DEBUGFS_NAME_LEN CPUFREQ_NAME_LEN
> +static struct dentry *cpufreq_debugfs;
> +
> /*
> * cpu_policy_rwsem is a per CPU reader-writer semaphore designed to cure
> * all cpufreq/hotplug/workqueue/etc related lock issues.
> @@ -726,6 +731,20 @@ static int cpufreq_add_dev_symlink(unsigned int cpu,
> cpufreq_cpu_put(managed_policy);
> return ret;
> }
> +
> + if (cpufreq_debugfs) {
> + char symlink_name[MAX_DEBUGFS_NAME_LEN];
> + char target_name[MAX_DEBUGFS_NAME_LEN];
> +
> + sprintf(symlink_name, "cpu%d", j);
> + sprintf(target_name, "./cpu%d", cpu);
> + managed_policy->cpu_debugfs[j] = debugfs_create_symlink(
> + symlink_name,
> + cpufreq_debugfs,
> + target_name);
> + if (!managed_policy->cpu_debugfs[j])
> + pr_debug("creating debugfs symlink failed\n");
> + }
> }
> return ret;
> }
> @@ -746,6 +765,22 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
> if (ret)
> return ret;
>
> + /* prepare interface data for debugfs */
> + if (cpufreq_debugfs) {
> + char name[MAX_DEBUGFS_NAME_LEN];
> + int size, i;
> +
> + sprintf(name, "cpu%d", policy->cpu);
> + size = sizeof(struct dentry*) * NR_CPUS;
> + i = cpu;
> +
> + policy->cpu_debugfs = devm_kzalloc(dev, size, GFP_KERNEL);
> + policy->cpu_debugfs[i] = debugfs_create_dir(name,
> + cpufreq_debugfs);
> + if (!policy->cpu_debugfs[i])
> + pr_debug("creating debugfs directory failed\n");
> + }
> +
> /* set up files for this cpu device */
> drv_attr = cpufreq_driver->attr;
> while ((drv_attr) && (*drv_attr)) {
> @@ -839,6 +874,20 @@ static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
> return ret;
> }
>
> + if (cpufreq_debugfs) {
> + char symlink_name[MAX_DEBUGFS_NAME_LEN];
> + char target_name[MAX_DEBUGFS_NAME_LEN];
> +
> + sprintf(symlink_name, "cpu%d", cpu);
> + sprintf(target_name, "./cpu%d", sibling);
> + policy->cpu_debugfs[cpu] = debugfs_create_symlink(
> + symlink_name,
> + cpufreq_debugfs,
> + target_name);
> + if (!policy->cpu_debugfs[cpu])
> + pr_debug("creating debugfs symlink failed\n");
> + }
> +
> return 0;
> }
> #endif
> @@ -1046,6 +1095,7 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
>
> if (cpu != data->cpu) {
> sysfs_remove_link(&dev->kobj, "cpufreq");
> + debugfs_remove(data->cpu_debugfs[cpu]);

I think you should add the call to remove the "cpufreq" also ???
"debugfs_remove(cpufreq_debugfs);"
> } else if (cpus > 1) {
> /* first sibling now owns the new sysfs dir */
> cpu_dev = get_cpu_device(cpumask_first(data->cpus));
> @@ -1068,6 +1118,9 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
> return -EINVAL;
> }
>
> + debugfs_remove_recursive(data->cpu_debugfs[cpu]);
> + debugfs_remove(cpufreq_debugfs);
> +
> WARN_ON(lock_policy_rwsem_write(cpu));
> update_policy_cpu(data, cpu_dev->id);
> unlock_policy_rwsem_write(cpu);
> @@ -1976,6 +2029,10 @@ static int __init cpufreq_core_init(void)
> BUG_ON(!cpufreq_global_kobject);
> register_syscore_ops(&cpufreq_syscore_ops);
>
> + cpufreq_debugfs = debugfs_create_dir("cpufreq", NULL);
> + if (!cpufreq_debugfs)
> + pr_debug("creating debugfs root failed\n");
> +
> return 0;
> }
> core_initcall(cpufreq_core_init);
> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
> index 037d36a..825f379 100644
> --- a/include/linux/cpufreq.h
> +++ b/include/linux/cpufreq.h
> @@ -115,6 +115,7 @@ struct cpufreq_policy {
>
> struct kobject kobj;
> struct completion kobj_unregister;
> + struct dentry **cpu_debugfs;
> };
>
> #define CPUFREQ_ADJUST (0)
> --
> 1.8.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

Regards,
Pankaj Jangra

2013-07-07 23:50:24

by Chanwoo Choi

[permalink] [raw]
Subject: Re: [PATCH 1/6] cpufreq: Add debugfs directory for cpufreq

Hi Pankaj,

On 07/08/2013 03:54 AM, Pankaj Jangra wrote:
> Hi Chanwoo,
>
> On Fri, Jul 5, 2013 at 1:46 AM, Chanwoo Choi <[email protected]> wrote:
>> This patch create debugfs root directory and child directory according to
>> the number of CPUs for CPUFreq as below debugfs directory path:
>> - /sys/kernel/debug/cpufreq/cpuX
>>
>> If many CPUs share only one cpufreq policy, other CPU(except for CPU0) create
>> a link for debugfs directory of CPU0.
>> - /sys/kernel/debug/cpufreq/cpu0
>> - /sys/kernel/debug/cpufreq/cpu[1-(N-1)] -> /sys/kernel/debug/cpufreq/cpu0
>>
>> And then cpufreq may need to create debugfs specific file below of debugfs
>> directory of cpufreq. (e.g., /sys/kernel/debug/cpufreq/cpu0/load_table)
>>
>> Signed-off-by: Chanwoo Choi <[email protected]>
>> Signed-off-by: Kyungmin Park <[email protected]>
>> Signed-off-by: Myungjoo Ham <[email protected]>
>> ---
>> drivers/cpufreq/cpufreq.c | 57 +++++++++++++++++++++++++++++++++++++++++++++++
>> include/linux/cpufreq.h | 1 +
>> 2 files changed, 58 insertions(+)
>>
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>> index 2d53f47..bc01c8e 100644
>> --- a/drivers/cpufreq/cpufreq.c
>> +++ b/drivers/cpufreq/cpufreq.c
>> @@ -23,6 +23,7 @@
>> #include <linux/notifier.h>
>> #include <linux/cpufreq.h>
>> #include <linux/delay.h>
>> +#include <linux/debugfs.h>
>> #include <linux/interrupt.h>
>> #include <linux/spinlock.h>
>> #include <linux/device.h>
>> @@ -47,6 +48,10 @@ static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor);
>> #endif
>> static DEFINE_RWLOCK(cpufreq_driver_lock);
>>
>> +/* The cpufreq_debugfs is used to create debugfs root directory for CPUFreq. */
>> +#define MAX_DEBUGFS_NAME_LEN CPUFREQ_NAME_LEN
>> +static struct dentry *cpufreq_debugfs;
>> +
>> /*
>> * cpu_policy_rwsem is a per CPU reader-writer semaphore designed to cure
>> * all cpufreq/hotplug/workqueue/etc related lock issues.
>> @@ -726,6 +731,20 @@ static int cpufreq_add_dev_symlink(unsigned int cpu,
>> cpufreq_cpu_put(managed_policy);
>> return ret;
>> }
>> +
>> + if (cpufreq_debugfs) {
>> + char symlink_name[MAX_DEBUGFS_NAME_LEN];
>> + char target_name[MAX_DEBUGFS_NAME_LEN];
>> +
>> + sprintf(symlink_name, "cpu%d", j);
>> + sprintf(target_name, "./cpu%d", cpu);
>> + managed_policy->cpu_debugfs[j] = debugfs_create_symlink(
>> + symlink_name,
>> + cpufreq_debugfs,
>> + target_name);
>> + if (!managed_policy->cpu_debugfs[j])
>> + pr_debug("creating debugfs symlink failed\n");
>> + }
>> }
>> return ret;
>> }
>> @@ -746,6 +765,22 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
>> if (ret)
>> return ret;
>>
>> + /* prepare interface data for debugfs */
>> + if (cpufreq_debugfs) {
>> + char name[MAX_DEBUGFS_NAME_LEN];
>> + int size, i;
>> +
>> + sprintf(name, "cpu%d", policy->cpu);
>> + size = sizeof(struct dentry*) * NR_CPUS;
>> + i = cpu;
>> +
>> + policy->cpu_debugfs = devm_kzalloc(dev, size, GFP_KERNEL);
>> + policy->cpu_debugfs[i] = debugfs_create_dir(name,
>> + cpufreq_debugfs);
>> + if (!policy->cpu_debugfs[i])
>> + pr_debug("creating debugfs directory failed\n");
>> + }
>> +
>> /* set up files for this cpu device */
>> drv_attr = cpufreq_driver->attr;
>> while ((drv_attr) && (*drv_attr)) {
>> @@ -839,6 +874,20 @@ static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
>> return ret;
>> }
>>
>> + if (cpufreq_debugfs) {
>> + char symlink_name[MAX_DEBUGFS_NAME_LEN];
>> + char target_name[MAX_DEBUGFS_NAME_LEN];
>> +
>> + sprintf(symlink_name, "cpu%d", cpu);
>> + sprintf(target_name, "./cpu%d", sibling);
>> + policy->cpu_debugfs[cpu] = debugfs_create_symlink(
>> + symlink_name,
>> + cpufreq_debugfs,
>> + target_name);
>> + if (!policy->cpu_debugfs[cpu])
>> + pr_debug("creating debugfs symlink failed\n");
>> + }
>> +
>> return 0;
>> }
>> #endif
>> @@ -1046,6 +1095,7 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
>>
>> if (cpu != data->cpu) {
>> sysfs_remove_link(&dev->kobj, "cpufreq");
>> + debugfs_remove(data->cpu_debugfs[cpu]);
>
> I think you should add the call to remove the "cpufreq" also ???

No, we have to remove 'cpufreq' debugfs directory when the last online CPU is removed.
If condition("cpu != data->cpu") is true, it means that various CPUs share only one cpufreq policy.
When CPU[1-3] is removed, kernel execute first if statement to remove the link of CPU[1-3] debugfs directory.

> "debugfs_remove(cpufreq_debugfs);"
>> } else if (cpus > 1) {
>> /* first sibling now owns the new sysfs dir */
>> cpu_dev = get_cpu_device(cpumask_first(data->cpus));
>> @@ -1068,6 +1118,9 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
>> return -EINVAL;
>> }
>>
>> + debugfs_remove_recursive(data->cpu_debugfs[cpu]);
>> + debugfs_remove(cpufreq_debugfs);
>> +
>> WARN_ON(lock_policy_rwsem_write(cpu));
>> update_policy_cpu(data, cpu_dev->id);
>> unlock_policy_rwsem_write(cpu);
>> @@ -1976,6 +2029,10 @@ static int __init cpufreq_core_init(void)
>> BUG_ON(!cpufreq_global_kobject);
>> register_syscore_ops(&cpufreq_syscore_ops);
>>
>> + cpufreq_debugfs = debugfs_create_dir("cpufreq", NULL);
>> + if (!cpufreq_debugfs)
>> + pr_debug("creating debugfs root failed\n");
>> +
>> return 0;
>> }
>> core_initcall(cpufreq_core_init);
>> diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
>> index 037d36a..825f379 100644
>> --- a/include/linux/cpufreq.h
>> +++ b/include/linux/cpufreq.h
>> @@ -115,6 +115,7 @@ struct cpufreq_policy {
>>
>> struct kobject kobj;
>> struct completion kobj_unregister;
>> + struct dentry **cpu_debugfs;
>> };
>>
>> #define CPUFREQ_ADJUST (0)
>> --
>> 1.8.0
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> Regards,
> Pankaj Jangra
> --
> To unsubscribe from this list: send the line "unsubscribe cpufreq" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

Thanks,
Chanwoo Choi

2013-07-09 06:50:05

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH 0/6] cpufreq: Add 'load_table' debugfs file to show colleced CPUs load

On 5 July 2013 14:16, Chanwoo Choi <[email protected]> wrote:
> Second, previous performance/powersave governor haven't calculated CPUs load
> becuase these governor didn't change CPU frequency according to CPUs load. But,
> load_table debugfs file always should indicate the collected CPUs data regardless
> of the kind of cpufreq governor. So, the patch3/4/5 implement that performance/
> powersave governor will check periodically CPUs load by calling dbs_check_cpu()
> with timer.

I raised a query on how can we call dbs_check_cpu() from
performance/powersave? Also, calling this routine will degrade
performance without any sense. So, I vote not for doing it.

2013-07-09 07:57:11

by Chanwoo Choi

[permalink] [raw]
Subject: Re: [PATCH 0/6] cpufreq: Add 'load_table' debugfs file to show colleced CPUs load

On 07/09/2013 03:50 PM, Viresh Kumar wrote:
> On 5 July 2013 14:16, Chanwoo Choi <[email protected]> wrote:
>> Second, previous performance/powersave governor haven't calculated CPUs load
>> becuase these governor didn't change CPU frequency according to CPUs load. But,
>> load_table debugfs file always should indicate the collected CPUs data regardless
>> of the kind of cpufreq governor. So, the patch3/4/5 implement that performance/
>> powersave governor will check periodically CPUs load by calling dbs_check_cpu()
>> with timer.
>
> I raised a query on how can we call dbs_check_cpu() from
> performance/powersave? Also, calling this routine will degrade
> performance without any sense. So, I vote not for doing it.

You're right. The performance/powersave don't usually need calling operation
of dbs_check_cpu(). Only, this patch aims at checking CPUs load on
load_table debugfs file.

I'm going to consider more efficient way than this patchset.
For example,

But, following patctes haven't the dependency about upper description about performance/powersave.
If user changes cpufreq governor from ondemand/conservative to performance/powersave,
patch2 did reset all of the data for load_table.

cpufreq: Add debugfs directory for cpufreq
cpufreq: stats: Add 'load_table' debugfs file to show accumulated data of CPUs
Documentation: cpufreq: load_table: Update load_table debugfs file documentation

So, I'd like you to review patch1,patch2, patch6. If you with that I resend patch1/2/6,
I will resend new patchset incluing in patch1/2/6.

Thanks,

Best Regards,
Chanwoo Choi

2013-07-09 08:00:10

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH 0/6] cpufreq: Add 'load_table' debugfs file to show colleced CPUs load

On 9 July 2013 13:27, Chanwoo Choi <[email protected]> wrote:
> So, I'd like you to review patch1,patch2, patch6. If you with that I resend patch1/2/6,
> I will resend new patchset incluing in patch1/2/6.

Yeah, I will do that only. Don't resend anything for now.

2013-07-09 09:23:05

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH 1/6] cpufreq: Add debugfs directory for cpufreq

On 5 July 2013 14:16, Chanwoo Choi <[email protected]> wrote:
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c

> +/* The cpufreq_debugfs is used to create debugfs root directory for CPUFreq. */
> +#define MAX_DEBUGFS_NAME_LEN CPUFREQ_NAME_LEN

Why declare MAX_DEBUGFS_NAME_LEN if it is going to be equal to
CPUFREQ_NAME_LEN. Simply use CPUFREQ_NAME_LEN everywhere.

> +static struct dentry *cpufreq_debugfs;

Probably make this dependent on CONFIG_CPU_FREQ_STAT?

> /*
> * cpu_policy_rwsem is a per CPU reader-writer semaphore designed to cure
> * all cpufreq/hotplug/workqueue/etc related lock issues.
> @@ -726,6 +731,20 @@ static int cpufreq_add_dev_symlink(unsigned int cpu,
> cpufreq_cpu_put(managed_policy);
> return ret;
> }
> +
> + if (cpufreq_debugfs) {
> + char symlink_name[MAX_DEBUGFS_NAME_LEN];
> + char target_name[MAX_DEBUGFS_NAME_LEN];
> +
> + sprintf(symlink_name, "cpu%d", j);
> + sprintf(target_name, "./cpu%d", cpu);
> + managed_policy->cpu_debugfs[j] = debugfs_create_symlink(
> + symlink_name,
> + cpufreq_debugfs,
> + target_name);
> + if (!managed_policy->cpu_debugfs[j])
> + pr_debug("creating debugfs symlink failed\n");

pr_err?

> + }
> }
> return ret;
> }
> @@ -746,6 +765,22 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
> if (ret)
> return ret;
>
> + /* prepare interface data for debugfs */
> + if (cpufreq_debugfs) {
> + char name[MAX_DEBUGFS_NAME_LEN];
> + int size, i;
> +
> + sprintf(name, "cpu%d", policy->cpu);
> + size = sizeof(struct dentry*) * NR_CPUS;

NR_CPUS? You only need to take care of cpus that belong to this
policy, isn't it? policy->related_cpus should be good enough for you.

> + i = cpu;
> +
> + policy->cpu_debugfs = devm_kzalloc(dev, size, GFP_KERNEL);
> + policy->cpu_debugfs[i] = debugfs_create_dir(name,
> + cpufreq_debugfs);
> + if (!policy->cpu_debugfs[i])
> + pr_debug("creating debugfs directory failed\n");
> + }

pr_err?

And move this code just before the call to cpufreq_add_dev_symlink().

> /* set up files for this cpu device */
> drv_attr = cpufreq_driver->attr;
> while ((drv_attr) && (*drv_attr)) {
> @@ -839,6 +874,20 @@ static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
> return ret;
> }
>
> + if (cpufreq_debugfs) {
> + char symlink_name[MAX_DEBUGFS_NAME_LEN];
> + char target_name[MAX_DEBUGFS_NAME_LEN];
> +
> + sprintf(symlink_name, "cpu%d", cpu);
> + sprintf(target_name, "./cpu%d", sibling);
> + policy->cpu_debugfs[cpu] = debugfs_create_symlink(
> + symlink_name,
> + cpufreq_debugfs,
> + target_name);
> + if (!policy->cpu_debugfs[cpu])
> + pr_debug("creating debugfs symlink failed\n");
> + }

This is purely replication of same code. Create a routine to
hold these lines and call it from wherever it is required.

> return 0;
> }
> #endif
> @@ -1046,6 +1095,7 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
>
> if (cpu != data->cpu) {
> sysfs_remove_link(&dev->kobj, "cpufreq");
> + debugfs_remove(data->cpu_debugfs[cpu]);
> } else if (cpus > 1) {
> /* first sibling now owns the new sysfs dir */
> cpu_dev = get_cpu_device(cpumask_first(data->cpus));
> @@ -1068,6 +1118,9 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
> return -EINVAL;
> }
>
> + debugfs_remove_recursive(data->cpu_debugfs[cpu]);

So you removed load_table here? What about other cpus that were
there in policy->cpus?

> + debugfs_remove(cpufreq_debugfs);

Who will create this again? Also, there might be multiple policy struct's
in a system and here we have reached to removal of all cpus of
a policy. Other policies are still alive.

> WARN_ON(lock_policy_rwsem_write(cpu));
> update_policy_cpu(data, cpu_dev->id);
> unlock_policy_rwsem_write(cpu);
> @@ -1976,6 +2029,10 @@ static int __init cpufreq_core_init(void)
> BUG_ON(!cpufreq_global_kobject);
> register_syscore_ops(&cpufreq_syscore_ops);
>
> + cpufreq_debugfs = debugfs_create_dir("cpufreq", NULL);
> + if (!cpufreq_debugfs)
> + pr_debug("creating debugfs root failed\n");

So, you just added this directory once.. So you must not
remove it.

> return 0;
> }
> core_initcall(cpufreq_core_init);

2013-07-10 08:30:53

by Chanwoo Choi

[permalink] [raw]
Subject: Re: [PATCH 1/6] cpufreq: Add debugfs directory for cpufreq

Hi Viresh,

On 07/09/2013 06:23 PM, Viresh Kumar wrote:
> On 5 July 2013 14:16, Chanwoo Choi <[email protected]> wrote:
>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>
>> +/* The cpufreq_debugfs is used to create debugfs root directory for CPUFreq. */
>> +#define MAX_DEBUGFS_NAME_LEN CPUFREQ_NAME_LEN
>
> Why declare MAX_DEBUGFS_NAME_LEN if it is going to be equal to
> CPUFREQ_NAME_LEN. Simply use CPUFREQ_NAME_LEN everywhere.

OK, I'll use CPUFREQ_NAME_LEN instead of defining CPUFREQ_NAME_LEN.

>
>> +static struct dentry *cpufreq_debugfs;
>
> Probably make this dependent on CONFIG_CPU_FREQ_STAT?

I thought that '/sys/kernel/debug/cpufreq' is always created in the same as sysfs file
when added cpufreq driver. Only the debugfs file(/sys/kernel/debug/cpufreq/load_table)
has the dependency on CONFIG_CPU_FREQ_STAT.

If *cpufreq_debugfs has the dependency on CONFIG_CPU_FREQ_STAT,
I should add checking statement with '#ifdef CONFIG_CPU_FREQ_STAT' keyword in cpufreq.c.

What is your opinion to add the dependency of CONFIG_CPU_FREQ_STAT
in cpufreq.c?

>
>> /*
>> * cpu_policy_rwsem is a per CPU reader-writer semaphore designed to cure
>> * all cpufreq/hotplug/workqueue/etc related lock issues.
>> @@ -726,6 +731,20 @@ static int cpufreq_add_dev_symlink(unsigned int cpu,
>> cpufreq_cpu_put(managed_policy);
>> return ret;
>> }
>> +
>> + if (cpufreq_debugfs) {
>> + char symlink_name[MAX_DEBUGFS_NAME_LEN];
>> + char target_name[MAX_DEBUGFS_NAME_LEN];
>> +
>> + sprintf(symlink_name, "cpu%d", j);
>> + sprintf(target_name, "./cpu%d", cpu);
>> + managed_policy->cpu_debugfs[j] = debugfs_create_symlink(
>> + symlink_name,
>> + cpufreq_debugfs,
>> + target_name);
>> + if (!managed_policy->cpu_debugfs[j])
>> + pr_debug("creating debugfs symlink failed\n");
>
> pr_err?

I'll fix it.

>
>> + }
>> }
>> return ret;
>> }
>> @@ -746,6 +765,22 @@ static int cpufreq_add_dev_interface(unsigned int cpu,
>> if (ret)
>> return ret;
>>
>> + /* prepare interface data for debugfs */
>> + if (cpufreq_debugfs) {
>> + char name[MAX_DEBUGFS_NAME_LEN];
>> + int size, i;
>> +
>> + sprintf(name, "cpu%d", policy->cpu);
>> + size = sizeof(struct dentry*) * NR_CPUS;
>
> NR_CPUS? You only need to take care of cpus that belong to this
> policy, isn't it? policy->related_cpus should be good enough for you.

You're right. I'll fix it.
I didn't think using policy->related_cpus instead of NR_CPUS.

>
>> + i = cpu;
>> +
>> + policy->cpu_debugfs = devm_kzalloc(dev, size, GFP_KERNEL);
>> + policy->cpu_debugfs[i] = debugfs_create_dir(name,
>> + cpufreq_debugfs);
>> + if (!policy->cpu_debugfs[i])
>> + pr_debug("creating debugfs directory failed\n");
>> + }
>
> pr_err?

I'll fix it.

>
> And move this code just before the call to cpufreq_add_dev_symlink().

OK, I'll move it.

>
>> /* set up files for this cpu device */
>> drv_attr = cpufreq_driver->attr;
>> while ((drv_attr) && (*drv_attr)) {
>> @@ -839,6 +874,20 @@ static int cpufreq_add_policy_cpu(unsigned int cpu, unsigned int sibling,
>> return ret;
>> }
>>
>> + if (cpufreq_debugfs) {
>> + char symlink_name[MAX_DEBUGFS_NAME_LEN];
>> + char target_name[MAX_DEBUGFS_NAME_LEN];
>> +
>> + sprintf(symlink_name, "cpu%d", cpu);
>> + sprintf(target_name, "./cpu%d", sibling);
>> + policy->cpu_debugfs[cpu] = debugfs_create_symlink(
>> + symlink_name,
>> + cpufreq_debugfs,
>> + target_name);
>> + if (!policy->cpu_debugfs[cpu])
>> + pr_debug("creating debugfs symlink failed\n");
>> + }
>
> This is purely replication of same code. Create a routine to
> hold these lines and call it from wherever it is required.

OK, I'll create a routine which create symbolic link of debugfs directory.

>
>> return 0;
>> }
>> #endif
>> @@ -1046,6 +1095,7 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
>>
>> if (cpu != data->cpu) {
>> sysfs_remove_link(&dev->kobj, "cpufreq");
>> + debugfs_remove(data->cpu_debugfs[cpu]);
>> } else if (cpus > 1) {
>> /* first sibling now owns the new sysfs dir */
>> cpu_dev = get_cpu_device(cpumask_first(data->cpus));
>> @@ -1068,6 +1118,9 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif
>> return -EINVAL;
>> }
>>
>> + debugfs_remove_recursive(data->cpu_debugfs[cpu]);
>
> So you removed load_table here? What about other cpus that were
> there in policy->cpus?

You're right. This code is wrong. I'll consider other way to resolve this case.

>
>> + debugfs_remove(cpufreq_debugfs);
>
> Who will create this again? Also, there might be multiple policy struct's
> in a system and here we have reached to removal of all cpus of
> a policy. Other policies are still alive.

OK, I'll control 'debugfs' directory in cpufreq_register/unregister_driver().

>
>> WARN_ON(lock_policy_rwsem_write(cpu));
>> update_policy_cpu(data, cpu_dev->id);
>> unlock_policy_rwsem_write(cpu);
>> @@ -1976,6 +2029,10 @@ static int __init cpufreq_core_init(void)
>> BUG_ON(!cpufreq_global_kobject);
>> register_syscore_ops(&cpufreq_syscore_ops);
>>
>> + cpufreq_debugfs = debugfs_create_dir("cpufreq", NULL);
>> + if (!cpufreq_debugfs)
>> + pr_debug("creating debugfs root failed\n");
>
> So, you just added this directory once.. So you must not
> remove it.

I'll add 'debugfs' directory in cpufreq_register_driver()
and remove it in cpufreq_unregister_driver().

>
>> return 0;
>> }
>> core_initcall(cpufreq_core_init);
>

Thanks for your comment.

Best Regards,
Chanwoo Choi

2013-07-15 10:02:11

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH 1/6] cpufreq: Add debugfs directory for cpufreq

On 10 July 2013 14:00, Chanwoo Choi <[email protected]> wrote:
> On 07/09/2013 06:23 PM, Viresh Kumar wrote:
>> On 5 July 2013 14:16, Chanwoo Choi <[email protected]> wrote:

>>> +static struct dentry *cpufreq_debugfs;
>>
>> Probably make this dependent on CONFIG_CPU_FREQ_STAT?
>
> I thought that '/sys/kernel/debug/cpufreq' is always created in the same as sysfs file
> when added cpufreq driver. Only the debugfs file(/sys/kernel/debug/cpufreq/load_table)
> has the dependency on CONFIG_CPU_FREQ_STAT.
>
> If *cpufreq_debugfs has the dependency on CONFIG_CPU_FREQ_STAT,
> I should add checking statement with '#ifdef CONFIG_CPU_FREQ_STAT' keyword in cpufreq.c.
>
> What is your opinion to add the dependency of CONFIG_CPU_FREQ_STAT
> in cpufreq.c?

I thought this is yet another piece of stats that we may or maynot use. And
it should be configurable if user wants it or not. So, probably you need
to have dependency on CONFIG_CPU_FREQ_STAT