2013-05-07 15:20:49

by Dirk Brandewie

[permalink] [raw]
Subject: [PATCH 0/6] Intel pstate driver update

From: dirk <[email protected]>

Collection of updates for cpufreq/intel_pstate.c

Patches 1-3 are bugfixes and marked for stable

Patches 4-6 are code cleanup with no functional changes.


Dirk Brandewie (6):
cpufreq/intel_pstate: remove idle time and duration from sample and
calculations
cpufreq/intel_pstate: use lowest requested max performance
cpufreq/intel_pstate: fix ffmpeg regression
cpufreq/intel_pstate: Remove idle mode PID
cpufreq/intel_pstate: Remove unused code
cpufreq/intel_pstate: remove #ifdef MODULE compile fence

drivers/cpufreq/intel_pstate.c | 122 +++++++----------------------------------
1 file changed, 20 insertions(+), 102 deletions(-)

--
1.7.11.7


2013-05-07 15:20:55

by Dirk Brandewie

[permalink] [raw]
Subject: [PATCH 3/6] cpufreq/intel_pstate: fix ffmpeg regression

From: Dirk Brandewie <[email protected]>

The ffmpeg benchmark in the phoronix test suite has threads on
multiple cores that rely on the progress on of threads on other cores
and ping pong back and forth fast enough to make the core appear less
busy than it "should" be. If the core has been at minimum p-state for
a while bump the pstate up to kick the core to see if it is in this
ping pong state. If the core is truly idle the p-state will be
reduced at the next sample time. If the core makes more progress it
will send more work to the thread bringing both threads out of the
ping pong scenario and the p-state will be selected normally.

This fixes a performance regression of approximately 30%

Cc: [email protected]
Signed-off-by: Dirk Brandewie <[email protected]>
---
drivers/cpufreq/intel_pstate.c | 10 ++--------
1 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index c0c98c2..97fbac2 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -551,22 +551,16 @@ static void intel_pstate_timer_func(unsigned long __data)
struct cpudata *cpu = (struct cpudata *) __data;

intel_pstate_sample(cpu);
+ intel_pstate_adjust_busy_pstate(cpu);

- if (!cpu->idle_mode)
- intel_pstate_adjust_busy_pstate(cpu);
- else
- intel_pstate_adjust_idle_pstate(cpu);
-
-#if defined(XPERF_FIX)
if (cpu->pstate.current_pstate == cpu->pstate.min_pstate) {
cpu->min_pstate_count++;
if (!(cpu->min_pstate_count % 5)) {
intel_pstate_set_pstate(cpu, cpu->pstate.max_pstate);
- intel_pstate_idle_mode(cpu);
}
} else
cpu->min_pstate_count = 0;
-#endif
+
intel_pstate_set_sample_time(cpu);
}

--
1.7.7.6

2013-05-07 15:20:59

by Dirk Brandewie

[permalink] [raw]
Subject: [PATCH 5/6] cpufreq/intel_pstate: Remove unused code

From: Dirk Brandewie <[email protected]>

Remove call to intel_pstate_get_min_max() the values returned are not
used.

Signed-off-by: Dirk Brandewie <[email protected]>
---
drivers/cpufreq/intel_pstate.c | 3 ---
1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index fc9b187..c7bd91f 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -579,15 +579,12 @@ static unsigned int intel_pstate_get(unsigned int cpu_num)
static int intel_pstate_set_policy(struct cpufreq_policy *policy)
{
struct cpudata *cpu;
- int min, max;

cpu = all_cpu_data[policy->cpu];

if (!policy->cpuinfo.max_freq)
return -ENODEV;

- intel_pstate_get_min_max(cpu, &min, &max);
-
limits.min_perf_pct = (policy->min * 100) / policy->cpuinfo.max_freq;
limits.min_perf_pct = clamp_t(int, limits.min_perf_pct, 0 , 100);
limits.min_perf = div_fp(int_tofp(limits.min_perf_pct), int_tofp(100));
--
1.7.7.6

2013-05-07 15:21:10

by Dirk Brandewie

[permalink] [raw]
Subject: [PATCH 6/6] cpufreq/intel_pstate: remove #ifdef MODULE compile fence

From: Dirk Brandewie <[email protected]>

The driver can no longer be built as a module remove the compile fence
around cpufreq tracing call.

Signed-off-by: Dirk Brandewie <[email protected]>
---
drivers/cpufreq/intel_pstate.c | 3 +--
1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index c7bd91f..50a29df 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -391,9 +391,8 @@ static void intel_pstate_set_pstate(struct cpudata *cpu, int pstate)
if (pstate == cpu->pstate.current_pstate)
return;

-#ifndef MODULE
trace_cpu_frequency(pstate * 100000, cpu->cpu);
-#endif
+
cpu->pstate.current_pstate = pstate;
wrmsrl(MSR_IA32_PERF_CTL, pstate << 8);

--
1.7.7.6

2013-05-07 15:21:45

by Dirk Brandewie

[permalink] [raw]
Subject: [PATCH 4/6] cpufreq/intel_pstate: Remove idle mode PID

From: Dirk Brandewie <[email protected]>

Remove dead code from the driver.

Signed-off-by: Dirk Brandewie <[email protected]>
---
drivers/cpufreq/intel_pstate.c | 49 ----------------------------------------
1 files changed, 0 insertions(+), 49 deletions(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 97fbac2..fc9b187 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -81,10 +81,8 @@ struct cpudata {
struct pstate_adjust_policy *pstate_policy;
struct pstate_data pstate;
struct _pid pid;
- struct _pid idle_pid;

int min_pstate_count;
- int idle_mode;

u64 prev_aperf;
u64 prev_mperf;
@@ -199,19 +197,6 @@ static inline void intel_pstate_busy_pid_reset(struct cpudata *cpu)
0);
}

-static inline void intel_pstate_idle_pid_reset(struct cpudata *cpu)
-{
- pid_p_gain_set(&cpu->idle_pid, cpu->pstate_policy->p_gain_pct);
- pid_d_gain_set(&cpu->idle_pid, cpu->pstate_policy->d_gain_pct);
- pid_i_gain_set(&cpu->idle_pid, cpu->pstate_policy->i_gain_pct);
-
- pid_reset(&cpu->idle_pid,
- 75,
- 50,
- cpu->pstate_policy->deadband,
- 0);
-}
-
static inline void intel_pstate_reset_all_pid(void)
{
unsigned int cpu;
@@ -481,16 +466,6 @@ static inline void intel_pstate_set_sample_time(struct cpudata *cpu)
mod_timer_pinned(&cpu->timer, jiffies + delay);
}

-static inline void intel_pstate_idle_mode(struct cpudata *cpu)
-{
- cpu->idle_mode = 1;
-}
-
-static inline void intel_pstate_normal_mode(struct cpudata *cpu)
-{
- cpu->idle_mode = 0;
-}
-
static inline int intel_pstate_get_scaled_busy(struct cpudata *cpu)
{
int32_t busy_scaled;
@@ -523,29 +498,6 @@ static inline void intel_pstate_adjust_busy_pstate(struct cpudata *cpu)
intel_pstate_pstate_decrease(cpu, steps);
}

-static inline void intel_pstate_adjust_idle_pstate(struct cpudata *cpu)
-{
- int busy_scaled;
- struct _pid *pid;
- int ctl = 0;
- int steps;
-
- pid = &cpu->idle_pid;
-
- busy_scaled = intel_pstate_get_scaled_busy(cpu);
-
- ctl = pid_calc(pid, 100 - busy_scaled);
-
- steps = abs(ctl);
- if (ctl < 0)
- intel_pstate_pstate_decrease(cpu, steps);
- else
- intel_pstate_pstate_increase(cpu, steps);
-
- if (cpu->pstate.current_pstate == cpu->pstate.min_pstate)
- intel_pstate_normal_mode(cpu);
-}
-
static void intel_pstate_timer_func(unsigned long __data)
{
struct cpudata *cpu = (struct cpudata *) __data;
@@ -602,7 +554,6 @@ static int intel_pstate_init_cpu(unsigned int cpunum)
(unsigned long)cpu;
cpu->timer.expires = jiffies + HZ/100;
intel_pstate_busy_pid_reset(cpu);
- intel_pstate_idle_pid_reset(cpu);
intel_pstate_sample(cpu);
intel_pstate_set_pstate(cpu, cpu->pstate.max_pstate);

--
1.7.7.6

2013-05-07 15:20:52

by Dirk Brandewie

[permalink] [raw]
Subject: [PATCH 1/6] cpufreq/intel_pstate: remove idle time and duration from sample and calculations

From: Dirk Brandewie <[email protected]>

Idle time is taken into account in the APERF/MPERF ratio calculation
there is no reason for the driver to track it seperately. This
reduces the work in the driver and makes the code more readable.

Removal of the tracking of sample duration removes the possibility of
the divide by zero exception when the duration is sub 1us

https://bugzilla.kernel.org/show_bug.cgi?id=56691

Reported-by: Mike Lothian <[email protected]>
Cc: [email protected]
Signed-off-by: Dirk Brandewie <[email protected]>
---
drivers/cpufreq/intel_pstate.c | 45 +++++++--------------------------------
1 files changed, 8 insertions(+), 37 deletions(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 4d6b988..5370b8d 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -48,12 +48,7 @@ static inline int32_t div_fp(int32_t x, int32_t y)
}

struct sample {
- ktime_t start_time;
- ktime_t end_time;
int core_pct_busy;
- int pstate_pct_busy;
- u64 duration_us;
- u64 idletime_us;
u64 aperf;
u64 mperf;
int freq;
@@ -91,8 +86,6 @@ struct cpudata {
int min_pstate_count;
int idle_mode;

- ktime_t prev_sample;
- u64 prev_idle_time_us;
u64 prev_aperf;
u64 prev_mperf;
int sample_ptr;
@@ -450,48 +443,26 @@ static inline void intel_pstate_calc_busy(struct cpudata *cpu,
struct sample *sample)
{
u64 core_pct;
- sample->pstate_pct_busy = 100 - div64_u64(
- sample->idletime_us * 100,
- sample->duration_us);
core_pct = div64_u64(sample->aperf * 100, sample->mperf);
sample->freq = cpu->pstate.max_pstate * core_pct * 1000;

- sample->core_pct_busy = div_s64((sample->pstate_pct_busy * core_pct),
- 100);
+ sample->core_pct_busy = core_pct;
}

static inline void intel_pstate_sample(struct cpudata *cpu)
{
- ktime_t now;
- u64 idle_time_us;
u64 aperf, mperf;

- now = ktime_get();
- idle_time_us = get_cpu_idle_time_us(cpu->cpu, NULL);
-
rdmsrl(MSR_IA32_APERF, aperf);
rdmsrl(MSR_IA32_MPERF, mperf);
- /* for the first sample, don't actually record a sample, just
- * set the baseline */
- if (cpu->prev_idle_time_us > 0) {
- cpu->sample_ptr = (cpu->sample_ptr + 1) % SAMPLE_COUNT;
- cpu->samples[cpu->sample_ptr].start_time = cpu->prev_sample;
- cpu->samples[cpu->sample_ptr].end_time = now;
- cpu->samples[cpu->sample_ptr].duration_us =
- ktime_us_delta(now, cpu->prev_sample);
- cpu->samples[cpu->sample_ptr].idletime_us =
- idle_time_us - cpu->prev_idle_time_us;
-
- cpu->samples[cpu->sample_ptr].aperf = aperf;
- cpu->samples[cpu->sample_ptr].mperf = mperf;
- cpu->samples[cpu->sample_ptr].aperf -= cpu->prev_aperf;
- cpu->samples[cpu->sample_ptr].mperf -= cpu->prev_mperf;
-
- intel_pstate_calc_busy(cpu, &cpu->samples[cpu->sample_ptr]);
- }
+ cpu->sample_ptr = (cpu->sample_ptr + 1) % SAMPLE_COUNT;
+ cpu->samples[cpu->sample_ptr].aperf = aperf;
+ cpu->samples[cpu->sample_ptr].mperf = mperf;
+ cpu->samples[cpu->sample_ptr].aperf -= cpu->prev_aperf;
+ cpu->samples[cpu->sample_ptr].mperf -= cpu->prev_mperf;
+
+ intel_pstate_calc_busy(cpu, &cpu->samples[cpu->sample_ptr]);

- cpu->prev_sample = now;
- cpu->prev_idle_time_us = idle_time_us;
cpu->prev_aperf = aperf;
cpu->prev_mperf = mperf;
}
--
1.7.7.6

2013-05-07 15:22:14

by Dirk Brandewie

[permalink] [raw]
Subject: [PATCH 2/6] cpufreq/intel_pstate: use lowest requested max performance

From: Dirk Brandewie <[email protected]>

There are two ways that the maximum p-state can be clamped, via a
policy change and via the sysfs file.

The acpi-thermal driver adjusts the p-state policy in response to
thermal events. These changes override the users settings ATM.

Use the lowest of the two requested values this ensures that we will
not exceed the requested pstate from either mechanism.

Reported-by: Srinivas Pandruvada <[email protected]>
Cc: [email protected]
Signed-off-by: Dirk Brandewie <[email protected]>
---
drivers/cpufreq/intel_pstate.c | 12 +++++++++---
1 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 5370b8d..c0c98c2 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -117,6 +117,8 @@ struct perf_limits {
int min_perf_pct;
int32_t max_perf;
int32_t min_perf;
+ int max_policy_pct;
+ int max_sysfs_pct;
};

static struct perf_limits limits = {
@@ -125,6 +127,8 @@ static struct perf_limits limits = {
.max_perf = int_tofp(1),
.min_perf_pct = 0,
.min_perf = 0,
+ .max_policy_pct = 100,
+ .max_sysfs_pct = 100,
};

static inline void pid_reset(struct _pid *pid, int setpoint, int busy,
@@ -295,7 +299,8 @@ static ssize_t store_max_perf_pct(struct kobject *a, struct attribute *b,
if (ret != 1)
return -EINVAL;

- limits.max_perf_pct = clamp_t(int, input, 0 , 100);
+ limits.max_sysfs_pct = clamp_t(int, input, 0 , 100);
+ limits.max_perf_pct = min(limits.max_policy_pct, limits.max_sysfs_pct);
limits.max_perf = div_fp(int_tofp(limits.max_perf_pct), int_tofp(100));
return count;
}
@@ -642,8 +647,9 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy)
limits.min_perf_pct = clamp_t(int, limits.min_perf_pct, 0 , 100);
limits.min_perf = div_fp(int_tofp(limits.min_perf_pct), int_tofp(100));

- limits.max_perf_pct = policy->max * 100 / policy->cpuinfo.max_freq;
- limits.max_perf_pct = clamp_t(int, limits.max_perf_pct, 0 , 100);
+ limits.max_policy_pct = policy->max * 100 / policy->cpuinfo.max_freq;
+ limits.max_policy_pct = clamp_t(int, limits.max_policy_pct, 0 , 100);
+ limits.max_perf_pct = min(limits.max_policy_pct, limits.max_sysfs_pct);
limits.max_perf = div_fp(int_tofp(limits.max_perf_pct), int_tofp(100));

if (policy->policy == CPUFREQ_POLICY_PERFORMANCE) {
--
1.7.7.6

2013-05-07 23:58:52

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 0/6] Intel pstate driver update

On Tuesday, May 07, 2013 08:20:24 AM [email protected] wrote:
> From: dirk <[email protected]>
>
> Collection of updates for cpufreq/intel_pstate.c
>
> Patches 1-3 are bugfixes and marked for stable
>
> Patches 4-6 are code cleanup with no functional changes.
>
>
> Dirk Brandewie (6):
> cpufreq/intel_pstate: remove idle time and duration from sample and
> calculations
> cpufreq/intel_pstate: use lowest requested max performance
> cpufreq/intel_pstate: fix ffmpeg regression
> cpufreq/intel_pstate: Remove idle mode PID
> cpufreq/intel_pstate: Remove unused code
> cpufreq/intel_pstate: remove #ifdef MODULE compile fence
>
> drivers/cpufreq/intel_pstate.c | 122 +++++++----------------------------------
> 1 file changed, 20 insertions(+), 102 deletions(-)

All queued up for a post-3.10-rc1 push as 3.10 material, but I have a couple
of comments.

First, the patches didn't apply for me cleanly. I needed to fix up one of
them manually to make it apply and patch [5/6] didn't appear to be necessary
at all (it made changes that had been made previously). Please check the
bleeding-edge branch of my tree to see if the code is what you wanted and
let me know (either way).

Second, can you please CC your cpufreq submissions to [email protected]?
That will allow me to use Patchwork for managing them, which is much more
convenient than plain email.

Thanks,
Rafael


--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2013-05-08 00:22:05

by Dirk Brandewie

[permalink] [raw]
Subject: Re: [PATCH 0/6] Intel pstate driver update

On 05/07/2013 05:07 PM, Rafael J. Wysocki wrote:
> On Tuesday, May 07, 2013 08:20:24 AM [email protected] wrote:
>> From: dirk <[email protected]>
>>
> All queued up for a post-3.10-rc1 push as 3.10 material, but I have a couple
> of comments.
>
> First, the patches didn't apply for me cleanly. I needed to fix up one of
> them manually to make it apply and patch [5/6] didn't appear to be necessary
> at all (it made changes that had been made previously). Please check the
> bleeding-edge branch of my tree to see if the code is what you wanted and
> let me know (either way).
>
Srinivas's commit d1b6848 collided with my 5/6 patch. my patches were based off
of v3.9. Which branch of yours should I base my submissions on?

linux-pm/bleeding-edge is correct.

> Second, can you please CC your cpufreq submissions to [email protected]?
> That will allow me to use Patchwork for managing them, which is much more
> convenient than plain email.
>

No problem

--Dirk

> Thanks,
> Rafael
>
>

2013-05-08 11:47:52

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 0/6] Intel pstate driver update

On Tuesday, May 07, 2013 05:21:59 PM Dirk Brandewie wrote:
> On 05/07/2013 05:07 PM, Rafael J. Wysocki wrote:
> > On Tuesday, May 07, 2013 08:20:24 AM [email protected] wrote:
> >> From: dirk <[email protected]>
> >>
> > All queued up for a post-3.10-rc1 push as 3.10 material, but I have a couple
> > of comments.
> >
> > First, the patches didn't apply for me cleanly. I needed to fix up one of
> > them manually to make it apply and patch [5/6] didn't appear to be necessary
> > at all (it made changes that had been made previously). Please check the
> > bleeding-edge branch of my tree to see if the code is what you wanted and
> > let me know (either way).
> >
> Srinivas's commit d1b6848 collided with my 5/6 patch. my patches were based off
> of v3.9. Which branch of yours should I base my submissions on?

Usually, linux-next is a safe bet.

> linux-pm/bleeding-edge is correct.

Good.

> > Second, can you please CC your cpufreq submissions to [email protected]?
> > That will allow me to use Patchwork for managing them, which is much more
> > convenient than plain email.
> >
>
> No problem

Thanks!

Rafael


--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.