2017-07-24 05:43:37

by Huaisheng HS1 Ye

[permalink] [raw]
Subject: [PATCH] cpufreq: intel_pstate: Fix cpuinfo_cur_freq after performance governor changes

After commit 82b4e03e01bc (intel_pstate: skip scheduler hook when
in "performance" mode) Software P-state control modes couldn't get
dynamic value during performance mode, and it still in last value from
powersave mode, so clear its value to get same behavior as Hardware
P-state to avoid confusion.

Signed-off-by: Huaisheng Ye <[email protected]>
---
drivers/cpufreq/intel_pstate.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 6cd5035..c675626 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -2050,6 +2050,7 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy)
*/
intel_pstate_clear_update_util_hook(policy->cpu);
intel_pstate_max_within_limits(cpu);
+ cpu->sample.core_avg_perf = 0;
} else {
intel_pstate_set_update_util_hook(policy->cpu);
}
--
1.8.3.1


2017-07-24 11:44:20

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] cpufreq: intel_pstate: Fix cpuinfo_cur_freq after performance governor changes

On Monday, July 24, 2017 05:43:14 AM Huaisheng HS1 Ye wrote:
> After commit 82b4e03e01bc (intel_pstate: skip scheduler hook when
> in "performance" mode) Software P-state control modes couldn't get
> dynamic value during performance mode,

Please explain what you mean here.

I guess you carried out some tests and the results were not as expected,
so what was the test?

> and it still in last value from powersave mode, so clear its value to get
> same behavior as Hardware P-state to avoid confusion.

And please explain why it should be fixed the way you've done that.

> Signed-off-by: Huaisheng Ye <[email protected]>
> ---
> drivers/cpufreq/intel_pstate.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> index 6cd5035..c675626 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -2050,6 +2050,7 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy)
> */
> intel_pstate_clear_update_util_hook(policy->cpu);
> intel_pstate_max_within_limits(cpu);
> + cpu->sample.core_avg_perf = 0;
> } else {
> intel_pstate_set_update_util_hook(policy->cpu);
> }
>

2017-07-24 15:33:09

by Huaisheng HS1 Ye

[permalink] [raw]
Subject: RE: [PATCH] cpufreq: intel_pstate: Fix cpuinfo_cur_freq after performance governor changes

Hi Rafael,
Thanks for your reply.

> On Monday, July 24, 2017 05:43:14 AM Huaisheng HS1 Ye wrote:
> > After commit 82b4e03e01bc (intel_pstate: skip scheduler hook when in
> > "performance" mode) Software P-state control modes couldn't get
> > dynamic value during performance mode,
>
> Please explain what you mean here.
>
commit 82b4e03e01bc (intel_pstate: skip scheduler hook when in
"performance" mode) disables intel_pstate_set_update_util_hook when current
policy is performance within function intel_pstate_set_policy. It leads to
Software P-states couldn't update sysfs interface cpuinfo_cur_freq's value
during performance mode, because of pstate_funcs.update_util couldn't set
for the given CPU.

> I guess you carried out some tests and the results were not as expected, so
> what was the test?
Exactly, we check the sysfs interface cpuinfo_cur_freq and the output of
cpupower frequency-info both with performance mode.

For example, we list logic CPU 63's sysfs interface and output of cpupower with performance
mode below,
analyzing CPU 63:
driver: intel_pstate
CPUs which run at the same hardware frequency: 63
CPUs which need to have their frequency coordinated by software: 63
maximum transition latency: Cannot determine or is not supported.
hardware limits: 1000 MHz - 3.70 GHz
available cpufreq governors: performance powersave
current policy: frequency should be within 1000 MHz and 3.70 GHz.
The governor "performance" may decide which speed to use
within this range.
current CPU frequency: 1.07 GHz (asserted by call to hardware) <--------cpuinfo_cur_freq
boost state support:
Supported: yes
Active: yes

[root@localhost cpufreq]# pwd
/sys/devices/system/cpu/cpu63/cpufreq
[root@localhost cpufreq]# cat cpuinfo_cur_freq scaling_cur_freq
1070379
2796179

The value of cpuinfo_cur_freq is static always during performance mode, because of
"cpu->sample.core_avg_perf" doesn't have chance to be updated when driver leaves
from powersave mode. Its value equals to last sample result which comes from function
intel_pstate_calc_avg_perf.
commit f8475cef (x86: use common aperfmperf_khz_on_cpu() to calculate KHz using
APERF/MPERF) offers a great method to help scaling_cur_freq gets accurate frequency.
And for some tools like cpupower which would try to get cpuinfo_cur_freq at first to show
current frequency, if it fails then it would move to scaling_cur_freq which represents the
real value.

>
> > and it still in last value from powersave mode, so clear its value to
> > get same behavior as Hardware P-state to avoid confusion.
>
> And please explain why it should be fixed the way you've done that.
>

In order to make sure that cpuinfo_cur_freq couldn't get an effective value,
we clean up "sample.core_avg_perf" when cpu's policy set to performance mode.
so the reading of sysfs cpuinfo_cur_freq would get "<unknown>" always within
performance mode, which is same with the status when HWP has been enabled.

With this patch, logic CPU 63's sysfs interface and output of cpupower with performance
mode would be like this,

analyzing CPU 63:
driver: intel_pstate
CPUs which run at the same hardware frequency: 63
CPUs which need to have their frequency coordinated by software: 63
maximum transition latency: Cannot determine or is not supported.
hardware limits: 1000 MHz - 3.70 GHz
available cpufreq governors: performance powersave
current policy: frequency should be within 1000 MHz and 3.70 GHz.
The governor "performance" may decide which speed to use
within this range.
current CPU frequency: Unable to call hardware <--------cpuinfo_cur_freq
current CPU frequency: 2.80 GHz (asserted by call to kernel) <--------scaling_cur_freq
boost state support:
Supported: yes
Active: yes

[root@localhost cpufreq]# pwd
/sys/devices/system/cpu/cpu63/cpufreq
[root@localhost cpufreq]# cat cpuinfo_cur_freq scaling_cur_freq
<unknown>
2800000

> > Signed-off-by: Huaisheng Ye <[email protected]>
> > ---
> > drivers/cpufreq/intel_pstate.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/cpufreq/intel_pstate.c
> > b/drivers/cpufreq/intel_pstate.c index 6cd5035..c675626 100644
> > --- a/drivers/cpufreq/intel_pstate.c
> > +++ b/drivers/cpufreq/intel_pstate.c
> > @@ -2050,6 +2050,7 @@ static int intel_pstate_set_policy(struct
> cpufreq_policy *policy)
> > */
> > intel_pstate_clear_update_util_hook(policy->cpu);
> > intel_pstate_max_within_limits(cpu);
> > + cpu->sample.core_avg_perf = 0;
> > } else {
> > intel_pstate_set_update_util_hook(policy->cpu);
> > }
> >


2017-07-24 23:59:48

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] cpufreq: intel_pstate: Fix cpuinfo_cur_freq after performance governor changes

On Monday, July 24, 2017 03:32:47 PM Huaisheng HS1 Ye wrote:
> Hi Rafael,
> Thanks for your reply.
>
> > On Monday, July 24, 2017 05:43:14 AM Huaisheng HS1 Ye wrote:
> > > After commit 82b4e03e01bc (intel_pstate: skip scheduler hook when in
> > > "performance" mode) Software P-state control modes couldn't get
> > > dynamic value during performance mode,
> >
> > Please explain what you mean here.
> >
> commit 82b4e03e01bc (intel_pstate: skip scheduler hook when in
> "performance" mode) disables intel_pstate_set_update_util_hook when current
> policy is performance within function intel_pstate_set_policy. It leads to
> Software P-states couldn't update sysfs interface cpuinfo_cur_freq's value
> during performance mode, because of pstate_funcs.update_util couldn't set
> for the given CPU.
>
> > I guess you carried out some tests and the results were not as expected, so
> > what was the test?
> Exactly, we check the sysfs interface cpuinfo_cur_freq and the output of
> cpupower frequency-info both with performance mode.

OK, so what about the change below:

---
drivers/cpufreq/intel_pstate.c | 8 --------
1 file changed, 8 deletions(-)

Index: linux-pm/drivers/cpufreq/intel_pstate.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/intel_pstate.c
+++ linux-pm/drivers/cpufreq/intel_pstate.c
@@ -1674,13 +1674,6 @@ static int intel_pstate_init_cpu(unsigne
return 0;
}

-static unsigned int intel_pstate_get(unsigned int cpu_num)
-{
- struct cpudata *cpu = all_cpu_data[cpu_num];
-
- return cpu ? get_avg_frequency(cpu) : 0;
-}
-
static void intel_pstate_set_update_util_hook(unsigned int cpu_num)
{
struct cpudata *cpu = all_cpu_data[cpu_num];
@@ -1921,7 +1914,6 @@ static struct cpufreq_driver intel_pstat
.setpolicy = intel_pstate_set_policy,
.suspend = intel_pstate_hwp_save_state,
.resume = intel_pstate_resume,
- .get = intel_pstate_get,
.init = intel_pstate_cpu_init,
.exit = intel_pstate_cpu_exit,
.stop_cpu = intel_pstate_stop_cpu,

2017-07-25 01:46:48

by Huaisheng HS1 Ye

[permalink] [raw]
Subject: RE: [PATCH] cpufreq: intel_pstate: Fix cpuinfo_cur_freq after performance governor changes

Hi Rafael,

If you delete "get" function implement within intel_pstate, the
sysfs interface cpuinfo_cur_freq will display <unknown> all the time.
To be honest, at the beginning I have consider this way like you
patched, but based two reasons below, it is conservative for us to do that.

1. I am worried about whether it would lead to confusion for customers or
Linux OS venders who are accustomed to cpuinfo_cur_freq.
2. This is the first time for me to offer patch to intel_pstate, not sure
whether it could be accepted by you.

> On Monday, July 24, 2017 03:32:47 PM Huaisheng HS1 Ye wrote:
> > Hi Rafael,
> > Thanks for your reply.
> >
> > > On Monday, July 24, 2017 05:43:14 AM Huaisheng HS1 Ye wrote:
> > > > After commit 82b4e03e01bc (intel_pstate: skip scheduler hook when
> > > > in "performance" mode) Software P-state control modes couldn't get
> > > > dynamic value during performance mode,
> > >
> > > Please explain what you mean here.
> > >
> > commit 82b4e03e01bc (intel_pstate: skip scheduler hook when in
> > "performance" mode) disables intel_pstate_set_update_util_hook when
> > current policy is performance within function intel_pstate_set_policy.
> > It leads to Software P-states couldn't update sysfs interface
> > cpuinfo_cur_freq's value during performance mode, because of
> > pstate_funcs.update_util couldn't set for the given CPU.
> >
> > > I guess you carried out some tests and the results were not as
> > > expected, so what was the test?
> > Exactly, we check the sysfs interface cpuinfo_cur_freq and the output
> > of cpupower frequency-info both with performance mode.
>
> OK, so what about the change below:
>
> ---
> drivers/cpufreq/intel_pstate.c | 8 --------
> 1 file changed, 8 deletions(-)
>
> Index: linux-pm/drivers/cpufreq/intel_pstate.c
> ==============================================================
> =====
> --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
> +++ linux-pm/drivers/cpufreq/intel_pstate.c
> @@ -1674,13 +1674,6 @@ static int intel_pstate_init_cpu(unsigne
> return 0;
> }
>
> -static unsigned int intel_pstate_get(unsigned int cpu_num) -{
> - struct cpudata *cpu = all_cpu_data[cpu_num];
> -
> - return cpu ? get_avg_frequency(cpu) : 0;
> -}
> -
> static void intel_pstate_set_update_util_hook(unsigned int cpu_num) {
> struct cpudata *cpu = all_cpu_data[cpu_num]; @@ -1921,7 +1914,6 @@
> static struct cpufreq_driver intel_pstat
> .setpolicy = intel_pstate_set_policy,
> .suspend = intel_pstate_hwp_save_state,
> .resume = intel_pstate_resume,
> - .get = intel_pstate_get,
> .init = intel_pstate_cpu_init,
> .exit = intel_pstate_cpu_exit,
> .stop_cpu = intel_pstate_stop_cpu,


2017-07-25 02:57:58

by srinivas pandruvada

[permalink] [raw]
Subject: Re: [PATCH] cpufreq: intel_pstate: Fix cpuinfo_cur_freq after performance governor changes

On Tue, 2017-07-25 at 01:46 +0000, Huaisheng HS1 Ye wrote:
> Hi Rafael,
>
> If you delete "get" function implement within intel_pstate, the 
> sysfs interface cpuinfo_cur_freq will display <unknown> all the
> time. 
cpuinfo_cur_freq by definition should show actual frequency HW
frequency. Unless I missed something. So Len Brown's patch should also
take care of this to get from arch specific function is available.
So in addition to Rafael's change, what about this?


diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 9bf97a3..29ec687 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -689,8 +689,13 @@ store_one(scaling_max_freq, max);
 static ssize_t show_cpuinfo_cur_freq(struct cpufreq_policy *policy,
                                        char *buf)
 {
-       unsigned int cur_freq = __cpufreq_get(policy);
+       unsigned int cur_freq;
 
+       cur_freq = arch_freq_get_on_cpu(policy->cpu);
+       if (cur_freq)
+               return sprintf(buf, "%u\n", cur_freq);
+
+       cur_freq = __cpufreq_get(policy);
        if (cur_freq)
                return sprintf(buf, "%u\n", cur_freq);



Thanks,
Srinivas

> To be honest, at the beginning I have consider this way like you 
> patched, but based two reasons below, it is conservative for us to do
> that.
>
> 1. I am worried about whether it would lead to confusion for
> customers or 
> Linux OS venders who are accustomed to cpuinfo_cur_freq.
> 2. This is the first time for me to offer patch to intel_pstate, not
> sure 
> whether it could be accepted by you.
>
> >
> > On Monday, July 24, 2017 03:32:47 PM Huaisheng HS1 Ye wrote:
> > >
> > > Hi Rafael,
> > > Thanks for your reply.
> > >
> > > >
> > > > On Monday, July 24, 2017 05:43:14 AM Huaisheng HS1 Ye wrote:
> > > > >
> > > > > After commit 82b4e03e01bc (intel_pstate: skip scheduler hook
> > > > > when
> > > > > in "performance" mode) Software P-state control modes
> > > > > couldn't get
> > > > > dynamic value during performance mode,
> > > > Please explain what you mean here.
> > > >
> > > commit 82b4e03e01bc (intel_pstate: skip scheduler hook when in
> > > "performance" mode) disables intel_pstate_set_update_util_hook
> > > when
> > > current policy is performance within function
> > > intel_pstate_set_policy.
> > > It leads to Software P-states couldn't update sysfs interface
> > > cpuinfo_cur_freq's value during performance mode, because of
> > > pstate_funcs.update_util couldn't set for the given CPU.
> > >
> > > >
> > > > I guess you carried out some tests and the results were not as
> > > > expected, so what was the test?
> > > Exactly, we check the sysfs interface cpuinfo_cur_freq and the
> > > output
> > > of cpupower frequency-info both with performance mode.
> > OK, so what about the change below:
> >
> > ---
> >  drivers/cpufreq/intel_pstate.c |    8 --------
> >  1 file changed, 8 deletions(-)
> >
> > Index: linux-pm/drivers/cpufreq/intel_pstate.c
> > ==============================================================
> > =====
> > --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
> > +++ linux-pm/drivers/cpufreq/intel_pstate.c
> > @@ -1674,13 +1674,6 @@ static int intel_pstate_init_cpu(unsigne
> >   return 0;
> >  }
> >
> > -static unsigned int intel_pstate_get(unsigned int cpu_num) -{
> > - struct cpudata *cpu = all_cpu_data[cpu_num];
> > -
> > - return cpu ? get_avg_frequency(cpu) : 0;
> > -}
> > -
> >  static void intel_pstate_set_update_util_hook(unsigned int
> > cpu_num)  {
> >   struct cpudata *cpu = all_cpu_data[cpu_num]; @@ -1921,7
> > +1914,6 @@
> > static struct cpufreq_driver intel_pstat
> >   .setpolicy = intel_pstate_set_policy,
> >   .suspend = intel_pstate_hwp_save_state,
> >   .resume = intel_pstate_resume,
> > - .get = intel_pstate_get,
> >   .init = intel_pstate_cpu_init,
> >   .exit = intel_pstate_cpu_exit,
> >   .stop_cpu = intel_pstate_stop_cpu,

2017-07-25 07:04:10

by Huaisheng HS1 Ye

[permalink] [raw]
Subject: RE: [PATCH] cpufreq: intel_pstate: Fix cpuinfo_cur_freq after performance governor changes

Hi Srinivas,
Your idea is great, but your patch at cpufreq.c will force all platforms to use scaling_cur_freq as first choice when userspace wants to access cpuinfo_cur_freq. It is ok for intel x86 platfrom but hard to say with other platforms.
I modified it like that, it looks more reasonable. How about that?

Hi Rafael,
Deleting "get" function pointer within intel_pstate would lead to sysfs interface cpuinfo_cur_freq disappearing, because of cpufreq_add_dev_interface will check "cpufreq_driver->get" for it.
Perhaps just return 0 with in intel_pstate_get would be a workaround for this issue, how about it?

I have tested this patch based on Purley platform, both Hardware and Software P-states works correct, we could get accurate and same frequency from cpuinfo_cur_freq and scaling_cur_freq.

drivers/cpufreq/cpufreq.c | 4 ++++
drivers/cpufreq/intel_pstate.c | 8 +++++---
2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 9bf97a3..922f9d9 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -694,6 +694,10 @@ static ssize_t show_cpuinfo_cur_freq(struct cpufreq_policy *policy,
if (cur_freq)
return sprintf(buf, "%u\n", cur_freq);

+ cur_freq = arch_freq_get_on_cpu(policy->cpu);
+ if (cur_freq)
+ return sprintf(buf, "%u\n", cur_freq);
+
return sprintf(buf, "<unknown>\n");
}

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 6cd5035..33e6c10 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -1924,9 +1924,11 @@ static int intel_pstate_init_cpu(unsigned int cpunum)

static unsigned int intel_pstate_get(unsigned int cpu_num)
{
- struct cpudata *cpu = all_cpu_data[cpu_num];
-
- return cpu ? get_avg_frequency(cpu) : 0;
+ /*
+ * Use frequency from scaling_cur_freq, reserve this function
+ * for existing of sysfs cpuinfo_cur_freq.
+ */
+ return 0;
}

static void intel_pstate_set_update_util_hook(unsigned int cpu_num)


> On Tue, 2017-07-25 at 01:46 +0000, Huaisheng HS1 Ye wrote:
> > Hi Rafael,
> >
> > If you delete "get" function implement within intel_pstate, the sysfs
> > interface cpuinfo_cur_freq will display <unknown> all the time.
> cpuinfo_cur_freq by definition should show actual frequency HW frequency.
> Unless I missed something. So Len Brown's patch should also take care of this
> to get from arch specific function is available.
> So in addition to Rafael's change, what about this?
>
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index
> 9bf97a3..29ec687 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -689,8 +689,13 @@ store_one(scaling_max_freq, max);
>  static ssize_t show_cpuinfo_cur_freq(struct cpufreq_policy *policy,
>                                         char *buf)
>  {
> -       unsigned int cur_freq = __cpufreq_get(policy);
> +       unsigned int cur_freq;
>
> +       cur_freq = arch_freq_get_on_cpu(policy->cpu);
> +       if (cur_freq)
> +               return sprintf(buf, "%u\n", cur_freq);
> +
> +       cur_freq = __cpufreq_get(policy);
>         if (cur_freq)
>                 return sprintf(buf, "%u\n", cur_freq);
>
>
>
> Thanks,
> Srinivas
>
> > To be honest, at the beginning I have consider this way like you
> > patched, but based two reasons below, it is conservative for us to do
> > that.
> >
> > 1. I am worried about whether it would lead to confusion for customers
> > or Linux OS venders who are accustomed to cpuinfo_cur_freq.
> > 2. This is the first time for me to offer patch to intel_pstate, not
> > sure whether it could be accepted by you.
> >
> > >
> > > On Monday, July 24, 2017 03:32:47 PM Huaisheng HS1 Ye wrote:
> > > >
> > > > Hi Rafael,
> > > > Thanks for your reply.
> > > >
> > > > >
> > > > > On Monday, July 24, 2017 05:43:14 AM Huaisheng HS1 Ye wrote:
> > > > > >
> > > > > > After commit 82b4e03e01bc (intel_pstate: skip scheduler hook
> > > > > > when in "performance" mode) Software P-state control modes
> > > > > > couldn't get dynamic value during performance mode,
> > > > > Please explain what you mean here.
> > > > >
> > > > commit 82b4e03e01bc (intel_pstate: skip scheduler hook when in
> > > > "performance" mode) disables intel_pstate_set_update_util_hook
> > > > when current policy is performance within function
> > > > intel_pstate_set_policy.
> > > > It leads to Software P-states couldn't update sysfs interface
> > > > cpuinfo_cur_freq's value during performance mode, because of
> > > > pstate_funcs.update_util couldn't set for the given CPU.
> > > >
> > > > >
> > > > > I guess you carried out some tests and the results were not as
> > > > > expected, so what was the test?
> > > > Exactly, we check the sysfs interface cpuinfo_cur_freq and the
> > > > output of cpupower frequency-info both with performance mode.
> > > OK, so what about the change below:
> > >
> > > ---
> > >  drivers/cpufreq/intel_pstate.c |    8 --------
> > >  1 file changed, 8 deletions(-)
> > >
> > > Index: linux-pm/drivers/cpufreq/intel_pstate.c
> > >
> ==============================================================
> > > =====
> > > --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
> > > +++ linux-pm/drivers/cpufreq/intel_pstate.c
> > > @@ -1674,13 +1674,6 @@ static int intel_pstate_init_cpu(unsigne
> > >   return 0;
> > >  }
> > >
> > > -static unsigned int intel_pstate_get(unsigned int cpu_num) -{
> > > - struct cpudata *cpu = all_cpu_data[cpu_num];
> > > -
> > > - return cpu ? get_avg_frequency(cpu) : 0;
> > > -}
> > > -
> > >  static void intel_pstate_set_update_util_hook(unsigned int
> > > cpu_num)  {
> > >   struct cpudata *cpu = all_cpu_data[cpu_num]; @@ -1921,7
> > > +1914,6 @@
> > > static struct cpufreq_driver intel_pstat
> > >   .setpolicy = intel_pstate_set_policy,
> > >   .suspend = intel_pstate_hwp_save_state,
> > >   .resume = intel_pstate_resume,
> > > - .get = intel_pstate_get,
> > >   .init = intel_pstate_cpu_init,
> > >   .exit = intel_pstate_cpu_exit,
> > >   .stop_cpu = intel_pstate_stop_cpu,

2017-07-25 14:37:47

by srinivas pandruvada

[permalink] [raw]
Subject: Re: [PATCH] cpufreq: intel_pstate: Fix cpuinfo_cur_freq after performance governor changes

Hi Huaisheng,

On Tue, 2017-07-25 at 07:03 +0000, Huaisheng HS1 Ye wrote:
> Hi Srinivas,
> Your idea is great, but your patch at cpufreq.c will force all
> platforms to use scaling_cur_freq as first choice when userspace
> wants to access cpuinfo_cur_freq. It is ok for intel x86 platfrom but
> hard to say with other platforms.
arch_freq_get_on_cpu is only implemented on x86, for other platforms it
will not change behavior. I didn't understand your comment about first
choice.

Thanks,
Srinivas


> I modified it like that, it looks more reasonable. How about that?
>
> Hi Rafael,
> Deleting "get" function pointer within intel_pstate would lead to
> sysfs interface cpuinfo_cur_freq disappearing, because of
> cpufreq_add_dev_interface will check "cpufreq_driver->get" for it.
> Perhaps just return 0 with in intel_pstate_get would be a workaround
> for this issue, how about it?
>
> I have tested this patch based on Purley platform, both Hardware and
> Software P-states works correct, we could get accurate and same
> frequency from cpuinfo_cur_freq and scaling_cur_freq.
>
>  drivers/cpufreq/cpufreq.c      | 4 ++++
>  drivers/cpufreq/intel_pstate.c | 8 +++++---
>  2 files changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 9bf97a3..922f9d9 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -694,6 +694,10 @@ static ssize_t show_cpuinfo_cur_freq(struct
> cpufreq_policy *policy,
>   if (cur_freq)
>   return sprintf(buf, "%u\n", cur_freq);
>  
> + cur_freq = arch_freq_get_on_cpu(policy->cpu);
> + if (cur_freq)
> + return sprintf(buf, "%u\n", cur_freq);
> +
>   return sprintf(buf, "<unknown>\n");
>  }
>  
> diff --git a/drivers/cpufreq/intel_pstate.c
> b/drivers/cpufreq/intel_pstate.c
> index 6cd5035..33e6c10 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -1924,9 +1924,11 @@ static int intel_pstate_init_cpu(unsigned int
> cpunum)
>  
>  static unsigned int intel_pstate_get(unsigned int cpu_num)
>  {
> - struct cpudata *cpu = all_cpu_data[cpu_num];
> -
> - return cpu ? get_avg_frequency(cpu) : 0;
> + /*
> +  * Use frequency from scaling_cur_freq, reserve this
> function
> +  * for existing of sysfs cpuinfo_cur_freq.
> +  */
> + return 0;
>  }
>  
>  static void intel_pstate_set_update_util_hook(unsigned int cpu_num)
>
>
> >
> > On Tue, 2017-07-25 at 01:46 +0000, Huaisheng HS1 Ye wrote:
> > >
> > > Hi Rafael,
> > >
> > > If you delete "get" function implement within intel_pstate, the
> > > sysfs
> > > interface cpuinfo_cur_freq will display <unknown> all the time.
> > cpuinfo_cur_freq by definition should show actual frequency HW
> > frequency.
> > Unless I missed something. So Len Brown's patch should also take
> > care of this
> > to get from arch specific function is available.
> > So in addition to Rafael's change, what about this?
> >
> >
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > index
> > 9bf97a3..29ec687 100644
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -689,8 +689,13 @@ store_one(scaling_max_freq, max);
> >  static ssize_t show_cpuinfo_cur_freq(struct cpufreq_policy
> > *policy,
> >                                         char *buf)
> >  {
> > -       unsigned int cur_freq = __cpufreq_get(policy);
> > +       unsigned int cur_freq;
> >
> > +       cur_freq = arch_freq_get_on_cpu(policy->cpu);
> > +       if (cur_freq)
> > +               return sprintf(buf, "%u\n", cur_freq);
> > +
> > +       cur_freq = __cpufreq_get(policy);
> >         if (cur_freq)
> >                 return sprintf(buf, "%u\n", cur_freq);
> >
> >
> >
> > Thanks,
> > Srinivas
> >
> > >
> > > To be honest, at the beginning I have consider this way like you
> > > patched, but based two reasons below, it is conservative for us
> > > to do
> > > that.
> > >
> > > 1. I am worried about whether it would lead to confusion for
> > > customers
> > > or Linux OS venders who are accustomed to cpuinfo_cur_freq.
> > > 2. This is the first time for me to offer patch to intel_pstate,
> > > not
> > > sure whether it could be accepted by you.
> > >
> > > >
> > > >
> > > > On Monday, July 24, 2017 03:32:47 PM Huaisheng HS1 Ye wrote:
> > > > >
> > > > >
> > > > > Hi Rafael,
> > > > > Thanks for your reply.
> > > > >
> > > > > >
> > > > > >
> > > > > > On Monday, July 24, 2017 05:43:14 AM Huaisheng HS1 Ye
> > > > > > wrote:
> > > > > > >
> > > > > > >
> > > > > > > After commit 82b4e03e01bc (intel_pstate: skip scheduler
> > > > > > > hook
> > > > > > > when in "performance" mode) Software P-state control
> > > > > > > modes
> > > > > > > couldn't get dynamic value during performance mode,
> > > > > > Please explain what you mean here.
> > > > > >
> > > > > commit 82b4e03e01bc (intel_pstate: skip scheduler hook when
> > > > > in
> > > > > "performance" mode) disables
> > > > > intel_pstate_set_update_util_hook
> > > > > when current policy is performance within function
> > > > > intel_pstate_set_policy.
> > > > > It leads to Software P-states couldn't update sysfs interface
> > > > > cpuinfo_cur_freq's value during performance mode, because of
> > > > > pstate_funcs.update_util couldn't set for the given CPU.
> > > > >
> > > > > >
> > > > > >
> > > > > > I guess you carried out some tests and the results were not
> > > > > > as
> > > > > > expected, so what was the test?
> > > > > Exactly, we check the sysfs interface cpuinfo_cur_freq and
> > > > > the
> > > > > output of cpupower frequency-info both with performance mode.
> > > > OK, so what about the change below:
> > > >
> > > > ---
> > > >  drivers/cpufreq/intel_pstate.c |    8 --------
> > > >  1 file changed, 8 deletions(-)
> > > >
> > > > Index: linux-pm/drivers/cpufreq/intel_pstate.c
> > > >
> > ==============================================================
> > >
> > > >
> > > > =====
> > > > --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
> > > > +++ linux-pm/drivers/cpufreq/intel_pstate.c
> > > > @@ -1674,13 +1674,6 @@ static int intel_pstate_init_cpu(unsigne
> > > >   return 0;
> > > >  }
> > > >
> > > > -static unsigned int intel_pstate_get(unsigned int cpu_num) -{
> > > > - struct cpudata *cpu = all_cpu_data[cpu_num];
> > > > -
> > > > - return cpu ? get_avg_frequency(cpu) : 0;
> > > > -}
> > > > -
> > > >  static void intel_pstate_set_update_util_hook(unsigned int
> > > > cpu_num)  {
> > > >   struct cpudata *cpu = all_cpu_data[cpu_num]; @@
> > > > -1921,7
> > > > +1914,6 @@
> > > > static struct cpufreq_driver intel_pstat
> > > >   .setpolicy = intel_pstate_set_policy,
> > > >   .suspend = intel_pstate_hwp_save_state,
> > > >   .resume = intel_pstate_resume,
> > > > - .get = intel_pstate_get,
> > > >   .init = intel_pstate_cpu_init,
> > > >   .exit = intel_pstate_cpu_exit,
> > > >   .stop_cpu = intel_pstate_stop_cpu,

2017-07-25 15:22:39

by Huaisheng HS1 Ye

[permalink] [raw]
Subject: RE: [PATCH] cpufreq: intel_pstate: Fix cpuinfo_cur_freq after performance governor changes

Hi Srinivas,

Oh, I see. Originally I thought this function "arch_freq_get_on_cpu" would have chance to expand to other platforms in the future. Because I found that it appears at cpufreq.c as __weak.
But if it is sure that this function only works for x86 all the time, I think it doesn't matter about its position within show_cpuinfo_cur_freq.

Thanks
Huaisheng

>
> Hi Huaisheng,
>
> On Tue, 2017-07-25 at 07:03 +0000, Huaisheng HS1 Ye wrote:
> > Hi Srinivas,
> > Your idea is great, but your patch at cpufreq.c will force all
> > platforms to use scaling_cur_freq as first choice when userspace wants
> > to access cpuinfo_cur_freq. It is ok for intel x86 platfrom but hard
> > to say with other platforms.
> arch_freq_get_on_cpu is only implemented on x86, for other platforms it will
> not change behavior. I didn't understand your comment about first choice.
>
> Thanks,
> Srinivas
>
>
> > I modified it like that, it looks more reasonable. How about that?
> >
> > Hi Rafael,
> > Deleting "get" function pointer within intel_pstate would lead to
> > sysfs interface cpuinfo_cur_freq disappearing, because of
> > cpufreq_add_dev_interface will check "cpufreq_driver->get" for it.
> > Perhaps just return 0 with in intel_pstate_get would be a workaround
> > for this issue, how about it?
> >
> > I have tested this patch based on Purley platform, both Hardware and
> > Software P-states works correct, we could get accurate and same
> > frequency from cpuinfo_cur_freq and scaling_cur_freq.
> >
> >  drivers/cpufreq/cpufreq.c      | 4 ++++
> >  drivers/cpufreq/intel_pstate.c | 8 +++++---
> >  2 files changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > index 9bf97a3..922f9d9 100644
> > --- a/drivers/cpufreq/cpufreq.c
> > +++ b/drivers/cpufreq/cpufreq.c
> > @@ -694,6 +694,10 @@ static ssize_t show_cpuinfo_cur_freq(struct
> > cpufreq_policy *policy,
> >   if (cur_freq)
> >   return sprintf(buf, "%u\n", cur_freq);
> >
> > + cur_freq = arch_freq_get_on_cpu(policy->cpu);
> > + if (cur_freq)
> > + return sprintf(buf, "%u\n", cur_freq);
> > +
> >   return sprintf(buf, "<unknown>\n");
> >  }
> >
> > diff --git a/drivers/cpufreq/intel_pstate.c
> > b/drivers/cpufreq/intel_pstate.c index 6cd5035..33e6c10 100644
> > --- a/drivers/cpufreq/intel_pstate.c
> > +++ b/drivers/cpufreq/intel_pstate.c
> > @@ -1924,9 +1924,11 @@ static int intel_pstate_init_cpu(unsigned int
> > cpunum)
> >
> >  static unsigned int intel_pstate_get(unsigned int cpu_num)
> >  {
> > - struct cpudata *cpu = all_cpu_data[cpu_num];
> > -
> > - return cpu ? get_avg_frequency(cpu) : 0;
> > + /*
> > +  * Use frequency from scaling_cur_freq, reserve this
> > function
> > +  * for existing of sysfs cpuinfo_cur_freq.
> > +  */
> > + return 0;
> >  }
> >
> >  static void intel_pstate_set_update_util_hook(unsigned int cpu_num)
> >
> >
> > >
> > > On Tue, 2017-07-25 at 01:46 +0000, Huaisheng HS1 Ye wrote:
> > > >
> > > > Hi Rafael,
> > > >
> > > > If you delete "get" function implement within intel_pstate, the
> > > > sysfs interface cpuinfo_cur_freq will display <unknown> all the
> > > > time.
> > > cpuinfo_cur_freq by definition should show actual frequency HW
> > > frequency.
> > > Unless I missed something. So Len Brown's patch should also take
> > > care of this to get from arch specific function is available.
> > > So in addition to Rafael's change, what about this?
> > >
> > >
> > > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> > > index
> > > 9bf97a3..29ec687 100644
> > > --- a/drivers/cpufreq/cpufreq.c
> > > +++ b/drivers/cpufreq/cpufreq.c
> > > @@ -689,8 +689,13 @@ store_one(scaling_max_freq, max);
> > >  static ssize_t show_cpuinfo_cur_freq(struct cpufreq_policy *policy,
> > >                                         char *buf)
> > >  {
> > > -       unsigned int cur_freq = __cpufreq_get(policy);
> > > +       unsigned int cur_freq;
> > >
> > > +       cur_freq = arch_freq_get_on_cpu(policy->cpu);
> > > +       if (cur_freq)
> > > +               return sprintf(buf, "%u\n", cur_freq);
> > > +
> > > +       cur_freq = __cpufreq_get(policy);
> > >         if (cur_freq)
> > >                 return sprintf(buf, "%u\n", cur_freq);
> > >
> > >
> > >
> > > Thanks,
> > > Srinivas
> > >
> > > >
> > > > To be honest, at the beginning I have consider this way like you
> > > > patched, but based two reasons below, it is conservative for us to
> > > > do that.
> > > >
> > > > 1. I am worried about whether it would lead to confusion for
> > > > customers or Linux OS venders who are accustomed to
> > > > cpuinfo_cur_freq.
> > > > 2. This is the first time for me to offer patch to intel_pstate,
> > > > not sure whether it could be accepted by you.
> > > >
> > > > >
> > > > >
> > > > > On Monday, July 24, 2017 03:32:47 PM Huaisheng HS1 Ye wrote:
> > > > > >
> > > > > >
> > > > > > Hi Rafael,
> > > > > > Thanks for your reply.
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Monday, July 24, 2017 05:43:14 AM Huaisheng HS1 Ye
> > > > > > > wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > > After commit 82b4e03e01bc (intel_pstate: skip scheduler
> > > > > > > > hook when in "performance" mode) Software P-state control
> > > > > > > > modes couldn't get dynamic value during performance mode,
> > > > > > > Please explain what you mean here.
> > > > > > >
> > > > > > commit 82b4e03e01bc (intel_pstate: skip scheduler hook when in
> > > > > > "performance" mode) disables intel_pstate_set_update_util_hook
> > > > > > when current policy is performance within function
> > > > > > intel_pstate_set_policy.
> > > > > > It leads to Software P-states couldn't update sysfs interface
> > > > > > cpuinfo_cur_freq's value during performance mode, because of
> > > > > > pstate_funcs.update_util couldn't set for the given CPU.
> > > > > >
> > > > > > >
> > > > > > >
> > > > > > > I guess you carried out some tests and the results were not
> > > > > > > as expected, so what was the test?
> > > > > > Exactly, we check the sysfs interface cpuinfo_cur_freq and the
> > > > > > output of cpupower frequency-info both with performance mode.
> > > > > OK, so what about the change below:
> > > > >
> > > > > ---
> > > > >  drivers/cpufreq/intel_pstate.c |    8 --------
> > > > >  1 file changed, 8 deletions(-)
> > > > >
> > > > > Index: linux-pm/drivers/cpufreq/intel_pstate.c
> > > > >
> > >
> ==============================================================
> > > >
> > > > >
> > > > > =====
> > > > > --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
> > > > > +++ linux-pm/drivers/cpufreq/intel_pstate.c
> > > > > @@ -1674,13 +1674,6 @@ static int intel_pstate_init_cpu(unsigne
> > > > >   return 0;
> > > > >  }
> > > > >
> > > > > -static unsigned int intel_pstate_get(unsigned int cpu_num) -{
> > > > > - struct cpudata *cpu = all_cpu_data[cpu_num];
> > > > > -
> > > > > - return cpu ? get_avg_frequency(cpu) : 0;
> > > > > -}
> > > > > -
> > > > >  static void intel_pstate_set_update_util_hook(unsigned int
> > > > > cpu_num)  {
> > > > >   struct cpudata *cpu = all_cpu_data[cpu_num]; @@
> > > > > -1921,7
> > > > > +1914,6 @@
> > > > > static struct cpufreq_driver intel_pstat
> > > > >   .setpolicy = intel_pstate_set_policy,
> > > > >   .suspend = intel_pstate_hwp_save_state,
> > > > >   .resume = intel_pstate_resume,
> > > > > - .get = intel_pstate_get,
> > > > >   .init = intel_pstate_cpu_init,
> > > > >   .exit = intel_pstate_cpu_exit,
> > > > >   .stop_cpu = intel_pstate_stop_cpu,

2017-07-25 15:43:35

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] cpufreq: intel_pstate: Fix cpuinfo_cur_freq after performance governor changes

On Monday, July 24, 2017 07:57:45 PM Srinivas Pandruvada wrote:
> On Tue, 2017-07-25 at 01:46 +0000, Huaisheng HS1 Ye wrote:
> > Hi Rafael,
> >
> > If you delete "get" function implement within intel_pstate, the
> > sysfs interface cpuinfo_cur_freq will display <unknown> all the
> > time.
> cpuinfo_cur_freq by definition should show actual frequency HW
> frequency.

Right.

> Unless I missed something. So Len Brown's patch should also
> take care of this to get from arch specific function is available.
> So in addition to Rafael's change, what about this?
>
>
> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
> index 9bf97a3..29ec687 100644
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -689,8 +689,13 @@ store_one(scaling_max_freq, max);
> static ssize_t show_cpuinfo_cur_freq(struct cpufreq_policy *policy,
> char *buf)
> {
> - unsigned int cur_freq = __cpufreq_get(policy);
> + unsigned int cur_freq;
>
> + cur_freq = arch_freq_get_on_cpu(policy->cpu);
> + if (cur_freq)
> + return sprintf(buf, "%u\n", cur_freq);
> +
> + cur_freq = __cpufreq_get(policy);
> if (cur_freq)
> return sprintf(buf, "%u\n", cur_freq);
>

So also by definition cpuinfo_cur_freq should not return the same thing as
scaling_cur_freq.

I actually would like cpuinfo_cur_freq to not be present at all, I need to
check why it still shows up if ->get is not present.

Thanks,
Rafael

2017-07-25 21:54:35

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] cpufreq: intel_pstate: Fix cpuinfo_cur_freq after performance governor changes

On Tuesday, July 25, 2017 07:03:36 AM Huaisheng HS1 Ye wrote:
> Hi Srinivas,
> Your idea is great, but your patch at cpufreq.c will force all platforms to use scaling_cur_freq as first choice when userspace wants to access cpuinfo_cur_freq. It is ok for intel x86 platfrom but hard to say with other platforms.
> I modified it like that, it looks more reasonable. How about that?
>
> Hi Rafael,
> Deleting "get" function pointer within intel_pstate would lead to sysfs
> interface cpuinfo_cur_freq disappearing, because of
> cpufreq_add_dev_interface will check "cpufreq_driver->get" for it.

Which is exactly what I want.

cpuinfo_cur_freq is bogus for intel_pstate and it should have never been
exported for this driver.

> Perhaps just return 0 with in intel_pstate_get would be a workaround for this
> issue, how about it?
>
> I have tested this patch based on Purley platform, both Hardware and Software
> P-states works correct, we could get accurate and same frequency from
> cpuinfo_cur_freq and scaling_cur_freq.

But this is not correct. These two attributes should not be expected to always
return the same value.

Thanks,
Rafael

2017-07-25 22:50:55

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH] cpufreq: intel_pstate: Drop ->get from intel_pstate structure

From: Rafael J. Wysocki <[email protected]>

The ->get callback in the intel_pstate structure was mostly there
for the scaling_cur_freq sysfs attribute to work, but after commit
f8475cef9008 (x86: use common aperfmperf_khz_on_cpu() to calculate
KHz using APERF/MPERF) that attribute uses arch_freq_get_on_cpu()
provided by the x86 arch code on all processors supported by
intel_pstate, so it doesn't need the ->get callback from the
driver any more.

Moreover, the very presence of the ->get callback in the intel_pstate
structure causes the cpuinfo_cur_freq attribute to be present when
intel_pstate operates in the active mode, which is bogus, because
the role of that attribute is to return the current CPU frequency
as seen by the hardware. For intel_pstate, though, this is just an
average frequency and not really current, but computed for the
previous sampling interval (the actual current frequency may be
way different at the point this value is obtained by reading from
cpuinfo_cur_freq), and after commit 82b4e03e01bc (intel_pstate: skip
scheduler hook when in "performance" mode) the value in
cpuinfo_cur_freq may be stale or just 0, depending on the driver's
operation mode. In fact, however, on the hardware supported by
intel_pstate there is no way to read the current CPU frequency
from it, so the cpuinfo_cur_freq attribute should not be present
at all when this driver is in use.

For this reason, drop intel_pstate_get() and clear the ->get
callback pointer pointing to it, so that the cpuinfo_cur_freq is
not present for intel_pstate in the active mode any more.

Fixes: 82b4e03e01bc (intel_pstate: skip scheduler hook when in "performance" mode)
Reported-by: Huaisheng Ye <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/cpufreq/intel_pstate.c | 8 --------
1 file changed, 8 deletions(-)

Index: linux-pm/drivers/cpufreq/intel_pstate.c
===================================================================
--- linux-pm.orig/drivers/cpufreq/intel_pstate.c
+++ linux-pm/drivers/cpufreq/intel_pstate.c
@@ -1674,13 +1674,6 @@ static int intel_pstate_init_cpu(unsigne
return 0;
}

-static unsigned int intel_pstate_get(unsigned int cpu_num)
-{
- struct cpudata *cpu = all_cpu_data[cpu_num];
-
- return cpu ? get_avg_frequency(cpu) : 0;
-}
-
static void intel_pstate_set_update_util_hook(unsigned int cpu_num)
{
struct cpudata *cpu = all_cpu_data[cpu_num];
@@ -1921,7 +1914,6 @@ static struct cpufreq_driver intel_pstat
.setpolicy = intel_pstate_set_policy,
.suspend = intel_pstate_hwp_save_state,
.resume = intel_pstate_resume,
- .get = intel_pstate_get,
.init = intel_pstate_cpu_init,
.exit = intel_pstate_cpu_exit,
.stop_cpu = intel_pstate_stop_cpu,

2017-07-27 03:47:55

by Viresh Kumar

[permalink] [raw]
Subject: Re: [PATCH] cpufreq: intel_pstate: Drop ->get from intel_pstate structure

On 26-07-17, 00:42, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[email protected]>
>
> The ->get callback in the intel_pstate structure was mostly there
> for the scaling_cur_freq sysfs attribute to work, but after commit
> f8475cef9008 (x86: use common aperfmperf_khz_on_cpu() to calculate
> KHz using APERF/MPERF) that attribute uses arch_freq_get_on_cpu()
> provided by the x86 arch code on all processors supported by
> intel_pstate, so it doesn't need the ->get callback from the
> driver any more.
>
> Moreover, the very presence of the ->get callback in the intel_pstate
> structure causes the cpuinfo_cur_freq attribute to be present when
> intel_pstate operates in the active mode, which is bogus, because
> the role of that attribute is to return the current CPU frequency
> as seen by the hardware. For intel_pstate, though, this is just an
> average frequency and not really current, but computed for the
> previous sampling interval (the actual current frequency may be
> way different at the point this value is obtained by reading from
> cpuinfo_cur_freq), and after commit 82b4e03e01bc (intel_pstate: skip
> scheduler hook when in "performance" mode) the value in
> cpuinfo_cur_freq may be stale or just 0, depending on the driver's
> operation mode. In fact, however, on the hardware supported by
> intel_pstate there is no way to read the current CPU frequency
> from it, so the cpuinfo_cur_freq attribute should not be present
> at all when this driver is in use.
>
> For this reason, drop intel_pstate_get() and clear the ->get
> callback pointer pointing to it, so that the cpuinfo_cur_freq is
> not present for intel_pstate in the active mode any more.
>
> Fixes: 82b4e03e01bc (intel_pstate: skip scheduler hook when in "performance" mode)
> Reported-by: Huaisheng Ye <[email protected]>
> Signed-off-by: Rafael J. Wysocki <[email protected]>
> ---
> drivers/cpufreq/intel_pstate.c | 8 --------
> 1 file changed, 8 deletions(-)
>
> Index: linux-pm/drivers/cpufreq/intel_pstate.c
> ===================================================================
> --- linux-pm.orig/drivers/cpufreq/intel_pstate.c
> +++ linux-pm/drivers/cpufreq/intel_pstate.c
> @@ -1674,13 +1674,6 @@ static int intel_pstate_init_cpu(unsigne
> return 0;
> }
>
> -static unsigned int intel_pstate_get(unsigned int cpu_num)
> -{
> - struct cpudata *cpu = all_cpu_data[cpu_num];
> -
> - return cpu ? get_avg_frequency(cpu) : 0;
> -}
> -
> static void intel_pstate_set_update_util_hook(unsigned int cpu_num)
> {
> struct cpudata *cpu = all_cpu_data[cpu_num];
> @@ -1921,7 +1914,6 @@ static struct cpufreq_driver intel_pstat
> .setpolicy = intel_pstate_set_policy,
> .suspend = intel_pstate_hwp_save_state,
> .resume = intel_pstate_resume,
> - .get = intel_pstate_get,
> .init = intel_pstate_cpu_init,
> .exit = intel_pstate_cpu_exit,
> .stop_cpu = intel_pstate_stop_cpu,

Acked-by: Viresh Kumar <[email protected]>

--
viresh