2015-05-07 16:59:42

by Joe Konno

[permalink] [raw]
Subject: [PATCH] intel_pstate: set BYT MSR with wrmsrl_on_cpu()

From: Joe Konno <[email protected]>

In instances where the default cpufreq governor is Performance, reading
from MSR 0x199 on an applicable multi-core Atom system saw boot-to-boot
variability in the P-State value set to each logical core. Sometimes
only one logical core would be set properly, other times two or three.
There was an assumption in the code that only a thread on the intended
logical core would be calling the wrmsrl() function. That was disproven
during debug, as cpufreq, at init, was not always calling from the same
as the logical core it targeted. Thus, use wrmsrl_on_cpu() instead, as
done in the core_set_pstate() function.

For: LCK-1822
Fixes: 007bea098b86 ("intel_pstate: Add setting voltage value for
baytrail P states.")
Signed-off-by: Joe Konno <[email protected]>
---
drivers/cpufreq/intel_pstate.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 6414661ac1c4..c45d274a75c8 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -535,7 +535,7 @@ static void byt_set_pstate(struct cpudata *cpudata, int pstate)

val |= vid;

- wrmsrl(MSR_IA32_PERF_CTL, val);
+ wrmsrl_on_cpu(cpudata->cpu, MSR_IA32_PERF_CTL, val);
}

#define BYT_BCLK_FREQS 5
--
2.4.0


2015-05-07 20:33:08

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] intel_pstate: set BYT MSR with wrmsrl_on_cpu()

On Thursday, May 07, 2015 09:59:39 AM Joe Konno wrote:
> From: Joe Konno <[email protected]>
>
> In instances where the default cpufreq governor is Performance, reading

I'm not really sure what this is about. You're talking about cpufreq governors
and this is an intel_pstate patch. What gives?

> from MSR 0x199 on an applicable multi-core Atom system saw boot-to-boot
> variability in the P-State value set to each logical core. Sometimes
> only one logical core would be set properly, other times two or three.
> There was an assumption in the code that only a thread on the intended
> logical core would be calling the wrmsrl() function. That was disproven
> during debug, as cpufreq, at init, was not always calling from the same
> as the logical core it targeted. Thus, use wrmsrl_on_cpu() instead, as
> done in the core_set_pstate() function.
>
> For: LCK-1822

This tag is meaningless upstream.

> Fixes: 007bea098b86 ("intel_pstate: Add setting voltage value for
> baytrail P states.")

So, you're fixing a function introduced by the above commit, right?

> Signed-off-by: Joe Konno <[email protected]>
> ---
> drivers/cpufreq/intel_pstate.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> index 6414661ac1c4..c45d274a75c8 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -535,7 +535,7 @@ static void byt_set_pstate(struct cpudata *cpudata, int pstate)
>
> val |= vid;
>
> - wrmsrl(MSR_IA32_PERF_CTL, val);
> + wrmsrl_on_cpu(cpudata->cpu, MSR_IA32_PERF_CTL, val);

So the bug is that this may run on a CPU which is not cpudata->cpu in which
case the write will not happen where it should. Is that correct?

> }
>
> #define BYT_BCLK_FREQS 5
>

--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2015-05-07 23:22:36

by Joe Konno

[permalink] [raw]
Subject: Re: [PATCH] intel_pstate: set BYT MSR with wrmsrl_on_cpu()

On Thu, May 07, 2015 at 10:58:11PM +0200, Rafael J. Wysocki wrote:
> On Thursday, May 07, 2015 09:59:39 AM Joe Konno wrote:
> > From: Joe Konno <[email protected]>
> >
> > In instances where the default cpufreq governor is Performance, reading
>
> I'm not really sure what this is about. You're talking about cpufreq governors
> and this is an intel_pstate patch. What gives?

I'll reshuffle the paragraph to bring detail to the fix first, and the
"when/why" second.

In debug I have only seen the bug during boot when cpufreq calls
intel_pstate's init for each logical core-- often from one, sometimes
two logical cores.

The bug may occur after init as well, but not enough data to conclude
one way or the other. I personally have not seen it happen after init in
my local testing.

>
> > from MSR 0x199 on an applicable multi-core Atom system saw boot-to-boot
> > variability in the P-State value set to each logical core. Sometimes
> > only one logical core would be set properly, other times two or three.
> > There was an assumption in the code that only a thread on the intended
> > logical core would be calling the wrmsrl() function. That was disproven
> > during debug, as cpufreq, at init, was not always calling from the same
> > as the logical core it targeted. Thus, use wrmsrl_on_cpu() instead, as
> > done in the core_set_pstate() function.
> >
> > For: LCK-1822
>
> This tag is meaningless upstream.

Mimicked another subsystem's practice. I have no problem removing it.

>
> > Fixes: 007bea098b86 ("intel_pstate: Add setting voltage value for
> > baytrail P states.")
>
> So, you're fixing a function introduced by the above commit, right?

Correct. That commit introduced the byt_set_pstate() function with the
wrmsrl() call.

>
> > Signed-off-by: Joe Konno <[email protected]>
> > ---
> > drivers/cpufreq/intel_pstate.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> > index 6414661ac1c4..c45d274a75c8 100644
> > --- a/drivers/cpufreq/intel_pstate.c
> > +++ b/drivers/cpufreq/intel_pstate.c
> > @@ -535,7 +535,7 @@ static void byt_set_pstate(struct cpudata *cpudata, int pstate)
> >
> > val |= vid;
> >
> > - wrmsrl(MSR_IA32_PERF_CTL, val);
> > + wrmsrl_on_cpu(cpudata->cpu, MSR_IA32_PERF_CTL, val);
>
> So the bug is that this may run on a CPU which is not cpudata->cpu in which
> case the write will not happen where it should. Is that correct?

Yes-- I believe my first inline comment spoke to this.

>
> > }
> >
> > #define BYT_BCLK_FREQS 5
> >
>
> --
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2015-05-08 13:34:14

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] intel_pstate: set BYT MSR with wrmsrl_on_cpu()

On Thursday, May 07, 2015 04:22:32 PM Joe Konno wrote:
> On Thu, May 07, 2015 at 10:58:11PM +0200, Rafael J. Wysocki wrote:
> > On Thursday, May 07, 2015 09:59:39 AM Joe Konno wrote:
> > > From: Joe Konno <[email protected]>
> > >
> > > In instances where the default cpufreq governor is Performance, reading
> >
> > I'm not really sure what this is about. You're talking about cpufreq governors
> > and this is an intel_pstate patch. What gives?
>
> I'll reshuffle the paragraph to bring detail to the fix first, and the
> "when/why" second.
>
> In debug I have only seen the bug during boot when cpufreq calls
> intel_pstate's init for each logical core-- often from one, sometimes
> two logical cores.
>
> The bug may occur after init as well, but not enough data to conclude
> one way or the other. I personally have not seen it happen after init in
> my local testing.
>
> >
> > > from MSR 0x199 on an applicable multi-core Atom system saw boot-to-boot
> > > variability in the P-State value set to each logical core. Sometimes
> > > only one logical core would be set properly, other times two or three.
> > > There was an assumption in the code that only a thread on the intended
> > > logical core would be calling the wrmsrl() function. That was disproven
> > > during debug, as cpufreq, at init, was not always calling from the same
> > > as the logical core it targeted. Thus, use wrmsrl_on_cpu() instead, as
> > > done in the core_set_pstate() function.
> > >
> > > For: LCK-1822
> >
> > This tag is meaningless upstream.
>
> Mimicked another subsystem's practice. I have no problem removing it.
>
> >
> > > Fixes: 007bea098b86 ("intel_pstate: Add setting voltage value for
> > > baytrail P states.")
> >
> > So, you're fixing a function introduced by the above commit, right?
>
> Correct. That commit introduced the byt_set_pstate() function with the
> wrmsrl() call.
>
> >
> > > Signed-off-by: Joe Konno <[email protected]>
> > > ---
> > > drivers/cpufreq/intel_pstate.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> > > index 6414661ac1c4..c45d274a75c8 100644
> > > --- a/drivers/cpufreq/intel_pstate.c
> > > +++ b/drivers/cpufreq/intel_pstate.c
> > > @@ -535,7 +535,7 @@ static void byt_set_pstate(struct cpudata *cpudata, int pstate)
> > >
> > > val |= vid;
> > >
> > > - wrmsrl(MSR_IA32_PERF_CTL, val);
> > > + wrmsrl_on_cpu(cpudata->cpu, MSR_IA32_PERF_CTL, val);
> >
> > So the bug is that this may run on a CPU which is not cpudata->cpu in which
> > case the write will not happen where it should. Is that correct?
>
> Yes-- I believe my first inline comment spoke to this.

So here's the changelog I'd use with this patch:

"Commit 007bea098b86 (intel_pstate: Add setting voltage value for baytrail
P states.) introduced byt_set_pstate() with the assumption that it would
always be run by the CPU whose MSR is to be written by it. It turns out,
however, that is not always the case in practice, so modify byt_set_pstate()
to enforce the MSR write done by it to always happen on the right CPU."

I don't think you need to say anything more in it. Mentioning governors in
particular is unnecessary and confusing.

Kristen, what do you think?


--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

2015-05-11 18:39:36

by Kristen Carlson Accardi

[permalink] [raw]
Subject: Re: [PATCH] intel_pstate: set BYT MSR with wrmsrl_on_cpu()

On Fri, 08 May 2015 15:59:14 +0200
"Rafael J. Wysocki" <[email protected]> wrote:

> On Thursday, May 07, 2015 04:22:32 PM Joe Konno wrote:
> > On Thu, May 07, 2015 at 10:58:11PM +0200, Rafael J. Wysocki wrote:
> > > On Thursday, May 07, 2015 09:59:39 AM Joe Konno wrote:
> > > > From: Joe Konno <[email protected]>
> > > >
> > > > In instances where the default cpufreq governor is Performance, reading
> > >
> > > I'm not really sure what this is about. You're talking about cpufreq governors
> > > and this is an intel_pstate patch. What gives?
> >
> > I'll reshuffle the paragraph to bring detail to the fix first, and the
> > "when/why" second.
> >
> > In debug I have only seen the bug during boot when cpufreq calls
> > intel_pstate's init for each logical core-- often from one, sometimes
> > two logical cores.
> >
> > The bug may occur after init as well, but not enough data to conclude
> > one way or the other. I personally have not seen it happen after init in
> > my local testing.
> >
> > >
> > > > from MSR 0x199 on an applicable multi-core Atom system saw boot-to-boot
> > > > variability in the P-State value set to each logical core. Sometimes
> > > > only one logical core would be set properly, other times two or three.
> > > > There was an assumption in the code that only a thread on the intended
> > > > logical core would be calling the wrmsrl() function. That was disproven
> > > > during debug, as cpufreq, at init, was not always calling from the same
> > > > as the logical core it targeted. Thus, use wrmsrl_on_cpu() instead, as
> > > > done in the core_set_pstate() function.
> > > >
> > > > For: LCK-1822
> > >
> > > This tag is meaningless upstream.
> >
> > Mimicked another subsystem's practice. I have no problem removing it.
> >
> > >
> > > > Fixes: 007bea098b86 ("intel_pstate: Add setting voltage value for
> > > > baytrail P states.")
> > >
> > > So, you're fixing a function introduced by the above commit, right?
> >
> > Correct. That commit introduced the byt_set_pstate() function with the
> > wrmsrl() call.
> >
> > >
> > > > Signed-off-by: Joe Konno <[email protected]>
> > > > ---
> > > > drivers/cpufreq/intel_pstate.c | 2 +-
> > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> > > > index 6414661ac1c4..c45d274a75c8 100644
> > > > --- a/drivers/cpufreq/intel_pstate.c
> > > > +++ b/drivers/cpufreq/intel_pstate.c
> > > > @@ -535,7 +535,7 @@ static void byt_set_pstate(struct cpudata *cpudata, int pstate)
> > > >
> > > > val |= vid;
> > > >
> > > > - wrmsrl(MSR_IA32_PERF_CTL, val);
> > > > + wrmsrl_on_cpu(cpudata->cpu, MSR_IA32_PERF_CTL, val);
> > >
> > > So the bug is that this may run on a CPU which is not cpudata->cpu in which
> > > case the write will not happen where it should. Is that correct?
> >
> > Yes-- I believe my first inline comment spoke to this.
>
> So here's the changelog I'd use with this patch:
>
> "Commit 007bea098b86 (intel_pstate: Add setting voltage value for baytrail
> P states.) introduced byt_set_pstate() with the assumption that it would
> always be run by the CPU whose MSR is to be written by it. It turns out,
> however, that is not always the case in practice, so modify byt_set_pstate()
> to enforce the MSR write done by it to always happen on the right CPU."
>
> I don't think you need to say anything more in it. Mentioning governors in
> particular is unnecessary and confusing.
>
> Kristen, what do you think?
>
>

Looks good to me with the modified changelog.

Acked-by: Kristen Carlson Accardi <[email protected]>

2015-05-12 15:01:00

by Joe Konno

[permalink] [raw]
Subject: [PATCH] intel_pstate: set BYT MSR with wrmsrl_on_cpu()

From: Joe Konno <[email protected]>

Commit 007bea098b86 (intel_pstate: Add setting voltage value for
baytrail P states.) introduced byt_set_pstate() with the assumption that
it would always be run by the CPU whose MSR is to be written by it. It
turns out, however, that is not always the case in practice, so modify
byt_set_pstate() to enforce the MSR write done by it to always happen on
the right CPU.

v2: better commit message, remove For: tag

Fixes: 007bea098b86 ("intel_pstate: Add setting voltage value for
baytrail P states.")
Signed-off-by: Joe Konno <[email protected]>
---
drivers/cpufreq/intel_pstate.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 6414661ac1c4..c45d274a75c8 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -535,7 +535,7 @@ static void byt_set_pstate(struct cpudata *cpudata, int pstate)

val |= vid;

- wrmsrl(MSR_IA32_PERF_CTL, val);
+ wrmsrl_on_cpu(cpudata->cpu, MSR_IA32_PERF_CTL, val);
}

#define BYT_BCLK_FREQS 5
--
2.4.0

2015-05-14 23:48:43

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] intel_pstate: set BYT MSR with wrmsrl_on_cpu()

On Tuesday, May 12, 2015 07:59:42 AM Joe Konno wrote:
> From: Joe Konno <[email protected]>
>
> Commit 007bea098b86 (intel_pstate: Add setting voltage value for
> baytrail P states.) introduced byt_set_pstate() with the assumption that
> it would always be run by the CPU whose MSR is to be written by it. It
> turns out, however, that is not always the case in practice, so modify
> byt_set_pstate() to enforce the MSR write done by it to always happen on
> the right CPU.
>
> v2: better commit message, remove For: tag
>
> Fixes: 007bea098b86 ("intel_pstate: Add setting voltage value for
> baytrail P states.")
> Signed-off-by: Joe Konno <[email protected]>
> ---
> drivers/cpufreq/intel_pstate.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
> index 6414661ac1c4..c45d274a75c8 100644
> --- a/drivers/cpufreq/intel_pstate.c
> +++ b/drivers/cpufreq/intel_pstate.c
> @@ -535,7 +535,7 @@ static void byt_set_pstate(struct cpudata *cpudata, int pstate)
>
> val |= vid;
>
> - wrmsrl(MSR_IA32_PERF_CTL, val);
> + wrmsrl_on_cpu(cpudata->cpu, MSR_IA32_PERF_CTL, val);
> }
>
> #define BYT_BCLK_FREQS 5

Queued up for 4.2, thanks!


--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.