2009-11-07 00:44:10

by Yong Wang

[permalink] [raw]
Subject: [PATCH] x86, therm: Only read the initial value of thermal LVT entry on BSP

Only read the initial value of thermal LVT entry on BSP. The initial
value of thermal LVT entries on all APs always reads 0x10000 because
APs are woken up by BSP issuing INIT-SIPI-SIPI sequence to them and
LVT registers are reset to 0s except for mask bits which are set to
1s when APs receive INIT IPI.

Also restore the value that BIOS has programmed on AP based on BSP's
info we saved since BIOS is always setting the same value for all
threads/cores.

Signed-off-by: Yong Wang <[email protected]>
Cc: stable <[email protected]>
---
arch/x86/kernel/cpu/mcheck/therm_throt.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
index b3a1dba..1fd42db 100644
--- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
+++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
@@ -259,6 +259,7 @@ void intel_init_thermal(struct cpuinfo_x86 *c)
unsigned int cpu = smp_processor_id();
int tm2 = 0;
u32 l, h;
+ static u32 lvtthmr;

/* Thermal monitoring depends on ACPI and clock modulation*/
if (!cpu_has(c, X86_FEATURE_ACPI) || !cpu_has(c, X86_FEATURE_ACC))
@@ -270,7 +271,24 @@ void intel_init_thermal(struct cpuinfo_x86 *c)
* since it might be delivered via SMI already:
*/
rdmsr(MSR_IA32_MISC_ENABLE, l, h);
- h = apic_read(APIC_LVTTHMR);
+
+ /*
+ * Only read the initial value of thermal LVT entry on BSP. The
+ * initial value of thermal LVT entries on all APs always reads
+ * 0x10000 because APs are woken up by BSP issuing INIT-SIPI-SIPI
+ * sequence to them and LVT registers are reset to 0s except for
+ * the mask bits which are set to 1s when APs receive INIT IPI.
+ * Also restore the value that BIOS has programmed on AP based on
+ * BSP's info we saved since BIOS is always setting the same value
+ * for all threads/cores
+ */
+ if (cpu == 0)
+ lvtthmr = apic_read(APIC_LVTTHMR);
+ else
+ apic_write(APIC_LVTTHMR, lvtthmr);
+
+ h = lvtthmr;
+
if ((l & MSR_IA32_MISC_ENABLE_TM1) && (h & APIC_DM_SMI)) {
printk(KERN_DEBUG
"CPU%d: Thermal monitoring handled by SMI\n", cpu);


2009-11-07 02:57:30

by Suresh Siddha

[permalink] [raw]
Subject: Re: [PATCH] x86, therm: Only read the initial value of thermal LVT entry on BSP

On Fri, 2009-11-06 at 16:17 -0800, Yong Wang wrote:
> Only read the initial value of thermal LVT entry on BSP. The initial
> value of thermal LVT entries on all APs always reads 0x10000 because
> APs are woken up by BSP issuing INIT-SIPI-SIPI sequence to them and
> LVT registers are reset to 0s except for mask bits which are set to
> 1s when APs receive INIT IPI.
>
> Also restore the value that BIOS has programmed on AP based on BSP's
> info we saved since BIOS is always setting the same value for all
> threads/cores.

Yong, I have appended a new patch with an enhanced change log and
subject. In future, when you modify and post another version of the
patch, can you please update the patch version and elaborate what has
changed, why etc, so that it will be easier for the reviewers.

Ingo/Peter, please review and queue this patch from Yong. Thanks.
---

From: Yong Wang <[email protected]>
Subject: x86: under bios control, restore AP's APIC_LVTTHMR to the BSP value

On platforms where bios handles the thermal monitor interrupt,
APIC_LVTTHMR on each logical CPU is programmed to generate a SMI and OS
can't touch it.

Unfortunately AP bringup sequence using INIT-SIPI-SIPI clear all
the LVT entries except the mask bit. Essentially this results in
all LVT entries including the thermal monitoring interrupt set to masked
(clearing the bios programmed value for APIC_LVTTHMR).

And this leads to kernel take over the thermal monitoring interrupt
on AP's but not on BSP (leaving the bios programmed value only on BSP).

As a result of this, we have seen system hangs when the thermal
monitoring interrupt is generated.

Fix this by reading the initial value of thermal LVT entry on BSP
and if bios has taken over the control, then program the same value
on all AP's and leave the thermal monitoring interrupt control
on all the logical cpu's to the bios.

Signed-off-by: Yong Wang <[email protected]>
Reviewed-by: Suresh Siddha <[email protected]>
Cc: [email protected]
---
arch/x86/kernel/cpu/mcheck/therm_throt.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
index b3a1dba..1fd42db 100644
--- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
+++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
@@ -259,6 +259,7 @@ void intel_init_thermal(struct cpuinfo_x86 *c)
unsigned int cpu = smp_processor_id();
int tm2 = 0;
u32 l, h;
+ static u32 lvtthmr;

/* Thermal monitoring depends on ACPI and clock modulation*/
if (!cpu_has(c, X86_FEATURE_ACPI) || !cpu_has(c, X86_FEATURE_ACC))
@@ -270,7 +271,24 @@ void intel_init_thermal(struct cpuinfo_x86 *c)
* since it might be delivered via SMI already:
*/
rdmsr(MSR_IA32_MISC_ENABLE, l, h);
- h = apic_read(APIC_LVTTHMR);
+
+ /*
+ * Only read the initial value of thermal LVT entry on BSP. The
+ * initial value of thermal LVT entries on all APs always reads
+ * 0x10000 because APs are woken up by BSP issuing INIT-SIPI-SIPI
+ * sequence to them and LVT registers are reset to 0s except for
+ * the mask bits which are set to 1s when APs receive INIT IPI.
+ * Also restore the value that BIOS has programmed on AP based on
+ * BSP's info we saved since BIOS is always setting the same value
+ * for all threads/cores
+ */
+ if (cpu == 0)
+ lvtthmr = apic_read(APIC_LVTTHMR);
+ else
+ apic_write(APIC_LVTTHMR, lvtthmr);
+
+ h = lvtthmr;
+
if ((l & MSR_IA32_MISC_ENABLE_TM1) && (h & APIC_DM_SMI)) {
printk(KERN_DEBUG
"CPU%d: Thermal monitoring handled by SMI\n", cpu);

2009-11-07 10:19:08

by Yong Wang

[permalink] [raw]
Subject: Re: [PATCH] x86, therm: Only read the initial value of thermal LVT entry on BSP

On Fri, Nov 06, 2009 at 06:56:05PM -0800, Suresh Siddha wrote:
> On Fri, 2009-11-06 at 16:17 -0800, Yong Wang wrote:
> > Only read the initial value of thermal LVT entry on BSP. The initial
> > value of thermal LVT entries on all APs always reads 0x10000 because
> > APs are woken up by BSP issuing INIT-SIPI-SIPI sequence to them and
> > LVT registers are reset to 0s except for mask bits which are set to
> > 1s when APs receive INIT IPI.
> >
> > Also restore the value that BIOS has programmed on AP based on BSP's
> > info we saved since BIOS is always setting the same value for all
> > threads/cores.
>
> Yong, I have appended a new patch with an enhanced change log and
> subject. In future, when you modify and post another version of the
> patch, can you please update the patch version and elaborate what has
> changed, why etc, so that it will be easier for the reviewers.
>

Thanks a lot for your review and comments. Will follow the conventions
going forward.

2009-11-08 10:25:28

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86, therm: Only read the initial value of thermal LVT entry on BSP


* Suresh Siddha <[email protected]> wrote:

> On Fri, 2009-11-06 at 16:17 -0800, Yong Wang wrote:
> > Only read the initial value of thermal LVT entry on BSP. The initial
> > value of thermal LVT entries on all APs always reads 0x10000 because
> > APs are woken up by BSP issuing INIT-SIPI-SIPI sequence to them and
> > LVT registers are reset to 0s except for mask bits which are set to
> > 1s when APs receive INIT IPI.
> >
> > Also restore the value that BIOS has programmed on AP based on BSP's
> > info we saved since BIOS is always setting the same value for all
> > threads/cores.
>
> Yong, I have appended a new patch with an enhanced change log and
> subject. In future, when you modify and post another version of the
> patch, can you please update the patch version and elaborate what has
> changed, why etc, so that it will be easier for the reviewers.
>
> Ingo/Peter, please review and queue this patch from Yong. Thanks.
> ---
>
> From: Yong Wang <[email protected]>
> Subject: x86: under bios control, restore AP's APIC_LVTTHMR to the BSP value
>
> On platforms where bios handles the thermal monitor interrupt,
> APIC_LVTTHMR on each logical CPU is programmed to generate a SMI and OS
> can't touch it.
>
> Unfortunately AP bringup sequence using INIT-SIPI-SIPI clear all
> the LVT entries except the mask bit. Essentially this results in
> all LVT entries including the thermal monitoring interrupt set to masked
> (clearing the bios programmed value for APIC_LVTTHMR).
>
> And this leads to kernel take over the thermal monitoring interrupt
> on AP's but not on BSP (leaving the bios programmed value only on BSP).
>
> As a result of this, we have seen system hangs when the thermal
> monitoring interrupt is generated.
>
> Fix this by reading the initial value of thermal LVT entry on BSP
> and if bios has taken over the control, then program the same value
> on all AP's and leave the thermal monitoring interrupt control
> on all the logical cpu's to the bios.
>
> Signed-off-by: Yong Wang <[email protected]>
> Reviewed-by: Suresh Siddha <[email protected]>
> Cc: [email protected]
> ---
> arch/x86/kernel/cpu/mcheck/therm_throt.c | 20 +++++++++++++++++++-
> 1 file changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
> index b3a1dba..1fd42db 100644
> --- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
> +++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
> @@ -259,6 +259,7 @@ void intel_init_thermal(struct cpuinfo_x86 *c)
> unsigned int cpu = smp_processor_id();
> int tm2 = 0;
> u32 l, h;
> + static u32 lvtthmr;
>
> /* Thermal monitoring depends on ACPI and clock modulation*/
> if (!cpu_has(c, X86_FEATURE_ACPI) || !cpu_has(c, X86_FEATURE_ACC))
> @@ -270,7 +271,24 @@ void intel_init_thermal(struct cpuinfo_x86 *c)
> * since it might be delivered via SMI already:
> */
> rdmsr(MSR_IA32_MISC_ENABLE, l, h);
> - h = apic_read(APIC_LVTTHMR);
> +
> + /*
> + * Only read the initial value of thermal LVT entry on BSP. The
> + * initial value of thermal LVT entries on all APs always reads
> + * 0x10000 because APs are woken up by BSP issuing INIT-SIPI-SIPI
> + * sequence to them and LVT registers are reset to 0s except for
> + * the mask bits which are set to 1s when APs receive INIT IPI.
> + * Also restore the value that BIOS has programmed on AP based on
> + * BSP's info we saved since BIOS is always setting the same value
> + * for all threads/cores
> + */
> + if (cpu == 0)
> + lvtthmr = apic_read(APIC_LVTTHMR);
> + else
> + apic_write(APIC_LVTTHMR, lvtthmr);
> +
> + h = lvtthmr;
> +

i dont disagree with the fix, but could we please do it a bit cleaner,
and initialize a proper file-scope lvtthrm_init value from a different
boot-CPU-only function? (not intel_init_thermal)

that makes it cleaner, and also it will work if we dont boot on cpu==0.
(should that ever occur)

Thanks,

Ingo

2009-11-08 13:04:52

by Yong Wang

[permalink] [raw]
Subject: Re: [PATCH] x86, therm: Only read the initial value of thermal LVT entry on BSP

On Sun, Nov 08, 2009 at 11:25:21AM +0100, Ingo Molnar wrote:
>
> i dont disagree with the fix, but could we please do it a bit cleaner,
> and initialize a proper file-scope lvtthrm_init value from a different
> boot-CPU-only function? (not intel_init_thermal)
>

Thanks for your comments. Just want to make sure I understand correctly.
By 'file-scope', do you want me to define lvtthrm_init as a static
variable but not to define it in any function?

> that makes it cleaner, and also it will work if we dont boot on cpu==0.
> (should that ever occur)
>

May I know when will this happen?

Thanks
-Yong

2009-11-08 13:16:59

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86, therm: Only read the initial value of thermal LVT entry on BSP


* Yong Wang <[email protected]> wrote:

> On Sun, Nov 08, 2009 at 11:25:21AM +0100, Ingo Molnar wrote:
> >
> > i dont disagree with the fix, but could we please do it a bit cleaner,
> > and initialize a proper file-scope lvtthrm_init value from a different
> > boot-CPU-only function? (not intel_init_thermal)
> >
>
> Thanks for your comments. Just want to make sure I understand correctly.
> By 'file-scope', do you want me to define lvtthrm_init as a static
> variable but not to define it in any function?

Correct - i'd suggest to put it next to other file-scope variables at
the top of the .c file. Maybe make it __read_mostly as well.

> > that makes it cleaner, and also it will work if we dont boot on
> > cpu==0. (should that ever occur)
> >
>
> May I know when will this happen?

It's not really expected - we factorize the CPU IDs (which are logical)
so that the boot CPU is 0. But relying on cpu==0 is the boot cpu is not
clean and the resulting code is harder to read.

Ingo

2009-11-08 13:46:21

by Yong Wang

[permalink] [raw]
Subject: Re: [PATCH] x86, therm: Only read the initial value of thermal LVT entry on BSP

On Sun, Nov 08, 2009 at 02:16:55PM +0100, Ingo Molnar wrote:
>
> * Yong Wang <[email protected]> wrote:
>
> > On Sun, Nov 08, 2009 at 11:25:21AM +0100, Ingo Molnar wrote:
> > >
> > > i dont disagree with the fix, but could we please do it a bit cleaner,
> > > and initialize a proper file-scope lvtthrm_init value from a different
> > > boot-CPU-only function? (not intel_init_thermal)
> > >
> >
> > Thanks for your comments. Just want to make sure I understand correctly.
> > By 'file-scope', do you want me to define lvtthrm_init as a static
> > variable but not to define it in any function?
>
> Correct - i'd suggest to put it next to other file-scope variables at
> the top of the .c file. Maybe make it __read_mostly as well.
>

OK, will do.

> > > that makes it cleaner, and also it will work if we dont boot on
> > > cpu==0. (should that ever occur)
> > >
> >
> > May I know when will this happen?
>
> It's not really expected - we factorize the CPU IDs (which are logical)
> so that the boot CPU is 0. But relying on cpu==0 is the boot cpu is not
> clean and the resulting code is harder to read.
>

There does not seem to be 'boot-CPU-only function' in that .c file. What
about changing cpu==0 to cpu==boot_cpu_id? Does that help?

Thanks
-Yong

2009-11-08 13:42:28

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [PATCH] x86, therm: Only read the initial value of thermal LVT entry on BSP

[Ingo Molnar - Sun, Nov 08, 2009 at 02:16:55PM +0100]
|
...
|
| > > that makes it cleaner, and also it will work if we dont boot on
| > > cpu==0. (should that ever occur)
| > >
| >
| > May I know when will this happen?
|
| It's not really expected - we factorize the CPU IDs (which are logical)
| so that the boot CPU is 0. But relying on cpu==0 is the boot cpu is not
| clean and the resulting code is harder to read.

Perhaps we need some is_bsp_cpu() helper? Though to cover all x86 places
we need some efforts to apply :/

|
| Ingo
|
-- Cyrill

2009-11-08 15:59:32

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86, therm: Only read the initial value of thermal LVT entry on BSP


* Yong Wang <[email protected]> wrote:

> On Sun, Nov 08, 2009 at 02:16:55PM +0100, Ingo Molnar wrote:
> >
> > * Yong Wang <[email protected]> wrote:
> >
> > > On Sun, Nov 08, 2009 at 11:25:21AM +0100, Ingo Molnar wrote:
> > > >
> > > > i dont disagree with the fix, but could we please do it a bit cleaner,
> > > > and initialize a proper file-scope lvtthrm_init value from a different
> > > > boot-CPU-only function? (not intel_init_thermal)
> > > >
> > >
> > > Thanks for your comments. Just want to make sure I understand correctly.
> > > By 'file-scope', do you want me to define lvtthrm_init as a static
> > > variable but not to define it in any function?
> >
> > Correct - i'd suggest to put it next to other file-scope variables at
> > the top of the .c file. Maybe make it __read_mostly as well.
> >
>
> OK, will do.
>
> > > > that makes it cleaner, and also it will work if we dont boot on
> > > > cpu==0. (should that ever occur)
> > > >
> > >
> > > May I know when will this happen?
> >
> > It's not really expected - we factorize the CPU IDs (which are logical)
> > so that the boot CPU is 0. But relying on cpu==0 is the boot cpu is not
> > clean and the resulting code is harder to read.
> >
>
> There does not seem to be 'boot-CPU-only function' in that .c file. What
> about changing cpu==0 to cpu==boot_cpu_id? Does that help?

Then create one and call it - it's still cleaner that way.

Ingo