2008-06-19 20:44:50

by Max Asbock

[permalink] [raw]
Subject: mce_64.c: mce_cpu_quirks being ignored

A recent change to mce_init in -mm does the following:

@@ -462,7 +463,7 @@ static void mce_init(void *dummy)
wrmsr(MSR_IA32_MCG_CTL, 0xffffffff, 0xffffffff);

for (i = 0; i < banks; i++) {
- wrmsrl(MSR_IA32_MC0_CTL+4*i, bank[i]);
+ wrmsrl(MSR_IA32_MC0_CTL+4*i, ~0UL);
wrmsrl(MSR_IA32_MC0_STATUS+4*i, 0);
}
....

This change appears to ignore the fact that mce_cpu_quirks() clears a
bit in bank[4] for certain CPUs, as in:

static void __cpuinit mce_cpu_quirks(struct cpuinfo_x86 *c)
{
/* This should be disabled by the BIOS, but isn't always */
if (c->x86_vendor == X86_VENDOR_AMD) {
if(c->x86 == 15)
/* disable GART TBL walk error reporting, which trips off
incorrectly with the IOMMU & 3ware & Cerberus. */
clear_bit(10, &bank[4]);
....

Is turning off that bit still needed?

Max


2008-06-19 21:16:35

by Andi Kleen

[permalink] [raw]
Subject: Re: mce_64.c: mce_cpu_quirks being ignored

Max Asbock wrote:



> static void __cpuinit mce_cpu_quirks(struct cpuinfo_x86 *c)
> {
> /* This should be disabled by the BIOS, but isn't always */
> if (c->x86_vendor == X86_VENDOR_AMD) {
> if(c->x86 == 15)
> /* disable GART TBL walk error reporting, which trips off
> incorrectly with the IOMMU & 3ware & Cerberus. */
> clear_bit(10, &bank[4]);
> ....
>
> Is turning off that bit still needed?

Yes it is. Also referencing the bank is needed for other reasons anyways,
otherwise it would ignore the user sysfs choice on reconfiguration.

Venki, I did a patch for dynamic banks anyways, but haven't submitted it yet.

-Andi

2008-06-25 00:13:15

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: Re: mce_64.c: mce_cpu_quirks being ignored

On Thu, Jun 19, 2008 at 02:16:17PM -0700, Andi Kleen wrote:
> Max Asbock wrote:
>
>
>
> > static void __cpuinit mce_cpu_quirks(struct cpuinfo_x86 *c)
> > {
> > /* This should be disabled by the BIOS, but isn't always */
> > if (c->x86_vendor == X86_VENDOR_AMD) {
> > if(c->x86 == 15)
> > /* disable GART TBL walk error reporting, which trips off
> > incorrectly with the IOMMU & 3ware & Cerberus. */
> > clear_bit(10, &bank[4]);
> > ....
> >
> > Is turning off that bit still needed?
>
> Yes it is. Also referencing the bank is needed for other reasons anyways,
> otherwise it would ignore the user sysfs choice on reconfiguration.
>
> Venki, I did a patch for dynamic banks anyways, but haven't submitted it yet.
>

Yes. That particular quirk getting ignored was a bug. Below patch fixes
the bug, until we have the dynamic banks support.

sysfs choice configuration should not have any issues with the earlier patch
as we look for NR_SYSFS_BANKS in do_machine_check().

Signed-off-by: Venkatesh Pallipadi <[email protected]>

---
arch/x86/kernel/cpu/mcheck/mce_64.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

Index: linux-2.6/arch/x86/kernel/cpu/mcheck/mce_64.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/cpu/mcheck/mce_64.c 2008-06-24 16:01:23.000000000 -0700
+++ linux-2.6/arch/x86/kernel/cpu/mcheck/mce_64.c 2008-06-24 16:48:01.000000000 -0700
@@ -463,7 +463,11 @@ static void mce_init(void *dummy)
wrmsr(MSR_IA32_MCG_CTL, 0xffffffff, 0xffffffff);

for (i = 0; i < banks; i++) {
- wrmsrl(MSR_IA32_MC0_CTL+4*i, ~0UL);
+ if (i < NR_SYSFS_BANKS)
+ wrmsrl(MSR_IA32_MC0_CTL+4*i, bank[i]);
+ else
+ wrmsrl(MSR_IA32_MC0_CTL+4*i, ~0UL);
+
wrmsrl(MSR_IA32_MC0_STATUS+4*i, 0);
}
}

2008-07-03 13:05:55

by Ingo Molnar

[permalink] [raw]
Subject: Re: mce_64.c: mce_cpu_quirks being ignored


* Venki Pallipadi <[email protected]> wrote:

> > Venki, I did a patch for dynamic banks anyways, but haven't
> > submitted it yet.
>
> Yes. That particular quirk getting ignored was a bug. Below patch
> fixes the bug, until we have the dynamic banks support.
>
> sysfs choice configuration should not have any issues with the earlier
> patch as we look for NR_SYSFS_BANKS in do_machine_check().
>
> Signed-off-by: Venkatesh Pallipadi <[email protected]>

applied to tip/x86/mce - thanks Venki.

Ingo

2008-07-03 13:22:26

by Andi Kleen

[permalink] [raw]
Subject: Re: mce_64.c: mce_cpu_quirks being ignored

Ingo Molnar wrote:
> * Venki Pallipadi <[email protected]> wrote:
>
>>> Venki, I did a patch for dynamic banks anyways, but haven't
>>> submitted it yet.
>> Yes. That particular quirk getting ignored was a bug. Below patch
>> fixes the bug, until we have the dynamic banks support.
>>
>> sysfs choice configuration should not have any issues with the earlier
>> patch as we look for NR_SYSFS_BANKS in do_machine_check().
>>
>> Signed-off-by: Venkatesh Pallipadi <[email protected]>
>
> applied to tip/x86/mce - thanks Venki.

The dynamic banks patch is posted
http://marc.info/?l=linux-kernel&m=121493114604815&w=2
, but it turned out it requires
a sysfs infrastructure change for the auto-generated attributes.
I did that, but the patch is rather large
(http://marc.info/?l=linux-kernel&m=121493094104469&w=2)

Greg said he would queue the change, so dynamic banks could
be added (replacing Venki's two patches), but it would need
to be submitted after the infrastructure change.

-Andi