2002-01-29 18:26:12

by Pete Wyckoff

[permalink] [raw]
Subject: [patch] typo in i386 machine check code

Our old PIII Xeons are dying, as shown by more frequent panics
due to bad hardware:

kernel: CPU 3: Machine Check Exception: 0000000000000007
kernel: Bank 0: b678600022000800 at 3678600022000800

The part after the "at" is supposed to be the memory address which
was being accessed when the fault was detected. Instead the code
prints out the status field again (with the high bit removed for
no apparent reason).

Patch is against 2.5.2.

-- Pete

--- linux/arch/i386/kernel/bluesmoke.c.orig Tue Jan 29 12:04:46 2002
+++ linux/arch/i386/kernel/bluesmoke.c Tue Jan 29 12:04:48 2002
@@ -40,21 +40,21 @@ static void intel_machine_check(struct p
high&=~(1<<31);
if(high&(1<<27))
{
rdmsr(MSR_IA32_MC0_MISC+i*4, alow, ahigh);
printk("[%08x%08x]", alow, ahigh);
}
if(high&(1<<26))
{
rdmsr(MSR_IA32_MC0_ADDR+i*4, alow, ahigh);
printk(" at %08x%08x",
- high, low);
+ ahigh, alow);
}
printk("\n");
/* Clear it */
wrmsr(MSR_IA32_MC0_STATUS+i*4, 0UL, 0UL);
/* Serialize */
wmb();
}
}

if(recover&2)


2002-01-29 18:39:02

by Dave Jones

[permalink] [raw]
Subject: Re: [patch] typo in i386 machine check code

On Tue, 29 Jan 2002, Pete Wyckoff wrote:

> kernel: CPU 3: Machine Check Exception: 0000000000000007
> kernel: Bank 0: b678600022000800 at 3678600022000800
>
> The part after the "at" is supposed to be the memory address which
> was being accessed when the fault was detected. Instead the code
> prints out the status field again (with the high bit removed for
> no apparent reason).
> Patch is against 2.5.2.

Patch is correct. I pushed the same fix to Marcelo & Linus
about a month back. Alan, 2.2 also needs this. (Can't remember if
I told you, or you told me 8-)

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2002-01-29 19:24:18

by Dave Jones

[permalink] [raw]
Subject: Re: [patch] typo in i386 machine check code

On Tue, 29 Jan 2002, Alan Cox wrote:

> > Patch is correct. I pushed the same fix to Marcelo & Linus
> > about a month back. Alan, 2.2 also needs this. (Can't remember if
> > I told you, or you told me 8-)
> You know where to send the diff

It was previous message in this thread 8-)
Bounced it your way again just in case..

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2002-01-29 19:59:29

by Alan

[permalink] [raw]
Subject: Re: [patch] typo in i386 machine check code

> Patch is correct. I pushed the same fix to Marcelo & Linus
> about a month back. Alan, 2.2 also needs this. (Can't remember if
> I told you, or you told me 8-)

You know where to send the diff