[ Please CC me on answers since I'm not on the list ]
Hi,
I've been getting MCE's repeatedly today when trying to compile
2.5.75-bk1 on 2.5.75-bk1 (obviously I didn't have them yesterday when I
build my first 2.5.75-bk1 kernel on a 2.4 kernel).
The MCE is always the same (I think) and reads like this :
CPU 0: Machine Check Exception: 0000000000000004
Bank 0: b600000000000135 at 000000000b99b9f0
Kernel panic: CPU context corrupt
Which when decoded with parsemce gives :
[nim@rousalka parse]$ ./parse -i < mce
CPU 0
Status: (4) Machine Check in progress.
Restart IP invalid.
parsebank(0): b600000000000135 @ b99b9f0
External tag parity error
CPU state corrupt. Restart not possible
Address in addr register valid
Error enabled in control register
Error not corrected.
Memory heirarchy error
Request: Generic error
Transaction type : Data
Memory/IO : Reserved
I'd like to have some advice on what to do next. Is this a 2.5 bug ? An
hardware problem only triggered in 2.5 because it exercises the harware
in a different way ? Should I change something in the system ? If so,
should I change memory, cpu, psu, something else ?
I don't usually build 2.5 on 2.5, but again yesterday was very hot and
hardware might have suffered (the best case cooling can not do much with
room temperature = 30+ ?C)
Any hint will be welcome - this is my first mce encounter.
Regards,
--
Nicolas Mailhot
Le sam 12/07/2003 ? 12:09, Nicolas Mailhot a ?crit :
> [ Please CC me on answers since I'm not on the list ]
>
> Hi,
>
> I've been getting MCE's repeatedly today when trying to compile
> 2.5.75-bk1 on 2.5.75-bk1 (obviously I didn't have them yesterday when I
> build my first 2.5.75-bk1 kernel on a 2.4 kernel).
>
> The MCE is always the same (I think) and reads like this :
>
> CPU 0: Machine Check Exception: 0000000000000004
> Bank 0: b600000000000135 at 000000000b99b9f0
Well looking in the logs the MCE type is always the same but the actual
address changes :
/var/log/messages:4371:Jul 12 11:14:08 rousalka kernel: Bank 0:
b67e800000000135 at 0000000004fc8678
/var/log/messages:4692:Jul 12 11:22:52 rousalka kernel: Bank 0:
b607000000000135 at 0000000011b6e7f0
/var/log/messages:4982:Jul 12 11:29:49 rousalka kernel: Bank 0:
b674000000000135 at 0000000017c029f0
/var/log/messages:5265:Jul 12 11:45:15 rousalka kernel: Bank 0:
b600000000000135 at 000000000b99b9f0
What's the best course of action now ?
--
Nicolas Mailhot