2002-11-25 19:13:24

by Rasmus Andersen

[permalink] [raw]
Subject: Linux 2.2.23-rc2 & an MCE

Hi,

I just had an MCE on my aging PPro 200. Before I go out to
buy a replacement I would like to hear if it could be
caused by anything other than the CPU. Googling a bit
gave some indications that sometimes other HW might report
failure through this method.

The MCE (hand copied):

Machine Check Exception: 000000000000004
Bank 4: b200000000040151
Kernel panic: CPU context corrupt

Regards,
Rasmus


Attachments:
(No filename) (404.00 B)
(No filename) (189.00 B)
Download all attachments

2002-11-28 10:58:04

by Pavel Machek

[permalink] [raw]
Subject: Re: Linux 2.2.23-rc2 & an MCE

Hi!

> I just had an MCE on my aging PPro 200. Before I go out to
> buy a replacement I would like to hear if it could be
> caused by anything other than the CPU. Googling a bit
> gave some indications that sometimes other HW might report
> failure through this method.
>
> The MCE (hand copied):
>
> Machine Check Exception: 000000000000004
> Bank 4: b200000000040151
> Kernel panic: CPU context corrupt

Is not it trying to tell you about bad ram?



--
Worst form of spam? Adding advertisment signatures ala sourceforge.net.
What goes next? Inserting advertisment *into* email?

2002-11-29 00:59:13

by Felipe W Damasio

[permalink] [raw]
Subject: Re: Linux 2.2.23-rc2 & an MCE

Pavel Machek wrote:
>>The MCE (hand copied):
>>
>>Machine Check Exception: 000000000000004
>>Bank 4: b200000000040151
>>Kernel panic: CPU context corrupt
>
> Is not it trying to tell you about bad ram?

Could be, though this looks like a Instruction fetch error from the
Level 1 cache, doesn't it? If so, it could be caused by a faulty processor.

Is this the first time it happened? Could you please check your logs
and send any more MCE error codes?

Thanks,

Felipe

2002-11-29 06:10:47

by Rasmus Andersen

[permalink] [raw]
Subject: Re: Linux 2.2.23-rc2 & an MCE

On Thu, Nov 28, 2002 at 11:03:04PM +0000, Felipe W Damasio wrote:
> Pavel Machek wrote:
> >>The MCE (hand copied):
> >>
> >>Machine Check Exception: 000000000000004
> >>Bank 4: b200000000040151
> >>Kernel panic: CPU context corrupt
> >
> > Is not it trying to tell you about bad ram?
>
> Could be, though this looks like a Instruction fetch error from the
> Level 1 cache, doesn't it? If so, it could be caused by a faulty processor.
>
> Is this the first time it happened? Could you please check your logs
> and send any more MCE error codes?

Hi,

I have nothing in my logs but have had three more chrashes since
my first report. Two of them I couldn't inspect since I was at
work (the machine is at home) and had my girlfriend boot the box,
but the last one was identical to the reported one.

I am getting a new processor now and hope that'll do it.

Thanks for your comments,
Rasmus


Attachments:
(No filename) (897.00 B)
(No filename) (189.00 B)
Download all attachments

2002-11-29 09:14:52

by Felipe W Damasio

[permalink] [raw]
Subject: Re: Linux 2.2.23-rc2 & an MCE

Rasmus Andersen wrote:
> I have nothing in my logs but have had three more chrashes since
> my first report. Two of them I couldn't inspect since I was at
> work (the machine is at home) and had my girlfriend boot the box,
> but the last one was identical to the reported one.
>
> I am getting a new processor now and hope that'll do it.

Since the MCE code is reporting a instruction fetch error from the
level 1 cache, it could be a bad ram problem...

Could you try and run the memtest86 on your memory card(s) first (maybe
in a different machine)?

Kind Regards,

Felipe

2002-12-02 02:35:19

by Randy.Dunlap

[permalink] [raw]
Subject: Re: Linux 2.2.23-rc2 & an MCE

| > I just had an MCE on my aging PPro 200. Before I go out to
| > buy a replacement I would like to hear if it could be
| > caused by anything other than the CPU. Googling a bit
| > gave some indications that sometimes other HW might report
| > failure through this method.
| >
| > The MCE (hand copied):
| >
| > Machine Check Exception: 000000000000004
| > Bank 4: b200000000040151
| > Kernel panic: CPU context corrupt

Rasmus,

If you haven't already done so, you should check out the
MCE decoder from Dave Jones at
http://www.codemonkey.org.uk/cruft/parsemce.c/

--
~Randy

2002-12-02 09:19:24

by Rasmus Andersen

[permalink] [raw]
Subject: Re: Linux 2.2.23-rc2 & an MCE

On Sun, Dec 01, 2002 at 06:39:41PM -0800, Randy.Dunlap wrote:
> | > Machine Check Exception: 000000000000004
> | > Bank 4: b200000000040151
> | > Kernel panic: CPU context corrupt
>
> Rasmus,
>
> If you haven't already done so, you should check out the
> MCE decoder from Dave Jones at
> http://www.codemonkey.org.uk/cruft/parsemce.c/
>

That gave me:

Status: (4) Machine Check in progress.
Restart IP invalid.

Not sure what to make of that, though. Further comments
always welcome. And thanks for the pointer.

Regards,
Rasmus


Attachments:
(No filename) (538.00 B)
(No filename) (189.00 B)
Download all attachments