2008-02-01 15:47:56

by Gene Heskett

[permalink] [raw]
Subject: log spamming

Greetings;

I just rebooted to a new config of 2.6.24, basically trying to strip out the
building of modules I don't use. And I enabled a couple of checks that
weren't checked in the kernel-hacking menu. .config posted on request.

Now the messages log is being spammed at 2-5 second intervals by these:
Feb 1 10:41:08 coyote kernel: [ 3085.501037] MCE: The hardware reports a non
fatal, correctable incident occurred on CPU 0.
Feb 1 10:41:08 coyote kernel: [ 3085.501042] Bank 1: d400400000000152
Feb 1 10:41:08 coyote kernel: [ 3085.501045] MCE: The hardware reports a non
fatal, correctable incident occurred on CPU 0.
Feb 1 10:41:08 coyote kernel: [ 3085.501048] Bank 2: d40040000000017a

Always the same 2 addresses. Is this telling me I should be running memtest86
for a couple of cycles?

Thanks.

--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
"It is the creationists who blasphemously are claiming that God is cheating
us in a stupid way."
-- J. W. Nienhuys


2008-02-01 20:47:56

by Chris Snook

[permalink] [raw]
Subject: Re: log spamming

Gene Heskett wrote:
> Greetings;
>
> I just rebooted to a new config of 2.6.24, basically trying to strip out the
> building of modules I don't use. And I enabled a couple of checks that
> weren't checked in the kernel-hacking menu. .config posted on request.
>
> Now the messages log is being spammed at 2-5 second intervals by these:
> Feb 1 10:41:08 coyote kernel: [ 3085.501037] MCE: The hardware reports a non
> fatal, correctable incident occurred on CPU 0.
> Feb 1 10:41:08 coyote kernel: [ 3085.501042] Bank 1: d400400000000152
> Feb 1 10:41:08 coyote kernel: [ 3085.501045] MCE: The hardware reports a non
> fatal, correctable incident occurred on CPU 0.
> Feb 1 10:41:08 coyote kernel: [ 3085.501048] Bank 2: d40040000000017a
>
> Always the same 2 addresses. Is this telling me I should be running memtest86
> for a couple of cycles?

Those two addresses are in the same cache line, but they are *not* in the same
128-bit ECC block. This is probably a northbridge problem, not a RAM problem.
It's not necessarily a hardware problem. I wouldn't be surprised if you swapped
CPUs and still got the same result, due to BIOS misconfiguration.

-- Chris