2007-12-16 01:40:13

by Matthias Schniedermeyer

[permalink] [raw]
Subject: Strange Memory Corruption Problem with Core2Duo E6700, P965 Chipset MB and >=4GB RAM

Hi



Yesterday i upgraded an 1 year old System from 4x1GB (32Bit, No
Memory-Remap) to 4x2GB (64Bit, Memory-Remap)

Today i due to a lucky coincidence i discovered that i have a memory
corruption problem.

This problem happens only with at least 4GB RAM and Memory-Remap. It
happens with any 2 of the 4x2GB RAM-Modules and regardless of slot or
dual or single channel configuration.

After some trying i found a way to "catch" the error. I wrote a small
perl script to fill a tmpfs with 1MB zero-files. Then i MD5 all the
files and discard all with the MD5-sum of a 1MB zero-file.

Here is what hexdump says about the file-content of the "bad"-file when
it is called several times.

:/misc/badram> hexdump bad
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
00d50a0 0000 84cf 1fff fd0f 0000 0000 0000 0000
00d50b0 0000 0000 0000 0000 0000 0000 0000 0000
*
00d90a0 0000 161f 1fff 43ff 0000 0000 0000 0000
00d90b0 0000 0000 0000 0000 0000 0000 0000 0000
00d90c0 0001 0000 0000 0000 0000 0000 0000 0000
00d90d0 0000 0000 0000 0000 0000 0000 0000 0000
*
0100000
:/misc/badram> hexdump bad
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
00d50a0 0000 84cf 1fff fd0f 0000 0000 0000 0000
00d50b0 0000 0000 0000 0000 0000 0000 0000 0000
*
00d90a0 0000 e59d 1fff 43bf 0000 0000 0000 0000
00d90b0 0000 0000 0000 0000 0000 0000 0000 0000
00d90c0 0001 0000 0000 0000 0000 0000 0000 0000
00d90d0 0000 0000 0000 0000 0000 0000 0000 0000
*
0100000
:/misc/badram> hexdump bad
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
00d50a0 0000 84cf 1fff fd0f 0000 0000 0000 0000
00d50b0 0000 0000 0000 0000 0000 0000 0000 0000
*
00d90a0 0000 f7fb 1fff 43ff 0000 0000 0000 0000
00d90b0 0000 0000 0000 0000 0000 0000 0000 0000
00d90c0 0001 0000 0000 0000 0000 0000 0000 0000
00d90d0 0000 0000 0000 0000 0000 0000 0000 0000
*
0100000
:/misc/badram> hexdump bad
0000000 0000 0000 0000 0000 0000 0000 0000 0000
*
00d50a0 0000 84cf 1fff ff03 0000 0000 0000 0000
00d50b0 0000 0000 0000 0000 0000 0000 0000 0000
*
00d90a0 0000 df3b 1fff 43ff 0000 0000 0000 0000
00d90b0 0000 0000 0000 0000 0000 0000 0000 0000
00d90c0 0001 0000 0000 0000 0000 0000 0000 0000
00d90d0 0000 0000 0000 0000 0000 0000 0000 0000
*
0100000

Strange is that some parts are static and other parts change.

Kernel is 2.6.23.11.
Kernel is 64Bit/x86-64, Userspace is 32bit. I only need the RAM for a
huge ramdisk, so i don't need 64bit userspace.

Can anybody help me with this problem?





Bis denn

--
Real Programmers consider "what you see is what you get" to be just as
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a "you asked for it, you got it" text editor -- complicated,
cryptic, powerful, unforgiving, dangerous.


2007-12-17 01:16:04

by Matthias Schniedermeyer

[permalink] [raw]
Subject: Re: Strange Memory Corruption Problem with Core2Duo E6700, P965 Chipset MB and >=4GB RAM

On 16.12.2007 02:39, Matthias Schniedermeyer wrote:
> Hi

It appears i found my culprit. <Knocking on wood> :-)

I have a ISDN-controller which is driven by the HFC-PCI-driver and it
appears to not be 64bit-safe.

I will test more thorough after i have relocated the card to another
computer.





Bis denn

--
Real Programmers consider "what you see is what you get" to be just as
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a "you asked for it, you got it" text editor -- complicated,
cryptic, powerful, unforgiving, dangerous.