2005-10-19 16:54:25

by Ken Moffat

[permalink] [raw]
Subject: Re: 2.6.13.4 After increasing RAM, I'm getting Bad page state at prep_new_page

On Thu, 20 Oct 2005, Steve Youngs wrote:

>
> It's not restricted to any one process, I've seen it in a number of
> different processes: Mozilla, sendmail, postmaster (pgsql). Of
> course, the first thing I thought of was that I'd been sold some dodgy
> RAM. But I've run memtest86 (version 3.2) over the RAM and no errors
> were found.
>

Steve,

this is almost certainly a hardware problem. I'm not saying that the
RAM is actually defective, it could be that the motherboard doesn't
reliably support that much memory, or even a weak powersupply.

I prefer to use memtest86+ for recent hardware, but I'm sure
memtest86 can find errors if you give it long enough (on a 1.8GHz
athlon64 with a mere 2GB of memory, several hours were needed - the
memory was good, but the mobo couldn't drive that much at full speed).
I think some of the tests in memtest86 are marked as 'optional', you
really want to run all of the tests if in doubt, and probably overnight.

3GB sounds an awful lot for an athlon - 2x1GB and 2x512MB, I suppose.
I would not be surprised to hear that a consumer-grade mobo has
difficulties. Bitter experience has taught me that it isn't a good idea
to fill a mobo with more memory than was reasonably envisaged when it
was designed - sure, the manual probably says it can take it, but linux
works it hard. Remember that the windows world thought 1GB was a lot of
memory until recently.

Of course, if it's a PSU problem related to excessive power to memory +
disk(s) + graphics card, memtest86 is unlikely to trigger it.

Ken
--
das eine Mal als Trag?die, das andere Mal als Farce

2005-10-19 17:00:24

by Hugh Dickins

[permalink] [raw]
Subject: Re: 2.6.13.4 After increasing RAM, I'm getting Bad page state at prep_new_page

On Thu, 20 Oct 2005, Steve Youngs wrote:
>
> A few days ago I increased my RAM from 0.5Gb to 3Gb and since then
> I've been getting `Bad page state at prep_new_page' errors at odd
> times. Here is a typical backtrace from my logs:
>
> Bad page state at prep_new_page (in process 'X', page c1f7bde0)
> flags:0x80000004 mapping:00000000 mapcount:-262144 count:0

It does look like bad memory, the single bit 0x40000 has got cleared
from the 0xffffffff which represents the expected mapcount 0 (for
reasons I won't go into, physical -1 represents logical 0 there).

If it were 0x800 which was cleared, I'd get excited, because that
would fit with a report from a few months back, which really did
not seem to be bad memory. But 0x40000 isn't so interesting, sorry!

The bad memory in question (the struct page at 0xc1f7bde0) is quite
low down, just below 32MB. Would I be right to guess that that you
inserted the new cards in such a way that the low memory is new RAM?

I suggest you try taking out that lowest card, and see what happens
then. Sometimes the kernel these days seems to find memory problems
that memtest86 does not (how long did you run it? overnight?).

You could try sending me all your "Bad page state" messages,
to check for correlations.

Hugh

2005-10-21 21:44:26

by Ken Moffat

[permalink] [raw]
Subject: Re: 2.6.13.4 After increasing RAM, I'm getting Bad page state at prep_new_page

On Sat, 22 Oct 2005, Steve Youngs wrote:

>
> I gave memtest86+ a shot, and after about 18 hours it came up with...
>
> Test: 8
> Pass: 7
> Failing Address: 00008072bf0 - 128.1MB
> Good: 00000000
> Bad: 00000100
> Err-Bits: 00000100
> Count: 1
>
> > 3GB sounds an awful lot for an athlon - 2x1GB and 2x512MB, I suppose.
>
> 3x1GB
>

At least the problem showed up, so the load on the power supply is not
a prime concern.

If you have the patience, first take out one of the 'good' sticks and
repeat with 2x1GB. If that works, it's probable the mobo can't drive
3x1GB, at least with the chip arrangement on those particular sticks.
OTOH, if it still fails at that address, perhaps that one stick is
suspect - in that case try swapping it and retesting.

Of course, if your manual is unclear about which slot maps where, you
might have to try permutations of 2 sticks on this approach. And that's
before messing with obscure and poorly-explained bios options to control
the memory timing and drive. Good Luck!

Ken
--
das eine Mal als Trag?die, das andere Mal als Farce