2002-01-08 23:06:48

by Harald Holzer

[permalink] [raw]
Subject: RE: i686 SMP systems with more then 12 GB ram with 2.4.x kernel, cache buffer bug ?

low memory problem:

A Server with 32 GB ram, RH 7.2 and kernel 2.4.17rc2aa2.
After doing a lot of disc access the system slows down and the system
dies. Because the system is running out of low memory.

The last kernel logs lines are:
"kernel: __alloc_pages: 0-order allocation failed (gfp=0x70/0)"

On other kernels then 2.4.17rc2aa2 the oom killer kicks in, or the
system simply stop responding without any messages.

It looks like that the buffer_heads would fill up the low memory,
whether there is sufficient memory available or not, as long as
there is sufficient high memory for caching.
It seems that the kernel does a good job of releasing dcache or icache,
but the buffer_heads are filling up the released mem.

(With 16GB Ram the system runs.:-)

This problem is not related to such big systems. But it seems that only
there it hits you immediately.

I rechecked this on a system with 4GB. After allocating 784 MB of 856 MB
low memory with kmalloc, i started a copy of some files.
And the system dies with the same problem.


Can someone help me, fixing this problem.


If someone want to check this problem, i attached a file wastemem.c.
With this module you can allocate low memory.
Compile it with "gcc -c wastemem.c", and load it with
"insmod -f wastemem.o count=49000".
Count are the number of 16k blocks to allocate.(49000 = 784 MB).
(Be warned dont kill your working system ;-)

Make sure with a reboot that there only some buffers are used.
Allocate enough low mem, so that the 104 byte blocks from buffer_head
fills up your low mem, before the 4k cache blocks the high mem does.
Now do some disk access to get the system out of memory.


Harald Holzer


Attachments:
wastemem.c (1.89 kB)

2002-01-08 23:29:59

by M. Edward Borasky

[permalink] [raw]
Subject: RE: i686 SMP systems with more then 12 GB ram with 2.4.x kernel, cache buffer bug ?

On 9 Jan 2002, Harald Holzer wrote:

> low memory problem:
>
> A Server with 32 GB ram, RH 7.2 and kernel 2.4.17rc2aa2.
> After doing a lot of disc access the system slows down and the system
> dies. Because the system is running out of low memory.
>
> The last kernel logs lines are:
> "kernel: __alloc_pages: 0-order allocation failed (gfp=0x70/0)"
>
> On other kernels then 2.4.17rc2aa2 the oom killer kicks in, or the
> system simply stop responding without any messages.
>
> It looks like that the buffer_heads would fill up the low memory,
> whether there is sufficient memory available or not, as long as
> there is sufficient high memory for caching.
> It seems that the kernel does a good job of releasing dcache or icache,
> but the buffer_heads are filling up the released mem.

In terms of "control knobs", would a limit on page cache size imply a
limit on "buffer_heads", or do we really need the control knob on
"buffer_heads" and not on the page cache? Or would we need both?

--
M. Edward "buffer head" Borasky

[email protected]
http://www.borasky-research.net

What phrase will you *never* hear Candice Bergen use?
"My daddy didn't raise no dummies!"

2002-01-09 01:25:46

by Rik van Riel

[permalink] [raw]
Subject: RE: i686 SMP systems with more then 12 GB ram with 2.4.x kernel, cache buffer bug ?

On Tue, 8 Jan 2002, M. Edward (Ed) Borasky wrote:

> > It seems that the kernel does a good job of releasing dcache or icache,
> > but the buffer_heads are filling up the released mem.
>
> In terms of "control knobs", would a limit on page cache size imply a
> limit on "buffer_heads", or do we really need the control knob on
> "buffer_heads" and not on the page cache? Or would we need both?

We can just remove the buffer heads from the page cache
pages without any problem (except that on writeback we
have to look up where exactly the page should be written
to on disk).

regards,

Rik
--
"Linux holds advantages over the single-vendor commercial OS"
-- Microsoft's "Competing with Linux" document

http://www.surriel.com/ http://distro.conectiva.com/