2001-04-06 14:07:57

by majer

[permalink] [raw]
Subject: memory allocation problems



Hi. Im hoping someone on here can help me out. I posted something
similar to this back in June 2000 when I was on the 2.2.X line and
was waiting to see if the 2.4 kernel would provide a fix.

Essentially, the problem can be summarized to be that on a machine
with ample ram (2G, 4G, etc), I am unable to malloc a gig if I ask
for the memory in small ( <= 128k) chunks. I've enclosed some results
and a little program which was put together to demonstrate the problems
we're having. All of the failures seem to occur around 930MB.

I'm more than happy to try any tunings, patches, etc and my
time is at your disposal.

Thanks,

Karl

----
Karl Majer [email protected]
Sr Systems Architect 617 577 7999 xt 251

"Think for yourselves and let others enjoy the privilege to do so, too."
--Voltaire


Attachments:
results (1.40 kB)
memgrab.c (836.00 B)
Download all attachments

2001-04-06 15:23:04

by Andi Kleen

[permalink] [raw]
Subject: Re: memory allocation problems

On Fri, Apr 06, 2001 at 10:06:47AM -0400, [email protected] wrote:
> Essentially, the problem can be summarized to be that on a machine
> with ample ram (2G, 4G, etc), I am unable to malloc a gig if I ask
> for the memory in small ( <= 128k) chunks. I've enclosed some results
> and a little program which was put together to demonstrate the problems
> we're having. All of the failures seem to occur around 930MB.

It's bumping against some mapping (just do system("cat /proc/self/maps")
on allocation failure to see which). Usual suspects are shared libraries.
One possible solution is to upgrade to a newer glibc, the 2.2 glibc malloc
should handle this case better.
A way to get mappings like shared libraries out of the way is to
increase the value of TASK_UNMAPPED_BASE in include/asm-i386/processor.h.
For that the kernel needs to be recompiled and it should be smaller
TASK_SIZE-enough space for your shared libraries. With that even the older
malloc will probably work.


-Andi

2001-04-06 16:58:55

by Wayne Whitney

[permalink] [raw]
Subject: Re: memory allocation problems

In mailing-lists.linux-kernel, you wrote:

> Essentially, the problem can be summarized to be that on a machine
> with ample ram (2G, 4G, etc), I am unable to malloc a gig if I ask
> for the memory in small ( <= 128k) chunks.

Take a look at this message by Szabolcs Szakacsits:

http://marc.theaimsgroup.com/?l=linux-kernel&m=97898653909227&w=2

There are other messages that may be of interest to you in that
thread, although they are spread out in a large thread.

Briefly, malloc in glibc will use brk() for "small" chunks and mmap()
for "large" chunks. On a usual i386 linux kernel, the 4GB address
space is set up so that brk() can get at most 870MB or so and mmap()
can get at most 2GB. Newer glibc's allow you to tune the definition
of "small" via an environment variable.

Cheers,
Wayne

2001-04-06 19:23:21

by Mark Hahn

[permalink] [raw]
Subject: Re: memory allocation problems

> can get at most 2GB. Newer glibc's allow you to tune the definition
> of "small" via an environment variable.

eventually, perhaps libc will be smart enough to create
more arenas in mmaped space once sbrk fails. note, though,
that you *CAN* actually malloc a lot more than 1G: you just
have to avoid causing mmaps that chop your VM at TASK_UNMAPPED_BASE:

#include <unistd.h>
#include <stdlib.h>
#include <fcntl.h>

void printnumber(unsigned n) {
char number[20];
int i;
for (i=sizeof(number)-1; i>=0 && n; i--) {
number[i] = '0' + (n % 10);
n /= 10;
}
i++;
write(1,number+i, sizeof(number)-i);
}
int main() {
unsigned total = 0;
const unsigned size = 32*1024;

while (malloc(size)) {
total += size;
printnumber(total>>20);
write(1,"\n",1);
}
return 0;
}

compile -static, of course; printnumber is to avoid stdio, which seems
to use mmap for a small scratch buffer. I allocated 2942 MB on my 128M
machine(had to add a swapfile temporarily, since so many tiny mallocs
do touch nontrivial numbers of pages for arena bookkeeping.)

2001-04-06 19:42:05

by Wayne Whitney

[permalink] [raw]
Subject: Re: memory allocation problems

On Fri, 6 Apr 2001, Mark Hahn wrote:

> note, though, that you *CAN* actually malloc a lot more than 1G: you
> just have to avoid causing mmaps that chop your VM at
> TASK_UNMAPPED_BASE:

Neat trick. I didn't realize that you could avoid allocating the mmap()
buffers for stdin and stdout.

As was pointed out to me in January, another solution for i386 would be to
fix a maximum stack size and have the mmap() allocations grow downward
from the "top" of the stack (3GB - max stack size). I'm not sure why that
is not currently done.

I once wrote a tiny patch to do this, and ran it successfully for a couple
days, but knowing so little about the kernel I probably did it in a
completely wrong, inefficient way. For example, some of the vma
structures are sorted in increasing address order, and so perhaps to do
this properly one should change them to decreasing address order.

Cheers,
Wayne


2001-04-06 20:09:20

by Mark Hahn

[permalink] [raw]
Subject: Re: memory allocation problems

> > note, though, that you *CAN* actually malloc a lot more than 1G: you
> > just have to avoid causing mmaps that chop your VM at
> > TASK_UNMAPPED_BASE:
>
> Neat trick. I didn't realize that you could avoid allocating the mmap()
> buffers for stdin and stdout.

noone ever said you had to use stdio. or even use libc, for that matter!

> As was pointed out to me in January, another solution for i386 would be to
> fix a maximum stack size and have the mmap() allocations grow downward
> from the "top" of the stack (3GB - max stack size). I'm not sure why that
> is not currently done.

problems get fixed when there's some pain involved: people bumping
into a limit, or painfully bad code, etc. not enough people are
feeling any pain about the current design.

this (and the "move TASK_UNMAPPED_BASE" workaround) have been known
for years; I think someone even coded up a "grow vmareas down" patch
the last time we all discussed this.

> I once wrote a tiny patch to do this, and ran it successfully for a couple
> days, but knowing so little about the kernel I probably did it in a
> completely wrong, inefficient way. For example, some of the vma
> structures are sorted in increasing address order, and so perhaps to do
> this properly one should change them to decreasing address order.

oh, I guess you did the patch ;)
seriously, resubmit it when 2.5 opens up. the fact is that we currently
have two things that grow up, and one that grows down. so obviously,
one up-grower must have an arbitrary limit. switching vma's to down-growing
is a good solution, since it's actually *good* to limit stack growth.
I wonder whether fortraners still put all their data on the stack;
they wouldn't be happy ;)

a simple workaround would be to turn TASK_UNMAPPED_AREA into a variable,
either system-wide or thread-specific (like ia64 already has!). that's
compatible with the improved vmas-down approach, too.

regards, mark hahn.

2001-04-06 21:21:23

by Hugh Dickins

[permalink] [raw]
Subject: Re: memory allocation problems

On Fri, 6 Apr 2001, Wayne Whitney wrote:
>
> As was pointed out to me in January, another solution for i386 would be to
> fix a maximum stack size and have the mmap() allocations grow downward
> from the "top" of the stack (3GB - max stack size). I'm not sure why that
> is not currently done.

I'd be interested in the answer to that too. Typically, the memory
layout has ELF text at the lowest address, starting at 0x08048000 -
which is a curious place to put it, until you realize that if you
place the stack below it, you can use (in a typical small program)
just one page table for stack + text + data (then another for mmaps
and shared libs from 3GB down): two page tables instead of present three.

Hugh