2001-11-27 16:45:41

by Peter Zaitsev

[permalink] [raw]
Subject: MMAP issues

Hello ,

I'm trying to write a program which uses mmap agressively to mmap
files (really it's used as fail safe memory allocator to store data
if application failed)
I'm using the latest kernel 2.4.16.

I've found a couple of problems.

1) I can mmap only about 64K files (4K size ones) and I can't find there the limit
is triggered. I can open huge number of files (tried 500K) or I can
map ammoniums pages even lager number. May be this limit is
compiled in kernel somehow so may be changed ?
The error code returned is 12 - Can not allocate memory

2) I see the speed dramatically degrades over time with mapping
segments:
zetta:/home/pz/mmap # ./a.out
10000 Time: 7
20000 Time: 22
30000 Time: 38
40000 Time: 61
50000 Time: 78
60000 Time: 90

First 10000 of mmaps took only 7 second there mapping 10000 of
files after 50000 took 90 seconds. I used the same file associated
with many file descriptors to avoid disk related speed issues. Map
ammoniums in the same case are runned much faster and with almost no
speed penalty as well as open calls. So the question is if this
thing may be tuned somehow - hash increase or something.


3) It looks like the speed degrades over the program runs:

1st
10000 Time: 7
20000 Time: 22
30000 Time: 38
40000 Time: 61
50000 Time: 78
60000 Time: 90

2nd
10000 Time: 7
20000 Time: 28
30000 Time: 49
40000 Time: 71
50000 Time: 92
60000 Time: 104

3rd
10000 Time: 9
20000 Time: 31
30000 Time: 52
40000 Time: 68
50000 Time: 87
60000 Time: 107





--
Best regards,
Peter mailto:[email protected]


2001-11-27 22:37:32

by Jeff Epler

[permalink] [raw]
Subject: Re: MMAP issues

The difference in runtime between successive runs of your program
doesn't look terribly significant.

You open 'fd' each time, and never close it. I die about 1000 mmap()s
into the process (-EMFILE returned by sys_open). You may be testing
Linux' performance with huge fd sets in your test as well.

Moving the open() outside the loop, and running on a 512M, kernel 2.2
machine that's also running a full gnome desktop I get really intense
kernel CPU usage, and the following output:
10000 Time: 12
20000 Time: 45
30000 Time: 79
40000 Time: 113
[and I got too bored to watch it go on]

Unmapping the page after each map yields much better results:

10000 Time: 4
20000 Time: 4
30000 Time: 4
40000 Time: 5
50000 Time: 4
60000 Time: 4
70000 Time: 5
80000 Time: 5
90000 Time: 5
100000 Time: 4
[etc]

Interestingly, forcing the test to allocate at successively lower
addresses gives fast results until mmap fails (collided with a shared
library?):
10000 Time: 4
20000 Time: 4
30000 Time: 4
40000 Time: 4
50000 Time: 4
60000 Time: 4
Failed 0x60007000 12

So in kernel 2.2, it looks like some sort of linked list ordered by user
address is being traversed in order to complete the mmap() operation.
If so, then the O(N^2)-like behavior you saw in your original report is
explained as the expected performance of linux' mmap for a given # of
mappings.

Jeff

#include <stdio.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>

int main()
{
int i = 0;
void *p;
int t;
int fd;
int addr = (void *) 0x70000000;

fd = open("test.dat", O_RDWR);
if (fd < 0) {
puts("Unable to open file !");
return;
}
t = time(NULL);
while (1) {
p = mmap(addr, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);
addr = addr - 4096;
if ((int) p == -1) {
printf("Failed %p %d\n", addr, errno);
return;
}
i++;
if (i % 10000 == 0) {
printf(" %d Time: %d\n", i, time(NULL) - t);
t = time(NULL);
}
}
}

2001-11-28 09:44:18

by Peter Zaitsev

[permalink] [raw]
Subject: Re[2]: MMAP issues

Hello Jeff,

Wednesday, November 28, 2001, 1:36:51 AM, you wrote:

JE> The difference in runtime between successive runs of your program
JE> doesn't look terribly significant.

Yes. The interesting thing it's Quite stable - I've repeated it
several times over reboot and the time always grows over runs.
But then I waited couple of hours and then repeated the test it showed
good results again (all of this time machine was completely idle) so
it looks like VM does clean itself up over time.


JE> You open 'fd' each time, and never close it. I die about 1000 mmap()s
JE> into the process (-EMFILE returned by sys_open). You may be testing
JE> Linux' performance with huge fd sets in your test as well.
Yes. I'm testing the same test as my application does - many mmaped
fd, the only difference is it uses different files. I've set ulimit
and /proc/sys/fs/file-max to be able to open 500.000 of files and
this is quite fast - it takes only couple of seconds to open. So
large fd sets is not the issue for linux.


JE> Moving the open() outside the loop, and running on a 512M, kernel 2.2
JE> machine that's also running a full gnome desktop I get really intense
JE> kernel CPU usage, and the following output:
JE> 10000 Time: 12
JE> 20000 Time: 45
JE> 30000 Time: 79
JE> 40000 Time: 113
JE> [and I got too bored to watch it go on]

Yes. This is quite expected but does not explain anything therefore
becomes even more interesting - as MAP ammoniums is quite fast even
with large number of mmaps the finding the hole in address space to
map does not take much time, so what does it ?


JE> Unmapping the page after each map yields much better results:

JE> 10000 Time: 4
JE> 20000 Time: 4
JE> 30000 Time: 4
JE> 40000 Time: 5
JE> 50000 Time: 4
JE> 60000 Time: 4
JE> 70000 Time: 5
JE> 80000 Time: 5
JE> 90000 Time: 5
JE> 100000 Time: 4
JE> [etc]

Yes. This shows what the speed degrades over increasing number of
mmaped segment not mmap calls

JE> Interestingly, forcing the test to allocate at successively lower
JE> addresses gives fast results until mmap fails (collided with a shared
JE> library?):
JE> 10000 Time: 4
JE> 20000 Time: 4
JE> 30000 Time: 4
JE> 40000 Time: 4
JE> 50000 Time: 4
JE> 60000 Time: 4
JE> Failed 0x60007000 12

You mean you self calculated free address and called mmap with it ?
If so this is quite strange as with mmap ammoniums mmap is able to
find free block quite fast.


JE> So in kernel 2.2, it looks like some sort of linked list ordered by user
JE> address is being traversed in order to complete the mmap() operation.
JE> If so, then the O(N^2)-like behavior you saw in your original report is
JE> explained as the expected performance of linux' mmap for a given # of
JE> mappings.

Well but why it's different for mmaping file and ammoniums MMAP.

Also I should show the same looking distribution for Solaris 8/x86 (the
CPU is different so I can't compare real numbers)

testserv:~ # ./a.out
10000 Time: 16
20000 Time: 53
30000 Time: 87
40000 Time: 120
50000 Time: 155
Failed 12

It's even able to allocate less number of chunks :)

Bad thing is - noone has explained 64K limit yet...


--
Best regards,
Peter mailto:[email protected]