2001-02-24 00:27:25

by Reto Baettig

[permalink] [raw]
Subject: RFC: vmalloc improvements

Hi

We have an application that makes extensive use of vmalloc (we need
lots of large virtual contiguous buffers. The buffers don't have to be
physically contiguous).

vmalloc/vfree is very slow when the vmlist gets long.

I don't know if this problem is already on a todo list or if we are the
first ones who want to use vmalloc extensively. Maybe We're also missing
something.

We would volounteer to improve vmalloc if there is any chance of
getting it into the main kernel tree. We also have an idea how we
Could do that (quite similar to the process address space management):

1. Create a generic avl-tree headerfile (similar to list.h)

2. We change the vm_struct to something like:

struct vm_struct {
unsigned long flags;
void * addr;
unsigned long size;
struct avl_entry avl;
struct list_head empty_list;
struct list_head vm_list;
};

with struct avl_entry:

struct avl_entry {
unsigned long key;
short height;
struct avl_entry * avl_left;
struct avl_entry * avl_right;
}

3. We have a avl-tree (vm_avl_used) for the used memory areas (sorted
by the address), a hashtable for the unused memory areas (vm_hash_unused,
hashed by the size) and a sorted linear list (vm_list) of all the memory
areas (used and unused). The vm_hash_unused hashtable is initially empty
and gets only filled when previously used areas are freed and the memory
space gets segmented.

4. When we free an area, we first find it in the avl tree. After we
have the vm_struct, we can look in the vm_list if there are any direct
neighbours. If yes and the neighbour is also free, the areas get merged.

5. When we have to allocate a new area (get_free_area)
and the hash table can not satisfy the request, we allocate a new area
starting after the end of the used memory areas.

Is this something that makes sense to do and that could make it
into the 2.4 or the 2.5 kernel?

Reto


2001-02-24 00:33:25

by Ingo Molnar

[permalink] [raw]
Subject: Re: RFC: vmalloc improvements


On Fri, 23 Feb 2001, Reto Baettig wrote:

> We have an application that makes extensive use of vmalloc (we need
> lots of large virtual contiguous buffers. The buffers don't have to be
> physically contiguous).

question: what is this application, and why does it need so much virtual
memory? vmalloc()-able memory is maximized to 128 MB right now, and
increasing it conflicts with directly mapping RAM, so generally it's a
good idea to avoid vmalloc() as much as possible.

Ingo

2001-02-24 01:02:30

by Linus Torvalds

[permalink] [raw]
Subject: Re: RFC: vmalloc improvements

In article <[email protected]>,
Reto Baettig <[email protected]> wrote:
>
>We would volounteer to improve vmalloc if there is any chance of
>getting it into the main kernel tree. We also have an idea how we
>Could do that (quite similar to the process address space management):
>
>1. Create a generic avl-tree headerfile (similar to list.h)
....

No thanks.

Just use the process address space management as-is, and make the
vmalloc address list be the same as any other address list: it would just
be the "native" address list for "init_mm".

You could probably even use "insert_vm_struct()" directly, and have that
do the AVL tree stuff for you, no changes needed.

>Is this something that makes sense to do and that could make it
>into the 2.4 or the 2.5 kernel?

It's definitely not a 2.4.x thing.

Linus

2001-02-24 01:07:01

by Alan

[permalink] [raw]
Subject: Re: RFC: vmalloc improvements

> We have an application that makes extensive use of vmalloc (we need
> lots of large virtual contiguous buffers. The buffers don't have to be
> physically contiguous).

So you could actually code around that. If you have them virtually contiguous
for mmap for example then you can actually mmap arbitary page arrays

> We would volounteer to improve vmalloc if there is any chance of
> getting it into the main kernel tree. We also have an idea how we
> Could do that (quite similar to the process address space management):

Im not the one to call the shots, but it seems if you need an AVL for the
vmalloc tables then vmalloc is possibly being overused, or people are not
allocating buffers just occasionally as anticipated

2001-02-27 00:51:56

by Reto Baettig

[permalink] [raw]
Subject: Re: RFC: vmalloc improvements

Ingo Molnar wrote:
> question: what is this application, and why does it need so much virtual
> memory? vmalloc()-able memory is maximized to 128 MB right now, and
> increasing it conflicts with directly mapping RAM, so generally it's a
> good idea to avoid vmalloc() as much as possible.

We implemented a RPC mechanism over a fast network in the kernel. The
end application is a distributed filesystem. The RPC server needs lots
of 2MB receive buffers which are allocated using vmalloc because the NIC
has its own pagetables.
The buffers then get handed to the consumer (lots of threads) which
eventually frees them. This way, we have a performance on the RPC layer
of 200MBytes/s.

The 128MB limit is probably an Intel limitation since we don't see it on
our Alpha Machines (Linux 2.2.18 Alpha SMP)

Reto

2001-02-27 01:00:49

by David Miller

[permalink] [raw]
Subject: Re: RFC: vmalloc improvements


Reto Baettig writes:
> The RPC server needs lots of 2MB receive buffers which are
> allocated using vmalloc because the NIC has its own pagetables.

Why not just allocate the page seperately and keep track of
where they are, since the NIC has all the page tabling facilities
on it's end, the cpu side is just a software issue. You can keep
an array of pages how ever large you need to keep track of that.

vmalloc() was never meant to be used on this level and doing
so is asking for trouble (it's also deadly expensive on SMP due
to the cross-cpu tlb invalidates using vmalloc() causes).

Later,
David S. Miller
[email protected]