LinuxLists.cc - [RFC] mempool_alloc() pre-allocated object usage

2005-10-03 14:39:10

Subject: [RFC] mempool_alloc() pre-allocated object usage

Currently mempool_create() will pre-allocate min_nr objects in the pool
for later usage. However, the current semantics of mempool_alloc() are to
first attempt the ->alloc() path and then fall back to using a
pre-allocated object that already exists in the pool.

This is somewhat of a problem if we want to build up a pool of relatively
high order allocations (backed with a slab cache for example) for
gauranteeing contiguity early on, as sometimes we are able to satisfy the
->alloc() path and end up growing the pool larger than we would like.

The easy way around this would be to first fetch objects out of the pool
and then try ->alloc() in the case where we have no free objects left in
the pool. ie:

diff --git a/mm/mempool.c b/mm/mempool.c
--- a/mm/mempool.c
+++ b/mm/mempool.c
@@ -216,11 +216,6 @@ void * mempool_alloc(mempool_t *pool, un
gfp_temp = gfp_mask & ~(__GFP_WAIT|__GFP_IO);

repeat_alloc:
-
- element = pool->alloc(gfp_temp, pool->pool_data);
- if (likely(element != NULL))
- return element;
-
spin_lock_irqsave(&pool->lock, flags);
if (likely(pool->curr_nr)) {
element = remove_element(pool);
@@ -229,6 +224,10 @@ repeat_alloc:
}
spin_unlock_irqrestore(&pool->lock, flags);

+ element = pool->alloc(gfp_temp, pool->pool_data);
+ if (likely(element != NULL))
+ return element;
+
/* We must not sleep in the GFP_ATOMIC case */
if (!(gfp_mask & __GFP_WAIT))
return NULL;

The downside to this is that some people may be expecting that
pre-allocated elements are used as reserve space for when regular
allocations aren't possible. In which case, this would break that
behaviour.

Both usage patterns seem valid from my point of view, would you be open
to something that would accomodate both? (ie, possibly adding in a flag
to determine pre-allocated object usage?) Or should I not be using
mempool for contiguity purposes?

2005-10-03 14:49:23

by Arjan van de Ven

[permalink] [raw]

Subject: Re: [RFC] mempool_alloc() pre-allocated object usage

On Mon, 2005-10-03 at 17:36 +0300, Paul Mundt wrote:

> Both usage patterns seem valid from my point of view, would you be open
> to something that would accomodate both? (ie, possibly adding in a flag
> to determine pre-allocated object usage?) Or should I not be using
> mempool for contiguity purposes?

a similar dillema was in the highmem bounce code in 2.4; what worked
really well back then was to do it both; eg use half the pool for
"immediate" use, then try a VM alloc, and use the second half of the
pool for the really emergency cases.

Technically a mempool is there ONLY for the fallback, but I can see some
value in making it also a fastpath by means of a small scratch pool

2005-10-03 14:58:36

by Brian Gerst

[permalink] [raw]

Subject: Re: [RFC] mempool_alloc() pre-allocated object usage

Paul Mundt wrote:
> The downside to this is that some people may be expecting that
> pre-allocated elements are used as reserve space for when regular
> allocations aren't possible. In which case, this would break that
> behaviour.

This is the original intent of the mempool. There must be objects in
reserve so that the machine doesn't deadlock on critical allocations
(ie. disk writes) under memory pressure.

--
Brian Gerst

2005-10-03 16:24:01

by Paul Mundt

[permalink] [raw]

Subject: Re: [RFC] mempool_alloc() pre-allocated object usage

On Mon, Oct 03, 2005 at 04:49:13PM +0200, Arjan van de Ven wrote:
> On Mon, 2005-10-03 at 17:36 +0300, Paul Mundt wrote:
> > Both usage patterns seem valid from my point of view, would you be open
> > to something that would accomodate both? (ie, possibly adding in a flag
> > to determine pre-allocated object usage?) Or should I not be using
> > mempool for contiguity purposes?
>
> a similar dillema was in the highmem bounce code in 2.4; what worked
> really well back then was to do it both; eg use half the pool for
> "immediate" use, then try a VM alloc, and use the second half of the
> pool for the really emergency cases.
>
Unfortunately this won't work very well in our case since it's
specifically high order allocations that we are after, and we don't have
the extra RAM to allow for this.

> Technically a mempool is there ONLY for the fallback, but I can see some
> value in making it also a fastpath by means of a small scratch pool

I haven't been able to think of any really good way to implement this, so
here's my current half-assed solution..

This adds a mempool_alloc_from_pool() to do the allocation directly from
the pool first if there are elements available, otherwise it defaults to
the mempool_alloc() behaviour (and no, I haven't commented it yet, since
it would be futile if no one likes this approach). It's at least fairly
minimalistic, and saves us from doing stupid things with the gfp_mask in
mempool_alloc().

--

include/linux/mempool.h | 2 ++
mm/mempool.c | 16 ++++++++++++++++
2 files changed, 18 insertions(+)

diff --git a/include/linux/mempool.h b/include/linux/mempool.h
--- a/include/linux/mempool.h
+++ b/include/linux/mempool.h
@@ -30,6 +30,8 @@ extern int mempool_resize(mempool_t *poo
unsigned int __nocast gfp_mask);
extern void mempool_destroy(mempool_t *pool);
extern void * mempool_alloc(mempool_t *pool, unsigned int __nocast gfp_mask);
+extern void * mempool_alloc_from_pool(mempool_t *pool,
+ unsigned int __nocast gfp_mask);
extern void mempool_free(void *element, mempool_t *pool);

/*
diff --git a/mm/mempool.c b/mm/mempool.c
--- a/mm/mempool.c
+++ b/mm/mempool.c
@@ -246,6 +246,22 @@ repeat_alloc:
}
EXPORT_SYMBOL(mempool_alloc);

+void *mempool_alloc_from_pool(mempool_t *pool, unsigned int __nocast gfp_mask)
+{
+ unsigned long flags;
+
+ spin_lock_irqsave(&pool->lock, flags);
+ if (likely(pool->curr_nr)) {
+ void *element = remove_element(pool);
+ spin_unlock_irqrestore(&pool->lock, flags);
+ return element;
+ }
+ spin_unlock_irqrestore(&pool->lock, flags);
+
+ return mempool_alloc(pool, gfp_mask);
+}
+EXPORT_SYMBOL(mempool_alloc_from_pool);
+
/**
* mempool_free - return an element to the pool.
* @element: pool element pointer.

2005-10-04 01:06:51

by Nick Piggin

[permalink] [raw]

Subject: Re: [RFC] mempool_alloc() pre-allocated object usage

On Mon, 2005-10-03 at 10:59 -0400, Brian Gerst wrote:
> Paul Mundt wrote:
> > The downside to this is that some people may be expecting that
> > pre-allocated elements are used as reserve space for when regular
> > allocations aren't possible. In which case, this would break that
> > behaviour.
>
> This is the original intent of the mempool. There must be objects in
> reserve so that the machine doesn't deadlock on critical allocations
> (ie. disk writes) under memory pressure.
>

No, the semantics are that at least 'min' objects must be able to
be allocated at one time. The user must be able to proceed far enough
to release its objects in this case, and that ensures no deadlock.

The problem with using the pool first is that it requires the lock
to be taken and is also not NUMA aware. So from a scalability point of
view, I don't think it is a good idea.

Perhaps you could introduce a new mempool allocation interface to do
it your way?

Nick

--
SUSE Labs, Novell Inc.

Send instant messages to your online friends http://au.messenger.yahoo.com

2005-10-04 06:59:15

by Jens Axboe

[permalink] [raw]

Subject: Re: [RFC] mempool_alloc() pre-allocated object usage

On Mon, Oct 03 2005, Arjan van de Ven wrote:
> On Mon, 2005-10-03 at 17:36 +0300, Paul Mundt wrote:
>
> > Both usage patterns seem valid from my point of view, would you be open
> > to something that would accomodate both? (ie, possibly adding in a flag
> > to determine pre-allocated object usage?) Or should I not be using
> > mempool for contiguity purposes?
>
> a similar dillema was in the highmem bounce code in 2.4; what worked
> really well back then was to do it both; eg use half the pool for
> "immediate" use, then try a VM alloc, and use the second half of the
> pool for the really emergency cases.
>
> Technically a mempool is there ONLY for the fallback, but I can see some
> value in making it also a fastpath by means of a small scratch pool

The reason it works the way it does is because of performance, you don't
want to touch the pool lock until you have to. If the page allocations
that happen before falling into the mempool, I would suggest looking at
that specific issue first. I think Nick recently did some changes in
that area, there might be more low hanging fruit.

--
Jens Axboe

2005-10-06 12:58:00

by Marcelo Tosatti

[permalink] [raw]

Subject: Re: [RFC] mempool_alloc() pre-allocated object usage

Hi Paul,

On Mon, Oct 03, 2005 at 07:21:22PM +0300, Paul Mundt wrote:
> On Mon, Oct 03, 2005 at 04:49:13PM +0200, Arjan van de Ven wrote:
> > On Mon, 2005-10-03 at 17:36 +0300, Paul Mundt wrote:
> > > Both usage patterns seem valid from my point of view, would you be open
> > > to something that would accomodate both? (ie, possibly adding in a flag
> > > to determine pre-allocated object usage?) Or should I not be using
> > > mempool for contiguity purposes?
> >
> > a similar dillema was in the highmem bounce code in 2.4; what worked
> > really well back then was to do it both; eg use half the pool for
> > "immediate" use, then try a VM alloc, and use the second half of the
> > pool for the really emergency cases.
> >
> Unfortunately this won't work very well in our case since it's
> specifically high order allocations that we are after, and we don't have
> the extra RAM to allow for this.

Out of curiosity, what is the requirement for higher order pages?