2007-02-22 12:39:16

by Pekka Enberg

[permalink] [raw]
Subject: [RFC/PATCH] slab: free pages in a batch in drain_freelist

From: Pekka Enberg <[email protected]>

As suggested by William, free the actual pages in a batch so that we
don't keep pounding on l3->list_lock.

Cc: William Lee Irwin III <[email protected]>
Cc: Christoph Lameter <[email protected]>
Signed-off-by: Pekka Enberg <[email protected]>
---
mm/slab.c | 32 +++++++++++++++-----------------
1 file changed, 15 insertions(+), 17 deletions(-)

Index: uml-2.6/mm/slab.c
===================================================================
--- uml-2.6.orig/mm/slab.c 2007-02-22 14:19:46.000000000 +0200
+++ uml-2.6/mm/slab.c 2007-02-22 14:28:01.000000000 +0200
@@ -2437,38 +2437,36 @@
*
* Returns the actual number of slabs released.
*/
-static int drain_freelist(struct kmem_cache *cache,
- struct kmem_list3 *l3, int tofree)
+static int drain_freelist(struct kmem_cache *cache, struct kmem_list3 *l3,
+ int tofree)
{
+ struct slab *slabp, *this, *next;
+ struct list_head to_free_list;
struct list_head *p;
int nr_freed;
- struct slab *slabp;

+ INIT_LIST_HEAD(&to_free_list);
nr_freed = 0;
- while (nr_freed < tofree && !list_empty(&l3->slabs_free)) {

- spin_lock_irq(&l3->list_lock);
+ spin_lock_irq(&l3->list_lock);
+ while (nr_freed < tofree && !list_empty(&l3->slabs_free)) {
p = l3->slabs_free.prev;
- if (p == &l3->slabs_free) {
- spin_unlock_irq(&l3->list_lock);
- goto out;
- }
+ if (p == &l3->slabs_free)
+ break;

slabp = list_entry(p, struct slab, list);
#if DEBUG
BUG_ON(slabp->inuse);
#endif
- list_del(&slabp->list);
- /*
- * Safe to drop the lock. The slab is no longer linked
- * to the cache.
- */
+ list_move(&slabp->list, &to_free_list);
l3->free_objects -= cache->num;
- spin_unlock_irq(&l3->list_lock);
- slab_destroy(cache, slabp);
nr_freed++;
}
-out:
+ spin_unlock_irq(&l3->list_lock);
+
+ list_for_each_entry_safe(this, next, &to_free_list, list)
+ slab_destroy(cache, this);
+
return nr_freed;
}


2007-02-22 21:57:26

by Christoph Lameter

[permalink] [raw]
Subject: Re: [RFC/PATCH] slab: free pages in a batch in drain_freelist

On Thu, 22 Feb 2007, Pekka J Enberg wrote:

> As suggested by William, free the actual pages in a batch so that we
> don't keep pounding on l3->list_lock.

This means holding the l3->list_lock for a prolonged time period. The
existing code was done this way in order to make sure that the interrupt
holdoffs are minimal.

There is no pounding. The cacheline with the list_lock is typically held
until the draining is complete. While we drain the freelist we need to be
able to respond to interrupts.

2007-02-22 23:35:20

by Christoph Lameter

[permalink] [raw]
Subject: Re: [RFC/PATCH] slab: free pages in a batch in drain_freelist

On Thu, 22 Feb 2007, Pekka J Enberg wrote:

> As suggested by William, free the actual pages in a batch so that we
> don't keep pounding on l3->list_lock.

This means holding the l3->list_lock for a prolonged time period. The
existing code was done this way in order to make sure that the interrupt
holdoffs are minimal.

There is no pounding. The cacheline with the list_lock is typically held
until the draining is complete. While we drain the freelist we need to be
able to respond to interrupts.

2007-02-24 04:30:55

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [RFC/PATCH] slab: free pages in a batch in drain_freelist

On Thu, 22 Feb 2007, Pekka J Enberg wrote:
>> As suggested by William, free the actual pages in a batch so that we
>> don't keep pounding on l3->list_lock.

On Thu, Feb 22, 2007 at 03:01:30PM -0800, Christoph Lameter wrote:
> This means holding the l3->list_lock for a prolonged time period. The
> existing code was done this way in order to make sure that the interrupt
> holdoffs are minimal.
> There is no pounding. The cacheline with the list_lock is typically held
> until the draining is complete. While we drain the freelist we need to be
> able to respond to interrupts.

I had in mind something more like a list_splice_init() operation under
the lock, since it empties the entire list except in the case of
cache_reap(). For cache_reap(), not much could be done unless they were
organized into batches of (l3->free_limit+5*searchp->num-1)/(5*searchp->num)
such as a list of lists of that length, which would need to be
reorganized when tuning ->batchcount occurs.

It's not terribly meaningful since only grand reorganizations that are
presumed to stop the world actually get "sped up" without the additional
effort required to improve cache_reap(). My commentary was more about
the data structures being incapable of bulk movement operations for
batching like or analogous to list_splice() than trying to say that
drain_freelist() in particular should be optimized. Allowing movement of
larger batches without increased hold time in transfer_objects() is
clearly a more meaningful goal, for example.

Furthermore, the patch as written merely increases hold time in
exchange for decreased arrival rate resulting in no net improvement.


-- wli