2007-01-02 13:52:18

by Pekka Enberg

[permalink] [raw]
Subject: [PATCH] slab: cache alloc cleanups

[Andrew, I have been unable to find a NUMA-capable tester for this patch,
so can we please put this in to -mm for some exposure?]

From: Pekka Enberg <[email protected]>

This patch cleans up __cache_alloc and __cache_alloc_node functions. We no
longer need to do NUMA_BUILD tricks and the UMA allocation path is much
simpler. Note: we now do alternate_node_alloc() for kmem_cache_alloc_node as
well.

Cc: Andy Whitcroft <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Manfred Spraul <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Paul Jackson <[email protected]>
Signed-off-by: Pekka Enberg <[email protected]>
---

mm/slab.c | 165 +++++++++++++++++++++++++++++++-------------------------------
1 file changed, 84 insertions(+), 81 deletions(-)

Index: 2.6/mm/slab.c
===================================================================
--- 2.6.orig/mm/slab.c
+++ 2.6/mm/slab.c
@@ -3197,35 +3197,6 @@ static inline void *____cache_alloc(stru
return objp;
}

-static __always_inline void *__cache_alloc(struct kmem_cache *cachep,
- gfp_t flags, void *caller)
-{
- unsigned long save_flags;
- void *objp = NULL;
-
- cache_alloc_debugcheck_before(cachep, flags);
-
- local_irq_save(save_flags);
-
- if (unlikely(NUMA_BUILD &&
- current->flags & (PF_SPREAD_SLAB | PF_MEMPOLICY)))
- objp = alternate_node_alloc(cachep, flags);
-
- if (!objp)
- objp = ____cache_alloc(cachep, flags);
- /*
- * We may just have run out of memory on the local node.
- * ____cache_alloc_node() knows how to locate memory on other nodes
- */
- if (NUMA_BUILD && !objp)
- objp = ____cache_alloc_node(cachep, flags, numa_node_id());
- local_irq_restore(save_flags);
- objp = cache_alloc_debugcheck_after(cachep, flags, objp,
- caller);
- prefetchw(objp);
- return objp;
-}
-
#ifdef CONFIG_NUMA
/*
* Try allocating on another node if PF_SPREAD_SLAB|PF_MEMPOLICY.
@@ -3383,7 +3354,90 @@ must_grow:
done:
return obj;
}
-#endif
+
+/**
+ * __do_cache_alloc_node - Allocate an object on the specified node
+ * @cachep: The cache to allocate from.
+ * @flags: See kmalloc().
+ * @nodeid: node number of the target node.
+ *
+ * Fallback to other node is possible if __GFP_THISNODE is not set.
+ */
+static __always_inline void *
+__do_cache_alloc_node(struct kmem_cache *cache, gfp_t flags, int nodeid)
+{
+ void *obj;
+
+ if (nodeid == -1 || nodeid == numa_node_id()) {
+ if (unlikely(current->flags & (PF_SPREAD_SLAB | PF_MEMPOLICY))) {
+ obj = alternate_node_alloc(cache, flags);
+ if (obj)
+ goto out;
+ }
+
+ /*
+ * Use the locally cached objects if possible. However,
+ * ____cache_alloc does not allow fallback to other nodes.
+ * It may fail while we still have objects on other nodes
+ * available.
+ */
+ obj = ____cache_alloc(cache, flags);
+ if (obj)
+ goto out;
+
+ /* Fall back to other nodes. */
+ obj = ____cache_alloc_node(cache, flags, numa_node_id());
+ } else {
+ if (likely(cache->nodelists[nodeid]))
+ obj = ____cache_alloc_node(cache, flags, nodeid);
+ else {
+ /* Node is not bootstrapped yet. */
+ if (!(flags & __GFP_THISNODE))
+ obj = fallback_alloc(cache, flags);
+ else
+ obj = NULL;
+ }
+ }
+ out:
+ return obj;
+}
+
+#else
+
+static __always_inline void *
+__do_cache_alloc_node(struct kmem_cache *cache, gfp_t flags, int nodeid)
+{
+ /*
+ * For UMA, we always allocate from the local cache.
+ */
+ return ____cache_alloc(cache, flags);
+}
+#endif /* CONFIG_NUMA */
+
+static __always_inline void *
+__cache_alloc_node(struct kmem_cache *cache, gfp_t flags, int nodeid,
+ void *caller)
+{
+ unsigned long save_flags;
+ void *obj;
+
+ cache_alloc_debugcheck_before(cache, flags);
+ local_irq_save(save_flags);
+ obj = __do_cache_alloc_node(cache, flags, nodeid);
+ local_irq_restore(save_flags);
+ obj = cache_alloc_debugcheck_after(cache, flags, obj, caller);
+ return obj;
+}
+
+static __always_inline void *
+__cache_alloc(struct kmem_cache *cache, gfp_t flags, void *caller)
+{
+ void *obj;
+
+ obj = __cache_alloc_node(cache, flags, -1, caller);
+ prefetchw(obj);
+ return obj;
+}

/*
* Caller needs to acquire correct kmem_list's list_lock
@@ -3582,57 +3636,6 @@ out:
}

#ifdef CONFIG_NUMA
-/**
- * kmem_cache_alloc_node - Allocate an object on the specified node
- * @cachep: The cache to allocate from.
- * @flags: See kmalloc().
- * @nodeid: node number of the target node.
- * @caller: return address of caller, used for debug information
- *
- * Identical to kmem_cache_alloc but it will allocate memory on the given
- * node, which can improve the performance for cpu bound structures.
- *
- * Fallback to other node is possible if __GFP_THISNODE is not set.
- */
-static __always_inline void *
-__cache_alloc_node(struct kmem_cache *cachep, gfp_t flags,
- int nodeid, void *caller)
-{
- unsigned long save_flags;
- void *ptr = NULL;
-
- cache_alloc_debugcheck_before(cachep, flags);
- local_irq_save(save_flags);
-
- if (unlikely(nodeid == -1))
- nodeid = numa_node_id();
-
- if (likely(cachep->nodelists[nodeid])) {
- if (nodeid == numa_node_id()) {
- /*
- * Use the locally cached objects if possible.
- * However ____cache_alloc does not allow fallback
- * to other nodes. It may fail while we still have
- * objects on other nodes available.
- */
- ptr = ____cache_alloc(cachep, flags);
- }
- if (!ptr) {
- /* ___cache_alloc_node can fall back to other nodes */
- ptr = ____cache_alloc_node(cachep, flags, nodeid);
- }
- } else {
- /* Node not bootstrapped yet */
- if (!(flags & __GFP_THISNODE))
- ptr = fallback_alloc(cachep, flags);
- }
-
- local_irq_restore(save_flags);
- ptr = cache_alloc_debugcheck_after(cachep, flags, ptr, caller);
-
- return ptr;
-}
-
void *kmem_cache_alloc_node(struct kmem_cache *cachep, gfp_t flags, int nodeid)
{
return __cache_alloc_node(cachep, flags, nodeid,


2007-01-02 14:29:56

by Andy Whitcroft

[permalink] [raw]
Subject: Re: [PATCH] slab: cache alloc cleanups

Pekka J Enberg wrote:
> [Andrew, I have been unable to find a NUMA-capable tester for this patch,
> so can we please put this in to -mm for some exposure?]
>
> From: Pekka Enberg <[email protected]>
>
> This patch cleans up __cache_alloc and __cache_alloc_node functions. We no
> longer need to do NUMA_BUILD tricks and the UMA allocation path is much
> simpler. Note: we now do alternate_node_alloc() for kmem_cache_alloc_node as
> well.

I'll push this through our tests here if that helps. I need to rerun
the -rc2-mm1 tests by the looks of it ... looks like the test lab had
some time off over Christmas.

-apw

2007-01-02 16:27:24

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH] slab: cache alloc cleanups

On Tue, 2 Jan 2007, Pekka J Enberg wrote:

> +
> + if (nodeid == -1 || nodeid == numa_node_id()) {
> + if (unlikely(current->flags & (PF_SPREAD_SLAB | PF_MEMPOLICY))) {
> + obj = alternate_node_alloc(cache, flags);
> + if (obj)
> + goto out;
> + }

This reintroduces a bug that was fixed a while ago.

kmalloc_node() must never obey memory policies.
Alternate_node_alloc implements memory policies.

With this patch kmalloc_node(...., numa_node_id()) would get redirected
again to other nodes if a memory policy is in effect.

2007-01-02 20:24:04

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] slab: cache alloc cleanups

On Tue, 2 Jan 2007 15:47:06 +0200 (EET)
Pekka J Enberg <[email protected]> wrote:

> I have been unable to find a NUMA-capable tester for this patch,

Any x86_64 box can be used to test NUMA code via the numa=fake=N boot option.

fake-numa is somewhat sick in mainline and you might find that it doesn't
work right on some machines, or that it fails with high values of N, but
works OK with N=2. There are fixes to address this problem in -mm.

2007-01-02 20:27:22

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH] slab: cache alloc cleanups

On Tue, 2 Jan 2007, Pekka J Enberg wrote:
> > +
> > + if (nodeid == -1 || nodeid == numa_node_id()) {
> > + if (unlikely(current->flags & (PF_SPREAD_SLAB | PF_MEMPOLICY))) {
> > + obj = alternate_node_alloc(cache, flags);
> > + if (obj)
> > + goto out;
> > + }

On 1/2/07, Christoph Lameter <[email protected]> wrote:
> This reintroduces a bug that was fixed a while ago.

Aah, well, we could have a can_mempolicy parameter, but I'm not sure
if it's an improvement to the current version still...

2007-01-04 21:16:17

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] slab: cache alloc cleanups

On Tue, Jan 02, 2007 at 03:47:06PM +0200, Pekka J Enberg wrote:
> [Andrew, I have been unable to find a NUMA-capable tester for this patch,
> so can we please put this in to -mm for some exposure?]
>
> From: Pekka Enberg <[email protected]>
>
> This patch cleans up __cache_alloc and __cache_alloc_node functions. We no
> longer need to do NUMA_BUILD tricks and the UMA allocation path is much
> simpler. Note: we now do alternate_node_alloc() for kmem_cache_alloc_node as
> well.

Seems to work nicely on my 2node cell blade.

2007-01-04 21:23:38

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH] slab: cache alloc cleanups

On 1/4/07, Christoph Hellwig <[email protected]> wrote:
> Seems to work nicely on my 2node cell blade.

Thanks for testing. Unfortunately as the other Christoph pointed out,
my patch reintroduces a bug that was fixed a while ago. kmalloc_node
should not be using mempolicies...