On Wed, Apr 08, 2015 at 03:53:13PM -0700, [email protected] wrote:
>
> The patch titled
> Subject: slub bulk alloc: extract objects from the per cpu slab
> has been added to the -mm tree. Its filename is
> slub-bulk-alloc-extract-objects-from-the-per-cpu-slab.patch
>
> This patch should soon appear at
> http://ozlabs.org/~akpm/mmots/broken-out/slub-bulk-alloc-extract-objects-from-the-per-cpu-slab.patch
> and later at
> http://ozlabs.org/~akpm/mmotm/broken-out/slub-bulk-alloc-extract-objects-from-the-per-cpu-slab.patch
>
> Before you just go and hit "reply", please:
> a) Consider who else should be cc'ed
> b) Prefer to cc a suitable mailing list as well
> c) Ideally: find the original patch on the mailing list and do a
> reply-to-all to that, adding suitable additional cc's
>
> *** Remember to use Documentation/SubmitChecklist when testing your code ***
>
> The -mm tree is included into linux-next and is updated
> there every 3-4 working days
>
> ------------------------------------------------------
> From: Christoph Lameter <[email protected]>
> Subject: slub bulk alloc: extract objects from the per cpu slab
>
> First piece: acceleration of retrieval of per cpu objects
>
> If we are allocating lots of objects then it is advantageous to disable
> interrupts and avoid the this_cpu_cmpxchg() operation to get these objects
> faster. Note that we cannot do the fast operation if debugging is
> enabled. Note also that the requirement of having interrupts disabled
> avoids having to do processor flag operations.
>
> Allocate as many objects as possible in the fast way and then fall back to
> the generic implementation for the rest of the objects.
>
> Signed-off-by: Christoph Lameter <[email protected]>
> Cc: Jesper Dangaard Brouer <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: Pekka Enberg <[email protected]>
> Cc: David Rientjes <[email protected]>
> Cc: Joonsoo Kim <[email protected]>
> Signed-off-by: Andrew Morton <[email protected]>
> ---
>
> mm/slub.c | 27 ++++++++++++++++++++++++++-
> 1 file changed, 26 insertions(+), 1 deletion(-)
>
> diff -puN mm/slub.c~slub-bulk-alloc-extract-objects-from-the-per-cpu-slab mm/slub.c
> --- a/mm/slub.c~slub-bulk-alloc-extract-objects-from-the-per-cpu-slab
> +++ a/mm/slub.c
> @@ -2759,7 +2759,32 @@ EXPORT_SYMBOL(kmem_cache_free_bulk);
> bool kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
> void **p)
> {
> - return kmem_cache_alloc_bulk(s, flags, size, p);
> + if (!kmem_cache_debug(s)) {
> + struct kmem_cache_cpu *c;
> +
> + /* Drain objects in the per cpu slab */
> + local_irq_disable();
> + c = this_cpu_ptr(s->cpu_slab);
> +
> + while (size) {
> + void *object = c->freelist;
> +
> + if (!object)
> + break;
> +
> + c->freelist = get_freepointer(s, object);
> + *p++ = object;
> + size--;
> +
> + if (unlikely(flags & __GFP_ZERO))
> + memset(object, 0, s->object_size);
Ahh... and, how about doing memset after irq is enabled?
Thanks.