2015-06-09 00:25:22

by Joonsoo Kim

[permalink] [raw]
Subject: Re: + slub-bulk-alloc-extract-objects-from-the-per-cpu-slab.patch added to -mm tree

On Wed, Apr 08, 2015 at 03:53:13PM -0700, [email protected] wrote:
>
> The patch titled
> Subject: slub bulk alloc: extract objects from the per cpu slab
> has been added to the -mm tree. Its filename is
> slub-bulk-alloc-extract-objects-from-the-per-cpu-slab.patch
>
> This patch should soon appear at
> http://ozlabs.org/~akpm/mmots/broken-out/slub-bulk-alloc-extract-objects-from-the-per-cpu-slab.patch
> and later at
> http://ozlabs.org/~akpm/mmotm/broken-out/slub-bulk-alloc-extract-objects-from-the-per-cpu-slab.patch
>
> Before you just go and hit "reply", please:
> a) Consider who else should be cc'ed
> b) Prefer to cc a suitable mailing list as well
> c) Ideally: find the original patch on the mailing list and do a
> reply-to-all to that, adding suitable additional cc's
>
> *** Remember to use Documentation/SubmitChecklist when testing your code ***
>
> The -mm tree is included into linux-next and is updated
> there every 3-4 working days
>
> ------------------------------------------------------
> From: Christoph Lameter <[email protected]>
> Subject: slub bulk alloc: extract objects from the per cpu slab
>
> First piece: acceleration of retrieval of per cpu objects
>
> If we are allocating lots of objects then it is advantageous to disable
> interrupts and avoid the this_cpu_cmpxchg() operation to get these objects
> faster. Note that we cannot do the fast operation if debugging is
> enabled. Note also that the requirement of having interrupts disabled
> avoids having to do processor flag operations.
>
> Allocate as many objects as possible in the fast way and then fall back to
> the generic implementation for the rest of the objects.
>
> Signed-off-by: Christoph Lameter <[email protected]>
> Cc: Jesper Dangaard Brouer <[email protected]>
> Cc: Christoph Lameter <[email protected]>
> Cc: Pekka Enberg <[email protected]>
> Cc: David Rientjes <[email protected]>
> Cc: Joonsoo Kim <[email protected]>
> Signed-off-by: Andrew Morton <[email protected]>
> ---
>
> mm/slub.c | 27 ++++++++++++++++++++++++++-
> 1 file changed, 26 insertions(+), 1 deletion(-)
>
> diff -puN mm/slub.c~slub-bulk-alloc-extract-objects-from-the-per-cpu-slab mm/slub.c
> --- a/mm/slub.c~slub-bulk-alloc-extract-objects-from-the-per-cpu-slab
> +++ a/mm/slub.c
> @@ -2759,7 +2759,32 @@ EXPORT_SYMBOL(kmem_cache_free_bulk);
> bool kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
> void **p)
> {
> - return kmem_cache_alloc_bulk(s, flags, size, p);
> + if (!kmem_cache_debug(s)) {
> + struct kmem_cache_cpu *c;
> +
> + /* Drain objects in the per cpu slab */
> + local_irq_disable();
> + c = this_cpu_ptr(s->cpu_slab);
> +
> + while (size) {
> + void *object = c->freelist;
> +
> + if (!object)
> + break;
> +
> + c->freelist = get_freepointer(s, object);
> + *p++ = object;
> + size--;
> +
> + if (unlikely(flags & __GFP_ZERO))
> + memset(object, 0, s->object_size);
> + }
> + c->tid = next_tid(c->tid);
> +
> + local_irq_enable();
> + }
> +
> + return __kmem_cache_alloc_bulk(s, flags, size, p);

Hello,

So, if __kmem_cache_alloc_bulk() fails, all allocated objects in array
should be freed, but, __kmem_cache_alloc_bulk() can't know
about objects allocated by this slub specific kmem_cache_alloc_bulk()
function. Please fix it.

Thanks.


2015-06-09 07:10:09

by Jesper Dangaard Brouer

[permalink] [raw]
Subject: Re: + slub-bulk-alloc-extract-objects-from-the-per-cpu-slab.patch added to -mm tree

On Tue, 9 Jun 2015 09:26:39 +0900
Joonsoo Kim <[email protected]> wrote:

> On Wed, Apr 08, 2015 at 03:53:13PM -0700, [email protected] wrote:
> >
> > The patch titled
> > Subject: slub bulk alloc: extract objects from the per cpu slab
> > has been added to the -mm tree. Its filename is
> > slub-bulk-alloc-extract-objects-from-the-per-cpu-slab.patch
> >
> > This patch should soon appear at
> > http://ozlabs.org/~akpm/mmots/broken-out/slub-bulk-alloc-extract-objects-from-the-per-cpu-slab.patch
> > and later at
> > http://ozlabs.org/~akpm/mmotm/broken-out/slub-bulk-alloc-extract-objects-from-the-per-cpu-slab.patch
> >
> > Before you just go and hit "reply", please:
> > a) Consider who else should be cc'ed
> > b) Prefer to cc a suitable mailing list as well
> > c) Ideally: find the original patch on the mailing list and do a
> > reply-to-all to that, adding suitable additional cc's
> >
> > *** Remember to use Documentation/SubmitChecklist when testing your code ***
> >
> > The -mm tree is included into linux-next and is updated
> > there every 3-4 working days
> >
> > ------------------------------------------------------
> > From: Christoph Lameter <[email protected]>
> > Subject: slub bulk alloc: extract objects from the per cpu slab
> >
> > First piece: acceleration of retrieval of per cpu objects
> >
> > If we are allocating lots of objects then it is advantageous to disable
> > interrupts and avoid the this_cpu_cmpxchg() operation to get these objects
> > faster. Note that we cannot do the fast operation if debugging is
> > enabled. Note also that the requirement of having interrupts disabled
> > avoids having to do processor flag operations.
> >
> > Allocate as many objects as possible in the fast way and then fall back to
> > the generic implementation for the rest of the objects.
> >
> > Signed-off-by: Christoph Lameter <[email protected]>
> > Cc: Jesper Dangaard Brouer <[email protected]>
> > Cc: Christoph Lameter <[email protected]>
> > Cc: Pekka Enberg <[email protected]>
> > Cc: David Rientjes <[email protected]>
> > Cc: Joonsoo Kim <[email protected]>
> > Signed-off-by: Andrew Morton <[email protected]>
> > ---
> >
> > mm/slub.c | 27 ++++++++++++++++++++++++++-
> > 1 file changed, 26 insertions(+), 1 deletion(-)
> >
> > diff -puN mm/slub.c~slub-bulk-alloc-extract-objects-from-the-per-cpu-slab mm/slub.c
> > --- a/mm/slub.c~slub-bulk-alloc-extract-objects-from-the-per-cpu-slab
> > +++ a/mm/slub.c
> > @@ -2759,7 +2759,32 @@ EXPORT_SYMBOL(kmem_cache_free_bulk);
> > bool kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, size_t size,
> > void **p)
> > {
> > - return kmem_cache_alloc_bulk(s, flags, size, p);
> > + if (!kmem_cache_debug(s)) {
> > + struct kmem_cache_cpu *c;
> > +
> > + /* Drain objects in the per cpu slab */
> > + local_irq_disable();
> > + c = this_cpu_ptr(s->cpu_slab);
> > +
> > + while (size) {
> > + void *object = c->freelist;
> > +
> > + if (!object)
> > + break;
> > +
> > + c->freelist = get_freepointer(s, object);
> > + *p++ = object;
> > + size--;
> > +
> > + if (unlikely(flags & __GFP_ZERO))
> > + memset(object, 0, s->object_size);
> > + }
> > + c->tid = next_tid(c->tid);
> > +
> > + local_irq_enable();
> > + }
> > +
> > + return __kmem_cache_alloc_bulk(s, flags, size, p);
>
> Hello,
>
> So, if __kmem_cache_alloc_bulk() fails, all allocated objects in array
> should be freed, but, __kmem_cache_alloc_bulk() can't know
> about objects allocated by this slub specific kmem_cache_alloc_bulk()
> function. Please fix it.

Check, I've already noticed this, and have fixed it in my local git
tree.

How do I submit a fix to AKPM? (do I replace the commit/patch, or do I
apply a patch on top)

(And as you also noticed, I've also moved the memset out of the loop,
after irq_enable)

--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Sr. Network Kernel Developer at Red Hat
Author of http://www.iptv-analyzer.org
LinkedIn: http://www.linkedin.com/in/brouer