Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752944AbdGLOy5 (ORCPT ); Wed, 12 Jul 2017 10:54:57 -0400 Received: from resqmta-ch2-05v.sys.comcast.net ([69.252.207.37]:46834 "EHLO resqmta-ch2-05v.sys.comcast.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752642AbdGLOy4 (ORCPT ); Wed, 12 Jul 2017 10:54:56 -0400 Date: Wed, 12 Jul 2017 09:54:54 -0500 (CDT) From: Christopher Lameter X-X-Sender: cl@nuc-kabylake To: Laura Abbott cc: Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Kees Cook Subject: Re: [RFC][PATCH] slub: Introduce 'alternate' per cpu partial lists In-Reply-To: <1496965984-21962-1-git-send-email-labbott@redhat.com> Message-ID: References: <1496965984-21962-1-git-send-email-labbott@redhat.com> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-CMAE-Envelope: MS4wfFT2mZizYkQZ+pyIU31GKQKdpt1GDMMJUDMI9kdmiwm2x7ORWXwJO05uaPq6hq+k3NmTvNiK9FmQ/zrQIKuBsCHiphFV3otxh2FlDfxJoCUcB5hEutF1 YavuYhQhsLPJMiDr94t7lHNx/j87bZhQT3USJzV+wS9ees1OsQC3P6+XcMx3GRe0/tAIjtW1fIRoQGdfwcVfk6nfNe5SvrKjE8mqLCTxIbjL6EyOq/LF4QUe uENmt/5zZ5SQ3vaWBhXtxg7WaYl4mjMpiA0cAtbXn6cDZL+8phcVaXM2wpb4G9uAQeLLCRzg2YX3MtzjL3JaIilzKw8PqdNmJze5U3C5Oo6M41aBIecHkCFT yVDEibFYnsnVO9Zkrrb89XkYDKtzOLM5kvfJZ+4E6SO9kEIlm9k= Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2605 Lines: 79 On Thu, 8 Jun 2017, Laura Abbott wrote: > - Some of this code is redundant and can probably be combined. > - The fast path is very sensitive and it was suggested I leave it alone. The > approach I took means the fastpath cmpxchg always fails before trying the > alternate cmpxchg. From some of my profiling, the cmpxchg seemed to be fairly > expensive. I think its better to change the fast path. Just make sure that the hot path is as unencumbered as possible. There are already slow pieces in the hotpath. If you modifications are similar then it would work. > diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h > index 07ef550..d582101 100644 > --- a/include/linux/slub_def.h > +++ b/include/linux/slub_def.h > @@ -42,6 +44,12 @@ struct kmem_cache_cpu { > unsigned long tid; /* Globally unique transaction id */ > struct page *page; /* The slab from which we are allocating */ > struct page *partial; /* Partially allocated frozen slabs */ > + /* > + * The following fields have identical uses to those above */ > + void **alt_freelist; > + unsigned long alt_tid; > + struct page *alt_partial; > + struct page *alt_page; > #ifdef CONFIG_SLUB_STATS > unsigned stat[NR_SLUB_STAT_ITEMS]; > #endif I would rather avoid duplication here. Use the regular entries and modify the flow depending on a flag. > diff --git a/mm/slub.c b/mm/slub.c > index 7449593..b1fc4c6 100644 > --- a/mm/slub.c > +++ b/mm/slub.c > @@ -132,10 +132,24 @@ void *fixup_red_left(struct kmem_cache *s, void *p) > return p; > } > > +#define SLAB_NO_PARTIAL (SLAB_CONSISTENCY_CHECKS | SLAB_STORE_USER | \ > + SLAB_TRACE) > + > + > +static inline bool kmem_cache_use_alt_partial(struct kmem_cache *s) > +{ > +#ifdef CONFIG_SLUB_CPU_PARTIAL > + return s->flags & (SLAB_RED_ZONE | SLAB_POISON) && > + !(s->flags & SLAB_NO_PARTIAL); > +#else > + return false; > +#endif > +} > + > static inline bool kmem_cache_has_cpu_partial(struct kmem_cache *s) > { > #ifdef CONFIG_SLUB_CPU_PARTIAL > - return !kmem_cache_debug(s); > + return !(s->flags & SLAB_NO_PARTIAL); > #else > return false; > #endif > @@ -1786,6 +1800,7 @@ static inline void *acquire_slab(struct kmem_cache *s, > } Hmmm... Looks like the inversion would be better SLAB_PARTIAL? ... Lots of duplication. I think that can be avoided by rearranging the fast path depending on a flag. Maybe make the fast poisoning the default? If you can keep the performance of the fast path for regular use then this may be best. You can then avoid adding the additional flag as well as the additional debug counters.