Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758931Ab3DAQFs (ORCPT ); Mon, 1 Apr 2013 12:05:48 -0400 Received: from mail1.windriver.com ([147.11.146.13]:46989 "EHLO mail1.windriver.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758835Ab3DAQFq (ORCPT ); Mon, 1 Apr 2013 12:05:46 -0400 Message-ID: <5159B06B.30007@windriver.com> Date: Mon, 1 Apr 2013 12:06:03 -0400 From: Paul Gortmaker User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130106 Thunderbird/17.0.2 MIME-Version: 1.0 To: Christoph Lameter CC: Joonsoo Kim , Steven Rostedt , LKML , RT , Thomas Gleixner , Clark Williams , Pekka Enberg Subject: Re: [RT LATENCY] 249 microsecond latency caused by slub's unfreeze_partials() code. References: <1364010673.6345.156.camel@gandalf.local.home> <0000013da1f93be3-c3d42ae8-ff34-4c63-8094-77a83291ea19-000000@email.amazonses.com> <1364227073.6345.182.camel@gandalf.local.home> <1364228039.6345.183.camel@gandalf.local.home> <0000013da2ace21a-9e87fe8a-75c2-4b7c-b5e1-37ad196ce012-000000@email.amazonses.com> <1364234613.6345.184.camel@gandalf.local.home> <0000013da2ce20f8-0e3a64ef-67ed-4ab4-9f20-b77980c876c3-000000@email.amazonses.com> <1364236355.6345.185.camel@gandalf.local.home> <20130327025957.GA17125@lge.com> <1364355032.6345.200.camel@gandalf.local.home> <20130327061351.GB17125@lge.com> <0000013db20ca149-0064fbb8-2f81-4323-9095-a38f6abb79c5-000000@email.amazonses.com> <0000013dc63a9086-7d10c4a8-748c-4e19-829a-856d8d42c8eb-000000@email.amazonses.com> In-Reply-To: <0000013dc63a9086-7d10c4a8-748c-4e19-829a-856d8d42c8eb-000000@email.amazonses.com> Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit X-Originating-IP: [128.224.146.65] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 11876 Lines: 375 On 13-04-01 11:32 AM, Christoph Lameter wrote: > On Thu, 28 Mar 2013, Paul Gortmaker wrote: > >>> Index: linux/init/Kconfig >>> =================================================================== >>> --- linux.orig/init/Kconfig 2013-03-28 12:14:26.958358688 -0500 >>> +++ linux/init/Kconfig 2013-03-28 12:19:46.275866639 -0500 >>> @@ -1514,6 +1514,14 @@ config SLOB >>> >>> endchoice >>> >>> +config SLUB_CPU_PARTIAL >>> + depends on SLUB >>> + bool "SLUB per cpu partial cache" >>> + help >>> + Per cpu partial caches accellerate freeing of objects at the >>> + price of more indeterminism in the latency of the free. >>> + Typically one would choose no for a realtime system. >> >> Is "batch" a better description than "accelerate" ? Something like > > Its not a batching but a cache that is going to be mainly used for new > allocations on the same processor. OK. In that case, a minor nit - since it is user facing text, we should probably drop one of the "l" for "accelerate". > >> Per cpu partial caches allows batch freeing of objects to maximize >> throughput. However, this can increase the length of time spent >> holding key locks, which can increase latency spikes with respect >> to responsiveness. Select yes unless you are tuning for a realtime >> oriented system. >> >> Also, I believe this will cause a behaviour change for people who >> just run "make oldconfig" -- since there is no default line. Meaning >> that it used to be unconditionally on, but now I think it will be off >> by default, if people just mindlessly hold down Enter key. > > Ok. >> >> For RT, we'll want default N if RT_FULL (RT_BASE?) but for mainline, >> I expect you'll want default Y in order to be consistent with previous >> behaviour? > > I was not sure exactly how to handle that one yet for realtime. So I need > two different patches? I don't think you need to worry about realtime -- meaning that I would guess once the patch exists in mainline, Steve will probably cherry pick it onto 3.6.11.x-stable, and then he'd likely add a one-line follow on RT-specific patch to make it set to off/disabled for RT. Similar for 3.4.x etc. > >> I've not built/booted yet, but I'll follow up if I see anything else in doing >> that. > > Here is an updated patch. I will also send an updated fixup patch. I'll give these some local testing today. Thanks, Paul. -- > > > Subject: slub: Make cpu partial slab support configurable V2 > > cpu partial support can introduce level of indeterminism that is not wanted > in certain context (like a realtime kernel). Make it configurable. > > Signed-off-by: Christoph Lameter > > Index: linux/include/linux/slub_def.h > =================================================================== > --- linux.orig/include/linux/slub_def.h 2013-04-01 10:27:05.908964674 -0500 > +++ linux/include/linux/slub_def.h 2013-04-01 10:27:19.905178531 -0500 > @@ -47,7 +47,9 @@ struct kmem_cache_cpu { > void **freelist; /* Pointer to next available object */ > unsigned long tid; /* Globally unique transaction id */ > struct page *page; /* The slab from which we are allocating */ > +#ifdef CONFIG_SLUB_CPU_PARTIAL > struct page *partial; /* Partially allocated frozen slabs */ > +#endif > #ifdef CONFIG_SLUB_STATS > unsigned stat[NR_SLUB_STAT_ITEMS]; > #endif > @@ -84,7 +86,9 @@ struct kmem_cache { > int size; /* The size of an object including meta data */ > int object_size; /* The size of an object without meta data */ > int offset; /* Free pointer offset. */ > +#ifdef CONFIG_SLUB_CPU_PARTIAL > int cpu_partial; /* Number of per cpu partial objects to keep around */ > +#endif > struct kmem_cache_order_objects oo; > > /* Allocation and freeing of slabs */ > Index: linux/mm/slub.c > =================================================================== > --- linux.orig/mm/slub.c 2013-04-01 10:27:05.908964674 -0500 > +++ linux/mm/slub.c 2013-04-01 10:27:19.905178531 -0500 > @@ -1531,7 +1531,9 @@ static inline void *acquire_slab(struct > return freelist; > } > > +#ifdef CONFIG_SLUB_CPU_PARTIAL > static int put_cpu_partial(struct kmem_cache *s, struct page *page, int drain); > +#endif > static inline bool pfmemalloc_match(struct page *page, gfp_t gfpflags); > > /* > @@ -1570,10 +1572,20 @@ static void *get_partial_node(struct kme > object = t; > available = page->objects - (unsigned long)page->lru.next; > } else { > +#ifdef CONFIG_SLUB_CPU_PARTIAL > available = put_cpu_partial(s, page, 0); > stat(s, CPU_PARTIAL_NODE); > +#else > + BUG(); > +#endif > } > - if (kmem_cache_debug(s) || available > s->cpu_partial / 2) > + if (kmem_cache_debug(s) || > +#ifdef CONFIG_SLUB_CPU_PARTIAL > + available > s->cpu_partial / 2 > +#else > + available > 0 > +#endif > + ) > break; > > } > @@ -1874,6 +1886,7 @@ redo: > } > } > > +#ifdef CONFIG_SLUB_CPU_PARTIAL > /* > * Unfreeze all the cpu partial slabs. > * > @@ -1989,6 +2002,7 @@ static int put_cpu_partial(struct kmem_c > } while (this_cpu_cmpxchg(s->cpu_slab->partial, oldpage, page) != oldpage); > return pobjects; > } > +#endif > > static inline void flush_slab(struct kmem_cache *s, struct kmem_cache_cpu *c) > { > @@ -2013,7 +2027,9 @@ static inline void __flush_cpu_slab(stru > if (c->page) > flush_slab(s, c); > > +#ifdef CONFIG_SLUB_CPU_PARTIAL > unfreeze_partials(s, c); > +#endif > } > } > > @@ -2029,7 +2045,11 @@ static bool has_cpu_slab(int cpu, void * > struct kmem_cache *s = info; > struct kmem_cache_cpu *c = per_cpu_ptr(s->cpu_slab, cpu); > > +#ifdef CONFIG_SLUB_CPU_PARTIAL > return c->page || c->partial; > +#else > + return c->page; > +#endif > } > > static void flush_all(struct kmem_cache *s) > @@ -2225,7 +2245,10 @@ static void *__slab_alloc(struct kmem_ca > page = c->page; > if (!page) > goto new_slab; > + > +#ifdef CONFIG_SLUB_CPU_PARTIAL > redo: > +#endif > > if (unlikely(!node_match(page, node))) { > stat(s, ALLOC_NODE_MISMATCH); > @@ -2278,6 +2301,7 @@ load_freelist: > > new_slab: > > +#ifdef CONFIG_SLUB_CPU_PARTIAL > if (c->partial) { > page = c->page = c->partial; > c->partial = page->next; > @@ -2285,6 +2309,7 @@ new_slab: > c->freelist = NULL; > goto redo; > } > +#endif > > freelist = new_slab_objects(s, gfpflags, node, &c); > > @@ -2491,6 +2516,7 @@ static void __slab_free(struct kmem_cach > new.inuse--; > if ((!new.inuse || !prior) && !was_frozen) { > > +#ifdef CONFIG_SLUB_CPU_PARTIAL > if (!kmem_cache_debug(s) && !prior) > > /* > @@ -2499,7 +2525,9 @@ static void __slab_free(struct kmem_cach > */ > new.frozen = 1; > > - else { /* Needs to be taken off a list */ > + else > +#endif > + { /* Needs to be taken off a list */ > > n = get_node(s, page_to_nid(page)); > /* > @@ -2521,6 +2549,7 @@ static void __slab_free(struct kmem_cach > "__slab_free")); > > if (likely(!n)) { > +#ifdef CONFIG_SLUB_CPU_PARTIAL > > /* > * If we just froze the page then put it onto the > @@ -2530,6 +2559,7 @@ static void __slab_free(struct kmem_cach > put_cpu_partial(s, page, 1); > stat(s, CPU_PARTIAL_FREE); > } > +#endif > /* > * The list lock was not taken therefore no list > * activity can be necessary. > @@ -3036,7 +3066,7 @@ static int kmem_cache_open(struct kmem_c > * list to avoid pounding the page allocator excessively. > */ > set_min_partial(s, ilog2(s->size) / 2); > - > +#ifdef CONFIG_SLUB_CPU_PARTIAL > /* > * cpu_partial determined the maximum number of objects kept in the > * per cpu partial lists of a processor. > @@ -3064,6 +3094,7 @@ static int kmem_cache_open(struct kmem_c > s->cpu_partial = 13; > else > s->cpu_partial = 30; > +#endif > > #ifdef CONFIG_NUMA > s->remote_node_defrag_ratio = 1000; > @@ -4424,13 +4455,14 @@ static ssize_t show_slab_objects(struct > total += x; > nodes[node] += x; > > +#ifdef CONFIG_SLUB_CPU_PARTIAL > page = ACCESS_ONCE(c->partial); > if (page) { > x = page->pobjects; > total += x; > nodes[node] += x; > } > - > +#endif > per_cpu[node]++; > } > } > @@ -4583,6 +4615,7 @@ static ssize_t min_partial_store(struct > } > SLAB_ATTR(min_partial); > > +#ifdef CONFIG_CPU_PARTIAL > static ssize_t cpu_partial_show(struct kmem_cache *s, char *buf) > { > return sprintf(buf, "%u\n", s->cpu_partial); > @@ -4605,6 +4638,7 @@ static ssize_t cpu_partial_store(struct > return length; > } > SLAB_ATTR(cpu_partial); > +#endif > > static ssize_t ctor_show(struct kmem_cache *s, char *buf) > { > @@ -4644,6 +4678,7 @@ static ssize_t objects_partial_show(stru > } > SLAB_ATTR_RO(objects_partial); > > +#ifdef CONFIG_SLUB_CPU_PARTIAL > static ssize_t slabs_cpu_partial_show(struct kmem_cache *s, char *buf) > { > int objects = 0; > @@ -4674,6 +4709,7 @@ static ssize_t slabs_cpu_partial_show(st > return len + sprintf(buf + len, "\n"); > } > SLAB_ATTR_RO(slabs_cpu_partial); > +#endif > > static ssize_t reclaim_account_show(struct kmem_cache *s, char *buf) > { > @@ -4997,11 +5033,13 @@ STAT_ATTR(DEACTIVATE_BYPASS, deactivate_ > STAT_ATTR(ORDER_FALLBACK, order_fallback); > STAT_ATTR(CMPXCHG_DOUBLE_CPU_FAIL, cmpxchg_double_cpu_fail); > STAT_ATTR(CMPXCHG_DOUBLE_FAIL, cmpxchg_double_fail); > +#ifdef CONFIG_CPU_PARTIAL > STAT_ATTR(CPU_PARTIAL_ALLOC, cpu_partial_alloc); > STAT_ATTR(CPU_PARTIAL_FREE, cpu_partial_free); > STAT_ATTR(CPU_PARTIAL_NODE, cpu_partial_node); > STAT_ATTR(CPU_PARTIAL_DRAIN, cpu_partial_drain); > #endif > +#endif > > static struct attribute *slab_attrs[] = { > &slab_size_attr.attr, > @@ -5009,7 +5047,9 @@ static struct attribute *slab_attrs[] = > &objs_per_slab_attr.attr, > &order_attr.attr, > &min_partial_attr.attr, > +#ifdef CONFIG_CPU_PARTIAL > &cpu_partial_attr.attr, > +#endif > &objects_attr.attr, > &objects_partial_attr.attr, > &partial_attr.attr, > @@ -5022,7 +5062,9 @@ static struct attribute *slab_attrs[] = > &destroy_by_rcu_attr.attr, > &shrink_attr.attr, > &reserved_attr.attr, > +#ifdef CONFIG_SLUB_CPU_PARTIAL > &slabs_cpu_partial_attr.attr, > +#endif > #ifdef CONFIG_SLUB_DEBUG > &total_objects_attr.attr, > &slabs_attr.attr, > @@ -5064,11 +5106,13 @@ static struct attribute *slab_attrs[] = > &order_fallback_attr.attr, > &cmpxchg_double_fail_attr.attr, > &cmpxchg_double_cpu_fail_attr.attr, > +#ifdef CONFIG_SLUB_CPU_PARTIAL > &cpu_partial_alloc_attr.attr, > &cpu_partial_free_attr.attr, > &cpu_partial_node_attr.attr, > &cpu_partial_drain_attr.attr, > #endif > +#endif > #ifdef CONFIG_FAILSLAB > &failslab_attr.attr, > #endif > Index: linux/init/Kconfig > =================================================================== > --- linux.orig/init/Kconfig 2013-04-01 10:27:05.908964674 -0500 > +++ linux/init/Kconfig 2013-04-01 10:31:46.497863625 -0500 > @@ -1514,6 +1514,17 @@ config SLOB > > endchoice > > +config SLUB_CPU_PARTIAL > + default y > + depends on SLUB > + bool "SLUB per cpu partial cache" > + help > + Per cpu partial caches accellerate objects allocation and freeing > + that is local to a processor at the price of more indeterminism > + in the latency of the free. On overflow these caches will be cleared > + which requires the taking of locks that may cause latency spikes. > + Typically one would choose no for a realtime system. > + > config MMAP_ALLOW_UNINITIALIZED > bool "Allow mmapped anonymous memory to be uninitialized" > depends on EXPERT && !MMU > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/