Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752264AbcJCMGx (ORCPT ); Mon, 3 Oct 2016 08:06:53 -0400 Received: from mail-wm0-f67.google.com ([74.125.82.67]:36260 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751111AbcJCMGo (ORCPT ); Mon, 3 Oct 2016 08:06:44 -0400 Date: Mon, 3 Oct 2016 14:06:42 +0200 From: Michal Hocko To: Vladimir Davydov Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Christoph Lameter , David Rientjes , Johannes Weiner , Joonsoo Kim , Pekka Enberg Subject: Re: [PATCH 1/2] mm: memcontrol: use special workqueue for creating per-memcg caches Message-ID: <20161003120641.GC26768@dhcp22.suse.cz> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.6.0 (2016-04-01) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2939 Lines: 80 On Sat 01-10-16 16:56:47, Vladimir Davydov wrote: > Creating a lot of cgroups at the same time might stall all worker > threads with kmem cache creation works, because kmem cache creation is > done with the slab_mutex held. To prevent that from happening, let's use > a special workqueue for kmem cache creation with max in-flight work > items equal to 1. > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=172981 This looks like a regression but I am not really sure I understand what has caused it. We had the WQ based cache creation since kmem was introduced more or less. So is it 801faf0db894 ("mm/slab: lockless decision to grow cache") which was pointed by bisection that changed the timing resp. relaxed the cache creation to the point that would allow this runaway? This would be really useful for the stable backport consideration. Also, if I understand the fix correctly, now we do limit the number of workers to 1 thread. Is this really what we want? Wouldn't it be possible that few memcgs could starve others fromm having their cache created? What would be the result, missed charges? > Signed-off-by: Vladimir Davydov > Reported-by: Doug Smythies > Cc: Christoph Lameter > Cc: David Rientjes > Cc: Johannes Weiner > Cc: Joonsoo Kim > Cc: Michal Hocko > Cc: Pekka Enberg > --- > mm/memcontrol.c | 15 ++++++++++++++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 4be518d4e68a..c1efe59e3a20 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -2175,6 +2175,8 @@ struct memcg_kmem_cache_create_work { > struct work_struct work; > }; > > +static struct workqueue_struct *memcg_kmem_cache_create_wq; > + > static void memcg_kmem_cache_create_func(struct work_struct *w) > { > struct memcg_kmem_cache_create_work *cw = > @@ -2206,7 +2208,7 @@ static void __memcg_schedule_kmem_cache_create(struct mem_cgroup *memcg, > cw->cachep = cachep; > INIT_WORK(&cw->work, memcg_kmem_cache_create_func); > > - schedule_work(&cw->work); > + queue_work(memcg_kmem_cache_create_wq, &cw->work); > } > > static void memcg_schedule_kmem_cache_create(struct mem_cgroup *memcg, > @@ -5794,6 +5796,17 @@ static int __init mem_cgroup_init(void) > { > int cpu, node; > > +#ifndef CONFIG_SLOB > + /* > + * Kmem cache creation is mostly done with the slab_mutex held, > + * so use a special workqueue to avoid stalling all worker > + * threads in case lots of cgroups are created simultaneously. > + */ > + memcg_kmem_cache_create_wq = > + alloc_workqueue("memcg_kmem_cache_create", 0, 1); > + BUG_ON(!memcg_kmem_cache_create_wq); > +#endif > + > hotcpu_notifier(memcg_cpu_hotplug_callback, 0); > > for_each_possible_cpu(cpu) > -- > 2.1.4 -- Michal Hocko SUSE Labs