Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753474AbcDALli (ORCPT ); Fri, 1 Apr 2016 07:41:38 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:46138 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751902AbcDALlh (ORCPT ); Fri, 1 Apr 2016 07:41:37 -0400 Date: Fri, 1 Apr 2016 13:41:29 +0200 From: Peter Zijlstra To: Vladimir Davydov Cc: Andrew Morton , Christoph Lameter , Joonsoo Kim , Pekka Enberg , David Rientjes , Johannes Weiner , Michal Hocko , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH -mm v2 3/3] slub: make dead caches discard free slabs immediately Message-ID: <20160401114129.GR3430@twins.programming.kicks-ass.net> References: <6eecfafdc6c3dcbb98d2176cdebcb65abbc180b4.1422461573.git.vdavydov@parallels.com> <20160401090441.GD12845@twins.programming.kicks-ass.net> <20160401105539.GA6610@esperanza> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160401105539.GA6610@esperanza> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1981 Lines: 48 On Fri, Apr 01, 2016 at 01:55:40PM +0300, Vladimir Davydov wrote: > > > + if (deactivate) { > > > + /* > > > + * Disable empty slabs caching. Used to avoid pinning offline > > > + * memory cgroups by kmem pages that can be freed. > > > + */ > > > + s->cpu_partial = 0; > > > + s->min_partial = 0; > > > + > > > + /* > > > + * s->cpu_partial is checked locklessly (see put_cpu_partial), > > > + * so we have to make sure the change is visible. > > > + */ > > > + kick_all_cpus_sync(); > > > + } > > > > Argh! what the heck! and without a single mention in the changelog. > > This function is only called when a memory cgroup is removed, which is > rather a rare event. I didn't think it would cause any pain. Sorry. Suppose you have a bunch of CPUs running HPC/RT code and someone causes the admin CPUs to create/destroy a few cgroups. > > Why are you spraying IPIs across the entire machine? Why isn't > > synchronize_sched() good enough, that would allow you to get rid of the > > local_irq_save/restore as well. > > synchronize_sched() is slower. Calling it for every per memcg kmem cache > would slow down cleanup on cgroup removal. Right, but who cares? cgroup removal isn't a fast path by any standard. > Regarding local_irq_save/restore - synchronize_sched() wouldn't allow us > to get rid of them, because unfreeze_partials() must be called with irqs > disabled. OK, I figured it was because it needed to be serialized against this kick_all_cpus_sync() IPI. > Come to think of it, kick_all_cpus_sync() is used as a memory barrier > here, so as to make sure that after it's finished all cpus will use the > new ->cpu_partial value, which makes me wonder if we could replace it > with a simple smp_mb. I mean, this_cpu_cmpxchg(), which is used by > put_cpu_partial to add a page to per-cpu partial list, must issue a full > memory barrier (am I correct?), so we have two possibilities here: Nope, this_cpu_cmpxchg() does not imply a memory barrier.