Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753878AbaKCVAI (ORCPT ); Mon, 3 Nov 2014 16:00:08 -0500 Received: from mx2.parallels.com ([199.115.105.18]:38846 "EHLO mx2.parallels.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753404AbaKCVAF (ORCPT ); Mon, 3 Nov 2014 16:00:05 -0500 From: Vladimir Davydov To: Andrew Morton CC: Johannes Weiner , Michal Hocko , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , , Subject: [PATCH -mm 0/8] memcg: reuse per cgroup kmem caches Date: Mon, 3 Nov 2014 23:59:38 +0300 Message-ID: X-Mailer: git-send-email 1.7.10.4 MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [81.5.99.36] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, Currently, each kmem active memory cgroup has its own set of kmem caches. The caches are only used by the memory cgroup they were created for, so when the cgroup is taken offline they must be destroyed. However, we can't easily destroy all the caches on css offline, because they still may contain objects accounted to the cgroup. Actually, we don't bother destroying busy caches on css offline at all, effectively leaking them. To make this scheme work as it was intended to, we have to introduce a kind of asynchronous caches destruction, which is going to be quite a complex stuff, because we'd have to handle a lot of various race conditions. And even if we manage to solve them all, kmem caches created for memory cgroups that are now dead will be dangling indefinitely long wasting memory. In this patch set I implement a different approach, which can be described by the following statements: 1. Never destroy per memcg kmem caches (except the root cache is destroyed, of course). 2. Reuse kmemcg_id and therefore the set of per memcg kmem caches left from a dead memory cgroup. 3. After allocating a kmem object, check if the slab is accounted to the proper (i.e. current) memory cgroup. If it doesn't recharge it. The benefits are: - It's much simpler than what we have now, even though the current implementation is incomplete. - The number of per cgroup caches of the same kind cannot be be greater than the maximal number of online kmem active memory cgroups that have ever existed simultaneously. Currently it is unlimited, which is really bad. - Once a new memory cgroup starts using a cache that was used by a dead cgroup before, it will be recharging slabs accounted to the dead cgroup while allocating objects from the cache. Therefore all references to the old cgroup will be put sooner or later, and it will be freed. Currently, cgroups that have kmem objects accounted to them on css offline leak for good. This patch set is based on v3.18-rc2-mmotm-2014-10-29-14-19 with the following patches by Johannes applied on top: [patch] mm: memcontrol: remove stale page_cgroup_lock comment [patch 1/3] mm: embed the memcg pointer directly into struct page [patch 2/3] mm: page_cgroup: rename file to mm/swap_cgroup.c [patch 3/3] mm: move page->mem_cgroup bad page handling into generic code Thanks, Vladimir Davydov (8): memcg: do not destroy kmem caches on css offline slab: charge slab pages to the current memory cgroup memcg: decouple per memcg kmem cache from the owner memcg memcg: zap memcg_{un}register_cache memcg: free kmem cache id on css offline memcg: introduce memcg_kmem_should_charge helper slab: introduce slab_free helper slab: recharge slab pages to the allocating memory cgroup include/linux/memcontrol.h | 63 ++++++----- include/linux/slab.h | 12 +- mm/memcontrol.c | 260 ++++++++++++++------------------------------ mm/slab.c | 62 +++++++---- mm/slab.h | 28 ----- mm/slab_common.c | 66 ++++++++--- mm/slub.c | 26 +++-- 7 files changed, 228 insertions(+), 289 deletions(-) -- 1.7.10.4 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/