Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757203Ab2HTPc5 (ORCPT ); Mon, 20 Aug 2012 11:32:57 -0400 Received: from mx2.parallels.com ([64.131.90.16]:35007 "EHLO mx2.parallels.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757160Ab2HTPcw (ORCPT ); Mon, 20 Aug 2012 11:32:52 -0400 Message-ID: <503257DD.50709@parallels.com> Date: Mon, 20 Aug 2012 19:29:33 +0400 From: Glauber Costa User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20120717 Thunderbird/14.0 MIME-Version: 1.0 To: Kamezawa Hiroyuki CC: Greg Thelen , , , , , Michal Hocko , Johannes Weiner , Andrew Morton , Christoph Lameter , David Rientjes , Pekka Enberg , Pekka Enberg Subject: Re: [PATCH v2 06/11] memcg: kmem controller infrastructure References: <1344517279-30646-1-git-send-email-glommer@parallels.com> <1344517279-30646-7-git-send-email-glommer@parallels.com> <50254475.4000201@jp.fujitsu.com> <5028BA9E.7000302@parallels.com> <502B6956.5030508@parallels.com> <502BD5AF.301@parallels.com> <50323D50.8070307@jp.fujitsu.com> In-Reply-To: <50323D50.8070307@jp.fujitsu.com> Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3497 Lines: 81 On 08/20/2012 05:36 PM, Kamezawa Hiroyuki wrote: > (2012/08/16 2:00), Glauber Costa wrote: >> On 08/15/2012 08:38 PM, Greg Thelen wrote: >>> On Wed, Aug 15 2012, Glauber Costa wrote: >>> >>>> On 08/14/2012 10:58 PM, Greg Thelen wrote: >>>>> On Mon, Aug 13 2012, Glauber Costa wrote: >>>>> >>>>>>>>> + WARN_ON(mem_cgroup_is_root(memcg)); >>>>>>>>> + size = (1 << order) << PAGE_SHIFT; >>>>>>>>> + memcg_uncharge_kmem(memcg, size); >>>>>>>>> + mem_cgroup_put(memcg); >>>>>>> Why do we need ref-counting here ? kmem res_counter cannot work as >>>>>>> reference ? >>>>>> This is of course the pair of the mem_cgroup_get() you commented on >>>>>> earlier. If we need one, we need the other. If we don't need one, we >>>>>> don't need the other =) >>>>>> >>>>>> The guarantee we're trying to give here is that the memcg >>>>>> structure will >>>>>> stay around while there are dangling charges to kmem, that we decided >>>>>> not to move (remember: moving it for the stack is simple, for the >>>>>> slab >>>>>> is very complicated and ill-defined, and I believe it is better to >>>>>> treat >>>>>> all kmem equally here) >>>>> >>>>> By keeping memcg structures hanging around until the last referring >>>>> kmem >>>>> page is uncharged do such zombie memcg each consume a css_id and thus >>>>> put pressure on the 64k css_id space? I imagine in pathological cases >>>>> this would prevent creation of new cgroups until these zombies are >>>>> dereferenced. >>>> >>>> Yes, but although this patch makes it more likely, it doesn't introduce >>>> that. If the tasks, for instance, grab a reference to the cgroup dentry >>>> in the filesystem (like their CWD, etc), they will also keep the cgroup >>>> around. >>> >>> Fair point. But this doesn't seems like a feature. It's probably not >>> needed initially, but what do you think about creating a >>> memcg_kernel_context structure which is allocated when memcg is >>> allocated? Kernel pages charged to a memcg would have >>> page_cgroup->mem_cgroup=memcg_kernel_context rather than memcg. This >>> would allow the mem_cgroup and its css_id to be deleted when the cgroup >>> is unlinked from cgroupfs while allowing for the active kernel pages to >>> continue pointing to a valid memcg_kernel_context. This would be a >>> reference counted structure much like you are doing with memcg. When a >>> memcg is deleted the memcg_kernel_context would be linked into its >>> surviving parent memcg. This would avoid needing to visit each kernel >>> page. >> >> You need more, you need at the res_counters to stay around as well. And >> probably other fields. >> >> So my fear here is that as you add fields to that structure, you can >> defeat a bit the goal of reducing memory consumption. Still leaves the >> css space, yes. But by doing this we can introduce some subtle bugs by >> having a field in the wrong structure. >> > > Hm, can't we free css_id and delete css structure from the css_id idr tree > when a memcg goes zombie ? > > Thanks, > -Kame Kame, I wrote a patch that does exactly that. Can you take a look? (I posted it already) I actually need to go back to it, because greg seems to be right saying that that will break things for memsw. But a simplified version may work. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/