Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754298Ab2HORDs (ORCPT ); Wed, 15 Aug 2012 13:03:48 -0400 Received: from mx2.parallels.com ([64.131.90.16]:49152 "EHLO mx2.parallels.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752085Ab2HORDq (ORCPT ); Wed, 15 Aug 2012 13:03:46 -0400 Message-ID: <502BD5AF.301@parallels.com> Date: Wed, 15 Aug 2012 21:00:31 +0400 From: Glauber Costa User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20120717 Thunderbird/14.0 MIME-Version: 1.0 To: Greg Thelen CC: Kamezawa Hiroyuki , , , , , Michal Hocko , Johannes Weiner , Andrew Morton , Christoph Lameter , David Rientjes , Pekka Enberg , Pekka Enberg Subject: Re: [PATCH v2 06/11] memcg: kmem controller infrastructure References: <1344517279-30646-1-git-send-email-glommer@parallels.com> <1344517279-30646-7-git-send-email-glommer@parallels.com> <50254475.4000201@jp.fujitsu.com> <5028BA9E.7000302@parallels.com> <502B6956.5030508@parallels.com> In-Reply-To: Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit X-Originating-IP: [109.173.1.99] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2943 Lines: 61 On 08/15/2012 08:38 PM, Greg Thelen wrote: > On Wed, Aug 15 2012, Glauber Costa wrote: > >> On 08/14/2012 10:58 PM, Greg Thelen wrote: >>> On Mon, Aug 13 2012, Glauber Costa wrote: >>> >>>>>>> + WARN_ON(mem_cgroup_is_root(memcg)); >>>>>>> + size = (1 << order) << PAGE_SHIFT; >>>>>>> + memcg_uncharge_kmem(memcg, size); >>>>>>> + mem_cgroup_put(memcg); >>>>> Why do we need ref-counting here ? kmem res_counter cannot work as >>>>> reference ? >>>> This is of course the pair of the mem_cgroup_get() you commented on >>>> earlier. If we need one, we need the other. If we don't need one, we >>>> don't need the other =) >>>> >>>> The guarantee we're trying to give here is that the memcg structure will >>>> stay around while there are dangling charges to kmem, that we decided >>>> not to move (remember: moving it for the stack is simple, for the slab >>>> is very complicated and ill-defined, and I believe it is better to treat >>>> all kmem equally here) >>> >>> By keeping memcg structures hanging around until the last referring kmem >>> page is uncharged do such zombie memcg each consume a css_id and thus >>> put pressure on the 64k css_id space? I imagine in pathological cases >>> this would prevent creation of new cgroups until these zombies are >>> dereferenced. >> >> Yes, but although this patch makes it more likely, it doesn't introduce >> that. If the tasks, for instance, grab a reference to the cgroup dentry >> in the filesystem (like their CWD, etc), they will also keep the cgroup >> around. > > Fair point. But this doesn't seems like a feature. It's probably not > needed initially, but what do you think about creating a > memcg_kernel_context structure which is allocated when memcg is > allocated? Kernel pages charged to a memcg would have > page_cgroup->mem_cgroup=memcg_kernel_context rather than memcg. This > would allow the mem_cgroup and its css_id to be deleted when the cgroup > is unlinked from cgroupfs while allowing for the active kernel pages to > continue pointing to a valid memcg_kernel_context. This would be a > reference counted structure much like you are doing with memcg. When a > memcg is deleted the memcg_kernel_context would be linked into its > surviving parent memcg. This would avoid needing to visit each kernel > page. You need more, you need at the res_counters to stay around as well. And probably other fields. So my fear here is that as you add fields to that structure, you can defeat a bit the goal of reducing memory consumption. Still leaves the css space, yes. But by doing this we can introduce some subtle bugs by having a field in the wrong structure. Did you observe that to be a big problem in your systems? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/