Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933698Ab2JLIjy (ORCPT ); Fri, 12 Oct 2012 04:39:54 -0400 Received: from cantor2.suse.de ([195.135.220.15]:34780 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933255Ab2JLIjs (ORCPT ); Fri, 12 Oct 2012 04:39:48 -0400 Date: Fri, 12 Oct 2012 10:39:45 +0200 From: Michal Hocko To: Glauber Costa Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , Mel Gorman , Suleiman Souhlal , Tejun Heo , cgroups@vger.kernel.org, kamezawa.hiroyu@jp.fujitsu.com, Johannes Weiner , Greg Thelen , devel@openvz.org, Frederic Weisbecker , Christoph Lameter , Pekka Enberg Subject: Re: [PATCH v4 06/14] memcg: kmem controller infrastructure Message-ID: <20121012083944.GD10110@dhcp22.suse.cz> References: <1349690780-15988-1-git-send-email-glommer@parallels.com> <1349690780-15988-7-git-send-email-glommer@parallels.com> <20121011124212.GC29295@dhcp22.suse.cz> <5077CAAA.3090709@parallels.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5077CAAA.3090709@parallels.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3042 Lines: 77 On Fri 12-10-12 11:45:46, Glauber Costa wrote: > On 10/11/2012 04:42 PM, Michal Hocko wrote: > > On Mon 08-10-12 14:06:12, Glauber Costa wrote: [...] > >> + /* > >> + * Conditions under which we can wait for the oom_killer. > >> + * __GFP_NORETRY should be masked by __mem_cgroup_try_charge, > >> + * but there is no harm in being explicit here > >> + */ > >> + may_oom = (gfp & __GFP_WAIT) && !(gfp & __GFP_NORETRY); > > > > Well we _have to_ check __GFP_NORETRY here because if we don't then we > > can end up in OOM. mem_cgroup_do_charge returns CHARGE_NOMEM for > > __GFP_NORETRY (without doing any reclaim) and of oom==true we decrement > > oom retries counter and eventually hit OOM killer. So the comment is > > misleading. > > I will update. What i understood from your last message is that we don't > really need to, because try_charge will do it. IIRC I just said it couldn't happen before because migration doesn't go through charge and thp disable oom by default. > >> + > >> + _memcg = memcg; > >> + ret = __mem_cgroup_try_charge(NULL, gfp, size >> PAGE_SHIFT, > >> + &_memcg, may_oom); > >> + > >> + if (!ret) { > >> + ret = res_counter_charge(&memcg->kmem, size, &fail_res); > > > > Now that I'm thinking about the charging ordering we should charge the > > kmem first because we would like to hit kmem limit before we hit u+k > > limit, don't we. > > Say that you have kmem limit 10M and the total limit 50M. Current `u' > > would be 40M and this charge would cause kmem to hit the `k' limit. I > > think we should fail to charge kmem before we go to u+k and potentially > > reclaim/oom. > > Or has this been alredy discussed and I just do not remember? > > > This has never been discussed as far as I remember. We charged u first > since day0, and you are so far the first one to raise it... > > One of the things in favor of charging 'u' first is that > mem_cgroup_try_charge is already equipped to make a lot of decisions, > like when to allow reclaim, when to bypass charges, and it would be good > if we can reuse all that. Hmm, I think that we should prevent from those decisions if kmem charge would fail anyway (especially now when we do not have targeted slab reclaim). > You oom-based argument makes some sense, if all other scenarios are > unchanged by this, I can change it. I will give this some more > consideration. > [...] > > /* > > * Keep reference on memcg while the page is charged to prevent > > * group from vanishing because allocation can outlive their > > * tasks. The reference is dropped in __memcg_kmem_uncharge_page > > */ > > > > please > > I can do that, but keep in mind this piece of code is going away soon =) Yes I have noticed that and replied to myself that it is not necessary. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/