Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752760AbZAIBrV (ORCPT ); Thu, 8 Jan 2009 20:47:21 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752647AbZAIBrD (ORCPT ); Thu, 8 Jan 2009 20:47:03 -0500 Received: from TYO202.gate.nec.co.jp ([202.32.8.206]:35382 "EHLO tyo202.gate.nec.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753571AbZAIBrB (ORCPT ); Thu, 8 Jan 2009 20:47:01 -0500 Date: Fri, 9 Jan 2009 10:44:16 +0900 From: Daisuke Nishimura To: "KAMEZAWA Hiroyuki" Cc: nishimura@mxp.nes.nec.co.jp, linux-mm@kvack.org, linux-kernel@vger.kernel.org, balbir@linux.vnet.ibm.com, lizf@cn.fujitsu.com, menage@google.com Subject: Re: [RFC][PATCH 4/4] memcg: make oom less frequently Message-Id: <20090109104416.9bf4aab7.nishimura@mxp.nes.nec.co.jp> In-Reply-To: <44480.10.75.179.62.1231413588.squirrel@webmail-b.css.fujitsu.com> References: <20090108190818.b663ce20.nishimura@mxp.nes.nec.co.jp> <20090108191520.df9c1d92.nishimura@mxp.nes.nec.co.jp> <44480.10.75.179.62.1231413588.squirrel@webmail-b.css.fujitsu.com> Organization: NEC Soft, Ltd. X-Mailer: Sylpheed 2.4.8 (GTK+ 2.10.14; i686-pc-mingw32) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4329 Lines: 124 On Thu, 8 Jan 2009 20:19:48 +0900 (JST), "KAMEZAWA Hiroyuki" wrote: > Daisuke Nishimura said: > > In previous implementation, mem_cgroup_try_charge checked the return > > value of mem_cgroup_try_to_free_pages, and just retried if some pages > > had been reclaimed. > > But now, try_charge(and mem_cgroup_hierarchical_reclaim called from it) > > only checks whether the usage is less than the limit. > > > > This patch tries to change the behavior as before to cause oom less > > frequently. > > > > To prevent try_charge from getting stuck in infinite loop, > > MEM_CGROUP_RECLAIM_RETRIES_MAX is defined. > > > > > > Signed-off-by: Daisuke Nishimura > > I think this is necessary change. > My version of hierarchy reclaim will do this. > > But RETRIES_MAX is not clear ;) please use one counter. > > And why MAX=32 ? I inserted printk and counted the loop count on oom(tested with 4 children). It seemed 32 would be enough. > > + if (ret) > > + continue; > seems to do enough work. > > While memory can be reclaimed, it's not dead lock. I see. I introduced this max count because mmap_sem might be hold for a long time at page fault, but this is not "dead" lock as you say. > To handle live-lock situation as "reclaimed memory is stolen very soon", > should we check signal_pending(current) or some flags ? > > IMHO, using jiffies to detect how long we should retry is easy to understand > ....like > "if memory charging cannot make progress for XXXX minutes, > trigger some notifier or show some flag to user via cgroupfs interface. > to show we're tooooooo busy." > Good Idea. But I think it would be enough for now to check signal_pending(curren) and return -ENOMEM. How about this one? === From: Daisuke Nishimura In previous implementation, mem_cgroup_try_charge checked the return value of mem_cgroup_try_to_free_pages, and just retried if some pages had been reclaimed. But now, try_charge(and mem_cgroup_hierarchical_reclaim called from it) only checks whether the usage is less than the limit. This patch tries to change the behavior as before to cause oom less frequently. Signed-off-by: Daisuke Nishimura --- mm/memcontrol.c | 14 ++++++++++---- 1 files changed, 10 insertions(+), 4 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index dc38a0e..2ab0a5c 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -770,10 +770,10 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem, * but there might be left over accounting, even after children * have left. */ - ret = try_to_free_mem_cgroup_pages(root_mem, gfp_mask, noswap, + ret += try_to_free_mem_cgroup_pages(root_mem, gfp_mask, noswap, get_swappiness(root_mem)); if (mem_cgroup_check_under_limit(root_mem)) - return 0; + return 1; /* indicate reclaim has succeeded */ if (!root_mem->use_hierarchy) return ret; @@ -784,10 +784,10 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem, next_mem = mem_cgroup_get_next_node(root_mem); continue; } - ret = try_to_free_mem_cgroup_pages(next_mem, gfp_mask, noswap, + ret += try_to_free_mem_cgroup_pages(next_mem, gfp_mask, noswap, get_swappiness(next_mem)); if (mem_cgroup_check_under_limit(root_mem)) - return 0; + return 1; /* indicate reclaim has succeeded */ next_mem = mem_cgroup_get_next_node(root_mem); } return ret; @@ -870,8 +870,13 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm, if (!(gfp_mask & __GFP_WAIT)) goto nomem; + if (signal_pending(current)) + goto oom; + ret = mem_cgroup_hierarchical_reclaim(mem_over_limit, gfp_mask, noswap); + if (ret) + continue; /* * try_to_free_mem_cgroup_pages() might not give us a full @@ -885,6 +890,7 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm, continue; if (!nr_retries--) { +oom: if (oom) { mutex_lock(&memcg_tasklist); mem_cgroup_out_of_memory(mem_over_limit, gfp_mask); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/