Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755219AbZAICcW (ORCPT ); Thu, 8 Jan 2009 21:32:22 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751315AbZAICcO (ORCPT ); Thu, 8 Jan 2009 21:32:14 -0500 Received: from TYO202.gate.nec.co.jp ([202.32.8.206]:48579 "EHLO tyo202.gate.nec.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752048AbZAICcN (ORCPT ); Thu, 8 Jan 2009 21:32:13 -0500 Date: Fri, 9 Jan 2009 11:29:22 +0900 From: Daisuke Nishimura To: KAMEZAWA Hiroyuki Cc: nishimura@mxp.nes.nec.co.jp, linux-mm@kvack.org, linux-kernel@vger.kernel.org, balbir@linux.vnet.ibm.com, lizf@cn.fujitsu.com, menage@google.com Subject: Re: [RFC][PATCH 4/4] memcg: make oom less frequently Message-Id: <20090109112922.68881c05.nishimura@mxp.nes.nec.co.jp> In-Reply-To: <20090109110358.8a0d991a.kamezawa.hiroyu@jp.fujitsu.com> References: <20090108190818.b663ce20.nishimura@mxp.nes.nec.co.jp> <20090108191520.df9c1d92.nishimura@mxp.nes.nec.co.jp> <44480.10.75.179.62.1231413588.squirrel@webmail-b.css.fujitsu.com> <20090109104416.9bf4aab7.nishimura@mxp.nes.nec.co.jp> <20090109110358.8a0d991a.kamezawa.hiroyu@jp.fujitsu.com> Organization: NEC Soft, Ltd. X-Mailer: Sylpheed 2.4.8 (GTK+ 2.10.14; i686-pc-mingw32) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4177 Lines: 118 On Fri, 9 Jan 2009 11:03:58 +0900, KAMEZAWA Hiroyuki wrote: > On Fri, 9 Jan 2009 10:44:16 +0900 > Daisuke Nishimura wrote: > > > > To handle live-lock situation as "reclaimed memory is stolen very soon", > > > should we check signal_pending(current) or some flags ? > > > > > > IMHO, using jiffies to detect how long we should retry is easy to understand > > > ....like > > > "if memory charging cannot make progress for XXXX minutes, > > > trigger some notifier or show some flag to user via cgroupfs interface. > > > to show we're tooooooo busy." > > > > > Good Idea. > > > > But I think it would be enough for now to check signal_pending(curren) and > > return -ENOMEM. > > > > How about this one? > > Hmm, looks much simpler. > > > === > > From: Daisuke Nishimura > > > > In previous implementation, mem_cgroup_try_charge checked the return > > value of mem_cgroup_try_to_free_pages, and just retried if some pages > > had been reclaimed. > > But now, try_charge(and mem_cgroup_hierarchical_reclaim called from it) > > only checks whether the usage is less than the limit. > > > > This patch tries to change the behavior as before to cause oom less frequently. > > > > > > Signed-off-by: Daisuke Nishimura > > --- > > mm/memcontrol.c | 14 ++++++++++---- > > 1 files changed, 10 insertions(+), 4 deletions(-) > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index dc38a0e..2ab0a5c 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -770,10 +770,10 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem, > > * but there might be left over accounting, even after children > > * have left. > > */ > > - ret = try_to_free_mem_cgroup_pages(root_mem, gfp_mask, noswap, > > + ret += try_to_free_mem_cgroup_pages(root_mem, gfp_mask, noswap, > > get_swappiness(root_mem)); > > if (mem_cgroup_check_under_limit(root_mem)) > > - return 0; > > + return 1; /* indicate reclaim has succeeded */ > > if (!root_mem->use_hierarchy) > > return ret; > > > > @@ -784,10 +784,10 @@ static int mem_cgroup_hierarchical_reclaim(struct mem_cgroup *root_mem, > > next_mem = mem_cgroup_get_next_node(root_mem); > > continue; > > } > > - ret = try_to_free_mem_cgroup_pages(next_mem, gfp_mask, noswap, > > + ret += try_to_free_mem_cgroup_pages(next_mem, gfp_mask, noswap, > > get_swappiness(next_mem)); > > if (mem_cgroup_check_under_limit(root_mem)) > > - return 0; > > + return 1; /* indicate reclaim has succeeded */ > > next_mem = mem_cgroup_get_next_node(root_mem); > > } > > return ret; > > @@ -870,8 +870,13 @@ static int __mem_cgroup_try_charge(struct mm_struct *mm, > > if (!(gfp_mask & __GFP_WAIT)) > > goto nomem; > > > > + if (signal_pending(current)) > > + goto oom; > > + > > I think it's better to avoid to add this check *now*. and "signal is pending" > doesn't mean oom situation. > hmm.. charge is assumed to return 0 or -ENOMEM, what should we return on signal_pending case ? In case of shmem for example, if charge at shmem_getpage fails by -ENOMEM, shmem_fault returns VM_FAULT_OOM, so pagefault_out_of_memory would be called. If memcg had not invoked oom-killer, system wide oom would be invoked. > Hmm..Maybe we can tell "please retry page fault again, it's too long latency in > memory reclaim and you received signal." in future. > OK. > IMHO, only quick path which we can add here now is > == > if (test_thread_flag(TIG_MEMDIE)) { /* This thread is killed by OOM */ > *memcg = NULL; > return 0; > } > == > like this. > > Anyway, please discuss this "quick exit path" in other patch and just remove > siginal check. > > Other part looks ok to me. > Thanks :) I'll update this one by removing the signal_pendign check. Thanks, Daisuke Nishimura. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/