Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934180Ab0BQIyi (ORCPT ); Wed, 17 Feb 2010 03:54:38 -0500 Received: from fgwmail7.fujitsu.co.jp ([192.51.44.37]:39843 "EHLO fgwmail7.fujitsu.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934170Ab0BQIyh (ORCPT ); Wed, 17 Feb 2010 03:54:37 -0500 X-SecurityPolicyCheck-FJ: OK by FujitsuOutboundMailChecker v1.3.1 Date: Wed, 17 Feb 2010 17:51:03 +0900 From: KAMEZAWA Hiroyuki To: Nick Piggin Cc: "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , "balbir@linux.vnet.ibm.com" , "nishimura@mxp.nes.nec.co.jp" , rientjes@google.com, "akpm@linux-foundation.org" Subject: Re: [PATCH] memcg: handle panic_on_oom=always case Message-Id: <20100217175103.51ce01b5.kamezawa.hiroyu@jp.fujitsu.com> In-Reply-To: <20100217084526.GP5723@laptop> References: <20100217150445.1a40201d.kamezawa.hiroyu@jp.fujitsu.com> <20100217084526.GP5723@laptop> Organization: FUJITSU Co. LTD. X-Mailer: Sylpheed 2.7.1 (GTK+ 2.10.14; i686-pc-mingw32) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4615 Lines: 127 On Wed, 17 Feb 2010 19:45:26 +1100 Nick Piggin wrote: > On Wed, Feb 17, 2010 at 03:04:45PM +0900, KAMEZAWA Hiroyuki wrote: > > tested on mmotm-Feb11. > > > > Balbir-san, Nishimura-san, I want review from both of you. > > > > == > > > > From: KAMEZAWA Hiroyuki > > > > Now, if panic_on_oom=2, the whole system panics even if the oom happend > > in some special situation (as cpuset, mempolicy....). > > Then, panic_on_oom=2 means painc_on_oom_always. > > > > Now, memcg doesn't check panic_on_oom flag. This patch adds a check. > > > > Maybe someone doubts how it's useful. kdump+panic_on_oom=2 is the > > last tool to investigate what happens in oom-ed system. If a task is killed, > > the sysytem recovers and used memory were freed, there will be few hint > > to know what happnes. In mission critical system, oom should never happen. > > Then, investigation after OOM is very important. > > Then, panic_on_oom=2+kdump is useful to avoid next OOM by knowing > > precise information via snapshot. > > No I don't doubt it is useful, and I think this probably is the simplest > and most useful semantic. So thanks for doing this. > Thank you for review. > I hate to pick nits in a trivial patch but I will anyway: > > > > TODO: > > - For memcg, it's for isolate system's memory usage, oom-notiifer and > > freeze_at_oom (or rest_at_oom) should be implemented. Then, management > > daemon can do similar jobs (as kdump) in safer way or taking snapshot > > per cgroup. > > > > CC: Balbir Singh > > CC: Daisuke Nishimura > > CC: David Rientjes > > Signed-off-by: KAMEZAWA Hiroyuki > > --- > > Documentation/cgroups/memory.txt | 2 ++ > > Documentation/sysctl/vm.txt | 5 ++++- > > mm/oom_kill.c | 2 ++ > > 3 files changed, 8 insertions(+), 1 deletion(-) > > > > Index: mmotm-2.6.33-Feb11/Documentation/cgroups/memory.txt > > =================================================================== > > --- mmotm-2.6.33-Feb11.orig/Documentation/cgroups/memory.txt > > +++ mmotm-2.6.33-Feb11/Documentation/cgroups/memory.txt > > @@ -182,6 +182,8 @@ list. > > NOTE: Reclaim does not work for the root cgroup, since we cannot set any > > limits on the root cgroup. > > > > +Note2: When panic_on_oom is set to "2", the whole system will panic. > > + > > Maybe: > > NOTE2: When panic_on_oom is set to "2", the whole system will panic in > case of an oom event in any cgroup. > ok. > > 2. Locking > > > > The memory controller uses the following hierarchy > > Index: mmotm-2.6.33-Feb11/Documentation/sysctl/vm.txt > > =================================================================== > > --- mmotm-2.6.33-Feb11.orig/Documentation/sysctl/vm.txt > > +++ mmotm-2.6.33-Feb11/Documentation/sysctl/vm.txt > > @@ -573,11 +573,14 @@ Because other nodes' memory may be free. > > may be not fatal yet. > > > > If this is set to 2, the kernel panics compulsorily even on the > > -above-mentioned. > > +above-mentioned. Even oom happens under memoyr cgroup, the whole > > +system panics. > memory > > > > > The default value is 0. > > 1 and 2 are for failover of clustering. Please select either > > according to your policy of failover. > > +2 seems too strong but panic_on_oom=2+kdump gives you very strong > > +tool to investigate a system which should never cause OOM. > > I don't think you need say 2 seems too strong because as you rightfully > say, it has real uses. The hint about using it to investigate OOM > conditions is good though. > ok. I'll update this patch. Thanks, -Kame > > > > ============================================================= > > > > Index: mmotm-2.6.33-Feb11/mm/oom_kill.c > > =================================================================== > > --- mmotm-2.6.33-Feb11.orig/mm/oom_kill.c > > +++ mmotm-2.6.33-Feb11/mm/oom_kill.c > > @@ -471,6 +471,8 @@ void mem_cgroup_out_of_memory(struct mem > > unsigned long points = 0; > > struct task_struct *p; > > > > + if (sysctl_panic_on_oom == 2) > > + panic("out of memory(memcg). panic_on_oom is selected.\n"); > > read_lock(&tasklist_lock); > > retry: > > p = select_bad_process(&points, mem); > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/