Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935367AbZLQWWH (ORCPT ); Thu, 17 Dec 2009 17:22:07 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S934184AbZLQWWF (ORCPT ); Thu, 17 Dec 2009 17:22:05 -0500 Received: from smtp-out.google.com ([216.239.44.51]:56696 "EHLO smtp-out.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932125AbZLQWWE (ORCPT ); Thu, 17 Dec 2009 17:22:04 -0500 DomainKey-Signature: a=rsa-sha1; s=beta; d=google.com; c=nofws; q=dns; h=date:from:x-x-sender:to:cc:subject:in-reply-to:message-id: references:user-agent:mime-version:content-type:x-system-of-record; b=H2Rom963gjvaIv9XMyakmGnDDTChU2xxXurGj3u5lK7O3RJe89YI4qrZw+UvM9gZN 7tyJFjZLHAGQtKCmQfWbQ== Date: Thu, 17 Dec 2009 14:21:49 -0800 (PST) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: KOSAKI Motohiro cc: KAMEZAWA Hiroyuki , Andrew Morton , Daisuke Nishimura , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Christoph Lameter Subject: Re: [BUGFIX][PATCH] oom-kill: fix NUMA consraint check with nodemask v4.2 In-Reply-To: <20091215135902.CDD6.A69D9226@jp.fujitsu.com> Message-ID: References: <20091215133546.6872fc4f.kamezawa.hiroyu@jp.fujitsu.com> <20091215135902.CDD6.A69D9226@jp.fujitsu.com> User-Agent: Alpine 2.00 (DEB 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-System-Of-Record: true Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2845 Lines: 66 On Tue, 15 Dec 2009, KOSAKI Motohiro wrote: > > A few requirements that I have: > > Um, good analysis! really. > > > > > - we must be able to define when a task is a memory hogger; this is > > currently done by /proc/pid/oom_adj relying on the overall total_vm > > size of the task as a baseline. Most users should have a good sense > > of when their task is using more memory than expected and killing a > > memory leaker should always be the optimal oom killer result. A better > > set of units other than a shift on total_vm would be helpful, though. > > nit: What's mean "Most users"? desktop user(one of most majority users) > don't have any expection of memory usage. > > but, if admin have memory expection, they should be able to tune > optimal oom result. > > I think you pointed right thing. > This is mostly referring to production server users where memory consumption by particular applications can be estimated, which allows the kernel to determine when a task is using a wildly unexpected amount that happens to become egregious enough to force the oom killer into killing a task. That is contrast to using rss as a baseline where we prefer on killing the application with the most resident RAM. It is not always ideal to kill a task with 8GB of rss when we fail to allocate a single page for a low priority task. > > - we must prefer tasks that run on a cpuset or mempolicy's nodes if the > > oom condition is constrained by that cpuset or mempolicy and its not a > > system-wide issue. > > agreed. (who disagree it?) > It's possible to nullify the current penalization in the badness heuristic (order 3 reduction) if a candidate task does not share nodes with current's allowed set either by way of cpusets or mempolicies. For example, an oom caused by an application with an MPOL_BIND on a single node can easily kill a task that has no memory resident on that node if its usage (or rss) is 3 orders higher than any candidate that is allowed on my bound node. > > - we must be able to polarize the badness heuristic to always select a > > particular task is if its very low priority or disable oom killing for > > a task if its must-run. > > Probably I haven't catch your point. What's mean "polarize"? Can you > please describe more? > We need to be able to polarize tasks so they are always killed regardless of any kernel heuristic (/proc/pid/oom_adj of +15, currently) or always chosen last (-16, currently). We also need a way of completely disabling oom killing for certain tasks such as with OOM_DISABLE. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/