Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932419AbdIGQ1f (ORCPT ); Thu, 7 Sep 2017 12:27:35 -0400 Received: from resqmta-ch2-02v.sys.comcast.net ([69.252.207.34]:58738 "EHLO resqmta-ch2-02v.sys.comcast.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932105AbdIGQ1d (ORCPT ); Thu, 7 Sep 2017 12:27:33 -0400 Date: Thu, 7 Sep 2017 11:27:30 -0500 (CDT) From: Christopher Lameter X-X-Sender: cl@nuc-kabylake To: Michal Hocko cc: Johannes Weiner , Roman Gushchin , linux-mm@kvack.org, Vladimir Davydov , Tetsuo Handa , David Rientjes , Andrew Morton , Tejun Heo , kernel-team@fb.com, cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [v7 5/5] mm, oom: cgroup v2 mount option to disable cgroup-aware OOM killer In-Reply-To: <20170906082859.qlqenftxuib64j35@dhcp22.suse.cz> Message-ID: References: <20170904142108.7165-1-guro@fb.com> <20170904142108.7165-6-guro@fb.com> <20170905134412.qdvqcfhvbdzmarna@dhcp22.suse.cz> <20170905215344.GA27427@cmpxchg.org> <20170906082859.qlqenftxuib64j35@dhcp22.suse.cz> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-CMAE-Envelope: MS4wfBtE68sXa476JB/CXcm1QbFAFnEAGpqHx7amHIgFDwl79waRHugpJvTqMRKrjfmiBo/oI+W0U/Bcvye/vWnefwbNzFeSyCjB23ZGUpgC1iyZ8vdhAzSr ExOVGxDPBv5cbw36B6k2uuNnNPIkIFVQ8afSd4ves9nijcTSoFypgzBHkGrhUML2TjNrYRwyw11TDKdJ6xvrBkmx3HSoWft6YLf9fMR20hSLL8qXKNxaFOXP NOMCWCjCCTiBLv4YrTTyuawhfr5gW/a5m24yb8d+A90YG4ABzhV70jOvwYVK+9Wa04Rrn61w2eKpNquYfup7ghcv6q8CBPpJ16J3ZzGWk0qC9XythQ23bG8h r7Bh8+1Z8wkdAsoGF0qiZ8cXw+Z8GiVkpHoV/f1Zl+Arrc31YuV5eBXPMqycyz13OSNNf6RUNB88h8/kED+RGYM6JDM6bII5RrgyjGMZYLAQR3xxc3UuHKQ2 uDGWUYw/fGLXzWhG+z3WySQCeI6w0KHT8tgMTOBZr6xHJAtnJsUR6G55fwyuAdfdZa8hrNlRt6Kxg3b0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1291 Lines: 24 On Wed, 6 Sep 2017, Michal Hocko wrote: > I am not sure this is how things evolved actually. This is way before > my time so my git log interpretation might be imprecise. We do have > oom_badness heuristic since out_of_memory has been introduced and > oom_kill_allocating_task has been introduced much later because of large > boxes with zillions of tasks (SGI I suspect) which took too long to > select a victim so David has added this heuristic. Nope. The logic was required for tasks that run out of memory when the restriction on the allocation did not allow the use of all of memory. cpuset restrictions and memory policy restrictions where the prime considerations at the time. It has *nothing* to do with zillions of tasks. Its amusing that the SGI ghost is still haunting the discussion here. The company died a couple of years ago finally (ok somehow HP has an "SGI" brand now I believe). But there are multiple companies that have large NUMA configurations and they all have configurations where they want to restrict allocations of a process to subset of system memory. This is even more important now that we get new forms of memory (NVDIMM, PCI-E device memory etc). You need to figure out what to do with allocations that fail because the *allowed* memory pools are empty.