Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752150AbdIUI1f (ORCPT ); Thu, 21 Sep 2017 04:27:35 -0400 Received: from mail-pf0-f172.google.com ([209.85.192.172]:52463 "EHLO mail-pf0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752056AbdIUI1b (ORCPT ); Thu, 21 Sep 2017 04:27:31 -0400 X-Google-Smtp-Source: AOwi7QCi4lgOnoY9Gb8+8auUFqnumNKBW/0MzenqNmc3aqfpAiSq9BMe/BdAPT+WSRXln+ZVMqglDg== Date: Thu, 21 Sep 2017 01:27:29 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Roman Gushchin cc: Michal Hocko , linux-mm@kvack.org, Vladimir Davydov , Johannes Weiner , Tetsuo Handa , Andrew Morton , Tejun Heo , kernel-team@fb.com, cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [v8 0/4] cgroup-aware OOM killer In-Reply-To: <20170920222403.GA4729@castle> Message-ID: References: <20170913122914.5gdksbmkolum7ita@dhcp22.suse.cz> <20170913215607.GA19259@castle> <20170914134014.wqemev2kgychv7m5@dhcp22.suse.cz> <20170914160548.GA30441@castle> <20170915105826.hq5afcu2ij7hevb4@dhcp22.suse.cz> <20170915152301.GA29379@castle> <20170915210807.GA5238@castle> <20170920222403.GA4729@castle> User-Agent: Alpine 2.10 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1454 Lines: 26 On Wed, 20 Sep 2017, Roman Gushchin wrote: > > It's actually much more complex because in our environment we'd need an > > "activity manager" with CAP_SYS_RESOURCE to control oom priorities of user > > subcontainers when today it need only be concerned with top-level memory > > cgroups. Users can create their own hierarchies with their own oom > > priorities at will, it doesn't alter the selection heuristic for another > > other user running on the same system and gives them full control over the > > selection in their own subtree. We shouldn't need to have a system-wide > > daemon with CAP_SYS_RESOURCE be required to manage subcontainers when > > nothing else requires it. I believe it's also much easier to document: > > oom_priority is considered for all sibling cgroups at each level of the > > hierarchy and the cgroup with the lowest priority value gets iterated. > > I do agree actually. System-wide OOM priorities make no sense. > > Always compare sibling cgroups, either by priority or size, seems to be > simple, clear and powerful enough for all reasonable use cases. Am I right, > that it's exactly what you've used internally? This is a perfect confirmation, > I believe. > We've used it for at least four years, I added my Tested-by to your patch, we would convert to your implementation if it is merged upstream, and I would enthusiastically support your patch if you would integrate it back into your series.