Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1030800AbdIZNao (ORCPT ); Tue, 26 Sep 2017 09:30:44 -0400 Received: from mx2.suse.de ([195.135.220.15]:39607 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S966986AbdIZNam (ORCPT ); Tue, 26 Sep 2017 09:30:42 -0400 Date: Tue, 26 Sep 2017 15:30:40 +0200 From: Michal Hocko To: Roman Gushchin Cc: Johannes Weiner , Tejun Heo , kernel-team@fb.com, David Rientjes , linux-mm@kvack.org, Vladimir Davydov , Tetsuo Handa , Andrew Morton , cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [v8 0/4] cgroup-aware OOM killer Message-ID: <20170926133040.uupv3ibkt3jtbotf@dhcp22.suse.cz> References: <20170915152301.GA29379@castle> <20170918061405.pcrf5vauvul4c2nr@dhcp22.suse.cz> <20170920215341.GA5382@castle> <20170925122400.4e7jh5zmuzvbggpe@dhcp22.suse.cz> <20170925170004.GA22704@cmpxchg.org> <20170925181533.GA15918@castle> <20170925202442.lmcmvqwy2jj2tr5h@dhcp22.suse.cz> <20170926105925.GA23139@castle.dhcp.TheFacebook.com> <20170926112134.r5eunanjy7ogjg5n@dhcp22.suse.cz> <20170926121300.GB23139@castle.dhcp.TheFacebook.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170926121300.GB23139@castle.dhcp.TheFacebook.com> User-Agent: NeoMutt/20170609 (1.8.3) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2019 Lines: 40 On Tue 26-09-17 13:13:00, Roman Gushchin wrote: > On Tue, Sep 26, 2017 at 01:21:34PM +0200, Michal Hocko wrote: > > On Tue 26-09-17 11:59:25, Roman Gushchin wrote: > > > On Mon, Sep 25, 2017 at 10:25:21PM +0200, Michal Hocko wrote: > > > > On Mon 25-09-17 19:15:33, Roman Gushchin wrote: > > > > [...] > > > > > I'm not against this model, as I've said before. It feels logical, > > > > > and will work fine in most cases. > > > > > > > > > > In this case we can drop any mount/boot options, because it preserves > > > > > the existing behavior in the default configuration. A big advantage. > > > > > > > > I am not sure about this. We still need an opt-in, ragardless, because > > > > selecting the largest process from the largest memcg != selecting the > > > > largest task (just consider memcgs with many processes example). > > > > > > As I understand Johannes, he suggested to compare individual processes with > > > group_oom mem cgroups. In other words, always select a killable entity with > > > the biggest memory footprint. > > > > > > This is slightly different from my v8 approach, where I treat leaf memcgs > > > as indivisible memory consumers independent on group_oom setting, so > > > by default I'm selecting the biggest task in the biggest memcg. > > > > My reading is that he is actually proposing the same thing I've been > > mentioning. Simply select the biggest killable entity (leaf memcg or > > group_oom hierarchy) and either kill the largest task in that entity > > (for !group_oom) or the whole memcg/hierarchy otherwise. > > He wrote the following: > "So I'm leaning toward the second model: compare all oomgroups and > standalone tasks in the system with each other, independent of the > failed hierarchical control structure. Then kill the biggest of them." I will let Johannes to comment but I believe this is just a misunderstanding. If we compared only the biggest task from each memcg then we are basically losing our fairness objective, aren't we? -- Michal Hocko SUSE Labs