Received: by 10.223.176.46 with SMTP id f43csp2792896wra; Thu, 25 Jan 2018 15:28:18 -0800 (PST) X-Google-Smtp-Source: AH8x226fD2y3sPXp0ah8rbuZ7OmKYem+eXOpAWbVlt/60pdFmQZe6GL4ZoRX97Rs8hGtdwm+DXV/ X-Received: by 10.99.117.1 with SMTP id q1mr14228370pgc.350.1516922898535; Thu, 25 Jan 2018 15:28:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516922898; cv=none; d=google.com; s=arc-20160816; b=fgfHqcYWNHkEfdEl3Jbb/duAl/smFl92+2zRDTml2MCkP/lD6fuI3flcxOwRmrvcEU ZdG+bdLjWni7nfbaNtbAehmQaRKFFnqL3jKlkye99tQP9W70M76OSz+OSYStYsgzHhmJ qlyG5jjKsdxd9kVjN3zOjssqCjdLt0l1CFMa0qLCjq/DoCNC/MAklTw5b94gayYZmIpK ThipS5mD0xBnQhCX0SpvfGVitOx10rLItl44lBu/PWWZ4OZEnR3Q6hDBmVry5qW6PP9i Y1aWi3Rt9AdijNnGrO1XhfIbLoyhGh2+687owWg1q3OrIgDAaMwg7INVgf+ILos85rhc 88Nw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date:dkim-signature :arc-authentication-results; bh=qC94N/Dd8khDgmRcbyMy1D6gFBDuejY8Dbibj//nWvI=; b=Z0w9LHtiKq0wyOoRoxv4kfBGaBBlfIzkY8Do8Z+/TboZ7Q6N4eDcuzV+HyMSzf6pOr XTJzIu+TVat4dWDTLWiBWFKMYlaunhVdvpZmX58k20NlX3t3Y7h+pdIkNzeM+ry+rJ24 1D2ILRqGN1Yn0iUP9NXmw2zlW7POrPKXlnqz8s6SxjvGyhkBv9n8NUPtpihlaiLOnpt+ egh7Usg9TaTNuhgR5baGp29g4oXPIYXkoCFSVypx4NCdKxvLkAsrRBdUpDTiXifgMBc5 RV+If5l7Ij+Y9nsqwTIPku1lETjs4yOfzQ1xCbx1Q2Vf8cxYmuUFtLN9lp+L+Okrm5d3 qO/g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=vNtft3CK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q15si5361270pfd.180.2018.01.25.15.28.03; Thu, 25 Jan 2018 15:28:18 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=vNtft3CK; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751499AbeAYX1f (ORCPT + 99 others); Thu, 25 Jan 2018 18:27:35 -0500 Received: from mail-it0-f65.google.com ([209.85.214.65]:45593 "EHLO mail-it0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751407AbeAYX1c (ORCPT ); Thu, 25 Jan 2018 18:27:32 -0500 Received: by mail-it0-f65.google.com with SMTP id k131so42010ith.4 for ; Thu, 25 Jan 2018 15:27:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=qC94N/Dd8khDgmRcbyMy1D6gFBDuejY8Dbibj//nWvI=; b=vNtft3CKQ8UugmTHRPAjbe1WDN0Aur1AVMMO0+JEnLwwHgCQeuG4SWYelbejmlj2Xn 89VUh3VHMMR9BfXcH2wMa3ldSra0NJa/pKjKvG7k+BA8OaTwZMaCJ7mrE/hUMoO6nJYl PoJ9DEA25q+rNovMTE4ariokMbqtk2rnF8ZY4fmdYTRh9flTc58i5N8DyCVeQ/NQviEY wFa7+MOeWmwQT5GdhGq5ThxUma39GqJ/cncYPydz31u272Cxzm9v8bQxE/2CQIXZj5f0 p/kmOoWwARGAYOEbRAJ2m7sMIF7hbuta7OljuC07mGDDPlMrp7+uzO5TgTGuG9h5dsRb AJVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=qC94N/Dd8khDgmRcbyMy1D6gFBDuejY8Dbibj//nWvI=; b=exj908W+H007m1otmjjsQUeKpLm6cA+uUUYRYJxhQStgpbUnDRe2/ho84a/GM0m6Xc lm4FuNgZgR13eXGLckYHXmjJKbPWy42ZPJJpF/gP26J54lvgSDdjXuh/3qSqUrmCWeNR NWQoQ7RVI/+N4tXuyfdkzI2noYSvQweTrHFrpioYgLhoZS1QtQvRm6KwSYzf/a12Oop2 /7lMb5Ag3LPLVNtcyySuYI18vMB8MJzqQMikLtpBuKAD/9aP7oOq2FuwVot608QOiMCz fpBaszcMq7Cz9XJ4RGeSVd2w6VHyYPgCjt1E6KfCnZJB46I4vhrk9kre4K3AYP2l30G0 3JfA== X-Gm-Message-State: AKwxytdLB6APaIL11sHYXWrffnVh8bhIJU7cBxhXGtTJl6is4HBQm6/q xxRivlO0ZUB9eCbtofMhzhJX1g== X-Received: by 10.36.71.4 with SMTP id t4mr15995336itb.75.1516922851418; Thu, 25 Jan 2018 15:27:31 -0800 (PST) Received: from [2620:15c:17:3:855a:6e21:19d4:9b12] ([2620:15c:17:3:855a:6e21:19d4:9b12]) by smtp.gmail.com with ESMTPSA id 140sm2555435itx.3.2018.01.25.15.27.30 (version=TLS1 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 25 Jan 2018 15:27:30 -0800 (PST) Date: Thu, 25 Jan 2018 15:27:29 -0800 (PST) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Michal Hocko cc: Tejun Heo , Andrew Morton , Roman Gushchin , Vladimir Davydov , Johannes Weiner , Tetsuo Handa , kernel-team@fb.com, cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [patch -mm 3/4] mm, memcg: replace memory.oom_group with policy tunable In-Reply-To: <20180125080542.GK28465@dhcp22.suse.cz> Message-ID: References: <20180117154155.GU3460072@devbig577.frc2.facebook.com> <20180120123251.GB1096857@devbig577.frc2.facebook.com> <20180123155301.GS1526@dhcp22.suse.cz> <20180124082041.GD1526@dhcp22.suse.cz> <20180125080542.GK28465@dhcp22.suse.cz> User-Agent: Alpine 2.10 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 25 Jan 2018, Michal Hocko wrote: > > As a result, this would remove patch 3/4 from the series. Do you have any > > other feedback regarding the remainder of this patch series before I > > rebase it? > > Yes, and I have provided it already. What you are proposing is > incomplete at best and needs much better consideration and much more > time to settle. > Could you elaborate on why specifying the oom policy for the entire hierarchy as part of the root mem cgroup and also for individual subtrees is incomplete? It allows admins to specify and delegate policy decisions to subtrees owners as appropriate. It addresses your concern in the /admins and /students example. It addresses my concern about evading the selection criteria simply by creating child cgroups. It appears to be a win-win. What is incomplete or are you concerned about? > > I will address the unfair root mem cgroup vs leaf mem cgroup comparison in > > a separate patchset to fix an issue where any user of oom_score_adj on a > > system that is not fully containerized gets very unusual, unexpected, and > > undocumented results. > > I will not oppose but as it has been mentioned several times, this is by > no means a blocker issue. It can be added on top. > The current implementation is only useful for fully containerized systems where no processes are attached to the root mem cgroup. Anything in the root mem cgroup is judged by different criteria and if they use /proc/pid/oom_score_adj the entire heuristic breaks down. That's because per-process usage and oom_score_adj are only relevant for the root mem cgroup and irrelevant when attached to a leaf. Because of that, users are affected by the design decision and will organize their hierarchies as approrpiate to avoid it. Users who only want to use cgroups for a subset of processes but still treat those processes as indivisible logical units when attached to cgroups find that it is simply not possible. I'm focused solely on fixing the three main issues that this implementation causes. One of them, userspace influence to protect important cgroups, can be added on top. The other two, evading the selection criteria and unfair comparison of root vs leaf, are shortcomings in the design that I believe should be addressed before it's merged to avoid changing the API later. I'm in no rush to ask for the cgroup aware oom killer to be merged if it's incomplete and must be changed for usecases that are not highly specialized (fully containerized and no use of oom_score_adj for any process). I am actively engaged in fixing it, however, so that it becomes a candidate for merge. Your feedback is useful with regard to those fixes, but daily emails on how we must merge the current implementation now are not providing value, at least to me.