Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp6879640imu; Thu, 31 Jan 2019 00:58:34 -0800 (PST) X-Google-Smtp-Source: ALg8bN7HmSJ7L7E6FVZpVIil2rMyatxe9dSWHEb0ktMQCS1PP4bgqAalyVzJrhoBa/Cv7/ESku47 X-Received: by 2002:a63:5c41:: with SMTP id n1mr23208656pgm.1.1548925114706; Thu, 31 Jan 2019 00:58:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548925114; cv=none; d=google.com; s=arc-20160816; b=Ol8hcRKd7Aa1ctAHbV0V3dSb0ilWl6+OcoSXN+kWegIW6w01lTNOTip72Jvf8aauC2 C7JyxqM4Ghqh5tdi8/zMluW40CMHM9sfs+iAUPvExqdeCcFeTU7yGwZ9s+LxUCB5nPgN SBKlVxR7j4mQgIDzSOV+v/My0rtZd/uZ6wyP0h8w7yzrL9MaXxAA/POYxJNaPcOuVYTM EeT83KTmd49KuXacpTvI+bm7EXHtBT1jTbNc0a2JwWHdezdCuc61mTsrsDOUI5vt/Ak8 SU3raeZSeRUzFv4eLzTRv6r7oFzpG4trAjD8HWScGWWGN7H7DmErYwXvgCKYUKcom/sI 7J6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=MjwyhMyh0zvh7MFptK1UBupnU+juol4zGWhQN4zKer4=; b=hIeGKj7l3BoS3ou9PbkK1/xIcUWdSBY250Q+th8ADLTQqUXXHX1ayzwHSqrpo09Oxa Mbb0A8WAth2B5yg2jTA0dEeOYw+FJrMcuv4Hm/luoCwWVCbYxSwLhh75sodL848oPDGI J6ZGfRcgrnYFHFzkCUFadwkh8nRb3ADdXeu3HnXIPJHdR3FsP6w0blgc7YxwndtZG1S5 6ut4fLr5ZXOUQtf/rg37Kk4ptpCO6QOncULaTu/oaKO0ONbpJK0UqrAYJ7ISyQCYxMF9 1cTmULnML8T9s6qWk8qgdjXI3qXxybQbnPsG9Vyewqc+NXztU8T9TZ3D9jGogi6di0ya tdfg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 14si3823078pgg.425.2019.01.31.00.58.18; Thu, 31 Jan 2019 00:58:34 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727701AbfAaI6M (ORCPT + 99 others); Thu, 31 Jan 2019 03:58:12 -0500 Received: from mx2.suse.de ([195.135.220.15]:60844 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725787AbfAaI6L (ORCPT ); Thu, 31 Jan 2019 03:58:11 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id AFF46AD8A; Thu, 31 Jan 2019 08:58:09 +0000 (UTC) Date: Thu, 31 Jan 2019 09:58:08 +0100 From: Michal Hocko To: Johannes Weiner Cc: Tejun Heo , Chris Down , Andrew Morton , Roman Gushchin , Dennis Zhou , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: Re: [PATCH 2/2] mm: Consider subtrees in memory.events Message-ID: <20190131085808.GO18811@dhcp22.suse.cz> References: <20190124170117.GS4087@dhcp22.suse.cz> <20190124182328.GA10820@cmpxchg.org> <20190125074824.GD3560@dhcp22.suse.cz> <20190125165152.GK50184@devbig004.ftw2.facebook.com> <20190125173713.GD20411@dhcp22.suse.cz> <20190125182808.GL50184@devbig004.ftw2.facebook.com> <20190128125151.GI18811@dhcp22.suse.cz> <20190130192345.GA20957@cmpxchg.org> <20190130200559.GI18811@dhcp22.suse.cz> <20190130213131.GA13142@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190130213131.GA13142@cmpxchg.org> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 30-01-19 16:31:31, Johannes Weiner wrote: > On Wed, Jan 30, 2019 at 09:05:59PM +0100, Michal Hocko wrote: [...] > > I thought I have already mentioned an example. Say you have an observer > > on the top of a delegated cgroup hierarchy and you setup limits (e.g. hard > > limit) on the root of it. If you get an OOM event then you know that the > > whole hierarchy might be underprovisioned and perform some rebalancing. > > Now you really do not care that somewhere down the delegated tree there > > was an oom. Such a spurious event would just confuse the monitoring and > > lead to wrong decisions. > > You can construct a usecase like this, as per above with OOM, but it's > incredibly unlikely for something like this to exist. There is plenty > of evidence on adoption rate that supports this: we know where the big > names in containerization are; we see the things we run into that have > not been reported yet etc. > > Compare this to real problems this has already caused for > us. Multi-level control and monitoring is a fundamental concept of the > cgroup design, so naturally our infrastructure doesn't monitor and log > at the individual job level (too much data, and also kind of pointless > when the jobs are identical) but at aggregate parental levels. > > Because of this wart, we have missed problematic configurations when > the low, high, max events were not propagated as expected (we log oom > separately, so we still noticed those). Even once we knew about it, we > had trouble tracking these configurations down for the same reason - > the data isn't logged, and won't be logged, at this level. Yes, I do understand that you might be interested in the hierarchical accounting. > Adding a separate, hierarchical file would solve this one particular > problem for us, but it wouldn't fix this pitfall for all future users > of cgroup2 (which by all available evidence is still most of them) and > would be a wart on the interface that we'd carry forever. I understand even this reasoning but if I have to chose between a risk of user breakage that would require to reimplement the monitoring or an API incosistency I vote for the first option. It is unfortunate but this is the way we deal with APIs and compatibility. > Adding a note in cgroup-v2.txt doesn't make up for the fact that this > behavior flies in the face of basic UX concepts that underly the > hierarchical monitoring and control idea of the cgroup2fs. > > The fact that the current behavior MIGHT HAVE a valid application does > not mean that THIS FILE should be providing it. It IS NOT an argument > against this patch here, just an argument for a separate patch that > adds this functionality in a way that is consistent with the rest of > the interface (e.g. systematically adding .local files). > > The current semantics have real costs to real users. You cannot > dismiss them or handwave them away with a hypothetical regression. > > I would really ask you to consider the real world usage and adoption > data we have on cgroup2, rather than insist on a black and white > answer to this situation. Those users requiring the hierarchical beahvior can use the new file without any risk of breakages so I really do not see why we should undertake the risk and do it the other way around. -- Michal Hocko SUSE Labs