Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3431765imu; Mon, 28 Jan 2019 04:52:57 -0800 (PST) X-Google-Smtp-Source: ALg8bN4c2FXYy0aVl/aybZ+WYIHTNmwTwe+6MnNtW5gQsnmFhx/X5ks1nHrlgJRCRJueTkRIX3ss X-Received: by 2002:a17:902:112c:: with SMTP id d41mr20945732pla.144.1548679977476; Mon, 28 Jan 2019 04:52:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548679977; cv=none; d=google.com; s=arc-20160816; b=QA1fhRX8PU71AShdI4ZPimXZ6iwrZhEAemzCujJ9O7BW51iZeOTJ2aLBFQO57VUlqO Aent99c3Z77+LGPMKoWk1cxOgSo9WgQT7AqkCFK07pliA4H9n/p7GJj8es4Ly7AlICH+ jbTlopE+FCFXc52eyiR5qGNitrEt42OUR3rzLRdL2JAocv4VZoWSdBBCSD+8+yXjBfd/ P8nkJ0nb7QSwrOX//TTqS5I3aYQU3IEYsugxJK0Yq6OuKdZUeb9ncdTLR+3E2satyKIS MfM4X+uf2/Iy5KYDbp2Xof7IHrozi75qpBMVb0bQWPpYBtC0SLDIRhhzI0AX01tCMU3a zwag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=PRhSLFuI+bBGqq83VElBMvkxzfd+2UdguEFiEAgiYb8=; b=LyutCpbWxuaclMd7ynONmIuvXntt00cj+PWsc0cEVF0OqNgiHye6hMHQcJT+CgOxVu 5fS3xpdEb78MSRNHr5c3yuISFBCl2doD+T1eao1ta4QT71pdMFbHzfkEEYG6PS8Xz1zc xxCeKL/IR6nbVOPEwhdjliZrbg2uaqevChgvbU1Y/HsgnaHt5m5fARPjb9J2pkuDUYPN 3j7zXbVZfthQm4yx13LxSZzeXZQGg1JDcjPVU0u5+9hifH8bWgh3kysbjpRaf5O3BVGU 06wwPxiepWRw8Kn93dhWeI+bwy3v6Y9U3VObiI31yqtA1SHsKExalsiW012j7kHyLv3X UG2g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u20si12667665plj.129.2019.01.28.04.52.42; Mon, 28 Jan 2019 04:52:57 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726846AbfA1Mvz (ORCPT + 99 others); Mon, 28 Jan 2019 07:51:55 -0500 Received: from mx2.suse.de ([195.135.220.15]:37136 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726647AbfA1Mvz (ORCPT ); Mon, 28 Jan 2019 07:51:55 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id DBF96AD76; Mon, 28 Jan 2019 12:51:52 +0000 (UTC) Date: Mon, 28 Jan 2019 13:51:51 +0100 From: Michal Hocko To: Tejun Heo Cc: Johannes Weiner , Chris Down , Andrew Morton , Roman Gushchin , Dennis Zhou , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: Re: [PATCH 2/2] mm: Consider subtrees in memory.events Message-ID: <20190128125151.GI18811@dhcp22.suse.cz> References: <20190123223144.GA10798@chrisdown.name> <20190124082252.GD4087@dhcp22.suse.cz> <20190124160009.GA12436@cmpxchg.org> <20190124170117.GS4087@dhcp22.suse.cz> <20190124182328.GA10820@cmpxchg.org> <20190125074824.GD3560@dhcp22.suse.cz> <20190125165152.GK50184@devbig004.ftw2.facebook.com> <20190125173713.GD20411@dhcp22.suse.cz> <20190125182808.GL50184@devbig004.ftw2.facebook.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190125182808.GL50184@devbig004.ftw2.facebook.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri 25-01-19 10:28:08, Tejun Heo wrote: > Hello, Michal. > > On Fri, Jan 25, 2019 at 06:37:13PM +0100, Michal Hocko wrote: > > > What if a user wants to monitor any ooms in the subtree tho, which is > > > a valid use case? > > > > How is that information useful without know which memcg the oom applies > > to? > > For example, a workload manager watching over a subtree for a job with > nested memory limits set by the job itself. It wants to take action > (reporting and possibly other remediative actions) when something goes > wrong in the delegated subtree but isn't involved in how the subtree > is configured inside. Yes, I understand this part, but it is not clear to me, _how_ to report anything sensible without knowing _what_ has caused the event. You can walk the cgroup hierarchy and compare cached results with new ones but this is a) racy and b) clumsy. > > > If local event monitoring is useful and it can be, > > > let's add separate events which are clearly identifiable to be local. > > > Right now, it's confusing like hell. > > > > From a backward compatible POV it should be a new interface added. > > That sure is an option for use cases like above but it has the > downside of carrying over the confusing interface into the indefinite > future. I actually believe that this is not such a big deal. For one thing the current events are actually helpful to watch the reclaim/setup behavior. > Again, I'd like to point back at how we changed the > accounting write and trim accounting because the benefits outweighted > the risks. > > > Please note that I understand that this might be confusing with the rest > > of the cgroup APIs but considering that this is the first time somebody > > is actually complaining and the interface is "production ready" for more > > than three years I am not really sure the situation is all that bad. > > cgroup2 uptake hasn't progressed that fast. None of the major distros > or container frameworks are currently shipping with it although many > are evaluating switching. I don't think I'm too mistaken in that we > (FB) are at the bleeding edge in terms of adopting cgroup2 and its > various new features and are hitting these corner cases and oversights > in the process. If there are noticeable breakages arising from this > change, we sure can backpaddle but I think the better course of action > is fixing them up while we can. I do not really think you can go back. You cannot simply change semantic back and forth because you just break new users. Really, I do not see the semantic changing after more than 3 years of production ready interface. If you really believe we need a hierarchical notification mechanism for the reclaim activity then add a new one. -- Michal Hocko SUSE Labs