Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp212221imu; Fri, 25 Jan 2019 00:43:22 -0800 (PST) X-Google-Smtp-Source: ALg8bN6oWiS+jKkNqFv1IgQdlmTPcQGAPVzwY1qhwnUWLPGV9AUW7wmrFC+8LD73Xs7zBRutn3sZ X-Received: by 2002:a62:109b:: with SMTP id 27mr9855745pfq.227.1548405802311; Fri, 25 Jan 2019 00:43:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548405802; cv=none; d=google.com; s=arc-20160816; b=cVaomBOiQ42sgYF/nZK1PudrYwswm755Nr7+2TzWFJtk0lsFmOcr0+aEshq85Sb5T1 rHQRkmcaiRRCiWn7J9d7HIcKp1nzjhFOYLXrkC3UbA3O5H+rolYAsm8+IoAdJsR/uXGS ocrHnSu40b2Ea9eEF4qYCbB71c7j4FKQHUXGwxhXWf00peid07pC1M9j/N0uG7jhC+e8 TT6ZW9WpALlXPSBfG06tEwtuScgW+yy1CFdN+P9YIXFBJidHvkK+XOmgJA5arUyUFHeA 6y1rnMTolqNMD7gLnIZsxBP/ls0PX/gaIOUGEdlJPJpDuYRpALMjDsLApmGAPNq77qM+ yzmw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=4GT0BTqiRi1yG704ZzuEjcs5SQj+Las29w/38VpwIxE=; b=z3xnKbE+hsp9Mn7nWg7IVY2R5Hdt1Ru1baNyucitT0bru5korYs6+jguFrqwpGZS1v EleJ2uxmNdiEk5wzKbYuHXQCqgtO/HfN3aZbnWKGfbR0oMzWOvVCjdHU0McLck/YJgq3 w+dSCso2kQmZUg0fGnSQiGUCIaC6nRfe62fSy0avmjyGkQF7NIEgzNprUHUPwIWwK98w bjoBYmPpEiYlKBMNQvLAKICIKveQuTQNSZhlTyDLlCB2t//pkzdD2znIPz8Qw1GwKi7u LXwfkwKbt47ptfa3sAzOSVJ3eirEcLoTy80vzEpvY5lCXRi3QPY/fgLz9tkEyUSQx16n lhSw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q189si20547672pfb.62.2019.01.25.00.43.06; Fri, 25 Jan 2019 00:43:22 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726802AbfAYImR (ORCPT + 99 others); Fri, 25 Jan 2019 03:42:17 -0500 Received: from mx2.suse.de ([195.135.220.15]:51268 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726200AbfAYImR (ORCPT ); Fri, 25 Jan 2019 03:42:17 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 28A50ABE4; Fri, 25 Jan 2019 08:42:15 +0000 (UTC) Date: Fri, 25 Jan 2019 09:42:13 +0100 From: Michal Hocko To: Johannes Weiner Cc: Chris Down , Andrew Morton , Tejun Heo , Roman Gushchin , Dennis Zhou , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: Re: [PATCH 2/2] mm: Consider subtrees in memory.events Message-ID: <20190125074824.GD3560@dhcp22.suse.cz> References: <20190123223144.GA10798@chrisdown.name> <20190124082252.GD4087@dhcp22.suse.cz> <20190124160009.GA12436@cmpxchg.org> <20190124170117.GS4087@dhcp22.suse.cz> <20190124182328.GA10820@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190124182328.GA10820@cmpxchg.org> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu 24-01-19 13:23:28, Johannes Weiner wrote: > On Thu, Jan 24, 2019 at 06:01:17PM +0100, Michal Hocko wrote: > > On Thu 24-01-19 11:00:10, Johannes Weiner wrote: > > [...] > > > We cannot fully eliminate a risk for regression, but it strikes me as > > > highly unlikely, given the extremely young age of cgroup2-based system > > > management and surrounding tooling. > > > > I am not really sure what you consider young but this interface is 4.0+ > > IIRC and the cgroup v2 is considered stable since 4.5 unless I > > missrememeber and that is not a short time period in my book. > > If you read my sentence again, I'm not talking about the kernel but > the surrounding infrastructure that consumes this data. The risk is > not dependent on the age of the interface age, but on its adoption. You really have to assume the user visible interface is consumed shortly after it is exposed/considered stable in this case as cgroups v2 was explicitly called unstable for a considerable period of time. This is a general policy regarding user APIs in the kernel. I can see arguments a next release after introduction or in similar cases but this is 3 years ago. We already have distribution kernels based on 4.12 kernel and it is old comparing to 5.0. > > Changing interfaces now represents a non-trivial risk and so far I > > haven't heard any actual usecase where the current semantic is > > actually wrong. Inconsistency on its own is not a sufficient > > justification IMO. > > It can be seen either way, and in isolation it wouldn't be wrong to > count events on the local level. But we made that decision for the > entire interface, and this file is the odd one out now. From that > comprehensive perspective, yes, the behavior is wrong. I do see your point about consistency. But it is also important to consider the usability of this interface. As already mentioned, catching an oom event at a level where the oom doesn't happen and having hard time to identify that place without races is a not a straightforward API to use. So it might be really the case that the api is actually usable for its purpose. > It really > confuses people who are trying to use it, because they *do* expect it > to behave recursively. Then we should improve the documentation. But seriously these are no strong reasons to change a long term semantic people might rely on. > I'm really having a hard time believing there are existing cgroup2 > users with specific expectations for the non-recursive behavior... I can certainly imagine monitoring tools to hook at levels where limits are set and report events as they happen. It would be more than confusing to receive events for reclaim/ooms that hasn't happened at that level just because a delegated memcg down the hierarchy has decided to set a more restrictive limits. Really this is a very unexpected behavior change for anybody using that interface right now on anything but leaf memcgs. -- Michal Hocko SUSE Labs