Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp1387760imu; Wed, 23 Jan 2019 16:25:13 -0800 (PST) X-Google-Smtp-Source: ALg8bN5tUt973urRWqO/aTgNVKHg1yqbzETXX0V5YztBO6U0ujfvt/18DuXZayPEn8wHz06RqaaT X-Received: by 2002:a17:902:2b84:: with SMTP id l4mr4399803plb.191.1548289513136; Wed, 23 Jan 2019 16:25:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548289513; cv=none; d=google.com; s=arc-20160816; b=QzJN4iNkFkSGw7QVljF42PscXj6qR7AO2QUnGxsRoHFGh+noIbHhvoik29JgqsVEVV kkpa8PzHHyp2Y95ZY5UQMUUnsrlfTIQ0BQiFulzlJhCU+88jqCTPy7ccIYESwFkkqrSK +Fo616GO13e3dyslqAlF4iBFk5JQEIEAWpN6R+vyFxocMMRHbjNSpe8uCfHmBgELhib4 mpjH0xeET1lyCAyarojkM5P3zJrbqsQQNMPXTmmeWgjFs0ScSdeyw7Xcsc5ICGBThlWH SXLfoVV4HHyWhSVexb/HC8HRalhG7Htrm2W78curPp1p0SQOro7T6GdMpBH2q5N/PgOG 9wdQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:content-transfer-encoding :content-id:spamdiagnosticmetadata:spamdiagnosticoutput :content-language:accept-language:in-reply-to:references:message-id :date:thread-index:thread-topic:subject:cc:to:from:dkim-signature :dkim-signature; bh=vidOgNFzpWSv+2q4f4PnFnWhyhThSoFlN1OA92UuNRU=; b=Ditn0qvMIsGlw0DufWSThj+d8Bn4w+F9A10PjGWEaF4C4nM9eOsIjcDLL29kj1jBvl +MdmYWV4QlehobAtHskqlt2LmawJkwZ8dStoaXbSYjw2EIjdhbD4NPpnp9WNJ8dST7O9 HKFd15Vq9s5tqI4EkFH9VUMuKTsKMn30T6vtsXRA3PEX+Qo6aHh+gSBhpeevpbWBk5WX J7WtuuyrspL0fRhL2qB+fZdGRuW1QLGWiLXjX5Oij1eNtWoU7nGKSd+4Mraxp2q2WEFb TsRW24VLAyiGeHKBBNDjIjKd58kBqFqxvDwSwm67xVfdBcJpJSBdXdyGdZi0MC2wjkGk lN6g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@fb.com header.s=facebook header.b=NCF1Z9Al; dkim=pass header.i=@fb.onmicrosoft.com header.s=selector1-fb-com header.b=b0h2rQqY; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=fb.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l36si20803565plb.433.2019.01.23.16.24.57; Wed, 23 Jan 2019 16:25:13 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@fb.com header.s=facebook header.b=NCF1Z9Al; dkim=pass header.i=@fb.onmicrosoft.com header.s=selector1-fb-com header.b=b0h2rQqY; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=fb.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726823AbfAXAYn (ORCPT + 99 others); Wed, 23 Jan 2019 19:24:43 -0500 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:37010 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726078AbfAXAYn (ORCPT ); Wed, 23 Jan 2019 19:24:43 -0500 Received: from pps.filterd (m0148461.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x0O0N8W4009826; Wed, 23 Jan 2019 16:24:28 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : references : in-reply-to : content-type : content-id : content-transfer-encoding : mime-version; s=facebook; bh=vidOgNFzpWSv+2q4f4PnFnWhyhThSoFlN1OA92UuNRU=; b=NCF1Z9Alea0L9UgTKKDqQuHnGX4pUEBhYAMf7xZl5UEc3tnVKqk6/NReRIa7Xbtu6i9G RNKCrnktT/qivbr9yhiRXftUMzmwNJ0rFDBSv1234te/WAEgvbPobxCqp2j1/CbrBDWy dzljSdEUma7FRwuSZW+n6pt2ApPHO2ha74c= Received: from mail.thefacebook.com ([199.201.64.23]) by mx0a-00082601.pphosted.com with ESMTP id 2q71r8r69x-11 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Wed, 23 Jan 2019 16:24:28 -0800 Received: from prn-mbx04.TheFacebook.com (2620:10d:c081:6::18) by prn-hub01.TheFacebook.com (2620:10d:c081:35::125) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.1.1531.3; Wed, 23 Jan 2019 16:24:07 -0800 Received: from prn-hub04.TheFacebook.com (2620:10d:c081:35::128) by prn-mbx04.TheFacebook.com (2620:10d:c081:6::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.1.1531.3; Wed, 23 Jan 2019 16:24:07 -0800 Received: from NAM04-BN3-obe.outbound.protection.outlook.com (192.168.54.28) by o365-in.thefacebook.com (192.168.16.28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.1.1531.3 via Frontend Transport; Wed, 23 Jan 2019 16:24:06 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.onmicrosoft.com; s=selector1-fb-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=vidOgNFzpWSv+2q4f4PnFnWhyhThSoFlN1OA92UuNRU=; b=b0h2rQqYrJvHVdJOPPGgE6Cqde3WaMgdenleTXPsAq/AZ2Opc1bvBzHHPzbdLrQ8j20qXH43E0mkYZmekFDhssJZLHPmxUc/52AENsjUxrq0iIKMrbtpHrgdJZ3t8X0keH4bTU4Fvw0GNm6MgJOJdoiTn2YK1HKb9uv3LFiG5gM= Received: from BYAPR15MB2631.namprd15.prod.outlook.com (20.179.156.24) by BYAPR15MB3254.namprd15.prod.outlook.com (20.179.57.89) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1558.16; Thu, 24 Jan 2019 00:24:05 +0000 Received: from BYAPR15MB2631.namprd15.prod.outlook.com ([fe80::7459:36fe:91f2:8b8a]) by BYAPR15MB2631.namprd15.prod.outlook.com ([fe80::7459:36fe:91f2:8b8a%6]) with mapi id 15.20.1558.016; Thu, 24 Jan 2019 00:24:05 +0000 From: Roman Gushchin To: Chris Down CC: Andrew Morton , Johannes Weiner , Michal Hocko , Tejun Heo , Dennis Zhou , "linux-kernel@vger.kernel.org" , "cgroups@vger.kernel.org" , "linux-mm@kvack.org" , Kernel Team Subject: Re: [PATCH 2/2] mm: Consider subtrees in memory.events Thread-Topic: [PATCH 2/2] mm: Consider subtrees in memory.events Thread-Index: AQHUs2twErhpTuX9qUy7NeIZ9vJ8laW9j10A Date: Thu, 24 Jan 2019 00:24:05 +0000 Message-ID: <20190124002359.GB21563@castle.DHCP.thefacebook.com> References: <20190123223144.GA10798@chrisdown.name> In-Reply-To: <20190123223144.GA10798@chrisdown.name> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-clientproxiedby: MWHPR14CA0067.namprd14.prod.outlook.com (2603:10b6:300:81::29) To BYAPR15MB2631.namprd15.prod.outlook.com (2603:10b6:a03:152::24) x-ms-exchange-messagesentrepresentingtype: 1 x-originating-ip: [2620:10d:c090:180::1:223c] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1;BYAPR15MB3254;20:JGmUogjNtekPKGlQ0YWWAu9moJIpPew77DftB3SsK2xxLJmZn7bR5Wo3NI6gp6XbiaO9BZrEQQ4bqikR5njzysa4yjRg+Jzk1A9RTfbT/oxcHHfBKUYECOgrSDpDGfBDs9bj/sYOS2b91DC28rNQEotYOkInbxN6h3/dwlG7Wzc= x-ms-office365-filtering-correlation-id: 02ee00b0-85cd-4add-329f-08d681923f5a x-microsoft-antispam: BCL:0;PCL:0;RULEID:(2390118)(7020095)(4652040)(8989299)(5600110)(711020)(4605077)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(2017052603328)(7153060)(7193020);SRVR:BYAPR15MB3254; x-ms-traffictypediagnostic: BYAPR15MB3254: x-microsoft-antispam-prvs: x-forefront-prvs: 0927AA37C7 x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(396003)(366004)(136003)(376002)(346002)(39860400002)(189003)(199004)(68736007)(14454004)(86362001)(106356001)(229853002)(6246003)(97736004)(33656002)(6116002)(4326008)(53936002)(25786009)(478600001)(76176011)(99286004)(316002)(52116002)(1076003)(54906003)(7736002)(8936002)(105586002)(81166006)(486006)(81156014)(8676002)(6916009)(71200400001)(33896004)(386003)(186003)(102836004)(6506007)(446003)(9686003)(6512007)(305945005)(6486002)(6436002)(2906002)(476003)(11346002)(46003)(256004)(71190400001)(14444005);DIR:OUT;SFP:1102;SCL:1;SRVR:BYAPR15MB3254;H:BYAPR15MB2631.namprd15.prod.outlook.com;FPR:;SPF:None;LANG:en;PTR:InfoNoRecords;MX:1;A:1; received-spf: None (protection.outlook.com: fb.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: 0S4SOnll7dsgziQpDAJYpZYBZeODi06qY4bYDuV4s8rCfOjgQmRrXFA7eTyBTgUlJDgkPOTWsE+6Nk1llllg4fbwXPG58iGOSV12voGAchaJk9W2V84rQLBEBtdE1R9tKLr4D1OhGoW0Bf+K7ImhHWNzK/uRypH4hfWCX5KGI7c0cs2WV5rTnYtr+JnRe1gYTp8xK0BB/tmaVUFYz1xKRJPvoe+wuCcq8n+CHdXDDl+SDjTS6cmjKVFSQoS4Df3h9VF1Dkdrox8SPrQYud+Bf3xp5HCXwh/rKKW+JnDTEDSxgzUliAT7w+7CIlniaR48YUNVKZBIbwj/oo7PXI3ayuvof1Y2BdDMEwH++3+ClvJq8FQ8VAPDrgn6BGfw3EoKvAp3AS6teVC/4gt+x0i2S6537tKELo/3fPdnDrdhLtg= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-ID: <4EF58E70687BCD4598BE7A330F81C27C@namprd15.prod.outlook.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-Network-Message-Id: 02ee00b0-85cd-4add-329f-08d681923f5a X-MS-Exchange-CrossTenant-originalarrivaltime: 24 Jan 2019 00:24:03.8981 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 8ae927fe-1255-47a7-a2af-5f3a069daaa2 X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR15MB3254 X-OriginatorOrg: fb.com X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2019-01-23_12:,, signatures=0 X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 23, 2019 at 05:31:44PM -0500, Chris Down wrote: > memory.stat and other files already consider subtrees in their output, > and we should too in order to not present an inconsistent interface. >=20 > The current situation is fairly confusing, because people interacting > with cgroups expect hierarchical behaviour in the vein of memory.stat, > cgroup.events, and other files. For example, this causes confusion when > debugging reclaim events under low, as currently these always read "0" > at non-leaf memcg nodes, which frequently causes people to misdiagnose > breach behaviour. The same confusion applies to other counters in this > file when debugging issues. >=20 > Aggregation is done at write time instead of at read-time since these > counters aren't hot (unlike memory.stat which is per-page, so it does it > at read time), and it makes sense to bundle this with the file > notifications. I agree with the consistency argument (matching cgroup.events, ...), and it's definitely looks better for oom* events, but at the same time it f= eels like a API break. Just for example, let's say you have a delegated sub-tree with memory.max set. Earlier, getting memory.high/max event meant that the whole sub-tree is tight on memory, and, for example, led to shutdown of some parts of the = tree. After your change, it might mean that some sub-cgroup has reached its limit= , and probably doesn't matter on the top level. Maybe it's still ok, but we definitely need to document it better. It feels bad that different versions of the kernel will handle it differently, so the userspace has to workaround it to actually use these events. Also, please, make sure that it doesn't break memcg kselftests. >=20 > After this patch, events are propagated up the hierarchy: >=20 > [root@ktst ~]# cat /sys/fs/cgroup/system.slice/memory.events > low 0 > high 0 > max 0 > oom 0 > oom_kill 0 > [root@ktst ~]# systemd-run -p MemoryMax=3D1 true > Running as unit: run-r251162a189fb4562b9dabfdc9b0422f5.service > [root@ktst ~]# cat /sys/fs/cgroup/system.slice/memory.events > low 0 > high 0 > max 7 > oom 1 > oom_kill 1 >=20 > Signed-off-by: Chris Down > Acked-by: Johannes Weiner > To: Andrew Morton s/To/CC > Cc: Michal Hocko > Cc: Tejun Heo > Cc: Roman Gushchin > Cc: Dennis Zhou > Cc: linux-kernel@vger.kernel.org > Cc: cgroups@vger.kernel.org > Cc: linux-mm@kvack.org > Cc: kernel-team@fb.com > --- > include/linux/memcontrol.h | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) >=20 > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > index 380a212a8c52..5428b372def4 100644 > --- a/include/linux/memcontrol.h > +++ b/include/linux/memcontrol.h > @@ -769,8 +769,10 @@ static inline void count_memcg_event_mm(struct mm_st= ruct *mm, > static inline void memcg_memory_event(struct mem_cgroup *memcg, > enum memcg_memory_event event) > { > - atomic_long_inc(&memcg->memory_events[event]); > - cgroup_file_notify(&memcg->events_file); > + do { > + atomic_long_inc(&memcg->memory_events[event]); > + cgroup_file_notify(&memcg->events_file); > + } while ((memcg =3D parent_mem_cgroup(memcg))); We don't have memory.events file for the root cgroup, so we can stop earlie= r. Thanks!