Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752450AbdIETQp (ORCPT ); Tue, 5 Sep 2017 15:16:45 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:60496 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752064AbdIETQl (ORCPT ); Tue, 5 Sep 2017 15:16:41 -0400 Date: Tue, 5 Sep 2017 20:16:09 +0100 From: Roman Gushchin To: Michal Hocko CC: , Vladimir Davydov , Johannes Weiner , Tetsuo Handa , David Rientjes , Andrew Morton , Tejun Heo , , , , Subject: Re: [v7 5/5] mm, oom: cgroup v2 mount option to disable cgroup-aware OOM killer Message-ID: <20170905191609.GA19687@castle.dhcp.TheFacebook.com> References: <20170904142108.7165-1-guro@fb.com> <20170904142108.7165-6-guro@fb.com> <20170905134412.qdvqcfhvbdzmarna@dhcp22.suse.cz> <20170905143021.GA28599@castle.dhcp.TheFacebook.com> <20170905151251.luh4wogjd3msfqgf@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <20170905151251.luh4wogjd3msfqgf@dhcp22.suse.cz> User-Agent: Mutt/1.8.3 (2017-05-23) X-Originating-IP: [2620:10d:c092:200::1:ad36] X-ClientProxiedBy: AM3PR07CA0146.eurprd07.prod.outlook.com (2603:10a6:207:8::32) To DM3PR15MB1081.namprd15.prod.outlook.com (2603:10b6:0:12::7) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 638fde11-8f79-42f7-ea08-08d4f492989a X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254152)(300000503095)(300135400095)(2017052603199)(201703131423075)(201703031133081)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095);SRVR:DM3PR15MB1081; X-Microsoft-Exchange-Diagnostics: 1;DM3PR15MB1081;3:eLvFfcjOV3xVpjGzJdR8halNVcE+GI05IlzAVGMU9GO2gtJ3RMcloUTChk6GjQgrD9Z3MhlZ1F0KdtCft4XuztwmPNwFisv2bvmLTdpJSIdMFHbfApIa4E/JGptXARgmHqRTvx93E5z/ORRvsnGFkXLCSnkYB/TOJ32sPDQ9G3TS81ZHnijX+JceiADqH4MhKgWdkR7waJPEdNfQiN7ddSSj4hyIVOCMwyzqL8oljQQ8kwu1YNO2addUNZqpo66+;25:zk4ICFdqCFAMpVL8QWtZ+w3k1vhxUuznUlfsVZMo/MNwj3eZF/a6LFDPRjW3MbRZFpp7p/uwU+/8yAhJK/KrLFcwQGALilt8q7TmF+pe8O8EPrlhzfcLyl+yZWYM+bHVdlQod9D/Hc3OGnD3cGAkw9fovRs6Lfbo/H0BZPe4TEw8V7Q1/yr6kg+sjtIcLAAozHvKtRqzfeBoXlSKUyRjYwST70ls9Ks5WfKJtPUjxShqKU743hIwkIBifdy7cNpN1/1JCjt7QutLyInZnAR9mZKKkSgoqXKPaWriJYw5UjO7tdr2gat+GhgRJHZXdpkgMldxiw4iVXQQ11/z3uLR3w==;31:LlBVNBVlAN/YcKeamS5koVjZkvuladN5LY0WK9gWlxe/ryzN9wSe7VBU/Z2EX2QXCZh0Kw8W4J23MitmTZJn+qT1+6E/kNWQyxMC09z7/qZHIFL+4mMYBRvKIsKPoaW8Ks7cPn72B7d8m/Y1re8tAEg/5ulfVjxhe3vLA028nKRFnwbHr+Wl89r4M+CaVdYsEiF/fhqEZlfVJ1JwdS031yGCfo7i9//6GOBKhBiOU1g= X-MS-TrafficTypeDiagnostic: DM3PR15MB1081: X-Microsoft-Exchange-Diagnostics: 1;DM3PR15MB1081;20:BY0sbdpmgp6fnVmrXfd15pjY4fEYie43dZa4ErObDnd2cKY8ce0gS4L0DowIcaym5kqS0YCjelOhDNTnG/0wWncD1A3WQXkcQedHQHrY1sr6B2lzQ9xIgAkfvzgPK/BSb65hyLEohRE+BpmFOHBaPwUYRgN+AagwZt1xwnXuq3fcYCHF6T0EdfJ2vSYShbL5XBQArq02WJ0Zb9FywADXgHT88gsiB38uPYm5v+UJwvNcxtUA/LM9O3Jb65DWNkGEslaK2xDk2/yUkF4e5/POBzqurdsHcBsXmnobQ6O0RO/aj8ewE4Cs+5fRKVSjuAFaPvvnGGiknG82xdMFYBmm0zbBSuRvk29xIUBwe9leiL3yPkz3AckX/22GPL/sWgbClhk/8y1Cu2vAue/xVqiXwFm6Ybua3jtvY62A63kQVwP9cH6EguLGPVLTPbHCL3X/HJ2aIMrINg6Rx0WH3UdyQS+VTQcEIIKCXuHakRseGMxtXcksUXNYaUo6CWRJ5c6y;4:3w2zYzMykR6VmkmidX0U8czJogAofjOpmlVKN9YkZkaUwkdkm6YgbQQ13oCGoWYciotceA0N1kRJHyvdzWOwAtg7oVfYurzqThO/92XrvGVkLR6Sg3c16G7Zsy755+7G3ConM0FdyhcHLmnvyIVxPfE+CDTfeM9sJbEj6lclnTYDWccyaSIgu/RD2ML8fW8bXUjgbABWpdPD0/zY/KG13HZK8p8FFSZSrqgYofZoECcVpi0hDOUk+wwscUTDkrLC X-Exchange-Antispam-Report-Test: UriScan:; X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(5005006)(8121501046)(3002001)(100000703101)(100105400095)(10201501046)(93006095)(93001095)(6041248)(20161123564025)(20161123555025)(20161123560025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123562025)(20161123558100)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095);SRVR:DM3PR15MB1081;BCL:0;PCL:0;RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);SRVR:DM3PR15MB1081; X-Forefront-PRVS: 0421BF7135 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(6009001)(57704003)(189002)(24454002)(199003)(377424004)(101416001)(189998001)(2906002)(86362001)(6916009)(54356999)(76176999)(25786009)(5660300001)(50986999)(42186005)(105586002)(7416002)(33656002)(8936002)(106356001)(81156014)(6666003)(23726003)(2950100002)(1076002)(81166006)(4326008)(8676002)(6116002)(305945005)(50466002)(7736002)(68736007)(9686003)(229853002)(55016002)(478600001)(83506001)(110136004)(4001350100001)(6246003)(6506006)(39060400002)(97736004)(93886005)(47776003)(53936002)(54906002)(18370500001)(42262002);DIR:OUT;SFP:1102;SCL:1;SRVR:DM3PR15MB1081;H:castle.dhcp.TheFacebook.com;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;DM3PR15MB1081;23:sl6x3cu4YuqVaSSyUP6Z8b2gEgD1MT9Rjk6XU6U6M?= =?us-ascii?Q?lb/lwv7ULLiwlKwyJFrFQsgREIM6+elmgCuJrP7QabW+eWTvTIRlnRmiu/ux?= =?us-ascii?Q?UPFX5r+ZF+8iGQdyvLJdU4R7tmWJvWNBivIYCPxm3u/clEpSMO2yB0u6XDc/?= =?us-ascii?Q?aFcJOqkF1DH46XWaMKtZOIwrHYwCN1UW60PlSkeS7trlOejGm8V4o6Erp5mb?= =?us-ascii?Q?YNpdCsw7gUODYWeMf560uf+Bx6eUwZCR1fs+Sqh8ntwCwmszQQjAyY6mtqxG?= =?us-ascii?Q?jk8su37w7zXy30TZtxb1nbqQzRSWB1WN6FsEnsf5mUgvLIKCBiiPmXWaddKs?= =?us-ascii?Q?vwCagNWLcwD5ci78hPSM7faa1OgM/ny+XDhI/zfdisU36shlvhH8yzJweOfe?= =?us-ascii?Q?tr0Rrgk7bw2zRhQgLDiqrWpni1OZ70EdNuHsaNpGGKe4lSuMST/qY9qqSgNa?= =?us-ascii?Q?AhiIW0ZwcXOzN5HoSeLokx8vQpHpeMowvjXysB5jGJrvSoJosN7PXJh2uf2U?= =?us-ascii?Q?MrfIsMu0/I5ZTM2N8cQASUE3a7q5nJouIgA+MRZ9NiABY/jyycZxzbUK1Fh8?= =?us-ascii?Q?Z9NTCw0EoiMct2NTVe+Z/LUDH2vf1+vQWdITmBpbK0y0JXVJYNf3knkzyStI?= =?us-ascii?Q?R0r5bHeTtyLrBM929bv2GXJmfUsDRtQRZdefCTCCyg3N/09SudYf1hXPevNi?= =?us-ascii?Q?7F6hZ6OiCkGSmde0m0Aj/PfaisFU2RP/snLwI9Q/+8M7+/+01lJJQPT6caCD?= =?us-ascii?Q?GIHyFDSg20TytG8mgeoQ/7qr521seaBdXABqIxx+HkYzQvuUz+eDs2+3Yc++?= =?us-ascii?Q?skXzH7e/cNdQ9SOSHB8gvsv2E7TTjABUQKNgZZTHdnHuFYzCDFCV1bfdGT1V?= =?us-ascii?Q?lV88IMhfVOitdC8s4vDqvWMSuz2Yrkh4sjbPq7Qy6dHl0qSqI+D5GCzF3Igy?= =?us-ascii?Q?SMXAM/LjVLAArohlVTSDS3QI/DyGhqVThhQ2nfdVCi/BpIWfomBtxkCL/kDn?= =?us-ascii?Q?3I8lBbFHuvOzfejndDu6imo/D9ue0gVdaEXTBFPOEBzCM6gF3Y8871FkjzTx?= =?us-ascii?Q?1gqme8jmhtb6lRbMt9r4f2LxcdTY+QJfk2tS4qRVHtyTWEr0jqXZwmIxcTdp?= =?us-ascii?Q?yztoNO5z5/ndbzK9F6OTJ9z/DCFIPiyfdO4juYl1KAwQQnh98bqk6UrtLOGr?= =?us-ascii?Q?Vs7SIPTZN/62uEj8o+uZPHDZF1+IQ5C5WtO8nkxzavgOU7y3nsSs7WYhAicJ?= =?us-ascii?Q?W+y4+Kr+d8ebpk3pIo=3D?= X-Microsoft-Exchange-Diagnostics: 1;DM3PR15MB1081;6:74GRdTE9d2oe85STfjUdL03TcRl5wE7RyRfD4mlANxZ/QpMJT9pYqfkoXapUMkeuwExfH/kxeXQdEszECaKtdw/khSOzkFUgyHNtdubsP5JIyXV6HQl9yHhHOM/1o/p9VRie5FKbks12KBYTrzoCVVH5w44ZgcxTh95EhaSZWsUA9TPm67pLfBCd8AhgF7e6czTulcUotiG6VOUAXswNeXYsuRHftgkX0ib+nADfiPXxAfF81ROutMJJW40j9bbIF5hzXaI7v3NDD6oa55h3Dqnd1Hb+EcOmTgveC82HMiP3CJ41gKFWDFj4ESbPmBCyMkFyFPYuyick+Gi//eOShQ==;5:b13xCm1BGPiIVsMi89jPfFj7bpVpyrxOPwCfeI6Uj9GuoTC/iBeYaHgihDscSatCFAyqWuM8s2JHSXxX3rRaqGy5DayHCKNb2j6NRv+sNnZyq9FESyNMoioXnUaVnXyB3QwkJahpMb4Kg6ndV9jh0g==;24:NNNpg6TXC/f56D/0TJGuw7jmov2xqVOcOQM/1X1H8D39SVchGqRtmQ28W2nYsTwNfE88jgbgLcFizw/L3GUcMA0mUpy4DfS7WM9qrQIq4PM=;7:c6h9JmzhqShbfPz8Z7xKGpGCBNGsOHawjD3wIUNP+NJq4rVZoKtmuui8hznP7ENBsz3z4kwx9L/HuaBbcLu3wMgSnSRXsgsT5mbfGMeFMkjKqvAli8gdX4KtZnF9nDr90UKuml22TQoipEjTA9x6Xxl5R50QKt0LMzU5GLRT4eDu0JQX+QYB4vP9fgLudQyrZwngpJT4gZ7sspKvC7DPON7f9NiKllZHOCNcvSu0m1U= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;DM3PR15MB1081;20:8Mk6X3nH6tu/XG6T9SuGwYQPnHy0zc1n0i0Yw/B8umOh7wzI0xMxVg7YAlN45UoVtNTDuKgUgl8ECXNxKLbQPHzWZJCJVQb0ELZbbyF47k4uWFnwdeayeQNHEvm4NShM93gCX5SjfjG+K9Poe/RyB9E+GtWxYsxyYRJ2DKRDo+E= X-MS-Exchange-CrossTenant-OriginalArrivalTime: 05 Sep 2017 19:16:20.7836 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8ae927fe-1255-47a7-a2af-5f3a069daaa2 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM3PR15MB1081 X-OriginatorOrg: fb.com X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-09-05_07:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3241 Lines: 73 On Tue, Sep 05, 2017 at 05:12:51PM +0200, Michal Hocko wrote: > On Tue 05-09-17 15:30:21, Roman Gushchin wrote: > > On Tue, Sep 05, 2017 at 03:44:12PM +0200, Michal Hocko wrote: > [...] > > > Why is this an opt out rather than opt-in? IMHO the original oom logic > > > should be preserved by default and specific workloads should opt in for > > > the cgroup aware logic. Changing the global behavior depending on > > > whether cgroup v2 interface is in use is more than unexpected and IMHO > > > wrong approach to take. I think we should instead go with > > > oom_strategy=[alloc_task,biggest_task,cgroup] > > > > > > we currently have alloc_task (via sysctl_oom_kill_allocating_task) and > > > biggest_task which is the default. You are adding cgroup and the more I > > > think about the more I agree that it doesn't really make sense to try to > > > fit thew new semantic into the existing one (compare tasks to kill-all > > > memcgs). Just introduce a new strategy and define a new semantic from > > > scratch. Memcg priority and kill-all are a natural extension of this new > > > strategy. This will make the life easier and easier to understand by > > > users. > > > > > > Does that make sense to you? > > > > Absolutely. > > > > The only thing: I'm not sure that we have to preserve the existing logic > > as default option. For most users (except few very specific usecases), > > it should be at least as good, as the existing one. > > But this is really an unexpected change. Users even might not know that > they are using cgroup v2 and memcg is in use. > > > Making it opt-in means that corresponding code will be executed only > > by few users, who cares. > > Yeah, which is the way we should introduce new features no? > > > Then we should probably hide corresponding > > cgroup interface (oom_group and oom_priority knobs) by default, > > and it feels as unnecessary complication and is overall against > > cgroup v2 interface design. > > Why. If we care enough, we could simply return EINVAL when those knobs > are written while the corresponding strategy is not used. It doesn't look as a nice default interface. > > > > I think we should instead go with > > > oom_strategy=[alloc_task,biggest_task,cgroup] > > > > It would be a really nice interface; although I've no idea how to implement it: > > "alloc_task" is an existing sysctl, which we have to preserve; > > I would argue that we should simply deprecate and later drop the sysctl. > I _strongly_ suspect anybody is using this. If yes it is not that hard > to change the kernel command like rather than select the sysctl. I agree. And if so, why do we need a new interface for an useless feature? > > > while "cgroup" depends on cgroup v2. > > Which is not a big deal either. Simply fall back to default if there are > no cgroup v2. The implementation would have essentially the same effect > because there won't be any kill-all cgroups and so we will select the > largest task. I'd agree with you, if there are use cases (excluding pure legacy), when the per-process algorithm is preferable over the cgroup-aware OOM. I really doubt, and hope, that with oom_priorities the suggested algorithm should cover almost all reasonable use cases. Thanks!