Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752992AbdHUJsJ (ORCPT ); Mon, 21 Aug 2017 05:48:09 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:36923 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751069AbdHUJsG (ORCPT ); Mon, 21 Aug 2017 05:48:06 -0400 Date: Mon, 21 Aug 2017 10:46:56 +0100 From: Roman Gushchin To: David Rientjes CC: , Michal Hocko , Vladimir Davydov , Johannes Weiner , Tetsuo Handa , Tejun Heo , , , , Subject: Re: [v5 2/4] mm, oom: cgroup-aware OOM killer Message-ID: <20170821094656.GA13899@castle.dhcp.TheFacebook.com> References: <20170814183213.12319-1-guro@fb.com> <20170814183213.12319-3-guro@fb.com> <20170815121558.GA15892@castle.dhcp.TheFacebook.com> <20170816154325.GB29131@castle.DHCP.thefacebook.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.8.3 (2017-05-23) X-Originating-IP: [2620:10d:c092:200::1:d68f] X-ClientProxiedBy: DB6PR06CA0007.eurprd06.prod.outlook.com (2603:10a6:6:1::20) To SN2PR15MB1087.namprd15.prod.outlook.com (2603:10b6:804:22::9) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 4692cdcf-a054-41a0-a1d4-08d4e87998db X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254152)(300000503095)(300135400095)(201703131423075)(201703031133081)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095);SRVR:SN2PR15MB1087; X-Microsoft-Exchange-Diagnostics: 1;SN2PR15MB1087;3:B5vgguHRLwfAXrNTQxKsDdNZQImdQBFblXNCW6SFXVKlFY3/Q1lF8HmajuX75u29GTpK/71Ct3AwfFmZkjxPHaqVMFUHa9QzGAh0oCDjOf4SxzdwZAz7ckDGlg9ODaOLSYAe4wav9tWTfmrVqkzFwqNL2duSZzPjuC0GHlRIdqpyMqXAMd/WUMneCmYgngAN/J8Em10sL8oS82WH+4C3Jg1JVfwjaLudZzDjxeFnfZ/z+RYWSG8Qw4zAI16vewVa;25:edD74wWqKt3JrXJ6lLrsXZel0/1WRJthPpFExnRl1O74+S1C5ItklHz8bjPediQX/A+Pz8LONPPWcwhu6joU1nd1aWm7HSa5qvAAc2XDuTART8n+cfLfixDEqwmgivHHeS+9a2YY7eAUejvCHQFrBLs4Fr6CS5iXo03YUmjoLmPisyDWwH+LtujgJ1F6ss8tFOmltae4sv17otAMtMTsnw4Gtzdg/5mk1yv/6iChXydMfS3aTUThPHyOSIoabEQ3t+RTjFUctKicSayVM8UwX0oqz4nUwt8VC6fQ3HaBvZJ+6FAr/v3nDpNQc9uD4FcM3UFcca4FwdNu9+09PikCIw==;31:eMpdC87vrOxg3WL+d5fx+qX8HSv8EzQaxiW/zuZtuzy4stQx6PkL6z/+6oOUGHgw5cpxyAjOgwuEAlSOW13LHBmcdORtAkbFNPyWCwzzEvlcYyqJdr77lJRelcNo00z54RObvUPilB9InFrPkwnUXgzzzahGduM4scJlZyFb8QGcPUNoT+X/9DGKTwKynIa4FFh3+J4myMsTrJXwiZptvEyF24AP53z3sIaX2xnmggo= X-MS-TrafficTypeDiagnostic: SN2PR15MB1087: X-Microsoft-Exchange-Diagnostics: 1;SN2PR15MB1087;20:p1iJgvmJel2kxtWLbFjUhAkEBivbedQehH/vk3XVpEgaAYLtrb4YZObMIqLhV8s8/jCSziV/Z6oRpmbDNBr3L6ZeUXQi0sna1O3dX/UVkq++P0TcAJymattBfwNd70u2Nvpg4A4kv/7pjZAgGZnE4ascgApFIBtFd0MI7o44YMuSXVsE+PzGoF1PsqrClH52itcFdThog10dXOBm4auo6qHLzsso+bxfXgoTi8+fwEp3pweVBDORJ4Z8wuBkJKCDuy2Nc3c3CNHf8vQylPw6KiB9RvdDTzyBP9IJkuUo3ugMFUDyf6JDdd2ykfeVbNEoPKJZHfgl0Five0JUYRa7foKovFQp0VX/XRog15mhnM0ajnr9jVIkPsutQZp/rJi4Fry7+7E6LprzVW6zCNJC7sEDVP4wbU2MNzSgEqxYmpGSogEnA7XiCWDtWqoZS/EiKJ5PFhcBli1HTySEubERpB8u2i7VlZ2yV44viAJ0b4bZdt9k87LBExp7z8GUn+EO;4:hGMRzhPiPLWmgERiJefoNjVq4ivqAD/UfgJeHILRA2NoHNWLxMJy10b+8jpsf+2k0J4APB3TC9IULpzjq5LMt7sHmfp5t2J1k7y23Jvt1JHn1X/KeX14EKwdtnNJ3YHDE9h6J/2eAD3sT70dlm5RNHbkgLUqbiY/P4qirGMlc3oT4E7Jc3n6ZQGk1vKivqFJTvkLTSuaguiXxTmAEPLqwynL3fVs3R8iTNO5K6EQJb5trffLpu3aTRdUK8rdEd6/CYJoftsfNcjko8NCcEdffhXpNXDNJcVebo97I8AUmbPzLmk4hU7+GuRs/8ELYUbKjQZm6h9z3V5Ohp4nQnFJwQ== X-Exchange-Antispam-Report-Test: UriScan:(192374486261705)(17755550239193); X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(601004)(2401047)(5005006)(8121501046)(93006095)(93001095)(3002001)(100000703101)(100105400095)(10201501046)(6041248)(20161123564025)(20161123558100)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123555025)(20161123560025)(20161123562025)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095);SRVR:SN2PR15MB1087;BCL:0;PCL:0;RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);SRVR:SN2PR15MB1087; X-Forefront-PRVS: 040655413E X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(7370300001)(4630300001)(6009001)(199003)(189002)(24454002)(86362001)(42186005)(33656002)(101416001)(4326008)(2950100002)(6116002)(5660300001)(23726003)(6666003)(7416002)(1076002)(53936002)(25786009)(55016002)(305945005)(6916009)(9686003)(54906002)(7736002)(7350300001)(6246003)(97736004)(6506006)(110136004)(106356001)(68736007)(8676002)(47776003)(83506001)(93886005)(2906002)(76176999)(50986999)(54356999)(229853002)(50466002)(81166006)(81156014)(478600001)(189998001)(105586002)(4001350100001)(18370500001)(42262002);DIR:OUT;SFP:1102;SCL:1;SRVR:SN2PR15MB1087;H:castle.dhcp.TheFacebook.com;FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;SN2PR15MB1087;23:eZJ6BW13x9utUH7kLlP3zDsDoFu+hPgjUX+cZatSP?= =?us-ascii?Q?8qp0eSpn4eyqTBY+nnlzgC50MnC712CrHfUTN8qZAttLDJtLEB8wf62APk2N?= =?us-ascii?Q?0hN7SQLgq8ITbXvtuF1/iqD0oN+Z29QS43BE/LO4SzxC8awzqGh491dEg+qS?= =?us-ascii?Q?zVsz/+PCyTu3m4C6WgbK4hGYyU618dH8Ff2qzOyRXlxa+2jgKznIBa20pnSJ?= =?us-ascii?Q?fUDBaqAu5ihayvm4+3J5WAukqvQjTEQbw65FiBFCxrPSe47vuwtbPxtmkhTo?= =?us-ascii?Q?zva8UOTC5T/u42f+Htm9n11IcmdMbuIdo637LZTidRkxgMZK1o/f4rQ2wGV4?= =?us-ascii?Q?0jof8XJ6U7DUzYjUdN9XzPGIdV68RpkNgfbKwTjD8Jd2XQ9bL6Fgyj6Gct2b?= =?us-ascii?Q?31Re9DQensiI84/gcbmDyhXsRpwUKvI4MWIvOUKCcB7sslEqSDRAfXeG505z?= =?us-ascii?Q?BdiDJHfplp/oiIKvnQKvawikhv5yH9UQ97b/fLeONahiS7c+RGLCdT3p6443?= =?us-ascii?Q?dfgzg4g5yDclMKr80MG7JWJJxbZBRgPVhgkSo5BP6plxxuPpo9lyKHh5/dot?= =?us-ascii?Q?NMxGOA4kr8mDzicF5VPtEdvpex5RzRwNqHuz8Fc7Jov3P1Dq/cwlGlXwLHaN?= =?us-ascii?Q?Ej+Nk9gKl5rGLOB6GBvX0WFInbt4M/xdhkYxxjI8NSeJN2FlbGkJj1SmgTaI?= =?us-ascii?Q?M6i2XcRSF91MZuxvHyF3DKoMs/7m48tofHz62gD+PpzVjxyCDNFRUIY8TbXB?= =?us-ascii?Q?mvQtAGuaUI/ToWflyd+c2mu55PYapR7HGvSXjgfsk+iey9EDmGKXk6hgUahx?= =?us-ascii?Q?fwiKZhL0WsuuNn+PWUl69QUrb/N46ep7kfkyPfZRlfBKbsZAj7q1ADUuNV18?= =?us-ascii?Q?1KTEJo1NFlAxMKfcXPv4ILy5Z1SJMg2Thk/b/N728/2B6wZW5Ug4Ky4o7iyu?= =?us-ascii?Q?Y6lOjLiXzqsrXV9H45soCJJOvpuCpcBRF9/Tr993q+qNc7xIjDEabbr4Mdrd?= =?us-ascii?Q?3xRXdxcFl3vWV5WngUGeV12XrdvNlXe1Vu82TVw5IZkvhAO0MXtSbk2jGIlf?= =?us-ascii?Q?cVVidwtIs0dYH6t3mtipymxinPtRLnjDeaukmFy7E0b3kR3nG0S2Co0B1L5R?= =?us-ascii?Q?uKWFURbWSw6I0NbtNx6/kGPrT4htVNUKyx4YfGqMuT5gp8XWq8pcSovFU1us?= =?us-ascii?Q?sEj4YgLzVj3eieYD9Pj2JTXoCoWJk2vmEJwF74e/absEP7wheWwt5PjDQ=3D?= =?us-ascii?Q?=3D?= X-Microsoft-Exchange-Diagnostics: 1;SN2PR15MB1087;6:ndwsnRmp3Q5auN86jZkgd85FJSqe5Q1cXJGrsD/MIdfMneLx6E16wuWIM9tArxi4DBjZB6PbZ2OOC7y/9yHXQSIKHzpKiluwFDIdWZRh8ej350BVHtX6u4E2jloxyXknpUNBHtFEaGTaZSJAVl2ZRqu/cAsAuPHGmkL5/w/nEatMDPZpFH8r/Y5L+MADsaZFfyPY9W38/zhw5fSLz5Q5ewCbao5lc7kBqK3hOAPo+21d4zbmsSAvHaFJcnKfFhEIMAhGHJH8X0VrChvxPHSL4CiUfLW6H0M11tLnRqi+OHem3wSuBi48K699zgBFn2AQ6IRhIBRIWDqhG6R92iS4mw==;5:Bjgde2/Q1U40knJvqyI1SQrZZ5oWxf6WuiXM4OUEiUm6BqzmG4DvgIFz7PLwlUsL/AyoN2AKq0tKPqfhx8SNcQlJZ+8pchxJUfj/HHj2lAYBvNRXnrKdsPZ2FsZG4duh21LCsRdgMm+zJRsurzbH8w==;24:ZMTvzXKgEl3SAQo7eZdFosuc0uhCB1HNOC8gr7WkH3hakb6zIdjW9hQe6zNregsTgDHc15Qa8d5fpvGaKuxNPRbzL/bz4rITM2OVB0KTHzo=;7:PlGcLG4Q7WSKrHgCGv9mN2/0R+I4KPavlwJ+GAuYUdq828bqatSfQ0DhiG8INrcwtVzzVm4AOYxz5RxJBfpQK32CkVaUASwwjeOTYOLzy4PBXswsU7UrNjyDrY/HiDQ4WoVDNbJwEiFL0yCdFlxPW/ixAQT/exxncB7x1o2+MuzJdJkAnDT4MJaSkPrsXGQmmp0a63UjSeB4hqdDS/YOKnvUXsaYmXo3wT0K87nLQbM= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;SN2PR15MB1087;20:garBs0KH7MDIiZN8vzctuYr6G2VYzufgU8gM+oSPrWNpHfC5eVrwONo1ny5KOSnzMGx2dDQc0HhoW+q3coQAUz4Lr2nD8tKz16Sw4gd/GGhu7S6aDRep3kNOIw1KlBxaMcvXE7d6DkyxzqhFChDGRM+5TxqikAoi39K1M39srUU= X-MS-Exchange-CrossTenant-OriginalArrivalTime: 21 Aug 2017 09:47:10.0705 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN2PR15MB1087 X-OriginatorOrg: fb.com X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-08-21_07:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2349 Lines: 51 On Sun, Aug 20, 2017 at 05:50:27PM -0700, David Rientjes wrote: > On Wed, 16 Aug 2017, Roman Gushchin wrote: > > > It's natural to expect that inside a container there are their own sshd, > > "activity manager" or some other stuff, which can play with oom_score_adj. > > If it can override the upper cgroup-level settings, the whole delegation model > > is broken. > > > > I don't think any delegation model related to core cgroups or memory > cgroup is broken, I think it's based on how memory.oom_kill_all_tasks is > defined. It could very well behave as memory.oom_kill_all_eligible_tasks > when enacted upon. > > > You can think about the oom_kill_all_tasks like the panic_on_oom, > > but on a cgroup level. It should _guarantee_, that in case of oom > > the whole cgroup will be destroyed completely, and will not remain > > in a non-consistent state. > > > > Only CAP_SYS_ADMIN has this ability to set /proc/pid/oom_score_adj to CAP_SYS_RESOURCE > OOM_SCORE_ADJ_MIN, so it preserves the ability to change that setting, if > needed, when it sets memory.oom_kill_all_tasks. If a user gains > permissions to change memory.oom_kill_all_tasks, I disagree it should > override the CAP_SYS_ADMIN setting of /proc/pid/oom_score_adj. > > I would prefer not to exclude oom disabled processes to their own sibling > cgroups because they would require their own reservation with cgroup v2 > and it makes the single hierarchy model much more difficult to arrange > alongside cpusets, for example. > > > The model you're describing is based on a trust given to these oom-unkillable > > processes on system level. But we can't really trust some unknown processes > > inside a cgroup that they will be able to do some useful work and finish > > in a reasonable time; especially in case of a global memory shortage. > > Yes, we prefer to panic instead of sshd, for example, being oom killed. > We trust that sshd, as well as our own activity manager and security > daemons are trusted to do useful work and that we never want the kernel to > do this. I'm not sure why you are describing processes that CAP_SYS_ADMIN > has set to be oom disabled as unknown processes. > > I'd be interested in hearing the opinions of others related to a per-memcg > knob being allowed to override the setting of the sysadmin. Sure, me too. Thanks!