Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932440AbdHWQVG (ORCPT ); Wed, 23 Aug 2017 12:21:06 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:40811 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932109AbdHWQVC (ORCPT ); Wed, 23 Aug 2017 12:21:02 -0400 Date: Wed, 23 Aug 2017 17:20:31 +0100 From: Roman Gushchin To: Johannes Weiner CC: , Michal Hocko , Vladimir Davydov , Tetsuo Handa , David Rientjes , Tejun Heo , , , , Subject: Re: [v5 2/4] mm, oom: cgroup-aware OOM killer Message-ID: <20170823162031.GA13578@castle.dhcp.TheFacebook.com> References: <20170814183213.12319-1-guro@fb.com> <20170814183213.12319-3-guro@fb.com> <20170822170344.GA13547@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <20170822170344.GA13547@cmpxchg.org> User-Agent: Mutt/1.8.3 (2017-05-23) X-Originating-IP: [2620:10d:c092:200::1:399d] X-ClientProxiedBy: VI1PR0802CA0011.eurprd08.prod.outlook.com (2603:10a6:800:aa::21) To CO1PR15MB1077.namprd15.prod.outlook.com (2a01:111:e400:7b66::7) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 4213de5e-d8e9-4330-942f-08d4ea42e88f X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254152)(300000503095)(300135400095)(201703131423075)(201703031133081)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095);SRVR:CO1PR15MB1077; X-Microsoft-Exchange-Diagnostics: 1;CO1PR15MB1077;3:nkFOjbfOAImV2LXd2/liqnS8N2cJD47F2r1el3PCsdJJ+8suEBUsl/jEKHpfXw1OK7MciBQaTZdItxVZNpopJt6LK7i9bmLM4YiAz3h0Y6uPGeaqpbpLOwpWFtbTj7QvSAxMju3bkRfrvtQFOPBxEqzZyZwFRvF2fp3NEEFyKPitDaRzl+LW7bOOSca1sbupKeHBWXNigIYk80ExBHzjpg2DkCa3IQqSJBGpg3BBvXahfUAJy2iAdLhDFI+1MCOY;25:wj4uK/9yS5NV6Vbl/e4THaFsVfyeno3Z24ernvktSbKGDXFJ+sofrcsfjhay8Kym+paYkfKeHzvhC8MLWsfbAJsJbPvnTCpy5mP8PdwfwYFuGVmspzKxPi2IE2ZOY9aXy00Umgjt8e6ZdtS2qAaT64tkfkyqnQ4GF0icGGmou49VPCSpF8xVU5ef6BtR1M2Wfmur8QVn53wzhvPY7hlmhp9gCg1H8kI7Pt13Dp2nWZwHLpaWyISUHhywb7F3HL/XCZEcCtoSHhfn3w6zB1ywMx1S8plmLEstgkRDxFjDkdv6zPuyf76yo8zPGNR6+HKl7Od4Hm0DvRW7RU03TXKrYg==;31:bNjvb9vJVFj5d1gzGFECL22DdVIHiMJ0JRVyXYStz+aM59Eo6FrWd2brPh39N+sHMI++fAgR+jT+v/UO9JG6eMcr5Zctis2wFgXosOloFmTbP7onIAVqU1/dZE0DF9h8xI6cU0ubJnlhYvPDYqrLtRyt0iHy04I+Fq1n9EGUGgm9Zdif952qqRxsdobzkYTLTbIbh7yc14S51zS9Cpjy+uBo+WJeCyYX8grymPutlHo= X-MS-TrafficTypeDiagnostic: CO1PR15MB1077: X-Microsoft-Exchange-Diagnostics: 1;CO1PR15MB1077;20:end2MxfS+8xKtMSX1F/U7HeCHGXP+ItkpLQD8T5Ez/2bsLdYGCIgYtkvySBiSPqxbgQy9ptVk0nK0sj4wTAzOZVpFz7CryvNE4Hp5ScIcrK/aZsv1wD5vsVTiIoS1NWA8o6FeGsw04ELdAG3ObcxMjQmrS2/FlYfGQizu1TMN68S1eViqD44Tt6LT0UNAjvoaCyP7k/DFBiw627hc776Y9QpVeqr7qQByuXTgzrk9HfmrQnmQRf76ALNdX0kaXM3GCel0X3a/q+9BCNPyIJ/ez3Dj1HCX7cjyoP7Ja/q4lwaWXsknztLWH7SUEoHOncNBYLpTOKLw3IACWFAzF/j5Z9EZ+UySYjkjmBQVQRlwCRsCk6P8eVEVr42i1hmTkCQ7SYTEsX9Oa0DuhohFyw9tWUNignNR1nBf6W4E3Vh3QFm5EHq5agi9vU7qp3GoNBNgtHFx2I0d0qYqXyq/rR2VjTkhQVnlCcRUtT7W7pHtWUh/6gczARtCjuIfTjWARg5;4:E8cLTnycYf6lCHSB9lyOqFdkwysu/yNugxefw/mMaAtTd6RmlXmBZCCrWzkDgpLttgmaDQf4yv3gfUu/Sz8/dbAma+9N8kAEkQxxJqTTvCkcERA4CHLZDOlAOuqx7wvRupcxgFuVMnP/v2GCYyBVrbJZkFTh1LXldlmr3aHamf95l3BK1RzNBg0jC5gF7MLCq0UXomn6xNpjPU9TtFBeo/kbFiwC6lKosiknAEc0f5a2k+bn18hqjhEnnKZ9pUuP X-Exchange-Antispam-Report-Test: UriScan:; X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(601004)(2401047)(8121501046)(5005006)(100000703101)(100105400095)(3002001)(10201501046)(93006095)(93001095)(6041248)(20161123558100)(20161123564025)(20161123562025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123555025)(20161123560025)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095);SRVR:CO1PR15MB1077;BCL:0;PCL:0;RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095);SRVR:CO1PR15MB1077; X-Forefront-PRVS: 040866B734 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(7370300001)(4630300001)(979002)(6009001)(24454002)(189002)(52314003)(199003)(189998001)(33656002)(76176999)(54356999)(229853002)(6666003)(2950100002)(6506006)(101416001)(6916009)(50986999)(50466002)(47776003)(81156014)(8676002)(42186005)(81166006)(7736002)(305945005)(105586002)(106356001)(68736007)(25786009)(83506001)(4326008)(2906002)(7416002)(86362001)(23726003)(1076002)(53936002)(55016002)(54906002)(97736004)(4001350100001)(9686003)(7350300001)(6116002)(110136004)(5660300001)(345774005)(478600001)(6246003)(18370500001)(42262002)(969003)(989001)(999001)(1009001)(1019001);DIR:OUT;SFP:1102;SCL:1;SRVR:CO1PR15MB1077;H:castle.dhcp.TheFacebook.com;FPR:;SPF:None;PTR:InfoNoRecords;MX:1;A:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;CO1PR15MB1077;23:GMb3AxxxzM7hw+740hKfr2YeNBYBhNsVs+RknWvKy?= =?us-ascii?Q?mOCio/20ab4zHTAwTbNfczEYmB1jo6FqmzwXSUOR4i4+aXYCQBUpaEMM1+aF?= =?us-ascii?Q?Rd7pa9b/ZGvBLi2tGF+bSqsKLZG/m63og393bf+/najKGg1Vp2Qe0p9g6Oia?= =?us-ascii?Q?mR6XoTgx0aNnAuLhI1RfZZkzugkILe6Cbp1Cqpe+K6YLFjEjYE/ARjMFMvM8?= =?us-ascii?Q?PAZBsbGx9a4KdVVvPwxB9VjwhYQK5waEIQID1cy7I2xnq5Il/7G18bCX0o9S?= =?us-ascii?Q?Lfd4FD9LSiv5NgUf9lr9RX0qol+DTh1nJ4xUmkSUUXlfVeIWdepwDndu+NJj?= =?us-ascii?Q?aAQsubxowzWDuHi71GpLaINhFvDUyQp7P5L3A9eaYkimRIaSSgBakENKyG9w?= =?us-ascii?Q?oeY2jHjBOKPK8Srz2QCxX5+YgIs4cshJYH+EB6qwuZJPIAb8atn4jIFTDewC?= =?us-ascii?Q?UN90KB8dm7G8Aa6BubqeiMvjzcUyBq0/xwlDW01NtvpWqkWe8ZqeFYNjd3gs?= =?us-ascii?Q?XuAd/IYip+sopWFXlKd9eKsUpo0W0JbB8Lq3jaUvuBQ7zUQjOibdt3jMpbau?= =?us-ascii?Q?w9Hil5vdU9juRP/QHtClVC2+eO2MDUwYnuWHwn7TnAw43KpBIoEJT3sh5LXG?= =?us-ascii?Q?WZv4c8850wTix4WdptXtOuKJpamuNZCatcPZfluLipwWvVtttYzeapJOjTZW?= =?us-ascii?Q?Lh2MEPqzncui42W6vvDELLlTYcAy42WU/VFwOKQJ15WVEVOD2fbOeNsqzxNV?= =?us-ascii?Q?OkM9kxXGghVNDIWS959N9Qb2XAZU8FwjCkmmacUWl+6KcDk7UsYOjHjXR4tN?= =?us-ascii?Q?fcs8n505lPy89V/XjtLwV23MuHQWuKNHnjaH7eDsuBKBRnoxlflj3SMoKRRA?= =?us-ascii?Q?pRKKzw0kCUcemWdSh4RZWGt4sNOJfvXRWDgzfrVrv+vb3EOwAHQuORjYqrfI?= =?us-ascii?Q?bt/3hcpZCe2Aaq+4j7vCyMTdKLYp8+9qwVuLyaXLZ3I2JdUxxEJfD6ywfSbB?= =?us-ascii?Q?Ej/tJ1sula6snYAaPkFXQrutBQUrICLwb9r4QIYq49WBnZj/R9iyxP6wlNfd?= =?us-ascii?Q?0Z9sUROThVcfeofFUN4cTvpeW3fmYB29snWMRVAnI9Zk/g5PfFGKZ2NOYkkg?= =?us-ascii?Q?ZKh72QQecthDsYlYCllE1RatYWL2UerJN6Vogtdpdof2mgEwH35Sw47FaSHd?= =?us-ascii?Q?dESZ6CIS9kUHYSChjfHnW1oauw38V1CLs88jXuLhWfWgr4jR1DSW/w3iGhEo?= =?us-ascii?Q?4mwNWJVjDcV/WaqLCpvoNTkvaZ99tIqadj0rp3wPK68cpnSyXfRoP3rH3oGt?= =?us-ascii?Q?nX3+K8WDRaOyHMmxu4DVZEv344gTHx8IuEvu0BBbyAFq15NtiWYsBeiCfHu1?= =?us-ascii?Q?pZYIw=3D=3D?= X-Microsoft-Exchange-Diagnostics: 1;CO1PR15MB1077;6:CTV6WUOjqyx2kyJvhW+pTmqGVNEpTmNQU6laH68Mqawl5jtVvFxznnNLspJvmd7ohV3IuCzlpc5DFAcz9wBaw1qaUSrdU2+nEjlYrzZy1TzP1IS6mYWiHgAaEfC9XAoCfMlzjlhH/bvuTmNGFYP+KIYyVL2v2wpTvsRWRlMR/RapObqU3tXvHrfVREmLOLOcBlMfSKXJMpeL3eFO7FQ8hMns9+3nM4WxLFaCy89yEXuqJkYHVFhPYrrd1GX0I9dNabZevCzDnI5HRhMtZZnx94FHqasPLqmwvme3fXprLkO9e2rFq7JU5BJ7+ZtEswr61fqpwmzxrZQkHCBGzFEb+w==;5:H4oGe1iWQ4D5Sn+YzRVrP2vwbcnH8SjLqdLQEvs6FUznxjbie7F8MsB+f/raPlqAqjigPHkvlGnjeujQmguQqCojyhvsBuVP2PnJpK6iTPxLvO+ODNWihHmcFI0hHQwhknKF/EPXatWjT626IDB5Gw==;24:q1SYkpJC6yKspyBVwFL7loQwSRSBqZv82swmuFHMRSjqn5sluhpZP/oTl2WqLsqFIfEiThDg39DuDAyiDP8Ag7HJUpKeV0V7eArmTSEvLb0=;7:bWHJ7gZWnp9+PaDotfDlmGbXCu2M7bbwV0haRdHnK31ARn1IFBYE5EoQXajs5rkFv5/Row2pwX+n4Rsl22/3ud9ZAzR0Y2YqaPKMKi1C3bqzE8DHr8T7ij/xcx07wbQbICYsHh3N0Gn4bKzqFo0lHnhnrCO9rWfJsrOjLLgK1jshaQGnMPCa6dbcJaYZLt7D2Xh4EZGZwo3k5LsSR2wlJQH4/MjIsNefr9375XWCvyQ= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;CO1PR15MB1077;20:Xg97RF6yulFw/p2cJIv9N1U4gmj+WWoENUKmOIuKZB1UKG60VE1Ck8BxPcepuwVj5ZTClAa2yrYz9CB/Dp3LH1S+GQvvScb1Si49cYcjbyIn7ZeXG6MGQWTMVWmEgB8lTN+U518SgQg6UCp99BMuZltMgLB341nirIMU9QEaOJk= X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Aug 2017 16:20:43.0608 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO1PR15MB1077 X-OriginatorOrg: fb.com X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-08-23_06:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3678 Lines: 123 Hi Johannes! Thank you for review! I do agree with most of the comments, and I will address them in v6. I'll post it soon. Please, find some comments below. On Tue, Aug 22, 2017 at 01:03:44PM -0400, Johannes Weiner wrote: > Hi Roman, > > great work! This looks mostly good to me now. Below are some nitpicks > concerning naming and code layout, but nothing major. > > > + > > + css_task_iter_start(&memcg->css, 0, &it); > > + while ((task = css_task_iter_next(&it))) { > > + /* > > + * If there are no tasks, or all tasks have oom_score_adj set > > + * to OOM_SCORE_ADJ_MIN and oom_kill_all_tasks is not set, > > + * don't select this memory cgroup. > > + */ > > + if (!elegible && > > + (memcg->oom_kill_all_tasks || > > + task->signal->oom_score_adj != OOM_SCORE_ADJ_MIN)) > > + elegible = 1; > > This is a little awkward to read. How about something like this: > > /* > * When killing individual tasks, we respect OOM score adjustments: > * at least one task in the group needs to be killable for the group > * to be oomable. > * > * Also check that previous OOM kills have finished, and abort if > * there are any pending OOM victims. > */ > oomable = memcg->oom_kill_all_tasks; > while ((task = css_task_iter_next(&it))) { > if (!oomable && task->signal_oom_score_adj != OOM_SCORE_ADJ_MIN) > oomable = 1; > > > + if (tsk_is_oom_victim(task) && > > + !test_bit(MMF_OOM_SKIP, &task->signal->oom_mm->flags)) { > > + elegible = -1; > > + break; > > + } > > + } > > + css_task_iter_end(&it); We ignore oom_score_adj if oom_kill_all_tasks is set, it's not reflected in your version. Anyway, I've moved the comments block outside and rephrased it to make more clear. > > etc. > > > + > > + return elegible > 0 ? memcg_oom_badness(memcg, nodemask) : elegible; > > I find these much easier to read if broken up, even if it's more LOC: > > if (eligible <= 0) > return eligible; > > return memcg_oom_badness(memcg, nodemask); > > > +static void select_victim_memcg(struct mem_cgroup *root, struct oom_control *oc) > > +{ > > + struct mem_cgroup *iter, *parent; > > + > > + for_each_mem_cgroup_tree(iter, root) { > > + if (memcg_has_children(iter)) { > > + iter->oom_score = 0; > > + continue; > > + } > > + > > + iter->oom_score = oom_evaluate_memcg(iter, oc->nodemask); > > + if (iter->oom_score == -1) { > > Please add comments to document the special returns. Maybe #defines > would be clearer, too. > > > + oc->chosen_memcg = (void *)-1UL; > > + mem_cgroup_iter_break(root, iter); > > + return; > > + } > > + > > + if (!iter->oom_score) > > + continue; > > Same here. > > Maybe a switch would be suitable to handle the abort/no-score cases. Not sure about switch/defines, but I've added several comment blocks to describe possible return values, as well as their handling. Hope, it will be enough. > > static int memory_events_show(struct seq_file *m, void *v) > > { > > struct mem_cgroup *memcg = mem_cgroup_from_css(seq_css(m)); > > @@ -5310,6 +5512,12 @@ static struct cftype memory_files[] = { > > .write = memory_max_write, > > }, > > { > > + .name = "oom_kill_all_tasks", > > + .flags = CFTYPE_NOT_ON_ROOT, > > + .seq_show = memory_oom_kill_all_tasks_show, > > + .write = memory_oom_kill_all_tasks_write, > > + }, > > This name is quite a mouthful and reminiscent of the awkward v1 > interface names. It doesn't really go well with the v2 names. > > How about memory.oom_group? I'd prefer to have something more obvious. I've renamed memory.oom_kill_all_tasks to memory.oom_kill_all, which was earlier suggested by Vladimir. Are you ok with it? Thanks!