Received: by 2002:ac0:8c9a:0:0:0:0:0 with SMTP id r26csp77616ima; Thu, 31 Jan 2019 23:20:31 -0800 (PST) X-Google-Smtp-Source: ALg8bN6amX6mg1sqgmlUDbn7GgGBdD5vqpi4i5XAjYbMUgPKIfo82Z2dXJluYFbG7PEQCu8YwAok X-Received: by 2002:a17:902:1008:: with SMTP id b8mr36902718pla.252.1549005631305; Thu, 31 Jan 2019 23:20:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1549005631; cv=none; d=google.com; s=arc-20160816; b=DNKJLW1cVZLRNYScLFPxPwk/VhUHmCKq9mo/6zfj4LmVk6TvwZ0FRqoIQbt8s2OKAq d9ksbwONY73fNox05WuHPkgGznEu5jDam0GNVqtdrZo0a12ien4M/fUjv5lQp6VXtuzk 3IZRyjs7dDyvQKokclH4RR6QduNd6xo/Bqnef8SmFHyEhj0QBELaKOeb5XVqIfhHX1ZH 2TgTZMT3YZ9qNVYY0A9HcuOoIZ64wuEGytbc0ZPD5khToZbYfJCgE1C76sKIgMLMnWR0 kclClzKVUy8E19CAbPgwlJuK84Kocp4U9ilG5lF9SygC7CRTWkNOD0RzQdHdOEokcgYQ oAYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=+6pE1Wwuqutoon8tDAgJCqm0X7cXgqICZqk/cPNbZ/o=; b=xUo3Q2ubm4288dS0qIq/E8DJ+6zXn9RZ5ZtbJ1ky9/YdMd95fNhmdVH8kcXWqwdJnM Xctkq7EAhZyEUjPGBSXfpzUS8KCSa5dcR+PlBNQlJGLkit4E+D9OKFrZVqpO8O5opFCw uBi9WY9+4aFoac6cMDW4FCqgxg/WBA34jUcidLgH68Cjt5CWiHzOq4q2xWvFKMogl2G/ FmWyyrHt/U1dGsEAvx1MpTWgWPz9csrS87cghM+RsQk1gvAGIuV0j+EEX5BC5UXZoa03 hZnEfXnj1gQu5FvViuPw4cp2fL/2I5GFbrBuTmlCzCp97DbuRlxycVQ3axaJYG6CEDDG oEGw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 97si6569689plm.312.2019.01.31.23.20.13; Thu, 31 Jan 2019 23:20:31 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727280AbfBAHSA (ORCPT + 99 others); Fri, 1 Feb 2019 02:18:00 -0500 Received: from mx2.suse.de ([195.135.220.15]:60848 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726114AbfBAHSA (ORCPT ); Fri, 1 Feb 2019 02:18:00 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 9AE8EADC1; Fri, 1 Feb 2019 07:17:58 +0000 (UTC) Date: Fri, 1 Feb 2019 08:17:57 +0100 From: Michal Hocko To: Chris Down Cc: Andrew Morton , Johannes Weiner , Tejun Heo , Roman Gushchin , linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: Re: [PATCH] mm: Throttle allocators when failing reclaim over memory.high Message-ID: <20190201071757.GE11599@dhcp22.suse.cz> References: <20190201011352.GA14370@chrisdown.name> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190201011352.GA14370@chrisdown.name> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu 31-01-19 20:13:52, Chris Down wrote: [...] > The current situation goes against both the expectations of users of > memory.high, and our intentions as cgroup v2 developers. In > cgroup-v2.txt, we claim that we will throttle and only under "extreme > conditions" will memory.high protection be breached. Likewise, cgroup v2 > users generally also expect that memory.high should throttle workloads > as they exceed their high threshold. However, as seen above, this isn't > always how it works in practice -- even on banal setups like those with > no swap, or where swap has become exhausted, we can end up with > memory.high being breached and us having no weapons left in our arsenal > to combat runaway growth with, since reclaim is futile. > > It's also hard for system monitoring software or users to tell how bad > the situation is, as "high" events for the memcg may in some cases be > benign, and in others be catastrophic. The current status quo is that we > fail containment in a way that doesn't provide any advance warning that > things are about to go horribly wrong (for example, we are about to > invoke the kernel OOM killer). > > This patch introduces explicit throttling when reclaim is failing to > keep memcg size contained at the memory.high setting. It does so by > applying an exponential delay curve derived from the memcg's overage > compared to memory.high. In the normal case where the memcg is either > below or only marginally over its memory.high setting, no throttling > will be performed. How does this play wit the actual OOM when the user expects oom to resolve the situation because the reclaim is futile and there is nothing reclaimable except for killing a process? -- Michal Hocko SUSE Labs