Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754071AbbLOOu2 (ORCPT ); Tue, 15 Dec 2015 09:50:28 -0500 Received: from gum.cmpxchg.org ([85.214.110.215]:49880 "EHLO gum.cmpxchg.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754008AbbLOOu1 (ORCPT ); Tue, 15 Dec 2015 09:50:27 -0500 Date: Tue, 15 Dec 2015 09:50:11 -0500 From: Johannes Weiner To: Kamezawa Hiroyuki Cc: Vladimir Davydov , Michal Hocko , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/7] mm: memcontrol: charge swap to cgroup2 Message-ID: <20151215145011.GA20355@cmpxchg.org> References: <265d8fe623ed2773d69a26d302eb31e335377c77.1449742560.git.vdavydov@virtuozzo.com> <20151214153037.GB4339@dhcp22.suse.cz> <20151214194258.GH28521@esperanza> <566F8781.80108@jp.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <566F8781.80108@jp.fujitsu.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1762 Lines: 33 On Tue, Dec 15, 2015 at 12:22:41PM +0900, Kamezawa Hiroyuki wrote: > On 2015/12/15 4:42, Vladimir Davydov wrote: > >Anyway, if you don't trust a container you'd better set the hard memory > >limit so that it can't hurt others no matter what it runs and how it > >tweaks its sub-tree knobs. > > Limiting swap can easily cause "OOM-Killer even while there are available swap" > with easy mistake. Can't you add "swap excess" switch to sysctl to allow global > memory reclaim can ignore swap limitation ? That never worked with a combined memory+swap limit, either. How could it? The parent might swap you out under pressure, but simply touching a few of your anon pages causes them to get swapped back in, thrashing with whatever the parent was trying to do. Your ability to swap it out is simply no protection against a group touching its pages. Allowing the parent to exceed swap with separate counters makes even less sense, because every page swapped out frees up a page of memory that the child can reuse. For every swap page that exceeds the limit, the child gets a free memory page! The child doesn't even have to cause swapin, it can just steal whatever the parent tried to free up, and meanwhile its combined memory & swap footprint explodes. The answer is and always should have been: don't overcommit untrusted cgroups. Think of swap as a resource you distribute, not as breathing room for the parents to rely on. Because it can't and could never. And the new separate swap counter makes this explicit. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/