Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S967207AbcCPUOY (ORCPT ); Wed, 16 Mar 2016 16:14:24 -0400 Received: from gum.cmpxchg.org ([85.214.110.215]:46666 "EHLO gum.cmpxchg.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934390AbcCPUOU (ORCPT ); Wed, 16 Mar 2016 16:14:20 -0400 Date: Wed, 16 Mar 2016 13:13:29 -0700 From: Johannes Weiner To: Vladimir Davydov Cc: Michal Hocko , Andrew Morton , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH] mm: memcontrol: reclaim and OOM kill when shrinking memory.max below usage Message-ID: <20160316201329.GA15498@cmpxchg.org> References: <1457643015-8828-2-git-send-email-hannes@cmpxchg.org> <20160311081825.GC27701@dhcp22.suse.cz> <20160311091931.GK1946@esperanza> <20160316051848.GA11006@cmpxchg.org> <20160316151509.GC18142@esperanza> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160316151509.GC18142@esperanza> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1985 Lines: 37 On Wed, Mar 16, 2016 at 06:15:09PM +0300, Vladimir Davydov wrote: > On Tue, Mar 15, 2016 at 10:18:48PM -0700, Johannes Weiner wrote: > > On Fri, Mar 11, 2016 at 12:19:31PM +0300, Vladimir Davydov wrote: > ... > > > Come to think of it, shouldn't we restore the old limit and return EBUSY > > > if we failed to reclaim enough memory? > > > > I suspect it's very rare that it would fail. But even in that case > > it's probably better to at least not allow new charges past what the > > user requested, even if we can't push the level back far enough. > > It's of course good to set the limit before trying to reclaim memory, > but isn't it strange that even if the cgroup's memory can't be reclaimed > to meet the new limit (tmpfs files or tasks protected from oom), the > write will still succeed? It's a rare use case, but still. It's not optimal, but there is nothing we can do about it, is there? I don't want to go back to the racy semantics that allow the application to balloon up again after the limit restriction fails. > I've one more concern regarding this patch. It's about calling OOM while > reclaiming cgroup memory. AFAIU OOM killer can be quite disruptive for a > workload, so is it really good to call it when normal reclaim fails? > > W/o OOM killer you can optimistically try to adjust memory.max and if it > fails you can manually kill some processes in the container or restart > it or cancel the limit update. With your patch adjusting memory.max > never fails, but OOM might kill vital processes rendering the whole > container useless. Wouldn't it be better to let the user decide if > processes should be killed or not rather than calling OOM forcefully? Those are the memory.max semantics, though. Why should there be a difference between the container growing beyond the limit and the limit cutting into the container? If you don't want OOM kills, set memory.high instead. This way you get the memory pressure *and* the chance to do your own killing.