Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754707AbaFCOpB (ORCPT ); Tue, 3 Jun 2014 10:45:01 -0400 Received: from cantor2.suse.de ([195.135.220.15]:42898 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752096AbaFCOo7 (ORCPT ); Tue, 3 Jun 2014 10:44:59 -0400 Date: Tue, 3 Jun 2014 16:44:55 +0200 From: Michal Hocko To: Greg Thelen , Johannes Weiner Cc: Roman Gushchin , KAMEZAWA Hiroyuki , Tejun Heo , linux-mm@kvack.org, Hugh Dickins , KOSAKI Motohiro , Rik van Riel , LKML , Andrew Morton , Michel Lespinasse Subject: Re: [PATCH v2 0/4] memcg: Low-limit reclaim Message-ID: <20140603144455.GL1321@dhcp22.suse.cz> References: <1398688005-26207-1-git-send-email-mhocko@suse.cz> <20140528121023.GA10735@dhcp22.suse.cz> <20140528134905.GF2878@cmpxchg.org> <20140528142144.GL9895@dhcp22.suse.cz> <20140528152854.GG2878@cmpxchg.org> <20140603110959.GE1321@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue 03-06-14 07:01:20, Greg Thelen wrote: > On Jun 3, 2014 4:10 AM, "Michal Hocko" wrote: > > > > On Wed 28-05-14 09:17:13, Greg Thelen wrote: > > [...] > > > My 2c... The following works for my use cases: > > > 1) introduce memory.low_limit_in_bytes (default=0 thus no default change > > > from older kernels) > > > 2) interested users will set low_limit_in_bytes to non-zero value. > > > Memory protected by low limit should be as migratable/reclaimable as > > > mlock memory. If a zone full of mlock memory causes oom kills, then > > > so should the low limit. > > > > Would fallback mode in overcommit or the corner case situation break > > your usecase? > > Yes. Fallback mode would break my use cases. What is the corner case > situation? NUMA conflicts? Described here http://marc.info/?l=linux-mm&m=139940101124396&w=2 > Low limit is a substitute for users mlocking memory. So if mlocked > memory has the same NUMA conflicts, then I see no problem with low > limit having the same behavior. In principal they are similar - at least from the reclaim POV. The usage will be however quite different IMO. mlock is the explicit way to keep memory resident. The application writer knows_what_he_is_doing, right? Lowlimit is an administrative tool. Administrator of a potentially complex application is tuning the said application to beat the best performance out of it. Now both of them know that the thing might blow up if they overcommit on the locked memory. So the application writer can check the system state before he asks for mlock and he knows about previous mlocks. Admin doesn't have that possibility because the memory distribution of the memcg is not easy to find out. > From a user API perspective, I'm not clear on the difference between > non-ooming (fallback) low limit and the existing soft limit interface. If > low limit is a "soft" (non ooming) limit then why not rework the existing > soft limit interface and save the low limit for strict (ooming) behavior? No, not that path again. Pretty please! We've been there and it didn't work out. We've been told to not flip defaults and potentially break userspace. Softlimit with it weird semantic should die and stay as a colorful example of a bad design decision. > Of course, Google can continue to tweak the soft limit or new low > limit to provide an ooming guarantee rather than violating the limit. If you have the use case for the hard guarantee then we can add a knob as I've said repeatedly. I just wanted to hear the use case. If you have one, great. I just wanted to start with something which is more usable in general. Your setup is quite specific and known to love OOM killers so you are very well prepared for that. On the other hand my users would end up in a surprise if they saw an OOM while the setup was seemingly correct because lowlimit was not overcommitted. I can come up with a patch on top of what is in mm tree now. It would add a knob (configurable to default to fallback or OOM by default). What do you think about this? Would that work for you and Johannes? > PS: I currently have very limited connectivity so my responses will be > delayed. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/