Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932406AbbDMMta (ORCPT ); Mon, 13 Apr 2015 08:49:30 -0400 Received: from mail-wi0-f175.google.com ([209.85.212.175]:36213 "EHLO mail-wi0-f175.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753849AbbDMMt0 (ORCPT ); Mon, 13 Apr 2015 08:49:26 -0400 Date: Mon, 13 Apr 2015 14:49:24 +0200 From: Michal Hocko To: Tetsuo Handa Cc: hannes@cmpxchg.org, david@fromorbit.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, torvalds@linux-foundation.org, akpm@linux-foundation.org, ying.huang@intel.com, aarcange@redhat.com, tytso@mit.edu Subject: Re: [patch 00/12] mm: page_alloc: improve OOM mechanism and policy Message-ID: <20150413124924.GB21790@dhcp22.suse.cz> References: <20150326195822.GB28129@dastard> <20150327150509.GA21119@cmpxchg.org> <20150330003240.GB28621@dastard> <20150401151920.GB23824@dhcp22.suse.cz> <20150407141822.GA3262@cmpxchg.org> <201504111629.FIB81218.QStJFFVFOLOMHO@I-love.SAKURA.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201504111629.FIB81218.QStJFFVFOLOMHO@I-love.SAKURA.ne.jp> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2514 Lines: 55 On Sat 11-04-15 16:29:26, Tetsuo Handa wrote: > Johannes Weiner wrote: > > The argument here was always that NOFS allocations are very limited in > > their reclaim powers and will trigger OOM prematurely. However, the > > way we limit dirty memory these days forces most cache to be clean at > > all times, and direct reclaim in general hasn't been allowed to issue > > page writeback for quite some time. So these days, NOFS reclaim isn't > > really weaker than regular direct reclaim. The only exception is that > > it might block writeback, so we'd go OOM if the only reclaimables left > > were dirty pages against that filesystem. That should be acceptable. > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > > index 47981c5e54c3..fe3cb2b0b85b 100644 > > --- a/mm/page_alloc.c > > +++ b/mm/page_alloc.c > > @@ -2367,16 +2367,6 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order, int alloc_flags, > > /* The OOM killer does not needlessly kill tasks for lowmem */ > > if (ac->high_zoneidx < ZONE_NORMAL) > > goto out; > > - /* The OOM killer does not compensate for IO-less reclaim */ > > - if (!(gfp_mask & __GFP_FS)) { > > - /* > > - * XXX: Page reclaim didn't yield anything, > > - * and the OOM killer can't be invoked, but > > - * keep looping as per tradition. > > - */ > > - *did_some_progress = 1; > > - goto out; > > - } > > if (pm_suspended_storage()) > > goto out; > > /* The OOM killer may not free memory on a specific node */ > > > > I think this change will allow calling out_of_memory() which results in > "oom_kill_process() is trivially called via pagefault_out_of_memory()" > problem described in https://lkml.org/lkml/2015/3/18/219 . > > I myself think that we should trigger OOM killer for !__GFP_FS allocation > in order to make forward progress in case the OOM victim is blocked. > So, my question about this change is whether we can accept involving OOM > killer from page fault, no matter how trivially OOM killer will kill some > process? We trigger OOM killer from the page fault path for ages. In fact the memcg will trigger memcg OOM killer _only_ from the page fault path because this context is safe as we do not sit on any locks at the time. -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/