Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752828AbbBYOI3 (ORCPT ); Wed, 25 Feb 2015 09:08:29 -0500 Received: from cantor2.suse.de ([195.135.220.15]:37315 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751611AbbBYOI1 (ORCPT ); Wed, 25 Feb 2015 09:08:27 -0500 Date: Wed, 25 Feb 2015 15:08:26 +0100 From: Michal Hocko To: David Rientjes Cc: Johannes Weiner , Andrew Morton , "\\\"Rafael J. Wysocki\\\"" , Tetsuo Handa , linux-mm@kvack.org, LKML Subject: [PATCH -v2] mm, oom: do not fail __GFP_NOFAIL allocation if oom killer is disbaled Message-ID: <20150225140826.GD26680@dhcp22.suse.cz> References: <1424801964-1602-1-git-send-email-mhocko@suse.cz> <20150224191127.GA14718@phnom.home.cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3340 Lines: 83 On Tue 24-02-15 12:23:55, David Rientjes wrote: > On Tue, 24 Feb 2015, Johannes Weiner wrote: [...] > > I'm fine with keeping the allocation looping, but is that message > > helpful? It seems completely useless to the user encountering it. Is > > it going to help kernel developers when we get a bug report with it? > > > > WARN_ON_ONCE()? > > > > Yeah, I'm not sure that the warning is helpful (and it needs > s/disbaled/disabled/ if it is to be kept). I also think this check should > be moved out of out_of_memory() since gfp/retry logic should be in the > page allocator itself and not in the oom killer: just make > __alloc_pages_may_oom() also set *did_some_progress = 1 for __GFP_NOFAIL. OK, this is a good point. Updated patch is below: --- >From 364fdbdaa175daa4b7353f71c2d8f8707b6bda31 Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Mon, 23 Feb 2015 10:33:30 +0100 Subject: [PATCH] mm, oom: do not fail __GFP_NOFAIL allocation if oom killer is disbaled Tetsuo Handa has pointed out that __GFP_NOFAIL allocations might fail after OOM killer is disabled if the allocation is performed by a kernel thread. This behavior was introduced from the very beginning by 7f33d49a2ed5 (mm, PM/Freezer: Disable OOM killer when tasks are frozen). This means that the basic contract for the allocation request is broken and the context requesting such an allocation might blow up unexpectedly. There are basically two ways forward. 1) move oom_killer_disable after kernel threads are frozen. This has a risk that the OOM victim wouldn't be able to finish because it would depend on an already frozen kernel thread. This would be really tricky to debug. 2) do not fail GFP_NOFAIL allocation no matter what and risk a potential Freezable kernel threads will loop and fail the suspend. Incidental allocations after kernel threads are frozen will at least dump a warning - if we are lucky and the serial console is still active of course... This patch implements the later option because it is safer. We would see warning rather than allocation failures for the kernel threads which would blow up otherwise and have a higher chances to identify __GFP_NOFAIL users from deeper pm code. Changes since v1 - move the __GFP_NOFAIL check to __alloc_pages_may_oom per David Rientjes - replace WARN by WARN_ON_ONCE as per Johannes Weiner Signed-off-by: Michal Hocko --- mm/page_alloc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 2d224bbdf8e8..c2ff40a30003 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2363,7 +2363,8 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order, goto out; } /* Exhausted what can be done so it's blamo time */ - if (out_of_memory(ac->zonelist, gfp_mask, order, ac->nodemask, false)) + if (out_of_memory(ac->zonelist, gfp_mask, order, ac->nodemask, false) + || WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL)) *did_some_progress = 1; out: oom_zonelist_unlock(ac->zonelist, gfp_mask); -- 2.1.4 -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/