Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753858AbbKWJnp (ORCPT ); Mon, 23 Nov 2015 04:43:45 -0500 Received: from mx2.suse.de ([195.135.220.15]:40091 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752695AbbKWJno (ORCPT ); Mon, 23 Nov 2015 04:43:44 -0500 Subject: Re: [PATCH] mm, oom: Give __GFP_NOFAIL allocations access to memory reserves To: Michal Hocko References: <1447249697-13380-1-git-send-email-mhocko@kernel.org> <5651BB43.8030102@suse.cz> <20151123092925.GB21050@dhcp22.suse.cz> Cc: Andrew Morton , Johannes Weiner , Andrea Arcangeli , Mel Gorman , David Rientjes , linux-mm@kvack.org, LKML From: Vlastimil Babka Message-ID: <5652DFCE.3010201@suse.cz> Date: Mon, 23 Nov 2015 10:43:42 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <20151123092925.GB21050@dhcp22.suse.cz> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3475 Lines: 82 On 11/23/2015 10:29 AM, Michal Hocko wrote: > On Sun 22-11-15 13:55:31, Vlastimil Babka wrote: >> On 11.11.2015 14:48, mhocko@kernel.org wrote: >>> mm/page_alloc.c | 10 +++++++++- >>> 1 file changed, 9 insertions(+), 1 deletion(-) >>> >>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >>> index 8034909faad2..d30bce9d7ac8 100644 >>> --- a/mm/page_alloc.c >>> +++ b/mm/page_alloc.c >>> @@ -2766,8 +2766,16 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order, >>> goto out; >>> } >>> /* Exhausted what can be done so it's blamo time */ >>> - if (out_of_memory(&oc) || WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL)) >>> + if (out_of_memory(&oc) || WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL)) { >>> *did_some_progress = 1; >>> + >>> + if (gfp_mask & __GFP_NOFAIL) { >>> + page = get_page_from_freelist(gfp_mask, order, >>> + ALLOC_NO_WATERMARKS|ALLOC_CPUSET, ac); >>> + WARN_ONCE(!page, "Unable to fullfil gfp_nofail allocation." >>> + " Consider increasing min_free_kbytes.\n"); >> >> It seems redundant to me to keep the WARN_ON_ONCE also above in the if () part? > > They are warning about two different things. The first one catches a > buggy code which uses __GFP_NOFAIL from oom disabled context while the Ah, I see, I misinterpreted what the return values of out_of_memory() mean. But now that I look at its code, it seems to only return false when oom_killer_disabled is set to true. Which is a global thing and nothing to do with the context of the __GFP_NOFAIL allocation? > second one tries to help the administrator with a hint that memory > reserves are too small. > >> Also s/gfp_nofail/GFP_NOFAIL/ for consistency? > > Fair enough, changed. > >> Hm and probably out of scope of your patch, but I understand the WARN_ONCE >> (WARN_ON_ONCE) to be _ONCE just to prevent a flood from a single task looping >> here. But for distinct tasks and potentially far away in time, wouldn't we want >> to see all the warnings? Would that be feasible to implement? > > I was thinking about that as well some time ago but it was quite > hard to find a good enough API to tell when to warn again. The first > WARN_ON_ONCE should trigger for all different _code paths_ no matter > how frequently they appear to catch all the buggy callers. The second > one would benefit from a new warning after min_free_kbytes was updated > because it would tell the administrator that the last update was not > sufficient for the workload. Hm, what about adding a flag to the struct alloc_context, so that when the particular allocation attempt emits the warning, it sets a flag in the alloc_context so that it won't emit them again as long as it keeps looping and attempting oom. Other allocations will warn independently. We could also print the same info as the "allocation failed" warnings do, since it's very similar, except we can't fail - but the admin/bug reporter should be interested in the same details as for an allocation failure that is allowed to fail. But it's also true that we have probably just printed the info during out_of_memory()... except when we skipped that for some reason? >> >>> + } >>> + } >>> out: >>> mutex_unlock(&oom_lock); >>> return page; >>> > > Thanks! > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/