Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750831AbbD3Joa (ORCPT ); Thu, 30 Apr 2015 05:44:30 -0400 Received: from www262.sakura.ne.jp ([202.181.97.72]:45297 "EHLO www262.sakura.ne.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750729AbbD3Jo1 (ORCPT ); Thu, 30 Apr 2015 05:44:27 -0400 To: mhocko@suse.cz, david@fromorbit.com Cc: hannes@cmpxchg.org, akpm@linux-foundation.org, aarcange@redhat.com, rientjes@google.com, vbabka@suse.cz, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/9] mm: improve OOM mechanism v2 From: Tetsuo Handa References: <201504290050.FDE18274.SOJVtFLOMOQFFH@I-love.SAKURA.ne.jp> <20150429125506.GB7148@cmpxchg.org> <20150429144031.GB31341@dhcp22.suse.cz> <201504300227.JCJ81217.FHOLSQVOFFJtMO@I-love.SAKURA.ne.jp> <20150429183135.GH31341@dhcp22.suse.cz> In-Reply-To: <20150429183135.GH31341@dhcp22.suse.cz> Message-Id: <201504301844.CFE13027.FOMtJHQOFSOFVL@I-love.SAKURA.ne.jp> X-Mailer: Winbiff [Version 2.51 PL2] X-Accept-Language: ja,en,zh Date: Thu, 30 Apr 2015 18:44:25 +0900 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2464 Lines: 44 Michal Hocko wrote: > I mean we should eventually fail all the allocation types but GFP_NOFS > is coming from _carefully_ handled code paths which is an easier starting > point than a random code path in the kernel/drivers. So can we finally > move at least in this direction? I agree that all the allocation types can fail unless GFP_NOFAIL is given. But I also expect that all the allocation types should not fail unless order > PAGE_ALLOC_COSTLY_ORDER or GFP_NORETRY is given or chosen as an OOM victim. We already experienced at Linux 3.19 what happens if !__GFP_FS allocations fails. out_of_memory() is called by pagefault_out_of_memory() when 0x2015a (!__GFP_FS) allocation failed. This looks to me that !__GFP_FS allocations are effectively OOM killer context. It is not fair to kill the thread which triggered a page fault, for that thread may not be using so much memory (unfair from memory usage point of view) or that thread may be global init (unfair because killing the entire system than survive by killing somebody). Also, failing the GFP_NOFS/GFP_NOIO allocations which are not triggered by a page fault generally causes more damage (e.g. taking filesystem error action) than survive by killing somebody. Therefore, I think we should not hesitate invoking the OOM killer for !__GFP_FS allocation. > > Likewise, there is possibility that such memory reserve is used by threads > > which the OOM victim is not waiting for, for malloc() + memset() causes > > __GFP_FS allocations. > > We cannot be certain without complete dependency tracking. This is > just a heuristic. Yes, we cannot be certain without complete dependency tracking. And doing complete dependency tracking is too expensive to implement. Dave is recommending that we should focus on not to trigger the OOM killer than how to handle corner cases in OOM conditions, isn't he? I still believe that choosing more OOM victims upon timeout (which is a heuristic after all) and invoking the OOM killer for !__GFP_FS allocations are the cheapest and least surprising. This is something like automatically and periodically pressing SysRq-f on behalf of the system administrator when memory allocator cannot recover from low memory situation. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/