Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754614AbcK2QZ1 (ORCPT ); Tue, 29 Nov 2016 11:25:27 -0500 Received: from mail-wm0-f67.google.com ([74.125.82.67]:36253 "EHLO mail-wm0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752196AbcK2QZS (ORCPT ); Tue, 29 Nov 2016 11:25:18 -0500 Date: Tue, 29 Nov 2016 17:25:15 +0100 From: Michal Hocko To: Greg Kroah-Hartman , Stable tree Cc: Vlastimil Babka , Marc MERLIN , linux-mm@kvack.org, Linus Torvalds , LKML , Joonsoo Kim , Tejun Heo Subject: Re: 4.8.8 kernel trigger OOM killer repeatedly when I have lots of RAM that should be free Message-ID: <20161129162515.GD9796@dhcp22.suse.cz> References: <20161121154336.GD19750@merlins.org> <0d4939f3-869d-6fb8-0914-5f74172f8519@suse.cz> <20161121215639.GF13371@merlins.org> <20161122160629.uzt2u6m75ash4ved@merlins.org> <48061a22-0203-de54-5a44-89773bff1e63@suse.cz> <20161122163801.GA2919@kroah.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161122163801.GA2919@kroah.com> User-Agent: Mutt/1.6.0 (2016-04-01) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4004 Lines: 92 On Tue 22-11-16 17:38:01, Greg KH wrote: > On Tue, Nov 22, 2016 at 05:14:02PM +0100, Vlastimil Babka wrote: > > On 11/22/2016 05:06 PM, Marc MERLIN wrote: > > > On Mon, Nov 21, 2016 at 01:56:39PM -0800, Marc MERLIN wrote: > > >> On Mon, Nov 21, 2016 at 10:50:20PM +0100, Vlastimil Babka wrote: > > >>>> 4.9rc5 however seems to be doing better, and is still running after 18 > > >>>> hours. However, I got a few page allocation failures as per below, but the > > >>>> system seems to recover. > > >>>> Vlastimil, do you want me to continue the copy on 4.9 (may take 3-5 days) > > >>>> or is that good enough, and i should go back to 4.8.8 with that patch applied? > > >>>> https://marc.info/?l=linux-mm&m=147423605024993 > > >>> > > >>> Hi, I think it's enough for 4.9 for now and I would appreciate trying > > >>> 4.8 with that patch, yeah. > > >> > > >> So the good news is that it's been running for almost 5H and so far so good. > > > > > > And the better news is that the copy is still going strong, 4.4TB and > > > going. So 4.8.8 is fixed with that one single patch as far as I'm > > > concerned. > > > > > > So thanks for that, looks good to me to merge. > > > > Thanks a lot for the testing. So what do we do now about 4.8? (4.7 is > > already EOL AFAICS). > > > > - send the patch [1] as 4.8-only stable. Greg won't like that, I expect. > > - alternatively a simpler (againm 4.8-only) patch that just outright > > prevents OOM for 0 < order < costly, as Michal already suggested. > > - backport 10+ compaction patches to 4.8 stable > > - something else? > > Just wait for 4.8-stable to go end-of-life in a few weeks after 4.9 is > released? :) OK, so can we push this through to 4.8 before EOL and make sure there won't be any additional pre-mature high order OOM reports? The patch should be simple enough and safe for the stable tree. There is no upstream commit because 4.9 is fixed in a different way which would be way too intrusive for the stable backport. --- >From 02306e8d593fa8a48d620e0c9d63a934ca8366d8 Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Wed, 23 Nov 2016 07:26:30 +0100 Subject: [PATCH] mm, oom: stop pre-mature high-order OOM killer invocations 31e49bfda184 ("mm, oom: protect !costly allocations some more for !CONFIG_COMPACTION") was an attempt to reduce chances of pre-mature OOM killer invocation for high order requests. It seemed to work for most users just fine but it is far from bullet proof and obviously not sufficient for Marc who has reported pre-mature OOM killer invocations with 4.8 based kernels. 4.9 will all the compaction improvements seems to be behaving much better but that would be too intrusive to backport to 4.8 stable kernels. Instead this patch simply never declares OOM for !costly high order requests. We rely on order-0 requests to do that in case we are really out of memory. Order-0 requests are much more common and so a risk of a livelock without any way forward is highly unlikely. Reported-by: Marc MERLIN Tested-by: Marc MERLIN Signed-off-by: Michal Hocko --- mm/page_alloc.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index a2214c64ed3c..7401e996009a 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3161,6 +3161,16 @@ should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_fla if (!order || order > PAGE_ALLOC_COSTLY_ORDER) return false; +#ifdef CONFIG_COMPACTION + /* + * This is a gross workaround to compensate a lack of reliable compaction + * operation. We cannot simply go OOM with the current state of the compaction + * code because this can lead to pre mature OOM declaration. + */ + if (order <= PAGE_ALLOC_COSTLY_ORDER) + return true; +#endif + /* * There are setups with compaction disabled which would prefer to loop * inside the allocator rather than hit the oom killer prematurely. -- 2.10.2 -- Michal Hocko SUSE Labs