Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932462AbcCHJI3 (ORCPT ); Tue, 8 Mar 2016 04:08:29 -0500 Received: from mail-wm0-f68.google.com ([74.125.82.68]:36265 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753643AbcCHJIV (ORCPT ); Tue, 8 Mar 2016 04:08:21 -0500 Date: Tue, 8 Mar 2016 10:08:18 +0100 From: Michal Hocko To: Sergey Senozhatsky Cc: Hugh Dickins , Andrew Morton , Linus Torvalds , Johannes Weiner , Mel Gorman , David Rientjes , Tetsuo Handa , Hillf Danton , KAMEZAWA Hiroyuki , linux-mm@kvack.org, LKML , Joonsoo Kim , Vlastimil Babka Subject: Re: [PATCH] mm, oom: protect !costly allocations some more (was: Re: [PATCH 0/3] OOM detection rework v4) Message-ID: <20160308090818.GA13542@dhcp22.suse.cz> References: <1450203586-10959-1-git-send-email-mhocko@kernel.org> <20160203132718.GI6757@dhcp22.suse.cz> <20160225092315.GD17573@dhcp22.suse.cz> <20160229210213.GX16930@dhcp22.suse.cz> <20160307160838.GB5028@dhcp22.suse.cz> <20160308035104.GA447@swordfish> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160308035104.GA447@swordfish> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3407 Lines: 53 On Tue 08-03-16 12:51:04, Sergey Senozhatsky wrote: > Hello Michal, > > On (03/07/16 17:08), Michal Hocko wrote: > > On Mon 29-02-16 22:02:13, Michal Hocko wrote: > > > Andrew, > > > could you queue this one as well, please? This is more a band aid than a > > > real solution which I will be working on as soon as I am able to > > > reproduce the issue but the patch should help to some degree at least. > > > > Joonsoo wasn't very happy about this approach so let me try a different > > way. What do you think about the following? Hugh, Sergey does it help > > for your load? I have tested it with the Hugh's load and there was no > > major difference from the previous testing so at least nothing has blown > > up as I am not able to reproduce the issue here. > > (next-20160307 + "[PATCH] mm, oom: protect !costly allocations some more") > > seems it's significantly less likely to oom-kill now, but I still can see > something like this Thanks for the testing. This is highly appreciated. If you are able to reproduce this then collecting compaction related tracepoints might be really helpful. > [ 501.942745] coretemp-sensor invoked oom-killer: gfp_mask=0x27000c0(GFP_KERNEL_ACCOUNT|__GFP_NOTRACK), order=2, oom_score_adj=0 [...] > [ 501.942853] active_anon:151312 inactive_anon:54791 isolated_anon:0 > active_file:31213 inactive_file:302048 isolated_file:0 > unevictable:0 dirty:44 writeback:221 unstable:0 > slab_reclaimable:43570 slab_unreclaimable:5651 > mapped:16660 shmem:29495 pagetables:2542 bounce:0 > free:10884 free_pcp:214 free_cma:0 [...] > [ 501.942867] DMA32 free:23664kB min:6232kB low:9332kB high:12432kB active_anon:516228kB inactive_anon:129136kB active_file:96508kB inactive_file:954780kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3194880kB managed:3107512kB mlocked:0kB dirty:136kB writeback:440kB mapped:51816kB shmem:91488kB slab_reclaimable:129856kB slab_unreclaimable:13876kB kernel_stack:2160kB pagetables:7888kB unstable:0kB bounce:0kB free_pcp:724kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:128 all_unreclaimable? no > [ 501.942870] lowmem_reserve[]: 0 0 824 824 > [ 501.942876] Normal free:4784kB min:1696kB low:2540kB high:3384kB active_anon:89020kB inactive_anon:90028kB active_file:28248kB inactive_file:253308kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:917504kB managed:844512kB mlocked:0kB dirty:40kB writeback:444kB mapped:14700kB shmem:26492kB slab_reclaimable:44396kB slab_unreclaimable:8620kB kernel_stack:1328kB pagetables:2280kB unstable:0kB bounce:0kB free_pcp:244kB local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:60 all_unreclaimable? no Both DMA32 and Normal zones are over high watermarks so this OOM is due to the memory fragmentation. > [ 501.942912] DMA32: 564*4kB (UME) 2700*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 23856kB > [ 501.942921] Normal: 959*4kB (ME) 128*8kB (UM) 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4860kB There are no order-2+ pages usable even after we know that the compaction was active and didn't back out early. I might be missing something of course and the patch might still be tweaked to be more conservative. Tracepoints should tell us more though. Thanks! -- Michal Hocko SUSE Labs