Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp2645300ybz; Mon, 27 Apr 2020 01:32:23 -0700 (PDT) X-Google-Smtp-Source: APiQypJFMc6WPphFsDpAYlPewbj1deys7OZeeSED7V6JoJrp530g2QPlR4oCg3LKkgzVqcDCio8e X-Received: by 2002:a17:907:a89:: with SMTP id by9mr17953272ejc.289.1587976343350; Mon, 27 Apr 2020 01:32:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587976343; cv=none; d=google.com; s=arc-20160816; b=RxsJWnBtLXqsSXxcWXgWGO/9QEg4REUNgm23TB6hsFw6iBUFqOZmiY8XGRzYAHzY2c 0svlBEWlciynxv+XHd20vVNmWp/nZsLRsIMqBfE1bzeqUpADAv9yPCNR2+A83nKsRPLs /pBeyUFz/IgiqWySt/QGYTTVM/H7/XeGtkA/cjxUpX/KIkXimNjZMAAZKz1OJQd4IpQ2 sew/G3f5QrPdoOVrOTsY4UbY1KbeGiSGlk/UYzzRdUwhmQuSNBhnIPEqdKE4GVtYVFNe SRLWZLmib0YxukNxJdqpIisGq68E2qN3nFRRj2MooMwzsn25QhTKQGo/kLORdUJoKEdN IxjQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=6xKgRBx2MpJrHitI4UlWV+KQOsl6kuUiZQkQgIEIgPg=; b=lCDqEdHsCBqa11GyyEWLSZkYoqI4H+97/aco1WQU98mwZSb7Vg2UaZlh442r+0WY6U up7h7g9ZK60Ddsy5OxEAD/wbEfpdRoUypLKYe62p9nm+E/CUx436A1Y8TcnivZ69f6ea bRKknE0qLmNjQphW7EM199hJ9U+gWRJpD0dmPDcuQYr7sXBdai9Bm496c9uzwXhxZBUI 7Gq4v1ffSMRRVH9ssJfMYM2fneX9WA2Z+RwijuXfGId2Igwd42gfPg1HOD9iRcJJhoCl vgQdjqBpwWUfLTlUk+SdCLNMqRYli5AwUBgs41opgSYvjNR1hZDKiWdd/ihLQgMVIMaP l6cg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=sony.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id q26si8075943ejz.353.2020.04.27.01.32.00; Mon, 27 Apr 2020 01:32:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=sony.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726621AbgD0Iag convert rfc822-to-8bit (ORCPT + 99 others); Mon, 27 Apr 2020 04:30:36 -0400 Received: from seldsegrel01.sonyericsson.com ([37.139.156.29]:15282 "EHLO SELDSEGREL01.sonyericsson.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726003AbgD0Iag (ORCPT ); Mon, 27 Apr 2020 04:30:36 -0400 X-Greylist: delayed 601 seconds by postgrey-1.27 at vger.kernel.org; Mon, 27 Apr 2020 04:30:35 EDT Subject: Re: [patch] mm, oom: stop reclaiming if GFP_ATOMIC will start failing soon To: Andrew Morton , David Rientjes CC: Vlastimil Babka , , References: <20200425172706.26b5011293e8dc77b1dccaf3@linux-foundation.org> From: peter enderborg Message-ID: <7726e8a8-8390-cee8-3480-4e68bf26f08a@sony.com> Date: Mon, 27 Apr 2020 10:20:32 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: <20200425172706.26b5011293e8dc77b1dccaf3@linux-foundation.org> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8BIT Content-Language: en-GB X-SEG-SpamProfiler-Analysis: v=2.3 cv=Nc2YKFL4 c=1 sm=1 tr=0 a=kIrCkORFHx6JeP9rmF/Kww==:117 a=IkcTkHD0fZMA:10 a=cl8xLZFz6L8A:10 a=1XWaLZrsAAAA:8 a=_KWB5BKwUhRobCCBmOoA:9 a=QEXdDO2ut3YA:10 X-SEG-SpamProfiler-Score: 0 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/26/20 2:27 AM, Andrew Morton wrote: > On Fri, 24 Apr 2020 13:48:06 -0700 (PDT) David Rientjes wrote: > >> If GFP_ATOMIC allocations will start failing soon because the amount of >> free memory is substantially under per-zone min watermarks, it is better >> to oom kill a process rather than continue to reclaim. >> >> This intends to significantly reduce the number of page allocation >> failures that are encountered when the demands of user and atomic >> allocations overwhelm the ability of reclaim to keep up. We can see this >> with a high ingress of networking traffic where memory allocated in irq >> context can overwhelm the ability to reclaim fast enough such that user >> memory consistently loops. In that case, we have reclaimable memory, and > "user memory allocation", I assume? Or maybe "blockable memory > allocatoins". > >> reclaiming is successful, but we've fully depleted memory reserves that >> are allowed for non-blockable allocations. >> >> Commit 400e22499dd9 ("mm: don't warn about allocations which stall for >> too long") removed evidence of user allocations stalling because of this, >> but the situation can apply anytime we get "page allocation failures" >> where reclaim is happening but per-zone min watermarks are starved: >> >> Node 0 Normal free:87356kB min:221984kB low:416984kB high:611984kB active_anon:123009936kB inactive_anon:67647652kB active_file:429612kB inactive_file:209980kB unevictable:112348kB writepending:260kB present:198180864kB managed:195027624kB mlocked:81756kB kernel_stack:24040kB pagetables:11460kB bounce:0kB free_pcp:940kB local_pcp:96kB free_cma:0kB >> lowmem_reserve[]: 0 0 0 0 >> Node 1 Normal free:105616kB min:225568kB low:423716kB high:621864kB active_anon:122124196kB inactive_anon:74112696kB active_file:39172kB inactive_file:103696kB unevictable:204480kB writepending:180kB present:201326592kB managed:198174372kB mlocked:204480kB kernel_stack:11328kB pagetables:3680kB bounce:0kB free_pcp:1140kB local_pcp:0kB free_cma:0kB >> lowmem_reserve[]: 0 0 0 0 >> >> Without this patch, there is no guarantee that user memory allocations >> will ever be successful when non-blockable allocations overwhelm the >> ability to get above per-zone min watermarks. >> >> This doesn't solve page allocation failures entirely since it's a >> preemptive measure based on watermarks that requires concurrent blockable >> allocations to trigger the oom kill. To complete solve page allocation >> failures, it would be possible to do the same watermark check for non- >> blockable allocations and then queue a worker to asynchronously oom kill >> if it finds watermarks to be sufficiently low as well. >> > Well, what's really going on here? > > Is networking potentially consuming an unbounded amount of memory? If > so, then killing a process will just cause networking to consume more > memory then hit against the same thing. So presumably the answer is > "no, the watermarks are inappropriately set for this workload". > > So would it not be sensible to dynamically adjust the watermarks in > response to this condition? Maintain a larger pool of memory for these > allocations? Or possibly push back on networking and tell it to reduce > its queue sizes? So that stuff doesn't keep on getting oom-killed? > I think I seen similar issues when dma-buf allocate a lot.  But that is on older kernels and out of tree. So networking is maybe not the only cause. dma-buf are used a lot for camera stuff in android.