Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp3921384ybz; Tue, 28 Apr 2020 02:40:20 -0700 (PDT) X-Google-Smtp-Source: APiQypIcvJOfLyMIZ2UykDjTEuYPIi1gF4BGtkjxc3ID4o1QJJ1xHziujXW1QZ4MigTXgIVWWFK7 X-Received: by 2002:a17:907:2098:: with SMTP id pv24mr24332580ejb.22.1588066820598; Tue, 28 Apr 2020 02:40:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588066820; cv=none; d=google.com; s=arc-20160816; b=iaVnu02RUIo1yrwoYw025vhGC/TfREodnBGsOu5JoeTb1pB3qX4TFmCV6T+s8kMFmo q0+Y1q1lUoZmrOl8jSrU0MnWqAloRwMTZksf+CVA0/366ueZGgUjWIrmY9R/HUc9ZJL2 AK6fT7iFbj2np/5H/HZG+2JH+sp0MmrxjSzygsG/ii9e7ZxUy59EDhvYRvaTdeYu8otE zamrjAhh/Hdz/aNNBBQZ4ol2aDyf+GiIdfzMBdU4jZaExEm7ka53RnAuckh0r6LlM12j 0shiD6+DHGYQTl6ALNEn/c7gjDZ78dqtAnQn29VFZ34D+oNMNoc7wKPMmju2L769maed 0Tgg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=g9G3+3Fde1sUaH/s/G77KjtMwXpLZgzdWqDZ2Nj46cc=; b=JsfgUFKkCcRi93FRt+amTCxcTttBhaphD88PFQj0USLbTpN8q/VGZORrfDyKGJry35 Ewh8n9zXwh6qTWspejiyEk4P6dm0iwYde0Ko830VeEv2v2BNmOYubyWqr4fxXe3dRoSk T+0WfXs/Tn3pfJvpV0/gMj+u1gtDQU5YDBLnjhEHYyMFUJOhDSRvZIEH/irTxx4rwIU2 1dVR1NNfUOIolqki30lx/Xi3dVIkUtSeC+gg1IZ3Z1/psGplzv+WiQO6WOL7c0b2W3Y1 3Ne2bWSBpaIMEwiwoYX5q1gijDHDr7WV9CRbE0OHRmh0/NqtTiv8/imse9Mfn8xLooxM qBhw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id cn25si1263514edb.589.2020.04.28.02.39.57; Tue, 28 Apr 2020 02:40:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727121AbgD1JiX (ORCPT + 99 others); Tue, 28 Apr 2020 05:38:23 -0400 Received: from mx2.suse.de ([195.135.220.15]:41084 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727042AbgD1JiW (ORCPT ); Tue, 28 Apr 2020 05:38:22 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id D6BF0AD92; Tue, 28 Apr 2020 09:38:19 +0000 (UTC) Subject: Re: [patch] mm, oom: stop reclaiming if GFP_ATOMIC will start failing soon To: Andrew Morton , David Rientjes Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Mel Gorman References: <20200425172706.26b5011293e8dc77b1dccaf3@linux-foundation.org> <20200427133051.b71f961c1bc53a8e72c4f003@linux-foundation.org> From: Vlastimil Babka Message-ID: <28e35a8b-400e-9320-5a97-accfccf4b9a8@suse.cz> Date: Tue, 28 Apr 2020 11:38:19 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0 MIME-Version: 1.0 In-Reply-To: <20200427133051.b71f961c1bc53a8e72c4f003@linux-foundation.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4/27/20 10:30 PM, Andrew Morton wrote: > On Sun, 26 Apr 2020 20:12:58 -0700 (PDT) David Rientjes wrote: > >> >> GFP_ATOMIC allocators can access below these per-zone watermarks. So the >> issue is that per-zone free pages stays between ALLOC_HIGH watermarks >> (the watermark that GFP_ATOMIC allocators can allocate to) and min >> watermarks. We never reclaim enough memory to get back to min watermarks >> because reclaim cannot keep up with the amount of GFP_ATOMIC allocations. > > But there should be an upper bound upon the total amount of in-flight > GFP_ATOMIC memory at any point in time? These aren't like pagecache If it's a network receive path, then this is effectively bounded by link speed vs ability to deal with the packets quickly and free the buffers. And the bursts of incoming packets might be out of control of the admin. With my "enterprise kernel support" hat on, it's it's annoying enough to explain GFP_ATOMIC failures (usually high-order) in dmesg every once in a while (the usual suggestion is to bump min_free_kbytes and stress that unless they are frequent, there's no actual harm as networking can defer the allocation to non-atomic context). If there was an OOM kill as a result, that could not be disabled, I can well imagine we would have to revert such patch in our kernel as a result due to the DOS (intentional or not) potential. > which will take more if we give it more. Setting the various > thresholds appropriately should ensure that blockable allocations don't > get their memory stolen by GPP_ATOMIC allocations? I agree with the view that GFP_ATOMIC is only a (perhaps more visible) part of the problem that there's no fairness guarantee in reclaim, and allocators can steal from each other. GFP_ATOMIC allocations just have it easier thanks to lower thresholds. > I took a look at doing a quick-fix for the > direct-reclaimers-get-their-stuff-stolen issue about a million years > ago. I don't recall where it ended up. It's pretty trivial for the > direct reclaimer to free pages into current->reclaimed_pages and to > take a look in there on the allocation path, etc. But it's only > practical for order-0 pages. FWIW there's already such approach added to compaction by Mel some time ago, so order>0 allocations are covered to some extent. But in this case I imagine that compaction won't even start because order-0 watermarks are too low. The order-0 reclaim capture might work though - as a result the GFP_ATOMIC allocations would more likely fail and defer to their fallback context.