Received: by 2002:a05:6358:16cc:b0:ea:6187:17c9 with SMTP id r12csp10655741rwl; Thu, 12 Jan 2023 00:23:08 -0800 (PST) X-Google-Smtp-Source: AMrXdXs0Rk3Ek7u2MuqGnFUY75GN8lctKK5Nv+N3wg3ilcUkHYVk7pkRuV5aC11v6kFNXStpm2OW X-Received: by 2002:a17:907:a707:b0:7c1:75e9:1180 with SMTP id vw7-20020a170907a70700b007c175e91180mr67314275ejc.22.1673511788776; Thu, 12 Jan 2023 00:23:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673511788; cv=none; d=google.com; s=arc-20160816; b=CuGhnTYL5OVL6T9aN5t722c8Hk2vk0KkZltxdoicCv2oUjZhqnJfGoMBB687CH9aSc W5qG6r8rJRNMOGxGIPv6ap7eD/Es9ivy6MNzBql42VlJ8//jeljRTh9G6BqAlaCAT1cy wBs/TzY2q9rMFy3hvtPPLIDaIrXdKS2ziySms6fbvtyAkYX1+uYc3Eck2Nlhfdrz6slt 23u7zq3C5u7LePo4HFprpOGbojji1f4w6Wei6LMltgFJSMtppTZed6zUcX3PoVTVOCTN QctTyRQxqbem7ks+lRqDU7/e22vQll7Uz9nK+noIqI/6dEN9o/TvWcPTWwKuK704H6rW Wy2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=g3rOVtr3FiiKyQNYoIzTg/bnyNbi0bomA2m3YXQL33w=; b=qHW7AFHpNBMhkinXsPQaWQ0SquE+yT1G5NcJhQqxBkMlt0+EtM5aSXagv+UD/nYJIJ 6dt8DOztnDuPPQAfGFoXHMoeVIzUNutXXsXurGGFJpYVaLg0S436SjjV5xcVRGLfhDq2 o/AVrMDf7Zbi8fohyUdBKFzecgpUyZMSNeDux5z6IrKEbWX+a/ce/9wiJqcfbtUGgcEF BtdTsWoY+Bu5B8qUZzSQUyRUD7deaM4lCD7kegfuOyVwByrnvFOP3EWbkWOaCvG2mwh1 Occ/T1gKtcwmLx0b29pZcWu+9iWQWh/CnXBBzqGXaQSMf6do5TUvZ5L/k7HB48+L+/ha baYA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=MXQcfGJt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id qw40-20020a1709066a2800b007c118b2d91dsi17265553ejc.248.2023.01.12.00.22.56; Thu, 12 Jan 2023 00:23:08 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=MXQcfGJt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239711AbjALILU (ORCPT + 50 others); Thu, 12 Jan 2023 03:11:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49468 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238488AbjALILK (ORCPT ); Thu, 12 Jan 2023 03:11:10 -0500 Received: from smtp-out1.suse.de (smtp-out1.suse.de [IPv6:2001:67c:2178:6::1c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 304B8E022 for ; Thu, 12 Jan 2023 00:11:09 -0800 (PST) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id CD5243F60C; Thu, 12 Jan 2023 08:11:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1673511067; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=g3rOVtr3FiiKyQNYoIzTg/bnyNbi0bomA2m3YXQL33w=; b=MXQcfGJtyiYGn9YKNiZfgcR6UavUB8jL8lrmiMtqR5MAAPuSmslm2tPu1u2O3i7WdJc7xf fgiRDb2OA5aUFeAi8O4AyRa3VF3VNQJkGaQxYiw7u47jzOTjWXI7csyGvxHITikyKaXi1M MRzCvMocbBP6zr3ga7ziGR4sfiNy87M= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id ACB80134B3; Thu, 12 Jan 2023 08:11:07 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id jyjfJ5vAv2NJaAAAMHmgww (envelope-from ); Thu, 12 Jan 2023 08:11:07 +0000 Date: Thu, 12 Jan 2023 09:11:06 +0100 From: Michal Hocko To: Mel Gorman Cc: Linux-MM , Andrew Morton , NeilBrown , Thierry Reding , Matthew Wilcox , Vlastimil Babka , LKML Subject: Re: [PATCH 6/7] mm/page_alloc: Give GFP_ATOMIC and non-blocking allocations access to reserves Message-ID: References: <20230109151631.24923-1-mgorman@techsingularity.net> <20230109151631.24923-7-mgorman@techsingularity.net> <20230111170552.5b7z5hetc2lcdwmb@techsingularity.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230111170552.5b7z5hetc2lcdwmb@techsingularity.net> X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 11-01-23 17:05:52, Mel Gorman wrote: > On Wed, Jan 11, 2023 at 04:58:02PM +0100, Michal Hocko wrote: > > On Mon 09-01-23 15:16:30, Mel Gorman wrote: > > > Explicit GFP_ATOMIC allocations get flagged ALLOC_HARDER which is a bit > > > vague. In preparation for removing __GFP_ATOMIC, give GFP_ATOMIC and > > > other non-blocking allocation requests equal access to reserve. Rename > > > ALLOC_HARDER to ALLOC_NON_BLOCK to make it more clear what the flag > > > means. > > > > GFP_NOWAIT can be also used for opportunistic allocations which can and > > should fail quickly if the memory is tight and more elaborate path > > should be taken (e.g. try higher order allocation first but fall back to > > smaller request if the memory is fragmented). Do we really want to give > > those access to memory reserves as well? > > Good question. Without __GFP_ATOMIC, GFP_NOWAIT only differs from GFP_ATOMIC > by __GFP_HIGH but that is not enough to distinguish between a caller that > cannot sleep versus one that is speculatively attempting an allocation but > has other options. That changelog is misleading, it's not equal access > as GFP_NOWAIT ends up with 25% of the reserves which is less than what > GFP_ATOMIC gets. > > Because it becomes impossible to distinguish between non-blocking and > atomic without __GFP_ATOMIC, there is some justification for allowing > access to reserves for GFP_NOWAIT. bio for example attempts an allocation > (clears __GFP_DIRECT_RECLAIM) before falling back to mempool but delays > in IO can also lead to further allocation pressure. mmu gather failing > GFP_WAIT slows the rate memory can be freed. NFS failing GFP_NOWAIT will > have to retry IOs multiple times. The examples were picked at random but > the point is that there are cases where failing GFP_NOWAIT can degrade > the system, particularly delay the cleaning of pages before reclaim. Fair points. > A lot of the truly speculative users appear to use GFP_NOWAIT | __GFP_NOWARN > so one compromise would be to avoid using reserves if __GFP_NOWARN is > also specified. > > Something like this as a separate patch? I cannot say I would be happy about adding more side effects to __GFP_NOWARN. You are right that it should be used for those optimistic allocation requests but historically all many of these subtle side effects have kicked back at some point. Wouldn't it make sense to explicitly mark those places which really benefit from reserves instead? This is more work but it should pay off long term. Your examples above would use GFP_ATOMIC instead of GFP_NOWAIT. The semantic would be easier to explain as well. GFP_ATOMIC - non sleeping allocations which are important so they have access to memory reserves. GFP_NOWAIT - non sleeping allocations. > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 7244ab522028..0a7a0ac1b46d 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -4860,9 +4860,11 @@ gfp_to_alloc_flags(gfp_t gfp_mask, unsigned int order) > if (!(gfp_mask & __GFP_DIRECT_RECLAIM)) { > /* > * Not worth trying to allocate harder for __GFP_NOMEMALLOC even > - * if it can't schedule. > + * if it can't schedule. Similarly, a caller specifying > + * __GFP_NOWARN is likely a speculative allocation with a > + * graceful recovery path. > */ > - if (!(gfp_mask & __GFP_NOMEMALLOC)) { > + if (!(gfp_mask & (__GFP_NOMEMALLOC|__GFP_NOWARN))) { > alloc_flags |= ALLOC_NON_BLOCK; > > if (order > 0) -- Michal Hocko SUSE Labs