Received: by 2002:a05:6358:d09b:b0:dc:cd0c:909e with SMTP id jc27csp5685509rwb; Mon, 5 Dec 2022 02:40:09 -0800 (PST) X-Google-Smtp-Source: AA0mqf6HbIVAFo+koZuIzI1/gYWIIOI++N8ZjLwIrUT6vfLiObSSf7CMp66J8JjtqbPFB8jPLxX/ X-Received: by 2002:a05:6a00:1744:b0:576:b4ce:42b4 with SMTP id j4-20020a056a00174400b00576b4ce42b4mr8500017pfc.61.1670236808942; Mon, 05 Dec 2022 02:40:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1670236808; cv=none; d=google.com; s=arc-20160816; b=EClTY2qzJG9TCkKZuZkrEk9tD23bKNT3r+RxWbhGlR4ElqwawrENc+1JJDJaJN/1cd 82bMb2DhojwKaCtnGkruZk04YxozMeVJHspvUjn27xSg9vJRQGT834wn5BZKQMD5Noa2 Ol3dUXxLxfZnmo6o7WtBX8vOwtVpdPMG+e/+W5QkvFa6ZAhniSBjMP5WOuG/dDgwxZde pDIS/++Lil8ZCtwkZ3utazGBDrKGaNONZJzfHIxy1af0993MZ1VTH/s5aIGMdrmVSY46 WYFE8uaQYMTfVJvCjNYfupl1xGlw5G94B7PjVOsi/ENFVEXCTDbV0K6s1jDlWZCB+wp4 2yPg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=UDrRHh4GayMr7LKVcbZ5iboIWExGQtjRIDSW9yIOltQ=; b=hgqUKAJxXfs+gGlPAUsStwFn2BDUn7/Z0MB4vLQISsscQmFDuqk1chFzT4tQ70nqFx ILOC+iKzbE0COfu5Am6z1zX523+zyx+psQMm2qDlphANP7b7qwDSQt4hASaEYFC3a320 0B/EWseA8HPlFHcjKUomloCSXW/cHKK4MM1aHV50oows0Vu8uPvp2PxYWemqZylpIhNU yl3fapPM6j9ApBPkNgUe5B1NUnyfQ2ejJUS4ylzamb+EzgAiTuKwfbogNBo/cfmraHux g+i9e4JggPGmeMyfMXQFwd725MVaAjeRglijVdoXTD/hjU9uAIHX+fEcClmLxcPK60ZE aJ+w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l1-20020a170902f68100b00186c37272a6si15781541plg.178.2022.12.05.02.39.57; Mon, 05 Dec 2022 02:40:08 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230265AbiLEK2v (ORCPT + 82 others); Mon, 5 Dec 2022 05:28:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58366 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231636AbiLEK2N (ORCPT ); Mon, 5 Dec 2022 05:28:13 -0500 Received: from outbound-smtp39.blacknight.com (outbound-smtp39.blacknight.com [46.22.139.222]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 43D93193D7 for ; Mon, 5 Dec 2022 02:27:29 -0800 (PST) Received: from mail.blacknight.com (pemlinmail05.blacknight.ie [81.17.254.26]) by outbound-smtp39.blacknight.com (Postfix) with ESMTPS id AC8DD208A for ; Mon, 5 Dec 2022 10:27:27 +0000 (GMT) Received: (qmail 11155 invoked from network); 5 Dec 2022 10:27:27 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[84.203.198.246]) by 81.17.254.9 with ESMTPSA (AES256-SHA encrypted, authenticated); 5 Dec 2022 10:27:27 -0000 Date: Mon, 5 Dec 2022 10:27:21 +0000 From: Mel Gorman To: NeilBrown Cc: Linux-MM , Andrew Morton , Michal Hocko , Thierry Reding , Matthew Wilcox , Vlastimil Babka , LKML Subject: Re: [PATCH 3/6] mm/page_alloc: Explicitly record high-order atomic allocations in alloc_flags Message-ID: <20221205102721.zitekvbhvylogtnv@techsingularity.net> References: <20221129151701.23261-1-mgorman@techsingularity.net> <20221129151701.23261-4-mgorman@techsingularity.net> <167021743246.8267.14900064704332224542@noble.neil.brown.name> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <167021743246.8267.14900064704332224542@noble.neil.brown.name> X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Dec 05, 2022 at 04:17:12PM +1100, NeilBrown wrote: > > Hi Mel, > thanks a lot for doing this! My pleasure. > I tried reviewing it but "HIGHATOMIC" is new to me and I quickly got > lost :-( That's ok, HIGHATOMIC reserves are obscure and internal to the allocator. It's almost as obscure as granting access to reserves for RT tasks with the only difference being a lack of data on what sort of RT tasks needed extra privileges back in the early 2000's but I happened to remember why highatomic reserves were introduced. It was introduced when fragmentation avoidance triggered high-order atomic allocations failures that "happened to work" by accident before fragmentation avoidance (details in 0aaa29a56e4f). IIRC, there were a storm of bugs, mostly on small embedded platforms where devices without an IOMMU relied on high-order atomics to work so a small reserve was created. I was concerned your patch would trigger this class of bug again even though it might take a few years to show up as embedded platforms tend to stay on old kernels for ages. The main point of this patch is identifying these high-order atomic allocations without relying on __GFP_ATOMIC but it should not be necessary to understand how high-order atomic reserves work. They still work the same way, access is just granted differently. > Maybe one day I'll work it out - now that several names are more > meaningful, it will likely be easier. > > > @@ -4818,7 +4820,7 @@ static void wake_all_kswapds(unsigned int order, gfp_t gfp_mask, > > } > > > > static inline unsigned int > > -gfp_to_alloc_flags(gfp_t gfp_mask) > > +gfp_to_alloc_flags(gfp_t gfp_mask, unsigned int order) > > { > > unsigned int alloc_flags = ALLOC_WMARK_MIN | ALLOC_CPUSET; > > > > @@ -4844,8 +4846,13 @@ gfp_to_alloc_flags(gfp_t gfp_mask) > > * Not worth trying to allocate harder for __GFP_NOMEMALLOC even > > * if it can't schedule. > > */ > > - if (!(gfp_mask & __GFP_NOMEMALLOC)) > > + if (!(gfp_mask & __GFP_NOMEMALLOC)) { > > alloc_flags |= ALLOC_HARDER; > > + > > + if (order > 0) > > + alloc_flags |= ALLOC_HIGHATOMIC; > > + } > > + > > /* > > * Ignore cpuset mems for GFP_ATOMIC rather than fail, see the > > * comment for __cpuset_node_allowed(). This is the most crucial hunk two hunks of the patch. If the series is merged and we start seeing high-order atomic allocations failure, the key will be to look at the gfp_flags and determine if this hunk is enough to accurately detect high-order atomic allocations. If the GFP flags look ok, then a tracing debugging patch will be created to dump gfp_flags every time access is granted to high-atomic reserves to determine if access is given incorrectly and under what circumstances. The main concern I had with your original patch was that it was too easy to grant access to high-atomic reserves for requests that were not high-order atomics requests which might exhaust the reserves. The rest of the series tries to improve the naming of the ALLOC flags and what they mean. The actual changes to your patch are minimal. I could have started with your patch and fixed it up but I preferred this ordering to reduce use of __GFP_ATOMIC and then delete it. It should be bisection safe. -- Mel Gorman SUSE Labs