Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp347122pxb; Wed, 24 Feb 2021 03:59:09 -0800 (PST) X-Google-Smtp-Source: ABdhPJzsjneOICGmCcbhGO092LIQWu/75f7gClh71XfXVcv2cB1qPqaiHv8UUQ+uk4hJKFXg8V7R X-Received: by 2002:a17:907:7252:: with SMTP id ds18mr31216570ejc.239.1614167949659; Wed, 24 Feb 2021 03:59:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1614167949; cv=none; d=google.com; s=arc-20160816; b=lta5gNrIj6KHRLd37LD9vSE/TIUY7xmm7k+G2BUb3kiPunDmSMUd8P+f1SroUF7cNW t46Xx06ClPcJMNUA/JY5SCt6eKqpRRmqvEfYlpQZkr/7xspDGoE3BnsAykTNLIYYzcVI tNGwioCZXurkqESvmo2eNwpD5j/hosDIV1b3zF8G2AMZj2vyr27TwRNHALBGi+I86c5y 5j9w4TgrQlMZJtEymlxuWVLz+aDS3qcNcUTOuW2zleGFbARQvvfpjbyLwpDTbKCWs0ZU Wu7tNh8Rr8urrdzP3PgW9eHvXZqNCk+OhG6rhZtybGUr4p+xEJ2V6rkMMMqoGOYNWBa1 vyhg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=SzBdOtE9EArhKyj9NZ27k93HAPJCfSK22UXdpPTicP0=; b=ZDRODoE//G2cpsJMSOEyO4Wf4FS9/GN/eNDtr2MmPhQNNuSVNQDx/F5mI994NARHQT RAAVmxIiAw8i67SVkBOChQU3SLKG8iG0UpLiKtg6aoxWYTmqqkd8SE5XYiIK6rSinpWY 0D+iTCp20MPH6iIiMnbmGOwwX4P+AZ9cijKfVm19Yp+qMXd88qyYYTsC33HrtlEUp1sN llXZgQWsR4RtFfOkuROx5jttJp8Ks8qBCzqSFBnmHJuTWxoITKcJ2vk7ths9BPTrwen5 Sgfo0jj5BAZ5mKV3Yo/14ReCihDnNq18+BRoqax+LJTAq2uBRFC4GDigaQWUyEa1MWrw q3TA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id q17si1167597ejy.390.2021.02.24.03.58.46; Wed, 24 Feb 2021 03:59:09 -0800 (PST) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235196AbhBXL5T (ORCPT + 99 others); Wed, 24 Feb 2021 06:57:19 -0500 Received: from outbound-smtp57.blacknight.com ([46.22.136.241]:41865 "EHLO outbound-smtp57.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235208AbhBXL4K (ORCPT ); Wed, 24 Feb 2021 06:56:10 -0500 Received: from mail.blacknight.com (pemlinmail01.blacknight.ie [81.17.254.10]) by outbound-smtp57.blacknight.com (Postfix) with ESMTPS id 49FBBFAD3C for ; Wed, 24 Feb 2021 11:55:11 +0000 (GMT) Received: (qmail 8759 invoked from network); 24 Feb 2021 11:55:11 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[84.203.22.4]) by 81.17.254.9 with ESMTPSA (AES256-SHA encrypted, authenticated); 24 Feb 2021 11:55:11 -0000 Date: Wed, 24 Feb 2021 11:55:08 +0000 From: Mel Gorman To: Jesper Dangaard Brouer Cc: Chuck Lever , LKML , Linux-Net , Linux-MM , Linux-NFS Subject: Re: [RFC PATCH 0/3] Introduce a bulk order-0 page allocator for sunrpc Message-ID: <20210224115508.GL3697@techsingularity.net> References: <20210224102603.19524-1-mgorman@techsingularity.net> <20210224122723.15943e95@carbon> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20210224122723.15943e95@carbon> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On Wed, Feb 24, 2021 at 12:27:23PM +0100, Jesper Dangaard Brouer wrote: > On Wed, 24 Feb 2021 10:26:00 +0000 > Mel Gorman wrote: > > > This is a prototype series that introduces a bulk order-0 page allocator > > with sunrpc being the first user. The implementation is not particularly > > efficient and the intention is to iron out what the semantics of the API > > should be. That said, sunrpc was reported to have reduced allocation > > latency when refilling a pool. > > I also have a use-case in page_pool, and I've been testing with the > earlier patches, results are here[1] > > [1] https://github.com/xdp-project/xdp-project/blob/master/areas/mem/page_pool06_alloc_pages_bulk.org > > Awesome to see this newer patchset! thanks a lot for working on this! > I'll run some new tests based on this. > Thanks and if they get finalised, a patch on top for review would be nice with the results included in the changelog. Obviously any change that would need to be made to the allocator would happen first. > > As a side-note, while the implementation could be more efficient, it > > would require fairly deep surgery in numerous places. The lock scope would > > need to be significantly reduced, particularly as vmstat, per-cpu and the > > buddy allocator have different locking protocol that overal -- e.g. all > > partially depend on irqs being disabled at various points. Secondly, > > the core of the allocator deals with single pages where as both the bulk > > allocator and per-cpu allocator operate in batches. All of that has to > > be reconciled with all the existing users and their constraints (memory > > offline, CMA and cpusets being the trickiest). > > As you can see in[1], I'm getting a significant speedup from this. I > guess that the cost of finding the "zone" is higher than I expected, as > this basically what we/you amortize for the bulk. > The obvious goal would be that if a refactoring did happen that the performance would be at least neutral but hopefully improved. > > In terms of semantics required by new users, my preference is that a pair > > of patches be applied -- the first which adds the required semantic to > > the bulk allocator and the second which adds the new user. > > > > Patch 1 of this series is a cleanup to sunrpc, it could be merged > > separately but is included here for convenience. > > > > Patch 2 is the prototype bulk allocator > > > > Patch 3 is the sunrpc user. Chuck also has a patch which further caches > > pages but is not included in this series. It's not directly > > related to the bulk allocator and as it caches pages, it might > > have other concerns (e.g. does it need a shrinker?) > > > > This has only been lightly tested on a low-end NFS server. It did not break > > but would benefit from an evaluation to see how much, if any, the headline > > performance changes. The biggest concern is that a light test case showed > > that there are a *lot* of bulk requests for 1 page which gets delegated to > > the normal allocator. The same criteria should apply to any other users. > > If you change local_irq_save(flags) to local_irq_disable() then you can > likely get better performance for 1 page requests via this API. This > limits the API to be used in cases where IRQs are enabled (which is > most cases). (For my use-case I will not do 1 page requests). > I do not want to constrain the API to being IRQ-only prematurely. An obvious alternative use case is the SLUB allocation path when a high-order allocation fails. It's known that if the SLUB order is reduced that it has an impact on hackbench communicating over sockets. It would be interesting to see what happens if order-0 pages are bulk allocated when s->min == 0 and that can be called from a blocking context. Tricky to test but could be fudged by forcing all high-order allocations to fail when s->min == 0 to evaluate the worst case scenario. In addition, it would constrain any potential refactoring if the lower levels have to choose between local_irq_disable() vs local_irq_save() depending on the caller context. -- Mel Gorman SUSE Labs