Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp1371302pxf; Fri, 12 Mar 2021 08:05:37 -0800 (PST) X-Google-Smtp-Source: ABdhPJxhjx6cT04Z01KBrhteif7YN9Rb9jFcWvTBJyOKQSwn1n42EhKmkL/5lB6kGBNBgxO6xmLL X-Received: by 2002:a17:906:14d4:: with SMTP id y20mr9439705ejc.190.1615565137647; Fri, 12 Mar 2021 08:05:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1615565137; cv=none; d=google.com; s=arc-20160816; b=Q8+B9GMzrBekpOjgmo0wVIQFLcFKJNewSMgCP6DAYca2VIbP8C+GX1yd8Yjk5j9t2O BaGQf4go/ivXi8aBIZZhrh8CNJxJ/SUI5b71IxhswpYbw7KHerN29QA3CY2Rk5OmTKIn 5u6fqZVKpkJIU2vAPWp+HwU5L7Px7SXtmLMOFiDIQ7es4f7lszmg2Uuvaxcb7Cq+C5NJ GRWq3QPMMDymvV7jwrYPvePoLS0AdArFAqr6xZUZVv0aJybYoKJluAHp4VUOdrhvHID8 irp8nCXvpdJ608v9KW37q6lyahdo1k3vxSZBFvXCcpTwQ82D2ubRxHusI8tGHdU07/aJ xcVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=BqhJzmxoOnm8zXl/7OiFCXXpbrayHHIbkw6NVPXeoqU=; b=Poq9HcnAh6BGrmRJsmdbPt2BmA+T1AZ7hTU1133gJOA1y4Iqtl1yN6VCJauuP97Fxf GHKCSbeW3wuCBizQNFKhQqbJ3scf13e98yZz6X/mhY8ugtH4jNeWTHNm81bF7tTgR1n3 Hr93rzhVS6dDuLoBSEHzUqsV0BIZD3x3lGCel5dpzRZZ0Kt77JQ/0lISalr6goVm584G fvHvlmaKVY4KcXY8Q4t5TVYdEilr/lf2cutSaeIw5/VG1RF+1/7TuEDaZL0rGTIp7Izo Sv4iOMfjEUHgT+SpWjPSKYoGpj8Wc+0RO64axOP+dwbJODQrtCMXD9qJwM3TmzwatiNB tO+w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id w24si4603015ejb.608.2021.03.12.08.05.08; Fri, 12 Mar 2021 08:05:37 -0800 (PST) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231730AbhCLQE0 (ORCPT + 99 others); Fri, 12 Mar 2021 11:04:26 -0500 Received: from outbound-smtp49.blacknight.com ([46.22.136.233]:51557 "EHLO outbound-smtp49.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231519AbhCLQDx (ORCPT ); Fri, 12 Mar 2021 11:03:53 -0500 Received: from mail.blacknight.com (pemlinmail02.blacknight.ie [81.17.254.11]) by outbound-smtp49.blacknight.com (Postfix) with ESMTPS id 23398FB0BD for ; Fri, 12 Mar 2021 16:03:52 +0000 (GMT) Received: (qmail 13536 invoked from network); 12 Mar 2021 16:03:51 -0000 Received: from unknown (HELO techsingularity.net) (mgorman@techsingularity.net@[84.203.22.4]) by 81.17.254.9 with ESMTPSA (AES256-SHA encrypted, authenticated); 12 Mar 2021 16:03:51 -0000 Date: Fri, 12 Mar 2021 16:03:50 +0000 From: Mel Gorman To: Matthew Wilcox Cc: Jesper Dangaard Brouer , Andrew Morton , Chuck Lever , Christoph Hellwig , LKML , Linux-Net , Linux-MM , Linux-NFS Subject: Re: [PATCH 2/5] mm/page_alloc: Add a bulk page allocator Message-ID: <20210312160350.GW3697@techsingularity.net> References: <20210310104618.22750-1-mgorman@techsingularity.net> <20210310104618.22750-3-mgorman@techsingularity.net> <20210310154650.ad9760cd7cb9ac4acccf77ee@linux-foundation.org> <20210311084200.GR3697@techsingularity.net> <20210312124609.33d4d4ba@carbon> <20210312145814.GA2577561@casper.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20210312145814.GA2577561@casper.infradead.org> User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On Fri, Mar 12, 2021 at 02:58:14PM +0000, Matthew Wilcox wrote: > On Fri, Mar 12, 2021 at 12:46:09PM +0100, Jesper Dangaard Brouer wrote: > > In my page_pool patch I'm bulk allocating 64 pages. I wanted to ask if > > this is too much? (PP_ALLOC_CACHE_REFILL=64). > > > > The mlx5 driver have a while loop for allocation 64 pages, which it > > used in this case, that is why 64 is chosen. If we choose a lower > > bulk number, then the bulk-alloc will just be called more times. > > The thing about batching is that smaller batches are often better. > Let's suppose you need to allocate 100 pages for something, and the page > allocator takes up 90% of your latency budget. Batching just ten pages > at a time is going to reduce the overhead to 9%. Going to 64 pages > reduces the overhead from 9% to 2% -- maybe that's important, but > possibly not. > I do not think that something like that can be properly accessed in advance. It heavily depends on whether the caller is willing to amortise the cost of the batch allocation or if the timing of the bulk request is critical every single time. > > The result of the API is to deliver pages as a double-linked list via > > LRU (page->lru member). If you are planning to use llist, then how to > > handle this API change later? > > > > Have you notice that the two users store the struct-page pointers in an > > array? We could have the caller provide the array to store struct-page > > pointers, like we do with kmem_cache_alloc_bulk API. > > My preference would be for a pagevec. That does limit you to 15 pages > per call [1], but I do think that might be enough. And the overhead of > manipulating a linked list isn't free. > I'm opposed to a pagevec because it unnecessarily limits the caller. The sunrpc user for example knows how many pages it needs at the time the bulk allocator is called but it's not the same value every time. When tracing, I found it sometimes requested 1 page (most common request actually) and other times requested 200+ pages. Forcing it to call the batch allocator in chunks of 15 means the caller incurs the cost of multiple allocation requests which is almost as bad as calling __alloc_pages in a loop. I think the first version should have an easy API to start with. Optimise the implementation if it is a bottleneck. Only make the API harder to use if the callers are really willing to always allocate and size the array in advance and it's shown that it really makes a big difference performance-wise. -- Mel Gorman SUSE Labs