Received: by 2002:a05:6a10:f3d0:0:0:0:0 with SMTP id a16csp4465829pxv; Tue, 6 Jul 2021 01:20:09 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwaryBncgkb/KQNqTlPzG1MekPgKKroEaWzf8FBkrxO8kwiHw2LoUTPgWcxOkhoZrOx9ffG X-Received: by 2002:a50:c111:: with SMTP id l17mr8075431edf.56.1625559609598; Tue, 06 Jul 2021 01:20:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1625559609; cv=none; d=google.com; s=arc-20160816; b=aoueOiCGRkfcBym9PtEjYWWouy8noiOn0XzEFFJd7MZjoyMKcuAxVjDd+iT5HPLcou q5qWtkchW0GHxcCKfeUyKEcI81jkz3Y4m4/kYVNeCt1BwXYTUJv5YqOLBIlbAlQO6hez 4VjtZED2PQguFlH+k4u0i8NRecsQC5RWYnNAnmM3FDSw/lSkFamqENri6wSOfgzKpKXS 37F95gbZxE1hzhUHnYemgHxpvO8BMDgWcKShO1sw3beVP65AzmaBD/kDgVQ7h5unG+IA kg7YrVJSkYI6aQ6qv9rQUnsnuprCNUYt4D7eMEB8PI/pw2IJjChp2pQv6tZSbW1a1ETS WfqA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=3mgnx8fEegD/aQ9zqBoBzzOqjAcJdnecr5X8UBCh88A=; b=YI7TGZSJtxiNP2KzLDGiZ7f6bNSqxBK5nL3vvnOJJIRISHVZf+ZPs8fUFGyB06kj1F PEckalrX9IpgEbUoMh7jYALxNh15MDAA/KipcDbP9sujNvSHHPnnde1jZAoCyNFSYhco r1ikOIfEjmFrPUzg4mMCoWQ+QZon2lIq9XX5w4ktux2pZ4Mh7dRQd4oLuNqDXgKANwke /r8xmI+WIDmaLAPwFvfXqhOpzG1QqHc08M6tTYSshz7ZOqHMCktGVkF6dEYPZX2BiDKf y25oxDllhi2KdkzH6gg1AttQwtJ9JX3wCGRgyCdn1e4IUpuAV9RUAWRZo4jVEMpnk1Z3 6DlA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=jCtVYDlv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z6si13633392edc.353.2021.07.06.01.19.46; Tue, 06 Jul 2021 01:20:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=jCtVYDlv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230442AbhGFIU7 (ORCPT + 99 others); Tue, 6 Jul 2021 04:20:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60082 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230426AbhGFIU6 (ORCPT ); Tue, 6 Jul 2021 04:20:58 -0400 Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B7306C061574 for ; Tue, 6 Jul 2021 01:18:19 -0700 (PDT) Received: by mail-wm1-x333.google.com with SMTP id g10so7105555wmh.2 for ; Tue, 06 Jul 2021 01:18:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=3mgnx8fEegD/aQ9zqBoBzzOqjAcJdnecr5X8UBCh88A=; b=jCtVYDlvBHH3GlSQUuqptLNQoADj8E6pFWpmgL1Or65pa8VgkFEwlKQz+Zy50buntc 1nXamWlS/PQVS/aK7+Q/p0wsC7O54/u3cjI7mRNz9oaiT5YjVbK8dhHZMxPRxotu43oF shzv1ow2Fda7OXIQZ9qxe40tW1SS7DWp4D0t0P1LBT9ZCQ+VOMwI/d35WdrCZ8II0DP5 anwOy0CkQp+4q4RGCYoxoUe2gppifBs8Ic/KLbTXuyrK0cx5zAPtIXE46eNnPJgHUxFq 9+BEil0tPu6G0o3NnhY5Ioc49EO7vFdHqxmuWtvxc1NnlHrnGEGTam4NFPi1EfcpVZEb bLdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=3mgnx8fEegD/aQ9zqBoBzzOqjAcJdnecr5X8UBCh88A=; b=qw4il60dWY00NJFCTQ3ooM/ETWrZnUl/0YyfIF9ZVpyO41MGIDuTxAFZ+B+/BunuLw O2Zi1JUPhGbKRvY4ljzFWXOtF+0aqqJeki0KzSMV9fuxFU5mp82FNnzCaqLmcqnZztwI iS9c4lWg4nPwiLKc9uG8msjuW9uLHWZdHWzACNM8IrYROEL/QyBZizqv9yJ/2Ndxih8i FuTz9LMONL8dol/k33Zr2F81Leg1sSKTNp2fW7gxunXX4E17MSg+YlNsWtnbIh5S7LI1 sLGDhljM3SAjFc7Rbbf9SdIUFPXEOzukHGpY0nLSSEXVgSUNkNy+U4LrhrWXtTW+29LL UzpA== X-Gm-Message-State: AOAM531zGsJ/oV068QB84XD0OzvUioNCJSs1caPQBq+rK63RBb9/JWsr u8QaNLNHlIctphNcEF+oQuKn8w== X-Received: by 2002:a05:600c:4843:: with SMTP id j3mr19554366wmo.73.1625559498313; Tue, 06 Jul 2021 01:18:18 -0700 (PDT) Received: from enceladus (ppp-94-66-242-227.home.otenet.gr. [94.66.242.227]) by smtp.gmail.com with ESMTPSA id a22sm13783534wrc.66.2021.07.06.01.18.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Jul 2021 01:18:17 -0700 (PDT) Date: Tue, 6 Jul 2021 11:18:13 +0300 From: Ilias Apalodimas To: Yunsheng Lin Cc: Jesper Dangaard Brouer , davem@davemloft.net, kuba@kernel.org, linuxarm@openeuler.org, yisen.zhuang@huawei.com, salil.mehta@huawei.com, thomas.petazzoni@bootlin.com, mw@semihalf.com, linux@armlinux.org.uk, hawk@kernel.org, ast@kernel.org, daniel@iogearbox.net, john.fastabend@gmail.com, akpm@linux-foundation.org, peterz@infradead.org, will@kernel.org, willy@infradead.org, vbabka@suse.cz, fenghua.yu@intel.com, guro@fb.com, peterx@redhat.com, feng.tang@intel.com, jgg@ziepe.ca, mcroce@microsoft.com, hughd@google.com, jonathan.lemon@gmail.com, alobakin@pm.me, willemb@google.com, wenxu@ucloud.cn, cong.wang@bytedance.com, haokexin@gmail.com, nogikh@google.com, elver@google.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, bpf@vger.kernel.org, Alexander Duyck Subject: Re: [PATCH net-next RFC 1/2] page_pool: add page recycling support based on elevated refcnt Message-ID: References: <1625044676-12441-1-git-send-email-linyunsheng@huawei.com> <1625044676-12441-2-git-send-email-linyunsheng@huawei.com> <6c2d76e2-30ce-5c0f-9d71-f6b71f9ad34f@redhat.com> <33aee58e-b1d5-ce7b-1576-556d0da28560@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <33aee58e-b1d5-ce7b-1576-556d0da28560@huawei.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > >> [...] > >>> > >>> > >>>> So add elevated refcnt support in page pool, and support > >>>> allocating page frag to enable multi-frames-per-page based > >>>> on the elevated refcnt support. > >>>> > >>>> As the elevated refcnt is per page, and there is no space > >>>> for that in "struct page" now, so add a dynamically allocated > >>>> "struct page_pool_info" to record page pool ptr and refcnt > >>>> corrsponding to a page for now. Later, we can recycle the > >>>> "struct page_pool_info" too, or use part of page memory to > >>>> record pp_info. > >>> > >>> I'm not happy with allocating a memory (slab) object "struct page_pool_info" per page. > >>> > >>> This also gives us an extra level of indirection. > >> > >> I'm not happy with that either, if there is better way to > >> avoid that, I will be happy to change it:) > > > > I think what we have to answer here is, do we want and does it make sense > > for page_pool to do the housekeeping of the buffer splitting or are we > > better of having each driver do that. IIRC your previous patch on top of > > the original recycling patchset was just 'atomic' refcnts on top of page pool. > > You are right that driver was doing the the buffer splitting in previous > patch. > > The reason why I abandoned that is: > 1. Currently the meta-data of page in the driver is per desc, which means > it might not be able to use first half of a page for a desc, and the > second half of the same page for another desc, this ping-pong way of > reusing the whole page for only one desc in the driver seems unnecessary > and waste a lot of memory when there is already reusing in the page pool. > > 2. Easy use of API for the driver too, which means the driver uses > page_pool_dev_alloc_frag() and page_pool_put_full_page() for elevated > refcnt case, corresponding to page_pool_dev_alloc_pages() and > page_pool_put_full_page() for non-elevated refcnt case, the driver does > not need to worry about the meta-data of a page. > Ok that makes sense. We'll need the complexity anyway and I said I don't have any strong opinions yet, we might as well make page_pool responsible for it. What we need to keep in mind is that page_pool was primarily used for XDP packets. We need to make sure we have no performance regressions there. However I don't have access to > 10gbit NICs with XDP support. Can anyone apply the patchset and check the performance? > > > >> [...] > >> Aside from the performance improvement, there is memory usage > >> decrease for 64K page size kernel, which means a 64K page can > >> be used by 32 description with 2k buffer size, and that is a > >> lot of memory saving for 64 page size kernel comparing to the > >> current split page reusing implemented in the driver. > >> > > > > Whether the driver or page_pool itself keeps the meta-data, the outcome > > here won't change. We'll still be able to use page frags. > > As above, it is the ping-pong way of reusing when the driver keeps the > meta-data, and it is page-frag way of reusing when the page pool keeps > the meta-data. > > I am not sure if the page-frag way of reusing is possible when we still > keep the meta-data in the driver, which seems very complex at the initial > thinking. > Fair enough. It's complex in both scenarios so if people think it's useful I am not against adding it in the API. Thanks /Ilias > > > > > > Cheers > > /Ilias > >> > >>> > >>> __page_frag_cache_refill() + __page_frag_cache_drain() + page_frag_alloc_align() > >>> > >>> > >> > >> [...] > > . > >