Received: by 2002:ab2:3319:0:b0:1ef:7a0f:c32d with SMTP id i25csp408080lqc; Fri, 8 Mar 2024 00:37:25 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWYEQd4uOe8U4K9TAqqNOG1rwPpyFsetpKhDZlsUd/jjF26zy95MCoLyB6nvU21mbtWz08KKC3AhNG/88sDCCt4y3WENQWshc9NZMvLgg== X-Google-Smtp-Source: AGHT+IEF8EwQVddFefA7AIrh1bxSezOkGMb0s3mDQyoahFq12wgVZb5iimDTDfIawTAhZ8T2ficH X-Received: by 2002:a17:903:246:b0:1dc:30d7:ff37 with SMTP id j6-20020a170903024600b001dc30d7ff37mr11752761plh.42.1709887045006; Fri, 08 Mar 2024 00:37:25 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709887044; cv=pass; d=google.com; s=arc-20160816; b=S+pHbqQcL2R9xy3xxfpXyA15DZL/SiQQ16tlR5GxdjaTYSF093hxysR7/LNcFDrnJA +2YIYnbi5GHlPqOgSr17PDBfw+oKTlsvkClh442wjOLPyXVd8aYFI5sWrWOtSgMv9J2m klsDc+hfJGAW/r1M0BCZ+Q9e3tkduJrP8FIjWUn0TWC6UugVzbNTdS216ZeVhrAbgHgj 7m+C9J48S4gjlqcveFXHuhKZupaoyrOUZx+w6NZq58J6uSVzNcl3yBCiGnD/wqcNbuPW 0fJnUnittPeuIj3+mwI4ODPKZR6Fi20+Qik21qb5kuTQEiLE4FXf/0nkc15qrcdLu9Ye CkiQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id:dkim-signature; bh=hd5eSIOo7us1LbsNzpdstrnvo/0BBPJxonyWDnWYOdI=; fh=iuYrMsB3buYvPThN9rDqTRUV8uXTedJ91mdpOuGt4FE=; b=HoAwmQXxRHjAP/9rX6GFau2Jr11pToMTh+GibWfrq6oHAAdoO0AysVo5Pt9L5LTttf 7+ZMp86Y1DD6LlRwI2jWDUFoF5kl7uEPcrOPCMj5HQHkNeWmnX5vdLPpYHuotI1/sqa7 5492SGSmFQSMj5PRyMntTDa3G248kialkEJZ6hmYiQPQjat4W+moOXfFQSaSneS4GBG5 518g5m2otV/NPOkxw9Ya2gPA/bec+3B8jzaoAsv6bjN2RMqUK+FLi+4itqQZZl488a6K n2zcra+H41ySEJBaYjRS/dczVotTCXFL9IMYYryJB/NZp3mr4oK6DC6dZ70GrVXvhqmt tJxw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@davidwei-uk.20230601.gappssmtp.com header.s=20230601 header.b=WHOLKf9i; arc=pass (i=1 dkim=pass dkdomain=davidwei-uk.20230601.gappssmtp.com); spf=pass (google.com: domain of linux-kernel+bounces-96513-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-96513-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id lw7-20020a1709032ac700b001dd4f7ca26bsi2878502plb.629.2024.03.08.00.37.24 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 08 Mar 2024 00:37:24 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-96513-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@davidwei-uk.20230601.gappssmtp.com header.s=20230601 header.b=WHOLKf9i; arc=pass (i=1 dkim=pass dkdomain=davidwei-uk.20230601.gappssmtp.com); spf=pass (google.com: domain of linux-kernel+bounces-96513-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-96513-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 5A6D92835E9 for ; Fri, 8 Mar 2024 04:58:19 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 8D18133070; Fri, 8 Mar 2024 04:57:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=davidwei-uk.20230601.gappssmtp.com header.i=@davidwei-uk.20230601.gappssmtp.com header.b="WHOLKf9i" Received: from mail-oo1-f41.google.com (mail-oo1-f41.google.com [209.85.161.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F2A9C32C89 for ; Fri, 8 Mar 2024 04:57:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.161.41 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709873876; cv=none; b=n+Ba3TxeQqSWqFW59hk4sadfG/zxfT9kfRYRJgcQ7TCFP+cR0LGKRLideummQQNquOszfHR5ifmDzvy+mhb7QX/JI6rPE+k6nGPtevOh2bWOMBbRb1fTeqhFuWDFb2n9RMkMuuVFtfcB59a4+aFC/rELmnTskPkdixsr264T79s= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709873876; c=relaxed/simple; bh=YHgLhGo9u/GaA4wjmsG1D7h2kk3KAbiuEeepwyKoTyA=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=bYv9KJlkRgspkfFQJ1/DiAFZaXcnLqLK4vuq1dNvmdqNgzMjtXUdUU4TbGbEbRbQA5634EDxxLkWQ4vkjcgPNCjqGmZE8LJTo9YwKqE1zePxe35DJ5fG08GYUSRO6Xe3KyEmyrUFcvd9g4DuQ6qp2BKIDtUBBAT/QsmucX3nw9Q= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=davidwei.uk; spf=none smtp.mailfrom=davidwei.uk; dkim=pass (2048-bit key) header.d=davidwei-uk.20230601.gappssmtp.com header.i=@davidwei-uk.20230601.gappssmtp.com header.b=WHOLKf9i; arc=none smtp.client-ip=209.85.161.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=davidwei.uk Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=davidwei.uk Received: by mail-oo1-f41.google.com with SMTP id 006d021491bc7-5a14d7f0e66so801293eaf.1 for ; Thu, 07 Mar 2024 20:57:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=davidwei-uk.20230601.gappssmtp.com; s=20230601; t=1709873872; x=1710478672; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=hd5eSIOo7us1LbsNzpdstrnvo/0BBPJxonyWDnWYOdI=; b=WHOLKf9iGdgVvKTjQmxXbGWwIT9aW9o9hEq9U+lj49TYZ42wh7OW8mGicrRGPFBWec GQXHXEEy7tPbKwC1FHg8zUZWXZfzhZPY7eTptlAxCYMCrZISnWijtPow38TPk+467WQI C+SLrK5dNJPHk+1cx98faF71o72UvdNKhUddExLhLHqSlNl82akiFzavl0puT4+AmRFJ Ec9bN9cNgCHYY85wXTpFf//96AeQfDdV8fpasYXN1atMrFSxm0GJhpfw8m5MlTIrmThy aDpFbpiasRW10nZIRqfeXavPvZw0dJn5/4Gc7DZtaG5mAlYsF39ActGfINqVsFI27Zds 3LKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709873872; x=1710478672; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=hd5eSIOo7us1LbsNzpdstrnvo/0BBPJxonyWDnWYOdI=; b=g/Ks23QJrFIbAtDAM0mTP1pO6cv9X4K3Fnnu055jFdSRL3DhSJ/6hG1zMsPCbkKXpx QZyVrDyv+KIEsc4xMPM6EaLGiw+Ufqe96qYMz5FTL7ZSMTRoA6s/7CwBm5sd5vqEixr9 84L/cH10lDphBKkJdxQhQxQI4PMKQ8gfW82kyvAc/+rNczwtNBI2l3YFPtcQHrXPYZeP tVvJUqYHr7rm80eeWDIGn7M3cWdgBZNm8N01rxbzU8dyHMJd75tuxBVpduEKB1zxCuaS Pm1bJRDwq0I1lq5J2qwUrcIcEsV4prfj7fww5U2iwsSY5QOVMOB69RliVOdpMpSUYEyw o3Gg== X-Forwarded-Encrypted: i=1; AJvYcCWhbKIvfMC0qSdSrPi42kQ8zk0AG9kYU5RUqGmVRnJiu0Ei8aZCFu96eGV3wsbHTiIokMOhnm53mswjYKI58sikMyggwdcvfxZgjw8D X-Gm-Message-State: AOJu0YyAZ6nBHlOR9eHebNsoYLboAZgGR2VhN91PuTxXmTxQmiq+ghzH aXEo7mhD96wKo/VnmUqBJEt8syNXWt0EH79+YIPM89msjh35lDC5G+dNXhSIahA= X-Received: by 2002:a05:6358:70b:b0:176:40d5:2bd5 with SMTP id e11-20020a056358070b00b0017640d52bd5mr10614876rwj.6.1709873871879; Thu, 07 Mar 2024 20:57:51 -0800 (PST) Received: from [192.168.1.24] (71-212-18-124.tukw.qwest.net. [71.212.18.124]) by smtp.gmail.com with ESMTPSA id k76-20020a633d4f000000b005dc8702f0a9sm13477127pga.1.2024.03.07.20.57.50 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 07 Mar 2024 20:57:51 -0800 (PST) Message-ID: <15625bac-dfec-4c4e-a828-d11424f7aced@davidwei.uk> Date: Thu, 7 Mar 2024 20:57:49 -0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH net-next v6 02/15] net: page_pool: create hooks for custom page providers To: Mina Almasry , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-alpha@vger.kernel.org, linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org, sparclinux@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-arch@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org Cc: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Jonathan Corbet , Richard Henderson , Ivan Kokshaysky , Matt Turner , Thomas Bogendoerfer , "James E.J. Bottomley" , Helge Deller , Andreas Larsson , Jesper Dangaard Brouer , Ilias Apalodimas , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Arnd Bergmann , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , David Ahern , Willem de Bruijn , Shuah Khan , Sumit Semwal , =?UTF-8?Q?Christian_K=C3=B6nig?= , Pavel Begunkov , Jason Gunthorpe , Yunsheng Lin , Shailend Chand , Harshitha Ramamurthy , Jeroen de Borst , Praveen Kaligineedi References: <20240305020153.2787423-1-almasrymina@google.com> <20240305020153.2787423-3-almasrymina@google.com> Content-Language: en-GB From: David Wei In-Reply-To: <20240305020153.2787423-3-almasrymina@google.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 2024-03-04 18:01, Mina Almasry wrote: > From: Jakub Kicinski > > The page providers which try to reuse the same pages will > need to hold onto the ref, even if page gets released from > the pool - as in releasing the page from the pp just transfers > the "ownership" reference from pp to the provider, and provider > will wait for other references to be gone before feeding this > page back into the pool. > > Signed-off-by: Jakub Kicinski > Signed-off-by: Mina Almasry > > --- > > This is implemented by Jakub in his RFC: > https://lore.kernel.org/netdev/f8270765-a27b-6ccf-33ea-cda097168d79@redhat.com/T/ > > I take no credit for the idea or implementation; I only added minor > edits to make this workable with device memory TCP, and removed some > hacky test code. This is a critical dependency of device memory TCP > and thus I'm pulling it into this series to make it revewable and > mergeable. > > RFC v3 -> v1 > - Removed unusued mem_provider. (Yunsheng). > - Replaced memory_provider & mp_priv with netdev_rx_queue (Jakub). > > --- > include/net/page_pool/types.h | 12 ++++++++++ > net/core/page_pool.c | 43 +++++++++++++++++++++++++++++++---- > 2 files changed, 50 insertions(+), 5 deletions(-) > > diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h > index 5e43a08d3231..ffe5f31fb0da 100644 > --- a/include/net/page_pool/types.h > +++ b/include/net/page_pool/types.h > @@ -52,6 +52,7 @@ struct pp_alloc_cache { > * @dev: device, for DMA pre-mapping purposes > * @netdev: netdev this pool will serve (leave as NULL if none or multiple) > * @napi: NAPI which is the sole consumer of pages, otherwise NULL > + * @queue: struct netdev_rx_queue this page_pool is being created for. > * @dma_dir: DMA mapping direction > * @max_len: max DMA sync memory size for PP_FLAG_DMA_SYNC_DEV > * @offset: DMA sync address offset for PP_FLAG_DMA_SYNC_DEV > @@ -64,6 +65,7 @@ struct page_pool_params { > int nid; > struct device *dev; > struct napi_struct *napi; > + struct netdev_rx_queue *queue; > enum dma_data_direction dma_dir; > unsigned int max_len; > unsigned int offset; > @@ -126,6 +128,13 @@ struct page_pool_stats { > }; > #endif > > +struct memory_provider_ops { > + int (*init)(struct page_pool *pool); > + void (*destroy)(struct page_pool *pool); > + struct page *(*alloc_pages)(struct page_pool *pool, gfp_t gfp); > + bool (*release_page)(struct page_pool *pool, struct page *page); > +}; Separate question as I try to adapt bnxt to this and your queue configuration API. How does GVE handle the need to allocate kernel pages for headers and dmabuf for payloads? Reading the code, struct gve_rx_ring is the main per-ring object with a page pool. gve_queue_page_lists are filled with page pool netmem allocations from the page pool in gve_alloc_queue_page_list(). Are these strictly used for payloads only? I found a struct gve_header_buf in both gve_rx_ring and struct gve_per_rx_queue_mem_dpo. This is allocated in gve_rx_queue_mem_alloc() using dma_alloc_coherent(). Is this where GVE stores headers? IOW, GVE only uses page pool to allocate memory for QPLs, and QPLs are used by the device for split payloads. Is my understanding correct? > + > struct page_pool { > struct page_pool_params_fast p; > > @@ -176,6 +185,9 @@ struct page_pool { > */ > struct ptr_ring ring; > > + void *mp_priv; > + const struct memory_provider_ops *mp_ops; > + > #ifdef CONFIG_PAGE_POOL_STATS > /* recycle stats are per-cpu to avoid locking */ > struct page_pool_recycle_stats __percpu *recycle_stats; > diff --git a/net/core/page_pool.c b/net/core/page_pool.c > index d706fe5548df..8776fcad064a 100644 > --- a/net/core/page_pool.c > +++ b/net/core/page_pool.c > @@ -25,6 +25,8 @@ > > #include "page_pool_priv.h" > > +static DEFINE_STATIC_KEY_FALSE(page_pool_mem_providers); > + > #define DEFER_TIME (msecs_to_jiffies(1000)) > #define DEFER_WARN_INTERVAL (60 * HZ) > > @@ -177,6 +179,7 @@ static int page_pool_init(struct page_pool *pool, > int cpuid) > { > unsigned int ring_qsize = 1024; /* Default */ > + int err; > > memcpy(&pool->p, ¶ms->fast, sizeof(pool->p)); > memcpy(&pool->slow, ¶ms->slow, sizeof(pool->slow)); > @@ -248,10 +251,25 @@ static int page_pool_init(struct page_pool *pool, > /* Driver calling page_pool_create() also call page_pool_destroy() */ > refcount_set(&pool->user_cnt, 1); > > + if (pool->mp_ops) { > + err = pool->mp_ops->init(pool); > + if (err) { > + pr_warn("%s() mem-provider init failed %d\n", > + __func__, err); > + goto free_ptr_ring; > + } > + > + static_branch_inc(&page_pool_mem_providers); > + } > + > if (pool->p.flags & PP_FLAG_DMA_MAP) > get_device(pool->p.dev); > > return 0; > + > +free_ptr_ring: > + ptr_ring_cleanup(&pool->ring, NULL); > + return err; > } > > static void page_pool_uninit(struct page_pool *pool) > @@ -546,7 +564,10 @@ struct page *page_pool_alloc_pages(struct page_pool *pool, gfp_t gfp) > return page; > > /* Slow-path: cache empty, do real allocation */ > - page = __page_pool_alloc_pages_slow(pool, gfp); > + if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_ops) > + page = pool->mp_ops->alloc_pages(pool, gfp); > + else > + page = __page_pool_alloc_pages_slow(pool, gfp); > return page; > } > EXPORT_SYMBOL(page_pool_alloc_pages); > @@ -603,10 +624,13 @@ void __page_pool_release_page_dma(struct page_pool *pool, struct page *page) > void page_pool_return_page(struct page_pool *pool, struct page *page) > { > int count; > + bool put; > > - __page_pool_release_page_dma(pool, page); > - > - page_pool_clear_pp_info(page); > + put = true; > + if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_ops) > + put = pool->mp_ops->release_page(pool, page); > + else > + __page_pool_release_page_dma(pool, page); > > /* This may be the last page returned, releasing the pool, so > * it is not safe to reference pool afterwards. > @@ -614,7 +638,10 @@ void page_pool_return_page(struct page_pool *pool, struct page *page) > count = atomic_inc_return_relaxed(&pool->pages_state_release_cnt); > trace_page_pool_state_release(pool, page, count); > > - put_page(page); > + if (put) { > + page_pool_clear_pp_info(page); > + put_page(page); > + } > /* An optimization would be to call __free_pages(page, pool->p.order) > * knowing page is not part of page-cache (thus avoiding a > * __page_cache_release() call). > @@ -884,6 +911,12 @@ static void __page_pool_destroy(struct page_pool *pool) > > page_pool_unlist(pool); > page_pool_uninit(pool); > + > + if (pool->mp_ops) { > + pool->mp_ops->destroy(pool); > + static_branch_dec(&page_pool_mem_providers); > + } > + > kfree(pool); > } >