Received: by 2002:a05:7412:b112:b0:f9:3106:f1c0 with SMTP id az18csp17044rdb; Sun, 17 Dec 2023 18:41:02 -0800 (PST) X-Google-Smtp-Source: AGHT+IHX162bm17GlBbFktW8FcX/9+TDosYnq9/akqs1peTFTth1SSynWlylgmpMLCojz9OxJ5dT X-Received: by 2002:a17:902:f54d:b0:1d3:8032:ccc5 with SMTP id h13-20020a170902f54d00b001d38032ccc5mr2097459plf.41.1702867262242; Sun, 17 Dec 2023 18:41:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702867262; cv=none; d=google.com; s=arc-20160816; b=L/Lw3VDkBdKlwSTYBMQh8GhJSp6/8KjEH+JxZxok59xmVTTGPQYEwf5QvtD2/3XUnL SNW5QSbBvGrqg6BVcc21OutVAIynSgho0N+oZCQ+s9szgWs1q+3qqj1/Gna8qc0zSxKv rnx7QOi3srS6h7qP8scV7dZoekJZbf4W/BrYCFu+LUuF8E4n0dB1UNlM/s65pZk0ymcG GO5M9CXuUk9XqAmnY3IgaWOFjaIeoOz+VlYB1kz/b2U8rW+AltWpZ8YVn4WdsSG6zS6w LAXMcktTcB1+Sj9xROBS8yleXqrQ0NpQ7dK5MujsbQChcksjmWXSUJoy/WOynmBPfJp+ fE6w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:message-id:references:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:in-reply-to:date :dkim-signature; bh=+YH1G0kDpPKfZZEjFhgAU3QdHJHUSt1BHlcFmevn2mU=; fh=8Y9D+7Eqr15uC+q4uaUvZHfnG7pix6ubnhqqWosbV7w=; b=1D4fVKA+nlIIl6oQ/9oU6rPxqlDbolL7HCyGdregJWt4Y+bkpx8rlJtm3K8MBnunzJ itRSbxr94vGPjz7BXkKMuoyyJ67fF+J05NkMYraJ8CsJW7D3wRrw9aZXb/HgeUeUJONd lZib0lhbqYuMtvJ7w77vsK+FAfZZiGyrfG/VZ215Qil7la6JFxh2JpEr9EZuF1/4Zukx PmwO+7ehmZ3VmkaJkEhhWeD1PXbbzodZqraqNUfoRynOnc3WMcqkY2FTWa8ASGKXvh50 aciGyWMDINvMr/ZaIj+r4i+FYBe/vszN48Mwf1Nl8n78QMaX69fxmfF8h0NPTyXZU5bn tclw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=lOkkGnwP; spf=pass (google.com: domain of linux-kernel+bounces-2925-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-2925-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [147.75.48.161]) by mx.google.com with ESMTPS id u2-20020a170902e80200b001d3c182ef86si440777plg.553.2023.12.17.18.41.01 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 17 Dec 2023 18:41:02 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-2925-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) client-ip=147.75.48.161; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=lOkkGnwP; spf=pass (google.com: domain of linux-kernel+bounces-2925-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-2925-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 07596B21A4B for ; Mon, 18 Dec 2023 02:41:01 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 88DFF566E; Mon, 18 Dec 2023 02:40:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="lOkkGnwP" X-Original-To: linux-kernel@vger.kernel.org Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CAC7733C8 for ; Mon, 18 Dec 2023 02:40:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--almasrymina.bounces.google.com Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-db402e6f61dso2363538276.3 for ; Sun, 17 Dec 2023 18:40:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1702867231; x=1703472031; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=+YH1G0kDpPKfZZEjFhgAU3QdHJHUSt1BHlcFmevn2mU=; b=lOkkGnwPVbP+hTxwYL/PBajDUv6cb2srkKt2ScHqDmgefWrx2lsJ93s914LD+WXvJI YH3mFlJmTZI2aF30MeeNrofRddzACCCGuiZHMf/GE6j4o718w+i7jiUA3kXIILyRhp7X 385eIBgmpjoP+naAXEVQoyAJ7fT6MZ52hY3r9Ut6IbkRG5LeZC4JrmbQX7KSMEezk+8q CyzKX6rtFVRZjiffa4+CHXQNP7tm8jfgA7K6cNcQhXJKAizDEXK9kguL/Os8iQidcn1E 5VJz6M/IjTHSljk9Y6IksGlKhDgDAzhSUayduuqt7taly793uFa+7BjshtIP6jXDOEnH 6cUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702867231; x=1703472031; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+YH1G0kDpPKfZZEjFhgAU3QdHJHUSt1BHlcFmevn2mU=; b=R5LrzuFEDHRxGHACiD5IsUCiIDSuOeqHbfFIDVS28J86UyY33TpU2H3pa5Gphtlw+u Yv9yOh0LDsxyV10C+y/kH+5+PBrWUMIb60lryYaFRXJ8dJFBYeHyS4D3e56Ia1927Rrh 2a0EenXrDZrRf2n4IKcdRNgIYuPlmgcerWRPFw4GpBn8uf16B/D1UxB5oOrYLb79mOY6 qAqP6biJCo9OlVABUAcsG54TiqucgyIEUJvINWIHqyEb45UYT0td83dsleexXq20gbnM UT5bZdNGnzzg9aUHqD6hAlb1fdx1xbaU5BXDT3Mwu56H58Rm9oqUSL1mN5gn0kYwE9MH 0KVg== X-Gm-Message-State: AOJu0YzXCU7IBkMkjiVNFWQ6V73O5j6H2smNAGJW63dj2s461MsDUe2A +Om/8eecJyLYQX+vfutQWk3zhNwuDu72pAGauw== X-Received: from almasrymina.svl.corp.google.com ([2620:15c:2c4:200:5cbf:3534:fb34:758e]) (user=almasrymina job=sendgmr) by 2002:a25:2e0a:0:b0:dbc:c4a7:3bf7 with SMTP id u10-20020a252e0a000000b00dbcc4a73bf7mr1342548ybu.1.1702867230794; Sun, 17 Dec 2023 18:40:30 -0800 (PST) Date: Sun, 17 Dec 2023 18:40:08 -0800 In-Reply-To: <20231218024024.3516870-1-almasrymina@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231218024024.3516870-1-almasrymina@google.com> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog Message-ID: <20231218024024.3516870-2-almasrymina@google.com> Subject: [RFC PATCH net-next v5 01/14] net: page_pool: create hooks for custom page providers From: Mina Almasry To: Mathieu Desnoyers , Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-alpha@vger.kernel.org, linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org, sparclinux@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-arch@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org Cc: Mina Almasry , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Jonathan Corbet , Richard Henderson , Ivan Kokshaysky , Matt Turner , Thomas Bogendoerfer , "James E.J. Bottomley" , Helge Deller , Jesper Dangaard Brouer , Ilias Apalodimas , Steven Rostedt , Masami Hiramatsu , Arnd Bergmann , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , David Ahern , Willem de Bruijn , Shuah Khan , Sumit Semwal , "=?UTF-8?q?Christian=20K=C3=B6nig?=" , Pavel Begunkov , David Wei , Jason Gunthorpe , Yunsheng Lin , Shailend Chand , Harshitha Ramamurthy , Shakeel Butt , Jeroen de Borst , Praveen Kaligineedi Content-Type: text/plain; charset="UTF-8" From: Jakub Kicinski The page providers which try to reuse the same pages will need to hold onto the ref, even if page gets released from the pool - as in releasing the page from the pp just transfers the "ownership" reference from pp to the provider, and provider will wait for other references to be gone before feeding this page back into the pool. Signed-off-by: Jakub Kicinski Signed-off-by: Mina Almasry --- This is implemented by Jakub in his RFC: https://lore.kernel.org/netdev/f8270765-a27b-6ccf-33ea-cda097168d79@redhat.com/T/ I take no credit for the idea or implementation; I only added minor edits to make this workable with device memory TCP, and removed some hacky test code. This is a critical dependency of device memory TCP and thus I'm pulling it into this series to make it revewable and mergeable. RFC v3 -> v1 - Removed unusued mem_provider. (Yunsheng). - Replaced memory_provider & mp_priv with netdev_rx_queue (Jakub). --- include/net/page_pool/types.h | 12 ++++++++++ net/core/page_pool.c | 43 +++++++++++++++++++++++++++++++---- 2 files changed, 50 insertions(+), 5 deletions(-) diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h index 76481c465375..4c090e86e550 100644 --- a/include/net/page_pool/types.h +++ b/include/net/page_pool/types.h @@ -51,6 +51,7 @@ struct pp_alloc_cache { * @dev: device, for DMA pre-mapping purposes * @netdev: netdev this pool will serve (leave as NULL if none or multiple) * @napi: NAPI which is the sole consumer of pages, otherwise NULL + * @queue: struct netdev_rx_queue this page_pool is being created for. * @dma_dir: DMA mapping direction * @max_len: max DMA sync memory size for PP_FLAG_DMA_SYNC_DEV * @offset: DMA sync address offset for PP_FLAG_DMA_SYNC_DEV @@ -63,6 +64,7 @@ struct page_pool_params { int nid; struct device *dev; struct napi_struct *napi; + struct netdev_rx_queue *queue; enum dma_data_direction dma_dir; unsigned int max_len; unsigned int offset; @@ -125,6 +127,13 @@ struct page_pool_stats { }; #endif +struct memory_provider_ops { + int (*init)(struct page_pool *pool); + void (*destroy)(struct page_pool *pool); + struct page *(*alloc_pages)(struct page_pool *pool, gfp_t gfp); + bool (*release_page)(struct page_pool *pool, struct page *page); +}; + struct page_pool { struct page_pool_params_fast p; @@ -174,6 +183,9 @@ struct page_pool { */ struct ptr_ring ring; + void *mp_priv; + const struct memory_provider_ops *mp_ops; + #ifdef CONFIG_PAGE_POOL_STATS /* recycle stats are per-cpu to avoid locking */ struct page_pool_recycle_stats __percpu *recycle_stats; diff --git a/net/core/page_pool.c b/net/core/page_pool.c index dd5a72533f2b..45bb4210412d 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -25,6 +25,8 @@ #include "page_pool_priv.h" +static DEFINE_STATIC_KEY_FALSE(page_pool_mem_providers); + #define DEFER_TIME (msecs_to_jiffies(1000)) #define DEFER_WARN_INTERVAL (60 * HZ) @@ -174,6 +176,7 @@ static int page_pool_init(struct page_pool *pool, const struct page_pool_params *params) { unsigned int ring_qsize = 1024; /* Default */ + int err; memcpy(&pool->p, ¶ms->fast, sizeof(pool->p)); memcpy(&pool->slow, ¶ms->slow, sizeof(pool->slow)); @@ -234,10 +237,25 @@ static int page_pool_init(struct page_pool *pool, /* Driver calling page_pool_create() also call page_pool_destroy() */ refcount_set(&pool->user_cnt, 1); + if (pool->mp_ops) { + err = pool->mp_ops->init(pool); + if (err) { + pr_warn("%s() mem-provider init failed %d\n", + __func__, err); + goto free_ptr_ring; + } + + static_branch_inc(&page_pool_mem_providers); + } + if (pool->p.flags & PP_FLAG_DMA_MAP) get_device(pool->p.dev); return 0; + +free_ptr_ring: + ptr_ring_cleanup(&pool->ring, NULL); + return err; } static void page_pool_uninit(struct page_pool *pool) @@ -519,7 +537,10 @@ struct page *page_pool_alloc_pages(struct page_pool *pool, gfp_t gfp) return page; /* Slow-path: cache empty, do real allocation */ - page = __page_pool_alloc_pages_slow(pool, gfp); + if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_ops) + page = pool->mp_ops->alloc_pages(pool, gfp); + else + page = __page_pool_alloc_pages_slow(pool, gfp); return page; } EXPORT_SYMBOL(page_pool_alloc_pages); @@ -576,10 +597,13 @@ void __page_pool_release_page_dma(struct page_pool *pool, struct page *page) void page_pool_return_page(struct page_pool *pool, struct page *page) { int count; + bool put; - __page_pool_release_page_dma(pool, page); - - page_pool_clear_pp_info(page); + put = true; + if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_ops) + put = pool->mp_ops->release_page(pool, page); + else + __page_pool_release_page_dma(pool, page); /* This may be the last page returned, releasing the pool, so * it is not safe to reference pool afterwards. @@ -587,7 +611,10 @@ void page_pool_return_page(struct page_pool *pool, struct page *page) count = atomic_inc_return_relaxed(&pool->pages_state_release_cnt); trace_page_pool_state_release(pool, page, count); - put_page(page); + if (put) { + page_pool_clear_pp_info(page); + put_page(page); + } /* An optimization would be to call __free_pages(page, pool->p.order) * knowing page is not part of page-cache (thus avoiding a * __page_cache_release() call). @@ -857,6 +884,12 @@ static void __page_pool_destroy(struct page_pool *pool) page_pool_unlist(pool); page_pool_uninit(pool); + + if (pool->mp_ops) { + pool->mp_ops->destroy(pool); + static_branch_dec(&page_pool_mem_providers); + } + kfree(pool); } -- 2.43.0.472.g3155946c3a-goog