Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp4485217rdb; Tue, 12 Dec 2023 00:08:22 -0800 (PST) X-Google-Smtp-Source: AGHT+IG2jyJN2iOoh0VmWexdb3XrrYGG9SNNWWtaVyT4OZ3b3btw/5geFIedbwhzE8vgUgvl5248 X-Received: by 2002:a17:902:d505:b0:1d0:6ffe:1e6b with SMTP id b5-20020a170902d50500b001d06ffe1e6bmr3373259plg.78.1702368501385; Tue, 12 Dec 2023 00:08:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702368501; cv=none; d=google.com; s=arc-20160816; b=JWgD5JataIomw1PLvc9x1ocSKHPvjlwMEHZkSWvZ/9ewzk/3f7aayT9850OkJqw8Fl 9qY9gVJQTEIJYDuEl5nBhSmS2lQDr2+ZmLl3PjYQxOI7mUMz2oVhwtPunZ0efL3KHPWf NwOODxC7mbMQQV7UQfcajvKNSsPa65MksaIkWQ3ANMvTOr/kpAW+0Fo6A8aAaYMIQdvm nnJvWRd3hvn+FUAG0tbzU0PrRkPA66ErpVNLrd0XGGAJmtBZkRE0jNuokxDiQHEMT/2j 0k1DiibDCi6CKbBjMHOw6Hr04mJJzUQiZ0AZ3iH2gNqNUGH6adqqKy/KRd318WaqVWeo QrXA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=KEK+GLviJvtTbzYzmtDV4AEm8T+lMdwOyDcx0NSd3/o=; fh=wLvB/jh72Mn1ePX6fdZZgnhoSc22GXy+Kdh8jbPqTOY=; b=TJauiKe4rfGtCQioghONj/99YFAWDcdxLjUo1zMuiAojtau6aPSBmJLPwTWtn1hB+B TGYJLq6EjJVMmSigpCi8F3YgcSHyRu6th5OmtTWzlZ+BOn6YvmLIUbUS9yEEh6WV+e13 +vbHytLsAJegczFa+25AaChAaQEB8Tcv1OE02+x3dc4zgHTeLJakmlRgH36JKmqDytry QS288dmv3WrxLj8QtXm3R5FW7c+QEdNALi5b2+M5FPTbFSEDSduAAbqcjIUkn5ocpQRQ jsnRpd+/reLT3NL0IFf1YZASA5PZpudMu0EIgGk0DB9q+hbzXjGERZ9VEtvrJyzSXHNs y5CA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=yjDyLlX+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id jj22-20020a170903049600b001d060d48fa7si7405523plb.399.2023.12.12.00.08.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Dec 2023 00:08:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=yjDyLlX+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 11F6080AA275; Tue, 12 Dec 2023 00:08:13 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345945AbjLLIHz (ORCPT + 99 others); Tue, 12 Dec 2023 03:07:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47700 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229449AbjLLIHx (ORCPT ); Tue, 12 Dec 2023 03:07:53 -0500 Received: from mail-lf1-x12c.google.com (mail-lf1-x12c.google.com [IPv6:2a00:1450:4864:20::12c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C2570CD for ; Tue, 12 Dec 2023 00:07:56 -0800 (PST) Received: by mail-lf1-x12c.google.com with SMTP id 2adb3069b0e04-50e0ba402b4so149864e87.1 for ; Tue, 12 Dec 2023 00:07:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1702368475; x=1702973275; darn=vger.kernel.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=KEK+GLviJvtTbzYzmtDV4AEm8T+lMdwOyDcx0NSd3/o=; b=yjDyLlX+WKvy4KTLqbEPOCBw4F+IfDEZIpxf47iaC+QwGIVN2mrumhoyWdfOJrZ4ln elyQuYvRjFjTkc6oFXoVx6liygAGXgNq2KJsItw/ovqcLu0yKy56kIVV7YzDoowB1pRi ks/5kNP3Pq4YwfnuN81N/o7hZ2oHD+mjhjXXeHtoNCrudwV9wg5+35c5/sB/6TI3b9Sj 4RESd8RD116cSDinYlkJ2MSyEmBrXvxm3GeUBqTMhncOnEV2ldV3DjKpqwGuotby8Jlb krIJQANQDL1ceXf/yKAp7b2enwvIJlKKajnuMCFKYnGeVzM3m0o7sGrSs98s6tchC2Fo oaaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702368475; x=1702973275; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=KEK+GLviJvtTbzYzmtDV4AEm8T+lMdwOyDcx0NSd3/o=; b=vU/IsQpTy4KqaZRQtN7a2Cx1Hv5wBHJVHuO0xdL3dvKU1GB1KWTy0YL937hHFTHLSp YKmZhE6ySg41cUo5zSJ0I4nciC7DcxggZZS4UaorWDNkMHhQxNQ7ibOcLd3ghjUW8B6v 95iYo1l9raOakhKI5OopFTXeW4jacK7TTK6eljbhjwskjeVzsiWK7Xm44BDE3INi5rui adBzM9U5UzrtntahU9EOLR1alpy5bT02/XyYQkyZSBdPqjRq3/CrT5rWgJs4lrMYHopW cDN+FVfX6TzXNnu9zjqMhB5eTof3i/Aa9yjg7zj7Hrx94SHDfvO/Tqp/F/onzstT8DkB T/nA== X-Gm-Message-State: AOJu0Ywzi2KLNs4PTzUCIbrhgkpB7+uqx/stu98cjHD6qvAW73Npdqv1 Qmc1O6OC+8j4pAn/FslhtN4F6LRNlCky81LMqZ9e+w== X-Received: by 2002:a05:6512:a8c:b0:50c:6b:f164 with SMTP id m12-20020a0565120a8c00b0050c006bf164mr3620875lfu.27.1702368474939; Tue, 12 Dec 2023 00:07:54 -0800 (PST) MIME-Version: 1.0 References: <20231208005250.2910004-1-almasrymina@google.com> <20231208005250.2910004-3-almasrymina@google.com> In-Reply-To: <20231208005250.2910004-3-almasrymina@google.com> From: Ilias Apalodimas Date: Tue, 12 Dec 2023 10:07:18 +0200 Message-ID: Subject: Re: [net-next v1 02/16] net: page_pool: create hooks for custom page providers To: Mina Almasry Cc: Shailend Chand , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-arch@vger.kernel.org, linux-kselftest@vger.kernel.org, bpf@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org, "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Jonathan Corbet , Jeroen de Borst , Praveen Kaligineedi , Jesper Dangaard Brouer , Arnd Bergmann , David Ahern , Willem de Bruijn , Shuah Khan , Sumit Semwal , =?UTF-8?Q?Christian_K=C3=B6nig?= , Yunsheng Lin , Harshitha Ramamurthy , Shakeel Butt Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Tue, 12 Dec 2023 00:08:13 -0800 (PST) Hi Mina, Apologies for not participating in the party earlier. On Fri, 8 Dec 2023 at 02:52, Mina Almasry wrote: > > From: Jakub Kicinski > > The page providers which try to reuse the same pages will > need to hold onto the ref, even if page gets released from > the pool - as in releasing the page from the pp just transfers > the "ownership" reference from pp to the provider, and provider > will wait for other references to be gone before feeding this > page back into the pool. > > Signed-off-by: Jakub Kicinski > Signed-off-by: Mina Almasry > > --- > > This is implemented by Jakub in his RFC: > https://lore.kernel.org/netdev/f8270765-a27b-6ccf-33ea-cda097168d79@redhat.com/T/ > > I take no credit for the idea or implementation; I only added minor > edits to make this workable with device memory TCP, and removed some > hacky test code. This is a critical dependency of device memory TCP > and thus I'm pulling it into this series to make it revewable and > mergable. > > RFC v3 -> v1 > - Removed unusued mem_provider. (Yunsheng). > - Replaced memory_provider & mp_priv with netdev_rx_queue (Jakub). > > --- > include/net/page_pool/types.h | 12 ++++++++++ > net/core/page_pool.c | 43 +++++++++++++++++++++++++++++++---- > 2 files changed, 50 insertions(+), 5 deletions(-) > > diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h > index ac286ea8ce2d..0e9fa79a5ef1 100644 > --- a/include/net/page_pool/types.h > +++ b/include/net/page_pool/types.h > @@ -51,6 +51,7 @@ struct pp_alloc_cache { > * @dev: device, for DMA pre-mapping purposes > * @netdev: netdev this pool will serve (leave as NULL if none or multiple) > * @napi: NAPI which is the sole consumer of pages, otherwise NULL > + * @queue: struct netdev_rx_queue this page_pool is being created for. > * @dma_dir: DMA mapping direction > * @max_len: max DMA sync memory size for PP_FLAG_DMA_SYNC_DEV > * @offset: DMA sync address offset for PP_FLAG_DMA_SYNC_DEV > @@ -63,6 +64,7 @@ struct page_pool_params { > int nid; > struct device *dev; > struct napi_struct *napi; > + struct netdev_rx_queue *queue; > enum dma_data_direction dma_dir; > unsigned int max_len; > unsigned int offset; > @@ -125,6 +127,13 @@ struct page_pool_stats { > }; > #endif > > +struct memory_provider_ops { > + int (*init)(struct page_pool *pool); > + void (*destroy)(struct page_pool *pool); > + struct page *(*alloc_pages)(struct page_pool *pool, gfp_t gfp); > + bool (*release_page)(struct page_pool *pool, struct page *page); > +}; > + > struct page_pool { > struct page_pool_params_fast p; > > @@ -174,6 +183,9 @@ struct page_pool { > */ > struct ptr_ring ring; > > + void *mp_priv; > + const struct memory_provider_ops *mp_ops; > + > #ifdef CONFIG_PAGE_POOL_STATS > /* recycle stats are per-cpu to avoid locking */ > struct page_pool_recycle_stats __percpu *recycle_stats; > diff --git a/net/core/page_pool.c b/net/core/page_pool.c > index ca1b3b65c9b5..f5c84d2a4510 100644 > --- a/net/core/page_pool.c > +++ b/net/core/page_pool.c > @@ -25,6 +25,8 @@ > > #include "page_pool_priv.h" > > +static DEFINE_STATIC_KEY_FALSE(page_pool_mem_providers); We could add the existing page pool mechanisms as another 'provider', but I assume this is coded like this for performance reasons (IOW skip the expensive ptr call for the default case?) > + > #define DEFER_TIME (msecs_to_jiffies(1000)) > #define DEFER_WARN_INTERVAL (60 * HZ) > > @@ -174,6 +176,7 @@ static int page_pool_init(struct page_pool *pool, > const struct page_pool_params *params) > { > unsigned int ring_qsize = 1024; /* Default */ > + int err; > > memcpy(&pool->p, ¶ms->fast, sizeof(pool->p)); > memcpy(&pool->slow, ¶ms->slow, sizeof(pool->slow)); > @@ -234,10 +237,25 @@ static int page_pool_init(struct page_pool *pool, > /* Driver calling page_pool_create() also call page_pool_destroy() */ > refcount_set(&pool->user_cnt, 1); > > + if (pool->mp_ops) { > + err = pool->mp_ops->init(pool); > + if (err) { > + pr_warn("%s() mem-provider init failed %d\n", > + __func__, err); > + goto free_ptr_ring; > + } > + > + static_branch_inc(&page_pool_mem_providers); > + } > + > if (pool->p.flags & PP_FLAG_DMA_MAP) > get_device(pool->p.dev); > > return 0; > + > +free_ptr_ring: > + ptr_ring_cleanup(&pool->ring, NULL); > + return err; > } > > static void page_pool_uninit(struct page_pool *pool) > @@ -519,7 +537,10 @@ struct page *page_pool_alloc_pages(struct page_pool *pool, gfp_t gfp) > return page; > > /* Slow-path: cache empty, do real allocation */ > - page = __page_pool_alloc_pages_slow(pool, gfp); > + if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_ops) Why do we need && pool->mp_ops? On the init function, we only bump page_pool_mem_providers if the ops are there > + page = pool->mp_ops->alloc_pages(pool, gfp); > + else > + page = __page_pool_alloc_pages_slow(pool, gfp); > return page; > } > EXPORT_SYMBOL(page_pool_alloc_pages); > @@ -576,10 +597,13 @@ void __page_pool_release_page_dma(struct page_pool *pool, struct page *page) > void page_pool_return_page(struct page_pool *pool, struct page *page) > { > int count; > + bool put; > > - __page_pool_release_page_dma(pool, page); > - > - page_pool_clear_pp_info(page); > + put = true; > + if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_ops) ditto > + put = pool->mp_ops->release_page(pool, page); > + else > + __page_pool_release_page_dma(pool, page); > > /* This may be the last page returned, releasing the pool, so > * it is not safe to reference pool afterwards. > @@ -587,7 +611,10 @@ void page_pool_return_page(struct page_pool *pool, struct page *page) > count = atomic_inc_return_relaxed(&pool->pages_state_release_cnt); > trace_page_pool_state_release(pool, page, count); > > - put_page(page); > + if (put) { > + page_pool_clear_pp_info(page); > + put_page(page); > + } > /* An optimization would be to call __free_pages(page, pool->p.order) > * knowing page is not part of page-cache (thus avoiding a > * __page_cache_release() call). > @@ -857,6 +884,12 @@ static void __page_pool_destroy(struct page_pool *pool) > > page_pool_unlist(pool); > page_pool_uninit(pool); > + > + if (pool->mp_ops) { Same here. Using a mix of pool->mp_ops and page_pool_mem_providers will work, but since we always check the ptr on init, can't we simply rely on page_pool_mem_providers for the rest of the code? Thanks /Ilias > + pool->mp_ops->destroy(pool); > + static_branch_dec(&page_pool_mem_providers); > + } > + > kfree(pool); > } > > -- > 2.43.0.472.g3155946c3a-goog >