Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp1277713pxb; Fri, 24 Sep 2021 00:24:38 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzakEardm+Ym8gTdW4LBBAX78+bcrVf3MkdjJ88KGiod4zqQ4jQf6Liy79FH67Q3842iL38 X-Received: by 2002:a05:6e02:156c:: with SMTP id k12mr7241564ilu.61.1632468277911; Fri, 24 Sep 2021 00:24:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1632468277; cv=none; d=google.com; s=arc-20160816; b=oFLs2S9RAKPVyHPhNiJVfLcedcMNWESdohLLoiF51ucFZpqtAqKwcYL+Q/9L1vf60w myyKTT7NjLDPoKxG1BQrTUQk2hf670BnAEm3rk9DW35dC6mwE9pXuS2DSVZWo4Yj/hUI EMWnZydMq99wvei5p/MCotcT2we/AYin52NVWV3c4c3dUu44wvg/R2+4lPE9B+tA6ADW 60s+w7oYZdl/1jMh+crmqVKJAAubgcXZOg+7G+0nWvtbh+TVx4CBugQ85JK0hIJ6vNM+ 1sMxOktJUeAXkoSwJRR1Bjz08IxIswUg88k7NiRVPrruZp24VBUWyP6i9meSV+xqmXwN jmLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=1n3ZrhSTgNQepEe9AkvDyOh4hfbZt87gWSqFg5jBXsU=; b=mhucm7FLBHZjcZKcU908M8ixKj8Y/pWMoF6jm56G7k2kdWXEW+2Svgwlv4/jaCP0uL y3UQr2KWgToSmpapMIA7r7fcvq88rQTN1j6AAkteT6Z96Xfj/GeZh4li/4STVvC6We6D nd/iglR+XsVaV7CB7TdIFUkbvjdWaMOxLwg7qGNIiAgxGHQmxqz2djJwjRPBKfCHybo6 22nQnVWtDHTeArYlOF+nsLcdPMLmL4ZeHD0iF523A7AbNAHBNJYQA901a8Mus7iGxYnJ DRLT3ddfDp0liq4kyVihDpTb7r6NV4qoHM0NJfUJx28kIc39dQnKTvN44JyeIgnX5AFS oUnQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id m24si9360341jac.82.2021.09.24.00.24.15; Fri, 24 Sep 2021 00:24:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244336AbhIXHYx (ORCPT + 99 others); Fri, 24 Sep 2021 03:24:53 -0400 Received: from szxga02-in.huawei.com ([45.249.212.188]:9918 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244191AbhIXHYw (ORCPT ); Fri, 24 Sep 2021 03:24:52 -0400 Received: from dggemv704-chm.china.huawei.com (unknown [172.30.72.55]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4HG3Kv1gx2z8yn0; Fri, 24 Sep 2021 15:18:43 +0800 (CST) Received: from dggpemm500005.china.huawei.com (7.185.36.74) by dggemv704-chm.china.huawei.com (10.3.19.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.8; Fri, 24 Sep 2021 15:23:17 +0800 Received: from [10.69.30.204] (10.69.30.204) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2308.8; Fri, 24 Sep 2021 15:23:17 +0800 Subject: Re: [PATCH net-next 2/7] page_pool: support non-split page with PP_FLAG_PAGE_FRAG To: Jesper Dangaard Brouer , , CC: , , , , , , , , , , , , , , , , , References: <20210922094131.15625-1-linyunsheng@huawei.com> <20210922094131.15625-3-linyunsheng@huawei.com> From: Yunsheng Lin Message-ID: <2f5a4b07-7acf-1838-cb42-fd8b5d4ba4c6@huawei.com> Date: Fri, 24 Sep 2021 15:23:16 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.69.30.204] X-ClientProxiedBy: dggeme712-chm.china.huawei.com (10.1.199.108) To dggpemm500005.china.huawei.com (7.185.36.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021/9/23 20:08, Jesper Dangaard Brouer wrote: > > > On 22/09/2021 11.41, Yunsheng Lin wrote: >> Currently when PP_FLAG_PAGE_FRAG is set, the caller is not >> expected to call page_pool_alloc_pages() directly because of >> the PP_FLAG_PAGE_FRAG checking in __page_pool_put_page(). >> >> The patch removes the above checking to enable non-split page >> support when PP_FLAG_PAGE_FRAG is set. >> >> Reviewed-by: Alexander Duyck >> Signed-off-by: Yunsheng Lin >> --- >> net/core/page_pool.c | 12 +++++++----- >> 1 file changed, 7 insertions(+), 5 deletions(-) >> >> diff --git a/net/core/page_pool.c b/net/core/page_pool.c >> index a65bd7972e37..f7e71dcb6a2e 100644 >> --- a/net/core/page_pool.c >> +++ b/net/core/page_pool.c >> @@ -315,11 +315,14 @@ struct page *page_pool_alloc_pages(struct page_pool *pool, gfp_t gfp) >> /* Fast-path: Get a page from cache */ >> page = __page_pool_get_cached(pool); >> - if (page) >> - return page; >> /* Slow-path: cache empty, do real allocation */ >> - page = __page_pool_alloc_pages_slow(pool, gfp); >> + if (!page) >> + page = __page_pool_alloc_pages_slow(pool, gfp); >> + >> + if (likely(page)) >> + page_pool_set_frag_count(page, 1); >> + > > I really don't like that you add one atomic_long_set operation per page alloc call. > This is a fast-path for XDP use-cases, which you are ignoring as you drivers doesn't implement XDP. > > As I cannot ask you to run XDP benchmarks, I fortunately have some page_pool specific microbenchmarks you can run instead. > > I will ask you to provide before and after results from running these benchmarks [1] and [2]. > > [1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/bench_page_pool_simple.c > > [2] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/bench_page_pool_cross_cpu.c > > How to use these module is documented here[3]: > [3] https://prototype-kernel.readthedocs.io/en/latest/prototype-kernel/build-process.html Will running these benchmarks to see if any performance overhead noticable here, thanks for the benchmarks. > >> return page; >> } >> EXPORT_SYMBOL(page_pool_alloc_pages); >> @@ -428,8 +431,7 @@ __page_pool_put_page(struct page_pool *pool, struct page *page, >> unsigned int dma_sync_size, bool allow_direct) >> { >> /* It is not the last user for the page frag case */ >> - if (pool->p.flags & PP_FLAG_PAGE_FRAG && >> - page_pool_atomic_sub_frag_count_return(page, 1)) >> + if (page_pool_atomic_sub_frag_count_return(page, 1)) >> return NULL; > > This adds an atomic_long_read, even when PP_FLAG_PAGE_FRAG is not set. The point here is to have consistent handling for both PP_FLAG_PAGE_FRAG and non-PP_FLAG_PAGE_FRAG case in the following patch. As the page->_refcount is accessed in "page_ref_count(page) == 1" checking in __page_pool_put_page(), and page->pp_frag_count is most likely in the same cache line as the page->_refcount, So I am not expecting a noticable overhead here. Anyway, will use the above benchmarks as an example to verify it. > >> /* This allocator is optimized for the XDP mode that uses >> > > . >