Received: by 2002:a05:6a10:d5a5:0:0:0:0 with SMTP id gn37csp53507pxb; Thu, 30 Sep 2021 00:33:20 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyMpQ3FqYeuUHn/5Pawx+dzAE5iinzif3o5L1I2tJA+Jo6t+Y+Z5gJOax/R6HMcw49/NPob X-Received: by 2002:a50:d8c5:: with SMTP id y5mr5396492edj.370.1632987199988; Thu, 30 Sep 2021 00:33:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1632987199; cv=none; d=google.com; s=arc-20160816; b=ayPZWQ9CAK5cfDb18oVLmL9Nn8Mh5Egd25qR3A8lpW6RgXVfokntDH/FVl+ioal1sU wZwpkbxrVMSNgIkOtyZhKY0tnLqva+wMdyoI5ICHRBXIUyqWTZ+Kl7UgmVo4WDSoBFo4 6MnCum5XJKFI00+M7XoXcIs2TwkBwCyKIlcbZCCQw6EER4Hk9sepR5S1SbLlJF9M6xG8 9Z82CEn+MJhGNMUrG1FMc9vBlc/jm6TfGrWFAVZ0cBg7zKm7euvkVIbPViQHQk095qYe ju6S/ZnUwaxgtomePzaQkMcMxbpIdhdTqiNIX2ClZd9JDzjq42S9MVlO1weA9zFI1J44 nalA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:references:cc :to:from:subject; bh=sr8/pv8VsJ88TgF3uQR8JqElcthgSjwEU8VYcrSwjIs=; b=iEpEFH8Q+gFqCDPR93zovr/VqDLc8/zI0tfjHqLnju38jnbOHMzakhInVp1axhZnMv a6dDSn+79xtlYqu12S28XL8t68zoXld9eE1U8RMj8SxkIt2IfLh/43hYXpbNOXP+JlzC +FchTLwqkyXqY2jrWre0hY2mTT+5unL9esDjat0lJFH4VKu3BaVpFdsvGfwjmmkKKqUe NgMTAzBlPM9UIPhX5fcoOrt9AwaNG1n3GovxpRSaAv57ib9Y5PZyYCElCrSjqOE7ga8j 7IyCiHGO6HEnxnXca2kXv07dvtj2sGundusGwhY6pZ7+Qq88+RW30h2z3XTaUbfMAXf5 9qvw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id bi14si2467938ejb.441.2021.09.30.00.32.52; Thu, 30 Sep 2021 00:33:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348729AbhI3HaZ (ORCPT + 99 others); Thu, 30 Sep 2021 03:30:25 -0400 Received: from szxga02-in.huawei.com ([45.249.212.188]:13863 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348577AbhI3HaZ (ORCPT ); Thu, 30 Sep 2021 03:30:25 -0400 Received: from dggemv704-chm.china.huawei.com (unknown [172.30.72.53]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4HKl9F4nXzz8sXg; Thu, 30 Sep 2021 15:24:01 +0800 (CST) Received: from dggpemm500005.china.huawei.com (7.185.36.74) by dggemv704-chm.china.huawei.com (10.3.19.47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.8; Thu, 30 Sep 2021 15:28:40 +0800 Received: from [10.69.30.204] (10.69.30.204) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2308.8; Thu, 30 Sep 2021 15:28:40 +0800 Subject: Re: [Linuxarm] Re: [PATCH net-next 2/7] page_pool: support non-split page with PP_FLAG_PAGE_FRAG From: Yunsheng Lin To: Jesper Dangaard Brouer , , CC: , , , , , , , , , , , , , , , , , References: <20210922094131.15625-1-linyunsheng@huawei.com> <20210922094131.15625-3-linyunsheng@huawei.com> <2f5a4b07-7acf-1838-cb42-fd8b5d4ba4c6@huawei.com> Message-ID: Date: Thu, 30 Sep 2021 15:28:39 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: <2f5a4b07-7acf-1838-cb42-fd8b5d4ba4c6@huawei.com> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.69.30.204] X-ClientProxiedBy: dggeme701-chm.china.huawei.com (10.1.199.97) To dggpemm500005.china.huawei.com (7.185.36.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021/9/24 15:23, Yunsheng Lin wrote: > On 2021/9/23 20:08, Jesper Dangaard Brouer wrote: >> >> >> On 22/09/2021 11.41, Yunsheng Lin wrote: >>> Currently when PP_FLAG_PAGE_FRAG is set, the caller is not >>> expected to call page_pool_alloc_pages() directly because of >>> the PP_FLAG_PAGE_FRAG checking in __page_pool_put_page(). >>> >>> The patch removes the above checking to enable non-split page >>> support when PP_FLAG_PAGE_FRAG is set. >>> >>> Reviewed-by: Alexander Duyck >>> Signed-off-by: Yunsheng Lin >>> --- >>> net/core/page_pool.c | 12 +++++++----- >>> 1 file changed, 7 insertions(+), 5 deletions(-) >>> >>> diff --git a/net/core/page_pool.c b/net/core/page_pool.c >>> index a65bd7972e37..f7e71dcb6a2e 100644 >>> --- a/net/core/page_pool.c >>> +++ b/net/core/page_pool.c >>> @@ -315,11 +315,14 @@ struct page *page_pool_alloc_pages(struct page_pool *pool, gfp_t gfp) >>> /* Fast-path: Get a page from cache */ >>> page = __page_pool_get_cached(pool); >>> - if (page) >>> - return page; >>> /* Slow-path: cache empty, do real allocation */ >>> - page = __page_pool_alloc_pages_slow(pool, gfp); >>> + if (!page) >>> + page = __page_pool_alloc_pages_slow(pool, gfp); >>> + >>> + if (likely(page)) >>> + page_pool_set_frag_count(page, 1); >>> + >> >> I really don't like that you add one atomic_long_set operation per page alloc call. >> This is a fast-path for XDP use-cases, which you are ignoring as you drivers doesn't implement XDP. >> >> As I cannot ask you to run XDP benchmarks, I fortunately have some page_pool specific microbenchmarks you can run instead. >> >> I will ask you to provide before and after results from running these benchmarks [1] and [2]. >> >> [1] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/bench_page_pool_simple.c >> >> [2] https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/bench_page_pool_cross_cpu.c >> >> How to use these module is documented here[3]: >> [3] https://prototype-kernel.readthedocs.io/en/latest/prototype-kernel/build-process.html > > Will running these benchmarks to see if any performance overhead noticable here, > thanks for the benchmarks. You are right, there is notiable overhead for bench_page_pool_cross_cpu test case above, possibly due to the cache bouncing caused by page_pool_set_frag_count(). As memntioned by Ilias, mlx5 use page pool and also do the recycling internally, so handling the page frag tracking consistently for both PP_FLAG_PAGE_FRAG and non-PP_FLAG_PAGE_FRAG will break the mlx5 driver. So I will drop this patch for now. > >> >>> return page; >>> } >>> EXPORT_SYMBOL(page_pool_alloc_pages); >>> @@ -428,8 +431,7 @@ __page_pool_put_page(struct page_pool *pool, struct page *page, >>> unsigned int dma_sync_size, bool allow_direct) >>> { >>> /* It is not the last user for the page frag case */ >>> - if (pool->p.flags & PP_FLAG_PAGE_FRAG && >>> - page_pool_atomic_sub_frag_count_return(page, 1)) >>> + if (page_pool_atomic_sub_frag_count_return(page, 1)) >>> return NULL; >> >> This adds an atomic_long_read, even when PP_FLAG_PAGE_FRAG is not set. > > The point here is to have consistent handling for both PP_FLAG_PAGE_FRAG > and non-PP_FLAG_PAGE_FRAG case in the following patch. > > As the page->_refcount is accessed in "page_ref_count(page) == 1" checking > in __page_pool_put_page(), and page->pp_frag_count is most likely in the > same cache line as the page->_refcount, So I am not expecting a noticable > overhead here. > > Anyway, will use the above benchmarks as an example to verify it. > > >> >>> /* This allocator is optimized for the XDP mode that uses >>> >> >> . >> > _______________________________________________ > Linuxarm mailing list -- linuxarm@openeuler.org > To unsubscribe send an email to linuxarm-leave@openeuler.org >