Received: by 2002:a05:6a11:4021:0:0:0:0 with SMTP id ky33csp364922pxb; Sat, 18 Sep 2021 05:11:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxD6ROzOQyei0QTFVnL0GStsRBt0PrXQb48bTHenQnqNb0H4/5TJHCU0u+Tp7+kcRnSvnfy X-Received: by 2002:a92:d102:: with SMTP id a2mr11346688ilb.162.1631967112321; Sat, 18 Sep 2021 05:11:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1631967112; cv=none; d=google.com; s=arc-20160816; b=jCtvx833EaXG0F/VVwCWLkahtkh6XErOyumTIzIK3a8QhOXcIzpuhpvIFookdYYTj3 lgYxJRtQJj0TqflHK0yDYwZoalR1UVMR7Oue4ITiyosFq04cg7kTVnWoUOhJlwzxwMhg FP3zbOWv3LImSiubwIVaVO3xdI2EcMh8W5//87/7I4k/LkKvbESybtJ7jmpyZkxeWHLd 1G7zWn9RH2XUvMy/uoYDsw3nYwmKa5KIsheeaFzoHPrSogIQhRYkKzwvwTdxWGZVz+jX 2diw9PSImWsC0OIZpAmrvktSr0XXuK9AGi9Hzn7XuFjzuykQPyXFngdE0Mqst01EV+ax QUnw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=ke/MTkcTJkZTLkuffSrSJpB6h4twqjrVu1BSne7wruU=; b=es5g/oB0Sz6FiMwWNt+FrdpnLu00sh5Z5ebBSW5Qjp6YULy8JT+zAhttXiUHyx5DwA XQR2BClaLNmKgJdrXu0Mr/Zcdjp2XbriqzD/QIuBNEoBBuXOVLhIuwwMgrPfMPXC4vUJ ZwvZ8CC+TP7ct3WTRot9X7hJb1qJYN4/qIILnYFRfFNCLME1ggj0a36IT0WgXe5T3DAQ DvXtuGQPavhFCet05PM659dm5sNtZ56y5owGB+b3f+nLC0ffRZOoIr32reHdw1/qRk1e oxtc2z0X74SQB0nCtw6DpOzNI9Esb3kWMPS8nTa9feysXnGhmsocAnJdqtQL8/ZJQrCk 54CQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j15si8063233jac.8.2021.09.18.05.11.40; Sat, 18 Sep 2021 05:11:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=huawei.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239371AbhIRCn1 (ORCPT + 99 others); Fri, 17 Sep 2021 22:43:27 -0400 Received: from szxga02-in.huawei.com ([45.249.212.188]:9892 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232471AbhIRCn0 (ORCPT ); Fri, 17 Sep 2021 22:43:26 -0400 Received: from dggemv703-chm.china.huawei.com (unknown [172.30.72.56]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4HBFNF0BNjz8yVZ; Sat, 18 Sep 2021 10:37:33 +0800 (CST) Received: from dggpemm500005.china.huawei.com (7.185.36.74) by dggemv703-chm.china.huawei.com (10.3.19.46) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.8; Sat, 18 Sep 2021 10:42:01 +0800 Received: from [10.69.30.204] (10.69.30.204) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2308.8; Sat, 18 Sep 2021 10:42:01 +0800 Subject: Re: [Linuxarm] Re: [PATCH net-next v2 3/3] skbuff: keep track of pp page when __skb_frag_ref() is called To: Eric Dumazet CC: Jesper Dangaard Brouer , Ilias Apalodimas , Jesper Dangaard Brouer , Alexander Duyck , David Miller , Jakub Kicinski , netdev , LKML , , Jesper Dangaard Brouer , "Jonathan Lemon" , Alexander Lobakin , "Willem de Bruijn" , Cong Wang , "Paolo Abeni" , Kevin Hao , Aleksandr Nogikh , Marco Elver , , David Ahern References: <20210914121114.28559-1-linyunsheng@huawei.com> <20210914121114.28559-4-linyunsheng@huawei.com> <9467ec14-af34-bba4-1ece-6f5ea199ec97@huawei.com> <0337e2f6-5428-2c75-71a5-6db31c60650a@redhat.com> From: Yunsheng Lin Message-ID: <0c59b17a-4bc7-9082-1362-77256bec9abe@huawei.com> Date: Sat, 18 Sep 2021 10:42:00 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.69.30.204] X-ClientProxiedBy: dggeme702-chm.china.huawei.com (10.1.199.98) To dggpemm500005.china.huawei.com (7.185.36.74) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2021/9/18 1:15, Eric Dumazet wrote: > On Wed, Sep 15, 2021 at 7:05 PM Yunsheng Lin wrote: > >> As memtioned before, Tx recycling is based on page_pool instance per socket. >> it shares the page_pool instance with rx. >> >> Anyway, based on feedback from edumazet and dsahern, I am still trying to >> see if the page pool is meaningful for tx. >> > > It is not for generic linux TCP stack, but perhaps for benchmarks. I am not sure I understand what does above means, did you mean tx recycling only benefit the benchmark tool, such as iperf/netperf, but not the real usecase? > > Unless you dedicate one TX/RX pair per TCP socket ? TX/RX pair for netdev queue or TX/RX pair for recycling pool? As the TX/RX pair for netdev queue, I am not dedicating one TX/RX pair netdev queue per TCP socket. As the TX/RX pair for recycling pool, my initial thinking is each NAPI/socket context have a 'struct pp_alloc_cache', which provides last-in-first-out and lockless mini pool specific to each NAPI/socket context, and a central locked 'struct ptr_ring' pool based on queue for all the NAPI/socket mini pools, when a NAPI/socket context's mini pool is empty or full, it can refill some page from the central pool or flush some page to the central pool. I am not sure if the locked central pool is needed or not, or the 'struct ptr_ring' of page pool is right one to be the locked central pool yet. > > Most high performance TCP flows are using zerocopy, I am really not > sure why we would > need to 'optimize' the path that is wasting cpu cycles doing > user->kernel copies anyway, > at the cost of insane complexity. As my understanding, zerocopy is mostly about big packet and non-IOMMU case. As complexity, I am not convinced yet that it is that complex, as it is mostly using the existing infrastructure to support tx recycling. The point is that most of skb is freed in the context of NAPI or socket, it seems we may utilize that to do batch allocating/freeing of skb/page_frag, or reusing of skb/page_frag/dma mapping to avoid (IO/CPU)TLB miss, cache miss, overhead of spinlock and dma mapping. > . >