Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp750835rwd; Sat, 27 May 2023 06:02:57 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4r3GYvdMJ/MhpUgpP8jLG5XzOIxi/29PSsMvHmV0zC0yUVa6sok/HZZZ2eHX5bcf6pBwFA X-Received: by 2002:a17:90b:3eca:b0:256:937:48dc with SMTP id rm10-20020a17090b3eca00b00256093748dcmr5427322pjb.40.1685192576566; Sat, 27 May 2023 06:02:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685192576; cv=none; d=google.com; s=arc-20160816; b=g0EWHERUo/HNSohbT5HvY+ZFMCXZRQwR/pDZZzXa/P2b7tLj77HL5TEajA1RWptx2I yrpqvN4x+QSGap46dgk0sJbi6LTyyMbizo0VMJF2zRJ1Az4W33HGxyN2xMQ2u6GLD1rt r59IuPC4yoZwFLnOAXPN6CcO3VEY9SbWv73yOU5M6MLfX8lyw9naliaaT9ZPegkxJWrZ uoQQy/i60sGH7kLPN/A1Ndk2YdONRpt7qhKmQoizk0s4oMmK7e7wr+yC3KkagAxADXXJ B5aUGK87W0vuxksUccRJRydNiM9Dky5XgpIlXrnFCMFn7SFytHM+DUTT+tddvhVPKuNa fTJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=FxM+okH8Wp1cmzKMTX/bhdgz/6xLhieWnxN6AooLVdk=; b=lrxMEYnHomwGXBK75EQpNGOLDLu9JH3ku5lLH/ZxIl7kzjroM32blSzzWHpuDXhXuk fftTkqAn3shptDXuycDXRmVEssmWFqxPe8xfoZKRn08bmeCMKWhtvJnhrbTrIHqzGOpn Xm6gcmIfT80uNK/lPoU0YPax6ckAA2GydljmTnEvJxbXAUwLHbFDAfVr4VTj+kCE4ix/ fhvHREbbenfyCtBvPVWbSBmx2228E0Q4iFo+l0r1+kYzgLid3oQTKNN/1ysCHsgpqak7 x1p5utx4Pnv2DlRyMQY9qQh+YzU6lElyLj7v+jQH8X0Ca9vOnrAVcYudDgwINLjxmbMx rrzg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=SOFBINV4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a193-20020a6390ca000000b0053ef5432117si6235371pge.461.2023.05.27.06.02.38; Sat, 27 May 2023 06:02:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=SOFBINV4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232583AbjE0Mgp (ORCPT + 99 others); Sat, 27 May 2023 08:36:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45722 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232569AbjE0Mgl (ORCPT ); Sat, 27 May 2023 08:36:41 -0400 Received: from mail-lj1-x231.google.com (mail-lj1-x231.google.com [IPv6:2a00:1450:4864:20::231]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C6FF5E41; Sat, 27 May 2023 05:36:25 -0700 (PDT) Received: by mail-lj1-x231.google.com with SMTP id 38308e7fff4ca-2afb2874e83so17592071fa.0; Sat, 27 May 2023 05:36:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1685190983; x=1687782983; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=FxM+okH8Wp1cmzKMTX/bhdgz/6xLhieWnxN6AooLVdk=; b=SOFBINV4tAoZFH6dmb0gL0gHL6TEKuKbaNktI/ukzf/sMwhe8aA4bfB6zJVgqVjjzF uc3hBuxEGEM2RxkCKKCHdSkuttRdiKyeKTxYst2WGnZUAcoC0ok0yf1KXs1i5AC1Y6/6 tNIpCVyED4VIWZMa4rlDIvuNKAc/TgpoUTg9xvhGwUcJ41rF94EAUWRqLkqW/Uc5wFZW Vpfj3X50b+KfGIdmPf60q6fzXUlJA9b1WWCZEe51a4mCOdhvxl05G1wDHxhahvMkO5ix bZ67QEmRUdlG8mFo27YGdOOXB/1tJ1e9hAXiTQQ6KGiN2EaxDgvzco7k0bA2YIfYQufc q/Qg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685190983; x=1687782983; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FxM+okH8Wp1cmzKMTX/bhdgz/6xLhieWnxN6AooLVdk=; b=EGGLt0ShYWdJ7SvWYPgltE3igNESEAQ/Snatjf/tUf0e1L7gpTxWOa9RBDEVsGF3eY l1wscO1R7r+bgfd2/IFN2/fgZlXMfVTHgUScYVK6siVCnemgzp/iPgVI5rYft97co52Y py/Sc9V/LWI+Z/2gJbZ0mWwspK9Oqlj93Slcy9KbHvlVS907u0U3DN4qmFnUtVEIAxE/ XTxDn+upPPXo7Ft9BKr4UgXywg7oQIjRxgRzekTgCW6MardVYY5H27uwx9NHnnfS883s hwj4FCoEDpoDWQySSExBsZ5xkYn8o/TiVkURXYzFG+kjspihsDgv8g/Evp93m3lVo6f5 kfMA== X-Gm-Message-State: AC+VfDwrJLKjy7mksSmEbEsopxEuixrugzf2yKAhhnYNhgf2zQsx56Li CII96t03Ukoaqi2giXHV4b1VdLnHIa1rNRG2wbE= X-Received: by 2002:a2e:968c:0:b0:2ab:16a6:b330 with SMTP id q12-20020a2e968c000000b002ab16a6b330mr1750644lji.51.1685190983476; Sat, 27 May 2023 05:36:23 -0700 (PDT) MIME-Version: 1.0 References: <20230526054621.18371-1-liangchen.linux@gmail.com> <20230526054621.18371-3-liangchen.linux@gmail.com> <20230526082914.owofnszwdjgcjwhi@soft-dev3-1> In-Reply-To: <20230526082914.owofnszwdjgcjwhi@soft-dev3-1> From: Liang Chen Date: Sat, 27 May 2023 20:36:09 +0800 Message-ID: Subject: Re: [PATCH net-next 3/5] virtio_net: Add page pool fragmentation support To: Horatiu Vultur Cc: jasowang@redhat.com, mst@redhat.com, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, xuanzhuo@linux.alibaba.com, kuba@kernel.org, edumazet@google.com, davem@davemloft.net, pabeni@redhat.com, alexander.duyck@gmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 26, 2023 at 4:29=E2=80=AFPM Horatiu Vultur wrote: > > The 05/26/2023 13:46, Liang Chen wrote: > > Hi Liang, > > > > > To further enhance performance, implement page pool fragmentation > > support and introduce a module parameter to enable or disable it. > > > > In single-core vm testing environments, there is an additional performa= nce > > gain observed in the normal path compared to the one packet per page > > approach. > > Upstream codebase: 47.5 Gbits/sec > > Upstream codebase with page pool: 50.2 Gbits/sec > > Upstream codebase with page pool fragmentation support: 52.3 Gbits/se= c > > > > There is also some performance gain for XDP cpumap. > > Upstream codebase: 1.38 Gbits/sec > > Upstream codebase with page pool: 9.74 Gbits/sec > > Upstream codebase with page pool fragmentation: 10.3 Gbits/sec > > > > Signed-off-by: Liang Chen > > --- > > drivers/net/virtio_net.c | 72 ++++++++++++++++++++++++++++++---------- > > 1 file changed, 55 insertions(+), 17 deletions(-) > > > > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c > > index 99c0ca0c1781..ac40b8c66c59 100644 > > --- a/drivers/net/virtio_net.c > > +++ b/drivers/net/virtio_net.c > > @@ -32,7 +32,9 @@ module_param(gso, bool, 0444); > > module_param(napi_tx, bool, 0644); > > > > static bool page_pool_enabled; > > +static bool page_pool_frag; > > module_param(page_pool_enabled, bool, 0400); > > +module_param(page_pool_frag, bool, 0400); > > > > /* FIXME: MTU in config. */ > > #define GOOD_PACKET_LEN (ETH_HLEN + VLAN_HLEN + ETH_DATA_LEN) > > @@ -909,23 +911,32 @@ static struct page *xdp_linearize_page(struct rec= eive_queue *rq, > > struct page *p, > > int offset, > > int page_off, > > - unsigned int *len) > > + unsigned int *len, > > + unsigned int *pp_frag_offset= ) > > The 'unsigned int *pp_frag_offset' seems to be unaligned. > Sure, Thanks! > > { > > int tailroom =3D SKB_DATA_ALIGN(sizeof(struct skb_shared_info))= ; > > struct page *page; > > + unsigned int pp_frag_offset_val; > > Please use reverse christmas tree notation here. The pp_frag_offset_val > needs to be declared before page; > Sure. Will do on v2. > > > > if (page_off + *len + tailroom > PAGE_SIZE) > > return NULL; > > > > if (rq->page_pool) > > - page =3D page_pool_dev_alloc_pages(rq->page_pool); > > + if (rq->page_pool->p.flags & PP_FLAG_PAGE_FRAG) > > + page =3D page_pool_dev_alloc_frag(rq->page_pool= , pp_frag_offset, > > + PAGE_SIZE); > > Don't you need to check if pp_frag_offset is null? As you call once with > NULL. > At the moment, page_pool is enabled only for mergeable mode, and the path leading to a call with NULL pp_frag_offset is from small mode. But I will evaluate again whether it is beneficial to support page_pool for small mode on v2. Thanks. > > + else > > + page =3D page_pool_dev_alloc_pages(rq->page_poo= l); > > else > > page =3D alloc_page(GFP_ATOMIC); > > > > if (!page) > > return NULL; > > > > - memcpy(page_address(page) + page_off, page_address(p) + offset,= *len); > > + pp_frag_offset_val =3D pp_frag_offset ? *pp_frag_offset : 0; > > + > > + memcpy(page_address(page) + page_off + pp_frag_offset_val, > > + page_address(p) + offset, *len); > > page_off +=3D *len; > > > > while (--*num_buf) { > > @@ -948,7 +959,7 @@ static struct page *xdp_linearize_page(struct recei= ve_queue *rq, > > goto err_buf; > > } > > > > - memcpy(page_address(page) + page_off, > > + memcpy(page_address(page) + page_off + pp_frag_offset_v= al, > > page_address(p) + off, buflen); > > page_off +=3D buflen; > > virtnet_put_page(rq, p); > > @@ -1029,7 +1040,7 @@ static struct sk_buff *receive_small_xdp(struct n= et_device *dev, > > SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); > > xdp_page =3D xdp_linearize_page(rq, &num_buf, page, > > offset, header_offset, > > - &tlen); > > + &tlen, NULL); > > if (!xdp_page) > > goto err_xdp; > > > > @@ -1323,6 +1334,7 @@ static void *mergeable_xdp_get_buf(struct virtnet= _info *vi, > > unsigned int headroom =3D mergeable_ctx_to_headroom(ctx); > > struct page *xdp_page; > > unsigned int xdp_room; > > + unsigned int page_frag_offset =3D 0; > > Please use reverse x-mas tree notation. > Sure. Will do on v2. > > > > /* Transient failure which in theory could occur if > > * in-flight packets from before XDP was enabled reach > > @@ -1356,7 +1368,8 @@ static void *mergeable_xdp_get_buf(struct virtnet= _info *vi, > > xdp_page =3D xdp_linearize_page(rq, num_buf, > > *page, offset, > > VIRTIO_XDP_HEADROOM, > > - len); > > + len, > > + &page_frag_offset); > > You have also here some misalignment with regards to page_frag_offset. > Sure, Thanks! > > if (!xdp_page) > > return NULL; > > } else { > > @@ -1366,14 +1379,19 @@ static void *mergeable_xdp_get_buf(struct virtn= et_info *vi, > > return NULL; > > > > if (rq->page_pool) > > - xdp_page =3D page_pool_dev_alloc_pages(rq->page= _pool); > > + if (rq->page_pool->p.flags & PP_FLAG_PAGE_FRAG) > > + xdp_page =3D page_pool_dev_alloc_frag(r= q->page_pool, > > + &pa= ge_frag_offset, PAGE_SIZE); > > + else > > + xdp_page =3D page_pool_dev_alloc_pages(= rq->page_pool); > > else > > xdp_page =3D alloc_page(GFP_ATOMIC); > > + > > if (!xdp_page) > > return NULL; > > > > - memcpy(page_address(xdp_page) + VIRTIO_XDP_HEADROOM, > > - page_address(*page) + offset, *len); > > + memcpy(page_address(xdp_page) + VIRTIO_XDP_HEADROOM + > > + page_frag_offset, page_address(*page) += offset, *len); > > } > > > > *frame_sz =3D PAGE_SIZE; > > @@ -1382,7 +1400,7 @@ static void *mergeable_xdp_get_buf(struct virtnet= _info *vi, > > > > *page =3D xdp_page; > > > > - return page_address(*page) + VIRTIO_XDP_HEADROOM; > > + return page_address(*page) + VIRTIO_XDP_HEADROOM + page_frag_of= fset; > > } > > > > static struct sk_buff *receive_mergeable_xdp(struct net_device *dev, > > @@ -1762,6 +1780,7 @@ static int add_recvbuf_mergeable(struct virtnet_i= nfo *vi, > > void *ctx; > > int err; > > unsigned int len, hole; > > + unsigned int pp_frag_offset; > > There same here. > Sure, Thanks! > > > > /* Extra tailroom is needed to satisfy XDP's assumption. This > > * means rx frags coalescing won't work, but consider we've > > @@ -1769,13 +1788,29 @@ static int add_recvbuf_mergeable(struct virtnet= _info *vi, > > */ > > len =3D get_mergeable_buf_len(rq, &rq->mrg_avg_pkt_len, room); > > if (rq->page_pool) { > > - struct page *page; > > + if (rq->page_pool->p.flags & PP_FLAG_PAGE_FRAG) { > > + if (unlikely(!page_pool_dev_alloc_frag(rq->page= _pool, > > + &pp_frag= _offset, len + room))) > > + return -ENOMEM; > > + buf =3D (char *)page_address(rq->page_pool->fra= g_page) + > > + pp_frag_offset; > > + buf +=3D headroom; /* advance address leaving h= ole at front of pkt */ > > + hole =3D (PAGE_SIZE << rq->page_pool->p.order) > > + - rq->page_pool->frag_offset; > > + if (hole < len + room) { > > + if (!headroom) > > + len +=3D hole; > > + rq->page_pool->frag_offset +=3D hole; > > + } > > + } else { > > + struct page *page; > > > > - page =3D page_pool_dev_alloc_pages(rq->page_pool); > > - if (unlikely(!page)) > > - return -ENOMEM; > > - buf =3D (char *)page_address(page); > > - buf +=3D headroom; /* advance address leaving hole at f= ront of pkt */ > > + page =3D page_pool_dev_alloc_pages(rq->page_poo= l); > > + if (unlikely(!page)) > > + return -ENOMEM; > > + buf =3D (char *)page_address(page); > > + buf +=3D headroom; /* advance address leaving h= ole at front of pkt */ > > + } > > } else { > > if (unlikely(!skb_page_frag_refill(len + room, alloc_fr= ag, gfp))) > > return -ENOMEM; > > @@ -3800,13 +3835,16 @@ static void virtnet_alloc_page_pool(struct rece= ive_queue *rq) > > struct virtio_device *vdev =3D rq->vq->vdev; > > > > struct page_pool_params pp_params =3D { > > - .order =3D 0, > > + .order =3D page_pool_frag ? SKB_FRAG_PAGE_ORDER : 0, > > .pool_size =3D rq->vq->num_max, > > .nid =3D dev_to_node(vdev->dev.parent), > > .dev =3D vdev->dev.parent, > > .offset =3D 0, > > }; > > > > + if (page_pool_frag) > > + pp_params.flags |=3D PP_FLAG_PAGE_FRAG; > > + > > rq->page_pool =3D page_pool_create(&pp_params); > > if (IS_ERR(rq->page_pool)) { > > dev_warn(&vdev->dev, "page pool creation failed: %ld\n"= , > > -- > > 2.31.1 > > > > > > -- > /Horatiu