Received: by 2002:a05:6a10:9afc:0:0:0:0 with SMTP id t28csp2550260pxm; Mon, 28 Feb 2022 01:04:11 -0800 (PST) X-Google-Smtp-Source: ABdhPJwC8JDt22CDRsSX1BH4NGhuZJTLoW8XVZyAm9tJUjwPHr6Bnlmo4S2Iis2yjT88j3xKi5+F X-Received: by 2002:a63:a22:0:b0:362:b5d4:fa89 with SMTP id 34-20020a630a22000000b00362b5d4fa89mr16390557pgk.372.1646039051475; Mon, 28 Feb 2022 01:04:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1646039051; cv=none; d=google.com; s=arc-20160816; b=YYgUPakoWRMmwfG6C98O2Lseq6JCKtdKTOAl2S5s18rHLW9XaCBDp4MaztVJQCdbG/ gyU0aPNgEmuef84pJjRBZy8//G4pfk3NFHS/aKYGoNVMuBA2mwKqTVBTY4T3taSE1LqZ fxbR7bsGl01jXyBJH+4eVFpLc31J5Vrvpd698V+5iRsd+wHxlH+LXkf/IATIg85UbYXo rNDNGwmGO4NfMayGnIsKvu/xkmNlzDHA62Se8V3USlkiuJkg+UQusbBOWieRp0Sw/nzO BIjuKoLAvatQ5uvDtT7GyQsumWz+84PzTo6jA0UYHcGxqMfVJydJVSZy6ve0b9LsV9AG aCyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=aUXNIN60U26tD/3N7dMMfSzdsR2HBfgMEkzDNq/C3Eo=; b=sRBoSNzqFfsfOREH+RKVD8sECZ2S+bmy4ObInqs30HvRmFLIWBLchuFp4IWWC0OFRx AHujjBDuT6tZA9FeI1bN07TnorAdRB93q62tl5EfiMIIlSaPRDB2KzHrFpJNnhzhB1ph L2cIyNR2XzdqlcMicIx4uXwN8JcLM/53plO2yAOxUlUgEWN45LxsqIBwkStGxN3NQhJ5 dp5ujrMhuCE5j/5pAvqaXasqenTAC/zoDGdL6qma+9oqDyq6Ot3HQIq2ZRwcZDNzRa43 UiYp16t/wTFlkuM7pRT1Dc0uIFoyiG+fn7vDv653gk8DoEWXnJXiD9SZh1bWkG8rvud3 gOsA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=J+aKlgqS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t21-20020a1709028c9500b0014f6869876bsi8238771plo.296.2022.02.28.01.03.57; Mon, 28 Feb 2022 01:04:11 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=J+aKlgqS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233679AbiB1Hry (ORCPT + 99 others); Mon, 28 Feb 2022 02:47:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54308 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229664AbiB1Hrw (ORCPT ); Mon, 28 Feb 2022 02:47:52 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id D110166AFF for ; Sun, 27 Feb 2022 23:47:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646034433; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=aUXNIN60U26tD/3N7dMMfSzdsR2HBfgMEkzDNq/C3Eo=; b=J+aKlgqShwnTA9HmTzNfd6RWP4FZWZ1i2cu9t9dIqkozMZqYTn4FZGMTe1lrnP92vGECz4 ThIa+dcLr5Z7hUa7qGmp2FUcSDKMmzB8XKaVI+UqmfYK4D8pVP81azA9owVU8jvIhRe9WA u1UGmKiQHzCJcKbODDhip/ldjobhzzU= Received: from mail-lf1-f70.google.com (mail-lf1-f70.google.com [209.85.167.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-523-uG3f7DDkMDq3PA_TE-nk7w-1; Mon, 28 Feb 2022 02:47:09 -0500 X-MC-Unique: uG3f7DDkMDq3PA_TE-nk7w-1 Received: by mail-lf1-f70.google.com with SMTP id v13-20020ac2592d000000b004435f5315dbso1522433lfi.21 for ; Sun, 27 Feb 2022 23:47:09 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=aUXNIN60U26tD/3N7dMMfSzdsR2HBfgMEkzDNq/C3Eo=; b=LfTYt/SqqlC02jx/hTIszl/pUqW467s6xKMlu0dmArE6W/KLNMZZVBc0OxaEbYP9rg E476vqyLz5C828r1ygNacuG46r6F67MfzpOVK0Swi2SR+rQI7Z+ypMjzS5HWMawhtcgY 86rFRTpcdPHnxpPEgrBiI37F6rq2QyKYEsnC2gTQc5573kN4kKA9YK6TKpPYh2Xn7fmG yPpHcYW3V6F9/kcN1FAzgI3YBDr3BLgBXrRWLYk4Ere06rlNvj6KISz459JBng16hZya 3LPuSBFtjYDRH44csYJaMmqplmjZkb0l5BBWlm7nsfiZxHlcCmc7F/2Pbqfu9MQ0onMs Cwew== X-Gm-Message-State: AOAM5324bvSkvWXk4rHSvSR/Gy0/FgWvJfEsj/JuqZtuMqMupz/yztb/ zyNSYnhAH86C4DzS48MQzV+/4LLW8r4SOHQuJXd5nvJW5tPSGXjFtvVFaoNnqA20noYZhhCIH3c 5l9qQdspUoKa3MWXGMwnMDBYG+Qju9j/2cjwli1Fa X-Received: by 2002:a05:6512:3341:b0:433:b033:bd22 with SMTP id y1-20020a056512334100b00433b033bd22mr11614973lfd.190.1646034427542; Sun, 27 Feb 2022 23:47:07 -0800 (PST) X-Received: by 2002:a05:6512:3341:b0:433:b033:bd22 with SMTP id y1-20020a056512334100b00433b033bd22mr11614963lfd.190.1646034427235; Sun, 27 Feb 2022 23:47:07 -0800 (PST) MIME-Version: 1.0 References: <20220224103852.311369-1-baymaxhuang@gmail.com> <20220228033805.1579435-1-baymaxhuang@gmail.com> In-Reply-To: <20220228033805.1579435-1-baymaxhuang@gmail.com> From: Jason Wang Date: Mon, 28 Feb 2022 15:46:56 +0800 Message-ID: Subject: Re: [PATCH net-next v3] tun: support NAPI for packets received from batched XDP buffs To: Harold Huang Cc: netdev , Paolo Abeni , "David S. Miller" , Jakub Kicinski , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , open list , "open list:XDP (eXpress Data Path)" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H5,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Feb 28, 2022 at 11:38 AM Harold Huang wrote: > > In tun, NAPI is supported and we can also use NAPI in the path of > batched XDP buffs to accelerate packet processing. What is more, after > we use NAPI, GRO is also supported. The iperf shows that the throughput of > single stream could be improved from 4.5Gbps to 9.2Gbps. Additionally, 9.2 > Gbps nearly reachs the line speed of the phy nic and there is still about > 15% idle cpu core remaining on the vhost thread. > > Test topology: > [iperf server]<--->tap<--->dpdk testpmd<--->phy nic<--->[iperf client] > > Iperf stream: > iperf3 -c 10.0.0.2 -i 1 -t 10 > > Before: > ... > [ 5] 5.00-6.00 sec 558 MBytes 4.68 Gbits/sec 0 1.50 MBytes > [ 5] 6.00-7.00 sec 556 MBytes 4.67 Gbits/sec 1 1.35 MBytes > [ 5] 7.00-8.00 sec 556 MBytes 4.67 Gbits/sec 2 1.18 MBytes > [ 5] 8.00-9.00 sec 559 MBytes 4.69 Gbits/sec 0 1.48 MBytes > [ 5] 9.00-10.00 sec 556 MBytes 4.67 Gbits/sec 1 1.33 MBytes > - - - - - - - - - - - - - - - - - - - - - - - - - > [ ID] Interval Transfer Bitrate Retr > [ 5] 0.00-10.00 sec 5.39 GBytes 4.63 Gbits/sec 72 sender > [ 5] 0.00-10.04 sec 5.39 GBytes 4.61 Gbits/sec receiver > > After: > ... > [ 5] 5.00-6.00 sec 1.07 GBytes 9.19 Gbits/sec 0 1.55 MBytes > [ 5] 6.00-7.00 sec 1.08 GBytes 9.30 Gbits/sec 0 1.63 MBytes > [ 5] 7.00-8.00 sec 1.08 GBytes 9.25 Gbits/sec 0 1.72 MBytes > [ 5] 8.00-9.00 sec 1.08 GBytes 9.25 Gbits/sec 77 1.31 MBytes > [ 5] 9.00-10.00 sec 1.08 GBytes 9.24 Gbits/sec 0 1.48 MBytes > - - - - - - - - - - - - - - - - - - - - - - - - - > [ ID] Interval Transfer Bitrate Retr > [ 5] 0.00-10.00 sec 10.8 GBytes 9.28 Gbits/sec 166 sender > [ 5] 0.00-10.04 sec 10.8 GBytes 9.24 Gbits/sec receiver > > Reported-at: https://lore.kernel.org/all/CACGkMEvTLG0Ayg+TtbN4q4pPW-ycgCCs3sC3-TF8cuRTf7Pp1A@mail.gmail.com > Signed-off-by: Harold Huang Acked-by: Jason Wang > --- > v2 -> v3 > - return the queued NAPI packet from tun_xdp_one > > drivers/net/tun.c | 43 ++++++++++++++++++++++++++++++------------- > 1 file changed, 30 insertions(+), 13 deletions(-) > > diff --git a/drivers/net/tun.c b/drivers/net/tun.c > index fed85447701a..969ea69fd29d 100644 > --- a/drivers/net/tun.c > +++ b/drivers/net/tun.c > @@ -2388,9 +2388,10 @@ static int tun_xdp_one(struct tun_struct *tun, > struct virtio_net_hdr *gso = &hdr->gso; > struct bpf_prog *xdp_prog; > struct sk_buff *skb = NULL; > + struct sk_buff_head *queue; > u32 rxhash = 0, act; > int buflen = hdr->buflen; > - int err = 0; > + int ret = 0; > bool skb_xdp = false; > struct page *page; > > @@ -2405,13 +2406,13 @@ static int tun_xdp_one(struct tun_struct *tun, > xdp_set_data_meta_invalid(xdp); > > act = bpf_prog_run_xdp(xdp_prog, xdp); > - err = tun_xdp_act(tun, xdp_prog, xdp, act); > - if (err < 0) { > + ret = tun_xdp_act(tun, xdp_prog, xdp, act); > + if (ret < 0) { > put_page(virt_to_head_page(xdp->data)); > - return err; > + return ret; > } > > - switch (err) { > + switch (ret) { > case XDP_REDIRECT: > *flush = true; > fallthrough; > @@ -2435,7 +2436,7 @@ static int tun_xdp_one(struct tun_struct *tun, > build: > skb = build_skb(xdp->data_hard_start, buflen); > if (!skb) { > - err = -ENOMEM; > + ret = -ENOMEM; > goto out; > } > > @@ -2445,7 +2446,7 @@ static int tun_xdp_one(struct tun_struct *tun, > if (virtio_net_hdr_to_skb(skb, gso, tun_is_little_endian(tun))) { > atomic_long_inc(&tun->rx_frame_errors); > kfree_skb(skb); > - err = -EINVAL; > + ret = -EINVAL; > goto out; > } > > @@ -2455,16 +2456,27 @@ static int tun_xdp_one(struct tun_struct *tun, > skb_record_rx_queue(skb, tfile->queue_index); > > if (skb_xdp) { > - err = do_xdp_generic(xdp_prog, skb); > - if (err != XDP_PASS) > + ret = do_xdp_generic(xdp_prog, skb); > + if (ret != XDP_PASS) { > + ret = 0; > goto out; > + } > } > > if (!rcu_dereference(tun->steering_prog) && tun->numqueues > 1 && > !tfile->detached) > rxhash = __skb_get_hash_symmetric(skb); > > - netif_receive_skb(skb); > + if (tfile->napi_enabled) { > + queue = &tfile->sk.sk_write_queue; > + spin_lock(&queue->lock); > + __skb_queue_tail(queue, skb); > + spin_unlock(&queue->lock); > + ret = 1; > + } else { > + netif_receive_skb(skb); > + ret = 0; > + } > > /* No need to disable preemption here since this function is > * always called with bh disabled > @@ -2475,7 +2487,7 @@ static int tun_xdp_one(struct tun_struct *tun, > tun_flow_update(tun, rxhash, tfile); > > out: > - return err; > + return ret; > } > > static int tun_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len) > @@ -2492,7 +2504,7 @@ static int tun_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len) > if (ctl && (ctl->type == TUN_MSG_PTR)) { > struct tun_page tpage; > int n = ctl->num; > - int flush = 0; > + int flush = 0, queued = 0; > > memset(&tpage, 0, sizeof(tpage)); > > @@ -2501,12 +2513,17 @@ static int tun_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len) > > for (i = 0; i < n; i++) { > xdp = &((struct xdp_buff *)ctl->ptr)[i]; > - tun_xdp_one(tun, tfile, xdp, &flush, &tpage); > + ret = tun_xdp_one(tun, tfile, xdp, &flush, &tpage); > + if (ret > 0) > + queued += ret; > } > > if (flush) > xdp_do_flush(); > > + if (tfile->napi_enabled && queued > 0) > + napi_schedule(&tfile->napi); > + > rcu_read_unlock(); > local_bh_enable(); > > -- > 2.27.0 >