Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp57698imm; Mon, 21 May 2018 02:08:26 -0700 (PDT) X-Google-Smtp-Source: AB8JxZo4Llyy53tPOlvAoqKqJPV9oDi1Z5LQifFEpJEEdyl6n+dgNdtLSO+nmzvnqLWHIyH97owb X-Received: by 2002:a17:902:a60d:: with SMTP id u13-v6mr19844640plq.40.1526893706178; Mon, 21 May 2018 02:08:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526893706; cv=none; d=google.com; s=arc-20160816; b=FEqsjjCUeFXIa+hg2xkyMJ/gK+TiooUiAyhUIHxyYm60pQglx0qzJC5Cx/8my/OYt9 ABWIn8BW/cfDYdDIlBH/I0ZrH5S0PGruEqTBJAUuBcCuIyj4f+C13pVdCCHi66z8rFtz 4gcJnKuOWu6Qoo0ImVxZ/sraeWgMpVfHB/s6GvLcO+VCN+aI7Tx4+qDldA+VHVNHe6Xs ciLylUUUnTzU/Q8QghyyMpzppHRtP1SW0PW+/H+s2uQGA8aVV6l8vEHs5C2TmY82APHL JA1n/4NJVmrUr6kQqEwGBxeEpj3daruA6OBwYGA1NIdW/+sLG9c1/aI/abDg5RG7avZ9 ZpSg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=7MOI0lxgS1eQtl3Crk0XWeBkdCsviHQTDGNwIy9I83A=; b=cgWzTd4V34JUVk7j7Z3aCUMo089NwX3iP0jrzSV7Hoo7bt7M+papRO4CuzjEg0NmDi iR3q5tS/ac7dbMWxL3wxSNZGLfMGY7rQz3b1Yrrrx1MwLp5Dic6PjLmTKSpjlc34oltY 7RpThlSysY/O0+LNT0M8V9/y3NXvYlfc3lKtnLU3w/sqkhfCFhIUk30YLBAmGMr8Er9b 2n+N2W6o2R29EvEm9CPK/lMFp+u0Pw9NVBFhWLlq3i0QWlXMGOaEz0jkjbo8B2EjJQPR q3XcKBp7MxpKdm0wwl8H/OKxEAdEQDaoHlApo1QbNMuRqLS90VHraEuQY105WQbt6tS+ jb5g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u3-v6si14053247plj.338.2018.05.21.02.08.12; Mon, 21 May 2018 02:08:26 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751636AbeEUJF3 (ORCPT + 99 others); Mon, 21 May 2018 05:05:29 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:41312 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751186AbeEUJFZ (ORCPT ); Mon, 21 May 2018 05:05:25 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9138D401DEA6; Mon, 21 May 2018 09:05:24 +0000 (UTC) Received: from jason-ThinkPad-T450s.redhat.com (ovpn-12-30.pek2.redhat.com [10.72.12.30]) by smtp.corp.redhat.com (Postfix) with ESMTP id A97B64AF1; Mon, 21 May 2018 09:05:21 +0000 (UTC) From: Jason Wang To: mst@redhat.com, jasowang@redhat.com Cc: kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH net-next 10/12] vhost_net: build xdp buff Date: Mon, 21 May 2018 17:04:31 +0800 Message-Id: <1526893473-20128-11-git-send-email-jasowang@redhat.com> In-Reply-To: <1526893473-20128-1-git-send-email-jasowang@redhat.com> References: <1526893473-20128-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Mon, 21 May 2018 09:05:24 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Mon, 21 May 2018 09:05:24 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'jasowang@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This patch implement build XDP buffers in vhost_net. The idea is do userspace copy in vhost_net and build XDP buff based on the page. Vhost_net can then submit one or an array of XDP buffs to underlayer socket (e.g TUN). TUN can choose to do XDP or call build_skb() to build skb. To support build skb, vnet header were also stored into the header of the XDP buff. This userspace copy and XDP buffs building is key to achieve XDP batching in TUN, since TUN does not need to care about userspace copy and then can disable premmption for several XDP buffs to achieve batching from XDP. TODO: reserve headroom based on the TUN XDP. Signed-off-by: Jason Wang --- drivers/vhost/net.c | 74 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 74 insertions(+) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index f0639d7..1209e84 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -492,6 +492,80 @@ static bool vhost_has_more_pkts(struct vhost_net *net, likely(!vhost_exceeds_maxpend(net)); } +#define VHOST_NET_HEADROOM 256 +#define VHOST_NET_RX_PAD (NET_IP_ALIGN + NET_SKB_PAD) + +static int vhost_net_build_xdp(struct vhost_net_virtqueue *nvq, + struct iov_iter *from, + struct xdp_buff *xdp) +{ + struct vhost_virtqueue *vq = &nvq->vq; + struct page_frag *alloc_frag = ¤t->task_frag; + struct virtio_net_hdr *gso; + size_t len = iov_iter_count(from); + int buflen = SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); + int pad = SKB_DATA_ALIGN(VHOST_NET_RX_PAD + VHOST_NET_HEADROOM + + nvq->sock_hlen); + int sock_hlen = nvq->sock_hlen; + void *buf; + int copied; + + if (len < nvq->sock_hlen) + return -EFAULT; + + if (SKB_DATA_ALIGN(len + pad) + + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) > PAGE_SIZE) + return -ENOSPC; + + buflen += SKB_DATA_ALIGN(len + pad); + alloc_frag->offset = ALIGN((u64)alloc_frag->offset, SMP_CACHE_BYTES); + if (unlikely(!skb_page_frag_refill(buflen, alloc_frag, GFP_KERNEL))) + return -ENOMEM; + + buf = (char *)page_address(alloc_frag->page) + alloc_frag->offset; + + /* We store two kinds of metadata in the header which will be + * used for XDP_PASS to do build_skb(): + * offset 0: buflen + * offset sizeof(int): vnet header + */ + copied = copy_page_from_iter(alloc_frag->page, + alloc_frag->offset + sizeof(int), sock_hlen, from); + if (copied != sock_hlen) + return -EFAULT; + + gso = (struct virtio_net_hdr *)(buf + sizeof(int)); + + if ((gso->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) && + vhost16_to_cpu(vq, gso->csum_start) + + vhost16_to_cpu(vq, gso->csum_offset) + 2 > + vhost16_to_cpu(vq, gso->hdr_len)) { + gso->hdr_len = cpu_to_vhost16(vq, + vhost16_to_cpu(vq, gso->csum_start) + + vhost16_to_cpu(vq, gso->csum_offset) + 2); + + if (vhost16_to_cpu(vq, gso->hdr_len) > len) + return -EINVAL; + } + + len -= sock_hlen; + copied = copy_page_from_iter(alloc_frag->page, + alloc_frag->offset + pad, + len, from); + if (copied != len) + return -EFAULT; + + xdp->data_hard_start = buf; + xdp->data = buf + pad; + xdp->data_end = xdp->data + len; + *(int *)(xdp->data_hard_start)= buflen; + + get_page(alloc_frag->page); + alloc_frag->offset += buflen; + + return 0; +} + static void handle_tx_copy(struct vhost_net *net) { struct vhost_net_virtqueue *nvq = &net->vqs[VHOST_NET_VQ_TX]; -- 2.7.4