Received: by 2002:ac0:a594:0:0:0:0:0 with SMTP id m20-v6csp59059imm; Mon, 21 May 2018 02:10:07 -0700 (PDT) X-Google-Smtp-Source: AB8JxZqg9qxQWCkh2VQkiZBK6I4dmkauN7z26LQpRn9hO1dPP5zmtkYZAzLVJtFnZ9bKv1vM2Oq3 X-Received: by 2002:a62:2f44:: with SMTP id v65-v6mr19135636pfv.83.1526893807908; Mon, 21 May 2018 02:10:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1526893807; cv=none; d=google.com; s=arc-20160816; b=r0Phd2T+CObPzttDB6Kzd4KyH4CNULGpN4VZVaAKBEL3xWS9ij9epqx3P9BfBI13Q/ 4zWKKbVNlEvomwZXb4b+4LPw27EkzT+SC8MtwUouOJi4xrVe89dE/W1WL8Sg+2pAnCav wR0FKrNAbkLD4L/4qyr4pOszEufaQJf2862d2yJ1DPUFTYFnOgJ77tKC7ryRL/7gmg5w hSDqCDqgDx0D0X+iUwcKN9OUJV1t4s8A991S0rKZ1HGtWgwsoAozQt9A0Mm9Qga6NEua FYeNJ0OzPzaYsWaUeA3wWeLaCRpAPSLhk7r61N1is4/vZ8YYN4Y8SsL08n0vcakXbBKb NNdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=AKefHlnXb8XnxvPH0pvBfo746WjGZn8pBDpkqYJphPE=; b=z8ZzY2QEQoeF6LHDXKiBh8MUytHGYYMPcvhAEkFBIsr+MdNYzYRnhtl4XD/r8qeW9p wIV53f7eJGbgZ6QfDpQ4eTeiM0wLrAjy7Gh0H1CRDqvYr/P1myCmsYzMbmIyrZElkHtS h+X3ziV663AY1sKH0DILXe/Gevus6tr+HUOUlhjUR71/GmwWrsJM7pTbU4so67z/wJ2E gsSwmHQ5o29jv5puflE8O86pntbi1I5Up8b2EsQiA0USEj/s100+A/xrpSy1wvRrcmNr DmC6u1zZf8ulyJ0b0RdlggS5pYp5Futj6BBB1wgTOztvebXeWenYuSma3xDzpYby5gC4 luAg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a35-v6si13867779pli.85.2018.05.21.02.09.53; Mon, 21 May 2018 02:10:07 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752642AbeEUJIR (ORCPT + 99 others); Mon, 21 May 2018 05:08:17 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:41288 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752308AbeEUJE4 (ORCPT ); Mon, 21 May 2018 05:04:56 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id AA3F6401DEA6; Mon, 21 May 2018 09:04:55 +0000 (UTC) Received: from jason-ThinkPad-T450s.redhat.com (ovpn-12-30.pek2.redhat.com [10.72.12.30]) by smtp.corp.redhat.com (Postfix) with ESMTP id AD2CB50329; Mon, 21 May 2018 09:04:52 +0000 (UTC) From: Jason Wang To: mst@redhat.com, jasowang@redhat.com Cc: kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC PATCH net-next 04/12] vhost_net: split out datacopy logic Date: Mon, 21 May 2018 17:04:25 +0800 Message-Id: <1526893473-20128-5-git-send-email-jasowang@redhat.com> In-Reply-To: <1526893473-20128-1-git-send-email-jasowang@redhat.com> References: <1526893473-20128-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Mon, 21 May 2018 09:04:55 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Mon, 21 May 2018 09:04:55 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'jasowang@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Instead of mixing zerocopy and datacopy logics, this patch tries to split datacopy logic out. This results for a more compact code and specific optimization could be done on top more easily. Signed-off-by: Jason Wang --- drivers/vhost/net.c | 111 +++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 102 insertions(+), 9 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index 4ebac76..4682fcc 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -492,9 +492,95 @@ static bool vhost_has_more_pkts(struct vhost_net *net, likely(!vhost_exceeds_maxpend(net)); } +static void handle_tx_copy(struct vhost_net *net) +{ + struct vhost_net_virtqueue *nvq = &net->vqs[VHOST_NET_VQ_TX]; + struct vhost_virtqueue *vq = &nvq->vq; + unsigned out, in; + int head; + struct msghdr msg = { + .msg_name = NULL, + .msg_namelen = 0, + .msg_control = NULL, + .msg_controllen = 0, + .msg_flags = MSG_DONTWAIT, + }; + size_t len, total_len = 0; + int err; + size_t hdr_size; + struct socket *sock; + struct vhost_net_ubuf_ref *uninitialized_var(ubufs); + int sent_pkts = 0; + + mutex_lock(&vq->mutex); + sock = vq->private_data; + if (!sock) + goto out; + + if (!vq_iotlb_prefetch(vq)) + goto out; + + vhost_disable_notify(&net->dev, vq); + vhost_net_disable_vq(net, vq); + + hdr_size = nvq->vhost_hlen; + + for (;;) { + head = vhost_net_tx_get_vq_desc(net, vq, vq->iov, + ARRAY_SIZE(vq->iov), + &out, &in); + /* On error, stop handling until the next kick. */ + if (unlikely(head < 0)) + break; + /* Nothing new? Wait for eventfd to tell us they refilled. */ + if (head == vq->num) { + if (unlikely(vhost_enable_notify(&net->dev, vq))) { + vhost_disable_notify(&net->dev, vq); + continue; + } + break; + } + if (in) { + vq_err(vq, "Unexpected descriptor format for TX: " + "out %d, int %d\n", out, in); + break; + } + + len = init_iov_iter(vq, &msg.msg_iter, hdr_size, out); + if (len < 0) + break; + + total_len += len; + if (total_len < VHOST_NET_WEIGHT && + vhost_has_more_pkts(net, vq)) { + msg.msg_flags |= MSG_MORE; + } else { + msg.msg_flags &= ~MSG_MORE; + } + + /* TODO: Check specific error and bomb out unless ENOBUFS? */ + err = sock->ops->sendmsg(sock, &msg, len); + if (unlikely(err < 0)) { + vhost_discard_vq_desc(vq, 1); + vhost_net_enable_vq(net, vq); + break; + } + if (err != len) + pr_debug("Truncated TX packet: " + " len %d != %zd\n", err, len); + vhost_add_used_and_signal(&net->dev, vq, head, 0); + if (vhost_exceeds_weight(++sent_pkts, total_len)) { + vhost_poll_queue(&vq->poll); + break; + } + } +out: + mutex_unlock(&vq->mutex); +} + /* Expects to be always run from workqueue - which acts as * read-size critical section for our kind of RCU. */ -static void handle_tx(struct vhost_net *net) +static void handle_tx_zerocopy(struct vhost_net *net) { struct vhost_net_virtqueue *nvq = &net->vqs[VHOST_NET_VQ_TX]; struct vhost_virtqueue *vq = &nvq->vq; @@ -512,7 +598,7 @@ static void handle_tx(struct vhost_net *net) size_t hdr_size; struct socket *sock; struct vhost_net_ubuf_ref *uninitialized_var(ubufs); - bool zcopy, zcopy_used; + bool zcopy_used; int sent_pkts = 0; mutex_lock(&vq->mutex); @@ -527,13 +613,10 @@ static void handle_tx(struct vhost_net *net) vhost_net_disable_vq(net, vq); hdr_size = nvq->vhost_hlen; - zcopy = nvq->ubufs; for (;;) { /* Release DMAs done buffers first */ - if (zcopy) - vhost_zerocopy_signal_used(net, vq); - + vhost_zerocopy_signal_used(net, vq); head = vhost_net_tx_get_vq_desc(net, vq, vq->iov, ARRAY_SIZE(vq->iov), @@ -559,9 +642,9 @@ static void handle_tx(struct vhost_net *net) if (len < 0) break; - zcopy_used = zcopy && len >= VHOST_GOODCOPY_LEN - && !vhost_exceeds_maxpend(net) - && vhost_net_tx_select_zcopy(net); + zcopy_used = len >= VHOST_GOODCOPY_LEN + && !vhost_exceeds_maxpend(net) + && vhost_net_tx_select_zcopy(net); /* use msg_control to pass vhost zerocopy ubuf info to skb */ if (zcopy_used) { @@ -620,6 +703,16 @@ static void handle_tx(struct vhost_net *net) mutex_unlock(&vq->mutex); } +static void handle_tx(struct vhost_net *net) +{ + struct vhost_net_virtqueue *nvq = &net->vqs[VHOST_NET_VQ_TX]; + + if (nvq->ubufs) + handle_tx_zerocopy(net); + else + handle_tx_copy(net); +} + static int peek_head_len(struct vhost_net_virtqueue *rvq, struct sock *sk) { struct sk_buff *head; -- 2.7.4