Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp297499imu; Wed, 7 Nov 2018 17:40:17 -0800 (PST) X-Google-Smtp-Source: AJdET5c1VWRlNK/UvM7IIw7HXmGJJFLXJQczD1GoMMbDrEdmpJ+9m03nuDRcXIz4aRPqoMT49Ks6 X-Received: by 2002:a17:902:4624:: with SMTP id o33-v6mr2653877pld.285.1541641216979; Wed, 07 Nov 2018 17:40:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1541641216; cv=none; d=google.com; s=arc-20160816; b=HJnGsf6fbEnGSHD8iui38wbyz3HfS8qyGE4LR4viJfw/D0ME3JICaDAIB31B09PTzb wkjvEfJMtNXI64EWD56EhJ53D+ozYS10ROWnTy035HFX68pjWfxh2V2ikViJvjlWYRmb 8vM6sVq5TatYR6oAcsNr5UEQ6gy5TzQ8HfzntnTaB4YIv0fy1i8+ArKpU4g5HqmgymUB 1pO01l/dwaoj/B+P/+f/SdzIOMhWiomz8YXTLfyjD3znJ7S4bIOoe4MUpJbUqGvmWMxa bsmEojYlhf8pgQW7Fml4YgpwtjYRR9mmJhBqCpfado/Q/ei3/8k6t8UAjnSowaqKPFiV ZwnQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=EXsXXGb1ZwySfcokjFXHyIbCAoFG08AZHKzrBhwzB2E=; b=IHwcOFaw6PaXAssp77C0OM2kz3MwWu5dsfnyp/OhprbOa5IRp4bREmpxN1pvLEtrW8 Snf1K/ZePYUHPVXc9qcfgiV32rG0h2RGOAmTIN5nktqt6dnP/NAArpTgWj+uTA753o/o g8bqzZXv6sQNQMRtW05HI8R1j3qFpBx6v1A25Ur6m/CvzCdgH7udqusCi1X8de+QEVi9 Vr47Y/JPwpgGvQEWwACIN5MtcpZahftobfoHPTn1icT3HdiaLuujeGir7xf/yIJrq5e4 mkKYxjDCJj/bCjPiJWqOba8Bh2tBZ26GnLI/4WKiDGNSHI7CiNMPf/0qTHPfsN/ZzDjb tbMA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 9si2144057pgm.112.2018.11.07.17.39.59; Wed, 07 Nov 2018 17:40:16 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728420AbeKHLMh (ORCPT + 99 others); Thu, 8 Nov 2018 06:12:37 -0500 Received: from mga07.intel.com ([134.134.136.100]:13879 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727724AbeKHLMh (ORCPT ); Thu, 8 Nov 2018 06:12:37 -0500 X-Amp-Result: UNSCANNABLE X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Nov 2018 17:39:35 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,478,1534834800"; d="scan'208";a="102501658" Received: from btwcube1.sh.intel.com (HELO debian) ([10.67.104.173]) by fmsmga002.fm.intel.com with ESMTP; 07 Nov 2018 17:39:33 -0800 Date: Thu, 8 Nov 2018 09:38:00 +0800 From: Tiwei Bie To: "Michael S. Tsirkin" Cc: jasowang@redhat.com, virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, virtio-dev@lists.oasis-open.org, wexu@redhat.com, jfreimann@redhat.com Subject: Re: [PATCH net-next v2 3/5] virtio_ring: add packed ring support Message-ID: <20181108013759.GA20591@debian> References: <20180711022711.7090-1-tiwei.bie@intel.com> <20180711022711.7090-4-tiwei.bie@intel.com> <20181107123933-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20181107123933-mutt-send-email-mst@kernel.org> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Nov 07, 2018 at 12:48:46PM -0500, Michael S. Tsirkin wrote: > On Wed, Jul 11, 2018 at 10:27:09AM +0800, Tiwei Bie wrote: > > This commit introduces the support (without EVENT_IDX) for > > packed ring. > > > > Signed-off-by: Tiwei Bie > > --- > > drivers/virtio/virtio_ring.c | 495 ++++++++++++++++++++++++++++++++++- > > 1 file changed, 487 insertions(+), 8 deletions(-) [...] > > > > +static void vring_unmap_state_packed(const struct vring_virtqueue *vq, > > + struct vring_desc_state_packed *state) > > +{ > > + u16 flags; > > + > > + if (!vring_use_dma_api(vq->vq.vdev)) > > + return; > > + > > + flags = state->flags; > > + > > + if (flags & VRING_DESC_F_INDIRECT) { > > + dma_unmap_single(vring_dma_dev(vq), > > + state->addr, state->len, > > + (flags & VRING_DESC_F_WRITE) ? > > + DMA_FROM_DEVICE : DMA_TO_DEVICE); > > + } else { > > + dma_unmap_page(vring_dma_dev(vq), > > + state->addr, state->len, > > + (flags & VRING_DESC_F_WRITE) ? > > + DMA_FROM_DEVICE : DMA_TO_DEVICE); > > + } > > +} > > + > > +static void vring_unmap_desc_packed(const struct vring_virtqueue *vq, > > + struct vring_packed_desc *desc) > > +{ > > + u16 flags; > > + > > + if (!vring_use_dma_api(vq->vq.vdev)) > > + return; > > + > > + flags = virtio16_to_cpu(vq->vq.vdev, desc->flags); > > BTW this stuff is only used on error etc. Is there a way to > reuse vring_unmap_state_packed? It's also used by the INDIRECT path. We don't allocate desc state for INDIRECT descriptors to save DMA addr/len etc. > > > + > > + if (flags & VRING_DESC_F_INDIRECT) { > > + dma_unmap_single(vring_dma_dev(vq), > > + virtio64_to_cpu(vq->vq.vdev, desc->addr), > > + virtio32_to_cpu(vq->vq.vdev, desc->len), > > + (flags & VRING_DESC_F_WRITE) ? > > + DMA_FROM_DEVICE : DMA_TO_DEVICE); > > + } else { > > + dma_unmap_page(vring_dma_dev(vq), > > + virtio64_to_cpu(vq->vq.vdev, desc->addr), > > + virtio32_to_cpu(vq->vq.vdev, desc->len), > > + (flags & VRING_DESC_F_WRITE) ? > > + DMA_FROM_DEVICE : DMA_TO_DEVICE); > > + } > > +} [...] > > @@ -766,47 +840,449 @@ static inline int virtqueue_add_packed(struct virtqueue *_vq, > > void *ctx, > > gfp_t gfp) > > { > > + struct vring_virtqueue *vq = to_vvq(_vq); > > + struct vring_packed_desc *desc; > > + struct scatterlist *sg; > > + unsigned int i, n, descs_used, uninitialized_var(prev), err_idx; > > + __virtio16 uninitialized_var(head_flags), flags; > > + u16 head, avail_wrap_counter, id, curr; > > + bool indirect; > > + > > + START_USE(vq); > > + > > + BUG_ON(data == NULL); > > + BUG_ON(ctx && vq->indirect); > > + > > + if (unlikely(vq->broken)) { > > + END_USE(vq); > > + return -EIO; > > + } > > + > > +#ifdef DEBUG > > + { > > + ktime_t now = ktime_get(); > > + > > + /* No kick or get, with .1 second between? Warn. */ > > + if (vq->last_add_time_valid) > > + WARN_ON(ktime_to_ms(ktime_sub(now, vq->last_add_time)) > > + > 100); > > + vq->last_add_time = now; > > + vq->last_add_time_valid = true; > > + } > > +#endif > > + > > + BUG_ON(total_sg == 0); > > + > > + head = vq->next_avail_idx; > > + avail_wrap_counter = vq->avail_wrap_counter; > > + > > + if (virtqueue_use_indirect(_vq, total_sg)) > > + desc = alloc_indirect_packed(_vq, total_sg, gfp); > > + else { > > + desc = NULL; > > + WARN_ON_ONCE(total_sg > vq->vring_packed.num && !vq->indirect); > > + } > > + > > + if (desc) { > > + /* Use a single buffer which doesn't continue */ > > + indirect = true; > > + /* Set up rest to use this indirect table. */ > > + i = 0; > > + descs_used = 1; > > + } else { > > + indirect = false; > > + desc = vq->vring_packed.desc; > > + i = head; > > + descs_used = total_sg; > > + } > > + > > + if (vq->vq.num_free < descs_used) { > > + pr_debug("Can't add buf len %i - avail = %i\n", > > + descs_used, vq->vq.num_free); > > + /* FIXME: for historical reasons, we force a notify here if > > + * there are outgoing parts to the buffer. Presumably the > > + * host should service the ring ASAP. */ > > I don't think we have a reason to do this for packed ring. > No historical baggage there, right? Based on the original commit log, it seems that the notify here is just an "optimization". But I don't quite understand what does the "the heuristics which KVM uses" refer to. If it's safe to drop this in packed ring, I'd like to do it. commit 44653eae1407f79dff6f52fcf594ae84cb165ec4 Author: Rusty Russell Date: Fri Jul 25 12:06:04 2008 -0500 virtio: don't always force a notification when ring is full We force notification when the ring is full, even if the host has indicated it doesn't want to know. This seemed like a good idea at the time: if we fill the transmit ring, we should tell the host immediately. Unfortunately this logic also applies to the receiving ring, which is refilled constantly. We should introduce real notification thesholds to replace this logic. Meanwhile, removing the logic altogether breaks the heuristics which KVM uses, so we use a hack: only notify if there are outgoing parts of the new buffer. Here are the number of exits with lguest's crappy network implementation: Before: network xmit 7859051 recv 236420 After: network xmit 7858610 recv 118136 Signed-off-by: Rusty Russell diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c index 72bf8bc09014..21d9a62767af 100644 --- a/drivers/virtio/virtio_ring.c +++ b/drivers/virtio/virtio_ring.c @@ -87,8 +87,11 @@ static int vring_add_buf(struct virtqueue *_vq, if (vq->num_free < out + in) { pr_debug("Can't add buf len %i - avail = %i\n", out + in, vq->num_free); - /* We notify *even if* VRING_USED_F_NO_NOTIFY is set here. */ - vq->notify(&vq->vq); + /* FIXME: for historical reasons, we force a notify here if + * there are outgoing parts to the buffer. Presumably the + * host should service the ring ASAP. */ + if (out) + vq->notify(&vq->vq); END_USE(vq); return -ENOSPC; } > > > + if (out_sgs) > > + vq->notify(&vq->vq); > > + if (indirect) > > + kfree(desc); > > + END_USE(vq); > > + return -ENOSPC; > > + } > > + [...]