Received: by 10.213.65.68 with SMTP id h4csp225944imn; Fri, 16 Mar 2018 00:43:15 -0700 (PDT) X-Google-Smtp-Source: AG47ELs1jQl7kewiU9eDi+VcXJ1J1WWJZUY0ZuX5WJTtjmUy72VKiEtHxWFxBNz5VYEMlR3vi+5s X-Received: by 2002:a17:902:2884:: with SMTP id f4-v6mr1005158plb.153.1521186195945; Fri, 16 Mar 2018 00:43:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521186195; cv=none; d=google.com; s=arc-20160816; b=ZYqZsGyhaBDfyjdHD4AuF94HJthP5mAkp0ktX9LLyCmr3EkhSWlTueOdm8oZsUfOrT 6stG3AzmiulcH7Rwrb7fBP/rLgNtvj2EuOgVTIfcT404tGNZl4+EjHUyi69se/NPd587 xTSsAQkXDuQlmzfZQztrYPCgyB5GKQ+eNM8e+ydbUk8OfsmGmoGzL0yCs+ImMrAt5F8W 6bRCr9Awvh37fHA0rnufG+0SnFM4tZwONnxN+3eh13twqlRl/Gp46CwyGbq3RKdNMsGO hCbYM/JZ80YIsI+OKK/qqljnjf5fkdyDFgNeWn4CgB4bnpCb6S3g7cZH4VBc6ODl2hso +9sA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date :arc-authentication-results; bh=kNLel8efbqliSVBpQmZbJfB1xGbPOCU+rrf9TBSVqME=; b=Oa9R+ZSbqI81ZhGXUSpm0ZMfiSLhvoj8XvUKgBlYDVfljUN3J2GhWRYcbHOCgr4Laq ooknXDUh7Uyoer8MuZguGr+m+YzjjLWZa6KVMt/mkdwmKzXGKm9ql6v5lyjWDt4V2mDX TDDivpiGXD8+Uubr6/lGItRiZ8WP/tX0JgR0VSNFz8skrejSvV+g/StWMO++dVssMHJ0 ZpwWza3L1QTio7UKCiLt9PHMm4dDX5FKrrfc9jyrm2iOsT1UCr3hlaUcEe94hvplznhg 8ee683n78LtRauEnwRhyaE02cMEZ6EWaIoOoetyqcTmLWxHVuY9ZbcRjrimN7MxJuNMq 9R6w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c187si5067350pfa.362.2018.03.16.00.43.01; Fri, 16 Mar 2018 00:43:15 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753343AbeCPHmG (ORCPT + 99 others); Fri, 16 Mar 2018 03:42:06 -0400 Received: from mga05.intel.com ([192.55.52.43]:64798 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753120AbeCPHmE (ORCPT ); Fri, 16 Mar 2018 03:42:04 -0400 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 16 Mar 2018 00:42:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.48,313,1517904000"; d="scan'208";a="211869577" Received: from debian.sh.intel.com (HELO debian) ([10.67.104.164]) by fmsmga006.fm.intel.com with ESMTP; 16 Mar 2018 00:42:02 -0700 Date: Fri, 16 Mar 2018 15:40:29 +0800 From: Tiwei Bie To: Jason Wang Cc: mst@redhat.com, virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, wexu@redhat.com, jfreimann@redhat.com Subject: Re: [PATCH RFC 2/2] virtio_ring: support packed ring Message-ID: <20180316074028.lun2jy45qqnfeymw@debian> References: <20180223111801.15240-1-tiwei.bie@intel.com> <20180223111801.15240-3-tiwei.bie@intel.com> <8f73267a-e20e-de64-d8e0-3fd608dbf368@redhat.com> <20180316061047.o2xdyuqhak3mzjyw@debian> <0a0ecf42-8f7f-9387-8ca4-cb65d0835b56@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <0a0ecf42-8f7f-9387-8ca4-cb65d0835b56@redhat.com> User-Agent: NeoMutt/20170113 (1.7.2) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Mar 16, 2018 at 02:44:12PM +0800, Jason Wang wrote: > On 2018年03月16日 14:10, Tiwei Bie wrote: > > On Fri, Mar 16, 2018 at 12:03:25PM +0800, Jason Wang wrote: > > > On 2018年02月23日 19:18, Tiwei Bie wrote: > > > > Signed-off-by: Tiwei Bie > > > > --- > > > > drivers/virtio/virtio_ring.c | 699 +++++++++++++++++++++++++++++++++++++------ > > > > include/linux/virtio_ring.h | 8 +- > > > > 2 files changed, 618 insertions(+), 89 deletions(-) [...] > > > > cpu_addr, size, direction); > > > > } > > > > -static void vring_unmap_one(const struct vring_virtqueue *vq, > > > > - struct vring_desc *desc) > > > > +static void vring_unmap_one(const struct vring_virtqueue *vq, void *_desc) > > > > { > > > Let's split the helpers to packed/split version like other helpers? > > > (Consider the caller has already known the type of vq). > > Okay. > > > > [...] > > > > > + desc[i].flags = flags; > > > > + > > > > + desc[i].addr = cpu_to_virtio64(_vq->vdev, addr); > > > > + desc[i].len = cpu_to_virtio32(_vq->vdev, sg->length); > > > > + desc[i].id = cpu_to_virtio32(_vq->vdev, head); > > > If it's a part of chain, we only need to do this for last buffer I think. > > I'm not sure I've got your point about the "last buffer". > > But, yes, id just needs to be set for the last desc. > > Right, I think I meant "last descriptor" :) > > > > > > > + prev = i; > > > > + i++; > > > It looks to me prev is always i - 1? > > No. prev will be (vq->vring_packed.num - 1) when i becomes 0. > > Right, so prev = i ? i - 1 : vq->vring_packed.num - 1. Yes, i wraps together with vq->wrap_counter in following code: > > > > + if (!indirect && i >= vq->vring_packed.num) { > > > > + i = 0; > > > > + vq->wrap_counter ^= 1; > > > > + } > > > > + } > > > > + } > > > > + for (; n < (out_sgs + in_sgs); n++) { > > > > + for (sg = sgs[n]; sg; sg = sg_next(sg)) { > > > > + dma_addr_t addr = vring_map_one_sg(vq, sg, DMA_FROM_DEVICE); > > > > + if (vring_mapping_error(vq, addr)) > > > > + goto unmap_release; > > > > + > > > > + flags = cpu_to_virtio16(_vq->vdev, VRING_DESC_F_NEXT | > > > > + VRING_DESC_F_WRITE | > > > > + VRING_DESC_F_AVAIL(vq->wrap_counter) | > > > > + VRING_DESC_F_USED(!vq->wrap_counter)); > > > > + if (!indirect && i == head) > > > > + head_flags = flags; > > > > + else > > > > + desc[i].flags = flags; > > > > + > > > > + desc[i].addr = cpu_to_virtio64(_vq->vdev, addr); > > > > + desc[i].len = cpu_to_virtio32(_vq->vdev, sg->length); > > > > + desc[i].id = cpu_to_virtio32(_vq->vdev, head); > > > > + prev = i; > > > > + i++; > > > > + if (!indirect && i >= vq->vring_packed.num) { > > > > + i = 0; > > > > + vq->wrap_counter ^= 1; > > > > + } > > > > + } > > > > + } > > > > + /* Last one doesn't continue. */ > > > > + if (!indirect && (head + 1) % vq->vring_packed.num == i) > > > > + head_flags &= cpu_to_virtio16(_vq->vdev, ~VRING_DESC_F_NEXT); > > > I can't get the why we need this here. > > If only one desc is used, we will need to clear the > > VRING_DESC_F_NEXT flag from the head_flags. > > Yes, I meant why following desc[prev].flags won't work for this? Because the update of desc[head].flags (in above case, prev == head) has been delayed. The flags is saved in head_flags. > > > > > > > + else > > > > + desc[prev].flags &= cpu_to_virtio16(_vq->vdev, ~VRING_DESC_F_NEXT); > > > > + > > > > + if (indirect) { > > > > + /* FIXME: to be implemented */ > > > > + > > > > + /* Now that the indirect table is filled in, map it. */ > > > > + dma_addr_t addr = vring_map_single( > > > > + vq, desc, total_sg * sizeof(struct vring_packed_desc), > > > > + DMA_TO_DEVICE); > > > > + if (vring_mapping_error(vq, addr)) > > > > + goto unmap_release; > > > > + > > > > + head_flags = cpu_to_virtio16(_vq->vdev, VRING_DESC_F_INDIRECT | > > > > + VRING_DESC_F_AVAIL(wrap_counter) | > > > > + VRING_DESC_F_USED(!wrap_counter)); > > > > + vq->vring_packed.desc[head].addr = cpu_to_virtio64(_vq->vdev, addr); > > > > + vq->vring_packed.desc[head].len = cpu_to_virtio32(_vq->vdev, > > > > + total_sg * sizeof(struct vring_packed_desc)); > > > > + vq->vring_packed.desc[head].id = cpu_to_virtio32(_vq->vdev, head); > > > > + } > > > > + > > > > + /* We're using some buffers from the free list. */ > > > > + vq->vq.num_free -= descs_used; > > > > + > > > > + /* Update free pointer */ > > > > + if (indirect) { > > > > + n = head + 1; > > > > + if (n >= vq->vring_packed.num) { > > > > + n = 0; > > > > + vq->wrap_counter ^= 1; > > > > + } > > > > + vq->free_head = n; > > > detach_buf_packed() does not even touch free_head here, so need to explain > > > its meaning for packed ring. > > Above code is for indirect support which isn't really > > implemented in this patch yet. > > > > For your question, free_head stores the index of the > > next avail desc. I'll add a comment for it or move it > > to union and give it a better name in next version. > > Yes, something like avail_idx might be better. > > > > > > > + } else > > > > + vq->free_head = i; > > > ID is only valid in the last descriptor in the list, so head + 1 should be > > > ok too? > > I don't really get your point. The vq->free_head stores > > the index of the next avail desc. > > I think I get your idea now, free_head has two meanings: > > - next avail index > - buffer id In my design, free_head is just the index of the next avail desc. Driver can set anything to buffer ID. And in my design, I save desc index in buffer ID. I'll add comments for them. > > If I'm correct, let's better add a comment for this. > > > > > > > + > > > > + /* Store token and indirect buffer state. */ > > > > + vq->desc_state[head].num = descs_used; > > > > + vq->desc_state[head].data = data; > > > > + if (indirect) > > > > + vq->desc_state[head].indir_desc = desc; > > > > + else > > > > + vq->desc_state[head].indir_desc = ctx; > > > > + > > > > + virtio_wmb(vq->weak_barriers); > > > Let's add a comment to explain the barrier here. > > Okay. > > > > > > + vq->vring_packed.desc[head].flags = head_flags; > > > > + vq->num_added++; > > > > + > > > > + pr_debug("Added buffer head %i to %p\n", head, vq); > > > > + END_USE(vq); > > > > + > > > > + return 0; > > > > + > > > > +unmap_release: > > > > + err_idx = i; > > > > + i = head; > > > > + > > > > + for (n = 0; n < total_sg; n++) { > > > > + if (i == err_idx) > > > > + break; > > > > + vring_unmap_one(vq, &desc[i]); > > > > + i++; > > > > + if (!indirect && i >= vq->vring_packed.num) > > > > + i = 0; > > > > + } > > > > + > > > > + vq->wrap_counter = wrap_counter; > > > > + > > > > + if (indirect) > > > > + kfree(desc); > > > > + > > > > + END_USE(vq); > > > > + return -EIO; > > > > +} [...] > > > > @@ -1096,17 +1599,21 @@ struct virtqueue *vring_create_virtqueue( > > > > if (!queue) { > > > > /* Try to get a single page. You are my only hope! */ > > > > - queue = vring_alloc_queue(vdev, vring_size(num, vring_align), > > > > + queue = vring_alloc_queue(vdev, __vring_size(num, vring_align, > > > > + packed), > > > > &dma_addr, GFP_KERNEL|__GFP_ZERO); > > > > } > > > > if (!queue) > > > > return NULL; > > > > - queue_size_in_bytes = vring_size(num, vring_align); > > > > - vring_init(&vring, num, queue, vring_align); > > > > + queue_size_in_bytes = __vring_size(num, vring_align, packed); > > > > + if (packed) > > > > + vring_packed_init(&vring.vring_packed, num, queue, vring_align); > > > > + else > > > > + vring_init(&vring.vring_split, num, queue, vring_align); > > > Let's rename vring_init to vring_init_split() like other helpers? > > The vring_init() is a public API in include/uapi/linux/virtio_ring.h. > > I don't think we can rename it. > > I see, then this need more thoughts to unify the API. My thought is to keep the old API as is, and introduce new types and helpers for packed ring. More details can be found in this patch: https://lkml.org/lkml/2018/2/23/243 (PS. The type which has bit fields is just for reference, and will be changed in next version.) Do you have any other suggestions? Best regards, Tiwei Bie > > > > > > > - vq = __vring_new_virtqueue(index, vring, vdev, weak_barriers, context, > > > > - notify, callback, name); > > > > + vq = __vring_new_virtqueue(index, vring, packed, vdev, weak_barriers, > > > > + context, notify, callback, name); > > > > if (!vq) { > > > > vring_free_queue(vdev, queue_size_in_bytes, queue, > > > > dma_addr); [...]