Received: by 10.213.65.68 with SMTP id h4csp332470imn; Fri, 16 Mar 2018 04:40:13 -0700 (PDT) X-Google-Smtp-Source: AG47ELvQ+qd06uWLy/0yqu6b7vwutfoJ5TBXotcAJ/CnVND+oSdrp/i7CxysVS2WDs6rd1GM0huR X-Received: by 10.98.19.10 with SMTP id b10mr1314256pfj.236.1521200413295; Fri, 16 Mar 2018 04:40:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521200413; cv=none; d=google.com; s=arc-20160816; b=H/gs3/WAuxvGLbdKV0pAu+QwYbG6NhqdMOpqrQGKlawmGemhSrVu1cIWAb07geEt52 U55PdwXpW+VOJTsBLjlxI4lIgardTjvkUlx5TSyYyI2K+N2yy5bKF8PG8m/QUPnZFrLU 7F8qgaxHKlyWT3tjdx0FH8G6yJ7nJyrYtc3x1R2YdVvG5bdQMZX1WdH/YhCyxWpvpiVh 6OnnfvVt/5H/SSJ8GZYMnypZ/0vfziaL8IcL8XI6AXyLVB3gKzBli5wDzo7Qo0KqL00k lmr4TRKeGEv4ueIcX4SsSLdXHjdWuL0dLLMXaO4D+VqqfE+4vxvFXiUWLi2rzSDeBv4D LOzg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=bFXWY5+R6Sj37hpu1b8bEuYGdLR22Y0qAfsP4Plfgvw=; b=YZccbVnKUidb60TXyhtOTHkZ6IX07vPaylSgkEX5OEBscoZwWlTvgUDb7KKDG6ZWBq KASu59Q5Fqcu9XFzQwCbdMv8cRY5qZ1Ckqkg2ybZP7kuvTTw6ACSy8l5+DrUL8C5Ojsi SEuXbZiF6vMTO5DrWQBj5TgOqhqF0oQocSk2zL8oEaI7bip2n6WwpypxrlP/lG0jq/X1 e3wsmeYVi0fDzS+EXyGWP3tMxXzVXP+KF2NCxOJz4yHno14Wc1zCP5UYsIULFIco+2tQ HvEjWf373yCpbGZr02x+YQJ15yI1PfSRbIQAC5RaM1OowKPdTlanZuLKf2jzsempTnX4 wQYA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 26si5376273pfl.261.2018.03.16.04.39.27; Fri, 16 Mar 2018 04:40:13 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753875AbeCPLg7 (ORCPT + 99 others); Fri, 16 Mar 2018 07:36:59 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:57838 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753138AbeCPLg4 (ORCPT ); Fri, 16 Mar 2018 07:36:56 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7BFD18D77D; Fri, 16 Mar 2018 11:36:55 +0000 (UTC) Received: from [10.72.12.42] (ovpn-12-42.pek2.redhat.com [10.72.12.42]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 053572026609; Fri, 16 Mar 2018 11:36:51 +0000 (UTC) Subject: Re: [PATCH RFC 2/2] virtio_ring: support packed ring To: Tiwei Bie Cc: mst@redhat.com, virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, wexu@redhat.com, jfreimann@redhat.com References: <20180223111801.15240-1-tiwei.bie@intel.com> <20180223111801.15240-3-tiwei.bie@intel.com> <8f73267a-e20e-de64-d8e0-3fd608dbf368@redhat.com> <20180316061047.o2xdyuqhak3mzjyw@debian> <0a0ecf42-8f7f-9387-8ca4-cb65d0835b56@redhat.com> <20180316074028.lun2jy45qqnfeymw@debian> <02a3da02-8226-ba4e-1b47-d25755b2c429@redhat.com> <20180316100413.vtqwatregzrmhvt3@debian> From: Jason Wang Message-ID: <094ca28b-d8af-bf7a-ea7e-0d0bf7518bda@redhat.com> Date: Fri, 16 Mar 2018 19:36:47 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <20180316100413.vtqwatregzrmhvt3@debian> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Fri, 16 Mar 2018 11:36:55 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Fri, 16 Mar 2018 11:36:55 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'jasowang@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2018年03月16日 18:04, Tiwei Bie wrote: > On Fri, Mar 16, 2018 at 04:34:28PM +0800, Jason Wang wrote: >> On 2018年03月16日 15:40, Tiwei Bie wrote: >>> On Fri, Mar 16, 2018 at 02:44:12PM +0800, Jason Wang wrote: >>>> On 2018年03月16日 14:10, Tiwei Bie wrote: >>>>> On Fri, Mar 16, 2018 at 12:03:25PM +0800, Jason Wang wrote: >>>>>> On 2018年02月23日 19:18, Tiwei Bie wrote: >>>>>>> Signed-off-by: Tiwei Bie >>>>>>> --- >>>>>>> drivers/virtio/virtio_ring.c | 699 +++++++++++++++++++++++++++++++++++++------ >>>>>>> include/linux/virtio_ring.h | 8 +- >>>>>>> 2 files changed, 618 insertions(+), 89 deletions(-) >>> [...] >>>>>>> + } >>>>>>> + } >>>>>>> + for (; n < (out_sgs + in_sgs); n++) { >>>>>>> + for (sg = sgs[n]; sg; sg = sg_next(sg)) { >>>>>>> + dma_addr_t addr = vring_map_one_sg(vq, sg, DMA_FROM_DEVICE); >>>>>>> + if (vring_mapping_error(vq, addr)) >>>>>>> + goto unmap_release; >>>>>>> + >>>>>>> + flags = cpu_to_virtio16(_vq->vdev, VRING_DESC_F_NEXT | >>>>>>> + VRING_DESC_F_WRITE | >>>>>>> + VRING_DESC_F_AVAIL(vq->wrap_counter) | >>>>>>> + VRING_DESC_F_USED(!vq->wrap_counter)); >>>>>>> + if (!indirect && i == head) >>>>>>> + head_flags = flags; >>>>>>> + else >>>>>>> + desc[i].flags = flags; >>>>>>> + >>>>>>> + desc[i].addr = cpu_to_virtio64(_vq->vdev, addr); >>>>>>> + desc[i].len = cpu_to_virtio32(_vq->vdev, sg->length); >>>>>>> + desc[i].id = cpu_to_virtio32(_vq->vdev, head); >>>>>>> + prev = i; >>>>>>> + i++; >>>>>>> + if (!indirect && i >= vq->vring_packed.num) { >>>>>>> + i = 0; >>>>>>> + vq->wrap_counter ^= 1; >>>>>>> + } >>>>>>> + } >>>>>>> + } >>>>>>> + /* Last one doesn't continue. */ >>>>>>> + if (!indirect && (head + 1) % vq->vring_packed.num == i) >>>>>>> + head_flags &= cpu_to_virtio16(_vq->vdev, ~VRING_DESC_F_NEXT); >>>>>> I can't get the why we need this here. >>>>> If only one desc is used, we will need to clear the >>>>> VRING_DESC_F_NEXT flag from the head_flags. >>>> Yes, I meant why following desc[prev].flags won't work for this? >>> Because the update of desc[head].flags (in above case, >>> prev == head) has been delayed. The flags is saved in >>> head_flags. >> Ok, but let's try to avoid modular here e.g tracking the number of sgs in a >> counter. >> >> And I see lots of duplication in the above two loops, I believe we can unify >> them with a a single loop. the only difference is dma direction and write >> flag. > The above implementation for packed ring is basically > an mirror of the existing implementation in split ring > as I want to keep the coding style consistent. Below > is the corresponding code in split ring: > > static inline int virtqueue_add(struct virtqueue *_vq, > struct scatterlist *sgs[], > unsigned int total_sg, > unsigned int out_sgs, > unsigned int in_sgs, > void *data, > void *ctx, > gfp_t gfp) > { > ...... > > for (n = 0; n < out_sgs; n++) { > for (sg = sgs[n]; sg; sg = sg_next(sg)) { > dma_addr_t addr = vring_map_one_sg(vq, sg, DMA_TO_DEVICE); > if (vring_mapping_error(vq, addr)) > goto unmap_release; > > desc[i].flags = cpu_to_virtio16(_vq->vdev, VRING_DESC_F_NEXT); > desc[i].addr = cpu_to_virtio64(_vq->vdev, addr); > desc[i].len = cpu_to_virtio32(_vq->vdev, sg->length); > prev = i; > i = virtio16_to_cpu(_vq->vdev, desc[i].next); > } > } > for (; n < (out_sgs + in_sgs); n++) { > for (sg = sgs[n]; sg; sg = sg_next(sg)) { > dma_addr_t addr = vring_map_one_sg(vq, sg, DMA_FROM_DEVICE); > if (vring_mapping_error(vq, addr)) > goto unmap_release; > > desc[i].flags = cpu_to_virtio16(_vq->vdev, VRING_DESC_F_NEXT | VRING_DESC_F_WRITE); > desc[i].addr = cpu_to_virtio64(_vq->vdev, addr); > desc[i].len = cpu_to_virtio32(_vq->vdev, sg->length); > prev = i; > i = virtio16_to_cpu(_vq->vdev, desc[i].next); > } > } > > ...... > } There's no need for such consistency especially consider it's a new kind of ring. Anyway, you can stick to it. [...] >>>>>>> + } else >>>>>>> + vq->free_head = i; >>>>>> ID is only valid in the last descriptor in the list, so head + 1 should be >>>>>> ok too? >>>>> I don't really get your point. The vq->free_head stores >>>>> the index of the next avail desc. >>>> I think I get your idea now, free_head has two meanings: >>>> >>>> - next avail index >>>> - buffer id >>> In my design, free_head is just the index of the next >>> avail desc. >>> >>> Driver can set anything to buffer ID. >> Then you need another method to track id to context e.g hashing. > I keep the context in desc_state[desc_idx]. So there is > no extra method needed to track the context. Well, it works for this patch but my reply was for "set anything to buffer ID". The size of desc_state is limited, so in fact you can't use a value greater than vq.num. > [...] >> @@ -1096,17 +1599,21 @@ struct virtqueue *vring_create_virtqueue( >>>>>>> if (!queue) { >>>>>>> /* Try to get a single page. You are my only hope! */ >>>>>>> - queue = vring_alloc_queue(vdev, vring_size(num, vring_align), >>>>>>> + queue = vring_alloc_queue(vdev, __vring_size(num, vring_align, >>>>>>> + packed), >>>>>>> &dma_addr, GFP_KERNEL|__GFP_ZERO); >>>>>>> } >>>>>>> if (!queue) >>>>>>> return NULL; >>>>>>> - queue_size_in_bytes = vring_size(num, vring_align); >>>>>>> - vring_init(&vring, num, queue, vring_align); >>>>>>> + queue_size_in_bytes = __vring_size(num, vring_align, packed); >>>>>>> + if (packed) >>>>>>> + vring_packed_init(&vring.vring_packed, num, queue, vring_align); >>>>>>> + else >>>>>>> + vring_init(&vring.vring_split, num, queue, vring_align); >>>>>> Let's rename vring_init to vring_init_split() like other helpers? >>>>> The vring_init() is a public API in include/uapi/linux/virtio_ring.h. >>>>> I don't think we can rename it. >>>> I see, then this need more thoughts to unify the API. >>> My thought is to keep the old API as is, and introduce >>> new types and helpers for packed ring. >> I admit it's not a fault of this patch. But we'd better think of this in the >> future, consider we may have new kinds of ring. >> >>> More details can be found in this patch: >>> https://lkml.org/lkml/2018/2/23/243 >>> (PS. The type which has bit fields is just for reference, >>> and will be changed in next version.) >>> >>> Do you have any other suggestions? >> No. > Hmm.. Sorry, I didn't describe my question well. > I mean do you have any suggestions about the API > design for packed ring in uapi header? Currently > I introduced below two new helpers: > > static inline void vring_packed_init(struct vring_packed *vr, unsigned int num, > void *p, unsigned long align); > static inline unsigned vring_packed_size(unsigned int num, unsigned long align); > > When new rings are introduced in the future, above > helpers can't be reused. Maybe we should make the > helpers be able to determine the ring type? Let's wait for Michael's comment here. Generally, I fail to understand why vring_init() become a part of uapi. Git grep shows the only use cases are virtio_test/vringh_test. Thanks > > Best regards, > Tiwei Bie > >> Thanks >> >>> Best regards, >>> Tiwei Bie >>> >>>>>>> - vq = __vring_new_virtqueue(index, vring, vdev, weak_barriers, context, >>>>>>> - notify, callback, name); >>>>>>> + vq = __vring_new_virtqueue(index, vring, packed, vdev, weak_barriers, >>>>>>> + context, notify, callback, name); >>>>>>> if (!vq) { >>>>>>> vring_free_queue(vdev, queue_size_in_bytes, queue, >>>>>>> dma_addr); >>> [...]