Received: by 2002:a05:6358:45e:b0:b5:b6eb:e1f9 with SMTP id 30csp1288870rwe; Thu, 25 Aug 2022 21:04:57 -0700 (PDT) X-Google-Smtp-Source: AA6agR6xqjX2nyLJIOct55cpe3lOOrDT99UHbIjiBwFqAO6ZW5YnfzGttXpb8pLskyLnO9oVghqU X-Received: by 2002:a05:6402:b15:b0:447:ee8c:7ace with SMTP id bm21-20020a0564020b1500b00447ee8c7acemr643463edb.306.1661486697332; Thu, 25 Aug 2022 21:04:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1661486697; cv=none; d=google.com; s=arc-20160816; b=I6VS3QtaHIA7eiIsmIZGkbIyoSAInwUpy0LF93n9zRrunrogk8snr9EJl/8biTP5lS o3gn4+RtCioM/JUXhfvUNmZx8BrYB6MkxrW1wA+xKmfOXwfYFaM9FVdy6ItGQ6YxTDJ8 LyswmyptqSl+h5IApIITu5gDzJZoLv55qS++gZxSis3bXFmwkFdq6xVOxhk7oiDJo9lb 3Sy5+UijxI9bRNuc5fqc0wmq9hkMwFVqFRPxUf33VZdvQEcJ+L4nbXX1VLcX9hPnzE/v wiw6UvAEfei7bbRWSZ1Q6v0ISpqDJmnmyAIRd0pMisl1OUeYP+u0nPa+nAb91LMME2T2 +Xxw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:thread-index:thread-topic :content-transfer-encoding:mime-version:subject:references :in-reply-to:message-id:cc:to:from:date; bh=1YG4DSBKa8gmBYHp0Wh8o8YfcFxilw3nCsQwrUVrz/4=; b=PQYkme03VmWOpUDGrfksi0aj8xjH0GWN5J8SyjemCMgexI5cqUmD9LygTfyRzJQoU7 Ir6z7RylvwvfKrNpzvY6GCN5TgT/GRDV+mZSug0omNMkcCiRJaZSffuvKByCm8pdAddl /LbgoUz4J47p/cD/ZuucPErB2Cj6vLLlIOxlhhkgyMX0DKYXXbS65parrxTRJivZIZRr 7sSliw1pkPVPL7ncsoi7z963Gm4ZkgpYqlIUrt+x3wFT85z2mC9QR1qCGollmv+3LqFj 5wU+bArwr3/tlKGJjaxGWSmCc18Dgr0rVHff+kKOJr4ofy+jMJQ9ig0MpK1RKQ3VQzGC /a9A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a4-20020a170906368400b00732fa7c5058si598771ejc.300.2022.08.25.21.04.31; Thu, 25 Aug 2022 21:04:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244765AbiHZDOX convert rfc822-to-8bit (ORCPT + 99 others); Thu, 25 Aug 2022 23:14:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34600 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230259AbiHZDOV (ORCPT ); Thu, 25 Aug 2022 23:14:21 -0400 Received: from smtp237.sjtu.edu.cn (smtp237.sjtu.edu.cn [202.120.2.237]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 004E88036A; Thu, 25 Aug 2022 20:14:18 -0700 (PDT) Received: from mta91.sjtu.edu.cn (unknown [10.118.0.91]) by smtp237.sjtu.edu.cn (Postfix) with ESMTPS id E88DF10087D60; Fri, 26 Aug 2022 11:14:15 +0800 (CST) Received: from localhost (localhost.localdomain [127.0.0.1]) by mta91.sjtu.edu.cn (Postfix) with ESMTP id 9302937C83F; Fri, 26 Aug 2022 11:14:15 +0800 (CST) X-Virus-Scanned: amavisd-new at Received: from mta91.sjtu.edu.cn ([127.0.0.1]) by localhost (mta91.sjtu.edu.cn [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 2-vxcLwyJSQ7; Fri, 26 Aug 2022 11:14:15 +0800 (CST) Received: from mstore105.sjtu.edu.cn (mstore101.sjtu.edu.cn [10.118.0.105]) by mta91.sjtu.edu.cn (Postfix) with ESMTP id 663D237C83E; Fri, 26 Aug 2022 11:14:15 +0800 (CST) Date: Fri, 26 Aug 2022 11:14:15 +0800 (CST) From: Guo Zhi To: jasowang Cc: eperezma , sgarzare , Michael Tsirkin , netdev , linux-kernel , kvm list , virtualization Message-ID: <384558036.9092726.1661483655362.JavaMail.zimbra@sjtu.edu.cn> In-Reply-To: References: <20220817135718.2553-1-qtxuning1999@sjtu.edu.cn> <20220817135718.2553-7-qtxuning1999@sjtu.edu.cn> Subject: Re: [RFC v2 6/7] virtio: in order support for virtio_ring MIME-Version: 1.0 Content-Type: text/plain; charset=GB2312 Content-Transfer-Encoding: 8BIT X-Originating-IP: [10.166.246.247] X-Mailer: Zimbra 8.8.15_GA_4308 (ZimbraWebClient - GC104 (Mac)/8.8.15_GA_3928) Thread-Topic: virtio: in order support for virtio_ring Thread-Index: /TRVhvCRWPpoWxEdPT1pQa8SovaRAA== X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ----- Original Message ----- > From: "jasowang" > To: "Guo Zhi" , "eperezma" , "sgarzare" , "Michael > Tsirkin" > Cc: "netdev" , "linux-kernel" , "kvm list" , > "virtualization" > Sent: Thursday, August 25, 2022 3:44:41 PM > Subject: Re: [RFC v2 6/7] virtio: in order support for virtio_ring > ?? 2022/8/17 21:57, Guo Zhi ะด??: >> If in order feature negotiated, we can skip the used ring to get >> buffer's desc id sequentially. >> >> Signed-off-by: Guo Zhi >> --- >> drivers/virtio/virtio_ring.c | 53 ++++++++++++++++++++++++++++++------ >> 1 file changed, 45 insertions(+), 8 deletions(-) >> >> diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c >> index 1c1b3fa376a2..143184ebb5a1 100644 >> --- a/drivers/virtio/virtio_ring.c >> +++ b/drivers/virtio/virtio_ring.c >> @@ -144,6 +144,9 @@ struct vring_virtqueue { >> /* DMA address and size information */ >> dma_addr_t queue_dma_addr; >> size_t queue_size_in_bytes; >> + >> + /* In order feature batch begin here */ > > > We need tweak the comment, it's not easy for me to understand the > meaning here. > How about this: if in_order feature is negotiated, is the next head to consume. > >> + u16 next_desc_begin; >> } split; >> >> /* Available for packed ring */ >> @@ -702,8 +705,13 @@ static void detach_buf_split(struct vring_virtqueue *vq, >> unsigned int head, >> } >> >> vring_unmap_one_split(vq, i); >> - vq->split.desc_extra[i].next = vq->free_head; >> - vq->free_head = head; >> + /* In order feature use desc in order, >> + * that means, the next desc will always be free >> + */ > > > Maybe we should add something like "The descriptors are prepared in order". > I will change it to this: The descriptors are made available in order if the in order feature is used. Since the free_head is already a circular list, it must consume it sequentially. > >> + if (!virtio_has_feature(vq->vq.vdev, VIRTIO_F_IN_ORDER)) { >> + vq->split.desc_extra[i].next = vq->free_head; >> + vq->free_head = head; >> + } >> >> /* Plus final descriptor */ >> vq->vq.num_free++; >> @@ -745,7 +753,7 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue >> *_vq, >> { >> struct vring_virtqueue *vq = to_vvq(_vq); >> void *ret; >> - unsigned int i; >> + unsigned int i, j; >> u16 last_used; >> >> START_USE(vq); >> @@ -764,11 +772,38 @@ static void *virtqueue_get_buf_ctx_split(struct virtqueue >> *_vq, >> /* Only get used array entries after they have been exposed by host. */ >> virtio_rmb(vq->weak_barriers); >> >> - last_used = (vq->last_used_idx & (vq->split.vring.num - 1)); >> - i = virtio32_to_cpu(_vq->vdev, >> - vq->split.vring.used->ring[last_used].id); >> - *len = virtio32_to_cpu(_vq->vdev, >> - vq->split.vring.used->ring[last_used].len); >> + if (virtio_has_feature(_vq->vdev, VIRTIO_F_IN_ORDER)) { >> + /* Skip used ring and get used desc in order*/ >> + i = vq->split.next_desc_begin; >> + j = i; >> + /* Indirect only takes one descriptor in descriptor table */ >> + while (!vq->indirect && (vq->split.desc_extra[j].flags & VRING_DESC_F_NEXT)) >> + j = (j + 1) % vq->split.vring.num; > > > Let's move the expensive mod outside the loop. Or it's split so we can > use and here actually since the size is guaranteed to be power of the > two? Another question, is it better to store the next_desc in e.g > desc_extra? Thanks, I will use bit operation instead of mod. We only use one next_desc_begin at the same time, there is no need to store it in desc_extra. > > And this seems very expensive if the device doesn't do the batching > (which is not mandatory). > > We will judge whether the device batched the buffer or not, we will only use this way for the batched buffer. And Not in order is more expensive, because is following a linked list. >> + /* move to next */ >> + j = (j + 1) % vq->split.vring.num; >> + /* Next buffer will use this descriptor in order */ >> + vq->split.next_desc_begin = j; >> + if (!vq->indirect) { >> + *len = vq->split.desc_extra[i].len; >> + } else { >> + struct vring_desc *indir_desc = >> + vq->split.desc_state[i].indir_desc; >> + u32 indir_num = vq->split.desc_extra[i].len, buffer_len = 0; >> + >> + if (indir_desc) { >> + for (j = 0; j < indir_num / sizeof(struct vring_desc); j++) >> + buffer_len += indir_desc[j].len; > > > So I think we need to finalize this, then we can have much more stress > on the cache: > > https://lkml.org/lkml/2021/10/26/1300 > > It was reverted since it's too aggressive, we should instead: > > 1) do the validation only for morden device > > 2) fail only when we enable the validation via (e.g a module parameter). > > Thanks > > >> + } >> + >> + *len = buffer_len; >> + } >> + } else { >> + last_used = (vq->last_used_idx & (vq->split.vring.num - 1)); >> + i = virtio32_to_cpu(_vq->vdev, >> + vq->split.vring.used->ring[last_used].id); >> + *len = virtio32_to_cpu(_vq->vdev, >> + vq->split.vring.used->ring[last_used].len); >> + } >> >> if (unlikely(i >= vq->split.vring.num)) { >> BAD_RING(vq, "id %u out of range\n", i); >> @@ -2236,6 +2271,8 @@ struct virtqueue *__vring_new_virtqueue(unsigned int >> index, >> vq->split.avail_flags_shadow = 0; >> vq->split.avail_idx_shadow = 0; >> >> + vq->split.next_desc_begin = 0; >> + >> /* No callback? Tell other side not to bother us. */ >> if (!callback) { >> vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;