Received: by 10.213.65.68 with SMTP id h4csp2165371imn; Thu, 29 Mar 2018 19:47:56 -0700 (PDT) X-Google-Smtp-Source: AIpwx48QcUp96cxKuqZLBVMDmd55OPSZ5IV7gE/CclmrTPbPrTc0OSs89hfs8paI+JswYSAkWgcx X-Received: by 10.98.10.156 with SMTP id 28mr8415485pfk.33.1522378076417; Thu, 29 Mar 2018 19:47:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522378076; cv=none; d=google.com; s=arc-20160816; b=iI9BmwgGU4QC9hAGbQkbpQUrAAfZ76wqjabjnKNhPRvIlmf2j0oIsQMSdOjvb/lnKp 73vm3yLfSPSMpOPfR/SkOp5uGej/B+tfjRv9LzCCH5CvdkzZxlqu7DJfUw7T4NkjIWlD rK3S1KzjFGEj3x+AyHEZuBgbCNMVQ+2wupTpmB2x3HIsFk73EykBl3lHyWfXY9xetcEP H8CAqnj9xCtHgij/riSp9+Spt92PCGWahamlKQXmOCbhej4PdkJM5yb6ARZ6t701vmOd 9VTciIL75zNoDOjgD8XwRV19f+LB4CK1dVyLk9U5I3cD/2023bXj2bIfAGw49EikRR/b lMsw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=NH0Kd4K2Yxj6eiyr3ZkmFNLhNUvAXMCyCEAzoU+eRhk=; b=D4KwgkhWpHuXQfB72ueu+ervcv1ZoVwKm0J7Q5VWkCHj4fkaBC7cj+NiwjgCS3mAzc AfrZ+lDi4jYWVqZnwA34bCEwLrfsYhNT1HukkyuTYMqJtjPTGZjfMkqf1OTIlUQBsQJ2 0WZGL+gkbjHJFnvgk1Ksbh+EE+psVFCmu5S57xwKOZUH/fo/18KcRTSpFB+n3W9XXhv4 rfmMTVo8pbxdWwTbR9belG6y/M9OrtRccx2PIP+VJx3UIextAEPEk174oSv8ZTGxUBFg cz/weRArtAJdgp2kvFnzwLrpw2DUs3qg8HCwWnRO+M87HUFt5k9+++7ddqhPA9t5nQcq CQuw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 3-v6si7139392plq.402.2018.03.29.19.47.42; Thu, 29 Mar 2018 19:47:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752626AbeC3Cqh (ORCPT + 99 others); Thu, 29 Mar 2018 22:46:37 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:44390 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751215AbeC3Cqf (ORCPT ); Thu, 29 Mar 2018 22:46:35 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 16839404084A; Fri, 30 Mar 2018 02:46:35 +0000 (UTC) Received: from [10.72.12.76] (ovpn-12-76.pek2.redhat.com [10.72.12.76]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 74FC510B00A2; Fri, 30 Mar 2018 02:46:27 +0000 (UTC) Subject: Re: [RFC PATCH V2 8/8] vhost: event suppression for packed ring To: Tiwei Bie Cc: mst@redhat.com, virtualization@lists.linux-foundation.org, kvm@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org References: <1522035533-11786-1-git-send-email-jasowang@redhat.com> <1522035533-11786-9-git-send-email-jasowang@redhat.com> <20180330020518.tja3lxkfz7mdwasj@debian> From: Jason Wang Message-ID: <013655c0-6950-8937-7489-95f9b0583642@redhat.com> Date: Fri, 30 Mar 2018 10:46:24 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <20180330020518.tja3lxkfz7mdwasj@debian> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Fri, 30 Mar 2018 02:46:35 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.5]); Fri, 30 Mar 2018 02:46:35 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'jasowang@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2018年03月30日 10:05, Tiwei Bie wrote: > On Mon, Mar 26, 2018 at 11:38:53AM +0800, Jason Wang wrote: >> This patch introduces basic support for event suppression aka driver >> and device area. Compile tested only. >> >> Signed-off-by: Jason Wang >> --- > [...] >> + >> +static bool vhost_notify_packed(struct vhost_dev *dev, >> + struct vhost_virtqueue *vq) >> +{ >> + __virtio16 event_off_wrap, event_flags; >> + __u16 old, new; >> + bool v, wrap; >> + int off; >> + >> + /* Flush out used descriptors updates. This is paired >> + * with the barrier that the Guest executes when enabling >> + * interrupts. >> + */ >> + smp_mb(); >> + >> + if (vhost_get_avail(vq, event_flags, >> + &vq->driver_event->desc_event_flags) < 0) { >> + vq_err(vq, "Failed to get driver desc_event_flags"); >> + return true; >> + } >> + >> + if (!(event_flags & cpu_to_vhost16(vq, RING_EVENT_FLAGS_DESC))) >> + return event_flags == >> + cpu_to_vhost16(vq, RING_EVENT_FLAGS_ENABLE); > Maybe it would be better to not use '&' here. Because these flags > are not defined as bits which can be ORed or ANDed. Instead, they > are defined as values: > > 0x0 enable > 0x1 disable > 0x2 desc > 0x3 reserved Yes the code seems tricky. Let me fix it in next version. > >> + >> + /* Read desc event flags before event_off and event_wrap */ >> + smp_rmb(); >> + >> + if (vhost_get_avail(vq, event_off_wrap, >> + &vq->driver_event->desc_event_off_warp) < 0) { >> + vq_err(vq, "Failed to get driver desc_event_off/wrap"); >> + return true; >> + } >> + >> + off = vhost16_to_cpu(vq, event_off_wrap); >> + >> + wrap = off & 0x1; >> + off >>= 1; > Based on the below definitions in spec, wrap counter is > the most significant bit. > > struct pvirtq_event_suppress { > le16 { > desc_event_off : 15; /* Descriptor Ring Change Event Offset */ > desc_event_wrap : 1; /* Descriptor Ring Change Event Wrap Counter */ > } desc; /* If desc_event_flags set to RING_EVENT_FLAGS_DESC */ > le16 { > desc_event_flags : 2, /* Descriptor Ring Change Event Flags */ > reserved : 14; /* Reserved, set to 0 */ > } flags; > }; Will fix this in next version. > >> + >> + >> + old = vq->signalled_used; >> + v = vq->signalled_used_valid; >> + new = vq->signalled_used = vq->last_used_idx; >> + vq->signalled_used_valid = true; >> + >> + if (unlikely(!v)) >> + return true; >> + >> + return vhost_vring_packed_need_event(vq, new, old, off) && >> + wrap == vq->used_wrap_counter; >> +} >> + >> +static bool vhost_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq) >> +{ >> + if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED)) >> + return vhost_notify_packed(dev, vq); >> + else >> + return vhost_notify_split(dev, vq); >> +} >> + >> /* This actually signals the guest, using eventfd. */ >> void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq) >> { >> @@ -2789,7 +2911,17 @@ static bool vhost_enable_notify_packed(struct vhost_dev *dev, >> __virtio16 flags; >> int ret; >> >> - /* FIXME: disable notification through device area */ >> + if (!(vq->used_flags & VRING_USED_F_NO_NOTIFY)) >> + return false; >> + vq->used_flags &= ~VRING_USED_F_NO_NOTIFY; >> + >> + flags = cpu_to_vhost16(vq, RING_EVENT_FLAGS_ENABLE); >> + ret = vhost_update_device_flags(vq, flags); >> + if (ret) { >> + vq_err(vq, "Failed to enable notification at %p: %d\n", >> + &vq->device_event->desc_event_flags, ret); >> + return false; >> + } >> >> /* They could have slipped one in as we were doing that: make >> * sure it's written, then check again. */ >> @@ -2855,7 +2987,18 @@ EXPORT_SYMBOL_GPL(vhost_enable_notify); >> static void vhost_disable_notify_packed(struct vhost_dev *dev, >> struct vhost_virtqueue *vq) >> { >> - /* FIXME: disable notification through device area */ >> + __virtio16 flags; >> + int r; >> + >> + if (vq->used_flags & VRING_USED_F_NO_NOTIFY) >> + return; >> + vq->used_flags |= VRING_USED_F_NO_NOTIFY; >> + >> + flags = cpu_to_vhost16(vq, RING_EVENT_FLAGS_DISABLE); >> + r = vhost_update_device_flags(vq, flags); >> + if (r) >> + vq_err(vq, "Failed to enable notification at %p: %d\n", >> + &vq->device_event->desc_event_flags, r); >> } >> >> static void vhost_disable_notify_split(struct vhost_dev *dev, >> diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h >> index 8a9df4f..02d7a36 100644 >> --- a/drivers/vhost/vhost.h >> +++ b/drivers/vhost/vhost.h >> @@ -96,8 +96,14 @@ struct vhost_virtqueue { >> struct vring_desc __user *desc; >> struct vring_desc_packed __user *desc_packed; > Do you think it'd be better to name the desc type as > struct vring_packed_desc? Ok. Will do. > And it will be consistent > with other names, like: > > struct vring_packed; > struct vring_packed_desc_event; > >> }; >> - struct vring_avail __user *avail; >> - struct vring_used __user *used; >> + union { >> + struct vring_avail __user *avail; >> + struct vring_packed_desc_event __user *driver_event; >> + }; >> + union { >> + struct vring_used __user *used; >> + struct vring_packed_desc_event __user *device_event; >> + }; >> const struct vhost_umem_node *meta_iotlb[VHOST_NUM_ADDRS]; >> struct file *kick; >> struct eventfd_ctx *call_ctx; >> diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h >> index e297580..7cdbf06 100644 >> --- a/include/uapi/linux/virtio_ring.h >> +++ b/include/uapi/linux/virtio_ring.h >> @@ -75,6 +75,25 @@ struct vring_desc_packed { >> __virtio16 flags; >> }; >> >> +/* Enable events */ >> +#define RING_EVENT_FLAGS_ENABLE 0x0 >> +/* Disable events */ >> +#define RING_EVENT_FLAGS_DISABLE 0x1 >> +/* >> + * Enable events for a specific descriptor >> + * (as specified by Descriptor Ring Change Event Offset/Wrap Counter). >> + * Only valid if VIRTIO_F_RING_EVENT_IDX has been negotiated. >> + */ >> +#define RING_EVENT_FLAGS_DESC 0x2 >> +/* The value 0x3 is reserved */ >> + >> +struct vring_packed_desc_event { >> + /* Descriptor Ring Change Event Offset and Wrap Counter */ >> + __virtio16 desc_event_off_warp; >> + /* Descriptor Ring Change Event Flags */ >> + __virtio16 desc_event_flags; > Do you think it'd be better to remove the prefix (desc_event_) for > the fields. And it will be consistent with other definitions, e.g.: > > struct vring_packed_desc { > /* Buffer Address. */ > __virtio64 addr; > /* Buffer Length. */ > __virtio32 len; > /* Buffer ID. */ > __virtio16 id; > /* The flags depending on descriptor type. */ > __virtio16 flags; > }; Yes. Let me do it in next version. Thanks for the review! >> +}; >> + >> /* Virtio ring descriptors: 16 bytes. These can chain together via "next". */ >> struct vring_desc { >> /* Address (guest-physical). */ >> -- >> 2.7.4 >>