Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp2964460imm; Mon, 28 May 2018 21:06:55 -0700 (PDT) X-Google-Smtp-Source: AB8JxZqdoea+OmjUDbCtxufYLmF1Qe1NRaUonDQHiGYIU3X60ddEi+p6uuxb9GzUa0YGpyShhowU X-Received: by 2002:a62:1211:: with SMTP id a17-v6mr15953991pfj.104.1527566815324; Mon, 28 May 2018 21:06:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527566815; cv=none; d=google.com; s=arc-20160816; b=J2z0aH6i27zfC+5Qi9LzR6npRD+BI1HhHm3YzLKCdkb6mcY83J9+cVFZRLCGtZFy40 hXF71f/HzZYNesJboxCLKVrLKnd3bewLUwzRGIwd8iINTd6G5RcmfGpdLIhfbjwGpxrE nvu68r/xHCGXn3K+Tp7oYRJdUA2fI136t0mBwN9pgh6eld7+Vsw+jWyXPTYSBvxx39Cf FLeCbmd4aWfxrLbhMs0QyylzqZ1mqz5cGSSI7iSE6YScmoIZGpCd4ifqGNg+EqLKgpNK sy+Bsp55qO39UZGrSgxIisXV0DnXB7Qz2P0FtbXl112AzEk43FRn90QKBl8bI7yKy5EE WTHQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:arc-authentication-results; bh=Lz0BBPxb8F2GJzulYVPzmpLsgJAdBnmOkh1BVWUUYHc=; b=AnM8q6tsysr5EAUfNW4aTgu8LU2CxsLbtBo5quL2VeJ0s9kpvAj2V4XcqoFb+mf05+ z++MEP4VaWZ89ShQpgkn9eW55o+SKJnxvQlbix7DHZUAnS59W7wcr/DCkmnVhTSPBuFa hy0ZT/+avCkKa/+RFxIKzSBqYiFPRgk89crra/nKA/ff8IoRuRwMPVpQuwBPbc40QX2/ EARuaG1skbO8RY+SfEnNUXUpha9UWYhGKX1s0XpXd6eMXq23qFpiM4TPcTIu9ThNQdVI iS0rtD0ARbdsghYUhaKMzKHyZfYacSTRqx8ItuI3NcejhCthJwGVtbDnE5GChhaXZmu+ tdeA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z190-v6si24598726pgb.108.2018.05.28.21.06.10; Mon, 28 May 2018 21:06:55 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755955AbeE2CMQ (ORCPT + 99 others); Mon, 28 May 2018 22:12:16 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:37014 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S933991AbeE2CMA (ORCPT ); Mon, 28 May 2018 22:12:00 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id B56E27C6C4; Tue, 29 May 2018 02:11:59 +0000 (UTC) Received: from jason-ThinkPad-T450s.redhat.com (ovpn-12-44.pek2.redhat.com [10.72.12.44]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3B9BE10F1C0A; Tue, 29 May 2018 02:11:49 +0000 (UTC) From: Jason Wang To: mst@redhat.com, jasowang@redhat.com Cc: kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, jfreimann@redhat.com, wexu@redhat.com, tiwei.bie@intel.com Subject: [RFC V5 PATCH 8/8] vhost: event suppression for packed ring Date: Tue, 29 May 2018 10:10:30 +0800 Message-Id: <1527559830-8133-9-git-send-email-jasowang@redhat.com> In-Reply-To: <1527559830-8133-1-git-send-email-jasowang@redhat.com> References: <1527559830-8133-1-git-send-email-jasowang@redhat.com> X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Tue, 29 May 2018 02:11:59 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Tue, 29 May 2018 02:11:59 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'jasowang@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This patch introduces basic support for event suppression aka driver and device area. Signed-off-by: Jason Wang --- drivers/vhost/vhost.c | 191 ++++++++++++++++++++++++++++++++++++--- drivers/vhost/vhost.h | 10 +- include/uapi/linux/virtio_ring.h | 19 ++++ 3 files changed, 204 insertions(+), 16 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index a36e5ad2..112f680 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -1112,10 +1112,15 @@ static int vq_access_ok_packed(struct vhost_virtqueue *vq, unsigned int num, struct vring_used __user *used) { struct vring_desc_packed *packed = (struct vring_desc_packed *)desc; + struct vring_packed_desc_event *driver_event = + (struct vring_packed_desc_event *)avail; + struct vring_packed_desc_event *device_event = + (struct vring_packed_desc_event *)used; - /* FIXME: check device area and driver area */ return access_ok(VERIFY_READ, packed, num * sizeof(*packed)) && - access_ok(VERIFY_WRITE, packed, num * sizeof(*packed)); + access_ok(VERIFY_WRITE, packed, num * sizeof(*packed)) && + access_ok(VERIFY_READ, driver_event, sizeof(*driver_event)) && + access_ok(VERIFY_WRITE, device_event, sizeof(*device_event)); } static int vq_access_ok_split(struct vhost_virtqueue *vq, unsigned int num, @@ -1190,14 +1195,27 @@ static bool iotlb_access_ok(struct vhost_virtqueue *vq, return true; } -int vq_iotlb_prefetch(struct vhost_virtqueue *vq) +int vq_iotlb_prefetch_packed(struct vhost_virtqueue *vq) +{ + int num = vq->num; + + return iotlb_access_ok(vq, VHOST_ACCESS_RO, (u64)(uintptr_t)vq->desc, + num * sizeof(*vq->desc), VHOST_ADDR_DESC) && + iotlb_access_ok(vq, VHOST_ACCESS_WO, (u64)(uintptr_t)vq->desc, + num * sizeof(*vq->desc), VHOST_ADDR_DESC) && + iotlb_access_ok(vq, VHOST_ACCESS_RO, + (u64)(uintptr_t)vq->driver_event, + sizeof(*vq->driver_event), VHOST_ADDR_AVAIL) && + iotlb_access_ok(vq, VHOST_ACCESS_WO, + (u64)(uintptr_t)vq->device_event, + sizeof(*vq->device_event), VHOST_ADDR_USED); +} + +int vq_iotlb_prefetch_split(struct vhost_virtqueue *vq) { size_t s = vhost_has_feature(vq, VIRTIO_RING_F_EVENT_IDX) ? 2 : 0; unsigned int num = vq->num; - if (!vq->iotlb) - return 1; - return iotlb_access_ok(vq, VHOST_ACCESS_RO, (u64)(uintptr_t)vq->desc, num * sizeof(*vq->desc), VHOST_ADDR_DESC) && iotlb_access_ok(vq, VHOST_ACCESS_RO, (u64)(uintptr_t)vq->avail, @@ -1209,6 +1227,17 @@ int vq_iotlb_prefetch(struct vhost_virtqueue *vq) num * sizeof(*vq->used->ring) + s, VHOST_ADDR_USED); } + +int vq_iotlb_prefetch(struct vhost_virtqueue *vq) +{ + if (!vq->iotlb) + return 1; + + if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED)) + return vq_iotlb_prefetch_packed(vq); + else + return vq_iotlb_prefetch_split(vq); +} EXPORT_SYMBOL_GPL(vq_iotlb_prefetch); /* Can we log writes? */ @@ -1730,6 +1759,50 @@ static int vhost_update_used_flags(struct vhost_virtqueue *vq) return 0; } +static int vhost_update_device_flags(struct vhost_virtqueue *vq, + __virtio16 device_flags) +{ + void __user *flags; + + if (vhost_put_user(vq, device_flags, &vq->device_event->flags, + VHOST_ADDR_USED) < 0) + return -EFAULT; + if (unlikely(vq->log_used)) { + /* Make sure the flag is seen before log. */ + smp_wmb(); + /* Log used flag write. */ + flags = &vq->device_event->flags; + log_write(vq->log_base, vq->log_addr + + (flags - (void __user *)vq->device_event), + sizeof(vq->device_event->flags)); + if (vq->log_ctx) + eventfd_signal(vq->log_ctx, 1); + } + return 0; +} + +static int vhost_update_device_off_wrap(struct vhost_virtqueue *vq, + __virtio16 device_off_wrap) +{ + void __user *off_wrap; + + if (vhost_put_user(vq, device_off_wrap, &vq->device_event->off_wrap, + VHOST_ADDR_USED) < 0) + return -EFAULT; + if (unlikely(vq->log_used)) { + /* Make sure the flag is seen before log. */ + smp_wmb(); + /* Log used flag write. */ + off_wrap = &vq->device_event->off_wrap; + log_write(vq->log_base, vq->log_addr + + (off_wrap - (void __user *)vq->device_event), + sizeof(vq->device_event->off_wrap)); + if (vq->log_ctx) + eventfd_signal(vq->log_ctx, 1); + } + return 0; +} + static int vhost_update_avail_event(struct vhost_virtqueue *vq, u16 avail_event) { if (vhost_put_user(vq, cpu_to_vhost16(vq, vq->avail_idx), @@ -2683,16 +2756,13 @@ int vhost_add_used_n(struct vhost_virtqueue *vq, } EXPORT_SYMBOL_GPL(vhost_add_used_n); -static bool vhost_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq) +static bool vhost_notify_split(struct vhost_dev *dev, + struct vhost_virtqueue *vq) { __u16 old, new; __virtio16 event; bool v; - /* FIXME: check driver area */ - if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED)) - return true; - /* Flush out used index updates. This is paired * with the barrier that the Guest executes when enabling * interrupts. */ @@ -2725,6 +2795,64 @@ static bool vhost_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq) return vring_need_event(vhost16_to_cpu(vq, event), new, old); } +static bool vhost_notify_packed(struct vhost_dev *dev, + struct vhost_virtqueue *vq) +{ + __virtio16 event_off_wrap, event_flags; + __u16 old, new, off_wrap; + bool v; + + /* Flush out used descriptors updates. This is paired + * with the barrier that the Guest executes when enabling + * interrupts. + */ + smp_mb(); + + if (vhost_get_avail(vq, event_flags, + &vq->driver_event->flags) < 0) { + vq_err(vq, "Failed to get driver desc_event_flags"); + return true; + } + + if (!vhost_has_feature(vq, VIRTIO_RING_F_EVENT_IDX)) + return event_flags != + cpu_to_vhost16(vq, RING_EVENT_FLAGS_DISABLE); + + old = vq->signalled_used; + v = vq->signalled_used_valid; + new = vq->signalled_used = vq->last_used_idx; + vq->signalled_used_valid = true; + + if (event_flags != cpu_to_vhost16(vq, RING_EVENT_FLAGS_DESC)) + return event_flags != + cpu_to_vhost16(vq, RING_EVENT_FLAGS_DISABLE); + + /* Read desc event flags before event_off and event_wrap */ + smp_rmb(); + + if (vhost_get_avail(vq, event_off_wrap, + &vq->driver_event->off_wrap) < 0) { + vq_err(vq, "Failed to get driver desc_event_off/wrap"); + return true; + } + + off_wrap = vhost16_to_cpu(vq, event_off_wrap); + + if (unlikely(!v)) + return true; + + return vhost_vring_packed_need_event(vq, vq->used_wrap_counter, + off_wrap, new, old); +} + +static bool vhost_notify(struct vhost_dev *dev, struct vhost_virtqueue *vq) +{ + if (vhost_has_feature(vq, VIRTIO_F_RING_PACKED)) + return vhost_notify_packed(dev, vq); + else + return vhost_notify_split(dev, vq); +} + /* This actually signals the guest, using eventfd. */ void vhost_signal(struct vhost_dev *dev, struct vhost_virtqueue *vq) { @@ -2802,10 +2930,34 @@ static bool vhost_enable_notify_packed(struct vhost_dev *dev, struct vhost_virtqueue *vq) { struct vring_desc_packed *d = vq->desc_packed + vq->avail_idx; - __virtio16 flags; + __virtio16 flags = RING_EVENT_FLAGS_ENABLE; int ret; - /* FIXME: disable notification through device area */ + if (!(vq->used_flags & VRING_USED_F_NO_NOTIFY)) + return false; + vq->used_flags &= ~VRING_USED_F_NO_NOTIFY; + + if (vhost_has_feature(vq, VIRTIO_RING_F_EVENT_IDX)) { + __virtio16 off_wrap = cpu_to_vhost16(vq, vq->avail_idx | + vq->avail_wrap_counter << 15); + + ret = vhost_update_device_off_wrap(vq, off_wrap); + if (ret) { + vq_err(vq, "Failed to write to off warp at %p: %d\n", + &vq->device_event->off_wrap, ret); + return false; + } + /* Make sure off_wrap is wrote before flags */ + smp_wmb(); + flags = RING_EVENT_FLAGS_DESC; + } + + ret = vhost_update_device_flags(vq, flags); + if (ret) { + vq_err(vq, "Failed to enable notification at %p: %d\n", + &vq->device_event->flags, ret); + return false; + } /* They could have slipped one in as we were doing that: make * sure it's written, then check again. */ @@ -2871,7 +3023,18 @@ EXPORT_SYMBOL_GPL(vhost_enable_notify); static void vhost_disable_notify_packed(struct vhost_dev *dev, struct vhost_virtqueue *vq) { - /* FIXME: disable notification through device area */ + __virtio16 flags; + int r; + + if (vq->used_flags & VRING_USED_F_NO_NOTIFY) + return; + vq->used_flags |= VRING_USED_F_NO_NOTIFY; + + flags = cpu_to_vhost16(vq, RING_EVENT_FLAGS_DISABLE); + r = vhost_update_device_flags(vq, flags); + if (r) + vq_err(vq, "Failed to enable notification at %p: %d\n", + &vq->device_event->flags, r); } static void vhost_disable_notify_split(struct vhost_dev *dev, diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h index 7543a46..b920582 100644 --- a/drivers/vhost/vhost.h +++ b/drivers/vhost/vhost.h @@ -96,8 +96,14 @@ struct vhost_virtqueue { struct vring_desc __user *desc; struct vring_desc_packed __user *desc_packed; }; - struct vring_avail __user *avail; - struct vring_used __user *used; + union { + struct vring_avail __user *avail; + struct vring_packed_desc_event __user *driver_event; + }; + union { + struct vring_used __user *used; + struct vring_packed_desc_event __user *device_event; + }; const struct vhost_umem_node *meta_iotlb[VHOST_NUM_ADDRS]; struct file *kick; struct eventfd_ctx *call_ctx; diff --git a/include/uapi/linux/virtio_ring.h b/include/uapi/linux/virtio_ring.h index e297580..71c7a46 100644 --- a/include/uapi/linux/virtio_ring.h +++ b/include/uapi/linux/virtio_ring.h @@ -75,6 +75,25 @@ struct vring_desc_packed { __virtio16 flags; }; +/* Enable events */ +#define RING_EVENT_FLAGS_ENABLE 0x0 +/* Disable events */ +#define RING_EVENT_FLAGS_DISABLE 0x1 +/* + * Enable events for a specific descriptor + * (as specified by Descriptor Ring Change Event Offset/Wrap Counter). + * Only valid if VIRTIO_F_RING_EVENT_IDX has been negotiated. + */ +#define RING_EVENT_FLAGS_DESC 0x2 +/* The value 0x3 is reserved */ + +struct vring_packed_desc_event { + /* Descriptor Ring Change Event Offset and Wrap Counter */ + __virtio16 off_wrap; + /* Descriptor Ring Change Event Flags */ + __virtio16 flags; +}; + /* Virtio ring descriptors: 16 bytes. These can chain together via "next". */ struct vring_desc { /* Address (guest-physical). */ -- 2.7.4