Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp5833058pxu; Wed, 23 Dec 2020 06:39:05 -0800 (PST) X-Google-Smtp-Source: ABdhPJwAx1foqzYdrOoskLiHxk42oxNd7RcMPDj39iEapRzP47KLgEl1nF0+/Q6hApX1g46mK6ag X-Received: by 2002:a17:906:369a:: with SMTP id a26mr23347860ejc.276.1608734345198; Wed, 23 Dec 2020 06:39:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1608734345; cv=none; d=google.com; s=arc-20160816; b=iaHLkQs8A3U1ZoauPkhJocb8nQ1zklNCYzxyFpacPMEO6gLg/r7SWb512l/pK1Ipj8 dzy2Gc6c8DR9+Ppfn3NEqnFpNIaUMlWArEzSjT3Yh1Ttp5fHxY/lBqL4wXXwJP6KIlzT 35lkjAPpo/5utyoeV4+PygWPtAtMHmAWesRIXWEWDcleRq+jZE2T7Pf4aMpMBU2J3vWX xYPCl6FIKu8uEvNOoKywxF1oqt7TlVmcnLhN2KIktu9VB9vErvuKVBGZ3WQDFH/FZoUb yB+MtxbO/aXf7+OwvtTXVtu9AcXh4gfXxi9a74W0H/zLvlErOT3D1lqL/T9YPfEjbZ9i gUAQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=dUU452tRqrvkfJMA+eezzExr3vkzl56EFOyXCB1EYnQ=; b=Ts2VcOGTYP8cr9x9QER+FG3tafoRMMlXI6BcwDW73o0B4mPQdkSZ6CkQolR2OFjLnU j6SyA32V4SkxHeR6ZouUPPfJPX87XRL8eW5csxwXYBntFAAf9spc87UfMGfvEVGDdNLU bdvmzn4v+WJfPfPKe30k5LvgLmKViTAOV7ohpzY8ZXsoFdItK2+kOQz3Jxn/PjdtdxUZ d6XLapcpOqBC+Swi4Y4HNn5UpUbMcIytn1yCJPe7GD3obr1Sv2OHApVZ3lE4yKFoZoVE EU2uRqxtP6uLcdlM8bzz1aZngcfe7VAE4SDciwau3xrOqXmI2Cr4Ubf22xUKoWzU53dV JdnA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="ZU3/CSMD"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id bz24si12589581ejc.72.2020.12.23.06.38.41; Wed, 23 Dec 2020 06:39:05 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="ZU3/CSMD"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727450AbgLWOiW (ORCPT + 99 others); Wed, 23 Dec 2020 09:38:22 -0500 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:22794 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726514AbgLWOiU (ORCPT ); Wed, 23 Dec 2020 09:38:20 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1608734213; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=dUU452tRqrvkfJMA+eezzExr3vkzl56EFOyXCB1EYnQ=; b=ZU3/CSMDhow18S+Maf1HEuJM9sMcHZXY62P3WszdU6DFIP69+72iz/QcAnE9S19MhYaqYu 1m4sozdviREq4kiVR1jwfMI5OtLWA7C7oMm/MsfxRh2nTXaZjojVBo96/VwuFcLiTlB7/j uSbnvspCEn75inU0umM0jNWS4Skls/g= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-462--u0xtbG2N-2sRj1vrceUuw-1; Wed, 23 Dec 2020 09:36:49 -0500 X-MC-Unique: -u0xtbG2N-2sRj1vrceUuw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9091E180A097; Wed, 23 Dec 2020 14:36:48 +0000 (UTC) Received: from steredhat.redhat.com (ovpn-112-247.ams2.redhat.com [10.36.112.247]) by smtp.corp.redhat.com (Postfix) with ESMTP id 89294614F5; Wed, 23 Dec 2020 14:36:39 +0000 (UTC) From: Stefano Garzarella To: virtualization@lists.linux-foundation.org Cc: netdev@vger.kernel.org, Jason Wang , "Michael S. Tsirkin" , Stefano Garzarella , linux-kernel@vger.kernel.org, kvm@vger.kernel.org, Stefan Hajnoczi Subject: [PATCH v2] vhost/vsock: add IOTLB API support Date: Wed, 23 Dec 2020 15:36:38 +0100 Message-Id: <20201223143638.123417-1-sgarzare@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This patch enables the IOTLB API support for vhost-vsock devices, allowing the userspace to emulate an IOMMU for the guest. These changes were made following vhost-net, in details this patch: - exposes VIRTIO_F_ACCESS_PLATFORM feature and inits the iotlb device if the feature is acked - implements VHOST_GET_BACKEND_FEATURES and VHOST_SET_BACKEND_FEATURES ioctls - calls vq_meta_prefetch() before vq processing to prefetch vq metadata address in IOTLB - provides .read_iter, .write_iter, and .poll callbacks for the chardev; they are used by the userspace to exchange IOTLB messages This patch was tested specifying "intel_iommu=strict" in the guest kernel command line. I used QEMU with a patch applied [1] to fix a simple issue (that patch was merged in QEMU v5.2.0): $ qemu -M q35,accel=kvm,kernel-irqchip=split \ -drive file=fedora.qcow2,format=qcow2,if=virtio \ -device intel-iommu,intremap=on,device-iotlb=on \ -device vhost-vsock-pci,guest-cid=3,iommu_platform=on,ats=on [1] https://lists.gnu.org/archive/html/qemu-devel/2020-10/msg09077.html Reviewed-by: Stefan Hajnoczi Signed-off-by: Stefano Garzarella --- The patch is the same of v1, but I re-tested it with: - QEMU v5.2.0-551-ga05f8ecd88 - Linux 5.9.15 (host) - Linux 5.9.15 and 5.10.0 (guest) Now, enabling 'ats' it works well, there are just a few simple changes. v1: https://www.spinics.net/lists/kernel/msg3716022.html v2: - updated commit message about QEMU version and string used to test - rebased on mst/vhost branch Thanks, Stefano --- drivers/vhost/vsock.c | 68 +++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 65 insertions(+), 3 deletions(-) diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index a483cec31d5c..5e78fb719602 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -30,7 +30,12 @@ #define VHOST_VSOCK_PKT_WEIGHT 256 enum { - VHOST_VSOCK_FEATURES = VHOST_FEATURES, + VHOST_VSOCK_FEATURES = VHOST_FEATURES | + (1ULL << VIRTIO_F_ACCESS_PLATFORM) +}; + +enum { + VHOST_VSOCK_BACKEND_FEATURES = (1ULL << VHOST_BACKEND_F_IOTLB_MSG_V2) }; /* Used to track all the vhost_vsock instances on the system. */ @@ -94,6 +99,9 @@ vhost_transport_do_send_pkt(struct vhost_vsock *vsock, if (!vhost_vq_get_backend(vq)) goto out; + if (!vq_meta_prefetch(vq)) + goto out; + /* Avoid further vmexits, we're already processing the virtqueue */ vhost_disable_notify(&vsock->dev, vq); @@ -449,6 +457,9 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work *work) if (!vhost_vq_get_backend(vq)) goto out; + if (!vq_meta_prefetch(vq)) + goto out; + vhost_disable_notify(&vsock->dev, vq); do { u32 len; @@ -766,8 +777,12 @@ static int vhost_vsock_set_features(struct vhost_vsock *vsock, u64 features) mutex_lock(&vsock->dev.mutex); if ((features & (1 << VHOST_F_LOG_ALL)) && !vhost_log_access_ok(&vsock->dev)) { - mutex_unlock(&vsock->dev.mutex); - return -EFAULT; + goto err; + } + + if ((features & (1ULL << VIRTIO_F_ACCESS_PLATFORM))) { + if (vhost_init_device_iotlb(&vsock->dev, true)) + goto err; } for (i = 0; i < ARRAY_SIZE(vsock->vqs); i++) { @@ -778,6 +793,10 @@ static int vhost_vsock_set_features(struct vhost_vsock *vsock, u64 features) } mutex_unlock(&vsock->dev.mutex); return 0; + +err: + mutex_unlock(&vsock->dev.mutex); + return -EFAULT; } static long vhost_vsock_dev_ioctl(struct file *f, unsigned int ioctl, @@ -811,6 +830,18 @@ static long vhost_vsock_dev_ioctl(struct file *f, unsigned int ioctl, if (copy_from_user(&features, argp, sizeof(features))) return -EFAULT; return vhost_vsock_set_features(vsock, features); + case VHOST_GET_BACKEND_FEATURES: + features = VHOST_VSOCK_BACKEND_FEATURES; + if (copy_to_user(argp, &features, sizeof(features))) + return -EFAULT; + return 0; + case VHOST_SET_BACKEND_FEATURES: + if (copy_from_user(&features, argp, sizeof(features))) + return -EFAULT; + if (features & ~VHOST_VSOCK_BACKEND_FEATURES) + return -EOPNOTSUPP; + vhost_set_backend_features(&vsock->dev, features); + return 0; default: mutex_lock(&vsock->dev.mutex); r = vhost_dev_ioctl(&vsock->dev, ioctl, argp); @@ -823,6 +854,34 @@ static long vhost_vsock_dev_ioctl(struct file *f, unsigned int ioctl, } } +static ssize_t vhost_vsock_chr_read_iter(struct kiocb *iocb, struct iov_iter *to) +{ + struct file *file = iocb->ki_filp; + struct vhost_vsock *vsock = file->private_data; + struct vhost_dev *dev = &vsock->dev; + int noblock = file->f_flags & O_NONBLOCK; + + return vhost_chr_read_iter(dev, to, noblock); +} + +static ssize_t vhost_vsock_chr_write_iter(struct kiocb *iocb, + struct iov_iter *from) +{ + struct file *file = iocb->ki_filp; + struct vhost_vsock *vsock = file->private_data; + struct vhost_dev *dev = &vsock->dev; + + return vhost_chr_write_iter(dev, from); +} + +static __poll_t vhost_vsock_chr_poll(struct file *file, poll_table *wait) +{ + struct vhost_vsock *vsock = file->private_data; + struct vhost_dev *dev = &vsock->dev; + + return vhost_chr_poll(file, dev, wait); +} + static const struct file_operations vhost_vsock_fops = { .owner = THIS_MODULE, .open = vhost_vsock_dev_open, @@ -830,6 +889,9 @@ static const struct file_operations vhost_vsock_fops = { .llseek = noop_llseek, .unlocked_ioctl = vhost_vsock_dev_ioctl, .compat_ioctl = compat_ptr_ioctl, + .read_iter = vhost_vsock_chr_read_iter, + .write_iter = vhost_vsock_chr_write_iter, + .poll = vhost_vsock_chr_poll, }; static struct miscdevice vhost_vsock_misc = { -- 2.26.2