Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp2958222imm; Mon, 28 May 2018 20:57:26 -0700 (PDT) X-Google-Smtp-Source: AB8JxZpF8Qt1g6xORZ+E8rxZ0r5MC5EgTtQ9NRIOH0zn0OfyZVYPwrIVGP2rqOsgcP1rSb2UTRq9 X-Received: by 2002:a62:b509:: with SMTP id y9-v6mr15851238pfe.121.1527566246536; Mon, 28 May 2018 20:57:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1527566246; cv=none; d=google.com; s=arc-20160816; b=I/eU9yUG0L7J9OHxKKEOy/hEY2rxjGe6PQBcLk4UdIQeuJOmiDfxhCIcZqv4k7J3oo /5YxkwVxlaxHw71mH9EBWic7l9p+sF5jYDVfycUFOYEXTbXT/SpD2mxIURKEs3yW896t Q79vSJ87XQvs+fN7oopHiXQTiYsVQxBzPNh1XXjeRde/4WM6cwemY+IocBuUTmzOP62W za3qGkqKl3hFQLlxDWMNPTXIvu4de86HPktUYETbMl/e2qLWXQtqbANzQNPMSwkopARC LFCxDkJYHfeDkOL8l4DrlODe4SNAv4ZSGrm3GCbH2p0ApDKjPu5KTMrubB43Snv2HTJL UVGg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:date:cc:to:from:subject:message-id :arc-authentication-results; bh=oicPLczY5JTmGzhy1bhVQSd5wlULmETwqgUXTS0MrAI=; b=BBriMDpeRdGPzPhVyVamO9cSrB9y19/uj2otYwcQG7qoT5zrMqIW1I5BQUVs1I2PIE qWmzgK02VZ7SL9ZIbbaiZ3RM8r5WNdvAbvyoLUTIb/DysjXckRGJP6onuxNBTBsRb5PI Z6Zw38fHvDtBslZT3DiI++jdcDYeaNnfOnl9S7p6VUIsedgwwyR9HkHpqcgi47CIGgmU HlON0tPcOUDZ8pKrmYZmhAg26Cb4Mrce6MMGkC8O8AoFYoUrn5Bsya4TVf5d5anKtIGq Z9RoRrZVDoKXE97YbekXXydqEdqXS7nKeX5j6jK0XMHdNA9TZm9tyqwHep7HgUM2WlI+ mLAw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h127-v6si34464502pfb.111.2018.05.28.20.57.12; Mon, 28 May 2018 20:57:26 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S936503AbeE1Xtr (ORCPT + 99 others); Mon, 28 May 2018 19:49:47 -0400 Received: from gate.crashing.org ([63.228.1.57]:33921 "EHLO gate.crashing.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S936467AbeE1Xtq (ORCPT ); Mon, 28 May 2018 19:49:46 -0400 Received: from localhost (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id w4SNmHfq004424; Mon, 28 May 2018 18:48:18 -0500 Message-ID: <6fff9f5d67361653e6072570a857cf0d1009a123.camel@kernel.crashing.org> Subject: Re: [RFC V2] virtio: Add platform specific DMA API translation for virito devices From: Benjamin Herrenschmidt To: "Michael S. Tsirkin" Cc: Anshuman Khandual , virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, aik@ozlabs.ru, robh@kernel.org, joe@perches.com, elfring@users.sourceforge.net, david@gibson.dropbear.id.au, jasowang@redhat.com, mpe@ellerman.id.au, hch@infradead.org Date: Tue, 29 May 2018 09:48:17 +1000 In-Reply-To: <20180525202300-mutt-send-email-mst@kernel.org> References: <20180522063317.20956-1-khandual@linux.vnet.ibm.com> <20180523213703-mutt-send-email-mst@kernel.org> <20180525202300-mutt-send-email-mst@kernel.org> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.28.1 (3.28.1-2.fc28) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 2018-05-25 at 20:45 +0300, Michael S. Tsirkin wrote: > On Thu, May 24, 2018 at 08:27:04AM +1000, Benjamin Herrenschmidt wrote: > > On Wed, 2018-05-23 at 21:50 +0300, Michael S. Tsirkin wrote: > > > > > I re-read that discussion and I'm still unclear on the > > > original question, since I got several apparently > > > conflicting answers. > > > > > > I asked: > > > > > > Why isn't setting VIRTIO_F_IOMMU_PLATFORM on the > > > hypervisor side sufficient? > > > > I thought I had replied to this... > > > > There are a couple of reasons: > > > > - First qemu doesn't know that the guest will switch to "secure mode" > > in advance. There is no difference between a normal and a secure > > partition until the partition does the magic UV call to "enter secure > > mode" and qemu doesn't see any of it. So who can set the flag here ? > > Not sure I understand. Just set the flag e.g. on qemu command line. > I might be wrong, but these secure mode things usually > a. require hypervisor side tricks anyway The way our secure mode architecture is designed, there doesn't need at this point to be any knowledge at qemu level whatsoever. Well at least until we do migration but that's a different kettle of fish. In any case, the guest starts normally (which means as a non-secure guest, and thus expects normal virtio, our FW today doesn't handle VIRTIO_F_IOMMU_PLATFORM, though granted, we can fix this), and later that guest issues some special Ultravisor call that turns it into a secure guest. There is some involvement of the hypervisor, but not qemu at this stage. We would very much like to avoid that, as it would be a hassle for users to have to use different libvirt options etc... bcs the guest might turn itself into a secure VM. > > - Second, when using VIRTIO_F_IOMMU_PLATFORM, we also make qemu (or > > vhost) go through the emulated MMIO for every access to the guest, > > which adds additional overhead. > > > > Cheers, > > Ben. > > Well it's not supposed to be much slower for the static case. > > vhost has a cache so should be fine. > > A while ago Paolo implemented a translation cache which should be > perfect for this case - most of the code got merged but > never enabled because of stability issues. > > If all else fails, we could teach QEMU to handle the no-iommu case > as if VIRTIO_F_IOMMU_PLATFORM was off. Any serious reason why not just getting that 2 line patch allowing our arch code to force virtio to use the DMA API ? It's not particularly invasive and solves our problem rather nicely without adding overhead or additional knowledge to qemu/libvirt/mgmnt tools etc... that it doesn't need etc.... The guest knows it's going secure so the guest arch code can do the right thing rather trivially. Long term we should probably make virtio always use the DMA API anyway, and interpose "1:1" dma_ops for the traditional virtio case, that would reduce code clutter significantly. In that case, it would become just a matter of having a platform hook to override the dma_ops used. Cheers, Ben. > > > > > > > > > > > > arch/powerpc/include/asm/dma-mapping.h | 6 ++++++ > > > > arch/powerpc/platforms/pseries/iommu.c | 11 +++++++++++ > > > > drivers/virtio/virtio_ring.c | 10 ++++++++++ > > > > 3 files changed, 27 insertions(+) > > > > > > > > diff --git a/arch/powerpc/include/asm/dma-mapping.h b/arch/powerpc/include/asm/dma-mapping.h > > > > index 8fa3945..056e578 100644 > > > > --- a/arch/powerpc/include/asm/dma-mapping.h > > > > +++ b/arch/powerpc/include/asm/dma-mapping.h > > > > @@ -115,4 +115,10 @@ extern u64 __dma_get_required_mask(struct device *dev); > > > > #define ARCH_HAS_DMA_MMAP_COHERENT > > > > > > > > #endif /* __KERNEL__ */ > > > > + > > > > +#define platform_forces_virtio_dma platform_forces_virtio_dma > > > > + > > > > +struct virtio_device; > > > > + > > > > +extern bool platform_forces_virtio_dma(struct virtio_device *vdev); > > > > #endif /* _ASM_DMA_MAPPING_H */ > > > > diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c > > > > index 06f0296..a2ec15a 100644 > > > > --- a/arch/powerpc/platforms/pseries/iommu.c > > > > +++ b/arch/powerpc/platforms/pseries/iommu.c > > > > @@ -38,6 +38,7 @@ > > > > #include > > > > #include > > > > #include > > > > +#include > > > > #include > > > > #include > > > > #include > > > > @@ -1396,3 +1397,13 @@ static int __init disable_multitce(char *str) > > > > __setup("multitce=", disable_multitce); > > > > > > > > machine_subsys_initcall_sync(pseries, tce_iommu_bus_notifier_init); > > > > + > > > > +bool platform_forces_virtio_dma(struct virtio_device *vdev) > > > > +{ > > > > + /* > > > > + * On protected guest platforms, force virtio core to use DMA > > > > + * MAP API for all virtio devices. But there can also be some > > > > + * exceptions for individual devices like virtio balloon. > > > > + */ > > > > + return (of_find_compatible_node(NULL, NULL, "ibm,ultravisor") != NULL); > > > > +} > > > > > > Isn't this kind of slow? vring_use_dma_api is on > > > data path and supposed to be very fast. > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > > > index 21d464a..47ea6c3 100644 > > > > --- a/drivers/virtio/virtio_ring.c > > > > +++ b/drivers/virtio/virtio_ring.c > > > > @@ -141,8 +141,18 @@ struct vring_virtqueue { > > > > * unconditionally on data path. > > > > */ > > > > > > > > +#ifndef platform_forces_virtio_dma > > > > +static inline bool platform_forces_virtio_dma(struct virtio_device *vdev) > > > > +{ > > > > + return false; > > > > +} > > > > +#endif > > > > + > > > > static bool vring_use_dma_api(struct virtio_device *vdev) > > > > { > > > > + if (platform_forces_virtio_dma(vdev)) > > > > + return true; > > > > + > > > > if (!virtio_has_iommu_quirk(vdev)) > > > > return true; > > > > > > > > -- > > > > 2.9.3