Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp3796989imm; Mon, 4 Jun 2018 09:22:31 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIrR8bKyaOVA+zA+XWxt1xLFvaGwCQoMAoztotWVu08KcQnPUr3S6oOK/BYUnf+Qb3BVNle X-Received: by 2002:a65:6205:: with SMTP id d5-v6mr17884399pgv.416.1528129351377; Mon, 04 Jun 2018 09:22:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1528129351; cv=none; d=google.com; s=arc-20160816; b=JSKYkM3hjuxViAHetp5prjOybhJ35swfsELb2hLuqoB2qqhH9zLzE5tO6aoW7WEegV 6pyqwVLvBaTy+PTBOqbCvI8wGOPZUksVpZZfhdRvx2hjg/aU3xhnpE3me5iu72iGMCki Mwi5MTg4MwpyX+Th1xV+yeymhsRrTrdngLuSQw07RHdW8xdCL2UKgqrzrz/wbnVr7sBv PST6kaqz2joeiSBU1eFBCpIvTkcKVoYA4kD+2wWZxNMsCMR0oZm3nWq/6frg1rRKkzCo XsU695EUQk3zeutItuHJoRqTtkJdolr33kI5eT8eJS9WZGF9GIgjRp+xXcVo5NogSWwZ mfjA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :arc-authentication-results; bh=2EpkntzpaJCgiQXCJW7SJKtBMP0ArlfYEZKjGjTnybs=; b=HJZ1eZO4OV9HOF7RyGQ4SDx1P7DMFBiow1hjtf1BeIPocSeUOO7HecNmzoPt83LYkY nmpJvsrK6pLTrLSQ6Ao5fTm+WAF4oLKTf1oxuIaWI4AXrTkA4/Lg6H77JAvc6ArIheGA RY0cKn1lXzxlVAcEh/VT8B4lZzA1MldQUCH/D2G4HasrUJfa853LzyS3c2y0EdJU2xoy zeVA9spXpUhkCCbQ2B/nqFRyqDb/Y4MY4Pvep+I7gD7/gXS0x5e6MQ3X7XpEzloOTSgK j4iXlxJG6DKECwzF9UZZ+QZ0DHT6hB1pgvqs5bPdrQFvNISF8jmdCeK30ACnxnU7w8aE MIGw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c7-v6si4712420pfd.89.2018.06.04.09.22.16; Mon, 04 Jun 2018 09:22:31 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751361AbeFDQV1 (ORCPT + 99 others); Mon, 4 Jun 2018 12:21:27 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:53560 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751091AbeFDQVZ (ORCPT ); Mon, 4 Jun 2018 12:21:25 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 1E1D8402347E; Mon, 4 Jun 2018 16:21:25 +0000 (UTC) Received: from redhat.com (ovpn-120-211.rdu2.redhat.com [10.10.120.211]) by smtp.corp.redhat.com (Postfix) with SMTP id 6297064017; Mon, 4 Jun 2018 16:21:23 +0000 (UTC) Date: Mon, 4 Jun 2018 19:21:23 +0300 From: "Michael S. Tsirkin" To: Benjamin Herrenschmidt Cc: Anshuman Khandual , virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, aik@ozlabs.ru, robh@kernel.org, joe@perches.com, elfring@users.sourceforge.net, david@gibson.dropbear.id.au, jasowang@redhat.com, mpe@ellerman.id.au, hch@infradead.org Subject: Re: [RFC V2] virtio: Add platform specific DMA API translation for virito devices Message-ID: <20180604184035-mutt-send-email-mst@kernel.org> References: <20180522063317.20956-1-khandual@linux.vnet.ibm.com> <20180523213703-mutt-send-email-mst@kernel.org> <20180604153558-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Mon, 04 Jun 2018 16:21:25 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.6]); Mon, 04 Jun 2018 16:21:25 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'mst@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 04, 2018 at 11:11:52PM +1000, Benjamin Herrenschmidt wrote: > On Mon, 2018-06-04 at 15:43 +0300, Michael S. Tsirkin wrote: > > On Thu, May 24, 2018 at 08:27:04AM +1000, Benjamin Herrenschmidt wrote: > > > On Wed, 2018-05-23 at 21:50 +0300, Michael S. Tsirkin wrote: > > > > > > > I re-read that discussion and I'm still unclear on the > > > > original question, since I got several apparently > > > > conflicting answers. > > > > > > > > I asked: > > > > > > > > Why isn't setting VIRTIO_F_IOMMU_PLATFORM on the > > > > hypervisor side sufficient? > > > > > > I thought I had replied to this... > > > > > > There are a couple of reasons: > > > > > > - First qemu doesn't know that the guest will switch to "secure mode" > > > in advance. There is no difference between a normal and a secure > > > partition until the partition does the magic UV call to "enter secure > > > mode" and qemu doesn't see any of it. So who can set the flag here ? > > > > The user should set it. You just tell user "to be able to use with > > feature X, enable IOMMU". > > That's completely backwards. The user has no idea what that stuff is. > And it would have to percolate all the way up the management stack, > libvirt, kimchi, whatever else ... that's just nonsense. > > Especially since, as I explained in my other email, this is *not* a > qemu problem and thus the solution shouldn't be messing around with > qemu. virtio is implemented in qemu though. If you prefer to stick all your code in either guest or the UV that's your decision but it looks like qemu could be helpful here. For example what if you have a guest that passes physical addresses to qemu bypassing swiotlb? Don't you want to detect that and fail gracefully rather than crash the guest? That's what VIRTIO_F_IOMMU_PLATFORM will do for you. Still that's hypervisor's decision. What isn't up to the hypervisor is the way we structure code. We made an early decision to merge a hack with xen, among discussion about how with time DMA API will learn to support per-device quirks and we'll be able to switch to that. So let's do that now? > > > > > - Second, when using VIRTIO_F_IOMMU_PLATFORM, we also make qemu (or > > > vhost) go through the emulated MMIO for every access to the guest, > > > which adds additional overhead. > > > > > > Cheers, > > > Ben. > > > > There are several answers to this. One is that we are working hard to > > make overhead small when the mappings are static (which they would be if > > there's no actual IOMMU). So maybe especially given you are using > > a bounce buffer on top it's not so bad - did you try to > > benchmark? > > > > Another is that given the basic functionality is in there, optimizations > > can possibly wait until per-device quirks in DMA API are supported. > > The point is that requiring specific qemu command line arguments isn't > going to fly. We have additional problems due to the fact that our > firmware (SLOF) inside qemu doesn't currently deal with iommu's etc... > though those can be fixed. > > Overall, however, this seems to be the most convoluted way of achieving > things, require user interventions where none should be needed etc... > > Again, what's wrong with a 2 lines hook instead that solves it all and > completely avoids involving qemu ? > > Ben. That each platform wants to add hacks in this data path function. > > > > > > > > > > > > > > > arch/powerpc/include/asm/dma-mapping.h | 6 ++++++ > > > > > arch/powerpc/platforms/pseries/iommu.c | 11 +++++++++++ > > > > > drivers/virtio/virtio_ring.c | 10 ++++++++++ > > > > > 3 files changed, 27 insertions(+) > > > > > > > > > > diff --git a/arch/powerpc/include/asm/dma-mapping.h b/arch/powerpc/include/asm/dma-mapping.h > > > > > index 8fa3945..056e578 100644 > > > > > --- a/arch/powerpc/include/asm/dma-mapping.h > > > > > +++ b/arch/powerpc/include/asm/dma-mapping.h > > > > > @@ -115,4 +115,10 @@ extern u64 __dma_get_required_mask(struct device *dev); > > > > > #define ARCH_HAS_DMA_MMAP_COHERENT > > > > > > > > > > #endif /* __KERNEL__ */ > > > > > + > > > > > +#define platform_forces_virtio_dma platform_forces_virtio_dma > > > > > + > > > > > +struct virtio_device; > > > > > + > > > > > +extern bool platform_forces_virtio_dma(struct virtio_device *vdev); > > > > > #endif /* _ASM_DMA_MAPPING_H */ > > > > > diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c > > > > > index 06f0296..a2ec15a 100644 > > > > > --- a/arch/powerpc/platforms/pseries/iommu.c > > > > > +++ b/arch/powerpc/platforms/pseries/iommu.c > > > > > @@ -38,6 +38,7 @@ > > > > > #include > > > > > #include > > > > > #include > > > > > +#include > > > > > #include > > > > > #include > > > > > #include > > > > > @@ -1396,3 +1397,13 @@ static int __init disable_multitce(char *str) > > > > > __setup("multitce=", disable_multitce); > > > > > > > > > > machine_subsys_initcall_sync(pseries, tce_iommu_bus_notifier_init); > > > > > + > > > > > +bool platform_forces_virtio_dma(struct virtio_device *vdev) > > > > > +{ > > > > > + /* > > > > > + * On protected guest platforms, force virtio core to use DMA > > > > > + * MAP API for all virtio devices. But there can also be some > > > > > + * exceptions for individual devices like virtio balloon. > > > > > + */ > > > > > + return (of_find_compatible_node(NULL, NULL, "ibm,ultravisor") != NULL); > > > > > +} > > > > > > > > Isn't this kind of slow? vring_use_dma_api is on > > > > data path and supposed to be very fast. > > > > > > > > > diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c > > > > > index 21d464a..47ea6c3 100644 > > > > > --- a/drivers/virtio/virtio_ring.c > > > > > +++ b/drivers/virtio/virtio_ring.c > > > > > @@ -141,8 +141,18 @@ struct vring_virtqueue { > > > > > * unconditionally on data path. > > > > > */ > > > > > > > > > > +#ifndef platform_forces_virtio_dma > > > > > +static inline bool platform_forces_virtio_dma(struct virtio_device *vdev) > > > > > +{ > > > > > + return false; > > > > > +} > > > > > +#endif > > > > > + > > > > > static bool vring_use_dma_api(struct virtio_device *vdev) > > > > > { > > > > > + if (platform_forces_virtio_dma(vdev)) > > > > > + return true; > > > > > + > > > > > if (!virtio_has_iommu_quirk(vdev)) > > > > > return true; > > > > > > > > > > -- > > > > > 2.9.3