Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752820AbcD0NhO (ORCPT ); Wed, 27 Apr 2016 09:37:14 -0400 Received: from mx1.redhat.com ([209.132.183.28]:50110 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751317AbcD0NhM (ORCPT ); Wed, 27 Apr 2016 09:37:12 -0400 Date: Wed, 27 Apr 2016 16:37:04 +0300 From: "Michael S. Tsirkin" To: David Woodhouse Cc: Wei Liu , qemu-devel@nongnu.org, linux-kernel@vger.kernel.org, pbonzini@redhat.com, peterx@redhat.com, cornelia.huck@de.ibm.com, Stefan Hajnoczi , Kevin Wolf , Amit Shah , qemu-block@nongnu.org, Jason Wang , Alex Williamson , Andy Lutomirski , Christian Borntraeger , virtualization@lists.linux-foundation.org, kvm@vger.kernel.org, Stefano Stabellini , Anthony PERARD , iommu@lists.linux-foundation.org Subject: Re: [PATCH V2 RFC] fixup! virtio: convert to use DMA api Message-ID: <20160427153345-mutt-send-email-mst@redhat.com> References: <1461245745-6710-1-git-send-email-mst@redhat.com> <20160421135416.GE11775@citrix.com> <1461759501.118304.149.camel@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1461759501.118304.149.camel@infradead.org> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3716 Lines: 85 On Wed, Apr 27, 2016 at 01:18:21PM +0100, David Woodhouse wrote: > > > > On some systems, including Xen and any system with a physical device > > > that speaks virtio behind a physical IOMMU, we must use the DMA API > > > for virtio DMA to work at all. > > >  > > > Add a feature bit to detect that: VIRTIO_F_IOMMU_PLATFORM. > > >  > > > If not there, we preserve historic behavior and bypass the DMA > > > API unless within Xen guest. This is actually required for > > > systems, including SPARC and PPC64, where virtio-pci devices are > > > enumerated as though they are behind an IOMMU, but the virtio host > > > ignores the IOMMU, so we must either pretend that the IOMMU isn't > > > there or somehow map everything as the identity. > > >  > > > Re: non-virtio devices. > > >  > > > It turns out that on old QEMU hosts, only emulated devices which were > > > part of QEMU use the IOMMU.  Should we want to bypass the IOMMU for such > > > devices *only*, it would be rather easy to detect them by looking at > > > subsystem vendor and device ID. Thus, no new interfaces are required > > > except for virtio which always uses the same subsystem vendor and device ID. > > Apologies for dropping this thread; I've been travelling. > > But seriously, NO! > > I understand why you want to see this as a virtio-specific issue, but > it isn't. And we don't *want* it to be. > > In the guest, drivers SHALL use the DMA API. And the DMA API SHALL do > the right thing for each device according to its needs. > > So any information passed from qemu to the guest should be directed at > the platform IOMMU code (or handled by qemu-detection quirks in the > guest, if we must). > > It is *not* acceptable for the virtio drivers in the guest to just > eschew the DMA API completely, triggered by some device-specific flag. > > The qemu implementation is, of course, monolithic. In qemu the fact > that virtio doesn't get translated by the emulated IOMMU *is* actually > down to code in the virtio implementation. I get that. > > But then again, it's not just virtio. *Any* device which we emulate for > the guest could have that same issue, and appear as untranslated. (And > assigned PCI devices currently do). > > Let's think about the parallel with a system-on-chip. Let's say we have > a peripheral which got included, but which was wired up such that it > bypasses the IOMMU and gets to do direct physical DMA. Is that a > feature of that specific peripheral? Do we hack its drivers to make the > distinction between this incarnation, and a normal discrete version of > the same device? No! It's a feature of the *system* One correction: it's a feature of the device in the system. There could be a mix of devices bypassing and not bypassing the IOMMU. > and needs to be > conveyed to the OS IOMMU code to do the right thing. Not to the driver. > > In my opinion, adding the VIRTIO_F_IOMMU_PLATFORM feature bit is > absolutely the wrong thing to do. > > What we *should* do is a patchset in the guest which both fixes virtio > drivers to *always* use the DMA API, and fixes the DMA API to DTRT at > the same time — by detecting qemu and installing no-op DMA ops for the > appropriate devices, perhaps. Sounds good. And a way to detect appropriate devices could be by looking at the feature flag, perhaps? > Then we can look at giving qemu a way to properly indicate which > devices it actually does DMA mapping for, so we can remove those > heuristic assumptions. > > But that flag does *not* live in the virtio host←→guest ABI. > > -- > David Woodhouse Open Source Technology Centre > David.Woodhouse@intel.com Intel Corporation >