Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755068AbcDSU1w (ORCPT ); Tue, 19 Apr 2016 16:27:52 -0400 Received: from mail-oi0-f43.google.com ([209.85.218.43]:35139 "EHLO mail-oi0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754512AbcDSU1u (ORCPT ); Tue, 19 Apr 2016 16:27:50 -0400 MIME-Version: 1.0 In-Reply-To: <20160419231437-mutt-send-email-mst@redhat.com> References: <1461004173.3765.73.camel@infradead.org> <20160419130732-mutt-send-email-mst@redhat.com> <20160419190520-mutt-send-email-mst@redhat.com> <20160419191914-mutt-send-email-mst@redhat.com> <1461083204.20056.8.camel@infradead.org> <20160419204907-mutt-send-email-mst@redhat.com> <20160419231437-mutt-send-email-mst@redhat.com> From: Andy Lutomirski Date: Tue, 19 Apr 2016 13:27:29 -0700 Message-ID: Subject: Re: [PATCH RFC] fixup! virtio: convert to use DMA api To: "Michael S. Tsirkin" Cc: David Woodhouse , "qemu-devel@nongnu.org Developers" , "linux-kernel@vger.kernel.org" , Paolo Bonzini , peterx@redhat.com, Cornelia Huck , Stefan Hajnoczi , Kevin Wolf , Amit Shah , qemu-block@nongnu.org, Jason Wang , Alex Williamson , Andy Lutomirski , Christian Borntraeger , Wei Liu , Linux Virtualization , kvm list Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3156 Lines: 67 On Tue, Apr 19, 2016 at 1:16 PM, Michael S. Tsirkin wrote: > On Tue, Apr 19, 2016 at 11:01:38AM -0700, Andy Lutomirski wrote: >> On Tue, Apr 19, 2016 at 10:49 AM, Michael S. Tsirkin wrote: >> > On Tue, Apr 19, 2016 at 12:26:44PM -0400, David Woodhouse wrote: >> >> On Tue, 2016-04-19 at 19:20 +0300, Michael S. Tsirkin wrote: >> >> > >> >> > > I thought that PLATFORM served that purpose. Woudn't the host >> >> > > advertise PLATFORM support and, if the guest doesn't ack it, the host >> >> > > device would skip translation? Or is that problematic for vfio? >> >> > >> >> > Exactly that's problematic for security. >> >> > You can't allow guest driver to decide whether device skips security. >> >> >> >> Right. Because fundamentally, this *isn't* a property of the endpoint >> >> device, and doesn't live in virtio itself. >> >> >> >> It's a property of the platform IOMMU, and lives there. >> > >> > It's a property of the hypervisor virtio implementation, and lives there. >> >> It is now, but QEMU could, in principle, change the way it thinks >> about it so that virtio devices would use the QEMU DMA API but ask >> QEMU to pass everything through 1:1. This would be entirely invisible >> to guests but would make it be a property of the IOMMU implementation. >> At that point, maybe QEMU could find a (platform dependent) way to >> tell the guest what's going on. >> >> FWIW, as far as I can tell, PPC and SPARC really could, in principle, >> set up 1:1 mappings in the guest so that the virtio devices would work >> regardless of whether QEMU is ignoring the IOMMU or not -- I think the >> only obstacle is that the PPC and SPARC 1:1 mappings are currectly set >> up with an offset. I don't know too much about those platforms, but >> presumably the layout could be changed so that 1:1 really was 1:1. >> >> --Andy > > Sure. Do you see any reason why the decision to do this can't be > keyed off the virtio feature bit? I can think of three types of virtio host: a) virtio always bypasses the IOMMU. b) virtio never bypasses the IOMMU (unless DMAR tables or similar say it does) -- i.e. virtio works like any other device. c) virtio may bypass the IOMMU depending on what the guest asks it to do. If this is keyed off a virtio feature bit and anyone tries to implement (c), the vfio is going to have a problem. And, if it's keyed off a virtio feature bit, then (a) won't work on Xen or similar setups unless the Xen hypervisor adds a giant and probably unreliable kludge to support it. Meanwhile, 4.6-rc works fine under Xen on a default x86 QEMU configuration, and I'd really like to keep it that way. What could plausibly work using a virtio feature bit is for a device to say "hey, I'm a new device and I support the platform-defined IOMMU mechanism". This bit would be *set* on default IOMMU-less QEMU configurations and on physical virtio PCI cards. The guest could operate accordingly. I'm not sure I see a good way for feature negotiation to work the other direction, though. PPC and SPARC could only set this bit on emulated devices if they know that new guest kernels are in use. --Andy