Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754104AbcKIULY (ORCPT ); Wed, 9 Nov 2016 15:11:24 -0500 Received: from foss.arm.com ([217.140.101.70]:34766 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751420AbcKIULW (ORCPT ); Wed, 9 Nov 2016 15:11:22 -0500 Subject: Re: Summary of LPC guest MSI discussion in Santa Fe To: Don Dutile , Will Deacon References: <1478209178-3009-1-git-send-email-eric.auger@redhat.com> <20161103220205.37715b49@t450s.home> <20161108024559.GA20591@arm.com> <20161108202922.GC15676@cbox> <20161108163508.1bcae0c2@t450s.home> <58228F71.6020108@redhat.com> <20161109170326.GG17771@arm.com> <582371FB.2040808@redhat.com> Cc: Alex Williamson , Christoffer Dall , Eric Auger , eric.auger.pro@gmail.com, marc.zyngier@arm.com, joro@8bytes.org, tglx@linutronix.de, jason@lakedaemon.net, linux-arm-kernel@lists.infradead.org, kvm@vger.kernel.org, drjones@redhat.com, linux-kernel@vger.kernel.org, pranav.sawargaonkar@gmail.com, iommu@lists.linux-foundation.org, punit.agrawal@arm.com, diana.craciun@nxp.com, benh@kernel.crashing.org, arnd@arndb.de, jcm@redhat.com, dwmw@amazon.co.uk From: Robin Murphy Message-ID: Date: Wed, 9 Nov 2016 20:11:16 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <582371FB.2040808@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3185 Lines: 65 On 09/11/16 18:59, Don Dutile wrote: > On 11/09/2016 12:03 PM, Will Deacon wrote: >> On Tue, Nov 08, 2016 at 09:52:33PM -0500, Don Dutile wrote: >>> On 11/08/2016 06:35 PM, Alex Williamson wrote: >>>> On Tue, 8 Nov 2016 21:29:22 +0100 >>>> Christoffer Dall wrote: >>>>> Is my understanding correct, that you need to tell userspace about the >>>>> location of the doorbell (in the IOVA space) in case (2), because even >>>>> though the configuration of the device is handled by the (host) kernel >>>>> through trapping of the BARs, we have to avoid the VFIO user >>>>> programming >>>>> the device to create other DMA transactions to this particular >>>>> address, >>>>> since that will obviously conflict and either not produce the desired >>>>> DMA transactions or result in unintended weird interrupts? >> >> Yes, that's the crux of the issue. >> >>>> Correct, if the MSI doorbell IOVA range overlaps RAM in the VM, then >>>> it's potentially a DMA target and we'll get bogus data on DMA read from >>>> the device, and lose data and potentially trigger spurious >>>> interrupts on >>>> DMA write from the device. Thanks, >>>> >>> That's b/c the MSI doorbells are not positioned *above* the SMMU, i.e., >>> they address match before the SMMU checks are done. if >>> all DMA addrs had to go through SMMU first, then the DMA access could >>> be ignored/rejected. >> >> That's actually not true :( The SMMU can't generally distinguish >> between MSI >> writes and DMA writes, so it would just see a write transaction to the >> doorbell address, regardless of how it was generated by the endpoint. >> >> Will >> > So, we have real systems where MSI doorbells are placed at the same IOVA > that could have memory for a guest, but not at the same IOVA as memory > on real hw ? MSI doorbells integral to PCIe root complexes (and thus untranslatable) typically have a programmable address, so could be anywhere. In the more general category of "special hardware addresses", QEMU's default ARM guest memory map puts RAM starting at 0x40000000; on the ARM Juno platform, that happens to be where PCI config space starts; as Juno's PCIe doesn't support ACS, peer-to-peer or anything clever, if you assign the PCI bus to a guest (all of it, given the lack of ACS), the root complex just sees the guest's attempts to DMA to "memory" as the device attempting to access config space and aborts them. > How are memory holes passed to SMMU so it doesn't have this issue for > bare-metal > (assign an IOVA that overlaps an MSI doorbell address)? When we *are* in full control of the IOVA space, we just carve out what we can find as best we can - see iova_reserve_pci_windows() in dma-iommu.c, which isn't really all that different to what x86 does (e.g. init_reserved_iova_ranges() in amd-iommu.c). Note that we don't actually have any way currently to discover upstream MSI doorbells (ponder dw_pcie_msi_init() in pcie-designware.c for an example of the problem) - the specific MSI support we have in DMA ops at the moment only covers GICv2m or GICv3 ITS downstream of translation, but fortunately that's the typical relevant use-case on current platforms. Robin.