Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754299AbcDYKHm (ORCPT ); Mon, 25 Apr 2016 06:07:42 -0400 Received: from e23smtp07.au.ibm.com ([202.81.31.140]:47613 "EHLO e23smtp07.au.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752439AbcDYKHj (ORCPT ); Mon, 25 Apr 2016 06:07:39 -0400 X-IBM-Helo: d23dlp02.au.ibm.com X-IBM-MailFrom: xyjxie@linux.vnet.ibm.com X-IBM-RcptTo: kvm@vger.kernel.org;linux-doc@vger.kernel.org;linux-kernel@vger.kernel.org;linux-pci@vger.kernel.org Subject: Re: [RFC v6 00/10] vfio-pci: Allow to mmap sub-page MMIO BARs and MSI-X table To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, iommu@lists.linux-foundation.org, linux-doc@vger.kernel.org, alex.williamson@redhat.com References: <1460976816-29294-1-git-send-email-xyjxie@linux.vnet.ibm.com> Cc: bhelgaas@google.com, aik@ozlabs.ru, benh@kernel.crashing.org, paulus@samba.org, mpe@ellerman.id.au, joro@8bytes.org, corbet@lwn.net, warrier@linux.vnet.ibm.com, zhong@linux.vnet.ibm.com, nikunj@linux.vnet.ibm.com, eric.auger@linaro.org, will.deacon@arm.com, gwshan@linux.vnet.ibm.com, alistair@popple.id.au, ruscur@russell.cc From: Yongji Xie Message-ID: <571DEC01.5040802@linux.vnet.ibm.com> Date: Mon, 25 Apr 2016 18:05:53 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.7.2 MIME-Version: 1.0 In-Reply-To: <1460976816-29294-1-git-send-email-xyjxie@linux.vnet.ibm.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16042510-0025-0000-0000-0000046273B1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5607 Lines: 122 Hi Alex, Any comment? Thanks, Yongji On 2016/4/18 18:53, Yongji Xie wrote: > Current vfio-pci implementation disallows to mmap > sub-page(size < PAGE_SIZE) MMIO BARs and MSI-X table. This is because > sub-page BARs' mmio page may be shared with other BARs and MSI-X table > should not be accessed directly from the guest for security reasons. > > But it will easily cause some performance issues for mmio accesses > in guest when vfio passthrough sub-page BARs or BARs containing MSI-X > table on PPC64 platform. This is because PAGE_SIZE is 64KB by default > on PPC64 platform and the big page may easily hit the sub-page MMIO > BARs' unmmapping and cause the unmmaping of the mmio page which > MSI-X table locate in, which lead to mmio emulation in host. > > For sub-page MMIO BARs' unmmapping, this patchset modifies > resource_alignment kernel parameter to enforce the alignment of all > MMIO BARs to be at least PAGE_SZIE so that sub-page BAR's mmio page > will not be shared with other BARs. And we also add shadow resources > to the vfio device and put them into the holes of mmio pages in case > that hot-add device's BARs are assigned into the holes. Then we can > mmap sub-page MMIO BARs safely. > > For MSI-X table's unmmapping, we think MSI-X table is safe to access > directly from userspace if hardware supports the capability of > interrupt remapping which can ensure that a given pci device can > only shoot the MSIs assigned for it. But the implenmentation of > this capability is arch-independent. To have a universal way > to test this capability on PCI side for different archs, we introduce > a new bus_flags PCI_BUS_FLAGS_MSI_REMAP. > > With this patchset applied, we can get almost 100% improvement on > performance for small block 4k random read when we passthrough a FC > HBA containing sub-page BARs and MSI-X BARs to guest on PPC64 in > our test. > > The patch 8 are based on the proposed patchset[2]. > > Changelog v6: > - Rebase on vfio/next with patchset[2] applied > - Fix some bugs of v5 > - Add three patches to make PCI_BUS_FLAGS_MSI_REMAP as > a universal flag to test IRQ remapping > > Changelog v5: > - Rebase on vfio/next > - Change the order of patch 1,2,3 > - Move the warning "resource_alignment will not work with > PCI_PROBE_ONLY set" from documentation to kernel log > - Remove IORESOURCE_WINDOW > - Add description for parameter "resize" > - Add PCIBIOS_MIN_ALIGNMENT to force all MMIO BARs to > get minimum alignment > - Add shadow resources to make sure sub-page BAR's mmio > page will not be shared with hot-add BARs. > - Add a new bit to pci_bus_flags to indicate the capbility > of interrupt remapping on PPC64 > - Remove IOMMU_CAP_INTR_REMAP on PPC64 > - Add a property msi_remap to vfio_pci_device to cache the > capbility of interrupt remapping > > Changelog v4: > - Rebase on v4.5-rc6 with patchset[1] applied. > - Remove resource_page_aligned kernel parameter > - Fix some problems with resource_alignment kernel parameter > - Modify resource_alignment kernel parameter to support multiple > devices. > - Remove host bridge attribute: msi_filtered > - Use IOMMU_CAP_INTR_REMAP to check if MSI-X table can be mmapped > - Add IOMMU_CAP_INTR_REMAP for IODA host bridge on PPC64 platform > > Changelog v3: > - Rebase on new linux kernel mainline with the patchset[1] applied. > - Add a function to check whether PCI BARs'mmio page is shared with > other BARs. > - Add a host bridge attribute to indicate PCI host bridge support > filtering of MSIs. > - Use the new host bridge attribute to check if MSI-X table can > be mmapped instead of CONFIG_EEH. > - Remove Kconfig option VFIO_PCI_MMAP_MSIX > > Changelog v2: > - Rebase on v4.4-rc6 with the patchset[1] applied. > - Use kernel parameter to enforce all MMIO BARs to be page aligned > on PCI core code instead of doing it on PPC64 arch code. > - Remove flags: VFIO_DEVICE_FLAGS_PCI_PAGE_ALIGNED > > [1] http://www.spinics.net/lists/kvm/msg127812.html > [2] http://www.spinics.net/lists/kvm/msg130256.html > > Yongji Xie (10): > PCI: Ignore resource_alignment if PCI_PROBE_ONLY was set > PCI: Do not Use IORESOURCE_STARTALIGN to identify bridge resources > PCI: Add a new option for resource_alignment to reassign alignment > PCI: Add support for enforcing all MMIO BARs to be page aligned > vfio-pci: Allow to mmap sub-page MMIO BARs if the mmio page is exclusive > PCI: Add a new PCI_BUS_FLAGS_MSI_REMAP flag > iommu: Set PCI_BUS_FLAGS_MSI_REMAP if IOMMU have capability of IRQ remapping > PCI: Set PCI_BUS_FLAGS_MSI_REMAP if MSI controller supports IRQ remapping > pci-ioda: Set PCI_BUS_FLAGS_MSI_REMAP for IODA host bridge > vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported > > Documentation/kernel-parameters.txt | 7 +- > arch/powerpc/include/asm/pci.h | 2 + > arch/powerpc/platforms/powernv/pci-ioda.c | 8 +++ > drivers/iommu/iommu.c | 15 +++++ > drivers/pci/msi.c | 12 ++++ > drivers/pci/pci.c | 105 +++++++++++++++++++++++------ > drivers/pci/probe.c | 3 + > drivers/pci/setup-bus.c | 9 ++- > drivers/vfio/pci/vfio_pci.c | 65 +++++++++++++++--- > drivers/vfio/pci/vfio_pci_private.h | 8 +++ > drivers/vfio/pci/vfio_pci_rdwr.c | 3 +- > include/linux/msi.h | 6 +- > include/linux/pci.h | 1 + > 13 files changed, 208 insertions(+), 36 deletions(-) >