2016-04-27 12:46:38

by Yongji Xie

[permalink] [raw]
Subject: [PATCH 0/5] vfio-pci: Add support for mmapping MSI-X table

Current vfio-pci implementation disallows to mmap the page
containing MSI-X table in case that users can write to MSI-X
table and generate an incorrect MSIs.

However, this will cause some performance issue when there
are some critical device registers in the same page as the
MSI-X table. We have to handle the mmio access to these
registers in QEMU emulation rather than in guest.

To solve this issue, this series allows to mmap MSI-X table
when hardware supports the capability of interrupt remapping
which can ensure that a given pci device can only shoot the
MSIs assigned for it. And we introduce a new bus_flags
PCI_BUS_FLAGS_MSI_REMAP to test this capability on PCI side
for different archs.

The patch 3 are based on the proposed patchset[1].

[1] http://www.spinics.net/lists/kvm/msg130256.html

Yongji Xie (5):
PCI: Add a new PCI_BUS_FLAGS_MSI_REMAP flag
iommu: Set PCI_BUS_FLAGS_MSI_REMAP if IOMMU have capability of IRQ remapping
PCI: Set PCI_BUS_FLAGS_MSI_REMAP if MSI controller supports IRQ remapping
pci-ioda: Set PCI_BUS_FLAGS_MSI_REMAP for IODA host bridge
vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

arch/powerpc/platforms/powernv/pci-ioda.c | 8 ++++++++
drivers/iommu/iommu.c | 15 +++++++++++++++
drivers/pci/msi.c | 12 ++++++++++++
drivers/pci/probe.c | 3 +++
drivers/vfio/pci/vfio_pci.c | 7 +++++--
drivers/vfio/pci/vfio_pci_rdwr.c | 3 ++-
include/linux/msi.h | 6 +++++-
include/linux/pci.h | 1 +
8 files changed, 51 insertions(+), 4 deletions(-)

--
1.7.9.5


2016-04-27 12:46:23

by Yongji Xie

[permalink] [raw]
Subject: [PATCH 2/5] iommu: Set PCI_BUS_FLAGS_MSI_REMAP if IOMMU have capability of IRQ remapping

The capability of IRQ remapping is abstracted on IOMMU side on
some archs. There is a existing flag IOMMU_CAP_INTR_REMAP for this.

To have a universal flag to test this capability for different
archs on PCI side, we set PCI_BUS_FLAGS_MSI_REMAP for PCI buses
when IOMMU_CAP_INTR_REMAP is set.

Signed-off-by: Yongji Xie <[email protected]>
---
drivers/iommu/iommu.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0e3b009..5d2b6f6 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -813,6 +813,16 @@ struct iommu_group *pci_device_group(struct device *dev)
return group;
}

+static void pci_check_msi_remapping(struct pci_dev *pdev,
+ const struct iommu_ops *ops)
+{
+ struct pci_bus *bus = pdev->bus;
+
+ if (ops->capable(IOMMU_CAP_INTR_REMAP) &&
+ !(bus->bus_flags & PCI_BUS_FLAGS_MSI_REMAP))
+ bus->bus_flags |= PCI_BUS_FLAGS_MSI_REMAP;
+}
+
/**
* iommu_group_get_for_dev - Find or create the IOMMU group for a device
* @dev: target device
@@ -871,6 +881,9 @@ static int add_iommu_group(struct device *dev, void *data)
const struct iommu_ops *ops = cb->ops;
int ret;

+ if (dev_is_pci(dev) && ops->capable)
+ pci_check_msi_remapping(to_pci_dev(dev), ops);
+
if (!ops->add_device)
return 0;

@@ -913,6 +926,8 @@ static int iommu_bus_notifier(struct notifier_block *nb,
* result in ADD/DEL notifiers to group->notifier
*/
if (action == BUS_NOTIFY_ADD_DEVICE) {
+ if (dev_is_pci(dev) && ops->capable)
+ pci_check_msi_remapping(to_pci_dev(dev), ops);
if (ops->add_device)
return ops->add_device(dev);
} else if (action == BUS_NOTIFY_REMOVED_DEVICE) {
--
1.7.9.5

2016-04-27 12:46:30

by Yongji Xie

[permalink] [raw]
Subject: [PATCH 3/5] PCI: Set PCI_BUS_FLAGS_MSI_REMAP if MSI controller supports IRQ remapping

On ARM HW the capability of IRQ remapping is abstracted on
MSI controller side. MSI_FLAG_IRQ_REMAPPING is used to advertise
this [1].

To have a universal flag to test this capability for different
archs on PCI side, we set PCI_BUS_FLAGS_MSI_REMAP for PCI buses
when MSI_FLAG_IRQ_REMAPPING is set.

[1] http://www.spinics.net/lists/kvm/msg130256.html

Signed-off-by: Yongji Xie <[email protected]>
---
drivers/pci/msi.c | 12 ++++++++++++
drivers/pci/probe.c | 3 +++
include/linux/msi.h | 6 +++++-
3 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index a080f44..1661cdf 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -1134,6 +1134,18 @@ void *msi_desc_to_pci_sysdata(struct msi_desc *desc)
}
EXPORT_SYMBOL_GPL(msi_desc_to_pci_sysdata);

+void pci_bus_check_msi_remapping(struct pci_bus *bus,
+ struct irq_domain *domain)
+{
+#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
+ struct msi_domain_info *info;
+
+ info = msi_get_domain_info(domain);
+ if (info->flags & MSI_FLAG_IRQ_REMAPPING)
+ bus->bus_flags |= PCI_BUS_FLAGS_MSI_REMAP;
+#endif
+}
+
#ifdef CONFIG_PCI_MSI_IRQ_DOMAIN
/**
* pci_msi_domain_write_msg - Helper to write MSI message to PCI config space
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 6d7ab9b..25cf1b1 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -696,6 +696,9 @@ static void pci_set_bus_msi_domain(struct pci_bus *bus)
if (!d)
d = pci_host_bridge_msi_domain(b);

+ if (d && b == bus)
+ pci_bus_check_msi_remapping(bus, d);
+
dev_set_msi_domain(&bus->dev, d);
}

diff --git a/include/linux/msi.h b/include/linux/msi.h
index 03eda72..b4c649e 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -15,6 +15,8 @@ extern int pci_msi_ignore_mask;
struct irq_data;
struct msi_desc;
struct pci_dev;
+struct pci_bus;
+struct irq_domain;
struct platform_msi_priv_data;
void __get_cached_msi_msg(struct msi_desc *entry, struct msi_msg *msg);
void get_cached_msi_msg(unsigned int irq, struct msi_msg *msg);
@@ -155,6 +157,9 @@ void arch_restore_msi_irqs(struct pci_dev *dev);
void default_teardown_msi_irqs(struct pci_dev *dev);
void default_restore_msi_irqs(struct pci_dev *dev);

+void pci_bus_check_msi_remapping(struct pci_bus *bus,
+ struct irq_domain *domain);
+
struct msi_controller {
struct module *owner;
struct device *dev;
@@ -173,7 +178,6 @@ struct msi_controller {
#include <linux/irqhandler.h>
#include <asm/msi.h>

-struct irq_domain;
struct irq_domain_ops;
struct irq_chip;
struct device_node;
--
1.7.9.5

2016-04-27 12:46:41

by Yongji Xie

[permalink] [raw]
Subject: [PATCH 1/5] PCI: Add a new PCI_BUS_FLAGS_MSI_REMAP flag

We introduce a new pci_bus_flags, PCI_BUS_FLAGS_MSI_REMAP
which indicates all devices on the bus are protected by the
hardware which supports IRQ remapping(intel naming).

This flag will be used to know whether it's safe to expose
MSI-X tables of PCI BARs to userspace. Because the capability
of IRQ remapping can guarantee the PCI device cannot trigger
MSIs that correspond to interrupt IDs of other devices.

There is a existing flag for this in the IOMMU space:

enum iommu_cap {
IOMMU_CAP_CACHE_COHERENCY,
---> IOMMU_CAP_INTR_REMAP,
IOMMU_CAP_NOEXEC,
};

and Eric also posted a patchset [1] to abstract this
capability on MSI controller side for ARM. But it would
make sense to have a more common flag like
PCI_BUS_FLAGS_MSI_REMAP in this patch so that we can use
a universal flag to test this capability on PCI side for
different archs.

[1] http://www.spinics.net/lists/kvm/msg130256.html

Signed-off-by: Yongji Xie <[email protected]>
---
include/linux/pci.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index 27df4a6..d619228 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -193,6 +193,7 @@ typedef unsigned short __bitwise pci_bus_flags_t;
enum pci_bus_flags {
PCI_BUS_FLAGS_NO_MSI = (__force pci_bus_flags_t) 1,
PCI_BUS_FLAGS_NO_MMRBC = (__force pci_bus_flags_t) 2,
+ PCI_BUS_FLAGS_MSI_REMAP = (__force pci_bus_flags_t) 4,
};

/* These values come from the PCI Express Spec */
--
1.7.9.5

2016-04-27 12:47:03

by Yongji Xie

[permalink] [raw]
Subject: [PATCH 4/5] pci-ioda: Set PCI_BUS_FLAGS_MSI_REMAP for IODA host bridge

Any IODA host bridge have the capability of IRQ remapping.
So we set PCI_BUS_FLAGS_MSI_REMAP when this kind of host birdge
is detected.

Signed-off-by: Yongji Xie <[email protected]>
---
arch/powerpc/platforms/powernv/pci-ioda.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index f90dc04..9557638 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -3080,6 +3080,12 @@ static void pnv_pci_ioda_fixup(void)
pnv_npu_ioda_fixup();
}

+int pnv_pci_ioda_root_bridge_prepare(struct pci_host_bridge *bridge)
+{
+ bridge->bus->bus_flags |= PCI_BUS_FLAGS_MSI_REMAP;
+ return 0;
+}
+
/*
* Returns the alignment for I/O or memory windows for P2P
* bridges. That actually depends on how PEs are segmented.
@@ -3364,6 +3370,8 @@ static void __init pnv_pci_init_ioda_phb(struct device_node *np,
*/
ppc_md.pcibios_fixup = pnv_pci_ioda_fixup;

+ ppc_md.pcibios_root_bridge_prepare = pnv_pci_ioda_root_bridge_prepare;
+
if (phb->type == PNV_PHB_NPU)
hose->controller_ops = pnv_npu_ioda_controller_ops;
else
--
1.7.9.5

2016-04-27 12:47:34

by Yongji Xie

[permalink] [raw]
Subject: [PATCH 5/5] vfio-pci: Allow to mmap MSI-X table if interrupt remapping is supported

This patch enables mmapping MSI-X tables if hardware supports
interrupt remapping which can ensure that a given pci device
can only shoot the MSIs assigned for it.

With MSI-X table mmapped, we also need to expose the
read/write interface which will be used to access MSI-X table.

Signed-off-by: Yongji Xie <[email protected]>
---
drivers/vfio/pci/vfio_pci.c | 7 +++++--
drivers/vfio/pci/vfio_pci_rdwr.c | 3 ++-
2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 98059df..7eba77d 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -591,7 +591,9 @@ static long vfio_pci_ioctl(void *device_data,
pci_resource_flags(pdev, info.index) &
IORESOURCE_MEM && info.size >= PAGE_SIZE) {
info.flags |= VFIO_REGION_INFO_FLAG_MMAP;
- if (info.index == vdev->msix_bar) {
+ if (info.index == vdev->msix_bar &&
+ !(pdev->bus->bus_flags &
+ PCI_BUS_FLAGS_MSI_REMAP)) {
ret = msix_sparse_mmap_cap(vdev, &caps);
if (ret)
return ret;
@@ -1023,7 +1025,8 @@ static int vfio_pci_mmap(void *device_data, struct vm_area_struct *vma)
if (phys_len < PAGE_SIZE || req_start + req_len > phys_len)
return -EINVAL;

- if (index == vdev->msix_bar) {
+ if (index == vdev->msix_bar &&
+ !(pdev->bus->bus_flags & PCI_BUS_FLAGS_MSI_REMAP)) {
/*
* Disallow mmaps overlapping the MSI-X table; users don't
* get to touch this directly. We could find somewhere
diff --git a/drivers/vfio/pci/vfio_pci_rdwr.c b/drivers/vfio/pci/vfio_pci_rdwr.c
index 5ffd1d9..dbf9cd0 100644
--- a/drivers/vfio/pci/vfio_pci_rdwr.c
+++ b/drivers/vfio/pci/vfio_pci_rdwr.c
@@ -164,7 +164,8 @@ ssize_t vfio_pci_bar_rw(struct vfio_pci_device *vdev, char __user *buf,
} else
io = vdev->barmap[bar];

- if (bar == vdev->msix_bar) {
+ if (bar == vdev->msix_bar &&
+ !(pdev->bus->bus_flags & PCI_BUS_FLAGS_MSI_REMAP)) {
x_start = vdev->msix_offset;
x_end = vdev->msix_offset + vdev->msix_size;
}
--
1.7.9.5