2016-04-18 11:00:25

by Yongji Xie

[permalink] [raw]
Subject: [RFC v6 06/10] PCI: Add a new PCI_BUS_FLAGS_MSI_REMAP flag

We introduce a new pci_bus_flags, PCI_BUS_FLAGS_MSI_REMAP
which indicates all devices on the bus are protected by the
hardware which supports IRQ remapping(intel naming).

This flag will be used to know whether it's safe to expose
MSI-X tables of PCI BARs to userspace. Because the capability
of IRQ remapping can guarantee the PCI device cannot trigger
MSIs that correspond to interrupt IDs of other devices.

There is a existing flag for this in the IOMMU space:

enum iommu_cap {
IOMMU_CAP_CACHE_COHERENCY,
---> IOMMU_CAP_INTR_REMAP,
IOMMU_CAP_NOEXEC,
};

and Eric also posted a patchset [1] to abstract this
capability on MSI controller side for ARM. But it would
make sense to have a more common flag like
PCI_BUS_FLAGS_MSI_REMAP so that we can use a universal
flag to test this capability for different archs on
PCI side.

[1] http://www.spinics.net/lists/kvm/msg130256.html

Signed-off-by: Yongji Xie <[email protected]>
---
include/linux/pci.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/include/linux/pci.h b/include/linux/pci.h
index 27df4a6..d619228 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -193,6 +193,7 @@ typedef unsigned short __bitwise pci_bus_flags_t;
enum pci_bus_flags {
PCI_BUS_FLAGS_NO_MSI = (__force pci_bus_flags_t) 1,
PCI_BUS_FLAGS_NO_MMRBC = (__force pci_bus_flags_t) 2,
+ PCI_BUS_FLAGS_MSI_REMAP = (__force pci_bus_flags_t) 4,
};

/* These values come from the PCI Express Spec */
--
1.7.9.5


2016-04-18 11:00:47

by Yongji Xie

[permalink] [raw]
Subject: [RFC v6 07/10] iommu: Set PCI_BUS_FLAGS_MSI_REMAP if IOMMU have capability of IRQ remapping

The capability of IRQ remapping is abstracted on IOMMU side on
some archs. There is a existing flag IOMMU_CAP_INTR_REMAP for this.

To have a universal flag to test this capability for different
archs on PCI side, we set PCI_BUS_FLAGS_MSI_REMAP for PCI buses
when IOMMU_CAP_INTR_REMAP is set.

Signed-off-by: Yongji Xie <[email protected]>
---
drivers/iommu/iommu.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 0e3b009..5d2b6f6 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -813,6 +813,16 @@ struct iommu_group *pci_device_group(struct device *dev)
return group;
}

+static void pci_check_msi_remapping(struct pci_dev *pdev,
+ const struct iommu_ops *ops)
+{
+ struct pci_bus *bus = pdev->bus;
+
+ if (ops->capable(IOMMU_CAP_INTR_REMAP) &&
+ !(bus->bus_flags & PCI_BUS_FLAGS_MSI_REMAP))
+ bus->bus_flags |= PCI_BUS_FLAGS_MSI_REMAP;
+}
+
/**
* iommu_group_get_for_dev - Find or create the IOMMU group for a device
* @dev: target device
@@ -871,6 +881,9 @@ static int add_iommu_group(struct device *dev, void *data)
const struct iommu_ops *ops = cb->ops;
int ret;

+ if (dev_is_pci(dev) && ops->capable)
+ pci_check_msi_remapping(to_pci_dev(dev), ops);
+
if (!ops->add_device)
return 0;

@@ -913,6 +926,8 @@ static int iommu_bus_notifier(struct notifier_block *nb,
* result in ADD/DEL notifiers to group->notifier
*/
if (action == BUS_NOTIFY_ADD_DEVICE) {
+ if (dev_is_pci(dev) && ops->capable)
+ pci_check_msi_remapping(to_pci_dev(dev), ops);
if (ops->add_device)
return ops->add_device(dev);
} else if (action == BUS_NOTIFY_REMOVED_DEVICE) {
--
1.7.9.5

2016-04-18 11:31:25

by David Laight

[permalink] [raw]
Subject: RE: [RFC v6 06/10] PCI: Add a new PCI_BUS_FLAGS_MSI_REMAP flag

From: Yongji Xie
> Sent: 18 April 2016 11:59
> We introduce a new pci_bus_flags, PCI_BUS_FLAGS_MSI_REMAP
> which indicates all devices on the bus are protected by the
> hardware which supports IRQ remapping(intel naming).
>
> This flag will be used to know whether it's safe to expose
> MSI-X tables of PCI BARs to userspace. Because the capability
> of IRQ remapping can guarantee the PCI device cannot trigger
> MSIs that correspond to interrupt IDs of other devices.

I'm worried that this entire series is going to break drivers
for existing hardware.

I understand some of the reasoning for 'vm pass through' configurations,
but there will be PCIe devices out there that have the MSI-X tables
in the same BAR as other device registers.
If you are lucky nothing else is in the same 4k area, but I wouldn't
assume it.

In any case, if the hardware can't police the card's master transfers
there is nothing to stop a different bus master block on the card
from raising MSI-X interrupts - they are just a PCIe write.
So all you are doing is raising the bar slightly and giving a very false
sense of security.

David


2016-04-19 11:15:24

by Yongji Xie

[permalink] [raw]
Subject: Re: [RFC v6 06/10] PCI: Add a new PCI_BUS_FLAGS_MSI_REMAP flag

On 2016/4/18 19:30, David Laight wrote:
> From: Yongji Xie
>> Sent: 18 April 2016 11:59
>> We introduce a new pci_bus_flags, PCI_BUS_FLAGS_MSI_REMAP
>> which indicates all devices on the bus are protected by the
>> hardware which supports IRQ remapping(intel naming).
>>
>> This flag will be used to know whether it's safe to expose
>> MSI-X tables of PCI BARs to userspace. Because the capability
>> of IRQ remapping can guarantee the PCI device cannot trigger
>> MSIs that correspond to interrupt IDs of other devices.
> I'm worried that this entire series is going to break drivers
> for existing hardware.
>
> I understand some of the reasoning for 'vm pass through' configurations,
> but there will be PCIe devices out there that have the MSI-X tables
> in the same BAR as other device registers.
> If you are lucky nothing else is in the same 4k area, but I wouldn't
> assume it.

Thanks for your comments. But I didn't get your point here.
Why will exposing MSI-X table to userspace break the driver
for hardware which have the MSI-X tables in the same BAR as
other device registers? Could you give me more details?

The reason why we want to mmap MSI-X table is that there
may be some other critical device registers in the same page
as the MSI-X table. We prefer to handle the mmio access to
these registers in guest rather than in QEMU. So we would
like to see there is something else in the same 4k/64k area.

> In any case, if the hardware can't police the card's master transfers
> there is nothing to stop a different bus master block on the card
> from raising MSI-X interrupts - they are just a PCIe write.
> So all you are doing is raising the bar slightly and giving a very false
> sense of security.

Do you mean we can request a DMA to the target address
area that raises MSI-X interrupts? But for PPC64 with IODA
bridge, this invalid PCIe write will be prevented on PHB before
raising MSI-X interrupt. And I think the capability of interrupt
remapping or ITS can also do the same thing. If hardware didn't
support this, we would not expose MSI-X table in my patch.

Thanks,
Yongji