2016-12-13 20:41:03

by Eric Auger

[permalink] [raw]
Subject: [RFC v4 00/16] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions

Following LPC discussions, we now report reserved regions through
iommu-group sysfs reserved_regions attribute file.

Reserved regions are populated through the IOMMU get_resv_region
callback (former get_dm_regions), now implemented by amd-iommu,
intel-iommu and arm-smmu:
- the amd-iommu reports device direct mapped regions.
- the intel-iommu reports the [0xfee00000 - 0xfeefffff] MSI window
as an IOMMU_RESV_NOMAP reserved region.
- the arm-smmu reports the MSI window (arbitrarily located at
0x8000000 and 1MB large).

Unsafe interrupt assignment is tested by enumerating all MSI irq
domains and checking they support MSI remapping. This check is done
in case we detect the iommu translates MSI (an IOMMU_RESV_MSI
window exists). Otherwise the IRQ remapping capability is checked
at IOMMU level. Obviously this is a defensive IRQ safety assessment.
Assuming there are several MSI controllers in the system and at
least one does not implement IRQ remapping, the assignment will be
considered as unsafe (even if this controller is not acessible from
the assigned devices).

The series integrates a not officially posted patch from Robin:
"iommu/dma: Allow MSI-only cookies".

Best Regards

Eric

Git: complete series available at
https://github.com/eauger/linux/tree/v4.9-reserved-v4

History:

RFCv3 -> RFCv4:
- arm-smmu driver does not register PCI host bridge windows as
reserved regions anymore
- Implement reserved region get/put callbacks also in arm-smmuv3
- take the iommu_group lock on iommu_get_group_resv_regions
- add a type field in iommu_resv_region instead of using prot
- init the region list_head in iommu_alloc_resv_region, also
add type parameter
- iommu_insert_resv_region manage overlaps and sort reserved
windows
- address IRQ safety assessment by enumerating all the MSI irq
domains and checking the MSI_REMAP flag
- update Documentation/ABI/testing/sysfs-kernel-iommu_groups
- Did not add T-b since the code has significantly changed

RFCv2 -> RFCv3:
- switch to an iommu-group sysfs API
- use new dummy allocator provided by Robin
- dummy allocator initialized by vfio-iommu-type1 after enumerating
the reserved regions
- at the moment ARM MSI base address/size is left unchanged compared
to v2
- we currently report reserved regions and not usable IOVA regions as
requested by Alex

RFC v1 -> v2:
- fix intel_add_reserved_regions
- add mutex lock/unlock in vfio_iommu_type1


Eric Auger (16):
iommu/dma: Allow MSI-only cookies
iommu: Rename iommu_dm_regions into iommu_resv_regions
iommu: Add a new type field in iommu_resv_region
iommu: iommu_alloc_resv_region
iommu: Only map direct mapped regions
iommu: iommu_get_group_resv_regions
iommu: Implement reserved_regions iommu-group sysfs file
iommu/vt-d: Implement reserved region get/put callbacks
iommu/arm-smmu: Implement reserved region get/put callbacks
iommu/arm-smmu-v3: Implement reserved region get/put callbacks
irqdomain: Add IRQ_DOMAIN_FLAG_MSI_REMAP value
irqdomain: irq_domain_check_msi_remap
irqchip/gicv3-its: Sets IRQ_DOMAIN_FLAG_MSI_REMAP
vfio/type1: Allow transparent MSI IOVA allocation
vfio/type1: Check MSI remapping at irq domain level
iommu/arm-smmu: Do not advertise IOMMU_CAP_INTR_REMAP anymore

.../ABI/testing/sysfs-kernel-iommu_groups | 9 ++
drivers/iommu/amd_iommu.c | 21 +--
drivers/iommu/arm-smmu-v3.c | 30 +++-
drivers/iommu/arm-smmu.c | 30 +++-
drivers/iommu/dma-iommu.c | 116 +++++++++++++---
drivers/iommu/intel-iommu.c | 50 +++++--
drivers/iommu/iommu.c | 152 +++++++++++++++++++--
drivers/irqchip/irq-gic-v3-its.c | 1 +
drivers/vfio/vfio_iommu_type1.c | 37 ++++-
include/linux/dma-iommu.h | 7 +
include/linux/iommu.h | 46 +++++--
include/linux/irqdomain.h | 8 ++
kernel/irq/irqdomain.c | 24 ++++
13 files changed, 455 insertions(+), 76 deletions(-)

--
1.9.1


2016-12-13 20:32:29

by Eric Auger

[permalink] [raw]
Subject: [RFC v4 05/16] iommu: Only map direct mapped regions

As we introduced new reserved region types which do not require
mapping, let's make sure we only map direct mapped regions.

Signed-off-by: Eric Auger <[email protected]>

---

v3 -> v4:
- use region's type and reword commit message and title
---
drivers/iommu/iommu.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index f83d762..ae3d97b 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -343,6 +343,9 @@ static int iommu_group_create_direct_mappings(struct iommu_group *group,
start = ALIGN(entry->start, pg_size);
end = ALIGN(entry->start + entry->length, pg_size);

+ if (entry->type != IOMMU_RESV_DIRECT)
+ continue;
+
for (addr = start; addr < end; addr += pg_size) {
phys_addr_t phys_addr;

--
1.9.1

2016-12-13 20:32:45

by Eric Auger

[permalink] [raw]
Subject: [RFC v4 16/16] iommu/arm-smmu: Do not advertise IOMMU_CAP_INTR_REMAP anymore

IOMMU_CAP_INTR_REMAP has been advertised in arm-smmu(-v3) although
on ARM this property is not attached to the IOMMU but rather is
implemented in the MSI controller (GICv3 ITS).

Now vfio_iommu_type1 takes into account the MSI domain MSI remapping
capability, let's correct this.

Signed-off-by: Eric Auger <[email protected]>
---
drivers/iommu/arm-smmu-v3.c | 2 --
drivers/iommu/arm-smmu.c | 2 --
2 files changed, 4 deletions(-)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index c9ee2f5..0d472c8 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1373,8 +1373,6 @@ static bool arm_smmu_capable(enum iommu_cap cap)
switch (cap) {
case IOMMU_CAP_CACHE_COHERENCY:
return true;
- case IOMMU_CAP_INTR_REMAP:
- return true; /* MSIs are just memory writes */
case IOMMU_CAP_NOEXEC:
return true;
default:
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index d24a646..27a851e 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1371,8 +1371,6 @@ static bool arm_smmu_capable(enum iommu_cap cap)
* requests.
*/
return true;
- case IOMMU_CAP_INTR_REMAP:
- return true; /* MSIs are just memory writes */
case IOMMU_CAP_NOEXEC:
return true;
default:
--
1.9.1

2016-12-13 20:32:25

by Eric Auger

[permalink] [raw]
Subject: [RFC v4 04/16] iommu: iommu_alloc_resv_region

Introduce a new helper serving the purpose to allocate a reserved
region. This will be used in iommu driver implementing reserved
region callbacks.

Signed-off-by: Eric Auger <[email protected]>

---

v3 -> v4:
- add INIT_LIST_HEAD(&region->list)
- use int for prot param and add int type param
- remove implementation outside of CONFIG_IOMMU_API
---
drivers/iommu/iommu.c | 18 ++++++++++++++++++
include/linux/iommu.h | 2 ++
2 files changed, 20 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index c7ed334..f83d762 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1562,6 +1562,24 @@ void iommu_put_resv_regions(struct device *dev, struct list_head *list)
ops->put_resv_regions(dev, list);
}

+struct iommu_resv_region *iommu_alloc_resv_region(phys_addr_t start,
+ size_t length,
+ int prot, int type)
+{
+ struct iommu_resv_region *region;
+
+ region = kzalloc(sizeof(*region), GFP_KERNEL);
+ if (!region)
+ return NULL;
+
+ INIT_LIST_HEAD(&region->list);
+ region->start = start;
+ region->length = length;
+ region->prot = prot;
+ region->type = type;
+ return region;
+}
+
/* Request that a device is direct mapped by the IOMMU */
int iommu_request_dm_for_dev(struct device *dev)
{
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index ec87bac..66b0c1c 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -244,6 +244,8 @@ extern void iommu_set_fault_handler(struct iommu_domain *domain,
extern void iommu_get_resv_regions(struct device *dev, struct list_head *list);
extern void iommu_put_resv_regions(struct device *dev, struct list_head *list);
extern int iommu_request_dm_for_dev(struct device *dev);
+extern struct iommu_resv_region *
+iommu_alloc_resv_region(phys_addr_t start, size_t length, int prot, int type);

extern int iommu_attach_group(struct iommu_domain *domain,
struct iommu_group *group);
--
1.9.1

2016-12-13 20:32:20

by Eric Auger

[permalink] [raw]
Subject: [RFC v4 03/16] iommu: Add a new type field in iommu_resv_region

We introduce a new field to differentiate the reserved region
types and specialize the apply_resv_region implementation.

Legacy direct mapped regions have IOMMU_RESV_DIRECT type.
We introduce 2 new reserved memory types:
- IOMMU_RESV_MSI will characterize MSI regions
- IOMMU_RESV_NOMAP characterize regions that cannot by mapped.

Signed-off-by: Eric Auger <[email protected]>
---
drivers/iommu/amd_iommu.c | 1 +
include/linux/iommu.h | 7 +++++++
2 files changed, 8 insertions(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index a6c351d..3186c05 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3184,6 +3184,7 @@ static void amd_iommu_get_resv_regions(struct device *dev,

region->start = entry->address_start;
region->length = entry->address_end - entry->address_start;
+ region->type = IOMMU_RESV_DIRECT;
if (entry->prot & IOMMU_PROT_IR)
region->prot |= IOMMU_READ;
if (entry->prot & IOMMU_PROT_IW)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 7f6ebd0..ec87bac 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -117,18 +117,25 @@ enum iommu_attr {
DOMAIN_ATTR_MAX,
};

+/* These are the possible reserved region types */
+#define IOMMU_RESV_DIRECT (1 << 0)
+#define IOMMU_RESV_NOMAP (1 << 1)
+#define IOMMU_RESV_MSI (1 << 2)
+
/**
* struct iommu_resv_region - descriptor for a reserved memory region
* @list: Linked list pointers
* @start: System physical start address of the region
* @length: Length of the region in bytes
* @prot: IOMMU Protection flags (READ/WRITE/...)
+ * @type: Type of the reserved region
*/
struct iommu_resv_region {
struct list_head list;
phys_addr_t start;
size_t length;
int prot;
+ int type;
};

#ifdef CONFIG_IOMMU_API
--
1.9.1

2016-12-13 20:38:36

by Eric Auger

[permalink] [raw]
Subject: [RFC v4 01/16] iommu/dma: Allow MSI-only cookies

IOMMU domain users such as VFIO face a similar problem to DMA API ops
with regard to mapping MSI messages in systems where the MSI write is
subject to IOMMU translation. With the relevant infrastructure now in
place for managed DMA domains, it's actually really simple for other
users to piggyback off that and reap the benefits without giving up
their own IOVA management, and without having to reinvent their own
wheel in the MSI layer.

Allow such users to opt into automatic MSI remapping by dedicating a
region of their IOVA space to a managed cookie, and extend the mapping
routine to implement a trivial linear allocator in such cases, to avoid
the needless overhead of a full-blown IOVA domain.

Signed-off-by: Robin Murphy <[email protected]>
Signed-off-by: Eric Auger <[email protected]>

---
[Eric] fixed 2 issues on MSI path
---
drivers/iommu/dma-iommu.c | 116 +++++++++++++++++++++++++++++++++++++---------
include/linux/dma-iommu.h | 7 +++
2 files changed, 101 insertions(+), 22 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index c5ab866..793103f 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -37,10 +37,19 @@ struct iommu_dma_msi_page {
phys_addr_t phys;
};

+enum iommu_dma_cookie_type {
+ IOMMU_DMA_IOVA_COOKIE,
+ IOMMU_DMA_MSI_COOKIE,
+};
+
struct iommu_dma_cookie {
- struct iova_domain iovad;
- struct list_head msi_page_list;
- spinlock_t msi_lock;
+ union {
+ struct iova_domain iovad;
+ dma_addr_t msi_iova;
+ };
+ struct list_head msi_page_list;
+ spinlock_t msi_lock;
+ enum iommu_dma_cookie_type type;
};

static inline struct iova_domain *cookie_iovad(struct iommu_domain *domain)
@@ -53,6 +62,19 @@ int iommu_dma_init(void)
return iova_cache_get();
}

+static struct iommu_dma_cookie *__cookie_alloc(enum iommu_dma_cookie_type type)
+{
+ struct iommu_dma_cookie *cookie;
+
+ cookie = kzalloc(sizeof(*cookie), GFP_KERNEL);
+ if (cookie) {
+ spin_lock_init(&cookie->msi_lock);
+ INIT_LIST_HEAD(&cookie->msi_page_list);
+ cookie->type = type;
+ }
+ return cookie;
+}
+
/**
* iommu_get_dma_cookie - Acquire DMA-API resources for a domain
* @domain: IOMMU domain to prepare for DMA-API usage
@@ -62,25 +84,53 @@ int iommu_dma_init(void)
*/
int iommu_get_dma_cookie(struct iommu_domain *domain)
{
+ if (domain->iova_cookie)
+ return -EEXIST;
+
+ domain->iova_cookie = __cookie_alloc(IOMMU_DMA_IOVA_COOKIE);
+ if (!domain->iova_cookie)
+ return -ENOMEM;
+
+ return 0;
+}
+EXPORT_SYMBOL(iommu_get_dma_cookie);
+
+/**
+ * iommu_get_msi_cookie - Acquire just MSI remapping resources
+ * @domain: IOMMU domain to prepare
+ * @base: Start address of IOVA region for MSI mappings
+ *
+ * Users who manage their own IOVA allocation and do not want DMA API support,
+ * but would still like to take advantage of automatic MSI remapping, can use
+ * this to initialise their own domain appropriately. Users should reserve a
+ * contiguous IOVA region, starting at @base, large enough to accommodate the
+ * number of PAGE_SIZE mappings necessary to cover every MSI doorbell address
+ * used by the devices attached to @domain.
+ */
+int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base)
+{
struct iommu_dma_cookie *cookie;

+ if (domain->type != IOMMU_DOMAIN_UNMANAGED)
+ return -EINVAL;
+
if (domain->iova_cookie)
return -EEXIST;

- cookie = kzalloc(sizeof(*cookie), GFP_KERNEL);
+ cookie = __cookie_alloc(IOMMU_DMA_MSI_COOKIE);
if (!cookie)
return -ENOMEM;

- spin_lock_init(&cookie->msi_lock);
- INIT_LIST_HEAD(&cookie->msi_page_list);
+ cookie->msi_iova = base;
domain->iova_cookie = cookie;
return 0;
}
-EXPORT_SYMBOL(iommu_get_dma_cookie);
+EXPORT_SYMBOL(iommu_get_msi_cookie);

/**
* iommu_put_dma_cookie - Release a domain's DMA mapping resources
- * @domain: IOMMU domain previously prepared by iommu_get_dma_cookie()
+ * @domain: IOMMU domain previously prepared by iommu_get_dma_cookie() or
+ * iommu_get_msi_cookie()
*
* IOMMU drivers should normally call this from their domain_free callback.
*/
@@ -92,7 +142,7 @@ void iommu_put_dma_cookie(struct iommu_domain *domain)
if (!cookie)
return;

- if (cookie->iovad.granule)
+ if (cookie->type == IOMMU_DMA_IOVA_COOKIE && cookie->iovad.granule)
put_iova_domain(&cookie->iovad);

list_for_each_entry_safe(msi, tmp, &cookie->msi_page_list, list) {
@@ -137,11 +187,12 @@ static void iova_reserve_pci_windows(struct pci_dev *dev,
int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
u64 size, struct device *dev)
{
- struct iova_domain *iovad = cookie_iovad(domain);
+ struct iommu_dma_cookie *cookie = domain->iova_cookie;
+ struct iova_domain *iovad = &cookie->iovad;
unsigned long order, base_pfn, end_pfn;

- if (!iovad)
- return -ENODEV;
+ if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE)
+ return -EINVAL;

/* Use the smallest supported page size for IOVA granularity */
order = __ffs(domain->pgsize_bitmap);
@@ -644,11 +695,21 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
{
struct iommu_dma_cookie *cookie = domain->iova_cookie;
struct iommu_dma_msi_page *msi_page;
- struct iova_domain *iovad = &cookie->iovad;
+ struct iova_domain *iovad;
struct iova *iova;
int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
+ size_t size;
+
+ if (cookie->type == IOMMU_DMA_IOVA_COOKIE) {
+ iovad = &cookie->iovad;
+ size = iovad->granule;
+ } else {
+ iovad = NULL;
+ size = PAGE_SIZE;
+ }
+
+ msi_addr &= ~(phys_addr_t)(size - 1);

- msi_addr &= ~(phys_addr_t)iova_mask(iovad);
list_for_each_entry(msi_page, &cookie->msi_page_list, list)
if (msi_page->phys == msi_addr)
return msi_page;
@@ -657,13 +718,18 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
if (!msi_page)
return NULL;

- iova = __alloc_iova(domain, iovad->granule, dma_get_mask(dev));
- if (!iova)
- goto out_free_page;
-
msi_page->phys = msi_addr;
- msi_page->iova = iova_dma_addr(iovad, iova);
- if (iommu_map(domain, msi_page->iova, msi_addr, iovad->granule, prot))
+ if (iovad) {
+ iova = __alloc_iova(domain, size, dma_get_mask(dev));
+ if (!iova)
+ goto out_free_page;
+ msi_page->iova = iova_dma_addr(iovad, iova);
+ } else {
+ msi_page->iova = cookie->msi_iova;
+ cookie->msi_iova += size;
+ }
+
+ if (iommu_map(domain, msi_page->iova, msi_addr, size, prot))
goto out_free_iova;

INIT_LIST_HEAD(&msi_page->list);
@@ -671,7 +737,10 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
return msi_page;

out_free_iova:
- __free_iova(iovad, iova);
+ if (iovad)
+ __free_iova(iovad, iova);
+ else
+ cookie->msi_iova -= size;
out_free_page:
kfree(msi_page);
return NULL;
@@ -712,7 +781,10 @@ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg)
msg->data = ~0U;
} else {
msg->address_hi = upper_32_bits(msi_page->iova);
- msg->address_lo &= iova_mask(&cookie->iovad);
+ if (cookie->type == IOMMU_DMA_IOVA_COOKIE)
+ msg->address_lo &= iova_mask(&cookie->iovad);
+ else
+ msg->address_lo &= (PAGE_SIZE - 1);
msg->address_lo += lower_32_bits(msi_page->iova);
}
}
diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
index 32c5890..e27f2997 100644
--- a/include/linux/dma-iommu.h
+++ b/include/linux/dma-iommu.h
@@ -27,6 +27,7 @@

/* Domain management interface for IOMMU drivers */
int iommu_get_dma_cookie(struct iommu_domain *domain);
+int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base);
void iommu_put_dma_cookie(struct iommu_domain *domain);

/* Setup call for arch DMA mapping code */
@@ -82,6 +83,12 @@ static inline int iommu_get_dma_cookie(struct iommu_domain *domain)
return -ENODEV;
}

+static inline int iommu_get_msi_cookie(struct iommu_domain *domain,
+ dma_addr_t base)
+{
+ return -ENODEV;
+}
+
static inline void iommu_put_dma_cookie(struct iommu_domain *domain)
{
}
--
1.9.1

2016-12-13 20:42:44

by Eric Auger

[permalink] [raw]
Subject: [RFC v4 02/16] iommu: Rename iommu_dm_regions into iommu_resv_regions

We want to extend the callbacks used for dm regions and
use them for reserved regions. Reserved regions can be
- directly mapped regions
- regions that cannot be iommu mapped (PCI host bridge windows, ...)
- MSI regions (because they belong to another address space or because
they are not translated by the IOMMU and need special handling)

So let's rename the struct and also the callbacks.

Signed-off-by: Eric Auger <[email protected]>
Acked-by: Robin Murphy <[email protected]>

---

v3 -> v4:
- add Robin's A-b
---
drivers/iommu/amd_iommu.c | 20 ++++++++++----------
drivers/iommu/iommu.c | 22 +++++++++++-----------
include/linux/iommu.h | 29 +++++++++++++++--------------
3 files changed, 36 insertions(+), 35 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 754595e..a6c351d 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3159,8 +3159,8 @@ static bool amd_iommu_capable(enum iommu_cap cap)
return false;
}

-static void amd_iommu_get_dm_regions(struct device *dev,
- struct list_head *head)
+static void amd_iommu_get_resv_regions(struct device *dev,
+ struct list_head *head)
{
struct unity_map_entry *entry;
int devid;
@@ -3170,7 +3170,7 @@ static void amd_iommu_get_dm_regions(struct device *dev,
return;

list_for_each_entry(entry, &amd_iommu_unity_map, list) {
- struct iommu_dm_region *region;
+ struct iommu_resv_region *region;

if (devid < entry->devid_start || devid > entry->devid_end)
continue;
@@ -3193,18 +3193,18 @@ static void amd_iommu_get_dm_regions(struct device *dev,
}
}

-static void amd_iommu_put_dm_regions(struct device *dev,
+static void amd_iommu_put_resv_regions(struct device *dev,
struct list_head *head)
{
- struct iommu_dm_region *entry, *next;
+ struct iommu_resv_region *entry, *next;

list_for_each_entry_safe(entry, next, head, list)
kfree(entry);
}

-static void amd_iommu_apply_dm_region(struct device *dev,
+static void amd_iommu_apply_resv_region(struct device *dev,
struct iommu_domain *domain,
- struct iommu_dm_region *region)
+ struct iommu_resv_region *region)
{
struct dma_ops_domain *dma_dom = to_dma_ops_domain(to_pdomain(domain));
unsigned long start, end;
@@ -3228,9 +3228,9 @@ static void amd_iommu_apply_dm_region(struct device *dev,
.add_device = amd_iommu_add_device,
.remove_device = amd_iommu_remove_device,
.device_group = amd_iommu_device_group,
- .get_dm_regions = amd_iommu_get_dm_regions,
- .put_dm_regions = amd_iommu_put_dm_regions,
- .apply_dm_region = amd_iommu_apply_dm_region,
+ .get_resv_regions = amd_iommu_get_resv_regions,
+ .put_resv_regions = amd_iommu_put_resv_regions,
+ .apply_resv_region = amd_iommu_apply_resv_region,
.pgsize_bitmap = AMD_IOMMU_PGSIZES,
};

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 9a2f196..c7ed334 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -318,7 +318,7 @@ static int iommu_group_create_direct_mappings(struct iommu_group *group,
struct device *dev)
{
struct iommu_domain *domain = group->default_domain;
- struct iommu_dm_region *entry;
+ struct iommu_resv_region *entry;
struct list_head mappings;
unsigned long pg_size;
int ret = 0;
@@ -331,14 +331,14 @@ static int iommu_group_create_direct_mappings(struct iommu_group *group,
pg_size = 1UL << __ffs(domain->pgsize_bitmap);
INIT_LIST_HEAD(&mappings);

- iommu_get_dm_regions(dev, &mappings);
+ iommu_get_resv_regions(dev, &mappings);

/* We need to consider overlapping regions for different devices */
list_for_each_entry(entry, &mappings, list) {
dma_addr_t start, end, addr;

- if (domain->ops->apply_dm_region)
- domain->ops->apply_dm_region(dev, domain, entry);
+ if (domain->ops->apply_resv_region)
+ domain->ops->apply_resv_region(dev, domain, entry);

start = ALIGN(entry->start, pg_size);
end = ALIGN(entry->start + entry->length, pg_size);
@@ -358,7 +358,7 @@ static int iommu_group_create_direct_mappings(struct iommu_group *group,
}

out:
- iommu_put_dm_regions(dev, &mappings);
+ iommu_put_resv_regions(dev, &mappings);

return ret;
}
@@ -1546,20 +1546,20 @@ int iommu_domain_set_attr(struct iommu_domain *domain,
}
EXPORT_SYMBOL_GPL(iommu_domain_set_attr);

-void iommu_get_dm_regions(struct device *dev, struct list_head *list)
+void iommu_get_resv_regions(struct device *dev, struct list_head *list)
{
const struct iommu_ops *ops = dev->bus->iommu_ops;

- if (ops && ops->get_dm_regions)
- ops->get_dm_regions(dev, list);
+ if (ops && ops->get_resv_regions)
+ ops->get_resv_regions(dev, list);
}

-void iommu_put_dm_regions(struct device *dev, struct list_head *list)
+void iommu_put_resv_regions(struct device *dev, struct list_head *list)
{
const struct iommu_ops *ops = dev->bus->iommu_ops;

- if (ops && ops->put_dm_regions)
- ops->put_dm_regions(dev, list);
+ if (ops && ops->put_resv_regions)
+ ops->put_resv_regions(dev, list);
}

/* Request that a device is direct mapped by the IOMMU */
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 436dc21..7f6ebd0 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -118,13 +118,13 @@ enum iommu_attr {
};

/**
- * struct iommu_dm_region - descriptor for a direct mapped memory region
+ * struct iommu_resv_region - descriptor for a reserved memory region
* @list: Linked list pointers
* @start: System physical start address of the region
* @length: Length of the region in bytes
* @prot: IOMMU Protection flags (READ/WRITE/...)
*/
-struct iommu_dm_region {
+struct iommu_resv_region {
struct list_head list;
phys_addr_t start;
size_t length;
@@ -150,9 +150,9 @@ struct iommu_dm_region {
* @device_group: find iommu group for a particular device
* @domain_get_attr: Query domain attributes
* @domain_set_attr: Change domain attributes
- * @get_dm_regions: Request list of direct mapping requirements for a device
- * @put_dm_regions: Free list of direct mapping requirements for a device
- * @apply_dm_region: Temporary helper call-back for iova reserved ranges
+ * @get_resv_regions: Request list of reserved regions for a device
+ * @put_resv_regions: Free list of reserved regions for a device
+ * @apply_resv_region: Temporary helper call-back for iova reserved ranges
* @domain_window_enable: Configure and enable a particular window for a domain
* @domain_window_disable: Disable a particular window for a domain
* @domain_set_windows: Set the number of windows for a domain
@@ -184,11 +184,12 @@ struct iommu_ops {
int (*domain_set_attr)(struct iommu_domain *domain,
enum iommu_attr attr, void *data);

- /* Request/Free a list of direct mapping requirements for a device */
- void (*get_dm_regions)(struct device *dev, struct list_head *list);
- void (*put_dm_regions)(struct device *dev, struct list_head *list);
- void (*apply_dm_region)(struct device *dev, struct iommu_domain *domain,
- struct iommu_dm_region *region);
+ /* Request/Free a list of reserved regions for a device */
+ void (*get_resv_regions)(struct device *dev, struct list_head *list);
+ void (*put_resv_regions)(struct device *dev, struct list_head *list);
+ void (*apply_resv_region)(struct device *dev,
+ struct iommu_domain *domain,
+ struct iommu_resv_region *region);

/* Window handling functions */
int (*domain_window_enable)(struct iommu_domain *domain, u32 wnd_nr,
@@ -233,8 +234,8 @@ extern size_t default_iommu_map_sg(struct iommu_domain *domain, unsigned long io
extern void iommu_set_fault_handler(struct iommu_domain *domain,
iommu_fault_handler_t handler, void *token);

-extern void iommu_get_dm_regions(struct device *dev, struct list_head *list);
-extern void iommu_put_dm_regions(struct device *dev, struct list_head *list);
+extern void iommu_get_resv_regions(struct device *dev, struct list_head *list);
+extern void iommu_put_resv_regions(struct device *dev, struct list_head *list);
extern int iommu_request_dm_for_dev(struct device *dev);

extern int iommu_attach_group(struct iommu_domain *domain,
@@ -439,12 +440,12 @@ static inline void iommu_set_fault_handler(struct iommu_domain *domain,
{
}

-static inline void iommu_get_dm_regions(struct device *dev,
+static inline void iommu_get_resv_regions(struct device *dev,
struct list_head *list)
{
}

-static inline void iommu_put_dm_regions(struct device *dev,
+static inline void iommu_put_resv_regions(struct device *dev,
struct list_head *list)
{
}
--
1.9.1

2016-12-13 20:45:30

by Eric Auger

[permalink] [raw]
Subject: [RFC v4 15/16] vfio/type1: Check MSI remapping at irq domain level

In case the IOMMU does not bypass MSI transactions (typical
case on ARM), we check all MSI controllers are IRQ remapping
capable. If not the IRQ assignment may be unsafe.

At this stage the arm-smmu-(v3) still advertise the
IOMMU_CAP_INTR_REMAP capability at IOMMU level. This will be
removed in subsequent patches.

Signed-off-by: Eric Auger <[email protected]>
---
drivers/vfio/vfio_iommu_type1.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index d07fe73..a05648b 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -37,6 +37,7 @@
#include <linux/vfio.h>
#include <linux/workqueue.h>
#include <linux/dma-iommu.h>
+#include <linux/irqdomain.h>

#define DRIVER_VERSION "0.2"
#define DRIVER_AUTHOR "Alex Williamson <[email protected]>"
@@ -765,7 +766,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
struct vfio_domain *domain, *d;
struct bus_type *bus = NULL;
int ret;
- bool resv_msi;
+ bool resv_msi, msi_remap;
phys_addr_t resv_msi_base;

mutex_lock(&iommu->lock);
@@ -818,8 +819,10 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
INIT_LIST_HEAD(&domain->group_list);
list_add(&group->next, &domain->group_list);

- if (!allow_unsafe_interrupts &&
- !iommu_capable(bus, IOMMU_CAP_INTR_REMAP)) {
+ msi_remap = resv_msi ? irq_domain_check_msi_remap() :
+ iommu_capable(bus, IOMMU_CAP_INTR_REMAP);
+
+ if (!allow_unsafe_interrupts && !msi_remap) {
pr_warn("%s: No interrupt remapping support. Use the module param \"allow_unsafe_interrupts\" to enable VFIO IOMMU support on this platform\n",
__func__);
ret = -EPERM;
--
1.9.1

2016-12-13 20:46:05

by Eric Auger

[permalink] [raw]
Subject: [RFC v4 13/16] irqchip/gicv3-its: Sets IRQ_DOMAIN_FLAG_MSI_REMAP

The GICv3 ITS is MSI remapping capable. Let's advertise
this property so that VFIO passthrough can assess IRQ safety.

Signed-off-by: Eric Auger <[email protected]>
---
drivers/irqchip/irq-gic-v3-its.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index c5dee30..39926e1 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1643,6 +1643,7 @@ static int its_init_domain(struct fwnode_handle *handle, struct its_node *its)

inner_domain->parent = its_parent;
inner_domain->bus_token = DOMAIN_BUS_NEXUS;
+ inner_domain->flags |= IRQ_DOMAIN_FLAG_MSI_REMAP;
info->ops = &its_msi_domain_ops;
info->data = its;
inner_domain->host_data = info;
--
1.9.1

2016-12-13 20:46:03

by Eric Auger

[permalink] [raw]
Subject: [RFC v4 14/16] vfio/type1: Allow transparent MSI IOVA allocation

When attaching a group to the container, check the group's
reserved regions and test whether the IOMMU translates MSI
transactions. If yes, we initialize an IOVA allocator through
the iommu_get_msi_cookie API. This will allow the MSI IOVAs
to be transparently allocated on MSI controller's compose().

Signed-off-by: Eric Auger <[email protected]>

---

v3 -> v4:
- test region's type: IOMMU_RESV_MSI
- restructure the code to prepare for safety assessment
- reword title
---
drivers/vfio/vfio_iommu_type1.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 2ba1942..d07fe73 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -36,6 +36,7 @@
#include <linux/uaccess.h>
#include <linux/vfio.h>
#include <linux/workqueue.h>
+#include <linux/dma-iommu.h>

#define DRIVER_VERSION "0.2"
#define DRIVER_AUTHOR "Alex Williamson <[email protected]>"
@@ -734,6 +735,28 @@ static void vfio_test_domain_fgsp(struct vfio_domain *domain)
__free_pages(pages, order);
}

+static bool vfio_iommu_has_resv_msi(struct iommu_group *group,
+ phys_addr_t *base)
+{
+ struct list_head group_resv_regions;
+ struct iommu_resv_region *region, *next;
+ bool ret = false;
+
+ INIT_LIST_HEAD(&group_resv_regions);
+ iommu_get_group_resv_regions(group, &group_resv_regions);
+ list_for_each_entry(region, &group_resv_regions, list) {
+ if (region->type & IOMMU_RESV_MSI) {
+ *base = region->start;
+ ret = true;
+ goto out;
+ }
+ }
+out:
+ list_for_each_entry_safe(region, next, &group_resv_regions, list)
+ kfree(region);
+ return ret;
+}
+
static int vfio_iommu_type1_attach_group(void *iommu_data,
struct iommu_group *iommu_group)
{
@@ -742,6 +765,8 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
struct vfio_domain *domain, *d;
struct bus_type *bus = NULL;
int ret;
+ bool resv_msi;
+ phys_addr_t resv_msi_base;

mutex_lock(&iommu->lock);

@@ -788,6 +813,8 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
if (ret)
goto out_domain;

+ resv_msi = vfio_iommu_has_resv_msi(iommu_group, &resv_msi_base);
+
INIT_LIST_HEAD(&domain->group_list);
list_add(&group->next, &domain->group_list);

@@ -834,6 +861,9 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
if (ret)
goto out_detach;

+ if (resv_msi && iommu_get_msi_cookie(domain->domain, resv_msi_base))
+ goto out_detach;
+
list_add(&domain->next, &iommu->domain_list);

mutex_unlock(&iommu->lock);
--
1.9.1

2016-12-13 20:46:35

by Eric Auger

[permalink] [raw]
Subject: [RFC v4 12/16] irqdomain: irq_domain_check_msi_remap

This new function checks whether all platform and PCI
MSI domains implement IRQ remapping. This is useful to
understand whether VFIO passthrough is safe with respect
to interrupts.

On ARM typically an MSI controller can sit downstream
to the IOMMU without preventing VFIO passthrough.
As such any assigned device can write into the MSI doorbell.
In case the MSI controller implements IRQ remapping, assigned
devices will not be able to trigger interrupts towards the
host. On the contrary, the assignment must be emphasized as
unsafe with respect to interrupts.

Signed-off-by: Eric Auger <[email protected]>
---
include/linux/irqdomain.h | 5 +++++
kernel/irq/irqdomain.c | 24 ++++++++++++++++++++++++
2 files changed, 29 insertions(+)

diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index ab017b2..b279dd1 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -219,6 +219,7 @@ struct irq_domain *irq_domain_add_legacy(struct device_node *of_node,
void *host_data);
extern struct irq_domain *irq_find_matching_fwspec(struct irq_fwspec *fwspec,
enum irq_domain_bus_token bus_token);
+extern bool irq_domain_check_msi_remap(void);
extern void irq_set_default_host(struct irq_domain *host);
extern int irq_domain_alloc_descs(int virq, unsigned int nr_irqs,
irq_hw_number_t hwirq, int node,
@@ -491,6 +492,10 @@ static inline struct irq_domain *irq_find_matching_fwnode(
{
return NULL;
}
+static inline bool irq_domain_check_msi_remap(void)
+{
+ return false;
+}
#endif /* !CONFIG_IRQ_DOMAIN */

#endif /* _LINUX_IRQDOMAIN_H */
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 8c0a0ae..d9a6fc5 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -278,6 +278,30 @@ struct irq_domain *irq_find_matching_fwspec(struct irq_fwspec *fwspec,
EXPORT_SYMBOL_GPL(irq_find_matching_fwspec);

/**
+ * irq_domain_check_msi_remap() - Checks whether all PCI and
+ * platform MSI irq domains implement IRQ remapping
+ */
+bool irq_domain_check_msi_remap(void)
+{
+ struct irq_domain *h;
+ bool ret = true;
+
+ mutex_lock(&irq_domain_mutex);
+ list_for_each_entry(h, &irq_domain_list, link) {
+ if (((h->bus_token & DOMAIN_BUS_PCI_MSI) ||
+ (h->bus_token & DOMAIN_BUS_PLATFORM_MSI)) &&
+ !(h->flags & IRQ_DOMAIN_FLAG_MSI_REMAP)) {
+ ret = false;
+ goto out;
+ }
+ }
+out:
+ mutex_unlock(&irq_domain_mutex);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(irq_domain_check_msi_remap);
+
+/**
* irq_set_default_host() - Set a "default" irq domain
* @domain: default domain pointer
*
--
1.9.1

2016-12-13 20:47:10

by Eric Auger

[permalink] [raw]
Subject: [RFC v4 10/16] iommu/arm-smmu-v3: Implement reserved region get/put callbacks

iommu/arm-smmu: Implement reserved region get/put callbacks

The get() populates the list with the MSI IOVA reserved window.

At the moment an arbitray MSI IOVA window is set at 0x8000000
of size 1MB. This will allow to report those info in iommu-group
sysfs.

Signed-off-by: Eric Auger <[email protected]>

---

v4: creation
---
drivers/iommu/arm-smmu-v3.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)

diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index e6f9b2d..c9ee2f5 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -410,6 +410,9 @@
/* High-level queue structures */
#define ARM_SMMU_POLL_TIMEOUT_US 100

+#define MSI_IOVA_BASE 0x8000000
+#define MSI_IOVA_LENGTH 0x100000
+
static bool disable_bypass;
module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO);
MODULE_PARM_DESC(disable_bypass,
@@ -1880,6 +1883,29 @@ static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
return iommu_fwspec_add_ids(dev, args->args, 1);
}

+static void arm_smmu_get_resv_regions(struct device *dev,
+ struct list_head *head)
+{
+ struct iommu_resv_region *region;
+ int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
+
+ region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
+ prot, IOMMU_RESV_MSI);
+ if (!region)
+ return;
+
+ list_add_tail(&region->list, head);
+}
+
+static void arm_smmu_put_resv_regions(struct device *dev,
+ struct list_head *head)
+{
+ struct iommu_resv_region *entry, *next;
+
+ list_for_each_entry_safe(entry, next, head, list)
+ kfree(entry);
+}
+
static struct iommu_ops arm_smmu_ops = {
.capable = arm_smmu_capable,
.domain_alloc = arm_smmu_domain_alloc,
@@ -1895,6 +1921,8 @@ static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
.domain_get_attr = arm_smmu_domain_get_attr,
.domain_set_attr = arm_smmu_domain_set_attr,
.of_xlate = arm_smmu_of_xlate,
+ .get_resv_regions = arm_smmu_get_resv_regions,
+ .put_resv_regions = arm_smmu_put_resv_regions,
.pgsize_bitmap = -1UL, /* Restricted during device attach */
};

--
1.9.1

2016-12-13 20:47:08

by Eric Auger

[permalink] [raw]
Subject: [RFC v4 08/16] iommu/vt-d: Implement reserved region get/put callbacks

This patch registers the [0xfee00000 - 0xfeefffff] 1MB
MSI range as a reserved region. This will allow to report
that range in the iommu-group sysfs.

Signed-off-by: Eric Auger <[email protected]>

---

RFCv2 -> RFCv3:
- use get/put_resv_region callbacks.

RFC v1 -> RFC v2:
- fix intel_iommu_add_reserved_regions name
- use IOAPIC_RANGE_START and IOAPIC_RANGE_END defines
- return if the MSI region is already registered;
---
drivers/iommu/intel-iommu.c | 50 +++++++++++++++++++++++++++++++++------------
1 file changed, 37 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index d8376c2..06c447a 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5196,6 +5196,28 @@ static void intel_iommu_remove_device(struct device *dev)
iommu_device_unlink(iommu->iommu_dev, dev);
}

+static void intel_iommu_get_resv_regions(struct device *device,
+ struct list_head *head)
+{
+ struct iommu_resv_region *reg;
+
+ reg = iommu_alloc_resv_region(IOAPIC_RANGE_START,
+ IOAPIC_RANGE_END - IOAPIC_RANGE_START + 1,
+ 0, IOMMU_RESV_NOMAP);
+ if (!reg)
+ return;
+ list_add_tail(&reg->list, head);
+}
+
+static void intel_iommu_put_resv_regions(struct device *dev,
+ struct list_head *head)
+{
+ struct iommu_resv_region *entry, *next;
+
+ list_for_each_entry_safe(entry, next, head, list)
+ kfree(entry);
+}
+
#ifdef CONFIG_INTEL_IOMMU_SVM
int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sdev)
{
@@ -5305,19 +5327,21 @@ struct intel_iommu *intel_svm_device_to_iommu(struct device *dev)
#endif /* CONFIG_INTEL_IOMMU_SVM */

static const struct iommu_ops intel_iommu_ops = {
- .capable = intel_iommu_capable,
- .domain_alloc = intel_iommu_domain_alloc,
- .domain_free = intel_iommu_domain_free,
- .attach_dev = intel_iommu_attach_device,
- .detach_dev = intel_iommu_detach_device,
- .map = intel_iommu_map,
- .unmap = intel_iommu_unmap,
- .map_sg = default_iommu_map_sg,
- .iova_to_phys = intel_iommu_iova_to_phys,
- .add_device = intel_iommu_add_device,
- .remove_device = intel_iommu_remove_device,
- .device_group = pci_device_group,
- .pgsize_bitmap = INTEL_IOMMU_PGSIZES,
+ .capable = intel_iommu_capable,
+ .domain_alloc = intel_iommu_domain_alloc,
+ .domain_free = intel_iommu_domain_free,
+ .attach_dev = intel_iommu_attach_device,
+ .detach_dev = intel_iommu_detach_device,
+ .map = intel_iommu_map,
+ .unmap = intel_iommu_unmap,
+ .map_sg = default_iommu_map_sg,
+ .iova_to_phys = intel_iommu_iova_to_phys,
+ .add_device = intel_iommu_add_device,
+ .remove_device = intel_iommu_remove_device,
+ .get_resv_regions = intel_iommu_get_resv_regions,
+ .put_resv_regions = intel_iommu_put_resv_regions,
+ .device_group = pci_device_group,
+ .pgsize_bitmap = INTEL_IOMMU_PGSIZES,
};

static void quirk_iommu_g4x_gfx(struct pci_dev *dev)
--
1.9.1

2016-12-13 20:47:06

by Eric Auger

[permalink] [raw]
Subject: [RFC v4 11/16] irqdomain: Add IRQ_DOMAIN_FLAG_MSI_REMAP value

This new enum value aims at indicating whether the irq domain
implements MSI remapping. This property is useful to assess
the IRQ assignment safety at VFIO level.

Signed-off-by: Eric Auger <[email protected]>
---
include/linux/irqdomain.h | 3 +++
1 file changed, 3 insertions(+)

diff --git a/include/linux/irqdomain.h b/include/linux/irqdomain.h
index ffb8460..ab017b2 100644
--- a/include/linux/irqdomain.h
+++ b/include/linux/irqdomain.h
@@ -183,6 +183,9 @@ enum {
/* Irq domain is an IPI domain with single virq */
IRQ_DOMAIN_FLAG_IPI_SINGLE = (1 << 3),

+ /* Irq domain is MSI remapping capable */
+ IRQ_DOMAIN_FLAG_MSI_REMAP = (1 << 4),
+
/*
* Flags starting from IRQ_DOMAIN_FLAG_NONCORE are reserved
* for implementation specific purposes and ignored by the
--
1.9.1

2016-12-13 20:48:12

by Eric Auger

[permalink] [raw]
Subject: [RFC v4 06/16] iommu: iommu_get_group_resv_regions

Introduce iommu_get_group_resv_regions whose role consists in
enumerating all devices from the group and collecting their
reserved regions. The list is sorted and overlaps are checked.

Signed-off-by: Eric Auger <[email protected]>

---

v3 -> v4:
- take the iommu_group lock in iommu_get_group_resv_regions
- the list now is sorted and overlaps are checked

NOTE:
- we do not move list elements from device to group list since
the iommu_put_resv_regions() could not be called.
- at the moment I did not introduce any iommu_put_group_resv_regions
since it simply consists in voiding/freeing the list
---
drivers/iommu/iommu.c | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++
include/linux/iommu.h | 8 ++++++
2 files changed, 86 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index ae3d97b..1565df4 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -133,6 +133,84 @@ static ssize_t iommu_group_show_name(struct iommu_group *group, char *buf)
return sprintf(buf, "%s\n", group->name);
}

+static int iommu_insert_resv_region(struct iommu_resv_region *new,
+ struct list_head *regions)
+{
+ struct iommu_resv_region *region;
+ phys_addr_t start = new->start;
+ phys_addr_t end = new->start + new->length - 1;
+ struct list_head *pos = regions->next;
+
+ while (pos != regions) {
+ struct iommu_resv_region *entry =
+ list_entry(pos, struct iommu_resv_region, list);
+ phys_addr_t a = entry->start;
+ phys_addr_t b = entry->start + entry->length - 1;
+
+ if (end < a) {
+ goto insert;
+ } else if (start > b) {
+ pos = pos->next;
+ } else if ((start >= a) && (end <= b)) {
+ goto done;
+ } else {
+ phys_addr_t new_start = min(a, start);
+ phys_addr_t new_end = max(b, end);
+
+ list_del(&entry->list);
+ entry->start = new_start;
+ entry->length = new_end - new_start + 1;
+ iommu_insert_resv_region(entry, regions);
+ }
+ }
+insert:
+ region = iommu_alloc_resv_region(new->start, new->length,
+ new->prot, new->type);
+ if (!region)
+ return -ENOMEM;
+
+ list_add_tail(&region->list, pos);
+done:
+ return 0;
+}
+
+static int
+iommu_insert_device_resv_regions(struct list_head *dev_resv_regions,
+ struct list_head *group_resv_regions)
+{
+ struct iommu_resv_region *entry;
+ int ret;
+
+ list_for_each_entry(entry, dev_resv_regions, list) {
+ ret = iommu_insert_resv_region(entry, group_resv_regions);
+ if (ret)
+ break;
+ }
+ return ret;
+}
+
+int iommu_get_group_resv_regions(struct iommu_group *group,
+ struct list_head *head)
+{
+ struct iommu_device *device;
+ int ret = 0;
+
+ mutex_lock(&group->mutex);
+ list_for_each_entry(device, &group->devices, list) {
+ struct list_head dev_resv_regions;
+
+ INIT_LIST_HEAD(&dev_resv_regions);
+ iommu_get_resv_regions(device->dev, &dev_resv_regions);
+ ret = iommu_insert_device_resv_regions(&dev_resv_regions, head);
+ iommu_put_resv_regions(device->dev, &dev_resv_regions);
+ if (ret)
+ break;
+ }
+ mutex_unlock(&group->mutex);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_get_group_resv_regions);
+
static IOMMU_GROUP_ATTR(name, S_IRUGO, iommu_group_show_name, NULL);

static void iommu_group_release(struct kobject *kobj)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 66b0c1c..f7c0606 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -246,6 +246,8 @@ extern void iommu_set_fault_handler(struct iommu_domain *domain,
extern int iommu_request_dm_for_dev(struct device *dev);
extern struct iommu_resv_region *
iommu_alloc_resv_region(phys_addr_t start, size_t length, int prot, int type);
+extern int iommu_get_group_resv_regions(struct iommu_group *group,
+ struct list_head *head);

extern int iommu_attach_group(struct iommu_domain *domain,
struct iommu_group *group);
@@ -459,6 +461,12 @@ static inline void iommu_put_resv_regions(struct device *dev,
{
}

+static inline int iommu_get_group_resv_regions(struct iommu_group *group,
+ struct list_head *head)
+{
+ return -ENODEV;
+}
+
static inline int iommu_request_dm_for_dev(struct device *dev)
{
return -ENODEV;
--
1.9.1

2016-12-13 20:48:10

by Eric Auger

[permalink] [raw]
Subject: [RFC v4 07/16] iommu: Implement reserved_regions iommu-group sysfs file

A new iommu-group sysfs attribute file is introduced. It contains
the list of reserved regions for the iommu-group. Each reserved
region is described on a separate line:
- first field is the start IOVA address,
- second is the end IOVA address,

Signed-off-by: Eric Auger <[email protected]>

---

v3 -> v4:
- add cast to long long int when printing to avoid warning on
i386
- change S_IRUGO into 0444
- remove sort. The list is natively sorted now.
- update Documentation/ABI/testing/sysfs-kernel-iommu_groups

The file layout is inspired of /sys/bus/pci/devices/BDF/resource.
I also read Documentation/filesystems/sysfs.txt so I expect this
to be frowned upon.
---
.../ABI/testing/sysfs-kernel-iommu_groups | 9 +++++++
drivers/iommu/iommu.c | 31 ++++++++++++++++++++++
2 files changed, 40 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-kernel-iommu_groups b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
index 9b31556..b0cb1fa 100644
--- a/Documentation/ABI/testing/sysfs-kernel-iommu_groups
+++ b/Documentation/ABI/testing/sysfs-kernel-iommu_groups
@@ -12,3 +12,12 @@ Description: /sys/kernel/iommu_groups/ contains a number of sub-
file if the IOMMU driver has chosen to register a more
common name for the group.
Users:
+
+What: /sys/kernel/iommu_groups/reserved_regions
+Date: December 2016
+KernelVersion: v4.11
+Contact: Eric Auger <[email protected]>
+Description: /sys/kernel/iommu_groups/reserved_regions list IOVA
+ regions that cannot be used. Not necessarily all
+ unusable regions are listed. This is typically used to
+ output reserved MSI IOVA regions.
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 1565df4..ab80dde 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -211,8 +211,32 @@ int iommu_get_group_resv_regions(struct iommu_group *group,
}
EXPORT_SYMBOL_GPL(iommu_get_group_resv_regions);

+static ssize_t iommu_group_show_resv_regions(struct iommu_group *group,
+ char *buf)
+{
+ struct iommu_resv_region *region, *next;
+ struct list_head group_resv_regions;
+ char *str = buf;
+
+ INIT_LIST_HEAD(&group_resv_regions);
+ iommu_get_group_resv_regions(group, &group_resv_regions);
+
+ list_for_each_entry_safe(region, next, &group_resv_regions, list) {
+ str += sprintf(str, "0x%016llx 0x%016llx\n",
+ (long long int)region->start,
+ (long long int)(region->start +
+ region->length - 1));
+ kfree(region);
+ }
+
+ return (str - buf);
+}
+
static IOMMU_GROUP_ATTR(name, S_IRUGO, iommu_group_show_name, NULL);

+static IOMMU_GROUP_ATTR(reserved_regions, 0444,
+ iommu_group_show_resv_regions, NULL);
+
static void iommu_group_release(struct kobject *kobj)
{
struct iommu_group *group = to_iommu_group(kobj);
@@ -227,6 +251,8 @@ static void iommu_group_release(struct kobject *kobj)
if (group->default_domain)
iommu_domain_free(group->default_domain);

+ iommu_group_remove_file(group, &iommu_group_attr_reserved_regions);
+
kfree(group->name);
kfree(group);
}
@@ -290,6 +316,11 @@ struct iommu_group *iommu_group_alloc(void)
*/
kobject_put(&group->kobj);

+ ret = iommu_group_create_file(group,
+ &iommu_group_attr_reserved_regions);
+ if (ret)
+ return ERR_PTR(ret);
+
pr_debug("Allocated group %d\n", group->id);

return group;
--
1.9.1

2016-12-13 20:48:08

by Eric Auger

[permalink] [raw]
Subject: [RFC v4 09/16] iommu/arm-smmu: Implement reserved region get/put callbacks

The get() populates the list with the MSI IOVA reserved window.

At the moment an arbitray MSI IOVA window is set at 0x8000000
of size 1MB. This will allow to report those info in iommu-group
sysfs.

Signed-off-by: Eric Auger <[email protected]>

---

v3 -> v4:
- do not handle PCI host bridge windows anymore
- encode prot

RFC v2 -> v3:
- use existing get/put_resv_regions

RFC v1 -> v2:
- use defines for MSI IOVA base and length
---
drivers/iommu/arm-smmu.c | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 8f72814..d24a646 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -278,6 +278,9 @@ enum arm_smmu_s2cr_privcfg {

#define FSYNR0_WNR (1 << 4)

+#define MSI_IOVA_BASE 0x8000000
+#define MSI_IOVA_LENGTH 0x100000
+
static int force_stage;
module_param(force_stage, int, S_IRUGO);
MODULE_PARM_DESC(force_stage,
@@ -1545,6 +1548,29 @@ static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
return iommu_fwspec_add_ids(dev, &fwid, 1);
}

+static void arm_smmu_get_resv_regions(struct device *dev,
+ struct list_head *head)
+{
+ struct iommu_resv_region *region;
+ int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO;
+
+ region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH,
+ prot, IOMMU_RESV_MSI);
+ if (!region)
+ return;
+
+ list_add_tail(&region->list, head);
+}
+
+static void arm_smmu_put_resv_regions(struct device *dev,
+ struct list_head *head)
+{
+ struct iommu_resv_region *entry, *next;
+
+ list_for_each_entry_safe(entry, next, head, list)
+ kfree(entry);
+}
+
static struct iommu_ops arm_smmu_ops = {
.capable = arm_smmu_capable,
.domain_alloc = arm_smmu_domain_alloc,
@@ -1560,6 +1586,8 @@ static int arm_smmu_of_xlate(struct device *dev, struct of_phandle_args *args)
.domain_get_attr = arm_smmu_domain_get_attr,
.domain_set_attr = arm_smmu_domain_set_attr,
.of_xlate = arm_smmu_of_xlate,
+ .get_resv_regions = arm_smmu_get_resv_regions,
+ .put_resv_regions = arm_smmu_put_resv_regions,
.pgsize_bitmap = -1UL, /* Restricted during device attach */
};

--
1.9.1

2016-12-20 14:11:37

by Bharat Bhushan

[permalink] [raw]
Subject: RE: [RFC v4 00/16] KVM PCIe/MSI passthrough on ARM/ARM64 and IOVA reserved regions

Hi Eric,

Tested this series on NXP ARMv8 platform.

Thanks
-Bharat

> -----Original Message-----
> From: Eric Auger [mailto:[email protected]]
> Sent: Wednesday, December 14, 2016 2:00 AM
> To: [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]
> Cc: [email protected]; [email protected]; linux-
> [email protected]; [email protected];
> [email protected]; [email protected]; Diana Madalina
> Craciun <[email protected]>; [email protected];
> [email protected]; Bharat Bhushan <[email protected]>
> Subject: [RFC v4 00/16] KVM PCIe/MSI passthrough on ARM/ARM64 and
> IOVA reserved regions
>
> Following LPC discussions, we now report reserved regions through iommu-
> group sysfs reserved_regions attribute file.
>
> Reserved regions are populated through the IOMMU get_resv_region
> callback (former get_dm_regions), now implemented by amd-iommu, intel-
> iommu and arm-smmu:
> - the amd-iommu reports device direct mapped regions.
> - the intel-iommu reports the [0xfee00000 - 0xfeefffff] MSI window
> as an IOMMU_RESV_NOMAP reserved region.
> - the arm-smmu reports the MSI window (arbitrarily located at
> 0x8000000 and 1MB large).
>
> Unsafe interrupt assignment is tested by enumerating all MSI irq domains
> and checking they support MSI remapping. This check is done in case we
> detect the iommu translates MSI (an IOMMU_RESV_MSI window exists).
> Otherwise the IRQ remapping capability is checked at IOMMU level.
> Obviously this is a defensive IRQ safety assessment.
> Assuming there are several MSI controllers in the system and at least one
> does not implement IRQ remapping, the assignment will be considered as
> unsafe (even if this controller is not acessible from the assigned devices).
>
> The series integrates a not officially posted patch from Robin:
> "iommu/dma: Allow MSI-only cookies".
>
> Best Regards
>
> Eric
>
> Git: complete series available at
> https://github.com/eauger/linux/tree/v4.9-reserved-v4
>
> History:
>
> RFCv3 -> RFCv4:
> - arm-smmu driver does not register PCI host bridge windows as
> reserved regions anymore
> - Implement reserved region get/put callbacks also in arm-smmuv3
> - take the iommu_group lock on iommu_get_group_resv_regions
> - add a type field in iommu_resv_region instead of using prot
> - init the region list_head in iommu_alloc_resv_region, also
> add type parameter
> - iommu_insert_resv_region manage overlaps and sort reserved
> windows
> - address IRQ safety assessment by enumerating all the MSI irq
> domains and checking the MSI_REMAP flag
> - update Documentation/ABI/testing/sysfs-kernel-iommu_groups
> - Did not add T-b since the code has significantly changed
>
> RFCv2 -> RFCv3:
> - switch to an iommu-group sysfs API
> - use new dummy allocator provided by Robin
> - dummy allocator initialized by vfio-iommu-type1 after enumerating
> the reserved regions
> - at the moment ARM MSI base address/size is left unchanged compared
> to v2
> - we currently report reserved regions and not usable IOVA regions as
> requested by Alex
>
> RFC v1 -> v2:
> - fix intel_add_reserved_regions
> - add mutex lock/unlock in vfio_iommu_type1
>
>
> Eric Auger (16):
> iommu/dma: Allow MSI-only cookies
> iommu: Rename iommu_dm_regions into iommu_resv_regions
> iommu: Add a new type field in iommu_resv_region
> iommu: iommu_alloc_resv_region
> iommu: Only map direct mapped regions
> iommu: iommu_get_group_resv_regions
> iommu: Implement reserved_regions iommu-group sysfs file
> iommu/vt-d: Implement reserved region get/put callbacks
> iommu/arm-smmu: Implement reserved region get/put callbacks
> iommu/arm-smmu-v3: Implement reserved region get/put callbacks
> irqdomain: Add IRQ_DOMAIN_FLAG_MSI_REMAP value
> irqdomain: irq_domain_check_msi_remap
> irqchip/gicv3-its: Sets IRQ_DOMAIN_FLAG_MSI_REMAP
> vfio/type1: Allow transparent MSI IOVA allocation
> vfio/type1: Check MSI remapping at irq domain level
> iommu/arm-smmu: Do not advertise IOMMU_CAP_INTR_REMAP anymore
>
> .../ABI/testing/sysfs-kernel-iommu_groups | 9 ++
> drivers/iommu/amd_iommu.c | 21 +--
> drivers/iommu/arm-smmu-v3.c | 30 +++-
> drivers/iommu/arm-smmu.c | 30 +++-
> drivers/iommu/dma-iommu.c | 116 +++++++++++++---
> drivers/iommu/intel-iommu.c | 50 +++++--
> drivers/iommu/iommu.c | 152 +++++++++++++++++++--
> drivers/irqchip/irq-gic-v3-its.c | 1 +
> drivers/vfio/vfio_iommu_type1.c | 37 ++++-
> include/linux/dma-iommu.h | 7 +
> include/linux/iommu.h | 46 +++++--
> include/linux/irqdomain.h | 8 ++
> kernel/irq/irqdomain.c | 24 ++++
> 13 files changed, 455 insertions(+), 76 deletions(-)
>
> --
> 1.9.1


2016-12-22 13:02:49

by Eric Auger

[permalink] [raw]
Subject: Re: [RFC v4 15/16] vfio/type1: Check MSI remapping at irq domain level

Hi Diana,

On 22/12/2016 13:41, Diana Madalina Craciun wrote:
> Hi Eric,
>
> On 12/13/2016 10:32 PM, Eric Auger wrote:
>> In case the IOMMU does not bypass MSI transactions (typical
>> case on ARM), we check all MSI controllers are IRQ remapping
>> capable. If not the IRQ assignment may be unsafe.
>>
>> At this stage the arm-smmu-(v3) still advertise the
>> IOMMU_CAP_INTR_REMAP capability at IOMMU level. This will be
>> removed in subsequent patches.
>>
>> Signed-off-by: Eric Auger <[email protected]>
>> ---
>> drivers/vfio/vfio_iommu_type1.c | 9 ++++++---
>> 1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
>> index d07fe73..a05648b 100644
>> --- a/drivers/vfio/vfio_iommu_type1.c
>> +++ b/drivers/vfio/vfio_iommu_type1.c
>> @@ -37,6 +37,7 @@
>> #include <linux/vfio.h>
>> #include <linux/workqueue.h>
>> #include <linux/dma-iommu.h>
>> +#include <linux/irqdomain.h>
>>
>> #define DRIVER_VERSION "0.2"
>> #define DRIVER_AUTHOR "Alex Williamson <[email protected]>"
>> @@ -765,7 +766,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
>> struct vfio_domain *domain, *d;
>> struct bus_type *bus = NULL;
>> int ret;
>> - bool resv_msi;
>> + bool resv_msi, msi_remap;
>> phys_addr_t resv_msi_base;
>>
>> mutex_lock(&iommu->lock);
>> @@ -818,8 +819,10 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
>> INIT_LIST_HEAD(&domain->group_list);
>> list_add(&group->next, &domain->group_list);
>>
>> - if (!allow_unsafe_interrupts &&
>> - !iommu_capable(bus, IOMMU_CAP_INTR_REMAP)) {
>> + msi_remap = resv_msi ? irq_domain_check_msi_remap() :
>> + iommu_capable(bus, IOMMU_CAP_INTR_REMAP);
>> +
>> + if (!allow_unsafe_interrupts && !msi_remap) {
>> pr_warn("%s: No interrupt remapping support. Use the module param \"allow_unsafe_interrupts\" to enable VFIO IOMMU support on this platform\n",
>> __func__);
>> ret = -EPERM;
>
> I tested your v4.9-reserved-v4 branch on a ITS capable hardware (NXP
> LS2080), so I did not set allow_unsafe_interrupts. It fails here
> complaining that the there is no interrupt remapping support. The
> irq_domain_check_msi_remap function returns false as none of the checked
> domains has the IRQ_DOMAIN_FLAG_MSI_REMAP flag set. I think the reason
> is that the flags are not propagated through the domain hierarchy when
> the domain is created.

Hum OK. Please apologize for the inconvenience, all the more so this is
the second time you report the same issue for different cause :-( At the
moment I can't test on a GICv3 ITS based system. I will try to fix that
though.

I would like to get the confirmation introducing this flag is the right
direction though.

Thanks

Eric
>
> Thanks,
>
> Diana
>
>
>

2016-12-22 14:17:21

by Diana Madalina Craciun

[permalink] [raw]
Subject: Re: [RFC v4 15/16] vfio/type1: Check MSI remapping at irq domain level

Hi Eric,

On 12/13/2016 10:32 PM, Eric Auger wrote:
> In case the IOMMU does not bypass MSI transactions (typical
> case on ARM), we check all MSI controllers are IRQ remapping
> capable. If not the IRQ assignment may be unsafe.
>
> At this stage the arm-smmu-(v3) still advertise the
> IOMMU_CAP_INTR_REMAP capability at IOMMU level. This will be
> removed in subsequent patches.
>
> Signed-off-by: Eric Auger <[email protected]>
> ---
> drivers/vfio/vfio_iommu_type1.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index d07fe73..a05648b 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -37,6 +37,7 @@
> #include <linux/vfio.h>
> #include <linux/workqueue.h>
> #include <linux/dma-iommu.h>
> +#include <linux/irqdomain.h>
>
> #define DRIVER_VERSION "0.2"
> #define DRIVER_AUTHOR "Alex Williamson <[email protected]>"
> @@ -765,7 +766,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
> struct vfio_domain *domain, *d;
> struct bus_type *bus = NULL;
> int ret;
> - bool resv_msi;
> + bool resv_msi, msi_remap;
> phys_addr_t resv_msi_base;
>
> mutex_lock(&iommu->lock);
> @@ -818,8 +819,10 @@ static int vfio_iommu_type1_attach_group(void *iommu_data,
> INIT_LIST_HEAD(&domain->group_list);
> list_add(&group->next, &domain->group_list);
>
> - if (!allow_unsafe_interrupts &&
> - !iommu_capable(bus, IOMMU_CAP_INTR_REMAP)) {
> + msi_remap = resv_msi ? irq_domain_check_msi_remap() :
> + iommu_capable(bus, IOMMU_CAP_INTR_REMAP);
> +
> + if (!allow_unsafe_interrupts && !msi_remap) {
> pr_warn("%s: No interrupt remapping support. Use the module param \"allow_unsafe_interrupts\" to enable VFIO IOMMU support on this platform\n",
> __func__);
> ret = -EPERM;

I tested your v4.9-reserved-v4 branch on a ITS capable hardware (NXP
LS2080), so I did not set allow_unsafe_interrupts. It fails here
complaining that the there is no interrupt remapping support. The
irq_domain_check_msi_remap function returns false as none of the checked
domains has the IRQ_DOMAIN_FLAG_MSI_REMAP flag set. I think the reason
is that the flags are not propagated through the domain hierarchy when
the domain is created.

Thanks,

Diana




2016-12-23 15:13:33

by Eric Auger

[permalink] [raw]
Subject: Re: [RFC v4 15/16] vfio/type1: Check MSI remapping at irq domain level

Hi Geetha,

On 23/12/2016 14:33, Geetha Akula wrote:
> Hi Eric,
>
> Seeing same issue reported by Diana on ThunderX with you
> v4.9-reserved-v4 branch.
> Vfio passthough work fine when allow_unsafe_interrupts is set.
Thank you for testing! I will fix the security assessment by better
studying flag propagation in domain hierarchy.

Best Regards

Eric

>
>
> Thank you,
> Geetha.
>
> On Thu, Dec 22, 2016 at 6:32 PM, Auger Eric <[email protected]
> <mailto:[email protected]>> wrote:
>
> Hi Diana,
>
> On 22/12/2016 13:41, Diana Madalina Craciun wrote:
> > Hi Eric,
> >
> > On 12/13/2016 10:32 PM, Eric Auger wrote:
> >> In case the IOMMU does not bypass MSI transactions (typical
> >> case on ARM), we check all MSI controllers are IRQ remapping
> >> capable. If not the IRQ assignment may be unsafe.
> >>
> >> At this stage the arm-smmu-(v3) still advertise the
> >> IOMMU_CAP_INTR_REMAP capability at IOMMU level. This will be
> >> removed in subsequent patches.
> >>
> >> Signed-off-by: Eric Auger <[email protected]
> <mailto:[email protected]>>
> >> ---
> >> drivers/vfio/vfio_iommu_type1.c | 9 ++++++---
> >> 1 file changed, 6 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/vfio/vfio_iommu_type1.c
> b/drivers/vfio/vfio_iommu_type1.c
> >> index d07fe73..a05648b 100644
> >> --- a/drivers/vfio/vfio_iommu_type1.c
> >> +++ b/drivers/vfio/vfio_iommu_type1.c
> >> @@ -37,6 +37,7 @@
> >> #include <linux/vfio.h>
> >> #include <linux/workqueue.h>
> >> #include <linux/dma-iommu.h>
> >> +#include <linux/irqdomain.h>
> >>
> >> #define DRIVER_VERSION "0.2"
> >> #define DRIVER_AUTHOR "Alex Williamson
> <[email protected] <mailto:[email protected]>>"
> >> @@ -765,7 +766,7 @@ static int vfio_iommu_type1_attach_group(void
> *iommu_data,
> >> struct vfio_domain *domain, *d;
> >> struct bus_type *bus = NULL;
> >> int ret;
> >> - bool resv_msi;
> >> + bool resv_msi, msi_remap;
> >> phys_addr_t resv_msi_base;
> >>
> >> mutex_lock(&iommu->lock);
> >> @@ -818,8 +819,10 @@ static int
> vfio_iommu_type1_attach_group(void *iommu_data,
> >> INIT_LIST_HEAD(&domain->group_list);
> >> list_add(&group->next, &domain->group_list);
> >>
> >> - if (!allow_unsafe_interrupts &&
> >> - !iommu_capable(bus, IOMMU_CAP_INTR_REMAP)) {
> >> + msi_remap = resv_msi ? irq_domain_check_msi_remap() :
> >> + iommu_capable(bus, IOMMU_CAP_INTR_REMAP);
> >> +
> >> + if (!allow_unsafe_interrupts && !msi_remap) {
> >> pr_warn("%s: No interrupt remapping support. Use
> the module param \"allow_unsafe_interrupts\" to enable VFIO IOMMU
> support on this platform\n",
> >> __func__);
> >> ret = -EPERM;
> >
> > I tested your v4.9-reserved-v4 branch on a ITS capable hardware (NXP
> > LS2080), so I did not set allow_unsafe_interrupts. It fails here
> > complaining that the there is no interrupt remapping support. The
> > irq_domain_check_msi_remap function returns false as none of the
> checked
> > domains has the IRQ_DOMAIN_FLAG_MSI_REMAP flag set. I think the reason
> > is that the flags are not propagated through the domain hierarchy when
> > the domain is created.
>
> Hum OK. Please apologize for the inconvenience, all the more so this is
> the second time you report the same issue for different cause :-( At the
> moment I can't test on a GICv3 ITS based system. I will try to fix that
> though.
>
> I would like to get the confirmation introducing this flag is the right
> direction though.
>
> Thanks
>
> Eric
> >
> > Thanks,
> >
> > Diana
> >
> >
> >
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> <mailto:[email protected]>
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> <http://lists.infradead.org/mailman/listinfo/linux-arm-kernel>
>
>