2019-03-04 16:56:54

by James Sewart

[permalink] [raw]
Subject: [PATCH 0/4] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.

Hey,

This patchset moves IOMMU_DOMAIN_DMA iommu domain management to iommu.c.
This avoids the use of find_or_alloc_domain whose domain assignment is
inconsistent with the iommu grouping as determined by pci_device_group.

Patch 3 permits domain type IOMMU_DOMAIN_DMA to be allocated via the
iommu_ops api, allowing the default_domain of an iommu group to be set in
iommu.c. This domain will be attached to every device that is brought up
with an iommu group, and the devices reserved regions will be mapped using
regions returned by get_resv_regions.

In intel_iommu_domain_alloc we don’t know the IOMMU this domain will be
associated with so we defer full initialisation until
intel_iommu_attach_device. Currently iommu.c:iommu_group_add_device will
try to map a devices reserved regions before attaching the domain which
would cause issue if the domain is not fully initialised. This is
addressed in patch 1 by moving the mapping to after attaching.

Patch 2 implements function apply_resv_region, used in
iommu_group_create_direct_mappings to mark the reserved regions as non
mappable for the dma_map_ops api.

Patch 4 removes the domain lazy allocation logic. Before this patch the
lazy allocation logic would not be used as any domain allocated using
these paths would be replaced when attaching the group default domain.
Default domain allocation has been tested with and without this patch on
4.19.

Cheers,
James.

James Sewart (4):
iommu: Move iommu_group_create_direct_mappings to after device_attach
iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated
iommu/vt-d: Remove lazy allocation of domains

drivers/iommu/intel-iommu.c | 329 ++++++++++++------------------------
drivers/iommu/iommu.c | 4 +-
2 files changed, 108 insertions(+), 225 deletions(-)

--
2.17.1



2019-03-04 16:00:57

by James Sewart

[permalink] [raw]
Subject: [PATCH 1/4] iommu: Move iommu_group_create_direct_mappings to after device_attach

If an IOMMU driver requires to know which IOMMU a domain is associated
to initialise a domain then it will do so in device_attach, before which
it is not safe to call iommu_ops.

Signed-off-by: James Sewart <[email protected]>
---
drivers/iommu/iommu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3ed4db334341..1c6ffbb2d20e 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -652,8 +652,6 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)

dev->iommu_group = group;

- iommu_group_create_direct_mappings(group, dev);
-
mutex_lock(&group->mutex);
list_add_tail(&device->list, &group->devices);
if (group->domain)
@@ -662,6 +660,8 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
if (ret)
goto err_put_group;

+ iommu_group_create_direct_mappings(group, dev);
+
/* Notify any listeners about change to group. */
blocking_notifier_call_chain(&group->notifier,
IOMMU_GROUP_NOTIFY_ADD_DEVICE, dev);
--
2.17.1




2019-03-04 16:05:08

by James Sewart

[permalink] [raw]
Subject: [PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated

Allowing IOMMU_DOMAIN_DMA type IOMMU domain to be allocated allows the
default_domain of an iommu_group to be set. This delegates device-domain
relationships to the generic IOMMU code.

Signed-off-by: James Sewart <[email protected]>
---
drivers/iommu/intel-iommu.c | 113 +++++++++++++++++++++++++++---------
1 file changed, 84 insertions(+), 29 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 8e0a4e2ff77f..71cd6bbfec05 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -309,6 +309,14 @@ static int hw_pass_through = 1;
/* si_domain contains mulitple devices */
#define DOMAIN_FLAG_STATIC_IDENTITY (1 << 1)

+/* Domain managed externally, don't cleanup if it isn't attached
+ * to any devices. */
+#define DOMAIN_FLAG_MANAGED_EXTERNALLY (1 << 2)
+
+/* Set after domain initialisation. Used when allocating dma domains to
+ * defer domain initialisation until it is attached to a device */
+#define DOMAIN_FLAG_INITIALISED (1 << 4)
+
#define for_each_domain_iommu(idx, domain) \
for (idx = 0; idx < g_num_of_iommus; idx++) \
if (domain->iommu_refcnt[idx])
@@ -558,6 +566,16 @@ static inline int domain_type_is_vm_or_si(struct dmar_domain *domain)
DOMAIN_FLAG_STATIC_IDENTITY);
}

+static inline int domain_managed_externally(struct dmar_domain *domain)
+{
+ return domain->flags & DOMAIN_FLAG_MANAGED_EXTERNALLY;
+}
+
+static inline int domain_is_initialised(struct dmar_domain *domain)
+{
+ return domain->flags & DOMAIN_FLAG_INITIALISED;
+}
+
static inline int domain_pfn_supported(struct dmar_domain *domain,
unsigned long pfn)
{
@@ -1662,7 +1680,7 @@ static void disable_dmar_iommu(struct intel_iommu *iommu)

__dmar_remove_one_dev_info(info);

- if (!domain_type_is_vm_or_si(domain)) {
+ if (!domain_managed_externally(domain)) {
/*
* The domain_exit() function can't be called under
* device_domain_lock, as it takes this lock itself.
@@ -1895,6 +1913,7 @@ static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu,
domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
if (!domain->pgd)
return -ENOMEM;
+ domain->flags |= DOMAIN_FLAG_INITIALISED;
__iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
return 0;
}
@@ -2807,14 +2826,10 @@ static inline void iommu_prepare_isa(void)

static int md_domain_init(struct dmar_domain *domain, int guest_width);

-static int __init si_domain_init(int hw)
+static int __init si_domain_init(struct dmar_domain *si_domain, int hw)
{
int nid, ret = 0;

- si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);
- if (!si_domain)
- return -EFAULT;
-
if (md_domain_init(si_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
domain_exit(si_domain);
return -EFAULT;
@@ -3417,9 +3432,16 @@ static int __init init_dmars(void)
check_tylersburg_isoch();

if (iommu_identity_mapping) {
- ret = si_domain_init(hw_pass_through);
- if (ret)
+ si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);
+ if (!si_domain) {
+ ret = -EFAULT;
goto free_iommu;
+ }
+ ret = si_domain_init(si_domain, hw_pass_through);
+ if (ret) {
+ domain_exit(si_domain);
+ goto free_iommu;
+ }
}


@@ -4575,7 +4597,7 @@ static int device_notifier(struct notifier_block *nb,
return 0;

dmar_remove_one_dev_info(domain, dev);
- if (!domain_type_is_vm_or_si(domain) && list_empty(&domain->devices))
+ if (!domain_managed_externally(domain) && list_empty(&domain->devices))
domain_exit(domain);

return 0;
@@ -5020,6 +5042,7 @@ static int md_domain_init(struct dmar_domain *domain, int guest_width)
domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
if (!domain->pgd)
return -ENOMEM;
+ domain->flags |= DOMAIN_FLAG_INITIALISED;
domain_flush_cache(domain, domain->pgd, PAGE_SIZE);
return 0;
}
@@ -5028,28 +5051,54 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
{
struct dmar_domain *dmar_domain;
struct iommu_domain *domain;
+ int flags = DOMAIN_FLAG_MANAGED_EXTERNALLY;

- if (type != IOMMU_DOMAIN_UNMANAGED)
- return NULL;
-
- dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE);
- if (!dmar_domain) {
- pr_err("Can't allocate dmar_domain\n");
- return NULL;
- }
- if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
- pr_err("Domain initialization failed\n");
- domain_exit(dmar_domain);
+ switch (type) {
+ case IOMMU_DOMAIN_UNMANAGED:
+ flags |= DOMAIN_FLAG_VIRTUAL_MACHINE | DOMAIN_FLAG_INITIALISED;
+ dmar_domain = alloc_domain(flags);
+ if (!dmar_domain) {
+ pr_err("Can't allocate dmar_domain\n");
+ return NULL;
+ }
+ if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
+ pr_err("Domain initialization failed\n");
+ domain_exit(dmar_domain);
+ return NULL;
+ }
+ domain_update_iommu_cap(dmar_domain);
+ domain = &dmar_domain->domain;
+ domain->geometry.aperture_start = 0;
+ domain->geometry.aperture_end = __DOMAIN_MAX_ADDR(dmar_domain->gaw);
+ domain->geometry.force_aperture = true;
+ break;
+ case IOMMU_DOMAIN_DMA:
+ dmar_domain = alloc_domain(flags);
+ if (!dmar_domain) {
+ pr_err("Can't allocate dmar_domain\n");
+ return NULL;
+ }
+ // init domain in device attach when we know IOMMU capabilities
+ break;
+ case IOMMU_DOMAIN_IDENTITY:
+ flags |= DOMAIN_FLAG_STATIC_IDENTITY | DOMAIN_FLAG_INITIALISED;
+ dmar_domain = alloc_domain(flags);
+ if (!dmar_domain) {
+ pr_err("Can't allocate dmar_domain\n");
+ return NULL;
+ }
+ if (si_domain_init(dmar_domain, 0)) {
+ pr_err("Domain initialization failed\n");
+ domain_exit(dmar_domain);
+ return NULL;
+ }
+ break;
+ default:
return NULL;
}
- domain_update_iommu_cap(dmar_domain);
-
- domain = &dmar_domain->domain;
- domain->geometry.aperture_start = 0;
- domain->geometry.aperture_end = __DOMAIN_MAX_ADDR(dmar_domain->gaw);
- domain->geometry.force_aperture = true;

- return domain;
+ dmar_domain->domain.type = type;
+ return &dmar_domain->domain;
}

static void intel_iommu_domain_free(struct iommu_domain *domain)
@@ -5080,8 +5129,8 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
dmar_remove_one_dev_info(old_domain, dev);
rcu_read_unlock();

- if (!domain_type_is_vm_or_si(old_domain) &&
- list_empty(&old_domain->devices))
+ if (list_empty(&old_domain->devices) &&
+ !domain_managed_externally(old_domain))
domain_exit(old_domain);
}
}
@@ -5090,6 +5139,12 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
if (!iommu)
return -ENODEV;

+ // Initialise domain with IOMMU capabilities if it isn't already initialised
+ if (!domain_is_initialised(dmar_domain)) {
+ if (domain_init(dmar_domain, iommu, DEFAULT_DOMAIN_ADDRESS_WIDTH))
+ return -ENOMEM;
+ }
+
/* check if this iommu agaw is sufficient for max mapped address */
addr_width = agaw_to_width(iommu->agaw);
if (addr_width > cap_mgaw(iommu->cap))
--
2.17.1


2019-03-04 16:58:28

by James Sewart

[permalink] [raw]
Subject: [PATCH 2/4] iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges

Used by iommu.c before creating identity mappings for reserved ranges to
ensure dma-map-ops won't ever remap these addresses.

Signed-off-by: James Sewart <[email protected]>
---
drivers/iommu/intel-iommu.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 78188bf7e90d..8e0a4e2ff77f 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5299,6 +5299,19 @@ static void intel_iommu_put_resv_regions(struct device *dev,
}
}

+static void intel_iommu_apply_resv_region(struct device *dev,
+ struct iommu_domain *domain,
+ struct iommu_resv_region *region)
+{
+ struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+ unsigned long start, end;
+
+ start = IOVA_PFN(region->start);
+ end = IOVA_PFN(region->start + region->length - 1);
+
+ WARN_ON_ONCE(reserve_iova(&dmar_domain->iovad, start, end) == NULL);
+}
+
#ifdef CONFIG_INTEL_IOMMU_SVM
int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sdev)
{
@@ -5392,6 +5405,7 @@ const struct iommu_ops intel_iommu_ops = {
.remove_device = intel_iommu_remove_device,
.get_resv_regions = intel_iommu_get_resv_regions,
.put_resv_regions = intel_iommu_put_resv_regions,
+ .apply_resv_region = intel_iommu_apply_resv_region,
.device_group = pci_device_group,
.pgsize_bitmap = INTEL_IOMMU_PGSIZES,
};
--
2.17.1


2019-03-04 16:58:45

by James Sewart

[permalink] [raw]
Subject: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains

The generic IOMMU code will allocate and attach a dma ops domain to each
device that comes online, replacing any lazy allocated domain. Removes
RMRR application at iommu init time as we won't have a domain attached
to any device. iommu.c will do this after attaching a device using
create_direct_mappings.

Signed-off-by: James Sewart <[email protected]>
---
drivers/iommu/intel-iommu.c | 202 ++----------------------------------
1 file changed, 8 insertions(+), 194 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 71cd6bbfec05..282257e2628d 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2595,118 +2595,6 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
return domain;
}

-static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
-{
- *(u16 *)opaque = alias;
- return 0;
-}
-
-static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw)
-{
- struct device_domain_info *info = NULL;
- struct dmar_domain *domain = NULL;
- struct intel_iommu *iommu;
- u16 dma_alias;
- unsigned long flags;
- u8 bus, devfn;
-
- iommu = device_to_iommu(dev, &bus, &devfn);
- if (!iommu)
- return NULL;
-
- if (dev_is_pci(dev)) {
- struct pci_dev *pdev = to_pci_dev(dev);
-
- pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
-
- spin_lock_irqsave(&device_domain_lock, flags);
- info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus),
- PCI_BUS_NUM(dma_alias),
- dma_alias & 0xff);
- if (info) {
- iommu = info->iommu;
- domain = info->domain;
- }
- spin_unlock_irqrestore(&device_domain_lock, flags);
-
- /* DMA alias already has a domain, use it */
- if (info)
- goto out;
- }
-
- /* Allocate and initialize new domain for the device */
- domain = alloc_domain(0);
- if (!domain)
- return NULL;
- if (domain_init(domain, iommu, gaw)) {
- domain_exit(domain);
- return NULL;
- }
-
-out:
-
- return domain;
-}
-
-static struct dmar_domain *set_domain_for_dev(struct device *dev,
- struct dmar_domain *domain)
-{
- struct intel_iommu *iommu;
- struct dmar_domain *tmp;
- u16 req_id, dma_alias;
- u8 bus, devfn;
-
- iommu = device_to_iommu(dev, &bus, &devfn);
- if (!iommu)
- return NULL;
-
- req_id = ((u16)bus << 8) | devfn;
-
- if (dev_is_pci(dev)) {
- struct pci_dev *pdev = to_pci_dev(dev);
-
- pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
-
- /* register PCI DMA alias device */
- if (req_id != dma_alias) {
- tmp = dmar_insert_one_dev_info(iommu, PCI_BUS_NUM(dma_alias),
- dma_alias & 0xff, NULL, domain);
-
- if (!tmp || tmp != domain)
- return tmp;
- }
- }
-
- tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain);
- if (!tmp || tmp != domain)
- return tmp;
-
- return domain;
-}
-
-static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw)
-{
- struct dmar_domain *domain, *tmp;
-
- domain = find_domain(dev);
- if (domain)
- goto out;
-
- domain = find_or_alloc_domain(dev, gaw);
- if (!domain)
- goto out;
-
- tmp = set_domain_for_dev(dev, domain);
- if (!tmp || domain != tmp) {
- domain_exit(domain);
- domain = tmp;
- }
-
-out:
-
- return domain;
-}
-
static int iommu_domain_identity_map(struct dmar_domain *domain,
unsigned long long start,
unsigned long long end)
@@ -2779,7 +2667,7 @@ static int iommu_prepare_identity_map(struct device *dev,
struct dmar_domain *domain;
int ret;

- domain = get_domain_for_dev(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
+ domain = find_domain(dev);
if (!domain)
return -ENOMEM;

@@ -3301,11 +3189,9 @@ static int copy_translation_tables(struct intel_iommu *iommu)
static int __init init_dmars(void)
{
struct dmar_drhd_unit *drhd;
- struct dmar_rmrr_unit *rmrr;
bool copied_tables = false;
- struct device *dev;
struct intel_iommu *iommu;
- int i, ret;
+ int ret;

/*
* for each drhd
@@ -3466,32 +3352,6 @@ static int __init init_dmars(void)
goto free_iommu;
}
}
- /*
- * For each rmrr
- * for each dev attached to rmrr
- * do
- * locate drhd for dev, alloc domain for dev
- * allocate free domain
- * allocate page table entries for rmrr
- * if context not allocated for bus
- * allocate and init context
- * set present in root table for this bus
- * init context with domain, translation etc
- * endfor
- * endfor
- */
- pr_info("Setting RMRR:\n");
- for_each_rmrr_units(rmrr) {
- /* some BIOS lists non-exist devices in DMAR table. */
- for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
- i, dev) {
- ret = iommu_prepare_rmrr_dev(rmrr, dev);
- if (ret)
- pr_err("Mapping reserved region failed\n");
- }
- }
-
- iommu_prepare_isa();

domains_done:

@@ -3580,53 +3440,6 @@ static unsigned long intel_alloc_iova(struct device *dev,
return iova_pfn;
}

-struct dmar_domain *get_valid_domain_for_dev(struct device *dev)
-{
- struct dmar_domain *domain, *tmp;
- struct dmar_rmrr_unit *rmrr;
- struct device *i_dev;
- int i, ret;
-
- domain = find_domain(dev);
- if (domain)
- goto out;
-
- domain = find_or_alloc_domain(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
- if (!domain)
- goto out;
-
- /* We have a new domain - setup possible RMRRs for the device */
- rcu_read_lock();
- for_each_rmrr_units(rmrr) {
- for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
- i, i_dev) {
- if (i_dev != dev)
- continue;
-
- ret = domain_prepare_identity_map(dev, domain,
- rmrr->base_address,
- rmrr->end_address);
- if (ret)
- dev_err(dev, "Mapping reserved region failed\n");
- }
- }
- rcu_read_unlock();
-
- tmp = set_domain_for_dev(dev, domain);
- if (!tmp || domain != tmp) {
- domain_exit(domain);
- domain = tmp;
- }
-
-out:
-
- if (!domain)
- pr_err("Allocating domain for %s failed\n", dev_name(dev));
-
-
- return domain;
-}
-
/* Check if the dev needs to go through non-identity map and unmap process.*/
static int iommu_no_mapping(struct device *dev)
{
@@ -3689,7 +3502,7 @@ static dma_addr_t __intel_map_page(struct device *dev, struct page *page,
if (iommu_no_mapping(dev))
return paddr;

- domain = get_valid_domain_for_dev(dev);
+ domain = find_domain(dev);
if (!domain)
return DMA_MAPPING_ERROR;

@@ -3753,7 +3566,8 @@ static void intel_unmap(struct device *dev, dma_addr_t dev_addr, size_t size)
return;

domain = find_domain(dev);
- BUG_ON(!domain);
+ if (!domain)
+ return;

iommu = domain_get_iommu(domain);

@@ -3899,7 +3713,7 @@ static int intel_map_sg(struct device *dev, struct scatterlist *sglist, int nele
if (iommu_no_mapping(dev))
return intel_nontranslate_map_sg(dev, sglist, nelems, dir);

- domain = get_valid_domain_for_dev(dev);
+ domain = find_domain(dev);
if (!domain)
return 0;

@@ -5377,9 +5191,9 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sd
u64 ctx_lo;
int ret;

- domain = get_valid_domain_for_dev(sdev->dev);
+ domain = find_domain(sdev->dev);
if (!domain)
- return -EINVAL;
+ return -ENOMEM;

spin_lock_irqsave(&device_domain_lock, flags);
spin_lock(&iommu->lock);
--
2.17.1


2019-03-05 06:11:40

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH 0/4] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.

Hi James,

Very glad to see this. Thank you!

On 3/4/19 11:41 PM, James Sewart wrote:
> Hey,
>
> This patchset moves IOMMU_DOMAIN_DMA iommu domain management to iommu.c.
> This avoids the use of find_or_alloc_domain whose domain assignment is
> inconsistent with the iommu grouping as determined by pci_device_group.

Is this a bug fix or an improvement? What's the real issue will it cause
if we go without this patch set?

Best regards,
Lu Baolu

>
> Patch 3 permits domain type IOMMU_DOMAIN_DMA to be allocated via the
> iommu_ops api, allowing the default_domain of an iommu group to be set in
> iommu.c. This domain will be attached to every device that is brought up
> with an iommu group, and the devices reserved regions will be mapped using
> regions returned by get_resv_regions.
>
> In intel_iommu_domain_alloc we don’t know the IOMMU this domain will be
> associated with so we defer full initialisation until
> intel_iommu_attach_device. Currently iommu.c:iommu_group_add_device will
> try to map a devices reserved regions before attaching the domain which
> would cause issue if the domain is not fully initialised. This is
> addressed in patch 1 by moving the mapping to after attaching.
>
> Patch 2 implements function apply_resv_region, used in
> iommu_group_create_direct_mappings to mark the reserved regions as non
> mappable for the dma_map_ops api.
>
> Patch 4 removes the domain lazy allocation logic. Before this patch the
> lazy allocation logic would not be used as any domain allocated using
> these paths would be replaced when attaching the group default domain.
> Default domain allocation has been tested with and without this patch on
> 4.19.
>
> Cheers,
> James.
>
> James Sewart (4):
> iommu: Move iommu_group_create_direct_mappings to after device_attach
> iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
> iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated
> iommu/vt-d: Remove lazy allocation of domains
>
> drivers/iommu/intel-iommu.c | 329 ++++++++++++------------------------
> drivers/iommu/iommu.c | 4 +-
> 2 files changed, 108 insertions(+), 225 deletions(-)
>

2019-03-05 06:52:29

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated

Hi,

On 3/4/19 11:46 PM, James Sewart wrote:
> Allowing IOMMU_DOMAIN_DMA type IOMMU domain to be allocated allows the
> default_domain of an iommu_group to be set. This delegates device-domain
> relationships to the generic IOMMU code.
>
> Signed-off-by: James Sewart <[email protected]>
> ---
> drivers/iommu/intel-iommu.c | 113 +++++++++++++++++++++++++++---------
> 1 file changed, 84 insertions(+), 29 deletions(-)
>
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 8e0a4e2ff77f..71cd6bbfec05 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -309,6 +309,14 @@ static int hw_pass_through = 1;
> /* si_domain contains mulitple devices */
> #define DOMAIN_FLAG_STATIC_IDENTITY (1 << 1)
>
> +/* Domain managed externally, don't cleanup if it isn't attached
> + * to any devices. */
> +#define DOMAIN_FLAG_MANAGED_EXTERNALLY (1 << 2)
> +
> +/* Set after domain initialisation. Used when allocating dma domains to
> + * defer domain initialisation until it is attached to a device */
> +#define DOMAIN_FLAG_INITIALISED (1 << 4)

Why do you skip bit 3?

> +
> #define for_each_domain_iommu(idx, domain) \
> for (idx = 0; idx < g_num_of_iommus; idx++) \
> if (domain->iommu_refcnt[idx])
> @@ -558,6 +566,16 @@ static inline int domain_type_is_vm_or_si(struct dmar_domain *domain)
> DOMAIN_FLAG_STATIC_IDENTITY);
> }
>
> +static inline int domain_managed_externally(struct dmar_domain *domain)
> +{
> + return domain->flags & DOMAIN_FLAG_MANAGED_EXTERNALLY;
> +}
> +
> +static inline int domain_is_initialised(struct dmar_domain *domain)
> +{
> + return domain->flags & DOMAIN_FLAG_INITIALISED;
> +}
> +
> static inline int domain_pfn_supported(struct dmar_domain *domain,
> unsigned long pfn)
> {
> @@ -1662,7 +1680,7 @@ static void disable_dmar_iommu(struct intel_iommu *iommu)
>
> __dmar_remove_one_dev_info(info);
>
> - if (!domain_type_is_vm_or_si(domain)) {
> + if (!domain_managed_externally(domain)) {
> /*
> * The domain_exit() function can't be called under
> * device_domain_lock, as it takes this lock itself.
> @@ -1895,6 +1913,7 @@ static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu,
> domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
> if (!domain->pgd)
> return -ENOMEM;
> + domain->flags |= DOMAIN_FLAG_INITIALISED;
> __iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
> return 0;
> }
> @@ -2807,14 +2826,10 @@ static inline void iommu_prepare_isa(void)
>
> static int md_domain_init(struct dmar_domain *domain, int guest_width);
>
> -static int __init si_domain_init(int hw)
> +static int __init si_domain_init(struct dmar_domain *si_domain, int hw)
> {
> int nid, ret = 0;
>
> - si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);
> - if (!si_domain)
> - return -EFAULT;
> -
> if (md_domain_init(si_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
> domain_exit(si_domain);
> return -EFAULT;
> @@ -3417,9 +3432,16 @@ static int __init init_dmars(void)
> check_tylersburg_isoch();
>
> if (iommu_identity_mapping) {
> - ret = si_domain_init(hw_pass_through);
> - if (ret)
> + si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);

Where will this si_domain be used? We still need to keep the global
si_domain?

> + if (!si_domain) {
> + ret = -EFAULT;
> goto free_iommu;
> + }
> + ret = si_domain_init(si_domain, hw_pass_through);
> + if (ret) {
> + domain_exit(si_domain);
> + goto free_iommu;
> + }
> }
>
>
> @@ -4575,7 +4597,7 @@ static int device_notifier(struct notifier_block *nb,
> return 0;
>
> dmar_remove_one_dev_info(domain, dev);
> - if (!domain_type_is_vm_or_si(domain) && list_empty(&domain->devices))
> + if (!domain_managed_externally(domain) && list_empty(&domain->devices))
> domain_exit(domain);
>
> return 0;
> @@ -5020,6 +5042,7 @@ static int md_domain_init(struct dmar_domain *domain, int guest_width)
> domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
> if (!domain->pgd)
> return -ENOMEM;
> + domain->flags |= DOMAIN_FLAG_INITIALISED;
> domain_flush_cache(domain, domain->pgd, PAGE_SIZE);
> return 0;
> }
> @@ -5028,28 +5051,54 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
> {
> struct dmar_domain *dmar_domain;
> struct iommu_domain *domain;
> + int flags = DOMAIN_FLAG_MANAGED_EXTERNALLY;
>
> - if (type != IOMMU_DOMAIN_UNMANAGED)
> - return NULL;
> -
> - dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE);
> - if (!dmar_domain) {
> - pr_err("Can't allocate dmar_domain\n");
> - return NULL;
> - }
> - if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
> - pr_err("Domain initialization failed\n");
> - domain_exit(dmar_domain);
> + switch (type) {
> + case IOMMU_DOMAIN_UNMANAGED:
> + flags |= DOMAIN_FLAG_VIRTUAL_MACHINE | DOMAIN_FLAG_INITIALISED;
> + dmar_domain = alloc_domain(flags);
> + if (!dmar_domain) {
> + pr_err("Can't allocate dmar_domain\n");
> + return NULL;
> + }
> + if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
> + pr_err("Domain initialization failed\n");
> + domain_exit(dmar_domain);
> + return NULL;
> + }
> + domain_update_iommu_cap(dmar_domain);
> + domain = &dmar_domain->domain;
> + domain->geometry.aperture_start = 0;
> + domain->geometry.aperture_end = __DOMAIN_MAX_ADDR(dmar_domain->gaw);
> + domain->geometry.force_aperture = true;
> + break;
> + case IOMMU_DOMAIN_DMA:
> + dmar_domain = alloc_domain(flags);
> + if (!dmar_domain) {
> + pr_err("Can't allocate dmar_domain\n");

Not necessary for memory allocation failure.

> + return NULL;
> + }
> + // init domain in device attach when we know IOMMU capabilities

Use /* */ for comments, please.

> + break;
> + case IOMMU_DOMAIN_IDENTITY:
> + flags |= DOMAIN_FLAG_STATIC_IDENTITY | DOMAIN_FLAG_INITIALISED;
> + dmar_domain = alloc_domain(flags);
> + if (!dmar_domain) {
> + pr_err("Can't allocate dmar_domain\n");
> + return NULL;
> + }
> + if (si_domain_init(dmar_domain, 0)) {
> + pr_err("Domain initialization failed\n");
> + domain_exit(dmar_domain);
> + return NULL;
> + }
> + break;

How about moving si domain allocation into a separated patch? There are
two significant changes here, so it worth a new patch for discussion.

1) a shared single si domain vs. per group si domain;
2) previously vt-d drivers determines what kind of devices should use
identity domain; Now, it turns to iommu generic layer, behavior will
be changed.

> + default:
> return NULL;
> }
> - domain_update_iommu_cap(dmar_domain);
> -
> - domain = &dmar_domain->domain;
> - domain->geometry.aperture_start = 0;
> - domain->geometry.aperture_end = __DOMAIN_MAX_ADDR(dmar_domain->gaw);
> - domain->geometry.force_aperture = true;
>
> - return domain;
> + dmar_domain->domain.type = type;

Is it necessary to set this?

> + return &dmar_domain->domain;
> }
>
> static void intel_iommu_domain_free(struct iommu_domain *domain)
> @@ -5080,8 +5129,8 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
> dmar_remove_one_dev_info(old_domain, dev);
> rcu_read_unlock();
>
> - if (!domain_type_is_vm_or_si(old_domain) &&
> - list_empty(&old_domain->devices))
> + if (list_empty(&old_domain->devices) &&
> + !domain_managed_externally(old_domain))
> domain_exit(old_domain);
> }
> }
> @@ -5090,6 +5139,12 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
> if (!iommu)
> return -ENODEV;
>
> + // Initialise domain with IOMMU capabilities if it isn't already initialised

Ditto.

> + if (!domain_is_initialised(dmar_domain)) {
> + if (domain_init(dmar_domain, iommu, DEFAULT_DOMAIN_ADDRESS_WIDTH))
> + return -ENOMEM;
> + }

Each domain, not matter the DMA domain or UNMANAGED domain, is allocated
first and then gets initialized during device attaching, right? If so,
why do we need to have a special DOMAIN_FLAG_INITIALISED flag for DMA
domains?

> +
> /* check if this iommu agaw is sufficient for max mapped address */
> addr_width = agaw_to_width(iommu->agaw);
> if (addr_width > cap_mgaw(iommu->cap))
>

Best regards,
Lu Baolu

2019-03-05 07:05:54

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains

Hi,

It's hard for me to understand why do we remove the rmrr related
code in this patch.

And, now we have two places to hold a domain for a device: group and
dev->info. We can consider to optimize the use of per device iommu
arch data. This should be later work anyway.

More comments inline.

On 3/4/19 11:47 PM, James Sewart wrote:
> The generic IOMMU code will allocate and attach a dma ops domain to each
> device that comes online, replacing any lazy allocated domain. Removes
> RMRR application at iommu init time as we won't have a domain attached
> to any device. iommu.c will do this after attaching a device using
> create_direct_mappings.
>
> Signed-off-by: James Sewart <[email protected]>
> ---
> drivers/iommu/intel-iommu.c | 202 ++----------------------------------
> 1 file changed, 8 insertions(+), 194 deletions(-)
>
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 71cd6bbfec05..282257e2628d 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -2595,118 +2595,6 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
> return domain;
> }
>
> -static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
> -{
> - *(u16 *)opaque = alias;
> - return 0;
> -}
> -
> -static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw)
> -{
> - struct device_domain_info *info = NULL;
> - struct dmar_domain *domain = NULL;
> - struct intel_iommu *iommu;
> - u16 dma_alias;
> - unsigned long flags;
> - u8 bus, devfn;
> -
> - iommu = device_to_iommu(dev, &bus, &devfn);
> - if (!iommu)
> - return NULL;
> -
> - if (dev_is_pci(dev)) {
> - struct pci_dev *pdev = to_pci_dev(dev);
> -
> - pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
> -
> - spin_lock_irqsave(&device_domain_lock, flags);
> - info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus),
> - PCI_BUS_NUM(dma_alias),
> - dma_alias & 0xff);
> - if (info) {
> - iommu = info->iommu;
> - domain = info->domain;
> - }
> - spin_unlock_irqrestore(&device_domain_lock, flags);
> -
> - /* DMA alias already has a domain, use it */
> - if (info)
> - goto out;
> - }
> -
> - /* Allocate and initialize new domain for the device */
> - domain = alloc_domain(0);
> - if (!domain)
> - return NULL;
> - if (domain_init(domain, iommu, gaw)) {
> - domain_exit(domain);
> - return NULL;
> - }
> -
> -out:
> -
> - return domain;
> -}
> -
> -static struct dmar_domain *set_domain_for_dev(struct device *dev,
> - struct dmar_domain *domain)
> -{
> - struct intel_iommu *iommu;
> - struct dmar_domain *tmp;
> - u16 req_id, dma_alias;
> - u8 bus, devfn;
> -
> - iommu = device_to_iommu(dev, &bus, &devfn);
> - if (!iommu)
> - return NULL;
> -
> - req_id = ((u16)bus << 8) | devfn;
> -
> - if (dev_is_pci(dev)) {
> - struct pci_dev *pdev = to_pci_dev(dev);
> -
> - pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
> -
> - /* register PCI DMA alias device */
> - if (req_id != dma_alias) {
> - tmp = dmar_insert_one_dev_info(iommu, PCI_BUS_NUM(dma_alias),
> - dma_alias & 0xff, NULL, domain);
> -
> - if (!tmp || tmp != domain)
> - return tmp;
> - }
> - }
> -
> - tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain);
> - if (!tmp || tmp != domain)
> - return tmp;
> -
> - return domain;
> -}
> -
> -static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw)
> -{
> - struct dmar_domain *domain, *tmp;
> -
> - domain = find_domain(dev);
> - if (domain)
> - goto out;
> -
> - domain = find_or_alloc_domain(dev, gaw);
> - if (!domain)
> - goto out;
> -
> - tmp = set_domain_for_dev(dev, domain);
> - if (!tmp || domain != tmp) {
> - domain_exit(domain);
> - domain = tmp;
> - }
> -
> -out:
> -
> - return domain;
> -}
> -
> static int iommu_domain_identity_map(struct dmar_domain *domain,
> unsigned long long start,
> unsigned long long end)
> @@ -2779,7 +2667,7 @@ static int iommu_prepare_identity_map(struct device *dev,
> struct dmar_domain *domain;
> int ret;
>
> - domain = get_domain_for_dev(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
> + domain = find_domain(dev);
> if (!domain)
> return -ENOMEM;
>
> @@ -3301,11 +3189,9 @@ static int copy_translation_tables(struct intel_iommu *iommu)
> static int __init init_dmars(void)
> {
> struct dmar_drhd_unit *drhd;
> - struct dmar_rmrr_unit *rmrr;
> bool copied_tables = false;
> - struct device *dev;
> struct intel_iommu *iommu;
> - int i, ret;
> + int ret;
>
> /*
> * for each drhd
> @@ -3466,32 +3352,6 @@ static int __init init_dmars(void)
> goto free_iommu;
> }
> }
> - /*
> - * For each rmrr
> - * for each dev attached to rmrr
> - * do
> - * locate drhd for dev, alloc domain for dev
> - * allocate free domain
> - * allocate page table entries for rmrr
> - * if context not allocated for bus
> - * allocate and init context
> - * set present in root table for this bus
> - * init context with domain, translation etc
> - * endfor
> - * endfor
> - */
> - pr_info("Setting RMRR:\n");
> - for_each_rmrr_units(rmrr) {
> - /* some BIOS lists non-exist devices in DMAR table. */
> - for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
> - i, dev) {
> - ret = iommu_prepare_rmrr_dev(rmrr, dev);
> - if (ret)
> - pr_err("Mapping reserved region failed\n");
> - }
> - }
> -
> - iommu_prepare_isa();

Why do you want to remove this segment of code?

>
> domains_done:
>
> @@ -3580,53 +3440,6 @@ static unsigned long intel_alloc_iova(struct device *dev,
> return iova_pfn;
> }
>
> -struct dmar_domain *get_valid_domain_for_dev(struct device *dev)
> -{
> - struct dmar_domain *domain, *tmp;
> - struct dmar_rmrr_unit *rmrr;
> - struct device *i_dev;
> - int i, ret;
> -
> - domain = find_domain(dev);
> - if (domain)
> - goto out;
> -
> - domain = find_or_alloc_domain(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
> - if (!domain)
> - goto out;
> -
> - /* We have a new domain - setup possible RMRRs for the device */
> - rcu_read_lock();
> - for_each_rmrr_units(rmrr) {
> - for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
> - i, i_dev) {
> - if (i_dev != dev)
> - continue;
> -
> - ret = domain_prepare_identity_map(dev, domain,
> - rmrr->base_address,
> - rmrr->end_address);
> - if (ret)
> - dev_err(dev, "Mapping reserved region failed\n");

We can't simply remove this segment of code, right? At least it should
be moved to the domain allocation interface.

> - }
> - }
> - rcu_read_unlock();
> -
> - tmp = set_domain_for_dev(dev, domain);
> - if (!tmp || domain != tmp) {
> - domain_exit(domain);
> - domain = tmp;
> - }
> -
> -out:
> -
> - if (!domain)
> - pr_err("Allocating domain for %s failed\n", dev_name(dev));
> -
> -
> - return domain;
> -}
> -
> /* Check if the dev needs to go through non-identity map and unmap process.*/
> static int iommu_no_mapping(struct device *dev)
> {
> @@ -3689,7 +3502,7 @@ static dma_addr_t __intel_map_page(struct device *dev, struct page *page,
> if (iommu_no_mapping(dev))
> return paddr;
>
> - domain = get_valid_domain_for_dev(dev);
> + domain = find_domain(dev);
> if (!domain)
> return DMA_MAPPING_ERROR;
>
> @@ -3753,7 +3566,8 @@ static void intel_unmap(struct device *dev, dma_addr_t dev_addr, size_t size)
> return;
>
> domain = find_domain(dev);
> - BUG_ON(!domain);
> + if (!domain)
> + return;
>

This is not related to this patch. Let's do it in a separated patch.

> iommu = domain_get_iommu(domain);
>
> @@ -3899,7 +3713,7 @@ static int intel_map_sg(struct device *dev, struct scatterlist *sglist, int nele
> if (iommu_no_mapping(dev))
> return intel_nontranslate_map_sg(dev, sglist, nelems, dir);
>
> - domain = get_valid_domain_for_dev(dev);
> + domain = find_domain(dev);
> if (!domain)
> return 0;
>
> @@ -5377,9 +5191,9 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sd
> u64 ctx_lo;
> int ret;
>
> - domain = get_valid_domain_for_dev(sdev->dev);
> + domain = find_domain(sdev->dev);
> if (!domain)
> - return -EINVAL;
> + return -ENOMEM;

This is not related to this patch. Let's do it in a separated patch.

>
> spin_lock_irqsave(&device_domain_lock, flags);
> spin_lock(&iommu->lock);
>

Best regards,
Lu Baolu

2019-03-05 11:36:21

by James Sewart

[permalink] [raw]
Subject: Re: [PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated

Hey Lu,

> On 5 Mar 2019, at 06:46, Lu Baolu <[email protected]> wrote:
>
> Hi,
>
> On 3/4/19 11:46 PM, James Sewart wrote:
>> Allowing IOMMU_DOMAIN_DMA type IOMMU domain to be allocated allows the
>> default_domain of an iommu_group to be set. This delegates device-domain
>> relationships to the generic IOMMU code.
>> Signed-off-by: James Sewart <[email protected]>
>> ---
>> drivers/iommu/intel-iommu.c | 113 +++++++++++++++++++++++++++---------
>> 1 file changed, 84 insertions(+), 29 deletions(-)
>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>> index 8e0a4e2ff77f..71cd6bbfec05 100644
>> --- a/drivers/iommu/intel-iommu.c
>> +++ b/drivers/iommu/intel-iommu.c
>> @@ -309,6 +309,14 @@ static int hw_pass_through = 1;
>> /* si_domain contains mulitple devices */
>> #define DOMAIN_FLAG_STATIC_IDENTITY (1 << 1)
>> +/* Domain managed externally, don't cleanup if it isn't attached
>> + * to any devices. */
>> +#define DOMAIN_FLAG_MANAGED_EXTERNALLY (1 << 2)
>> +
>> +/* Set after domain initialisation. Used when allocating dma domains to
>> + * defer domain initialisation until it is attached to a device */
>> +#define DOMAIN_FLAG_INITIALISED (1 << 4)
>
> Why do you skip bit 3?

This was an oversight, I will update to use bit 3.

>
>> +
>> #define for_each_domain_iommu(idx, domain) \
>> for (idx = 0; idx < g_num_of_iommus; idx++) \
>> if (domain->iommu_refcnt[idx])
>> @@ -558,6 +566,16 @@ static inline int domain_type_is_vm_or_si(struct dmar_domain *domain)
>> DOMAIN_FLAG_STATIC_IDENTITY);
>> }
>> +static inline int domain_managed_externally(struct dmar_domain *domain)
>> +{
>> + return domain->flags & DOMAIN_FLAG_MANAGED_EXTERNALLY;
>> +}
>> +
>> +static inline int domain_is_initialised(struct dmar_domain *domain)
>> +{
>> + return domain->flags & DOMAIN_FLAG_INITIALISED;
>> +}
>> +
>> static inline int domain_pfn_supported(struct dmar_domain *domain,
>> unsigned long pfn)
>> {
>> @@ -1662,7 +1680,7 @@ static void disable_dmar_iommu(struct intel_iommu *iommu)
>> __dmar_remove_one_dev_info(info);
>> - if (!domain_type_is_vm_or_si(domain)) {
>> + if (!domain_managed_externally(domain)) {
>> /*
>> * The domain_exit() function can't be called under
>> * device_domain_lock, as it takes this lock itself.
>> @@ -1895,6 +1913,7 @@ static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu,
>> domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
>> if (!domain->pgd)
>> return -ENOMEM;
>> + domain->flags |= DOMAIN_FLAG_INITIALISED;
>> __iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
>> return 0;
>> }
>> @@ -2807,14 +2826,10 @@ static inline void iommu_prepare_isa(void)
>> static int md_domain_init(struct dmar_domain *domain, int guest_width);
>> -static int __init si_domain_init(int hw)
>> +static int __init si_domain_init(struct dmar_domain *si_domain, int hw)
>> {
>> int nid, ret = 0;
>> - si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);
>> - if (!si_domain)
>> - return -EFAULT;
>> -
>> if (md_domain_init(si_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
>> domain_exit(si_domain);
>> return -EFAULT;
>> @@ -3417,9 +3432,16 @@ static int __init init_dmars(void)
>> check_tylersburg_isoch();
>> if (iommu_identity_mapping) {
>> - ret = si_domain_init(hw_pass_through);
>> - if (ret)
>> + si_domain = alloc_domain(DOMAIN_FLAG_STATIC_IDENTITY);
>
> Where will this si_domain be used? We still need to keep the global
> si_domain?

I am unsure of the best thing to do here. The si_domain can be initialised
as a hardware passthrough which means it doesn’t have any mappings applied.
I think any user allocating a domain here will always want a software
passthrough domain. I’m not sure if/how to consolidate these two.

>
>> + if (!si_domain) {
>> + ret = -EFAULT;
>> goto free_iommu;
>> + }
>> + ret = si_domain_init(si_domain, hw_pass_through);
>> + if (ret) {
>> + domain_exit(si_domain);
>> + goto free_iommu;
>> + }
>> }
>> @@ -4575,7 +4597,7 @@ static int device_notifier(struct notifier_block *nb,
>> return 0;
>> dmar_remove_one_dev_info(domain, dev);
>> - if (!domain_type_is_vm_or_si(domain) && list_empty(&domain->devices))
>> + if (!domain_managed_externally(domain) && list_empty(&domain->devices))
>> domain_exit(domain);
>> return 0;
>> @@ -5020,6 +5042,7 @@ static int md_domain_init(struct dmar_domain *domain, int guest_width)
>> domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
>> if (!domain->pgd)
>> return -ENOMEM;
>> + domain->flags |= DOMAIN_FLAG_INITIALISED;
>> domain_flush_cache(domain, domain->pgd, PAGE_SIZE);
>> return 0;
>> }
>> @@ -5028,28 +5051,54 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
>> {
>> struct dmar_domain *dmar_domain;
>> struct iommu_domain *domain;
>> + int flags = DOMAIN_FLAG_MANAGED_EXTERNALLY;
>> - if (type != IOMMU_DOMAIN_UNMANAGED)
>> - return NULL;
>> -
>> - dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE);
>> - if (!dmar_domain) {
>> - pr_err("Can't allocate dmar_domain\n");
>> - return NULL;
>> - }
>> - if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
>> - pr_err("Domain initialization failed\n");
>> - domain_exit(dmar_domain);
>> + switch (type) {
>> + case IOMMU_DOMAIN_UNMANAGED:
>> + flags |= DOMAIN_FLAG_VIRTUAL_MACHINE | DOMAIN_FLAG_INITIALISED;
>> + dmar_domain = alloc_domain(flags);
>> + if (!dmar_domain) {
>> + pr_err("Can't allocate dmar_domain\n");
>> + return NULL;
>> + }
>> + if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
>> + pr_err("Domain initialization failed\n");
>> + domain_exit(dmar_domain);
>> + return NULL;
>> + }
>> + domain_update_iommu_cap(dmar_domain);
>> + domain = &dmar_domain->domain;
>> + domain->geometry.aperture_start = 0;
>> + domain->geometry.aperture_end = __DOMAIN_MAX_ADDR(dmar_domain->gaw);
>> + domain->geometry.force_aperture = true;
>> + break;
>> + case IOMMU_DOMAIN_DMA:
>> + dmar_domain = alloc_domain(flags);
>> + if (!dmar_domain) {
>> + pr_err("Can't allocate dmar_domain\n");
>
> Not necessary for memory allocation failure.

I’ll update in the next revision

>
>> + return NULL;
>> + }
>> + // init domain in device attach when we know IOMMU capabilities
>
> Use /* */ for comments, please.

I’ll update in the next revision

>
>> + break;
>> + case IOMMU_DOMAIN_IDENTITY:
>> + flags |= DOMAIN_FLAG_STATIC_IDENTITY | DOMAIN_FLAG_INITIALISED;
>> + dmar_domain = alloc_domain(flags);
>> + if (!dmar_domain) {
>> + pr_err("Can't allocate dmar_domain\n");
>> + return NULL;
>> + }
>> + if (si_domain_init(dmar_domain, 0)) {
>> + pr_err("Domain initialization failed\n");
>> + domain_exit(dmar_domain);
>> + return NULL;
>> + }
>> + break;
>
> How about moving si domain allocation into a separated patch? There are
> two significant changes here, so it worth a new patch for discussion.
>
> 1) a shared single si domain vs. per group si domain;
> 2) previously vt-d drivers determines what kind of devices should use
> identity domain; Now, it turns to iommu generic layer, behavior will
> be changed.

I’ll move it into another patch. Do you think the global si_domain should
be returned here?

>> + default:
>> return NULL;
>> }
>> - domain_update_iommu_cap(dmar_domain);
>> -
>> - domain = &dmar_domain->domain;
>> - domain->geometry.aperture_start = 0;
>> - domain->geometry.aperture_end = __DOMAIN_MAX_ADDR(dmar_domain->gaw);
>> - domain->geometry.force_aperture = true;
>> - return domain;
>> + dmar_domain->domain.type = type;
>
> Is it necessary to set this?

It isn’t I’ll remove this for the next revision.

>
>> + return &dmar_domain->domain;
>> }
>> static void intel_iommu_domain_free(struct iommu_domain *domain)
>> @@ -5080,8 +5129,8 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
>> dmar_remove_one_dev_info(old_domain, dev);
>> rcu_read_unlock();
>> - if (!domain_type_is_vm_or_si(old_domain) &&
>> - list_empty(&old_domain->devices))
>> + if (list_empty(&old_domain->devices) &&
>> + !domain_managed_externally(old_domain))
>> domain_exit(old_domain);
>> }
>> }
>> @@ -5090,6 +5139,12 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
>> if (!iommu)
>> return -ENODEV;
>> + // Initialise domain with IOMMU capabilities if it isn't already initialised
>
> Ditto.
>
>> + if (!domain_is_initialised(dmar_domain)) {
>> + if (domain_init(dmar_domain, iommu, DEFAULT_DOMAIN_ADDRESS_WIDTH))
>> + return -ENOMEM;
>> + }
>
> Each domain, not matter the DMA domain or UNMANAGED domain, is allocated
> first and then gets initialized during device attaching, right? If so,
> why do we need to have a special DOMAIN_FLAG_INITIALISED flag for DMA
> domains?

Only the DMA domain is initialised on attach, UNMANAGED and IDENTITY
initialises with the lowest common denominator of the parameters of all
attached IOMMUs using domain_update_iommu_cap.

>
>> +
>> /* check if this iommu agaw is sufficient for max mapped address */
>> addr_width = agaw_to_width(iommu->agaw);
>> if (addr_width > cap_mgaw(iommu->cap))
>
> Best regards,
> Lu Baolu

Cheers,
James.

2019-03-05 12:58:11

by James Sewart

[permalink] [raw]
Subject: Re: [PATCH 0/4] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.

Hey Lu,

The motivation for this is buggy domain <-> IOMMU group relationship when
using find_or_alloc_domain. From what I have read it should be the case
that an IOMMU domain is shared by all devices in the same group, thus the
same mappings. This is because the IOMMU group is determined by things
like ACS settings on pci switches which determines whether devices can
talk to each other.

In find_or_alloc_domain we only determine domain sharing based on devices
returned by pci_for_each_dma_alias, whereas there are many more checks in
pci_device_group(e.g. ACS settings of intermediary pci switches), which is
used for determining which IOMMU group a device is in. This discontinuity
means it can be the case that each device within an IOMMU group will have
different domains.

We see issues as it is supposed to be safe to assume that devices within a
group should be considered to be in the same address space, but if each
device has its own domain then any mapping created won’t be valid for the
other devices, and even could be mapped differently for each device. This
also could cause issues with a device whose RMRR's are not applied to the
domains of other devices within its group, these regions could be remapped
for the other devices resulting in unexpected behaviour during
peer-to-peer transfers.

Cheers,
James


> On 5 Mar 2019, at 06:05, Lu Baolu <[email protected]> wrote:
>
> Hi James,
>
> Very glad to see this. Thank you!
>
> On 3/4/19 11:41 PM, James Sewart wrote:
>> Hey,
>> This patchset moves IOMMU_DOMAIN_DMA iommu domain management to iommu.c.
>> This avoids the use of find_or_alloc_domain whose domain assignment is
>> inconsistent with the iommu grouping as determined by pci_device_group.
>
> Is this a bug fix or an improvement? What's the real issue will it cause
> if we go without this patch set?
>
> Best regards,
> Lu Baolu
>
>> Patch 3 permits domain type IOMMU_DOMAIN_DMA to be allocated via the
>> iommu_ops api, allowing the default_domain of an iommu group to be set in
>> iommu.c. This domain will be attached to every device that is brought up
>> with an iommu group, and the devices reserved regions will be mapped using
>> regions returned by get_resv_regions.
>> In intel_iommu_domain_alloc we don’t know the IOMMU this domain will be
>> associated with so we defer full initialisation until
>> intel_iommu_attach_device. Currently iommu.c:iommu_group_add_device will
>> try to map a devices reserved regions before attaching the domain which
>> would cause issue if the domain is not fully initialised. This is
>> addressed in patch 1 by moving the mapping to after attaching.
>> Patch 2 implements function apply_resv_region, used in
>> iommu_group_create_direct_mappings to mark the reserved regions as non
>> mappable for the dma_map_ops api.
>> Patch 4 removes the domain lazy allocation logic. Before this patch the
>> lazy allocation logic would not be used as any domain allocated using
>> these paths would be replaced when attaching the group default domain.
>> Default domain allocation has been tested with and without this patch on
>> 4.19.
>> Cheers,
>> James.
>> James Sewart (4):
>> iommu: Move iommu_group_create_direct_mappings to after device_attach
>> iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
>> iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated
>> iommu/vt-d: Remove lazy allocation of domains
>> drivers/iommu/intel-iommu.c | 329 ++++++++++++------------------------
>> drivers/iommu/iommu.c | 4 +-
>> 2 files changed, 108 insertions(+), 225 deletions(-)


2019-03-05 13:30:35

by James Sewart

[permalink] [raw]
Subject: Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains

Hey Lu,

> On 5 Mar 2019, at 06:59, Lu Baolu <[email protected]> wrote:
>
> Hi,
>
> It's hard for me to understand why do we remove the rmrr related
> code in this patch.

The RMRR code removed here requires the lazy allocation of domains to
exist, as it is run before iommu.c would assign IOMMU groups and attach a
domain. Before patch 3, removing this code would mean the RMRR regions are
never mapped for a domain: iommu.c will allocate a default domain for the
group that a device is about to be put in, it will attach the domain to
the device, then for each region returned by get_resv_regions it will
create an identity map, this is where the RMRRs are setup for the default
domain.

>
> And, now we have two places to hold a domain for a device: group and
> dev->info. We can consider to optimize the use of per device iommu
> arch data. This should be later work anyway.
>
> More comments inline.
>
> On 3/4/19 11:47 PM, James Sewart wrote:
>> The generic IOMMU code will allocate and attach a dma ops domain to each
>> device that comes online, replacing any lazy allocated domain. Removes
>> RMRR application at iommu init time as we won't have a domain attached
>> to any device. iommu.c will do this after attaching a device using
>> create_direct_mappings.
>> Signed-off-by: James Sewart <[email protected]>
>> ---
>> drivers/iommu/intel-iommu.c | 202 ++----------------------------------
>> 1 file changed, 8 insertions(+), 194 deletions(-)
>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>> index 71cd6bbfec05..282257e2628d 100644
>> --- a/drivers/iommu/intel-iommu.c
>> +++ b/drivers/iommu/intel-iommu.c
>> @@ -2595,118 +2595,6 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
>> return domain;
>> }
>> -static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
>> -{
>> - *(u16 *)opaque = alias;
>> - return 0;
>> -}
>> -
>> -static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw)
>> -{
>> - struct device_domain_info *info = NULL;
>> - struct dmar_domain *domain = NULL;
>> - struct intel_iommu *iommu;
>> - u16 dma_alias;
>> - unsigned long flags;
>> - u8 bus, devfn;
>> -
>> - iommu = device_to_iommu(dev, &bus, &devfn);
>> - if (!iommu)
>> - return NULL;
>> -
>> - if (dev_is_pci(dev)) {
>> - struct pci_dev *pdev = to_pci_dev(dev);
>> -
>> - pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
>> -
>> - spin_lock_irqsave(&device_domain_lock, flags);
>> - info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus),
>> - PCI_BUS_NUM(dma_alias),
>> - dma_alias & 0xff);
>> - if (info) {
>> - iommu = info->iommu;
>> - domain = info->domain;
>> - }
>> - spin_unlock_irqrestore(&device_domain_lock, flags);
>> -
>> - /* DMA alias already has a domain, use it */
>> - if (info)
>> - goto out;
>> - }
>> -
>> - /* Allocate and initialize new domain for the device */
>> - domain = alloc_domain(0);
>> - if (!domain)
>> - return NULL;
>> - if (domain_init(domain, iommu, gaw)) {
>> - domain_exit(domain);
>> - return NULL;
>> - }
>> -
>> -out:
>> -
>> - return domain;
>> -}
>> -
>> -static struct dmar_domain *set_domain_for_dev(struct device *dev,
>> - struct dmar_domain *domain)
>> -{
>> - struct intel_iommu *iommu;
>> - struct dmar_domain *tmp;
>> - u16 req_id, dma_alias;
>> - u8 bus, devfn;
>> -
>> - iommu = device_to_iommu(dev, &bus, &devfn);
>> - if (!iommu)
>> - return NULL;
>> -
>> - req_id = ((u16)bus << 8) | devfn;
>> -
>> - if (dev_is_pci(dev)) {
>> - struct pci_dev *pdev = to_pci_dev(dev);
>> -
>> - pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
>> -
>> - /* register PCI DMA alias device */
>> - if (req_id != dma_alias) {
>> - tmp = dmar_insert_one_dev_info(iommu, PCI_BUS_NUM(dma_alias),
>> - dma_alias & 0xff, NULL, domain);
>> -
>> - if (!tmp || tmp != domain)
>> - return tmp;
>> - }
>> - }
>> -
>> - tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain);
>> - if (!tmp || tmp != domain)
>> - return tmp;
>> -
>> - return domain;
>> -}
>> -
>> -static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw)
>> -{
>> - struct dmar_domain *domain, *tmp;
>> -
>> - domain = find_domain(dev);
>> - if (domain)
>> - goto out;
>> -
>> - domain = find_or_alloc_domain(dev, gaw);
>> - if (!domain)
>> - goto out;
>> -
>> - tmp = set_domain_for_dev(dev, domain);
>> - if (!tmp || domain != tmp) {
>> - domain_exit(domain);
>> - domain = tmp;
>> - }
>> -
>> -out:
>> -
>> - return domain;
>> -}
>> -
>> static int iommu_domain_identity_map(struct dmar_domain *domain,
>> unsigned long long start,
>> unsigned long long end)
>> @@ -2779,7 +2667,7 @@ static int iommu_prepare_identity_map(struct device *dev,
>> struct dmar_domain *domain;
>> int ret;
>> - domain = get_domain_for_dev(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
>> + domain = find_domain(dev);
>> if (!domain)
>> return -ENOMEM;
>> @@ -3301,11 +3189,9 @@ static int copy_translation_tables(struct intel_iommu *iommu)
>> static int __init init_dmars(void)
>> {
>> struct dmar_drhd_unit *drhd;
>> - struct dmar_rmrr_unit *rmrr;
>> bool copied_tables = false;
>> - struct device *dev;
>> struct intel_iommu *iommu;
>> - int i, ret;
>> + int ret;
>> /*
>> * for each drhd
>> @@ -3466,32 +3352,6 @@ static int __init init_dmars(void)
>> goto free_iommu;
>> }
>> }
>> - /*
>> - * For each rmrr
>> - * for each dev attached to rmrr
>> - * do
>> - * locate drhd for dev, alloc domain for dev
>> - * allocate free domain
>> - * allocate page table entries for rmrr
>> - * if context not allocated for bus
>> - * allocate and init context
>> - * set present in root table for this bus
>> - * init context with domain, translation etc
>> - * endfor
>> - * endfor
>> - */
>> - pr_info("Setting RMRR:\n");
>> - for_each_rmrr_units(rmrr) {
>> - /* some BIOS lists non-exist devices in DMAR table. */
>> - for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
>> - i, dev) {
>> - ret = iommu_prepare_rmrr_dev(rmrr, dev);
>> - if (ret)
>> - pr_err("Mapping reserved region failed\n");
>> - }
>> - }
>> -
>> - iommu_prepare_isa();
>
> Why do you want to remove this segment of code?

This will only work if the lazy allocation of domains exists, these
mappings will disappear once a default domain is attached to a device and
then remade by iommu_group_create_direct_mappings. This code is redundant
and removing it allows us to remove all the lazy allocation logic.

iommu_prepare_isa does need moving to get_resv_regions for its mappings to
be applied, this will need some refactoring.

>
>> domains_done:
>> @@ -3580,53 +3440,6 @@ static unsigned long intel_alloc_iova(struct device *dev,
>> return iova_pfn;
>> }
>> -struct dmar_domain *get_valid_domain_for_dev(struct device *dev)
>> -{
>> - struct dmar_domain *domain, *tmp;
>> - struct dmar_rmrr_unit *rmrr;
>> - struct device *i_dev;
>> - int i, ret;
>> -
>> - domain = find_domain(dev);
>> - if (domain)
>> - goto out;
>> -
>> - domain = find_or_alloc_domain(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
>> - if (!domain)
>> - goto out;
>> -
>> - /* We have a new domain - setup possible RMRRs for the device */
>> - rcu_read_lock();
>> - for_each_rmrr_units(rmrr) {
>> - for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
>> - i, i_dev) {
>> - if (i_dev != dev)
>> - continue;
>> -
>> - ret = domain_prepare_identity_map(dev, domain,
>> - rmrr->base_address,
>> - rmrr->end_address);
>> - if (ret)
>> - dev_err(dev, "Mapping reserved region failed\n");
>
> We can't simply remove this segment of code, right? At least it should
> be moved to the domain allocation interface.

iommu_group_create_direct_mappings will take care of these mappings, this
code is not used once an externally managed domain(group domain) is
attached to the device.

>
>> - }
>> - }
>> - rcu_read_unlock();
>> -
>> - tmp = set_domain_for_dev(dev, domain);
>> - if (!tmp || domain != tmp) {
>> - domain_exit(domain);
>> - domain = tmp;
>> - }
>> -
>> -out:
>> -
>> - if (!domain)
>> - pr_err("Allocating domain for %s failed\n", dev_name(dev));
>> -
>> -
>> - return domain;
>> -}
>> -
>> /* Check if the dev needs to go through non-identity map and unmap process.*/
>> static int iommu_no_mapping(struct device *dev)
>> {
>> @@ -3689,7 +3502,7 @@ static dma_addr_t __intel_map_page(struct device *dev, struct page *page,
>> if (iommu_no_mapping(dev))
>> return paddr;
>> - domain = get_valid_domain_for_dev(dev);
>> + domain = find_domain(dev);
>> if (!domain)
>> return DMA_MAPPING_ERROR;
>> @@ -3753,7 +3566,8 @@ static void intel_unmap(struct device *dev, dma_addr_t dev_addr, size_t size)
>> return;
>> domain = find_domain(dev);
>> - BUG_ON(!domain);
>> + if (!domain)
>> + return;
>>
>
> This is not related to this patch. Let's do it in a separated patch.
>
>> iommu = domain_get_iommu(domain);
>> @@ -3899,7 +3713,7 @@ static int intel_map_sg(struct device *dev, struct scatterlist *sglist, int nele
>> if (iommu_no_mapping(dev))
>> return intel_nontranslate_map_sg(dev, sglist, nelems, dir);
>> - domain = get_valid_domain_for_dev(dev);
>> + domain = find_domain(dev);
>> if (!domain)
>> return 0;
>> @@ -5377,9 +5191,9 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sd
>> u64 ctx_lo;
>> int ret;
>> - domain = get_valid_domain_for_dev(sdev->dev);
>> + domain = find_domain(sdev->dev);
>> if (!domain)
>> - return -EINVAL;
>> + return -ENOMEM;
>
> This is not related to this patch. Let's do it in a separated patch.
>
>> spin_lock_irqsave(&device_domain_lock, flags);
>> spin_lock(&iommu->lock);
>
> Best regards,
> Lu Baolu


2019-03-06 06:33:42

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH 0/4] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.

Hi,

On 3/5/19 7:14 PM, James Sewart wrote:
> Hey Lu,
>
> The motivation for this is buggy domain <-> IOMMU group relationship when
> using find_or_alloc_domain. From what I have read it should be the case
> that an IOMMU domain is shared by all devices in the same group, thus the
> same mappings. This is because the IOMMU group is determined by things
> like ACS settings on pci switches which determines whether devices can
> talk to each other.
Yes, exactly. This is a known issue.

>
> In find_or_alloc_domain we only determine domain sharing based on devices
> returned by pci_for_each_dma_alias, whereas there are many more checks in
> pci_device_group(e.g. ACS settings of intermediary pci switches), which is
> used for determining which IOMMU group a device is in. This discontinuity
> means it can be the case that each device within an IOMMU group will have
> different domains.

Yes. Agree again.

>
> We see issues as it is supposed to be safe to assume that devices within a
> group should be considered to be in the same address space, but if each
> device has its own domain then any mapping created won’t be valid for the
> other devices, and even could be mapped differently for each device. This
> also could cause issues with a device whose RMRR's are not applied to the
> domains of other devices within its group, these regions could be remapped
> for the other devices resulting in unexpected behaviour during
> peer-to-peer transfers.

Yes, fair enough.

I asked this question because I am interested to know whether multiple
domains per group due to lack of ACS consideration will cause any issues
that we need to fix in the stable kernels.

Best regards,
Lu Baolu

>
> Cheers,
> James
>
>
>> On 5 Mar 2019, at 06:05, Lu Baolu <[email protected]> wrote:
>>
>> Hi James,
>>
>> Very glad to see this. Thank you!
>>
>> On 3/4/19 11:41 PM, James Sewart wrote:
>>> Hey,
>>> This patchset moves IOMMU_DOMAIN_DMA iommu domain management to iommu.c.
>>> This avoids the use of find_or_alloc_domain whose domain assignment is
>>> inconsistent with the iommu grouping as determined by pci_device_group.
>>
>> Is this a bug fix or an improvement? What's the real issue will it cause
>> if we go without this patch set?
>>
>> Best regards,
>> Lu Baolu
>>
>>> Patch 3 permits domain type IOMMU_DOMAIN_DMA to be allocated via the
>>> iommu_ops api, allowing the default_domain of an iommu group to be set in
>>> iommu.c. This domain will be attached to every device that is brought up
>>> with an iommu group, and the devices reserved regions will be mapped using
>>> regions returned by get_resv_regions.
>>> In intel_iommu_domain_alloc we don’t know the IOMMU this domain will be
>>> associated with so we defer full initialisation until
>>> intel_iommu_attach_device. Currently iommu.c:iommu_group_add_device will
>>> try to map a devices reserved regions before attaching the domain which
>>> would cause issue if the domain is not fully initialised. This is
>>> addressed in patch 1 by moving the mapping to after attaching.
>>> Patch 2 implements function apply_resv_region, used in
>>> iommu_group_create_direct_mappings to mark the reserved regions as non
>>> mappable for the dma_map_ops api.
>>> Patch 4 removes the domain lazy allocation logic. Before this patch the
>>> lazy allocation logic would not be used as any domain allocated using
>>> these paths would be replaced when attaching the group default domain.
>>> Default domain allocation has been tested with and without this patch on
>>> 4.19.
>>> Cheers,
>>> James.
>>> James Sewart (4):
>>> iommu: Move iommu_group_create_direct_mappings to after device_attach
>>> iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
>>> iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated
>>> iommu/vt-d: Remove lazy allocation of domains
>>> drivers/iommu/intel-iommu.c | 329 ++++++++++++------------------------
>>> drivers/iommu/iommu.c | 4 +-
>>> 2 files changed, 108 insertions(+), 225 deletions(-)
>
>

2019-03-06 07:15:06

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains

Hi,

On 3/5/19 7:46 PM, James Sewart wrote:
> Hey Lu,
>
>> On 5 Mar 2019, at 06:59, Lu Baolu <[email protected]> wrote:
>>
>> Hi,
>>
>> It's hard for me to understand why do we remove the rmrr related
>> code in this patch.
>
> The RMRR code removed here requires the lazy allocation of domains to
> exist, as it is run before iommu.c would assign IOMMU groups and attach a
> domain. Before patch 3, removing this code would mean the RMRR regions are
> never mapped for a domain: iommu.c will allocate a default domain for the
> group that a device is about to be put in, it will attach the domain to
> the device, then for each region returned by get_resv_regions it will
> create an identity map, this is where the RMRRs are setup for the default
> domain. >
>>
>> And, now we have two places to hold a domain for a device: group and
>> dev->info. We can consider to optimize the use of per device iommu
>> arch data. This should be later work anyway.
>>
>> More comments inline.
>>
>> On 3/4/19 11:47 PM, James Sewart wrote:
>>> The generic IOMMU code will allocate and attach a dma ops domain to each
>>> device that comes online, replacing any lazy allocated domain. Removes
>>> RMRR application at iommu init time as we won't have a domain attached
>>> to any device. iommu.c will do this after attaching a device using
>>> create_direct_mappings.
>>> Signed-off-by: James Sewart <[email protected]>
>>> ---
>>> drivers/iommu/intel-iommu.c | 202 ++----------------------------------
>>> 1 file changed, 8 insertions(+), 194 deletions(-)
>>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>>> index 71cd6bbfec05..282257e2628d 100644
>>> --- a/drivers/iommu/intel-iommu.c
>>> +++ b/drivers/iommu/intel-iommu.c
>>> @@ -2595,118 +2595,6 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
>>> return domain;
>>> }
>>> -static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
>>> -{
>>> - *(u16 *)opaque = alias;
>>> - return 0;
>>> -}
>>> -
>>> -static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw)
>>> -{
>>> - struct device_domain_info *info = NULL;
>>> - struct dmar_domain *domain = NULL;
>>> - struct intel_iommu *iommu;
>>> - u16 dma_alias;
>>> - unsigned long flags;
>>> - u8 bus, devfn;
>>> -
>>> - iommu = device_to_iommu(dev, &bus, &devfn);
>>> - if (!iommu)
>>> - return NULL;
>>> -
>>> - if (dev_is_pci(dev)) {
>>> - struct pci_dev *pdev = to_pci_dev(dev);
>>> -
>>> - pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
>>> -
>>> - spin_lock_irqsave(&device_domain_lock, flags);
>>> - info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus),
>>> - PCI_BUS_NUM(dma_alias),
>>> - dma_alias & 0xff);
>>> - if (info) {
>>> - iommu = info->iommu;
>>> - domain = info->domain;
>>> - }
>>> - spin_unlock_irqrestore(&device_domain_lock, flags);
>>> -
>>> - /* DMA alias already has a domain, use it */
>>> - if (info)
>>> - goto out;
>>> - }
>>> -
>>> - /* Allocate and initialize new domain for the device */
>>> - domain = alloc_domain(0);
>>> - if (!domain)
>>> - return NULL;
>>> - if (domain_init(domain, iommu, gaw)) {
>>> - domain_exit(domain);
>>> - return NULL;
>>> - }
>>> -
>>> -out:
>>> -
>>> - return domain;
>>> -}
>>> -
>>> -static struct dmar_domain *set_domain_for_dev(struct device *dev,
>>> - struct dmar_domain *domain)
>>> -{
>>> - struct intel_iommu *iommu;
>>> - struct dmar_domain *tmp;
>>> - u16 req_id, dma_alias;
>>> - u8 bus, devfn;
>>> -
>>> - iommu = device_to_iommu(dev, &bus, &devfn);
>>> - if (!iommu)
>>> - return NULL;
>>> -
>>> - req_id = ((u16)bus << 8) | devfn;
>>> -
>>> - if (dev_is_pci(dev)) {
>>> - struct pci_dev *pdev = to_pci_dev(dev);
>>> -
>>> - pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
>>> -
>>> - /* register PCI DMA alias device */
>>> - if (req_id != dma_alias) {
>>> - tmp = dmar_insert_one_dev_info(iommu, PCI_BUS_NUM(dma_alias),
>>> - dma_alias & 0xff, NULL, domain);
>>> -
>>> - if (!tmp || tmp != domain)
>>> - return tmp;
>>> - }
>>> - }
>>> -
>>> - tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain);
>>> - if (!tmp || tmp != domain)
>>> - return tmp;
>>> -
>>> - return domain;
>>> -}
>>> -
>>> -static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw)
>>> -{
>>> - struct dmar_domain *domain, *tmp;
>>> -
>>> - domain = find_domain(dev);
>>> - if (domain)
>>> - goto out;
>>> -
>>> - domain = find_or_alloc_domain(dev, gaw);
>>> - if (!domain)
>>> - goto out;
>>> -
>>> - tmp = set_domain_for_dev(dev, domain);
>>> - if (!tmp || domain != tmp) {
>>> - domain_exit(domain);
>>> - domain = tmp;
>>> - }
>>> -
>>> -out:
>>> -
>>> - return domain;
>>> -}
>>> -
>>> static int iommu_domain_identity_map(struct dmar_domain *domain,
>>> unsigned long long start,
>>> unsigned long long end)
>>> @@ -2779,7 +2667,7 @@ static int iommu_prepare_identity_map(struct device *dev,
>>> struct dmar_domain *domain;
>>> int ret;
>>> - domain = get_domain_for_dev(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
>>> + domain = find_domain(dev);
>>> if (!domain)
>>> return -ENOMEM;
>>> @@ -3301,11 +3189,9 @@ static int copy_translation_tables(struct intel_iommu *iommu)
>>> static int __init init_dmars(void)
>>> {
>>> struct dmar_drhd_unit *drhd;
>>> - struct dmar_rmrr_unit *rmrr;
>>> bool copied_tables = false;
>>> - struct device *dev;
>>> struct intel_iommu *iommu;
>>> - int i, ret;
>>> + int ret;
>>> /*
>>> * for each drhd
>>> @@ -3466,32 +3352,6 @@ static int __init init_dmars(void)
>>> goto free_iommu;
>>> }
>>> }
>>> - /*
>>> - * For each rmrr
>>> - * for each dev attached to rmrr
>>> - * do
>>> - * locate drhd for dev, alloc domain for dev
>>> - * allocate free domain
>>> - * allocate page table entries for rmrr
>>> - * if context not allocated for bus
>>> - * allocate and init context
>>> - * set present in root table for this bus
>>> - * init context with domain, translation etc
>>> - * endfor
>>> - * endfor
>>> - */
>>> - pr_info("Setting RMRR:\n");
>>> - for_each_rmrr_units(rmrr) {
>>> - /* some BIOS lists non-exist devices in DMAR table. */
>>> - for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
>>> - i, dev) {
>>> - ret = iommu_prepare_rmrr_dev(rmrr, dev);
>>> - if (ret)
>>> - pr_err("Mapping reserved region failed\n");
>>> - }
>>> - }
>>> -
>>> - iommu_prepare_isa();
>>
>> Why do you want to remove this segment of code?
>
> This will only work if the lazy allocation of domains exists, these
> mappings will disappear once a default domain is attached to a device and
> then remade by iommu_group_create_direct_mappings. This code is redundant
> and removing it allows us to remove all the lazy allocation logic.

No exactly.

We need to setup the rmrr mapping before enabling dma remapping,
otherwise, there will be a window after dma remapping enabling and
rmrr getting mapped, during which people might see dma faults.

>
> iommu_prepare_isa does need moving to get_resv_regions for its mappings to
> be applied, this will need some refactoring.

Yes, agree.

>
>>
>>> domains_done:
>>> @@ -3580,53 +3440,6 @@ static unsigned long intel_alloc_iova(struct device *dev,
>>> return iova_pfn;
>>> }
>>> -struct dmar_domain *get_valid_domain_for_dev(struct device *dev)
>>> -{
>>> - struct dmar_domain *domain, *tmp;
>>> - struct dmar_rmrr_unit *rmrr;
>>> - struct device *i_dev;
>>> - int i, ret;
>>> -
>>> - domain = find_domain(dev);
>>> - if (domain)
>>> - goto out;
>>> -
>>> - domain = find_or_alloc_domain(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
>>> - if (!domain)
>>> - goto out;
>>> -
>>> - /* We have a new domain - setup possible RMRRs for the device */
>>> - rcu_read_lock();
>>> - for_each_rmrr_units(rmrr) {
>>> - for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
>>> - i, i_dev) {
>>> - if (i_dev != dev)
>>> - continue;
>>> -
>>> - ret = domain_prepare_identity_map(dev, domain,
>>> - rmrr->base_address,
>>> - rmrr->end_address);
>>> - if (ret)
>>> - dev_err(dev, "Mapping reserved region failed\n");
>>
>> We can't simply remove this segment of code, right? At least it should
>> be moved to the domain allocation interface.
>
> iommu_group_create_direct_mappings will take care of these mappings, this
> code is not used once an externally managed domain(group domain) is
> attached to the device.

Yes, this seems to be duplicated even with lazy domain allocation.

Best regards,
Lu Baolu

>
>>
>>> - }
>>> - }
>>> - rcu_read_unlock();
>>> -
>>> - tmp = set_domain_for_dev(dev, domain);
>>> - if (!tmp || domain != tmp) {
>>> - domain_exit(domain);
>>> - domain = tmp;
>>> - }
>>> -
>>> -out:
>>> -
>>> - if (!domain)
>>> - pr_err("Allocating domain for %s failed\n", dev_name(dev));
>>> -
>>> -
>>> - return domain;
>>> -}
>>> -
>>> /* Check if the dev needs to go through non-identity map and unmap process.*/
>>> static int iommu_no_mapping(struct device *dev)
>>> {
>>> @@ -3689,7 +3502,7 @@ static dma_addr_t __intel_map_page(struct device *dev, struct page *page,
>>> if (iommu_no_mapping(dev))
>>> return paddr;
>>> - domain = get_valid_domain_for_dev(dev);
>>> + domain = find_domain(dev);
>>> if (!domain)
>>> return DMA_MAPPING_ERROR;
>>> @@ -3753,7 +3566,8 @@ static void intel_unmap(struct device *dev, dma_addr_t dev_addr, size_t size)
>>> return;
>>> domain = find_domain(dev);
>>> - BUG_ON(!domain);
>>> + if (!domain)
>>> + return;
>>>
>>
>> This is not related to this patch. Let's do it in a separated patch.
>>
>>> iommu = domain_get_iommu(domain);
>>> @@ -3899,7 +3713,7 @@ static int intel_map_sg(struct device *dev, struct scatterlist *sglist, int nele
>>> if (iommu_no_mapping(dev))
>>> return intel_nontranslate_map_sg(dev, sglist, nelems, dir);
>>> - domain = get_valid_domain_for_dev(dev);
>>> + domain = find_domain(dev);
>>> if (!domain)
>>> return 0;
>>> @@ -5377,9 +5191,9 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sd
>>> u64 ctx_lo;
>>> int ret;
>>> - domain = get_valid_domain_for_dev(sdev->dev);
>>> + domain = find_domain(sdev->dev);
>>> if (!domain)
>>> - return -EINVAL;
>>> + return -ENOMEM;
>>
>> This is not related to this patch. Let's do it in a separated patch.
>>
>>> spin_lock_irqsave(&device_domain_lock, flags);
>>> spin_lock(&iommu->lock);
>>
>> Best regards,
>> Lu Baolu
>
>

2019-03-06 18:36:42

by James Sewart

[permalink] [raw]
Subject: Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains

Hey,

> On 6 Mar 2019, at 07:00, Lu Baolu <[email protected]> wrote:
>
> Hi,
>
> On 3/5/19 7:46 PM, James Sewart wrote:
>> Hey Lu,
>>> On 5 Mar 2019, at 06:59, Lu Baolu <[email protected]> wrote:
>>>
>>> Hi,
>>>
>>> It's hard for me to understand why do we remove the rmrr related
>>> code in this patch.
>> The RMRR code removed here requires the lazy allocation of domains to
>> exist, as it is run before iommu.c would assign IOMMU groups and attach a
>> domain. Before patch 3, removing this code would mean the RMRR regions are
>> never mapped for a domain: iommu.c will allocate a default domain for the
>> group that a device is about to be put in, it will attach the domain to
>> the device, then for each region returned by get_resv_regions it will
>> create an identity map, this is where the RMRRs are setup for the default
>> domain. >
>>>
>>> And, now we have two places to hold a domain for a device: group and
>>> dev->info. We can consider to optimize the use of per device iommu
>>> arch data. This should be later work anyway.
>>>
>>> More comments inline.
>>>
>>> On 3/4/19 11:47 PM, James Sewart wrote:
>>>> The generic IOMMU code will allocate and attach a dma ops domain to each
>>>> device that comes online, replacing any lazy allocated domain. Removes
>>>> RMRR application at iommu init time as we won't have a domain attached
>>>> to any device. iommu.c will do this after attaching a device using
>>>> create_direct_mappings.
>>>> Signed-off-by: James Sewart <[email protected]>
>>>> ---
>>>> drivers/iommu/intel-iommu.c | 202 ++----------------------------------
>>>> 1 file changed, 8 insertions(+), 194 deletions(-)
>>>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>>>> index 71cd6bbfec05..282257e2628d 100644
>>>> --- a/drivers/iommu/intel-iommu.c
>>>> +++ b/drivers/iommu/intel-iommu.c
>>>> @@ -2595,118 +2595,6 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
>>>> return domain;
>>>> }
>>>> -static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
>>>> -{
>>>> - *(u16 *)opaque = alias;
>>>> - return 0;
>>>> -}
>>>> -
>>>> -static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw)
>>>> -{
>>>> - struct device_domain_info *info = NULL;
>>>> - struct dmar_domain *domain = NULL;
>>>> - struct intel_iommu *iommu;
>>>> - u16 dma_alias;
>>>> - unsigned long flags;
>>>> - u8 bus, devfn;
>>>> -
>>>> - iommu = device_to_iommu(dev, &bus, &devfn);
>>>> - if (!iommu)
>>>> - return NULL;
>>>> -
>>>> - if (dev_is_pci(dev)) {
>>>> - struct pci_dev *pdev = to_pci_dev(dev);
>>>> -
>>>> - pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
>>>> -
>>>> - spin_lock_irqsave(&device_domain_lock, flags);
>>>> - info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus),
>>>> - PCI_BUS_NUM(dma_alias),
>>>> - dma_alias & 0xff);
>>>> - if (info) {
>>>> - iommu = info->iommu;
>>>> - domain = info->domain;
>>>> - }
>>>> - spin_unlock_irqrestore(&device_domain_lock, flags);
>>>> -
>>>> - /* DMA alias already has a domain, use it */
>>>> - if (info)
>>>> - goto out;
>>>> - }
>>>> -
>>>> - /* Allocate and initialize new domain for the device */
>>>> - domain = alloc_domain(0);
>>>> - if (!domain)
>>>> - return NULL;
>>>> - if (domain_init(domain, iommu, gaw)) {
>>>> - domain_exit(domain);
>>>> - return NULL;
>>>> - }
>>>> -
>>>> -out:
>>>> -
>>>> - return domain;
>>>> -}
>>>> -
>>>> -static struct dmar_domain *set_domain_for_dev(struct device *dev,
>>>> - struct dmar_domain *domain)
>>>> -{
>>>> - struct intel_iommu *iommu;
>>>> - struct dmar_domain *tmp;
>>>> - u16 req_id, dma_alias;
>>>> - u8 bus, devfn;
>>>> -
>>>> - iommu = device_to_iommu(dev, &bus, &devfn);
>>>> - if (!iommu)
>>>> - return NULL;
>>>> -
>>>> - req_id = ((u16)bus << 8) | devfn;
>>>> -
>>>> - if (dev_is_pci(dev)) {
>>>> - struct pci_dev *pdev = to_pci_dev(dev);
>>>> -
>>>> - pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
>>>> -
>>>> - /* register PCI DMA alias device */
>>>> - if (req_id != dma_alias) {
>>>> - tmp = dmar_insert_one_dev_info(iommu, PCI_BUS_NUM(dma_alias),
>>>> - dma_alias & 0xff, NULL, domain);
>>>> -
>>>> - if (!tmp || tmp != domain)
>>>> - return tmp;
>>>> - }
>>>> - }
>>>> -
>>>> - tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain);
>>>> - if (!tmp || tmp != domain)
>>>> - return tmp;
>>>> -
>>>> - return domain;
>>>> -}
>>>> -
>>>> -static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw)
>>>> -{
>>>> - struct dmar_domain *domain, *tmp;
>>>> -
>>>> - domain = find_domain(dev);
>>>> - if (domain)
>>>> - goto out;
>>>> -
>>>> - domain = find_or_alloc_domain(dev, gaw);
>>>> - if (!domain)
>>>> - goto out;
>>>> -
>>>> - tmp = set_domain_for_dev(dev, domain);
>>>> - if (!tmp || domain != tmp) {
>>>> - domain_exit(domain);
>>>> - domain = tmp;
>>>> - }
>>>> -
>>>> -out:
>>>> -
>>>> - return domain;
>>>> -}
>>>> -
>>>> static int iommu_domain_identity_map(struct dmar_domain *domain,
>>>> unsigned long long start,
>>>> unsigned long long end)
>>>> @@ -2779,7 +2667,7 @@ static int iommu_prepare_identity_map(struct device *dev,
>>>> struct dmar_domain *domain;
>>>> int ret;
>>>> - domain = get_domain_for_dev(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
>>>> + domain = find_domain(dev);
>>>> if (!domain)
>>>> return -ENOMEM;
>>>> @@ -3301,11 +3189,9 @@ static int copy_translation_tables(struct intel_iommu *iommu)
>>>> static int __init init_dmars(void)
>>>> {
>>>> struct dmar_drhd_unit *drhd;
>>>> - struct dmar_rmrr_unit *rmrr;
>>>> bool copied_tables = false;
>>>> - struct device *dev;
>>>> struct intel_iommu *iommu;
>>>> - int i, ret;
>>>> + int ret;
>>>> /*
>>>> * for each drhd
>>>> @@ -3466,32 +3352,6 @@ static int __init init_dmars(void)
>>>> goto free_iommu;
>>>> }
>>>> }
>>>> - /*
>>>> - * For each rmrr
>>>> - * for each dev attached to rmrr
>>>> - * do
>>>> - * locate drhd for dev, alloc domain for dev
>>>> - * allocate free domain
>>>> - * allocate page table entries for rmrr
>>>> - * if context not allocated for bus
>>>> - * allocate and init context
>>>> - * set present in root table for this bus
>>>> - * init context with domain, translation etc
>>>> - * endfor
>>>> - * endfor
>>>> - */
>>>> - pr_info("Setting RMRR:\n");
>>>> - for_each_rmrr_units(rmrr) {
>>>> - /* some BIOS lists non-exist devices in DMAR table. */
>>>> - for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
>>>> - i, dev) {
>>>> - ret = iommu_prepare_rmrr_dev(rmrr, dev);
>>>> - if (ret)
>>>> - pr_err("Mapping reserved region failed\n");
>>>> - }
>>>> - }
>>>> -
>>>> - iommu_prepare_isa();
>>>
>>> Why do you want to remove this segment of code?
>> This will only work if the lazy allocation of domains exists, these
>> mappings will disappear once a default domain is attached to a device and
>> then remade by iommu_group_create_direct_mappings. This code is redundant
>> and removing it allows us to remove all the lazy allocation logic.
>
> No exactly.
>
> We need to setup the rmrr mapping before enabling dma remapping,
> otherwise, there will be a window after dma remapping enabling and
> rmrr getting mapped, during which people might see dma faults.

Do you think this patch instead should be a refactoring to simplify this initial domain setup before the default domain takes over? It seems like duplicated effort to have both lazy allocated domains and externally managed domains. We could allocate a domain here for any devices with RMRR and call get_resv_regions to avoid duplicating RMRR loop.

>
>> iommu_prepare_isa does need moving to get_resv_regions for its mappings to
>> be applied, this will need some refactoring.
>
> Yes, agree.
>
>>>
>>>> domains_done:
>>>> @@ -3580,53 +3440,6 @@ static unsigned long intel_alloc_iova(struct device *dev,
>>>> return iova_pfn;
>>>> }
>>>> -struct dmar_domain *get_valid_domain_for_dev(struct device *dev)
>>>> -{
>>>> - struct dmar_domain *domain, *tmp;
>>>> - struct dmar_rmrr_unit *rmrr;
>>>> - struct device *i_dev;
>>>> - int i, ret;
>>>> -
>>>> - domain = find_domain(dev);
>>>> - if (domain)
>>>> - goto out;
>>>> -
>>>> - domain = find_or_alloc_domain(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
>>>> - if (!domain)
>>>> - goto out;
>>>> -
>>>> - /* We have a new domain - setup possible RMRRs for the device */
>>>> - rcu_read_lock();
>>>> - for_each_rmrr_units(rmrr) {
>>>> - for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
>>>> - i, i_dev) {
>>>> - if (i_dev != dev)
>>>> - continue;
>>>> -
>>>> - ret = domain_prepare_identity_map(dev, domain,
>>>> - rmrr->base_address,
>>>> - rmrr->end_address);
>>>> - if (ret)
>>>> - dev_err(dev, "Mapping reserved region failed\n");
>>>
>>> We can't simply remove this segment of code, right? At least it should
>>> be moved to the domain allocation interface.
>> iommu_group_create_direct_mappings will take care of these mappings, this
>> code is not used once an externally managed domain(group domain) is
>> attached to the device.
>
> Yes, this seems to be duplicated even with lazy domain allocation.
>
> Best regards,
> Lu Baolu
>
>>>
>>>> - }
>>>> - }
>>>> - rcu_read_unlock();
>>>> -
>>>> - tmp = set_domain_for_dev(dev, domain);
>>>> - if (!tmp || domain != tmp) {
>>>> - domain_exit(domain);
>>>> - domain = tmp;
>>>> - }
>>>> -
>>>> -out:
>>>> -
>>>> - if (!domain)
>>>> - pr_err("Allocating domain for %s failed\n", dev_name(dev));
>>>> -
>>>> -
>>>> - return domain;
>>>> -}
>>>> -
>>>> /* Check if the dev needs to go through non-identity map and unmap process.*/
>>>> static int iommu_no_mapping(struct device *dev)
>>>> {
>>>> @@ -3689,7 +3502,7 @@ static dma_addr_t __intel_map_page(struct device *dev, struct page *page,
>>>> if (iommu_no_mapping(dev))
>>>> return paddr;
>>>> - domain = get_valid_domain_for_dev(dev);
>>>> + domain = find_domain(dev);
>>>> if (!domain)
>>>> return DMA_MAPPING_ERROR;
>>>> @@ -3753,7 +3566,8 @@ static void intel_unmap(struct device *dev, dma_addr_t dev_addr, size_t size)
>>>> return;
>>>> domain = find_domain(dev);
>>>> - BUG_ON(!domain);
>>>> + if (!domain)
>>>> + return;
>>>>
>>>
>>> This is not related to this patch. Let's do it in a separated patch.
>>>
>>>> iommu = domain_get_iommu(domain);
>>>> @@ -3899,7 +3713,7 @@ static int intel_map_sg(struct device *dev, struct scatterlist *sglist, int nele
>>>> if (iommu_no_mapping(dev))
>>>> return intel_nontranslate_map_sg(dev, sglist, nelems, dir);
>>>> - domain = get_valid_domain_for_dev(dev);
>>>> + domain = find_domain(dev);
>>>> if (!domain)
>>>> return 0;
>>>> @@ -5377,9 +5191,9 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sd
>>>> u64 ctx_lo;
>>>> int ret;
>>>> - domain = get_valid_domain_for_dev(sdev->dev);
>>>> + domain = find_domain(sdev->dev);
>>>> if (!domain)
>>>> - return -EINVAL;
>>>> + return -ENOMEM;
>>>
>>> This is not related to this patch. Let's do it in a separated patch.
>>>
>>>> spin_lock_irqsave(&device_domain_lock, flags);
>>>> spin_lock(&iommu->lock);
>>>
>>> Best regards,
>>> Lu Baolu

Cheers,
James.


2019-03-07 06:37:21

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains

Hi James,

On 3/7/19 2:08 AM, James Sewart wrote:
>>>>> - /*
>>>>> - * For each rmrr
>>>>> - * for each dev attached to rmrr
>>>>> - * do
>>>>> - * locate drhd for dev, alloc domain for dev
>>>>> - * allocate free domain
>>>>> - * allocate page table entries for rmrr
>>>>> - * if context not allocated for bus
>>>>> - * allocate and init context
>>>>> - * set present in root table for this bus
>>>>> - * init context with domain, translation etc
>>>>> - * endfor
>>>>> - * endfor
>>>>> - */
>>>>> - pr_info("Setting RMRR:\n");
>>>>> - for_each_rmrr_units(rmrr) {
>>>>> - /* some BIOS lists non-exist devices in DMAR table. */
>>>>> - for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
>>>>> - i, dev) {
>>>>> - ret = iommu_prepare_rmrr_dev(rmrr, dev);
>>>>> - if (ret)
>>>>> - pr_err("Mapping reserved region failed\n");
>>>>> - }
>>>>> - }
>>>>> -
>>>>> - iommu_prepare_isa();
>>>> Why do you want to remove this segment of code?
>>> This will only work if the lazy allocation of domains exists, these
>>> mappings will disappear once a default domain is attached to a device and
>>> then remade by iommu_group_create_direct_mappings. This code is redundant
>>> and removing it allows us to remove all the lazy allocation logic.
>> No exactly.
>>
>> We need to setup the rmrr mapping before enabling dma remapping,
>> otherwise, there will be a window after dma remapping enabling and
>> rmrr getting mapped, during which people might see dma faults.
> Do you think this patch instead should be a refactoring to simplify this initial domain setup before the default domain takes over? It seems like duplicated effort to have both lazy allocated domains and externally managed domains. We could allocate a domain here for any devices with RMRR and call get_resv_regions to avoid duplicating RMRR loop.
>

Agree. We should replace the lazy allocated domains with default domain
in a clean way. Actually, your patches look great to me. But I do think
we need further cleanups. The rmrr code is one example, and the identity
domain (si_domain) is another.

Do you mind if I work on top of your patches for further cleanups and
sign off a v2 together with you?

Best regards,
Lu Baolu

2019-03-07 10:23:26

by James Sewart

[permalink] [raw]
Subject: Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains

Hey Lu,

> On 7 Mar 2019, at 06:31, Lu Baolu <[email protected]> wrote:
>
> Hi James,
>
> On 3/7/19 2:08 AM, James Sewart wrote:
>>>>>> - /*
>>>>>> - * For each rmrr
>>>>>> - * for each dev attached to rmrr
>>>>>> - * do
>>>>>> - * locate drhd for dev, alloc domain for dev
>>>>>> - * allocate free domain
>>>>>> - * allocate page table entries for rmrr
>>>>>> - * if context not allocated for bus
>>>>>> - * allocate and init context
>>>>>> - * set present in root table for this bus
>>>>>> - * init context with domain, translation etc
>>>>>> - * endfor
>>>>>> - * endfor
>>>>>> - */
>>>>>> - pr_info("Setting RMRR:\n");
>>>>>> - for_each_rmrr_units(rmrr) {
>>>>>> - /* some BIOS lists non-exist devices in DMAR table. */
>>>>>> - for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
>>>>>> - i, dev) {
>>>>>> - ret = iommu_prepare_rmrr_dev(rmrr, dev);
>>>>>> - if (ret)
>>>>>> - pr_err("Mapping reserved region failed\n");
>>>>>> - }
>>>>>> - }
>>>>>> -
>>>>>> - iommu_prepare_isa();
>>>>> Why do you want to remove this segment of code?
>>>> This will only work if the lazy allocation of domains exists, these
>>>> mappings will disappear once a default domain is attached to a device and
>>>> then remade by iommu_group_create_direct_mappings. This code is redundant
>>>> and removing it allows us to remove all the lazy allocation logic.
>>> No exactly.
>>>
>>> We need to setup the rmrr mapping before enabling dma remapping,
>>> otherwise, there will be a window after dma remapping enabling and
>>> rmrr getting mapped, during which people might see dma faults.
>> Do you think this patch instead should be a refactoring to simplify this initial domain setup before the default domain takes over? It seems like duplicated effort to have both lazy allocated domains and externally managed domains. We could allocate a domain here for any devices with RMRR and call get_resv_regions to avoid duplicating RMRR loop.
>
> Agree. We should replace the lazy allocated domains with default domain
> in a clean way. Actually, your patches look great to me. But I do think
> we need further cleanups. The rmrr code is one example, and the identity
> domain (si_domain) is another.
>
> Do you mind if I work on top of your patches for further cleanups and
> sign off a v2 together with you?

Sure, sounds good. I’ll fixup patch 3 and have a go at integrating
iommu_prepare_isa into get_resv_regions. This should make the initial
domain logic here quite concise.

>
> Best regards,
> Lu Baolu

Cheers,
James.

2019-03-08 01:15:56

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains

Hi,

On 3/7/19 6:21 PM, James Sewart wrote:
> Hey Lu,
>
>> On 7 Mar 2019, at 06:31, Lu Baolu <[email protected]> wrote:
>>
>> Hi James,
>>
>> On 3/7/19 2:08 AM, James Sewart wrote:
>>>>>>> - /*
>>>>>>> - * For each rmrr
>>>>>>> - * for each dev attached to rmrr
>>>>>>> - * do
>>>>>>> - * locate drhd for dev, alloc domain for dev
>>>>>>> - * allocate free domain
>>>>>>> - * allocate page table entries for rmrr
>>>>>>> - * if context not allocated for bus
>>>>>>> - * allocate and init context
>>>>>>> - * set present in root table for this bus
>>>>>>> - * init context with domain, translation etc
>>>>>>> - * endfor
>>>>>>> - * endfor
>>>>>>> - */
>>>>>>> - pr_info("Setting RMRR:\n");
>>>>>>> - for_each_rmrr_units(rmrr) {
>>>>>>> - /* some BIOS lists non-exist devices in DMAR table. */
>>>>>>> - for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
>>>>>>> - i, dev) {
>>>>>>> - ret = iommu_prepare_rmrr_dev(rmrr, dev);
>>>>>>> - if (ret)
>>>>>>> - pr_err("Mapping reserved region failed\n");
>>>>>>> - }
>>>>>>> - }
>>>>>>> -
>>>>>>> - iommu_prepare_isa();
>>>>>> Why do you want to remove this segment of code?
>>>>> This will only work if the lazy allocation of domains exists, these
>>>>> mappings will disappear once a default domain is attached to a device and
>>>>> then remade by iommu_group_create_direct_mappings. This code is redundant
>>>>> and removing it allows us to remove all the lazy allocation logic.
>>>> No exactly.
>>>>
>>>> We need to setup the rmrr mapping before enabling dma remapping,
>>>> otherwise, there will be a window after dma remapping enabling and
>>>> rmrr getting mapped, during which people might see dma faults.
>>> Do you think this patch instead should be a refactoring to simplify this initial domain setup before the default domain takes over? It seems like duplicated effort to have both lazy allocated domains and externally managed domains. We could allocate a domain here for any devices with RMRR and call get_resv_regions to avoid duplicating RMRR loop.
>>
>> Agree. We should replace the lazy allocated domains with default domain
>> in a clean way. Actually, your patches look great to me. But I do think
>> we need further cleanups. The rmrr code is one example, and the identity
>> domain (si_domain) is another.
>>
>> Do you mind if I work on top of your patches for further cleanups and
>> sign off a v2 together with you?
>
> Sure, sounds good. I’ll fixup patch 3 and have a go at integrating
> iommu_prepare_isa into get_resv_regions. This should make the initial
> domain logic here quite concise.

Very appreciated! Thank you!

Best regards,
Lu Baolu

2019-03-08 01:23:06

by Dmitry Safonov

[permalink] [raw]
Subject: Re: [PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated

On 3/4/19 3:46 PM, James Sewart wrote:
> +static inline int domain_is_initialised(struct dmar_domain *domain)
> +{
> + return domain->flags & DOMAIN_FLAG_INITIALISED;
> +}

Maybe check it in intel_iommu_domain_free(), eh?

Thanks,
Dmitry

2019-03-08 03:15:59

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains

Hi James,

On 3/7/19 6:21 PM, James Sewart wrote:
> Hey Lu,
>
>> On 7 Mar 2019, at 06:31, Lu Baolu <[email protected]> wrote:
>>
>> Hi James,
>>
>> On 3/7/19 2:08 AM, James Sewart wrote:
>>>>>>> - /*
>>>>>>> - * For each rmrr
>>>>>>> - * for each dev attached to rmrr
>>>>>>> - * do
>>>>>>> - * locate drhd for dev, alloc domain for dev
>>>>>>> - * allocate free domain
>>>>>>> - * allocate page table entries for rmrr
>>>>>>> - * if context not allocated for bus
>>>>>>> - * allocate and init context
>>>>>>> - * set present in root table for this bus
>>>>>>> - * init context with domain, translation etc
>>>>>>> - * endfor
>>>>>>> - * endfor
>>>>>>> - */
>>>>>>> - pr_info("Setting RMRR:\n");
>>>>>>> - for_each_rmrr_units(rmrr) {
>>>>>>> - /* some BIOS lists non-exist devices in DMAR table. */
>>>>>>> - for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
>>>>>>> - i, dev) {
>>>>>>> - ret = iommu_prepare_rmrr_dev(rmrr, dev);
>>>>>>> - if (ret)
>>>>>>> - pr_err("Mapping reserved region failed\n");
>>>>>>> - }
>>>>>>> - }
>>>>>>> -
>>>>>>> - iommu_prepare_isa();
>>>>>> Why do you want to remove this segment of code?
>>>>> This will only work if the lazy allocation of domains exists, these
>>>>> mappings will disappear once a default domain is attached to a device and
>>>>> then remade by iommu_group_create_direct_mappings. This code is redundant
>>>>> and removing it allows us to remove all the lazy allocation logic.
>>>> No exactly.
>>>>
>>>> We need to setup the rmrr mapping before enabling dma remapping,
>>>> otherwise, there will be a window after dma remapping enabling and
>>>> rmrr getting mapped, during which people might see dma faults.
>>> Do you think this patch instead should be a refactoring to simplify this initial domain setup before the default domain takes over? It seems like duplicated effort to have both lazy allocated domains and externally managed domains. We could allocate a domain here for any devices with RMRR and call get_resv_regions to avoid duplicating RMRR loop.
>>
>> Agree. We should replace the lazy allocated domains with default domain
>> in a clean way. Actually, your patches look great to me. But I do think
>> we need further cleanups. The rmrr code is one example, and the identity
>> domain (si_domain) is another.
>>
>> Do you mind if I work on top of your patches for further cleanups and
>> sign off a v2 together with you?
>
> Sure, sounds good. I’ll fixup patch 3 and have a go at integrating
> iommu_prepare_isa into get_resv_regions. This should make the initial
> domain logic here quite concise.
>

Here attached three extra patches which I think should be added before
PATCH 3/4, and some further cleanup changes which you can merge with
PATCH 4/4.

----------------

0006-iommu-Add-ops-entry-for-vendor-specific-default-doma.patch
0007-iommu-vt-d-Add-is_identity_map-ops-entry.patch

These two patches aim to add a generic method for vendor specific iommu
drivers to specify the type of the default domain for a device. Intel
iommu driver will register an ops for this since it already has its own
identity map adjudicator for a long time.

----------------

0008-iommu-vt-d-Enable-DMA-remapping-after-rmrr-mapped.patch

This aims to address the requirement of rmrr identity map before
enabling DMA remapping. With default domain mechanism, the default
domain will be allocated and attached to the device in bus_set_iommu()
during boot. Move enabling DMA remapping engines after bus_set_iommu()
will fix the rmrr issue.

----------------

0009-return-si_domain-directly.patch

I suggest that we should keep current si_domain implementation since
changing si_domain is not the purpose of this patch set. Please merge
this with PATCH 3/4 if you like it.

----------------

0010-iommu-vt-d-remove-floopy-workaround.patch
0011-iommu-vt-d-remove-prepare-rmrr-helpers.patch
0012-iommu-vt-d-remove-prepare-identity-map-during-boot.patch

Above patches are further cleanups as the result of switching to default
domain. Please consider to merge them with PATCH 4/4.

----------------

I have done some sanity checks on my local machine against all patches.
I can do more tests after you submit a v2.


Best regards,
Lu Baolu



Attachments:
0006-iommu-Add-ops-entry-for-vendor-specific-default-doma.patch (2.51 kB)
0007-iommu-vt-d-Add-is_identity_map-ops-entry.patch (1.31 kB)
0008-iommu-vt-d-Enable-DMA-remapping-after-rmrr-mapped.patch (1.95 kB)
0009-return-si_domain-directly.patch (2.21 kB)
0010-iommu-vt-d-remove-floopy-workaround.patch (1.34 kB)
0011-iommu-vt-d-remove-prepare-rmrr-helpers.patch (2.87 kB)
0012-iommu-vt-d-remove-prepare-identity-map-during-boot.patch (1.09 kB)
Download all attachments

2019-03-08 16:58:50

by James Sewart

[permalink] [raw]
Subject: Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains

Hey Lu,

> On 8 Mar 2019, at 03:09, Lu Baolu <[email protected]> wrote:
>
> Hi James,
>
> On 3/7/19 6:21 PM, James Sewart wrote:
>> Hey Lu,
>>> On 7 Mar 2019, at 06:31, Lu Baolu <[email protected]> wrote:
>>>
>>> Hi James,
>>>
>>> On 3/7/19 2:08 AM, James Sewart wrote:
>>>>>>>> - /*
>>>>>>>> - * For each rmrr
>>>>>>>> - * for each dev attached to rmrr
>>>>>>>> - * do
>>>>>>>> - * locate drhd for dev, alloc domain for dev
>>>>>>>> - * allocate free domain
>>>>>>>> - * allocate page table entries for rmrr
>>>>>>>> - * if context not allocated for bus
>>>>>>>> - * allocate and init context
>>>>>>>> - * set present in root table for this bus
>>>>>>>> - * init context with domain, translation etc
>>>>>>>> - * endfor
>>>>>>>> - * endfor
>>>>>>>> - */
>>>>>>>> - pr_info("Setting RMRR:\n");
>>>>>>>> - for_each_rmrr_units(rmrr) {
>>>>>>>> - /* some BIOS lists non-exist devices in DMAR table. */
>>>>>>>> - for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
>>>>>>>> - i, dev) {
>>>>>>>> - ret = iommu_prepare_rmrr_dev(rmrr, dev);
>>>>>>>> - if (ret)
>>>>>>>> - pr_err("Mapping reserved region failed\n");
>>>>>>>> - }
>>>>>>>> - }
>>>>>>>> -
>>>>>>>> - iommu_prepare_isa();
>>>>>>> Why do you want to remove this segment of code?
>>>>>> This will only work if the lazy allocation of domains exists, these
>>>>>> mappings will disappear once a default domain is attached to a device and
>>>>>> then remade by iommu_group_create_direct_mappings. This code is redundant
>>>>>> and removing it allows us to remove all the lazy allocation logic.
>>>>> No exactly.
>>>>>
>>>>> We need to setup the rmrr mapping before enabling dma remapping,
>>>>> otherwise, there will be a window after dma remapping enabling and
>>>>> rmrr getting mapped, during which people might see dma faults.
>>>> Do you think this patch instead should be a refactoring to simplify this initial domain setup before the default domain takes over? It seems like duplicated effort to have both lazy allocated domains and externally managed domains. We could allocate a domain here for any devices with RMRR and call get_resv_regions to avoid duplicating RMRR loop.
>>>
>>> Agree. We should replace the lazy allocated domains with default domain
>>> in a clean way. Actually, your patches look great to me. But I do think
>>> we need further cleanups. The rmrr code is one example, and the identity
>>> domain (si_domain) is another.
>>>
>>> Do you mind if I work on top of your patches for further cleanups and
>>> sign off a v2 together with you?
>> Sure, sounds good. I’ll fixup patch 3 and have a go at integrating
>> iommu_prepare_isa into get_resv_regions. This should make the initial
>> domain logic here quite concise.
>
> Here attached three extra patches which I think should be added before
> PATCH 3/4, and some further cleanup changes which you can merge with
> PATCH 4/4.
>
> ----------------
>
> 0006-iommu-Add-ops-entry-for-vendor-specific-default-doma.patch
> 0007-iommu-vt-d-Add-is_identity_map-ops-entry.patch
>
> These two patches aim to add a generic method for vendor specific iommu
> drivers to specify the type of the default domain for a device. Intel
> iommu driver will register an ops for this since it already has its own
> identity map adjudicator for a long time.

This seems like a good idea, but as domain alloc is only called for the
default domain on first device attached to a group, we may miss checking
whether a device added later should have an identity domain. Should there
be paths to downgrade a groups domain if one of the devices requires one?

>
> ----------------
>
> 0008-iommu-vt-d-Enable-DMA-remapping-after-rmrr-mapped.patch
>
> This aims to address the requirement of rmrr identity map before
> enabling DMA remapping. With default domain mechanism, the default
> domain will be allocated and attached to the device in bus_set_iommu()
> during boot. Move enabling DMA remapping engines after bus_set_iommu()
> will fix the rmrr issue.

Thats pretty neat, avoids any temporary domain allocation, nice!

>
> ----------------
>
> 0009-return-si_domain-directly.patch
>
> I suggest that we should keep current si_domain implementation since
> changing si_domain is not the purpose of this patch set. Please merge
> this with PATCH 3/4 if you like it.

Seems reasonable.

>
> ----------------
>
> 0010-iommu-vt-d-remove-floopy-workaround.patch
> 0011-iommu-vt-d-remove-prepare-rmrr-helpers.patch
> 0012-iommu-vt-d-remove-prepare-identity-map-during-boot.patch
>
> Above patches are further cleanups as the result of switching to default
> domain. Please consider to merge them with PATCH 4/4.

Nice, good catch.

>
> ----------------
>
> I have done some sanity checks on my local machine against all patches.
> I can do more tests after you submit a v2.
>
>
> Best regards,
> Lu Baolu
>
>
> <0006-iommu-Add-ops-entry-for-vendor-specific-default-doma.patch><0007-iommu-vt-d-Add-is_identity_map-ops-entry.patch><0008-iommu-vt-d-Enable-DMA-remapping-after-rmrr-mapped.patch><0009-return-si_domain-directly.patch><0010-iommu-vt-d-remove-floopy-workaround.patch><0011-iommu-vt-d-remove-prepare-rmrr-helpers.patch><0012-iommu-vt-d-remove-prepare-identity-map-during-boot.patch>

I have revised my patches and hope to do some testing before reposting.

Cheers,
James.


2019-03-09 01:59:33

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains

Hi James,

On 3/9/19 12:57 AM, James Sewart wrote:
> Hey Lu,
>
>> On 8 Mar 2019, at 03:09, Lu Baolu<[email protected]> wrote:
>>>>
>>>> Do you mind if I work on top of your patches for further cleanups and
>>>> sign off a v2 together with you?
>>> Sure, sounds good. I’ll fixup patch 3 and have a go at integrating
>>> iommu_prepare_isa into get_resv_regions. This should make the initial
>>> domain logic here quite concise.
>> Here attached three extra patches which I think should be added before
>> PATCH 3/4, and some further cleanup changes which you can merge with
>> PATCH 4/4.
>>
>> ----------------
>>
>> 0006-iommu-Add-ops-entry-for-vendor-specific-default-doma.patch
>> 0007-iommu-vt-d-Add-is_identity_map-ops-entry.patch
>>
>> These two patches aim to add a generic method for vendor specific iommu
>> drivers to specify the type of the default domain for a device. Intel
>> iommu driver will register an ops for this since it already has its own
>> identity map adjudicator for a long time.
> This seems like a good idea, but as domain alloc is only called for the
> default domain on first device attached to a group, we may miss checking
> whether a device added later should have an identity domain. Should there
> be paths to downgrade a groups domain if one of the devices requires one?
>

Good catch!

This is supposed to be handled in iommu_no_mapping(). But, obviously
current code sticks to lazy domain allocation. I'm not sure whether
there are any real such cases, but we should handle it in a clean way.
My idea is that we could downgrade to multiple domains per group (just
like what we have now) in this case and print a kernel message for this.

Or any other thoughts?

Best regards,
Lu Baolu

2019-03-09 11:51:42

by James Sewart

[permalink] [raw]
Subject: Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains

Hey Lu,

> On 9 Mar 2019, at 01:53, Lu Baolu <[email protected]> wrote:
>
> Hi James,
>
> On 3/9/19 12:57 AM, James Sewart wrote:
>> Hey Lu,
>>> On 8 Mar 2019, at 03:09, Lu Baolu<[email protected]> wrote:
>>>>>
>>>>> Do you mind if I work on top of your patches for further cleanups and
>>>>> sign off a v2 together with you?
>>>> Sure, sounds good. I’ll fixup patch 3 and have a go at integrating
>>>> iommu_prepare_isa into get_resv_regions. This should make the initial
>>>> domain logic here quite concise.
>>> Here attached three extra patches which I think should be added before
>>> PATCH 3/4, and some further cleanup changes which you can merge with
>>> PATCH 4/4.
>>>
>>> ----------------
>>>
>>> 0006-iommu-Add-ops-entry-for-vendor-specific-default-doma.patch
>>> 0007-iommu-vt-d-Add-is_identity_map-ops-entry.patch
>>>
>>> These two patches aim to add a generic method for vendor specific iommu
>>> drivers to specify the type of the default domain for a device. Intel
>>> iommu driver will register an ops for this since it already has its own
>>> identity map adjudicator for a long time.
>> This seems like a good idea, but as domain alloc is only called for the
>> default domain on first device attached to a group, we may miss checking
>> whether a device added later should have an identity domain. Should there
>> be paths to downgrade a groups domain if one of the devices requires one?
>
> Good catch!
>
> This is supposed to be handled in iommu_no_mapping(). But, obviously
> current code sticks to lazy domain allocation. I'm not sure whether
> there are any real such cases, but we should handle it in a clean way.
> My idea is that we could downgrade to multiple domains per group (just
> like what we have now) in this case and print a kernel message for this.

I think if a device requires an identity domain, then it should ignore
attempts to attach to something else. A print to warn a user about this
would be a good idea.

I figure during attach: if iommu_no_mapping() then attach to si_domain and
print, else continue with the given domain.

>
> Or any other thoughts?
>
> Best regards,
> Lu Baolu

Cheers,
James.

2019-03-09 11:59:12

by James Sewart

[permalink] [raw]
Subject: Re: [PATCH 3/4] iommu/vt-d: Allow IOMMU_DOMAIN_DMA and IOMMU_DOMAIN_IDENTITY to be allocated

Hey Dmitry,

> On 8 Mar 2019, at 01:20, Dmitry Safonov <[email protected]> wrote:
>
> On 3/4/19 3:46 PM, James Sewart wrote:
>> +static inline int domain_is_initialised(struct dmar_domain *domain)
>> +{
>> + return domain->flags & DOMAIN_FLAG_INITIALISED;
>> +}
>
> Maybe check it in intel_iommu_domain_free(), eh?

Good shout, freeing a non attached IOMMU_DOMAIN_DMA domain is now
different to an attached one.

>
> Thanks,
> Dmitry


Cheers,
James.

2019-03-10 02:58:37

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH 4/4] iommu/vt-d: Remove lazy allocation of domains

Hi,

On 3/9/19 7:49 PM, James Sewart wrote:
> Hey Lu,
>
>> On 9 Mar 2019, at 01:53, Lu Baolu <[email protected]> wrote:
>>
>> Hi James,
>>
>> On 3/9/19 12:57 AM, James Sewart wrote:
>>> Hey Lu,
>>>> On 8 Mar 2019, at 03:09, Lu Baolu<[email protected]> wrote:
>>>>>>
>>>>>> Do you mind if I work on top of your patches for further cleanups and
>>>>>> sign off a v2 together with you?
>>>>> Sure, sounds good. I’ll fixup patch 3 and have a go at integrating
>>>>> iommu_prepare_isa into get_resv_regions. This should make the initial
>>>>> domain logic here quite concise.
>>>> Here attached three extra patches which I think should be added before
>>>> PATCH 3/4, and some further cleanup changes which you can merge with
>>>> PATCH 4/4.
>>>>
>>>> ----------------
>>>>
>>>> 0006-iommu-Add-ops-entry-for-vendor-specific-default-doma.patch
>>>> 0007-iommu-vt-d-Add-is_identity_map-ops-entry.patch
>>>>
>>>> These two patches aim to add a generic method for vendor specific iommu
>>>> drivers to specify the type of the default domain for a device. Intel
>>>> iommu driver will register an ops for this since it already has its own
>>>> identity map adjudicator for a long time.
>>> This seems like a good idea, but as domain alloc is only called for the
>>> default domain on first device attached to a group, we may miss checking
>>> whether a device added later should have an identity domain. Should there
>>> be paths to downgrade a groups domain if one of the devices requires one?
>>
>> Good catch!
>>
>> This is supposed to be handled in iommu_no_mapping(). But, obviously
>> current code sticks to lazy domain allocation. I'm not sure whether
>> there are any real such cases, but we should handle it in a clean way.
>> My idea is that we could downgrade to multiple domains per group (just
>> like what we have now) in this case and print a kernel message for this.
>
> I think if a device requires an identity domain, then it should ignore
> attempts to attach to something else. A print to warn a user about this
> would be a good idea.
>
> I figure during attach: if iommu_no_mapping() then attach to si_domain and
> print, else continue with the given domain.

Yes, agree.

Best regards,
Lu Baolu

2019-03-14 12:03:14

by James Sewart

[permalink] [raw]
Subject: [PATCH v2 2/7] iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges

Used by iommu.c before creating identity mappings for reserved ranges to
ensure dma-ops won't ever remap these ranges

Signed-off-by: James Sewart <[email protected]>
---
drivers/iommu/intel-iommu.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 78188bf7e90d..8e0a4e2ff77f 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5299,6 +5299,19 @@ static void intel_iommu_put_resv_regions(struct device *dev,
}
}

+static void intel_iommu_apply_resv_region(struct device *dev,
+ struct iommu_domain *domain,
+ struct iommu_resv_region *region)
+{
+ struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+ unsigned long start, end;
+
+ start = IOVA_PFN(region->start);
+ end = IOVA_PFN(region->start + region->length - 1);
+
+ WARN_ON_ONCE(reserve_iova(&dmar_domain->iovad, start, end) == NULL);
+}
+
#ifdef CONFIG_INTEL_IOMMU_SVM
int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sdev)
{
@@ -5392,6 +5405,7 @@ const struct iommu_ops intel_iommu_ops = {
.remove_device = intel_iommu_remove_device,
.get_resv_regions = intel_iommu_get_resv_regions,
.put_resv_regions = intel_iommu_put_resv_regions,
+ .apply_resv_region = intel_iommu_apply_resv_region,
.device_group = pci_device_group,
.pgsize_bitmap = INTEL_IOMMU_PGSIZES,
};
--
2.17.1


2019-03-14 12:04:02

by James Sewart

[permalink] [raw]
Subject: [PATCH v2 4/7] iommu/vt-d: Ignore domain parameter in attach_device if device requires identity map

If a device requires an identity map then it is not safe to attach a
domain that can remap addresses. Warn the user if this occurs.

Signed-off-by: James Sewart <[email protected]>
---
drivers/iommu/intel-iommu.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 2e00e8708f06..104d36f225a7 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5101,6 +5101,11 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
}
}

+ if (iommu_no_mapping(dev)) {
+ dmar_domain = si_domain;
+ dev_warn(dev, "VT-d: Device is required to use identity IOMMU mapping, ignoring domain attached\n");
+ }
+
iommu = device_to_iommu(dev, &bus, &devfn);
if (!iommu)
return -ENODEV;
--
2.17.1


2019-03-14 12:04:09

by James Sewart

[permalink] [raw]
Subject: [PATCH v2 0/7] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.

Patches 1 and 2 are the same as v1.

v1-v2:
Refactored ISA direct mappings to be returned by iommu_get_resv_regions.
Integrated patch by Lu to defer turning on DMAR until iommu.c has mapped
reserved regions.
Integrated patches by Lu to remove more unused code in cleanup.

Lu: I didn't integrate your patch to set the default domain type as it
isn't directly related to the aim of this patchset. Instead patch 4
addresses the issue of a device requiring an identity domain by ignoring
the domain param in attach_device and printing a warning.

I booted some of our devices with this patchset and haven't seen any
issues. It doesn't look like we have any devices with RMRR's though so
those codepaths aren't tested.

James Sewart (7):
iommu: Move iommu_group_create_direct_mappings to after device_attach
iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
iommu/vt-d: Expose ISA direct mapping region via
iommu_get_resv_regions
iommu/vt-d: Ignore domain parameter in attach_device if device
requires identity map
iommu/vt-d: Allow IOMMU_DOMAIN_DMA to be allocated by iommu_ops
iommu/vt-d: Remove lazy allocation of domains

Lu Baolu (1):
iommu/vt-d: Enable DMA remapping after rmrr mapped

drivers/iommu/intel-iommu.c | 444 +++++++++++-------------------------
drivers/iommu/iommu.c | 4 +-
2 files changed, 131 insertions(+), 317 deletions(-)

--
2.17.1


2019-03-14 12:04:20

by James Sewart

[permalink] [raw]
Subject: [PATCH v2 1/7] iommu: Move iommu_group_create_direct_mappings to after device_attach

If an IOMMU driver requires to know which IOMMU a domain is associated
to initialise a domain then it will do so in device_attach, before which
it is not safe to call iommu_ops.

Signed-off-by: James Sewart <[email protected]>
---
drivers/iommu/iommu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 3ed4db334341..1c6ffbb2d20e 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -652,8 +652,6 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)

dev->iommu_group = group;

- iommu_group_create_direct_mappings(group, dev);
-
mutex_lock(&group->mutex);
list_add_tail(&device->list, &group->devices);
if (group->domain)
@@ -662,6 +660,8 @@ int iommu_group_add_device(struct iommu_group *group, struct device *dev)
if (ret)
goto err_put_group;

+ iommu_group_create_direct_mappings(group, dev);
+
/* Notify any listeners about change to group. */
blocking_notifier_call_chain(&group->notifier,
IOMMU_GROUP_NOTIFY_ADD_DEVICE, dev);
--
2.17.1


2019-03-14 12:04:31

by James Sewart

[permalink] [raw]
Subject: [PATCH v2 5/7] iommu/vt-d: Enable DMA remapping after rmrr mapped

The rmrr devices require identity map of the rmrr regions before
enabling DMA remapping. Otherwise, there will be a window during
which DMA from/to the rmrr regions will be blocked. In order to
alleviate this, we move enabling DMA remapping after all rmrr
regions get mapped.

Signed-off-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel-iommu.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 104d36f225a7..35821df70f78 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -3518,11 +3518,6 @@ static int __init init_dmars(void)
ret = dmar_set_interrupt(iommu);
if (ret)
goto free_iommu;
-
- if (!translation_pre_enabled(iommu))
- iommu_enable_translation(iommu);
-
- iommu_disable_protect_mem_regions(iommu);
}

return 0;
@@ -4912,7 +4907,6 @@ int __init intel_iommu_init(void)
goto out_free_reserved_range;
}
up_write(&dmar_global_lock);
- pr_info("Intel(R) Virtualization Technology for Directed I/O\n");

#if defined(CONFIG_X86) && defined(CONFIG_SWIOTLB)
swiotlb = 0;
@@ -4935,6 +4929,16 @@ int __init intel_iommu_init(void)
register_memory_notifier(&intel_iommu_memory_nb);
cpuhp_setup_state(CPUHP_IOMMU_INTEL_DEAD, "iommu/intel:dead", NULL,
intel_iommu_cpu_dead);
+
+ /* Finally, we enable the DMA remapping hardware. */
+ for_each_iommu(iommu, drhd) {
+ if (!translation_pre_enabled(iommu))
+ iommu_enable_translation(iommu);
+
+ iommu_disable_protect_mem_regions(iommu);
+ }
+ pr_info("Intel(R) Virtualization Technology for Directed I/O\n");
+
intel_iommu_enabled = 1;
intel_iommu_debugfs_init();

--
2.17.1


2019-03-14 12:04:51

by James Sewart

[permalink] [raw]
Subject: [PATCH v2 6/7] iommu/vt-d: Allow IOMMU_DOMAIN_DMA to be allocated by iommu_ops

Allowing IOMMU_DOMAIN_DMA type IOMMU domain to be allocated allows the
default_domain of an iommu_group to be set. This delegates device-domain
relationships to the generic IOMMU code.

Signed-off-by: James Sewart <[email protected]>
---
drivers/iommu/intel-iommu.c | 99 +++++++++++++++++++++++++++++--------
1 file changed, 78 insertions(+), 21 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 35821df70f78..2c9d793af394 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -309,6 +309,18 @@ static int hw_pass_through = 1;
/* si_domain contains mulitple devices */
#define DOMAIN_FLAG_STATIC_IDENTITY (1 << 1)

+/*
+ * Domain managed externally, don't cleanup if it isn't attached
+ * to any devices.
+ */
+#define DOMAIN_FLAG_MANAGED_EXTERNALLY (1 << 2)
+
+/*
+ * Set after domain initialisation. Used when allocating dma domains to
+ * defer domain initialisation until it is attached to a device
+ */
+#define DOMAIN_FLAG_INITIALISED (1 << 3)
+
#define for_each_domain_iommu(idx, domain) \
for (idx = 0; idx < g_num_of_iommus; idx++) \
if (domain->iommu_refcnt[idx])
@@ -560,6 +572,16 @@ static inline int domain_type_is_vm_or_si(struct dmar_domain *domain)
DOMAIN_FLAG_STATIC_IDENTITY);
}

+static inline int domain_managed_externally(struct dmar_domain *domain)
+{
+ return domain->flags & DOMAIN_FLAG_MANAGED_EXTERNALLY;
+}
+
+static inline int domain_is_initialised(struct dmar_domain *domain)
+{
+ return domain->flags & DOMAIN_FLAG_INITIALISED;
+}
+
static inline int domain_pfn_supported(struct dmar_domain *domain,
unsigned long pfn)
{
@@ -1664,7 +1686,7 @@ static void disable_dmar_iommu(struct intel_iommu *iommu)

__dmar_remove_one_dev_info(info);

- if (!domain_type_is_vm_or_si(domain)) {
+ if (!domain_managed_externally(domain)) {
/*
* The domain_exit() function can't be called under
* device_domain_lock, as it takes this lock itself.
@@ -1897,6 +1919,7 @@ static int domain_init(struct dmar_domain *domain, struct intel_iommu *iommu,
domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
if (!domain->pgd)
return -ENOMEM;
+ domain->flags |= DOMAIN_FLAG_INITIALISED;
__iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
return 0;
}
@@ -1909,6 +1932,9 @@ static void domain_exit(struct dmar_domain *domain)
if (!domain)
return;

+ if (!domain_is_initialised(domain))
+ goto free_mem;
+
/* Remove associated devices and clear attached or cached domains */
rcu_read_lock();
domain_remove_dev_info(domain);
@@ -1921,6 +1947,7 @@ static void domain_exit(struct dmar_domain *domain)

dma_free_pagelist(freelist);

+free_mem:
free_domain_mem(domain);
}

@@ -4585,7 +4612,7 @@ static int device_notifier(struct notifier_block *nb,
return 0;

dmar_remove_one_dev_info(domain, dev);
- if (!domain_type_is_vm_or_si(domain) && list_empty(&domain->devices))
+ if (!domain_managed_externally(domain) && list_empty(&domain->devices))
domain_exit(domain);

return 0;
@@ -5039,6 +5066,7 @@ static int md_domain_init(struct dmar_domain *domain, int guest_width)
domain->pgd = (struct dma_pte *)alloc_pgtable_page(domain->nid);
if (!domain->pgd)
return -ENOMEM;
+ domain->flags |= DOMAIN_FLAG_INITIALISED;
domain_flush_cache(domain, domain->pgd, PAGE_SIZE);
return 0;
}
@@ -5047,28 +5075,43 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type)
{
struct dmar_domain *dmar_domain;
struct iommu_domain *domain;
+ int flags = DOMAIN_FLAG_MANAGED_EXTERNALLY;

- if (type != IOMMU_DOMAIN_UNMANAGED)
- return NULL;
+ switch (type) {
+ case IOMMU_DOMAIN_UNMANAGED:
+ flags |= DOMAIN_FLAG_VIRTUAL_MACHINE | DOMAIN_FLAG_INITIALISED;
+ dmar_domain = alloc_domain(flags);
+ if (!dmar_domain)
+ return NULL;

- dmar_domain = alloc_domain(DOMAIN_FLAG_VIRTUAL_MACHINE);
- if (!dmar_domain) {
- pr_err("Can't allocate dmar_domain\n");
- return NULL;
- }
- if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
- pr_err("Domain initialization failed\n");
- domain_exit(dmar_domain);
+ if (md_domain_init(dmar_domain, DEFAULT_DOMAIN_ADDRESS_WIDTH)) {
+ pr_err("Domain initialization failed\n");
+ domain_exit(dmar_domain);
+ return NULL;
+ }
+ domain_update_iommu_cap(dmar_domain);
+ domain = &dmar_domain->domain;
+ domain->geometry.aperture_start = 0;
+ domain->geometry.aperture_end =
+ __DOMAIN_MAX_ADDR(dmar_domain->gaw);
+ domain->geometry.force_aperture = true;
+ break;
+ case IOMMU_DOMAIN_DMA:
+ dmar_domain = alloc_domain(flags);
+ if (!dmar_domain)
+ return NULL;
+ /*
+ * init domain in device attach when we know IOMMU
+ * capabilities
+ */
+ break;
+ case IOMMU_DOMAIN_IDENTITY:
+ return &si_domain->domain;
+ default:
return NULL;
}
- domain_update_iommu_cap(dmar_domain);
-
- domain = &dmar_domain->domain;
- domain->geometry.aperture_start = 0;
- domain->geometry.aperture_end = __DOMAIN_MAX_ADDR(dmar_domain->gaw);
- domain->geometry.force_aperture = true;

- return domain;
+ return &dmar_domain->domain;
}

static void intel_iommu_domain_free(struct iommu_domain *domain)
@@ -5099,8 +5142,8 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
dmar_remove_one_dev_info(old_domain, dev);
rcu_read_unlock();

- if (!domain_type_is_vm_or_si(old_domain) &&
- list_empty(&old_domain->devices))
+ if (list_empty(&old_domain->devices) &&
+ !domain_managed_externally(old_domain))
domain_exit(old_domain);
}
}
@@ -5114,6 +5157,16 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
if (!iommu)
return -ENODEV;

+ /*
+ * Initialise domain with IOMMU capabilities if it isn't already
+ * initialised
+ */
+ if (!domain_is_initialised(dmar_domain)) {
+ if (domain_init(dmar_domain, iommu,
+ DEFAULT_DOMAIN_ADDRESS_WIDTH))
+ return -ENOMEM;
+ }
+
/* check if this iommu agaw is sufficient for max mapped address */
addr_width = agaw_to_width(iommu->agaw);
if (addr_width > cap_mgaw(iommu->cap))
@@ -5160,6 +5213,10 @@ static int intel_iommu_map(struct iommu_domain *domain,
int prot = 0;
int ret;

+ /* Don't bother if hardware passthrough used. */
+ if (dmar_domain == si_domain && hw_pass_through)
+ return 0;
+
if (iommu_prot & IOMMU_READ)
prot |= DMA_PTE_READ;
if (iommu_prot & IOMMU_WRITE)
--
2.17.1


2019-03-14 12:05:08

by James Sewart

[permalink] [raw]
Subject: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

To support mapping ISA region via iommu_group_create_direct_mappings,
make sure its exposed by iommu_get_resv_regions. This allows
deduplication of reserved region mappings

Signed-off-by: James Sewart <[email protected]>
---
drivers/iommu/intel-iommu.c | 42 +++++++++++++++++++++++++++++--------
1 file changed, 33 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 8e0a4e2ff77f..2e00e8708f06 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -337,6 +337,8 @@ static LIST_HEAD(dmar_rmrr_units);
#define for_each_rmrr_units(rmrr) \
list_for_each_entry(rmrr, &dmar_rmrr_units, list)

+static struct iommu_resv_region *isa_resv_region;
+
/* bitmap for indexing intel_iommus */
static int g_num_of_iommus;

@@ -2780,26 +2782,34 @@ static inline int iommu_prepare_rmrr_dev(struct dmar_rmrr_unit *rmrr,
rmrr->end_address);
}

+static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
+{
+ if (!isa_resv_region)
+ isa_resv_region = iommu_alloc_resv_region(0,
+ 16*1024*1024,
+ 0, IOMMU_RESV_DIRECT);
+
+ return isa_resv_region;
+}
+
#ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
-static inline void iommu_prepare_isa(void)
+static inline void iommu_prepare_isa(struct pci_dev *pdev)
{
- struct pci_dev *pdev;
int ret;
+ struct iommu_resv_region *reg = iommu_get_isa_resv_region();

- pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
- if (!pdev)
+ if (!reg)
return;

pr_info("Prepare 0-16MiB unity mapping for LPC\n");
- ret = iommu_prepare_identity_map(&pdev->dev, 0, 16*1024*1024 - 1);
+ ret = iommu_prepare_identity_map(&pdev->dev, reg->start,
+ reg->start + reg->length - 1);

if (ret)
pr_err("Failed to create 0-16MiB identity map - floppy might not work\n");
-
- pci_dev_put(pdev);
}
#else
-static inline void iommu_prepare_isa(void)
+static inline void iommu_prepare_isa(struct pci_dev *pdev)
{
return;
}
@@ -3289,6 +3299,7 @@ static int __init init_dmars(void)
struct dmar_rmrr_unit *rmrr;
bool copied_tables = false;
struct device *dev;
+ struct pci_dev *pdev;
struct intel_iommu *iommu;
int i, ret;

@@ -3469,7 +3480,11 @@ static int __init init_dmars(void)
}
}

- iommu_prepare_isa();
+ pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
+ if (pdev) {
+ iommu_prepare_isa(pdev);
+ pci_dev_put(pdev);
+ }

domains_done:

@@ -5266,6 +5281,7 @@ static void intel_iommu_get_resv_regions(struct device *device,
struct iommu_resv_region *reg;
struct dmar_rmrr_unit *rmrr;
struct device *i_dev;
+ struct pci_dev *pdev;
int i;

rcu_read_lock();
@@ -5280,6 +5296,14 @@ static void intel_iommu_get_resv_regions(struct device *device,
}
rcu_read_unlock();

+ if (dev_is_pci(device)) {
+ pdev = to_pci_dev(device);
+ if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
+ reg = iommu_get_isa_resv_region();
+ list_add_tail(&reg->list, head);
+ }
+ }
+
reg = iommu_alloc_resv_region(IOAPIC_RANGE_START,
IOAPIC_RANGE_END - IOAPIC_RANGE_START + 1,
0, IOMMU_RESV_MSI);
--
2.17.1


2019-03-14 12:06:51

by James Sewart

[permalink] [raw]
Subject: [PATCH v2 7/7] iommu/vt-d: Remove lazy allocation of domains

The generic IOMMU code will allocate and attach a default domain to each
device that comes online. This patch removes the lazy domain allocation
and early reserved region mapping that is now redundant.

Signed-off-by: James Sewart <[email protected]>
---
drivers/iommu/intel-iommu.c | 300 +-----------------------------------
1 file changed, 5 insertions(+), 295 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 2c9d793af394..f8c0c3e16935 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2605,118 +2605,6 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
return domain;
}

-static int get_last_alias(struct pci_dev *pdev, u16 alias, void *opaque)
-{
- *(u16 *)opaque = alias;
- return 0;
-}
-
-static struct dmar_domain *find_or_alloc_domain(struct device *dev, int gaw)
-{
- struct device_domain_info *info = NULL;
- struct dmar_domain *domain = NULL;
- struct intel_iommu *iommu;
- u16 dma_alias;
- unsigned long flags;
- u8 bus, devfn;
-
- iommu = device_to_iommu(dev, &bus, &devfn);
- if (!iommu)
- return NULL;
-
- if (dev_is_pci(dev)) {
- struct pci_dev *pdev = to_pci_dev(dev);
-
- pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
-
- spin_lock_irqsave(&device_domain_lock, flags);
- info = dmar_search_domain_by_dev_info(pci_domain_nr(pdev->bus),
- PCI_BUS_NUM(dma_alias),
- dma_alias & 0xff);
- if (info) {
- iommu = info->iommu;
- domain = info->domain;
- }
- spin_unlock_irqrestore(&device_domain_lock, flags);
-
- /* DMA alias already has a domain, use it */
- if (info)
- goto out;
- }
-
- /* Allocate and initialize new domain for the device */
- domain = alloc_domain(0);
- if (!domain)
- return NULL;
- if (domain_init(domain, iommu, gaw)) {
- domain_exit(domain);
- return NULL;
- }
-
-out:
-
- return domain;
-}
-
-static struct dmar_domain *set_domain_for_dev(struct device *dev,
- struct dmar_domain *domain)
-{
- struct intel_iommu *iommu;
- struct dmar_domain *tmp;
- u16 req_id, dma_alias;
- u8 bus, devfn;
-
- iommu = device_to_iommu(dev, &bus, &devfn);
- if (!iommu)
- return NULL;
-
- req_id = ((u16)bus << 8) | devfn;
-
- if (dev_is_pci(dev)) {
- struct pci_dev *pdev = to_pci_dev(dev);
-
- pci_for_each_dma_alias(pdev, get_last_alias, &dma_alias);
-
- /* register PCI DMA alias device */
- if (req_id != dma_alias) {
- tmp = dmar_insert_one_dev_info(iommu, PCI_BUS_NUM(dma_alias),
- dma_alias & 0xff, NULL, domain);
-
- if (!tmp || tmp != domain)
- return tmp;
- }
- }
-
- tmp = dmar_insert_one_dev_info(iommu, bus, devfn, dev, domain);
- if (!tmp || tmp != domain)
- return tmp;
-
- return domain;
-}
-
-static struct dmar_domain *get_domain_for_dev(struct device *dev, int gaw)
-{
- struct dmar_domain *domain, *tmp;
-
- domain = find_domain(dev);
- if (domain)
- goto out;
-
- domain = find_or_alloc_domain(dev, gaw);
- if (!domain)
- goto out;
-
- tmp = set_domain_for_dev(dev, domain);
- if (!tmp || domain != tmp) {
- domain_exit(domain);
- domain = tmp;
- }
-
-out:
-
- return domain;
-}
-
static int iommu_domain_identity_map(struct dmar_domain *domain,
unsigned long long start,
unsigned long long end)
@@ -2742,73 +2630,6 @@ static int iommu_domain_identity_map(struct dmar_domain *domain,
DMA_PTE_READ|DMA_PTE_WRITE);
}

-static int domain_prepare_identity_map(struct device *dev,
- struct dmar_domain *domain,
- unsigned long long start,
- unsigned long long end)
-{
- /* For _hardware_ passthrough, don't bother. But for software
- passthrough, we do it anyway -- it may indicate a memory
- range which is reserved in E820, so which didn't get set
- up to start with in si_domain */
- if (domain == si_domain && hw_pass_through) {
- pr_warn("Ignoring identity map for HW passthrough device %s [0x%Lx - 0x%Lx]\n",
- dev_name(dev), start, end);
- return 0;
- }
-
- pr_info("Setting identity map for device %s [0x%Lx - 0x%Lx]\n",
- dev_name(dev), start, end);
-
- if (end < start) {
- WARN(1, "Your BIOS is broken; RMRR ends before it starts!\n"
- "BIOS vendor: %s; Ver: %s; Product Version: %s\n",
- dmi_get_system_info(DMI_BIOS_VENDOR),
- dmi_get_system_info(DMI_BIOS_VERSION),
- dmi_get_system_info(DMI_PRODUCT_VERSION));
- return -EIO;
- }
-
- if (end >> agaw_to_width(domain->agaw)) {
- WARN(1, "Your BIOS is broken; RMRR exceeds permitted address width (%d bits)\n"
- "BIOS vendor: %s; Ver: %s; Product Version: %s\n",
- agaw_to_width(domain->agaw),
- dmi_get_system_info(DMI_BIOS_VENDOR),
- dmi_get_system_info(DMI_BIOS_VERSION),
- dmi_get_system_info(DMI_PRODUCT_VERSION));
- return -EIO;
- }
-
- return iommu_domain_identity_map(domain, start, end);
-}
-
-static int iommu_prepare_identity_map(struct device *dev,
- unsigned long long start,
- unsigned long long end)
-{
- struct dmar_domain *domain;
- int ret;
-
- domain = get_domain_for_dev(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
- if (!domain)
- return -ENOMEM;
-
- ret = domain_prepare_identity_map(dev, domain, start, end);
- if (ret)
- domain_exit(domain);
-
- return ret;
-}
-
-static inline int iommu_prepare_rmrr_dev(struct dmar_rmrr_unit *rmrr,
- struct device *dev)
-{
- if (dev->archdata.iommu == DUMMY_DEVICE_DOMAIN_INFO)
- return 0;
- return iommu_prepare_identity_map(dev, rmrr->base_address,
- rmrr->end_address);
-}
-
static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
{
if (!isa_resv_region)
@@ -2819,29 +2640,6 @@ static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
return isa_resv_region;
}

-#ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
-static inline void iommu_prepare_isa(struct pci_dev *pdev)
-{
- int ret;
- struct iommu_resv_region *reg = iommu_get_isa_resv_region();
-
- if (!reg)
- return;
-
- pr_info("Prepare 0-16MiB unity mapping for LPC\n");
- ret = iommu_prepare_identity_map(&pdev->dev, reg->start,
- reg->start + reg->length - 1);
-
- if (ret)
- pr_err("Failed to create 0-16MiB identity map - floppy might not work\n");
-}
-#else
-static inline void iommu_prepare_isa(struct pci_dev *pdev)
-{
- return;
-}
-#endif /* !CONFIG_INTEL_IOMMU_FLPY_WA */
-
static int md_domain_init(struct dmar_domain *domain, int guest_width);

static int __init si_domain_init(int hw)
@@ -3065,18 +2863,10 @@ static int __init dev_prepare_static_identity_mapping(struct device *dev, int hw

static int __init iommu_prepare_static_identity_mapping(int hw)
{
- struct pci_dev *pdev = NULL;
struct dmar_drhd_unit *drhd;
struct intel_iommu *iommu;
struct device *dev;
- int i;
- int ret = 0;
-
- for_each_pci_dev(pdev) {
- ret = dev_prepare_static_identity_mapping(&pdev->dev, hw);
- if (ret)
- return ret;
- }
+ int i, ret = 0;

for_each_active_iommu(iommu, drhd)
for_each_active_dev_scope(drhd->devices, drhd->devices_cnt, i, dev) {
@@ -3323,12 +3113,9 @@ static int copy_translation_tables(struct intel_iommu *iommu)
static int __init init_dmars(void)
{
struct dmar_drhd_unit *drhd;
- struct dmar_rmrr_unit *rmrr;
bool copied_tables = false;
- struct device *dev;
- struct pci_dev *pdev;
struct intel_iommu *iommu;
- int i, ret;
+ int ret;

/*
* for each drhd
@@ -3482,36 +3269,6 @@ static int __init init_dmars(void)
goto free_iommu;
}
}
- /*
- * For each rmrr
- * for each dev attached to rmrr
- * do
- * locate drhd for dev, alloc domain for dev
- * allocate free domain
- * allocate page table entries for rmrr
- * if context not allocated for bus
- * allocate and init context
- * set present in root table for this bus
- * init context with domain, translation etc
- * endfor
- * endfor
- */
- pr_info("Setting RMRR:\n");
- for_each_rmrr_units(rmrr) {
- /* some BIOS lists non-exist devices in DMAR table. */
- for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
- i, dev) {
- ret = iommu_prepare_rmrr_dev(rmrr, dev);
- if (ret)
- pr_err("Mapping reserved region failed\n");
- }
- }
-
- pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
- if (pdev) {
- iommu_prepare_isa(pdev);
- pci_dev_put(pdev);
- }

domains_done:

@@ -3595,53 +3352,6 @@ static unsigned long intel_alloc_iova(struct device *dev,
return iova_pfn;
}

-struct dmar_domain *get_valid_domain_for_dev(struct device *dev)
-{
- struct dmar_domain *domain, *tmp;
- struct dmar_rmrr_unit *rmrr;
- struct device *i_dev;
- int i, ret;
-
- domain = find_domain(dev);
- if (domain)
- goto out;
-
- domain = find_or_alloc_domain(dev, DEFAULT_DOMAIN_ADDRESS_WIDTH);
- if (!domain)
- goto out;
-
- /* We have a new domain - setup possible RMRRs for the device */
- rcu_read_lock();
- for_each_rmrr_units(rmrr) {
- for_each_active_dev_scope(rmrr->devices, rmrr->devices_cnt,
- i, i_dev) {
- if (i_dev != dev)
- continue;
-
- ret = domain_prepare_identity_map(dev, domain,
- rmrr->base_address,
- rmrr->end_address);
- if (ret)
- dev_err(dev, "Mapping reserved region failed\n");
- }
- }
- rcu_read_unlock();
-
- tmp = set_domain_for_dev(dev, domain);
- if (!tmp || domain != tmp) {
- domain_exit(domain);
- domain = tmp;
- }
-
-out:
-
- if (!domain)
- pr_err("Allocating domain for %s failed\n", dev_name(dev));
-
-
- return domain;
-}
-
/* Check if the dev needs to go through non-identity map and unmap process.*/
static int iommu_no_mapping(struct device *dev)
{
@@ -3704,7 +3414,7 @@ static dma_addr_t __intel_map_page(struct device *dev, struct page *page,
if (iommu_no_mapping(dev))
return paddr;

- domain = get_valid_domain_for_dev(dev);
+ domain = find_domain(dev);
if (!domain)
return DMA_MAPPING_ERROR;

@@ -3914,7 +3624,7 @@ static int intel_map_sg(struct device *dev, struct scatterlist *sglist, int nele
if (iommu_no_mapping(dev))
return intel_nontranslate_map_sg(dev, sglist, nelems, dir);

- domain = get_valid_domain_for_dev(dev);
+ domain = find_domain(dev);
if (!domain)
return 0;

@@ -5412,7 +5122,7 @@ int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sd
u64 ctx_lo;
int ret;

- domain = get_valid_domain_for_dev(sdev->dev);
+ domain = find_domain(sdev->dev);
if (!domain)
return -EINVAL;

--
2.17.1


2019-03-14 23:33:32

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH v2 7/7] iommu/vt-d: Remove lazy allocation of domains

On Thu, 14 Mar 2019 11:59:36 +0000
James Sewart <[email protected]> wrote:

>
> - domain = get_valid_domain_for_dev(dev);
> + domain = find_domain(dev);
> if (!domain)
> return DMA_MAPPING_ERROR;
>
> @@ -3914,7 +3624,7 @@ static int intel_map_sg(struct device *dev,
> struct scatterlist *sglist, int nele if (iommu_no_mapping(dev))
> return intel_nontranslate_map_sg(dev, sglist,
> nelems, dir);
> - domain = get_valid_domain_for_dev(dev);
> + domain = find_domain(dev);
This patchset looks like a very good clean up, I am wondering why we
can't use the generic iommu_get_domain_for_dev() here, since VT-d has a
default DMA domain after your patch.

2019-03-15 02:27:10

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

Hi James,

On 3/14/19 7:58 PM, James Sewart wrote:
> To support mapping ISA region via iommu_group_create_direct_mappings,
> make sure its exposed by iommu_get_resv_regions. This allows
> deduplication of reserved region mappings
>
> Signed-off-by: James Sewart <[email protected]>
> ---
> drivers/iommu/intel-iommu.c | 42 +++++++++++++++++++++++++++++--------
> 1 file changed, 33 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 8e0a4e2ff77f..2e00e8708f06 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -337,6 +337,8 @@ static LIST_HEAD(dmar_rmrr_units);
> #define for_each_rmrr_units(rmrr) \
> list_for_each_entry(rmrr, &dmar_rmrr_units, list)
>
> +static struct iommu_resv_region *isa_resv_region;
> +
> /* bitmap for indexing intel_iommus */
> static int g_num_of_iommus;
>
> @@ -2780,26 +2782,34 @@ static inline int iommu_prepare_rmrr_dev(struct dmar_rmrr_unit *rmrr,
> rmrr->end_address);
> }
>
> +static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
> +{
> + if (!isa_resv_region)
> + isa_resv_region = iommu_alloc_resv_region(0,
> + 16*1024*1024,
> + 0, IOMMU_RESV_DIRECT);
> +
> + return isa_resv_region;
> +}
> +
> #ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
> -static inline void iommu_prepare_isa(void)
> +static inline void iommu_prepare_isa(struct pci_dev *pdev)
> {
> - struct pci_dev *pdev;
> int ret;
> + struct iommu_resv_region *reg = iommu_get_isa_resv_region();
>
> - pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
> - if (!pdev)
> + if (!reg)
> return;
>
> pr_info("Prepare 0-16MiB unity mapping for LPC\n");
> - ret = iommu_prepare_identity_map(&pdev->dev, 0, 16*1024*1024 - 1);
> + ret = iommu_prepare_identity_map(&pdev->dev, reg->start,
> + reg->start + reg->length - 1);
>
> if (ret)
> pr_err("Failed to create 0-16MiB identity map - floppy might not work\n");
> -
> - pci_dev_put(pdev);
> }
> #else
> -static inline void iommu_prepare_isa(void)
> +static inline void iommu_prepare_isa(struct pci_dev *pdev)
> {
> return;
> }
> @@ -3289,6 +3299,7 @@ static int __init init_dmars(void)
> struct dmar_rmrr_unit *rmrr;
> bool copied_tables = false;
> struct device *dev;
> + struct pci_dev *pdev;
> struct intel_iommu *iommu;
> int i, ret;
>
> @@ -3469,7 +3480,11 @@ static int __init init_dmars(void)
> }
> }
>
> - iommu_prepare_isa();
> + pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
> + if (pdev) {
> + iommu_prepare_isa(pdev);
> + pci_dev_put(pdev);
> + }
>
> domains_done:
>
> @@ -5266,6 +5281,7 @@ static void intel_iommu_get_resv_regions(struct device *device,
> struct iommu_resv_region *reg;
> struct dmar_rmrr_unit *rmrr;
> struct device *i_dev;
> + struct pci_dev *pdev;
> int i;
>
> rcu_read_lock();
> @@ -5280,6 +5296,14 @@ static void intel_iommu_get_resv_regions(struct device *device,
> }
> rcu_read_unlock();
>
> + if (dev_is_pci(device)) {
> + pdev = to_pci_dev(device);
> + if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
> + reg = iommu_get_isa_resv_region();
> + list_add_tail(&reg->list, head);
> + }
> + }
> +

Just wondering why not just

+#ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
+ if (dev_is_pci(device)) {
+ pdev = to_pci_dev(device);
+ if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
+ reg = iommu_alloc_resv_region(0,
+ 16*1024*1024,
+ 0, IOMMU_RESV_DIRECT);
+ if (reg)
+ list_add_tail(&reg->list, head);
+ }
+ }
+#endif

and, remove all other related code?

Best regards,
Lu Baolu

2019-03-15 02:37:22

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH v2 4/7] iommu/vt-d: Ignore domain parameter in attach_device if device requires identity map

Hi,

On 3/14/19 7:58 PM, James Sewart wrote:
> If a device requires an identity map then it is not safe to attach a
> domain that can remap addresses. Warn the user if this occurs.
>
> Signed-off-by: James Sewart <[email protected]>
> ---
> drivers/iommu/intel-iommu.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 2e00e8708f06..104d36f225a7 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -5101,6 +5101,11 @@ static int intel_iommu_attach_device(struct iommu_domain *domain,
> }
> }
>
> + if (iommu_no_mapping(dev)) {
> + dmar_domain = si_domain;
> + dev_warn(dev, "VT-d: Device is required to use identity IOMMU mapping, ignoring domain attached\n");
> + }

This is awful. :-)

IOMMU generic layer allocated a @domain and tries to bind the @domain
with a device. This code replaces @domain with si_domain without
notifying the upper layer.

Best regards,
Lu Baolu

2019-03-15 03:19:49

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH v2 0/7] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.

Hi James,

On 3/14/19 7:56 PM, James Sewart wrote:
> Patches 1 and 2 are the same as v1.
>
> v1-v2:
> Refactored ISA direct mappings to be returned by iommu_get_resv_regions.
> Integrated patch by Lu to defer turning on DMAR until iommu.c has mapped
> reserved regions.
> Integrated patches by Lu to remove more unused code in cleanup.
>
> Lu: I didn't integrate your patch to set the default domain type as it
> isn't directly related to the aim of this patchset. Instead patch 4

Without those patches, user experience will be affected and some devices
will not work on Intel platforms anymore.

For a long time, Intel IOMMU driver has its own logic to determine
whether a device requires an identity domain. For example, when user
specifies "iommu=pt" in kernel parameter, all device will be attached
with the identity domain. Further more, some quirky devices require
an identity domain to be used before enabling DMA remapping, otherwise,
it will not work. This was done by adding quirk bits in Intel IOMMU
driver.

So from my point of view, one way is porting all those quirks and kernel
parameters into IOMMU generic layer, or opening a door for vendor IOMMU
driver to determine the default domain type by their own. I prefer the
latter option since it will not impact any behaviors on other
architectures.

> addresses the issue of a device requiring an identity domain by ignoring
> the domain param in attach_device and printing a warning.

This will not work as I commented in that thread.

>
> I booted some of our devices with this patchset and haven't seen any
> issues. It doesn't look like we have any devices with RMRR's though so
> those codepaths aren't tested.
>
> James Sewart (7):
> iommu: Move iommu_group_create_direct_mappings to after device_attach
> iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
> iommu/vt-d: Expose ISA direct mapping region via
> iommu_get_resv_regions
> iommu/vt-d: Ignore domain parameter in attach_device if device
> requires identity map
> iommu/vt-d: Allow IOMMU_DOMAIN_DMA to be allocated by iommu_ops
> iommu/vt-d: Remove lazy allocation of domains
>
> Lu Baolu (1):
> iommu/vt-d: Enable DMA remapping after rmrr mapped
>
> drivers/iommu/intel-iommu.c | 444 +++++++++++-------------------------
> drivers/iommu/iommu.c | 4 +-
> 2 files changed, 131 insertions(+), 317 deletions(-)
>

Best regards,
Lu Baolu

2019-03-19 13:37:07

by James Sewart

[permalink] [raw]
Subject: Re: [PATCH v2 0/7] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.

Hey Lu,

> On 15 Mar 2019, at 03:13, Lu Baolu <[email protected]> wrote:
>
> Hi James,
>
> On 3/14/19 7:56 PM, James Sewart wrote:
>> Patches 1 and 2 are the same as v1.
>> v1-v2:
>> Refactored ISA direct mappings to be returned by iommu_get_resv_regions.
>> Integrated patch by Lu to defer turning on DMAR until iommu.c has mapped
>> reserved regions.
>> Integrated patches by Lu to remove more unused code in cleanup.
>> Lu: I didn't integrate your patch to set the default domain type as it
>> isn't directly related to the aim of this patchset. Instead patch 4
>
> Without those patches, user experience will be affected and some devices
> will not work on Intel platforms anymore.
>
> For a long time, Intel IOMMU driver has its own logic to determine
> whether a device requires an identity domain. For example, when user
> specifies "iommu=pt" in kernel parameter, all device will be attached
> with the identity domain. Further more, some quirky devices require
> an identity domain to be used before enabling DMA remapping, otherwise,
> it will not work. This was done by adding quirk bits in Intel IOMMU
> driver.
>
> So from my point of view, one way is porting all those quirks and kernel
> parameters into IOMMU generic layer, or opening a door for vendor IOMMU
> driver to determine the default domain type by their own. I prefer the
> latter option since it will not impact any behaviors on other
> architectures.

I see your point. I’m not confident that using the proposed door to set a
groups default domain has the desired behaviour. As discussed before the
default domain type will be set based on the desired type for only the
first device attached to a group. I think to change the default domain
type you would need a slightly different door that wasn’t conditioned on
device.

For situations where individual devices require an identity domain because
of quirks then maybe calling is_identity_map per device in
iommu_group_get_for_dev is a better solution than the one I proposed.

>
>> addresses the issue of a device requiring an identity domain by ignoring
>> the domain param in attach_device and printing a warning.
>
> This will not work as I commented in that thread.
>
>> I booted some of our devices with this patchset and haven't seen any
>> issues. It doesn't look like we have any devices with RMRR's though so
>> those codepaths aren't tested.
>> James Sewart (7):
>> iommu: Move iommu_group_create_direct_mappings to after device_attach
>> iommu/vt-d: Implement apply_resv_region for reserving IOVA ranges
>> iommu/vt-d: Expose ISA direct mapping region via
>> iommu_get_resv_regions
>> iommu/vt-d: Ignore domain parameter in attach_device if device
>> requires identity map
>> iommu/vt-d: Allow IOMMU_DOMAIN_DMA to be allocated by iommu_ops
>> iommu/vt-d: Remove lazy allocation of domains
>> Lu Baolu (1):
>> iommu/vt-d: Enable DMA remapping after rmrr mapped
>> drivers/iommu/intel-iommu.c | 444 +++++++++++-------------------------
>> drivers/iommu/iommu.c | 4 +-
>> 2 files changed, 131 insertions(+), 317 deletions(-)
>
> Best regards,
> Lu Baolu

Cheers,
James.


2019-03-20 01:33:42

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH v2 0/7] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.

Hi James,

On 3/19/19 9:35 PM, James Sewart wrote:
> Hey Lu,
>
>> On 15 Mar 2019, at 03:13, Lu Baolu <[email protected]> wrote:
>>
>> Hi James,
>>
>> On 3/14/19 7:56 PM, James Sewart wrote:
>>> Patches 1 and 2 are the same as v1.
>>> v1-v2:
>>> Refactored ISA direct mappings to be returned by iommu_get_resv_regions.
>>> Integrated patch by Lu to defer turning on DMAR until iommu.c has mapped
>>> reserved regions.
>>> Integrated patches by Lu to remove more unused code in cleanup.
>>> Lu: I didn't integrate your patch to set the default domain type as it
>>> isn't directly related to the aim of this patchset. Instead patch 4
>>
>> Without those patches, user experience will be affected and some devices
>> will not work on Intel platforms anymore.
>>
>> For a long time, Intel IOMMU driver has its own logic to determine
>> whether a device requires an identity domain. For example, when user
>> specifies "iommu=pt" in kernel parameter, all device will be attached
>> with the identity domain. Further more, some quirky devices require
>> an identity domain to be used before enabling DMA remapping, otherwise,
>> it will not work. This was done by adding quirk bits in Intel IOMMU
>> driver.
>>
>> So from my point of view, one way is porting all those quirks and kernel
>> parameters into IOMMU generic layer, or opening a door for vendor IOMMU
>> driver to determine the default domain type by their own. I prefer the
>> latter option since it will not impact any behaviors on other
>> architectures.
>
> I see your point. I’m not confident that using the proposed door to set a
> groups default domain has the desired behaviour. As discussed before the
> default domain type will be set based on the desired type for only the
> first device attached to a group. I think to change the default domain
> type you would need a slightly different door that wasn’t conditioned on
> device.

I think this as another problem. Just a summarize for the ease of
discussion. We saw two problems:

1. When allocating a new group for a device, how should we determine the
type of the default domain? This is what my proposal patches trying to
address.

2. If we need to put a device into an existing group which uses a
different type of domain from what the device desires to use, we might
break the functionality of the device. For this problem I'd second your
proposal below if I get your point correctly.

>
> For situations where individual devices require an identity domain because
> of quirks then maybe calling is_identity_map per device in
> iommu_group_get_for_dev is a better solution than the one I proposed.
>

Do you mean if we see a quirky device requires a different domain type
other than the default domain type of the group, we will assign a new
group to it? That looks good to me as far as I can see. I suppose this
should be done in vt-d's ops callback.

Best regards,
Lu Baolu

2019-03-22 09:59:36

by James Sewart

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

Hey Lu,

> On 15 Mar 2019, at 02:19, Lu Baolu <[email protected]> wrote:
>
> Hi James,
>
> On 3/14/19 7:58 PM, James Sewart wrote:
>> To support mapping ISA region via iommu_group_create_direct_mappings,
>> make sure its exposed by iommu_get_resv_regions. This allows
>> deduplication of reserved region mappings
>> Signed-off-by: James Sewart <[email protected]>
>> ---
>> drivers/iommu/intel-iommu.c | 42 +++++++++++++++++++++++++++++--------
>> 1 file changed, 33 insertions(+), 9 deletions(-)
>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>> index 8e0a4e2ff77f..2e00e8708f06 100644
>> --- a/drivers/iommu/intel-iommu.c
>> +++ b/drivers/iommu/intel-iommu.c
>> @@ -337,6 +337,8 @@ static LIST_HEAD(dmar_rmrr_units);
>> #define for_each_rmrr_units(rmrr) \
>> list_for_each_entry(rmrr, &dmar_rmrr_units, list)
>> +static struct iommu_resv_region *isa_resv_region;
>> +
>> /* bitmap for indexing intel_iommus */
>> static int g_num_of_iommus;
>> @@ -2780,26 +2782,34 @@ static inline int iommu_prepare_rmrr_dev(struct dmar_rmrr_unit *rmrr,
>> rmrr->end_address);
>> }
>> +static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
>> +{
>> + if (!isa_resv_region)
>> + isa_resv_region = iommu_alloc_resv_region(0,
>> + 16*1024*1024,
>> + 0, IOMMU_RESV_DIRECT);
>> +
>> + return isa_resv_region;
>> +}
>> +
>> #ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
>> -static inline void iommu_prepare_isa(void)
>> +static inline void iommu_prepare_isa(struct pci_dev *pdev)
>> {
>> - struct pci_dev *pdev;
>> int ret;
>> + struct iommu_resv_region *reg = iommu_get_isa_resv_region();
>> - pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
>> - if (!pdev)
>> + if (!reg)
>> return;
>> pr_info("Prepare 0-16MiB unity mapping for LPC\n");
>> - ret = iommu_prepare_identity_map(&pdev->dev, 0, 16*1024*1024 - 1);
>> + ret = iommu_prepare_identity_map(&pdev->dev, reg->start,
>> + reg->start + reg->length - 1);
>> if (ret)
>> pr_err("Failed to create 0-16MiB identity map - floppy might not work\n");
>> -
>> - pci_dev_put(pdev);
>> }
>> #else
>> -static inline void iommu_prepare_isa(void)
>> +static inline void iommu_prepare_isa(struct pci_dev *pdev)
>> {
>> return;
>> }
>> @@ -3289,6 +3299,7 @@ static int __init init_dmars(void)
>> struct dmar_rmrr_unit *rmrr;
>> bool copied_tables = false;
>> struct device *dev;
>> + struct pci_dev *pdev;
>> struct intel_iommu *iommu;
>> int i, ret;
>> @@ -3469,7 +3480,11 @@ static int __init init_dmars(void)
>> }
>> }
>> - iommu_prepare_isa();
>> + pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
>> + if (pdev) {
>> + iommu_prepare_isa(pdev);
>> + pci_dev_put(pdev);
>> + }
>> domains_done:
>> @@ -5266,6 +5281,7 @@ static void intel_iommu_get_resv_regions(struct device *device,
>> struct iommu_resv_region *reg;
>> struct dmar_rmrr_unit *rmrr;
>> struct device *i_dev;
>> + struct pci_dev *pdev;
>> int i;
>> rcu_read_lock();
>> @@ -5280,6 +5296,14 @@ static void intel_iommu_get_resv_regions(struct device *device,
>> }
>> rcu_read_unlock();
>> + if (dev_is_pci(device)) {
>> + pdev = to_pci_dev(device);
>> + if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
>> + reg = iommu_get_isa_resv_region();
>> + list_add_tail(&reg->list, head);
>> + }
>> + }
>> +
>
> Just wondering why not just
>
> +#ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
> + if (dev_is_pci(device)) {
> + pdev = to_pci_dev(device);
> + if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
> + reg = iommu_alloc_resv_region(0,
> + 16*1024*1024,
> + 0, IOMMU_RESV_DIRECT);
> + if (reg)
> + list_add_tail(&reg->list, head);
> + }
> + }
> +#endif
>
> and, remove all other related code?

At this point in the patchset if we remove iommu_prepare_isa then the ISA
region won’t be mapped to the device. Only once the dma domain is allocable
will the reserved regions be mapped by iommu_group_create_direct_mappings.

Theres an issue that if we choose to alloc a new resv_region with type
IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
to free this entry type which means refactoring the rmrr regions in
get_resv_regions. Should this work be in this patchset?

>
> Best regards,
> Lu Baolu

Cheers,
James.


2019-03-22 10:06:46

by James Sewart

[permalink] [raw]
Subject: Re: [PATCH v2 0/7] iommu/vt-d: Fix-up device-domain relationship by refactoring to use iommu group default domain.

Hey Lu,

> On 20 Mar 2019, at 01:26, Lu Baolu <[email protected]> wrote:
>
> Hi James,
>
> On 3/19/19 9:35 PM, James Sewart wrote:
>> Hey Lu,
>>> On 15 Mar 2019, at 03:13, Lu Baolu <[email protected]> wrote:
>>>
>>> Hi James,
>>>
>>> On 3/14/19 7:56 PM, James Sewart wrote:
>>>> Patches 1 and 2 are the same as v1.
>>>> v1-v2:
>>>> Refactored ISA direct mappings to be returned by iommu_get_resv_regions.
>>>> Integrated patch by Lu to defer turning on DMAR until iommu.c has mapped
>>>> reserved regions.
>>>> Integrated patches by Lu to remove more unused code in cleanup.
>>>> Lu: I didn't integrate your patch to set the default domain type as it
>>>> isn't directly related to the aim of this patchset. Instead patch 4
>>>
>>> Without those patches, user experience will be affected and some devices
>>> will not work on Intel platforms anymore.
>>>
>>> For a long time, Intel IOMMU driver has its own logic to determine
>>> whether a device requires an identity domain. For example, when user
>>> specifies "iommu=pt" in kernel parameter, all device will be attached
>>> with the identity domain. Further more, some quirky devices require
>>> an identity domain to be used before enabling DMA remapping, otherwise,
>>> it will not work. This was done by adding quirk bits in Intel IOMMU
>>> driver.
>>>
>>> So from my point of view, one way is porting all those quirks and kernel
>>> parameters into IOMMU generic layer, or opening a door for vendor IOMMU
>>> driver to determine the default domain type by their own. I prefer the
>>> latter option since it will not impact any behaviors on other
>>> architectures.
>> I see your point. I’m not confident that using the proposed door to set a
>> groups default domain has the desired behaviour. As discussed before the
>> default domain type will be set based on the desired type for only the
>> first device attached to a group. I think to change the default domain
>> type you would need a slightly different door that wasn’t conditioned on
>> device.
>
> I think this as another problem. Just a summarize for the ease of
> discussion. We saw two problems:
>
> 1. When allocating a new group for a device, how should we determine the
> type of the default domain? This is what my proposal patches trying to
> address.

This will work as expected only if all devices within a group get the same
result from is_identity_map. Otherwise wee see issue 2.

>
> 2. If we need to put a device into an existing group which uses a
> different type of domain from what the device desires to use, we might
> break the functionality of the device. For this problem I'd second your
> proposal below if I get your point correctly.
>
>> For situations where individual devices require an identity domain because
>> of quirks then maybe calling is_identity_map per device in
>> iommu_group_get_for_dev is a better solution than the one I proposed.
>
> Do you mean if we see a quirky device requires a different domain type
> other than the default domain type of the group, we will assign a new
> group to it? That looks good to me as far as I can see. I suppose this
> should be done in vt-d's ops callback.

I have thought about this a bit and I think the cleanest approach is to
put devices that require an identity domain into their own group, this can
be done in the device_group callback, avoiding any situation where we have
to deal with creating a new group based on domain type in
iommu_group_get_for_dev. This way we shouldn’t ever be in a situation with
multiple different domain types per group. This way your patches will work
as expected. See below for a possible implementation.

>
> Best regards,
> Lu Baolu

Cheers,
James.

Devices that require an identity map because of quirks or other reasons
should be put in their own IOMMU group so as to not end up with multiple
different domains per group.

Signed-off-by: James Sewart <[email protected]>

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 3cb8b36abf50..0f5a121d31a0 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -5421,6 +5421,22 @@ struct intel_iommu *intel_svm_device_to_iommu(struct device *dev)
}
#endif /* CONFIG_INTEL_IOMMU_SVM */

+static struct iommu_group *intel_identity_group;
+
+struct iommu_group *intel_iommu_pci_device_group(struct device *dev)
+{
+ if (iommu_no_mapping(dev)) {
+ if (!intel_identity_group) {
+ intel_identity_group = iommu_group_alloc();
+ if (IS_ERR(intel_identity_group))
+ return NULL;
+ }
+ return intel_identity_group;
+ }
+
+ return pci_device_group(dev);
+}
+
const struct iommu_ops intel_iommu_ops = {
.capable = intel_iommu_capable,
.domain_alloc = intel_iommu_domain_alloc,
@@ -5435,7 +5451,7 @@ const struct iommu_ops intel_iommu_ops = {
.get_resv_regions = intel_iommu_get_resv_regions,
.put_resv_regions = intel_iommu_put_resv_regions,
.apply_resv_region = intel_iommu_apply_resv_region,
- .device_group = pci_device_group,
+ .device_group = intel_iommu_pci_device_group,
.pgsize_bitmap = INTEL_IOMMU_PGSIZES,
};

--
2.17.1


2019-03-22 10:09:36

by James Sewart

[permalink] [raw]
Subject: Re: [PATCH v2 7/7] iommu/vt-d: Remove lazy allocation of domains

Hey Jacob,

> On 14 Mar 2019, at 23:35, Jacob Pan <[email protected]> wrote:
>
> On Thu, 14 Mar 2019 11:59:36 +0000
> James Sewart <[email protected]> wrote:
>
>>
>> - domain = get_valid_domain_for_dev(dev);
>> + domain = find_domain(dev);
>> if (!domain)
>> return DMA_MAPPING_ERROR;
>>
>> @@ -3914,7 +3624,7 @@ static int intel_map_sg(struct device *dev,
>> struct scatterlist *sglist, int nele if (iommu_no_mapping(dev))
>> return intel_nontranslate_map_sg(dev, sglist,
>> nelems, dir);
>> - domain = get_valid_domain_for_dev(dev);
>> + domain = find_domain(dev);
> This patchset looks like a very good clean up, I am wondering why we
> can't use the generic iommu_get_domain_for_dev() here, since VT-d has a
> default DMA domain after your patch.

This should be possible, only downside is we get an iommu_domain from
iommu_get_domain_for_dev and will have to check its not null before
getting the dmar_domain from it. We will be able to remove find_domain
though.

Cheers,
James.

2019-03-25 02:09:48

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

Hi James,

On 3/22/19 5:57 PM, James Sewart wrote:
> Hey Lu,
>
>> On 15 Mar 2019, at 02:19, Lu Baolu <[email protected]> wrote:
>>
>> Hi James,
>>
>> On 3/14/19 7:58 PM, James Sewart wrote:
>>> To support mapping ISA region via iommu_group_create_direct_mappings,
>>> make sure its exposed by iommu_get_resv_regions. This allows
>>> deduplication of reserved region mappings
>>> Signed-off-by: James Sewart <[email protected]>
>>> ---
>>> drivers/iommu/intel-iommu.c | 42 +++++++++++++++++++++++++++++--------
>>> 1 file changed, 33 insertions(+), 9 deletions(-)
>>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>>> index 8e0a4e2ff77f..2e00e8708f06 100644
>>> --- a/drivers/iommu/intel-iommu.c
>>> +++ b/drivers/iommu/intel-iommu.c
>>> @@ -337,6 +337,8 @@ static LIST_HEAD(dmar_rmrr_units);
>>> #define for_each_rmrr_units(rmrr) \
>>> list_for_each_entry(rmrr, &dmar_rmrr_units, list)
>>> +static struct iommu_resv_region *isa_resv_region;
>>> +
>>> /* bitmap for indexing intel_iommus */
>>> static int g_num_of_iommus;
>>> @@ -2780,26 +2782,34 @@ static inline int iommu_prepare_rmrr_dev(struct dmar_rmrr_unit *rmrr,
>>> rmrr->end_address);
>>> }
>>> +static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
>>> +{
>>> + if (!isa_resv_region)
>>> + isa_resv_region = iommu_alloc_resv_region(0,
>>> + 16*1024*1024,
>>> + 0, IOMMU_RESV_DIRECT);
>>> +
>>> + return isa_resv_region;
>>> +}
>>> +
>>> #ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
>>> -static inline void iommu_prepare_isa(void)
>>> +static inline void iommu_prepare_isa(struct pci_dev *pdev)
>>> {
>>> - struct pci_dev *pdev;
>>> int ret;
>>> + struct iommu_resv_region *reg = iommu_get_isa_resv_region();
>>> - pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
>>> - if (!pdev)
>>> + if (!reg)
>>> return;
>>> pr_info("Prepare 0-16MiB unity mapping for LPC\n");
>>> - ret = iommu_prepare_identity_map(&pdev->dev, 0, 16*1024*1024 - 1);
>>> + ret = iommu_prepare_identity_map(&pdev->dev, reg->start,
>>> + reg->start + reg->length - 1);
>>> if (ret)
>>> pr_err("Failed to create 0-16MiB identity map - floppy might not work\n");
>>> -
>>> - pci_dev_put(pdev);
>>> }
>>> #else
>>> -static inline void iommu_prepare_isa(void)
>>> +static inline void iommu_prepare_isa(struct pci_dev *pdev)
>>> {
>>> return;
>>> }
>>> @@ -3289,6 +3299,7 @@ static int __init init_dmars(void)
>>> struct dmar_rmrr_unit *rmrr;
>>> bool copied_tables = false;
>>> struct device *dev;
>>> + struct pci_dev *pdev;
>>> struct intel_iommu *iommu;
>>> int i, ret;
>>> @@ -3469,7 +3480,11 @@ static int __init init_dmars(void)
>>> }
>>> }
>>> - iommu_prepare_isa();
>>> + pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
>>> + if (pdev) {
>>> + iommu_prepare_isa(pdev);
>>> + pci_dev_put(pdev);
>>> + }
>>> domains_done:
>>> @@ -5266,6 +5281,7 @@ static void intel_iommu_get_resv_regions(struct device *device,
>>> struct iommu_resv_region *reg;
>>> struct dmar_rmrr_unit *rmrr;
>>> struct device *i_dev;
>>> + struct pci_dev *pdev;
>>> int i;
>>> rcu_read_lock();
>>> @@ -5280,6 +5296,14 @@ static void intel_iommu_get_resv_regions(struct device *device,
>>> }
>>> rcu_read_unlock();
>>> + if (dev_is_pci(device)) {
>>> + pdev = to_pci_dev(device);
>>> + if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
>>> + reg = iommu_get_isa_resv_region();
>>> + list_add_tail(&reg->list, head);
>>> + }
>>> + }
>>> +
>>
>> Just wondering why not just
>>
>> +#ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
>> + if (dev_is_pci(device)) {
>> + pdev = to_pci_dev(device);
>> + if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
>> + reg = iommu_alloc_resv_region(0,
>> + 16*1024*1024,
>> + 0, IOMMU_RESV_DIRECT);
>> + if (reg)
>> + list_add_tail(&reg->list, head);
>> + }
>> + }
>> +#endif
>>
>> and, remove all other related code?
>
> At this point in the patchset if we remove iommu_prepare_isa then the ISA
> region won’t be mapped to the device. Only once the dma domain is allocable
> will the reserved regions be mapped by iommu_group_create_direct_mappings.

Yes. So if we put the allocation code here, it won't impact anything and
will take effect as soon as the dma domain is allocatable.

>
> Theres an issue that if we choose to alloc a new resv_region with type
> IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
> to free this entry type which means refactoring the rmrr regions in
> get_resv_regions. Should this work be in this patchset?

Do you mean the rmrr regions are not allocated in get_resv_regions, but
are freed in put_resv_regions? I think we should fix this in this patch
set since this might impact the device passthrough if we don't do it.

Best regards,
Lu Baolu

2019-03-25 12:58:58

by James Sewart

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

Hey Lu,

> On 25 Mar 2019, at 02:03, Lu Baolu <[email protected]> wrote:
>
> Hi James,
>
> On 3/22/19 5:57 PM, James Sewart wrote:
>> Hey Lu,
>>> On 15 Mar 2019, at 02:19, Lu Baolu <[email protected]> wrote:
>>>
>>> Hi James,
>>>
>>> On 3/14/19 7:58 PM, James Sewart wrote:
>>>> To support mapping ISA region via iommu_group_create_direct_mappings,
>>>> make sure its exposed by iommu_get_resv_regions. This allows
>>>> deduplication of reserved region mappings
>>>> Signed-off-by: James Sewart <[email protected]>
>>>> ---
>>>> drivers/iommu/intel-iommu.c | 42 +++++++++++++++++++++++++++++--------
>>>> 1 file changed, 33 insertions(+), 9 deletions(-)
>>>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>>>> index 8e0a4e2ff77f..2e00e8708f06 100644
>>>> --- a/drivers/iommu/intel-iommu.c
>>>> +++ b/drivers/iommu/intel-iommu.c
>>>> @@ -337,6 +337,8 @@ static LIST_HEAD(dmar_rmrr_units);
>>>> #define for_each_rmrr_units(rmrr) \
>>>> list_for_each_entry(rmrr, &dmar_rmrr_units, list)
>>>> +static struct iommu_resv_region *isa_resv_region;
>>>> +
>>>> /* bitmap for indexing intel_iommus */
>>>> static int g_num_of_iommus;
>>>> @@ -2780,26 +2782,34 @@ static inline int iommu_prepare_rmrr_dev(struct dmar_rmrr_unit *rmrr,
>>>> rmrr->end_address);
>>>> }
>>>> +static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
>>>> +{
>>>> + if (!isa_resv_region)
>>>> + isa_resv_region = iommu_alloc_resv_region(0,
>>>> + 16*1024*1024,
>>>> + 0, IOMMU_RESV_DIRECT);
>>>> +
>>>> + return isa_resv_region;
>>>> +}
>>>> +
>>>> #ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
>>>> -static inline void iommu_prepare_isa(void)
>>>> +static inline void iommu_prepare_isa(struct pci_dev *pdev)
>>>> {
>>>> - struct pci_dev *pdev;
>>>> int ret;
>>>> + struct iommu_resv_region *reg = iommu_get_isa_resv_region();
>>>> - pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
>>>> - if (!pdev)
>>>> + if (!reg)
>>>> return;
>>>> pr_info("Prepare 0-16MiB unity mapping for LPC\n");
>>>> - ret = iommu_prepare_identity_map(&pdev->dev, 0, 16*1024*1024 - 1);
>>>> + ret = iommu_prepare_identity_map(&pdev->dev, reg->start,
>>>> + reg->start + reg->length - 1);
>>>> if (ret)
>>>> pr_err("Failed to create 0-16MiB identity map - floppy might not work\n");
>>>> -
>>>> - pci_dev_put(pdev);
>>>> }
>>>> #else
>>>> -static inline void iommu_prepare_isa(void)
>>>> +static inline void iommu_prepare_isa(struct pci_dev *pdev)
>>>> {
>>>> return;
>>>> }
>>>> @@ -3289,6 +3299,7 @@ static int __init init_dmars(void)
>>>> struct dmar_rmrr_unit *rmrr;
>>>> bool copied_tables = false;
>>>> struct device *dev;
>>>> + struct pci_dev *pdev;
>>>> struct intel_iommu *iommu;
>>>> int i, ret;
>>>> @@ -3469,7 +3480,11 @@ static int __init init_dmars(void)
>>>> }
>>>> }
>>>> - iommu_prepare_isa();
>>>> + pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
>>>> + if (pdev) {
>>>> + iommu_prepare_isa(pdev);
>>>> + pci_dev_put(pdev);
>>>> + }
>>>> domains_done:
>>>> @@ -5266,6 +5281,7 @@ static void intel_iommu_get_resv_regions(struct device *device,
>>>> struct iommu_resv_region *reg;
>>>> struct dmar_rmrr_unit *rmrr;
>>>> struct device *i_dev;
>>>> + struct pci_dev *pdev;
>>>> int i;
>>>> rcu_read_lock();
>>>> @@ -5280,6 +5296,14 @@ static void intel_iommu_get_resv_regions(struct device *device,
>>>> }
>>>> rcu_read_unlock();
>>>> + if (dev_is_pci(device)) {
>>>> + pdev = to_pci_dev(device);
>>>> + if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
>>>> + reg = iommu_get_isa_resv_region();
>>>> + list_add_tail(&reg->list, head);
>>>> + }
>>>> + }
>>>> +
>>>
>>> Just wondering why not just
>>>
>>> +#ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
>>> + if (dev_is_pci(device)) {
>>> + pdev = to_pci_dev(device);
>>> + if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
>>> + reg = iommu_alloc_resv_region(0,
>>> + 16*1024*1024,
>>> + 0, IOMMU_RESV_DIRECT);
>>> + if (reg)
>>> + list_add_tail(&reg->list, head);
>>> + }
>>> + }
>>> +#endif
>>>
>>> and, remove all other related code?
>> At this point in the patchset if we remove iommu_prepare_isa then the ISA
>> region won’t be mapped to the device. Only once the dma domain is allocable
>> will the reserved regions be mapped by iommu_group_create_direct_mappings.
>
> Yes. So if we put the allocation code here, it won't impact anything and
> will take effect as soon as the dma domain is allocatable.
>
>> Theres an issue that if we choose to alloc a new resv_region with type
>> IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
>> to free this entry type which means refactoring the rmrr regions in
>> get_resv_regions. Should this work be in this patchset?
>
> Do you mean the rmrr regions are not allocated in get_resv_regions, but
> are freed in put_resv_regions? I think we should fix this in this patch
> set since this might impact the device passthrough if we don't do it.

They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is
freed in put_resv_regions. If we allocate a new resv_region with type
IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify
put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free
the static RMRR regions.

Either the ISA region is static and not freed as with my implementation,
or the RMRR regions are converted to be allocated on each call to
get_resv_regions and freed in put_resv_regions.

>
> Best regards,
> Lu Baolu

Cheers,
James.

2019-03-26 01:17:36

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

Hi,

On 3/25/19 8:57 PM, James Sewart wrote:
> Hey Lu,
>
>> On 25 Mar 2019, at 02:03, Lu Baolu <[email protected]> wrote:
>>
>> Hi James,
>>
>> On 3/22/19 5:57 PM, James Sewart wrote:
>>> Hey Lu,
>>>> On 15 Mar 2019, at 02:19, Lu Baolu <[email protected]> wrote:
>>>>
>>>> Hi James,
>>>>
>>>> On 3/14/19 7:58 PM, James Sewart wrote:
>>>>> To support mapping ISA region via iommu_group_create_direct_mappings,
>>>>> make sure its exposed by iommu_get_resv_regions. This allows
>>>>> deduplication of reserved region mappings
>>>>> Signed-off-by: James Sewart <[email protected]>
>>>>> ---
>>>>> drivers/iommu/intel-iommu.c | 42 +++++++++++++++++++++++++++++--------
>>>>> 1 file changed, 33 insertions(+), 9 deletions(-)
>>>>> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
>>>>> index 8e0a4e2ff77f..2e00e8708f06 100644
>>>>> --- a/drivers/iommu/intel-iommu.c
>>>>> +++ b/drivers/iommu/intel-iommu.c
>>>>> @@ -337,6 +337,8 @@ static LIST_HEAD(dmar_rmrr_units);
>>>>> #define for_each_rmrr_units(rmrr) \
>>>>> list_for_each_entry(rmrr, &dmar_rmrr_units, list)
>>>>> +static struct iommu_resv_region *isa_resv_region;
>>>>> +
>>>>> /* bitmap for indexing intel_iommus */
>>>>> static int g_num_of_iommus;
>>>>> @@ -2780,26 +2782,34 @@ static inline int iommu_prepare_rmrr_dev(struct dmar_rmrr_unit *rmrr,
>>>>> rmrr->end_address);
>>>>> }
>>>>> +static inline struct iommu_resv_region *iommu_get_isa_resv_region(void)
>>>>> +{
>>>>> + if (!isa_resv_region)
>>>>> + isa_resv_region = iommu_alloc_resv_region(0,
>>>>> + 16*1024*1024,
>>>>> + 0, IOMMU_RESV_DIRECT);
>>>>> +
>>>>> + return isa_resv_region;
>>>>> +}
>>>>> +
>>>>> #ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
>>>>> -static inline void iommu_prepare_isa(void)
>>>>> +static inline void iommu_prepare_isa(struct pci_dev *pdev)
>>>>> {
>>>>> - struct pci_dev *pdev;
>>>>> int ret;
>>>>> + struct iommu_resv_region *reg = iommu_get_isa_resv_region();
>>>>> - pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
>>>>> - if (!pdev)
>>>>> + if (!reg)
>>>>> return;
>>>>> pr_info("Prepare 0-16MiB unity mapping for LPC\n");
>>>>> - ret = iommu_prepare_identity_map(&pdev->dev, 0, 16*1024*1024 - 1);
>>>>> + ret = iommu_prepare_identity_map(&pdev->dev, reg->start,
>>>>> + reg->start + reg->length - 1);
>>>>> if (ret)
>>>>> pr_err("Failed to create 0-16MiB identity map - floppy might not work\n");
>>>>> -
>>>>> - pci_dev_put(pdev);
>>>>> }
>>>>> #else
>>>>> -static inline void iommu_prepare_isa(void)
>>>>> +static inline void iommu_prepare_isa(struct pci_dev *pdev)
>>>>> {
>>>>> return;
>>>>> }
>>>>> @@ -3289,6 +3299,7 @@ static int __init init_dmars(void)
>>>>> struct dmar_rmrr_unit *rmrr;
>>>>> bool copied_tables = false;
>>>>> struct device *dev;
>>>>> + struct pci_dev *pdev;
>>>>> struct intel_iommu *iommu;
>>>>> int i, ret;
>>>>> @@ -3469,7 +3480,11 @@ static int __init init_dmars(void)
>>>>> }
>>>>> }
>>>>> - iommu_prepare_isa();
>>>>> + pdev = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, NULL);
>>>>> + if (pdev) {
>>>>> + iommu_prepare_isa(pdev);
>>>>> + pci_dev_put(pdev);
>>>>> + }
>>>>> domains_done:
>>>>> @@ -5266,6 +5281,7 @@ static void intel_iommu_get_resv_regions(struct device *device,
>>>>> struct iommu_resv_region *reg;
>>>>> struct dmar_rmrr_unit *rmrr;
>>>>> struct device *i_dev;
>>>>> + struct pci_dev *pdev;
>>>>> int i;
>>>>> rcu_read_lock();
>>>>> @@ -5280,6 +5296,14 @@ static void intel_iommu_get_resv_regions(struct device *device,
>>>>> }
>>>>> rcu_read_unlock();
>>>>> + if (dev_is_pci(device)) {
>>>>> + pdev = to_pci_dev(device);
>>>>> + if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
>>>>> + reg = iommu_get_isa_resv_region();
>>>>> + list_add_tail(&reg->list, head);
>>>>> + }
>>>>> + }
>>>>> +
>>>>
>>>> Just wondering why not just
>>>>
>>>> +#ifdef CONFIG_INTEL_IOMMU_FLOPPY_WA
>>>> + if (dev_is_pci(device)) {
>>>> + pdev = to_pci_dev(device);
>>>> + if ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA) {
>>>> + reg = iommu_alloc_resv_region(0,
>>>> + 16*1024*1024,
>>>> + 0, IOMMU_RESV_DIRECT);
>>>> + if (reg)
>>>> + list_add_tail(&reg->list, head);
>>>> + }
>>>> + }
>>>> +#endif
>>>>
>>>> and, remove all other related code?
>>> At this point in the patchset if we remove iommu_prepare_isa then the ISA
>>> region won’t be mapped to the device. Only once the dma domain is allocable
>>> will the reserved regions be mapped by iommu_group_create_direct_mappings.
>>
>> Yes. So if we put the allocation code here, it won't impact anything and
>> will take effect as soon as the dma domain is allocatable.
>>
>>> Theres an issue that if we choose to alloc a new resv_region with type
>>> IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
>>> to free this entry type which means refactoring the rmrr regions in
>>> get_resv_regions. Should this work be in this patchset?
>>
>> Do you mean the rmrr regions are not allocated in get_resv_regions, but
>> are freed in put_resv_regions? I think we should fix this in this patch
>> set since this might impact the device passthrough if we don't do it.
>
> They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is
> freed in put_resv_regions. If we allocate a new resv_region with type
> IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify
> put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free
> the static RMRR regions.

Ah yes! Thanks for the explanation.

>
> Either the ISA region is static and not freed as with my implementation,
> or the RMRR regions are converted to be allocated on each call to
> get_resv_regions and freed in put_resv_regions.

Okay, I second your implementation now.

>
>>
>> Best regards,
>> Lu Baolu
>
> Cheers,
> James.
>

Best regards,
Lu Baolu

2019-03-26 01:31:10

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

Hi James,

On 3/25/19 8:57 PM, James Sewart wrote:
>>> Theres an issue that if we choose to alloc a new resv_region with type
>>> IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
>>> to free this entry type which means refactoring the rmrr regions in
>>> get_resv_regions. Should this work be in this patchset?
>> Do you mean the rmrr regions are not allocated in get_resv_regions, but
>> are freed in put_resv_regions? I think we should fix this in this patch
>> set since this might impact the device passthrough if we don't do it.
> They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is
> freed in put_resv_regions. If we allocate a new resv_region with type
> IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify
> put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free
> the static RMRR regions.
>
> Either the ISA region is static and not freed as with my implementation,
> or the RMRR regions are converted to be allocated on each call to
> get_resv_regions and freed in put_resv_regions.
>

By the way, there's another way in my mind. Let's add a new region type
for LPC devices, e.x. IOMMU_RESV_LPC, and then handle it in the same way
as those MSI regions. Just FYI.

Best regards,
Lu Baolu

2019-03-28 18:38:48

by James Sewart

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

Hey Lu,

> On 26 Mar 2019, at 01:24, Lu Baolu <[email protected]> wrote:
>
> Hi James,
>
> On 3/25/19 8:57 PM, James Sewart wrote:
>>>> Theres an issue that if we choose to alloc a new resv_region with type
>>>> IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
>>>> to free this entry type which means refactoring the rmrr regions in
>>>> get_resv_regions. Should this work be in this patchset?
>>> Do you mean the rmrr regions are not allocated in get_resv_regions, but
>>> are freed in put_resv_regions? I think we should fix this in this patch
>>> set since this might impact the device passthrough if we don't do it.
>> They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is
>> freed in put_resv_regions. If we allocate a new resv_region with type
>> IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify
>> put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free
>> the static RMRR regions.
>> Either the ISA region is static and not freed as with my implementation,
>> or the RMRR regions are converted to be allocated on each call to
>> get_resv_regions and freed in put_resv_regions.
>
> By the way, there's another way in my mind. Let's add a new region type
> for LPC devices, e.x. IOMMU_RESV_LPC, and then handle it in the same way
> as those MSI regions. Just FYI.

This solution would require adding some extra code to
iommu_group_create_direct_mappings as currently only type
IOMMU_RESV_DIRECT is identity mapped, other types are only reserved.


>
> Best regards,
> Lu Baolu

Cheers,
James.

2019-03-29 15:28:20

by James Sewart

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

Hey Lu,

I’ve attached a preliminary v3, if you could take a look and run some tests
that would be great.

Since v2 i’ve added your default domain type patches, the new device_group
handler, and incorporated Jacob’s feedback.

> On 28 Mar 2019, at 18:37, James Sewart <[email protected]> wrote:
>
> Hey Lu,
>
>> On 26 Mar 2019, at 01:24, Lu Baolu <[email protected]> wrote:
>>
>> Hi James,
>>
>> On 3/25/19 8:57 PM, James Sewart wrote:
>>>>> Theres an issue that if we choose to alloc a new resv_region with type
>>>>> IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
>>>>> to free this entry type which means refactoring the rmrr regions in
>>>>> get_resv_regions. Should this work be in this patchset?
>>>> Do you mean the rmrr regions are not allocated in get_resv_regions, but
>>>> are freed in put_resv_regions? I think we should fix this in this patch
>>>> set since this might impact the device passthrough if we don't do it.
>>> They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is
>>> freed in put_resv_regions. If we allocate a new resv_region with type
>>> IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify
>>> put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free
>>> the static RMRR regions.
>>> Either the ISA region is static and not freed as with my implementation,
>>> or the RMRR regions are converted to be allocated on each call to
>>> get_resv_regions and freed in put_resv_regions.
>>
>> By the way, there's another way in my mind. Let's add a new region type
>> for LPC devices, e.x. IOMMU_RESV_LPC, and then handle it in the same way
>> as those MSI regions. Just FYI.
>
> This solution would require adding some extra code to
> iommu_group_create_direct_mappings as currently only type
> IOMMU_RESV_DIRECT is identity mapped, other types are only reserved.
>
>
>>
>> Best regards,
>> Lu Baolu
>
> Cheers,
> James.

Cheers,
James.


Attachments:
0001-iommu-Move-iommu_group_create_direct_mappings-to-aft.patch (1.22 kB)
0002-iommu-vt-d-Implement-apply_resv_region-for-reserving.patch (1.53 kB)
0003-iommu-vt-d-Expose-ISA-direct-mapping-region-via-iomm.patch (3.14 kB)
0004-iommu-vt-d-Create-an-IOMMU-group-for-devices-that-re.patch (1.61 kB)
0005-iommu-Add-ops-entry-for-vendor-specific-default-doma.patch (2.36 kB)
0006-iommu-vt-d-Add-is_identity_map-ops-entry.patch (1.20 kB)
0007-iommu-vt-d-Enable-DMA-remapping-after-rmrr-mapped.patch (1.79 kB)
0008-iommu-vt-d-Allow-IOMMU_DOMAIN_DMA-to-be-allocated-by.patch (6.30 kB)
0009-iommu-vt-d-Remove-lazy-allocation-of-domains.patch (13.70 kB)
Download all attachments

2019-04-04 06:57:14

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

Hi James,

I did a sanity test from my end. The test machine fails to boot. I
haven't seen any valuable kernel log. It results in a purple screen.

Best regards,
Lu Baolu

On 3/29/19 11:26 PM, James Sewart wrote:
> Hey Lu,
>
> I’ve attached a preliminary v3, if you could take a look and run some tests
> that would be great.
>
> Since v2 i’ve added your default domain type patches, the new device_group
> handler, and incorporated Jacob’s feedback.
>
>> On 28 Mar 2019, at 18:37, James Sewart <[email protected]> wrote:
>>
>> Hey Lu,
>>
>>> On 26 Mar 2019, at 01:24, Lu Baolu <[email protected]> wrote:
>>>
>>> Hi James,
>>>
>>> On 3/25/19 8:57 PM, James Sewart wrote:
>>>>>> Theres an issue that if we choose to alloc a new resv_region with type
>>>>>> IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
>>>>>> to free this entry type which means refactoring the rmrr regions in
>>>>>> get_resv_regions. Should this work be in this patchset?
>>>>> Do you mean the rmrr regions are not allocated in get_resv_regions, but
>>>>> are freed in put_resv_regions? I think we should fix this in this patch
>>>>> set since this might impact the device passthrough if we don't do it.
>>>> They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is
>>>> freed in put_resv_regions. If we allocate a new resv_region with type
>>>> IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify
>>>> put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free
>>>> the static RMRR regions.
>>>> Either the ISA region is static and not freed as with my implementation,
>>>> or the RMRR regions are converted to be allocated on each call to
>>>> get_resv_regions and freed in put_resv_regions.
>>>
>>> By the way, there's another way in my mind. Let's add a new region type
>>> for LPC devices, e.x. IOMMU_RESV_LPC, and then handle it in the same way
>>> as those MSI regions. Just FYI.
>>
>> This solution would require adding some extra code to
>> iommu_group_create_direct_mappings as currently only type
>> IOMMU_RESV_DIRECT is identity mapped, other types are only reserved.
>>
>>
>>>
>>> Best regards,
>>> Lu Baolu
>>
>> Cheers,
>> James.
>
> Cheers,
> James.
>

2019-04-05 18:04:05

by James Sewart

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions




> On 4 Apr 2019, at 07:49, Lu Baolu <[email protected]> wrote:
>
> Hi James,
>
> I did a sanity test from my end. The test machine fails to boot. I
> haven't seen any valuable kernel log. It results in a purple screen.
>
> Best regards,
> Lu Baolu
>
> On 3/29/19 11:26 PM, James Sewart wrote:
>> Hey Lu,
>> I’ve attached a preliminary v3, if you could take a look and run some tests
>> that would be great.
>> Since v2 i’ve added your default domain type patches, the new device_group
>> handler, and incorporated Jacob’s feedback.
>>> On 28 Mar 2019, at 18:37, James Sewart <[email protected]> wrote:
>>>
>>> Hey Lu,
>>>
>>>> On 26 Mar 2019, at 01:24, Lu Baolu <[email protected]> wrote:
>>>>
>>>> Hi James,
>>>>
>>>> On 3/25/19 8:57 PM, James Sewart wrote:
>>>>>>> Theres an issue that if we choose to alloc a new resv_region with type
>>>>>>> IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
>>>>>>> to free this entry type which means refactoring the rmrr regions in
>>>>>>> get_resv_regions. Should this work be in this patchset?
>>>>>> Do you mean the rmrr regions are not allocated in get_resv_regions, but
>>>>>> are freed in put_resv_regions? I think we should fix this in this patch
>>>>>> set since this might impact the device passthrough if we don't do it.
>>>>> They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is
>>>>> freed in put_resv_regions. If we allocate a new resv_region with type
>>>>> IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify
>>>>> put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free
>>>>> the static RMRR regions.
>>>>> Either the ISA region is static and not freed as with my implementation,
>>>>> or the RMRR regions are converted to be allocated on each call to
>>>>> get_resv_regions and freed in put_resv_regions.
>>>>
>>>> By the way, there's another way in my mind. Let's add a new region type
>>>> for LPC devices, e.x. IOMMU_RESV_LPC, and then handle it in the same way
>>>> as those MSI regions. Just FYI.
>>>
>>> This solution would require adding some extra code to
>>> iommu_group_create_direct_mappings as currently only type
>>> IOMMU_RESV_DIRECT is identity mapped, other types are only reserved.
>>>
>>>
>>>>
>>>> Best regards,
>>>> Lu Baolu
>>>
>>> Cheers,
>>> James.
>> Cheers,
>> James.


Attachments:
0009-iommu-vt-d-Remove-lazy-allocation-of-domains.patch (10.12 kB)

2019-04-08 02:51:00

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

Hi,

On 4/6/19 2:02 AM, James Sewart wrote:
> Hey Lu,
>
> My bad, did some debugging on my end. The issue was swapping out
> find_domain for iommu_get_domain_for_dev. It seems in some situations the
> domain is not attached to the group but the device is expected to have the
> domain still stored in its archdata.
>
> I’ve attached the final patch with find_domain unremoved which seems to
> work in my testing.

This version works for me now.

>
> Cheers,
> James.

Best regards,
Lu Baolu

>
>
>
>
>
>> On 4 Apr 2019, at 07:49, Lu Baolu <[email protected]> wrote:
>>
>> Hi James,
>>
>> I did a sanity test from my end. The test machine fails to boot. I
>> haven't seen any valuable kernel log. It results in a purple screen.
>>
>> Best regards,
>> Lu Baolu
>>
>> On 3/29/19 11:26 PM, James Sewart wrote:
>>> Hey Lu,
>>> I’ve attached a preliminary v3, if you could take a look and run some tests
>>> that would be great.
>>> Since v2 i’ve added your default domain type patches, the new device_group
>>> handler, and incorporated Jacob’s feedback.
>>>> On 28 Mar 2019, at 18:37, James Sewart <[email protected]> wrote:
>>>>
>>>> Hey Lu,
>>>>
>>>>> On 26 Mar 2019, at 01:24, Lu Baolu <[email protected]> wrote:
>>>>>
>>>>> Hi James,
>>>>>
>>>>> On 3/25/19 8:57 PM, James Sewart wrote:
>>>>>>>> Theres an issue that if we choose to alloc a new resv_region with type
>>>>>>>> IOMMU_RESV_DIRECT, we will need to refactor intel_iommu_put_resv_regions
>>>>>>>> to free this entry type which means refactoring the rmrr regions in
>>>>>>>> get_resv_regions. Should this work be in this patchset?
>>>>>>> Do you mean the rmrr regions are not allocated in get_resv_regions, but
>>>>>>> are freed in put_resv_regions? I think we should fix this in this patch
>>>>>>> set since this might impact the device passthrough if we don't do it.
>>>>>> They’re not allocated and not freed currently, only type IOMMU_RESV_MSI is
>>>>>> freed in put_resv_regions. If we allocate a new resv_region with type
>>>>>> IOMMU_RESV_DIRECT for the isa region, then it won’t be freed. If we modify
>>>>>> put_resv_regions to free type IOMMU_RESV_DIRECT, then we will try to free
>>>>>> the static RMRR regions.
>>>>>> Either the ISA region is static and not freed as with my implementation,
>>>>>> or the RMRR regions are converted to be allocated on each call to
>>>>>> get_resv_regions and freed in put_resv_regions.
>>>>>
>>>>> By the way, there's another way in my mind. Let's add a new region type
>>>>> for LPC devices, e.x. IOMMU_RESV_LPC, and then handle it in the same way
>>>>> as those MSI regions. Just FYI.
>>>>
>>>> This solution would require adding some extra code to
>>>> iommu_group_create_direct_mappings as currently only type
>>>> IOMMU_RESV_DIRECT is identity mapped, other types are only reserved.
>>>>
>>>>
>>>>>
>>>>> Best regards,
>>>>> Lu Baolu
>>>>
>>>> Cheers,
>>>> James.
>>> Cheers,
>>> James.
>

2019-04-10 06:08:26

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

Hi James,

On 4/6/19 2:02 AM, James Sewart wrote:
> Hey Lu,
>
> My bad, did some debugging on my end. The issue was swapping out
> find_domain for iommu_get_domain_for_dev. It seems in some situations the
> domain is not attached to the group but the device is expected to have the
> domain still stored in its archdata.
>
> I’ve attached the final patch with find_domain unremoved which seems to
> work in my testing.
>

Just looked into your v3 patch set and some thoughts from my end posted
here just for your information.

Let me post the problems we want to address.

1. When allocating a new group for a device, how should we determine the
type of the default domain?

2. If we need to put a device into an existing group which uses a
different type of domain from what the device desires to use, we might
break the functionality of the device.

My new thought is letting the iommu generic code to determine the
default domain type (hence my proposed vendor specific default domain
type patches could be dropped).

If the default domain type is dynamical mapping, and later in
iommu_no_mapping(), we determines that we must use an identity domain,
we then call iommu_request_dm_for_dev(dev).

If the default domain type is identity mapping, and later in
iommu_no_mapping(), we determined that we must use a dynamical domain,
we then call iommu_request_dma_domain_for_dev(dev).

We already have iommu_request_dm_for_dev() in iommu.c. We only need to
implement iommu_request_dma_domain_for_dev().

With this done, your patch titled "Create an IOMMU group for devices
that require an identity map" could also be dropped.

Any thoughts?

> Cheers,
> James.

Best regards,
Lu Baolu

2019-04-15 14:18:32

by James Sewart

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

Hey Lu,

> On 10 Apr 2019, at 06:22, Lu Baolu <[email protected]> wrote:
>
> Hi James,
>
> On 4/6/19 2:02 AM, James Sewart wrote:
>> Hey Lu,
>> My bad, did some debugging on my end. The issue was swapping out
>> find_domain for iommu_get_domain_for_dev. It seems in some situations the
>> domain is not attached to the group but the device is expected to have the
>> domain still stored in its archdata.
>> I’ve attached the final patch with find_domain unremoved which seems to
>> work in my testing.
>
> Just looked into your v3 patch set and some thoughts from my end posted
> here just for your information.
>
> Let me post the problems we want to address.
>
> 1. When allocating a new group for a device, how should we determine the
> type of the default domain?
>
> 2. If we need to put a device into an existing group which uses a
> different type of domain from what the device desires to use, we might
> break the functionality of the device.
>
> My new thought is letting the iommu generic code to determine the
> default domain type (hence my proposed vendor specific default domain
> type patches could be dropped).
>
> If the default domain type is dynamical mapping, and later in iommu_no_mapping(), we determines that we must use an identity domain,
> we then call iommu_request_dm_for_dev(dev).
>
> If the default domain type is identity mapping, and later in
> iommu_no_mapping(), we determined that we must use a dynamical domain,
> we then call iommu_request_dma_domain_for_dev(dev).
>
> We already have iommu_request_dm_for_dev() in iommu.c. We only need to
> implement iommu_request_dma_domain_for_dev().
>
> With this done, your patch titled "Create an IOMMU group for devices
> that require an identity map" could also be dropped.
>
> Any thoughts?

Theres a couple issues I can think of. iommu_request_dm_for_dev changes
the domain for all devices within the devices group, if we may have
devices with different domain requirements in the same group then only the
last initialised device will get the domain it wants. If we want to add
iommu_request_dma_domain_for_dev then we would still need another IOMMU
group for identity domain devices.

Both with v3 and the proposed method here, iommu_should_identity_map is
determining whether the device requires an identity map. In v3 this is
called once up front by the generic code to determine for the IOMMU group
which domain type to use. In the proposed method I think this would happen
after initial domain allocation and would trigger the domain to be
replaced with a different domain.

I prefer the solution in v3. What do you think?

>
>> Cheers,
>> James.
>
> Best regards,
> Lu Baolu

Cheers,
James.

2019-04-16 02:27:33

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

Hi James,

On 4/15/19 10:16 PM, James Sewart wrote:
> Hey Lu,
>
>> On 10 Apr 2019, at 06:22, Lu Baolu <[email protected]> wrote:
>>
>> Hi James,
>>
>> On 4/6/19 2:02 AM, James Sewart wrote:
>>> Hey Lu,
>>> My bad, did some debugging on my end. The issue was swapping out
>>> find_domain for iommu_get_domain_for_dev. It seems in some situations the
>>> domain is not attached to the group but the device is expected to have the
>>> domain still stored in its archdata.
>>> I’ve attached the final patch with find_domain unremoved which seems to
>>> work in my testing.
>>
>> Just looked into your v3 patch set and some thoughts from my end posted
>> here just for your information.
>>
>> Let me post the problems we want to address.
>>
>> 1. When allocating a new group for a device, how should we determine the
>> type of the default domain?
>>
>> 2. If we need to put a device into an existing group which uses a
>> different type of domain from what the device desires to use, we might
>> break the functionality of the device.
>>
>> My new thought is letting the iommu generic code to determine the
>> default domain type (hence my proposed vendor specific default domain
>> type patches could be dropped).
>>
>> If the default domain type is dynamical mapping, and later in iommu_no_mapping(), we determines that we must use an identity domain,
>> we then call iommu_request_dm_for_dev(dev).
>>
>> If the default domain type is identity mapping, and later in
>> iommu_no_mapping(), we determined that we must use a dynamical domain,
>> we then call iommu_request_dma_domain_for_dev(dev).
>>
>> We already have iommu_request_dm_for_dev() in iommu.c. We only need to
>> implement iommu_request_dma_domain_for_dev().
>>
>> With this done, your patch titled "Create an IOMMU group for devices
>> that require an identity map" could also be dropped.
>>
>> Any thoughts?
>
> Theres a couple issues I can think of. iommu_request_dm_for_dev changes
> the domain for all devices within the devices group, if we may have
> devices with different domain requirements in the same group then only the
> last initialised device will get the domain it wants. If we want to add
> iommu_request_dma_domain_for_dev then we would still need another IOMMU
> group for identity domain devices.

Current solution (a.k.a. v3) for this is to assign a new group to the
device which requires an identity mapped domain. This will break the
functionality of device direct pass-through (to user level). The iommu
group represents the minimum granularity of iommu isolation and
protection. The requirement of identity mapped domain should not be a
factor for a new group.

Both iomm_request_dma_domain_for_dev() or iommu_request_dm_for_dev()
only support groups with a single device in it.

The users could choose between domains of DMA type or IDENTITY type for
a group. If multiple devices share a single group and both don't work
for them, they have to disable the IOMMU. This isn't a valid
configuration from IOMMU's point of view.

>
> Both with v3 and the proposed method here, iommu_should_identity_map is
> determining whether the device requires an identity map. In v3 this is
> called once up front by the generic code to determine for the IOMMU group
> which domain type to use. In the proposed method I think this would happen
> after initial domain allocation and would trigger the domain to be
> replaced with a different domain.
>
> I prefer the solution in v3. What do you think?

The only concern of solution in v3 from my point of view is some kind of
duplication. Even we can ask the Intel IOMMU driver to tell the default
domain type, we might change it after boot up. For example, for 32-bit
pci devices, we don't know whether it's 64-bit or 32-bit capable during
IOMMU initialization, so we might tell iommu.c to use identity mapped
domain. After boot up, we check it again, and find out that it's only
32-bit capable (hence only can access physical memory below 4G) and the
default identity domain doesn't work. And we have to change it to DMA
domain via iommu_request_dma_domain_for_dev().

So to make it simple and straight-forward, we can just let iommu.c to
determine the default domain type and after that the Intel IOMMU driver
could tweak it according to the quirk bits in the driver.

Best regards,
Lu Baolu

2019-04-25 07:52:14

by Tom Murphy

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

I can see two potential problems with these patches that should be addressed:

The default domain of a group can be changed to type
IOMMU_DOMAIN_IDENTITY via the command line. With these patches we are
returning the si_domain for type IOMMU_DOMAIN_IDENTITY. There's a
chance the shared si_domain could be freed while it is still being
used when a group is freed. For example here in the
iommu_group_release:
https://github.com/torvalds/linux/blob/cd8dead0c39457e58ec1d36db93aedca811d48f1/drivers/iommu/iommu.c#L376
"if (group->default_domain)
iommu_domain_free(group->default_domain);"

Also now that a domain is attached to a device earlier we should
implement the is_attach_deferred call-back and use it to defer the
domain attach from iommu driver init to device driver init when iommu
is pre-enabled in kdump kernel. like in this commit:
https://github.com/torvalds/linux/commit/df3f7a6e8e855e4ff533508807cd7c3723faa51f


On Tue, Apr 16, 2019 at 3:24 AM Lu Baolu <[email protected]> wrote:
>
> Hi James,
>
> On 4/15/19 10:16 PM, James Sewart wrote:
> > Hey Lu,
> >
> >> On 10 Apr 2019, at 06:22, Lu Baolu <[email protected]> wrote:
> >>
> >> Hi James,
> >>
> >> On 4/6/19 2:02 AM, James Sewart wrote:
> >>> Hey Lu,
> >>> My bad, did some debugging on my end. The issue was swapping out
> >>> find_domain for iommu_get_domain_for_dev. It seems in some situations the
> >>> domain is not attached to the group but the device is expected to have the
> >>> domain still stored in its archdata.
> >>> I’ve attached the final patch with find_domain unremoved which seems to
> >>> work in my testing.
> >>
> >> Just looked into your v3 patch set and some thoughts from my end posted
> >> here just for your information.
> >>
> >> Let me post the problems we want to address.
> >>
> >> 1. When allocating a new group for a device, how should we determine the
> >> type of the default domain?
> >>
> >> 2. If we need to put a device into an existing group which uses a
> >> different type of domain from what the device desires to use, we might
> >> break the functionality of the device.
> >>
> >> My new thought is letting the iommu generic code to determine the
> >> default domain type (hence my proposed vendor specific default domain
> >> type patches could be dropped).
> >>
> >> If the default domain type is dynamical mapping, and later in iommu_no_mapping(), we determines that we must use an identity domain,
> >> we then call iommu_request_dm_for_dev(dev).
> >>
> >> If the default domain type is identity mapping, and later in
> >> iommu_no_mapping(), we determined that we must use a dynamical domain,
> >> we then call iommu_request_dma_domain_for_dev(dev).
> >>
> >> We already have iommu_request_dm_for_dev() in iommu.c. We only need to
> >> implement iommu_request_dma_domain_for_dev().
> >>
> >> With this done, your patch titled "Create an IOMMU group for devices
> >> that require an identity map" could also be dropped.
> >>
> >> Any thoughts?
> >
> > Theres a couple issues I can think of. iommu_request_dm_for_dev changes
> > the domain for all devices within the devices group, if we may have
> > devices with different domain requirements in the same group then only the
> > last initialised device will get the domain it wants. If we want to add
> > iommu_request_dma_domain_for_dev then we would still need another IOMMU
> > group for identity domain devices.
>
> Current solution (a.k.a. v3) for this is to assign a new group to the
> device which requires an identity mapped domain. This will break the
> functionality of device direct pass-through (to user level). The iommu
> group represents the minimum granularity of iommu isolation and
> protection. The requirement of identity mapped domain should not be a
> factor for a new group.
>
> Both iomm_request_dma_domain_for_dev() or iommu_request_dm_for_dev()
> only support groups with a single device in it.
>
> The users could choose between domains of DMA type or IDENTITY type for
> a group. If multiple devices share a single group and both don't work
> for them, they have to disable the IOMMU. This isn't a valid
> configuration from IOMMU's point of view.
>
> >
> > Both with v3 and the proposed method here, iommu_should_identity_map is
> > determining whether the device requires an identity map. In v3 this is
> > called once up front by the generic code to determine for the IOMMU group
> > which domain type to use. In the proposed method I think this would happen
> > after initial domain allocation and would trigger the domain to be
> > replaced with a different domain.
> >
> > I prefer the solution in v3. What do you think?
>
> The only concern of solution in v3 from my point of view is some kind of
> duplication. Even we can ask the Intel IOMMU driver to tell the default
> domain type, we might change it after boot up. For example, for 32-bit
> pci devices, we don't know whether it's 64-bit or 32-bit capable during
> IOMMU initialization, so we might tell iommu.c to use identity mapped
> domain. After boot up, we check it again, and find out that it's only
> 32-bit capable (hence only can access physical memory below 4G) and the
> default identity domain doesn't work. And we have to change it to DMA
> domain via iommu_request_dma_domain_for_dev().
>
> So to make it simple and straight-forward, we can just let iommu.c to
> determine the default domain type and after that the Intel IOMMU driver
> could tweak it according to the quirk bits in the driver.
>
> Best regards,
> Lu Baolu

2019-04-25 07:56:54

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH v2 3/7] iommu/vt-d: Expose ISA direct mapping region via iommu_get_resv_regions

Hi Tom,

On 4/25/19 7:47 AM, Tom Murphy wrote:
> I can see two potential problems with these patches that should be addressed:
>
> The default domain of a group can be changed to type
> IOMMU_DOMAIN_IDENTITY via the command line. With these patches we are
> returning the si_domain for type IOMMU_DOMAIN_IDENTITY. There's a
> chance the shared si_domain could be freed while it is still being
> used when a group is freed. For example here in the
> iommu_group_release:
> https://github.com/torvalds/linux/blob/cd8dead0c39457e58ec1d36db93aedca811d48f1/drivers/iommu/iommu.c#L376
> "if (group->default_domain)
> iommu_domain_free(group->default_domain);"
>
> Also now that a domain is attached to a device earlier we should
> implement the is_attach_deferred call-back and use it to defer the
> domain attach from iommu driver init to device driver init when iommu
> is pre-enabled in kdump kernel. like in this commit:
> https://github.com/torvalds/linux/commit/df3f7a6e8e855e4ff533508807cd7c3723faa51f
>

Both agreed and should be considered during development.

Best regards,
Lu Baolu

>
> On Tue, Apr 16, 2019 at 3:24 AM Lu Baolu <[email protected]> wrote:
>>
>> Hi James,
>>
>> On 4/15/19 10:16 PM, James Sewart wrote:
>>> Hey Lu,
>>>
>>>> On 10 Apr 2019, at 06:22, Lu Baolu <[email protected]> wrote:
>>>>
>>>> Hi James,
>>>>
>>>> On 4/6/19 2:02 AM, James Sewart wrote:
>>>>> Hey Lu,
>>>>> My bad, did some debugging on my end. The issue was swapping out
>>>>> find_domain for iommu_get_domain_for_dev. It seems in some situations the
>>>>> domain is not attached to the group but the device is expected to have the
>>>>> domain still stored in its archdata.
>>>>> I’ve attached the final patch with find_domain unremoved which seems to
>>>>> work in my testing.
>>>>
>>>> Just looked into your v3 patch set and some thoughts from my end posted
>>>> here just for your information.
>>>>
>>>> Let me post the problems we want to address.
>>>>
>>>> 1. When allocating a new group for a device, how should we determine the
>>>> type of the default domain?
>>>>
>>>> 2. If we need to put a device into an existing group which uses a
>>>> different type of domain from what the device desires to use, we might
>>>> break the functionality of the device.
>>>>
>>>> My new thought is letting the iommu generic code to determine the
>>>> default domain type (hence my proposed vendor specific default domain
>>>> type patches could be dropped).
>>>>
>>>> If the default domain type is dynamical mapping, and later in iommu_no_mapping(), we determines that we must use an identity domain,
>>>> we then call iommu_request_dm_for_dev(dev).
>>>>
>>>> If the default domain type is identity mapping, and later in
>>>> iommu_no_mapping(), we determined that we must use a dynamical domain,
>>>> we then call iommu_request_dma_domain_for_dev(dev).
>>>>
>>>> We already have iommu_request_dm_for_dev() in iommu.c. We only need to
>>>> implement iommu_request_dma_domain_for_dev().
>>>>
>>>> With this done, your patch titled "Create an IOMMU group for devices
>>>> that require an identity map" could also be dropped.
>>>>
>>>> Any thoughts?
>>>
>>> Theres a couple issues I can think of. iommu_request_dm_for_dev changes
>>> the domain for all devices within the devices group, if we may have
>>> devices with different domain requirements in the same group then only the
>>> last initialised device will get the domain it wants. If we want to add
>>> iommu_request_dma_domain_for_dev then we would still need another IOMMU
>>> group for identity domain devices.
>>
>> Current solution (a.k.a. v3) for this is to assign a new group to the
>> device which requires an identity mapped domain. This will break the
>> functionality of device direct pass-through (to user level). The iommu
>> group represents the minimum granularity of iommu isolation and
>> protection. The requirement of identity mapped domain should not be a
>> factor for a new group.
>>
>> Both iomm_request_dma_domain_for_dev() or iommu_request_dm_for_dev()
>> only support groups with a single device in it.
>>
>> The users could choose between domains of DMA type or IDENTITY type for
>> a group. If multiple devices share a single group and both don't work
>> for them, they have to disable the IOMMU. This isn't a valid
>> configuration from IOMMU's point of view.
>>
>>>
>>> Both with v3 and the proposed method here, iommu_should_identity_map is
>>> determining whether the device requires an identity map. In v3 this is
>>> called once up front by the generic code to determine for the IOMMU group
>>> which domain type to use. In the proposed method I think this would happen
>>> after initial domain allocation and would trigger the domain to be
>>> replaced with a different domain.
>>>
>>> I prefer the solution in v3. What do you think?
>>
>> The only concern of solution in v3 from my point of view is some kind of
>> duplication. Even we can ask the Intel IOMMU driver to tell the default
>> domain type, we might change it after boot up. For example, for 32-bit
>> pci devices, we don't know whether it's 64-bit or 32-bit capable during
>> IOMMU initialization, so we might tell iommu.c to use identity mapped
>> domain. After boot up, we check it again, and find out that it's only
>> 32-bit capable (hence only can access physical memory below 4G) and the
>> default identity domain doesn't work. And we have to change it to DMA
>> domain via iommu_request_dma_domain_for_dev().
>>
>> So to make it simple and straight-forward, we can just let iommu.c to
>> determine the default domain type and after that the Intel IOMMU driver
>> could tweak it according to the quirk bits in the driver.
>>
>> Best regards,
>> Lu Baolu
>