2023-08-09 13:30:36

by Baolu Lu

[permalink] [raw]
Subject: [PATCH 00/13] [PULL REQUEST] Intel IOMMU updates for Linux v6.6

Hi Joerg,

This includes patches queued for v6.6. They aim to:

- Enable idxd device DMA with pasid through iommu dma ops.
- Lift RESV_DIRECT check from VT-d driver to core.
- Miscellaneous cleanups and fixes.

All patches are based on top of the next branch and vt-d patches can
apply to v6.5-rc5 as well.

The series is also available at:
https://github.com/LuBaolu/intel-iommu/commits/vtd-update-for-v6.6

Please consider them for v6.6-rc1.

Best regards,
Baolu

Jacob Pan (3):
iommu: Generalize PASID 0 for normal DMA w/o PASID
iommu: Move global PASID allocation from SVA to core
dmaengine/idxd: Re-enable kernel workqueue under DMA API

Lu Baolu (7):
iommu/vt-d: Add domain_flush_pasid_iotlb()
iommu/vt-d: Remove pasid_mutex
iommu/vt-d: Make prq draining code generic
iommu/vt-d: Prepare for set_dev_pasid callback
iommu/vt-d: Add set_dev_pasid callback for dma domain
iommu: Prevent RESV_DIRECT devices from blocking domains
iommu/vt-d: Remove rmrr check in domain attaching device path

Yanfei Xu (2):
iommu/vt-d: Fix to flush cache of PASID directory table
iommu/vt-d: Fix to convert mm pfn to dma pfn

YueHaibing (1):
iommu/vt-d: Remove unused extern declaration dmar_parse_dev_scope()

include/linux/dmar.h | 2 -
include/linux/iommu.h | 13 +
drivers/dma/idxd/idxd.h | 9 +
drivers/iommu/intel/iommu.h | 9 +
drivers/iommu/intel/pasid.h | 2 -
drivers/dma/idxd/device.c | 39 ++-
drivers/dma/idxd/dma.c | 5 +-
drivers/dma/idxd/init.c | 54 +++-
drivers/dma/idxd/sysfs.c | 7 -
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 2 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 16 +-
drivers/iommu/intel/iommu.c | 237 +++++++++++-------
drivers/iommu/intel/pasid.c | 4 +-
drivers/iommu/intel/svm.c | 62 +----
drivers/iommu/iommu-sva.c | 29 +--
drivers/iommu/iommu.c | 65 ++++-
16 files changed, 330 insertions(+), 225 deletions(-)

--
2.34.1



2023-08-09 13:31:06

by Baolu Lu

[permalink] [raw]
Subject: [PATCH 02/13] iommu: Move global PASID allocation from SVA to core

From: Jacob Pan <[email protected]>

Intel ENQCMD requires a single PASID to be shared between multiple
devices, as the PASID is stored in a single MSR register per-process
and userspace can use only that one PASID.

This means that the PASID allocation for any ENQCMD using device driver
must always come from a shared global pool, regardless of what kind of
domain the PASID will be used with.

Split the code for the global PASID allocator into
iommu_alloc/free_global_pasid() so that drivers can attach non-SVA
domains to PASIDs as well.

This patch moves global PASID allocation APIs from SVA to IOMMU APIs.
Reserved PASIDs, currently only RID_PASID, are excluded from the global
PASID allocation.

It is expected that device drivers will use the allocated PASIDs to
attach to appropriate IOMMU domains for use.

Reviewed-by: Lu Baolu <[email protected]>
Reviewed-by: Kevin Tian <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Jacob Pan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lu Baolu <[email protected]>
---
include/linux/iommu.h | 10 ++++++++++
drivers/iommu/iommu-sva.c | 29 ++++++++++-------------------
drivers/iommu/iommu.c | 28 ++++++++++++++++++++++++++++
3 files changed, 48 insertions(+), 19 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index e71a4153041d..4cb584b08b81 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -197,6 +197,7 @@ enum iommu_dev_features {
};

#define IOMMU_NO_PASID (0U) /* Reserved for DMA w/o PASID */
+#define IOMMU_FIRST_GLOBAL_PASID (1U) /*starting range for allocation */
#define IOMMU_PASID_INVALID (-1U)
typedef unsigned int ioasid_t;

@@ -733,6 +734,8 @@ void iommu_detach_device_pasid(struct iommu_domain *domain,
struct iommu_domain *
iommu_get_domain_for_dev_pasid(struct device *dev, ioasid_t pasid,
unsigned int type);
+ioasid_t iommu_alloc_global_pasid(struct device *dev);
+void iommu_free_global_pasid(ioasid_t pasid);
#else /* CONFIG_IOMMU_API */

struct iommu_ops {};
@@ -1094,6 +1097,13 @@ iommu_get_domain_for_dev_pasid(struct device *dev, ioasid_t pasid,
{
return NULL;
}
+
+static inline ioasid_t iommu_alloc_global_pasid(struct device *dev)
+{
+ return IOMMU_PASID_INVALID;
+}
+
+static inline void iommu_free_global_pasid(ioasid_t pasid) {}
#endif /* CONFIG_IOMMU_API */

/**
diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
index 05c0fb2acbc4..b78671a8a914 100644
--- a/drivers/iommu/iommu-sva.c
+++ b/drivers/iommu/iommu-sva.c
@@ -10,34 +10,30 @@
#include "iommu-sva.h"

static DEFINE_MUTEX(iommu_sva_lock);
-static DEFINE_IDA(iommu_global_pasid_ida);

/* Allocate a PASID for the mm within range (inclusive) */
-static int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max)
+static int iommu_sva_alloc_pasid(struct mm_struct *mm, struct device *dev)
{
+ ioasid_t pasid;
int ret = 0;

- if (min == IOMMU_PASID_INVALID ||
- max == IOMMU_PASID_INVALID ||
- min == 0 || max < min)
- return -EINVAL;
-
if (!arch_pgtable_dma_compat(mm))
return -EBUSY;

mutex_lock(&iommu_sva_lock);
/* Is a PASID already associated with this mm? */
if (mm_valid_pasid(mm)) {
- if (mm->pasid < min || mm->pasid > max)
+ if (mm->pasid >= dev->iommu->max_pasids)
ret = -EOVERFLOW;
goto out;
}

- ret = ida_alloc_range(&iommu_global_pasid_ida, min, max, GFP_KERNEL);
- if (ret < 0)
+ pasid = iommu_alloc_global_pasid(dev);
+ if (pasid == IOMMU_PASID_INVALID) {
+ ret = -ENOSPC;
goto out;
-
- mm->pasid = ret;
+ }
+ mm->pasid = pasid;
ret = 0;
out:
mutex_unlock(&iommu_sva_lock);
@@ -64,15 +60,10 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm
{
struct iommu_domain *domain;
struct iommu_sva *handle;
- ioasid_t max_pasids;
int ret;

- max_pasids = dev->iommu->max_pasids;
- if (!max_pasids)
- return ERR_PTR(-EOPNOTSUPP);
-
/* Allocate mm->pasid if necessary. */
- ret = iommu_sva_alloc_pasid(mm, 1, max_pasids - 1);
+ ret = iommu_sva_alloc_pasid(mm, dev);
if (ret)
return ERR_PTR(ret);

@@ -217,5 +208,5 @@ void mm_pasid_drop(struct mm_struct *mm)
if (likely(!mm_valid_pasid(mm)))
return;

- ida_free(&iommu_global_pasid_ida, mm->pasid);
+ iommu_free_global_pasid(mm->pasid);
}
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 25d7327e8013..738fee1a24f7 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -39,6 +39,7 @@

static struct kset *iommu_group_kset;
static DEFINE_IDA(iommu_group_ida);
+static DEFINE_IDA(iommu_global_pasid_ida);

static unsigned int iommu_def_domain_type __read_mostly;
static bool iommu_dma_strict __read_mostly = IS_ENABLED(CONFIG_IOMMU_DEFAULT_DMA_STRICT);
@@ -3453,3 +3454,30 @@ struct iommu_domain *iommu_sva_domain_alloc(struct device *dev,

return domain;
}
+
+ioasid_t iommu_alloc_global_pasid(struct device *dev)
+{
+ int ret;
+
+ /* max_pasids == 0 means that the device does not support PASID */
+ if (!dev->iommu->max_pasids)
+ return IOMMU_PASID_INVALID;
+
+ /*
+ * max_pasids is set up by vendor driver based on number of PASID bits
+ * supported but the IDA allocation is inclusive.
+ */
+ ret = ida_alloc_range(&iommu_global_pasid_ida, IOMMU_FIRST_GLOBAL_PASID,
+ dev->iommu->max_pasids - 1, GFP_KERNEL);
+ return ret < 0 ? IOMMU_PASID_INVALID : ret;
+}
+EXPORT_SYMBOL_GPL(iommu_alloc_global_pasid);
+
+void iommu_free_global_pasid(ioasid_t pasid)
+{
+ if (WARN_ON(pasid == IOMMU_PASID_INVALID))
+ return;
+
+ ida_free(&iommu_global_pasid_ida, pasid);
+}
+EXPORT_SYMBOL_GPL(iommu_free_global_pasid);
--
2.34.1


2023-08-09 13:31:48

by Baolu Lu

[permalink] [raw]
Subject: [PATCH 01/13] iommu: Generalize PASID 0 for normal DMA w/o PASID

From: Jacob Pan <[email protected]>

PCIe Process address space ID (PASID) is used to tag DMA traffic, it
provides finer grained isolation than requester ID (RID).

For each device/RID, 0 is a special PASID for the normal DMA (no
PASID). This is universal across all architectures that supports PASID,
therefore warranted to be reserved globally and declared in the common
header. Consequently, we can avoid the conflict between different PASID
use cases in the generic code. e.g. SVA and DMA API with PASIDs.

This paved away for device drivers to choose global PASID policy while
continue doing normal DMA.

Noting that VT-d could support none-zero RID/NO_PASID, but currently not
used.

Reviewed-by: Lu Baolu <[email protected]>
Reviewed-by: Kevin Tian <[email protected]>
Reviewed-by: Jean-Philippe Brucker <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Jacob Pan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lu Baolu <[email protected]>
---
include/linux/iommu.h | 1 +
drivers/iommu/intel/pasid.h | 2 --
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 2 +-
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 16 ++++++-------
drivers/iommu/intel/iommu.c | 24 +++++++++----------
drivers/iommu/intel/pasid.c | 2 +-
6 files changed, 23 insertions(+), 24 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index f1e18e81fca7..e71a4153041d 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -196,6 +196,7 @@ enum iommu_dev_features {
IOMMU_DEV_FEAT_IOPF,
};

+#define IOMMU_NO_PASID (0U) /* Reserved for DMA w/o PASID */
#define IOMMU_PASID_INVALID (-1U)
typedef unsigned int ioasid_t;

diff --git a/drivers/iommu/intel/pasid.h b/drivers/iommu/intel/pasid.h
index d6b7d21244b1..4e9e68c3c388 100644
--- a/drivers/iommu/intel/pasid.h
+++ b/drivers/iommu/intel/pasid.h
@@ -10,8 +10,6 @@
#ifndef __INTEL_PASID_H
#define __INTEL_PASID_H

-#define PASID_RID2PASID 0x0
-#define PASID_MIN 0x1
#define PASID_MAX 0x100000
#define PASID_PTE_MASK 0x3F
#define PASID_PTE_PRESENT 1
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
index a5a63b1c947e..5e6b39881c04 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c
@@ -80,7 +80,7 @@ arm_smmu_share_asid(struct mm_struct *mm, u16 asid)
* be some overlap between use of both ASIDs, until we invalidate the
* TLB.
*/
- arm_smmu_write_ctx_desc(smmu_domain, 0, cd);
+ arm_smmu_write_ctx_desc(smmu_domain, IOMMU_NO_PASID, cd);

/* Invalidate TLB entries previously associated with that context */
arm_smmu_tlb_inv_asid(smmu, asid);
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 9b0dc3505601..ee70687f060b 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1059,7 +1059,7 @@ int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, int ssid,
/*
* This function handles the following cases:
*
- * (1) Install primary CD, for normal DMA traffic (SSID = 0).
+ * (1) Install primary CD, for normal DMA traffic (SSID = IOMMU_NO_PASID = 0).
* (2) Install a secondary CD, for SID+SSID traffic.
* (3) Update ASID of a CD. Atomically write the first 64 bits of the
* CD, then invalidate the old entry and mappings.
@@ -1607,7 +1607,7 @@ static void arm_smmu_handle_ppr(struct arm_smmu_device *smmu, u64 *evt)

sid = FIELD_GET(PRIQ_0_SID, evt[0]);
ssv = FIELD_GET(PRIQ_0_SSID_V, evt[0]);
- ssid = ssv ? FIELD_GET(PRIQ_0_SSID, evt[0]) : 0;
+ ssid = ssv ? FIELD_GET(PRIQ_0_SSID, evt[0]) : IOMMU_NO_PASID;
last = FIELD_GET(PRIQ_0_PRG_LAST, evt[0]);
grpid = FIELD_GET(PRIQ_1_PRG_IDX, evt[1]);

@@ -1748,7 +1748,7 @@ arm_smmu_atc_inv_to_cmd(int ssid, unsigned long iova, size_t size,
*/
*cmd = (struct arm_smmu_cmdq_ent) {
.opcode = CMDQ_OP_ATC_INV,
- .substream_valid = !!ssid,
+ .substream_valid = (ssid != IOMMU_NO_PASID),
.atc.ssid = ssid,
};

@@ -1795,7 +1795,7 @@ static int arm_smmu_atc_inv_master(struct arm_smmu_master *master)
struct arm_smmu_cmdq_ent cmd;
struct arm_smmu_cmdq_batch cmds;

- arm_smmu_atc_inv_to_cmd(0, 0, 0, &cmd);
+ arm_smmu_atc_inv_to_cmd(IOMMU_NO_PASID, 0, 0, &cmd);

cmds.num = 0;
for (i = 0; i < master->num_streams; i++) {
@@ -1875,7 +1875,7 @@ static void arm_smmu_tlb_inv_context(void *cookie)
cmd.tlbi.vmid = smmu_domain->s2_cfg.vmid;
arm_smmu_cmdq_issue_cmd_with_sync(smmu, &cmd);
}
- arm_smmu_atc_inv_domain(smmu_domain, 0, 0, 0);
+ arm_smmu_atc_inv_domain(smmu_domain, IOMMU_NO_PASID, 0, 0);
}

static void __arm_smmu_tlb_inv_range(struct arm_smmu_cmdq_ent *cmd,
@@ -1968,7 +1968,7 @@ static void arm_smmu_tlb_inv_range_domain(unsigned long iova, size_t size,
* Unfortunately, this can't be leaf-only since we may have
* zapped an entire table.
*/
- arm_smmu_atc_inv_domain(smmu_domain, 0, iova, size);
+ arm_smmu_atc_inv_domain(smmu_domain, IOMMU_NO_PASID, iova, size);
}

void arm_smmu_tlb_inv_range_asid(unsigned long iova, size_t size, int asid,
@@ -2142,7 +2142,7 @@ static int arm_smmu_domain_finalise_s1(struct arm_smmu_domain *smmu_domain,
* the master has been added to the devices list for this domain.
* This isn't an issue because the STE hasn't been installed yet.
*/
- ret = arm_smmu_write_ctx_desc(smmu_domain, 0, &cfg->cd);
+ ret = arm_smmu_write_ctx_desc(smmu_domain, IOMMU_NO_PASID, &cfg->cd);
if (ret)
goto out_free_cd_tables;

@@ -2328,7 +2328,7 @@ static void arm_smmu_enable_ats(struct arm_smmu_master *master)
pdev = to_pci_dev(master->dev);

atomic_inc(&smmu_domain->nr_ats_masters);
- arm_smmu_atc_inv_domain(smmu_domain, 0, 0, 0);
+ arm_smmu_atc_inv_domain(smmu_domain, IOMMU_NO_PASID, 0, 0);
if (pci_enable_ats(pdev, stu))
dev_err(master->dev, "Failed to enable ATS (STU %zu)\n", stu);
}
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index d5ca2387e65c..89013a2913af 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -877,7 +877,7 @@ void dmar_fault_dump_ptes(struct intel_iommu *iommu, u16 source_id,
}
/* For request-without-pasid, get the pasid from context entry */
if (intel_iommu_sm && pasid == IOMMU_PASID_INVALID)
- pasid = PASID_RID2PASID;
+ pasid = IOMMU_NO_PASID;

dir_index = pasid >> PASID_PDE_SHIFT;
pde = &dir[dir_index];
@@ -1449,7 +1449,7 @@ static void __iommu_flush_dev_iotlb(struct device_domain_info *info,
qdep = info->ats_qdep;
qi_flush_dev_iotlb(info->iommu, sid, info->pfsid,
qdep, addr, mask);
- quirk_extra_dev_tlb_flush(info, addr, mask, PASID_RID2PASID, qdep);
+ quirk_extra_dev_tlb_flush(info, addr, mask, IOMMU_NO_PASID, qdep);
}

static void iommu_flush_dev_iotlb(struct dmar_domain *domain,
@@ -1484,7 +1484,7 @@ static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
ih = 1 << 6;

if (domain->use_first_level) {
- qi_flush_piotlb(iommu, did, PASID_RID2PASID, addr, pages, ih);
+ qi_flush_piotlb(iommu, did, IOMMU_NO_PASID, addr, pages, ih);
} else {
unsigned long bitmask = aligned_pages - 1;

@@ -1554,7 +1554,7 @@ static void intel_flush_iotlb_all(struct iommu_domain *domain)
u16 did = domain_id_iommu(dmar_domain, iommu);

if (dmar_domain->use_first_level)
- qi_flush_piotlb(iommu, did, PASID_RID2PASID, 0, -1, 0);
+ qi_flush_piotlb(iommu, did, IOMMU_NO_PASID, 0, -1, 0);
else
iommu->flush.flush_iotlb(iommu, did, 0, 0,
DMA_TLB_DSI_FLUSH);
@@ -1940,7 +1940,7 @@ static int domain_context_mapping_one(struct dmar_domain *domain,
context_pdts(pds);

/* Setup the RID_PASID field: */
- context_set_sm_rid2pasid(context, PASID_RID2PASID);
+ context_set_sm_rid2pasid(context, IOMMU_NO_PASID);

/*
* Setup the Device-TLB enable bit and Page request
@@ -2420,13 +2420,13 @@ static int dmar_domain_attach_device(struct dmar_domain *domain,
/* Setup the PASID entry for requests without PASID: */
if (hw_pass_through && domain_type_is_si(domain))
ret = intel_pasid_setup_pass_through(iommu, domain,
- dev, PASID_RID2PASID);
+ dev, IOMMU_NO_PASID);
else if (domain->use_first_level)
ret = domain_setup_first_level(iommu, domain, dev,
- PASID_RID2PASID);
+ IOMMU_NO_PASID);
else
ret = intel_pasid_setup_second_level(iommu, domain,
- dev, PASID_RID2PASID);
+ dev, IOMMU_NO_PASID);
if (ret) {
dev_err(dev, "Setup RID2PASID failed\n");
device_block_translation(dev);
@@ -3961,7 +3961,7 @@ static void dmar_remove_one_dev_info(struct device *dev)
if (!dev_is_real_dma_subdevice(info->dev)) {
if (dev_is_pci(info->dev) && sm_supported(iommu))
intel_pasid_tear_down_entry(iommu, info->dev,
- PASID_RID2PASID, false);
+ IOMMU_NO_PASID, false);

iommu_disable_pci_caps(info);
domain_context_clear(info);
@@ -3990,7 +3990,7 @@ static void device_block_translation(struct device *dev)
if (!dev_is_real_dma_subdevice(dev)) {
if (sm_supported(iommu))
intel_pasid_tear_down_entry(iommu, dev,
- PASID_RID2PASID, false);
+ IOMMU_NO_PASID, false);
else
domain_context_clear(info);
}
@@ -4324,7 +4324,7 @@ static void domain_set_force_snooping(struct dmar_domain *domain)

list_for_each_entry(info, &domain->devices, link)
intel_pasid_setup_page_snoop_control(info->iommu, info->dev,
- PASID_RID2PASID);
+ IOMMU_NO_PASID);
}

static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain)
@@ -4980,7 +4980,7 @@ void quirk_extra_dev_tlb_flush(struct device_domain_info *info,
return;

sid = PCI_DEVID(info->bus, info->devfn);
- if (pasid == PASID_RID2PASID) {
+ if (pasid == IOMMU_NO_PASID) {
qi_flush_dev_iotlb(info->iommu, sid, info->pfsid,
qdep, address, mask);
} else {
diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
index c5d479770e12..23dca3bc319d 100644
--- a/drivers/iommu/intel/pasid.c
+++ b/drivers/iommu/intel/pasid.c
@@ -438,7 +438,7 @@ devtlb_invalidation_with_pasid(struct intel_iommu *iommu,
* SVA usage, device could do DMA with multiple PASIDs. It is more
* efficient to flush devTLB specific to the PASID.
*/
- if (pasid == PASID_RID2PASID)
+ if (pasid == IOMMU_NO_PASID)
qi_flush_dev_iotlb(iommu, sid, pfsid, qdep, 0, 64 - VTD_PAGE_SHIFT);
else
qi_flush_dev_iotlb_pasid(iommu, sid, pfsid, pasid, qdep, 0, 64 - VTD_PAGE_SHIFT);
--
2.34.1


2023-08-09 13:32:46

by Baolu Lu

[permalink] [raw]
Subject: [PATCH 09/13] iommu: Prevent RESV_DIRECT devices from blocking domains

The IOMMU_RESV_DIRECT flag indicates that a memory region must be mapped
1:1 at all times. This means that the region must always be accessible to
the device, even if the device is attached to a blocking domain. This is
equal to saying that IOMMU_RESV_DIRECT flag prevents devices from being
attached to blocking domains.

This also implies that devices that implement RESV_DIRECT regions will be
prevented from being assigned to user space since taking the DMA ownership
immediately switches to a blocking domain.

The rule of preventing devices with the IOMMU_RESV_DIRECT regions from
being assigned to user space has existed in the Intel IOMMU driver for
a long time. Now, this rule is being lifted up to a general core rule,
as other architectures like AMD and ARM also have RMRR-like reserved
regions. This has been discussed in the community mailing list and refer
to below link for more details.

Other places using unmanaged domains for kernel DMA must follow the
iommu_get_resv_regions() and setup IOMMU_RESV_DIRECT - we do not restrict
them in the core code.

Cc: Robin Murphy <[email protected]>
Cc: Alex Williamson <[email protected]>
Cc: Kevin Tian <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Link: https://lore.kernel.org/linux-iommu/BN9PR11MB5276E84229B5BD952D78E9598C639@BN9PR11MB5276.namprd11.prod.outlook.com
Signed-off-by: Lu Baolu <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Acked-by: Joerg Roedel <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
include/linux/iommu.h | 2 ++
drivers/iommu/iommu.c | 37 +++++++++++++++++++++++++++----------
2 files changed, 29 insertions(+), 10 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 4cb584b08b81..9ed139bf111f 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -414,6 +414,7 @@ struct iommu_fault_param {
* @max_pasids: number of PASIDs this device can consume
* @attach_deferred: the dma domain attachment is deferred
* @pci_32bit_workaround: Limit DMA allocations to 32-bit IOVAs
+ * @require_direct: device requires IOMMU_RESV_DIRECT regions
*
* TODO: migrate other per device data pointers under iommu_dev_data, e.g.
* struct iommu_group *iommu_group;
@@ -428,6 +429,7 @@ struct dev_iommu {
u32 max_pasids;
u32 attach_deferred:1;
u32 pci_32bit_workaround:1;
+ u32 require_direct:1;
};

int iommu_device_register(struct iommu_device *iommu,
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 738fee1a24f7..49e96a267a2b 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1032,14 +1032,12 @@ static int iommu_create_device_direct_mappings(struct iommu_domain *domain,
unsigned long pg_size;
int ret = 0;

- if (!iommu_is_dma_domain(domain))
- return 0;
-
- BUG_ON(!domain->pgsize_bitmap);
-
- pg_size = 1UL << __ffs(domain->pgsize_bitmap);
+ pg_size = domain->pgsize_bitmap ? 1UL << __ffs(domain->pgsize_bitmap) : 0;
INIT_LIST_HEAD(&mappings);

+ if (WARN_ON_ONCE(iommu_is_dma_domain(domain) && !pg_size))
+ return -EINVAL;
+
iommu_get_resv_regions(dev, &mappings);

/* We need to consider overlapping regions for different devices */
@@ -1047,13 +1045,17 @@ static int iommu_create_device_direct_mappings(struct iommu_domain *domain,
dma_addr_t start, end, addr;
size_t map_size = 0;

+ if (entry->type == IOMMU_RESV_DIRECT)
+ dev->iommu->require_direct = 1;
+
+ if ((entry->type != IOMMU_RESV_DIRECT &&
+ entry->type != IOMMU_RESV_DIRECT_RELAXABLE) ||
+ !iommu_is_dma_domain(domain))
+ continue;
+
start = ALIGN(entry->start, pg_size);
end = ALIGN(entry->start + entry->length, pg_size);

- if (entry->type != IOMMU_RESV_DIRECT &&
- entry->type != IOMMU_RESV_DIRECT_RELAXABLE)
- continue;
-
for (addr = start; addr <= end; addr += pg_size) {
phys_addr_t phys_addr;

@@ -2182,6 +2184,21 @@ static int __iommu_device_set_domain(struct iommu_group *group,
{
int ret;

+ /*
+ * If the device requires IOMMU_RESV_DIRECT then we cannot allow
+ * the blocking domain to be attached as it does not contain the
+ * required 1:1 mapping. This test effectively excludes the device
+ * being used with iommu_group_claim_dma_owner() which will block
+ * vfio and iommufd as well.
+ */
+ if (dev->iommu->require_direct &&
+ (new_domain->type == IOMMU_DOMAIN_BLOCKED ||
+ new_domain == group->blocking_domain)) {
+ dev_warn(dev,
+ "Firmware has requested this device have a 1:1 IOMMU mapping, rejecting configuring the device without a 1:1 mapping. Contact your platform vendor.\n");
+ return -EINVAL;
+ }
+
if (dev->iommu->attach_deferred) {
if (new_domain == group->default_domain)
return 0;
--
2.34.1


2023-08-09 13:52:21

by Baolu Lu

[permalink] [raw]
Subject: [PATCH 11/13] iommu/vt-d: Fix to flush cache of PASID directory table

From: Yanfei Xu <[email protected]>

Even the PCI devices don't support pasid capability, PASID table is
mandatory for a PCI device in scalable mode. However flushing cache
of pasid directory table for these devices are not taken after pasid
table is allocated as the "size" of table is zero. Fix it by
calculating the size by page order.

Found this when reading the code, no real problem encountered for now.

Fixes: 194b3348bdbb ("iommu/vt-d: Fix PASID directory pointer coherency")
Suggested-by: Lu Baolu <[email protected]>
Signed-off-by: Yanfei Xu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel/pasid.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
index 23dca3bc319d..8f92b92f3d2a 100644
--- a/drivers/iommu/intel/pasid.c
+++ b/drivers/iommu/intel/pasid.c
@@ -129,7 +129,7 @@ int intel_pasid_alloc_table(struct device *dev)
info->pasid_table = pasid_table;

if (!ecap_coherent(info->iommu->ecap))
- clflush_cache_range(pasid_table->table, size);
+ clflush_cache_range(pasid_table->table, (1 << order) * PAGE_SIZE);

return 0;
}
--
2.34.1


2023-08-09 13:52:51

by Baolu Lu

[permalink] [raw]
Subject: [PATCH 04/13] iommu/vt-d: Remove pasid_mutex

The pasid_mutex was used to protect the paths of set/remove_dev_pasid().
It's duplicate with iommu_sva_lock. Remove it to avoid duplicate code.

Signed-off-by: Lu Baolu <[email protected]>
Signed-off-by: Jacob Pan <[email protected]>
Reviewed-by: Kevin Tian <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
drivers/iommu/intel/svm.c | 45 +++++----------------------------------
1 file changed, 5 insertions(+), 40 deletions(-)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index e95b339e9cdc..2a82864e9d57 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -259,8 +259,6 @@ static const struct mmu_notifier_ops intel_mmuops = {
.invalidate_range = intel_invalidate_range,
};

-static DEFINE_MUTEX(pasid_mutex);
-
static int pasid_to_svm_sdev(struct device *dev, unsigned int pasid,
struct intel_svm **rsvm,
struct intel_svm_dev **rsdev)
@@ -268,10 +266,6 @@ static int pasid_to_svm_sdev(struct device *dev, unsigned int pasid,
struct intel_svm_dev *sdev = NULL;
struct intel_svm *svm;

- /* The caller should hold the pasid_mutex lock */
- if (WARN_ON(!mutex_is_locked(&pasid_mutex)))
- return -EINVAL;
-
if (pasid == IOMMU_PASID_INVALID || pasid >= PASID_MAX)
return -EINVAL;

@@ -371,22 +365,19 @@ static int intel_svm_bind_mm(struct intel_iommu *iommu, struct device *dev,
return ret;
}

-/* Caller must hold pasid_mutex */
-static int intel_svm_unbind_mm(struct device *dev, u32 pasid)
+void intel_svm_remove_dev_pasid(struct device *dev, u32 pasid)
{
struct intel_svm_dev *sdev;
struct intel_iommu *iommu;
struct intel_svm *svm;
struct mm_struct *mm;
- int ret = -EINVAL;

iommu = device_to_iommu(dev, NULL, NULL);
if (!iommu)
- goto out;
+ return;

- ret = pasid_to_svm_sdev(dev, pasid, &svm, &sdev);
- if (ret)
- goto out;
+ if (pasid_to_svm_sdev(dev, pasid, &svm, &sdev))
+ return;
mm = svm->mm;

if (sdev) {
@@ -418,8 +409,6 @@ static int intel_svm_unbind_mm(struct device *dev, u32 pasid)
kfree(svm);
}
}
-out:
- return ret;
}

/* Page request queue descriptor */
@@ -520,19 +509,7 @@ static void intel_svm_drain_prq(struct device *dev, u32 pasid)
goto prq_retry;
}

- /*
- * A work in IO page fault workqueue may try to lock pasid_mutex now.
- * Holding pasid_mutex while waiting in iopf_queue_flush_dev() for
- * all works in the workqueue to finish may cause deadlock.
- *
- * It's unnecessary to hold pasid_mutex in iopf_queue_flush_dev().
- * Unlock it to allow the works to be handled while waiting for
- * them to finish.
- */
- lockdep_assert_held(&pasid_mutex);
- mutex_unlock(&pasid_mutex);
iopf_queue_flush_dev(dev);
- mutex_lock(&pasid_mutex);

/*
* Perform steps described in VT-d spec CH7.10 to drain page
@@ -827,26 +804,14 @@ int intel_svm_page_response(struct device *dev,
return ret;
}

-void intel_svm_remove_dev_pasid(struct device *dev, ioasid_t pasid)
-{
- mutex_lock(&pasid_mutex);
- intel_svm_unbind_mm(dev, pasid);
- mutex_unlock(&pasid_mutex);
-}
-
static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t pasid)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct intel_iommu *iommu = info->iommu;
struct mm_struct *mm = domain->mm;
- int ret;

- mutex_lock(&pasid_mutex);
- ret = intel_svm_bind_mm(iommu, dev, mm);
- mutex_unlock(&pasid_mutex);
-
- return ret;
+ return intel_svm_bind_mm(iommu, dev, mm);
}

static void intel_svm_domain_free(struct iommu_domain *domain)
--
2.34.1


2023-08-09 13:55:18

by Baolu Lu

[permalink] [raw]
Subject: [PATCH 06/13] iommu/vt-d: Prepare for set_dev_pasid callback

The domain_flush_pasid_iotlb() helper function is used to flush the IOTLB
entries for a given PASID. Previously, this function assumed that
RID2PASID was only used for the first-level DMA translation. However, with
the introduction of the set_dev_pasid callback, this assumption is no
longer valid.

Add a check before using the RID2PASID for PASID invalidation. This check
ensures that the domain has been attached to a physical device before
using RID2PASID.

Signed-off-by: Lu Baolu <[email protected]>
Signed-off-by: Jacob Pan <[email protected]>
Reviewed-by: Kevin Tian <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
drivers/iommu/intel/iommu.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index bdde38a5e43a..4a41aca6a2ba 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1475,7 +1475,8 @@ static void domain_flush_pasid_iotlb(struct intel_iommu *iommu,
unsigned long flags;

spin_lock_irqsave(&domain->lock, flags);
- qi_flush_piotlb(iommu, did, IOMMU_NO_PASID, addr, npages, ih);
+ if (!list_empty(&domain->devices))
+ qi_flush_piotlb(iommu, did, IOMMU_NO_PASID, addr, npages, ih);
spin_unlock_irqrestore(&domain->lock, flags);
}

--
2.34.1


2023-08-09 13:55:57

by Baolu Lu

[permalink] [raw]
Subject: [PATCH 13/13] iommu/vt-d: Remove unused extern declaration dmar_parse_dev_scope()

From: YueHaibing <[email protected]>

Since commit 2e4552893038 ("iommu/vt-d: Unify the way to process DMAR
device scope array") this is not used anymore, so can remove it.

Signed-off-by: YueHaibing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lu Baolu <[email protected]>
---
include/linux/dmar.h | 2 --
1 file changed, 2 deletions(-)

diff --git a/include/linux/dmar.h b/include/linux/dmar.h
index 27dbd4c64860..e34b601b71fd 100644
--- a/include/linux/dmar.h
+++ b/include/linux/dmar.h
@@ -106,8 +106,6 @@ static inline bool dmar_rcu_check(void)
extern int dmar_table_init(void);
extern int dmar_dev_scope_init(void);
extern void dmar_register_bus_notifier(void);
-extern int dmar_parse_dev_scope(void *start, void *end, int *cnt,
- struct dmar_dev_scope **devices, u16 segment);
extern void *dmar_alloc_dev_scope(void *start, void *end, int *cnt);
extern void dmar_free_dev_scope(struct dmar_dev_scope **devices, int *cnt);
extern int dmar_insert_dev_scope(struct dmar_pci_notify_info *info,
--
2.34.1


2023-08-09 17:16:07

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH 00/13] [PULL REQUEST] Intel IOMMU updates for Linux v6.6

On Wed, Aug 09, 2023 at 08:47:53PM +0800, Lu Baolu wrote:
> This includes patches queued for v6.6. They aim to:
>
> - Enable idxd device DMA with pasid through iommu dma ops.
> - Lift RESV_DIRECT check from VT-d driver to core.
> - Miscellaneous cleanups and fixes.
>
> All patches are based on top of the next branch and vt-d patches can
> apply to v6.5-rc5 as well.

Pulled, thanks Baolu. There was a minor conflict with the differences
between next and v6.5-rc5, but I resolved that.


Joerg