2023-04-13 04:08:25

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v2 00/17] [PULL REQUEST] Intel IOMMU updates for Linux v6.4

Hi Joerg,

This includes patches queued for v6.4. They are:

- Allow the VT-d driver to support non-PRI IOPF
- Remove PASID supervisor request support
- Various small and misc cleanups

This series also includes an idxd patch to avoid driver regression after
changes in the IOMMU driver. It has been reviewed and acked by the
maintainers.

The whole series is based on v6.3-rc6 and also available at:
https://github.com/LuBaolu/intel-iommu/commits/vtd-update-for-v6.4

Please pull them for x86/vt-d branch.

Best regards,
Baolu

Change log:
- v2:
- Avoid using "fix" wording in the commit message of a cleanup patch,
no functional change. [David Laight]
- Add Acked-by from Vinod Koul to the idxd patch.

- v1: https://lore.kernel.org/linux-iommu/[email protected]/

Christophe JAILLET (1):
iommu/vt-d: Do not use GFP_ATOMIC when not needed

Jacob Pan (2):
iommu/vt-d: Use non-privileged mode for all PASIDs
iommu/vt-d: Remove PASID supervisor request support

Lu Baolu (7):
dmaengine: idxd: Add enable/disable device IOPF feature
iommu/vt-d: Allow SVA with device-specific IOPF
iommu/vt-d: Move iopf code from SVA to IOPF enabling path
iommu/vt-d: Move pfsid and ats_qdep calculation to device probe path
iommu/vt-d: Move PRI handling to IOPF feature path
iommu/vt-d: Remove unnecessary checks in iopf disabling path
iommu/vt-d: Remove extern from function prototypes

Tina Zhang (7):
iommu/vt-d: Make size of operands same in bitwise operations
iommu/vt-d: Remove BUG_ON on checking valid pfn range
iommu/vt-d: Remove BUG_ON in handling iotlb cache invalidation
iommu/vt-d: Remove BUG_ON when domain->pgd is NULL
iommu/vt-d: Remove BUG_ON in map/unmap()
iommu/vt-d: Remove a useless BUG_ON(dev->is_virtfn)
iommu/vt-d: Remove BUG_ON in dmar_insert_dev_scope()

drivers/iommu/intel/iommu.h | 36 +++---
drivers/iommu/intel/pasid.h | 7 --
drivers/dma/idxd/init.c | 31 ++++-
drivers/iommu/intel/dmar.c | 7 +-
drivers/iommu/intel/iommu.c | 173 +++++++++++++++++-----------
drivers/iommu/intel/irq_remapping.c | 2 +-
drivers/iommu/intel/pasid.c | 43 -------
7 files changed, 155 insertions(+), 144 deletions(-)

--
2.34.1


2023-04-13 04:08:25

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v2 02/17] iommu/vt-d: Allow SVA with device-specific IOPF

Currently enabling SVA requires IOPF support from the IOMMU and device
PCI PRI. However, some devices can handle IOPF by itself without ever
sending PCI page requests nor advertising PRI capability.

Allow SVA support with IOPF handled either by IOMMU (PCI PRI) or device
driver (device-specific IOPF). As long as IOPF could be handled, SVA
should continue to work.

Reviewed-by: Kevin Tian <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
drivers/iommu/intel/iommu.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 7c2f4bd33582..caf664448ee9 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4650,7 +4650,21 @@ static int intel_iommu_enable_sva(struct device *dev)
if (!(iommu->flags & VTD_FLAG_SVM_CAPABLE))
return -ENODEV;

- if (!info->pasid_enabled || !info->pri_enabled || !info->ats_enabled)
+ if (!info->pasid_enabled || !info->ats_enabled)
+ return -EINVAL;
+
+ /*
+ * Devices having device-specific I/O fault handling should not
+ * support PCI/PRI. The IOMMU side has no means to check the
+ * capability of device-specific IOPF. Therefore, IOMMU can only
+ * default that if the device driver enables SVA on a non-PRI
+ * device, it will handle IOPF in its own way.
+ */
+ if (!info->pri_supported)
+ return 0;
+
+ /* Devices supporting PRI should have it enabled. */
+ if (!info->pri_enabled)
return -EINVAL;

ret = iopf_queue_add_device(iommu->iopf_queue, dev);
--
2.34.1

2023-04-13 04:08:25

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v2 04/17] iommu/vt-d: Move pfsid and ats_qdep calculation to device probe path

They should be part of the per-device iommu private data initialization.

Reviewed-by: Jacob Pan <[email protected]>
Reviewed-by: Kevin Tian <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
drivers/iommu/intel/iommu.c | 26 +++++++++++---------------
1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index a6f07c74da2d..6d77b4072fdd 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1406,20 +1406,6 @@ static void iommu_enable_pci_caps(struct device_domain_info *info)
return;

pdev = to_pci_dev(info->dev);
- /* For IOMMU that supports device IOTLB throttling (DIT), we assign
- * PFSID to the invalidation desc of a VF such that IOMMU HW can gauge
- * queue depth at PF level. If DIT is not set, PFSID will be treated as
- * reserved, which should be set to 0.
- */
- if (!ecap_dit(info->iommu->ecap))
- info->pfsid = 0;
- else {
- struct pci_dev *pf_pdev;
-
- /* pdev will be returned if device is not a vf */
- pf_pdev = pci_physfn(pdev);
- info->pfsid = pci_dev_id(pf_pdev);
- }

/* The PCIe spec, in its wisdom, declares that the behaviour of
the device if you enable PASID support after ATS support is
@@ -1438,7 +1424,6 @@ static void iommu_enable_pci_caps(struct device_domain_info *info)
!pci_enable_ats(pdev, VTD_PAGE_SHIFT)) {
info->ats_enabled = 1;
domain_update_iotlb(info->domain);
- info->ats_qdep = pci_ats_queue_depth(pdev);
}
}

@@ -4521,6 +4506,17 @@ static struct iommu_device *intel_iommu_probe_device(struct device *dev)
dmar_ats_supported(pdev, iommu)) {
info->ats_supported = 1;
info->dtlb_extra_inval = dev_needs_extra_dtlb_flush(pdev);
+
+ /*
+ * For IOMMU that supports device IOTLB throttling
+ * (DIT), we assign PFSID to the invalidation desc
+ * of a VF such that IOMMU HW can gauge queue depth
+ * at PF level. If DIT is not set, PFSID will be
+ * treated as reserved, which should be set to 0.
+ */
+ if (ecap_dit(iommu->ecap))
+ info->pfsid = pci_dev_id(pci_physfn(pdev));
+ info->ats_qdep = pci_ats_queue_depth(pdev);
}
if (sm_supported(iommu)) {
if (pasid_supported(iommu)) {
--
2.34.1

2023-04-13 04:08:44

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v2 01/17] dmaengine: idxd: Add enable/disable device IOPF feature

The iommu subsystem requires IOMMU_DEV_FEAT_IOPF must be enabled before
and disabled after IOMMU_DEV_FEAT_SVA, if device's I/O page faults rely
on the IOMMU. Add explicit IOMMU_DEV_FEAT_IOPF enabling/disabling in this
driver.

At present, missing IOPF enabling/disabling doesn't cause any real issue,
because the IOMMU driver places the IOPF enabling/disabling in the path
of SVA feature handling. But this may change.

Reviewed-by: Dave Jiang <[email protected]>
Reviewed-by: Fenghua Yu <[email protected]>
Reviewed-by: Kevin Tian <[email protected]>
Acked-by: Vinod Koul <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
drivers/dma/idxd/init.c | 31 +++++++++++++++++++++++++------
1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
index 640d3048368e..09ef62aa0635 100644
--- a/drivers/dma/idxd/init.c
+++ b/drivers/dma/idxd/init.c
@@ -516,6 +516,27 @@ static void idxd_disable_system_pasid(struct idxd_device *idxd)
idxd->sva = NULL;
}

+static int idxd_enable_sva(struct pci_dev *pdev)
+{
+ int ret;
+
+ ret = iommu_dev_enable_feature(&pdev->dev, IOMMU_DEV_FEAT_IOPF);
+ if (ret)
+ return ret;
+
+ ret = iommu_dev_enable_feature(&pdev->dev, IOMMU_DEV_FEAT_SVA);
+ if (ret)
+ iommu_dev_disable_feature(&pdev->dev, IOMMU_DEV_FEAT_IOPF);
+
+ return ret;
+}
+
+static void idxd_disable_sva(struct pci_dev *pdev)
+{
+ iommu_dev_disable_feature(&pdev->dev, IOMMU_DEV_FEAT_SVA);
+ iommu_dev_disable_feature(&pdev->dev, IOMMU_DEV_FEAT_IOPF);
+}
+
static int idxd_probe(struct idxd_device *idxd)
{
struct pci_dev *pdev = idxd->pdev;
@@ -530,7 +551,7 @@ static int idxd_probe(struct idxd_device *idxd)
dev_dbg(dev, "IDXD reset complete\n");

if (IS_ENABLED(CONFIG_INTEL_IDXD_SVM) && sva) {
- if (iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA)) {
+ if (idxd_enable_sva(pdev)) {
dev_warn(dev, "Unable to turn on user SVA feature.\n");
} else {
set_bit(IDXD_FLAG_USER_PASID_ENABLED, &idxd->flags);
@@ -578,21 +599,19 @@ static int idxd_probe(struct idxd_device *idxd)
if (device_pasid_enabled(idxd))
idxd_disable_system_pasid(idxd);
if (device_user_pasid_enabled(idxd))
- iommu_dev_disable_feature(dev, IOMMU_DEV_FEAT_SVA);
+ idxd_disable_sva(pdev);
return rc;
}

static void idxd_cleanup(struct idxd_device *idxd)
{
- struct device *dev = &idxd->pdev->dev;
-
perfmon_pmu_remove(idxd);
idxd_cleanup_interrupts(idxd);
idxd_cleanup_internals(idxd);
if (device_pasid_enabled(idxd))
idxd_disable_system_pasid(idxd);
if (device_user_pasid_enabled(idxd))
- iommu_dev_disable_feature(dev, IOMMU_DEV_FEAT_SVA);
+ idxd_disable_sva(idxd->pdev);
}

static int idxd_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
@@ -710,7 +729,7 @@ static void idxd_remove(struct pci_dev *pdev)
pci_free_irq_vectors(pdev);
pci_iounmap(pdev, idxd->reg_base);
if (device_user_pasid_enabled(idxd))
- iommu_dev_disable_feature(&pdev->dev, IOMMU_DEV_FEAT_SVA);
+ idxd_disable_sva(pdev);
pci_disable_device(pdev);
destroy_workqueue(idxd->wq);
perfmon_pmu_remove(idxd);
--
2.34.1

2023-04-13 04:08:45

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v2 05/17] iommu/vt-d: Move PRI handling to IOPF feature path

PRI is only used for IOPF. With this move, the PCI/PRI feature could be
controlled by the device driver through iommu_dev_enable/disable_feature()
interfaces.

Reviewed-by: Jacob Pan <[email protected]>
Reviewed-by: Kevin Tian <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
drivers/iommu/intel/iommu.c | 53 ++++++++++++++++++++++++++++---------
1 file changed, 41 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 6d77b4072fdd..cd3a3c4b5e64 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1415,11 +1415,6 @@ static void iommu_enable_pci_caps(struct device_domain_info *info)
if (info->pasid_supported && !pci_enable_pasid(pdev, info->pasid_supported & ~1))
info->pasid_enabled = 1;

- if (info->pri_supported &&
- (info->pasid_enabled ? pci_prg_resp_pasid_required(pdev) : 1) &&
- !pci_reset_pri(pdev) && !pci_enable_pri(pdev, PRQ_DEPTH))
- info->pri_enabled = 1;
-
if (info->ats_supported && pci_ats_page_aligned(pdev) &&
!pci_enable_ats(pdev, VTD_PAGE_SHIFT)) {
info->ats_enabled = 1;
@@ -1442,11 +1437,6 @@ static void iommu_disable_pci_caps(struct device_domain_info *info)
domain_update_iotlb(info->domain);
}

- if (info->pri_enabled) {
- pci_disable_pri(pdev);
- info->pri_enabled = 0;
- }
-
if (info->pasid_enabled) {
pci_disable_pasid(pdev);
info->pasid_enabled = 0;
@@ -4667,23 +4657,48 @@ static int intel_iommu_enable_sva(struct device *dev)

static int intel_iommu_enable_iopf(struct device *dev)
{
+ struct pci_dev *pdev = dev_is_pci(dev) ? to_pci_dev(dev) : NULL;
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct intel_iommu *iommu;
int ret;

- if (!info || !info->ats_enabled || !info->pri_enabled)
+ if (!pdev || !info || !info->ats_enabled || !info->pri_supported)
return -ENODEV;
+
+ if (info->pri_enabled)
+ return -EBUSY;
+
iommu = info->iommu;
if (!iommu)
return -EINVAL;

+ /* PASID is required in PRG Response Message. */
+ if (info->pasid_enabled && !pci_prg_resp_pasid_required(pdev))
+ return -EINVAL;
+
+ ret = pci_reset_pri(pdev);
+ if (ret)
+ return ret;
+
ret = iopf_queue_add_device(iommu->iopf_queue, dev);
if (ret)
return ret;

ret = iommu_register_device_fault_handler(dev, iommu_queue_iopf, dev);
if (ret)
- iopf_queue_remove_device(iommu->iopf_queue, dev);
+ goto iopf_remove_device;
+
+ ret = pci_enable_pri(pdev, PRQ_DEPTH);
+ if (ret)
+ goto iopf_unregister_handler;
+ info->pri_enabled = 1;
+
+ return 0;
+
+iopf_unregister_handler:
+ iommu_unregister_device_fault_handler(dev);
+iopf_remove_device:
+ iopf_queue_remove_device(iommu->iopf_queue, dev);

return ret;
}
@@ -4694,6 +4709,20 @@ static int intel_iommu_disable_iopf(struct device *dev)
struct intel_iommu *iommu = info->iommu;
int ret;

+ if (!info->pri_enabled)
+ return -EINVAL;
+
+ /*
+ * PCIe spec states that by clearing PRI enable bit, the Page
+ * Request Interface will not issue new page requests, but has
+ * outstanding page requests that have been transmitted or are
+ * queued for transmission. This is supposed to be called after
+ * the device driver has stopped DMA, all PASIDs have been
+ * unbound and the outstanding PRQs have been drained.
+ */
+ pci_disable_pri(to_pci_dev(dev));
+ info->pri_enabled = 0;
+
ret = iommu_unregister_device_fault_handler(dev);
if (ret)
return ret;
--
2.34.1

2023-04-13 04:08:48

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v2 06/17] iommu/vt-d: Remove unnecessary checks in iopf disabling path

iommu_unregister_device_fault_handler() and iopf_queue_remove_device()
are called after device has stopped issuing new page falut requests and
all outstanding page requests have been drained. They should never fail.
Trigger a warning if it happens unfortunately.

Reviewed-by: Kevin Tian <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
drivers/iommu/intel/iommu.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index cd3a3c4b5e64..c771233d6f2a 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4707,7 +4707,6 @@ static int intel_iommu_disable_iopf(struct device *dev)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct intel_iommu *iommu = info->iommu;
- int ret;

if (!info->pri_enabled)
return -EINVAL;
@@ -4723,15 +4722,15 @@ static int intel_iommu_disable_iopf(struct device *dev)
pci_disable_pri(to_pci_dev(dev));
info->pri_enabled = 0;

- ret = iommu_unregister_device_fault_handler(dev);
- if (ret)
- return ret;
-
- ret = iopf_queue_remove_device(iommu->iopf_queue, dev);
- if (ret)
- iommu_register_device_fault_handler(dev, iommu_queue_iopf, dev);
+ /*
+ * With PRI disabled and outstanding PRQs drained, unregistering
+ * fault handler and removing device from iopf queue should never
+ * fail.
+ */
+ WARN_ON(iommu_unregister_device_fault_handler(dev));
+ WARN_ON(iopf_queue_remove_device(iommu->iopf_queue, dev));

- return ret;
+ return 0;
}

static int
--
2.34.1

2023-04-13 04:08:56

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v2 08/17] iommu/vt-d: Remove extern from function prototypes

The kernel coding style does not require 'extern' in function prototypes
in .h files, so remove them from drivers/iommu/intel/iommu.h as they are
not needed.

Signed-off-by: Lu Baolu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
---
drivers/iommu/intel/iommu.h | 36 ++++++++++++++++++------------------
1 file changed, 18 insertions(+), 18 deletions(-)

diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index 694ab9b7d3e9..19494713d6b3 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -798,18 +798,18 @@ static inline bool context_present(struct context_entry *context)
return (context->lo & 1);
}

-extern struct dmar_drhd_unit * dmar_find_matched_drhd_unit(struct pci_dev *dev);
-
-extern int dmar_enable_qi(struct intel_iommu *iommu);
-extern void dmar_disable_qi(struct intel_iommu *iommu);
-extern int dmar_reenable_qi(struct intel_iommu *iommu);
-extern void qi_global_iec(struct intel_iommu *iommu);
-
-extern void qi_flush_context(struct intel_iommu *iommu, u16 did, u16 sid,
- u8 fm, u64 type);
-extern void qi_flush_iotlb(struct intel_iommu *iommu, u16 did, u64 addr,
- unsigned int size_order, u64 type);
-extern void qi_flush_dev_iotlb(struct intel_iommu *iommu, u16 sid, u16 pfsid,
+struct dmar_drhd_unit *dmar_find_matched_drhd_unit(struct pci_dev *dev);
+
+int dmar_enable_qi(struct intel_iommu *iommu);
+void dmar_disable_qi(struct intel_iommu *iommu);
+int dmar_reenable_qi(struct intel_iommu *iommu);
+void qi_global_iec(struct intel_iommu *iommu);
+
+void qi_flush_context(struct intel_iommu *iommu, u16 did,
+ u16 sid, u8 fm, u64 type);
+void qi_flush_iotlb(struct intel_iommu *iommu, u16 did, u64 addr,
+ unsigned int size_order, u64 type);
+void qi_flush_dev_iotlb(struct intel_iommu *iommu, u16 sid, u16 pfsid,
u16 qdep, u64 addr, unsigned mask);

void qi_flush_piotlb(struct intel_iommu *iommu, u16 did, u32 pasid, u64 addr,
@@ -832,7 +832,7 @@ int qi_submit_sync(struct intel_iommu *iommu, struct qi_desc *desc,
*/
#define QI_OPT_WAIT_DRAIN BIT(0)

-extern int dmar_ir_support(void);
+int dmar_ir_support(void);

void *alloc_pgtable_page(int node, gfp_t gfp);
void free_pgtable_page(void *vaddr);
@@ -840,9 +840,9 @@ void iommu_flush_write_buffer(struct intel_iommu *iommu);
struct intel_iommu *device_to_iommu(struct device *dev, u8 *bus, u8 *devfn);

#ifdef CONFIG_INTEL_IOMMU_SVM
-extern void intel_svm_check(struct intel_iommu *iommu);
-extern int intel_svm_enable_prq(struct intel_iommu *iommu);
-extern int intel_svm_finish_prq(struct intel_iommu *iommu);
+void intel_svm_check(struct intel_iommu *iommu);
+int intel_svm_enable_prq(struct intel_iommu *iommu);
+int intel_svm_finish_prq(struct intel_iommu *iommu);
int intel_svm_page_response(struct device *dev, struct iommu_fault_event *evt,
struct iommu_page_response *msg);
struct iommu_domain *intel_svm_domain_alloc(void);
@@ -889,8 +889,8 @@ extern const struct iommu_ops intel_iommu_ops;

#ifdef CONFIG_INTEL_IOMMU
extern int intel_iommu_sm;
-extern int iommu_calculate_agaw(struct intel_iommu *iommu);
-extern int iommu_calculate_max_sagaw(struct intel_iommu *iommu);
+int iommu_calculate_agaw(struct intel_iommu *iommu);
+int iommu_calculate_max_sagaw(struct intel_iommu *iommu);
int ecmd_submit_sync(struct intel_iommu *iommu, u8 ecmd, u64 oa, u64 ob);

static inline bool ecmd_has_pmu_essential(struct intel_iommu *iommu)
--
2.34.1

2023-04-13 04:08:56

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v2 07/17] iommu/vt-d: Do not use GFP_ATOMIC when not needed

From: Christophe JAILLET <[email protected]>

There is no need to use GFP_ATOMIC here. GFP_KERNEL is already used for
some other memory allocations just a few lines above.

Commit e3a981d61d15 ("iommu/vt-d: Convert allocations to GFP_KERNEL") has
changed the other memory allocation flags.

Signed-off-by: Christophe JAILLET <[email protected]>
Link: https://lore.kernel.org/r/e2a8a1019ffc8a86b4b4ed93def3623f60581274.1675542576.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel/irq_remapping.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/irq_remapping.c b/drivers/iommu/intel/irq_remapping.c
index df9e261af0b5..a1b987335b31 100644
--- a/drivers/iommu/intel/irq_remapping.c
+++ b/drivers/iommu/intel/irq_remapping.c
@@ -548,7 +548,7 @@ static int intel_setup_irq_remapping(struct intel_iommu *iommu)
goto out_free_table;
}

- bitmap = bitmap_zalloc(INTR_REMAP_TABLE_ENTRIES, GFP_ATOMIC);
+ bitmap = bitmap_zalloc(INTR_REMAP_TABLE_ENTRIES, GFP_KERNEL);
if (bitmap == NULL) {
pr_err("IR%d: failed to allocate bitmap\n", iommu->seq_id);
goto out_free_pages;
--
2.34.1

2023-04-13 04:09:00

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v2 09/17] iommu/vt-d: Use non-privileged mode for all PASIDs

From: Jacob Pan <[email protected]>

Supervisor Request Enable (SRE) bit in a PASID entry is for permission
checking on DMA requests. When SRE = 0, DMA with supervisor privilege
will be blocked. However, for in-kernel DMA this is not necessary in that
we are targeting kernel memory anyway. There's no need to differentiate
user and kernel for in-kernel DMA.

Let's use non-privileged (user) permission for all PASIDs used in kernel,
it will be consistent with DMA without PASID (RID_PASID) as well.

Signed-off-by: Jacob Pan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel/iommu.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index c771233d6f2a..f4e536fd5a28 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2316,8 +2316,6 @@ static int domain_setup_first_level(struct intel_iommu *iommu,
if (level != 4 && level != 5)
return -EINVAL;

- if (pasid != PASID_RID2PASID)
- flags |= PASID_FLAG_SUPERVISOR_MODE;
if (level == 5)
flags |= PASID_FLAG_FL5LP;

--
2.34.1

2023-04-13 04:09:08

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v2 10/17] iommu/vt-d: Remove PASID supervisor request support

From: Jacob Pan <[email protected]>

There's no more usage, remove PASID supervisor support.

Suggested-by: Lu Baolu <[email protected]>
Reviewed-by: Lu Baolu <[email protected]>
Signed-off-by: Jacob Pan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel/pasid.h | 7 ------
drivers/iommu/intel/pasid.c | 43 -------------------------------------
2 files changed, 50 deletions(-)

diff --git a/drivers/iommu/intel/pasid.h b/drivers/iommu/intel/pasid.h
index 20c54e50f533..d6b7d21244b1 100644
--- a/drivers/iommu/intel/pasid.h
+++ b/drivers/iommu/intel/pasid.h
@@ -41,13 +41,6 @@
#define FLPT_DEFAULT_DID 1
#define NUM_RESERVED_DID 2

-/*
- * The SUPERVISOR_MODE flag indicates a first level translation which
- * can be used for access to kernel addresses. It is valid only for
- * access to the kernel's static 1:1 mapping of physical memory — not
- * to vmalloc or even module mappings.
- */
-#define PASID_FLAG_SUPERVISOR_MODE BIT(0)
#define PASID_FLAG_NESTED BIT(1)
#define PASID_FLAG_PAGE_SNOOP BIT(2)

diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c
index 633e0a4a01e7..c5d479770e12 100644
--- a/drivers/iommu/intel/pasid.c
+++ b/drivers/iommu/intel/pasid.c
@@ -335,15 +335,6 @@ static inline void pasid_set_fault_enable(struct pasid_entry *pe)
pasid_set_bits(&pe->val[0], 1 << 1, 0);
}

-/*
- * Setup the SRE(Supervisor Request Enable) field (Bit 128) of a
- * scalable mode PASID entry.
- */
-static inline void pasid_set_sre(struct pasid_entry *pe)
-{
- pasid_set_bits(&pe->val[2], 1 << 0, 1);
-}
-
/*
* Setup the WPE(Write Protect Enable) field (Bit 132) of a
* scalable mode PASID entry.
@@ -521,23 +512,6 @@ int intel_pasid_setup_first_level(struct intel_iommu *iommu,
return -EINVAL;
}

- if (flags & PASID_FLAG_SUPERVISOR_MODE) {
-#ifdef CONFIG_X86
- unsigned long cr0 = read_cr0();
-
- /* CR0.WP is normally set but just to be sure */
- if (unlikely(!(cr0 & X86_CR0_WP))) {
- pr_err("No CPU write protect!\n");
- return -EINVAL;
- }
-#endif
- if (!ecap_srs(iommu->ecap)) {
- pr_err("No supervisor request support on %s\n",
- iommu->name);
- return -EINVAL;
- }
- }
-
if ((flags & PASID_FLAG_FL5LP) && !cap_fl5lp_support(iommu->cap)) {
pr_err("No 5-level paging support for first-level on %s\n",
iommu->name);
@@ -560,10 +534,6 @@ int intel_pasid_setup_first_level(struct intel_iommu *iommu,

/* Setup the first level page table pointer: */
pasid_set_flptr(pte, (u64)__pa(pgd));
- if (flags & PASID_FLAG_SUPERVISOR_MODE) {
- pasid_set_sre(pte);
- pasid_set_wpe(pte);
- }

if (flags & PASID_FLAG_FL5LP)
pasid_set_flpm(pte, 1);
@@ -658,12 +628,6 @@ int intel_pasid_setup_second_level(struct intel_iommu *iommu,
pasid_set_fault_enable(pte);
pasid_set_page_snoop(pte, !!ecap_smpwc(iommu->ecap));

- /*
- * Since it is a second level only translation setup, we should
- * set SRE bit as well (addresses are expected to be GPAs).
- */
- if (pasid != PASID_RID2PASID && ecap_srs(iommu->ecap))
- pasid_set_sre(pte);
pasid_set_present(pte);
spin_unlock(&iommu->lock);

@@ -700,13 +664,6 @@ int intel_pasid_setup_pass_through(struct intel_iommu *iommu,
pasid_set_translation_type(pte, PASID_ENTRY_PGTT_PT);
pasid_set_fault_enable(pte);
pasid_set_page_snoop(pte, !!ecap_smpwc(iommu->ecap));
-
- /*
- * We should set SRE bit as well since the addresses are expected
- * to be GPAs.
- */
- if (ecap_srs(iommu->ecap))
- pasid_set_sre(pte);
pasid_set_present(pte);
spin_unlock(&iommu->lock);

--
2.34.1

2023-04-13 04:09:22

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v2 11/17] iommu/vt-d: Make size of operands same in bitwise operations

From: Tina Zhang <[email protected]>

This addresses the following issue reported by klocwork tool:

- operands of different size in bitwise operations

Suggested-by: Yongwei Ma <[email protected]>
Signed-off-by: Tina Zhang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel/dmar.c | 2 +-
drivers/iommu/intel/iommu.c | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index 23828d189c2a..f0f51c957ccb 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -1690,7 +1690,7 @@ static void __dmar_enable_qi(struct intel_iommu *iommu)
* is present.
*/
if (ecap_smts(iommu->ecap))
- val |= (1 << 11) | 1;
+ val |= BIT_ULL(11) | BIT_ULL(0);

raw_spin_lock_irqsave(&iommu->register_lock, flags);

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index f4e536fd5a28..acbf82fa90e7 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1870,7 +1870,7 @@ context_set_sm_rid2pasid(struct context_entry *context, unsigned long pasid)
*/
static inline void context_set_sm_dte(struct context_entry *context)
{
- context->lo |= (1 << 2);
+ context->lo |= BIT_ULL(2);
}

/*
@@ -1879,7 +1879,7 @@ static inline void context_set_sm_dte(struct context_entry *context)
*/
static inline void context_set_sm_pre(struct context_entry *context)
{
- context->lo |= (1 << 4);
+ context->lo |= BIT_ULL(4);
}

/* Convert value to context PASID directory size field coding. */
--
2.34.1

2023-04-13 04:09:40

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v2 12/17] iommu/vt-d: Remove BUG_ON on checking valid pfn range

From: Tina Zhang <[email protected]>

When encountering an unexpected invalid pfn range, the kernel should
attempt recovery and proceed with execution. Therefore, using WARN_ON to
replace BUG_ON to avoid halting the machine.

Besides, one redundant checking is reduced.

Signed-off-by: Tina Zhang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel/iommu.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index acbf82fa90e7..c4847a5aaf52 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1005,9 +1005,9 @@ static void dma_pte_clear_range(struct dmar_domain *domain,
unsigned int large_page;
struct dma_pte *first_pte, *pte;

- BUG_ON(!domain_pfn_supported(domain, start_pfn));
- BUG_ON(!domain_pfn_supported(domain, last_pfn));
- BUG_ON(start_pfn > last_pfn);
+ if (WARN_ON(!domain_pfn_supported(domain, last_pfn)) ||
+ WARN_ON(start_pfn > last_pfn))
+ return;

/* we don't need lock here; nobody else touches the iova range */
do {
@@ -1166,9 +1166,9 @@ static void dma_pte_clear_level(struct dmar_domain *domain, int level,
static void domain_unmap(struct dmar_domain *domain, unsigned long start_pfn,
unsigned long last_pfn, struct list_head *freelist)
{
- BUG_ON(!domain_pfn_supported(domain, start_pfn));
- BUG_ON(!domain_pfn_supported(domain, last_pfn));
- BUG_ON(start_pfn > last_pfn);
+ if (WARN_ON(!domain_pfn_supported(domain, last_pfn)) ||
+ WARN_ON(start_pfn > last_pfn))
+ return;

/* we don't need lock here; nobody else touches the iova range */
dma_pte_clear_level(domain, agaw_to_level(domain->agaw),
--
2.34.1

2023-04-13 04:10:05

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v2 13/17] iommu/vt-d: Remove BUG_ON in handling iotlb cache invalidation

From: Tina Zhang <[email protected]>

VT-d iotlb cache invalidation request with unexpected type is considered
as a bug to developers, which can be fixed. So, when such kind of issue
comes out, it needs to be reported through the kernel log, instead of
halting the system. Replacing BUG_ON with warning reporting.

Signed-off-by: Tina Zhang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel/iommu.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index c4847a5aaf52..dd61bb554aa7 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1272,7 +1272,9 @@ static void __iommu_flush_context(struct intel_iommu *iommu,
| DMA_CCMD_SID(source_id) | DMA_CCMD_FM(function_mask);
break;
default:
- BUG();
+ pr_warn("%s: Unexpected context-cache invalidation type 0x%llx\n",
+ iommu->name, type);
+ return;
}
val |= DMA_CCMD_ICC;

@@ -1308,7 +1310,9 @@ static void __iommu_flush_iotlb(struct intel_iommu *iommu, u16 did,
val_iva = size_order | addr;
break;
default:
- BUG();
+ pr_warn("%s: Unexpected iotlb invalidation type 0x%llx\n",
+ iommu->name, type);
+ return;
}
/* Note: set drain read/write */
#if 0
@@ -1483,7 +1487,8 @@ static void iommu_flush_iotlb_psi(struct intel_iommu *iommu,
uint64_t addr = (uint64_t)pfn << VTD_PAGE_SHIFT;
u16 did = domain_id_iommu(domain, iommu);

- BUG_ON(pages == 0);
+ if (WARN_ON(!pages))
+ return;

if (ih)
ih = 1 << 6;
--
2.34.1

2023-04-13 04:10:16

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v2 15/17] iommu/vt-d: Remove BUG_ON in map/unmap()

From: Tina Zhang <[email protected]>

Domain map/unmap with invalid parameters shouldn't crash the kernel.
Therefore, using if() replaces the BUG_ON.

Signed-off-by: Tina Zhang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel/iommu.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index f11347a590d7..ab21ef1ddb3c 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2159,7 +2159,8 @@ __domain_mapping(struct dmar_domain *domain, unsigned long iov_pfn,
phys_addr_t pteval;
u64 attr;

- BUG_ON(!domain_pfn_supported(domain, iov_pfn + nr_pages - 1));
+ if (unlikely(!domain_pfn_supported(domain, iov_pfn + nr_pages - 1)))
+ return -EINVAL;

if ((prot & (DMA_PTE_READ|DMA_PTE_WRITE)) == 0)
return -EINVAL;
@@ -4314,8 +4315,9 @@ static size_t intel_iommu_unmap(struct iommu_domain *domain,

/* Cope with horrid API which requires us to unmap more than the
size argument if it happens to be a large-page mapping. */
- BUG_ON(!pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT, &level,
- GFP_ATOMIC));
+ if (unlikely(!pfn_to_dma_pte(dmar_domain, iova >> VTD_PAGE_SHIFT,
+ &level, GFP_ATOMIC)))
+ return 0;

if (size < VTD_PAGE_SIZE << level_to_offset_bits(level))
size = VTD_PAGE_SIZE << level_to_offset_bits(level);
--
2.34.1

2023-04-13 04:10:19

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v2 14/17] iommu/vt-d: Remove BUG_ON when domain->pgd is NULL

From: Tina Zhang <[email protected]>

When performing domain_context_mapping or getting dma_pte of a pfn, the
availability of the domain page table directory is ensured. Therefore,
the domain->pgd checkings are unnecessary.

Signed-off-by: Tina Zhang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel/iommu.c | 4 ----
1 file changed, 4 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index dd61bb554aa7..f11347a590d7 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -915,8 +915,6 @@ static struct dma_pte *pfn_to_dma_pte(struct dmar_domain *domain,
int level = agaw_to_level(domain->agaw);
int offset;

- BUG_ON(!domain->pgd);
-
if (!domain_pfn_supported(domain, pfn))
/* Address beyond IOMMU's addressing capabilities. */
return NULL;
@@ -1910,8 +1908,6 @@ static int domain_context_mapping_one(struct dmar_domain *domain,
pr_debug("Set context mapping for %02x:%02x.%d\n",
bus, PCI_SLOT(devfn), PCI_FUNC(devfn));

- BUG_ON(!domain->pgd);
-
spin_lock(&iommu->lock);
ret = -ENOMEM;
context = iommu_context_addr(iommu, bus, devfn, 1);
--
2.34.1

2023-04-13 04:10:20

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v2 17/17] iommu/vt-d: Remove BUG_ON in dmar_insert_dev_scope()

From: Tina Zhang <[email protected]>

The dmar_insert_dev_scope() could fail if any unexpected condition is
encountered. However, in this situation, the kernel should attempt
recovery and proceed with execution. Remove BUG_ON with WARN_ON, so that
kernel can avoid being crashed when an unexpected condition occurs.

Signed-off-by: Tina Zhang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel/dmar.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index 9346c6e7ebae..e35be3786bde 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -262,7 +262,8 @@ int dmar_insert_dev_scope(struct dmar_pci_notify_info *info,
get_device(dev));
return 1;
}
- BUG_ON(i >= devices_cnt);
+ if (WARN_ON(i >= devices_cnt))
+ return -EINVAL;
}

return 0;
--
2.34.1

2023-04-13 04:10:32

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v2 16/17] iommu/vt-d: Remove a useless BUG_ON(dev->is_virtfn)

From: Tina Zhang <[email protected]>

When dmar_alloc_pci_notify_info() is being invoked, the invoker has
ensured the dev->is_virtfn is false. So, remove the useless BUG_ON in
dmar_alloc_pci_notify_info().

Signed-off-by: Tina Zhang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel/dmar.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index f0f51c957ccb..9346c6e7ebae 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -127,8 +127,6 @@ dmar_alloc_pci_notify_info(struct pci_dev *dev, unsigned long event)
struct pci_dev *tmp;
struct dmar_pci_notify_info *info;

- BUG_ON(dev->is_virtfn);
-
/*
* Ignore devices that have a domain number higher than what can
* be looked up in DMAR, e.g. VMD subdevices with domain 0x10000
--
2.34.1

2023-04-13 10:10:47

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH v2 00/17] [PULL REQUEST] Intel IOMMU updates for Linux v6.4

On Thu, Apr 13, 2023 at 12:06:28PM +0800, Lu Baolu wrote:
> Lu Baolu (7):
> dmaengine: idxd: Add enable/disable device IOPF feature
> iommu/vt-d: Allow SVA with device-specific IOPF
> iommu/vt-d: Move iopf code from SVA to IOPF enabling path
> iommu/vt-d: Move pfsid and ats_qdep calculation to device probe path
> iommu/vt-d: Move PRI handling to IOPF feature path
> iommu/vt-d: Remove unnecessary checks in iopf disabling path
> iommu/vt-d: Remove extern from function prototypes
>
> Tina Zhang (7):
> iommu/vt-d: Make size of operands same in bitwise operations
> iommu/vt-d: Remove BUG_ON on checking valid pfn range
> iommu/vt-d: Remove BUG_ON in handling iotlb cache invalidation
> iommu/vt-d: Remove BUG_ON when domain->pgd is NULL
> iommu/vt-d: Remove BUG_ON in map/unmap()
> iommu/vt-d: Remove a useless BUG_ON(dev->is_virtfn)
> iommu/vt-d: Remove BUG_ON in dmar_insert_dev_scope()

Applied, thanks.