2023-03-09 02:57:50

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v2 0/5] Refactor code for non-PRI IOPF

The existing SVA and IOPF implementation assumes that devices report I/O
page faults via PCI PRI. This is not always true as some emerging
devices are designed to handle the I/O page faults by themselves without
ever sending PCI page requests nor advertising PRI capability.

Refactor the SVA and IOPF code to allow SVA support with IOPF handled
either by IOMMU (PCI PRI) or device driver (device-specific IOPF).

This series is based on v6.3-rc1 and also available at github:
https://github.com/LuBaolu/intel-iommu/commits/intel-iommu-refactor-iopf-v2

Change log:
v2:
- Separate a fix patch and merge it into v6. 3-rc1 [commit 60b1daa3b168
("iommu/vt-d: Fix error handling in sva enable/disable paths")]
- Disallow device-specific IOPF is device advertising PRI capability;
- Add some extra patches to move all IOPF related code to the IOPF
enabling/disabling paths.

v1: Initial post
- https://lore.kernel.org/linux-iommu/[email protected]/

Lu Baolu (5):
dmaengine: idxd: Add enable/disable device IOPF feature
iommu/vt-d: Allow SVA with device-specific IOPF
iommu/vt-d: Move iopf code from SVA to IOPF enabling path
iommu/vt-d: Move pfsid and ats_qdep calculation to device probe path
iommu/vt-d: Move PRI handling to IOPF feature path

drivers/dma/idxd/init.c | 31 ++++++++--
drivers/iommu/intel/iommu.c | 120 ++++++++++++++++++++++--------------
2 files changed, 100 insertions(+), 51 deletions(-)

--
2.34.1



2023-03-09 02:57:55

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v2 1/5] dmaengine: idxd: Add enable/disable device IOPF feature

The iommu subsystem requires IOMMU_DEV_FEAT_IOPF must be enabled before
and disabled after IOMMU_DEV_FEAT_SVA, if device's I/O page faults rely
on the IOMMU. Add explicit IOMMU_DEV_FEAT_IOPF enabling/disabling in this
driver.

At present, missing IOPF enabling/disabling doesn't cause any real issue,
because the IOMMU driver places the IOPF enabling/disabling in the path
of SVA feature handling. But this may change.

Reviewed-by: Dave Jiang <[email protected]>
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/dma/idxd/init.c | 31 +++++++++++++++++++++++++------
1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
index 640d3048368e..09ef62aa0635 100644
--- a/drivers/dma/idxd/init.c
+++ b/drivers/dma/idxd/init.c
@@ -516,6 +516,27 @@ static void idxd_disable_system_pasid(struct idxd_device *idxd)
idxd->sva = NULL;
}

+static int idxd_enable_sva(struct pci_dev *pdev)
+{
+ int ret;
+
+ ret = iommu_dev_enable_feature(&pdev->dev, IOMMU_DEV_FEAT_IOPF);
+ if (ret)
+ return ret;
+
+ ret = iommu_dev_enable_feature(&pdev->dev, IOMMU_DEV_FEAT_SVA);
+ if (ret)
+ iommu_dev_disable_feature(&pdev->dev, IOMMU_DEV_FEAT_IOPF);
+
+ return ret;
+}
+
+static void idxd_disable_sva(struct pci_dev *pdev)
+{
+ iommu_dev_disable_feature(&pdev->dev, IOMMU_DEV_FEAT_SVA);
+ iommu_dev_disable_feature(&pdev->dev, IOMMU_DEV_FEAT_IOPF);
+}
+
static int idxd_probe(struct idxd_device *idxd)
{
struct pci_dev *pdev = idxd->pdev;
@@ -530,7 +551,7 @@ static int idxd_probe(struct idxd_device *idxd)
dev_dbg(dev, "IDXD reset complete\n");

if (IS_ENABLED(CONFIG_INTEL_IDXD_SVM) && sva) {
- if (iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA)) {
+ if (idxd_enable_sva(pdev)) {
dev_warn(dev, "Unable to turn on user SVA feature.\n");
} else {
set_bit(IDXD_FLAG_USER_PASID_ENABLED, &idxd->flags);
@@ -578,21 +599,19 @@ static int idxd_probe(struct idxd_device *idxd)
if (device_pasid_enabled(idxd))
idxd_disable_system_pasid(idxd);
if (device_user_pasid_enabled(idxd))
- iommu_dev_disable_feature(dev, IOMMU_DEV_FEAT_SVA);
+ idxd_disable_sva(pdev);
return rc;
}

static void idxd_cleanup(struct idxd_device *idxd)
{
- struct device *dev = &idxd->pdev->dev;
-
perfmon_pmu_remove(idxd);
idxd_cleanup_interrupts(idxd);
idxd_cleanup_internals(idxd);
if (device_pasid_enabled(idxd))
idxd_disable_system_pasid(idxd);
if (device_user_pasid_enabled(idxd))
- iommu_dev_disable_feature(dev, IOMMU_DEV_FEAT_SVA);
+ idxd_disable_sva(idxd->pdev);
}

static int idxd_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
@@ -710,7 +729,7 @@ static void idxd_remove(struct pci_dev *pdev)
pci_free_irq_vectors(pdev);
pci_iounmap(pdev, idxd->reg_base);
if (device_user_pasid_enabled(idxd))
- iommu_dev_disable_feature(&pdev->dev, IOMMU_DEV_FEAT_SVA);
+ idxd_disable_sva(pdev);
pci_disable_device(pdev);
destroy_workqueue(idxd->wq);
perfmon_pmu_remove(idxd);
--
2.34.1


2023-03-09 02:57:59

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v2 2/5] iommu/vt-d: Allow SVA with device-specific IOPF

Currently enabling SVA requires IOPF support from the IOMMU and device
PCI PRI. However, some devices can handle IOPF by itself without ever
sending PCI page requests nor advertising PRI capability.

Allow SVA support with IOPF handled either by IOMMU (PCI PRI) or device
driver (device-specific IOPF). As long as IOPF could be handled, SVA
should continue to work.

Signed-off-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel/iommu.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 7c2f4bd33582..d2fcab9d8f61 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4650,7 +4650,18 @@ static int intel_iommu_enable_sva(struct device *dev)
if (!(iommu->flags & VTD_FLAG_SVM_CAPABLE))
return -ENODEV;

- if (!info->pasid_enabled || !info->pri_enabled || !info->ats_enabled)
+ if (!info->pasid_enabled)
+ return -EINVAL;
+
+ /*
+ * Devices having device-specific I/O fault handling should not
+ * support PCI/PRI.
+ */
+ if (!info->pri_supported)
+ return 0;
+
+ /* Devices supporting ATS/PRI should have it enabled. */
+ if (!info->pri_enabled || !info->ats_enabled)
return -EINVAL;

ret = iopf_queue_add_device(iommu->iopf_queue, dev);
--
2.34.1


2023-03-09 02:58:10

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v2 3/5] iommu/vt-d: Move iopf code from SVA to IOPF enabling path

Generally enabling IOMMU_DEV_FEAT_SVA requires IOMMU_DEV_FEAT_IOPF, but
some devices manage I/O Page Faults themselves instead of relying on the
IOMMU. Move IOPF related code from SVA to IOPF enabling path.

For the device drivers that relies on the IOMMU for IOPF through PCI/PRI,
IOMMU_DEV_FEAT_IOPF must be enabled before and disabled after
IOMMU_DEV_FEAT_SVA.

Signed-off-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel/iommu.c | 32 ++++++++++++++++++--------------
1 file changed, 18 insertions(+), 14 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index d2fcab9d8f61..9ada12bf38dd 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4638,7 +4638,6 @@ static int intel_iommu_enable_sva(struct device *dev)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct intel_iommu *iommu;
- int ret;

if (!info || dmar_disabled)
return -EINVAL;
@@ -4664,6 +4663,21 @@ static int intel_iommu_enable_sva(struct device *dev)
if (!info->pri_enabled || !info->ats_enabled)
return -EINVAL;

+ return 0;
+}
+
+static int intel_iommu_enable_iopf(struct device *dev)
+{
+ struct device_domain_info *info = dev_iommu_priv_get(dev);
+ struct intel_iommu *iommu;
+ int ret;
+
+ if (!info || !info->ats_enabled || !info->pri_enabled)
+ return -ENODEV;
+ iommu = info->iommu;
+ if (!iommu)
+ return -EINVAL;
+
ret = iopf_queue_add_device(iommu->iopf_queue, dev);
if (ret)
return ret;
@@ -4675,7 +4689,7 @@ static int intel_iommu_enable_sva(struct device *dev)
return ret;
}

-static int intel_iommu_disable_sva(struct device *dev)
+static int intel_iommu_disable_iopf(struct device *dev)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct intel_iommu *iommu = info->iommu;
@@ -4692,16 +4706,6 @@ static int intel_iommu_disable_sva(struct device *dev)
return ret;
}

-static int intel_iommu_enable_iopf(struct device *dev)
-{
- struct device_domain_info *info = dev_iommu_priv_get(dev);
-
- if (info && info->pri_supported)
- return 0;
-
- return -ENODEV;
-}
-
static int
intel_iommu_dev_enable_feat(struct device *dev, enum iommu_dev_features feat)
{
@@ -4722,10 +4726,10 @@ intel_iommu_dev_disable_feat(struct device *dev, enum iommu_dev_features feat)
{
switch (feat) {
case IOMMU_DEV_FEAT_IOPF:
- return 0;
+ return intel_iommu_disable_iopf(dev);

case IOMMU_DEV_FEAT_SVA:
- return intel_iommu_disable_sva(dev);
+ return 0;

default:
return -ENODEV;
--
2.34.1


2023-03-09 02:58:13

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v2 4/5] iommu/vt-d: Move pfsid and ats_qdep calculation to device probe path

They should be part of the per-device iommu private data initialization.

Signed-off-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel/iommu.c | 26 +++++++++++---------------
1 file changed, 11 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 9ada12bf38dd..fb64ab8358a9 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1406,20 +1406,6 @@ static void iommu_enable_pci_caps(struct device_domain_info *info)
return;

pdev = to_pci_dev(info->dev);
- /* For IOMMU that supports device IOTLB throttling (DIT), we assign
- * PFSID to the invalidation desc of a VF such that IOMMU HW can gauge
- * queue depth at PF level. If DIT is not set, PFSID will be treated as
- * reserved, which should be set to 0.
- */
- if (!ecap_dit(info->iommu->ecap))
- info->pfsid = 0;
- else {
- struct pci_dev *pf_pdev;
-
- /* pdev will be returned if device is not a vf */
- pf_pdev = pci_physfn(pdev);
- info->pfsid = pci_dev_id(pf_pdev);
- }

/* The PCIe spec, in its wisdom, declares that the behaviour of
the device if you enable PASID support after ATS support is
@@ -1438,7 +1424,6 @@ static void iommu_enable_pci_caps(struct device_domain_info *info)
!pci_enable_ats(pdev, VTD_PAGE_SHIFT)) {
info->ats_enabled = 1;
domain_update_iotlb(info->domain);
- info->ats_qdep = pci_ats_queue_depth(pdev);
}
}

@@ -4521,6 +4506,17 @@ static struct iommu_device *intel_iommu_probe_device(struct device *dev)
dmar_ats_supported(pdev, iommu)) {
info->ats_supported = 1;
info->dtlb_extra_inval = dev_needs_extra_dtlb_flush(pdev);
+
+ /*
+ * For IOMMU that supports device IOTLB throttling
+ * (DIT), we assign PFSID to the invalidation desc
+ * of a VF such that IOMMU HW can gauge queue depth
+ * at PF level. If DIT is not set, PFSID will be
+ * treated as reserved, which should be set to 0.
+ */
+ if (ecap_dit(iommu->ecap))
+ info->pfsid = pci_dev_id(pci_physfn(pdev));
+ info->ats_qdep = pci_ats_queue_depth(pdev);
}
if (sm_supported(iommu)) {
if (pasid_supported(iommu)) {
--
2.34.1


2023-03-09 02:58:18

by Lu Baolu

[permalink] [raw]
Subject: [PATCH v2 5/5] iommu/vt-d: Move PRI handling to IOPF feature path

PRI is only used for IOPF. With this move, the PCI/PRI feature could be
controlled by the device driver through iommu_dev_enable/disable_feature()
interfaces.

Signed-off-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel/iommu.c | 59 ++++++++++++++++++++++++-------------
1 file changed, 39 insertions(+), 20 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index fb64ab8358a9..4ed32bde4287 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -1415,11 +1415,6 @@ static void iommu_enable_pci_caps(struct device_domain_info *info)
if (info->pasid_supported && !pci_enable_pasid(pdev, info->pasid_supported & ~1))
info->pasid_enabled = 1;

- if (info->pri_supported &&
- (info->pasid_enabled ? pci_prg_resp_pasid_required(pdev) : 1) &&
- !pci_reset_pri(pdev) && !pci_enable_pri(pdev, PRQ_DEPTH))
- info->pri_enabled = 1;
-
if (info->ats_supported && pci_ats_page_aligned(pdev) &&
!pci_enable_ats(pdev, VTD_PAGE_SHIFT)) {
info->ats_enabled = 1;
@@ -1442,11 +1437,6 @@ static void iommu_disable_pci_caps(struct device_domain_info *info)
domain_update_iotlb(info->domain);
}

- if (info->pri_enabled) {
- pci_disable_pri(pdev);
- info->pri_enabled = 0;
- }
-
if (info->pasid_enabled) {
pci_disable_pasid(pdev);
info->pasid_enabled = 0;
@@ -4664,23 +4654,48 @@ static int intel_iommu_enable_sva(struct device *dev)

static int intel_iommu_enable_iopf(struct device *dev)
{
+ struct pci_dev *pdev = dev_is_pci(dev) ? to_pci_dev(dev) : NULL;
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct intel_iommu *iommu;
int ret;

- if (!info || !info->ats_enabled || !info->pri_enabled)
+ if (!pdev || !info || !info->ats_enabled || !info->pri_supported)
return -ENODEV;
+
+ if (info->pri_enabled)
+ return -EBUSY;
+
iommu = info->iommu;
if (!iommu)
return -EINVAL;

+ /* PASID is required in PRG Response Message. */
+ if (info->pasid_enabled && !pci_prg_resp_pasid_required(pdev))
+ return -EINVAL;
+
+ ret = pci_reset_pri(pdev);
+ if (ret)
+ return ret;
+
ret = iopf_queue_add_device(iommu->iopf_queue, dev);
if (ret)
return ret;

ret = iommu_register_device_fault_handler(dev, iommu_queue_iopf, dev);
if (ret)
- iopf_queue_remove_device(iommu->iopf_queue, dev);
+ goto iopf_remove_device;
+
+ ret = pci_enable_pri(pdev, PRQ_DEPTH);
+ if (ret)
+ goto iopf_unregister_handler;
+ info->pri_enabled = 1;
+
+ return 0;
+
+iopf_unregister_handler:
+ iommu_unregister_device_fault_handler(dev);
+iopf_remove_device:
+ iopf_queue_remove_device(iommu->iopf_queue, dev);

return ret;
}
@@ -4689,17 +4704,21 @@ static int intel_iommu_disable_iopf(struct device *dev)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct intel_iommu *iommu = info->iommu;
- int ret;

- ret = iommu_unregister_device_fault_handler(dev);
- if (ret)
- return ret;
+ if (!info->pri_enabled)
+ return -EINVAL;

- ret = iopf_queue_remove_device(iommu->iopf_queue, dev);
- if (ret)
- iommu_register_device_fault_handler(dev, iommu_queue_iopf, dev);
+ pci_disable_pri(to_pci_dev(dev));
+ info->pri_enabled = 0;

- return ret;
+ /*
+ * With pri_enabled checked, unregistering fault handler and
+ * removing device from iopf queue should never fail.
+ */
+ iommu_unregister_device_fault_handler(dev);
+ iopf_queue_remove_device(iommu->iopf_queue, dev);
+
+ return 0;
}

static int
--
2.34.1


2023-03-09 03:03:05

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH v2 0/5] Refactor code for non-PRI IOPF

On 3/9/23 10:56 AM, Lu Baolu wrote:
> The existing SVA and IOPF implementation assumes that devices report I/O
> page faults via PCI PRI. This is not always true as some emerging
> devices are designed to handle the I/O page faults by themselves without
> ever sending PCI page requests nor advertising PRI capability.
>
> Refactor the SVA and IOPF code to allow SVA support with IOPF handled
> either by IOMMU (PCI PRI) or device driver (device-specific IOPF).
>
> This series is based on v6.3-rc1 and also available at github:
> https://github.com/LuBaolu/intel-iommu/commits/intel-iommu-refactor-iopf-v2

This is iommu/vt-d specific. I should make it explicit in the
cover-letter title.

iommu/vt-d: Refactor code for non-PRI IOPF

Sorry about it.

Best regards,
baolu

2023-03-09 03:51:55

by Fenghua Yu

[permalink] [raw]
Subject: Re: [PATCH v2 1/5] dmaengine: idxd: Add enable/disable device IOPF feature



On 3/8/23 18:56, Lu Baolu wrote:
> The iommu subsystem requires IOMMU_DEV_FEAT_IOPF must be enabled before
> and disabled after IOMMU_DEV_FEAT_SVA, if device's I/O page faults rely
> on the IOMMU. Add explicit IOMMU_DEV_FEAT_IOPF enabling/disabling in this
> driver.
>
> At present, missing IOPF enabling/disabling doesn't cause any real issue,
> because the IOMMU driver places the IOPF enabling/disabling in the path
> of SVA feature handling. But this may change.
>
> Reviewed-by: Dave Jiang <[email protected]>
> Signed-off-by: Lu Baolu <[email protected]>

Reviewed-by: Fenghua Yu <[email protected]>

Thanks.

-Fenghua

2023-03-16 07:08:21

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH v2 1/5] dmaengine: idxd: Add enable/disable device IOPF feature

> From: Lu Baolu <[email protected]>
> Sent: Thursday, March 9, 2023 10:57 AM
>
> The iommu subsystem requires IOMMU_DEV_FEAT_IOPF must be enabled
> before
> and disabled after IOMMU_DEV_FEAT_SVA, if device's I/O page faults rely
> on the IOMMU. Add explicit IOMMU_DEV_FEAT_IOPF enabling/disabling in
> this
> driver.
>
> At present, missing IOPF enabling/disabling doesn't cause any real issue,
> because the IOMMU driver places the IOPF enabling/disabling in the path
> of SVA feature handling. But this may change.
>
> Reviewed-by: Dave Jiang <[email protected]>
> Signed-off-by: Lu Baolu <[email protected]>

Reviewed-by: Kevin Tian <[email protected]>

2023-03-16 07:10:24

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH v2 2/5] iommu/vt-d: Allow SVA with device-specific IOPF

> From: Lu Baolu <[email protected]>
> Sent: Thursday, March 9, 2023 10:57 AM
>
> @@ -4650,7 +4650,18 @@ static int intel_iommu_enable_sva(struct device
> *dev)
> if (!(iommu->flags & VTD_FLAG_SVM_CAPABLE))
> return -ENODEV;
>
> - if (!info->pasid_enabled || !info->pri_enabled || !info->ats_enabled)
> + if (!info->pasid_enabled)
> + return -EINVAL;
> +

I think you still want to check ats_enabled even for device specific IOPF.

2023-03-16 07:14:31

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH v2 3/5] iommu/vt-d: Move iopf code from SVA to IOPF enabling path

> From: Lu Baolu <[email protected]>
> Sent: Thursday, March 9, 2023 10:57 AM
>
> Generally enabling IOMMU_DEV_FEAT_SVA requires
> IOMMU_DEV_FEAT_IOPF, but
> some devices manage I/O Page Faults themselves instead of relying on the
> IOMMU. Move IOPF related code from SVA to IOPF enabling path.
>
> For the device drivers that relies on the IOMMU for IOPF through PCI/PRI,
> IOMMU_DEV_FEAT_IOPF must be enabled before and disabled after
> IOMMU_DEV_FEAT_SVA.
>
> Signed-off-by: Lu Baolu <[email protected]>

Reviewed-by: Kevin Tian <[email protected]>

2023-03-16 07:14:42

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH v2 4/5] iommu/vt-d: Move pfsid and ats_qdep calculation to device probe path

> From: Lu Baolu <[email protected]>
> Sent: Thursday, March 9, 2023 10:57 AM
>
> They should be part of the per-device iommu private data initialization.
>
> Signed-off-by: Lu Baolu <[email protected]>

Reviewed-by: Kevin Tian <[email protected]>

2023-03-16 07:19:21

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH v2 5/5] iommu/vt-d: Move PRI handling to IOPF feature path

> From: Lu Baolu <[email protected]>
> Sent: Thursday, March 9, 2023 10:57 AM
>
> @@ -4689,17 +4704,21 @@ static int intel_iommu_disable_iopf(struct device
> *dev)
> {
> struct device_domain_info *info = dev_iommu_priv_get(dev);
> struct intel_iommu *iommu = info->iommu;
> - int ret;
>
> - ret = iommu_unregister_device_fault_handler(dev);
> - if (ret)
> - return ret;
> + if (!info->pri_enabled)
> + return -EINVAL;
>
> - ret = iopf_queue_remove_device(iommu->iopf_queue, dev);
> - if (ret)
> - iommu_register_device_fault_handler(dev,
> iommu_queue_iopf, dev);
> + pci_disable_pri(to_pci_dev(dev));
> + info->pri_enabled = 0;
>
> - return ret;
> + /*
> + * With pri_enabled checked, unregistering fault handler and
> + * removing device from iopf queue should never fail.
> + */
> + iommu_unregister_device_fault_handler(dev);
> + iopf_queue_remove_device(iommu->iopf_queue, dev);
> +
> + return 0;
> }

PCIe spec says that clearing the enable bit doesn't mean in-fly
page requests are completed:
--
Enable (E) - This field, when set, indicates that the Page Request
Interface is allowed to make page requests. If this field is Clear,
the Page Request Interface is not allowed to issue page requests.
If both this field and the Stopped field are Clear, then the Page
Request Interface will not issue new page requests, but has
outstanding page requests that have been transmitted or are
queued for transmission

2023-03-16 07:35:13

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH v2 2/5] iommu/vt-d: Allow SVA with device-specific IOPF

On 2023/3/16 15:09, Tian, Kevin wrote:
>> From: Lu Baolu <[email protected]>
>> Sent: Thursday, March 9, 2023 10:57 AM
>>
>> @@ -4650,7 +4650,18 @@ static int intel_iommu_enable_sva(struct device
>> *dev)
>> if (!(iommu->flags & VTD_FLAG_SVM_CAPABLE))
>> return -ENODEV;
>>
>> - if (!info->pasid_enabled || !info->pri_enabled || !info->ats_enabled)
>> + if (!info->pasid_enabled)
>> + return -EINVAL;
>> +
>
> I think you still want to check ats_enabled even for device specific IOPF.

Yeah! Updated.

Best regards,
baolu

2023-03-16 08:18:15

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH v2 5/5] iommu/vt-d: Move PRI handling to IOPF feature path

On 2023/3/16 15:17, Tian, Kevin wrote:
>> From: Lu Baolu <[email protected]>
>> Sent: Thursday, March 9, 2023 10:57 AM
>>
>> @@ -4689,17 +4704,21 @@ static int intel_iommu_disable_iopf(struct device
>> *dev)
>> {
>> struct device_domain_info *info = dev_iommu_priv_get(dev);
>> struct intel_iommu *iommu = info->iommu;
>> - int ret;
>>
>> - ret = iommu_unregister_device_fault_handler(dev);
>> - if (ret)
>> - return ret;
>> + if (!info->pri_enabled)
>> + return -EINVAL;
>>
>> - ret = iopf_queue_remove_device(iommu->iopf_queue, dev);
>> - if (ret)
>> - iommu_register_device_fault_handler(dev,
>> iommu_queue_iopf, dev);
>> + pci_disable_pri(to_pci_dev(dev));
>> + info->pri_enabled = 0;
>>
>> - return ret;
>> + /*
>> + * With pri_enabled checked, unregistering fault handler and
>> + * removing device from iopf queue should never fail.
>> + */
>> + iommu_unregister_device_fault_handler(dev);
>> + iopf_queue_remove_device(iommu->iopf_queue, dev);
>> +
>> + return 0;
>> }
>
> PCIe spec says that clearing the enable bit doesn't mean in-fly
> page requests are completed:
> --
> Enable (E) - This field, when set, indicates that the Page Request
> Interface is allowed to make page requests. If this field is Clear,
> the Page Request Interface is not allowed to issue page requests.
> If both this field and the Stopped field are Clear, then the Page
> Request Interface will not issue new page requests, but has
> outstanding page requests that have been transmitted or are
> queued for transmission

Yes. So the iommu driver should drain the in-fly PRQs.

The Intel VT-d implementation drains the PRQs when any PASID is unbound
from the iommu domain (see intel_svm_drain_prq()) before reuse. Before
disabling iopf, the device driver should unbind pasid and disable sva,
so when it comes here, the PRQ should have been drained.

Perhaps I can add below comments to make this clear:

/*
* PCIe spec states that by clearing PRI enable bit, the Page
* Request Interface will not issue new page requests, but has
* outstanding page requests that have been transmitted or are
* queued for transmission. This is supposed to be called after
* the device driver has stopped DMA, all PASIDs have been
* unbound and the outstanding PRQs have been drained.
*/

Best regards,
baolu

2023-03-17 00:06:29

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH v2 5/5] iommu/vt-d: Move PRI handling to IOPF feature path

> From: Baolu Lu <[email protected]>
> Sent: Thursday, March 16, 2023 4:17 PM
>
> On 2023/3/16 15:17, Tian, Kevin wrote:
> >> From: Lu Baolu <[email protected]>
> >> Sent: Thursday, March 9, 2023 10:57 AM
> >>
> >> @@ -4689,17 +4704,21 @@ static int intel_iommu_disable_iopf(struct
> device
> >> *dev)
> >> {
> >> struct device_domain_info *info = dev_iommu_priv_get(dev);
> >> struct intel_iommu *iommu = info->iommu;
> >> - int ret;
> >>
> >> - ret = iommu_unregister_device_fault_handler(dev);
> >> - if (ret)
> >> - return ret;
> >> + if (!info->pri_enabled)
> >> + return -EINVAL;
> >>
> >> - ret = iopf_queue_remove_device(iommu->iopf_queue, dev);
> >> - if (ret)
> >> - iommu_register_device_fault_handler(dev,
> >> iommu_queue_iopf, dev);
> >> + pci_disable_pri(to_pci_dev(dev));
> >> + info->pri_enabled = 0;
> >>
> >> - return ret;
> >> + /*
> >> + * With pri_enabled checked, unregistering fault handler and
> >> + * removing device from iopf queue should never fail.
> >> + */
> >> + iommu_unregister_device_fault_handler(dev);
> >> + iopf_queue_remove_device(iommu->iopf_queue, dev);
> >> +
> >> + return 0;
> >> }
> >
> > PCIe spec says that clearing the enable bit doesn't mean in-fly
> > page requests are completed:
> > --
> > Enable (E) - This field, when set, indicates that the Page Request
> > Interface is allowed to make page requests. If this field is Clear,
> > the Page Request Interface is not allowed to issue page requests.
> > If both this field and the Stopped field are Clear, then the Page
> > Request Interface will not issue new page requests, but has
> > outstanding page requests that have been transmitted or are
> > queued for transmission
>
> Yes. So the iommu driver should drain the in-fly PRQs.
>
> The Intel VT-d implementation drains the PRQs when any PASID is unbound
> from the iommu domain (see intel_svm_drain_prq()) before reuse. Before
> disabling iopf, the device driver should unbind pasid and disable sva,
> so when it comes here, the PRQ should have been drained.
>
> Perhaps I can add below comments to make this clear:
>
> /*
> * PCIe spec states that by clearing PRI enable bit, the Page
> * Request Interface will not issue new page requests, but has
> * outstanding page requests that have been transmitted or are
> * queued for transmission. This is supposed to be called after
> * the device driver has stopped DMA, all PASIDs have been
> * unbound and the outstanding PRQs have been drained.
> */
>

this is fine. But it should be a separate patch which removes
check of return value. It's not caused by this PRI handling move
patch.

2023-03-17 00:47:46

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH v2 5/5] iommu/vt-d: Move PRI handling to IOPF feature path

On 2023/3/17 8:06, Tian, Kevin wrote:
>> From: Baolu Lu <[email protected]>
>> Sent: Thursday, March 16, 2023 4:17 PM
>>
>> On 2023/3/16 15:17, Tian, Kevin wrote:
>>>> From: Lu Baolu <[email protected]>
>>>> Sent: Thursday, March 9, 2023 10:57 AM
>>>>
>>>> @@ -4689,17 +4704,21 @@ static int intel_iommu_disable_iopf(struct
>> device
>>>> *dev)
>>>> {
>>>> struct device_domain_info *info = dev_iommu_priv_get(dev);
>>>> struct intel_iommu *iommu = info->iommu;
>>>> - int ret;
>>>>
>>>> - ret = iommu_unregister_device_fault_handler(dev);
>>>> - if (ret)
>>>> - return ret;
>>>> + if (!info->pri_enabled)
>>>> + return -EINVAL;
>>>>
>>>> - ret = iopf_queue_remove_device(iommu->iopf_queue, dev);
>>>> - if (ret)
>>>> - iommu_register_device_fault_handler(dev,
>>>> iommu_queue_iopf, dev);
>>>> + pci_disable_pri(to_pci_dev(dev));
>>>> + info->pri_enabled = 0;
>>>>
>>>> - return ret;
>>>> + /*
>>>> + * With pri_enabled checked, unregistering fault handler and
>>>> + * removing device from iopf queue should never fail.
>>>> + */
>>>> + iommu_unregister_device_fault_handler(dev);
>>>> + iopf_queue_remove_device(iommu->iopf_queue, dev);
>>>> +
>>>> + return 0;
>>>> }
>>>
>>> PCIe spec says that clearing the enable bit doesn't mean in-fly
>>> page requests are completed:
>>> --
>>> Enable (E) - This field, when set, indicates that the Page Request
>>> Interface is allowed to make page requests. If this field is Clear,
>>> the Page Request Interface is not allowed to issue page requests.
>>> If both this field and the Stopped field are Clear, then the Page
>>> Request Interface will not issue new page requests, but has
>>> outstanding page requests that have been transmitted or are
>>> queued for transmission
>>
>> Yes. So the iommu driver should drain the in-fly PRQs.
>>
>> The Intel VT-d implementation drains the PRQs when any PASID is unbound
>> from the iommu domain (see intel_svm_drain_prq()) before reuse. Before
>> disabling iopf, the device driver should unbind pasid and disable sva,
>> so when it comes here, the PRQ should have been drained.
>>
>> Perhaps I can add below comments to make this clear:
>>
>> /*
>> * PCIe spec states that by clearing PRI enable bit, the Page
>> * Request Interface will not issue new page requests, but has
>> * outstanding page requests that have been transmitted or are
>> * queued for transmission. This is supposed to be called after
>> * the device driver has stopped DMA, all PASIDs have been
>> * unbound and the outstanding PRQs have been drained.
>> */
>>
>
> this is fine. But it should be a separate patch which removes
> check of return value. It's not caused by this PRI handling move
> patch.

Okay, that will be clearer.

Best regards,
baolu

2023-03-20 16:07:15

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH v2 2/5] iommu/vt-d: Allow SVA with device-specific IOPF

Hi BaoLu,

On Thu, 9 Mar 2023 10:56:36 +0800, Lu Baolu <[email protected]>
wrote:

> Currently enabling SVA requires IOPF support from the IOMMU and device
> PCI PRI. However, some devices can handle IOPF by itself without ever
> sending PCI page requests nor advertising PRI capability.
>
> Allow SVA support with IOPF handled either by IOMMU (PCI PRI) or device
> driver (device-specific IOPF). As long as IOPF could be handled, SVA
> should continue to work.
>
> Signed-off-by: Lu Baolu <[email protected]>
> ---
> drivers/iommu/intel/iommu.c | 13 ++++++++++++-
> 1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 7c2f4bd33582..d2fcab9d8f61 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4650,7 +4650,18 @@ static int intel_iommu_enable_sva(struct device
> *dev) if (!(iommu->flags & VTD_FLAG_SVM_CAPABLE))
> return -ENODEV;
>
> - if (!info->pasid_enabled || !info->pri_enabled ||
> !info->ats_enabled)
> + if (!info->pasid_enabled)
> + return -EINVAL;
> +
> + /*
> + * Devices having device-specific I/O fault handling should not
> + * support PCI/PRI.
> + */
> + if (!info->pri_supported)
> + return 0;
If you put this check at the very beginning, everything else should it be
the same, right?

Still feel a little weird that, SVA is tied to PRI for PCI PRI but not for
device specific IOPF.

> + /* Devices supporting ATS/PRI should have it enabled. */
> + if (!info->pri_enabled || !info->ats_enabled)
> return -EINVAL;
>
> ret = iopf_queue_add_device(iommu->iopf_queue, dev);


Thanks,

Jacob

2023-03-20 16:20:14

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH v2 4/5] iommu/vt-d: Move pfsid and ats_qdep calculation to device probe path

Hi BaoLu,

On Thu, 9 Mar 2023 10:56:38 +0800, Lu Baolu <[email protected]>
wrote:

> They should be part of the per-device iommu private data initialization.
>
Reviewed-by: Jacob Pan <[email protected]>

> Signed-off-by: Lu Baolu <[email protected]>
> ---
> drivers/iommu/intel/iommu.c | 26 +++++++++++---------------
> 1 file changed, 11 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 9ada12bf38dd..fb64ab8358a9 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -1406,20 +1406,6 @@ static void iommu_enable_pci_caps(struct
> device_domain_info *info) return;
>
> pdev = to_pci_dev(info->dev);
> - /* For IOMMU that supports device IOTLB throttling (DIT), we
> assign
> - * PFSID to the invalidation desc of a VF such that IOMMU HW can
> gauge
> - * queue depth at PF level. If DIT is not set, PFSID will be
> treated as
> - * reserved, which should be set to 0.
> - */
> - if (!ecap_dit(info->iommu->ecap))
> - info->pfsid = 0;
> - else {
> - struct pci_dev *pf_pdev;
> -
> - /* pdev will be returned if device is not a vf */
> - pf_pdev = pci_physfn(pdev);
> - info->pfsid = pci_dev_id(pf_pdev);
> - }
>
> /* The PCIe spec, in its wisdom, declares that the behaviour of
> the device if you enable PASID support after ATS support is
> @@ -1438,7 +1424,6 @@ static void iommu_enable_pci_caps(struct
> device_domain_info *info) !pci_enable_ats(pdev, VTD_PAGE_SHIFT)) {
> info->ats_enabled = 1;
> domain_update_iotlb(info->domain);
> - info->ats_qdep = pci_ats_queue_depth(pdev);
> }
> }
>
> @@ -4521,6 +4506,17 @@ static struct iommu_device
> *intel_iommu_probe_device(struct device *dev) dmar_ats_supported(pdev,
> iommu)) { info->ats_supported = 1;
> info->dtlb_extra_inval =
> dev_needs_extra_dtlb_flush(pdev); +
> + /*
> + * For IOMMU that supports device IOTLB
> throttling
> + * (DIT), we assign PFSID to the invalidation
> desc
> + * of a VF such that IOMMU HW can gauge queue
> depth
> + * at PF level. If DIT is not set, PFSID will be
> + * treated as reserved, which should be set to 0.
> + */
> + if (ecap_dit(iommu->ecap))
> + info->pfsid =
> pci_dev_id(pci_physfn(pdev));
> + info->ats_qdep = pci_ats_queue_depth(pdev);
> }
> if (sm_supported(iommu)) {
> if (pasid_supported(iommu)) {


Thanks,

Jacob

2023-03-20 16:32:19

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH v2 5/5] iommu/vt-d: Move PRI handling to IOPF feature path

Hi BaoLu,

On Thu, 9 Mar 2023 10:56:39 +0800, Lu Baolu <[email protected]>
wrote:

> PRI is only used for IOPF. With this move, the PCI/PRI feature could be
> controlled by the device driver through iommu_dev_enable/disable_feature()
> interfaces.
This move is good for DMA API PASID as well, it will not turn on PRI when
enabling PASID, ATS cap.

Reviewed-by: Jacob Pan <[email protected]>

> Signed-off-by: Lu Baolu <[email protected]>
> ---
> drivers/iommu/intel/iommu.c | 59 ++++++++++++++++++++++++-------------
> 1 file changed, 39 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index fb64ab8358a9..4ed32bde4287 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -1415,11 +1415,6 @@ static void iommu_enable_pci_caps(struct
> device_domain_info *info) if (info->pasid_supported &&
> !pci_enable_pasid(pdev, info->pasid_supported & ~1)) info->pasid_enabled
> = 1;
> - if (info->pri_supported &&
> - (info->pasid_enabled ? pci_prg_resp_pasid_required(pdev) :
> 1) &&
> - !pci_reset_pri(pdev) && !pci_enable_pri(pdev, PRQ_DEPTH))
> - info->pri_enabled = 1;
> -
> if (info->ats_supported && pci_ats_page_aligned(pdev) &&
> !pci_enable_ats(pdev, VTD_PAGE_SHIFT)) {
> info->ats_enabled = 1;
> @@ -1442,11 +1437,6 @@ static void iommu_disable_pci_caps(struct
> device_domain_info *info) domain_update_iotlb(info->domain);
> }
>
> - if (info->pri_enabled) {
> - pci_disable_pri(pdev);
> - info->pri_enabled = 0;
> - }
> -
> if (info->pasid_enabled) {
> pci_disable_pasid(pdev);
> info->pasid_enabled = 0;
> @@ -4664,23 +4654,48 @@ static int intel_iommu_enable_sva(struct device
> *dev)
> static int intel_iommu_enable_iopf(struct device *dev)
> {
> + struct pci_dev *pdev = dev_is_pci(dev) ? to_pci_dev(dev) : NULL;
> struct device_domain_info *info = dev_iommu_priv_get(dev);
> struct intel_iommu *iommu;
> int ret;
>
> - if (!info || !info->ats_enabled || !info->pri_enabled)
> + if (!pdev || !info || !info->ats_enabled || !info->pri_supported)
> return -ENODEV;
> +
> + if (info->pri_enabled)
> + return -EBUSY;
> +
> iommu = info->iommu;
> if (!iommu)
> return -EINVAL;
>
> + /* PASID is required in PRG Response Message. */
> + if (info->pasid_enabled && !pci_prg_resp_pasid_required(pdev))
> + return -EINVAL;
> +
> + ret = pci_reset_pri(pdev);
> + if (ret)
> + return ret;
> +
> ret = iopf_queue_add_device(iommu->iopf_queue, dev);
> if (ret)
> return ret;
>
> ret = iommu_register_device_fault_handler(dev, iommu_queue_iopf,
> dev); if (ret)
> - iopf_queue_remove_device(iommu->iopf_queue, dev);
> + goto iopf_remove_device;
> +
> + ret = pci_enable_pri(pdev, PRQ_DEPTH);
> + if (ret)
> + goto iopf_unregister_handler;
> + info->pri_enabled = 1;
> +
> + return 0;
> +
> +iopf_unregister_handler:
> + iommu_unregister_device_fault_handler(dev);
> +iopf_remove_device:
> + iopf_queue_remove_device(iommu->iopf_queue, dev);
>
> return ret;
> }
> @@ -4689,17 +4704,21 @@ static int intel_iommu_disable_iopf(struct device
> *dev) {
> struct device_domain_info *info = dev_iommu_priv_get(dev);
> struct intel_iommu *iommu = info->iommu;
> - int ret;
>
> - ret = iommu_unregister_device_fault_handler(dev);
> - if (ret)
> - return ret;
> + if (!info->pri_enabled)
> + return -EINVAL;
>
> - ret = iopf_queue_remove_device(iommu->iopf_queue, dev);
> - if (ret)
> - iommu_register_device_fault_handler(dev,
> iommu_queue_iopf, dev);
> + pci_disable_pri(to_pci_dev(dev));
> + info->pri_enabled = 0;
>
> - return ret;
> + /*
> + * With pri_enabled checked, unregistering fault handler and
> + * removing device from iopf queue should never fail.
> + */
> + iommu_unregister_device_fault_handler(dev);
> + iopf_queue_remove_device(iommu->iopf_queue, dev);
> +
> + return 0;
> }
>
> static int


Thanks,

Jacob

2023-03-21 05:44:21

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH v2 2/5] iommu/vt-d: Allow SVA with device-specific IOPF

On 3/21/23 12:00 AM, Jacob Pan wrote:
> Hi BaoLu,

Hi Jacob,

>
> On Thu, 9 Mar 2023 10:56:36 +0800, Lu Baolu<[email protected]>
> wrote:
>
>> Currently enabling SVA requires IOPF support from the IOMMU and device
>> PCI PRI. However, some devices can handle IOPF by itself without ever
>> sending PCI page requests nor advertising PRI capability.
>>
>> Allow SVA support with IOPF handled either by IOMMU (PCI PRI) or device
>> driver (device-specific IOPF). As long as IOPF could be handled, SVA
>> should continue to work.
>>
>> Signed-off-by: Lu Baolu<[email protected]>
>> ---
>> drivers/iommu/intel/iommu.c | 13 ++++++++++++-
>> 1 file changed, 12 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
>> index 7c2f4bd33582..d2fcab9d8f61 100644
>> --- a/drivers/iommu/intel/iommu.c
>> +++ b/drivers/iommu/intel/iommu.c
>> @@ -4650,7 +4650,18 @@ static int intel_iommu_enable_sva(struct device
>> *dev) if (!(iommu->flags & VTD_FLAG_SVM_CAPABLE))
>> return -ENODEV;
>>
>> - if (!info->pasid_enabled || !info->pri_enabled ||
>> !info->ats_enabled)
>> + if (!info->pasid_enabled)
>> + return -EINVAL;
>> +
>> + /*
>> + * Devices having device-specific I/O fault handling should not
>> + * support PCI/PRI.
>> + */
>> + if (!info->pri_supported)
>> + return 0;
> If you put this check at the very beginning, everything else should it be
> the same, right?

Even for device specific IOPF, PASID/ATS are still required on the IOMMU
side.

>
> Still feel a little weird that, SVA is tied to PRI for PCI PRI but not for
> device specific IOPF.

PCI PRI and device specific IOPF *should* be equivalent. But the IOMMU
side has no means to check the capability of device-specific IOPF.
Therefore, IOMMU can only default that if the device driver enables SVA
on a non-PRI device, it will handle IOPF in its own way.

Best regards,
baolu