2023-03-02 00:56:21

by Jacob Pan

[permalink] [raw]
Subject: [PATCH 0/4] Re-enable IDXD kernel workqueue under DMA API

Hi all,

IDXD kernel work queues were disabled due to the flawed use of kernel VA
and SVA API.
Link: https://lore.kernel.org/linux-iommu/[email protected]/

The solution is to enable it under DMA API where IDXD shared workqueue users
can use ENQCMDS to submit work on buffers mapped by DMA API.

This patchset adds support for attaching PASID to the device's default
domain and the ability to reserve global PASIDs from SVA APIs. We can then
re-enable the kernel work queues and use them under DMA API.

This depends on the IOASID removal series.
https://lore.kernel.org/linux-iommu/[email protected]/T/#t

Thanks,

Jacob

Jacob Pan (4):
iommu/vt-d: Implement set device pasid op for default domain
iommu/vt-d: Use non-privileged mode for all PASIDs
iommu/sva: Support reservation of global PASIDs
dmaengine/idxd: Re-enable kernel workqueue under DMA API

drivers/dma/idxd/device.c | 30 ++++-------------------
drivers/dma/idxd/init.c | 47 +++++++++++++++++++++++++++++++++----
drivers/dma/idxd/sysfs.c | 7 ------
drivers/iommu/intel/iommu.c | 34 +++++++++++++++++++++++++--
drivers/iommu/iommu-sva.c | 25 ++++++++++++++++++++
include/linux/iommu.h | 14 +++++++++++
6 files changed, 119 insertions(+), 38 deletions(-)

--
2.25.1



2023-03-02 00:56:24

by Jacob Pan

[permalink] [raw]
Subject: [PATCH 1/4] iommu/vt-d: Implement set device pasid op for default domain

On VT-d platforms, legacy DMA requests without PASID use device’s
default domain, where RID_PASID is always attached. Device drivers
can then use the DMA API for all in-kernel DMA on the RID.

Ideally, devices capable of using ENQCMDS can also transparently use the
default domain, consequently DMA API. However, VT-d architecture
dictates that the PASID used by ENQCMDS must be different from the
RID_PASID value.

To provide support for transparent use of DMA API with non-RID_PASID
value, this patch implements the set_dev_pasid() function for the
default domain. The idea is that device drivers wishing to use ENQCMDS
to submit work on buffers mapped by DMA API will call
iommu_attach_device_pasid() beforehand.

Signed-off-by: Jacob Pan <[email protected]>
---
drivers/iommu/intel/iommu.c | 32 ++++++++++++++++++++++++++++++++
1 file changed, 32 insertions(+)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 10f657828d3a..a0cb3bc851ac 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4665,6 +4665,10 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
case IOMMU_DOMAIN_SVA:
intel_svm_remove_dev_pasid(dev, pasid);
break;
+ case IOMMU_DOMAIN_DMA:
+ case IOMMU_DOMAIN_DMA_FQ:
+ case IOMMU_DOMAIN_IDENTITY:
+ break;
default:
/* should never reach here */
WARN_ON(1);
@@ -4675,6 +4679,33 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
intel_pasid_tear_down_entry(iommu, dev, pasid, false);
}

+static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
+ struct device *dev, ioasid_t pasid)
+{
+ struct device_domain_info *info = dev_iommu_priv_get(dev);
+ struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+ struct intel_iommu *iommu = info->iommu;
+ int ret = 0;
+
+ if (!sm_supported(iommu) || !info)
+ return -ENODEV;
+
+ if (WARN_ON(pasid == PASID_RID2PASID))
+ return -EINVAL;
+
+ if (hw_pass_through && domain_type_is_si(dmar_domain))
+ ret = intel_pasid_setup_pass_through(iommu, dmar_domain,
+ dev, pasid);
+ else if (dmar_domain->use_first_level)
+ ret = domain_setup_first_level(iommu, dmar_domain,
+ dev, pasid);
+ else
+ ret = intel_pasid_setup_second_level(iommu, dmar_domain,
+ dev, pasid);
+
+ return ret;
+}
+
const struct iommu_ops intel_iommu_ops = {
.capable = intel_iommu_capable,
.domain_alloc = intel_iommu_domain_alloc,
@@ -4702,6 +4733,7 @@ const struct iommu_ops intel_iommu_ops = {
.iova_to_phys = intel_iommu_iova_to_phys,
.free = intel_iommu_domain_free,
.enforce_cache_coherency = intel_iommu_enforce_cache_coherency,
+ .set_dev_pasid = intel_iommu_set_dev_pasid,
}
};

--
2.25.1


2023-03-02 00:56:26

by Jacob Pan

[permalink] [raw]
Subject: [PATCH 2/4] iommu/vt-d: Use non-privileged mode for all PASIDs

For in-kernel DMA, use non-privileged access for all PASIDs to be
consistent with RID_PASID.
There's no need to differentiate user and kernel for in-kernel DMA.

Signed-off-by: Jacob Pan <[email protected]>
---
drivers/iommu/intel/iommu.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index a0cb3bc851ac..9e3c056e392d 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2334,8 +2334,6 @@ static int domain_setup_first_level(struct intel_iommu *iommu,
if (level != 4 && level != 5)
return -EINVAL;

- if (pasid != PASID_RID2PASID)
- flags |= PASID_FLAG_SUPERVISOR_MODE;
if (level == 5)
flags |= PASID_FLAG_FL5LP;

--
2.25.1


2023-03-02 00:56:29

by Jacob Pan

[permalink] [raw]
Subject: [PATCH 3/4] iommu/sva: Support reservation of global PASIDs

Global PASID allocation is under IOMMU SVA code since it is the primary
use case. However, some architecture such as VT-d, global PASIDs are
necessary for its internal use of DMA API with PASID.

This patch introduces SVA APIs to reserve and release global PASIDs.

Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Jacob Pan <[email protected]>
---
drivers/iommu/iommu-sva.c | 25 +++++++++++++++++++++++++
include/linux/iommu.h | 14 ++++++++++++++
2 files changed, 39 insertions(+)

diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
index 8c92a145e15d..cfdeafde88a9 100644
--- a/drivers/iommu/iommu-sva.c
+++ b/drivers/iommu/iommu-sva.c
@@ -149,6 +149,31 @@ u32 iommu_sva_get_pasid(struct iommu_sva *handle)
}
EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);

+ioasid_t iommu_sva_reserve_pasid(ioasid_t min, ioasid_t max)
+{
+ int ret;
+
+ if (min == IOMMU_PASID_INVALID || max == IOMMU_PASID_INVALID ||
+ min == 0 || max < min)
+ return IOMMU_PASID_INVALID;
+
+ ret = ida_alloc_range(&iommu_global_pasid_ida, min, max, GFP_KERNEL);
+ if (ret < 0)
+ return IOMMU_PASID_INVALID;
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_sva_reserve_pasid);
+
+void iommu_sva_unreserve_pasid(ioasid_t pasid)
+{
+ if (!pasid_valid(pasid))
+ return;
+
+ ida_free(&iommu_global_pasid_ida, pasid);
+}
+EXPORT_SYMBOL_GPL(iommu_sva_unreserve_pasid);
+
/*
* I/O page fault handler for SVA
*/
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 39a97bd8f04a..8ba07eb03d32 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -1192,6 +1192,9 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev,
struct mm_struct *mm);
void iommu_sva_unbind_device(struct iommu_sva *handle);
u32 iommu_sva_get_pasid(struct iommu_sva *handle);
+ioasid_t iommu_sva_reserve_pasid(ioasid_t min, ioasid_t max);
+void iommu_sva_unreserve_pasid(ioasid_t pasid);
+
#else
static inline struct iommu_sva *
iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
@@ -1207,6 +1210,17 @@ static inline u32 iommu_sva_get_pasid(struct iommu_sva *handle)
{
return IOMMU_PASID_INVALID;
}
+
+static inline ioasid_t iommu_sva_reserve_pasid(ioasid_t min, ioasid_t max);
+{
+ return IOMMU_PASID_INVALID;
+}
+
+static inline void iommu_sva_unreserve_pasid(ioasid_t pasid)
+{
+
+}
+
#endif /* CONFIG_IOMMU_SVA */

#endif /* __LINUX_IOMMU_H */
--
2.25.1


2023-03-02 00:56:31

by Jacob Pan

[permalink] [raw]
Subject: [PATCH 4/4] dmaengine/idxd: Re-enable kernel workqueue under DMA API

Kernel workqueues were disabled due to flawed use of kernel VA and SVA
API. Now That we have the support for attaching PASID to the device's
default domain and the ability to reserve global PASIDs from SVA APIs,
we can re-enable the kernel work queues and use them under DMA API.

We also use non-privileged access for in-kernel DMA to be consistent
with the IOMMU settings. Consequently, interrupt for user privilege is
enabled for work completion IRQs.

Link:https://lore.kernel.org/linux-iommu/[email protected]/
Signed-off-by: Jacob Pan <[email protected]>
---
drivers/dma/idxd/device.c | 30 +++++--------------------
drivers/dma/idxd/init.c | 47 +++++++++++++++++++++++++++++++++++----
drivers/dma/idxd/sysfs.c | 7 ------
3 files changed, 48 insertions(+), 36 deletions(-)

diff --git a/drivers/dma/idxd/device.c b/drivers/dma/idxd/device.c
index 125652a8bb29..96faf4d3445e 100644
--- a/drivers/dma/idxd/device.c
+++ b/drivers/dma/idxd/device.c
@@ -299,21 +299,6 @@ void idxd_wqs_unmap_portal(struct idxd_device *idxd)
}
}

-static void __idxd_wq_set_priv_locked(struct idxd_wq *wq, int priv)
-{
- struct idxd_device *idxd = wq->idxd;
- union wqcfg wqcfg;
- unsigned int offset;
-
- offset = WQCFG_OFFSET(idxd, wq->id, WQCFG_PRIVL_IDX);
- spin_lock(&idxd->dev_lock);
- wqcfg.bits[WQCFG_PRIVL_IDX] = ioread32(idxd->reg_base + offset);
- wqcfg.priv = priv;
- wq->wqcfg->bits[WQCFG_PRIVL_IDX] = wqcfg.bits[WQCFG_PRIVL_IDX];
- iowrite32(wqcfg.bits[WQCFG_PRIVL_IDX], idxd->reg_base + offset);
- spin_unlock(&idxd->dev_lock);
-}
-
static void __idxd_wq_set_pasid_locked(struct idxd_wq *wq, int pasid)
{
struct idxd_device *idxd = wq->idxd;
@@ -1324,15 +1309,14 @@ int drv_enable_wq(struct idxd_wq *wq)
}

/*
- * In the event that the WQ is configurable for pasid and priv bits.
- * For kernel wq, the driver should setup the pasid, pasid_en, and priv bit.
- * However, for non-kernel wq, the driver should only set the pasid_en bit for
- * shared wq. A dedicated wq that is not 'kernel' type will configure pasid and
+ * In the event that the WQ is configurable for pasid, the driver
+ * should setup the pasid, pasid_en bit. This is true for both kernel
+ * and user shared workqueues. There is no need to setup priv bit in
+ * that in-kernel DMA will also do user privileged requests.
+ * A dedicated wq that is not 'kernel' type will configure pasid and
* pasid_en later on so there is no need to setup.
*/
if (test_bit(IDXD_FLAG_CONFIGURABLE, &idxd->flags)) {
- int priv = 0;
-
if (wq_pasid_enabled(wq)) {
if (is_idxd_wq_kernel(wq) || wq_shared(wq)) {
u32 pasid = wq_dedicated(wq) ? idxd->pasid : 0;
@@ -1340,10 +1324,6 @@ int drv_enable_wq(struct idxd_wq *wq)
__idxd_wq_set_pasid_locked(wq, pasid);
}
}
-
- if (is_idxd_wq_kernel(wq))
- priv = 1;
- __idxd_wq_set_priv_locked(wq, priv);
}

rc = 0;
diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
index f30eef701970..dadc908318aa 100644
--- a/drivers/dma/idxd/init.c
+++ b/drivers/dma/idxd/init.c
@@ -501,14 +501,52 @@ static struct idxd_device *idxd_alloc(struct pci_dev *pdev, struct idxd_driver_d

static int idxd_enable_system_pasid(struct idxd_device *idxd)
{
- return -EOPNOTSUPP;
+ struct pci_dev *pdev = idxd->pdev;
+ struct device *dev = &pdev->dev;
+ struct iommu_domain *domain;
+ union gencfg_reg gencfg;
+ ioasid_t pasid;
+ int ret;
+
+ domain = iommu_get_domain_for_dev(dev);
+ if (!domain || domain->type == IOMMU_DOMAIN_BLOCKED)
+ return -EPERM;
+
+ pasid = iommu_sva_reserve_pasid(1, dev->iommu->max_pasids);
+ if (pasid == IOMMU_PASID_INVALID)
+ return -ENOSPC;
+
+ ret = iommu_attach_device_pasid(domain, dev, pasid);
+ if (ret) {
+ dev_err(dev, "failed to attach device pasid %d, domain type %d",
+ pasid, domain->type);
+ iommu_sva_unreserve_pasid(pasid);
+ return ret;
+ }
+
+ /* Since we set user privilege for kernel DMA, enable completion IRQ */
+ gencfg.bits = ioread32(idxd->reg_base + IDXD_GENCFG_OFFSET);
+ gencfg.user_int_en = 1;
+ iowrite32(gencfg.bits, idxd->reg_base + IDXD_GENCFG_OFFSET);
+ idxd->pasid = pasid;
+
+ return ret;
}

static void idxd_disable_system_pasid(struct idxd_device *idxd)
{
+ struct pci_dev *pdev = idxd->pdev;
+ struct device *dev = &pdev->dev;
+ struct iommu_domain *domain;
+
+ domain = iommu_get_domain_for_dev(dev);
+ if (!domain || domain->type == IOMMU_DOMAIN_BLOCKED)
+ return;

- iommu_sva_unbind_device(idxd->sva);
+ iommu_detach_device_pasid(domain, dev, idxd->pasid);
+ iommu_sva_unreserve_pasid(idxd->pasid);
idxd->sva = NULL;
+ idxd->pasid = IOMMU_PASID_INVALID;
}

static int idxd_probe(struct idxd_device *idxd)
@@ -530,8 +568,9 @@ static int idxd_probe(struct idxd_device *idxd)
} else {
set_bit(IDXD_FLAG_USER_PASID_ENABLED, &idxd->flags);

- if (idxd_enable_system_pasid(idxd))
- dev_warn(dev, "No in-kernel DMA with PASID.\n");
+ rc = idxd_enable_system_pasid(idxd);
+ if (rc)
+ dev_warn(dev, "No in-kernel DMA with PASID. %d\n", rc);
else
set_bit(IDXD_FLAG_PASID_ENABLED, &idxd->flags);
}
diff --git a/drivers/dma/idxd/sysfs.c b/drivers/dma/idxd/sysfs.c
index 3229dfc78650..09f5c3f2a992 100644
--- a/drivers/dma/idxd/sysfs.c
+++ b/drivers/dma/idxd/sysfs.c
@@ -944,13 +944,6 @@ static ssize_t wq_name_store(struct device *dev,
if (strlen(buf) > WQ_NAME_SIZE || strlen(buf) == 0)
return -EINVAL;

- /*
- * This is temporarily placed here until we have SVM support for
- * dmaengine.
- */
- if (wq->type == IDXD_WQT_KERNEL && device_pasid_enabled(wq->idxd))
- return -EOPNOTSUPP;
-
input = kstrndup(buf, count, GFP_KERNEL);
if (!input)
return -ENOMEM;
--
2.25.1


2023-03-02 01:03:23

by Dave Jiang

[permalink] [raw]
Subject: Re: [PATCH 4/4] dmaengine/idxd: Re-enable kernel workqueue under DMA API



On 3/1/23 5:59 PM, Jacob Pan wrote:
> Kernel workqueues were disabled due to flawed use of kernel VA and SVA
> API. Now That we have the support for attaching PASID to the device's
> default domain and the ability to reserve global PASIDs from SVA APIs,
> we can re-enable the kernel work queues and use them under DMA API.
>
> We also use non-privileged access for in-kernel DMA to be consistent
> with the IOMMU settings. Consequently, interrupt for user privilege is
> enabled for work completion IRQs.
>
> Link:https://lore.kernel.org/linux-iommu/[email protected]/
> Signed-off-by: Jacob Pan <[email protected]>

Reviewed-by: Dave Jiang <[email protected]>

> ---
> drivers/dma/idxd/device.c | 30 +++++--------------------
> drivers/dma/idxd/init.c | 47 +++++++++++++++++++++++++++++++++++----
> drivers/dma/idxd/sysfs.c | 7 ------
> 3 files changed, 48 insertions(+), 36 deletions(-)
>
> diff --git a/drivers/dma/idxd/device.c b/drivers/dma/idxd/device.c
> index 125652a8bb29..96faf4d3445e 100644
> --- a/drivers/dma/idxd/device.c
> +++ b/drivers/dma/idxd/device.c
> @@ -299,21 +299,6 @@ void idxd_wqs_unmap_portal(struct idxd_device *idxd)
> }
> }
>
> -static void __idxd_wq_set_priv_locked(struct idxd_wq *wq, int priv)
> -{
> - struct idxd_device *idxd = wq->idxd;
> - union wqcfg wqcfg;
> - unsigned int offset;
> -
> - offset = WQCFG_OFFSET(idxd, wq->id, WQCFG_PRIVL_IDX);
> - spin_lock(&idxd->dev_lock);
> - wqcfg.bits[WQCFG_PRIVL_IDX] = ioread32(idxd->reg_base + offset);
> - wqcfg.priv = priv;
> - wq->wqcfg->bits[WQCFG_PRIVL_IDX] = wqcfg.bits[WQCFG_PRIVL_IDX];
> - iowrite32(wqcfg.bits[WQCFG_PRIVL_IDX], idxd->reg_base + offset);
> - spin_unlock(&idxd->dev_lock);
> -}
> -
> static void __idxd_wq_set_pasid_locked(struct idxd_wq *wq, int pasid)
> {
> struct idxd_device *idxd = wq->idxd;
> @@ -1324,15 +1309,14 @@ int drv_enable_wq(struct idxd_wq *wq)
> }
>
> /*
> - * In the event that the WQ is configurable for pasid and priv bits.
> - * For kernel wq, the driver should setup the pasid, pasid_en, and priv bit.
> - * However, for non-kernel wq, the driver should only set the pasid_en bit for
> - * shared wq. A dedicated wq that is not 'kernel' type will configure pasid and
> + * In the event that the WQ is configurable for pasid, the driver
> + * should setup the pasid, pasid_en bit. This is true for both kernel
> + * and user shared workqueues. There is no need to setup priv bit in
> + * that in-kernel DMA will also do user privileged requests.
> + * A dedicated wq that is not 'kernel' type will configure pasid and
> * pasid_en later on so there is no need to setup.
> */
> if (test_bit(IDXD_FLAG_CONFIGURABLE, &idxd->flags)) {
> - int priv = 0;
> -
> if (wq_pasid_enabled(wq)) {
> if (is_idxd_wq_kernel(wq) || wq_shared(wq)) {
> u32 pasid = wq_dedicated(wq) ? idxd->pasid : 0;
> @@ -1340,10 +1324,6 @@ int drv_enable_wq(struct idxd_wq *wq)
> __idxd_wq_set_pasid_locked(wq, pasid);
> }
> }
> -
> - if (is_idxd_wq_kernel(wq))
> - priv = 1;
> - __idxd_wq_set_priv_locked(wq, priv);
> }
>
> rc = 0;
> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
> index f30eef701970..dadc908318aa 100644
> --- a/drivers/dma/idxd/init.c
> +++ b/drivers/dma/idxd/init.c
> @@ -501,14 +501,52 @@ static struct idxd_device *idxd_alloc(struct pci_dev *pdev, struct idxd_driver_d
>
> static int idxd_enable_system_pasid(struct idxd_device *idxd)
> {
> - return -EOPNOTSUPP;
> + struct pci_dev *pdev = idxd->pdev;
> + struct device *dev = &pdev->dev;
> + struct iommu_domain *domain;
> + union gencfg_reg gencfg;
> + ioasid_t pasid;
> + int ret;
> +
> + domain = iommu_get_domain_for_dev(dev);
> + if (!domain || domain->type == IOMMU_DOMAIN_BLOCKED)
> + return -EPERM;
> +
> + pasid = iommu_sva_reserve_pasid(1, dev->iommu->max_pasids);
> + if (pasid == IOMMU_PASID_INVALID)
> + return -ENOSPC;
> +
> + ret = iommu_attach_device_pasid(domain, dev, pasid);
> + if (ret) {
> + dev_err(dev, "failed to attach device pasid %d, domain type %d",
> + pasid, domain->type);
> + iommu_sva_unreserve_pasid(pasid);
> + return ret;
> + }
> +
> + /* Since we set user privilege for kernel DMA, enable completion IRQ */
> + gencfg.bits = ioread32(idxd->reg_base + IDXD_GENCFG_OFFSET);
> + gencfg.user_int_en = 1;
> + iowrite32(gencfg.bits, idxd->reg_base + IDXD_GENCFG_OFFSET);
> + idxd->pasid = pasid;
> +
> + return ret;
> }
>
> static void idxd_disable_system_pasid(struct idxd_device *idxd)
> {
> + struct pci_dev *pdev = idxd->pdev;
> + struct device *dev = &pdev->dev;
> + struct iommu_domain *domain;
> +
> + domain = iommu_get_domain_for_dev(dev);
> + if (!domain || domain->type == IOMMU_DOMAIN_BLOCKED)
> + return;
>
> - iommu_sva_unbind_device(idxd->sva);
> + iommu_detach_device_pasid(domain, dev, idxd->pasid);
> + iommu_sva_unreserve_pasid(idxd->pasid);
> idxd->sva = NULL;
> + idxd->pasid = IOMMU_PASID_INVALID;
> }
>
> static int idxd_probe(struct idxd_device *idxd)
> @@ -530,8 +568,9 @@ static int idxd_probe(struct idxd_device *idxd)
> } else {
> set_bit(IDXD_FLAG_USER_PASID_ENABLED, &idxd->flags);
>
> - if (idxd_enable_system_pasid(idxd))
> - dev_warn(dev, "No in-kernel DMA with PASID.\n");
> + rc = idxd_enable_system_pasid(idxd);
> + if (rc)
> + dev_warn(dev, "No in-kernel DMA with PASID. %d\n", rc);
> else
> set_bit(IDXD_FLAG_PASID_ENABLED, &idxd->flags);
> }
> diff --git a/drivers/dma/idxd/sysfs.c b/drivers/dma/idxd/sysfs.c
> index 3229dfc78650..09f5c3f2a992 100644
> --- a/drivers/dma/idxd/sysfs.c
> +++ b/drivers/dma/idxd/sysfs.c
> @@ -944,13 +944,6 @@ static ssize_t wq_name_store(struct device *dev,
> if (strlen(buf) > WQ_NAME_SIZE || strlen(buf) == 0)
> return -EINVAL;
>
> - /*
> - * This is temporarily placed here until we have SVM support for
> - * dmaengine.
> - */
> - if (wq->type == IDXD_WQT_KERNEL && device_pasid_enabled(wq->idxd))
> - return -EOPNOTSUPP;
> -
> input = kstrndup(buf, count, GFP_KERNEL);
> if (!input)
> return -ENOMEM;

2023-03-02 03:07:17

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 3/4] iommu/sva: Support reservation of global PASIDs

Hi Jacob,

I love your patch! Yet something to improve:

[auto build test ERROR on vkoul-dmaengine/next]
[also build test ERROR on v6.2]
[cannot apply to joro-iommu/next linus/master next-20230302]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Jacob-Pan/iommu-vt-d-Implement-set-device-pasid-op-for-default-domain/20230302-085748
base: https://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine.git next
patch link: https://lore.kernel.org/r/20230302005959.2695267-4-jacob.jun.pan%40linux.intel.com
patch subject: [PATCH 3/4] iommu/sva: Support reservation of global PASIDs
config: arm64-randconfig-r013-20230302 (https://download.01.org/0day-ci/archive/20230302/[email protected]/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project 67409911353323ca5edf2049ef0df54132fa1ca7)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install arm64 cross compiling tool for clang build
# apt-get install binutils-aarch64-linux-gnu
# https://github.com/intel-lab-lkp/linux/commit/b27170369658e99a0aafd84985de0ce48c1b2539
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Jacob-Pan/iommu-vt-d-Implement-set-device-pasid-op-for-default-domain/20230302-085748
git checkout b27170369658e99a0aafd84985de0ce48c1b2539
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=arm64 olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=arm64 SHELL=/bin/bash drivers/gpu/drm/exynos/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
| Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/

All errors (new ones prefixed by >>):

In file included from drivers/gpu/drm/exynos/exynos_drm_dma.c:8:
>> include/linux/iommu.h:1215:1: error: expected identifier or '('
{
^
drivers/gpu/drm/exynos/exynos_drm_dma.c:54:35: warning: implicit conversion from 'unsigned long long' to 'unsigned int' changes value from 18446744073709551615 to 4294967295 [-Wconstant-conversion]
dma_set_max_seg_size(subdrv_dev, DMA_BIT_MASK(32));
~~~~~~~~~~~~~~~~~~~~ ^~~~~~~~~~~~~~~~
include/linux/dma-mapping.h:76:40: note: expanded from macro 'DMA_BIT_MASK'
#define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1))
^~~~~
1 warning and 1 error generated.


vim +1215 include/linux/iommu.h

1213
1214 static inline ioasid_t iommu_sva_reserve_pasid(ioasid_t min, ioasid_t max);
> 1215 {
1216 return IOMMU_PASID_INVALID;
1217 }
1218

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

2023-03-02 03:20:27

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 3/4] iommu/sva: Support reservation of global PASIDs

Hi Jacob,

I love your patch! Yet something to improve:

[auto build test ERROR on vkoul-dmaengine/next]
[also build test ERROR on v6.2]
[cannot apply to joro-iommu/next linus/master next-20230302]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url: https://github.com/intel-lab-lkp/linux/commits/Jacob-Pan/iommu-vt-d-Implement-set-device-pasid-op-for-default-domain/20230302-085748
base: https://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine.git next
patch link: https://lore.kernel.org/r/20230302005959.2695267-4-jacob.jun.pan%40linux.intel.com
patch subject: [PATCH 3/4] iommu/sva: Support reservation of global PASIDs
config: hexagon-buildonly-randconfig-r005-20230302 (https://download.01.org/0day-ci/archive/20230302/[email protected]/config)
compiler: clang version 17.0.0 (https://github.com/llvm/llvm-project 67409911353323ca5edf2049ef0df54132fa1ca7)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/b27170369658e99a0aafd84985de0ce48c1b2539
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Jacob-Pan/iommu-vt-d-Implement-set-device-pasid-op-for-default-domain/20230302-085748
git checkout b27170369658e99a0aafd84985de0ce48c1b2539
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=hexagon olddefconfig
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=hexagon SHELL=/bin/bash drivers/crypto/hisilicon/sec/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
| Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/

All errors (new ones prefixed by >>):

In file included from drivers/crypto/hisilicon/sec/sec_drv.c:11:
In file included from include/linux/dma-mapping.h:10:
In file included from include/linux/scatterlist.h:9:
In file included from arch/hexagon/include/asm/io.h:334:
include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __raw_readb(PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
~~~~~~~~~~ ^
include/uapi/linux/byteorder/little_endian.h:37:51: note: expanded from macro '__le16_to_cpu'
#define __le16_to_cpu(x) ((__force __u16)(__le16)(x))
^
In file included from drivers/crypto/hisilicon/sec/sec_drv.c:11:
In file included from include/linux/dma-mapping.h:10:
In file included from include/linux/scatterlist.h:9:
In file included from arch/hexagon/include/asm/io.h:334:
include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
~~~~~~~~~~ ^
include/uapi/linux/byteorder/little_endian.h:35:51: note: expanded from macro '__le32_to_cpu'
#define __le32_to_cpu(x) ((__force __u32)(__le32)(x))
^
In file included from drivers/crypto/hisilicon/sec/sec_drv.c:11:
In file included from include/linux/dma-mapping.h:10:
In file included from include/linux/scatterlist.h:9:
In file included from arch/hexagon/include/asm/io.h:334:
include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writeb(value, PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
~~~~~~~~~~ ^
In file included from drivers/crypto/hisilicon/sec/sec_drv.c:14:
>> include/linux/iommu.h:1215:1: error: expected identifier or '('
{
^
drivers/crypto/hisilicon/sec/sec_drv.c:1209:39: warning: shift count >= width of type [-Wshift-count-overflow]
ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64));
^~~~~~~~~~~~~~~~
include/linux/dma-mapping.h:76:54: note: expanded from macro 'DMA_BIT_MASK'
#define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1))
^ ~~~
7 warnings and 1 error generated.


vim +1215 include/linux/iommu.h

1213
1214 static inline ioasid_t iommu_sva_reserve_pasid(ioasid_t min, ioasid_t max);
> 1215 {
1216 return IOMMU_PASID_INVALID;
1217 }
1218

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

2023-03-02 09:38:01

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 1/4] iommu/vt-d: Implement set device pasid op for default domain

> From: Jacob Pan <[email protected]>
> Sent: Thursday, March 2, 2023 9:00 AM
>
> @@ -4675,6 +4679,33 @@ static void
> intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
> intel_pasid_tear_down_entry(iommu, dev, pasid, false);
> }
>
> +static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
> + struct device *dev, ioasid_t pasid)
> +{
> + struct device_domain_info *info = dev_iommu_priv_get(dev);
> + struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> + struct intel_iommu *iommu = info->iommu;
> + int ret = 0;
> +
> + if (!sm_supported(iommu) || !info)
> + return -ENODEV;
> +
> + if (WARN_ON(pasid == PASID_RID2PASID))
> + return -EINVAL;
> +
> + if (hw_pass_through && domain_type_is_si(dmar_domain))
> + ret = intel_pasid_setup_pass_through(iommu, dmar_domain,
> + dev, pasid);
> + else if (dmar_domain->use_first_level)
> + ret = domain_setup_first_level(iommu, dmar_domain,
> + dev, pasid);
> + else
> + ret = intel_pasid_setup_second_level(iommu, dmar_domain,
> + dev, pasid);
> +

check whether pasid capability has been enabled on this device.

it's unlike SVA which has separate capability to tell.

2023-03-02 09:43:22

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 3/4] iommu/sva: Support reservation of global PASIDs

> From: Jacob Pan <[email protected]>
> Sent: Thursday, March 2, 2023 9:00 AM
>
> Global PASID allocation is under IOMMU SVA code since it is the primary
> use case. However, some architecture such as VT-d, global PASIDs are
> necessary for its internal use of DMA API with PASID.

No, global PASID is not a VT-d restriction. It's from ENQCMD/S hence a
device requirement.

>
> This patch introduces SVA APIs to reserve and release global PASIDs.
>
> Link: https://lore.kernel.org/all/20230301235646.2692846-4-
> [email protected]/
> Signed-off-by: Jacob Pan <[email protected]>
> ---
> drivers/iommu/iommu-sva.c | 25 +++++++++++++++++++++++++
> include/linux/iommu.h | 14 ++++++++++++++
> 2 files changed, 39 insertions(+)
>
> diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
> index 8c92a145e15d..cfdeafde88a9 100644
> --- a/drivers/iommu/iommu-sva.c
> +++ b/drivers/iommu/iommu-sva.c
> @@ -149,6 +149,31 @@ u32 iommu_sva_get_pasid(struct iommu_sva
> *handle)
> }
> EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
>
> +ioasid_t iommu_sva_reserve_pasid(ioasid_t min, ioasid_t max)
> +{
> + int ret;
> +
> + if (min == IOMMU_PASID_INVALID || max ==
> IOMMU_PASID_INVALID ||
> + min == 0 || max < min)
> + return IOMMU_PASID_INVALID;
> +
> + ret = ida_alloc_range(&iommu_global_pasid_ida, min, max,
> GFP_KERNEL);
> + if (ret < 0)
> + return IOMMU_PASID_INVALID;
> +
> + return ret;
> +}
> +EXPORT_SYMBOL_GPL(iommu_sva_reserve_pasid);
> +

I'm not sure it's the right way. It's not related to SVA.

We should move iommu_global_pasid_ida to iomm.c and then have
another interface for allocation.

Above is pretty generic so probably a general one like:

ioasid_t iommu_allocate_global_pasid(struct device *dev)

internally it can use [1, dev->iommu->max_pasids] as min/max instead
of passed in from the caller.



2023-03-02 09:47:13

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 4/4] dmaengine/idxd: Re-enable kernel workqueue under DMA API

> From: Jacob Pan <[email protected]>
> Sent: Thursday, March 2, 2023 9:00 AM
>
> static int idxd_enable_system_pasid(struct idxd_device *idxd)
> {
> - return -EOPNOTSUPP;
> + struct pci_dev *pdev = idxd->pdev;
> + struct device *dev = &pdev->dev;
> + struct iommu_domain *domain;
> + union gencfg_reg gencfg;
> + ioasid_t pasid;
> + int ret;
> +
> + domain = iommu_get_domain_for_dev(dev);
> + if (!domain || domain->type == IOMMU_DOMAIN_BLOCKED)
> + return -EPERM;

what about UNMANAGED?

> +
> + pasid = iommu_sva_reserve_pasid(1, dev->iommu->max_pasids);
> + if (pasid == IOMMU_PASID_INVALID)
> + return -ENOSPC;

as commented in last patch we can just pass a device pointer to a
general allocation interface.

> +
> + ret = iommu_attach_device_pasid(domain, dev, pasid);
> + if (ret) {
> + dev_err(dev, "failed to attach device pasid %d, domain
> type %d",
> + pasid, domain->type);
> + iommu_sva_unreserve_pasid(pasid);
> + return ret;
> + }
> +
> + /* Since we set user privilege for kernel DMA, enable completion IRQ
> */
> + gencfg.bits = ioread32(idxd->reg_base + IDXD_GENCFG_OFFSET);
> + gencfg.user_int_en = 1;
> + iowrite32(gencfg.bits, idxd->reg_base + IDXD_GENCFG_OFFSET);
> + idxd->pasid = pasid;

Why does user privilege requires a completion interrupt?

Or instead it's more due to doing kernel DMA itself then we certainly
don't want to poll in the kernel?

2023-03-02 12:58:41

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 4/4] dmaengine/idxd: Re-enable kernel workqueue under DMA API

On Thu, Mar 02, 2023 at 09:47:00AM +0000, Tian, Kevin wrote:
> > From: Jacob Pan <[email protected]>
> > Sent: Thursday, March 2, 2023 9:00 AM
> >
> > static int idxd_enable_system_pasid(struct idxd_device *idxd)
> > {
> > - return -EOPNOTSUPP;
> > + struct pci_dev *pdev = idxd->pdev;
> > + struct device *dev = &pdev->dev;
> > + struct iommu_domain *domain;
> > + union gencfg_reg gencfg;
> > + ioasid_t pasid;
> > + int ret;
> > +
> > + domain = iommu_get_domain_for_dev(dev);
> > + if (!domain || domain->type == IOMMU_DOMAIN_BLOCKED)
> > + return -EPERM;
>
> what about UNMANAGED?

Why are we checking this anyhow?

Get the domain the DMA API is using and feed it to
iommu_attach_device_pasid(). If the driver can't mirror the DMA API's
domain onto the PASID then it will just fail the attach. A domain
cannot even be NULL on x86.

Jason

2023-03-02 14:08:48

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH 1/4] iommu/vt-d: Implement set device pasid op for default domain

On 2023/3/2 8:59, Jacob Pan wrote:
> On VT-d platforms, legacy DMA requests without PASID use device’s
> default domain, where RID_PASID is always attached. Device drivers
> can then use the DMA API for all in-kernel DMA on the RID.
>
> Ideally, devices capable of using ENQCMDS can also transparently use the
> default domain, consequently DMA API. However, VT-d architecture
> dictates that the PASID used by ENQCMDS must be different from the
> RID_PASID value.
>
> To provide support for transparent use of DMA API with non-RID_PASID
> value, this patch implements the set_dev_pasid() function for the
> default domain. The idea is that device drivers wishing to use ENQCMDS
> to submit work on buffers mapped by DMA API will call
> iommu_attach_device_pasid() beforehand.
>
> Signed-off-by: Jacob Pan <[email protected]>
> ---
> drivers/iommu/intel/iommu.c | 32 ++++++++++++++++++++++++++++++++
> 1 file changed, 32 insertions(+)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 10f657828d3a..a0cb3bc851ac 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4665,6 +4665,10 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
> case IOMMU_DOMAIN_SVA:
> intel_svm_remove_dev_pasid(dev, pasid);
> break;
> + case IOMMU_DOMAIN_DMA:
> + case IOMMU_DOMAIN_DMA_FQ:
> + case IOMMU_DOMAIN_IDENTITY:
> + break;
> default:
> /* should never reach here */
> WARN_ON(1);
> @@ -4675,6 +4679,33 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
> intel_pasid_tear_down_entry(iommu, dev, pasid, false);
> }
>
> +static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
> + struct device *dev, ioasid_t pasid)
> +{
> + struct device_domain_info *info = dev_iommu_priv_get(dev);
> + struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> + struct intel_iommu *iommu = info->iommu;
> + int ret = 0;
> +
> + if (!sm_supported(iommu) || !info)

@info has been referenced. !info check makes no sense.

Add pasid_supported(iommu).

Do you need to check whether the domain is compatible for this rid
pasid?

> + return -ENODEV;
> +
> + if (WARN_ON(pasid == PASID_RID2PASID))
> + return -EINVAL;

Add a call to domain_attach_iommu() here to get a refcount of the domain
ID. And call domain_detach_iommu() in intel_iommu_remove_dev_pasid().

> +
> + if (hw_pass_through && domain_type_is_si(dmar_domain))
> + ret = intel_pasid_setup_pass_through(iommu, dmar_domain,
> + dev, pasid);
> + else if (dmar_domain->use_first_level)
> + ret = domain_setup_first_level(iommu, dmar_domain,
> + dev, pasid);
> + else
> + ret = intel_pasid_setup_second_level(iommu, dmar_domain,
> + dev, pasid);
> +
> + return ret;
> +}

Do you need to consider pasid cache invalidation?

> +
> const struct iommu_ops intel_iommu_ops = {
> .capable = intel_iommu_capable,
> .domain_alloc = intel_iommu_domain_alloc,
> @@ -4702,6 +4733,7 @@ const struct iommu_ops intel_iommu_ops = {
> .iova_to_phys = intel_iommu_iova_to_phys,
> .free = intel_iommu_domain_free,
> .enforce_cache_coherency = intel_iommu_enforce_cache_coherency,
> + .set_dev_pasid = intel_iommu_set_dev_pasid,
> }
> };
>

Best regards,
baolu

2023-03-02 14:20:32

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH 2/4] iommu/vt-d: Use non-privileged mode for all PASIDs

On 2023/3/2 8:59, Jacob Pan wrote:
> For in-kernel DMA, use non-privileged access for all PASIDs to be
> consistent with RID_PASID.
> There's no need to differentiate user and kernel for in-kernel DMA. >
> Signed-off-by: Jacob Pan <[email protected]>
> ---
> drivers/iommu/intel/iommu.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index a0cb3bc851ac..9e3c056e392d 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -2334,8 +2334,6 @@ static int domain_setup_first_level(struct intel_iommu *iommu,
> if (level != 4 && level != 5)
> return -EINVAL;
>
> - if (pasid != PASID_RID2PASID)
> - flags |= PASID_FLAG_SUPERVISOR_MODE;

With above removed, PASID_FLAG_SUPERVISOR_MODE is not used anywhere?
Perhaps you can cleanup it to avoid dead code?

> if (level == 5)
> flags |= PASID_FLAG_FL5LP;
>

Best regards,
baolu

2023-03-02 19:21:33

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH 1/4] iommu/vt-d: Implement set device pasid op for default domain

Hi Kevin,

On Thu, 2 Mar 2023 09:37:30 +0000, "Tian, Kevin" <[email protected]>
wrote:

> > From: Jacob Pan <[email protected]>
> > Sent: Thursday, March 2, 2023 9:00 AM
> >
> > @@ -4675,6 +4679,33 @@ static void
> > intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
> > intel_pasid_tear_down_entry(iommu, dev, pasid, false);
> > }
> >
> > +static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
> > + struct device *dev, ioasid_t pasid)
> > +{
> > + struct device_domain_info *info = dev_iommu_priv_get(dev);
> > + struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> > + struct intel_iommu *iommu = info->iommu;
> > + int ret = 0;
> > +
> > + if (!sm_supported(iommu) || !info)
> > + return -ENODEV;
> > +
> > + if (WARN_ON(pasid == PASID_RID2PASID))
> > + return -EINVAL;
> > +
> > + if (hw_pass_through && domain_type_is_si(dmar_domain))
> > + ret = intel_pasid_setup_pass_through(iommu,
> > dmar_domain,
> > + dev, pasid);
> > + else if (dmar_domain->use_first_level)
> > + ret = domain_setup_first_level(iommu, dmar_domain,
> > + dev, pasid);
> > + else
> > + ret = intel_pasid_setup_second_level(iommu,
> > dmar_domain,
> > + dev, pasid);
> > +
>
> check whether pasid capability has been enabled on this device.
>
> it's unlike SVA which has separate capability to tell.
yes, it is a little tricky in that enabling PASID should be done before ATS
but for now anything uses ENQCMDS should support ATS. I will add
/*
* PASID should be enabled as part of PCI caps enabling where
* ordering is required relative to ATS.
*/
if (WARN_ON(!pdev->pasid_enabled))
return -ENODEV;

Thanks,

Jacob

2023-03-03 01:20:49

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH 4/4] dmaengine/idxd: Re-enable kernel workqueue under DMA API

On 3/2/23 8:59 AM, Jacob Pan wrote:
> diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
> index f30eef701970..dadc908318aa 100644
> --- a/drivers/dma/idxd/init.c
> +++ b/drivers/dma/idxd/init.c
> @@ -501,14 +501,52 @@ static struct idxd_device *idxd_alloc(struct pci_dev *pdev, struct idxd_driver_d
>
> static int idxd_enable_system_pasid(struct idxd_device *idxd)
> {
> - return -EOPNOTSUPP;
> + struct pci_dev *pdev = idxd->pdev;
> + struct device *dev = &pdev->dev;
> + struct iommu_domain *domain;
> + union gencfg_reg gencfg;
> + ioasid_t pasid;
> + int ret;
> +
> + domain = iommu_get_domain_for_dev(dev);
> + if (!domain || domain->type == IOMMU_DOMAIN_BLOCKED)
> + return -EPERM;

The idxd driver has claimed the DMA ownership of this device. Unless the
idxd driver itself attached another domain, iommu_get_domain_for_dev()
should never return a blocking domain.

"domain == NULL" happens when CONFIG_IOMMU_API is not set.

Furthermore, iommu_get_domain_for_dev() doesn't hold any refcount from
the domain, so in theory it's not safe here because it possibly causes
use-after-release case.

I would say iommu_get_dma_domain() or something similar is more suitable
for use here. It directly returns the device's default domain and the
iommu core guarantees that default domain will always valid during the
life cycle of any device driver.

Best regards,
baolu

2023-03-03 02:36:37

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 1/4] iommu/vt-d: Implement set device pasid op for default domain

> From: Baolu Lu <[email protected]>
> Sent: Thursday, March 2, 2023 10:07 PM
> > +
> > + if (!sm_supported(iommu) || !info)
>
> @info has been referenced. !info check makes no sense.
>
> Add pasid_supported(iommu).
>
> Do you need to check whether the domain is compatible for this rid
> pasid?

what kind of compatibility is concerned here? In concept a pasid
can be attached to any domain if it has been successfully attached
to rid. Probably we can add a check here that RID2PASID must
point to the domain already.

>
> > + return -ENODEV;
> > +
> > + if (WARN_ON(pasid == PASID_RID2PASID))
> > + return -EINVAL;
>
> Add a call to domain_attach_iommu() here to get a refcount of the domain
> ID. And call domain_detach_iommu() in intel_iommu_remove_dev_pasid().
>

Is it necessary? iommu core doesn't allow taking ownership
if !xa_empty(&group->pasid_array) so if this pasid attach succeeds
this device cannot be attached to another domain before pasid
detach is done on the current domain.

2023-03-03 02:49:37

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH 1/4] iommu/vt-d: Implement set device pasid op for default domain

On 3/3/23 10:36 AM, Tian, Kevin wrote:
>> From: Baolu Lu <[email protected]>
>> Sent: Thursday, March 2, 2023 10:07 PM
>>> +
>>> + if (!sm_supported(iommu) || !info)
>>
>> @info has been referenced. !info check makes no sense.
>>
>> Add pasid_supported(iommu).
>>
>> Do you need to check whether the domain is compatible for this rid
>> pasid?
>
> what kind of compatibility is concerned here? In concept a pasid
> can be attached to any domain if it has been successfully attached
> to rid. Probably we can add a check here that RID2PASID must
> point to the domain already.

"...if it has been successfully attached to rid..."

We should not have this assumption in iommu driver's callback. The iommu
driver has no (and should not have) knowledge about the history of any
domain.

>
>>
>>> + return -ENODEV;
>>> +
>>> + if (WARN_ON(pasid == PASID_RID2PASID))
>>> + return -EINVAL;
>>
>> Add a call to domain_attach_iommu() here to get a refcount of the domain
>> ID. And call domain_detach_iommu() in intel_iommu_remove_dev_pasid().
>>
>
> Is it necessary? iommu core doesn't allow taking ownership
> if !xa_empty(&group->pasid_array) so if this pasid attach succeeds
> this device cannot be attached to another domain before pasid
> detach is done on the current domain.

It's not about the pasid, but the domain id.

This domain's id will be set to a field of the device's pasid entry. It
must get a refcount of that domain id to avoid use after free.

Best regards,
baolu

2023-03-03 03:02:48

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 1/4] iommu/vt-d: Implement set device pasid op for default domain

> From: Baolu Lu <[email protected]>
> Sent: Friday, March 3, 2023 10:49 AM
>
> On 3/3/23 10:36 AM, Tian, Kevin wrote:
> >> From: Baolu Lu <[email protected]>
> >> Sent: Thursday, March 2, 2023 10:07 PM
> >>> +
> >>> + if (!sm_supported(iommu) || !info)
> >>
> >> @info has been referenced. !info check makes no sense.
> >>
> >> Add pasid_supported(iommu).
> >>
> >> Do you need to check whether the domain is compatible for this rid
> >> pasid?
> >
> > what kind of compatibility is concerned here? In concept a pasid
> > can be attached to any domain if it has been successfully attached
> > to rid. Probably we can add a check here that RID2PASID must
> > point to the domain already.
>
> "...if it has been successfully attached to rid..."
>
> We should not have this assumption in iommu driver's callback. The iommu
> driver has no (and should not have) knowledge about the history of any
> domain.

but this is an op for default domain which must have been attached
to RID2PASID and any compatibility check between this domain and device
should be passed.

We can have another set_pasid for unmanaged which then need similar
check as prepare_domain_attach_device() does.

>
> >
> >>
> >>> + return -ENODEV;
> >>> +
> >>> + if (WARN_ON(pasid == PASID_RID2PASID))
> >>> + return -EINVAL;
> >>
> >> Add a call to domain_attach_iommu() here to get a refcount of the
> domain
> >> ID. And call domain_detach_iommu() in
> intel_iommu_remove_dev_pasid().
> >>
> >
> > Is it necessary? iommu core doesn't allow taking ownership
> > if !xa_empty(&group->pasid_array) so if this pasid attach succeeds
> > this device cannot be attached to another domain before pasid
> > detach is done on the current domain.
>
> It's not about the pasid, but the domain id.
>
> This domain's id will be set to a field of the device's pasid entry. It
> must get a refcount of that domain id to avoid use after free.
>

If the domain still has attached device (due to this pasid usage) how could
domain id be freed?

2023-03-03 04:39:27

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH 1/4] iommu/vt-d: Implement set device pasid op for default domain

On 3/3/23 11:02 AM, Tian, Kevin wrote:
>> From: Baolu Lu <[email protected]>
>> Sent: Friday, March 3, 2023 10:49 AM
>>
>> On 3/3/23 10:36 AM, Tian, Kevin wrote:
>>>> From: Baolu Lu <[email protected]>
>>>> Sent: Thursday, March 2, 2023 10:07 PM
>>>>> +
>>>>> + if (!sm_supported(iommu) || !info)
>>>>
>>>> @info has been referenced. !info check makes no sense.
>>>>
>>>> Add pasid_supported(iommu).
>>>>
>>>> Do you need to check whether the domain is compatible for this rid
>>>> pasid?
>>>
>>> what kind of compatibility is concerned here? In concept a pasid
>>> can be attached to any domain if it has been successfully attached
>>> to rid. Probably we can add a check here that RID2PASID must
>>> point to the domain already.
>>
>> "...if it has been successfully attached to rid..."
>>
>> We should not have this assumption in iommu driver's callback. The iommu
>> driver has no (and should not have) knowledge about the history of any
>> domain.
>
> but this is an op for default domain which must have been attached
> to RID2PASID and any compatibility check between this domain and device
> should be passed.

This is an op for DMA, DMA-FQ and UNMANAGED domain. The IOMMU driver
doesn't need to interpret the default domain concept. :-)

>
> We can have another set_pasid for unmanaged which then need similar
> check as prepare_domain_attach_device() does.

From the perspective of the iommu driver, there's no essential
difference between DMA and UNMANAGED domains. So almost all IOMMU
drivers maintain a single set of domain ops for them.

>>
>>>
>>>>
>>>>> + return -ENODEV;
>>>>> +
>>>>> + if (WARN_ON(pasid == PASID_RID2PASID))
>>>>> + return -EINVAL;
>>>>
>>>> Add a call to domain_attach_iommu() here to get a refcount of the
>> domain
>>>> ID. And call domain_detach_iommu() in
>> intel_iommu_remove_dev_pasid().
>>>>
>>>
>>> Is it necessary? iommu core doesn't allow taking ownership
>>> if !xa_empty(&group->pasid_array) so if this pasid attach succeeds
>>> this device cannot be attached to another domain before pasid
>>> detach is done on the current domain.
>>
>> It's not about the pasid, but the domain id.
>>
>> This domain's id will be set to a field of the device's pasid entry. It
>> must get a refcount of that domain id to avoid use after free.
>>
>
> If the domain still has attached device (due to this pasid usage) how could
> domain id be freed?

The Intel IOMMU driver uses a user counter to determine when the domain
id could be freed.

Best regards,
baolu

2023-03-03 05:36:12

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 1/4] iommu/vt-d: Implement set device pasid op for default domain

> From: Baolu Lu <[email protected]>
> Sent: Friday, March 3, 2023 12:38 PM
>
> On 3/3/23 11:02 AM, Tian, Kevin wrote:
> >> From: Baolu Lu <[email protected]>
> >> Sent: Friday, March 3, 2023 10:49 AM
> >>
> >> On 3/3/23 10:36 AM, Tian, Kevin wrote:
> >>>> From: Baolu Lu <[email protected]>
> >>>> Sent: Thursday, March 2, 2023 10:07 PM
> >>>>> +
> >>>>> + if (!sm_supported(iommu) || !info)
> >>>>
> >>>> @info has been referenced. !info check makes no sense.
> >>>>
> >>>> Add pasid_supported(iommu).
> >>>>
> >>>> Do you need to check whether the domain is compatible for this rid
> >>>> pasid?
> >>>
> >>> what kind of compatibility is concerned here? In concept a pasid
> >>> can be attached to any domain if it has been successfully attached
> >>> to rid. Probably we can add a check here that RID2PASID must
> >>> point to the domain already.
> >>
> >> "...if it has been successfully attached to rid..."
> >>
> >> We should not have this assumption in iommu driver's callback. The
> iommu
> >> driver has no (and should not have) knowledge about the history of any
> >> domain.
> >
> > but this is an op for default domain which must have been attached
> > to RID2PASID and any compatibility check between this domain and device
> > should be passed.
>
> This is an op for DMA, DMA-FQ and UNMANAGED domain. The IOMMU
> driver
> doesn't need to interpret the default domain concept. :-)
>

yes if we target a general callback for all those domain types.

and probably this is the right thing to do as in the end DMA type will
be removed with Jason's cleanup

2023-03-03 05:38:39

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 1/4] iommu/vt-d: Implement set device pasid op for default domain

> From: Baolu Lu <[email protected]>
> Sent: Thursday, March 2, 2023 10:07 PM
>
> > +
> > + if (hw_pass_through && domain_type_is_si(dmar_domain))
> > + ret = intel_pasid_setup_pass_through(iommu, dmar_domain,
> > + dev, pasid);
> > + else if (dmar_domain->use_first_level)
> > + ret = domain_setup_first_level(iommu, dmar_domain,
> > + dev, pasid);
> > + else
> > + ret = intel_pasid_setup_second_level(iommu, dmar_domain,
> > + dev, pasid);
> > +
> > + return ret;
> > +}
>
> Do you need to consider pasid cache invalidation?
>

To avoid confusion this is not about invalidation of pasid cache itself
which should be covered by above setup functions already.

Here actually means per-PASID invalidation in iotlb and devtlb. Today
only RID is tracked per domain for invalidation. it needs extension to
walk attached pasid too.

2023-03-03 16:31:29

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH 1/4] iommu/vt-d: Implement set device pasid op for default domain

Hi Kevin,

On Fri, 3 Mar 2023 05:38:19 +0000, "Tian, Kevin" <[email protected]>
wrote:

> > From: Baolu Lu <[email protected]>
> > Sent: Thursday, March 2, 2023 10:07 PM
> >
> > > +
> > > + if (hw_pass_through && domain_type_is_si(dmar_domain))
> > > + ret = intel_pasid_setup_pass_through(iommu,
> > > dmar_domain,
> > > + dev, pasid);
> > > + else if (dmar_domain->use_first_level)
> > > + ret = domain_setup_first_level(iommu, dmar_domain,
> > > + dev, pasid);
> > > + else
> > > + ret = intel_pasid_setup_second_level(iommu,
> > > dmar_domain,
> > > + dev, pasid);
> > > +
> > > + return ret;
> > > +}
> >
> > Do you need to consider pasid cache invalidation?
> >
>
> To avoid confusion this is not about invalidation of pasid cache itself
> which should be covered by above setup functions already.
>
> Here actually means per-PASID invalidation in iotlb and devtlb. Today
> only RID is tracked per domain for invalidation. it needs extension to
> walk attached pasid too.

Yes, will add.

For the set up path, there is no need to flush IOTLBs, because we're going
from non present to present.

On the remove path, IOTLB flush should be covered when device driver
calls iommu_detach_device_pasid(). Covered with this patch.


Thanks,

Jacob

2023-03-03 21:36:25

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH 2/4] iommu/vt-d: Use non-privileged mode for all PASIDs

Hi Baolu,

On Thu, 2 Mar 2023 22:11:36 +0800, Baolu Lu <[email protected]>
wrote:

> On 2023/3/2 8:59, Jacob Pan wrote:
> > For in-kernel DMA, use non-privileged access for all PASIDs to be
> > consistent with RID_PASID.
> > There's no need to differentiate user and kernel for in-kernel DMA. >
> > Signed-off-by: Jacob Pan <[email protected]>
> > ---
> > drivers/iommu/intel/iommu.c | 2 --
> > 1 file changed, 2 deletions(-)
> >
> > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> > index a0cb3bc851ac..9e3c056e392d 100644
> > --- a/drivers/iommu/intel/iommu.c
> > +++ b/drivers/iommu/intel/iommu.c
> > @@ -2334,8 +2334,6 @@ static int domain_setup_first_level(struct
> > intel_iommu *iommu, if (level != 4 && level != 5)
> > return -EINVAL;
> >
> > - if (pasid != PASID_RID2PASID)
> > - flags |= PASID_FLAG_SUPERVISOR_MODE;
>
> With above removed, PASID_FLAG_SUPERVISOR_MODE is not used anywhere?
> Perhaps you can cleanup it to avoid dead code?
good point, we could remove pasid_set_sre() related code for FL,SL, and PT.

Thanks,

Jacob

2023-03-03 21:52:32

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH 4/4] dmaengine/idxd: Re-enable kernel workqueue under DMA API

Hi Jason,

On Thu, 2 Mar 2023 08:57:48 -0400, Jason Gunthorpe <[email protected]> wrote:

> On Thu, Mar 02, 2023 at 09:47:00AM +0000, Tian, Kevin wrote:
> > > From: Jacob Pan <[email protected]>
> > > Sent: Thursday, March 2, 2023 9:00 AM
> > >
> > > static int idxd_enable_system_pasid(struct idxd_device *idxd)
> > > {
> > > - return -EOPNOTSUPP;
> > > + struct pci_dev *pdev = idxd->pdev;
> > > + struct device *dev = &pdev->dev;
> > > + struct iommu_domain *domain;
> > > + union gencfg_reg gencfg;
> > > + ioasid_t pasid;
> > > + int ret;
> > > +
> > > + domain = iommu_get_domain_for_dev(dev);
> > > + if (!domain || domain->type == IOMMU_DOMAIN_BLOCKED)
> > > + return -EPERM;
> >
> > what about UNMANAGED?
>
> Why are we checking this anyhow?
>
> Get the domain the DMA API is using and feed it to
> iommu_attach_device_pasid(). If the driver can't mirror the DMA API's
> domain onto the PASID then it will just fail the attach. A domain
> cannot even be NULL on x86.
makes sense,

Thanks,

Jacob

2023-03-03 21:52:55

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH 3/4] iommu/sva: Support reservation of global PASIDs

Hi Kevin,

On Thu, 2 Mar 2023 09:43:03 +0000, "Tian, Kevin" <[email protected]>
wrote:

> > From: Jacob Pan <[email protected]>
> > Sent: Thursday, March 2, 2023 9:00 AM
> >
> > Global PASID allocation is under IOMMU SVA code since it is the primary
> > use case. However, some architecture such as VT-d, global PASIDs are
> > necessary for its internal use of DMA API with PASID.
>
> No, global PASID is not a VT-d restriction. It's from ENQCMD/S hence a
> device requirement.
I meant VT-d based platforms, it is kind of intertwined in that ENQCMDS
does not restrict RIDPASID!=DMA PASID, vt-d does. Without this
restriction, there wouldn't be a need for this patch. Let me reword.
> >
> > This patch introduces SVA APIs to reserve and release global PASIDs.
> >
> > Link: https://lore.kernel.org/all/20230301235646.2692846-4-
> > [email protected]/
> > Signed-off-by: Jacob Pan <[email protected]>
> > ---
> > drivers/iommu/iommu-sva.c | 25 +++++++++++++++++++++++++
> > include/linux/iommu.h | 14 ++++++++++++++
> > 2 files changed, 39 insertions(+)
> >
> > diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
> > index 8c92a145e15d..cfdeafde88a9 100644
> > --- a/drivers/iommu/iommu-sva.c
> > +++ b/drivers/iommu/iommu-sva.c
> > @@ -149,6 +149,31 @@ u32 iommu_sva_get_pasid(struct iommu_sva
> > *handle)
> > }
> > EXPORT_SYMBOL_GPL(iommu_sva_get_pasid);
> >
> > +ioasid_t iommu_sva_reserve_pasid(ioasid_t min, ioasid_t max)
> > +{
> > + int ret;
> > +
> > + if (min == IOMMU_PASID_INVALID || max ==
> > IOMMU_PASID_INVALID ||
> > + min == 0 || max < min)
> > + return IOMMU_PASID_INVALID;
> > +
> > + ret = ida_alloc_range(&iommu_global_pasid_ida, min, max,
> > GFP_KERNEL);
> > + if (ret < 0)
> > + return IOMMU_PASID_INVALID;
> > +
> > + return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(iommu_sva_reserve_pasid);
> > +
>
> I'm not sure it's the right way. It's not related to SVA.
>
> We should move iommu_global_pasid_ida to iomm.c and then have
> another interface for allocation.
>
> Above is pretty generic so probably a general one like:
>
> ioasid_t iommu_allocate_global_pasid(struct device *dev)
>
> internally it can use [1, dev->iommu->max_pasids] as min/max instead
> of passed in from the caller.
sounds good to me, will do.


Thanks,

Jacob

2023-03-03 22:03:56

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH 4/4] dmaengine/idxd: Re-enable kernel workqueue under DMA API

Hi Baolu,

On Fri, 3 Mar 2023 09:19:48 +0800, Baolu Lu <[email protected]>
wrote:

> On 3/2/23 8:59 AM, Jacob Pan wrote:
> > diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
> > index f30eef701970..dadc908318aa 100644
> > --- a/drivers/dma/idxd/init.c
> > +++ b/drivers/dma/idxd/init.c
> > @@ -501,14 +501,52 @@ static struct idxd_device *idxd_alloc(struct
> > pci_dev *pdev, struct idxd_driver_d
> > static int idxd_enable_system_pasid(struct idxd_device *idxd)
> > {
> > - return -EOPNOTSUPP;
> > + struct pci_dev *pdev = idxd->pdev;
> > + struct device *dev = &pdev->dev;
> > + struct iommu_domain *domain;
> > + union gencfg_reg gencfg;
> > + ioasid_t pasid;
> > + int ret;
> > +
> > + domain = iommu_get_domain_for_dev(dev);
> > + if (!domain || domain->type == IOMMU_DOMAIN_BLOCKED)
> > + return -EPERM;
>
> The idxd driver has claimed the DMA ownership of this device. Unless the
> idxd driver itself attached another domain, iommu_get_domain_for_dev()
> should never return a blocking domain.
>
> "domain == NULL" happens when CONFIG_IOMMU_API is not set.
>
> Furthermore, iommu_get_domain_for_dev() doesn't hold any refcount from
> the domain, so in theory it's not safe here because it possibly causes
> use-after-release case.
>
> I would say iommu_get_dma_domain() or something similar is more suitable
> for use here. It directly returns the device's default domain and the
> iommu core guarantees that default domain will always valid during the
> life cycle of any device driver.
>
will do, same as Jason's comments.

Thanks,

Jacob

2023-03-03 22:18:50

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH 4/4] dmaengine/idxd: Re-enable kernel workqueue under DMA API

Hi Kevin,

On Thu, 2 Mar 2023 09:47:00 +0000, "Tian, Kevin" <[email protected]>
wrote:

> > From: Jacob Pan <[email protected]>
> > Sent: Thursday, March 2, 2023 9:00 AM
> >
> > static int idxd_enable_system_pasid(struct idxd_device *idxd)
> > {
> > - return -EOPNOTSUPP;
> > + struct pci_dev *pdev = idxd->pdev;
> > + struct device *dev = &pdev->dev;
> > + struct iommu_domain *domain;
> > + union gencfg_reg gencfg;
> > + ioasid_t pasid;
> > + int ret;
> > +
> > + domain = iommu_get_domain_for_dev(dev);
> > + if (!domain || domain->type == IOMMU_DOMAIN_BLOCKED)
> > + return -EPERM;
>
> what about UNMANAGED?
will fix this by getting the dma domain.
>
> > +
> > + pasid = iommu_sva_reserve_pasid(1, dev->iommu->max_pasids);
> > + if (pasid == IOMMU_PASID_INVALID)
> > + return -ENOSPC;
>
> as commented in last patch we can just pass a device pointer to a
> general allocation interface.
will do
>
> > +
> > + ret = iommu_attach_device_pasid(domain, dev, pasid);
> > + if (ret) {
> > + dev_err(dev, "failed to attach device pasid %d, domain
> > type %d",
> > + pasid, domain->type);
> > + iommu_sva_unreserve_pasid(pasid);
> > + return ret;
> > + }
> > +
> > + /* Since we set user privilege for kernel DMA, enable
> > completion IRQ */
> > + gencfg.bits = ioread32(idxd->reg_base + IDXD_GENCFG_OFFSET);
> > + gencfg.user_int_en = 1;
> > + iowrite32(gencfg.bits, idxd->reg_base + IDXD_GENCFG_OFFSET);
> > + idxd->pasid = pasid;
>
> Why does user privilege requires a completion interrupt?
>
> Or instead it's more due to doing kernel DMA itself then we certainly
> don't want to poll in the kernel?
yes, kernel wq does not support polling, therefore it needs interrupts.
Without user_int_en bit set, there would be no interrupts if we use user
privilege for kernel wq.

Thanks,

Jacob

2023-03-05 03:06:56

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH 1/4] iommu/vt-d: Implement set device pasid op for default domain

On 3/4/23 12:35 AM, Jacob Pan wrote:
>>> From: Baolu Lu<[email protected]>
>>> Sent: Thursday, March 2, 2023 10:07 PM
>>>
>>>> +
>>>> + if (hw_pass_through && domain_type_is_si(dmar_domain))
>>>> + ret = intel_pasid_setup_pass_through(iommu,
>>>> dmar_domain,
>>>> + dev, pasid);
>>>> + else if (dmar_domain->use_first_level)
>>>> + ret = domain_setup_first_level(iommu, dmar_domain,
>>>> + dev, pasid);
>>>> + else
>>>> + ret = intel_pasid_setup_second_level(iommu,
>>>> dmar_domain,
>>>> + dev, pasid);
>>>> +
>>>> + return ret;
>>>> +}
>>> Do you need to consider pasid cache invalidation?
>>>
>> To avoid confusion this is not about invalidation of pasid cache itself
>> which should be covered by above setup functions already.
>>
>> Here actually means per-PASID invalidation in iotlb and devtlb. Today
>> only RID is tracked per domain for invalidation. it needs extension to
>> walk attached pasid too.
> Yes, will add.
>
> For the set up path, there is no need to flush IOTLBs, because we're going
> from non present to present.
>
> On the remove path, IOTLB flush should be covered when device driver
> calls iommu_detach_device_pasid(). Covered with this patch.

It's not only for the PASID teardown path, but also for unmap(). As the
device has issued DMA requests with PASID, the IOMMU probably will cache
the DMA translation with PASID tagged. Hence, we need to invalidate the
PASID-specific IOTLB and device TLB in the unmap() path.

I once had a patch for this:

https://lore.kernel.org/linux-iommu/[email protected]/

Probably you can use it as a starting point.

Best regards,
baolu

2023-03-06 08:18:52

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 1/4] iommu/vt-d: Implement set device pasid op for default domain

> From: Baolu Lu <[email protected]>
> Sent: Sunday, March 5, 2023 11:06 AM
>
> On 3/4/23 12:35 AM, Jacob Pan wrote:
> >>> From: Baolu Lu<[email protected]>
> >>> Sent: Thursday, March 2, 2023 10:07 PM
> >>>
> >>>> +
> >>>> + if (hw_pass_through && domain_type_is_si(dmar_domain))
> >>>> + ret = intel_pasid_setup_pass_through(iommu,
> >>>> dmar_domain,
> >>>> + dev, pasid);
> >>>> + else if (dmar_domain->use_first_level)
> >>>> + ret = domain_setup_first_level(iommu, dmar_domain,
> >>>> + dev, pasid);
> >>>> + else
> >>>> + ret = intel_pasid_setup_second_level(iommu,
> >>>> dmar_domain,
> >>>> + dev, pasid);
> >>>> +
> >>>> + return ret;
> >>>> +}
> >>> Do you need to consider pasid cache invalidation?
> >>>
> >> To avoid confusion this is not about invalidation of pasid cache itself
> >> which should be covered by above setup functions already.
> >>
> >> Here actually means per-PASID invalidation in iotlb and devtlb. Today
> >> only RID is tracked per domain for invalidation. it needs extension to
> >> walk attached pasid too.
> > Yes, will add.
> >
> > For the set up path, there is no need to flush IOTLBs, because we're going
> > from non present to present.
> >
> > On the remove path, IOTLB flush should be covered when device driver
> > calls iommu_detach_device_pasid(). Covered with this patch.
>
> It's not only for the PASID teardown path, but also for unmap(). As the
> device has issued DMA requests with PASID, the IOMMU probably will cache
> the DMA translation with PASID tagged. Hence, we need to invalidate the
> PASID-specific IOTLB and device TLB in the unmap() path.
>
> I once had a patch for this:
>
> https://lore.kernel.org/linux-iommu/20220614034411.1634238-1-
> [email protected]/
>
> Probably you can use it as a starting point.
>

just that we should not have a sub-device term there. Just name
the tracking information per pasid.

2023-03-06 12:58:05

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 1/4] iommu/vt-d: Implement set device pasid op for default domain

On Wed, Mar 01, 2023 at 04:59:56PM -0800, Jacob Pan wrote:
> On VT-d platforms, legacy DMA requests without PASID use device’s
> default domain, where RID_PASID is always attached. Device drivers
> can then use the DMA API for all in-kernel DMA on the RID.
>
> Ideally, devices capable of using ENQCMDS can also transparently use the
> default domain, consequently DMA API. However, VT-d architecture
> dictates that the PASID used by ENQCMDS must be different from the
> RID_PASID value.
>
> To provide support for transparent use of DMA API with non-RID_PASID
> value, this patch implements the set_dev_pasid() function for the
> default domain. The idea is that device drivers wishing to use ENQCMDS
> to submit work on buffers mapped by DMA API will call
> iommu_attach_device_pasid() beforehand.
>
> Signed-off-by: Jacob Pan <[email protected]>
> ---
> drivers/iommu/intel/iommu.c | 32 ++++++++++++++++++++++++++++++++
> 1 file changed, 32 insertions(+)
>
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 10f657828d3a..a0cb3bc851ac 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4665,6 +4665,10 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
> case IOMMU_DOMAIN_SVA:
> intel_svm_remove_dev_pasid(dev, pasid);
> break;
> + case IOMMU_DOMAIN_DMA:
> + case IOMMU_DOMAIN_DMA_FQ:
> + case IOMMU_DOMAIN_IDENTITY:

Why do we need this switch statement anyhow? Something seems to have
gone wrong here.. SVM shouldn't be special, and why does this call
intel_pasid_tear_down_entry() twice on the SVA path?

It seems like all this is doing is flushing the PRI queue.

A domain should have a dedicated flag unrelated to the type if it is
using PRI and all PRI using domains should have the PRI queue flushed
here, using the same code as flushing the PRI for a RID attachment.

Jason

2023-03-06 13:02:20

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 3/4] iommu/sva: Support reservation of global PASIDs

On Fri, Mar 03, 2023 at 01:47:53PM -0800, Jacob Pan wrote:
> Hi Kevin,
>
> On Thu, 2 Mar 2023 09:43:03 +0000, "Tian, Kevin" <[email protected]>
> wrote:
>
> > > From: Jacob Pan <[email protected]>
> > > Sent: Thursday, March 2, 2023 9:00 AM
> > >
> > > Global PASID allocation is under IOMMU SVA code since it is the primary
> > > use case. However, some architecture such as VT-d, global PASIDs are
> > > necessary for its internal use of DMA API with PASID.
> >
> > No, global PASID is not a VT-d restriction. It's from ENQCMD/S hence a
> > device requirement.
> I meant VT-d based platforms, it is kind of intertwined in that ENQCMDS
> does not restrict RIDPASID!=DMA PASID, vt-d does. Without this
> restriction, there wouldn't be a need for this patch. Let me reword.

No, Kevin is right, there is nothing about VT-d that needs global
PASID values.

The driver should be managing RID2PASID itself to avoid conflicting
with any in-use PASID, either by changing RID2PASID on demand or by
setting it to a value that is not part of the PASID number space, eg
we can make 0 entirely invalid, or the driver can reduce max_pasid of
the devices it controls and use PASID_MAX.

Jason

2023-03-06 17:35:14

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH 1/4] iommu/vt-d: Implement set device pasid op for default domain

Hi Jason,

On Mon, 6 Mar 2023 08:57:57 -0400, Jason Gunthorpe <[email protected]> wrote:

> On Wed, Mar 01, 2023 at 04:59:56PM -0800, Jacob Pan wrote:
> > On VT-d platforms, legacy DMA requests without PASID use device’s
> > default domain, where RID_PASID is always attached. Device drivers
> > can then use the DMA API for all in-kernel DMA on the RID.
> >
> > Ideally, devices capable of using ENQCMDS can also transparently use the
> > default domain, consequently DMA API. However, VT-d architecture
> > dictates that the PASID used by ENQCMDS must be different from the
> > RID_PASID value.
> >
> > To provide support for transparent use of DMA API with non-RID_PASID
> > value, this patch implements the set_dev_pasid() function for the
> > default domain. The idea is that device drivers wishing to use ENQCMDS
> > to submit work on buffers mapped by DMA API will call
> > iommu_attach_device_pasid() beforehand.
> >
> > Signed-off-by: Jacob Pan <[email protected]>
> > ---
> > drivers/iommu/intel/iommu.c | 32 ++++++++++++++++++++++++++++++++
> > 1 file changed, 32 insertions(+)
> >
> > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> > index 10f657828d3a..a0cb3bc851ac 100644
> > --- a/drivers/iommu/intel/iommu.c
> > +++ b/drivers/iommu/intel/iommu.c
> > @@ -4665,6 +4665,10 @@ static void intel_iommu_remove_dev_pasid(struct
> > device *dev, ioasid_t pasid) case IOMMU_DOMAIN_SVA:
> > intel_svm_remove_dev_pasid(dev, pasid);
> > break;
> > + case IOMMU_DOMAIN_DMA:
> > + case IOMMU_DOMAIN_DMA_FQ:
> > + case IOMMU_DOMAIN_IDENTITY:
>
> Why do we need this switch statement anyhow?
For DMA API pasid, there is nothing special just let it fall through and
call
intel_pasid_tear_down_entry(iommu, dev, pasid, false);

> Something seems to have
> gone wrong here.. SVM shouldn't be special,
I think all the trouble is caused by the asymmetrical setup of
iommu_op.remove_dev_pasid() and iommu_domain_ops.set_dev_pasid()
Perhaps, we should "demote" remove_dev_pasid to iommu_domain_ops then we
don't have to check SVA specific things.

> and why does this call intel_pasid_tear_down_entry() twice on the SVA
> path?
Good catch, that seems to be unnecessary.

> It seems like all this is doing is flushing the PRI queue.
> A domain should have a dedicated flag unrelated to the type if it is
> using PRI and all PRI using domains should have the PRI queue flushed
> here, using the same code as flushing the PRI for a RID attachment.
Yes, or if the teardown op is domain-specific, then it works too?


Thanks,

Jacob

2023-03-06 17:42:04

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH 3/4] iommu/sva: Support reservation of global PASIDs

Hi Jason,

On Mon, 6 Mar 2023 09:01:32 -0400, Jason Gunthorpe <[email protected]> wrote:

> On Fri, Mar 03, 2023 at 01:47:53PM -0800, Jacob Pan wrote:
> > Hi Kevin,
> >
> > On Thu, 2 Mar 2023 09:43:03 +0000, "Tian, Kevin" <[email protected]>
> > wrote:
> >
> > > > From: Jacob Pan <[email protected]>
> > > > Sent: Thursday, March 2, 2023 9:00 AM
> > > >
> > > > Global PASID allocation is under IOMMU SVA code since it is the
> > > > primary use case. However, some architecture such as VT-d, global
> > > > PASIDs are necessary for its internal use of DMA API with PASID.
> > >
> > > No, global PASID is not a VT-d restriction. It's from ENQCMD/S hence a
> > > device requirement.
> > I meant VT-d based platforms, it is kind of intertwined in that ENQCMDS
> > does not restrict RIDPASID!=DMA PASID, vt-d does. Without this
> > restriction, there wouldn't be a need for this patch. Let me reword.
>
> No, Kevin is right, there is nothing about VT-d that needs global
> PASID values.
>
> The driver should be managing RID2PASID itself to avoid conflicting
> with any in-use PASID, either by changing RID2PASID on demand or by
> setting it to a value that is not part of the PASID number space, eg
> we can make 0 entirely invalid, or the driver can reduce max_pasid of
> the devices it controls and use PASID_MAX.
>
I see, thank you both. how about
"This patch provide an API for device drivers to request global PASIDs as
needed. The device drivers will then gain the flexibility of choosing
PASIDs not conflicting with anyone in-use."


Thanks,

Jacob

2023-03-06 17:43:14

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 1/4] iommu/vt-d: Implement set device pasid op for default domain

On Mon, Mar 06, 2023 at 09:36:26AM -0800, Jacob Pan wrote:

> > It seems like all this is doing is flushing the PRI queue.
> > A domain should have a dedicated flag unrelated to the type if it is
> > using PRI and all PRI using domains should have the PRI queue flushed
> > here, using the same code as flushing the PRI for a RID attachment.
> Yes, or if the teardown op is domain-specific, then it works too?

That could only sense if we end up creating a PRI domain type too..

Right now PRI is messy because it doesn't really work with unmanaged
domains and that is something that will have to get fixed sort of
soonish

Once PRI is a proper thing then "SVA" is just a PRI domain that has a
non-managed from-the-mm page table pointer.

Most of the driver code marked svm should entirely disappear.

Jason

2023-03-06 17:45:02

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 3/4] iommu/sva: Support reservation of global PASIDs

On Mon, Mar 06, 2023 at 09:44:08AM -0800, Jacob Pan wrote:
> Hi Jason,
>
> On Mon, 6 Mar 2023 09:01:32 -0400, Jason Gunthorpe <[email protected]> wrote:
>
> > On Fri, Mar 03, 2023 at 01:47:53PM -0800, Jacob Pan wrote:
> > > Hi Kevin,
> > >
> > > On Thu, 2 Mar 2023 09:43:03 +0000, "Tian, Kevin" <[email protected]>
> > > wrote:
> > >
> > > > > From: Jacob Pan <[email protected]>
> > > > > Sent: Thursday, March 2, 2023 9:00 AM
> > > > >
> > > > > Global PASID allocation is under IOMMU SVA code since it is the
> > > > > primary use case. However, some architecture such as VT-d, global
> > > > > PASIDs are necessary for its internal use of DMA API with PASID.
> > > >
> > > > No, global PASID is not a VT-d restriction. It's from ENQCMD/S hence a
> > > > device requirement.
> > > I meant VT-d based platforms, it is kind of intertwined in that ENQCMDS
> > > does not restrict RIDPASID!=DMA PASID, vt-d does. Without this
> > > restriction, there wouldn't be a need for this patch. Let me reword.
> >
> > No, Kevin is right, there is nothing about VT-d that needs global
> > PASID values.
> >
> > The driver should be managing RID2PASID itself to avoid conflicting
> > with any in-use PASID, either by changing RID2PASID on demand or by
> > setting it to a value that is not part of the PASID number space, eg
> > we can make 0 entirely invalid, or the driver can reduce max_pasid of
> > the devices it controls and use PASID_MAX.
> >
> I see, thank you both. how about
> "This patch provide an API for device drivers to request global PASIDs as
> needed. The device drivers will then gain the flexibility of choosing
> PASIDs not conflicting with anyone in-use."

Stil no, this functionality should be clearly and unambiguously tied
to ENQCMD:

Devices that rely on Intel ENQCMD have a single CPU register to store
the current thread's PASID in. This necessarily makes the PASID a
system-global value shared by all ENQCMD using devices.

This matches the current allocator being used for the SVA PASID so for
now allow ENQCMD drivers to access this PASID allocator for other
uses.

Jason

2023-03-06 17:55:37

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH 3/4] iommu/sva: Support reservation of global PASIDs

Hi Jason,

On Mon, 6 Mar 2023 13:43:39 -0400, Jason Gunthorpe <[email protected]> wrote:

> On Mon, Mar 06, 2023 at 09:44:08AM -0800, Jacob Pan wrote:
> > Hi Jason,
> >
> > On Mon, 6 Mar 2023 09:01:32 -0400, Jason Gunthorpe <[email protected]>
> > wrote:
> > > On Fri, Mar 03, 2023 at 01:47:53PM -0800, Jacob Pan wrote:
> > > > Hi Kevin,
> > > >
> > > > On Thu, 2 Mar 2023 09:43:03 +0000, "Tian, Kevin"
> > > > <[email protected]> wrote:
> > > >
> > > > > > From: Jacob Pan <[email protected]>
> > > > > > Sent: Thursday, March 2, 2023 9:00 AM
> > > > > >
> > > > > > Global PASID allocation is under IOMMU SVA code since it is the
> > > > > > primary use case. However, some architecture such as VT-d,
> > > > > > global PASIDs are necessary for its internal use of DMA API
> > > > > > with PASID.
> > > > >
> > > > > No, global PASID is not a VT-d restriction. It's from ENQCMD/S
> > > > > hence a device requirement.
> > > > I meant VT-d based platforms, it is kind of intertwined in that
> > > > ENQCMDS does not restrict RIDPASID!=DMA PASID, vt-d does. Without
> > > > this restriction, there wouldn't be a need for this patch. Let me
> > > > reword.
> > >
> > > No, Kevin is right, there is nothing about VT-d that needs global
> > > PASID values.
> > >
> > > The driver should be managing RID2PASID itself to avoid conflicting
> > > with any in-use PASID, either by changing RID2PASID on demand or by
> > > setting it to a value that is not part of the PASID number space, eg
> > > we can make 0 entirely invalid, or the driver can reduce max_pasid of
> > > the devices it controls and use PASID_MAX.
> > >
> > I see, thank you both. how about
> > "This patch provide an API for device drivers to request global PASIDs
> > as needed. The device drivers will then gain the flexibility of choosing
> > PASIDs not conflicting with anyone in-use."
>
> Stil no, this functionality should be clearly and unambiguously tied
> to ENQCMD:
>
> Devices that rely on Intel ENQCMD have a single CPU register to store
> the current thread's PASID in. This necessarily makes the PASID a
> system-global value shared by all ENQCMD using devices.
>
> This matches the current allocator being used for the SVA PASID so for
> now allow ENQCMD drivers to access this PASID allocator for other
> uses.
>
ENQCMDS does not have the restriction of using a single CPU MSR to store
PASIDs, PASID is supplied to the instruction operand. Here we are adding
API for ENQCMDS. Should we explain this as well? i.e. due the unforgiving
nature of ENQCMD that requires global PASIDs, ENQCMDS has no choice but to
allocate from the same numberspace to avoid conflict.


Thanks,

Jacob

2023-03-06 18:20:14

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 3/4] iommu/sva: Support reservation of global PASIDs

On Mon, Mar 06, 2023 at 09:57:59AM -0800, Jacob Pan wrote:

> ENQCMDS does not have the restriction of using a single CPU MSR to store
> PASIDs, PASID is supplied to the instruction operand.

Huh? That isn't what it says in the programming manual. It says the
PASID only comes from the IA32_PASID msr and the only two operands are
the destination MMIO and the memory source for the rest of the payload.

Jason

2023-03-06 18:26:31

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH 1/4] iommu/vt-d: Implement set device pasid op for default domain

Hi Baolu,

On Sun, 5 Mar 2023 11:05:50 +0800, Baolu Lu <[email protected]>
wrote:

> On 3/4/23 12:35 AM, Jacob Pan wrote:
> >>> From: Baolu Lu<[email protected]>
> >>> Sent: Thursday, March 2, 2023 10:07 PM
> >>>
> >>>> +
> >>>> + if (hw_pass_through && domain_type_is_si(dmar_domain))
> >>>> + ret = intel_pasid_setup_pass_through(iommu,
> >>>> dmar_domain,
> >>>> + dev, pasid);
> >>>> + else if (dmar_domain->use_first_level)
> >>>> + ret = domain_setup_first_level(iommu, dmar_domain,
> >>>> + dev, pasid);
> >>>> + else
> >>>> + ret = intel_pasid_setup_second_level(iommu,
> >>>> dmar_domain,
> >>>> + dev, pasid);
> >>>> +
> >>>> + return ret;
> >>>> +}
> >>> Do you need to consider pasid cache invalidation?
> >>>
> >> To avoid confusion this is not about invalidation of pasid cache itself
> >> which should be covered by above setup functions already.
> >>
> >> Here actually means per-PASID invalidation in iotlb and devtlb. Today
> >> only RID is tracked per domain for invalidation. it needs extension to
> >> walk attached pasid too.
> > Yes, will add.
> >
> > For the set up path, there is no need to flush IOTLBs, because we're
> > going from non present to present.
> >
> > On the remove path, IOTLB flush should be covered when device driver
> > calls iommu_detach_device_pasid(). Covered with this patch.
>
> It's not only for the PASID teardown path, but also for unmap(). As the
> device has issued DMA requests with PASID, the IOMMU probably will cache
> the DMA translation with PASID tagged. Hence, we need to invalidate the
> PASID-specific IOTLB and device TLB in the unmap() path.
>
> I once had a patch for this:
>
> https://lore.kernel.org/linux-iommu/[email protected]/
>
> Probably you can use it as a starting point.
understood, actually my previous version had unmap flush, based on yours as
well.
https://lore.kernel.org/lkml/[email protected]/T/
>
> Best regards,
> baolu


Thanks,

Jacob

2023-03-06 18:40:19

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH 1/4] iommu/vt-d: Implement set device pasid op for default domain

Hi Kevin,

On Mon, 6 Mar 2023 08:18:37 +0000, "Tian, Kevin" <[email protected]>
wrote:

> > From: Baolu Lu <[email protected]>
> > Sent: Sunday, March 5, 2023 11:06 AM
> >
> > On 3/4/23 12:35 AM, Jacob Pan wrote:
> > >>> From: Baolu Lu<[email protected]>
> > >>> Sent: Thursday, March 2, 2023 10:07 PM
> > >>>
> > >>>> +
> > >>>> + if (hw_pass_through && domain_type_is_si(dmar_domain))
> > >>>> + ret = intel_pasid_setup_pass_through(iommu,
> > >>>> dmar_domain,
> > >>>> + dev, pasid);
> > >>>> + else if (dmar_domain->use_first_level)
> > >>>> + ret = domain_setup_first_level(iommu, dmar_domain,
> > >>>> + dev, pasid);
> > >>>> + else
> > >>>> + ret = intel_pasid_setup_second_level(iommu,
> > >>>> dmar_domain,
> > >>>> + dev, pasid);
> > >>>> +
> > >>>> + return ret;
> > >>>> +}
> > >>> Do you need to consider pasid cache invalidation?
> > >>>
> > >> To avoid confusion this is not about invalidation of pasid cache
> > >> itself which should be covered by above setup functions already.
> > >>
> > >> Here actually means per-PASID invalidation in iotlb and devtlb. Today
> > >> only RID is tracked per domain for invalidation. it needs extension
> > >> to walk attached pasid too.
> > > Yes, will add.
> > >
> > > For the set up path, there is no need to flush IOTLBs, because we're
> > > going from non present to present.
> > >
> > > On the remove path, IOTLB flush should be covered when device driver
> > > calls iommu_detach_device_pasid(). Covered with this patch.
> >
> > It's not only for the PASID teardown path, but also for unmap(). As the
> > device has issued DMA requests with PASID, the IOMMU probably will cache
> > the DMA translation with PASID tagged. Hence, we need to invalidate the
> > PASID-specific IOTLB and device TLB in the unmap() path.
> >
> > I once had a patch for this:
> >
> > https://lore.kernel.org/linux-iommu/20220614034411.1634238-1-
> > [email protected]/
> >
> > Probably you can use it as a starting point.
> >
>
> just that we should not have a sub-device term there. Just name
> the tracking information per pasid.
Sounds good, I should be enable to use Baolu's patch for the most part.

Thanks,

Jacob

2023-03-06 18:49:19

by Luck, Tony

[permalink] [raw]
Subject: RE: [PATCH 3/4] iommu/sva: Support reservation of global PASIDs

>> ENQCMDS does not have the restriction of using a single CPU MSR to store
>> PASIDs, PASID is supplied to the instruction operand.
>
> Huh? That isn't what it says in the programming manual. It says the
> PASID only comes from the IA32_PASID msr and the only two operands are
> the destination MMIO and the memory source for the rest of the payload.

Jason,

Two different instructions with only one letter different in the name.

ENQCMD - ring 3 instruction. The PASID is inserted into the descriptor
pushed to the device from the IA32_PASID MSR.

ENQCMDS - ring 0 instruction (see that trailing "S" for Supervisor mode).
In this case the submitter can include any PASID value they want in the
in-memory copy of the descriptor and ENQCMDS will pass that to the
device.

-Tony

2023-03-06 19:01:38

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH 1/4] iommu/vt-d: Implement set device pasid op for default domain

Hi Kevin,

On Fri, 3 Mar 2023 05:35:58 +0000, "Tian, Kevin" <[email protected]>
wrote:

> > From: Baolu Lu <[email protected]>
> > Sent: Friday, March 3, 2023 12:38 PM
> >
> > On 3/3/23 11:02 AM, Tian, Kevin wrote:
> > >> From: Baolu Lu <[email protected]>
> > >> Sent: Friday, March 3, 2023 10:49 AM
> > >>
> > >> On 3/3/23 10:36 AM, Tian, Kevin wrote:
> > >>>> From: Baolu Lu <[email protected]>
> > >>>> Sent: Thursday, March 2, 2023 10:07 PM
> > >>>>> +
> > >>>>> + if (!sm_supported(iommu) || !info)
> > >>>>
> > >>>> @info has been referenced. !info check makes no sense.
> > >>>>
> > >>>> Add pasid_supported(iommu).
> > >>>>
> > >>>> Do you need to check whether the domain is compatible for this rid
> > >>>> pasid?
> > >>>
> > >>> what kind of compatibility is concerned here? In concept a pasid
> > >>> can be attached to any domain if it has been successfully attached
> > >>> to rid. Probably we can add a check here that RID2PASID must
> > >>> point to the domain already.
> > >>
> > >> "...if it has been successfully attached to rid..."
> > >>
> > >> We should not have this assumption in iommu driver's callback. The
> > iommu
> > >> driver has no (and should not have) knowledge about the history of
> > >> any domain.
> > >
> > > but this is an op for default domain which must have been attached
> > > to RID2PASID and any compatibility check between this domain and
> > > device should be passed.
> >
> > This is an op for DMA, DMA-FQ and UNMANAGED domain. The IOMMU
> > driver
> > doesn't need to interpret the default domain concept. :-)
> >
>
> yes if we target a general callback for all those domain types.
>
> and probably this is the right thing to do as in the end DMA type will
> be removed with Jason's cleanup
so, let me recap. set_dev_pasid() should make no assumptions of
ordering, i.e. it is equal to iommu_domain_ops.attach_dev().
It will be kind of the same as Baolu's old patch
https://lore.kernel.org/linux-iommu/[email protected]/


Thanks,

Jacob

2023-03-06 19:03:05

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 1/4] iommu/vt-d: Implement set device pasid op for default domain

On Mon, Mar 06, 2023 at 11:04:43AM -0800, Jacob Pan wrote:

> > and probably this is the right thing to do as in the end DMA type will
> > be removed with Jason's cleanup
>
> so, let me recap. set_dev_pasid() should make no assumptions of
> ordering, i.e. it is equal to iommu_domain_ops.attach_dev().

Absolutely yes.

You should factor out all the "prepare the domain to be used" code and
call it in both places.

Jason

2023-03-06 19:05:36

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 3/4] iommu/sva: Support reservation of global PASIDs

On Mon, Mar 06, 2023 at 06:48:43PM +0000, Luck, Tony wrote:
> >> ENQCMDS does not have the restriction of using a single CPU MSR to store
> >> PASIDs, PASID is supplied to the instruction operand.
> >
> > Huh? That isn't what it says in the programming manual. It says the
> > PASID only comes from the IA32_PASID msr and the only two operands are
> > the destination MMIO and the memory source for the rest of the payload.
>
> Jason,
>
> Two different instructions with only one letter different in the name.
>
> ENQCMD - ring 3 instruction. The PASID is inserted into the descriptor
> pushed to the device from the IA32_PASID MSR.
>
> ENQCMDS - ring 0 instruction (see that trailing "S" for Supervisor mode).
> In this case the submitter can include any PASID value they want in the
> in-memory copy of the descriptor and ENQCMDS will pass that to the
> device.

Ah, well, my comment wasn't talking about ENQCMDS :)

If ENQCMDS can take in an arbitary PASID then there is no
justification here to use the global allocator.

The rational is more like:

IDXD uses PASIDs that come from the SVA allocator. It needs to create
an internal kernel-only PASID that is non-overlapping so allow the SVA
allocator to reserve PASIDs for driver use.

IDXD has to use the global SVA PASID allocator beacuse its userspace
will use ENQCMD which requires global PASIDs.

Jason

2023-03-06 23:41:20

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH 1/4] iommu/vt-d: Implement set device pasid op for default domain

Hi Jason,

On Mon, 6 Mar 2023 15:02:37 -0400, Jason Gunthorpe <[email protected]> wrote:

> On Mon, Mar 06, 2023 at 11:04:43AM -0800, Jacob Pan wrote:
>
> > > and probably this is the right thing to do as in the end DMA type will
> > > be removed with Jason's cleanup
> >
> > so, let me recap. set_dev_pasid() should make no assumptions of
> > ordering, i.e. it is equal to iommu_domain_ops.attach_dev().
>
> Absolutely yes.
>
> You should factor out all the "prepare the domain to be used" code and
> call it in both places.
>
I think this has been done by Baolu
https://lore.kernel.org/linux-iommu/[email protected]/T/#m8c980357a39dc75dc360ff0f71dc7ebe1e49f9a6
iommu/vt-d: Move common code out of iommu_attch_device()

This part of code could be used by both normal and aux
domain specific attach entries. Hence move them into a
common function to avoid duplication.

set_dev_pasid() will call prepare_domain_attach_device() as well.

Thanks,

Jacob

2023-03-07 00:42:05

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH 1/4] iommu/vt-d: Implement set device pasid op for default domain

Hi Jacob,

On Mon, 6 Mar 2023 15:45:04 -0800, Jacob Pan
<[email protected]> wrote:

> Hi Jason,
>
> On Mon, 6 Mar 2023 15:02:37 -0400, Jason Gunthorpe <[email protected]> wrote:
>
> > On Mon, Mar 06, 2023 at 11:04:43AM -0800, Jacob Pan wrote:
> >
> > > > and probably this is the right thing to do as in the end DMA type
> > > > will be removed with Jason's cleanup
> > >
> > > so, let me recap. set_dev_pasid() should make no assumptions of
> > > ordering, i.e. it is equal to iommu_domain_ops.attach_dev().
> >
> > Absolutely yes.
> >
> > You should factor out all the "prepare the domain to be used" code and
> > call it in both places.
> >
> I think this has been done by Baolu
> https://lore.kernel.org/linux-iommu/[email protected]/T/#m8c980357a39dc75dc360ff0f71dc7ebe1e49f9a6
> iommu/vt-d: Move common code out of iommu_attch_device()
>
> This part of code could be used by both normal and aux
> domain specific attach entries. Hence move them into a
> common function to avoid duplication.
>
> set_dev_pasid() will call prepare_domain_attach_device() as well.
Actually, there are more to be factored to common code if we take that
assumption away. attach_dev() can be viewed as a special case for
set_dev_pasid() except the PASID is RID_PASID.

Thanks,

Jacob

2023-03-07 02:18:58

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH 1/4] iommu/vt-d: Implement set device pasid op for default domain

On 3/7/23 1:41 AM, Jason Gunthorpe wrote:
> On Mon, Mar 06, 2023 at 09:36:26AM -0800, Jacob Pan wrote:
>
>>> It seems like all this is doing is flushing the PRI queue.
>>> A domain should have a dedicated flag unrelated to the type if it is
>>> using PRI and all PRI using domains should have the PRI queue flushed
>>> here, using the same code as flushing the PRI for a RID attachment.
>> Yes, or if the teardown op is domain-specific, then it works too?
> That could only sense if we end up creating a PRI domain type too..
>
> Right now PRI is messy because it doesn't really work with unmanaged
> domains and that is something that will have to get fixed sort of
> soonish
>
> Once PRI is a proper thing then "SVA" is just a PRI domain that has a
> non-managed from-the-mm page table pointer.
>
> Most of the driver code marked svm should entirely disappear.

Agreed.

There is also another code consolidation we discussed earlier, that is
moving mmu_notifier_register/unregister to drivers/iommu/iommu-sva.c.
It's common for all SVA-capable iommu drivers.

With PRI domain and mm_notifier addressed, we could safely remove the
switch here. SVA domain will be nothing special. :-)

Best regards,
baolu

2023-03-09 17:08:37

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH 3/4] iommu/sva: Support reservation of global PASIDs

Hi Jason,

On Mon, 6 Mar 2023 15:05:27 -0400, Jason Gunthorpe <[email protected]> wrote:

> On Mon, Mar 06, 2023 at 06:48:43PM +0000, Luck, Tony wrote:
> > >> ENQCMDS does not have the restriction of using a single CPU MSR to
> > >> store PASIDs, PASID is supplied to the instruction operand.
> > >
> > > Huh? That isn't what it says in the programming manual. It says the
> > > PASID only comes from the IA32_PASID msr and the only two operands are
> > > the destination MMIO and the memory source for the rest of the
> > > payload.
> >
> > Jason,
> >
> > Two different instructions with only one letter different in the name.
> >
> > ENQCMD - ring 3 instruction. The PASID is inserted into the descriptor
> > pushed to the device from the IA32_PASID MSR.
> >
> > ENQCMDS - ring 0 instruction (see that trailing "S" for Supervisor
> > mode). In this case the submitter can include any PASID value they want
> > in the in-memory copy of the descriptor and ENQCMDS will pass that to
> > the device.
>
> Ah, well, my comment wasn't talking about ENQCMDS :)
>
> If ENQCMDS can take in an arbitary PASID then there is no
> justification here to use the global allocator.
>
> The rational is more like:
>
> IDXD uses PASIDs that come from the SVA allocator. It needs to create
> an internal kernel-only PASID that is non-overlapping so allow the SVA
> allocator to reserve PASIDs for driver use.
>
> IDXD has to use the global SVA PASID allocator beacuse its userspace
> will use ENQCMD which requires global PASIDs.
>
yes, great summary. I think that is the same as what I was trying to say
earlier :)
"due the unforgiving nature of ENQCMD that requires global PASIDs, ENQCMDS
has no choice but to allocate from the same numberspace to avoid conflict."

In that sense, I feel the global allocator should be staying with SVA
instead of moving to iommu core (as Kevin suggested). Because we are trying
to have non-overlapping pasid with SVA.

Thanks,

Jacob

2023-03-16 07:25:28

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 3/4] iommu/sva: Support reservation of global PASIDs

> From: Jacob Pan <[email protected]>
> Sent: Friday, March 10, 2023 1:06 AM
>
> Hi Jason,
>
> On Mon, 6 Mar 2023 15:05:27 -0400, Jason Gunthorpe <[email protected]>
> wrote:
>
> > On Mon, Mar 06, 2023 at 06:48:43PM +0000, Luck, Tony wrote:
> > > >> ENQCMDS does not have the restriction of using a single CPU MSR to
> > > >> store PASIDs, PASID is supplied to the instruction operand.
> > > >
> > > > Huh? That isn't what it says in the programming manual. It says the
> > > > PASID only comes from the IA32_PASID msr and the only two operands
> are
> > > > the destination MMIO and the memory source for the rest of the
> > > > payload.
> > >
> > > Jason,
> > >
> > > Two different instructions with only one letter different in the name.
> > >
> > > ENQCMD - ring 3 instruction. The PASID is inserted into the descriptor
> > > pushed to the device from the IA32_PASID MSR.
> > >
> > > ENQCMDS - ring 0 instruction (see that trailing "S" for Supervisor
> > > mode). In this case the submitter can include any PASID value they want
> > > in the in-memory copy of the descriptor and ENQCMDS will pass that to
> > > the device.
> >
> > Ah, well, my comment wasn't talking about ENQCMDS :)
> >
> > If ENQCMDS can take in an arbitary PASID then there is no
> > justification here to use the global allocator.
> >
> > The rational is more like:
> >
> > IDXD uses PASIDs that come from the SVA allocator. It needs to create
> > an internal kernel-only PASID that is non-overlapping so allow the SVA
> > allocator to reserve PASIDs for driver use.
> >
> > IDXD has to use the global SVA PASID allocator beacuse its userspace
> > will use ENQCMD which requires global PASIDs.
> >
> yes, great summary. I think that is the same as what I was trying to say
> earlier :)
> "due the unforgiving nature of ENQCMD that requires global PASIDs,
> ENQCMDS
> has no choice but to allocate from the same numberspace to avoid conflict."
>
> In that sense, I feel the global allocator should be staying with SVA
> instead of moving to iommu core (as Kevin suggested). Because we are trying
> to have non-overlapping pasid with SVA.
>

I still doubt 'reserve' is the right interface to define.

for DMA domain probably yes as it's static and one-off.

but thinking louder when the same driver starts to support SIOV we
need allocating additional PASIDs on demand which is hardly to be
fit in a reservation interface.

2023-03-20 17:28:24

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH 3/4] iommu/sva: Support reservation of global PASIDs

On Thu, Mar 16, 2023 at 07:25:17AM +0000, Tian, Kevin wrote:

> I still doubt 'reserve' is the right interface to define.
>
> for DMA domain probably yes as it's static and one-off.
>
> but thinking louder when the same driver starts to support SIOV we
> need allocating additional PASIDs on demand which is hardly to be
> fit in a reservation interface.

Sure, it is the same thing.

It is "reserve" in the sense they are not assigned to mm structs and
not used for SVA.

Jason