2023-10-17 00:49:02

by Zhang, Tina

[permalink] [raw]
Subject: [PATCH v8 0/5] Share sva domains with all devices bound to a mm

This series is to share sva(shared virtual addressing) domains with all
devices bound to one mm.

Problem
-------
In the current iommu core code, sva domain is allocated per IOMMU group,
when device driver is binding a process address space to a device (which is
handled in iommu_sva_bind_device()). If one than more device is bound to
the same process address space, there must be more than one sva domain
instance, with each device having one. In other words, the sva domain
doesn't share between those devices bound to the same process address
space, and that leads to two problems:
1) device driver has to duplicate sva domains with enqcmd, as those sva
domains have the same PASID and are relevant to one virtual address space.
This makes the sva domain handling complex in device drivers.
2) IOMMU driver cannot get sufficient info of the IOMMUs that have
devices behind them bound to the same virtual address space, when handling
mmu_notifier_ops callbacks. As a result, IOMMU IOTLB invalidation is
performed per device instead of per IOMMU, and that may lead to
superfluous IOTLB invalidation issue, especially in a virtualization
environment where all devices may be behind one virtual IOMMU.

Solution
--------
This patch-set tries to fix those two problems by allowing sharing sva
domains with all devices bound to a mm. To achieve this, a new structure
pointer is introduced to mm to replace the old PASID field, which can keep
the info of PASID as well as the corresponding shared sva domains.
Besides, function iommu_sva_bind_device() is updated to ensure a new sva
domain can only be allocated when the old ones cannot work for the IOMMU.
With these changes, a device driver can expect one sva domain could work
for per PASID instance(e.g., enqcmd PASID instance), and therefore may get
rid of handling sva domain duplication. Besides, IOMMU driver (e.g., intel
vt-d driver) can get sufficient info (e.g., the info of the IOMMUs having
their devices bound to one virtual address space) when handling
mmu_notifier_ops callbacks, to remove the redundant IOTLB invalidations.

Arguably there shouldn't be more than one sva_domain with the same PASID,
and in any sane configuration there should be only 1 type of IOMMU driver
that needs only 1 SVA domain. However, in reality, IOMMUs on one platform
may not be identical to each other. Thus, attaching a sva domain that has
been successfully bound to device A behind a IOMMU A, to device B behind
IOMMU B may get failed due to the difference between IOMMU A and IOMMU
B. In this case, a new sva domain with the same PASID needs to be
allocated to work with IOMMU B. That's why we need a list to keep sva
domains of one PASID. For the platform where IOMMUs are compatible to each
other, there should be one sva domain in the list.

v8:
- CC more people
- CC [email protected] mailing list.
When sending version 7, some issue happened in my CC list and that caused
version 7 wasn't sent to [email protected].
- Rebase to v6.6-rc6 and make a few format changes.

v7: https://lore.kernel.org/lkml/[email protected]/
- Add mm_pasid_init() back and do zeroing mm->iommu_mm pointer in
mm_pasid_init() to avoid the use-after-free/double-free problem.
- Update the commit message of patch "iommu: Add mm_get_enqcmd_pasid()
helper function".

v6: https://lore.kernel.org/linux-iommu/[email protected]/
- Rename iommu_sva_alloc_pasid() to iommu_alloc_mm_data().
- Hold the iommu_sva_lock before invoking iommu_alloc_mm_data().
- Remove "iommu: Introduce mm_get_pasid() helper function" patch, because
SMMUv3 decides to use mm_get_enqcmd_pasid() instead and other users are
using iommu_sva_get_pasid() to get the pasid value. Besides, the iommu
core accesses iommu_mm_data in the critical section protected by
iommu_sva_lock. So no need to add another helper to retrieve PASID
atomically.

v5: https://lore.kernel.org/linux-iommu/[email protected]/
- Order patch "iommu/vt-d: Remove mm->pasid in intel_sva_bind_mm()"
first in this series.
- Update commit message of patch "iommu: Introduce mm_get_pasid()
helper function"
- Use smp_store_release() & READ_ONCE() in storing and loading mm's
pasid value.

v4: https://lore.kernel.org/linux-iommu/[email protected]/
- Rebase to v6.6-rc1.

v3: https://lore.kernel.org/linux-iommu/[email protected]/
- Add a comment describing domain->next.
- Expand explanation of why PASID isn't released in
iommu_sva_unbind_device().
- Add a patch to remove mm->pasid in intel_sva_bind_mm()

v2: https://lore.kernel.org/linux-iommu/[email protected]/
- Add mm_get_enqcmd_pasid().
- Update commit message.

v1: https://lore.kernel.org/linux-iommu/[email protected]/

RFC: https://lore.kernel.org/linux-iommu/[email protected]/

Tina Zhang (5):
iommu/vt-d: Remove mm->pasid in intel_sva_bind_mm()
iommu: Add mm_get_enqcmd_pasid() helper function
mm: Add structure to keep sva information
iommu: Support mm PASID 1:n with sva domains
mm: Deprecate pasid field

arch/x86/kernel/traps.c | 2 +-
.../iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c | 12 +--
drivers/iommu/intel/svm.c | 14 +--
drivers/iommu/iommu-sva.c | 94 +++++++++++--------
include/linux/iommu.h | 38 +++++++-
include/linux/mm_types.h | 3 +-
mm/init-mm.c | 3 -
7 files changed, 107 insertions(+), 59 deletions(-)

--
2.39.3


2023-10-17 00:49:04

by Zhang, Tina

[permalink] [raw]
Subject: [PATCH v8 4/5] iommu: Support mm PASID 1:n with sva domains

Each mm bound to devices gets a PASID and corresponding sva domains
allocated in iommu_sva_bind_device(), which are referenced by iommu_mm
field of the mm. The PASID is released in __mmdrop(), while a sva domain
is released when no one is using it (the reference count is decremented
in iommu_sva_unbind_device()). However, although sva domains and their
PASID are separate objects such that their own life cycles could be
handled independently, an enqcmd use case may require releasing the
PASID in releasing the mm (i.e., once a PASID is allocated for a mm, it
will be permanently used by the mm and won't be released until the end
of mm) and only allows to drop the PASID after the sva domains are
released. To this end, mmgrab() is called in iommu_sva_domain_alloc() to
increment the mm reference count and mmdrop() is invoked in
iommu_domain_free() to decrement the mm reference count.

Since the required info of PASID and sva domains is kept in struct
iommu_mm_data of a mm, use mm->iommu_mm field instead of the old pasid
field in mm struct. The sva domain list is protected by iommu_sva_lock.

Besides, this patch removes mm_pasid_init(), as with the introduced
iommu_mm structure, initializing mm pasid in mm_init() is unnecessary.

Reviewed-by: Lu Baolu <[email protected]>
Reviewed-by: Vasant Hegde <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Tested-by: Nicolin Chen <[email protected]>
Signed-off-by: Tina Zhang <[email protected]>
---

Change in v7:
- Add mm_pasid_init() back and do zeroing mm->iommu_mm pointer in
mm_pasid_init() to avoid the use-after-free/double-free problem.

Changes in v6:
- Rename iommu_sva_alloc_pasid() to iommu_alloc_mm_data().
- Hold the iommu_sva_lock before invoking iommu_alloc_mm_data().

Change in v5:
- Use smp_store_release() & READ_ONCE() in storing and loading mm's
pasid value.

Change in v4:
- Rebase to v6.6-rc1.

drivers/iommu/iommu-sva.c | 92 +++++++++++++++++++++++----------------
include/linux/iommu.h | 23 ++++++++--
2 files changed, 74 insertions(+), 41 deletions(-)

diff --git a/drivers/iommu/iommu-sva.c b/drivers/iommu/iommu-sva.c
index 4a2f5699747f..5175e8d85247 100644
--- a/drivers/iommu/iommu-sva.c
+++ b/drivers/iommu/iommu-sva.c
@@ -12,32 +12,42 @@
static DEFINE_MUTEX(iommu_sva_lock);

/* Allocate a PASID for the mm within range (inclusive) */
-static int iommu_sva_alloc_pasid(struct mm_struct *mm, struct device *dev)
+static struct iommu_mm_data *iommu_alloc_mm_data(struct mm_struct *mm, struct device *dev)
{
+ struct iommu_mm_data *iommu_mm;
ioasid_t pasid;
- int ret = 0;
+
+ lockdep_assert_held(&iommu_sva_lock);

if (!arch_pgtable_dma_compat(mm))
- return -EBUSY;
+ return ERR_PTR(-EBUSY);

- mutex_lock(&iommu_sva_lock);
+ iommu_mm = mm->iommu_mm;
/* Is a PASID already associated with this mm? */
- if (mm_valid_pasid(mm)) {
- if (mm->pasid >= dev->iommu->max_pasids)
- ret = -EOVERFLOW;
- goto out;
+ if (iommu_mm) {
+ if (iommu_mm->pasid >= dev->iommu->max_pasids)
+ return ERR_PTR(-EOVERFLOW);
+ return iommu_mm;
}

+ iommu_mm = kzalloc(sizeof(struct iommu_mm_data), GFP_KERNEL);
+ if (!iommu_mm)
+ return ERR_PTR(-ENOMEM);
+
pasid = iommu_alloc_global_pasid(dev);
if (pasid == IOMMU_PASID_INVALID) {
- ret = -ENOSPC;
- goto out;
+ kfree(iommu_mm);
+ return ERR_PTR(-ENOSPC);
}
- mm->pasid = pasid;
- ret = 0;
-out:
- mutex_unlock(&iommu_sva_lock);
- return ret;
+ iommu_mm->pasid = pasid;
+ INIT_LIST_HEAD(&iommu_mm->sva_domains);
+ /*
+ * Make sure the write to mm->iommu_mm is not reordered in front of
+ * initialization to iommu_mm fields. If it does, readers may see a
+ * valid iommu_mm with uninitialized values.
+ */
+ smp_store_release(&mm->iommu_mm, iommu_mm);
+ return iommu_mm;
}

/**
@@ -58,31 +68,33 @@ static int iommu_sva_alloc_pasid(struct mm_struct *mm, struct device *dev)
*/
struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm)
{
+ struct iommu_mm_data *iommu_mm;
struct iommu_domain *domain;
struct iommu_sva *handle;
int ret;

+ mutex_lock(&iommu_sva_lock);
+
/* Allocate mm->pasid if necessary. */
- ret = iommu_sva_alloc_pasid(mm, dev);
- if (ret)
- return ERR_PTR(ret);
+ iommu_mm = iommu_alloc_mm_data(mm, dev);
+ if (IS_ERR(iommu_mm)) {
+ ret = PTR_ERR(iommu_mm);
+ goto out_unlock;
+ }

handle = kzalloc(sizeof(*handle), GFP_KERNEL);
- if (!handle)
- return ERR_PTR(-ENOMEM);
-
- mutex_lock(&iommu_sva_lock);
- /* Search for an existing domain. */
- domain = iommu_get_domain_for_dev_pasid(dev, mm->pasid,
- IOMMU_DOMAIN_SVA);
- if (IS_ERR(domain)) {
- ret = PTR_ERR(domain);
+ if (!handle) {
+ ret = -ENOMEM;
goto out_unlock;
}

- if (domain) {
- domain->users++;
- goto out;
+ /* Search for an existing domain. */
+ list_for_each_entry(domain, &mm->iommu_mm->sva_domains, next) {
+ ret = iommu_attach_device_pasid(domain, dev, iommu_mm->pasid);
+ if (!ret) {
+ domain->users++;
+ goto out;
+ }
}

/* Allocate a new domain and set it on device pasid. */
@@ -92,23 +104,23 @@ struct iommu_sva *iommu_sva_bind_device(struct device *dev, struct mm_struct *mm
goto out_unlock;
}

- ret = iommu_attach_device_pasid(domain, dev, mm->pasid);
+ ret = iommu_attach_device_pasid(domain, dev, iommu_mm->pasid);
if (ret)
goto out_free_domain;
domain->users = 1;
+ list_add(&domain->next, &mm->iommu_mm->sva_domains);
+
out:
mutex_unlock(&iommu_sva_lock);
handle->dev = dev;
handle->domain = domain;
-
return handle;

out_free_domain:
iommu_domain_free(domain);
+ kfree(handle);
out_unlock:
mutex_unlock(&iommu_sva_lock);
- kfree(handle);
-
return ERR_PTR(ret);
}
EXPORT_SYMBOL_GPL(iommu_sva_bind_device);
@@ -124,12 +136,13 @@ EXPORT_SYMBOL_GPL(iommu_sva_bind_device);
void iommu_sva_unbind_device(struct iommu_sva *handle)
{
struct iommu_domain *domain = handle->domain;
- ioasid_t pasid = domain->mm->pasid;
+ struct iommu_mm_data *iommu_mm = domain->mm->iommu_mm;
struct device *dev = handle->dev;

mutex_lock(&iommu_sva_lock);
+ iommu_detach_device_pasid(domain, dev, iommu_mm->pasid);
if (--domain->users == 0) {
- iommu_detach_device_pasid(domain, dev, pasid);
+ list_del(&domain->next);
iommu_domain_free(domain);
}
mutex_unlock(&iommu_sva_lock);
@@ -205,8 +218,11 @@ iommu_sva_handle_iopf(struct iommu_fault *fault, void *data)

void mm_pasid_drop(struct mm_struct *mm)
{
- if (likely(!mm_valid_pasid(mm)))
+ struct iommu_mm_data *iommu_mm = mm->iommu_mm;
+
+ if (!iommu_mm)
return;

- iommu_free_global_pasid(mm->pasid);
+ iommu_free_global_pasid(iommu_mm->pasid);
+ kfree(iommu_mm);
}
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index e350520e3a35..19b5ae2303ff 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -109,6 +109,11 @@ struct iommu_domain {
struct { /* IOMMU_DOMAIN_SVA */
struct mm_struct *mm;
int users;
+ /*
+ * Next iommu_domain in mm->iommu_mm->sva-domains list
+ * protected by iommu_sva_lock.
+ */
+ struct list_head next;
};
};
};
@@ -1188,16 +1193,28 @@ static inline bool tegra_dev_iommu_get_stream_id(struct device *dev, u32 *stream
#ifdef CONFIG_IOMMU_SVA
static inline void mm_pasid_init(struct mm_struct *mm)
{
- mm->pasid = IOMMU_PASID_INVALID;
+ /*
+ * During dup_mm(), a new mm will be memcpy'd from an old one and that makes
+ * the new mm and the old one point to a same iommu_mm instance. When either
+ * one of the two mms gets released, the iommu_mm instance is freed, leaving
+ * the other mm running into a use-after-free/double-free problem. To avoid
+ * the problem, zeroing the iommu_mm pointer of a new mm is needed here.
+ */
+ mm->iommu_mm = NULL;
}
+
static inline bool mm_valid_pasid(struct mm_struct *mm)
{
- return mm->pasid != IOMMU_PASID_INVALID;
+ return READ_ONCE(mm->iommu_mm);
}

static inline u32 mm_get_enqcmd_pasid(struct mm_struct *mm)
{
- return mm->pasid;
+ struct iommu_mm_data *iommu_mm = READ_ONCE(mm->iommu_mm);
+
+ if (!iommu_mm)
+ return IOMMU_PASID_INVALID;
+ return iommu_mm->pasid;
}

void mm_pasid_drop(struct mm_struct *mm);
--
2.39.3

2023-10-17 00:49:14

by Zhang, Tina

[permalink] [raw]
Subject: [PATCH v8 5/5] mm: Deprecate pasid field

Drop the pasid field, as all the information needed for sva domain
management has been moved to the newly added iommu_mm field.

Reviewed-by: Lu Baolu <[email protected]>
Reviewed-by: Vasant Hegde <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Tina Zhang <[email protected]>
---
include/linux/mm_types.h | 1 -
mm/init-mm.c | 3 ---
2 files changed, 4 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 9f4efed85f74..37f049c4b059 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -883,7 +883,6 @@ struct mm_struct {
struct work_struct async_put_work;

#ifdef CONFIG_IOMMU_SVA
- u32 pasid;
struct iommu_mm_data *iommu_mm;
#endif
#ifdef CONFIG_KSM
diff --git a/mm/init-mm.c b/mm/init-mm.c
index cfd367822cdd..24c809379274 100644
--- a/mm/init-mm.c
+++ b/mm/init-mm.c
@@ -44,9 +44,6 @@ struct mm_struct init_mm = {
#endif
.user_ns = &init_user_ns,
.cpu_bitmap = CPU_BITS_NONE,
-#ifdef CONFIG_IOMMU_SVA
- .pasid = IOMMU_PASID_INVALID,
-#endif
INIT_MM_CONTEXT(init_mm)
};

--
2.39.3

2023-10-17 02:19:58

by Nicolin Chen

[permalink] [raw]
Subject: Re: [PATCH v8 0/5] Share sva domains with all devices bound to a mm

On Tue, Oct 17, 2023 at 08:47:57AM +0800, Tina Zhang wrote:

> v8:
> - CC more people
> - CC [email protected] mailing list.
> When sending version 7, some issue happened in my CC list and that caused
> version 7 wasn't sent to [email protected].
> - Rebase to v6.6-rc6 and make a few format changes.

Tried Sanity with SMMUv3, though there is a merge conflict
against Michael's SMMU series:
https://lore.kernel.org/linux-iommu/[email protected]/

Tested-by: Nicolin Chen <[email protected]>

2023-10-17 16:42:09

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v8 0/5] Share sva domains with all devices bound to a mm

On Tue, Oct 17, 2023 at 08:47:57AM +0800, Tina Zhang wrote:
> This series is to share sva(shared virtual addressing) domains with all
> devices bound to one mm.
>
> Problem
> -------
> In the current iommu core code, sva domain is allocated per IOMMU group,
> when device driver is binding a process address space to a device (which is
> handled in iommu_sva_bind_device()). If one than more device is bound to
> the same process address space, there must be more than one sva domain
> instance, with each device having one. In other words, the sva domain
> doesn't share between those devices bound to the same process address
> space, and that leads to two problems:
> 1) device driver has to duplicate sva domains with enqcmd, as those sva
> domains have the same PASID and are relevant to one virtual address space.
> This makes the sva domain handling complex in device drivers.
> 2) IOMMU driver cannot get sufficient info of the IOMMUs that have
> devices behind them bound to the same virtual address space, when handling
> mmu_notifier_ops callbacks. As a result, IOMMU IOTLB invalidation is
> performed per device instead of per IOMMU, and that may lead to
> superfluous IOTLB invalidation issue, especially in a virtualization
> environment where all devices may be behind one virtual IOMMU.
>
> Solution
> --------
> This patch-set tries to fix those two problems by allowing sharing sva
> domains with all devices bound to a mm. To achieve this, a new structure
> pointer is introduced to mm to replace the old PASID field, which can keep
> the info of PASID as well as the corresponding shared sva domains.
> Besides, function iommu_sva_bind_device() is updated to ensure a new sva
> domain can only be allocated when the old ones cannot work for the IOMMU.
> With these changes, a device driver can expect one sva domain could work
> for per PASID instance(e.g., enqcmd PASID instance), and therefore may get
> rid of handling sva domain duplication. Besides, IOMMU driver (e.g., intel
> vt-d driver) can get sufficient info (e.g., the info of the IOMMUs having
> their devices bound to one virtual address space) when handling
> mmu_notifier_ops callbacks, to remove the redundant IOTLB invalidations.
>
> Arguably there shouldn't be more than one sva_domain with the same PASID,
> and in any sane configuration there should be only 1 type of IOMMU driver
> that needs only 1 SVA domain. However, in reality, IOMMUs on one platform
> may not be identical to each other. Thus, attaching a sva domain that has
> been successfully bound to device A behind a IOMMU A, to device B behind
> IOMMU B may get failed due to the difference between IOMMU A and IOMMU
> B. In this case, a new sva domain with the same PASID needs to be
> allocated to work with IOMMU B. That's why we need a list to keep sva
> domains of one PASID. For the platform where IOMMUs are compatible to each
> other, there should be one sva domain in the list.
>
> v8:
> - CC more people
> - CC [email protected] mailing list.
> When sending version 7, some issue happened in my CC list and that caused
> version 7 wasn't sent to [email protected].
> - Rebase to v6.6-rc6 and make a few format changes.

You should based it on Joerg's tree so he can take it without
conflcits.

The conflicts are trivial though (Take Michael's version and switch
mm->pasid with mm_get_enqcmd_pasid(mm))

It looks fine, please lets get it in this cycle, the ARM and AMD SVA
series depend on it.

Jason

2023-10-18 02:17:54

by Baolu Lu

[permalink] [raw]
Subject: Re: [PATCH v8 0/5] Share sva domains with all devices bound to a mm

On 10/18/23 12:41 AM, Jason Gunthorpe wrote:
> On Tue, Oct 17, 2023 at 08:47:57AM +0800, Tina Zhang wrote:
>> This series is to share sva(shared virtual addressing) domains with all
>> devices bound to one mm.
>>
>> Problem
>> -------
>> In the current iommu core code, sva domain is allocated per IOMMU group,
>> when device driver is binding a process address space to a device (which is
>> handled in iommu_sva_bind_device()). If one than more device is bound to
>> the same process address space, there must be more than one sva domain
>> instance, with each device having one. In other words, the sva domain
>> doesn't share between those devices bound to the same process address
>> space, and that leads to two problems:
>> 1) device driver has to duplicate sva domains with enqcmd, as those sva
>> domains have the same PASID and are relevant to one virtual address space.
>> This makes the sva domain handling complex in device drivers.
>> 2) IOMMU driver cannot get sufficient info of the IOMMUs that have
>> devices behind them bound to the same virtual address space, when handling
>> mmu_notifier_ops callbacks. As a result, IOMMU IOTLB invalidation is
>> performed per device instead of per IOMMU, and that may lead to
>> superfluous IOTLB invalidation issue, especially in a virtualization
>> environment where all devices may be behind one virtual IOMMU.
>>
>> Solution
>> --------
>> This patch-set tries to fix those two problems by allowing sharing sva
>> domains with all devices bound to a mm. To achieve this, a new structure
>> pointer is introduced to mm to replace the old PASID field, which can keep
>> the info of PASID as well as the corresponding shared sva domains.
>> Besides, function iommu_sva_bind_device() is updated to ensure a new sva
>> domain can only be allocated when the old ones cannot work for the IOMMU.
>> With these changes, a device driver can expect one sva domain could work
>> for per PASID instance(e.g., enqcmd PASID instance), and therefore may get
>> rid of handling sva domain duplication. Besides, IOMMU driver (e.g., intel
>> vt-d driver) can get sufficient info (e.g., the info of the IOMMUs having
>> their devices bound to one virtual address space) when handling
>> mmu_notifier_ops callbacks, to remove the redundant IOTLB invalidations.
>>
>> Arguably there shouldn't be more than one sva_domain with the same PASID,
>> and in any sane configuration there should be only 1 type of IOMMU driver
>> that needs only 1 SVA domain. However, in reality, IOMMUs on one platform
>> may not be identical to each other. Thus, attaching a sva domain that has
>> been successfully bound to device A behind a IOMMU A, to device B behind
>> IOMMU B may get failed due to the difference between IOMMU A and IOMMU
>> B. In this case, a new sva domain with the same PASID needs to be
>> allocated to work with IOMMU B. That's why we need a list to keep sva
>> domains of one PASID. For the platform where IOMMUs are compatible to each
>> other, there should be one sva domain in the list.
>>
>> v8:
>> - CC more people
>> - [email protected] mailing list.
>> When sending version 7, some issue happened in my CC list and that caused
>> version 7 wasn't sent [email protected].
>> - Rebase to v6.6-rc6 and make a few format changes.
> You should based it on Joerg's tree so he can take it without
> conflcits.
>
> The conflicts are trivial though (Take Michael's version and switch
> mm->pasid with mm_get_enqcmd_pasid(mm))
>
> It looks fine, please lets get it in this cycle, the ARM and AMD SVA
> series depend on it.

The vt-d driver also has series depending on it.

https://lore.kernel.org/linux-iommu/[email protected]/

Best regards,
baolu

2023-10-18 04:52:07

by Zhang, Tina

[permalink] [raw]
Subject: RE: [PATCH v8 0/5] Share sva domains with all devices bound to a mm

Hi,

> -----Original Message-----
> From: Jason Gunthorpe <[email protected]>
> Sent: Wednesday, October 18, 2023 12:42 AM
> To: Zhang, Tina <[email protected]>
> Cc: [email protected]; [email protected]; David Woodhouse
> <[email protected]>; Lu Baolu <[email protected]>; Joerg
> Roedel <[email protected]>; Will Deacon <[email protected]>; Robin Murphy
> <[email protected]>; Tian, Kevin <[email protected]>; Nicolin Chen
> <[email protected]>; Michael Shavit <[email protected]>; Vasant
> Hegde <[email protected]>
> Subject: Re: [PATCH v8 0/5] Share sva domains with all devices bound to a
> mm
>
> On Tue, Oct 17, 2023 at 08:47:57AM +0800, Tina Zhang wrote:
> > This series is to share sva(shared virtual addressing) domains with
> > all devices bound to one mm.
> >
> > Problem
> > -------
> > In the current iommu core code, sva domain is allocated per IOMMU
> > group, when device driver is binding a process address space to a
> > device (which is handled in iommu_sva_bind_device()). If one than more
> > device is bound to the same process address space, there must be more
> > than one sva domain instance, with each device having one. In other
> > words, the sva domain doesn't share between those devices bound to the
> > same process address space, and that leads to two problems:
> > 1) device driver has to duplicate sva domains with enqcmd, as those
> > sva domains have the same PASID and are relevant to one virtual address
> space.
> > This makes the sva domain handling complex in device drivers.
> > 2) IOMMU driver cannot get sufficient info of the IOMMUs that have
> > devices behind them bound to the same virtual address space, when
> > handling mmu_notifier_ops callbacks. As a result, IOMMU IOTLB
> > invalidation is performed per device instead of per IOMMU, and that
> > may lead to superfluous IOTLB invalidation issue, especially in a
> > virtualization environment where all devices may be behind one virtual
> IOMMU.
> >
> > Solution
> > --------
> > This patch-set tries to fix those two problems by allowing sharing sva
> > domains with all devices bound to a mm. To achieve this, a new
> > structure pointer is introduced to mm to replace the old PASID field,
> > which can keep the info of PASID as well as the corresponding shared sva
> domains.
> > Besides, function iommu_sva_bind_device() is updated to ensure a new
> > sva domain can only be allocated when the old ones cannot work for the
> IOMMU.
> > With these changes, a device driver can expect one sva domain could
> > work for per PASID instance(e.g., enqcmd PASID instance), and
> > therefore may get rid of handling sva domain duplication. Besides,
> > IOMMU driver (e.g., intel vt-d driver) can get sufficient info (e.g.,
> > the info of the IOMMUs having their devices bound to one virtual
> > address space) when handling mmu_notifier_ops callbacks, to remove the
> redundant IOTLB invalidations.
> >
> > Arguably there shouldn't be more than one sva_domain with the same
> > PASID, and in any sane configuration there should be only 1 type of
> > IOMMU driver that needs only 1 SVA domain. However, in reality, IOMMUs
> > on one platform may not be identical to each other. Thus, attaching a
> > sva domain that has been successfully bound to device A behind a IOMMU
> > A, to device B behind IOMMU B may get failed due to the difference
> > between IOMMU A and IOMMU B. In this case, a new sva domain with the
> > same PASID needs to be allocated to work with IOMMU B. That's why we
> > need a list to keep sva domains of one PASID. For the platform where
> > IOMMUs are compatible to each other, there should be one sva domain in
> the list.
> >
> > v8:
> > - CC more people
> > - CC [email protected] mailing list.
> > When sending version 7, some issue happened in my CC list and that
> caused
> > version 7 wasn't sent to [email protected].
> > - Rebase to v6.6-rc6 and make a few format changes.
>
> You should based it on Joerg's tree so he can take it without conflcits.
>
> The conflicts are trivial though (Take Michael's version and switch
> mm->pasid with mm_get_enqcmd_pasid(mm))
>
> It looks fine, please lets get it in this cycle, the ARM and AMD SVA series
> depend on it.
The V9 will be based on the next branch of Joerg's tree.

Like Baolu mentioned, besides ARM and AMD SVA series, we also have a VT-d series waiting for it.

Regards,
-Tina
>
> Jason