2023-10-17 03:21:42

by Zhang, Tina

[permalink] [raw]
Subject: [RFC PATCH 00/12] iommu/vt-d: Remove superfluous IOMMU IOTLB invalidations

This series based on "Share sva" patch-set[1], aims to remove superfluous
IOMMU IOTLB invalidations in VT-d driver.

In the current VT-d driver, IOMMU IOTLB invalidation commands and
device-TLB commands are performed per device, which leads to superfluous
IOTLB invalidations. For example, if there are four devices behind a IOMMU
are attached to one sva domain (which could be a common case in
virtualization scenarios where one virtual IOMMU working for all the
virtual devices), four IOTLB invalidation commands and four device-IOTLB
invalidation commands will be issued. However, only one IOTLB invalidation
command and four device-IOTLB invalidation commands are necessary.
Superfluous IOMMU IOTLB invalidations impact run-time performance.

Although the goal could be built straight-forwardly in the current VT-d
driver, some refactoring works are considered necessary before landing the
one solving the problem of redundant IOTLB invalidations:

1) VT-d driver uses different structures to keep attached device info.
For default domain, it uses struct dev_pasids_info and some related fields
of struct dmar_domain. For sva domain, it uses struct intel_svm and
struct intel_svm_dev. The reason of it is because previously the sva
domain is not shared among devices behind different IOMMUs and
therefore dmar_domain and its fields cannot be used globally to keep
all attached device info. After the patch-set[1] gets acceptance, sva
domain is global to the attached devices. Thus, retiring struct
intel_svm/intel_svm_dev is the main refactoring work of this patch-set.

2) Most logic of intel_svm_set_dev_pasid() can be covered by
intel_iommu_set_dev_pasid(). Refactoring both intel_svm_set_dev_pasid()
and intel_iommu_set_dev_pasid() and let the former call the latter for
set_dev_pasid operation to avoid duplicating code.

3) Last but not least, struct mmu_notifier is proposed to iommu_domain.
This is a change to IOMMU core, which helps to centralize info required by
sva to sva domain and therefore can help cleanup the code in IOMMU drivers.

This patchset is on github per-iommu_IOTLB_invalidation branch[2].

[1]: https://lore.kernel.org/linux-iommu/[email protected]/
[2]: https://github.com/TinaZhangZW/linux/tree/per-iommu_IOTLB_invalidation

Tina Zhang (12):
iommu/vt-d: Retire the treatment for revoking PASIDs with pending
pgfaults
iommu/vt-d: Remove initialization for dynamically heap-allocated
rcu_head
iommu/vt-d: Retire intel_svm_bind_mm()
iommu/vt-d: Make dev_to_intel_iommu() helper global
iommu/vt-d: Retire struct intel_svm_dev
iommu: Add mmu_notifier to sva domain
iommu/vt-d: Retire struct intel_svm
iommu/vt-d: Use RCU for dev_pasids list updates in
set/remove_dev_pasid()
iommu/vt-d: Refactor intel_iommu_set_dev_pasid()
iommu/vt-d: Refactor intel_iommu_remove_dev_pasid()
iommu/vt-d: Use intel_iommu_set_dev_pasid() for sva domain
iommu/vt-d: Remove superfluous IOMMU IOTLB invalidations

drivers/iommu/intel/iommu.c | 91 +++++++++----
drivers/iommu/intel/iommu.h | 30 ++---
drivers/iommu/intel/svm.c | 261 ++++++++----------------------------
include/linux/iommu.h | 2 +
4 files changed, 128 insertions(+), 256 deletions(-)

--
2.39.3


2023-10-17 03:21:45

by Zhang, Tina

[permalink] [raw]
Subject: [RFC PATCH 01/12] iommu/vt-d: Retire the treatment for revoking PASIDs with pending pgfaults

Revoking PASIDs with pending page faults has been achieved by clearing the
pasid entry and draining the pending page faults in both hardware and
software. The temporary treatment can be retired now.

Signed-off-by: Tina Zhang <[email protected]>
---
drivers/iommu/intel/svm.c | 7 -------
1 file changed, 7 deletions(-)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 3c531af58658..9bf79cf88aec 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -387,13 +387,6 @@ void intel_svm_remove_dev_pasid(struct device *dev, u32 pasid)
if (svm->notifier.ops)
mmu_notifier_unregister(&svm->notifier, mm);
pasid_private_remove(svm->pasid);
- /*
- * We mandate that no page faults may be outstanding
- * for the PASID when intel_svm_unbind_mm() is called.
- * If that is not obeyed, subtle errors will happen.
- * Let's make them less subtle...
- */
- memset(svm, 0x6b, sizeof(*svm));
kfree(svm);
}
}
--
2.39.3

2023-10-17 03:21:45

by Zhang, Tina

[permalink] [raw]
Subject: [RFC PATCH 02/12] iommu/vt-d: Remove initialization for dynamically heap-allocated rcu_head

The rcu_head structures allocated dynamically in the heap don't need any
initialization. Therefore, remove the init_rcu_head().

Signed-off-by: Tina Zhang <[email protected]>
---
drivers/iommu/intel/svm.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 9bf79cf88aec..f9b1f13bd068 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -334,7 +334,6 @@ static int intel_svm_bind_mm(struct intel_iommu *iommu, struct device *dev,
sdev->iommu = iommu;
sdev->did = FLPT_DEFAULT_DID;
sdev->sid = PCI_DEVID(info->bus, info->devfn);
- init_rcu_head(&sdev->rcu);
if (info->ats_enabled) {
sdev->qdep = info->ats_qdep;
if (sdev->qdep >= QI_DEV_EIOTLB_MAX_INVS)
--
2.39.3

2023-10-17 03:21:56

by Zhang, Tina

[permalink] [raw]
Subject: [RFC PATCH 04/12] iommu/vt-d: Make dev_to_intel_iommu() helper global

Make dev_to_intel_iommu() global so that it can be used by other files.

Signed-off-by: Tina Zhang <[email protected]>
---
drivers/iommu/intel/iommu.c | 7 -------
drivers/iommu/intel/iommu.h | 6 ++++++
2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 3685ba90ec88..eeb658d3bc6e 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -3609,13 +3609,6 @@ void intel_iommu_shutdown(void)
up_write(&dmar_global_lock);
}

-static inline struct intel_iommu *dev_to_intel_iommu(struct device *dev)
-{
- struct iommu_device *iommu_dev = dev_to_iommu_device(dev);
-
- return container_of(iommu_dev, struct intel_iommu, iommu);
-}
-
static ssize_t version_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index 7dac94f62b4e..1e972e7edeca 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -842,6 +842,12 @@ void *alloc_pgtable_page(int node, gfp_t gfp);
void free_pgtable_page(void *vaddr);
void iommu_flush_write_buffer(struct intel_iommu *iommu);
struct intel_iommu *device_to_iommu(struct device *dev, u8 *bus, u8 *devfn);
+static inline struct intel_iommu *dev_to_intel_iommu(struct device *dev)
+{
+ struct iommu_device *iommu_dev = dev_to_iommu_device(dev);
+
+ return container_of(iommu_dev, struct intel_iommu, iommu);
+}

#ifdef CONFIG_INTEL_IOMMU_SVM
void intel_svm_check(struct intel_iommu *iommu);
--
2.39.3

2023-10-17 03:22:06

by Zhang, Tina

[permalink] [raw]
Subject: [RFC PATCH 05/12] iommu/vt-d: Retire struct intel_svm_dev

The struct dev_pasid_info is used by IOMMU domain to keep pasid info of
attached device. For sva domain, there is another structure which keeps
info of attached device, named intel_svm_dev. Instead of using two structs
to keep attached device info separately, sva domain should use struct
dev_pasid to centralize info of attach device. To achieve this, rcu/sid/
qdep fields are moved from struct intel_svm_dev into struct
dev_pasid_info and then sva domain switches to use dev_pasid_info to
keep attached device info. In this way, struct intel_svm_dev gets retired.

Signed-off-by: Tina Zhang <[email protected]>
---
drivers/iommu/intel/iommu.h | 11 +----
drivers/iommu/intel/svm.c | 94 ++++++++++++++++++-------------------
2 files changed, 49 insertions(+), 56 deletions(-)

diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index 1e972e7edeca..bd7210980fb2 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -722,6 +722,8 @@ struct dev_pasid_info {
struct list_head link_domain; /* link to domain siblings */
struct device *dev;
ioasid_t pasid;
+ struct rcu_head rcu;
+ u16 sid, qdep;
};

static inline void __iommu_flush_cache(
@@ -859,15 +861,6 @@ struct iommu_domain *intel_svm_domain_alloc(void);
void intel_svm_remove_dev_pasid(struct device *dev, ioasid_t pasid);
void intel_drain_pasid_prq(struct device *dev, u32 pasid);

-struct intel_svm_dev {
- struct list_head list;
- struct rcu_head rcu;
- struct device *dev;
- struct intel_iommu *iommu;
- u16 did;
- u16 sid, qdep;
-};
-
struct intel_svm {
struct mmu_notifier notifier;
struct mm_struct *mm;
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 03406395ac5b..e0373586811e 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -44,21 +44,21 @@ static void *pasid_private_find(ioasid_t pasid)
return xa_load(&pasid_private_array, pasid);
}

-static struct intel_svm_dev *
-svm_lookup_device_by_dev(struct intel_svm *svm, struct device *dev)
+static struct dev_pasid_info *
+svm_lookup_dev_pasid_info_by_dev(struct intel_svm *svm, struct device *dev)
{
- struct intel_svm_dev *sdev = NULL, *t;
+ struct dev_pasid_info *dev_pasid = NULL, *t;

rcu_read_lock();
- list_for_each_entry_rcu(t, &svm->devs, list) {
+ list_for_each_entry_rcu(t, &svm->devs, link_domain) {
if (t->dev == dev) {
- sdev = t;
+ dev_pasid = t;
break;
}
}
rcu_read_unlock();

- return sdev;
+ return dev_pasid;
}

int intel_svm_enable_prq(struct intel_iommu *iommu)
@@ -170,27 +170,28 @@ void intel_svm_check(struct intel_iommu *iommu)
}

static void __flush_svm_range_dev(struct intel_svm *svm,
- struct intel_svm_dev *sdev,
+ struct dev_pasid_info *dev_pasid,
unsigned long address,
unsigned long pages, int ih)
{
- struct device_domain_info *info = dev_iommu_priv_get(sdev->dev);
+ struct device_domain_info *info = dev_iommu_priv_get(dev_pasid->dev);
+ struct intel_iommu *iommu = dev_to_intel_iommu(dev_pasid->dev);

if (WARN_ON(!pages))
return;

- qi_flush_piotlb(sdev->iommu, sdev->did, svm->pasid, address, pages, ih);
+ qi_flush_piotlb(iommu, FLPT_DEFAULT_DID, svm->pasid, address, pages, ih);
if (info->ats_enabled) {
- qi_flush_dev_iotlb_pasid(sdev->iommu, sdev->sid, info->pfsid,
- svm->pasid, sdev->qdep, address,
+ qi_flush_dev_iotlb_pasid(iommu, dev_pasid->sid, info->pfsid,
+ svm->pasid, dev_pasid->qdep, address,
order_base_2(pages));
quirk_extra_dev_tlb_flush(info, address, order_base_2(pages),
- svm->pasid, sdev->qdep);
+ svm->pasid, dev_pasid->qdep);
}
}

static void intel_flush_svm_range_dev(struct intel_svm *svm,
- struct intel_svm_dev *sdev,
+ struct dev_pasid_info *dev_pasid,
unsigned long address,
unsigned long pages, int ih)
{
@@ -200,7 +201,7 @@ static void intel_flush_svm_range_dev(struct intel_svm *svm,
unsigned long end = ALIGN(address + (pages << VTD_PAGE_SHIFT), align);

while (start < end) {
- __flush_svm_range_dev(svm, sdev, start, align >> VTD_PAGE_SHIFT, ih);
+ __flush_svm_range_dev(svm, dev_pasid, start, align >> VTD_PAGE_SHIFT, ih);
start += align;
}
}
@@ -208,11 +209,11 @@ static void intel_flush_svm_range_dev(struct intel_svm *svm,
static void intel_flush_svm_range(struct intel_svm *svm, unsigned long address,
unsigned long pages, int ih)
{
- struct intel_svm_dev *sdev;
+ struct dev_pasid_info *dev_pasid;

rcu_read_lock();
- list_for_each_entry_rcu(sdev, &svm->devs, list)
- intel_flush_svm_range_dev(svm, sdev, address, pages, ih);
+ list_for_each_entry_rcu(dev_pasid, &svm->devs, link_domain)
+ intel_flush_svm_range_dev(svm, dev_pasid, address, pages, ih);
rcu_read_unlock();
}

@@ -230,7 +231,8 @@ static void intel_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
static void intel_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
{
struct intel_svm *svm = container_of(mn, struct intel_svm, notifier);
- struct intel_svm_dev *sdev;
+ struct dev_pasid_info *dev_pasid;
+ struct intel_iommu *iommu;

/* This might end up being called from exit_mmap(), *before* the page
* tables are cleared. And __mmu_notifier_release() will delete us from
@@ -245,9 +247,11 @@ static void intel_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
* *has* to handle gracefully without affecting other processes.
*/
rcu_read_lock();
- list_for_each_entry_rcu(sdev, &svm->devs, list)
- intel_pasid_tear_down_entry(sdev->iommu, sdev->dev,
+ list_for_each_entry_rcu(dev_pasid, &svm->devs, link_domain) {
+ iommu = dev_to_intel_iommu(dev_pasid->dev);
+ intel_pasid_tear_down_entry(iommu, dev_pasid->dev,
svm->pasid, true);
+ }
rcu_read_unlock();

}
@@ -257,11 +261,11 @@ static const struct mmu_notifier_ops intel_mmuops = {
.arch_invalidate_secondary_tlbs = intel_arch_invalidate_secondary_tlbs,
};

-static int pasid_to_svm_sdev(struct device *dev, unsigned int pasid,
+static int pasid_to_dev_pasid_info(struct device *dev, unsigned int pasid,
struct intel_svm **rsvm,
- struct intel_svm_dev **rsdev)
+ struct dev_pasid_info **rsdev_pasid_info)
{
- struct intel_svm_dev *sdev = NULL;
+ struct dev_pasid_info *dev_pasid = NULL;
struct intel_svm *svm;

if (pasid == IOMMU_PASID_INVALID || pasid >= PASID_MAX)
@@ -280,11 +284,11 @@ static int pasid_to_svm_sdev(struct device *dev, unsigned int pasid,
*/
if (WARN_ON(list_empty(&svm->devs)))
return -EINVAL;
- sdev = svm_lookup_device_by_dev(svm, dev);
+ dev_pasid = svm_lookup_dev_pasid_info_by_dev(svm, dev);

out:
*rsvm = svm;
- *rsdev = sdev;
+ *rsdev_pasid_info = dev_pasid;

return 0;
}
@@ -295,7 +299,7 @@ static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct intel_iommu *iommu = info->iommu;
struct mm_struct *mm = domain->mm;
- struct intel_svm_dev *sdev;
+ struct dev_pasid_info *dev_pasid;
struct intel_svm *svm;
unsigned long sflags;
int ret = 0;
@@ -325,20 +329,18 @@ static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
}
}

- sdev = kzalloc(sizeof(*sdev), GFP_KERNEL);
- if (!sdev) {
+ dev_pasid = kzalloc(sizeof(*dev_pasid), GFP_KERNEL);
+ if (!dev_pasid) {
ret = -ENOMEM;
goto free_svm;
}

- sdev->dev = dev;
- sdev->iommu = iommu;
- sdev->did = FLPT_DEFAULT_DID;
- sdev->sid = PCI_DEVID(info->bus, info->devfn);
+ dev_pasid->dev = dev;
+ dev_pasid->sid = PCI_DEVID(info->bus, info->devfn);
if (info->ats_enabled) {
- sdev->qdep = info->ats_qdep;
- if (sdev->qdep >= QI_DEV_EIOTLB_MAX_INVS)
- sdev->qdep = 0;
+ dev_pasid->qdep = info->ats_qdep;
+ if (dev_pasid->qdep >= QI_DEV_EIOTLB_MAX_INVS)
+ dev_pasid->qdep = 0;
}

/* Setup the pasid table: */
@@ -346,14 +348,14 @@ static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
ret = intel_pasid_setup_first_level(iommu, dev, mm->pgd, pasid,
FLPT_DEFAULT_DID, sflags);
if (ret)
- goto free_sdev;
+ goto free_dev_pasid;

- list_add_rcu(&sdev->list, &svm->devs);
+ list_add_rcu(&dev_pasid->link_domain, &svm->devs);

return 0;

-free_sdev:
- kfree(sdev);
+free_dev_pasid:
+ kfree(dev_pasid);
free_svm:
if (list_empty(&svm->devs)) {
mmu_notifier_unregister(&svm->notifier, mm);
@@ -366,26 +368,24 @@ static int intel_svm_set_dev_pasid(struct iommu_domain *domain,

void intel_svm_remove_dev_pasid(struct device *dev, u32 pasid)
{
- struct intel_svm_dev *sdev;
+ struct dev_pasid_info *dev_pasid;
struct intel_iommu *iommu;
struct intel_svm *svm;
- struct mm_struct *mm;

iommu = device_to_iommu(dev, NULL, NULL);
if (!iommu)
return;

- if (pasid_to_svm_sdev(dev, pasid, &svm, &sdev))
+ if (pasid_to_dev_pasid_info(dev, pasid, &svm, &dev_pasid))
return;
- mm = svm->mm;

- if (sdev) {
- list_del_rcu(&sdev->list);
- kfree_rcu(sdev, rcu);
+ if (dev_pasid) {
+ list_del_rcu(&dev_pasid->link_domain);
+ kfree_rcu(dev_pasid, rcu);

if (list_empty(&svm->devs)) {
if (svm->notifier.ops)
- mmu_notifier_unregister(&svm->notifier, mm);
+ mmu_notifier_unregister(&svm->notifier, svm->mm);
pasid_private_remove(svm->pasid);
kfree(svm);
}
--
2.39.3

2023-10-17 03:22:12

by Zhang, Tina

[permalink] [raw]
Subject: [RFC PATCH 04/12] iommu/vt-d: Make dev_to_intel_iommu() helper global

Make dev_to_intel_iommu() global so that it can be used by other files.

Signed-off-by: Tina Zhang <[email protected]>
---
drivers/iommu/intel/iommu.c | 7 -------
drivers/iommu/intel/iommu.h | 6 ++++++
2 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 3685ba90ec88..eeb658d3bc6e 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -3609,13 +3609,6 @@ void intel_iommu_shutdown(void)
up_write(&dmar_global_lock);
}

-static inline struct intel_iommu *dev_to_intel_iommu(struct device *dev)
-{
- struct iommu_device *iommu_dev = dev_to_iommu_device(dev);
-
- return container_of(iommu_dev, struct intel_iommu, iommu);
-}
-
static ssize_t version_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index 7dac94f62b4e..1e972e7edeca 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -842,6 +842,12 @@ void *alloc_pgtable_page(int node, gfp_t gfp);
void free_pgtable_page(void *vaddr);
void iommu_flush_write_buffer(struct intel_iommu *iommu);
struct intel_iommu *device_to_iommu(struct device *dev, u8 *bus, u8 *devfn);
+static inline struct intel_iommu *dev_to_intel_iommu(struct device *dev)
+{
+ struct iommu_device *iommu_dev = dev_to_iommu_device(dev);
+
+ return container_of(iommu_dev, struct intel_iommu, iommu);
+}

#ifdef CONFIG_INTEL_IOMMU_SVM
void intel_svm_check(struct intel_iommu *iommu);
--
2.39.3

2023-10-17 03:22:14

by Zhang, Tina

[permalink] [raw]
Subject: [RFC PATCH 06/12] iommu: Add mmu_notifier to sva domain

Devices attached to shared virtual addressing (SVA) domain are allowed to
use the same virtual addresses with processor, and this functionality is
called shared virtual memory. When shared virtual memory is being used,
it's the sva domain's responsibility to keep device TLB cache and the CPU
cache in sync. Hence add mmu_notifier to sva domain.

Signed-off-by: Tina Zhang <[email protected]>
---
include/linux/iommu.h | 2 ++
1 file changed, 2 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 19b5ae2303ff..afb566230427 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -7,6 +7,7 @@
#ifndef __LINUX_IOMMU_H
#define __LINUX_IOMMU_H

+#include <linux/mmu_notifier.h>
#include <linux/scatterlist.h>
#include <linux/device.h>
#include <linux/types.h>
@@ -114,6 +115,7 @@ struct iommu_domain {
* protected by iommu_sva_lock.
*/
struct list_head next;
+ struct mmu_notifier notifier;
};
};
};
--
2.39.3

2023-10-17 03:22:33

by Zhang, Tina

[permalink] [raw]
Subject: [RFC PATCH 07/12] iommu/vt-d: Retire struct intel_svm

The sva domain can be shared among devices behind different IOMMUs and
therefore sva domain should centrally keep the info needed for shared
virtual memory to avoid duplicating code.

Chain the attached devices to sva domains's dev_pasids list, and retire
struct intel_svm.

Signed-off-by: Tina Zhang <[email protected]>
---
drivers/iommu/intel/iommu.h | 7 --
drivers/iommu/intel/svm.c | 152 ++++++++++--------------------------
2 files changed, 42 insertions(+), 117 deletions(-)

diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index bd7210980fb2..a9e9301b2713 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -860,13 +860,6 @@ int intel_svm_page_response(struct device *dev, struct iommu_fault_event *evt,
struct iommu_domain *intel_svm_domain_alloc(void);
void intel_svm_remove_dev_pasid(struct device *dev, ioasid_t pasid);
void intel_drain_pasid_prq(struct device *dev, u32 pasid);
-
-struct intel_svm {
- struct mmu_notifier notifier;
- struct mm_struct *mm;
- u32 pasid;
- struct list_head devs;
-};
#else
static inline void intel_svm_check(struct intel_iommu *iommu) {}
static inline void intel_drain_pasid_prq(struct device *dev, u32 pasid) {}
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index e0373586811e..fd61fbf7593a 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -27,30 +27,13 @@

static irqreturn_t prq_event_thread(int irq, void *d);

-static DEFINE_XARRAY_ALLOC(pasid_private_array);
-static int pasid_private_add(ioasid_t pasid, void *priv)
-{
- return xa_alloc(&pasid_private_array, &pasid, priv,
- XA_LIMIT(pasid, pasid), GFP_ATOMIC);
-}
-
-static void pasid_private_remove(ioasid_t pasid)
-{
- xa_erase(&pasid_private_array, pasid);
-}
-
-static void *pasid_private_find(ioasid_t pasid)
-{
- return xa_load(&pasid_private_array, pasid);
-}
-
static struct dev_pasid_info *
-svm_lookup_dev_pasid_info_by_dev(struct intel_svm *svm, struct device *dev)
+domain_lookup_dev_pasid_info_by_dev(struct dmar_domain *domain, struct device *dev)
{
struct dev_pasid_info *dev_pasid = NULL, *t;

rcu_read_lock();
- list_for_each_entry_rcu(t, &svm->devs, link_domain) {
+ list_for_each_entry_rcu(t, &domain->dev_pasids, link_domain) {
if (t->dev == dev) {
dev_pasid = t;
break;
@@ -169,28 +152,29 @@ void intel_svm_check(struct intel_iommu *iommu)
iommu->flags |= VTD_FLAG_SVM_CAPABLE;
}

-static void __flush_svm_range_dev(struct intel_svm *svm,
+static void __flush_svm_range_dev(struct dmar_domain *domain,
struct dev_pasid_info *dev_pasid,
unsigned long address,
unsigned long pages, int ih)
{
struct device_domain_info *info = dev_iommu_priv_get(dev_pasid->dev);
struct intel_iommu *iommu = dev_to_intel_iommu(dev_pasid->dev);
+ u32 pasid = mm_get_enqcmd_pasid(domain->domain.mm);

if (WARN_ON(!pages))
return;

- qi_flush_piotlb(iommu, FLPT_DEFAULT_DID, svm->pasid, address, pages, ih);
+ qi_flush_piotlb(iommu, FLPT_DEFAULT_DID, pasid, address, pages, ih);
if (info->ats_enabled) {
qi_flush_dev_iotlb_pasid(iommu, dev_pasid->sid, info->pfsid,
- svm->pasid, dev_pasid->qdep, address,
+ pasid, dev_pasid->qdep, address,
order_base_2(pages));
quirk_extra_dev_tlb_flush(info, address, order_base_2(pages),
- svm->pasid, dev_pasid->qdep);
+ pasid, dev_pasid->qdep);
}
}

-static void intel_flush_svm_range_dev(struct intel_svm *svm,
+static void intel_flush_svm_range_dev(struct dmar_domain *domain,
struct dev_pasid_info *dev_pasid,
unsigned long address,
unsigned long pages, int ih)
@@ -201,19 +185,19 @@ static void intel_flush_svm_range_dev(struct intel_svm *svm,
unsigned long end = ALIGN(address + (pages << VTD_PAGE_SHIFT), align);

while (start < end) {
- __flush_svm_range_dev(svm, dev_pasid, start, align >> VTD_PAGE_SHIFT, ih);
+ __flush_svm_range_dev(domain, dev_pasid, start, align >> VTD_PAGE_SHIFT, ih);
start += align;
}
}

-static void intel_flush_svm_range(struct intel_svm *svm, unsigned long address,
+static void intel_flush_svm_range(struct dmar_domain *domain, unsigned long address,
unsigned long pages, int ih)
{
struct dev_pasid_info *dev_pasid;

rcu_read_lock();
- list_for_each_entry_rcu(dev_pasid, &svm->devs, link_domain)
- intel_flush_svm_range_dev(svm, dev_pasid, address, pages, ih);
+ list_for_each_entry_rcu(dev_pasid, &domain->dev_pasids, link_domain)
+ intel_flush_svm_range_dev(domain, dev_pasid, address, pages, ih);
rcu_read_unlock();
}

@@ -222,15 +206,15 @@ static void intel_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
struct mm_struct *mm,
unsigned long start, unsigned long end)
{
- struct intel_svm *svm = container_of(mn, struct intel_svm, notifier);
+ struct iommu_domain *domain = container_of(mn, struct iommu_domain, notifier);

- intel_flush_svm_range(svm, start,
+ intel_flush_svm_range(to_dmar_domain(domain), start,
(end - start + PAGE_SIZE - 1) >> VTD_PAGE_SHIFT, 0);
}

static void intel_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
{
- struct intel_svm *svm = container_of(mn, struct intel_svm, notifier);
+ struct iommu_domain *domain = container_of(mn, struct iommu_domain, notifier);
struct dev_pasid_info *dev_pasid;
struct intel_iommu *iommu;

@@ -247,10 +231,10 @@ static void intel_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
* *has* to handle gracefully without affecting other processes.
*/
rcu_read_lock();
- list_for_each_entry_rcu(dev_pasid, &svm->devs, link_domain) {
+ list_for_each_entry_rcu(dev_pasid, &to_dmar_domain(domain)->dev_pasids, link_domain) {
iommu = dev_to_intel_iommu(dev_pasid->dev);
intel_pasid_tear_down_entry(iommu, dev_pasid->dev,
- svm->pasid, true);
+ mm_get_enqcmd_pasid(domain->mm), true);
}
rcu_read_unlock();

@@ -261,78 +245,32 @@ static const struct mmu_notifier_ops intel_mmuops = {
.arch_invalidate_secondary_tlbs = intel_arch_invalidate_secondary_tlbs,
};

-static int pasid_to_dev_pasid_info(struct device *dev, unsigned int pasid,
- struct intel_svm **rsvm,
- struct dev_pasid_info **rsdev_pasid_info)
-{
- struct dev_pasid_info *dev_pasid = NULL;
- struct intel_svm *svm;
-
- if (pasid == IOMMU_PASID_INVALID || pasid >= PASID_MAX)
- return -EINVAL;
-
- svm = pasid_private_find(pasid);
- if (IS_ERR(svm))
- return PTR_ERR(svm);
-
- if (!svm)
- goto out;
-
- /*
- * If we found svm for the PASID, there must be at least one device
- * bond.
- */
- if (WARN_ON(list_empty(&svm->devs)))
- return -EINVAL;
- dev_pasid = svm_lookup_dev_pasid_info_by_dev(svm, dev);
-
-out:
- *rsvm = svm;
- *rsdev_pasid_info = dev_pasid;
-
- return 0;
-}
-
static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t pasid)
{
+ struct dmar_domain *dmar_domain = to_dmar_domain(domain);
struct device_domain_info *info = dev_iommu_priv_get(dev);
struct intel_iommu *iommu = info->iommu;
struct mm_struct *mm = domain->mm;
struct dev_pasid_info *dev_pasid;
- struct intel_svm *svm;
unsigned long sflags;
int ret = 0;

- svm = pasid_private_find(pasid);
- if (!svm) {
- svm = kzalloc(sizeof(*svm), GFP_KERNEL);
- if (!svm)
- return -ENOMEM;
-
- svm->pasid = pasid;
- svm->mm = mm;
- INIT_LIST_HEAD_RCU(&svm->devs);
-
- svm->notifier.ops = &intel_mmuops;
- ret = mmu_notifier_register(&svm->notifier, mm);
- if (ret) {
- kfree(svm);
- return ret;
- }
-
- ret = pasid_private_add(svm->pasid, svm);
- if (ret) {
- mmu_notifier_unregister(&svm->notifier, mm);
- kfree(svm);
+ /*
+ * The sva domain can be shared among multiple devices. Make sure
+ * to add the notifier only once to a sva domain.
+ */
+ if (domain->notifier.ops != &intel_mmuops) {
+ domain->notifier.ops = &intel_mmuops;
+ ret = mmu_notifier_register(&domain->notifier, mm);
+ if (ret)
return ret;
- }
}

dev_pasid = kzalloc(sizeof(*dev_pasid), GFP_KERNEL);
if (!dev_pasid) {
ret = -ENOMEM;
- goto free_svm;
+ goto out_unregister;
}

dev_pasid->dev = dev;
@@ -348,46 +286,39 @@ static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
ret = intel_pasid_setup_first_level(iommu, dev, mm->pgd, pasid,
FLPT_DEFAULT_DID, sflags);
if (ret)
- goto free_dev_pasid;
+ goto out_free_dev_pasid;

- list_add_rcu(&dev_pasid->link_domain, &svm->devs);
+ list_add_rcu(&dev_pasid->link_domain, &dmar_domain->dev_pasids);

return 0;

-free_dev_pasid:
+out_free_dev_pasid:
kfree(dev_pasid);
-free_svm:
- if (list_empty(&svm->devs)) {
- mmu_notifier_unregister(&svm->notifier, mm);
- pasid_private_remove(pasid);
- kfree(svm);
- }
+out_unregister:
+ if (list_empty(&dmar_domain->dev_pasids))
+ mmu_notifier_unregister(&domain->notifier, mm);

return ret;
}

void intel_svm_remove_dev_pasid(struct device *dev, u32 pasid)
{
+ struct iommu_domain *domain;
struct dev_pasid_info *dev_pasid;
- struct intel_iommu *iommu;
- struct intel_svm *svm;
-
- iommu = device_to_iommu(dev, NULL, NULL);
- if (!iommu)
- return;

- if (pasid_to_dev_pasid_info(dev, pasid, &svm, &dev_pasid))
+ domain = iommu_get_domain_for_dev_pasid(dev, pasid,
+ IOMMU_DOMAIN_SVA);
+ if (WARN_ON_ONCE(IS_ERR_OR_NULL(domain)))
return;

+ dev_pasid = domain_lookup_dev_pasid_info_by_dev(to_dmar_domain(domain), dev);
if (dev_pasid) {
list_del_rcu(&dev_pasid->link_domain);
kfree_rcu(dev_pasid, rcu);

- if (list_empty(&svm->devs)) {
- if (svm->notifier.ops)
- mmu_notifier_unregister(&svm->notifier, svm->mm);
- pasid_private_remove(svm->pasid);
- kfree(svm);
+ if (list_empty(&to_dmar_domain(domain)->dev_pasids)) {
+ if (domain->notifier.ops)
+ mmu_notifier_unregister(&domain->notifier, domain->mm);
}
}
}
@@ -803,6 +734,7 @@ struct iommu_domain *intel_svm_domain_alloc(void)
if (!domain)
return NULL;
domain->domain.ops = &intel_svm_domain_ops;
+ INIT_LIST_HEAD(&domain->dev_pasids);

return &domain->domain;
}
--
2.39.3

2023-10-17 03:22:51

by Zhang, Tina

[permalink] [raw]
Subject: [RFC PATCH 08/12] iommu/vt-d: Use RCU for dev_pasids list updates in set/remove_dev_pasid()

Extend intel_iommu_remove_dev_pasid() and intel_iommu_set_dev_pasid() to
support updating dev_pasids list concurrently with readers.

For default domain operations, the dev_pasids list accesses are protected
by domain->lock and therefore all read/write accesses of default domain
operations to dev_pasids list are performed sequentially. To extend
intel_iommu_set/remove_dev_pasid() to have the ability to update the
dev_pasids list concurrently with multiple readers (which is required by
sva domain), RCU mechanism is being used here.

Signed-off-by: Tina Zhang <[email protected]>
---
drivers/iommu/intel/iommu.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index eeb658d3bc6e..fe063e1250fa 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4694,7 +4694,7 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
spin_lock_irqsave(&dmar_domain->lock, flags);
list_for_each_entry(curr, &dmar_domain->dev_pasids, link_domain) {
if (curr->dev == dev && curr->pasid == pasid) {
- list_del(&curr->link_domain);
+ list_del_rcu(&curr->link_domain);
dev_pasid = curr;
break;
}
@@ -4703,7 +4703,7 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
spin_unlock_irqrestore(&dmar_domain->lock, flags);

domain_detach_iommu(dmar_domain, iommu);
- kfree(dev_pasid);
+ kfree_rcu(dev_pasid, rcu);
out_tear_down:
intel_pasid_tear_down_entry(iommu, dev, pasid, false);
intel_drain_pasid_prq(dev, pasid);
@@ -4751,8 +4751,14 @@ static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,

dev_pasid->dev = dev;
dev_pasid->pasid = pasid;
+
+ /*
+ * Spin lock protects dev_pasids list from being updated concurrently with
+ * multiple updaters, while rcu ensures concurrency between one updater
+ * and multiple readers
+ */
spin_lock_irqsave(&dmar_domain->lock, flags);
- list_add(&dev_pasid->link_domain, &dmar_domain->dev_pasids);
+ list_add_rcu(&dev_pasid->link_domain, &dmar_domain->dev_pasids);
spin_unlock_irqrestore(&dmar_domain->lock, flags);

return 0;
--
2.39.3

2023-10-17 03:23:03

by Zhang, Tina

[permalink] [raw]
Subject: [RFC PATCH 03/12] iommu/vt-d: Retire intel_svm_bind_mm()

Refactor intel_svm_set_dev_pasid() by moving the logic from
intel_svm_bind_mm() into intel_svm_set_dev_pasid() and retire
intel_svm_bind_mm().

No functional change intended.

The motivation is to let intel_iommu_set_dev_pasid() serve for both sva
domain and default domain set_dev_pasid operation, instead of duplicating
code for them. In latter patches, we will first do some refactoring to
intel_svm_set_dev_pasid(), and then extend intel_iommu_set_dev_pasid()
to support sva domain. After that, intel_svm_set_dev_pasid() will call
intel_iommu_set_dev_pasid() for set_dev_pasid purpose.

Signed-off-by: Tina Zhang <[email protected]>
---
drivers/iommu/intel/svm.c | 14 +++-----------
1 file changed, 3 insertions(+), 11 deletions(-)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index f9b1f13bd068..03406395ac5b 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -289,10 +289,11 @@ static int pasid_to_svm_sdev(struct device *dev, unsigned int pasid,
return 0;
}

-static int intel_svm_bind_mm(struct intel_iommu *iommu, struct device *dev,
- struct iommu_domain *domain, ioasid_t pasid)
+static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
+ struct device *dev, ioasid_t pasid)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
+ struct intel_iommu *iommu = info->iommu;
struct mm_struct *mm = domain->mm;
struct intel_svm_dev *sdev;
struct intel_svm *svm;
@@ -784,15 +785,6 @@ int intel_svm_page_response(struct device *dev,
return ret;
}

-static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
- struct device *dev, ioasid_t pasid)
-{
- struct device_domain_info *info = dev_iommu_priv_get(dev);
- struct intel_iommu *iommu = info->iommu;
-
- return intel_svm_bind_mm(iommu, dev, domain, pasid);
-}
-
static void intel_svm_domain_free(struct iommu_domain *domain)
{
kfree(to_dmar_domain(domain));
--
2.39.3

2023-10-17 03:23:06

by Zhang, Tina

[permalink] [raw]
Subject: [RFC PATCH 12/12] iommu/vt-d: Remove superfluous IOMMU IOTLB invalidations

Devices behind different IOMMUs can be bound to one sva domain. When a
range of a sva domain address is being invalidated, vt-d driver needs to
issue IOMMU IOTLB and Dev-IOTLB invalidation commands to ask IOMMU
hardware and related devices to invalidate their caches.

The current logic issues both IOTLB invalidation command and device-TLB
command per device, which leads to superfluous IOTLB invalidation (e.g.,
if there are four devices behind a IOMMU are attached to one sva domain.
In the current logic, during handing intel_invalidate_range(), four IOTLB
invalidation commands and four Dev-IOTLB invalidation commands will be
issued. However, only one IOTLB invalidation command and four Dev-IOTLB
invalidation command are necessary.), and therefore impacts run-time
performance.

The patch removes the redundant IOMMU IOTLB invalidations by allowing
issuing IOMMU IOTLB invalidation command per iommu instead of per device.

Signed-off-by: Tina Zhang <[email protected]>
---
drivers/iommu/intel/svm.c | 56 +++++++++++++++++++--------------------
1 file changed, 27 insertions(+), 29 deletions(-)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index c9a703935908..f684b92a1241 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -135,32 +135,41 @@ void intel_svm_check(struct intel_iommu *iommu)
iommu->flags |= VTD_FLAG_SVM_CAPABLE;
}

-static void __flush_svm_range_dev(struct dmar_domain *domain,
- struct dev_pasid_info *dev_pasid,
+static void __flush_svm_range(struct iommu_domain *domain,
unsigned long address,
unsigned long pages, int ih)
{
- struct device_domain_info *info = dev_iommu_priv_get(dev_pasid->dev);
- struct intel_iommu *iommu = dev_to_intel_iommu(dev_pasid->dev);
- u32 pasid = mm_get_enqcmd_pasid(domain->domain.mm);
+ u32 pasid = mm_get_enqcmd_pasid(domain->mm);
+ struct device_domain_info *dev_info;
+ struct iommu_domain_info *iommu_info;
+ struct dev_pasid_info *dev_pasid;
+ struct intel_iommu *iommu;
+ unsigned long idx;

if (WARN_ON(!pages))
return;

- qi_flush_piotlb(iommu, FLPT_DEFAULT_DID, pasid, address, pages, ih);
- if (info->ats_enabled) {
- qi_flush_dev_iotlb_pasid(iommu, dev_pasid->sid, info->pfsid,
- pasid, dev_pasid->qdep, address,
- order_base_2(pages));
- quirk_extra_dev_tlb_flush(info, address, order_base_2(pages),
- pasid, dev_pasid->qdep);
+ rcu_read_lock();
+ xa_for_each(&to_dmar_domain(domain)->iommu_array, idx, iommu_info)
+ qi_flush_piotlb(iommu_info->iommu, FLPT_DEFAULT_DID,
+ pasid, address, pages, ih);
+
+ list_for_each_entry_rcu(dev_pasid, &to_dmar_domain(domain)->dev_pasids, link_domain) {
+ dev_info = dev_iommu_priv_get(dev_pasid->dev);
+ iommu = dev_to_intel_iommu(dev_pasid->dev);
+ if (dev_info->ats_enabled) {
+ qi_flush_dev_iotlb_pasid(iommu, dev_pasid->sid, dev_info->pfsid,
+ pasid, dev_pasid->qdep, address,
+ order_base_2(pages));
+ quirk_extra_dev_tlb_flush(dev_info, address, order_base_2(pages),
+ pasid, dev_pasid->qdep);
+ }
}
+ rcu_read_unlock();
}

-static void intel_flush_svm_range_dev(struct dmar_domain *domain,
- struct dev_pasid_info *dev_pasid,
- unsigned long address,
- unsigned long pages, int ih)
+static void intel_flush_svm_range(struct iommu_domain *domain, unsigned long address,
+ unsigned long pages, int ih)
{
unsigned long shift = ilog2(__roundup_pow_of_two(pages));
unsigned long align = (1ULL << (VTD_PAGE_SHIFT + shift));
@@ -168,22 +177,11 @@ static void intel_flush_svm_range_dev(struct dmar_domain *domain,
unsigned long end = ALIGN(address + (pages << VTD_PAGE_SHIFT), align);

while (start < end) {
- __flush_svm_range_dev(domain, dev_pasid, start, align >> VTD_PAGE_SHIFT, ih);
+ __flush_svm_range(domain, start, align >> VTD_PAGE_SHIFT, ih);
start += align;
}
}

-static void intel_flush_svm_range(struct dmar_domain *domain, unsigned long address,
- unsigned long pages, int ih)
-{
- struct dev_pasid_info *dev_pasid;
-
- rcu_read_lock();
- list_for_each_entry_rcu(dev_pasid, &domain->dev_pasids, link_domain)
- intel_flush_svm_range_dev(domain, dev_pasid, address, pages, ih);
- rcu_read_unlock();
-}
-
/* Pages have been freed at this point */
static void intel_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
struct mm_struct *mm,
@@ -191,7 +189,7 @@ static void intel_arch_invalidate_secondary_tlbs(struct mmu_notifier *mn,
{
struct iommu_domain *domain = container_of(mn, struct iommu_domain, notifier);

- intel_flush_svm_range(to_dmar_domain(domain), start,
+ intel_flush_svm_range(domain, start,
(end - start + PAGE_SIZE - 1) >> VTD_PAGE_SHIFT, 0);
}

--
2.39.3

2023-10-17 03:23:07

by Zhang, Tina

[permalink] [raw]
Subject: [RFC PATCH 09/12] iommu/vt-d: Refactor intel_iommu_set_dev_pasid()

Refactor the code in several aspects:
1) Add domain_type_is_sva() helper.
2) Extend domain_attach_iommu() to support sva domain.
3) Extend intel_iommu_set_dev_pasid() to support sva domain.
4) Make intel_iommu_set_dev_pasid() global so that it can be used by
others.

Signed-off-by: Tina Zhang <[email protected]>
---
drivers/iommu/intel/iommu.c | 58 +++++++++++++++++++++++++++++++------
drivers/iommu/intel/iommu.h | 2 ++
2 files changed, 51 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index fe063e1250fa..57f6bbf33205 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -388,6 +388,11 @@ static inline int domain_type_is_si(struct dmar_domain *domain)
return domain->domain.type == IOMMU_DOMAIN_IDENTITY;
}

+static inline int domain_type_is_sva(struct dmar_domain *domain)
+{
+ return domain->domain.type == IOMMU_DOMAIN_SVA;
+}
+
static inline int domain_pfn_supported(struct dmar_domain *domain,
unsigned long pfn)
{
@@ -562,6 +567,14 @@ static unsigned long domain_super_pgsize_bitmap(struct dmar_domain *domain)
/* Some capabilities may be different across iommus */
static void domain_update_iommu_cap(struct dmar_domain *domain)
{
+ /*
+ * Caps for maintaining I/O page table are unnecessary for SVA domain,
+ * since device is using the same virtual page table with processor and
+ * processor is the owner of the page table.
+ */
+ if (domain_type_is_sva(domain))
+ return;
+
domain_update_iommu_coherency(domain);
domain->iommu_superpage = domain_update_iommu_superpage(domain, NULL);

@@ -1798,14 +1811,18 @@ static int domain_attach_iommu(struct dmar_domain *domain,
return 0;
}

- ndomains = cap_ndoms(iommu->cap);
- num = find_first_zero_bit(iommu->domain_ids, ndomains);
- if (num >= ndomains) {
- pr_err("%s: No free domain ids\n", iommu->name);
- goto err_unlock;
+ if (!domain_type_is_sva(domain)) {
+ ndomains = cap_ndoms(iommu->cap);
+ num = find_first_zero_bit(iommu->domain_ids, ndomains);
+ if (num >= ndomains) {
+ pr_err("%s: No free domain ids\n", iommu->name);
+ goto err_unlock;
+ }
+ set_bit(num, iommu->domain_ids);
+ } else {
+ num = FLPT_DEFAULT_DID;
}

- set_bit(num, iommu->domain_ids);
info->refcnt = 1;
info->did = num;
info->iommu = iommu;
@@ -1821,7 +1838,8 @@ static int domain_attach_iommu(struct dmar_domain *domain,
return 0;

err_clear:
- clear_bit(info->did, iommu->domain_ids);
+ if (!domain_type_is_sva(domain))
+ clear_bit(info->did, iommu->domain_ids);
err_unlock:
spin_unlock(&iommu->lock);
kfree(info);
@@ -4064,6 +4082,14 @@ static int prepare_domain_attach_device(struct iommu_domain *domain,
struct intel_iommu *iommu;
int addr_width;

+ /*
+ * In SVA case, don't need to check the fields for maintaining I/O page table, as
+ * device is using the same virtual page table with processor and processor is
+ * the owner of the page table.
+ */
+ if (domain_type_is_sva(dmar_domain))
+ return 0;
+
iommu = device_to_iommu(dev, NULL, NULL);
if (!iommu)
return -ENODEV;
@@ -4685,7 +4711,7 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
* notification. Before consolidating that code into iommu core, let
* the intel sva code handle it.
*/
- if (domain->type == IOMMU_DOMAIN_SVA) {
+ if (domain_type_is_sva(dmar_domain)) {
intel_svm_remove_dev_pasid(dev, pasid);
goto out_tear_down;
}
@@ -4709,7 +4735,7 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
intel_drain_pasid_prq(dev, pasid);
}

-static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
+int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t pasid)
{
struct device_domain_info *info = dev_iommu_priv_get(dev);
@@ -4733,6 +4759,16 @@ static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
if (!dev_pasid)
return -ENOMEM;

+ if (domain_type_is_sva(dmar_domain)) {
+ dev_pasid->sid = PCI_DEVID(info->bus, info->devfn);
+ init_rcu_head(&dev_pasid->rcu);
+ if (info->ats_enabled) {
+ dev_pasid->qdep = info->ats_qdep;
+ if (dev_pasid->qdep >= QI_DEV_EIOTLB_MAX_INVS)
+ dev_pasid->qdep = 0;
+ }
+ }
+
ret = domain_attach_iommu(dmar_domain, iommu);
if (ret)
goto out_free;
@@ -4743,6 +4779,10 @@ static int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
else if (dmar_domain->use_first_level)
ret = domain_setup_first_level(iommu, dmar_domain,
dev, pasid);
+ else if (domain_type_is_sva(dmar_domain))
+ ret = intel_pasid_setup_first_level(iommu, dev,
+ domain->mm->pgd, pasid, FLPT_DEFAULT_DID,
+ cpu_feature_enabled(X86_FEATURE_LA57) ? PASID_FLAG_FL5LP : 0);
else
ret = intel_pasid_setup_second_level(iommu, dmar_domain,
dev, pasid);
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index a9e9301b2713..b6dff0da9e8e 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -850,6 +850,8 @@ static inline struct intel_iommu *dev_to_intel_iommu(struct device *dev)

return container_of(iommu_dev, struct intel_iommu, iommu);
}
+int intel_iommu_set_dev_pasid(struct iommu_domain *domain,
+ struct device *dev, ioasid_t pasid);

#ifdef CONFIG_INTEL_IOMMU_SVM
void intel_svm_check(struct intel_iommu *iommu);
--
2.39.3

2023-10-17 03:23:17

by Zhang, Tina

[permalink] [raw]
Subject: [RFC PATCH 11/12] iommu/vt-d: Use intel_iommu_set_dev_pasid() for sva domain

Let intel_svm_set_dev_pasid() use intel_iommu_set_dev_pasid() for
set_dev_pasid operation.

Signed-off-by: Tina Zhang <[email protected]>
---
drivers/iommu/intel/svm.c | 44 +++++++--------------------------------
1 file changed, 8 insertions(+), 36 deletions(-)

diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index 7f98c8acd04f..c9a703935908 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -231,13 +231,9 @@ static const struct mmu_notifier_ops intel_mmuops = {
static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t pasid)
{
- struct dmar_domain *dmar_domain = to_dmar_domain(domain);
- struct device_domain_info *info = dev_iommu_priv_get(dev);
- struct intel_iommu *iommu = info->iommu;
struct mm_struct *mm = domain->mm;
- struct dev_pasid_info *dev_pasid;
- unsigned long sflags;
- int ret = 0;
+ struct dmar_domain *dmar_domain = to_dmar_domain(domain);
+ int ret;

/*
* The sva domain can be shared among multiple devices. Make sure
@@ -250,37 +246,11 @@ static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
return ret;
}

- dev_pasid = kzalloc(sizeof(*dev_pasid), GFP_KERNEL);
- if (!dev_pasid) {
- ret = -ENOMEM;
- goto out_unregister;
- }
-
- dev_pasid->dev = dev;
- dev_pasid->sid = PCI_DEVID(info->bus, info->devfn);
- if (info->ats_enabled) {
- dev_pasid->qdep = info->ats_qdep;
- if (dev_pasid->qdep >= QI_DEV_EIOTLB_MAX_INVS)
- dev_pasid->qdep = 0;
+ ret = intel_iommu_set_dev_pasid(domain, dev, pasid);
+ if (ret) {
+ if (list_empty(&dmar_domain->dev_pasids))
+ mmu_notifier_unregister(&domain->notifier, mm);
}
-
- /* Setup the pasid table: */
- sflags = cpu_feature_enabled(X86_FEATURE_LA57) ? PASID_FLAG_FL5LP : 0;
- ret = intel_pasid_setup_first_level(iommu, dev, mm->pgd, pasid,
- FLPT_DEFAULT_DID, sflags);
- if (ret)
- goto out_free_dev_pasid;
-
- list_add_rcu(&dev_pasid->link_domain, &dmar_domain->dev_pasids);
-
- return 0;
-
-out_free_dev_pasid:
- kfree(dev_pasid);
-out_unregister:
- if (list_empty(&dmar_domain->dev_pasids))
- mmu_notifier_unregister(&domain->notifier, mm);
-
return ret;
}

@@ -696,6 +666,8 @@ struct iommu_domain *intel_svm_domain_alloc(void)
return NULL;
domain->domain.ops = &intel_svm_domain_ops;
INIT_LIST_HEAD(&domain->dev_pasids);
+ spin_lock_init(&domain->lock);
+ xa_init(&domain->iommu_array);

return &domain->domain;
}
--
2.39.3

2023-10-17 03:23:21

by Zhang, Tina

[permalink] [raw]
Subject: [RFC PATCH 10/12] iommu/vt-d: Refactor intel_iommu_remove_dev_pasid()

Since domain centralizes info required for sva domain in itself, the
intel_iommu_remove_dev_pasid() can handle device detaching work of sva
domain w/o asking help of intel_svm_remove_dev_pasid() any more.

Signed-off-by: Tina Zhang <[email protected]>
---
drivers/iommu/intel/iommu.c | 16 +++++----------
drivers/iommu/intel/iommu.h | 4 ----
drivers/iommu/intel/svm.c | 39 -------------------------------------
3 files changed, 5 insertions(+), 54 deletions(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 57f6bbf33205..2aea5201de3d 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -4703,19 +4703,9 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)
unsigned long flags;

domain = iommu_get_domain_for_dev_pasid(dev, pasid, 0);
- if (WARN_ON_ONCE(!domain))
+ if (WARN_ON_ONCE(IS_ERR_OR_NULL(domain)))
goto out_tear_down;

- /*
- * The SVA implementation needs to handle its own stuffs like the mm
- * notification. Before consolidating that code into iommu core, let
- * the intel sva code handle it.
- */
- if (domain_type_is_sva(dmar_domain)) {
- intel_svm_remove_dev_pasid(dev, pasid);
- goto out_tear_down;
- }
-
dmar_domain = to_dmar_domain(domain);
spin_lock_irqsave(&dmar_domain->lock, flags);
list_for_each_entry(curr, &dmar_domain->dev_pasids, link_domain) {
@@ -4730,6 +4720,10 @@ static void intel_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid)

domain_detach_iommu(dmar_domain, iommu);
kfree_rcu(dev_pasid, rcu);
+ if (list_empty(&to_dmar_domain(domain)->dev_pasids)) {
+ if (domain->notifier.ops)
+ mmu_notifier_unregister(&domain->notifier, domain->mm);
+ }
out_tear_down:
intel_pasid_tear_down_entry(iommu, dev, pasid, false);
intel_drain_pasid_prq(dev, pasid);
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index b6dff0da9e8e..84e942a16666 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -860,7 +860,6 @@ int intel_svm_finish_prq(struct intel_iommu *iommu);
int intel_svm_page_response(struct device *dev, struct iommu_fault_event *evt,
struct iommu_page_response *msg);
struct iommu_domain *intel_svm_domain_alloc(void);
-void intel_svm_remove_dev_pasid(struct device *dev, ioasid_t pasid);
void intel_drain_pasid_prq(struct device *dev, u32 pasid);
#else
static inline void intel_svm_check(struct intel_iommu *iommu) {}
@@ -870,9 +869,6 @@ static inline struct iommu_domain *intel_svm_domain_alloc(void)
return NULL;
}

-static inline void intel_svm_remove_dev_pasid(struct device *dev, ioasid_t pasid)
-{
-}
#endif

#ifdef CONFIG_INTEL_IOMMU_DEBUGFS
diff --git a/drivers/iommu/intel/svm.c b/drivers/iommu/intel/svm.c
index fd61fbf7593a..7f98c8acd04f 100644
--- a/drivers/iommu/intel/svm.c
+++ b/drivers/iommu/intel/svm.c
@@ -27,23 +27,6 @@

static irqreturn_t prq_event_thread(int irq, void *d);

-static struct dev_pasid_info *
-domain_lookup_dev_pasid_info_by_dev(struct dmar_domain *domain, struct device *dev)
-{
- struct dev_pasid_info *dev_pasid = NULL, *t;
-
- rcu_read_lock();
- list_for_each_entry_rcu(t, &domain->dev_pasids, link_domain) {
- if (t->dev == dev) {
- dev_pasid = t;
- break;
- }
- }
- rcu_read_unlock();
-
- return dev_pasid;
-}
-
int intel_svm_enable_prq(struct intel_iommu *iommu)
{
struct iopf_queue *iopfq;
@@ -301,28 +284,6 @@ static int intel_svm_set_dev_pasid(struct iommu_domain *domain,
return ret;
}

-void intel_svm_remove_dev_pasid(struct device *dev, u32 pasid)
-{
- struct iommu_domain *domain;
- struct dev_pasid_info *dev_pasid;
-
- domain = iommu_get_domain_for_dev_pasid(dev, pasid,
- IOMMU_DOMAIN_SVA);
- if (WARN_ON_ONCE(IS_ERR_OR_NULL(domain)))
- return;
-
- dev_pasid = domain_lookup_dev_pasid_info_by_dev(to_dmar_domain(domain), dev);
- if (dev_pasid) {
- list_del_rcu(&dev_pasid->link_domain);
- kfree_rcu(dev_pasid, rcu);
-
- if (list_empty(&to_dmar_domain(domain)->dev_pasids)) {
- if (domain->notifier.ops)
- mmu_notifier_unregister(&domain->notifier, domain->mm);
- }
- }
-}
-
/* Page request queue descriptor */
struct page_req_dsc {
union {
--
2.39.3

2023-10-23 18:36:08

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [RFC PATCH 06/12] iommu: Add mmu_notifier to sva domain


On Tue, Oct 17, 2023 at 11:20:39AM +0800, Tina Zhang wrote:
> Devices attached to shared virtual addressing (SVA) domain are allowed to
> use the same virtual addresses with processor, and this functionality is
> called shared virtual memory. When shared virtual memory is being used,
> it's the sva domain's responsibility to keep device TLB cache and the CPU
> cache in sync. Hence add mmu_notifier to sva domain.
>
> Signed-off-by: Tina Zhang <[email protected]>
> ---
> include/linux/iommu.h | 2 ++
> 1 file changed, 2 insertions(+)

You should look at how arm smmuv3 ended up after I went over it to
make similar changes, I think you should take this patch

https://lore.kernel.org/linux-iommu/[email protected]/

into this series (maybe drop the arm part)

And copy the same basic structure for how the mmu notifier works.

It would also be nice if alot of the 'if_sva' tests could be avoided,
smmu didn't end up with those..

In the guts of the pasid handling sva shouldn't be special beyond a
different source for the pgd.

Jason

2023-11-01 11:59:06

by Zhang, Tina

[permalink] [raw]
Subject: RE: [RFC PATCH 06/12] iommu: Add mmu_notifier to sva domain

Hi Jason,

> -----Original Message-----
> From: Jason Gunthorpe <[email protected]>
> Sent: Tuesday, October 24, 2023 2:35 AM
> To: Zhang, Tina <[email protected]>
> Cc: [email protected]; [email protected]; [email protected];
> David Woodhouse <[email protected]>; Lu Baolu
> <[email protected]>; Joerg Roedel <[email protected]>; Tian, Kevin
> <[email protected]>
> Subject: Re: [RFC PATCH 06/12] iommu: Add mmu_notifier to sva domain
>
>
> On Tue, Oct 17, 2023 at 11:20:39AM +0800, Tina Zhang wrote:
> > Devices attached to shared virtual addressing (SVA) domain are allowed
> > to use the same virtual addresses with processor, and this
> > functionality is called shared virtual memory. When shared virtual
> > memory is being used, it's the sva domain's responsibility to keep
> > device TLB cache and the CPU cache in sync. Hence add mmu_notifier to sva
> domain.
> >
> > Signed-off-by: Tina Zhang <[email protected]>
> > ---
> > include/linux/iommu.h | 2 ++
> > 1 file changed, 2 insertions(+)
>
> You should look at how arm smmuv3 ended up after I went over it to make
> similar changes, I think you should take this patch
>
> https://lore.kernel.org/linux-iommu/20-v1-afbb86647bbd+5-
> [email protected]/
>
> into this series (maybe drop the arm part)
Good suggestion. The new domain_alloc_sva() callback allows drivers to fully initialize sva domain that is what we need. I'll take the domain_alloc_sva() part as a patch and include it in this patch-set. Thanks.

>
> And copy the same basic structure for how the mmu notifier works.
>
> It would also be nice if alot of the 'if_sva' tests could be avoided, smmu didn't
> end up with those..
Agree. With the help of domain_alloc_sva() callback, I think most of the if_sva brought by this RFC version can be reduced, as we can reuse fields from dmar_domain (or add new ones if necessary) and initialize them in domain_alloc_sva().

Regards,
-Tina
>
> In the guts of the pasid handling sva shouldn't be special beyond a different
> source for the pgd.
>
> Jason