2023-06-25 15:27:38

by Jingqi Liu

[permalink] [raw]
Subject: [PATCH 0/5] iommu/vt-d: debugfs: Enhancements to IOMMU debugfs

The original debugfs only dumps all IOMMU page tables without pasid
supported. It traverses all devices on the pci bus, then dumps all page
tables based on device domains. This traversal is from software
perspective.

This series dumps page tables by traversing root tables, context tables,
pasid directories and pasid tables from hardware perspective. By
specifying source identifier and PASID, it supports dumping specified
page table or all page tables in legacy mode or scalable mode.

For a device that only supports legacy mode, specify the source
identifier, and search the root table and context table to dump its
page table. It does not support to specify PASID.

For a device that supports scalable mode, specify a
{source identifier, PASID} pair and search the root table, context table
and pasid table to dump its page table. If the pasid is not specified,
it is set to RID_PASID.

Switch to dump all page tables by specifying "auto".

Examples are as follows:
1) Dump the page table of device "00:1f.0" that only supports legacy
mode.

$ sudo echo 00:1f.0 >
/sys/kernel/debug/iommu/intel/domain_translation_struct
$ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
Device 0000:00:1f.0 @0x105407000
IOVA_PFN PML5E PML4E
0x0000000000000 | 0x0000000000000000 0x0000000105408003
0x0000000000001 | 0x0000000000000000 0x0000000105408003
0x0000000000002 | 0x0000000000000000 0x0000000105408003
0x0000000000003 | 0x0000000000000000 0x0000000105408003

PDPE PDE PTE
0x0000000105409003 0x000000010540a003 0x0000000000000003
0x0000000105409003 0x000000010540a003 0x0000000000001003
0x0000000105409003 0x000000010540a003 0x0000000000002003
0x0000000105409003 0x000000010540a003 0x0000000000003003

[...]

2) Dump the page table of device "00:0a.0" with pasid "2".

$ sudo echo 00:0a.0,2 >
/sys/kernel/debug/iommu/intel/domain_translation_struct
$ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
Device 0000:00:0a.0 with pasid 2 @0x1083d7000
IOVA_PFN PML5E PML4E
0x0000000000000 | 0x0000000000000000 0x0000000106aaa003
0x0000000000001 | 0x0000000000000000 0x0000000106aaa003
0x0000000000002 | 0x0000000000000000 0x0000000106aaa003
0x0000000000003 | 0x0000000000000000 0x0000000106aaa003

PDPE PDE PTE
0x000000010a819003 0x000000010a7aa003 0x0000000129800003
0x000000010a819003 0x000000010a7aa003 0x0000000129801003
0x000000010a819003 0x000000010a7aa003 0x0000000129802003
0x000000010a819003 0x000000010a7aa003 0x0000000129803003

[...]

3) Dump all page tables:
$ sudo echo "auto" >
/sys/kernel/debug/iommu/intel/domain_translation_struct
$ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
[...]

Device 0000:00:02.0 @0x103072000
IOVA_PFN PML5E PML4E
0x000000008d800 | 0x0000000000000000 0x0000000103073003
0x000000008d801 | 0x0000000000000000 0x0000000103073003

PDPE PDE PTE
0x0000000103074003 0x0000000103075003 0x000000008d800003
0x0000000103074003 0x0000000103075003 0x000000008d801003

[...]

Device 0000:00:0a.0 with pasid 2 @0x10a0b6000
IOVA_PFN PML5E PML4E
0x0000000000000 | 0x0000000000000000 0x00000001072d2003
0x0000000000001 | 0x0000000000000000 0x00000001072d2003

PDPE PDE PTE
0x0000000107d6e003 0x00000001161d4003 0x00000001bdc00003
0x0000000107d6e003 0x00000001161d4003 0x00000001bdc01003

[...]

Thanks,
Jingqi

Jingqi Liu (5):
iommu/vt-d: debugfs: Define domain_translation_struct file ops
iommu/vt-d: debugfs: Support specifying source identifier and PASID
iommu/vt-d: debugfs: Dump the corresponding page table of a pasid
iommu/vt-d: debugfs: Support dumping a specified page table
iommu/vt-d: debugfs: Dump entry pointing to huge page

drivers/iommu/intel/debugfs.c | 361 ++++++++++++++++++++++++++++++----
1 file changed, 326 insertions(+), 35 deletions(-)

--
2.21.3



2023-06-25 15:36:46

by Jingqi Liu

[permalink] [raw]
Subject: [PATCH 1/5] iommu/vt-d: debugfs: Define domain_translation_struct file ops

Define domain_translation_struct file_operations instead of using
DEFINE_SHOW_ATTRIBUTE() in order to specify source identifier and pasid
to dump the specified page table.

Signed-off-by: Jingqi Liu <[email protected]>
---
drivers/iommu/intel/debugfs.c | 20 +++++++++++++++++++-
1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/debugfs.c b/drivers/iommu/intel/debugfs.c
index 1f925285104e..072cfef19175 100644
--- a/drivers/iommu/intel/debugfs.c
+++ b/drivers/iommu/intel/debugfs.c
@@ -391,7 +391,25 @@ static int domain_translation_struct_show(struct seq_file *m, void *unused)
return bus_for_each_dev(&pci_bus_type, NULL, m,
show_device_domain_translation);
}
-DEFINE_SHOW_ATTRIBUTE(domain_translation_struct);
+
+static int domain_translation_struct_open(struct inode *inode,
+ struct file *filp)
+{
+ /*
+ * Allocate one 1Mbyte buffer to save sequential file output,
+ * since the default size of input buffer is 1Mbyte when the
+ * user reads.
+ */
+ return single_open_size(filp, domain_translation_struct_show,
+ inode->i_private, SZ_1M);
+}
+
+static const struct file_operations domain_translation_struct_fops = {
+ .open = domain_translation_struct_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = single_release,
+};

static void invalidation_queue_entry_show(struct seq_file *m,
struct intel_iommu *iommu)
--
2.21.3


2023-06-25 15:39:37

by Jingqi Liu

[permalink] [raw]
Subject: [PATCH 5/5] iommu/vt-d: debugfs: Dump entry pointing to huge page

For the page table entry pointing to a huge page, the data below the
level of the huge page is meaningless and does not need to be dumped.

Signed-off-by: Jingqi Liu <[email protected]>
---
drivers/iommu/intel/debugfs.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel/debugfs.c b/drivers/iommu/intel/debugfs.c
index e4d3b7836076..f5e26cc21905 100644
--- a/drivers/iommu/intel/debugfs.c
+++ b/drivers/iommu/intel/debugfs.c
@@ -326,9 +326,14 @@ static inline unsigned long level_to_directory_size(int level)
static inline void
dump_page_info(struct seq_file *m, unsigned long iova, u64 *path)
{
- seq_printf(m, "0x%013lx |\t0x%016llx\t0x%016llx\t0x%016llx\t0x%016llx\t0x%016llx\n",
- iova >> VTD_PAGE_SHIFT, path[5], path[4],
- path[3], path[2], path[1]);
+ seq_printf(m, "0x%013lx |\t0x%016llx\t0x%016llx\t0x%016llx",
+ iova >> VTD_PAGE_SHIFT, path[5], path[4], path[3]);
+ if (path[2]) {
+ seq_printf(m, "\t0x%016llx", path[2]);
+ if (path[1])
+ seq_printf(m, "\t0x%016llx", path[1]);
+ }
+ seq_putc(m, '\n');
}

static void pgtable_walk_level(struct seq_file *m, struct dma_pte *pde,
--
2.21.3


2023-06-25 15:41:06

by Jingqi Liu

[permalink] [raw]
Subject: [PATCH 3/5] iommu/vt-d: debugfs: Dump the corresponding page table of a pasid

Add a generic helper to dump the page table contained in a pasid table entry.

For implementations supporting Scalable Mode Translation, the PASID-table
entries contain pointers to both first-stage and second-stage translation
structures, along with the PASID Granular Translation Type (PGTT) field that
specifies which translation process the request undergoes.

The original debugfs only dumps the contents of pasid table entry when
traversing the pasid table. Add a check to decide whether to dump the page
table contained by a pasid table entry.

Signed-off-by: Jingqi Liu <[email protected]>
---
drivers/iommu/intel/debugfs.c | 59 ++++++++++++++++++++++++++++++++++-
1 file changed, 58 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/debugfs.c b/drivers/iommu/intel/debugfs.c
index 6d02cd91718a..212d33598de9 100644
--- a/drivers/iommu/intel/debugfs.c
+++ b/drivers/iommu/intel/debugfs.c
@@ -19,9 +19,11 @@
#include "perf.h"

struct tbl_walk {
+ u16 segment; /* PCI segment# */
u16 bus;
u16 devfn;
u32 pasid;
+ bool dump_page_table;
struct root_entry *rt_entry;
struct context_entry *ctx_entry;
struct pasid_entry *pasid_tbl_entry;
@@ -118,6 +120,8 @@ static const struct iommu_regset iommu_regs_64[] = {
IOMMU_REGSET_ENTRY(VCRSP),
};

+static void dump_translation_page_table(struct seq_file *m);
+
static int iommu_regset_show(struct seq_file *m, void *unused)
{
struct dmar_drhd_unit *drhd;
@@ -199,7 +203,11 @@ static void pasid_tbl_walk(struct seq_file *m, struct pasid_entry *tbl_entry,
if (pasid_pte_is_present(tbl_entry)) {
tbl_wlk->pasid_tbl_entry = tbl_entry;
tbl_wlk->pasid = (dir_idx << PASID_PDE_SHIFT) + tbl_idx;
- print_tbl_walk(m);
+
+ if (tbl_wlk->dump_page_table)
+ dump_translation_page_table(m);
+ else
+ print_tbl_walk(m);
}

tbl_entry++;
@@ -347,6 +355,55 @@ static void pgtable_walk_level(struct seq_file *m, struct dma_pte *pde,
}
}

+/*
+ * Dump the page table that contained in a pasid table entry.
+ * There're two consumers of this helper, as follows:
+ * 1) When traversing the pasid table, dump the page table
+ * contained in the pasid table entry.
+ * 2) Find the pasid table entry with a specified pasid,
+ * and dump the page table it contains.
+ */
+static void dump_translation_page_table(struct seq_file *m)
+{
+ struct tbl_walk *tbl_wlk = m->private;
+ u64 pgd, path[6] = { 0 };
+ u16 pgtt;
+ u8 agaw;
+
+ if (!tbl_wlk->pasid_tbl_entry)
+ return;
+
+ /*
+ * According to PASID Granular Translation Type(PGTT),
+ * get the page table pointer.
+ */
+ pgtt = (u16)(tbl_wlk->pasid_tbl_entry->val[0] & GENMASK_ULL(8, 6)) >> 6;
+ agaw = (u8)(tbl_wlk->pasid_tbl_entry->val[0] & GENMASK_ULL(4, 2)) >> 2;
+
+ switch (pgtt) {
+ case PASID_ENTRY_PGTT_FL_ONLY:
+ pgd = tbl_wlk->pasid_tbl_entry->val[2];
+ break;
+ case PASID_ENTRY_PGTT_SL_ONLY:
+ case PASID_ENTRY_PGTT_NESTED:
+ pgd = tbl_wlk->pasid_tbl_entry->val[0];
+ break;
+ default:
+ return;
+ }
+
+ pgd &= VTD_PAGE_MASK;
+ seq_printf(m, "Device %04x:%02x:%02x.%x with pasid %x @0x%llx\n",
+ tbl_wlk->segment, tbl_wlk->bus, PCI_SLOT(tbl_wlk->devfn),
+ PCI_FUNC(tbl_wlk->devfn), tbl_wlk->pasid, pgd);
+ seq_printf(m, "%-17s\t%-18s\t%-18s\t%-18s\t%-18s\t%-s\n",
+ "IOVA_PFN", "PML5E", "PML4E", "PDPE", "PDE", "PTE");
+ pgtable_walk_level(m, phys_to_virt(pgd), agaw + 2, 0, path);
+ seq_putc(m, '\n');
+
+ return;
+}
+
static int __show_device_domain_translation(struct device *dev, void *data)
{
struct dmar_domain *domain;
--
2.21.3


2023-06-25 15:41:14

by Jingqi Liu

[permalink] [raw]
Subject: [PATCH 2/5] iommu/vt-d: debugfs: Support specifying source identifier and PASID

The original debugfs only dumps IOMMU page tables of all domains.
Usually developers want to dump the specified page table instead of all.

This patch supports users to specify the source identifier and PASID to
dump the specific page table.

For a device that only supports legacy mode, specify the source
identifier to dump its page table. For a device that supports scalable
mode, specify a {source identifier, PASID} pair to dump its page table.

Switch to dump all page tables by specifying "auto".

Examples are as follows:

1) Specify device "00:1f.0" that only supports legacy mode.
$ sudo echo 00:1f.0 >
/sys/kernel/debug/iommu/intel/domain_translation_struct

2) Specify device "00:0a.0" with PASID "1".
$ sudo echo 00:0a.0,1 >
/sys/kernel/debug/iommu/intel/domain_translation_struct

3) Specify all page tables:
$ sudo echo "auto" >
/sys/kernel/debug/iommu/intel/domain_translation_struct

Signed-off-by: Jingqi Liu <[email protected]>
---
drivers/iommu/intel/debugfs.c | 86 ++++++++++++++++++++++++++++++++++-
1 file changed, 85 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/debugfs.c b/drivers/iommu/intel/debugfs.c
index 072cfef19175..6d02cd91718a 100644
--- a/drivers/iommu/intel/debugfs.c
+++ b/drivers/iommu/intel/debugfs.c
@@ -32,6 +32,13 @@ struct iommu_regset {
const char *regs;
};

+#define BUF_SIZE 64
+
+static struct show_domain_info {
+ struct pci_dev *pdev;
+ ioasid_t pasid;
+} *show_domain_info;
+
#define DEBUG_BUFFER_SIZE 1024
static char debug_buf[DEBUG_BUFFER_SIZE];

@@ -392,6 +399,82 @@ static int domain_translation_struct_show(struct seq_file *m, void *unused)
show_device_domain_translation);
}

+static ssize_t domain_translation_struct_write(struct file *filp,
+ const char __user *ubuf,
+ size_t cnt, loff_t *ppos)
+{
+ char buf[BUF_SIZE], *srcid_ptr = NULL, *pasid_ptr = NULL;
+ unsigned int seg, bus, slot, func;
+ struct pci_dev *pdev = NULL;
+ u32 pasid = INVALID_IOASID;
+ char *key, *pbuf;
+ int i = 0;
+
+ if (cnt >= BUF_SIZE)
+ return -EINVAL;
+
+ if (copy_from_user(buf, ubuf, cnt))
+ return -EFAULT;
+
+ buf[cnt - 1] = 0;
+ if (!strcmp(buf, "auto")) {
+ if (show_domain_info)
+ show_domain_info->pdev = NULL;
+ *ppos += cnt;
+ return cnt;
+ }
+
+ pbuf = buf;
+
+ /* Seperate the input: one {source identifier, PASID} pair */
+ while ((key = strsep(&pbuf, ", ")) != NULL) {
+ if (!*key)
+ continue;
+ if (i >= 2) /* too many fields */
+ return -EINVAL;
+ if (i++ == 0)
+ srcid_ptr = key;
+ else
+ pasid_ptr = key;
+ }
+
+ if (!srcid_ptr) /* no source identifier */
+ return -EINVAL;
+
+ /*
+ * The string of source identifier must be of the form:
+ * [<domain>:]<bus>:<device>.<func>
+ */
+ i = sscanf(srcid_ptr, "%x:%x:%x.%x", &seg, &bus, &slot, &func);
+ if (i != 4) {
+ seg = 0;
+ i = sscanf(srcid_ptr, "%x:%x.%x", &bus, &slot, &func);
+ if (i != 3)
+ return -EINVAL;
+ }
+
+ pdev = pci_get_domain_bus_and_slot(seg, bus, PCI_DEVFN(slot, func));
+ if (!pdev)
+ return -EINVAL;
+
+ if (pasid_ptr &&
+ ((kstrtou32(pasid_ptr, 0, &pasid) < 0) || (pasid >= PASID_MAX)))
+ return -EINVAL;
+
+ if (!show_domain_info) {
+ show_domain_info = kzalloc(sizeof(*show_domain_info),
+ GFP_KERNEL);
+ if (!show_domain_info)
+ return -EINVAL;
+ }
+
+ show_domain_info->pdev = pdev;
+ show_domain_info->pasid = pasid;
+
+ *ppos += cnt;
+ return cnt;
+}
+
static int domain_translation_struct_open(struct inode *inode,
struct file *filp)
{
@@ -406,6 +489,7 @@ static int domain_translation_struct_open(struct inode *inode,

static const struct file_operations domain_translation_struct_fops = {
.open = domain_translation_struct_open,
+ .write = domain_translation_struct_write,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
@@ -691,7 +775,7 @@ void __init intel_iommu_debugfs_init(void)
&iommu_regset_fops);
debugfs_create_file("dmar_translation_struct", 0444, intel_iommu_debug,
NULL, &dmar_translation_struct_fops);
- debugfs_create_file("domain_translation_struct", 0444,
+ debugfs_create_file("domain_translation_struct", 0644,
intel_iommu_debug, NULL,
&domain_translation_struct_fops);
debugfs_create_file("invalidation_queue", 0444, intel_iommu_debug,
--
2.21.3


2023-06-25 15:41:14

by Jingqi Liu

[permalink] [raw]
Subject: [PATCH 4/5] iommu/vt-d: debugfs: Support dumping a specified page table

The original debugfs only dumps all page tables without pasid. With
pasid supported, the page table with pasid also needs to be dumped.

This patch supports dumping a specified page table or all page tables in
legacy mode or scalable mode.

For legacy mode, according to bus number and DEVFN, traverse the root
table and context table to get the pointer of page table in the context
table entry, then dump the page table.

For scalable mode, according to bus number, DEVFN and pasid, traverse
the root table, context table, pasid directory and pasid table to get
the pointer of page table in the pasid table entry, then dump it.

Examples are as follows:
1) Dump the page table of device "00:1f.0" that only supports legacy mode.
$ sudo echo 00:1f.0 >
/sys/kernel/debug/iommu/intel/domain_translation_struct
$ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct

2) Dump the page table of device "00:0a.0" with PASID "1".
$ sudo echo 00:0a.0,1 >
/sys/kernel/debug/iommu/intel/domain_translation_struct
$ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct

3) Dump all page tables.
$ sudo echo "auto" >
/sys/kernel/debug/iommu/intel/domain_translation_struct
$ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct

Signed-off-by: Jingqi Liu <[email protected]>
---
drivers/iommu/intel/debugfs.c | 191 ++++++++++++++++++++++++++++------
1 file changed, 159 insertions(+), 32 deletions(-)

diff --git a/drivers/iommu/intel/debugfs.c b/drivers/iommu/intel/debugfs.c
index 212d33598de9..e4d3b7836076 100644
--- a/drivers/iommu/intel/debugfs.c
+++ b/drivers/iommu/intel/debugfs.c
@@ -404,56 +404,183 @@ static void dump_translation_page_table(struct seq_file *m)
return;
}

-static int __show_device_domain_translation(struct device *dev, void *data)
+/*
+ * Dump the page table with the specified device and pasid.
+ * For legacy mode, search root and context tables to find
+ * the page table.
+ * For scalable mode, search root, context, pasid directory
+ * and pasid tables to find the page table.
+ * If not specify device, it will traverse all devices and
+ * pasid tables, and then dump all page tables.
+ */
+static int show_device_domain_translation(struct show_domain_info *sinfo,
+ void *data)
{
- struct dmar_domain *domain;
+ bool walk_tbl = false, found = false;
+ u16 s_devfn = 0, e_devfn = 255, devfn;
+ u16 s_bus = 0, e_bus = 255, bus, seg;
+ struct dmar_drhd_unit *drhd;
+ struct intel_iommu *iommu;
struct seq_file *m = data;
- u64 path[6] = { 0 };
+ bool scalable;

- domain = to_dmar_domain(iommu_get_domain_for_dev(dev));
- if (!domain)
- return 0;
+ if (sinfo && sinfo->pdev) {
+ s_bus = sinfo->pdev->bus->number;
+ e_bus = sinfo->pdev->bus->number;
+ s_devfn = sinfo->pdev->devfn;
+ e_devfn = sinfo->pdev->devfn;
+ seg = pci_domain_nr(sinfo->pdev->bus);
+ } else
+ walk_tbl = true;

- seq_printf(m, "Device %s @0x%llx\n", dev_name(dev),
- (u64)virt_to_phys(domain->pgd));
- seq_puts(m, "IOVA_PFN\t\tPML5E\t\t\tPML4E\t\t\tPDPE\t\t\tPDE\t\t\tPTE\n");
-
- pgtable_walk_level(m, domain->pgd, domain->agaw + 2, 0, path);
- seq_putc(m, '\n');
+ rcu_read_lock();
+ for_each_active_iommu(iommu, drhd) {
+ struct context_entry *context;
+ u64 pgd, path[6] = { 0 };
+ u32 sts, agaw;

- /* Don't iterate */
- return 1;
-}
+ if (sinfo && sinfo->pdev && (seg != iommu->segment))
+ continue;

-static int show_device_domain_translation(struct device *dev, void *data)
-{
- struct iommu_group *group;
+ sts = dmar_readl(iommu->reg + DMAR_GSTS_REG);
+ if (!(sts & DMA_GSTS_TES)) {
+ seq_printf(m, "DMA Remapping is not enabled on %s\n",
+ iommu->name);
+ continue;
+ }
+ if (dmar_readq(iommu->reg + DMAR_RTADDR_REG) & DMA_RTADDR_SMT)
+ scalable = true;
+ else
+ scalable = false;

- group = iommu_group_get(dev);
- if (group) {
/*
- * The group->mutex is held across the callback, which will
- * block calls to iommu_attach/detach_group/device. Hence,
+ * The iommu->lock is held across the callback, which will
+ * block calls to domain_attach/domain_detach. Hence,
* the domain of the device will not change during traversal.
*
- * All devices in an iommu group share a single domain, hence
- * we only dump the domain of the first device. Even though,
- * this code still possibly races with the iommu_unmap()
- * interface. This could be solved by RCU-freeing the page
- * table pages in the iommu_unmap() path.
+ * Traversing page table possibly races with the iommu_unmap()
+ * interface. This could be solved by incrementing the
+ * reference count of page table page before traversal and
+ * decrementing the reference count after traversal.
*/
- iommu_group_for_each_dev(group, data,
- __show_device_domain_translation);
- iommu_group_put(group);
+ spin_lock(&iommu->lock);
+ for (bus = s_bus; bus <= e_bus; bus++) {
+ for (devfn = s_devfn; devfn <= e_devfn; devfn++) {
+ context = iommu_context_addr(iommu, bus, devfn, 0);
+ if (!context || !context_present(context))
+ continue;
+
+ if (!scalable) { /* legacy mode */
+ pgd = context->lo & VTD_PAGE_MASK;
+ agaw = context->hi & 7;
+
+ seq_printf(m, "Device %04x:%02x:%02x.%x @0x%llx\n",
+ iommu->segment, bus, PCI_SLOT(devfn), PCI_FUNC(devfn), pgd);
+ seq_printf(m, "%-17s\t%-18s\t%-18s\t%-18s\t%-18s\t%-s\n",
+ "IOVA_PFN", "PML5E", "PML4E", "PDPE", "PDE", "PTE");
+ pgtable_walk_level(m, phys_to_virt(pgd), agaw + 2, 0, path);
+ seq_putc(m, '\n');
+
+ found = true;
+ } else { /* scalable mode */
+ struct tbl_walk tbl_wlk = {0};
+ struct pasid_dir_entry *dir_tbl, *dir_entry;
+ struct pasid_entry *pasid_tbl, *pasid_tbl_entry;
+ u16 pasid_dir_size, dir_idx, tbl_idx;
+ u64 pasid_dir_ptr;
+
+ tbl_wlk.segment = iommu->segment;
+ tbl_wlk.bus = bus;
+ tbl_wlk.devfn = devfn;
+ tbl_wlk.rt_entry = &iommu->root_entry[bus];
+ tbl_wlk.ctx_entry = context;
+ tbl_wlk.dump_page_table = true;
+ m->private = &tbl_wlk;
+
+ pasid_dir_ptr = context->lo & VTD_PAGE_MASK;
+ pasid_dir_size = get_pasid_dir_size(context);
+
+ if (walk_tbl) {
+ pasid_dir_walk(m, pasid_dir_ptr, pasid_dir_size);
+ continue;
+ }
+
+ if (sinfo && sinfo->pasid == INVALID_IOASID) {
+ spin_unlock(&iommu->lock);
+ goto unlock_out;
+ }
+
+ /* Dump specified device domain mappings with PASID. */
+ dir_idx = sinfo->pasid >> PASID_PDE_SHIFT;
+ tbl_idx = sinfo->pasid & PASID_PTE_MASK;
+
+ dir_tbl = phys_to_virt(pasid_dir_ptr);
+ dir_entry = &dir_tbl[dir_idx];
+
+ pasid_tbl = get_pasid_table_from_pde(dir_entry);
+ if (!pasid_tbl)
+ continue;
+
+ pasid_tbl_entry = &pasid_tbl[tbl_idx];
+ if (!pasid_pte_is_present(pasid_tbl_entry))
+ continue;
+
+ tbl_wlk.pasid = sinfo->pasid;
+ tbl_wlk.pasid_tbl_entry = pasid_tbl_entry;
+ dump_translation_page_table(m);
+
+ found = true;
+ }
+ }
+ }
+
+ spin_unlock(&iommu->lock);
+ if (!walk_tbl && found)
+ break;
}

+unlock_out:
+ rcu_read_unlock();
+
+ if (!walk_tbl && !found && (sinfo->pasid != INVALID_IOASID))
+ seq_printf(m, "No mappings found on device %s with pasid %x.\n",
+ dev_name(&sinfo->pdev->dev), sinfo->pasid);
return 0;
}

static int domain_translation_struct_show(struct seq_file *m, void *unused)
{
- return bus_for_each_dev(&pci_bus_type, NULL, m,
- show_device_domain_translation);
+ int ret;
+
+ if (show_domain_info && show_domain_info->pdev) {
+ struct device_domain_info *info =
+ dev_iommu_priv_get(&show_domain_info->pdev->dev);
+
+ if (info) {
+ /*
+ * The domain has already exited, and will
+ * switch to the default domain next.
+ */
+ if (!info->domain)
+ return 0;
+
+ if (info->pasid_enabled &&
+ (show_domain_info->pasid == INVALID_IOASID))
+ show_domain_info->pasid = PASID_RID2PASID;
+ else if (!info->pasid_enabled &&
+ (show_domain_info->pasid != INVALID_IOASID)) {
+ seq_printf(m, "Device %s does not support PASID.\n",
+ dev_name(&show_domain_info->pdev->dev));
+ return 0;
+ }
+ } else
+ show_domain_info->pasid = PASID_RID2PASID;
+
+ ret = show_device_domain_translation(show_domain_info, m);
+ } else
+ ret = show_device_domain_translation(NULL, m);
+
+ return ret;
}

static ssize_t domain_translation_struct_write(struct file *filp,
--
2.21.3


2023-07-03 07:33:34

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 0/5] iommu/vt-d: debugfs: Enhancements to IOMMU debugfs

> From: Liu, Jingqi <[email protected]>
> Sent: Sunday, June 25, 2023 11:05 PM
>
> The original debugfs only dumps all IOMMU page tables without pasid
> supported. It traverses all devices on the pci bus, then dumps all page
> tables based on device domains. This traversal is from software
> perspective.
>
> This series dumps page tables by traversing root tables, context tables,
> pasid directories and pasid tables from hardware perspective. By
> specifying source identifier and PASID, it supports dumping specified
> page table or all page tables in legacy mode or scalable mode.
>
> For a device that only supports legacy mode, specify the source
> identifier, and search the root table and context table to dump its
> page table. It does not support to specify PASID.
>
> For a device that supports scalable mode, specify a
> {source identifier, PASID} pair and search the root table, context table
> and pasid table to dump its page table. If the pasid is not specified,
> it is set to RID_PASID.
>
> Switch to dump all page tables by specifying "auto".
>
> Examples are as follows:
> 1) Dump the page table of device "00:1f.0" that only supports legacy
> mode.
>
> $ sudo echo 00:1f.0 >
> /sys/kernel/debug/iommu/intel/domain_translation_struct
> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
> Device 0000:00:1f.0 @0x105407000
> IOVA_PFN PML5E PML4E
> 0x0000000000000 | 0x0000000000000000 0x0000000105408003
> 0x0000000000001 | 0x0000000000000000 0x0000000105408003
> 0x0000000000002 | 0x0000000000000000 0x0000000105408003
> 0x0000000000003 | 0x0000000000000000 0x0000000105408003
>
> PDPE PDE PTE
> 0x0000000105409003 0x000000010540a003 0x0000000000000003
> 0x0000000105409003 0x000000010540a003 0x0000000000001003
> 0x0000000105409003 0x000000010540a003 0x0000000000002003
> 0x0000000105409003 0x000000010540a003 0x0000000000003003
>
> [...]
>
> 2) Dump the page table of device "00:0a.0" with pasid "2".
>
> $ sudo echo 00:0a.0,2 >
> /sys/kernel/debug/iommu/intel/domain_translation_struct
> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct

What about creating a directory layout per {dev, pasid} so the user can
easily figure out and dump?

e.g.

/sys/kernel/debug/iommu/intel/00:0a.0/0/domain_translation_struct
/sys/kernel/debug/iommu/intel/00:0a.0/2/domain_translation_struct

> Device 0000:00:0a.0 with pasid 2 @0x1083d7000
> IOVA_PFN PML5E PML4E
> 0x0000000000000 | 0x0000000000000000 0x0000000106aaa003
> 0x0000000000001 | 0x0000000000000000 0x0000000106aaa003
> 0x0000000000002 | 0x0000000000000000 0x0000000106aaa003
> 0x0000000000003 | 0x0000000000000000 0x0000000106aaa003
>
> PDPE PDE PTE
> 0x000000010a819003 0x000000010a7aa003 0x0000000129800003
> 0x000000010a819003 0x000000010a7aa003 0x0000000129801003
> 0x000000010a819003 0x000000010a7aa003 0x0000000129802003
> 0x000000010a819003 0x000000010a7aa003 0x0000000129803003
>
> [...]
>
> 3) Dump all page tables:
> $ sudo echo "auto" >
> /sys/kernel/debug/iommu/intel/domain_translation_struct
> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
> [...]
>
> Device 0000:00:02.0 @0x103072000
> IOVA_PFN PML5E PML4E
> 0x000000008d800 | 0x0000000000000000 0x0000000103073003
> 0x000000008d801 | 0x0000000000000000 0x0000000103073003
>
> PDPE PDE PTE
> 0x0000000103074003 0x0000000103075003 0x000000008d800003
> 0x0000000103074003 0x0000000103075003 0x000000008d801003
>
> [...]
>
> Device 0000:00:0a.0 with pasid 2 @0x10a0b6000
> IOVA_PFN PML5E PML4E
> 0x0000000000000 | 0x0000000000000000 0x00000001072d2003
> 0x0000000000001 | 0x0000000000000000 0x00000001072d2003
>
> PDPE PDE PTE
> 0x0000000107d6e003 0x00000001161d4003 0x00000001bdc00003
> 0x0000000107d6e003 0x00000001161d4003 0x00000001bdc01003
>
> [...]
>
> Thanks,
> Jingqi
>
> Jingqi Liu (5):
> iommu/vt-d: debugfs: Define domain_translation_struct file ops
> iommu/vt-d: debugfs: Support specifying source identifier and PASID
> iommu/vt-d: debugfs: Dump the corresponding page table of a pasid
> iommu/vt-d: debugfs: Support dumping a specified page table
> iommu/vt-d: debugfs: Dump entry pointing to huge page
>
> drivers/iommu/intel/debugfs.c | 361 ++++++++++++++++++++++++++++++----
> 1 file changed, 326 insertions(+), 35 deletions(-)
>
> --
> 2.21.3


2023-07-03 14:43:13

by Jingqi Liu

[permalink] [raw]
Subject: Re: [PATCH 0/5] iommu/vt-d: debugfs: Enhancements to IOMMU debugfs

On 7/3/2023 3:15 PM, Tian, Kevin wrote:
>> From: Liu, Jingqi <[email protected]>
>> Sent: Sunday, June 25, 2023 11:05 PM
>>
>> The original debugfs only dumps all IOMMU page tables without pasid
>> supported. It traverses all devices on the pci bus, then dumps all page
>> tables based on device domains. This traversal is from software
>> perspective.
>>
>> This series dumps page tables by traversing root tables, context tables,
>> pasid directories and pasid tables from hardware perspective. By
>> specifying source identifier and PASID, it supports dumping specified
>> page table or all page tables in legacy mode or scalable mode.
>>
>> For a device that only supports legacy mode, specify the source
>> identifier, and search the root table and context table to dump its
>> page table. It does not support to specify PASID.
>>
>> For a device that supports scalable mode, specify a
>> {source identifier, PASID} pair and search the root table, context table
>> and pasid table to dump its page table. If the pasid is not specified,
>> it is set to RID_PASID.
>>
>> Switch to dump all page tables by specifying "auto".
>>
>> Examples are as follows:
>> 1) Dump the page table of device "00:1f.0" that only supports legacy
>> mode.
>>
>> $ sudo echo 00:1f.0 >
>> /sys/kernel/debug/iommu/intel/domain_translation_struct
>> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
>> Device 0000:00:1f.0 @0x105407000
>> IOVA_PFN PML5E PML4E
>> 0x0000000000000 | 0x0000000000000000 0x0000000105408003
>> 0x0000000000001 | 0x0000000000000000 0x0000000105408003
>> 0x0000000000002 | 0x0000000000000000 0x0000000105408003
>> 0x0000000000003 | 0x0000000000000000 0x0000000105408003
>>
>> PDPE PDE PTE
>> 0x0000000105409003 0x000000010540a003 0x0000000000000003
>> 0x0000000105409003 0x000000010540a003 0x0000000000001003
>> 0x0000000105409003 0x000000010540a003 0x0000000000002003
>> 0x0000000105409003 0x000000010540a003 0x0000000000003003
>>
>> [...]
>>
>> 2) Dump the page table of device "00:0a.0" with pasid "2".
>>
>> $ sudo echo 00:0a.0,2 >
>> /sys/kernel/debug/iommu/intel/domain_translation_struct
>> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
> What about creating a directory layout per {dev, pasid} so the user can
> easily figure out and dump?
>
> e.g.
>
> /sys/kernel/debug/iommu/intel/00:0a.0/0/domain_translation_struct
> /sys/kernel/debug/iommu/intel/00:0a.0/2/domain_translation_struct
Thanks.

Do you mean create a directory for each device, whether it supports
PASID or not ?
Seems the PASID can be assigned at runtime.
So it needs to support creating debugfs file at runtime in IOMMU driver.
Looks like this requires modifying IOMMU driver.

BR,
Jingqi

2023-07-04 08:37:51

by Tian, Kevin

[permalink] [raw]
Subject: RE: [PATCH 0/5] iommu/vt-d: debugfs: Enhancements to IOMMU debugfs

> From: Liu, Jingqi <[email protected]>
> Sent: Monday, July 3, 2023 10:37 PM
>
> On 7/3/2023 3:15 PM, Tian, Kevin wrote:
> >> From: Liu, Jingqi <[email protected]>
> >> Sent: Sunday, June 25, 2023 11:05 PM
> >>
> >> The original debugfs only dumps all IOMMU page tables without pasid
> >> supported. It traverses all devices on the pci bus, then dumps all page
> >> tables based on device domains. This traversal is from software
> >> perspective.
> >>
> >> This series dumps page tables by traversing root tables, context tables,
> >> pasid directories and pasid tables from hardware perspective. By
> >> specifying source identifier and PASID, it supports dumping specified
> >> page table or all page tables in legacy mode or scalable mode.
> >>
> >> For a device that only supports legacy mode, specify the source
> >> identifier, and search the root table and context table to dump its
> >> page table. It does not support to specify PASID.
> >>
> >> For a device that supports scalable mode, specify a
> >> {source identifier, PASID} pair and search the root table, context table
> >> and pasid table to dump its page table. If the pasid is not specified,
> >> it is set to RID_PASID.
> >>
> >> Switch to dump all page tables by specifying "auto".
> >>
> >> Examples are as follows:
> >> 1) Dump the page table of device "00:1f.0" that only supports legacy
> >> mode.
> >>
> >> $ sudo echo 00:1f.0 >
> >> /sys/kernel/debug/iommu/intel/domain_translation_struct
> >> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
> >> Device 0000:00:1f.0 @0x105407000
> >> IOVA_PFN PML5E PML4E
> >> 0x0000000000000 | 0x0000000000000000 0x0000000105408003
> >> 0x0000000000001 | 0x0000000000000000 0x0000000105408003
> >> 0x0000000000002 | 0x0000000000000000 0x0000000105408003
> >> 0x0000000000003 | 0x0000000000000000 0x0000000105408003
> >>
> >> PDPE PDE PTE
> >> 0x0000000105409003 0x000000010540a003 0x0000000000000003
> >> 0x0000000105409003 0x000000010540a003 0x0000000000001003
> >> 0x0000000105409003 0x000000010540a003 0x0000000000002003
> >> 0x0000000105409003 0x000000010540a003 0x0000000000003003
> >>
> >> [...]
> >>
> >> 2) Dump the page table of device "00:0a.0" with pasid "2".
> >>
> >> $ sudo echo 00:0a.0,2 >
> >> /sys/kernel/debug/iommu/intel/domain_translation_struct
> >> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
> > What about creating a directory layout per {dev, pasid} so the user can
> > easily figure out and dump?
> >
> > e.g.
> >
> > /sys/kernel/debug/iommu/intel/00:0a.0/0/domain_translation_struct
> > /sys/kernel/debug/iommu/intel/00:0a.0/2/domain_translation_struct
> Thanks.
>
> Do you mean create a directory for each device, whether it supports
> PASID or not ?

every device has PASID#0 valid, i.e. RID2PASID.

> Seems the PASID can be assigned at runtime.
> So it needs to support creating debugfs file at runtime in IOMMU driver.
> Looks like this requires modifying IOMMU driver.
>

Isn't this patch trying to modify the driver?

2023-07-11 01:49:52

by Jingqi Liu

[permalink] [raw]
Subject: Re: [PATCH 0/5] iommu/vt-d: debugfs: Enhancements to IOMMU debugfs

On 7/4/2023 3:54 PM, Tian, Kevin wrote:
>> From: Liu, Jingqi <[email protected]>
>> Sent: Monday, July 3, 2023 10:37 PM
>>
>> On 7/3/2023 3:15 PM, Tian, Kevin wrote:
>>>> From: Liu, Jingqi <[email protected]>
>>>> Sent: Sunday, June 25, 2023 11:05 PM
>>>>
>>>> The original debugfs only dumps all IOMMU page tables without pasid
>>>> supported. It traverses all devices on the pci bus, then dumps all page
>>>> tables based on device domains. This traversal is from software
>>>> perspective.
>>>>
>>>> This series dumps page tables by traversing root tables, context tables,
>>>> pasid directories and pasid tables from hardware perspective. By
>>>> specifying source identifier and PASID, it supports dumping specified
>>>> page table or all page tables in legacy mode or scalable mode.
>>>>
>>>> For a device that only supports legacy mode, specify the source
>>>> identifier, and search the root table and context table to dump its
>>>> page table. It does not support to specify PASID.
>>>>
>>>> For a device that supports scalable mode, specify a
>>>> {source identifier, PASID} pair and search the root table, context table
>>>> and pasid table to dump its page table. If the pasid is not specified,
>>>> it is set to RID_PASID.
>>>>
>>>> Switch to dump all page tables by specifying "auto".
>>>>
>>>> Examples are as follows:
>>>> 1) Dump the page table of device "00:1f.0" that only supports legacy
>>>> mode.
>>>>
>>>> $ sudo echo 00:1f.0 >
>>>> /sys/kernel/debug/iommu/intel/domain_translation_struct
>>>> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
>>>> Device 0000:00:1f.0 @0x105407000
>>>> IOVA_PFN PML5E PML4E
>>>> 0x0000000000000 | 0x0000000000000000 0x0000000105408003
>>>> 0x0000000000001 | 0x0000000000000000 0x0000000105408003
>>>> 0x0000000000002 | 0x0000000000000000 0x0000000105408003
>>>> 0x0000000000003 | 0x0000000000000000 0x0000000105408003
>>>>
>>>> PDPE PDE PTE
>>>> 0x0000000105409003 0x000000010540a003 0x0000000000000003
>>>> 0x0000000105409003 0x000000010540a003 0x0000000000001003
>>>> 0x0000000105409003 0x000000010540a003 0x0000000000002003
>>>> 0x0000000105409003 0x000000010540a003 0x0000000000003003
>>>>
>>>> [...]
>>>>
>>>> 2) Dump the page table of device "00:0a.0" with pasid "2".
>>>>
>>>> $ sudo echo 00:0a.0,2 >
>>>> /sys/kernel/debug/iommu/intel/domain_translation_struct
>>>> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
>>> What about creating a directory layout per {dev, pasid} so the user can
>>> easily figure out and dump?
>>>
>>> e.g.
>>>
>>> /sys/kernel/debug/iommu/intel/00:0a.0/0/domain_translation_struct
>>> /sys/kernel/debug/iommu/intel/00:0a.0/2/domain_translation_struct
>> Thanks.
>>
>> Do you mean create a directory for each device, whether it supports
>> PASID or not ?
> every device has PASID#0 valid, i.e. RID2PASID.
Sorry for the late response.
Got it. Thanks.
>> Seems the PASID can be assigned at runtime.
>> So it needs to support creating debugfs file at runtime in IOMMU driver.
>> Looks like this requires modifying IOMMU driver.
>>
> Isn't this patch trying to modify the driver?
I just tried not to modify the driver except debugfs.
I'll try this implementation.

Thanks,
Jingqi


2023-07-11 03:00:44

by Baolu Lu

[permalink] [raw]
Subject: Re: [PATCH 0/5] iommu/vt-d: debugfs: Enhancements to IOMMU debugfs

On 2023/7/11 9:40, Liu, Jingqi wrote:
> On 7/4/2023 3:54 PM, Tian, Kevin wrote:
>>> From: Liu, Jingqi <[email protected]>
>>> Sent: Monday, July 3, 2023 10:37 PM
>>>
>>> On 7/3/2023 3:15 PM, Tian, Kevin wrote:
>>>>> From: Liu, Jingqi <[email protected]>
>>>>> Sent: Sunday, June 25, 2023 11:05 PM
>>>>>
>>>>> The original debugfs only dumps all IOMMU page tables without pasid
>>>>> supported. It traverses all devices on the pci bus, then dumps all
>>>>> page
>>>>> tables based on device domains. This traversal is from software
>>>>> perspective.
>>>>>
>>>>> This series dumps page tables by traversing root tables, context
>>>>> tables,
>>>>> pasid directories and pasid tables from hardware perspective. By
>>>>> specifying source identifier and PASID, it supports dumping specified
>>>>> page table or all page tables in legacy mode or scalable mode.
>>>>>
>>>>> For a device that only supports legacy mode, specify the source
>>>>> identifier, and search the root table and context table to dump its
>>>>> page table. It does not support to specify PASID.
>>>>>
>>>>> For a device that supports scalable mode, specify a
>>>>> {source identifier, PASID} pair and search the root table, context
>>>>> table
>>>>> and pasid table to dump its page table.  If the pasid is not
>>>>> specified,
>>>>> it is set to RID_PASID.
>>>>>
>>>>> Switch to dump all page tables by specifying "auto".
>>>>>
>>>>> Examples are as follows:
>>>>> 1) Dump the page table of device "00:1f.0" that only supports legacy
>>>>> mode.
>>>>>
>>>>> $ sudo echo 00:1f.0 >
>>>>> /sys/kernel/debug/iommu/intel/domain_translation_struct
>>>>> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
>>>>> Device 0000:00:1f.0 @0x105407000
>>>>> IOVA_PFN                PML5E                   PML4E
>>>>> 0x0000000000000 |       0x0000000000000000      0x0000000105408003
>>>>> 0x0000000000001 |       0x0000000000000000      0x0000000105408003
>>>>> 0x0000000000002 |       0x0000000000000000      0x0000000105408003
>>>>> 0x0000000000003 |       0x0000000000000000      0x0000000105408003
>>>>>
>>>>> PDPE                    PDE                     PTE
>>>>> 0x0000000105409003      0x000000010540a003      0x0000000000000003
>>>>> 0x0000000105409003      0x000000010540a003      0x0000000000001003
>>>>> 0x0000000105409003      0x000000010540a003      0x0000000000002003
>>>>> 0x0000000105409003      0x000000010540a003      0x0000000000003003
>>>>>
>>>>> [...]
>>>>>
>>>>> 2) Dump the page table of device "00:0a.0" with pasid "2".
>>>>>
>>>>> $ sudo echo 00:0a.0,2 >
>>>>> /sys/kernel/debug/iommu/intel/domain_translation_struct
>>>>> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
>>>> What about creating a directory layout per {dev, pasid} so the user can
>>>> easily figure out and dump?
>>>>
>>>> e.g.
>>>>
>>>> /sys/kernel/debug/iommu/intel/00:0a.0/0/domain_translation_struct
>>>> /sys/kernel/debug/iommu/intel/00:0a.0/2/domain_translation_struct
>>> Thanks.
>>>
>>> Do you mean create a directory for each device, whether it supports
>>> PASID or not ?
>> every device has PASID#0 valid, i.e. RID2PASID.
> Sorry for the late response.
> Got it. Thanks.
>>> Seems the PASID can be assigned at runtime.
>>> So it needs to support creating debugfs file at runtime in IOMMU driver.
>>> Looks like this requires modifying IOMMU driver.
>>>
>> Isn't this patch trying to modify the driver?
> I just tried not to modify the driver except debugfs.
> I'll try this implementation.

I'd second Kevin's suggestion.

If you check how usb xhci dumps its contexts for devices, you can see
the similar scheme.

# ls /sys/kernel/debug/usb/xhci/0000:00:14.0/devices
01 02 03 04 05

In our case, pasid 0 is special which denotes the domain attached to the
RID.

Best regards,
baolu

2023-07-11 06:37:45

by Jingqi Liu

[permalink] [raw]
Subject: Re: [PATCH 0/5] iommu/vt-d: debugfs: Enhancements to IOMMU debugfs

On 7/11/2023 10:52 AM, Baolu Lu wrote:
> On 2023/7/11 9:40, Liu, Jingqi wrote:
>> On 7/4/2023 3:54 PM, Tian, Kevin wrote:
>>>> From: Liu, Jingqi <[email protected]>
>>>> Sent: Monday, July 3, 2023 10:37 PM
>>>>
>>>> On 7/3/2023 3:15 PM, Tian, Kevin wrote:
>>>>>> From: Liu, Jingqi <[email protected]>
>>>>>> Sent: Sunday, June 25, 2023 11:05 PM
>>>>>>
>>>>>> The original debugfs only dumps all IOMMU page tables without pasid
>>>>>> supported. It traverses all devices on the pci bus, then dumps
>>>>>> all page
>>>>>> tables based on device domains. This traversal is from software
>>>>>> perspective.
>>>>>>
>>>>>> This series dumps page tables by traversing root tables, context
>>>>>> tables,
>>>>>> pasid directories and pasid tables from hardware perspective. By
>>>>>> specifying source identifier and PASID, it supports dumping
>>>>>> specified
>>>>>> page table or all page tables in legacy mode or scalable mode.
>>>>>>
>>>>>> For a device that only supports legacy mode, specify the source
>>>>>> identifier, and search the root table and context table to dump its
>>>>>> page table. It does not support to specify PASID.
>>>>>>
>>>>>> For a device that supports scalable mode, specify a
>>>>>> {source identifier, PASID} pair and search the root table,
>>>>>> context table
>>>>>> and pasid table to dump its page table.  If the pasid is not
>>>>>> specified,
>>>>>> it is set to RID_PASID.
>>>>>>
>>>>>> Switch to dump all page tables by specifying "auto".
>>>>>>
>>>>>> Examples are as follows:
>>>>>> 1) Dump the page table of device "00:1f.0" that only supports legacy
>>>>>> mode.
>>>>>>
>>>>>> $ sudo echo 00:1f.0 >
>>>>>> /sys/kernel/debug/iommu/intel/domain_translation_struct
>>>>>> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
>>>>>> Device 0000:00:1f.0 @0x105407000
>>>>>> IOVA_PFN                PML5E                   PML4E
>>>>>> 0x0000000000000 |       0x0000000000000000 0x0000000105408003
>>>>>> 0x0000000000001 |       0x0000000000000000 0x0000000105408003
>>>>>> 0x0000000000002 |       0x0000000000000000 0x0000000105408003
>>>>>> 0x0000000000003 |       0x0000000000000000 0x0000000105408003
>>>>>>
>>>>>> PDPE                    PDE                     PTE
>>>>>> 0x0000000105409003      0x000000010540a003 0x0000000000000003
>>>>>> 0x0000000105409003      0x000000010540a003 0x0000000000001003
>>>>>> 0x0000000105409003      0x000000010540a003 0x0000000000002003
>>>>>> 0x0000000105409003      0x000000010540a003 0x0000000000003003
>>>>>>
>>>>>> [...]
>>>>>>
>>>>>> 2) Dump the page table of device "00:0a.0" with pasid "2".
>>>>>>
>>>>>> $ sudo echo 00:0a.0,2 >
>>>>>> /sys/kernel/debug/iommu/intel/domain_translation_struct
>>>>>> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
>>>>> What about creating a directory layout per {dev, pasid} so the
>>>>> user can
>>>>> easily figure out and dump?
>>>>>
>>>>> e.g.
>>>>>
>>>>> /sys/kernel/debug/iommu/intel/00:0a.0/0/domain_translation_struct
>>>>> /sys/kernel/debug/iommu/intel/00:0a.0/2/domain_translation_struct
>>>> Thanks.
>>>>
>>>> Do you mean create a directory for each device, whether it supports
>>>> PASID or not ?
>>> every device has PASID#0 valid, i.e. RID2PASID.
>> Sorry for the late response.
>> Got it. Thanks.
>>>> Seems the PASID can be assigned at runtime.
>>>> So it needs to support creating debugfs file at runtime in IOMMU
>>>> driver.
>>>> Looks like this requires modifying IOMMU driver.
>>>>
>>> Isn't this patch trying to modify the driver?
>> I just tried not to modify the driver except debugfs.
>> I'll try this implementation. [
>
> I'd second Kevin's suggestion.
>
> If you check how usb xhci dumps its contexts for devices, you can see
> the similar scheme.
>
> # ls /sys/kernel/debug/usb/xhci/0000:00:14.0/devices
> 01  02  03  04  05
>
> In our case, pasid 0 is special which denotes the domain attached to the
> RID.
Thanks for your info.
This implementation is more friendly for user.
I'll implement it as such.

BR,
Jingqi