Hi All,
This series aims to add debugfs support for Intel IOMMU. It exposes IOMMU
registers, internal context and dumps individual table entries to help debug
Intel IOMMUs.
The first patch does the ground work for the following patches by reorganizing
some Intel IOMMU data structures. The following patches create a new Kconfig
option - INTEL_IOMMU_DEBUG and add debugfs support for IOMMU context internals,
register contents, PASID internals, and Interrupt remapping in that order. The
information can be accessed in sysfs at '/sys/kernel/debug/intel_iommu/'.
Regards,
Sohil
Changes since v6:
- Split patch 1/5 and 2/5 differently
- Simplify and improve code formatting
- Use macro for register set definitions
- Fix compiler warning for readq
- Add Co-Developed-by tag to commit messages
Changes since v5:
- Change the order of includes to an alphabetical order
- Change seq_printf and seq_puts formatting
Changes since v4:
- Change to a SPDX license tag
- Fix seq_printf formatting and remove leading '\n's
Changes since v3:
- Remove an unused function parameter from some of the functions
- Fix checkpatch.pl warnings
- Remove error reporting for debugfs_create_file functions
- Fix unnecessary reprogramming of the context entries
- Simplify and merge the show context and extended context patch into one
- Remove redundant IOMMU null check under for_each_active_iommu
- Update the commit title to be consistent
Changes since v2:
- Added a macro for seq file operations based on recommendation by Andy
Shevchenko. The marco can be moved to seq_file.h at a future point
- Changed the debugfs file names to more relevant ones
- Added information for MTRR registers in the regset file
Changes since v1:
- Fixed seq_printf formatting
- Handled the case when Interrupt remapping is not enabled
Gayatri Kammela (4):
iommu/vt-d: Relocate struct/function declarations to its header files
iommu/vt-d: Enable debugfs support to show context internals
iommu/vt-d: Add debugfs support to show register contents
iommu/vt-d: Add debugfs support to show Pasid table contents
Sohil Mehta (1):
iommu/vt-d: Add debugfs support for Interrupt remapping
drivers/iommu/Kconfig | 8 +
drivers/iommu/Makefile | 1 +
drivers/iommu/intel-iommu-debug.c | 338 ++++++++++++++++++++++++++++++++++++++
drivers/iommu/intel-iommu.c | 34 +---
drivers/iommu/intel-svm.c | 8 -
include/linux/intel-iommu.h | 39 +++++
include/linux/intel-svm.h | 10 +-
7 files changed, 400 insertions(+), 38 deletions(-)
create mode 100644 drivers/iommu/intel-iommu-debug.c
--
2.7.4
From: Gayatri Kammela <[email protected]>
Debugfs extension to dump all the register contents for each IOMMU
device to the user space via debugfs.
Example:
root@OTC-KBLH-01:~# cat /sys/kernel/debug/intel_iommu/iommu_regset
DMAR: dmar0: Register Base Address fed90000
Name Offset Contents
VER 0x00 0x0000000000000010
CAP 0x08 0x01c0000c40660462
ECAP 0x10 0x0000000000f0101a
GCMD 0x18 0x0000000000000000
GSTS 0x1c 0x00000000c7000000
RTADDR 0x20 0x00000004071d3800
CCMD 0x28 0x0800000000000000
FSTS 0x34 0x0000000000000000
FECTL 0x38 0x0000000000000000
FEDATA 0x3c 0xfee0100400004021
Cc: Fenghua Yu <[email protected]>
Cc: Jacob Pan <[email protected]>
Cc: Ashok Raj <[email protected]>
Co-Developed-by: Sohil Mehta <[email protected]>
Signed-off-by: Sohil Mehta <[email protected]>
Signed-off-by: Gayatri Kammela <[email protected]>
---
v7: Use macro for register set definitions
Fix compiler warning for readq with 32bit architecture
Remove leading '\n'
v6: No change
v5: No change
v4: Fix checkpatch.pl warnings
Remove error reporting for debugfs_create_file function
Remove redundant IOMMU null check under for_each_active_iommu
v3: Use a macro for seq file operations
Change the intel_iommu_regset file name to iommu_regset
Add information for MTRR registers
v2: Fix seq_printf formatting
drivers/iommu/intel-iommu-debug.c | 84 +++++++++++++++++++++++++++++++++++++++
include/linux/intel-iommu.h | 2 +
2 files changed, 86 insertions(+)
diff --git a/drivers/iommu/intel-iommu-debug.c b/drivers/iommu/intel-iommu-debug.c
index 8253503..38651ad 100644
--- a/drivers/iommu/intel-iommu-debug.c
+++ b/drivers/iommu/intel-iommu-debug.c
@@ -38,6 +38,49 @@ static const struct file_operations __name ## _fops = \
.owner = THIS_MODULE, \
}
+struct iommu_regset {
+ int offset;
+ const char *regs;
+};
+
+#define IOMMU_REGSET_ENTRY(_reg_) \
+ { DMAR_##_reg_##_REG, __stringify(_reg_) }
+static const struct iommu_regset iommu_regs[] = {
+ IOMMU_REGSET_ENTRY(VER),
+ IOMMU_REGSET_ENTRY(CAP),
+ IOMMU_REGSET_ENTRY(ECAP),
+ IOMMU_REGSET_ENTRY(GCMD),
+ IOMMU_REGSET_ENTRY(GSTS),
+ IOMMU_REGSET_ENTRY(RTADDR),
+ IOMMU_REGSET_ENTRY(CCMD),
+ IOMMU_REGSET_ENTRY(FSTS),
+ IOMMU_REGSET_ENTRY(FECTL),
+ IOMMU_REGSET_ENTRY(FEDATA),
+ IOMMU_REGSET_ENTRY(FEADDR),
+ IOMMU_REGSET_ENTRY(FEUADDR),
+ IOMMU_REGSET_ENTRY(AFLOG),
+ IOMMU_REGSET_ENTRY(PMEN),
+ IOMMU_REGSET_ENTRY(PLMBASE),
+ IOMMU_REGSET_ENTRY(PLMLIMIT),
+ IOMMU_REGSET_ENTRY(PHMBASE),
+ IOMMU_REGSET_ENTRY(PHMLIMIT),
+ IOMMU_REGSET_ENTRY(IQH),
+ IOMMU_REGSET_ENTRY(IQT),
+ IOMMU_REGSET_ENTRY(IQA),
+ IOMMU_REGSET_ENTRY(ICS),
+ IOMMU_REGSET_ENTRY(IRTA),
+ IOMMU_REGSET_ENTRY(PQH),
+ IOMMU_REGSET_ENTRY(PQT),
+ IOMMU_REGSET_ENTRY(PQA),
+ IOMMU_REGSET_ENTRY(PRS),
+ IOMMU_REGSET_ENTRY(PECTL),
+ IOMMU_REGSET_ENTRY(PEDATA),
+ IOMMU_REGSET_ENTRY(PEADDR),
+ IOMMU_REGSET_ENTRY(PEUADDR),
+ IOMMU_REGSET_ENTRY(MTRRCAP),
+ IOMMU_REGSET_ENTRY(MTRRDEF)
+};
+
static void ctx_tbl_entry_show(struct seq_file *m, struct intel_iommu *iommu,
int bus, bool ext)
{
@@ -116,6 +159,45 @@ static int dmar_translation_struct_show(struct seq_file *m, void *unused)
}
DEFINE_SHOW_ATTRIBUTE(dmar_translation_struct);
+static int iommu_regset_show(struct seq_file *m, void *unused)
+{
+ struct dmar_drhd_unit *drhd;
+ struct intel_iommu *iommu;
+ unsigned long long base;
+ int i, ret = 0;
+ u64 value;
+
+ rcu_read_lock();
+ for_each_active_iommu(iommu, drhd) {
+ if (!drhd->reg_base_addr) {
+ seq_puts(m, "IOMMU: Invalid base address\n");
+ ret = -EINVAL;
+ goto out;
+ }
+
+ base = drhd->reg_base_addr;
+ seq_printf(m, "DMAR: %s: Register Base Address %llx\n",
+ iommu->name, base);
+ seq_puts(m, "Name\t\t\tOffset\t\tContents\n");
+ /*
+ * Publish the contents of the 64-bit hardware registers
+ * by adding the offset to the pointer (virtual address).
+ */
+ for (i = 0 ; i < ARRAY_SIZE(iommu_regs); i++) {
+ value = dmar_readq(iommu->reg + iommu_regs[i].offset);
+ seq_printf(m, "%-8s\t\t0x%02x\t\t0x%016llx\n",
+ iommu_regs[i].regs, iommu_regs[i].offset,
+ value);
+ }
+ seq_putc(m, '\n');
+ }
+out:
+ rcu_read_unlock();
+
+ return ret;
+}
+DEFINE_SHOW_ATTRIBUTE(iommu_regset);
+
void __init intel_iommu_debugfs_init(void)
{
struct dentry *iommu_debug_root;
@@ -126,4 +208,6 @@ void __init intel_iommu_debugfs_init(void)
debugfs_create_file("dmar_translation_struct", 0444, iommu_debug_root,
NULL, &dmar_translation_struct_fops);
+ debugfs_create_file("iommu_regset", 0444, iommu_debug_root, NULL,
+ &iommu_regset_fops);
}
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 8968afa..1044a14 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -71,6 +71,8 @@
#define DMAR_PEDATA_REG 0xe4 /* Page request event interrupt data register */
#define DMAR_PEADDR_REG 0xe8 /* Page request event interrupt addr register */
#define DMAR_PEUADDR_REG 0xec /* Page request event Upper address register */
+#define DMAR_MTRRCAP_REG 0x100 /* Memory type range register capability register */
+#define DMAR_MTRRDEF_REG 0x108 /* Memory type range register default type register */
#define OFFSET_STRIDE (9)
--
2.7.4
From: Gayatri Kammela <[email protected]>
Add a new config option CONFIG_INTEL_IOMMU_DEBUG and export Intel IOMMU
internals states, such as root and context in debugfs to the userspace.
Example of such dump in Kabylake:
root@OTC-KBLH-01:~# cat
/sys/kernel/debug/intel_iommu/dmar_translation_struct
IOMMU dmar1: Extended Root Table Address:4071d3800
Extended Root Table Entries:
Bus 0 L: 4071d7001 H: 0
Lower Context Table Entries for Bus: 0
[entry] Device B:D.F Low High
[16] 0000:00:02.0 4071d6005 102
Higher Context Table Entries for Bus: 0
[16] 0000:00:02.0 0 0
IOMMU dmar0: Extended Root Table Address:4071d4800
IOMMU dmar2: Root Table Address:4071d5000
Root Table Entries:
Bus 0 L: 406d13001 H: 0
Context Table Entries for Bus: 0
[entry] Device B:D.F Low High
[160] 0000:00:14.0 406d12001 102
[184] 0000:00:17.0 405756001 302
[248] 0000:00:1f.0 406d3b001 202
[251] 0000:00:1f.3 405497001 402
[254] 0000:00:1f.6 40662e001 502
Root Table Entries:
Bus 1 L: 401e03001 H: 0
Context Table Entries for Bus: 1
[entry] Device B:D.F Low High
[0] 0000:01:00.0 401e04001 602
Cc: Fenghua Yu <[email protected]>
Cc: Ashok Raj <[email protected]>
Co-Developed-by: Sohil Mehta <[email protected]>
Signed-off-by: Jacob Pan <[email protected]>
Signed-off-by: Sohil Mehta <[email protected]>
Signed-off-by: Gayatri Kammela <[email protected]>
---
v7: Split patch 1/5 and 2/5 differently
Update commit message and copyright year
Fix typo in a comment
Simplify code
v6: Change the order of includes to an alphabetical order
Change seq_printf formatting
v5: Change to a SPDX license tag
Fix seq_printf formatting
v4: Remove the unused function parameter
Fix checkpatch.pl warnings
Remove error reporting for debugfs_create_file function
Fix unnecessary reprogramming of the context entries
Simplify and merge the show context and extended context patch into one
Remove redundant IOMMU null check under for_each_active_iommu
v3: Add a macro for seq file operations
Change the intel_iommu_ctx file name to dmar_translation_struct
v2: No change
drivers/iommu/Kconfig | 8 +++
drivers/iommu/Makefile | 1 +
drivers/iommu/intel-iommu-debug.c | 129 ++++++++++++++++++++++++++++++++++++++
drivers/iommu/intel-iommu.c | 1 +
include/linux/intel-iommu.h | 6 ++
5 files changed, 145 insertions(+)
create mode 100644 drivers/iommu/intel-iommu-debug.c
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index f3a2134..332648f 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -152,6 +152,14 @@ config INTEL_IOMMU
and include PCI device scope covered by these DMA
remapping devices.
+config INTEL_IOMMU_DEBUG
+ bool "Export Intel IOMMU internals in Debugfs"
+ depends on INTEL_IOMMU && DEBUG_FS
+ help
+ Debugfs support to export IOMMU context internals, register contents,
+ PASID internals and interrupt remapping. To access this information in
+ sysfs, say Y.
+
config INTEL_IOMMU_SVM
bool "Support for Shared Virtual Memory with Intel IOMMU"
depends on INTEL_IOMMU && X86
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 1fb6958..fdbaf46 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -15,6 +15,7 @@ obj-$(CONFIG_ARM_SMMU) += arm-smmu.o
obj-$(CONFIG_ARM_SMMU_V3) += arm-smmu-v3.o
obj-$(CONFIG_DMAR_TABLE) += dmar.o
obj-$(CONFIG_INTEL_IOMMU) += intel-iommu.o
+obj-$(CONFIG_INTEL_IOMMU_DEBUG) += intel-iommu-debug.o
obj-$(CONFIG_INTEL_IOMMU_SVM) += intel-svm.o
obj-$(CONFIG_IPMMU_VMSA) += ipmmu-vmsa.o
obj-$(CONFIG_IRQ_REMAP) += intel_irq_remapping.o irq_remapping.o
diff --git a/drivers/iommu/intel-iommu-debug.c b/drivers/iommu/intel-iommu-debug.c
new file mode 100644
index 0000000..8253503
--- /dev/null
+++ b/drivers/iommu/intel-iommu-debug.c
@@ -0,0 +1,129 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright © 2018 Intel Corporation.
+ *
+ * Authors: Gayatri Kammela <[email protected]>
+ * Jacob Pan <[email protected]>
+ * Sohil Mehta <[email protected]>
+ */
+
+#define pr_fmt(fmt) "INTEL_IOMMU: " fmt
+#include <linux/debugfs.h>
+#include <linux/dmar.h>
+#include <linux/err.h>
+#include <linux/intel-iommu.h>
+#include <linux/intel-svm.h>
+#include <linux/io.h>
+#include <linux/iommu.h>
+#include <linux/pci.h>
+#include <linux/pm_runtime.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+#include <linux/uaccess.h>
+
+#include "irq_remapping.h"
+
+#define TOTAL_BUS_NR 256 /* full bus range */
+#define DEFINE_SHOW_ATTRIBUTE(__name) \
+static int __name ## _open(struct inode *inode, struct file *file) \
+{ \
+ return single_open(file, __name ## _show, inode->i_private); \
+} \
+static const struct file_operations __name ## _fops = \
+{ \
+ .open = __name ## _open, \
+ .read = seq_read, \
+ .llseek = seq_lseek, \
+ .release = single_release, \
+ .owner = THIS_MODULE, \
+}
+
+static void ctx_tbl_entry_show(struct seq_file *m, struct intel_iommu *iommu,
+ int bus, bool ext)
+{
+ const char *ct = ext ? "Lower Context Table" : "Context Table";
+ struct context_entry *context;
+ unsigned long flags;
+ int ctx;
+
+ seq_printf(m, "%s Entries for Bus: %d\n", ct, bus);
+ seq_puts(m, "[entry]\tDevice B:D.F\tLow\t\tHigh\n");
+
+ spin_lock_irqsave(&iommu->lock, flags);
+
+ /* Publish either context entries or extended context entries */
+ for (ctx = 0; ctx < (ext ? 128 : 256); ctx++) {
+ context = iommu_context_addr(iommu, bus, ctx, 0);
+ if (!context)
+ goto out;
+
+ if (!context_present(context))
+ continue;
+
+ seq_printf(m, "[%d]\t%04x:%02x:%02x.%x\t%llx\t%llx\n", ctx,
+ iommu->segment, bus, PCI_SLOT(ctx), PCI_FUNC(ctx),
+ context[0].lo, context[0].hi);
+
+ if (!ecap_ecs(iommu->ecap))
+ continue;
+
+ seq_printf(m, "Higher Context Table Entries for Bus: %d\n",
+ bus);
+ seq_printf(m, "[%d]\t%04x:%02x:%02x.%x\t%llx\t%llx\n", ctx,
+ iommu->segment, bus, PCI_SLOT(ctx), PCI_FUNC(ctx),
+ context[1].lo, context[1].hi);
+ }
+out:
+ spin_unlock_irqrestore(&iommu->lock, flags);
+}
+
+static void root_tbl_entry_show(struct seq_file *m, struct intel_iommu *iommu)
+{
+ u64 rtaddr_reg = dmar_readq(iommu->reg + DMAR_RTADDR_REG);
+ bool ext = !!(rtaddr_reg & DMA_RTADDR_RTT);
+ const char *rt = ext ? "Extended Root Table" : "Root Table";
+ int bus;
+
+ seq_printf(m, "IOMMU %s: %s Address:%llx\n", iommu->name, rt,
+ rtaddr_reg);
+ /* Publish extended root table entries or root table entries here */
+ for (bus = 0; bus < TOTAL_BUS_NR; bus++) {
+ if (!iommu->root_entry[bus].lo)
+ continue;
+
+ seq_printf(m, "%s Entries:\n", rt);
+ seq_printf(m, "Bus %d L: %llx H: %llx\n", bus,
+ iommu->root_entry[bus].lo,
+ iommu->root_entry[bus].hi);
+
+ ctx_tbl_entry_show(m, iommu, bus, ext);
+ }
+}
+
+static int dmar_translation_struct_show(struct seq_file *m, void *unused)
+{
+ struct dmar_drhd_unit *drhd;
+ struct intel_iommu *iommu;
+
+ rcu_read_lock();
+ for_each_active_iommu(iommu, drhd) {
+ root_tbl_entry_show(m, iommu);
+ seq_putc(m, '\n');
+ }
+ rcu_read_unlock();
+
+ return 0;
+}
+DEFINE_SHOW_ATTRIBUTE(dmar_translation_struct);
+
+void __init intel_iommu_debugfs_init(void)
+{
+ struct dentry *iommu_debug_root;
+
+ iommu_debug_root = debugfs_create_dir("intel_iommu", NULL);
+ if (!iommu_debug_root)
+ return;
+
+ debugfs_create_file("dmar_translation_struct", 0444, iommu_debug_root,
+ NULL, &dmar_translation_struct_fops);
+}
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index f6241f6..e23f31d3 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -4805,6 +4805,7 @@ int __init intel_iommu_init(void)
cpuhp_setup_state(CPUHP_IOMMU_INTEL_DEAD, "iommu/intel:dead", NULL,
intel_iommu_cpu_dead);
intel_iommu_enabled = 1;
+ intel_iommu_debugfs_init();
return 0;
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 78ec85a..8968afa 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -518,6 +518,12 @@ extern struct intel_iommu *intel_svm_device_to_iommu(struct device *dev);
extern unsigned long intel_iommu_get_pts(struct intel_iommu *iommu);
#endif
+#ifdef CONFIG_INTEL_IOMMU_DEBUG
+extern void intel_iommu_debugfs_init(void);
+#else
+static inline void intel_iommu_debugfs_init(void) {}
+#endif /* CONFIG_INTEL_IOMMU_DEBUG */
+
extern const struct attribute_group *intel_iommu_groups[];
extern bool context_present(struct context_entry *context);
extern struct context_entry *iommu_context_addr(struct intel_iommu *iommu,
--
2.7.4
From: Gayatri Kammela <[email protected]>
To reuse the static functions and the struct declarations, move them to
corresponding header files and export the needed functions.
Cc: Sohil Mehta <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: Ashok Raj <[email protected]>
Signed-off-by: Jacob Pan <[email protected]>
Signed-off-by: Gayatri Kammela <[email protected]>
---
v7: Split patch 1/5 and 2/5 differently
Update the commit message
v6: No change
v5: No change
v4: No change
v3: No change
v2: No change
drivers/iommu/intel-iommu.c | 33 ++++-----------------------------
include/linux/intel-iommu.h | 31 +++++++++++++++++++++++++++++++
include/linux/intel-svm.h | 2 +-
3 files changed, 36 insertions(+), 30 deletions(-)
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 4a2de34..f6241f6 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -183,16 +183,6 @@ static int rwbf_quirk;
static int force_on = 0;
int intel_iommu_tboot_noforce;
-/*
- * 0: Present
- * 1-11: Reserved
- * 12-63: Context Ptr (12 - (haw-1))
- * 64-127: Reserved
- */
-struct root_entry {
- u64 lo;
- u64 hi;
-};
#define ROOT_ENTRY_NR (VTD_PAGE_SIZE/sizeof(struct root_entry))
/*
@@ -218,21 +208,6 @@ static phys_addr_t root_entry_uctp(struct root_entry *re)
return re->hi & VTD_PAGE_MASK;
}
-/*
- * low 64 bits:
- * 0: present
- * 1: fault processing disable
- * 2-3: translation type
- * 12-63: address space root
- * high 64 bits:
- * 0-2: address width
- * 3-6: aval
- * 8-23: domain id
- */
-struct context_entry {
- u64 lo;
- u64 hi;
-};
static inline void context_clear_pasid_enable(struct context_entry *context)
{
@@ -259,7 +234,7 @@ static inline bool __context_present(struct context_entry *context)
return (context->lo & 1);
}
-static inline bool context_present(struct context_entry *context)
+bool context_present(struct context_entry *context)
{
return context_pasid_enabled(context) ?
__context_present(context) :
@@ -819,8 +794,8 @@ static void domain_update_iommu_cap(struct dmar_domain *domain)
domain->iommu_superpage = domain_update_iommu_superpage(NULL);
}
-static inline struct context_entry *iommu_context_addr(struct intel_iommu *iommu,
- u8 bus, u8 devfn, int alloc)
+struct context_entry *iommu_context_addr(struct intel_iommu *iommu, u8 bus,
+ u8 devfn, int alloc)
{
struct root_entry *root = &iommu->root_entry[bus];
struct context_entry *context;
@@ -5208,7 +5183,7 @@ static void intel_iommu_put_resv_regions(struct device *dev,
#ifdef CONFIG_INTEL_IOMMU_SVM
#define MAX_NR_PASID_BITS (20)
-static inline unsigned long intel_iommu_get_pts(struct intel_iommu *iommu)
+unsigned long intel_iommu_get_pts(struct intel_iommu *iommu)
{
/*
* Convert ecap_pss to extend context entry pts encoding, also
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index f3274d9..78ec85a 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -383,6 +383,33 @@ struct pasid_entry;
struct pasid_state_entry;
struct page_req_dsc;
+/*
+ * 0: Present
+ * 1-11: Reserved
+ * 12-63: Context Ptr (12 - (haw-1))
+ * 64-127: Reserved
+ */
+struct root_entry {
+ u64 lo;
+ u64 hi;
+};
+
+/*
+ * low 64 bits:
+ * 0: present
+ * 1: fault processing disable
+ * 2-3: translation type
+ * 12-63: address space root
+ * high 64 bits:
+ * 0-2: address width
+ * 3-6: aval
+ * 8-23: domain id
+ */
+struct context_entry {
+ u64 lo;
+ u64 hi;
+};
+
struct intel_iommu {
void __iomem *reg; /* Pointer to hardware regs, virtual addr */
u64 reg_phys; /* physical address of hw register set */
@@ -488,8 +515,12 @@ struct intel_svm {
extern int intel_iommu_enable_pasid(struct intel_iommu *iommu, struct intel_svm_dev *sdev);
extern struct intel_iommu *intel_svm_device_to_iommu(struct device *dev);
+extern unsigned long intel_iommu_get_pts(struct intel_iommu *iommu);
#endif
extern const struct attribute_group *intel_iommu_groups[];
+extern bool context_present(struct context_entry *context);
+extern struct context_entry *iommu_context_addr(struct intel_iommu *iommu,
+ u8 bus, u8 devfn, int alloc);
#endif
diff --git a/include/linux/intel-svm.h b/include/linux/intel-svm.h
index 99bc5b3..733eaf9 100644
--- a/include/linux/intel-svm.h
+++ b/include/linux/intel-svm.h
@@ -130,7 +130,7 @@ static inline int intel_svm_unbind_mm(struct device *dev, int pasid)
BUG();
}
-static int intel_svm_is_pasid_valid(struct device *dev, int pasid)
+static inline int intel_svm_is_pasid_valid(struct device *dev, int pasid)
{
return -EINVAL;
}
--
2.7.4
From: Gayatri Kammela <[email protected]>
Debugfs extension to dump the internals such as pasid table entries for
each IOMMU to the userspace.
Example of such dump in Kabylake:
root@OTC-KBLH-01:~# cat
/sys/kernel/debug/intel_iommu/dmar_translation_struct
IOMMU dmar1: Extended Root Table Address:4071d3800
Extended Root Table Entries:
Bus 0 L: 4071d7001 H: 0
Lower Context Table Entries for Bus: 0
[entry] Device B:D.F Low High
[16] 0000:00:02.0 4071d6005 102
Higher Context Table Entries for Bus: 0
[16] 0000:00:02.0 0 0
Pasid Table Address: 00000000746cb0af
Pasid Table Entries for domain 0:
[Entry] Contents
[0] 12c409801
Cc: Fenghua Yu <[email protected]>
Cc: Jacob Pan <[email protected]>
Cc: Ashok Raj <[email protected]>
Co-Developed-by: Sohil Mehta <[email protected]>
Signed-off-by: Sohil Mehta <[email protected]>
Signed-off-by: Gayatri Kammela <[email protected]>
---
v7: Improve code indentation and formatting
v6: No change
v5: No change
v4: Remove the unused function parameter
Fix checkpatch.pl warnings
v3: No change
v2: Fix seq_printf formatting
drivers/iommu/intel-iommu-debug.c | 31 +++++++++++++++++++++++++++++++
drivers/iommu/intel-svm.c | 8 --------
include/linux/intel-svm.h | 8 ++++++++
3 files changed, 39 insertions(+), 8 deletions(-)
diff --git a/drivers/iommu/intel-iommu-debug.c b/drivers/iommu/intel-iommu-debug.c
index 38651ad..a9a99aa 100644
--- a/drivers/iommu/intel-iommu-debug.c
+++ b/drivers/iommu/intel-iommu-debug.c
@@ -81,6 +81,36 @@ static const struct iommu_regset iommu_regs[] = {
IOMMU_REGSET_ENTRY(MTRRDEF)
};
+#ifdef CONFIG_INTEL_IOMMU_SVM
+static void pasid_tbl_entry_show(struct seq_file *m, struct intel_iommu *iommu)
+{
+ int pasid_size = 0, i;
+
+ if (!ecap_pasid(iommu->ecap))
+ return;
+
+ pasid_size = intel_iommu_get_pts(iommu);
+ seq_printf(m, "Pasid Table Address: %p\n", iommu->pasid_table);
+
+ if (!iommu->pasid_table)
+ return;
+
+ seq_printf(m, "Pasid Table Entries for domain %d:\n", iommu->segment);
+ seq_puts(m, "[Entry]\t\tContents\n");
+
+ /* Publish the pasid table entries here */
+ for (i = 0; i < pasid_size; i++) {
+ if (!iommu->pasid_table[i].val)
+ continue;
+
+ seq_printf(m, "[%d]\t\t%04llx\n", i, iommu->pasid_table[i].val);
+ }
+}
+#else /* CONFIG_INTEL_IOMMU_SVM */
+static inline void
+pasid_tbl_entry_show(struct seq_file *m, struct intel_iommu *iommu) {}
+#endif /* CONFIG_INTEL_IOMMU_SVM */
+
static void ctx_tbl_entry_show(struct seq_file *m, struct intel_iommu *iommu,
int bus, bool ext)
{
@@ -116,6 +146,7 @@ static void ctx_tbl_entry_show(struct seq_file *m, struct intel_iommu *iommu,
iommu->segment, bus, PCI_SLOT(ctx), PCI_FUNC(ctx),
context[1].lo, context[1].hi);
}
+ pasid_tbl_entry_show(m, iommu);
out:
spin_unlock_irqrestore(&iommu->lock, flags);
}
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index ed1cf7c..c646724 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -28,14 +28,6 @@
static irqreturn_t prq_event_thread(int irq, void *d);
-struct pasid_entry {
- u64 val;
-};
-
-struct pasid_state_entry {
- u64 val;
-};
-
int intel_svm_alloc_pasid_tables(struct intel_iommu *iommu)
{
struct page *pages;
diff --git a/include/linux/intel-svm.h b/include/linux/intel-svm.h
index 733eaf9..a8abad6 100644
--- a/include/linux/intel-svm.h
+++ b/include/linux/intel-svm.h
@@ -18,6 +18,14 @@
struct device;
+struct pasid_entry {
+ u64 val;
+};
+
+struct pasid_state_entry {
+ u64 val;
+};
+
struct svm_dev_ops {
void (*fault_cb)(struct device *dev, int pasid, u64 address,
u32 private, int rwxp, int response);
--
2.7.4
Debugfs extension for Intel IOMMU to dump Interrupt remapping table
entries for Interrupt remapping and Interrupt posting.
The file /sys/kernel/debug/intel_iommu/ir_translation_struct provides
detailed information, such as Index, Source Id, Destination Id, Vector
and the IRTE values for entries with the present bit set, in the format
shown.
Remapped Interrupt supported on IOMMU: dmar7
IR table address:85e500000
Index SrcID DstID Vct IRTE_high IRTE_low
1 f0f8 00000100 30 000000000004f0f8 000001000030000d
7 f0f8 00000400 22 000000000004f0f8 000004000022000d
Posted Interrupt supported on IOMMU: dmar5
IR table address:85ec00000
Index SrcID PDA_high PDA_low Vct IRTE_high IRTE_low
4 4300 0000000f ff765980 41 0000000f00044300 ff76598000418001
5 4300 0000000f ff765980 51 0000000f00044300 ff76598000518001
Cc: Jacob Pan <[email protected]>
Cc: Fenghua Yu <[email protected]>
Cc: Ashok Raj <[email protected]>
Co-Developed-by: Gayatri Kammela <[email protected]>
Signed-off-by: Gayatri Kammela <[email protected]>
Signed-off-by: Sohil Mehta <[email protected]>
---
v7: Print the IR table physical base address
Simplify IR table formatting
v6: Change a couple of seq_puts to seq_putc
v5: Fix seq_puts formatting and remove leading '\n's
v4: Remove the unused function parameter
Fix checkpatch.pl warnings
Remove error reporting for debugfs_create_file function
Remove redundant IOMMU null check under for_each_active_iommu
v3: Use a macro for seq file operations
Change the intel_iommu_interrupt_remap file name to ir_translation_struct
v2: Handle the case when IR is not enabled. Fix seq_printf formatting
drivers/iommu/intel-iommu-debug.c | 94 +++++++++++++++++++++++++++++++++++++++
1 file changed, 94 insertions(+)
diff --git a/drivers/iommu/intel-iommu-debug.c b/drivers/iommu/intel-iommu-debug.c
index a9a99aa..b66a073 100644
--- a/drivers/iommu/intel-iommu-debug.c
+++ b/drivers/iommu/intel-iommu-debug.c
@@ -229,6 +229,96 @@ static int iommu_regset_show(struct seq_file *m, void *unused)
}
DEFINE_SHOW_ATTRIBUTE(iommu_regset);
+#ifdef CONFIG_IRQ_REMAP
+static void ir_tbl_remap_entry_show(struct seq_file *m,
+ struct intel_iommu *iommu)
+{
+ struct irte *ri_entry;
+ int idx;
+
+ seq_puts(m, " Index SrcID DstID Vct IRTE_high\t\tIRTE_low\n");
+
+ for (idx = 0; idx < INTR_REMAP_TABLE_ENTRIES; idx++) {
+ ri_entry = &iommu->ir_table->base[idx];
+ if (!ri_entry->present || ri_entry->p_pst)
+ continue;
+
+ seq_printf(m, " %d\t%04x %08x %02x %016llx\t%016llx\n", idx,
+ ri_entry->sid, ri_entry->dest_id, ri_entry->vector,
+ ri_entry->high, ri_entry->low);
+ }
+}
+
+static void ir_tbl_posted_entry_show(struct seq_file *m,
+ struct intel_iommu *iommu)
+{
+ struct irte *pi_entry;
+ int idx;
+
+ seq_puts(m, " Index SrcID PDA_high PDA_low Vct IRTE_high\t\tIRTE_low\n");
+
+ for (idx = 0; idx < INTR_REMAP_TABLE_ENTRIES; idx++) {
+ pi_entry = &iommu->ir_table->base[idx];
+ if (!pi_entry->present || !pi_entry->p_pst)
+ continue;
+
+ seq_printf(m, " %d\t%04x %08x %08x %02x %016llx\t%016llx\n",
+ idx, pi_entry->sid, pi_entry->pda_h,
+ pi_entry->pda_l << 6, pi_entry->vector,
+ pi_entry->high, pi_entry->low);
+ }
+}
+
+/*
+ * For active IOMMUs go through the Interrupt remapping
+ * table and print valid entries in a table format for
+ * Remapped and Posted Interrupts.
+ */
+static int ir_translation_struct_show(struct seq_file *m, void *unused)
+{
+ struct dmar_drhd_unit *drhd;
+ struct intel_iommu *iommu;
+ u64 irta;
+
+ rcu_read_lock();
+ for_each_active_iommu(iommu, drhd) {
+ if (!ecap_ir_support(iommu->ecap))
+ continue;
+
+ irta = dmar_readq(iommu->reg + DMAR_IRTA_REG) & VTD_PAGE_MASK;
+ seq_printf(m, "Remapped Interrupt supported on IOMMU: %s\n"
+ " IR table address:%llx\n", iommu->name, irta);
+
+ if (iommu->ir_table && irta)
+ ir_tbl_remap_entry_show(m, iommu);
+ else
+ seq_puts(m, "Interrupt Remapping is not enabled\n");
+ seq_putc(m, '\n');
+ }
+
+ seq_puts(m, "****\n\n");
+
+ for_each_active_iommu(iommu, drhd) {
+ if (!cap_pi_support(iommu->cap))
+ continue;
+
+ irta = dmar_readq(iommu->reg + DMAR_IRTA_REG) & VTD_PAGE_MASK;
+ seq_printf(m, "Posted Interrupt supported on IOMMU: %s\n"
+ " IR table address:%llx\n", iommu->name, irta);
+
+ if (iommu->ir_table && irta)
+ ir_tbl_posted_entry_show(m, iommu);
+ else
+ seq_puts(m, "Interrupt Remapping is not enabled\n");
+ seq_putc(m, '\n');
+ }
+ rcu_read_unlock();
+
+ return 0;
+}
+DEFINE_SHOW_ATTRIBUTE(ir_translation_struct);
+#endif
+
void __init intel_iommu_debugfs_init(void)
{
struct dentry *iommu_debug_root;
@@ -241,4 +331,8 @@ void __init intel_iommu_debugfs_init(void)
NULL, &dmar_translation_struct_fops);
debugfs_create_file("iommu_regset", 0444, iommu_debug_root, NULL,
&iommu_regset_fops);
+#ifdef CONFIG_IRQ_REMAP
+ debugfs_create_file("ir_translation_struct", 0444, iommu_debug_root,
+ NULL, &ir_translation_struct_fops);
+#endif
}
--
2.7.4
On Fri, 2018-02-02 at 16:49 -0800, Sohil Mehta wrote:
> Hi All,
>
> This series aims to add debugfs support for Intel IOMMU. It exposes
> IOMMU
> registers, internal context and dumps individual table entries to help
> debug
> Intel IOMMUs.
>
> The first patch does the ground work for the following patches by
> reorganizing
> some Intel IOMMU data structures. The following patches create a new
> Kconfig
> option - INTEL_IOMMU_DEBUG and add debugfs support for IOMMU context
> internals,
> register contents, PASID internals, and Interrupt remapping in that
> order. The
> information can be accessed in sysfs at
> '/sys/kernel/debug/intel_iommu/'.
>
>
Nice and clean in comparison to v1.
Reviewed-by: Andy Shevchenko <[email protected]>
Joerg, note, that macro, which patch 2 defines privately, likely will
make v4.16-rc1, thus patch 2 might need to be rebased. So, please, wait
till v4.16-rc1 before applying this.
> Regards,
> Sohil
>
> Changes since v6:
> - Split patch 1/5 and 2/5 differently
> - Simplify and improve code formatting
> - Use macro for register set definitions
> - Fix compiler warning for readq
> - Add Co-Developed-by tag to commit messages
>
> Changes since v5:
> - Change the order of includes to an alphabetical order
> - Change seq_printf and seq_puts formatting
>
> Changes since v4:
> - Change to a SPDX license tag
> - Fix seq_printf formatting and remove leading '\n's
>
> Changes since v3:
> - Remove an unused function parameter from some of the functions
> - Fix checkpatch.pl warnings
> - Remove error reporting for debugfs_create_file functions
> - Fix unnecessary reprogramming of the context entries
> - Simplify and merge the show context and extended context patch into
> one
> - Remove redundant IOMMU null check under for_each_active_iommu
> - Update the commit title to be consistent
>
> Changes since v2:
> - Added a macro for seq file operations based on recommendation by
> Andy
> Shevchenko. The marco can be moved to seq_file.h at a future point
> - Changed the debugfs file names to more relevant ones
> - Added information for MTRR registers in the regset file
>
> Changes since v1:
> - Fixed seq_printf formatting
> - Handled the case when Interrupt remapping is not enabled
>
> Gayatri Kammela (4):
> iommu/vt-d: Relocate struct/function declarations to its header
> files
> iommu/vt-d: Enable debugfs support to show context internals
> iommu/vt-d: Add debugfs support to show register contents
> iommu/vt-d: Add debugfs support to show Pasid table contents
>
> Sohil Mehta (1):
> iommu/vt-d: Add debugfs support for Interrupt remapping
>
> drivers/iommu/Kconfig | 8 +
> drivers/iommu/Makefile | 1 +
> drivers/iommu/intel-iommu-debug.c | 338
> ++++++++++++++++++++++++++++++++++++++
> drivers/iommu/intel-iommu.c | 34 +---
> drivers/iommu/intel-svm.c | 8 -
> include/linux/intel-iommu.h | 39 +++++
> include/linux/intel-svm.h | 10 +-
> 7 files changed, 400 insertions(+), 38 deletions(-)
> create mode 100644 drivers/iommu/intel-iommu-debug.c
>
--
Andy Shevchenko <[email protected]>
Intel Finland Oy
On Fri, Feb 02, 2018 at 04:49:56PM -0800, Sohil Mehta wrote:
> This series aims to add debugfs support for Intel IOMMU. It exposes IOMMU
> registers, internal context and dumps individual table entries to help debug
> Intel IOMMUs.
>
> The first patch does the ground work for the following patches by reorganizing
> some Intel IOMMU data structures. The following patches create a new Kconfig
> option - INTEL_IOMMU_DEBUG and add debugfs support for IOMMU context internals,
> register contents, PASID internals, and Interrupt remapping in that order. The
> information can be accessed in sysfs at '/sys/kernel/debug/intel_iommu/'.
This looks like it only presents data from the iommu-hardware (register
state) or from in-memory data structures used by the hardware. Can't all
this be read out from user-space with libpci and /dev/mem access?
Things are different for kernel-defined data structures, as they might
change between releases and can be presented to user-space via debugfs
is needed, but the data structures used by the hardware should be pretty
stable.
Joerg
Hi Joerg,
On Tue, Feb 13, 2018 at 03:03:03PM +0100, Joerg Roedel wrote:
> On Fri, Feb 02, 2018 at 04:49:56PM -0800, Sohil Mehta wrote:
> > This series aims to add debugfs support for Intel IOMMU. It exposes IOMMU
> > registers, internal context and dumps individual table entries to help debug
> > Intel IOMMUs.
> >
> > The first patch does the ground work for the following patches by reorganizing
> > some Intel IOMMU data structures. The following patches create a new Kconfig
> > option - INTEL_IOMMU_DEBUG and add debugfs support for IOMMU context internals,
> > register contents, PASID internals, and Interrupt remapping in that order. The
> > information can be accessed in sysfs at '/sys/kernel/debug/intel_iommu/'.
>
> This looks like it only presents data from the iommu-hardware (register
> state) or from in-memory data structures used by the hardware. Can't all
> this be read out from user-space with libpci and /dev/mem access?
True, we can do the tools and keep it strictly user space, but find it very
convenient to keep the kernel code and debugfs together. When there are
bug-reports its rather easy for the user to collect some data and report it,
and all the data-structures we need are readily available instead of finding
a round-about way to capture the same data from user-space.
This version has only hw dumps for now, but we plan to add some other things
like walking 2nd level page-tables, or get some SVM specific data from the
driver in the future.
>
> Things are different for kernel-defined data structures, as they might
> change between releases and can be presented to user-space via debugfs
> is needed, but the data structures used by the hardware should be pretty
> stable.
Cheers,
Ashok
On Tue, 13 Feb 2018 13:40:02 -0800
"Raj, Ashok" <[email protected]> wrote:
> Hi Joerg,
>
> On Tue, Feb 13, 2018 at 03:03:03PM +0100, Joerg Roedel wrote:
> > On Fri, Feb 02, 2018 at 04:49:56PM -0800, Sohil Mehta wrote:
> > > This series aims to add debugfs support for Intel IOMMU. It
> > > exposes IOMMU registers, internal context and dumps individual
> > > table entries to help debug Intel IOMMUs.
> > >
> > > The first patch does the ground work for the following patches by
> > > reorganizing some Intel IOMMU data structures. The following
> > > patches create a new Kconfig option - INTEL_IOMMU_DEBUG and add
> > > debugfs support for IOMMU context internals, register contents,
> > > PASID internals, and Interrupt remapping in that order. The
> > > information can be accessed in sysfs at
> > > '/sys/kernel/debug/intel_iommu/'.
> >
> > This looks like it only presents data from the iommu-hardware
> > (register state) or from in-memory data structures used by the
> > hardware. Can't all this be read out from user-space with libpci
> > and /dev/mem access?
>
> True, we can do the tools and keep it strictly user space, but find
> it very convenient to keep the kernel code and debugfs together. When
> there are bug-reports its rather easy for the user to collect some
> data and report it, and all the data-structures we need are readily
> available instead of finding a round-about way to capture the same
> data from user-space.
>
> This version has only hw dumps for now, but we plan to add some other
> things like walking 2nd level page-tables, or get some SVM specific
> data from the driver in the future.
>
We did start out with /dev/mem but run into CONFIG_STRICT_DEVMEM
requirement which is turned on by default.
libpci is only limited to PCI config space access, right?
> >
> > Things are different for kernel-defined data structures, as they
> > might change between releases and can be presented to user-space
> > via debugfs is needed, but the data structures used by the hardware
> > should be pretty stable.
>
> Cheers,
> Ashok
[Jacob Pan]
On Tue, Feb 13, 2018 at 02:53:32PM -0800, Jacob Pan wrote:
> We did start out with /dev/mem but run into CONFIG_STRICT_DEVMEM
> requirement which is turned on by default.
> libpci is only limited to PCI config space access, right?
Even if /dev/mem is not an option, I am still not convinced that this is
a good idea. How should this be used? Will you just use it to debug
Intel IOMMUs when you guys work on the driver or should users report the
data back when they hit problems?
Joerg
On Thu, 15 Feb 2018 10:53:38 +0100
Joerg Roedel <[email protected]> wrote:
> On Tue, Feb 13, 2018 at 02:53:32PM -0800, Jacob Pan wrote:
> > We did start out with /dev/mem but run into CONFIG_STRICT_DEVMEM
> > requirement which is turned on by default.
> > libpci is only limited to PCI config space access, right?
>
> Even if /dev/mem is not an option, I am still not convinced that this
> is a good idea. How should this be used? Will you just use it to debug
> Intel IOMMUs when you guys work on the driver or should users report
> the data back when they hit problems?
>
It is for both but mainly the latter. when we have customers/users
reporting Intel IOMMU issues, it would be extremely helpful if we can
ask them to turn debugfs on and report the state. As we move to SVA, we
may also include io page table similar to /sys/kernel/debug/page_table
for PASID etc.
Just wondering if your concern is on the implementation or the debugfs
idea in general. Perhaps have some common IOMMU debugfs?
On Tue, 2018-02-13 at 13:40 -0800, Raj, Ashok wrote:
> This version has only hw dumps for now, but we plan to add some other things
> like walking 2nd level page-tables, or get some SVM specific data from the
> driver in the future.
Hi,
I'm not sure how much you know about chipsec [1], but with a oneliner [2] you
can have it print a lot of data from Vt-d configuration, including page
tables. As far as I remember, it doesn't need access to /dev/mem (but I gidn't
dig to check how it does it).
It might still be useful to have everything from debugfs because it'll always
be consistent with the current running kernel, but at least it might help
users reporting bugs.
[1] https://github.com/chipsec/chipsec
[2] http://paste.debian.net/1010105
--
Yves-Alexis
On Sun, 18 Feb 2018 23:15:32 +0100
Yves-Alexis Perez <[email protected]> wrote:
> On Tue, 2018-02-13 at 13:40 -0800, Raj, Ashok wrote:
> > This version has only hw dumps for now, but we plan to add some
> > other things like walking 2nd level page-tables, or get some SVM
> > specific data from the driver in the future.
>
> Hi,
>
> I'm not sure how much you know about chipsec [1], but with a oneliner
> [2] you can have it print a lot of data from Vt-d configuration,
> including page tables. As far as I remember, it doesn't need access
> to /dev/mem (but I gidn't dig to check how it does it).
>
> It might still be useful to have everything from debugfs because
> it'll always be consistent with the current running kernel, but at
> least it might help users reporting bugs.
>
> [1] https://github.com/chipsec/chipsec
> [2] http://paste.debian.net/1010105
I didn't know about chipsec but reading the code seems to rely on an
out-of-tree kernel module. I don't think it matches what we need here.
On Tue, 2018-02-20 at 14:25 -0800, Jacob Pan wrote:
> I didn't know about chipsec but reading the code seems to rely on an
> out-of-tree kernel module. I don't think it matches what we need here.
Yes good indeed, I had forgot about that. Maybe the userland part is still
useful, but there's definitely a need for (protected) access to privileged
memory (and access to /dev/mem is less practical than debugfs, I guess).
Regards,
--
Yves-Alexis
On Thu, 22 Feb 2018 08:48:37 +0100
Yves-Alexis Perez <[email protected]> wrote:
> On Tue, 2018-02-20 at 14:25 -0800, Jacob Pan wrote:
> > I didn't know about chipsec but reading the code seems to rely on an
> > out-of-tree kernel module. I don't think it matches what we need
> > here.
>
> Yes good indeed, I had forgot about that. Maybe the userland part is
> still useful, but there's definitely a need for (protected) access to
> privileged memory (and access to /dev/mem is less practical than
> debugfs, I guess).
>
Another reason we can't use /dev/mem is that the context entries are
not static, they are created for each device when the first map()
comes. So we need to rely on the list in the intel iommu driver to dump
the context.
> Regards,
[Jacob Pan]
On Thu, Feb 15, 2018 at 08:38:11AM -0800, Jacob Pan wrote:
> Just wondering if your concern is on the implementation or the debugfs
> idea in general. Perhaps have some common IOMMU debugfs?
My concern mainly is that we add interfaces which reveal
potentially security relevant information to user-space and that tools
come up using it so that this also becomes kABI and we can't easily
change it anymore and this whole stuff turns into a maintence nightmare.
So that is definitly not something I'd like to see enabled in the
distros, and its better to avoid it at all and search for better ways to
debug upcoming issues.
BPF tracers and tracing in general comes to mind here...
Joerg
On Thu, 15 Mar 2018 14:18:54 +0100
Joerg Roedel <[email protected]> wrote:
> On Thu, Feb 15, 2018 at 08:38:11AM -0800, Jacob Pan wrote:
> > Just wondering if your concern is on the implementation or the
> > debugfs idea in general. Perhaps have some common IOMMU debugfs?
>
> My concern mainly is that we add interfaces which reveal
> potentially security relevant information
I don;t think security is any worse than existing kernel page table in
debugfs. i.e. /sys/kernel/debug/page_tables
This is a debug feature.
> to user-space and that tools
> come up using it so that this also becomes kABI and we can't easily
> change it anymore and this whole stuff turns into a maintence
> nightmare.
>
Agreed, perhaps we can address that by only dumping user readable data
which avoid having a parser tool that relies on stable kABI?
> So that is definitly not something I'd like to see enabled in the
> distros, and its better to avoid it at all and search for better ways
> to debug upcoming issues.
>
We can make it "def_bool n" so only used by advanced customers who can
recompile kernel.
> BPF tracers and tracing in general comes to mind here...
>
my concern is that tracing is suitable for dynamic debugging, but these
context info are mostly static. Perhaps I am missing some tracing
features.
Thanks,
Jacob
>
> Joerg
>
[Jacob Pan]
[ Adding Gary from AMD to Cc ]
On Mon, Mar 19, 2018 at 09:37:14AM -0700, Jacob Pan wrote:
> On Thu, 15 Mar 2018 14:18:54 +0100
> Joerg Roedel <[email protected]> wrote:
>
> > On Thu, Feb 15, 2018 at 08:38:11AM -0800, Jacob Pan wrote:
> > > Just wondering if your concern is on the implementation or the
> > > debugfs idea in general. Perhaps have some common IOMMU debugfs?
> >
> > My concern mainly is that we add interfaces which reveal
> > potentially security relevant information
> I don;t think security is any worse than existing kernel page table in
> debugfs. i.e. /sys/kernel/debug/page_tables
> This is a debug feature.
Okay, so here is the way to go: Please introduce a basic debugfs
facility to the core iommu code. It should basically only create a
'iommu/' directory in debugfs where drivers can create their own
sub-directories. This must be enabled by a new kconfig option
(CONFIG_IOMMU_DEBUGFS) and the kernel should print a big fat warning at
boot when it is enabled. This hopefully prevents anyone from enabling it
for production kernels.
Then in the next cycle I will review again more closely what information
about VT-d and AMD-Vi is revealed there and will probably apply what I
can live with.
Thanks,
Joerg
On 03/29/2018 03:48 AM, Joerg Roedel wrote:
> [ Adding Gary from AMD to Cc ]
>
> On Mon, Mar 19, 2018 at 09:37:14AM -0700, Jacob Pan wrote:
>> On Thu, 15 Mar 2018 14:18:54 +0100
>> Joerg Roedel <[email protected]> wrote:
>>
>>> On Thu, Feb 15, 2018 at 08:38:11AM -0800, Jacob Pan wrote:
>>>> Just wondering if your concern is on the implementation or the
>>>> debugfs idea in general. Perhaps have some common IOMMU debugfs?
>>>
>>> My concern mainly is that we add interfaces which reveal
>>> potentially security relevant information
>> I don;t think security is any worse than existing kernel page table in
>> debugfs. i.e. /sys/kernel/debug/page_tables
>> This is a debug feature.
>
> Okay, so here is the way to go: Please introduce a basic debugfs
> facility to the core iommu code. It should basically only create a
> 'iommu/' directory in debugfs where drivers can create their own
> sub-directories. This must be enabled by a new kconfig option
> (CONFIG_IOMMU_DEBUGFS) and the kernel should print a big fat warning at
> boot when it is enabled. This hopefully prevents anyone from enabling it
> for production kernels.
I'm halfway through this. Where would you like to place the invocation
of the initialization function?
There's an iommu_init() in iommu.c, But it's a core_initcall, which
doesn't seem like a good spot. Not knowing enough about bring-up here,
Would adding another __init function be suitable?
Gary
On Thu, 29 Mar 2018 10:48:24 +0200
Joerg Roedel <[email protected]> wrote:
> [ Adding Gary from AMD to Cc ]
>
> On Mon, Mar 19, 2018 at 09:37:14AM -0700, Jacob Pan wrote:
> > On Thu, 15 Mar 2018 14:18:54 +0100
> > Joerg Roedel <[email protected]> wrote:
> >
> > > On Thu, Feb 15, 2018 at 08:38:11AM -0800, Jacob Pan wrote:
> > > > Just wondering if your concern is on the implementation or the
> > > > debugfs idea in general. Perhaps have some common IOMMU
> > > > debugfs?
> > >
> > > My concern mainly is that we add interfaces which reveal
> > > potentially security relevant information
> > I don;t think security is any worse than existing kernel page table
> > in debugfs. i.e. /sys/kernel/debug/page_tables
> > This is a debug feature.
>
> Okay, so here is the way to go: Please introduce a basic debugfs
> facility to the core iommu code. It should basically only create a
> 'iommu/' directory in debugfs where drivers can create their own
> sub-directories. This must be enabled by a new kconfig option
> (CONFIG_IOMMU_DEBUGFS) and the kernel should print a big fat warning
> at boot when it is enabled. This hopefully prevents anyone from
> enabling it for production kernels.
>
> Then in the next cycle I will review again more closely what
> information about VT-d and AMD-Vi is revealed there and will probably
> apply what I can live with.
>
sounds great. we will provide vt-d info for both current and
potential extensions so that you can consider if there can be any
abstractions.
> Thanks,
>
> Joerg
>
[Jacob Pan]