2019-04-07 12:49:38

by Zhen Lei

[permalink] [raw]
Subject: [PATCH v4 0/6] normalize IOMMU dma mode boot options

As Robin Murphy's suggestion:
"It's also not necessarily obvious to the user how this interacts with
IOMMU_DEFAULT_PASSTHROUGH, so if we really do go down this route, maybe it
would be better to refactor the whole lot into a single selection of something
like IOMMU_DEFAULT_MODE anyway."

In this version, I tried to normalize the IOMMU dma mode boot options for all
ARCHs. When IOMMU is enabled, there are 3 dma modes: paasthrough(bypass),
lazy(mapping but defer the IOTLB invalidation), strict. But currently each
ARCHs defined their private boot options, different with each other. For
example, to enable/disable "passthrough", ARM64 use iommu.passthrough=1/0,
X86 use iommu=pt/nopt, PPC/POWERNV use iommu=nobypass.


Zhen Lei (6):
iommu: use iommu.dma_mode to replace iommu.passthrough and
iommu.strict
iommu: keep dma mode build options consistent with cmdline options
iommu: add iommu_default_dma_mode_get() helper
s390/pci: use common boot option iommu.dma_mode
powernv/iommu: use common boot option iommu.dma_mode
x86/iommu: use common boot option iommu.dma_mode

Documentation/admin-guide/kernel-parameters.txt | 42 +++++++-------------
arch/ia64/include/asm/iommu.h | 2 -
arch/ia64/kernel/pci-dma.c | 2 -
arch/powerpc/platforms/powernv/pci-ioda.c | 23 +----------
arch/s390/pci/pci_dma.c | 20 +++-------
arch/x86/include/asm/iommu.h | 1 -
arch/x86/kernel/pci-dma.c | 20 ----------
drivers/iommu/Kconfig | 36 ++++++++++++++---
drivers/iommu/amd_iommu.c | 12 +++---
drivers/iommu/amd_iommu_init.c | 4 --
drivers/iommu/amd_iommu_types.h | 6 ---
drivers/iommu/intel-iommu.c | 7 +---
drivers/iommu/iommu.c | 52 ++++++++++++++++---------
include/linux/iommu.h | 16 ++++++++
14 files changed, 108 insertions(+), 135 deletions(-)

--
1.8.3



2019-04-07 12:49:38

by Zhen Lei

[permalink] [raw]
Subject: [PATCH v4 3/6] iommu: add iommu_default_dma_mode_get() helper

Add IOMMU_DMA_MODE_IS_LAZY() and IOMMU_DMA_MODE_IS_PASSTHROUGH() to make
the code looks cleaner.

There is no functional change, just prepare for the following patches.

Signed-off-by: Zhen Lei <[email protected]>
---
drivers/iommu/iommu.c | 12 ++++++++----
include/linux/iommu.h | 11 +++++++++++
2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index b75e23a2ea08da8..876c0966db2481a 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -168,6 +168,11 @@ static int __init iommu_dma_mode_setup(char *str)
}
early_param("iommu.dma_mode", iommu_dma_mode_setup);

+int iommu_default_dma_mode_get(void)
+{
+ return iommu_default_dma_mode;
+}
+
static ssize_t iommu_group_attr_show(struct kobject *kobj,
struct attribute *__attr, char *buf)
{
@@ -1109,9 +1114,8 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
*/
if (!group->default_domain) {
struct iommu_domain *dom;
- int def_domain_type =
- (iommu_default_dma_mode == IOMMU_DMA_MODE_PASSTHROUGH)
- ? IOMMU_DOMAIN_IDENTITY : IOMMU_DOMAIN_DMA;
+ int def_domain_type = IOMMU_DMA_MODE_IS_PASSTHROUGH() \
+ ? IOMMU_DOMAIN_IDENTITY : IOMMU_DOMAIN_DMA;

dom = __iommu_domain_alloc(dev->bus, def_domain_type);
if (!dom && def_domain_type != IOMMU_DOMAIN_DMA) {
@@ -1127,7 +1131,7 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
if (!group->domain)
group->domain = dom;

- if (dom && (iommu_default_dma_mode == IOMMU_DMA_MODE_LAZY)) {
+ if (dom && IOMMU_DMA_MODE_IS_LAZY()) {
int attr = 1;
iommu_domain_set_attr(dom,
DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index c3f4e3416176496..1b35ae3f6382e4a 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -46,6 +46,10 @@
#define IOMMU_DMA_MODE_STRICT 0x0
#define IOMMU_DMA_MODE_LAZY 0x1
#define IOMMU_DMA_MODE_PASSTHROUGH 0x2
+#define IOMMU_DMA_MODE_IS_LAZY() \
+ (iommu_default_dma_mode_get() == IOMMU_DMA_MODE_LAZY)
+#define IOMMU_DMA_MODE_IS_PASSTHROUGH() \
+ (iommu_default_dma_mode_get() == IOMMU_DMA_MODE_PASSTHROUGH)

struct iommu_ops;
struct iommu_group;
@@ -421,6 +425,8 @@ static inline void dev_iommu_fwspec_set(struct device *dev,
int iommu_probe_device(struct device *dev);
void iommu_release_device(struct device *dev);

+extern int iommu_default_dma_mode_get(void);
+
#else /* CONFIG_IOMMU_API */

struct iommu_ops {};
@@ -705,6 +711,11 @@ const struct iommu_ops *iommu_ops_from_fwnode(struct fwnode_handle *fwnode)
return NULL;
}

+static inline int iommu_default_dma_mode_get(void)
+{
+ return IOMMU_DMA_MODE_PASSTHROUGH;
+}
+
#endif /* CONFIG_IOMMU_API */

#ifdef CONFIG_IOMMU_DEBUGFS
--
1.8.3


2019-04-07 12:49:38

by Zhen Lei

[permalink] [raw]
Subject: [PATCH v4 4/6] s390/pci: use common boot option iommu.dma_mode

s390_iommu=strict can be replaced with iommu.dma_mode=strict.

Signed-off-by: Zhen Lei <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 7 -------
arch/s390/pci/pci_dma.c | 20 +++++---------------
drivers/iommu/Kconfig | 1 +
3 files changed, 6 insertions(+), 22 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 60409ad23b2ac8b..a2df11945b33fc9 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4095,13 +4095,6 @@

S [KNL] Run init in single mode

- s390_iommu= [HW,S390]
- Set s390 IOTLB flushing mode
- strict
- With strict flushing every unmap operation will result in
- an IOTLB flush. Default is lazy flushing before reuse,
- which is faster.
-
sa1100ir [NET]
See drivers/net/irda/sa1100_ir.c.

diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
index 9e52d1527f71495..6f1615c16f33995 100644
--- a/arch/s390/pci/pci_dma.c
+++ b/arch/s390/pci/pci_dma.c
@@ -17,7 +17,6 @@

static struct kmem_cache *dma_region_table_cache;
static struct kmem_cache *dma_page_table_cache;
-static int s390_iommu_strict;

static int zpci_refresh_global(struct zpci_dev *zdev)
{
@@ -193,13 +192,13 @@ static int __dma_purge_tlb(struct zpci_dev *zdev, dma_addr_t dma_addr,
if (!zdev->tlb_refresh)
return 0;
} else {
- if (!s390_iommu_strict)
+ if (IOMMU_DMA_MODE_IS_LAZY())
return 0;
}

ret = zpci_refresh_trans((u64) zdev->fh << 32, dma_addr,
PAGE_ALIGN(size));
- if (ret == -ENOMEM && !s390_iommu_strict) {
+ if (ret == -ENOMEM && IOMMU_DMA_MODE_IS_LAZY()) {
/* enable the hypervisor to free some resources */
if (zpci_refresh_global(zdev))
goto out;
@@ -278,7 +277,7 @@ static dma_addr_t dma_alloc_address(struct device *dev, int size)
spin_lock_irqsave(&zdev->iommu_bitmap_lock, flags);
offset = __dma_alloc_iommu(dev, zdev->next_bit, size);
if (offset == -1) {
- if (!s390_iommu_strict) {
+ if (IOMMU_DMA_MODE_IS_LAZY()) {
/* global flush before DMA addresses are reused */
if (zpci_refresh_global(zdev))
goto out_error;
@@ -313,7 +312,7 @@ static void dma_free_address(struct device *dev, dma_addr_t dma_addr, int size)
if (!zdev->iommu_bitmap)
goto out;

- if (s390_iommu_strict)
+ if (!IOMMU_DMA_MODE_IS_LAZY())
bitmap_clear(zdev->iommu_bitmap, offset, size);
else
bitmap_set(zdev->lazy_bitmap, offset, size);
@@ -584,7 +583,7 @@ int zpci_dma_init_device(struct zpci_dev *zdev)
rc = -ENOMEM;
goto free_dma_table;
}
- if (!s390_iommu_strict) {
+ if (IOMMU_DMA_MODE_IS_LAZY()) {
zdev->lazy_bitmap = vzalloc(zdev->iommu_pages / 8);
if (!zdev->lazy_bitmap) {
rc = -ENOMEM;
@@ -671,12 +670,3 @@ void zpci_dma_exit(void)
/* dma_supported is unconditionally true without a callback */
};
EXPORT_SYMBOL_GPL(s390_pci_dma_ops);
-
-static int __init s390_iommu_setup(char *str)
-{
- if (!strncmp(str, "strict", 6))
- s390_iommu_strict = 1;
- return 0;
-}
-
-__setup("s390_iommu=", s390_iommu_setup);
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 19118cfdea1c335..d88dc44d60d88ea 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -77,6 +77,7 @@ config IOMMU_DEBUGFS
choice
prompt "IOMMU dma mode"
depends on IOMMU_API
+ default IOMMU_DMA_MODE_LAZY if S390_IOMMU
default IOMMU_DMA_MODE_STRICT
help
IOMMU dma mode, such as: passthrough, lazy, strict.
--
1.8.3


2019-04-07 12:49:38

by Zhen Lei

[permalink] [raw]
Subject: [PATCH v4 5/6] powernv/iommu: use common boot option iommu.dma_mode

iommu=nobypass can be replaced with iommu.dma_mode=strict.

Signed-off-by: Zhen Lei <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 2 --
arch/powerpc/platforms/powernv/pci-ioda.c | 23 +----------------------
drivers/iommu/Kconfig | 1 +
3 files changed, 2 insertions(+), 24 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index a2df11945b33fc9..f88a8bff3c0caa0 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1789,8 +1789,6 @@
soft
pt [x86]
nopt [x86]
- nobypass [PPC/POWERNV]
- Disable IOMMU bypass, using IOMMU for PCI devices.


iommu.dma_mode= [ARM64] Configure default dma mode. if unset, use the
diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c b/arch/powerpc/platforms/powernv/pci-ioda.c
index 3ead4c237ed0ec9..be0234c170316bc 100644
--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -85,29 +85,8 @@ void pe_level_printk(const struct pnv_ioda_pe *pe, const char *level,
va_end(args);
}

-static bool pnv_iommu_bypass_disabled __read_mostly;
static bool pci_reset_phbs __read_mostly;

-static int __init iommu_setup(char *str)
-{
- if (!str)
- return -EINVAL;
-
- while (*str) {
- if (!strncmp(str, "nobypass", 8)) {
- pnv_iommu_bypass_disabled = true;
- pr_info("PowerNV: IOMMU bypass window disabled.\n");
- break;
- }
- str += strcspn(str, ",");
- if (*str == ',')
- str++;
- }
-
- return 0;
-}
-early_param("iommu", iommu_setup);
-
static int __init pci_reset_phbs_setup(char *str)
{
pci_reset_phbs = true;
@@ -2456,7 +2435,7 @@ static long pnv_pci_ioda2_setup_default_config(struct pnv_ioda_pe *pe)
return rc;
}

- if (!pnv_iommu_bypass_disabled)
+ if (IOMMU_DMA_MODE_IS_PASSTHROUGH())
pnv_pci_ioda2_set_bypass(pe, true);

return 0;
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index d88dc44d60d88ea..b053eeaa82ebca8 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -77,6 +77,7 @@ config IOMMU_DEBUGFS
choice
prompt "IOMMU dma mode"
depends on IOMMU_API
+ default IOMMU_DMA_MODE_PASSTHROUGH if (PPC_POWERNV && PCI)
default IOMMU_DMA_MODE_LAZY if S390_IOMMU
default IOMMU_DMA_MODE_STRICT
help
--
1.8.3


2019-04-07 12:49:38

by Zhen Lei

[permalink] [raw]
Subject: [PATCH v4 6/6] x86/iommu: use common boot option iommu.dma_mode

iommu=pt can be replaced with iommu.dma_mode=passthrough.
iommu=nopt can be replaced with iommu.dma_mode=lazy.
intel_iommu=strict can be replaced with iommu.dma_mode=strict.
amd_iommu=fullflush can be replaced with iommu.dma_mode=strict.

Note: intel_iommu_strict is not deleted because it can also be assigned
in quirk_calpella_no_shadow_gtt().

Signed-off-by: Zhen Lei <[email protected]>
---
arch/ia64/include/asm/iommu.h | 2 --
arch/ia64/kernel/pci-dma.c | 2 --
arch/x86/include/asm/iommu.h | 1 -
arch/x86/kernel/pci-dma.c | 20 --------------------
drivers/iommu/Kconfig | 14 ++++++--------
drivers/iommu/amd_iommu.c | 12 ++++++------
drivers/iommu/amd_iommu_init.c | 4 ----
drivers/iommu/amd_iommu_types.h | 6 ------
drivers/iommu/intel-iommu.c | 7 ++-----
9 files changed, 14 insertions(+), 54 deletions(-)

diff --git a/arch/ia64/include/asm/iommu.h b/arch/ia64/include/asm/iommu.h
index 7429a72f3f92199..92aceef63710861 100644
--- a/arch/ia64/include/asm/iommu.h
+++ b/arch/ia64/include/asm/iommu.h
@@ -8,10 +8,8 @@
extern void no_iommu_init(void);
#ifdef CONFIG_INTEL_IOMMU
extern int force_iommu, no_iommu;
-extern int iommu_pass_through;
extern int iommu_detected;
#else
-#define iommu_pass_through (0)
#define no_iommu (1)
#define iommu_detected (0)
#endif
diff --git a/arch/ia64/kernel/pci-dma.c b/arch/ia64/kernel/pci-dma.c
index fe988c49f01ce6a..f5d49cd3fbb01a9 100644
--- a/arch/ia64/kernel/pci-dma.c
+++ b/arch/ia64/kernel/pci-dma.c
@@ -22,8 +22,6 @@
int force_iommu __read_mostly;
#endif

-int iommu_pass_through;
-
static int __init pci_iommu_init(void)
{
if (iommu_detected)
diff --git a/arch/x86/include/asm/iommu.h b/arch/x86/include/asm/iommu.h
index baedab8ac5385f7..b91623d521d9f0f 100644
--- a/arch/x86/include/asm/iommu.h
+++ b/arch/x86/include/asm/iommu.h
@@ -4,7 +4,6 @@

extern int force_iommu, no_iommu;
extern int iommu_detected;
-extern int iommu_pass_through;

/* 10 seconds */
#define DMAR_OPERATION_TIMEOUT ((cycles_t) tsc_khz*10*1000)
diff --git a/arch/x86/kernel/pci-dma.c b/arch/x86/kernel/pci-dma.c
index d460998ae828514..bd63d80597ae6d0 100644
--- a/arch/x86/kernel/pci-dma.c
+++ b/arch/x86/kernel/pci-dma.c
@@ -34,21 +34,6 @@
/* Set this to 1 if there is a HW IOMMU in the system */
int iommu_detected __read_mostly = 0;

-/*
- * This variable becomes 1 if iommu=pt is passed on the kernel command line.
- * If this variable is 1, IOMMU implementations do no DMA translation for
- * devices and allow every device to access to whole physical memory. This is
- * useful if a user wants to use an IOMMU only for KVM device assignment to
- * guests and not for driver dma translation.
- * It is also possible to disable by default in kernel config, and enable with
- * iommu=nopt at boot time.
- */
-#ifdef CONFIG_IOMMU_DEFAULT_PASSTHROUGH
-int iommu_pass_through __read_mostly = 1;
-#else
-int iommu_pass_through __read_mostly;
-#endif
-
extern struct iommu_table_entry __iommu_table[], __iommu_table_end[];

/* Dummy device used for NULL arguments (normally ISA). */
@@ -139,11 +124,6 @@ static __init int iommu_setup(char *p)
if (!strncmp(p, "soft", 4))
swiotlb = 1;
#endif
- if (!strncmp(p, "pt", 2))
- iommu_pass_through = 1;
- if (!strncmp(p, "nopt", 4))
- iommu_pass_through = 0;
-
gart_parse_options(p);

#ifdef CONFIG_CALGARY_IOMMU
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index b053eeaa82ebca8..ff8c5d0d435cf58 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -78,7 +78,7 @@ choice
prompt "IOMMU dma mode"
depends on IOMMU_API
default IOMMU_DMA_MODE_PASSTHROUGH if (PPC_POWERNV && PCI)
- default IOMMU_DMA_MODE_LAZY if S390_IOMMU
+ default IOMMU_DMA_MODE_LAZY if (AMD_IOMMU || INTEL_IOMMU || S390_IOMMU)
default IOMMU_DMA_MODE_STRICT
help
IOMMU dma mode, such as: passthrough, lazy, strict.
@@ -87,9 +87,8 @@ config IOMMU_DMA_MODE_PASSTHROUGH
bool "Configure DMA to bypass the IOMMU"
help
Enable passthrough by default, removing the need to pass in
- iommu.dma_mode=passthrough or iommu=pt through command line. If this
- is enabled, you can still disable with iommu.dma_mode={lazy|strict}
- or iommu=nopt depending on the architecture.
+ iommu.dma_mode=passthrough through command line. If this is enabled,
+ you can still disable with iommu.dma_mode={lazy|strict}.

config IOMMU_DMA_MODE_LAZY
bool "IOMMU DMA use lazy mode to flush IOTLB and free IOVA"
@@ -97,10 +96,9 @@ config IOMMU_DMA_MODE_LAZY
Support lazy mode, where for every IOMMU DMA unmap operation, the
flush operation of IOTLB and the free operation of IOVA are deferred.
They are only guaranteed to be done before the related IOVA will be
- reused. Removing the need to pass in kernel parameters through
- command line. For example, iommu.dma_mode=lazy on ARM64. If this is
- enabled, you can still disable with kernel parameters, such as
- iommu.dma_mode=strict depending on the architecture.
+ reused. Removing the need to pass in iommu.dma_mode=lazy through
+ command line. If this is enabled, you can still disable with
+ iommu.dma_mode=strict.

config IOMMU_DMA_MODE_STRICT
bool "IOMMU DMA use strict mode to flush IOTLB and free IOVA"
diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index f7cdd2ab7f11f6c..361fa6d5561a0ff 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -94,7 +94,7 @@

/*
* Domain for untranslated devices - only allocated
- * if iommu=pt passed on kernel cmd line.
+ * if iommu.dma_mode=passthrough passed on kernel cmd line.
*/
const struct iommu_ops amd_iommu_ops;

@@ -448,7 +448,7 @@ static int iommu_init_device(struct device *dev)
* invalid address), we ignore the capability for the device so
* it'll be forced to go into translation mode.
*/
- if ((iommu_pass_through || !amd_iommu_force_isolation) &&
+ if ((IOMMU_DMA_MODE_IS_PASSTHROUGH() || !amd_iommu_force_isolation) &&
dev_is_pci(dev) && pci_iommuv2_capable(to_pci_dev(dev))) {
struct amd_iommu *iommu;

@@ -2274,7 +2274,7 @@ static int amd_iommu_add_device(struct device *dev)

BUG_ON(!dev_data);

- if (iommu_pass_through || dev_data->iommu_v2)
+ if (IOMMU_DMA_MODE_IS_PASSTHROUGH() || dev_data->iommu_v2)
iommu_request_dm_for_dev(dev);

/* Domains are initialized for this device - have a look what we ended up with */
@@ -2479,7 +2479,7 @@ static void __unmap_single(struct dma_ops_domain *dma_dom,
start += PAGE_SIZE;
}

- if (amd_iommu_unmap_flush) {
+ if (!IOMMU_DMA_MODE_IS_LAZY()) {
domain_flush_tlb(&dma_dom->domain);
domain_flush_complete(&dma_dom->domain);
dma_ops_free_iova(dma_dom, dma_addr, pages);
@@ -2853,10 +2853,10 @@ int __init amd_iommu_init_api(void)

int __init amd_iommu_init_dma_ops(void)
{
- swiotlb = (iommu_pass_through || sme_me_mask) ? 1 : 0;
+ swiotlb = (IOMMU_DMA_MODE_IS_PASSTHROUGH() || sme_me_mask) ? 1 : 0;
iommu_detected = 1;

- if (amd_iommu_unmap_flush)
+ if (!IOMMU_DMA_MODE_IS_LAZY())
pr_info("IO/TLB flush on unmap enabled\n");
else
pr_info("Lazy IO/TLB flushing enabled\n");
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 1b1378619fc9ec2..68ab83327997e7b 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -166,8 +166,6 @@ struct ivmd_header {
to handle */
LIST_HEAD(amd_iommu_unity_map); /* a list of required unity mappings
we find in ACPI */
-bool amd_iommu_unmap_flush; /* if true, flush on every unmap */
-
LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the
system */

@@ -2856,8 +2854,6 @@ static int __init parse_amd_iommu_intr(char *str)
static int __init parse_amd_iommu_options(char *str)
{
for (; *str; ++str) {
- if (strncmp(str, "fullflush", 9) == 0)
- amd_iommu_unmap_flush = true;
if (strncmp(str, "off", 3) == 0)
amd_iommu_disabled = true;
if (strncmp(str, "force_isolation", 15) == 0)
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 87965e4d964771b..724182f158523a1 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -743,12 +743,6 @@ struct unity_map_entry {
/* allocation bitmap for domain ids */
extern unsigned long *amd_iommu_pd_alloc_bitmap;

-/*
- * If true, the addresses will be flushed on unmap time, not when
- * they are reused
- */
-extern bool amd_iommu_unmap_flush;
-
/* Smallest max PASID supported by any IOMMU in the system */
extern u32 amd_iommu_max_pasid;

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index 28cb713d728ceef..500b3182d0a9280 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -451,9 +451,6 @@ static int __init intel_iommu_setup(char *str)
} else if (!strncmp(str, "forcedac", 8)) {
pr_info("Forcing DAC for PCI devices\n");
dmar_forcedac = 1;
- } else if (!strncmp(str, "strict", 6)) {
- pr_info("Disable batched IOTLB flush\n");
- intel_iommu_strict = 1;
} else if (!strncmp(str, "sp_off", 6)) {
pr_info("Disable supported super page\n");
intel_iommu_superpage = 0;
@@ -3408,7 +3405,7 @@ static int __init init_dmars(void)
iommu->flush.flush_iotlb(iommu, 0, 0, 0, DMA_TLB_GLOBAL_FLUSH);
}

- if (iommu_pass_through)
+ if (IOMMU_DMA_MODE_IS_PASSTHROUGH())
iommu_identity_mapping |= IDENTMAP_ALL;

#ifdef CONFIG_INTEL_IOMMU_BROKEN_GFX_WA
@@ -3749,7 +3746,7 @@ static void intel_unmap(struct device *dev, dma_addr_t dev_addr, size_t size)

freelist = domain_unmap(domain, start_pfn, last_pfn);

- if (intel_iommu_strict) {
+ if (!IOMMU_DMA_MODE_IS_LAZY() || intel_iommu_strict) {
iommu_flush_iotlb_psi(iommu, domain, start_pfn,
nrpages, !freelist, 0);
/* free iova */
--
1.8.3


2019-04-07 12:49:50

by Zhen Lei

[permalink] [raw]
Subject: [PATCH v4 1/6] iommu: use iommu.dma_mode to replace iommu.passthrough and iommu.strict

Currently the IOMMU dma contains 3 modes: passthrough, lazy, strict. The
passthrough mode bypass the IOMMU, the lazy mode defer the invalidation
of hardware TLBs, and the strict mode invalidate IOMMU hardware TLBs
synchronously. The three modes are mutually exclusive. So people maybe
confused about iommu.passthrough and iommu.strict, because thay can not
be coexist. Use iommu.dma_mode to replace them will be better.

Signed-off-by: Zhen Lei <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 33 ++++++++---------
drivers/iommu/Kconfig | 4 +--
drivers/iommu/iommu.c | 48 ++++++++++++++-----------
include/linux/iommu.h | 5 +++
4 files changed, 50 insertions(+), 40 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 2b8ee90bb64470d..60409ad23b2ac8b 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1792,24 +1792,21 @@
nobypass [PPC/POWERNV]
Disable IOMMU bypass, using IOMMU for PCI devices.

- iommu.strict= [ARM64] Configure TLB invalidation behaviour
- Format: { "0" | "1" }
- 0 - Lazy mode.
- Request that DMA unmap operations use deferred
- invalidation of hardware TLBs, for increased
- throughput at the cost of reduced device isolation.
- Will fall back to strict mode if not supported by
- the relevant IOMMU driver.
- 1 - Strict mode (default).
- DMA unmap operations invalidate IOMMU hardware TLBs
- synchronously.
-
- iommu.passthrough=
- [ARM64] Configure DMA to bypass the IOMMU by default.
- Format: { "0" | "1" }
- 0 - Use IOMMU translation for DMA.
- 1 - Bypass the IOMMU for DMA.
- unset - Use value of CONFIG_IOMMU_DEFAULT_PASSTHROUGH.
+
+ iommu.dma_mode= [ARM64] Configure default dma mode. if unset, use the
+ value of CONFIG_IOMMU_DEFAULT_PASSTHROUGH.
+ passthrough
+ Configure DMA to bypass the IOMMU by default.
+ lazy
+ Request that DMA unmap operations use deferred
+ invalidation of hardware TLBs, for increased
+ throughput at the cost of reduced device isolation.
+ Will fall back to strict mode if not supported by
+ the relevant IOMMU driver.
+ strict
+ Default. DMA unmap operations invalidate IOMMU hardware
+ TLBs synchronously.
+

io7= [HW] IO7 for Marvel based alpha systems
See comment before marvel_specify_io7 in
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 6f07f3b21816c64..b67fcabd668f7b6 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -79,8 +79,8 @@ config IOMMU_DEFAULT_PASSTHROUGH
depends on IOMMU_API
help
Enable passthrough by default, removing the need to pass in
- iommu.passthrough=on or iommu=pt through command line. If this
- is enabled, you can still disable with iommu.passthrough=off
+ iommu.dma_mode=passthrough or iommu=pt through command line. If this
+ is enabled, you can still disable with iommu.dma_mode={lazy|strict}
or iommu=nopt depending on the architecture.

If unsure, say N here.
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 109de67d5d727c2..e4d581e6cb8d210 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -38,12 +38,13 @@

static struct kset *iommu_group_kset;
static DEFINE_IDA(iommu_group_ida);
+
#ifdef CONFIG_IOMMU_DEFAULT_PASSTHROUGH
-static unsigned int iommu_def_domain_type = IOMMU_DOMAIN_IDENTITY;
+#define IOMMU_DEFAULT_DMA_MODE IOMMU_DMA_MODE_PASSTHROUGH
#else
-static unsigned int iommu_def_domain_type = IOMMU_DOMAIN_DMA;
+#define IOMMU_DEFAULT_DMA_MODE IOMMU_DMA_MODE_STRICT
#endif
-static bool iommu_dma_strict __read_mostly = true;
+static int iommu_default_dma_mode __read_mostly = IOMMU_DEFAULT_DMA_MODE;

struct iommu_callback_data {
const struct iommu_ops *ops;
@@ -141,25 +142,29 @@ static int __iommu_attach_group(struct iommu_domain *domain,
static void __iommu_detach_group(struct iommu_domain *domain,
struct iommu_group *group);

-static int __init iommu_set_def_domain_type(char *str)
+static int __init iommu_dma_mode_setup(char *str)
{
- bool pt;
- int ret;
+ if (!str)
+ goto fail;

- ret = kstrtobool(str, &pt);
- if (ret)
- return ret;
+ if (!strncmp(str, "passthrough", 11))
+ iommu_default_dma_mode = IOMMU_DMA_MODE_PASSTHROUGH;
+ else if (!strncmp(str, "lazy", 4))
+ iommu_default_dma_mode = IOMMU_DMA_MODE_LAZY;
+ else if (!strncmp(str, "strict", 6))
+ iommu_default_dma_mode = IOMMU_DMA_MODE_STRICT;
+ else
+ goto fail;
+
+ pr_info("Force dma mode to be %d\n", iommu_default_dma_mode);

- iommu_def_domain_type = pt ? IOMMU_DOMAIN_IDENTITY : IOMMU_DOMAIN_DMA;
return 0;
-}
-early_param("iommu.passthrough", iommu_set_def_domain_type);

-static int __init iommu_dma_setup(char *str)
-{
- return kstrtobool(str, &iommu_dma_strict);
+fail:
+ pr_debug("Boot option iommu.dma_mode is incorrect, ignored\n");
+ return -EINVAL;
}
-early_param("iommu.strict", iommu_dma_setup);
+early_param("iommu.dma_mode", iommu_dma_mode_setup);

static ssize_t iommu_group_attr_show(struct kobject *kobj,
struct attribute *__attr, char *buf)
@@ -1102,14 +1107,17 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
*/
if (!group->default_domain) {
struct iommu_domain *dom;
+ int def_domain_type =
+ (iommu_default_dma_mode == IOMMU_DMA_MODE_PASSTHROUGH)
+ ? IOMMU_DOMAIN_IDENTITY : IOMMU_DOMAIN_DMA;

- dom = __iommu_domain_alloc(dev->bus, iommu_def_domain_type);
- if (!dom && iommu_def_domain_type != IOMMU_DOMAIN_DMA) {
+ dom = __iommu_domain_alloc(dev->bus, def_domain_type);
+ if (!dom && def_domain_type != IOMMU_DOMAIN_DMA) {
dom = __iommu_domain_alloc(dev->bus, IOMMU_DOMAIN_DMA);
if (dom) {
dev_warn(dev,
"failed to allocate default IOMMU domain of type %u; falling back to IOMMU_DOMAIN_DMA",
- iommu_def_domain_type);
+ def_domain_type);
}
}

@@ -1117,7 +1125,7 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
if (!group->domain)
group->domain = dom;

- if (dom && !iommu_dma_strict) {
+ if (dom && (iommu_default_dma_mode == IOMMU_DMA_MODE_LAZY)) {
int attr = 1;
iommu_domain_set_attr(dom,
DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE,
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index ffbbc7e39ceeba3..c3f4e3416176496 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -42,6 +42,11 @@
*/
#define IOMMU_PRIV (1 << 5)

+
+#define IOMMU_DMA_MODE_STRICT 0x0
+#define IOMMU_DMA_MODE_LAZY 0x1
+#define IOMMU_DMA_MODE_PASSTHROUGH 0x2
+
struct iommu_ops;
struct iommu_group;
struct bus_type;
--
1.8.3


2019-04-07 12:50:12

by Zhen Lei

[permalink] [raw]
Subject: [PATCH v4 2/6] iommu: keep dma mode build options consistent with cmdline options

First, add build option IOMMU_DMA_MODE_LAZY, so that we have the
opportunity to set lazy mode as default at build time. Then put the
three config options in an choice, make people can only choose one of the
three at a time, the same to the boot options iommu.dma_mode.

Signed-off-by: Zhen Lei <[email protected]>
---
drivers/iommu/Kconfig | 30 +++++++++++++++++++++++++++---
drivers/iommu/iommu.c | 4 +++-
2 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index b67fcabd668f7b6..19118cfdea1c335 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -74,16 +74,40 @@ config IOMMU_DEBUGFS
debug/iommu directory, and then populate a subdirectory with
entries as required.

-config IOMMU_DEFAULT_PASSTHROUGH
- bool "IOMMU passthrough by default"
+choice
+ prompt "IOMMU dma mode"
depends on IOMMU_API
+ default IOMMU_DMA_MODE_STRICT
+ help
+ IOMMU dma mode, such as: passthrough, lazy, strict.
+
+config IOMMU_DMA_MODE_PASSTHROUGH
+ bool "Configure DMA to bypass the IOMMU"
help
Enable passthrough by default, removing the need to pass in
iommu.dma_mode=passthrough or iommu=pt through command line. If this
is enabled, you can still disable with iommu.dma_mode={lazy|strict}
or iommu=nopt depending on the architecture.

- If unsure, say N here.
+config IOMMU_DMA_MODE_LAZY
+ bool "IOMMU DMA use lazy mode to flush IOTLB and free IOVA"
+ help
+ Support lazy mode, where for every IOMMU DMA unmap operation, the
+ flush operation of IOTLB and the free operation of IOVA are deferred.
+ They are only guaranteed to be done before the related IOVA will be
+ reused. Removing the need to pass in kernel parameters through
+ command line. For example, iommu.dma_mode=lazy on ARM64. If this is
+ enabled, you can still disable with kernel parameters, such as
+ iommu.dma_mode=strict depending on the architecture.
+
+config IOMMU_DMA_MODE_STRICT
+ bool "IOMMU DMA use strict mode to flush IOTLB and free IOVA"
+ help
+ For every IOMMU DMA unmap operation, the flush operation of IOTLB and
+ the free operation of IOVA are guaranteed to be done in the unmap
+ function.
+
+endchoice

config OF_IOMMU
def_bool y
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index e4d581e6cb8d210..b75e23a2ea08da8 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -39,8 +39,10 @@
static struct kset *iommu_group_kset;
static DEFINE_IDA(iommu_group_ida);

-#ifdef CONFIG_IOMMU_DEFAULT_PASSTHROUGH
+#if defined(CONFIG_IOMMU_DMA_MODE_PASSTHROUGH)
#define IOMMU_DEFAULT_DMA_MODE IOMMU_DMA_MODE_PASSTHROUGH
+#elif defined(CONFIG_IOMMU_DMA_MODE_LAZY)
+#define IOMMU_DEFAULT_DMA_MODE IOMMU_DMA_MODE_LAZY
#else
#define IOMMU_DEFAULT_DMA_MODE IOMMU_DMA_MODE_STRICT
#endif
--
1.8.3


2019-04-08 01:16:12

by Hanjun Guo

[permalink] [raw]
Subject: Re: [PATCH v4 0/6] normalize IOMMU dma mode boot options

Hi Zhen,

On 2019/4/7 20:41, Zhen Lei wrote:
> As Robin Murphy's suggestion:
> "It's also not necessarily obvious to the user how this interacts with
> IOMMU_DEFAULT_PASSTHROUGH, so if we really do go down this route, maybe it
> would be better to refactor the whole lot into a single selection of something
> like IOMMU_DEFAULT_MODE anyway."
>
> In this version, I tried to normalize the IOMMU dma mode boot options for all
> ARCHs. When IOMMU is enabled, there are 3 dma modes: paasthrough(bypass),
> lazy(mapping but defer the IOTLB invalidation), strict. But currently each
> ARCHs defined their private boot options, different with each other. For
> example, to enable/disable "passthrough", ARM64 use iommu.passthrough=1/0,
> X86 use iommu=pt/nopt, PPC/POWERNV use iommu=nobypass.
>
>
> Zhen Lei (6):
> iommu: use iommu.dma_mode to replace iommu.passthrough and
> iommu.strict
> iommu: keep dma mode build options consistent with cmdline options
> iommu: add iommu_default_dma_mode_get() helper
> s390/pci: use common boot option iommu.dma_mode
> powernv/iommu: use common boot option iommu.dma_mode
> x86/iommu: use common boot option iommu.dma_mode

This will break systems using boot options as now, and I think
this is unacceptable. If you want to do so, just introduce iommu.dma_mode
on top of those iommu boot options with dma mode boot options unchanged,
and iommu.dma_mode is for all archs but compatible with them.

Thanks
Hanjun

2019-04-08 03:17:59

by Zhen Lei

[permalink] [raw]
Subject: Re: [PATCH v4 0/6] normalize IOMMU dma mode boot options



On 2019/4/8 9:14, Hanjun Guo wrote:
> Hi Zhen,
>
> On 2019/4/7 20:41, Zhen Lei wrote:
>> As Robin Murphy's suggestion:
>> "It's also not necessarily obvious to the user how this interacts with
>> IOMMU_DEFAULT_PASSTHROUGH, so if we really do go down this route, maybe it
>> would be better to refactor the whole lot into a single selection of something
>> like IOMMU_DEFAULT_MODE anyway."
>>
>> In this version, I tried to normalize the IOMMU dma mode boot options for all
>> ARCHs. When IOMMU is enabled, there are 3 dma modes: paasthrough(bypass),
>> lazy(mapping but defer the IOTLB invalidation), strict. But currently each
>> ARCHs defined their private boot options, different with each other. For
>> example, to enable/disable "passthrough", ARM64 use iommu.passthrough=1/0,
>> X86 use iommu=pt/nopt, PPC/POWERNV use iommu=nobypass.
>>
>>
>> Zhen Lei (6):
>> iommu: use iommu.dma_mode to replace iommu.passthrough and
>> iommu.strict
>> iommu: keep dma mode build options consistent with cmdline options
>> iommu: add iommu_default_dma_mode_get() helper
>> s390/pci: use common boot option iommu.dma_mode
>> powernv/iommu: use common boot option iommu.dma_mode
>> x86/iommu: use common boot option iommu.dma_mode
>
> This will break systems using boot options as now, and I think
> this is unacceptable. If you want to do so, just introduce iommu.dma_mode
> on top of those iommu boot options with dma mode boot options unchanged,
> and iommu.dma_mode is for all archs but compatible with them.

I just changed the boot options name, but keep the function no change. I added
all related maintainers/supporters in the "to=" list, maybe we can disuss this.
Should I add some "obsoleted" warnings for old options and keep them for a while?
But I think this kind of thing is best done in one go.

>
> Thanks
> Hanjun
>
>
> .
>

--
Thanks!
BestRegards

2019-04-08 06:34:38

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v4 0/6] normalize IOMMU dma mode boot options

On Mon, 8 Apr 2019, Leizhen (ThunderTown) wrote:
> >
> > This will break systems using boot options as now, and I think
> > this is unacceptable. If you want to do so, just introduce iommu.dma_mode
> > on top of those iommu boot options with dma mode boot options unchanged,
> > and iommu.dma_mode is for all archs but compatible with them.
>
> I just changed the boot options name, but keep the function no change. I added
> all related maintainers/supporters in the "to=" list, maybe we can disuss this.

Changing the name _IS_ the problem. Think about unattended updates.

> Should I add some "obsoleted" warnings for old options and keep them for a while?

No, just keep the old options around for backwards compatibilty sake. We
just do not add new arch specific options in the future. New options need
to use the generic iommu.dma_mode name space.

Thanks,

tglx

2019-04-08 09:53:35

by Zhen Lei

[permalink] [raw]
Subject: Re: [PATCH v4 0/6] normalize IOMMU dma mode boot options



On 2019/4/8 14:32, Thomas Gleixner wrote:
> On Mon, 8 Apr 2019, Leizhen (ThunderTown) wrote:
>>>
>>> This will break systems using boot options as now, and I think
>>> this is unacceptable. If you want to do so, just introduce iommu.dma_mode
>>> on top of those iommu boot options with dma mode boot options unchanged,
>>> and iommu.dma_mode is for all archs but compatible with them.
>>
>> I just changed the boot options name, but keep the function no change. I added
>> all related maintainers/supporters in the "to=" list, maybe we can disuss this.
>
> Changing the name _IS_ the problem. Think about unattended updates.
>
>> Should I add some "obsoleted" warnings for old options and keep them for a while?
>
> No, just keep the old options around for backwards compatibilty sake. We
> just do not add new arch specific options in the future. New options need
> to use the generic iommu.dma_mode name space.

OK, thanks for your advise.


>
> Thanks,
>
> tglx
>
> .
>

--
Thanks!
BestRegards