Hi all,
As part of the work to enable a "Generic Kernel Image" across multiple
Android devices, there is a need to seperate shared, core kernel code
from modular driver code that may not be needed by all SoCs. This means
building IOMMU drivers as modules.
It turns out that most of the groundwork has already been done to enable
the ARM SMMU drivers to be 'tristate' options in drivers/iommu/Kconfig;
with a few symbols exported from the IOMMU/PCI core, everything builds
nicely out of the box. The one exception is support for the legacy SMMU
DT binding, which is not in widespread use and has never worked with
modules, so we can simply remove that when building as a module rather
than try to paper over it with even more hacks.
Obviously you need to be careful about using IOMMU drivers as modules,
since late loading of the driver for an IOMMU serving active DMA masters
is going to end badly in many cases. On Android, we're using device links
to ensure that the IOMMU probes first.
Comments welcome,
Will
Cc: Robin Murphy <[email protected]>
Cc: Joerg Roedel <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Lorenzo Pieralisi <[email protected]>
--->8
Will Deacon (7):
drivers/iommu: Export core IOMMU API symbols to permit modular drivers
iommu/of: Request ACS from the PCI core when configuring IOMMU linkage
PCI: Export pci_ats_disabled() as a GPL symbol to modules
Revert "iommu/arm-smmu: Make arm-smmu-v3 explicitly non-modular"
iommu/arm-smmu-v3: Allow building as a module
Revert "iommu/arm-smmu: Make arm-smmu explicitly non-modular"
iommu/arm-smmu: Allow building as a module
drivers/iommu/Kconfig | 16 ++++++-
drivers/iommu/arm-smmu-impl.c | 6 +++
drivers/iommu/arm-smmu-v3.c | 26 +++++++----
drivers/iommu/arm-smmu.c | 86 +++++++++++++++++++++--------------
drivers/iommu/iommu-sysfs.c | 5 ++
drivers/iommu/iommu.c | 8 ++++
drivers/iommu/of_iommu.c | 1 +
drivers/pci/pci.c | 1 +
8 files changed, 102 insertions(+), 47 deletions(-)
--
2.24.0.rc0.303.g954a862665-goog
To avoid having to export 'pci_request_acs()' to modular IOMMU drivers,
move the call into the 'of_dma_configure()' path in a similar manner to
the way in which ACS is configured when probing via ACPI/IORT.
Signed-off-by: Will Deacon <[email protected]>
---
drivers/iommu/of_iommu.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index 614a93aa5305..78faa9f73a91 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -177,6 +177,7 @@ const struct iommu_ops *of_iommu_configure(struct device *dev,
.np = master_np,
};
+ pci_request_acs();
err = pci_for_each_dma_alias(to_pci_dev(dev),
of_pci_iommu_init, &info);
} else if (dev_is_fsl_mc(dev)) {
--
2.24.0.rc0.303.g954a862665-goog
This reverts commit c07b6426df922d21a13a959cf785d46e9c531941.
Let's get the SMMUv3 driver building as a module, which means putting
back some dead code that we used to carry.
Signed-off-by: Will Deacon <[email protected]>
---
drivers/iommu/arm-smmu-v3.c | 25 ++++++++++++++++---------
1 file changed, 16 insertions(+), 9 deletions(-)
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 8da93e730d6f..2ad8e2ca0583 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -21,8 +21,7 @@
#include <linux/io-pgtable.h>
#include <linux/iommu.h>
#include <linux/iopoll.h>
-#include <linux/init.h>
-#include <linux/moduleparam.h>
+#include <linux/module.h>
#include <linux/msi.h>
#include <linux/of.h>
#include <linux/of_address.h>
@@ -384,10 +383,6 @@
#define MSI_IOVA_BASE 0x8000000
#define MSI_IOVA_LENGTH 0x100000
-/*
- * not really modular, but the easiest way to keep compat with existing
- * bootargs behaviour is to continue using module_param_named here.
- */
static bool disable_bypass = 1;
module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO);
MODULE_PARM_DESC(disable_bypass,
@@ -3683,25 +3678,37 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
return 0;
}
-static void arm_smmu_device_shutdown(struct platform_device *pdev)
+static int arm_smmu_device_remove(struct platform_device *pdev)
{
struct arm_smmu_device *smmu = platform_get_drvdata(pdev);
arm_smmu_device_disable(smmu);
+
+ return 0;
+}
+
+static void arm_smmu_device_shutdown(struct platform_device *pdev)
+{
+ arm_smmu_device_remove(pdev);
}
static const struct of_device_id arm_smmu_of_match[] = {
{ .compatible = "arm,smmu-v3", },
{ },
};
+MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
static struct platform_driver arm_smmu_driver = {
.driver = {
.name = "arm-smmu-v3",
.of_match_table = of_match_ptr(arm_smmu_of_match),
- .suppress_bind_attrs = true,
},
.probe = arm_smmu_device_probe,
+ .remove = arm_smmu_device_remove,
.shutdown = arm_smmu_device_shutdown,
};
-builtin_platform_driver(arm_smmu_driver);
+module_platform_driver(arm_smmu_driver);
+
+MODULE_DESCRIPTION("IOMMU API for ARM architected SMMUv3 implementations");
+MODULE_AUTHOR("Will Deacon <[email protected]>");
+MODULE_LICENSE("GPL v2");
--
2.24.0.rc0.303.g954a862665-goog
By conditionally dropping support for the legacy binding and exporting
the newly introduced 'arm_smmu_impl_init()' function we can allow the
ARM SMMU driver to be built as a module.
Signed-off-by: Will Deacon <[email protected]>
---
drivers/iommu/Kconfig | 14 ++++++++-
drivers/iommu/arm-smmu-impl.c | 6 ++++
drivers/iommu/arm-smmu.c | 54 +++++++++++++++++++++--------------
3 files changed, 51 insertions(+), 23 deletions(-)
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 7583d47fc4d5..02703f51e533 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -350,7 +350,7 @@ config SPAPR_TCE_IOMMU
# ARM IOMMU support
config ARM_SMMU
- bool "ARM Ltd. System MMU (SMMU) Support"
+ tristate "ARM Ltd. System MMU (SMMU) Support"
depends on (ARM64 || ARM) && MMU
select IOMMU_API
select IOMMU_IO_PGTABLE_LPAE
@@ -362,6 +362,18 @@ config ARM_SMMU
Say Y here if your SoC includes an IOMMU device implementing
the ARM SMMU architecture.
+config ARM_SMMU_LEGACY_DT_BINDINGS
+ bool "Support the legacy \"mmu-masters\" devicetree bindings"
+ depends on ARM_SMMU=y && OF
+ help
+ Support for the badly designed and deprecated \"mmu-masters\"
+ devicetree bindings. This allows some DMA masters to attach
+ to the SMMU but does not provide any support via the DMA API.
+ If you're lucky, you might be able to get VFIO up and running.
+
+ If you say Y here then you'll make me very sad. Instead, say N
+ and move your firmware to the utopian future that was 2016.
+
config ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT
bool "Default to disabling bypass on ARM SMMU v1 and v2"
depends on ARM_SMMU
diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
index 5c87a38620c4..2f82d40317d6 100644
--- a/drivers/iommu/arm-smmu-impl.c
+++ b/drivers/iommu/arm-smmu-impl.c
@@ -5,6 +5,7 @@
#define pr_fmt(fmt) "arm-smmu: " fmt
#include <linux/bitfield.h>
+#include <linux/module.h>
#include <linux/of.h>
#include "arm-smmu.h"
@@ -172,3 +173,8 @@ struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu)
return smmu;
}
+EXPORT_SYMBOL_GPL(arm_smmu_impl_init);
+
+MODULE_DESCRIPTION("IOMMU quirks for ARM architected SMMU implementations");
+MODULE_AUTHOR("Robin Murphy <[email protected]>");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 53bbe0663b9e..9ef14830546d 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -125,6 +125,12 @@ static struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom)
return container_of(dom, struct arm_smmu_domain, domain);
}
+static struct platform_driver arm_smmu_driver;
+static struct iommu_ops arm_smmu_ops;
+
+#ifdef CONFIG_ARM_SMMU_LEGACY_DT_BINDINGS
+static void arm_smmu_bus_init(void);
+
static struct device_node *dev_get_dev_node(struct device *dev)
{
if (dev_is_pci(dev)) {
@@ -160,9 +166,6 @@ static int __find_legacy_master_phandle(struct device *dev, void *data)
return err == -ENOENT ? 0 : err;
}
-static struct platform_driver arm_smmu_driver;
-static struct iommu_ops arm_smmu_ops;
-
static int arm_smmu_register_legacy_master(struct device *dev,
struct arm_smmu_device **smmu)
{
@@ -214,6 +217,27 @@ static int arm_smmu_register_legacy_master(struct device *dev,
return err;
}
+/*
+ * With the legacy DT binding in play, we have no guarantees about
+ * probe order, but then we're also not doing default domains, so we can
+ * delay setting bus ops until we're sure every possible SMMU is ready,
+ * and that way ensure that no add_device() calls get missed.
+ */
+static int arm_smmu_legacy_bus_init(void)
+{
+ if (using_legacy_binding)
+ arm_smmu_bus_init();
+ return 0;
+}
+device_initcall_sync(arm_smmu_legacy_bus_init);
+#else
+static int arm_smmu_register_legacy_master(struct device *dev,
+ struct arm_smmu_device **smmu)
+{
+ return -ENODEV;
+}
+#endif /* CONFIG_ARM_SMMU_LEGACY_DT_BINDINGS */
+
static int __arm_smmu_alloc_bitmap(unsigned long *map, int start, int end)
{
int idx;
@@ -1960,8 +1984,10 @@ static int arm_smmu_device_dt_probe(struct platform_device *pdev,
legacy_binding = of_find_property(dev->of_node, "mmu-masters", NULL);
if (legacy_binding && !using_generic_binding) {
- if (!using_legacy_binding)
- pr_notice("deprecated \"mmu-masters\" DT property in use; DMA API support unavailable\n");
+ if (!using_legacy_binding) {
+ pr_notice("deprecated \"mmu-masters\" DT property in use; %s support unavailable\n",
+ IS_ENABLED(CONFIG_ARM_SMMU_LEGACY_DT_BINDINGS) ? "DMA API" : "SMMU");
+ }
using_legacy_binding = true;
} else if (!legacy_binding && !using_legacy_binding) {
using_generic_binding = true;
@@ -1986,10 +2012,8 @@ static void arm_smmu_bus_init(void)
bus_set_iommu(&amba_bustype, &arm_smmu_ops);
#endif
#ifdef CONFIG_PCI
- if (!iommu_present(&pci_bus_type)) {
- pci_request_acs();
+ if (!iommu_present(&pci_bus_type))
bus_set_iommu(&pci_bus_type, &arm_smmu_ops);
- }
#endif
#ifdef CONFIG_FSL_MC_BUS
if (!iommu_present(&fsl_mc_bus_type))
@@ -2147,20 +2171,6 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
return 0;
}
-/*
- * With the legacy DT binding in play, though, we have no guarantees about
- * probe order, but then we're also not doing default domains, so we can
- * delay setting bus ops until we're sure every possible SMMU is ready,
- * and that way ensure that no add_device() calls get missed.
- */
-static int arm_smmu_legacy_bus_init(void)
-{
- if (using_legacy_binding)
- arm_smmu_bus_init();
- return 0;
-}
-device_initcall_sync(arm_smmu_legacy_bus_init);
-
static int arm_smmu_device_remove(struct platform_device *pdev)
{
struct arm_smmu_device *smmu = platform_get_drvdata(pdev);
--
2.24.0.rc0.303.g954a862665-goog
This reverts commit addb672f200f4e99368270da205320b83efe01a0.
Let's get the SMMU driver building as a module, which means putting
back some dead code that we used to carry.
Signed-off-by: Will Deacon <[email protected]>
---
drivers/iommu/arm-smmu.c | 32 +++++++++++++++++++-------------
1 file changed, 19 insertions(+), 13 deletions(-)
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 7c503a6bc585..53bbe0663b9e 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -27,8 +27,7 @@
#include <linux/interrupt.h>
#include <linux/io.h>
#include <linux/iopoll.h>
-#include <linux/init.h>
-#include <linux/moduleparam.h>
+#include <linux/module.h>
#include <linux/of.h>
#include <linux/of_address.h>
#include <linux/of_device.h>
@@ -59,10 +58,6 @@
#define MSI_IOVA_LENGTH 0x100000
static int force_stage;
-/*
- * not really modular, but the easiest way to keep compat with existing
- * bootargs behaviour is to continue using module_param() here.
- */
module_param(force_stage, int, S_IRUGO);
MODULE_PARM_DESC(force_stage,
"Force SMMU mappings to be installed at a particular stage of translation. A value of '1' or '2' forces the corresponding stage. All other values are ignored (i.e. no stage is forced). Note that selecting a specific stage will disable support for nested translation.");
@@ -1878,6 +1873,7 @@ static const struct of_device_id arm_smmu_of_match[] = {
{ .compatible = "qcom,smmu-v2", .data = &qcom_smmuv2 },
{ },
};
+MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
#ifdef CONFIG_ACPI
static int acpi_smmu_get_data(u32 model, struct arm_smmu_device *smmu)
@@ -2165,12 +2161,12 @@ static int arm_smmu_legacy_bus_init(void)
}
device_initcall_sync(arm_smmu_legacy_bus_init);
-static void arm_smmu_device_shutdown(struct platform_device *pdev)
+static int arm_smmu_device_remove(struct platform_device *pdev)
{
struct arm_smmu_device *smmu = platform_get_drvdata(pdev);
if (!smmu)
- return;
+ return -ENODEV;
if (!bitmap_empty(smmu->context_map, ARM_SMMU_MAX_CBS))
dev_err(&pdev->dev, "removing device with active domains!\n");
@@ -2186,6 +2182,12 @@ static void arm_smmu_device_shutdown(struct platform_device *pdev)
clk_bulk_disable(smmu->num_clks, smmu->clks);
clk_bulk_unprepare(smmu->num_clks, smmu->clks);
+ return 0;
+}
+
+static void arm_smmu_device_shutdown(struct platform_device *pdev)
+{
+ arm_smmu_device_remove(pdev);
}
static int __maybe_unused arm_smmu_runtime_resume(struct device *dev)
@@ -2235,12 +2237,16 @@ static const struct dev_pm_ops arm_smmu_pm_ops = {
static struct platform_driver arm_smmu_driver = {
.driver = {
- .name = "arm-smmu",
- .of_match_table = of_match_ptr(arm_smmu_of_match),
- .pm = &arm_smmu_pm_ops,
- .suppress_bind_attrs = true,
+ .name = "arm-smmu",
+ .of_match_table = of_match_ptr(arm_smmu_of_match),
+ .pm = &arm_smmu_pm_ops,
},
.probe = arm_smmu_device_probe,
+ .remove = arm_smmu_device_remove,
.shutdown = arm_smmu_device_shutdown,
};
-builtin_platform_driver(arm_smmu_driver);
+module_platform_driver(arm_smmu_driver);
+
+MODULE_DESCRIPTION("IOMMU API for ARM architected SMMU implementations");
+MODULE_AUTHOR("Will Deacon <[email protected]>");
+MODULE_LICENSE("GPL v2");
--
2.24.0.rc0.303.g954a862665-goog
By removing the redundant call to 'pci_request_acs()' we can allow the
ARM SMMUv3 driver to be built as a module.
Signed-off-by: Will Deacon <[email protected]>
---
drivers/iommu/Kconfig | 2 +-
drivers/iommu/arm-smmu-v3.c | 1 -
2 files changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index e3842eabcfdd..7583d47fc4d5 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -388,7 +388,7 @@ config ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT
config.
config ARM_SMMU_V3
- bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
+ tristate "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
depends on ARM64
select IOMMU_API
select IOMMU_IO_PGTABLE_LPAE
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index 2ad8e2ca0583..56ce4ba2fcbe 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -3657,7 +3657,6 @@ static int arm_smmu_device_probe(struct platform_device *pdev)
#ifdef CONFIG_PCI
if (pci_bus_type.iommu_ops != &arm_smmu_ops) {
- pci_request_acs();
ret = bus_set_iommu(&pci_bus_type, &arm_smmu_ops);
if (ret)
return ret;
--
2.24.0.rc0.303.g954a862665-goog
Building IOMMU drivers as modules requires that the core IOMMU API
symbols are exported as GPL symbols.
Signed-off-by: Will Deacon <[email protected]>
---
drivers/iommu/iommu-sysfs.c | 5 +++++
drivers/iommu/iommu.c | 8 ++++++++
2 files changed, 13 insertions(+)
diff --git a/drivers/iommu/iommu-sysfs.c b/drivers/iommu/iommu-sysfs.c
index e436ff813e7e..99869217fbec 100644
--- a/drivers/iommu/iommu-sysfs.c
+++ b/drivers/iommu/iommu-sysfs.c
@@ -87,6 +87,7 @@ int iommu_device_sysfs_add(struct iommu_device *iommu,
put_device(iommu->dev);
return ret;
}
+EXPORT_SYMBOL_GPL(iommu_device_sysfs_add);
void iommu_device_sysfs_remove(struct iommu_device *iommu)
{
@@ -94,6 +95,8 @@ void iommu_device_sysfs_remove(struct iommu_device *iommu)
device_unregister(iommu->dev);
iommu->dev = NULL;
}
+EXPORT_SYMBOL_GPL(iommu_device_sysfs_remove);
+
/*
* IOMMU drivers can indicate a device is managed by a given IOMMU using
* this interface. A link to the device will be created in the "devices"
@@ -119,6 +122,7 @@ int iommu_device_link(struct iommu_device *iommu, struct device *link)
return ret;
}
+EXPORT_SYMBOL_GPL(iommu_device_link);
void iommu_device_unlink(struct iommu_device *iommu, struct device *link)
{
@@ -128,3 +132,4 @@ void iommu_device_unlink(struct iommu_device *iommu, struct device *link)
sysfs_remove_link(&link->kobj, "iommu");
sysfs_remove_link_from_group(&iommu->dev->kobj, "devices", dev_name(link));
}
+EXPORT_SYMBOL_GPL(iommu_device_unlink);
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index d658c7c6a2ab..c1aadb570145 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -141,6 +141,7 @@ int iommu_device_register(struct iommu_device *iommu)
spin_unlock(&iommu_device_lock);
return 0;
}
+EXPORT_SYMBOL_GPL(iommu_device_register);
void iommu_device_unregister(struct iommu_device *iommu)
{
@@ -148,6 +149,7 @@ void iommu_device_unregister(struct iommu_device *iommu)
list_del(&iommu->list);
spin_unlock(&iommu_device_lock);
}
+EXPORT_SYMBOL_GPL(iommu_device_unregister);
static struct iommu_param *iommu_get_dev_param(struct device *dev)
{
@@ -886,6 +888,7 @@ struct iommu_group *iommu_group_ref_get(struct iommu_group *group)
kobject_get(group->devices_kobj);
return group;
}
+EXPORT_SYMBOL_GPL(iommu_group_ref_get);
/**
* iommu_group_put - Decrement group reference
@@ -1259,6 +1262,7 @@ struct iommu_group *generic_device_group(struct device *dev)
{
return iommu_group_alloc();
}
+EXPORT_SYMBOL_GPL(generic_device_group);
/*
* Use standard PCI bus topology, isolation features, and DMA alias quirks
@@ -1326,6 +1330,7 @@ struct iommu_group *pci_device_group(struct device *dev)
/* No shared group found, allocate new */
return iommu_group_alloc();
}
+EXPORT_SYMBOL_GPL(pci_device_group);
/* Get the IOMMU group for device on fsl-mc bus */
struct iommu_group *fsl_mc_device_group(struct device *dev)
@@ -1338,6 +1343,7 @@ struct iommu_group *fsl_mc_device_group(struct device *dev)
group = iommu_group_alloc();
return group;
}
+EXPORT_SYMBOL_GPL(fsl_mc_device_group);
/**
* iommu_group_get_for_dev - Find or create the IOMMU group for a device
@@ -1406,6 +1412,7 @@ struct iommu_group *iommu_group_get_for_dev(struct device *dev)
return group;
}
+EXPORT_SYMBOL(iommu_group_get_for_dev);
struct iommu_domain *iommu_group_default_domain(struct iommu_group *group)
{
@@ -2185,6 +2192,7 @@ struct iommu_resv_region *iommu_alloc_resv_region(phys_addr_t start,
region->type = type;
return region;
}
+EXPORT_SYMBOL_GPL(iommu_alloc_resv_region);
static int
request_default_domain_for_dev(struct device *dev, unsigned long type)
--
2.24.0.rc0.303.g954a862665-goog
Building drivers for ATS-aware IOMMUs as modules requires access to
pci_ats_disabled(). Export it as a GPL symbol to get things working.
Signed-off-by: Will Deacon <[email protected]>
---
drivers/pci/pci.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a97e2571a527..4fbe5b576dd8 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -123,6 +123,7 @@ bool pci_ats_disabled(void)
{
return pcie_ats_disabled;
}
+EXPORT_SYMBOL_GPL(pci_ats_disabled);
/* Disable bridge_d3 for all PCIe ports */
static bool pci_bridge_d3_disable;
--
2.24.0.rc0.303.g954a862665-goog
On Wed, Oct 30, 2019 at 02:51:12PM +0000, Will Deacon wrote:
> By conditionally dropping support for the legacy binding and exporting
> the newly introduced 'arm_smmu_impl_init()' function we can allow the
> ARM SMMU driver to be built as a module.
>
> Signed-off-by: Will Deacon <[email protected]>
> ---
> drivers/iommu/Kconfig | 14 ++++++++-
> drivers/iommu/arm-smmu-impl.c | 6 ++++
> drivers/iommu/arm-smmu.c | 54 +++++++++++++++++++++--------------
> 3 files changed, 51 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index 7583d47fc4d5..02703f51e533 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -350,7 +350,7 @@ config SPAPR_TCE_IOMMU
>
> # ARM IOMMU support
> config ARM_SMMU
> - bool "ARM Ltd. System MMU (SMMU) Support"
> + tristate "ARM Ltd. System MMU (SMMU) Support"
> depends on (ARM64 || ARM) && MMU
> select IOMMU_API
> select IOMMU_IO_PGTABLE_LPAE
> @@ -362,6 +362,18 @@ config ARM_SMMU
> Say Y here if your SoC includes an IOMMU device implementing
> the ARM SMMU architecture.
>
> +config ARM_SMMU_LEGACY_DT_BINDINGS
> + bool "Support the legacy \"mmu-masters\" devicetree bindings"
Can't we just remove this now? The only user is Seattle. Is anyone still
using Seattle AND DT? There's been no real dts change since Feb '16.
There's a bit of clean-up needed in the Seattle dts files, so I'd like
to remove them if there's not users.
If there are users, can't we just make them move to the new binding?
Yes compatibility, but that really depends on the users caring.
I though Calxeda was using this too, but I guess we didn't get that
finished. We should probably remove that secure mode flag as well.
Rob
Hi Rob,
On Wed, Oct 30, 2019 at 10:22:12AM -0500, Rob Herring wrote:
> On Wed, Oct 30, 2019 at 02:51:12PM +0000, Will Deacon wrote:
> > By conditionally dropping support for the legacy binding and exporting
> > the newly introduced 'arm_smmu_impl_init()' function we can allow the
> > ARM SMMU driver to be built as a module.
> >
> > Signed-off-by: Will Deacon <[email protected]>
> > ---
> > drivers/iommu/Kconfig | 14 ++++++++-
> > drivers/iommu/arm-smmu-impl.c | 6 ++++
> > drivers/iommu/arm-smmu.c | 54 +++++++++++++++++++++--------------
> > 3 files changed, 51 insertions(+), 23 deletions(-)
> >
> > diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> > index 7583d47fc4d5..02703f51e533 100644
> > --- a/drivers/iommu/Kconfig
> > +++ b/drivers/iommu/Kconfig
> > @@ -350,7 +350,7 @@ config SPAPR_TCE_IOMMU
> >
> > # ARM IOMMU support
> > config ARM_SMMU
> > - bool "ARM Ltd. System MMU (SMMU) Support"
> > + tristate "ARM Ltd. System MMU (SMMU) Support"
> > depends on (ARM64 || ARM) && MMU
> > select IOMMU_API
> > select IOMMU_IO_PGTABLE_LPAE
> > @@ -362,6 +362,18 @@ config ARM_SMMU
> > Say Y here if your SoC includes an IOMMU device implementing
> > the ARM SMMU architecture.
> >
> > +config ARM_SMMU_LEGACY_DT_BINDINGS
> > + bool "Support the legacy \"mmu-masters\" devicetree bindings"
>
> Can't we just remove this now? The only user is Seattle. Is anyone still
> using Seattle AND DT? There's been no real dts change since Feb '16.
> There's a bit of clean-up needed in the Seattle dts files, so I'd like
> to remove them if there's not users.
>
> If there are users, can't we just make them move to the new binding?
> Yes compatibility, but that really depends on the users caring.
>
> I though Calxeda was using this too, but I guess we didn't get that
> finished. We should probably remove that secure mode flag as well.
There was a recent mail from somebody using TX1 with this binding, although
it didn't actually appear to be working (but I'm not sure how much of that
is the bindings fault):
http://lists.infradead.org/pipermail/linux-arm-kernel/2019-October/683992.html
However, I agree with you, which is why the new Kconfig option actually
disables the legacy bindings by default in the hope that we can remove it
in a few releases time, with an easy "get things sorta working" option in
the meantime.
Will
On 30/10/2019 15:22, Rob Herring wrote:
> On Wed, Oct 30, 2019 at 02:51:12PM +0000, Will Deacon wrote:
>> By conditionally dropping support for the legacy binding and exporting
>> the newly introduced 'arm_smmu_impl_init()' function we can allow the
>> ARM SMMU driver to be built as a module.
>>
>> Signed-off-by: Will Deacon <[email protected]>
>> ---
>> drivers/iommu/Kconfig | 14 ++++++++-
>> drivers/iommu/arm-smmu-impl.c | 6 ++++
>> drivers/iommu/arm-smmu.c | 54 +++++++++++++++++++++--------------
>> 3 files changed, 51 insertions(+), 23 deletions(-)
>>
>> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
>> index 7583d47fc4d5..02703f51e533 100644
>> --- a/drivers/iommu/Kconfig
>> +++ b/drivers/iommu/Kconfig
>> @@ -350,7 +350,7 @@ config SPAPR_TCE_IOMMU
>>
>> # ARM IOMMU support
>> config ARM_SMMU
>> - bool "ARM Ltd. System MMU (SMMU) Support"
>> + tristate "ARM Ltd. System MMU (SMMU) Support"
>> depends on (ARM64 || ARM) && MMU
>> select IOMMU_API
>> select IOMMU_IO_PGTABLE_LPAE
>> @@ -362,6 +362,18 @@ config ARM_SMMU
>> Say Y here if your SoC includes an IOMMU device implementing
>> the ARM SMMU architecture.
>>
>> +config ARM_SMMU_LEGACY_DT_BINDINGS
>> + bool "Support the legacy \"mmu-masters\" devicetree bindings"
>
> Can't we just remove this now? The only user is Seattle. Is anyone still
> using Seattle AND DT? There's been no real dts change since Feb '16.
> There's a bit of clean-up needed in the Seattle dts files, so I'd like
> to remove them if there's not users.
>
> If there are users, can't we just make them move to the new binding?
> Yes compatibility, but that really depends on the users caring.
Apparently it's also in the wild on Cavium ThunderX/OcteonTX machines as
well :(
> I though Calxeda was using this too, but I guess we didn't get that
> finished. We should probably remove that secure mode flag as well.
FWIW the secure quirk still comes in useful every now and then when
people prototype stuff on 32-bit VExpress, where it turns out an SMMU is
about the only thing which cares whether you're running Linux in Secure
mode or not.
Robin.
On 30/10/2019 14:51, Will Deacon wrote:
> Hi all,
>
> As part of the work to enable a "Generic Kernel Image" across multiple
> Android devices, there is a need to seperate shared, core kernel code
> from modular driver code that may not be needed by all SoCs. This means
> building IOMMU drivers as modules.
>
> It turns out that most of the groundwork has already been done to enable
> the ARM SMMU drivers to be 'tristate' options in drivers/iommu/Kconfig;
> with a few symbols exported from the IOMMU/PCI core, everything builds
> nicely out of the box. The one exception is support for the legacy SMMU
> DT binding, which is not in widespread use and has never worked with
> modules, so we can simply remove that when building as a module rather
> than try to paper over it with even more hacks.
>
> Obviously you need to be careful about using IOMMU drivers as modules,
> since late loading of the driver for an IOMMU serving active DMA masters
> is going to end badly in many cases. On Android, we're using device links
> to ensure that the IOMMU probes first.
Out of curiosity, which device links are those? Clearly not the RPM
links created by the IOMMU drivers themselves... Is this some special
Android magic, or is there actually a chance of replacing all the
of_iommu_configure() machinery with something more generic?
Robin.
>
> Comments welcome,
>
> Will
>
> Cc: Robin Murphy <[email protected]>
> Cc: Joerg Roedel <[email protected]>
> Cc: Bjorn Helgaas <[email protected]>
> Cc: Lorenzo Pieralisi <[email protected]>
>
> --->8
>
> Will Deacon (7):
> drivers/iommu: Export core IOMMU API symbols to permit modular drivers
> iommu/of: Request ACS from the PCI core when configuring IOMMU linkage
> PCI: Export pci_ats_disabled() as a GPL symbol to modules
> Revert "iommu/arm-smmu: Make arm-smmu-v3 explicitly non-modular"
> iommu/arm-smmu-v3: Allow building as a module
> Revert "iommu/arm-smmu: Make arm-smmu explicitly non-modular"
> iommu/arm-smmu: Allow building as a module
>
> drivers/iommu/Kconfig | 16 ++++++-
> drivers/iommu/arm-smmu-impl.c | 6 +++
> drivers/iommu/arm-smmu-v3.c | 26 +++++++----
> drivers/iommu/arm-smmu.c | 86 +++++++++++++++++++++--------------
> drivers/iommu/iommu-sysfs.c | 5 ++
> drivers/iommu/iommu.c | 8 ++++
> drivers/iommu/of_iommu.c | 1 +
> drivers/pci/pci.c | 1 +
> 8 files changed, 102 insertions(+), 47 deletions(-)
>
Hi Robin,
On Wed, Oct 30, 2019 at 03:35:55PM +0000, Robin Murphy wrote:
> On 30/10/2019 14:51, Will Deacon wrote:
> > As part of the work to enable a "Generic Kernel Image" across multiple
> > Android devices, there is a need to seperate shared, core kernel code
> > from modular driver code that may not be needed by all SoCs. This means
> > building IOMMU drivers as modules.
> >
> > It turns out that most of the groundwork has already been done to enable
> > the ARM SMMU drivers to be 'tristate' options in drivers/iommu/Kconfig;
> > with a few symbols exported from the IOMMU/PCI core, everything builds
> > nicely out of the box. The one exception is support for the legacy SMMU
> > DT binding, which is not in widespread use and has never worked with
> > modules, so we can simply remove that when building as a module rather
> > than try to paper over it with even more hacks.
> >
> > Obviously you need to be careful about using IOMMU drivers as modules,
> > since late loading of the driver for an IOMMU serving active DMA masters
> > is going to end badly in many cases. On Android, we're using device links
> > to ensure that the IOMMU probes first.
>
> Out of curiosity, which device links are those? Clearly not the RPM links
> created by the IOMMU drivers themselves... Is this some special Android
> magic, or is there actually a chance of replacing all the
> of_iommu_configure() machinery with something more generic?
I'll admit that I haven't used them personally yet, but I'm referring to
this series from Saravana [CC'd]:
https://lore.kernel.org/linux-acpi/[email protected]/
which is currently sitting in linux-next now that we're upstreaming the
"special Android magic" ;)
Will
Hi Will,
On Wed, Oct 30, 2019 at 02:51:10PM +0000, Will Deacon wrote:
> By removing the redundant call to 'pci_request_acs()' we can allow the
> ARM SMMUv3 driver to be built as a module.
>
> Signed-off-by: Will Deacon <[email protected]>
> ---
> drivers/iommu/Kconfig | 2 +-
> drivers/iommu/arm-smmu-v3.c | 1 -
> 2 files changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index e3842eabcfdd..7583d47fc4d5 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -388,7 +388,7 @@ config ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT
> config.
>
> config ARM_SMMU_V3
> - bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
> + tristate "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
> depends on ARM64
> select IOMMU_API
> select IOMMU_IO_PGTABLE_LPAEa
Sorry for the stupid question, but what prevents the iommu module from
being unloaded when there are active users? There are no symbol
dependencies to endpoint device drivers, because the interface is only
exposed through the iommu-api, right? Is some sort of manual module
reference counting needed?
Joerg
On Wed, Oct 30, 2019 at 02:51:08PM +0000, Will Deacon wrote:
> Building drivers for ATS-aware IOMMUs as modules requires access to
> pci_ats_disabled(). Export it as a GPL symbol to get things working.
>
> Signed-off-by: Will Deacon <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
> ---
> drivers/pci/pci.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index a97e2571a527..4fbe5b576dd8 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -123,6 +123,7 @@ bool pci_ats_disabled(void)
> {
> return pcie_ats_disabled;
> }
> +EXPORT_SYMBOL_GPL(pci_ats_disabled);
>
> /* Disable bridge_d3 for all PCIe ports */
> static bool pci_bridge_d3_disable;
> --
> 2.24.0.rc0.303.g954a862665-goog
>
On Wed, Oct 30, 2019 at 02:51:11PM +0000, Will Deacon wrote:
> This reverts commit addb672f200f4e99368270da205320b83efe01a0.
>
> Let's get the SMMU driver building as a module, which means putting
> back some dead code that we used to carry.
>
> Signed-off-by: Will Deacon <[email protected]>
> ---
> drivers/iommu/arm-smmu.c | 32 +++++++++++++++++++-------------
> 1 file changed, 19 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 7c503a6bc585..53bbe0663b9e 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -27,8 +27,7 @@
> #include <linux/interrupt.h>
> #include <linux/io.h>
> #include <linux/iopoll.h>
> -#include <linux/init.h>
> -#include <linux/moduleparam.h>
> +#include <linux/module.h>
> #include <linux/of.h>
> #include <linux/of_address.h>
> #include <linux/of_device.h>
> @@ -59,10 +58,6 @@
> #define MSI_IOVA_LENGTH 0x100000
>
> static int force_stage;
> -/*
> - * not really modular, but the easiest way to keep compat with existing
> - * bootargs behaviour is to continue using module_param() here.
> - */
> module_param(force_stage, int, S_IRUGO);
> MODULE_PARM_DESC(force_stage,
> "Force SMMU mappings to be installed at a particular stage of translation. A value of '1' or '2' forces the corresponding stage. All other values are ignored (i.e. no stage is forced). Note that selecting a specific stage will disable support for nested translation.");
> @@ -1878,6 +1873,7 @@ static const struct of_device_id arm_smmu_of_match[] = {
> { .compatible = "qcom,smmu-v2", .data = &qcom_smmuv2 },
> { },
> };
> +MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
>
> #ifdef CONFIG_ACPI
> static int acpi_smmu_get_data(u32 model, struct arm_smmu_device *smmu)
> @@ -2165,12 +2161,12 @@ static int arm_smmu_legacy_bus_init(void)
> }
> device_initcall_sync(arm_smmu_legacy_bus_init);
>
> -static void arm_smmu_device_shutdown(struct platform_device *pdev)
> +static int arm_smmu_device_remove(struct platform_device *pdev)
> {
> struct arm_smmu_device *smmu = platform_get_drvdata(pdev);
>
> if (!smmu)
> - return;
> + return -ENODEV;
>
> if (!bitmap_empty(smmu->context_map, ARM_SMMU_MAX_CBS))
> dev_err(&pdev->dev, "removing device with active domains!\n");
> @@ -2186,6 +2182,12 @@ static void arm_smmu_device_shutdown(struct platform_device *pdev)
> clk_bulk_disable(smmu->num_clks, smmu->clks);
>
> clk_bulk_unprepare(smmu->num_clks, smmu->clks);
> + return 0;
> +}
> +
> +static void arm_smmu_device_shutdown(struct platform_device *pdev)
> +{
> + arm_smmu_device_remove(pdev);
> }
>
> static int __maybe_unused arm_smmu_runtime_resume(struct device *dev)
> @@ -2235,12 +2237,16 @@ static const struct dev_pm_ops arm_smmu_pm_ops = {
>
> static struct platform_driver arm_smmu_driver = {
> .driver = {
> - .name = "arm-smmu",
> - .of_match_table = of_match_ptr(arm_smmu_of_match),
> - .pm = &arm_smmu_pm_ops,
> - .suppress_bind_attrs = true,
> + .name = "arm-smmu",
> + .of_match_table = of_match_ptr(arm_smmu_of_match),
> + .pm = &arm_smmu_pm_ops,
> },
> .probe = arm_smmu_device_probe,
> + .remove = arm_smmu_device_remove,
> .shutdown = arm_smmu_device_shutdown,
> };
> -builtin_platform_driver(arm_smmu_driver);
> +module_platform_driver(arm_smmu_driver);
I know this is a revert, but wouldn't you still want to be at device_init()
level for built in drivers? It always preferable to not defer if given the
choice to do so and device_init() is the right level for this driver IMO.
Jordan
--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
On Wed, Oct 30, 2019 at 8:54 AM Will Deacon <[email protected]> wrote:
>
> Hi Robin,
>
> On Wed, Oct 30, 2019 at 03:35:55PM +0000, Robin Murphy wrote:
> > On 30/10/2019 14:51, Will Deacon wrote:
> > > As part of the work to enable a "Generic Kernel Image" across multiple
> > > Android devices, there is a need to seperate shared, core kernel code
> > > from modular driver code that may not be needed by all SoCs. This means
> > > building IOMMU drivers as modules.
> > >
> > > It turns out that most of the groundwork has already been done to enable
> > > the ARM SMMU drivers to be 'tristate' options in drivers/iommu/Kconfig;
> > > with a few symbols exported from the IOMMU/PCI core, everything builds
> > > nicely out of the box. The one exception is support for the legacy SMMU
> > > DT binding, which is not in widespread use and has never worked with
> > > modules, so we can simply remove that when building as a module rather
> > > than try to paper over it with even more hacks.
> > >
> > > Obviously you need to be careful about using IOMMU drivers as modules,
> > > since late loading of the driver for an IOMMU serving active DMA masters
> > > is going to end badly in many cases. On Android, we're using device links
> > > to ensure that the IOMMU probes first.
> >
> > Out of curiosity, which device links are those? Clearly not the RPM links
> > created by the IOMMU drivers themselves... Is this some special Android
> > magic, or is there actually a chance of replacing all the
> > of_iommu_configure() machinery with something more generic?
>
> I'll admit that I haven't used them personally yet, but I'm referring to
> this series from Saravana [CC'd]:
>
> https://lore.kernel.org/linux-acpi/[email protected]/
>
> which is currently sitting in linux-next now that we're upstreaming the
> "special Android magic" ;)
Hi Robin,
Actually, none of this is special Android magic. Will is talking about
the of_devlink feature that's been merged into driver-core-next.
A one line summary of of_devlink: the driver core + firmware (DT in
this case) automatically add the device links during device addition
based on the firmware properties of each device. The link that Will
gave has more details.
Wrt IOMMUs, the only missing piece in upstream is a trivial change
that does something like this in drivers/of/property.c
+static struct device_node *parse_iommus(struct device_node *np,
+ const char *prop_name, int index)
+{
+ return parse_prop_cells(np, prop_name, index, "iommus",
+ "#iommu-cells");
+}
static const struct supplier_bindings of_supplier_bindings[] = {
{ .parse_prop = parse_clocks, },
{ .parse_prop = parse_interconnects, },
{ .parse_prop = parse_regulators, },
+ { .parse_prop = parse_iommus, },
{},
};
I plan to upstream this pretty soon, but I have other patches in
flight that touch the same file and I'm waiting for those to get
accepted. I also want to clean up the code a bit to reduce some
repetition before I add support for more bindings.
Thanks,
Saravana
On Wed, Oct 30, 2019 at 05:09:41PM -0600, Jordan Crouse wrote:
> On Wed, Oct 30, 2019 at 02:51:11PM +0000, Will Deacon wrote:
> > @@ -2235,12 +2237,16 @@ static const struct dev_pm_ops arm_smmu_pm_ops = {
> >
> > static struct platform_driver arm_smmu_driver = {
> > .driver = {
> > - .name = "arm-smmu",
> > - .of_match_table = of_match_ptr(arm_smmu_of_match),
> > - .pm = &arm_smmu_pm_ops,
> > - .suppress_bind_attrs = true,
> > + .name = "arm-smmu",
> > + .of_match_table = of_match_ptr(arm_smmu_of_match),
> > + .pm = &arm_smmu_pm_ops,
> > },
> > .probe = arm_smmu_device_probe,
> > + .remove = arm_smmu_device_remove,
> > .shutdown = arm_smmu_device_shutdown,
> > };
> > -builtin_platform_driver(arm_smmu_driver);
> > +module_platform_driver(arm_smmu_driver);
>
> I know this is a revert, but wouldn't you still want to be at device_init()
> level for built in drivers? It always preferable to not defer if given the
> choice to do so and device_init() is the right level for this driver IMO.
Hmm, not sure I'm following you completely here. With this change,
module_init() is used to invoke platform_driver_register(). For builtin
drivers, module_initx() expands to __initcall(x), which itself expands
to device_initcall(x). Or are you referrring to something else?
Will
On Thu, Oct 31, 2019 at 12:03:28PM +0000, Will Deacon wrote:
> On Wed, Oct 30, 2019 at 05:09:41PM -0600, Jordan Crouse wrote:
> > On Wed, Oct 30, 2019 at 02:51:11PM +0000, Will Deacon wrote:
> > > @@ -2235,12 +2237,16 @@ static const struct dev_pm_ops arm_smmu_pm_ops = {
> > >
> > > static struct platform_driver arm_smmu_driver = {
> > > .driver = {
> > > - .name = "arm-smmu",
> > > - .of_match_table = of_match_ptr(arm_smmu_of_match),
> > > - .pm = &arm_smmu_pm_ops,
> > > - .suppress_bind_attrs = true,
> > > + .name = "arm-smmu",
> > > + .of_match_table = of_match_ptr(arm_smmu_of_match),
> > > + .pm = &arm_smmu_pm_ops,
> > > },
> > > .probe = arm_smmu_device_probe,
> > > + .remove = arm_smmu_device_remove,
> > > .shutdown = arm_smmu_device_shutdown,
> > > };
> > > -builtin_platform_driver(arm_smmu_driver);
> > > +module_platform_driver(arm_smmu_driver);
> >
> > I know this is a revert, but wouldn't you still want to be at device_init()
> > level for built in drivers? It always preferable to not defer if given the
> > choice to do so and device_init() is the right level for this driver IMO.
>
> Hmm, not sure I'm following you completely here. With this change,
> module_init() is used to invoke platform_driver_register(). For builtin
> drivers, module_initx() expands to __initcall(x), which itself expands
> to device_initcall(x). Or are you referrring to something else?
Oh, yep, I was off. For whatever reason I thought device and module were at
different levels. Sorry for the noise.
Jordan
--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
Hi Joerg,
On Wed, Oct 30, 2019 at 08:31:48PM +0100, Joerg Roedel wrote:
> On Wed, Oct 30, 2019 at 02:51:10PM +0000, Will Deacon wrote:
> > By removing the redundant call to 'pci_request_acs()' we can allow the
> > ARM SMMUv3 driver to be built as a module.
> >
> > Signed-off-by: Will Deacon <[email protected]>
> > ---
> > drivers/iommu/Kconfig | 2 +-
> > drivers/iommu/arm-smmu-v3.c | 1 -
> > 2 files changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> > index e3842eabcfdd..7583d47fc4d5 100644
> > --- a/drivers/iommu/Kconfig
> > +++ b/drivers/iommu/Kconfig
> > @@ -388,7 +388,7 @@ config ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT
> > config.
> >
> > config ARM_SMMU_V3
> > - bool "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
> > + tristate "ARM Ltd. System MMU Version 3 (SMMUv3) Support"
> > depends on ARM64
> > select IOMMU_API
> > select IOMMU_IO_PGTABLE_LPAEa
>
> Sorry for the stupid question, but what prevents the iommu module from
> being unloaded when there are active users? There are no symbol
> dependencies to endpoint device drivers, because the interface is only
> exposed through the iommu-api, right? Is some sort of manual module
> reference counting needed?
Generally, I think unloading the IOMMU driver module while there are
active users is a pretty bad idea, much like unbinding the driver via
/sys in the same situation would also be fairly daft. However, I *think*
the code in __device_release_driver() tries to deal with this by
iterating over the active consumers and ->remove()ing them first.
I'm without hardware access at the moment, so I haven't been able to
test this myself. We could nobble the module_exit() hook, but there's
still the "force unload" option depending on the .config.
Will
Hi Saravana, Will,
On Wed, Oct 30, 2019 at 05:57:44PM -0700, Saravana Kannan via iommu wrote:
> > > > Obviously you need to be careful about using IOMMU drivers as modules,
> > > > since late loading of the driver for an IOMMU serving active DMA masters
> > > > is going to end badly in many cases. On Android, we're using device links
> > > > to ensure that the IOMMU probes first.
> > >
> > > Out of curiosity, which device links are those? Clearly not the RPM links
> > > created by the IOMMU drivers themselves... Is this some special Android
> > > magic, or is there actually a chance of replacing all the
> > > of_iommu_configure() machinery with something more generic?
> >
> > I'll admit that I haven't used them personally yet, but I'm referring to
> > this series from Saravana [CC'd]:
> >
> > https://lore.kernel.org/linux-acpi/[email protected]/
> >
> > which is currently sitting in linux-next now that we're upstreaming the
> > "special Android magic" ;)
Neat, I'm trying to do the same for virtio-iommu. It needs to be modular
because it depends on the virtio transport, which distributions usually
build as a module. So far I've been managing the device links in
virtio-iommu's add_device() and remove_device() callbacks [1]. Since it
relies on the existing probe deferral, I had to make a special case for
virtio-iommu to avoid giving up after initcalls_done [2].
Currently buggy, it explodes on the second modprobe.
[1] http://jpbrucker.net/git/linux/commit/?h=virtio-iommu/module-2019-10-31&id=f72978be18cb52eaa2d46dc762711bacbfab5039
[2] http://jpbrucker.net/git/linux/commit/?h=virtio-iommu/module-2019-10-31&id=f5fe188bb7fde33422ef08b9aad956dc3c77ec39
[...]
> Wrt IOMMUs, the only missing piece in upstream is a trivial change
> that does something like this in drivers/of/property.c
>
> +static struct device_node *parse_iommus(struct device_node *np,
> + const char *prop_name, int index)
> +{
> + return parse_prop_cells(np, prop_name, index, "iommus",
> + "#iommu-cells");
> +}
The 'iommus' property only applies to platform devices, do you have any
plan for PCI? PCI devices generally don't have a DT node. Only their root
bridge has a node, with an 'iommu-map' property instead of 'iommus', so
I don't think add_links() would get called for them.
I'm also unsure about distro vendors agreeing to a mandatory kernel
parameter (of_devlink). Do you plan to eventually enable it by default?
> static const struct supplier_bindings of_supplier_bindings[] = {
> { .parse_prop = parse_clocks, },
> { .parse_prop = parse_interconnects, },
> { .parse_prop = parse_regulators, },
> + { .parse_prop = parse_iommus, },
> {},
> };
>
> I plan to upstream this pretty soon, but I have other patches in
> flight that touch the same file and I'm waiting for those to get
> accepted. I also want to clean up the code a bit to reduce some
> repetition before I add support for more bindings.
I'm also wondering about ACPI support. IOMMU already has a sort of
canonical code path that links endpoints to their IOMMU
(iommu_probe_device()), after the firmware descriptions have been parsed.
So if we created the device links in the iommu core, for example
iommu_bus_notifier(), we would support all firmware interface flavors.
Otherwise we'll have to create those device links in the IORT driver as
well (plus DMAR and IVRS if they want it).
Robin pointed out that the SMMU drivers already have device links for
power management, can those be reused for module support as well? I think
the device_link_add() flags might need to differ but I haven't looked
closely at the implementation yet.
Thanks,
Jean
On Thu, Oct 31, 2019 at 12:38 PM Jean-Philippe Brucker
<[email protected]> wrote:
>
> Hi Saravana, Will,
>
> On Wed, Oct 30, 2019 at 05:57:44PM -0700, Saravana Kannan via iommu wrote:
> > > > > Obviously you need to be careful about using IOMMU drivers as modules,
> > > > > since late loading of the driver for an IOMMU serving active DMA masters
> > > > > is going to end badly in many cases. On Android, we're using device links
> > > > > to ensure that the IOMMU probes first.
> > > >
> > > > Out of curiosity, which device links are those? Clearly not the RPM links
> > > > created by the IOMMU drivers themselves... Is this some special Android
> > > > magic, or is there actually a chance of replacing all the
> > > > of_iommu_configure() machinery with something more generic?
> > >
> > > I'll admit that I haven't used them personally yet, but I'm referring to
> > > this series from Saravana [CC'd]:
> > >
> > > https://lore.kernel.org/linux-acpi/[email protected]/
> > >
> > > which is currently sitting in linux-next now that we're upstreaming the
> > > "special Android magic" ;)
>
> Neat, I'm trying to do the same for virtio-iommu. It needs to be modular
> because it depends on the virtio transport, which distributions usually
> build as a module. So far I've been managing the device links in
> virtio-iommu's add_device() and remove_device() callbacks [1]. Since it
> relies on the existing probe deferral, I had to make a special case for
> virtio-iommu to avoid giving up after initcalls_done [2].
>
> Currently buggy, it explodes on the second modprobe.
>
> [1] http://jpbrucker.net/git/linux/commit/?h=virtio-iommu/module-2019-10-31&id=f72978be18cb52eaa2d46dc762711bacbfab5039
> [2] http://jpbrucker.net/git/linux/commit/?h=virtio-iommu/module-2019-10-31&id=f5fe188bb7fde33422ef08b9aad956dc3c77ec39
>
> [...]
> > Wrt IOMMUs, the only missing piece in upstream is a trivial change
> > that does something like this in drivers/of/property.c
> >
> > +static struct device_node *parse_iommus(struct device_node *np,
> > + const char *prop_name, int index)
> > +{
> > + return parse_prop_cells(np, prop_name, index, "iommus",
> > + "#iommu-cells");
> > +}
>
> The 'iommus' property only applies to platform devices,
An early version of this patch series was limited to platform device,
but that's not true with the version that Will pointed to and was
merged into driver-core-next. The iommu parsing and creating device
links applies to all devices that use DT. That's why this code is in
of/property.c opposed to of/platform.c.
> do you have any
> plan for PCI? PCI devices generally don't have a DT node. Only their root
> bridge has a node, with an 'iommu-map' property instead of 'iommus', so
> I don't think add_links() would get called for them.
I looked into the iommu-map property and it shouldn't be too hard to
add support for it. Looks like we can simply hold off on probing the
root bridge device till all the iommus in its iommu-map are probed and
we should be fine.
> I'm also unsure about distro vendors agreeing to a mandatory kernel
> parameter (of_devlink). Do you plan to eventually enable it by default?
>
> > static const struct supplier_bindings of_supplier_bindings[] = {
> > { .parse_prop = parse_clocks, },
> > { .parse_prop = parse_interconnects, },
> > { .parse_prop = parse_regulators, },
> > + { .parse_prop = parse_iommus, },
> > {},
> > };
> >
> > I plan to upstream this pretty soon, but I have other patches in
> > flight that touch the same file and I'm waiting for those to get
> > accepted. I also want to clean up the code a bit to reduce some
> > repetition before I add support for more bindings.
>
> I'm also wondering about ACPI support.
I'd love to add ACPI support too, but I have zero knowledge of ACPI.
I'd be happy to help anyone who wants to add ACPI support that allows
ACPI to add device links.
> IOMMU already has a sort of
> canonical code path that links endpoints to their IOMMU
> (iommu_probe_device()), after the firmware descriptions have been parsed.
> So if we created the device links in the iommu core, for example
> iommu_bus_notifier(), we would support all firmware interface flavors.
> Otherwise we'll have to create those device links in the IORT driver as
> well (plus DMAR and IVRS if they want it).
IOMMU driver/framework or whoever else can create device links as
necessary. That's not mutually exclusive to the firmware adding device
links (the device links APIs handle this nicely). While device probe
ordering is one benefit of my patch series, that's not all of it
though. It also deals with making sure suppliers known when they can
clean up the boot state of their device even when the drivers for all
their consumers are loaded as modules (so late initcall won't work). I
can go into more details on this if needed, but that latter part is
not very relevant in this context and you can find most of the details
in my patch series/documentation I added.
Thanks,
Saravana
On 31/10/2019 23:34, Saravana Kannan via iommu wrote:
> I looked into the iommu-map property and it shouldn't be too hard to
> add support for it. Looks like we can simply hold off on probing the
> root bridge device till all the iommus in its iommu-map are probed and
> we should be fine.
>
>> I'm also unsure about distro vendors agreeing to a mandatory kernel
>> parameter (of_devlink). Do you plan to eventually enable it by default?
>>
>>> static const struct supplier_bindings of_supplier_bindings[] = {
>>> { .parse_prop = parse_clocks, },
>>> { .parse_prop = parse_interconnects, },
>>> { .parse_prop = parse_regulators, },
>>> + { .parse_prop = parse_iommus, },
>>> {},
>>> };
>>>
>>> I plan to upstream this pretty soon, but I have other patches in
>>> flight that touch the same file and I'm waiting for those to get
>>> accepted. I also want to clean up the code a bit to reduce some
>>> repetition before I add support for more bindings.
>> I'm also wondering about ACPI support.
> I'd love to add ACPI support too, but I have zero knowledge of ACPI.
> I'd be happy to help anyone who wants to add ACPI support that allows
> ACPI to add device links.
If possible to add, that may be useful for remedying this:
https://lore.kernel.org/linux-iommu/[email protected]/
Thanks,
John
>
>> IOMMU already has a sort of
>> canonical code path that links endpoints to their IOMMU
>> (iommu_probe_device()), after the firmware descriptions have been parsed.
>> So if we created the device links in the iommu core, for example
>> iommu_bus_notifier(), we would support all firmware interface fl
On Thu, Oct 31, 2019 at 04:34:14PM -0700, Saravana Kannan wrote:
> > Neat, I'm trying to do the same for virtio-iommu. It needs to be modular
> > because it depends on the virtio transport, which distributions usually
> > build as a module. So far I've been managing the device links in
> > virtio-iommu's add_device() and remove_device() callbacks [1]. Since it
> > relies on the existing probe deferral, I had to make a special case for
> > virtio-iommu to avoid giving up after initcalls_done [2].
> >
> > Currently buggy, it explodes on the second modprobe.
> >
> > [1] http://jpbrucker.net/git/linux/commit/?h=virtio-iommu/module-2019-10-31&id=f72978be18cb52eaa2d46dc762711bacbfab5039
> > [2] http://jpbrucker.net/git/linux/commit/?h=virtio-iommu/module-2019-10-31&id=f5fe188bb7fde33422ef08b9aad956dc3c77ec39
> >
> > [...]
> > > Wrt IOMMUs, the only missing piece in upstream is a trivial change
> > > that does something like this in drivers/of/property.c
> > >
> > > +static struct device_node *parse_iommus(struct device_node *np,
> > > + const char *prop_name, int index)
> > > +{
> > > + return parse_prop_cells(np, prop_name, index, "iommus",
> > > + "#iommu-cells");
> > > +}
> >
> > The 'iommus' property only applies to platform devices,
>
> An early version of this patch series was limited to platform device,
> but that's not true with the version that Will pointed to and was
> merged into driver-core-next. The iommu parsing and creating device
> links applies to all devices that use DT. That's why this code is in
> of/property.c opposed to of/platform.c.
>
> > do you have any
> > plan for PCI? PCI devices generally don't have a DT node. Only their root
> > bridge has a node, with an 'iommu-map' property instead of 'iommus', so
> > I don't think add_links() would get called for them.
>
> I looked into the iommu-map property and it shouldn't be too hard to
> add support for it. Looks like we can simply hold off on probing the
> root bridge device till all the iommus in its iommu-map are probed and
> we should be fine.
Virtio-iommu can use the PCI transport [3], in which case you need a
finer-grained approach. In some implementations you'll have the IOMMU
interface and endpoints managed by that IOMMU under the same RC. I'm not
proud of it.
[3] 6c9e92ef8bdd ("dt-bindings: virtio: Add virtio-pci-iommu node")
> > I'm also unsure about distro vendors agreeing to a mandatory kernel
> > parameter (of_devlink). Do you plan to eventually enable it by default?
> >
> > > static const struct supplier_bindings of_supplier_bindings[] = {
> > > { .parse_prop = parse_clocks, },
> > > { .parse_prop = parse_interconnects, },
> > > { .parse_prop = parse_regulators, },
> > > + { .parse_prop = parse_iommus, },
> > > {},
> > > };
> > >
> > > I plan to upstream this pretty soon, but I have other patches in
> > > flight that touch the same file and I'm waiting for those to get
> > > accepted. I also want to clean up the code a bit to reduce some
> > > repetition before I add support for more bindings.
> >
> > I'm also wondering about ACPI support.
>
> I'd love to add ACPI support too, but I have zero knowledge of ACPI.
> I'd be happy to help anyone who wants to add ACPI support that allows
> ACPI to add device links.
It's not as generic as device-tree, each vendor has their own table to
describe the IOMMU topology. I don't see a nice way to transpose the
add_links() callback there. Links need to be created either in a common
path (iommu_probe_device()) or in the APCI IORT driver.
> > IOMMU already has a sort of
> > canonical code path that links endpoints to their IOMMU
> > (iommu_probe_device()), after the firmware descriptions have been parsed.
> > So if we created the device links in the iommu core, for example
> > iommu_bus_notifier(), we would support all firmware interface flavors.
> > Otherwise we'll have to create those device links in the IORT driver as
> > well (plus DMAR and IVRS if they want it).
>
> IOMMU driver/framework or whoever else can create device links as
> necessary. That's not mutually exclusive to the firmware adding device
> links (the device links APIs handle this nicely).
Oh right, device_link_add() would reuse an existing link. It even seems to
work for IOMMU drivers that want to manage PM links themselves
(DL_FLAG_STATELESS).
> While device probe
> ordering is one benefit of my patch series, that's not all of it
> though. It also deals with making sure suppliers known when they can
> clean up the boot state of their device even when the drivers for all
> their consumers are loaded as modules (so late initcall won't work). I
> can go into more details on this if needed, but that latter part is
> not very relevant in this context and you can find most of the details
> in my patch series/documentation I added.
Yes I admit I only glanced over it, but it seems like a nice solution. At
the moment of_iommu_xlate() handles the dependency itself. It currently
give up after initcalls but than can be tweaked to wait indefinitely (my
patch [2]). If of_devlink was enabled by default and handled iommu-map it
could probably replace the of_iommu probe deferral. For now it might be
best to manage device links somewhere in the IOMMU core, so ACPI
implementations (which do probe deferral their own way) can benefit from
it as well.
Thanks,
Jean
On Fri, Nov 01, 2019 at 12:41:48PM +0100, Jean-Philippe Brucker wrote:
[...]
> > > I'm also wondering about ACPI support.
> >
> > I'd love to add ACPI support too, but I have zero knowledge of ACPI.
> > I'd be happy to help anyone who wants to add ACPI support that allows
> > ACPI to add device links.
>
> It's not as generic as device-tree, each vendor has their own table to
> describe the IOMMU topology. I don't see a nice way to transpose the
> add_links() callback there. Links need to be created either in a common
> path (iommu_probe_device()) or in the APCI IORT driver.
We can create a generic stub that calls into respective firmware
handling paths (eg iort_dma_setup() in acpi_dma_configure()).
There are three arches booting with ACPI so stubbing it out in
specific firmware handlers is not such a big deal, less generic
sure, but not catastrophically bad.
Obviously this works for IOMMU masters links - for resources
dependencies (eg power domains) it deserves some thought, keeping in
mind that IOMMUs are static table entries in ACPI and not device objects
so they are not even capable of expressing eg power resources and
suchlike.
Long story short: adding IOMMU masters links in ACPI should be
reasonably simple, everything else requires further thought.
Lorenzo
Hi Jean-Philippe,
Quick question while you figure out the devlink stuff with Saravana...
On Thu, Oct 31, 2019 at 08:37:58PM +0100, Jean-Philippe Brucker wrote:
> On Wed, Oct 30, 2019 at 05:57:44PM -0700, Saravana Kannan via iommu wrote:
> > > > > Obviously you need to be careful about using IOMMU drivers as modules,
> > > > > since late loading of the driver for an IOMMU serving active DMA masters
> > > > > is going to end badly in many cases. On Android, we're using device links
> > > > > to ensure that the IOMMU probes first.
> > > >
> > > > Out of curiosity, which device links are those? Clearly not the RPM links
> > > > created by the IOMMU drivers themselves... Is this some special Android
> > > > magic, or is there actually a chance of replacing all the
> > > > of_iommu_configure() machinery with something more generic?
> > >
> > > I'll admit that I haven't used them personally yet, but I'm referring to
> > > this series from Saravana [CC'd]:
> > >
> > > https://lore.kernel.org/linux-acpi/[email protected]/
> > >
> > > which is currently sitting in linux-next now that we're upstreaming the
> > > "special Android magic" ;)
>
> Neat, I'm trying to do the same for virtio-iommu. It needs to be modular
> because it depends on the virtio transport, which distributions usually
> build as a module. So far I've been managing the device links in
> virtio-iommu's add_device() and remove_device() callbacks [1]. Since it
> relies on the existing probe deferral, I had to make a special case for
> virtio-iommu to avoid giving up after initcalls_done [2].
As far as symbols exported from the IOMMU and PCI layers, did you find you
needed anything on top of the stuff I'm exporting in patches 1 and 3?
Cheers,
Will
On Fri, Nov 1, 2019 at 3:28 AM John Garry <[email protected]> wrote:
>
> On 31/10/2019 23:34, Saravana Kannan via iommu wrote:
> > I looked into the iommu-map property and it shouldn't be too hard to
> > add support for it. Looks like we can simply hold off on probing the
> > root bridge device till all the iommus in its iommu-map are probed and
> > we should be fine.
> >
> >> I'm also unsure about distro vendors agreeing to a mandatory kernel
> >> parameter (of_devlink). Do you plan to eventually enable it by default?
> >>
> >>> static const struct supplier_bindings of_supplier_bindings[] = {
> >>> { .parse_prop = parse_clocks, },
> >>> { .parse_prop = parse_interconnects, },
> >>> { .parse_prop = parse_regulators, },
> >>> + { .parse_prop = parse_iommus, },
> >>> {},
> >>> };
> >>>
> >>> I plan to upstream this pretty soon, but I have other patches in
> >>> flight that touch the same file and I'm waiting for those to get
> >>> accepted. I also want to clean up the code a bit to reduce some
> >>> repetition before I add support for more bindings.
> >> I'm also wondering about ACPI support.
> > I'd love to add ACPI support too, but I have zero knowledge of ACPI.
> > I'd be happy to help anyone who wants to add ACPI support that allows
> > ACPI to add device links.
>
> If possible to add, that may be useful for remedying this:
>
> https://lore.kernel.org/linux-iommu/[email protected]/
I'm happy that this change might fix that problem, but isn't the
problem reported in that thread more to do with child devices getting
added before the parent probes successfully? That doesn't make sense
to me. Can't the piceport driver not add its child devices before it
probes successfully? Or more specifically, who adds the child devices
of the pcieport before the pcieport itself probes?
Thanks,
Saravana
On Fri, Nov 1, 2019 at 5:28 AM Lorenzo Pieralisi
<[email protected]> wrote:
>
> On Fri, Nov 01, 2019 at 12:41:48PM +0100, Jean-Philippe Brucker wrote:
>
> [...]
>
> > > > I'm also wondering about ACPI support.
> > >
> > > I'd love to add ACPI support too, but I have zero knowledge of ACPI.
> > > I'd be happy to help anyone who wants to add ACPI support that allows
> > > ACPI to add device links.
> >
> > It's not as generic as device-tree, each vendor has their own table to
> > describe the IOMMU topology. I don't see a nice way to transpose the
> > add_links() callback there. Links need to be created either in a common
> > path (iommu_probe_device()) or in the APCI IORT driver.
>
> We can create a generic stub that calls into respective firmware
> handling paths (eg iort_dma_setup() in acpi_dma_configure()).
>
> There are three arches booting with ACPI so stubbing it out in
> specific firmware handlers is not such a big deal, less generic
> sure, but not catastrophically bad.
Ok, good to know.
> Obviously this works for IOMMU masters links
It's unclear to me what you are referring to here and it's throwing me
off on the rest of the email.
Did you mean to say "IOMMU master's links"? As in the bus masters
whose accesses go through IOMMUs? And "links" as in device links?
OR
Do you mean device links from bus master devices to IOMMUs here?
> - for resources
> dependencies (eg power domains) it deserves some thought, keeping in
> mind that IOMMUs are static table entries in ACPI and not device objects
> so they are not even capable of expressing eg power resources and
> suchlike.
If you can reword this sentence for me with more context or split it
into separate sentences, I'd appreciate that very much. I'd help me
understand this better and allow me to try to help out.
> Long story short: adding IOMMU masters links in ACPI should be
> reasonably simple, everything else requires further thought.
-Saravana
Hi Will,
On Fri, Nov 01, 2019 at 05:21:46PM +0000, Will Deacon wrote:
> As far as symbols exported from the IOMMU and PCI layers, did you find you
> needed anything on top of the stuff I'm exporting in patches 1 and 3?
No, I needed the same symbols (minus fsl_mc_device_group and
iommu_group_ref_get).
Thanks,
Jean
On Fri, Nov 01, 2019 at 02:26:05PM -0700, Saravana Kannan wrote:
> On Fri, Nov 1, 2019 at 5:28 AM Lorenzo Pieralisi
> <[email protected]> wrote:
> >
> > On Fri, Nov 01, 2019 at 12:41:48PM +0100, Jean-Philippe Brucker wrote:
> >
> > [...]
> >
> > > > > I'm also wondering about ACPI support.
> > > >
> > > > I'd love to add ACPI support too, but I have zero knowledge of ACPI.
> > > > I'd be happy to help anyone who wants to add ACPI support that allows
> > > > ACPI to add device links.
> > >
> > > It's not as generic as device-tree, each vendor has their own table to
> > > describe the IOMMU topology. I don't see a nice way to transpose the
> > > add_links() callback there. Links need to be created either in a common
> > > path (iommu_probe_device()) or in the APCI IORT driver.
> >
> > We can create a generic stub that calls into respective firmware
> > handling paths (eg iort_dma_setup() in acpi_dma_configure()).
> >
> > There are three arches booting with ACPI so stubbing it out in
> > specific firmware handlers is not such a big deal, less generic
> > sure, but not catastrophically bad.
>
> Ok, good to know.
>
> > Obviously this works for IOMMU masters links
>
> It's unclear to me what you are referring to here and it's throwing me
> off on the rest of the email.
>
> Did you mean to say "IOMMU master's links"? As in the bus masters
> whose accesses go through IOMMUs? And "links" as in device links?
>
> OR
>
> Do you mean device links from bus master devices to IOMMUs here?
I meant associating endpoints devices to the IOMMU they are connected
to.
In DT you do it through "iommus", "iommu-map" properties, in ACPI
it is arch specific, doable nonetheless through ACPI (IORT on ARM)
static tables data.
> > - for resources
> > dependencies (eg power domains) it deserves some thought, keeping in
> > mind that IOMMUs are static table entries in ACPI and not device objects
> > so they are not even capable of expressing eg power resources and
> > suchlike.
>
> If you can reword this sentence for me with more context or split it
> into separate sentences, I'd appreciate that very much. I'd help me
> understand this better and allow me to try to help out.
In ACPI (at least on ARM but on x86 I suspect that's the same story with
the DMAR table) an SMMU is presented in FW as an entry in a static
table (eg IORT on ARM). I noticed that your patch series takes into
account for instance eg clock dependencies in DT; this way the OS knows
the clock(s) the SMMU depends on to be activated.
In ACPI there is not a notion of "clock" (hopefully - unless someone
sneaked that in using _DSD properties) but rather every device in the
ACPI namespace (which is part of tables containing code that needs the
ACPI interpreter to be used such as SSDT/DSDT - it is AML code) has ACPI
objects describing power resources (ie ACPI specification 6.3, 7.2).
The SMMU, since it is not itself an ACPI object in the ACPI namespace
but rather an entry in a static ACPI table (IORT on ARM), can't have
PowerResource object in it which means that at the moment there is no
way you can detect a dependency on other power resources to be ON to
build the device links you require to sort out the probe dependencies,
which I *assume* that's the reason why you require to detect
clock dependencies in DT.
Maybe it is not even needed at all but in case it is I was giving
a heads-up to say that clocks (or rather an all encompassing "power
resource" dependency) dependencies in ACPI to build an SMMU as
a module are not straightforward and most certainly will require
firmware specifications updates.
*Hopefully* in the short term all you need to detect is how endpoint
devices are connected to an IOMMU and build device links to describe
that probe dependency, if we need to throw power management into
the picture there is more work to be done.
I hope that's clearer, if it is not please let me know and I will
try to be more precise.
Thanks,
Lorenzo
On 01/11/2019 21:13, Saravana Kannan wrote:
> On Fri, Nov 1, 2019 at 3:28 AM John Garry <[email protected]> wrote:
>>
>> On 31/10/2019 23:34, Saravana Kannan via iommu wrote:
>>> I looked into the iommu-map property and it shouldn't be too hard to
>>> add support for it. Looks like we can simply hold off on probing the
>>> root bridge device till all the iommus in its iommu-map are probed and
>>> we should be fine.
>>>
>>>> I'm also unsure about distro vendors agreeing to a mandatory kernel
>>>> parameter (of_devlink). Do you plan to eventually enable it by default?
>>>>
>>>>> static const struct supplier_bindings of_supplier_bindings[] = {
>>>>> { .parse_prop = parse_clocks, },
>>>>> { .parse_prop = parse_interconnects, },
>>>>> { .parse_prop = parse_regulators, },
>>>>> + { .parse_prop = parse_iommus, },
>>>>> {},
>>>>> };
>>>>>
>>>>> I plan to upstream this pretty soon, but I have other patches in
>>>>> flight that touch the same file and I'm waiting for those to get
>>>>> accepted. I also want to clean up the code a bit to reduce some
>>>>> repetition before I add support for more bindings.
>>>> I'm also wondering about ACPI support.
>>> I'd love to add ACPI support too, but I have zero knowledge of ACPI.
>>> I'd be happy to help anyone who wants to add ACPI support that allows
>>> ACPI to add device links.
>>
>> If possible to add, that may be useful for remedying this:
>>
>> https://lore.kernel.org/linux-iommu/[email protected]/
>
> I'm happy that this change might fix that problem, but isn't the
> problem reported in that thread more to do with child devices getting
> added before the parent probes successfully? That doesn't make sense
> to me.
So the pcieport device and then the child device are added in the PCI
scan, but only some time later do the device drivers probe for these
devices; so it's not that the that pcieport driver creates the child device.
The problem then occurs in that the ordering the of device driver probe
is such that we have this: pcieport probe + defer (as no IOMMU group
registered), SMMU probe (registers the IOMMU group), child device probe,
pcieport really probe.
Can't the piceport driver not add its child devices before it
> probes successfully? Or more specifically, who adds the child devices
> of the pcieport before the pcieport itself probes?
The devices are actually added in order pcieport, child device, but not
really probed in that same order, as above.
I'll add you to that thread if you want to discuss further.
Thanks,
John
On 04/11/2019 12:16, John Garry wrote:
> On 01/11/2019 21:13, Saravana Kannan wrote:
>> On Fri, Nov 1, 2019 at 3:28 AM John Garry <[email protected]> wrote:
>>>
>>> On 31/10/2019 23:34, Saravana Kannan via iommu wrote:
>>>> I looked into the iommu-map property and it shouldn't be too hard to
>>>> add support for it. Looks like we can simply hold off on probing the
>>>> root bridge device till all the iommus in its iommu-map are probed and
>>>> we should be fine.
>>>>
>>>>> I'm also unsure about distro vendors agreeing to a mandatory kernel
>>>>> parameter (of_devlink). Do you plan to eventually enable it by
>>>>> default?
>>>>>
>>>>>> static const struct supplier_bindings of_supplier_bindings[] = {
>>>>>> { .parse_prop = parse_clocks, },
>>>>>> { .parse_prop = parse_interconnects, },
>>>>>> { .parse_prop = parse_regulators, },
>>>>>> + { .parse_prop = parse_iommus, },
>>>>>> {},
>>>>>> };
>>>>>>
>>>>>> I plan to upstream this pretty soon, but I have other patches in
>>>>>> flight that touch the same file and I'm waiting for those to get
>>>>>> accepted. I also want to clean up the code a bit to reduce some
>>>>>> repetition before I add support for more bindings.
>>>>> I'm also wondering about ACPI support.
>>>> I'd love to add ACPI support too, but I have zero knowledge of ACPI.
>>>> I'd be happy to help anyone who wants to add ACPI support that allows
>>>> ACPI to add device links.
>>>
>>> If possible to add, that may be useful for remedying this:
>>>
>>> https://lore.kernel.org/linux-iommu/[email protected]/
>>>
>>
>> I'm happy that this change might fix that problem, but isn't the
>> problem reported in that thread more to do with child devices getting
>> added before the parent probes successfully? That doesn't make sense
>> to me.
>
> So the pcieport device and then the child device are added in the PCI
> scan, but only some time later do the device drivers probe for these
> devices; so it's not that the that pcieport driver creates the child
> device.
>
> The problem then occurs in that the ordering the of device driver probe
> is such that we have this: pcieport probe + defer (as no IOMMU group
> registered), SMMU probe (registers the IOMMU group), child device probe,
> pcieport really probe.
>
> Can't the piceport driver not add its child devices before it
>> probes successfully? Or more specifically, who adds the child devices
>> of the pcieport before the pcieport itself probes?
>
> The devices are actually added in order pcieport, child device, but not
> really probed in that same order, as above.
Right, in short the fundamental problem is that of_iommu_configure() now
does the wrong thing. Deferring probe of the entire host bridge/root
complex based on "iommu-map" would indeed happen to solve the problem by
brute force, I think, but could lead to a dependency cycle for PCI-based
IOMMUs as Jean points out. I hope to have time this week to work a bit
more on pulling of_iommu_configure() apart to fix it properly, after
which of_devlink *should* only have to worry about the child devices
themselves...
Robin.
> I'll add you to that thread if you want to discuss further.
>
> Thanks,
> John
>
On Thu, Oct 31, 2019 at 03:42:47PM +0000, Will Deacon wrote:
> > Sorry for the stupid question, but what prevents the iommu module from
> > being unloaded when there are active users? There are no symbol
> > dependencies to endpoint device drivers, because the interface is only
> > exposed through the iommu-api, right? Is some sort of manual module
> > reference counting needed?
>
> Generally, I think unloading the IOMMU driver module while there are
> active users is a pretty bad idea, much like unbinding the driver via
> /sys in the same situation would also be fairly daft. However, I *think*
> the code in __device_release_driver() tries to deal with this by
> iterating over the active consumers and ->remove()ing them first.
> I'm without hardware access at the moment, so I haven't been able to
> test this myself. We could nobble the module_exit() hook, but there's
> still the "force unload" option depending on the .config.
Shame that we can't completely prevent module unloading, because handling
rmmod cleanly is tricky.
On module unload we also need to tidy up the bus->iommu_ops installed by
bus_set_iommu(), and remove the IOMMU groups (and probably other leaks I
missed). I have a solution for the bus->iommu_ops, which is simply adding
a bus_unset_iommu() counterpart with a refcount, but it doesn't deal with
the IOMMU groups cleanly. If there are multiple IOMMU instances managing
one bus, then we should only remove the IOMMU groups belonging to the
instance that is being removed.
I'll think about this more, but the simple solution is attached if you
want to test. It at least works with a single IOMMU now:
$ modprobe virtio-iommu
[ 25.180965] virtio_iommu virtio0: input address: 64 bits
[ 25.181437] virtio_iommu virtio0: page mask: 0xfffffffffffff000
[ 25.214493] virtio-pci 0000:00:03.0: Adding to iommu group 0
[ 25.233252] virtio-pci 0000:00:03.0: enabling device (0000 -> 0003)
[ 25.334810] e1000e 0000:00:02.0: Adding to iommu group 1
[ 25.348997] e1000e 0000:00:02.0: enabling device (0000 -> 0002)
... net test etc
$ rmmod virtio-iommu
[ 34.084816] e1000e: eth1 NIC Link is Down
[ 34.212152] pci 0000:00:02.0: Removing from iommu group 1
[ 34.250558] pci 0000:00:03.0: Removing from iommu group 0
[ 34.261570] virtio_iommu virtio0: device removed
$ modprobe virtio-iommu
[ 34.828982] virtio_iommu virtio0: input address: 64 bits
[ 34.829442] virtio_iommu virtio0: page mask: 0xfffffffffffff000
[ 34.844576] virtio-pci 0000:00:03.0: Adding to iommu group 0
[ 34.916449] e1000e 0000:00:02.0: Adding to iommu group 1
Thanks,
Jean
On Wed, Oct 30, 2019 at 02:51:12PM +0000, Will Deacon wrote:
> By conditionally dropping support for the legacy binding and exporting
> the newly introduced 'arm_smmu_impl_init()' function we can allow the
> ARM SMMU driver to be built as a module.
>
> Signed-off-by: Will Deacon <[email protected]>
> ---
> drivers/iommu/Kconfig | 14 ++++++++-
> drivers/iommu/arm-smmu-impl.c | 6 ++++
> drivers/iommu/arm-smmu.c | 54 +++++++++++++++++++++--------------
> 3 files changed, 51 insertions(+), 23 deletions(-)
>
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index 7583d47fc4d5..02703f51e533 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -350,7 +350,7 @@ config SPAPR_TCE_IOMMU
>
> # ARM IOMMU support
> config ARM_SMMU
> - bool "ARM Ltd. System MMU (SMMU) Support"
> + tristate "ARM Ltd. System MMU (SMMU) Support"
> depends on (ARM64 || ARM) && MMU
> select IOMMU_API
> select IOMMU_IO_PGTABLE_LPAE
> @@ -362,6 +362,18 @@ config ARM_SMMU
> Say Y here if your SoC includes an IOMMU device implementing
> the ARM SMMU architecture.
>
> +config ARM_SMMU_LEGACY_DT_BINDINGS
> + bool "Support the legacy \"mmu-masters\" devicetree bindings"
> + depends on ARM_SMMU=y && OF
> + help
> + Support for the badly designed and deprecated \"mmu-masters\"
> + devicetree bindings. This allows some DMA masters to attach
> + to the SMMU but does not provide any support via the DMA API.
> + If you're lucky, you might be able to get VFIO up and running.
> +
> + If you say Y here then you'll make me very sad. Instead, say N
> + and move your firmware to the utopian future that was 2016.
> +
> config ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT
> bool "Default to disabling bypass on ARM SMMU v1 and v2"
> depends on ARM_SMMU
> diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
> index 5c87a38620c4..2f82d40317d6 100644
> --- a/drivers/iommu/arm-smmu-impl.c
> +++ b/drivers/iommu/arm-smmu-impl.c
> @@ -5,6 +5,7 @@
> #define pr_fmt(fmt) "arm-smmu: " fmt
>
> #include <linux/bitfield.h>
> +#include <linux/module.h>
> #include <linux/of.h>
>
> #include "arm-smmu.h"
> @@ -172,3 +173,8 @@ struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu)
>
> return smmu;
> }
> +EXPORT_SYMBOL_GPL(arm_smmu_impl_init);
> +
> +MODULE_DESCRIPTION("IOMMU quirks for ARM architected SMMU implementations");
> +MODULE_AUTHOR("Robin Murphy <[email protected]>");
> +MODULE_LICENSE("GPL v2");
Hi Will,
A minor comment: I was curious about why arm-smmu.c and arm-smmu-impl.c
were being compiled as two separate modules, as opposed to combining
them into one module? The latter approach seemed more appropriate, given
that arm-smmu-impl doesn't offer much as a module on its own. Thoughts?
--Isaac
Hi Will,
On Thu, Oct 31, 2019 at 03:42:47PM +0000, Will Deacon wrote:
> Generally, I think unloading the IOMMU driver module while there are
> active users is a pretty bad idea, much like unbinding the driver via
> /sys in the same situation would also be fairly daft. However, I *think*
> the code in __device_release_driver() tries to deal with this by
> iterating over the active consumers and ->remove()ing them first.
>
> I'm without hardware access at the moment, so I haven't been able to
> test this myself. We could nobble the module_exit() hook, but there's
> still the "force unload" option depending on the .config.
Okay, but besides the force-unload case, can we prevent accidential
unloading by taking a reference to the module in add_device() and release
it in remove_device()?
Regards,
Joerg
On Mon, Nov 4, 2019 at 3:43 AM Lorenzo Pieralisi
<[email protected]> wrote:
>
> On Fri, Nov 01, 2019 at 02:26:05PM -0700, Saravana Kannan wrote:
> > On Fri, Nov 1, 2019 at 5:28 AM Lorenzo Pieralisi
> > <[email protected]> wrote:
> > >
> > > On Fri, Nov 01, 2019 at 12:41:48PM +0100, Jean-Philippe Brucker wrote:
> > >
> > > [...]
> > >
> > > > > > I'm also wondering about ACPI support.
> > > > >
> > > > > I'd love to add ACPI support too, but I have zero knowledge of ACPI.
> > > > > I'd be happy to help anyone who wants to add ACPI support that allows
> > > > > ACPI to add device links.
> > > >
> > > > It's not as generic as device-tree, each vendor has their own table to
> > > > describe the IOMMU topology. I don't see a nice way to transpose the
> > > > add_links() callback there. Links need to be created either in a common
> > > > path (iommu_probe_device()) or in the APCI IORT driver.
> > >
> > > We can create a generic stub that calls into respective firmware
> > > handling paths (eg iort_dma_setup() in acpi_dma_configure()).
> > >
> > > There are three arches booting with ACPI so stubbing it out in
> > > specific firmware handlers is not such a big deal, less generic
> > > sure, but not catastrophically bad.
> >
> > Ok, good to know.
> >
> > > Obviously this works for IOMMU masters links
> >
> > It's unclear to me what you are referring to here and it's throwing me
> > off on the rest of the email.
> >
> > Did you mean to say "IOMMU master's links"? As in the bus masters
> > whose accesses go through IOMMUs? And "links" as in device links?
> >
> > OR
> >
> > Do you mean device links from bus master devices to IOMMUs here?
>
> I meant associating endpoints devices to the IOMMU they are connected
> to.
>
> In DT you do it through "iommus", "iommu-map" properties, in ACPI
> it is arch specific, doable nonetheless through ACPI (IORT on ARM)
> static tables data.
>
> > > - for resources
> > > dependencies (eg power domains) it deserves some thought, keeping in
> > > mind that IOMMUs are static table entries in ACPI and not device objects
> > > so they are not even capable of expressing eg power resources and
> > > suchlike.
> >
> > If you can reword this sentence for me with more context or split it
> > into separate sentences, I'd appreciate that very much. I'd help me
> > understand this better and allow me to try to help out.
>
> In ACPI (at least on ARM but on x86 I suspect that's the same story with
> the DMAR table) an SMMU is presented in FW as an entry in a static
> table (eg IORT on ARM). I noticed that your patch series takes into
> account for instance eg clock dependencies in DT; this way the OS knows
> the clock(s) the SMMU depends on to be activated.
>
> In ACPI there is not a notion of "clock" (hopefully - unless someone
> sneaked that in using _DSD properties) but rather every device in the
> ACPI namespace (which is part of tables containing code that needs the
> ACPI interpreter to be used such as SSDT/DSDT - it is AML code) has ACPI
> objects describing power resources (ie ACPI specification 6.3, 7.2).
>
> The SMMU, since it is not itself an ACPI object in the ACPI namespace
> but rather an entry in a static ACPI table (IORT on ARM), can't have
> PowerResource object in it which means that at the moment there is no
> way you can detect a dependency on other power resources to be ON to
> build the device links you require to sort out the probe dependencies,
> which I *assume* that's the reason why you require to detect
> clock dependencies in DT.
>
> Maybe it is not even needed at all but in case it is I was giving
> a heads-up to say that clocks (or rather an all encompassing "power
> resource" dependency) dependencies in ACPI to build an SMMU as
> a module are not straightforward and most certainly will require
> firmware specifications updates.
>
> *Hopefully* in the short term all you need to detect is how endpoint
> devices are connected to an IOMMU and build device links to describe
> that probe dependency, if we need to throw power management into
> the picture there is more work to be done.
>
> I hope that's clearer, if it is not please let me know and I will
> try to be more precise.
Thanks for the detailed explanation and context Lorenzo. I understand
it better now.
-Saravana
On Mon, Nov 4, 2019 at 4:16 AM John Garry <[email protected]> wrote:
>
> On 01/11/2019 21:13, Saravana Kannan wrote:
> > On Fri, Nov 1, 2019 at 3:28 AM John Garry <[email protected]> wrote:
> >>
> >> On 31/10/2019 23:34, Saravana Kannan via iommu wrote:
> >>> I looked into the iommu-map property and it shouldn't be too hard to
> >>> add support for it. Looks like we can simply hold off on probing the
> >>> root bridge device till all the iommus in its iommu-map are probed and
> >>> we should be fine.
> >>>
> >>>> I'm also unsure about distro vendors agreeing to a mandatory kernel
> >>>> parameter (of_devlink). Do you plan to eventually enable it by default?
> >>>>
> >>>>> static const struct supplier_bindings of_supplier_bindings[] = {
> >>>>> { .parse_prop = parse_clocks, },
> >>>>> { .parse_prop = parse_interconnects, },
> >>>>> { .parse_prop = parse_regulators, },
> >>>>> + { .parse_prop = parse_iommus, },
> >>>>> {},
> >>>>> };
> >>>>>
> >>>>> I plan to upstream this pretty soon, but I have other patches in
> >>>>> flight that touch the same file and I'm waiting for those to get
> >>>>> accepted. I also want to clean up the code a bit to reduce some
> >>>>> repetition before I add support for more bindings.
> >>>> I'm also wondering about ACPI support.
> >>> I'd love to add ACPI support too, but I have zero knowledge of ACPI.
> >>> I'd be happy to help anyone who wants to add ACPI support that allows
> >>> ACPI to add device links.
> >>
> >> If possible to add, that may be useful for remedying this:
> >>
> >> https://lore.kernel.org/linux-iommu/[email protected]/
> >
> > I'm happy that this change might fix that problem, but isn't the
> > problem reported in that thread more to do with child devices getting
> > added before the parent probes successfully? That doesn't make sense
> > to me.
>
> So the pcieport device and then the child device are added in the PCI
> scan, but only some time later do the device drivers probe for these
> devices; so it's not that the that pcieport driver creates the child device.
My lack of PCI knowledge might be showing, but without the pcieport
device itself being functional (I'm assuming this is the PCIe
controller) how does a PCI scan work? In anyway, I understand that all
devices are added at the same time as of today.
> The problem then occurs in that the ordering the of device driver probe
> is such that we have this: pcieport probe + defer (as no IOMMU group
> registered), SMMU probe (registers the IOMMU group), child device probe,
> pcieport really probe.
>
> Can't the piceport driver not add its child devices before it
> > probes successfully? Or more specifically, who adds the child devices
> > of the pcieport before the pcieport itself probes?
>
> The devices are actually added in order pcieport, child device, but not
> really probed in that same order, as above.
Thanks. Got it.
I guess one nice thing that came out of devicetree world is that the
child devices are never probed without the parent probing first (I'm
ignoring the simple-bus devices). It also kinda makes sense to me --
without a USB port device being added and probed, it doesn't make
sense to add the (keyboard, mouse, USB-HDD, etc) devices connected to
the USB port. But this is an orthogonal debate.
-Saravana
On Mon, Nov 4, 2019 at 5:29 AM Robin Murphy <[email protected]> wrote:
>
> On 04/11/2019 12:16, John Garry wrote:
> > On 01/11/2019 21:13, Saravana Kannan wrote:
> >> On Fri, Nov 1, 2019 at 3:28 AM John Garry <[email protected]> wrote:
> >>>
> >>> On 31/10/2019 23:34, Saravana Kannan via iommu wrote:
> >>>> I looked into the iommu-map property and it shouldn't be too hard to
> >>>> add support for it. Looks like we can simply hold off on probing the
> >>>> root bridge device till all the iommus in its iommu-map are probed and
> >>>> we should be fine.
> >>>>
> >>>>> I'm also unsure about distro vendors agreeing to a mandatory kernel
> >>>>> parameter (of_devlink). Do you plan to eventually enable it by
> >>>>> default?
> >>>>>
> >>>>>> static const struct supplier_bindings of_supplier_bindings[] = {
> >>>>>> { .parse_prop = parse_clocks, },
> >>>>>> { .parse_prop = parse_interconnects, },
> >>>>>> { .parse_prop = parse_regulators, },
> >>>>>> + { .parse_prop = parse_iommus, },
> >>>>>> {},
> >>>>>> };
> >>>>>>
> >>>>>> I plan to upstream this pretty soon, but I have other patches in
> >>>>>> flight that touch the same file and I'm waiting for those to get
> >>>>>> accepted. I also want to clean up the code a bit to reduce some
> >>>>>> repetition before I add support for more bindings.
> >>>>> I'm also wondering about ACPI support.
> >>>> I'd love to add ACPI support too, but I have zero knowledge of ACPI.
> >>>> I'd be happy to help anyone who wants to add ACPI support that allows
> >>>> ACPI to add device links.
> >>>
> >>> If possible to add, that may be useful for remedying this:
> >>>
> >>> https://lore.kernel.org/linux-iommu/[email protected]/
> >>>
> >>
> >> I'm happy that this change might fix that problem, but isn't the
> >> problem reported in that thread more to do with child devices getting
> >> added before the parent probes successfully? That doesn't make sense
> >> to me.
> >
> > So the pcieport device and then the child device are added in the PCI
> > scan, but only some time later do the device drivers probe for these
> > devices; so it's not that the that pcieport driver creates the child
> > device.
> >
> > The problem then occurs in that the ordering the of device driver probe
> > is such that we have this: pcieport probe + defer (as no IOMMU group
> > registered), SMMU probe (registers the IOMMU group), child device probe,
> > pcieport really probe.
> >
> > Can't the piceport driver not add its child devices before it
> >> probes successfully? Or more specifically, who adds the child devices
> >> of the pcieport before the pcieport itself probes?
> >
> > The devices are actually added in order pcieport, child device, but not
> > really probed in that same order, as above.
>
> Right, in short the fundamental problem is that of_iommu_configure() now
> does the wrong thing. Deferring probe of the entire host bridge/root
> complex based on "iommu-map" would indeed happen to solve the problem by
> brute force, I think, but could lead to a dependency cycle for PCI-based
> IOMMUs as Jean points out.
Sorry for the late reply. Got caught up on other work.
I didn't think the SMMU itself was PCI based in the example Jean gave.
I thought it just happened to be the case where the SMMU probes after
the pcieport but before the other children. If the SMMU itself is a
child of the pcieport, how can it be required for the parent to
function? How will suspend/resume even work?! I feel like I'm missing
some context wrt to PCI that everyone else seems to know (which isn't
surprising).
> I hope to have time this week to work a bit
> more on pulling of_iommu_configure() apart to fix it properly, after
> which of_devlink *should* only have to worry about the child devices
> themselves...
Worry about child devices in what sense? From a non-iommu perspective?
-Saravana
On Wed, Oct 30, 2019 at 5:57 PM Saravana Kannan <[email protected]> wrote:
>
> On Wed, Oct 30, 2019 at 8:54 AM Will Deacon <[email protected]> wrote:
> >
> > Hi Robin,
> >
> > On Wed, Oct 30, 2019 at 03:35:55PM +0000, Robin Murphy wrote:
> > > On 30/10/2019 14:51, Will Deacon wrote:
> > > > As part of the work to enable a "Generic Kernel Image" across multiple
> > > > Android devices, there is a need to seperate shared, core kernel code
> > > > from modular driver code that may not be needed by all SoCs. This means
> > > > building IOMMU drivers as modules.
> > > >
> > > > It turns out that most of the groundwork has already been done to enable
> > > > the ARM SMMU drivers to be 'tristate' options in drivers/iommu/Kconfig;
> > > > with a few symbols exported from the IOMMU/PCI core, everything builds
> > > > nicely out of the box. The one exception is support for the legacy SMMU
> > > > DT binding, which is not in widespread use and has never worked with
> > > > modules, so we can simply remove that when building as a module rather
> > > > than try to paper over it with even more hacks.
> > > >
> > > > Obviously you need to be careful about using IOMMU drivers as modules,
> > > > since late loading of the driver for an IOMMU serving active DMA masters
> > > > is going to end badly in many cases. On Android, we're using device links
> > > > to ensure that the IOMMU probes first.
> > >
> > > Out of curiosity, which device links are those? Clearly not the RPM links
> > > created by the IOMMU drivers themselves... Is this some special Android
> > > magic, or is there actually a chance of replacing all the
> > > of_iommu_configure() machinery with something more generic?
> >
> > I'll admit that I haven't used them personally yet, but I'm referring to
> > this series from Saravana [CC'd]:
> >
> > https://lore.kernel.org/linux-acpi/[email protected]/
> >
> > which is currently sitting in linux-next now that we're upstreaming the
> > "special Android magic" ;)
>
> Hi Robin,
>
> Actually, none of this is special Android magic. Will is talking about
> the of_devlink feature that's been merged into driver-core-next.
>
> A one line summary of of_devlink: the driver core + firmware (DT in
> this case) automatically add the device links during device addition
> based on the firmware properties of each device. The link that Will
> gave has more details.
>
> Wrt IOMMUs, the only missing piece in upstream is a trivial change
> that does something like this in drivers/of/property.c
>
> +static struct device_node *parse_iommus(struct device_node *np,
> + const char *prop_name, int index)
> +{
> + return parse_prop_cells(np, prop_name, index, "iommus",
> + "#iommu-cells");
> +}
>
> static const struct supplier_bindings of_supplier_bindings[] = {
> { .parse_prop = parse_clocks, },
> { .parse_prop = parse_interconnects, },
> { .parse_prop = parse_regulators, },
> + { .parse_prop = parse_iommus, },
> {},
> };
>
> I plan to upstream this pretty soon, but I have other patches in
> flight that touch the same file and I'm waiting for those to get
> accepted. I also want to clean up the code a bit to reduce some
> repetition before I add support for more bindings.
As promised:
https://lore.kernel.org/lkml/[email protected]/
-Saravana
On Wed, Nov 06, 2019 at 10:11:55PM -0800, Saravana Kannan wrote:
> > Right, in short the fundamental problem is that of_iommu_configure() now
> > does the wrong thing. Deferring probe of the entire host bridge/root
> > complex based on "iommu-map" would indeed happen to solve the problem by
> > brute force, I think, but could lead to a dependency cycle for PCI-based
> > IOMMUs as Jean points out.
>
> Sorry for the late reply. Got caught up on other work.
>
> I didn't think the SMMU itself was PCI based in the example Jean gave.
> I thought it just happened to be the case where the SMMU probes after
> the pcieport but before the other children. If the SMMU itself is a
> child of the pcieport, how can it be required for the parent to
> function? How will suspend/resume even work?! I feel like I'm missing
> some context wrt to PCI that everyone else seems to know (which isn't
> surprising).
The Arm SMMU isn't PCI based, it always appears as an independent MMIO
device. John's problem is something different if I understand correctly,
where the probe order between pcieport and endpoint shouldn't affect the
IOMMU grouping, but currently does.
Two other IOMMU models are PCI based, though, AMD IOMMU and virtio-iommu
(which is a purely virtual device). In theory they can have their
programming interface anywhere in the PCI config space, but to ensure
proper software support they should be at the top of the PCI hierarchy.
AMD strongly recommends that the IOMMU is a root-complex device (4.5 -
Software and Platform Firmware Implementation Issues). Within a PCIe
system the IOMMU would have to be a Root Complex integrated Endpoint, not
be a child of a root port.
Thanks,
Jean
Hi Isaac,
On Mon, Nov 04, 2019 at 11:34:00AM -0800, Isaac J. Manjarres wrote:
> On Wed, Oct 30, 2019 at 02:51:12PM +0000, Will Deacon wrote:
> > diff --git a/drivers/iommu/arm-smmu-impl.c b/drivers/iommu/arm-smmu-impl.c
> > index 5c87a38620c4..2f82d40317d6 100644
> > --- a/drivers/iommu/arm-smmu-impl.c
> > +++ b/drivers/iommu/arm-smmu-impl.c
> > @@ -5,6 +5,7 @@
> > #define pr_fmt(fmt) "arm-smmu: " fmt
> >
> > #include <linux/bitfield.h>
> > +#include <linux/module.h>
> > #include <linux/of.h>
> >
> > #include "arm-smmu.h"
> > @@ -172,3 +173,8 @@ struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu)
> >
> > return smmu;
> > }
> > +EXPORT_SYMBOL_GPL(arm_smmu_impl_init);
> > +
> > +MODULE_DESCRIPTION("IOMMU quirks for ARM architected SMMU implementations");
> > +MODULE_AUTHOR("Robin Murphy <[email protected]>");
> > +MODULE_LICENSE("GPL v2");
>
> A minor comment: I was curious about why arm-smmu.c and arm-smmu-impl.c
> were being compiled as two separate modules, as opposed to combining
> them into one module? The latter approach seemed more appropriate, given
> that arm-smmu-impl doesn't offer much as a module on its own. Thoughts?
Yes, you're right. The simple answer is that I couldn't come up with a good
name for the combined module, since "arm-smmu" is already taken by the core
part of the driver and I don't want to rename that file. Looking at what a
few other drivers do, it seems that "arm-smmu-mod" might be the best bet
so I'll incorporate that change for v2 and put you on cc.
Thanks!
Will
Hi Joerg,
On Tue, Nov 05, 2019 at 01:15:08PM +0100, Joerg Roedel wrote:
> On Thu, Oct 31, 2019 at 03:42:47PM +0000, Will Deacon wrote:
> > Generally, I think unloading the IOMMU driver module while there are
> > active users is a pretty bad idea, much like unbinding the driver via
> > /sys in the same situation would also be fairly daft. However, I *think*
> > the code in __device_release_driver() tries to deal with this by
> > iterating over the active consumers and ->remove()ing them first.
> >
> > I'm without hardware access at the moment, so I haven't been able to
> > test this myself. We could nobble the module_exit() hook, but there's
> > still the "force unload" option depending on the .config.
>
> Okay, but besides the force-unload case, can we prevent accidential
> unloading by taking a reference to the module in add_device() and release
> it in remove_device()?
That's probably a sensible starting point, yes. In conjunction with the
patch from Jean-Philippe to introduce bus_unset_iommu(), we might have
a fighting chance of getting this to work.
I'll spin a v2.
Thanks!
Will
Hi Jean-Philippe,
On Mon, Nov 04, 2019 at 08:15:24PM +0100, Jean-Philippe Brucker wrote:
> On Thu, Oct 31, 2019 at 03:42:47PM +0000, Will Deacon wrote:
> > > Sorry for the stupid question, but what prevents the iommu module from
> > > being unloaded when there are active users? There are no symbol
> > > dependencies to endpoint device drivers, because the interface is only
> > > exposed through the iommu-api, right? Is some sort of manual module
> > > reference counting needed?
> >
> > Generally, I think unloading the IOMMU driver module while there are
> > active users is a pretty bad idea, much like unbinding the driver via
> > /sys in the same situation would also be fairly daft. However, I *think*
> > the code in __device_release_driver() tries to deal with this by
> > iterating over the active consumers and ->remove()ing them first.
>
> > I'm without hardware access at the moment, so I haven't been able to
> > test this myself. We could nobble the module_exit() hook, but there's
> > still the "force unload" option depending on the .config.
>
> Shame that we can't completely prevent module unloading, because handling
> rmmod cleanly is tricky.
>
> On module unload we also need to tidy up the bus->iommu_ops installed by
> bus_set_iommu(), and remove the IOMMU groups (and probably other leaks I
> missed). I have a solution for the bus->iommu_ops, which is simply adding
> a bus_unset_iommu() counterpart with a refcount, but it doesn't deal with
> the IOMMU groups cleanly. If there are multiple IOMMU instances managing
> one bus, then we should only remove the IOMMU groups belonging to the
> instance that is being removed.
Hmm, but all of those IOMMU instances must be driven by the same driver,
right, since bus_set_iommu() can only take one set of callbacks for a given
bus? In which case, removing the driver module effectively removes all
instances of the IOMMU for that bus and I think we're ok.
If we couple that with Joerg's suggestion to take a reference to the driver
module in add_device(), then I think that actually it's harmless to leave
the bus ops installed and the groups should be sorted too. It means it's
pretty difficult to unload the module, but that's probably not a bad thing.
I'll post a v2 shortly...
> I'll think about this more, but the simple solution is attached if you
> want to test. It at least works with a single IOMMU now:
>
> $ modprobe virtio-iommu
> [ 25.180965] virtio_iommu virtio0: input address: 64 bits
> [ 25.181437] virtio_iommu virtio0: page mask: 0xfffffffffffff000
> [ 25.214493] virtio-pci 0000:00:03.0: Adding to iommu group 0
> [ 25.233252] virtio-pci 0000:00:03.0: enabling device (0000 -> 0003)
> [ 25.334810] e1000e 0000:00:02.0: Adding to iommu group 1
> [ 25.348997] e1000e 0000:00:02.0: enabling device (0000 -> 0002)
> ... net test etc
>
> $ rmmod virtio-iommu
> [ 34.084816] e1000e: eth1 NIC Link is Down
> [ 34.212152] pci 0000:00:02.0: Removing from iommu group 1
> [ 34.250558] pci 0000:00:03.0: Removing from iommu group 0
> [ 34.261570] virtio_iommu virtio0: device removed
>
> $ modprobe virtio-iommu
> [ 34.828982] virtio_iommu virtio0: input address: 64 bits
> [ 34.829442] virtio_iommu virtio0: page mask: 0xfffffffffffff000
> [ 34.844576] virtio-pci 0000:00:03.0: Adding to iommu group 0
> [ 34.916449] e1000e 0000:00:02.0: Adding to iommu group 1
>
> Thanks,
> Jean
> From 5437fcaabe1d4671e2dc5b90b7898c0bf698111b Mon Sep 17 00:00:00 2001
> From: Jean-Philippe Brucker <[email protected]>
> Date: Mon, 4 Nov 2019 15:52:36 +0100
> Subject: [PATCH] iommu: Add bus_unset_iommu()
>
> Let modular IOMMU drivers undo bus_set_iommu(). Keep track of bus
> registrations with a list and refcount, and remove the iommu_ops from
> the bus when there are no IOMMU providers anymore.
>
> Signed-off-by: Jean-Philippe Brucker <[email protected]>
> ---
> drivers/iommu/iommu.c | 101 ++++++++++++++++++++++++++++++++++--------
> include/linux/iommu.h | 1 +
> 2 files changed, 84 insertions(+), 18 deletions(-)
To be honest, I think we should be trying to move *away* from the bus-ops
abstraction rather than extending it. We already don't need it for DMA
domains on arm64, and I think it's really just a bit of a wart now because
iommu_domain_alloc() takes a 'struct bus_type *' as its argument.
Will