From: Thierry Reding <[email protected]>
This series adds support for the IOMMU found on Tegra124 SoCs. The SMMU
groups memory clients into SWGROUPs and each SWGROUP can be assigned to
one I/O virtual address space. Translation of virtual addresses can be
enabled per memory client.
Patch 1 adds an IOMMU device registry. The driver in patch 4 will add
the IOMMU device with this registry, which will in turn be used by the
client drivers to attach to the IOMMU device. Note that the API that is
introduced in this patch may not be sufficient in the long term (f.e.
when multiple master interfaces need to be supported).
Patch 2 is v3 of the generic IOMMU device tree binding that has been
discussed previously. Patch 3 defines the device tree binding for the
NVIDIA Tegra124 memory controller (and references the generic IOMMU
binding).
Patch 4 implements a memory controller driver for NVIDIA Tegra124. It
initializes the latency allowance programming to sensible defaults and
registers an IOMMU device. Note that this is still somewhat work in
progress. The page tables aren't properly cleaned up yet and other
features of the memory controller may be useful to implement
subsequently.
Patches 5 through 8 add the device tree node for the memory controller
and enable IOMMU support in the display and SDMMC controllers as
examples.
Patches 9 and 10 add support for IOMMU to the DRM and SDMMC drivers.
SDMMC uses the DMA mapping API, which will make use of ARM's DMA/IOMMU
integration. DRM has special needs (buffers that are mapped can be
scanned out by either display controller) and not a good fit for the
DMA mapping API, so it uses the IOMMU API directly.
This has been tested using both SDMMC and DRM drivers via the IOMMU. For
DRM when an IOMMU is detected it will use shmem as backing store, which
removes the need for CMA. Importing from gk20a via the Nouveau driver
also works, but buffers occasionally have some kind of offset that I
haven't been able to track down yet.
Thierry
Thierry Reding (10):
iommu: Add IOMMU device registry
devicetree: Add generic IOMMU device tree bindings
of: Add NVIDIA Tegra124 memory controller binding
memory: Add Tegra124 memory controller support
ARM: tegra: Add memory controller on Tegra124
ARM: tegra: tegra124: Enable IOMMU for display controllers
ARM: tegra: tegra124: Enable IOMMU for SDMMC controllers
ARM: tegra: Select ARM_DMA_USE_IOMMU
drm/tegra: Add IOMMU support
mmc: sdhci-tegra: Add IOMMU support
Documentation/devicetree/bindings/iommu/iommu.txt | 156 ++
.../memory-controllers/nvidia,tegra124-mc.txt | 12 +
arch/arm/boot/dts/tegra124.dtsi | 18 +
arch/arm/mach-tegra/Kconfig | 1 +
drivers/gpu/drm/tegra/dc.c | 21 +
drivers/gpu/drm/tegra/drm.c | 17 +
drivers/gpu/drm/tegra/drm.h | 3 +
drivers/gpu/drm/tegra/fb.c | 16 +-
drivers/gpu/drm/tegra/gem.c | 236 ++-
drivers/gpu/drm/tegra/gem.h | 4 +
drivers/iommu/iommu.c | 93 +
drivers/memory/Kconfig | 9 +
drivers/memory/Makefile | 1 +
drivers/memory/tegra124-mc.c | 1945 ++++++++++++++++++++
drivers/mmc/host/sdhci-tegra.c | 8 +
include/dt-bindings/memory/tegra124-mc.h | 30 +
include/linux/iommu.h | 27 +
17 files changed, 2573 insertions(+), 24 deletions(-)
create mode 100644 Documentation/devicetree/bindings/iommu/iommu.txt
create mode 100644 Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt
create mode 100644 drivers/memory/tegra124-mc.c
create mode 100644 include/dt-bindings/memory/tegra124-mc.h
--
2.0.0
From: Thierry Reding <[email protected]>
Add an IOMMU device registry for drivers to register with and implement
a method for users of the IOMMU API to attach to an IOMMU device. This
allows to support deferred probing and gives the IOMMU API a convenient
hook to perform early initialization of a device if necessary.
Signed-off-by: Thierry Reding <[email protected]>
---
drivers/iommu/iommu.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++
include/linux/iommu.h | 27 +++++++++++++++
2 files changed, 120 insertions(+)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 806b55d056b7..5e9e82c73bbf 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -29,8 +29,12 @@
#include <linux/idr.h>
#include <linux/notifier.h>
#include <linux/err.h>
+#include <linux/of.h>
#include <trace/events/iommu.h>
+static DEFINE_MUTEX(iommus_lock);
+static LIST_HEAD(iommus);
+
static struct kset *iommu_group_kset;
static struct ida iommu_group_ida;
static struct mutex iommu_group_mutex;
@@ -1004,3 +1008,92 @@ int iommu_domain_set_attr(struct iommu_domain *domain,
return ret;
}
EXPORT_SYMBOL_GPL(iommu_domain_set_attr);
+
+int iommu_add(struct iommu *iommu)
+{
+ mutex_lock(&iommus_lock);
+ list_add_tail(&iommu->list, &iommus);
+ mutex_unlock(&iommus_lock);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(iommu_add);
+
+void iommu_remove(struct iommu *iommu)
+{
+ mutex_lock(&iommus_lock);
+ list_del_init(&iommu->list);
+ mutex_unlock(&iommus_lock);
+}
+EXPORT_SYMBOL_GPL(iommu_remove);
+
+static int of_iommu_attach(struct device *dev)
+{
+ struct of_phandle_iter iter;
+ struct iommu *iommu;
+
+ mutex_lock(&iommus_lock);
+
+ of_property_for_each_phandle_with_args(iter, dev->of_node, "iommus",
+ "#iommu-cells", 0) {
+ bool found = false;
+ int err;
+
+ /* skip disabled IOMMUs */
+ if (!of_device_is_available(iter.out_args.np))
+ continue;
+
+ list_for_each_entry(iommu, &iommus, list) {
+ if (iommu->dev->of_node == iter.out_args.np) {
+ err = iommu->ops->attach(iommu, dev);
+ if (err < 0) {
+ }
+
+ found = true;
+ }
+ }
+
+ if (!found) {
+ mutex_unlock(&iommus_lock);
+ return -EPROBE_DEFER;
+ }
+ }
+
+ mutex_unlock(&iommus_lock);
+
+ return 0;
+}
+
+static int of_iommu_detach(struct device *dev)
+{
+ /* TODO: implement */
+ return -ENOSYS;
+}
+
+int iommu_attach(struct device *dev)
+{
+ int err = 0;
+
+ if (IS_ENABLED(CONFIG_OF) && dev->of_node) {
+ err = of_iommu_attach(dev);
+ if (!err)
+ return 0;
+ }
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(iommu_attach);
+
+int iommu_detach(struct device *dev)
+{
+ int err = 0;
+
+ if (IS_ENABLED(CONFIG_OF) && dev->of_node) {
+ err = of_iommu_detach(dev);
+ if (!err)
+ return 0;
+ }
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(iommu_detach);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 284a4683fdc1..ac2ceef194d4 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -43,6 +43,17 @@ struct notifier_block;
typedef int (*iommu_fault_handler_t)(struct iommu_domain *,
struct device *, unsigned long, int, void *);
+struct iommu {
+ struct device *dev;
+
+ struct list_head list;
+
+ const struct iommu_ops *ops;
+};
+
+int iommu_add(struct iommu *iommu);
+void iommu_remove(struct iommu *iommu);
+
struct iommu_domain_geometry {
dma_addr_t aperture_start; /* First address that can be mapped */
dma_addr_t aperture_end; /* Last address that can be mapped */
@@ -130,6 +141,9 @@ struct iommu_ops {
/* Get the numer of window per domain */
u32 (*domain_get_windows)(struct iommu_domain *domain);
+ int (*attach)(struct iommu *iommu, struct device *dev);
+ int (*detach)(struct iommu *iommu, struct device *dev);
+
unsigned long pgsize_bitmap;
};
@@ -192,6 +206,10 @@ extern int iommu_domain_window_enable(struct iommu_domain *domain, u32 wnd_nr,
phys_addr_t offset, u64 size,
int prot);
extern void iommu_domain_window_disable(struct iommu_domain *domain, u32 wnd_nr);
+
+int iommu_attach(struct device *dev);
+int iommu_detach(struct device *dev);
+
/**
* report_iommu_fault() - report about an IOMMU fault to the IOMMU framework
* @domain: the iommu domain where the fault has happened
@@ -396,6 +414,15 @@ static inline int iommu_domain_set_attr(struct iommu_domain *domain,
return -EINVAL;
}
+static inline int iommu_attach(struct device *dev)
+{
+ return 0;
+}
+
+static inline int iommu_detach(struct device *dev)
+{
+ return 0;
+}
#endif /* CONFIG_IOMMU_API */
#endif /* __LINUX_IOMMU_H */
--
2.0.0
From: Thierry Reding <[email protected]>
Attach to the device's master interface of the IOMMU at .probe() time.
IOMMU support becomes available via the DMA mapping API interoperation
code, but this explicit attachment is necessary to ensure proper probe
order.
Signed-off-by: Thierry Reding <[email protected]>
---
drivers/mmc/host/sdhci-tegra.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c
index 33100d10d176..b884614fa4e6 100644
--- a/drivers/mmc/host/sdhci-tegra.c
+++ b/drivers/mmc/host/sdhci-tegra.c
@@ -15,6 +15,7 @@
#include <linux/err.h>
#include <linux/module.h>
#include <linux/init.h>
+#include <linux/iommu.h>
#include <linux/platform_device.h>
#include <linux/clk.h>
#include <linux/io.h>
@@ -237,6 +238,11 @@ static int sdhci_tegra_probe(struct platform_device *pdev)
match = of_match_device(sdhci_tegra_dt_match, &pdev->dev);
if (!match)
return -EINVAL;
+
+ rc = iommu_attach(&pdev->dev);
+ if (rc < 0)
+ return rc;
+
soc_data = match->data;
host = sdhci_pltfm_init(pdev, soc_data->pdata, 0);
@@ -310,6 +316,8 @@ static int sdhci_tegra_remove(struct platform_device *pdev)
clk_disable_unprepare(pltfm_host->clk);
clk_put(pltfm_host->clk);
+ iommu_detach(&pdev->dev);
+
sdhci_pltfm_free(pdev);
return 0;
--
2.0.0
From: Thierry Reding <[email protected]>
When an IOMMU device is available on the platform bus, allocate an IOMMU
domain and attach the display controllers to it. The display controllers
can then scan out non-contiguous buffers by mapping them through the
IOMMU.
Signed-off-by: Thierry Reding <[email protected]>
---
drivers/gpu/drm/tegra/dc.c | 21 ++++
drivers/gpu/drm/tegra/drm.c | 17 ++++
drivers/gpu/drm/tegra/drm.h | 3 +
drivers/gpu/drm/tegra/fb.c | 16 ++-
drivers/gpu/drm/tegra/gem.c | 236 +++++++++++++++++++++++++++++++++++++++-----
drivers/gpu/drm/tegra/gem.h | 4 +
6 files changed, 273 insertions(+), 24 deletions(-)
diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index afcca04f5367..0f7452d04811 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -9,6 +9,7 @@
#include <linux/clk.h>
#include <linux/debugfs.h>
+#include <linux/iommu.h>
#include <linux/reset.h>
#include "dc.h"
@@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
{
struct drm_device *drm = dev_get_drvdata(client->parent);
struct tegra_dc *dc = host1x_client_to_dc(client);
+ struct tegra_drm *tegra = drm->dev_private;
int err;
+ if (tegra->domain) {
+ err = iommu_attach_device(tegra->domain, dc->dev);
+ if (err < 0) {
+ dev_err(dc->dev, "failed to attach to IOMMU: %d\n",
+ err);
+ return err;
+ }
+ }
+
drm_crtc_init(drm, &dc->base, &tegra_crtc_funcs);
drm_mode_crtc_set_gamma_size(&dc->base, 256);
drm_crtc_helper_add(&dc->base, &tegra_crtc_helper_funcs);
@@ -1318,7 +1329,9 @@ static int tegra_dc_init(struct host1x_client *client)
static int tegra_dc_exit(struct host1x_client *client)
{
+ struct drm_device *drm = dev_get_drvdata(client->parent);
struct tegra_dc *dc = host1x_client_to_dc(client);
+ struct tegra_drm *tegra = drm->dev_private;
int err;
devm_free_irq(dc->dev, dc->irq, dc);
@@ -1335,6 +1348,8 @@ static int tegra_dc_exit(struct host1x_client *client)
return err;
}
+ iommu_detach_device(tegra->domain, dc->dev);
+
return 0;
}
@@ -1462,6 +1477,12 @@ static int tegra_dc_probe(struct platform_device *pdev)
return -ENXIO;
}
+ err = iommu_attach(&pdev->dev);
+ if (err < 0) {
+ dev_err(&pdev->dev, "failed to attach to IOMMU: %d\n", err);
+ return err;
+ }
+
INIT_LIST_HEAD(&dc->client.list);
dc->client.ops = &dc_client_ops;
dc->client.dev = &pdev->dev;
diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index 59736bb810cd..1d2bbafad982 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -8,6 +8,7 @@
*/
#include <linux/host1x.h>
+#include <linux/iommu.h>
#include "drm.h"
#include "gem.h"
@@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
if (!tegra)
return -ENOMEM;
+ if (iommu_present(&platform_bus_type)) {
+ tegra->domain = iommu_domain_alloc(&platform_bus_type);
+ if (IS_ERR(tegra->domain)) {
+ kfree(tegra);
+ return PTR_ERR(tegra->domain);
+ }
+
+ drm_mm_init(&tegra->mm, 0, SZ_2G);
+ }
+
mutex_init(&tegra->clients_lock);
INIT_LIST_HEAD(&tegra->clients);
drm->dev_private = tegra;
@@ -71,6 +82,7 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
static int tegra_drm_unload(struct drm_device *drm)
{
struct host1x_device *device = to_host1x_device(drm->dev);
+ struct tegra_drm *tegra = drm->dev_private;
int err;
drm_kms_helper_poll_fini(drm);
@@ -82,6 +94,11 @@ static int tegra_drm_unload(struct drm_device *drm)
if (err < 0)
return err;
+ if (tegra->domain) {
+ iommu_domain_free(tegra->domain);
+ drm_mm_takedown(&tegra->mm);
+ }
+
return 0;
}
diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
index 96d754e7b3eb..a07c796b7edc 100644
--- a/drivers/gpu/drm/tegra/drm.h
+++ b/drivers/gpu/drm/tegra/drm.h
@@ -39,6 +39,9 @@ struct tegra_fbdev {
struct tegra_drm {
struct drm_device *drm;
+ struct iommu_domain *domain;
+ struct drm_mm mm;
+
struct mutex clients_lock;
struct list_head clients;
diff --git a/drivers/gpu/drm/tegra/fb.c b/drivers/gpu/drm/tegra/fb.c
index 7790d43ad082..21c65dd817c3 100644
--- a/drivers/gpu/drm/tegra/fb.c
+++ b/drivers/gpu/drm/tegra/fb.c
@@ -65,8 +65,12 @@ static void tegra_fb_destroy(struct drm_framebuffer *framebuffer)
for (i = 0; i < fb->num_planes; i++) {
struct tegra_bo *bo = fb->planes[i];
- if (bo)
+ if (bo) {
+ if (bo->pages && bo->virt)
+ vunmap(bo->virt);
+
drm_gem_object_unreference_unlocked(&bo->gem);
+ }
}
drm_framebuffer_cleanup(framebuffer);
@@ -252,6 +256,16 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
offset = info->var.xoffset * bytes_per_pixel +
info->var.yoffset * fb->pitches[0];
+ if (bo->pages) {
+ bo->vaddr = vmap(bo->pages, bo->num_pages, VM_MAP,
+ pgprot_writecombine(PAGE_KERNEL));
+ if (!bo->vaddr) {
+ dev_err(drm->dev, "failed to vmap() framebuffer\n");
+ err = -ENOMEM;
+ goto destroy;
+ }
+ }
+
drm->mode_config.fb_base = (resource_size_t)bo->paddr;
info->screen_base = (void __iomem *)bo->vaddr + offset;
info->screen_size = size;
diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
index c1e4e8b6e5ca..2912e61a2599 100644
--- a/drivers/gpu/drm/tegra/gem.c
+++ b/drivers/gpu/drm/tegra/gem.c
@@ -14,8 +14,10 @@
*/
#include <linux/dma-buf.h>
+#include <linux/iommu.h>
#include <drm/tegra_drm.h>
+#include "drm.h"
#include "gem.h"
static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo)
@@ -90,14 +92,144 @@ static const struct host1x_bo_ops tegra_bo_ops = {
.kunmap = tegra_bo_kunmap,
};
+static int iommu_map_sg(struct iommu_domain *domain, struct sg_table *sgt,
+ dma_addr_t iova, int prot)
+{
+ unsigned long offset = 0;
+ struct scatterlist *sg;
+ unsigned int i, j;
+ int err;
+
+ for_each_sg(sgt->sgl, sg, sgt->nents, i) {
+ dma_addr_t phys = sg_phys(sg);
+ size_t length = sg->offset;
+
+ phys = sg_phys(sg) - sg->offset;
+ length = sg->length + sg->offset;
+
+ err = iommu_map(domain, iova + offset, phys, length, prot);
+ if (err < 0)
+ goto unmap;
+
+ offset += length;
+ }
+
+ return 0;
+
+unmap:
+ offset = 0;
+
+ for_each_sg(sgt->sgl, sg, i, j) {
+ size_t length = sg->length + sg->offset;
+ iommu_unmap(domain, iova + offset, length);
+ offset += length;
+ }
+
+ return err;
+}
+
+static int iommu_unmap_sg(struct iommu_domain *domain, struct sg_table *sgt,
+ dma_addr_t iova)
+{
+ unsigned long offset = 0;
+ struct scatterlist *sg;
+ unsigned int i;
+
+ for_each_sg(sgt->sgl, sg, sgt->nents, i) {
+ dma_addr_t phys = sg_phys(sg);
+ size_t length = sg->offset;
+
+ phys = sg_phys(sg) - sg->offset;
+ length = sg->length + sg->offset;
+
+ iommu_unmap(domain, iova + offset, length);
+ offset += length;
+ }
+
+ return 0;
+}
+
+static int tegra_bo_iommu_map(struct tegra_drm *tegra, struct tegra_bo *bo)
+{
+ int prot = IOMMU_READ | IOMMU_WRITE;
+ int err;
+
+ if (bo->mm)
+ return -EBUSY;
+
+ bo->mm = kzalloc(sizeof(*bo->mm), GFP_KERNEL);
+ if (!bo->mm)
+ return -ENOMEM;
+
+ err = drm_mm_insert_node_generic(&tegra->mm, bo->mm, bo->gem.size,
+ PAGE_SIZE, 0, 0, 0);
+ if (err < 0) {
+ dev_err(tegra->drm->dev, "out of virtual memory: %d\n", err);
+ return err;
+ }
+
+ bo->paddr = bo->mm->start;
+
+ err = iommu_map_sg(tegra->domain, bo->sgt, bo->paddr, prot);
+ if (err < 0) {
+ dev_err(tegra->drm->dev, "failed to map buffer: %d\n", err);
+ return err;
+ }
+
+ return 0;
+}
+
+static int tegra_bo_iommu_unmap(struct tegra_drm *tegra, struct tegra_bo *bo)
+{
+ if (!bo->mm)
+ return 0;
+
+ iommu_unmap_sg(tegra->domain, bo->sgt, bo->paddr);
+ drm_mm_remove_node(bo->mm);
+
+ kfree(bo->mm);
+ return 0;
+}
+
static void tegra_bo_destroy(struct drm_device *drm, struct tegra_bo *bo)
{
- dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr, bo->paddr);
+ if (!bo->pages)
+ dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr,
+ bo->paddr);
+ else
+ drm_gem_put_pages(&bo->gem, bo->pages, true, true);
+}
+
+static int tegra_bo_get_pages(struct drm_device *drm, struct tegra_bo *bo,
+ size_t size)
+{
+ bo->pages = drm_gem_get_pages(&bo->gem, GFP_KERNEL);
+ if (!bo->pages)
+ return -ENOMEM;
+
+ bo->num_pages = size >> PAGE_SHIFT;
+
+ return 0;
+}
+
+static int tegra_bo_alloc(struct drm_device *drm, struct tegra_bo *bo,
+ size_t size)
+{
+ bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
+ GFP_KERNEL | __GFP_NOWARN);
+ if (!bo->vaddr) {
+ dev_err(drm->dev, "failed to allocate buffer of size %zu\n",
+ size);
+ return -ENOMEM;
+ }
+
+ return 0;
}
struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
unsigned long flags)
{
+ struct tegra_drm *tegra = drm->dev_private;
struct tegra_bo *bo;
int err;
@@ -108,22 +240,33 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
host1x_bo_init(&bo->base, &tegra_bo_ops);
size = round_up(size, PAGE_SIZE);
- bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
- GFP_KERNEL | __GFP_NOWARN);
- if (!bo->vaddr) {
- dev_err(drm->dev, "failed to allocate buffer with size %u\n",
- size);
- err = -ENOMEM;
- goto err_dma;
- }
-
err = drm_gem_object_init(drm, &bo->gem, size);
if (err)
- goto err_init;
+ goto free;
err = drm_gem_create_mmap_offset(&bo->gem);
if (err)
- goto err_mmap;
+ goto release;
+
+ if (tegra->domain) {
+ err = tegra_bo_get_pages(drm, bo, size);
+ if (err < 0)
+ goto release;
+
+ bo->sgt = drm_prime_pages_to_sg(bo->pages, bo->num_pages);
+ if (IS_ERR(bo->sgt)) {
+ err = PTR_ERR(bo->sgt);
+ goto release;
+ }
+
+ err = tegra_bo_iommu_map(tegra, bo);
+ if (err < 0)
+ goto release;
+ } else {
+ err = tegra_bo_alloc(drm, bo, size);
+ if (err < 0)
+ goto release;
+ }
if (flags & DRM_TEGRA_GEM_CREATE_TILED)
bo->tiling.mode = TEGRA_BO_TILING_MODE_TILED;
@@ -133,11 +276,10 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
return bo;
-err_mmap:
+release:
drm_gem_object_release(&bo->gem);
-err_init:
tegra_bo_destroy(drm, bo);
-err_dma:
+free:
kfree(bo);
return ERR_PTR(err);
@@ -172,6 +314,7 @@ err:
static struct tegra_bo *tegra_bo_import(struct drm_device *drm,
struct dma_buf *buf)
{
+ struct tegra_drm *tegra = drm->dev_private;
struct dma_buf_attachment *attach;
struct tegra_bo *bo;
ssize_t size;
@@ -211,12 +354,19 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm,
goto detach;
}
- if (bo->sgt->nents > 1) {
- err = -EINVAL;
- goto detach;
+ if (tegra->domain) {
+ err = tegra_bo_iommu_map(tegra, bo);
+ if (err < 0)
+ goto detach;
+ } else {
+ if (bo->sgt->nents > 1) {
+ err = -EINVAL;
+ goto detach;
+ }
+
+ bo->paddr = sg_dma_address(bo->sgt->sgl);
}
- bo->paddr = sg_dma_address(bo->sgt->sgl);
bo->gem.import_attach = attach;
return bo;
@@ -239,8 +389,12 @@ free:
void tegra_bo_free_object(struct drm_gem_object *gem)
{
+ struct tegra_drm *tegra = gem->dev->dev_private;
struct tegra_bo *bo = to_tegra_bo(gem);
+ if (tegra->domain)
+ tegra_bo_iommu_unmap(tegra, bo);
+
if (gem->import_attach) {
dma_buf_unmap_attachment(gem->import_attach, bo->sgt,
DMA_TO_DEVICE);
@@ -301,7 +455,38 @@ int tegra_bo_dumb_map_offset(struct drm_file *file, struct drm_device *drm,
return 0;
}
+static int tegra_bo_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
+{
+ struct drm_gem_object *gem = vma->vm_private_data;
+ struct tegra_bo *bo = to_tegra_bo(gem);
+ struct page *page;
+ pgoff_t offset;
+ int err;
+
+ if (!bo->pages)
+ return VM_FAULT_SIGBUS;
+
+ offset = ((unsigned long)vmf->virtual_address - vma->vm_start) >> PAGE_SHIFT;
+ page = bo->pages[offset];
+
+ err = vm_insert_page(vma, (unsigned long)vmf->virtual_address, page);
+ switch (err) {
+ case -EAGAIN:
+ case 0:
+ case -ERESTARTSYS:
+ case -EINTR:
+ case -EBUSY:
+ return VM_FAULT_NOPAGE;
+
+ case -ENOMEM:
+ return VM_FAULT_OOM;
+ }
+
+ return VM_FAULT_SIGBUS;
+}
+
const struct vm_operations_struct tegra_bo_vm_ops = {
+ .fault = tegra_bo_fault,
.open = drm_gem_vm_open,
.close = drm_gem_vm_close,
};
@@ -316,13 +501,18 @@ int tegra_drm_mmap(struct file *file, struct vm_area_struct *vma)
if (ret)
return ret;
+ vma->vm_flags |= VM_MIXEDMAP;
+ vma->vm_flags &= ~VM_PFNMAP;
+
gem = vma->vm_private_data;
bo = to_tegra_bo(gem);
- ret = remap_pfn_range(vma, vma->vm_start, bo->paddr >> PAGE_SHIFT,
- vma->vm_end - vma->vm_start, vma->vm_page_prot);
- if (ret)
- drm_gem_vm_close(vma);
+ if (!bo->pages) {
+ ret = remap_pfn_range(vma, vma->vm_start, bo->paddr >> PAGE_SHIFT,
+ vma->vm_end - vma->vm_start, vma->vm_page_prot);
+ if (ret)
+ drm_gem_vm_close(vma);
+ }
return ret;
}
diff --git a/drivers/gpu/drm/tegra/gem.h b/drivers/gpu/drm/tegra/gem.h
index 43a25c853357..c2e3f43e4b3f 100644
--- a/drivers/gpu/drm/tegra/gem.h
+++ b/drivers/gpu/drm/tegra/gem.h
@@ -37,6 +37,10 @@ struct tegra_bo {
dma_addr_t paddr;
void *vaddr;
+ struct drm_mm_node *mm;
+ unsigned long num_pages;
+ struct page **pages;
+
struct tegra_bo_tiling tiling;
};
--
2.0.0
From: Thierry Reding <[email protected]>
Add an iommus property to each of the display controllers and encode the
SWGROUP in the specifier.
Signed-off-by: Thierry Reding <[email protected]>
---
arch/arm/boot/dts/tegra124.dtsi | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi
index efa0f0c519be..82751d2878c4 100644
--- a/arch/arm/boot/dts/tegra124.dtsi
+++ b/arch/arm/boot/dts/tegra124.dtsi
@@ -3,6 +3,7 @@
#include <dt-bindings/pinctrl/pinctrl-tegra.h>
#include <dt-bindings/pinctrl/pinctrl-tegra-xusb.h>
#include <dt-bindings/interrupt-controller/arm-gic.h>
+#include <dt-bindings/memory/tegra124-mc.h>
#include "skeleton.dtsi"
@@ -104,6 +105,8 @@
reset-names = "dc";
nvidia,head = <0>;
+
+ iommus = <&mc TEGRA_SWGROUP_DC>;
};
dc@0,54240000 {
@@ -117,6 +120,8 @@
reset-names = "dc";
nvidia,head = <1>;
+
+ iommus = <&mc TEGRA_SWGROUP_DCB>;
};
hdmi@0,54280000 {
--
2.0.0
From: Thierry Reding <[email protected]>
This enables IOMMU interoperation with the DMA mapping API so that
clients that use the DMA mapping API can seemlessly make use of an
existing IOMMU.
Signed-off-by: Thierry Reding <[email protected]>
---
arch/arm/mach-tegra/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/arm/mach-tegra/Kconfig b/arch/arm/mach-tegra/Kconfig
index a52d96366919..20bc43975bde 100644
--- a/arch/arm/mach-tegra/Kconfig
+++ b/arch/arm/mach-tegra/Kconfig
@@ -2,6 +2,7 @@ menuconfig ARCH_TEGRA
bool "NVIDIA Tegra" if ARCH_MULTI_V7
select ARCH_REQUIRE_GPIOLIB
select ARCH_SUPPORTS_TRUSTED_FOUNDATIONS
+ select ARM_DMA_USE_IOMMU
select ARM_GIC
select CLKSRC_MMIO
select HAVE_ARM_SCU if SMP
--
2.0.0
From: Thierry Reding <[email protected]>
The memory controller on NVIDIA Tegra124 exposes various knobs that can
be used to tune the behaviour of the clients attached to it.
In addition, the memory controller implements an SMMU (IOMMU) which can
translate I/O virtual addresses to physical addresses for clients. This
is useful for scatter-gather operation on devices that don't support it
natively and for virtualization or process separation.
Signed-off-by: Thierry Reding <[email protected]>
---
.../bindings/memory-controllers/nvidia,tegra124-mc.txt | 12 ++++++++++++
1 file changed, 12 insertions(+)
create mode 100644 Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt
diff --git a/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt
new file mode 100644
index 000000000000..4c922e839059
--- /dev/null
+++ b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra124-mc.txt
@@ -0,0 +1,12 @@
+NVIDIA Tegra124 Memory Controller device tree bindings
+======================================================
+
+Required properties:
+- compatible: Should be "nvidia,tegra124-mc"
+- reg: Physical base address and length of the controller's registers.
+- interrupts: The interrupt outputs from the controller.
+- #iommu-cells: Should be 1. The single cell of the IOMMU specifier defines
+ the SWGROUP of the master.
+
+This device implements an IOMMU that complies with the generic IOMMU binding.
+See ../iommu/iommu.txt for details.
--
2.0.0
From: Thierry Reding <[email protected]>
The SDMMC controllers can use the IOMMU to avoid the need for bounce
buffers.
Signed-off-by: Thierry Reding <[email protected]>
---
arch/arm/boot/dts/tegra124.dtsi | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi
index 82751d2878c4..bfffb4c102fb 100644
--- a/arch/arm/boot/dts/tegra124.dtsi
+++ b/arch/arm/boot/dts/tegra124.dtsi
@@ -607,6 +607,7 @@
resets = <&tegra_car 14>;
reset-names = "sdhci";
status = "disabled";
+ iommus = <&mc TEGRA_SWGROUP_SDMMC1A>;
};
sdhci@0,700b0200 {
@@ -618,6 +619,7 @@
resets = <&tegra_car 9>;
reset-names = "sdhci";
status = "disabled";
+ iommus = <&mc TEGRA_SWGROUP_SDMMC2A>;
};
sdhci@0,700b0400 {
@@ -629,6 +631,7 @@
resets = <&tegra_car 69>;
reset-names = "sdhci";
status = "disabled";
+ iommus = <&mc TEGRA_SWGROUP_SDMMC3A>;
};
sdhci@0,700b0600 {
@@ -640,6 +643,7 @@
resets = <&tegra_car 15>;
reset-names = "sdhci";
status = "disabled";
+ iommus = <&mc TEGRA_SWGROUP_SDMMC4A>;
};
ahub@0,70300000 {
--
2.0.0
From: Thierry Reding <[email protected]>
Add the memory controller and wire up the interrupt that is used to
report errors. Also add an #iommu-cells property to make the device
as an IOMMU.
Signed-off-by: Thierry Reding <[email protected]>
---
arch/arm/boot/dts/tegra124.dtsi | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi
index 0bf050696186..efa0f0c519be 100644
--- a/arch/arm/boot/dts/tegra124.dtsi
+++ b/arch/arm/boot/dts/tegra124.dtsi
@@ -560,6 +560,15 @@
reset-names = "fuse";
};
+ mc: memory-controller@0,70019000 {
+ compatible = "nvidia,tegra124-mc";
+ reg = <0x0 0x70019000 0x0 0x1000>;
+
+ interrupts = <GIC_SPI 77 IRQ_TYPE_LEVEL_HIGH>;
+
+ #iommu-cells = <1>;
+ };
+
hda@0,70030000 {
compatible = "nvidia,tegra124-hda", "nvidia,tegra30-hda";
reg = <0x0 0x70030000 0x0 0x10000>;
--
2.0.0
From: Thierry Reding <[email protected]>
The memory controller on NVIDIA Tegra124 exposes various knobs that can
be used to tune the behaviour of the clients attached to it.
Currently this driver sets up the latency allowance registers to the HW
defaults. Eventually an API should be exported by this driver (via a
custom API or a generic subsystem) to allow clients to register latency
requirements.
This driver also registers an IOMMU (SMMU) that's implemented by the
memory controller.
Signed-off-by: Thierry Reding <[email protected]>
---
drivers/memory/Kconfig | 9 +
drivers/memory/Makefile | 1 +
drivers/memory/tegra124-mc.c | 1945 ++++++++++++++++++++++++++++++
include/dt-bindings/memory/tegra124-mc.h | 30 +
4 files changed, 1985 insertions(+)
create mode 100644 drivers/memory/tegra124-mc.c
create mode 100644 include/dt-bindings/memory/tegra124-mc.h
diff --git a/drivers/memory/Kconfig b/drivers/memory/Kconfig
index c59e9c96e86d..d0f0e6781570 100644
--- a/drivers/memory/Kconfig
+++ b/drivers/memory/Kconfig
@@ -61,6 +61,15 @@ config TEGRA30_MC
analysis, especially for IOMMU/SMMU(System Memory Management
Unit) module.
+config TEGRA124_MC
+ bool "Tegra124 Memory Controller driver"
+ depends on ARCH_TEGRA
+ select IOMMU_API
+ help
+ This driver is for the Memory Controller module available on
+ Tegra124 SoCs. It provides an IOMMU that can be used for I/O
+ virtual address translation.
+
config FSL_IFC
bool
depends on FSL_SOC
diff --git a/drivers/memory/Makefile b/drivers/memory/Makefile
index 71160a2b7313..03143927abab 100644
--- a/drivers/memory/Makefile
+++ b/drivers/memory/Makefile
@@ -11,3 +11,4 @@ obj-$(CONFIG_FSL_IFC) += fsl_ifc.o
obj-$(CONFIG_MVEBU_DEVBUS) += mvebu-devbus.o
obj-$(CONFIG_TEGRA20_MC) += tegra20-mc.o
obj-$(CONFIG_TEGRA30_MC) += tegra30-mc.o
+obj-$(CONFIG_TEGRA124_MC) += tegra124-mc.o
diff --git a/drivers/memory/tegra124-mc.c b/drivers/memory/tegra124-mc.c
new file mode 100644
index 000000000000..741755b6785d
--- /dev/null
+++ b/drivers/memory/tegra124-mc.c
@@ -0,0 +1,1945 @@
+/*
+ * Copyright (C) 2014 NVIDIA CORPORATION. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iommu.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+
+#include <dt-bindings/memory/tegra124-mc.h>
+
+#include <asm/cacheflush.h>
+#ifndef CONFIG_ARM64
+#include <asm/dma-iommu.h>
+#endif
+
+#define MC_INTSTATUS 0x000
+#define MC_INT_DECERR_MTS (1 << 16)
+#define MC_INT_SECERR_SEC (1 << 13)
+#define MC_INT_DECERR_VPR (1 << 12)
+#define MC_INT_INVALID_APB_ASID_UPDATE (1 << 11)
+#define MC_INT_INVALID_SMMU_PAGE (1 << 10)
+#define MC_INT_ARBITRATION_EMEM (1 << 9)
+#define MC_INT_SECURITY_VIOLATION (1 << 8)
+#define MC_INT_DECERR_EMEM (1 << 6)
+#define MC_INTMASK 0x004
+#define MC_ERR_STATUS 0x08
+#define MC_ERR_ADR 0x0c
+
+struct latency_allowance {
+ unsigned int reg;
+ unsigned int shift;
+ unsigned int mask;
+ unsigned int def;
+};
+
+struct smmu_enable {
+ unsigned int reg;
+ unsigned int bit;
+};
+
+struct tegra_mc_client {
+ unsigned int id;
+ const char *name;
+ unsigned int swgroup;
+
+ struct smmu_enable smmu;
+ struct latency_allowance latency;
+};
+
+static const struct tegra_mc_client tegra124_mc_clients[] = {
+ {
+ .id = 0x01,
+ .name = "display0a",
+ .swgroup = TEGRA_SWGROUP_DC,
+ .smmu = {
+ .reg = 0x228,
+ .bit = 1,
+ },
+ .latency = {
+ .reg = 0x2e8,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0xc2,
+ },
+ }, {
+ .id = 0x02,
+ .name = "display0ab",
+ .swgroup = TEGRA_SWGROUP_DCB,
+ .smmu = {
+ .reg = 0x228,
+ .bit = 2,
+ },
+ .latency = {
+ .reg = 0x2f4,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0xc6,
+ },
+ }, {
+ .id = 0x03,
+ .name = "display0b",
+ .swgroup = TEGRA_SWGROUP_DC,
+ .smmu = {
+ .reg = 0x228,
+ .bit = 3,
+ },
+ .latency = {
+ .reg = 0x2e8,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x50,
+ },
+ }, {
+ .id = 0x04,
+ .name = "display0bb",
+ .swgroup = TEGRA_SWGROUP_DCB,
+ .smmu = {
+ .reg = 0x228,
+ .bit = 4,
+ },
+ .latency = {
+ .reg = 0x2f4,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x50,
+ },
+ }, {
+ .id = 0x05,
+ .name = "display0c",
+ .swgroup = TEGRA_SWGROUP_DC,
+ .smmu = {
+ .reg = 0x228,
+ .bit = 5,
+ },
+ .latency = {
+ .reg = 0x2ec,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x50,
+ },
+ }, {
+ .id = 0x06,
+ .name = "display0cb",
+ .swgroup = TEGRA_SWGROUP_DCB,
+ .smmu = {
+ .reg = 0x228,
+ .bit = 6,
+ },
+ .latency = {
+ .reg = 0x2f8,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x50,
+ },
+ }, {
+ .id = 0x0e,
+ .name = "afir",
+ .swgroup = TEGRA_SWGROUP_AFI,
+ .smmu = {
+ .reg = 0x228,
+ .bit = 14,
+ },
+ .latency = {
+ .reg = 0x2e0,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x13,
+ },
+ }, {
+ .id = 0x0f,
+ .name = "avpcarm7r",
+ .swgroup = TEGRA_SWGROUP_AVPC,
+ .smmu = {
+ .reg = 0x228,
+ .bit = 15,
+ },
+ .latency = {
+ .reg = 0x2e4,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x04,
+ },
+ }, {
+ .id = 0x10,
+ .name = "displayhc",
+ .swgroup = TEGRA_SWGROUP_DC,
+ .smmu = {
+ .reg = 0x228,
+ .bit = 16,
+ },
+ .latency = {
+ .reg = 0x2f0,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x50,
+ },
+ }, {
+ .id = 0x11,
+ .name = "displayhcb",
+ .swgroup = TEGRA_SWGROUP_DCB,
+ .smmu = {
+ .reg = 0x228,
+ .bit = 17,
+ },
+ .latency = {
+ .reg = 0x2fc,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x50,
+ },
+ }, {
+ .id = 0x15,
+ .name = "hdar",
+ .swgroup = TEGRA_SWGROUP_HDA,
+ .smmu = {
+ .reg = 0x228,
+ .bit = 21,
+ },
+ .latency = {
+ .reg = 0x318,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x24,
+ },
+ }, {
+ .id = 0x16,
+ .name = "host1xdmar",
+ .swgroup = TEGRA_SWGROUP_HC,
+ .smmu = {
+ .reg = 0x228,
+ .bit = 22,
+ },
+ .latency = {
+ .reg = 0x310,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x1e,
+ },
+ }, {
+ .id = 0x17,
+ .name = "host1xr",
+ .swgroup = TEGRA_SWGROUP_HC,
+ .smmu = {
+ .reg = 0x228,
+ .bit = 23,
+ },
+ .latency = {
+ .reg = 0x310,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x50,
+ },
+ }, {
+ .id = 0x1c,
+ .name = "msencsrd",
+ .swgroup = TEGRA_SWGROUP_MSENC,
+ .smmu = {
+ .reg = 0x228,
+ .bit = 28,
+ },
+ .latency = {
+ .reg = 0x328,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x23,
+ },
+ }, {
+ .id = 0x1d,
+ .name = "ppcsahbdmarhdar",
+ .swgroup = TEGRA_SWGROUP_PPCS,
+ .smmu = {
+ .reg = 0x228,
+ .bit = 29,
+ },
+ .latency = {
+ .reg = 0x344,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x49,
+ },
+ }, {
+ .id = 0x1e,
+ .name = "ppcsahbslvr",
+ .swgroup = TEGRA_SWGROUP_PPCS,
+ .smmu = {
+ .reg = 0x228,
+ .bit = 30,
+ },
+ .latency = {
+ .reg = 0x344,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x1a,
+ },
+ }, {
+ .id = 0x1f,
+ .name = "satar",
+ .swgroup = TEGRA_SWGROUP_SATA,
+ .smmu = {
+ .reg = 0x228,
+ .bit = 31,
+ },
+ .latency = {
+ .reg = 0x350,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x65,
+ },
+ }, {
+ .id = 0x22,
+ .name = "vdebsevr",
+ .swgroup = TEGRA_SWGROUP_VDE,
+ .smmu = {
+ .reg = 0x22c,
+ .bit = 2,
+ },
+ .latency = {
+ .reg = 0x354,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x4f,
+ },
+ }, {
+ .id = 0x23,
+ .name = "vdember",
+ .swgroup = TEGRA_SWGROUP_VDE,
+ .smmu = {
+ .reg = 0x22c,
+ .bit = 3,
+ },
+ .latency = {
+ .reg = 0x354,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x3d,
+ },
+ }, {
+ .id = 0x24,
+ .name = "vdemcer",
+ .swgroup = TEGRA_SWGROUP_VDE,
+ .smmu = {
+ .reg = 0x22c,
+ .bit = 4,
+ },
+ .latency = {
+ .reg = 0x358,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x66,
+ },
+ }, {
+ .id = 0x25,
+ .name = "vdetper",
+ .swgroup = TEGRA_SWGROUP_VDE,
+ .smmu = {
+ .reg = 0x22c,
+ .bit = 5,
+ },
+ .latency = {
+ .reg = 0x358,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0xa5,
+ },
+ }, {
+ .id = 0x26,
+ .name = "mpcorelpr",
+ .swgroup = TEGRA_SWGROUP_MPCORELP,
+ .latency = {
+ .reg = 0x324,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x04,
+ },
+ }, {
+ .id = 0x27,
+ .name = "mpcorer",
+ .swgroup = TEGRA_SWGROUP_MPCORE,
+ .smmu = {
+ .reg = 0x22c,
+ .bit = 2,
+ },
+ .latency = {
+ .reg = 0x320,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x04,
+ },
+ }, {
+ .id = 0x2b,
+ .name = "msencswr",
+ .swgroup = TEGRA_SWGROUP_MSENC,
+ .smmu = {
+ .reg = 0x22c,
+ .bit = 11,
+ },
+ .latency = {
+ .reg = 0x328,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x31,
+ .name = "afiw",
+ .swgroup = TEGRA_SWGROUP_AFI,
+ .smmu = {
+ .reg = 0x22c,
+ .bit = 17,
+ },
+ .latency = {
+ .reg = 0x2e0,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x32,
+ .name = "avpcarm7w",
+ .swgroup = TEGRA_SWGROUP_AVPC,
+ .smmu = {
+ .reg = 0x22c,
+ .bit = 18,
+ },
+ .latency = {
+ .reg = 0x2e4,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x35,
+ .name = "hdaw",
+ .swgroup = TEGRA_SWGROUP_HDA,
+ .smmu = {
+ .reg = 0x22c,
+ .bit = 21,
+ },
+ .latency = {
+ .reg = 0x318,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x36,
+ .name = "host1xw",
+ .swgroup = TEGRA_SWGROUP_HC,
+ .smmu = {
+ .reg = 0x22c,
+ .bit = 22,
+ },
+ .latency = {
+ .reg = 0x314,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x38,
+ .name = "mpcorelpw",
+ .swgroup = TEGRA_SWGROUP_MPCORELP,
+ .latency = {
+ .reg = 0x324,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x39,
+ .name = "mpcorew",
+ .swgroup = TEGRA_SWGROUP_MPCORE,
+ .latency = {
+ .reg = 0x320,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x3b,
+ .name = "ppcsahbdmaw",
+ .swgroup = TEGRA_SWGROUP_PPCS,
+ .smmu = {
+ .reg = 0x22c,
+ .bit = 27,
+ },
+ .latency = {
+ .reg = 0x348,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x3c,
+ .name = "ppcsahbslvw",
+ .swgroup = TEGRA_SWGROUP_PPCS,
+ .smmu = {
+ .reg = 0x22c,
+ .bit = 28,
+ },
+ .latency = {
+ .reg = 0x348,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x3d,
+ .name = "sataw",
+ .swgroup = TEGRA_SWGROUP_SATA,
+ .smmu = {
+ .reg = 0x22c,
+ .bit = 29,
+ },
+ .latency = {
+ .reg = 0x350,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x65,
+ },
+ }, {
+ .id = 0x3e,
+ .name = "vdebsevw",
+ .swgroup = TEGRA_SWGROUP_VDE,
+ .smmu = {
+ .reg = 0x22c,
+ .bit = 30,
+ },
+ .latency = {
+ .reg = 0x35c,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x3f,
+ .name = "vdedbgw",
+ .swgroup = TEGRA_SWGROUP_VDE,
+ .smmu = {
+ .reg = 0x22c,
+ .bit = 31,
+ },
+ .latency = {
+ .reg = 0x35c,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x40,
+ .name = "vdembew",
+ .swgroup = TEGRA_SWGROUP_VDE,
+ .smmu = {
+ .reg = 0x230,
+ .bit = 0,
+ },
+ .latency = {
+ .reg = 0x360,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x41,
+ .name = "vdetpmw",
+ .swgroup = TEGRA_SWGROUP_VDE,
+ .smmu = {
+ .reg = 0x230,
+ .bit = 1,
+ },
+ .latency = {
+ .reg = 0x360,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x44,
+ .name = "ispra",
+ .swgroup = TEGRA_SWGROUP_ISP2,
+ .smmu = {
+ .reg = 0x230,
+ .bit = 4,
+ },
+ .latency = {
+ .reg = 0x370,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x18,
+ },
+ }, {
+ .id = 0x46,
+ .name = "ispwa",
+ .swgroup = TEGRA_SWGROUP_ISP2,
+ .smmu = {
+ .reg = 0x230,
+ .bit = 6,
+ },
+ .latency = {
+ .reg = 0x374,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x47,
+ .name = "ispwb",
+ .swgroup = TEGRA_SWGROUP_ISP2,
+ .smmu = {
+ .reg = 0x230,
+ .bit = 7,
+ },
+ .latency = {
+ .reg = 0x374,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x4a,
+ .name = "xusb_hostr",
+ .swgroup = TEGRA_SWGROUP_XUSB_HOST,
+ .smmu = {
+ .reg = 0x230,
+ .bit = 10,
+ },
+ .latency = {
+ .reg = 0x37c,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x39,
+ },
+ }, {
+ .id = 0x4b,
+ .name = "xusb_hostw",
+ .swgroup = TEGRA_SWGROUP_XUSB_HOST,
+ .smmu = {
+ .reg = 0x230,
+ .bit = 11,
+ },
+ .latency = {
+ .reg = 0x37c,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x4c,
+ .name = "xusb_devr",
+ .swgroup = TEGRA_SWGROUP_XUSB_DEV,
+ .smmu = {
+ .reg = 0x230,
+ .bit = 12,
+ },
+ .latency = {
+ .reg = 0x380,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x39,
+ },
+ }, {
+ .id = 0x4d,
+ .name = "xusb_devw",
+ .swgroup = TEGRA_SWGROUP_XUSB_DEV,
+ .smmu = {
+ .reg = 0x230,
+ .bit = 13,
+ },
+ .latency = {
+ .reg = 0x380,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x4e,
+ .name = "isprab",
+ .swgroup = TEGRA_SWGROUP_ISP2B,
+ .smmu = {
+ .reg = 0x230,
+ .bit = 14,
+ },
+ .latency = {
+ .reg = 0x384,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x18,
+ },
+ }, {
+ .id = 0x50,
+ .name = "ispwab",
+ .swgroup = TEGRA_SWGROUP_ISP2B,
+ .smmu = {
+ .reg = 0x230,
+ .bit = 16,
+ },
+ .latency = {
+ .reg = 0x388,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x51,
+ .name = "ispwbb",
+ .swgroup = TEGRA_SWGROUP_ISP2B,
+ .smmu = {
+ .reg = 0x230,
+ .bit = 17,
+ },
+ .latency = {
+ .reg = 0x388,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x54,
+ .name = "tsecsrd",
+ .swgroup = TEGRA_SWGROUP_TSEC,
+ .smmu = {
+ .reg = 0x230,
+ .bit = 20,
+ },
+ .latency = {
+ .reg = 0x390,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x9b,
+ },
+ }, {
+ .id = 0x55,
+ .name = "tsecswr",
+ .swgroup = TEGRA_SWGROUP_TSEC,
+ .smmu = {
+ .reg = 0x230,
+ .bit = 21,
+ },
+ .latency = {
+ .reg = 0x390,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x56,
+ .name = "a9avpscr",
+ .swgroup = TEGRA_SWGROUP_A9AVP,
+ .smmu = {
+ .reg = 0x230,
+ .bit = 22,
+ },
+ .latency = {
+ .reg = 0x3a4,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x04,
+ },
+ }, {
+ .id = 0x57,
+ .name = "a9avpscw",
+ .swgroup = TEGRA_SWGROUP_A9AVP,
+ .smmu = {
+ .reg = 0x230,
+ .bit = 23,
+ },
+ .latency = {
+ .reg = 0x3a4,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x58,
+ .name = "gpusrd",
+ .swgroup = TEGRA_SWGROUP_GPU,
+ .smmu = {
+ /* read-only */
+ .reg = 0x230,
+ .bit = 24,
+ },
+ .latency = {
+ .reg = 0x3c8,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x1a,
+ },
+ }, {
+ .id = 0x59,
+ .name = "gpuswr",
+ .swgroup = TEGRA_SWGROUP_GPU,
+ .smmu = {
+ /* read-only */
+ .reg = 0x230,
+ .bit = 25,
+ },
+ .latency = {
+ .reg = 0x3c8,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x5a,
+ .name = "displayt",
+ .swgroup = TEGRA_SWGROUP_DC,
+ .smmu = {
+ .reg = 0x230,
+ .bit = 26,
+ },
+ .latency = {
+ .reg = 0x2f0,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x50,
+ },
+ }, {
+ .id = 0x60,
+ .name = "sdmmcra",
+ .swgroup = TEGRA_SWGROUP_SDMMC1A,
+ .smmu = {
+ .reg = 0x234,
+ .bit = 0,
+ },
+ .latency = {
+ .reg = 0x3b8,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x49,
+ },
+ }, {
+ .id = 0x61,
+ .name = "sdmmcraa",
+ .swgroup = TEGRA_SWGROUP_SDMMC2A,
+ .smmu = {
+ .reg = 0x234,
+ .bit = 1,
+ },
+ .latency = {
+ .reg = 0x3bc,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x49,
+ },
+ }, {
+ .id = 0x62,
+ .name = "sdmmcr",
+ .swgroup = TEGRA_SWGROUP_SDMMC3A,
+ .smmu = {
+ .reg = 0x234,
+ .bit = 2,
+ },
+ .latency = {
+ .reg = 0x3c0,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x49,
+ },
+ }, {
+ .id = 0x63,
+ .swgroup = TEGRA_SWGROUP_SDMMC4A,
+ .name = "sdmmcrab",
+ .smmu = {
+ .reg = 0x234,
+ .bit = 3,
+ },
+ .latency = {
+ .reg = 0x3c4,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x49,
+ },
+ }, {
+ .id = 0x64,
+ .name = "sdmmcwa",
+ .swgroup = TEGRA_SWGROUP_SDMMC1A,
+ .smmu = {
+ .reg = 0x234,
+ .bit = 4,
+ },
+ .latency = {
+ .reg = 0x3b8,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x65,
+ .name = "sdmmcwaa",
+ .swgroup = TEGRA_SWGROUP_SDMMC2A,
+ .smmu = {
+ .reg = 0x234,
+ .bit = 5,
+ },
+ .latency = {
+ .reg = 0x3bc,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x66,
+ .name = "sdmmcw",
+ .swgroup = TEGRA_SWGROUP_SDMMC3A,
+ .smmu = {
+ .reg = 0x234,
+ .bit = 6,
+ },
+ .latency = {
+ .reg = 0x3c0,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x67,
+ .name = "sdmmcwab",
+ .swgroup = TEGRA_SWGROUP_SDMMC4A,
+ .smmu = {
+ .reg = 0x234,
+ .bit = 7,
+ },
+ .latency = {
+ .reg = 0x3c4,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x6c,
+ .name = "vicsrd",
+ .swgroup = TEGRA_SWGROUP_VIC,
+ .smmu = {
+ .reg = 0x234,
+ .bit = 12,
+ },
+ .latency = {
+ .reg = 0x394,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x1a,
+ },
+ }, {
+ .id = 0x6d,
+ .name = "vicswr",
+ .swgroup = TEGRA_SWGROUP_VIC,
+ .smmu = {
+ .reg = 0x234,
+ .bit = 13,
+ },
+ .latency = {
+ .reg = 0x394,
+ .shift = 16,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x72,
+ .name = "viw",
+ .swgroup = TEGRA_SWGROUP_VI,
+ .smmu = {
+ .reg = 0x234,
+ .bit = 18,
+ },
+ .latency = {
+ .reg = 0x398,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x80,
+ },
+ }, {
+ .id = 0x73,
+ .name = "displayd",
+ .swgroup = TEGRA_SWGROUP_DC,
+ .smmu = {
+ .reg = 0x234,
+ .bit = 19,
+ },
+ .latency = {
+ .reg = 0x3c8,
+ .shift = 0,
+ .mask = 0xff,
+ .def = 0x50,
+ },
+ },
+};
+
+struct tegra_smmu_swgroup {
+ unsigned int swgroup;
+ unsigned int reg;
+};
+
+static const struct tegra_smmu_swgroup tegra124_swgroups[] = {
+ { .swgroup = TEGRA_SWGROUP_DC, .reg = 0x240 },
+ { .swgroup = TEGRA_SWGROUP_DCB, .reg = 0x244 },
+ { .swgroup = TEGRA_SWGROUP_AFI, .reg = 0x238 },
+ { .swgroup = TEGRA_SWGROUP_AVPC, .reg = 0x23c },
+ { .swgroup = TEGRA_SWGROUP_HDA, .reg = 0x254 },
+ { .swgroup = TEGRA_SWGROUP_HC, .reg = 0x250 },
+ { .swgroup = TEGRA_SWGROUP_MSENC, .reg = 0x264 },
+ { .swgroup = TEGRA_SWGROUP_PPCS, .reg = 0x270 },
+ { .swgroup = TEGRA_SWGROUP_SATA, .reg = 0x274 },
+ { .swgroup = TEGRA_SWGROUP_VDE, .reg = 0x27c },
+ { .swgroup = TEGRA_SWGROUP_ISP2, .reg = 0x258 },
+ { .swgroup = TEGRA_SWGROUP_XUSB_HOST, .reg = 0x288 },
+ { .swgroup = TEGRA_SWGROUP_XUSB_DEV, .reg = 0x28c },
+ { .swgroup = TEGRA_SWGROUP_ISP2B, .reg = 0xaa4 },
+ { .swgroup = TEGRA_SWGROUP_TSEC, .reg = 0x294 },
+ { .swgroup = TEGRA_SWGROUP_A9AVP, .reg = 0x290 },
+ { .swgroup = TEGRA_SWGROUP_GPU, .reg = 0xaa8 },
+ { .swgroup = TEGRA_SWGROUP_SDMMC1A, .reg = 0xa94 },
+ { .swgroup = TEGRA_SWGROUP_SDMMC2A, .reg = 0xa98 },
+ { .swgroup = TEGRA_SWGROUP_SDMMC3A, .reg = 0xa9c },
+ { .swgroup = TEGRA_SWGROUP_SDMMC4A, .reg = 0xaa0 },
+ { .swgroup = TEGRA_SWGROUP_VIC, .reg = 0x284 },
+ { .swgroup = TEGRA_SWGROUP_VI, .reg = 0x280 },
+};
+
+struct tegra_smmu_group_init {
+ unsigned int asid;
+ const char *name;
+
+ const struct of_device_id *matches;
+};
+
+struct tegra_smmu_soc {
+ const struct tegra_smmu_group_init *groups;
+ unsigned int num_groups;
+
+ const struct tegra_mc_client *clients;
+ unsigned int num_clients;
+
+ const struct tegra_smmu_swgroup *swgroups;
+ unsigned int num_swgroups;
+
+ unsigned int num_asids;
+ unsigned int atom_size;
+
+ const struct tegra_smmu_ops *ops;
+};
+
+struct tegra_smmu_ops {
+ void (*flush_dcache)(struct page *page, unsigned long offset,
+ size_t size);
+};
+
+struct tegra_smmu_master {
+ struct list_head list;
+ struct device *dev;
+};
+
+struct tegra_smmu_group {
+ const char *name;
+ const struct of_device_id *matches;
+ unsigned int asid;
+
+#ifndef CONFIG_ARM64
+ struct dma_iommu_mapping *mapping;
+#endif
+ struct list_head masters;
+};
+
+static const struct of_device_id tegra124_periph_matches[] = {
+ { .compatible = "nvidia,tegra124-sdhci", },
+ { }
+};
+
+static const struct tegra_smmu_group_init tegra124_smmu_groups[] = {
+ { 0, "peripherals", tegra124_periph_matches },
+};
+
+static void tegra_smmu_group_release(void *data)
+{
+ kfree(data);
+}
+
+struct tegra_smmu {
+ void __iomem *regs;
+ struct iommu iommu;
+ struct device *dev;
+
+ const struct tegra_smmu_soc *soc;
+
+ struct iommu_group **groups;
+ unsigned int num_groups;
+
+ unsigned long *asids;
+ struct mutex lock;
+};
+
+struct tegra_smmu_address_space {
+ struct iommu_domain *domain;
+ struct tegra_smmu *smmu;
+ struct page *pd;
+ unsigned id;
+ u32 attr;
+};
+
+static inline void smmu_writel(struct tegra_smmu *smmu, u32 value,
+ unsigned long offset)
+{
+ writel(value, smmu->regs + offset);
+}
+
+static inline u32 smmu_readl(struct tegra_smmu *smmu, unsigned long offset)
+{
+ return readl(smmu->regs + offset);
+}
+
+#define SMMU_CONFIG 0x010
+#define SMMU_CONFIG_ENABLE (1 << 0)
+
+#define SMMU_PTB_ASID 0x01c
+#define SMMU_PTB_ASID_VALUE(x) ((x) & 0x7f)
+
+#define SMMU_PTB_DATA 0x020
+#define SMMU_PTB_DATA_VALUE(page, attr) (page_to_phys(page) >> 12 | (attr))
+
+#define SMMU_MK_PDE(page, attr) (page_to_phys(page) >> SMMU_PTE_SHIFT | (attr))
+
+#define SMMU_TLB_FLUSH 0x030
+#define SMMU_TLB_FLUSH_VA_MATCH_ALL (0 << 0)
+#define SMMU_TLB_FLUSH_VA_MATCH_SECTION (2 << 0)
+#define SMMU_TLB_FLUSH_VA_MATCH_GROUP (3 << 0)
+#define SMMU_TLB_FLUSH_ASID(x) (((x) & 0x7f) << 24)
+#define SMMU_TLB_FLUSH_VA_SECTION(addr) ((((addr) & 0xffc00000) >> 12) | \
+ SMMU_TLB_FLUSH_VA_MATCH_SECTION)
+#define SMMU_TLB_FLUSH_VA_GROUP(addr) ((((addr) & 0xffffc000) >> 12) | \
+ SMMU_TLB_FLUSH_VA_MATCH_GROUP)
+#define SMMU_TLB_FLUSH_ASID_MATCH (1 << 31)
+
+#define SMMU_PTC_FLUSH 0x034
+#define SMMU_PTC_FLUSH_TYPE_ALL (0 << 0)
+#define SMMU_PTC_FLUSH_TYPE_ADR (1 << 0)
+
+#define SMMU_PTC_FLUSH_HI 0x9b8
+#define SMMU_PTC_FLUSH_HI_MASK 0x3
+
+/* per-SWGROUP SMMU_*_ASID register */
+#define SMMU_ASID_ENABLE (1 << 31)
+#define SMMU_ASID_MASK 0x7f
+#define SMMU_ASID_VALUE(x) ((x) & SMMU_ASID_MASK)
+
+/* page table definitions */
+#define SMMU_NUM_PDE 1024
+#define SMMU_NUM_PTE 1024
+
+#define SMMU_SIZE_PD (SMMU_NUM_PDE * 4)
+#define SMMU_SIZE_PT (SMMU_NUM_PTE * 4)
+
+#define SMMU_PDE_SHIFT 22
+#define SMMU_PTE_SHIFT 12
+
+#define SMMU_PFN_MASK 0x000fffff
+
+#define SMMU_PD_READABLE (1 << 31)
+#define SMMU_PD_WRITABLE (1 << 30)
+#define SMMU_PD_NONSECURE (1 << 29)
+
+#define SMMU_PDE_READABLE (1 << 31)
+#define SMMU_PDE_WRITABLE (1 << 30)
+#define SMMU_PDE_NONSECURE (1 << 29)
+#define SMMU_PDE_NEXT (1 << 28)
+
+#define SMMU_PTE_READABLE (1 << 31)
+#define SMMU_PTE_WRITABLE (1 << 30)
+#define SMMU_PTE_NONSECURE (1 << 29)
+
+#define SMMU_PDE_ATTR (SMMU_PDE_READABLE | SMMU_PDE_WRITABLE | \
+ SMMU_PDE_NONSECURE)
+#define SMMU_PTE_ATTR (SMMU_PTE_READABLE | SMMU_PTE_WRITABLE | \
+ SMMU_PTE_NONSECURE)
+
+#define SMMU_PDE_VACANT(n) (((n) << 10) | SMMU_PDE_ATTR)
+#define SMMU_PTE_VACANT(n) (((n) << 12) | SMMU_PTE_ATTR)
+
+#ifdef CONFIG_ARCH_TEGRA_124_SOC
+static void tegra124_flush_dcache(struct page *page, unsigned long offset,
+ size_t size)
+{
+ phys_addr_t phys = page_to_phys(page) + offset;
+ void *virt = page_address(page) + offset;
+
+ __cpuc_flush_dcache_area(virt, size);
+ outer_flush_range(phys, phys + size);
+}
+
+static const struct tegra_smmu_ops tegra124_smmu_ops = {
+ .flush_dcache = tegra124_flush_dcache,
+};
+#endif
+
+static void tegra132_flush_dcache(struct page *page, unsigned long offset,
+ size_t size)
+{
+ /* TODO: implement */
+}
+
+static const struct tegra_smmu_ops tegra132_smmu_ops = {
+ .flush_dcache = tegra132_flush_dcache,
+};
+
+static inline void smmu_flush_ptc(struct tegra_smmu *smmu, struct page *page,
+ unsigned long offset)
+{
+ phys_addr_t phys = page ? page_to_phys(page) : 0;
+ u32 value;
+
+ if (page) {
+ offset &= ~(smmu->soc->atom_size - 1);
+
+#ifdef CONFIG_PHYS_ADDR_T_64BIT
+ value = (phys >> 32) & SMMU_PTC_FLUSH_HI_MASK;
+#else
+ value = 0;
+#endif
+ smmu_writel(smmu, value, SMMU_PTC_FLUSH_HI);
+
+ value = (phys + offset) | SMMU_PTC_FLUSH_TYPE_ADR;
+ } else {
+ value = SMMU_PTC_FLUSH_TYPE_ALL;
+ }
+
+ smmu_writel(smmu, value, SMMU_PTC_FLUSH);
+}
+
+static inline void smmu_flush_tlb(struct tegra_smmu *smmu)
+{
+ smmu_writel(smmu, SMMU_TLB_FLUSH_VA_MATCH_ALL, SMMU_TLB_FLUSH);
+}
+
+static inline void smmu_flush_tlb_asid(struct tegra_smmu *smmu,
+ unsigned long asid)
+{
+ u32 value;
+
+ value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
+ SMMU_TLB_FLUSH_VA_MATCH_ALL;
+ smmu_writel(smmu, value, SMMU_TLB_FLUSH);
+}
+
+static inline void smmu_flush_tlb_section(struct tegra_smmu *smmu,
+ unsigned long asid,
+ unsigned long iova)
+{
+ u32 value;
+
+ value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
+ SMMU_TLB_FLUSH_VA_SECTION(iova);
+ smmu_writel(smmu, value, SMMU_TLB_FLUSH);
+}
+
+static inline void smmu_flush_tlb_group(struct tegra_smmu *smmu,
+ unsigned long asid,
+ unsigned long iova)
+{
+ u32 value;
+
+ value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
+ SMMU_TLB_FLUSH_VA_GROUP(iova);
+ smmu_writel(smmu, value, SMMU_TLB_FLUSH);
+}
+
+static inline void smmu_flush(struct tegra_smmu *smmu)
+{
+ smmu_readl(smmu, SMMU_CONFIG);
+}
+
+static inline struct tegra_smmu *to_tegra_smmu(struct iommu *iommu)
+{
+ return container_of(iommu, struct tegra_smmu, iommu);
+}
+
+static struct tegra_smmu *smmu_handle = NULL;
+
+static int tegra_smmu_alloc_asid(struct tegra_smmu *smmu, unsigned int *idp)
+{
+ unsigned long id;
+
+ mutex_lock(&smmu->lock);
+
+ id = find_first_zero_bit(smmu->asids, smmu->soc->num_asids);
+ if (id >= smmu->soc->num_asids) {
+ mutex_unlock(&smmu->lock);
+ return -ENOSPC;
+ }
+
+ set_bit(id, smmu->asids);
+ *idp = id;
+
+ mutex_unlock(&smmu->lock);
+ return 0;
+}
+
+static void tegra_smmu_free_asid(struct tegra_smmu *smmu, unsigned int id)
+{
+ mutex_lock(&smmu->lock);
+ clear_bit(id, smmu->asids);
+ mutex_unlock(&smmu->lock);
+}
+
+struct tegra_smmu_address_space *foo = NULL;
+
+static int tegra_smmu_domain_init(struct iommu_domain *domain)
+{
+ struct tegra_smmu *smmu = smmu_handle;
+ struct tegra_smmu_address_space *as;
+ uint32_t *pd, value;
+ unsigned int i;
+ int err = 0;
+
+ as = kzalloc(sizeof(*as), GFP_KERNEL);
+ if (!as) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ as->attr = SMMU_PD_READABLE | SMMU_PD_WRITABLE | SMMU_PD_NONSECURE;
+ as->smmu = smmu_handle;
+ as->domain = domain;
+
+ err = tegra_smmu_alloc_asid(smmu, &as->id);
+ if (err < 0) {
+ kfree(as);
+ goto out;
+ }
+
+ as->pd = alloc_page(GFP_KERNEL | __GFP_DMA);
+ if (!as->pd) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ pd = page_address(as->pd);
+ SetPageReserved(as->pd);
+
+ for (i = 0; i < SMMU_NUM_PDE; i++)
+ pd[i] = SMMU_PDE_VACANT(i);
+
+ smmu->soc->ops->flush_dcache(as->pd, 0, SMMU_SIZE_PD);
+ smmu_flush_ptc(smmu, as->pd, 0);
+ smmu_flush_tlb_asid(smmu, as->id);
+
+ smmu_writel(smmu, as->id & 0x7f, SMMU_PTB_ASID);
+ value = SMMU_PTB_DATA_VALUE(as->pd, as->attr);
+ smmu_writel(smmu, value, SMMU_PTB_DATA);
+ smmu_flush(smmu);
+
+ domain->priv = as;
+
+ return 0;
+
+out:
+ return err;
+}
+
+static void tegra_smmu_domain_destroy(struct iommu_domain *domain)
+{
+ struct tegra_smmu_address_space *as = domain->priv;
+
+ /* TODO: free page directory and page tables */
+
+ tegra_smmu_free_asid(as->smmu, as->id);
+ kfree(as);
+}
+
+static const struct tegra_smmu_swgroup *
+tegra_smmu_find_swgroup(struct tegra_smmu *smmu, unsigned int swgroup)
+{
+ const struct tegra_smmu_swgroup *group = NULL;
+ unsigned int i;
+
+ for (i = 0; i < smmu->soc->num_swgroups; i++) {
+ if (smmu->soc->swgroups[i].swgroup == swgroup) {
+ group = &smmu->soc->swgroups[i];
+ break;
+ }
+ }
+
+ return group;
+}
+
+static int tegra_smmu_enable(struct tegra_smmu *smmu, unsigned int swgroup,
+ unsigned int asid)
+{
+ const struct tegra_smmu_swgroup *group;
+ unsigned int i;
+ u32 value;
+
+ for (i = 0; i < smmu->soc->num_clients; i++) {
+ const struct tegra_mc_client *client = &smmu->soc->clients[i];
+
+ if (client->swgroup != swgroup)
+ continue;
+
+ value = smmu_readl(smmu, client->smmu.reg);
+ value |= BIT(client->smmu.bit);
+ smmu_writel(smmu, value, client->smmu.reg);
+ }
+
+ group = tegra_smmu_find_swgroup(smmu, swgroup);
+ if (group) {
+ value = smmu_readl(smmu, group->reg);
+ value &= ~SMMU_ASID_MASK;
+ value |= SMMU_ASID_VALUE(asid);
+ value |= SMMU_ASID_ENABLE;
+ smmu_writel(smmu, value, group->reg);
+ }
+
+ return 0;
+}
+
+static int tegra_smmu_disable(struct tegra_smmu *smmu, unsigned int swgroup,
+ unsigned int asid)
+{
+ const struct tegra_smmu_swgroup *group;
+ unsigned int i;
+ u32 value;
+
+ group = tegra_smmu_find_swgroup(smmu, swgroup);
+ if (group) {
+ value = smmu_readl(smmu, group->reg);
+ value &= ~SMMU_ASID_MASK;
+ value |= SMMU_ASID_VALUE(asid);
+ value &= ~SMMU_ASID_ENABLE;
+ smmu_writel(smmu, value, group->reg);
+ }
+
+ for (i = 0; i < smmu->soc->num_clients; i++) {
+ const struct tegra_mc_client *client = &smmu->soc->clients[i];
+
+ if (client->swgroup != swgroup)
+ continue;
+
+ value = smmu_readl(smmu, client->smmu.reg);
+ value &= ~BIT(client->smmu.bit);
+ smmu_writel(smmu, value, client->smmu.reg);
+ }
+
+ return 0;
+}
+
+static int tegra_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
+{
+ struct tegra_smmu_address_space *as = domain->priv;
+ struct tegra_smmu *smmu = as->smmu;
+ struct of_phandle_iter entry;
+ int err;
+
+ of_property_for_each_phandle_with_args(entry, dev->of_node, "iommus",
+ "#iommu-cells", 0) {
+ unsigned int swgroup = entry.out_args.args[0];
+
+ if (entry.out_args.np != smmu->dev->of_node)
+ continue;
+
+ err = tegra_smmu_enable(smmu, swgroup, as->id);
+ if (err < 0)
+ pr_err("failed to enable SWGROUP#%u\n", swgroup);
+ }
+
+ return 0;
+}
+
+static void tegra_smmu_detach_dev(struct iommu_domain *domain, struct device *dev)
+{
+ struct tegra_smmu_address_space *as = domain->priv;
+ struct tegra_smmu *smmu = as->smmu;
+ struct of_phandle_iter entry;
+ int err;
+
+ of_property_for_each_phandle_with_args(entry, dev->of_node, "iommus",
+ "#iommu-cells", 0) {
+ unsigned int swgroup;
+
+ if (entry.out_args.np != smmu->dev->of_node)
+ continue;
+
+ swgroup = entry.out_args.args[0];
+
+ err = tegra_smmu_disable(smmu, swgroup, as->id);
+ if (err < 0) {
+ pr_err("failed to enable SWGROUP#%u\n", swgroup);
+ }
+ }
+}
+
+static u32 *as_get_pte(struct tegra_smmu_address_space *as, dma_addr_t iova,
+ struct page **pagep)
+{
+ struct tegra_smmu *smmu = smmu_handle;
+ u32 *pd = page_address(as->pd), *pt;
+ u32 pde = (iova >> SMMU_PDE_SHIFT) & 0x3ff;
+ u32 pte = (iova >> SMMU_PTE_SHIFT) & 0x3ff;
+ struct page *page;
+ unsigned int i;
+
+ if (pd[pde] != SMMU_PDE_VACANT(pde)) {
+ page = pfn_to_page(pd[pde] & SMMU_PFN_MASK);
+ pt = page_address(page);
+ } else {
+ page = alloc_page(GFP_KERNEL | __GFP_DMA);
+ if (!page)
+ return NULL;
+
+ pt = page_address(page);
+ SetPageReserved(page);
+
+ for (i = 0; i < SMMU_NUM_PTE; i++)
+ pt[i] = SMMU_PTE_VACANT(i);
+
+ smmu->soc->ops->flush_dcache(page, 0, SMMU_SIZE_PT);
+
+ pd[pde] = SMMU_MK_PDE(page, SMMU_PDE_ATTR | SMMU_PDE_NEXT);
+
+ smmu->soc->ops->flush_dcache(as->pd, pde << 2, 4);
+ smmu_flush_ptc(smmu, as->pd, pde << 2);
+ smmu_flush_tlb_section(smmu, as->id, iova);
+ smmu_flush(smmu);
+ }
+
+ *pagep = page;
+
+ return &pt[pte];
+}
+
+static int tegra_smmu_map(struct iommu_domain *domain, unsigned long iova,
+ phys_addr_t paddr, size_t size, int prot)
+{
+ struct tegra_smmu_address_space *as = domain->priv;
+ struct tegra_smmu *smmu = smmu_handle;
+ unsigned long offset;
+ struct page *page;
+ u32 *pte;
+
+ pte = as_get_pte(as, iova, &page);
+ if (!pte)
+ return -ENOMEM;
+
+ offset = offset_in_page(pte);
+
+ *pte = __phys_to_pfn(paddr) | SMMU_PTE_ATTR;
+
+ smmu->soc->ops->flush_dcache(page, offset, 4);
+ smmu_flush_ptc(smmu, page, offset);
+ smmu_flush_tlb_group(smmu, as->id, iova);
+ smmu_flush(smmu);
+
+ return 0;
+}
+
+static size_t tegra_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
+ size_t size)
+{
+ struct tegra_smmu_address_space *as = domain->priv;
+ struct tegra_smmu *smmu = smmu_handle;
+ unsigned long offset;
+ struct page *page;
+ u32 *pte;
+
+ pte = as_get_pte(as, iova, &page);
+ if (!pte)
+ return 0;
+
+ offset = offset_in_page(pte);
+ *pte = 0;
+
+ smmu->soc->ops->flush_dcache(page, offset, 4);
+ smmu_flush_ptc(smmu, page, offset);
+ smmu_flush_tlb_group(smmu, as->id, iova);
+ smmu_flush(smmu);
+
+ return size;
+}
+
+static phys_addr_t tegra_smmu_iova_to_phys(struct iommu_domain *domain,
+ dma_addr_t iova)
+{
+ struct tegra_smmu_address_space *as = domain->priv;
+ struct page *page;
+ unsigned long pfn;
+ u32 *pte;
+
+ pte = as_get_pte(as, iova, &page);
+ pfn = *pte & SMMU_PFN_MASK;
+
+ return PFN_PHYS(pfn);
+}
+
+static int tegra_smmu_attach(struct iommu *iommu, struct device *dev)
+{
+ struct tegra_smmu *smmu = to_tegra_smmu(iommu);
+ struct tegra_smmu_group *group;
+ unsigned int i;
+
+ for (i = 0; i < smmu->soc->num_groups; i++) {
+ group = iommu_group_get_iommudata(smmu->groups[i]);
+
+ if (of_match_node(group->matches, dev->of_node)) {
+ pr_debug("adding device %s to group %s\n",
+ dev_name(dev), group->name);
+ iommu_group_add_device(smmu->groups[i], dev);
+ break;
+ }
+ }
+
+ if (i == smmu->soc->num_groups)
+ return 0;
+
+#ifndef CONFIG_ARM64
+ return arm_iommu_attach_device(dev, group->mapping);
+#else
+ return 0;
+#endif
+}
+
+static int tegra_smmu_detach(struct iommu *iommu, struct device *dev)
+{
+ return 0;
+}
+
+static const struct iommu_ops tegra_smmu_ops = {
+ .domain_init = tegra_smmu_domain_init,
+ .domain_destroy = tegra_smmu_domain_destroy,
+ .attach_dev = tegra_smmu_attach_dev,
+ .detach_dev = tegra_smmu_detach_dev,
+ .map = tegra_smmu_map,
+ .unmap = tegra_smmu_unmap,
+ .iova_to_phys = tegra_smmu_iova_to_phys,
+ .attach = tegra_smmu_attach,
+ .detach = tegra_smmu_detach,
+
+ .pgsize_bitmap = SZ_4K,
+};
+
+static struct tegra_smmu *tegra_smmu_probe(struct device *dev,
+ const struct tegra_smmu_soc *soc,
+ void __iomem *regs)
+{
+ struct tegra_smmu *smmu;
+ unsigned int i;
+ size_t size;
+ u32 value;
+ int err;
+
+ smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL);
+ if (!smmu)
+ return ERR_PTR(-ENOMEM);
+
+ size = BITS_TO_LONGS(soc->num_asids) * sizeof(long);
+
+ smmu->asids = devm_kzalloc(dev, size, GFP_KERNEL);
+ if (!smmu->asids)
+ return ERR_PTR(-ENOMEM);
+
+ INIT_LIST_HEAD(&smmu->iommu.list);
+ mutex_init(&smmu->lock);
+
+ smmu->iommu.ops = &tegra_smmu_ops;
+ smmu->iommu.dev = dev;
+
+ smmu->regs = regs;
+ smmu->soc = soc;
+ smmu->dev = dev;
+
+ smmu_handle = smmu;
+ bus_set_iommu(&platform_bus_type, &tegra_smmu_ops);
+
+ smmu->num_groups = soc->num_groups;
+
+ smmu->groups = devm_kcalloc(dev, smmu->num_groups, sizeof(*smmu->groups),
+ GFP_KERNEL);
+ if (!smmu->groups)
+ return ERR_PTR(-ENOMEM);
+
+ for (i = 0; i < smmu->num_groups; i++) {
+ struct tegra_smmu_group *group;
+
+ smmu->groups[i] = iommu_group_alloc();
+ if (IS_ERR(smmu->groups[i]))
+ return ERR_CAST(smmu->groups[i]);
+
+ err = iommu_group_set_name(smmu->groups[i], soc->groups[i].name);
+ if (err < 0) {
+ }
+
+ group = kzalloc(sizeof(*group), GFP_KERNEL);
+ if (!group)
+ return ERR_PTR(-ENOMEM);
+
+ group->matches = soc->groups[i].matches;
+ group->asid = soc->groups[i].asid;
+ group->name = soc->groups[i].name;
+
+ iommu_group_set_iommudata(smmu->groups[i], group,
+ tegra_smmu_group_release);
+
+#ifndef CONFIG_ARM64
+ group->mapping = arm_iommu_create_mapping(&platform_bus_type,
+ 0, SZ_2G);
+ if (IS_ERR(group->mapping)) {
+ dev_err(dev, "failed to create mapping for group %s: %ld\n",
+ group->name, PTR_ERR(group->mapping));
+ return ERR_CAST(group->mapping);
+ }
+#endif
+ }
+
+ value = (1 << 29) | (8 << 24) | 0x3f;
+ smmu_writel(smmu, value, 0x18);
+
+ value = (1 << 29) | (1 << 28) | 0x20;
+ smmu_writel(smmu, value, 0x014);
+
+ smmu_flush_ptc(smmu, NULL, 0);
+ smmu_flush_tlb(smmu);
+ smmu_writel(smmu, SMMU_CONFIG_ENABLE, SMMU_CONFIG);
+ smmu_flush(smmu);
+
+ err = iommu_add(&smmu->iommu);
+ if (err < 0)
+ return ERR_PTR(err);
+
+ return smmu;
+}
+
+static int tegra_smmu_remove(struct tegra_smmu *smmu)
+{
+ iommu_remove(&smmu->iommu);
+
+ return 0;
+}
+
+#ifdef CONFIG_ARCH_TEGRA_124_SOC
+static const struct tegra_smmu_soc tegra124_smmu_soc = {
+ .groups = tegra124_smmu_groups,
+ .num_groups = ARRAY_SIZE(tegra124_smmu_groups),
+ .clients = tegra124_mc_clients,
+ .num_clients = ARRAY_SIZE(tegra124_mc_clients),
+ .swgroups = tegra124_swgroups,
+ .num_swgroups = ARRAY_SIZE(tegra124_swgroups),
+ .num_asids = 128,
+ .atom_size = 32,
+ .ops = &tegra124_smmu_ops,
+};
+#endif
+
+static const struct tegra_smmu_soc tegra132_smmu_soc = {
+ .groups = tegra124_smmu_groups,
+ .num_groups = ARRAY_SIZE(tegra124_smmu_groups),
+ .clients = tegra124_mc_clients,
+ .num_clients = ARRAY_SIZE(tegra124_mc_clients),
+ .swgroups = tegra124_swgroups,
+ .num_swgroups = ARRAY_SIZE(tegra124_swgroups),
+ .num_asids = 128,
+ .atom_size = 32,
+ .ops = &tegra132_smmu_ops,
+};
+
+struct tegra_mc {
+ struct device *dev;
+ struct tegra_smmu *smmu;
+ void __iomem *regs;
+ int irq;
+
+ const struct tegra_mc_soc *soc;
+};
+
+static inline u32 mc_readl(struct tegra_mc *mc, unsigned long offset)
+{
+ return readl(mc->regs + offset);
+}
+
+static inline void mc_writel(struct tegra_mc *mc, u32 value, unsigned long offset)
+{
+ writel(value, mc->regs + offset);
+}
+
+struct tegra_mc_soc {
+ const struct tegra_mc_client *clients;
+ unsigned int num_clients;
+
+ const struct tegra_smmu_soc *smmu;
+};
+
+#ifdef CONFIG_ARCH_TEGRA_124_SOC
+static const struct tegra_mc_soc tegra124_mc_soc = {
+ .clients = tegra124_mc_clients,
+ .num_clients = ARRAY_SIZE(tegra124_mc_clients),
+ .smmu = &tegra124_smmu_soc,
+};
+#endif
+
+static const struct tegra_mc_soc tegra132_mc_soc = {
+ .clients = tegra124_mc_clients,
+ .num_clients = ARRAY_SIZE(tegra124_mc_clients),
+ .smmu = &tegra132_smmu_soc,
+};
+
+static const struct of_device_id tegra_mc_of_match[] = {
+#ifdef CONFIG_ARCH_TEGRA_124_SOC
+ { .compatible = "nvidia,tegra124-mc", .data = &tegra124_mc_soc },
+#endif
+ { .compatible = "nvidia,tegra132-mc", .data = &tegra132_mc_soc },
+ { }
+};
+
+static irqreturn_t tegra124_mc_irq(int irq, void *data)
+{
+ struct tegra_mc *mc = data;
+ u32 value, status, mask;
+
+ /* mask all interrupts to avoid flooding */
+ mask = mc_readl(mc, MC_INTMASK);
+ mc_writel(mc, 0, MC_INTMASK);
+
+ status = mc_readl(mc, MC_INTSTATUS);
+ mc_writel(mc, status, MC_INTSTATUS);
+
+ dev_dbg(mc->dev, "INTSTATUS: %08x\n", status);
+
+ if (status & MC_INT_DECERR_MTS)
+ dev_dbg(mc->dev, " DECERR_MTS\n");
+
+ if (status & MC_INT_SECERR_SEC)
+ dev_dbg(mc->dev, " SECERR_SEC\n");
+
+ if (status & MC_INT_DECERR_VPR)
+ dev_dbg(mc->dev, " DECERR_VPR\n");
+
+ if (status & MC_INT_INVALID_APB_ASID_UPDATE)
+ dev_dbg(mc->dev, " INVALID_APB_ASID_UPDATE\n");
+
+ if (status & MC_INT_INVALID_SMMU_PAGE)
+ dev_dbg(mc->dev, " INVALID_SMMU_PAGE\n");
+
+ if (status & MC_INT_ARBITRATION_EMEM)
+ dev_dbg(mc->dev, " ARBITRATION_EMEM\n");
+
+ if (status & MC_INT_SECURITY_VIOLATION)
+ dev_dbg(mc->dev, " SECURITY_VIOLATION\n");
+
+ if (status & MC_INT_DECERR_EMEM)
+ dev_dbg(mc->dev, " DECERR_EMEM\n");
+
+ value = mc_readl(mc, MC_ERR_STATUS);
+
+ dev_dbg(mc->dev, "ERR_STATUS: %08x\n", value);
+ dev_dbg(mc->dev, " type: %x\n", (value >> 28) & 0x7);
+ dev_dbg(mc->dev, " protection: %x\n", (value >> 25) & 0x7);
+ dev_dbg(mc->dev, " adr_hi: %x\n", (value >> 20) & 0x3);
+ dev_dbg(mc->dev, " swap: %x\n", (value >> 18) & 0x1);
+ dev_dbg(mc->dev, " security: %x\n", (value >> 17) & 0x1);
+ dev_dbg(mc->dev, " r/w: %x\n", (value >> 16) & 0x1);
+ dev_dbg(mc->dev, " adr1: %x\n", (value >> 12) & 0x7);
+ dev_dbg(mc->dev, " client: %x\n", value & 0x7f);
+
+ value = mc_readl(mc, MC_ERR_ADR);
+ dev_dbg(mc->dev, "ERR_ADR: %08x\n", value);
+
+ mc_writel(mc, mask, MC_INTMASK);
+
+ return IRQ_HANDLED;
+}
+
+static int tegra_mc_probe(struct platform_device *pdev)
+{
+ const struct of_device_id *match;
+ struct resource *res;
+ struct tegra_mc *mc;
+ unsigned int i;
+ u32 value;
+ int err;
+
+ match = of_match_node(tegra_mc_of_match, pdev->dev.of_node);
+ if (!match)
+ return -ENODEV;
+
+ mc = devm_kzalloc(&pdev->dev, sizeof(*mc), GFP_KERNEL);
+ if (!mc)
+ return -ENOMEM;
+
+ platform_set_drvdata(pdev, mc);
+ mc->soc = match->data;
+ mc->dev = &pdev->dev;
+
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ mc->regs = devm_ioremap_resource(&pdev->dev, res);
+ if (IS_ERR(mc->regs))
+ return PTR_ERR(mc->regs);
+
+ for (i = 0; i < mc->soc->num_clients; i++) {
+ const struct latency_allowance *la = &mc->soc->clients[i].latency;
+ u32 value;
+
+ value = readl(mc->regs + la->reg);
+ value &= ~(la->mask << la->shift);
+ value |= (la->def & la->mask) << la->shift;
+ writel(value, mc->regs + la->reg);
+ }
+
+ mc->smmu = tegra_smmu_probe(&pdev->dev, mc->soc->smmu, mc->regs);
+ if (IS_ERR(mc->smmu)) {
+ dev_err(&pdev->dev, "failed to probe SMMU: %ld\n",
+ PTR_ERR(mc->smmu));
+ return PTR_ERR(mc->smmu);
+ }
+
+ mc->irq = platform_get_irq(pdev, 0);
+ if (mc->irq < 0) {
+ dev_err(&pdev->dev, "interrupt not specified\n");
+ return mc->irq;
+ }
+
+ err = devm_request_irq(&pdev->dev, mc->irq, tegra124_mc_irq,
+ IRQF_SHARED, dev_name(&pdev->dev), mc);
+ if (err < 0) {
+ dev_err(&pdev->dev, "failed to request IRQ#%u: %d\n", mc->irq,
+ err);
+ return err;
+ }
+
+ value = MC_INT_DECERR_MTS | MC_INT_SECERR_SEC | MC_INT_DECERR_VPR |
+ MC_INT_INVALID_APB_ASID_UPDATE | MC_INT_INVALID_SMMU_PAGE |
+ MC_INT_ARBITRATION_EMEM | MC_INT_SECURITY_VIOLATION |
+ MC_INT_DECERR_EMEM;
+ mc_writel(mc, value, MC_INTMASK);
+
+ return 0;
+}
+
+static int tegra_mc_remove(struct platform_device *pdev)
+{
+ struct tegra_mc *mc = platform_get_drvdata(pdev);
+ int err;
+
+ err = tegra_smmu_remove(mc->smmu);
+ if (err < 0)
+ dev_err(&pdev->dev, "failed to remove SMMU: %d\n", err);
+
+ return 0;
+}
+
+static struct platform_driver tegra_mc_driver = {
+ .driver = {
+ .name = "tegra124-mc",
+ .of_match_table = tegra_mc_of_match,
+ },
+ .probe = tegra_mc_probe,
+ .remove = tegra_mc_remove,
+};
+module_platform_driver(tegra_mc_driver);
+
+MODULE_AUTHOR("Thierry Reding <[email protected]>");
+MODULE_DESCRIPTION("NVIDIA Tegra124 Memory Controller driver");
+MODULE_LICENSE("GPL v2");
diff --git a/include/dt-bindings/memory/tegra124-mc.h b/include/dt-bindings/memory/tegra124-mc.h
new file mode 100644
index 000000000000..6b1617ce022f
--- /dev/null
+++ b/include/dt-bindings/memory/tegra124-mc.h
@@ -0,0 +1,30 @@
+#ifndef DT_BINDINGS_MEMORY_TEGRA124_MC_H
+#define DT_BINDINGS_MEMORY_TEGRA124_MC_H
+
+#define TEGRA_SWGROUP_DC 0
+#define TEGRA_SWGROUP_DCB 1
+#define TEGRA_SWGROUP_AFI 2
+#define TEGRA_SWGROUP_AVPC 3
+#define TEGRA_SWGROUP_HDA 4
+#define TEGRA_SWGROUP_HC 5
+#define TEGRA_SWGROUP_MSENC 6
+#define TEGRA_SWGROUP_PPCS 7
+#define TEGRA_SWGROUP_SATA 8
+#define TEGRA_SWGROUP_VDE 9
+#define TEGRA_SWGROUP_MPCORELP 10
+#define TEGRA_SWGROUP_MPCORE 11
+#define TEGRA_SWGROUP_ISP2 12
+#define TEGRA_SWGROUP_XUSB_HOST 13
+#define TEGRA_SWGROUP_XUSB_DEV 14
+#define TEGRA_SWGROUP_ISP2B 15
+#define TEGRA_SWGROUP_TSEC 16
+#define TEGRA_SWGROUP_A9AVP 17
+#define TEGRA_SWGROUP_GPU 18
+#define TEGRA_SWGROUP_SDMMC1A 19
+#define TEGRA_SWGROUP_SDMMC2A 20
+#define TEGRA_SWGROUP_SDMMC3A 21
+#define TEGRA_SWGROUP_SDMMC4A 22
+#define TEGRA_SWGROUP_VIC 23
+#define TEGRA_SWGROUP_VI 24
+
+#endif
--
2.0.0
From: Thierry Reding <[email protected]>
This commit introduces a generic device tree binding for IOMMU devices.
Only a very minimal subset is described here, but it is enough to cover
the requirements of both the Exynos System MMU and Tegra SMMU as
discussed here:
https://lkml.org/lkml/2014/4/27/346
Signed-off-by: Thierry Reding <[email protected]>
---
Changes in v3:
- use #iommu-cells instead of #address-cells/#size-cells
- drop optional iommu-names property
Changes in v2:
- add notes about "dma-ranges" property (drop note from commit message)
- document priorities of "iommus" property vs. "dma-ranges" property
- drop #iommu-cells in favour of #address-cells and #size-cells
- remove multiple-master device example
Documentation/devicetree/bindings/iommu/iommu.txt | 156 ++++++++++++++++++++++
1 file changed, 156 insertions(+)
create mode 100644 Documentation/devicetree/bindings/iommu/iommu.txt
diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt b/Documentation/devicetree/bindings/iommu/iommu.txt
new file mode 100644
index 000000000000..f8f03f057156
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/iommu.txt
@@ -0,0 +1,156 @@
+This document describes the generic device tree binding for IOMMUs and their
+master(s).
+
+
+IOMMU device node:
+==================
+
+An IOMMU can provide the following services:
+
+* Remap address space to allow devices to access physical memory ranges that
+ they otherwise wouldn't be capable of accessing.
+
+ Example: 32-bit DMA to 64-bit physical addresses
+
+* Implement scatter-gather at page level granularity so that the device does
+ not have to.
+
+* Provide system protection against "rogue" DMA by forcing all accesses to go
+ through the IOMMU and faulting when encountering accesses to unmapped
+ address regions.
+
+* Provide address space isolation between multiple contexts.
+
+ Example: Virtualization
+
+Device nodes compatible with this binding represent hardware with some of the
+above capabilities.
+
+IOMMUs can be single-master or multiple-master. Single-master IOMMU devices
+typically have a fixed association to the master device, whereas multiple-
+master IOMMU devices can translate accesses from more than one master.
+
+The device tree node of the IOMMU device's parent bus must contain a valid
+"dma-ranges" property that describes how the physical address space of the
+IOMMU maps to memory. An empty "dma-ranges" property means that there is a
+1:1 mapping from IOMMU to memory.
+
+Required properties:
+--------------------
+- #iommu-cells: The number of cells in an IOMMU specifier needed to encode an
+ address.
+
+Typical values for the above include:
+- #iommu-cells = <0>: Single master IOMMU devices are not configurable and
+ therefore no additional information needs to be encoded in the specifier.
+ This may also apply to multiple master IOMMU devices that do not allow the
+ association of masters to be configured.
+- #iommu-cells = <1>: Multiple master IOMMU devices may need to be configured
+ in order to enable translation for a given master. In such cases the single
+ address cell corresponds to the master device's ID.
+- #iommu-cells = <4>: Some IOMMU devices allow the DMA window for masters to
+ be configured. The first cell of the address in this may contain the master
+ device's ID for example, while the second cell could contain the start of
+ the DMA window for the given device. The length of the DMA window is given
+ by the third and fourth cells.
+
+
+IOMMU master node:
+==================
+
+Devices that access memory through an IOMMU are called masters. A device can
+have multiple master interfaces (to one or more IOMMU devices).
+
+Required properties:
+--------------------
+- iommus: A list of phandle and IOMMU specifier pairs that describe the IOMMU
+ master interfaces of the device. One entry in the list describes one master
+ interface of the device.
+
+When an "iommus" property is specified in a device tree node, the IOMMU will
+be used for address translation. If a "dma-ranges" property exists in the
+device's parent node it will be ignored. An exception to this rule is if the
+referenced IOMMU is disabled, in which case the "dma-ranges" property of the
+parent shall take effect.
+
+
+Notes:
+======
+
+One possible extension to the above is to use an "iommus" property along with
+a "dma-ranges" property in a bus device node (such as PCI host bridges). This
+can be useful to describe how children on the bus relate to the IOMMU if they
+are not explicitly listed in the device tree (e.g. PCI devices). However, the
+requirements of that use-case haven't been fully determined yet. Implementing
+this is therefore not recommended without further discussion and extension of
+this binding.
+
+
+Examples:
+=========
+
+Single-master IOMMU:
+--------------------
+
+ iommu {
+ #iommu-cells = <0>;
+ };
+
+ master {
+ iommus = <&/iommu>;
+ };
+
+Multiple-master IOMMU with fixed associations:
+----------------------------------------------
+
+ /* multiple-master IOMMU */
+ iommu {
+ /*
+ * Masters are statically associated with this IOMMU and
+ * address translation is always enabled.
+ */
+ #iommu-cells = <0>;
+ };
+
+ /* static association with IOMMU */
+ master@1 {
+ reg = <1>;
+ iommus = <&/iommu>;
+ };
+
+ /* static association with IOMMU */
+ master@2 {
+ reg = <2>;
+ iommus = <&/iommu>;
+ };
+
+Multiple-master IOMMU:
+----------------------
+
+ iommu {
+ /* the specifier represents the ID of the master */
+ #iommu-cells = <1>;
+ };
+
+ master {
+ /* device has master ID 42 in the IOMMU */
+ iommus = <&/iommu 42>;
+ };
+
+Multiple-master IOMMU with configurable DMA window:
+---------------------------------------------------
+
+ / {
+ #address-cells = <1>;
+ #size-cells = <1>;
+
+ iommu {
+ /* master ID, address and length of DMA window */
+ #iommu-cells = <4>;
+ };
+
+ master {
+ /* master ID 42, 4 GiB DMA window starting at 0 */
+ iommus = <&/iommu 42 0 0x1 0x0>;
+ };
+ };
--
2.0.0
On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> From: Thierry Reding <[email protected]>
>
> Add an IOMMU device registry for drivers to register with and implement
> a method for users of the IOMMU API to attach to an IOMMU device. This
> allows to support deferred probing and gives the IOMMU API a convenient
> hook to perform early initialization of a device if necessary.
>
> Signed-off-by: Thierry Reding <[email protected]>
> ---
> drivers/iommu/iommu.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++++++
> include/linux/iommu.h | 27 +++++++++++++++
> 2 files changed, 120 insertions(+)
I thought that perhaps I should elaborate on this a bit since I have a
few ideas on how the API could be enhanced.
> +static int of_iommu_attach(struct device *dev)
> +{
> + struct of_phandle_iter iter;
> + struct iommu *iommu;
> +
> + mutex_lock(&iommus_lock);
> +
> + of_property_for_each_phandle_with_args(iter, dev->of_node, "iommus",
> + "#iommu-cells", 0) {
> + bool found = false;
> + int err;
> +
> + /* skip disabled IOMMUs */
> + if (!of_device_is_available(iter.out_args.np))
> + continue;
> +
> + list_for_each_entry(iommu, &iommus, list) {
> + if (iommu->dev->of_node == iter.out_args.np) {
> + err = iommu->ops->attach(iommu, dev);
> + if (err < 0) {
> + }
> +
> + found = true;
> + }
> + }
> +
> + if (!found) {
> + mutex_unlock(&iommus_lock);
> + return -EPROBE_DEFER;
> + }
> + }
> +
> + mutex_unlock(&iommus_lock);
> +
> + return 0;
> +}
> +
> +static int of_iommu_detach(struct device *dev)
> +{
> + /* TODO: implement */
> + return -ENOSYS;
> +}
> +
> +int iommu_attach(struct device *dev)
> +{
> + int err = 0;
> +
> + if (IS_ENABLED(CONFIG_OF) && dev->of_node) {
> + err = of_iommu_attach(dev);
> + if (!err)
> + return 0;
> + }
> +
> + return err;
> +}
> +EXPORT_SYMBOL_GPL(iommu_attach);
I think it might make sense to introduce an explicit object for an IOMMU
master attachment. Maybe something like:
struct iommu_master {
struct iommu *iommu;
struct device *dev;
...
};
iommu_attach() could then return a pointer to that attachment and the
IOMMU user driver could subsequently use that as a handle to access
other parts of the API.
The reason is that if we ever need to support more than a single master
interface (and perhaps even multiple master interfaces on different
IOMMUs) for a single device, then we need a way for the IOMMU user to
differentiate between its master interfaces.
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 284a4683fdc1..ac2ceef194d4 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -43,6 +43,17 @@ struct notifier_block;
> typedef int (*iommu_fault_handler_t)(struct iommu_domain *,
> struct device *, unsigned long, int, void *);
>
> +struct iommu {
> + struct device *dev;
> +
> + struct list_head list;
> +
> + const struct iommu_ops *ops;
> +};
For reasons explained above, I also think that it would be a good idea
to modify the iommu_ops functions to take a struct iommu * as their
first argument. This may become important when one driver needs to
support multiple IOMMU devices. With the current API drivers have to
rely on global variables to track the driver-specific context. As far as
I can tell, only .domain_init(), .add_device(), .remove_device() and
.device_group(). .domain_init() could set up a pointer to struct iommu
in struct iommu_domain so the functions dealing with domains could gain
access to the IOMMU device via that pointer.
Thierry
Hi Thierry,
On 06/27/2014 04:49 AM, Thierry Reding wrote:
[snip]
> +
> +#define MC_INTSTATUS 0x000
> +#define MC_INT_DECERR_MTS (1 << 16)
> +#define MC_INT_SECERR_SEC (1 << 13)
> +#define MC_INT_DECERR_VPR (1 << 12)
> +#define MC_INT_INVALID_APB_ASID_UPDATE (1 << 11)
> +#define MC_INT_INVALID_SMMU_PAGE (1 << 10)
> +#define MC_INT_ARBITRATION_EMEM (1 << 9)
> +#define MC_INT_SECURITY_VIOLATION (1 << 8)
> +#define MC_INT_DECERR_EMEM (1 << 6)
> +#define MC_INTMASK 0x004
> +#define MC_ERR_STATUS 0x08
> +#define MC_ERR_ADR 0x0c
> +
[snip]
> +
> +#define SMMU_PDE_ATTR (SMMU_PDE_READABLE | SMMU_PDE_WRITABLE | \
> + SMMU_PDE_NONSECURE)
> +#define SMMU_PTE_ATTR (SMMU_PTE_READABLE | SMMU_PTE_WRITABLE | \
> + SMMU_PTE_NONSECURE)
> +
> +#define SMMU_PDE_VACANT(n) (((n) << 10) | SMMU_PDE_ATTR)
> +#define SMMU_PTE_VACANT(n) (((n) << 12) | SMMU_PTE_ATTR)
There is an ISR to catch the invalid SMMU translation. Do you want to
modify the identity mapping with read/write attribute of the unused SMMU
pages?
This can make sure we capture the invalid SMMU translation. And helps
for driver to capture issues when using SMMU.
-joseph
> +static irqreturn_t tegra124_mc_irq(int irq, void *data)
> +{
> + struct tegra_mc *mc = data;
> + u32 value, status, mask;
> +
> + /* mask all interrupts to avoid flooding */
> + mask = mc_readl(mc, MC_INTMASK);
> + mc_writel(mc, 0, MC_INTMASK);
> +
> + status = mc_readl(mc, MC_INTSTATUS);
> + mc_writel(mc, status, MC_INTSTATUS);
> +
> + dev_dbg(mc->dev, "INTSTATUS: %08x\n", status);
> +
> + if (status & MC_INT_DECERR_MTS)
> + dev_dbg(mc->dev, " DECERR_MTS\n");
> +
> + if (status & MC_INT_SECERR_SEC)
> + dev_dbg(mc->dev, " SECERR_SEC\n");
> +
> + if (status & MC_INT_DECERR_VPR)
> + dev_dbg(mc->dev, " DECERR_VPR\n");
> +
> + if (status & MC_INT_INVALID_APB_ASID_UPDATE)
> + dev_dbg(mc->dev, " INVALID_APB_ASID_UPDATE\n");
> +
> + if (status & MC_INT_INVALID_SMMU_PAGE)
> + dev_dbg(mc->dev, " INVALID_SMMU_PAGE\n");
> +
> + if (status & MC_INT_ARBITRATION_EMEM)
> + dev_dbg(mc->dev, " ARBITRATION_EMEM\n");
> +
> + if (status & MC_INT_SECURITY_VIOLATION)
> + dev_dbg(mc->dev, " SECURITY_VIOLATION\n");
> +
> + if (status & MC_INT_DECERR_EMEM)
> + dev_dbg(mc->dev, " DECERR_EMEM\n");
> +
> + value = mc_readl(mc, MC_ERR_STATUS);
> +
> + dev_dbg(mc->dev, "ERR_STATUS: %08x\n", value);
> + dev_dbg(mc->dev, " type: %x\n", (value >> 28) & 0x7);
> + dev_dbg(mc->dev, " protection: %x\n", (value >> 25) & 0x7);
> + dev_dbg(mc->dev, " adr_hi: %x\n", (value >> 20) & 0x3);
> + dev_dbg(mc->dev, " swap: %x\n", (value >> 18) & 0x1);
> + dev_dbg(mc->dev, " security: %x\n", (value >> 17) & 0x1);
> + dev_dbg(mc->dev, " r/w: %x\n", (value >> 16) & 0x1);
> + dev_dbg(mc->dev, " adr1: %x\n", (value >> 12) & 0x7);
> + dev_dbg(mc->dev, " client: %x\n", value & 0x7f);
> +
> + value = mc_readl(mc, MC_ERR_ADR);
> + dev_dbg(mc->dev, "ERR_ADR: %08x\n", value);
> +
> + mc_writel(mc, mask, MC_INTMASK);
> +
> + return IRQ_HANDLED;
> +}
> +
On Fri, Jun 27, 2014 at 03:41:20PM +0800, Joseph Lo wrote:
> Hi Thierry,
>
> On 06/27/2014 04:49 AM, Thierry Reding wrote:
> [snip]
> >+
> >+#define MC_INTSTATUS 0x000
> >+#define MC_INT_DECERR_MTS (1 << 16)
> >+#define MC_INT_SECERR_SEC (1 << 13)
> >+#define MC_INT_DECERR_VPR (1 << 12)
> >+#define MC_INT_INVALID_APB_ASID_UPDATE (1 << 11)
> >+#define MC_INT_INVALID_SMMU_PAGE (1 << 10)
> >+#define MC_INT_ARBITRATION_EMEM (1 << 9)
> >+#define MC_INT_SECURITY_VIOLATION (1 << 8)
> >+#define MC_INT_DECERR_EMEM (1 << 6)
> >+#define MC_INTMASK 0x004
> >+#define MC_ERR_STATUS 0x08
> >+#define MC_ERR_ADR 0x0c
> >+
> [snip]
> >+
> >+#define SMMU_PDE_ATTR (SMMU_PDE_READABLE | SMMU_PDE_WRITABLE | \
> >+ SMMU_PDE_NONSECURE)
> >+#define SMMU_PTE_ATTR (SMMU_PTE_READABLE | SMMU_PTE_WRITABLE | \
> >+ SMMU_PTE_NONSECURE)
> >+
> >+#define SMMU_PDE_VACANT(n) (((n) << 10) | SMMU_PDE_ATTR)
> >+#define SMMU_PTE_VACANT(n) (((n) << 12) | SMMU_PTE_ATTR)
>
> There is an ISR to catch the invalid SMMU translation. Do you want to modify
> the identity mapping with read/write attribute of the unused SMMU pages?
I'm not sure I understand what you mean by "identity mapping". None of
the public documentation seems to describe the exact layout of PDEs or
PTEs, so it's somewhat hard to tell what to set them to when pages are
unmapped.
> This can make sure we capture the invalid SMMU translation. And helps for
> driver to capture issues when using SMMU.
That certainly sounds like a useful thing to have. Like I said this is
an RFC and I'm not even sure if it's acceptable in the current form, so
I wanted to get feedback early on to avoid wasting effort on something
that turn out to be a wild-goose chase.
Thierry
Thierry Reding <[email protected]> writes:
> From: Thierry Reding <[email protected]>
>
> Attach to the device's master interface of the IOMMU at .probe() time.
> IOMMU support becomes available via the DMA mapping API interoperation
> code, but this explicit attachment is necessary to ensure proper probe
> order.
>
> Signed-off-by: Thierry Reding <[email protected]>
> ---
> drivers/mmc/host/sdhci-tegra.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c
> index 33100d10d176..b884614fa4e6 100644
> --- a/drivers/mmc/host/sdhci-tegra.c
> +++ b/drivers/mmc/host/sdhci-tegra.c
> @@ -15,6 +15,7 @@
> #include <linux/err.h>
> #include <linux/module.h>
> #include <linux/init.h>
> +#include <linux/iommu.h>
> #include <linux/platform_device.h>
> #include <linux/clk.h>
> #include <linux/io.h>
> @@ -237,6 +238,11 @@ static int sdhci_tegra_probe(struct platform_device *pdev)
> match = of_match_device(sdhci_tegra_dt_match, &pdev->dev);
> if (!match)
> return -EINVAL;
> +
> + rc = iommu_attach(&pdev->dev);
> + if (rc < 0)
> + return rc;
> +
I thought that, if we consider that ->probe() should include minimal H/W
probing so that DMA API call in ->probe() could be deferred after
->probe() and till it's in use actually, like opening a device node. For
me this decision(minimal h/w probe) seemed logical but it would add a
new restriction. One advantage is that we could still keep all drivers
wihtout any IOMMU code if it doesn't call DMA API in ->probe().
> soc_data = match->data;
>
> host = sdhci_pltfm_init(pdev, soc_data->pdata, 0);
> @@ -310,6 +316,8 @@ static int sdhci_tegra_remove(struct platform_device *pdev)
> clk_disable_unprepare(pltfm_host->clk);
> clk_put(pltfm_host->clk);
>
> + iommu_detach(&pdev->dev);
> +
> sdhci_pltfm_free(pdev);
>
> return 0;
Thierry Reding <[email protected]> writes:
> From: Thierry Reding <[email protected]>
>
> When an IOMMU device is available on the platform bus, allocate an IOMMU
> domain and attach the display controllers to it. The display controllers
> can then scan out non-contiguous buffers by mapping them through the
> IOMMU.
>
> Signed-off-by: Thierry Reding <[email protected]>
> ---
> drivers/gpu/drm/tegra/dc.c | 21 ++++
> drivers/gpu/drm/tegra/drm.c | 17 ++++
> drivers/gpu/drm/tegra/drm.h | 3 +
> drivers/gpu/drm/tegra/fb.c | 16 ++-
> drivers/gpu/drm/tegra/gem.c | 236 +++++++++++++++++++++++++++++++++++++++-----
> drivers/gpu/drm/tegra/gem.h | 4 +
> 6 files changed, 273 insertions(+), 24 deletions(-)
>
> diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
> index afcca04f5367..0f7452d04811 100644
> --- a/drivers/gpu/drm/tegra/dc.c
> +++ b/drivers/gpu/drm/tegra/dc.c
> @@ -9,6 +9,7 @@
>
> #include <linux/clk.h>
> #include <linux/debugfs.h>
> +#include <linux/iommu.h>
> #include <linux/reset.h>
>
> #include "dc.h"
> @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
> {
> struct drm_device *drm = dev_get_drvdata(client->parent);
> struct tegra_dc *dc = host1x_client_to_dc(client);
> + struct tegra_drm *tegra = drm->dev_private;
> int err;
>
> + if (tegra->domain) {
> + err = iommu_attach_device(tegra->domain, dc->dev);
I wanted to keep device drivers iommu-free with the following:
http://patchwork.ozlabs.org/patch/354074/
> + if (err < 0) {
> + dev_err(dc->dev, "failed to attach to IOMMU: %d\n",
> + err);
> + return err;
> + }
> + }
> +
> drm_crtc_init(drm, &dc->base, &tegra_crtc_funcs);
> drm_mode_crtc_set_gamma_size(&dc->base, 256);
> drm_crtc_helper_add(&dc->base, &tegra_crtc_helper_funcs);
> @@ -1318,7 +1329,9 @@ static int tegra_dc_init(struct host1x_client *client)
>
> static int tegra_dc_exit(struct host1x_client *client)
> {
> + struct drm_device *drm = dev_get_drvdata(client->parent);
> struct tegra_dc *dc = host1x_client_to_dc(client);
> + struct tegra_drm *tegra = drm->dev_private;
> int err;
>
> devm_free_irq(dc->dev, dc->irq, dc);
> @@ -1335,6 +1348,8 @@ static int tegra_dc_exit(struct host1x_client *client)
> return err;
> }
>
> + iommu_detach_device(tegra->domain, dc->dev);
> +
> return 0;
> }
>
> @@ -1462,6 +1477,12 @@ static int tegra_dc_probe(struct platform_device *pdev)
> return -ENXIO;
> }
>
> + err = iommu_attach(&pdev->dev);
> + if (err < 0) {
> + dev_err(&pdev->dev, "failed to attach to IOMMU: %d\n", err);
> + return err;
> + }
> +
> INIT_LIST_HEAD(&dc->client.list);
> dc->client.ops = &dc_client_ops;
> dc->client.dev = &pdev->dev;
> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
> index 59736bb810cd..1d2bbafad982 100644
> --- a/drivers/gpu/drm/tegra/drm.c
> +++ b/drivers/gpu/drm/tegra/drm.c
> @@ -8,6 +8,7 @@
> */
>
> #include <linux/host1x.h>
> +#include <linux/iommu.h>
>
> #include "drm.h"
> #include "gem.h"
> @@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
> if (!tegra)
> return -ENOMEM;
>
> + if (iommu_present(&platform_bus_type)) {
> + tegra->domain = iommu_domain_alloc(&platform_bus_type);
Can we use "dma_iommu_mapping" instead of domain?
I thought that DMA API is on the top of IOMMU API so that it may be
cleaner to use only DMA API.
> + if (IS_ERR(tegra->domain)) {
> + kfree(tegra);
> + return PTR_ERR(tegra->domain);
> + }
> +
> + drm_mm_init(&tegra->mm, 0, SZ_2G);
> + }
> +
> mutex_init(&tegra->clients_lock);
> INIT_LIST_HEAD(&tegra->clients);
> drm->dev_private = tegra;
> @@ -71,6 +82,7 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
> static int tegra_drm_unload(struct drm_device *drm)
> {
> struct host1x_device *device = to_host1x_device(drm->dev);
> + struct tegra_drm *tegra = drm->dev_private;
> int err;
>
> drm_kms_helper_poll_fini(drm);
> @@ -82,6 +94,11 @@ static int tegra_drm_unload(struct drm_device *drm)
> if (err < 0)
> return err;
>
> + if (tegra->domain) {
> + iommu_domain_free(tegra->domain);
> + drm_mm_takedown(&tegra->mm);
> + }
> +
> return 0;
> }
>
> diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
> index 96d754e7b3eb..a07c796b7edc 100644
> --- a/drivers/gpu/drm/tegra/drm.h
> +++ b/drivers/gpu/drm/tegra/drm.h
> @@ -39,6 +39,9 @@ struct tegra_fbdev {
> struct tegra_drm {
> struct drm_device *drm;
>
> + struct iommu_domain *domain;
> + struct drm_mm mm;
> +
> struct mutex clients_lock;
> struct list_head clients;
>
> diff --git a/drivers/gpu/drm/tegra/fb.c b/drivers/gpu/drm/tegra/fb.c
> index 7790d43ad082..21c65dd817c3 100644
> --- a/drivers/gpu/drm/tegra/fb.c
> +++ b/drivers/gpu/drm/tegra/fb.c
> @@ -65,8 +65,12 @@ static void tegra_fb_destroy(struct drm_framebuffer *framebuffer)
> for (i = 0; i < fb->num_planes; i++) {
> struct tegra_bo *bo = fb->planes[i];
>
> - if (bo)
> + if (bo) {
> + if (bo->pages && bo->virt)
> + vunmap(bo->virt);
> +
> drm_gem_object_unreference_unlocked(&bo->gem);
> + }
> }
>
> drm_framebuffer_cleanup(framebuffer);
> @@ -252,6 +256,16 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
> offset = info->var.xoffset * bytes_per_pixel +
> info->var.yoffset * fb->pitches[0];
>
> + if (bo->pages) {
> + bo->vaddr = vmap(bo->pages, bo->num_pages, VM_MAP,
> + pgprot_writecombine(PAGE_KERNEL));
> + if (!bo->vaddr) {
> + dev_err(drm->dev, "failed to vmap() framebuffer\n");
> + err = -ENOMEM;
> + goto destroy;
> + }
> + }
> +
> drm->mode_config.fb_base = (resource_size_t)bo->paddr;
> info->screen_base = (void __iomem *)bo->vaddr + offset;
> info->screen_size = size;
> diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
> index c1e4e8b6e5ca..2912e61a2599 100644
> --- a/drivers/gpu/drm/tegra/gem.c
> +++ b/drivers/gpu/drm/tegra/gem.c
> @@ -14,8 +14,10 @@
> */
>
> #include <linux/dma-buf.h>
> +#include <linux/iommu.h>
> #include <drm/tegra_drm.h>
>
> +#include "drm.h"
> #include "gem.h"
>
> static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo)
> @@ -90,14 +92,144 @@ static const struct host1x_bo_ops tegra_bo_ops = {
> .kunmap = tegra_bo_kunmap,
> };
iommu_map_sg() could be implemented as iommu_ops->map_sg() for the
better perf since iommu_map() needs some pagetable cache operations. If
we do those cache operations at once, it would bring some perf benefit.
> +static int iommu_map_sg(struct iommu_domain *domain, struct sg_table *sgt,
> + dma_addr_t iova, int prot)
> +{
> + unsigned long offset = 0;
> + struct scatterlist *sg;
> + unsigned int i, j;
> + int err;
> +
> + for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> + dma_addr_t phys = sg_phys(sg);
> + size_t length = sg->offset;
> +
> + phys = sg_phys(sg) - sg->offset;
> + length = sg->length + sg->offset;
> +
> + err = iommu_map(domain, iova + offset, phys, length, prot);
> + if (err < 0)
> + goto unmap;
> +
> + offset += length;
> + }
> +
> + return 0;
> +
> +unmap:
> + offset = 0;
> +
> + for_each_sg(sgt->sgl, sg, i, j) {
> + size_t length = sg->length + sg->offset;
> + iommu_unmap(domain, iova + offset, length);
> + offset += length;
> + }
> +
> + return err;
> +}
I think that we don't need unmap_sg(), instead normal iommu_unmap() for
a whole area could do the same at once?
> +static int iommu_unmap_sg(struct iommu_domain *domain, struct sg_table *sgt,
> + dma_addr_t iova)
> +{
> + unsigned long offset = 0;
> + struct scatterlist *sg;
> + unsigned int i;
> +
> + for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> + dma_addr_t phys = sg_phys(sg);
> + size_t length = sg->offset;
> +
> + phys = sg_phys(sg) - sg->offset;
> + length = sg->length + sg->offset;
> +
> + iommu_unmap(domain, iova + offset, length);
> + offset += length;
> + }
> +
> + return 0;
> +}
Can the rest of IOMMU API be replaced with DMA API too?
Thierry Reding <[email protected]> writes:
> From: Thierry Reding <[email protected]>
>
> The memory controller on NVIDIA Tegra124 exposes various knobs that can
> be used to tune the behaviour of the clients attached to it.
>
> Currently this driver sets up the latency allowance registers to the HW
> defaults. Eventually an API should be exported by this driver (via a
> custom API or a generic subsystem) to allow clients to register latency
> requirements.
>
> This driver also registers an IOMMU (SMMU) that's implemented by the
> memory controller.
>
> Signed-off-by: Thierry Reding <[email protected]>
> ---
> drivers/memory/Kconfig | 9 +
> drivers/memory/Makefile | 1 +
> drivers/memory/tegra124-mc.c | 1945 ++++++++++++++++++++++++++++++
> include/dt-bindings/memory/tegra124-mc.h | 30 +
> 4 files changed, 1985 insertions(+)
> create mode 100644 drivers/memory/tegra124-mc.c
> create mode 100644 include/dt-bindings/memory/tegra124-mc.h
I prefer reusing the existing SMMU and having MC and SMMU separated
since most of SMMU code are not different from functionality POV, and
new MC features are quite independent of SMMU.
If it's really convenient to combine MC and SMMU into one driver, we
could move "drivers/iomm/tegra-smmu.c" here first, and add MC features
on the top of it.
On Friday 27 June 2014 12:46:14 Hiroshi DOyu wrote:
>
> Thierry Reding <[email protected]> writes:
>
> > From: Thierry Reding <[email protected]>
> >
> > When an IOMMU device is available on the platform bus, allocate an IOMMU
> > domain and attach the display controllers to it. The display controllers
> > can then scan out non-contiguous buffers by mapping them through the
> > IOMMU.
> >
> > Signed-off-by: Thierry Reding <[email protected]>
> > ---
> > @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
> > {
> > struct drm_device *drm = dev_get_drvdata(client->parent);
> > struct tegra_dc *dc = host1x_client_to_dc(client);
> > + struct tegra_drm *tegra = drm->dev_private;
> > int err;
> >
> > + if (tegra->domain) {
> > + err = iommu_attach_device(tegra->domain, dc->dev);
>
> I wanted to keep device drivers iommu-free with the following:
>
> http://patchwork.ozlabs.org/patch/354074/
>
We definitely need something like your series to make iommus work transparently
on ARM for normal devices, using the of_dma_configure() to look up the correct
iommu per device and initialize it.
However, any devices that work with multiple iommu domains cannot do that
and still need to use the iommu API directy. I believe the tegra drm code
falls into this category.
Arnd
On Fri, Jun 27, 2014 at 12:46:14PM +0300, Hiroshi DOyu wrote:
> Thierry Reding <[email protected]> writes:
[...]
> > diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
[...]
> > + if (tegra->domain) {
> > + err = iommu_attach_device(tegra->domain, dc->dev);
>
> I wanted to keep device drivers iommu-free with the following:
>
> http://patchwork.ozlabs.org/patch/354074/
That patch only addresses the probe ordering problem that happens if the
user of an IOMMU is probed before the IOMMU. What this patch does is a
whole lot more.
> > diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
> > index 59736bb810cd..1d2bbafad982 100644
> > --- a/drivers/gpu/drm/tegra/drm.c
> > +++ b/drivers/gpu/drm/tegra/drm.c
> > @@ -8,6 +8,7 @@
> > */
> >
> > #include <linux/host1x.h>
> > +#include <linux/iommu.h>
> >
> > #include "drm.h"
> > #include "gem.h"
> > @@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
> > if (!tegra)
> > return -ENOMEM;
> >
> > + if (iommu_present(&platform_bus_type)) {
> > + tegra->domain = iommu_domain_alloc(&platform_bus_type);
>
> Can we use "dma_iommu_mapping" instead of domain?
>
> I thought that DMA API is on the top of IOMMU API so that it may be
> cleaner to use only DMA API.
Using the DMA API doesn't work for Tegra DRM because it assumes a 1:1
mapping between a device and an IOMMU domain. For Tegra DRM we have two
devices (two display controllers) that need to be able to access the
same buffers, therefore they need to share one IOMMU domain. This can't
be done using the DMA API.
The DMA API is fine to be used by devices that operate on "private" DMA
buffers (SDMMC, USB, ...).
> iommu_map_sg() could be implemented as iommu_ops->map_sg() for the
> better perf since iommu_map() needs some pagetable cache operations. If
> we do those cache operations at once, it would bring some perf benefit.
Yes, I agree that eventually this should be moved into the IOMMU core.
We could add a .map_sg() to IOMMU ops for devices where mapping a whole
sg_table at once would have significant performance benefits and change
this generic implementation to be used by devices that don't implement
.map_sg(). Then the IOMMU core's iommu_map_sg() can call into the driver
directly or fallback to the generic implementation.
> I think that we don't need unmap_sg(), instead normal iommu_unmap() for
> a whole area could do the same at once?
Yes, I suppose that's true. I'll see if it can be safely dropped. It
might give us the same benefit as the iommu_map_sg() regarding cache
maintenance, though.
> > +static int iommu_unmap_sg(struct iommu_domain *domain, struct sg_table *sgt,
> > + dma_addr_t iova)
> > +{
> > + unsigned long offset = 0;
> > + struct scatterlist *sg;
> > + unsigned int i;
> > +
> > + for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> > + dma_addr_t phys = sg_phys(sg);
> > + size_t length = sg->offset;
> > +
> > + phys = sg_phys(sg) - sg->offset;
> > + length = sg->length + sg->offset;
> > +
> > + iommu_unmap(domain, iova + offset, length);
> > + offset += length;
> > + }
> > +
> > + return 0;
> > +}
>
> Can the rest of IOMMU API be replaced with DMA API too?
As I explained above, I don't see how it could be done for this driver.
But I don't think it has to. After all the IOMMU API does exist, so we
shouldn't shy away from using it when appropriate.
Thierry
On Fri, Jun 27, 2014 at 12:46:02PM +0300, Hiroshi DOyu wrote:
>
> Thierry Reding <[email protected]> writes:
>
> > From: Thierry Reding <[email protected]>
> >
> > Attach to the device's master interface of the IOMMU at .probe() time.
> > IOMMU support becomes available via the DMA mapping API interoperation
> > code, but this explicit attachment is necessary to ensure proper probe
> > order.
> >
> > Signed-off-by: Thierry Reding <[email protected]>
> > ---
> > drivers/mmc/host/sdhci-tegra.c | 8 ++++++++
> > 1 file changed, 8 insertions(+)
> >
> > diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c
> > index 33100d10d176..b884614fa4e6 100644
> > --- a/drivers/mmc/host/sdhci-tegra.c
> > +++ b/drivers/mmc/host/sdhci-tegra.c
> > @@ -15,6 +15,7 @@
> > #include <linux/err.h>
> > #include <linux/module.h>
> > #include <linux/init.h>
> > +#include <linux/iommu.h>
> > #include <linux/platform_device.h>
> > #include <linux/clk.h>
> > #include <linux/io.h>
> > @@ -237,6 +238,11 @@ static int sdhci_tegra_probe(struct platform_device *pdev)
> > match = of_match_device(sdhci_tegra_dt_match, &pdev->dev);
> > if (!match)
> > return -EINVAL;
> > +
> > + rc = iommu_attach(&pdev->dev);
> > + if (rc < 0)
> > + return rc;
> > +
>
> I thought that, if we consider that ->probe() should include minimal H/W
> probing so that DMA API call in ->probe() could be deferred after
> ->probe() and till it's in use actually, like opening a device node. For
> me this decision(minimal h/w probe) seemed logical but it would add a
> new restriction. One advantage is that we could still keep all drivers
> wihtout any IOMMU code if it doesn't call DMA API in ->probe().
This isn't immediately apparent in this case, but I think that in the
future we may need to have this kind of explicit attachment to an IOMMU
for example once devices start to appear that have multiple master
interfaces (possibly on different IOMMUs). For easy cases like this
SDMMC driver we may be able to get away more easily by hooking this up
within the driver core for example. I'd have to look into how exactly
this would work, though.
Thierry
On Thursday 26 June 2014 22:49:44 Thierry Reding wrote:
> +static const struct tegra_mc_client tegra124_mc_clients[] = {
> + {
> + .id = 0x01,
> + .name = "display0a",
> + .swgroup = TEGRA_SWGROUP_DC,
> + .smmu = {
> + .reg = 0x228,
> + .bit = 1,
> + },
> + .latency = {
> + .reg = 0x2e8,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0xc2,
> + },
> + }, {
This is a rather long table that I assume would need to get duplicated
and modified for each specific SoC. Have you considered to put the information
into DT instead, as auxiliary data in the iommu specifier as provided by
the device?
Arnd
On Fri, Jun 27, 2014 at 12:46:38PM +0300, Hiroshi DOyu wrote:
>
> Thierry Reding <[email protected]> writes:
>
> > From: Thierry Reding <[email protected]>
> >
> > The memory controller on NVIDIA Tegra124 exposes various knobs that can
> > be used to tune the behaviour of the clients attached to it.
> >
> > Currently this driver sets up the latency allowance registers to the HW
> > defaults. Eventually an API should be exported by this driver (via a
> > custom API or a generic subsystem) to allow clients to register latency
> > requirements.
> >
> > This driver also registers an IOMMU (SMMU) that's implemented by the
> > memory controller.
> >
> > Signed-off-by: Thierry Reding <[email protected]>
> > ---
> > drivers/memory/Kconfig | 9 +
> > drivers/memory/Makefile | 1 +
> > drivers/memory/tegra124-mc.c | 1945 ++++++++++++++++++++++++++++++
> > include/dt-bindings/memory/tegra124-mc.h | 30 +
> > 4 files changed, 1985 insertions(+)
> > create mode 100644 drivers/memory/tegra124-mc.c
> > create mode 100644 include/dt-bindings/memory/tegra124-mc.h
>
> I prefer reusing the existing SMMU and having MC and SMMU separated
> since most of SMMU code are not different from functionality POV, and
> new MC features are quite independent of SMMU.
>
> If it's really convenient to combine MC and SMMU into one driver, we
> could move "drivers/iomm/tegra-smmu.c" here first, and add MC features
> on the top of it.
I'm not sure if we can do that, since the tegra-smmu driver is
technically used by Tegra30 and Tegra114. We've never really made use of
it, but there are device trees in mainline releases that contain the
separate SMMU node.
Perhaps one of the DT folks can comment on whether it would be possible
to break compatibility with existing DTs in this case, given that the
SMMU on Tegra30 and Tegra114 have never been used.
Either way, I do see advantages in incremental patches, but at the same
time the old driver and architecture was never enabled (therefore not
tested either) upstream and as shown by the Tegra DRM example can't cope
with more complex cases. So I'm not completely convinced that an
incremental approach would be the best here.
Thierry
On Fri, Jun 27, 2014 at 01:07:04PM +0200, Arnd Bergmann wrote:
> On Thursday 26 June 2014 22:49:44 Thierry Reding wrote:
> > +static const struct tegra_mc_client tegra124_mc_clients[] = {
> > + {
> > + .id = 0x01,
> > + .name = "display0a",
> > + .swgroup = TEGRA_SWGROUP_DC,
> > + .smmu = {
> > + .reg = 0x228,
> > + .bit = 1,
> > + },
> > + .latency = {
> > + .reg = 0x2e8,
> > + .shift = 0,
> > + .mask = 0xff,
> > + .def = 0xc2,
> > + },
> > + }, {
>
> This is a rather long table that I assume would need to get duplicated
> and modified for each specific SoC. Have you considered to put the information
> into DT instead, as auxiliary data in the iommu specifier as provided by
> the device?
Most of this data really is register information and I don't think that
belongs in DT. Also since this is fixed for a given SoC and in no way
configurable (well, with the exception of the .def field above) I don't
see any point in parsing this from device tree.
Also only the .smmu substruct is immediately relevant to the IOMMU part
of the driver. The .swgroup field could possibly also be moved into that
substructure since it is only relevant to the IOMMU.
So essentially what this table does is map SWGROUPs (which are provided
in the IOMMU specifier) to the clients and registers that the IOMMU
programming needs. As an analogy it corresponds roughly to the pins and
pingroups tables of pinctrl drivers. Those don't belong in device tree
either.
Thierry
In the future, the EMC driver will also want to write and read quite
many registers in the MC block.. MC_EMEM_*, the latency allowance
registers and a couple others. Downstream just uses __raw_writel with
values from the EMC tables. A fun thing here is that during the point
that the values are written, the code cannot do some things like reading
registers (I believe) without hanging, so calling into the MC driver to
write the changes might not be very nice either. Related to that,
reading from MC_EMEM_ADR_CFG is used as a barrier in the sequence.
On 26/06/14 23:49, Thierry Reding wrote:
> From: Thierry Reding <[email protected]>
>
> The memory controller on NVIDIA Tegra124 exposes various knobs that can
> be used to tune the behaviour of the clients attached to it.
>
> Currently this driver sets up the latency allowance registers to the HW
> defaults. Eventually an API should be exported by this driver (via a
> custom API or a generic subsystem) to allow clients to register latency
> requirements.
I cannot see where the downstream latency allowance code is reloading
the latency allowance registers after a EMC clock rate change. Strange.
>
> This driver also registers an IOMMU (SMMU) that's implemented by the
> memory controller.
>
> Signed-off-by: Thierry Reding <[email protected]>
> ---
> drivers/memory/Kconfig | 9 +
> drivers/memory/Makefile | 1 +
> drivers/memory/tegra124-mc.c | 1945 ++++++++++++++++++++++++++++++
> include/dt-bindings/memory/tegra124-mc.h | 30 +
> 4 files changed, 1985 insertions(+)
> create mode 100644 drivers/memory/tegra124-mc.c
> create mode 100644 include/dt-bindings/memory/tegra124-mc.h
>
> diff --git a/drivers/memory/Kconfig b/drivers/memory/Kconfig
> index c59e9c96e86d..d0f0e6781570 100644
> --- a/drivers/memory/Kconfig
> +++ b/drivers/memory/Kconfig
> @@ -61,6 +61,15 @@ config TEGRA30_MC
> analysis, especially for IOMMU/SMMU(System Memory Management
> Unit) module.
>
> +config TEGRA124_MC
> + bool "Tegra124 Memory Controller driver"
> + depends on ARCH_TEGRA
> + select IOMMU_API
> + help
> + This driver is for the Memory Controller module available on
> + Tegra124 SoCs. It provides an IOMMU that can be used for I/O
> + virtual address translation.
> +
> config FSL_IFC
> bool
> depends on FSL_SOC
> diff --git a/drivers/memory/Makefile b/drivers/memory/Makefile
> index 71160a2b7313..03143927abab 100644
> --- a/drivers/memory/Makefile
> +++ b/drivers/memory/Makefile
> @@ -11,3 +11,4 @@ obj-$(CONFIG_FSL_IFC) += fsl_ifc.o
> obj-$(CONFIG_MVEBU_DEVBUS) += mvebu-devbus.o
> obj-$(CONFIG_TEGRA20_MC) += tegra20-mc.o
> obj-$(CONFIG_TEGRA30_MC) += tegra30-mc.o
> +obj-$(CONFIG_TEGRA124_MC) += tegra124-mc.o
> diff --git a/drivers/memory/tegra124-mc.c b/drivers/memory/tegra124-mc.c
> new file mode 100644
> index 000000000000..741755b6785d
> --- /dev/null
> +++ b/drivers/memory/tegra124-mc.c
> @@ -0,0 +1,1945 @@
> +/*
> + * Copyright (C) 2014 NVIDIA CORPORATION. All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +
> +#include <linux/interrupt.h>
> +#include <linux/io.h>
> +#include <linux/iommu.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/of.h>
> +#include <linux/platform_device.h>
> +#include <linux/slab.h>
> +
> +#include <dt-bindings/memory/tegra124-mc.h>
> +
> +#include <asm/cacheflush.h>
> +#ifndef CONFIG_ARM64
> +#include <asm/dma-iommu.h>
> +#endif
> +
> +#define MC_INTSTATUS 0x000
> +#define MC_INT_DECERR_MTS (1 << 16)
> +#define MC_INT_SECERR_SEC (1 << 13)
> +#define MC_INT_DECERR_VPR (1 << 12)
> +#define MC_INT_INVALID_APB_ASID_UPDATE (1 << 11)
> +#define MC_INT_INVALID_SMMU_PAGE (1 << 10)
> +#define MC_INT_ARBITRATION_EMEM (1 << 9)
> +#define MC_INT_SECURITY_VIOLATION (1 << 8)
> +#define MC_INT_DECERR_EMEM (1 << 6)
> +#define MC_INTMASK 0x004
> +#define MC_ERR_STATUS 0x08
> +#define MC_ERR_ADR 0x0c
> +
> +struct latency_allowance {
> + unsigned int reg;
> + unsigned int shift;
> + unsigned int mask;
> + unsigned int def;
> +};
> +
> +struct smmu_enable {
> + unsigned int reg;
> + unsigned int bit;
> +};
> +
> +struct tegra_mc_client {
> + unsigned int id;
> + const char *name;
> + unsigned int swgroup;
> +
> + struct smmu_enable smmu;
> + struct latency_allowance latency;
> +};
> +
> +static const struct tegra_mc_client tegra124_mc_clients[] = {
> + {
> + .id = 0x01,
> + .name = "display0a",
> + .swgroup = TEGRA_SWGROUP_DC,
> + .smmu = {
> + .reg = 0x228,
> + .bit = 1,
> + },
> + .latency = {
> + .reg = 0x2e8,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0xc2,
> + },
> + }, {
> + .id = 0x02,
> + .name = "display0ab",
> + .swgroup = TEGRA_SWGROUP_DCB,
> + .smmu = {
> + .reg = 0x228,
> + .bit = 2,
> + },
> + .latency = {
> + .reg = 0x2f4,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0xc6,
> + },
> + }, {
> + .id = 0x03,
> + .name = "display0b",
> + .swgroup = TEGRA_SWGROUP_DC,
> + .smmu = {
> + .reg = 0x228,
> + .bit = 3,
> + },
> + .latency = {
> + .reg = 0x2e8,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x50,
> + },
> + }, {
> + .id = 0x04,
> + .name = "display0bb",
> + .swgroup = TEGRA_SWGROUP_DCB,
> + .smmu = {
> + .reg = 0x228,
> + .bit = 4,
> + },
> + .latency = {
> + .reg = 0x2f4,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x50,
> + },
> + }, {
> + .id = 0x05,
> + .name = "display0c",
> + .swgroup = TEGRA_SWGROUP_DC,
> + .smmu = {
> + .reg = 0x228,
> + .bit = 5,
> + },
> + .latency = {
> + .reg = 0x2ec,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x50,
> + },
> + }, {
> + .id = 0x06,
> + .name = "display0cb",
> + .swgroup = TEGRA_SWGROUP_DCB,
> + .smmu = {
> + .reg = 0x228,
> + .bit = 6,
> + },
> + .latency = {
> + .reg = 0x2f8,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x50,
> + },
> + }, {
> + .id = 0x0e,
> + .name = "afir",
> + .swgroup = TEGRA_SWGROUP_AFI,
> + .smmu = {
> + .reg = 0x228,
> + .bit = 14,
> + },
> + .latency = {
> + .reg = 0x2e0,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x13,
> + },
> + }, {
> + .id = 0x0f,
> + .name = "avpcarm7r",
> + .swgroup = TEGRA_SWGROUP_AVPC,
> + .smmu = {
> + .reg = 0x228,
> + .bit = 15,
> + },
> + .latency = {
> + .reg = 0x2e4,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x04,
> + },
> + }, {
> + .id = 0x10,
> + .name = "displayhc",
> + .swgroup = TEGRA_SWGROUP_DC,
> + .smmu = {
> + .reg = 0x228,
> + .bit = 16,
> + },
> + .latency = {
> + .reg = 0x2f0,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x50,
> + },
> + }, {
> + .id = 0x11,
> + .name = "displayhcb",
> + .swgroup = TEGRA_SWGROUP_DCB,
> + .smmu = {
> + .reg = 0x228,
> + .bit = 17,
> + },
> + .latency = {
> + .reg = 0x2fc,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x50,
> + },
> + }, {
> + .id = 0x15,
> + .name = "hdar",
> + .swgroup = TEGRA_SWGROUP_HDA,
> + .smmu = {
> + .reg = 0x228,
> + .bit = 21,
> + },
> + .latency = {
> + .reg = 0x318,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x24,
> + },
> + }, {
> + .id = 0x16,
> + .name = "host1xdmar",
> + .swgroup = TEGRA_SWGROUP_HC,
> + .smmu = {
> + .reg = 0x228,
> + .bit = 22,
> + },
> + .latency = {
> + .reg = 0x310,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x1e,
> + },
> + }, {
> + .id = 0x17,
> + .name = "host1xr",
> + .swgroup = TEGRA_SWGROUP_HC,
> + .smmu = {
> + .reg = 0x228,
> + .bit = 23,
> + },
> + .latency = {
> + .reg = 0x310,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x50,
> + },
> + }, {
> + .id = 0x1c,
> + .name = "msencsrd",
> + .swgroup = TEGRA_SWGROUP_MSENC,
> + .smmu = {
> + .reg = 0x228,
> + .bit = 28,
> + },
> + .latency = {
> + .reg = 0x328,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x23,
> + },
> + }, {
> + .id = 0x1d,
> + .name = "ppcsahbdmarhdar",
> + .swgroup = TEGRA_SWGROUP_PPCS,
> + .smmu = {
> + .reg = 0x228,
> + .bit = 29,
> + },
> + .latency = {
> + .reg = 0x344,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x49,
> + },
> + }, {
> + .id = 0x1e,
> + .name = "ppcsahbslvr",
> + .swgroup = TEGRA_SWGROUP_PPCS,
> + .smmu = {
> + .reg = 0x228,
> + .bit = 30,
> + },
> + .latency = {
> + .reg = 0x344,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x1a,
> + },
> + }, {
> + .id = 0x1f,
> + .name = "satar",
> + .swgroup = TEGRA_SWGROUP_SATA,
> + .smmu = {
> + .reg = 0x228,
> + .bit = 31,
> + },
> + .latency = {
> + .reg = 0x350,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x65,
> + },
> + }, {
> + .id = 0x22,
> + .name = "vdebsevr",
> + .swgroup = TEGRA_SWGROUP_VDE,
> + .smmu = {
> + .reg = 0x22c,
> + .bit = 2,
> + },
> + .latency = {
> + .reg = 0x354,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x4f,
> + },
> + }, {
> + .id = 0x23,
> + .name = "vdember",
> + .swgroup = TEGRA_SWGROUP_VDE,
> + .smmu = {
> + .reg = 0x22c,
> + .bit = 3,
> + },
> + .latency = {
> + .reg = 0x354,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x3d,
> + },
> + }, {
> + .id = 0x24,
> + .name = "vdemcer",
> + .swgroup = TEGRA_SWGROUP_VDE,
> + .smmu = {
> + .reg = 0x22c,
> + .bit = 4,
> + },
> + .latency = {
> + .reg = 0x358,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x66,
> + },
> + }, {
> + .id = 0x25,
> + .name = "vdetper",
> + .swgroup = TEGRA_SWGROUP_VDE,
> + .smmu = {
> + .reg = 0x22c,
> + .bit = 5,
> + },
> + .latency = {
> + .reg = 0x358,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0xa5,
> + },
> + }, {
> + .id = 0x26,
> + .name = "mpcorelpr",
> + .swgroup = TEGRA_SWGROUP_MPCORELP,
> + .latency = {
> + .reg = 0x324,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x04,
> + },
> + }, {
> + .id = 0x27,
> + .name = "mpcorer",
> + .swgroup = TEGRA_SWGROUP_MPCORE,
> + .smmu = {
> + .reg = 0x22c,
> + .bit = 2,
> + },
> + .latency = {
> + .reg = 0x320,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x04,
> + },
> + }, {
> + .id = 0x2b,
> + .name = "msencswr",
> + .swgroup = TEGRA_SWGROUP_MSENC,
> + .smmu = {
> + .reg = 0x22c,
> + .bit = 11,
> + },
> + .latency = {
> + .reg = 0x328,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x31,
> + .name = "afiw",
> + .swgroup = TEGRA_SWGROUP_AFI,
> + .smmu = {
> + .reg = 0x22c,
> + .bit = 17,
> + },
> + .latency = {
> + .reg = 0x2e0,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x32,
> + .name = "avpcarm7w",
> + .swgroup = TEGRA_SWGROUP_AVPC,
> + .smmu = {
> + .reg = 0x22c,
> + .bit = 18,
> + },
> + .latency = {
> + .reg = 0x2e4,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x35,
> + .name = "hdaw",
> + .swgroup = TEGRA_SWGROUP_HDA,
> + .smmu = {
> + .reg = 0x22c,
> + .bit = 21,
> + },
> + .latency = {
> + .reg = 0x318,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x36,
> + .name = "host1xw",
> + .swgroup = TEGRA_SWGROUP_HC,
> + .smmu = {
> + .reg = 0x22c,
> + .bit = 22,
> + },
> + .latency = {
> + .reg = 0x314,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x38,
> + .name = "mpcorelpw",
> + .swgroup = TEGRA_SWGROUP_MPCORELP,
> + .latency = {
> + .reg = 0x324,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x39,
> + .name = "mpcorew",
> + .swgroup = TEGRA_SWGROUP_MPCORE,
> + .latency = {
> + .reg = 0x320,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x3b,
> + .name = "ppcsahbdmaw",
> + .swgroup = TEGRA_SWGROUP_PPCS,
> + .smmu = {
> + .reg = 0x22c,
> + .bit = 27,
> + },
> + .latency = {
> + .reg = 0x348,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x3c,
> + .name = "ppcsahbslvw",
> + .swgroup = TEGRA_SWGROUP_PPCS,
> + .smmu = {
> + .reg = 0x22c,
> + .bit = 28,
> + },
> + .latency = {
> + .reg = 0x348,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x3d,
> + .name = "sataw",
> + .swgroup = TEGRA_SWGROUP_SATA,
> + .smmu = {
> + .reg = 0x22c,
> + .bit = 29,
> + },
> + .latency = {
> + .reg = 0x350,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x65,
> + },
> + }, {
> + .id = 0x3e,
> + .name = "vdebsevw",
> + .swgroup = TEGRA_SWGROUP_VDE,
> + .smmu = {
> + .reg = 0x22c,
> + .bit = 30,
> + },
> + .latency = {
> + .reg = 0x35c,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x3f,
> + .name = "vdedbgw",
> + .swgroup = TEGRA_SWGROUP_VDE,
> + .smmu = {
> + .reg = 0x22c,
> + .bit = 31,
> + },
> + .latency = {
> + .reg = 0x35c,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x40,
> + .name = "vdembew",
> + .swgroup = TEGRA_SWGROUP_VDE,
> + .smmu = {
> + .reg = 0x230,
> + .bit = 0,
> + },
> + .latency = {
> + .reg = 0x360,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x41,
> + .name = "vdetpmw",
> + .swgroup = TEGRA_SWGROUP_VDE,
> + .smmu = {
> + .reg = 0x230,
> + .bit = 1,
> + },
> + .latency = {
> + .reg = 0x360,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x44,
> + .name = "ispra",
> + .swgroup = TEGRA_SWGROUP_ISP2,
> + .smmu = {
> + .reg = 0x230,
> + .bit = 4,
> + },
> + .latency = {
> + .reg = 0x370,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x18,
> + },
> + }, {
> + .id = 0x46,
> + .name = "ispwa",
> + .swgroup = TEGRA_SWGROUP_ISP2,
> + .smmu = {
> + .reg = 0x230,
> + .bit = 6,
> + },
> + .latency = {
> + .reg = 0x374,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x47,
> + .name = "ispwb",
> + .swgroup = TEGRA_SWGROUP_ISP2,
> + .smmu = {
> + .reg = 0x230,
> + .bit = 7,
> + },
> + .latency = {
> + .reg = 0x374,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x4a,
> + .name = "xusb_hostr",
> + .swgroup = TEGRA_SWGROUP_XUSB_HOST,
> + .smmu = {
> + .reg = 0x230,
> + .bit = 10,
> + },
> + .latency = {
> + .reg = 0x37c,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x39,
> + },
> + }, {
> + .id = 0x4b,
> + .name = "xusb_hostw",
> + .swgroup = TEGRA_SWGROUP_XUSB_HOST,
> + .smmu = {
> + .reg = 0x230,
> + .bit = 11,
> + },
> + .latency = {
> + .reg = 0x37c,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x4c,
> + .name = "xusb_devr",
> + .swgroup = TEGRA_SWGROUP_XUSB_DEV,
> + .smmu = {
> + .reg = 0x230,
> + .bit = 12,
> + },
> + .latency = {
> + .reg = 0x380,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x39,
> + },
> + }, {
> + .id = 0x4d,
> + .name = "xusb_devw",
> + .swgroup = TEGRA_SWGROUP_XUSB_DEV,
> + .smmu = {
> + .reg = 0x230,
> + .bit = 13,
> + },
> + .latency = {
> + .reg = 0x380,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x4e,
> + .name = "isprab",
> + .swgroup = TEGRA_SWGROUP_ISP2B,
> + .smmu = {
> + .reg = 0x230,
> + .bit = 14,
> + },
> + .latency = {
> + .reg = 0x384,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x18,
> + },
> + }, {
> + .id = 0x50,
> + .name = "ispwab",
> + .swgroup = TEGRA_SWGROUP_ISP2B,
> + .smmu = {
> + .reg = 0x230,
> + .bit = 16,
> + },
> + .latency = {
> + .reg = 0x388,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x51,
> + .name = "ispwbb",
> + .swgroup = TEGRA_SWGROUP_ISP2B,
> + .smmu = {
> + .reg = 0x230,
> + .bit = 17,
> + },
> + .latency = {
> + .reg = 0x388,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x54,
> + .name = "tsecsrd",
> + .swgroup = TEGRA_SWGROUP_TSEC,
> + .smmu = {
> + .reg = 0x230,
> + .bit = 20,
> + },
> + .latency = {
> + .reg = 0x390,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x9b,
> + },
> + }, {
> + .id = 0x55,
> + .name = "tsecswr",
> + .swgroup = TEGRA_SWGROUP_TSEC,
> + .smmu = {
> + .reg = 0x230,
> + .bit = 21,
> + },
> + .latency = {
> + .reg = 0x390,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x56,
> + .name = "a9avpscr",
> + .swgroup = TEGRA_SWGROUP_A9AVP,
> + .smmu = {
> + .reg = 0x230,
> + .bit = 22,
> + },
> + .latency = {
> + .reg = 0x3a4,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x04,
> + },
> + }, {
> + .id = 0x57,
> + .name = "a9avpscw",
> + .swgroup = TEGRA_SWGROUP_A9AVP,
> + .smmu = {
> + .reg = 0x230,
> + .bit = 23,
> + },
> + .latency = {
> + .reg = 0x3a4,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x58,
> + .name = "gpusrd",
> + .swgroup = TEGRA_SWGROUP_GPU,
> + .smmu = {
> + /* read-only */
> + .reg = 0x230,
> + .bit = 24,
> + },
> + .latency = {
> + .reg = 0x3c8,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x1a,
> + },
> + }, {
> + .id = 0x59,
> + .name = "gpuswr",
> + .swgroup = TEGRA_SWGROUP_GPU,
> + .smmu = {
> + /* read-only */
> + .reg = 0x230,
> + .bit = 25,
> + },
> + .latency = {
> + .reg = 0x3c8,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x5a,
> + .name = "displayt",
> + .swgroup = TEGRA_SWGROUP_DC,
> + .smmu = {
> + .reg = 0x230,
> + .bit = 26,
> + },
> + .latency = {
> + .reg = 0x2f0,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x50,
> + },
> + }, {
> + .id = 0x60,
> + .name = "sdmmcra",
> + .swgroup = TEGRA_SWGROUP_SDMMC1A,
> + .smmu = {
> + .reg = 0x234,
> + .bit = 0,
> + },
> + .latency = {
> + .reg = 0x3b8,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x49,
> + },
> + }, {
> + .id = 0x61,
> + .name = "sdmmcraa",
> + .swgroup = TEGRA_SWGROUP_SDMMC2A,
> + .smmu = {
> + .reg = 0x234,
> + .bit = 1,
> + },
> + .latency = {
> + .reg = 0x3bc,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x49,
> + },
> + }, {
> + .id = 0x62,
> + .name = "sdmmcr",
> + .swgroup = TEGRA_SWGROUP_SDMMC3A,
> + .smmu = {
> + .reg = 0x234,
> + .bit = 2,
> + },
> + .latency = {
> + .reg = 0x3c0,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x49,
> + },
> + }, {
> + .id = 0x63,
> + .swgroup = TEGRA_SWGROUP_SDMMC4A,
> + .name = "sdmmcrab",
> + .smmu = {
> + .reg = 0x234,
> + .bit = 3,
> + },
> + .latency = {
> + .reg = 0x3c4,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x49,
> + },
> + }, {
> + .id = 0x64,
> + .name = "sdmmcwa",
> + .swgroup = TEGRA_SWGROUP_SDMMC1A,
> + .smmu = {
> + .reg = 0x234,
> + .bit = 4,
> + },
> + .latency = {
> + .reg = 0x3b8,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x65,
> + .name = "sdmmcwaa",
> + .swgroup = TEGRA_SWGROUP_SDMMC2A,
> + .smmu = {
> + .reg = 0x234,
> + .bit = 5,
> + },
> + .latency = {
> + .reg = 0x3bc,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x66,
> + .name = "sdmmcw",
> + .swgroup = TEGRA_SWGROUP_SDMMC3A,
> + .smmu = {
> + .reg = 0x234,
> + .bit = 6,
> + },
> + .latency = {
> + .reg = 0x3c0,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x67,
> + .name = "sdmmcwab",
> + .swgroup = TEGRA_SWGROUP_SDMMC4A,
> + .smmu = {
> + .reg = 0x234,
> + .bit = 7,
> + },
> + .latency = {
> + .reg = 0x3c4,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x6c,
> + .name = "vicsrd",
> + .swgroup = TEGRA_SWGROUP_VIC,
> + .smmu = {
> + .reg = 0x234,
> + .bit = 12,
> + },
> + .latency = {
> + .reg = 0x394,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x1a,
> + },
> + }, {
> + .id = 0x6d,
> + .name = "vicswr",
> + .swgroup = TEGRA_SWGROUP_VIC,
> + .smmu = {
> + .reg = 0x234,
> + .bit = 13,
> + },
> + .latency = {
> + .reg = 0x394,
> + .shift = 16,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x72,
> + .name = "viw",
> + .swgroup = TEGRA_SWGROUP_VI,
> + .smmu = {
> + .reg = 0x234,
> + .bit = 18,
> + },
> + .latency = {
> + .reg = 0x398,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x80,
> + },
> + }, {
> + .id = 0x73,
> + .name = "displayd",
> + .swgroup = TEGRA_SWGROUP_DC,
> + .smmu = {
> + .reg = 0x234,
> + .bit = 19,
> + },
> + .latency = {
> + .reg = 0x3c8,
> + .shift = 0,
> + .mask = 0xff,
> + .def = 0x50,
> + },
> + },
> +};
> +
> +struct tegra_smmu_swgroup {
> + unsigned int swgroup;
> + unsigned int reg;
> +};
> +
> +static const struct tegra_smmu_swgroup tegra124_swgroups[] = {
> + { .swgroup = TEGRA_SWGROUP_DC, .reg = 0x240 },
> + { .swgroup = TEGRA_SWGROUP_DCB, .reg = 0x244 },
> + { .swgroup = TEGRA_SWGROUP_AFI, .reg = 0x238 },
> + { .swgroup = TEGRA_SWGROUP_AVPC, .reg = 0x23c },
> + { .swgroup = TEGRA_SWGROUP_HDA, .reg = 0x254 },
> + { .swgroup = TEGRA_SWGROUP_HC, .reg = 0x250 },
> + { .swgroup = TEGRA_SWGROUP_MSENC, .reg = 0x264 },
> + { .swgroup = TEGRA_SWGROUP_PPCS, .reg = 0x270 },
> + { .swgroup = TEGRA_SWGROUP_SATA, .reg = 0x274 },
> + { .swgroup = TEGRA_SWGROUP_VDE, .reg = 0x27c },
> + { .swgroup = TEGRA_SWGROUP_ISP2, .reg = 0x258 },
> + { .swgroup = TEGRA_SWGROUP_XUSB_HOST, .reg = 0x288 },
> + { .swgroup = TEGRA_SWGROUP_XUSB_DEV, .reg = 0x28c },
> + { .swgroup = TEGRA_SWGROUP_ISP2B, .reg = 0xaa4 },
> + { .swgroup = TEGRA_SWGROUP_TSEC, .reg = 0x294 },
> + { .swgroup = TEGRA_SWGROUP_A9AVP, .reg = 0x290 },
> + { .swgroup = TEGRA_SWGROUP_GPU, .reg = 0xaa8 },
> + { .swgroup = TEGRA_SWGROUP_SDMMC1A, .reg = 0xa94 },
> + { .swgroup = TEGRA_SWGROUP_SDMMC2A, .reg = 0xa98 },
> + { .swgroup = TEGRA_SWGROUP_SDMMC3A, .reg = 0xa9c },
> + { .swgroup = TEGRA_SWGROUP_SDMMC4A, .reg = 0xaa0 },
> + { .swgroup = TEGRA_SWGROUP_VIC, .reg = 0x284 },
> + { .swgroup = TEGRA_SWGROUP_VI, .reg = 0x280 },
> +};
> +
> +struct tegra_smmu_group_init {
> + unsigned int asid;
> + const char *name;
> +
> + const struct of_device_id *matches;
> +};
> +
> +struct tegra_smmu_soc {
> + const struct tegra_smmu_group_init *groups;
> + unsigned int num_groups;
> +
> + const struct tegra_mc_client *clients;
> + unsigned int num_clients;
> +
> + const struct tegra_smmu_swgroup *swgroups;
> + unsigned int num_swgroups;
> +
> + unsigned int num_asids;
> + unsigned int atom_size;
> +
> + const struct tegra_smmu_ops *ops;
> +};
> +
> +struct tegra_smmu_ops {
> + void (*flush_dcache)(struct page *page, unsigned long offset,
> + size_t size);
> +};
> +
> +struct tegra_smmu_master {
> + struct list_head list;
> + struct device *dev;
> +};
> +
> +struct tegra_smmu_group {
> + const char *name;
> + const struct of_device_id *matches;
> + unsigned int asid;
> +
> +#ifndef CONFIG_ARM64
> + struct dma_iommu_mapping *mapping;
> +#endif
> + struct list_head masters;
> +};
> +
> +static const struct of_device_id tegra124_periph_matches[] = {
> + { .compatible = "nvidia,tegra124-sdhci", },
> + { }
> +};
> +
> +static const struct tegra_smmu_group_init tegra124_smmu_groups[] = {
> + { 0, "peripherals", tegra124_periph_matches },
> +};
> +
> +static void tegra_smmu_group_release(void *data)
> +{
> + kfree(data);
> +}
> +
> +struct tegra_smmu {
> + void __iomem *regs;
> + struct iommu iommu;
> + struct device *dev;
> +
> + const struct tegra_smmu_soc *soc;
> +
> + struct iommu_group **groups;
> + unsigned int num_groups;
> +
> + unsigned long *asids;
> + struct mutex lock;
> +};
> +
> +struct tegra_smmu_address_space {
> + struct iommu_domain *domain;
> + struct tegra_smmu *smmu;
> + struct page *pd;
> + unsigned id;
> + u32 attr;
> +};
> +
> +static inline void smmu_writel(struct tegra_smmu *smmu, u32 value,
> + unsigned long offset)
> +{
> + writel(value, smmu->regs + offset);
> +}
> +
> +static inline u32 smmu_readl(struct tegra_smmu *smmu, unsigned long offset)
> +{
> + return readl(smmu->regs + offset);
> +}
> +
> +#define SMMU_CONFIG 0x010
> +#define SMMU_CONFIG_ENABLE (1 << 0)
> +
> +#define SMMU_PTB_ASID 0x01c
> +#define SMMU_PTB_ASID_VALUE(x) ((x) & 0x7f)
> +
> +#define SMMU_PTB_DATA 0x020
> +#define SMMU_PTB_DATA_VALUE(page, attr) (page_to_phys(page) >> 12 | (attr))
> +
> +#define SMMU_MK_PDE(page, attr) (page_to_phys(page) >> SMMU_PTE_SHIFT | (attr))
> +
> +#define SMMU_TLB_FLUSH 0x030
> +#define SMMU_TLB_FLUSH_VA_MATCH_ALL (0 << 0)
> +#define SMMU_TLB_FLUSH_VA_MATCH_SECTION (2 << 0)
> +#define SMMU_TLB_FLUSH_VA_MATCH_GROUP (3 << 0)
> +#define SMMU_TLB_FLUSH_ASID(x) (((x) & 0x7f) << 24)
> +#define SMMU_TLB_FLUSH_VA_SECTION(addr) ((((addr) & 0xffc00000) >> 12) | \
> + SMMU_TLB_FLUSH_VA_MATCH_SECTION)
> +#define SMMU_TLB_FLUSH_VA_GROUP(addr) ((((addr) & 0xffffc000) >> 12) | \
> + SMMU_TLB_FLUSH_VA_MATCH_GROUP)
> +#define SMMU_TLB_FLUSH_ASID_MATCH (1 << 31)
> +
> +#define SMMU_PTC_FLUSH 0x034
> +#define SMMU_PTC_FLUSH_TYPE_ALL (0 << 0)
> +#define SMMU_PTC_FLUSH_TYPE_ADR (1 << 0)
> +
> +#define SMMU_PTC_FLUSH_HI 0x9b8
> +#define SMMU_PTC_FLUSH_HI_MASK 0x3
> +
> +/* per-SWGROUP SMMU_*_ASID register */
> +#define SMMU_ASID_ENABLE (1 << 31)
> +#define SMMU_ASID_MASK 0x7f
> +#define SMMU_ASID_VALUE(x) ((x) & SMMU_ASID_MASK)
> +
> +/* page table definitions */
> +#define SMMU_NUM_PDE 1024
> +#define SMMU_NUM_PTE 1024
> +
> +#define SMMU_SIZE_PD (SMMU_NUM_PDE * 4)
> +#define SMMU_SIZE_PT (SMMU_NUM_PTE * 4)
> +
> +#define SMMU_PDE_SHIFT 22
> +#define SMMU_PTE_SHIFT 12
> +
> +#define SMMU_PFN_MASK 0x000fffff
> +
> +#define SMMU_PD_READABLE (1 << 31)
> +#define SMMU_PD_WRITABLE (1 << 30)
> +#define SMMU_PD_NONSECURE (1 << 29)
> +
> +#define SMMU_PDE_READABLE (1 << 31)
> +#define SMMU_PDE_WRITABLE (1 << 30)
> +#define SMMU_PDE_NONSECURE (1 << 29)
> +#define SMMU_PDE_NEXT (1 << 28)
> +
> +#define SMMU_PTE_READABLE (1 << 31)
> +#define SMMU_PTE_WRITABLE (1 << 30)
> +#define SMMU_PTE_NONSECURE (1 << 29)
> +
> +#define SMMU_PDE_ATTR (SMMU_PDE_READABLE | SMMU_PDE_WRITABLE | \
> + SMMU_PDE_NONSECURE)
> +#define SMMU_PTE_ATTR (SMMU_PTE_READABLE | SMMU_PTE_WRITABLE | \
> + SMMU_PTE_NONSECURE)
> +
> +#define SMMU_PDE_VACANT(n) (((n) << 10) | SMMU_PDE_ATTR)
> +#define SMMU_PTE_VACANT(n) (((n) << 12) | SMMU_PTE_ATTR)
> +
> +#ifdef CONFIG_ARCH_TEGRA_124_SOC
> +static void tegra124_flush_dcache(struct page *page, unsigned long offset,
> + size_t size)
> +{
> + phys_addr_t phys = page_to_phys(page) + offset;
> + void *virt = page_address(page) + offset;
> +
> + __cpuc_flush_dcache_area(virt, size);
> + outer_flush_range(phys, phys + size);
> +}
> +
> +static const struct tegra_smmu_ops tegra124_smmu_ops = {
> + .flush_dcache = tegra124_flush_dcache,
> +};
> +#endif
> +
> +static void tegra132_flush_dcache(struct page *page, unsigned long offset,
> + size_t size)
> +{
> + /* TODO: implement */
> +}
> +
> +static const struct tegra_smmu_ops tegra132_smmu_ops = {
> + .flush_dcache = tegra132_flush_dcache,
> +};
> +
> +static inline void smmu_flush_ptc(struct tegra_smmu *smmu, struct page *page,
> + unsigned long offset)
> +{
> + phys_addr_t phys = page ? page_to_phys(page) : 0;
> + u32 value;
> +
> + if (page) {
> + offset &= ~(smmu->soc->atom_size - 1);
> +
> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
> + value = (phys >> 32) & SMMU_PTC_FLUSH_HI_MASK;
> +#else
> + value = 0;
> +#endif
> + smmu_writel(smmu, value, SMMU_PTC_FLUSH_HI);
> +
> + value = (phys + offset) | SMMU_PTC_FLUSH_TYPE_ADR;
> + } else {
> + value = SMMU_PTC_FLUSH_TYPE_ALL;
> + }
> +
> + smmu_writel(smmu, value, SMMU_PTC_FLUSH);
> +}
> +
> +static inline void smmu_flush_tlb(struct tegra_smmu *smmu)
> +{
> + smmu_writel(smmu, SMMU_TLB_FLUSH_VA_MATCH_ALL, SMMU_TLB_FLUSH);
> +}
> +
> +static inline void smmu_flush_tlb_asid(struct tegra_smmu *smmu,
> + unsigned long asid)
> +{
> + u32 value;
> +
> + value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
> + SMMU_TLB_FLUSH_VA_MATCH_ALL;
> + smmu_writel(smmu, value, SMMU_TLB_FLUSH);
> +}
> +
> +static inline void smmu_flush_tlb_section(struct tegra_smmu *smmu,
> + unsigned long asid,
> + unsigned long iova)
> +{
> + u32 value;
> +
> + value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
> + SMMU_TLB_FLUSH_VA_SECTION(iova);
> + smmu_writel(smmu, value, SMMU_TLB_FLUSH);
> +}
> +
> +static inline void smmu_flush_tlb_group(struct tegra_smmu *smmu,
> + unsigned long asid,
> + unsigned long iova)
> +{
> + u32 value;
> +
> + value = SMMU_TLB_FLUSH_ASID_MATCH | SMMU_TLB_FLUSH_ASID(asid) |
> + SMMU_TLB_FLUSH_VA_GROUP(iova);
> + smmu_writel(smmu, value, SMMU_TLB_FLUSH);
> +}
> +
> +static inline void smmu_flush(struct tegra_smmu *smmu)
> +{
> + smmu_readl(smmu, SMMU_CONFIG);
> +}
> +
> +static inline struct tegra_smmu *to_tegra_smmu(struct iommu *iommu)
> +{
> + return container_of(iommu, struct tegra_smmu, iommu);
> +}
> +
> +static struct tegra_smmu *smmu_handle = NULL;
> +
> +static int tegra_smmu_alloc_asid(struct tegra_smmu *smmu, unsigned int *idp)
> +{
> + unsigned long id;
> +
> + mutex_lock(&smmu->lock);
> +
> + id = find_first_zero_bit(smmu->asids, smmu->soc->num_asids);
> + if (id >= smmu->soc->num_asids) {
> + mutex_unlock(&smmu->lock);
> + return -ENOSPC;
> + }
> +
> + set_bit(id, smmu->asids);
> + *idp = id;
> +
> + mutex_unlock(&smmu->lock);
> + return 0;
> +}
> +
> +static void tegra_smmu_free_asid(struct tegra_smmu *smmu, unsigned int id)
> +{
> + mutex_lock(&smmu->lock);
> + clear_bit(id, smmu->asids);
> + mutex_unlock(&smmu->lock);
> +}
> +
> +struct tegra_smmu_address_space *foo = NULL;
> +
> +static int tegra_smmu_domain_init(struct iommu_domain *domain)
> +{
> + struct tegra_smmu *smmu = smmu_handle;
> + struct tegra_smmu_address_space *as;
> + uint32_t *pd, value;
> + unsigned int i;
> + int err = 0;
> +
> + as = kzalloc(sizeof(*as), GFP_KERNEL);
> + if (!as) {
> + err = -ENOMEM;
> + goto out;
> + }
> +
> + as->attr = SMMU_PD_READABLE | SMMU_PD_WRITABLE | SMMU_PD_NONSECURE;
> + as->smmu = smmu_handle;
> + as->domain = domain;
> +
> + err = tegra_smmu_alloc_asid(smmu, &as->id);
> + if (err < 0) {
> + kfree(as);
> + goto out;
> + }
> +
> + as->pd = alloc_page(GFP_KERNEL | __GFP_DMA);
> + if (!as->pd) {
> + err = -ENOMEM;
> + goto out;
> + }
> +
> + pd = page_address(as->pd);
> + SetPageReserved(as->pd);
> +
> + for (i = 0; i < SMMU_NUM_PDE; i++)
> + pd[i] = SMMU_PDE_VACANT(i);
> +
> + smmu->soc->ops->flush_dcache(as->pd, 0, SMMU_SIZE_PD);
> + smmu_flush_ptc(smmu, as->pd, 0);
> + smmu_flush_tlb_asid(smmu, as->id);
> +
> + smmu_writel(smmu, as->id & 0x7f, SMMU_PTB_ASID);
> + value = SMMU_PTB_DATA_VALUE(as->pd, as->attr);
> + smmu_writel(smmu, value, SMMU_PTB_DATA);
> + smmu_flush(smmu);
> +
> + domain->priv = as;
> +
> + return 0;
> +
> +out:
> + return err;
> +}
> +
> +static void tegra_smmu_domain_destroy(struct iommu_domain *domain)
> +{
> + struct tegra_smmu_address_space *as = domain->priv;
> +
> + /* TODO: free page directory and page tables */
> +
> + tegra_smmu_free_asid(as->smmu, as->id);
> + kfree(as);
> +}
> +
> +static const struct tegra_smmu_swgroup *
> +tegra_smmu_find_swgroup(struct tegra_smmu *smmu, unsigned int swgroup)
> +{
> + const struct tegra_smmu_swgroup *group = NULL;
> + unsigned int i;
> +
> + for (i = 0; i < smmu->soc->num_swgroups; i++) {
> + if (smmu->soc->swgroups[i].swgroup == swgroup) {
> + group = &smmu->soc->swgroups[i];
> + break;
> + }
> + }
> +
> + return group;
> +}
> +
> +static int tegra_smmu_enable(struct tegra_smmu *smmu, unsigned int swgroup,
> + unsigned int asid)
> +{
> + const struct tegra_smmu_swgroup *group;
> + unsigned int i;
> + u32 value;
> +
> + for (i = 0; i < smmu->soc->num_clients; i++) {
> + const struct tegra_mc_client *client = &smmu->soc->clients[i];
> +
> + if (client->swgroup != swgroup)
> + continue;
> +
> + value = smmu_readl(smmu, client->smmu.reg);
> + value |= BIT(client->smmu.bit);
> + smmu_writel(smmu, value, client->smmu.reg);
> + }
> +
> + group = tegra_smmu_find_swgroup(smmu, swgroup);
> + if (group) {
> + value = smmu_readl(smmu, group->reg);
> + value &= ~SMMU_ASID_MASK;
> + value |= SMMU_ASID_VALUE(asid);
> + value |= SMMU_ASID_ENABLE;
> + smmu_writel(smmu, value, group->reg);
> + }
> +
> + return 0;
> +}
> +
> +static int tegra_smmu_disable(struct tegra_smmu *smmu, unsigned int swgroup,
> + unsigned int asid)
> +{
> + const struct tegra_smmu_swgroup *group;
> + unsigned int i;
> + u32 value;
> +
> + group = tegra_smmu_find_swgroup(smmu, swgroup);
> + if (group) {
> + value = smmu_readl(smmu, group->reg);
> + value &= ~SMMU_ASID_MASK;
> + value |= SMMU_ASID_VALUE(asid);
> + value &= ~SMMU_ASID_ENABLE;
> + smmu_writel(smmu, value, group->reg);
> + }
> +
> + for (i = 0; i < smmu->soc->num_clients; i++) {
> + const struct tegra_mc_client *client = &smmu->soc->clients[i];
> +
> + if (client->swgroup != swgroup)
> + continue;
> +
> + value = smmu_readl(smmu, client->smmu.reg);
> + value &= ~BIT(client->smmu.bit);
> + smmu_writel(smmu, value, client->smmu.reg);
> + }
> +
> + return 0;
> +}
> +
> +static int tegra_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> +{
> + struct tegra_smmu_address_space *as = domain->priv;
> + struct tegra_smmu *smmu = as->smmu;
> + struct of_phandle_iter entry;
> + int err;
> +
> + of_property_for_each_phandle_with_args(entry, dev->of_node, "iommus",
> + "#iommu-cells", 0) {
> + unsigned int swgroup = entry.out_args.args[0];
> +
> + if (entry.out_args.np != smmu->dev->of_node)
> + continue;
> +
> + err = tegra_smmu_enable(smmu, swgroup, as->id);
> + if (err < 0)
> + pr_err("failed to enable SWGROUP#%u\n", swgroup);
> + }
> +
> + return 0;
> +}
> +
> +static void tegra_smmu_detach_dev(struct iommu_domain *domain, struct device *dev)
> +{
> + struct tegra_smmu_address_space *as = domain->priv;
> + struct tegra_smmu *smmu = as->smmu;
> + struct of_phandle_iter entry;
> + int err;
> +
> + of_property_for_each_phandle_with_args(entry, dev->of_node, "iommus",
> + "#iommu-cells", 0) {
> + unsigned int swgroup;
> +
> + if (entry.out_args.np != smmu->dev->of_node)
> + continue;
> +
> + swgroup = entry.out_args.args[0];
> +
> + err = tegra_smmu_disable(smmu, swgroup, as->id);
> + if (err < 0) {
> + pr_err("failed to enable SWGROUP#%u\n", swgroup);
> + }
> + }
> +}
> +
> +static u32 *as_get_pte(struct tegra_smmu_address_space *as, dma_addr_t iova,
> + struct page **pagep)
> +{
> + struct tegra_smmu *smmu = smmu_handle;
> + u32 *pd = page_address(as->pd), *pt;
> + u32 pde = (iova >> SMMU_PDE_SHIFT) & 0x3ff;
> + u32 pte = (iova >> SMMU_PTE_SHIFT) & 0x3ff;
> + struct page *page;
> + unsigned int i;
> +
> + if (pd[pde] != SMMU_PDE_VACANT(pde)) {
> + page = pfn_to_page(pd[pde] & SMMU_PFN_MASK);
> + pt = page_address(page);
> + } else {
> + page = alloc_page(GFP_KERNEL | __GFP_DMA);
> + if (!page)
> + return NULL;
> +
> + pt = page_address(page);
> + SetPageReserved(page);
> +
> + for (i = 0; i < SMMU_NUM_PTE; i++)
> + pt[i] = SMMU_PTE_VACANT(i);
> +
> + smmu->soc->ops->flush_dcache(page, 0, SMMU_SIZE_PT);
> +
> + pd[pde] = SMMU_MK_PDE(page, SMMU_PDE_ATTR | SMMU_PDE_NEXT);
> +
> + smmu->soc->ops->flush_dcache(as->pd, pde << 2, 4);
> + smmu_flush_ptc(smmu, as->pd, pde << 2);
> + smmu_flush_tlb_section(smmu, as->id, iova);
> + smmu_flush(smmu);
> + }
> +
> + *pagep = page;
> +
> + return &pt[pte];
> +}
> +
> +static int tegra_smmu_map(struct iommu_domain *domain, unsigned long iova,
> + phys_addr_t paddr, size_t size, int prot)
> +{
> + struct tegra_smmu_address_space *as = domain->priv;
> + struct tegra_smmu *smmu = smmu_handle;
> + unsigned long offset;
> + struct page *page;
> + u32 *pte;
> +
> + pte = as_get_pte(as, iova, &page);
> + if (!pte)
> + return -ENOMEM;
> +
> + offset = offset_in_page(pte);
> +
> + *pte = __phys_to_pfn(paddr) | SMMU_PTE_ATTR;
> +
> + smmu->soc->ops->flush_dcache(page, offset, 4);
> + smmu_flush_ptc(smmu, page, offset);
> + smmu_flush_tlb_group(smmu, as->id, iova);
> + smmu_flush(smmu);
> +
> + return 0;
> +}
> +
> +static size_t tegra_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
> + size_t size)
> +{
> + struct tegra_smmu_address_space *as = domain->priv;
> + struct tegra_smmu *smmu = smmu_handle;
> + unsigned long offset;
> + struct page *page;
> + u32 *pte;
> +
> + pte = as_get_pte(as, iova, &page);
> + if (!pte)
> + return 0;
> +
> + offset = offset_in_page(pte);
> + *pte = 0;
> +
> + smmu->soc->ops->flush_dcache(page, offset, 4);
> + smmu_flush_ptc(smmu, page, offset);
> + smmu_flush_tlb_group(smmu, as->id, iova);
> + smmu_flush(smmu);
> +
> + return size;
> +}
> +
> +static phys_addr_t tegra_smmu_iova_to_phys(struct iommu_domain *domain,
> + dma_addr_t iova)
> +{
> + struct tegra_smmu_address_space *as = domain->priv;
> + struct page *page;
> + unsigned long pfn;
> + u32 *pte;
> +
> + pte = as_get_pte(as, iova, &page);
> + pfn = *pte & SMMU_PFN_MASK;
> +
> + return PFN_PHYS(pfn);
> +}
> +
> +static int tegra_smmu_attach(struct iommu *iommu, struct device *dev)
> +{
> + struct tegra_smmu *smmu = to_tegra_smmu(iommu);
> + struct tegra_smmu_group *group;
> + unsigned int i;
> +
> + for (i = 0; i < smmu->soc->num_groups; i++) {
> + group = iommu_group_get_iommudata(smmu->groups[i]);
> +
> + if (of_match_node(group->matches, dev->of_node)) {
> + pr_debug("adding device %s to group %s\n",
> + dev_name(dev), group->name);
> + iommu_group_add_device(smmu->groups[i], dev);
> + break;
> + }
> + }
> +
> + if (i == smmu->soc->num_groups)
> + return 0;
> +
> +#ifndef CONFIG_ARM64
> + return arm_iommu_attach_device(dev, group->mapping);
> +#else
> + return 0;
> +#endif
> +}
> +
> +static int tegra_smmu_detach(struct iommu *iommu, struct device *dev)
> +{
> + return 0;
> +}
> +
> +static const struct iommu_ops tegra_smmu_ops = {
> + .domain_init = tegra_smmu_domain_init,
> + .domain_destroy = tegra_smmu_domain_destroy,
> + .attach_dev = tegra_smmu_attach_dev,
> + .detach_dev = tegra_smmu_detach_dev,
> + .map = tegra_smmu_map,
> + .unmap = tegra_smmu_unmap,
> + .iova_to_phys = tegra_smmu_iova_to_phys,
> + .attach = tegra_smmu_attach,
> + .detach = tegra_smmu_detach,
> +
> + .pgsize_bitmap = SZ_4K,
> +};
> +
> +static struct tegra_smmu *tegra_smmu_probe(struct device *dev,
> + const struct tegra_smmu_soc *soc,
> + void __iomem *regs)
> +{
> + struct tegra_smmu *smmu;
> + unsigned int i;
> + size_t size;
> + u32 value;
> + int err;
> +
> + smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL);
> + if (!smmu)
> + return ERR_PTR(-ENOMEM);
> +
> + size = BITS_TO_LONGS(soc->num_asids) * sizeof(long);
> +
> + smmu->asids = devm_kzalloc(dev, size, GFP_KERNEL);
> + if (!smmu->asids)
> + return ERR_PTR(-ENOMEM);
> +
> + INIT_LIST_HEAD(&smmu->iommu.list);
> + mutex_init(&smmu->lock);
> +
> + smmu->iommu.ops = &tegra_smmu_ops;
> + smmu->iommu.dev = dev;
> +
> + smmu->regs = regs;
> + smmu->soc = soc;
> + smmu->dev = dev;
> +
> + smmu_handle = smmu;
> + bus_set_iommu(&platform_bus_type, &tegra_smmu_ops);
> +
> + smmu->num_groups = soc->num_groups;
> +
> + smmu->groups = devm_kcalloc(dev, smmu->num_groups, sizeof(*smmu->groups),
> + GFP_KERNEL);
> + if (!smmu->groups)
> + return ERR_PTR(-ENOMEM);
> +
> + for (i = 0; i < smmu->num_groups; i++) {
> + struct tegra_smmu_group *group;
> +
> + smmu->groups[i] = iommu_group_alloc();
> + if (IS_ERR(smmu->groups[i]))
> + return ERR_CAST(smmu->groups[i]);
> +
> + err = iommu_group_set_name(smmu->groups[i], soc->groups[i].name);
> + if (err < 0) {
> + }
> +
> + group = kzalloc(sizeof(*group), GFP_KERNEL);
> + if (!group)
> + return ERR_PTR(-ENOMEM);
> +
> + group->matches = soc->groups[i].matches;
> + group->asid = soc->groups[i].asid;
> + group->name = soc->groups[i].name;
> +
> + iommu_group_set_iommudata(smmu->groups[i], group,
> + tegra_smmu_group_release);
> +
> +#ifndef CONFIG_ARM64
> + group->mapping = arm_iommu_create_mapping(&platform_bus_type,
> + 0, SZ_2G);
> + if (IS_ERR(group->mapping)) {
> + dev_err(dev, "failed to create mapping for group %s: %ld\n",
> + group->name, PTR_ERR(group->mapping));
> + return ERR_CAST(group->mapping);
> + }
> +#endif
> + }
> +
> + value = (1 << 29) | (8 << 24) | 0x3f;
> + smmu_writel(smmu, value, 0x18);
> +
> + value = (1 << 29) | (1 << 28) | 0x20;
> + smmu_writel(smmu, value, 0x014);
> +
> + smmu_flush_ptc(smmu, NULL, 0);
> + smmu_flush_tlb(smmu);
> + smmu_writel(smmu, SMMU_CONFIG_ENABLE, SMMU_CONFIG);
> + smmu_flush(smmu);
> +
> + err = iommu_add(&smmu->iommu);
> + if (err < 0)
> + return ERR_PTR(err);
> +
> + return smmu;
> +}
> +
> +static int tegra_smmu_remove(struct tegra_smmu *smmu)
> +{
> + iommu_remove(&smmu->iommu);
> +
> + return 0;
> +}
> +
> +#ifdef CONFIG_ARCH_TEGRA_124_SOC
> +static const struct tegra_smmu_soc tegra124_smmu_soc = {
> + .groups = tegra124_smmu_groups,
> + .num_groups = ARRAY_SIZE(tegra124_smmu_groups),
> + .clients = tegra124_mc_clients,
> + .num_clients = ARRAY_SIZE(tegra124_mc_clients),
> + .swgroups = tegra124_swgroups,
> + .num_swgroups = ARRAY_SIZE(tegra124_swgroups),
> + .num_asids = 128,
> + .atom_size = 32,
> + .ops = &tegra124_smmu_ops,
> +};
> +#endif
> +
> +static const struct tegra_smmu_soc tegra132_smmu_soc = {
> + .groups = tegra124_smmu_groups,
> + .num_groups = ARRAY_SIZE(tegra124_smmu_groups),
> + .clients = tegra124_mc_clients,
> + .num_clients = ARRAY_SIZE(tegra124_mc_clients),
> + .swgroups = tegra124_swgroups,
> + .num_swgroups = ARRAY_SIZE(tegra124_swgroups),
> + .num_asids = 128,
> + .atom_size = 32,
> + .ops = &tegra132_smmu_ops,
> +};
> +
> +struct tegra_mc {
> + struct device *dev;
> + struct tegra_smmu *smmu;
> + void __iomem *regs;
> + int irq;
> +
> + const struct tegra_mc_soc *soc;
> +};
> +
> +static inline u32 mc_readl(struct tegra_mc *mc, unsigned long offset)
> +{
> + return readl(mc->regs + offset);
> +}
> +
> +static inline void mc_writel(struct tegra_mc *mc, u32 value, unsigned long offset)
> +{
> + writel(value, mc->regs + offset);
> +}
> +
> +struct tegra_mc_soc {
> + const struct tegra_mc_client *clients;
> + unsigned int num_clients;
> +
> + const struct tegra_smmu_soc *smmu;
> +};
> +
> +#ifdef CONFIG_ARCH_TEGRA_124_SOC
> +static const struct tegra_mc_soc tegra124_mc_soc = {
> + .clients = tegra124_mc_clients,
> + .num_clients = ARRAY_SIZE(tegra124_mc_clients),
> + .smmu = &tegra124_smmu_soc,
> +};
> +#endif
> +
> +static const struct tegra_mc_soc tegra132_mc_soc = {
> + .clients = tegra124_mc_clients,
> + .num_clients = ARRAY_SIZE(tegra124_mc_clients),
> + .smmu = &tegra132_smmu_soc,
> +};
> +
> +static const struct of_device_id tegra_mc_of_match[] = {
> +#ifdef CONFIG_ARCH_TEGRA_124_SOC
> + { .compatible = "nvidia,tegra124-mc", .data = &tegra124_mc_soc },
> +#endif
> + { .compatible = "nvidia,tegra132-mc", .data = &tegra132_mc_soc },
> + { }
> +};
> +
> +static irqreturn_t tegra124_mc_irq(int irq, void *data)
> +{
> + struct tegra_mc *mc = data;
> + u32 value, status, mask;
> +
> + /* mask all interrupts to avoid flooding */
> + mask = mc_readl(mc, MC_INTMASK);
> + mc_writel(mc, 0, MC_INTMASK);
> +
> + status = mc_readl(mc, MC_INTSTATUS);
> + mc_writel(mc, status, MC_INTSTATUS);
> +
> + dev_dbg(mc->dev, "INTSTATUS: %08x\n", status);
> +
> + if (status & MC_INT_DECERR_MTS)
> + dev_dbg(mc->dev, " DECERR_MTS\n");
> +
> + if (status & MC_INT_SECERR_SEC)
> + dev_dbg(mc->dev, " SECERR_SEC\n");
> +
> + if (status & MC_INT_DECERR_VPR)
> + dev_dbg(mc->dev, " DECERR_VPR\n");
> +
> + if (status & MC_INT_INVALID_APB_ASID_UPDATE)
> + dev_dbg(mc->dev, " INVALID_APB_ASID_UPDATE\n");
> +
> + if (status & MC_INT_INVALID_SMMU_PAGE)
> + dev_dbg(mc->dev, " INVALID_SMMU_PAGE\n");
> +
> + if (status & MC_INT_ARBITRATION_EMEM)
> + dev_dbg(mc->dev, " ARBITRATION_EMEM\n");
> +
> + if (status & MC_INT_SECURITY_VIOLATION)
> + dev_dbg(mc->dev, " SECURITY_VIOLATION\n");
> +
> + if (status & MC_INT_DECERR_EMEM)
> + dev_dbg(mc->dev, " DECERR_EMEM\n");
> +
> + value = mc_readl(mc, MC_ERR_STATUS);
> +
> + dev_dbg(mc->dev, "ERR_STATUS: %08x\n", value);
> + dev_dbg(mc->dev, " type: %x\n", (value >> 28) & 0x7);
> + dev_dbg(mc->dev, " protection: %x\n", (value >> 25) & 0x7);
> + dev_dbg(mc->dev, " adr_hi: %x\n", (value >> 20) & 0x3);
> + dev_dbg(mc->dev, " swap: %x\n", (value >> 18) & 0x1);
> + dev_dbg(mc->dev, " security: %x\n", (value >> 17) & 0x1);
> + dev_dbg(mc->dev, " r/w: %x\n", (value >> 16) & 0x1);
> + dev_dbg(mc->dev, " adr1: %x\n", (value >> 12) & 0x7);
> + dev_dbg(mc->dev, " client: %x\n", value & 0x7f);
> +
> + value = mc_readl(mc, MC_ERR_ADR);
> + dev_dbg(mc->dev, "ERR_ADR: %08x\n", value);
> +
> + mc_writel(mc, mask, MC_INTMASK);
> +
> + return IRQ_HANDLED;
> +}
> +
> +static int tegra_mc_probe(struct platform_device *pdev)
> +{
> + const struct of_device_id *match;
> + struct resource *res;
> + struct tegra_mc *mc;
> + unsigned int i;
> + u32 value;
> + int err;
> +
> + match = of_match_node(tegra_mc_of_match, pdev->dev.of_node);
> + if (!match)
> + return -ENODEV;
> +
> + mc = devm_kzalloc(&pdev->dev, sizeof(*mc), GFP_KERNEL);
> + if (!mc)
> + return -ENOMEM;
> +
> + platform_set_drvdata(pdev, mc);
> + mc->soc = match->data;
> + mc->dev = &pdev->dev;
> +
> + res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> + mc->regs = devm_ioremap_resource(&pdev->dev, res);
> + if (IS_ERR(mc->regs))
> + return PTR_ERR(mc->regs);
> +
> + for (i = 0; i < mc->soc->num_clients; i++) {
> + const struct latency_allowance *la = &mc->soc->clients[i].latency;
> + u32 value;
> +
> + value = readl(mc->regs + la->reg);
> + value &= ~(la->mask << la->shift);
> + value |= (la->def & la->mask) << la->shift;
> + writel(value, mc->regs + la->reg);
> + }
> +
> + mc->smmu = tegra_smmu_probe(&pdev->dev, mc->soc->smmu, mc->regs);
> + if (IS_ERR(mc->smmu)) {
> + dev_err(&pdev->dev, "failed to probe SMMU: %ld\n",
> + PTR_ERR(mc->smmu));
> + return PTR_ERR(mc->smmu);
> + }
> +
> + mc->irq = platform_get_irq(pdev, 0);
> + if (mc->irq < 0) {
> + dev_err(&pdev->dev, "interrupt not specified\n");
> + return mc->irq;
> + }
> +
> + err = devm_request_irq(&pdev->dev, mc->irq, tegra124_mc_irq,
> + IRQF_SHARED, dev_name(&pdev->dev), mc);
> + if (err < 0) {
> + dev_err(&pdev->dev, "failed to request IRQ#%u: %d\n", mc->irq,
> + err);
> + return err;
> + }
> +
> + value = MC_INT_DECERR_MTS | MC_INT_SECERR_SEC | MC_INT_DECERR_VPR |
> + MC_INT_INVALID_APB_ASID_UPDATE | MC_INT_INVALID_SMMU_PAGE |
> + MC_INT_ARBITRATION_EMEM | MC_INT_SECURITY_VIOLATION |
> + MC_INT_DECERR_EMEM;
> + mc_writel(mc, value, MC_INTMASK);
> +
> + return 0;
> +}
> +
> +static int tegra_mc_remove(struct platform_device *pdev)
> +{
> + struct tegra_mc *mc = platform_get_drvdata(pdev);
> + int err;
> +
> + err = tegra_smmu_remove(mc->smmu);
> + if (err < 0)
> + dev_err(&pdev->dev, "failed to remove SMMU: %d\n", err);
> +
> + return 0;
> +}
> +
> +static struct platform_driver tegra_mc_driver = {
> + .driver = {
> + .name = "tegra124-mc",
> + .of_match_table = tegra_mc_of_match,
> + },
> + .probe = tegra_mc_probe,
> + .remove = tegra_mc_remove,
> +};
> +module_platform_driver(tegra_mc_driver);
> +
> +MODULE_AUTHOR("Thierry Reding <[email protected]>");
> +MODULE_DESCRIPTION("NVIDIA Tegra124 Memory Controller driver");
> +MODULE_LICENSE("GPL v2");
> diff --git a/include/dt-bindings/memory/tegra124-mc.h b/include/dt-bindings/memory/tegra124-mc.h
> new file mode 100644
> index 000000000000..6b1617ce022f
> --- /dev/null
> +++ b/include/dt-bindings/memory/tegra124-mc.h
> @@ -0,0 +1,30 @@
> +#ifndef DT_BINDINGS_MEMORY_TEGRA124_MC_H
> +#define DT_BINDINGS_MEMORY_TEGRA124_MC_H
> +
> +#define TEGRA_SWGROUP_DC 0
> +#define TEGRA_SWGROUP_DCB 1
> +#define TEGRA_SWGROUP_AFI 2
> +#define TEGRA_SWGROUP_AVPC 3
> +#define TEGRA_SWGROUP_HDA 4
> +#define TEGRA_SWGROUP_HC 5
> +#define TEGRA_SWGROUP_MSENC 6
> +#define TEGRA_SWGROUP_PPCS 7
> +#define TEGRA_SWGROUP_SATA 8
> +#define TEGRA_SWGROUP_VDE 9
> +#define TEGRA_SWGROUP_MPCORELP 10
> +#define TEGRA_SWGROUP_MPCORE 11
> +#define TEGRA_SWGROUP_ISP2 12
> +#define TEGRA_SWGROUP_XUSB_HOST 13
> +#define TEGRA_SWGROUP_XUSB_DEV 14
> +#define TEGRA_SWGROUP_ISP2B 15
> +#define TEGRA_SWGROUP_TSEC 16
> +#define TEGRA_SWGROUP_A9AVP 17
> +#define TEGRA_SWGROUP_GPU 18
> +#define TEGRA_SWGROUP_SDMMC1A 19
> +#define TEGRA_SWGROUP_SDMMC2A 20
> +#define TEGRA_SWGROUP_SDMMC3A 21
> +#define TEGRA_SWGROUP_SDMMC4A 22
> +#define TEGRA_SWGROUP_VIC 23
> +#define TEGRA_SWGROUP_VI 24
> +
> +#endif
> --
> 2.0.0
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
Hi Thierry,
On Thu, Jun 26, 2014 at 09:49:42PM +0100, Thierry Reding wrote:
> From: Thierry Reding <[email protected]>
>
> This commit introduces a generic device tree binding for IOMMU devices.
> Only a very minimal subset is described here, but it is enough to cover
> the requirements of both the Exynos System MMU and Tegra SMMU as
> discussed here:
>
> https://lkml.org/lkml/2014/4/27/346
>
> Signed-off-by: Thierry Reding <[email protected]>
[...]
> +Required properties:
> +--------------------
> +- #iommu-cells: The number of cells in an IOMMU specifier needed to encode an
> + address.
> +
> +Typical values for the above include:
> +- #iommu-cells = <0>: Single master IOMMU devices are not configurable and
> + therefore no additional information needs to be encoded in the specifier.
> + This may also apply to multiple master IOMMU devices that do not allow the
> + association of masters to be configured.
A multiple-master capable IOMMU could be built with a single master, but
we'd still need #iommu-cells > 0 here. I appreciate this is just an example,
but the wording sounds like it's enforced.
> +- #iommu-cells = <1>: Multiple master IOMMU devices may need to be configured
> + in order to enable translation for a given master. In such cases the single
> + address cell corresponds to the master device's ID.
Again, we will definitely need more than one cell in this case, as I fully
expect multiple StreamIDs for each master (e.g. Qualcomm mentioned on the
list the other day that they have a master emitting 43 unique IDs).
Anyway, the actual binding looks great, I just don't want people to think
they need to do something different because they don't fit your example
use-cases.
> +Multiple-master IOMMU:
> +----------------------
> +
> + iommu {
> + /* the specifier represents the ID of the master */
> + #iommu-cells = <1>;
> + };
> +
> + master {
> + /* device has master ID 42 in the IOMMU */
> + iommus = <&/iommu 42>;
> + };
> +
> +Multiple-master IOMMU with configurable DMA window:
> +---------------------------------------------------
> +
> + / {
> + #address-cells = <1>;
> + #size-cells = <1>;
> +
> + iommu {
> + /* master ID, address and length of DMA window */
> + #iommu-cells = <4>;
> + };
> +
> + master {
> + /* master ID 42, 4 GiB DMA window starting at 0 */
> + iommus = <&/iommu 42 0 0x1 0x0>;
> + };
> + };
Could you also please include an example of a master with multiple IDs?
Will
On 06/27/2014 05:08 AM, Thierry Reding wrote:
> On Fri, Jun 27, 2014 at 12:46:38PM +0300, Hiroshi DOyu wrote:
>>
>> Thierry Reding <[email protected]> writes:
>>
>>> From: Thierry Reding <[email protected]>
>>>
>>> The memory controller on NVIDIA Tegra124 exposes various knobs that can
>>> be used to tune the behaviour of the clients attached to it.
>>>
>>> Currently this driver sets up the latency allowance registers to the HW
>>> defaults. Eventually an API should be exported by this driver (via a
>>> custom API or a generic subsystem) to allow clients to register latency
>>> requirements.
>>>
>>> This driver also registers an IOMMU (SMMU) that's implemented by the
>>> memory controller.
>>>
>>> Signed-off-by: Thierry Reding <[email protected]>
>>> ---
>>> drivers/memory/Kconfig | 9 +
>>> drivers/memory/Makefile | 1 +
>>> drivers/memory/tegra124-mc.c | 1945 ++++++++++++++++++++++++++++++
>>> include/dt-bindings/memory/tegra124-mc.h | 30 +
>>> 4 files changed, 1985 insertions(+)
>>> create mode 100644 drivers/memory/tegra124-mc.c
>>> create mode 100644 include/dt-bindings/memory/tegra124-mc.h
>>
>> I prefer reusing the existing SMMU and having MC and SMMU separated
>> since most of SMMU code are not different from functionality POV, and
>> new MC features are quite independent of SMMU.
>>
>> If it's really convenient to combine MC and SMMU into one driver, we
>> could move "drivers/iomm/tegra-smmu.c" here first, and add MC features
>> on the top of it.
>
> I'm not sure if we can do that, since the tegra-smmu driver is
> technically used by Tegra30 and Tegra114. We've never really made use of
> it, but there are device trees in mainline releases that contain the
> separate SMMU node.
The existing DT nodes do nothing more than instantiate the driver.
However, IIUC nothing actually uses the driver for any purpose, so if we
simply deleted those nodes or changed them incompatibly, there'd be no
functional difference. Perhaps this is stretching DT ABIness very
slightly, but I think it makes no practical difference.
On 06/27/2014 05:15 AM, Thierry Reding wrote:
> On Fri, Jun 27, 2014 at 01:07:04PM +0200, Arnd Bergmann wrote:
>> On Thursday 26 June 2014 22:49:44 Thierry Reding wrote:
>>> +static const struct tegra_mc_client tegra124_mc_clients[] = {
>>> + {
>>> + .id = 0x01,
>>> + .name = "display0a",
>>> + .swgroup = TEGRA_SWGROUP_DC,
>>> + .smmu = {
>>> + .reg = 0x228,
>>> + .bit = 1,
>>> + },
>>> + .latency = {
>>> + .reg = 0x2e8,
>>> + .shift = 0,
>>> + .mask = 0xff,
>>> + .def = 0xc2,
>>> + },
>>> + }, {
>>
>> This is a rather long table that I assume would need to get duplicated
>> and modified for each specific SoC. Have you considered to put the information
>> into DT instead, as auxiliary data in the iommu specifier as provided by
>> the device?
>
> Most of this data really is register information and I don't think that
> belongs in DT.
I agree. I think it's quite inappropriate to put information into DT
that could simply be put into a table in the driver. If the information
is put into DT, you have to define a fixed binding for it, munge the
table and data representation to fit DT's much less flexible (than C
structs/arrays) syntax, write a whole bunch of code to parse it back out
(at probably not do a good job with error-checking), all only to end up
with exactly the same C structs in the driver at the end of the process.
Oh, and if multiple SoCs use the same data values, you have to duplicate
those tables into at least the DTBs if not in the .dts files, whereas
with C you can just point at the same struct.
SoCs come out much less frequently than new boards (perhaps ignoring the
fact that we support a small subset of boards in mainline, so the
frequency isn't too dissimilar there). It makes good sense to put
board-to-board differences in DT, but I see little point in putting
static SoC information into DT.
On 06/26/2014 02:49 PM, Thierry Reding wrote:
> From: Thierry Reding <[email protected]>
>
> This commit introduces a generic device tree binding for IOMMU devices.
> Only a very minimal subset is described here, but it is enough to cover
> the requirements of both the Exynos System MMU and Tegra SMMU as
> discussed here:
>
> https://lkml.org/lkml/2014/4/27/346
> diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt b/Documentation/devicetree/bindings/iommu/iommu.txt
> +When an "iommus" property is specified in a device tree node, the IOMMU will
> +be used for address translation. If a "dma-ranges" property exists in the
> +device's parent node it will be ignored. An exception to this rule is if the
> +referenced IOMMU is disabled, in which case the "dma-ranges" property of the
> +parent shall take effect.
I wonder how useful that paragraph is. The fact that someone disabled a
particular IOMMU's node doesn't necessarily mean that the HW can
actually do that; an IOMMU might always be active in HW and always
translate accesses by some master. In that case, the fallback to
dma-ranges wouldn't correlate with what the HW actually does. Perhaps
all we need is to add a note to that effect here?
On 06/26/2014 02:49 PM, Thierry Reding wrote:
> From: Thierry Reding <[email protected]>
>
> The memory controller on NVIDIA Tegra124 exposes various knobs that can
> be used to tune the behaviour of the clients attached to it.
>
> Currently this driver sets up the latency allowance registers to the HW
> defaults. Eventually an API should be exported by this driver (via a
> custom API or a generic subsystem) to allow clients to register latency
> requirements.
>
> This driver also registers an IOMMU (SMMU) that's implemented by the
> memory controller.
> diff --git a/drivers/memory/Kconfig b/drivers/memory/Kconfig
> +config TEGRA124_MC
> + bool "Tegra124 Memory Controller driver"
> + depends on ARCH_TEGRA
Does it make sense to default to y for system-level drivers like this?
> diff --git a/drivers/memory/tegra124-mc.c b/drivers/memory/tegra124-mc.c
As a general comment, I wonder why the Tegra124 code/data here is
ifdef'd based on CONFIG_ARCH_TEGRA_124_SOC but the Tegra132 code isn't
ifdef'd at all. I'd assert that the Tegra124 code is small enough it's
hardly worth worrying about ifdefs.
> +static inline void smmu_flush_ptc(struct tegra_smmu *smmu, struct page *page,
> + unsigned long offset)
> +{
> + phys_addr_t phys = page ? page_to_phys(page) : 0;
> + u32 value;
> +
> + if (page) {
> + offset &= ~(smmu->soc->atom_size - 1);
> +
> +#ifdef CONFIG_PHYS_ADDR_T_64BIT
> + value = (phys >> 32) & SMMU_PTC_FLUSH_HI_MASK;
> +#else
> + value = 0;
> +#endif
Shouldn't Tegra124 have CONFIG_PHYS_ADDR_T_64BIT defined, such that
there's no need for this ifdef? Certainly Tegra124 {has,can have} RAM
above 4GB physical, for some memory map layouts (i.e. non swiss cheese).
(I assume most of this code matches the existing Tegra30 SMMU driver, so
I didn't look at all of it that closely).
> +static int tegra_smmu_attach(struct iommu *iommu, struct device *dev)
...
> +#ifndef CONFIG_ARM64
> + return arm_iommu_attach_device(dev, group->mapping);
> +#else
> + return 0;
> +#endif
Hmm. Why must an SMMU driver for the exact same HW operate differently
depending on the CPU that's attached to the SoC? Surely the requirements
for how IOMMU drives should work should be the same for all architectures?
> +static int tegra_mc_probe(struct platform_device *pdev)
> + err = devm_request_irq(&pdev->dev, mc->irq, tegra124_mc_irq,
> + IRQF_SHARED, dev_name(&pdev->dev), mc);
I don't see any code in tegra_mc_remove() that guarantees that the IRQ
won't fire between tegra_mc_remove() returning, and the devm cleanup
code running to unhook that IRQ handler.
> diff --git a/include/dt-bindings/memory/tegra124-mc.h b/include/dt-bindings/memory/tegra124-mc.h
This file is part of the DT binding, so should be added in the patch
that adds the binding.
Thierry Reding <[email protected]> writes:
> diff --git a/include/dt-bindings/memory/tegra124-mc.h b/include/dt-bindings/memory/tegra124-mc.h
> new file mode 100644
> index 000000000000..6b1617ce022f
> --- /dev/null
> +++ b/include/dt-bindings/memory/tegra124-mc.h
> @@ -0,0 +1,30 @@
> +#ifndef DT_BINDINGS_MEMORY_TEGRA124_MC_H
> +#define DT_BINDINGS_MEMORY_TEGRA124_MC_H
> +
> +#define TEGRA_SWGROUP_DC 0
> +#define TEGRA_SWGROUP_DCB 1
> +#define TEGRA_SWGROUP_AFI 2
> +#define TEGRA_SWGROUP_AVPC 3
> +#define TEGRA_SWGROUP_HDA 4
> +#define TEGRA_SWGROUP_HC 5
> +#define TEGRA_SWGROUP_MSENC 6
> +#define TEGRA_SWGROUP_PPCS 7
> +#define TEGRA_SWGROUP_SATA 8
> +#define TEGRA_SWGROUP_VDE 9
> +#define TEGRA_SWGROUP_MPCORELP 10
> +#define TEGRA_SWGROUP_MPCORE 11
> +#define TEGRA_SWGROUP_ISP2 12
> +#define TEGRA_SWGROUP_XUSB_HOST 13
> +#define TEGRA_SWGROUP_XUSB_DEV 14
> +#define TEGRA_SWGROUP_ISP2B 15
> +#define TEGRA_SWGROUP_TSEC 16
> +#define TEGRA_SWGROUP_A9AVP 17
> +#define TEGRA_SWGROUP_GPU 18
> +#define TEGRA_SWGROUP_SDMMC1A 19
> +#define TEGRA_SWGROUP_SDMMC2A 20
> +#define TEGRA_SWGROUP_SDMMC3A 21
> +#define TEGRA_SWGROUP_SDMMC4A 22
> +#define TEGRA_SWGROUP_VIC 23
> +#define TEGRA_SWGROUP_VI 24
> +
> +#endif
In the SMMUv8 patch series, I have assigned unique IDs for all those
HWAs among Tegra SoC generations so that DT can provide which HWAs are
attached to that SoC. The SMMUv8 driver would be unified among Tegra
SoCs, then.
> -----Original Message-----
> From: [email protected] [mailto:iommu-
> [email protected]] On Behalf Of Thierry Reding
> Sent: Friday, June 27, 2014 12:29 PM
> To: Rob Herring; Pawel Moll; Mark Rutland; Ian Campbell; Kumar Gala;
> Stephen Warren; Arnd Bergmann; Will Deacon; Joerg Roedel
> Cc: Olav Haugan; [email protected]; Grant Grundler; Rhyland
> Klein; [email protected]; [email protected];
> Marc Zyngier; Allen Martin; Paul Walmsley; [email protected];
> Cho KyongHo; Dave Martin; [email protected]
> Subject: Re: [RFC 01/10] iommu: Add IOMMU device registry
>
> On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> > From: Thierry Reding <[email protected]>
> >
> > Add an IOMMU device registry for drivers to register with and
> > implement a method for users of the IOMMU API to attach to an IOMMU
> > device. This allows to support deferred probing and gives the IOMMU
> > API a convenient hook to perform early initialization of a device if
> necessary.
> >
> > Signed-off-by: Thierry Reding <[email protected]>
> > ---
> > drivers/iommu/iommu.c | 93
> > +++++++++++++++++++++++++++++++++++++++++++++++++++
> > include/linux/iommu.h | 27 +++++++++++++++
> > 2 files changed, 120 insertions(+)
>
> I thought that perhaps I should elaborate on this a bit since I have a
> few ideas on how the API could be enhanced.
>
> > +static int of_iommu_attach(struct device *dev) {
> > + struct of_phandle_iter iter;
> > + struct iommu *iommu;
> > +
> > + mutex_lock(&iommus_lock);
> > +
> > + of_property_for_each_phandle_with_args(iter, dev->of_node,
> "iommus",
> > + "#iommu-cells", 0) {
> > + bool found = false;
> > + int err;
> > +
> > + /* skip disabled IOMMUs */
> > + if (!of_device_is_available(iter.out_args.np))
> > + continue;
> > +
> > + list_for_each_entry(iommu, &iommus, list) {
> > + if (iommu->dev->of_node == iter.out_args.np) {
> > + err = iommu->ops->attach(iommu, dev);
> > + if (err < 0) {
> > + }
> > +
> > + found = true;
> > + }
> > + }
> > +
> > + if (!found) {
> > + mutex_unlock(&iommus_lock);
> > + return -EPROBE_DEFER;
> > + }
> > + }
> > +
> > + mutex_unlock(&iommus_lock);
> > +
> > + return 0;
> > +}
> > +
> > +static int of_iommu_detach(struct device *dev) {
> > + /* TODO: implement */
> > + return -ENOSYS;
> > +}
> > +
> > +int iommu_attach(struct device *dev)
> > +{
> > + int err = 0;
> > +
> > + if (IS_ENABLED(CONFIG_OF) && dev->of_node) {
> > + err = of_iommu_attach(dev);
> > + if (!err)
> > + return 0;
> > + }
> > +
> > + return err;
> > +}
> > +EXPORT_SYMBOL_GPL(iommu_attach);
>
> I think it might make sense to introduce an explicit object for an IOMMU
> master attachment. Maybe something like:
>
> struct iommu_master {
> struct iommu *iommu;
> struct device *dev;
>
> ...
> };
>
> iommu_attach() could then return a pointer to that attachment and the
> IOMMU user driver could subsequently use that as a handle to access other
> parts of the API.
>
> The reason is that if we ever need to support more than a single master
> interface (and perhaps even multiple master interfaces on different
> IOMMUs) for a single device, then we need a way for the IOMMU user to
> differentiate between its master interfaces.
>
> > diff --git a/include/linux/iommu.h b/include/linux/iommu.h index
> > 284a4683fdc1..ac2ceef194d4 100644
> > --- a/include/linux/iommu.h
> > +++ b/include/linux/iommu.h
> > @@ -43,6 +43,17 @@ struct notifier_block; typedef int
> > (*iommu_fault_handler_t)(struct iommu_domain *,
> > struct device *, unsigned long, int, void *);
> >
> > +struct iommu {
> > + struct device *dev;
> > +
> > + struct list_head list;
> > +
> > + const struct iommu_ops *ops;
> > +};
>
> For reasons explained above, I also think that it would be a good idea to
> modify the iommu_ops functions to take a struct iommu * as their first
> argument. This may become important when one driver needs to support
> multiple IOMMU devices. With the current API drivers have to rely on
> global variables to track the driver-specific context. As far as I can
> tell, only .domain_init(), .add_device(), .remove_device() and
> .device_group(). .domain_init() could set up a pointer to struct iommu in
> struct iommu_domain so the functions dealing with domains could gain
> access to the IOMMU device via that pointer.
Would the proposed interface be an alternate to the add_device interface? Also, how would the iommu group creation work? We are dependent on device driver initialization to attach device an IOMMU, but the add_device allows creation of iommu_groups during bus probing.
Can't the same thing be achieved using the add device interface where an IOMMU driver can determine (in add_device) if the device is attached to a particular IOMMU. If the device is attached to that IOMMU then it can create the corresponding IOMMU group. IOMMU information can be stored in archdata.
-Varun
> -----Original Message-----
> From: [email protected] [mailto:iommu-
> [email protected]] On Behalf Of Thierry Reding
> Sent: Friday, June 27, 2014 2:20 AM
> To: Rob Herring; Pawel Moll; Mark Rutland; Ian Campbell; Kumar Gala;
> Stephen Warren; Arnd Bergmann; Will Deacon; Joerg Roedel
> Cc: Olav Haugan; [email protected]; Grant Grundler; Rhyland
> Klein; [email protected]; [email protected];
> Marc Zyngier; Allen Martin; Paul Walmsley; [email protected];
> Cho KyongHo; Dave Martin; [email protected]
> Subject: [PATCH v3 02/10] devicetree: Add generic IOMMU device tree
> bindings
>
> From: Thierry Reding <[email protected]>
>
> This commit introduces a generic device tree binding for IOMMU devices.
> Only a very minimal subset is described here, but it is enough to cover
> the requirements of both the Exynos System MMU and Tegra SMMU as
> discussed here:
>
> https://lkml.org/lkml/2014/4/27/346
>
> Signed-off-by: Thierry Reding <[email protected]>
> ---
> Changes in v3:
> - use #iommu-cells instead of #address-cells/#size-cells
> - drop optional iommu-names property
>
> Changes in v2:
> - add notes about "dma-ranges" property (drop note from commit message)
> - document priorities of "iommus" property vs. "dma-ranges" property
> - drop #iommu-cells in favour of #address-cells and #size-cells
> - remove multiple-master device example
>
> Documentation/devicetree/bindings/iommu/iommu.txt | 156
> ++++++++++++++++++++++
> 1 file changed, 156 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/iommu/iommu.txt
>
> diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt
> b/Documentation/devicetree/bindings/iommu/iommu.txt
> new file mode 100644
> index 000000000000..f8f03f057156
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/iommu/iommu.txt
> @@ -0,0 +1,156 @@
> +This document describes the generic device tree binding for IOMMUs and
> +their master(s).
> +
> +
> +IOMMU device node:
> +==================
> +
> +An IOMMU can provide the following services:
> +
> +* Remap address space to allow devices to access physical memory ranges
> +that
> + they otherwise wouldn't be capable of accessing.
> +
> + Example: 32-bit DMA to 64-bit physical addresses
> +
> +* Implement scatter-gather at page level granularity so that the device
> +does
> + not have to.
> +
> +* Provide system protection against "rogue" DMA by forcing all accesses
> +to go
> + through the IOMMU and faulting when encountering accesses to unmapped
> + address regions.
> +
> +* Provide address space isolation between multiple contexts.
> +
> + Example: Virtualization
> +
> +Device nodes compatible with this binding represent hardware with some
> +of the above capabilities.
> +
> +IOMMUs can be single-master or multiple-master. Single-master IOMMU
> +devices typically have a fixed association to the master device,
> +whereas multiple- master IOMMU devices can translate accesses from more
> than one master.
> +
> +The device tree node of the IOMMU device's parent bus must contain a
> +valid "dma-ranges" property that describes how the physical address
> +space of the IOMMU maps to memory. An empty "dma-ranges" property means
> +that there is a
> +1:1 mapping from IOMMU to memory.
> +
> +Required properties:
> +--------------------
> +- #iommu-cells: The number of cells in an IOMMU specifier needed to
> +encode an
> + address.
> +
> +Typical values for the above include:
> +- #iommu-cells = <0>: Single master IOMMU devices are not configurable
> +and
> + therefore no additional information needs to be encoded in the
> specifier.
> + This may also apply to multiple master IOMMU devices that do not
> +allow the
> + association of masters to be configured.
> +- #iommu-cells = <1>: Multiple master IOMMU devices may need to be
> +configured
> + in order to enable translation for a given master. In such cases the
> +single
> + address cell corresponds to the master device's ID.
> +- #iommu-cells = <4>: Some IOMMU devices allow the DMA window for
> +masters to
> + be configured. The first cell of the address in this may contain the
> +master
> + device's ID for example, while the second cell could contain the
> +start of
> + the DMA window for the given device. The length of the DMA window is
> +given
> + by the third and fourth cells.
> +
> +
> +IOMMU master node:
> +==================
> +
> +Devices that access memory through an IOMMU are called masters. A
> +device can have multiple master interfaces (to one or more IOMMU
> devices).
> +
> +Required properties:
> +--------------------
> +- iommus: A list of phandle and IOMMU specifier pairs that describe the
> +IOMMU
> + master interfaces of the device. One entry in the list describes one
> +master
> + interface of the device.
> +
> +When an "iommus" property is specified in a device tree node, the IOMMU
> +will be used for address translation. If a "dma-ranges" property exists
> +in the device's parent node it will be ignored. An exception to this
> +rule is if the referenced IOMMU is disabled, in which case the
> +"dma-ranges" property of the parent shall take effect.
> +
> +
> +Notes:
> +======
> +
> +One possible extension to the above is to use an "iommus" property
> +along with a "dma-ranges" property in a bus device node (such as PCI
> +host bridges). This can be useful to describe how children on the bus
> +relate to the IOMMU if they are not explicitly listed in the device
> +tree (e.g. PCI devices). However, the requirements of that use-case
> +haven't been fully determined yet. Implementing this is therefore not
> +recommended without further discussion and extension of this binding.
> +
> +
> +Examples:
> +=========
> +
> +Single-master IOMMU:
> +--------------------
> +
> + iommu {
> + #iommu-cells = <0>;
> + };
> +
> + master {
> + iommus = <&/iommu>;
> + };
> +
> +Multiple-master IOMMU with fixed associations:
> +----------------------------------------------
> +
> + /* multiple-master IOMMU */
> + iommu {
> + /*
> + * Masters are statically associated with this IOMMU and
> + * address translation is always enabled.
> + */
> + #iommu-cells = <0>;
> + };
> +
> + /* static association with IOMMU */
> + master@1 {
> + reg = <1>;
> + iommus = <&/iommu>;
> + };
> +
> + /* static association with IOMMU */
> + master@2 {
> + reg = <2>;
> + iommus = <&/iommu>;
> + };
> +
> +Multiple-master IOMMU:
> +----------------------
> +
> + iommu {
> + /* the specifier represents the ID of the master */
> + #iommu-cells = <1>;
> + };
> +
> + master {
> + /* device has master ID 42 in the IOMMU */
> + iommus = <&/iommu 42>;
> + };
> +
Master node corresponds to the device node, right? Master ID would correspond to Stream ID? We are already using "iommu-parent" property to link a device to its corresponding IOMMU. We can use the same property instead of using "iommus".
-Varun
On Friday 04 July 2014 06:42:48 Varun Sethi wrote:
> Master node corresponds to the device node, right? Master ID would correspond
> to Stream ID? We are already using "iommu-parent" property to link a device
> to its corresponding IOMMU. We can use the same property instead of using "iommus".
I don't see "iommu-parent" used anywhere, just "fsl,iommu-parent". We can
probably allow "fsl,iommu-parent" as an alias for "iommus" for backwards-
compatibility if that helps you on PowerPC. For ARM, I'd prefer to mandate
that we use just "iommus".
Arnd
On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> Add an IOMMU device registry for drivers to register with and implement
> a method for users of the IOMMU API to attach to an IOMMU device. This
> allows to support deferred probing and gives the IOMMU API a convenient
> hook to perform early initialization of a device if necessary.
Can you elaborate on why exactly you need this? The IOMMU-API is
designed to hide any details from the user about the available IOMMUs in
the system and which IOMMU handles which device. This looks like it is
going in a completly different direction from that.
Joerg
On Fri, Jul 04, 2014 at 01:05:30PM +0200, Joerg Roedel wrote:
> On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> > Add an IOMMU device registry for drivers to register with and implement
> > a method for users of the IOMMU API to attach to an IOMMU device. This
> > allows to support deferred probing and gives the IOMMU API a convenient
> > hook to perform early initialization of a device if necessary.
>
> Can you elaborate on why exactly you need this? The IOMMU-API is
> designed to hide any details from the user about the available IOMMUs in
> the system and which IOMMU handles which device. This looks like it is
> going in a completly different direction from that.
I need this primarily to properly serialize device probing order.
Without it the IOMMU may be probed later than its clients, in which case
the client drivers will assume that there is no IOMMU (iommu_present()
for the parent bus fails).
There are other ways around this, but I think we'll need to eventually
come up with something like this anyway. Consider for example what
happens when a device has master interfaces on two different IOMMUs. Not
only does the current model of having one and one only IOMMU per struct
bus_type break down, but also IOMMU masters will need a way to specify
which IOMMU they're talking to.
Thierry
On Fri, Jul 04, 2014 at 02:47:10PM +0100, Thierry Reding wrote:
> On Fri, Jul 04, 2014 at 01:05:30PM +0200, Joerg Roedel wrote:
> > On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> > > Add an IOMMU device registry for drivers to register with and implement
> > > a method for users of the IOMMU API to attach to an IOMMU device. This
> > > allows to support deferred probing and gives the IOMMU API a convenient
> > > hook to perform early initialization of a device if necessary.
> >
> > Can you elaborate on why exactly you need this? The IOMMU-API is
> > designed to hide any details from the user about the available IOMMUs in
> > the system and which IOMMU handles which device. This looks like it is
> > going in a completly different direction from that.
>
> I need this primarily to properly serialize device probing order.
> Without it the IOMMU may be probed later than its clients, in which case
> the client drivers will assume that there is no IOMMU (iommu_present()
> for the parent bus fails).
I can also vouch for needing *a* solution to this problem. The ARM SMMU (and
I think others) rely on initcall ordering rather than the driver probing
model to ensure the IOMMU is probed before any of its masters.
Will
On Friday 04 July 2014, Will Deacon wrote:
> On Fri, Jul 04, 2014 at 02:47:10PM +0100, Thierry Reding wrote:
> > On Fri, Jul 04, 2014 at 01:05:30PM +0200, Joerg Roedel wrote:
> > > On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> > > > Add an IOMMU device registry for drivers to register with and implement
> > > > a method for users of the IOMMU API to attach to an IOMMU device. This
> > > > allows to support deferred probing and gives the IOMMU API a convenient
> > > > hook to perform early initialization of a device if necessary.
> > >
> > > Can you elaborate on why exactly you need this? The IOMMU-API is
> > > designed to hide any details from the user about the available IOMMUs in
> > > the system and which IOMMU handles which device. This looks like it is
> > > going in a completly different direction from that.
> >
> > I need this primarily to properly serialize device probing order.
> > Without it the IOMMU may be probed later than its clients, in which case
> > the client drivers will assume that there is no IOMMU (iommu_present()
> > for the parent bus fails).
>
> I can also vouch for needing a solution to this problem. The ARM SMMU (and
> I think others) rely on initcall ordering rather than the driver probing
> model to ensure the IOMMU is probed before any of its masters.
I think it would be best to attach platform devices to IOMMUs from the
of_dma_configure() we just introduced. That still requires handling
IOMMUs special though, and I don't know how we should best deal
with that. It would not be too hard to scan for IOMMUs in DT first
and register them all in a way that we can later look them up
by phandle, but that would break down if we ever get nested IOMMUs.
Another possibility might be to register all devices as we do today,
including IOMMU devices, but return -EPROBE_DEFER from
platform_drv_probe() before we call into the driver's probe function
if the IOMMU has not been set up at that point.
For PCI devices, we need a different way of dealing with the IOMMUs,
some generic PCI code needs to be added to attach the correct IOMMU
to a newly added PCI device based on how the host bridge is configured.
We can probably for now get away with not worrying about any bus type
other than platform, amba or PCI: we don't use any other DMA master
capable bus on ARM, and other architectures can probably rely on
having only a single IOMMU implementation in the system.
Arnd
On Sun, Jul 06, 2014 at 08:17:22PM +0200, Arnd Bergmann wrote:
> On Friday 04 July 2014, Will Deacon wrote:
> > On Fri, Jul 04, 2014 at 02:47:10PM +0100, Thierry Reding wrote:
> > > On Fri, Jul 04, 2014 at 01:05:30PM +0200, Joerg Roedel wrote:
> > > > On Thu, Jun 26, 2014 at 10:49:41PM +0200, Thierry Reding wrote:
> > > > > Add an IOMMU device registry for drivers to register with and implement
> > > > > a method for users of the IOMMU API to attach to an IOMMU device. This
> > > > > allows to support deferred probing and gives the IOMMU API a convenient
> > > > > hook to perform early initialization of a device if necessary.
> > > >
> > > > Can you elaborate on why exactly you need this? The IOMMU-API is
> > > > designed to hide any details from the user about the available IOMMUs in
> > > > the system and which IOMMU handles which device. This looks like it is
> > > > going in a completly different direction from that.
> > >
> > > I need this primarily to properly serialize device probing order.
> > > Without it the IOMMU may be probed later than its clients, in which case
> > > the client drivers will assume that there is no IOMMU (iommu_present()
> > > for the parent bus fails).
> >
> > I can also vouch for needing a solution to this problem. The ARM SMMU (and
> > I think others) rely on initcall ordering rather than the driver probing
> > model to ensure the IOMMU is probed before any of its masters.
>
> I think it would be best to attach platform devices to IOMMUs from the
> of_dma_configure() we just introduced. That still requires handling
> IOMMUs special though, and I don't know how we should best deal
> with that. It would not be too hard to scan for IOMMUs in DT first
> and register them all in a way that we can later look them up
> by phandle, but that would break down if we ever get nested IOMMUs.
But even for nested IOMMUs each will have an associated device node, so
we could scan the tree up front. But given that it only solves the
problem partially I don't think that's a big advantage.
> Another possibility might be to register all devices as we do today,
> including IOMMU devices, but return -EPROBE_DEFER from
> platform_drv_probe() before we call into the driver's probe function
> if the IOMMU has not been set up at that point.
Right, Hiroshi already proposed a patch for that, but it was more or
less NAK'ed because people didn't want to have that functionality in the
device driver core.
> For PCI devices, we need a different way of dealing with the IOMMUs,
> some generic PCI code needs to be added to attach the correct IOMMU
> to a newly added PCI device based on how the host bridge is configured.
I'm curious. Without device tree, how do we find out what IOMMU a device
is connected to? Will it always be an ancestor of the device in the PCI
hierarchy?
> We can probably for now get away with not worrying about any bus type
> other than platform, amba or PCI: we don't use any other DMA master
> capable bus on ARM, and other architectures can probably rely on
> having only a single IOMMU implementation in the system.
Neither of the above proposals will work for cases where more than a
single IOMMU exists in the system. Currently we can only register one
IOMMU per bus and if we try to register a second IOMMU it will fail
(bus_set_iommu() returns -EBUSY).
Also, struct bus_type has only a pointer to a struct iommu_ops, but no
associated context. Hence my proposal, which I only posted partially
here since it didn't seem immediately relevant. But I guess to better
illustrate how I envisioned this to work, here goes:
The idea was to allow each device to have zero or more master on zero or
more IOMMUs. That's as general a case as it gets. Now to make this work
we'd need something like this:
struct iommu_master {
struct device *dev; /* the master device */
struct iommu *iommu; /* the IOMMU that dev masters */
struct list_head list; /* link in a list of all master
interfaces of dev */
};
Then we could store a list in struct device:
struct device {
...
struct list_head iommu_masters;
...
};
It was already mentioned in other threads that if a device does indeed
have more than one master interface, then it needs to control access to
them explicitly via the IOMMU API. Since we only have an API to allocate
an IRQ domain (which automatically forwards calls to the global IOMMU)
we'd need something new, such as:
master = iommu_get(dev, "foo");
or
master = iommu_get(dev, 0);
Or whichever variant we prefer. That could return a pointer to a struct
iommu_master, which could then be used to obtain a domain, like so:
domain = iommu_master_alloc_domain(master);
To make that work, as far as I can tell only very minimal changes would
have to be done to iommu_ops. Most of the functions take a pointer to a
struct iommu_domain anyway, we could extend it with a reference to the
parent of a domain. For that we'll need a structure that represents the
IOMMU device's context (which is what this patch introduces as struct
iommu).
The only functions in struct iommu_ops that deal with an IOMMU directly
are .add_device(), .remove_device() and .device_group(), although they
may become obsolete with the new APIs. Currently .add_device() and
.remove_device() are only used to register devices from a bus notifier
and that would be replaced by something more explicit like above. As for
device_group(), I don't see it being used at all currently.
Now for DMA mapping API integration we could make that use the first (or
only) IOMMU device registered. Perhaps we could even reject using this
layer of integration for multi-master devices, since it would be
difficult to tell whether or not the selected device is the correct one.
We still have the option to handle things mostly transparently with the
above by moving calls to iommu_get() into the core. But we also gain the
flexibility to work with multiple IOMMU contexts explicitly if required.
Thierry
On Tue, Sep 30, 2014 at 2:48 PM, Sean Paul <[email protected]> wrote:
> On Thu, Jun 26, 2014 at 4:49 PM, Thierry Reding
> <[email protected]> wrote:
>> From: Thierry Reding <[email protected]>
>>
>> When an IOMMU device is available on the platform bus, allocate an IOMMU
>> domain and attach the display controllers to it. The display controllers
>> can then scan out non-contiguous buffers by mapping them through the
>> IOMMU.
>>
>
> Hi Thierry,
> A few comments from Stéphane and myself that came up while we were
> reviewing this for our tree.
>
>> Signed-off-by: Thierry Reding <[email protected]>
>> ---
>> drivers/gpu/drm/tegra/dc.c | 21 ++++
>> drivers/gpu/drm/tegra/drm.c | 17 ++++
>> drivers/gpu/drm/tegra/drm.h | 3 +
>> drivers/gpu/drm/tegra/fb.c | 16 ++-
>> drivers/gpu/drm/tegra/gem.c | 236 +++++++++++++++++++++++++++++++++++++++-----
>> drivers/gpu/drm/tegra/gem.h | 4 +
>> 6 files changed, 273 insertions(+), 24 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
>> index afcca04f5367..0f7452d04811 100644
>> --- a/drivers/gpu/drm/tegra/dc.c
>> +++ b/drivers/gpu/drm/tegra/dc.c
>> @@ -9,6 +9,7 @@
>>
>> #include <linux/clk.h>
>> #include <linux/debugfs.h>
>> +#include <linux/iommu.h>
>> #include <linux/reset.h>
>>
>> #include "dc.h"
>> @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
>> {
>> struct drm_device *drm = dev_get_drvdata(client->parent);
>> struct tegra_dc *dc = host1x_client_to_dc(client);
>> + struct tegra_drm *tegra = drm->dev_private;
>> int err;
>>
>> + if (tegra->domain) {
>> + err = iommu_attach_device(tegra->domain, dc->dev);
>> + if (err < 0) {
>> + dev_err(dc->dev, "failed to attach to IOMMU: %d\n",
>> + err);
>> + return err;
>> + }
>
> [from Stéphane]
>
> shouldn't we call detach in the error paths below?
>
>
>> + }
>> +
>> drm_crtc_init(drm, &dc->base, &tegra_crtc_funcs);
>> drm_mode_crtc_set_gamma_size(&dc->base, 256);
>> drm_crtc_helper_add(&dc->base, &tegra_crtc_helper_funcs);
>> @@ -1318,7 +1329,9 @@ static int tegra_dc_init(struct host1x_client *client)
>>
>> static int tegra_dc_exit(struct host1x_client *client)
>> {
>> + struct drm_device *drm = dev_get_drvdata(client->parent);
>> struct tegra_dc *dc = host1x_client_to_dc(client);
>> + struct tegra_drm *tegra = drm->dev_private;
>> int err;
>>
>> devm_free_irq(dc->dev, dc->irq, dc);
>> @@ -1335,6 +1348,8 @@ static int tegra_dc_exit(struct host1x_client *client)
>> return err;
>> }
>>
>> + iommu_detach_device(tegra->domain, dc->dev);
>> +
>> return 0;
>> }
>>
>> @@ -1462,6 +1477,12 @@ static int tegra_dc_probe(struct platform_device *pdev)
>> return -ENXIO;
>> }
>>
>> + err = iommu_attach(&pdev->dev);
>> + if (err < 0) {
>> + dev_err(&pdev->dev, "failed to attach to IOMMU: %d\n", err);
>> + return err;
>> + }
>> +
>> INIT_LIST_HEAD(&dc->client.list);
>> dc->client.ops = &dc_client_ops;
>> dc->client.dev = &pdev->dev;
>> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
>> index 59736bb810cd..1d2bbafad982 100644
>> --- a/drivers/gpu/drm/tegra/drm.c
>> +++ b/drivers/gpu/drm/tegra/drm.c
>> @@ -8,6 +8,7 @@
>> */
>>
>> #include <linux/host1x.h>
>> +#include <linux/iommu.h>
>>
>> #include "drm.h"
>> #include "gem.h"
>> @@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>> if (!tegra)
>> return -ENOMEM;
>>
>> + if (iommu_present(&platform_bus_type)) {
>> + tegra->domain = iommu_domain_alloc(&platform_bus_type);
>> + if (IS_ERR(tegra->domain)) {
>> + kfree(tegra);
>> + return PTR_ERR(tegra->domain);
>> + }
>> +
>> + drm_mm_init(&tegra->mm, 0, SZ_2G);
>
>
> [from Stéphane]:
>
> none of these are freed in the error path below (iommu_domain_free and
> drm_mm_takedown)
>
> also |tegra| isn't freed either?
>
>
>
>> + }
>> +
>> mutex_init(&tegra->clients_lock);
>> INIT_LIST_HEAD(&tegra->clients);
>> drm->dev_private = tegra;
>> @@ -71,6 +82,7 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
>> static int tegra_drm_unload(struct drm_device *drm)
>> {
>> struct host1x_device *device = to_host1x_device(drm->dev);
>> + struct tegra_drm *tegra = drm->dev_private;
>> int err;
>>
>> drm_kms_helper_poll_fini(drm);
>> @@ -82,6 +94,11 @@ static int tegra_drm_unload(struct drm_device *drm)
>> if (err < 0)
>> return err;
>>
>> + if (tegra->domain) {
>> + iommu_domain_free(tegra->domain);
>> + drm_mm_takedown(&tegra->mm);
>> + }
>> +
>> return 0;
>> }
>>
>> diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
>> index 96d754e7b3eb..a07c796b7edc 100644
>> --- a/drivers/gpu/drm/tegra/drm.h
>> +++ b/drivers/gpu/drm/tegra/drm.h
>> @@ -39,6 +39,9 @@ struct tegra_fbdev {
>> struct tegra_drm {
>> struct drm_device *drm;
>>
>> + struct iommu_domain *domain;
>> + struct drm_mm mm;
>> +
>> struct mutex clients_lock;
>> struct list_head clients;
>>
>> diff --git a/drivers/gpu/drm/tegra/fb.c b/drivers/gpu/drm/tegra/fb.c
>> index 7790d43ad082..21c65dd817c3 100644
>> --- a/drivers/gpu/drm/tegra/fb.c
>> +++ b/drivers/gpu/drm/tegra/fb.c
>> @@ -65,8 +65,12 @@ static void tegra_fb_destroy(struct drm_framebuffer *framebuffer)
>> for (i = 0; i < fb->num_planes; i++) {
>> struct tegra_bo *bo = fb->planes[i];
>>
>> - if (bo)
>> + if (bo) {
>> + if (bo->pages && bo->virt)
>> + vunmap(bo->virt);
>> +
>> drm_gem_object_unreference_unlocked(&bo->gem);
>> + }
>> }
>>
>> drm_framebuffer_cleanup(framebuffer);
>> @@ -252,6 +256,16 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
>> offset = info->var.xoffset * bytes_per_pixel +
>> info->var.yoffset * fb->pitches[0];
>>
>> + if (bo->pages) {
>> + bo->vaddr = vmap(bo->pages, bo->num_pages, VM_MAP,
>> + pgprot_writecombine(PAGE_KERNEL));
>> + if (!bo->vaddr) {
>> + dev_err(drm->dev, "failed to vmap() framebuffer\n");
>> + err = -ENOMEM;
>> + goto destroy;
>> + }
>> + }
>> +
>> drm->mode_config.fb_base = (resource_size_t)bo->paddr;
>> info->screen_base = (void __iomem *)bo->vaddr + offset;
>> info->screen_size = size;
>> diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
>> index c1e4e8b6e5ca..2912e61a2599 100644
>> --- a/drivers/gpu/drm/tegra/gem.c
>> +++ b/drivers/gpu/drm/tegra/gem.c
>> @@ -14,8 +14,10 @@
>> */
>>
>> #include <linux/dma-buf.h>
>> +#include <linux/iommu.h>
>> #include <drm/tegra_drm.h>
>>
>> +#include "drm.h"
>> #include "gem.h"
>>
>> static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo)
>> @@ -90,14 +92,144 @@ static const struct host1x_bo_ops tegra_bo_ops = {
>> .kunmap = tegra_bo_kunmap,
>> };
>>
>> +static int iommu_map_sg(struct iommu_domain *domain, struct sg_table *sgt,
>> + dma_addr_t iova, int prot)
>> +{
>> + unsigned long offset = 0;
>> + struct scatterlist *sg;
>> + unsigned int i, j;
>> + int err;
>> +
>> + for_each_sg(sgt->sgl, sg, sgt->nents, i) {
>> + dma_addr_t phys = sg_phys(sg);
>> + size_t length = sg->offset;
>> +
>> + phys = sg_phys(sg) - sg->offset;
>> + length = sg->length + sg->offset;
>> +
>> + err = iommu_map(domain, iova + offset, phys, length, prot);
>> + if (err < 0)
>> + goto unmap;
>> +
>> + offset += length;
>> + }
>> +
>> + return 0;
>> +
>> +unmap:
>> + offset = 0;
>> +
>> + for_each_sg(sgt->sgl, sg, i, j) {
>> + size_t length = sg->length + sg->offset;
>> + iommu_unmap(domain, iova + offset, length);
>> + offset += length;
>> + }
>> +
>> + return err;
>> +}
>> +
>> +static int iommu_unmap_sg(struct iommu_domain *domain, struct sg_table *sgt,
>> + dma_addr_t iova)
>> +{
>> + unsigned long offset = 0;
>> + struct scatterlist *sg;
>> + unsigned int i;
>> +
>> + for_each_sg(sgt->sgl, sg, sgt->nents, i) {
>> + dma_addr_t phys = sg_phys(sg);
>> + size_t length = sg->offset;
>> +
>> + phys = sg_phys(sg) - sg->offset;
>> + length = sg->length + sg->offset;
>> +
>> + iommu_unmap(domain, iova + offset, length);
>> + offset += length;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static int tegra_bo_iommu_map(struct tegra_drm *tegra, struct tegra_bo *bo)
>> +{
>> + int prot = IOMMU_READ | IOMMU_WRITE;
>> + int err;
>> +
>> + if (bo->mm)
>> + return -EBUSY;
>> +
>> + bo->mm = kzalloc(sizeof(*bo->mm), GFP_KERNEL);
>> + if (!bo->mm)
>> + return -ENOMEM;
>> +
>> + err = drm_mm_insert_node_generic(&tegra->mm, bo->mm, bo->gem.size,
>> + PAGE_SIZE, 0, 0, 0);
>> + if (err < 0) {
>> + dev_err(tegra->drm->dev, "out of virtual memory: %d\n", err);
>> + return err;
>> + }
>> +
>> + bo->paddr = bo->mm->start;
>> +
>> + err = iommu_map_sg(tegra->domain, bo->sgt, bo->paddr, prot);
>> + if (err < 0) {
>> + dev_err(tegra->drm->dev, "failed to map buffer: %d\n", err);
>> + return err;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static int tegra_bo_iommu_unmap(struct tegra_drm *tegra, struct tegra_bo *bo)
>> +{
>> + if (!bo->mm)
>> + return 0;
>> +
>> + iommu_unmap_sg(tegra->domain, bo->sgt, bo->paddr);
>> + drm_mm_remove_node(bo->mm);
>> +
>> + kfree(bo->mm);
>> + return 0;
>> +}
>> +
>> static void tegra_bo_destroy(struct drm_device *drm, struct tegra_bo *bo)
>> {
>> - dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr, bo->paddr);
>> + if (!bo->pages)
>> + dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr,
>> + bo->paddr);
One more thing. If tegra_bo_alloc fails, we'll have bo->vaddr == NULL
and bo->paddr == ~0 here, which causes a crash.
I posted https://lkml.org/lkml/2014/9/30/659 to check for the error
condition in the mm code, but it seems like reviewer consensus is to
check for this before calling free.
As such, we'll need to make sure bo->vaddr != NULL before calling
dma_free_writecombine to avoid this situation.
Would you prefer I send a patch up to fix this separately, or would
you like to roll this into your next version?
Sean
>> + else
>> + drm_gem_put_pages(&bo->gem, bo->pages, true, true);
>> +}
>> +
>> +static int tegra_bo_get_pages(struct drm_device *drm, struct tegra_bo *bo,
>> + size_t size)
>> +{
>> + bo->pages = drm_gem_get_pages(&bo->gem, GFP_KERNEL);
>> + if (!bo->pages)
>> + return -ENOMEM;
>> +
>> + bo->num_pages = size >> PAGE_SHIFT;
>> +
>> + return 0;
>> +}
>> +
>> +static int tegra_bo_alloc(struct drm_device *drm, struct tegra_bo *bo,
>> + size_t size)
>> +{
>> + bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
>> + GFP_KERNEL | __GFP_NOWARN);
>> + if (!bo->vaddr) {
>> + dev_err(drm->dev, "failed to allocate buffer of size %zu\n",
>> + size);
>> + return -ENOMEM;
>> + }
>> +
>> + return 0;
>> }
>>
>> struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
>> unsigned long flags)
>> {
>> + struct tegra_drm *tegra = drm->dev_private;
>> struct tegra_bo *bo;
>> int err;
>>
>> @@ -108,22 +240,33 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
>> host1x_bo_init(&bo->base, &tegra_bo_ops);
>> size = round_up(size, PAGE_SIZE);
>>
>> - bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
>> - GFP_KERNEL | __GFP_NOWARN);
>> - if (!bo->vaddr) {
>> - dev_err(drm->dev, "failed to allocate buffer with size %u\n",
>> - size);
>> - err = -ENOMEM;
>> - goto err_dma;
>> - }
>> -
>> err = drm_gem_object_init(drm, &bo->gem, size);
>> if (err)
>> - goto err_init;
>> + goto free;
>>
>> err = drm_gem_create_mmap_offset(&bo->gem);
>
> We need to call drm_gem_free_mmap_offset if one of the calls below
> fails, otherwise we'll try to free the mmap_offset on an already
> destroyed bo.
>
>
> Sean
>
>
>
>> if (err)
>> - goto err_mmap;
>> + goto release;
>> +
>> + if (tegra->domain) {
>> + err = tegra_bo_get_pages(drm, bo, size);
>> + if (err < 0)
>> + goto release;
>> +
>> + bo->sgt = drm_prime_pages_to_sg(bo->pages, bo->num_pages);
>> + if (IS_ERR(bo->sgt)) {
>> + err = PTR_ERR(bo->sgt);
>> + goto release;
>> + }
>> +
>> + err = tegra_bo_iommu_map(tegra, bo);
>> + if (err < 0)
>> + goto release;
>> + } else {
>> + err = tegra_bo_alloc(drm, bo, size);
>> + if (err < 0)
>> + goto release;
>> + }
>>
>> if (flags & DRM_TEGRA_GEM_CREATE_TILED)
>> bo->tiling.mode = TEGRA_BO_TILING_MODE_TILED;
>> @@ -133,11 +276,10 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
>>
>> return bo;
>>
>> -err_mmap:
>> +release:
>> drm_gem_object_release(&bo->gem);
>> -err_init:
>> tegra_bo_destroy(drm, bo);
>> -err_dma:
>> +free:
>> kfree(bo);
>>
>> return ERR_PTR(err);
>> @@ -172,6 +314,7 @@ err:
>> static struct tegra_bo *tegra_bo_import(struct drm_device *drm,
>> struct dma_buf *buf)
>> {
>> + struct tegra_drm *tegra = drm->dev_private;
>> struct dma_buf_attachment *attach;
>> struct tegra_bo *bo;
>> ssize_t size;
>> @@ -211,12 +354,19 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm,
>> goto detach;
>> }
>>
>> - if (bo->sgt->nents > 1) {
>> - err = -EINVAL;
>> - goto detach;
>> + if (tegra->domain) {
>> + err = tegra_bo_iommu_map(tegra, bo);
>> + if (err < 0)
>> + goto detach;
>> + } else {
>> + if (bo->sgt->nents > 1) {
>> + err = -EINVAL;
>> + goto detach;
>> + }
>> +
>> + bo->paddr = sg_dma_address(bo->sgt->sgl);
>> }
>>
>> - bo->paddr = sg_dma_address(bo->sgt->sgl);
>> bo->gem.import_attach = attach;
>>
>> return bo;
>> @@ -239,8 +389,12 @@ free:
>>
>> void tegra_bo_free_object(struct drm_gem_object *gem)
>> {
>> + struct tegra_drm *tegra = gem->dev->dev_private;
>> struct tegra_bo *bo = to_tegra_bo(gem);
>>
>> + if (tegra->domain)
>> + tegra_bo_iommu_unmap(tegra, bo);
>> +
>> if (gem->import_attach) {
>> dma_buf_unmap_attachment(gem->import_attach, bo->sgt,
>> DMA_TO_DEVICE);
>> @@ -301,7 +455,38 @@ int tegra_bo_dumb_map_offset(struct drm_file *file, struct drm_device *drm,
>> return 0;
>> }
>>
>> +static int tegra_bo_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>> +{
>> + struct drm_gem_object *gem = vma->vm_private_data;
>> + struct tegra_bo *bo = to_tegra_bo(gem);
>> + struct page *page;
>> + pgoff_t offset;
>> + int err;
>> +
>> + if (!bo->pages)
>> + return VM_FAULT_SIGBUS;
>> +
>> + offset = ((unsigned long)vmf->virtual_address - vma->vm_start) >> PAGE_SHIFT;
>> + page = bo->pages[offset];
>> +
>> + err = vm_insert_page(vma, (unsigned long)vmf->virtual_address, page);
>> + switch (err) {
>> + case -EAGAIN:
>> + case 0:
>> + case -ERESTARTSYS:
>> + case -EINTR:
>> + case -EBUSY:
>> + return VM_FAULT_NOPAGE;
>> +
>> + case -ENOMEM:
>> + return VM_FAULT_OOM;
>> + }
>> +
>> + return VM_FAULT_SIGBUS;
>> +}
>> +
>> const struct vm_operations_struct tegra_bo_vm_ops = {
>> + .fault = tegra_bo_fault,
>> .open = drm_gem_vm_open,
>> .close = drm_gem_vm_close,
>> };
>> @@ -316,13 +501,18 @@ int tegra_drm_mmap(struct file *file, struct vm_area_struct *vma)
>> if (ret)
>> return ret;
>>
>> + vma->vm_flags |= VM_MIXEDMAP;
>> + vma->vm_flags &= ~VM_PFNMAP;
>> +
>> gem = vma->vm_private_data;
>> bo = to_tegra_bo(gem);
>>
>> - ret = remap_pfn_range(vma, vma->vm_start, bo->paddr >> PAGE_SHIFT,
>> - vma->vm_end - vma->vm_start, vma->vm_page_prot);
>> - if (ret)
>> - drm_gem_vm_close(vma);
>> + if (!bo->pages) {
>> + ret = remap_pfn_range(vma, vma->vm_start, bo->paddr >> PAGE_SHIFT,
>> + vma->vm_end - vma->vm_start, vma->vm_page_prot);
>> + if (ret)
>> + drm_gem_vm_close(vma);
>> + }
>>
>> return ret;
>> }
>> diff --git a/drivers/gpu/drm/tegra/gem.h b/drivers/gpu/drm/tegra/gem.h
>> index 43a25c853357..c2e3f43e4b3f 100644
>> --- a/drivers/gpu/drm/tegra/gem.h
>> +++ b/drivers/gpu/drm/tegra/gem.h
>> @@ -37,6 +37,10 @@ struct tegra_bo {
>> dma_addr_t paddr;
>> void *vaddr;
>>
>> + struct drm_mm_node *mm;
>> + unsigned long num_pages;
>> + struct page **pages;
>> +
>> struct tegra_bo_tiling tiling;
>> };
>>
>> --
>> 2.0.0
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Please read the FAQ at http://www.tux.org/lkml/
On Wed, Oct 01, 2014 at 11:54:11AM -0400, Sean Paul wrote:
> On Tue, Sep 30, 2014 at 2:48 PM, Sean Paul <[email protected]> wrote:
> > On Thu, Jun 26, 2014 at 4:49 PM, Thierry Reding
> > <[email protected]> wrote:
> >> From: Thierry Reding <[email protected]>
> >>
> >> When an IOMMU device is available on the platform bus, allocate an IOMMU
> >> domain and attach the display controllers to it. The display controllers
> >> can then scan out non-contiguous buffers by mapping them through the
> >> IOMMU.
> >>
> >
> > Hi Thierry,
> > A few comments from Stéphane and myself that came up while we were
> > reviewing this for our tree.
> >
> >> Signed-off-by: Thierry Reding <[email protected]>
> >> ---
> >> drivers/gpu/drm/tegra/dc.c | 21 ++++
> >> drivers/gpu/drm/tegra/drm.c | 17 ++++
> >> drivers/gpu/drm/tegra/drm.h | 3 +
> >> drivers/gpu/drm/tegra/fb.c | 16 ++-
> >> drivers/gpu/drm/tegra/gem.c | 236 +++++++++++++++++++++++++++++++++++++++-----
> >> drivers/gpu/drm/tegra/gem.h | 4 +
> >> 6 files changed, 273 insertions(+), 24 deletions(-)
> >>
> >> diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
> >> index afcca04f5367..0f7452d04811 100644
> >> --- a/drivers/gpu/drm/tegra/dc.c
> >> +++ b/drivers/gpu/drm/tegra/dc.c
> >> @@ -9,6 +9,7 @@
> >>
> >> #include <linux/clk.h>
> >> #include <linux/debugfs.h>
> >> +#include <linux/iommu.h>
> >> #include <linux/reset.h>
> >>
> >> #include "dc.h"
> >> @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
> >> {
> >> struct drm_device *drm = dev_get_drvdata(client->parent);
> >> struct tegra_dc *dc = host1x_client_to_dc(client);
> >> + struct tegra_drm *tegra = drm->dev_private;
> >> int err;
> >>
> >> + if (tegra->domain) {
> >> + err = iommu_attach_device(tegra->domain, dc->dev);
> >> + if (err < 0) {
> >> + dev_err(dc->dev, "failed to attach to IOMMU: %d\n",
> >> + err);
> >> + return err;
> >> + }
> >
> > [from Stéphane]
> >
> > shouldn't we call detach in the error paths below?
> >
> >
> >> + }
> >> +
> >> drm_crtc_init(drm, &dc->base, &tegra_crtc_funcs);
> >> drm_mode_crtc_set_gamma_size(&dc->base, 256);
> >> drm_crtc_helper_add(&dc->base, &tegra_crtc_helper_funcs);
> >> @@ -1318,7 +1329,9 @@ static int tegra_dc_init(struct host1x_client *client)
> >>
> >> static int tegra_dc_exit(struct host1x_client *client)
> >> {
> >> + struct drm_device *drm = dev_get_drvdata(client->parent);
> >> struct tegra_dc *dc = host1x_client_to_dc(client);
> >> + struct tegra_drm *tegra = drm->dev_private;
> >> int err;
> >>
> >> devm_free_irq(dc->dev, dc->irq, dc);
> >> @@ -1335,6 +1348,8 @@ static int tegra_dc_exit(struct host1x_client *client)
> >> return err;
> >> }
> >>
> >> + iommu_detach_device(tegra->domain, dc->dev);
> >> +
> >> return 0;
> >> }
> >>
> >> @@ -1462,6 +1477,12 @@ static int tegra_dc_probe(struct platform_device *pdev)
> >> return -ENXIO;
> >> }
> >>
> >> + err = iommu_attach(&pdev->dev);
> >> + if (err < 0) {
> >> + dev_err(&pdev->dev, "failed to attach to IOMMU: %d\n", err);
> >> + return err;
> >> + }
> >> +
> >> INIT_LIST_HEAD(&dc->client.list);
> >> dc->client.ops = &dc_client_ops;
> >> dc->client.dev = &pdev->dev;
> >> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
> >> index 59736bb810cd..1d2bbafad982 100644
> >> --- a/drivers/gpu/drm/tegra/drm.c
> >> +++ b/drivers/gpu/drm/tegra/drm.c
> >> @@ -8,6 +8,7 @@
> >> */
> >>
> >> #include <linux/host1x.h>
> >> +#include <linux/iommu.h>
> >>
> >> #include "drm.h"
> >> #include "gem.h"
> >> @@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
> >> if (!tegra)
> >> return -ENOMEM;
> >>
> >> + if (iommu_present(&platform_bus_type)) {
> >> + tegra->domain = iommu_domain_alloc(&platform_bus_type);
> >> + if (IS_ERR(tegra->domain)) {
> >> + kfree(tegra);
> >> + return PTR_ERR(tegra->domain);
> >> + }
> >> +
> >> + drm_mm_init(&tegra->mm, 0, SZ_2G);
> >
> >
> > [from Stéphane]:
> >
> > none of these are freed in the error path below (iommu_domain_free and
> > drm_mm_takedown)
> >
> > also |tegra| isn't freed either?
> >
> >
> >
> >> + }
> >> +
> >> mutex_init(&tegra->clients_lock);
> >> INIT_LIST_HEAD(&tegra->clients);
> >> drm->dev_private = tegra;
> >> @@ -71,6 +82,7 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
> >> static int tegra_drm_unload(struct drm_device *drm)
> >> {
> >> struct host1x_device *device = to_host1x_device(drm->dev);
> >> + struct tegra_drm *tegra = drm->dev_private;
> >> int err;
> >>
> >> drm_kms_helper_poll_fini(drm);
> >> @@ -82,6 +94,11 @@ static int tegra_drm_unload(struct drm_device *drm)
> >> if (err < 0)
> >> return err;
> >>
> >> + if (tegra->domain) {
> >> + iommu_domain_free(tegra->domain);
> >> + drm_mm_takedown(&tegra->mm);
> >> + }
> >> +
> >> return 0;
> >> }
> >>
> >> diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
> >> index 96d754e7b3eb..a07c796b7edc 100644
> >> --- a/drivers/gpu/drm/tegra/drm.h
> >> +++ b/drivers/gpu/drm/tegra/drm.h
> >> @@ -39,6 +39,9 @@ struct tegra_fbdev {
> >> struct tegra_drm {
> >> struct drm_device *drm;
> >>
> >> + struct iommu_domain *domain;
> >> + struct drm_mm mm;
> >> +
> >> struct mutex clients_lock;
> >> struct list_head clients;
> >>
> >> diff --git a/drivers/gpu/drm/tegra/fb.c b/drivers/gpu/drm/tegra/fb.c
> >> index 7790d43ad082..21c65dd817c3 100644
> >> --- a/drivers/gpu/drm/tegra/fb.c
> >> +++ b/drivers/gpu/drm/tegra/fb.c
> >> @@ -65,8 +65,12 @@ static void tegra_fb_destroy(struct drm_framebuffer *framebuffer)
> >> for (i = 0; i < fb->num_planes; i++) {
> >> struct tegra_bo *bo = fb->planes[i];
> >>
> >> - if (bo)
> >> + if (bo) {
> >> + if (bo->pages && bo->virt)
> >> + vunmap(bo->virt);
> >> +
> >> drm_gem_object_unreference_unlocked(&bo->gem);
> >> + }
> >> }
> >>
> >> drm_framebuffer_cleanup(framebuffer);
> >> @@ -252,6 +256,16 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
> >> offset = info->var.xoffset * bytes_per_pixel +
> >> info->var.yoffset * fb->pitches[0];
> >>
> >> + if (bo->pages) {
> >> + bo->vaddr = vmap(bo->pages, bo->num_pages, VM_MAP,
> >> + pgprot_writecombine(PAGE_KERNEL));
> >> + if (!bo->vaddr) {
> >> + dev_err(drm->dev, "failed to vmap() framebuffer\n");
> >> + err = -ENOMEM;
> >> + goto destroy;
> >> + }
> >> + }
> >> +
> >> drm->mode_config.fb_base = (resource_size_t)bo->paddr;
> >> info->screen_base = (void __iomem *)bo->vaddr + offset;
> >> info->screen_size = size;
> >> diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
> >> index c1e4e8b6e5ca..2912e61a2599 100644
> >> --- a/drivers/gpu/drm/tegra/gem.c
> >> +++ b/drivers/gpu/drm/tegra/gem.c
> >> @@ -14,8 +14,10 @@
> >> */
> >>
> >> #include <linux/dma-buf.h>
> >> +#include <linux/iommu.h>
> >> #include <drm/tegra_drm.h>
> >>
> >> +#include "drm.h"
> >> #include "gem.h"
> >>
> >> static inline struct tegra_bo *host1x_to_tegra_bo(struct host1x_bo *bo)
> >> @@ -90,14 +92,144 @@ static const struct host1x_bo_ops tegra_bo_ops = {
> >> .kunmap = tegra_bo_kunmap,
> >> };
> >>
> >> +static int iommu_map_sg(struct iommu_domain *domain, struct sg_table *sgt,
> >> + dma_addr_t iova, int prot)
> >> +{
> >> + unsigned long offset = 0;
> >> + struct scatterlist *sg;
> >> + unsigned int i, j;
> >> + int err;
> >> +
> >> + for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> >> + dma_addr_t phys = sg_phys(sg);
> >> + size_t length = sg->offset;
> >> +
> >> + phys = sg_phys(sg) - sg->offset;
> >> + length = sg->length + sg->offset;
> >> +
> >> + err = iommu_map(domain, iova + offset, phys, length, prot);
> >> + if (err < 0)
> >> + goto unmap;
> >> +
> >> + offset += length;
> >> + }
> >> +
> >> + return 0;
> >> +
> >> +unmap:
> >> + offset = 0;
> >> +
> >> + for_each_sg(sgt->sgl, sg, i, j) {
> >> + size_t length = sg->length + sg->offset;
> >> + iommu_unmap(domain, iova + offset, length);
> >> + offset += length;
> >> + }
> >> +
> >> + return err;
> >> +}
> >> +
> >> +static int iommu_unmap_sg(struct iommu_domain *domain, struct sg_table *sgt,
> >> + dma_addr_t iova)
> >> +{
> >> + unsigned long offset = 0;
> >> + struct scatterlist *sg;
> >> + unsigned int i;
> >> +
> >> + for_each_sg(sgt->sgl, sg, sgt->nents, i) {
> >> + dma_addr_t phys = sg_phys(sg);
> >> + size_t length = sg->offset;
> >> +
> >> + phys = sg_phys(sg) - sg->offset;
> >> + length = sg->length + sg->offset;
> >> +
> >> + iommu_unmap(domain, iova + offset, length);
> >> + offset += length;
> >> + }
> >> +
> >> + return 0;
> >> +}
> >> +
> >> +static int tegra_bo_iommu_map(struct tegra_drm *tegra, struct tegra_bo *bo)
> >> +{
> >> + int prot = IOMMU_READ | IOMMU_WRITE;
> >> + int err;
> >> +
> >> + if (bo->mm)
> >> + return -EBUSY;
> >> +
> >> + bo->mm = kzalloc(sizeof(*bo->mm), GFP_KERNEL);
> >> + if (!bo->mm)
> >> + return -ENOMEM;
> >> +
> >> + err = drm_mm_insert_node_generic(&tegra->mm, bo->mm, bo->gem.size,
> >> + PAGE_SIZE, 0, 0, 0);
> >> + if (err < 0) {
> >> + dev_err(tegra->drm->dev, "out of virtual memory: %d\n", err);
> >> + return err;
> >> + }
> >> +
> >> + bo->paddr = bo->mm->start;
> >> +
> >> + err = iommu_map_sg(tegra->domain, bo->sgt, bo->paddr, prot);
> >> + if (err < 0) {
> >> + dev_err(tegra->drm->dev, "failed to map buffer: %d\n", err);
> >> + return err;
> >> + }
> >> +
> >> + return 0;
> >> +}
> >> +
> >> +static int tegra_bo_iommu_unmap(struct tegra_drm *tegra, struct tegra_bo *bo)
> >> +{
> >> + if (!bo->mm)
> >> + return 0;
> >> +
> >> + iommu_unmap_sg(tegra->domain, bo->sgt, bo->paddr);
> >> + drm_mm_remove_node(bo->mm);
> >> +
> >> + kfree(bo->mm);
> >> + return 0;
> >> +}
> >> +
> >> static void tegra_bo_destroy(struct drm_device *drm, struct tegra_bo *bo)
> >> {
> >> - dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr, bo->paddr);
> >> + if (!bo->pages)
> >> + dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr,
> >> + bo->paddr);
>
> One more thing. If tegra_bo_alloc fails, we'll have bo->vaddr == NULL
> and bo->paddr == ~0 here, which causes a crash.
>
> I posted https://lkml.org/lkml/2014/9/30/659 to check for the error
> condition in the mm code, but it seems like reviewer consensus is to
> check for this before calling free.
>
> As such, we'll need to make sure bo->vaddr != NULL before calling
> dma_free_writecombine to avoid this situation.
>
> Would you prefer I send a patch up to fix this separately, or would
> you like to roll this into your next version?
Thanks for pointing all of these out. I'm going to trace the failure
code path anyway since there seem to be a couple of loose ends here and
there, so I'll probably roll in a fix for this anyway.
Thierry
On Wed, Oct 01, 2014 at 11:54:11AM -0400, Sean Paul wrote:
> On Tue, Sep 30, 2014 at 2:48 PM, Sean Paul <[email protected]> wrote:
> > On Thu, Jun 26, 2014 at 4:49 PM, Thierry Reding <[email protected]> wrote:
> >> diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
[...]
> >> static void tegra_bo_destroy(struct drm_device *drm, struct tegra_bo *bo)
> >> {
> >> - dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr, bo->paddr);
> >> + if (!bo->pages)
> >> + dma_free_writecombine(drm->dev, bo->gem.size, bo->vaddr,
> >> + bo->paddr);
>
> One more thing. If tegra_bo_alloc fails, we'll have bo->vaddr == NULL
> and bo->paddr == ~0 here, which causes a crash.
>
> I posted https://lkml.org/lkml/2014/9/30/659 to check for the error
> condition in the mm code, but it seems like reviewer consensus is to
> check for this before calling free.
>
> As such, we'll need to make sure bo->vaddr != NULL before calling
> dma_free_writecombine to avoid this situation.
>
> Would you prefer I send a patch up to fix this separately, or would
> you like to roll this into your next version?
I've rolled this check into my series because I touch that area of code
anyway.
Thanks for bringing it up.
Thierry
On Tue, Sep 30, 2014 at 02:48:35PM -0400, Sean Paul wrote:
> On Thu, Jun 26, 2014 at 4:49 PM, Thierry Reding
> <[email protected]> wrote:
> > From: Thierry Reding <[email protected]>
> >
> > When an IOMMU device is available on the platform bus, allocate an IOMMU
> > domain and attach the display controllers to it. The display controllers
> > can then scan out non-contiguous buffers by mapping them through the
> > IOMMU.
> >
>
> Hi Thierry,
> A few comments from Stéphane and myself that came up while we were
> reviewing this for our tree.
I just realized that I hadn't integrated these comments completely yet,
but I've done so now in my local tree. I'm running a couple of tests to
verify that it's all handled correctly.
> > Signed-off-by: Thierry Reding <[email protected]>
> > ---
> > drivers/gpu/drm/tegra/dc.c | 21 ++++
> > drivers/gpu/drm/tegra/drm.c | 17 ++++
> > drivers/gpu/drm/tegra/drm.h | 3 +
> > drivers/gpu/drm/tegra/fb.c | 16 ++-
> > drivers/gpu/drm/tegra/gem.c | 236 +++++++++++++++++++++++++++++++++++++++-----
> > drivers/gpu/drm/tegra/gem.h | 4 +
> > 6 files changed, 273 insertions(+), 24 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
> > index afcca04f5367..0f7452d04811 100644
> > --- a/drivers/gpu/drm/tegra/dc.c
> > +++ b/drivers/gpu/drm/tegra/dc.c
> > @@ -9,6 +9,7 @@
> >
> > #include <linux/clk.h>
> > #include <linux/debugfs.h>
> > +#include <linux/iommu.h>
> > #include <linux/reset.h>
> >
> > #include "dc.h"
> > @@ -1283,8 +1284,18 @@ static int tegra_dc_init(struct host1x_client *client)
> > {
> > struct drm_device *drm = dev_get_drvdata(client->parent);
> > struct tegra_dc *dc = host1x_client_to_dc(client);
> > + struct tegra_drm *tegra = drm->dev_private;
> > int err;
> >
> > + if (tegra->domain) {
> > + err = iommu_attach_device(tegra->domain, dc->dev);
> > + if (err < 0) {
> > + dev_err(dc->dev, "failed to attach to IOMMU: %d\n",
> > + err);
> > + return err;
> > + }
>
> [from Stéphane]
>
> shouldn't we call detach in the error paths below?
This was mostly rewritten for universal plane support, but I've made
sure that the DC properly detaches from the IOMMU in case of failure
during the code below.
> > diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
[...]
> > @@ -8,6 +8,7 @@
> > */
> >
> > #include <linux/host1x.h>
> > +#include <linux/iommu.h>
> >
> > #include "drm.h"
> > #include "gem.h"
> > @@ -33,6 +34,16 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
> > if (!tegra)
> > return -ENOMEM;
> >
> > + if (iommu_present(&platform_bus_type)) {
> > + tegra->domain = iommu_domain_alloc(&platform_bus_type);
> > + if (IS_ERR(tegra->domain)) {
> > + kfree(tegra);
> > + return PTR_ERR(tegra->domain);
> > + }
> > +
> > + drm_mm_init(&tegra->mm, 0, SZ_2G);
>
>
> [from Stéphane]:
>
> none of these are freed in the error path below (iommu_domain_free and
> drm_mm_takedown)
>
> also |tegra| isn't freed either?
None of the resources were actually being cleaned up, but I think I have
it all handled properly now.
> > @@ -108,22 +240,33 @@ struct tegra_bo *tegra_bo_create(struct drm_device *drm, unsigned int size,
> > host1x_bo_init(&bo->base, &tegra_bo_ops);
> > size = round_up(size, PAGE_SIZE);
> >
> > - bo->vaddr = dma_alloc_writecombine(drm->dev, size, &bo->paddr,
> > - GFP_KERNEL | __GFP_NOWARN);
> > - if (!bo->vaddr) {
> > - dev_err(drm->dev, "failed to allocate buffer with size %u\n",
> > - size);
> > - err = -ENOMEM;
> > - goto err_dma;
> > - }
> > -
> > err = drm_gem_object_init(drm, &bo->gem, size);
> > if (err)
> > - goto err_init;
> > + goto free;
> >
> > err = drm_gem_create_mmap_offset(&bo->gem);
>
> We need to call drm_gem_free_mmap_offset if one of the calls below
> fails, otherwise we'll try to free the mmap_offset on an already
> destroyed bo.
drm_gem_object_release() (below) already calls drm_gem_free_mmap_offset()
for us implicitly.
Thierry