Hi all,
After 5 years or so of intending to get round to this, finally the
time comes! The changes themselves actualy turn out to be relatively
mechanical; the bigger concern appears to be how to get everything
merged across about 5 diffferent trees given the dependencies.
I've lightly boot-tested things on Rockchip RK3288 and Exynos 4412
(Odroid-U3), to the degree that their display drivers should be using
IOMMU-backed buffers and don't explode (the Odroid doesn't manage to
send a working HDMI signal to the one monitor I have that it actually
detects, but that's a pre-existing condition...) Confirmation that the
Mediatek, OMAP and Tegra changes work will be most welcome.
Patches are based on 5.9-rc1, branch available here:
git://linux-arm.org/linux-rm arm/dma
Robin.
Robin Murphy (18):
ARM/dma-mapping: Drop .dma_supported for IOMMU ops
ARM/dma-mapping: Consolidate IOMMU ops callbacks
ARM/dma-mapping: Merge IOMMU ops
iommu/dma: Add temporary hacks for arch/arm
ARM/dma-mapping: Switch to iommu_dma_ops
ARM/dma-mapping: Support IOMMU default domains
iommu/arm-smmu: Remove arch/arm workaround
iommu/renesas: Remove arch/arm workaround
iommu/mediatek-v1: Add IOMMU_DOMAIN_DMA support
iommu/msm: Add IOMMU_DOMAIN_DMA support
iommu/omap: Add IOMMU_DOMAIN_DMA support
iommu/tegra-gart: Add IOMMU_DOMAIN_DMA support
iommu/tegra: Add IOMMU_DOMAIN_DMA support
drm/exynos: Consolidate IOMMU mapping code
drm/nouveau/tegra: Clean up IOMMU workaround
staging/media/tegra-vde: Clean up IOMMU workaround
media/omap3isp: Clean up IOMMU workaround
ARM/dma-mapping: Remove legacy dma-iommu API
arch/arm/Kconfig | 28 +-
arch/arm/common/dmabounce.c | 1 -
arch/arm/include/asm/device.h | 9 -
arch/arm/include/asm/dma-iommu.h | 37 -
arch/arm/mm/dma-mapping.c | 1198 +----------------
drivers/gpu/drm/exynos/exynos5433_drm_decon.c | 5 +-
drivers/gpu/drm/exynos/exynos7_drm_decon.c | 5 +-
drivers/gpu/drm/exynos/exynos_drm_dma.c | 61 +-
drivers/gpu/drm/exynos/exynos_drm_drv.h | 6 +-
drivers/gpu/drm/exynos/exynos_drm_fimc.c | 5 +-
drivers/gpu/drm/exynos/exynos_drm_fimd.c | 5 +-
drivers/gpu/drm/exynos/exynos_drm_g2d.c | 5 +-
drivers/gpu/drm/exynos/exynos_drm_gsc.c | 5 +-
drivers/gpu/drm/exynos/exynos_drm_rotator.c | 5 +-
drivers/gpu/drm/exynos/exynos_drm_scaler.c | 6 +-
drivers/gpu/drm/exynos/exynos_mixer.c | 7 +-
.../drm/nouveau/nvkm/engine/device/tegra.c | 13 -
drivers/iommu/Kconfig | 8 -
drivers/iommu/arm/arm-smmu/arm-smmu.c | 10 -
drivers/iommu/ipmmu-vmsa.c | 69 -
drivers/iommu/msm_iommu.c | 7 +-
drivers/iommu/mtk_iommu.h | 2 -
drivers/iommu/mtk_iommu_v1.c | 153 +--
drivers/iommu/omap-iommu.c | 22 +-
drivers/iommu/tegra-gart.c | 17 +-
drivers/iommu/tegra-smmu.c | 37 +-
drivers/media/platform/Kconfig | 1 -
drivers/media/platform/omap3isp/isp.c | 68 +-
drivers/media/platform/omap3isp/isp.h | 3 -
drivers/staging/media/tegra-vde/iommu.c | 12 -
30 files changed, 150 insertions(+), 1660 deletions(-)
delete mode 100644 arch/arm/include/asm/dma-iommu.h
--
2.28.0.dirty
When an IOMMU is present, we trust that it should be capable
of remapping any physical memory, and since the device masks
represent the input (virtual) addresses to the IOMMU it makes
no sense to validate them against physical PFNs anyway.
Signed-off-by: Robin Murphy <[email protected]>
---
arch/arm/mm/dma-mapping.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 8a8949174b1c..ffa387f343dc 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -1997,8 +1997,6 @@ static const struct dma_map_ops iommu_ops = {
.map_resource = arm_iommu_map_resource,
.unmap_resource = arm_iommu_unmap_resource,
-
- .dma_supported = arm_dma_supported,
};
static const struct dma_map_ops iommu_coherent_ops = {
@@ -2015,8 +2013,6 @@ static const struct dma_map_ops iommu_coherent_ops = {
.map_resource = arm_iommu_map_resource,
.unmap_resource = arm_iommu_unmap_resource,
-
- .dma_supported = arm_dma_supported,
};
/**
--
2.28.0.dirty
With the IOMMU ops now looking much the same shape as iommu_dma_ops,
switch them out in favour of the iommu-dma library, currently enhanced
with temporary workarounds that allow it to also sit underneath the
arch-specific API. With that in place, we can now start converting the
remaining IOMMU drivers and consumers to work with IOMMU API default
domains instead.
Signed-off-by: Robin Murphy <[email protected]>
---
arch/arm/Kconfig | 24 +-
arch/arm/include/asm/dma-iommu.h | 8 -
arch/arm/mm/dma-mapping.c | 887 +------------------------------
drivers/iommu/Kconfig | 8 -
drivers/media/platform/Kconfig | 1 -
5 files changed, 22 insertions(+), 906 deletions(-)
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index b91273f9fd43..79406fe5cd6b 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -133,31 +133,11 @@ config ARM_HAS_SG_CHAIN
bool
config ARM_DMA_USE_IOMMU
- bool
+ def_bool IOMMU_SUPPORT
select ARM_HAS_SG_CHAIN
+ select IOMMU_DMA
select NEED_SG_DMA_LENGTH
-if ARM_DMA_USE_IOMMU
-
-config ARM_DMA_IOMMU_ALIGNMENT
- int "Maximum PAGE_SIZE order of alignment for DMA IOMMU buffers"
- range 4 9
- default 8
- help
- DMA mapping framework by default aligns all buffers to the smallest
- PAGE_SIZE order which is greater than or equal to the requested buffer
- size. This works well for buffers up to a few hundreds kilobytes, but
- for larger buffers it just a waste of address space. Drivers which has
- relatively small addressing window (like 64Mib) might run out of
- virtual space with just a few allocations.
-
- With this parameter you can specify the maximum PAGE_SIZE order for
- DMA IOMMU buffers. Larger buffers will be aligned only to this
- specified order. The order is expressed as a power of two multiplied
- by the PAGE_SIZE.
-
-endif
-
config SYS_SUPPORTS_APM_EMULATION
bool
diff --git a/arch/arm/include/asm/dma-iommu.h b/arch/arm/include/asm/dma-iommu.h
index 86405cc81385..f39cfa509fe4 100644
--- a/arch/arm/include/asm/dma-iommu.h
+++ b/arch/arm/include/asm/dma-iommu.h
@@ -13,14 +13,6 @@ struct dma_iommu_mapping {
/* iommu specific data */
struct iommu_domain *domain;
- unsigned long **bitmaps; /* array of bitmaps */
- unsigned int nr_bitmaps; /* nr of elements in array */
- unsigned int extensions;
- size_t bitmap_size; /* size of a single bitmap */
- size_t bits; /* per bitmap */
- dma_addr_t base;
-
- spinlock_t lock;
struct kref kref;
};
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 0537c97cebe1..0f69ede44cd7 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -15,6 +15,7 @@
#include <linux/init.h>
#include <linux/device.h>
#include <linux/dma-direct.h>
+#include <linux/dma-iommu.h>
#include <linux/dma-mapping.h>
#include <linux/dma-noncoherent.h>
#include <linux/dma-contiguous.h>
@@ -1074,812 +1075,9 @@ static const struct dma_map_ops *arm_get_dma_map_ops(bool coherent)
#ifdef CONFIG_ARM_DMA_USE_IOMMU
-static int __dma_info_to_prot(enum dma_data_direction dir, unsigned long attrs)
-{
- int prot = 0;
-
- if (attrs & DMA_ATTR_PRIVILEGED)
- prot |= IOMMU_PRIV;
-
- switch (dir) {
- case DMA_BIDIRECTIONAL:
- return prot | IOMMU_READ | IOMMU_WRITE;
- case DMA_TO_DEVICE:
- return prot | IOMMU_READ;
- case DMA_FROM_DEVICE:
- return prot | IOMMU_WRITE;
- default:
- return prot;
- }
-}
-
-/* IOMMU */
-
-static int extend_iommu_mapping(struct dma_iommu_mapping *mapping);
-
-static inline dma_addr_t __alloc_iova(struct dma_iommu_mapping *mapping,
- size_t size)
-{
- unsigned int order = get_order(size);
- unsigned int align = 0;
- unsigned int count, start;
- size_t mapping_size = mapping->bits << PAGE_SHIFT;
- unsigned long flags;
- dma_addr_t iova;
- int i;
-
- if (order > CONFIG_ARM_DMA_IOMMU_ALIGNMENT)
- order = CONFIG_ARM_DMA_IOMMU_ALIGNMENT;
-
- count = PAGE_ALIGN(size) >> PAGE_SHIFT;
- align = (1 << order) - 1;
-
- spin_lock_irqsave(&mapping->lock, flags);
- for (i = 0; i < mapping->nr_bitmaps; i++) {
- start = bitmap_find_next_zero_area(mapping->bitmaps[i],
- mapping->bits, 0, count, align);
-
- if (start > mapping->bits)
- continue;
-
- bitmap_set(mapping->bitmaps[i], start, count);
- break;
- }
-
- /*
- * No unused range found. Try to extend the existing mapping
- * and perform a second attempt to reserve an IO virtual
- * address range of size bytes.
- */
- if (i == mapping->nr_bitmaps) {
- if (extend_iommu_mapping(mapping)) {
- spin_unlock_irqrestore(&mapping->lock, flags);
- return DMA_MAPPING_ERROR;
- }
-
- start = bitmap_find_next_zero_area(mapping->bitmaps[i],
- mapping->bits, 0, count, align);
-
- if (start > mapping->bits) {
- spin_unlock_irqrestore(&mapping->lock, flags);
- return DMA_MAPPING_ERROR;
- }
-
- bitmap_set(mapping->bitmaps[i], start, count);
- }
- spin_unlock_irqrestore(&mapping->lock, flags);
-
- iova = mapping->base + (mapping_size * i);
- iova += start << PAGE_SHIFT;
-
- return iova;
-}
-
-static inline void __free_iova(struct dma_iommu_mapping *mapping,
- dma_addr_t addr, size_t size)
-{
- unsigned int start, count;
- size_t mapping_size = mapping->bits << PAGE_SHIFT;
- unsigned long flags;
- dma_addr_t bitmap_base;
- u32 bitmap_index;
-
- if (!size)
- return;
-
- bitmap_index = (u32) (addr - mapping->base) / (u32) mapping_size;
- BUG_ON(addr < mapping->base || bitmap_index > mapping->extensions);
-
- bitmap_base = mapping->base + mapping_size * bitmap_index;
-
- start = (addr - bitmap_base) >> PAGE_SHIFT;
-
- if (addr + size > bitmap_base + mapping_size) {
- /*
- * The address range to be freed reaches into the iova
- * range of the next bitmap. This should not happen as
- * we don't allow this in __alloc_iova (at the
- * moment).
- */
- BUG();
- } else
- count = size >> PAGE_SHIFT;
-
- spin_lock_irqsave(&mapping->lock, flags);
- bitmap_clear(mapping->bitmaps[bitmap_index], start, count);
- spin_unlock_irqrestore(&mapping->lock, flags);
-}
-
-/* We'll try 2M, 1M, 64K, and finally 4K; array must end with 0! */
-static const int iommu_order_array[] = { 9, 8, 4, 0 };
-
-static struct page **__iommu_alloc_buffer(struct device *dev, size_t size,
- gfp_t gfp, unsigned long attrs,
- int coherent_flag)
-{
- struct page **pages;
- int count = size >> PAGE_SHIFT;
- int array_size = count * sizeof(struct page *);
- int i = 0;
- int order_idx = 0;
-
- if (array_size <= PAGE_SIZE)
- pages = kzalloc(array_size, GFP_KERNEL);
- else
- pages = vzalloc(array_size);
- if (!pages)
- return NULL;
-
- if (attrs & DMA_ATTR_FORCE_CONTIGUOUS)
- {
- unsigned long order = get_order(size);
- struct page *page;
-
- page = dma_alloc_from_contiguous(dev, count, order,
- gfp & __GFP_NOWARN);
- if (!page)
- goto error;
-
- __dma_clear_buffer(page, size, coherent_flag);
-
- for (i = 0; i < count; i++)
- pages[i] = page + i;
-
- return pages;
- }
-
- /* Go straight to 4K chunks if caller says it's OK. */
- if (attrs & DMA_ATTR_ALLOC_SINGLE_PAGES)
- order_idx = ARRAY_SIZE(iommu_order_array) - 1;
-
- /*
- * IOMMU can map any pages, so himem can also be used here
- */
- gfp |= __GFP_NOWARN | __GFP_HIGHMEM;
-
- while (count) {
- int j, order;
-
- order = iommu_order_array[order_idx];
-
- /* Drop down when we get small */
- if (__fls(count) < order) {
- order_idx++;
- continue;
- }
-
- if (order) {
- /* See if it's easy to allocate a high-order chunk */
- pages[i] = alloc_pages(gfp | __GFP_NORETRY, order);
-
- /* Go down a notch at first sign of pressure */
- if (!pages[i]) {
- order_idx++;
- continue;
- }
- } else {
- pages[i] = alloc_pages(gfp, 0);
- if (!pages[i])
- goto error;
- }
-
- if (order) {
- split_page(pages[i], order);
- j = 1 << order;
- while (--j)
- pages[i + j] = pages[i] + j;
- }
-
- __dma_clear_buffer(pages[i], PAGE_SIZE << order, coherent_flag);
- i += 1 << order;
- count -= 1 << order;
- }
-
- return pages;
-error:
- while (i--)
- if (pages[i])
- __free_pages(pages[i], 0);
- kvfree(pages);
- return NULL;
-}
-
-static int __iommu_free_buffer(struct device *dev, struct page **pages,
- size_t size, unsigned long attrs)
-{
- int count = size >> PAGE_SHIFT;
- int i;
-
- if (attrs & DMA_ATTR_FORCE_CONTIGUOUS) {
- dma_release_from_contiguous(dev, pages[0], count);
- } else {
- for (i = 0; i < count; i++)
- if (pages[i])
- __free_pages(pages[i], 0);
- }
-
- kvfree(pages);
- return 0;
-}
-
-/*
- * Create a mapping in device IO address space for specified pages
- */
-static dma_addr_t
-__iommu_create_mapping(struct device *dev, struct page **pages, size_t size,
- unsigned long attrs)
-{
- struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
- unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
- dma_addr_t dma_addr, iova;
- int i;
-
- dma_addr = __alloc_iova(mapping, size);
- if (dma_addr == DMA_MAPPING_ERROR)
- return dma_addr;
-
- iova = dma_addr;
- for (i = 0; i < count; ) {
- int ret;
-
- unsigned int next_pfn = page_to_pfn(pages[i]) + 1;
- phys_addr_t phys = page_to_phys(pages[i]);
- unsigned int len, j;
-
- for (j = i + 1; j < count; j++, next_pfn++)
- if (page_to_pfn(pages[j]) != next_pfn)
- break;
-
- len = (j - i) << PAGE_SHIFT;
- ret = iommu_map(mapping->domain, iova, phys, len,
- __dma_info_to_prot(DMA_BIDIRECTIONAL, attrs));
- if (ret < 0)
- goto fail;
- iova += len;
- i = j;
- }
- return dma_addr;
-fail:
- iommu_unmap(mapping->domain, dma_addr, iova-dma_addr);
- __free_iova(mapping, dma_addr, size);
- return DMA_MAPPING_ERROR;
-}
-
-static int __iommu_remove_mapping(struct device *dev, dma_addr_t iova, size_t size)
-{
- struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
-
- /*
- * add optional in-page offset from iova to size and align
- * result to page size
- */
- size = PAGE_ALIGN((iova & ~PAGE_MASK) + size);
- iova &= PAGE_MASK;
-
- iommu_unmap(mapping->domain, iova, size);
- __free_iova(mapping, iova, size);
- return 0;
-}
-
-static struct page **__atomic_get_pages(void *addr)
-{
- struct page *page;
- phys_addr_t phys;
-
- phys = gen_pool_virt_to_phys(atomic_pool, (unsigned long)addr);
- page = phys_to_page(phys);
-
- return (struct page **)page;
-}
-
-static struct page **__iommu_get_pages(void *cpu_addr, unsigned long attrs)
-{
- if (__in_atomic_pool(cpu_addr, PAGE_SIZE))
- return __atomic_get_pages(cpu_addr);
-
- if (attrs & DMA_ATTR_NO_KERNEL_MAPPING)
- return cpu_addr;
-
- return dma_common_find_pages(cpu_addr);
-}
-
-static void *__iommu_alloc_simple(struct device *dev, size_t size, gfp_t gfp,
- dma_addr_t *handle, int coherent_flag,
- unsigned long attrs)
-{
- struct page *page;
- void *addr;
-
- if (coherent_flag == COHERENT)
- addr = __alloc_simple_buffer(dev, size, gfp, &page);
- else
- addr = __alloc_from_pool(size, &page);
- if (!addr)
- return NULL;
-
- *handle = __iommu_create_mapping(dev, &page, size, attrs);
- if (*handle == DMA_MAPPING_ERROR)
- goto err_mapping;
-
- return addr;
-
-err_mapping:
- __free_from_pool(addr, size);
- return NULL;
-}
-
-static void __iommu_free_atomic(struct device *dev, void *cpu_addr,
- dma_addr_t handle, size_t size, int coherent_flag)
-{
- __iommu_remove_mapping(dev, handle, size);
- if (coherent_flag == COHERENT)
- __dma_free_buffer(virt_to_page(cpu_addr), size);
- else
- __free_from_pool(cpu_addr, size);
-}
-
-static void *arm_iommu_alloc_attrs(struct device *dev, size_t size,
- dma_addr_t *handle, gfp_t gfp, unsigned long attrs)
-{
- pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL);
- struct page **pages;
- void *addr = NULL;
- int coherent_flag = dev->dma_coherent ? COHERENT : NORMAL;
-
- *handle = DMA_MAPPING_ERROR;
- size = PAGE_ALIGN(size);
-
- if (coherent_flag == COHERENT || !gfpflags_allow_blocking(gfp))
- return __iommu_alloc_simple(dev, size, gfp, handle,
- coherent_flag, attrs);
-
- /*
- * Following is a work-around (a.k.a. hack) to prevent pages
- * with __GFP_COMP being passed to split_page() which cannot
- * handle them. The real problem is that this flag probably
- * should be 0 on ARM as it is not supported on this
- * platform; see CONFIG_HUGETLBFS.
- */
- gfp &= ~(__GFP_COMP);
-
- pages = __iommu_alloc_buffer(dev, size, gfp, attrs, coherent_flag);
- if (!pages)
- return NULL;
-
- *handle = __iommu_create_mapping(dev, pages, size, attrs);
- if (*handle == DMA_MAPPING_ERROR)
- goto err_buffer;
-
- if (attrs & DMA_ATTR_NO_KERNEL_MAPPING)
- return pages;
-
- addr = dma_common_pages_remap(pages, size, prot,
- __builtin_return_address(0));
- if (!addr)
- goto err_mapping;
-
- return addr;
-
-err_mapping:
- __iommu_remove_mapping(dev, *handle, size);
-err_buffer:
- __iommu_free_buffer(dev, pages, size, attrs);
- return NULL;
-}
-
-static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
- void *cpu_addr, dma_addr_t dma_addr, size_t size,
- unsigned long attrs)
-{
- struct page **pages = __iommu_get_pages(cpu_addr, attrs);
- unsigned long nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
- int err;
-
- if (!pages)
- return -ENXIO;
-
- if (vma->vm_pgoff >= nr_pages)
- return -ENXIO;
-
- if (!dev->dma_coherent)
- vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot);
-
- err = vm_map_pages(vma, pages, nr_pages);
- if (err)
- pr_err("Remapping memory failed: %d\n", err);
-
- return err;
-}
-
-/*
- * free a page as defined by the above mapping.
- * Must not be called with IRQs disabled.
- */
-static void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
- dma_addr_t handle, unsigned long attrs)
-{
- int coherent_flag = dev->dma_coherent ? COHERENT : NORMAL;
- struct page **pages;
- size = PAGE_ALIGN(size);
-
- if (coherent_flag == COHERENT || __in_atomic_pool(cpu_addr, size)) {
- __iommu_free_atomic(dev, cpu_addr, handle, size, coherent_flag);
- return;
- }
-
- pages = __iommu_get_pages(cpu_addr, attrs);
- if (!pages) {
- WARN(1, "trying to free invalid coherent area: %p\n", cpu_addr);
- return;
- }
-
- if ((attrs & DMA_ATTR_NO_KERNEL_MAPPING) == 0)
- dma_common_free_remap(cpu_addr, size);
-
- __iommu_remove_mapping(dev, handle, size);
- __iommu_free_buffer(dev, pages, size, attrs);
-}
-
-static int arm_iommu_get_sgtable(struct device *dev, struct sg_table *sgt,
- void *cpu_addr, dma_addr_t dma_addr,
- size_t size, unsigned long attrs)
-{
- unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
- struct page **pages = __iommu_get_pages(cpu_addr, attrs);
-
- if (!pages)
- return -ENXIO;
-
- return sg_alloc_table_from_pages(sgt, pages, count, 0, size,
- GFP_KERNEL);
-}
-
-/*
- * Map a part of the scatter-gather list into contiguous io address space
- */
-static int __map_sg_chunk(struct device *dev, struct scatterlist *sg,
- size_t size, dma_addr_t *handle,
- enum dma_data_direction dir, unsigned long attrs)
-{
- struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
- dma_addr_t iova, iova_base;
- int ret = 0;
- unsigned int count;
- struct scatterlist *s;
- int prot;
-
- size = PAGE_ALIGN(size);
- *handle = DMA_MAPPING_ERROR;
-
- iova_base = iova = __alloc_iova(mapping, size);
- if (iova == DMA_MAPPING_ERROR)
- return -ENOMEM;
-
- for (count = 0, s = sg; count < (size >> PAGE_SHIFT); s = sg_next(s)) {
- phys_addr_t phys = page_to_phys(sg_page(s));
- unsigned int len = PAGE_ALIGN(s->offset + s->length);
-
- if (!dev->dma_coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
- __dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir);
-
- prot = __dma_info_to_prot(dir, attrs);
-
- ret = iommu_map(mapping->domain, iova, phys, len, prot);
- if (ret < 0)
- goto fail;
- count += len >> PAGE_SHIFT;
- iova += len;
- }
- *handle = iova_base;
-
- return 0;
-fail:
- iommu_unmap(mapping->domain, iova_base, count * PAGE_SIZE);
- __free_iova(mapping, iova_base, size);
- return ret;
-}
-
-/**
- * arm_iommu_map_sg - map a set of SG buffers for streaming mode DMA
- * @dev: valid struct device pointer
- * @sg: list of buffers
- * @nents: number of buffers to map
- * @dir: DMA transfer direction
- *
- * Map a set of buffers described by scatterlist in streaming mode for DMA.
- * The scatter gather list elements are merged together (if possible) and
- * tagged with the appropriate dma address and length. They are obtained via
- * sg_dma_{address,length}.
- */
-static int arm_iommu_map_sg(struct device *dev, struct scatterlist *sg,
- int nents, enum dma_data_direction dir, unsigned long attrs)
-{
- struct scatterlist *s = sg, *dma = sg, *start = sg;
- int i, count = 0;
- unsigned int offset = s->offset;
- unsigned int size = s->offset + s->length;
- unsigned int max = dma_get_max_seg_size(dev);
-
- for (i = 1; i < nents; i++) {
- s = sg_next(s);
-
- s->dma_address = DMA_MAPPING_ERROR;
- s->dma_length = 0;
-
- if (s->offset || (size & ~PAGE_MASK) || size + s->length > max) {
- if (__map_sg_chunk(dev, start, size, &dma->dma_address,
- dir, attrs) < 0)
- goto bad_mapping;
-
- dma->dma_address += offset;
- dma->dma_length = size - offset;
-
- size = offset = s->offset;
- start = s;
- dma = sg_next(dma);
- count += 1;
- }
- size += s->length;
- }
- if (__map_sg_chunk(dev, start, size, &dma->dma_address, dir, attrs) < 0)
- goto bad_mapping;
-
- dma->dma_address += offset;
- dma->dma_length = size - offset;
-
- return count+1;
-
-bad_mapping:
- for_each_sg(sg, s, count, i)
- __iommu_remove_mapping(dev, sg_dma_address(s), sg_dma_len(s));
- return 0;
-}
-
-/**
- * arm_iommu_unmap_sg - unmap a set of SG buffers mapped by dma_map_sg
- * @dev: valid struct device pointer
- * @sg: list of buffers
- * @nents: number of buffers to unmap (same as was passed to dma_map_sg)
- * @dir: DMA transfer direction (same as was passed to dma_map_sg)
- *
- * Unmap a set of streaming mode DMA translations. Again, CPU access
- * rules concerning calls here are the same as for dma_unmap_single().
- */
-static void arm_iommu_unmap_sg(struct device *dev,
- struct scatterlist *sg, int nents,
- enum dma_data_direction dir,
- unsigned long attrs)
-{
- struct scatterlist *s;
- int i;
-
- for_each_sg(sg, s, nents, i) {
- if (sg_dma_len(s))
- __iommu_remove_mapping(dev, sg_dma_address(s),
- sg_dma_len(s));
- if (!dev->dma_coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
- __dma_page_dev_to_cpu(sg_page(s), s->offset,
- s->length, dir);
- }
-}
-
-/**
- * arm_iommu_sync_sg_for_cpu
- * @dev: valid struct device pointer
- * @sg: list of buffers
- * @nents: number of buffers to map (returned from dma_map_sg)
- * @dir: DMA transfer direction (same as was passed to dma_map_sg)
- */
-static void arm_iommu_sync_sg_for_cpu(struct device *dev,
- struct scatterlist *sg,
- int nents, enum dma_data_direction dir)
-{
- struct scatterlist *s;
- int i;
-
- if (dev->dma_coherent)
- return;
-
- for_each_sg(sg, s, nents, i)
- __dma_page_dev_to_cpu(sg_page(s), s->offset, s->length, dir);
-
-}
-
-/**
- * arm_iommu_sync_sg_for_device
- * @dev: valid struct device pointer
- * @sg: list of buffers
- * @nents: number of buffers to map (returned from dma_map_sg)
- * @dir: DMA transfer direction (same as was passed to dma_map_sg)
- */
-static void arm_iommu_sync_sg_for_device(struct device *dev,
- struct scatterlist *sg,
- int nents, enum dma_data_direction dir)
-{
- struct scatterlist *s;
- int i;
-
- if (dev->dma_coherent)
- return;
-
- for_each_sg(sg, s, nents, i)
- __dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir);
-}
-
-/**
- * arm_iommu_map_page
- * @dev: valid struct device pointer
- * @page: page that buffer resides in
- * @offset: offset into page for start of buffer
- * @size: size of buffer to map
- * @dir: DMA transfer direction
- *
- * IOMMU aware version of arm_dma_map_page()
- */
-static dma_addr_t arm_iommu_map_page(struct device *dev, struct page *page,
- unsigned long offset, size_t size, enum dma_data_direction dir,
- unsigned long attrs)
-{
- struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
- dma_addr_t dma_addr;
- int ret, prot, len = PAGE_ALIGN(size + offset);
-
- if (!dev->dma_coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
- __dma_page_cpu_to_dev(page, offset, size, dir);
-
- dma_addr = __alloc_iova(mapping, len);
- if (dma_addr == DMA_MAPPING_ERROR)
- return dma_addr;
-
- prot = __dma_info_to_prot(dir, attrs);
-
- ret = iommu_map(mapping->domain, dma_addr, page_to_phys(page), len, prot);
- if (ret < 0)
- goto fail;
-
- return dma_addr + offset;
-fail:
- __free_iova(mapping, dma_addr, len);
- return DMA_MAPPING_ERROR;
-}
-
-/**
- * arm_iommu_unmap_page
- * @dev: valid struct device pointer
- * @handle: DMA address of buffer
- * @size: size of buffer (same as passed to dma_map_page)
- * @dir: DMA transfer direction (same as passed to dma_map_page)
- *
- * IOMMU aware version of arm_dma_unmap_page()
- */
-static void arm_iommu_unmap_page(struct device *dev, dma_addr_t handle,
- size_t size, enum dma_data_direction dir, unsigned long attrs)
-{
- struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
- dma_addr_t iova = handle & PAGE_MASK;
- struct page *page;
- int offset = handle & ~PAGE_MASK;
- int len = PAGE_ALIGN(size + offset);
-
- if (!iova)
- return;
-
- if (!dev->dma_coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) {
- page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
- __dma_page_dev_to_cpu(page, offset, size, dir);
- }
-
- iommu_unmap(mapping->domain, iova, len);
- __free_iova(mapping, iova, len);
-}
-
-/**
- * arm_iommu_map_resource - map a device resource for DMA
- * @dev: valid struct device pointer
- * @phys_addr: physical address of resource
- * @size: size of resource to map
- * @dir: DMA transfer direction
- */
-static dma_addr_t arm_iommu_map_resource(struct device *dev,
- phys_addr_t phys_addr, size_t size,
- enum dma_data_direction dir, unsigned long attrs)
-{
- struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
- dma_addr_t dma_addr;
- int ret, prot;
- phys_addr_t addr = phys_addr & PAGE_MASK;
- unsigned int offset = phys_addr & ~PAGE_MASK;
- size_t len = PAGE_ALIGN(size + offset);
-
- dma_addr = __alloc_iova(mapping, len);
- if (dma_addr == DMA_MAPPING_ERROR)
- return dma_addr;
-
- prot = __dma_info_to_prot(dir, attrs) | IOMMU_MMIO;
-
- ret = iommu_map(mapping->domain, dma_addr, addr, len, prot);
- if (ret < 0)
- goto fail;
-
- return dma_addr + offset;
-fail:
- __free_iova(mapping, dma_addr, len);
- return DMA_MAPPING_ERROR;
-}
-
-/**
- * arm_iommu_unmap_resource - unmap a device DMA resource
- * @dev: valid struct device pointer
- * @dma_handle: DMA address to resource
- * @size: size of resource to map
- * @dir: DMA transfer direction
- */
-static void arm_iommu_unmap_resource(struct device *dev, dma_addr_t dma_handle,
- size_t size, enum dma_data_direction dir,
- unsigned long attrs)
-{
- struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
- dma_addr_t iova = dma_handle & PAGE_MASK;
- unsigned int offset = dma_handle & ~PAGE_MASK;
- size_t len = PAGE_ALIGN(size + offset);
-
- if (!iova)
- return;
-
- iommu_unmap(mapping->domain, iova, len);
- __free_iova(mapping, iova, len);
-}
-
-static void arm_iommu_sync_single_for_cpu(struct device *dev,
- dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
- struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
- dma_addr_t iova = handle & PAGE_MASK;
- struct page *page;
- unsigned int offset = handle & ~PAGE_MASK;
-
- if (dev->dma_coherent || !iova)
- return;
-
- page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
- __dma_page_dev_to_cpu(page, offset, size, dir);
-}
-
-static void arm_iommu_sync_single_for_device(struct device *dev,
- dma_addr_t handle, size_t size, enum dma_data_direction dir)
-{
- struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
- dma_addr_t iova = handle & PAGE_MASK;
- struct page *page;
- unsigned int offset = handle & ~PAGE_MASK;
-
- if (dev->dma_coherent || !iova)
- return;
-
- page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
- __dma_page_cpu_to_dev(page, offset, size, dir);
-}
-
-static const struct dma_map_ops iommu_ops = {
- .alloc = arm_iommu_alloc_attrs,
- .free = arm_iommu_free_attrs,
- .mmap = arm_iommu_mmap_attrs,
- .get_sgtable = arm_iommu_get_sgtable,
-
- .map_page = arm_iommu_map_page,
- .unmap_page = arm_iommu_unmap_page,
- .sync_single_for_cpu = arm_iommu_sync_single_for_cpu,
- .sync_single_for_device = arm_iommu_sync_single_for_device,
-
- .map_sg = arm_iommu_map_sg,
- .unmap_sg = arm_iommu_unmap_sg,
- .sync_sg_for_cpu = arm_iommu_sync_sg_for_cpu,
- .sync_sg_for_device = arm_iommu_sync_sg_for_device,
-
- .map_resource = arm_iommu_map_resource,
- .unmap_resource = arm_iommu_unmap_resource,
-};
-
+extern const struct dma_map_ops iommu_dma_ops;
+extern int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
+ u64 size, struct device *dev);
/**
* arm_iommu_create_mapping
* @bus: pointer to the bus holding the client device (for IOMMU calls)
@@ -1896,55 +1094,31 @@ static const struct dma_map_ops iommu_ops = {
struct dma_iommu_mapping *
arm_iommu_create_mapping(struct bus_type *bus, dma_addr_t base, u64 size)
{
- unsigned int bits = size >> PAGE_SHIFT;
- unsigned int bitmap_size = BITS_TO_LONGS(bits) * sizeof(long);
struct dma_iommu_mapping *mapping;
- int extensions = 1;
int err = -ENOMEM;
- /* currently only 32-bit DMA address space is supported */
- if (size > DMA_BIT_MASK(32) + 1)
- return ERR_PTR(-ERANGE);
-
- if (!bitmap_size)
- return ERR_PTR(-EINVAL);
-
- if (bitmap_size > PAGE_SIZE) {
- extensions = bitmap_size / PAGE_SIZE;
- bitmap_size = PAGE_SIZE;
- }
-
- mapping = kzalloc(sizeof(struct dma_iommu_mapping), GFP_KERNEL);
+ mapping = kzalloc(sizeof(*mapping), GFP_KERNEL);
if (!mapping)
goto err;
- mapping->bitmap_size = bitmap_size;
- mapping->bitmaps = kcalloc(extensions, sizeof(unsigned long *),
- GFP_KERNEL);
- if (!mapping->bitmaps)
- goto err2;
-
- mapping->bitmaps[0] = kzalloc(bitmap_size, GFP_KERNEL);
- if (!mapping->bitmaps[0])
- goto err3;
-
- mapping->nr_bitmaps = 1;
- mapping->extensions = extensions;
- mapping->base = base;
- mapping->bits = BITS_PER_BYTE * bitmap_size;
-
- spin_lock_init(&mapping->lock);
-
mapping->domain = iommu_domain_alloc(bus);
if (!mapping->domain)
+ goto err2;
+
+ err = iommu_get_dma_cookie(mapping->domain);
+ if (err)
+ goto err3;
+
+ err = iommu_dma_init_domain(mapping->domain, base, size, NULL);
+ if (err)
goto err4;
kref_init(&mapping->kref);
return mapping;
err4:
- kfree(mapping->bitmaps[0]);
+ iommu_put_dma_cookie(mapping->domain);
err3:
- kfree(mapping->bitmaps);
+ iommu_domain_free(mapping->domain);
err2:
kfree(mapping);
err:
@@ -1954,35 +1128,14 @@ EXPORT_SYMBOL_GPL(arm_iommu_create_mapping);
static void release_iommu_mapping(struct kref *kref)
{
- int i;
struct dma_iommu_mapping *mapping =
container_of(kref, struct dma_iommu_mapping, kref);
+ iommu_put_dma_cookie(mapping->domain);
iommu_domain_free(mapping->domain);
- for (i = 0; i < mapping->nr_bitmaps; i++)
- kfree(mapping->bitmaps[i]);
- kfree(mapping->bitmaps);
kfree(mapping);
}
-static int extend_iommu_mapping(struct dma_iommu_mapping *mapping)
-{
- int next_bitmap;
-
- if (mapping->nr_bitmaps >= mapping->extensions)
- return -EINVAL;
-
- next_bitmap = mapping->nr_bitmaps;
- mapping->bitmaps[next_bitmap] = kzalloc(mapping->bitmap_size,
- GFP_ATOMIC);
- if (!mapping->bitmaps[next_bitmap])
- return -ENOMEM;
-
- mapping->nr_bitmaps++;
-
- return 0;
-}
-
void arm_iommu_release_mapping(struct dma_iommu_mapping *mapping)
{
if (mapping)
@@ -2028,7 +1181,7 @@ int arm_iommu_attach_device(struct device *dev,
if (err)
return err;
- set_dma_ops(dev, &iommu_ops);
+ set_dma_ops(dev, &iommu_dma_ops);
return 0;
}
EXPORT_SYMBOL_GPL(arm_iommu_attach_device);
@@ -2126,7 +1279,7 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
return;
if (arm_setup_iommu_dma_ops(dev, dma_base, size, iommu))
- dma_ops = &iommu_ops;
+ dma_ops = &iommu_dma_ops;
else
dma_ops = arm_get_dma_map_ops(coherent);
@@ -2149,7 +1302,7 @@ void arch_teardown_dma_ops(struct device *dev)
set_dma_ops(dev, NULL);
}
-#if defined(CONFIG_SWIOTLB) || defined(CONFIG_ARM_DMA_USE_IOMMU)
+#if defined(CONFIG_SWIOTLB) || defined(CONFIG_IOMMU_DMA)
void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
enum dma_data_direction dir)
{
@@ -2177,4 +1330,4 @@ void arch_dma_free(struct device *dev, size_t size, void *cpu_addr,
{
__arm_dma_free(dev, size, cpu_addr, dma_handle, attrs, false);
}
-#endif /* CONFIG_SWIOTLB || CONFIG_ARM_DMA_USE_IOMMU */
+#endif /* CONFIG_SWIOTLB || CONFIG_IOMMU_DMA */
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index bef5d75e306b..ca0bdf826e9d 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -163,7 +163,6 @@ config ROCKCHIP_IOMMU
bool "Rockchip IOMMU Support"
depends on ARCH_ROCKCHIP || COMPILE_TEST
select IOMMU_API
- select ARM_DMA_USE_IOMMU
help
Support for IOMMUs found on Rockchip rk32xx SOCs.
These IOMMUs allow virtualization of the address space used by most
@@ -175,7 +174,6 @@ config SUN50I_IOMMU
bool "Allwinner H6 IOMMU Support"
depends on HAS_DMA
depends on ARCH_SUNXI || COMPILE_TEST
- select ARM_DMA_USE_IOMMU
select IOMMU_API
help
Support for the IOMMU introduced in the Allwinner H6 SoCs.
@@ -206,7 +204,6 @@ config EXYNOS_IOMMU
depends on ARCH_EXYNOS || COMPILE_TEST
depends on !CPU_BIG_ENDIAN # revisit driver if we can enable big-endian ptes
select IOMMU_API
- select ARM_DMA_USE_IOMMU
help
Support for the IOMMU (System MMU) of Samsung Exynos application
processor family. This enables H/W multimedia accelerators to see
@@ -229,7 +226,6 @@ config IPMMU_VMSA
depends on ARCH_RENESAS || (COMPILE_TEST && !GENERIC_ATOMIC64)
select IOMMU_API
select IOMMU_IO_PGTABLE_LPAE
- select ARM_DMA_USE_IOMMU
help
Support for the Renesas VMSA-compatible IPMMU found in the R-Mobile
APE6, R-Car Gen2, and R-Car Gen3 SoCs.
@@ -250,7 +246,6 @@ config ARM_SMMU
depends on ARM64 || ARM || (COMPILE_TEST && !GENERIC_ATOMIC64)
select IOMMU_API
select IOMMU_IO_PGTABLE_LPAE
- select ARM_DMA_USE_IOMMU if ARM
help
Support for implementations of the ARM System MMU architecture
versions 1 and 2.
@@ -334,7 +329,6 @@ config S390_AP_IOMMU
config MTK_IOMMU
bool "MTK IOMMU Support"
depends on ARCH_MEDIATEK || COMPILE_TEST
- select ARM_DMA_USE_IOMMU
select IOMMU_API
select IOMMU_IO_PGTABLE_ARMV7S
select MEMORY
@@ -350,7 +344,6 @@ config MTK_IOMMU_V1
bool "MTK IOMMU Version 1 (M4U gen1) Support"
depends on ARM
depends on ARCH_MEDIATEK || COMPILE_TEST
- select ARM_DMA_USE_IOMMU
select IOMMU_API
select MEMORY
select MTK_SMI
@@ -367,7 +360,6 @@ config QCOM_IOMMU
depends on ARCH_QCOM || (COMPILE_TEST && !GENERIC_ATOMIC64)
select IOMMU_API
select IOMMU_IO_PGTABLE_LPAE
- select ARM_DMA_USE_IOMMU
help
Support for IOMMU on certain Qualcomm SoCs.
diff --git a/drivers/media/platform/Kconfig b/drivers/media/platform/Kconfig
index c57ee78fa99d..9d6ff7b5f7f1 100644
--- a/drivers/media/platform/Kconfig
+++ b/drivers/media/platform/Kconfig
@@ -76,7 +76,6 @@ config VIDEO_OMAP3
depends on VIDEO_V4L2 && I2C
depends on (ARCH_OMAP3 && OMAP_IOMMU) || COMPILE_TEST
depends on COMMON_CLK && OF
- select ARM_DMA_USE_IOMMU if OMAP_IOMMU
select MEDIA_CONTROLLER
select VIDEO_V4L2_SUBDEV_API
select VIDEOBUF2_DMA_CONTIG
--
2.28.0.dirty
Now that arch/arm is wired up for default domains and iommu-dma,
implement the corresponding driver-side support for DMA domains.
Signed-off-by: Robin Murphy <[email protected]>
---
drivers/iommu/msm_iommu.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
index 3615cd6241c4..f34efcbb0b2b 100644
--- a/drivers/iommu/msm_iommu.c
+++ b/drivers/iommu/msm_iommu.c
@@ -8,6 +8,7 @@
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/platform_device.h>
+#include <linux/dma-iommu.h>
#include <linux/errno.h>
#include <linux/io.h>
#include <linux/io-pgtable.h>
@@ -314,13 +315,16 @@ static struct iommu_domain *msm_iommu_domain_alloc(unsigned type)
{
struct msm_priv *priv;
- if (type != IOMMU_DOMAIN_UNMANAGED)
+ if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
return NULL;
priv = kzalloc(sizeof(*priv), GFP_KERNEL);
if (!priv)
goto fail_nomem;
+ if (type == IOMMU_DOMAIN_DMA && iommu_get_dma_cookie(&priv->domain))
+ goto fail_nomem;
+
INIT_LIST_HEAD(&priv->list_attached);
priv->domain.geometry.aperture_start = 0;
@@ -339,6 +343,7 @@ static void msm_iommu_domain_free(struct iommu_domain *domain)
struct msm_priv *priv;
unsigned long flags;
+ iommu_put_dma_cookie(domain);
spin_lock_irqsave(&msm_iommu_lock, flags);
priv = to_msm_priv(domain);
kfree(priv);
--
2.28.0.dirty
Now that arch/arm is wired up for default domains and iommu-dma, we no
longer need to work around the arch-private mapping.
Signed-off-by: Robin Murphy <[email protected]>
---
drivers/staging/media/tegra-vde/iommu.c | 12 ------------
1 file changed, 12 deletions(-)
diff --git a/drivers/staging/media/tegra-vde/iommu.c b/drivers/staging/media/tegra-vde/iommu.c
index 6af863d92123..4f770189ed34 100644
--- a/drivers/staging/media/tegra-vde/iommu.c
+++ b/drivers/staging/media/tegra-vde/iommu.c
@@ -10,10 +10,6 @@
#include <linux/kernel.h>
#include <linux/platform_device.h>
-#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
-#include <asm/dma-iommu.h>
-#endif
-
#include "vde.h"
int tegra_vde_iommu_map(struct tegra_vde *vde,
@@ -70,14 +66,6 @@ int tegra_vde_iommu_init(struct tegra_vde *vde)
if (!vde->group)
return 0;
-#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
- if (dev->archdata.mapping) {
- struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
-
- arm_iommu_detach_device(dev);
- arm_iommu_release_mapping(mapping);
- }
-#endif
vde->domain = iommu_domain_alloc(&platform_bus_type);
if (!vde->domain) {
err = -ENOMEM;
--
2.28.0.dirty
Now that arch/arm is wired up for default domains and iommu-dma, we no
longer need to work around the arch-private mapping.
Signed-off-by: Robin Murphy <[email protected]>
---
drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c | 13 -------------
1 file changed, 13 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
index d0d52c1d4aee..410ee1f83e0b 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c
@@ -23,10 +23,6 @@
#ifdef CONFIG_NOUVEAU_PLATFORM_DRIVER
#include "priv.h"
-#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
-#include <asm/dma-iommu.h>
-#endif
-
static int
nvkm_device_tegra_power_up(struct nvkm_device_tegra *tdev)
{
@@ -109,15 +105,6 @@ nvkm_device_tegra_probe_iommu(struct nvkm_device_tegra *tdev)
unsigned long pgsize_bitmap;
int ret;
-#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
- if (dev->archdata.mapping) {
- struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
-
- arm_iommu_detach_device(dev);
- arm_iommu_release_mapping(mapping);
- }
-#endif
-
if (!tdev->func->iommu_bit)
return;
--
2.28.0.dirty
Now that arch/arm is wired up for default domains and iommu-dma, we can
consolidate the shared mapping code onto the generic IOMMU API version,
and retire the arch-specific implementation.
Signed-off-by: Robin Murphy <[email protected]>
---
This is a cheeky revert of 07dc3678bacc ("drm/exynos: Fix cleanup of
IOMMU related objects"), plus removal of the remaining arm_iommu_*
references on top.
---
drivers/gpu/drm/exynos/exynos5433_drm_decon.c | 5 +-
drivers/gpu/drm/exynos/exynos7_drm_decon.c | 5 +-
drivers/gpu/drm/exynos/exynos_drm_dma.c | 61 +++----------------
drivers/gpu/drm/exynos/exynos_drm_drv.h | 6 +-
drivers/gpu/drm/exynos/exynos_drm_fimc.c | 5 +-
drivers/gpu/drm/exynos/exynos_drm_fimd.c | 5 +-
drivers/gpu/drm/exynos/exynos_drm_g2d.c | 5 +-
drivers/gpu/drm/exynos/exynos_drm_gsc.c | 5 +-
drivers/gpu/drm/exynos/exynos_drm_rotator.c | 5 +-
drivers/gpu/drm/exynos/exynos_drm_scaler.c | 6 +-
drivers/gpu/drm/exynos/exynos_mixer.c | 7 +--
11 files changed, 29 insertions(+), 86 deletions(-)
diff --git a/drivers/gpu/drm/exynos/exynos5433_drm_decon.c b/drivers/gpu/drm/exynos/exynos5433_drm_decon.c
index 1f79bc2a881e..8428ae12dfa5 100644
--- a/drivers/gpu/drm/exynos/exynos5433_drm_decon.c
+++ b/drivers/gpu/drm/exynos/exynos5433_drm_decon.c
@@ -55,7 +55,6 @@ static const char * const decon_clks_name[] = {
struct decon_context {
struct device *dev;
struct drm_device *drm_dev;
- void *dma_priv;
struct exynos_drm_crtc *crtc;
struct exynos_drm_plane planes[WINDOWS_NR];
struct exynos_drm_plane_config configs[WINDOWS_NR];
@@ -645,7 +644,7 @@ static int decon_bind(struct device *dev, struct device *master, void *data)
decon_clear_channels(ctx->crtc);
- return exynos_drm_register_dma(drm_dev, dev, &ctx->dma_priv);
+ return exynos_drm_register_dma(drm_dev, dev);
}
static void decon_unbind(struct device *dev, struct device *master, void *data)
@@ -655,7 +654,7 @@ static void decon_unbind(struct device *dev, struct device *master, void *data)
decon_atomic_disable(ctx->crtc);
/* detach this sub driver from iommu mapping if supported. */
- exynos_drm_unregister_dma(ctx->drm_dev, ctx->dev, &ctx->dma_priv);
+ exynos_drm_unregister_dma(ctx->drm_dev, ctx->dev);
}
static const struct component_ops decon_component_ops = {
diff --git a/drivers/gpu/drm/exynos/exynos7_drm_decon.c b/drivers/gpu/drm/exynos/exynos7_drm_decon.c
index f2d87a7445c7..e7b58097ccdc 100644
--- a/drivers/gpu/drm/exynos/exynos7_drm_decon.c
+++ b/drivers/gpu/drm/exynos/exynos7_drm_decon.c
@@ -40,7 +40,6 @@
struct decon_context {
struct device *dev;
struct drm_device *drm_dev;
- void *dma_priv;
struct exynos_drm_crtc *crtc;
struct exynos_drm_plane planes[WINDOWS_NR];
struct exynos_drm_plane_config configs[WINDOWS_NR];
@@ -128,13 +127,13 @@ static int decon_ctx_initialize(struct decon_context *ctx,
decon_clear_channels(ctx->crtc);
- return exynos_drm_register_dma(drm_dev, ctx->dev, &ctx->dma_priv);
+ return exynos_drm_register_dma(drm_dev, ctx->dev);
}
static void decon_ctx_remove(struct decon_context *ctx)
{
/* detach this sub driver from iommu mapping if supported. */
- exynos_drm_unregister_dma(ctx->drm_dev, ctx->dev, &ctx->dma_priv);
+ exynos_drm_unregister_dma(ctx->drm_dev, ctx->dev);
}
static u32 decon_calc_clkdiv(struct decon_context *ctx,
diff --git a/drivers/gpu/drm/exynos/exynos_drm_dma.c b/drivers/gpu/drm/exynos/exynos_drm_dma.c
index 58b89ec11b0e..fd5f9bcf1857 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_dma.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_dma.c
@@ -14,19 +14,6 @@
#include "exynos_drm_drv.h"
-#if defined(CONFIG_ARM_DMA_USE_IOMMU)
-#include <asm/dma-iommu.h>
-#else
-#define arm_iommu_create_mapping(...) ({ NULL; })
-#define arm_iommu_attach_device(...) ({ -ENODEV; })
-#define arm_iommu_release_mapping(...) ({ })
-#define arm_iommu_detach_device(...) ({ })
-#define to_dma_iommu_mapping(dev) NULL
-#endif
-
-#if !defined(CONFIG_IOMMU_DMA)
-#define iommu_dma_init_domain(...) ({ -EINVAL; })
-#endif
#define EXYNOS_DEV_ADDR_START 0x20000000
#define EXYNOS_DEV_ADDR_SIZE 0x40000000
@@ -58,7 +45,7 @@ static inline void clear_dma_max_seg_size(struct device *dev)
* mapping.
*/
static int drm_iommu_attach_device(struct drm_device *drm_dev,
- struct device *subdrv_dev, void **dma_priv)
+ struct device *subdrv_dev)
{
struct exynos_drm_private *priv = drm_dev->dev_private;
int ret = 0;
@@ -73,22 +60,7 @@ static int drm_iommu_attach_device(struct drm_device *drm_dev,
if (ret)
return ret;
- if (IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)) {
- /*
- * Keep the original DMA mapping of the sub-device and
- * restore it on Exynos DRM detach, otherwise the DMA
- * framework considers it as IOMMU-less during the next
- * probe (in case of deferred probe or modular build)
- */
- *dma_priv = to_dma_iommu_mapping(subdrv_dev);
- if (*dma_priv)
- arm_iommu_detach_device(subdrv_dev);
-
- ret = arm_iommu_attach_device(subdrv_dev, priv->mapping);
- } else if (IS_ENABLED(CONFIG_IOMMU_DMA)) {
- ret = iommu_attach_device(priv->mapping, subdrv_dev);
- }
-
+ ret = iommu_attach_device(priv->mapping, subdrv_dev);
if (ret)
clear_dma_max_seg_size(subdrv_dev);
@@ -105,21 +77,15 @@ static int drm_iommu_attach_device(struct drm_device *drm_dev,
* mapping
*/
static void drm_iommu_detach_device(struct drm_device *drm_dev,
- struct device *subdrv_dev, void **dma_priv)
+ struct device *subdrv_dev)
{
struct exynos_drm_private *priv = drm_dev->dev_private;
- if (IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)) {
- arm_iommu_detach_device(subdrv_dev);
- arm_iommu_attach_device(subdrv_dev, *dma_priv);
- } else if (IS_ENABLED(CONFIG_IOMMU_DMA))
- iommu_detach_device(priv->mapping, subdrv_dev);
-
+ iommu_detach_device(priv->mapping, subdrv_dev);
clear_dma_max_seg_size(subdrv_dev);
}
-int exynos_drm_register_dma(struct drm_device *drm, struct device *dev,
- void **dma_priv)
+int exynos_drm_register_dma(struct drm_device *drm, struct device *dev)
{
struct exynos_drm_private *priv = drm->dev_private;
@@ -133,27 +99,20 @@ int exynos_drm_register_dma(struct drm_device *drm, struct device *dev,
return 0;
if (!priv->mapping) {
- void *mapping;
-
- if (IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU))
- mapping = arm_iommu_create_mapping(&platform_bus_type,
- EXYNOS_DEV_ADDR_START, EXYNOS_DEV_ADDR_SIZE);
- else if (IS_ENABLED(CONFIG_IOMMU_DMA))
- mapping = iommu_get_domain_for_dev(priv->dma_dev);
+ void *mapping = iommu_get_domain_for_dev(priv->dma_dev);
if (IS_ERR(mapping))
return PTR_ERR(mapping);
priv->mapping = mapping;
}
- return drm_iommu_attach_device(drm, dev, dma_priv);
+ return drm_iommu_attach_device(drm, dev);
}
-void exynos_drm_unregister_dma(struct drm_device *drm, struct device *dev,
- void **dma_priv)
+void exynos_drm_unregister_dma(struct drm_device *drm, struct device *dev)
{
if (IS_ENABLED(CONFIG_EXYNOS_IOMMU))
- drm_iommu_detach_device(drm, dev, dma_priv);
+ drm_iommu_detach_device(drm, dev);
}
void exynos_drm_cleanup_dma(struct drm_device *drm)
@@ -163,7 +122,5 @@ void exynos_drm_cleanup_dma(struct drm_device *drm)
if (!IS_ENABLED(CONFIG_EXYNOS_IOMMU))
return;
- arm_iommu_release_mapping(priv->mapping);
- priv->mapping = NULL;
priv->dma_dev = NULL;
}
diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.h b/drivers/gpu/drm/exynos/exynos_drm_drv.h
index 6ae9056e7a18..d4d21d8cfb90 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_drv.h
+++ b/drivers/gpu/drm/exynos/exynos_drm_drv.h
@@ -223,10 +223,8 @@ static inline bool is_drm_iommu_supported(struct drm_device *drm_dev)
return priv->mapping ? true : false;
}
-int exynos_drm_register_dma(struct drm_device *drm, struct device *dev,
- void **dma_priv);
-void exynos_drm_unregister_dma(struct drm_device *drm, struct device *dev,
- void **dma_priv);
+int exynos_drm_register_dma(struct drm_device *drm, struct device *dev);
+void exynos_drm_unregister_dma(struct drm_device *drm, struct device *dev);
void exynos_drm_cleanup_dma(struct drm_device *drm);
#ifdef CONFIG_DRM_EXYNOS_DPI
diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimc.c b/drivers/gpu/drm/exynos/exynos_drm_fimc.c
index 29ab8be8604c..8ea2e1d77802 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_fimc.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_fimc.c
@@ -97,7 +97,6 @@ struct fimc_scaler {
struct fimc_context {
struct exynos_drm_ipp ipp;
struct drm_device *drm_dev;
- void *dma_priv;
struct device *dev;
struct exynos_drm_ipp_task *task;
struct exynos_drm_ipp_formats *formats;
@@ -1134,7 +1133,7 @@ static int fimc_bind(struct device *dev, struct device *master, void *data)
ctx->drm_dev = drm_dev;
ipp->drm_dev = drm_dev;
- exynos_drm_register_dma(drm_dev, dev, &ctx->dma_priv);
+ exynos_drm_register_dma(drm_dev, dev);
exynos_drm_ipp_register(dev, ipp, &ipp_funcs,
DRM_EXYNOS_IPP_CAP_CROP | DRM_EXYNOS_IPP_CAP_ROTATE |
@@ -1154,7 +1153,7 @@ static void fimc_unbind(struct device *dev, struct device *master,
struct exynos_drm_ipp *ipp = &ctx->ipp;
exynos_drm_ipp_unregister(dev, ipp);
- exynos_drm_unregister_dma(drm_dev, dev, &ctx->dma_priv);
+ exynos_drm_unregister_dma(drm_dev, dev);
}
static const struct component_ops fimc_component_ops = {
diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimd.c b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
index bb67cad8371f..21aec38702fc 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_fimd.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
@@ -167,7 +167,6 @@ static struct fimd_driver_data exynos5420_fimd_driver_data = {
struct fimd_context {
struct device *dev;
struct drm_device *drm_dev;
- void *dma_priv;
struct exynos_drm_crtc *crtc;
struct exynos_drm_plane planes[WINDOWS_NR];
struct exynos_drm_plane_config configs[WINDOWS_NR];
@@ -1091,7 +1090,7 @@ static int fimd_bind(struct device *dev, struct device *master, void *data)
if (is_drm_iommu_supported(drm_dev))
fimd_clear_channels(ctx->crtc);
- return exynos_drm_register_dma(drm_dev, dev, &ctx->dma_priv);
+ return exynos_drm_register_dma(drm_dev, dev);
}
static void fimd_unbind(struct device *dev, struct device *master,
@@ -1101,7 +1100,7 @@ static void fimd_unbind(struct device *dev, struct device *master,
fimd_atomic_disable(ctx->crtc);
- exynos_drm_unregister_dma(ctx->drm_dev, ctx->dev, &ctx->dma_priv);
+ exynos_drm_unregister_dma(ctx->drm_dev, ctx->dev);
if (ctx->encoder)
exynos_dpi_remove(ctx->encoder);
diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
index 03be31427181..256ceafd9945 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
@@ -232,7 +232,6 @@ struct g2d_runqueue_node {
struct g2d_data {
struct device *dev;
- void *dma_priv;
struct clk *gate_clk;
void __iomem *regs;
int irq;
@@ -1410,7 +1409,7 @@ static int g2d_bind(struct device *dev, struct device *master, void *data)
return ret;
}
- ret = exynos_drm_register_dma(drm_dev, dev, &g2d->dma_priv);
+ ret = exynos_drm_register_dma(drm_dev, dev);
if (ret < 0) {
dev_err(dev, "failed to enable iommu.\n");
g2d_fini_cmdlist(g2d);
@@ -1435,7 +1434,7 @@ static void g2d_unbind(struct device *dev, struct device *master, void *data)
priv->g2d_dev = NULL;
cancel_work_sync(&g2d->runqueue_work);
- exynos_drm_unregister_dma(g2d->drm_dev, dev, &g2d->dma_priv);
+ exynos_drm_unregister_dma(g2d->drm_dev, dev);
}
static const struct component_ops g2d_component_ops = {
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gsc.c b/drivers/gpu/drm/exynos/exynos_drm_gsc.c
index 45e9aee8366a..88b6fcaa20be 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gsc.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_gsc.c
@@ -97,7 +97,6 @@ struct gsc_scaler {
struct gsc_context {
struct exynos_drm_ipp ipp;
struct drm_device *drm_dev;
- void *dma_priv;
struct device *dev;
struct exynos_drm_ipp_task *task;
struct exynos_drm_ipp_formats *formats;
@@ -1170,7 +1169,7 @@ static int gsc_bind(struct device *dev, struct device *master, void *data)
ctx->drm_dev = drm_dev;
ctx->drm_dev = drm_dev;
- exynos_drm_register_dma(drm_dev, dev, &ctx->dma_priv);
+ exynos_drm_register_dma(drm_dev, dev);
exynos_drm_ipp_register(dev, ipp, &ipp_funcs,
DRM_EXYNOS_IPP_CAP_CROP | DRM_EXYNOS_IPP_CAP_ROTATE |
@@ -1190,7 +1189,7 @@ static void gsc_unbind(struct device *dev, struct device *master,
struct exynos_drm_ipp *ipp = &ctx->ipp;
exynos_drm_ipp_unregister(dev, ipp);
- exynos_drm_unregister_dma(drm_dev, dev, &ctx->dma_priv);
+ exynos_drm_unregister_dma(drm_dev, dev);
}
static const struct component_ops gsc_component_ops = {
diff --git a/drivers/gpu/drm/exynos/exynos_drm_rotator.c b/drivers/gpu/drm/exynos/exynos_drm_rotator.c
index 2d94afba031e..f22fa0d2621a 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_rotator.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_rotator.c
@@ -56,7 +56,6 @@ struct rot_variant {
struct rot_context {
struct exynos_drm_ipp ipp;
struct drm_device *drm_dev;
- void *dma_priv;
struct device *dev;
void __iomem *regs;
struct clk *clock;
@@ -244,7 +243,7 @@ static int rotator_bind(struct device *dev, struct device *master, void *data)
rot->drm_dev = drm_dev;
ipp->drm_dev = drm_dev;
- exynos_drm_register_dma(drm_dev, dev, &rot->dma_priv);
+ exynos_drm_register_dma(drm_dev, dev);
exynos_drm_ipp_register(dev, ipp, &ipp_funcs,
DRM_EXYNOS_IPP_CAP_CROP | DRM_EXYNOS_IPP_CAP_ROTATE,
@@ -262,7 +261,7 @@ static void rotator_unbind(struct device *dev, struct device *master,
struct exynos_drm_ipp *ipp = &rot->ipp;
exynos_drm_ipp_unregister(dev, ipp);
- exynos_drm_unregister_dma(rot->drm_dev, rot->dev, &rot->dma_priv);
+ exynos_drm_unregister_dma(rot->drm_dev, rot->dev);
}
static const struct component_ops rotator_component_ops = {
diff --git a/drivers/gpu/drm/exynos/exynos_drm_scaler.c b/drivers/gpu/drm/exynos/exynos_drm_scaler.c
index ce1857138f89..0c560566bd2e 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_scaler.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_scaler.c
@@ -39,7 +39,6 @@ struct scaler_data {
struct scaler_context {
struct exynos_drm_ipp ipp;
struct drm_device *drm_dev;
- void *dma_priv;
struct device *dev;
void __iomem *regs;
struct clk *clock[SCALER_MAX_CLK];
@@ -451,7 +450,7 @@ static int scaler_bind(struct device *dev, struct device *master, void *data)
scaler->drm_dev = drm_dev;
ipp->drm_dev = drm_dev;
- exynos_drm_register_dma(drm_dev, dev, &scaler->dma_priv);
+ exynos_drm_register_dma(drm_dev, dev);
exynos_drm_ipp_register(dev, ipp, &ipp_funcs,
DRM_EXYNOS_IPP_CAP_CROP | DRM_EXYNOS_IPP_CAP_ROTATE |
@@ -471,8 +470,7 @@ static void scaler_unbind(struct device *dev, struct device *master,
struct exynos_drm_ipp *ipp = &scaler->ipp;
exynos_drm_ipp_unregister(dev, ipp);
- exynos_drm_unregister_dma(scaler->drm_dev, scaler->dev,
- &scaler->dma_priv);
+ exynos_drm_unregister_dma(scaler->drm_dev, scaler->dev);
}
static const struct component_ops scaler_component_ops = {
diff --git a/drivers/gpu/drm/exynos/exynos_mixer.c b/drivers/gpu/drm/exynos/exynos_mixer.c
index af192e5a16ef..18972e605c5d 100644
--- a/drivers/gpu/drm/exynos/exynos_mixer.c
+++ b/drivers/gpu/drm/exynos/exynos_mixer.c
@@ -94,7 +94,6 @@ struct mixer_context {
struct platform_device *pdev;
struct device *dev;
struct drm_device *drm_dev;
- void *dma_priv;
struct exynos_drm_crtc *crtc;
struct exynos_drm_plane planes[MIXER_WIN_NR];
unsigned long flags;
@@ -895,14 +894,12 @@ static int mixer_initialize(struct mixer_context *mixer_ctx,
}
}
- return exynos_drm_register_dma(drm_dev, mixer_ctx->dev,
- &mixer_ctx->dma_priv);
+ return exynos_drm_register_dma(drm_dev, mixer_ctx->dev);
}
static void mixer_ctx_remove(struct mixer_context *mixer_ctx)
{
- exynos_drm_unregister_dma(mixer_ctx->drm_dev, mixer_ctx->dev,
- &mixer_ctx->dma_priv);
+ exynos_drm_unregister_dma(mixer_ctx->drm_dev, mixer_ctx->dev);
}
static int mixer_enable_vblank(struct exynos_drm_crtc *crtc)
--
2.28.0.dirty
Now that iommu-dma is wired up, we can let it work as normal
without the dma_iommu_mapping hacks if the IOMMU driver already
supports default domains.
Signed-off-by: Robin Murphy <[email protected]>
---
arch/arm/mm/dma-mapping.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 0f69ede44cd7..2ef0afc17645 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -1220,6 +1220,13 @@ static bool arm_setup_iommu_dma_ops(struct device *dev, u64 dma_base, u64 size,
if (!iommu)
return false;
+ /* If a default domain exists, just let iommu-dma work normally */
+ if (iommu_get_domain_for_dev(dev)) {
+ iommu_setup_dma_ops(dev, dma_base, size);
+ return true;
+ }
+
+ /* Otherwise, use the workaround until the IOMMU driver is updated */
mapping = arm_iommu_create_mapping(dev->bus, dma_base, size);
if (IS_ERR(mapping)) {
pr_warn("Failed to create %llu-byte IOMMU mapping for device %s\n",
@@ -1234,6 +1241,7 @@ static bool arm_setup_iommu_dma_ops(struct device *dev, u64 dma_base, u64 size,
return false;
}
+ set_dma_ops(dev, &iommu_dma_ops);
return true;
}
@@ -1263,8 +1271,6 @@ static void arm_teardown_iommu_dma_ops(struct device *dev) { }
void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
const struct iommu_ops *iommu, bool coherent)
{
- const struct dma_map_ops *dma_ops;
-
dev->archdata.dma_coherent = coherent;
#ifdef CONFIG_SWIOTLB
dev->dma_coherent = coherent;
@@ -1278,12 +1284,9 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
if (dev->dma_ops)
return;
- if (arm_setup_iommu_dma_ops(dev, dma_base, size, iommu))
- dma_ops = &iommu_dma_ops;
- else
- dma_ops = arm_get_dma_map_ops(coherent);
+ set_dma_ops(dev, arm_get_dma_map_ops(coherent));
- set_dma_ops(dev, dma_ops);
+ arm_setup_iommu_dma_ops(dev, dma_base, size, iommu);
#ifdef CONFIG_XEN
if (xen_initial_domain())
--
2.28.0.dirty
Now that arch/arm is wired up for default domains and iommu-dma, remove
the add_device workaround.
Signed-off-by: Robin Murphy <[email protected]>
---
drivers/iommu/arm/arm-smmu/arm-smmu.c | 10 ----------
1 file changed, 10 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 09c42af9f31e..4e52d8cb67dd 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1164,17 +1164,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
return -ENXIO;
}
- /*
- * FIXME: The arch/arm DMA API code tries to attach devices to its own
- * domains between of_xlate() and probe_device() - we have no way to cope
- * with that, so until ARM gets converted to rely on groups and default
- * domains, just say no (but more politely than by dereferencing NULL).
- * This should be at least a WARN_ON once that's sorted.
- */
cfg = dev_iommu_priv_get(dev);
- if (!cfg)
- return -ENODEV;
-
smmu = cfg->smmu;
ret = arm_smmu_rpm_get(smmu);
--
2.28.0.dirty
With no users left and generic iommu-dma now doing all the work,
clean up the last traces of the arch-specific API, plus the temporary
workarounds that you'd forgotten about because you were thinking about
zebras instead.
Signed-off-by: Robin Murphy <[email protected]>
---
arch/arm/common/dmabounce.c | 1 -
arch/arm/include/asm/device.h | 9 --
arch/arm/include/asm/dma-iommu.h | 29 -----
arch/arm/mm/dma-mapping.c | 200 +------------------------------
drivers/iommu/dma-iommu.c | 38 ++----
5 files changed, 11 insertions(+), 266 deletions(-)
delete mode 100644 arch/arm/include/asm/dma-iommu.h
diff --git a/arch/arm/common/dmabounce.c b/arch/arm/common/dmabounce.c
index f4b719bde763..064349df7bbf 100644
--- a/arch/arm/common/dmabounce.c
+++ b/arch/arm/common/dmabounce.c
@@ -30,7 +30,6 @@
#include <linux/scatterlist.h>
#include <asm/cacheflush.h>
-#include <asm/dma-iommu.h>
#undef STATS
diff --git a/arch/arm/include/asm/device.h b/arch/arm/include/asm/device.h
index be666f58bf7a..db33f389c94e 100644
--- a/arch/arm/include/asm/device.h
+++ b/arch/arm/include/asm/device.h
@@ -8,9 +8,6 @@
struct dev_archdata {
#ifdef CONFIG_DMABOUNCE
struct dmabounce_device_info *dmabounce;
-#endif
-#ifdef CONFIG_ARM_DMA_USE_IOMMU
- struct dma_iommu_mapping *mapping;
#endif
unsigned int dma_coherent:1;
unsigned int dma_ops_setup:1;
@@ -24,10 +21,4 @@ struct pdev_archdata {
#endif
};
-#ifdef CONFIG_ARM_DMA_USE_IOMMU
-#define to_dma_iommu_mapping(dev) ((dev)->archdata.mapping)
-#else
-#define to_dma_iommu_mapping(dev) NULL
-#endif
-
#endif
diff --git a/arch/arm/include/asm/dma-iommu.h b/arch/arm/include/asm/dma-iommu.h
deleted file mode 100644
index f39cfa509fe4..000000000000
--- a/arch/arm/include/asm/dma-iommu.h
+++ /dev/null
@@ -1,29 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef ASMARM_DMA_IOMMU_H
-#define ASMARM_DMA_IOMMU_H
-
-#ifdef __KERNEL__
-
-#include <linux/mm_types.h>
-#include <linux/scatterlist.h>
-#include <linux/dma-debug.h>
-#include <linux/kref.h>
-
-struct dma_iommu_mapping {
- /* iommu specific data */
- struct iommu_domain *domain;
-
- struct kref kref;
-};
-
-struct dma_iommu_mapping *
-arm_iommu_create_mapping(struct bus_type *bus, dma_addr_t base, u64 size);
-
-void arm_iommu_release_mapping(struct dma_iommu_mapping *mapping);
-
-int arm_iommu_attach_device(struct device *dev,
- struct dma_iommu_mapping *mapping);
-void arm_iommu_detach_device(struct device *dev);
-
-#endif /* __KERNEL__ */
-#endif
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 2ef0afc17645..ff6c4962161a 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -33,7 +33,6 @@
#include <asm/cacheflush.h>
#include <asm/tlbflush.h>
#include <asm/mach/arch.h>
-#include <asm/dma-iommu.h>
#include <asm/mach/map.h>
#include <asm/system_info.h>
#include <asm/dma-contiguous.h>
@@ -1073,201 +1072,6 @@ static const struct dma_map_ops *arm_get_dma_map_ops(bool coherent)
return coherent ? &arm_coherent_dma_ops : &arm_dma_ops;
}
-#ifdef CONFIG_ARM_DMA_USE_IOMMU
-
-extern const struct dma_map_ops iommu_dma_ops;
-extern int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
- u64 size, struct device *dev);
-/**
- * arm_iommu_create_mapping
- * @bus: pointer to the bus holding the client device (for IOMMU calls)
- * @base: start address of the valid IO address space
- * @size: maximum size of the valid IO address space
- *
- * Creates a mapping structure which holds information about used/unused
- * IO address ranges, which is required to perform memory allocation and
- * mapping with IOMMU aware functions.
- *
- * The client device need to be attached to the mapping with
- * arm_iommu_attach_device function.
- */
-struct dma_iommu_mapping *
-arm_iommu_create_mapping(struct bus_type *bus, dma_addr_t base, u64 size)
-{
- struct dma_iommu_mapping *mapping;
- int err = -ENOMEM;
-
- mapping = kzalloc(sizeof(*mapping), GFP_KERNEL);
- if (!mapping)
- goto err;
-
- mapping->domain = iommu_domain_alloc(bus);
- if (!mapping->domain)
- goto err2;
-
- err = iommu_get_dma_cookie(mapping->domain);
- if (err)
- goto err3;
-
- err = iommu_dma_init_domain(mapping->domain, base, size, NULL);
- if (err)
- goto err4;
-
- kref_init(&mapping->kref);
- return mapping;
-err4:
- iommu_put_dma_cookie(mapping->domain);
-err3:
- iommu_domain_free(mapping->domain);
-err2:
- kfree(mapping);
-err:
- return ERR_PTR(err);
-}
-EXPORT_SYMBOL_GPL(arm_iommu_create_mapping);
-
-static void release_iommu_mapping(struct kref *kref)
-{
- struct dma_iommu_mapping *mapping =
- container_of(kref, struct dma_iommu_mapping, kref);
-
- iommu_put_dma_cookie(mapping->domain);
- iommu_domain_free(mapping->domain);
- kfree(mapping);
-}
-
-void arm_iommu_release_mapping(struct dma_iommu_mapping *mapping)
-{
- if (mapping)
- kref_put(&mapping->kref, release_iommu_mapping);
-}
-EXPORT_SYMBOL_GPL(arm_iommu_release_mapping);
-
-static int __arm_iommu_attach_device(struct device *dev,
- struct dma_iommu_mapping *mapping)
-{
- int err;
-
- err = iommu_attach_device(mapping->domain, dev);
- if (err)
- return err;
-
- kref_get(&mapping->kref);
- to_dma_iommu_mapping(dev) = mapping;
-
- pr_debug("Attached IOMMU controller to %s device.\n", dev_name(dev));
- return 0;
-}
-
-/**
- * arm_iommu_attach_device
- * @dev: valid struct device pointer
- * @mapping: io address space mapping structure (returned from
- * arm_iommu_create_mapping)
- *
- * Attaches specified io address space mapping to the provided device.
- * This replaces the dma operations (dma_map_ops pointer) with the
- * IOMMU aware version.
- *
- * More than one client might be attached to the same io address space
- * mapping.
- */
-int arm_iommu_attach_device(struct device *dev,
- struct dma_iommu_mapping *mapping)
-{
- int err;
-
- err = __arm_iommu_attach_device(dev, mapping);
- if (err)
- return err;
-
- set_dma_ops(dev, &iommu_dma_ops);
- return 0;
-}
-EXPORT_SYMBOL_GPL(arm_iommu_attach_device);
-
-/**
- * arm_iommu_detach_device
- * @dev: valid struct device pointer
- *
- * Detaches the provided device from a previously attached map.
- * This overwrites the dma_ops pointer with appropriate non-IOMMU ops.
- */
-void arm_iommu_detach_device(struct device *dev)
-{
- struct dma_iommu_mapping *mapping;
-
- mapping = to_dma_iommu_mapping(dev);
- if (!mapping) {
- dev_warn(dev, "Not attached\n");
- return;
- }
-
- iommu_detach_device(mapping->domain, dev);
- kref_put(&mapping->kref, release_iommu_mapping);
- to_dma_iommu_mapping(dev) = NULL;
- set_dma_ops(dev, arm_get_dma_map_ops(dev->archdata.dma_coherent));
-
- pr_debug("Detached IOMMU controller from %s device.\n", dev_name(dev));
-}
-EXPORT_SYMBOL_GPL(arm_iommu_detach_device);
-
-static bool arm_setup_iommu_dma_ops(struct device *dev, u64 dma_base, u64 size,
- const struct iommu_ops *iommu)
-{
- struct dma_iommu_mapping *mapping;
-
- if (!iommu)
- return false;
-
- /* If a default domain exists, just let iommu-dma work normally */
- if (iommu_get_domain_for_dev(dev)) {
- iommu_setup_dma_ops(dev, dma_base, size);
- return true;
- }
-
- /* Otherwise, use the workaround until the IOMMU driver is updated */
- mapping = arm_iommu_create_mapping(dev->bus, dma_base, size);
- if (IS_ERR(mapping)) {
- pr_warn("Failed to create %llu-byte IOMMU mapping for device %s\n",
- size, dev_name(dev));
- return false;
- }
-
- if (__arm_iommu_attach_device(dev, mapping)) {
- pr_warn("Failed to attached device %s to IOMMU_mapping\n",
- dev_name(dev));
- arm_iommu_release_mapping(mapping);
- return false;
- }
-
- set_dma_ops(dev, &iommu_dma_ops);
- return true;
-}
-
-static void arm_teardown_iommu_dma_ops(struct device *dev)
-{
- struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
-
- if (!mapping)
- return;
-
- arm_iommu_detach_device(dev);
- arm_iommu_release_mapping(mapping);
-}
-
-#else
-
-static bool arm_setup_iommu_dma_ops(struct device *dev, u64 dma_base, u64 size,
- const struct iommu_ops *iommu)
-{
- return false;
-}
-
-static void arm_teardown_iommu_dma_ops(struct device *dev) { }
-
-#endif /* CONFIG_ARM_DMA_USE_IOMMU */
-
void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
const struct iommu_ops *iommu, bool coherent)
{
@@ -1286,7 +1090,8 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
set_dma_ops(dev, arm_get_dma_map_ops(coherent));
- arm_setup_iommu_dma_ops(dev, dma_base, size, iommu);
+ if (iommu)
+ iommu_setup_dma_ops(dev, dma_base, size);
#ifdef CONFIG_XEN
if (xen_initial_domain())
@@ -1300,7 +1105,6 @@ void arch_teardown_dma_ops(struct device *dev)
if (!dev->archdata.dma_ops_setup)
return;
- arm_teardown_iommu_dma_ops(dev);
/* Let arch_setup_dma_ops() start again from scratch upon re-probe */
set_dma_ops(dev, NULL);
}
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index ab157d155bf7..4959f5df21bd 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -25,19 +25,6 @@
#include <linux/vmalloc.h>
#include <linux/crash_dump.h>
-#ifdef CONFIG_ARM
-#include <asm/dma-iommu.h>
-#endif
-static struct iommu_domain *__iommu_get_dma_domain(struct device *dev)
-{
-#ifdef CONFIG_ARM
- struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
- if (mapping)
- return mapping->domain;
-#endif
- return iommu_get_dma_domain(dev);
-}
-
struct iommu_dma_msi_page {
struct list_head list;
dma_addr_t iova;
@@ -311,11 +298,8 @@ static void iommu_dma_flush_iotlb_all(struct iova_domain *iovad)
* avoid rounding surprises. If necessary, we reserve the page at address 0
* to ensure it is an invalid IOVA. It is safe to reinitialise a domain, but
* any change which could make prior IOVAs invalid will fail.
- *
- * XXX: Not formally exported, but needs to be referenced
- * from arch/arm/mm/dma-mapping.c temporarily
*/
-int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
+static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
u64 size, struct device *dev)
{
struct iommu_dma_cookie *cookie = domain->iova_cookie;
@@ -472,7 +456,7 @@ static void iommu_dma_free_iova(struct iommu_dma_cookie *cookie,
static void __iommu_dma_unmap(struct device *dev, dma_addr_t dma_addr,
size_t size)
{
- struct iommu_domain *domain = __iommu_get_dma_domain(dev);
+ struct iommu_domain *domain = iommu_get_dma_domain(dev);
struct iommu_dma_cookie *cookie = domain->iova_cookie;
struct iova_domain *iovad = &cookie->iovad;
size_t iova_off = iova_offset(iovad, dma_addr);
@@ -494,7 +478,7 @@ static void __iommu_dma_unmap(struct device *dev, dma_addr_t dma_addr,
static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
size_t size, int prot, u64 dma_mask)
{
- struct iommu_domain *domain = __iommu_get_dma_domain(dev);
+ struct iommu_domain *domain = iommu_get_dma_domain(dev);
struct iommu_dma_cookie *cookie = domain->iova_cookie;
struct iova_domain *iovad = &cookie->iovad;
size_t iova_off = iova_offset(iovad, phys);
@@ -598,7 +582,7 @@ static struct page **__iommu_dma_alloc_pages(struct device *dev,
static void *iommu_dma_alloc_remap(struct device *dev, size_t size,
dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
{
- struct iommu_domain *domain = __iommu_get_dma_domain(dev);
+ struct iommu_domain *domain = iommu_get_dma_domain(dev);
struct iommu_dma_cookie *cookie = domain->iova_cookie;
struct iova_domain *iovad = &cookie->iovad;
bool coherent = dev_is_dma_coherent(dev);
@@ -694,7 +678,7 @@ static void iommu_dma_sync_single_for_cpu(struct device *dev,
if (dev_is_dma_coherent(dev))
return;
- phys = iommu_iova_to_phys(__iommu_get_dma_domain(dev), dma_handle);
+ phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle);
arch_sync_dma_for_cpu(phys, size, dir);
}
@@ -706,7 +690,7 @@ static void iommu_dma_sync_single_for_device(struct device *dev,
if (dev_is_dma_coherent(dev))
return;
- phys = iommu_iova_to_phys(__iommu_get_dma_domain(dev), dma_handle);
+ phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle);
arch_sync_dma_for_device(phys, size, dir);
}
@@ -847,7 +831,7 @@ static void __invalidate_sg(struct scatterlist *sg, int nents)
static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
int nents, enum dma_data_direction dir, unsigned long attrs)
{
- struct iommu_domain *domain = __iommu_get_dma_domain(dev);
+ struct iommu_domain *domain = iommu_get_dma_domain(dev);
struct iommu_dma_cookie *cookie = domain->iova_cookie;
struct iova_domain *iovad = &cookie->iovad;
struct scatterlist *s, *prev = NULL;
@@ -1128,16 +1112,12 @@ static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
static unsigned long iommu_dma_get_merge_boundary(struct device *dev)
{
- struct iommu_domain *domain = __iommu_get_dma_domain(dev);
+ struct iommu_domain *domain = iommu_get_dma_domain(dev);
return (1UL << __ffs(domain->pgsize_bitmap)) - 1;
}
-/*
- * XXX: Not formally exported, but needs to be referenced
- * from arch/arm/mm/dma-mapping.c temporarily
- */
-const struct dma_map_ops iommu_dma_ops = {
+static const struct dma_map_ops iommu_dma_ops = {
.alloc = iommu_dma_alloc,
.free = iommu_dma_free,
.mmap = iommu_dma_mmap,
--
2.28.0.dirty
Now that arch/arm is wired up for default domains and iommu-dma, remove
the shared mapping workaround and rely on groups there as well.
Signed-off-by: Robin Murphy <[email protected]>
---
drivers/iommu/ipmmu-vmsa.c | 69 --------------------------------------
1 file changed, 69 deletions(-)
diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 0f18abda0e20..8ad74a76f402 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -26,15 +26,6 @@
#include <linux/slab.h>
#include <linux/sys_soc.h>
-#if defined(CONFIG_ARM) && !defined(CONFIG_IOMMU_DMA)
-#include <asm/dma-iommu.h>
-#else
-#define arm_iommu_create_mapping(...) NULL
-#define arm_iommu_attach_device(...) -ENODEV
-#define arm_iommu_release_mapping(...) do {} while (0)
-#define arm_iommu_detach_device(...) do {} while (0)
-#endif
-
#define IPMMU_CTX_MAX 8U
#define IPMMU_CTX_INVALID -1
@@ -67,7 +58,6 @@ struct ipmmu_vmsa_device {
s8 utlb_ctx[IPMMU_UTLB_MAX];
struct iommu_group *group;
- struct dma_iommu_mapping *mapping;
};
struct ipmmu_vmsa_domain {
@@ -805,50 +795,6 @@ static int ipmmu_of_xlate(struct device *dev,
return ipmmu_init_platform_device(dev, spec);
}
-static int ipmmu_init_arm_mapping(struct device *dev)
-{
- struct ipmmu_vmsa_device *mmu = to_ipmmu(dev);
- int ret;
-
- /*
- * Create the ARM mapping, used by the ARM DMA mapping core to allocate
- * VAs. This will allocate a corresponding IOMMU domain.
- *
- * TODO:
- * - Create one mapping per context (TLB).
- * - Make the mapping size configurable ? We currently use a 2GB mapping
- * at a 1GB offset to ensure that NULL VAs will fault.
- */
- if (!mmu->mapping) {
- struct dma_iommu_mapping *mapping;
-
- mapping = arm_iommu_create_mapping(&platform_bus_type,
- SZ_1G, SZ_2G);
- if (IS_ERR(mapping)) {
- dev_err(mmu->dev, "failed to create ARM IOMMU mapping\n");
- ret = PTR_ERR(mapping);
- goto error;
- }
-
- mmu->mapping = mapping;
- }
-
- /* Attach the ARM VA mapping to the device. */
- ret = arm_iommu_attach_device(dev, mmu->mapping);
- if (ret < 0) {
- dev_err(dev, "Failed to attach device to VA mapping\n");
- goto error;
- }
-
- return 0;
-
-error:
- if (mmu->mapping)
- arm_iommu_release_mapping(mmu->mapping);
-
- return ret;
-}
-
static struct iommu_device *ipmmu_probe_device(struct device *dev)
{
struct ipmmu_vmsa_device *mmu = to_ipmmu(dev);
@@ -862,20 +808,8 @@ static struct iommu_device *ipmmu_probe_device(struct device *dev)
return &mmu->iommu;
}
-static void ipmmu_probe_finalize(struct device *dev)
-{
- int ret = 0;
-
- if (IS_ENABLED(CONFIG_ARM) && !IS_ENABLED(CONFIG_IOMMU_DMA))
- ret = ipmmu_init_arm_mapping(dev);
-
- if (ret)
- dev_err(dev, "Can't create IOMMU mapping - DMA-OPS will not work\n");
-}
-
static void ipmmu_release_device(struct device *dev)
{
- arm_iommu_detach_device(dev);
}
static struct iommu_group *ipmmu_find_group(struct device *dev)
@@ -905,7 +839,6 @@ static const struct iommu_ops ipmmu_ops = {
.iova_to_phys = ipmmu_iova_to_phys,
.probe_device = ipmmu_probe_device,
.release_device = ipmmu_release_device,
- .probe_finalize = ipmmu_probe_finalize,
.device_group = IS_ENABLED(CONFIG_ARM) && !IS_ENABLED(CONFIG_IOMMU_DMA)
? generic_device_group : ipmmu_find_group,
.pgsize_bitmap = SZ_1G | SZ_2M | SZ_4K,
@@ -1118,8 +1051,6 @@ static int ipmmu_remove(struct platform_device *pdev)
iommu_device_sysfs_remove(&mmu->iommu);
iommu_device_unregister(&mmu->iommu);
- arm_iommu_release_mapping(mmu->mapping);
-
ipmmu_device_reset(mmu);
return 0;
--
2.28.0.dirty
Now that arch/arm is wired up for default domains and iommu-dma,
implement the corresponding driver-side support for DMA domains.
Signed-off-by: Robin Murphy <[email protected]>
---
drivers/iommu/tegra-gart.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index fac720273889..e081387080f6 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -9,6 +9,7 @@
#define dev_fmt(fmt) "gart: " fmt
+#include <linux/dma-iommu.h>
#include <linux/io.h>
#include <linux/iommu.h>
#include <linux/moduleparam.h>
@@ -145,16 +146,22 @@ static struct iommu_domain *gart_iommu_domain_alloc(unsigned type)
{
struct iommu_domain *domain;
- if (type != IOMMU_DOMAIN_UNMANAGED)
+ if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
return NULL;
domain = kzalloc(sizeof(*domain), GFP_KERNEL);
- if (domain) {
- domain->geometry.aperture_start = gart_handle->iovmm_base;
- domain->geometry.aperture_end = gart_handle->iovmm_end - 1;
- domain->geometry.force_aperture = true;
+ if (!domain)
+ return NULL;
+
+ if (type == IOMMU_DOMAIN_DMA && iommu_get_dma_cookie(domain)) {
+ kfree(domain);
+ return NULL;
}
+ domain->geometry.aperture_start = gart_handle->iovmm_base;
+ domain->geometry.aperture_end = gart_handle->iovmm_end - 1;
+ domain->geometry.force_aperture = true;
+
return domain;
}
--
2.28.0.dirty
Now that arch/arm is wired up for default domains and iommu-dma,
implement the corresponding driver-side support for DMA domains.
Signed-off-by: Robin Murphy <[email protected]>
---
drivers/iommu/tegra-smmu.c | 37 +++++++++++++++++++++----------------
1 file changed, 21 insertions(+), 16 deletions(-)
diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 124c8848ab7e..8e276eac84d9 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -5,6 +5,7 @@
#include <linux/bitops.h>
#include <linux/debugfs.h>
+#include <linux/dma-iommu.h>
#include <linux/err.h>
#include <linux/iommu.h>
#include <linux/kernel.h>
@@ -278,7 +279,7 @@ static struct iommu_domain *tegra_smmu_domain_alloc(unsigned type)
{
struct tegra_smmu_as *as;
- if (type != IOMMU_DOMAIN_UNMANAGED)
+ if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
return NULL;
as = kzalloc(sizeof(*as), GFP_KERNEL);
@@ -288,25 +289,19 @@ static struct iommu_domain *tegra_smmu_domain_alloc(unsigned type)
as->attr = SMMU_PD_READABLE | SMMU_PD_WRITABLE | SMMU_PD_NONSECURE;
as->pd = alloc_page(GFP_KERNEL | __GFP_DMA | __GFP_ZERO);
- if (!as->pd) {
- kfree(as);
- return NULL;
- }
+ if (!as->pd)
+ goto out_free_as;
as->count = kcalloc(SMMU_NUM_PDE, sizeof(u32), GFP_KERNEL);
- if (!as->count) {
- __free_page(as->pd);
- kfree(as);
- return NULL;
- }
+ if (!as->count)
+ goto out_free_all;
as->pts = kcalloc(SMMU_NUM_PDE, sizeof(*as->pts), GFP_KERNEL);
- if (!as->pts) {
- kfree(as->count);
- __free_page(as->pd);
- kfree(as);
- return NULL;
- }
+ if (!as->pts)
+ goto out_free_all;
+
+ if (type == IOMMU_DOMAIN_DMA && iommu_get_dma_cookie(&as->domain))
+ goto out_free_all;
/* setup aperture */
as->domain.geometry.aperture_start = 0;
@@ -314,12 +309,22 @@ static struct iommu_domain *tegra_smmu_domain_alloc(unsigned type)
as->domain.geometry.force_aperture = true;
return &as->domain;
+
+out_free_all:
+ kfree(as->pts);
+ kfree(as->count);
+ __free_page(as->pd);
+out_free_as:
+ kfree(as);
+ return NULL;
}
static void tegra_smmu_domain_free(struct iommu_domain *domain)
{
struct tegra_smmu_as *as = to_smmu_as(domain);
+ iommu_put_dma_cookie(domain);
+
/* TODO: free page directory and page tables */
WARN_ON_ONCE(as->use_count);
--
2.28.0.dirty
Now that arch/arm is wired up for default domains and iommu-dma, devices
behind IOMMUs will get mappings set up automatically as appropriate, so
there is no need for drivers to do so manually.
Signed-off-by: Robin Murphy <[email protected]>
---
drivers/media/platform/omap3isp/isp.c | 68 ++-------------------------
drivers/media/platform/omap3isp/isp.h | 3 --
2 files changed, 3 insertions(+), 68 deletions(-)
diff --git a/drivers/media/platform/omap3isp/isp.c b/drivers/media/platform/omap3isp/isp.c
index b91e472ee764..196522883231 100644
--- a/drivers/media/platform/omap3isp/isp.c
+++ b/drivers/media/platform/omap3isp/isp.c
@@ -56,10 +56,6 @@
#include <linux/sched.h>
#include <linux/vmalloc.h>
-#ifdef CONFIG_ARM_DMA_USE_IOMMU
-#include <asm/dma-iommu.h>
-#endif
-
#include <media/v4l2-common.h>
#include <media/v4l2-fwnode.h>
#include <media/v4l2-device.h>
@@ -1942,51 +1938,6 @@ static int isp_initialize_modules(struct isp_device *isp)
return ret;
}
-static void isp_detach_iommu(struct isp_device *isp)
-{
-#ifdef CONFIG_ARM_DMA_USE_IOMMU
- arm_iommu_detach_device(isp->dev);
- arm_iommu_release_mapping(isp->mapping);
- isp->mapping = NULL;
-#endif
-}
-
-static int isp_attach_iommu(struct isp_device *isp)
-{
-#ifdef CONFIG_ARM_DMA_USE_IOMMU
- struct dma_iommu_mapping *mapping;
- int ret;
-
- /*
- * Create the ARM mapping, used by the ARM DMA mapping core to allocate
- * VAs. This will allocate a corresponding IOMMU domain.
- */
- mapping = arm_iommu_create_mapping(&platform_bus_type, SZ_1G, SZ_2G);
- if (IS_ERR(mapping)) {
- dev_err(isp->dev, "failed to create ARM IOMMU mapping\n");
- return PTR_ERR(mapping);
- }
-
- isp->mapping = mapping;
-
- /* Attach the ARM VA mapping to the device. */
- ret = arm_iommu_attach_device(isp->dev, mapping);
- if (ret < 0) {
- dev_err(isp->dev, "failed to attach device to VA mapping\n");
- goto error;
- }
-
- return 0;
-
-error:
- arm_iommu_release_mapping(isp->mapping);
- isp->mapping = NULL;
- return ret;
-#else
- return -ENODEV;
-#endif
-}
-
/*
* isp_remove - Remove ISP platform device
* @pdev: Pointer to ISP platform device
@@ -2002,10 +1953,6 @@ static int isp_remove(struct platform_device *pdev)
isp_cleanup_modules(isp);
isp_xclk_cleanup(isp);
- __omap3isp_get(isp, false);
- isp_detach_iommu(isp);
- __omap3isp_put(isp, false);
-
media_entity_enum_cleanup(&isp->crashed);
v4l2_async_notifier_cleanup(&isp->notifier);
@@ -2383,18 +2330,11 @@ static int isp_probe(struct platform_device *pdev)
isp->mmio_hist_base_phys =
mem->start + isp_res_maps[m].offset[OMAP3_ISP_IOMEM_HIST];
- /* IOMMU */
- ret = isp_attach_iommu(isp);
- if (ret < 0) {
- dev_err(&pdev->dev, "unable to attach to IOMMU\n");
- goto error_isp;
- }
-
/* Interrupt */
ret = platform_get_irq(pdev, 0);
if (ret <= 0) {
ret = -ENODEV;
- goto error_iommu;
+ goto error_isp;
}
isp->irq_num = ret;
@@ -2402,13 +2342,13 @@ static int isp_probe(struct platform_device *pdev)
"OMAP3 ISP", isp)) {
dev_err(isp->dev, "Unable to request IRQ\n");
ret = -EINVAL;
- goto error_iommu;
+ goto error_isp;
}
/* Entities */
ret = isp_initialize_modules(isp);
if (ret < 0)
- goto error_iommu;
+ goto error_isp;
ret = isp_register_entities(isp);
if (ret < 0)
@@ -2433,8 +2373,6 @@ static int isp_probe(struct platform_device *pdev)
isp_unregister_entities(isp);
error_modules:
isp_cleanup_modules(isp);
-error_iommu:
- isp_detach_iommu(isp);
error_isp:
isp_xclk_cleanup(isp);
__omap3isp_put(isp, false);
diff --git a/drivers/media/platform/omap3isp/isp.h b/drivers/media/platform/omap3isp/isp.h
index a9d760fbf349..b50459106d89 100644
--- a/drivers/media/platform/omap3isp/isp.h
+++ b/drivers/media/platform/omap3isp/isp.h
@@ -145,7 +145,6 @@ struct isp_xclk {
* @syscon: Regmap for the syscon register space
* @syscon_offset: Offset of the CSIPHY control register in syscon
* @phy_type: ISP_PHY_TYPE_{3430,3630}
- * @mapping: IOMMU mapping
* @stat_lock: Spinlock for handling statistics
* @isp_mutex: Mutex for serializing requests to ISP.
* @stop_failure: Indicates that an entity failed to stop.
@@ -185,8 +184,6 @@ struct isp_device {
u32 syscon_offset;
u32 phy_type;
- struct dma_iommu_mapping *mapping;
-
/* ISP Obj */
spinlock_t stat_lock; /* common lock for statistic drivers */
struct mutex isp_mutex; /* For handling ref_count field */
--
2.28.0.dirty
Merge the coherent and non-coherent callbacks down to a single
implementation each, relying on the generic dev->dma_coherent
flag at the points where the difference matters.
Signed-off-by: Robin Murphy <[email protected]>
---
arch/arm/Kconfig | 4 +-
arch/arm/mm/dma-mapping.c | 281 +++++++++++---------------------------
2 files changed, 79 insertions(+), 206 deletions(-)
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index e00d94b16658..b91273f9fd43 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -19,8 +19,8 @@ config ARM
select ARCH_HAS_SET_MEMORY
select ARCH_HAS_STRICT_KERNEL_RWX if MMU && !XIP_KERNEL
select ARCH_HAS_STRICT_MODULE_RWX if MMU
- select ARCH_HAS_SYNC_DMA_FOR_DEVICE if SWIOTLB
- select ARCH_HAS_SYNC_DMA_FOR_CPU if SWIOTLB
+ select ARCH_HAS_SYNC_DMA_FOR_DEVICE if SWIOTLB || ARM_DMA_USE_IOMMU
+ select ARCH_HAS_SYNC_DMA_FOR_CPU if SWIOTLB || ARM_DMA_USE_IOMMU
select ARCH_HAS_TEARDOWN_DMA_OPS if MMU
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
select ARCH_HAVE_CUSTOM_GPIO_H
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index ffa387f343dc..1bb7e9608f75 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -1418,13 +1418,13 @@ static void __iommu_free_atomic(struct device *dev, void *cpu_addr,
__free_from_pool(cpu_addr, size);
}
-static void *__arm_iommu_alloc_attrs(struct device *dev, size_t size,
- dma_addr_t *handle, gfp_t gfp, unsigned long attrs,
- int coherent_flag)
+static void *arm_iommu_alloc_attrs(struct device *dev, size_t size,
+ dma_addr_t *handle, gfp_t gfp, unsigned long attrs)
{
pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL);
struct page **pages;
void *addr = NULL;
+ int coherent_flag = dev->dma_coherent ? COHERENT : NORMAL;
*handle = DMA_MAPPING_ERROR;
size = PAGE_ALIGN(size);
@@ -1467,19 +1467,7 @@ static void *__arm_iommu_alloc_attrs(struct device *dev, size_t size,
return NULL;
}
-static void *arm_iommu_alloc_attrs(struct device *dev, size_t size,
- dma_addr_t *handle, gfp_t gfp, unsigned long attrs)
-{
- return __arm_iommu_alloc_attrs(dev, size, handle, gfp, attrs, NORMAL);
-}
-
-static void *arm_coherent_iommu_alloc_attrs(struct device *dev, size_t size,
- dma_addr_t *handle, gfp_t gfp, unsigned long attrs)
-{
- return __arm_iommu_alloc_attrs(dev, size, handle, gfp, attrs, COHERENT);
-}
-
-static int __arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
+static int arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma,
void *cpu_addr, dma_addr_t dma_addr, size_t size,
unsigned long attrs)
{
@@ -1493,35 +1481,24 @@ static int __arm_iommu_mmap_attrs(struct device *dev, struct vm_area_struct *vma
if (vma->vm_pgoff >= nr_pages)
return -ENXIO;
+ if (!dev->dma_coherent)
+ vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot);
+
err = vm_map_pages(vma, pages, nr_pages);
if (err)
pr_err("Remapping memory failed: %d\n", err);
return err;
}
-static int arm_iommu_mmap_attrs(struct device *dev,
- struct vm_area_struct *vma, void *cpu_addr,
- dma_addr_t dma_addr, size_t size, unsigned long attrs)
-{
- vma->vm_page_prot = __get_dma_pgprot(attrs, vma->vm_page_prot);
-
- return __arm_iommu_mmap_attrs(dev, vma, cpu_addr, dma_addr, size, attrs);
-}
-
-static int arm_coherent_iommu_mmap_attrs(struct device *dev,
- struct vm_area_struct *vma, void *cpu_addr,
- dma_addr_t dma_addr, size_t size, unsigned long attrs)
-{
- return __arm_iommu_mmap_attrs(dev, vma, cpu_addr, dma_addr, size, attrs);
-}
/*
* free a page as defined by the above mapping.
* Must not be called with IRQs disabled.
*/
-static void __arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
- dma_addr_t handle, unsigned long attrs, int coherent_flag)
+static void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
+ dma_addr_t handle, unsigned long attrs)
{
+ int coherent_flag = dev->dma_coherent ? COHERENT : NORMAL;
struct page **pages;
size = PAGE_ALIGN(size);
@@ -1543,19 +1520,6 @@ static void __arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_ad
__iommu_free_buffer(dev, pages, size, attrs);
}
-static void arm_iommu_free_attrs(struct device *dev, size_t size,
- void *cpu_addr, dma_addr_t handle,
- unsigned long attrs)
-{
- __arm_iommu_free_attrs(dev, size, cpu_addr, handle, attrs, NORMAL);
-}
-
-static void arm_coherent_iommu_free_attrs(struct device *dev, size_t size,
- void *cpu_addr, dma_addr_t handle, unsigned long attrs)
-{
- __arm_iommu_free_attrs(dev, size, cpu_addr, handle, attrs, COHERENT);
-}
-
static int arm_iommu_get_sgtable(struct device *dev, struct sg_table *sgt,
void *cpu_addr, dma_addr_t dma_addr,
size_t size, unsigned long attrs)
@@ -1575,8 +1539,7 @@ static int arm_iommu_get_sgtable(struct device *dev, struct sg_table *sgt,
*/
static int __map_sg_chunk(struct device *dev, struct scatterlist *sg,
size_t size, dma_addr_t *handle,
- enum dma_data_direction dir, unsigned long attrs,
- bool is_coherent)
+ enum dma_data_direction dir, unsigned long attrs)
{
struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
dma_addr_t iova, iova_base;
@@ -1596,7 +1559,7 @@ static int __map_sg_chunk(struct device *dev, struct scatterlist *sg,
phys_addr_t phys = page_to_phys(sg_page(s));
unsigned int len = PAGE_ALIGN(s->offset + s->length);
- if (!is_coherent && (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
+ if (!dev->dma_coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
__dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir);
prot = __dma_info_to_prot(dir, attrs);
@@ -1616,70 +1579,6 @@ static int __map_sg_chunk(struct device *dev, struct scatterlist *sg,
return ret;
}
-static int __iommu_map_sg(struct device *dev, struct scatterlist *sg, int nents,
- enum dma_data_direction dir, unsigned long attrs,
- bool is_coherent)
-{
- struct scatterlist *s = sg, *dma = sg, *start = sg;
- int i, count = 0;
- unsigned int offset = s->offset;
- unsigned int size = s->offset + s->length;
- unsigned int max = dma_get_max_seg_size(dev);
-
- for (i = 1; i < nents; i++) {
- s = sg_next(s);
-
- s->dma_address = DMA_MAPPING_ERROR;
- s->dma_length = 0;
-
- if (s->offset || (size & ~PAGE_MASK) || size + s->length > max) {
- if (__map_sg_chunk(dev, start, size, &dma->dma_address,
- dir, attrs, is_coherent) < 0)
- goto bad_mapping;
-
- dma->dma_address += offset;
- dma->dma_length = size - offset;
-
- size = offset = s->offset;
- start = s;
- dma = sg_next(dma);
- count += 1;
- }
- size += s->length;
- }
- if (__map_sg_chunk(dev, start, size, &dma->dma_address, dir, attrs,
- is_coherent) < 0)
- goto bad_mapping;
-
- dma->dma_address += offset;
- dma->dma_length = size - offset;
-
- return count+1;
-
-bad_mapping:
- for_each_sg(sg, s, count, i)
- __iommu_remove_mapping(dev, sg_dma_address(s), sg_dma_len(s));
- return 0;
-}
-
-/**
- * arm_coherent_iommu_map_sg - map a set of SG buffers for streaming mode DMA
- * @dev: valid struct device pointer
- * @sg: list of buffers
- * @nents: number of buffers to map
- * @dir: DMA transfer direction
- *
- * Map a set of i/o coherent buffers described by scatterlist in streaming
- * mode for DMA. The scatter gather list elements are merged together (if
- * possible) and tagged with the appropriate dma address and length. They are
- * obtained via sg_dma_{address,length}.
- */
-static int arm_coherent_iommu_map_sg(struct device *dev, struct scatterlist *sg,
- int nents, enum dma_data_direction dir, unsigned long attrs)
-{
- return __iommu_map_sg(dev, sg, nents, dir, attrs, true);
-}
-
/**
* arm_iommu_map_sg - map a set of SG buffers for streaming mode DMA
* @dev: valid struct device pointer
@@ -1695,41 +1594,45 @@ static int arm_coherent_iommu_map_sg(struct device *dev, struct scatterlist *sg,
static int arm_iommu_map_sg(struct device *dev, struct scatterlist *sg,
int nents, enum dma_data_direction dir, unsigned long attrs)
{
- return __iommu_map_sg(dev, sg, nents, dir, attrs, false);
-}
+ struct scatterlist *s = sg, *dma = sg, *start = sg;
+ int i, count = 0;
+ unsigned int offset = s->offset;
+ unsigned int size = s->offset + s->length;
+ unsigned int max = dma_get_max_seg_size(dev);
-static void __iommu_unmap_sg(struct device *dev, struct scatterlist *sg,
- int nents, enum dma_data_direction dir,
- unsigned long attrs, bool is_coherent)
-{
- struct scatterlist *s;
- int i;
+ for (i = 1; i < nents; i++) {
+ s = sg_next(s);
- for_each_sg(sg, s, nents, i) {
- if (sg_dma_len(s))
- __iommu_remove_mapping(dev, sg_dma_address(s),
- sg_dma_len(s));
- if (!is_coherent && (attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
- __dma_page_dev_to_cpu(sg_page(s), s->offset,
- s->length, dir);
+ s->dma_address = DMA_MAPPING_ERROR;
+ s->dma_length = 0;
+
+ if (s->offset || (size & ~PAGE_MASK) || size + s->length > max) {
+ if (__map_sg_chunk(dev, start, size, &dma->dma_address,
+ dir, attrs) < 0)
+ goto bad_mapping;
+
+ dma->dma_address += offset;
+ dma->dma_length = size - offset;
+
+ size = offset = s->offset;
+ start = s;
+ dma = sg_next(dma);
+ count += 1;
+ }
+ size += s->length;
}
-}
+ if (__map_sg_chunk(dev, start, size, &dma->dma_address, dir, attrs) < 0)
+ goto bad_mapping;
-/**
- * arm_coherent_iommu_unmap_sg - unmap a set of SG buffers mapped by dma_map_sg
- * @dev: valid struct device pointer
- * @sg: list of buffers
- * @nents: number of buffers to unmap (same as was passed to dma_map_sg)
- * @dir: DMA transfer direction (same as was passed to dma_map_sg)
- *
- * Unmap a set of streaming mode DMA translations. Again, CPU access
- * rules concerning calls here are the same as for dma_unmap_single().
- */
-static void arm_coherent_iommu_unmap_sg(struct device *dev,
- struct scatterlist *sg, int nents, enum dma_data_direction dir,
- unsigned long attrs)
-{
- __iommu_unmap_sg(dev, sg, nents, dir, attrs, true);
+ dma->dma_address += offset;
+ dma->dma_length = size - offset;
+
+ return count+1;
+
+bad_mapping:
+ for_each_sg(sg, s, count, i)
+ __iommu_remove_mapping(dev, sg_dma_address(s), sg_dma_len(s));
+ return 0;
}
/**
@@ -1747,7 +1650,17 @@ static void arm_iommu_unmap_sg(struct device *dev,
enum dma_data_direction dir,
unsigned long attrs)
{
- __iommu_unmap_sg(dev, sg, nents, dir, attrs, false);
+ struct scatterlist *s;
+ int i;
+
+ for_each_sg(sg, s, nents, i) {
+ if (sg_dma_len(s))
+ __iommu_remove_mapping(dev, sg_dma_address(s),
+ sg_dma_len(s));
+ if (!dev->dma_coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ __dma_page_dev_to_cpu(sg_page(s), s->offset,
+ s->length, dir);
+ }
}
/**
@@ -1787,18 +1700,17 @@ static void arm_iommu_sync_sg_for_device(struct device *dev,
__dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir);
}
-
/**
- * arm_coherent_iommu_map_page
+ * arm_iommu_map_page
* @dev: valid struct device pointer
* @page: page that buffer resides in
* @offset: offset into page for start of buffer
* @size: size of buffer to map
* @dir: DMA transfer direction
*
- * Coherent IOMMU aware version of arm_dma_map_page()
+ * IOMMU aware version of arm_dma_map_page()
*/
-static dma_addr_t arm_coherent_iommu_map_page(struct device *dev, struct page *page,
+static dma_addr_t arm_iommu_map_page(struct device *dev, struct page *page,
unsigned long offset, size_t size, enum dma_data_direction dir,
unsigned long attrs)
{
@@ -1806,6 +1718,9 @@ static dma_addr_t arm_coherent_iommu_map_page(struct device *dev, struct page *p
dma_addr_t dma_addr;
int ret, prot, len = PAGE_ALIGN(size + offset);
+ if (!dev->dma_coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC))
+ __dma_page_cpu_to_dev(page, offset, size, dir);
+
dma_addr = __alloc_iova(mapping, len);
if (dma_addr == DMA_MAPPING_ERROR)
return dma_addr;
@@ -1822,50 +1737,6 @@ static dma_addr_t arm_coherent_iommu_map_page(struct device *dev, struct page *p
return DMA_MAPPING_ERROR;
}
-/**
- * arm_iommu_map_page
- * @dev: valid struct device pointer
- * @page: page that buffer resides in
- * @offset: offset into page for start of buffer
- * @size: size of buffer to map
- * @dir: DMA transfer direction
- *
- * IOMMU aware version of arm_dma_map_page()
- */
-static dma_addr_t arm_iommu_map_page(struct device *dev, struct page *page,
- unsigned long offset, size_t size, enum dma_data_direction dir,
- unsigned long attrs)
-{
- if ((attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
- __dma_page_cpu_to_dev(page, offset, size, dir);
-
- return arm_coherent_iommu_map_page(dev, page, offset, size, dir, attrs);
-}
-
-/**
- * arm_coherent_iommu_unmap_page
- * @dev: valid struct device pointer
- * @handle: DMA address of buffer
- * @size: size of buffer (same as passed to dma_map_page)
- * @dir: DMA transfer direction (same as passed to dma_map_page)
- *
- * Coherent IOMMU aware version of arm_dma_unmap_page()
- */
-static void arm_coherent_iommu_unmap_page(struct device *dev, dma_addr_t handle,
- size_t size, enum dma_data_direction dir, unsigned long attrs)
-{
- struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
- dma_addr_t iova = handle & PAGE_MASK;
- int offset = handle & ~PAGE_MASK;
- int len = PAGE_ALIGN(size + offset);
-
- if (!iova)
- return;
-
- iommu_unmap(mapping->domain, iova, len);
- __free_iova(mapping, iova, len);
-}
-
/**
* arm_iommu_unmap_page
* @dev: valid struct device pointer
@@ -1880,15 +1751,17 @@ static void arm_iommu_unmap_page(struct device *dev, dma_addr_t handle,
{
struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
dma_addr_t iova = handle & PAGE_MASK;
- struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
+ struct page *page;
int offset = handle & ~PAGE_MASK;
int len = PAGE_ALIGN(size + offset);
if (!iova)
return;
- if ((attrs & DMA_ATTR_SKIP_CPU_SYNC) == 0)
+ if (!dev->dma_coherent && !(attrs & DMA_ATTR_SKIP_CPU_SYNC)) {
+ page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
__dma_page_dev_to_cpu(page, offset, size, dir);
+ }
iommu_unmap(mapping->domain, iova, len);
__free_iova(mapping, iova, len);
@@ -2000,16 +1873,16 @@ static const struct dma_map_ops iommu_ops = {
};
static const struct dma_map_ops iommu_coherent_ops = {
- .alloc = arm_coherent_iommu_alloc_attrs,
- .free = arm_coherent_iommu_free_attrs,
- .mmap = arm_coherent_iommu_mmap_attrs,
+ .alloc = arm_iommu_alloc_attrs,
+ .free = arm_iommu_free_attrs,
+ .mmap = arm_iommu_mmap_attrs,
.get_sgtable = arm_iommu_get_sgtable,
- .map_page = arm_coherent_iommu_map_page,
- .unmap_page = arm_coherent_iommu_unmap_page,
+ .map_page = arm_iommu_map_page,
+ .unmap_page = arm_iommu_unmap_page,
- .map_sg = arm_coherent_iommu_map_sg,
- .unmap_sg = arm_coherent_iommu_unmap_sg,
+ .map_sg = arm_iommu_map_sg,
+ .unmap_sg = arm_iommu_unmap_sg,
.map_resource = arm_iommu_map_resource,
.unmap_resource = arm_iommu_unmap_resource,
@@ -2291,7 +2164,7 @@ void arch_teardown_dma_ops(struct device *dev)
set_dma_ops(dev, NULL);
}
-#ifdef CONFIG_SWIOTLB
+#if defined(CONFIG_SWIOTLB) || defined(CONFIG_ARM_DMA_USE_IOMMU)
void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
enum dma_data_direction dir)
{
@@ -2319,4 +2192,4 @@ void arch_dma_free(struct device *dev, size_t size, void *cpu_addr,
{
__arm_dma_free(dev, size, cpu_addr, dma_handle, attrs, false);
}
-#endif /* CONFIG_SWIOTLB */
+#endif /* CONFIG_SWIOTLB || CONFIG_ARM_DMA_USE_IOMMU */
--
2.28.0.dirty
The dma_sync_* operations are now the only difference between the
coherent and non-coherent IOMMU ops. Some minor tweaks to make those
safe for coherent devices with minimal overhead, and we can condense
down to a single set of DMA ops.
Signed-off-by: Robin Murphy <[email protected]>
---
arch/arm/mm/dma-mapping.c | 41 +++++++++++++--------------------------
1 file changed, 13 insertions(+), 28 deletions(-)
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 1bb7e9608f75..0537c97cebe1 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -1677,6 +1677,9 @@ static void arm_iommu_sync_sg_for_cpu(struct device *dev,
struct scatterlist *s;
int i;
+ if (dev->dma_coherent)
+ return;
+
for_each_sg(sg, s, nents, i)
__dma_page_dev_to_cpu(sg_page(s), s->offset, s->length, dir);
@@ -1696,6 +1699,9 @@ static void arm_iommu_sync_sg_for_device(struct device *dev,
struct scatterlist *s;
int i;
+ if (dev->dma_coherent)
+ return;
+
for_each_sg(sg, s, nents, i)
__dma_page_cpu_to_dev(sg_page(s), s->offset, s->length, dir);
}
@@ -1829,12 +1835,13 @@ static void arm_iommu_sync_single_for_cpu(struct device *dev,
{
struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
dma_addr_t iova = handle & PAGE_MASK;
- struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
+ struct page *page;
unsigned int offset = handle & ~PAGE_MASK;
- if (!iova)
+ if (dev->dma_coherent || !iova)
return;
+ page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
__dma_page_dev_to_cpu(page, offset, size, dir);
}
@@ -1843,12 +1850,13 @@ static void arm_iommu_sync_single_for_device(struct device *dev,
{
struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
dma_addr_t iova = handle & PAGE_MASK;
- struct page *page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
+ struct page *page;
unsigned int offset = handle & ~PAGE_MASK;
- if (!iova)
+ if (dev->dma_coherent || !iova)
return;
+ page = phys_to_page(iommu_iova_to_phys(mapping->domain, iova));
__dma_page_cpu_to_dev(page, offset, size, dir);
}
@@ -1872,22 +1880,6 @@ static const struct dma_map_ops iommu_ops = {
.unmap_resource = arm_iommu_unmap_resource,
};
-static const struct dma_map_ops iommu_coherent_ops = {
- .alloc = arm_iommu_alloc_attrs,
- .free = arm_iommu_free_attrs,
- .mmap = arm_iommu_mmap_attrs,
- .get_sgtable = arm_iommu_get_sgtable,
-
- .map_page = arm_iommu_map_page,
- .unmap_page = arm_iommu_unmap_page,
-
- .map_sg = arm_iommu_map_sg,
- .unmap_sg = arm_iommu_unmap_sg,
-
- .map_resource = arm_iommu_map_resource,
- .unmap_resource = arm_iommu_unmap_resource,
-};
-
/**
* arm_iommu_create_mapping
* @bus: pointer to the bus holding the client device (for IOMMU calls)
@@ -2067,11 +2059,6 @@ void arm_iommu_detach_device(struct device *dev)
}
EXPORT_SYMBOL_GPL(arm_iommu_detach_device);
-static const struct dma_map_ops *arm_get_iommu_dma_map_ops(bool coherent)
-{
- return coherent ? &iommu_coherent_ops : &iommu_ops;
-}
-
static bool arm_setup_iommu_dma_ops(struct device *dev, u64 dma_base, u64 size,
const struct iommu_ops *iommu)
{
@@ -2118,8 +2105,6 @@ static bool arm_setup_iommu_dma_ops(struct device *dev, u64 dma_base, u64 size,
static void arm_teardown_iommu_dma_ops(struct device *dev) { }
-#define arm_get_iommu_dma_map_ops arm_get_dma_map_ops
-
#endif /* CONFIG_ARM_DMA_USE_IOMMU */
void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
@@ -2141,7 +2126,7 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
return;
if (arm_setup_iommu_dma_ops(dev, dma_base, size, iommu))
- dma_ops = arm_get_iommu_dma_map_ops(coherent);
+ dma_ops = &iommu_ops;
else
dma_ops = arm_get_dma_map_ops(coherent);
--
2.28.0.dirty
Now that arch/arm is wired up for default domains and iommu-dma,
implement the corresponding driver-side support for groups and DMA
domains to replace the shared mapping workaround.
Signed-off-by: Robin Murphy <[email protected]>
---
drivers/iommu/mtk_iommu.h | 2 -
drivers/iommu/mtk_iommu_v1.c | 153 +++++++++++------------------------
2 files changed, 48 insertions(+), 107 deletions(-)
diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
index 122925dbe547..6253e98d810c 100644
--- a/drivers/iommu/mtk_iommu.h
+++ b/drivers/iommu/mtk_iommu.h
@@ -67,8 +67,6 @@ struct mtk_iommu_data {
struct iommu_device iommu;
const struct mtk_iommu_plat_data *plat_data;
- struct dma_iommu_mapping *mapping; /* For mtk_iommu_v1.c */
-
struct list_head list;
struct mtk_smi_larb_iommu larb_imu[MTK_LARB_NR_MAX];
};
diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
index 82ddfe9170d4..40c89b8d3ac4 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -28,7 +28,6 @@
#include <linux/slab.h>
#include <linux/spinlock.h>
#include <asm/barrier.h>
-#include <asm/dma-iommu.h>
#include <linux/init.h>
#include <dt-bindings/memory/mt2701-larb-port.h>
#include <soc/mediatek/smi.h>
@@ -240,13 +239,18 @@ static struct iommu_domain *mtk_iommu_domain_alloc(unsigned type)
{
struct mtk_iommu_domain *dom;
- if (type != IOMMU_DOMAIN_UNMANAGED)
+ if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
return NULL;
dom = kzalloc(sizeof(*dom), GFP_KERNEL);
if (!dom)
return NULL;
+ if (type == IOMMU_DOMAIN_DMA && iommu_get_dma_cookie(&dom->domain)) {
+ kfree(dom);
+ return NULL;
+ }
+
return &dom->domain;
}
@@ -257,6 +261,7 @@ static void mtk_iommu_domain_free(struct iommu_domain *domain)
dma_free_coherent(data->dev, M2701_IOMMU_PGT_SIZE,
dom->pgt_va, dom->pgt_pa);
+ iommu_put_dma_cookie(domain);
kfree(to_mtk_domain(domain));
}
@@ -265,14 +270,8 @@ static int mtk_iommu_attach_device(struct iommu_domain *domain,
{
struct mtk_iommu_data *data = dev_iommu_priv_get(dev);
struct mtk_iommu_domain *dom = to_mtk_domain(domain);
- struct dma_iommu_mapping *mtk_mapping;
int ret;
- /* Only allow the domain created internally. */
- mtk_mapping = data->mapping;
- if (mtk_mapping->domain != domain)
- return 0;
-
if (!data->m4u_dom) {
data->m4u_dom = dom;
ret = mtk_iommu_domain_finalise(data);
@@ -358,18 +357,39 @@ static phys_addr_t mtk_iommu_iova_to_phys(struct iommu_domain *domain,
static const struct iommu_ops mtk_iommu_ops;
-/*
- * MTK generation one iommu HW only support one iommu domain, and all the client
- * sharing the same iova address space.
- */
-static int mtk_iommu_create_mapping(struct device *dev,
- struct of_phandle_args *args)
+static struct iommu_device *mtk_iommu_probe_device(struct device *dev)
{
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct mtk_iommu_data *data;
+
+ if (!fwspec || fwspec->ops != &mtk_iommu_ops)
+ return ERR_PTR(-ENODEV); /* Not a iommu client device */
+
+ data = dev_iommu_priv_get(dev);
+
+ return &data->iommu;
+}
+
+static void mtk_iommu_release_device(struct device *dev)
+{
+ struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+
+ if (!fwspec || fwspec->ops != &mtk_iommu_ops)
+ return;
+
+ iommu_fwspec_free(dev);
+}
+
+static struct iommu_group *mtk_iommu_device_group(struct device *dev)
+{
+ struct mtk_iommu_data *data = dev_iommu_priv_get(dev);
+
+ return iommu_group_ref_get(data->m4u_group);
+}
+
+static int mtk_iommu_of_xlate(struct device *dev, struct of_phandle_args *args)
+{
struct platform_device *m4updev;
- struct dma_iommu_mapping *mtk_mapping;
- int ret;
if (args->args_count != 1) {
dev_err(dev, "invalid #iommu-cells(%d) property for IOMMU\n",
@@ -377,15 +397,6 @@ static int mtk_iommu_create_mapping(struct device *dev,
return -EINVAL;
}
- if (!fwspec) {
- ret = iommu_fwspec_init(dev, &args->np->fwnode, &mtk_iommu_ops);
- if (ret)
- return ret;
- fwspec = dev_iommu_fwspec_get(dev);
- } else if (dev_iommu_fwspec_get(dev)->ops != &mtk_iommu_ops) {
- return -EINVAL;
- }
-
if (!dev_iommu_priv_get(dev)) {
/* Get the m4u device */
m4updev = of_find_device_by_node(args->np);
@@ -395,83 +406,7 @@ static int mtk_iommu_create_mapping(struct device *dev,
dev_iommu_priv_set(dev, platform_get_drvdata(m4updev));
}
- ret = iommu_fwspec_add_ids(dev, args->args, 1);
- if (ret)
- return ret;
-
- data = dev_iommu_priv_get(dev);
- mtk_mapping = data->mapping;
- if (!mtk_mapping) {
- /* MTK iommu support 4GB iova address space. */
- mtk_mapping = arm_iommu_create_mapping(&platform_bus_type,
- 0, 1ULL << 32);
- if (IS_ERR(mtk_mapping))
- return PTR_ERR(mtk_mapping);
-
- data->mapping = mtk_mapping;
- }
-
- return 0;
-}
-
-static int mtk_iommu_def_domain_type(struct device *dev)
-{
- return IOMMU_DOMAIN_UNMANAGED;
-}
-
-static struct iommu_device *mtk_iommu_probe_device(struct device *dev)
-{
- struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
- struct of_phandle_args iommu_spec;
- struct of_phandle_iterator it;
- struct mtk_iommu_data *data;
- int err;
-
- of_for_each_phandle(&it, err, dev->of_node, "iommus",
- "#iommu-cells", -1) {
- int count = of_phandle_iterator_args(&it, iommu_spec.args,
- MAX_PHANDLE_ARGS);
- iommu_spec.np = of_node_get(it.node);
- iommu_spec.args_count = count;
-
- mtk_iommu_create_mapping(dev, &iommu_spec);
-
- /* dev->iommu_fwspec might have changed */
- fwspec = dev_iommu_fwspec_get(dev);
-
- of_node_put(iommu_spec.np);
- }
-
- if (!fwspec || fwspec->ops != &mtk_iommu_ops)
- return ERR_PTR(-ENODEV); /* Not a iommu client device */
-
- data = dev_iommu_priv_get(dev);
-
- return &data->iommu;
-}
-
-static void mtk_iommu_probe_finalize(struct device *dev)
-{
- struct dma_iommu_mapping *mtk_mapping;
- struct mtk_iommu_data *data;
- int err;
-
- data = dev_iommu_priv_get(dev);
- mtk_mapping = data->mapping;
-
- err = arm_iommu_attach_device(dev, mtk_mapping);
- if (err)
- dev_err(dev, "Can't create IOMMU mapping - DMA-OPS will not work\n");
-}
-
-static void mtk_iommu_release_device(struct device *dev)
-{
- struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
-
- if (!fwspec || fwspec->ops != &mtk_iommu_ops)
- return;
-
- iommu_fwspec_free(dev);
+ return iommu_fwspec_add_ids(dev, args->args, 1);
}
static int mtk_iommu_hw_init(const struct mtk_iommu_data *data)
@@ -524,10 +459,9 @@ static const struct iommu_ops mtk_iommu_ops = {
.unmap = mtk_iommu_unmap,
.iova_to_phys = mtk_iommu_iova_to_phys,
.probe_device = mtk_iommu_probe_device,
- .probe_finalize = mtk_iommu_probe_finalize,
.release_device = mtk_iommu_release_device,
- .def_domain_type = mtk_iommu_def_domain_type,
- .device_group = generic_device_group,
+ .device_group = mtk_iommu_device_group,
+ .of_xlate = mtk_iommu_of_xlate,
.pgsize_bitmap = ~0UL << MT2701_IOMMU_PAGE_SHIFT,
};
@@ -626,6 +560,14 @@ static int mtk_iommu_probe(struct platform_device *pdev)
if (ret)
return ret;
+ /*
+ * MTK generation one iommu HW only support one iommu domain,
+ * and all the client sharing the same iova address space.
+ */
+ data->m4u_group = iommu_group_alloc();
+ if (IS_ERR(data->m4u_group))
+ return PTR_ERR(data->m4u_group);
+
if (!iommu_present(&platform_bus_type))
bus_set_iommu(&platform_bus_type, &mtk_iommu_ops);
@@ -636,6 +578,7 @@ static int mtk_iommu_remove(struct platform_device *pdev)
{
struct mtk_iommu_data *data = platform_get_drvdata(pdev);
+ iommu_group_put(data->m4u_group);
iommu_device_sysfs_remove(&data->iommu);
iommu_device_unregister(&data->iommu);
--
2.28.0.dirty
Side note, I suspect we'll end up needing something like
0e764a01015dfebff8a8ffd297d74663772e248a .. if someone can dig a 32b
device out of the closet and dust it off, the fix is easy enough.
Just wanted to mention that here so anyone with a 32b device could
find what is needed.
BR,
-R
On Thu, Aug 20, 2020 at 8:09 AM Robin Murphy <[email protected]> wrote:
>
> Now that arch/arm is wired up for default domains and iommu-dma,
> implement the corresponding driver-side support for DMA domains.
>
> Signed-off-by: Robin Murphy <[email protected]>
> ---
> drivers/iommu/msm_iommu.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
> index 3615cd6241c4..f34efcbb0b2b 100644
> --- a/drivers/iommu/msm_iommu.c
> +++ b/drivers/iommu/msm_iommu.c
> @@ -8,6 +8,7 @@
> #include <linux/kernel.h>
> #include <linux/init.h>
> #include <linux/platform_device.h>
> +#include <linux/dma-iommu.h>
> #include <linux/errno.h>
> #include <linux/io.h>
> #include <linux/io-pgtable.h>
> @@ -314,13 +315,16 @@ static struct iommu_domain *msm_iommu_domain_alloc(unsigned type)
> {
> struct msm_priv *priv;
>
> - if (type != IOMMU_DOMAIN_UNMANAGED)
> + if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
> return NULL;
>
> priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> if (!priv)
> goto fail_nomem;
>
> + if (type == IOMMU_DOMAIN_DMA && iommu_get_dma_cookie(&priv->domain))
> + goto fail_nomem;
> +
> INIT_LIST_HEAD(&priv->list_attached);
>
> priv->domain.geometry.aperture_start = 0;
> @@ -339,6 +343,7 @@ static void msm_iommu_domain_free(struct iommu_domain *domain)
> struct msm_priv *priv;
> unsigned long flags;
>
> + iommu_put_dma_cookie(domain);
> spin_lock_irqsave(&msm_iommu_lock, flags);
> priv = to_msm_priv(domain);
> kfree(priv);
> --
> 2.28.0.dirty
>
> _______________________________________________
> dri-devel mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
Now that arch/arm is wired up for default domains and iommu-dma,
implement the corresponding driver-side support for DMA domains.
Signed-off-by: Robin Murphy <[email protected]>
---
drivers/iommu/omap-iommu.c | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
index 71f29c0927fc..ea25c2fe0418 100644
--- a/drivers/iommu/omap-iommu.c
+++ b/drivers/iommu/omap-iommu.c
@@ -9,6 +9,7 @@
* Paul Mundt and Toshihiro Kobayashi
*/
+#include <linux/dma-iommu.h>
#include <linux/dma-mapping.h>
#include <linux/err.h>
#include <linux/slab.h>
@@ -1574,13 +1575,19 @@ static struct iommu_domain *omap_iommu_domain_alloc(unsigned type)
{
struct omap_iommu_domain *omap_domain;
- if (type != IOMMU_DOMAIN_UNMANAGED)
+ if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
return NULL;
omap_domain = kzalloc(sizeof(*omap_domain), GFP_KERNEL);
if (!omap_domain)
return NULL;
+ if (type == IOMMU_DOMAIN_DMA &&
+ iommu_get_dma_cookie(&omap_domain->domain)) {
+ kfree(omap_domain);
+ return NULL;
+ }
+
spin_lock_init(&omap_domain->lock);
omap_domain->domain.geometry.aperture_start = 0;
@@ -1601,6 +1608,7 @@ static void omap_iommu_domain_free(struct iommu_domain *domain)
if (omap_domain->dev)
_omap_iommu_detach_dev(omap_domain, omap_domain->dev);
+ iommu_put_dma_cookie(&omap_domain->domain);
kfree(omap_domain);
}
@@ -1736,6 +1744,17 @@ static struct iommu_group *omap_iommu_device_group(struct device *dev)
return group;
}
+static int omap_iommu_of_xlate(struct device *dev,
+ struct of_phandle_args *args)
+{
+ /*
+ * Logically, some of the housekeeping from _omap_iommu_add_device()
+ * should probably move here, but the minimum we *need* is simply to
+ * cooperate with of_iommu at all to let iommu-dma work.
+ */
+ return 0;
+}
+
static const struct iommu_ops omap_iommu_ops = {
.domain_alloc = omap_iommu_domain_alloc,
.domain_free = omap_iommu_domain_free,
@@ -1747,6 +1766,7 @@ static const struct iommu_ops omap_iommu_ops = {
.probe_device = omap_iommu_probe_device,
.release_device = omap_iommu_release_device,
.device_group = omap_iommu_device_group,
+ .of_xlate = omap_iommu_of_xlate,
.pgsize_bitmap = OMAP_IOMMU_PGSIZES,
};
--
2.28.0.dirty
In order to wrangle arch/arm over to iommu_dma_ops, we first need to
convert all its associated IOMMU drivers over to default domains, and
deal with users of its public dma_iommu_mappinng API. Since that can't
reasonably be done in a single patch, we've no choice but to go through
an ugly transitional phase. That starts with exposing some hooks into
iommu-dma's internals so that it can start to do most of the heavy
lifting.
Before you start thinking about how horrible that is, here's a zebra:
,
c@
`||||)\
< /
Signed-off-by: Robin Murphy <[email protected]>
---
drivers/iommu/dma-iommu.c | 38 +++++++++++++++++++++++++++++---------
1 file changed, 29 insertions(+), 9 deletions(-)
diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 4959f5df21bd..ab157d155bf7 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -25,6 +25,19 @@
#include <linux/vmalloc.h>
#include <linux/crash_dump.h>
+#ifdef CONFIG_ARM
+#include <asm/dma-iommu.h>
+#endif
+static struct iommu_domain *__iommu_get_dma_domain(struct device *dev)
+{
+#ifdef CONFIG_ARM
+ struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
+ if (mapping)
+ return mapping->domain;
+#endif
+ return iommu_get_dma_domain(dev);
+}
+
struct iommu_dma_msi_page {
struct list_head list;
dma_addr_t iova;
@@ -298,8 +311,11 @@ static void iommu_dma_flush_iotlb_all(struct iova_domain *iovad)
* avoid rounding surprises. If necessary, we reserve the page at address 0
* to ensure it is an invalid IOVA. It is safe to reinitialise a domain, but
* any change which could make prior IOVAs invalid will fail.
+ *
+ * XXX: Not formally exported, but needs to be referenced
+ * from arch/arm/mm/dma-mapping.c temporarily
*/
-static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
+int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base,
u64 size, struct device *dev)
{
struct iommu_dma_cookie *cookie = domain->iova_cookie;
@@ -456,7 +472,7 @@ static void iommu_dma_free_iova(struct iommu_dma_cookie *cookie,
static void __iommu_dma_unmap(struct device *dev, dma_addr_t dma_addr,
size_t size)
{
- struct iommu_domain *domain = iommu_get_dma_domain(dev);
+ struct iommu_domain *domain = __iommu_get_dma_domain(dev);
struct iommu_dma_cookie *cookie = domain->iova_cookie;
struct iova_domain *iovad = &cookie->iovad;
size_t iova_off = iova_offset(iovad, dma_addr);
@@ -478,7 +494,7 @@ static void __iommu_dma_unmap(struct device *dev, dma_addr_t dma_addr,
static dma_addr_t __iommu_dma_map(struct device *dev, phys_addr_t phys,
size_t size, int prot, u64 dma_mask)
{
- struct iommu_domain *domain = iommu_get_dma_domain(dev);
+ struct iommu_domain *domain = __iommu_get_dma_domain(dev);
struct iommu_dma_cookie *cookie = domain->iova_cookie;
struct iova_domain *iovad = &cookie->iovad;
size_t iova_off = iova_offset(iovad, phys);
@@ -582,7 +598,7 @@ static struct page **__iommu_dma_alloc_pages(struct device *dev,
static void *iommu_dma_alloc_remap(struct device *dev, size_t size,
dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
{
- struct iommu_domain *domain = iommu_get_dma_domain(dev);
+ struct iommu_domain *domain = __iommu_get_dma_domain(dev);
struct iommu_dma_cookie *cookie = domain->iova_cookie;
struct iova_domain *iovad = &cookie->iovad;
bool coherent = dev_is_dma_coherent(dev);
@@ -678,7 +694,7 @@ static void iommu_dma_sync_single_for_cpu(struct device *dev,
if (dev_is_dma_coherent(dev))
return;
- phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle);
+ phys = iommu_iova_to_phys(__iommu_get_dma_domain(dev), dma_handle);
arch_sync_dma_for_cpu(phys, size, dir);
}
@@ -690,7 +706,7 @@ static void iommu_dma_sync_single_for_device(struct device *dev,
if (dev_is_dma_coherent(dev))
return;
- phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle);
+ phys = iommu_iova_to_phys(__iommu_get_dma_domain(dev), dma_handle);
arch_sync_dma_for_device(phys, size, dir);
}
@@ -831,7 +847,7 @@ static void __invalidate_sg(struct scatterlist *sg, int nents)
static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
int nents, enum dma_data_direction dir, unsigned long attrs)
{
- struct iommu_domain *domain = iommu_get_dma_domain(dev);
+ struct iommu_domain *domain = __iommu_get_dma_domain(dev);
struct iommu_dma_cookie *cookie = domain->iova_cookie;
struct iova_domain *iovad = &cookie->iovad;
struct scatterlist *s, *prev = NULL;
@@ -1112,12 +1128,16 @@ static int iommu_dma_get_sgtable(struct device *dev, struct sg_table *sgt,
static unsigned long iommu_dma_get_merge_boundary(struct device *dev)
{
- struct iommu_domain *domain = iommu_get_dma_domain(dev);
+ struct iommu_domain *domain = __iommu_get_dma_domain(dev);
return (1UL << __ffs(domain->pgsize_bitmap)) - 1;
}
-static const struct dma_map_ops iommu_dma_ops = {
+/*
+ * XXX: Not formally exported, but needs to be referenced
+ * from arch/arm/mm/dma-mapping.c temporarily
+ */
+const struct dma_map_ops iommu_dma_ops = {
.alloc = iommu_dma_alloc,
.free = iommu_dma_free,
.mmap = iommu_dma_mmap,
--
2.28.0.dirty
Hi Robin,
On Thu, Aug 20, 2020 at 04:08:36PM +0100, Robin Murphy wrote:
> Now that arch/arm is wired up for default domains and iommu-dma, devices
> behind IOMMUs will get mappings set up automatically as appropriate, so
> there is no need for drivers to do so manually.
>
> Signed-off-by: Robin Murphy <[email protected]>
Thanks for the patch.
I haven't looked at the details but it seems that this causes the buffer
memory allocation to be physically contiguous, which causes a failure to
allocate video buffers of entirely normal size. I guess that was not
intentional?
-----------------8<---------------------------
[ 218.934448] WARNING: CPU: 0 PID: 1994 at mm/page_alloc.c:4859 __alloc_pages_nodemask+0x9c/0xb1c
[ 218.943847] Modules linked in: omap3_isp videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 videobuf2_common leds_as3645a smiapp v4l2_flash_led_class led_class_flash v4l2_fwnode smiapp_pll videodev leds_gpio mc led_class
[ 218.964660] CPU: 0 PID: 1994 Comm: yavta Not tainted 5.9.0-rc1-dirty #1818
[ 218.972442] Hardware name: Generic OMAP36xx (Flattened Device Tree)
[ 218.978973] Backtrace:
[ 218.981842] [<c010bf90>] (dump_backtrace) from [<c010c350>] (show_stack+0x20/0x24)
[ 218.989715] r7:00000000 r6:00000009 r5:c08f03bc r4:c08f2fef
[ 218.995880] [<c010c330>] (show_stack) from [<c03d3328>] (dump_stack+0x28/0x30)
[ 219.003631] [<c03d3300>] (dump_stack) from [<c012e324>] (__warn+0x100/0x118)
[ 219.010955] r5:c08f03bc r4:00000000
[ 219.014953] [<c012e224>] (__warn) from [<c012e6f4>] (warn_slowpath_fmt+0x84/0xa8)
[ 219.022949] r9:c0232090 r8:c08f03bc r7:c0b08a88 r6:00000009 r5:000012fb r4:00000000
[ 219.031036] [<c012e674>] (warn_slowpath_fmt) from [<c0232090>] (__alloc_pages_nodemask+0x9c/0xb1c)
[ 219.040557] r9:c0185c3c r8:00000000 r7:010ec000 r6:00000000 r5:0000000d r4:00000000
[ 219.048858] [<c0231ff4>] (__alloc_pages_nodemask) from [<c01108f0>] (__dma_alloc_buffer.constprop.14+0x3c/0x90)
[ 219.059570] r10:00000cc0 r9:c0185c3c r8:00000000 r7:010ec000 r6:0000000d r5:c0b08a88
[ 219.067901] r4:00000cc0
[ 219.070587] [<c01108b4>] (__dma_alloc_buffer.constprop.14) from [<c0110a6c>] (remap_allocator_alloc+0x34/0x7c)
[ 219.081207] r9:c0185c3c r8:00000247 r7:e6d7fb84 r6:010ec000 r5:c0b08a88 r4:00000001
[ 219.089263] [<c0110a38>] (remap_allocator_alloc) from [<c010f4f4>] (__dma_alloc+0x124/0x21c)
[ 219.098236] r9:ed99fc10 r8:e69aa890 r7:00000000 r6:ffffffff r5:c0b08a88 r4:e6fdd680
[ 219.106536] [<c010f3d0>] (__dma_alloc) from [<c010f69c>] (arm_dma_alloc+0x68/0x74)
[ 219.114654] r10:00000cc0 r9:c0185c3c r8:00000cc0 r7:e69aa890 r6:010ec000 r5:ed99fc10
[ 219.122985] r4:00000000
[ 219.125671] [<c010f634>] (arm_dma_alloc) from [<c0185c3c>] (dma_alloc_attrs+0xe4/0x120)
[ 219.134216] r9:00000000 r8:e69aa890 r7:010ec000 r6:c0b08a88 r5:ed99fc10 r4:c010f634
[ 219.142517] [<c0185b58>] (dma_alloc_attrs) from [<bf095c3c>] (vb2_dc_alloc+0xcc/0x108 [videobuf2_dma_contig])
[ 219.153076] r10:e6885ca8 r9:e6abfc48 r8:00000002 r7:00000000 r6:010ec000 r5:ed99fc10
[ 219.161407] r4:e69aa880
[ 219.164184] [<bf095b70>] (vb2_dc_alloc [videobuf2_dma_contig]) from [<bf080fd0>] (__vb2_queue_alloc+0x258/0x4a4 [videobuf2_common])
[ 219.176696] r8:bf095b70 r7:010ec000 r6:00000000 r5:e6885ca8 r4:e6abfc00
[ 219.183959] [<bf080d78>] (__vb2_queue_alloc [videobuf2_common]) from [<bf0833a0>] (vb2_core_reqbufs+0x408/0x498 [videobuf2_common])
[ 219.196533] r10:e6885ce8 r9:00000000 r8:e6d7fe24 r7:e6d7fcec r6:bf09ced4 r5:bf088580
[ 219.204895] r4:e6885ca8
[ 219.207672] [<bf082f98>] (vb2_core_reqbufs [videobuf2_common]) from [<bf08e1cc>] (vb2_reqbufs+0x64/0x70 [videobuf2_v4l2])
[ 219.219268] r10:00000000 r9:bf032bc0 r8:c0145608 r7:bf0ad4a4 r6:e6885ca8 r5:00000000
[ 219.227600] r4:e6d7fe24
[ 219.230499] [<bf08e168>] (vb2_reqbufs [videobuf2_v4l2]) from [<bf09d7b4>] (isp_video_reqbufs+0x40/0x54 [omap3_isp])
[ 219.241607] r7:bf0ad4a4 r6:e6d7fe24 r5:e6885c00 r4:e6cca928
[ 219.247924] [<bf09d774>] (isp_video_reqbufs [omap3_isp]) from [<bf01de4c>] (v4l_reqbufs+0x4c/0x50 [videodev])
[ 219.258514] r7:bf0ad4a4 r6:e6885c00 r5:e6d7fe24 r4:e7efbec0
[ 219.264984] [<bf01de00>] (v4l_reqbufs [videodev]) from [<bf01eeb4>] (__video_do_ioctl+0x2d8/0x414 [videodev])
[ 219.275512] r7:bf01de00 r6:00000000 r5:00000000 r4:e6cca2e0
[ 219.281982] [<bf01ebdc>] (__video_do_ioctl [videodev]) from [<bf01fa1c>] (video_usercopy+0x144/0x508 [videodev])
[ 219.292816] r10:e7efbec0 r9:c0145608 r8:e6d7fe24 r7:00000000 r6:00000000 r5:bf01ebdc
[ 219.300933] r4:c0145608
[ 219.304168] [<bf01f8d8>] (video_usercopy [videodev]) from [<bf01fdfc>] (video_ioctl2+0x1c/0x24 [videodev])
[ 219.314453] r10:e7fbfda0 r9:e7efbec0 r8:00000003 r7:00000000 r6:bee658f4 r5:c0145608
[ 219.322784] r4:e7efbec0
[ 219.325775] [<bf01fde0>] (video_ioctl2 [videodev]) from [<bf01814c>] (v4l2_ioctl+0x50/0x64 [videodev])
[ 219.335845] [<bf0180fc>] (v4l2_ioctl [videodev]) from [<c02654a0>] (vfs_ioctl+0x30/0x44)
[ 219.344482] r7:00000000 r6:e7efbec0 r5:bee658f4 r4:c0145608
[ 219.350402] [<c0265470>] (vfs_ioctl) from [<c0265e9c>] (sys_ioctl+0xdc/0x7ec)
[ 219.358062] [<c0265dc0>] (sys_ioctl) from [<c0100080>] (ret_fast_syscall+0x0/0x28)
[ 219.366149] Exception stack(0xe6d7ffa8 to 0xe6d7fff0)
[ 219.371673] ffa0: 00000000 bee65c1a 00000003 c0145608 bee658f4 00000001
[ 219.380157] ffc0: 00000000 bee65c1a 00000000 00000036 000009a0 00000000 0000ef30 010eb400
[ 219.388885] ffe0: 0001716c bee65104 0000b588 b6e413ac
[ 219.394409] r10:00000036 r9:e6d7e000 r8:c0100244 r7:00000036 r6:00000000 r5:bee65c1a
[ 219.402740] r4:00000000
[ 219.405426] irq event stamp: 5075
[ 219.408905] hardirqs last enabled at (5083): [<c01778b0>] console_unlock+0x4cc/0x524
[ 219.417297] hardirqs last disabled at (5092): [<c01777ac>] console_unlock+0x3c8/0x524
[ 219.425628] softirqs last enabled at (4532): [<c01017d8>] __do_softirq+0x1f0/0x490
[ 219.433837] softirqs last disabled at (4493): [<c0132c20>] irq_exit+0xe4/0x160
[ 219.441558] ---[ end trace 8c56810633cf24db ]---
[ 219.446502] omap3isp 480bc000.isp: dma_alloc_coherent of size 17743872 failed
-----------------8<---------------------------
--
Kind regards,
Sakari Ailus
On 2020-08-20 16:55, Rob Clark wrote:
> Side note, I suspect we'll end up needing something like
> 0e764a01015dfebff8a8ffd297d74663772e248a .. if someone can dig a 32b
> device out of the closet and dust it off, the fix is easy enough.
> Just wanted to mention that here so anyone with a 32b device could
> find what is needed.
FWIW there shouldn't be any material change here - the generic default
domain is installed under the same circumstances as the Arm
dma_iommu_mapping was, so if any platform does have an issue, then it
should already have started 4 years with f78ebca8ff3d ("iommu/msm: Add
support for generic master bindings").
Robin.
>
> BR,
> -R
>
> On Thu, Aug 20, 2020 at 8:09 AM Robin Murphy <[email protected]> wrote:
>>
>> Now that arch/arm is wired up for default domains and iommu-dma,
>> implement the corresponding driver-side support for DMA domains.
>>
>> Signed-off-by: Robin Murphy <[email protected]>
>> ---
>> drivers/iommu/msm_iommu.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
>> index 3615cd6241c4..f34efcbb0b2b 100644
>> --- a/drivers/iommu/msm_iommu.c
>> +++ b/drivers/iommu/msm_iommu.c
>> @@ -8,6 +8,7 @@
>> #include <linux/kernel.h>
>> #include <linux/init.h>
>> #include <linux/platform_device.h>
>> +#include <linux/dma-iommu.h>
>> #include <linux/errno.h>
>> #include <linux/io.h>
>> #include <linux/io-pgtable.h>
>> @@ -314,13 +315,16 @@ static struct iommu_domain *msm_iommu_domain_alloc(unsigned type)
>> {
>> struct msm_priv *priv;
>>
>> - if (type != IOMMU_DOMAIN_UNMANAGED)
>> + if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
>> return NULL;
>>
>> priv = kzalloc(sizeof(*priv), GFP_KERNEL);
>> if (!priv)
>> goto fail_nomem;
>>
>> + if (type == IOMMU_DOMAIN_DMA && iommu_get_dma_cookie(&priv->domain))
>> + goto fail_nomem;
>> +
>> INIT_LIST_HEAD(&priv->list_attached);
>>
>> priv->domain.geometry.aperture_start = 0;
>> @@ -339,6 +343,7 @@ static void msm_iommu_domain_free(struct iommu_domain *domain)
>> struct msm_priv *priv;
>> unsigned long flags;
>>
>> + iommu_put_dma_cookie(domain);
>> spin_lock_irqsave(&msm_iommu_lock, flags);
>> priv = to_msm_priv(domain);
>> kfree(priv);
>> --
>> 2.28.0.dirty
>>
>> _______________________________________________
>> dri-devel mailing list
>> [email protected]
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
On Thu, Aug 20, 2020 at 9:58 AM Robin Murphy <[email protected]> wrote:
>
> On 2020-08-20 16:55, Rob Clark wrote:
> > Side note, I suspect we'll end up needing something like
> > 0e764a01015dfebff8a8ffd297d74663772e248a .. if someone can dig a 32b
> > device out of the closet and dust it off, the fix is easy enough.
> > Just wanted to mention that here so anyone with a 32b device could
> > find what is needed.
>
> FWIW there shouldn't be any material change here - the generic default
> domain is installed under the same circumstances as the Arm
> dma_iommu_mapping was, so if any platform does have an issue, then it
> should already have started 4 years with f78ebca8ff3d ("iommu/msm: Add
> support for generic master bindings").
ok, it has, I guess, been a while since playing with 32b things..
someone on IRC had mentioned a problem that sounded like what
0e764a01015dfebff8a8ffd297d74663772e248a solved, unless they disabled
some ARCH_HAS_xyz thing (IIRC), which I guess is related..
BR,
-R
> Robin.
>
> >
> > BR,
> > -R
> >
> > On Thu, Aug 20, 2020 at 8:09 AM Robin Murphy <[email protected]> wrote:
> >>
> >> Now that arch/arm is wired up for default domains and iommu-dma,
> >> implement the corresponding driver-side support for DMA domains.
> >>
> >> Signed-off-by: Robin Murphy <[email protected]>
> >> ---
> >> drivers/iommu/msm_iommu.c | 7 ++++++-
> >> 1 file changed, 6 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
> >> index 3615cd6241c4..f34efcbb0b2b 100644
> >> --- a/drivers/iommu/msm_iommu.c
> >> +++ b/drivers/iommu/msm_iommu.c
> >> @@ -8,6 +8,7 @@
> >> #include <linux/kernel.h>
> >> #include <linux/init.h>
> >> #include <linux/platform_device.h>
> >> +#include <linux/dma-iommu.h>
> >> #include <linux/errno.h>
> >> #include <linux/io.h>
> >> #include <linux/io-pgtable.h>
> >> @@ -314,13 +315,16 @@ static struct iommu_domain *msm_iommu_domain_alloc(unsigned type)
> >> {
> >> struct msm_priv *priv;
> >>
> >> - if (type != IOMMU_DOMAIN_UNMANAGED)
> >> + if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
> >> return NULL;
> >>
> >> priv = kzalloc(sizeof(*priv), GFP_KERNEL);
> >> if (!priv)
> >> goto fail_nomem;
> >>
> >> + if (type == IOMMU_DOMAIN_DMA && iommu_get_dma_cookie(&priv->domain))
> >> + goto fail_nomem;
> >> +
> >> INIT_LIST_HEAD(&priv->list_attached);
> >>
> >> priv->domain.geometry.aperture_start = 0;
> >> @@ -339,6 +343,7 @@ static void msm_iommu_domain_free(struct iommu_domain *domain)
> >> struct msm_priv *priv;
> >> unsigned long flags;
> >>
> >> + iommu_put_dma_cookie(domain);
> >> spin_lock_irqsave(&msm_iommu_lock, flags);
> >> priv = to_msm_priv(domain);
> >> kfree(priv);
> >> --
> >> 2.28.0.dirty
> >>
> >> _______________________________________________
> >> dri-devel mailing list
> >> [email protected]
> >> https://lists.freedesktop.org/mailman/listinfo/dri-devel
On 2020-08-20 17:53, Sakari Ailus wrote:
> Hi Robin,
>
> On Thu, Aug 20, 2020 at 04:08:36PM +0100, Robin Murphy wrote:
>> Now that arch/arm is wired up for default domains and iommu-dma, devices
>> behind IOMMUs will get mappings set up automatically as appropriate, so
>> there is no need for drivers to do so manually.
>>
>> Signed-off-by: Robin Murphy <[email protected]>
>
> Thanks for the patch.
Many thanks for testing so quickly!
> I haven't looked at the details but it seems that this causes the buffer
> memory allocation to be physically contiguous, which causes a failure to
> allocate video buffers of entirely normal size. I guess that was not
> intentional?
Hmm, it looks like the device ends up with the wrong DMA ops, which
implies something didn't go as expected with the earlier IOMMU setup and
default domain creation. Chances are that either I missed some subtlety
in the omap_iommu change, or I've fundamentally misjudged how the ISP
probing works and it never actually goes down the of_iommu_configure()
path in the first place. Do you get any messages from the IOMMU layer
earlier on during boot?
Robin.
> -----------------8<---------------------------
> [ 218.934448] WARNING: CPU: 0 PID: 1994 at mm/page_alloc.c:4859 __alloc_pages_nodemask+0x9c/0xb1c
> [ 218.943847] Modules linked in: omap3_isp videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 videobuf2_common leds_as3645a smiapp v4l2_flash_led_class led_class_flash v4l2_fwnode smiapp_pll videodev leds_gpio mc led_class
> [ 218.964660] CPU: 0 PID: 1994 Comm: yavta Not tainted 5.9.0-rc1-dirty #1818
> [ 218.972442] Hardware name: Generic OMAP36xx (Flattened Device Tree)
> [ 218.978973] Backtrace:
> [ 218.981842] [<c010bf90>] (dump_backtrace) from [<c010c350>] (show_stack+0x20/0x24)
> [ 218.989715] r7:00000000 r6:00000009 r5:c08f03bc r4:c08f2fef
> [ 218.995880] [<c010c330>] (show_stack) from [<c03d3328>] (dump_stack+0x28/0x30)
> [ 219.003631] [<c03d3300>] (dump_stack) from [<c012e324>] (__warn+0x100/0x118)
> [ 219.010955] r5:c08f03bc r4:00000000
> [ 219.014953] [<c012e224>] (__warn) from [<c012e6f4>] (warn_slowpath_fmt+0x84/0xa8)
> [ 219.022949] r9:c0232090 r8:c08f03bc r7:c0b08a88 r6:00000009 r5:000012fb r4:00000000
> [ 219.031036] [<c012e674>] (warn_slowpath_fmt) from [<c0232090>] (__alloc_pages_nodemask+0x9c/0xb1c)
> [ 219.040557] r9:c0185c3c r8:00000000 r7:010ec000 r6:00000000 r5:0000000d r4:00000000
> [ 219.048858] [<c0231ff4>] (__alloc_pages_nodemask) from [<c01108f0>] (__dma_alloc_buffer.constprop.14+0x3c/0x90)
> [ 219.059570] r10:00000cc0 r9:c0185c3c r8:00000000 r7:010ec000 r6:0000000d r5:c0b08a88
> [ 219.067901] r4:00000cc0
> [ 219.070587] [<c01108b4>] (__dma_alloc_buffer.constprop.14) from [<c0110a6c>] (remap_allocator_alloc+0x34/0x7c)
> [ 219.081207] r9:c0185c3c r8:00000247 r7:e6d7fb84 r6:010ec000 r5:c0b08a88 r4:00000001
> [ 219.089263] [<c0110a38>] (remap_allocator_alloc) from [<c010f4f4>] (__dma_alloc+0x124/0x21c)
> [ 219.098236] r9:ed99fc10 r8:e69aa890 r7:00000000 r6:ffffffff r5:c0b08a88 r4:e6fdd680
> [ 219.106536] [<c010f3d0>] (__dma_alloc) from [<c010f69c>] (arm_dma_alloc+0x68/0x74)
> [ 219.114654] r10:00000cc0 r9:c0185c3c r8:00000cc0 r7:e69aa890 r6:010ec000 r5:ed99fc10
> [ 219.122985] r4:00000000
> [ 219.125671] [<c010f634>] (arm_dma_alloc) from [<c0185c3c>] (dma_alloc_attrs+0xe4/0x120)
> [ 219.134216] r9:00000000 r8:e69aa890 r7:010ec000 r6:c0b08a88 r5:ed99fc10 r4:c010f634
> [ 219.142517] [<c0185b58>] (dma_alloc_attrs) from [<bf095c3c>] (vb2_dc_alloc+0xcc/0x108 [videobuf2_dma_contig])
> [ 219.153076] r10:e6885ca8 r9:e6abfc48 r8:00000002 r7:00000000 r6:010ec000 r5:ed99fc10
> [ 219.161407] r4:e69aa880
> [ 219.164184] [<bf095b70>] (vb2_dc_alloc [videobuf2_dma_contig]) from [<bf080fd0>] (__vb2_queue_alloc+0x258/0x4a4 [videobuf2_common])
> [ 219.176696] r8:bf095b70 r7:010ec000 r6:00000000 r5:e6885ca8 r4:e6abfc00
> [ 219.183959] [<bf080d78>] (__vb2_queue_alloc [videobuf2_common]) from [<bf0833a0>] (vb2_core_reqbufs+0x408/0x498 [videobuf2_common])
> [ 219.196533] r10:e6885ce8 r9:00000000 r8:e6d7fe24 r7:e6d7fcec r6:bf09ced4 r5:bf088580
> [ 219.204895] r4:e6885ca8
> [ 219.207672] [<bf082f98>] (vb2_core_reqbufs [videobuf2_common]) from [<bf08e1cc>] (vb2_reqbufs+0x64/0x70 [videobuf2_v4l2])
> [ 219.219268] r10:00000000 r9:bf032bc0 r8:c0145608 r7:bf0ad4a4 r6:e6885ca8 r5:00000000
> [ 219.227600] r4:e6d7fe24
> [ 219.230499] [<bf08e168>] (vb2_reqbufs [videobuf2_v4l2]) from [<bf09d7b4>] (isp_video_reqbufs+0x40/0x54 [omap3_isp])
> [ 219.241607] r7:bf0ad4a4 r6:e6d7fe24 r5:e6885c00 r4:e6cca928
> [ 219.247924] [<bf09d774>] (isp_video_reqbufs [omap3_isp]) from [<bf01de4c>] (v4l_reqbufs+0x4c/0x50 [videodev])
> [ 219.258514] r7:bf0ad4a4 r6:e6885c00 r5:e6d7fe24 r4:e7efbec0
> [ 219.264984] [<bf01de00>] (v4l_reqbufs [videodev]) from [<bf01eeb4>] (__video_do_ioctl+0x2d8/0x414 [videodev])
> [ 219.275512] r7:bf01de00 r6:00000000 r5:00000000 r4:e6cca2e0
> [ 219.281982] [<bf01ebdc>] (__video_do_ioctl [videodev]) from [<bf01fa1c>] (video_usercopy+0x144/0x508 [videodev])
> [ 219.292816] r10:e7efbec0 r9:c0145608 r8:e6d7fe24 r7:00000000 r6:00000000 r5:bf01ebdc
> [ 219.300933] r4:c0145608
> [ 219.304168] [<bf01f8d8>] (video_usercopy [videodev]) from [<bf01fdfc>] (video_ioctl2+0x1c/0x24 [videodev])
> [ 219.314453] r10:e7fbfda0 r9:e7efbec0 r8:00000003 r7:00000000 r6:bee658f4 r5:c0145608
> [ 219.322784] r4:e7efbec0
> [ 219.325775] [<bf01fde0>] (video_ioctl2 [videodev]) from [<bf01814c>] (v4l2_ioctl+0x50/0x64 [videodev])
> [ 219.335845] [<bf0180fc>] (v4l2_ioctl [videodev]) from [<c02654a0>] (vfs_ioctl+0x30/0x44)
> [ 219.344482] r7:00000000 r6:e7efbec0 r5:bee658f4 r4:c0145608
> [ 219.350402] [<c0265470>] (vfs_ioctl) from [<c0265e9c>] (sys_ioctl+0xdc/0x7ec)
> [ 219.358062] [<c0265dc0>] (sys_ioctl) from [<c0100080>] (ret_fast_syscall+0x0/0x28)
> [ 219.366149] Exception stack(0xe6d7ffa8 to 0xe6d7fff0)
> [ 219.371673] ffa0: 00000000 bee65c1a 00000003 c0145608 bee658f4 00000001
> [ 219.380157] ffc0: 00000000 bee65c1a 00000000 00000036 000009a0 00000000 0000ef30 010eb400
> [ 219.388885] ffe0: 0001716c bee65104 0000b588 b6e413ac
> [ 219.394409] r10:00000036 r9:e6d7e000 r8:c0100244 r7:00000036 r6:00000000 r5:bee65c1a
> [ 219.402740] r4:00000000
> [ 219.405426] irq event stamp: 5075
> [ 219.408905] hardirqs last enabled at (5083): [<c01778b0>] console_unlock+0x4cc/0x524
> [ 219.417297] hardirqs last disabled at (5092): [<c01777ac>] console_unlock+0x3c8/0x524
> [ 219.425628] softirqs last enabled at (4532): [<c01017d8>] __do_softirq+0x1f0/0x490
> [ 219.433837] softirqs last disabled at (4493): [<c0132c20>] irq_exit+0xe4/0x160
> [ 219.441558] ---[ end trace 8c56810633cf24db ]---
> [ 219.446502] omap3isp 480bc000.isp: dma_alloc_coherent of size 17743872 failed
> -----------------8<---------------------------
>
20.08.2020 18:08, Robin Murphy пишет:
> Now that arch/arm is wired up for default domains and iommu-dma, we no
> longer need to work around the arch-private mapping.
>
> Signed-off-by: Robin Murphy <[email protected]>
> ---
> drivers/staging/media/tegra-vde/iommu.c | 12 ------------
> 1 file changed, 12 deletions(-)
>
> diff --git a/drivers/staging/media/tegra-vde/iommu.c b/drivers/staging/media/tegra-vde/iommu.c
> index 6af863d92123..4f770189ed34 100644
> --- a/drivers/staging/media/tegra-vde/iommu.c
> +++ b/drivers/staging/media/tegra-vde/iommu.c
> @@ -10,10 +10,6 @@
> #include <linux/kernel.h>
> #include <linux/platform_device.h>
>
> -#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
> -#include <asm/dma-iommu.h>
> -#endif
> -
> #include "vde.h"
>
> int tegra_vde_iommu_map(struct tegra_vde *vde,
> @@ -70,14 +66,6 @@ int tegra_vde_iommu_init(struct tegra_vde *vde)
> if (!vde->group)
> return 0;
>
> -#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
> - if (dev->archdata.mapping) {
> - struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
> -
> - arm_iommu_detach_device(dev);
> - arm_iommu_release_mapping(mapping);
> - }
> -#endif
> vde->domain = iommu_domain_alloc(&platform_bus_type);
> if (!vde->domain) {
> err = -ENOMEM;
>
Hello, Robin! Thank you for yours work!
Some drivers, like this Tegra VDE (Video Decoder Engine) driver for
example, do not want to use implicit IOMMU domain. Tegra VDE driver
relies on explicit IOMMU domain in a case of Tegra SMMU because VDE
hardware can't access last page of the AS and because driver wants to
reserve some fixed addresses [1].
[1]
https://elixir.bootlin.com/linux/v5.9-rc1/source/drivers/staging/media/tegra-vde/iommu.c#L100
Tegra30 SoC supports up to 4 domains, hence it's not possible to afford
wasting unused implicit domains. I think this needs to be addressed
before this patch could be applied.
Would it be possible for IOMMU drivers to gain support for filtering out
devices in iommu_domain_alloc(dev, type)? Then perhaps Tegra SMMU driver
could simply return NULL in a case of type=IOMMU_DOMAIN_DMA and
dev=tegra-vde.
Alternatively, the Tegra SMMU could be changed such that the devices
will be attached to a domain at the time of a first IOMMU mapping
invocation instead of attaching at the time of attach_dev() callback
invocation.
Or maybe even IOMMU core could be changed to attach devices at the time
of the first IOMMU mapping invocation? This could be a universal
solution for all drivers.
On Thu, Aug 20, 2020 at 06:25:19PM +0100, Robin Murphy wrote:
> On 2020-08-20 17:53, Sakari Ailus wrote:
> > Hi Robin,
> >
> > On Thu, Aug 20, 2020 at 04:08:36PM +0100, Robin Murphy wrote:
> > > Now that arch/arm is wired up for default domains and iommu-dma, devices
> > > behind IOMMUs will get mappings set up automatically as appropriate, so
> > > there is no need for drivers to do so manually.
> > >
> > > Signed-off-by: Robin Murphy <[email protected]>
> >
> > Thanks for the patch.
>
> Many thanks for testing so quickly!
>
> > I haven't looked at the details but it seems that this causes the buffer
> > memory allocation to be physically contiguous, which causes a failure to
> > allocate video buffers of entirely normal size. I guess that was not
> > intentional?
>
> Hmm, it looks like the device ends up with the wrong DMA ops, which implies
> something didn't go as expected with the earlier IOMMU setup and default
> domain creation. Chances are that either I missed some subtlety in the
> omap_iommu change, or I've fundamentally misjudged how the ISP probing works
> and it never actually goes down the of_iommu_configure() path in the first
> place. Do you get any messages from the IOMMU layer earlier on during boot?
I do get these:
[ 2.934936] iommu: Default domain type: Translated
[ 2.940917] omap-iommu 480bd400.mmu: 480bd400.mmu registered
[ 2.946899] platform 480bc000.isp: Adding to iommu group 0
--
Sakari Ailus
20.08.2020 18:08, Robin Murphy пишет:
> Now that arch/arm is wired up for default domains and iommu-dma,
> implement the corresponding driver-side support for DMA domains.
>
> Signed-off-by: Robin Murphy <[email protected]>
> ---
> drivers/iommu/tegra-gart.c | 17 ++++++++++++-----
> 1 file changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
> index fac720273889..e081387080f6 100644
> --- a/drivers/iommu/tegra-gart.c
> +++ b/drivers/iommu/tegra-gart.c
> @@ -9,6 +9,7 @@
>
> #define dev_fmt(fmt) "gart: " fmt
>
> +#include <linux/dma-iommu.h>
> #include <linux/io.h>
> #include <linux/iommu.h>
> #include <linux/moduleparam.h>
> @@ -145,16 +146,22 @@ static struct iommu_domain *gart_iommu_domain_alloc(unsigned type)
> {
> struct iommu_domain *domain;
Hello, Robin!
Tegra20 GART isn't a real IOMMU, but a small relocation aperture. We
would only want to use it for a temporal mappings (managed by GPU
driver) for the time while GPU hardware is busy and working with a
sparse DMA buffers, the driver will take care of unmapping the sparse
buffers once GPU work is finished [1]. In a case of contiguous DMA
buffers, we want to bypass the IOMMU and use buffer's phys address
because GART aperture is small and all buffers simply can't fit into
GART for a complex GPU operations that involve multiple buffers [2][3].
The upstream GPU driver still doesn't support GART, but eventually it
needs to be changed.
[1]
https://github.com/grate-driver/linux/blob/master/drivers/gpu/drm/tegra/gart.c#L489
[2]
https://github.com/grate-driver/linux/blob/master/drivers/gpu/drm/tegra/gart.c#L542
[3]
https://github.com/grate-driver/linux/blob/master/drivers/gpu/drm/tegra/uapi/patching.c#L90
> - if (type != IOMMU_DOMAIN_UNMANAGED)
> + if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
> return NULL;
Will a returned NULL tell to IOMMU core that implicit domain shouldn't
be used? Is it possible to leave this driver as-is?
20.08.2020 22:51, Dmitry Osipenko пишет:
> Alternatively, the Tegra SMMU could be changed such that the devices
> will be attached to a domain at the time of a first IOMMU mapping
> invocation instead of attaching at the time of attach_dev() callback
> invocation.
>
> Or maybe even IOMMU core could be changed to attach devices at the time
> of the first IOMMU mapping invocation? This could be a universal
> solution for all drivers.
Although, please scratch this :) I'll need to revisit how DMA mapping
API works.
On 2020-08-20 20:55, Sakari Ailus wrote:
> On Thu, Aug 20, 2020 at 06:25:19PM +0100, Robin Murphy wrote:
>> On 2020-08-20 17:53, Sakari Ailus wrote:
>>> Hi Robin,
>>>
>>> On Thu, Aug 20, 2020 at 04:08:36PM +0100, Robin Murphy wrote:
>>>> Now that arch/arm is wired up for default domains and iommu-dma, devices
>>>> behind IOMMUs will get mappings set up automatically as appropriate, so
>>>> there is no need for drivers to do so manually.
>>>>
>>>> Signed-off-by: Robin Murphy <[email protected]>
>>>
>>> Thanks for the patch.
>>
>> Many thanks for testing so quickly!
>>
>>> I haven't looked at the details but it seems that this causes the buffer
>>> memory allocation to be physically contiguous, which causes a failure to
>>> allocate video buffers of entirely normal size. I guess that was not
>>> intentional?
>>
>> Hmm, it looks like the device ends up with the wrong DMA ops, which implies
>> something didn't go as expected with the earlier IOMMU setup and default
>> domain creation. Chances are that either I missed some subtlety in the
>> omap_iommu change, or I've fundamentally misjudged how the ISP probing works
>> and it never actually goes down the of_iommu_configure() path in the first
>> place. Do you get any messages from the IOMMU layer earlier on during boot?
>
> I do get these:
>
> [ 2.934936] iommu: Default domain type: Translated
> [ 2.940917] omap-iommu 480bd400.mmu: 480bd400.mmu registered
> [ 2.946899] platform 480bc000.isp: Adding to iommu group 0
>
So that much looks OK, if there are no obvious errors. Unfortunately
there's no easy way to tell exactly what of_iommu_configure() is doing
(beyond enabling a couple of vague debug messages). The first thing I'll
do tomorrow is double-check whether it's really working on my boards
here, or whether I was just getting lucky with CMA... (I assume you
don't have CMA enabled if you're ending up in remap_allocator_alloc())
Robin.
On 2020-08-20 20:51, Dmitry Osipenko wrote:
> 20.08.2020 18:08, Robin Murphy пишет:
>> Now that arch/arm is wired up for default domains and iommu-dma, we no
>> longer need to work around the arch-private mapping.
>>
>> Signed-off-by: Robin Murphy <[email protected]>
>> ---
>> drivers/staging/media/tegra-vde/iommu.c | 12 ------------
>> 1 file changed, 12 deletions(-)
>>
>> diff --git a/drivers/staging/media/tegra-vde/iommu.c b/drivers/staging/media/tegra-vde/iommu.c
>> index 6af863d92123..4f770189ed34 100644
>> --- a/drivers/staging/media/tegra-vde/iommu.c
>> +++ b/drivers/staging/media/tegra-vde/iommu.c
>> @@ -10,10 +10,6 @@
>> #include <linux/kernel.h>
>> #include <linux/platform_device.h>
>>
>> -#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
>> -#include <asm/dma-iommu.h>
>> -#endif
>> -
>> #include "vde.h"
>>
>> int tegra_vde_iommu_map(struct tegra_vde *vde,
>> @@ -70,14 +66,6 @@ int tegra_vde_iommu_init(struct tegra_vde *vde)
>> if (!vde->group)
>> return 0;
>>
>> -#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)
>> - if (dev->archdata.mapping) {
>> - struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev);
>> -
>> - arm_iommu_detach_device(dev);
>> - arm_iommu_release_mapping(mapping);
>> - }
>> -#endif
>> vde->domain = iommu_domain_alloc(&platform_bus_type);
>> if (!vde->domain) {
>> err = -ENOMEM;
>>
>
> Hello, Robin! Thank you for yours work!
>
> Some drivers, like this Tegra VDE (Video Decoder Engine) driver for
> example, do not want to use implicit IOMMU domain.
That isn't (intentionally) changing here - the only difference should be
that instead of having the ARM-special implicit domain, which you have
to kick out of the way with the ARM-specific API before you're able to
attach your own domain, the implicit domain is now a proper IOMMU API
default domain, which automatically gets bumped by your attach. The
default domains should still only be created in the same cases that the
ARM dma_iommu_mappings were.
> Tegra VDE driver
> relies on explicit IOMMU domain in a case of Tegra SMMU because VDE
> hardware can't access last page of the AS and because driver wants to
> reserve some fixed addresses [1].
>
> [1]
> https://elixir.bootlin.com/linux/v5.9-rc1/source/drivers/staging/media/tegra-vde/iommu.c#L100
>
> Tegra30 SoC supports up to 4 domains, hence it's not possible to afford
> wasting unused implicit domains. I think this needs to be addressed
> before this patch could be applied.
Yeah, there is one subtle change in behaviour from removing the ARM
layer on top of the core API, in that the IOMMU driver will no longer
see an explicit detach call. Thus it does stand to benefit from being a
bit cleverer about noticing devices being moved from one domain to
another by an attach call, either by releasing the hardware context for
the inactive domain once the device(s) are moved across to the new one,
or by simply reprogramming the hardware context in-place for the new
domain's address space without allocating a new one at all (most of the
drivers that don't have multiple contexts already handle the latter
approach quite well).
> Would it be possible for IOMMU drivers to gain support for filtering out
> devices in iommu_domain_alloc(dev, type)? Then perhaps Tegra SMMU driver
> could simply return NULL in a case of type=IOMMU_DOMAIN_DMA and
> dev=tegra-vde.
If you can implement IOMMU_DOMAIN_IDENTITY by allowing the relevant
devices to bypass translation entirely without needing a hardware
context (or at worst, can spare one context which all identity-mapped
logical domains can share), then you could certainly do that kind of
filtering with the .def_domain_type callback if you really wanted to. As
above, the intent is that that shouldn't be necessary for this
particular case, since only one of a group's default domain and
explicitly attached domain can be live at any given time, so the driver
should be able to take advantage of that.
If you simply have more active devices (groups) than available contexts
then yes, you probably would want to do some filtering to decide who
deserves a translation domain and who doesn't, but in that case you
should already have had a long-standing problem with the ARM implicit
domains.
> Alternatively, the Tegra SMMU could be changed such that the devices
> will be attached to a domain at the time of a first IOMMU mapping
> invocation instead of attaching at the time of attach_dev() callback
> invocation.
>
> Or maybe even IOMMU core could be changed to attach devices at the time
> of the first IOMMU mapping invocation? This could be a universal
> solution for all drivers.
I suppose technically you could do that within an IOMMU driver already
(similar to how some defer most of setup that logically belongs to
->domain_alloc until the first ->attach_dev). It's a bit grim from the
caller's PoV though, in terms of the failure mode being non-obvious and
having no real way to recover. Again, you'd be better off simply making
decisions up-front at domain_alloc or attach time based on the domain type.
Robin.
On 2020-08-20 21:16, Dmitry Osipenko wrote:
> 20.08.2020 18:08, Robin Murphy пишет:
>> Now that arch/arm is wired up for default domains and iommu-dma,
>> implement the corresponding driver-side support for DMA domains.
>>
>> Signed-off-by: Robin Murphy <[email protected]>
>> ---
>> drivers/iommu/tegra-gart.c | 17 ++++++++++++-----
>> 1 file changed, 12 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
>> index fac720273889..e081387080f6 100644
>> --- a/drivers/iommu/tegra-gart.c
>> +++ b/drivers/iommu/tegra-gart.c
>> @@ -9,6 +9,7 @@
>>
>> #define dev_fmt(fmt) "gart: " fmt
>>
>> +#include <linux/dma-iommu.h>
>> #include <linux/io.h>
>> #include <linux/iommu.h>
>> #include <linux/moduleparam.h>
>> @@ -145,16 +146,22 @@ static struct iommu_domain *gart_iommu_domain_alloc(unsigned type)
>> {
>> struct iommu_domain *domain;
>
> Hello, Robin!
>
> Tegra20 GART isn't a real IOMMU, but a small relocation aperture. We
> would only want to use it for a temporal mappings (managed by GPU
> driver) for the time while GPU hardware is busy and working with a
> sparse DMA buffers, the driver will take care of unmapping the sparse
> buffers once GPU work is finished [1]. In a case of contiguous DMA
> buffers, we want to bypass the IOMMU and use buffer's phys address
> because GART aperture is small and all buffers simply can't fit into
> GART for a complex GPU operations that involve multiple buffers [2][3].
> The upstream GPU driver still doesn't support GART, but eventually it
> needs to be changed.
>
> [1]
> https://github.com/grate-driver/linux/blob/master/drivers/gpu/drm/tegra/gart.c#L489
>
> [2]
> https://github.com/grate-driver/linux/blob/master/drivers/gpu/drm/tegra/gart.c#L542
>
> [3]
> https://github.com/grate-driver/linux/blob/master/drivers/gpu/drm/tegra/uapi/patching.c#L90
>
>> - if (type != IOMMU_DOMAIN_UNMANAGED)
>> + if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
>> return NULL;
>
> Will a returned NULL tell to IOMMU core that implicit domain shouldn't
> be used? Is it possible to leave this driver as-is?
The aim of this patch was just to make the conversion without functional
changes wherever possible, i.e. maintain an equivalent to the existing
ARM behaviour of allocating its own implicit domains for everything. It
doesn't represent any judgement of whether that was ever appropriate for
this driver in the first place ;)
Hopefully my other reply already covered the degree of control drivers
can have with proper default domains, but do shout if anything wasn't clear.
Cheers,
Robin.
On Thu, Aug 20, 2020 at 04:08:26PM +0100, Robin Murphy wrote:
> Now that arch/arm is wired up for default domains and iommu-dma, remove
> the add_device workaround.
>
> Signed-off-by: Robin Murphy <[email protected]>
> ---
> drivers/iommu/arm/arm-smmu/arm-smmu.c | 10 ----------
> 1 file changed, 10 deletions(-)
>
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> index 09c42af9f31e..4e52d8cb67dd 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
> @@ -1164,17 +1164,7 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev)
> return -ENXIO;
> }
>
> - /*
> - * FIXME: The arch/arm DMA API code tries to attach devices to its own
> - * domains between of_xlate() and probe_device() - we have no way to cope
> - * with that, so until ARM gets converted to rely on groups and default
> - * domains, just say no (but more politely than by dereferencing NULL).
> - * This should be at least a WARN_ON once that's sorted.
> - */
> cfg = dev_iommu_priv_get(dev);
> - if (!cfg)
> - return -ENODEV;
> -
> smmu = cfg->smmu;
>
> ret = arm_smmu_rpm_get(smmu);
Acked-by: Will Deacon <[email protected]>
Will
21.08.2020 03:11, Robin Murphy пишет:
...
>> Hello, Robin! Thank you for yours work!
>>
>> Some drivers, like this Tegra VDE (Video Decoder Engine) driver for
>> example, do not want to use implicit IOMMU domain.
>
> That isn't (intentionally) changing here - the only difference should be
> that instead of having the ARM-special implicit domain, which you have
> to kick out of the way with the ARM-specific API before you're able to
> attach your own domain, the implicit domain is now a proper IOMMU API
> default domain, which automatically gets bumped by your attach. The
> default domains should still only be created in the same cases that the
> ARM dma_iommu_mappings were.
>
>> Tegra VDE driver
>> relies on explicit IOMMU domain in a case of Tegra SMMU because VDE
>> hardware can't access last page of the AS and because driver wants to
>> reserve some fixed addresses [1].
>>
>> [1]
>> https://elixir.bootlin.com/linux/v5.9-rc1/source/drivers/staging/media/tegra-vde/iommu.c#L100
>>
>>
>> Tegra30 SoC supports up to 4 domains, hence it's not possible to afford
>> wasting unused implicit domains. I think this needs to be addressed
>> before this patch could be applied.
>
> Yeah, there is one subtle change in behaviour from removing the ARM
> layer on top of the core API, in that the IOMMU driver will no longer
> see an explicit detach call. Thus it does stand to benefit from being a
> bit cleverer about noticing devices being moved from one domain to
> another by an attach call, either by releasing the hardware context for
> the inactive domain once the device(s) are moved across to the new one,
> or by simply reprogramming the hardware context in-place for the new
> domain's address space without allocating a new one at all (most of the
> drivers that don't have multiple contexts already handle the latter
> approach quite well).
>
>> Would it be possible for IOMMU drivers to gain support for filtering out
>> devices in iommu_domain_alloc(dev, type)? Then perhaps Tegra SMMU driver
>> could simply return NULL in a case of type=IOMMU_DOMAIN_DMA and
>> dev=tegra-vde.
>
> If you can implement IOMMU_DOMAIN_IDENTITY by allowing the relevant
> devices to bypass translation entirely without needing a hardware
> context (or at worst, can spare one context which all identity-mapped
> logical domains can share), then you could certainly do that kind of
> filtering with the .def_domain_type callback if you really wanted to. As
> above, the intent is that that shouldn't be necessary for this
> particular case, since only one of a group's default domain and
> explicitly attached domain can be live at any given time, so the driver
> should be able to take advantage of that.
>
> If you simply have more active devices (groups) than available contexts
> then yes, you probably would want to do some filtering to decide who
> deserves a translation domain and who doesn't, but in that case you
> should already have had a long-standing problem with the ARM implicit
> domains.
>
>> Alternatively, the Tegra SMMU could be changed such that the devices
>> will be attached to a domain at the time of a first IOMMU mapping
>> invocation instead of attaching at the time of attach_dev() callback
>> invocation.
>>
>> Or maybe even IOMMU core could be changed to attach devices at the time
>> of the first IOMMU mapping invocation? This could be a universal
>> solution for all drivers.
>
> I suppose technically you could do that within an IOMMU driver already
> (similar to how some defer most of setup that logically belongs to
> ->domain_alloc until the first ->attach_dev). It's a bit grim from the
> caller's PoV though, in terms of the failure mode being non-obvious and
> having no real way to recover. Again, you'd be better off simply making
> decisions up-front at domain_alloc or attach time based on the domain type.
Robin, thank you very much for the clarifications!
In accordance to yours comments, this patch can't be applied until Tegra
SMMU will support IOMMU_DOMAIN_IDENTITY and implement def_domain_type()
callback that returns IOMMU_DOMAIN_IDENTITY for the VDE device.
Otherwise you're breaking the VDE driver because
dma_buf_map_attachment() [1] returns the IOMMU SGT of the implicit
domain which is then mapped into the VDE's explicit domain [2], and this
is a nonsense.
[1]
https://elixir.bootlin.com/linux/v5.9-rc1/source/drivers/staging/media/tegra-vde/dmabuf-cache.c#L102
[2]
https://elixir.bootlin.com/linux/v5.9-rc1/source/drivers/staging/media/tegra-vde/dmabuf-cache.c#L122
Hence, either VDE driver should bypass iommu_dma_ops from the start or
it needs a way to kick out the ops, like it does this using ARM's
arm_iommu_detach_device().
The same applies to the Tegra GPU devices, otherwise you're breaking
them as well because Tegra DRM is sensible to implicit vs explicit domain.
BTW, I tried to apply this series and T30 doesn't boot anymore. I don't
have more info for now.
21.08.2020 03:28, Robin Murphy пишет:
...
>> Will a returned NULL tell to IOMMU core that implicit domain shouldn't
>> be used? Is it possible to leave this driver as-is?
>
> The aim of this patch was just to make the conversion without functional
> changes wherever possible, i.e. maintain an equivalent to the existing
> ARM behaviour of allocating its own implicit domains for everything. It
> doesn't represent any judgement of whether that was ever appropriate for
> this driver in the first place ;)
>
> Hopefully my other reply already covered the degree of control drivers
> can have with proper default domains, but do shout if anything wasn't
> clear.
Thank you for the detailed comments! I wasn't watching closely all the
recent iommu/ changes and yours clarification is very helpful!
My current understanding is that the GART driver will need to support
the IOMMU_DOMAIN_IDENTITY and set def_domain_type to
IOMMU_DOMAIN_IDENTITY for all devices.
Meanwhile, today's upstream drivers don't use GART, hence this patch
should be okay. Although, it's a bit unlikely that the IOMMU_DOMAIN_DMA
type will ever be useful for the GART, and thus, I'm still thinking that
will be a bit nicer to keep GART driver as-is for now.
Hi Robin,
On 20.08.2020 17:08, Robin Murphy wrote:
> Hi all,
>
> After 5 years or so of intending to get round to this, finally the
> time comes! The changes themselves actualy turn out to be relatively
> mechanical; the bigger concern appears to be how to get everything
> merged across about 5 diffferent trees given the dependencies.
>
> I've lightly boot-tested things on Rockchip RK3288 and Exynos 4412
> (Odroid-U3), to the degree that their display drivers should be using
> IOMMU-backed buffers and don't explode (the Odroid doesn't manage to
> send a working HDMI signal to the one monitor I have that it actually
> detects, but that's a pre-existing condition...) Confirmation that the
> Mediatek, OMAP and Tegra changes work will be most welcome.
>
> Patches are based on 5.9-rc1, branch available here:
>
> git://linux-arm.org/linux-rm arm/dma
Well, my first proposal for the ARM and ARM64 DMA-mapping unification
has been posted long time ago: https://lkml.org/lkml/2016/2/19/79
Thanks for resurrecting it! :)
I've tested this patchset on various ARM32bit Exynos based boards (not
only Exynos4412) and most of them works fine after your conversion.
However there are issues you cannot learn from the code.
Conversion of the Exynos DRM was straightforward (thanks!), but there
are other Exynos drivers that depends on the old ARM implementation. The
S5P-MFC (only for the v5 hardware) and Exynos4 FIMC-ISP drivers depends
on the first-fit IOVA allocation algorithm in the old ARM DMA-mapping.
This was the main reason I've didn't continue my initial conversion attempt.
Both drivers allocate a buffer for their firmware and then in the
hardware registers address video buffers as an offset from the
begginning of the firmware. This doesn't work when underlying
DMA-mapping allocates IOVA with the last-fit algorithm, what the
drivers/iommu/dma-iommu.c does. So far I didn't find a good solution for
that issue.
I'm open for suggestions. One more limitation for the S5P-MFC driver is
that the hardware is capable only for addressing 128MiB. They will
probably need to call IOMMU API directly, but I would like to keep as
much from the IOMMU/DMA-mapping code as possible.
Anyway, we need to move ARM 32bit forward, so for the ARM DMA-mapping
and Exynos DRM changes, feel free to add:
Acked-by: Marek Szyprowski <[email protected]>
Tested-by: Marek Szyprowski <[email protected]>
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
On 2020-08-23 22:34, Dmitry Osipenko wrote:
> 21.08.2020 03:11, Robin Murphy пишет:
> ...
>>> Hello, Robin! Thank you for yours work!
>>>
>>> Some drivers, like this Tegra VDE (Video Decoder Engine) driver for
>>> example, do not want to use implicit IOMMU domain.
>>
>> That isn't (intentionally) changing here - the only difference should be
>> that instead of having the ARM-special implicit domain, which you have
>> to kick out of the way with the ARM-specific API before you're able to
>> attach your own domain, the implicit domain is now a proper IOMMU API
>> default domain, which automatically gets bumped by your attach. The
>> default domains should still only be created in the same cases that the
>> ARM dma_iommu_mappings were.
>>
>>> Tegra VDE driver
>>> relies on explicit IOMMU domain in a case of Tegra SMMU because VDE
>>> hardware can't access last page of the AS and because driver wants to
>>> reserve some fixed addresses [1].
>>>
>>> [1]
>>> https://elixir.bootlin.com/linux/v5.9-rc1/source/drivers/staging/media/tegra-vde/iommu.c#L100
>>>
>>>
>>> Tegra30 SoC supports up to 4 domains, hence it's not possible to afford
>>> wasting unused implicit domains. I think this needs to be addressed
>>> before this patch could be applied.
>>
>> Yeah, there is one subtle change in behaviour from removing the ARM
>> layer on top of the core API, in that the IOMMU driver will no longer
>> see an explicit detach call. Thus it does stand to benefit from being a
>> bit cleverer about noticing devices being moved from one domain to
>> another by an attach call, either by releasing the hardware context for
>> the inactive domain once the device(s) are moved across to the new one,
>> or by simply reprogramming the hardware context in-place for the new
>> domain's address space without allocating a new one at all (most of the
>> drivers that don't have multiple contexts already handle the latter
>> approach quite well).
>>
>>> Would it be possible for IOMMU drivers to gain support for filtering out
>>> devices in iommu_domain_alloc(dev, type)? Then perhaps Tegra SMMU driver
>>> could simply return NULL in a case of type=IOMMU_DOMAIN_DMA and
>>> dev=tegra-vde.
>>
>> If you can implement IOMMU_DOMAIN_IDENTITY by allowing the relevant
>> devices to bypass translation entirely without needing a hardware
>> context (or at worst, can spare one context which all identity-mapped
>> logical domains can share), then you could certainly do that kind of
>> filtering with the .def_domain_type callback if you really wanted to. As
>> above, the intent is that that shouldn't be necessary for this
>> particular case, since only one of a group's default domain and
>> explicitly attached domain can be live at any given time, so the driver
>> should be able to take advantage of that.
>>
>> If you simply have more active devices (groups) than available contexts
>> then yes, you probably would want to do some filtering to decide who
>> deserves a translation domain and who doesn't, but in that case you
>> should already have had a long-standing problem with the ARM implicit
>> domains.
>>
>>> Alternatively, the Tegra SMMU could be changed such that the devices
>>> will be attached to a domain at the time of a first IOMMU mapping
>>> invocation instead of attaching at the time of attach_dev() callback
>>> invocation.
>>>
>>> Or maybe even IOMMU core could be changed to attach devices at the time
>>> of the first IOMMU mapping invocation? This could be a universal
>>> solution for all drivers.
>>
>> I suppose technically you could do that within an IOMMU driver already
>> (similar to how some defer most of setup that logically belongs to
>> ->domain_alloc until the first ->attach_dev). It's a bit grim from the
>> caller's PoV though, in terms of the failure mode being non-obvious and
>> having no real way to recover. Again, you'd be better off simply making
>> decisions up-front at domain_alloc or attach time based on the domain type.
>
> Robin, thank you very much for the clarifications!
>
> In accordance to yours comments, this patch can't be applied until Tegra
> SMMU will support IOMMU_DOMAIN_IDENTITY and implement def_domain_type()
> callback that returns IOMMU_DOMAIN_IDENTITY for the VDE device.
>
> Otherwise you're breaking the VDE driver because
> dma_buf_map_attachment() [1] returns the IOMMU SGT of the implicit
> domain which is then mapped into the VDE's explicit domain [2], and this
> is a nonsense.
It's true that iommu_dma_ops will do some work in the unattached default
domain, but non-coherent cache maintenance will still be performed
correctly on the underlying memory, which is really all that you care
about for this case. As for tegra_vde_iommu_map(), that seems to do the
right thing in only referencing the physical side of the scatterlist
(via iommu_map_sg()) and ignoring the DMA side, so things ought to work
out OK even if it is a little non-obvious.
> [1]
> https://elixir.bootlin.com/linux/v5.9-rc1/source/drivers/staging/media/tegra-vde/dmabuf-cache.c#L102
>
> [2]
> https://elixir.bootlin.com/linux/v5.9-rc1/source/drivers/staging/media/tegra-vde/dmabuf-cache.c#L122
>
> Hence, either VDE driver should bypass iommu_dma_ops from the start or
> it needs a way to kick out the ops, like it does this using ARM's
> arm_iommu_detach_device().
>
>
> The same applies to the Tegra GPU devices, otherwise you're breaking
> them as well because Tegra DRM is sensible to implicit vs explicit domain.
Note that Tegra DRM will only be as broken as its current state on
arm64, and I was under the impression that that was OK now - at least I
don't recall seeing any complaints since 43c5bf11a610. Although that
commit and the one before it are resolving the scalability issue that
they describe, it was very much in my mind at the time that they also
have the happy side-effect described above - the default domain isn't
*completely* out of the way, but it's far enough that sensible cases
should be able to work as expected.
> BTW, I tried to apply this series and T30 doesn't boot anymore. I don't
> have more info for now.
Yeah, I'm still trying to get to the bottom of whether it's actually
working as intended at all, even on my RK3288. So far my debugging
instrumentation has been confusingly inconclusive :/
Robin.
Hi Robin,
On 8/20/20 10:08 AM, Robin Murphy wrote:
> Now that arch/arm is wired up for default domains and iommu-dma,
> implement the corresponding driver-side support for DMA domains.
>
> Signed-off-by: Robin Murphy <[email protected]>
> ---
> drivers/iommu/omap-iommu.c | 22 +++++++++++++++++++++-
> 1 file changed, 21 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
> index 71f29c0927fc..ea25c2fe0418 100644
> --- a/drivers/iommu/omap-iommu.c
> +++ b/drivers/iommu/omap-iommu.c
> @@ -9,6 +9,7 @@
> * Paul Mundt and Toshihiro Kobayashi
> */
>
> +#include <linux/dma-iommu.h>
> #include <linux/dma-mapping.h>
> #include <linux/err.h>
> #include <linux/slab.h>
> @@ -1574,13 +1575,19 @@ static struct iommu_domain *omap_iommu_domain_alloc(unsigned type)
> {
> struct omap_iommu_domain *omap_domain;
>
> - if (type != IOMMU_DOMAIN_UNMANAGED)
> + if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
> return NULL;
>
> omap_domain = kzalloc(sizeof(*omap_domain), GFP_KERNEL);
> if (!omap_domain)
> return NULL;
>
> + if (type == IOMMU_DOMAIN_DMA &&
> + iommu_get_dma_cookie(&omap_domain->domain)) {
> + kfree(omap_domain);
> + return NULL;
> + }
> +
> spin_lock_init(&omap_domain->lock);
>
> omap_domain->domain.geometry.aperture_start = 0;
> @@ -1601,6 +1608,7 @@ static void omap_iommu_domain_free(struct iommu_domain *domain)
> if (omap_domain->dev)
> _omap_iommu_detach_dev(omap_domain, omap_domain->dev);
>
> + iommu_put_dma_cookie(&omap_domain->domain);
> kfree(omap_domain);
> }
>
> @@ -1736,6 +1744,17 @@ static struct iommu_group *omap_iommu_device_group(struct device *dev)
> return group;
> }
>
> +static int omap_iommu_of_xlate(struct device *dev,
> + struct of_phandle_args *args)
> +{
> + /*
> + * Logically, some of the housekeeping from _omap_iommu_add_device()
> + * should probably move here, but the minimum we *need* is simply to
> + * cooperate with of_iommu at all to let iommu-dma work.
> + */
> + return 0;
> +}
> +
I have tested this series, and it is breaking the OMAP remoteproc functionality.
We definitely need some more plumbing. I am currently getting MMU faults and
also the DMA allocated addresses are not coming from the device-specific CMA
pools (opposite of what Sakari has reported with OMAP3 ISP). Just removing the
of_xlate gets me back the expected allocations, and no MMU faults, but I don't
see any valid traces.
The MMU devices that the OMAP IOMMU driver deals with are not traditional
bus-level IOMMU devices, but local MMU devices that are present within a remote
processor sub-system or hardware accelerator (eg: OMAP3 ISP). The usage is also
slightly different between remoteprocs and OMAP3 ISP. The former uses the CMA
pools and iommu_map/unmap API (UNMANAGED iommu domain), as the allocated regions
need to be mapped using specific device addresses adhering to the firmware
linker map, while OMAP3 ISP uses it like a traditional DMA pool.
regards
Suman
> static const struct iommu_ops omap_iommu_ops = {
> .domain_alloc = omap_iommu_domain_alloc,
> .domain_free = omap_iommu_domain_free,
> @@ -1747,6 +1766,7 @@ static const struct iommu_ops omap_iommu_ops = {
> .probe_device = omap_iommu_probe_device,
> .release_device = omap_iommu_release_device,
> .device_group = omap_iommu_device_group,
> + .of_xlate = omap_iommu_of_xlate,
> .pgsize_bitmap = OMAP_IOMMU_PGSIZES,
> };
>
>
On 8/20/20 6:01 PM, Robin Murphy wrote:
> On 2020-08-20 20:55, Sakari Ailus wrote:
>> On Thu, Aug 20, 2020 at 06:25:19PM +0100, Robin Murphy wrote:
>>> On 2020-08-20 17:53, Sakari Ailus wrote:
>>>> Hi Robin,
>>>>
>>>> On Thu, Aug 20, 2020 at 04:08:36PM +0100, Robin Murphy wrote:
>>>>> Now that arch/arm is wired up for default domains and iommu-dma, devices
>>>>> behind IOMMUs will get mappings set up automatically as appropriate, so
>>>>> there is no need for drivers to do so manually.
>>>>>
>>>>> Signed-off-by: Robin Murphy <[email protected]>
>>>>
>>>> Thanks for the patch.
>>>
>>> Many thanks for testing so quickly!
>>>
>>>> I haven't looked at the details but it seems that this causes the buffer
>>>> memory allocation to be physically contiguous, which causes a failure to
>>>> allocate video buffers of entirely normal size. I guess that was not
>>>> intentional?
>>>
>>> Hmm, it looks like the device ends up with the wrong DMA ops, which implies
>>> something didn't go as expected with the earlier IOMMU setup and default
>>> domain creation. Chances are that either I missed some subtlety in the
>>> omap_iommu change, or I've fundamentally misjudged how the ISP probing works
>>> and it never actually goes down the of_iommu_configure() path in the first
>>> place. Do you get any messages from the IOMMU layer earlier on during boot?
Yeah, I don't think we go through the of_iommu_configure() path, the setup is
mostly done using .probe_device() and .attach_dev() ops. Since the MMUs are
present directly in the respective sub-systems and relies on the sub-system
clocking and power, the MMU itself is turned ON and enabled during .attach_dev().
regards
Suman
>>
>> I do get these:
>>
>> [ 2.934936] iommu: Default domain type: Translated
>> [ 2.940917] omap-iommu 480bd400.mmu: 480bd400.mmu registered
>> [ 2.946899] platform 480bc000.isp: Adding to iommu group 0
>>
>
> So that much looks OK, if there are no obvious errors. Unfortunately there's no
> easy way to tell exactly what of_iommu_configure() is doing (beyond enabling a
> couple of vague debug messages). The first thing I'll do tomorrow is
> double-check whether it's really working on my boards here, or whether I was
> just getting lucky with CMA... (I assume you don't have CMA enabled if you're
> ending up in remap_allocator_alloc())
>
> Robin.
24.08.2020 17:01, Robin Murphy пишет:
...
>> Robin, thank you very much for the clarifications!
>>
>> In accordance to yours comments, this patch can't be applied until Tegra
>> SMMU will support IOMMU_DOMAIN_IDENTITY and implement def_domain_type()
>> callback that returns IOMMU_DOMAIN_IDENTITY for the VDE device.
>>
>> Otherwise you're breaking the VDE driver because
>> dma_buf_map_attachment() [1] returns the IOMMU SGT of the implicit
>> domain which is then mapped into the VDE's explicit domain [2], and this
>> is a nonsense.
>
> It's true that iommu_dma_ops will do some work in the unattached default
> domain, but non-coherent cache maintenance will still be performed
> correctly on the underlying memory, which is really all that you care
> about for this case. As for tegra_vde_iommu_map(), that seems to do the
> right thing in only referencing the physical side of the scatterlist
> (via iommu_map_sg()) and ignoring the DMA side, so things ought to work
> out OK even if it is a little non-obvious.
I'll need to double-check this, it's indeed not clear to me right now.
I see that if Tegra DRM driver uses implicit IOMMU domain, then when VDE
driver imports DMA-buf from Terga DRM and the imported buffer will be
auto-mapped to the implicit VDE IOVA [1].
[1]
https://elixir.bootlin.com/linux/v5.9-rc2/source/drivers/gpu/drm/tegra/gem.c#L574
>> Hence, either VDE driver should bypass iommu_dma_ops from the start or
>> it needs a way to kick out the ops, like it does this using ARM's
>> arm_iommu_detach_device().
>>
>>
>> The same applies to the Tegra GPU devices, otherwise you're breaking
>> them as well because Tegra DRM is sensible to implicit vs explicit
>> domain.
>
> Note that Tegra DRM will only be as broken as its current state on
> arm64, and I was under the impression that that was OK now - at least I
> don't recall seeing any complaints since 43c5bf11a610. Although that
> commit and the one before it are resolving the scalability issue that
> they describe, it was very much in my mind at the time that they also
> have the happy side-effect described above - the default domain isn't
> *completely* out of the way, but it's far enough that sensible cases
> should be able to work as expected.
The Tegra DRM has a very special quirk for ARM32 that was added in this
commit [2] and driver relies on checking of whether explicit or implicit
IOMMU is used in order to activate the quirk.
[2]
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=273da5a046965ccf0ec79eb63f2d5173467e20fa
Once the implicit IOMMU is used for the DRM driver, the quirk no longer
works (if I'm not missing something). This problem needs to be resolved
before implicit IOMMU could be used by the Tegra DRM on ARM32.
>> BTW, I tried to apply this series and T30 doesn't boot anymore. I don't
>> have more info for now.
>
> Yeah, I'm still trying to get to the bottom of whether it's actually
> working as intended at all, even on my RK3288. So far my debugging
> instrumentation has been confusingly inconclusive :/
Surely it will take some time to resolve all the problems and it's great
that you're pushing this work!
I'll try to help with fixing the ARM32 Tegra side of the problems. I
added this to my "TODO" list and should be able to take a closer look
during of this/next weeks!
Tested full series on bananapi r2 (mt7623/mt2701, 5.9-rc1 + hdmi-patches), works so far fbcon+x without issues
Tested-by: Frank Wunderlich <[email protected]>
regards Frank
On 27/08/2020 14:31, Frank Wunderlich wrote:
> Tested full series on bananapi r2 (mt7623/mt2701, 5.9-rc1 + hdmi-patches), works so far fbcon+x without issues
>
> Tested-by: Frank Wunderlich <[email protected]>
>
Thanks for testing.
Robin this is especially relevant for:
[PATCH 09/18] iommu/mediatek-v1: Add IOMMU_DOMAIN_DMA support
Regards,
Matthias
On Thu, Aug 20, 2020 at 04:08:32PM +0100, Robin Murphy wrote:
> Now that arch/arm is wired up for default domains and iommu-dma,
> implement the corresponding driver-side support for DMA domains.
>
> Signed-off-by: Robin Murphy <[email protected]>
> ---
> drivers/iommu/tegra-smmu.c | 37 +++++++++++++++++++++----------------
> 1 file changed, 21 insertions(+), 16 deletions(-)
We can't do that yet because it will currently still break for use-cases
such as display where we don't properly set up identity mappings during
boot. The result is that the dma-iommu code will enable translations
before the driver gets a chance to set up any mappings and if the
display controller was left on by the bootloader, scanning out a splash
screen, this causes faults between the point where dma-iommu is being
set up for the display controller and where the display controller
starts mapping its own buffers (rather than the ones mapped by the
bootloader).
That said, I do have a series that I've been carrying around for longer
than I've wanted that does exactly this for Tegra SMMU and I'd prefer if
you could drop this particular change from your series so that I can
keep working on resolving the identity mapping issues first.
Thierry
On Thu, Aug 27, 2020 at 10:05:14AM +0300, Dmitry Osipenko wrote:
> 24.08.2020 17:01, Robin Murphy пишет:
> ...
> >> Robin, thank you very much for the clarifications!
> >>
> >> In accordance to yours comments, this patch can't be applied until Tegra
> >> SMMU will support IOMMU_DOMAIN_IDENTITY and implement def_domain_type()
> >> callback that returns IOMMU_DOMAIN_IDENTITY for the VDE device.
> >>
> >> Otherwise you're breaking the VDE driver because
> >> dma_buf_map_attachment() [1] returns the IOMMU SGT of the implicit
> >> domain which is then mapped into the VDE's explicit domain [2], and this
> >> is a nonsense.
> >
> > It's true that iommu_dma_ops will do some work in the unattached default
> > domain, but non-coherent cache maintenance will still be performed
> > correctly on the underlying memory, which is really all that you care
> > about for this case. As for tegra_vde_iommu_map(), that seems to do the
> > right thing in only referencing the physical side of the scatterlist
> > (via iommu_map_sg()) and ignoring the DMA side, so things ought to work
> > out OK even if it is a little non-obvious.
>
> I'll need to double-check this, it's indeed not clear to me right now.
>
> I see that if Tegra DRM driver uses implicit IOMMU domain, then when VDE
> driver imports DMA-buf from Terga DRM and the imported buffer will be
> auto-mapped to the implicit VDE IOVA [1].
>
> [1]
> https://elixir.bootlin.com/linux/v5.9-rc2/source/drivers/gpu/drm/tegra/gem.c#L574
>
> >> Hence, either VDE driver should bypass iommu_dma_ops from the start or
> >> it needs a way to kick out the ops, like it does this using ARM's
> >> arm_iommu_detach_device().
> >>
> >>
> >> The same applies to the Tegra GPU devices, otherwise you're breaking
> >> them as well because Tegra DRM is sensible to implicit vs explicit
> >> domain.
> >
> > Note that Tegra DRM will only be as broken as its current state on
> > arm64, and I was under the impression that that was OK now - at least I
> > don't recall seeing any complaints since 43c5bf11a610. Although that
> > commit and the one before it are resolving the scalability issue that
> > they describe, it was very much in my mind at the time that they also
> > have the happy side-effect described above - the default domain isn't
> > *completely* out of the way, but it's far enough that sensible cases
> > should be able to work as expected.
>
> The Tegra DRM has a very special quirk for ARM32 that was added in this
> commit [2] and driver relies on checking of whether explicit or implicit
> IOMMU is used in order to activate the quirk.
>
> [2]
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=273da5a046965ccf0ec79eb63f2d5173467e20fa
>
> Once the implicit IOMMU is used for the DRM driver, the quirk no longer
> works (if I'm not missing something). This problem needs to be resolved
> before implicit IOMMU could be used by the Tegra DRM on ARM32.
>
> >> BTW, I tried to apply this series and T30 doesn't boot anymore. I don't
> >> have more info for now.
> >
> > Yeah, I'm still trying to get to the bottom of whether it's actually
> > working as intended at all, even on my RK3288. So far my debugging
> > instrumentation has been confusingly inconclusive :/
>
> Surely it will take some time to resolve all the problems and it's great
> that you're pushing this work!
>
> I'll try to help with fixing the ARM32 Tegra side of the problems. I
> added this to my "TODO" list and should be able to take a closer look
> during of this/next weeks!
I do have a patch lying around somewhere that implements the mapping
cache that was referenced in the above commit. Let me know if I should
dig that up and send it out.
Thierry
On 2020-08-27 16:45, Thierry Reding wrote:
> On Thu, Aug 20, 2020 at 04:08:32PM +0100, Robin Murphy wrote:
>> Now that arch/arm is wired up for default domains and iommu-dma,
>> implement the corresponding driver-side support for DMA domains.
>>
>> Signed-off-by: Robin Murphy <[email protected]>
>> ---
>> drivers/iommu/tegra-smmu.c | 37 +++++++++++++++++++++----------------
>> 1 file changed, 21 insertions(+), 16 deletions(-)
>
> We can't do that yet because it will currently still break for use-cases
> such as display where we don't properly set up identity mappings during
> boot. The result is that the dma-iommu code will enable translations
> before the driver gets a chance to set up any mappings and if the
> display controller was left on by the bootloader, scanning out a splash
> screen, this causes faults between the point where dma-iommu is being
> set up for the display controller and where the display controller
> starts mapping its own buffers (rather than the ones mapped by the
> bootloader).
Rest assured that I understand the situation all too well ;) As with
tegra-gart, the unspoken point here is that since tegra-smmu implements
of_xlate(), then arm_setup_iommu_dma_ops() must already be causing the
exact same problem, no? This patch only seeks to move any existing
behaviour over to the common backend, regardless of whether it was ever
really appropriate in the first place.
> That said, I do have a series that I've been carrying around for longer
> than I've wanted that does exactly this for Tegra SMMU and I'd prefer if
> you could drop this particular change from your series so that I can
> keep working on resolving the identity mapping issues first.
That would mean you'd see a functional change from the final patch,
wherein nothing would ever be able to get translation unless drivers do
their own explicit IOMMU API management. If you definitely want that
change then OK, but it would still be nice to do it "properly" with
IOMMU_DOMAIN_IDENTITY support, rather than just forcibly failing default
domain allocation. I'm having a go at reworking the tegra-gart patch in
that direction, so I can certainly try it for tegra-smmu as well.
Robin.
On Thu, 2020-08-20 at 16:08 +0100, Robin Murphy wrote:
> Now that arch/arm is wired up for default domains and iommu-dma,
> implement the corresponding driver-side support for groups and DMA
> domains to replace the shared mapping workaround.
>
> Signed-off-by: Robin Murphy <[email protected]>
> ---
> drivers/iommu/mtk_iommu.h | 2 -
> drivers/iommu/mtk_iommu_v1.c | 153 +++++++++++------------------------
> 2 files changed, 48 insertions(+), 107 deletions(-)
Hi Robin,
Thanks very much for this patch, It makes the code much cleaner.
Please help squash the little change in this patch,
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -555,6 +555,7 @@ static int mtk_iommu_probe(struct platform_device
*pdev)
return ret;
iommu_device_set_ops(&data->iommu, &mtk_iommu_ops);
+ iommu_device_set_fwnode(&data->iommu, &dev->of_node->fwnode);
ret = iommu_device_register(&data->iommu);
if (ret)
Then,
Tested-by: Yong Wu <[email protected]>
>
> diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> index 122925dbe547..6253e98d810c 100644
> --- a/drivers/iommu/mtk_iommu.h
> +++ b/drivers/iommu/mtk_iommu.h
> @@ -67,8 +67,6 @@ struct mtk_iommu_data {
> struct iommu_device iommu;
> const struct mtk_iommu_plat_data *plat_data;
>
> - struct dma_iommu_mapping *mapping; /* For mtk_iommu_v1.c */
> -
> struct list_head list;
> struct mtk_smi_larb_iommu larb_imu[MTK_LARB_NR_MAX];
> };
> diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
> index 82ddfe9170d4..40c89b8d3ac4 100644
> --- a/drivers/iommu/mtk_iommu_v1.c
> +++ b/drivers/iommu/mtk_iommu_v1.c
> @@ -28,7 +28,6 @@
> #include <linux/slab.h>
> #include <linux/spinlock.h>
> #include <asm/barrier.h>
> -#include <asm/dma-iommu.h>
> #include <linux/init.h>
> #include <dt-bindings/memory/mt2701-larb-port.h>
> #include <soc/mediatek/smi.h>
> @@ -240,13 +239,18 @@ static struct iommu_domain *mtk_iommu_domain_alloc(unsigned type)
> {
> struct mtk_iommu_domain *dom;
>
> - if (type != IOMMU_DOMAIN_UNMANAGED)
> + if (type != IOMMU_DOMAIN_UNMANAGED && type != IOMMU_DOMAIN_DMA)
> return NULL;
>
> dom = kzalloc(sizeof(*dom), GFP_KERNEL);
> if (!dom)
> return NULL;
>
> + if (type == IOMMU_DOMAIN_DMA && iommu_get_dma_cookie(&dom->domain)) {
> + kfree(dom);
> + return NULL;
> + }
> +
> return &dom->domain;
> }
>
> @@ -257,6 +261,7 @@ static void mtk_iommu_domain_free(struct iommu_domain *domain)
>
> dma_free_coherent(data->dev, M2701_IOMMU_PGT_SIZE,
> dom->pgt_va, dom->pgt_pa);
> + iommu_put_dma_cookie(domain);
> kfree(to_mtk_domain(domain));
> }
>
> @@ -265,14 +270,8 @@ static int mtk_iommu_attach_device(struct iommu_domain *domain,
> {
> struct mtk_iommu_data *data = dev_iommu_priv_get(dev);
> struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> - struct dma_iommu_mapping *mtk_mapping;
> int ret;
>
> - /* Only allow the domain created internally. */
> - mtk_mapping = data->mapping;
> - if (mtk_mapping->domain != domain)
> - return 0;
> -
> if (!data->m4u_dom) {
> data->m4u_dom = dom;
> ret = mtk_iommu_domain_finalise(data);
> @@ -358,18 +357,39 @@ static phys_addr_t mtk_iommu_iova_to_phys(struct iommu_domain *domain,
>
> static const struct iommu_ops mtk_iommu_ops;
>
> -/*
> - * MTK generation one iommu HW only support one iommu domain, and all the client
> - * sharing the same iova address space.
> - */
> -static int mtk_iommu_create_mapping(struct device *dev,
> - struct of_phandle_args *args)
> +static struct iommu_device *mtk_iommu_probe_device(struct device *dev)
> {
> struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> struct mtk_iommu_data *data;
> +
> + if (!fwspec || fwspec->ops != &mtk_iommu_ops)
> + return ERR_PTR(-ENODEV); /* Not a iommu client device */
> +
> + data = dev_iommu_priv_get(dev);
> +
> + return &data->iommu;
> +}
> +
> +static void mtk_iommu_release_device(struct device *dev)
> +{
> + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> +
> + if (!fwspec || fwspec->ops != &mtk_iommu_ops)
> + return;
> +
> + iommu_fwspec_free(dev);
> +}
> +
> +static struct iommu_group *mtk_iommu_device_group(struct device *dev)
> +{
> + struct mtk_iommu_data *data = dev_iommu_priv_get(dev);
> +
> + return iommu_group_ref_get(data->m4u_group);
> +}
> +
> +static int mtk_iommu_of_xlate(struct device *dev, struct of_phandle_args *args)
> +{
> struct platform_device *m4updev;
> - struct dma_iommu_mapping *mtk_mapping;
> - int ret;
>
> if (args->args_count != 1) {
> dev_err(dev, "invalid #iommu-cells(%d) property for IOMMU\n",
> @@ -377,15 +397,6 @@ static int mtk_iommu_create_mapping(struct device *dev,
> return -EINVAL;
> }
>
> - if (!fwspec) {
> - ret = iommu_fwspec_init(dev, &args->np->fwnode, &mtk_iommu_ops);
> - if (ret)
> - return ret;
> - fwspec = dev_iommu_fwspec_get(dev);
> - } else if (dev_iommu_fwspec_get(dev)->ops != &mtk_iommu_ops) {
> - return -EINVAL;
> - }
> -
> if (!dev_iommu_priv_get(dev)) {
> /* Get the m4u device */
> m4updev = of_find_device_by_node(args->np);
> @@ -395,83 +406,7 @@ static int mtk_iommu_create_mapping(struct device *dev,
> dev_iommu_priv_set(dev, platform_get_drvdata(m4updev));
> }
>
> - ret = iommu_fwspec_add_ids(dev, args->args, 1);
> - if (ret)
> - return ret;
> -
> - data = dev_iommu_priv_get(dev);
> - mtk_mapping = data->mapping;
> - if (!mtk_mapping) {
> - /* MTK iommu support 4GB iova address space. */
> - mtk_mapping = arm_iommu_create_mapping(&platform_bus_type,
> - 0, 1ULL << 32);
> - if (IS_ERR(mtk_mapping))
> - return PTR_ERR(mtk_mapping);
> -
> - data->mapping = mtk_mapping;
> - }
> -
> - return 0;
> -}
> -
> -static int mtk_iommu_def_domain_type(struct device *dev)
> -{
> - return IOMMU_DOMAIN_UNMANAGED;
> -}
> -
> -static struct iommu_device *mtk_iommu_probe_device(struct device *dev)
> -{
> - struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> - struct of_phandle_args iommu_spec;
> - struct of_phandle_iterator it;
> - struct mtk_iommu_data *data;
> - int err;
> -
> - of_for_each_phandle(&it, err, dev->of_node, "iommus",
> - "#iommu-cells", -1) {
> - int count = of_phandle_iterator_args(&it, iommu_spec.args,
> - MAX_PHANDLE_ARGS);
> - iommu_spec.np = of_node_get(it.node);
> - iommu_spec.args_count = count;
> -
> - mtk_iommu_create_mapping(dev, &iommu_spec);
> -
> - /* dev->iommu_fwspec might have changed */
> - fwspec = dev_iommu_fwspec_get(dev);
> -
> - of_node_put(iommu_spec.np);
> - }
> -
> - if (!fwspec || fwspec->ops != &mtk_iommu_ops)
> - return ERR_PTR(-ENODEV); /* Not a iommu client device */
> -
> - data = dev_iommu_priv_get(dev);
> -
> - return &data->iommu;
> -}
> -
> -static void mtk_iommu_probe_finalize(struct device *dev)
> -{
> - struct dma_iommu_mapping *mtk_mapping;
> - struct mtk_iommu_data *data;
> - int err;
> -
> - data = dev_iommu_priv_get(dev);
> - mtk_mapping = data->mapping;
> -
> - err = arm_iommu_attach_device(dev, mtk_mapping);
> - if (err)
> - dev_err(dev, "Can't create IOMMU mapping - DMA-OPS will not work\n");
> -}
> -
> -static void mtk_iommu_release_device(struct device *dev)
> -{
> - struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> -
> - if (!fwspec || fwspec->ops != &mtk_iommu_ops)
> - return;
> -
> - iommu_fwspec_free(dev);
> + return iommu_fwspec_add_ids(dev, args->args, 1);
> }
>
> static int mtk_iommu_hw_init(const struct mtk_iommu_data *data)
> @@ -524,10 +459,9 @@ static const struct iommu_ops mtk_iommu_ops = {
> .unmap = mtk_iommu_unmap,
> .iova_to_phys = mtk_iommu_iova_to_phys,
> .probe_device = mtk_iommu_probe_device,
> - .probe_finalize = mtk_iommu_probe_finalize,
> .release_device = mtk_iommu_release_device,
> - .def_domain_type = mtk_iommu_def_domain_type,
> - .device_group = generic_device_group,
> + .device_group = mtk_iommu_device_group,
> + .of_xlate = mtk_iommu_of_xlate,
> .pgsize_bitmap = ~0UL << MT2701_IOMMU_PAGE_SHIFT,
> };
>
> @@ -626,6 +560,14 @@ static int mtk_iommu_probe(struct platform_device *pdev)
> if (ret)
> return ret;
>
> + /*
> + * MTK generation one iommu HW only support one iommu domain,
> + * and all the client sharing the same iova address space.
> + */
> + data->m4u_group = iommu_group_alloc();
> + if (IS_ERR(data->m4u_group))
> + return PTR_ERR(data->m4u_group);
> +
> if (!iommu_present(&platform_bus_type))
> bus_set_iommu(&platform_bus_type, &mtk_iommu_ops);
>
> @@ -636,6 +578,7 @@ static int mtk_iommu_remove(struct platform_device *pdev)
> {
> struct mtk_iommu_data *data = platform_get_drvdata(pdev);
>
> + iommu_group_put(data->m4u_group);
> iommu_device_sysfs_remove(&data->iommu);
> iommu_device_unregister(&data->iommu);
>
27.08.2020 18:54, Thierry Reding пишет:
...
>> The Tegra DRM has a very special quirk for ARM32 that was added in this
>> commit [2] and driver relies on checking of whether explicit or implicit
>> IOMMU is used in order to activate the quirk.
>>
>> [2]
>> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=273da5a046965ccf0ec79eb63f2d5173467e20fa
>>
>> Once the implicit IOMMU is used for the DRM driver, the quirk no longer
>> works (if I'm not missing something). This problem needs to be resolved
>> before implicit IOMMU could be used by the Tegra DRM on ARM32.
...
> I do have a patch lying around somewhere that implements the mapping
> cache that was referenced in the above commit. Let me know if I should
> dig that up and send it out.
Hello, Thierry! It certainly will be interesting to take a look at yours
patch!
I think that the caching shouldn't be strictly necessary for keeping the
current workaround working and it should be possible to keep the code
as-is by replacing the domain-type checking with the SoC-generation
check in the Tegra DRM driver.
In general, IMO it should be better to stash the complex changes until
we'll get closer to adopting the new UAPI as it will certainly touch the
aspect of the per-device mappings. But if yours patch is less than 100
LOC, then maybe we could consider applying it right now!
Hi Robin,
On 20.08.2020 17:08, Robin Murphy wrote:
> Now that arch/arm is wired up for default domains and iommu-dma, we can
> consolidate the shared mapping code onto the generic IOMMU API version,
> and retire the arch-specific implementation.
>
> Signed-off-by: Robin Murphy <[email protected]>
>
> ---
> This is a cheeky revert of 07dc3678bacc ("drm/exynos: Fix cleanup of
> IOMMU related objects"), plus removal of the remaining arm_iommu_*
> references on top.
I have one more suggestion to this patch. Please rename
exynos_drm_private->mapping to exynos_drm_private->domain and change its
type from "void *" to "struct iommu_domain *". The "void *" was there to
support both old-ARM and ARM64 IOMMU/DMA-mapping frameworks, but now we
can use the proper pointer types.
> ---
> drivers/gpu/drm/exynos/exynos5433_drm_decon.c | 5 +-
> drivers/gpu/drm/exynos/exynos7_drm_decon.c | 5 +-
> drivers/gpu/drm/exynos/exynos_drm_dma.c | 61 +++----------------
> drivers/gpu/drm/exynos/exynos_drm_drv.h | 6 +-
> drivers/gpu/drm/exynos/exynos_drm_fimc.c | 5 +-
> drivers/gpu/drm/exynos/exynos_drm_fimd.c | 5 +-
> drivers/gpu/drm/exynos/exynos_drm_g2d.c | 5 +-
> drivers/gpu/drm/exynos/exynos_drm_gsc.c | 5 +-
> drivers/gpu/drm/exynos/exynos_drm_rotator.c | 5 +-
> drivers/gpu/drm/exynos/exynos_drm_scaler.c | 6 +-
> drivers/gpu/drm/exynos/exynos_mixer.c | 7 +--
> 11 files changed, 29 insertions(+), 86 deletions(-)
>
> diff --git a/drivers/gpu/drm/exynos/exynos5433_drm_decon.c b/drivers/gpu/drm/exynos/exynos5433_drm_decon.c
> index 1f79bc2a881e..8428ae12dfa5 100644
> --- a/drivers/gpu/drm/exynos/exynos5433_drm_decon.c
> +++ b/drivers/gpu/drm/exynos/exynos5433_drm_decon.c
> @@ -55,7 +55,6 @@ static const char * const decon_clks_name[] = {
> struct decon_context {
> struct device *dev;
> struct drm_device *drm_dev;
> - void *dma_priv;
> struct exynos_drm_crtc *crtc;
> struct exynos_drm_plane planes[WINDOWS_NR];
> struct exynos_drm_plane_config configs[WINDOWS_NR];
> @@ -645,7 +644,7 @@ static int decon_bind(struct device *dev, struct device *master, void *data)
>
> decon_clear_channels(ctx->crtc);
>
> - return exynos_drm_register_dma(drm_dev, dev, &ctx->dma_priv);
> + return exynos_drm_register_dma(drm_dev, dev);
> }
>
> static void decon_unbind(struct device *dev, struct device *master, void *data)
> @@ -655,7 +654,7 @@ static void decon_unbind(struct device *dev, struct device *master, void *data)
> decon_atomic_disable(ctx->crtc);
>
> /* detach this sub driver from iommu mapping if supported. */
> - exynos_drm_unregister_dma(ctx->drm_dev, ctx->dev, &ctx->dma_priv);
> + exynos_drm_unregister_dma(ctx->drm_dev, ctx->dev);
> }
>
> static const struct component_ops decon_component_ops = {
> diff --git a/drivers/gpu/drm/exynos/exynos7_drm_decon.c b/drivers/gpu/drm/exynos/exynos7_drm_decon.c
> index f2d87a7445c7..e7b58097ccdc 100644
> --- a/drivers/gpu/drm/exynos/exynos7_drm_decon.c
> +++ b/drivers/gpu/drm/exynos/exynos7_drm_decon.c
> @@ -40,7 +40,6 @@
> struct decon_context {
> struct device *dev;
> struct drm_device *drm_dev;
> - void *dma_priv;
> struct exynos_drm_crtc *crtc;
> struct exynos_drm_plane planes[WINDOWS_NR];
> struct exynos_drm_plane_config configs[WINDOWS_NR];
> @@ -128,13 +127,13 @@ static int decon_ctx_initialize(struct decon_context *ctx,
>
> decon_clear_channels(ctx->crtc);
>
> - return exynos_drm_register_dma(drm_dev, ctx->dev, &ctx->dma_priv);
> + return exynos_drm_register_dma(drm_dev, ctx->dev);
> }
>
> static void decon_ctx_remove(struct decon_context *ctx)
> {
> /* detach this sub driver from iommu mapping if supported. */
> - exynos_drm_unregister_dma(ctx->drm_dev, ctx->dev, &ctx->dma_priv);
> + exynos_drm_unregister_dma(ctx->drm_dev, ctx->dev);
> }
>
> static u32 decon_calc_clkdiv(struct decon_context *ctx,
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_dma.c b/drivers/gpu/drm/exynos/exynos_drm_dma.c
> index 58b89ec11b0e..fd5f9bcf1857 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_dma.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_dma.c
> @@ -14,19 +14,6 @@
>
> #include "exynos_drm_drv.h"
>
> -#if defined(CONFIG_ARM_DMA_USE_IOMMU)
> -#include <asm/dma-iommu.h>
> -#else
> -#define arm_iommu_create_mapping(...) ({ NULL; })
> -#define arm_iommu_attach_device(...) ({ -ENODEV; })
> -#define arm_iommu_release_mapping(...) ({ })
> -#define arm_iommu_detach_device(...) ({ })
> -#define to_dma_iommu_mapping(dev) NULL
> -#endif
> -
> -#if !defined(CONFIG_IOMMU_DMA)
> -#define iommu_dma_init_domain(...) ({ -EINVAL; })
> -#endif
>
> #define EXYNOS_DEV_ADDR_START 0x20000000
> #define EXYNOS_DEV_ADDR_SIZE 0x40000000
> @@ -58,7 +45,7 @@ static inline void clear_dma_max_seg_size(struct device *dev)
> * mapping.
> */
> static int drm_iommu_attach_device(struct drm_device *drm_dev,
> - struct device *subdrv_dev, void **dma_priv)
> + struct device *subdrv_dev)
> {
> struct exynos_drm_private *priv = drm_dev->dev_private;
> int ret = 0;
> @@ -73,22 +60,7 @@ static int drm_iommu_attach_device(struct drm_device *drm_dev,
> if (ret)
> return ret;
>
> - if (IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)) {
> - /*
> - * Keep the original DMA mapping of the sub-device and
> - * restore it on Exynos DRM detach, otherwise the DMA
> - * framework considers it as IOMMU-less during the next
> - * probe (in case of deferred probe or modular build)
> - */
> - *dma_priv = to_dma_iommu_mapping(subdrv_dev);
> - if (*dma_priv)
> - arm_iommu_detach_device(subdrv_dev);
> -
> - ret = arm_iommu_attach_device(subdrv_dev, priv->mapping);
> - } else if (IS_ENABLED(CONFIG_IOMMU_DMA)) {
> - ret = iommu_attach_device(priv->mapping, subdrv_dev);
> - }
> -
> + ret = iommu_attach_device(priv->mapping, subdrv_dev);
> if (ret)
> clear_dma_max_seg_size(subdrv_dev);
>
> @@ -105,21 +77,15 @@ static int drm_iommu_attach_device(struct drm_device *drm_dev,
> * mapping
> */
> static void drm_iommu_detach_device(struct drm_device *drm_dev,
> - struct device *subdrv_dev, void **dma_priv)
> + struct device *subdrv_dev)
> {
> struct exynos_drm_private *priv = drm_dev->dev_private;
>
> - if (IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)) {
> - arm_iommu_detach_device(subdrv_dev);
> - arm_iommu_attach_device(subdrv_dev, *dma_priv);
> - } else if (IS_ENABLED(CONFIG_IOMMU_DMA))
> - iommu_detach_device(priv->mapping, subdrv_dev);
> -
> + iommu_detach_device(priv->mapping, subdrv_dev);
> clear_dma_max_seg_size(subdrv_dev);
> }
>
> -int exynos_drm_register_dma(struct drm_device *drm, struct device *dev,
> - void **dma_priv)
> +int exynos_drm_register_dma(struct drm_device *drm, struct device *dev)
> {
> struct exynos_drm_private *priv = drm->dev_private;
>
> @@ -133,27 +99,20 @@ int exynos_drm_register_dma(struct drm_device *drm, struct device *dev,
> return 0;
>
> if (!priv->mapping) {
> - void *mapping;
> -
> - if (IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU))
> - mapping = arm_iommu_create_mapping(&platform_bus_type,
> - EXYNOS_DEV_ADDR_START, EXYNOS_DEV_ADDR_SIZE);
> - else if (IS_ENABLED(CONFIG_IOMMU_DMA))
> - mapping = iommu_get_domain_for_dev(priv->dma_dev);
> + void *mapping = iommu_get_domain_for_dev(priv->dma_dev);
>
> if (IS_ERR(mapping))
> return PTR_ERR(mapping);
> priv->mapping = mapping;
> }
>
> - return drm_iommu_attach_device(drm, dev, dma_priv);
> + return drm_iommu_attach_device(drm, dev);
> }
>
> -void exynos_drm_unregister_dma(struct drm_device *drm, struct device *dev,
> - void **dma_priv)
> +void exynos_drm_unregister_dma(struct drm_device *drm, struct device *dev)
> {
> if (IS_ENABLED(CONFIG_EXYNOS_IOMMU))
> - drm_iommu_detach_device(drm, dev, dma_priv);
> + drm_iommu_detach_device(drm, dev);
> }
>
> void exynos_drm_cleanup_dma(struct drm_device *drm)
> @@ -163,7 +122,5 @@ void exynos_drm_cleanup_dma(struct drm_device *drm)
> if (!IS_ENABLED(CONFIG_EXYNOS_IOMMU))
> return;
>
> - arm_iommu_release_mapping(priv->mapping);
> - priv->mapping = NULL;
> priv->dma_dev = NULL;
> }
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.h b/drivers/gpu/drm/exynos/exynos_drm_drv.h
> index 6ae9056e7a18..d4d21d8cfb90 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_drv.h
> +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.h
> @@ -223,10 +223,8 @@ static inline bool is_drm_iommu_supported(struct drm_device *drm_dev)
> return priv->mapping ? true : false;
> }
>
> -int exynos_drm_register_dma(struct drm_device *drm, struct device *dev,
> - void **dma_priv);
> -void exynos_drm_unregister_dma(struct drm_device *drm, struct device *dev,
> - void **dma_priv);
> +int exynos_drm_register_dma(struct drm_device *drm, struct device *dev);
> +void exynos_drm_unregister_dma(struct drm_device *drm, struct device *dev);
> void exynos_drm_cleanup_dma(struct drm_device *drm);
>
> #ifdef CONFIG_DRM_EXYNOS_DPI
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimc.c b/drivers/gpu/drm/exynos/exynos_drm_fimc.c
> index 29ab8be8604c..8ea2e1d77802 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_fimc.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_fimc.c
> @@ -97,7 +97,6 @@ struct fimc_scaler {
> struct fimc_context {
> struct exynos_drm_ipp ipp;
> struct drm_device *drm_dev;
> - void *dma_priv;
> struct device *dev;
> struct exynos_drm_ipp_task *task;
> struct exynos_drm_ipp_formats *formats;
> @@ -1134,7 +1133,7 @@ static int fimc_bind(struct device *dev, struct device *master, void *data)
>
> ctx->drm_dev = drm_dev;
> ipp->drm_dev = drm_dev;
> - exynos_drm_register_dma(drm_dev, dev, &ctx->dma_priv);
> + exynos_drm_register_dma(drm_dev, dev);
>
> exynos_drm_ipp_register(dev, ipp, &ipp_funcs,
> DRM_EXYNOS_IPP_CAP_CROP | DRM_EXYNOS_IPP_CAP_ROTATE |
> @@ -1154,7 +1153,7 @@ static void fimc_unbind(struct device *dev, struct device *master,
> struct exynos_drm_ipp *ipp = &ctx->ipp;
>
> exynos_drm_ipp_unregister(dev, ipp);
> - exynos_drm_unregister_dma(drm_dev, dev, &ctx->dma_priv);
> + exynos_drm_unregister_dma(drm_dev, dev);
> }
>
> static const struct component_ops fimc_component_ops = {
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimd.c b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> index bb67cad8371f..21aec38702fc 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> @@ -167,7 +167,6 @@ static struct fimd_driver_data exynos5420_fimd_driver_data = {
> struct fimd_context {
> struct device *dev;
> struct drm_device *drm_dev;
> - void *dma_priv;
> struct exynos_drm_crtc *crtc;
> struct exynos_drm_plane planes[WINDOWS_NR];
> struct exynos_drm_plane_config configs[WINDOWS_NR];
> @@ -1091,7 +1090,7 @@ static int fimd_bind(struct device *dev, struct device *master, void *data)
> if (is_drm_iommu_supported(drm_dev))
> fimd_clear_channels(ctx->crtc);
>
> - return exynos_drm_register_dma(drm_dev, dev, &ctx->dma_priv);
> + return exynos_drm_register_dma(drm_dev, dev);
> }
>
> static void fimd_unbind(struct device *dev, struct device *master,
> @@ -1101,7 +1100,7 @@ static void fimd_unbind(struct device *dev, struct device *master,
>
> fimd_atomic_disable(ctx->crtc);
>
> - exynos_drm_unregister_dma(ctx->drm_dev, ctx->dev, &ctx->dma_priv);
> + exynos_drm_unregister_dma(ctx->drm_dev, ctx->dev);
>
> if (ctx->encoder)
> exynos_dpi_remove(ctx->encoder);
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> index 03be31427181..256ceafd9945 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> @@ -232,7 +232,6 @@ struct g2d_runqueue_node {
>
> struct g2d_data {
> struct device *dev;
> - void *dma_priv;
> struct clk *gate_clk;
> void __iomem *regs;
> int irq;
> @@ -1410,7 +1409,7 @@ static int g2d_bind(struct device *dev, struct device *master, void *data)
> return ret;
> }
>
> - ret = exynos_drm_register_dma(drm_dev, dev, &g2d->dma_priv);
> + ret = exynos_drm_register_dma(drm_dev, dev);
> if (ret < 0) {
> dev_err(dev, "failed to enable iommu.\n");
> g2d_fini_cmdlist(g2d);
> @@ -1435,7 +1434,7 @@ static void g2d_unbind(struct device *dev, struct device *master, void *data)
> priv->g2d_dev = NULL;
>
> cancel_work_sync(&g2d->runqueue_work);
> - exynos_drm_unregister_dma(g2d->drm_dev, dev, &g2d->dma_priv);
> + exynos_drm_unregister_dma(g2d->drm_dev, dev);
> }
>
> static const struct component_ops g2d_component_ops = {
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_gsc.c b/drivers/gpu/drm/exynos/exynos_drm_gsc.c
> index 45e9aee8366a..88b6fcaa20be 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_gsc.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_gsc.c
> @@ -97,7 +97,6 @@ struct gsc_scaler {
> struct gsc_context {
> struct exynos_drm_ipp ipp;
> struct drm_device *drm_dev;
> - void *dma_priv;
> struct device *dev;
> struct exynos_drm_ipp_task *task;
> struct exynos_drm_ipp_formats *formats;
> @@ -1170,7 +1169,7 @@ static int gsc_bind(struct device *dev, struct device *master, void *data)
>
> ctx->drm_dev = drm_dev;
> ctx->drm_dev = drm_dev;
> - exynos_drm_register_dma(drm_dev, dev, &ctx->dma_priv);
> + exynos_drm_register_dma(drm_dev, dev);
>
> exynos_drm_ipp_register(dev, ipp, &ipp_funcs,
> DRM_EXYNOS_IPP_CAP_CROP | DRM_EXYNOS_IPP_CAP_ROTATE |
> @@ -1190,7 +1189,7 @@ static void gsc_unbind(struct device *dev, struct device *master,
> struct exynos_drm_ipp *ipp = &ctx->ipp;
>
> exynos_drm_ipp_unregister(dev, ipp);
> - exynos_drm_unregister_dma(drm_dev, dev, &ctx->dma_priv);
> + exynos_drm_unregister_dma(drm_dev, dev);
> }
>
> static const struct component_ops gsc_component_ops = {
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_rotator.c b/drivers/gpu/drm/exynos/exynos_drm_rotator.c
> index 2d94afba031e..f22fa0d2621a 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_rotator.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_rotator.c
> @@ -56,7 +56,6 @@ struct rot_variant {
> struct rot_context {
> struct exynos_drm_ipp ipp;
> struct drm_device *drm_dev;
> - void *dma_priv;
> struct device *dev;
> void __iomem *regs;
> struct clk *clock;
> @@ -244,7 +243,7 @@ static int rotator_bind(struct device *dev, struct device *master, void *data)
>
> rot->drm_dev = drm_dev;
> ipp->drm_dev = drm_dev;
> - exynos_drm_register_dma(drm_dev, dev, &rot->dma_priv);
> + exynos_drm_register_dma(drm_dev, dev);
>
> exynos_drm_ipp_register(dev, ipp, &ipp_funcs,
> DRM_EXYNOS_IPP_CAP_CROP | DRM_EXYNOS_IPP_CAP_ROTATE,
> @@ -262,7 +261,7 @@ static void rotator_unbind(struct device *dev, struct device *master,
> struct exynos_drm_ipp *ipp = &rot->ipp;
>
> exynos_drm_ipp_unregister(dev, ipp);
> - exynos_drm_unregister_dma(rot->drm_dev, rot->dev, &rot->dma_priv);
> + exynos_drm_unregister_dma(rot->drm_dev, rot->dev);
> }
>
> static const struct component_ops rotator_component_ops = {
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_scaler.c b/drivers/gpu/drm/exynos/exynos_drm_scaler.c
> index ce1857138f89..0c560566bd2e 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_scaler.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_scaler.c
> @@ -39,7 +39,6 @@ struct scaler_data {
> struct scaler_context {
> struct exynos_drm_ipp ipp;
> struct drm_device *drm_dev;
> - void *dma_priv;
> struct device *dev;
> void __iomem *regs;
> struct clk *clock[SCALER_MAX_CLK];
> @@ -451,7 +450,7 @@ static int scaler_bind(struct device *dev, struct device *master, void *data)
>
> scaler->drm_dev = drm_dev;
> ipp->drm_dev = drm_dev;
> - exynos_drm_register_dma(drm_dev, dev, &scaler->dma_priv);
> + exynos_drm_register_dma(drm_dev, dev);
>
> exynos_drm_ipp_register(dev, ipp, &ipp_funcs,
> DRM_EXYNOS_IPP_CAP_CROP | DRM_EXYNOS_IPP_CAP_ROTATE |
> @@ -471,8 +470,7 @@ static void scaler_unbind(struct device *dev, struct device *master,
> struct exynos_drm_ipp *ipp = &scaler->ipp;
>
> exynos_drm_ipp_unregister(dev, ipp);
> - exynos_drm_unregister_dma(scaler->drm_dev, scaler->dev,
> - &scaler->dma_priv);
> + exynos_drm_unregister_dma(scaler->drm_dev, scaler->dev);
> }
>
> static const struct component_ops scaler_component_ops = {
> diff --git a/drivers/gpu/drm/exynos/exynos_mixer.c b/drivers/gpu/drm/exynos/exynos_mixer.c
> index af192e5a16ef..18972e605c5d 100644
> --- a/drivers/gpu/drm/exynos/exynos_mixer.c
> +++ b/drivers/gpu/drm/exynos/exynos_mixer.c
> @@ -94,7 +94,6 @@ struct mixer_context {
> struct platform_device *pdev;
> struct device *dev;
> struct drm_device *drm_dev;
> - void *dma_priv;
> struct exynos_drm_crtc *crtc;
> struct exynos_drm_plane planes[MIXER_WIN_NR];
> unsigned long flags;
> @@ -895,14 +894,12 @@ static int mixer_initialize(struct mixer_context *mixer_ctx,
> }
> }
>
> - return exynos_drm_register_dma(drm_dev, mixer_ctx->dev,
> - &mixer_ctx->dma_priv);
> + return exynos_drm_register_dma(drm_dev, mixer_ctx->dev);
> }
>
> static void mixer_ctx_remove(struct mixer_context *mixer_ctx)
> {
> - exynos_drm_unregister_dma(mixer_ctx->drm_dev, mixer_ctx->dev,
> - &mixer_ctx->dma_priv);
> + exynos_drm_unregister_dma(mixer_ctx->drm_dev, mixer_ctx->dev);
> }
>
> static int mixer_enable_vblank(struct exynos_drm_crtc *crtc)
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
Hi all,
On 24.08.2020 13:40, Marek Szyprowski wrote:
> On 20.08.2020 17:08, Robin Murphy wrote:
>> After 5 years or so of intending to get round to this, finally the
>> time comes! The changes themselves actualy turn out to be relatively
>> mechanical; the bigger concern appears to be how to get everything
>> merged across about 5 diffferent trees given the dependencies.
>>
>> I've lightly boot-tested things on Rockchip RK3288 and Exynos 4412
>> (Odroid-U3), to the degree that their display drivers should be using
>> IOMMU-backed buffers and don't explode (the Odroid doesn't manage to
>> send a working HDMI signal to the one monitor I have that it actually
>> detects, but that's a pre-existing condition...) Confirmation that the
>> Mediatek, OMAP and Tegra changes work will be most welcome.
>>
>> Patches are based on 5.9-rc1, branch available here:
>>
>> git://linux-arm.org/linux-rm arm/dma
>
> Well, my first proposal for the ARM and ARM64 DMA-mapping unification
> has been posted long time ago: https://lkml.org/lkml/2016/2/19/79
>
> Thanks for resurrecting it! :)
>
> I've tested this patchset on various ARM32bit Exynos based boards (not
> only Exynos4412) and most of them works fine after your conversion.
> However there are issues you cannot learn from the code.
>
> Conversion of the Exynos DRM was straightforward (thanks!), but there
> are other Exynos drivers that depends on the old ARM implementation.
> The S5P-MFC (only for the v5 hardware) and Exynos4 FIMC-ISP drivers
> depends on the first-fit IOVA allocation algorithm in the old ARM
> DMA-mapping. This was the main reason I've didn't continue my initial
> conversion attempt.
>
> Both drivers allocate a buffer for their firmware and then in the
> hardware registers address video buffers as an offset from the
> begginning of the firmware. This doesn't work when underlying
> DMA-mapping allocates IOVA with the last-fit algorithm, what the
> drivers/iommu/dma-iommu.c does. So far I didn't find a good solution
> for that issue.
>
> I'm open for suggestions. One more limitation for the S5P-MFC driver
> is that the hardware is capable only for addressing 128MiB. They will
> probably need to call IOMMU API directly, but I would like to keep as
> much from the IOMMU/DMA-mapping code as possible.
Just for the record. I've finally managed to add needed workarounds to
the both problematic Exynos4 drivers, so they work fine with this
patchset. It turned out that it wasn't that hard:
https://lore.kernel.org/linux-samsung-soc/[email protected]/T/#t
So from my side you have a green light to go ahead and switch ARM 32bit
to generic code. Time to say good bye to the one of my biggest
architecture related things merged once to mainline Linux. ;)
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
20. 8. 21. 오전 12:08에 Robin Murphy 이(가) 쓴 글:
> Now that arch/arm is wired up for default domains and iommu-dma, we can
> consolidate the shared mapping code onto the generic IOMMU API version,
> and retire the arch-specific implementation.
>
> Signed-off-by: Robin Murphy <[email protected]>
>
> ---
> This is a cheeky revert of 07dc3678bacc ("drm/exynos: Fix cleanup of
> IOMMU related objects"), plus removal of the remaining arm_iommu_*
> references on top.
Thanks for cleanup.
Acked-by: Inki Dae <[email protected]>
Thanks,
Inki Dae
> ---
> drivers/gpu/drm/exynos/exynos5433_drm_decon.c | 5 +-
> drivers/gpu/drm/exynos/exynos7_drm_decon.c | 5 +-
> drivers/gpu/drm/exynos/exynos_drm_dma.c | 61 +++----------------
> drivers/gpu/drm/exynos/exynos_drm_drv.h | 6 +-
> drivers/gpu/drm/exynos/exynos_drm_fimc.c | 5 +-
> drivers/gpu/drm/exynos/exynos_drm_fimd.c | 5 +-
> drivers/gpu/drm/exynos/exynos_drm_g2d.c | 5 +-
> drivers/gpu/drm/exynos/exynos_drm_gsc.c | 5 +-
> drivers/gpu/drm/exynos/exynos_drm_rotator.c | 5 +-
> drivers/gpu/drm/exynos/exynos_drm_scaler.c | 6 +-
> drivers/gpu/drm/exynos/exynos_mixer.c | 7 +--
> 11 files changed, 29 insertions(+), 86 deletions(-)
>
> diff --git a/drivers/gpu/drm/exynos/exynos5433_drm_decon.c b/drivers/gpu/drm/exynos/exynos5433_drm_decon.c
> index 1f79bc2a881e..8428ae12dfa5 100644
> --- a/drivers/gpu/drm/exynos/exynos5433_drm_decon.c
> +++ b/drivers/gpu/drm/exynos/exynos5433_drm_decon.c
> @@ -55,7 +55,6 @@ static const char * const decon_clks_name[] = {
> struct decon_context {
> struct device *dev;
> struct drm_device *drm_dev;
> - void *dma_priv;
> struct exynos_drm_crtc *crtc;
> struct exynos_drm_plane planes[WINDOWS_NR];
> struct exynos_drm_plane_config configs[WINDOWS_NR];
> @@ -645,7 +644,7 @@ static int decon_bind(struct device *dev, struct device *master, void *data)
>
> decon_clear_channels(ctx->crtc);
>
> - return exynos_drm_register_dma(drm_dev, dev, &ctx->dma_priv);
> + return exynos_drm_register_dma(drm_dev, dev);
> }
>
> static void decon_unbind(struct device *dev, struct device *master, void *data)
> @@ -655,7 +654,7 @@ static void decon_unbind(struct device *dev, struct device *master, void *data)
> decon_atomic_disable(ctx->crtc);
>
> /* detach this sub driver from iommu mapping if supported. */
> - exynos_drm_unregister_dma(ctx->drm_dev, ctx->dev, &ctx->dma_priv);
> + exynos_drm_unregister_dma(ctx->drm_dev, ctx->dev);
> }
>
> static const struct component_ops decon_component_ops = {
> diff --git a/drivers/gpu/drm/exynos/exynos7_drm_decon.c b/drivers/gpu/drm/exynos/exynos7_drm_decon.c
> index f2d87a7445c7..e7b58097ccdc 100644
> --- a/drivers/gpu/drm/exynos/exynos7_drm_decon.c
> +++ b/drivers/gpu/drm/exynos/exynos7_drm_decon.c
> @@ -40,7 +40,6 @@
> struct decon_context {
> struct device *dev;
> struct drm_device *drm_dev;
> - void *dma_priv;
> struct exynos_drm_crtc *crtc;
> struct exynos_drm_plane planes[WINDOWS_NR];
> struct exynos_drm_plane_config configs[WINDOWS_NR];
> @@ -128,13 +127,13 @@ static int decon_ctx_initialize(struct decon_context *ctx,
>
> decon_clear_channels(ctx->crtc);
>
> - return exynos_drm_register_dma(drm_dev, ctx->dev, &ctx->dma_priv);
> + return exynos_drm_register_dma(drm_dev, ctx->dev);
> }
>
> static void decon_ctx_remove(struct decon_context *ctx)
> {
> /* detach this sub driver from iommu mapping if supported. */
> - exynos_drm_unregister_dma(ctx->drm_dev, ctx->dev, &ctx->dma_priv);
> + exynos_drm_unregister_dma(ctx->drm_dev, ctx->dev);
> }
>
> static u32 decon_calc_clkdiv(struct decon_context *ctx,
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_dma.c b/drivers/gpu/drm/exynos/exynos_drm_dma.c
> index 58b89ec11b0e..fd5f9bcf1857 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_dma.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_dma.c
> @@ -14,19 +14,6 @@
>
> #include "exynos_drm_drv.h"
>
> -#if defined(CONFIG_ARM_DMA_USE_IOMMU)
> -#include <asm/dma-iommu.h>
> -#else
> -#define arm_iommu_create_mapping(...) ({ NULL; })
> -#define arm_iommu_attach_device(...) ({ -ENODEV; })
> -#define arm_iommu_release_mapping(...) ({ })
> -#define arm_iommu_detach_device(...) ({ })
> -#define to_dma_iommu_mapping(dev) NULL
> -#endif
> -
> -#if !defined(CONFIG_IOMMU_DMA)
> -#define iommu_dma_init_domain(...) ({ -EINVAL; })
> -#endif
>
> #define EXYNOS_DEV_ADDR_START 0x20000000
> #define EXYNOS_DEV_ADDR_SIZE 0x40000000
> @@ -58,7 +45,7 @@ static inline void clear_dma_max_seg_size(struct device *dev)
> * mapping.
> */
> static int drm_iommu_attach_device(struct drm_device *drm_dev,
> - struct device *subdrv_dev, void **dma_priv)
> + struct device *subdrv_dev)
> {
> struct exynos_drm_private *priv = drm_dev->dev_private;
> int ret = 0;
> @@ -73,22 +60,7 @@ static int drm_iommu_attach_device(struct drm_device *drm_dev,
> if (ret)
> return ret;
>
> - if (IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)) {
> - /*
> - * Keep the original DMA mapping of the sub-device and
> - * restore it on Exynos DRM detach, otherwise the DMA
> - * framework considers it as IOMMU-less during the next
> - * probe (in case of deferred probe or modular build)
> - */
> - *dma_priv = to_dma_iommu_mapping(subdrv_dev);
> - if (*dma_priv)
> - arm_iommu_detach_device(subdrv_dev);
> -
> - ret = arm_iommu_attach_device(subdrv_dev, priv->mapping);
> - } else if (IS_ENABLED(CONFIG_IOMMU_DMA)) {
> - ret = iommu_attach_device(priv->mapping, subdrv_dev);
> - }
> -
> + ret = iommu_attach_device(priv->mapping, subdrv_dev);
> if (ret)
> clear_dma_max_seg_size(subdrv_dev);
>
> @@ -105,21 +77,15 @@ static int drm_iommu_attach_device(struct drm_device *drm_dev,
> * mapping
> */
> static void drm_iommu_detach_device(struct drm_device *drm_dev,
> - struct device *subdrv_dev, void **dma_priv)
> + struct device *subdrv_dev)
> {
> struct exynos_drm_private *priv = drm_dev->dev_private;
>
> - if (IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU)) {
> - arm_iommu_detach_device(subdrv_dev);
> - arm_iommu_attach_device(subdrv_dev, *dma_priv);
> - } else if (IS_ENABLED(CONFIG_IOMMU_DMA))
> - iommu_detach_device(priv->mapping, subdrv_dev);
> -
> + iommu_detach_device(priv->mapping, subdrv_dev);
> clear_dma_max_seg_size(subdrv_dev);
> }
>
> -int exynos_drm_register_dma(struct drm_device *drm, struct device *dev,
> - void **dma_priv)
> +int exynos_drm_register_dma(struct drm_device *drm, struct device *dev)
> {
> struct exynos_drm_private *priv = drm->dev_private;
>
> @@ -133,27 +99,20 @@ int exynos_drm_register_dma(struct drm_device *drm, struct device *dev,
> return 0;
>
> if (!priv->mapping) {
> - void *mapping;
> -
> - if (IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU))
> - mapping = arm_iommu_create_mapping(&platform_bus_type,
> - EXYNOS_DEV_ADDR_START, EXYNOS_DEV_ADDR_SIZE);
> - else if (IS_ENABLED(CONFIG_IOMMU_DMA))
> - mapping = iommu_get_domain_for_dev(priv->dma_dev);
> + void *mapping = iommu_get_domain_for_dev(priv->dma_dev);
>
> if (IS_ERR(mapping))
> return PTR_ERR(mapping);
> priv->mapping = mapping;
> }
>
> - return drm_iommu_attach_device(drm, dev, dma_priv);
> + return drm_iommu_attach_device(drm, dev);
> }
>
> -void exynos_drm_unregister_dma(struct drm_device *drm, struct device *dev,
> - void **dma_priv)
> +void exynos_drm_unregister_dma(struct drm_device *drm, struct device *dev)
> {
> if (IS_ENABLED(CONFIG_EXYNOS_IOMMU))
> - drm_iommu_detach_device(drm, dev, dma_priv);
> + drm_iommu_detach_device(drm, dev);
> }
>
> void exynos_drm_cleanup_dma(struct drm_device *drm)
> @@ -163,7 +122,5 @@ void exynos_drm_cleanup_dma(struct drm_device *drm)
> if (!IS_ENABLED(CONFIG_EXYNOS_IOMMU))
> return;
>
> - arm_iommu_release_mapping(priv->mapping);
> - priv->mapping = NULL;
> priv->dma_dev = NULL;
> }
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.h b/drivers/gpu/drm/exynos/exynos_drm_drv.h
> index 6ae9056e7a18..d4d21d8cfb90 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_drv.h
> +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.h
> @@ -223,10 +223,8 @@ static inline bool is_drm_iommu_supported(struct drm_device *drm_dev)
> return priv->mapping ? true : false;
> }
>
> -int exynos_drm_register_dma(struct drm_device *drm, struct device *dev,
> - void **dma_priv);
> -void exynos_drm_unregister_dma(struct drm_device *drm, struct device *dev,
> - void **dma_priv);
> +int exynos_drm_register_dma(struct drm_device *drm, struct device *dev);
> +void exynos_drm_unregister_dma(struct drm_device *drm, struct device *dev);
> void exynos_drm_cleanup_dma(struct drm_device *drm);
>
> #ifdef CONFIG_DRM_EXYNOS_DPI
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimc.c b/drivers/gpu/drm/exynos/exynos_drm_fimc.c
> index 29ab8be8604c..8ea2e1d77802 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_fimc.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_fimc.c
> @@ -97,7 +97,6 @@ struct fimc_scaler {
> struct fimc_context {
> struct exynos_drm_ipp ipp;
> struct drm_device *drm_dev;
> - void *dma_priv;
> struct device *dev;
> struct exynos_drm_ipp_task *task;
> struct exynos_drm_ipp_formats *formats;
> @@ -1134,7 +1133,7 @@ static int fimc_bind(struct device *dev, struct device *master, void *data)
>
> ctx->drm_dev = drm_dev;
> ipp->drm_dev = drm_dev;
> - exynos_drm_register_dma(drm_dev, dev, &ctx->dma_priv);
> + exynos_drm_register_dma(drm_dev, dev);
>
> exynos_drm_ipp_register(dev, ipp, &ipp_funcs,
> DRM_EXYNOS_IPP_CAP_CROP | DRM_EXYNOS_IPP_CAP_ROTATE |
> @@ -1154,7 +1153,7 @@ static void fimc_unbind(struct device *dev, struct device *master,
> struct exynos_drm_ipp *ipp = &ctx->ipp;
>
> exynos_drm_ipp_unregister(dev, ipp);
> - exynos_drm_unregister_dma(drm_dev, dev, &ctx->dma_priv);
> + exynos_drm_unregister_dma(drm_dev, dev);
> }
>
> static const struct component_ops fimc_component_ops = {
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimd.c b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> index bb67cad8371f..21aec38702fc 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
> @@ -167,7 +167,6 @@ static struct fimd_driver_data exynos5420_fimd_driver_data = {
> struct fimd_context {
> struct device *dev;
> struct drm_device *drm_dev;
> - void *dma_priv;
> struct exynos_drm_crtc *crtc;
> struct exynos_drm_plane planes[WINDOWS_NR];
> struct exynos_drm_plane_config configs[WINDOWS_NR];
> @@ -1091,7 +1090,7 @@ static int fimd_bind(struct device *dev, struct device *master, void *data)
> if (is_drm_iommu_supported(drm_dev))
> fimd_clear_channels(ctx->crtc);
>
> - return exynos_drm_register_dma(drm_dev, dev, &ctx->dma_priv);
> + return exynos_drm_register_dma(drm_dev, dev);
> }
>
> static void fimd_unbind(struct device *dev, struct device *master,
> @@ -1101,7 +1100,7 @@ static void fimd_unbind(struct device *dev, struct device *master,
>
> fimd_atomic_disable(ctx->crtc);
>
> - exynos_drm_unregister_dma(ctx->drm_dev, ctx->dev, &ctx->dma_priv);
> + exynos_drm_unregister_dma(ctx->drm_dev, ctx->dev);
>
> if (ctx->encoder)
> exynos_dpi_remove(ctx->encoder);
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_g2d.c b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> index 03be31427181..256ceafd9945 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_g2d.c
> @@ -232,7 +232,6 @@ struct g2d_runqueue_node {
>
> struct g2d_data {
> struct device *dev;
> - void *dma_priv;
> struct clk *gate_clk;
> void __iomem *regs;
> int irq;
> @@ -1410,7 +1409,7 @@ static int g2d_bind(struct device *dev, struct device *master, void *data)
> return ret;
> }
>
> - ret = exynos_drm_register_dma(drm_dev, dev, &g2d->dma_priv);
> + ret = exynos_drm_register_dma(drm_dev, dev);
> if (ret < 0) {
> dev_err(dev, "failed to enable iommu.\n");
> g2d_fini_cmdlist(g2d);
> @@ -1435,7 +1434,7 @@ static void g2d_unbind(struct device *dev, struct device *master, void *data)
> priv->g2d_dev = NULL;
>
> cancel_work_sync(&g2d->runqueue_work);
> - exynos_drm_unregister_dma(g2d->drm_dev, dev, &g2d->dma_priv);
> + exynos_drm_unregister_dma(g2d->drm_dev, dev);
> }
>
> static const struct component_ops g2d_component_ops = {
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_gsc.c b/drivers/gpu/drm/exynos/exynos_drm_gsc.c
> index 45e9aee8366a..88b6fcaa20be 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_gsc.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_gsc.c
> @@ -97,7 +97,6 @@ struct gsc_scaler {
> struct gsc_context {
> struct exynos_drm_ipp ipp;
> struct drm_device *drm_dev;
> - void *dma_priv;
> struct device *dev;
> struct exynos_drm_ipp_task *task;
> struct exynos_drm_ipp_formats *formats;
> @@ -1170,7 +1169,7 @@ static int gsc_bind(struct device *dev, struct device *master, void *data)
>
> ctx->drm_dev = drm_dev;
> ctx->drm_dev = drm_dev;
> - exynos_drm_register_dma(drm_dev, dev, &ctx->dma_priv);
> + exynos_drm_register_dma(drm_dev, dev);
>
> exynos_drm_ipp_register(dev, ipp, &ipp_funcs,
> DRM_EXYNOS_IPP_CAP_CROP | DRM_EXYNOS_IPP_CAP_ROTATE |
> @@ -1190,7 +1189,7 @@ static void gsc_unbind(struct device *dev, struct device *master,
> struct exynos_drm_ipp *ipp = &ctx->ipp;
>
> exynos_drm_ipp_unregister(dev, ipp);
> - exynos_drm_unregister_dma(drm_dev, dev, &ctx->dma_priv);
> + exynos_drm_unregister_dma(drm_dev, dev);
> }
>
> static const struct component_ops gsc_component_ops = {
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_rotator.c b/drivers/gpu/drm/exynos/exynos_drm_rotator.c
> index 2d94afba031e..f22fa0d2621a 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_rotator.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_rotator.c
> @@ -56,7 +56,6 @@ struct rot_variant {
> struct rot_context {
> struct exynos_drm_ipp ipp;
> struct drm_device *drm_dev;
> - void *dma_priv;
> struct device *dev;
> void __iomem *regs;
> struct clk *clock;
> @@ -244,7 +243,7 @@ static int rotator_bind(struct device *dev, struct device *master, void *data)
>
> rot->drm_dev = drm_dev;
> ipp->drm_dev = drm_dev;
> - exynos_drm_register_dma(drm_dev, dev, &rot->dma_priv);
> + exynos_drm_register_dma(drm_dev, dev);
>
> exynos_drm_ipp_register(dev, ipp, &ipp_funcs,
> DRM_EXYNOS_IPP_CAP_CROP | DRM_EXYNOS_IPP_CAP_ROTATE,
> @@ -262,7 +261,7 @@ static void rotator_unbind(struct device *dev, struct device *master,
> struct exynos_drm_ipp *ipp = &rot->ipp;
>
> exynos_drm_ipp_unregister(dev, ipp);
> - exynos_drm_unregister_dma(rot->drm_dev, rot->dev, &rot->dma_priv);
> + exynos_drm_unregister_dma(rot->drm_dev, rot->dev);
> }
>
> static const struct component_ops rotator_component_ops = {
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_scaler.c b/drivers/gpu/drm/exynos/exynos_drm_scaler.c
> index ce1857138f89..0c560566bd2e 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_scaler.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_scaler.c
> @@ -39,7 +39,6 @@ struct scaler_data {
> struct scaler_context {
> struct exynos_drm_ipp ipp;
> struct drm_device *drm_dev;
> - void *dma_priv;
> struct device *dev;
> void __iomem *regs;
> struct clk *clock[SCALER_MAX_CLK];
> @@ -451,7 +450,7 @@ static int scaler_bind(struct device *dev, struct device *master, void *data)
>
> scaler->drm_dev = drm_dev;
> ipp->drm_dev = drm_dev;
> - exynos_drm_register_dma(drm_dev, dev, &scaler->dma_priv);
> + exynos_drm_register_dma(drm_dev, dev);
>
> exynos_drm_ipp_register(dev, ipp, &ipp_funcs,
> DRM_EXYNOS_IPP_CAP_CROP | DRM_EXYNOS_IPP_CAP_ROTATE |
> @@ -471,8 +470,7 @@ static void scaler_unbind(struct device *dev, struct device *master,
> struct exynos_drm_ipp *ipp = &scaler->ipp;
>
> exynos_drm_ipp_unregister(dev, ipp);
> - exynos_drm_unregister_dma(scaler->drm_dev, scaler->dev,
> - &scaler->dma_priv);
> + exynos_drm_unregister_dma(scaler->drm_dev, scaler->dev);
> }
>
> static const struct component_ops scaler_component_ops = {
> diff --git a/drivers/gpu/drm/exynos/exynos_mixer.c b/drivers/gpu/drm/exynos/exynos_mixer.c
> index af192e5a16ef..18972e605c5d 100644
> --- a/drivers/gpu/drm/exynos/exynos_mixer.c
> +++ b/drivers/gpu/drm/exynos/exynos_mixer.c
> @@ -94,7 +94,6 @@ struct mixer_context {
> struct platform_device *pdev;
> struct device *dev;
> struct drm_device *drm_dev;
> - void *dma_priv;
> struct exynos_drm_crtc *crtc;
> struct exynos_drm_plane planes[MIXER_WIN_NR];
> unsigned long flags;
> @@ -895,14 +894,12 @@ static int mixer_initialize(struct mixer_context *mixer_ctx,
> }
> }
>
> - return exynos_drm_register_dma(drm_dev, mixer_ctx->dev,
> - &mixer_ctx->dma_priv);
> + return exynos_drm_register_dma(drm_dev, mixer_ctx->dev);
> }
>
> static void mixer_ctx_remove(struct mixer_context *mixer_ctx)
> {
> - exynos_drm_unregister_dma(mixer_ctx->drm_dev, mixer_ctx->dev,
> - &mixer_ctx->dma_priv);
> + exynos_drm_unregister_dma(mixer_ctx->drm_dev, mixer_ctx->dev);
> }
>
> static int mixer_enable_vblank(struct exynos_drm_crtc *crtc)
>
Hi Robin,
On 20.08.2020 17:08, Robin Murphy wrote:
> With the IOMMU ops now looking much the same shape as iommu_dma_ops,
> switch them out in favour of the iommu-dma library, currently enhanced
> with temporary workarounds that allow it to also sit underneath the
> arch-specific API. With that in place, we can now start converting the
> remaining IOMMU drivers and consumers to work with IOMMU API default
> domains instead.
>
> Signed-off-by: Robin Murphy <[email protected]>
I've played a bit longer with this and found that reading the kernel
virtual address of the buffers allocated via dma_alloc_attrs() from
dma-iommu ops gives trashes from time to time. It took me a while to
debug this...
Your conversion misses adding arch_dma_prep_coherent() to arch/arm, so
the buffers are cleared by the mm allocator, but the caches are NOT
flushed for the newly allocated buffers.
This fixes the issue:
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index fec3e59215b8..8b60bcc5b14f 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -2,6 +2,7 @@
config ARM
bool
default y
+ select ARCH_HAS_DMA_PREP_COHERENT
select ARCH_32BIT_OFF_T
select ARCH_HAS_BINFMT_FLAT
select ARCH_HAS_DEBUG_VIRTUAL if MMU
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index ff6c4962161a..6954681b73da 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -266,6 +266,20 @@ static void __dma_clear_buffer(struct page *page,
size_t size, int coherent_flag
}
}
+void arch_dma_prep_coherent(struct page *page, size_t size)
+{
+
+ if (PageHighMem(page)) {
+ phys_addr_t base = __pfn_to_phys(page_to_pfn(page));
+ phys_addr_t end = base + size;
+ outer_flush_range(base, end);
+ } else {
+ void *ptr = page_address(page);
+ dmac_flush_range(ptr, ptr + size);
+ outer_flush_range(__pa(ptr), __pa(ptr) + size);
+ }
+}
+
/*
* Allocate a DMA buffer for 'dev' of size 'size' using the
* specified gfp mask. Note that 'size' must be page aligned.
I also wonder if it would be better to use per-arch __dma_clear_buffer()
instead of setting __GFP_ZERO unconditionally in dma-iommu.c. This
should be faster on ARM with highmem...
> ...
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland