2014-06-17 01:39:31

by Laura Abbott

[permalink] [raw]
Subject: [PATCHv3 0/5] Atomic pool for arm64

Hi,

This is a series to add a pool for atomic allocations for arm64. It was
previously suggested to try and share more code with arm. I did some
refactoring to have arm use genalloc and pull out some of the remapping
code. The end result is a negative diffstat overall for arm dma-mapping.c.

There still might be some room for more refactoring of atomic functions into
common dma-mapping.c and integration with dma-coherent.c but there should
be less overlap now.

Reviews and testing welcome.

Thanks,
Laura

v3: Now a patch series due to refactoring of arm code. arm and arm64 now both
use genalloc for atomic pool management. genalloc extensions added.
DMA remapping code factored out as well.

v2: Various bug fixes pointed out by David and Ritesh (CMA dependency, swapping
coherent, noncoherent). I'm still not sure how to address the devicetree
suggestion by Will [1][2]. I added the devicetree mailing list this time around
to get more input on this.

[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-April/249180.html
[2] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-April/249528.html


Laura Abbott (5):
lib/genalloc.c: Add power aligned algorithm
lib/genalloc.c: Add genpool range check function
common: dma-mapping: Introduce common remapping functions
arm: use genalloc for the atomic pool
arm64: Add atomic pool for non-coherent and CMA allocaitons.

arch/arm/Kconfig | 1 +
arch/arm/mm/dma-mapping.c | 200 ++++++++-----------------------
arch/arm64/Kconfig | 1 +
arch/arm64/mm/dma-mapping.c | 154 +++++++++++++++++++++---
drivers/base/dma-mapping.c | 66 ++++++++++
include/asm-generic/dma-mapping-common.h | 9 ++
include/linux/genalloc.h | 7 ++
lib/genalloc.c | 50 ++++++++
8 files changed, 323 insertions(+), 165 deletions(-)

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation


2014-06-17 01:39:36

by Laura Abbott

[permalink] [raw]
Subject: [PATCHv3 2/5] lib/genalloc.c: Add genpool range check function

After allocating an address from a particular genpool,
there is no good way to verify if that address actually
belongs to a genpool. Introduce addr_in_gen_pool which
will return if an address plus size falls completely
within the genpool range.

Signed-off-by: Laura Abbott <[email protected]>
---
include/linux/genalloc.h | 3 +++
lib/genalloc.c | 29 +++++++++++++++++++++++++++++
2 files changed, 32 insertions(+)

diff --git a/include/linux/genalloc.h b/include/linux/genalloc.h
index 3cd0934..1ccaab4 100644
--- a/include/linux/genalloc.h
+++ b/include/linux/genalloc.h
@@ -121,6 +121,9 @@ extern struct gen_pool *devm_gen_pool_create(struct device *dev,
int min_alloc_order, int nid);
extern struct gen_pool *dev_get_gen_pool(struct device *dev);

+bool addr_in_gen_pool(struct gen_pool *pool, unsigned long start,
+ size_t size);
+
#ifdef CONFIG_OF
extern struct gen_pool *of_get_named_gen_pool(struct device_node *np,
const char *propname, int index);
diff --git a/lib/genalloc.c b/lib/genalloc.c
index 9758529..66edf93 100644
--- a/lib/genalloc.c
+++ b/lib/genalloc.c
@@ -403,6 +403,35 @@ void gen_pool_for_each_chunk(struct gen_pool *pool,
EXPORT_SYMBOL(gen_pool_for_each_chunk);

/**
+ * addr_in_gen_pool - checks if an address falls within the range of a pool
+ * @pool: the generic memory pool
+ * @start: start address
+ * @size: size of the region
+ *
+ * Check if the range of addresses falls within the specified pool. Takes
+ * the rcu_read_lock for the duration of the check.
+ */
+bool addr_in_gen_pool(struct gen_pool *pool, unsigned long start,
+ size_t size)
+{
+ bool found = false;
+ unsigned long end = start + size;
+ struct gen_pool_chunk *chunk;
+
+ rcu_read_lock();
+ list_for_each_entry_rcu(chunk, &(pool)->chunks, next_chunk) {
+ if (start >= chunk->start_addr && start <= chunk->end_addr) {
+ if (end <= chunk->end_addr) {
+ found = true;
+ break;
+ }
+ }
+ }
+ rcu_read_unlock();
+ return found;
+}
+
+/**
* gen_pool_avail - get available free space of the pool
* @pool: pool to get available free space
*
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

2014-06-17 01:39:39

by Laura Abbott

[permalink] [raw]
Subject: [PATCHv3 1/5] lib/genalloc.c: Add power aligned algorithm

One of the more common algorithms used for allocation
is to align the start address of the allocation to
the order of size requested. Add this as an algorithm
option for genalloc.

Signed-off-by: Laura Abbott <[email protected]>
---
include/linux/genalloc.h | 4 ++++
lib/genalloc.c | 21 +++++++++++++++++++++
2 files changed, 25 insertions(+)

diff --git a/include/linux/genalloc.h b/include/linux/genalloc.h
index 1c2fdaa..3cd0934 100644
--- a/include/linux/genalloc.h
+++ b/include/linux/genalloc.h
@@ -110,6 +110,10 @@ extern void gen_pool_set_algo(struct gen_pool *pool, genpool_algo_t algo,
extern unsigned long gen_pool_first_fit(unsigned long *map, unsigned long size,
unsigned long start, unsigned int nr, void *data);

+extern unsigned long gen_pool_first_fit_order_align(unsigned long *map,
+ unsigned long size, unsigned long start, unsigned int nr,
+ void *data);
+
extern unsigned long gen_pool_best_fit(unsigned long *map, unsigned long size,
unsigned long start, unsigned int nr, void *data);

diff --git a/lib/genalloc.c b/lib/genalloc.c
index bdb9a45..9758529 100644
--- a/lib/genalloc.c
+++ b/lib/genalloc.c
@@ -481,6 +481,27 @@ unsigned long gen_pool_first_fit(unsigned long *map, unsigned long size,
EXPORT_SYMBOL(gen_pool_first_fit);

/**
+ * gen_pool_first_fit_order_align - find the first available region
+ * of memory matching the size requirement. The region will be aligned
+ * to the order of the size specified.
+ * @map: The address to base the search on
+ * @size: The bitmap size in bits
+ * @start: The bitnumber to start searching at
+ * @nr: The number of zeroed bits we're looking for
+ * @data: additional data - unused
+ */
+unsigned long gen_pool_first_fit_order_align(unsigned long *map,
+ unsigned long size, unsigned long start,
+ unsigned int nr, void *data)
+{
+ unsigned long order = (unsigned long) data;
+ unsigned long align_mask = (1 << get_order(nr << order)) - 1;
+
+ return bitmap_find_next_zero_area(map, size, start, nr, align_mask);
+}
+EXPORT_SYMBOL(gen_pool_first_fit_order_align);
+
+/**
* gen_pool_best_fit - find the best fitting region of memory
* macthing the size requirement (no alignment constraint)
* @map: The address to base the search on
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

2014-06-17 01:39:59

by Laura Abbott

[permalink] [raw]
Subject: [PATCHv3 5/5] arm64: Add atomic pool for non-coherent and CMA allocations.

Neither CMA nor noncoherent allocations support atomic allocations.
Add a dedicated atomic pool to support this.

Signed-off-by: Laura Abbott <[email protected]>
---
arch/arm64/Kconfig | 1 +
arch/arm64/mm/dma-mapping.c | 155 +++++++++++++++++++++++++++++++++++++++-----
2 files changed, 139 insertions(+), 17 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 7295419..9de71a26 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -14,6 +14,7 @@ config ARM64
select COMMON_CLK
select CPU_PM if (SUSPEND || CPU_IDLE)
select DCACHE_WORD_ACCESS
+ select GENERIC_ALLOCATOR
select GENERIC_CLOCKEVENTS
select GENERIC_CLOCKEVENTS_BROADCAST if SMP
select GENERIC_CPU_AUTOPROBE
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
index 4164c5a..8e8049b 100644
--- a/arch/arm64/mm/dma-mapping.c
+++ b/arch/arm64/mm/dma-mapping.c
@@ -27,6 +27,7 @@
#include <linux/vmalloc.h>
#include <linux/swiotlb.h>
#include <linux/amba/bus.h>
+#include <linux/genalloc.h>

#include <asm/cacheflush.h>

@@ -41,6 +42,55 @@ static pgprot_t __get_dma_pgprot(struct dma_attrs *attrs, pgprot_t prot,
return prot;
}

+static struct gen_pool *atomic_pool;
+
+#define DEFAULT_DMA_COHERENT_POOL_SIZE SZ_256K
+static size_t atomic_pool_size = DEFAULT_DMA_COHERENT_POOL_SIZE;
+
+static int __init early_coherent_pool(char *p)
+{
+ atomic_pool_size = memparse(p, &p);
+ return 0;
+}
+early_param("coherent_pool", early_coherent_pool);
+
+static void *__alloc_from_pool(size_t size, struct page **ret_page)
+{
+ unsigned long val;
+ void *ptr = NULL;
+
+ if (!atomic_pool) {
+ WARN(1, "coherent pool not initialised!\n");
+ return NULL;
+ }
+
+ val = gen_pool_alloc(atomic_pool, size);
+ if (val) {
+ phys_addr_t phys = gen_pool_virt_to_phys(atomic_pool, val);
+
+ *ret_page = phys_to_page(phys);
+ ptr = (void *)val;
+ }
+
+ return ptr;
+}
+
+static bool __in_atomic_pool(void *start, size_t size)
+{
+ return addr_in_gen_pool(atomic_pool, (unsigned long)start, size);
+}
+
+static int __free_from_pool(void *start, size_t size)
+{
+ if (!__in_atomic_pool(start, size))
+ return 0;
+
+ gen_pool_free(atomic_pool, (unsigned long)start, size);
+
+ return 1;
+}
+
+
static void *__dma_alloc_coherent(struct device *dev, size_t size,
dma_addr_t *dma_handle, gfp_t flags,
struct dma_attrs *attrs)
@@ -53,7 +103,8 @@ static void *__dma_alloc_coherent(struct device *dev, size_t size,
if (IS_ENABLED(CONFIG_ZONE_DMA) &&
dev->coherent_dma_mask <= DMA_BIT_MASK(32))
flags |= GFP_DMA;
- if (IS_ENABLED(CONFIG_DMA_CMA)) {
+
+ if (!(flags & __GFP_WAIT) && IS_ENABLED(CONFIG_DMA_CMA)) {
struct page *page;

size = PAGE_ALIGN(size);
@@ -73,50 +124,56 @@ static void __dma_free_coherent(struct device *dev, size_t size,
void *vaddr, dma_addr_t dma_handle,
struct dma_attrs *attrs)
{
+ bool freed;
+ phys_addr_t paddr = dma_to_phys(dev, dma_handle);
+
if (dev == NULL) {
WARN_ONCE(1, "Use an actual device structure for DMA allocation\n");
return;
}

- if (IS_ENABLED(CONFIG_DMA_CMA)) {
- phys_addr_t paddr = dma_to_phys(dev, dma_handle);

- dma_release_from_contiguous(dev,
+ freed = dma_release_from_contiguous(dev,
phys_to_page(paddr),
size >> PAGE_SHIFT);
- } else {
+ if (!freed)
swiotlb_free_coherent(dev, size, vaddr, dma_handle);
- }
}

static void *__dma_alloc_noncoherent(struct device *dev, size_t size,
dma_addr_t *dma_handle, gfp_t flags,
struct dma_attrs *attrs)
{
- struct page *page, **map;
+ struct page *page;
void *ptr, *coherent_ptr;
- int order, i;

size = PAGE_ALIGN(size);
- order = get_order(size);
+
+ if (!(flags & __GFP_WAIT)) {
+ struct page *page = NULL;
+ void *addr = __alloc_from_pool(size, &page);
+
+ if (addr)
+ *dma_handle = phys_to_dma(dev, page_to_phys(page));
+
+ return addr;
+
+ }

ptr = __dma_alloc_coherent(dev, size, dma_handle, flags, attrs);
if (!ptr)
goto no_mem;
- map = kmalloc(sizeof(struct page *) << order, flags & ~GFP_DMA);
- if (!map)
- goto no_map;

/* remove any dirty cache lines on the kernel alias */
__dma_flush_range(ptr, ptr + size);

+
/* create a coherent mapping */
page = virt_to_page(ptr);
- for (i = 0; i < (size >> PAGE_SHIFT); i++)
- map[i] = page + i;
- coherent_ptr = vmap(map, size >> PAGE_SHIFT, VM_MAP,
- __get_dma_pgprot(attrs, __pgprot(PROT_NORMAL_NC), false));
- kfree(map);
+ coherent_ptr = dma_common_contiguous_remap(page, size, VM_USERMAP,
+ __get_dma_pgprot(attrs,
+ __pgprot(PROT_NORMAL_NC), false),
+ NULL);
if (!coherent_ptr)
goto no_map;

@@ -135,6 +192,8 @@ static void __dma_free_noncoherent(struct device *dev, size_t size,
{
void *swiotlb_addr = phys_to_virt(dma_to_phys(dev, dma_handle));

+ if (__free_from_pool(vaddr, size))
+ return;
vunmap(vaddr);
__dma_free_coherent(dev, size, swiotlb_addr, dma_handle, attrs);
}
@@ -332,6 +391,68 @@ static struct notifier_block amba_bus_nb = {

extern int swiotlb_late_init_with_default_size(size_t default_size);

+static int __init atomic_pool_init(void)
+{
+ pgprot_t prot = __pgprot(PROT_NORMAL_NC);
+ unsigned long nr_pages = atomic_pool_size >> PAGE_SHIFT;
+ struct page *page;
+ void *addr;
+
+
+ if (dev_get_cma_area(NULL))
+ page = dma_alloc_from_contiguous(NULL, nr_pages,
+ get_order(atomic_pool_size));
+ else
+ page = alloc_pages(GFP_KERNEL, get_order(atomic_pool_size));
+
+
+ if (page) {
+ int ret;
+
+ atomic_pool = gen_pool_create(PAGE_SHIFT, -1);
+ if (!atomic_pool)
+ goto free_page;
+
+ addr = dma_common_contiguous_remap(page, atomic_pool_size,
+ VM_USERMAP, prot, atomic_pool_init);
+
+ if (!addr)
+ goto destroy_genpool;
+
+ memset(addr, 0, atomic_pool_size);
+ __dma_flush_range(addr, addr + atomic_pool_size);
+
+ ret = gen_pool_add_virt(atomic_pool, (unsigned long)addr,
+ page_to_phys(page),
+ atomic_pool_size, -1);
+ if (ret)
+ goto remove_mapping;
+
+ gen_pool_set_algo(atomic_pool,
+ gen_pool_first_fit_order_align,
+ (void *)PAGE_SHIFT);
+
+ pr_info("DMA: preallocated %zd KiB pool for atomic allocations\n",
+ atomic_pool_size / 1024);
+ return 0;
+ }
+ goto out;
+
+remove_mapping:
+ dma_common_free_remap(addr, atomic_pool_size, VM_USERMAP);
+destroy_genpool:
+ gen_pool_destroy(atomic_pool);
+ atomic_pool == NULL;
+free_page:
+ if (!dma_release_from_contiguous(NULL, page, nr_pages))
+ __free_pages(page, get_order(atomic_pool_size));
+out:
+ pr_err("DMA: failed to allocate %zx KiB pool for atomic coherent allocation\n",
+ atomic_pool_size / 1024);
+ return -ENOMEM;
+}
+postcore_initcall(atomic_pool_init);
+
static int __init swiotlb_late_init(void)
{
size_t swiotlb_size = min(SZ_64M, MAX_ORDER_NR_PAGES << PAGE_SHIFT);
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

2014-06-17 01:40:23

by Laura Abbott

[permalink] [raw]
Subject: [PATCHv3 4/5] arm: use genalloc for the atomic pool

ARM currently uses a bitmap for tracking atomic allocations.
genalloc already handles this type of memory pool allocation
so switch to using that instead.

Signed-off-by: Laura Abbott <[email protected]>
---
arch/arm/Kconfig | 1 +
arch/arm/mm/dma-mapping.c | 144 ++++++++++++++--------------------------------
2 files changed, 45 insertions(+), 100 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 87b63fd..71899da 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -13,6 +13,7 @@ config ARM
select CLONE_BACKWARDS
select CPU_PM if (SUSPEND || CPU_IDLE)
select DCACHE_WORD_ACCESS if HAVE_EFFICIENT_UNALIGNED_ACCESS
+ select GENERIC_ALLOCATOR
select GENERIC_ATOMIC64 if (CPU_V7M || CPU_V6 || !CPU_32v6K || !AEABI)
select GENERIC_CLOCKEVENTS_BROADCAST if SMP
select GENERIC_IDLE_POLL_SETUP
diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index f5190ac..30edbd4 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -26,6 +26,7 @@
#include <linux/io.h>
#include <linux/vmalloc.h>
#include <linux/sizes.h>
+#include <linux/genalloc.h>

#include <asm/memory.h>
#include <asm/highmem.h>
@@ -313,40 +314,31 @@ static void __dma_free_remap(void *cpu_addr, size_t size)
}

#define DEFAULT_DMA_COHERENT_POOL_SIZE SZ_256K
+static struct gen_pool *atomic_pool;

-struct dma_pool {
- size_t size;
- spinlock_t lock;
- unsigned long *bitmap;
- unsigned long nr_pages;
- void *vaddr;
- struct page **pages;
-};
-
-static struct dma_pool atomic_pool = {
- .size = DEFAULT_DMA_COHERENT_POOL_SIZE,
-};
+static size_t atomic_pool_size = DEFAULT_DMA_COHERENT_POOL_SIZE;

static int __init early_coherent_pool(char *p)
{
- atomic_pool.size = memparse(p, &p);
+ atomic_pool_size = memparse(p, &p);
return 0;
}
early_param("coherent_pool", early_coherent_pool);

+
void __init init_dma_coherent_pool_size(unsigned long size)
{
/*
* Catch any attempt to set the pool size too late.
*/
- BUG_ON(atomic_pool.vaddr);
+ BUG_ON(atomic_pool);

/*
* Set architecture specific coherent pool size only if
* it has not been changed by kernel command line parameter.
*/
- if (atomic_pool.size == DEFAULT_DMA_COHERENT_POOL_SIZE)
- atomic_pool.size = size;
+ if (atomic_pool_size == DEFAULT_DMA_COHERENT_POOL_SIZE)
+ atomic_pool_size = size;
}

/*
@@ -354,52 +346,44 @@ void __init init_dma_coherent_pool_size(unsigned long size)
*/
static int __init atomic_pool_init(void)
{
- struct dma_pool *pool = &atomic_pool;
pgprot_t prot = pgprot_dmacoherent(PAGE_KERNEL);
gfp_t gfp = GFP_KERNEL | GFP_DMA;
- unsigned long nr_pages = pool->size >> PAGE_SHIFT;
- unsigned long *bitmap;
struct page *page;
- struct page **pages;
void *ptr;
- int bitmap_size = BITS_TO_LONGS(nr_pages) * sizeof(long);

- bitmap = kzalloc(bitmap_size, GFP_KERNEL);
- if (!bitmap)
- goto no_bitmap;
-
- pages = kzalloc(nr_pages * sizeof(struct page *), GFP_KERNEL);
- if (!pages)
- goto no_pages;
+ atomic_pool = gen_pool_create(PAGE_SHIFT, -1);
+ if (!atomic_pool)
+ goto out;

if (dev_get_cma_area(NULL))
- ptr = __alloc_from_contiguous(NULL, pool->size, prot, &page,
- atomic_pool_init);
+ ptr = __alloc_from_contiguous(NULL, atomic_pool_size, prot,
+ &page, atomic_pool_init);
else
- ptr = __alloc_remap_buffer(NULL, pool->size, gfp, prot, &page,
- atomic_pool_init);
+ ptr = __alloc_remap_buffer(NULL, atomic_pool_size, gfp, prot,
+ &page, atomic_pool_init);
if (ptr) {
- int i;
-
- for (i = 0; i < nr_pages; i++)
- pages[i] = page + i;
-
- spin_lock_init(&pool->lock);
- pool->vaddr = ptr;
- pool->pages = pages;
- pool->bitmap = bitmap;
- pool->nr_pages = nr_pages;
- pr_info("DMA: preallocated %u KiB pool for atomic coherent allocations\n",
- (unsigned)pool->size / 1024);
+ int ret;
+
+ ret = gen_pool_add_virt(atomic_pool, (unsigned long)ptr,
+ page_to_phys(page),
+ atomic_pool_size, -1);
+ if (ret)
+ goto destroy_genpool;
+
+ gen_pool_set_algo(atomic_pool,
+ gen_pool_first_fit_order_align,
+ (void *)PAGE_SHIFT);
+ pr_info("DMA: preallocated %zd KiB pool for atomic coherent allocations\n",
+ atomic_pool_size / 1024);
return 0;
}

- kfree(pages);
-no_pages:
- kfree(bitmap);
-no_bitmap:
- pr_err("DMA: failed to allocate %u KiB pool for atomic coherent allocation\n",
- (unsigned)pool->size / 1024);
+destroy_genpool:
+ gen_pool_destroy(atomic_pool);
+ atomic_pool = NULL;
+out:
+ pr_err("DMA: failed to allocate %zx KiB pool for atomic coherent allocation\n",
+ atomic_pool_size / 1024);
return -ENOMEM;
}
/*
@@ -494,76 +478,36 @@ static void *__alloc_remap_buffer(struct device *dev, size_t size, gfp_t gfp,

static void *__alloc_from_pool(size_t size, struct page **ret_page)
{
- struct dma_pool *pool = &atomic_pool;
- unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
- unsigned int pageno;
- unsigned long flags;
+ unsigned long val;
void *ptr = NULL;
- unsigned long align_mask;

- if (!pool->vaddr) {
+ if (!atomic_pool) {
WARN(1, "coherent pool not initialised!\n");
return NULL;
}

- /*
- * Align the region allocation - allocations from pool are rather
- * small, so align them to their order in pages, minimum is a page
- * size. This helps reduce fragmentation of the DMA space.
- */
- align_mask = (1 << get_order(size)) - 1;
-
- spin_lock_irqsave(&pool->lock, flags);
- pageno = bitmap_find_next_zero_area(pool->bitmap, pool->nr_pages,
- 0, count, align_mask);
- if (pageno < pool->nr_pages) {
- bitmap_set(pool->bitmap, pageno, count);
- ptr = pool->vaddr + PAGE_SIZE * pageno;
- *ret_page = pool->pages[pageno];
- } else {
- pr_err_once("ERROR: %u KiB atomic DMA coherent pool is too small!\n"
- "Please increase it with coherent_pool= kernel parameter!\n",
- (unsigned)pool->size / 1024);
+ val = gen_pool_alloc(atomic_pool, size);
+ if (val) {
+ phys_addr_t phys = gen_pool_virt_to_phys(atomic_pool, val);
+
+ *ret_page = phys_to_page(phys);
+ ptr = (void *)val;
}
- spin_unlock_irqrestore(&pool->lock, flags);

return ptr;
}

static bool __in_atomic_pool(void *start, size_t size)
{
- struct dma_pool *pool = &atomic_pool;
- void *end = start + size;
- void *pool_start = pool->vaddr;
- void *pool_end = pool->vaddr + pool->size;
-
- if (start < pool_start || start >= pool_end)
- return false;
-
- if (end <= pool_end)
- return true;
-
- WARN(1, "Wrong coherent size(%p-%p) from atomic pool(%p-%p)\n",
- start, end - 1, pool_start, pool_end - 1);
-
- return false;
+ return addr_in_gen_pool(atomic_pool, (unsigned long)start, size);
}

static int __free_from_pool(void *start, size_t size)
{
- struct dma_pool *pool = &atomic_pool;
- unsigned long pageno, count;
- unsigned long flags;
-
if (!__in_atomic_pool(start, size))
return 0;

- pageno = (start - pool->vaddr) >> PAGE_SHIFT;
- count = size >> PAGE_SHIFT;
-
- spin_lock_irqsave(&pool->lock, flags);
- bitmap_clear(pool->bitmap, pageno, count);
- spin_unlock_irqrestore(&pool->lock, flags);
+ gen_pool_free(atomic_pool, (unsigned long)start, size);

return 1;
}
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

2014-06-17 01:40:45

by Laura Abbott

[permalink] [raw]
Subject: [PATCHv3 3/5] common: dma-mapping: Introduce common remapping functions

For architectures without coherent DMA, memory for DMA may
need to be remapped with coherent attributes. Factor out
the the remapping code from arm and put it in a
common location to reduced code duplication.

Signed-off-by: Laura Abbott <[email protected]>
---
arch/arm/mm/dma-mapping.c | 57 +++++----------------------
drivers/base/dma-mapping.c | 67 ++++++++++++++++++++++++++++++++
include/asm-generic/dma-mapping-common.h | 9 +++++
3 files changed, 85 insertions(+), 48 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 4c88935..f5190ac 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -297,37 +297,19 @@ static void *
__dma_alloc_remap(struct page *page, size_t size, gfp_t gfp, pgprot_t prot,
const void *caller)
{
- struct vm_struct *area;
- unsigned long addr;
-
/*
* DMA allocation can be mapped to user space, so lets
* set VM_USERMAP flags too.
*/
- area = get_vm_area_caller(size, VM_ARM_DMA_CONSISTENT | VM_USERMAP,
- caller);
- if (!area)
- return NULL;
- addr = (unsigned long)area->addr;
- area->phys_addr = __pfn_to_phys(page_to_pfn(page));
-
- if (ioremap_page_range(addr, addr + size, area->phys_addr, prot)) {
- vunmap((void *)addr);
- return NULL;
- }
- return (void *)addr;
+ return dma_common_contiguous_remap(page, size,
+ VM_ARM_DMA_CONSISTENT | VM_USERMAP,
+ prot, caller);
}

static void __dma_free_remap(void *cpu_addr, size_t size)
{
- unsigned int flags = VM_ARM_DMA_CONSISTENT | VM_USERMAP;
- struct vm_struct *area = find_vm_area(cpu_addr);
- if (!area || (area->flags & flags) != flags) {
- WARN(1, "trying to free invalid coherent area: %p\n", cpu_addr);
- return;
- }
- unmap_kernel_range((unsigned long)cpu_addr, size);
- vunmap(cpu_addr);
+ dma_common_free_remap(cpu_addr, size,
+ VM_ARM_DMA_CONSISTENT | VM_USERMAP);
}

#define DEFAULT_DMA_COHERENT_POOL_SIZE SZ_256K
@@ -1261,29 +1243,8 @@ static void *
__iommu_alloc_remap(struct page **pages, size_t size, gfp_t gfp, pgprot_t prot,
const void *caller)
{
- unsigned int i, nr_pages = PAGE_ALIGN(size) >> PAGE_SHIFT;
- struct vm_struct *area;
- unsigned long p;
-
- area = get_vm_area_caller(size, VM_ARM_DMA_CONSISTENT | VM_USERMAP,
- caller);
- if (!area)
- return NULL;
-
- area->pages = pages;
- area->nr_pages = nr_pages;
- p = (unsigned long)area->addr;
-
- for (i = 0; i < nr_pages; i++) {
- phys_addr_t phys = __pfn_to_phys(page_to_pfn(pages[i]));
- if (ioremap_page_range(p, p + PAGE_SIZE, phys, prot))
- goto err;
- p += PAGE_SIZE;
- }
- return area->addr;
-err:
- unmap_kernel_range((unsigned long)area->addr, size);
- vunmap(area->addr);
+ return dma_common_pages_remap(pages, size,
+ VM_ARM_DMA_CONSISTENT | VM_USERMAP, prot, caller);
return NULL;
}

@@ -1491,8 +1452,8 @@ void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
}

if (!dma_get_attr(DMA_ATTR_NO_KERNEL_MAPPING, attrs)) {
- unmap_kernel_range((unsigned long)cpu_addr, size);
- vunmap(cpu_addr);
+ dma_common_free_remap(cpu_addr, size,
+ VM_ARM_DMA_CONSISTENT | VM_USERMAP);
}

__iommu_remove_mapping(dev, handle, size);
diff --git a/drivers/base/dma-mapping.c b/drivers/base/dma-mapping.c
index 6cd08e1..34265b5 100644
--- a/drivers/base/dma-mapping.c
+++ b/drivers/base/dma-mapping.c
@@ -10,6 +10,8 @@
#include <linux/dma-mapping.h>
#include <linux/export.h>
#include <linux/gfp.h>
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
#include <asm-generic/dma-coherent.h>

/*
@@ -267,3 +269,68 @@ int dma_common_mmap(struct device *dev, struct vm_area_struct *vma,
return ret;
}
EXPORT_SYMBOL(dma_common_mmap);
+
+/*
+ * remaps an allocated contiguous region into another vm_area.
+ * Cannot be used in non-sleeping contexts
+ */
+
+void *dma_common_contiguous_remap(struct page *page, size_t size,
+ unsigned long vm_flags,
+ pgprot_t prot, const void *caller)
+{
+ int i;
+ struct page **pages;
+ void *ptr;
+
+ pages = kmalloc(sizeof(struct page *) << get_order(size), GFP_KERNEL);
+ if (!pages)
+ return NULL;
+
+ for (i = 0; i < (size >> PAGE_SHIFT); i++)
+ pages[i] = page + i;
+
+ ptr = dma_common_pages_remap(pages, size, vm_flags, prot, caller);
+
+ kfree(pages);
+
+ return ptr;
+}
+
+/*
+ * remaps an array of PAGE_SIZE pages into another vm_area
+ * Cannot be used in non-sleeping contexts
+ */
+void *dma_common_pages_remap(struct page **pages, size_t size,
+ unsigned long vm_flags, pgprot_t prot,
+ const void *caller)
+{
+ struct vm_struct *area;
+
+ area = get_vm_area_caller(size, vm_flags, caller);
+ if (!area)
+ return NULL;
+
+ if (map_vm_area(area, prot, &pages)) {
+ vunmap(area->addr);
+ return NULL;
+ }
+
+ return area->addr;
+}
+
+/*
+ * unmaps a range previously mapped by dma_common_*_remap
+ */
+void dma_common_free_remap(void *cpu_addr, size_t size, unsigned long vm_flags)
+{
+ struct vm_struct *area = find_vm_area(cpu_addr);
+
+ if (!area || (area->flags & vm_flags) != vm_flags) {
+ WARN(1, "trying to free invalid coherent area: %p\n", cpu_addr);
+ return;
+ }
+
+ unmap_kernel_range((unsigned long)cpu_addr, size);
+ vunmap(cpu_addr);
+}
diff --git a/include/asm-generic/dma-mapping-common.h b/include/asm-generic/dma-mapping-common.h
index de8bf89..a9fd248 100644
--- a/include/asm-generic/dma-mapping-common.h
+++ b/include/asm-generic/dma-mapping-common.h
@@ -179,6 +179,15 @@ dma_sync_sg_for_device(struct device *dev, struct scatterlist *sg,
extern int dma_common_mmap(struct device *dev, struct vm_area_struct *vma,
void *cpu_addr, dma_addr_t dma_addr, size_t size);

+void *dma_common_contiguous_remap(struct page *page, size_t size,
+ unsigned long vm_flags,
+ pgprot_t prot, const void *caller);
+
+void *dma_common_pages_remap(struct page **pages, size_t size,
+ unsigned long vm_flags, pgprot_t prot,
+ const void *caller);
+void dma_common_free_remap(void *cpu_addr, size_t size, unsigned long vm_flags);
+
/**
* dma_mmap_attrs - map a coherent DMA allocation into user space
* @dev: valid struct device pointer, or NULL for ISA and EISA-like devices
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

2014-06-20 09:34:15

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCHv3 1/5] lib/genalloc.c: Add power aligned algorithm

Hi Laura,

On Tue, Jun 17, 2014 at 02:39:21AM +0100, Laura Abbott wrote:
> One of the more common algorithms used for allocation
> is to align the start address of the allocation to
> the order of size requested. Add this as an algorithm
> option for genalloc.

Good idea, I didn't know this even existed!

> Signed-off-by: Laura Abbott <[email protected]>
> ---
> include/linux/genalloc.h | 4 ++++
> lib/genalloc.c | 21 +++++++++++++++++++++
> 2 files changed, 25 insertions(+)
>
> diff --git a/include/linux/genalloc.h b/include/linux/genalloc.h
> index 1c2fdaa..3cd0934 100644
> --- a/include/linux/genalloc.h
> +++ b/include/linux/genalloc.h
> @@ -110,6 +110,10 @@ extern void gen_pool_set_algo(struct gen_pool *pool, genpool_algo_t algo,
> extern unsigned long gen_pool_first_fit(unsigned long *map, unsigned long size,
> unsigned long start, unsigned int nr, void *data);
>
> +extern unsigned long gen_pool_first_fit_order_align(unsigned long *map,
> + unsigned long size, unsigned long start, unsigned int nr,
> + void *data);
> +
> extern unsigned long gen_pool_best_fit(unsigned long *map, unsigned long size,
> unsigned long start, unsigned int nr, void *data);
>
> diff --git a/lib/genalloc.c b/lib/genalloc.c
> index bdb9a45..9758529 100644
> --- a/lib/genalloc.c
> +++ b/lib/genalloc.c
> @@ -481,6 +481,27 @@ unsigned long gen_pool_first_fit(unsigned long *map, unsigned long size,
> EXPORT_SYMBOL(gen_pool_first_fit);
>
> /**
> + * gen_pool_first_fit_order_align - find the first available region
> + * of memory matching the size requirement. The region will be aligned
> + * to the order of the size specified.
> + * @map: The address to base the search on
> + * @size: The bitmap size in bits
> + * @start: The bitnumber to start searching at
> + * @nr: The number of zeroed bits we're looking for
> + * @data: additional data - unused

It doesn't look unused to me.

> + */
> +unsigned long gen_pool_first_fit_order_align(unsigned long *map,
> + unsigned long size, unsigned long start,
> + unsigned int nr, void *data)
> +{
> + unsigned long order = (unsigned long) data;
> + unsigned long align_mask = (1 << get_order(nr << order)) - 1;

Why isn't the order just order?

Will

2014-06-20 09:39:31

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCHv3 2/5] lib/genalloc.c: Add genpool range check function

On Tue, Jun 17, 2014 at 02:39:22AM +0100, Laura Abbott wrote:
> After allocating an address from a particular genpool,
> there is no good way to verify if that address actually
> belongs to a genpool. Introduce addr_in_gen_pool which
> will return if an address plus size falls completely
> within the genpool range.
>
> Signed-off-by: Laura Abbott <[email protected]>
> ---
> include/linux/genalloc.h | 3 +++
> lib/genalloc.c | 29 +++++++++++++++++++++++++++++
> 2 files changed, 32 insertions(+)
>
> diff --git a/include/linux/genalloc.h b/include/linux/genalloc.h
> index 3cd0934..1ccaab4 100644
> --- a/include/linux/genalloc.h
> +++ b/include/linux/genalloc.h
> @@ -121,6 +121,9 @@ extern struct gen_pool *devm_gen_pool_create(struct device *dev,
> int min_alloc_order, int nid);
> extern struct gen_pool *dev_get_gen_pool(struct device *dev);
>
> +bool addr_in_gen_pool(struct gen_pool *pool, unsigned long start,
> + size_t size);
> +
> #ifdef CONFIG_OF
> extern struct gen_pool *of_get_named_gen_pool(struct device_node *np,
> const char *propname, int index);
> diff --git a/lib/genalloc.c b/lib/genalloc.c
> index 9758529..66edf93 100644
> --- a/lib/genalloc.c
> +++ b/lib/genalloc.c
> @@ -403,6 +403,35 @@ void gen_pool_for_each_chunk(struct gen_pool *pool,
> EXPORT_SYMBOL(gen_pool_for_each_chunk);
>
> /**
> + * addr_in_gen_pool - checks if an address falls within the range of a pool
> + * @pool: the generic memory pool
> + * @start: start address
> + * @size: size of the region
> + *
> + * Check if the range of addresses falls within the specified pool. Takes
> + * the rcu_read_lock for the duration of the check.
> + */
> +bool addr_in_gen_pool(struct gen_pool *pool, unsigned long start,
> + size_t size)
> +{
> + bool found = false;
> + unsigned long end = start + size;
> + struct gen_pool_chunk *chunk;
> +
> + rcu_read_lock();
> + list_for_each_entry_rcu(chunk, &(pool)->chunks, next_chunk) {
> + if (start >= chunk->start_addr && start <= chunk->end_addr) {

Why do you need to check start against the end of the chunk? Is that in case
of overflow?

Will

2014-06-29 19:33:53

by Laura Abbott

[permalink] [raw]
Subject: Re: [PATCHv3 1/5] lib/genalloc.c: Add power aligned algorithm

On 6/20/2014 2:33 AM, Will Deacon wrote:
> Hi Laura,
>
> On Tue, Jun 17, 2014 at 02:39:21AM +0100, Laura Abbott wrote:
>> One of the more common algorithms used for allocation
>> is to align the start address of the allocation to
>> the order of size requested. Add this as an algorithm
>> option for genalloc.
>
> Good idea, I didn't know this even existed!
>
>> Signed-off-by: Laura Abbott <[email protected]>
>> ---
>> include/linux/genalloc.h | 4 ++++
>> lib/genalloc.c | 21 +++++++++++++++++++++
>> 2 files changed, 25 insertions(+)
>>
>> diff --git a/include/linux/genalloc.h b/include/linux/genalloc.h
>> index 1c2fdaa..3cd0934 100644
>> --- a/include/linux/genalloc.h
>> +++ b/include/linux/genalloc.h
>> @@ -110,6 +110,10 @@ extern void gen_pool_set_algo(struct gen_pool *pool, genpool_algo_t algo,
>> extern unsigned long gen_pool_first_fit(unsigned long *map, unsigned long size,
>> unsigned long start, unsigned int nr, void *data);
>>
>> +extern unsigned long gen_pool_first_fit_order_align(unsigned long *map,
>> + unsigned long size, unsigned long start, unsigned int nr,
>> + void *data);
>> +
>> extern unsigned long gen_pool_best_fit(unsigned long *map, unsigned long size,
>> unsigned long start, unsigned int nr, void *data);
>>
>> diff --git a/lib/genalloc.c b/lib/genalloc.c
>> index bdb9a45..9758529 100644
>> --- a/lib/genalloc.c
>> +++ b/lib/genalloc.c
>> @@ -481,6 +481,27 @@ unsigned long gen_pool_first_fit(unsigned long *map, unsigned long size,
>> EXPORT_SYMBOL(gen_pool_first_fit);
>>
>> /**
>> + * gen_pool_first_fit_order_align - find the first available region
>> + * of memory matching the size requirement. The region will be aligned
>> + * to the order of the size specified.
>> + * @map: The address to base the search on
>> + * @size: The bitmap size in bits
>> + * @start: The bitnumber to start searching at
>> + * @nr: The number of zeroed bits we're looking for
>> + * @data: additional data - unused
>
> It doesn't look unused to me.
>
>> + */
>> +unsigned long gen_pool_first_fit_order_align(unsigned long *map,
>> + unsigned long size, unsigned long start,
>> + unsigned int nr, void *data)
>> +{
>> + unsigned long order = (unsigned long) data;
>> + unsigned long align_mask = (1 << get_order(nr << order)) - 1;
>
> Why isn't the order just order?
>

I did some bad math somewhere. All we really need is

unsigned long align_mask = roundup_pow_of_two(nr) - 1;

Which means the data would actually be unused. I'll fix it in the next
version.

Thanks,
Laura

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

2014-06-29 19:38:12

by Laura Abbott

[permalink] [raw]
Subject: Re: [PATCHv3 2/5] lib/genalloc.c: Add genpool range check function

On 6/20/2014 2:38 AM, Will Deacon wrote:
> On Tue, Jun 17, 2014 at 02:39:22AM +0100, Laura Abbott wrote:
>> After allocating an address from a particular genpool,
>> there is no good way to verify if that address actually
>> belongs to a genpool. Introduce addr_in_gen_pool which
>> will return if an address plus size falls completely
>> within the genpool range.
>>
>> Signed-off-by: Laura Abbott <[email protected]>
>> ---
>> include/linux/genalloc.h | 3 +++
>> lib/genalloc.c | 29 +++++++++++++++++++++++++++++
>> 2 files changed, 32 insertions(+)
>>
>> diff --git a/include/linux/genalloc.h b/include/linux/genalloc.h
>> index 3cd0934..1ccaab4 100644
>> --- a/include/linux/genalloc.h
>> +++ b/include/linux/genalloc.h
>> @@ -121,6 +121,9 @@ extern struct gen_pool *devm_gen_pool_create(struct device *dev,
>> int min_alloc_order, int nid);
>> extern struct gen_pool *dev_get_gen_pool(struct device *dev);
>>
>> +bool addr_in_gen_pool(struct gen_pool *pool, unsigned long start,
>> + size_t size);
>> +
>> #ifdef CONFIG_OF
>> extern struct gen_pool *of_get_named_gen_pool(struct device_node *np,
>> const char *propname, int index);
>> diff --git a/lib/genalloc.c b/lib/genalloc.c
>> index 9758529..66edf93 100644
>> --- a/lib/genalloc.c
>> +++ b/lib/genalloc.c
>> @@ -403,6 +403,35 @@ void gen_pool_for_each_chunk(struct gen_pool *pool,
>> EXPORT_SYMBOL(gen_pool_for_each_chunk);
>>
>> /**
>> + * addr_in_gen_pool - checks if an address falls within the range of a pool
>> + * @pool: the generic memory pool
>> + * @start: start address
>> + * @size: size of the region
>> + *
>> + * Check if the range of addresses falls within the specified pool. Takes
>> + * the rcu_read_lock for the duration of the check.
>> + */
>> +bool addr_in_gen_pool(struct gen_pool *pool, unsigned long start,
>> + size_t size)
>> +{
>> + bool found = false;
>> + unsigned long end = start + size;
>> + struct gen_pool_chunk *chunk;
>> +
>> + rcu_read_lock();
>> + list_for_each_entry_rcu(chunk, &(pool)->chunks, next_chunk) {
>> + if (start >= chunk->start_addr && start <= chunk->end_addr) {
>
> Why do you need to check start against the end of the chunk? Is that in case
> of overflow?
>

Yes, this provides an extra check for overflow and also matches similar logic for
gen_pool_virt_to_phys.

Thanks,
Laura

--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation