2017-07-03 14:51:45

by Vitaly Kuzmichev

[permalink] [raw]
Subject: [PATCH 0/2] Implement default DMA coherent pool

From: Vitaly Kuzmichev <[email protected]>

Here is alternate version to implement default DMA coherent pool, that
we use in our custom kernel since 2015.

Unlike to Vladimir Murzin's patch [1] "drivers: dma-coherent: Introduce
default DMA pool" this variant stores pointer to "rmem->priv" pointer
and thus it does not need additional function (dma_init_reserved_memory)
to explicitly call device_init (=rmem_dma_device_init) to get valid
address in "rmem->priv".

Default DMA pool attribute for memory region is being provided from
a device tree file.

Patch 2/2 adds 'dmainfo' to ProcFS to show available DMA regions.

The patchset is based on driver-core-next branch.

[1] https://patchwork.kernel.org/patch/9615465/

George G. Davis (2):
drivers: dma-coherent: Add support for default DMA coherent pool
drivers: dma-coherent: show per-device DMA region utilization via
procfs

.../bindings/reserved-memory/reserved-memory.txt | 2 +
drivers/base/dma-coherent.c | 250 +++++++++++++++++++--
2 files changed, 237 insertions(+), 15 deletions(-)

--
1.9.1


2017-07-03 14:51:28

by Vitaly Kuzmichev

[permalink] [raw]
Subject: [PATCH 1/2] drivers: dma-coherent: Add support for default DMA coherent pool

From: "George G. Davis" <[email protected]>

Use concept similar to the default CMA region for DMA coherent pools.

Cc: Rob Herring <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Signed-off-by: George G. Davis <[email protected]>
Signed-off-by: Jiada Wang <[email protected]>
Signed-off-by: Mark Craske <[email protected]>
Signed-off-by: Vitaly Kuzmichev <[email protected]>
---
.../bindings/reserved-memory/reserved-memory.txt | 2 ++
drivers/base/dma-coherent.c | 29 ++++++++++++++++------
2 files changed, 24 insertions(+), 7 deletions(-)

diff --git a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
index 3da0ebd..ed9a051 100644
--- a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
+++ b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
@@ -67,6 +67,8 @@ reusable (optional) - empty property
Linux implementation note:
- If a "linux,cma-default" property is present, then Linux will use the
region for the default pool of the contiguous memory allocator.
+- If a "linux,dma-default" property is present, then Linux will use the
+ region for the default pool of the DMA coherent memory allocator.

Device node references to reserved memory
-----------------------------------------
diff --git a/drivers/base/dma-coherent.c b/drivers/base/dma-coherent.c
index 640a7e6..a0b0f2b 100644
--- a/drivers/base/dma-coherent.c
+++ b/drivers/base/dma-coherent.c
@@ -18,6 +18,19 @@ struct dma_coherent_mem {
spinlock_t spinlock;
};

+static struct dma_coherent_mem **dma_coherent_default_area;
+
+static inline struct dma_coherent_mem *dev_get_dma_area(struct device *dev)
+{
+ if (dev && dev->dma_mem)
+ return dev->dma_mem;
+#ifdef CONFIG_CMA
+ if (dev && dev->cma_area)
+ return NULL;
+#endif
+ return dma_coherent_default_area ? *dma_coherent_default_area : NULL;
+}
+
static bool dma_init_coherent_memory(
phys_addr_t phys_addr, dma_addr_t device_addr, size_t size, int flags,
struct dma_coherent_mem **mem)
@@ -111,7 +124,7 @@ int dma_declare_coherent_memory(struct device *dev, phys_addr_t phys_addr,

void dma_release_declared_memory(struct device *dev)
{
- struct dma_coherent_mem *mem = dev->dma_mem;
+ struct dma_coherent_mem *mem = dev_get_dma_area(dev);

if (!mem)
return;
@@ -123,7 +136,7 @@ void dma_release_declared_memory(struct device *dev)
void *dma_mark_declared_memory_occupied(struct device *dev,
dma_addr_t device_addr, size_t size)
{
- struct dma_coherent_mem *mem = dev->dma_mem;
+ struct dma_coherent_mem *mem = dev_get_dma_area(dev);
unsigned long flags;
int pos, err;

@@ -167,9 +180,7 @@ int dma_alloc_from_coherent(struct device *dev, ssize_t size,
int pageno;
int dma_memory_map;

- if (!dev)
- return 0;
- mem = dev->dma_mem;
+ mem = dev_get_dma_area(dev);
if (!mem)
return 0;

@@ -223,7 +234,7 @@ int dma_alloc_from_coherent(struct device *dev, ssize_t size,
*/
int dma_release_from_coherent(struct device *dev, int order, void *vaddr)
{
- struct dma_coherent_mem *mem = dev ? dev->dma_mem : NULL;
+ struct dma_coherent_mem *mem = dev_get_dma_area(dev);

if (mem && vaddr >= mem->virt_base && vaddr <
(mem->virt_base + (mem->size << PAGE_SHIFT))) {
@@ -257,7 +268,7 @@ int dma_release_from_coherent(struct device *dev, int order, void *vaddr)
int dma_mmap_from_coherent(struct device *dev, struct vm_area_struct *vma,
void *vaddr, size_t size, int *ret)
{
- struct dma_coherent_mem *mem = dev ? dev->dma_mem : NULL;
+ struct dma_coherent_mem *mem = dev_get_dma_area(dev);

if (mem && vaddr >= mem->virt_base && vaddr + size <=
(mem->virt_base + (mem->size << PAGE_SHIFT))) {
@@ -329,6 +340,10 @@ static int __init rmem_dma_setup(struct reserved_mem *rmem)
}
#endif

+ if (of_get_flat_dt_prop(node, "linux,dma-default", NULL))
+ dma_coherent_default_area =
+ (struct dma_coherent_mem **)&rmem->priv;
+
rmem->ops = &rmem_dma_ops;
pr_info("Reserved memory: created DMA memory pool at %pa, size %ld MiB\n",
&rmem->base, (unsigned long)rmem->size / SZ_1M);
--
1.9.1

2017-07-03 14:51:50

by Vitaly Kuzmichev

[permalink] [raw]
Subject: [PATCH 2/2] drivers: dma-coherent: show per-device DMA region utilization via procfs

From: "George G. Davis" <[email protected]>

Add hooks to allow displaying per-device DMA region utilization via
/proc/dmainfo similar to how free blocks are displayed in
/proc/pagetypeinfo.

This also adds /proc/dmainfo reporting for regions defined via
dma_declare_coherent_memory().

Cc: Greg Kroah-Hartman <[email protected]>
Signed-off-by: George G. Davis <[email protected]>
Signed-off-by: Mark Craske <[email protected]>
Signed-off-by: Vitaly Kuzmichev <[email protected]>
---
drivers/base/dma-coherent.c | 221 ++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 213 insertions(+), 8 deletions(-)

diff --git a/drivers/base/dma-coherent.c b/drivers/base/dma-coherent.c
index a0b0f2b..5671422 100644
--- a/drivers/base/dma-coherent.c
+++ b/drivers/base/dma-coherent.c
@@ -8,6 +8,12 @@
#include <linux/module.h>
#include <linux/dma-mapping.h>

+#ifdef CONFIG_PROC_FS
+#include <linux/fs.h>
+#include <linux/proc_fs.h>
+#include <linux/seq_file.h>
+#endif
+
struct dma_coherent_mem {
void *virt_base;
dma_addr_t device_base;
@@ -16,8 +22,57 @@ struct dma_coherent_mem {
int flags;
unsigned long *bitmap;
spinlock_t spinlock;
+ int used;
+ int highwatermark;
+ int errs;
};

+#ifdef CONFIG_PROC_FS
+struct dmacoherent_region {
+ struct list_head list;
+ struct device *dev;
+};
+
+static LIST_HEAD(dmacoherent_region_list);
+static DEFINE_MUTEX(dmacoherent_region_list_lock);
+
+static int dmacoherent_region_add(struct device *dev)
+{
+ struct dmacoherent_region *rp;
+
+ rp = kzalloc(sizeof(*rp), GFP_KERNEL);
+ if (!rp)
+ return -ENOMEM;
+
+ rp->dev = dev;
+
+ mutex_lock(&dmacoherent_region_list_lock);
+ list_add(&rp->list, &dmacoherent_region_list);
+ mutex_unlock(&dmacoherent_region_list_lock);
+ dev_info(dev, "Registered DMA-coherent pool with /proc/dmainfo accounting\n");
+
+ return 0;
+}
+
+static void dmacoherent_region_del(struct device *dev)
+{
+ struct dmacoherent_region *rp;
+
+ mutex_lock(&dmacoherent_region_list_lock);
+ list_for_each_entry(rp, &dmacoherent_region_list, list) {
+ if (rp->dev == dev) {
+ list_del(&rp->list);
+ kfree(rp);
+ break;
+ }
+ }
+ mutex_unlock(&dmacoherent_region_list_lock);
+}
+#else
+static int dmacoherent_region_add(struct device *dev) { return 0; }
+static void dmacoherent_region_del(struct device *dev) { return; }
+#endif
+
static struct dma_coherent_mem **dma_coherent_default_area;

static inline struct dma_coherent_mem *dev_get_dma_area(struct device *dev)
@@ -109,14 +164,22 @@ int dma_declare_coherent_memory(struct device *dev, phys_addr_t phys_addr,
dma_addr_t device_addr, size_t size, int flags)
{
struct dma_coherent_mem *mem;
+ int ret;

if (!dma_init_coherent_memory(phys_addr, device_addr, size, flags,
&mem))
return 0;

- if (dma_assign_coherent_memory(dev, mem) == 0)
- return flags & DMA_MEMORY_MAP ? DMA_MEMORY_MAP : DMA_MEMORY_IO;
+ if (dma_assign_coherent_memory(dev, mem) != 0)
+ goto errout;
+
+ ret = (flags & DMA_MEMORY_MAP ? DMA_MEMORY_MAP : DMA_MEMORY_IO);
+
+ if (dmacoherent_region_add(dev) == 0)
+ return ret;

+ dev->dma_mem = NULL;
+errout:
dma_release_coherent_memory(mem);
return 0;
}
@@ -128,6 +191,8 @@ void dma_release_declared_memory(struct device *dev)

if (!mem)
return;
+
+ dmacoherent_region_del(dev);
dma_release_coherent_memory(mem);
dev->dma_mem = NULL;
}
@@ -139,19 +204,25 @@ void *dma_mark_declared_memory_occupied(struct device *dev,
struct dma_coherent_mem *mem = dev_get_dma_area(dev);
unsigned long flags;
int pos, err;
-
- size += device_addr & ~PAGE_MASK;
+ int order;

if (!mem)
return ERR_PTR(-EINVAL);

- spin_lock_irqsave(&mem->spinlock, flags);
+ size += device_addr & ~PAGE_MASK;
+ order = get_order(size);
pos = (device_addr - mem->device_base) >> PAGE_SHIFT;
- err = bitmap_allocate_region(mem->bitmap, pos, get_order(size));
- spin_unlock_irqrestore(&mem->spinlock, flags);

- if (err != 0)
+ spin_lock_irqsave(&mem->spinlock, flags);
+ err = bitmap_allocate_region(mem->bitmap, pos, order);
+ if (err != 0) {
+ spin_unlock_irqrestore(&mem->spinlock, flags);
return ERR_PTR(err);
+ }
+ mem->used += 1 << order;
+ if (mem->highwatermark < mem->used)
+ mem->highwatermark = mem->used;
+ spin_unlock_irqrestore(&mem->spinlock, flags);
return mem->virt_base + (pos << PAGE_SHIFT);
}
EXPORT_SYMBOL(dma_mark_declared_memory_occupied);
@@ -194,6 +265,10 @@ int dma_alloc_from_coherent(struct device *dev, ssize_t size,
if (unlikely(pageno < 0))
goto err;

+ mem->used += 1 << order;
+ if (mem->highwatermark < mem->used)
+ mem->highwatermark = mem->used;
+
/*
* Memory was found in the per-device area.
*/
@@ -209,6 +284,7 @@ int dma_alloc_from_coherent(struct device *dev, ssize_t size,
return 1;

err:
+ mem->errs++;
spin_unlock_irqrestore(&mem->spinlock, flags);
/*
* In the case where the allocation can not be satisfied from the
@@ -243,6 +319,7 @@ int dma_release_from_coherent(struct device *dev, int order, void *vaddr)

spin_lock_irqsave(&mem->spinlock, flags);
bitmap_release_region(mem->bitmap, page, order);
+ mem->used -= 1 << order;
spin_unlock_irqrestore(&mem->spinlock, flags);
return 1;
}
@@ -311,6 +388,10 @@ static int rmem_dma_device_init(struct reserved_mem *rmem, struct device *dev)
return -ENODEV;
}
rmem->priv = mem;
+
+ if (dmacoherent_region_add(dev))
+ return -ENOMEM;
+
dma_assign_coherent_memory(dev, mem);
return 0;
}
@@ -318,6 +399,7 @@ static int rmem_dma_device_init(struct reserved_mem *rmem, struct device *dev)
static void rmem_dma_device_release(struct reserved_mem *rmem,
struct device *dev)
{
+ dmacoherent_region_del(dev);
dev->dma_mem = NULL;
}

@@ -351,3 +433,126 @@ static int __init rmem_dma_setup(struct reserved_mem *rmem)
}
RESERVEDMEM_OF_DECLARE(dma, "shared-dma-pool", rmem_dma_setup);
#endif
+
+#ifdef CONFIG_PROC_FS
+
+static int dmainfo_proc_show_dma_mem(struct seq_file *m, void *v,
+ struct device *dev)
+{
+ struct dma_coherent_mem *mem = dev_get_dma_area(dev);
+ int offset;
+ int start;
+ int end;
+ int pages;
+ int order;
+ int free = 0;
+ int blocks[MAX_ORDER];
+
+ memset(blocks, 0, sizeof(blocks));
+
+ spin_lock(&mem->spinlock);
+
+ for (offset = 0; offset < mem->size; offset = end) {
+ start = find_next_zero_bit(mem->bitmap, mem->size, offset);
+ if (start >= mem->size)
+ break;
+ end = find_next_bit(mem->bitmap, mem->size, start + 1);
+ pages = end - start;
+
+ /* Align start: */
+ for (order = 0; order < MAX_ORDER; order += 1) {
+ if (start >= end)
+ break;
+ if (pages < (1 << order))
+ break;
+ if (start & (1 << order)) {
+ blocks[order] += 1;
+ start += 1 << order;
+ pages -= 1 << order;
+ free += 1 << order;
+ }
+ }
+
+ if (start >= end)
+ continue;
+
+ /* Align middle and end: */
+ order = MAX_ORDER - 1;
+ while (order >= 0) {
+ if (start >= end)
+ break;
+ if (pages >= (1 << order)) {
+ blocks[order] += 1;
+ start += 1 << order;
+ pages -= 1 << order;
+ free += 1 << order;
+ } else {
+ order -= 1;
+ }
+ }
+ }
+
+ seq_printf(m, "%-30s", dev_name(dev));
+
+ for (order = 0; order < MAX_ORDER; order += 1)
+ seq_printf(m, " %6d", blocks[order]);
+
+ seq_printf(m, " %6d %6d %6d %6d %6d\n",
+ mem->size,
+ mem->used,
+ free,
+ mem->highwatermark,
+ mem->errs);
+
+ spin_unlock(&mem->spinlock);
+
+ return 0;
+}
+
+static int dmainfo_proc_show(struct seq_file *m, void *v)
+{
+ struct dmacoherent_region *rp;
+ int order;
+
+ seq_puts(m, "DMA-coherent region information:\n");
+ seq_printf(m, "%-30s", "Free block count at order");
+
+ for (order = 0; order < MAX_ORDER; ++order)
+ seq_printf(m, " %6d", order);
+
+ seq_printf(m, " %6s %6s %6s %6s %6s\n",
+ "Size",
+ "Used",
+ "Free",
+ "High",
+ "Errs");
+
+ mutex_lock(&dmacoherent_region_list_lock);
+ list_for_each_entry(rp, &dmacoherent_region_list, list) {
+ dmainfo_proc_show_dma_mem(m, v, rp->dev);
+ }
+ mutex_unlock(&dmacoherent_region_list_lock);
+
+ return 0;
+}
+
+static int dmainfo_proc_open(struct inode *inode, struct file *file)
+{
+ return single_open(file, dmainfo_proc_show, NULL);
+}
+
+static const struct file_operations dmainfo_proc_fops = {
+ .open = dmainfo_proc_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = single_release,
+};
+
+static int __init proc_dmainfo_init(void)
+{
+ proc_create("dmainfo", 0, NULL, &dmainfo_proc_fops);
+ return 0;
+}
+module_init(proc_dmainfo_init);
+
+#endif
--
1.9.1

2017-07-04 21:02:51

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 1/2] drivers: dma-coherent: Add support for default DMA coherent pool

Hi George,

[auto build test ERROR on staging/staging-testing]
[also build test ERROR on v4.12]
[cannot apply to next-20170704]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/vitaly_kuzmichev-mentor-com/drivers-dma-coherent-Add-support-for-default-DMA-coherent-pool/20170705-040238
config: i386-randconfig-x075-07041126 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=i386

All errors (new ones prefixed by >>):

drivers/base/dma-coherent.c: In function 'dev_get_dma_area':
>> drivers/base/dma-coherent.c:28:16: error: 'struct device' has no member named 'cma_area'; did you mean 'dma_parms'?
if (dev && dev->cma_area)
^~

vim +28 drivers/base/dma-coherent.c

22
23 static inline struct dma_coherent_mem *dev_get_dma_area(struct device *dev)
24 {
25 if (dev && dev->dma_mem)
26 return dev->dma_mem;
27 #ifdef CONFIG_CMA
> 28 if (dev && dev->cma_area)
29 return NULL;
30 #endif
31 return dma_coherent_default_area ? *dma_coherent_default_area : NULL;

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (1.36 kB)
.config.gz (25.47 kB)
Download all attachments

2017-07-05 05:56:01

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 1/2] drivers: dma-coherent: Add support for default DMA coherent pool

Hi George,

[auto build test ERROR on staging/staging-testing]
[also build test ERROR on v4.12]
[cannot apply to next-20170704]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/vitaly_kuzmichev-mentor-com/drivers-dma-coherent-Add-support-for-default-DMA-coherent-pool/20170705-040238
config: i386-randconfig-c0-07051154 (attached as .config)
compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4
reproduce:
# save the attached .config to linux build tree
make ARCH=i386

All error/warnings (new ones prefixed by >>):

In file included from include/uapi/linux/stddef.h:1:0,
from include/linux/stddef.h:4,
from include/uapi/linux/posix_types.h:4,
from include/uapi/linux/types.h:13,
from include/linux/types.h:5,
from include/linux/io.h:21,
from drivers//base/dma-coherent.c:5:
drivers//base/dma-coherent.c: In function 'dev_get_dma_area':
>> drivers//base/dma-coherent.c:28:16: error: 'struct device' has no member named 'cma_area'
if (dev && dev->cma_area)
^
include/linux/compiler.h:160:30: note: in definition of macro '__trace_if'
if (__builtin_constant_p(!!(cond)) ? !!(cond) : \
^
>> drivers//base/dma-coherent.c:28:2: note: in expansion of macro 'if'
if (dev && dev->cma_area)
^
>> drivers//base/dma-coherent.c:28:16: error: 'struct device' has no member named 'cma_area'
if (dev && dev->cma_area)
^
include/linux/compiler.h:160:42: note: in definition of macro '__trace_if'
if (__builtin_constant_p(!!(cond)) ? !!(cond) : \
^
>> drivers//base/dma-coherent.c:28:2: note: in expansion of macro 'if'
if (dev && dev->cma_area)
^
>> drivers//base/dma-coherent.c:28:16: error: 'struct device' has no member named 'cma_area'
if (dev && dev->cma_area)
^
include/linux/compiler.h:171:16: note: in definition of macro '__trace_if'
______r = !!(cond); \
^
>> drivers//base/dma-coherent.c:28:2: note: in expansion of macro 'if'
if (dev && dev->cma_area)
^

vim +28 drivers//base/dma-coherent.c

1 /*
2 * Coherent per-device memory handling.
3 * Borrowed from i386
4 */
> 5 #include <linux/io.h>
6 #include <linux/slab.h>
7 #include <linux/kernel.h>
8 #include <linux/module.h>
9 #include <linux/dma-mapping.h>
10
11 struct dma_coherent_mem {
12 void *virt_base;
13 dma_addr_t device_base;
14 unsigned long pfn_base;
15 int size;
16 int flags;
17 unsigned long *bitmap;
18 spinlock_t spinlock;
19 };
20
21 static struct dma_coherent_mem **dma_coherent_default_area;
22
23 static inline struct dma_coherent_mem *dev_get_dma_area(struct device *dev)
24 {
25 if (dev && dev->dma_mem)
26 return dev->dma_mem;
27 #ifdef CONFIG_CMA
> 28 if (dev && dev->cma_area)
29 return NULL;
30 #endif
31 return dma_coherent_default_area ? *dma_coherent_default_area : NULL;

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (3.34 kB)
.config.gz (24.32 kB)
Download all attachments

2017-07-07 13:22:46

by Vitaly Kuzmichev

[permalink] [raw]
Subject: [PATCH v2 0/2] Additions to default DMA coherent pool

v2:
Since linux-next now includes Vladimir Murzin's version of default DMA
coherent pool [1] our version is not required now and even causes merge
conflict. The difference between two versions is not really significant
except one serious problem with CONFIG_DMA_CMA. Please see patch v2 1/2
for details.
So that I have rebased our work on linux-next/master branch, and sending
this patchset with necessary additions for default DMA pool feature.

Patch v2 2/2 adds 'dmainfo' to ProcFS to show available DMA regions.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=93228b44c33a572cb36cec2dbed42e9bdbc88d79

George G. Davis (2):
drivers: dma-coherent: Fix dev->cma_area vs dev->dma_mem breakage
drivers: dma-coherent: show per-device DMA region utilization via
procfs

drivers/base/dma-coherent.c | 228 ++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 219 insertions(+), 9 deletions(-)

--
1.9.1

2017-07-07 13:23:14

by Vitaly Kuzmichev

[permalink] [raw]
Subject: [PATCH v2 1/2] drivers: dma-coherent: Fix dev->cma_area vs dev->dma_mem breakage

From: "George G. Davis" <[email protected]>

When a "linux,dma-default" DMA coherent region is defined, the
dma_coherent_default_memory pointer is returned by function
dev_get_coherent_memory() for any struct device *dev which has not
explicitly assigned a dev->dma_mem memory region, i.e. dev->dma_mem is
the NULL pointer. Unfortunately this overlooks the fact that for the
CONFIG_DMA_CMA case, it is also possible that a device may have assigned
a CMA memory region via the dev->cma_area pointer in which case,
the "linux,dma-default" DMA coherent region should not be used.
Since the current code did not consider this case, dev->cma_area regions
are not used when a "linux,dma-default" DMA coherent region is defined.
Instead, memory is allocated from the "linux,dma-default" DMA coherent
region. This omission could lead to DMA memory allocation failures for
devices such as the "viv,galcore" which require a large contiguous
address space which cannot be supplied by the "linux,dma-default" region
IFF it has been reconfigured to use a CMA memory region. Similar DMA
allocation failures are likely to occur for other devices which require
large memory regions and/or overall allocation requests exceed the size
of the "linux,dma-default" DMA coherent region size.

Fix this by updating the dev_get_coherent_memory() function to return
the NULL pointer if a dev->cma_area region is assigned to a device.

Cc: Greg Kroah-Hartman <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Marek Szyprowski <[email protected]>
Cc: Robin Murphy <[email protected]>
Signed-off-by: George G. Davis <[email protected]>
Signed-off-by: Vitaly Kuzmichev <[email protected]>
---
drivers/base/dma-coherent.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/base/dma-coherent.c b/drivers/base/dma-coherent.c
index 2ae24c2..acfe140 100644
--- a/drivers/base/dma-coherent.c
+++ b/drivers/base/dma-coherent.c
@@ -25,6 +25,10 @@ static inline struct dma_coherent_mem *dev_get_coherent_memory(struct device *de
{
if (dev && dev->dma_mem)
return dev->dma_mem;
+#ifdef CONFIG_DMA_CMA
+ if (dev && dev->cma_area)
+ return NULL;
+#endif
return dma_coherent_default_memory;
}

--
1.9.1

2017-07-07 13:23:36

by Vitaly Kuzmichev

[permalink] [raw]
Subject: [PATCH v2 2/2] drivers: dma-coherent: show per-device DMA region utilization via procfs

From: "George G. Davis" <[email protected]>

Add hooks to allow displaying per-device DMA region utilization via
/proc/dmainfo similar to how free blocks are displayed in
/proc/pagetypeinfo.

This also adds /proc/dmainfo reporting for regions defined via
dma_declare_coherent_memory().

Cc: Greg Kroah-Hartman <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Marek Szyprowski <[email protected]>
Cc: Robin Murphy <[email protected]>
Signed-off-by: George G. Davis <[email protected]>
Signed-off-by: Mark Craske <[email protected]>
Signed-off-by: Vitaly Kuzmichev <[email protected]>
---
drivers/base/dma-coherent.c | 224 ++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 215 insertions(+), 9 deletions(-)

diff --git a/drivers/base/dma-coherent.c b/drivers/base/dma-coherent.c
index acfe140..2692e6d 100644
--- a/drivers/base/dma-coherent.c
+++ b/drivers/base/dma-coherent.c
@@ -8,6 +8,12 @@
#include <linux/module.h>
#include <linux/dma-mapping.h>

+#ifdef CONFIG_PROC_FS
+#include <linux/fs.h>
+#include <linux/proc_fs.h>
+#include <linux/seq_file.h>
+#endif
+
struct dma_coherent_mem {
void *virt_base;
dma_addr_t device_base;
@@ -17,8 +23,57 @@ struct dma_coherent_mem {
unsigned long *bitmap;
spinlock_t spinlock;
bool use_dev_dma_pfn_offset;
+ int used;
+ int highwatermark;
+ int errs;
};

+#ifdef CONFIG_PROC_FS
+struct dmacoherent_region {
+ struct list_head list;
+ struct device *dev;
+};
+
+static LIST_HEAD(dmacoherent_region_list);
+static DEFINE_MUTEX(dmacoherent_region_list_lock);
+
+static int dmacoherent_region_add(struct device *dev)
+{
+ struct dmacoherent_region *rp;
+
+ rp = kzalloc(sizeof(*rp), GFP_KERNEL);
+ if (!rp)
+ return -ENOMEM;
+
+ rp->dev = dev;
+
+ mutex_lock(&dmacoherent_region_list_lock);
+ list_add(&rp->list, &dmacoherent_region_list);
+ mutex_unlock(&dmacoherent_region_list_lock);
+ dev_info(dev, "Registered DMA-coherent pool with /proc/dmainfo accounting\n");
+
+ return 0;
+}
+
+static void dmacoherent_region_del(struct device *dev)
+{
+ struct dmacoherent_region *rp;
+
+ mutex_lock(&dmacoherent_region_list_lock);
+ list_for_each_entry(rp, &dmacoherent_region_list, list) {
+ if (rp->dev == dev) {
+ list_del(&rp->list);
+ kfree(rp);
+ break;
+ }
+ }
+ mutex_unlock(&dmacoherent_region_list_lock);
+}
+#else
+static int dmacoherent_region_add(struct device *dev) { return 0; }
+static void dmacoherent_region_del(struct device *dev) { return; }
+#endif
+
static struct dma_coherent_mem *dma_coherent_default_memory __ro_after_init;

static inline struct dma_coherent_mem *dev_get_coherent_memory(struct device *dev)
@@ -122,14 +177,22 @@ int dma_declare_coherent_memory(struct device *dev, phys_addr_t phys_addr,
dma_addr_t device_addr, size_t size, int flags)
{
struct dma_coherent_mem *mem;
+ int ret;

if (!dma_init_coherent_memory(phys_addr, device_addr, size, flags,
&mem))
return 0;

- if (dma_assign_coherent_memory(dev, mem) == 0)
- return flags & DMA_MEMORY_MAP ? DMA_MEMORY_MAP : DMA_MEMORY_IO;
+ if (dma_assign_coherent_memory(dev, mem) != 0)
+ goto errout;
+
+ ret = (flags & DMA_MEMORY_MAP ? DMA_MEMORY_MAP : DMA_MEMORY_IO);

+ if (dmacoherent_region_add(dev) == 0)
+ return ret;
+
+ dev->dma_mem = NULL;
+errout:
dma_release_coherent_memory(mem);
return 0;
}
@@ -141,6 +204,8 @@ void dma_release_declared_memory(struct device *dev)

if (!mem)
return;
+
+ dmacoherent_region_del(dev);
dma_release_coherent_memory(mem);
dev->dma_mem = NULL;
}
@@ -152,19 +217,25 @@ void *dma_mark_declared_memory_occupied(struct device *dev,
struct dma_coherent_mem *mem = dev->dma_mem;
unsigned long flags;
int pos, err;
-
- size += device_addr & ~PAGE_MASK;
+ int order;

if (!mem)
return ERR_PTR(-EINVAL);

- spin_lock_irqsave(&mem->spinlock, flags);
+ size += device_addr & ~PAGE_MASK;
+ order = get_order(size);
pos = PFN_DOWN(device_addr - dma_get_device_base(dev, mem));
- err = bitmap_allocate_region(mem->bitmap, pos, get_order(size));
- spin_unlock_irqrestore(&mem->spinlock, flags);

- if (err != 0)
+ spin_lock_irqsave(&mem->spinlock, flags);
+ err = bitmap_allocate_region(mem->bitmap, pos, order);
+ if (err != 0) {
+ spin_unlock_irqrestore(&mem->spinlock, flags);
return ERR_PTR(err);
+ }
+ mem->used += 1 << order;
+ if (mem->highwatermark < mem->used)
+ mem->highwatermark = mem->used;
+ spin_unlock_irqrestore(&mem->spinlock, flags);
return mem->virt_base + (pos << PAGE_SHIFT);
}
EXPORT_SYMBOL(dma_mark_declared_memory_occupied);
@@ -206,6 +277,10 @@ int dma_alloc_from_coherent(struct device *dev, ssize_t size,
if (unlikely(pageno < 0))
goto err;

+ mem->used += 1 << order;
+ if (mem->highwatermark < mem->used)
+ mem->highwatermark = mem->used;
+
/*
* Memory was found in the per-device area.
*/
@@ -221,6 +296,7 @@ int dma_alloc_from_coherent(struct device *dev, ssize_t size,
return 1;

err:
+ mem->errs++;
spin_unlock_irqrestore(&mem->spinlock, flags);
/*
* In the case where the allocation can not be satisfied from the
@@ -255,6 +331,7 @@ int dma_release_from_coherent(struct device *dev, int order, void *vaddr)

spin_lock_irqsave(&mem->spinlock, flags);
bitmap_release_region(mem->bitmap, page, order);
+ mem->used -= 1 << order;
spin_unlock_irqrestore(&mem->spinlock, flags);
return 1;
}
@@ -326,6 +403,10 @@ static int rmem_dma_device_init(struct reserved_mem *rmem, struct device *dev)
}
mem->use_dev_dma_pfn_offset = true;
rmem->priv = mem;
+
+ if (dmacoherent_region_add(dev))
+ return -ENOMEM;
+
dma_assign_coherent_memory(dev, mem);
return 0;
}
@@ -333,8 +414,10 @@ static int rmem_dma_device_init(struct reserved_mem *rmem, struct device *dev)
static void rmem_dma_device_release(struct reserved_mem *rmem,
struct device *dev)
{
- if (dev)
+ if (dev) {
+ dmacoherent_region_del(dev);
dev->dma_mem = NULL;
+ }
}

static const struct reserved_mem_ops rmem_dma_ops = {
@@ -396,3 +479,126 @@ static int __init dma_init_reserved_memory(void)

RESERVEDMEM_OF_DECLARE(dma, "shared-dma-pool", rmem_dma_setup);
#endif
+
+#ifdef CONFIG_PROC_FS
+
+static int dmainfo_proc_show_dma_mem(struct seq_file *m, void *v,
+ struct device *dev)
+{
+ struct dma_coherent_mem *mem = dev_get_coherent_memory(dev);
+ int offset;
+ int start;
+ int end;
+ int pages;
+ int order;
+ int free = 0;
+ int blocks[MAX_ORDER];
+
+ memset(blocks, 0, sizeof(blocks));
+
+ spin_lock(&mem->spinlock);
+
+ for (offset = 0; offset < mem->size; offset = end) {
+ start = find_next_zero_bit(mem->bitmap, mem->size, offset);
+ if (start >= mem->size)
+ break;
+ end = find_next_bit(mem->bitmap, mem->size, start + 1);
+ pages = end - start;
+
+ /* Align start: */
+ for (order = 0; order < MAX_ORDER; order += 1) {
+ if (start >= end)
+ break;
+ if (pages < (1 << order))
+ break;
+ if (start & (1 << order)) {
+ blocks[order] += 1;
+ start += 1 << order;
+ pages -= 1 << order;
+ free += 1 << order;
+ }
+ }
+
+ if (start >= end)
+ continue;
+
+ /* Align middle and end: */
+ order = MAX_ORDER - 1;
+ while (order >= 0) {
+ if (start >= end)
+ break;
+ if (pages >= (1 << order)) {
+ blocks[order] += 1;
+ start += 1 << order;
+ pages -= 1 << order;
+ free += 1 << order;
+ } else {
+ order -= 1;
+ }
+ }
+ }
+
+ seq_printf(m, "%-30s", dev_name(dev));
+
+ for (order = 0; order < MAX_ORDER; order += 1)
+ seq_printf(m, " %6d", blocks[order]);
+
+ seq_printf(m, " %6d %6d %6d %6d %6d\n",
+ mem->size,
+ mem->used,
+ free,
+ mem->highwatermark,
+ mem->errs);
+
+ spin_unlock(&mem->spinlock);
+
+ return 0;
+}
+
+static int dmainfo_proc_show(struct seq_file *m, void *v)
+{
+ struct dmacoherent_region *rp;
+ int order;
+
+ seq_puts(m, "DMA-coherent region information:\n");
+ seq_printf(m, "%-30s", "Free block count at order");
+
+ for (order = 0; order < MAX_ORDER; ++order)
+ seq_printf(m, " %6d", order);
+
+ seq_printf(m, " %6s %6s %6s %6s %6s\n",
+ "Size",
+ "Used",
+ "Free",
+ "High",
+ "Errs");
+
+ mutex_lock(&dmacoherent_region_list_lock);
+ list_for_each_entry(rp, &dmacoherent_region_list, list) {
+ dmainfo_proc_show_dma_mem(m, v, rp->dev);
+ }
+ mutex_unlock(&dmacoherent_region_list_lock);
+
+ return 0;
+}
+
+static int dmainfo_proc_open(struct inode *inode, struct file *file)
+{
+ return single_open(file, dmainfo_proc_show, NULL);
+}
+
+static const struct file_operations dmainfo_proc_fops = {
+ .open = dmainfo_proc_open,
+ .read = seq_read,
+ .llseek = seq_lseek,
+ .release = single_release,
+};
+
+static int __init proc_dmainfo_init(void)
+{
+ proc_create("dmainfo", 0, NULL, &dmainfo_proc_fops);
+ return 0;
+}
+module_init(proc_dmainfo_init);
+
+#endif
--
1.9.1

2017-07-07 13:56:49

by Stephen Rothwell

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] Additions to default DMA coherent pool

Hi Vitaly,

On Fri, 7 Jul 2017 16:22:39 +0300 Vitaly Kuzmichev <[email protected]> wrote:
>
> v2:
> Since linux-next now includes Vladimir Murzin's version of default DMA
> coherent pool [1] our version is not required now and even causes merge
> conflict. The difference between two versions is not really significant
> except one serious problem with CONFIG_DMA_CMA. Please see patch v2 1/2
> for details.
> So that I have rebased our work on linux-next/master branch, and sending
> this patchset with necessary additions for default DMA pool feature.
>
> Patch v2 2/2 adds 'dmainfo' to ProcFS to show available DMA regions.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=93228b44c33a572cb36cec2dbed42e9bdbc88d79

This is now also in Linus' tree. You should rebase your work on that
commit (or just Linus' tree), not linux-next.

--
Cheers,
Stephen Rothwell

2017-07-07 14:27:49

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] drivers: dma-coherent: Fix dev->cma_area vs dev->dma_mem breakage

Vladimir,

this is why I really didn't like overloading the current
dma coherent infrastructure with the global pool.

And this new patch seems like piling hacks over hacks. I think we
should go back and make sure allocations from the global coherent
pool are done by the dma ops implementation, and not before calling
into them - preferably still reusing the common code for it.

Vladimir or Vitaly - can you look into that?

2017-07-07 14:28:30

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] drivers: dma-coherent: show per-device DMA region utilization via procfs

This should at least go into debugfs. I'd also prefer it it was a
separate file instead of the ifdefs.

2017-07-07 15:40:21

by Vladimir Murzin

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] drivers: dma-coherent: Fix dev->cma_area vs dev->dma_mem breakage

Christoph,

On 07/07/17 15:27, Christoph Hellwig wrote:
> Vladimir,
>
> this is why I really didn't like overloading the current
> dma coherent infrastructure with the global pool.
>
> And this new patch seems like piling hacks over hacks. I think we
> should go back and make sure allocations from the global coherent
> pool are done by the dma ops implementation, and not before calling
> into them - preferably still reusing the common code for it.
>
> Vladimir or Vitaly - can you look into that?
>

It is really sad that Vitaly and George did not join to discussions earlier,
so we could avoid being in situation like this.

Likely I'm missing something, but what should happen if device relies on
dma_contiguous_default_area?

Originally, intention behind dma-default was to simplify things, so instead of

reserved-memory {
#address-cells = <1>;
#size-cells = <1>;
ranges;

coherent_dma: linux,dma {
compatible = "shared-dma-pool";
no-map;
reg = <0x78000000 0x800000>;
};
};


dev0: dev@12300000 {
memory-region = <&coherent_dma>;
/* ... */
};

dev1: dev@12500000 {
memory-region = <&coherent_dma>;
/* ... */
};

dev2: dev@12600000 {
memory-region = <&coherent_dma>;
/* ... */
};

in device tree we could simply have

reserved-memory {
#address-cells = <1>;
#size-cells = <1>;
ranges;

coherent_dma: linux,dma {
compatible = "shared-dma-pool";
no-map;
reg = <0x78000000 0x800000>;
linux,dma-default;
};
};

and that just work in my (NOMMU) case because there is no CMA there...

However, given that dma-default is being overloaded and there are no device
tree users merged yet, I would not object stepping back, reverting "drivers:
dma-coherent: Introduce default DMA pool" and cooperatively rethinking
design/implementation, so every party gets happy.

The rest of my original patch set should be enough to keep NOMMU working.

Cheers
Vladimir

2017-07-07 16:08:13

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] drivers: dma-coherent: Fix dev->cma_area vs dev->dma_mem breakage

On 07/07/17 16:40, Vladimir Murzin wrote:
> Christoph,
>
> On 07/07/17 15:27, Christoph Hellwig wrote:
>> Vladimir,
>>
>> this is why I really didn't like overloading the current
>> dma coherent infrastructure with the global pool.
>>
>> And this new patch seems like piling hacks over hacks. I think we
>> should go back and make sure allocations from the global coherent
>> pool are done by the dma ops implementation, and not before calling
>> into them - preferably still reusing the common code for it.
>>
>> Vladimir or Vitaly - can you look into that?
>>
>
> It is really sad that Vitaly and George did not join to discussions earlier,
> so we could avoid being in situation like this.
>
> Likely I'm missing something, but what should happen if device relies on
> dma_contiguous_default_area?
>
> Originally, intention behind dma-default was to simplify things, so instead of
>
> reserved-memory {
> #address-cells = <1>;
> #size-cells = <1>;
> ranges;
>
> coherent_dma: linux,dma {
> compatible = "shared-dma-pool";
> no-map;
> reg = <0x78000000 0x800000>;
> };
> };
>
>
> dev0: dev@12300000 {
> memory-region = <&coherent_dma>;
> /* ... */
> };
>
> dev1: dev@12500000 {
> memory-region = <&coherent_dma>;
> /* ... */
> };
>
> dev2: dev@12600000 {
> memory-region = <&coherent_dma>;
> /* ... */
> };
>
> in device tree we could simply have
>
> reserved-memory {
> #address-cells = <1>;
> #size-cells = <1>;
> ranges;
>
> coherent_dma: linux,dma {
> compatible = "shared-dma-pool";
> no-map;
> reg = <0x78000000 0x800000>;
> linux,dma-default;
> };
> };
>
> and that just work in my (NOMMU) case because there is no CMA there...
>
> However, given that dma-default is being overloaded and there are no device
> tree users merged yet, I would not object stepping back, reverting "drivers:
> dma-coherent: Introduce default DMA pool" and cooperatively rethinking
> design/implementation, so every party gets happy.

I don't think we need to go that far, I reckon it would be clear enough
to just split the per-device vs. global pool interfaces, something like
I've sketched out below (such that the ops->alloc implementation calls
dma_alloc_from_global_coherent() if dma_alloc_from_contiguous() fails).

If anyone wants to take that and run with it, feel free.

Robin.

----->8-----
diff --git a/drivers/base/dma-coherent.c b/drivers/base/dma-coherent.c
index 640a7e63c453..e6393c6d8359 100644
--- a/drivers/base/dma-coherent.c
+++ b/drivers/base/dma-coherent.c
@@ -143,6 +143,44 @@ void *dma_mark_declared_memory_occupied(struct
device *dev,
}
EXPORT_SYMBOL(dma_mark_declared_memory_occupied);

+static void *__dma_alloc_from_coherent(struct dma_coherent_mem *mem,
ssize_t size,
+ dma_addr_t *dma_handle)
+{
+ int order = get_order(size);
+ unsigned long flags;
+ int pageno;
+ int dma_memory_map;
+ void *ret;
+
+ spin_lock_irqsave(&mem->spinlock, flags);
+
+ if (unlikely(size > (mem->size << PAGE_SHIFT)))
+ goto err;
+
+ pageno = bitmap_find_free_region(mem->bitmap, mem->size, order);
+ if (unlikely(pageno < 0))
+ goto err;
+
+ /*
+ * Memory was found in the coherent area.
+ */
+ *dma_handle = mem->device_base + (pageno << PAGE_SHIFT);
+ ret = mem->virt_base + (pageno << PAGE_SHIFT);
+ dma_memory_map = (mem->flags & DMA_MEMORY_MAP);
+ spin_unlock_irqrestore(&mem->spinlock, flags);
+ if (dma_memory_map)
+ memset(ret, 0, size);
+ else
+ memset_io(ret, 0, size);
+
+ return ret;
+
+err:
+ spin_unlock_irqrestore(&mem->spinlock, flags);
+ return NULL;
+}
+EXPORT_SYMBOL(dma_alloc_from_coherent);
+
/**
* dma_alloc_from_coherent() - try to allocate memory from the
per-device coherent area
*
@@ -162,10 +200,6 @@ int dma_alloc_from_coherent(struct device *dev,
ssize_t size,
dma_addr_t *dma_handle, void **ret)
{
struct dma_coherent_mem *mem;
- int order = get_order(size);
- unsigned long flags;
- int pageno;
- int dma_memory_map;

if (!dev)
return 0;
@@ -173,32 +207,10 @@ int dma_alloc_from_coherent(struct device *dev,
ssize_t size,
if (!mem)
return 0;

- *ret = NULL;
- spin_lock_irqsave(&mem->spinlock, flags);
+ *ret = __dma_alloc_from_coherent(mem, size, dma_handle);
+ if (*ret)
+ return 1;

- if (unlikely(size > (mem->size << PAGE_SHIFT)))
- goto err;
-
- pageno = bitmap_find_free_region(mem->bitmap, mem->size, order);
- if (unlikely(pageno < 0))
- goto err;
-
- /*
- * Memory was found in the per-device area.
- */
- *dma_handle = mem->device_base + (pageno << PAGE_SHIFT);
- *ret = mem->virt_base + (pageno << PAGE_SHIFT);
- dma_memory_map = (mem->flags & DMA_MEMORY_MAP);
- spin_unlock_irqrestore(&mem->spinlock, flags);
- if (dma_memory_map)
- memset(*ret, 0, size);
- else
- memset_io(*ret, 0, size);
-
- return 1;
-
-err:
- spin_unlock_irqrestore(&mem->spinlock, flags);
/*
* In the case where the allocation can not be satisfied from the
* per-device area, try to fall back to generic memory if the
@@ -208,6 +220,15 @@ int dma_alloc_from_coherent(struct device *dev,
ssize_t size,
}
EXPORT_SYMBOL(dma_alloc_from_coherent);

+void *dma_alloc_from_global_coherent(ssize_t size, dma_addr_t *dma_handle)
+{
+ if (!dma_coherent_default_memory)
+ return NULL;
+
+ return __dma_alloc_from_coherent(dma_coherent_default_memory, size,
+ handle);
+}
+
/**
* dma_release_from_coherent() - try to free the memory allocated from
per-device coherent memory pool
* @dev: device from which the memory was allocated

2017-07-07 16:44:41

by Vladimir Murzin

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] drivers: dma-coherent: Fix dev->cma_area vs dev->dma_mem breakage

On 07/07/17 17:06, Robin Murphy wrote:
> On 07/07/17 16:40, Vladimir Murzin wrote:
>> Christoph,
>>
>> On 07/07/17 15:27, Christoph Hellwig wrote:
>>> Vladimir,
>>>
>>> this is why I really didn't like overloading the current
>>> dma coherent infrastructure with the global pool.
>>>
>>> And this new patch seems like piling hacks over hacks. I think we
>>> should go back and make sure allocations from the global coherent
>>> pool are done by the dma ops implementation, and not before calling
>>> into them - preferably still reusing the common code for it.
>>>
>>> Vladimir or Vitaly - can you look into that?
>>>
>>
>> It is really sad that Vitaly and George did not join to discussions earlier,
>> so we could avoid being in situation like this.
>>
>> Likely I'm missing something, but what should happen if device relies on
>> dma_contiguous_default_area?
>>
>> Originally, intention behind dma-default was to simplify things, so instead of
>>
>> reserved-memory {
>> #address-cells = <1>;
>> #size-cells = <1>;
>> ranges;
>>
>> coherent_dma: linux,dma {
>> compatible = "shared-dma-pool";
>> no-map;
>> reg = <0x78000000 0x800000>;
>> };
>> };
>>
>>
>> dev0: dev@12300000 {
>> memory-region = <&coherent_dma>;
>> /* ... */
>> };
>>
>> dev1: dev@12500000 {
>> memory-region = <&coherent_dma>;
>> /* ... */
>> };
>>
>> dev2: dev@12600000 {
>> memory-region = <&coherent_dma>;
>> /* ... */
>> };
>>
>> in device tree we could simply have
>>
>> reserved-memory {
>> #address-cells = <1>;
>> #size-cells = <1>;
>> ranges;
>>
>> coherent_dma: linux,dma {
>> compatible = "shared-dma-pool";
>> no-map;
>> reg = <0x78000000 0x800000>;
>> linux,dma-default;
>> };
>> };
>>
>> and that just work in my (NOMMU) case because there is no CMA there...
>>
>> However, given that dma-default is being overloaded and there are no device
>> tree users merged yet, I would not object stepping back, reverting "drivers:
>> dma-coherent: Introduce default DMA pool" and cooperatively rethinking
>> design/implementation, so every party gets happy.
>
> I don't think we need to go that far, I reckon it would be clear enough
> to just split the per-device vs. global pool interfaces, something like
> I've sketched out below (such that the ops->alloc implementation calls
> dma_alloc_from_global_coherent() if dma_alloc_from_contiguous() fails).

Would not we need also release and mmap variants?

>
> If anyone wants to take that and run with it, feel free.
>
> Robin.
>
> ----->8-----
> diff --git a/drivers/base/dma-coherent.c b/drivers/base/dma-coherent.c
> index 640a7e63c453..e6393c6d8359 100644
> --- a/drivers/base/dma-coherent.c
> +++ b/drivers/base/dma-coherent.c
> @@ -143,6 +143,44 @@ void *dma_mark_declared_memory_occupied(struct
> device *dev,
> }
> EXPORT_SYMBOL(dma_mark_declared_memory_occupied);
>
> +static void *__dma_alloc_from_coherent(struct dma_coherent_mem *mem,
> ssize_t size,
> + dma_addr_t *dma_handle)
> +{
> + int order = get_order(size);
> + unsigned long flags;
> + int pageno;
> + int dma_memory_map;
> + void *ret;
> +
> + spin_lock_irqsave(&mem->spinlock, flags);
> +
> + if (unlikely(size > (mem->size << PAGE_SHIFT)))
> + goto err;
> +
> + pageno = bitmap_find_free_region(mem->bitmap, mem->size, order);
> + if (unlikely(pageno < 0))
> + goto err;
> +
> + /*
> + * Memory was found in the coherent area.
> + */
> + *dma_handle = mem->device_base + (pageno << PAGE_SHIFT);
> + ret = mem->virt_base + (pageno << PAGE_SHIFT);
> + dma_memory_map = (mem->flags & DMA_MEMORY_MAP);
> + spin_unlock_irqrestore(&mem->spinlock, flags);
> + if (dma_memory_map)
> + memset(ret, 0, size);
> + else
> + memset_io(ret, 0, size);
> +
> + return ret;
> +
> +err:
> + spin_unlock_irqrestore(&mem->spinlock, flags);
> + return NULL;
> +}
> +EXPORT_SYMBOL(dma_alloc_from_coherent);

We already export dma_alloc_from_coherent

> +
> /**
> * dma_alloc_from_coherent() - try to allocate memory from the
> per-device coherent area
> *
> @@ -162,10 +200,6 @@ int dma_alloc_from_coherent(struct device *dev,
> ssize_t size,
> dma_addr_t *dma_handle, void **ret)
> {
> struct dma_coherent_mem *mem;
> - int order = get_order(size);
> - unsigned long flags;
> - int pageno;
> - int dma_memory_map;
>
> if (!dev)
> return 0;
> @@ -173,32 +207,10 @@ int dma_alloc_from_coherent(struct device *dev,
> ssize_t size,
> if (!mem)
> return 0;
>
> - *ret = NULL;
> - spin_lock_irqsave(&mem->spinlock, flags);
> + *ret = __dma_alloc_from_coherent(mem, size, dma_handle);
> + if (*ret)
> + return 1;
>
> - if (unlikely(size > (mem->size << PAGE_SHIFT)))
> - goto err;
> -
> - pageno = bitmap_find_free_region(mem->bitmap, mem->size, order);
> - if (unlikely(pageno < 0))
> - goto err;
> -
> - /*
> - * Memory was found in the per-device area.
> - */
> - *dma_handle = mem->device_base + (pageno << PAGE_SHIFT);
> - *ret = mem->virt_base + (pageno << PAGE_SHIFT);
> - dma_memory_map = (mem->flags & DMA_MEMORY_MAP);
> - spin_unlock_irqrestore(&mem->spinlock, flags);
> - if (dma_memory_map)
> - memset(*ret, 0, size);
> - else
> - memset_io(*ret, 0, size);
> -
> - return 1;
> -
> -err:
> - spin_unlock_irqrestore(&mem->spinlock, flags);
> /*
> * In the case where the allocation can not be satisfied from the
> * per-device area, try to fall back to generic memory if the
> @@ -208,6 +220,15 @@ int dma_alloc_from_coherent(struct device *dev,
> ssize_t size,
> }
> EXPORT_SYMBOL(dma_alloc_from_coherent);
>
> +void *dma_alloc_from_global_coherent(ssize_t size, dma_addr_t *dma_handle)
> +{
> + if (!dma_coherent_default_memory)
> + return NULL;
> +
> + return __dma_alloc_from_coherent(dma_coherent_default_memory, size,
> + handle);
^^^^^^
dma_handle
> +}
> +

EXPORT_SYMBOL(dma_release_from_coherent); ?


> /**
> * dma_release_from_coherent() - try to free the memory allocated from
> per-device coherent memory pool
> * @dev: device from which the memory was allocated
>

Cheers
Vladimir

2017-07-07 17:55:22

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] drivers: dma-coherent: Fix dev->cma_area vs dev->dma_mem breakage

On 07/07/17 17:44, Vladimir Murzin wrote:
> On 07/07/17 17:06, Robin Murphy wrote:
>> On 07/07/17 16:40, Vladimir Murzin wrote:
>>> Christoph,
>>>
>>> On 07/07/17 15:27, Christoph Hellwig wrote:
>>>> Vladimir,
>>>>
>>>> this is why I really didn't like overloading the current
>>>> dma coherent infrastructure with the global pool.
>>>>
>>>> And this new patch seems like piling hacks over hacks. I think we
>>>> should go back and make sure allocations from the global coherent
>>>> pool are done by the dma ops implementation, and not before calling
>>>> into them - preferably still reusing the common code for it.
>>>>
>>>> Vladimir or Vitaly - can you look into that?
>>>>
>>>
>>> It is really sad that Vitaly and George did not join to discussions earlier,
>>> so we could avoid being in situation like this.
>>>
>>> Likely I'm missing something, but what should happen if device relies on
>>> dma_contiguous_default_area?
>>>
>>> Originally, intention behind dma-default was to simplify things, so instead of
>>>
>>> reserved-memory {
>>> #address-cells = <1>;
>>> #size-cells = <1>;
>>> ranges;
>>>
>>> coherent_dma: linux,dma {
>>> compatible = "shared-dma-pool";
>>> no-map;
>>> reg = <0x78000000 0x800000>;
>>> };
>>> };
>>>
>>>
>>> dev0: dev@12300000 {
>>> memory-region = <&coherent_dma>;
>>> /* ... */
>>> };
>>>
>>> dev1: dev@12500000 {
>>> memory-region = <&coherent_dma>;
>>> /* ... */
>>> };
>>>
>>> dev2: dev@12600000 {
>>> memory-region = <&coherent_dma>;
>>> /* ... */
>>> };
>>>
>>> in device tree we could simply have
>>>
>>> reserved-memory {
>>> #address-cells = <1>;
>>> #size-cells = <1>;
>>> ranges;
>>>
>>> coherent_dma: linux,dma {
>>> compatible = "shared-dma-pool";
>>> no-map;
>>> reg = <0x78000000 0x800000>;
>>> linux,dma-default;
>>> };
>>> };
>>>
>>> and that just work in my (NOMMU) case because there is no CMA there...
>>>
>>> However, given that dma-default is being overloaded and there are no device
>>> tree users merged yet, I would not object stepping back, reverting "drivers:
>>> dma-coherent: Introduce default DMA pool" and cooperatively rethinking
>>> design/implementation, so every party gets happy.
>>
>> I don't think we need to go that far, I reckon it would be clear enough
>> to just split the per-device vs. global pool interfaces, something like
>> I've sketched out below (such that the ops->alloc implementation calls
>> dma_alloc_from_global_coherent() if dma_alloc_from_contiguous() fails).
>
> Would not we need also release and mmap variants?

Sure, that was just bashed out in 2 minutes and diffed into an email on
the assumption that code would help illustrate the general idea I had in
mind more clearly than prose alone. I'm certain it won't even compile
as-is ;)

Robin.

2017-07-10 00:36:13

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH 1/2] drivers: dma-coherent: Add support for default DMA coherent pool

On Mon, Jul 03, 2017 at 05:51:14PM +0300, [email protected] wrote:
> From: "George G. Davis" <[email protected]>
>
> Use concept similar to the default CMA region for DMA coherent pools.

Why do we need this in DT? CMA is a carveout and has to be reserved
early, but DMA coherent memory is just different MMU attributes, right?

Also, does this still apply with DMA mapping changes in 4.13?

> Cc: Rob Herring <[email protected]>
> Cc: Mark Rutland <[email protected]>
> Cc: Greg Kroah-Hartman <[email protected]>
> Signed-off-by: George G. Davis <[email protected]>
> Signed-off-by: Jiada Wang <[email protected]>
> Signed-off-by: Mark Craske <[email protected]>
> Signed-off-by: Vitaly Kuzmichev <[email protected]>
> ---
> .../bindings/reserved-memory/reserved-memory.txt | 2 ++
> drivers/base/dma-coherent.c | 29 ++++++++++++++++------
> 2 files changed, 24 insertions(+), 7 deletions(-)

2017-07-10 13:42:25

by Vladimir Murzin

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] drivers: dma-coherent: Fix dev->cma_area vs dev->dma_mem breakage

On 07/07/17 18:55, Robin Murphy wrote:
> On 07/07/17 17:44, Vladimir Murzin wrote:
>> On 07/07/17 17:06, Robin Murphy wrote:
>>> On 07/07/17 16:40, Vladimir Murzin wrote:
>>>> Christoph,
>>>>
>>>> On 07/07/17 15:27, Christoph Hellwig wrote:
>>>>> Vladimir,
>>>>>
>>>>> this is why I really didn't like overloading the current
>>>>> dma coherent infrastructure with the global pool.
>>>>>
>>>>> And this new patch seems like piling hacks over hacks. I think we
>>>>> should go back and make sure allocations from the global coherent
>>>>> pool are done by the dma ops implementation, and not before calling
>>>>> into them - preferably still reusing the common code for it.
>>>>>
>>>>> Vladimir or Vitaly - can you look into that?
>>>>>
>>>>
>>>> It is really sad that Vitaly and George did not join to discussions earlier,
>>>> so we could avoid being in situation like this.
>>>>
>>>> Likely I'm missing something, but what should happen if device relies on
>>>> dma_contiguous_default_area?
>>>>
>>>> Originally, intention behind dma-default was to simplify things, so instead of
>>>>
>>>> reserved-memory {
>>>> #address-cells = <1>;
>>>> #size-cells = <1>;
>>>> ranges;
>>>>
>>>> coherent_dma: linux,dma {
>>>> compatible = "shared-dma-pool";
>>>> no-map;
>>>> reg = <0x78000000 0x800000>;
>>>> };
>>>> };
>>>>
>>>>
>>>> dev0: dev@12300000 {
>>>> memory-region = <&coherent_dma>;
>>>> /* ... */
>>>> };
>>>>
>>>> dev1: dev@12500000 {
>>>> memory-region = <&coherent_dma>;
>>>> /* ... */
>>>> };
>>>>
>>>> dev2: dev@12600000 {
>>>> memory-region = <&coherent_dma>;
>>>> /* ... */
>>>> };
>>>>
>>>> in device tree we could simply have
>>>>
>>>> reserved-memory {
>>>> #address-cells = <1>;
>>>> #size-cells = <1>;
>>>> ranges;
>>>>
>>>> coherent_dma: linux,dma {
>>>> compatible = "shared-dma-pool";
>>>> no-map;
>>>> reg = <0x78000000 0x800000>;
>>>> linux,dma-default;
>>>> };
>>>> };
>>>>
>>>> and that just work in my (NOMMU) case because there is no CMA there...
>>>>
>>>> However, given that dma-default is being overloaded and there are no device
>>>> tree users merged yet, I would not object stepping back, reverting "drivers:
>>>> dma-coherent: Introduce default DMA pool" and cooperatively rethinking
>>>> design/implementation, so every party gets happy.
>>>
>>> I don't think we need to go that far, I reckon it would be clear enough
>>> to just split the per-device vs. global pool interfaces, something like
>>> I've sketched out below (such that the ops->alloc implementation calls
>>> dma_alloc_from_global_coherent() if dma_alloc_from_contiguous() fails).
>>
>> Would not we need also release and mmap variants?
>
> Sure, that was just bashed out in 2 minutes and diffed into an email on
> the assumption that code would help illustrate the general idea I had in
> mind more clearly than prose alone. I'm certain it won't even compile
> as-is ;)

Ok. I've added missed pieces and even wire-up that with ARM NOMMU and it works
fine for me, but before I go further it'd be handy to know
1. what does Christoph think of that idea?
2. what is Vitaly's use case for dma-default?

Cheers
Vladimir

>
> Robin.
>

2017-07-11 14:19:39

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] drivers: dma-coherent: Fix dev->cma_area vs dev->dma_mem breakage

On Fri, Jul 07, 2017 at 05:06:52PM +0100, Robin Murphy wrote:
> I don't think we need to go that far, I reckon it would be clear enough
> to just split the per-device vs. global pool interfaces, something like
> I've sketched out below (such that the ops->alloc implementation calls
> dma_alloc_from_global_coherent() if dma_alloc_from_contiguous() fails).
>
> If anyone wants to take that and run with it, feel free.

I like this basic idea. It also fits into one of my plans for the
4.14 merge window - I want to enhance the lib/dma-noop.c so that
it can use different allocators and mapping helpers, e.g. for
the allocators what makes sense is:

(1) simple page allocator (as-is)
(2) CMA
(3) swiotlb
(4) the OF coherent allocator from your draft patch

and then for the mapping into phys space we can use

(1) virto_to_phys (as-is)
(2) arch helper (e.g. like done in mips plat support)
(3) maybe some common form of ioremap / vmap instead of various
duplicates

With that we should be able to cosolidate most direct mapped
dma_ops for architectures that do not require cache flushing into
common code. As a next step we could think about useful cache
flushing hooks.