2015-06-30 21:29:32

by Lorenzo Nava

[permalink] [raw]
Subject: [PATCH v5] arm DMA: Fix allocation from CMA for coherent DMA

This patch allows the use of CMA for DMA coherent memory allocation.
At the moment if the input parameter "is_coherent" is set to true
the allocation is not made using the CMA, which I think is not the
desired behaviour.

Signed-off-by: Lorenzo Nava <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
---
Changes in v2:
correct __arm_dma_free() according to __dma_alloc() allocation
---
Changes in v3:
now __dma_alloc(), if 'is_coherent' is true, returns memory from CMA
if there is no need for atomic allocation. If CMA is not available
the function returns the result of __alloc_simple_buffer().
__arm_dma_free() frees memory according to the new alloc function
avoiding __dma_free_remap() for coherent DMA if CMA is not enable.
arm_dma_alloc() mark pages as cacheable if attrs are set by default
to NULL. If attrs is not NULL, attributes are preserved in the allocation.

Coherent allocation tested on Xilinx Zynq processor.
---
Changes in v4:
back to "if..else" code style for __dma_alloc()
avoided unnecessary __free_from_pool() call in __arm_dma_free()
---
Changes in v5:
changed coherent allocation attributes in arm_coherent_dma_alloc()
---
arch/arm/mm/dma-mapping.c | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 1ced8a0..8f3f173 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -648,14 +648,18 @@ static void *__dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
size = PAGE_ALIGN(size);
want_vaddr = !dma_get_attr(DMA_ATTR_NO_KERNEL_MAPPING, attrs);

- if (is_coherent || nommu())
+ if (nommu())
+ addr = __alloc_simple_buffer(dev, size, gfp, &page);
+ else if (dev_get_cma_area(dev) && (gfp & __GFP_WAIT))
+ addr = __alloc_from_contiguous(dev, size, prot, &page,
+ caller, want_vaddr);
+ else if (is_coherent)
addr = __alloc_simple_buffer(dev, size, gfp, &page);
else if (!(gfp & __GFP_WAIT))
addr = __alloc_from_pool(size, &page);
- else if (!dev_get_cma_area(dev))
- addr = __alloc_remap_buffer(dev, size, gfp, prot, &page, caller, want_vaddr);
else
- addr = __alloc_from_contiguous(dev, size, prot, &page, caller, want_vaddr);
+ addr = __alloc_remap_buffer(dev, size, gfp, prot, &page,
+ caller, want_vaddr);

if (page)
*handle = pfn_to_dma(dev, page_to_pfn(page));
@@ -683,13 +687,12 @@ void *arm_dma_alloc(struct device *dev, size_t size, dma_addr_t *handle,
static void *arm_coherent_dma_alloc(struct device *dev, size_t size,
dma_addr_t *handle, gfp_t gfp, struct dma_attrs *attrs)
{
- pgprot_t prot = __get_dma_pgprot(attrs, PAGE_KERNEL);
void *memory;

if (dma_alloc_from_coherent(dev, size, handle, &memory))
return memory;

- return __dma_alloc(dev, size, handle, gfp, prot, true,
+ return __dma_alloc(dev, size, handle, gfp, PAGE_KERNEL, true,
attrs, __builtin_return_address(0));
}

@@ -753,12 +756,12 @@ static void __arm_dma_free(struct device *dev, size_t size, void *cpu_addr,

size = PAGE_ALIGN(size);

- if (is_coherent || nommu()) {
+ if (nommu()) {
__dma_free_buffer(page, size);
- } else if (__free_from_pool(cpu_addr, size)) {
+ } else if (!is_coherent && __free_from_pool(cpu_addr, size)) {
return;
} else if (!dev_get_cma_area(dev)) {
- if (want_vaddr)
+ if (want_vaddr && !is_coherent)
__dma_free_remap(cpu_addr, size);
__dma_free_buffer(page, size);
} else {
--
1.7.10.4


2015-07-01 11:13:03

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v5] arm DMA: Fix allocation from CMA for coherent DMA

On Tue, Jun 30, 2015 at 11:29:06PM +0200, Lorenzo Nava wrote:
> This patch allows the use of CMA for DMA coherent memory allocation.
> At the moment if the input parameter "is_coherent" is set to true
> the allocation is not made using the CMA, which I think is not the
> desired behaviour.
>
> Signed-off-by: Lorenzo Nava <[email protected]>
> Reviewed-by: Catalin Marinas <[email protected]>

If Russell doesn't have any objections, you can send the patch to
his patch system. See here for more information:

http://www.arm.linux.org.uk/developer/patches/info.php

Or, if you are into git aliases (I haven't used this alias recently):

[alias]
send-rmk-patch = !GIT_EDITOR=\"sed -i -e \\\"s/^---$/---\\nKernelVersion: $(git describe --abbrev=0 --match=\"v*\")\\n/\\\"\" git send-email --annotate --no-thread --suppress-cc=all --to="[email protected]"

Just make sure you send one at a time if you have a series. E.g. for the
top commit:

git send-rmk-patch HEAD^

--
Catalin

2015-07-03 09:03:31

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH v5] arm DMA: Fix allocation from CMA for coherent DMA

On Wed, Jul 01, 2015 at 12:12:51PM +0100, Catalin Marinas wrote:
> On Tue, Jun 30, 2015 at 11:29:06PM +0200, Lorenzo Nava wrote:
> > This patch allows the use of CMA for DMA coherent memory allocation.
> > At the moment if the input parameter "is_coherent" is set to true
> > the allocation is not made using the CMA, which I think is not the
> > desired behaviour.
> >
> > Signed-off-by: Lorenzo Nava <[email protected]>
> > Reviewed-by: Catalin Marinas <[email protected]>
>
> If Russell doesn't have any objections, you can send the patch to
> his patch system. See here for more information:

I'm left wondering whether this patch is really want Lorenzo wants.
>From my reading of it, while this has the effect of allocating from
CMA for coherent devices, it's no different from the non-coherent
case, because by calling __alloc_from_contiguous(), we end up
remapping the allocated memory, removing the cacheability status
from the allocated pages.

This brings up an interesting point: presumably, it's been tested, and
people are happy with the performance it's giving, inspite of it not
returning cacheable memory... or maybe it hasn't been tested that much?

--
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.

2015-07-03 10:26:18

by Catalin Marinas

[permalink] [raw]
Subject: Re: [PATCH v5] arm DMA: Fix allocation from CMA for coherent DMA

On Fri, Jul 03, 2015 at 10:03:06AM +0100, Russell King - ARM Linux wrote:
> On Wed, Jul 01, 2015 at 12:12:51PM +0100, Catalin Marinas wrote:
> > On Tue, Jun 30, 2015 at 11:29:06PM +0200, Lorenzo Nava wrote:
> > > This patch allows the use of CMA for DMA coherent memory allocation.
> > > At the moment if the input parameter "is_coherent" is set to true
> > > the allocation is not made using the CMA, which I think is not the
> > > desired behaviour.
> > >
> > > Signed-off-by: Lorenzo Nava <[email protected]>
> > > Reviewed-by: Catalin Marinas <[email protected]>
> >
> > If Russell doesn't have any objections, you can send the patch to
> > his patch system. See here for more information:
>
> I'm left wondering whether this patch is really want Lorenzo wants.
> From my reading of it, while this has the effect of allocating from
> CMA for coherent devices, it's no different from the non-coherent
> case, because by calling __alloc_from_contiguous(), we end up
> remapping the allocated memory, removing the cacheability status
> from the allocated pages.

The arm_coherent_dma_alloc() function is changed by this patch to pass
PAGE_KERNEL as prot, so the cacheability attributes should be preserved.
We have a superfluous __dma_remap() call for lowmem CMA pages which
changes the protection from PAGE_KERNEL to PAGE_KERNEL but, as you
mentioned in another reply, the overhead of CMA is big enough to not
make this noticeable.

The alternative (as I suggested in one of the earlier versions of this
patch) was to change __get_dma_pgprot() to take a "coherent" argument
and return PAGE_KERNEL. But since the coherent mmap was fixed in a
similar way to this patch (and merged already), I gave up on my initial
suggestion.

> This brings up an interesting point: presumably, it's been tested, and
> people are happy with the performance it's giving, inspite of it not
> returning cacheable memory... or maybe it hasn't been tested that much?

It must cacheable memory, otherwise it's not a performance but a data
corruption issue. For example, a cacheable device access may allocate in
an outer cache but non-cacheable CPU access doesn't look into it.

--
Catalin

2015-07-03 11:23:33

by Lorenzo Nava

[permalink] [raw]
Subject: Re: [PATCH v5] arm DMA: Fix allocation from CMA for coherent DMA

On Fri, Jul 3, 2015 at 11:03 AM, Russell King - ARM Linux
<[email protected]> wrote:
> On Wed, Jul 01, 2015 at 12:12:51PM +0100, Catalin Marinas wrote:
>> On Tue, Jun 30, 2015 at 11:29:06PM +0200, Lorenzo Nava wrote:
>> > This patch allows the use of CMA for DMA coherent memory allocation.
>> > At the moment if the input parameter "is_coherent" is set to true
>> > the allocation is not made using the CMA, which I think is not the
>> > desired behaviour.
>> >
>> > Signed-off-by: Lorenzo Nava <[email protected]>
>> > Reviewed-by: Catalin Marinas <[email protected]>
>>
>> If Russell doesn't have any objections, you can send the patch to
>> his patch system. See here for more information:
>
> I'm left wondering whether this patch is really want Lorenzo wants.
> From my reading of it, while this has the effect of allocating from
> CMA for coherent devices, it's no different from the non-coherent
> case, because by calling __alloc_from_contiguous(), we end up
> remapping the allocated memory, removing the cacheability status
> from the allocated pages.
>
> This brings up an interesting point: presumably, it's been tested, and
> people are happy with the performance it's giving, inspite of it not
> returning cacheable memory... or maybe it hasn't been tested that much?
>

As Catalin correctly pointed out, I always consider that this patch:


> --
> FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
> according to speedtest.net.

2015-07-03 11:27:52

by Lorenzo Nava

[permalink] [raw]
Subject: Re: [PATCH v5] arm DMA: Fix allocation from CMA for coherent DMA

On Fri, Jul 3, 2015 at 11:03 AM, Russell King - ARM Linux
<[email protected]> wrote:
> On Wed, Jul 01, 2015 at 12:12:51PM +0100, Catalin Marinas wrote:
>> On Tue, Jun 30, 2015 at 11:29:06PM +0200, Lorenzo Nava wrote:
>> > This patch allows the use of CMA for DMA coherent memory allocation.
>> > At the moment if the input parameter "is_coherent" is set to true
>> > the allocation is not made using the CMA, which I think is not the
>> > desired behaviour.
>> >
>> > Signed-off-by: Lorenzo Nava <[email protected]>
>> > Reviewed-by: Catalin Marinas <[email protected]>
>>
>> If Russell doesn't have any objections, you can send the patch to
>> his patch system. See here for more information:
>
> I'm left wondering whether this patch is really want Lorenzo wants.
> From my reading of it, while this has the effect of allocating from
> CMA for coherent devices, it's no different from the non-coherent
> case, because by calling __alloc_from_contiguous(), we end up
> remapping the allocated memory, removing the cacheability status
> from the allocated pages.
>
> This brings up an interesting point: presumably, it's been tested, and
> people are happy with the performance it's giving, inspite of it not
> returning cacheable memory... or maybe it hasn't been tested that much?
>

As Catalin correctly pointed out, I always consider that this patch:

https://lkml.org/lkml/2015/5/7/512

has been applied. The memory returned by mmap is then cacheable and
preserve the attributes used during allocation.
Of course, without that patch, mine is not working at all.

(Sorry for the previous mail which has been sent accidentally).

Lorenzo

2015-07-06 19:49:59

by Lorenzo Nava

[permalink] [raw]
Subject: Re: [PATCH v5] arm DMA: Fix allocation from CMA for coherent DMA

On Fri, Jul 3, 2015 at 11:03 AM, Russell King - ARM Linux
<[email protected]> wrote:
> On Wed, Jul 01, 2015 at 12:12:51PM +0100, Catalin Marinas wrote:
>> On Tue, Jun 30, 2015 at 11:29:06PM +0200, Lorenzo Nava wrote:
>> > This patch allows the use of CMA for DMA coherent memory allocation.
>> > At the moment if the input parameter "is_coherent" is set to true
>> > the allocation is not made using the CMA, which I think is not the
>> > desired behaviour.
>> >
>> > Signed-off-by: Lorenzo Nava <[email protected]>
>> > Reviewed-by: Catalin Marinas <[email protected]>
>>
>> If Russell doesn't have any objections, you can send the patch to
>> his patch system. See here for more information:
>
> I'm left wondering whether this patch is really want Lorenzo wants.
> From my reading of it, while this has the effect of allocating from
> CMA for coherent devices, it's no different from the non-coherent
> case, because by calling __alloc_from_contiguous(), we end up
> remapping the allocated memory, removing the cacheability status
> from the allocated pages.
>
> This brings up an interesting point: presumably, it's been tested, and
> people are happy with the performance it's giving, inspite of it not
> returning cacheable memory... or maybe it hasn't been tested that much?

In the weekend I've run the test once again and what I got is that:

allocation non-coherent memory (CMA) -> prot = 0x647 (bufferable, non cacheable)
allocation coherent memory (with patch, CMA) -> prot = 0x65F (writealloc)

mmap of coherent memory (without mmap patched) -> prot = 0x707
(bufferable, non cacheable) -> ERROR
mmap of coherent memory (with mmap patch) -> prot = 0x71F (writealloc) -> OK

So, it looks like the page attributes are ok from the test I ran. I
didn't repeat the performance tests on the mapped memory, but they
were ok for me as well.

I've submitted the patch to your system.
If you have any other objections or any feedback, please let me know.

Lorenzo

2015-07-18 11:54:45

by Lorenzo Nava

[permalink] [raw]
Subject: Re: [PATCH v5] arm DMA: Fix allocation from CMA for coherent DMA

On Fri, Jul 3, 2015 at 11:03 AM, Russell King - ARM Linux
<[email protected]> wrote:
> On Wed, Jul 01, 2015 at 12:12:51PM +0100, Catalin Marinas wrote:
>> On Tue, Jun 30, 2015 at 11:29:06PM +0200, Lorenzo Nava wrote:
>> > This patch allows the use of CMA for DMA coherent memory allocation.
>> > At the moment if the input parameter "is_coherent" is set to true
>> > the allocation is not made using the CMA, which I think is not the
>> > desired behaviour.
>> >
>> > Signed-off-by: Lorenzo Nava <[email protected]>
>> > Reviewed-by: Catalin Marinas <[email protected]>
>>
>> If Russell doesn't have any objections, you can send the patch to
>> his patch system. See here for more information:
>
> I'm left wondering whether this patch is really want Lorenzo wants.
> From my reading of it, while this has the effect of allocating from
> CMA for coherent devices, it's no different from the non-coherent
> case, because by calling __alloc_from_contiguous(), we end up
> remapping the allocated memory, removing the cacheability status
> from the allocated pages.
>
> This brings up an interesting point: presumably, it's been tested, and
> people are happy with the performance it's giving, inspite of it not
> returning cacheable memory... or maybe it hasn't been tested that much?
>
Hello Russell,

I was wondering if you have any other observations about this patch or
if you need any further investigation about some aspects.
It's not clear to me if the last comments from me and Catalin answer
your questions or if you see any other problem that we didn't
consider.

Thank you.
Lorenzo