LinuxLists.cc - [RFC 0/2] DMA-mapping & IOMMU - physically contiguous allocations

2012-10-15 14:04:13

Subject: [RFC 0/2] DMA-mapping & IOMMU - physically contiguous allocations

Hello,

Some devices, which have IOMMU, for some use cases might require to
allocate a buffers for DMA which is contiguous in physical memory. Such
use cases appears for example in DRM subsystem when one wants to improve
performance or use secure buffer protection.

I would like to ask if adding a new attribute, as proposed in this RFC
is a good idea? I feel that it might be an attribute just for a single
driver, but I would like to know your opinion. Should we look for other
solution?

Best regards
--
Marek Szyprowski
Samsung Poland R&D Center

Marek Szyprowski (2):
common: DMA-mapping: add DMA_ATTR_FORCE_CONTIGUOUS attribute
ARM: dma-mapping: add support for DMA_ATTR_FORCE_CONTIGUOUS attribute

Documentation/DMA-attributes.txt | 9 +++++++++
arch/arm/mm/dma-mapping.c | 41 ++++++++++++++++++++++++++++++--------
include/linux/dma-attrs.h | 1 +
3 files changed, 43 insertions(+), 8 deletions(-)

--
1.7.9.5

2012-10-15 14:05:13

by Marek Szyprowski

[permalink] [raw]

Subject: [RFC 2/2] ARM: dma-mapping: add support for DMA_ATTR_FORCE_CONTIGUOUS attribute

This patch adds support for DMA_ATTR_FORCE_CONTIGUOUS attribute for
dma_alloc_attrs() in IOMMU-aware implementation. For allocating physically
contiguous buffers Contiguous Memory Allocator is used.

Signed-off-by: Marek Szyprowski <[email protected]>
---
arch/arm/mm/dma-mapping.c | 41 +++++++++++++++++++++++++++++++++--------
1 file changed, 33 insertions(+), 8 deletions(-)

diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
index 477a2d2..583a302 100644
--- a/arch/arm/mm/dma-mapping.c
+++ b/arch/arm/mm/dma-mapping.c
@@ -1036,7 +1036,8 @@ static inline void __free_iova(struct dma_iommu_mapping *mapping,
spin_unlock_irqrestore(&mapping->lock, flags);
}

-static struct page **__iommu_alloc_buffer(struct device *dev, size_t size, gfp_t gfp)
+static struct page **__iommu_alloc_buffer(struct device *dev, size_t size,
+ gfp_t gfp, struct dma_attrs *attrs)
{
struct page **pages;
int count = size >> PAGE_SHIFT;
@@ -1050,6 +1051,23 @@ static struct page **__iommu_alloc_buffer(struct device *dev, size_t size, gfp_t
if (!pages)
return NULL;

+ if (dma_get_attr(DMA_ATTR_FORCE_CONTIGUOUS, attrs))
+ {
+ unsigned long order = get_order(size);
+ struct page *page;
+
+ page = dma_alloc_from_contiguous(dev, count, order);
+ if (!page)
+ goto error;
+
+ __dma_clear_buffer(page, size);
+
+ for (i = 0; i < count; i++)
+ pages[i] = page + i;
+
+ return pages;
+ }
+
while (count) {
int j, order = __fls(count);

@@ -1083,14 +1101,21 @@ error:
return NULL;
}

-static int __iommu_free_buffer(struct device *dev, struct page **pages, size_t size)
+static int __iommu_free_buffer(struct device *dev, struct page **pages,
+ size_t size, struct dma_attrs *attrs)
{
int count = size >> PAGE_SHIFT;
int array_size = count * sizeof(struct page *);
int i;
- for (i = 0; i < count; i++)
- if (pages[i])
- __free_pages(pages[i], 0);
+
+ if (dma_get_attr(DMA_ATTR_FORCE_CONTIGUOUS, attrs)) {
+ dma_release_from_contiguous(dev, pages[0], count);
+ } else {
+ for (i = 0; i < count; i++)
+ if (pages[i])
+ __free_pages(pages[i], 0);
+ }
+
if (array_size <= PAGE_SIZE)
kfree(pages);
else
@@ -1252,7 +1277,7 @@ static void *arm_iommu_alloc_attrs(struct device *dev, size_t size,
if (gfp & GFP_ATOMIC)
return __iommu_alloc_atomic(dev, size, handle);

- pages = __iommu_alloc_buffer(dev, size, gfp);
+ pages = __iommu_alloc_buffer(dev, size, gfp, attrs);
if (!pages)
return NULL;

@@ -1273,7 +1298,7 @@ static void *arm_iommu_alloc_attrs(struct device *dev, size_t size,
err_mapping:
__iommu_remove_mapping(dev, *handle, size);
err_buffer:
- __iommu_free_buffer(dev, pages, size);
+ __iommu_free_buffer(dev, pages, size, attrs);
return NULL;
}

@@ -1329,7 +1354,7 @@ void arm_iommu_free_attrs(struct device *dev, size_t size, void *cpu_addr,
}

__iommu_remove_mapping(dev, handle, size);
- __iommu_free_buffer(dev, pages, size);
+ __iommu_free_buffer(dev, pages, size, attrs);
}

static int arm_iommu_get_sgtable(struct device *dev, struct sg_table *sgt,
--
1.7.9.5

2012-10-15 14:06:31

by Marek Szyprowski

[permalink] [raw]

Subject: [RFC 1/2] common: DMA-mapping: add DMA_ATTR_FORCE_CONTIGUOUS attribute

This patch adds DMA_ATTR_FORCE_CONTIGUOUS attribute to the DMA-mapping
subsystem.

By default DMA-mapping subsystem is allowed to assemble the buffer
allocated by dma_alloc_attrs() function from individual pages if it can
be mapped as contiguous chunk into device dma address space. By
specifing this attribute the allocated buffer is forced to be contiguous
also in physical memory.

Signed-off-by: Marek Szyprowski <[email protected]>
---
Documentation/DMA-attributes.txt | 9 +++++++++
include/linux/dma-attrs.h | 1 +
2 files changed, 10 insertions(+)

diff --git a/Documentation/DMA-attributes.txt b/Documentation/DMA-attributes.txt
index f503090..e59480d 100644
--- a/Documentation/DMA-attributes.txt
+++ b/Documentation/DMA-attributes.txt
@@ -91,3 +91,12 @@ transferred to 'device' domain. This attribute can be also used for
dma_unmap_{single,page,sg} functions family to force buffer to stay in
device domain after releasing a mapping for it. Use this attribute with
care!
+
+DMA_ATTR_FORCE_CONTIGUOUS
+-------------------------
+
+By default DMA-mapping subsystem is allowed to assemble the buffer
+allocated by dma_alloc_attrs() function from individual pages if it can
+be mapped as contiguous chunk into device dma address space. By
+specifing this attribute the allocated buffer is forced to be contiguous
+also in physical memory.
diff --git a/include/linux/dma-attrs.h b/include/linux/dma-attrs.h
index f83f793..c8e1831 100644
--- a/include/linux/dma-attrs.h
+++ b/include/linux/dma-attrs.h
@@ -17,6 +17,7 @@ enum dma_attr {
DMA_ATTR_NON_CONSISTENT,
DMA_ATTR_NO_KERNEL_MAPPING,
DMA_ATTR_SKIP_CPU_SYNC,
+ DMA_ATTR_FORCE_CONTIGUOUS,
DMA_ATTR_MAX,
};

--
1.7.9.5

2012-10-16 00:50:19

by 대인기/Tizen Platform Lab(SR)/삼성전자

[permalink] [raw]

Subject: Re: [RFC 0/2] DMA-mapping & IOMMU - physically contiguous allocations

2012/10/15 Marek Szyprowski <[email protected]>:
> Hello,
>
> Some devices, which have IOMMU, for some use cases might require to
> allocate a buffers for DMA which is contiguous in physical memory. Such
> use cases appears for example in DRM subsystem when one wants to improve
> performance or use secure buffer protection.
>
> I would like to ask if adding a new attribute, as proposed in this RFC
> is a good idea? I feel that it might be an attribute just for a single
> driver, but I would like to know your opinion. Should we look for other
> solution?
>

In addition, currently we have worked dma-mapping-based iommu support
for exynos drm driver with this patch set so this patch set has been
tested with iommu enabled exynos drm driver and worked fine. actually,
this feature is needed for secure mode such as TrustZone. in case of
Exynos SoC, memory region for secure mode should be physically
contiguous and also maybe OMAP but now dma-mapping framework doesn't
guarantee physically continuous memory allocation so this patch set
would make it possible.

Tested-by: Inki Dae <[email protected]>
Reviewed-by: Inki Dae <[email protected]>

Thanks,
Inki Dae

> Best regards
> --
> Marek Szyprowski
> Samsung Poland R&D Center
>
>
> Marek Szyprowski (2):
> common: DMA-mapping: add DMA_ATTR_FORCE_CONTIGUOUS attribute
> ARM: dma-mapping: add support for DMA_ATTR_FORCE_CONTIGUOUS attribute
>
> Documentation/DMA-attributes.txt | 9 +++++++++
> arch/arm/mm/dma-mapping.c | 41 ++++++++++++++++++++++++++++++--------
> include/linux/dma-attrs.h | 1 +
> 3 files changed, 43 insertions(+), 8 deletions(-)
>
> --
> 1.7.9.5
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>

2012-10-16 06:05:06

by Hiroshi Doyu

[permalink] [raw]

Subject: Re: [Linaro-mm-sig] [RFC 0/2] DMA-mapping & IOMMU - physically contiguous allocations

Hi Inki/Marek,

On Tue, 16 Oct 2012 02:50:16 +0200
Inki Dae <[email protected]> wrote:

> 2012/10/15 Marek Szyprowski <[email protected]>:
> > Hello,
> >
> > Some devices, which have IOMMU, for some use cases might require to
> > allocate a buffers for DMA which is contiguous in physical memory. Such
> > use cases appears for example in DRM subsystem when one wants to improve
> > performance or use secure buffer protection.
> >
> > I would like to ask if adding a new attribute, as proposed in this RFC
> > is a good idea? I feel that it might be an attribute just for a single
> > driver, but I would like to know your opinion. Should we look for other
> > solution?
> >
>
> In addition, currently we have worked dma-mapping-based iommu support
> for exynos drm driver with this patch set so this patch set has been
> tested with iommu enabled exynos drm driver and worked fine. actually,
> this feature is needed for secure mode such as TrustZone. in case of
> Exynos SoC, memory region for secure mode should be physically
> contiguous and also maybe OMAP but now dma-mapping framework doesn't
> guarantee physically continuous memory allocation so this patch set
> would make it possible.

Agree that the contigous memory allocation is necessary for us too.

In addition to those contiguous/discontiguous page allocation, is
there any way to _import_ anonymous pages allocated by a process to be
used in dma-mapping API later?

I'm considering the following scenario, an user process allocates a
buffer by malloc() in advance, and then it asks some driver to convert
that buffer into IOMMU'able/DMA'able ones later. In this case, pages
are discouguous and even they may not be yet allocated at
malloc()/mmap().

2012-10-16 08:59:48

by Russell King - ARM Linux

[permalink] [raw]

Subject: Re: [Linaro-mm-sig] [RFC 0/2] DMA-mapping & IOMMU - physically contiguous allocations

On Tue, Oct 16, 2012 at 09:04:34AM +0300, Hiroshi Doyu wrote:
> In addition to those contiguous/discontiguous page allocation, is
> there any way to _import_ anonymous pages allocated by a process to be
> used in dma-mapping API later?
>
> I'm considering the following scenario, an user process allocates a
> buffer by malloc() in advance, and then it asks some driver to convert
> that buffer into IOMMU'able/DMA'able ones later. In this case, pages
> are discouguous and even they may not be yet allocated at
> malloc()/mmap().

That situation is covered. It's the streaming API you're wanting for that.
dma_map_sg() - but you may need additional cache handling via
flush_dcache_page() to ensure that your code is safe for all CPU cache
architectures.

Remember that pages allocated into userspace will be cacheable, so a cache
flush is required before they can be DMA'd. Hence the streaming API.

2012-10-16 09:30:17

by Daniel Vetter

[permalink] [raw]

Subject: Re: [Linaro-mm-sig] [RFC 0/2] DMA-mapping & IOMMU - physically contiguous allocations

On Mon, Oct 15, 2012 at 4:03 PM, Marek Szyprowski
<[email protected]> wrote:
> Some devices, which have IOMMU, for some use cases might require to
> allocate a buffers for DMA which is contiguous in physical memory. Such
> use cases appears for example in DRM subsystem when one wants to improve
> performance or use secure buffer protection.
>
> I would like to ask if adding a new attribute, as proposed in this RFC
> is a good idea? I feel that it might be an attribute just for a single
> driver, but I would like to know your opinion. Should we look for other
> solution?

One thing to consider is that up to know all allocation constraints
have been stored somewhere in struct device, either in the dma
attributes (for the more generic stuff) or somewhere in platform
specific data (e.g. for special cma pools). The design of dma_buf
relies on this: The exporter/buffer allocator only sees all the struct
device *devs that want to take part in sharing a given buffer. With
this proposal some of these allocation constraints get moved to alloc
time and aren't visible in the struct device any more. Now I that
dma_buf isn't really there yet and no one has yet implemented a
generic exporter that would allocate the dma_buf at the right spot for
all cases, but I think we should consider this to not draw ourselves
into an ugly api corner.

Cheers, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

2012-10-16 10:04:51

by Catalin Marinas

[permalink] [raw]

Subject: Re: [Linaro-mm-sig] [RFC 0/2] DMA-mapping & IOMMU - physically contiguous allocations

On 16 October 2012 09:59, Russell King - ARM Linux
<[email protected]> wrote:
> On Tue, Oct 16, 2012 at 09:04:34AM +0300, Hiroshi Doyu wrote:
>> In addition to those contiguous/discontiguous page allocation, is
>> there any way to _import_ anonymous pages allocated by a process to be
>> used in dma-mapping API later?
>>
>> I'm considering the following scenario, an user process allocates a
>> buffer by malloc() in advance, and then it asks some driver to convert
>> that buffer into IOMMU'able/DMA'able ones later. In this case, pages
>> are discouguous and even they may not be yet allocated at
>> malloc()/mmap().
>
> That situation is covered. It's the streaming API you're wanting for that.
> dma_map_sg() - but you may need additional cache handling via
> flush_dcache_page() to ensure that your code is safe for all CPU cache
> architectures.

For user-allocated pages you first need get_user_pages() to make sure
they are in memory (and will stay there). This function also calls
flush_dcache_page(). Then you can build the sg list for dma_map_sg().

--
Catalin

2012-10-16 10:12:52

by 대인기/Tizen Platform Lab(SR)/삼성전자

[permalink] [raw]

Subject: Re: [Linaro-mm-sig] [RFC 0/2] DMA-mapping & IOMMU - physically contiguous allocations

Hi Hiroshi,

2012/10/16 Hiroshi Doyu <[email protected]>:
> Hi Inki/Marek,
>
> On Tue, 16 Oct 2012 02:50:16 +0200
> Inki Dae <[email protected]> wrote:
>
>> 2012/10/15 Marek Szyprowski <[email protected]>:
>> > Hello,
>> >
>> > Some devices, which have IOMMU, for some use cases might require to
>> > allocate a buffers for DMA which is contiguous in physical memory. Such
>> > use cases appears for example in DRM subsystem when one wants to improve
>> > performance or use secure buffer protection.
>> >
>> > I would like to ask if adding a new attribute, as proposed in this RFC
>> > is a good idea? I feel that it might be an attribute just for a single
>> > driver, but I would like to know your opinion. Should we look for other
>> > solution?
>> >
>>
>> In addition, currently we have worked dma-mapping-based iommu support
>> for exynos drm driver with this patch set so this patch set has been
>> tested with iommu enabled exynos drm driver and worked fine. actually,
>> this feature is needed for secure mode such as TrustZone. in case of
>> Exynos SoC, memory region for secure mode should be physically
>> contiguous and also maybe OMAP but now dma-mapping framework doesn't
>> guarantee physically continuous memory allocation so this patch set
>> would make it possible.
>
> Agree that the contigous memory allocation is necessary for us too.
>
> In addition to those contiguous/discontiguous page allocation, is
> there any way to _import_ anonymous pages allocated by a process to be
> used in dma-mapping API later?
>
> I'm considering the following scenario, an user process allocates a
> buffer by malloc() in advance, and then it asks some driver to convert
> that buffer into IOMMU'able/DMA'able ones later. In this case, pages
> are discouguous and even they may not be yet allocated at
> malloc()/mmap().
>

I'm not sure I understand what you mean but we had already tried this
way and for this, you can refer to below link,
http://www.mail-archive.com/[email protected]/msg22555.html

but this way had been pointed out by drm guys because the pages could
be used through gem object after that pages had been freed by free()
anyway their pointing was reasonable and I'm trying another way, this
is the way that the pages to user space has same life time with dma
operation. in other word, if dma completed access to that pages then
also that pages will be freed. actually drm-based via driver of
mainline kernel is using same way

> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>

2012-10-16 10:28:12

by Hiroshi Doyu

[permalink] [raw]

Subject: Re: [Linaro-mm-sig] [RFC 0/2] DMA-mapping & IOMMU - physically contiguous allocations

Hi Russell,

Russell King - ARM Linux <[email protected]> wrote @ Tue, 16 Oct 2012 10:59:28 +0200:

> On Tue, Oct 16, 2012 at 09:04:34AM +0300, Hiroshi Doyu wrote:
> > In addition to those contiguous/discontiguous page allocation, is
> > there any way to _import_ anonymous pages allocated by a process to be
> > used in dma-mapping API later?
> >
> > I'm considering the following scenario, an user process allocates a
> > buffer by malloc() in advance, and then it asks some driver to convert
> > that buffer into IOMMU'able/DMA'able ones later. In this case, pages
> > are discouguous and even they may not be yet allocated at
> > malloc()/mmap().
>
> That situation is covered. It's the streaming API you're wanting for that.
> dma_map_sg() - but you may need additional cache handling via
> flush_dcache_page() to ensure that your code is safe for all CPU cache
> architectures.
>
> Remember that pages allocated into userspace will be cacheable, so a cache
> flush is required before they can be DMA'd. Hence the streaming
> API.

Is the syscall "cacheflush()" supposed to be the knob for that?

Or is there any other ones to have more precise control, "clean",
"invalidate" and "flush", from userland in generic way?

2012-10-16 10:31:27

by Russell King - ARM Linux

[permalink] [raw]

Subject: Re: [Linaro-mm-sig] [RFC 0/2] DMA-mapping & IOMMU - physically contiguous allocations

On Tue, Oct 16, 2012 at 07:12:49PM +0900, Inki Dae wrote:
> Hi Hiroshi,
>
> I'm not sure I understand what you mean but we had already tried this
> way and for this, you can refer to below link,
> http://www.mail-archive.com/[email protected]/msg22555.html
>
> but this way had been pointed out by drm guys because the pages could
> be used through gem object after that pages had been freed by free()
> anyway their pointing was reasonable and I'm trying another way, this
> is the way that the pages to user space has same life time with dma
> operation. in other word, if dma completed access to that pages then
> also that pages will be freed. actually drm-based via driver of
> mainline kernel is using same way

I don't know about Hiroshi, but the above "sentence" - and I mean the 7
line sentence - is very difficult to understand and wears readers out.

If your GPU hardware has a MMU, then the problem of dealing with userspace
pages is very easy. Do it the same way that the i915 driver and the rest
of DRM does. Use shmem backed memory.

I'm doing that for the Dove DRM driver and it works a real treat, and as
the pages are backed by page cache pages, you can use all the normal
page refcounting on them to prevent them being freed until your DMA has
completed. All my X pixmaps are shmem backed drm objects, except for
the scanout buffers which are dumb drm objects (because they must be
contiguous.)

In fact, get_user_pages() will take the reference for you before you pass
them over to dma_map_sg(). On completion of DMA, you just need to use
dma_unmap_sg() and release each page.

If you don't want to use get_user_pages() (which other drivers don't) then
you need to following the i915 example and get each page out of shmem
individually.

(My situation on the Dove hardware is a little different, because the
kernel DRM driver isn't involved with the GPU - it merely provides the
memory for pixmaps. The GPU software stack, being a chunk of closed
source userspace library with open source kernel driver, means that
things are more complicated; the kernel side GPU driver uses
get_user_pages() to pin them prior to building the GPU's MMU table.)

2012-10-16 10:37:19

by Russell King - ARM Linux

[permalink] [raw]

Subject: Re: [Linaro-mm-sig] [RFC 0/2] DMA-mapping & IOMMU - physically contiguous allocations

On Tue, Oct 16, 2012 at 12:27:55PM +0200, Hiroshi Doyu wrote:
> Hi Russell,
>
> Russell King - ARM Linux <[email protected]> wrote @ Tue, 16 Oct 2012 10:59:28 +0200:
>
> > On Tue, Oct 16, 2012 at 09:04:34AM +0300, Hiroshi Doyu wrote:
> > > In addition to those contiguous/discontiguous page allocation, is
> > > there any way to _import_ anonymous pages allocated by a process to be
> > > used in dma-mapping API later?
> > >
> > > I'm considering the following scenario, an user process allocates a
> > > buffer by malloc() in advance, and then it asks some driver to convert
> > > that buffer into IOMMU'able/DMA'able ones later. In this case, pages
> > > are discouguous and even they may not be yet allocated at
> > > malloc()/mmap().
> >
> > That situation is covered. It's the streaming API you're wanting for that.
> > dma_map_sg() - but you may need additional cache handling via
> > flush_dcache_page() to ensure that your code is safe for all CPU cache
> > architectures.
> >
> > Remember that pages allocated into userspace will be cacheable, so a cache
> > flush is required before they can be DMA'd. Hence the streaming
> > API.
>
> Is the syscall "cacheflush()" supposed to be the knob for that?
>
> Or is there any other ones to have more precise control, "clean",
> "invalidate" and "flush", from userland in generic way?

No other syscalls are required - this sequence will do everything you
need to perform DMA on pages mapped into userspace:

get_user_pages()
convert array of struct page * to scatterlist
dma_map_sg()
perform DMA
dma_unmap_sg()
for each page in sg()
page_cache_release(page);

If you get the list of pages some other way (eg, via shmem_read_mapping_page_gfp)
then additional maintanence may be required (though that may be a bug in
shmem - remember this stuff hasn't been well tested on ARM before.)

2012-10-16 11:11:42

by 대인기/Tizen Platform Lab(SR)/삼성전자

[permalink] [raw]

Subject: Re: [Linaro-mm-sig] [RFC 0/2] DMA-mapping & IOMMU - physically contiguous allocations

Hi Russell,

2012/10/16 Russell King - ARM Linux <[email protected]>:
> On Tue, Oct 16, 2012 at 07:12:49PM +0900, Inki Dae wrote:
>> Hi Hiroshi,
>>
>> I'm not sure I understand what you mean but we had already tried this
>> way and for this, you can refer to below link,
>> http://www.mail-archive.com/[email protected]/msg22555.html
>>
>> but this way had been pointed out by drm guys because the pages could
>> be used through gem object after that pages had been freed by free()
>> anyway their pointing was reasonable and I'm trying another way, this
>> is the way that the pages to user space has same life time with dma
>> operation. in other word, if dma completed access to that pages then
>> also that pages will be freed. actually drm-based via driver of
>> mainline kernel is using same way
>
> I don't know about Hiroshi, but the above "sentence" - and I mean the 7
> line sentence - is very difficult to understand and wears readers out.
>

Sorry for this. Please see below comments.

> If your GPU hardware has a MMU, then the problem of dealing with userspace
> pages is very easy. Do it the same way that the i915 driver and the rest
> of DRM does. Use shmem backed memory.
>
> I'm doing that for the Dove DRM driver and it works a real treat, and as
> the pages are backed by page cache pages, you can use all the normal
> page refcounting on them to prevent them being freed until your DMA has
> completed. All my X pixmaps are shmem backed drm objects, except for
> the scanout buffers which are dumb drm objects (because they must be
> contiguous.)
>
> In fact, get_user_pages() will take the reference for you before you pass
> them over to dma_map_sg(). On completion of DMA, you just need to use
> dma_unmap_sg() and release each page.
>

It's exactly same as ours. Besides, I know get_user_pages() takes 2
reference counts if the user process has never accessed user region
allocated by malloc(). Then, if the user calls free(), the page
reference count becomes 1 and becomes 0 with put_page() call. And the
reverse holds as well. This means how the pages backed are used by dma
and freed. dma_map_sg() just does cache operation properly and maps
these pages with iommu table. There may be my missing point.

Thanks,
Inki Dae

> If you don't want to use get_user_pages() (which other drivers don't) then
> you need to following the i915 example and get each page out of shmem
> individually.
>
> (My situation on the Dove hardware is a little different, because the
> kernel DRM driver isn't involved with the GPU - it merely provides the
> memory for pixmaps. The GPU software stack, being a chunk of closed
> source userspace library with open source kernel driver, means that
> things are more complicated; the kernel side GPU driver uses
> get_user_pages() to pin them prior to building the GPU's MMU table.)
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>

2012-10-16 14:13:54

by Hiroshi Doyu

[permalink] [raw]

Subject: Re: [Linaro-mm-sig] [RFC 0/2] DMA-mapping & IOMMU - physically contiguous allocations

Hi Inki,

Inki Dae <[email protected]> wrote @ Tue, 16 Oct 2012 12:12:49 +0200:

> Hi Hiroshi,
>
> 2012/10/16 Hiroshi Doyu <[email protected]>:
> > Hi Inki/Marek,
> >
> > On Tue, 16 Oct 2012 02:50:16 +0200
> > Inki Dae <[email protected]> wrote:
> >
> >> 2012/10/15 Marek Szyprowski <[email protected]>:
> >> > Hello,
> >> >
> >> > Some devices, which have IOMMU, for some use cases might require to
> >> > allocate a buffers for DMA which is contiguous in physical memory. Such
> >> > use cases appears for example in DRM subsystem when one wants to improve
> >> > performance or use secure buffer protection.
> >> >
> >> > I would like to ask if adding a new attribute, as proposed in this RFC
> >> > is a good idea? I feel that it might be an attribute just for a single
> >> > driver, but I would like to know your opinion. Should we look for other
> >> > solution?
> >> >
> >>
> >> In addition, currently we have worked dma-mapping-based iommu support
> >> for exynos drm driver with this patch set so this patch set has been
> >> tested with iommu enabled exynos drm driver and worked fine. actually,
> >> this feature is needed for secure mode such as TrustZone. in case of
> >> Exynos SoC, memory region for secure mode should be physically
> >> contiguous and also maybe OMAP but now dma-mapping framework doesn't
> >> guarantee physically continuous memory allocation so this patch set
> >> would make it possible.
> >
> > Agree that the contigous memory allocation is necessary for us too.
> >
> > In addition to those contiguous/discontiguous page allocation, is
> > there any way to _import_ anonymous pages allocated by a process to be
> > used in dma-mapping API later?
> >
> > I'm considering the following scenario, an user process allocates a
> > buffer by malloc() in advance, and then it asks some driver to convert
> > that buffer into IOMMU'able/DMA'able ones later. In this case, pages
> > are discouguous and even they may not be yet allocated at
> > malloc()/mmap().
> >
>
> I'm not sure I understand what you mean but we had already tried this
> way and for this, you can refer to below link,
> http://www.mail-archive.com/[email protected]/msg22555.html

The above patch doesn't seem to have so much platform/SoC specific
code but rather it could common over other SoC as well. Is there any
plan to make it more generic, which can be used by other DRM drivers?

2012-10-16 22:54:28

by Inki Dae

[permalink] [raw]

Subject: Re: [Linaro-mm-sig] [RFC 0/2] DMA-mapping & IOMMU - physically contiguous allocations

Hi Hiroshi,

2012. 10. 16. ???? 11:13 Hiroshi Doyu <[email protected]> ?ۼ?:

> Hi Inki,
>
> Inki Dae <[email protected]> wrote @ Tue, 16 Oct 2012 12:12:49 +0200:
>
>> Hi Hiroshi,
>>
>> 2012/10/16 Hiroshi Doyu <[email protected]>:
>>> Hi Inki/Marek,
>>>
>>> On Tue, 16 Oct 2012 02:50:16 +0200
>>> Inki Dae <[email protected]> wrote:
>>>
>>>> 2012/10/15 Marek Szyprowski <[email protected]>:
>>>>> Hello,
>>>>>
>>>>> Some devices, which have IOMMU, for some use cases might require to
>>>>> allocate a buffers for DMA which is contiguous in physical memory. Such
>>>>> use cases appears for example in DRM subsystem when one wants to improve
>>>>> performance or use secure buffer protection.
>>>>>
>>>>> I would like to ask if adding a new attribute, as proposed in this RFC
>>>>> is a good idea? I feel that it might be an attribute just for a single
>>>>> driver, but I would like to know your opinion. Should we look for other
>>>>> solution?
>>>>>
>>>>
>>>> In addition, currently we have worked dma-mapping-based iommu support
>>>> for exynos drm driver with this patch set so this patch set has been
>>>> tested with iommu enabled exynos drm driver and worked fine. actually,
>>>> this feature is needed for secure mode such as TrustZone. in case of
>>>> Exynos SoC, memory region for secure mode should be physically
>>>> contiguous and also maybe OMAP but now dma-mapping framework doesn't
>>>> guarantee physically continuous memory allocation so this patch set
>>>> would make it possible.
>>>
>>> Agree that the contigous memory allocation is necessary for us too.
>>>
>>> In addition to those contiguous/discontiguous page allocation, is
>>> there any way to _import_ anonymous pages allocated by a process to be
>>> used in dma-mapping API later?
>>>
>>> I'm considering the following scenario, an user process allocates a
>>> buffer by malloc() in advance, and then it asks some driver to convert
>>> that buffer into IOMMU'able/DMA'able ones later. In this case, pages
>>> are discouguous and even they may not be yet allocated at
>>> malloc()/mmap().
>>>
>>
>> I'm not sure I understand what you mean but we had already tried this
>> way and for this, you can refer to below link,
>> http://www.mail-archive.com/[email protected]/msg22555.html
>
> The above patch doesn't seem to have so much platform/SoC specific
> code but rather it could common over other SoC as well. Is there any
> plan to make it more generic, which can be used by other DRM drivers?
>

Right, the above patch has no any platform/SoC specific code but doesn't use dma-mapping API . Anyway we should refrain from using such thing because gem object could still be used and shared with other processes even if user process freed user region allocated by malloc()

And our new patch in progress would resolve this issue and this way is similar to drm-based via driver of mainline kernel. And this patch isn't considered for common use and is specific to platform/SoC so much. The pages backed can be used only by 2d gpu's dma.

Thanks,
Inki Dae

> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>