LinuxLists.cc - [PATCH v3 0/2] Optimize dma_*_from

2019-05-24 04:09:57

Subject: [PATCH v3 0/2] Optimize dma_*_from_contiguous calls

[ Per discussion at v1, we decide to add two new functions and start
replacing callers one by one. For this series, it only touches the
dma-direct part. And instead of merging two PATCHes, I still keep
them separate so that we may easily revert PATCH-2 if anything bad
happens as last time -- PATCH-1 is supposed to be a safe cleanup. ]

This series of patches try to optimize dma_*_from_contiguous calls:
PATCH-1 abstracts two new functions and applies to dma-direct.c file.
PATCH-2 saves single pages and reduce fragmentations from CMA area.

Please check their commit messages for detail changelog.

Nicolin Chen (2):
dma-contiguous: Abstract dma_{alloc,free}_contiguous()
dma-contiguous: Use fallback alloc_pages for single pages

include/linux/dma-contiguous.h | 11 +++++++
kernel/dma/contiguous.c | 57 ++++++++++++++++++++++++++++++++++
kernel/dma/direct.c | 24 +++-----------
3 files changed, 72 insertions(+), 20 deletions(-)

--
2.17.1

2019-05-24 04:10:10

by Nicolin Chen

[permalink] [raw]

Subject: [PATCH v3 1/2] dma-contiguous: Abstract dma_{alloc,free}_contiguous()

Both dma_alloc_from_contiguous() and dma_release_from_contiguous()
are very simply implemented, but requiring callers to pass certain
parameters like count and align, and taking a boolean parameter to
check __GFP_NOWARN in the allocation flags. So every function call
duplicates similar work:
/* A piece of example */
unsigned long order = get_order(size);
size_t count = size >> PAGE_SHIFT;
page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN);
[...]
dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);

Additionally, as CMA can be used only in the context which permits
sleeping, most of callers do a gfpflags_allow_blocking() check and
a corresponding fallback allocation of normal pages upon any false
result:
/* A piece of example */
if (gfpflags_allow_blocking(flag))
page = dma_alloc_from_contiguous();
if (!page)
page = alloc_pages();
[...]
if (!dma_release_from_contiguous(dev, page, count))
__free_pages(page, get_order(size));

So this patch simplifies those function calls by abstracting these
operations into the two new functions: dma_{alloc,free}_contiguous.

As some callers of dma_{alloc,release}_from_contiguous() might be
complicated, this patch just implements these two new functions to
kernel/dma/direct.c only as an initial step.

Suggested-by: Christoph Hellwig <[email protected]>
Signed-off-by: Nicolin Chen <[email protected]>
---
Changelog
v2->v3:
* Added missing "static inline" in header file to fix build error.
v1->v2:
* Added new functions beside the old ones so we can replace callers
one by one later.
* Applied new functions to dma/direct.c only, because it's the best
example caller to apply and should be safe with the new functions.

include/linux/dma-contiguous.h | 11 ++++++++
kernel/dma/contiguous.c | 48 ++++++++++++++++++++++++++++++++++
kernel/dma/direct.c | 24 +++--------------
3 files changed, 63 insertions(+), 20 deletions(-)

diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h
index f247e8aa5e3d..00a370c1c140 100644
--- a/include/linux/dma-contiguous.h
+++ b/include/linux/dma-contiguous.h
@@ -115,6 +115,8 @@ struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
unsigned int order, bool no_warn);
bool dma_release_from_contiguous(struct device *dev, struct page *pages,
int count);
+struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp);
+void dma_free_contiguous(struct device *dev, struct page *page, size_t size);

#else

@@ -157,6 +159,15 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
return false;
}

+static inline
+struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
+{
+ return NULL;
+}
+
+static inline
+void dma_free_contiguous(struct device *dev, struct page *page, size_t size) { }
+
#endif

#endif
diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index b2a87905846d..21f39a6cb04f 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -214,6 +214,54 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
return cma_release(dev_get_cma_area(dev), pages, count);
}

+/**
+ * dma_alloc_contiguous() - allocate contiguous pages
+ * @dev: Pointer to device for which the allocation is performed.
+ * @size: Requested allocation size.
+ * @gfp: Allocation flags.
+ *
+ * This function allocates contiguous memory buffer for specified device. It
+ * first tries to use device specific contiguous memory area if available or
+ * the default global one, then tries a fallback allocation of normal pages.
+ */
+struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
+{
+ int node = dev ? dev_to_node(dev) : NUMA_NO_NODE;
+ size_t count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+ size_t align = get_order(PAGE_ALIGN(size));
+ struct cma *cma = dev_get_cma_area(dev);
+ struct page *page = NULL;
+
+ /* CMA can be used only in the context which permits sleeping */
+ if (cma && gfpflags_allow_blocking(gfp)) {
+ align = min_t(size_t, align, CONFIG_CMA_ALIGNMENT);
+ page = cma_alloc(cma, count, align, gfp & __GFP_NOWARN);
+ }
+
+ /* Fallback allocation of normal pages */
+ if (!page)
+ page = alloc_pages_node(node, gfp, align);
+
+ return page;
+}
+
+/**
+ * dma_free_contiguous() - release allocated pages
+ * @dev: Pointer to device for which the pages were allocated.
+ * @page: Pointer to the allocated pages.
+ * @size: Size of allocated pages.
+ *
+ * This function releases memory allocated by dma_alloc_contiguous(). As the
+ * cma_release returns false when provided pages do not belong to contiguous
+ * area and true otherwise, this function then does a fallback __free_pages()
+ * upon a false-return.
+ */
+void dma_free_contiguous(struct device *dev, struct page *page, size_t size)
+{
+ if (!cma_release(dev_get_cma_area(dev), page, size >> PAGE_SHIFT))
+ __free_pages(page, get_order(size));
+}
+
/*
* Support for reserved memory regions defined in device tree
*/
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 2c2772e9702a..0816c1e8b05a 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -96,8 +96,6 @@ static bool dma_coherent_ok(struct device *dev, phys_addr_t phys, size_t size)
struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
{
- unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
- int page_order = get_order(size);
struct page *page = NULL;
u64 phys_mask;

@@ -109,20 +107,9 @@ struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
gfp |= __dma_direct_optimal_gfp_mask(dev, dev->coherent_dma_mask,
&phys_mask);
again:
- /* CMA can be used only in the context which permits sleeping */
- if (gfpflags_allow_blocking(gfp)) {
- page = dma_alloc_from_contiguous(dev, count, page_order,
- gfp & __GFP_NOWARN);
- if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
- dma_release_from_contiguous(dev, page, count);
- page = NULL;
- }
- }
- if (!page)
- page = alloc_pages_node(dev_to_node(dev), gfp, page_order);
-
+ page = dma_alloc_contiguous(dev, size, gfp);
if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
- __free_pages(page, page_order);
+ dma_free_contiguous(dev, page, size);
page = NULL;

if (IS_ENABLED(CONFIG_ZONE_DMA32) &&
@@ -154,7 +141,7 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size,
if (PageHighMem(page)) {
/*
* Depending on the cma= arguments and per-arch setup
- * dma_alloc_from_contiguous could return highmem pages.
+ * dma_alloc_contiguous could return highmem pages.
* Without remapping there is no way to return them here,
* so log an error and fail.
*/
@@ -176,10 +163,7 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size,

void __dma_direct_free_pages(struct device *dev, size_t size, struct page *page)
{
- unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
-
- if (!dma_release_from_contiguous(dev, page, count))
- __free_pages(page, get_order(size));
+ dma_free_contiguous(dev, page, size);
}

void dma_direct_free_pages(struct device *dev, size_t size, void *cpu_addr,
--
2.17.1

2019-05-24 04:11:15

by Nicolin Chen

[permalink] [raw]

Subject: [PATCH v3 2/2] dma-contiguous: Use fallback alloc_pages for single pages

The addresses within a single page are always contiguous, so it's
not so necessary to always allocate one single page from CMA area.
Since the CMA area has a limited predefined size of space, it may
run out of space in heavy use cases, where there might be quite a
lot CMA pages being allocated for single pages.

However, there is also a concern that a device might care where a
page comes from -- it might expect the page from CMA area and act
differently if the page doesn't.

This patch tries to use the fallback alloc_pages path, instead of
one-page size allocations from the global CMA area in case that a
device does not have its own CMA area. This'd save resources from
the CMA global area for more CMA allocations, and also reduce CMA
fragmentations resulted from trivial allocations.

Signed-off-by: Nicolin Chen <[email protected]>
---
kernel/dma/contiguous.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index 21f39a6cb04f..6914b92d5c88 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -223,14 +223,23 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
* This function allocates contiguous memory buffer for specified device. It
* first tries to use device specific contiguous memory area if available or
* the default global one, then tries a fallback allocation of normal pages.
+ *
+ * Note that it byapss one-page size of allocations from the global area as
+ * the addresses within one page are always contiguous, so there is no need
+ * to waste CMA pages for that kind; it also helps reduce fragmentations.
*/
struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
{
int node = dev ? dev_to_node(dev) : NUMA_NO_NODE;
size_t count = PAGE_ALIGN(size) >> PAGE_SHIFT;
size_t align = get_order(PAGE_ALIGN(size));
- struct cma *cma = dev_get_cma_area(dev);
struct page *page = NULL;
+ struct cma *cma = NULL;
+
+ if (dev && dev->cma_area)
+ cma = dev->cma_area;
+ else if (count > 1)
+ cma = dma_contiguous_default_area;

/* CMA can be used only in the context which permits sleeping */
if (cma && gfpflags_allow_blocking(gfp)) {
--
2.17.1

2019-05-24 16:17:09

by Ira Weiny

[permalink] [raw]

Subject: Re: [PATCH v3 2/2] dma-contiguous: Use fallback alloc_pages for single pages

On Thu, May 23, 2019 at 09:06:33PM -0700, Nicolin Chen wrote:
> The addresses within a single page are always contiguous, so it's
> not so necessary to always allocate one single page from CMA area.
> Since the CMA area has a limited predefined size of space, it may
> run out of space in heavy use cases, where there might be quite a
> lot CMA pages being allocated for single pages.
>
> However, there is also a concern that a device might care where a
> page comes from -- it might expect the page from CMA area and act
> differently if the page doesn't.

How does a device know, after this call, if a CMA area was used? From the
patches I figured a device should not care.

>
> This patch tries to use the fallback alloc_pages path, instead of
> one-page size allocations from the global CMA area in case that a
> device does not have its own CMA area. This'd save resources from
> the CMA global area for more CMA allocations, and also reduce CMA
> fragmentations resulted from trivial allocations.
>
> Signed-off-by: Nicolin Chen <[email protected]>
> ---
> kernel/dma/contiguous.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
> index 21f39a6cb04f..6914b92d5c88 100644
> --- a/kernel/dma/contiguous.c
> +++ b/kernel/dma/contiguous.c
> @@ -223,14 +223,23 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> * This function allocates contiguous memory buffer for specified device. It
> * first tries to use device specific contiguous memory area if available or
> * the default global one, then tries a fallback allocation of normal pages.
> + *
> + * Note that it byapss one-page size of allocations from the global area as
> + * the addresses within one page are always contiguous, so there is no need
> + * to waste CMA pages for that kind; it also helps reduce fragmentations.
> */
> struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
> {
> int node = dev ? dev_to_node(dev) : NUMA_NO_NODE;
> size_t count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> size_t align = get_order(PAGE_ALIGN(size));
> - struct cma *cma = dev_get_cma_area(dev);
> struct page *page = NULL;
> + struct cma *cma = NULL;
> +
> + if (dev && dev->cma_area)
> + cma = dev->cma_area;
> + else if (count > 1)
> + cma = dma_contiguous_default_area;

Doesn't dev_get_dma_area() already do this?

Ira

>
> /* CMA can be used only in the context which permits sleeping */
> if (cma && gfpflags_allow_blocking(gfp)) {
> --
> 2.17.1
>

2019-05-24 20:07:23

by dann frazier

[permalink] [raw]

Subject: Re: [PATCH v3 0/2] Optimize dma_*_from_contiguous calls

On Thu, May 23, 2019 at 10:08 PM Nicolin Chen <[email protected]> wrote:
>
> [ Per discussion at v1, we decide to add two new functions and start
> replacing callers one by one. For this series, it only touches the
> dma-direct part. And instead of merging two PATCHes, I still keep
> them separate so that we may easily revert PATCH-2 if anything bad
> happens as last time -- PATCH-1 is supposed to be a safe cleanup. ]
>
> This series of patches try to optimize dma_*_from_contiguous calls:
> PATCH-1 abstracts two new functions and applies to dma-direct.c file.
> PATCH-2 saves single pages and reduce fragmentations from CMA area.
>
> Please check their commit messages for detail changelog.
>
> Nicolin Chen (2):
> dma-contiguous: Abstract dma_{alloc,free}_contiguous()
> dma-contiguous: Use fallback alloc_pages for single pages
>
> include/linux/dma-contiguous.h | 11 +++++++
> kernel/dma/contiguous.c | 57 ++++++++++++++++++++++++++++++++++
> kernel/dma/direct.c | 24 +++-----------
> 3 files changed, 72 insertions(+), 20 deletions(-)

Thanks Nicolin. Tested on a HiSilicon D06 system.

Tested-by: dann frazier <[email protected]>

2019-05-27 10:59:59

by Nicolin Chen

[permalink] [raw]

Subject: Re: [PATCH v3 2/2] dma-contiguous: Use fallback alloc_pages for single pages

Hi Ira,

On Fri, May 24, 2019 at 09:16:19AM -0700, Ira Weiny wrote:
> On Thu, May 23, 2019 at 09:06:33PM -0700, Nicolin Chen wrote:
> > The addresses within a single page are always contiguous, so it's
> > not so necessary to always allocate one single page from CMA area.
> > Since the CMA area has a limited predefined size of space, it may
> > run out of space in heavy use cases, where there might be quite a
> > lot CMA pages being allocated for single pages.
> >
> > However, there is also a concern that a device might care where a
> > page comes from -- it might expect the page from CMA area and act
> > differently if the page doesn't.
>
> How does a device know, after this call, if a CMA area was used? From the
> patches I figured a device should not care.

A device doesn't know. But that doesn't mean a device won't care
at all. There was a concern from Robin and Christoph, as a corner
case that device might act differently if the memory isn't in its
own CMA region. That's why we let it still use its device specific
CMA area.

> > + if (dev && dev->cma_area)
> > + cma = dev->cma_area;
> > + else if (count > 1)
> > + cma = dma_contiguous_default_area;
>
> Doesn't dev_get_dma_area() already do this?

Partially yes. But unwrapping it makes the program flow clear in
my opinion. Actually I should have mentioned that this patch was
suggested by Christoph also.

Otherwise, it would need an override like:
cma = dev_get_dma_area();
if (count > 1 && cma == dma_contiguous_default_area)
cma = NULL;

Which doesn't look that bad though..

Thanks
Nicolin

2019-05-28 06:06:20

by Christoph Hellwig

[permalink] [raw]

Subject: Re: [PATCH v3 0/2] Optimize dma_*_from_contiguous calls

Thanks,

applied to dma-mapping for-next.

Can you also send a conversion of drivers/iommu/dma-iommu.c to your
new helpers against this tree?

http://git.infradead.org/users/hch/dma-mapping.git/shortlog/refs/heads/for-next

2019-05-29 18:37:27

by Nathan Chancellor

[permalink] [raw]

Subject: Re: [PATCH v3 1/2] dma-contiguous: Abstract dma_{alloc,free}_contiguous()

Hi Nicolin,

On Thu, May 23, 2019 at 09:06:32PM -0700, Nicolin Chen wrote:
> Both dma_alloc_from_contiguous() and dma_release_from_contiguous()
> are very simply implemented, but requiring callers to pass certain
> parameters like count and align, and taking a boolean parameter to
> check __GFP_NOWARN in the allocation flags. So every function call
> duplicates similar work:
> /* A piece of example */
> unsigned long order = get_order(size);
> size_t count = size >> PAGE_SHIFT;
> page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN);
> [...]
> dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
>
> Additionally, as CMA can be used only in the context which permits
> sleeping, most of callers do a gfpflags_allow_blocking() check and
> a corresponding fallback allocation of normal pages upon any false
> result:
> /* A piece of example */
> if (gfpflags_allow_blocking(flag))
> page = dma_alloc_from_contiguous();
> if (!page)
> page = alloc_pages();
> [...]
> if (!dma_release_from_contiguous(dev, page, count))
> __free_pages(page, get_order(size));
>
> So this patch simplifies those function calls by abstracting these
> operations into the two new functions: dma_{alloc,free}_contiguous.
>
> As some callers of dma_{alloc,release}_from_contiguous() might be
> complicated, this patch just implements these two new functions to
> kernel/dma/direct.c only as an initial step.
>
> Suggested-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Nicolin Chen <[email protected]>
> ---

This commit is causing boot failures in QEMU on x86_64 defconfig:

https://travis-ci.com/ClangBuiltLinux/continuous-integration/jobs/203825363

Attached is a bisect log and a boot log with GCC (just to show it is not
a compiler thing).

My QEMU command line is:

qemu-system-x86_64 -m 512m \
-drive file=images/x86_64/rootfs.ext4,format=raw,if=ide \
-append 'console=ttyS0 root=/dev/sda' \
-nographic \
-kernel arch/x86_64/boot/bzImage

and the rootfs is available here:

https://github.com/ClangBuiltLinux/continuous-integration/raw/master/images/x86_64/rootfs.ext4

I haven't seen a report on this yet so apologize if there is already a
fix in the works. Let me know if you need anythnig from me.

Cheers,
Nathan

Attachments:

(No filename) (2.37 kB)
bisect.log (2.36 kB)
boot.log (21.84 kB)
Download all attachments

2019-05-29 22:51:45

by Nicolin Chen

[permalink] [raw]

Subject: Re: [PATCH v3 1/2] dma-contiguous: Abstract dma_{alloc,free}_contiguous()

Hi Nathan,

On Wed, May 29, 2019 at 11:35:46AM -0700, Nathan Chancellor wrote:
> This commit is causing boot failures in QEMU on x86_64 defconfig:
>
> https://travis-ci.com/ClangBuiltLinux/continuous-integration/jobs/203825363
>
> Attached is a bisect log and a boot log with GCC (just to show it is not
> a compiler thing).
>
> My QEMU command line is:
>
> qemu-system-x86_64 -m 512m \
> -drive file=images/x86_64/rootfs.ext4,format=raw,if=ide \
> -append 'console=ttyS0 root=/dev/sda' \
> -nographic \
> -kernel arch/x86_64/boot/bzImage
>
> and the rootfs is available here:
>
> https://github.com/ClangBuiltLinux/continuous-integration/raw/master/images/x86_64/rootfs.ext4

Thanks for reporting the bug.

I am able to repro the issue with the given command and rootfs. The
problem is that x86_64 has CONFIG_DMA_CMA=n so the helper function
is blank on x86_64 while dma-direct should be platform independent.

A simple fix is to add alloc_pages_node() for !CONFIG_DMA_CMA. I'll
submit a fix soon -- need to figure out a good way though. It seems
that adding the fallback to the !CONFIG_DMA_CMA version would cause
some recipe errors when building the kernel...

2019-05-29 23:09:45

by Nicolin Chen

[permalink] [raw]

Subject: Re: [PATCH v3 0/2] Optimize dma_*_from_contiguous calls

On Tue, May 28, 2019 at 08:04:24AM +0200, Christoph Hellwig wrote:
> Thanks,
>
> applied to dma-mapping for-next.
>
> Can you also send a conversion of drivers/iommu/dma-iommu.c to your
> new helpers against this tree?
>
> http://git.infradead.org/users/hch/dma-mapping.git/shortlog/refs/heads/for-next

I can. There is a reported regression with !CONFIG_DMA_CMA now
so I will do that after a fix is merged and the whole thing is
stable.

Thank you

2019-07-25 18:07:27

by Ezequiel Garcia

[permalink] [raw]

Subject: Re: [PATCH v3 1/2] dma-contiguous: Abstract dma_{alloc,free}_contiguous()

I can't find a way to forward-redirect from Gmail, so I'm Ccing Dafna
who found a regression caused by this commit. Dafna, can you give all
the details, including the log and how you are reproducing it?

On Fri, 24 May 2019 at 01:08, Nicolin Chen <[email protected]> wrote:
>
> Both dma_alloc_from_contiguous() and dma_release_from_contiguous()
> are very simply implemented, but requiring callers to pass certain
> parameters like count and align, and taking a boolean parameter to
> check __GFP_NOWARN in the allocation flags. So every function call
> duplicates similar work:
> /* A piece of example */
> unsigned long order = get_order(size);
> size_t count = size >> PAGE_SHIFT;
> page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN);
> [...]
> dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
>
> Additionally, as CMA can be used only in the context which permits
> sleeping, most of callers do a gfpflags_allow_blocking() check and
> a corresponding fallback allocation of normal pages upon any false
> result:
> /* A piece of example */
> if (gfpflags_allow_blocking(flag))
> page = dma_alloc_from_contiguous();
> if (!page)
> page = alloc_pages();
> [...]
> if (!dma_release_from_contiguous(dev, page, count))
> __free_pages(page, get_order(size));
>
> So this patch simplifies those function calls by abstracting these
> operations into the two new functions: dma_{alloc,free}_contiguous.
>
> As some callers of dma_{alloc,release}_from_contiguous() might be
> complicated, this patch just implements these two new functions to
> kernel/dma/direct.c only as an initial step.
>
> Suggested-by: Christoph Hellwig <[email protected]>
> Signed-off-by: Nicolin Chen <[email protected]>
> ---
> Changelog
> v2->v3:
> * Added missing "static inline" in header file to fix build error.
> v1->v2:
> * Added new functions beside the old ones so we can replace callers
> one by one later.
> * Applied new functions to dma/direct.c only, because it's the best
> example caller to apply and should be safe with the new functions.
>
> include/linux/dma-contiguous.h | 11 ++++++++
> kernel/dma/contiguous.c | 48 ++++++++++++++++++++++++++++++++++
> kernel/dma/direct.c | 24 +++--------------
> 3 files changed, 63 insertions(+), 20 deletions(-)
>
> diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h
> index f247e8aa5e3d..00a370c1c140 100644
> --- a/include/linux/dma-contiguous.h
> +++ b/include/linux/dma-contiguous.h
> @@ -115,6 +115,8 @@ struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
> unsigned int order, bool no_warn);
> bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> int count);
> +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp);
> +void dma_free_contiguous(struct device *dev, struct page *page, size_t size);
>
> #else
>
> @@ -157,6 +159,15 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> return false;
> }
>
> +static inline
> +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
> +{
> + return NULL;
> +}
> +
> +static inline
> +void dma_free_contiguous(struct device *dev, struct page *page, size_t size) { }
> +
> #endif
>
> #endif
> diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
> index b2a87905846d..21f39a6cb04f 100644
> --- a/kernel/dma/contiguous.c
> +++ b/kernel/dma/contiguous.c
> @@ -214,6 +214,54 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> return cma_release(dev_get_cma_area(dev), pages, count);
> }
>
> +/**
> + * dma_alloc_contiguous() - allocate contiguous pages
> + * @dev: Pointer to device for which the allocation is performed.
> + * @size: Requested allocation size.
> + * @gfp: Allocation flags.
> + *
> + * This function allocates contiguous memory buffer for specified device. It
> + * first tries to use device specific contiguous memory area if available or
> + * the default global one, then tries a fallback allocation of normal pages.
> + */
> +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
> +{
> + int node = dev ? dev_to_node(dev) : NUMA_NO_NODE;
> + size_t count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> + size_t align = get_order(PAGE_ALIGN(size));
> + struct cma *cma = dev_get_cma_area(dev);
> + struct page *page = NULL;
> +
> + /* CMA can be used only in the context which permits sleeping */
> + if (cma && gfpflags_allow_blocking(gfp)) {
> + align = min_t(size_t, align, CONFIG_CMA_ALIGNMENT);
> + page = cma_alloc(cma, count, align, gfp & __GFP_NOWARN);
> + }
> +
> + /* Fallback allocation of normal pages */
> + if (!page)
> + page = alloc_pages_node(node, gfp, align);
> +
> + return page;
> +}
> +
> +/**
> + * dma_free_contiguous() - release allocated pages
> + * @dev: Pointer to device for which the pages were allocated.
> + * @page: Pointer to the allocated pages.
> + * @size: Size of allocated pages.
> + *
> + * This function releases memory allocated by dma_alloc_contiguous(). As the
> + * cma_release returns false when provided pages do not belong to contiguous
> + * area and true otherwise, this function then does a fallback __free_pages()
> + * upon a false-return.
> + */
> +void dma_free_contiguous(struct device *dev, struct page *page, size_t size)
> +{
> + if (!cma_release(dev_get_cma_area(dev), page, size >> PAGE_SHIFT))
> + __free_pages(page, get_order(size));
> +}
> +
> /*
> * Support for reserved memory regions defined in device tree
> */
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index 2c2772e9702a..0816c1e8b05a 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -96,8 +96,6 @@ static bool dma_coherent_ok(struct device *dev, phys_addr_t phys, size_t size)
> struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
> dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
> {
> - unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> - int page_order = get_order(size);
> struct page *page = NULL;
> u64 phys_mask;
>
> @@ -109,20 +107,9 @@ struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
> gfp |= __dma_direct_optimal_gfp_mask(dev, dev->coherent_dma_mask,
> &phys_mask);
> again:
> - /* CMA can be used only in the context which permits sleeping */
> - if (gfpflags_allow_blocking(gfp)) {
> - page = dma_alloc_from_contiguous(dev, count, page_order,
> - gfp & __GFP_NOWARN);
> - if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
> - dma_release_from_contiguous(dev, page, count);
> - page = NULL;
> - }
> - }
> - if (!page)
> - page = alloc_pages_node(dev_to_node(dev), gfp, page_order);
> -
> + page = dma_alloc_contiguous(dev, size, gfp);
> if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
> - __free_pages(page, page_order);
> + dma_free_contiguous(dev, page, size);
> page = NULL;
>
> if (IS_ENABLED(CONFIG_ZONE_DMA32) &&
> @@ -154,7 +141,7 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size,
> if (PageHighMem(page)) {
> /*
> * Depending on the cma= arguments and per-arch setup
> - * dma_alloc_from_contiguous could return highmem pages.
> + * dma_alloc_contiguous could return highmem pages.
> * Without remapping there is no way to return them here,
> * so log an error and fail.
> */
> @@ -176,10 +163,7 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size,
>
> void __dma_direct_free_pages(struct device *dev, size_t size, struct page *page)
> {
> - unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> -
> - if (!dma_release_from_contiguous(dev, page, count))
> - __free_pages(page, get_order(size));
> + dma_free_contiguous(dev, page, size);
> }
>
> void dma_direct_free_pages(struct device *dev, size_t size, void *cpu_addr,
> --
> 2.17.1
>

2019-07-25 18:10:36

by Nicolin Chen

[permalink] [raw]

Subject: Re: [PATCH v3 1/2] dma-contiguous: Abstract dma_{alloc,free}_contiguous()

On Thu, Jul 25, 2019 at 01:06:42PM -0300, Ezequiel Garcia wrote:
> I can't find a way to forward-redirect from Gmail, so I'm Ccing Dafna
> who found a regression caused by this commit. Dafna, can you give all
> the details, including the log and how you are reproducing it?

I saw the conversation there. Sorry for not replying yet.
May we discuss there since there are full logs available?

Thanks
Nicolin

>
>
> On Fri, 24 May 2019 at 01:08, Nicolin Chen <[email protected]> wrote:
> >
> > Both dma_alloc_from_contiguous() and dma_release_from_contiguous()
> > are very simply implemented, but requiring callers to pass certain
> > parameters like count and align, and taking a boolean parameter to
> > check __GFP_NOWARN in the allocation flags. So every function call
> > duplicates similar work:
> > /* A piece of example */
> > unsigned long order = get_order(size);
> > size_t count = size >> PAGE_SHIFT;
> > page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN);
> > [...]
> > dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
> >
> > Additionally, as CMA can be used only in the context which permits
> > sleeping, most of callers do a gfpflags_allow_blocking() check and
> > a corresponding fallback allocation of normal pages upon any false
> > result:
> > /* A piece of example */
> > if (gfpflags_allow_blocking(flag))
> > page = dma_alloc_from_contiguous();
> > if (!page)
> > page = alloc_pages();
> > [...]
> > if (!dma_release_from_contiguous(dev, page, count))
> > __free_pages(page, get_order(size));
> >
> > So this patch simplifies those function calls by abstracting these
> > operations into the two new functions: dma_{alloc,free}_contiguous.
> >
> > As some callers of dma_{alloc,release}_from_contiguous() might be
> > complicated, this patch just implements these two new functions to
> > kernel/dma/direct.c only as an initial step.
> >
> > Suggested-by: Christoph Hellwig <[email protected]>
> > Signed-off-by: Nicolin Chen <[email protected]>
> > ---
> > Changelog
> > v2->v3:
> > * Added missing "static inline" in header file to fix build error.
> > v1->v2:
> > * Added new functions beside the old ones so we can replace callers
> > one by one later.
> > * Applied new functions to dma/direct.c only, because it's the best
> > example caller to apply and should be safe with the new functions.
> >
> > include/linux/dma-contiguous.h | 11 ++++++++
> > kernel/dma/contiguous.c | 48 ++++++++++++++++++++++++++++++++++
> > kernel/dma/direct.c | 24 +++--------------
> > 3 files changed, 63 insertions(+), 20 deletions(-)
> >
> > diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h
> > index f247e8aa5e3d..00a370c1c140 100644
> > --- a/include/linux/dma-contiguous.h
> > +++ b/include/linux/dma-contiguous.h
> > @@ -115,6 +115,8 @@ struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
> > unsigned int order, bool no_warn);
> > bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> > int count);
> > +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp);
> > +void dma_free_contiguous(struct device *dev, struct page *page, size_t size);
> >
> > #else
> >
> > @@ -157,6 +159,15 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> > return false;
> > }
> >
> > +static inline
> > +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
> > +{
> > + return NULL;
> > +}
> > +
> > +static inline
> > +void dma_free_contiguous(struct device *dev, struct page *page, size_t size) { }
> > +
> > #endif
> >
> > #endif
> > diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
> > index b2a87905846d..21f39a6cb04f 100644
> > --- a/kernel/dma/contiguous.c
> > +++ b/kernel/dma/contiguous.c
> > @@ -214,6 +214,54 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> > return cma_release(dev_get_cma_area(dev), pages, count);
> > }
> >
> > +/**
> > + * dma_alloc_contiguous() - allocate contiguous pages
> > + * @dev: Pointer to device for which the allocation is performed.
> > + * @size: Requested allocation size.
> > + * @gfp: Allocation flags.
> > + *
> > + * This function allocates contiguous memory buffer for specified device. It
> > + * first tries to use device specific contiguous memory area if available or
> > + * the default global one, then tries a fallback allocation of normal pages.
> > + */
> > +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
> > +{
> > + int node = dev ? dev_to_node(dev) : NUMA_NO_NODE;
> > + size_t count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > + size_t align = get_order(PAGE_ALIGN(size));
> > + struct cma *cma = dev_get_cma_area(dev);
> > + struct page *page = NULL;
> > +
> > + /* CMA can be used only in the context which permits sleeping */
> > + if (cma && gfpflags_allow_blocking(gfp)) {
> > + align = min_t(size_t, align, CONFIG_CMA_ALIGNMENT);
> > + page = cma_alloc(cma, count, align, gfp & __GFP_NOWARN);
> > + }
> > +
> > + /* Fallback allocation of normal pages */
> > + if (!page)
> > + page = alloc_pages_node(node, gfp, align);
> > +
> > + return page;
> > +}
> > +
> > +/**
> > + * dma_free_contiguous() - release allocated pages
> > + * @dev: Pointer to device for which the pages were allocated.
> > + * @page: Pointer to the allocated pages.
> > + * @size: Size of allocated pages.
> > + *
> > + * This function releases memory allocated by dma_alloc_contiguous(). As the
> > + * cma_release returns false when provided pages do not belong to contiguous
> > + * area and true otherwise, this function then does a fallback __free_pages()
> > + * upon a false-return.
> > + */
> > +void dma_free_contiguous(struct device *dev, struct page *page, size_t size)
> > +{
> > + if (!cma_release(dev_get_cma_area(dev), page, size >> PAGE_SHIFT))
> > + __free_pages(page, get_order(size));
> > +}
> > +
> > /*
> > * Support for reserved memory regions defined in device tree
> > */
> > diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> > index 2c2772e9702a..0816c1e8b05a 100644
> > --- a/kernel/dma/direct.c
> > +++ b/kernel/dma/direct.c
> > @@ -96,8 +96,6 @@ static bool dma_coherent_ok(struct device *dev, phys_addr_t phys, size_t size)
> > struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
> > dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
> > {
> > - unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > - int page_order = get_order(size);
> > struct page *page = NULL;
> > u64 phys_mask;
> >
> > @@ -109,20 +107,9 @@ struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
> > gfp |= __dma_direct_optimal_gfp_mask(dev, dev->coherent_dma_mask,
> > &phys_mask);
> > again:
> > - /* CMA can be used only in the context which permits sleeping */
> > - if (gfpflags_allow_blocking(gfp)) {
> > - page = dma_alloc_from_contiguous(dev, count, page_order,
> > - gfp & __GFP_NOWARN);
> > - if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
> > - dma_release_from_contiguous(dev, page, count);
> > - page = NULL;
> > - }
> > - }
> > - if (!page)
> > - page = alloc_pages_node(dev_to_node(dev), gfp, page_order);
> > -
> > + page = dma_alloc_contiguous(dev, size, gfp);
> > if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
> > - __free_pages(page, page_order);
> > + dma_free_contiguous(dev, page, size);
> > page = NULL;
> >
> > if (IS_ENABLED(CONFIG_ZONE_DMA32) &&
> > @@ -154,7 +141,7 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size,
> > if (PageHighMem(page)) {
> > /*
> > * Depending on the cma= arguments and per-arch setup
> > - * dma_alloc_from_contiguous could return highmem pages.
> > + * dma_alloc_contiguous could return highmem pages.
> > * Without remapping there is no way to return them here,
> > * so log an error and fail.
> > */
> > @@ -176,10 +163,7 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size,
> >
> > void __dma_direct_free_pages(struct device *dev, size_t size, struct page *page)
> > {
> > - unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > -
> > - if (!dma_release_from_contiguous(dev, page, count))
> > - __free_pages(page, get_order(size));
> > + dma_free_contiguous(dev, page, size);
> > }
> >
> > void dma_direct_free_pages(struct device *dev, size_t size, void *cpu_addr,
> > --
> > 2.17.1
> >

2019-07-25 18:13:29

by Dafna Hirschfeld

[permalink] [raw]

Subject: Re: [PATCH v3 1/2] dma-contiguous: Abstract dma_{alloc,free}_contiguous()

On Thu, 2019-07-25 at 09:50 -0700, Nicolin Chen wrote:
> On Thu, Jul 25, 2019 at 01:06:42PM -0300, Ezequiel Garcia wrote:
> > I can't find a way to forward-redirect from Gmail, so I'm Ccing Dafna
> > who found a regression caused by this commit. Dafna, can you give all
> > the details, including the log and how you are reproducing it?
>
> I saw the conversation there. Sorry for not replying yet.
> May we discuss there since there are full logs available?
>
> Thanks
> Nicolin
>

Hi,
I compiled vivid as built-in into the kernel (not as a separate module) for nitrogen8m device (imx8) and
set it to use contig dma for mem_ops by adding the kernel param
vivid.allocators=1,1,...

I use this devicetree patch for the dtb file: https://lkml.org/lkml/2019/7/24/789. Although it should
be the same on any Aarch64 platform.

Then, on the board I run the command:

v4l2-ctl -d3 -v width=2592,height=1944,pixelformat=UYVY,bytesperline=5184 --stream-mmap --stream-to video.UYVY

In every run there is a different crash. Here is one of them: https://pastebin.com/xXgbXMAN

Dafna
> >
> >
> > > > On Fri, 24 May 2019 at 01:08, Nicolin Chen <[email protected]> wrote:
> > >
> > > Both dma_alloc_from_contiguous() and dma_release_from_contiguous()
> > > are very simply implemented, but requiring callers to pass certain
> > > parameters like count and align, and taking a boolean parameter to
> > > check __GFP_NOWARN in the allocation flags. So every function call
> > > duplicates similar work:
> > > /* A piece of example */
> > > unsigned long order = get_order(size);
> > > size_t count = size >> PAGE_SHIFT;
> > > page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN);
> > > [...]
> > > dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
> > >
> > > Additionally, as CMA can be used only in the context which permits
> > > sleeping, most of callers do a gfpflags_allow_blocking() check and
> > > a corresponding fallback allocation of normal pages upon any false
> > > result:
> > > /* A piece of example */
> > > if (gfpflags_allow_blocking(flag))
> > >       page = dma_alloc_from_contiguous();
> > > if (!page)
> > >       page = alloc_pages();
> > > [...]
> > > if (!dma_release_from_contiguous(dev, page, count))
> > >       __free_pages(page, get_order(size));
> > >
> > > So this patch simplifies those function calls by abstracting these
> > > operations into the two new functions: dma_{alloc,free}_contiguous.
> > >
> > > As some callers of dma_{alloc,release}_from_contiguous() might be
> > > complicated, this patch just implements these two new functions to
> > > kernel/dma/direct.c only as an initial step.
> > >
> > > > > > Suggested-by: Christoph Hellwig <[email protected]>
> > > > > > Signed-off-by: Nicolin Chen <[email protected]>
> > > ---
> > > Changelog
> > > v2->v3:
> > > * Added missing "static inline" in header file to fix build error.
> > > v1->v2:
> > > * Added new functions beside the old ones so we can replace callers
> > >    one by one later.
> > > * Applied new functions to dma/direct.c only, because it's the best
> > >    example caller to apply and should be safe with the new functions.
> > >
> > > include/linux/dma-contiguous.h | 11 ++++++++
> > > kernel/dma/contiguous.c        | 48 ++++++++++++++++++++++++++++++++++
> > > kernel/dma/direct.c            | 24 +++--------------
> > > 3 files changed, 63 insertions(+), 20 deletions(-)
> > >
> > > diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h
> > > index f247e8aa5e3d..00a370c1c140 100644
> > > --- a/include/linux/dma-contiguous.h
> > > +++ b/include/linux/dma-contiguous.h
> > > @@ -115,6 +115,8 @@ struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
> > >                                        unsigned int order, bool no_warn);
> > > bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> > >                                  int count);
> > > +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp);
> > > +void dma_free_contiguous(struct device *dev, struct page *page, size_t size);
> > >
> > > #else
> > >
> > > @@ -157,6 +159,15 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> > >         return false;
> > > }
> > >
> > > +static inline
> > > +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
> > > +{
> > > +       return NULL;
> > > +}
> > > +
> > > +static inline
> > > +void dma_free_contiguous(struct device *dev, struct page *page, size_t size) { }
> > > +
> > > #endif
> > >
> > > #endif
> > > diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
> > > index b2a87905846d..21f39a6cb04f 100644
> > > --- a/kernel/dma/contiguous.c
> > > +++ b/kernel/dma/contiguous.c
> > > @@ -214,6 +214,54 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> > >         return cma_release(dev_get_cma_area(dev), pages, count);
> > > }
> > >
> > > +/**
> > > + * dma_alloc_contiguous() - allocate contiguous pages
> > > + * @dev:   Pointer to device for which the allocation is performed.
> > > + * @size:  Requested allocation size.
> > > + * @gfp:   Allocation flags.
> > > + *
> > > + * This function allocates contiguous memory buffer for specified device. It
> > > + * first tries to use device specific contiguous memory area if available or
> > > + * the default global one, then tries a fallback allocation of normal pages.
> > > + */
> > > +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
> > > +{
> > > +       int node = dev ? dev_to_node(dev) : NUMA_NO_NODE;
> > > +       size_t count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > > +       size_t align = get_order(PAGE_ALIGN(size));
> > > +       struct cma *cma = dev_get_cma_area(dev);
> > > +       struct page *page = NULL;
> > > +
> > > +       /* CMA can be used only in the context which permits sleeping */
> > > +       if (cma && gfpflags_allow_blocking(gfp)) {
> > > +               align = min_t(size_t, align, CONFIG_CMA_ALIGNMENT);
> > > +               page = cma_alloc(cma, count, align, gfp & __GFP_NOWARN);
> > > +       }
> > > +
> > > +       /* Fallback allocation of normal pages */
> > > +       if (!page)
> > > +               page = alloc_pages_node(node, gfp, align);
> > > +
> > > +       return page;
> > > +}
> > > +
> > > +/**
> > > + * dma_free_contiguous() - release allocated pages
> > > + * @dev:   Pointer to device for which the pages were allocated.
> > > + * @page:  Pointer to the allocated pages.
> > > + * @size:  Size of allocated pages.
> > > + *
> > > + * This function releases memory allocated by dma_alloc_contiguous(). As the
> > > + * cma_release returns false when provided pages do not belong to contiguous
> > > + * area and true otherwise, this function then does a fallback __free_pages()
> > > + * upon a false-return.
> > > + */
> > > +void dma_free_contiguous(struct device *dev, struct page *page, size_t size)
> > > +{
> > > +       if (!cma_release(dev_get_cma_area(dev), page, size >> PAGE_SHIFT))
> > > +               __free_pages(page, get_order(size));
> > > +}
> > > +
> > > /*
> > > * Support for reserved memory regions defined in device tree
> > > */
> > > diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> > > index 2c2772e9702a..0816c1e8b05a 100644
> > > --- a/kernel/dma/direct.c
> > > +++ b/kernel/dma/direct.c
> > > @@ -96,8 +96,6 @@ static bool dma_coherent_ok(struct device *dev, phys_addr_t phys, size_t size)
> > > struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
> > >                 dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
> > > {
> > > -       unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > > -       int page_order = get_order(size);
> > >         struct page *page = NULL;
> > >         u64 phys_mask;
> > >
> > > @@ -109,20 +107,9 @@ struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
> > >         gfp |= __dma_direct_optimal_gfp_mask(dev, dev->coherent_dma_mask,
> > >                         &phys_mask);
> > > again:
> > > -       /* CMA can be used only in the context which permits sleeping */
> > > -       if (gfpflags_allow_blocking(gfp)) {
> > > -               page = dma_alloc_from_contiguous(dev, count, page_order,
> > > -                                                gfp & __GFP_NOWARN);
> > > -               if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
> > > -                       dma_release_from_contiguous(dev, page, count);
> > > -                       page = NULL;
> > > -               }
> > > -       }
> > > -       if (!page)
> > > -               page = alloc_pages_node(dev_to_node(dev), gfp, page_order);
> > > -
> > > +       page = dma_alloc_contiguous(dev, size, gfp);
> > >         if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
> > > -               __free_pages(page, page_order);
> > > +               dma_free_contiguous(dev, page, size);
> > >                 page = NULL;
> > >
> > >                 if (IS_ENABLED(CONFIG_ZONE_DMA32) &&
> > > @@ -154,7 +141,7 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size,
> > >         if (PageHighMem(page)) {
> > >                 /*
> > >                  * Depending on the cma= arguments and per-arch setup
> > > -                * dma_alloc_from_contiguous could return highmem pages.
> > > +                * dma_alloc_contiguous could return highmem pages.
> > >                  * Without remapping there is no way to return them here,
> > >                  * so log an error and fail.
> > >                  */
> > > @@ -176,10 +163,7 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size,
> > >
> > > void __dma_direct_free_pages(struct device *dev, size_t size, struct page *page)
> > > {
> > > -       unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > > -
> > > -       if (!dma_release_from_contiguous(dev, page, count))
> > > -               __free_pages(page, get_order(size));
> > > +       dma_free_contiguous(dev, page, size);
> > > }
> > >
> > > void dma_direct_free_pages(struct device *dev, size_t size, void *cpu_addr,
> > > --
> > > 2.17.1
> > >

2019-07-25 23:43:57

by Nicolin Chen

[permalink] [raw]

Subject: Re: [PATCH v3 1/2] dma-contiguous: Abstract dma_{alloc,free}_contiguous()

On Thu, Jul 25, 2019 at 07:31:05PM +0200, Dafna Hirschfeld wrote:
> On Thu, 2019-07-25 at 09:50 -0700, Nicolin Chen wrote:
> > On Thu, Jul 25, 2019 at 01:06:42PM -0300, Ezequiel Garcia wrote:
> > > I can't find a way to forward-redirect from Gmail, so I'm Ccing Dafna
> > > who found a regression caused by this commit. Dafna, can you give all
> > > the details, including the log and how you are reproducing it?
> >
> > I saw the conversation there. Sorry for not replying yet.
> > May we discuss there since there are full logs available?
> >
> > Thanks
> > Nicolin
> >
>
> Hi,
> I compiled vivid as built-in into the kernel (not as a separate module) for nitrogen8m device (imx8) and
> set it to use contig dma for mem_ops by adding the kernel param
> vivid.allocators=1,1,...
>
> I use this devicetree patch for the dtb file: https://lkml.org/lkml/2019/7/24/789. Although it should
> be the same on any Aarch64 platform.
>
> Then, on the board I run the command:
>
> v4l2-ctl -d3 -v width=2592,height=1944,pixelformat=UYVY,bytesperline=5184 --stream-mmap --stream-to video.UYVY
>
> In every run there is a different crash. Here is one of them: https://pastebin.com/xXgbXMAN

This probably should be a fix: https://lkml.org/lkml/2019/7/25/1432

I also sent it to you. Would it be possible for you to test it?

Thanks
Nicolin

>
> Dafna
> > >
> > >
> > > > > On Fri, 24 May 2019 at 01:08, Nicolin Chen <[email protected]> wrote:
> > > >
> > > > Both dma_alloc_from_contiguous() and dma_release_from_contiguous()
> > > > are very simply implemented, but requiring callers to pass certain
> > > > parameters like count and align, and taking a boolean parameter to
> > > > check __GFP_NOWARN in the allocation flags. So every function call
> > > > duplicates similar work:
> > > > ? /* A piece of example */
> > > > ? unsigned long order = get_order(size);
> > > > ? size_t count = size >> PAGE_SHIFT;
> > > > ? page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN);
> > > > ? [...]
> > > > ? dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
> > > >
> > > > Additionally, as CMA can be used only in the context which permits
> > > > sleeping, most of callers do a gfpflags_allow_blocking() check and
> > > > a corresponding fallback allocation of normal pages upon any false
> > > > result:
> > > > ? /* A piece of example */
> > > > ? if (gfpflags_allow_blocking(flag))
> > > > ??????page = dma_alloc_from_contiguous();
> > > > ? if (!page)
> > > > ??????page = alloc_pages();
> > > > ? [...]
> > > > ? if (!dma_release_from_contiguous(dev, page, count))
> > > > ??????__free_pages(page, get_order(size));
> > > >
> > > > So this patch simplifies those function calls by abstracting these
> > > > operations into the two new functions: dma_{alloc,free}_contiguous.
> > > >
> > > > As some callers of dma_{alloc,release}_from_contiguous() might be
> > > > complicated, this patch just implements these two new functions to
> > > > kernel/dma/direct.c only as an initial step.
> > > >
> > > > > > > Suggested-by: Christoph Hellwig <[email protected]>
> > > > > > > Signed-off-by: Nicolin Chen <[email protected]>
> > > > ---
> > > > Changelog
> > > > v2->v3:
> > > > ?* Added missing "static inline" in header file to fix build error.
> > > > v1->v2:
> > > > ?* Added new functions beside the old ones so we can replace callers
> > > > ???one by one later.
> > > > ?* Applied new functions to dma/direct.c only, because it's the best
> > > > ???example caller to apply and should be safe with the new functions.
> > > >
> > > > ?include/linux/dma-contiguous.h | 11 ++++++++
> > > > ?kernel/dma/contiguous.c????????| 48 ++++++++++++++++++++++++++++++++++
> > > > ?kernel/dma/direct.c????????????| 24 +++--------------
> > > > ?3 files changed, 63 insertions(+), 20 deletions(-)
> > > >
> > > > diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h
> > > > index f247e8aa5e3d..00a370c1c140 100644
> > > > --- a/include/linux/dma-contiguous.h
> > > > +++ b/include/linux/dma-contiguous.h
> > > > @@ -115,6 +115,8 @@ struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
> > > > ???????????????????????????????????????unsigned int order, bool no_warn);
> > > > ?bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> > > > ?????????????????????????????????int count);
> > > > +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp);
> > > > +void dma_free_contiguous(struct device *dev, struct page *page, size_t size);
> > > >
> > > > ?#else
> > > >
> > > > @@ -157,6 +159,15 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> > > > ????????return false;
> > > > ?}
> > > >
> > > > +static inline
> > > > +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
> > > > +{
> > > > +???????return NULL;
> > > > +}
> > > > +
> > > > +static inline
> > > > +void dma_free_contiguous(struct device *dev, struct page *page, size_t size) { }
> > > > +
> > > > ?#endif
> > > >
> > > > ?#endif
> > > > diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
> > > > index b2a87905846d..21f39a6cb04f 100644
> > > > --- a/kernel/dma/contiguous.c
> > > > +++ b/kernel/dma/contiguous.c
> > > > @@ -214,6 +214,54 @@ bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> > > > ????????return cma_release(dev_get_cma_area(dev), pages, count);
> > > > ?}
> > > >
> > > > +/**
> > > > + * dma_alloc_contiguous() - allocate contiguous pages
> > > > + * @dev:???Pointer to device for which the allocation is performed.
> > > > + * @size:??Requested allocation size.
> > > > + * @gfp:???Allocation flags.
> > > > + *
> > > > + * This function allocates contiguous memory buffer for specified device. It
> > > > + * first tries to use device specific contiguous memory area if available or
> > > > + * the default global one, then tries a fallback allocation of normal pages.
> > > > + */
> > > > +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
> > > > +{
> > > > +???????int node = dev ? dev_to_node(dev) : NUMA_NO_NODE;
> > > > +???????size_t count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > > > +???????size_t align = get_order(PAGE_ALIGN(size));
> > > > +???????struct cma *cma = dev_get_cma_area(dev);
> > > > +???????struct page *page = NULL;
> > > > +
> > > > +???????/* CMA can be used only in the context which permits sleeping */
> > > > +???????if (cma && gfpflags_allow_blocking(gfp)) {
> > > > +???????????????align = min_t(size_t, align, CONFIG_CMA_ALIGNMENT);
> > > > +???????????????page = cma_alloc(cma, count, align, gfp & __GFP_NOWARN);
> > > > +???????}
> > > > +
> > > > +???????/* Fallback allocation of normal pages */
> > > > +???????if (!page)
> > > > +???????????????page = alloc_pages_node(node, gfp, align);
> > > > +
> > > > +???????return page;
> > > > +}
> > > > +
> > > > +/**
> > > > + * dma_free_contiguous() - release allocated pages
> > > > + * @dev:???Pointer to device for which the pages were allocated.
> > > > + * @page:??Pointer to the allocated pages.
> > > > + * @size:??Size of allocated pages.
> > > > + *
> > > > + * This function releases memory allocated by dma_alloc_contiguous(). As the
> > > > + * cma_release returns false when provided pages do not belong to contiguous
> > > > + * area and true otherwise, this function then does a fallback __free_pages()
> > > > + * upon a false-return.
> > > > + */
> > > > +void dma_free_contiguous(struct device *dev, struct page *page, size_t size)
> > > > +{
> > > > +???????if (!cma_release(dev_get_cma_area(dev), page, size >> PAGE_SHIFT))
> > > > +???????????????__free_pages(page, get_order(size));
> > > > +}
> > > > +
> > > > ?/*
> > > > ? * Support for reserved memory regions defined in device tree
> > > > ? */
> > > > diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> > > > index 2c2772e9702a..0816c1e8b05a 100644
> > > > --- a/kernel/dma/direct.c
> > > > +++ b/kernel/dma/direct.c
> > > > @@ -96,8 +96,6 @@ static bool dma_coherent_ok(struct device *dev, phys_addr_t phys, size_t size)
> > > > ?struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
> > > > ????????????????dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs)
> > > > ?{
> > > > -???????unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > > > -???????int page_order = get_order(size);
> > > > ????????struct page *page = NULL;
> > > > ????????u64 phys_mask;
> > > >
> > > > @@ -109,20 +107,9 @@ struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
> > > > ????????gfp |= __dma_direct_optimal_gfp_mask(dev, dev->coherent_dma_mask,
> > > > ????????????????????????&phys_mask);
> > > > ?again:
> > > > -???????/* CMA can be used only in the context which permits sleeping */
> > > > -???????if (gfpflags_allow_blocking(gfp)) {
> > > > -???????????????page = dma_alloc_from_contiguous(dev, count, page_order,
> > > > -????????????????????????????????????????????????gfp & __GFP_NOWARN);
> > > > -???????????????if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
> > > > -???????????????????????dma_release_from_contiguous(dev, page, count);
> > > > -???????????????????????page = NULL;
> > > > -???????????????}
> > > > -???????}
> > > > -???????if (!page)
> > > > -???????????????page = alloc_pages_node(dev_to_node(dev), gfp, page_order);
> > > > -
> > > > +???????page = dma_alloc_contiguous(dev, size, gfp);
> > > > ????????if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
> > > > -???????????????__free_pages(page, page_order);
> > > > +???????????????dma_free_contiguous(dev, page, size);
> > > > ????????????????page = NULL;
> > > >
> > > > ????????????????if (IS_ENABLED(CONFIG_ZONE_DMA32) &&
> > > > @@ -154,7 +141,7 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size,
> > > > ????????if (PageHighMem(page)) {
> > > > ????????????????/*
> > > > ?????????????????* Depending on the cma= arguments and per-arch setup
> > > > -????????????????* dma_alloc_from_contiguous could return highmem pages.
> > > > +????????????????* dma_alloc_contiguous could return highmem pages.
> > > > ?????????????????* Without remapping there is no way to return them here,
> > > > ?????????????????* so log an error and fail.
> > > > ?????????????????*/
> > > > @@ -176,10 +163,7 @@ void *dma_direct_alloc_pages(struct device *dev, size_t size,
> > > >
> > > > ?void __dma_direct_free_pages(struct device *dev, size_t size, struct page *page)
> > > > ?{
> > > > -???????unsigned int count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> > > > -
> > > > -???????if (!dma_release_from_contiguous(dev, page, count))
> > > > -???????????????__free_pages(page, get_order(size));
> > > > +???????dma_free_contiguous(dev, page, size);
> > > > ?}
> > > >
> > > > ?void dma_direct_free_pages(struct device *dev, size_t size, void *cpu_addr,
> > > > --
> > > > 2.17.1
> > > >