2024-06-12 08:13:01

by Zhai He

[permalink] [raw]
Subject: [PATCH v2] Supports to use the default CMA when the device-specified CMA memory is not enough.

From: He Zhai <[email protected]>

In the current code logic, if the device-specified CMA memory
allocation fails, memory will not be allocated from the default CMA area.
This patch will use the default cma region when the device's
specified CMA is not enough.

In addition, the log level of allocation failure is changed to debug.
Because these logs will be printed when memory allocation from the
device specified CMA fails, but if the allocation fails, it will be
allocated from the default cma area. It can easily mislead developers'
judgment.

Signed-off-by: He Zhai <[email protected]>
---
kernel/dma/contiguous.c | 11 +++++++++--
mm/cma.c | 4 ++--
2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index 055da410ac71..e45cfb24500f 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -357,8 +357,13 @@ struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
/* CMA can be used only in the context which permits sleeping */
if (!gfpflags_allow_blocking(gfp))
return NULL;
- if (dev->cma_area)
- return cma_alloc_aligned(dev->cma_area, size, gfp);
+ if (dev->cma_area) {
+ struct page *page = NULL;
+
+ page = cma_alloc_aligned(dev->cma_area, size, gfp);
+ if (page)
+ return page;
+ }
if (size <= PAGE_SIZE)
return NULL;

@@ -406,6 +411,8 @@ void dma_free_contiguous(struct device *dev, struct page *page, size_t size)
if (dev->cma_area) {
if (cma_release(dev->cma_area, page, count))
return;
+ if (cma_release(dma_contiguous_default_area, page, count))
+ return;
} else {
/*
* otherwise, page is from either per-numa cma or default cma
diff --git a/mm/cma.c b/mm/cma.c
index 3e9724716bad..6e12faf1bea7 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -495,8 +495,8 @@ struct page *cma_alloc(struct cma *cma, unsigned long count,
}

if (ret && !no_warn) {
- pr_err_ratelimited("%s: %s: alloc failed, req-size: %lu pages, ret: %d\n",
- __func__, cma->name, count, ret);
+ pr_debug("%s: alloc failed, req-size: %lu pages, ret: %d, try to use default cma\n",
+ cma->name, count, ret);
cma_debug_show_areas(cma);
}

--
2.34.1



2024-06-12 19:13:51

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH v2] Supports to use the default CMA when the device-specified CMA memory is not enough.

On Wed, 12 Jun 2024 16:12:16 +0800 "zhai.he" <[email protected]> wrote:

> From: He Zhai <[email protected]>

(cc Barry & Christoph)

What was your reason for adding cc:stable to the email headers? Does
this address some serious problem? If so, please fully describe that
problem.

> In the current code logic, if the device-specified CMA memory
> allocation fails, memory will not be allocated from the default CMA area.
> This patch will use the default cma region when the device's
> specified CMA is not enough.
>
> In addition, the log level of allocation failure is changed to debug.
> Because these logs will be printed when memory allocation from the
> device specified CMA fails, but if the allocation fails, it will be
> allocated from the default cma area. It can easily mislead developers'
> judgment.
>
> ...
>
> --- a/kernel/dma/contiguous.c
> +++ b/kernel/dma/contiguous.c
> @@ -357,8 +357,13 @@ struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
> /* CMA can be used only in the context which permits sleeping */
> if (!gfpflags_allow_blocking(gfp))
> return NULL;
> - if (dev->cma_area)
> - return cma_alloc_aligned(dev->cma_area, size, gfp);
> + if (dev->cma_area) {
> + struct page *page = NULL;
> +
> + page = cma_alloc_aligned(dev->cma_area, size, gfp);
> + if (page)
> + return page;
> + }
> if (size <= PAGE_SIZE)
> return NULL;

The dma_alloc_contiguous() kerneldoc should be updated for this.

The patch prompts the question "why does the device-specified CMA area
exist?". Why not always allocate from the global pool? If the
device-specified area exists to prevent one device from going crazy and
consuming too much contiguous memory, this patch violates that intent?

> @@ -406,6 +411,8 @@ void dma_free_contiguous(struct device *dev, struct page *page, size_t size)
> if (dev->cma_area) {
> if (cma_release(dev->cma_area, page, count))
> return;
> + if (cma_release(dma_contiguous_default_area, page, count))
> + return;
> } else {
> /*
> * otherwise, page is from either per-numa cma or default cma
> diff --git a/mm/cma.c b/mm/cma.c
> index 3e9724716bad..6e12faf1bea7 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -495,8 +495,8 @@ struct page *cma_alloc(struct cma *cma, unsigned long count,
> }
>
> if (ret && !no_warn) {
> - pr_err_ratelimited("%s: %s: alloc failed, req-size: %lu pages, ret: %d\n",
> - __func__, cma->name, count, ret);
> + pr_debug("%s: alloc failed, req-size: %lu pages, ret: %d, try to use default cma\n",
> + cma->name, count, ret);
> cma_debug_show_areas(cma);
> }


2024-06-12 21:37:39

by Barry Song

[permalink] [raw]
Subject: Re: [PATCH v2] Supports to use the default CMA when the device-specified CMA memory is not enough.

On Thu, Jun 13, 2024 at 6:47 AM Andrew Morton <[email protected]> wrote:
>
> On Wed, 12 Jun 2024 16:12:16 +0800 "zhai.he" <[email protected]> wrote:
>
> > From: He Zhai <[email protected]>
>
> (cc Barry & Christoph)
>
> What was your reason for adding cc:stable to the email headers? Does
> this address some serious problem? If so, please fully describe that
> problem.
>
> > In the current code logic, if the device-specified CMA memory
> > allocation fails, memory will not be allocated from the default CMA area.
> > This patch will use the default cma region when the device's
> > specified CMA is not enough.
> >
> > In addition, the log level of allocation failure is changed to debug.
> > Because these logs will be printed when memory allocation from the
> > device specified CMA fails, but if the allocation fails, it will be
> > allocated from the default cma area. It can easily mislead developers'
> > judgment.

I am not convinced that this patch is correct. If device-specific CMA
is too small,
why not increase it in the device tree? Conversely, if the default CMA
size is too
large, why not reduce it via the cmdline? CMA offers all kinds of
flexible configuration
options based on users’ needs.

One significant benefit of device-specific CMA is that it helps
decrease fragmentation
in the common CMA pool. While many devices allocate memory from the same pool,
they have different memory requirements in terms of sizes and
alignments. Occasions
of memory allocation and release can lead to situations where the CMA
pool has enough
free space, yet someone fails to obtain contiguous memory from it.

This patch entirely negates the advantage we gain from device-specific CMA.
My point is that instead of modifying the core code, please consider correcting
your device tree or cmdline configurations.

> >
> > ...
> >
> > --- a/kernel/dma/contiguous.c
> > +++ b/kernel/dma/contiguous.c
> > @@ -357,8 +357,13 @@ struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
> > /* CMA can be used only in the context which permits sleeping */
> > if (!gfpflags_allow_blocking(gfp))
> > return NULL;
> > - if (dev->cma_area)
> > - return cma_alloc_aligned(dev->cma_area, size, gfp);
> > + if (dev->cma_area) {
> > + struct page *page = NULL;
> > +
> > + page = cma_alloc_aligned(dev->cma_area, size, gfp);
> > + if (page)
> > + return page;
> > + }
> > if (size <= PAGE_SIZE)
> > return NULL;
>
> The dma_alloc_contiguous() kerneldoc should be updated for this.
>
> The patch prompts the question "why does the device-specified CMA area
> exist?". Why not always allocate from the global pool? If the
> device-specified area exists to prevent one device from going crazy and
> consuming too much contiguous memory, this patch violates that intent?
>
> > @@ -406,6 +411,8 @@ void dma_free_contiguous(struct device *dev, struct page *page, size_t size)
> > if (dev->cma_area) {
> > if (cma_release(dev->cma_area, page, count))
> > return;
> > + if (cma_release(dma_contiguous_default_area, page, count))
> > + return;
> > } else {
> > /*
> > * otherwise, page is from either per-numa cma or default cma
> > diff --git a/mm/cma.c b/mm/cma.c
> > index 3e9724716bad..6e12faf1bea7 100644
> > --- a/mm/cma.c
> > +++ b/mm/cma.c
> > @@ -495,8 +495,8 @@ struct page *cma_alloc(struct cma *cma, unsigned long count,
> > }
> >
> > if (ret && !no_warn) {
> > - pr_err_ratelimited("%s: %s: alloc failed, req-size: %lu pages, ret: %d\n",
> > - __func__, cma->name, count, ret);
> > + pr_debug("%s: alloc failed, req-size: %lu pages, ret: %d, try to use default cma\n",
> > + cma->name, count, ret);
> > cma_debug_show_areas(cma);
> > }
>

Thanks
Barry

2024-06-13 02:13:08

by Zhai He

[permalink] [raw]
Subject: RE: [EXT] Re: [PATCH v2] Supports to use the default CMA when the device-specified CMA memory is not enough.

> -----Original Message-----
> From: Andrew Morton <[email protected]>
> Sent: Thursday, June 13, 2024 2:48 AM
> To: Zhai He <[email protected]>
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; Zhipeng Wang <[email protected]>; Jindong
> Yue <[email protected]>; Barry Song <[email protected]>; Christoph
> Hellwig <[email protected]>
> Subject: [EXT] Re: [PATCH v2] Supports to use the default CMA when the
> device-specified CMA memory is not enough.
>
> Caution: This is an external email. Please take care when clicking links
or
> opening attachments. When in doubt, report the message using the 'Report
this
> email' button
>
>
> On Wed, 12 Jun 2024 16:12:16 +0800 "zhai.he" <[email protected]> wrote:
>
> > From: He Zhai <[email protected]>
>
> (cc Barry & Christoph)
>
> What was your reason for adding cc:stable to the email headers? Does this
> address some serious problem? If so, please fully describe that problem.
>
Sorry, I don't think this is probably a serious problem, just something I
discovered while developing that I think might exist.
I will not continue to cc: stable.

> > In the current code logic, if the device-specified CMA memory
> > allocation fails, memory will not be allocated from the default CMA
area.
> > This patch will use the default cma region when the device's specified
> > CMA is not enough.
> >
> > In addition, the log level of allocation failure is changed to debug.
> > Because these logs will be printed when memory allocation from the
> > device specified CMA fails, but if the allocation fails, it will be
> > allocated from the default cma area. It can easily mislead developers'
> > judgment.
> >
> > ...
> >
> > --- a/kernel/dma/contiguous.c
> > +++ b/kernel/dma/contiguous.c
> > @@ -357,8 +357,13 @@ struct page *dma_alloc_contiguous(struct device
> *dev, size_t size, gfp_t gfp)
> > /* CMA can be used only in the context which permits sleeping */
> > if (!gfpflags_allow_blocking(gfp))
> > return NULL;
> > - if (dev->cma_area)
> > - return cma_alloc_aligned(dev->cma_area, size, gfp);
> > + if (dev->cma_area) {
> > + struct page *page = NULL;
> > +
> > + page = cma_alloc_aligned(dev->cma_area, size, gfp);
> > + if (page)
> > + return page;
> > + }
> > if (size <= PAGE_SIZE)
> > return NULL;
>
> The dma_alloc_contiguous() kerneldoc should be updated for this.
>
> The patch prompts the question "why does the device-specified CMA area
> exist?". Why not always allocate from the global pool? If the
device-specified
> area exists to prevent one device from going crazy and consuming too much
> contiguous memory, this patch violates that intent?
>

In our environment, because of Widevine DRM, we enable secure heap. When the
VPU decodes 4K secure video,
the VPU needs to allocate a large amount of secure contiguous memory. This
will cause us to need to shrink the CMA
in order not to affect other devices that require contiguous memory. So I
specified CMA memory for VPU, But currently, in addition
to the secure heap and CMA, the remaining continuous memory in our memory
configuration is not enough to support the VPU decoding
high-resolution code streams, so I added this patch so that when the CMA
specified by the device is not enough,
allocated in default CMA.

Another reason is that the secure heap requires secure configuration, so the
start address of the secure heap cannot be specified arbitrarily.
Therefore, in order to reduce the impact of shrinking CMA, I used
multi-segment CMA and assign one of the CMA
to the VPU. When the VPU decodes the high-resolution stream, if the
specified CMA is not enough, it will continue to be allocated from the
default CMA.

The VPU will not consume a crazy amount of memory, it just requires more
memory when playing high-resolution streams.
Therefore, this patch is mainly to prevent the situation where the global
CMA size cannot be satisfied.

> > @@ -406,6 +411,8 @@ void dma_free_contiguous(struct device *dev, struct
> page *page, size_t size)
> > if (dev->cma_area) {
> > if (cma_release(dev->cma_area, page, count))
> > return;
> > + if (cma_release(dma_contiguous_default_area, page,
> count))
> > + return;
> > } else {
> > /*
> > * otherwise, page is from either per-numa cma or
> > default cma diff --git a/mm/cma.c b/mm/cma.c index
> > 3e9724716bad..6e12faf1bea7 100644
> > --- a/mm/cma.c
> > +++ b/mm/cma.c
> > @@ -495,8 +495,8 @@ struct page *cma_alloc(struct cma *cma, unsigned
> long count,
> > }
> >
> > if (ret && !no_warn) {
> > - pr_err_ratelimited("%s: %s: alloc failed, req-size: %lu
pages,
> ret: %d\n",
> > - __func__, cma->name, count, ret);
> > + pr_debug("%s: alloc failed, req-size: %lu pages, ret: %d,
try to
> use default cma\n",
> > + cma->name, count, ret);
> > cma_debug_show_areas(cma);
> > }


Attachments:
smime.p7s (9.56 kB)

2024-06-13 02:35:02

by Zhai He

[permalink] [raw]
Subject: RE: [EXT] Re: [PATCH v2] Supports to use the default CMA when the device-specified CMA memory is not enough.

> -----Original Message-----
> From: Barry Song <[email protected]>
> Sent: Thursday, June 13, 2024 5:37 AM
> To: Andrew Morton <[email protected]>
> Cc: Zhai He <[email protected]>; [email protected]; [email protected];
> [email protected]; [email protected]; Zhipeng Wang
> <[email protected]>; Jindong Yue <[email protected]>; Christoph
> Hellwig <[email protected]>
> Subject: [EXT] Re: [PATCH v2] Supports to use the default CMA when the
> device-specified CMA memory is not enough.
>
> Caution: This is an external email. Please take care when clicking links or
> opening attachments. When in doubt, report the message using the 'Report this
> email' button
>
>
> On Thu, Jun 13, 2024 at 6:47 AM Andrew Morton <[email protected]>
> wrote:
> >
> > On Wed, 12 Jun 2024 16:12:16 +0800 "zhai.he" <[email protected]> wrote:
> >
> > > From: He Zhai <[email protected]>
> >
> > (cc Barry & Christoph)
> >
> > What was your reason for adding cc:stable to the email headers? Does
> > this address some serious problem? If so, please fully describe that
> > problem.
> >
> > > In the current code logic, if the device-specified CMA memory
> > > allocation fails, memory will not be allocated from the default CMA area.
> > > This patch will use the default cma region when the device's
> > > specified CMA is not enough.
> > >
> > > In addition, the log level of allocation failure is changed to debug.
> > > Because these logs will be printed when memory allocation from the
> > > device specified CMA fails, but if the allocation fails, it will be
> > > allocated from the default cma area. It can easily mislead developers'
> > > judgment.
>
> I am not convinced that this patch is correct. If device-specific CMA is too small,
> why not increase it in the device tree? Conversely, if the default CMA size is too
> large, why not reduce it via the cmdline? CMA offers all kinds of flexible
> configuration options based on users’ needs.
>
> One significant benefit of device-specific CMA is that it helps decrease
> fragmentation in the common CMA pool. While many devices allocate memory
> from the same pool, they have different memory requirements in terms of sizes
> and alignments. Occasions of memory allocation and release can lead to
> situations where the CMA pool has enough free space, yet someone fails to
> obtain contiguous memory from it.
>
> This patch entirely negates the advantage we gain from device-specific CMA.
> My point is that instead of modifying the core code, please consider correcting
> your device tree or cmdline configurations.
>
Because we enabled secure heap to support widevine DRM, and secure heap requires security configuration, its starting
address cannot be specified arbitrarily, which causes the default CMA to be reduced. So we reduced the CMA, but in order
to avoid the impact of reducing the CMA, we used a multi-segment CMA and gave one segment to the VPU.

However, under our memory configuration, the device-specific CMA is not enough to support the VPU decoding high-resolution code streams, so this patch is added so that the VPU can work properly.
Thanks.
> > >
> > > ...
> > >
> > > --- a/kernel/dma/contiguous.c
> > > +++ b/kernel/dma/contiguous.c
> > > @@ -357,8 +357,13 @@ struct page *dma_alloc_contiguous(struct device
> *dev, size_t size, gfp_t gfp)
> > > /* CMA can be used only in the context which permits sleeping */
> > > if (!gfpflags_allow_blocking(gfp))
> > > return NULL;
> > > - if (dev->cma_area)
> > > - return cma_alloc_aligned(dev->cma_area, size, gfp);
> > > + if (dev->cma_area) {
> > > + struct page *page = NULL;
> > > +
> > > + page = cma_alloc_aligned(dev->cma_area, size, gfp);
> > > + if (page)
> > > + return page;
> > > + }
> > > if (size <= PAGE_SIZE)
> > > return NULL;
> >
> > The dma_alloc_contiguous() kerneldoc should be updated for this.
> >
> > The patch prompts the question "why does the device-specified CMA area
> > exist?". Why not always allocate from the global pool? If the
> > device-specified area exists to prevent one device from going crazy
> > and consuming too much contiguous memory, this patch violates that intent?
> >
> > > @@ -406,6 +411,8 @@ void dma_free_contiguous(struct device *dev,
> struct page *page, size_t size)
> > > if (dev->cma_area) {
> > > if (cma_release(dev->cma_area, page, count))
> > > return;
> > > + if (cma_release(dma_contiguous_default_area, page,
> count))
> > > + return;
> > > } else {
> > > /*
> > > * otherwise, page is from either per-numa cma or
> > > default cma diff --git a/mm/cma.c b/mm/cma.c index
> > > 3e9724716bad..6e12faf1bea7 100644
> > > --- a/mm/cma.c
> > > +++ b/mm/cma.c
> > > @@ -495,8 +495,8 @@ struct page *cma_alloc(struct cma *cma, unsigned
> long count,
> > > }
> > >
> > > if (ret && !no_warn) {
> > > - pr_err_ratelimited("%s: %s: alloc failed, req-size: %lu pages,
> ret: %d\n",
> > > - __func__, cma->name, count, ret);
> > > + pr_debug("%s: alloc failed, req-size: %lu pages, ret: %d, try
> to use default cma\n",
> > > + cma->name, count, ret);
> > > cma_debug_show_areas(cma);
> > > }
> >
>
> Thanks
> Barry


Attachments:
smime.p7s (9.56 kB)

2024-06-13 03:28:48

by Barry Song

[permalink] [raw]
Subject: Re: [EXT] Re: [PATCH v2] Supports to use the default CMA when the device-specified CMA memory is not enough.

On Thu, Jun 13, 2024 at 2:34 PM Zhai He <[email protected]> wrote:
>
> > -----Original Message-----
> > From: Barry Song <[email protected]>
> > Sent: Thursday, June 13, 2024 5:37 AM
> > To: Andrew Morton <[email protected]>
> > Cc: Zhai He <[email protected]>; [email protected]; [email protected];
> > [email protected]; [email protected]; Zhipeng Wang
> > <[email protected]>; Jindong Yue <[email protected]>; Christoph
> > Hellwig <[email protected]>
> > Subject: [EXT] Re: [PATCH v2] Supports to use the default CMA when the
> > device-specified CMA memory is not enough.
> >
> > Caution: This is an external email. Please take care when clicking links or
> > opening attachments. When in doubt, report the message using the 'Report this
> > email' button
> >
> >
> > On Thu, Jun 13, 2024 at 6:47 AM Andrew Morton <[email protected]>
> > wrote:
> > >
> > > On Wed, 12 Jun 2024 16:12:16 +0800 "zhai.he" <[email protected]> wrote:
> > >
> > > > From: He Zhai <[email protected]>
> > >
> > > (cc Barry & Christoph)
> > >
> > > What was your reason for adding cc:stable to the email headers? Does
> > > this address some serious problem? If so, please fully describe that
> > > problem.
> > >
> > > > In the current code logic, if the device-specified CMA memory
> > > > allocation fails, memory will not be allocated from the default CMA area.
> > > > This patch will use the default cma region when the device's
> > > > specified CMA is not enough.
> > > >
> > > > In addition, the log level of allocation failure is changed to debug.
> > > > Because these logs will be printed when memory allocation from the
> > > > device specified CMA fails, but if the allocation fails, it will be
> > > > allocated from the default cma area. It can easily mislead developers'
> > > > judgment.
> >
> > I am not convinced that this patch is correct. If device-specific CMA is too small,
> > why not increase it in the device tree? Conversely, if the default CMA size is too
> > large, why not reduce it via the cmdline? CMA offers all kinds of flexible
> > configuration options based on users’ needs.
> >
> > One significant benefit of device-specific CMA is that it helps decrease
> > fragmentation in the common CMA pool. While many devices allocate memory
> > from the same pool, they have different memory requirements in terms of sizes
> > and alignments. Occasions of memory allocation and release can lead to
> > situations where the CMA pool has enough free space, yet someone fails to
> > obtain contiguous memory from it.
> >
> > This patch entirely negates the advantage we gain from device-specific CMA.
> > My point is that instead of modifying the core code, please consider correcting
> > your device tree or cmdline configurations.
> >
> Because we enabled secure heap to support widevine DRM, and secure heap requires security configuration, its starting
> address cannot be specified arbitrarily, which causes the default CMA to be reduced. So we reduced the CMA, but in order
> to avoid the impact of reducing the CMA, we used a multi-segment CMA and gave one segment to the VPU.
>
> However, under our memory configuration, the device-specific CMA is not enough to support the VPU decoding high-resolution code streams, so this patch is added so that the VPU can work properly.
> Thanks.

I don’t quite understand what you are saying. Why can’t you increase
VPU’s CMA size?
It seems you mean that only in some corner cases do you need a large
CMA, but most
of the time, you don’t need it to be this big? So you have to "borrow"
memory from the
default CMA. but why not move that portion from the default CMA to
your VPU’s CMA?

Thanks
Barry

2024-06-13 05:33:07

by Zhai He

[permalink] [raw]
Subject: RE: [EXT] Re: [PATCH v2] Supports to use the default CMA when the device-specified CMA memory is not enough.

> -----Original Message-----
> From: Barry Song <[email protected]>
> Sent: Thursday, June 13, 2024 11:28 AM
> To: Zhai He <[email protected]>
> Cc: Andrew Morton <[email protected]>; [email protected];
> [email protected]; [email protected]; Zhipeng Wang
> <[email protected]>; Jindong Yue <[email protected]>; Christoph
> Hellwig <[email protected]>
> Subject: Re: [EXT] Re: [PATCH v2] Supports to use the default CMA when the
> device-specified CMA memory is not enough.
>
> Caution: This is an external email. Please take care when clicking links or
> opening attachments. When in doubt, report the message using the 'Report this
> email' button
>
>
> On Thu, Jun 13, 2024 at 2:34 PM Zhai He <[email protected]> wrote:
> >
> > > -----Original Message-----
> > > From: Barry Song <[email protected]>
> > > Sent: Thursday, June 13, 2024 5:37 AM
> > > To: Andrew Morton <[email protected]>
> > > Cc: Zhai He <[email protected]>; [email protected]; [email protected];
> > > [email protected]; [email protected]; Zhipeng Wang
> > > <[email protected]>; Jindong Yue <[email protected]>;
> > > Christoph Hellwig <[email protected]>
> > > Subject: [EXT] Re: [PATCH v2] Supports to use the default CMA when
> > > the device-specified CMA memory is not enough.
> > >
> > > Caution: This is an external email. Please take care when clicking
> > > links or opening attachments. When in doubt, report the message
> > > using the 'Report this email' button
> > >
> > >
> > > On Thu, Jun 13, 2024 at 6:47 AM Andrew Morton
> > > <[email protected]>
> > > wrote:
> > > >
> > > > On Wed, 12 Jun 2024 16:12:16 +0800 "zhai.he" <[email protected]>
> wrote:
> > > >
> > > > > From: He Zhai <[email protected]>
> > > >
> > > > (cc Barry & Christoph)
> > > >
> > > > What was your reason for adding cc:stable to the email headers?
> > > > Does this address some serious problem? If so, please fully
> > > > describe that problem.
> > > >
> > > > > In the current code logic, if the device-specified CMA memory
> > > > > allocation fails, memory will not be allocated from the default CMA area.
> > > > > This patch will use the default cma region when the device's
> > > > > specified CMA is not enough.
> > > > >
> > > > > In addition, the log level of allocation failure is changed to debug.
> > > > > Because these logs will be printed when memory allocation from
> > > > > the device specified CMA fails, but if the allocation fails, it
> > > > > will be allocated from the default cma area. It can easily mislead
> developers'
> > > > > judgment.
> > >
> > > I am not convinced that this patch is correct. If device-specific
> > > CMA is too small, why not increase it in the device tree?
> > > Conversely, if the default CMA size is too large, why not reduce it
> > > via the cmdline? CMA offers all kinds of flexible configuration options based
> on users’ needs.
> > >
> > > One significant benefit of device-specific CMA is that it helps
> > > decrease fragmentation in the common CMA pool. While many devices
> > > allocate memory from the same pool, they have different memory
> > > requirements in terms of sizes and alignments. Occasions of memory
> > > allocation and release can lead to situations where the CMA pool has
> > > enough free space, yet someone fails to obtain contiguous memory from it.
> > >
> > > This patch entirely negates the advantage we gain from device-specific CMA.
> > > My point is that instead of modifying the core code, please consider
> > > correcting your device tree or cmdline configurations.
> > >
> > Because we enabled secure heap to support widevine DRM, and secure
> > heap requires security configuration, its starting address cannot be
> > specified arbitrarily, which causes the default CMA to be reduced. So we
> reduced the CMA, but in order to avoid the impact of reducing the CMA, we
> used a multi-segment CMA and gave one segment to the VPU.
> >
> > However, under our memory configuration, the device-specific CMA is not
> enough to support the VPU decoding high-resolution code streams, so this patch
> is added so that the VPU can work properly.
> > Thanks.
>
> I don’t quite understand what you are saying. Why can’t you increase VPU’s
> CMA size?
Thanks for your quick reply.
Because we added a secure heap to support Widevine DRM, this heap requires hardware protection, so its starting address cannot be specified arbitrarily. This causes the secure heap to occupy part of the default CMA, and the default CMA is therefore reduced, so in order to avoid default CMA Shrinking introduces other problems. We added a specific CMA area for the VPU. However, due to the large size of the secure heap and default CMA, There is no remaining memory available to increase specific CMA for VPU.

> It seems you mean that only in some corner cases do you need a large CMA, but
> most of the time, you don’t need it to be this big? So you have to "borrow"
> memory from the
> default CMA. but why not move that portion from the default CMA to your
> VPU’s CMA?
>
This is a method, but because for VPU, the continuous memory size allocated by the driver is based on the video stream, we cannot determine the maximum size of memory required by the VPU. This makes it impossible for us to determine the size of the specific CMA assigned to the VPU. Thanks.
> Thanks
> Barry


Attachments:
smime.p7s (9.56 kB)

2024-06-13 06:16:35

by Barry Song

[permalink] [raw]
Subject: Re: [EXT] Re: [PATCH v2] Supports to use the default CMA when the device-specified CMA memory is not enough.

On Thu, Jun 13, 2024 at 5:32 PM Zhai He <[email protected]> wrote:
>
> > -----Original Message-----
> > From: Barry Song <[email protected]>
> > Sent: Thursday, June 13, 2024 11:28 AM
> > To: Zhai He <[email protected]>
> > Cc: Andrew Morton <[email protected]>; [email protected];
> > [email protected]; [email protected]; Zhipeng Wang
> > <[email protected]>; Jindong Yue <[email protected]>; Christoph
> > Hellwig <[email protected]>
> > Subject: Re: [EXT] Re: [PATCH v2] Supports to use the default CMA when the
> > device-specified CMA memory is not enough.
> >
> > Caution: This is an external email. Please take care when clicking links or
> > opening attachments. When in doubt, report the message using the 'Report this
> > email' button
> >
> >
> > On Thu, Jun 13, 2024 at 2:34 PM Zhai He <[email protected]> wrote:
> > >
> > > > -----Original Message-----
> > > > From: Barry Song <[email protected]>
> > > > Sent: Thursday, June 13, 2024 5:37 AM
> > > > To: Andrew Morton <[email protected]>
> > > > Cc: Zhai He <[email protected]>; [email protected]; [email protected];
> > > > [email protected]; [email protected]; Zhipeng Wang
> > > > <[email protected]>; Jindong Yue <[email protected]>;
> > > > Christoph Hellwig <[email protected]>
> > > > Subject: [EXT] Re: [PATCH v2] Supports to use the default CMA when
> > > > the device-specified CMA memory is not enough.
> > > >
> > > > Caution: This is an external email. Please take care when clicking
> > > > links or opening attachments. When in doubt, report the message
> > > > using the 'Report this email' button
> > > >
> > > >
> > > > On Thu, Jun 13, 2024 at 6:47 AM Andrew Morton
> > > > <[email protected]>
> > > > wrote:
> > > > >
> > > > > On Wed, 12 Jun 2024 16:12:16 +0800 "zhai.he" <[email protected]>
> > wrote:
> > > > >
> > > > > > From: He Zhai <[email protected]>
> > > > >
> > > > > (cc Barry & Christoph)
> > > > >
> > > > > What was your reason for adding cc:stable to the email headers?
> > > > > Does this address some serious problem? If so, please fully
> > > > > describe that problem.
> > > > >
> > > > > > In the current code logic, if the device-specified CMA memory
> > > > > > allocation fails, memory will not be allocated from the default CMA area.
> > > > > > This patch will use the default cma region when the device's
> > > > > > specified CMA is not enough.
> > > > > >
> > > > > > In addition, the log level of allocation failure is changed to debug.
> > > > > > Because these logs will be printed when memory allocation from
> > > > > > the device specified CMA fails, but if the allocation fails, it
> > > > > > will be allocated from the default cma area. It can easily mislead
> > developers'
> > > > > > judgment.
> > > >
> > > > I am not convinced that this patch is correct. If device-specific
> > > > CMA is too small, why not increase it in the device tree?
> > > > Conversely, if the default CMA size is too large, why not reduce it
> > > > via the cmdline? CMA offers all kinds of flexible configuration options based
> > on users’ needs.
> > > >
> > > > One significant benefit of device-specific CMA is that it helps
> > > > decrease fragmentation in the common CMA pool. While many devices
> > > > allocate memory from the same pool, they have different memory
> > > > requirements in terms of sizes and alignments. Occasions of memory
> > > > allocation and release can lead to situations where the CMA pool has
> > > > enough free space, yet someone fails to obtain contiguous memory from it.
> > > >
> > > > This patch entirely negates the advantage we gain from device-specific CMA.
> > > > My point is that instead of modifying the core code, please consider
> > > > correcting your device tree or cmdline configurations.
> > > >
> > > Because we enabled secure heap to support widevine DRM, and secure
> > > heap requires security configuration, its starting address cannot be
> > > specified arbitrarily, which causes the default CMA to be reduced. So we
> > reduced the CMA, but in order to avoid the impact of reducing the CMA, we
> > used a multi-segment CMA and gave one segment to the VPU.
> > >
> > > However, under our memory configuration, the device-specific CMA is not
> > enough to support the VPU decoding high-resolution code streams, so this patch
> > is added so that the VPU can work properly.
> > > Thanks.
> >
> > I don’t quite understand what you are saying. Why can’t you increase VPU’s
> > CMA size?
> Thanks for your quick reply.
> Because we added a secure heap to support Widevine DRM, this heap requires hardware protection, so its starting address cannot be specified arbitrarily. This causes the secure heap to occupy part of the default CMA, and the default CMA is therefore reduced, so in order to avoid default CMA Shrinking introduces other problems. We added a specific CMA area for the VPU. However, due to the large size of the secure heap and default CMA, There is no remaining memory available to increase specific CMA for VPU.

I assume the secure heap you are referring to is a section of memory
that should only be accessed by TrustZone and not be visible to Linux
running in non-secure mode. How do you allocate this secure heap from the
default CMA? Do you use the cma_alloc() APIs or the dma_alloc_coherent()
APIs? Given that the VPU has its own device-specific CMA, why is this
secure heap allocated from the default CMA instead of the VPU's CMA?

If this secure heap was allocated before the kernel booted, why did the
kernel(your dts) fail to mark this area as nomap/reserved to prevent
the default CMA from intersecting with it?

>
> > It seems you mean that only in some corner cases do you need a large CMA, but
> > most of the time, you don’t need it to be this big? So you have to "borrow"
> > memory from the
> > default CMA. but why not move that portion from the default CMA to your
> > VPU’s CMA?
> >
> This is a method, but because for VPU, the continuous memory size allocated by the driver is based on the video stream, we cannot determine the maximum size of memory required by the VPU. This makes it impossible for us to determine the size of the specific CMA assigned to the VPU. Thanks.

I don't understand how this can happen. You should precisely know the
maximum size required for the VPU based on your multimedia pipeline
and resolutions.

I still don't understand your scenarios or the problem you are facing.

Thanks
Barry

2024-06-13 07:11:27

by Zhai He

[permalink] [raw]
Subject: RE: [EXT] Re: [PATCH v2] Supports to use the default CMA when the device-specified CMA memory is not enough.

> -----Original Message-----
> From: Barry Song <[email protected]>
> Sent: Thursday, June 13, 2024 2:15 PM
> To: Zhai He <[email protected]>
> Cc: Andrew Morton <[email protected]>; [email protected];
> [email protected]; [email protected]; Zhipeng Wang
> <[email protected]>; Jindong Yue <[email protected]>; Christoph
> Hellwig <[email protected]>
> Subject: Re: [EXT] Re: [PATCH v2] Supports to use the default CMA when the
> device-specified CMA memory is not enough.
>
> Caution: This is an external email. Please take care when clicking links or
> opening attachments. When in doubt, report the message using the 'Report
> this email' button
>
>
> On Thu, Jun 13, 2024 at 5:32 PM Zhai He <[email protected]> wrote:
> >
> > > -----Original Message-----
> > > From: Barry Song <[email protected]>
> > > Sent: Thursday, June 13, 2024 11:28 AM
> > > To: Zhai He <[email protected]>
> > > Cc: Andrew Morton <[email protected]>; [email protected];
> > > [email protected]; [email protected]; Zhipeng Wang
> > > <[email protected]>; Jindong Yue <[email protected]>;
> > > Christoph Hellwig <[email protected]>
> > > Subject: Re: [EXT] Re: [PATCH v2] Supports to use the default CMA
> > > when the device-specified CMA memory is not enough.
> > >
> > > Caution: This is an external email. Please take care when clicking
> > > links or opening attachments. When in doubt, report the message
> > > using the 'Report this email' button
> > >
> > >
> > > On Thu, Jun 13, 2024 at 2:34 PM Zhai He <[email protected]> wrote:
> > > >
> > > > > -----Original Message-----
> > > > > From: Barry Song <[email protected]>
> > > > > Sent: Thursday, June 13, 2024 5:37 AM
> > > > > To: Andrew Morton <[email protected]>
> > > > > Cc: Zhai He <[email protected]>; [email protected];
> > > > > [email protected]; [email protected];
> > > > > [email protected]; Zhipeng Wang <[email protected]>;
> > > > > Jindong Yue <[email protected]>; Christoph Hellwig
> > > > > <[email protected]>
> > > > > Subject: [EXT] Re: [PATCH v2] Supports to use the default CMA
> > > > > when the device-specified CMA memory is not enough.
> > > > >
> > > > > Caution: This is an external email. Please take care when
> > > > > clicking links or opening attachments. When in doubt, report the
> > > > > message using the 'Report this email' button
> > > > >
> > > > >
> > > > > On Thu, Jun 13, 2024 at 6:47 AM Andrew Morton
> > > > > <[email protected]>
> > > > > wrote:
> > > > > >
> > > > > > On Wed, 12 Jun 2024 16:12:16 +0800 "zhai.he" <[email protected]>
> > > wrote:
> > > > > >
> > > > > > > From: He Zhai <[email protected]>
> > > > > >
> > > > > > (cc Barry & Christoph)
> > > > > >
> > > > > > What was your reason for adding cc:stable to the email headers?
> > > > > > Does this address some serious problem? If so, please fully
> > > > > > describe that problem.
> > > > > >
> > > > > > > In the current code logic, if the device-specified CMA
> > > > > > > memory allocation fails, memory will not be allocated from the
> default CMA area.
> > > > > > > This patch will use the default cma region when the device's
> > > > > > > specified CMA is not enough.
> > > > > > >
> > > > > > > In addition, the log level of allocation failure is changed to debug.
> > > > > > > Because these logs will be printed when memory allocation
> > > > > > > from the device specified CMA fails, but if the allocation
> > > > > > > fails, it will be allocated from the default cma area. It
> > > > > > > can easily mislead
> > > developers'
> > > > > > > judgment.
> > > > >
> > > > > I am not convinced that this patch is correct. If
> > > > > device-specific CMA is too small, why not increase it in the device tree?
> > > > > Conversely, if the default CMA size is too large, why not reduce
> > > > > it via the cmdline? CMA offers all kinds of flexible
> > > > > configuration options based
> > > on users’ needs.
> > > > >
> > > > > One significant benefit of device-specific CMA is that it helps
> > > > > decrease fragmentation in the common CMA pool. While many
> > > > > devices allocate memory from the same pool, they have different
> > > > > memory requirements in terms of sizes and alignments. Occasions
> > > > > of memory allocation and release can lead to situations where
> > > > > the CMA pool has enough free space, yet someone fails to obtain
> contiguous memory from it.
> > > > >
> > > > > This patch entirely negates the advantage we gain from device-specific
> CMA.
> > > > > My point is that instead of modifying the core code, please
> > > > > consider correcting your device tree or cmdline configurations.
> > > > >
> > > > Because we enabled secure heap to support widevine DRM, and secure
> > > > heap requires security configuration, its starting address cannot
> > > > be specified arbitrarily, which causes the default CMA to be
> > > > reduced. So we
> > > reduced the CMA, but in order to avoid the impact of reducing the
> > > CMA, we used a multi-segment CMA and gave one segment to the VPU.
> > > >
> > > > However, under our memory configuration, the device-specific CMA
> > > > is not
> > > enough to support the VPU decoding high-resolution code streams, so
> > > this patch is added so that the VPU can work properly.
> > > > Thanks.
> > >
> > > I don’t quite understand what you are saying. Why can’t you increase
> > > VPU’s CMA size?
> > Thanks for your quick reply.
> > Because we added a secure heap to support Widevine DRM, this heap
> requires hardware protection, so its starting address cannot be specified
> arbitrarily. This causes the secure heap to occupy part of the default CMA, and
> the default CMA is therefore reduced, so in order to avoid default CMA
> Shrinking introduces other problems. We added a specific CMA area for the
> VPU. However, due to the large size of the secure heap and default CMA, There
> is no remaining memory available to increase specific CMA for VPU.
>
> I assume the secure heap you are referring to is a section of memory that
> should only be accessed by TrustZone and not be visible to Linux running in
> non-secure mode. How do you allocate this secure heap from the default CMA?

No, secure heap is a reserved memory, secure heap is not allocated from CMA, secure heap has been reserved during the kernel startup phase.
And this reserved memory is protected by hardware. Only specific hardware and secure world can accessed it.
For example:
&{/reserved-memory/} {
secure_region: secure {
compatible = "imx-secure-ion-pool";
reg = <0x0 0xA0000000 0 0x1EF00000>;
};
};

> Do you use the cma_alloc() APIs or the dma_alloc_coherent() APIs? Given that
> the VPU has its own device-specific CMA, why is this secure heap allocated
> from the default CMA instead of the VPU's CMA?
>
The VPU driver will use dma_alloc_coherent() to allocate contiguous memory. The secure heap is not allocated from the CMA, but because the secure heap is enabled, it occupies some contiguous memory, causing the default CMA to be reduced.

> If this secure heap was allocated before the kernel booted, why did the
> kernel(your dts) fail to mark this area as nomap/reserved to prevent the
> default CMA from intersecting with it?
>
Secure heap does not intersect with the CMA.
for example:
before secure heap enabled:
0xA000 0000 ~ 0xFFFFFFFF: default CMA
after secure heap enabled:
0x9000 0000 ~0x9FFF FFFF is the CMA specified by VPU,
0xA000 0000 ~0xAFFF FFFF is secure heap, (the start address cannot be specified arbitrarily, because this memory is protected by hardware, if the start address is 0x9000 0000, uboot will use this memory, but uboot can't access this memory because of hardware protection. So we find a section of memory that UBOOT will not use as secure heap.
Note: The memory of uboot can be adjusted, but avoiding the secure heap will limit the memory range that uboot can use, causing problems such as the uboot stack)
0xB000 0000 ~0xFFFFFFFF is default CMA.
So default CMA is reduced.

> >
> > > It seems you mean that only in some corner cases do you need a large
> > > CMA, but most of the time, you don’t need it to be this big? So you have
> to "borrow"
> > > memory from the
> > > default CMA. but why not move that portion from the default CMA to
> > > your VPU’s CMA?
> > >
> > This is a method, but because for VPU, the continuous memory size allocated
> by the driver is based on the video stream, we cannot determine the maximum
> size of memory required by the VPU. This makes it impossible for us to
> determine the size of the specific CMA assigned to the VPU. Thanks.
>
> I don't understand how this can happen. You should precisely know the
> maximum size required for the VPU based on your multimedia pipeline and
> resolutions.
>
We cannot estimate the maximum contiguous memory required by the VPU because it depends on how the video is encoded.
Thanks very much.

> I still don't understand your scenarios or the problem you are facing.


> Thanks
> Barry


Attachments:
smime.p7s (9.56 kB)

2024-06-13 07:38:13

by Barry Song

[permalink] [raw]
Subject: Re: [EXT] Re: [PATCH v2] Supports to use the default CMA when the device-specified CMA memory is not enough.

On Thu, Jun 13, 2024 at 7:11 PM Zhai He <[email protected]> wrote:
>
> > -----Original Message-----
> > From: Barry Song <[email protected]>
> > Sent: Thursday, June 13, 2024 2:15 PM
> > To: Zhai He <[email protected]>
> > Cc: Andrew Morton <[email protected]>; [email protected];
> > [email protected]; [email protected]; Zhipeng Wang
> > <[email protected]>; Jindong Yue <[email protected]>; Christoph
> > Hellwig <[email protected]>
> > Subject: Re: [EXT] Re: [PATCH v2] Supports to use the default CMA when the
> > device-specified CMA memory is not enough.
> >
> > Caution: This is an external email. Please take care when clicking links or
> > opening attachments. When in doubt, report the message using the 'Report
> > this email' button
> >
> >
> > On Thu, Jun 13, 2024 at 5:32 PM Zhai He <[email protected]> wrote:
> > >
> > > > -----Original Message-----
> > > > From: Barry Song <[email protected]>
> > > > Sent: Thursday, June 13, 2024 11:28 AM
> > > > To: Zhai He <[email protected]>
> > > > Cc: Andrew Morton <[email protected]>; [email protected];
> > > > [email protected]; [email protected]; Zhipeng Wang
> > > > <[email protected]>; Jindong Yue <[email protected]>;
> > > > Christoph Hellwig <[email protected]>
> > > > Subject: Re: [EXT] Re: [PATCH v2] Supports to use the default CMA
> > > > when the device-specified CMA memory is not enough.
> > > >
> > > > Caution: This is an external email. Please take care when clicking
> > > > links or opening attachments. When in doubt, report the message
> > > > using the 'Report this email' button
> > > >
> > > >
> > > > On Thu, Jun 13, 2024 at 2:34 PM Zhai He <[email protected]> wrote:
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Barry Song <[email protected]>
> > > > > > Sent: Thursday, June 13, 2024 5:37 AM
> > > > > > To: Andrew Morton <[email protected]>
> > > > > > Cc: Zhai He <[email protected]>; [email protected];
> > > > > > [email protected]; [email protected];
> > > > > > [email protected]; Zhipeng Wang <[email protected]>;
> > > > > > Jindong Yue <[email protected]>; Christoph Hellwig
> > > > > > <[email protected]>
> > > > > > Subject: [EXT] Re: [PATCH v2] Supports to use the default CMA
> > > > > > when the device-specified CMA memory is not enough.
> > > > > >
> > > > > > Caution: This is an external email. Please take care when
> > > > > > clicking links or opening attachments. When in doubt, report the
> > > > > > message using the 'Report this email' button
> > > > > >
> > > > > >
> > > > > > On Thu, Jun 13, 2024 at 6:47 AM Andrew Morton
> > > > > > <[email protected]>
> > > > > > wrote:
> > > > > > >
> > > > > > > On Wed, 12 Jun 2024 16:12:16 +0800 "zhai.he" <[email protected]>
> > > > wrote:
> > > > > > >
> > > > > > > > From: He Zhai <[email protected]>
> > > > > > >
> > > > > > > (cc Barry & Christoph)
> > > > > > >
> > > > > > > What was your reason for adding cc:stable to the email headers?
> > > > > > > Does this address some serious problem? If so, please fully
> > > > > > > describe that problem.
> > > > > > >
> > > > > > > > In the current code logic, if the device-specified CMA
> > > > > > > > memory allocation fails, memory will not be allocated from the
> > default CMA area.
> > > > > > > > This patch will use the default cma region when the device's
> > > > > > > > specified CMA is not enough.
> > > > > > > >
> > > > > > > > In addition, the log level of allocation failure is changed to debug.
> > > > > > > > Because these logs will be printed when memory allocation
> > > > > > > > from the device specified CMA fails, but if the allocation
> > > > > > > > fails, it will be allocated from the default cma area. It
> > > > > > > > can easily mislead
> > > > developers'
> > > > > > > > judgment.
> > > > > >
> > > > > > I am not convinced that this patch is correct. If
> > > > > > device-specific CMA is too small, why not increase it in the device tree?
> > > > > > Conversely, if the default CMA size is too large, why not reduce
> > > > > > it via the cmdline? CMA offers all kinds of flexible
> > > > > > configuration options based
> > > > on users’ needs.
> > > > > >
> > > > > > One significant benefit of device-specific CMA is that it helps
> > > > > > decrease fragmentation in the common CMA pool. While many
> > > > > > devices allocate memory from the same pool, they have different
> > > > > > memory requirements in terms of sizes and alignments. Occasions
> > > > > > of memory allocation and release can lead to situations where
> > > > > > the CMA pool has enough free space, yet someone fails to obtain
> > contiguous memory from it.
> > > > > >
> > > > > > This patch entirely negates the advantage we gain from device-specific
> > CMA.
> > > > > > My point is that instead of modifying the core code, please
> > > > > > consider correcting your device tree or cmdline configurations.
> > > > > >
> > > > > Because we enabled secure heap to support widevine DRM, and secure
> > > > > heap requires security configuration, its starting address cannot
> > > > > be specified arbitrarily, which causes the default CMA to be
> > > > > reduced. So we
> > > > reduced the CMA, but in order to avoid the impact of reducing the
> > > > CMA, we used a multi-segment CMA and gave one segment to the VPU.
> > > > >
> > > > > However, under our memory configuration, the device-specific CMA
> > > > > is not
> > > > enough to support the VPU decoding high-resolution code streams, so
> > > > this patch is added so that the VPU can work properly.
> > > > > Thanks.
> > > >
> > > > I don’t quite understand what you are saying. Why can’t you increase
> > > > VPU’s CMA size?
> > > Thanks for your quick reply.
> > > Because we added a secure heap to support Widevine DRM, this heap
> > requires hardware protection, so its starting address cannot be specified
> > arbitrarily. This causes the secure heap to occupy part of the default CMA, and
> > the default CMA is therefore reduced, so in order to avoid default CMA
> > Shrinking introduces other problems. We added a specific CMA area for the
> > VPU. However, due to the large size of the secure heap and default CMA, There
> > is no remaining memory available to increase specific CMA for VPU.
> >
> > I assume the secure heap you are referring to is a section of memory that
> > should only be accessed by TrustZone and not be visible to Linux running in
> > non-secure mode. How do you allocate this secure heap from the default CMA?
>
> No, secure heap is a reserved memory, secure heap is not allocated from CMA, secure heap has been reserved during the kernel startup phase.
> And this reserved memory is protected by hardware. Only specific hardware and secure world can accessed it.
> For example:
> &{/reserved-memory/} {
> secure_region: secure {
> compatible = "imx-secure-ion-pool";
> reg = <0x0 0xA0000000 0 0x1EF00000>;
> };
> };
>
> > Do you use the cma_alloc() APIs or the dma_alloc_coherent() APIs? Given that
> > the VPU has its own device-specific CMA, why is this secure heap allocated
> > from the default CMA instead of the VPU's CMA?
> >
> The VPU driver will use dma_alloc_coherent() to allocate contiguous memory. The secure heap is not allocated from the CMA, but because the secure heap is enabled, it occupies some contiguous memory, causing the default CMA to be reduced.
>
> > If this secure heap was allocated before the kernel booted, why did the
> > kernel(your dts) fail to mark this area as nomap/reserved to prevent the
> > default CMA from intersecting with it?
> >
> Secure heap does not intersect with the CMA.
> for example:
> before secure heap enabled:
> 0xA000 0000 ~ 0xFFFFFFFF: default CMA
> after secure heap enabled:
> 0x9000 0000 ~0x9FFF FFFF is the CMA specified by VPU,
> 0xA000 0000 ~0xAFFF FFFF is secure heap, (the start address cannot be specified arbitrarily, because this memory is protected by hardware, if the start address is 0x9000 0000, uboot will use this memory, but uboot can't access this memory because of hardware protection. So we find a section of memory that UBOOT will not use as secure heap.
> Note: The memory of uboot can be adjusted, but avoiding the secure heap will limit the memory range that uboot can use, causing problems such as the uboot stack)
> 0xB000 0000 ~0xFFFFFFFF is default CMA.
> So default CMA is reduced.

How is that related to your patch? I assume the default CMA is reduced because
you modified it in the DTS after enabling the secure heap, as the CMA
size is set
by you. The default CMA size won't automatically decrease due to the secure
heap. To me, 0xB0000000-0xFFFFFFFF(1.25GiB) is still too large a CMA.

>
> > >
> > > > It seems you mean that only in some corner cases do you need a large
> > > > CMA, but most of the time, you don’t need it to be this big? So you have
> > to "borrow"
> > > > memory from the
> > > > default CMA. but why not move that portion from the default CMA to
> > > > your VPU’s CMA?
> > > >
> > > This is a method, but because for VPU, the continuous memory size allocated
> > by the driver is based on the video stream, we cannot determine the maximum
> > size of memory required by the VPU. This makes it impossible for us to
> > determine the size of the specific CMA assigned to the VPU. Thanks.
> >
> > I don't understand how this can happen. You should precisely know the
> > maximum size required for the VPU based on your multimedia pipeline and
> > resolutions.
> >
> We cannot estimate the maximum contiguous memory required by the VPU because it depends on how the video is encoded.
> Thanks very much.

Yes, you can. Please ask your multimedia team; they will give
you a number.

>
> > I still don't understand your scenarios or the problem you are facing.
>
>

Thanks
Barry

2024-06-13 08:53:41

by Zhai He

[permalink] [raw]
Subject: RE: [EXT] Re: [PATCH v2] Supports to use the default CMA when the device-specified CMA memory is not enough.

> -----Original Message-----
> From: Barry Song <[email protected]>
> Sent: Thursday, June 13, 2024 3:38 PM
> To: Zhai He <[email protected]>
> Cc: Andrew Morton <[email protected]>; [email protected];
> [email protected]; [email protected]; Zhipeng Wang
> <[email protected]>; Jindong Yue <[email protected]>; Christoph
> Hellwig <[email protected]>
> Subject: Re: [EXT] Re: [PATCH v2] Supports to use the default CMA when the
> device-specified CMA memory is not enough.
>
> Caution: This is an external email. Please take care when clicking links or
> opening attachments. When in doubt, report the message using the 'Report this
> email' button
>
>
> On Thu, Jun 13, 2024 at 7:11 PM Zhai He <[email protected]> wrote:
> >
> > > -----Original Message-----
> > > From: Barry Song <[email protected]>
> > > Sent: Thursday, June 13, 2024 2:15 PM
> > > To: Zhai He <[email protected]>
> > > Cc: Andrew Morton <[email protected]>; [email protected];
> > > [email protected]; [email protected]; Zhipeng Wang
> > > <[email protected]>; Jindong Yue <[email protected]>;
> > > Christoph Hellwig <[email protected]>
> > > Subject: Re: [EXT] Re: [PATCH v2] Supports to use the default CMA
> > > when the device-specified CMA memory is not enough.
> > >
> > > Caution: This is an external email. Please take care when clicking
> > > links or opening attachments. When in doubt, report the message
> > > using the 'Report this email' button
> > >
> > >
> > > On Thu, Jun 13, 2024 at 5:32 PM Zhai He <[email protected]> wrote:
> > > >
> > > > > -----Original Message-----
> > > > > From: Barry Song <[email protected]>
> > > > > Sent: Thursday, June 13, 2024 11:28 AM
> > > > > To: Zhai He <[email protected]>
> > > > > Cc: Andrew Morton <[email protected]>; [email protected];
> > > > > [email protected]; [email protected]; Zhipeng Wang
> > > > > <[email protected]>; Jindong Yue <[email protected]>;
> > > > > Christoph Hellwig <[email protected]>
> > > > > Subject: Re: [EXT] Re: [PATCH v2] Supports to use the default
> > > > > CMA when the device-specified CMA memory is not enough.
> > > > >
> > > > > Caution: This is an external email. Please take care when
> > > > > clicking links or opening attachments. When in doubt, report the
> > > > > message using the 'Report this email' button
> > > > >
> > > > >
> > > > > On Thu, Jun 13, 2024 at 2:34 PM Zhai He <[email protected]> wrote:
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Barry Song <[email protected]>
> > > > > > > Sent: Thursday, June 13, 2024 5:37 AM
> > > > > > > To: Andrew Morton <[email protected]>
> > > > > > > Cc: Zhai He <[email protected]>; [email protected];
> > > > > > > [email protected]; [email protected];
> > > > > > > [email protected]; Zhipeng Wang
> > > > > > > <[email protected]>; Jindong Yue <[email protected]>;
> > > > > > > Christoph Hellwig <[email protected]>
> > > > > > > Subject: [EXT] Re: [PATCH v2] Supports to use the default
> > > > > > > CMA when the device-specified CMA memory is not enough.
> > > > > > >
> > > > > > > Caution: This is an external email. Please take care when
> > > > > > > clicking links or opening attachments. When in doubt, report
> > > > > > > the message using the 'Report this email' button
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Jun 13, 2024 at 6:47 AM Andrew Morton
> > > > > > > <[email protected]>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On Wed, 12 Jun 2024 16:12:16 +0800 "zhai.he"
> > > > > > > > <[email protected]>
> > > > > wrote:
> > > > > > > >
> > > > > > > > > From: He Zhai <[email protected]>
> > > > > > > >
> > > > > > > > (cc Barry & Christoph)
> > > > > > > >
> > > > > > > > What was your reason for adding cc:stable to the email headers?
> > > > > > > > Does this address some serious problem? If so, please
> > > > > > > > fully describe that problem.
> > > > > > > >
> > > > > > > > > In the current code logic, if the device-specified CMA
> > > > > > > > > memory allocation fails, memory will not be allocated
> > > > > > > > > from the
> > > default CMA area.
> > > > > > > > > This patch will use the default cma region when the
> > > > > > > > > device's specified CMA is not enough.
> > > > > > > > >
> > > > > > > > > In addition, the log level of allocation failure is changed to debug.
> > > > > > > > > Because these logs will be printed when memory
> > > > > > > > > allocation from the device specified CMA fails, but if
> > > > > > > > > the allocation fails, it will be allocated from the
> > > > > > > > > default cma area. It can easily mislead
> > > > > developers'
> > > > > > > > > judgment.
> > > > > > >
> > > > > > > I am not convinced that this patch is correct. If
> > > > > > > device-specific CMA is too small, why not increase it in the device
> tree?
> > > > > > > Conversely, if the default CMA size is too large, why not
> > > > > > > reduce it via the cmdline? CMA offers all kinds of flexible
> > > > > > > configuration options based
> > > > > on users’ needs.
> > > > > > >
> > > > > > > One significant benefit of device-specific CMA is that it
> > > > > > > helps decrease fragmentation in the common CMA pool. While
> > > > > > > many devices allocate memory from the same pool, they have
> > > > > > > different memory requirements in terms of sizes and
> > > > > > > alignments. Occasions of memory allocation and release can
> > > > > > > lead to situations where the CMA pool has enough free space,
> > > > > > > yet someone fails to obtain
> > > contiguous memory from it.
> > > > > > >
> > > > > > > This patch entirely negates the advantage we gain from
> > > > > > > device-specific
> > > CMA.
> > > > > > > My point is that instead of modifying the core code, please
> > > > > > > consider correcting your device tree or cmdline configurations.
> > > > > > >
> > > > > > Because we enabled secure heap to support widevine DRM, and
> > > > > > secure heap requires security configuration, its starting
> > > > > > address cannot be specified arbitrarily, which causes the
> > > > > > default CMA to be reduced. So we
> > > > > reduced the CMA, but in order to avoid the impact of reducing
> > > > > the CMA, we used a multi-segment CMA and gave one segment to the
> VPU.
> > > > > >
> > > > > > However, under our memory configuration, the device-specific
> > > > > > CMA is not
> > > > > enough to support the VPU decoding high-resolution code streams,
> > > > > so this patch is added so that the VPU can work properly.
> > > > > > Thanks.
> > > > >
> > > > > I don’t quite understand what you are saying. Why can’t you
> > > > > increase VPU’s CMA size?
> > > > Thanks for your quick reply.
> > > > Because we added a secure heap to support Widevine DRM, this heap
> > > requires hardware protection, so its starting address cannot be
> > > specified arbitrarily. This causes the secure heap to occupy part of
> > > the default CMA, and the default CMA is therefore reduced, so in
> > > order to avoid default CMA Shrinking introduces other problems. We
> > > added a specific CMA area for the VPU. However, due to the large
> > > size of the secure heap and default CMA, There is no remaining memory
> available to increase specific CMA for VPU.
> > >
> > > I assume the secure heap you are referring to is a section of memory
> > > that should only be accessed by TrustZone and not be visible to
> > > Linux running in non-secure mode. How do you allocate this secure heap
> from the default CMA?
> >
> > No, secure heap is a reserved memory, secure heap is not allocated from CMA,
> secure heap has been reserved during the kernel startup phase.
> > And this reserved memory is protected by hardware. Only specific hardware
> and secure world can accessed it.
> > For example:
> > &{/reserved-memory/} {
> > secure_region: secure {
> > compatible = "imx-secure-ion-pool";
> > reg = <0x0 0xA0000000 0 0x1EF00000>;
> > };
> > };
> >
> > > Do you use the cma_alloc() APIs or the dma_alloc_coherent() APIs?
> > > Given that the VPU has its own device-specific CMA, why is this
> > > secure heap allocated from the default CMA instead of the VPU's CMA?
> > >
> > The VPU driver will use dma_alloc_coherent() to allocate contiguous memory.
> The secure heap is not allocated from the CMA, but because the secure heap is
> enabled, it occupies some contiguous memory, causing the default CMA to be
> reduced.
> >
> > > If this secure heap was allocated before the kernel booted, why did
> > > the kernel(your dts) fail to mark this area as nomap/reserved to
> > > prevent the default CMA from intersecting with it?
> > >
> > Secure heap does not intersect with the CMA.
> > for example:
> > before secure heap enabled:
> > 0xA000 0000 ~ 0xFFFFFFFF: default CMA
> > after secure heap enabled:
> > 0x9000 0000 ~0x9FFF FFFF is the CMA specified by VPU,
> > 0xA000 0000 ~0xAFFF FFFF is secure heap, (the start address cannot be
> specified arbitrarily, because this memory is protected by hardware, if the start
> address is 0x9000 0000, uboot will use this memory, but uboot can't access this
> memory because of hardware protection. So we find a section of memory that
> UBOOT will not use as secure heap.
> > Note: The memory of uboot can be adjusted, but avoiding the secure
> > heap will limit the memory range that uboot can use, causing problems
> > such as the uboot stack)
> > 0xB000 0000 ~0xFFFFFFFF is default CMA.
> > So default CMA is reduced.
>
> How is that related to your patch? I assume the default CMA is reduced because
> you modified it in the DTS after enabling the secure heap, as the CMA size is set
> by you. The default CMA size won't automatically decrease due to the secure
> heap. To me, 0xB0000000-0xFFFFFFFF(1.25GiB) is still too large a CMA.
>

Sorry, This example is just an example. In fact, the size of our default CMA is less than 1.25GiB.
Our current memory distribution is as follows. Now the size of "c" (default CMA) could not meet the needs of our requirement. And "b" (reserved memory for secure) is fixed, so we couldn't expand "c" (default CMA) through modify DTS. Then we reserved "a" (specific CMA) for VPU. However, we have confirmed with the multimedia team that the maximum size required is It is uncertain, so specify "a" for VPU to use first, and "c" for other devices that require continuous memory. If "a" is not enough, use "c".
That's the purpose of this patch.
--------------------------------------------------
| a. VPU specific cma |
--------------------------------------------------
| b. reserved memory for secure |
---------------------------------------------------
| c. default CMA |
---------------------------------------------------
> >
> > > >
> > > > > It seems you mean that only in some corner cases do you need a
> > > > > large CMA, but most of the time, you don’t need it to be this
> > > > > big? So you have
> > > to "borrow"
> > > > > memory from the
> > > > > default CMA. but why not move that portion from the default CMA
> > > > > to your VPU’s CMA?
> > > > >
> > > > This is a method, but because for VPU, the continuous memory size
> > > > allocated
> > > by the driver is based on the video stream, we cannot determine the
> > > maximum size of memory required by the VPU. This makes it impossible
> > > for us to determine the size of the specific CMA assigned to the VPU. Thanks.
> > >
> > > I don't understand how this can happen. You should precisely know
> > > the maximum size required for the VPU based on your multimedia
> > > pipeline and resolutions.
> > >
> > We cannot estimate the maximum contiguous memory required by the VPU
> because it depends on how the video is encoded.
> > Thanks very much.
>
> Yes, you can. Please ask your multimedia team; they will give you a number.
>
> >
> > > I still don't understand your scenarios or the problem you are facing.
> >
> >
>
> Thanks
> Barry


Attachments:
smime.p7s (9.56 kB)

2024-06-13 09:44:23

by Barry Song

[permalink] [raw]
Subject: Re: [EXT] Re: [PATCH v2] Supports to use the default CMA when the device-specified CMA memory is not enough.

On Thu, Jun 13, 2024 at 8:49 PM Zhai He <[email protected]> wrote:
>
> > -----Original Message-----
> > From: Barry Song <[email protected]>
> > Sent: Thursday, June 13, 2024 3:38 PM
> > To: Zhai He <[email protected]>
> > Cc: Andrew Morton <[email protected]>; [email protected];
> > [email protected]; [email protected]; Zhipeng Wang
> > <[email protected]>; Jindong Yue <[email protected]>; Christoph
> > Hellwig <[email protected]>
> > Subject: Re: [EXT] Re: [PATCH v2] Supports to use the default CMA when the
> > device-specified CMA memory is not enough.
> >
> > Caution: This is an external email. Please take care when clicking links or
> > opening attachments. When in doubt, report the message using the 'Report this
> > email' button
> >
> >
> > On Thu, Jun 13, 2024 at 7:11 PM Zhai He <[email protected]> wrote:
> > >
> > > > -----Original Message-----
> > > > From: Barry Song <[email protected]>
> > > > Sent: Thursday, June 13, 2024 2:15 PM
> > > > To: Zhai He <[email protected]>
> > > > Cc: Andrew Morton <[email protected]>; [email protected];
> > > > [email protected]; [email protected]; Zhipeng Wang
> > > > <[email protected]>; Jindong Yue <[email protected]>;
> > > > Christoph Hellwig <[email protected]>
> > > > Subject: Re: [EXT] Re: [PATCH v2] Supports to use the default CMA
> > > > when the device-specified CMA memory is not enough.
> > > >
> > > > Caution: This is an external email. Please take care when clicking
> > > > links or opening attachments. When in doubt, report the message
> > > > using the 'Report this email' button
> > > >
> > > >
> > > > On Thu, Jun 13, 2024 at 5:32 PM Zhai He <[email protected]> wrote:
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Barry Song <[email protected]>
> > > > > > Sent: Thursday, June 13, 2024 11:28 AM
> > > > > > To: Zhai He <[email protected]>
> > > > > > Cc: Andrew Morton <[email protected]>; [email protected];
> > > > > > [email protected]; [email protected]; Zhipeng Wang
> > > > > > <[email protected]>; Jindong Yue <[email protected]>;
> > > > > > Christoph Hellwig <[email protected]>
> > > > > > Subject: Re: [EXT] Re: [PATCH v2] Supports to use the default
> > > > > > CMA when the device-specified CMA memory is not enough.
> > > > > >
> > > > > > Caution: This is an external email. Please take care when
> > > > > > clicking links or opening attachments. When in doubt, report the
> > > > > > message using the 'Report this email' button
> > > > > >
> > > > > >
> > > > > > On Thu, Jun 13, 2024 at 2:34 PM Zhai He <[email protected]> wrote:
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Barry Song <[email protected]>
> > > > > > > > Sent: Thursday, June 13, 2024 5:37 AM
> > > > > > > > To: Andrew Morton <[email protected]>
> > > > > > > > Cc: Zhai He <[email protected]>; [email protected];
> > > > > > > > [email protected]; [email protected];
> > > > > > > > [email protected]; Zhipeng Wang
> > > > > > > > <[email protected]>; Jindong Yue <[email protected]>;
> > > > > > > > Christoph Hellwig <[email protected]>
> > > > > > > > Subject: [EXT] Re: [PATCH v2] Supports to use the default
> > > > > > > > CMA when the device-specified CMA memory is not enough.
> > > > > > > >
> > > > > > > > Caution: This is an external email. Please take care when
> > > > > > > > clicking links or opening attachments. When in doubt, report
> > > > > > > > the message using the 'Report this email' button
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Jun 13, 2024 at 6:47 AM Andrew Morton
> > > > > > > > <[email protected]>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > On Wed, 12 Jun 2024 16:12:16 +0800 "zhai.he"
> > > > > > > > > <[email protected]>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > From: He Zhai <[email protected]>
> > > > > > > > >
> > > > > > > > > (cc Barry & Christoph)
> > > > > > > > >
> > > > > > > > > What was your reason for adding cc:stable to the email headers?
> > > > > > > > > Does this address some serious problem? If so, please
> > > > > > > > > fully describe that problem.
> > > > > > > > >
> > > > > > > > > > In the current code logic, if the device-specified CMA
> > > > > > > > > > memory allocation fails, memory will not be allocated
> > > > > > > > > > from the
> > > > default CMA area.
> > > > > > > > > > This patch will use the default cma region when the
> > > > > > > > > > device's specified CMA is not enough.
> > > > > > > > > >
> > > > > > > > > > In addition, the log level of allocation failure is changed to debug.
> > > > > > > > > > Because these logs will be printed when memory
> > > > > > > > > > allocation from the device specified CMA fails, but if
> > > > > > > > > > the allocation fails, it will be allocated from the
> > > > > > > > > > default cma area. It can easily mislead
> > > > > > developers'
> > > > > > > > > > judgment.
> > > > > > > >
> > > > > > > > I am not convinced that this patch is correct. If
> > > > > > > > device-specific CMA is too small, why not increase it in the device
> > tree?
> > > > > > > > Conversely, if the default CMA size is too large, why not
> > > > > > > > reduce it via the cmdline? CMA offers all kinds of flexible
> > > > > > > > configuration options based
> > > > > > on users’ needs.
> > > > > > > >
> > > > > > > > One significant benefit of device-specific CMA is that it
> > > > > > > > helps decrease fragmentation in the common CMA pool. While
> > > > > > > > many devices allocate memory from the same pool, they have
> > > > > > > > different memory requirements in terms of sizes and
> > > > > > > > alignments. Occasions of memory allocation and release can
> > > > > > > > lead to situations where the CMA pool has enough free space,
> > > > > > > > yet someone fails to obtain
> > > > contiguous memory from it.
> > > > > > > >
> > > > > > > > This patch entirely negates the advantage we gain from
> > > > > > > > device-specific
> > > > CMA.
> > > > > > > > My point is that instead of modifying the core code, please
> > > > > > > > consider correcting your device tree or cmdline configurations.
> > > > > > > >
> > > > > > > Because we enabled secure heap to support widevine DRM, and
> > > > > > > secure heap requires security configuration, its starting
> > > > > > > address cannot be specified arbitrarily, which causes the
> > > > > > > default CMA to be reduced. So we
> > > > > > reduced the CMA, but in order to avoid the impact of reducing
> > > > > > the CMA, we used a multi-segment CMA and gave one segment to the
> > VPU.
> > > > > > >
> > > > > > > However, under our memory configuration, the device-specific
> > > > > > > CMA is not
> > > > > > enough to support the VPU decoding high-resolution code streams,
> > > > > > so this patch is added so that the VPU can work properly.
> > > > > > > Thanks.
> > > > > >
> > > > > > I don’t quite understand what you are saying. Why can’t you
> > > > > > increase VPU’s CMA size?
> > > > > Thanks for your quick reply.
> > > > > Because we added a secure heap to support Widevine DRM, this heap
> > > > requires hardware protection, so its starting address cannot be
> > > > specified arbitrarily. This causes the secure heap to occupy part of
> > > > the default CMA, and the default CMA is therefore reduced, so in
> > > > order to avoid default CMA Shrinking introduces other problems. We
> > > > added a specific CMA area for the VPU. However, due to the large
> > > > size of the secure heap and default CMA, There is no remaining memory
> > available to increase specific CMA for VPU.
> > > >
> > > > I assume the secure heap you are referring to is a section of memory
> > > > that should only be accessed by TrustZone and not be visible to
> > > > Linux running in non-secure mode. How do you allocate this secure heap
> > from the default CMA?
> > >
> > > No, secure heap is a reserved memory, secure heap is not allocated from CMA,
> > secure heap has been reserved during the kernel startup phase.
> > > And this reserved memory is protected by hardware. Only specific hardware
> > and secure world can accessed it.
> > > For example:
> > > &{/reserved-memory/} {
> > > secure_region: secure {
> > > compatible = "imx-secure-ion-pool";
> > > reg = <0x0 0xA0000000 0 0x1EF00000>;
> > > };
> > > };
> > >
> > > > Do you use the cma_alloc() APIs or the dma_alloc_coherent() APIs?
> > > > Given that the VPU has its own device-specific CMA, why is this
> > > > secure heap allocated from the default CMA instead of the VPU's CMA?
> > > >
> > > The VPU driver will use dma_alloc_coherent() to allocate contiguous memory.
> > The secure heap is not allocated from the CMA, but because the secure heap is
> > enabled, it occupies some contiguous memory, causing the default CMA to be
> > reduced.
> > >
> > > > If this secure heap was allocated before the kernel booted, why did
> > > > the kernel(your dts) fail to mark this area as nomap/reserved to
> > > > prevent the default CMA from intersecting with it?
> > > >
> > > Secure heap does not intersect with the CMA.
> > > for example:
> > > before secure heap enabled:
> > > 0xA000 0000 ~ 0xFFFFFFFF: default CMA
> > > after secure heap enabled:
> > > 0x9000 0000 ~0x9FFF FFFF is the CMA specified by VPU,
> > > 0xA000 0000 ~0xAFFF FFFF is secure heap, (the start address cannot be
> > specified arbitrarily, because this memory is protected by hardware, if the start
> > address is 0x9000 0000, uboot will use this memory, but uboot can't access this
> > memory because of hardware protection. So we find a section of memory that
> > UBOOT will not use as secure heap.
> > > Note: The memory of uboot can be adjusted, but avoiding the secure
> > > heap will limit the memory range that uboot can use, causing problems
> > > such as the uboot stack)
> > > 0xB000 0000 ~0xFFFFFFFF is default CMA.
> > > So default CMA is reduced.
> >
> > How is that related to your patch? I assume the default CMA is reduced because
> > you modified it in the DTS after enabling the secure heap, as the CMA size is set
> > by you. The default CMA size won't automatically decrease due to the secure
> > heap. To me, 0xB0000000-0xFFFFFFFF(1.25GiB) is still too large a CMA.
> >
>
> Sorry, This example is just an example. In fact, the size of our default CMA is less than 1.25GiB.
> Our current memory distribution is as follows. Now the size of "c" (default CMA) could not meet the needs of our requirement. And "b" (reserved memory for secure) is fixed, so we couldn't expand "c" (default CMA) through modify DTS. Then we reserved "a" (specific CMA) for VPU. However, we have confirmed with the multimedia team that the maximum size required is It is uncertain, so specify "a" for VPU to use first, and "c" for other devices that require continuous memory. If "a" is not enough, use "c".
> That's the purpose of this patch.
> --------------------------------------------------
> | a. VPU specific cma |
> --------------------------------------------------
> | b. reserved memory for secure |
> ---------------------------------------------------
> | c. default CMA |
> ---------------------------------------------------
> > >

Ok, I understand your problem. Because B is enabled, you can't have C as large
as you would without B. So you add A, but A might have insufficient space for
high-resolution video. You also don't know the exact size needed for A. In the
corner case of encoding/decoding high-resolution video, A might be insufficient,
forcing you to borrow memory from C. Because B is situated between A and C,
creating a gap, you cannot merge A and C into a single default CMA.

This does indeed seem like a valid requirement. Please ensure to
clearly describe
the problem next time. However, as a general rule, allowing device-specific CMAs
to borrow from the default CMA is not advisable. This would undermine
the reasons
why we started supporting device-specific CMAs in the first place.

So the problem is that memory holes may prevent the formation of large CMAs.

This situation raises a question: can we have two or more default CMAs due to
memory holes like B, which might hinder the system from obtaining a sufficiently
large default CMA?

If you could define both A and C as default CMAs, you wouldn't require
these kinds of
fallbacks. Is it possible for you to make dma_contiguous_default_area
a list rather
than a global variant ?

Another option is that we allow devices to have more than one memory-region,
we can use device tree to fallback.
memory-region = <&mem1, &mem2>;

My perspective is that I acknowledge your problem as a valid requirement.
However, I find the approach to be too aggressive.

> > > > >
> > > > > > It seems you mean that only in some corner cases do you need a
> > > > > > large CMA, but most of the time, you don’t need it to be this
> > > > > > big? So you have
> > > > to "borrow"
> > > > > > memory from the
> > > > > > default CMA. but why not move that portion from the default CMA
> > > > > > to your VPU’s CMA?
> > > > > >
> > > > > This is a method, but because for VPU, the continuous memory size
> > > > > allocated
> > > > by the driver is based on the video stream, we cannot determine the
> > > > maximum size of memory required by the VPU. This makes it impossible
> > > > for us to determine the size of the specific CMA assigned to the VPU. Thanks.
> > > >
> > > > I don't understand how this can happen. You should precisely know
> > > > the maximum size required for the VPU based on your multimedia
> > > > pipeline and resolutions.
> > > >
> > > We cannot estimate the maximum contiguous memory required by the VPU
> > because it depends on how the video is encoded.
> > > Thanks very much.
> >
> > Yes, you can. Please ask your multimedia team; they will give you a number.
> >
> > >
> > > > I still don't understand your scenarios or the problem you are facing.
> > >
> > >
> >

Thanks
Barry

2024-06-13 10:33:07

by Zhai He

[permalink] [raw]
Subject: RE: [EXT] Re: [PATCH v2] Supports to use the default CMA when the device-specified CMA memory is not enough.

> -----Original Message-----
> From: Barry Song <[email protected]>
> Sent: Thursday, June 13, 2024 5:44 PM
> To: Zhai He <[email protected]>
> Cc: Andrew Morton <[email protected]>; [email protected];
> [email protected]; [email protected]; Zhipeng Wang
> <[email protected]>; Jindong Yue <[email protected]>; Christoph
> Hellwig <[email protected]>
> Subject: Re: [EXT] Re: [PATCH v2] Supports to use the default CMA when the
> device-specified CMA memory is not enough.
>
> Caution: This is an external email. Please take care when clicking links or
> opening attachments. When in doubt, report the message using the 'Report this
> email' button
>
>
> On Thu, Jun 13, 2024 at 8:49 PM Zhai He <[email protected]> wrote:
> >
> > > -----Original Message-----
> > > From: Barry Song <[email protected]>
> > > Sent: Thursday, June 13, 2024 3:38 PM
> > > To: Zhai He <[email protected]>
> > > Cc: Andrew Morton <[email protected]>; [email protected];
> > > [email protected]; [email protected]; Zhipeng Wang
> > > <[email protected]>; Jindong Yue <[email protected]>;
> > > Christoph Hellwig <[email protected]>
> > > Subject: Re: [EXT] Re: [PATCH v2] Supports to use the default CMA
> > > when the device-specified CMA memory is not enough.
> > >
> > > Caution: This is an external email. Please take care when clicking
> > > links or opening attachments. When in doubt, report the message
> > > using the 'Report this email' button
> > >
> > >
> > > On Thu, Jun 13, 2024 at 7:11 PM Zhai He <[email protected]> wrote:
> > > >
> > > > > -----Original Message-----
> > > > > From: Barry Song <[email protected]>
> > > > > Sent: Thursday, June 13, 2024 2:15 PM
> > > > > To: Zhai He <[email protected]>
> > > > > Cc: Andrew Morton <[email protected]>; [email protected];
> > > > > [email protected]; [email protected]; Zhipeng Wang
> > > > > <[email protected]>; Jindong Yue <[email protected]>;
> > > > > Christoph Hellwig <[email protected]>
> > > > > Subject: Re: [EXT] Re: [PATCH v2] Supports to use the default
> > > > > CMA when the device-specified CMA memory is not enough.
> > > > >
> > > > > Caution: This is an external email. Please take care when
> > > > > clicking links or opening attachments. When in doubt, report the
> > > > > message using the 'Report this email' button
> > > > >
> > > > >
> > > > > On Thu, Jun 13, 2024 at 5:32 PM Zhai He <[email protected]> wrote:
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Barry Song <[email protected]>
> > > > > > > Sent: Thursday, June 13, 2024 11:28 AM
> > > > > > > To: Zhai He <[email protected]>
> > > > > > > Cc: Andrew Morton <[email protected]>;
> > > > > > > [email protected]; [email protected];
> > > > > > > [email protected]; Zhipeng Wang
> > > > > > > <[email protected]>; Jindong Yue <[email protected]>;
> > > > > > > Christoph Hellwig <[email protected]>
> > > > > > > Subject: Re: [EXT] Re: [PATCH v2] Supports to use the
> > > > > > > default CMA when the device-specified CMA memory is not enough.
> > > > > > >
> > > > > > > Caution: This is an external email. Please take care when
> > > > > > > clicking links or opening attachments. When in doubt, report
> > > > > > > the message using the 'Report this email' button
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Jun 13, 2024 at 2:34 PM Zhai He <[email protected]> wrote:
> > > > > > > >
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Barry Song <[email protected]>
> > > > > > > > > Sent: Thursday, June 13, 2024 5:37 AM
> > > > > > > > > To: Andrew Morton <[email protected]>
> > > > > > > > > Cc: Zhai He <[email protected]>; [email protected];
> > > > > > > > > [email protected]; [email protected];
> > > > > > > > > [email protected]; Zhipeng Wang
> > > > > > > > > <[email protected]>; Jindong Yue
> > > > > > > > > <[email protected]>; Christoph Hellwig <[email protected]>
> > > > > > > > > Subject: [EXT] Re: [PATCH v2] Supports to use the
> > > > > > > > > default CMA when the device-specified CMA memory is not
> enough.
> > > > > > > > >
> > > > > > > > > Caution: This is an external email. Please take care
> > > > > > > > > when clicking links or opening attachments. When in
> > > > > > > > > doubt, report the message using the 'Report this email'
> > > > > > > > > button
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Jun 13, 2024 at 6:47 AM Andrew Morton
> > > > > > > > > <[email protected]>
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > On Wed, 12 Jun 2024 16:12:16 +0800 "zhai.he"
> > > > > > > > > > <[email protected]>
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > From: He Zhai <[email protected]>
> > > > > > > > > >
> > > > > > > > > > (cc Barry & Christoph)
> > > > > > > > > >
> > > > > > > > > > What was your reason for adding cc:stable to the email
> headers?
> > > > > > > > > > Does this address some serious problem? If so, please
> > > > > > > > > > fully describe that problem.
> > > > > > > > > >
> > > > > > > > > > > In the current code logic, if the device-specified
> > > > > > > > > > > CMA memory allocation fails, memory will not be
> > > > > > > > > > > allocated from the
> > > > > default CMA area.
> > > > > > > > > > > This patch will use the default cma region when the
> > > > > > > > > > > device's specified CMA is not enough.
> > > > > > > > > > >
> > > > > > > > > > > In addition, the log level of allocation failure is changed to
> debug.
> > > > > > > > > > > Because these logs will be printed when memory
> > > > > > > > > > > allocation from the device specified CMA fails, but
> > > > > > > > > > > if the allocation fails, it will be allocated from
> > > > > > > > > > > the default cma area. It can easily mislead
> > > > > > > developers'
> > > > > > > > > > > judgment.
> > > > > > > > >
> > > > > > > > > I am not convinced that this patch is correct. If
> > > > > > > > > device-specific CMA is too small, why not increase it in
> > > > > > > > > the device
> > > tree?
> > > > > > > > > Conversely, if the default CMA size is too large, why
> > > > > > > > > not reduce it via the cmdline? CMA offers all kinds of
> > > > > > > > > flexible configuration options based
> > > > > > > on users’ needs.
> > > > > > > > >
> > > > > > > > > One significant benefit of device-specific CMA is that
> > > > > > > > > it helps decrease fragmentation in the common CMA pool.
> > > > > > > > > While many devices allocate memory from the same pool,
> > > > > > > > > they have different memory requirements in terms of
> > > > > > > > > sizes and alignments. Occasions of memory allocation and
> > > > > > > > > release can lead to situations where the CMA pool has
> > > > > > > > > enough free space, yet someone fails to obtain
> > > > > contiguous memory from it.
> > > > > > > > >
> > > > > > > > > This patch entirely negates the advantage we gain from
> > > > > > > > > device-specific
> > > > > CMA.
> > > > > > > > > My point is that instead of modifying the core code,
> > > > > > > > > please consider correcting your device tree or cmdline
> configurations.
> > > > > > > > >
> > > > > > > > Because we enabled secure heap to support widevine DRM,
> > > > > > > > and secure heap requires security configuration, its
> > > > > > > > starting address cannot be specified arbitrarily, which
> > > > > > > > causes the default CMA to be reduced. So we
> > > > > > > reduced the CMA, but in order to avoid the impact of
> > > > > > > reducing the CMA, we used a multi-segment CMA and gave one
> > > > > > > segment to the
> > > VPU.
> > > > > > > >
> > > > > > > > However, under our memory configuration, the
> > > > > > > > device-specific CMA is not
> > > > > > > enough to support the VPU decoding high-resolution code
> > > > > > > streams, so this patch is added so that the VPU can work properly.
> > > > > > > > Thanks.
> > > > > > >
> > > > > > > I don’t quite understand what you are saying. Why can’t you
> > > > > > > increase VPU’s CMA size?
> > > > > > Thanks for your quick reply.
> > > > > > Because we added a secure heap to support Widevine DRM, this
> > > > > > heap
> > > > > requires hardware protection, so its starting address cannot be
> > > > > specified arbitrarily. This causes the secure heap to occupy
> > > > > part of the default CMA, and the default CMA is therefore
> > > > > reduced, so in order to avoid default CMA Shrinking introduces
> > > > > other problems. We added a specific CMA area for the VPU.
> > > > > However, due to the large size of the secure heap and default
> > > > > CMA, There is no remaining memory
> > > available to increase specific CMA for VPU.
> > > > >
> > > > > I assume the secure heap you are referring to is a section of
> > > > > memory that should only be accessed by TrustZone and not be
> > > > > visible to Linux running in non-secure mode. How do you allocate
> > > > > this secure heap
> > > from the default CMA?
> > > >
> > > > No, secure heap is a reserved memory, secure heap is not allocated
> > > > from CMA,
> > > secure heap has been reserved during the kernel startup phase.
> > > > And this reserved memory is protected by hardware. Only specific
> > > > hardware
> > > and secure world can accessed it.
> > > > For example:
> > > > &{/reserved-memory/} {
> > > > secure_region: secure {
> > > > compatible = "imx-secure-ion-pool";
> > > > reg = <0x0 0xA0000000 0 0x1EF00000>;
> > > > };
> > > > };
> > > >
> > > > > Do you use the cma_alloc() APIs or the dma_alloc_coherent() APIs?
> > > > > Given that the VPU has its own device-specific CMA, why is this
> > > > > secure heap allocated from the default CMA instead of the VPU's CMA?
> > > > >
> > > > The VPU driver will use dma_alloc_coherent() to allocate contiguous
> memory.
> > > The secure heap is not allocated from the CMA, but because the
> > > secure heap is enabled, it occupies some contiguous memory, causing
> > > the default CMA to be reduced.
> > > >
> > > > > If this secure heap was allocated before the kernel booted, why
> > > > > did the kernel(your dts) fail to mark this area as
> > > > > nomap/reserved to prevent the default CMA from intersecting with it?
> > > > >
> > > > Secure heap does not intersect with the CMA.
> > > > for example:
> > > > before secure heap enabled:
> > > > 0xA000 0000 ~ 0xFFFFFFFF: default CMA after secure heap enabled:
> > > > 0x9000 0000 ~0x9FFF FFFF is the CMA specified by VPU,
> > > > 0xA000 0000 ~0xAFFF FFFF is secure heap, (the start address cannot
> > > > be
> > > specified arbitrarily, because this memory is protected by hardware,
> > > if the start address is 0x9000 0000, uboot will use this memory, but
> > > uboot can't access this memory because of hardware protection. So we
> > > find a section of memory that UBOOT will not use as secure heap.
> > > > Note: The memory of uboot can be adjusted, but avoiding the secure
> > > > heap will limit the memory range that uboot can use, causing
> > > > problems such as the uboot stack)
> > > > 0xB000 0000 ~0xFFFFFFFF is default CMA.
> > > > So default CMA is reduced.
> > >
> > > How is that related to your patch? I assume the default CMA is
> > > reduced because you modified it in the DTS after enabling the secure
> > > heap, as the CMA size is set by you. The default CMA size won't
> > > automatically decrease due to the secure heap. To me,
> 0xB0000000-0xFFFFFFFF(1.25GiB) is still too large a CMA.
> > >
> >
> > Sorry, This example is just an example. In fact, the size of our default CMA is
> less than 1.25GiB.
> > Our current memory distribution is as follows. Now the size of "c" (default
> CMA) could not meet the needs of our requirement. And "b" (reserved memory
> for secure) is fixed, so we couldn't expand "c" (default CMA) through modify DTS.
> Then we reserved "a" (specific CMA) for VPU. However, we have confirmed with
> the multimedia team that the maximum size required is It is uncertain, so
> specify "a" for VPU to use first, and "c" for other devices that require continuous
> memory. If "a" is not enough, use "c".
> > That's the purpose of this patch.
> > --------------------------------------------------
> > | a. VPU specific cma |
> > --------------------------------------------------
> > | b. reserved memory for secure |
> > ---------------------------------------------------
> > | c. default CMA |
> > ---------------------------------------------------
> > > >
>
> Ok, I understand your problem. Because B is enabled, you can't have C as large
> as you would without B. So you add A, but A might have insufficient space for
> high-resolution video. You also don't know the exact size needed for A. In the
> corner case of encoding/decoding high-resolution video, A might be insufficient,
> forcing you to borrow memory from C. Because B is situated between A and C,
> creating a gap, you cannot merge A and C into a single default CMA.
>
> This does indeed seem like a valid requirement. Please ensure to clearly describe
> the problem next time. However, as a general rule, allowing device-specific
> CMAs to borrow from the default CMA is not advisable. This would undermine
> the reasons why we started supporting device-specific CMAs in the first place.
>
> So the problem is that memory holes may prevent the formation of large CMAs.
>
> This situation raises a question: can we have two or more default CMAs due to
> memory holes like B, which might hinder the system from obtaining a sufficiently
> large default CMA?
>
> If you could define both A and C as default CMAs, you wouldn't require these
> kinds of fallbacks. Is it possible for you to make dma_contiguous_default_area a
> list rather than a global variant ?
>
> Another option is that we allow devices to have more than one memory-region,
> we can use device tree to fallback.
> memory-region = <&mem1, &mem2>;
>
> My perspective is that I acknowledge your problem as a valid requirement.
> However, I find the approach to be too aggressive.
>
OK, thanks very much for your suggestions.
I will try the method you proposed but I'm not sure I can implement it, it's a bit difficult for me. ^_^
All in all, thank you for your patient reply.

> > > > > >
> > > > > > > It seems you mean that only in some corner cases do you need
> > > > > > > a large CMA, but most of the time, you don’t need it to be
> > > > > > > this big? So you have
> > > > > to "borrow"
> > > > > > > memory from the
> > > > > > > default CMA. but why not move that portion from the default
> > > > > > > CMA to your VPU’s CMA?
> > > > > > >
> > > > > > This is a method, but because for VPU, the continuous memory
> > > > > > size allocated
> > > > > by the driver is based on the video stream, we cannot determine
> > > > > the maximum size of memory required by the VPU. This makes it
> > > > > impossible for us to determine the size of the specific CMA assigned to
> the VPU. Thanks.
> > > > >
> > > > > I don't understand how this can happen. You should precisely
> > > > > know the maximum size required for the VPU based on your
> > > > > multimedia pipeline and resolutions.
> > > > >
> > > > We cannot estimate the maximum contiguous memory required by the
> > > > VPU
> > > because it depends on how the video is encoded.
> > > > Thanks very much.
> > >
> > > Yes, you can. Please ask your multimedia team; they will give you a number.
> > >
> > > >
> > > > > I still don't understand your scenarios or the problem you are facing.
> > > >
> > > >
> > >
>
> Thanks
> Barry


Attachments:
smime.p7s (9.56 kB)