2024-06-13 15:09:31

by Devarsh Thakkar

[permalink] [raw]
Subject: [PATCH 0/3] Add global CMA reserve area

Add global CMA reserve area for AM62x, AM62A and AM62P SoCs.
These SoCs do not have MMU and hence require contiguous memory pool to
support various multimedia use-cases.

Brandon Brnich (1):
arm64: dts: ti: k3-am62p5-sk: Reserve 576 MiB of global CMA

Devarsh Thakkar (2):
arm64: dts: ti: k3-am62x-sk-common: Reserve 128MiB of global CMA
arm64: dts: ti: k3-am62a7-sk: Reserve 576MiB of global CMA

arch/arm64/boot/dts/ti/k3-am62a7-sk.dts | 9 +++++++++
arch/arm64/boot/dts/ti/k3-am62p5-sk.dts | 7 +++++++
arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi | 8 ++++++++
3 files changed, 24 insertions(+)

--
2.39.1



2024-06-13 15:09:32

by Devarsh Thakkar

[permalink] [raw]
Subject: [PATCH 1/3] arm64: dts: ti: k3-am62x-sk-common: Reserve 128MiB of global CMA

Reserve 128MiB of global CMA which is also marked as re-usable
so that OS can also use the same if peripheral drivers are not using the
same.

AM62x supports multimedia components such as GPU, dual Display and Camera.
Assuming the worst-case scenario where all 3 are run in parallel below
is the calculation :

1) OV5640 camera sensor supports 1920x1080 resolution
-> 1920 width x 1080 height x 2 bytesperpixel x 8 buffers
(default in yavta) : 32MiB

2) 1920x1200 Microtips LVDS panel supported
-> 1920 width x 1080 height x 4 bytesperpixel x 2 buffers :
16 MiB

3) 1920x1080 HDMI display supported
-> 1920 width x 1080 height x 4 bytesperpixel x 2 buffers :
15.82 MiB which is ~16 MiB

4) IMG GPU shares with display allocated buffers while rendering
but in case some dedicated operation viz color conversion,
keeping same window of ~16 MiB for GPU too.

Total is 80 MiB and adding 32 MiB for other peripherals and extra
16 MiB to keep as buffer for fragmentation thus rounding total to 128
MiB.

Signed-off-by: Devarsh Thakkar <[email protected]>
---
arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
index f4948b937627..52231bfe60fe 100644
--- a/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
+++ b/arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi
@@ -48,6 +48,14 @@ ramoops@9ca00000 {
pmsg-size = <0x8000>;
};

+ /* global cma region */
+ linux,cma {
+ compatible = "shared-dma-pool";
+ reusable;
+ size = <0x00 0x8000000>;
+ linux,cma-default;
+ };
+
secure_tfa_ddr: tfa@9e780000 {
reg = <0x00 0x9e780000 0x00 0x80000>;
alignment = <0x1000>;
--
2.39.1


2024-06-13 15:09:53

by Devarsh Thakkar

[permalink] [raw]
Subject: [PATCH 3/3] arm64: dts: ti: k3-am62p5-sk: Reserve 576 MiB of global CMA

From: Brandon Brnich <[email protected]>

AM62p has different multimedia components such as Camera, Display, H264
Video Codec which uses CMA for buffer allocations. We require 576MiB for 12
channel decode-to-encode 720x480@30FPS use case.

Signed-off-by: Brandon Brnich <[email protected]>
Signed-off-by: Devarsh Thakkar <[email protected]>
---
arch/arm64/boot/dts/ti/k3-am62p5-sk.dts | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/arch/arm64/boot/dts/ti/k3-am62p5-sk.dts b/arch/arm64/boot/dts/ti/k3-am62p5-sk.dts
index fb980d46e304..5ef74d9f8eea 100644
--- a/arch/arm64/boot/dts/ti/k3-am62p5-sk.dts
+++ b/arch/arm64/boot/dts/ti/k3-am62p5-sk.dts
@@ -48,6 +48,13 @@ reserved-memory {
#size-cells = <2>;
ranges;

+ linux,cma {
+ compatible = "shared-dma-pool";
+ reusable;
+ size = <0x00 0x24000000>;
+ linux,cma-default;
+ };
+
secure_tfa_ddr: tfa@9e780000 {
reg = <0x00 0x9e780000 0x00 0x80000>;
no-map;
--
2.39.1


2024-06-13 15:10:13

by Devarsh Thakkar

[permalink] [raw]
Subject: [PATCH 2/3] arm64: dts: ti: k3-am62a7-sk: Reserve 576MiB of global CMA

Reserve 576MiB of CMA as global CMA pool starting after initial 1GiB of
DDR.

AM62ax has different multimedia components such as Camera, Display, H.264
VPU and JPEG Encoder which use CMA for buffer allocations.

The 12x 720x480 realtime VPU decode use-case requires 544MiB of CMA,
additional 32MiB is kept as buffer in case some other peripheral also
require it while VPU is running.

The reason to choose latter 1GiB is to not overlap with existing memory map
which is utilizing initial 1GiB for remoteproc firmwares as shared here
[1].

Also some drivers such as JPEG require 32bit addressing so not allocating
from higher DDR address.

Link: https://lore.kernel.org/all/[email protected] [1]
Signed-off-by: Devarsh Thakkar <[email protected]>
---
arch/arm64/boot/dts/ti/k3-am62a7-sk.dts | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/arch/arm64/boot/dts/ti/k3-am62a7-sk.dts b/arch/arm64/boot/dts/ti/k3-am62a7-sk.dts
index e026f65738b3..67faf46d7a35 100644
--- a/arch/arm64/boot/dts/ti/k3-am62a7-sk.dts
+++ b/arch/arm64/boot/dts/ti/k3-am62a7-sk.dts
@@ -40,6 +40,15 @@ reserved-memory {
#size-cells = <2>;
ranges;

+ /* global cma region */
+ linux,cma {
+ compatible = "shared-dma-pool";
+ reusable;
+ size = <0x00 0x24000000>;
+ alloc-ranges = <0x00 0xc0000000 0x00 0x24000000>;
+ linux,cma-default;
+ };
+
secure_tfa_ddr: tfa@9e780000 {
reg = <0x00 0x9e780000 0x00 0x80000>;
alignment = <0x1000>;
--
2.39.1


2024-06-14 17:05:17

by Brandon Brnich

[permalink] [raw]
Subject: Re: [PATCH 3/3] arm64: dts: ti: k3-am62p5-sk: Reserve 576 MiB of global CMA

Hi Devarsh,


On 20:39-20240613, Devarsh Thakkar wrote:
> From: Brandon Brnich <[email protected]>
>
> AM62p has different multimedia components such as Camera, Display, H264
> Video Codec which uses CMA for buffer allocations. We require 576MiB for 12
> channel decode-to-encode 720x480@30FPS use case.
>
> Signed-off-by: Brandon Brnich <[email protected]>
> Signed-off-by: Devarsh Thakkar <[email protected]>
> ---
> arch/arm64/boot/dts/ti/k3-am62p5-sk.dts | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/ti/k3-am62p5-sk.dts b/arch/arm64/boot/dts/ti/k3-am62p5-sk.dts
> index fb980d46e304..5ef74d9f8eea 100644
> --- a/arch/arm64/boot/dts/ti/k3-am62p5-sk.dts
> +++ b/arch/arm64/boot/dts/ti/k3-am62p5-sk.dts
> @@ -48,6 +48,13 @@ reserved-memory {
> #size-cells = <2>;
> ranges;
>
> + linux,cma {
> + compatible = "shared-dma-pool";
> + reusable;
> + size = <0x00 0x24000000>;
> + linux,cma-default;
> + };

Since AM62p has 8gb memory, this allocation can come from upper portion.
Doing so breaks Wave5 encoding/decoding as the driver can not yet handle
48 bit addressing. 48bit support is scheduled to be upstreamed, but unsure of
when this will actually make it in.

Could we force this into lower 32bits using same
alloc-ranges as done in your AM62a patch[0]?


[0]: https://patchwork.kernel.org/project/linux-arm-kernel/patch/[email protected]/

Best,
Brandon

> +
> secure_tfa_ddr: tfa@9e780000 {
> reg = <0x00 0x9e780000 0x00 0x80000>;
> no-map;
> --
> 2.39.1
>

2024-06-14 17:28:04

by Brandon Brnich

[permalink] [raw]
Subject: Re: [PATCH 2/3] arm64: dts: ti: k3-am62a7-sk: Reserve 576MiB of global CMA

Hi Devarsh,

On 20:39-20240613, Devarsh Thakkar wrote:
> Reserve 576MiB of CMA as global CMA pool starting after initial 1GiB of
> DDR.
>
> AM62ax has different multimedia components such as Camera, Display, H.264
> VPU and JPEG Encoder which use CMA for buffer allocations.
>
> The 12x 720x480 realtime VPU decode use-case requires 544MiB of CMA,
> additional 32MiB is kept as buffer in case some other peripheral also
> require it while VPU is running.
>
> The reason to choose latter 1GiB is to not overlap with existing memory map
> which is utilizing initial 1GiB for remoteproc firmwares as shared here
> [1].
>
> Also some drivers such as JPEG require 32bit addressing so not allocating
> from higher DDR address.
>
> Link: https://lore.kernel.org/all/[email protected] [1]
> Signed-off-by: Devarsh Thakkar <[email protected]>

I have validated that this patch works with VPU.

Tested-by: Brandon Brnich <[email protected]>

Best,
Brandon

> ---
> arch/arm64/boot/dts/ti/k3-am62a7-sk.dts | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/ti/k3-am62a7-sk.dts b/arch/arm64/boot/dts/ti/k3-am62a7-sk.dts
> index e026f65738b3..67faf46d7a35 100644
> --- a/arch/arm64/boot/dts/ti/k3-am62a7-sk.dts
> +++ b/arch/arm64/boot/dts/ti/k3-am62a7-sk.dts
> @@ -40,6 +40,15 @@ reserved-memory {
> #size-cells = <2>;
> ranges;
>
> + /* global cma region */
> + linux,cma {
> + compatible = "shared-dma-pool";
> + reusable;
> + size = <0x00 0x24000000>;
> + alloc-ranges = <0x00 0xc0000000 0x00 0x24000000>;
> + linux,cma-default;
> + };
> +
> secure_tfa_ddr: tfa@9e780000 {
> reg = <0x00 0x9e780000 0x00 0x80000>;
> alignment = <0x1000>;
> --
> 2.39.1
>

2024-06-14 18:02:10

by Randolph Sapp

[permalink] [raw]
Subject: Re: [PATCH 0/3] Add global CMA reserve area

On Thu Jun 13, 2024 at 10:08 AM CDT, Devarsh Thakkar wrote:
> Add global CMA reserve area for AM62x, AM62A and AM62P SoCs.
> These SoCs do not have MMU and hence require contiguous memory pool to
> support various multimedia use-cases.
>
> Brandon Brnich (1):
> arm64: dts: ti: k3-am62p5-sk: Reserve 576 MiB of global CMA
>
> Devarsh Thakkar (2):
> arm64: dts: ti: k3-am62x-sk-common: Reserve 128MiB of global CMA
> arm64: dts: ti: k3-am62a7-sk: Reserve 576MiB of global CMA
>
> arch/arm64/boot/dts/ti/k3-am62a7-sk.dts | 9 +++++++++
> arch/arm64/boot/dts/ti/k3-am62p5-sk.dts | 7 +++++++
> arch/arm64/boot/dts/ti/k3-am62x-sk-common.dtsi | 8 ++++++++
> 3 files changed, 24 insertions(+)

I'm still a little torn about putting this allocation into the device tree
directly as the actual required allocation size depends on the task.

If it's allowed though, this series is fine for introducing those changes. This
uses the long-tested values we've been using on our tree for a bit now. The only
thing that's a little worrying is the missing range definitions for devices with
more than 32bits of addressable memory as Brandon has pointed out. Once that's
addressed:

Reviewed-by: Randolph Sapp <[email protected]>

Specifying these regions using the kernel cmdline parameter via u-boot was
brought up as a potential workaround. This is fine until you get into distro
boot methods which will almost certainly attempt to override those. I don't
know. Still a little odd. Curious how the community feels about it.

Technically the user or distro can still override it with the cmdline parameter
if necessary, so this may be the best way to have a useful default.