2020-08-18 07:48:48

by Hyesoo Yu

[permalink] [raw]
Subject: [PATCH 0/3] Chunk Heap Support on DMA-HEAP

These patch series to introduce a new dma heap, chunk heap.
That heap is needed for special HW that requires bulk allocation of
fixed high order pages. For example, 64MB dma-buf pages are made up
to fixed order-4 pages * 1024.

The chunk heap uses alloc_pages_bulk to allocate high order page.
https://lore.kernel.org/linux-mm/[email protected]

The chunk heap is registered by device tree with alignment and memory node
of contiguous memory allocator(CMA). Alignment defines chunk page size.
For example, alignment 0x1_0000 means chunk page size is 64KB.
The phandle to memory node indicates contiguous memory allocator(CMA).
If device node doesn't have cma, the registration of chunk heap fails.

The patchset includes the following:
- export dma-heap API to register kernel module dma heap.
- add chunk heap implementation.
- document of device tree to register chunk heap

Hyesoo Yu (3):
dma-buf: add missing EXPORT_SYMBOL_GPL() for dma heaps
dma-buf: heaps: add chunk heap to dmabuf heaps
dma-heap: Devicetree binding for chunk heap

.../devicetree/bindings/dma-buf/chunk_heap.yaml | 46 +++++
drivers/dma-buf/dma-heap.c | 2 +
drivers/dma-buf/heaps/Kconfig | 9 +
drivers/dma-buf/heaps/Makefile | 1 +
drivers/dma-buf/heaps/chunk_heap.c | 222 +++++++++++++++++++++
drivers/dma-buf/heaps/heap-helpers.c | 2 +
6 files changed, 282 insertions(+)
create mode 100644 Documentation/devicetree/bindings/dma-buf/chunk_heap.yaml
create mode 100644 drivers/dma-buf/heaps/chunk_heap.c

--
2.7.4


2020-08-18 07:49:55

by Hyesoo Yu

[permalink] [raw]
Subject: [PATCH 3/3] dma-heap: Devicetree binding for chunk heap

Document devicetree binding for chunk heap on dma heap framework

Signed-off-by: Hyesoo Yu <[email protected]>
---
.../devicetree/bindings/dma-buf/chunk_heap.yaml | 46 ++++++++++++++++++++++
1 file changed, 46 insertions(+)
create mode 100644 Documentation/devicetree/bindings/dma-buf/chunk_heap.yaml

diff --git a/Documentation/devicetree/bindings/dma-buf/chunk_heap.yaml b/Documentation/devicetree/bindings/dma-buf/chunk_heap.yaml
new file mode 100644
index 0000000..1ee8fad
--- /dev/null
+++ b/Documentation/devicetree/bindings/dma-buf/chunk_heap.yaml
@@ -0,0 +1,46 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/dma-buf/chunk_heap.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Device tree binding for chunk heap on DMA HEAP FRAMEWORK
+
+maintainers:
+ - Sumit Semwal <[email protected]>
+
+description: |
+ The chunk heap is backed by the Contiguous Memory Allocator (CMA) and
+ allocates the buffers that are made up to a list of fixed size chunks
+ taken from CMA. Chunk sizes are configurated when the heaps are created.
+
+properties:
+ compatible:
+ enum:
+ - dma_heap,chunk
+
+required:
+ - compatible
+ - memory-region
+ - alignment
+
+additionalProperties: false
+
+examples:
+ - |
+ reserved-memory {
+ #address-cells = <2>;
+ #size-cells = <1>;
+
+ chunk_memory: chunk_memory {
+ compatible = "shared-dma-pool";
+ reusable;
+ size = <0x10000000>;
+ };
+ };
+
+ chunk_default_heap: chunk_default_heap {
+ compatible = "dma_heap,chunk";
+ memory-region = <&chunk_memory>;
+ alignment = <0x10000>;
+ };
--
2.7.4

2020-08-18 11:00:08

by Brian Starkey

[permalink] [raw]
Subject: Re: [PATCH 0/3] Chunk Heap Support on DMA-HEAP

Hi,

On Tue, Aug 18, 2020 at 05:04:12PM +0900, Hyesoo Yu wrote:
> These patch series to introduce a new dma heap, chunk heap.
> That heap is needed for special HW that requires bulk allocation of
> fixed high order pages. For example, 64MB dma-buf pages are made up
> to fixed order-4 pages * 1024.
>
> The chunk heap uses alloc_pages_bulk to allocate high order page.
> https://lore.kernel.org/linux-mm/[email protected]
>
> The chunk heap is registered by device tree with alignment and memory node
> of contiguous memory allocator(CMA). Alignment defines chunk page size.
> For example, alignment 0x1_0000 means chunk page size is 64KB.
> The phandle to memory node indicates contiguous memory allocator(CMA).
> If device node doesn't have cma, the registration of chunk heap fails.

This reminds me of an ion heap developed at Arm several years ago:
https://git.linaro.org/landing-teams/working/arm/kernel.git/tree/drivers/staging/android/ion/ion_compound_page.c

Some more descriptive text here:
https://github.com/ARM-software/CPA

It maintains a pool of high-order pages with a worker thread to
attempt compaction and allocation to keep the pool filled, with high
and low watermarks to trigger freeing/allocating of chunks.
It implements a shrinker to allow the system to reclaim the pool under
high memory pressure.

Is maintaining a pool something you considered? From the
alloc_pages_bulk thread it sounds like you want to allocate 300M at a
time, so I expect if you tuned the pool size to match that it could
work quite well.

That implementation isn't using a CMA region, but a similar approach
could definitely be applied.

Thanks,
-Brian

>
> The patchset includes the following:
> - export dma-heap API to register kernel module dma heap.
> - add chunk heap implementation.
> - document of device tree to register chunk heap
>
> Hyesoo Yu (3):
> dma-buf: add missing EXPORT_SYMBOL_GPL() for dma heaps
> dma-buf: heaps: add chunk heap to dmabuf heaps
> dma-heap: Devicetree binding for chunk heap
>
> .../devicetree/bindings/dma-buf/chunk_heap.yaml | 46 +++++
> drivers/dma-buf/dma-heap.c | 2 +
> drivers/dma-buf/heaps/Kconfig | 9 +
> drivers/dma-buf/heaps/Makefile | 1 +
> drivers/dma-buf/heaps/chunk_heap.c | 222 +++++++++++++++++++++
> drivers/dma-buf/heaps/heap-helpers.c | 2 +
> 6 files changed, 282 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/dma-buf/chunk_heap.yaml
> create mode 100644 drivers/dma-buf/heaps/chunk_heap.c
>
> --
> 2.7.4
>

2020-08-18 16:49:14

by Rob Herring

[permalink] [raw]
Subject: Re: [PATCH 3/3] dma-heap: Devicetree binding for chunk heap

On Tue, 18 Aug 2020 17:04:15 +0900, Hyesoo Yu wrote:
> Document devicetree binding for chunk heap on dma heap framework
>
> Signed-off-by: Hyesoo Yu <[email protected]>
> ---
> .../devicetree/bindings/dma-buf/chunk_heap.yaml | 46 ++++++++++++++++++++++
> 1 file changed, 46 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/dma-buf/chunk_heap.yaml
>


My bot found errors running 'make dt_binding_check' on your patch:

/builds/robherring/linux-dt-review/Documentation/devicetree/bindings/dma-buf/chunk_heap.example.dt.yaml: chunk_default_heap: 'alignment', 'memory-region' do not match any of the regexes: 'pinctrl-[0-9]+'


See https://patchwork.ozlabs.org/patch/1346687

If you already ran 'make dt_binding_check' and didn't see the above
error(s), then make sure dt-schema is up to date:

pip3 install git+https://github.com/devicetree-org/dt-schema.git@master --upgrade

Please check and re-submit.

2020-08-18 21:21:06

by John Stultz

[permalink] [raw]
Subject: Re: [PATCH 0/3] Chunk Heap Support on DMA-HEAP

On Tue, Aug 18, 2020 at 12:45 AM Hyesoo Yu <[email protected]> wrote:
>
> These patch series to introduce a new dma heap, chunk heap.
> That heap is needed for special HW that requires bulk allocation of
> fixed high order pages. For example, 64MB dma-buf pages are made up
> to fixed order-4 pages * 1024.
>
> The chunk heap uses alloc_pages_bulk to allocate high order page.
> https://lore.kernel.org/linux-mm/[email protected]
>
> The chunk heap is registered by device tree with alignment and memory node
> of contiguous memory allocator(CMA). Alignment defines chunk page size.
> For example, alignment 0x1_0000 means chunk page size is 64KB.
> The phandle to memory node indicates contiguous memory allocator(CMA).
> If device node doesn't have cma, the registration of chunk heap fails.
>
> The patchset includes the following:
> - export dma-heap API to register kernel module dma heap.
> - add chunk heap implementation.
> - document of device tree to register chunk heap
>
> Hyesoo Yu (3):
> dma-buf: add missing EXPORT_SYMBOL_GPL() for dma heaps
> dma-buf: heaps: add chunk heap to dmabuf heaps
> dma-heap: Devicetree binding for chunk heap

Hey! Thanks so much for sending this out! I'm really excited to see
these heaps be submitted and reviewed on the list!

The first general concern I have with your series is that it adds a dt
binding for the chunk heap, which we've gotten a fair amount of
pushback on.

A possible alternative might be something like what Kunihiko Hayashi
proposed for non-default CMA heaps:
https://lore.kernel.org/lkml/[email protected]/

This approach would insteal allow a driver to register a CMA area with
the chunk heap implementation.

However, (and this was the catch Kunihiko Hayashi's patch) this
requires that the driver also be upstream, as we need an in-tree user
of such code.

Also, it might be good to provide some further rationale on why this
heap is beneficial over the existing CMA heap? In general focusing
the commit messages more on the why we might want the patch, rather
than what the patch does, is helpful.

"Special hardware" that doesn't have upstream drivers isn't very
compelling for most maintainers.

That said, I'm very excited to see these sorts of submissions, as I
know lots of vendors have historically had very custom out of tree ION
heaps, and I think it would be a great benefit to the community to
better understand the experience vendors have in optimizing
performance on their devices, so we can create good common solutions
upstream. So I look forward to your insights on future revisions of
this patch series!

thanks
-john

2020-08-19 13:26:13

by Brian Starkey

[permalink] [raw]
Subject: Re: [PATCH 0/3] Chunk Heap Support on DMA-HEAP

Hi KyongHo,

On Wed, Aug 19, 2020 at 12:46:26PM +0900, Cho KyongHo wrote:
> I have seriously considered CPA in our product but we developed our own
> because of the pool in CPA.

Oh good, I'm glad you considered it :-)

> The high-order pages are required by some specific users like Netflix
> app. Moreover required number of bytes are dramatically increasing
> because of high resolution videos and displays in these days.
>
> Gathering lots of free high-order pages in the background during
> run-time means reserving that amount of pages from the entier available
> system memory. Moreover the gathered pages are soon reclaimed whenever
> the system is sufferring from memory pressure (i.e. camera recording,
> heavy games).

Aren't these two things in contradiction? If they're easily reclaimed
then they aren't "reserved" in any detrimental way. And if you don't
want them to be reclaimed, then you need them to be reserved...

The approach you have here assigns the chunk of memory as a reserved
CMA region which the kernel is going to try not to use too - similar
to the CPA pool.

I suppose it's a balance depending on how much you're willing to wait
for migration on the allocation path. CPA has the potential to get you
faster allocations, but the downside is you need to make it a little
more "greedy".

Cheers,
-Brian