2023-04-14 07:58:53

by Zhiqiang Hou

[permalink] [raw]
Subject: [RFC PATCH] dma: coherent: respect to device 'dma-coherent' property

From: Hou Zhiqiang <[email protected]>

Currently, the coherent DMA memory is always mapped as writecombine
and uncached, ignored the 'dma-coherent' property in device node,
this patch is to map the memory as writeback and cached when the
device has 'dma-coherent' property.

Signed-off-by: Hou Zhiqiang <[email protected]>
---
kernel/dma/coherent.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/kernel/dma/coherent.c b/kernel/dma/coherent.c
index c21abc77c53e..f15ba6c6358e 100644
--- a/kernel/dma/coherent.c
+++ b/kernel/dma/coherent.c
@@ -36,7 +36,8 @@ static inline dma_addr_t dma_get_device_base(struct device *dev,
}

static struct dma_coherent_mem *dma_init_coherent_memory(phys_addr_t phys_addr,
- dma_addr_t device_addr, size_t size, bool use_dma_pfn_offset)
+ dma_addr_t device_addr, size_t size, bool use_dma_pfn_offset,
+ bool cacheable)
{
struct dma_coherent_mem *dma_mem;
int pages = size >> PAGE_SHIFT;
@@ -45,7 +46,8 @@ static struct dma_coherent_mem *dma_init_coherent_memory(phys_addr_t phys_addr,
if (!size)
return ERR_PTR(-EINVAL);

- mem_base = memremap(phys_addr, size, MEMREMAP_WC);
+ mem_base = memremap(phys_addr, size, cacheable ? MEMREMAP_WB :
+ MEMREMAP_WC);
if (!mem_base)
return ERR_PTR(-EINVAL);

@@ -119,8 +121,10 @@ int dma_declare_coherent_memory(struct device *dev, phys_addr_t phys_addr,
{
struct dma_coherent_mem *mem;
int ret;
+ bool cacheable = dev_is_dma_coherent(dev);

- mem = dma_init_coherent_memory(phys_addr, device_addr, size, false);
+ mem = dma_init_coherent_memory(phys_addr, device_addr, size, false,
+ cacheable);
if (IS_ERR(mem))
return PTR_ERR(mem);

@@ -310,7 +314,7 @@ int dma_init_global_coherent(phys_addr_t phys_addr, size_t size)
{
struct dma_coherent_mem *mem;

- mem = dma_init_coherent_memory(phys_addr, phys_addr, size, true);
+ mem = dma_init_coherent_memory(phys_addr, phys_addr, size, true, false);
if (IS_ERR(mem))
return PTR_ERR(mem);
dma_coherent_default_memory = mem;
@@ -335,9 +339,10 @@ static int rmem_dma_device_init(struct reserved_mem *rmem, struct device *dev)
{
if (!rmem->priv) {
struct dma_coherent_mem *mem;
+ bool cacheable = dev_is_dma_coherent(dev);

mem = dma_init_coherent_memory(rmem->base, rmem->base,
- rmem->size, true);
+ rmem->size, true, cacheable);
if (IS_ERR(mem))
return PTR_ERR(mem);
rmem->priv = mem;
--
2.17.1


2023-04-16 06:42:29

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC PATCH] dma: coherent: respect to device 'dma-coherent' property

On Fri, Apr 14, 2023 at 04:03:07PM +0800, Zhiqiang Hou wrote:
> From: Hou Zhiqiang <[email protected]>
>
> Currently, the coherent DMA memory is always mapped as writecombine
> and uncached, ignored the 'dma-coherent' property in device node,
> this patch is to map the memory as writeback and cached when the
> device has 'dma-coherent' property.

What is the use case here? The somewhat misnamed per-device coherent
memory is intended for small per-device pools of sram or such
used for staging memory.

2023-04-17 02:21:16

by Zhiqiang Hou

[permalink] [raw]
Subject: RE: [RFC PATCH] dma: coherent: respect to device 'dma-coherent' property

Hi Christoph,

> -----Original Message-----
> From: Christoph Hellwig <[email protected]>
> Sent: Sunday, April 16, 2023 2:30 PM
> To: Z.Q. Hou <[email protected]>
> Cc: [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]
> Subject: Re: [RFC PATCH] dma: coherent: respect to device 'dma-coherent'
> property
>
> On Fri, Apr 14, 2023 at 04:03:07PM +0800, Zhiqiang Hou wrote:
> > From: Hou Zhiqiang <[email protected]>
> >
> > Currently, the coherent DMA memory is always mapped as writecombine
> > and uncached, ignored the 'dma-coherent' property in device node, this
> > patch is to map the memory as writeback and cached when the device has
> > 'dma-coherent' property.
>
> What is the use case here? The somewhat misnamed per-device coherent
> memory is intended for small per-device pools of sram or such used for
> staging memory.

In my case, there are multiple Cortex-A cores within the cluster, in which it is
cache coherent, they are split into 2 island for running Linux and RTOS respectively.
I created a virtual device for Linux and RTOS communication using shared memory.
In Linux side, I created a per-device dma memory pool and added 'dma-coherent'
for the virtual device, but the data in shared memory can't be sync up, finally found
the per-device dma pool is always mapped as uncached, so submitted this fix patch.

Thanks,
Zhiqiang

2023-04-17 12:36:34

by Robin Murphy

[permalink] [raw]
Subject: Re: [RFC PATCH] dma: coherent: respect to device 'dma-coherent' property

On 2023-04-17 03:06, Z.Q. Hou wrote:
> Hi Christoph,
>
>> -----Original Message-----
>> From: Christoph Hellwig <[email protected]>
>> Sent: Sunday, April 16, 2023 2:30 PM
>> To: Z.Q. Hou <[email protected]>
>> Cc: [email protected]; [email protected]; [email protected];
>> [email protected]; [email protected]
>> Subject: Re: [RFC PATCH] dma: coherent: respect to device 'dma-coherent'
>> property
>>
>> On Fri, Apr 14, 2023 at 04:03:07PM +0800, Zhiqiang Hou wrote:
>>> From: Hou Zhiqiang <[email protected]>
>>>
>>> Currently, the coherent DMA memory is always mapped as writecombine
>>> and uncached, ignored the 'dma-coherent' property in device node, this
>>> patch is to map the memory as writeback and cached when the device has
>>> 'dma-coherent' property.
>>
>> What is the use case here? The somewhat misnamed per-device coherent
>> memory is intended for small per-device pools of sram or such used for
>> staging memory.
>
> In my case, there are multiple Cortex-A cores within the cluster, in which it is
> cache coherent, they are split into 2 island for running Linux and RTOS respectively.
> I created a virtual device for Linux and RTOS communication using shared memory.
> In Linux side, I created a per-device dma memory pool and added 'dma-coherent'
> for the virtual device, but the data in shared memory can't be sync up, finally found
> the per-device dma pool is always mapped as uncached, so submitted this fix patch.

Yes, in principle this should apply similarly to restricted DMA or
confidential compute VMs where DMA buffers are to be allocated from a
predetermined shared memory area, and a DT reserved-memory region is
used as a coherent pool to achieve that. Quite likely that so far this
has only been done with non-coherent hardware or in software models
where a mismatch in nominal cacheability wasn't noticeable.

It's a bit niche, but not entirely unreasonable.

Thanks,
Robin.

2023-04-18 09:56:59

by Zhiqiang Hou

[permalink] [raw]
Subject: RE: [RFC PATCH] dma: coherent: respect to device 'dma-coherent' property

Hi Robin,

> -----Original Message-----
> From: Robin Murphy <[email protected]>
> Sent: Monday, April 17, 2023 8:28 PM
> To: Z.Q. Hou <[email protected]>; Christoph Hellwig <[email protected]>
> Cc: [email protected]; [email protected];
> [email protected]
> Subject: Re: [RFC PATCH] dma: coherent: respect to device 'dma-coherent'
> property
>
> On 2023-04-17 03:06, Z.Q. Hou wrote:
> > Hi Christoph,
> >
> >> -----Original Message-----
> >> From: Christoph Hellwig <[email protected]>
> >> Sent: Sunday, April 16, 2023 2:30 PM
> >> To: Z.Q. Hou <[email protected]>
> >> Cc: [email protected]; [email protected]; [email protected];
> >> [email protected]; [email protected]
> >> Subject: Re: [RFC PATCH] dma: coherent: respect to device 'dma-coherent'
> >> property
> >>
> >> On Fri, Apr 14, 2023 at 04:03:07PM +0800, Zhiqiang Hou wrote:
> >>> From: Hou Zhiqiang <[email protected]>
> >>>
> >>> Currently, the coherent DMA memory is always mapped as writecombine
> >>> and uncached, ignored the 'dma-coherent' property in device node,
> >>> this patch is to map the memory as writeback and cached when the
> >>> device has 'dma-coherent' property.
> >>
> >> What is the use case here? The somewhat misnamed per-device coherent
> >> memory is intended for small per-device pools of sram or such used
> >> for staging memory.
> >
> > In my case, there are multiple Cortex-A cores within the cluster, in
> > which it is cache coherent, they are split into 2 island for running Linux and
> RTOS respectively.
> > I created a virtual device for Linux and RTOS communication using shared
> memory.
> > In Linux side, I created a per-device dma memory pool and added
> 'dma-coherent'
> > for the virtual device, but the data in shared memory can't be sync
> > up, finally found the per-device dma pool is always mapped as uncached, so
> submitted this fix patch.
>
> Yes, in principle this should apply similarly to restricted DMA or confidential
> compute VMs where DMA buffers are to be allocated from a predetermined
> shared memory area, and a DT reserved-memory region is used as a coherent
> pool to achieve that. Quite likely that so far this has only been done with
> non-coherent hardware or in software models where a mismatch in nominal
> cacheability wasn't noticeable.
>
> It's a bit niche, but not entirely unreasonable.
>

Understand, this change doesn't affect the ones without 'dma-coherent', and it can improve the performance leveraging the hardware cache coherent feature.
And in the CMA, it maps the memory as cacheable when the device node has 'dma-coherent', otherwise non-cacheable.
So this change aligns the behavior of the per-device dma pool to the CMA.

Thanks,
Zhiqiang

2023-05-26 00:43:18

by Zhiqiang Hou

[permalink] [raw]
Subject: RE: [RFC PATCH] dma: coherent: respect to device 'dma-coherent' property



> -----Original Message-----
> From: Z.Q. Hou
> Sent: Tuesday, April 18, 2023 5:56 PM
> To: Robin Murphy <[email protected]>; Christoph Hellwig <[email protected]>
> Cc: [email protected]; [email protected];
> [email protected]
> Subject: RE: [RFC PATCH] dma: coherent: respect to device 'dma-coherent'
> property
>
> Hi Robin,
>
> > -----Original Message-----
> > From: Robin Murphy <[email protected]>
> > Sent: Monday, April 17, 2023 8:28 PM
> > To: Z.Q. Hou <[email protected]>; Christoph Hellwig <[email protected]>
> > Cc: [email protected]; [email protected];
> > [email protected]
> > Subject: Re: [RFC PATCH] dma: coherent: respect to device 'dma-coherent'
> > property
> >
> > On 2023-04-17 03:06, Z.Q. Hou wrote:
> > > Hi Christoph,
> > >
> > >> -----Original Message-----
> > >> From: Christoph Hellwig <[email protected]>
> > >> Sent: Sunday, April 16, 2023 2:30 PM
> > >> To: Z.Q. Hou <[email protected]>
> > >> Cc: [email protected]; [email protected];
> > >> [email protected]; [email protected]; [email protected]
> > >> Subject: Re: [RFC PATCH] dma: coherent: respect to device 'dma-coherent'
> > >> property
> > >>
> > >> On Fri, Apr 14, 2023 at 04:03:07PM +0800, Zhiqiang Hou wrote:
> > >>> From: Hou Zhiqiang <[email protected]>
> > >>>
> > >>> Currently, the coherent DMA memory is always mapped as
> > >>> writecombine and uncached, ignored the 'dma-coherent' property in
> > >>> device node, this patch is to map the memory as writeback and
> > >>> cached when the device has 'dma-coherent' property.
> > >>
> > >> What is the use case here? The somewhat misnamed per-device
> > >> coherent memory is intended for small per-device pools of sram or
> > >> such used for staging memory.
> > >
> > > In my case, there are multiple Cortex-A cores within the cluster, in
> > > which it is cache coherent, they are split into 2 island for running
> > > Linux and
> > RTOS respectively.
> > > I created a virtual device for Linux and RTOS communication using
> > > shared
> > memory.
> > > In Linux side, I created a per-device dma memory pool and added
> > 'dma-coherent'
> > > for the virtual device, but the data in shared memory can't be sync
> > > up, finally found the per-device dma pool is always mapped as
> > > uncached, so
> > submitted this fix patch.
> >
> > Yes, in principle this should apply similarly to restricted DMA or
> > confidential compute VMs where DMA buffers are to be allocated from a
> > predetermined shared memory area, and a DT reserved-memory region is
> > used as a coherent pool to achieve that. Quite likely that so far this
> > has only been done with non-coherent hardware or in software models
> > where a mismatch in nominal cacheability wasn't noticeable.
> >
> > It's a bit niche, but not entirely unreasonable.
> >
>
> Understand, this change doesn't affect the ones without 'dma-coherent', and it
> can improve the performance leveraging the hardware cache coherent feature.
> And in the CMA, it maps the memory as cacheable when the device node has
> 'dma-coherent', otherwise non-cacheable.
> So this change aligns the behavior of the per-device dma pool to the CMA.

Any comments, is it acceptable?

Thanks,
Zhiqiang