When the DMA_MEMORY_MAP flag is used, memory which can be accessed
directly should be returned, so use ioremap_wc() instead of ioremap().
Also, ensure that the correct memset operation is used in
dma_alloc_from_coherent() with respect to the region's flags.
This fixes the below alignment fault on arm64, caused by invalid use
of memset() on Device memory.
Unhandled fault: alignment fault (0x96000061) at 0xffffff8000380000
Internal error: : 96000061 [#1] PREEMPT SMP
Modules linked in: hdlcd(+) clk_scpi
CPU: 4 PID: 1355 Comm: systemd-udevd Not tainted 4.4.0-rc1+ #5
Hardware name: ARM Juno development board (r0) (DT)
task: ffffffc9763eee00 ti: ffffffc9758c4000 task.ti: ffffffc9758c4000
PC is at __efistub_memset+0x1ac/0x200
LR is at dma_alloc_from_coherent+0xb0/0x120
pc : [<ffffffc00030ff2c>] lr : [<ffffffc00042a918>] pstate: 400001c5
sp : ffffffc9758c79a0
x29: ffffffc9758c79a0 x28: ffffffc000635cd0
x27: 0000000000000124 x26: ffffffc000119ef4
x25: 0000000000010000 x24: 0000000000000140
x23: ffffffc07e9ac3a8 x22: ffffffc9758c7a58
x21: ffffffc9758c7a68 x20: 0000000000000004
x19: ffffffc07e9ac380 x18: 0000000000000001
x17: 0000007fae1bbba8 x16: ffffffc0001b2d1c
x15: ffffffffffffffff x14: 0ffffffffffffffe
x13: 0000000000000010 x12: ffffff800837ffff
x11: ffffff800837ffff x10: 0000000040000000
x9 : 0000000000000000 x8 : ffffff8000380000
x7 : 0000000000000000 x6 : 000000000000003f
x5 : 0000000000000040 x4 : 0000000000000000
x3 : 0000000000000004 x2 : 000000000000ffc0
x1 : 0000000000000000 x0 : ffffff8000380000
Signed-off-by: Brian Starkey <[email protected]>
Reviewed-by: Catalin Marinas <[email protected]>
---
drivers/base/dma-coherent.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/base/dma-coherent.c b/drivers/base/dma-coherent.c
index 55b8398..45358d0 100644
--- a/drivers/base/dma-coherent.c
+++ b/drivers/base/dma-coherent.c
@@ -31,7 +31,10 @@ static int dma_init_coherent_memory(phys_addr_t phys_addr, dma_addr_t device_add
if (!size)
goto out;
- mem_base = ioremap(phys_addr, size);
+ if (flags & DMA_MEMORY_MAP)
+ mem_base = ioremap_wc(phys_addr, size);
+ else
+ mem_base = ioremap(phys_addr, size);
if (!mem_base)
goto out;
@@ -181,7 +184,10 @@ int dma_alloc_from_coherent(struct device *dev, ssize_t size,
*/
*dma_handle = mem->device_base + (pageno << PAGE_SHIFT);
*ret = mem->virt_base + (pageno << PAGE_SHIFT);
- memset(*ret, 0, size);
+ if (mem->flags & DMA_MEMORY_MAP)
+ memset(*ret, 0, size);
+ else
+ memset_io(*ret, 0, size);
spin_unlock_irqrestore(&mem->spinlock, flags);
return 1;
--
1.7.9.5
Hi,
Is there anything I can do to help get this merged?
It fixes a real problem we have on our arm64 development boards.
>From my reading of Documentation/DMA-API.txt it seems like this was
the intended behavior in the first place.
Best regards,
Brian
On Fri, Nov 20, 2015 at 02:20:26PM +0000, Brian Starkey wrote:
>When the DMA_MEMORY_MAP flag is used, memory which can be accessed
>directly should be returned, so use ioremap_wc() instead of ioremap().
>Also, ensure that the correct memset operation is used in
>dma_alloc_from_coherent() with respect to the region's flags.
>
>This fixes the below alignment fault on arm64, caused by invalid use
>of memset() on Device memory.
>
> Unhandled fault: alignment fault (0x96000061) at 0xffffff8000380000
> Internal error: : 96000061 [#1] PREEMPT SMP
> Modules linked in: hdlcd(+) clk_scpi
> CPU: 4 PID: 1355 Comm: systemd-udevd Not tainted 4.4.0-rc1+ #5
> Hardware name: ARM Juno development board (r0) (DT)
> task: ffffffc9763eee00 ti: ffffffc9758c4000 task.ti: ffffffc9758c4000
> PC is at __efistub_memset+0x1ac/0x200
> LR is at dma_alloc_from_coherent+0xb0/0x120
> pc : [<ffffffc00030ff2c>] lr : [<ffffffc00042a918>] pstate: 400001c5
> sp : ffffffc9758c79a0
> x29: ffffffc9758c79a0 x28: ffffffc000635cd0
> x27: 0000000000000124 x26: ffffffc000119ef4
> x25: 0000000000010000 x24: 0000000000000140
> x23: ffffffc07e9ac3a8 x22: ffffffc9758c7a58
> x21: ffffffc9758c7a68 x20: 0000000000000004
> x19: ffffffc07e9ac380 x18: 0000000000000001
> x17: 0000007fae1bbba8 x16: ffffffc0001b2d1c
> x15: ffffffffffffffff x14: 0ffffffffffffffe
> x13: 0000000000000010 x12: ffffff800837ffff
> x11: ffffff800837ffff x10: 0000000040000000
> x9 : 0000000000000000 x8 : ffffff8000380000
> x7 : 0000000000000000 x6 : 000000000000003f
> x5 : 0000000000000040 x4 : 0000000000000000
> x3 : 0000000000000004 x2 : 000000000000ffc0
> x1 : 0000000000000000 x0 : ffffff8000380000
>
>Signed-off-by: Brian Starkey <[email protected]>
>Reviewed-by: Catalin Marinas <[email protected]>
>---
> drivers/base/dma-coherent.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
>diff --git a/drivers/base/dma-coherent.c b/drivers/base/dma-coherent.c
>index 55b8398..45358d0 100644
>--- a/drivers/base/dma-coherent.c
>+++ b/drivers/base/dma-coherent.c
>@@ -31,7 +31,10 @@ static int dma_init_coherent_memory(phys_addr_t phys_addr, dma_addr_t device_add
> if (!size)
> goto out;
>
>- mem_base = ioremap(phys_addr, size);
>+ if (flags & DMA_MEMORY_MAP)
>+ mem_base = ioremap_wc(phys_addr, size);
>+ else
>+ mem_base = ioremap(phys_addr, size);
> if (!mem_base)
> goto out;
>
>@@ -181,7 +184,10 @@ int dma_alloc_from_coherent(struct device *dev, ssize_t size,
> */
> *dma_handle = mem->device_base + (pageno << PAGE_SHIFT);
> *ret = mem->virt_base + (pageno << PAGE_SHIFT);
>- memset(*ret, 0, size);
>+ if (mem->flags & DMA_MEMORY_MAP)
>+ memset(*ret, 0, size);
>+ else
>+ memset_io(*ret, 0, size);
> spin_unlock_irqrestore(&mem->spinlock, flags);
>
> return 1;
>--
>1.7.9.5
>
>
>_______________________________________________
>linux-arm-kernel mailing list
>[email protected]
>http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
On Fri, Nov 20, 2015 at 02:20:26PM +0000, Brian Starkey wrote:
> When the DMA_MEMORY_MAP flag is used, memory which can be accessed
> directly should be returned, so use ioremap_wc() instead of ioremap().
> Also, ensure that the correct memset operation is used in
> dma_alloc_from_coherent() with respect to the region's flags.
>
> This fixes the below alignment fault on arm64, caused by invalid use
> of memset() on Device memory.
This is indeed affecting both arm32 and arm64 systems.
> diff --git a/drivers/base/dma-coherent.c b/drivers/base/dma-coherent.c
> index 55b8398..45358d0 100644
> --- a/drivers/base/dma-coherent.c
> +++ b/drivers/base/dma-coherent.c
> @@ -31,7 +31,10 @@ static int dma_init_coherent_memory(phys_addr_t phys_addr, dma_addr_t device_add
> if (!size)
> goto out;
>
> - mem_base = ioremap(phys_addr, size);
> + if (flags & DMA_MEMORY_MAP)
> + mem_base = ioremap_wc(phys_addr, size);
> + else
> + mem_base = ioremap(phys_addr, size);
I wonder whether a memremap() approach for the DMA_MEMORY_MAP case would
be better. This API was added recently by commit 92281dee825f ("arch:
introduce memremap()"). It only supports write-back and write-through
but we could add a MEMREMAP_WC flag for this case.
--
Catalin
On Fri, Dec 4, 2015 at 2:50 AM, Catalin Marinas <[email protected]> wrote:
> On Fri, Nov 20, 2015 at 02:20:26PM +0000, Brian Starkey wrote:
>> When the DMA_MEMORY_MAP flag is used, memory which can be accessed
>> directly should be returned, so use ioremap_wc() instead of ioremap().
>> Also, ensure that the correct memset operation is used in
>> dma_alloc_from_coherent() with respect to the region's flags.
>>
>> This fixes the below alignment fault on arm64, caused by invalid use
>> of memset() on Device memory.
>
> This is indeed affecting both arm32 and arm64 systems.
>
>> diff --git a/drivers/base/dma-coherent.c b/drivers/base/dma-coherent.c
>> index 55b8398..45358d0 100644
>> --- a/drivers/base/dma-coherent.c
>> +++ b/drivers/base/dma-coherent.c
>> @@ -31,7 +31,10 @@ static int dma_init_coherent_memory(phys_addr_t phys_addr, dma_addr_t device_add
>> if (!size)
>> goto out;
>>
>> - mem_base = ioremap(phys_addr, size);
>> + if (flags & DMA_MEMORY_MAP)
>> + mem_base = ioremap_wc(phys_addr, size);
>> + else
>> + mem_base = ioremap(phys_addr, size);
>
> I wonder whether a memremap() approach for the DMA_MEMORY_MAP case would
> be better. This API was added recently by commit 92281dee825f ("arch:
> introduce memremap()"). It only supports write-back and write-through
> but we could add a MEMREMAP_WC flag for this case.
I originally included both MEMREMAP_WC and MEMREAMP_UC as potential
flags to this api, but ultimately decided against it. The memremap()
api is meant for memory that is known to have no i/o side effects. As
far as I can see WC and UC usages are a muddy mix of "sometimes
there's I/O side effects, but it depends by arch and driver". In
other words we can't drop the "__iomem" annotation from WC and UC
mappings by default.
On Fri, Dec 04, 2015 at 08:59:10AM -0800, Dan Williams wrote:
> On Fri, Dec 4, 2015 at 2:50 AM, Catalin Marinas <[email protected]> wrote:
> > On Fri, Nov 20, 2015 at 02:20:26PM +0000, Brian Starkey wrote:
> >> When the DMA_MEMORY_MAP flag is used, memory which can be accessed
> >> directly should be returned, so use ioremap_wc() instead of ioremap().
> >> Also, ensure that the correct memset operation is used in
> >> dma_alloc_from_coherent() with respect to the region's flags.
> >>
> >> This fixes the below alignment fault on arm64, caused by invalid use
> >> of memset() on Device memory.
> >
> > This is indeed affecting both arm32 and arm64 systems.
> >
> >> diff --git a/drivers/base/dma-coherent.c b/drivers/base/dma-coherent.c
> >> index 55b8398..45358d0 100644
> >> --- a/drivers/base/dma-coherent.c
> >> +++ b/drivers/base/dma-coherent.c
> >> @@ -31,7 +31,10 @@ static int dma_init_coherent_memory(phys_addr_t phys_addr, dma_addr_t device_add
> >> if (!size)
> >> goto out;
> >>
> >> - mem_base = ioremap(phys_addr, size);
> >> + if (flags & DMA_MEMORY_MAP)
> >> + mem_base = ioremap_wc(phys_addr, size);
> >> + else
> >> + mem_base = ioremap(phys_addr, size);
> >
> > I wonder whether a memremap() approach for the DMA_MEMORY_MAP case would
> > be better. This API was added recently by commit 92281dee825f ("arch:
> > introduce memremap()"). It only supports write-back and write-through
> > but we could add a MEMREMAP_WC flag for this case.
>
> I originally included both MEMREMAP_WC and MEMREAMP_UC as potential
> flags to this api, but ultimately decided against it. The memremap()
> api is meant for memory that is known to have no i/o side effects. As
> far as I can see WC and UC usages are a muddy mix of "sometimes
> there's I/O side effects, but it depends by arch and driver". In
> other words we can't drop the "__iomem" annotation from WC and UC
> mappings by default.
In this context, the dma_declare_coherent_memory(DMA_MEMORY_MAP)
implementation is aimed at normal RAM with no side effects as later
returned by dma_alloc_coherent(). To me it looks like memremap is better
suited here for the DMA_MEMORY_MAP case. As per the
Documentation/DMA-API.txt:
DMA_MEMORY_MAP - request that the memory returned from
dma_alloc_coherent() be directly writable.
which means no __iomem. Of course, we still need ioremap for
DMA_MEMORY_IO which is supposed to be written with memcpy_toio() etc.
Which memory type should be used is left to the driver and it should
pass the corresponding DMA_MEMORY_* flag.
--
Catalin
On Fri, Dec 04, 2015 at 05:15:54PM +0000, Catalin Marinas wrote:
>On Fri, Dec 04, 2015 at 08:59:10AM -0800, Dan Williams wrote:
>> On Fri, Dec 4, 2015 at 2:50 AM, Catalin Marinas <[email protected]> wrote:
>> > On Fri, Nov 20, 2015 at 02:20:26PM +0000, Brian Starkey wrote:
>> >> When the DMA_MEMORY_MAP flag is used, memory which can be accessed
>> >> directly should be returned, so use ioremap_wc() instead of ioremap().
>> >> Also, ensure that the correct memset operation is used in
>> >> dma_alloc_from_coherent() with respect to the region's flags.
>> >>
>> >> This fixes the below alignment fault on arm64, caused by invalid use
>> >> of memset() on Device memory.
>> >
>> > This is indeed affecting both arm32 and arm64 systems.
>> >
>> >> diff --git a/drivers/base/dma-coherent.c b/drivers/base/dma-coherent.c
>> >> index 55b8398..45358d0 100644
>> >> --- a/drivers/base/dma-coherent.c
>> >> +++ b/drivers/base/dma-coherent.c
>> >> @@ -31,7 +31,10 @@ static int dma_init_coherent_memory(phys_addr_t phys_addr, dma_addr_t device_add
>> >> if (!size)
>> >> goto out;
>> >>
>> >> - mem_base = ioremap(phys_addr, size);
>> >> + if (flags & DMA_MEMORY_MAP)
>> >> + mem_base = ioremap_wc(phys_addr, size);
>> >> + else
>> >> + mem_base = ioremap(phys_addr, size);
>> >
>> > I wonder whether a memremap() approach for the DMA_MEMORY_MAP case would
>> > be better. This API was added recently by commit 92281dee825f ("arch:
>> > introduce memremap()"). It only supports write-back and write-through
>> > but we could add a MEMREMAP_WC flag for this case.
>>
>> I originally included both MEMREMAP_WC and MEMREAMP_UC as potential
>> flags to this api, but ultimately decided against it. The memremap()
>> api is meant for memory that is known to have no i/o side effects. As
>> far as I can see WC and UC usages are a muddy mix of "sometimes
>> there's I/O side effects, but it depends by arch and driver". In
>> other words we can't drop the "__iomem" annotation from WC and UC
>> mappings by default.
The DMA_MEMORY_MAP flag is pretty much a statement of "no side-
effects", so as Catalin says it would fit OK here. That said, if it's
not possible to deprecate ioremap_wc() in the same way as
ioremap_cache() then I wonder if there's even much benefit in adding
it to memremap().
>
>In this context, the dma_declare_coherent_memory(DMA_MEMORY_MAP)
>implementation is aimed at normal RAM with no side effects as later
>returned by dma_alloc_coherent(). To me it looks like memremap is better
>suited here for the DMA_MEMORY_MAP case. As per the
>Documentation/DMA-API.txt:
>
> DMA_MEMORY_MAP - request that the memory returned from
> dma_alloc_coherent() be directly writable.
>
>which means no __iomem. Of course, we still need ioremap for
>DMA_MEMORY_IO which is supposed to be written with memcpy_toio() etc.
>
>Which memory type should be used is left to the driver and it should
>pass the corresponding DMA_MEMORY_* flag.
>
This can still be achieved without adding _WC to memremap(). I can
look at adding _WC to memremap() if that's deemed the preferred
approach, but if that isn't clear-cut, then it would be nice to get
this bug fixed now and worry about adding it to memremap() later.
-Brian
>-- Catalin
>
On Mon, Dec 7, 2015 at 5:28 AM, Brian Starkey <[email protected]> wrote:
> On Fri, Dec 04, 2015 at 05:15:54PM +0000, Catalin Marinas wrote:
>>
>> On Fri, Dec 04, 2015 at 08:59:10AM -0800, Dan Williams wrote:
>>>
>>> On Fri, Dec 4, 2015 at 2:50 AM, Catalin Marinas <[email protected]>
>>> wrote:
>>> > On Fri, Nov 20, 2015 at 02:20:26PM +0000, Brian Starkey wrote:
>>> >> When the DMA_MEMORY_MAP flag is used, memory which can be accessed
>>> >> directly should be returned, so use ioremap_wc() instead of ioremap().
>>> >> Also, ensure that the correct memset operation is used in
>>> >> dma_alloc_from_coherent() with respect to the region's flags.
>>> >>
>>> >> This fixes the below alignment fault on arm64, caused by invalid use
>>> >> of memset() on Device memory.
>>> >
>>> > This is indeed affecting both arm32 and arm64 systems.
>>> >
>>> >> diff --git a/drivers/base/dma-coherent.c b/drivers/base/dma-coherent.c
>>> >> index 55b8398..45358d0 100644
>>> >> --- a/drivers/base/dma-coherent.c
>>> >> +++ b/drivers/base/dma-coherent.c
>>> >> @@ -31,7 +31,10 @@ static int dma_init_coherent_memory(phys_addr_t
>>> >> phys_addr, dma_addr_t device_add
>>> >> if (!size)
>>> >> goto out;
>>> >>
>>> >> - mem_base = ioremap(phys_addr, size);
>>> >> + if (flags & DMA_MEMORY_MAP)
>>> >> + mem_base = ioremap_wc(phys_addr, size);
>>> >> + else
>>> >> + mem_base = ioremap(phys_addr, size);
>>> >
>>> > I wonder whether a memremap() approach for the DMA_MEMORY_MAP case
>>> > would
>>> > be better. This API was added recently by commit 92281dee825f ("arch:
>>> > introduce memremap()"). It only supports write-back and write-through
>>> > but we could add a MEMREMAP_WC flag for this case.
>>>
>>> I originally included both MEMREMAP_WC and MEMREAMP_UC as potential
>>> flags to this api, but ultimately decided against it. The memremap()
>>> api is meant for memory that is known to have no i/o side effects. As
>>> far as I can see WC and UC usages are a muddy mix of "sometimes
>>> there's I/O side effects, but it depends by arch and driver". In
>>> other words we can't drop the "__iomem" annotation from WC and UC
>>> mappings by default.
>
>
> The DMA_MEMORY_MAP flag is pretty much a statement of "no side-
> effects", so as Catalin says it would fit OK here. That said, if it's
> not possible to deprecate ioremap_wc() in the same way as
> ioremap_cache() then I wonder if there's even much benefit in adding
> it to memremap().
I don't see a problem adding a _WC option to memremap.
The only difference is that it can't replace ioremap_wc. I.e. unlike
_WB, and _WT case where ioremap_cache and ioremap_wt are now
deprecated we'd have ioremap_wc continuing to live alongside
memremap(..., MEMREMAP_WC).
On Mon, Dec 07, 2015 at 08:19:27AM -0800, Dan Williams wrote:
> On Mon, Dec 7, 2015 at 5:28 AM, Brian Starkey <[email protected]> wrote:
> > On Fri, Dec 04, 2015 at 05:15:54PM +0000, Catalin Marinas wrote:
> >>
> >> On Fri, Dec 04, 2015 at 08:59:10AM -0800, Dan Williams wrote:
> >>>
> >>> On Fri, Dec 4, 2015 at 2:50 AM, Catalin Marinas <[email protected]>
> >>> wrote:
> >>> > On Fri, Nov 20, 2015 at 02:20:26PM +0000, Brian Starkey wrote:
> >>> >> When the DMA_MEMORY_MAP flag is used, memory which can be accessed
> >>> >> directly should be returned, so use ioremap_wc() instead of ioremap().
> >>> >> Also, ensure that the correct memset operation is used in
> >>> >> dma_alloc_from_coherent() with respect to the region's flags.
> >>> >>
> >>> >> This fixes the below alignment fault on arm64, caused by invalid use
> >>> >> of memset() on Device memory.
> >>> >
> >>> > This is indeed affecting both arm32 and arm64 systems.
> >>> >
> >>> >> diff --git a/drivers/base/dma-coherent.c b/drivers/base/dma-coherent.c
> >>> >> index 55b8398..45358d0 100644
> >>> >> --- a/drivers/base/dma-coherent.c
> >>> >> +++ b/drivers/base/dma-coherent.c
> >>> >> @@ -31,7 +31,10 @@ static int dma_init_coherent_memory(phys_addr_t
> >>> >> phys_addr, dma_addr_t device_add
> >>> >> if (!size)
> >>> >> goto out;
> >>> >>
> >>> >> - mem_base = ioremap(phys_addr, size);
> >>> >> + if (flags & DMA_MEMORY_MAP)
> >>> >> + mem_base = ioremap_wc(phys_addr, size);
> >>> >> + else
> >>> >> + mem_base = ioremap(phys_addr, size);
> >>> >
> >>> > I wonder whether a memremap() approach for the DMA_MEMORY_MAP case
> >>> > would
> >>> > be better. This API was added recently by commit 92281dee825f ("arch:
> >>> > introduce memremap()"). It only supports write-back and write-through
> >>> > but we could add a MEMREMAP_WC flag for this case.
> >>>
> >>> I originally included both MEMREMAP_WC and MEMREAMP_UC as potential
> >>> flags to this api, but ultimately decided against it. The memremap()
> >>> api is meant for memory that is known to have no i/o side effects. As
> >>> far as I can see WC and UC usages are a muddy mix of "sometimes
> >>> there's I/O side effects, but it depends by arch and driver". In
> >>> other words we can't drop the "__iomem" annotation from WC and UC
> >>> mappings by default.
> >
> >
> > The DMA_MEMORY_MAP flag is pretty much a statement of "no side-
> > effects", so as Catalin says it would fit OK here. That said, if it's
> > not possible to deprecate ioremap_wc() in the same way as
> > ioremap_cache() then I wonder if there's even much benefit in adding
> > it to memremap().
>
> I don't see a problem adding a _WC option to memremap.
>
> The only difference is that it can't replace ioremap_wc. I.e. unlike
> _WB, and _WT case where ioremap_cache and ioremap_wt are now
> deprecated we'd have ioremap_wc continuing to live alongside
> memremap(..., MEMREMAP_WC).
I think that's fine. The difference is that memory returned by
ioremap_wc() should (in theory) only be accessed with I/O accessors
while the range returned by memremap(MEMREMAP_WC) will be directly
accessible.
--
Catalin