2008-12-26 15:31:21

by Atsushi Nemoto

[permalink] [raw]
Subject: [PATCH] dmatest: flush and invalidate destination buffer before DMA

On non-coherent architectures, such as MIPS, the dmatest reports
mismatches on just before and after the DMA area. This is because
test patterns in the dstbuf discarded from cache without writing to
main memory.

Signed-off-by: Atsushi Nemoto <[email protected]>
---
drivers/dma/dmatest.c | 7 +++++++
1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/drivers/dma/dmatest.c b/drivers/dma/dmatest.c
index ed9636b..6b747dc 100644
--- a/drivers/dma/dmatest.c
+++ b/drivers/dma/dmatest.c
@@ -202,6 +202,7 @@ static int dmatest_func(void *data)
dma_cookie_t cookie;
enum dma_status status;
int ret;
+ dma_addr_t dma_dest;

thread_name = current->comm;

@@ -226,6 +227,12 @@ static int dmatest_func(void *data)

dmatest_init_srcbuf(thread->srcbuf, src_off, len);
dmatest_init_dstbuf(thread->dstbuf, dst_off, len);
+ /* flush and invalidate caches for whole dstbuf */
+ dma_dest = dma_map_single(chan->device->dev,
+ thread->dstbuf,
+ test_buf_size, DMA_BIDIRECTIONAL);
+ dma_unmap_single(chan->device->dev, dma_dest,
+ test_buf_size, DMA_BIDIRECTIONAL);

cookie = dma_async_memcpy_buf_to_buf(chan,
thread->dstbuf + dst_off,
--
1.5.6.3


2008-12-27 10:11:18

by Haavard Skinnemoen

[permalink] [raw]
Subject: Re: [PATCH] dmatest: flush and invalidate destination buffer before DMA

Atsushi Nemoto wrote:
> @@ -226,6 +227,12 @@ static int dmatest_func(void *data)
>
> dmatest_init_srcbuf(thread->srcbuf, src_off, len);
> dmatest_init_dstbuf(thread->dstbuf, dst_off, len);
> + /* flush and invalidate caches for whole dstbuf */
> + dma_dest = dma_map_single(chan->device->dev,
> + thread->dstbuf,
> + test_buf_size, DMA_BIDIRECTIONAL);
> + dma_unmap_single(chan->device->dev, dma_dest,
> + test_buf_size, DMA_BIDIRECTIONAL);

You're supposed to unmap after the DMA operation is done, not before
it's submitted.

In this case, the DMA engine framework will do the unmapping for you
(probably using the wrong primitive, but they're really all the same in
practice, right?) so you can just drop the unmap call.

Now, I suspect the dw_dmac driver maps the buffer when it's not
supposed to, masking this kind of error...should probably get that
fixed too.

Haavard

2008-12-28 17:54:06

by Atsushi Nemoto

[permalink] [raw]
Subject: Re: [PATCH] dmatest: flush and invalidate destination buffer before DMA

On Sat, 27 Dec 2008 11:10:37 +0100, Haavard Skinnemoen <[email protected]> wrote:
> Atsushi Nemoto wrote:
> > @@ -226,6 +227,12 @@ static int dmatest_func(void *data)
> >
> > dmatest_init_srcbuf(thread->srcbuf, src_off, len);
> > dmatest_init_dstbuf(thread->dstbuf, dst_off, len);
> > + /* flush and invalidate caches for whole dstbuf */
> > + dma_dest = dma_map_single(chan->device->dev,
> > + thread->dstbuf,
> > + test_buf_size, DMA_BIDIRECTIONAL);
> > + dma_unmap_single(chan->device->dev, dma_dest,
> > + test_buf_size, DMA_BIDIRECTIONAL);
>
> You're supposed to unmap after the DMA operation is done, not before
> it's submitted.
>
> In this case, the DMA engine framework will do the unmapping for you
> (probably using the wrong primitive, but they're really all the same in
> practice, right?) so you can just drop the unmap call.

Well, let me explain more.

On nono-coherent MIPS platforms, dma_map_single() for DMA_TO_DEVICE
writeback the cache, dma_map_single() for DMA_FROM_DEVICE invalidated
(without writeback) the cache. dma_unmap_simgle() is a nop.

If dst_off was not cacheline aligned, dma_map_single(...,
DMA_FROM_DEVICE) in dma_async_memcpy_buf_to_buf() invalidate whole
cachelines including dst_off. So, for example, the initialized data
at dst_off - 1 will be just discarded. This result mismatch error of
course. Same error can be happen at end of the real DMA area.

I added dma_map_single/dma_unmap_single to just flush all dstbuf to
main memory.

> Now, I suspect the dw_dmac driver maps the buffer when it's not
> supposed to, masking this kind of error...should probably get that
> fixed too.

I don't think so. But I'm not sure dma_map_sg() case in
dwc_prep_slave_sg().

---
Atsushi Nemoto

2009-01-05 18:32:17

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH] dmatest: flush and invalidate destination buffer before DMA

On Sun, Dec 28, 2008 at 10:53 AM, Atsushi Nemoto <[email protected]> wrote:
> On Sat, 27 Dec 2008 11:10:37 +0100, Haavard Skinnemoen <[email protected]> wrote:
>> Atsushi Nemoto wrote:
>> > @@ -226,6 +227,12 @@ static int dmatest_func(void *data)
>> >
>> > dmatest_init_srcbuf(thread->srcbuf, src_off, len);
>> > dmatest_init_dstbuf(thread->dstbuf, dst_off, len);
>> > + /* flush and invalidate caches for whole dstbuf */
>> > + dma_dest = dma_map_single(chan->device->dev,
>> > + thread->dstbuf,
>> > + test_buf_size, DMA_BIDIRECTIONAL);
>> > + dma_unmap_single(chan->device->dev, dma_dest,
>> > + test_buf_size, DMA_BIDIRECTIONAL);
>>
>> You're supposed to unmap after the DMA operation is done, not before
>> it's submitted.
>>
>> In this case, the DMA engine framework will do the unmapping for you
>> (probably using the wrong primitive, but they're really all the same in
>> practice, right?) so you can just drop the unmap call.
>
> Well, let me explain more.
>
> On nono-coherent MIPS platforms, dma_map_single() for DMA_TO_DEVICE
> writeback the cache, dma_map_single() for DMA_FROM_DEVICE invalidated
> (without writeback) the cache. dma_unmap_simgle() is a nop.
>
> If dst_off was not cacheline aligned, dma_map_single(...,
> DMA_FROM_DEVICE) in dma_async_memcpy_buf_to_buf() invalidate whole
> cachelines including dst_off. So, for example, the initialized data
> at dst_off - 1 will be just discarded. This result mismatch error of
> course. Same error can be happen at end of the real DMA area.
>
> I added dma_map_single/dma_unmap_single to just flush all dstbuf to
> main memory.
>

I am tempted to say this is a bug in the MIPS dma_map_single
implementation. It is at least a difference in behavior from ARM
where partial cachelines are cleaned+invalidated in the
DMA_FROM_DEVICE case [1].

--
Dan

[1] http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=arch/arm/mm/cache-xsc3l2.c#l82

2009-01-06 01:30:15

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH] dmatest: flush and invalidate destination buffer before DMA

Atsushi Nemoto wrote:
> On Mon, 5 Jan 2009 11:31:57 -0700, "Dan Williams" <[email protected]> wrote:
> Yes, MIPS and ARM do different thing on partial cache line. But I
> suppose this belongs to "implementation dependent" area of DMA API so
> users of the API should not depend on it. (Well, maybe I'm biased to
> MIPS ;-))
>
> In general, drivers must not put normal data and DMA buffer on same
> cacheline anyway to avoid unexpected writeback and data loss. So this
> ambiguity is not a problem. IMHO writeback of the partial line for
> DMA_FROM_DEVICE just _hides_ abusing of the DMA API and potential data
> loss.

Hmm... one implementation does the right thing in all cases. The other
silently allows data corruption unless each callsite that accidentally
or purposely passes cacheline-unaligned buffers adds extra maintenance
code. Which implementation are you biased towards now? :-).

--
Dan

2009-01-06 01:35:55

by Atsushi Nemoto

[permalink] [raw]
Subject: Re: [PATCH] dmatest: flush and invalidate destination buffer before DMA

On Mon, 5 Jan 2009 11:31:57 -0700, "Dan Williams" <[email protected]> wrote:
> > On nono-coherent MIPS platforms, dma_map_single() for DMA_TO_DEVICE
> > writeback the cache, dma_map_single() for DMA_FROM_DEVICE invalidated
> > (without writeback) the cache. dma_unmap_simgle() is a nop.
> >
> > If dst_off was not cacheline aligned, dma_map_single(...,
> > DMA_FROM_DEVICE) in dma_async_memcpy_buf_to_buf() invalidate whole
> > cachelines including dst_off. So, for example, the initialized data
> > at dst_off - 1 will be just discarded. This result mismatch error of
> > course. Same error can be happen at end of the real DMA area.
> >
> > I added dma_map_single/dma_unmap_single to just flush all dstbuf to
> > main memory.
>
> I am tempted to say this is a bug in the MIPS dma_map_single
> implementation. It is at least a difference in behavior from ARM
> where partial cachelines are cleaned+invalidated in the
> DMA_FROM_DEVICE case [1].

Yes, MIPS and ARM do different thing on partial cache line. But I
suppose this belongs to "implementation dependent" area of DMA API so
users of the API should not depend on it. (Well, maybe I'm biased to
MIPS ;-))

In general, drivers must not put normal data and DMA buffer on same
cacheline anyway to avoid unexpected writeback and data loss. So this
ambiguity is not a problem. IMHO writeback of the partial line for
DMA_FROM_DEVICE just _hides_ abusing of the DMA API and potential data
loss.

---
Atsushi Nemoto

2009-01-06 02:06:56

by Atsushi Nemoto

[permalink] [raw]
Subject: Re: [PATCH] dmatest: flush and invalidate destination buffer before DMA

On Mon, 05 Jan 2009 18:29:56 -0700, Dan Williams <[email protected]> wrote:
> > Yes, MIPS and ARM do different thing on partial cache line. But I
> > suppose this belongs to "implementation dependent" area of DMA API so
> > users of the API should not depend on it. (Well, maybe I'm biased to
> > MIPS ;-))
> > In general, drivers must not put normal data and DMA buffer on same
> > cacheline anyway to avoid unexpected writeback and data loss. So this
> > ambiguity is not a problem. IMHO writeback of the partial line for
> > DMA_FROM_DEVICE just _hides_ abusing of the DMA API and potential data
> > loss.
>
> Hmm... one implementation does the right thing in all cases.

Yes, with additional checking in generic path :-)

> The other silently allows data corruption unless each callsite that
> accidentally or purposely passes cacheline-unaligned buffers adds
> extra maintenance code. Which implementation are you biased towards
> now? :-).

Hmm... this is generic DMA mapping API (or linux-arch) issue. Let's
wait comments from arch maintainers. Ralf ?

---
Atsushi Nemoto

2009-01-08 04:43:52

by Atsushi Nemoto

[permalink] [raw]
Subject: Re: [PATCH] dmatest: flush and invalidate destination buffer before DMA

On Mon, 5 Jan 2009 11:31:57 -0700, "Dan Williams" <[email protected]> wrote:
> >> > @@ -226,6 +227,12 @@ static int dmatest_func(void *data)
> >> >
> >> > dmatest_init_srcbuf(thread->srcbuf, src_off, len);
> >> > dmatest_init_dstbuf(thread->dstbuf, dst_off, len);
> >> > + /* flush and invalidate caches for whole dstbuf */
> >> > + dma_dest = dma_map_single(chan->device->dev,
> >> > + thread->dstbuf,
> >> > + test_buf_size, DMA_BIDIRECTIONAL);
> >> > + dma_unmap_single(chan->device->dev, dma_dest,
> >> > + test_buf_size, DMA_BIDIRECTIONAL);
...
> > Well, let me explain more.
> >
> > On nono-coherent MIPS platforms, dma_map_single() for DMA_TO_DEVICE
> > writeback the cache, dma_map_single() for DMA_FROM_DEVICE invalidated
> > (without writeback) the cache. dma_unmap_simgle() is a nop.
> >
> > If dst_off was not cacheline aligned, dma_map_single(...,
> > DMA_FROM_DEVICE) in dma_async_memcpy_buf_to_buf() invalidate whole
> > cachelines including dst_off. So, for example, the initialized data
> > at dst_off - 1 will be just discarded. This result mismatch error of
> > course. Same error can be happen at end of the real DMA area.
> >
> > I added dma_map_single/dma_unmap_single to just flush all dstbuf to
> > main memory.
>
> I am tempted to say this is a bug in the MIPS dma_map_single
> implementation. It is at least a difference in behavior from ARM
> where partial cachelines are cleaned+invalidated in the
> DMA_FROM_DEVICE case [1].

I should argue another reason to do writeback/invalidate here in
dmatest for non-coherent archs (not only for MIPS).

The dmatest compares dstbuf outside the real DMA region to detect
illegal overwriting. But on non-coherent archs, this comparison just
validates contents of cache, not main memory. So dmatest might not
able to detect illegal overwriting. Doing writeback/invalidate
_whole_ dstbuf are before dma_async_memcpy_buf_to_buf() will make
dmatest more strict.

This is why I used DMA_BIDIRECTIONAL for whole dstbuf area rather than
DMA_TO_DEVICE for two cache lines on both edges of the real DMA area.

Doesn't this make sense?

---
Atsushi Nemoto

2009-01-08 08:36:29

by Haavard Skinnemoen

[permalink] [raw]
Subject: Re: [PATCH] dmatest: flush and invalidate destination buffer before DMA

Atsushi Nemoto wrote:
> I should argue another reason to do writeback/invalidate here in
> dmatest for non-coherent archs (not only for MIPS).
>
> The dmatest compares dstbuf outside the real DMA region to detect
> illegal overwriting. But on non-coherent archs, this comparison just
> validates contents of cache, not main memory. So dmatest might not
> able to detect illegal overwriting. Doing writeback/invalidate
> _whole_ dstbuf are before dma_async_memcpy_buf_to_buf() will make
> dmatest more strict.
>
> This is why I used DMA_BIDIRECTIONAL for whole dstbuf area rather than
> DMA_TO_DEVICE for two cache lines on both edges of the real DMA area.
>
> Doesn't this make sense?

I think it does. The dmatest driver should definitely use
DMA_BIDIRECTIONAL on the destination buffer to ensure that the poison
values are written to RAM and not just written to cache and discarded.

Now, this probably means that the destination buffer must be _unmapped_
with DMA_BIDIRECTIONAL too, which is difficult to do with the current
asymmetrical API...

In the general case, however, I think MIPS has a bug: I've seen drivers
DMA to/from tiny buffers stored inside another struct. This is legal
because the driver can guarantee that the other fields in the struct
aren't accessed in the mean time, but any fields sharing a cacheline
with the buffer must be written back before the lines are invalidated.

Haavard

2009-01-08 17:21:16

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH] dmatest: flush and invalidate destination buffer before DMA

On Thu, Jan 8, 2009 at 1:36 AM, Haavard Skinnemoen
<[email protected]> wrote:
> I think it does. The dmatest driver should definitely use
> DMA_BIDIRECTIONAL on the destination buffer to ensure that the poison
> values are written to RAM and not just written to cache and discarded.
>

True.

> Now, this probably means that the destination buffer must be _unmapped_
> with DMA_BIDIRECTIONAL too, which is difficult to do with the current
> asymmetrical API...
>

A map and unmap should work for the current platforms with dma
drivers, but you are right I think this is a violation of the api.
For correctness we would need the entire operation covered by
DMA_BIDIRECTIONAL to support platforms that may implement a dma bounce
buffer. Or, we could change dmatest to not go through the dma-memcpy
api, allowing dma_alloc to be used for the destination.

> In the general case, however, I think MIPS has a bug: I've seen drivers
> DMA to/from tiny buffers stored inside another struct. This is legal
> because the driver can guarantee that the other fields in the struct
> aren't accessed in the mean time, but any fields sharing a cacheline
> with the buffer must be written back before the lines are invalidated.
>

*nod*

--
Dan

2009-01-09 08:31:18

by Atsushi Nemoto

[permalink] [raw]
Subject: Re: [PATCH] dmatest: flush and invalidate destination buffer before DMA

On Thu, 8 Jan 2009 10:20:56 -0700, "Dan Williams" <[email protected]> wrote:
> On Thu, Jan 8, 2009 at 1:36 AM, Haavard Skinnemoen
> > I think it does. The dmatest driver should definitely use
> > DMA_BIDIRECTIONAL on the destination buffer to ensure that the poison
> > values are written to RAM and not just written to cache and discarded.
>
> True.
>
> > Now, this probably means that the destination buffer must be _unmapped_
> > with DMA_BIDIRECTIONAL too, which is difficult to do with the current
> > asymmetrical API...
>
> A map and unmap should work for the current platforms with dma
> drivers, but you are right I think this is a violation of the api.
> For correctness we would need the entire operation covered by
> DMA_BIDIRECTIONAL to support platforms that may implement a dma bounce
> buffer. Or, we could change dmatest to not go through the dma-memcpy
> api, allowing dma_alloc to be used for the destination.

Do you mean something like this?

dmatest_init_srcbuf(thread->srcbuf, src_off, len);
dmatest_init_dstbuf(thread->dstbuf, dst_off, len);

dma_src = dma_map_single(dev->dev, src + src_off, len,
DMA_TO_DEVICE);
dma_dest = dma_map_single(dev->dev, dest_buf, test_buf_size,
DMA_BIDIRECTIONAL);
tx = dev->device_prep_dma_memcpy(chan, dma_dest + dst_off,
dma_src, len,
DMA_CTRL_ACK | DMA_COMPL_SKIP_DEST_UNMAP);
if (!tx) {
/* error */
}
tx->callback = NULL;
cookie = tx->tx_submit(tx);
...
/* wait for completion */
...
dma_unmap_single(dev->dev, dma_dest, test_buf_size,
DMA_BIDIRECTIONAL);

It will make dmatest more aggressive, for example, a option for
testing DMA_PREP_INTERRUPT handling of a lowlevel driver can be added
easily.

---
Atsushi Nemoto

2009-01-09 12:16:50

by Ralf Baechle

[permalink] [raw]
Subject: Re: [PATCH] dmatest: flush and invalidate destination buffer before DMA

On Thu, Jan 08, 2009 at 09:36:03AM +0100, Haavard Skinnemoen wrote:

> In the general case, however, I think MIPS has a bug: I've seen drivers
> DMA to/from tiny buffers stored inside another struct. This is legal
> because the driver can guarantee that the other fields in the struct
> aren't accessed in the mean time, but any fields sharing a cacheline
> with the buffer must be written back before the lines are invalidated.

Depending on the implementation details, the use of such a struct might be
relying on implementation-specific behaviour. This is what
Documentation/DMA-API.txt has to say:

[...]
int
dma_get_cache_alignment(void)

Returns the processor cache alignment. This is the absolute minimum
alignment *and* width that you must observe when either mapping
memory or doing partial flushes.

Notes: This API may return a number *larger* than the actual cache
line, but it will guarantee that one or more cache lines fit exactly
into the width returned by this call. It will also always be a power
of two for easy alignment.
[...]

Since dma_get_cache_alignment() is a function not a constant its result
can't be used in the definition of a struct unless possibly excessive
padding is used.

The debate has shown that we problably need BUG_ON() assertions in the
DMA API implementations to catch this sort of dangerous use.

Ralf

2009-01-09 22:32:41

by Russell King

[permalink] [raw]
Subject: Re: [PATCH] dmatest: flush and invalidate destination buffer before DMA

On Fri, Jan 09, 2009 at 11:19:36AM +0000, Ralf Baechle wrote:
> On Thu, Jan 08, 2009 at 09:36:03AM +0100, Haavard Skinnemoen wrote:
> > In the general case, however, I think MIPS has a bug: I've seen drivers
> > DMA to/from tiny buffers stored inside another struct. This is legal
> > because the driver can guarantee that the other fields in the struct
> > aren't accessed in the mean time, but any fields sharing a cacheline
> > with the buffer must be written back before the lines are invalidated.
>
> Depending on the implementation details, the use of such a struct might be
> relying on implementation-specific behaviour. This is what
> Documentation/DMA-API.txt has to say:
>
> [...]
> int
> dma_get_cache_alignment(void)
>
> Returns the processor cache alignment. This is the absolute minimum
> alignment *and* width that you must observe when either mapping
> memory or doing partial flushes.
>
> Notes: This API may return a number *larger* than the actual cache
> line, but it will guarantee that one or more cache lines fit exactly
> into the width returned by this call. It will also always be a power
> of two for easy alignment.
> [...]
>
> Since dma_get_cache_alignment() is a function not a constant its result
> can't be used in the definition of a struct unless possibly excessive
> padding is used.
>
> The debate has shown that we problably need BUG_ON() assertions in the
> DMA API implementations to catch this sort of dangerous use.

I really don't think that's a realistic option. You're asking for
every call to the DMA API to ensure that the buffer and length are
a multiple of the cache line size.

So, what happens if, eg, SPI wants to send a 16 byte buffer, and your
cache lines are 64 bytes? Does the SPI driver have to kmalloc a new
chunk of memory 64 bytes long and copy the data into that before
passing it into the DMA API?

If you start enforcing that kind of thing, I think the cache coherent
people will take violent exception and refuse to play such games - and
quite rightly so.

--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of:

2009-01-10 00:40:10

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH] dmatest: flush and invalidate destination buffer before DMA

On Fri, Jan 9, 2009 at 1:30 AM, Atsushi Nemoto <[email protected]> wrote:
> Do you mean something like this?

Yes, exactly.

>
> dmatest_init_srcbuf(thread->srcbuf, src_off, len);
> dmatest_init_dstbuf(thread->dstbuf, dst_off, len);
>
> dma_src = dma_map_single(dev->dev, src + src_off, len,
> DMA_TO_DEVICE);
> dma_dest = dma_map_single(dev->dev, dest_buf, test_buf_size,
> DMA_BIDIRECTIONAL);
> tx = dev->device_prep_dma_memcpy(chan, dma_dest + dst_off,
> dma_src, len,
> DMA_CTRL_ACK | DMA_COMPL_SKIP_DEST_UNMAP);
> if (!tx) {
> /* error */
> }
> tx->callback = NULL;
> cookie = tx->tx_submit(tx);
> ...
> /* wait for completion */
> ...
> dma_unmap_single(dev->dev, dma_dest, test_buf_size,
> DMA_BIDIRECTIONAL);
>
> It will make dmatest more aggressive, for example, a option for
> testing DMA_PREP_INTERRUPT handling of a lowlevel driver can be added
> easily.
>

Yes, and it will also allow testing of corner cases like dependency
chains and other async_tx api driver-assumptions.

Thanks,
Dan

2009-01-11 18:44:34

by Ralf Baechle

[permalink] [raw]
Subject: Re: [PATCH] dmatest: flush and invalidate destination buffer before DMA

On Fri, Jan 09, 2009 at 10:27:21PM +0000, Russell King wrote:

> On Fri, Jan 09, 2009 at 11:19:36AM +0000, Ralf Baechle wrote:
> > On Thu, Jan 08, 2009 at 09:36:03AM +0100, Haavard Skinnemoen wrote:
> > > In the general case, however, I think MIPS has a bug: I've seen drivers
> > > DMA to/from tiny buffers stored inside another struct. This is legal
> > > because the driver can guarantee that the other fields in the struct
> > > aren't accessed in the mean time, but any fields sharing a cacheline
> > > with the buffer must be written back before the lines are invalidated.
> >
> > Depending on the implementation details, the use of such a struct might be
> > relying on implementation-specific behaviour. This is what
> > Documentation/DMA-API.txt has to say:
> >
> > [...]
> > int
> > dma_get_cache_alignment(void)
> >
> > Returns the processor cache alignment. This is the absolute minimum
> > alignment *and* width that you must observe when either mapping
> > memory or doing partial flushes.
> >
> > Notes: This API may return a number *larger* than the actual cache
> > line, but it will guarantee that one or more cache lines fit exactly
> > into the width returned by this call. It will also always be a power
> > of two for easy alignment.
> > [...]
> >
> > Since dma_get_cache_alignment() is a function not a constant its result
> > can't be used in the definition of a struct unless possibly excessive
> > padding is used.
> >
> > The debate has shown that we problably need BUG_ON() assertions in the
> > DMA API implementations to catch this sort of dangerous use.
>
> I really don't think that's a realistic option. You're asking for
> every call to the DMA API to ensure that the buffer and length are
> a multiple of the cache line size.
>
> So, what happens if, eg, SPI wants to send a 16 byte buffer, and your
> cache lines are 64 bytes? Does the SPI driver have to kmalloc a new
> chunk of memory 64 bytes long and copy the data into that before
> passing it into the DMA API?
>
> If you start enforcing that kind of thing, I think the cache coherent
> people will take violent exception and refuse to play such games - and
> quite rightly so.

I only want to force people to be aware of what they're doing. So far I've
seen cache lines of up to 256 bytes in size on non-coherent systems. Be
paranoid, very paranoid ...

Below patch should solve Dan William's concerns. It will peform a writeback
and invalidation operation on the first and last cacheline worth of data.
The instruction costs around a dozen cycles so I won't even try to optimize
possible double cache operations away; that'd probably be more expensive.

Ralf

Signed-off-by: Ralf Baechle <[email protected]>

diff --git a/arch/mips/mm/c-r4k.c b/arch/mips/mm/c-r4k.c
index 6e99665..56290a7 100644
--- a/arch/mips/mm/c-r4k.c
+++ b/arch/mips/mm/c-r4k.c
@@ -618,8 +618,11 @@ static void r4k_dma_cache_inv(unsigned long addr, unsigned long size)
if (cpu_has_inclusive_pcaches) {
if (size >= scache_size)
r4k_blast_scache();
- else
+ else {
+ cache_op(Hit_Writeback_Inv_SD, addr);
+ cache_op(Hit_Writeback_Inv_SD, addr + size - 1);
blast_inv_scache_range(addr, addr + size);
+ }
return;
}

@@ -627,6 +630,8 @@ static void r4k_dma_cache_inv(unsigned long addr, unsigned long size)
r4k_blast_dcache();
} else {
R4600_HIT_CACHEOP_WAR_IMPL;
+ cache_op(Hit_Writeback_Inv_D, addr);
+ cache_op(Hit_Writeback_Inv_D, addr + size - 1);
blast_inv_dcache_range(addr, addr + size);
}