2011-07-11 00:52:50

by Michał Mirosław

[permalink] [raw]
Subject: [PATCH v2 07/46] net/wireless: ath9k: fix DMA API usage

Also constify buf_addr for ath9k_hw_process_rxdesc_edma() to verify
assumptions --- dma_sync_single_for_device() call can be removed.

Signed-off-by: Michał Mirosław <[email protected]>
---
drivers/net/wireless/ath/ath9k/ar9003_mac.c | 4 ++--
drivers/net/wireless/ath/ath9k/ar9003_mac.h | 2 +-
drivers/net/wireless/ath/ath9k/recv.c | 10 +++-------
3 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/ar9003_mac.c b/drivers/net/wireless/ath/ath9k/ar9003_mac.c
index 575e185..2d211b6 100644
--- a/drivers/net/wireless/ath/ath9k/ar9003_mac.c
+++ b/drivers/net/wireless/ath/ath9k/ar9003_mac.c
@@ -524,9 +524,9 @@ void ath9k_hw_addrxbuf_edma(struct ath_hw *ah, u32 rxdp,
EXPORT_SYMBOL(ath9k_hw_addrxbuf_edma);

int ath9k_hw_process_rxdesc_edma(struct ath_hw *ah, struct ath_rx_status *rxs,
- void *buf_addr)
+ const void *buf_addr)
{
- struct ar9003_rxs *rxsp = (struct ar9003_rxs *) buf_addr;
+ const struct ar9003_rxs *rxsp = buf_addr;
unsigned int phyerr;

/* TODO: byte swap on big endian for ar9300_10 */
diff --git a/drivers/net/wireless/ath/ath9k/ar9003_mac.h b/drivers/net/wireless/ath/ath9k/ar9003_mac.h
index c504493..c310edc 100644
--- a/drivers/net/wireless/ath/ath9k/ar9003_mac.h
+++ b/drivers/net/wireless/ath/ath9k/ar9003_mac.h
@@ -114,7 +114,7 @@ void ath9k_hw_addrxbuf_edma(struct ath_hw *ah, u32 rxdp,

int ath9k_hw_process_rxdesc_edma(struct ath_hw *ah,
struct ath_rx_status *rxs,
- void *buf_addr);
+ const void *buf_addr);
void ath9k_hw_reset_txstatus_ring(struct ath_hw *ah);
void ath9k_hw_setup_statusring(struct ath_hw *ah, void *ts_start,
u32 ts_paddr_start,
diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
index 70dc8ec..c5f46d5 100644
--- a/drivers/net/wireless/ath/ath9k/recv.c
+++ b/drivers/net/wireless/ath/ath9k/recv.c
@@ -156,7 +156,7 @@ static bool ath_rx_edma_buf_link(struct ath_softc *sc,
ATH_RXBUF_RESET(bf);
memset(skb->data, 0, ah->caps.rx_status_len);
dma_sync_single_for_device(sc->dev, bf->bf_buf_addr,
- ah->caps.rx_status_len, DMA_TO_DEVICE);
+ ah->caps.rx_status_len, DMA_BIDIRECTIONAL);

SKB_CB_ATHBUF(skb) = bf;
ath9k_hw_addrxbuf_edma(ah, bf->bf_buf_addr, qtype);
@@ -684,15 +684,11 @@ static bool ath_edma_get_buffers(struct ath_softc *sc,
BUG_ON(!bf);

dma_sync_single_for_cpu(sc->dev, bf->bf_buf_addr,
- common->rx_bufsize, DMA_FROM_DEVICE);
+ common->rx_bufsize, DMA_BIDIRECTIONAL);

ret = ath9k_hw_process_rxdesc_edma(ah, NULL, skb->data);
- if (ret == -EINPROGRESS) {
- /*let device gain the buffer again*/
- dma_sync_single_for_device(sc->dev, bf->bf_buf_addr,
- common->rx_bufsize, DMA_FROM_DEVICE);
+ if (ret == -EINPROGRESS)
return false;
- }

__skb_unlink(skb, &rx_edma->rx_fifo);
if (ret == -EINVAL) {
--
1.7.5.4



2011-07-12 19:13:46

by Michał Mirosław

[permalink] [raw]
Subject: Re: [ath9k-devel] [PATCH v2 07/46] net/wireless: ath9k: fix DMA API usage

On Wed, Jul 13, 2011 at 12:04:50AM +0800, Felix Fietkau wrote:
> On 12.07.2011, at 23:58, Micha? Miros?aw <[email protected]> wrote:
>
> > On Tue, Jul 12, 2011 at 10:21:05PM +0800, Felix Fietkau wrote:
> >> On 12.07.2011, at 21:03, Micha? Miros?aw <[email protected]> wrote:
> >>
> >>> On Tue, Jul 12, 2011 at 08:54:32PM +0800, Felix Fietkau wrote:
> >>>> On 12.07.2011, at 17:55, Micha? Miros?aw <[email protected]> wrote:
> >>>>
> >>>>> On Tue, Jul 12, 2011 at 12:36:06PM +0800, Felix Fietkau wrote:
> >>>>>> On 2011-07-11 8:52 AM, Micha? Miros?aw wrote:
> >>>>>>> Also constify buf_addr for ath9k_hw_process_rxdesc_edma() to verify
> >>>>>>> assumptions --- dma_sync_single_for_device() call can be removed.
> >>>>>>>
> >>>>>>> Signed-off-by: Micha? Miros?aw<[email protected]>
> >>>>>>> ---
> >>>>>>> drivers/net/wireless/ath/ath9k/ar9003_mac.c | 4 ++--
> >>>>>>> drivers/net/wireless/ath/ath9k/ar9003_mac.h | 2 +-
> >>>>>>> drivers/net/wireless/ath/ath9k/recv.c | 10 +++-------
> >>>>>>> 3 files changed, 6 insertions(+), 10 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
> >>>>>>> index 70dc8ec..c5f46d5 100644
> >>>>>>> --- a/drivers/net/wireless/ath/ath9k/recv.c
> >>>>>>> +++ b/drivers/net/wireless/ath/ath9k/recv.c
> >>>>>>> @@ -684,15 +684,11 @@ static bool ath_edma_get_buffers(struct ath_softc *sc,
> >>>>>>> BUG_ON(!bf);
> >>>>>>>
> >>>>>>> dma_sync_single_for_cpu(sc->dev, bf->bf_buf_addr,
> >>>>>>> - common->rx_bufsize, DMA_FROM_DEVICE);
> >>>>>>> + common->rx_bufsize, DMA_BIDIRECTIONAL);
> >>>>>>>
> >>>>>>> ret = ath9k_hw_process_rxdesc_edma(ah, NULL, skb->data);
> >>>>>>> - if (ret == -EINPROGRESS) {
> >>>>>>> - /*let device gain the buffer again*/
> >>>>>>> - dma_sync_single_for_device(sc->dev, bf->bf_buf_addr,
> >>>>>>> - common->rx_bufsize, DMA_FROM_DEVICE);
> >>>>>>> + if (ret == -EINPROGRESS)
> >>>>>>> return false;
> >>>>>>> - }
> >>>>>>>
> >>>>>>> __skb_unlink(skb,&rx_edma->rx_fifo);
> >>>>>>> if (ret == -EINVAL) {
> >>>>>> I have strong doubts about this change. On most MIPS devices,
> >>>>>> dma_sync_single_for_cpu is a no-op, whereas
> >>>>>> dma_sync_single_for_device flushes the cache range. With this
> >>>>>> change, the CPU could cache the DMA status part behind skb->data and
> >>>>>> that cache entry would not be flushed inbetween calls to this
> >>>>>> functions on the same buffer, likely leading to rx stalls.
> >>>>> You're suggesting a platform implementation bug then. If the platform is not
> >>>>> cache-coherent, it should invalidate relevant CPU cache lines for sync_to_cpu
> >>>>> and unmap cases. Do other devices show such symptoms on MIPS systems?
> >>>>>
> >>>>> I'm not familiar with the platform internals, so we should ask MIPS people.
> >>>> I only mentioned MIPS to describe the potential side effect of this change. From my current understanding of the DMA API, it would be wrong on other platforms as well. I believe the _for_device function needs to be used to transfer ownership of the buffer back to the device, before calling _for_cpu again later for another read.
> >>> What you're saying reminds the wording in DMA-API-HOWTO.txt that I find
> >>> wrong (or at least misleading) compared to what DMA-API.txt describes.
> >>> DMA sync calls do not transfer the ownership of the buffer - they are
> >>> cache synchronization points, ownership passing is handled entirely by
> >>> the driver.
> >> What I meant was that the DMA sync calls reflect the ownership transfer of the memory regions. In this case ownership is transferred between device and CPU multiple times and the code reflects that.
> >>>> This is definitely required in this case, because when the return code is -EINPROGRESS, the driver waits for the hardware to complete this buffer, and the next call has to fetch the memory again after the device has updated it.
> >>> Correctness of this access should be provided by sync_to_cpu() call.
> >> At least in MIPS I'm sure it isn't. If I remember correctly, it also isn't on ARM, so I'm pretty sure that either your understanding of the API is incorrect, or arch code does not implement it properly. In either case, this change (and probably also the p54 one) should not be merged.
> >
> > I briefly looked through DMA API implementation in MIPS, and except
> > for R10k and R12k both sync_for_cpu and sync_for_device are no-ops
> > (see: arch/mips/mm/dma-default.c). For R10k and R12k the syncs are
> > in both points, and exactly like I described before - CPU cachelines
> > are invalidated for DMA_FROM_DEVICE mappings, written back for
> > DMA_TO_DEVICE, both for DMA_BIDIRECTIONAL (including redundant
> > mapping+sync direction).
> >
> > So doing that sync_to_device you are just invalidating the same cachelines
> > twice for no gain (or do nothing twice in some cases) - they are not read
> > by CPU between sync_to_device -> sync_to_cpu (unless you have other bugs
> > in the driver).
> I think you're missing something. It works like this: In the AR9380 rx path, the descriptor is part of the skb. The rx tasklet checks for rx frame completion by calling the sync for cpu, reading the completion flag and (in case of a not completed frame) flushes the cache for that location again (for device). If you remove the for_device call, the next call to this function can see stale data, as the for_cpu call can be a no-op.

Is the descriptor modified in any way before being checked again? Looks like
it isn't. That is my assumption - if this doesn't hold, then we're talking
about different things.

When I looked through the DMA API implementation for MIPS, I saw that whenever
sync_to_cpu is a no-op, sync_to_device is also a no-op. So the bug you're
seeing is not related to those calls. It might be that despite no-op sync
primitives, the platform is not cache-coherent --- that is DMA writes by
device do not cause corresponding CPU cache lines to be invalidated.

BTW, cache flush (other name: invalidation) is needed just before reading the
value. Doing it once more earlier does not really matter. Unless you're
modifying some data in the same cache line as mapped buffer --- then this
is a BUG in the driver and should either use DMA_BIDIRECTIONAL if the modified
value is part of the buffer, or move the modified data away.

Best Regards,
Micha? Miros?aw

2011-07-12 15:58:50

by Michał Mirosław

[permalink] [raw]
Subject: Re: [ath9k-devel] [PATCH v2 07/46] net/wireless: ath9k: fix DMA API usage

On Tue, Jul 12, 2011 at 10:21:05PM +0800, Felix Fietkau wrote:
> On 12.07.2011, at 21:03, Micha? Miros?aw <[email protected]> wrote:
>
> > On Tue, Jul 12, 2011 at 08:54:32PM +0800, Felix Fietkau wrote:
> >> On 12.07.2011, at 17:55, Micha? Miros?aw <[email protected]> wrote:
> >>
> >>> On Tue, Jul 12, 2011 at 12:36:06PM +0800, Felix Fietkau wrote:
> >>>> On 2011-07-11 8:52 AM, Micha? Miros?aw wrote:
> >>>>> Also constify buf_addr for ath9k_hw_process_rxdesc_edma() to verify
> >>>>> assumptions --- dma_sync_single_for_device() call can be removed.
> >>>>>
> >>>>> Signed-off-by: Micha? Miros?aw<[email protected]>
> >>>>> ---
> >>>>> drivers/net/wireless/ath/ath9k/ar9003_mac.c | 4 ++--
> >>>>> drivers/net/wireless/ath/ath9k/ar9003_mac.h | 2 +-
> >>>>> drivers/net/wireless/ath/ath9k/recv.c | 10 +++-------
> >>>>> 3 files changed, 6 insertions(+), 10 deletions(-)
> >>>>>
> >>>>> diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
> >>>>> index 70dc8ec..c5f46d5 100644
> >>>>> --- a/drivers/net/wireless/ath/ath9k/recv.c
> >>>>> +++ b/drivers/net/wireless/ath/ath9k/recv.c
> >>>>> @@ -684,15 +684,11 @@ static bool ath_edma_get_buffers(struct ath_softc *sc,
> >>>>> BUG_ON(!bf);
> >>>>>
> >>>>> dma_sync_single_for_cpu(sc->dev, bf->bf_buf_addr,
> >>>>> - common->rx_bufsize, DMA_FROM_DEVICE);
> >>>>> + common->rx_bufsize, DMA_BIDIRECTIONAL);
> >>>>>
> >>>>> ret = ath9k_hw_process_rxdesc_edma(ah, NULL, skb->data);
> >>>>> - if (ret == -EINPROGRESS) {
> >>>>> - /*let device gain the buffer again*/
> >>>>> - dma_sync_single_for_device(sc->dev, bf->bf_buf_addr,
> >>>>> - common->rx_bufsize, DMA_FROM_DEVICE);
> >>>>> + if (ret == -EINPROGRESS)
> >>>>> return false;
> >>>>> - }
> >>>>>
> >>>>> __skb_unlink(skb,&rx_edma->rx_fifo);
> >>>>> if (ret == -EINVAL) {
> >>>> I have strong doubts about this change. On most MIPS devices,
> >>>> dma_sync_single_for_cpu is a no-op, whereas
> >>>> dma_sync_single_for_device flushes the cache range. With this
> >>>> change, the CPU could cache the DMA status part behind skb->data and
> >>>> that cache entry would not be flushed inbetween calls to this
> >>>> functions on the same buffer, likely leading to rx stalls.
> >>> You're suggesting a platform implementation bug then. If the platform is not
> >>> cache-coherent, it should invalidate relevant CPU cache lines for sync_to_cpu
> >>> and unmap cases. Do other devices show such symptoms on MIPS systems?
> >>>
> >>> I'm not familiar with the platform internals, so we should ask MIPS people.
> >> I only mentioned MIPS to describe the potential side effect of this change. From my current understanding of the DMA API, it would be wrong on other platforms as well. I believe the _for_device function needs to be used to transfer ownership of the buffer back to the device, before calling _for_cpu again later for another read.
> > What you're saying reminds the wording in DMA-API-HOWTO.txt that I find
> > wrong (or at least misleading) compared to what DMA-API.txt describes.
> > DMA sync calls do not transfer the ownership of the buffer - they are
> > cache synchronization points, ownership passing is handled entirely by
> > the driver.
> What I meant was that the DMA sync calls reflect the ownership transfer of the memory regions. In this case ownership is transferred between device and CPU multiple times and the code reflects that.
> >> This is definitely required in this case, because when the return code is -EINPROGRESS, the driver waits for the hardware to complete this buffer, and the next call has to fetch the memory again after the device has updated it.
> > Correctness of this access should be provided by sync_to_cpu() call.
> At least in MIPS I'm sure it isn't. If I remember correctly, it also isn't on ARM, so I'm pretty sure that either your understanding of the API is incorrect, or arch code does not implement it properly. In either case, this change (and probably also the p54 one) should not be merged.

I briefly looked through DMA API implementation in MIPS, and except
for R10k and R12k both sync_for_cpu and sync_for_device are no-ops
(see: arch/mips/mm/dma-default.c). For R10k and R12k the syncs are
in both points, and exactly like I described before - CPU cachelines
are invalidated for DMA_FROM_DEVICE mappings, written back for
DMA_TO_DEVICE, both for DMA_BIDIRECTIONAL (including redundant
mapping+sync direction).

So doing that sync_to_device you are just invalidating the same cachelines
twice for no gain (or do nothing twice in some cases) - they are not read
by CPU between sync_to_device -> sync_to_cpu (unless you have other bugs
in the driver).

Best Regards,
Micha? Miros?aw

2011-07-12 14:22:58

by Felix Fietkau

[permalink] [raw]
Subject: Re: [ath9k-devel] [PATCH v2 07/46] net/wireless: ath9k: fix DMA API usage

On 12.07.2011, at 21:03, Michał Mirosław <[email protected]> wrote:

> On Tue, Jul 12, 2011 at 08:54:32PM +0800, Felix Fietkau wrote:
>> On 12.07.2011, at 17:55, Michał Mirosław <[email protected]> wrote:
>>
>>> On Tue, Jul 12, 2011 at 12:36:06PM +0800, Felix Fietkau wrote:
>>>> On 2011-07-11 8:52 AM, Michał Mirosław wrote:
>>>>> Also constify buf_addr for ath9k_hw_process_rxdesc_edma() to verify
>>>>> assumptions --- dma_sync_single_for_device() call can be removed.
>>>>>
>>>>> Signed-off-by: Michał Mirosław<[email protected]>
>>>>> ---
>>>>> drivers/net/wireless/ath/ath9k/ar9003_mac.c | 4 ++--
>>>>> drivers/net/wireless/ath/ath9k/ar9003_mac.h | 2 +-
>>>>> drivers/net/wireless/ath/ath9k/recv.c | 10 +++-------
>>>>> 3 files changed, 6 insertions(+), 10 deletions(-)
>>>>>
>>>>> diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
>>>>> index 70dc8ec..c5f46d5 100644
>>>>> --- a/drivers/net/wireless/ath/ath9k/recv.c
>>>>> +++ b/drivers/net/wireless/ath/ath9k/recv.c
>>>>> @@ -684,15 +684,11 @@ static bool ath_edma_get_buffers(struct ath_softc *sc,
>>>>> BUG_ON(!bf);
>>>>>
>>>>> dma_sync_single_for_cpu(sc->dev, bf->bf_buf_addr,
>>>>> - common->rx_bufsize, DMA_FROM_DEVICE);
>>>>> + common->rx_bufsize, DMA_BIDIRECTIONAL);
>>>>>
>>>>> ret = ath9k_hw_process_rxdesc_edma(ah, NULL, skb->data);
>>>>> - if (ret == -EINPROGRESS) {
>>>>> - /*let device gain the buffer again*/
>>>>> - dma_sync_single_for_device(sc->dev, bf->bf_buf_addr,
>>>>> - common->rx_bufsize, DMA_FROM_DEVICE);
>>>>> + if (ret == -EINPROGRESS)
>>>>> return false;
>>>>> - }
>>>>>
>>>>> __skb_unlink(skb,&rx_edma->rx_fifo);
>>>>> if (ret == -EINVAL) {
>>>> I have strong doubts about this change. On most MIPS devices,
>>>> dma_sync_single_for_cpu is a no-op, whereas
>>>> dma_sync_single_for_device flushes the cache range. With this
>>>> change, the CPU could cache the DMA status part behind skb->data and
>>>> that cache entry would not be flushed inbetween calls to this
>>>> functions on the same buffer, likely leading to rx stalls.
>>>
>>> You're suggesting a platform implementation bug then. If the platform is not
>>> cache-coherent, it should invalidate relevant CPU cache lines for sync_to_cpu
>>> and unmap cases. Do other devices show such symptoms on MIPS systems?
>>>
>>> I'm not familiar with the platform internals, so we should ask MIPS people.
>> I only mentioned MIPS to describe the potential side effect of this change. From my current understanding of the DMA API, it would be wrong on other platforms as well. I believe the _for_device function needs to be used to transfer ownership of the buffer back to the device, before calling _for_cpu again later for another read.
>
> What you're saying reminds the wording in DMA-API-HOWTO.txt that I find
> wrong (or at least misleading) compared to what DMA-API.txt describes.
> DMA sync calls do not transfer the ownership of the buffer - they are
> cache synchronization points, ownership passing is handled entirely by
> the driver.
What I meant was that the DMA sync calls reflect the ownership transfer of the memory regions. In this case ownership is transferred between device and CPU multiple times and the code reflects that.

>
>> This is definitely required in this case, because when the return code is -EINPROGRESS, the driver waits for the hardware to complete this buffer, and the next call has to fetch the memory again after the device has updated it.
>
> Correctness of this access should be provided by sync_to_cpu() call.
At least in MIPS I'm sure it isn't. If I remember correctly, it also isn't on ARM, so I'm pretty sure that either your understanding of the API is incorrect, or arch code does not implement it properly. In either case, this change (and probably also the p54 one) should not be merged.

- Felix

2011-07-12 20:59:49

by Michał Mirosław

[permalink] [raw]
Subject: Re: [ath9k-devel] [PATCH v2 07/46] net/wireless: ath9k: fix DMA API usage

On Tue, Jul 12, 2011 at 10:53:16PM +0200, Micha? Miros?aw wrote:
> In the case we're analysing, the ownership is on the device's side
> until it stops writing the buffer.

Just to be clear, this should say: ... until it marks the buffer as done.

Best Regards,
Micha? Miros?aw

2011-07-12 16:10:09

by Felix Fietkau

[permalink] [raw]
Subject: Re: [ath9k-devel] [PATCH v2 07/46] net/wireless: ath9k: fix DMA API usage

On 12.07.2011, at 23:58, Michał Mirosław <[email protected]> wrote:

> On Tue, Jul 12, 2011 at 10:21:05PM +0800, Felix Fietkau wrote:
>> On 12.07.2011, at 21:03, Michał Mirosław <[email protected]> wrote:
>>
>>> On Tue, Jul 12, 2011 at 08:54:32PM +0800, Felix Fietkau wrote:
>>>> On 12.07.2011, at 17:55, Michał Mirosław <[email protected]> wrote:
>>>>
>>>>> On Tue, Jul 12, 2011 at 12:36:06PM +0800, Felix Fietkau wrote:
>>>>>> On 2011-07-11 8:52 AM, Michał Mirosław wrote:
>>>>>>> Also constify buf_addr for ath9k_hw_process_rxdesc_edma() to verify
>>>>>>> assumptions --- dma_sync_single_for_device() call can be removed.
>>>>>>>
>>>>>>> Signed-off-by: Michał Mirosław<[email protected]>
>>>>>>> ---
>>>>>>> drivers/net/wireless/ath/ath9k/ar9003_mac.c | 4 ++--
>>>>>>> drivers/net/wireless/ath/ath9k/ar9003_mac.h | 2 +-
>>>>>>> drivers/net/wireless/ath/ath9k/recv.c | 10 +++-------
>>>>>>> 3 files changed, 6 insertions(+), 10 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
>>>>>>> index 70dc8ec..c5f46d5 100644
>>>>>>> --- a/drivers/net/wireless/ath/ath9k/recv.c
>>>>>>> +++ b/drivers/net/wireless/ath/ath9k/recv.c
>>>>>>> @@ -684,15 +684,11 @@ static bool ath_edma_get_buffers(struct ath_softc *sc,
>>>>>>> BUG_ON(!bf);
>>>>>>>
>>>>>>> dma_sync_single_for_cpu(sc->dev, bf->bf_buf_addr,
>>>>>>> - common->rx_bufsize, DMA_FROM_DEVICE);
>>>>>>> + common->rx_bufsize, DMA_BIDIRECTIONAL);
>>>>>>>
>>>>>>> ret = ath9k_hw_process_rxdesc_edma(ah, NULL, skb->data);
>>>>>>> - if (ret == -EINPROGRESS) {
>>>>>>> - /*let device gain the buffer again*/
>>>>>>> - dma_sync_single_for_device(sc->dev, bf->bf_buf_addr,
>>>>>>> - common->rx_bufsize, DMA_FROM_DEVICE);
>>>>>>> + if (ret == -EINPROGRESS)
>>>>>>> return false;
>>>>>>> - }
>>>>>>>
>>>>>>> __skb_unlink(skb,&rx_edma->rx_fifo);
>>>>>>> if (ret == -EINVAL) {
>>>>>> I have strong doubts about this change. On most MIPS devices,
>>>>>> dma_sync_single_for_cpu is a no-op, whereas
>>>>>> dma_sync_single_for_device flushes the cache range. With this
>>>>>> change, the CPU could cache the DMA status part behind skb->data and
>>>>>> that cache entry would not be flushed inbetween calls to this
>>>>>> functions on the same buffer, likely leading to rx stalls.
>>>>> You're suggesting a platform implementation bug then. If the platform is not
>>>>> cache-coherent, it should invalidate relevant CPU cache lines for sync_to_cpu
>>>>> and unmap cases. Do other devices show such symptoms on MIPS systems?
>>>>>
>>>>> I'm not familiar with the platform internals, so we should ask MIPS people.
>>>> I only mentioned MIPS to describe the potential side effect of this change. From my current understanding of the DMA API, it would be wrong on other platforms as well. I believe the _for_device function needs to be used to transfer ownership of the buffer back to the device, before calling _for_cpu again later for another read.
>>> What you're saying reminds the wording in DMA-API-HOWTO.txt that I find
>>> wrong (or at least misleading) compared to what DMA-API.txt describes.
>>> DMA sync calls do not transfer the ownership of the buffer - they are
>>> cache synchronization points, ownership passing is handled entirely by
>>> the driver.
>> What I meant was that the DMA sync calls reflect the ownership transfer of the memory regions. In this case ownership is transferred between device and CPU multiple times and the code reflects that.
>>>> This is definitely required in this case, because when the return code is -EINPROGRESS, the driver waits for the hardware to complete this buffer, and the next call has to fetch the memory again after the device has updated it.
>>> Correctness of this access should be provided by sync_to_cpu() call.
>> At least in MIPS I'm sure it isn't. If I remember correctly, it also isn't on ARM, so I'm pretty sure that either your understanding of the API is incorrect, or arch code does not implement it properly. In either case, this change (and probably also the p54 one) should not be merged.
>
> I briefly looked through DMA API implementation in MIPS, and except
> for R10k and R12k both sync_for_cpu and sync_for_device are no-ops
> (see: arch/mips/mm/dma-default.c). For R10k and R12k the syncs are
> in both points, and exactly like I described before - CPU cachelines
> are invalidated for DMA_FROM_DEVICE mappings, written back for
> DMA_TO_DEVICE, both for DMA_BIDIRECTIONAL (including redundant
> mapping+sync direction).
>
> So doing that sync_to_device you are just invalidating the same cachelines
> twice for no gain (or do nothing twice in some cases) - they are not read
> by CPU between sync_to_device -> sync_to_cpu (unless you have other bugs
> in the driver).
I think you're missing something. It works like this: In the AR9380 rx path, the descriptor is part of the skb. The rx tasklet checks for rx frame completion by calling the sync for cpu, reading the completion flag and (in case of a not completed frame) flushes the cache for that location again (for device). If you remove the for_device call, the next call to this function can see stale data, as the for_cpu call can be a no-op.

- Felix

2011-07-12 20:53:18

by Michał Mirosław

[permalink] [raw]
Subject: Re: [ath9k-devel] [PATCH v2 07/46] net/wireless: ath9k: fix DMA API usage

On Tue, Jul 12, 2011 at 08:32:04PM +0100, Ralf Baechle wrote:
> On Tue, Jul 12, 2011 at 12:36:06PM +0800, Felix Fietkau wrote:
>
> > >diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
> > >index 70dc8ec..c5f46d5 100644
> > >--- a/drivers/net/wireless/ath/ath9k/recv.c
> > >+++ b/drivers/net/wireless/ath/ath9k/recv.c
> > >@@ -684,15 +684,11 @@ static bool ath_edma_get_buffers(struct ath_softc *sc,
> > > BUG_ON(!bf);
> > >
> > > dma_sync_single_for_cpu(sc->dev, bf->bf_buf_addr,
> > >- common->rx_bufsize, DMA_FROM_DEVICE);
> > >+ common->rx_bufsize, DMA_BIDIRECTIONAL);
> > >
> > > ret = ath9k_hw_process_rxdesc_edma(ah, NULL, skb->data);
> > >- if (ret == -EINPROGRESS) {
> > >- /*let device gain the buffer again*/
> > >- dma_sync_single_for_device(sc->dev, bf->bf_buf_addr,
> > >- common->rx_bufsize, DMA_FROM_DEVICE);
> > >+ if (ret == -EINPROGRESS)
> > > return false;
> > >- }
> > >
> > > __skb_unlink(skb,&rx_edma->rx_fifo);
> > > if (ret == -EINVAL) {
> > I have strong doubts about this change. On most MIPS devices,
> > dma_sync_single_for_cpu is a no-op, whereas
> > dma_sync_single_for_device flushes the cache range. With this
> > change, the CPU could cache the DMA status part behind skb->data and
> > that cache entry would not be flushed inbetween calls to this
> > functions on the same buffer, likely leading to rx stalls.
>
> The code was already broken before. By the time dma_sync_single_for_cpu
> and ath9k_hw_process_rxdesc_edma are called, the DMA engine may still be
> active in the buffer, yet the driver is looking at it.
>
> dma_sync_single_for_cpu() is part of changing the buffer ownership from
> the device to the CPU. When it is being called, DMA into the buffer should
> already have been completed ... or else the shit may hit the jet engine.

Let's get rid of the "buffer ownership" misunderstanding from the picture.
Ownership is about who is expected to be writing (or is assured the data
is not being changed under his foot). This has nothing to do with DMA API.

DMA API is there for two purposes: to make part of memory visible to
both CPU and device (map/unmap) and to ensure memory consistency in
presence of caches (sync; implicitly done in map/unmap).

In the case we're analysing, the ownership is on the device's side
until it stops writing the buffer. sync_to_cpu doesn't change that.
It only allows the CPU to see more recent data (in case the CPU cached
something earlier).

> Imagine what would happen on a hypothetic cache architecture which does not
> have a dirty bit, that is which would have to write back every cache line -
> even clean lines - to memory in order to evict it. Corruption.

dma_map_whatever() would mark the memory uncachable on such an architecture.
Otherwise this would violate assumptions on DMA_FROM_DEVICE mappings (or
"device owned buffers") that the CPU does not write to the mapped memory.

Best Regards,
Micha? Miros?aw

2011-07-12 13:03:21

by Michał Mirosław

[permalink] [raw]
Subject: Re: [ath9k-devel] [PATCH v2 07/46] net/wireless: ath9k: fix DMA API usage

On Tue, Jul 12, 2011 at 08:54:32PM +0800, Felix Fietkau wrote:
> On 12.07.2011, at 17:55, Micha? Miros?aw <[email protected]> wrote:
>
> > On Tue, Jul 12, 2011 at 12:36:06PM +0800, Felix Fietkau wrote:
> >> On 2011-07-11 8:52 AM, Micha? Miros?aw wrote:
> >>> Also constify buf_addr for ath9k_hw_process_rxdesc_edma() to verify
> >>> assumptions --- dma_sync_single_for_device() call can be removed.
> >>>
> >>> Signed-off-by: Micha? Miros?aw<[email protected]>
> >>> ---
> >>> drivers/net/wireless/ath/ath9k/ar9003_mac.c | 4 ++--
> >>> drivers/net/wireless/ath/ath9k/ar9003_mac.h | 2 +-
> >>> drivers/net/wireless/ath/ath9k/recv.c | 10 +++-------
> >>> 3 files changed, 6 insertions(+), 10 deletions(-)
> >>>
> >>> diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
> >>> index 70dc8ec..c5f46d5 100644
> >>> --- a/drivers/net/wireless/ath/ath9k/recv.c
> >>> +++ b/drivers/net/wireless/ath/ath9k/recv.c
> >>> @@ -684,15 +684,11 @@ static bool ath_edma_get_buffers(struct ath_softc *sc,
> >>> BUG_ON(!bf);
> >>>
> >>> dma_sync_single_for_cpu(sc->dev, bf->bf_buf_addr,
> >>> - common->rx_bufsize, DMA_FROM_DEVICE);
> >>> + common->rx_bufsize, DMA_BIDIRECTIONAL);
> >>>
> >>> ret = ath9k_hw_process_rxdesc_edma(ah, NULL, skb->data);
> >>> - if (ret == -EINPROGRESS) {
> >>> - /*let device gain the buffer again*/
> >>> - dma_sync_single_for_device(sc->dev, bf->bf_buf_addr,
> >>> - common->rx_bufsize, DMA_FROM_DEVICE);
> >>> + if (ret == -EINPROGRESS)
> >>> return false;
> >>> - }
> >>>
> >>> __skb_unlink(skb,&rx_edma->rx_fifo);
> >>> if (ret == -EINVAL) {
> >> I have strong doubts about this change. On most MIPS devices,
> >> dma_sync_single_for_cpu is a no-op, whereas
> >> dma_sync_single_for_device flushes the cache range. With this
> >> change, the CPU could cache the DMA status part behind skb->data and
> >> that cache entry would not be flushed inbetween calls to this
> >> functions on the same buffer, likely leading to rx stalls.
> >
> > You're suggesting a platform implementation bug then. If the platform is not
> > cache-coherent, it should invalidate relevant CPU cache lines for sync_to_cpu
> > and unmap cases. Do other devices show such symptoms on MIPS systems?
> >
> > I'm not familiar with the platform internals, so we should ask MIPS people.
> I only mentioned MIPS to describe the potential side effect of this change. From my current understanding of the DMA API, it would be wrong on other platforms as well. I believe the _for_device function needs to be used to transfer ownership of the buffer back to the device, before calling _for_cpu again later for another read.

What you're saying reminds the wording in DMA-API-HOWTO.txt that I find
wrong (or at least misleading) compared to what DMA-API.txt describes.
DMA sync calls do not transfer the ownership of the buffer - they are
cache synchronization points, ownership passing is handled entirely by
the driver.

> This is definitely required in this case, because when the return code is -EINPROGRESS, the driver waits for the hardware to complete this buffer, and the next call has to fetch the memory again after the device has updated it.

Correctness of this access should be provided by sync_to_cpu() call.

Best Regards,
Micha? Miros?aw

2011-07-12 19:32:26

by Ralf Baechle

[permalink] [raw]
Subject: Re: [ath9k-devel] [PATCH v2 07/46] net/wireless: ath9k: fix DMA API usage

On Tue, Jul 12, 2011 at 12:36:06PM +0800, Felix Fietkau wrote:

> >diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
> >index 70dc8ec..c5f46d5 100644
> >--- a/drivers/net/wireless/ath/ath9k/recv.c
> >+++ b/drivers/net/wireless/ath/ath9k/recv.c
> >@@ -684,15 +684,11 @@ static bool ath_edma_get_buffers(struct ath_softc *sc,
> > BUG_ON(!bf);
> >
> > dma_sync_single_for_cpu(sc->dev, bf->bf_buf_addr,
> >- common->rx_bufsize, DMA_FROM_DEVICE);
> >+ common->rx_bufsize, DMA_BIDIRECTIONAL);
> >
> > ret = ath9k_hw_process_rxdesc_edma(ah, NULL, skb->data);
> >- if (ret == -EINPROGRESS) {
> >- /*let device gain the buffer again*/
> >- dma_sync_single_for_device(sc->dev, bf->bf_buf_addr,
> >- common->rx_bufsize, DMA_FROM_DEVICE);
> >+ if (ret == -EINPROGRESS)
> > return false;
> >- }
> >
> > __skb_unlink(skb,&rx_edma->rx_fifo);
> > if (ret == -EINVAL) {
> I have strong doubts about this change. On most MIPS devices,
> dma_sync_single_for_cpu is a no-op, whereas
> dma_sync_single_for_device flushes the cache range. With this
> change, the CPU could cache the DMA status part behind skb->data and
> that cache entry would not be flushed inbetween calls to this
> functions on the same buffer, likely leading to rx stalls.

The code was already broken before. By the time dma_sync_single_for_cpu
and ath9k_hw_process_rxdesc_edma are called, the DMA engine may still be
active in the buffer, yet the driver is looking at it.

dma_sync_single_for_cpu() is part of changing the buffer ownership from
the device to the CPU. When it is being called, DMA into the buffer should
already have been completed ... or else the shit may hit the jet engine.

Imagine what would happen on a hypothetic cache architecture which does not
have a dirty bit, that is which would have to write back every cache line -
even clean lines - to memory in order to evict it. Corruption.

And don't argue with what the actual MIPS implementation of dma_sync_single_-
for-{cpu,device} is doing. It's meant to bee treated as a black box; that
abstraction is the whole point of the ABI. And it seems the driver is also
being used on other architectures than MIPS …

Ralf

2011-07-12 09:55:43

by Michał Mirosław

[permalink] [raw]
Subject: Re: [ath9k-devel] [PATCH v2 07/46] net/wireless: ath9k: fix DMA API usage

On Tue, Jul 12, 2011 at 12:36:06PM +0800, Felix Fietkau wrote:
> On 2011-07-11 8:52 AM, Micha? Miros?aw wrote:
> >Also constify buf_addr for ath9k_hw_process_rxdesc_edma() to verify
> >assumptions --- dma_sync_single_for_device() call can be removed.
> >
> >Signed-off-by: Micha? Miros?aw<[email protected]>
> >---
> > drivers/net/wireless/ath/ath9k/ar9003_mac.c | 4 ++--
> > drivers/net/wireless/ath/ath9k/ar9003_mac.h | 2 +-
> > drivers/net/wireless/ath/ath9k/recv.c | 10 +++-------
> > 3 files changed, 6 insertions(+), 10 deletions(-)
> >
> >diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
> >index 70dc8ec..c5f46d5 100644
> >--- a/drivers/net/wireless/ath/ath9k/recv.c
> >+++ b/drivers/net/wireless/ath/ath9k/recv.c
> >@@ -684,15 +684,11 @@ static bool ath_edma_get_buffers(struct ath_softc *sc,
> > BUG_ON(!bf);
> >
> > dma_sync_single_for_cpu(sc->dev, bf->bf_buf_addr,
> >- common->rx_bufsize, DMA_FROM_DEVICE);
> >+ common->rx_bufsize, DMA_BIDIRECTIONAL);
> >
> > ret = ath9k_hw_process_rxdesc_edma(ah, NULL, skb->data);
> >- if (ret == -EINPROGRESS) {
> >- /*let device gain the buffer again*/
> >- dma_sync_single_for_device(sc->dev, bf->bf_buf_addr,
> >- common->rx_bufsize, DMA_FROM_DEVICE);
> >+ if (ret == -EINPROGRESS)
> > return false;
> >- }
> >
> > __skb_unlink(skb,&rx_edma->rx_fifo);
> > if (ret == -EINVAL) {
> I have strong doubts about this change. On most MIPS devices,
> dma_sync_single_for_cpu is a no-op, whereas
> dma_sync_single_for_device flushes the cache range. With this
> change, the CPU could cache the DMA status part behind skb->data and
> that cache entry would not be flushed inbetween calls to this
> functions on the same buffer, likely leading to rx stalls.

You're suggesting a platform implementation bug then. If the platform is not
cache-coherent, it should invalidate relevant CPU cache lines for sync_to_cpu
and unmap cases. Do other devices show such symptoms on MIPS systems?

I'm not familiar with the platform internals, so we should ask MIPS people.

[added Cc: linux-mips]

Best Regards,
Micha? Miros?aw

2011-07-12 04:36:18

by Felix Fietkau

[permalink] [raw]
Subject: Re: [ath9k-devel] [PATCH v2 07/46] net/wireless: ath9k: fix DMA API usage

On 2011-07-11 8:52 AM, Michał Mirosław wrote:
> Also constify buf_addr for ath9k_hw_process_rxdesc_edma() to verify
> assumptions --- dma_sync_single_for_device() call can be removed.
>
> Signed-off-by: Michał Mirosław<[email protected]>
> ---
> drivers/net/wireless/ath/ath9k/ar9003_mac.c | 4 ++--
> drivers/net/wireless/ath/ath9k/ar9003_mac.h | 2 +-
> drivers/net/wireless/ath/ath9k/recv.c | 10 +++-------
> 3 files changed, 6 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
> index 70dc8ec..c5f46d5 100644
> --- a/drivers/net/wireless/ath/ath9k/recv.c
> +++ b/drivers/net/wireless/ath/ath9k/recv.c
> @@ -684,15 +684,11 @@ static bool ath_edma_get_buffers(struct ath_softc *sc,
> BUG_ON(!bf);
>
> dma_sync_single_for_cpu(sc->dev, bf->bf_buf_addr,
> - common->rx_bufsize, DMA_FROM_DEVICE);
> + common->rx_bufsize, DMA_BIDIRECTIONAL);
>
> ret = ath9k_hw_process_rxdesc_edma(ah, NULL, skb->data);
> - if (ret == -EINPROGRESS) {
> - /*let device gain the buffer again*/
> - dma_sync_single_for_device(sc->dev, bf->bf_buf_addr,
> - common->rx_bufsize, DMA_FROM_DEVICE);
> + if (ret == -EINPROGRESS)
> return false;
> - }
>
> __skb_unlink(skb,&rx_edma->rx_fifo);
> if (ret == -EINVAL) {
I have strong doubts about this change. On most MIPS devices,
dma_sync_single_for_cpu is a no-op, whereas dma_sync_single_for_device
flushes the cache range. With this change, the CPU could cache the DMA
status part behind skb->data and that cache entry would not be flushed
inbetween calls to this functions on the same buffer, likely leading to
rx stalls.

- Felix

2011-07-12 05:30:37

by Ben Greear

[permalink] [raw]
Subject: Re: [ath9k-devel] [PATCH v2 07/46] net/wireless: ath9k: fix DMA API usage

On 07/11/2011 09:36 PM, Felix Fietkau wrote:
> On 2011-07-11 8:52 AM, Michał Mirosław wrote:
>> Also constify buf_addr for ath9k_hw_process_rxdesc_edma() to verify
>> assumptions --- dma_sync_single_for_device() call can be removed.
>>
>> Signed-off-by: Michał Mirosław<[email protected]>
>> ---
>> drivers/net/wireless/ath/ath9k/ar9003_mac.c | 4 ++--
>> drivers/net/wireless/ath/ath9k/ar9003_mac.h | 2 +-
>> drivers/net/wireless/ath/ath9k/recv.c | 10 +++-------
>> 3 files changed, 6 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
>> index 70dc8ec..c5f46d5 100644
>> --- a/drivers/net/wireless/ath/ath9k/recv.c
>> +++ b/drivers/net/wireless/ath/ath9k/recv.c
>> @@ -684,15 +684,11 @@ static bool ath_edma_get_buffers(struct ath_softc *sc,
>> BUG_ON(!bf);
>>
>> dma_sync_single_for_cpu(sc->dev, bf->bf_buf_addr,
>> - common->rx_bufsize, DMA_FROM_DEVICE);
>> + common->rx_bufsize, DMA_BIDIRECTIONAL);
>>
>> ret = ath9k_hw_process_rxdesc_edma(ah, NULL, skb->data);
>> - if (ret == -EINPROGRESS) {
>> - /*let device gain the buffer again*/
>> - dma_sync_single_for_device(sc->dev, bf->bf_buf_addr,
>> - common->rx_bufsize, DMA_FROM_DEVICE);
>> + if (ret == -EINPROGRESS)
>> return false;
>> - }
>>
>> __skb_unlink(skb,&rx_edma->rx_fifo);
>> if (ret == -EINVAL) {
> I have strong doubts about this change. On most MIPS devices, dma_sync_single_for_cpu is a no-op, whereas dma_sync_single_for_device flushes the cache range.
> With this change, the CPU could cache the DMA status part behind skb->data and that cache entry would not be flushed inbetween calls to this functions on the
> same buffer, likely leading to rx stalls.

At the very least, it would need heavy testing. It took a very long time to get
the ath9k DMA issues (mostly?) resolved...so we shouldn't go mucking in this
code on theory...

Thanks,
Ben

>
> - Felix
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com

2011-07-12 12:56:19

by Felix Fietkau

[permalink] [raw]
Subject: Re: [ath9k-devel] [PATCH v2 07/46] net/wireless: ath9k: fix DMA API usage

On 12.07.2011, at 17:55, Michał Mirosław <[email protected]> wrote:

> On Tue, Jul 12, 2011 at 12:36:06PM +0800, Felix Fietkau wrote:
>> On 2011-07-11 8:52 AM, Michał Mirosław wrote:
>>> Also constify buf_addr for ath9k_hw_process_rxdesc_edma() to verify
>>> assumptions --- dma_sync_single_for_device() call can be removed.
>>>
>>> Signed-off-by: Michał Mirosław<[email protected]>
>>> ---
>>> drivers/net/wireless/ath/ath9k/ar9003_mac.c | 4 ++--
>>> drivers/net/wireless/ath/ath9k/ar9003_mac.h | 2 +-
>>> drivers/net/wireless/ath/ath9k/recv.c | 10 +++-------
>>> 3 files changed, 6 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
>>> index 70dc8ec..c5f46d5 100644
>>> --- a/drivers/net/wireless/ath/ath9k/recv.c
>>> +++ b/drivers/net/wireless/ath/ath9k/recv.c
>>> @@ -684,15 +684,11 @@ static bool ath_edma_get_buffers(struct ath_softc *sc,
>>> BUG_ON(!bf);
>>>
>>> dma_sync_single_for_cpu(sc->dev, bf->bf_buf_addr,
>>> - common->rx_bufsize, DMA_FROM_DEVICE);
>>> + common->rx_bufsize, DMA_BIDIRECTIONAL);
>>>
>>> ret = ath9k_hw_process_rxdesc_edma(ah, NULL, skb->data);
>>> - if (ret == -EINPROGRESS) {
>>> - /*let device gain the buffer again*/
>>> - dma_sync_single_for_device(sc->dev, bf->bf_buf_addr,
>>> - common->rx_bufsize, DMA_FROM_DEVICE);
>>> + if (ret == -EINPROGRESS)
>>> return false;
>>> - }
>>>
>>> __skb_unlink(skb,&rx_edma->rx_fifo);
>>> if (ret == -EINVAL) {
>> I have strong doubts about this change. On most MIPS devices,
>> dma_sync_single_for_cpu is a no-op, whereas
>> dma_sync_single_for_device flushes the cache range. With this
>> change, the CPU could cache the DMA status part behind skb->data and
>> that cache entry would not be flushed inbetween calls to this
>> functions on the same buffer, likely leading to rx stalls.
>
> You're suggesting a platform implementation bug then. If the platform is not
> cache-coherent, it should invalidate relevant CPU cache lines for sync_to_cpu
> and unmap cases. Do other devices show such symptoms on MIPS systems?
>
> I'm not familiar with the platform internals, so we should ask MIPS people.
I only mentioned MIPS to describe the potential side effect of this change. From my current understanding of the DMA API, it would be wrong on other platforms as well. I believe the _for_device function needs to be used to transfer ownership of the buffer back to the device, before calling _for_cpu again later for another read.
This is definitely required in this case, because when the return code is -EINPROGRESS, the driver waits for the hardware to complete this buffer, and the next call has to fetch the memory again after the device has updated it.
This is handled properly by the current code without your change.

- Felix