2010-10-20 23:07:11

by Luis R. Rodriguez

[permalink] [raw]
Subject: [PATCH 0/6] ath9k: fix RX poison issue

This series addresses the RX poison issue reported by Ben.
I've split the patches up into as many atomic pieces to make
it easier for review and understanding. Some are stable, the
last few patches are for debugging.

Luis R. Rodriguez (6):
ath9k: add locking for stopping RX
ath9k: add locking for starting the PCU on RX
ath9k: rename rxflushlock to pcu_lock
ath9k: lock reset and PCU start/stopping
ath: add a ATH_DBG_WARN()
ath9k: add a debug warning when we cannot stop RX

drivers/net/wireless/ath/ath9k/ath9k.h | 2 +-
drivers/net/wireless/ath/ath9k/main.c | 31 +++++++++++++++++++++++++++++--
drivers/net/wireless/ath/ath9k/recv.c | 17 +++++++++--------
drivers/net/wireless/ath/debug.h | 2 ++
4 files changed, 41 insertions(+), 11 deletions(-)



2010-10-22 00:32:36

by Luis R. Rodriguez

[permalink] [raw]
Subject: Re: [PATCH 0/6] ath9k: fix RX poison issue

On Thu, Oct 21, 2010 at 5:30 PM, Bruno Randolf <[email protected]> wrote:
> On Fri October 22 2010 02:47:21 Ben Greear wrote:
>> I tested this code when it was combined into a single patch.
>> There were still some issues, but I have no reason to believe
>> this patch set caused them.  This patch set significantly
>> improved stability in my testing with 130+ STA interfaces.
>
> one question: do you see similar issues with ath5k? i remember one bug is
> filed with a rx poision issue.

Hardware is similar so the I'd do the same locking.

Luis

2010-10-20 23:07:20

by Luis R. Rodriguez

[permalink] [raw]
Subject: [PATCH 6/6] ath9k: add a debug warning when we cannot stop RX

We have seen several DMA races when we race against
stopping and starting the PCU. I suspect that when
we cannot stop the PCU we may hit some of these same
races so warn against them for now but only when
debugging (CONFIG_ATH_DEBUG) is enabled.

If you run into this warning and are a developer,
please fix the cause of the warning. The potential
here, although I cannot prove yet, is that the DMA
engine can be confused and start writing to a buffer
that was already DMA'd before and at least the kernel
assumes is not being accessed by hardware anymore.

Cc: Ben Greear <[email protected]>
Cc: Kyungwan Nam <[email protected]>
Signed-off-by: Luis R. Rodriguez <[email protected]>
---
drivers/net/wireless/ath/ath9k/recv.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
index c04a940..87fabf8 100644
--- a/drivers/net/wireless/ath/ath9k/recv.c
+++ b/drivers/net/wireless/ath/ath9k/recv.c
@@ -528,6 +528,8 @@ bool ath_stoprecv(struct ath_softc *sc)
sc->rx.rxlink = NULL;
spin_unlock_bh(&sc->rx.rxbuflock);

+ ATH_DBG_WARN(!stopped, "Could not stop RX, we could be "
+ "confusing the DMA engine when we start RX up\n");
return stopped;
}

--
1.7.0.4


2010-10-20 23:07:12

by Luis R. Rodriguez

[permalink] [raw]
Subject: [PATCH 1/6] ath9k: add locking for stopping RX

ath9k locks for starting RX but not for stopping RX. We could
potentially run into a situation where tried to stop RX
but immediately started RX. This allows for races on the
the RX engine deciding what buffer we last left off on
and could potentially cause ath9k to DMA into already
free'd memory or in the worst case at a later time to
already given memory to other drivers.

Fix this by locking stopping RX.

This is part of a series that will help resolve the bug:

https://bugzilla.kernel.org/show_bug.cgi?id=14624

For more details about this issue refer to:

http://marc.info/?l=linux-wireless&m=128629803703756&w=2

Cc: [email protected]
Cc: Ben Greear <[email protected]>
Cc: Kyungwan Nam <[email protected]>
Signed-off-by: Luis R. Rodriguez <[email protected]>
---
drivers/net/wireless/ath/ath9k/recv.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
index 2b2c318..3b19bbb 100644
--- a/drivers/net/wireless/ath/ath9k/recv.c
+++ b/drivers/net/wireless/ath/ath9k/recv.c
@@ -306,10 +306,8 @@ static void ath_edma_start_recv(struct ath_softc *sc)

static void ath_edma_stop_recv(struct ath_softc *sc)
{
- spin_lock_bh(&sc->rx.rxbuflock);
ath_rx_remove_buffer(sc, ATH9K_RX_QUEUE_HP);
ath_rx_remove_buffer(sc, ATH9K_RX_QUEUE_LP);
- spin_unlock_bh(&sc->rx.rxbuflock);
}

int ath_rx_init(struct ath_softc *sc, int nbufs)
@@ -518,6 +516,7 @@ bool ath_stoprecv(struct ath_softc *sc)
struct ath_hw *ah = sc->sc_ah;
bool stopped;

+ spin_lock_bh(&sc->rx.rxbuflock);
ath9k_hw_stoppcurecv(ah);
ath9k_hw_setrxfilter(ah, 0);
stopped = ath9k_hw_stopdmarecv(ah);
@@ -526,6 +525,7 @@ bool ath_stoprecv(struct ath_softc *sc)
ath_edma_stop_recv(sc);
else
sc->rx.rxlink = NULL;
+ spin_unlock_bh(&sc->rx.rxbuflock);

return stopped;
}
--
1.7.0.4


2010-10-20 23:07:18

by Luis R. Rodriguez

[permalink] [raw]
Subject: [PATCH 5/6] ath: add a ATH_DBG_WARN()

To be used to throw out warnings only for developers.
This can be used by some corner cases that developers
already know can be hit but developers want to address
so to avoid spewing out a warning this can only be
enabled with CONFIG_ATH_DEBUG enabled.

Cc: Ben Greear <[email protected]>
Cc: Kyungwan Nam <[email protected]>
Signed-off-by: Luis R. Rodriguez <[email protected]>
---
drivers/net/wireless/ath/debug.h | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/net/wireless/ath/debug.h b/drivers/net/wireless/ath/debug.h
index 64e4af2..15fc8ff 100644
--- a/drivers/net/wireless/ath/debug.h
+++ b/drivers/net/wireless/ath/debug.h
@@ -70,11 +70,13 @@ enum ATH_DEBUG {
#ifdef CONFIG_ATH_DEBUG
void ath_print(struct ath_common *common, int dbg_mask, const char *fmt, ...)
__attribute__ ((format (printf, 3, 4)));
+#define ATH_DBG_WARN(foo, arg...) WARN(foo, arg)
#else
static inline void __attribute__ ((format (printf, 3, 4)))
ath_print(struct ath_common *common, int dbg_mask, const char *fmt, ...)
{
}
+#define ATH_DBG_WARN(foo)
#endif /* CONFIG_ATH_DEBUG */

/** Returns string describing opmode, or NULL if unknown mode. */
--
1.7.0.4


2010-10-21 19:14:27

by Luis R. Rodriguez

[permalink] [raw]
Subject: Re: [PATCH 5/6] ath: add a ATH_DBG_WARN()

On Wed, Oct 20, 2010 at 4:07 PM, Luis R. Rodriguez
<[email protected]> wrote:
> To be used to throw out warnings only for developers.
> This can be used by some corner cases that developers
> already know can be hit but developers want to address
> so to avoid spewing out a warning this can only be
> enabled with CONFIG_ATH_DEBUG enabled.
>
> Cc: Ben Greear <[email protected]>
> Cc: Kyungwan Nam <[email protected]>
> Signed-off-by: Luis R. Rodriguez <[email protected]>

Sorry, please use my v2.

Luis

2010-10-21 09:03:34

by Luis R. Rodriguez

[permalink] [raw]
Subject: Re: [PATCH 3/6] ath9k: rename rxflushlock to pcu_lock

On Thu, Oct 21, 2010 at 12:05:15AM -0700, Senthilkumar Balasubramanian wrote:
> On Thu, Oct 21, 2010 at 11:43:50AM +0530, Luis R. Rodriguez wrote:
> > On Wed, Oct 20, 2010 at 11:03 PM, Senthil Balasubramanian
> > <[email protected]> wrote:
> > > On Thu, Oct 21, 2010 at 04:37:05AM +0530, Luis R. Rodriguez wrote:
> > >> The real way to lock RX is to contend on the PCU
> > >> and reset, this will be fixed in the next patch but for
> > >> now just do the renames so that the next patch which changes
> > >> the locking order is crystal clear.
> > >>
> > >> This is part of a series that will help resolve the bug:
> > >>
> > >> https://bugzilla.kernel.org/show_bug.cgi?id=14624
> > >>
> > >> For more details about this issue refer to:
> > >>
> > >> http://marc.info/?l=linux-wireless&m=128629803703756&w=2
> > >>
> > >> Cc: [email protected]
> > >> Cc: Ben Greear <[email protected]>
> > >> Cc: Kyungwan Nam <[email protected]>
> > >> Signed-off-by: Luis R. Rodriguez <[email protected]>
> > >> ---
> > >> ?drivers/net/wireless/ath/ath9k/ath9k.h | ? ?2 +-
> > >> ?drivers/net/wireless/ath/ath9k/main.c ?| ? ?4 ++--
> > >> ?drivers/net/wireless/ath/ath9k/recv.c ?| ? ?6 +++---
> > >> ?3 files changed, 6 insertions(+), 6 deletions(-)
> > >>
> > >> diff --git a/drivers/net/wireless/ath/ath9k/ath9k.h b/drivers/net/wireless/ath/ath9k/ath9k.h
> > >> index 0f0bc54..81fed83 100644
> > >> --- a/drivers/net/wireless/ath/ath9k/ath9k.h
> > >> +++ b/drivers/net/wireless/ath/ath9k/ath9k.h
> > >> @@ -309,7 +309,7 @@ struct ath_rx {
> > >> ? ? ? u8 rxotherant;
> > >> ? ? ? u32 *rxlink;
> > >> ? ? ? unsigned int rxfilter;
> > >> - ? ? spinlock_t rxflushlock;
> > >> + ? ? spinlock_t pcu_lock;
> > >> ? ? ? spinlock_t rxbuflock;
> > >> ? ? ? struct list_head rxbuf;
> > >> ? ? ? struct ath_descdma rxdma;
> > >> diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c
> > >> index dcd94ba..37f18ef 100644
> > >> --- a/drivers/net/wireless/ath/ath9k/main.c
> > >> +++ b/drivers/net/wireless/ath/ath9k/main.c
> > >> @@ -609,7 +609,7 @@ void ath9k_tasklet(unsigned long data)
> > >> ? ? ? ? ? ? ? rxmask = (ATH9K_INT_RX | ATH9K_INT_RXEOL | ATH9K_INT_RXORN);
> > >>
> > >> ? ? ? if (status & rxmask) {
> > >> - ? ? ? ? ? ? spin_lock_bh(&sc->rx.rxflushlock);
> > >> + ? ? ? ? ? ? spin_lock_bh(&sc->rx.pcu_lock);
> > > no need to disable bh as you are already in tasklet. I understand that the
> > > existing code itself was doing this and I thought of fixing that anyway.
> > > Can you plese fix this as well with this patch?
> >
> > Nah, I rather this be a separate patch, which you can test and send yourself :)
> Since we are sending this to stable, I thought we can send a clean patch.

Hrm. Have you tested it?

Luis

2010-10-21 06:14:12

by Luis R. Rodriguez

[permalink] [raw]
Subject: Re: [PATCH 3/6] ath9k: rename rxflushlock to pcu_lock

On Wed, Oct 20, 2010 at 11:03 PM, Senthil Balasubramanian
<[email protected]> wrote:
> On Thu, Oct 21, 2010 at 04:37:05AM +0530, Luis R. Rodriguez wrote:
>> The real way to lock RX is to contend on the PCU
>> and reset, this will be fixed in the next patch but for
>> now just do the renames so that the next patch which changes
>> the locking order is crystal clear.
>>
>> This is part of a series that will help resolve the bug:
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=14624
>>
>> For more details about this issue refer to:
>>
>> http://marc.info/?l=linux-wireless&m=128629803703756&w=2
>>
>> Cc: [email protected]
>> Cc: Ben Greear <[email protected]>
>> Cc: Kyungwan Nam <[email protected]>
>> Signed-off-by: Luis R. Rodriguez <[email protected]>
>> ---
>>  drivers/net/wireless/ath/ath9k/ath9k.h |    2 +-
>>  drivers/net/wireless/ath/ath9k/main.c  |    4 ++--
>>  drivers/net/wireless/ath/ath9k/recv.c  |    6 +++---
>>  3 files changed, 6 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/net/wireless/ath/ath9k/ath9k.h b/drivers/net/wireless/ath/ath9k/ath9k.h
>> index 0f0bc54..81fed83 100644
>> --- a/drivers/net/wireless/ath/ath9k/ath9k.h
>> +++ b/drivers/net/wireless/ath/ath9k/ath9k.h
>> @@ -309,7 +309,7 @@ struct ath_rx {
>>       u8 rxotherant;
>>       u32 *rxlink;
>>       unsigned int rxfilter;
>> -     spinlock_t rxflushlock;
>> +     spinlock_t pcu_lock;
>>       spinlock_t rxbuflock;
>>       struct list_head rxbuf;
>>       struct ath_descdma rxdma;
>> diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c
>> index dcd94ba..37f18ef 100644
>> --- a/drivers/net/wireless/ath/ath9k/main.c
>> +++ b/drivers/net/wireless/ath/ath9k/main.c
>> @@ -609,7 +609,7 @@ void ath9k_tasklet(unsigned long data)
>>               rxmask = (ATH9K_INT_RX | ATH9K_INT_RXEOL | ATH9K_INT_RXORN);
>>
>>       if (status & rxmask) {
>> -             spin_lock_bh(&sc->rx.rxflushlock);
>> +             spin_lock_bh(&sc->rx.pcu_lock);
> no need to disable bh as you are already in tasklet. I understand that the
> existing code itself was doing this and I thought of fixing that anyway.
> Can you plese fix this as well with this patch?

Nah, I rather this be a separate patch, which you can test and send yourself :)

>>
>>               /* Check for high priority Rx first */
>>               if ((ah->caps.hw_caps & ATH9K_HW_CAP_EDMA) &&
>> @@ -617,7 +617,7 @@ void ath9k_tasklet(unsigned long data)
>>                       ath_rx_tasklet(sc, 0, true);
>>
>>               ath_rx_tasklet(sc, 0, false);
>> -             spin_unlock_bh(&sc->rx.rxflushlock);
>> +             spin_unlock_bh(&sc->rx.pcu_lock);
> here also. spin_lock() is sufficient.

Ditto.

Luis

2010-10-21 17:47:34

by Ben Greear

[permalink] [raw]
Subject: Re: [PATCH 0/6] ath9k: fix RX poison issue

On 10/20/2010 04:07 PM, Luis R. Rodriguez wrote:
> This series addresses the RX poison issue reported by Ben.
> I've split the patches up into as many atomic pieces to make
> it easier for review and understanding. Some are stable, the
> last few patches are for debugging.
>
> Luis R. Rodriguez (6):
> ath9k: add locking for stopping RX
> ath9k: add locking for starting the PCU on RX
> ath9k: rename rxflushlock to pcu_lock
> ath9k: lock reset and PCU start/stopping
> ath: add a ATH_DBG_WARN()
> ath9k: add a debug warning when we cannot stop RX
>
> drivers/net/wireless/ath/ath9k/ath9k.h | 2 +-
> drivers/net/wireless/ath/ath9k/main.c | 31 +++++++++++++++++++++++++++++--
> drivers/net/wireless/ath/ath9k/recv.c | 17 +++++++++--------
> drivers/net/wireless/ath/debug.h | 2 ++
> 4 files changed, 41 insertions(+), 11 deletions(-)

I tested this code when it was combined into a single patch.
There were still some issues, but I have no reason to believe
this patch set caused them. This patch set significantly
improved stability in my testing with 130+ STA interfaces.

So:

Tested-by: Ben Greear <[email protected]>


--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com


2010-10-22 00:30:01

by Bruno Randolf

[permalink] [raw]
Subject: Re: [PATCH 0/6] ath9k: fix RX poison issue

On Fri October 22 2010 02:47:21 Ben Greear wrote:
> I tested this code when it was combined into a single patch.
> There were still some issues, but I have no reason to believe
> this patch set caused them. This patch set significantly
> improved stability in my testing with 130+ STA interfaces.

one question: do you see similar issues with ath5k? i remember one bug is
filed with a rx poision issue.

thanks,
bruno

2010-10-22 04:51:24

by Ben Greear

[permalink] [raw]
Subject: Re: [PATCH 0/6] ath9k: fix RX poison issue

On 10/21/2010 05:32 PM, Luis R. Rodriguez wrote:
> On Thu, Oct 21, 2010 at 5:30 PM, Bruno Randolf<[email protected]> wrote:
>> On Fri October 22 2010 02:47:21 Ben Greear wrote:
>>> I tested this code when it was combined into a single patch.
>>> There were still some issues, but I have no reason to believe
>>> this patch set caused them. This patch set significantly
>>> improved stability in my testing with 130+ STA interfaces.
>>
>> one question: do you see similar issues with ath5k? i remember one bug is
>> filed with a rx poision issue.
>
> Hardware is similar so the I'd do the same locking.

We did similar testing with ath5k, and did not see any obvious
problems. But, maybe we were just getting lucky.

Thanks,
Ben

>
> Luis


--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com

2010-10-20 23:07:15

by Luis R. Rodriguez

[permalink] [raw]
Subject: [PATCH 3/6] ath9k: rename rxflushlock to pcu_lock

The real way to lock RX is to contend on the PCU
and reset, this will be fixed in the next patch but for
now just do the renames so that the next patch which changes
the locking order is crystal clear.

This is part of a series that will help resolve the bug:

https://bugzilla.kernel.org/show_bug.cgi?id=14624

For more details about this issue refer to:

http://marc.info/?l=linux-wireless&m=128629803703756&w=2

Cc: [email protected]
Cc: Ben Greear <[email protected]>
Cc: Kyungwan Nam <[email protected]>
Signed-off-by: Luis R. Rodriguez <[email protected]>
---
drivers/net/wireless/ath/ath9k/ath9k.h | 2 +-
drivers/net/wireless/ath/ath9k/main.c | 4 ++--
drivers/net/wireless/ath/ath9k/recv.c | 6 +++---
3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/ath9k.h b/drivers/net/wireless/ath/ath9k/ath9k.h
index 0f0bc54..81fed83 100644
--- a/drivers/net/wireless/ath/ath9k/ath9k.h
+++ b/drivers/net/wireless/ath/ath9k/ath9k.h
@@ -309,7 +309,7 @@ struct ath_rx {
u8 rxotherant;
u32 *rxlink;
unsigned int rxfilter;
- spinlock_t rxflushlock;
+ spinlock_t pcu_lock;
spinlock_t rxbuflock;
struct list_head rxbuf;
struct ath_descdma rxdma;
diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c
index dcd94ba..37f18ef 100644
--- a/drivers/net/wireless/ath/ath9k/main.c
+++ b/drivers/net/wireless/ath/ath9k/main.c
@@ -609,7 +609,7 @@ void ath9k_tasklet(unsigned long data)
rxmask = (ATH9K_INT_RX | ATH9K_INT_RXEOL | ATH9K_INT_RXORN);

if (status & rxmask) {
- spin_lock_bh(&sc->rx.rxflushlock);
+ spin_lock_bh(&sc->rx.pcu_lock);

/* Check for high priority Rx first */
if ((ah->caps.hw_caps & ATH9K_HW_CAP_EDMA) &&
@@ -617,7 +617,7 @@ void ath9k_tasklet(unsigned long data)
ath_rx_tasklet(sc, 0, true);

ath_rx_tasklet(sc, 0, false);
- spin_unlock_bh(&sc->rx.rxflushlock);
+ spin_unlock_bh(&sc->rx.pcu_lock);
}

if (status & ATH9K_INT_TX) {
diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
index 944fb59..b099827 100644
--- a/drivers/net/wireless/ath/ath9k/recv.c
+++ b/drivers/net/wireless/ath/ath9k/recv.c
@@ -317,7 +317,7 @@ int ath_rx_init(struct ath_softc *sc, int nbufs)
struct ath_buf *bf;
int error = 0;

- spin_lock_init(&sc->rx.rxflushlock);
+ spin_lock_init(&sc->rx.pcu_lock);
sc->sc_flags &= ~SC_OP_RXFLUSH;
spin_lock_init(&sc->rx.rxbuflock);

@@ -533,13 +533,13 @@ bool ath_stoprecv(struct ath_softc *sc)

void ath_flushrecv(struct ath_softc *sc)
{
- spin_lock_bh(&sc->rx.rxflushlock);
+ spin_lock_bh(&sc->rx.pcu_lock);
sc->sc_flags |= SC_OP_RXFLUSH;
if (sc->sc_ah->caps.hw_caps & ATH9K_HW_CAP_EDMA)
ath_rx_tasklet(sc, 1, true);
ath_rx_tasklet(sc, 1, false);
sc->sc_flags &= ~SC_OP_RXFLUSH;
- spin_unlock_bh(&sc->rx.rxflushlock);
+ spin_unlock_bh(&sc->rx.pcu_lock);
}

static bool ath_beacon_dtim_pending_cab(struct sk_buff *skb)
--
1.7.0.4


2010-10-20 23:07:17

by Luis R. Rodriguez

[permalink] [raw]
Subject: [PATCH 4/6] ath9k: lock reset and PCU start/stopping

Apart from locking the start and stop PCU we need
to ensure we also content starting and stopping the PCU
between hardware resets.

This is part of a series that will help resolve the bug:

https://bugzilla.kernel.org/show_bug.cgi?id=14624

For more details about this issue refer to:

http://marc.info/?l=linux-wireless&m=128629803703756&w=2

Cc: [email protected]
Cc: Ben Greear <[email protected]>
Cc: Kyungwan Nam <[email protected]>
Signed-off-by: Luis R. Rodriguez <[email protected]>
---
drivers/net/wireless/ath/ath9k/main.c | 27 +++++++++++++++++++++++++++
drivers/net/wireless/ath/ath9k/recv.c | 2 --
2 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c
index 37f18ef..dac11eb 100644
--- a/drivers/net/wireless/ath/ath9k/main.c
+++ b/drivers/net/wireless/ath/ath9k/main.c
@@ -238,6 +238,9 @@ int ath_set_channel(struct ath_softc *sc, struct ieee80211_hw *hw,
*/
ath9k_hw_disable_interrupts(ah);
ath_drain_all_txq(sc, false);
+
+ spin_lock_bh(&sc->rx.pcu_lock);
+
stopped = ath_stoprecv(sc);

/* XXX: do not flush receive queue here. We don't want
@@ -265,6 +268,7 @@ int ath_set_channel(struct ath_softc *sc, struct ieee80211_hw *hw,
"reset status %d\n",
channel->center_freq, r);
spin_unlock_bh(&sc->sc_resetlock);
+ spin_unlock_bh(&sc->rx.pcu_lock);
goto ps_restore;
}
spin_unlock_bh(&sc->sc_resetlock);
@@ -273,9 +277,12 @@ int ath_set_channel(struct ath_softc *sc, struct ieee80211_hw *hw,
ath_print(common, ATH_DBG_FATAL,
"Unable to restart recv logic\n");
r = -EIO;
+ spin_unlock_bh(&sc->rx.pcu_lock);
goto ps_restore;
}

+ spin_unlock_bh(&sc->rx.pcu_lock);
+
ath_update_txpow(sc);
ath9k_hw_set_interrupts(ah, ah->imask);

@@ -875,6 +882,7 @@ void ath_radio_enable(struct ath_softc *sc, struct ieee80211_hw *hw)
if (!ah->curchan)
ah->curchan = ath_get_curchannel(sc, sc->hw);

+ spin_lock_bh(&sc->rx.pcu_lock);
spin_lock_bh(&sc->sc_resetlock);
r = ath9k_hw_reset(ah, ah->curchan, ah->caldata, false);
if (r) {
@@ -889,8 +897,10 @@ void ath_radio_enable(struct ath_softc *sc, struct ieee80211_hw *hw)
if (ath_startrecv(sc) != 0) {
ath_print(common, ATH_DBG_FATAL,
"Unable to restart recv logic\n");
+ spin_unlock_bh(&sc->rx.pcu_lock);
return;
}
+ spin_unlock_bh(&sc->rx.pcu_lock);

if (sc->sc_flags & SC_OP_BEACONS)
ath_beacon_config(sc, NULL); /* restart beacons */
@@ -929,6 +939,9 @@ void ath_radio_disable(struct ath_softc *sc, struct ieee80211_hw *hw)
ath9k_hw_disable_interrupts(ah);

ath_drain_all_txq(sc, false); /* clear pending tx frames */
+
+ spin_lock_bh(&sc->rx.pcu_lock);
+
ath_stoprecv(sc); /* turn off frame recv */
ath_flushrecv(sc); /* flush recv queue */

@@ -946,6 +959,9 @@ void ath_radio_disable(struct ath_softc *sc, struct ieee80211_hw *hw)
spin_unlock_bh(&sc->sc_resetlock);

ath9k_hw_phy_disable(ah);
+
+ spin_unlock_bh(&sc->rx.pcu_lock);
+
ath9k_hw_configpcipowersave(ah, 1, 1);
ath9k_ps_restore(sc);
ath9k_setpower(sc, ATH9K_PM_FULL_SLEEP);
@@ -965,6 +981,9 @@ int ath_reset(struct ath_softc *sc, bool retry_tx)

ath9k_hw_disable_interrupts(ah);
ath_drain_all_txq(sc, retry_tx);
+
+ spin_lock_bh(&sc->rx.pcu_lock);
+
ath_stoprecv(sc);
ath_flushrecv(sc);

@@ -979,6 +998,8 @@ int ath_reset(struct ath_softc *sc, bool retry_tx)
ath_print(common, ATH_DBG_FATAL,
"Unable to start recv logic\n");

+ spin_unlock_bh(&sc->rx.pcu_lock);
+
/*
* We may be doing a reset in response to a request
* that changes the channel so update any state that
@@ -1141,6 +1162,7 @@ static int ath9k_start(struct ieee80211_hw *hw)
* be followed by initialization of the appropriate bits
* and then setup of the interrupt mask.
*/
+ spin_lock_bh(&sc->rx.pcu_lock);
spin_lock_bh(&sc->sc_resetlock);
r = ath9k_hw_reset(ah, init_channel, ah->caldata, false);
if (r) {
@@ -1149,6 +1171,7 @@ static int ath9k_start(struct ieee80211_hw *hw)
"(freq %u MHz)\n", r,
curchan->center_freq);
spin_unlock_bh(&sc->sc_resetlock);
+ spin_unlock_bh(&sc->rx.pcu_lock);
goto mutex_unlock;
}
spin_unlock_bh(&sc->sc_resetlock);
@@ -1170,8 +1193,10 @@ static int ath9k_start(struct ieee80211_hw *hw)
ath_print(common, ATH_DBG_FATAL,
"Unable to start recv logic\n");
r = -EIO;
+ spin_unlock_bh(&sc->rx.pcu_lock);
goto mutex_unlock;
}
+ spin_unlock_bh(&sc->rx.pcu_lock);

/* Setup our intr mask. */
ah->imask = ATH9K_INT_TX | ATH9K_INT_RXEOL |
@@ -1370,12 +1395,14 @@ static void ath9k_stop(struct ieee80211_hw *hw)
* before setting the invalid flag. */
ath9k_hw_disable_interrupts(ah);

+ spin_lock_bh(&sc->rx.pcu_lock);
if (!(sc->sc_flags & SC_OP_INVALID)) {
ath_drain_all_txq(sc, false);
ath_stoprecv(sc);
ath9k_hw_phy_disable(ah);
} else
sc->rx.rxlink = NULL;
+ spin_unlock_bh(&sc->rx.pcu_lock);

/* disable HAL and put h/w to sleep */
ath9k_hw_disable(ah);
diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
index b099827..c04a940 100644
--- a/drivers/net/wireless/ath/ath9k/recv.c
+++ b/drivers/net/wireless/ath/ath9k/recv.c
@@ -533,13 +533,11 @@ bool ath_stoprecv(struct ath_softc *sc)

void ath_flushrecv(struct ath_softc *sc)
{
- spin_lock_bh(&sc->rx.pcu_lock);
sc->sc_flags |= SC_OP_RXFLUSH;
if (sc->sc_ah->caps.hw_caps & ATH9K_HW_CAP_EDMA)
ath_rx_tasklet(sc, 1, true);
ath_rx_tasklet(sc, 1, false);
sc->sc_flags &= ~SC_OP_RXFLUSH;
- spin_unlock_bh(&sc->rx.pcu_lock);
}

static bool ath_beacon_dtim_pending_cab(struct sk_buff *skb)
--
1.7.0.4


2010-10-20 23:07:14

by Luis R. Rodriguez

[permalink] [raw]
Subject: [PATCH 2/6] ath9k: add locking for starting the PCU on RX

There was some locking for starting some parts of
RX but not for starting the PCU. Include this otherwise
we can content against stopping the PCU.

This can potentially lead to races against different
buffers on the PCU which can lead to to the DMA RX
engine writing to buffers which are already freed.

This is part of a series that will help resolve the bug:

https://bugzilla.kernel.org/show_bug.cgi?id=14624

For more details about this issue refer to:

http://marc.info/?l=linux-wireless&m=128629803703756&w=2

Cc: [email protected]
Cc: Ben Greear <[email protected]>
Cc: Kyungwan Nam <[email protected]>
Signed-off-by: Luis R. Rodriguez <[email protected]>
---
drivers/net/wireless/ath/ath9k/recv.c | 7 ++++---
1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
index 3b19bbb..944fb59 100644
--- a/drivers/net/wireless/ath/ath9k/recv.c
+++ b/drivers/net/wireless/ath/ath9k/recv.c
@@ -297,11 +297,11 @@ static void ath_edma_start_recv(struct ath_softc *sc)
ath_rx_addbuffer_edma(sc, ATH9K_RX_QUEUE_LP,
sc->rx.rx_edma[ATH9K_RX_QUEUE_LP].rx_fifo_hwsize);

- spin_unlock_bh(&sc->rx.rxbuflock);
-
ath_opmode_init(sc);

ath9k_hw_startpcureceive(sc->sc_ah, (sc->sc_flags & SC_OP_OFFCHANNEL));
+
+ spin_unlock_bh(&sc->rx.rxbuflock);
}

static void ath_edma_stop_recv(struct ath_softc *sc)
@@ -504,10 +504,11 @@ int ath_startrecv(struct ath_softc *sc)
ath9k_hw_rxena(ah);

start_recv:
- spin_unlock_bh(&sc->rx.rxbuflock);
ath_opmode_init(sc);
ath9k_hw_startpcureceive(ah, (sc->sc_flags & SC_OP_OFFCHANNEL));

+ spin_unlock_bh(&sc->rx.rxbuflock);
+
return 0;
}

--
1.7.0.4


2010-10-21 06:03:39

by Senthil Balasubramanian

[permalink] [raw]
Subject: Re: [PATCH 3/6] ath9k: rename rxflushlock to pcu_lock

On Thu, Oct 21, 2010 at 04:37:05AM +0530, Luis R. Rodriguez wrote:
> The real way to lock RX is to contend on the PCU
> and reset, this will be fixed in the next patch but for
> now just do the renames so that the next patch which changes
> the locking order is crystal clear.
>
> This is part of a series that will help resolve the bug:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=14624
>
> For more details about this issue refer to:
>
> http://marc.info/?l=linux-wireless&m=128629803703756&w=2
>
> Cc: [email protected]
> Cc: Ben Greear <[email protected]>
> Cc: Kyungwan Nam <[email protected]>
> Signed-off-by: Luis R. Rodriguez <[email protected]>
> ---
> drivers/net/wireless/ath/ath9k/ath9k.h | 2 +-
> drivers/net/wireless/ath/ath9k/main.c | 4 ++--
> drivers/net/wireless/ath/ath9k/recv.c | 6 +++---
> 3 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/net/wireless/ath/ath9k/ath9k.h b/drivers/net/wireless/ath/ath9k/ath9k.h
> index 0f0bc54..81fed83 100644
> --- a/drivers/net/wireless/ath/ath9k/ath9k.h
> +++ b/drivers/net/wireless/ath/ath9k/ath9k.h
> @@ -309,7 +309,7 @@ struct ath_rx {
> u8 rxotherant;
> u32 *rxlink;
> unsigned int rxfilter;
> - spinlock_t rxflushlock;
> + spinlock_t pcu_lock;
> spinlock_t rxbuflock;
> struct list_head rxbuf;
> struct ath_descdma rxdma;
> diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c
> index dcd94ba..37f18ef 100644
> --- a/drivers/net/wireless/ath/ath9k/main.c
> +++ b/drivers/net/wireless/ath/ath9k/main.c
> @@ -609,7 +609,7 @@ void ath9k_tasklet(unsigned long data)
> rxmask = (ATH9K_INT_RX | ATH9K_INT_RXEOL | ATH9K_INT_RXORN);
>
> if (status & rxmask) {
> - spin_lock_bh(&sc->rx.rxflushlock);
> + spin_lock_bh(&sc->rx.pcu_lock);
no need to disable bh as you are already in tasklet. I understand that the
existing code itself was doing this and I thought of fixing that anyway.
Can you plese fix this as well with this patch?
>
> /* Check for high priority Rx first */
> if ((ah->caps.hw_caps & ATH9K_HW_CAP_EDMA) &&
> @@ -617,7 +617,7 @@ void ath9k_tasklet(unsigned long data)
> ath_rx_tasklet(sc, 0, true);
>
> ath_rx_tasklet(sc, 0, false);
> - spin_unlock_bh(&sc->rx.rxflushlock);
> + spin_unlock_bh(&sc->rx.pcu_lock);
here also. spin_lock() is sufficient.
> }
>
> if (status & ATH9K_INT_TX) {
> diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
> index 944fb59..b099827 100644
> --- a/drivers/net/wireless/ath/ath9k/recv.c
> +++ b/drivers/net/wireless/ath/ath9k/recv.c
> @@ -317,7 +317,7 @@ int ath_rx_init(struct ath_softc *sc, int nbufs)
> struct ath_buf *bf;
> int error = 0;
>
> - spin_lock_init(&sc->rx.rxflushlock);
> + spin_lock_init(&sc->rx.pcu_lock);
> sc->sc_flags &= ~SC_OP_RXFLUSH;
> spin_lock_init(&sc->rx.rxbuflock);
>
> @@ -533,13 +533,13 @@ bool ath_stoprecv(struct ath_softc *sc)
>
> void ath_flushrecv(struct ath_softc *sc)
> {
> - spin_lock_bh(&sc->rx.rxflushlock);
> + spin_lock_bh(&sc->rx.pcu_lock);
> sc->sc_flags |= SC_OP_RXFLUSH;
> if (sc->sc_ah->caps.hw_caps & ATH9K_HW_CAP_EDMA)
> ath_rx_tasklet(sc, 1, true);
> ath_rx_tasklet(sc, 1, false);
> sc->sc_flags &= ~SC_OP_RXFLUSH;
> - spin_unlock_bh(&sc->rx.rxflushlock);
> + spin_unlock_bh(&sc->rx.pcu_lock);
> }
>
> static bool ath_beacon_dtim_pending_cab(struct sk_buff *skb)
> --
> 1.7.0.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2010-10-21 07:05:17

by Senthil Balasubramanian

[permalink] [raw]
Subject: Re: [PATCH 3/6] ath9k: rename rxflushlock to pcu_lock

On Thu, Oct 21, 2010 at 11:43:50AM +0530, Luis R. Rodriguez wrote:
> On Wed, Oct 20, 2010 at 11:03 PM, Senthil Balasubramanian
> <[email protected]> wrote:
> > On Thu, Oct 21, 2010 at 04:37:05AM +0530, Luis R. Rodriguez wrote:
> >> The real way to lock RX is to contend on the PCU
> >> and reset, this will be fixed in the next patch but for
> >> now just do the renames so that the next patch which changes
> >> the locking order is crystal clear.
> >>
> >> This is part of a series that will help resolve the bug:
> >>
> >> https://bugzilla.kernel.org/show_bug.cgi?id=14624
> >>
> >> For more details about this issue refer to:
> >>
> >> http://marc.info/?l=linux-wireless&m=128629803703756&w=2
> >>
> >> Cc: [email protected]
> >> Cc: Ben Greear <[email protected]>
> >> Cc: Kyungwan Nam <[email protected]>
> >> Signed-off-by: Luis R. Rodriguez <[email protected]>
> >> ---
> >> ?drivers/net/wireless/ath/ath9k/ath9k.h | ? ?2 +-
> >> ?drivers/net/wireless/ath/ath9k/main.c ?| ? ?4 ++--
> >> ?drivers/net/wireless/ath/ath9k/recv.c ?| ? ?6 +++---
> >> ?3 files changed, 6 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/drivers/net/wireless/ath/ath9k/ath9k.h b/drivers/net/wireless/ath/ath9k/ath9k.h
> >> index 0f0bc54..81fed83 100644
> >> --- a/drivers/net/wireless/ath/ath9k/ath9k.h
> >> +++ b/drivers/net/wireless/ath/ath9k/ath9k.h
> >> @@ -309,7 +309,7 @@ struct ath_rx {
> >> ? ? ? u8 rxotherant;
> >> ? ? ? u32 *rxlink;
> >> ? ? ? unsigned int rxfilter;
> >> - ? ? spinlock_t rxflushlock;
> >> + ? ? spinlock_t pcu_lock;
> >> ? ? ? spinlock_t rxbuflock;
> >> ? ? ? struct list_head rxbuf;
> >> ? ? ? struct ath_descdma rxdma;
> >> diff --git a/drivers/net/wireless/ath/ath9k/main.c b/drivers/net/wireless/ath/ath9k/main.c
> >> index dcd94ba..37f18ef 100644
> >> --- a/drivers/net/wireless/ath/ath9k/main.c
> >> +++ b/drivers/net/wireless/ath/ath9k/main.c
> >> @@ -609,7 +609,7 @@ void ath9k_tasklet(unsigned long data)
> >> ? ? ? ? ? ? ? rxmask = (ATH9K_INT_RX | ATH9K_INT_RXEOL | ATH9K_INT_RXORN);
> >>
> >> ? ? ? if (status & rxmask) {
> >> - ? ? ? ? ? ? spin_lock_bh(&sc->rx.rxflushlock);
> >> + ? ? ? ? ? ? spin_lock_bh(&sc->rx.pcu_lock);
> > no need to disable bh as you are already in tasklet. I understand that the
> > existing code itself was doing this and I thought of fixing that anyway.
> > Can you plese fix this as well with this patch?
>
> Nah, I rather this be a separate patch, which you can test and send yourself :)
Since we are sending this to stable, I thought we can send a clean patch.
>
> >>
> >> ? ? ? ? ? ? ? /* Check for high priority Rx first */
> >> ? ? ? ? ? ? ? if ((ah->caps.hw_caps & ATH9K_HW_CAP_EDMA) &&
> >> @@ -617,7 +617,7 @@ void ath9k_tasklet(unsigned long data)
> >> ? ? ? ? ? ? ? ? ? ? ? ath_rx_tasklet(sc, 0, true);
> >>
> >> ? ? ? ? ? ? ? ath_rx_tasklet(sc, 0, false);
> >> - ? ? ? ? ? ? spin_unlock_bh(&sc->rx.rxflushlock);
> >> + ? ? ? ? ? ? spin_unlock_bh(&sc->rx.pcu_lock);
> > here also. spin_lock() is sufficient.
>
> Ditto.
>
> Luis