2010-03-31 19:40:30

by Johan Hovold

[permalink] [raw]
Subject: ath9k: receive stops working in AP-mode and 802.11n

Hi,

I'm having a problem with ath9k running in AP-mode where receive seems to stop
working during large transfers. When this happen I can, for example, see
ping requests from the AP and replies from STA in the air but the
replies seem to get lost somewhere in the AP (I see the frames being
acked). Disconnecting and reconnecting from the STA brings the
connection back up. A group-rekeying event is also sufficient restore
the connection.

Note also that while the connection is stalled, I can still associate using a
second STA and everything works fine.

The problem is easily reproducible in my test setup with an x86-based STA using
iwlagn (5300) and a 2.6.33.1 kernel, and a powerpc-based AP running 2.6.32.10
and ath9k (AR9280) from a recent compat-wireless (e.g. 2010-03-23).

My customer Excito producing the AP has noted similar problems with STAs using
Intel 5100-chipsets under Windows. I have also been able to reproduce it (once)
with an STA running ath9k (AR5008) from 2.6.33.1 on x86. Perhaps more
importantly, I can also trigger the same behaviour when I run the latter system
in AP-mode.

The x86-AP uses hostapd 0.7.1 and the powerpc-AP a git-build from somewhere
between 0.7.0 and 0.7.1.

In my test setup the problem is triggered by running iperf as a client on the
AP and thus _sending_ a lot of data to the STA. Reversing this relation does
not seem to trigger the problem (as easily). Also, everything seem to work
fine in 802.11g (in both directions).

Some observations:

- Normally only takes a few seconds at 30-40Mbits/s to trigger, but can
sometimes take longer.
- Seems to be related to throughput:
- takes longer to trigger when using many small writes (e.g. 1MB a time).
- takes longer to trigger when when competing for bandwidth with a second
connection or when the transfer is CPU-bound in the AP.
- Possibly triggers faster the second time after having disconnected and
re-associated without restarting hostapd (but this is more a feeling
I've got).
- I have seen occasional crashes in iwlagn in the STA, but can't say for sure
that it is related.

When the connection has stalled I can see TCP and ARP traffic in the air which
seem to indicate that the problem is in AP receive:

- AP (running iperf client) retransmits a TCP packet over and over, although
the STA is acking
- ping sent from AP is answered by STA, but answer seems to get lost in AP
(frame is acked)
- ping sent from STA is not answered by AP
- AP sending repeated ARP-requests even though STA is answering them
- STA sending unanswered ARP-requests

The corresponding frames sent by the STA are all acked by the AP. There are
also management frames being sent and answered (delete and add block acks).

As I mentioned above, I can associate using a second STA and establish a
TCP-connection while the first one is stalled. I can also have two connected
STAs and trigger the stall on one of them without it affecting the other.

Any ideas about what may be the problem here or suggestion on further steps
that can be taken to identify it?

Thanks,
Johan Hovold
Lundinova AB



2010-04-30 20:38:46

by Daniel Yingqiang Ma

[permalink] [raw]
Subject: Re: [ath9k-devel] ath9k: corrupt frames forwarded to mac80211 as decrypted

This patch indeed improve the link quality of my AP powered by ath9k.

It seems there are plenty of this kind of corruptted packet. Any one
know it's reason?

Thanks,
Daniel

2010/4/20 Johan Hovold <[email protected]>:
> On Tue, Apr 20, 2010 at 01:06:57PM +0200, Johan Hovold wrote:
>> On Tue, Apr 20, 2010 at 02:40:51PM +0530, Ranga Rao Ravuri wrote:
>> > Can you please set register 0x8120 bit 28 to 1 and test this again to
>> > see that helps ?
>>
>> I'll try that.
>
> I'm afraid it does not seem to have any effect on the corrupt frame
> status.
>
> Anything else I can try?
>
> Thanks,
> Johan
>
> _______________________________________________
> ath9k-devel mailing list
> [email protected]
> https://lists.ath9k.org/mailman/listinfo/ath9k-devel
>

2010-04-20 08:28:44

by Johan Hovold

[permalink] [raw]
Subject: [RFC][PATCH 2/2] ath9k: use also AR_DecryptBusyErr to determine decrypt errors

Prevent non-decrypted frames with DecryptBusyErr flag set to be marked
decrypted.

Have seen such a frame with status 0x40030bc1 (due to bit error?).

Signed-off-by: Johan Hovold <[email protected]>
---
drivers/net/wireless/ath/ath9k/mac.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/mac.c b/drivers/net/wireless/ath/ath9k/mac.c
index 2afe72f..780da87 100644
--- a/drivers/net/wireless/ath/ath9k/mac.c
+++ b/drivers/net/wireless/ath/ath9k/mac.c
@@ -930,7 +930,8 @@ int ath9k_hw_rxprocdesc(struct ath_hw *ah, struct ath_desc *ds,
rs->rs_status |= ATH9K_RXERR_PHY;
phyerr = MS(ads.ds_rxstatus8, AR_PHYErrCode);
rs->rs_phyerr = phyerr;
- } else if (ads.ds_rxstatus8 & AR_DecryptCRCErr)
+ } else if (ads.ds_rxstatus8 & AR_DecryptCRCErr ||
+ ads.ds_rxstatus8 & AR_DecryptBusyErr)
rs->rs_status |= ATH9K_RXERR_DECRYPT;
else if (ads.ds_rxstatus8 & AR_MichaelErr)
rs->rs_status |= ATH9K_RXERR_MIC;
--
1.7.0.3


2010-04-20 11:35:09

by Johan Hovold

[permalink] [raw]
Subject: Re: [ath9k-devel] ath9k: corrupt frames forwarded to mac80211 as decrypted

On Tue, Apr 20, 2010 at 01:06:57PM +0200, Johan Hovold wrote:
> On Tue, Apr 20, 2010 at 02:40:51PM +0530, Ranga Rao Ravuri wrote:
> > Can you please set register 0x8120 bit 28 to 1 and test this again to
> > see that helps ?
>
> I'll try that.

I'm afraid it does not seem to have any effect on the corrupt frame
status.

Anything else I can try?

Thanks,
Johan


2010-04-16 10:52:43

by Johan Hovold

[permalink] [raw]
Subject: [RFC][PATCH 2/6] ath9k: do not mark frames with RXKEY_IX_INVALID as decrypted

Frames tagged by hardware with ATH9K_RXKEYIX_INVALID should not
incorrectly be marked decrypted (even if key index in frame is valid).

Signed-off-by: Johan Hovold <[email protected]>
---

The current code overrides the hardware flag indicating that the key index is
invalid and falsely mark this frame as decrypted.

00000000: 88 41 30 00 00 80 48 68 08 0f 00 21 6a 56 2c 36
00000010: 00 22 02 00 0b 63 20 0d 00 00 20 00 d1 20 00 20
00000020: b1 5b a7 c5 96 be cf a3 16 3b 35 a0 bb 59 ea d2
00000030: 17 10 28 b0 07 67 14 ff d7 6f 77 5c f1 01 f0 04
00000040: 8d 03 47 68 9d b2 bd b4 64 bb cd 58 e9 ff 82 d2
00000050: f3 d0 38 b1 75 a2 2f d2 d6 b7 70 ec 95 22 71 32
00000060: 54 c0 c4 6d 1f 0d 19 32 22 e9 c2 9c
rxstatus8 = 3bbc20a3

set: AR_RxFrameOK | AR_MichaelErr
cleared: AR_CRCErr | AR_DecryptCRCErr | AR_PHYErr | AR_RxKeyIdxValid


drivers/net/wireless/ath/ath9k/common.c | 8 +-------
1 files changed, 1 insertions(+), 7 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/common.c b/drivers/net/wireless/ath/ath9k/common.c
index 0cd10dc..af22e7a 100644
--- a/drivers/net/wireless/ath/ath9k/common.c
+++ b/drivers/net/wireless/ath/ath9k/common.c
@@ -256,14 +256,8 @@ void ath9k_cmn_rx_skb_postprocess(struct ath_common *common,
keyix = rx_stats->rs_keyix;

if (ieee80211_has_protected(fc) && !decrypt_error) {
- if (keyix != ATH9K_RXKEYIX_INVALID) {
+ if (keyix != ATH9K_RXKEYIX_INVALID)
rxs->flag |= RX_FLAG_DECRYPTED;
- } else if (skb->len >= hdrlen + 4) {
- keyix = skb->data[hdrlen + 3] >> 6;
-
- if (test_bit(keyix, common->keymap))
- rxs->flag |= RX_FLAG_DECRYPTED;
- }
}
if (ah->sw_mgmt_crypto &&
(rxs->flag & RX_FLAG_DECRYPTED) &&
--
1.7.0.3


2010-04-16 10:52:45

by Johan Hovold

[permalink] [raw]
Subject: [RFC][PATCH 4/6] ath9k: do not mark frames with RX_KEY_MISS as decrypted

Frames tagged by hardware with ATH9K_RX_KEY_MISS should not
incorrectly be marked decrypted.

Signed-off-by: Johan Hovold <[email protected]>
---

The frame below has no error flags set besides KeyMiss and would be marked
decrypted by the current code.

00000000: 88 41 30 00 00 80 48 68 08 0f 00 21 6a 56 2c 36
00000010: 00 22 02 00 0b 63 10 a4 00 00 10 00 40 0a 00 20
00000020: 81 62 6d de 46 89 96 96 97 ec cf aa e3 77 73 07
00000030: 17 e8 76 91 73 fd ea a9 29 62 e4 c3 17 46 39 0a
00000040: 52 7f 26 39 2d 4c 22 7c 0e 91 78 95 ff 1d d0 18
00000050: ef 6a af 99 42 74 70 c1 d4 8b 56 16 9b 90 f3 9a
00000060: ff 52 5d 1c 77 ee 83 34 f6 14 1c da
rxstatus8 = b9accfc3

set: AR_RxFrameOK | AR_RxKeyIdxValid | AR_KeyMiss
cleared: AR_CRCErr | AR_DecryptCRCErr | AR_PHYErr | AR_MichaelErr | AR_DecryptBusyErr


drivers/net/wireless/ath/ath9k/common.c | 3 ++-
drivers/net/wireless/ath/ath9k/mac.c | 2 ++
drivers/net/wireless/ath/ath9k/mac.h | 1 +
3 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/common.c b/drivers/net/wireless/ath/ath9k/common.c
index dd4be54..1623af1 100644
--- a/drivers/net/wireless/ath/ath9k/common.c
+++ b/drivers/net/wireless/ath/ath9k/common.c
@@ -256,7 +256,8 @@ void ath9k_cmn_rx_skb_postprocess(struct ath_common *common,
keyix = rx_stats->rs_keyix;

if (ieee80211_has_protected(fc) && !decrypt_error &&
- !(rx_stats->rs_flags & ATH9K_RX_DECRYPT_BUSY)) {
+ !(rx_stats->rs_flags & ATH9K_RX_DECRYPT_BUSY) &&
+ !(rx_stats->rs_flags & ATH9K_RX_KEY_MISS)) {
if (keyix != ATH9K_RXKEYIX_INVALID)
rxs->flag |= RX_FLAG_DECRYPTED;
}
diff --git a/drivers/net/wireless/ath/ath9k/mac.c b/drivers/net/wireless/ath/ath9k/mac.c
index 4a2060e..891a294 100644
--- a/drivers/net/wireless/ath/ath9k/mac.c
+++ b/drivers/net/wireless/ath/ath9k/mac.c
@@ -922,6 +922,8 @@ int ath9k_hw_rxprocdesc(struct ath_hw *ah, struct ath_desc *ds,
rs->rs_flags |= ATH9K_RX_DELIM_CRC_POST;
if (ads.ds_rxstatus8 & AR_DecryptBusyErr)
rs->rs_flags |= ATH9K_RX_DECRYPT_BUSY;
+ if (ads.ds_rxstatus8 & AR_KeyMiss)
+ rs->rs_flags |= ATH9K_RX_KEY_MISS;

if ((ads.ds_rxstatus8 & AR_RxFrameOK) == 0) {
if (ads.ds_rxstatus8 & AR_CRCErr)
diff --git a/drivers/net/wireless/ath/ath9k/mac.h b/drivers/net/wireless/ath/ath9k/mac.h
index 68dbd7a..ba3c98e 100644
--- a/drivers/net/wireless/ath/ath9k/mac.h
+++ b/drivers/net/wireless/ath/ath9k/mac.h
@@ -189,6 +189,7 @@ struct ath_htc_rx_status {
#define ATH9K_RX_DELIM_CRC_PRE 0x10
#define ATH9K_RX_DELIM_CRC_POST 0x20
#define ATH9K_RX_DECRYPT_BUSY 0x40
+#define ATH9K_RX_KEY_MISS 0x80

#define ATH9K_RXKEYIX_INVALID ((u8)-1)
#define ATH9K_TXKEYIX_INVALID ((u32)-1)
--
1.7.0.3


2010-04-16 10:52:44

by Johan Hovold

[permalink] [raw]
Subject: [RFC][PATCH 3/6] ath9k: do not mark frames with RX_DECRYPT_BUSY as decrypted

Frames tagged by hardware with ATH9K_RX_DECRYPT_BUSY should not
incorrectly be marked decrypted.

Signed-off-by: Johan Hovold <[email protected]>
---

Some corrupt frames such as the one below have DecryptBusyErr flag set even
though frame is marked ok and without DecryptCRCErr set.

00000000: 88 41 30 00 00 80 48 68 08 0f 00 21 6a 56 2c 36
00000010: 00 22 02 00 0b 63 e0 2b 00 00 e0 00 bd 42 00 20
00000020: ef 44 5c a5 45 62 c2 2d af c3 cc ef ec cb d0 83
00000030: a7 7f fd bc 7d f1 c4 5e 72 82 81 fc ff 1a 9d 85
00000040: 63 cd 36 ae a4 12 6e fb b7 6a 77 71 4a 06 e6 ae
00000050: a6 40 ad b1 76 b7 de ff 7c bd cf b1 ef 3d 93 bf
00000060: 68 a0 af c1 a2 14 84 a4 4c 9e 5e 3e
rxstatus8 = de461103

set: AR_RxFrameOK | AR_RxKeyIdxValid | AR_DecryptBusyErr | Ar_KeyMiss
cleared: AR_CRCErr | AR_DecryptCRCErr | AR_PHYErr | AR_MichaelErr


drivers/net/wireless/ath/ath9k/common.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/common.c b/drivers/net/wireless/ath/ath9k/common.c
index af22e7a..dd4be54 100644
--- a/drivers/net/wireless/ath/ath9k/common.c
+++ b/drivers/net/wireless/ath/ath9k/common.c
@@ -255,7 +255,8 @@ void ath9k_cmn_rx_skb_postprocess(struct ath_common *common,

keyix = rx_stats->rs_keyix;

- if (ieee80211_has_protected(fc) && !decrypt_error) {
+ if (ieee80211_has_protected(fc) && !decrypt_error &&
+ !(rx_stats->rs_flags & ATH9K_RX_DECRYPT_BUSY)) {
if (keyix != ATH9K_RXKEYIX_INVALID)
rxs->flag |= RX_FLAG_DECRYPTED;
}
--
1.7.0.3


2010-04-20 08:35:48

by Johan Hovold

[permalink] [raw]
Subject: Re: [ath9k-devel] [RFC][PATCH 2/6] ath9k: do not mark frames with RXKEY_IX_INVALID as decrypted

On Fri, Apr 16, 2010 at 02:32:41PM +0300, Jouni Malinen wrote:
> On Fri, 2010-04-16 at 03:52 -0700, Johan Hovold wrote:
> > Frames tagged by hardware with ATH9K_RXKEYIX_INVALID should not
> > incorrectly be marked decrypted (even if key index in frame is valid).
>
> Have you tested this with static WEP configuration? Or broadcast RX with
> WPA? There must be a reason for that odd looking code being there in the
> first place and I can now only think of it being needed when the default
> keys are used.

You are correct. That piece of code is indeed required for WEP
(KeyIdxValid is not set and sometimes also KeyMiss is set even though
frame has been decrypted).

I just posted a patch which seems to catch the corrupt frames without
messing anything else up. Perhaps you could take a quick look at it?

Thanks,
Johan


2010-04-16 10:52:39

by Johan Hovold

[permalink] [raw]
Subject: [RFC][PATCH 1/6] ath9k: clean up rx skb post-process logic

Refactor IEEE80211_FCTL_PROTECTED and decryption error test.

Signed-off-by: Johan Hovold <[email protected]>
---
drivers/net/wireless/ath/ath9k/common.c | 16 ++++++++--------
1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/common.c b/drivers/net/wireless/ath/ath9k/common.c
index 09effde..0cd10dc 100644
--- a/drivers/net/wireless/ath/ath9k/common.c
+++ b/drivers/net/wireless/ath/ath9k/common.c
@@ -255,15 +255,15 @@ void ath9k_cmn_rx_skb_postprocess(struct ath_common *common,

keyix = rx_stats->rs_keyix;

- if (!(keyix == ATH9K_RXKEYIX_INVALID) && !decrypt_error &&
- ieee80211_has_protected(fc)) {
- rxs->flag |= RX_FLAG_DECRYPTED;
- } else if (ieee80211_has_protected(fc)
- && !decrypt_error && skb->len >= hdrlen + 4) {
- keyix = skb->data[hdrlen + 3] >> 6;
-
- if (test_bit(keyix, common->keymap))
+ if (ieee80211_has_protected(fc) && !decrypt_error) {
+ if (keyix != ATH9K_RXKEYIX_INVALID) {
rxs->flag |= RX_FLAG_DECRYPTED;
+ } else if (skb->len >= hdrlen + 4) {
+ keyix = skb->data[hdrlen + 3] >> 6;
+
+ if (test_bit(keyix, common->keymap))
+ rxs->flag |= RX_FLAG_DECRYPTED;
+ }
}
if (ah->sw_mgmt_crypto &&
(rxs->flag & RX_FLAG_DECRYPTED) &&
--
1.7.0.3


2010-04-16 10:52:46

by Johan Hovold

[permalink] [raw]
Subject: [RFC][PATCH 5/6] ath9k: check error flags even if rx frame is marked ok

Check error flags even if frame is marked ok by hardware as this flag
may have been incorrectly set.

Signed-off-by: Johan Hovold <[email protected]>
---

00000000: 88 41 30 00 00 80 48 68 08 af 00 b9 49 c3 e1 3f
00000010: 5a 9d 51 71 09 63 00 50 00 00 00 00 ff 54 00 20
00000020: 36 9d 46 02 90 31 2c e8 68 06 84 6e b5 00 29 e8
00000030: ef e3 6f a0 ee 99 7c 7e d8 7d 12 aa de 5c 20 69
00000040: d6 6a ad c4 99 bb c1 e4 c3 ba bd 77 51 7f a2 a5
00000050: 01 e4 81 a0 be 40 54 45 70 e4 cc 11 58 f8 ad 45
00000060: 84 1c 72 36 a1 fd b7 33 ad aa 4f 8b
rxstatus8 = 5994daab

Here AR_RxFrameOK is set even though AR_DecryptCRCErr and AR_MichaelErr are
set.

Note that this also happens for frames with non-corrupt PNs, e.g.:

FrameOK with error: c79e7573
FrameOK with error: 264786bf
FrameOK with error: f3a2446f
FrameOK with error: 2c054c4f

drivers/net/wireless/ath/ath9k/mac.c | 22 ++++++++++------------
1 files changed, 10 insertions(+), 12 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/mac.c b/drivers/net/wireless/ath/ath9k/mac.c
index 891a294..2b4295b 100644
--- a/drivers/net/wireless/ath/ath9k/mac.c
+++ b/drivers/net/wireless/ath/ath9k/mac.c
@@ -925,18 +925,16 @@ int ath9k_hw_rxprocdesc(struct ath_hw *ah, struct ath_desc *ds,
if (ads.ds_rxstatus8 & AR_KeyMiss)
rs->rs_flags |= ATH9K_RX_KEY_MISS;

- if ((ads.ds_rxstatus8 & AR_RxFrameOK) == 0) {
- if (ads.ds_rxstatus8 & AR_CRCErr)
- rs->rs_status |= ATH9K_RXERR_CRC;
- else if (ads.ds_rxstatus8 & AR_PHYErr) {
- rs->rs_status |= ATH9K_RXERR_PHY;
- phyerr = MS(ads.ds_rxstatus8, AR_PHYErrCode);
- rs->rs_phyerr = phyerr;
- } else if (ads.ds_rxstatus8 & AR_DecryptCRCErr)
- rs->rs_status |= ATH9K_RXERR_DECRYPT;
- else if (ads.ds_rxstatus8 & AR_MichaelErr)
- rs->rs_status |= ATH9K_RXERR_MIC;
- }
+ if (ads.ds_rxstatus8 & AR_CRCErr)
+ rs->rs_status |= ATH9K_RXERR_CRC;
+ else if (ads.ds_rxstatus8 & AR_PHYErr) {
+ rs->rs_status |= ATH9K_RXERR_PHY;
+ phyerr = MS(ads.ds_rxstatus8, AR_PHYErrCode);
+ rs->rs_phyerr = phyerr;
+ } else if (ads.ds_rxstatus8 & AR_DecryptCRCErr)
+ rs->rs_status |= ATH9K_RXERR_DECRYPT;
+ else if (ads.ds_rxstatus8 & AR_MichaelErr)
+ rs->rs_status |= ATH9K_RXERR_MIC;

return 0;
}
--
1.7.0.3


2010-04-20 11:06:59

by Johan Hovold

[permalink] [raw]
Subject: Re: [ath9k-devel] ath9k: corrupt frames forwarded to mac80211 as decrypted

On Tue, Apr 20, 2010 at 02:40:51PM +0530, Ranga Rao Ravuri wrote:
> Can you please tell us more about your test,
> what exactly is your test is ?
> What kind of traffic are you running
> what is your AP ?.

I have a powerpc-based platform running AR9280 in AP mode. My STA is an
Intel 5300 (but I have been able to reproduce it also with an STA using
AR5008).

After connecting I run iperf in client mode on the STA thus sending data
to the STA. After a short while the AP stops receiving any data
(including TCP-acks) and the transfer stalls.

I've verified that frames that look ok in the air can still be corrupt
in ath9k. In particular, the status field is trashed so that frames are
incorrectly forwarded to mac80211 marked as decrypted which can trigger
countermeasures or update CCMP PN incorrectly (thereby dropping later,
correct frames).

Discarding frames which appear to have trashed status makes 802.11n
work.

> Are you testing with aggregated traffic or non-aggregate legacy
> traffic ?

Aggregated.

> Are you seeing only when CCMP is enabled or in WEP/TKIP also ?

Only with 802.11n (CCMP). Everything works perfectly with 802.11g and
WEP/TKIP/CCMP and I never see any corrupt frame status.

> Can you please set register 0x8120 bit 28 to 1 and test this again to
> see that helps ?

I'll try that.

Thanks,
Johan


2010-04-20 09:05:44

by Ranga Rao Ravuri

[permalink] [raw]
Subject: Re: [ath9k-devel] ath9k: corrupt frames forwarded to mac80211 as decrypted

Hi

Can you please tell us more about your test,
what exactly is your test is ?
What kind of traffic are you running
what is your AP ?.
Are you testing with aggregated traffic or non-aggregate legacy
traffic ?
Are you seeing only when CCMP is enabled or in WEP/TKIP also ?
Can you please set register 0x8120 bit 28 to 1 and test this again to
see that helps ?

thanks
Ranga

On Tue, 2010-04-20 at 13:55 +0530, Johan Hovold wrote:
> Hi again,
>
> On Fri, Apr 16, 2010 at 12:48:50PM +0200, Johan Hovold wrote:
> > I now know why 802.11n receive stalls; ath9k is passing corrupt frames to
> > mac80211.
> [...]
> > An example of such a frame is:
> >
> > 00000000: 88 41 30 00 00 80 48 68 08 0f 00 21 6a 56 2c 36
> > 00000010: 00 22 02 00 0b 63 20 52 00 00 20 21 21 05 00 20
> > 00000020: 8a 39 7b 1f 0f 11 07 9e bd 53 80 33 3b 8c 98 00
> > 00000030: ef 5f da 7c 9a d6 3d d7 59 ac e0 21 44 88 63 d7
> > 00000040: 21 34 b7 9a 89 8e cf 9e 46 1c ee d6 81 56 25 59
> > 00000050: d2 ec ac 33 e6 12 3d c5 02 61 2d 80 8d 30 44 1e
> > 00000060: 79 74 79 79 62 25 ba ec 04 4d 54 dc
> >
> > with associated status
> >
> > rxstatus8 = 1e989103
> >
> > Here nothing in the frame status indicates an error; the frame has no error
> > flags set, the frame-ok flag is set, and so on. Still the frame is indeed
> > corrupt; the last four octets of the CCMP-header (bytes 0x20..0x23) should
> > be {00,00,00,00} rather than {8a,39,7b,1f} as the correct PN is 0x0521 (not
> > 0x1f7b398a0521).
> >
> > The corrupt frames all seem to have the upper half of the CCMP-header, data
> > and MIC corrupted, whereas the FCS (last four bytes) seem to be correct in the
> > sense that they match what I see in the air (and is verified by wireshark).
> >
> > One explanation for all of this could be that the corrupt packet is what the
> > hardware is expected to return should it's processing fail (e.g. due to
> > checksum error). Then the problem is merely that the status field sometimes
> > get corrupted (some frames with corrupt PN do indeed come with matching
> > rxstatus). Comments in the code concerning corrupt status fields also point in
> > this direction.
> >
> > Another explanation could be that the status is actually correct but for some
> > reason the returned frame is corrupted. Perhaps it's a combination of both
> > corrupt status and frame.
> >
> > Any ideas about what may be going on here?
>
> I'll answer my own question -- the status field is clearly corrupted.
> The bad frames all have status such as
>
> 19fdb637
> 9131f6d7
> 87b3f0c1
> 29de5e7b
> 10c68ff3
> 4c1a33c7
> 9abd470b
>
> where it should look something like 00030943. From inspection of
> returned frame status, I've come up with at way to catch these; I
> discard frames marked ok but with errors flags set or which have any of
> the reserved bits 19 through 28 (0x1ff80000) set.
>
> My question is: Is the latter assumption correct? Can I expect bits
> 19..28 to be zero?
>
> On my AR9820 hardware this seems to be case and I am able to catch all
> the corrupt frames and have 802.11n work perfectly. It also works fine
> with 802.11g/WEP/TKIP/CCMP.
>
> As a side note, I also seem able to confirm my observation above that
> frames returned with non-corrupt status and decrypt error flag set
> indeed do have corrupt CCMP header, data and MIC. That is, this seem to
> be what hw is expected to return on (such) errors.
>
> I'm responding to this mail with my fix (workaround).
>
> I have also seen non-decrypted frames with decrypt busy flag set but
> without decrypt crc err set (frame is marked not ok). I'm not sure
> whether this is due to a bug or bit error (decrypt crc somehow got
> cleared) but I also propose a change to remedy this.
>
> Thanks,
> Johan Hovold
>
> _______________________________________________
> ath9k-devel mailing list
> [email protected]
> https://lists.ath9k.org/mailman/listinfo/ath9k-devel


2010-04-20 08:38:10

by Johan Hovold

[permalink] [raw]
Subject: Re: [RFC][PATCH 1/2] ath9k: fix corrupt frames being forwarded to mac80211

On Tue, Apr 20, 2010 at 10:28:07AM +0200, Johan Hovold wrote:
> Verify frame status by making sure that no error flags are set if frame
> is marked ok and that reserved bits 19 through 28 (0x1ff80000) are all
> zero, otherwise discarded frame (as ATH9K_PHYERR_HT_CRC_ERROR).

I used ATH9K_PHYERR_HT_CRC_ERROR for now, but perhaps there is better
choice?


2010-04-20 08:28:36

by Johan Hovold

[permalink] [raw]
Subject: [RFC][PATCH 1/2] ath9k: fix corrupt frames being forwarded to mac80211

Frame status frequently gets corrupted in hardware with 802.11n which
can lead to corrupt frames being forwarded to mac80211. This in turn may
trigger countermeasures in reponse to false MIC-errors or cause receive
to stall when frames are dropped after processing a frame with corrupt PN.

Verify frame status by making sure that no error flags are set if frame
is marked ok and that reserved bits 19 through 28 (0x1ff80000) are all
zero, otherwise discarded frame (as ATH9K_PHYERR_HT_CRC_ERROR).

Tested with AR9280 (WEP, TKIP and CCMP).

Signed-off-by: Johan Hovold <[email protected]>
---
drivers/net/wireless/ath/ath9k/mac.c | 13 +++++++++++++
1 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/mac.c b/drivers/net/wireless/ath/ath9k/mac.c
index 4a2060e..2afe72f 100644
--- a/drivers/net/wireless/ath/ath9k/mac.c
+++ b/drivers/net/wireless/ath/ath9k/mac.c
@@ -934,9 +934,22 @@ int ath9k_hw_rxprocdesc(struct ath_hw *ah, struct ath_desc *ds,
rs->rs_status |= ATH9K_RXERR_DECRYPT;
else if (ads.ds_rxstatus8 & AR_MichaelErr)
rs->rs_status |= ATH9K_RXERR_MIC;
+ } else {
+ if (ads.ds_rxstatus8 & (AR_CRCErr |
+ AR_PHYErr |
+ AR_DecryptCRCErr |
+ AR_MichaelErr |
+ AR_DecryptBusyErr))
+ goto corrupt;
}
+ if (ads.ds_rxstatus8 & 0x1ff80000)
+ goto corrupt;

return 0;
+corrupt:
+ rs->rs_status |= ATH9K_RXERR_PHY;
+ rs->rs_phyerr = ATH9K_PHYERR_HT_CRC_ERROR;
+ return 0;
}
EXPORT_SYMBOL(ath9k_hw_rxprocdesc);

--
1.7.0.3


2010-04-16 10:48:52

by Johan Hovold

[permalink] [raw]
Subject: ath9k: corrupt frames forwarded to mac80211 as decrypted (was: ath9k: receive stops working in AP-mode and 802.11n)

Hi,

I now know why 802.11n receive stalls; ath9k is passing corrupt frames to
mac80211.

The corrupt frames are marked as decrypted so the receive PN is updated to a
random number. Later non-corrupt frames with correct PNs are consequently
deemed out-of-sequence and are dropped. Connection is restored at re-keying as
this resets the queue PN.

I noted that some of the corrupt frames may be caught in the driver by closer
inspection of the associated rx status. By modifying the receive processing I
am able to catch most corrupt frames. Unfortunately, there are still some that
seem impossible to identify without actually looking at the actual frames.

An example of such a frame is:

00000000: 88 41 30 00 00 80 48 68 08 0f 00 21 6a 56 2c 36
00000010: 00 22 02 00 0b 63 20 52 00 00 20 21 21 05 00 20
00000020: 8a 39 7b 1f 0f 11 07 9e bd 53 80 33 3b 8c 98 00
00000030: ef 5f da 7c 9a d6 3d d7 59 ac e0 21 44 88 63 d7
00000040: 21 34 b7 9a 89 8e cf 9e 46 1c ee d6 81 56 25 59
00000050: d2 ec ac 33 e6 12 3d c5 02 61 2d 80 8d 30 44 1e
00000060: 79 74 79 79 62 25 ba ec 04 4d 54 dc

with associated status

rxstatus8 = 1e989103

Here nothing in the frame status indicates an error; the frame has no error
flags set, the frame-ok flag is set, and so on. Still the frame is indeed
corrupt; the last four octets of the CCMP-header (bytes 0x20..0x23) should
be {00,00,00,00} rather than {8a,39,7b,1f} as the correct PN is 0x0521 (not
0x1f7b398a0521).

The corrupt frames all seem to have the upper half of the CCMP-header, data
and MIC corrupted, whereas the FCS (last four bytes) seem to be correct in the
sense that they match what I see in the air (and is verified by wireshark).

One explanation for all of this could be that the corrupt packet is what the
hardware is expected to return should it's processing fail (e.g. due to
checksum error). Then the problem is merely that the status field sometimes
get corrupted (some frames with corrupt PN do indeed come with matching
rxstatus). Comments in the code concerning corrupt status fields also point in
this direction.

Another explanation could be that the status is actually correct but for some
reason the returned frame is corrupted. Perhaps it's a combination of both
corrupt status and frame.

Any ideas about what may be going on here?

As I mentioned above I can catch most corrupt frames with the following changes
to the rx processing:

ath9k: clean up rx skb post-process logic
ath9k: do not mark frames with RXKEY_IX_INVALID as decrypted
ath9k: do not mark frames with RX_DECRYPT_BUSY as decrypted
ath9k: do not mark frames with RX_KEY_MISS as decrypted
ath9k: check error flags even if rx frame is marked ok
ath9k: clear mic error flag on encrypted frames

drivers/net/wireless/ath/ath9k/common.c | 16 ++++++++--------
drivers/net/wireless/ath/ath9k/mac.c | 26 +++++++++++++-------------
drivers/net/wireless/ath/ath9k/mac.h | 1 +
3 files changed, 22 insertions(+), 21 deletions(-)

The last change reduces the number of false MIC-errors that leads hostapd to
trigger countermeasures.

I might be violating the semantics of the error flags with some of these
changes, but it does make sense if indeed the status flags are getting
corrupted. For instance, if the FrameOK flag is erroneously set the remaining
error flags would never be checked. My change make sure the error flags are
always checked. Of course this may also, if the error flags get set due to status
corruption, lead to occasional false negatives which would have to be resend,
but this is better than passing false positives to mac80211 which breaks
communication completely.

I'm responding to this mail with the aforementioned patches against linux-next
from 20100413.

I'm still using AR9280.

Thanks,
Johan Hovold


2010-04-20 08:25:22

by Johan Hovold

[permalink] [raw]
Subject: Re: [ath9k-devel] ath9k: corrupt frames forwarded to mac80211 as decrypted

Hi again,

On Fri, Apr 16, 2010 at 12:48:50PM +0200, Johan Hovold wrote:
> I now know why 802.11n receive stalls; ath9k is passing corrupt frames to
> mac80211.
[...]
> An example of such a frame is:
>
> 00000000: 88 41 30 00 00 80 48 68 08 0f 00 21 6a 56 2c 36
> 00000010: 00 22 02 00 0b 63 20 52 00 00 20 21 21 05 00 20
> 00000020: 8a 39 7b 1f 0f 11 07 9e bd 53 80 33 3b 8c 98 00
> 00000030: ef 5f da 7c 9a d6 3d d7 59 ac e0 21 44 88 63 d7
> 00000040: 21 34 b7 9a 89 8e cf 9e 46 1c ee d6 81 56 25 59
> 00000050: d2 ec ac 33 e6 12 3d c5 02 61 2d 80 8d 30 44 1e
> 00000060: 79 74 79 79 62 25 ba ec 04 4d 54 dc
>
> with associated status
>
> rxstatus8 = 1e989103
>
> Here nothing in the frame status indicates an error; the frame has no error
> flags set, the frame-ok flag is set, and so on. Still the frame is indeed
> corrupt; the last four octets of the CCMP-header (bytes 0x20..0x23) should
> be {00,00,00,00} rather than {8a,39,7b,1f} as the correct PN is 0x0521 (not
> 0x1f7b398a0521).
>
> The corrupt frames all seem to have the upper half of the CCMP-header, data
> and MIC corrupted, whereas the FCS (last four bytes) seem to be correct in the
> sense that they match what I see in the air (and is verified by wireshark).
>
> One explanation for all of this could be that the corrupt packet is what the
> hardware is expected to return should it's processing fail (e.g. due to
> checksum error). Then the problem is merely that the status field sometimes
> get corrupted (some frames with corrupt PN do indeed come with matching
> rxstatus). Comments in the code concerning corrupt status fields also point in
> this direction.
>
> Another explanation could be that the status is actually correct but for some
> reason the returned frame is corrupted. Perhaps it's a combination of both
> corrupt status and frame.
>
> Any ideas about what may be going on here?

I'll answer my own question -- the status field is clearly corrupted.
The bad frames all have status such as

19fdb637
9131f6d7
87b3f0c1
29de5e7b
10c68ff3
4c1a33c7
9abd470b

where it should look something like 00030943. From inspection of
returned frame status, I've come up with at way to catch these; I
discard frames marked ok but with errors flags set or which have any of
the reserved bits 19 through 28 (0x1ff80000) set.

My question is: Is the latter assumption correct? Can I expect bits
19..28 to be zero?

On my AR9820 hardware this seems to be case and I am able to catch all
the corrupt frames and have 802.11n work perfectly. It also works fine
with 802.11g/WEP/TKIP/CCMP.

As a side note, I also seem able to confirm my observation above that
frames returned with non-corrupt status and decrypt error flag set
indeed do have corrupt CCMP header, data and MIC. That is, this seem to
be what hw is expected to return on (such) errors.

I'm responding to this mail with my fix (workaround).

I have also seen non-decrypted frames with decrypt busy flag set but
without decrypt crc err set (frame is marked not ok). I'm not sure
whether this is due to a bug or bit error (decrypt crc somehow got
cleared) but I also propose a change to remedy this.

Thanks,
Johan Hovold


2010-04-16 11:32:46

by Jouni Malinen

[permalink] [raw]
Subject: Re: [ath9k-devel] [RFC][PATCH 2/6] ath9k: do not mark frames with RXKEY_IX_INVALID as decrypted

On Fri, 2010-04-16 at 03:52 -0700, Johan Hovold wrote:
> Frames tagged by hardware with ATH9K_RXKEYIX_INVALID should not
> incorrectly be marked decrypted (even if key index in frame is valid).

Have you tested this with static WEP configuration? Or broadcast RX with
WPA? There must be a reason for that odd looking code being there in the
first place and I can now only think of it being needed when the default
keys are used.

- Jouni



2010-04-16 10:52:47

by Johan Hovold

[permalink] [raw]
Subject: [RFC][PATCH 6/6] ath9k: clear mic error flag on encrypted frames

It does not make sense to forward MIC-errors on non-decrypted frames.

Signed-off-by: Johan Hovold <[email protected]>
---

drivers/net/wireless/ath/ath9k/common.c | 12 ++++++++----
1 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/common.c b/drivers/net/wireless/ath/ath9k/common.c
index 1623af1..6efb388 100644
--- a/drivers/net/wireless/ath/ath9k/common.c
+++ b/drivers/net/wireless/ath/ath9k/common.c
@@ -255,11 +255,15 @@ void ath9k_cmn_rx_skb_postprocess(struct ath_common *common,

keyix = rx_stats->rs_keyix;

- if (ieee80211_has_protected(fc) && !decrypt_error &&
- !(rx_stats->rs_flags & ATH9K_RX_DECRYPT_BUSY) &&
- !(rx_stats->rs_flags & ATH9K_RX_KEY_MISS)) {
- if (keyix != ATH9K_RXKEYIX_INVALID)
+ if (ieee80211_has_protected(fc)) {
+ if (!decrypt_error &&
+ !(rx_stats->rs_flags & ATH9K_RX_DECRYPT_BUSY) &&
+ !(rx_stats->rs_flags & ATH9K_RX_KEY_MISS) &&
+ keyix != ATH9K_RXKEYIX_INVALID) {
rxs->flag |= RX_FLAG_DECRYPTED;
+ }
+ if (!(rxs->flag & RX_FLAG_DECRYPTED))
+ rxs->flag &= ~RX_FLAG_MMIC_ERROR;
}
if (ah->sw_mgmt_crypto &&
(rxs->flag & RX_FLAG_DECRYPTED) &&
--
1.7.0.3