2009-01-07 13:52:21

by Jiri Slaby

[permalink] [raw]
Subject: [PATCH 1/1] ath5k: fix hw rate index condition

Dhaval Giani wrote:
> I see this on current git. Not sure how to reproduce it, has happened on
> two random occasions. At both times, I was not connected to a wireless
> network, but to wired networks.
>
> ------------[ cut here ]------------
> WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0x7f/0x559
> ...
> Call Trace:
> [<f80d4192>] __ieee80211_rx+0x7f/0x559 [mac80211]
> [<f80a19f4>] ath5k_tasklet_rx+0x4f7/0x53b [ath5k]
> ...

Hmm, maybe ath5k is culprit. Could you apply the attached patch and
use the kernel till the problem appears again?

--

Make sure we print out a warning when the index is out of bounds,
i.e. even on hw_rix == AR5K_MAX_RATES.

Also change to WARN and print text with the reported hw_rix.

Signed-off-by: Jiri Slaby <[email protected]>
Cc: Nick Kossifidis <[email protected]>
Cc: Luis R. Rodriguez <[email protected]>
Cc: Bob Copeland <[email protected]>
---
drivers/net/wireless/ath5k/base.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/net/wireless/ath5k/base.c b/drivers/net/wireless/ath5k/base.c
index 4af2607..0e65e25 100644
--- a/drivers/net/wireless/ath5k/base.c
+++ b/drivers/net/wireless/ath5k/base.c
@@ -1088,7 +1088,8 @@ ath5k_mode_setup(struct ath5k_softc *sc)
static inline int
ath5k_hw_to_driver_rix(struct ath5k_softc *sc, int hw_rix)
{
- WARN_ON(hw_rix < 0 || hw_rix > AR5K_MAX_RATES);
+ WARN(hw_rix < 0 || hw_rix >= AR5K_MAX_RATES,
+ "hw_rix out of bounds: %x\n", hw_rix);
return sc->rate_idx[sc->curband->band][hw_rix];
}

--
1.6.0.6



2009-01-07 15:30:58

by Jiri Slaby

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On 01/07/2009 04:22 PM, Dhaval Giani wrote:
> I will get back to you in about a day or two.

No problem. Thanks.

2009-01-07 14:37:04

by Jiri Slaby

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On 01/07/2009 02:51 PM, Jiri Slaby wrote:
> Dhaval Giani wrote:
>> I see this on current git. Not sure how to reproduce it, has happened on
>> two random occasions. At both times, I was not connected to a wireless
>> network, but to wired networks.
>>
>> ------------[ cut here ]------------
>> WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0x7f/0x559
>> ...
>> Call Trace:
>> [<f80d4192>] __ieee80211_rx+0x7f/0x559 [mac80211]
>> [<f80a19f4>] ath5k_tasklet_rx+0x4f7/0x53b [ath5k]
>> ...
>
> Hmm, maybe ath5k is culprit. Could you apply the attached patch and
> use the kernel till the problem appears again?

I don't think this will print anything, the rate won't be 32, it's rather
too high. Could you apply also the appended debug one?

---
net/mac80211/rx.c | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
index 7175ae8..5e17e57 100644
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -2230,8 +2230,10 @@ void __ieee80211_rx(struct ieee80211_hw *hw, struct sk_buff *skb,
* MCS aware. */
rate = &sband->bitrates[sband->n_bitrates - 1];
} else {
- if (WARN_ON(status->rate_idx < 0 ||
- status->rate_idx >= sband->n_bitrates))
+ if (WARN(status->rate_idx < 0 ||
+ status->rate_idx >= sband->n_bitrates,
+ "RATE=%u, BAND=%x\n", status->rate_idx,
+ sband->n_bitrates))
return;
rate = &sband->bitrates[status->rate_idx];
}
--
1.6.0.6


2009-01-07 15:22:29

by Dhaval Giani

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On Wed, Jan 07, 2009 at 03:36:05PM +0100, Jiri Slaby wrote:
> On 01/07/2009 02:51 PM, Jiri Slaby wrote:
> > Dhaval Giani wrote:
> >> I see this on current git. Not sure how to reproduce it, has happened on
> >> two random occasions. At both times, I was not connected to a wireless
> >> network, but to wired networks.
> >>
> >> ------------[ cut here ]------------
> >> WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0x7f/0x559
> >> ...
> >> Call Trace:
> >> [<f80d4192>] __ieee80211_rx+0x7f/0x559 [mac80211]
> >> [<f80a19f4>] ath5k_tasklet_rx+0x4f7/0x53b [ath5k]
> >> ...
> >
> > Hmm, maybe ath5k is culprit. Could you apply the attached patch and
> > use the kernel till the problem appears again?
>
> I don't think this will print anything, the rate won't be 32, it's rather
> too high. Could you apply also the appended debug one?
>

I will apply both the patches and try it out again. As I mentioned
earlier, I am not sure how to reproduce the WARN_ON. I will get back to
you in about a day or two.

> ---
> net/mac80211/rx.c | 6 ++++--
> 1 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
> index 7175ae8..5e17e57 100644
> --- a/net/mac80211/rx.c
> +++ b/net/mac80211/rx.c
> @@ -2230,8 +2230,10 @@ void __ieee80211_rx(struct ieee80211_hw *hw, struct sk_buff *skb,
> * MCS aware. */
> rate = &sband->bitrates[sband->n_bitrates - 1];
> } else {
> - if (WARN_ON(status->rate_idx < 0 ||
> - status->rate_idx >= sband->n_bitrates))
> + if (WARN(status->rate_idx < 0 ||
> + status->rate_idx >= sband->n_bitrates,
> + "RATE=%u, BAND=%x\n", status->rate_idx,
> + sband->n_bitrates))
> return;
> rate = &sband->bitrates[status->rate_idx];
> }
> --
> 1.6.0.6

--
regards,
Dhaval

2009-02-27 02:40:03

by Luis R. Rodriguez

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH 1/1] ath5k: fix hw rate index condition

On Thu, Feb 26, 2009 at 06:27:04PM -0800, Bob Copeland wrote:
> On Fri, Feb 27, 2009 at 12:32:55AM +0100, Jiri Slaby wrote:
> > On 27.2.2009 00:28, Bob Copeland wrote:
> >> hw_to_driver_rix() returns sc->rate_idx[x][y] as an int, and that
> >> array is initialized to (u8)-1 for invalid rates. So, it can
> >> return 255 if the hardware rate index (y) is bad, then the check
> >> "rxs.rate_idx>= 0" would always be true, right? If it's not a
> >> real bug yet, it likely will be one day :)
> >
> > Ah, yes, it really is a bug(tm), care to post a fix?
>
> Actually, I remembered in the dark recesses of my moldering brain
> that someone had a lost patch for this a while ago, so I searched
> the archives. Pavel, ok to add your s-o-b?
>
> From: Pavel Roskin <[email protected]>
> Subject: [PATCH] ath5k: use signed elements for rate index table
>
> A lookup table is used to convert from hardware rate indexes back
> to driver-based rate indexes. For unknown hardware rates, we
> initialize these values to -1, but since the array elements are of
> type u8, they will be in the range 0-255. This can cause array
> overruns because subsequent sanity checks only check for negative
> values.
>
> Signed-off-by: Bob Copeland <[email protected]>
> ---
> drivers/net/wireless/ath5k/base.h | 2 +-
> 1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/net/wireless/ath5k/base.h b/drivers/net/wireless/ath5k/base.h
> index 20e0d14..8229561 100644
> --- a/drivers/net/wireless/ath5k/base.h
> +++ b/drivers/net/wireless/ath5k/base.h
> @@ -112,7 +112,7 @@ struct ath5k_softc {
> struct ieee80211_supported_band sbands[IEEE80211_NUM_BANDS];
> struct ieee80211_channel channels[ATH_CHAN_MAX];
> struct ieee80211_rate rates[IEEE80211_NUM_BANDS][AR5K_MAX_RATES];
> - u8 rate_idx[IEEE80211_NUM_BANDS][AR5K_MAX_RATES];
> + s8 rate_idx[IEEE80211_NUM_BANDS][AR5K_MAX_RATES];

Might be worth adding a note why this is the case. Can't we simply avoid
this by checking earlier for the error or simply assigning it an actual
default _good_ hw rate value?

Luis

2009-02-26 23:28:56

by Bob Copeland

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH 1/1] ath5k: fix hw rate index condition

On Thu, Feb 26, 2009 at 6:19 PM, Jiri Slaby <[email protected]> wrote=
:
> On 27.2.2009 00:15, Bob Copeland wrote:
>> Speaking of, I think there's another potential oob array access at:
>>
>> if (rxs.rate_idx>=3D 0&& =A0rs.rs_rate =3D=3D
>> =A0 =A0 =A0sc->curband->bitrates[rxs.rate_idx].hw_value_short)
>> =A0 =A0 =A0 =A0 =A0rxs.flag |=3D RX_FLAG_SHORTPRE;
>>
>> because sc->rate_idx is u8 instead of s8.
>
> strcmp("sc->rate_idx", "rxs.rate_idx") !=3D 0 :)
>
> Or did I miss something?

:) Sorry, I should've been clearer.

hw_to_driver_rix() returns sc->rate_idx[x][y] as an int, and that
array is initialized to (u8)-1 for invalid rates. So, it can
return 255 if the hardware rate index (y) is bad, then the check
"rxs.rate_idx >=3D 0" would always be true, right? If it's not a
real bug yet, it likely will be one day :)

--=20
Bob Copeland %% http://www.bobcopeland.com

2009-02-27 02:27:42

by Bob Copeland

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH 1/1] ath5k: fix hw rate index condition

On Fri, Feb 27, 2009 at 12:32:55AM +0100, Jiri Slaby wrote:
> On 27.2.2009 00:28, Bob Copeland wrote:
>> hw_to_driver_rix() returns sc->rate_idx[x][y] as an int, and that
>> array is initialized to (u8)-1 for invalid rates. So, it can
>> return 255 if the hardware rate index (y) is bad, then the check
>> "rxs.rate_idx>= 0" would always be true, right? If it's not a
>> real bug yet, it likely will be one day :)
>
> Ah, yes, it really is a bug(tm), care to post a fix?

Actually, I remembered in the dark recesses of my moldering brain
that someone had a lost patch for this a while ago, so I searched
the archives. Pavel, ok to add your s-o-b?

From: Pavel Roskin <[email protected]>
Subject: [PATCH] ath5k: use signed elements for rate index table

A lookup table is used to convert from hardware rate indexes back
to driver-based rate indexes. For unknown hardware rates, we
initialize these values to -1, but since the array elements are of
type u8, they will be in the range 0-255. This can cause array
overruns because subsequent sanity checks only check for negative
values.

Signed-off-by: Bob Copeland <[email protected]>
---
drivers/net/wireless/ath5k/base.h | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/wireless/ath5k/base.h b/drivers/net/wireless/ath5k/base.h
index 20e0d14..8229561 100644
--- a/drivers/net/wireless/ath5k/base.h
+++ b/drivers/net/wireless/ath5k/base.h
@@ -112,7 +112,7 @@ struct ath5k_softc {
struct ieee80211_supported_band sbands[IEEE80211_NUM_BANDS];
struct ieee80211_channel channels[ATH_CHAN_MAX];
struct ieee80211_rate rates[IEEE80211_NUM_BANDS][AR5K_MAX_RATES];
- u8 rate_idx[IEEE80211_NUM_BANDS][AR5K_MAX_RATES];
+ s8 rate_idx[IEEE80211_NUM_BANDS][AR5K_MAX_RATES];
enum nl80211_iftype opmode;
struct ath5k_hw *ah; /* Atheros HW */

--
1.5.4.1


--
Bob Copeland %% http://www.bobcopeland.com


2009-02-27 03:06:49

by Bob Copeland

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH 1/1] ath5k: fix hw rate index condition

On Thu, Feb 26, 2009 at 06:39:12PM -0800, Luis R. Rodriguez wrote:
> Might be worth adding a note why this is the case. Can't we simply avoid
> this by checking earlier for the error or simply assigning it an actual
> default _good_ hw rate value?

I guess an alternative is to initialize to 0, that would count any rx
packets whose hw rate we don't know about as the base rate, so it would
probably bias the RC to 1mb, but this is already one of those 'should
never happen' cases. Also I can't forsee having a rate index > 127 so
changing the sign is pretty low risk.

--
Bob Copeland %% http://www.bobcopeland.com


2009-02-26 23:19:16

by Jiri Slaby

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On 27.2.2009 00:15, Bob Copeland wrote:
> Speaking of, I think there's another potential oob array access at:
>
> if (rxs.rate_idx>= 0&& rs.rs_rate ==
> sc->curband->bitrates[rxs.rate_idx].hw_value_short)
> rxs.flag |= RX_FLAG_SHORTPRE;
>
> because sc->rate_idx is u8 instead of s8.

strcmp("sc->rate_idx", "rxs.rate_idx") != 0 :)

Or did I miss something?

2009-02-26 23:33:00

by Jiri Slaby

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH 1/1] ath5k: fix hw rate index condition

On 27.2.2009 00:28, Bob Copeland wrote:
> hw_to_driver_rix() returns sc->rate_idx[x][y] as an int, and that
> array is initialized to (u8)-1 for invalid rates. So, it can
> return 255 if the hardware rate index (y) is bad, then the check
> "rxs.rate_idx>= 0" would always be true, right? If it's not a
> real bug yet, it likely will be one day :)

Ah, yes, it really is a bug(tm), care to post a fix?

2009-02-27 03:16:15

by Luis R. Rodriguez

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH 1/1] ath5k: fix hw rate index condition

On Thu, Feb 26, 2009 at 07:06:08PM -0800, Bob Copeland wrote:
> On Thu, Feb 26, 2009 at 06:39:12PM -0800, Luis R. Rodriguez wrote:
> > Might be worth adding a note why this is the case. Can't we simply avoid
> > this by checking earlier for the error or simply assigning it an actual
> > default _good_ hw rate value?
>
> I guess an alternative is to initialize to 0, that would count any rx
> packets whose hw rate we don't know about as the base rate, so it would
> probably bias the RC to 1mb, but this is already one of those 'should
> never happen' cases.

Understood.

> Also I can't forsee having a rate index > 127 so
> changing the sign is pretty low risk.

Sure, it just seems a bit strange to see a signed rate index,
that's all. And if its to deal with an error I think it may
be nicer to actually use a rate that works and then warn
rather than warn and not use a valid rate at all.

Mind you I haven't checked this code in while.

Luis

2009-02-02 07:57:43

by Dhaval Giani

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

Hi Jiri,

On Wed, Jan 7, 2009 at 9:00 PM, Jiri Slaby <[email protected]> wrote:
> On 01/07/2009 04:22 PM, Dhaval Giani wrote:
>> I will get back to you in about a day or two.
>
> No problem. Thanks.
>

So I finally managed to hit this on 2.6.29-rc3. It is hard to
reproduce, so I hope so much information is enough to give you a good
guess. This time it hit while trying to connect to an open network at
the airport.

Thanks,
Dhaval


Attachments:
dmesg.jiri (43.12 kB)

2009-02-15 13:47:37

by Bob Copeland

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On Mon, Feb 02, 2009 at 01:27:39PM +0530, Dhaval Giani wrote:
> So I finally managed to hit this on 2.6.29-rc3. It is hard to
> reproduce, so I hope so much information is enough to give you a good
> guess. This time it hit while trying to connect to an open network at
> the airport.

> WARNING: at net/mac80211/rx.c:2236 __ieee80211_rx+0x96/0x571 [mac80211]()
> Hardware name: 2007CS3
> RATE=255, BAND=8

band is supposed to be sc->curband? 8 is way wrong. rate could be 255
if, for some reason, the hardware rate wasn't in the rate table.

> Pid: 2634, comm: X Not tainted 2.6.29-rc3 #18
> Call Trace:
> [<c0430636>] warn_slowpath+0x76/0xad
> [<c04508d7>] ? __lock_acquire+0xb36/0xb45
> [<f7dd0205>] __ieee80211_rx+0x96/0x571 [mac80211]
> [<f7e37976>] ath5k_tasklet_rx+0x4fb/0x53d [ath5k]
> [<c06fa5c3>] ? _spin_unlock_irq+0x27/0x34
> [<c0434f91>] tasklet_action+0x85/0xf0

Interestingly I hit this just the other day -- I was debugging something
else and had a serial console hooked up, otherwise I wouldn't have
noticed at all. It happened at some point during a suspend-to-ram
sequence, which makes at least my version sound like a race condition
of some sort.




2009-03-03 13:03:11

by Bob Copeland

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH 1/1] ath5k: fix hw rate index condition

On Tue, Mar 03, 2009 at 06:31:05AM +0200, Nick Kossifidis wrote:
>
> According to docs the rate field is only valid if more flag is clear
> (we have the last descriptor) and only if the receive ok flag is set
> or both receive ok and phy error flags are cleared. We never do such
> checks so we might actually try to process this field when we already
> know we shouldn't...

Well, we do skip rs_more packets without getting the rate, hopefully
phy error packets too. The warning would definitely show if we have
any bugs there.

> Also the following rate codes are reserved (except XR codes that we
> already know):
[snip]
> and i don't believe i've ever seen them, so we can warn on them too
> and print something like "Reserved rate code: %x", also it would be
> nice to warn on XR rates (1,2,3,6,7) in case we want to debug this in
> the future.

Good idea, though I'm somewhat of the mind that we should let the
current patch go in for a bit and see if any of these pop up. But
that's because I'm lazy :)

--
Bob Copeland %% http://www.bobcopeland.com


2009-03-23 08:21:09

by Nick Kossifidis

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH 1/1] ath5k: fix hw rate index condition

2009/3/23 Bob Copeland <[email protected]>:
> On Tue, Mar 03, 2009 at 06:31:05AM +0200, Nick Kossifidis wrote:
>> and i don't believe i've ever seen them, so we can warn on them too
>> and print something like "Reserved rate code: %x", also it would be
>> nice to warn on XR rates (1,2,3,6,7) in case we want to debug this i=
n
>> the future.
>
> Nick, are you ok with taking this patch as-is and adding the other st=
uff
> in later? =C2=A0It does fix a warning in mac80211 for unknown rates a=
nd
> instead warns with the actual hw rate code we get.
>

Sure ;-)


--=20
GPG ID: 0xD21DB2DB
As you read this post global entropy rises. Have Fun ;-)
Nick

2009-03-23 20:00:45

by John W. Linville

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH 1/1] ath5k: fix hw rate index condition

On Sun, Mar 22, 2009 at 11:04:19PM -0400, Bob Copeland wrote:
> On Tue, Mar 03, 2009 at 06:31:05AM +0200, Nick Kossifidis wrote:
> > and i don't believe i've ever seen them, so we can warn on them too
> > and print something like "Reserved rate code: %x", also it would be
> > nice to warn on XR rates (1,2,3,6,7) in case we want to debug this in
> > the future.
>
> Nick, are you ok with taking this patch as-is and adding the other stuff
> in later? It does fix a warning in mac80211 for unknown rates and
> instead warns with the actual hw rate code we get.
>
> Here's a repost (rebased, if it matters):
>
> From: Bob Copeland <[email protected]>
> Date: Mon, 2 Mar 2009 21:55:18 -0500
> Subject: [PATCH] ath5k: warn and correct rate for unknown hw rate indexes

I've got this one now. Just FYI, I'll be prone to losing any
patches posted in the middle of a thread if the Subject isn't changed
appropriately. FWIW, 'git send-email' can accept an "--in-reply-to"
argument. So when replying in a thread you can say "Oh yeah, I see
the problem...patch to follow" and then send the patch in a separate
email with it's own subject so that I will see it even if I'm buried
(at least most of the time).

John
--
John W. Linville Someday the world will need a hero, and you
[email protected] might be all we have. Be ready.

2009-03-15 21:27:21

by Stefan Lippers-Hollmann

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

Hi

On Mittwoch, 7. Januar 2009, Jiri Slaby wrote:
> On 01/07/2009 02:51 PM, Jiri Slaby wrote:
> > Dhaval Giani wrote:
> >> I see this on current git. Not sure how to reproduce it, has happened on
> >> two random occasions. At both times, I was not connected to a wireless
> >> network, but to wired networks.
> >>
> >> ------------[ cut here ]------------
> >> WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0x7f/0x559
> >> ...
> >> Call Trace:
> >> [<f80d4192>] __ieee80211_rx+0x7f/0x559 [mac80211]
> >> [<f80a19f4>] ath5k_tasklet_rx+0x4f7/0x53b [ath5k]
> >> ...
> >
> > Hmm, maybe ath5k is culprit. Could you apply the attached patch and
> > use the kernel till the problem appears again?

It seems as if this problem wouldn't be restricted to ath5k, I just
triggered something very similar on b43 and 2.6.29-rc8-git1 (i386, hard
preemption):

b43-phy0: Broadcom 4306 WLAN found (core revision 5)
wmaster0 (b43): not using net_device_ops yet
phy0: Selected rate control algorithm 'minstrel'
wlan0 (b43): not using net_device_ops yet
Broadcom 43xx driver loaded [ Features: PMLR, Firmware-ID: FW13 ]
udev: renamed network interface wlan0 to wlan1
[...]
input: b43-phy0 as /devices/virtual/input/input8
b43 ssb0:0: firmware: requesting b43/ucode5.fw
b43 ssb0:0: firmware: requesting b43/pcm5.fw
b43 ssb0:0: firmware: requesting b43/b0g0initvals5.fw
b43 ssb0:0: firmware: requesting b43/b0g0bsinitvals5.fw
b43-phy0: Loading firmware version 410.2160 (2007-05-26 15:32:10)
Registered led device: b43-phy0::tx
Registered led device: b43-phy0::rx
Registered led device: b43-phy0::radio
b43-phy0: Radio turned on by software
[...]
ADDRCONF(NETDEV_UP): wlan1: link is not ready
wlan1: authenticate with AP 00:15:f2:7e:9b:7d
wlan1: authenticated
wlan1: associate with AP 00:15:f2:7e:9b:7d
wlan1: RX AssocResp from 00:15:f2:7e:9b:7d (capab=0x411 status=0 aid=2)
wlan1: associated
ADDRCONF(NETDEV_CHANGE): wlan1: link becomes ready
[...]
wlan1: no IPv6 routers present
b43-phy0 ERROR: PHY transmission error
b43-phy0 ERROR: PHY transmission error

[ lots of these, likely to be caused by minstrel being a little too
optimistic about the possible wlan rates (it was more conservative in
2.6.28 and didn't happen there); the distance between both stations is
on the upper end ]

b43-phy0 ERROR: PHY transmission error
__ratelimit: 9 callbacks suppressed
b43-phy0 ERROR: PHY transmission error
b43-phy0 ERROR: PHY transmission error
------------[ cut here ]------------
WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0xa2/0x6a0 [mac80211]()
Hardware name: Amilo D-Series
Modules linked in: ppdev lp aes_i586 aes_generic ipv6 af_packet rfkill_input arc4 ecb b43 rfkill rng_core mac80211 cfg80211 led_class input_polldev ssb joydev pcmcia snd_via82xx gameport snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device i2c_viapro serio_raw snd i2c_core pcspkr psmouse evdev soundcore via686a via_agp shpchp yenta_socket rsrc_nonstatic pcmcia_core pci_hotplug rtc_cmos battery rtc_core rtc_lib parport_pc parport ac button ext3 jbd mbcache sg sr_mod cdrom sd_mod ata_generic pata_acpi pata_via uhci_hcd ehci_hcd floppy firewire_ohci libata tulip firewire_core crc_itu_t usbcore scsi_mod thermal processor fan
Pid: 0, comm: swapper Not tainted 2.6.29-rc8-sidux-686 #1
Call Trace:
[<c01319d7>] warn_slowpath+0x87/0xe0
[<d00523b7>] op32_set_current_rxslot+0x27/0x40 [b43]
[<d0052d93>] b43_dma_rx+0x193/0x420 [b43]
[<c0124fc3>] __wake_up_common+0x43/0x70
[<cfffcc62>] __ieee80211_rx+0xa2/0x6a0 [mac80211]
[<c011e9a5>] default_spin_lock_flags+0x5/0x10
[<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60
[<cffeb337>] ieee80211_tasklet_handler+0x107/0x130 [mac80211]
[<c013692c>] tasklet_action+0x6c/0xf0
[<c0137147>] __do_softirq+0x87/0x140
[<c011e9a5>] default_spin_lock_flags+0x5/0x10
[<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60
[<c0137255>] do_softirq+0x55/0x60
[<c0137495>] irq_exit+0x75/0x90
[<c0106378>] do_IRQ+0x48/0x90
[<c0104527>] common_interrupt+0x27/0x2c
[<cf8372e4>] acpi_idle_enter_simple+0x17a/0x1f4 [processor]
[<c02fd3bf>] cpuidle_idle_call+0x6f/0xc0
[<c0102de6>] cpu_idle+0x66/0xa0
---[ end trace c754f566bbe5ac47 ]---
------------[ cut here ]------------
WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0xa2/0x6a0 [mac80211]()
Hardware name: Amilo D-Series
Modules linked in: ppdev lp aes_i586 aes_generic ipv6 af_packet rfkill_input arc4 ecb b43 rfkill rng_core mac80211 cfg80211 led_class input_polldev ssb joydev pcmcia snd_via82xx gameport snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device i2c_viapro serio_raw snd i2c_core pcspkr psmouse evdev soundcore via686a via_agp shpchp yenta_socket rsrc_nonstatic pcmcia_core pci_hotplug rtc_cmos battery rtc_core rtc_lib parport_pc parport ac button ext3 jbd mbcache sg sr_mod cdrom sd_mod ata_generic pata_acpi pata_via uhci_hcd ehci_hcd floppy firewire_ohci libata tulip firewire_core crc_itu_t usbcore scsi_mod thermal processor fan
Pid: 0, comm: swapper Tainted: G W 2.6.29-rc8-sidux-686 #1
Call Trace:
[<c01319d7>] warn_slowpath+0x87/0xe0
[<d00523b7>] op32_set_current_rxslot+0x27/0x40 [b43]
[<d0052d93>] b43_dma_rx+0x193/0x420 [b43]
[<d0055f15>] b43_led_turn_off+0x55/0x90 [b43]
[<cfffcc62>] __ieee80211_rx+0xa2/0x6a0 [mac80211]
[<c011e9a5>] default_spin_lock_flags+0x5/0x10
[<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60
[<cffeb337>] ieee80211_tasklet_handler+0x107/0x130 [mac80211]
[<c013692c>] tasklet_action+0x6c/0xf0
[<c0137147>] __do_softirq+0x87/0x140
[<c011e9a5>] default_spin_lock_flags+0x5/0x10
[<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60
[<c0137255>] do_softirq+0x55/0x60
[<c0137495>] irq_exit+0x75/0x90
[<c0106378>] do_IRQ+0x48/0x90
[<c0104527>] common_interrupt+0x27/0x2c
[<cf8372e4>] acpi_idle_enter_simple+0x17a/0x1f4 [processor]
[<c02fd3bf>] cpuidle_idle_call+0x6f/0xc0
[<c0102de6>] cpu_idle+0x66/0xa0
---[ end trace c754f566bbe5ac48 ]---
------------[ cut here ]------------
WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0xa2/0x6a0 [mac80211]()
Hardware name: Amilo D-Series
Modules linked in: ppdev lp aes_i586 aes_generic ipv6 af_packet rfkill_input arc4 ecb b43 rfkill rng_core mac80211 cfg80211 led_class input_polldev ssb joydev pcmcia snd_via82xx gameport snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device i2c_viapro serio_raw snd i2c_core pcspkr psmouse evdev soundcore via686a via_agp shpchp yenta_socket rsrc_nonstatic pcmcia_core pci_hotplug rtc_cmos battery rtc_core rtc_lib parport_pc parport ac button ext3 jbd mbcache sg sr_mod cdrom sd_mod ata_generic pata_acpi pata_via uhci_hcd ehci_hcd floppy firewire_ohci libata tulip firewire_core crc_itu_t usbcore scsi_mod thermal processor fan
Pid: 1873, comm: kjournald Tainted: G W 2.6.29-rc8-sidux-686 #1
Call Trace:
[<c01319d7>] warn_slowpath+0x87/0xe0
[<d00523b7>] op32_set_current_rxslot+0x27/0x40 [b43]
[<d0052d93>] b43_dma_rx+0x193/0x420 [b43]
[<cfffcc62>] __ieee80211_rx+0xa2/0x6a0 [mac80211]
[<c011e9a5>] default_spin_lock_flags+0x5/0x10
[<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60
[<cffeb337>] ieee80211_tasklet_handler+0x107/0x130 [mac80211]
[<c013692c>] tasklet_action+0x6c/0xf0
[<c0137147>] __do_softirq+0x87/0x140
[<c011e9a5>] default_spin_lock_flags+0x5/0x10
[<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60
[<c0137255>] do_softirq+0x55/0x60
[<c0137495>] irq_exit+0x75/0x90
[<c0106378>] do_IRQ+0x48/0x90
[<c01d3f44>] generic_block_bmap+0x54/0x70
[<c0104527>] common_interrupt+0x27/0x2c
[<cfbf723c>] __journal_file_buffer+0xdc/0x1d0 [jbd]
[<cfbf7397>] journal_file_buffer+0x67/0xc0 [jbd]
[<cfbfe102>] journal_write_metadata_buffer+0x1e2/0x3dc [jbd]
[<cfbf9e26>] journal_commit_transaction+0x806/0x1120 [jbd]
[<c013bcc7>] lock_timer_base+0x27/0x60
[<cfbfd82c>] kjournald+0xac/0x1f0 [jbd]
[<c01464b0>] autoremove_wake_function+0x0/0x50
[<cfbfd780>] kjournald+0x0/0x1f0 [jbd]
[<c01460e9>] kthread+0x39/0x70
[<c01460b0>] kthread+0x0/0x70
[<c0104793>] kernel_thread_helper+0x7/0x14
---[ end trace c754f566bbe5ac49 ]---
__ratelimit: 21 callbacks suppressed
b43-phy0 ERROR: PHY transmission error
[...]

Sometimes even the firmware crashes and gets reloaded continously.

wlan1 IEEE 802.11bg ESSID:"soyuz"
Mode:Managed Frequency:2.422 GHz Access Point: 00:15:F2:7E:9B:7D
Bit Rate=18 Mb/s Tx-Power=20 dBm
Retry min limit:7 RTS thr:off Fragment thr=2352 B
Encryption key:<wpa2psk> [3] Security mode:open
Power Management:off
Link Quality=53/100 Signal level:-75 dBm Noise level=-65 dBm
Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0
Tx excessive retries:0 Invalid misc:0 Missed beacon:0

Setting a fixed wlan rate (like 11M) seems to avoid this problem.

> I don't think this will print anything, the rate won't be 32, it's rather
> too high. Could you apply also the appended debug one?

I will apply this patch and give it some more testing tomorrow evening,
this problem is almost 100% reproducable for me at the end of my router's
range and doesn't happen in closer proximity.

> ---
> net/mac80211/rx.c | 6 ++++--
> 1 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
> index 7175ae8..5e17e57 100644
> --- a/net/mac80211/rx.c
> +++ b/net/mac80211/rx.c
> @@ -2230,8 +2230,10 @@ void __ieee80211_rx(struct ieee80211_hw *hw, struct sk_buff *skb,
> * MCS aware. */
> rate = &sband->bitrates[sband->n_bitrates - 1];
> } else {
> - if (WARN_ON(status->rate_idx < 0 ||
> - status->rate_idx >= sband->n_bitrates))
> + if (WARN(status->rate_idx < 0 ||
> + status->rate_idx >= sband->n_bitrates,
> + "RATE=%u, BAND=%x\n", status->rate_idx,
> + sband->n_bitrates))
> return;
> rate = &sband->bitrates[status->rate_idx];
> }

Regards
Stefan Lippers-Hollmann

2009-03-01 14:37:01

by Bob Copeland

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH 1/1] ath5k: fix hw rate index condition

On Sun, Mar 1, 2009 at 12:07 AM, Pavel Roskin <[email protected]> wrote:
> On Thu, 2009-02-26 at 21:27 -0500, Bob Copeland wrote:
>
>> Actually, I remembered in the dark recesses of my moldering brain
>> that someone had a lost patch for this a while ago, so I searched
>> the archives. =A0Pavel, ok to add your s-o-b?
>
> Since my patch was dropped and the new patch was implemented without =
my
> participation, it makes no sense to put my s-o-b on the code I didn't
> write (even though I wrote something similar before).

Ok, I just wanted to be sure to maintain proper credit, the "From" shou=
ld
suffice. I did rewrite the patch but it actually had an identical diff=
=2E
=46WIW, the thread didn't give a clue why it didn't make it upstream, j=
ust
missed I guess (http://marc.info/?l=3Dlinux-wireless&m=3D12248000251962=
7&w=3D2,
ultimately that problem was fixed by correctly setting the rs_more flag=
).

Anyway, the patch, while IMO correct, will still result in mac80211
warning in ieee80211_rx with -1 just as 255 will; it just fixes the
subsequent out of bound read. If we want to tell mac80211 a real rate,
I think we should change it to s8 then hw_to_driver_rix should do
something like:

idx =3D array[x][y];
if (WARN_ON(idx < 0))
idx =3D 0;

return idx;

Then we get the warning in the driver and we also return a real rate up
the stack. I'll prep a patch for this unless there are any objections.

--=20
Bob Copeland %% http://www.bobcopeland.com

2009-03-24 03:38:37

by Bob Copeland

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH 1/1] ath5k: fix hw rate index condition

On Mon, Mar 23, 2009 at 03:53:28PM -0400, John W. Linville wrote:
> I've got this one now. Just FYI, I'll be prone to losing any
> patches posted in the middle of a thread if the Subject isn't changed
> appropriately. FWIW, 'git send-email' can accept an "--in-reply-to"
> argument.

Great, thanks -- and thanks for the tip!

--
Bob Copeland %% http://www.bobcopeland.com


2009-03-03 03:46:49

by Bob Copeland

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH 1/1] ath5k: fix hw rate index condition

On Sun, Mar 01, 2009 at 12:21:52AM -0500, Pavel Roskin wrote:
> I would prefer that we don't hide problems.
>
> If we don't know why we cannot get a valid rate, we should use WARN_ON
> and find out why and when it happens. I'm fine with using a bogus rate
> with WARN_ON.

So here is at least stage one of this, not yet the global "unknown rate"
infrastructure, but hopefully it will allow us to track down the issue.

It makes hw_to_driver_rix a little uglier, but oh well. Thoughts?

From: Bob Copeland <[email protected]>
Date: Mon, 2 Mar 2009 21:55:18 -0500
Subject: [PATCH] ath5k: warn and correct rate for unknown hw rate indexes

ath5k sets up a mapping table from the hardware rate index to
the rate index used by mac80211; however, we have seen some
received frames with incorrect rate indexes. Such frames
normally get dropped with a warning in __ieee80211_rx(), but the
warning doesn't include enough context to track down the error.

This patch adds a warning to hw_to_driver_rix for any lookups
that result in a rate index of -1, then returns a valid rate so
the frame can be processed.

This also includes the bug fix suggested by Pavel Roskin, in which
the mapping table is made signed, so rates initialized to -1 stay
that way.

Signed-off-by: Bob Copeland <[email protected]>
---
drivers/net/wireless/ath5k/base.c | 15 ++++++++++++---
drivers/net/wireless/ath5k/base.h | 2 +-
2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/ath5k/base.c b/drivers/net/wireless/ath5k/base.c
index f7c424d..8d4b11c 100644
--- a/drivers/net/wireless/ath5k/base.c
+++ b/drivers/net/wireless/ath5k/base.c
@@ -1100,9 +1100,18 @@ ath5k_mode_setup(struct ath5k_softc *sc)
static inline int
ath5k_hw_to_driver_rix(struct ath5k_softc *sc, int hw_rix)
{
- WARN(hw_rix < 0 || hw_rix >= AR5K_MAX_RATES,
- "hw_rix out of bounds: %x\n", hw_rix);
- return sc->rate_idx[sc->curband->band][hw_rix];
+ int rix;
+
+ /* return base rate on errors */
+ if (WARN(hw_rix < 0 || hw_rix >= AR5K_MAX_RATES,
+ "hw_rix out of bounds: %x\n", hw_rix))
+ return 0;
+
+ rix = sc->rate_idx[sc->curband->band][hw_rix];
+ if (WARN(rix < 0, "invalid hw_rix: %x\n", hw_rix))
+ rix = 0;
+
+ return rix;
}

/***************\
diff --git a/drivers/net/wireless/ath5k/base.h b/drivers/net/wireless/ath5k/base.h
index 20e0d14..8229561 100644
--- a/drivers/net/wireless/ath5k/base.h
+++ b/drivers/net/wireless/ath5k/base.h
@@ -112,7 +112,7 @@ struct ath5k_softc {
struct ieee80211_supported_band sbands[IEEE80211_NUM_BANDS];
struct ieee80211_channel channels[ATH_CHAN_MAX];
struct ieee80211_rate rates[IEEE80211_NUM_BANDS][AR5K_MAX_RATES];
- u8 rate_idx[IEEE80211_NUM_BANDS][AR5K_MAX_RATES];
+ s8 rate_idx[IEEE80211_NUM_BANDS][AR5K_MAX_RATES];
enum nl80211_iftype opmode;
struct ath5k_hw *ah; /* Atheros HW */

--
1.6.0.6



--
Bob Copeland %% http://www.bobcopeland.com


2009-03-01 05:21:56

by Pavel Roskin

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH 1/1] ath5k: fix hw rate index condition

On Thu, 2009-02-26 at 22:06 -0500, Bob Copeland wrote:
> On Thu, Feb 26, 2009 at 06:39:12PM -0800, Luis R. Rodriguez wrote:
> > Might be worth adding a note why this is the case. Can't we simply avoid
> > this by checking earlier for the error or simply assigning it an actual
> > default _good_ hw rate value?
>
> I guess an alternative is to initialize to 0, that would count any rx
> packets whose hw rate we don't know about as the base rate, so it would
> probably bias the RC to 1mb, but this is already one of those 'should
> never happen' cases.

I would prefer that we don't hide problems.

If we don't know why we cannot get a valid rate, we should use WARN_ON
and find out why and when it happens. I'm fine with using a bogus rate
with WARN_ON.

If we decide that we indeed cannot find the actual rate, then WARN_ON
should be removed and the bogus rate replaced with an "unknown rate",
that is, a special value that is never translated to a valid rate and
never given to any rate control algorithm. Using a bogus rate without a
warning is wrong in my opinion.

It should be possible to represent "unknown rate" as such. That applies
to all drivers. I remember that b43 also failed to report the rate in
some cases (for the first received packet or something like that).

--
Regards,
Pavel Roskin

2009-03-23 00:46:08

by Stefan Lippers-Hollmann

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

Hi

On Sonntag, 15. M=C3=A4rz 2009, Stefan Lippers-Hollmann wrote:
> Hi
>=20
> On Mittwoch, 7. Januar 2009, Jiri Slaby wrote:
> > On 01/07/2009 02:51 PM, Jiri Slaby wrote:
> > > Dhaval Giani wrote:
> > >> I see this on current git. Not sure how to reproduce it, has hap=
pened on
> > >> two random occasions. At both times, I was not connected to a wi=
reless
> > >> network, but to wired networks.
> > >>
> > >> ------------[ cut here ]------------
> > >> WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0x7f/0x559
> > >> ...
> > >> Call Trace:
> > >> [<f80d4192>] __ieee80211_rx+0x7f/0x559 [mac80211]
> > >> [<f80a19f4>] ath5k_tasklet_rx+0x4f7/0x53b [ath5k]
> > >> ...
> > >=20
> > > Hmm, maybe ath5k is culprit. Could you apply the attached patch a=
nd
> > > use the kernel till the problem appears again?
>=20
> It seems as if this problem wouldn't be restricted to ath5k, I just=20
> triggered something very similar on b43 and 2.6.29-rc8-git1 (i386, ha=
rd=20
> preemption):
>=20
> b43-phy0: Broadcom 4306 WLAN found (core revision 5)
[...]
> wlan1: no IPv6 routers present
> b43-phy0 ERROR: PHY transmission error
> b43-phy0 ERROR: PHY transmission error
>=20
> [ lots of these, likely to be caused by minstrel being a little too=20
> optimistic about the possible wlan rates (it was more conservative =
in=20
> 2.6.28 and didn't happen there); the distance between both stations=
is=20
> on the upper end ]
>=20
> b43-phy0 ERROR: PHY transmission error
> __ratelimit: 9 callbacks suppressed
> b43-phy0 ERROR: PHY transmission error
> b43-phy0 ERROR: PHY transmission error
> ------------[ cut here ]------------
> WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0xa2/0x6a0 [mac8021=
1]()
> Hardware name: Amilo D-Series
> Modules linked in: ppdev lp aes_i586 aes_generic ipv6 af_packet rfkil=
l_input arc4 ecb b43 rfkill rng_core mac80211 cfg80211 led_class input_=
polldev ssb joydev pcmcia snd_via82xx gameport snd_ac97_codec ac97_bus =
snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_mpu401_u=
art snd_rawmidi snd_seq_device i2c_viapro serio_raw snd i2c_core pcspkr=
psmouse evdev soundcore via686a via_agp shpchp yenta_socket rsrc_nonst=
atic pcmcia_core pci_hotplug rtc_cmos battery rtc_core rtc_lib parport_=
pc parport ac button ext3 jbd mbcache sg sr_mod cdrom sd_mod ata_generi=
c pata_acpi pata_via uhci_hcd ehci_hcd floppy firewire_ohci libata tuli=
p firewire_core crc_itu_t usbcore scsi_mod thermal processor fan
> Pid: 0, comm: swapper Not tainted 2.6.29-rc8-sidux-686 #1
> Call Trace:
> [<c01319d7>] warn_slowpath+0x87/0xe0
> [<d00523b7>] op32_set_current_rxslot+0x27/0x40 [b43]
> [<d0052d93>] b43_dma_rx+0x193/0x420 [b43]
> [<c0124fc3>] __wake_up_common+0x43/0x70
> [<cfffcc62>] __ieee80211_rx+0xa2/0x6a0 [mac80211]
> [<c011e9a5>] default_spin_lock_flags+0x5/0x10
> [<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60
> [<cffeb337>] ieee80211_tasklet_handler+0x107/0x130 [mac80211]
> [<c013692c>] tasklet_action+0x6c/0xf0
> [<c0137147>] __do_softirq+0x87/0x140
> [<c011e9a5>] default_spin_lock_flags+0x5/0x10
> [<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60
> [<c0137255>] do_softirq+0x55/0x60
> [<c0137495>] irq_exit+0x75/0x90
> [<c0106378>] do_IRQ+0x48/0x90
> [<c0104527>] common_interrupt+0x27/0x2c
> [<cf8372e4>] acpi_idle_enter_simple+0x17a/0x1f4 [processor]
> [<c02fd3bf>] cpuidle_idle_call+0x6f/0xc0
> [<c0102de6>] cpu_idle+0x66/0xa0
> ---[ end trace c754f566bbe5ac47 ]---
[...]
> Sometimes even the firmware crashes and gets reloaded continously.
[...]
> Setting a fixed wlan rate (like 11M) seems to avoid this problem.
>=20
> > I don't think this will print anything, the rate won't be 32, it's =
rather
> > too high. Could you apply also the appended debug one?
>=20
> I will apply this patch and give it some more testing tomorrow evenin=
g,=20
> this problem is almost 100% reproducable for me at the end of my rout=
er's
> range and doesn't happen in closer proximity.
[...]

Sorry for the late response, but I've been unexpectedly away from my=20
BCM4306 system until today.

Thanks to the following (not yet mainline) patches by Michael Buesch an=
d=20
Lorenzo Nava on top of 2.6.29-rc8-git5, these problems seem to be "fixe=
d"=20
(well, the PHY errors are basically just hidden, but as they don't=20
trigger the firmware watchdog anymore, it's much less of a problem and=20
isn't actually a user visible problem anymore).

[PATCH] b43: Mask PHY TX error interrupt, if not debugging
http://marc.info/?l=3Dlinux-wireless&m=3D123748731831778&w=3D2

[PATCH] b43: fix b43_plcp_get_bitrate_idx_ofdm return type
http://marc.info/?l=3Dlinux-wireless&m=3D123774585529189&w=3D2


Confirming the patch descriptions, Jiri Slaby's debugging patch did rev=
eal=20
a signedness problem of the return value of in=20
b43_plcp_get_bitrate_idx_ofdm(), which has been fixed by the patch abov=
e:

[ this trace happened *without* "b43: fix b43_plcp_get_bitrate_idx_ofdm=
=20
return type", and only "b43: Mask PHY TX error interrupt, if not=20
debugging" applied on top of 2.6.29-rc8-git5 ]
------------[ cut here ]------------
WARNING: at net/mac80211/rx.c:2236 __ieee80211_rx+0xab/0x6b0 [mac80211]=
()
Hardware name: Amilo D-Series
RATE=3D255, BAND=3Dc
Modules linked in: ppdev lp aes_i586 aes_generic ipv6 af_packet rfkill_=
input arc4 ecb b43 rfkill rng_core mac80211 cfg80211 led_class input_po=
lldev ssb joydev snd_via82xx gameport snd_ac97_codec ac97_bus snd_pcm_o=
ss snd_mixer_oss pcmcia snd_pcm snd_timer snd_page_alloc snd_mpu401_uar=
t snd_rawmidi i2c_viapro serio_raw snd_seq_device pcspkr i2c_core psmou=
se snd evdev soundcore via686a shpchp yenta_socket rsrc_nonstatic pcmci=
a_core via_agp pci_hotplug rtc_cmos parport_pc battery rtc_core rtc_lib=
parport ac button ext3 jbd mbcache sg sr_mod cdrom sd_mod ata_generic =
pata_acpi pata_via uhci_hcd ehci_hcd floppy firewire_ohci libata tulip =
firewire_core crc_itu_t usbcore scsi_mod thermal processor fan
Pid: 0, comm: swapper Not tainted 2.6.29-rc8-sidux-686 #1
Call Trace:
[<c0131a67>] warn_slowpath+0x87/0xe0
[<d002d377>] op32_set_current_rxslot+0x27/0x40 [b43]
[<d002dd53>] b43_dma_rx+0x193/0x420 [b43]
[<c01ae229>] add_partial+0x19/0x70
[<cfcd834f>] ieee80211_tasklet_handler+0x11f/0x130 [mac80211]
[<c03a4195>] _spin_unlock+0x5/0x20
[<cfce9c6b>] __ieee80211_rx+0xab/0x6b0 [mac80211]
[<c011ea35>] default_spin_lock_flags+0x5/0x10
[<c03a3d7e>] _spin_lock_irqsave+0x3e/0x60
[<cfcd8337>] ieee80211_tasklet_handler+0x107/0x130 [mac80211]
[<c01369bc>] tasklet_action+0x6c/0xf0
[<c01371d7>] __do_softirq+0x87/0x140
[<c011ea35>] default_spin_lock_flags+0x5/0x10
[<c03a3d7e>] _spin_lock_irqsave+0x3e/0x60
[<c01372e5>] do_softirq+0x55/0x60
[<c0137525>] irq_exit+0x75/0x90
[<c0106378>] do_IRQ+0x48/0x90
[<c0104527>] common_interrupt+0x27/0x2c
[<cf8372cb>] acpi_idle_enter_simple+0x17a/0x1f4 [processor]
[<c02fcfcf>] cpuidle_idle_call+0x6f/0xc0
[<c0102de6>] cpu_idle+0x66/0xa0
---[ end trace ba8601a4d52a20d2 ]---
------------[ cut here ]------------ =09

So far (after 2.9 GB continuous kernel tarball downloads from a local=20
mirror) b43 seems to be fine again:

wlan1 IEEE 802.11bg ESSID:"gemini"
Mode:Managed Frequency:2.412 GHz Access Point: 00:21:27:FF:=
51:A8
Bit Rate=3D54 Mb/s Tx-Power=3D20 dBm
Retry min limit:7 RTS thr:off Fragment thr=3D2352 B
Power Management:off
Link Quality=3D54/100 Signal level:-82 dBm Noise level=3D-6=
9 dBm
Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0
Tx excessive retries:0 Invalid misc:0 Missed beacon:0

wlan1 Link encap:Ethernet HWaddr 00:0f:66:d8:67:ca
inet addr:192.168.0.70 Bcast:192.168.0.255 Mask:255.255.255=
=2E0
inet6 addr: fe80::20f:66ff:fed8:67ca/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:2090104 errors:0 dropped:0 overruns:0 frame:0
TX packets:1082081 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3146865411 (2.9 GiB) TX bytes:93054386 (88.7 MiB)

=46etched 83.2MB in 1min18s (1058kB/s)
[...]
=46etched 83.2MB in 1min1s (1362kB/s)

Thank you and sorry about the late response.

Regards
Stefan Lippers-Hollmann


Post scriptum: I'm not able to trigger this trace with ath5k/ AR2425.
--=20
> > net/mac80211/rx.c | 6 ++++--
> > 1 files changed, 4 insertions(+), 2 deletions(-)
> >=20
> > diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
> > index 7175ae8..5e17e57 100644
> > --- a/net/mac80211/rx.c
> > +++ b/net/mac80211/rx.c
> > @@ -2230,8 +2230,10 @@ void __ieee80211_rx(struct ieee80211_hw *hw,=
struct sk_buff *skb,
> > * MCS aware. */
> > rate =3D &sband->bitrates[sband->n_bitrates - 1];
> > } else {
> > - if (WARN_ON(status->rate_idx < 0 ||
> > - status->rate_idx >=3D sband->n_bitrates))
> > + if (WARN(status->rate_idx < 0 ||
> > + status->rate_idx >=3D sband->n_bitrates,
> > + "RATE=3D%u, BAND=3D%x\n", status->rate_idx,
> > + sband->n_bitrates))
> > return;
> > rate =3D &sband->bitrates[status->rate_idx];
> > }

2009-03-03 04:31:09

by Nick Kossifidis

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH 1/1] ath5k: fix hw rate index condition

2009/3/3 Bob Copeland <[email protected]>:
> On Sun, Mar 01, 2009 at 12:21:52AM -0500, Pavel Roskin wrote:
>> I would prefer that we don't hide problems.
>>
>> If we don't know why we cannot get a valid rate, we should use WARN_=
ON
>> and find out why and when it happens. =C2=A0I'm fine with using a bo=
gus rate
>> with WARN_ON.
>
> So here is at least stage one of this, not yet the global "unknown ra=
te"
> infrastructure, but hopefully it will allow us to track down the issu=
e.
>
> It makes hw_to_driver_rix a little uglier, but oh well. =C2=A0Thought=
s?
>
> From: Bob Copeland <[email protected]>
> Date: Mon, 2 Mar 2009 21:55:18 -0500
> Subject: [PATCH] ath5k: warn and correct rate for unknown hw rate ind=
exes
>
> ath5k sets up a mapping table from the hardware rate index to
> the rate index used by mac80211; however, we have seen some
> received frames with incorrect rate indexes. =C2=A0Such frames
> normally get dropped with a warning in __ieee80211_rx(), but the
> warning doesn't include enough context to track down the error.
>
> This patch adds a warning to hw_to_driver_rix for any lookups
> that result in a rate index of -1, then returns a valid rate so
> the frame can be processed.
>
> This also includes the bug fix suggested by Pavel Roskin, in which
> the mapping table is made signed, so rates initialized to -1 stay
> that way.
>
> Signed-off-by: Bob Copeland <[email protected]>
> ---
> =C2=A0drivers/net/wireless/ath5k/base.c | =C2=A0 15 ++++++++++++---
> =C2=A0drivers/net/wireless/ath5k/base.h | =C2=A0 =C2=A02 +-
> =C2=A02 files changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/wireless/ath5k/base.c b/drivers/net/wireless=
/ath5k/base.c
> index f7c424d..8d4b11c 100644
> --- a/drivers/net/wireless/ath5k/base.c
> +++ b/drivers/net/wireless/ath5k/base.c
> @@ -1100,9 +1100,18 @@ ath5k_mode_setup(struct ath5k_softc *sc)
> =C2=A0static inline int
> =C2=A0ath5k_hw_to_driver_rix(struct ath5k_softc *sc, int hw_rix)
> =C2=A0{
> - =C2=A0 =C2=A0 =C2=A0 WARN(hw_rix < 0 || hw_rix >=3D AR5K_MAX_RATES,
> - =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0 "hw_rix out of bounds: %x\n", hw_rix);
> - =C2=A0 =C2=A0 =C2=A0 return sc->rate_idx[sc->curband->band][hw_rix]=
;
> + =C2=A0 =C2=A0 =C2=A0 int rix;
> +
> + =C2=A0 =C2=A0 =C2=A0 /* return base rate on errors */
> + =C2=A0 =C2=A0 =C2=A0 if (WARN(hw_rix < 0 || hw_rix >=3D AR5K_MAX_RA=
TES,
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0 "hw_rix out of bounds: %x\n", hw_rix))
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 return 0;
> +
> + =C2=A0 =C2=A0 =C2=A0 rix =3D sc->rate_idx[sc->curband->band][hw_rix=
];
> + =C2=A0 =C2=A0 =C2=A0 if (WARN(rix < 0, "invalid hw_rix: %x\n", hw_r=
ix))
> + =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 rix =3D 0;
> +
> + =C2=A0 =C2=A0 =C2=A0 return rix;
> =C2=A0}
>
> =C2=A0/***************\
> diff --git a/drivers/net/wireless/ath5k/base.h b/drivers/net/wireless=
/ath5k/base.h
> index 20e0d14..8229561 100644
> --- a/drivers/net/wireless/ath5k/base.h
> +++ b/drivers/net/wireless/ath5k/base.h
> @@ -112,7 +112,7 @@ struct ath5k_softc {
> =C2=A0 =C2=A0 =C2=A0 =C2=A0struct ieee80211_supported_band sbands[IEE=
E80211_NUM_BANDS];
> =C2=A0 =C2=A0 =C2=A0 =C2=A0struct ieee80211_channel channels[ATH_CHAN=
_MAX];
> =C2=A0 =C2=A0 =C2=A0 =C2=A0struct ieee80211_rate =C2=A0 rates[IEEE802=
11_NUM_BANDS][AR5K_MAX_RATES];
> - =C2=A0 =C2=A0 =C2=A0 u8 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0rate_idx[IEEE80211_NUM_BANDS][AR5K_MAX_R=
ATES];
> + =C2=A0 =C2=A0 =C2=A0 s8 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0rate_idx[IEEE80211_NUM_BANDS][AR5K_MAX_R=
ATES];
> =C2=A0 =C2=A0 =C2=A0 =C2=A0enum nl80211_iftype =C2=A0 =C2=A0 opmode;
> =C2=A0 =C2=A0 =C2=A0 =C2=A0struct ath5k_hw =C2=A0 =C2=A0 =C2=A0 =C2=A0=
*ah; =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0/* Atheros HW */
>
> --
> 1.6.0.6
>

Another thought...

According to docs the rate field is only valid if more flag is clear
(we have the last descriptor) and only if the receive ok flag is set
or both receive ok and phy error flags are cleared. We never do such
checks so we might actually try to process this field when we already
know we shouldn't...

Also the following rate codes are reserved (except XR codes that we
already know):

0x00
0x04 -> the short preamble flag
0x05
0x10 - 0x17
0x1f

and i don't believe i've ever seen them, so we can warn on them too
and print something like "Reserved rate code: %x", also it would be
nice to warn on XR rates (1,2,3,6,7) in case we want to debug this in
the future.


--=20
GPG ID: 0xD21DB2DB
As you read this post global entropy rises. Have Fun ;-)
Nick

2009-03-01 05:07:31

by Pavel Roskin

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH 1/1] ath5k: fix hw rate index condition

On Thu, 2009-02-26 at 21:27 -0500, Bob Copeland wrote:

> Actually, I remembered in the dark recesses of my moldering brain
> that someone had a lost patch for this a while ago, so I searched
> the archives. Pavel, ok to add your s-o-b?

Since my patch was dropped and the new patch was implemented without my
participation, it makes no sense to put my s-o-b on the code I didn't
write (even though I wrote something similar before).

--
Regards,
Pavel Roskin

2009-02-28 23:08:14

by Jiri Slaby

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On 15.2.2009 14:47, Bob Copeland wrote:
> On Mon, Feb 02, 2009 at 01:27:39PM +0530, Dhaval Giani wrote:
>> So I finally managed to hit this on 2.6.29-rc3. It is hard to
>> reproduce, so I hope so much information is enough to give you a good
>> guess. This time it hit while trying to connect to an open network at
>> the airport.
>
>> WARNING: at net/mac80211/rx.c:2236 __ieee80211_rx+0x96/0x571 [mac80211]()
>> Hardware name: 2007CS3
>> RATE=255, BAND=8
>
> band is supposed to be sc->curband? 8 is way wrong.

If you look into the patch which outputs this (backtrace in this
thread), sband->n_bitrates is 8. I have no idea what I have been smoking
the day I wrote it, but BAND= for sure isn't the right name for that
thing. Sorry for the confusion.

> rate could be 255
> if, for some reason, the hardware rate wasn't in the rate table.

So, we have a fix for this, right? I mean the u8->s8 sc->rate_idx
conversion or alike...

2009-03-23 03:04:36

by Bob Copeland

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH 1/1] ath5k: fix hw rate index condition

On Tue, Mar 03, 2009 at 06:31:05AM +0200, Nick Kossifidis wrote:
> and i don't believe i've ever seen them, so we can warn on them too
> and print something like "Reserved rate code: %x", also it would be
> nice to warn on XR rates (1,2,3,6,7) in case we want to debug this in
> the future.

Nick, are you ok with taking this patch as-is and adding the other stuff
in later? It does fix a warning in mac80211 for unknown rates and
instead warns with the actual hw rate code we get.

Here's a repost (rebased, if it matters):

From: Bob Copeland <[email protected]>
Date: Mon, 2 Mar 2009 21:55:18 -0500
Subject: [PATCH] ath5k: warn and correct rate for unknown hw rate indexes

ath5k sets up a mapping table from the hardware rate index to
the rate index used by mac80211; however, we have seen some
received frames with incorrect rate indexes. Such frames
normally get dropped with a warning in __ieee80211_rx(),
but it doesn't include enough information to track down the
error.

This patch adds a warning to hw_to_driver_rix for any lookups
that result in a rate index of -1, then returns a valid rate so
the frame can be processed.

Changes-licensed-under: 3-Clause-BSD

Signed-off-by: Bob Copeland <[email protected]>
---
drivers/net/wireless/ath5k/base.c | 15 ++++++++++++---
drivers/net/wireless/ath5k/base.h | 2 +-
2 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/ath5k/base.c b/drivers/net/wireless/ath5k/base.c
index f28b86c..6580df2 100644
--- a/drivers/net/wireless/ath5k/base.c
+++ b/drivers/net/wireless/ath5k/base.c
@@ -1088,9 +1088,18 @@ ath5k_mode_setup(struct ath5k_softc *sc)
static inline int
ath5k_hw_to_driver_rix(struct ath5k_softc *sc, int hw_rix)
{
- WARN(hw_rix < 0 || hw_rix >= AR5K_MAX_RATES,
- "hw_rix out of bounds: %x\n", hw_rix);
- return sc->rate_idx[sc->curband->band][hw_rix];
+ int rix;
+
+ /* return base rate on errors */
+ if (WARN(hw_rix < 0 || hw_rix >= AR5K_MAX_RATES,
+ "hw_rix out of bounds: %x\n", hw_rix))
+ return 0;
+
+ rix = sc->rate_idx[sc->curband->band][hw_rix];
+ if (WARN(rix < 0, "invalid hw_rix: %x\n", hw_rix))
+ rix = 0;
+
+ return rix;
}

/***************\
diff --git a/drivers/net/wireless/ath5k/base.h b/drivers/net/wireless/ath5k/base.h
index 20e0d14..8229561 100644
--- a/drivers/net/wireless/ath5k/base.h
+++ b/drivers/net/wireless/ath5k/base.h
@@ -112,7 +112,7 @@ struct ath5k_softc {
struct ieee80211_supported_band sbands[IEEE80211_NUM_BANDS];
struct ieee80211_channel channels[ATH_CHAN_MAX];
struct ieee80211_rate rates[IEEE80211_NUM_BANDS][AR5K_MAX_RATES];
- u8 rate_idx[IEEE80211_NUM_BANDS][AR5K_MAX_RATES];
+ s8 rate_idx[IEEE80211_NUM_BANDS][AR5K_MAX_RATES];
enum nl80211_iftype opmode;
struct ath5k_hw *ah; /* Atheros HW */

--
1.6.0.6


--
Bob Copeland %% http://www.bobcopeland.com


2009-03-15 21:36:49

by Michael Büsch

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On Sunday 15 March 2009 22:27:13 Stefan Lippers-Hollmann wrote:
> Hi
>
> On Mittwoch, 7. Januar 2009, Jiri Slaby wrote:
> > On 01/07/2009 02:51 PM, Jiri Slaby wrote:
> > > Dhaval Giani wrote:
> > >> I see this on current git. Not sure how to reproduce it, has happened on
> > >> two random occasions. At both times, I was not connected to a wireless
> > >> network, but to wired networks.
> > >>
> > >> ------------[ cut here ]------------
> > >> WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0x7f/0x559

I also see this triggering frequently on b43.
I'm not quite sure why it happens.

> Sometimes even the firmware crashes and gets reloaded continously.

Nah, that's most likely a separate bug.

--
Greetings, Michael.

2009-03-23 02:32:23

by Bob Copeland

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On Mon, Mar 23, 2009 at 01:45:58AM +0100, Stefan Lippers-Hollmann wrote:
>
> Post scriptum: I'm not able to trigger this trace with ath5k/ AR2425.

Okay, well just to be clear ath5k had the same issue (I posted a patch
a couple of weeks ago - I think it got lost and I need to repost it).

But this is separate from the problem where the rate controller is
choosing a bad rate index for TX in adhoc mode, that's still an unknown,
unsolved problem.

--
Bob Copeland %% http://www.bobcopeland.com