2009-02-26 22:45:21

by Jiri Slaby

[permalink] [raw]
Subject: [PATCH 1/1] ath5k: fix hw rate index condition

Make sure we print out a warning when the index is out of bounds,
i.e. even on hw_rix == AR5K_MAX_RATES.

Also change to WARN and print text with the reported hw_rix.

Signed-off-by: Jiri Slaby <[email protected]>
Cc: Nick Kossifidis <[email protected]>
Cc: Luis R. Rodriguez <[email protected]>
Cc: Bob Copeland <[email protected]>
---
drivers/net/wireless/ath5k/base.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/net/wireless/ath5k/base.c b/drivers/net/wireless/ath5k/base.c
index c06d1cc..08d691d 100644
--- a/drivers/net/wireless/ath5k/base.c
+++ b/drivers/net/wireless/ath5k/base.c
@@ -1100,7 +1100,8 @@ ath5k_mode_setup(struct ath5k_softc *sc)
static inline int
ath5k_hw_to_driver_rix(struct ath5k_softc *sc, int hw_rix)
{
- WARN_ON(hw_rix < 0 || hw_rix > AR5K_MAX_RATES);
+ WARN(hw_rix < 0 || hw_rix >= AR5K_MAX_RATES,
+ "hw_rix out of bounds: %x\n", hw_rix);
return sc->rate_idx[sc->curband->band][hw_rix];
}

--
1.6.1.3



2009-02-26 23:15:37

by Bob Copeland

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On Thu, 26 Feb 2009 23:44:31 +0100, Jiri Slaby wrote
> Make sure we print out a warning when the index is out of bounds,
> i.e. even on hw_rix == AR5K_MAX_RATES.
>
> Also change to WARN and print text with the reported hw_rix.

ACK.

Speaking of, I think there's another potential oob array access at:

if (rxs.rate_idx >= 0 && rs.rs_rate ==
sc->curband->bitrates[rxs.rate_idx].hw_value_short)
rxs.flag |= RX_FLAG_SHORTPRE;

because sc->rate_idx is u8 instead of s8.

--
Bob Copeland %% http://www.bobcopeland.com



2009-03-30 09:00:56

by Dhaval Giani

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On Sun, Mar 01, 2009 at 12:08:07AM +0100, Jiri Slaby wrote:
> On 15.2.2009 14:47, Bob Copeland wrote:
>> On Mon, Feb 02, 2009 at 01:27:39PM +0530, Dhaval Giani wrote:
>>> So I finally managed to hit this on 2.6.29-rc3. It is hard to
>>> reproduce, so I hope so much information is enough to give you a good
>>> guess. This time it hit while trying to connect to an open network at
>>> the airport.
>>
>>> WARNING: at net/mac80211/rx.c:2236 __ieee80211_rx+0x96/0x571 [mac80211]()
>>> Hardware name: 2007CS3
>>> RATE=255, BAND=8
>>
>> band is supposed to be sc->curband? 8 is way wrong.
>
> If you look into the patch which outputs this (backtrace in this
> thread), sband->n_bitrates is 8. I have no idea what I have been smoking
> the day I wrote it, but BAND= for sure isn't the right name for that
> thing. Sorry for the confusion.
>
>> rate could be 255
>> if, for some reason, the hardware rate wasn't in the rate table.
>
> So, we have a fix for this, right? I mean the u8->s8 sc->rate_idx
> conversion or alike...

Where is the fix? Is it merged in? I still see this happen on 2.6.29

thanks,
--
regards,
Dhaval

2009-03-30 18:00:54

by Dhaval Giani

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On Mon, Mar 30, 2009 at 12:58:28PM -0400, Bob Copeland wrote:
> On Mon, Mar 30, 2009 at 4:59 AM, Dhaval Giani <[email protected]> wrote:
> > Where is the fix? Is it merged in? I still see this happen on 2.6.29
> >
> > thanks,
>
> It's in b726604706ad88d8b28bc487e45e710f58cc19ee in Linus' tree, after
> 2.6.29. You still might get a warning, but this time from the driver
> side instead of higher up the stack -- if you do please post it.
>

ok, so my kernel does hve this patch applied, and this is what I get,

------------[ cut here ]------------
WARNING: at include/net/mac80211.h:1956 minstrel_get_rate+0xa1/0x4b9 [mac80211]()
Hardware name: 2007CS3
Modules linked in: fuse radeon drm ipt_MASQUERADE iptable_nat nf_nat bridge stp bnep sco l2cap bluetooth ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 cpufreq_ondemand acpi_cpufreq dm_multipath kvm_intel kvm uinput snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_seq_dummy arc4 snd_seq_oss snd_seq_midi_event snd_seq ecb ath5k nsc_ircc snd_seq_device snd_pcm_oss video snd_mixer_oss snd_pcm mac80211 snd_timer snd yenta_socket i2c_i801 thinkpad_acpi rfkill irda output iTCO_wdt rsrc_nonstatic pcspkr hwmon cfg80211 joydev i2c_core iTCO_vendor_support soundcore crc_ccitt snd_page_alloc [last unloaded: scsi_wait_scan]
Pid: 2389, comm: wpa_supplicant Tainted: G W 2.6.29-tip #28
Call Trace:
[<c0431b0e>] warn_slowpath+0x76/0xad
[<c04523e1>] ? print_lock_contention_bug+0x14/0xd7
[<c042e874>] ? default_wake_function+0x10/0x12
[<c04523e1>] ? print_lock_contention_bug+0x14/0xd7
[<f7d31879>] minstrel_get_rate+0xa1/0x4b9 [mac80211]
[<c0450fa4>] ? trace_hardirqs_on+0xb/0xd
[<c0424909>] ? __wake_up+0x36/0x40
[<f7d272fe>] ? invoke_tx_handlers+0x3b1/0xa50 [mac80211]
[<f7d21b1e>] rate_control_get_rate+0x7e/0xbe [mac80211]
[<f7d27330>] invoke_tx_handlers+0x3e3/0xa50 [mac80211]
[<c0450e61>] ? trace_hardirqs_on_caller+0x18/0x150
[<f7d26c03>] ? __ieee80211_tx_prepare+0x24b/0x288 [mac80211]
[<f7d286ad>] ieee80211_master_start_xmit+0x38b/0x4b2 [mac80211]
[<c069d1f4>] dev_hard_start_xmit+0x219/0x280
[<c06ac17e>] __qdisc_run+0xca/0x1b0
[<c069d6de>] dev_queue_xmit+0x398/0x4bf
[<f7d2a116>] ieee80211_tx_skb+0x53/0x56 [mac80211]
[<f7d1dac4>] ieee80211_send_deauth_disassoc+0xd7/0xdf [mac80211]
[<f7d1dbc1>] ieee80211_set_disassoc+0xf5/0x209 [mac80211]
[<f7d1ddc6>] ieee80211_sta_req_auth+0x47/0x69 [mac80211]
[<f7d17c5a>] ieee80211_ioctl_siwgenie+0x50/0x5d [mac80211]
[<c06f9720>] ioctl_standard_call+0x1b4/0x268
[<c069b3ce>] ? dev_name_hash+0x1b/0x47
[<c06f92e7>] wext_handle_ioctl+0xe7/0x17d
[<f7d17c0a>] ? ieee80211_ioctl_siwgenie+0x0/0x5d [mac80211]
[<c04937ba>] ? might_fault+0x83/0x85
[<c069f06f>] dev_ioctl+0x5c6/0x5e6
[<c0690bf3>] ? sockfd_lookup_light+0x1b/0x4e
[<c0691b65>] ? sys_sendto+0xa9/0xc8
[<c04cf997>] ? dnotify_parent+0x22/0x63
[<c0690746>] ? sock_ioctl+0x0/0x1f0
[<c069092a>] sock_ioctl+0x1e4/0x1f0
[<c0690746>] ? sock_ioctl+0x0/0x1f0
[<c04b6d55>] vfs_ioctl+0x27/0x6e
[<c04b72d4>] do_vfs_ioctl+0x46f/0x4a8
[<c0691ba1>] ? sys_send+0x1d/0x1f
[<c04b7352>] sys_ioctl+0x45/0x5f
[<c04032a4>] sysenter_do_call+0x12/0x38
---[ end trace 0e3d1a2e9037b74b ]---


> --
> Bob Copeland %% http://www.bobcopeland.com

--
regards,
Dhaval

2009-03-30 18:13:38

by Bob Copeland

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On Mon, Mar 30, 2009 at 1:59 PM, Dhaval Giani <[email protected]=
m> wrote:
> ok, so my kernel does hve this patch applied, and this is what I get,
>
> =A0------------[ cut here ]------------
> =A0WARNING: at include/net/mac80211.h:1956 minstrel_get_rate+0xa1/0x4=
b9 [mac80211]()

I believe this is something different (tx path not rx). I think it's
that minstrel rate table bug again, which we never solved for ath5k.

Are you using adhoc or managed mode? Do you have the slab/slub debuggi=
ng
options turned on? Any steps that consistently reproduce it? Do you
get any warnings with PID controller?

--=20
Bob Copeland %% http://www.bobcopeland.com

2009-03-31 12:24:13

by Bob Copeland

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On Tue, Mar 31, 2009 at 09:21:40AM +0530, Dhaval Giani wrote:
> Am not sure what the PID controller is, and google gave me a number of
> results, which did not make too much sense in the context.

CONFIG_MAC80211_RC_PID -- unfortunately I recall having to jump through
a few config hoops to enable it.

> One way I have found of reproducing it is to connect to open networks,
> but it does not happen always. At home, when my network is set to open,
> I do not see this issue, whereas at the airport, kaboom.

Ok - that is a useful data point. Perhaps something to do with the rates
the peer supports; it would help if you could grab a scan next time you
are in the area. Turn off auto-connect to open networks, then do:

# iw dev wlan0 scan trigger
# iw dev wlan0 scan dump >> dump.out # do this a few times

Then if a particular peer triggers the problem, we can look at the
advertised rates to see if anything jumps out.

--
Bob Copeland %% http://www.bobcopeland.com


2009-03-30 16:58:31

by Bob Copeland

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On Mon, Mar 30, 2009 at 4:59 AM, Dhaval Giani <[email protected]> wrote:
> Where is the fix? Is it merged in? I still see this happen on 2.6.29
>
> thanks,

It's in b726604706ad88d8b28bc487e45e710f58cc19ee in Linus' tree, after
2.6.29. You still might get a warning, but this time from the driver
side instead of higher up the stack -- if you do please post it.

--
Bob Copeland %% http://www.bobcopeland.com

2009-03-31 03:52:45

by Dhaval Giani

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On Mon, Mar 30, 2009 at 02:13:35PM -0400, Bob Copeland wrote:
> On Mon, Mar 30, 2009 at 1:59 PM, Dhaval Giani <[email protected].=
com> wrote:
> > ok, so my kernel does hve this patch applied, and this is what I ge=
t,
> >
> > =A0------------[ cut here ]------------
> > =A0WARNING: at include/net/mac80211.h:1956 minstrel_get_rate+0xa1/0=
x4b9 [mac80211]()
>=20
> I believe this is something different (tx path not rx). I think it's
> that minstrel rate table bug again, which we never solved for ath5k.
>=20
> Are you using adhoc or managed mode? Do you have the slab/slub debug=
ging
> options turned on? Any steps that consistently reproduce it? Do you
> get any warnings with PID controller?
>=20

[dhaval@gondor ~]$ iwconfig wlan0
wlan0 IEEE 802.11abg ESSID:"linksys_SES_62338"
Mode:Managed Frequency:2.462 GHz Access Point: 00:1A:70:D6:=
2D:06
Bit Rate=3D36 Mb/s Tx-Power=3D23 dBm
Retry min limit:7 RTS thr:off Fragment thr=3D2352 B
Power Management:off
Link Quality=3D100/100 Signal level:-49 dBm Noise level=3D-=
96 dBm
Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0
Tx excessive retries:0 Invalid misc:0 Missed beacon:0

[dhaval@gondor ~]$

[dhaval@gondor linux-2.6]$ grep -i slub .config
CONFIG_SLUB_DEBUG=3Dy
CONFIG_SLUB=3Dy
# CONFIG_SLUB_DEBUG_ON is not set
# CONFIG_SLUB_STATS is not set
[dhaval@gondor linux-2.6]$

Am not sure what the PID controller is, and google gave me a number of
results, which did not make too much sense in the context.

Yes, I think I know how to reproduce it, but I am not sure what is the
real cause.

One way I have found of reproducing it is to connect to open networks,
but it does not happen always. At home, when my network is set to open,
I do not see this issue, whereas at the airport, kaboom.

I've also seen it on LEAP networks, but there were also a few open
networks around. This warning is generally accompanied by a disconnect
from the LEAP connected network, and then the system reconnects. Let me
know if you have patches, I can give them a run and report back.

Thanks,
--=20
regards,
Dhaval

2009-04-08 15:22:36

by Bob Copeland

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH 1/1] ath5k: fix hw rate index condition

On Tue, Mar 31, 2009 at 8:23 AM, Bob Copeland <[email protected]> wrot=
e:
> Ok - that is a useful data point. =A0Perhaps something to do with the=
rates
> the peer supports; it would help if you could grab a scan next time y=
ou
> are in the area. =A0Turn off auto-connect to open networks, then do:

Hi Dhaval,

Would you mind trying this patch and report the warnings it triggers?

http://marc.info/?l=3Dlinux-kernel&m=3D123915183521347&q=3Draw

--=20
Bob Copeland %% http://www.bobcopeland.com