2009-01-06 17:08:10

by Dhaval Giani

[permalink] [raw]
Subject: WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0x7f/0x559 [mac80211]()

Hi,

I see this on current git. Not sure how to reproduce it, has happened on
two random occasions. At both times, I was not connected to a wireless
network, but to wired networks.

------------[ cut here ]------------
WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0x7f/0x559
[mac80211]()
Hardware name: 2007CS3
Modules linked in: tun fuse radeon drm ipt_MASQUERADE iptable_nat nf_nat
bridge stp bnep sco l2cap bluetooth ip6t_REJECT nf_conntrack_ipv6
ip6table_filter ip6_tables ipv6 cpufreq_ondemand acpi_cpufreq
dm_multipath kvm_intel kvm uinput snd_hda_codec_analog snd_hda_intel
snd_hd
a_codec snd_hwdep snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq
arc4 snd_seq_device ecb snd_pcm_oss ath5k mac80211 snd_mixer_oss
thinkpad_acpi snd_pcm rfkill hwmon snd_timer i2c_i801 snd iTCO_wdt
yenta_socket pcspkr joydev nsc_ircc i2c_core iTCO_vendor_support
rsrc_non
static cfg80211 soundcore irda snd_page_alloc video output crc_ccitt
[last unloaded: scsi_wait_scan]
Pid: 0, comm: swapper Not tainted 2.6.28 #5
Call Trace:
[<c043180f>] warn_slowpath+0x76/0xad
[<c04512d8>] ? __lock_acquire+0xb3b/0xb4a
[<c044f3bd>] ? trace_hardirqs_off+0xb/0xd
[<f80d4192>] __ieee80211_rx+0x7f/0x559 [mac80211]
[<f80a19f4>] ath5k_tasklet_rx+0x4f7/0x53b [ath5k]
[<c06f6edb>] ? _spin_unlock_irq+0x27/0x34
[<c04360fc>] tasklet_action+0x84/0xef
[<c0436802>] __do_softirq+0x9d/0x157
[<c0436765>] ? __do_softirq+0x0/0x157
<IRQ> [<c0470e78>] ? handle_fasteoi_irq+0x0/0xb6
[<c0436493>] ? irq_exit+0x49/0x85
[<c04055df>] ? do_IRQ+0xf5/0x10b
[<c04042ac>] ? common_interrupt+0x2c/0x34
[<c045017e>] ? trace_hardirqs_on+0xb/0xd
[<c045007b>] ? trace_hardirqs_on_caller+0x5c/0x154
[<c058cf89>] ? acpi_idle_enter_bm+0x281/0x2d0
[<c0665273>] ? cpuidle_idle_call+0x65/0x96
[<c0402c96>] ? cpu_idle+0x84/0xae
[<c06e2e1b>] ? rest_init+0x53/0x55
---[ end trace e5a3692e59279535 ]---

Thanks,
--
regards,
Dhaval


2009-01-07 13:52:32

by Jiri Slaby

[permalink] [raw]
Subject: [PATCH 1/1] ath5k: fix hw rate index condition

Dhaval Giani wrote:
> I see this on current git. Not sure how to reproduce it, has happened on
> two random occasions. At both times, I was not connected to a wireless
> network, but to wired networks.
>
> ------------[ cut here ]------------
> WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0x7f/0x559
> ...
> Call Trace:
> [<f80d4192>] __ieee80211_rx+0x7f/0x559 [mac80211]
> [<f80a19f4>] ath5k_tasklet_rx+0x4f7/0x53b [ath5k]
> ...

Hmm, maybe ath5k is culprit. Could you apply the attached patch and
use the kernel till the problem appears again?

--

Make sure we print out a warning when the index is out of bounds,
i.e. even on hw_rix == AR5K_MAX_RATES.

Also change to WARN and print text with the reported hw_rix.

Signed-off-by: Jiri Slaby <[email protected]>
Cc: Nick Kossifidis <[email protected]>
Cc: Luis R. Rodriguez <[email protected]>
Cc: Bob Copeland <[email protected]>
---
drivers/net/wireless/ath5k/base.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/net/wireless/ath5k/base.c b/drivers/net/wireless/ath5k/base.c
index 4af2607..0e65e25 100644
--- a/drivers/net/wireless/ath5k/base.c
+++ b/drivers/net/wireless/ath5k/base.c
@@ -1088,7 +1088,8 @@ ath5k_mode_setup(struct ath5k_softc *sc)
static inline int
ath5k_hw_to_driver_rix(struct ath5k_softc *sc, int hw_rix)
{
- WARN_ON(hw_rix < 0 || hw_rix > AR5K_MAX_RATES);
+ WARN(hw_rix < 0 || hw_rix >= AR5K_MAX_RATES,
+ "hw_rix out of bounds: %x\n", hw_rix);
return sc->rate_idx[sc->curband->band][hw_rix];
}

--
1.6.0.6

2009-01-07 14:37:25

by Jiri Slaby

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On 01/07/2009 02:51 PM, Jiri Slaby wrote:
> Dhaval Giani wrote:
>> I see this on current git. Not sure how to reproduce it, has happened on
>> two random occasions. At both times, I was not connected to a wireless
>> network, but to wired networks.
>>
>> ------------[ cut here ]------------
>> WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0x7f/0x559
>> ...
>> Call Trace:
>> [<f80d4192>] __ieee80211_rx+0x7f/0x559 [mac80211]
>> [<f80a19f4>] ath5k_tasklet_rx+0x4f7/0x53b [ath5k]
>> ...
>
> Hmm, maybe ath5k is culprit. Could you apply the attached patch and
> use the kernel till the problem appears again?

I don't think this will print anything, the rate won't be 32, it's rather
too high. Could you apply also the appended debug one?

---
net/mac80211/rx.c | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
index 7175ae8..5e17e57 100644
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -2230,8 +2230,10 @@ void __ieee80211_rx(struct ieee80211_hw *hw, struct sk_buff *skb,
* MCS aware. */
rate = &sband->bitrates[sband->n_bitrates - 1];
} else {
- if (WARN_ON(status->rate_idx < 0 ||
- status->rate_idx >= sband->n_bitrates))
+ if (WARN(status->rate_idx < 0 ||
+ status->rate_idx >= sband->n_bitrates,
+ "RATE=%u, BAND=%x\n", status->rate_idx,
+ sband->n_bitrates))
return;
rate = &sband->bitrates[status->rate_idx];
}
--
1.6.0.6

2009-01-07 15:22:44

by Dhaval Giani

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On Wed, Jan 07, 2009 at 03:36:05PM +0100, Jiri Slaby wrote:
> On 01/07/2009 02:51 PM, Jiri Slaby wrote:
> > Dhaval Giani wrote:
> >> I see this on current git. Not sure how to reproduce it, has happened on
> >> two random occasions. At both times, I was not connected to a wireless
> >> network, but to wired networks.
> >>
> >> ------------[ cut here ]------------
> >> WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0x7f/0x559
> >> ...
> >> Call Trace:
> >> [<f80d4192>] __ieee80211_rx+0x7f/0x559 [mac80211]
> >> [<f80a19f4>] ath5k_tasklet_rx+0x4f7/0x53b [ath5k]
> >> ...
> >
> > Hmm, maybe ath5k is culprit. Could you apply the attached patch and
> > use the kernel till the problem appears again?
>
> I don't think this will print anything, the rate won't be 32, it's rather
> too high. Could you apply also the appended debug one?
>

I will apply both the patches and try it out again. As I mentioned
earlier, I am not sure how to reproduce the WARN_ON. I will get back to
you in about a day or two.

> ---
> net/mac80211/rx.c | 6 ++++--
> 1 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
> index 7175ae8..5e17e57 100644
> --- a/net/mac80211/rx.c
> +++ b/net/mac80211/rx.c
> @@ -2230,8 +2230,10 @@ void __ieee80211_rx(struct ieee80211_hw *hw, struct sk_buff *skb,
> * MCS aware. */
> rate = &sband->bitrates[sband->n_bitrates - 1];
> } else {
> - if (WARN_ON(status->rate_idx < 0 ||
> - status->rate_idx >= sband->n_bitrates))
> + if (WARN(status->rate_idx < 0 ||
> + status->rate_idx >= sband->n_bitrates,
> + "RATE=%u, BAND=%x\n", status->rate_idx,
> + sband->n_bitrates))
> return;
> rate = &sband->bitrates[status->rate_idx];
> }
> --
> 1.6.0.6

--
regards,
Dhaval

2009-01-07 15:31:33

by Jiri Slaby

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On 01/07/2009 04:22 PM, Dhaval Giani wrote:
> I will get back to you in about a day or two.

No problem. Thanks.

2009-03-15 21:27:34

by Stefan Lippers-Hollmann

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

Hi

On Mittwoch, 7. Januar 2009, Jiri Slaby wrote:
> On 01/07/2009 02:51 PM, Jiri Slaby wrote:
> > Dhaval Giani wrote:
> >> I see this on current git. Not sure how to reproduce it, has happened on
> >> two random occasions. At both times, I was not connected to a wireless
> >> network, but to wired networks.
> >>
> >> ------------[ cut here ]------------
> >> WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0x7f/0x559
> >> ...
> >> Call Trace:
> >> [<f80d4192>] __ieee80211_rx+0x7f/0x559 [mac80211]
> >> [<f80a19f4>] ath5k_tasklet_rx+0x4f7/0x53b [ath5k]
> >> ...
> >
> > Hmm, maybe ath5k is culprit. Could you apply the attached patch and
> > use the kernel till the problem appears again?

It seems as if this problem wouldn't be restricted to ath5k, I just
triggered something very similar on b43 and 2.6.29-rc8-git1 (i386, hard
preemption):

b43-phy0: Broadcom 4306 WLAN found (core revision 5)
wmaster0 (b43): not using net_device_ops yet
phy0: Selected rate control algorithm 'minstrel'
wlan0 (b43): not using net_device_ops yet
Broadcom 43xx driver loaded [ Features: PMLR, Firmware-ID: FW13 ]
udev: renamed network interface wlan0 to wlan1
[...]
input: b43-phy0 as /devices/virtual/input/input8
b43 ssb0:0: firmware: requesting b43/ucode5.fw
b43 ssb0:0: firmware: requesting b43/pcm5.fw
b43 ssb0:0: firmware: requesting b43/b0g0initvals5.fw
b43 ssb0:0: firmware: requesting b43/b0g0bsinitvals5.fw
b43-phy0: Loading firmware version 410.2160 (2007-05-26 15:32:10)
Registered led device: b43-phy0::tx
Registered led device: b43-phy0::rx
Registered led device: b43-phy0::radio
b43-phy0: Radio turned on by software
[...]
ADDRCONF(NETDEV_UP): wlan1: link is not ready
wlan1: authenticate with AP 00:15:f2:7e:9b:7d
wlan1: authenticated
wlan1: associate with AP 00:15:f2:7e:9b:7d
wlan1: RX AssocResp from 00:15:f2:7e:9b:7d (capab=0x411 status=0 aid=2)
wlan1: associated
ADDRCONF(NETDEV_CHANGE): wlan1: link becomes ready
[...]
wlan1: no IPv6 routers present
b43-phy0 ERROR: PHY transmission error
b43-phy0 ERROR: PHY transmission error

[ lots of these, likely to be caused by minstrel being a little too
optimistic about the possible wlan rates (it was more conservative in
2.6.28 and didn't happen there); the distance between both stations is
on the upper end ]

b43-phy0 ERROR: PHY transmission error
__ratelimit: 9 callbacks suppressed
b43-phy0 ERROR: PHY transmission error
b43-phy0 ERROR: PHY transmission error
------------[ cut here ]------------
WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0xa2/0x6a0 [mac80211]()
Hardware name: Amilo D-Series
Modules linked in: ppdev lp aes_i586 aes_generic ipv6 af_packet rfkill_input arc4 ecb b43 rfkill rng_core mac80211 cfg80211 led_class input_polldev ssb joydev pcmcia snd_via82xx gameport snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device i2c_viapro serio_raw snd i2c_core pcspkr psmouse evdev soundcore via686a via_agp shpchp yenta_socket rsrc_nonstatic pcmcia_core pci_hotplug rtc_cmos battery rtc_core rtc_lib parport_pc parport ac button ext3 jbd mbcache sg sr_mod cdrom sd_mod ata_generic pata_acpi pata_via uhci_hcd ehci_hcd floppy firewire_ohci libata tulip firewire_core crc_itu_t usbcore scsi_mod thermal processor fan
Pid: 0, comm: swapper Not tainted 2.6.29-rc8-sidux-686 #1
Call Trace:
[<c01319d7>] warn_slowpath+0x87/0xe0
[<d00523b7>] op32_set_current_rxslot+0x27/0x40 [b43]
[<d0052d93>] b43_dma_rx+0x193/0x420 [b43]
[<c0124fc3>] __wake_up_common+0x43/0x70
[<cfffcc62>] __ieee80211_rx+0xa2/0x6a0 [mac80211]
[<c011e9a5>] default_spin_lock_flags+0x5/0x10
[<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60
[<cffeb337>] ieee80211_tasklet_handler+0x107/0x130 [mac80211]
[<c013692c>] tasklet_action+0x6c/0xf0
[<c0137147>] __do_softirq+0x87/0x140
[<c011e9a5>] default_spin_lock_flags+0x5/0x10
[<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60
[<c0137255>] do_softirq+0x55/0x60
[<c0137495>] irq_exit+0x75/0x90
[<c0106378>] do_IRQ+0x48/0x90
[<c0104527>] common_interrupt+0x27/0x2c
[<cf8372e4>] acpi_idle_enter_simple+0x17a/0x1f4 [processor]
[<c02fd3bf>] cpuidle_idle_call+0x6f/0xc0
[<c0102de6>] cpu_idle+0x66/0xa0
---[ end trace c754f566bbe5ac47 ]---
------------[ cut here ]------------
WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0xa2/0x6a0 [mac80211]()
Hardware name: Amilo D-Series
Modules linked in: ppdev lp aes_i586 aes_generic ipv6 af_packet rfkill_input arc4 ecb b43 rfkill rng_core mac80211 cfg80211 led_class input_polldev ssb joydev pcmcia snd_via82xx gameport snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device i2c_viapro serio_raw snd i2c_core pcspkr psmouse evdev soundcore via686a via_agp shpchp yenta_socket rsrc_nonstatic pcmcia_core pci_hotplug rtc_cmos battery rtc_core rtc_lib parport_pc parport ac button ext3 jbd mbcache sg sr_mod cdrom sd_mod ata_generic pata_acpi pata_via uhci_hcd ehci_hcd floppy firewire_ohci libata tulip firewire_core crc_itu_t usbcore scsi_mod thermal processor fan
Pid: 0, comm: swapper Tainted: G W 2.6.29-rc8-sidux-686 #1
Call Trace:
[<c01319d7>] warn_slowpath+0x87/0xe0
[<d00523b7>] op32_set_current_rxslot+0x27/0x40 [b43]
[<d0052d93>] b43_dma_rx+0x193/0x420 [b43]
[<d0055f15>] b43_led_turn_off+0x55/0x90 [b43]
[<cfffcc62>] __ieee80211_rx+0xa2/0x6a0 [mac80211]
[<c011e9a5>] default_spin_lock_flags+0x5/0x10
[<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60
[<cffeb337>] ieee80211_tasklet_handler+0x107/0x130 [mac80211]
[<c013692c>] tasklet_action+0x6c/0xf0
[<c0137147>] __do_softirq+0x87/0x140
[<c011e9a5>] default_spin_lock_flags+0x5/0x10
[<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60
[<c0137255>] do_softirq+0x55/0x60
[<c0137495>] irq_exit+0x75/0x90
[<c0106378>] do_IRQ+0x48/0x90
[<c0104527>] common_interrupt+0x27/0x2c
[<cf8372e4>] acpi_idle_enter_simple+0x17a/0x1f4 [processor]
[<c02fd3bf>] cpuidle_idle_call+0x6f/0xc0
[<c0102de6>] cpu_idle+0x66/0xa0
---[ end trace c754f566bbe5ac48 ]---
------------[ cut here ]------------
WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0xa2/0x6a0 [mac80211]()
Hardware name: Amilo D-Series
Modules linked in: ppdev lp aes_i586 aes_generic ipv6 af_packet rfkill_input arc4 ecb b43 rfkill rng_core mac80211 cfg80211 led_class input_polldev ssb joydev pcmcia snd_via82xx gameport snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device i2c_viapro serio_raw snd i2c_core pcspkr psmouse evdev soundcore via686a via_agp shpchp yenta_socket rsrc_nonstatic pcmcia_core pci_hotplug rtc_cmos battery rtc_core rtc_lib parport_pc parport ac button ext3 jbd mbcache sg sr_mod cdrom sd_mod ata_generic pata_acpi pata_via uhci_hcd ehci_hcd floppy firewire_ohci libata tulip firewire_core crc_itu_t usbcore scsi_mod thermal processor fan
Pid: 1873, comm: kjournald Tainted: G W 2.6.29-rc8-sidux-686 #1
Call Trace:
[<c01319d7>] warn_slowpath+0x87/0xe0
[<d00523b7>] op32_set_current_rxslot+0x27/0x40 [b43]
[<d0052d93>] b43_dma_rx+0x193/0x420 [b43]
[<cfffcc62>] __ieee80211_rx+0xa2/0x6a0 [mac80211]
[<c011e9a5>] default_spin_lock_flags+0x5/0x10
[<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60
[<cffeb337>] ieee80211_tasklet_handler+0x107/0x130 [mac80211]
[<c013692c>] tasklet_action+0x6c/0xf0
[<c0137147>] __do_softirq+0x87/0x140
[<c011e9a5>] default_spin_lock_flags+0x5/0x10
[<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60
[<c0137255>] do_softirq+0x55/0x60
[<c0137495>] irq_exit+0x75/0x90
[<c0106378>] do_IRQ+0x48/0x90
[<c01d3f44>] generic_block_bmap+0x54/0x70
[<c0104527>] common_interrupt+0x27/0x2c
[<cfbf723c>] __journal_file_buffer+0xdc/0x1d0 [jbd]
[<cfbf7397>] journal_file_buffer+0x67/0xc0 [jbd]
[<cfbfe102>] journal_write_metadata_buffer+0x1e2/0x3dc [jbd]
[<cfbf9e26>] journal_commit_transaction+0x806/0x1120 [jbd]
[<c013bcc7>] lock_timer_base+0x27/0x60
[<cfbfd82c>] kjournald+0xac/0x1f0 [jbd]
[<c01464b0>] autoremove_wake_function+0x0/0x50
[<cfbfd780>] kjournald+0x0/0x1f0 [jbd]
[<c01460e9>] kthread+0x39/0x70
[<c01460b0>] kthread+0x0/0x70
[<c0104793>] kernel_thread_helper+0x7/0x14
---[ end trace c754f566bbe5ac49 ]---
__ratelimit: 21 callbacks suppressed
b43-phy0 ERROR: PHY transmission error
[...]

Sometimes even the firmware crashes and gets reloaded continously.

wlan1 IEEE 802.11bg ESSID:"soyuz"
Mode:Managed Frequency:2.422 GHz Access Point: 00:15:F2:7E:9B:7D
Bit Rate=18 Mb/s Tx-Power=20 dBm
Retry min limit:7 RTS thr:off Fragment thr=2352 B
Encryption key:<wpa2psk> [3] Security mode:open
Power Management:off
Link Quality=53/100 Signal level:-75 dBm Noise level=-65 dBm
Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0
Tx excessive retries:0 Invalid misc:0 Missed beacon:0

Setting a fixed wlan rate (like 11M) seems to avoid this problem.

> I don't think this will print anything, the rate won't be 32, it's rather
> too high. Could you apply also the appended debug one?

I will apply this patch and give it some more testing tomorrow evening,
this problem is almost 100% reproducable for me at the end of my router's
range and doesn't happen in closer proximity.

> ---
> net/mac80211/rx.c | 6 ++++--
> 1 files changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
> index 7175ae8..5e17e57 100644
> --- a/net/mac80211/rx.c
> +++ b/net/mac80211/rx.c
> @@ -2230,8 +2230,10 @@ void __ieee80211_rx(struct ieee80211_hw *hw, struct sk_buff *skb,
> * MCS aware. */
> rate = &sband->bitrates[sband->n_bitrates - 1];
> } else {
> - if (WARN_ON(status->rate_idx < 0 ||
> - status->rate_idx >= sband->n_bitrates))
> + if (WARN(status->rate_idx < 0 ||
> + status->rate_idx >= sband->n_bitrates,
> + "RATE=%u, BAND=%x\n", status->rate_idx,
> + sband->n_bitrates))
> return;
> rate = &sband->bitrates[status->rate_idx];
> }

Regards
Stefan Lippers-Hollmann

2009-03-15 21:37:00

by Michael Büsch

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On Sunday 15 March 2009 22:27:13 Stefan Lippers-Hollmann wrote:
> Hi
>
> On Mittwoch, 7. Januar 2009, Jiri Slaby wrote:
> > On 01/07/2009 02:51 PM, Jiri Slaby wrote:
> > > Dhaval Giani wrote:
> > >> I see this on current git. Not sure how to reproduce it, has happened on
> > >> two random occasions. At both times, I was not connected to a wireless
> > >> network, but to wired networks.
> > >>
> > >> ------------[ cut here ]------------
> > >> WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0x7f/0x559

I also see this triggering frequently on b43.
I'm not quite sure why it happens.

> Sometimes even the firmware crashes and gets reloaded continously.

Nah, that's most likely a separate bug.

--
Greetings, Michael.

2009-03-23 00:46:23

by Stefan Lippers-Hollmann

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

Hi

On Sonntag, 15. März 2009, Stefan Lippers-Hollmann wrote:
> Hi
>
> On Mittwoch, 7. Januar 2009, Jiri Slaby wrote:
> > On 01/07/2009 02:51 PM, Jiri Slaby wrote:
> > > Dhaval Giani wrote:
> > >> I see this on current git. Not sure how to reproduce it, has happened on
> > >> two random occasions. At both times, I was not connected to a wireless
> > >> network, but to wired networks.
> > >>
> > >> ------------[ cut here ]------------
> > >> WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0x7f/0x559
> > >> ...
> > >> Call Trace:
> > >> [<f80d4192>] __ieee80211_rx+0x7f/0x559 [mac80211]
> > >> [<f80a19f4>] ath5k_tasklet_rx+0x4f7/0x53b [ath5k]
> > >> ...
> > >
> > > Hmm, maybe ath5k is culprit. Could you apply the attached patch and
> > > use the kernel till the problem appears again?
>
> It seems as if this problem wouldn't be restricted to ath5k, I just
> triggered something very similar on b43 and 2.6.29-rc8-git1 (i386, hard
> preemption):
>
> b43-phy0: Broadcom 4306 WLAN found (core revision 5)
[...]
> wlan1: no IPv6 routers present
> b43-phy0 ERROR: PHY transmission error
> b43-phy0 ERROR: PHY transmission error
>
> [ lots of these, likely to be caused by minstrel being a little too
> optimistic about the possible wlan rates (it was more conservative in
> 2.6.28 and didn't happen there); the distance between both stations is
> on the upper end ]
>
> b43-phy0 ERROR: PHY transmission error
> __ratelimit: 9 callbacks suppressed
> b43-phy0 ERROR: PHY transmission error
> b43-phy0 ERROR: PHY transmission error
> ------------[ cut here ]------------
> WARNING: at net/mac80211/rx.c:2234 __ieee80211_rx+0xa2/0x6a0 [mac80211]()
> Hardware name: Amilo D-Series
> Modules linked in: ppdev lp aes_i586 aes_generic ipv6 af_packet rfkill_input arc4 ecb b43 rfkill rng_core mac80211 cfg80211 led_class input_polldev ssb joydev pcmcia snd_via82xx gameport snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device i2c_viapro serio_raw snd i2c_core pcspkr psmouse evdev soundcore via686a via_agp shpchp yenta_socket rsrc_nonstatic pcmcia_core pci_hotplug rtc_cmos battery rtc_core rtc_lib parport_pc parport ac button ext3 jbd mbcache sg sr_mod cdrom sd_mod ata_generic pata_acpi pata_via uhci_hcd ehci_hcd floppy firewire_ohci libata tulip firewire_core crc_itu_t usbcore scsi_mod thermal processor fan
> Pid: 0, comm: swapper Not tainted 2.6.29-rc8-sidux-686 #1
> Call Trace:
> [<c01319d7>] warn_slowpath+0x87/0xe0
> [<d00523b7>] op32_set_current_rxslot+0x27/0x40 [b43]
> [<d0052d93>] b43_dma_rx+0x193/0x420 [b43]
> [<c0124fc3>] __wake_up_common+0x43/0x70
> [<cfffcc62>] __ieee80211_rx+0xa2/0x6a0 [mac80211]
> [<c011e9a5>] default_spin_lock_flags+0x5/0x10
> [<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60
> [<cffeb337>] ieee80211_tasklet_handler+0x107/0x130 [mac80211]
> [<c013692c>] tasklet_action+0x6c/0xf0
> [<c0137147>] __do_softirq+0x87/0x140
> [<c011e9a5>] default_spin_lock_flags+0x5/0x10
> [<c03a3f2e>] _spin_lock_irqsave+0x3e/0x60
> [<c0137255>] do_softirq+0x55/0x60
> [<c0137495>] irq_exit+0x75/0x90
> [<c0106378>] do_IRQ+0x48/0x90
> [<c0104527>] common_interrupt+0x27/0x2c
> [<cf8372e4>] acpi_idle_enter_simple+0x17a/0x1f4 [processor]
> [<c02fd3bf>] cpuidle_idle_call+0x6f/0xc0
> [<c0102de6>] cpu_idle+0x66/0xa0
> ---[ end trace c754f566bbe5ac47 ]---
[...]
> Sometimes even the firmware crashes and gets reloaded continously.
[...]
> Setting a fixed wlan rate (like 11M) seems to avoid this problem.
>
> > I don't think this will print anything, the rate won't be 32, it's rather
> > too high. Could you apply also the appended debug one?
>
> I will apply this patch and give it some more testing tomorrow evening,
> this problem is almost 100% reproducable for me at the end of my router's
> range and doesn't happen in closer proximity.
[...]

Sorry for the late response, but I've been unexpectedly away from my
BCM4306 system until today.

Thanks to the following (not yet mainline) patches by Michael Buesch and
Lorenzo Nava on top of 2.6.29-rc8-git5, these problems seem to be "fixed"
(well, the PHY errors are basically just hidden, but as they don't
trigger the firmware watchdog anymore, it's much less of a problem and
isn't actually a user visible problem anymore).

[PATCH] b43: Mask PHY TX error interrupt, if not debugging
http://marc.info/?l=linux-wireless&m=123748731831778&w=2

[PATCH] b43: fix b43_plcp_get_bitrate_idx_ofdm return type
http://marc.info/?l=linux-wireless&m=123774585529189&w=2


Confirming the patch descriptions, Jiri Slaby's debugging patch did reveal
a signedness problem of the return value of in
b43_plcp_get_bitrate_idx_ofdm(), which has been fixed by the patch above:

[ this trace happened *without* "b43: fix b43_plcp_get_bitrate_idx_ofdm
return type", and only "b43: Mask PHY TX error interrupt, if not
debugging" applied on top of 2.6.29-rc8-git5 ]
------------[ cut here ]------------
WARNING: at net/mac80211/rx.c:2236 __ieee80211_rx+0xab/0x6b0 [mac80211]()
Hardware name: Amilo D-Series
RATE=255, BAND=c
Modules linked in: ppdev lp aes_i586 aes_generic ipv6 af_packet rfkill_input arc4 ecb b43 rfkill rng_core mac80211 cfg80211 led_class input_polldev ssb joydev snd_via82xx gameport snd_ac97_codec ac97_bus snd_pcm_oss snd_mixer_oss pcmcia snd_pcm snd_timer snd_page_alloc snd_mpu401_uart snd_rawmidi i2c_viapro serio_raw snd_seq_device pcspkr i2c_core psmouse snd evdev soundcore via686a shpchp yenta_socket rsrc_nonstatic pcmcia_core via_agp pci_hotplug rtc_cmos parport_pc battery rtc_core rtc_lib parport ac button ext3 jbd mbcache sg sr_mod cdrom sd_mod ata_generic pata_acpi pata_via uhci_hcd ehci_hcd floppy firewire_ohci libata tulip firewire_core crc_itu_t usbcore scsi_mod thermal processor fan
Pid: 0, comm: swapper Not tainted 2.6.29-rc8-sidux-686 #1
Call Trace:
[<c0131a67>] warn_slowpath+0x87/0xe0
[<d002d377>] op32_set_current_rxslot+0x27/0x40 [b43]
[<d002dd53>] b43_dma_rx+0x193/0x420 [b43]
[<c01ae229>] add_partial+0x19/0x70
[<cfcd834f>] ieee80211_tasklet_handler+0x11f/0x130 [mac80211]
[<c03a4195>] _spin_unlock+0x5/0x20
[<cfce9c6b>] __ieee80211_rx+0xab/0x6b0 [mac80211]
[<c011ea35>] default_spin_lock_flags+0x5/0x10
[<c03a3d7e>] _spin_lock_irqsave+0x3e/0x60
[<cfcd8337>] ieee80211_tasklet_handler+0x107/0x130 [mac80211]
[<c01369bc>] tasklet_action+0x6c/0xf0
[<c01371d7>] __do_softirq+0x87/0x140
[<c011ea35>] default_spin_lock_flags+0x5/0x10
[<c03a3d7e>] _spin_lock_irqsave+0x3e/0x60
[<c01372e5>] do_softirq+0x55/0x60
[<c0137525>] irq_exit+0x75/0x90
[<c0106378>] do_IRQ+0x48/0x90
[<c0104527>] common_interrupt+0x27/0x2c
[<cf8372cb>] acpi_idle_enter_simple+0x17a/0x1f4 [processor]
[<c02fcfcf>] cpuidle_idle_call+0x6f/0xc0
[<c0102de6>] cpu_idle+0x66/0xa0
---[ end trace ba8601a4d52a20d2 ]---
------------[ cut here ]------------

So far (after 2.9 GB continuous kernel tarball downloads from a local
mirror) b43 seems to be fine again:

wlan1 IEEE 802.11bg ESSID:"gemini"
Mode:Managed Frequency:2.412 GHz Access Point: 00:21:27:FF:51:A8
Bit Rate=54 Mb/s Tx-Power=20 dBm
Retry min limit:7 RTS thr:off Fragment thr=2352 B
Power Management:off
Link Quality=54/100 Signal level:-82 dBm Noise level=-69 dBm
Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0
Tx excessive retries:0 Invalid misc:0 Missed beacon:0

wlan1 Link encap:Ethernet HWaddr 00:0f:66:d8:67:ca
inet addr:192.168.0.70 Bcast:192.168.0.255 Mask:255.255.255.0
inet6 addr: fe80::20f:66ff:fed8:67ca/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:2090104 errors:0 dropped:0 overruns:0 frame:0
TX packets:1082081 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3146865411 (2.9 GiB) TX bytes:93054386 (88.7 MiB)

Fetched 83.2MB in 1min18s (1058kB/s)
[...]
Fetched 83.2MB in 1min1s (1362kB/s)

Thank you and sorry about the late response.

Regards
Stefan Lippers-Hollmann


Post scriptum: I'm not able to trigger this trace with ath5k/ AR2425.
--
> > net/mac80211/rx.c | 6 ++++--
> > 1 files changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
> > index 7175ae8..5e17e57 100644
> > --- a/net/mac80211/rx.c
> > +++ b/net/mac80211/rx.c
> > @@ -2230,8 +2230,10 @@ void __ieee80211_rx(struct ieee80211_hw *hw, struct sk_buff *skb,
> > * MCS aware. */
> > rate = &sband->bitrates[sband->n_bitrates - 1];
> > } else {
> > - if (WARN_ON(status->rate_idx < 0 ||
> > - status->rate_idx >= sband->n_bitrates))
> > + if (WARN(status->rate_idx < 0 ||
> > + status->rate_idx >= sband->n_bitrates,
> > + "RATE=%u, BAND=%x\n", status->rate_idx,
> > + sband->n_bitrates))
> > return;
> > rate = &sband->bitrates[status->rate_idx];
> > }

2009-03-23 02:32:32

by Bob Copeland

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On Mon, Mar 23, 2009 at 01:45:58AM +0100, Stefan Lippers-Hollmann wrote:
>
> Post scriptum: I'm not able to trigger this trace with ath5k/ AR2425.

Okay, well just to be clear ath5k had the same issue (I posted a patch
a couple of weeks ago - I think it got lost and I need to repost it).

But this is separate from the problem where the rate controller is
choosing a bad rate index for TX in adhoc mode, that's still an unknown,
unsolved problem.

--
Bob Copeland %% http://www.bobcopeland.com

2009-03-30 09:01:15

by Dhaval Giani

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On Sun, Mar 01, 2009 at 12:08:07AM +0100, Jiri Slaby wrote:
> On 15.2.2009 14:47, Bob Copeland wrote:
>> On Mon, Feb 02, 2009 at 01:27:39PM +0530, Dhaval Giani wrote:
>>> So I finally managed to hit this on 2.6.29-rc3. It is hard to
>>> reproduce, so I hope so much information is enough to give you a good
>>> guess. This time it hit while trying to connect to an open network at
>>> the airport.
>>
>>> WARNING: at net/mac80211/rx.c:2236 __ieee80211_rx+0x96/0x571 [mac80211]()
>>> Hardware name: 2007CS3
>>> RATE=255, BAND=8
>>
>> band is supposed to be sc->curband? 8 is way wrong.
>
> If you look into the patch which outputs this (backtrace in this
> thread), sband->n_bitrates is 8. I have no idea what I have been smoking
> the day I wrote it, but BAND= for sure isn't the right name for that
> thing. Sorry for the confusion.
>
>> rate could be 255
>> if, for some reason, the hardware rate wasn't in the rate table.
>
> So, we have a fix for this, right? I mean the u8->s8 sc->rate_idx
> conversion or alike...

Where is the fix? Is it merged in? I still see this happen on 2.6.29

thanks,
--
regards,
Dhaval

2009-03-30 16:58:44

by Bob Copeland

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On Mon, Mar 30, 2009 at 4:59 AM, Dhaval Giani <[email protected]> wrote:
> Where is the fix? Is it merged in? I still see this happen on 2.6.29
>
> thanks,

It's in b726604706ad88d8b28bc487e45e710f58cc19ee in Linus' tree, after
2.6.29. You still might get a warning, but this time from the driver
side instead of higher up the stack -- if you do please post it.

--
Bob Copeland %% http://www.bobcopeland.com

2009-03-30 18:01:13

by Dhaval Giani

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On Mon, Mar 30, 2009 at 12:58:28PM -0400, Bob Copeland wrote:
> On Mon, Mar 30, 2009 at 4:59 AM, Dhaval Giani <[email protected]> wrote:
> > Where is the fix? Is it merged in? I still see this happen on 2.6.29
> >
> > thanks,
>
> It's in b726604706ad88d8b28bc487e45e710f58cc19ee in Linus' tree, after
> 2.6.29. You still might get a warning, but this time from the driver
> side instead of higher up the stack -- if you do please post it.
>

ok, so my kernel does hve this patch applied, and this is what I get,

------------[ cut here ]------------
WARNING: at include/net/mac80211.h:1956 minstrel_get_rate+0xa1/0x4b9 [mac80211]()
Hardware name: 2007CS3
Modules linked in: fuse radeon drm ipt_MASQUERADE iptable_nat nf_nat bridge stp bnep sco l2cap bluetooth ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables ipv6 cpufreq_ondemand acpi_cpufreq dm_multipath kvm_intel kvm uinput snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep snd_seq_dummy arc4 snd_seq_oss snd_seq_midi_event snd_seq ecb ath5k nsc_ircc snd_seq_device snd_pcm_oss video snd_mixer_oss snd_pcm mac80211 snd_timer snd yenta_socket i2c_i801 thinkpad_acpi rfkill irda output iTCO_wdt rsrc_nonstatic pcspkr hwmon cfg80211 joydev i2c_core iTCO_vendor_support soundcore crc_ccitt snd_page_alloc [last unloaded: scsi_wait_scan]
Pid: 2389, comm: wpa_supplicant Tainted: G W 2.6.29-tip #28
Call Trace:
[<c0431b0e>] warn_slowpath+0x76/0xad
[<c04523e1>] ? print_lock_contention_bug+0x14/0xd7
[<c042e874>] ? default_wake_function+0x10/0x12
[<c04523e1>] ? print_lock_contention_bug+0x14/0xd7
[<f7d31879>] minstrel_get_rate+0xa1/0x4b9 [mac80211]
[<c0450fa4>] ? trace_hardirqs_on+0xb/0xd
[<c0424909>] ? __wake_up+0x36/0x40
[<f7d272fe>] ? invoke_tx_handlers+0x3b1/0xa50 [mac80211]
[<f7d21b1e>] rate_control_get_rate+0x7e/0xbe [mac80211]
[<f7d27330>] invoke_tx_handlers+0x3e3/0xa50 [mac80211]
[<c0450e61>] ? trace_hardirqs_on_caller+0x18/0x150
[<f7d26c03>] ? __ieee80211_tx_prepare+0x24b/0x288 [mac80211]
[<f7d286ad>] ieee80211_master_start_xmit+0x38b/0x4b2 [mac80211]
[<c069d1f4>] dev_hard_start_xmit+0x219/0x280
[<c06ac17e>] __qdisc_run+0xca/0x1b0
[<c069d6de>] dev_queue_xmit+0x398/0x4bf
[<f7d2a116>] ieee80211_tx_skb+0x53/0x56 [mac80211]
[<f7d1dac4>] ieee80211_send_deauth_disassoc+0xd7/0xdf [mac80211]
[<f7d1dbc1>] ieee80211_set_disassoc+0xf5/0x209 [mac80211]
[<f7d1ddc6>] ieee80211_sta_req_auth+0x47/0x69 [mac80211]
[<f7d17c5a>] ieee80211_ioctl_siwgenie+0x50/0x5d [mac80211]
[<c06f9720>] ioctl_standard_call+0x1b4/0x268
[<c069b3ce>] ? dev_name_hash+0x1b/0x47
[<c06f92e7>] wext_handle_ioctl+0xe7/0x17d
[<f7d17c0a>] ? ieee80211_ioctl_siwgenie+0x0/0x5d [mac80211]
[<c04937ba>] ? might_fault+0x83/0x85
[<c069f06f>] dev_ioctl+0x5c6/0x5e6
[<c0690bf3>] ? sockfd_lookup_light+0x1b/0x4e
[<c0691b65>] ? sys_sendto+0xa9/0xc8
[<c04cf997>] ? dnotify_parent+0x22/0x63
[<c0690746>] ? sock_ioctl+0x0/0x1f0
[<c069092a>] sock_ioctl+0x1e4/0x1f0
[<c0690746>] ? sock_ioctl+0x0/0x1f0
[<c04b6d55>] vfs_ioctl+0x27/0x6e
[<c04b72d4>] do_vfs_ioctl+0x46f/0x4a8
[<c0691ba1>] ? sys_send+0x1d/0x1f
[<c04b7352>] sys_ioctl+0x45/0x5f
[<c04032a4>] sysenter_do_call+0x12/0x38
---[ end trace 0e3d1a2e9037b74b ]---


> --
> Bob Copeland %% http://www.bobcopeland.com

--
regards,
Dhaval

2009-03-30 18:13:49

by Bob Copeland

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On Mon, Mar 30, 2009 at 1:59 PM, Dhaval Giani <[email protected]> wrote:
> ok, so my kernel does hve this patch applied, and this is what I get,
>
> ?------------[ cut here ]------------
> ?WARNING: at include/net/mac80211.h:1956 minstrel_get_rate+0xa1/0x4b9 [mac80211]()

I believe this is something different (tx path not rx). I think it's
that minstrel rate table bug again, which we never solved for ath5k.

Are you using adhoc or managed mode? Do you have the slab/slub debugging
options turned on? Any steps that consistently reproduce it? Do you
get any warnings with PID controller?

--
Bob Copeland %% http://www.bobcopeland.com

2009-03-31 03:52:58

by Dhaval Giani

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On Mon, Mar 30, 2009 at 02:13:35PM -0400, Bob Copeland wrote:
> On Mon, Mar 30, 2009 at 1:59 PM, Dhaval Giani <[email protected]> wrote:
> > ok, so my kernel does hve this patch applied, and this is what I get,
> >
> > ?------------[ cut here ]------------
> > ?WARNING: at include/net/mac80211.h:1956 minstrel_get_rate+0xa1/0x4b9 [mac80211]()
>
> I believe this is something different (tx path not rx). I think it's
> that minstrel rate table bug again, which we never solved for ath5k.
>
> Are you using adhoc or managed mode? Do you have the slab/slub debugging
> options turned on? Any steps that consistently reproduce it? Do you
> get any warnings with PID controller?
>

[dhaval@gondor ~]$ iwconfig wlan0
wlan0 IEEE 802.11abg ESSID:"linksys_SES_62338"
Mode:Managed Frequency:2.462 GHz Access Point: 00:1A:70:D6:2D:06
Bit Rate=36 Mb/s Tx-Power=23 dBm
Retry min limit:7 RTS thr:off Fragment thr=2352 B
Power Management:off
Link Quality=100/100 Signal level:-49 dBm Noise level=-96 dBm
Rx invalid nwid:0 Rx invalid crypt:0 Rx invalid frag:0
Tx excessive retries:0 Invalid misc:0 Missed beacon:0

[dhaval@gondor ~]$

[dhaval@gondor linux-2.6]$ grep -i slub .config
CONFIG_SLUB_DEBUG=y
CONFIG_SLUB=y
# CONFIG_SLUB_DEBUG_ON is not set
# CONFIG_SLUB_STATS is not set
[dhaval@gondor linux-2.6]$

Am not sure what the PID controller is, and google gave me a number of
results, which did not make too much sense in the context.

Yes, I think I know how to reproduce it, but I am not sure what is the
real cause.

One way I have found of reproducing it is to connect to open networks,
but it does not happen always. At home, when my network is set to open,
I do not see this issue, whereas at the airport, kaboom.

I've also seen it on LEAP networks, but there were also a few open
networks around. This warning is generally accompanied by a disconnect
from the LEAP connected network, and then the system reconnects. Let me
know if you have patches, I can give them a run and report back.

Thanks,
--
regards,
Dhaval

2009-03-31 12:24:25

by Bob Copeland

[permalink] [raw]
Subject: Re: [PATCH 1/1] ath5k: fix hw rate index condition

On Tue, Mar 31, 2009 at 09:21:40AM +0530, Dhaval Giani wrote:
> Am not sure what the PID controller is, and google gave me a number of
> results, which did not make too much sense in the context.

CONFIG_MAC80211_RC_PID -- unfortunately I recall having to jump through
a few config hoops to enable it.

> One way I have found of reproducing it is to connect to open networks,
> but it does not happen always. At home, when my network is set to open,
> I do not see this issue, whereas at the airport, kaboom.

Ok - that is a useful data point. Perhaps something to do with the rates
the peer supports; it would help if you could grab a scan next time you
are in the area. Turn off auto-connect to open networks, then do:

# iw dev wlan0 scan trigger
# iw dev wlan0 scan dump >> dump.out # do this a few times

Then if a particular peer triggers the problem, we can look at the
advertised rates to see if anything jumps out.

--
Bob Copeland %% http://www.bobcopeland.com

2009-04-08 15:22:49

by Bob Copeland

[permalink] [raw]
Subject: Re: [ath5k-devel] [PATCH 1/1] ath5k: fix hw rate index condition

On Tue, Mar 31, 2009 at 8:23 AM, Bob Copeland <[email protected]> wrote:
> Ok - that is a useful data point. ?Perhaps something to do with the rates
> the peer supports; it would help if you could grab a scan next time you
> are in the area. ?Turn off auto-connect to open networks, then do:

Hi Dhaval,

Would you mind trying this patch and report the warnings it triggers?

http://marc.info/?l=linux-kernel&m=123915183521347&q=raw

--
Bob Copeland %% http://www.bobcopeland.com