2011-05-01 19:46:09

by Russell Senior

[permalink] [raw]
Subject: kernel panic on MIPS + ath5k + Wistron CM9 radio


Recently trying to abandon madwifi for good on some embedded devices
with OpenWrt, encountered kernel panics. Same radio on an x86
platform (Soekris net4826), no kernel panic. Different atheros radio,
no kernel panic. Tried two different MIPS boards (Netgear WGT634U and
Ubiquiti RouterStation), both panic'd.

Netgear WGT634U: http://pastebin.ca/2051127
Ubiquiti RouterStation: http://pastebin.ca/2052610

Both cases were firmwares built from r26771 of OpenWrt, and both seem
to have panic'd from hostapd. While researching this, I encountered a
similar report from last June:

http://www.mail-archive.com/[email protected]/msg01001.html
http://www.spinics.net/lists/linux-wireless/msg52502.html

Another person reported some issues with the CM9 earlier this year in
client mode:

http://permalink.gmane.org/gmane.linux.drivers.ath5k.devel/4722

I am happy to do some leg work on getting this fixed, if someone can
provide some guidance. Thanks!


--
Russell Senior ``I have nine fingers; you have ten.''
[email protected]


2011-05-04 19:55:53

by Russell Senior

[permalink] [raw]
Subject: Re: [ath5k-devel] kernel panic on MIPS + ath5k + Wistron CM9 radio

>>>>> "Felix" == Felix Fietkau <[email protected]> writes:

>> Well, I see the same on FreeBSD/MIPS whenever an Atheros device is
>> fondled incorrectly. Either because the chip isn't yet fully awake
>> or the register plainly doesn't exist.
>>
>> See if you can add some debugging in ath5k_hw_reset_tx_queue() to
>> see which register is being read/written before the PCI bus error
>> occurs.

Felix> If I read the trace correctly, the accessed register is 0x111c,
Felix> which is AR5K_QUEUE_DFS_MISC(7). If I remember correctly, queue
Felix> 7 is the beacon queue. Access to this register should never
Felix> fail unless the hardware is in sleep mode, or there is some
Felix> other PCI related issue. Unfortunately, this issue might be
Felix> caused by something entirely different that is not visible in
Felix> the stack trace. I have observed that messing up the internal
Felix> state of a PCI card can trigger an error that only shows up
Felix> much later and thus can only be found by doing a thorough code
Felix> review or by analyzing PCI traces.

I wonder what is special about MIPS (relative to x86, where I don't
see the panics).


--
Russell Senior ``I have nine fingers; you have ten.''
[email protected]

2011-05-03 19:19:30

by Felix Fietkau

[permalink] [raw]
Subject: Re: [ath5k-devel] kernel panic on MIPS + ath5k + Wistron CM9 radio

On 2011-05-03 8:18 AM, Adrian Chadd wrote:
> Well, I see the same on FreeBSD/MIPS whenever an Atheros device is
> fondled incorrectly. Either because the chip isn't yet fully awake or
> the register plainly doesn't exist.
>
> See if you can add some debugging in ath5k_hw_reset_tx_queue() to see
> which register is being read/written before the PCI bus error occurs.
If I read the trace correctly, the accessed register is 0x111c, which is
AR5K_QUEUE_DFS_MISC(7). If I remember correctly, queue 7 is the beacon
queue. Access to this register should never fail unless the hardware is
in sleep mode, or there is some other PCI related issue. Unfortunately,
this issue might be caused by something entirely different that is not
visible in the stack trace. I have observed that messing up the internal
state of a PCI card can trigger an error that only shows up much later
and thus can only be found by doing a thorough code review or by
analyzing PCI traces.

- Felix

2011-05-03 06:18:27

by Adrian Chadd

[permalink] [raw]
Subject: Re: kernel panic on MIPS + ath5k + Wistron CM9 radio

Well, I see the same on FreeBSD/MIPS whenever an Atheros device is
fondled incorrectly. Either because the chip isn't yet fully awake or
the register plainly doesn't exist.

See if you can add some debugging in ath5k_hw_reset_tx_queue() to see
which register is being read/written before the PCI bus error occurs.

I have that hardware (routerstation/rspro) and the CM-9 card but I
don't currently have any spare time to run it up and test, sorry.


Adrian

On 2 May 2011 03:39, Russell Senior <[email protected]> wrote:
>
> Recently trying to abandon madwifi for good on some embedded devices
> with OpenWrt, encountered kernel panics. ?Same radio on an x86
> platform (Soekris net4826), no kernel panic. ?Different atheros radio,
> no kernel panic. ?Tried two different MIPS boards (Netgear WGT634U and
> Ubiquiti RouterStation), both panic'd.
>
> Netgear WGT634U: ?http://pastebin.ca/2051127
> Ubiquiti RouterStation: http://pastebin.ca/2052610
>
> Both cases were firmwares built from r26771 of OpenWrt, and both seem
> to have panic'd from hostapd. ?While researching this, I encountered a
> similar report from last June:
>
> ?http://www.mail-archive.com/[email protected]/msg01001.html
> ?http://www.spinics.net/lists/linux-wireless/msg52502.html
>
> Another person reported some issues with the CM9 earlier this year in
> client mode:
>
> ?http://permalink.gmane.org/gmane.linux.drivers.ath5k.devel/4722
>
> I am happy to do some leg work on getting this fixed, if someone can
> provide some guidance. ?Thanks!
>
>
> --
> Russell Senior ? ? ? ? ``I have nine fingers; you have ten.''
> [email protected]
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>

2011-05-04 14:39:24

by Russell Senior

[permalink] [raw]
Subject: Re: [ath5k-devel] kernel panic on MIPS + ath5k + Wistron CM9 radio

>>>>> "Russell" == Russell Senior <[email protected]> writes:

Russell> Recently trying to abandon madwifi for good on some embedded
Russell> devices with OpenWrt, encountered kernel panics. Same radio
Russell> on an x86 platform (Soekris net4826), no kernel panic.
Russell> Different atheros radio, no kernel panic. Tried two
Russell> different MIPS boards (Netgear WGT634U and Ubiquiti
Russell> RouterStation), both panic'd.

Russell> Netgear WGT634U: http://pastebin.ca/2051127 Ubiquiti
Russell> RouterStation: http://pastebin.ca/2052610

Here is another panic, this time not resulting from hostapd, but from
an adhoc vif:

http://pastebin.ca/2053827

The initial panic appears to occur in a different function from the
hostapd initiated panics:

PCI error 1 at PCI addr 0x10008020
Data bus error, epc == 83006b24, ra == 83006b20
Oops[#1]:
Cpu 0
$ 0 : 00000000 802f0000 deadc0de 00000002
$ 4 : b0008020 00000002 018003e8 00000000
$ 8 : 830a8228 00000015 83ad8df0 0000006c
$12 : 00000001 0000000c 00000005 00000008
$16 : 8387a000 01000000 00008020 00008024
$20 : 0000802c 00008030 018003e8 000003e9
$24 : 00000008 0000021c
$28 : 83832000 83833b70 0000003b 83006b20
Hi : 00000000
Lo : 00000020
epc : 83006b24 ath5k_hw_reset_tsf+0x48/0x98 [ath5k]
Not tainted
ra : 83006b20 ath5k_hw_reset_tsf+0x44/0x98 [ath5k]
Status: 1000f403 KERNEL EXL IE
Cause : 1080001c
PrId : 00019374 (MIPS 24Kc)
Modules linked in: batman_adv ath5k ath mac80211 cfg80211 compat arc4 aes_generic crypto_algapi ipv6 leds_gpio gpio_keys_polled input_polldev input_core
Process kworker/u:0 (pid: 5, threadinfo=83832000, task=83819100, tls=00000000)
Stack : 00000000 00000001 00000000 00000000 000000d0 8387a000 00008034 00001f30
00001ef0 83006d28 00000008 000200d0 0000ed54 800ade3c 83ad8da0 83ad8da0
018003e8 00000000 00000000 000003e8 00000000 0000ed54 00000000 830113cc
00000002 802c74fc 00000000 80099364 00000000 00000000 00000000 000252d0
000000d0 00000001 802c74f8 003fffff 8387a000 00000000 0000021c 83845300
...
Call Trace:
[<83006b24>] ath5k_hw_reset_tsf+0x48/0x98 [ath5k]
[<83006d28>] ath5k_hw_init_beacon+0x1b4/0x250 [ath5k]
[<830113cc>] ath5k_beacon_update_timers+0x248/0xc8c [ath5k]
[<83bc81f8>] ieee80211_process_addba_request+0x520/0xa44 [mac80211]


Code: 0c058bb8 00442021 8e0500e4 <00518825> 02601021 0245100b 00402821 8e0200d4 02202021
[...]


--
Russell Senior ``I have nine fingers; you have ten.''
[email protected]

2011-05-06 01:57:33

by Peizhao Hu

[permalink] [raw]
Subject: Re: [ath5k-devel] kernel panic on MIPS + ath5k + Wistron CM9 radio

Well, the CPU architecture is different. But this is not really the cause...

As far as I understand the problem, the ath5k driver or the entire
mac80211 framework is ported to OpenWRT by the team. There are
differences between the one you use on the Intel sys and on the OpenWRT
boxes.

The best place to request for fix is the OpenWRT mailing list.

regards;

Peizhao


On 05/05/11 05:55, Russell Senior wrote:
>>>>>> "Felix" == Felix Fietkau<[email protected]> writes:
>>> Well, I see the same on FreeBSD/MIPS whenever an Atheros device is
>>> fondled incorrectly. Either because the chip isn't yet fully awake
>>> or the register plainly doesn't exist.
>>>
>>> See if you can add some debugging in ath5k_hw_reset_tx_queue() to
>>> see which register is being read/written before the PCI bus error
>>> occurs.
> Felix> If I read the trace correctly, the accessed register is 0x111c,
> Felix> which is AR5K_QUEUE_DFS_MISC(7). If I remember correctly, queue
> Felix> 7 is the beacon queue. Access to this register should never
> Felix> fail unless the hardware is in sleep mode, or there is some
> Felix> other PCI related issue. Unfortunately, this issue might be
> Felix> caused by something entirely different that is not visible in
> Felix> the stack trace. I have observed that messing up the internal
> Felix> state of a PCI card can trigger an error that only shows up
> Felix> much later and thus can only be found by doing a thorough code
> Felix> review or by analyzing PCI traces.
>
> I wonder what is special about MIPS (relative to x86, where I don't
> see the panics).
>
>

2011-05-04 14:48:03

by Felix Fietkau

[permalink] [raw]
Subject: Re: [ath5k-devel] kernel panic on MIPS + ath5k + Wistron CM9 radio

On 2011-05-04 4:39 PM, Russell Senior wrote:
>>>>>> "Russell" == Russell Senior<[email protected]> writes:
>
> Russell> Recently trying to abandon madwifi for good on some embedded
> Russell> devices with OpenWrt, encountered kernel panics. Same radio
> Russell> on an x86 platform (Soekris net4826), no kernel panic.
> Russell> Different atheros radio, no kernel panic. Tried two
> Russell> different MIPS boards (Netgear WGT634U and Ubiquiti
> Russell> RouterStation), both panic'd.
>
> Russell> Netgear WGT634U: http://pastebin.ca/2051127 Ubiquiti
> Russell> RouterStation: http://pastebin.ca/2052610
>
> Here is another panic, this time not resulting from hostapd, but from
> an adhoc vif:
>
> http://pastebin.ca/2053827
>
> The initial panic appears to occur in a different function from the
> hostapd initiated panics:
Again a beacon related register, this time AR5K_BEACON

- Felix

2011-05-03 06:56:58

by Peizhao Hu

[permalink] [raw]
Subject: Re: [ath5k-devel] kernel panic on MIPS + ath5k + Wistron CM9 radio

Hi

Would the issue I posted help to resolve this?
http://www.mail-archive.com/[email protected]/msg01001.html

regards;

Peizhao


On 03/05/11 16:18, Adrian Chadd wrote:
> Well, I see the same on FreeBSD/MIPS whenever an Atheros device is
> fondled incorrectly. Either because the chip isn't yet fully awake or
> the register plainly doesn't exist.
>
> See if you can add some debugging in ath5k_hw_reset_tx_queue() to see
> which register is being read/written before the PCI bus error occurs.
>
> I have that hardware (routerstation/rspro) and the CM-9 card but I
> don't currently have any spare time to run it up and test, sorry.
>
>
> Adrian
>
> On 2 May 2011 03:39, Russell Senior<[email protected]> wrote:
>> Recently trying to abandon madwifi for good on some embedded devices
>> with OpenWrt, encountered kernel panics. Same radio on an x86
>> platform (Soekris net4826), no kernel panic. Different atheros radio,
>> no kernel panic. Tried two different MIPS boards (Netgear WGT634U and
>> Ubiquiti RouterStation), both panic'd.
>>
>> Netgear WGT634U: http://pastebin.ca/2051127
>> Ubiquiti RouterStation: http://pastebin.ca/2052610
>>
>> Both cases were firmwares built from r26771 of OpenWrt, and both seem
>> to have panic'd from hostapd. While researching this, I encountered a
>> similar report from last June:
>>
>> http://www.mail-archive.com/[email protected]/msg01001.html
>> http://www.spinics.net/lists/linux-wireless/msg52502.html
>>
>> Another person reported some issues with the CM9 earlier this year in
>> client mode:
>>
>> http://permalink.gmane.org/gmane.linux.drivers.ath5k.devel/4722
>>
>> I am happy to do some leg work on getting this fixed, if someone can
>> provide some guidance. Thanks!
>>
>>
>> --
>> Russell Senior ``I have nine fingers; you have ten.''
>> [email protected]
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
> _______________________________________________
> ath5k-devel mailing list
> [email protected]
> https://lists.ath5k.org/mailman/listinfo/ath5k-devel