After the latest git pull from wireless-testing
master-2010-01-14 to master-2010-01-19
the start of hostapd causes kernel panic.
Tested with wireless-testing master-2010-01-19
and hostapd 0.6.9 / 0.7.0
---------------------------------------------
BUG: unable to handle kernel NULL pointer dereference at 00000193
IP: [<c126afc9>] invoke_tx_handlers+0x909/0xf40
*pde = 00000000
Oops: 0000 [#1]
last sysfs file: /sys/devices/virtual/net/br0/bridge/topology_change_detected
Modules linked in: rt61pci crc_itu_t rt2x00pci rt2x00lib eeprom_93cx6
Pid: 4411, comm: hostapd Not tainted 2.6.33-rc4-wl-47289-gd602bbd #27 CN700-8237/
EIP: 0060:[<c126afc9>] EFLAGS: 00210246 CPU: 0
EIP is at invoke_tx_handlers+0x909/0xf40
EAX: 00000040 EBX: 00000000 ECX: f6dfc000 EDX: 00000000
ESI: f6c03c00 EDI: f6c07c2c EBP: f6c07c00 ESP: f6c07b34
DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068
Process hostapd (pid: 4411, ti=f6c06000 task=f7bb4000 task.ti=f6c06000)
Stack:
c108763f 00000000 00000200 0000000a 00000304 f6c07e44 c1088368 0098966f
<0> b7a55af2 0000002d 00000000 f6c07bd0 f6c03c00 f6c0805e f6c07f60 f6c07e2c
<0> 00000000 f6c03c20 000000c0 0098966f f6c07e6c f6c07e70 f6c07e74 f6c07e60
Call Trace:
[<c108763f>] ? poll_freewait+0x3f/0xa0
[<c1088368>] ? do_select+0x608/0x680
[<c1269ee5>] ? ieee80211_tx_prepare+0x105/0x310
[<c1088860>] ? __pollwait+0x0/0xd0
[<c126b7b3>] ? ieee80211_tx+0x53/0x180
[<c11cc278>] ? skb_release_data+0x68/0xa0
[<c11cc398>] ? pskb_expand_head+0xe8/0x170
[<c126b96c>] ? ieee80211_xmit+0x8c/0x180
[<c126bb44>] ? ieee80211_monitor_start_xmit+0x94/0xc0
[<c11d3c0d>] ? dev_hard_start_xmit+0x20d/0x2c0
[<c11cce89>] ? __alloc_skb+0x49/0x130
[<c11e297c>] ? sch_direct_xmit+0xec/0x140
[<c11c860a>] ? sock_alloc_send_pskb+0x17a/0x260
[<c11e2060>] ? pfifo_fast_enqueue+0x0/0x90
[<c11d3ebd>] ? dev_queue_xmit+0xdd/0x4a0
[<c12314c3>] ? packet_sendmsg+0x213/0x250
[<c11c565f>] ? sock_sendmsg+0xaf/0xe0
[<c11c5539>] ? sock_recvmsg+0xb9/0xe0
[<c11ce19c>] ? verify_iovec+0x2c/0xa0
[<c11c5b31>] ? sys_sendmsg+0x111/0x230
[<c1056c6f>] ? find_get_page+0x1f/0x70
[<c1057499>] ? filemap_fault+0x69/0x340
[<c1056f6d>] ? unlock_page+0x3d/0x40
[<c1066fe0>] ? __do_fault+0x2a0/0x380
[<c106804b>] ? handle_mm_fault+0x13b/0x850
[<c11c6f1c>] ? sys_socketcall+0xdc/0x290
[<c1078467>] ? filp_close+0x47/0x70
[<c1002990>] ? sysenter_do_call+0x12/0x26
Code: 3d a0 00 00 00 0f 84 1d 05 00 00 3d c0 00 00 00 0f 84 12 05 00 00 3d d0
00 00 00 0f 84 e6 04 00 00 90 c7 47 10 00 00 00 00 31 db <0f> b6 93 93 01 00
00 f6 c2 10 0f 84 e0 f8 ff ff 8b 8d 68 ff ff
EIP: [<c126afc9>] invoke_tx_handlers+0x909/0xf40 SS:ESP 0068:f6c07b34
CR2: 0000000000000193
---[ end trace bc184f73743b5879 ]---
Kernel panic - not syncing: Fatal exception in interrupt
Pid: 4411, comm: hostapd Tainted: G D 2.6.33-rc4-wl-47289-gd602bbd #27
Call Trace:
[<c1277d45>] ? printk+0x18/0x1b
[<c1277c7e>] panic+0x43/0xf2
[<c10054ee>] oops_end+0x7e/0x90
[<c101a8ae>] no_context+0xbe/0x150
[<c101a98f>] __bad_area_nosemaphore+0x4f/0x180
[<c101f056>] ? update_curr+0x116/0x160
[<c1020787>] ? dequeue_entity+0x17/0x1b0
[<c1020ff0>] ? dequeue_task_fair+0x30/0x80
[<c101aad2>] bad_area_nosemaphore+0x12/0x20
[<c101aeb4>] do_page_fault+0x254/0x2f0
[<c101ac60>] ? do_page_fault+0x0/0x2f0
[<c12799f6>] error_code+0x5e/0x64
[<c101ac60>] ? do_page_fault+0x0/0x2f0
[<c126afc9>] ? invoke_tx_handlers+0x909/0xf40
[<c108763f>] ? poll_freewait+0x3f/0xa0
[<c1088368>] ? do_select+0x608/0x680
[<c1269ee5>] ? ieee80211_tx_prepare+0x105/0x310
[<c1088860>] ? __pollwait+0x0/0xd0
[<c126b7b3>] ieee80211_tx+0x53/0x180
[<c11cc278>] ? skb_release_data+0x68/0xa0
[<c11cc398>] ? pskb_expand_head+0xe8/0x170
[<c126b96c>] ieee80211_xmit+0x8c/0x180
[<c126bb44>] ieee80211_monitor_start_xmit+0x94/0xc0
[<c11d3c0d>] dev_hard_start_xmit+0x20d/0x2c0
[<c11cce89>] ? __alloc_skb+0x49/0x130
[<c11e297c>] sch_direct_xmit+0xec/0x140
[<c11c860a>] ? sock_alloc_send_pskb+0x17a/0x260
[<c11e2060>] ? pfifo_fast_enqueue+0x0/0x90
[<c11d3ebd>] dev_queue_xmit+0xdd/0x4a0
[<c12314c3>] packet_sendmsg+0x213/0x250
[<c11c565f>] sock_sendmsg+0xaf/0xe0
[<c11c5539>] ? sock_recvmsg+0xb9/0xe0
[<c11ce19c>] ? verify_iovec+0x2c/0xa0
[<c11c5b31>] sys_sendmsg+0x111/0x230
[<c1056c6f>] ? find_get_page+0x1f/0x70
[<c1057499>] ? filemap_fault+0x69/0x340
[<c1056f6d>] ? unlock_page+0x3d/0x40
[<c1066fe0>] ? __do_fault+0x2a0/0x380
[<c106804b>] ? handle_mm_fault+0x13b/0x850
[<c11c6f1c>] sys_socketcall+0xdc/0x290
[<c1078467>] ? filp_close+0x47/0x70
[<c1002990>] sysenter_do_call+0x12/0x26
On 01/24/2010 03:42 AM, Johannes Berg wrote:
> On Sun, 2010-01-24 at 00:14 -0800, Philip A. Prindeville wrote:
>> On 01/23/2010 09:58 PM, Kalle Valo wrote:
>>> "Philip A. Prindeville" <[email protected]> writes:
>>>
>>>>>> Whatever you prefer. Either way, the panic is fixed now!
>>>>>>
>>>>> Great, and sorry about that! I'll send a patch to insert the else too.
>>>>>
>>>>> johannes
>>>>>
>>>>
>>>> Did you send that patch? I'd like to apply it. Please copy me when
>>>> you send it out.
>>>
>>> Johannes was busy and I sent the patch instead. It's here:
>>>
>>> http://marc.info/?l=linux-wireless&m=126427124317427&w=2
>>>
>>
>> I just applied it to compat-wireless-2010-01-20 and ran it on an AR5413 but it still panics:
>>
>>
>> BUG: unable to handle kernel NULL pointer dereference at 0000019f
>> IP: [<e0993e7f>] :mac80211:invoke_tx_handlers+0x5be/0xe6a
>> *pde = 00000000
>> Oops: 0000 [#1] PREEMPT
>
>> Pid: 1652, comm: hostapd Tainted: P (2.6.27.42-astlinux #1)
>> EIP: 0060:[<e0993e7f>] EFLAGS: 00010246 CPU: 0
>> EIP is at invoke_tx_handlers+0x5be/0xe6a [mac80211]
>
> Are you sure you reloaded the modules etc. correctly? Kinda looks like
> the same issue. Otherwise can you send me in private your mac80211.ko
> and hostapd config file?
>
> johannes
Just to be sure I applied all of the correct patches, can you please send me the list again?
Kalle mentioned there being more than one.
Thanks.
"Philip A. Prindeville" <[email protected]> writes:
>>> Did you send that patch? I'd like to apply it. Please copy me when
>>> you send it out.
>>
>> Johannes was busy and I sent the patch instead. It's here:
>>
>> http://marc.info/?l=linux-wireless&m=126427124317427&w=2
>>
>
> I just applied it to compat-wireless-2010-01-20 and ran it on an
> AR5413 but it still panics:
>
> BUG: unable to handle kernel NULL pointer dereference at 0000019f
> IP: [<e0993e7f>] :mac80211:invoke_tx_handlers+0x5be/0xe6a
Is there any way you could test wireless-testing kernel? I'm not
familiar with compat-wireless, so it's difficult for me to comment
anything.
But remember that there were multiple fixes related to this crash,
make sure that you have all of them in your compat-wireless tree.
--
Kalle Valo
On Fri, 2010-01-22 at 20:14 +0000, Markus Baier wrote:
> After the latest git pull from wireless-testing
> master-2010-01-14 to master-2010-01-19
> the start of hostapd causes kernel panic.
>
> Tested with wireless-testing master-2010-01-19
> and hostapd 0.6.9 / 0.7.0
Alright, managed to reproduce it in kvm -- hostapd was useful for that.
Try this please.
johannes
--- wireless-testing.orig/net/mac80211/tx.c 2010-01-22 21:44:40.000000000 +0100
+++ wireless-testing/net/mac80211/tx.c 2010-01-22 21:49:50.000000000 +0100
@@ -557,7 +557,7 @@ ieee80211_tx_h_select_key(struct ieee802
break;
}
- if (!skip_hw &&
+ if (!skip_hw && tx->key &&
tx->key->conf.flags & KEY_FLAG_UPLOADED_TO_HARDWARE)
info->control.hw_key = &tx->key->conf;
}
On Sun, 2010-01-24 at 00:14 -0800, Philip A. Prindeville wrote:
> On 01/23/2010 09:58 PM, Kalle Valo wrote:
> > "Philip A. Prindeville" <[email protected]> writes:
> >
> >>>> Whatever you prefer. Either way, the panic is fixed now!
> >>>>
> >>> Great, and sorry about that! I'll send a patch to insert the else too.
> >>>
> >>> johannes
> >>>
> >>
> >> Did you send that patch? I'd like to apply it. Please copy me when
> >> you send it out.
> >
> > Johannes was busy and I sent the patch instead. It's here:
> >
> > http://marc.info/?l=linux-wireless&m=126427124317427&w=2
> >
>
> I just applied it to compat-wireless-2010-01-20 and ran it on an AR5413 but it still panics:
>
>
> BUG: unable to handle kernel NULL pointer dereference at 0000019f
> IP: [<e0993e7f>] :mac80211:invoke_tx_handlers+0x5be/0xe6a
> *pde = 00000000
> Oops: 0000 [#1] PREEMPT
> Pid: 1652, comm: hostapd Tainted: P (2.6.27.42-astlinux #1)
> EIP: 0060:[<e0993e7f>] EFLAGS: 00010246 CPU: 0
> EIP is at invoke_tx_handlers+0x5be/0xe6a [mac80211]
Are you sure you reloaded the modules etc. correctly? Kinda looks like
the same issue. Otherwise can you send me in private your mac80211.ko
and hostapd config file?
johannes
On Fri, 2010-01-22 at 22:53 +0100, Johannes Berg wrote:
> > I assume it's another case where tx->key should be checked for being
> > NULL. In fact, it's set to NULL on the preceding line!
>
> or an else inserted.
Whatever you prefer. Either way, the panic is fixed now!
--
Regards,
Pavel Roskin
On Fri, 2010-01-22 at 20:14 +0000, Markus Baier wrote:
> After the latest git pull from wireless-testing
> master-2010-01-14 to master-2010-01-19
> the start of hostapd causes kernel panic.
>
> Tested with wireless-testing master-2010-01-19
> and hostapd 0.6.9 / 0.7.0
> EIP: [<c126afc9>] invoke_tx_handlers+0x909/0xf40 SS:ESP 0068:f6c07b34
Would you compile with CONFIG_MAC80211_NOINLINE (may need to enable
CONFIG_MAC80211_DEBUG_MENU) and give me the stack trace then? But maybe
I can reproduce it this way.
johannes
Pavel Roskin wrote:
> On Fri, 2010-01-22 at 21:53 +0100, Johannes Berg wrote:
>
>> Try this please.
>
> I'm still getting a panic in ieee80211_tx_h_select_key():
>
> BUG: unable to handle kernel NULL pointer dereference at 00000000000001cf
> IP: [<ffffffffa0167e1a>] ieee80211_tx_h_select_key+0x26a/0x300 [mac80211]
> PGD 12a7f8067 PUD 126450067 PMD 0
> Oops: 0000 [#1] SMP
> last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:02:02.0/irq
> CPU 0
> Pid: 19396, comm: hostapd Not tainted 2.6.33-rc4-wl #239 G31T-M/G31T-M
> RIP: 0010:[<ffffffffa0167e1a>] [<ffffffffa0167e1a>]
> ieee80211_tx_h_select_key+0x26a/0x300 [mac80211]
>
> (gdb) l *(ieee80211_tx_h_select_key+0x26a)
> 0x16e4a is in ieee80211_tx_h_select_key
> (/home/proski/src/linux-2.6/net/mac80211/tx.c:550).
> 545 case ALG_CCMP:
> 546 if
> (!ieee80211_is_data_present(hdr->frame_control) &&
> 547 !ieee80211_use_mfp(hdr->frame_control,
> tx->sta,
> 548 tx->skb))
> 549 tx->key = NULL;
> 550 skip_hw = (tx->key->conf.flags &
> 551
> IEEE80211_KEY_FLAG_SW_MGMT) &&
> 552
> ieee80211_is_mgmt(hdr->frame_control);
> 553 break;
> 554 case ALG_AES_CMAC:
>
> I assume it's another case where tx->key should be checked for being
> NULL. In fact, it's set to NULL on the preceding line!
or an else inserted.
> --
> Regards,
> Pavel Roskin
>
>
"Philip A. Prindeville" <[email protected]> writes:
>>> Whatever you prefer. Either way, the panic is fixed now!
>>>
>> Great, and sorry about that! I'll send a patch to insert the else too.
>>
>> johannes
>>
>
> Did you send that patch? I'd like to apply it. Please copy me when
> you send it out.
Johannes was busy and I sent the patch instead. It's here:
http://marc.info/?l=linux-wireless&m=126427124317427&w=2
--
Kalle Valo
<pat-lkml@...> writes:
> On my system, it doesnt' fail until
> something actively scans and we receive the probe.
I thinks its the same here.
That would explain the following behavior.
If I started the hostapd deamon in the cellar,
where I can access the console server to capture the trace,
I was able to start the service and the panic appeared
when I stopped the hostap deamon.
When I start it at the upper flor where the AP can
receive many WLAN stations the kernel panic appears instantly
after the start of the deamon.
On Fri, 2010-01-22 at 17:06 -0500, Pavel Roskin wrote:
> On Fri, 2010-01-22 at 22:53 +0100, Johannes Berg wrote:
> > > I assume it's another case where tx->key should be checked for being
> > > NULL. In fact, it's set to NULL on the preceding line!
> >
> > or an else inserted.
>
> Whatever you prefer. Either way, the panic is fixed now!
Great, and sorry about that! I'll send a patch to insert the else too.
johannes
On 01/23/2010 09:58 PM, Kalle Valo wrote:
> "Philip A. Prindeville" <[email protected]> writes:
>
>>>> Whatever you prefer. Either way, the panic is fixed now!
>>>>
>>> Great, and sorry about that! I'll send a patch to insert the else too.
>>>
>>> johannes
>>>
>>
>> Did you send that patch? I'd like to apply it. Please copy me when
>> you send it out.
>
> Johannes was busy and I sent the patch instead. It's here:
>
> http://marc.info/?l=linux-wireless&m=126427124317427&w=2
>
I just applied it to compat-wireless-2010-01-20 and ran it on an AR5413 but it still panics:
BUG: unable to handle kernel NULL pointer dereference at 0000019f
IP: [<e0993e7f>] :mac80211:invoke_tx_handlers+0x5be/0xe6a
*pde = 00000000
Oops: 0000 [#1] PREEMPT
Modules linked in: aes_i586 aes_generic pc87360 hwmon_vid hwmon bridge stp llc dummy ath5k mac80211 ath cfg80211 rfkill_backport compat dahdi_dummy dahdi sha512_generic sha256_generic deflate zlib_deflate arc4 ecb sha1_generic blowfish des_generic cbc cryptosoft cryptodev(P) ocf(P) geodewdt geode_rng geode_aes crypto_blkcipher via_rhine rtc cs5535_gpio
Pid: 1652, comm: hostapd Tainted: P (2.6.27.42-astlinux #1)
EIP: 0060:[<e0993e7f>] EFLAGS: 00010246 CPU: 0
EIP is at invoke_tx_handlers+0x5be/0xe6a [mac80211]
EAX: 00000000 EBX: df0b7cac ECX: 00000000 EDX: df0b7cac
ESI: df22dce0 EDI: df22dcc0 EBP: df22dce0 ESP: df0b7c10
DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
Process hostapd (pid: 1652, ti=df0b6000 task=dfa43be0 task.ti=df0b6000)
Stack: df032920 00000000 dfa43be0 df0b7cac df22dcc0 df22dce0 df5658dc dfa75640
dfaa5b60 df22dce0 df0b7cac e099362e df0b7cac df0b7c5c df27a45e df204260
dfa43be0 c0112426 00100100 df27a450 0000000e 0000000f df27a45c 00000012
Call Trace:
[<e099362e>] ieee80211_tx_prepare+0x2ed/0x327 [mac80211]
[<c0112426>] default_wake_function+0x0/0x8
[<e0994904>] ieee80211_tx+0x94/0x21b [mac80211]
[<c02511cd>] pskb_expand_head+0xe7/0x14d
[<e0994bfd>] ieee80211_xmit+0x172/0x196 [mac80211]
[<e0994f3e>] ieee80211_monitor_start_xmit+0x8e/0xa0 [mac80211]
[<c02559d0>] dev_hard_start_xmit+0x196/0x1ef
[<c02611a8>] __qdisc_run+0xa1/0x183
[<c0257aee>] dev_queue_xmit+0x161/0x283
[<c0252311>] memcpy_fromiovec+0x28/0x4b
[<c02a3efc>] packet_sendmsg+0x1ba/0x200
[<c024c174>] sock_sendmsg+0xb7/0xd0
[<c0123bde>] autoremove_wake_function+0x0/0x2b
[<c0123bde>] autoremove_wake_function+0x0/0x2b
[<c015abe6>] core_sys_select+0x260/0x285
[<c0252579>] verify_iovec+0x3e/0x6d
[<c024c31a>] sys_sendmsg+0x18d/0x1f0
[<c013c232>] mark_page_accessed+0x18/0x27
[<c013778a>] filemap_fault+0x202/0x364
[<c01b7abc>] unionfs_fault+0x50/0x58
[<c0140071>] __do_fault+0x2b7/0x2e9
[<c01411fb>] handle_mm_fault+0x219/0x4a3
[<c024d209>] sys_socketcall+0x15b/0x193
[<c02ad674>] do_page_fault+0x0/0x60d
[<c01037e6>] syscall_call+0x7/0xb
[<c02a0000>] unix_dgram_disconnected+0x39/0x4e
=======================
Code: 00 00 10 74 1f 0f b7 03 a8 0c 0f 84 52 08 00 00 eb 12 0f b7 03 a8 0c 74 0b 8b 4c 24 0c c7 41 10 00 00 00 00 8b 5c 24 0c 8b 43 10 <f6> 80 9f 01 00 00 01 0f 84 2a 08 00 00 05 98 01 00 00 89 45 1c
EIP: [<e0993e7f>] invoke_tx_handlers+0x5be/0xe6a [mac80211] SS:ESP 0068:df0b7c10
Kernel panic - not syncing: Fatal exception in interrupt
Slightly different from Markus's trace.
On 01/23/2010 04:59 AM, Johannes Berg wrote:
> On Fri, 2010-01-22 at 17:06 -0500, Pavel Roskin wrote:
>
>> On Fri, 2010-01-22 at 22:53 +0100, Johannes Berg wrote:
>>
>>>> I assume it's another case where tx->key should be checked for being
>>>> NULL. In fact, it's set to NULL on the preceding line!
>>>>
>>> or an else inserted.
>>>
>> Whatever you prefer. Either way, the panic is fixed now!
>>
> Great, and sorry about that! I'll send a patch to insert the else too.
>
> johannes
>
Did you send that patch? I'd like to apply it. Please copy me when you send it out.
Thanks.
On Fri, 22 Jan 2010 20:14:36 +0000 (UTC), Markus Baier
<[email protected]> wrote:
> After the latest git pull from wireless-testing
> master-2010-01-14 to master-2010-01-19
> the start of hostapd causes kernel panic.
>
> Tested with wireless-testing master-2010-01-19
> and hostapd 0.6.9 / 0.7.0
>
>
> ---------------------------------------------
<SNIP>
I'm seeing this with git-tip of hostapd as well (using ath9k
instead of rt61pci). I hadn't caught a full trace, so I
hadn't reported it yet. On my system, it doesnt' fail until
something actively scans and we receive the probe. I can't
provide more info than that yet, though.
Pat Erley
On Fri, 2010-01-22 at 21:53 +0100, Johannes Berg wrote:
> Try this please.
I'm still getting a panic in ieee80211_tx_h_select_key():
BUG: unable to handle kernel NULL pointer dereference at 00000000000001cf
IP: [<ffffffffa0167e1a>] ieee80211_tx_h_select_key+0x26a/0x300 [mac80211]
PGD 12a7f8067 PUD 126450067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:02:02.0/irq
CPU 0
Pid: 19396, comm: hostapd Not tainted 2.6.33-rc4-wl #239 G31T-M/G31T-M
RIP: 0010:[<ffffffffa0167e1a>] [<ffffffffa0167e1a>] ieee80211_tx_h_select_key+0x26a/0x300 [mac80211]
(gdb) l *(ieee80211_tx_h_select_key+0x26a)
0x16e4a is in ieee80211_tx_h_select_key (/home/proski/src/linux-2.6/net/mac80211/tx.c:550).
545 case ALG_CCMP:
546 if (!ieee80211_is_data_present(hdr->frame_control) &&
547 !ieee80211_use_mfp(hdr->frame_control, tx->sta,
548 tx->skb))
549 tx->key = NULL;
550 skip_hw = (tx->key->conf.flags &
551 IEEE80211_KEY_FLAG_SW_MGMT) &&
552 ieee80211_is_mgmt(hdr->frame_control);
553 break;
554 case ALG_AES_CMAC:
I assume it's another case where tx->key should be checked for being
NULL. In fact, it's set to NULL on the preceding line!
--
Regards,
Pavel Roskin