Hello!
After upgrading a PowerPC system from Fedora 10 to Fedora 11, I started
getting a BUG on startup. I'm using a self-compiled kernel from
writeless-testing.git. The current source is affected, and so is an
older revision identified as 2.6.30-rc6-wl.
The BUS causes a long wait on startup (about 3 minutes). Perhaps it's
caused by udev waiting for something.
mac80211 is compiled into the kernel, but ath5k is not:
$ lsmod
Module Size Used by
hfs 52980 1
ath5k 130428 0
ath 7564 1 ath5k
The system has 2 Atheros devices:
01:02.0 Ethernet controller: Atheros Communications Inc. Atheros AR5001X+ Wireless Network Adapter (rev 01)
01:03.0 Network controller: Atheros Communications Inc. Atheros AR5001X+ Wireless Network Adapter (rev 01)
That's the relevant part of the kernel log:
udevd version 127 started
ath5k 0000:01:02.0: enabling device (0014 -> 0016)
ath5k 0000:01:02.0: registered as 'phy0'
ath: EEPROM regdomain: 0x0
ath: EEPROM indicates default country code should be used
ath: doing EEPROM country->regdmn map search
ath: country maps to regdmn code: 0x3a
ath: Country alpha2 being used: US
ath: Regpair used: 0x3a
phy0: Selected rate control algorithm 'minstrel'
ath5k phy0: Atheros AR2414 chip found (MAC: 0x79, PHY: 0x45)
ath5k 0000:01:03.0: enabling device (0014 -> 0016)
ath5k 0000:01:03.0: registered as 'phy1'
ath: EEPROM regdomain: 0x0
ath: EEPROM indicates default country code should be used
ath: doing EEPROM country->regdmn map search
ath: country maps to regdmn code: 0x3a
ath: Country alpha2 being used: US
ath: Regpair used: 0x3a
cfg80211: Calling CRDA for country: US
phy1: Selected rate control algorithm 'minstrel'
ath5k phy1: Atheros AR5213A chip found (MAC: 0x59, PHY: 0x43)
ath5k phy1: RF2112B 2GHz radio found (0x46)
udev: renamed network interface wlan0 to wlan1
udev: renamed network interface wlan1_rename to wlan0
------------[ cut here ]------------
kernel BUG at /home/proski/src/linux-2.6/net/wireless/reg.c:2132!
Oops: Exception in kernel mode, sig: 5 [#1]
PowerMac
Modules linked in: ath5k ath [last unloaded: scsi_wait_scan]
NIP: c02f3eac LR: c02f3d08 CTR: 00000000
REGS: ef107aa0 TRAP: 0700 Not tainted (2.6.30-rc8-wl)
MSR: 00029032 <EE,ME,CE,IR,DR> CR: 88002442 XER: 20000000
TASK = ef84acb0[834] 'crda' THREAD: ef106000
GPR00: ef953840 ef107b50 ef84acb0 ef1380bc 00000006 c035a5c8 ef107b90 c035a5c8
GPR08: 00080005 efb68980 c0445628 ef130004 28002422 10019ce0 10012d3c 00000001
GPR16: 1070b2ac 00000005 48023558 1070b380 4802304c 00000000 ef107ddc c035a5c8
GPR24: ef107b78 c0443350 ef8bcb00 00000005 ef138080 c04a6a70 c04a0000 ef8bcb00
NIP [c02f3eac] set_regdom+0x4c4/0x4ec
LR [c02f3d08] set_regdom+0x320/0x4ec
Call Trace:
[ef107b50] [c02f3d08] set_regdom+0x320/0x4ec (unreliable)
[ef107b70] [c02f9d10] nl80211_set_reg+0x140/0x2d0
[ef107bc0] [c02aa2b8] genl_rcv_msg+0x204/0x228
[ef107c10] [c02a97cc] netlink_rcv_skb+0xe8/0x10c
[ef107c30] [c02aa094] genl_rcv+0x3c/0x5c
[ef107c40] [c02a9050] netlink_unicast+0x308/0x36c
[ef107c80] [c02a92bc] netlink_sendmsg+0x208/0x2f0
[ef107cd0] [c0282048] sock_sendmsg+0xac/0xe4
[ef107db0] [c02822b4] sys_sendmsg+0x234/0x2d8
[ef107f00] [c0283a88] sys_socketcall+0x108/0x258
[ef107f40] [c0012790] ret_from_syscall+0x0/0x38
--- Exception: c01 at 0xfa6b3b4
LR = 0xfb4b5d4
Instruction dump:
80690000 4bffc1e9 2c030000 4182003c 88a30039 88830038 3c60c03c 38632a18
4802c20d 4bfffed0 7f83e378 8403003c <0f000000> 7fe4fb78 4bffe6d1 7c7b1b79
---[ end trace c8eebdfbe3eb31de ]---
net/wireless/reg.c:2132 is:
BUG_ON(request_wiphy->regd);
An x86_64 running Fedora 11 and the current wireless-testing kernel
doesn't exhibit this behavior.
--
Regards,
Pavel Roskin
On Mon, Jun 8, 2009 at 6:17 PM, Pavel Roskin<[email protected]> wrote:
> On Mon, 2009-06-08 at 19:19 -0400, Luis R. Rodriguez wrote:
>
>> Thank you for reporting this, try loading the system without ath5k present
>> (mv ath5k.ko ath5k.ignore is an easy way) then prior to loadin git in a window
>> get 'iw event' running.
>
> I'm sorry, but I managed to "fix" the problem without having a way to
> recreate it again. I emptied /etc/udev/rules.d/70-persistent-net.rules
> and removed /etc/sysconfig/network-scripts/ifcfg-wlan0, and the problem
> is gone, but I didn't preserve the original files, and I cannot make the
> crash happen again.
>
> Also, it turns out that pciutils failed to upgrade to the Fedora 11
> version, and "lspci -v" was crashing. After upgrading pciutils the BUG
> was gone, but downgrading it didn't make it reappear :-(
>
> I tried what you said, and that's what I get:
>
> # iw event
> phy #0: regulatory domain change: set to US by a driver request on phy0
> phy #1: regulatory domain change: set to US by a driver request on phy1
>
>>
>> - BUG_ON(request_wiphy->regd);
>> + /*
>> + * Userspace could have sent two replies with only
>> + * one kernel request.
>> + */
>> + if (request_wiphy->regd)
>> + return -EALREADY;
>
> I think it's a good idea. Stupid userspace shouldn't cause kernel bugs.
Agreed.
Luis
On Mon, Jun 08, 2009 at 06:59:48PM -0400, Pavel Roskin wrote:
> ------------[ cut here ]------------
> kernel BUG at /home/proski/src/linux-2.6/net/wireless/reg.c:2132!
> Oops: Exception in kernel mode, sig: 5 [#1]
> PowerMac
> Modules linked in: ath5k ath [last unloaded: scsi_wait_scan]
> NIP: c02f3eac LR: c02f3d08 CTR: 00000000
> REGS: ef107aa0 TRAP: 0700 Not tainted (2.6.30-rc8-wl)
> MSR: 00029032 <EE,ME,CE,IR,DR> CR: 88002442 XER: 20000000
> TASK = ef84acb0[834] 'crda' THREAD: ef106000
> GPR00: ef953840 ef107b50 ef84acb0 ef1380bc 00000006 c035a5c8 ef107b90 c035a5c8
> GPR08: 00080005 efb68980 c0445628 ef130004 28002422 10019ce0 10012d3c 00000001
> GPR16: 1070b2ac 00000005 48023558 1070b380 4802304c 00000000 ef107ddc c035a5c8
> GPR24: ef107b78 c0443350 ef8bcb00 00000005 ef138080 c04a6a70 c04a0000 ef8bcb00
> NIP [c02f3eac] set_regdom+0x4c4/0x4ec
> LR [c02f3d08] set_regdom+0x320/0x4ec
> Call Trace:
> [ef107b50] [c02f3d08] set_regdom+0x320/0x4ec (unreliable)
> [ef107b70] [c02f9d10] nl80211_set_reg+0x140/0x2d0
> [ef107bc0] [c02aa2b8] genl_rcv_msg+0x204/0x228
> [ef107c10] [c02a97cc] netlink_rcv_skb+0xe8/0x10c
> [ef107c30] [c02aa094] genl_rcv+0x3c/0x5c
> [ef107c40] [c02a9050] netlink_unicast+0x308/0x36c
> [ef107c80] [c02a92bc] netlink_sendmsg+0x208/0x2f0
> [ef107cd0] [c0282048] sock_sendmsg+0xac/0xe4
> [ef107db0] [c02822b4] sys_sendmsg+0x234/0x2d8
> [ef107f00] [c0283a88] sys_socketcall+0x108/0x258
> [ef107f40] [c0012790] ret_from_syscall+0x0/0x38
> --- Exception: c01 at 0xfa6b3b4
> LR = 0xfb4b5d4
> Instruction dump:
> 80690000 4bffc1e9 2c030000 4182003c 88a30039 88830038 3c60c03c 38632a18
> 4802c20d 4bfffed0 7f83e378 8403003c <0f000000> 7fe4fb78 4bffe6d1 7c7b1b79
> ---[ end trace c8eebdfbe3eb31de ]---
>
> net/wireless/reg.c:2132 is:
>
> BUG_ON(request_wiphy->regd);
Thank you for reporting this, try loading the system without ath5k present
(mv ath5k.ko ath5k.ignore is an easy way) then prior to loadin git in a window
get 'iw event' running.
I'm pretty sure this is another case of the kernel/userspace sending
two replies. In such a case we just drop as follows. 'iw event' will tell
us more.
diff --git a/net/wireless/reg.c b/net/wireless/reg.c
index ea4c299..5e14371 100644
--- a/net/wireless/reg.c
+++ b/net/wireless/reg.c
@@ -2129,7 +2129,12 @@ static int __set_regdom(const struct ieee80211_regdomain *rd)
* driver wanted to the wiphy to deal with conflicts
*/
- BUG_ON(request_wiphy->regd);
+ /*
+ * Userspace could have sent two replies with only
+ * one kernel request.
+ */
+ if (request_wiphy->regd)
+ return -EALREADY;
r = reg_copy_regd(&request_wiphy->regd, rd);
if (r)
On Mon, 2009-06-08 at 19:19 -0400, Luis R. Rodriguez wrote:
> Thank you for reporting this, try loading the system without ath5k present
> (mv ath5k.ko ath5k.ignore is an easy way) then prior to loadin git in a window
> get 'iw event' running.
I'm sorry, but I managed to "fix" the problem without having a way to
recreate it again. I emptied /etc/udev/rules.d/70-persistent-net.rules
and removed /etc/sysconfig/network-scripts/ifcfg-wlan0, and the problem
is gone, but I didn't preserve the original files, and I cannot make the
crash happen again.
Also, it turns out that pciutils failed to upgrade to the Fedora 11
version, and "lspci -v" was crashing. After upgrading pciutils the BUG
was gone, but downgrading it didn't make it reappear :-(
I tried what you said, and that's what I get:
# iw event
phy #0: regulatory domain change: set to US by a driver request on phy0
phy #1: regulatory domain change: set to US by a driver request on phy1
>
> - BUG_ON(request_wiphy->regd);
> + /*
> + * Userspace could have sent two replies with only
> + * one kernel request.
> + */
> + if (request_wiphy->regd)
> + return -EALREADY;
I think it's a good idea. Stupid userspace shouldn't cause kernel bugs.
--
Regards,
Pavel Roskin
On Mon, 2009-06-08 at 18:13 -0500, Larry Finger wrote:
> I don't know what is triggering the kernel BUG, but you have an error
> in your udev rules. The pertinent file is
> /etc/udev/rules.d/70-persistent-net.rules. Any rule that renames a
> wireless device should look like the following:
>
> SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", \
> ATTR{address}=="00:90:4b:d2:1f:cd", ATTR{type}=="1", \
> KERNEL=="wlan*", NAME="wlan0"
>
> The ATTR{address} should match the MAC address of the device, but the
> ATTR{type}=="1" is really important as it keeps the master device from
> being renamed, which is the usual cause of the presence of a name like
> wlanX_rename.
I actually removed 70-persistent-net.rules, but I think "1" was there.
The failed rename was due to /etc/sysconfig/network-scripts/ifcfg-wlan0
that was specifying the MAC address that must have conflicted with the
udev rules.
--
Regards,
Pavel Roskin
Pavel Roskin wrote:
> Hello!
>
> After upgrading a PowerPC system from Fedora 10 to Fedora 11, I started
> getting a BUG on startup. I'm using a self-compiled kernel from
> writeless-testing.git. The current source is affected, and so is an
> older revision identified as 2.6.30-rc6-wl.
>
> The BUS causes a long wait on startup (about 3 minutes). Perhaps it's
> caused by udev waiting for something.
>
> mac80211 is compiled into the kernel, but ath5k is not:
>
> $ lsmod
> Module Size Used by
> hfs 52980 1
> ath5k 130428 0
> ath 7564 1 ath5k
>
> The system has 2 Atheros devices:
>
> 01:02.0 Ethernet controller: Atheros Communications Inc. Atheros AR5001X+ Wireless Network Adapter (rev 01)
> 01:03.0 Network controller: Atheros Communications Inc. Atheros AR5001X+ Wireless Network Adapter (rev 01)
>
> That's the relevant part of the kernel log:
>
> udevd version 127 started
> ath5k 0000:01:02.0: enabling device (0014 -> 0016)
> ath5k 0000:01:02.0: registered as 'phy0'
> ath: EEPROM regdomain: 0x0
> ath: EEPROM indicates default country code should be used
> ath: doing EEPROM country->regdmn map search
> ath: country maps to regdmn code: 0x3a
> ath: Country alpha2 being used: US
> ath: Regpair used: 0x3a
> phy0: Selected rate control algorithm 'minstrel'
> ath5k phy0: Atheros AR2414 chip found (MAC: 0x79, PHY: 0x45)
> ath5k 0000:01:03.0: enabling device (0014 -> 0016)
> ath5k 0000:01:03.0: registered as 'phy1'
> ath: EEPROM regdomain: 0x0
> ath: EEPROM indicates default country code should be used
> ath: doing EEPROM country->regdmn map search
> ath: country maps to regdmn code: 0x3a
> ath: Country alpha2 being used: US
> ath: Regpair used: 0x3a
> cfg80211: Calling CRDA for country: US
> phy1: Selected rate control algorithm 'minstrel'
> ath5k phy1: Atheros AR5213A chip found (MAC: 0x59, PHY: 0x43)
> ath5k phy1: RF2112B 2GHz radio found (0x46)
> udev: renamed network interface wlan0 to wlan1
> udev: renamed network interface wlan1_rename to wlan0
I don't know what is triggering the kernel BUG, but you have an error
in your udev rules. The pertinent file is
/etc/udev/rules.d/70-persistent-net.rules. Any rule that renames a
wireless device should look like the following:
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", \
ATTR{address}=="00:90:4b:d2:1f:cd", ATTR{type}=="1", \
KERNEL=="wlan*", NAME="wlan0"
The ATTR{address} should match the MAC address of the device, but the
ATTR{type}=="1" is really important as it keeps the master device from
being renamed, which is the usual cause of the presence of a name like
wlanX_rename.