2013-03-12 15:42:03

by Patrik, Kluba

[permalink] [raw]
Subject: bug: deadlock in rtl8192cu


Hi!

After killing a wpa_supplicant which was set up to connect to an
invalid (non-existent) SSID, it goes to 'D' state, and brings down
every process trying to do low-level network operations (anything
involving an rtnl_lock) to their knees. The same happens when instead
of killing wpa_supplicant, I do 'ifconfig wlan0 down'.

See the task state dump at the end. Unfortunately it's not very helpful
as there are a plenty of static functions called, and due to this a
couple of addresses are resolved to incorrect symbol names, which
makes the dump confusing. BTW this is on an ARM system.

The blocking usb_control_msg was actually called from
_usbctrl_vendorreq_sync_read(). See the following fragment:


do {
status = usb_control_msg(udev, pipe, request, reqtype, value,
index, pdata, len, 0); /*max. timeout*/
if (status < 0) {


Seems like the blocking is intentional (max timeout specified in the
comment), but I don't know why the transfer cannot finish.

The caller of _usbctrl_vendorreq_sync_read() is _DisableGPIO().


/* 1. Disable GPIO[7:0] */
rtl_write_word(rtlpriv, REG_GPIO_PIN_CTRL+2, 0x0000);
value32 = rtl_read_dword(rtlpriv, REG_GPIO_PIN_CTRL) & 0xFFFF00FF;


rtl_write_word() completes, and rtl_read_dword() is being blocked.

The caller of _DisableGPIO() is _CardDisableHWSM(), which was called
from rtl92cu_card_disable(), which was called from rtl_usb_stop().
rtl_usb_stop was called with rtnl_lock held.

This was first observed on a 3.2.34 kernel. Today I have tried
compat-wireless-02-22 on the same kernel, with the patches from OpenWrt,
but nothing changed. I have checked the wireless-next git tree, and
_usbctrl_vendorreq_sync_read() is the same.

I have no idea what can be the actual problem, but can do a bit of debugging and
information gathering if you need more, just tell me what should I do.
usb_control_msg() has not completed for at least 30 minutes now. As a quick
workaround, is it enough to set a timeout of say, 5 seconds in
usbctrl_vendorreq_sync_read() ? Could this cause problems at different places?

Regards,
Patrik

/ # cat /sys/class/net/wlan0/device/modalias
usb:v0BDAp8176d0200dc00dsc00dp00icFFiscFFipFF

T: Bus=02 Lev=02 Prnt=02 Port=02 Cnt=02 Dev#= 4 Spd=12 MxCh= 0
D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1
P: Vendor=0bda ProdID=8176 Rev= 2.00
S: Manufacturer=Realtek
S: Product=802.11n WLAN Adapter
S: SerialNumber=00e04c000001
C:* #Ifs= 1 Cfg#= 1 Atr=a0 MxPwr=500mA
I:* If#= 0 Alt= 0 #EPs= 4 Cls=ff(vend.) Sub=ff Prot=ff Driver=rtl8192cu
E: Ad=81(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms
E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms
E: Ad=03(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms
E: Ad=84(I) Atr=03(Int.) MxPS= 64 Ivl=1ms

[ 707.022567] SysRq : Show Blocked State
[ 707.026155] task PC stack pid father
[ 707.026218] heartbeat D c040d5a4 0 1523 1 0x00000000
[ 707.026255] Backtrace:
[ 707.026339] [<c040d390>] (__schedule+0x0/0x350) from [<c040d7fc>] (schedule+0x50/0x68)
[ 707.026388] [<c040d7ac>] (schedule+0x0/0x68) from [<c040df64>] (__mutex_lock_slowpath+0x88/0xc8)
[ 707.026429] [<c040dedc>] (__mutex_lock_slowpath+0x0/0xc8) from [<c040dfec>] (mutex_lock+0x48/0x4c)
[ 707.026455] r8:c64fa900 r7:c78c682c r6:00000014 r5:00000014 r4:c64fa900
[ 707.026514] [<c040dfa4>] (mutex_lock+0x0/0x4c) from [<c03af49c>] (rtnetlink_rcv+0x18/0x34)
[ 707.026571] [<c03af484>] (rtnetlink_rcv+0x0/0x34) from [<c03b9d00>] (netlink_unicast+0x2c0/0x2d8)
[ 707.026595] r5:00000014 r4:c78c6800
[ 707.026632] [<c03b9a40>] (netlink_unicast+0x0/0x2d8) from [<c03ba610>] (netlink_sendmsg+0x2c4/0x314)
[ 707.026673] [<c03ba34c>] (netlink_sendmsg+0x0/0x314) from [<c038ecd4>] (sock_sendmsg+0x98/0xb8)
[ 707.026712] [<c038ec3c>] (sock_sendmsg+0x0/0xb8) from [<c038f49c>] (sys_sendto+0xc4/0x100)
[ 707.026732] r7:00000000 r6:41dff194 r5:c7115840 r4:00000014
[ 707.026807] [<c038f3d8>] (sys_sendto+0x0/0x100) from [<c010c820>] (ret_fast_syscall+0x0/0x30)

[ 707.026833] ifplugd D c040d5a4 0 755 1 0x00000000
[ 707.026860] Backtrace:
[ 707.026907] [<c040d390>] (__schedule+0x0/0x350) from [<c040d7fc>] (schedule+0x50/0x68)
[ 707.026946] [<c040d7ac>] (schedule+0x0/0x68) from [<c040df64>] (__mutex_lock_slowpath+0x88/0xc8)
[ 707.026986] [<c040dedc>] (__mutex_lock_slowpath+0x0/0xc8) from [<c040dfec>] (mutex_lock+0x48/0x4c)
[ 707.027008] r8:c0528008 r7:c05410cc r6:be9c3bc8 r5:c0f70000 r4:00008946
[ 707.027061] [<c040dfa4>] (mutex_lock+0x0/0x4c) from [<c03af47c>] (rtnl_lock+0x14/0x1c)
[ 707.027119] [<c03af468>] (rtnl_lock+0x0/0x1c) from [<c03a3c08>] (dev_ioctl+0x520/0x600)
[ 707.027158] [<c03a36e8>] (dev_ioctl+0x0/0x600) from [<c038e608>] (sock_ioctl+0x94/0x258)
[ 707.027190] [<c038e574>] (sock_ioctl+0x0/0x258) from [<c0195bac>] (do_vfs_ioctl+0x8c/0x36c)
[ 707.027210] r7:be9c3bc8 r6:00000003 r5:c6a6d500 r4:00008946
[ 707.027251] [<c0195b20>] (do_vfs_ioctl+0x0/0x36c) from [<c0195ecc>] (sys_ioctl+0x40/0x68)
[ 707.027270] r7:c6a6d500 r6:00008946 r5:be9c3bc8 r4:00000003
[ 707.027328] [<c0195e8c>] (sys_ioctl+0x0/0x68) from [<c010c820>] (ret_fast_syscall+0x0/0x30)
[ 707.027350] r7:00000036 r6:00000001 r5:0006a677 r4:00000001

[ 707.027387] wpa_supplicant D c040d5a4 0 901 1 0x00000000
[ 707.027412] Backtrace:
[ 707.027449] [<c040d390>] (__schedule+0x0/0x350) from [<c040d7fc>] (schedule+0x50/0x68)
[ 707.027484] [<c040d7ac>] (schedule+0x0/0x68) from [<c040daf8>] (schedule_timeout+0x154/0x1a0)
[ 707.027523] [<c040d9a4>] (schedule_timeout+0x0/0x1a0) from [<c040d214>] (wait_for_common+0x104/0x13c)
[ 707.027546] r7:c65b2000 r6:c65b3c70 r5:7fffffff r4:c65b3c74
[ 707.027589] [<c040d110>] (wait_for_common+0x0/0x13c) from [<c040d2d8>] (wait_for_completion_timeout+0x14/0x18)
[ 707.027630] [<c040d2c4>] (wait_for_completion_timeout+0x0/0x18) from [<c031f84c>] (usb_start_wait_urb+0x64/0xcc)
[ 707.027668] [<c031f7e8>] (usb_start_wait_urb+0x0/0xcc) from [<c031fab0>] (usb_control_msg+0xd4/0xf8)
[ 707.027689] r8:000000c0 r7:00000000 r6:00000044 r5:00000004 r4:c0f9dae0
[ 707.027782] [<c031f9dc>] (usb_control_msg+0x0/0xf8) from [<bf14f240>] (_usb_writeN_sync+0x140/0x1bc [rtlwifi])
[ 707.027849] [<bf14f190>] (_usb_writeN_sync+0x90/0x1bc [rtlwifi]) from [<bf14f2d0>] (_usb_read32_sync+0x14/0x18 [rtlwifi])
[ 707.027923] [<bf14f2bc>] (_usb_read32_sync+0x0/0x18 [rtlwifi]) from [<bf166e70>] (rtl92cu_enable_hw_security_config+0x444/0xbdc [rtl8192cu])
[ 707.027983] [<bf166e28>] (rtl92cu_enable_hw_security_config+0x3fc/0xbdc [rtl8192cu]) from [<bf169abc>] (rtl92cu_card_disable+0x8c/0xa44 [rtl8192cu])
[ 707.028013] r5:c0ed8aa0 r4:c0ed9760
[ 707.028070] [<bf169a30>] (rtl92cu_card_disable+0x0/0xa44 [rtl8192cu]) from [<bf14db9c>] (rtl_usb_stop+0x44/0x4c [rtlwifi])
[ 707.028096] r7:c0ed8aa0 r6:c0ed9760 r5:c0ed8aa0 r4:c0ed9760
[ 707.028154] [<bf14db58>] (rtl_usb_stop+0x0/0x4c [rtlwifi]) from [<bf149960>] (rtl_op_stop+0x7c/0xa4 [rtlwifi])
[ 707.028178] r5:c0ed9794 r4:00000000
[ 707.028321] [<bf1498e4>] (rtl_op_stop+0x0/0xa4 [rtlwifi]) from [<bf0c5a50>] (ieee80211_stop_device+0x48/0x80 [mac80211])
[ 707.028348] r7:c78e5340 r6:c78e5968 r5:00000000 r4:c0ed8aa0
[ 707.028479] [<bf0c5a08>] (ieee80211_stop_device+0x0/0x80 [mac80211]) from [<bf0b364c>] (ieee80211_recalc_idle+0x448/0x838 [mac80211])
[ 707.028507] r5:00000000 r4:00000060
[ 707.028598] [<bf0b333c>] (ieee80211_recalc_idle+0x138/0x838 [mac80211]) from [<bf0b3ab8>] (ieee80211_stop+0x18/0x20 [mac80211])
[ 707.028687] [<bf0b3aa0>] (ieee80211_stop+0x0/0x20 [mac80211]) from [<c03a1730>] (__dev_close_many+0x90/0xd8)
[ 707.028733] [<c03a16a0>] (__dev_close_many+0x0/0xd8) from [<c03a17a8>] (__dev_close+0x30/0x48)
[ 707.028755] r5:c78e5000 r4:00000001
[ 707.028795] [<c03a1778>] (__dev_close+0x0/0x48) from [<c039f3e0>] (__dev_change_flags+0x84/0x148)
[ 707.028837] [<c039f35c>] (__dev_change_flags+0x0/0x148) from [<c03a2184>] (dev_change_flags+0x18/0x50)
[ 707.028859] r7:c05410cc r6:bec2abf0 r5:00001003 r4:c78e5000
[ 707.028928] [<c03a216c>] (dev_change_flags+0x0/0x50) from [<c03ee03c>] (devinet_ioctl+0x758/0x790)
[ 707.028952] r7:c05410cc r6:bec2abf0 r5:00000000 r4:c65b2000
[ 707.029000] [<c03ed8e4>] (devinet_ioctl+0x0/0x790) from [<c03ef0ec>] (inet_ioctl+0x1ac/0x1c4)
[ 707.029041] [<c03eef40>] (inet_ioctl+0x0/0x1c4) from [<c038e5e8>] (sock_ioctl+0x74/0x258)
[ 707.029078] [<c038e574>] (sock_ioctl+0x0/0x258) from [<c0195bac>] (do_vfs_ioctl+0x8c/0x36c)
[ 707.029097] r7:bec2abf0 r6:00000004 r5:c783d180 r4:00008914
[ 707.029136] [<c0195b20>] (do_vfs_ioctl+0x0/0x36c) from [<c0195ecc>] (sys_ioctl+0x40/0x68)
[ 707.029155] r7:c783d180 r6:00008914 r5:bec2abf0 r4:00000004
[ 707.029224] [<c0195e8c>] (sys_ioctl+0x0/0x68) from [<c010c820>] (ret_fast_syscall+0x0/0x30)
[ 707.029246] r7:00000036 r6:00000000 r5:018655c8 r4:018655b8

[ 707.029277] ifconfig D c040d5a4 0 1522 1521 0x00000000
[ 707.029303] Backtrace:
[ 707.029352] [<c040d390>] (__schedule+0x0/0x350) from [<c040d7fc>] (schedule+0x50/0x68)
[ 707.029392] [<c040d7ac>] (schedule+0x0/0x68) from [<c040df64>] (__mutex_lock_slowpath+0x88/0xc8)
[ 707.029432] [<c040dedc>] (__mutex_lock_slowpath+0x0/0xc8) from [<c040dfec>] (mutex_lock+0x48/0x4c)
[ 707.029455] r8:00008916 r7:c05410cc r6:bedbbb50 r5:00000000 r4:c78d4000
[ 707.029516] [<c040dfa4>] (mutex_lock+0x0/0x4c) from [<c03af47c>] (rtnl_lock+0x14/0x1c)
[ 707.029572] [<c03af468>] (rtnl_lock+0x0/0x1c) from [<c03eda08>] (devinet_ioctl+0x124/0x790)
[ 707.029616] [<c03ed8e4>] (devinet_ioctl+0x0/0x790) from [<c03ef0ec>] (inet_ioctl+0x1ac/0x1c4)
[ 707.029651] [<c03eef40>] (inet_ioctl+0x0/0x1c4) from [<c038e5e8>] (sock_ioctl+0x74/0x258)
[ 707.029684] [<c038e574>] (sock_ioctl+0x0/0x258) from [<c0195bac>] (do_vfs_ioctl+0x8c/0x36c)
[ 707.029707] r7:bedbbb50 r6:00000004 r5:c64b0c80 r4:00008916
[ 707.029747] [<c0195b20>] (do_vfs_ioctl+0x0/0x36c) from [<c0195ecc>] (sys_ioctl+0x40/0x68)
[ 707.029768] r7:c64b0c80 r6:00008916 r5:bedbbb50 r4:00000004
[ 707.029823] [<c0195e8c>] (sys_ioctl+0x0/0x68) from [<c010c820>] (ret_fast_syscall+0x0/0x30)
[ 707.029843] r7:00000036 r6:000001c3 r5:00000015 r4:bedbbb50

--
Patrik KLUBA
Software Developer at

DENSION Audio Systems Ltd.
H-1116 Budapest, Sztregova u. 1
Phone: +36 1 463 0470
Fax: +36 1 463 0479
Web: http://www.dension.com


2013-03-12 16:35:01

by Larry Finger

[permalink] [raw]
Subject: Re: bug: deadlock in rtl8192cu

On 03/12/2013 10:30 AM, Patrik, Kluba wrote:
>
> Hi!
>
> After killing a wpa_supplicant which was set up to connect to an
> invalid (non-existent) SSID, it goes to 'D' state, and brings down
> every process trying to do low-level network operations (anything
> involving an rtnl_lock) to their knees. The same happens when instead
> of killing wpa_supplicant, I do 'ifconfig wlan0 down'.
>
> See the task state dump at the end. Unfortunately it's not very helpful
> as there are a plenty of static functions called, and due to this a
> couple of addresses are resolved to incorrect symbol names, which
> makes the dump confusing. BTW this is on an ARM system.
>
> The blocking usb_control_msg was actually called from
> _usbctrl_vendorreq_sync_read(). See the following fragment:
>
>
> do {
> status = usb_control_msg(udev, pipe, request, reqtype, value,
> index, pdata, len, 0); /*max. timeout*/
> if (status < 0) {
>
>
> Seems like the blocking is intentional (max timeout specified in the
> comment), but I don't know why the transfer cannot finish.
>
> The caller of _usbctrl_vendorreq_sync_read() is _DisableGPIO().
>
>
> /* 1. Disable GPIO[7:0] */
> rtl_write_word(rtlpriv, REG_GPIO_PIN_CTRL+2, 0x0000);
> value32 = rtl_read_dword(rtlpriv, REG_GPIO_PIN_CTRL) & 0xFFFF00FF;
>
>
> rtl_write_word() completes, and rtl_read_dword() is being blocked.
>
> The caller of _DisableGPIO() is _CardDisableHWSM(), which was called
> from rtl92cu_card_disable(), which was called from rtl_usb_stop().
> rtl_usb_stop was called with rtnl_lock held.
>
> This was first observed on a 3.2.34 kernel. Today I have tried
> compat-wireless-02-22 on the same kernel, with the patches from OpenWrt,
> but nothing changed. I have checked the wireless-next git tree, and
> _usbctrl_vendorreq_sync_read() is the same.
>
> I have no idea what can be the actual problem, but can do a bit of debugging and
> information gathering if you need more, just tell me what should I do.
> usb_control_msg() has not completed for at least 30 minutes now. As a quick
> workaround, is it enough to set a timeout of say, 5 seconds in
> usbctrl_vendorreq_sync_read() ? Could this cause problems at different places?

I do not think that a timeout of 5 seconds would cause any problems. Any USB I/O
that has not completed in that time should return an error status.

Please try it with

status = usb_control_msg(udev, pipe, request, reqtype, value,
index, pdata, len, USB_CTRL_SET_TIMEOUT);

That symbol is set to 5000 (milliseconds).

Let me know if that helps. I have not seen this problem on x86 or ppc
architecture. Perhaps these are fundamentally different than ARM.

Larry



2013-03-13 15:13:05

by Larry Finger

[permalink] [raw]
Subject: Re: bug: deadlock in rtl8192cu

On 03/13/2013 09:25 AM, Patrik, Kluba wrote:
> On Tue, 12 Mar 2013 11:34:57 -0500
> Larry Finger <[email protected]> wrote:
>
>>
>> Please try it with
>>
>> status = usb_control_msg(udev, pipe, request, reqtype, value,
>> index, pdata, len, USB_CTRL_SET_TIMEOUT);
>>
>> That symbol is set to 5000 (milliseconds).
>>
>> Let me know if that helps. I have not seen this problem on x86 or ppc
>> architecture. Perhaps these are fundamentally different than ARM.
>>
>> Larry
>>
>>
>
> Well, at least it avoids the deadlock, but the device is unusable until
> a power cycle has been done. Even scanning reports no results. All I
> can see after an ifconfig wlan0 down + ifconfig wlan0 up is:
>
> [ 29.412736] rtl8192cu: MAC auto ON okay!
> [ 29.979279] rtl8192cu: Tx queue select: 0x05
>
> rmmod + modprobe does not help also.
>
> I have turned on lock debugging in the hope of catching something, and
> a 'sleeping in invalid context' has turned up at a different place.
>
> [ 35.821233] wlan0: RX AssocResp from xx:xx:xx:xx:xx:xx (capab=0x431 status=0 aid=9)
> [ 35.852506] wlan0: associated
> [ 37.857611] BUG: sleeping function called from invalid context at mm/dmapool.c:315
> [ 37.857663] in_atomic(): 0, irqs_disabled(): 0, pid: 695, name: kworker/0:2
> [ 37.857697] 3 locks held by kworker/0:2/695:
> [ 37.857718] #0: (rtlpriv->cfg->name){.+.+..}, at: [<c013cd24>] process_one_work+0x1cc/0x3f8
> [ 37.857810] #1: ((&(&rtlpriv->works.watchdog_wq)->work)){+.+...}, at: [<c013cd24>] process_one_work+0x1cc/0x3f8
> [ 37.857884] #2: (rcu_read_lock){.+.+..}, at: [<bf0eac38>] rtl92c_dm_dynamic_txpower+0x1a0/0xfac [rtl8192c_common]
> [ 37.857978] Backtrace:
> [ 37.858039] [<c010fe28>] (dump_backtrace+0x0/0xfc) from [<c041b64c>] (dump_stack+0x18/0x1c)
> [ 37.858070] r7:0000013b r6:c04dd1e6 r5:00000000 r4:c6d5a000
> [ 37.858153] [<c041b634>] (dump_stack+0x0/0x1c) from [<c012067c>] (__might_sleep+0x19c/0x1d4)
> [ 37.858219] [<c01204e0>] (__might_sleep+0x0/0x1d4) from [<c01a0c5c>] (dma_pool_alloc+0x30/0x17c)
> [ 37.858252] r7:c6d08c80 r6:c6a39f00 r5:c671ee20 r4:00000000
> [ 37.858351] [<c01a0c2c>] (dma_pool_alloc+0x0/0x17c) from [<c0334f04>] (td_alloc+0x1c/0x48)
> [ 37.858405] [<c0334ee8>] (td_alloc+0x0/0x48) from [<c03352e0>] (ohci_urb_enqueue+0x11c/0x260)
> [ 37.858620] r4:00000000
> [ 37.858700] [<c03351c4>] (ohci_urb_enqueue+0x0/0x260) from [<c03205f8>] (usb_hcd_submit_urb+0xac/0x138)
> [ 37.858751] [<c032054c>] (usb_hcd_submit_urb+0x0/0x138) from [<c0321030>] (usb_submit_urb+0x2b0/0x2cc)
> [ 37.858783] r9:c6cbe000 r8:c6d5bd1c r7:00000010 r6:00000000 r5:c6cbe000
> [ 37.858839] r4:c6cbe038
> [ 37.858879] [<c0320d80>] (usb_submit_urb+0x0/0x2cc) from [<c0322264>] (usb_start_wait_urb+0x54/0xdc)
> [ 37.858908] r7:00001388 r6:c6d08c80 r5:00000000 r4:c6d5bcb4
> [ 37.858978] [<c0322210>] (usb_start_wait_urb+0x0/0xdc) from [<c032247c>] (usb_internal_control_msg+0x6c/0x80)
> [ 37.859009] r8:000000c0 r7:80000480 r6:c6cbe000 r5:c6059820 r4:c671ef20
> [ 37.859090] [<c0322410>] (usb_internal_control_msg+0x0/0x80) from [<c032252c>] (usb_control_msg+0x9c/0xb8)
> [ 37.859118] r7:00000000 r6:00000444 r5:00000004 r4:c671ef20
> [ 37.859218] [<c0322490>] (usb_control_msg+0x0/0xb8) from [<bf0ff240>] (_usb_writeN_sync+0xfc/0x200 [rtlwifi])
> [ 37.859290] [<bf0ff1d4>] (_usb_writeN_sync+0x90/0x200 [rtlwifi]) from [<bf0ff334>] (_usb_writeN_sync+0x1f0/0x200 [rtlwifi])
> [ 37.859359] [<bf0ff2b0>] (_usb_writeN_sync+0x16c/0x200 [rtlwifi]) from [<bf0ff358>] (_usb_read32_sync+0x14/0x18 [rtlwifi])
> [ 37.859391] r8:c6088d40 r7:00001f05 r6:00000000 r5:00000000 r4:c608a160
> [ 37.859608] [<bf0ff344>] (_usb_read32_sync+0x0/0x18 [rtlwifi]) from [<bf116098>] (rtl92cu_update_hal_rate_table+0x158/0x17c [rtl8192cu])
> [ 37.859684] [<bf115f40>] (rtl92cu_update_hal_rate_table+0x0/0x17c [rtl8192cu]) from [<bf0eac98>] (rtl92c_dm_dynamic_txpower+0x200/0xfac [rtl8192c_common])
> [ 37.859720] r7:00001f05 r6:c608a160 r5:00000001 r4:00000000
> [ 37.859802] [<bf0eab84>] (rtl92c_dm_dynamic_txpower+0xec/0xfac [rtl8192c_common]) from [<bf0ebb20>] (rtl92c_dm_watchdog+0xc8/0x708 [rtl8192c_common])
> [ 37.859869] [<bf0eba58>] (rtl92c_dm_watchdog+0x0/0x708 [rtl8192c_common]) from [<bf0f7334>] (rtl_watchdog_wq_callback+0x2ac/0x2f0 [rtlwifi])
> [ 37.859902] r6:c608c51c r5:00000020 r4:c608c4e0
> [ 37.859982] [<bf0f7088>] (rtl_watchdog_wq_callback+0x0/0x2f0 [rtlwifi]) from [<c013cda8>] (process_one_work+0x250/0x3f8)
> [ 37.860033] [<c013cb58>] (process_one_work+0x0/0x3f8) from [<c013d36c>] (worker_thread+0x148/0x23c)
> [ 37.860090] [<c013d224>] (worker_thread+0x0/0x23c) from [<c0142c78>] (kthread+0x98/0xa4)
> [ 37.860141] [<c0142be0>] (kthread+0x0/0xa4) from [<c012a0a0>] (do_exit+0x0/0x2cc)
> [ 37.860168] r7:00000013 r6:c012a0a0 r5:c0142be0 r4:c7881e78
>
> If I have tracked it down correctly, the problem is with the following
> segment from rtl92c_dm_refresh_rate_adaptive_mask():
>
> rcu_read_lock();
> sta = ieee80211_find_sta(mac->vif, mac->bssid);
> rtlpriv->cfg->ops->update_rate_tbl(hw, sta, p_ra->ratr_state);
> p_ra->pre_ratr_state = p_ra->ratr_state;
> rcu_read_unlock();
>
> (again from compat-wireless-02-22, but wireless-next has the same)
>
> According to http://lwn.net/Articles/37889/ no sleeping functions
> should be called inside an rcu_read_lock() region. No sleeping can
> not be guaranteed for USB transfers.
> The comment for ieee80211_find_sta() says that the returned pointer
> is only valid under RCU lock, which leads to an interesting situation.

I think that is the problem that was fixed in wireless-testing commit
664899786cb4. In that case, we got a scheduling while atomic when the debug
level was 3 or higher. Check routine rtl92cu_update_hal_rate_table() to see in
the following statement is the last one in that routine.

RT_TRACE(rtlpriv, COMP_RATR, DBG_DMESG, "%x\n",
rtl_read_dword(rtlpriv, REG_ARFR0));

The patch in question removed that RT_TRACE statement.

Yesterday, Jussi Kivilinna and I found a problem that prevented rtl8192cu from
reconnecting once it disconnected. That patch is attached.

Larry


Attachments:
01-rtl8192cu_set_network_type_with_new_set_check_bssid.patch (5.56 kB)

2013-03-13 14:25:11

by Patrik, Kluba

[permalink] [raw]
Subject: Re: bug: deadlock in rtl8192cu

On Tue, 12 Mar 2013 11:34:57 -0500
Larry Finger <[email protected]> wrote:

>
> Please try it with
>
> status = usb_control_msg(udev, pipe, request, reqtype, value,
> index, pdata, len, USB_CTRL_SET_TIMEOUT);
>
> That symbol is set to 5000 (milliseconds).
>
> Let me know if that helps. I have not seen this problem on x86 or ppc
> architecture. Perhaps these are fundamentally different than ARM.
>
> Larry
>
>

Well, at least it avoids the deadlock, but the device is unusable until
a power cycle has been done. Even scanning reports no results. All I
can see after an ifconfig wlan0 down + ifconfig wlan0 up is:

[ 29.412736] rtl8192cu: MAC auto ON okay!
[ 29.979279] rtl8192cu: Tx queue select: 0x05

rmmod + modprobe does not help also.

I have turned on lock debugging in the hope of catching something, and
a 'sleeping in invalid context' has turned up at a different place.

[ 35.821233] wlan0: RX AssocResp from xx:xx:xx:xx:xx:xx (capab=0x431 status=0 aid=9)
[ 35.852506] wlan0: associated
[ 37.857611] BUG: sleeping function called from invalid context at mm/dmapool.c:315
[ 37.857663] in_atomic(): 0, irqs_disabled(): 0, pid: 695, name: kworker/0:2
[ 37.857697] 3 locks held by kworker/0:2/695:
[ 37.857718] #0: (rtlpriv->cfg->name){.+.+..}, at: [<c013cd24>] process_one_work+0x1cc/0x3f8
[ 37.857810] #1: ((&(&rtlpriv->works.watchdog_wq)->work)){+.+...}, at: [<c013cd24>] process_one_work+0x1cc/0x3f8
[ 37.857884] #2: (rcu_read_lock){.+.+..}, at: [<bf0eac38>] rtl92c_dm_dynamic_txpower+0x1a0/0xfac [rtl8192c_common]
[ 37.857978] Backtrace:
[ 37.858039] [<c010fe28>] (dump_backtrace+0x0/0xfc) from [<c041b64c>] (dump_stack+0x18/0x1c)
[ 37.858070] r7:0000013b r6:c04dd1e6 r5:00000000 r4:c6d5a000
[ 37.858153] [<c041b634>] (dump_stack+0x0/0x1c) from [<c012067c>] (__might_sleep+0x19c/0x1d4)
[ 37.858219] [<c01204e0>] (__might_sleep+0x0/0x1d4) from [<c01a0c5c>] (dma_pool_alloc+0x30/0x17c)
[ 37.858252] r7:c6d08c80 r6:c6a39f00 r5:c671ee20 r4:00000000
[ 37.858351] [<c01a0c2c>] (dma_pool_alloc+0x0/0x17c) from [<c0334f04>] (td_alloc+0x1c/0x48)
[ 37.858405] [<c0334ee8>] (td_alloc+0x0/0x48) from [<c03352e0>] (ohci_urb_enqueue+0x11c/0x260)
[ 37.858620] r4:00000000
[ 37.858700] [<c03351c4>] (ohci_urb_enqueue+0x0/0x260) from [<c03205f8>] (usb_hcd_submit_urb+0xac/0x138)
[ 37.858751] [<c032054c>] (usb_hcd_submit_urb+0x0/0x138) from [<c0321030>] (usb_submit_urb+0x2b0/0x2cc)
[ 37.858783] r9:c6cbe000 r8:c6d5bd1c r7:00000010 r6:00000000 r5:c6cbe000
[ 37.858839] r4:c6cbe038
[ 37.858879] [<c0320d80>] (usb_submit_urb+0x0/0x2cc) from [<c0322264>] (usb_start_wait_urb+0x54/0xdc)
[ 37.858908] r7:00001388 r6:c6d08c80 r5:00000000 r4:c6d5bcb4
[ 37.858978] [<c0322210>] (usb_start_wait_urb+0x0/0xdc) from [<c032247c>] (usb_internal_control_msg+0x6c/0x80)
[ 37.859009] r8:000000c0 r7:80000480 r6:c6cbe000 r5:c6059820 r4:c671ef20
[ 37.859090] [<c0322410>] (usb_internal_control_msg+0x0/0x80) from [<c032252c>] (usb_control_msg+0x9c/0xb8)
[ 37.859118] r7:00000000 r6:00000444 r5:00000004 r4:c671ef20
[ 37.859218] [<c0322490>] (usb_control_msg+0x0/0xb8) from [<bf0ff240>] (_usb_writeN_sync+0xfc/0x200 [rtlwifi])
[ 37.859290] [<bf0ff1d4>] (_usb_writeN_sync+0x90/0x200 [rtlwifi]) from [<bf0ff334>] (_usb_writeN_sync+0x1f0/0x200 [rtlwifi])
[ 37.859359] [<bf0ff2b0>] (_usb_writeN_sync+0x16c/0x200 [rtlwifi]) from [<bf0ff358>] (_usb_read32_sync+0x14/0x18 [rtlwifi])
[ 37.859391] r8:c6088d40 r7:00001f05 r6:00000000 r5:00000000 r4:c608a160
[ 37.859608] [<bf0ff344>] (_usb_read32_sync+0x0/0x18 [rtlwifi]) from [<bf116098>] (rtl92cu_update_hal_rate_table+0x158/0x17c [rtl8192cu])
[ 37.859684] [<bf115f40>] (rtl92cu_update_hal_rate_table+0x0/0x17c [rtl8192cu]) from [<bf0eac98>] (rtl92c_dm_dynamic_txpower+0x200/0xfac [rtl8192c_common])
[ 37.859720] r7:00001f05 r6:c608a160 r5:00000001 r4:00000000
[ 37.859802] [<bf0eab84>] (rtl92c_dm_dynamic_txpower+0xec/0xfac [rtl8192c_common]) from [<bf0ebb20>] (rtl92c_dm_watchdog+0xc8/0x708 [rtl8192c_common])
[ 37.859869] [<bf0eba58>] (rtl92c_dm_watchdog+0x0/0x708 [rtl8192c_common]) from [<bf0f7334>] (rtl_watchdog_wq_callback+0x2ac/0x2f0 [rtlwifi])
[ 37.859902] r6:c608c51c r5:00000020 r4:c608c4e0
[ 37.859982] [<bf0f7088>] (rtl_watchdog_wq_callback+0x0/0x2f0 [rtlwifi]) from [<c013cda8>] (process_one_work+0x250/0x3f8)
[ 37.860033] [<c013cb58>] (process_one_work+0x0/0x3f8) from [<c013d36c>] (worker_thread+0x148/0x23c)
[ 37.860090] [<c013d224>] (worker_thread+0x0/0x23c) from [<c0142c78>] (kthread+0x98/0xa4)
[ 37.860141] [<c0142be0>] (kthread+0x0/0xa4) from [<c012a0a0>] (do_exit+0x0/0x2cc)
[ 37.860168] r7:00000013 r6:c012a0a0 r5:c0142be0 r4:c7881e78

If I have tracked it down correctly, the problem is with the following
segment from rtl92c_dm_refresh_rate_adaptive_mask():

rcu_read_lock();
sta = ieee80211_find_sta(mac->vif, mac->bssid);
rtlpriv->cfg->ops->update_rate_tbl(hw, sta, p_ra->ratr_state);
p_ra->pre_ratr_state = p_ra->ratr_state;
rcu_read_unlock();

(again from compat-wireless-02-22, but wireless-next has the same)

According to http://lwn.net/Articles/37889/ no sleeping functions
should be called inside an rcu_read_lock() region. No sleeping can
not be guaranteed for USB transfers.
The comment for ieee80211_find_sta() says that the returned pointer
is only valid under RCU lock, which leads to an interesting situation.

Regards,
Patrik

--
Patrik KLUBA
Software Developer at

DENSION Audio Systems Ltd.
H-1116 Budapest, Sztregova u. 1
Phone: +36 1 463 0470
Fax: +36 1 463 0479
Web: http://www.dension.com

2013-03-13 15:51:49

by Patrik, Kluba

[permalink] [raw]
Subject: Re: bug: deadlock in rtl8192cu

On Wed, 13 Mar 2013 10:13:01 -0500
Larry Finger <[email protected]> wrote:

>
> I think that is the problem that was fixed in wireless-testing commit
> 664899786cb4. In that case, we got a scheduling while atomic when the
> debug level was 3 or higher. Check routine
> rtl92cu_update_hal_rate_table() to see in the following statement is
> the last one in that routine.
>
> RT_TRACE(rtlpriv, COMP_RATR, DBG_DMESG, "%x\n",
> rtl_read_dword(rtlpriv, REG_ARFR0));
>
> The patch in question removed that RT_TRACE statement.
>

Yep, that helped.

> Yesterday, Jussi Kivilinna and I found a problem that prevented
> rtl8192cu from reconnecting once it disconnected. That patch is
> attached.

And this one too, thanks.

Regards,
Patrik

--
Patrik KLUBA
Software Developer at

DENSION Audio Systems Ltd.
H-1116 Budapest, Sztregova u. 1
Phone: +36 1 463 0470
Fax: +36 1 463 0479
Web: http://www.dension.com

2013-03-13 15:11:25

by Patrik, Kluba

[permalink] [raw]
Subject: Re: bug: deadlock in rtl8192cu

On Wed, 13 Mar 2013 15:25:05 +0100
"Patrik, Kluba" <[email protected]> wrote:

> According to http://lwn.net/Articles/37889/ no sleeping functions
> should be called inside an rcu_read_lock() region. No sleeping can
> not be guaranteed for USB transfers.

To correct myself: for synchronous USB transfers. usb_submit_urb() and
friends are safe. Rewriting things in rtlwifi to do async transfers is
a tremendous job, and might not help at all as the termination of the
callback has still be waited under RCU.

But how about something like this?


rcu_read_lock();
sta = ...
increase_reference_count();
rcu_read_unlock();

... do all the might-sleeping things here ...

decrease_reference_count_and_call_callback_if_necessary();



I have found an example for rcu_read_lock() + kref_get() here:
https://lists.open-mesh.org/pipermail/b.a.t.m.a.n/2011-February/004234.html

Regards,
Patrik

--
Patrik KLUBA
Software Developer at

DENSION Audio Systems Ltd.
H-1116 Budapest, Sztregova u. 1
Phone: +36 1 463 0470
Fax: +36 1 463 0479
Web: http://www.dension.com

2013-03-12 15:42:03

by Patrik, Kluba

[permalink] [raw]
Subject: Re: bug: deadlock in rtl8192cu

On Tue, 12 Mar 2013 16:30:20 +0100
"Patrik, Kluba" <[email protected]> wrote:

>
> Hi!
>
> After killing a wpa_supplicant which was set up to connect to an

Forgot to tell that WPA2 was enabled, if that counts.

Regards,
Patrik

2013-03-13 15:26:35

by John W. Linville

[permalink] [raw]
Subject: Re: bug: deadlock in rtl8192cu

On Wed, Mar 13, 2013 at 10:13:01AM -0500, Larry Finger wrote:

> Yesterday, Jussi Kivilinna and I found a problem that prevented
> rtl8192cu from reconnecting once it disconnected. That patch is
> attached.
>
> Larry

Is this your patch submission? Or will you be posting separately
(and with a clearer subject/intent)? :-)

> The driver was failing to clear the BSSID when a disconnect happened. That
> prevented a reconnection. This problem is reported at
> https://bugzilla.redhat.com/show_bug.cgi?id=789605,
> https://bugzilla.redhat.com/show_bug.cgi?id=866786,
> https://bugzilla.redhat.com/show_bug.cgi?id=906734, and
> https://bugzilla.kernel.org/show_bug.cgi?id=46171.
>
> Thanks to Jussi Kivilinna for making the critical observation
> that led to the solution.
>
> Reported-by: Jussi Kivilinna <[email protected]>
> Tested-by: Jussi Kivilinna <[email protected]>
> Signed-off-by: Larry Finger <[email protected]>
> Cc: Stable <[email protected]>
> ---
>
> John,
>
> As you can see by the number of bug reports, this patch should be
> pushed as soon as possible.
>
> Thanks,
>
> Larry
> ---
>
> base.h | 3 +
> pci.c | 2 -
> rtl8192cu/hw.c | 87 ++++++++++++++++++++++-----------------------------------
> 3 files changed, 39 insertions(+), 53 deletions(-)
>
> Index: linux-2.6/drivers/net/wireless/rtlwifi/rtl8192cu/hw.c
> ===================================================================
> --- linux-2.6.orig/drivers/net/wireless/rtlwifi/rtl8192cu/hw.c
> +++ linux-2.6/drivers/net/wireless/rtlwifi/rtl8192cu/hw.c
> @@ -1377,74 +1377,57 @@ void rtl92cu_card_disable(struct ieee802
>
> void rtl92cu_set_check_bssid(struct ieee80211_hw *hw, bool check_bssid)
> {
> - /* dummy routine needed for callback from rtl_op_configure_filter() */
> -}
> -
> -/*========================================================================== */
> -
> -static void _rtl92cu_set_check_bssid(struct ieee80211_hw *hw,
> - enum nl80211_iftype type)
> -{
> struct rtl_priv *rtlpriv = rtl_priv(hw);
> - u32 reg_rcr = rtl_read_dword(rtlpriv, REG_RCR);
> struct rtl_hal *rtlhal = rtl_hal(rtlpriv);
> - struct rtl_phy *rtlphy = &(rtlpriv->phy);
> - u8 filterout_non_associated_bssid = false;
> + u32 reg_rcr = rtl_read_dword(rtlpriv, REG_RCR);
>
> - switch (type) {
> - case NL80211_IFTYPE_ADHOC:
> - case NL80211_IFTYPE_STATION:
> - filterout_non_associated_bssid = true;
> - break;
> - case NL80211_IFTYPE_UNSPECIFIED:
> - case NL80211_IFTYPE_AP:
> - default:
> - break;
> - }
> - if (filterout_non_associated_bssid) {
> + if (rtlpriv->psc.rfpwr_state != ERFON)
> + return;
> +
> + if (check_bssid) {
> + u8 tmp;
> if (IS_NORMAL_CHIP(rtlhal->version)) {
> - switch (rtlphy->current_io_type) {
> - case IO_CMD_RESUME_DM_BY_SCAN:
> - reg_rcr |= (RCR_CBSSID_DATA | RCR_CBSSID_BCN);
> - rtlpriv->cfg->ops->set_hw_reg(hw,
> - HW_VAR_RCR, (u8 *)(&reg_rcr));
> - /* enable update TSF */
> - _rtl92cu_set_bcn_ctrl_reg(hw, 0, BIT(4));
> - break;
> - case IO_CMD_PAUSE_DM_BY_SCAN:
> - reg_rcr &= ~(RCR_CBSSID_DATA | RCR_CBSSID_BCN);
> - rtlpriv->cfg->ops->set_hw_reg(hw,
> - HW_VAR_RCR, (u8 *)(&reg_rcr));
> - /* disable update TSF */
> - _rtl92cu_set_bcn_ctrl_reg(hw, BIT(4), 0);
> - break;
> - }
> + reg_rcr |= (RCR_CBSSID_DATA | RCR_CBSSID_BCN);
> + tmp = BIT(4);
> } else {
> - reg_rcr |= (RCR_CBSSID);
> - rtlpriv->cfg->ops->set_hw_reg(hw, HW_VAR_RCR,
> - (u8 *)(&reg_rcr));
> - _rtl92cu_set_bcn_ctrl_reg(hw, 0, (BIT(4)|BIT(5)));
> + reg_rcr |= RCR_CBSSID;
> + tmp = BIT(4) | BIT(5);
> }
> - } else if (filterout_non_associated_bssid == false) {
> + rtlpriv->cfg->ops->set_hw_reg(hw, HW_VAR_RCR,
> + (u8 *) (&reg_rcr));
> + _rtl92cu_set_bcn_ctrl_reg(hw, 0, tmp);
> + } else {
> + u8 tmp;
> if (IS_NORMAL_CHIP(rtlhal->version)) {
> - reg_rcr &= (~(RCR_CBSSID_DATA | RCR_CBSSID_BCN));
> - rtlpriv->cfg->ops->set_hw_reg(hw, HW_VAR_RCR,
> - (u8 *)(&reg_rcr));
> - _rtl92cu_set_bcn_ctrl_reg(hw, BIT(4), 0);
> + reg_rcr &= ~(RCR_CBSSID_DATA | RCR_CBSSID_BCN);
> + tmp = BIT(4);
> } else {
> - reg_rcr &= (~RCR_CBSSID);
> - rtlpriv->cfg->ops->set_hw_reg(hw, HW_VAR_RCR,
> - (u8 *)(&reg_rcr));
> - _rtl92cu_set_bcn_ctrl_reg(hw, (BIT(4)|BIT(5)), 0);
> + reg_rcr &= ~RCR_CBSSID;
> + tmp = BIT(4) | BIT(5);
> }
> + reg_rcr &= (~(RCR_CBSSID_DATA | RCR_CBSSID_BCN));
> + rtlpriv->cfg->ops->set_hw_reg(hw,
> + HW_VAR_RCR, (u8 *) (&reg_rcr));
> + _rtl92cu_set_bcn_ctrl_reg(hw, tmp, 0);
> }
> }
>
> +/*========================================================================== */
> +
> int rtl92cu_set_network_type(struct ieee80211_hw *hw, enum nl80211_iftype type)
> {
> + struct rtl_priv *rtlpriv = rtl_priv(hw);
> +
> if (_rtl92cu_set_media_status(hw, type))
> return -EOPNOTSUPP;
> - _rtl92cu_set_check_bssid(hw, type);
> +
> + if (rtlpriv->mac80211.link_state == MAC80211_LINKED) {
> + if (type != NL80211_IFTYPE_AP)
> + rtl92cu_set_check_bssid(hw, true);
> + } else {
> + rtl92cu_set_check_bssid(hw, false);
> + }
> +
> return 0;
> }
>
> Index: linux-2.6/drivers/net/wireless/rtlwifi/base.h
> ===================================================================
> --- linux-2.6.orig/drivers/net/wireless/rtlwifi/base.h
> +++ linux-2.6/drivers/net/wireless/rtlwifi/base.h
> @@ -143,5 +143,8 @@ extern struct attribute_group rtl_attrib
> int rtlwifi_rate_mapping(struct ieee80211_hw *hw,
> bool isht, u8 desc_rate, bool first_ampdu);
> bool rtl_tx_mgmt_proc(struct ieee80211_hw *hw, struct sk_buff *skb);
> +struct sk_buff *rtl_make_del_ba(struct ieee80211_hw *hw,
> + u8 *sa, u8 *bssid, u16 tid);
> +void rtl_lps_change_work_callback(struct work_struct *work);
>
> #endif
> Index: linux-2.6/drivers/net/wireless/rtlwifi/pci.c
> ===================================================================
> --- linux-2.6.orig/drivers/net/wireless/rtlwifi/pci.c
> +++ linux-2.6/drivers/net/wireless/rtlwifi/pci.c
> @@ -939,7 +939,7 @@ static void _rtl_pci_prepare_bcn_tasklet
> return;
> }
>
> -static void rtl_lps_leave_work_callback(struct work_struct *work)
> +void rtl_lps_leave_work_callback(struct work_struct *work)
> {
> struct rtl_works *rtlworks =
> container_of(work, struct rtl_works, lps_leave_work);


--
John W. Linville Someday the world will need a hero, and you
[email protected] might be all we have. Be ready.