=================================================
skb_push called from rt2x00 panics occasionally
=================================================
We have been experiencing occasional skb_push panics with rt2x00 driver
throughout the stable v3.10 -series. Problems occur with hosts acting as
accesspoints, managed by hostapd 2.0. Some hosts panic reportedly
several times per week, some almost never. All hosts have identical
kernels and WiFi adapters.
We have not been able to reproduce panics in a controlled manner.
I'd be very happy to get some advice how to proceed in debugging and fixing
this problem and even better, to understand the whole issue
thoroughly. Unfortunately, all I know now is that the skb headroom is
not big enough for the frame the AP is trying to transmit, right?
Hopefully someone can shed some light on this problem.
Kernel version
==============
The latest tested and affected is stable v3.10.23, earlier v3.10
-kernels are affected as well. We do not have reliable test results from
newer kernels due to random nature of this bug.
Output
======
Unfortunately, we don't have textual traces, just photos. Here is one:
http://i.imgur.com/2XD8X3r.jpg
Environment
===========
USB Wifi adapter
----------------
Bus 001 Device 003: ID 148f:3070 Ralink Technology, Corp. RT2870/RT3070 Wireless Adapter
wlan0 Link encap:Ethernet HWaddr 00:1e:ab:20:56:30
UP BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
Wiphy phy0
Band 1:
Capabilities: 0x172
HT20/HT40
Static SM Power Save
RX Greenfield
RX HT20 SGI
RX HT40 SGI
RX STBC 1-stream
Max AMSDU length: 3839 bytes
No DSSS/CCK HT40
Maximum RX AMPDU length 65535 bytes (exponent: 0x003)
Minimum RX AMPDU time spacing: 2 usec (0x04)
HT RX MCS rate indexes supported: 0-7, 32
TX unequal modulation not supported
HT TX Max spatial streams: 1
HT TX MCS rate indexes supported may differ
Frequencies:
* 2412 MHz [1] (20.0 dBm)
* 2417 MHz [2] (20.0 dBm)
* 2422 MHz [3] (20.0 dBm)
* 2427 MHz [4] (20.0 dBm)
* 2432 MHz [5] (20.0 dBm)
* 2437 MHz [6] (20.0 dBm)
* 2442 MHz [7] (20.0 dBm)
* 2447 MHz [8] (20.0 dBm)
* 2452 MHz [9] (20.0 dBm)
* 2457 MHz [10] (20.0 dBm)
* 2462 MHz [11] (20.0 dBm)
* 2467 MHz [12] (20.0 dBm)
* 2472 MHz [13] (20.0 dBm)
* 2484 MHz [14] (disabled)
Bitrates (non-HT):
* 1.0 Mbps
* 2.0 Mbps (short preamble supported)
* 5.5 Mbps (short preamble supported)
* 11.0 Mbps (short preamble supported)
* 6.0 Mbps
* 9.0 Mbps
* 12.0 Mbps
* 18.0 Mbps
* 24.0 Mbps
* 36.0 Mbps
* 48.0 Mbps
* 54.0 Mbps
max # scan SSIDs: 4
max scan IEs length: 2257 bytes
Coverage class: 0 (up to 0m)
Supported Ciphers:
* WEP40 (00-0f-ac:1)
* WEP104 (00-0f-ac:5)
* TKIP (00-0f-ac:2)
* CCMP (00-0f-ac:4)
Available Antennas: TX 0 RX 0
Supported interface modes:
* IBSS
* managed
* AP
* AP/VLAN
* WDS
* monitor
* mesh point
software interface modes (can always be added):
* AP/VLAN
* monitor
valid interface combinations:
* #{ AP, mesh point } <= 8,
total <= 8, #channels <= 1
Supported commands:
* new_interface
* set_interface
* new_key
* new_beacon
* new_station
* new_mpath
* set_mesh_params
* set_bss
* authenticate
* associate
* deauthenticate
* disassociate
* join_ibss
* join_mesh
* set_tx_bitrate_mask
* action
* frame_wait_cancel
* set_wiphy_netns
* set_channel
* set_wds_peer
* Unknown command (84)
* Unknown command (87)
* Unknown command (85)
* Unknown command (89)
* Unknown command (92)
* testmode
* connect
* disconnect
Supported TX frame types:
* IBSS: 0x00 0x10 0x20 0x30 0x40 0x50 0x60 0x70 0x80 0x90 0xa0 0xb0 0xc0 0xd0 0xe0 0xf0
* managed: 0x00 0x10 0x20 0x30 0x40 0x50 0x60 0x70 0x80 0x90 0xa0 0xb0 0xc0 0xd0 0xe0 0xf0
* AP: 0x00 0x10 0x20 0x30 0x40 0x50 0x60 0x70 0x80 0x90 0xa0 0xb0 0xc0 0xd0 0xe0 0xf0
* AP/VLAN: 0x00 0x10 0x20 0x30 0x40 0x50 0x60 0x70 0x80 0x90 0xa0 0xb0 0xc0 0xd0 0xe0 0xf0
* mesh point: 0x00 0x10 0x20 0x30 0x40 0x50 0x60 0x70 0x80 0x90 0xa0 0xb0 0xc0 0xd0 0xe0 0xf0
* P2P-client: 0x00 0x10 0x20 0x30 0x40 0x50 0x60 0x70 0x80 0x90 0xa0 0xb0 0xc0 0xd0 0xe0 0xf0
* P2P-GO: 0x00 0x10 0x20 0x30 0x40 0x50 0x60 0x70 0x80 0x90 0xa0 0xb0 0xc0 0xd0 0xe0 0xf0
* Unknown mode (10): 0x00 0x10 0x20 0x30 0x40 0x50 0x60 0x70 0x80 0x90 0xa0 0xb0 0xc0 0xd0 0xe0 0xf0
Supported RX frame types:
* IBSS: 0x40 0xb0 0xc0 0xd0
* managed: 0x40 0xd0
* AP: 0x00 0x20 0x40 0xa0 0xb0 0xc0 0xd0
* AP/VLAN: 0x00 0x20 0x40 0xa0 0xb0 0xc0 0xd0
* mesh point: 0xb0 0xc0 0xd0
* P2P-client: 0x40 0xd0
* P2P-GO: 0x00 0x20 0x40 0xa0 0xb0 0xc0 0xd0
* Unknown mode (10): 0x40 0xd0
Device supports RSN-IBSS.
HT Capability overrides:
* MCS: ff ff ff ff ff ff ff ff ff ff
* maximum A-MSDU length
* supported channel width
* short GI for 40 MHz
* max A-MPDU length exponent
* min MPDU start spacing
Device supports TX status socket option.
Device supports HT-IBSS.
--
Tuomas
On Thu, Jan 30, 2014 at 12:33:42PM +0000, Tuomas R?s?nen wrote:
> =================================================
> skb_push called from rt2x00 panics occasionally
> =================================================
>
> We have been experiencing occasional skb_push panics with rt2x00 driver
> throughout the stable v3.10 -series. Problems occur with hosts acting as
> accesspoints, managed by hostapd 2.0. Some hosts panic reportedly
> several times per week, some almost never. All hosts have identical
> kernels and WiFi adapters.
>
> We have not been able to reproduce panics in a controlled manner.
>
> I'd be very happy to get some advice how to proceed in debugging and fixing
> this problem and even better, to understand the whole issue
> thoroughly. Unfortunately, all I know now is that the skb headroom is
> not big enough for the frame the AP is trying to transmit, right?
> Hopefully someone can shed some light on this problem.
Yes, is possible that we have no headroom for frame retransmitted
several times, but I do not see bug in code for that so far.
We remove and add again l2 pad for retransmit frame, that should
not utilize more headroom than for first frame transmission (as long
mac80211 does not increase header length, but that seems to be not
possible and sensible).
For know you can try to increase headroom like on below patch or use
module with nohwcrypt=1. Please check if any of that stops panics or
just make it less reproducible.
You can also configure kdump to dump memory to see how skb looks when
kernel crashes.
Stanislaw
diff --git a/drivers/net/wireless/rt2x00/rt2x00.h b/drivers/net/wireless/rt2x00/rt2x00.h
index e4ba2ce..50fc3e7 100644
--- a/drivers/net/wireless/rt2x00/rt2x00.h
+++ b/drivers/net/wireless/rt2x00/rt2x00.h
@@ -115,7 +115,7 @@
* Constants for extra TX headroom for alignment purposes.
*/
#define RT2X00_ALIGN_SIZE 4 /* Only whole frame needs alignment */
-#define RT2X00_L2PAD_SIZE 8 /* Both header & payload need alignment */
+#define RT2X00_L2PAD_SIZE 12 /* Both header & payload need alignment */
/*
* Standard timing and size defines.