2011-12-09 14:50:22

by Wolfgang Breyha

[permalink] [raw]
Subject: iwlwifi havoc on some APs (rekeying?)

Hi!

I recently upgraded my HP 2540p Laptop with Intel 6200 AGN to Fedora 16 and
suddenly had big troubles keeping the machine stable. After a while I
nailed it down to iwlagn in all available kernels for F16. 3.1.0 - 3.1.4.

Meanwhile I'm running wireless-compat iwlwifi from today (with the
regression patch from Nikolay) but the troubles are still there. Kernel is
the latest from F16 (3.1.4).

Cards PCI-Id: 8086:4239 (rev 35)
dmesg from iwlwifi:
> [ 11.091363] Intel(R) Wireless WiFi Link AGN driver for Linux, in-tree:d
> [ 11.091366] Copyright(c) 2003-2011 Intel Corporation
> [ 11.091429] iwlwifi 0000:43:00.0: PCI INT A -> GSI 19 (level, low) -> IRQ 19
> [ 11.091438] iwlwifi 0000:43:00.0: setting latency timer to 64
> [ 11.091459] iwlwifi 0000:43:00.0: pci_resource_len = 0x00002000
> [ 11.091461] iwlwifi 0000:43:00.0: pci_resource_base = fb444000
> [ 11.091464] iwlwifi 0000:43:00.0: HW Revision ID = 0x35
> [ 11.091537] iwlwifi 0000:43:00.0: irq 49 for MSI/MSI-X
> [ 11.091592] iwlwifi 0000:43:00.0: CONFIG_IWLWIFI_DEBUG enabled
> [ 11.091594] iwlwifi 0000:43:00.0: CONFIG_IWLWIFI_DEBUGFS enabled
> [ 11.091596] iwlwifi 0000:43:00.0: CONFIG_IWLWIFI_DEVICE_TRACING disabled
> [ 11.091598] iwlwifi 0000:43:00.0: CONFIG_IWLWIFI_DEVICE_SVTOOL enabled
> [ 11.091605] iwlwifi 0000:43:00.0: Detected Intel(R) Centrino(R) Advanced-N 6200 AGN, REV=0x74
> [ 11.091724] iwlwifi 0000:43:00.0: L1 Disabled; Enabling L0S
> [ 11.107567] iwlwifi 0000:43:00.0: device EEPROM VER=0x436, CALIB=0x6
> [ 11.107570] iwlwifi 0000:43:00.0: Device SKU: 0X1f0
> [ 11.107596] iwlwifi 0000:43:00.0: Tunable channels: 13 802.11bg, 24 802.11a channels
> [ 11.116014] iwlwifi 0000:43:00.0: loaded firmware version 9.221.4.1 build 25532
> [ 11.116327] Registered led device: phy0-led

wpa_supplicant is: wpa_supplicant-0.7.3-11.fc16.i686
driven by NetworkManager: NetworkManager-0.9.2-1.fc16.i686
via wext.

I've two APs for tests. One is a Ubiquiti UniFi bgn (also Atheros/hostapd
based). No problems with this one so far. Works like a charm.

The other one is my Soekris 4801 equipped with an
00:0e.0 168c:001a Ethernet controller: Atheros Communications Inc. AR2413
802.11bg NIC (rev 01)
running on openwrt 10.x with hostapd.

If I connect to this one and start eg. videostreaming it takes only a few
minutes and my machine gets completly unstable. Various applications crash,
sounds starts crackling, video gets artefacts. Also doing large downloads
helps to kill various components. gnome-shell, firefox, ... looks like
iwlwifi writes random memory.

Scanresults from "iw wlan0 scan" for this AP:
BSS 00:80:48:xx:xx:xx (on wlan0)
TSF: 58415136285 usec (0d, 16:13:35)
freq: 2437
beacon interval: 100
capability: ESS Privacy ShortPreamble ShortSlotTime (0x0431)
signal: -45.00 dBm
last seen: 4490 ms ago
SSID: xxxx
Supported rates: 1.0* 2.0* 5.5* 6.0 9.0 11.0* 12.0 18.0
DS Parameter set: channel 6
ERP: <no flags>
Extended supported rates: 24.0 36.0 48.0 54.0
WMM: * Parameter version 1
* BE: CW 7-1023, AIFSN 2, TXOP 2048 usec
* BK: CW 15-1023, AIFSN 7
* VI: CW 7-15, AIFSN 2, TXOP 3008 usec
* VO: acm CW 3-7, AIFSN 2, TXOP 1504 usec
RSN: * Version: 1
* Group cipher: TKIP
* Pairwise ciphers: CCMP TKIP
* Authentication suites: PSK
* Capabilities: (0x0000)

Usually I use CCMP only, but activated TKIP for testing purposes. But it
makes no difference at all.

I activated debug output from iwlwifi and besides the recurring
> Dec 9 14:42:17 hpwb kernel: [ 616.473854] iwlwifi 0000:43:00.0: U iwlagn_request_scan Scanning while associated...
> Dec 9 14:42:17 hpwb kernel: [ 616.473866] iwlwifi 0000:43:00.0: U iwl_send_cmd_sync Attempting to send sync command REPLY_SCAN_CMD
> Dec 9 14:42:17 hpwb kernel: [ 616.473869] iwlwifi 0000:43:00.0: U iwl_send_cmd_sync Setting HCMD_ACTIVE for command REPLY_SCAN_CMD
> Dec 9 14:42:17 hpwb kernel: [ 616.476081] iwlwifi 0000:43:00.0: I iwl_tx_cmd_complete Clearing HCMD_ACTIVE for command REPLY_SCAN_CMD
... I very often see stuff like...
> Dec 9 14:42:07 hpwb kernel: [ 606.751372] iwlwifi 0000:43:00.0: I iwl_force_reset force reset rejected
> Dec 9 14:42:13 hpwb kernel: [ 612.907705] iwlwifi 0000:43:00.0: I is_lq_table_valid Channel 6 is not an HT channel
> Dec 9 14:42:13 hpwb kernel: [ 612.932685] iwlwifi 0000:43:00.0: I is_lq_table_valid Channel 6 is not an HT channel
> Dec 9 14:42:17 hpwb kernel: [ 616.473816] iwlwifi 0000:43:00.0: I iwl_force_reset perform force reset (0)
> Dec 9 14:42:17 hpwb kernel: [ 616.473825] iwlwifi 0000:43:00.0: I iwl_force_rf_reset perform radio reset.

Another difference between the two APs is that my soekris uses rekeying. I
don't have proof yet that it's the cause, but troubles often start after I
see....
> Dec 9 14:47:23 hpwb kernel: [ 922.682961] iwlwifi 0000:43:00.0: I iwl_tx_cmd_complete Clearing HCMD_ACTIVE for command REPLY_SCAN_CMD
> Dec 9 14:47:23 hpwb kernel: [ 922.885333] iwlwifi 0000:43:00.0: I iwl_force_reset force reset rejected
> Dec 9 14:47:27 hpwb kernel: [ 926.672222] iwlwifi 0000:43:00.0: I iwl_force_reset perform force reset (0)
> Dec 9 14:47:27 hpwb kernel: [ 926.672233] iwlwifi 0000:43:00.0: I iwl_force_rf_reset perform radio reset.
> Dec 9 14:47:27 hpwb kernel: [ 926.672254] iwlwifi 0000:43:00.0: U iwlagn_request_scan Scanning while associated...
> Dec 9 14:47:27 hpwb kernel: [ 926.672265] iwlwifi 0000:43:00.0: U iwl_send_cmd_sync Attempting to send sync command REPLY_SCAN_CMD
> Dec 9 14:47:27 hpwb kernel: [ 926.672271] iwlwifi 0000:43:00.0: U iwl_send_cmd_sync Setting HCMD_ACTIVE for command REPLY_SCAN_CMD
> Dec 9 14:47:27 hpwb kernel: [ 926.674473] iwlwifi 0000:43:00.0: I iwl_tx_cmd_complete Clearing HCMD_ACTIVE for command REPLY_SCAN_CMD
> Dec 9 14:47:33 hpwb kernel: [ 932.686199] iwlwifi 0000:43:00.0: U iwl_send_add_sta Adding sta 0 (00:80:48:xx:xx:xx) synchronously
> Dec 9 14:47:33 hpwb kernel: [ 932.686209] iwlwifi 0000:43:00.0: U iwl_send_cmd_sync Attempting to send sync command REPLY_ADD_STA
> Dec 9 14:47:33 hpwb kernel: [ 932.686214] iwlwifi 0000:43:00.0: U iwl_send_cmd_sync Setting HCMD_ACTIVE for command REPLY_ADD_STA
> Dec 9 14:47:33 hpwb kernel: [ 932.686271] iwlwifi 0000:43:00.0: I iwl_process_add_sta_resp Processing response for adding station 0
> Dec 9 14:47:33 hpwb kernel: [ 932.686278] iwlwifi 0000:43:00.0: I iwl_process_add_sta_resp REPLY_ADD_STA PASSED
> Dec 9 14:47:33 hpwb kernel: [ 932.686285] iwlwifi 0000:43:00.0: I iwl_sta_ucode_activate STA id 0 addr 00:80:48:xx:xx:xx already present in uCode (according to driver)
> Dec 9 14:47:33 hpwb kernel: [ 932.686294] iwlwifi 0000:43:00.0: I iwl_process_add_sta_resp Added station id 0 addr 00:80:48:xx:xx:xx
> Dec 9 14:47:33 hpwb kernel: [ 932.686302] iwlwifi 0000:43:00.0: I iwl_process_add_sta_resp Added station according to cmd buffer 00:80:48:xx:xx:xx
> Dec 9 14:47:33 hpwb kernel: [ 932.686313] iwlwifi 0000:43:00.0: I iwl_tx_cmd_complete Clearing HCMD_ACTIVE for command REPLY_ADD_STA
> Dec 9 14:47:33 hpwb kernel: [ 932.686423] iwlwifi 0000:43:00.0: U iwl_send_add_sta Adding sta 0 (00:80:48:xx:xx:xx) synchronously
> Dec 9 14:47:33 hpwb kernel: [ 932.686430] iwlwifi 0000:43:00.0: U iwl_send_cmd_sync Attempting to send sync command REPLY_ADD_STA
> Dec 9 14:47:33 hpwb kernel: [ 932.686437] iwlwifi 0000:43:00.0: U iwl_send_cmd_sync Setting HCMD_ACTIVE for command REPLY_ADD_STA
> Dec 9 14:47:33 hpwb kernel: [ 932.686487] iwlwifi 0000:43:00.0: I iwl_process_add_sta_resp Processing response for adding station 0
> Dec 9 14:47:33 hpwb kernel: [ 932.686499] iwlwifi 0000:43:00.0: I iwl_process_add_sta_resp REPLY_ADD_STA PASSED
> Dec 9 14:47:33 hpwb kernel: [ 932.686509] iwlwifi 0000:43:00.0: I iwl_sta_ucode_activate STA id 0 addr 00:80:48:xx:xx:xx already present in uCode (according to driver)
> Dec 9 14:47:33 hpwb kernel: [ 932.686522] iwlwifi 0000:43:00.0: I iwl_process_add_sta_resp Added station id 0 addr 00:80:48:xx:xx:xx
> Dec 9 14:47:33 hpwb kernel: [ 932.686533] iwlwifi 0000:43:00.0: I iwl_process_add_sta_resp Added station according to cmd buffer 00:80:48:xx:xx:xx
> Dec 9 14:47:33 hpwb kernel: [ 932.686546] iwlwifi 0000:43:00.0: I iwl_tx_cmd_complete Clearing HCMD_ACTIVE for command REPLY_ADD_STA
> Dec 9 14:47:34 hpwb kernel: [ 933.835898] iwlwifi 0000:43:00.0: I iwl_force_reset perform force reset (0)
> Dec 9 14:47:34 hpwb kernel: [ 933.835908] iwlwifi 0000:43:00.0: I iwl_force_rf_reset perform radio reset.

wpa_supplicant said:
WPA: Group rekeying completed with 00:80:48:xx:xx:xx [GTK=TKIP]
at that time.

There're no crashdumps in dmesg!

If you need more information or debug output please let me know.

With kind regards,
Wolfgang Breyha
--
Wolfgang Breyha <[email protected]> | http://www.blafasel.at/
Vienna University Computer Center | Austria


2011-12-09 18:22:38

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

Johannes,

Anything changed in this area?

Thanks
Wey

On Fri, 2011-12-09 at 10:05 -0800, Wolfgang Breyha wrote:
> On 09/12/11 17:02, Guy, Wey-Yi wrote:
> > Hi Wolfgang,
> >
> > I don't have the similar AP here for testing. by looking at the log,
> > looks like there is a issue while you scan.
> >
> > Could you please try this patch and see if make any different? it is a
> > long shot, but I just like to see if it make any different at all
> >
> > once you apply the patch, please make sure P2P is disabled in .config
>
> Done that... iwlwifi says...
> > [ 833.360213] iwlwifi 0000:43:00.0: CONFIG_IWLWIFI_DEBUG enabled
> > [ 833.360215] iwlwifi 0000:43:00.0: CONFIG_IWLWIFI_DEBUGFS enabled
> > [ 833.360217] iwlwifi 0000:43:00.0: CONFIG_IWLWIFI_DEVICE_TRACING disabled
> > [ 833.360219] iwlwifi 0000:43:00.0: CONFIG_IWLWIFI_DEVICE_SVTOOL enabled
> > [ 833.360222] iwlwifi 0000:43:00.0: CONFIG_IWLWIFI_P2P disabled
> > [ 833.360228] iwlwifi 0000:43:00.0: Detected Intel(R) Centrino(R) Advanced-N 6200 AGN, REV=0x74
>
> But it still occurs. This time it started again right after the 2.
> rekeying event. wpa_supplicant logged.
> > Associated with 00:80:48:xx:xx:xx
> > WPA: Key negotiation completed with 00:80:48:xx:xx:xx [PTK=CCMP GTK=TKIP]
> > CTRL-EVENT-CONNECTED - Connection to 00:80:48:xx:xx9:xx completed (reauth) [id=0 id_str=]
> > WPA: Group rekeying completed with 00:80:48:xx:xx:xx [GTK=TKIP]
> > WPA: Group rekeying completed with 00:80:48:xx:xx:xx [GTK=TKIP]
>
> I failed to mention that I used F14 with compat-wireless from arround
> march of 2011 before I upgraded to F16 in my first mail. With this
> version I had no troubles with the exactly same AP and my 2540p.
>
> Greetings, Wolfgang
> ..
>
>
>
> >
> > Thanks
> > Wey



2011-12-12 11:21:55

by Wolfgang Breyha

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

Johannes Berg wrote, on 12.12.2011 11:12:
> Interesting. We'll have to play with this. Can you try running a 64-bit
> kernel maybe and see if that fixes things?

I can try with a F16 x86_64 live image. It's not the latest iwlwifi then, but
it happend with the F16 iwlagn too anyway;-) I'll do that in the evening and
report.

Greetings, Wolfgang
--
Wolfgang Breyha <[email protected]> | http://www.blafasel.at/
Vienna University Computer Center | Austria


2011-12-09 18:26:58

by Berg, Johannes

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

T24gRnJpLCAyMDExLTEyLTA5IGF0IDA5OjIzIC0wODAwLCBHdXksIFdleS1ZaSB3cm90ZToKPiBK
b2hhbm5lcywKPiAKPiBBbnl0aGluZyBjaGFuZ2VkIGluIHRoaXMgYXJlYT8KCldoYXQncyAidGhp
cyBhcmVhIj8gVGhpcyBpc24ndCBleGFjdGx5IHRoZSBmaXJzdCByZXBvcnQgb2YgcmFuZG9tIG1l
bW9yeQpjb3JydXB0aW9uIC4uLiBJIHRoaW5rIHRoZSBsYXN0IG9uZSB3YXMgc29tZXRoaW5nIHdp
dGggMzIvNjQgYml0IGJ1dCBJCmRvbid0IHJlbWVtYmVyLgoKam9oYW5uZXMKCi0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tCkludGVsIEdtYkgKRG9ybmFjaGVyIFN0cmFzc2UgMQo4NTYyMiBG
ZWxka2lyY2hlbi9NdWVuY2hlbiwgRGV1dHNjaGxhbmQgClNpdHogZGVyIEdlc2VsbHNjaGFmdDog
RmVsZGtpcmNoZW4gYmVpIE11ZW5jaGVuCkdlc2NoYWVmdHNmdWVocmVyOiBEb3VnbGFzIEx1c2ss
IFBldGVyIEdsZWlzc25lciwgSGFubmVzIFNjaHdhZGVyZXIKUmVnaXN0ZXJnZXJpY2h0OiBNdWVu
Y2hlbiBIUkIgNDc0NTYgClVzdC4tSWROci4vVkFUIFJlZ2lzdHJhdGlvbiBOby46IERFMTI5Mzg1
ODk1CkNpdGliYW5rIEZyYW5rZnVydCBhLk0uIChCTFogNTAyIDEwOSAwMCkgNjAwMTE5MDUyCg==


2011-12-10 13:02:52

by Wolfgang Breyha

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

On 09/12/11 19:26, Johannes Berg wrote:
> What's "this area"? This isn't exactly the first report of random memory
> corruption ... I think the last one was something with 32/64 bit but I
> don't remember.

To narrow down "this area" I tried the following:

*) I reconfigured my soekris AP hostapd to use wpa_group_rekey=0
.... and it works like a charm since then

*) I reconfigured the ubiquiti UniFi AP to use wpa_group_rekey=120
... and I'am pretty sure now that it is exactly the second rekeying
that starts messing up the system. Exactly after wpa_supplicant logs
WPA: Group rekeying completed with 06:27:22:xx:xx:xx [GTK=TKIP]
the second time, video streaming with VLC immediatly leeds to all the
mentioned sideeffects.

Maybe it's possible to reproduce it with any hostapd based AP with
active wpa_group_rekey.

Greetings, Wolfgang

2011-12-14 22:10:33

by Johannes Berg

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

On Wed, 2011-12-14 at 15:22 +0100, Wolfgang Breyha wrote:
> Johannes Berg wrote, on 13.12.2011 16:38:
> > The program will allocate 2GiB memory (edit to suit, should be OK on
> > your machine), fill them with 0x94 and then continually scan them for
> > corruption. Identifying what kind of corruption happened will hopefully
> > allow me to figure out where it's coming from.
> >
> > It prints out the wrong data & resets the memory so new corruption later
> > is also identified.
>
> Ok, I had only 20 minutes yesterday evening, but the results are not very
> pleasant, because there are no results:(

Ouch! So we aren't actually just dealing with random memory corruption?!

> I did the usual steps to reproduce the case on my laptop:
>
> *) stay connected on the "working" AP (no rekeying)
> --> *) new here: start "mc"
> *) echo "1" >...iwlwifi/debug
> *) open multitail, firefox, vlc
> *) connect to the other AP (rekeying every 20 seconds currently)
> *) start video stream
> *) wait for the second "group rekey finished"
> *) watch the artifacts, closing applications and listen to crackling sound
>
> Everything happened exactly the same way as always. BUT "mc" didn't show any
> corrupted memory regions.

Hmmm. Yeah if it was random memory corruption that should have done
something.

> I already tried to remember which applications crashed, but currently I'm not
> able to give them a clear category like "all (network-)IO" or "all
> audio/video". Allocating memory seems not to be enough to trigger "something".
> <brainstorm mode>Maybe mmap'ed regions are affected?</brainstorm mode>

But my tool was using mmap ;-) I can't think why mmap would make a
difference though.

> Watching a video is only one way to notice that issue. Simply starting firefox
> with a group of tabs open is an other and has a high probability to
> immediately crash firefox while fetching the contents.
>
> Watching a video shows the coincidence with the second rekeying event best.
>
> I'll try to give "mc" some more time and start it after/before the others as
> soon as my pre-x-mas schedule allows it.

I wouldn't. If it was really memory corruption, this should've caught
it. No way firefox would crash while "mc" would run fine. Back to square
1!

Maybe we somehow invented the best fuzzer ever? Wrongly decrypted
packets being sent up without being dropped, and your video stream and
firefox hating random binary data in the middle of the input?

johannes


2011-12-14 22:51:22

by Daniel Halperin

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

On Wed, Dec 14, 2011 at 2:10 PM, Johannes Berg
<[email protected]> wrote:
>
> Maybe we somehow invented the best fuzzer ever? Wrongly decrypted
> packets being sent up without being dropped, and your video stream and
> firefox hating random binary data in the middle of the input?
>

Seems unlikely---how would random binary data get properly put into
the right TCP streams...

Dan

2011-12-09 17:00:55

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

Hi Wolfgang,

I don't have the similar AP here for testing. by looking at the log,
looks like there is a issue while you scan.

Could you please try this patch and see if make any different? it is a
long shot, but I just like to see if it make any different at all

once you apply the patch, please make sure P2P is disabled in .config

Thanks
Wey

On Fri, 2011-12-09 at 06:50 -0800, Wolfgang Breyha wrote:
> Hi!
>
> I recently upgraded my HP 2540p Laptop with Intel 6200 AGN to Fedora 16 and
> suddenly had big troubles keeping the machine stable. After a while I
> nailed it down to iwlagn in all available kernels for F16. 3.1.0 - 3.1.4.
>
> Meanwhile I'm running wireless-compat iwlwifi from today (with the
> regression patch from Nikolay) but the troubles are still there. Kernel is
> the latest from F16 (3.1.4).
>
> Cards PCI-Id: 8086:4239 (rev 35)
> dmesg from iwlwifi:
> > [ 11.091363] Intel(R) Wireless WiFi Link AGN driver for Linux, in-tree:d
> > [ 11.091366] Copyright(c) 2003-2011 Intel Corporation
> > [ 11.091429] iwlwifi 0000:43:00.0: PCI INT A -> GSI 19 (level, low) -> IRQ 19
> > [ 11.091438] iwlwifi 0000:43:00.0: setting latency timer to 64
> > [ 11.091459] iwlwifi 0000:43:00.0: pci_resource_len = 0x00002000
> > [ 11.091461] iwlwifi 0000:43:00.0: pci_resource_base = fb444000
> > [ 11.091464] iwlwifi 0000:43:00.0: HW Revision ID = 0x35
> > [ 11.091537] iwlwifi 0000:43:00.0: irq 49 for MSI/MSI-X
> > [ 11.091592] iwlwifi 0000:43:00.0: CONFIG_IWLWIFI_DEBUG enabled
> > [ 11.091594] iwlwifi 0000:43:00.0: CONFIG_IWLWIFI_DEBUGFS enabled
> > [ 11.091596] iwlwifi 0000:43:00.0: CONFIG_IWLWIFI_DEVICE_TRACING disabled
> > [ 11.091598] iwlwifi 0000:43:00.0: CONFIG_IWLWIFI_DEVICE_SVTOOL enabled
> > [ 11.091605] iwlwifi 0000:43:00.0: Detected Intel(R) Centrino(R) Advanced-N 6200 AGN, REV=0x74
> > [ 11.091724] iwlwifi 0000:43:00.0: L1 Disabled; Enabling L0S
> > [ 11.107567] iwlwifi 0000:43:00.0: device EEPROM VER=0x436, CALIB=0x6
> > [ 11.107570] iwlwifi 0000:43:00.0: Device SKU: 0X1f0
> > [ 11.107596] iwlwifi 0000:43:00.0: Tunable channels: 13 802.11bg, 24 802.11a channels
> > [ 11.116014] iwlwifi 0000:43:00.0: loaded firmware version 9.221.4.1 build 25532
> > [ 11.116327] Registered led device: phy0-led
>
> wpa_supplicant is: wpa_supplicant-0.7.3-11.fc16.i686
> driven by NetworkManager: NetworkManager-0.9.2-1.fc16.i686
> via wext.
>
> I've two APs for tests. One is a Ubiquiti UniFi bgn (also Atheros/hostapd
> based). No problems with this one so far. Works like a charm.
>
> The other one is my Soekris 4801 equipped with an
> 00:0e.0 168c:001a Ethernet controller: Atheros Communications Inc. AR2413
> 802.11bg NIC (rev 01)
> running on openwrt 10.x with hostapd.
>
> If I connect to this one and start eg. videostreaming it takes only a few
> minutes and my machine gets completly unstable. Various applications crash,
> sounds starts crackling, video gets artefacts. Also doing large downloads
> helps to kill various components. gnome-shell, firefox, ... looks like
> iwlwifi writes random memory.
>
> Scanresults from "iw wlan0 scan" for this AP:
> BSS 00:80:48:xx:xx:xx (on wlan0)
> TSF: 58415136285 usec (0d, 16:13:35)
> freq: 2437
> beacon interval: 100
> capability: ESS Privacy ShortPreamble ShortSlotTime (0x0431)
> signal: -45.00 dBm
> last seen: 4490 ms ago
> SSID: xxxx
> Supported rates: 1.0* 2.0* 5.5* 6.0 9.0 11.0* 12.0 18.0
> DS Parameter set: channel 6
> ERP: <no flags>
> Extended supported rates: 24.0 36.0 48.0 54.0
> WMM: * Parameter version 1
> * BE: CW 7-1023, AIFSN 2, TXOP 2048 usec
> * BK: CW 15-1023, AIFSN 7
> * VI: CW 7-15, AIFSN 2, TXOP 3008 usec
> * VO: acm CW 3-7, AIFSN 2, TXOP 1504 usec
> RSN: * Version: 1
> * Group cipher: TKIP
> * Pairwise ciphers: CCMP TKIP
> * Authentication suites: PSK
> * Capabilities: (0x0000)
>
> Usually I use CCMP only, but activated TKIP for testing purposes. But it
> makes no difference at all.
>
> I activated debug output from iwlwifi and besides the recurring
> > Dec 9 14:42:17 hpwb kernel: [ 616.473854] iwlwifi 0000:43:00.0: U iwlagn_request_scan Scanning while associated...
> > Dec 9 14:42:17 hpwb kernel: [ 616.473866] iwlwifi 0000:43:00.0: U iwl_send_cmd_sync Attempting to send sync command REPLY_SCAN_CMD
> > Dec 9 14:42:17 hpwb kernel: [ 616.473869] iwlwifi 0000:43:00.0: U iwl_send_cmd_sync Setting HCMD_ACTIVE for command REPLY_SCAN_CMD
> > Dec 9 14:42:17 hpwb kernel: [ 616.476081] iwlwifi 0000:43:00.0: I iwl_tx_cmd_complete Clearing HCMD_ACTIVE for command REPLY_SCAN_CMD
> ... I very often see stuff like...
> > Dec 9 14:42:07 hpwb kernel: [ 606.751372] iwlwifi 0000:43:00.0: I iwl_force_reset force reset rejected
> > Dec 9 14:42:13 hpwb kernel: [ 612.907705] iwlwifi 0000:43:00.0: I is_lq_table_valid Channel 6 is not an HT channel
> > Dec 9 14:42:13 hpwb kernel: [ 612.932685] iwlwifi 0000:43:00.0: I is_lq_table_valid Channel 6 is not an HT channel
> > Dec 9 14:42:17 hpwb kernel: [ 616.473816] iwlwifi 0000:43:00.0: I iwl_force_reset perform force reset (0)
> > Dec 9 14:42:17 hpwb kernel: [ 616.473825] iwlwifi 0000:43:00.0: I iwl_force_rf_reset perform radio reset.
>
> Another difference between the two APs is that my soekris uses rekeying. I
> don't have proof yet that it's the cause, but troubles often start after I
> see....
> > Dec 9 14:47:23 hpwb kernel: [ 922.682961] iwlwifi 0000:43:00.0: I iwl_tx_cmd_complete Clearing HCMD_ACTIVE for command REPLY_SCAN_CMD
> > Dec 9 14:47:23 hpwb kernel: [ 922.885333] iwlwifi 0000:43:00.0: I iwl_force_reset force reset rejected
> > Dec 9 14:47:27 hpwb kernel: [ 926.672222] iwlwifi 0000:43:00.0: I iwl_force_reset perform force reset (0)
> > Dec 9 14:47:27 hpwb kernel: [ 926.672233] iwlwifi 0000:43:00.0: I iwl_force_rf_reset perform radio reset.
> > Dec 9 14:47:27 hpwb kernel: [ 926.672254] iwlwifi 0000:43:00.0: U iwlagn_request_scan Scanning while associated...
> > Dec 9 14:47:27 hpwb kernel: [ 926.672265] iwlwifi 0000:43:00.0: U iwl_send_cmd_sync Attempting to send sync command REPLY_SCAN_CMD
> > Dec 9 14:47:27 hpwb kernel: [ 926.672271] iwlwifi 0000:43:00.0: U iwl_send_cmd_sync Setting HCMD_ACTIVE for command REPLY_SCAN_CMD
> > Dec 9 14:47:27 hpwb kernel: [ 926.674473] iwlwifi 0000:43:00.0: I iwl_tx_cmd_complete Clearing HCMD_ACTIVE for command REPLY_SCAN_CMD
> > Dec 9 14:47:33 hpwb kernel: [ 932.686199] iwlwifi 0000:43:00.0: U iwl_send_add_sta Adding sta 0 (00:80:48:xx:xx:xx) synchronously
> > Dec 9 14:47:33 hpwb kernel: [ 932.686209] iwlwifi 0000:43:00.0: U iwl_send_cmd_sync Attempting to send sync command REPLY_ADD_STA
> > Dec 9 14:47:33 hpwb kernel: [ 932.686214] iwlwifi 0000:43:00.0: U iwl_send_cmd_sync Setting HCMD_ACTIVE for command REPLY_ADD_STA
> > Dec 9 14:47:33 hpwb kernel: [ 932.686271] iwlwifi 0000:43:00.0: I iwl_process_add_sta_resp Processing response for adding station 0
> > Dec 9 14:47:33 hpwb kernel: [ 932.686278] iwlwifi 0000:43:00.0: I iwl_process_add_sta_resp REPLY_ADD_STA PASSED
> > Dec 9 14:47:33 hpwb kernel: [ 932.686285] iwlwifi 0000:43:00.0: I iwl_sta_ucode_activate STA id 0 addr 00:80:48:xx:xx:xx already present in uCode (according to driver)
> > Dec 9 14:47:33 hpwb kernel: [ 932.686294] iwlwifi 0000:43:00.0: I iwl_process_add_sta_resp Added station id 0 addr 00:80:48:xx:xx:xx
> > Dec 9 14:47:33 hpwb kernel: [ 932.686302] iwlwifi 0000:43:00.0: I iwl_process_add_sta_resp Added station according to cmd buffer 00:80:48:xx:xx:xx
> > Dec 9 14:47:33 hpwb kernel: [ 932.686313] iwlwifi 0000:43:00.0: I iwl_tx_cmd_complete Clearing HCMD_ACTIVE for command REPLY_ADD_STA
> > Dec 9 14:47:33 hpwb kernel: [ 932.686423] iwlwifi 0000:43:00.0: U iwl_send_add_sta Adding sta 0 (00:80:48:xx:xx:xx) synchronously
> > Dec 9 14:47:33 hpwb kernel: [ 932.686430] iwlwifi 0000:43:00.0: U iwl_send_cmd_sync Attempting to send sync command REPLY_ADD_STA
> > Dec 9 14:47:33 hpwb kernel: [ 932.686437] iwlwifi 0000:43:00.0: U iwl_send_cmd_sync Setting HCMD_ACTIVE for command REPLY_ADD_STA
> > Dec 9 14:47:33 hpwb kernel: [ 932.686487] iwlwifi 0000:43:00.0: I iwl_process_add_sta_resp Processing response for adding station 0
> > Dec 9 14:47:33 hpwb kernel: [ 932.686499] iwlwifi 0000:43:00.0: I iwl_process_add_sta_resp REPLY_ADD_STA PASSED
> > Dec 9 14:47:33 hpwb kernel: [ 932.686509] iwlwifi 0000:43:00.0: I iwl_sta_ucode_activate STA id 0 addr 00:80:48:xx:xx:xx already present in uCode (according to driver)
> > Dec 9 14:47:33 hpwb kernel: [ 932.686522] iwlwifi 0000:43:00.0: I iwl_process_add_sta_resp Added station id 0 addr 00:80:48:xx:xx:xx
> > Dec 9 14:47:33 hpwb kernel: [ 932.686533] iwlwifi 0000:43:00.0: I iwl_process_add_sta_resp Added station according to cmd buffer 00:80:48:xx:xx:xx
> > Dec 9 14:47:33 hpwb kernel: [ 932.686546] iwlwifi 0000:43:00.0: I iwl_tx_cmd_complete Clearing HCMD_ACTIVE for command REPLY_ADD_STA
> > Dec 9 14:47:34 hpwb kernel: [ 933.835898] iwlwifi 0000:43:00.0: I iwl_force_reset perform force reset (0)
> > Dec 9 14:47:34 hpwb kernel: [ 933.835908] iwlwifi 0000:43:00.0: I iwl_force_rf_reset perform radio reset.
>
> wpa_supplicant said:
> WPA: Group rekeying completed with 00:80:48:xx:xx:xx [GTK=TKIP]
> at that time.
>
> There're no crashdumps in dmesg!
>
> If you need more information or debug output please let me know.
>
> With kind regards,
> Wolfgang Breyha


Attachments:
0008-iwlwifi-P2P-is-not-enabled-by-default.patch (2.91 kB)

2011-12-15 19:47:08

by Wolfgang Breyha

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

On 2011-12-15 14:01, Johannes Berg wrote:
> Ok, thanks. That's a long time since known good, probably won't help
> much unfortunately.
>
> I'm not even sure tracing would help here since evidently something is
> going completely wrong. I think we'll have to try harder to reproduce it
> in our lab.

I was just testing your suggestion to use swcrypto=1 with iwlwifi from last
week and again the result is more than surprising.

If I set it the issue appears immediately on *BOTH* APs regardless of
rekeying. I expected the exact opposite at best!?

I will continue trying with F15 now... stay tuned;-)

Greetings, Wolfgang
--
Wolfgang Breyha <[email protected]> | http://www.blafasel.at/
Vienna University Computer Center | Austria

2011-12-12 17:48:06

by Johannes Berg

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

On Mon, 2011-12-12 at 12:21 +0100, Wolfgang Breyha wrote:
> Johannes Berg wrote, on 12.12.2011 11:12:
> > Interesting. We'll have to play with this. Can you try running a 64-bit
> > kernel maybe and see if that fixes things?
>
> I can try with a F16 x86_64 live image. It's not the latest iwlwifi then, but
> it happend with the F16 iwlagn too anyway;-) I'll do that in the evening and
> report.

A colleague pointed out that due to the PCI address space hold we might
be seeing larger addresses -- could you try booting the 32 bit kernel
with "mem=3G" on the command line (edit it in grub maybe?) and see if
you can still reproduce it then?

johannes


2011-12-12 18:17:10

by Wolfgang Breyha

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

On 2011-12-12 18:48, Johannes Berg wrote:
> A colleague pointed out that due to the PCI address space hold we might
> be seeing larger addresses -- could you try booting the 32 bit kernel
> with "mem=3G" on the command line (edit it in grub maybe?) and see if
> you can still reproduce it then?

mem=3G changes nothing except the amount of memory;-) Exactly the same
issues occur.

Trying 64bits now...

Greetings, Wolfgang
--
Wolfgang Breyha <[email protected]> | http://www.blafasel.at/
Vienna University Computer Center | Austria

2011-12-14 22:55:14

by Johannes Berg

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

On Wed, 2011-12-14 at 14:51 -0800, Daniel Halperin wrote:
> On Wed, Dec 14, 2011 at 2:10 PM, Johannes Berg
> <[email protected]> wrote:
> >
> > Maybe we somehow invented the best fuzzer ever? Wrongly decrypted
> > packets being sent up without being dropped, and your video stream and
> > firefox hating random binary data in the middle of the input?
> >
>
> Seems unlikely---how would random binary data get properly put into
> the right TCP streams...

Hm, good point. So memory corruption in the applications using the
network? Why would it be restricted to those applications? (not to even
mention why would it be restricted to 32-bit kernels on 64-bit systems
then?)

I wish I could reproduce it, maybe it would be possible to bisect?

johannes


2011-12-15 00:46:59

by Wolfgang Breyha

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

On 14/12/11 23:55, Johannes Berg wrote:
> I wish I could reproduce it, maybe it would be possible to bisect?

I was able to build and test the following releases of compat-wireless:
compat-wireless-2011-10-12
compat-wireless-2011-09-27
compat-wireless-2011-08-27
compat-wireless-2011-08-26

All are affected in the same way. 2011-10-12 gave me a hard system
freeze without output to messages. Most likely the "worst case scenario"
of the issue.

compat-wireless-2011-08-23 and some older releases I tried do not build
on F16 3.1.4.

IIRC 2011-03-25 was the release I used with F14.

Greetings, Wolfgang

2011-12-09 18:42:50

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

On Fri, 2011-12-09 at 10:26 -0800, Berg, Johannes wrote:
> On Fri, 2011-12-09 at 09:23 -0800, Guy, Wey-Yi wrote:
> > Johannes,
> >
> > Anything changed in this area?
>
> What's "this area"? This isn't exactly the first report of random memory
> corruption ... I think the last one was something with 32/64 bit but I
> don't remember.
>

yup, what I mean is do you have any thought on this? I am not sure how
to reproduce this issue with the APs we have here.

Something happen between March 2011 and now, bisect it? but we need to
be able to reproduce the failure case first.

Thanks
Wey
>



2011-12-13 15:38:13

by Johannes Berg

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

On Mon, 2011-12-12 at 19:17 +0100, Wolfgang Breyha wrote:
> On 2011-12-12 18:48, Johannes Berg wrote:
> > A colleague pointed out that due to the PCI address space hold we might
> > be seeing larger addresses -- could you try booting the 32 bit kernel
> > with "mem=3G" on the command line (edit it in grub maybe?) and see if
> > you can still reproduce it then?
>
> mem=3G changes nothing except the amount of memory;-) Exactly the same
> issues occur.

Hm, unfortunately, I can't reproduce this problem, even with 2 minute
GTK rekey intervals. Could you run this program on your machine while
you reproduce it?

http://johannes.sipsolutions.net/files/mc.c.txt

(just save as "mc.c" and compile with gcc mc.c -o mc)

The program will allocate 2GiB memory (edit to suit, should be OK on
your machine), fill them with 0x94 and then continually scan them for
corruption. Identifying what kind of corruption happened will hopefully
allow me to figure out where it's coming from.

It prints out the wrong data & resets the memory so new corruption later
is also identified.

Thanks!

johannes


2011-12-12 10:09:27

by Wolfgang Breyha

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

On 12/12/11 10:04, Johannes Berg wrote:
> Are you running a 32-bit kernel on a 64-bit machine by any chance?
> Somebody else reported that using a 64-bit kernel helped similar issues.

That's right. It's F16.i686 on a
# cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 37
model name : Intel(R) Core(TM) i5 CPU M 540 @ 2.53GHz
stepping : 5
stepping : 5
cpu MHz : 1199.000
cache size : 3072 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 2
apicid : 0
initial apicid : 0
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx rdtscp lm
constant_tsc arch_perfmon pebs bts xtopology nonstop_tsc aperfmperf pni
pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm
sse4_1 sse4_2 popcnt aes lahf_lm ida arat dts tpr_shadow vnmi
flexpriority ept vpid
bogomips : 5053.69
clflush size : 64
cache_alignment : 64
address sizes : 36 bits physical, 48 bits virtual
power management:

VT is enabled in BIOS (what's not default on that model).

Greetings, Wolfgang

2011-12-12 10:12:38

by Johannes Berg

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

On Mon, 2011-12-12 at 11:09 +0100, Wolfgang Breyha wrote:
> On 12/12/11 10:04, Johannes Berg wrote:
> > Are you running a 32-bit kernel on a 64-bit machine by any chance?
> > Somebody else reported that using a 64-bit kernel helped similar issues.
>
> That's right. It's F16.i686 on a
> # cat /proc/cpuinfo
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 37
> model name : Intel(R) Core(TM) i5 CPU M 540 @ 2.53GHz

Interesting. We'll have to play with this. Can you try running a 64-bit
kernel maybe and see if that fixes things?

johannes


2011-12-15 13:01:10

by Johannes Berg

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

On Thu, 2011-12-15 at 01:46 +0100, Wolfgang Breyha wrote:
> On 14/12/11 23:55, Johannes Berg wrote:
> > I wish I could reproduce it, maybe it would be possible to bisect?
>
> I was able to build and test the following releases of compat-wireless:
> compat-wireless-2011-10-12
> compat-wireless-2011-09-27
> compat-wireless-2011-08-27
> compat-wireless-2011-08-26
>
> All are affected in the same way. 2011-10-12 gave me a hard system
> freeze without output to messages. Most likely the "worst case scenario"
> of the issue.
>
> compat-wireless-2011-08-23 and some older releases I tried do not build
> on F16 3.1.4.
>
> IIRC 2011-03-25 was the release I used with F14.

Ok, thanks. That's a long time since known good, probably won't help
much unfortunately.

I'm not even sure tracing would help here since evidently something is
going completely wrong. I think we'll have to try harder to reproduce it
in our lab.

johannes


2011-12-14 14:22:40

by Wolfgang Breyha

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

Johannes Berg wrote, on 13.12.2011 16:38:
> The program will allocate 2GiB memory (edit to suit, should be OK on
> your machine), fill them with 0x94 and then continually scan them for
> corruption. Identifying what kind of corruption happened will hopefully
> allow me to figure out where it's coming from.
>
> It prints out the wrong data & resets the memory so new corruption later
> is also identified.

Ok, I had only 20 minutes yesterday evening, but the results are not very
pleasant, because there are no results:(

I did the usual steps to reproduce the case on my laptop:

*) stay connected on the "working" AP (no rekeying)
--> *) new here: start "mc"
*) echo "1" >...iwlwifi/debug
*) open multitail, firefox, vlc
*) connect to the other AP (rekeying every 20 seconds currently)
*) start video stream
*) wait for the second "group rekey finished"
*) watch the artifacts, closing applications and listen to crackling sound

Everything happened exactly the same way as always. BUT "mc" didn't show any
corrupted memory regions.

I already tried to remember which applications crashed, but currently I'm not
able to give them a clear category like "all (network-)IO" or "all
audio/video". Allocating memory seems not to be enough to trigger "something".
<brainstorm mode>Maybe mmap'ed regions are affected?</brainstorm mode>

Watching a video is only one way to notice that issue. Simply starting firefox
with a group of tabs open is an other and has a high probability to
immediately crash firefox while fetching the contents.

Watching a video shows the coincidence with the second rekeying event best.

I'll try to give "mc" some more time and start it after/before the others as
soon as my pre-x-mas schedule allows it.

Greetings, Wolfgang
--
Wolfgang Breyha <[email protected]> | http://www.blafasel.at/
Vienna University Computer Center | Austria

2011-12-12 09:04:20

by Johannes Berg

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

On Sat, 2011-12-10 at 14:02 +0100, Wolfgang Breyha wrote:
> On 09/12/11 19:26, Johannes Berg wrote:
> > What's "this area"? This isn't exactly the first report of random memory
> > corruption ... I think the last one was something with 32/64 bit but I
> > don't remember.
>
> To narrow down "this area" I tried the following:
>
> *) I reconfigured my soekris AP hostapd to use wpa_group_rekey=0
> .... and it works like a charm since then
>
> *) I reconfigured the ubiquiti UniFi AP to use wpa_group_rekey=120
> ... and I'am pretty sure now that it is exactly the second rekeying
> that starts messing up the system. Exactly after wpa_supplicant logs
> WPA: Group rekeying completed with 06:27:22:xx:xx:xx [GTK=TKIP]
> the second time, video streaming with VLC immediatly leeds to all the
> mentioned sideeffects.
>
> Maybe it's possible to reproduce it with any hostapd based AP with
> active wpa_group_rekey.

Ok, thanks.

Are you running a 32-bit kernel on a 64-bit machine by any chance?
Somebody else reported that using a 64-bit kernel helped similar issues.

johannes


2011-12-14 23:07:36

by Wolfgang Breyha

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

On 14/12/11 23:55, Johannes Berg wrote:
> I wish I could reproduce it, maybe it would be possible to bisect?

I can try my luck building older compat-wireless releases as far as they
are compatible with 3.1.x.

And I'll try my best to check which applications are prone to crash
while transmitting.

Greetings, Wolfgang


2011-12-09 18:05:43

by Wolfgang Breyha

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

On 09/12/11 17:02, Guy, Wey-Yi wrote:
> Hi Wolfgang,
>
> I don't have the similar AP here for testing. by looking at the log,
> looks like there is a issue while you scan.
>
> Could you please try this patch and see if make any different? it is a
> long shot, but I just like to see if it make any different at all
>
> once you apply the patch, please make sure P2P is disabled in .config

Done that... iwlwifi says...
> [ 833.360213] iwlwifi 0000:43:00.0: CONFIG_IWLWIFI_DEBUG enabled
> [ 833.360215] iwlwifi 0000:43:00.0: CONFIG_IWLWIFI_DEBUGFS enabled
> [ 833.360217] iwlwifi 0000:43:00.0: CONFIG_IWLWIFI_DEVICE_TRACING disabled
> [ 833.360219] iwlwifi 0000:43:00.0: CONFIG_IWLWIFI_DEVICE_SVTOOL enabled
> [ 833.360222] iwlwifi 0000:43:00.0: CONFIG_IWLWIFI_P2P disabled
> [ 833.360228] iwlwifi 0000:43:00.0: Detected Intel(R) Centrino(R) Advanced-N 6200 AGN, REV=0x74

But it still occurs. This time it started again right after the 2.
rekeying event. wpa_supplicant logged.
> Associated with 00:80:48:xx:xx:xx
> WPA: Key negotiation completed with 00:80:48:xx:xx:xx [PTK=CCMP GTK=TKIP]
> CTRL-EVENT-CONNECTED - Connection to 00:80:48:xx:xx9:xx completed (reauth) [id=0 id_str=]
> WPA: Group rekeying completed with 00:80:48:xx:xx:xx [GTK=TKIP]
> WPA: Group rekeying completed with 00:80:48:xx:xx:xx [GTK=TKIP]

I failed to mention that I used F14 with compat-wireless from arround
march of 2011 before I upgraded to F16 in my first mail. With this
version I had no troubles with the exactly same AP and my 2540p.

Greetings, Wolfgang
..



>
> Thanks
> Wey

2011-12-12 19:40:02

by Wolfgang Breyha

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

On 2011-12-12 18:48, Johannes Berg wrote:
> A colleague pointed out that due to the PCI address space hold we might
> be seeing larger addresses -- could you try booting the 32 bit kernel
> with "mem=3G" on the command line (edit it in grub maybe?) and see if
> you can still reproduce it then?

F16 kernel-3.1.0-7.x86_64 works with and without rekeying.

Greetings, Wolfgang
--
Wolfgang Breyha <[email protected]> | http://www.blafasel.at/
Vienna University Computer Center | Austria

2012-02-24 07:25:08

by Johannes Berg

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

On Sat, 2012-02-18 at 14:09 +0100, Johannes Berg wrote:
> On Fri, 2011-12-09 at 15:50 +0100, Wolfgang Breyha wrote:
>
> > If I connect to this one and start eg. videostreaming it takes only a few
> > minutes and my machine gets completly unstable. Various applications crash,
> > sounds starts crackling, video gets artefacts. Also doing large downloads
> > helps to kill various components. gnome-shell, firefox, ... looks like
> > iwlwifi writes random memory.
>
> Just to give this some public closure as we've been discussing this in
> private for a while now -- the problem was caused by the AES-NI FPU
> state corruption problem that's been discussed elsewhere for a while.

Wolfgang, all,

Greg is asking for testers in conjunction with this problem to release
3.2.8: https://plus.google.com/109995262342451767357/posts/TVTyT1DtQKJ

johannes


2012-02-18 13:09:45

by Johannes Berg

[permalink] [raw]
Subject: Re: iwlwifi havoc on some APs (rekeying?)

On Fri, 2011-12-09 at 15:50 +0100, Wolfgang Breyha wrote:

> If I connect to this one and start eg. videostreaming it takes only a few
> minutes and my machine gets completly unstable. Various applications crash,
> sounds starts crackling, video gets artefacts. Also doing large downloads
> helps to kill various components. gnome-shell, firefox, ... looks like
> iwlwifi writes random memory.

Just to give this some public closure as we've been discussing this in
private for a while now -- the problem was caused by the AES-NI FPU
state corruption problem that's been discussed elsewhere for a while.

johannes